0% found this document useful (0 votes)
22 views652 pages

ALAFF

The document is a draft edition of 'Advanced Linear Algebra: Foundations to Frontiers' by Robert van de Geijn and Margaret Myers, aimed at building knowledge in linear algebra and practical algorithms for matrix computations. It is organized into three main parts: Orthogonality, Solving Linear Systems, and the Algebraic Eigenvalue Problem, with a focus on numerical stability and high-performance algorithms. The document also includes setup instructions for MATLAB and C programming, along with enrichments for deeper learning in numerical linear algebra.

Uploaded by

6rknq8cdp2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views652 pages

ALAFF

The document is a draft edition of 'Advanced Linear Algebra: Foundations to Frontiers' by Robert van de Geijn and Margaret Myers, aimed at building knowledge in linear algebra and practical algorithms for matrix computations. It is organized into three main parts: Orthogonality, Solving Linear Systems, and the Algebraic Eigenvalue Problem, with a focus on numerical stability and high-performance algorithms. The document also includes setup instructions for MATLAB and C programming, along with enrichments for deeper learning in numerical linear algebra.

Uploaded by

6rknq8cdp2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 652

Advanced Linear Algebra

Foundations to Frontiers
Advanced Linear Algebra
Foundations to Frontiers

Robert van de Geijn


The University of Texas at Austin

Margaret Myers
The University of Texas at Austin

August 17, 2020


Edition: Draft Edition 2019–2020
Website: ulaff.net
©2019–2020 Robert van de Geijn and Margaret Myers
Permission is granted to copy, distribute and/or modify this document under the terms of
the GNU Free Documentation License, Version 1.2 or any later version published by the Free
Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
Texts. A copy of the license is included in the appendix entitled “GNU Free Documentation
License.” All trademarks™ are the registered® marks of their respective owners.
Acknowledgements

We would like to thank the people who created PreTeXt, the authoring system used to
typeset these materials. We applaud you!

vii
Preface

Robert van de Geijn


Maggie Myers
Austin, 2020

viii
Contents

Acknowledgements vii

Preface viii

0 Getting Started 1

I Orthogonality

1 Norms 12

2 The Singular Value Decomposition 89

3 The QR Decomposition 151

4 Linear Least Squares 212

II Solving Linear Systems

5 The LU and Cholesky Factorizations 255

6 Numerical Stability 331

7 Solving Sparse Linear Systems 382

ix
CONTENTS x

8 Descent Methods 416

III The Algebraic Eigenvalue Problem

9 Eigenvalues and Eigenvectors 460

10 Practical Solution of the Hermitian Eigenvalue Problem 514

11 Computing the SVD 556

12 Attaining High Performance 585

A Are you ready? 621

B Notation 622

C Knowledge from Numerical Analysis 623

D GNU Free Documentation License 625

References 633

Index 637
Week 0

Getting Started

0.1 Opening Remarks


0.1.1 Welcome

YouTube: https://www.youtube.com/watch?v=KzCTM vxtQA


Linear algebra is one of the fundamental tools for computational and data scientists. In
Advanced Linear Algebra: Foundations to Frontiers (ALAFF), you build your knowledge,
understanding, and skills in linear algebra, practical algorithms for matrix computations,
and how floating-point arithmetic, as performed by computers, affects correctness.
The materials are organized into Weeks that correspond to a chunk of information that
is covered in a typical on-campus week. These weeks are arranged into three parts:
Part I: Orthogonality
The Singular Value Decomposition (SVD) is
possibly the most important result in linear
algebra, yet too advanced to cover in an in-
troductory undergraduate course. To be able
to get to this topic as quickly as possible, we
start by focusing on orthogonality, which is
at the heart of image compression, Google’s
page rank algorithm, and linear least-squares
approximation.

1
WEEK 0. GETTING STARTED 2

Part II: Solving Linear Systems


Solving linear systems, via direct or iterative
methods, is at the core of applications in com-
putational science and machine learning. We
also leverage these topics to introduce numer-
ical stability of algorithms: the classical study
that qualifies and quantifies the "correctness"
of an algorithm in the presence of floating
point computation and approximation. Along
the way, we discuss how to restructure algo-
rithms so that they can attain high perfor-
mance on modern CPUs.

Part III: Eigenvalues and Eigenvectors


Many problems in science have the property
that if one looks at them in just the right way
(in the right basis), they greatly simplify and/
or decouple into simpler subproblems. Eigen-
value and eigenvectors are the key to discover-
ing how to view a linear transformation, repre-
sented by a matrix, in that special way. Algo-
rithms for computing them also are the key to
practical algorithms for computing the SVD

In this week (Week 0), we walk you through some of the basic course information and
help you set up for learning. The week itself is structured like future weeks, so that you
become familiar with that structure.

0.1.2 Outline Week 0


Each week is structured so that we give the outline for the week immediately after the
"launch:"
• 0.1 Opening Remarks
¶ 0.1.1 Welcome
¶ 0.1.2 Outline Week 0
¶ 0.1.3 What you will learn
• 0.2 Setting Up For ALAFF
WEEK 0. GETTING STARTED 3

¶ 0.2.1 Accessing these notes


¶ 0.2.2 Cloning the ALAFF repository
¶ 0.2.3 MATLAB
¶ 0.2.4 Setting up to implement in C (optional)

• 0.3 Enrichments

¶ 0.3.1 Ten surprises from numerical linear algebra


¶ 0.3.2 Best algorithms of the 20th century

• 0.4 Wrap Up

¶ 0.4.1 Additional Homework


¶ 0.4.2 Summary

0.1.3 What you will learn


The third unit of each week informs you of what you will learn. This describes the knowledge
and skills that you can expect to acquire. If you return to this unit after you complete the
week, you will be able to use the below to self-assess.
Upon completion of this week, you should be able to

• Navigate the materials.

• Access additional materials from GitHub.

• Track your homework and progress.

• Register for MATLAB online.

• Recognize the structure of a typical week.

0.2 Setting Up For ALAFF


0.2.1 Accessing these notes
For information regarding these and our other materials, visit ulaff.net.
These notes are available in a number of formats:

• As an online book authored with PreTeXt at http://www.cs.utexas.edu/users/flame/


laff/alaff/.
WEEK 0. GETTING STARTED 4

• As a PDF at http://www.cs.utexas.edu/users/flame/laff/alaff/ALAFF.pdf.
If you download this PDF and place it in just the right folder of the materials you will
clone from GitHub (see next unit), the links in the PDF to the downloaded material
will work.
We will be updating the materals frequently as people report typos and we receive
feedback from learners. Please consider the environment before you print a copy...

• Eventually, if we perceive there is demand, we may offer a printed copy of these notes
from Lulu.com, a self-publishing service. This will not happen until Summer 2020, at
the earliest.

0.2.2 Cloning the ALAFF repository


We have placed all materials on GitHub, a development environment for software projects.
In our case, we use it to disseminate the various activities associated with this course.
On the computer on which you have chosen to work, "clone" the GitHub repository for
this course:

• Visit https://github.com/ULAFF/ALAFF

• Click on

and copy https://github.com/ULAFF/ALAFF.git.

• On the computer where you intend to work, in a terminal session on the command line
in the directory where you would like to place the materials, execute

git c one https://github.com/ULAFF/ALAFF.git

This will create a local copy (clone) of the materials.

• Sometimes we will update some of the files from the repository. When this happens
you will want to execute, in the cloned directory,

git stash save

which saves any local changes you have made, followed by

git pu

which updates your local copy of the repository, followed by

git stash pop


WEEK 0. GETTING STARTED 5

which restores local changes you made. This last step may require you to "merge" files
that were changed in the repository that conflict with local changes.

Upon completion of the cloning, you will have a directory structure similar to that given in
Figure 0.2.2.1.

Figure 0.2.2.1 Directory structure for your ALAFF materials. In this example, we cloned
the repository in Robert’s home directory, rvdg.

0.2.3 MATLAB
We will use Matlab to translate algorithms into code and to experiment with linear algebra.
There are a number of ways in which you can use Matlab:

• Via MATLAB that is installed on the same computer as you will execute your perfor-
mance experiments. This is usually called a "desktop installation of Matlab."

• Via MATLAB Online. You will have to transfer files from the computer where you are
performing your experiments to MATLAB Online. You could try to set up MATLAB
Drive, which allows you to share files easily between computers and with MATLAB
Online. Be warned that there may be a delay in when files show up, and as a result
you may be using old data to plot if you aren’t careful!

If you are using these materials as part of an offering of the Massive Open Online Course
(MOOC) titled "Advanced Linear Algebra: Foundations to Frontiers," you will be given a
temporary license to Matlab, courtesy of MathWorks. In this case, there will be additional
WEEK 0. GETTING STARTED 6

instructions on how to set up MATLAB Online, in the Unit on edX that corresponds to this
section.
You need relatively little familiarity with MATLAB in order to learn what we want you
to learn in this course. So, you could just skip these tutorials altogether, and come back to
them if you find you want to know more about MATLAB and its programming language
(M-script).
Below you find a few short videos that introduce you to MATLAB. For a more compre-
hensive tutorial, you may want to visit MATLAB Tutorials at MathWorks and click "Launch
Tutorial".

What is MATLAB?

https://www.youtube.com/watch?v=2sB-NMD9Qhk

Getting Started with MATLAB


Online

https://www.youtube.com/watch?v=4shp284pGc8

MATLAB Variables

https://www.youtube.com/watch?v=gPIsIzHJA9I

MATLAB as a Calculator

https://www.youtube.com/watch?v=K9xy5kQHDBo

Managing Files with MATLAB


Online

https://www.youtube.com/watch?v=mqYwMnM-x5Q
Remark 0.2.3.1 Some of you may choose to use MATLAB on your personal computer while
others may choose to use MATLAB Online. Those who use MATLAB Online will need to
transfer some of the downloaded materials to that platform.
WEEK 0. GETTING STARTED 7

0.2.4 Setting up to implement in C (optional)


You may want to return to this unit later in the course. We are still working on adding
programming exercises that require C implementation.
In some of the enrichments in these notes and the final week on how to attain performance,
we suggest implementing algorithms that are encounted in C. Those who intend to pursue
these activities will want to install a Basic Linear Algebra Subprograms (BLAS) library and
our libflame library ( which not only provides higher level linear algebra functionality, but
also allows one to program in a manner that mirrors how we present algorithms.)

0.2.4.1 Installing the BLAS


The Basic Linear Algebra Subprograms (BLAS) are an interface to fundamental linear alge-
bra operations. The idea is that if we write our software in terms of calls to these routines
and vendors optimize an implementation of the BLAS, then our software can be easily ported
to different computer architectures while achieving reasonable performance.
A popular and high-performing open source implementation of the BLAS is provided by
our BLAS-like Library Instantiation Software (BLIS). The following steps will install BLIS
if you are using the Linux OS (on a Mac, there may be a few more steps, which are discussed
later in this unit.)

• Visit the BLIS Github repository.

• Click on

and copy https://github.com/flame/blis.git.

• In a terminal session, in your home directory, enter

git c one https://github.com/f ame/b is.git

(to make sure you get the address right, you will want to paste the address you copied
in the last step.)

• Change directory to blis:

cd b is

• Indicate a specific version of BLIS so that we all are using the same release:

git checkout pfhp

• Configure, build, and install with OpenMP turned on.


WEEK 0. GETTING STARTED 8

./configure -p ~/b is auto


make -j8
make check -j8
make insta

The -p ~/blis installs the library in the subdirectory ~/blis of your home directory,
which is where the various exercises in the course expect it to reside.

• If you run into a problem while installing BLIS, you may want to consult https:
//github.com/f ame/b is/b ob/master/docs/Bui dSystem.md.

On Mac OS-X
• You may need to install Homebrew, a program that helps you install various software
on you mac. Warning: you may need "root" privileges to do so.

$ /usr/bin/ruby -e $(cur -fsSL https://raw.githubusercontent.com/Homebrew/insta /maste

Keep an eye on the output to see if the “Command Line Tools” get installed. This
may not be installed if you already have Xcode Command line tools installed. If this
happens, post in the "Discussion" for this unit, and see if someone can help you out.

• Use Homebrew to install the gcc compiler:

$ brew insta gcc

Check if gcc installation overrides clang:

$ which gcc

The output should be /usr/local/bin. If it isn’t, you may want to add /usr/local/bin
to "the path." I did so by inserting

export PATH= /usr/ oca /bin:$PATH

into the file .bash_profile in my home directory. (Notice the "period" before "bash_profile"

• Now you can go back to the beginning of this unit, and follow the instructions to install
BLIS.

0.2.4.2 Installing libflame


Higher level linear algebra functionality, such as the various decompositions we will discuss
in this course, are supported by the LAPACK library [1]. Our libflame library is an imple-
mentation of LAPACK that also exports an API for representing algorithms in code in a
way that closely reflects the FLAME notation to which you will be introduced in the course.
The libflame library can be cloned from
• https://github.com/flame/libflame.
WEEK 0. GETTING STARTED 9

by executing

git c one https://github.com/f ame/ ibf ame.git

in the command window.


Instructions on how to install it are at

• https://github.com/flame/libflame/blob/master/INSTALL.

Here is what I had to do on my MacBook Pro (OSX Catalina):

./configure --disab e-autodetect-f77- df ags --disab e-autodetect-f77-name-mang ing --prefix=$


make -j8
make insta

This will take a while!

0.3 Enrichments
In each week, we include "enrichments" that allow the participant to go beyond.

0.3.1 Ten surprises from numerical linear algebra


You may find the following list of insights regarding numerical linear algebra, compiled by
John D. Cook, interesting:

• John D. Cook. Ten surprises from numerical linear algebra. 2010.

0.3.2 Best algorithms of the 20th century


An article published in SIAM News, a publication of the Society for Industrial and Applied
Mathermatics, lists the ten most important algorithms of the 20th century [10]:

1. 1946: John von Neumann, Stan Ulam, and Nick Metropolis, all at the Los Alamos
Scientific Laboratory, cook up the Metropolis algorithm, also known as the Monte Carlo
method.

2. 1947: George Dantzig, at the RAND Corporation, creates the simplex method for
linear programming.

3. 1950: Magnus Hestenes, Eduard Stiefel, and Cornelius Lanczos, all from the Institute
for Numerical Analysisat the National Bureau of Standards, initiate the development
of Krylov subspace iteration methods.

4. 1951: Alston Householder of Oak Ridge National Laboratory formalizes the decompo-
sitional approach to matrix computations.
GETTING STARTED 10

5. 1957: John Backus leads a team at IBM in developing the Fortran optimizing compiler.

6. 1959–61: J.G.F. Francis of Ferranti Ltd., London, finds a stable method for computing
eigenvalues, known as the QR algorithm.

7. 1962: Tony Hoare of Elliott Brothers, Ltd., London, presents Quicksort.

8. 1965: James Cooley of the IBM T.J. Watson Research Center and John Tukey of
PrincetonUniversity and AT&T Bell Laboratories unveil the fast Fourier transform.

9. 1977: Helaman Ferguson and Rodney Forcade of Brigham Young University advance
an integer relation detection algorithm.

10. 1987: Leslie Greengard and Vladimir Rokhlin of Yale University invent the fast mul-
tipole algorithm.

Of these, we will explicitly cover three: the decomposition method to matrix computations,
Krylov subspace methods, and the QR algorithm. Although not explicitly covered, your
understanding of numerical linear algebra will also be a first step towards understanding
some of the other numerical algorithms listed.

0.4 Wrap Up
0.4.1 Additional Homework
For a typical week, additional assignments may be given in this unit.

0.4.2 Summary
In a typical week, we provide a quick summary of the highlights in this unit.
Part I

Orthogonality

11
Week 1

Norms

1.1 Opening
1.1.1 Why norms?

YouTube: https://www.youtube.com/watch?v=DKX3TdQWQ90
The following exercises expose some of the issues that we encounter when computing.
We start by computing b = U x, where U is upper triangular.
Homework 1.1.1.1 Compute
Q RQ R
1 ≠2 1 ≠1
c
a 0 ≠1 ≠1 d c d
ba 2 b =
0 0 2 1

Solution. Q RQ R Q R
1 ≠2 1 ≠1 ≠4
c
a 0 ≠1 ≠1 b a 2 b = a ≠3 d
dc d c
b
0 0 2 1 2
Next, let’s examine the slightly more difficult problem of finding a vector x that satisfies
U x = b.

12
WEEK 1. NORMS 13

Homework 1.1.1.2 Solve


Q RQ R Q R
1 ≠2 1 ‰0 ≠4
c
a 0 ≠1 ≠1 d c d c d
b a ‰1 b = a ≠3 b
0 0 2 ‰2 2

Solution. We can recognize the relation between this problem and Homework 1.1.1.1 and
hence deduce the answer without computation:
Q R Q R
‰0 ≠1
c d c d

a 1 b = a 2 b
‰2 1
The point of these two homework exercises is that if one creates a (nonsingular) n ◊ n
matrix U and vector x of size n, then computing b = U x followed by solving U x‚ = b should
leave us with a vector x‚ such that x = x‚.
Remark 1.1.1.1 We don’t "teach" Matlab in this course. Instead, we think that Matlab is
intuitive enough that we can figure out what the various commands mean. We can always
investigate them by typing
he p <command>

in the command window. For example, for this unit you may want to execute

he p format
he p rng
he p rand
he p triu
he p *
he p \
he p diag
he p abs
he p min
he p max

If you want to learn more about Matlab, you may want to take some of the tutorials offered
by Mathworks at https://www.mathworks.com/support/ earn-with-mat ab-tutoria s.htm .
Let us see if Matlab can compute the solution of a triangular matrix correctly.
Homework 1.1.1.3 In Matlab’s command window, create a random upper triangular matrix
U:
format ong Report results in long format. Seed the ran-
dom number generator so that we all create
rng( 0 ); the same random matrix U and vector x.
n = 3
U = triu( rand( n,n ) )
x = rand( n,1 )
WEEK 1. NORMS 14

b = U * x; Compute right-hand side b from known solu-


tion x.
xhat = U \ b; Solve U x‚ = b.
xhat - x Report the difference between x‚ and x.
What do we notice?
Next, check how close U x‚ is to b = U x:
b - U * xhat
This is known as the residual.
What do we notice?
Solution. A script with the described commands can be found in Assignments/Week01/
matlab/Test_Upper_triangular_solve_3.m.
Some things we observe:

• x‚ ≠ x does not equal zero. This is due to the fact that the computer stores floating
point numbers and computes with floating point arithmetic, and as a result roundoff
error happens.

• The difference is small (notice the 1.0e-15 before the vector, which shows that each
entry in x‚ ≠ x is around 10≠15 .

• The residual b ≠ U x‚ is small.

• Repeating this with a much larger n make things cumbersome since very long vectors
are then printed.
To be able to compare more easily, we will compute the Euclidean length of x‚ ≠ x instead
using the Matlab command norm( xhat - x ). By adding a semicolon at the end of Matlab
commands, we suppress output.
Homework 1.1.1.4 Execute
format ong Report results in long format.
Seed the random number generator so that we
rng( 0 ); all create the same random matrix U and vec-
n = 100; tor x.
U = triu( rand( n,n ) );
x = rand( n,1 );
b = U * x; Compute right-hand side b from known solu-
tion x.
xhat = U \ b; Solve U x‚ = b
norm( xhat - x ) Report the Euclidean length of the difference
between x‚ and x.
What do we notice?
Next, check how close U x‚ is to b = U x, again using the Euclidean length:
norm( b - U * xhat )
WEEK 1. NORMS 15

What do we notice?
Solution. A script with the described commands can be found in Assignments/Week01/
matlab/Test_Upper_triangular_solve_100.m.
Some things we observe:

• norm(x‚ ≠ x), the Euclidean length of x‚ ≠ x, is huge. Matlab computed the wrong
answer!

• However, the computed x‚ solves a problem that corresponds to a slightly different right-
hand side. Thus, x‚ appears to be the solution to an only slightly changed problem.
The next exercise helps us gain insight into what is going on.
Homework 1.1.1.5 Continuing with the U, x, b, and xhat from Homework 1.1.1.4, consider
• When is an upper triangular matrix singular?

• How large is the smallest element on the diagonal of the U from Homework 1.1.1.4?
(min( abs( diag( U ) ) ) returns it!)

• If U were singular, how many solutions to U x‚ = b would there be? How can we
characterize them?

• What is the relationship between x‚ ≠ x and U ?

What have we learned?


Solution.

• When is an upper triangular matrix singular?


Answer:
If and only if there is a zero on its diagonal.

• How large is the smallest element on the diagonal of the U from Homework 1.1.1.4?
(min( abs( diag( U ) ) ) returns it!)
Answer:
It is small in magnitude. This is not surprising, since it is a random number and hence
as the matrix size increases, the chance of placing a small entry (in magnitude) on the
diagonal increases.

• If U were singular, how many solutions to U x‚ = b would there be? How can we
characterize them?
Answer:
An infinite number. Any vector in the null space can be added to a specific solution
to create another solution.
WEEK 1. NORMS 16

• What is the relationship between x‚ ≠ x and U ?


Answer:
It maps almost to the zero vector. In other words, it is close to a vector in the null space
of the matrix U that has its smallest entry (in magnitude) on the diagonal changed to
a zero.

What have we learned? The :"wrong" answer that Matlab computed was due to the fact
that matrix U was almost singular.
To mathematically qualify and quantify all this, we need to be able to talk about "small"
and "large" vectors, and "small" and "large" matrices. For that, we need to generalize the
notion of length. By the end of this week, this will give us some of the tools to more fully
understand what we have observed.

YouTube: https://www.youtube.com/watch?v=2ZEtcnaynnM

1.1.2 Overview
• 1.1 Opening

¶ 1.1.1 Why norms?


¶ 1.1.2 Overview
¶ 1.1.3 What you will learn

• 1.2 Vector Norms

¶ 1.2.1 Absolute value


¶ 1.2.2 What is a vector norm?
¶ 1.2.3 The vector 2-norm (Euclidean length)
¶ 1.2.4 The vector p-norms
¶ 1.2.5 Unit ball
¶ 1.2.6 Equivalence of vector norms

• 1.3 Matrix Norms

¶ 1.3.1 Of linear transformations and matrices


¶ 1.3.2 What is a matrix norm?
WEEK 1. NORMS 17

¶ 1.3.3 The Frobenius norm


¶ 1.3.4 Induced matrix norms
¶ 1.3.5 The matrix 2-norm
¶ 1.3.6 Computing the matrix 1-norm and Œ-norm
¶ 1.3.7 Equivalence of matrix norms
¶ 1.3.8 Submultiplicative norms
¶ 1.3.9 Summary
• 1.4 Condition Number of a Matrix
¶ 1.4.1 Conditioning of a linear system
¶ 1.4.2 Loss of digits of accuracy
¶ 1.4.3 The conditioning of an upper triangular matrix
• 1.5 Enrichments
¶ 1.5.1 Condition number estimation
• 1.6 Wrap Up
¶ 1.6.1 Additional homework
¶ 1.6.2 Summary

1.1.3 What you will learn


Numerical analysis is the study of how the perturbation of a problem or data affects the
accuracy of computation. This inherently means that you have to be able to measure whether
changes are large or small. That, in turn, means we need to be able to quantify whether
vectors or matrices are large or small. Norms are a tool for measuring magnitude.
Upon completion of this week, you should be able to
• Prove or disprove that a function is a norm.
• Connect linear transformations to matrices.
• Recognize, compute, and employ different measures of length, which differ and yet are
equivalent.
• Exploit the benefits of examining vectors on the unit ball.
• Categorize different matrix norms based on their properties.
• Describe, in words and mathematically, how the condition number of a matrix affects
how a relative change in the right-hand side can amplify into relative change in the
solution of a linear system.
• Use norms to quantify the conditioning of solving linear systems.
WEEK 1. NORMS 18

1.2 Vector Norms


1.2.1 Absolute value
Remark 1.2.1.1 Don’t Panic!
In this course, we mostly allow scalars, vectors, and matrices to be complex-valued. This
means we will use terms like "conjugate" and "Hermitian" quite liberally. You will think this
is a big deal, but actually, if you just focus on the real case, you will notice that the complex
case is just a natural extension of the real case.

YouTube: https://www.youtube.com/watch?v=V5ZQmR4zTeU
Recall that | · | : C æ R is the function that returns the absolute value of the input.
In other words, if – = –r + –c i, where –r and –c are the real and imaginary parts of –,
respectively, then Ò
|–| = –r2 + –c2 .
The absolute value (magnitude) of a complex number can also be thought of as the (Eu-
clidean) distance from the point in the complex plane to the origin of that plane, as illustrated
below for the number 3 + 2i.
WEEK 1. NORMS 19

Alternatively, we can compute the absolute value as


|–|
Ò
=
–r2 + –c2
Ò
=
–r2 ≠ –c –r i + –r –c i + –c2
Ò
=
(–r ≠ –c i)(–r + –c i)
Ô =
–– ,
where – denotes the complex conjugate of –:

– = –r + –c i = –r ≠ –c i.

The absolute value function has the following properties:


• – ”= 0 ∆ |–| > 0 (| · | is positive definite),
• |–—| = |–||—| (| · | is homogeneous), and
• |– + —| Æ |–| + |—| (| · | obeys the triangle inequality).
Norms are functions from a domain to the real numbers that are positive definite, homoge-
neous, and obey the triangle inequality. This makes the absolute value function an example
of a norm.
The below exercises help refresh your fluency with complex arithmetic.
Homework 1.2.1.1
1. (1 + i)(2 ≠ i) =
2. (2 ≠ i)(1 + i) =
3. (1 ≠ i)(2 ≠ i) =

4. (1 ≠ i)(2 ≠ i) =
5. (2 ≠ i)(1 ≠ i) =
6. (1 ≠ i)(2 ≠ i) =

Solution.
1. (1 + i)(2 ≠ i) = 2 + 2i ≠ i ≠ i2 = 2 + i + 1 = 3 + i
2. (2 ≠ i)(1 + i) = 2 ≠ i + 2i ≠ i2 = 2 + i + 1 = 3 + i
3. (1 ≠ i)(2 ≠ i) = (1 + i)(2 ≠ i) = 2 ≠ i + 2i ≠ i2 = 3 + i
WEEK 1. NORMS 20

4. (1 ≠ i)(2 ≠ i) = (1 + i)(2 ≠ i) = 2 ≠ i + 2i ≠ i2 = 2 + i + 1 = 3 + i = 3 ≠ i

5. (2 ≠ i)(1 ≠ i) = (2 + i)(1 ≠ i) = 2 ≠ 2i + i ≠ i2 = 2 ≠ i + 1 = 3 ≠ i

6. (1 ≠ i)(2 ≠ i) = (1 ≠ i)(2 + i) = 2 + i ≠ 2i ≠ i2 = 2 ≠ i + 1 = 3 ≠ i
Homework 1.2.1.2 Let –, — œ C.
1. ALWAYS/SOMETIMES/NEVER: –— = —–.

2. ALWAYS/SOMETIMES/NEVER: –— = —–.

Hint. Let – = –r + –c i and — = —r + —c i, where –r , –c , —r , —c œ R.


Answer.

1. ALWAYS: –— = —–.

2. SOMETIMES: –— = —–.

Solution.

1. ALWAYS: –— = —–.
Proof:
–—
= < substitute >
(–r + –c i)(—r + —c i)
= < multiply out >
–r —r + –r —c i + –c —r i ≠ –c —c
= < commutativity of real multiplication >
—r –r + —r –c i + —c –r i ≠ —c –c
= < factor >
(—r + —c i)(–r + –c i)
= < substitute >
—–.

2. SOMETIMES: –— = —–.
An example where it is true: – = — = 0.
An example where it is false: – = 1 and — = i. Then –— = 1◊i = i and —– = ≠i◊1 =
≠i.
Homework 1.2.1.3 Let –, — œ C.
ALWAYS/SOMETIMES/NEVER: –— = —–.
Hint. Let – = –r + –c i and — = —r + —c i, where –r , –c , —r , —c œ R.
Answer. ALWAYS
Now prove it!
WEEK 1. NORMS 21

Solution 1.
–—
= < – = –r + –c i; — = —r + —c i >
(–r + –c i)(—r + —c i)
= < conjugate – >
(–r ≠ –c i)(—r + —c i)
= < multiply out >
(–r —r ≠ –c —r i + –r —c i + –c —c )
= < conjugate >
–r —r + –c —r i ≠ –r —c i + –c —c
= < rearrange >
—r –r + —r –c i ≠ —c –r i + —c –c
= < factor >
(—r ≠ —c i)(–r + –c i)
= < definition of conjugation >
(—r + —c i)(–r + –c i)
= < – = –r + –c i; — = —r + —c i >
—–
Solution 2. Proofs in mathematical textbooks seem to always be wonderfully smooth
arguments that lead from the left-hand side of an equivalence to the right-hand side. In
practice, you may want to start on the left-hand side, and apply a few rules:
–—
= < – = –r + –c i; — = —r + —c i >
(–r + –c i)(—r + —c i)
= < conjugate – >
(–r ≠ –c i)(—r + —c i)
= < multiply out >
(–r —r ≠ –c —r i + –r —c i + –c —c )
= < conjugate >
–r —r + –c —r i ≠ –r —c i + –c —c
and then move on to the right-hand side, applying a few rules:
—–
= < – = –r + –c i; — = —r + —c i >
(—r + —c i)(–r + –c i)
= < conjugate — >
(—r ≠ —c i)(–r + –c i)
= < multiply out >
—r –r + —r –c i ≠ —c –r i + —c –c .
At that point, you recognize that
–r —r + –c —r i ≠ –r —c i + –c —c = —r –r + —r –c i ≠ —c –r i + —c –c
WEEK 1. NORMS 22

since the second is a rearrangement of the terms of the first. Optionally, you then go back
and presents these insights as a smooth argument that leads from the expression on the
left-hand side to the one on the right-hand side:

–—
= < – = –r + –c i; — = —r + —c i >
(–r + –c i)(—r + —c i)
= < conjugate – >
(–r ≠ –c i)(—r + —c i)
= < multiply out >
(–r —r ≠ –c —r i + –r —c i + –c —c )
= < conjugate >
–r —r + –c —r i ≠ –r —c i + –c —c
= < rearrange >
—r –r + —r –c i ≠ —c –r i + —c –c
= < factor >
(—r ≠ —c i)(–r + –c i)
= < definition of conjugation >
(—r + —c i)(–r + –c i)
= < – = –r + –c i; — = —r + —c i >
—–.

Solution 3. Yet another way of presenting the proof uses an "equivalence style proof." The
idea is to start with the equivalence you wish to prove correct:

–— = —–

and through a sequence of equivalent statements argue that this evaluates to TRUE:

–— = —–
… < – = –r + –c i; — = —r + —c i >
(–r + –c i)(—r + —c i) = (—r + —c i)(–r + –c i)
… < conjugate ◊ 2 >
(–r ≠ –c i)(—r + —c i) = (—r ≠ —c i)(–r + –c i)
… < multiply out ◊ 2 >
–r —r + –r —c i ≠ –c —r i + –c —c = —r –r + —r –c i ≠ —c –r i + —c –c
… < conjugate >
–r —r ≠ –r —c i + –c —r i + –c —c = —r –r + —r –c i ≠ —c –r i + —c –c
… < subtract equivalent terms from left-hand side and right-hand side >
0=0
… < algebra >
TRUE.

By transitivity of equivalence, we conclude that –— = —– is TRUE.


WEEK 1. NORMS 23

Homework 1.2.1.4 Let – œ C.


ALWAYS/SOMETIMES/NEVER: –– œ R
Answer. ALWAYS.
Now prove it!
Solution. Let – = –r + –c i. Then

––
= < instantiate >
(–r + –c i)(–r + –c i)
= < conjugate >
(–r ≠ –c i)(–r + –c i)
= < multiply out >
–r2 + –c2 ,

which is a real number.


Homework 1.2.1.5 Prove that the absolute value function is homogeneous: |–—| = |–||—|
for all –, — œ C.
Solution.
|–—| = |–||—|
… < squaring both sides simplifies >
|–—|2 = |–|2 |—|2
… < instantiate >
|(–r + –c i)(—r + —c i)|2 = |–r + –c i|2 |—r + —c i|2
… < algebra >
|(–r —r ≠ –c —c ) + (–r —c + –c —r )i|2 = (–r2 + –c2 )(—r2 + —c2 )
… < algebra >
(–r —r ≠ –c —c )2 + (–r —c + –c —r )2 = (–r2 + –c2 )(—r2 + —c2 )
… < algebra >
–r2 —r2 ≠ 2–r –c —r —c + –c2 —c2 + –r2 —c2 + 2–r –c —r —c + –c2 —r2
= –r2 —r2 + –r2 —c+ –c2 —r2 + –c2 —c2
… < subtract equivalent terms from both sides >
0=0
… < algebra >
T
Homework 1.2.1.6 Let – œ C.
ALWAYS/SOMETIMES/NEVER: |–| = |–|.
Answer. ALWAYS
Now prove it!
WEEK 1. NORMS 24

Solution. Let – = –r + –c i.

|–|
= < instantiate >
|–r + –c i|
= < conjugate >
|–r ≠ –c i|
Ò
= < definition of | · | >
–r + –c
2 2

= < definition of | · | >


|–r + –c i|
= < instantiate >
|–|

1.2.2 What is a vector norm?

YouTube: https://www.youtube.com/watch?v=CTrUVfLGcNM
A vector norm extends the notion of an absolute value to vectors. It allows us to measure
the magnitude (or length) of a vector. In different situations, a different measure may be
more appropriate.
Definition 1.2.2.1 Vector norm. Let ‹ : Cm æ R. Then ‹ is a (vector) norm if for all
x, y œ Cm and all – œ C
• x ”= 0 ∆ ‹(x) > 0 (‹ is positive definite),

• ‹(–x) = |–|‹(x) (‹ is homogeneous), and

• ‹(x + y) Æ ‹(x) + ‹(y) (‹ obeys the triangle inequality).


Homework 1.2.2.1 TRUE/FALSE: If ‹ : C æ R is a norm, then ‹(0) = 0.
m

Hint. From context, you should be able to tell which of these 0’s denotes the zero vector
of a given size and which is the scalar 0.
0x = 0 (multiplying any vector x by the scalar 0 results in a vector of zeroes).
Answer. TRUE.
Now prove it.
WEEK 1. NORMS 25

Solution. Let x œ Cm and, just for clarity this first time, ˛0 be the zero vector of size m so
that 0 is the scalar zero. Then
‹(˛0)
= < 0 · x = ˛0 >
‹(0 · x)
= < ‹(· · · ) is homogeneous >
0‹(x)
= < algebra >
0
Remark 1.2.2.2 We typically use Î · Î instead of ‹(·) for a function that is a norm.

1.2.3 The vector 2-norm (Euclidean length)

YouTube: https://www.youtube.com/watch?v=bxDDpUZEqBs
The length of a vector is most commonly measured by the "square root of the sum of the
squares of the elements," also known as the Euclidean norm. It is called the 2-norm because
it is a member of a class of norms known as p-norms, discussed in the next unit.
Definition 1.2.3.1 Vector 2-norm. The vector 2-norm Î · Î2 : Cm æ R is defined for
x œ Cm by
Ò ı̂m≠1
ıÿ
ÎxÎ2 = |‰0 + · · · + |‰m≠1 =
|2 |2 Ù |‰ |2 .
i
i=0

Equivalently, it can be defined by Ô


ÎxÎ2 = xH x
or
Ò ı̂m≠1
ıÿ
ÎxÎ2 = ‰0 ‰0 + · · · + ‰m≠1 ‰m≠1 = Ù ‰ i ‰i .
i=0


Remark 1.2.3.2 The notation xH requires a bit of explanation. If
Q R
‰0
c . d
x = a .. d
c
b
‰m
WEEK 1. NORMS 26

then the row vector 1 2


xH = ‰0 · · · ‰m
is the Hermitian transpose of x (or, equivalently, the Hermitian transpose of the vector x
that is viewed as a matrix) and xH y can be thought of as the dot product of x and y or,
equivalently, as the matrix-vector multiplication of the matrix xH times the vector y.
To prove that the 2-norm is a norm (just calling it a norm doesn’t mean it is, after
all), we need a result known as the Cauchy-Schwarz inequality. This inequality relates the
magnitude of the dot product of two vectors to the product of their 2-norms: if x, y œ Rm ,
then |xT y| Æ ÎxÎ2 ÎyÎ2 . To motivate this result before we rigorously prove it, recall from
your undergraduate studies that the component of x in the direction of a vector y of unit
length is given by (y T x)y, as illustrated by

The length of the component of x in the direction of y then equals

Î(y T x)yÎ2
Ò
= < definition >
(y T x)T y T (y T x)y
Ò
= < z– = –z >
(x y) y T y
T 2

= < y has unit length >


T
|y x|
= < definition >
|xT y|.

Thus |xT y| Æ ÎxÎ2 (since a component should be shorter than the whole). If y is not of unit
length (but a nonzero vector), then |xT ÎyÎ
y
2
| Æ ÎxÎ2 or, equivalently, |xT y| Æ ÎxÎ2 ÎyÎ2 .
We now state this result as a theorem, generalized to complex valued vectors:
Theorem 1.2.3.3 Cauchy-Schwarz inequality. Let x, y œ Cm . Then |xH y| Æ ÎxÎ2 ÎyÎ2 .
Proof. Assume that x ”= 0 and y ”= 0, since otherwise the inequality is trivially true. We
can then choose x‚ = x/ÎxÎ2 and y‚ = y/ÎyÎ2 . This leaves us to prove that |x‚H y‚| Æ 1 since
Îx‚Î2 = Îy‚Î2 = 1.
Pick I
1 if xH y = 0
–=
y‚ x‚/|x‚ y‚| otherwise.
H H

so that |–| = 1 and –x‚H y‚ is real and nonnegative. Note that since it is real we also know
WEEK 1. NORMS 27

that
–x‚H y‚
= < — = — if — is real >
–x‚H y‚ .
= < property of complex conjugation >
H‚
–y x

Now,
0
Æ < Î · Î2 is nonnegative definite >
Îx‚ ≠ –y‚Î22
= < ÎzÎ22 = z H z >
(x‚ ≠ –y‚) (x‚ ≠ –y‚)
H

= < multiplying out >


x‚H x‚ ≠ –y‚H x‚ ≠ –x‚H y‚ + ––y‚H y‚
= < above assumptions and observations >
1 ≠ 2–x y + |–|2
‚ H‚

= < –x‚H y‚ = |x‚H y‚|; |–| = 1 >


2 ≠ 2|x‚ y‚|.
H

Thus |x‚H y‚| Æ 1 and therefore |xH y| Æ ÎxÎ2 ÎyÎ2 . ⌅


The proof of Theorem 1.2.3.3 does not employ any of the intuition we used to motivate
it in the real valued case just before its statement. We leave it to the reader to prove the
Cauchy-Schartz inequality for real-valued vectors by modifying (simplifying) the proof of
Theorem 1.2.3.3.
Ponder This 1.2.3.1 Let x, y œ Rm . Prove that |xT y| Æ ÎxÎ2 ÎyÎ2 by specializing the proof
of Theorem 1.2.3.3.
The following theorem states that the 2-norm is indeed a norm:
Theorem 1.2.3.4 The vector 2-norm is a norm.
We leave its proof as an exercise.
Homework 1.2.3.2 Prove Theorem 1.2.3.4.
Solution. To prove this, we merely check whether the three conditions are met:
Let x, y œ Cm and – œ C be arbitrarily chosen. Then

• x ”= 0 ∆ ÎxÎ2 > 0 (Î · Î2 is positive definite):


Notice that x ”= 0 means that at least one of its components is nonzero. Let’s assume
that ‰j ”= 0. Then
Ò Ò
ÎxÎ2 = |‰0 |2 + · · · + |‰m≠1 |2 Ø |‰j |2 = |‰j | > 0.
WEEK 1. NORMS 28

• ΖxÎ2 = |–|ÎxÎ2 (Î · Î2 is homogeneous):

ΖxÎ2
Ò
= < scaling a vector scales its components; definition >
|–‰0 | + · · · + |–‰m≠1 |2
2

Ò
= < algebra >
|–| |‰0 |2 + · · · + |–|2 |‰m≠1 |2
2

Ò
= < algebra >
|–| (|‰0 |2 + · · · + |‰m≠1 |2 )
2

=
Ò
< algebra >
|–| |‰0 |2 + · · · + |‰m≠1 |2
= < definition >
|–|ÎxÎ2 .

• Îx + yÎ2 Æ ÎxÎ2 + ÎyÎ2 (Î · Î2 obeys the triangle inequality):

Îx + yÎ22
= < ÎzÎ22 = z H z >
(x + y) (x + y)
H

= < distribute >


x x + y x + xH y + y H y
H H

= < — + — = 2Real(—) >


xH x + 2Real(xH y) + y H y
Æ < algebra >
xH x + 2|Real(xH y)| + y H y
Æ < algebra >
xH x + 2|xH y| + y H y
Æ < algebra; Cauchy-Schwarz >
ÎxÎ22 + 2ÎxÎ2 ÎyÎ2 + ÎyÎ22
= < algebra >
(ÎxÎ2 + ÎyÎ2 )2 .
Taking the square root (an increasing function that hence maintains the inequality) of
both sides yields the desired result.
Throughout this course, we will reason about subvectors and submatrices. Let’s get some
practice:
Homework 1.2.3.3 Partition x œ Cm into subvectors:
Q R
x0
c x1 d
c d
x= c
c .. d.
d
a . b
xM ≠1
WEEK 1. NORMS 29

ALWAYS/SOMETIMES/NEVER: Îxi Î2 Æ ÎxÎ2 .


Answer. ALWAYS
Now prove it!
Solution.
ÎxÎ22
= < partition vector >
.Q R.2
. x0 .
. .
.c x d.
.c 1 d.
.c
.c .. d .
d.
.a
.
. b. .
. xM ≠1 .2
= < equivalent definition >
Q RH Q R
x0 x0
c x1 d c d
c d c x1 d
c .. d c . d
. b a .. b
c d c d
a
xM ≠1 xM ≠1
= < dot product of partitioned vectors >
0 x0 + x1 x1 + · · · + xM ≠1 xM ≠1
xH H H

= < equivalent definition >


Îx0 Î2 + Îx1 Î22 + · · · + ÎxM ≠1 Î22
2

Ø < algebra >


Îxi Î22
so that Îxi Î22 Æ ÎxÎ22 . Taking the square root of both sides shows that Îxi Î2 Æ ÎxÎ2 .

1.2.4 The vector p-norms

YouTube: https://www.youtube.com/watch?v=WGBMnmgJek8
A vector norm is a measure of the magnitude of a vector. The Euclidean norm (length) is
merely the best known such measure. There are others. A simple alternative is the 1-norm.
Definition 1.2.4.1 Vector 1-norm. The vector 1-norm, Î · Î1 : Cm æ R, is defined for
x œ Cm by
m≠1
ÿ
ÎxÎ1 = |‰0 | + |‰1 | + · · · + |‰m≠1 | = |‰i |.
i=0


WEEK 1. NORMS 30

Homework 1.2.4.1 Prove that the vector 1-norm is a norm.


Solution. We show that the three conditions are met:
Let x, y œ Cm and – œ C be arbitrarily chosen. Then

• x ”= 0 ∆ ÎxÎ1 > 0 (Î · Î1 is positive definite):


Notice that x ”= 0 means that at least one of its components is nonzero. Let’s assume
that ‰j ”= 0. Then
ÎxÎ1 = |‰0 | + · · · + |‰m≠1 | Ø |‰j | > 0.

• ΖxÎ1 = |–|ÎxÎ1 (Î · Î1 is homogeneous):

ΖxÎ1 = < scaling a vector-scales-its-components; definition >


|–‰0 | + · · · + |–‰m≠1 |
= < algebra >
|–||‰0 | + · · · + |–||‰m≠1 |
= < algebra >
|–|(|‰0 | + · · · + |‰m≠1 |)
= < definition >
|–|ÎxÎ1 .

• Îx + yÎ1 Æ ÎxÎ1 + ÎyÎ1 (Î · Î1 obeys the triangle inequality):

Îx + yÎ1
= < vector addition; definition of 1-norm >
|‰0 + Â0 | + |‰1 + Â1 | + · · · + |‰m≠1 + Âm≠1 |
Æ < algebra >
|‰0 | + |Â0 | + |‰1 | + |Â1 | + · · · + |‰m≠1 | + |Âm≠1 |
= < commutivity >
|‰0 | + |‰1 | + · · · + |‰m≠1 | + |Â0 | + |Â1 | + · · · + |Âm≠1 |
= < associativity; definition >
ÎxÎ1 + ÎyÎ1 .
The vector 1-norm is sometimes referred to as the "taxi-cab norm". It is the distance
that a taxi travels, from one point on a street to another such point, along the streets of a
city that has square city blocks.
Another alternative is the infinity norm.
Definition 1.2.4.2 Vector Œ-norm. The vector Œ-norm, Î · ÎŒ : Cm æ R, is defined
for x œ Cm by
ÎxÎŒ = max(|‰0 |, . . . , |‰m≠1 ) = max |‰i |.
m≠1
i=0


The infinity norm simply measures how large the vector is by the magnitude of its largest
entry.
WEEK 1. NORMS 31

Homework 1.2.4.2 Prove that the vector Œ-norm is a norm.


Solution. We show that the three conditions are met:
Let x, y œ Cm and – œ C be arbitrarily chosen. Then

• x ”= 0 ∆ ÎxÎŒ > 0 (Î · ÎŒ is positive definite):


Notice that x ”= 0 means that at least one of its components is nonzero. Let’s assume
that ‰j ”= 0. Then
ÎxÎŒ = max |‰i | Ø |‰j | > 0.
m≠1
i=0

• ΖxÎŒ = |–|ÎxÎŒ (Î · ÎŒ is homogeneous):

ΖxÎŒ = maxm≠1
i=0 |–‰i |
= maxm≠1
i=0 |–||‰i |
= |–| maxm≠1
i=0 |‰i |
= |–|ÎxÎŒ .

• Îx + yÎŒ Æ ÎxÎŒ + ÎyÎŒ (Î · ÎŒ obeys the triangle inequality):

Îx + yÎŒ = maxm≠1
i=0 |‰i + Âi |
Æ maxm≠1
i=0 (|‰i | + |Âi |)
Æ maxm≠1
i=0 |‰i | + maxi=0 |Âi |
m≠1

= ÎxÎŒ + ÎyÎŒ .
In this course, we will primarily use the vector 1-norm, 2-norm, and Œ-norms. For
completeness, we briefly discuss their generalization: the vector p-norm.
Definition 1.2.4.3 Vector p-norm. Given p Ø 1, the vector p-norm Î · Îp : Cm æ R is
defined for x œ Cm by

Ò Am≠1 B1/p
ÿ
ÎxÎp = p
|‰0 + · · · + |‰m≠1 =
|p |p p
|‰i | .
i=0


Theorem 1.2.4.4 The vector p-norm is a norm.
The proof of this result is very similar to the proof of the fact that the 2-norm is a
norm. It depends on Hölder’s inequality, which is a generalization of the Cauchy-Schwarz
inequality:
Theorem 1.2.4.5 Hölder’s inequality. Let 1 Æ p, q Æ Œ with p1 + 1q = 1. If x, y œ Cm
then |xH y| Æ ÎxÎp ÎyÎq .
We skip the proof of Hölder’s inequality and Theorem 1.2.4.4. You can easily find proofs
for these results, should you be interested.
WEEK 1. NORMS 32

Remark 1.2.4.6 The vector 1-norm and 2-norm are obviously special cases of the vector
p-norm. It can be easily shown that the vector Œ-norm is also related:

lim ÎxÎp = ÎxÎŒ .


pæŒ

Ponder This 1.2.4.3 Consider Homework 1.2.3.3. Try to elegantly formulate this question
in the most general way you can think of. How do you prove the result?
Ponder This 1.2.4.4 Consider the vector norm Î · Î : Cm æ R, the matrix A œ Cm◊n and
the function f : Cn æ R defined by f (x) = ÎAxÎ. For what matrices A is the function f a
norm?

1.2.5 Unit ball

YouTube: https://www.youtube.com/watch?v=aJgrpp7uscw
In 3-dimensional space, the notion of the unit ball is intuitive: the set of all points that
are a (Euclidean) distance of one from the origin. Vectors have no position and can have
more than three components. Still the unit ball for the 2-norm is a straight forward extension
to the set of all vectors with length (2-norm) one. More generally, the unit ball for any norm
can be defined:
Definition 1.2.5.1 Unit ball. Given norm Î · Î : Cm æ R, the unit ball with respect to
Î · Î is the set {x | ÎxÎ = 1} (the set of all vectors with norm equal to one). We will use
ÎxÎ = 1 as shorthand for {x | ÎxÎ = 1}. ⌃
Homework 1.2.5.1 Although vectors have no position, it is convenient to visualize a vector
x œ R2 by the point
A in the plane to which it extends when rooted at the origin. For example,
B
2
the vector x = can be so visualized with the point (2, 1). With this in mind, match
1
the pictures on the right corresponding to the sets on the left:
(a) ÎxÎ2 = 1. (1)
WEEK 1. NORMS 33

(b) ÎxÎ1 = 1. (2)

(c) ÎxÎŒ = 1. (3)

Solution.
(a) ÎxÎ2 = 1. (3)

(b) ÎxÎ1 = 1. (1)

(c) ÎxÎŒ = 1. (2)


WEEK 1. NORMS 34

YouTube: https://www.youtube.com/watch?v=Ov77sE90P58

1.2.6 Equivalence of vector norms

YouTube: https://www.youtube.com/watch?v=qjZyKHvL13E
Homework 1.2.6.1 Fill out the following table:

Q
x R ÎxÎ1 ÎxÎŒ ÎxÎ2
1
c d
a 0 b
Q
0 R
1
c d
a 1 b
Q
1 R
1
c d
a ≠2 b
≠1

Solution.
Q
x R ÎxÎ1 ÎxÎŒ ÎxÎ2
1
c d
a 0 b 1 1 1
Q
0 R
1 Ô
c d
a 1 b 3 1 3
Q
1 R
1 Ò Ô
c d
a ≠2 b 4 2 12 + (≠2)2 + (≠1)2 = 6
≠1
WEEK 1. NORMS 35

In this course, norms are going to be used to reason that vectors are "small" or "large".
It would be unfortunate if a vector were small in one norm yet large in another norm.
Fortunately, the following theorem excludes this possibility:
Theorem 1.2.6.1 Equivalence of vector norms. Let ηΠ: Cm æ R and |||·||| : Cm æ R
both be vector norms. Then there exist positive scalars ‡ and · such that for all x œ Cm

‡ÎxÎ Æ |||x||| Æ · ÎxÎ.


Proof. The proof depends on a result from real analysis (sometimes called "advanced calcu-
lus") that states that supxœS f (x) is attained for some vector x œ S as long as f is continuous
and S is a compact (closed and bounded) set. For any norm Î · Î, the unit ball ÎxÎ = 1 is
a compact set. When a supremum is an element in S, it is called the maximum instead and
supxœS f (x) can be restated as maxxœS f (x).
Those who have not studied real analysis (which is not a prerequisite for this course)
have to take this on faith. It is a result that we will use a few times in our discussion.
We prove that there exists a · such that for all x œ Cm

|||x||| Æ · ÎxÎ,

leaving the rest of the proof as an exercise.


Let x œ Cm be an arbitary vector. W.l.o.g. assume that x ”= 0. Then

|||x|||
= < algebra >
|||x|||
ÎxÎ
ÎxÎ
1 Æ < 2algebra >
supz”=0 ÎzÎ ÎxÎ
|||z|||

1 = < change
2 of variables: y = z/ÎzÎ >
supÎyÎ=1 |||y||| ÎxÎ
1 = < the 2set ÎyÎ = 1 is compact >
maxÎyÎ=1 |||y||| ÎxÎ

The desired · can now be chosen to equal maxÎyÎ=1 |||y|||. ⌅

YouTube: https://www.youtube.com/watch?v=I1W6ErdEyoc
WEEK 1. NORMS 36

Homework 1.2.6.2 Complete the proof of Theorem 1.2.6.1.


Solution. We need to prove that

‡ÎxÎ Æ |||x|||.

From the first part of the proof of Theorem 1.2.6.1, we know that there exists a fl > 0 such
that
ÎxÎ Æ fl|||x|||
and hence
1
ÎxÎ Æ |||x|||.

We conclude that
‡ÎxÎ Æ |||x|||
where ‡ = 1/fl.
Example 1.2.6.2
• Let x œ R2 . Use the picture

to determine the constant C such that ÎxÎ1 Æ CÎxÎŒ . Give a vector x for which
ÎxÎ1 = CÎxÎŒ .

• For x œ R2 and the C you determined in the first part of this problem, prove that
ÎxÎ1 Æ CÎxÎŒ .

• Let x œ Cm . Extrapolate from the last part the constant C such that ÎxÎ1 Æ CÎxÎŒ
and then prove the inequality. Give a vector x for which ÎxÎ1 = CÎxÎŒ .

Solution.

• Consider the picture


WEEK 1. NORMS 37

¶ The red square represents all vectors such that ÎxÎŒ = 1 and the white square
represents all vectors such that ÎxÎ1 = 2.
¶ All points on or outside the red square represent vectors y such that ÎyÎŒ Ø 1.
Hence if ÎyÎ1 = 2 then ÎyÎŒ Ø 1.
¶ Now, pick any z ”= 0. Then Î2z/ÎzÎ1 Î1 = 2). Hence

Î2z/ÎzÎ1 ÎŒ Ø 1

which can be rewritten as


ÎzÎ1 Æ 2ÎzÎŒ .
Thus, C = 2 works.
A B
1
¶ Now, from the picture it is clear that x = has the property that ÎxÎ1 =
1
2ÎxÎŒ . Thus, the inequality is "tight."

• We now prove that ÎxÎ1 Æ 2ÎxÎŒ for x œ R2 :

ÎxÎ1
= < definition >
|‰0 | + |‰1 |
Æ < algebra >
max(|‰0 |, |‰1 |) + max(|‰0 |, |‰1 |)
= < algebra >
2 max(|‰0 |, |‰1 |)
= < definition >
2ÎxÎŒ .

• From the last part we extrapolate that ÎxÎ1 Æ mÎxÎŒ .


WEEK 1. NORMS 38

ÎxÎ1
= < definition >
qm≠1
i=0 |‰ i |
Æ < algebra2>
qm≠1 1
i=0 maxm≠1
j=0 |‰j |
= < algebra >
m maxm≠1
j=0 |‰j |
= < definition >
mÎxÎŒ .
Q R
1
c
c 1 d
d
Equality holds (i.e., ÎxÎ1 = mÎxÎŒ ) for x = c
c .. d.
d
a . b
1
Some will be able to go straight for the general result, while others will want to seek
inspiration from the picture and/or the specialized case where x œ R2 . ⇤
Homework 1.2.6.3 Let x œ Cm . The following table organizes the various bounds:

ÎxÎ1 Æ C1,2 ÎxÎ2 ÎxÎ1 Æ C1,Œ ÎxÎŒ


ÎxÎ2 Æ C2,1 ÎxÎ1 ÎxÎ2 Æ C2,Œ ÎxÎŒ
ÎxÎŒ Æ CŒ,1 ÎxÎ1 ÎxÎŒ Æ CŒ,2 ÎxÎ2

For each, determine the constant Cx,y and prove the inequality, including that it is a tight
inequality.
Hint: look at the hint!
Ô
Hint. ÎxÎ1 Æ mÎxÎ2 :
This is the hardest
Q one to prove.
R Do it last and use the following hint:
‰0 /|‰0 |
Consider y = c
c .. d
d and employ the Cauchy-Schwarz inequality.
a . b
‰m≠1 /|‰m≠1 |
Ô
Solution 1 (ÎxÎ1QÆ C1,2 ÎxÎ2 ). ÎxÎ
R 1
Æ mÎxÎ2 :
‰0 /|‰0 |
Consider y = a
c .. d
d . Then
c
. b
‰m≠1 /|‰m≠1 |
- - - - - -
-m≠1 - -m≠1 - -m≠1 -
-ÿ - -ÿ 2 - -ÿ -
|x y| =
H
- ‰i ‰i /|‰i |- = - |‰i | /|‰i |- = - |‰i |- = ÎxÎ1 .
- - - - - -
i=0 i=0 i=0
Ô
We also notice that ÎyÎ2 = m.
From the Cauchy-Swartz inequality we know that
Ô
ÎxÎ1 = |xH y| Æ ÎxÎ2 ÎyÎ2 = mÎxÎ2 .
WEEK 1. NORMS 39

If we now choose Q R
1
c . d
x = a .. d
c
b
1
Ô Ô
then ÎxÎ1 = m and ÎxÎ2 = m so that ÎxÎ1 = mÎxÎ2 .
Solution 2 (ÎxÎ1 Æ C1,Œ ÎxÎŒ ). ÎxÎ1 Æ mÎxÎŒ :
See Example 1.2.6.2.
Solution 3 (ÎxÎ2 Æ C2,1 ÎxÎ1 :). ÎxÎ2 Æ ÎxÎ1 :

ÎxÎ22
= < definition >
qm≠1 2
i=0 |‰i |
Æ < algebra >
1q 22
m≠1
i=0 |‰i |
= < definition >
ÎxÎ21 .
Taking the square root of both sides yields ÎxÎ2 Æ ÎxÎ1 .
If we now choose Q R
0
c . d
c .. d
c d
c d
c
c
0 d
d
x= c
c 1 d
d
c
c 0 d
d
c
c .. d
d
a . b
0
then ÎxÎ2 = ÎxÎ1 .
Ô
Solution 4 (ÎxÎ2 Æ C2,Œ ÎxÎŒ ). ÎxÎ2 Æ mÎxÎŒ :

ÎxÎ22
= < definition >
qm≠1 2
i=0 |‰i |
Æ < algebra >
qm≠1 1 22
i=0 maxm≠1
j=0 |‰j |
= < definition >
qm≠1 2
i=0 ÎxÎ Œ
= < algebra >
mÎxÎ2Œ .
Ô
Taking the square root of both sides yields ÎxÎ2 Æ mÎxÎŒ .
WEEK 1. NORMS 40

Consider Q R
1
c . d
x = a .. d
c
b
1
Ô Ô
then ÎxÎ2 = m and ÎxÎŒ = 1 so that ÎxÎ2 = mÎxÎŒ .
Solution 5 (ÎxÎŒ Æ CŒ,1 ÎxÎ1 :). ÎxÎŒ Æ ÎxÎ1 :

ÎxÎŒ
= < definition >
maxm≠1
i=0 |‰ i|
Æ < algebra >
qm≠1
i=0 |‰ i |
= < definition >
ÎxÎ1 .
Consider Q R
0
c . d
c .. d
c d
c d
c
c
0 d
d
x= c
c 1 d.
d
c
c 0 d
d
c
c .. d
d
a . b
0
Then ÎxÎŒ = 1 = ÎxÎ1 .
Solution 6 (ÎxÎŒ Æ CŒ,2 ÎxÎ2 ). ÎxÎŒ Æ ÎxÎ2 :

ÎxÎ2Œ
= < definition >
1 22
maxm≠1 i=0 |‰i |
= < algebra >
maxi=0 |‰i |2
m≠1

Æ < algebra >


qm≠1 2
i=0 |‰ i |
= < definition >
ÎxÎ22 .
Taking the square root of both sides yields ÎxÎŒ Æ ÎxÎ2 .
WEEK 1. NORMS 41

Consider Q R
0
c . d
c .. d
c d
c d
c
c
0 d
d
x= c
c 1 d.
d
c
c 0 d
d
c
c .. d
d
a . b
0
Then ÎxÎŒ = 1 = ÎxÎ2 .
Solution 7 (Table of constants).
Ô
ÎxÎ1 Æ mÎxÎ2 ÎxÎ1 Æ mÎxÎŒ
Ô
ÎxÎ2 Æ ÎxÎ1 ÎxÎ2 Æ mÎxÎŒ
ÎxÎŒ Æ ÎxÎ1 ÎxÎŒ Æ ÎxÎ2
Remark 1.2.6.3 The bottom line is that, modulo a constant factor, if a vector is "small"
in one norm, it is "small" in all other norms. If it is "large" in one norm, it is "large" in all
other norms.

1.3 Matrix Norms


1.3.1 Of linear transformations and matrices

YouTube: https://www.youtube.com/watch?v=x kiZEbYh38


We briefly review the relationship between linear transformations and matrices, which is
key to understanding why linear algebra is all about matrices and vectors.
Definition 1.3.1.1 Linear transformations and matrices. Let L : Cn æ Cm . Then L
is said to be a linear transformation if for all – œ C and x, y œ Cn
• L(–x) = –L(x). That is, scaling first and then transforming yields the same result as
transforming first and then scaling.

• L(x + y) = L(x) + L(y). That is, adding first and then transforming yields the same
result as transforming first and then adding.


WEEK 1. NORMS 42

The importance of linear transformations comes in part from the fact that many problems
in science boil down to, given a function F : Cn æ Cm and vector y œ Cm , find x such that
F (x) = y. This is known as an inverse problem. Under mild conditions, F can be locally
approximated with a linear transformation L and then, as part of a solution method, one
would want to solve Lx = y.
The following theorem provides the link between linear transformations and matrices:
Theorem 1.3.1.2 Let L : Cn æ Cm be a linear transformation, v0 , v1 , · · · , vk≠1 œ Cn , and
x œ Ck . Then

L(‰0 v0 + ‰1 v1 + · · · + ‰k≠1 vk≠1 ) = ‰0 L(v0 ) + ‰1 L(v1 ) + · · · + ‰k≠1 L(vk≠1 ),

where Q R
‰0
c . d
x = a .. d
c
b.
‰k≠1
Proof. A simple inductive proof yields the result. For details, see Week 2 of Linear Algebra:
Foundations to Frontiers (LAFF) [26]. ⌅
The following set of vectors ends up playing a crucial role throughout this course:
Definition 1.3.1.3 Standard basis vector. In this course, we will use ej œ Cm to denote
the standard basis vector with a "1" in the position indexed with j. So,
Q R
0
c . d
c .. d
c d
c d
c
c
0 d
d
ej = c
c 1 d
d Ω≠ j
c
c 0 d
d
c
c .. d
d
a . b
0


Key is the fact that any vector x œ Cn can be written as a linear combination of the
standard basis vectors of Cn :
Q R Q R Q R Q R
‰0 1 0 0
c 0 d 1 0
c d c d c d c d
c ‰1 d c d c d
x = c c .. dd = ‰ 0 c .. d
c d + ‰ c
1c .. d + · · · + ‰n≠1 c
d c .. d
d
a . b a . b a . b a . b
‰n≠1 0 0 1
= ‰0 e0 + ‰1 e1 + · · · + ‰n≠1 en≠1 .
Hence, if L is a linear transformation,
L(x) = L(‰0 e0 + ‰1 e1 + · · · + ‰n≠1 en≠1 )
= ‰0 L(e0 ) + ‰1 L(e1 ) + · · · + ‰n≠1 L(en≠1 ) .
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
a0 a1 an≠1
WEEK 1. NORMS 43

If we now let aj = L(ej ) (the vector aj is the transformation of the standard basis vector ej
and collect these vectors into a two-dimensional array of numbers:
1 2
A= a0 a1 · · · an≠1 (1.3.1)

then we notice that information for evaluating L(x) can be found in this array, since L can
then alternatively be computed by

L(x) = ‰0 a0 + ‰1 a1 + · · · + ‰n≠1 an≠1 .

The array A in (1.3.1) we call a matrix and the operation Ax = ‰0 a0 +‰1 a1 +· · ·+‰n≠1 an≠1
we call matrix-vector multiplication. Clearly

Ax = L(x).
Remark 1.3.1.4 Notation. In these notes, as a rule,
• Roman upper case letters are used to denote matrices.

• Roman lower case letters are used to denote vectors.

• Greek lower case letters are used to denote scalars.

Corresponding letters from these three sets are used to refer to a matrix, the row or columns
of that matrix, and the elements of that matrix. If A œ Cm◊n then

A
= < partition A by columns and rows >
Q R
aÂT0
1 2 c a T d
c Â1 d
a0 a1 · · · an≠1 = c
c .. d
d
a . b
aÂTm≠1
Q = < expose the elements ofRA >
–0,0 –0,1 · · · –0,n≠1
c d
c
c –1,0 –1,1 · · · –1,n≠1 d d
c .. .. .. d
c
c . . . d
d
c .. d
c
a . d
b
–m≠1,0 –m≠1,1 · · · –m≠1,n≠1
We now notice that the standard basis vector ej œ Cm equals the column of the m ◊ m
identity matrix indexed with j:
Q R Q R
1 0 ··· 0 eÂT0
c
c 0 1 ··· 0 d
d 1 2 c
c eÂT1 d
d
I= c .... . . .. d = e0 e1 · · · em≠1 = c .. d.
c
a . . . .
d
b
c
a .
d
b
0 0 ··· 1 eÂTm≠1
WEEK 1. NORMS 44

Remark 1.3.1.5 The important thing to note is that a matrix is a convenient represen-
tation of a linear transformation and matrix-vector multiplication is an alternative way for
evaluating that linear transformation.

YouTube: https://www.youtube.com/watch?v=cCFAnQmwwIw
Let’s investigate matrix-matrix multiplication and its relationship to linear transforma-
tions. Consider two linear transformations
LA : Ck æ Cm represented by matrix A
LB : Cn æ Ck represented by matrix B

and define
LC (x) = LA (LB (x)),
as the composition of LA and LB . Then it can be easily shown that LC is also a linear
transformation. Let m ◊ n matrix C represent LC . How are A, B, and C related? If we let
cj equal the column of C indexed with j, then because of the link between matrices, linear
transformations, and standard basis vectors

cj = LC (ej ) = LA (LB (ej )) = LA (bj ) = Abj ,

where bj equals the column of B indexed with j. Now, we say that C = AB is the product
of A and B defined by
1 2 1 2 1 2
c0 c1 · · · cn≠1 =A b0 b1 · · · bn≠1 = Ab0 Ab1 · · · Abn≠1

and define the matrix-matrix multiplication as the operation that computes

C := AB,

which you will want to pronounce "C becomes A times B" to distinguish assignment from
equality. If you think carefully how individual elements of C are computed, you will realize
that they equal the usual "dot product of rows of A with columns of B."
WEEK 1. NORMS 45

YouTube: https://www.youtube.com/watch?v=g_9RbA5EOIc
As already mentioned, throughout this course, it will be important that you can think
about matrices in terms of their columns and rows, and matrix-matrix multiplication (and
other operations with matrices and vectors) in terms of columns and rows. It is also impor-
tant to be able to think about matrix-matrix multiplication in three different ways. If we
partition each matrix by rows and by columns:
Q R Q R
1 2
cÂT0 1 2
aÂT0
C=
c
=c .. d c .. d
b , A = a0 · · · ak≠1 = a
. d .
c0 · · · cn≠1 c d,
a b
cÂTm≠1 aÂTm≠1

and Q R
Â
bT
1 2 0
B= =c
c .. d
b0 · · · bn≠1 a . d,
b
Â
bT k≠1

then C := AB can be computed in the following ways:


1. By columns:
1 2 1 2 1 2
c0 · · · cn≠1 := A b0 · · · bn≠1 = Ab0 · · · Abn≠1 .

In other words, cj := Abj for all columns of C.

2. By rows: Q R Q R Q R
cÂT0 aÂT0 aÂT0 B
c .. d c
d := c .. d B = c
d c .. d
c
a . b a . b a . d.
b
cÂTm≠1 T
aÂm≠1 T
aÂm≠1 B
In other words, cÂTi = aÂTi B for all rows of C.

3. One you may not have thought about much before:


Q R
Â
bT
1 2c 0
C := .. d
= a0ÂbT0 + · · · + ak≠1ÂbTk≠1 ,
a0 · · · ak≠1 c
a . d
b
Â
bT
k≠1

which should be thought of as a sequence of rank-1 updates, since each term is an


outer product and an outer product has rank of at most one.
These three cases are special cases of the more general observation that, if we can partition
C, A, and B by blocks (submatrices),
Q R Q R
C0,0 ··· C0,N ≠1 A0,0 ··· A0,K≠1
C=c
c .. .. d c .. .. d
a . . d,c
b a . . d,
b
CM ≠1,0 · · · CM ≠1,N ≠1 AM ≠1,0 · · · AM ≠1,K≠1
WEEK 1. NORMS 46

and Q R
B0,0 ··· B0,N ≠1
c .. .. d
c
a . . d,
b
BK≠1,0 · · · BK≠1,N ≠1
where the partitionings are "conformal", then
K≠1
ÿ
Ci,j = Ai,p Bp,j .
p=0

Remark 1.3.1.6 If the above review of linear transformations, matrices, matrix-vector


multiplication, and matrix-matrix multiplication makes you exclaim "That is all a bit too
fast for me!" then it is time for you to take a break and review Weeks 2-5 of our introductory
linear algebra course "Linear Algebra: Foundations to Frontiers." Information, including
notes [26] (optionally downloadable for free) and a link to the course on edX [27] (which can
be audited for free) can be found at http://u aff.net.

1.3.2 What is a matrix norm?

YouTube: https://www.youtube.com/watch?v=6DsBTz1eU7E
A matrix norm extends the notions of an absolute value and vector norm to matrices:
Definition 1.3.2.1 Matrix norm. Let ‹ : Cm◊n æ R. Then ‹ is a (matrix) norm if for
all A, B œ Cm◊n and all – œ C
• A ”= 0 ∆ ‹(A) > 0 (‹ is positive definite),

• ‹(–A) = |–|‹(A) (‹ is homogeneous), and

• ‹(A + B) Æ ‹(A) + ‹(B) (‹ obeys the triangle inequality).


Homework 1.3.2.1 Let ‹ : C æ R be a matrix norm.
m◊n

ALWAYS/SOMETIMES/NEVER: ‹(0) = 0.
Hint. Review the proof on Homework 1.2.2.1.
Answer. ALWAYS.
Now prove it.
WEEK 1. NORMS 47

Solution. Let A œ Cm◊n . Then


‹(0)
= <0·A=0>
‹(0 · A)
= < Î · ΋ is homogeneous >
0‹(A)
= < algebra >
0
Remark 1.3.2.2 As we do with vector norms, we will typically use Î · Î instead of ‹(·) for
a function that is a matrix norm.

1.3.3 The Frobenius norm

YouTube: https://www.youtube.com/watch?v=0ZHnGgrJXa4
Definition 1.3.3.1 The Frobenius norm. The Frobenius norm Î · ÎF : Cm◊n æ R is
defined for A œ Cm◊n by
ı̂
ı̂m≠1 n≠1 ı
ı |–0,0 |2 + ··· + |–0,n≠1 |2 +
ÎAÎF =
ıÿ ÿ
2 =
ı .. .. .. .. ..
Ù |–i,j | ı
Ù . . . . .
i=0 j=0
|–m≠1,0 |2 + · · · + |–m≠1,n≠1 |2 .


One can think of the Frobenius norm as taking the columns of the matrix, stacking them
on top of each other to create a vector of size m ◊ n, and then taking the vector 2-norm of
the result.
Homework 1.3.3.1 Partition m ◊ n matrix A by columns:
1 2
A= a0 · · · an≠1 .

Show that
n≠1
ÿ
ÎAÎ2F = Îaj Î22 .
j=0
WEEK 1. NORMS 48

Solution.
ÎAÎF
Ò
= < definition >
qm≠1 qn≠1 2
i=0 j=0 |–i,j |
=
Òq
< commutativity of addition >
n≠1 qm≠1 2
j=0 i=0 |–i,j |
=
Òq
< definition of vector 2-norm >
n≠1 2
j=0 Îaj Î2

Homework 1.3.3.2 Prove that the Frobenius norm is a norm.


Solution. Establishing that this function is positive definite and homogeneous 1 is straight 2
forward. To show that the triangle inequality holds it helps to realize that if A = a0 a1 · · · an≠1
then
ÎAÎF
Òq
= < definition >
m≠1 qn≠1 2
i=0 j=0 |–i,j |

Òq
= < commutativity of addition >
n≠1 qm≠1 2
j=0 i=0 |–i,j |

Òq
= < definition of vector 2-norm >
n≠1 2
j=0 Îaj Î2
= < definition of vector 2-norm >
ı̂.Q R.2
ı. a0 .
ı. .
ı.c d.
ı.c
ı.c
a1 d.
ı.c .. d. .
d.
ı.a
Ù. . b.
.
. an≠1 .2
In other words, it equals the vector 2-norm of the vector that is created by stacking the
columns of A on top of each other. One can then exploit the fact that the vector 2-norm
obeys the triangle inequality.
Homework 1.3.3.3 Partition m ◊ n matrix A by rows:
Q R
aÂT0
c
A=c .. d
a . d
b.
T
am≠1
Â

Show that
m≠1
ÿ
ÎAÎ2F = ÎaÂi Î22 ,
i=0

where aÂi = aÂTi .


T
WEEK 1. NORMS 49

Solution.
ÎAÎF
Ò
= < definition >
qm≠1 qn≠1 2
i=0 j=0 |–i,j |
=
Òq
< definition of vector 2-norm >
m≠1 Â 2
i=0 Îai Î2 .

Let us review the definition of the transpose of a matrix (which we have already used
when defining the dot product of two real-valued vectors and when identifying a row in a
matrix):
Definition 1.3.3.2 Transpose. If A œ Cm◊n and
Q R
–0,0 –0,1 · · · –0,n≠1
c d
c
c –1,0 –1,1 · · · –1,n≠1 d
d
c .. .. .. d
A= c
c . . . d
d
c .. d
c
a . d
b
–m≠1,0 –m≠1,1 · · · –m≠1,n≠1

then its transpose is defined by


Q R
–0,0 –1,0 · · · –m≠1,0
c d
c
c –0,1 –1,1 · · · –m≠1,1 d
d
c .. .. .. d
AT = c
c . . . d.
d
c .. d
c
a . d
b
–0,n≠1 –1,n≠1 · · · –m≠1,n≠1


For complex-valued matrices, it is important to also define the Hermitian transpose
of a matrix:
Definition 1.3.3.3 Hermitian transpose. If A œ Cm◊n and
Q R
–0,0 –0,1 · · · –0,n≠1
c d
c
c –1,0 –1,1 · · · –1,n≠1 d
d
c .. .. .. d
A= c
c . . . d
d
c .. d
c
a . d
b
–m≠1,0 –m≠1,1 · · · –m≠1,n≠1
WEEK 1. NORMS 50

then its Hermitian transpose is defined by


Q R
–0,0 –1,0 · · · –m≠1,0
c d
c
c –0,1 –1,1 · · · –m≠1,1 d
d
T c .. .. .. d
AH = A cc . . . d,
d
c .. d
c
a . d
b
–0,n≠1 –1,n≠1 · · · –m≠1,n≠1

where A denotes the conjugate of a matrix, in which each element of the matrix is
conjugated. ⌃
We note that
• A = AT .
T

• If A œ Rm◊n , then AH = AT .
• If x œ Cm , then xH is defined consistent with how we have used it before.
• If – œ C, then –H = –.
(If you view the scalar as a matrix and then Hermitian transpose it, you get the matrix
with as only element –.)
Don’t Panic!. While working with complex-valued scalars, vectors, and matrices may appear
a bit scary at first, you will soon notice that it is not really much more complicated than
working with their real-valued counterparts.
Homework 1.3.3.4 Let A œ Cm◊k and B œ Ck◊n . Using what you once learned about
matrix transposition and matrix-matrix multiplication, reason that (AB)H = B H AH .
Solution.
(AB)H
= < XH = XT >
(AB) T

= < you once discovered that (AB)T = B T AT >


T
B A T

= < you may check separately that XY = XY >


T
B A T

= < XT = X >
T

B H AH
Definition 1.3.3.4 Hermitian. A matrix A œ Cm◊m is Hermitian if and only if A = AH .

Obviously, if A œ Rm◊m , then A is a Hermitian matrix if and only if A is a symmetric
matrix.
Homework 1.3.3.5 Let A œ Cm◊n .
ALWAYS/SOMETIMES/NEVER: ÎAH ÎF = ÎAÎF .
WEEK 1. NORMS 51

Answer. ALWAYS
Solution.
ÎAÎF
Ò
= < definition >
qm≠1 qn≠1 2
i=0 j=0 |–i,j |

Òq
= < commutativity of addition >
n≠1 qm≠1 2
j=0 i=0 |–i,j |

Òq
= < change of variables >
n≠1 qm≠1 2
i=0 j=0 |–j,i |

Òq
= < algebra >
n≠1 qm≠1 2
i=0 j=0 |–j,i |
= < definition >
H
ÎA ÎF
Similarly, other matrix norms can be created from vector norms by viewing the matrix
as a vector. It turns out that, other than the Frobenius norm, these aren’t particularly
interesting in practice. An example can be found in Homework 1.6.1.6.
Remark 1.3.3.5 The Frobenius norm of a m ◊ n matrix is easy to compute (requiring
O(mn) computations). The functions f (A) = ÎAÎF and f (A) = ÎAÎ2F are also differentiable.
However, you’d be hard-pressed to find a meaningful way of linking the definition of the
Frobenius norm to a measure of an underlying linear transformation (other than by first
transforming that linear transformation into a matrix).

1.3.4 Induced matrix norms

YouTube: https://www.youtube.com/watch?v=M6ZVBRFnYcU
Recall from Subsection 1.3.1 that a matrix, A œ Cm◊n , is a 2-dimensional array of
numbers that represents a linear transformation, L : Cn æ Cm , such that for all x œ Cn the
matrix-vector multiplication Ax yields the same result as does L(x).
The question "What is the norm of matrix A?" or, equivalently, "How ’large’ is A?" is the
same as asking the question "How ’large’ is L?" What does this mean? It suggests that what
we really want is a measure of how much linear transformation L or, equivalently, matrix A
"stretches" (magnifies) the "length" of a vector. This observation motivates a class of matrix
norms known as induced matrix norms.
WEEK 1. NORMS 52

Definition 1.3.4.1 Induced matrix norm. Let Î · ε : Cm æ R and Î · ΋ : Cn æ R be


vector norms. Define Î · ε,‹ : Cm◊n æ R by

ÎAxε
ÎAε,‹ = sup .
x œ Cn Îx΋
x ”= 0


Matrix norms that are defined in this way are said to be induced matrix norms.
Remark 1.3.4.2 In context, it is obvious (from the column size of the matrix) what the
size of vector x is. For this reason, we will write

ÎAxε ÎAxε
ÎAε,‹ = sup as ÎAε,‹ = sup .
x œ Cn Îx΋ x”=0 Îx΋
x ”= 0
Let us start by interpreting this. How "large" A is, as measured by ÎAε,‹ , is defined as
the most that A magnifies the length of nonzero vectors, where the length of the vector, x,
is measured with norm Î · ΋ and the length of the transformed vector, Ax, is measured with
norm Î · ε .
Two comments are in order. First,

ÎAxε
sup = sup ÎAxε .
x”=0 Îx΋ Îx΋ =1

This follows from the following sequence of equivalences:

supx”=0 ÎAxÎ
Îx΋
µ

= < homogeneity >


supx”=0 Î ÎxÎ
Ax

ε
= < norms are associative >
supx”=0 ÎA ÎxÎ x

ε
= < substitute y = x/Îx΋ >
supÎy΋ =1 ÎAyε .

Second, the "sup" (which stands for supremum) is used because we can’t claim yet that
there is a nonzero vector x for which
ÎAxε
sup
x”=0 Îx΋

is attained or, alternatively, a vector, x, with Îx΋ = 1 for which

sup ÎAxε
Îx΋ =1
WEEK 1. NORMS 53

is attained. In words, it is not immediately obvious that there is a vector for which the
supremum is attained. The fact is that there is always such a vector x. The proof again
depends on a result from real analysis, also employed in Proof 1.2.6.1, that states that
supxœS f (x) is attained for some vector x œ S as long as f is continuous and S is a compact
set. For any norm, ÎxÎ = 1 is a compact set. Thus, we can replace sup by max from here
on in our discussion.
We conclude that the following two definitions are equivalent definitions to the one we
already gave:
Definition 1.3.4.3 Let Î · ε : Cm æ R and Î · ΋ : Cn æ R be vector norms. Define
Î · ε,‹ : Cm◊n æ R by
ÎAxε
ÎAε,‹ = max .
x”=0 Îx΋

or, equivalently,
ÎAε,‹ = max ÎAxε .
Îx΋ =1


Remark 1.3.4.4 In this course, we will often encounter proofs involving norms. Such
proofs are much cleaner if one starts by strategically picking the most convenient of these
two definitions. Until you gain the intuition needed to pick which one is better, you may
have to start your proof using one of them and then switch to the other one if the proof
becomes unwieldy.
Theorem 1.3.4.5 Î · ε,‹ : Cm◊n æ R is a norm.
Proof. To prove this, we merely check whether the three conditions are met:
Let A, B œ Cm◊n and – œ C be arbitrarily chosen. Then

• A ”= 0 ∆ ÎAε,‹ > 0 (Î · ε,‹ is positive definite):


Notice that A ”= 0 means that at least one of its columns is not a zero vector (since at
least one element is nonzero). Let us assume it is the jth column, aj , that is nonzero.
Let ej equal the column of I (the identity matrix) indexed with j. Then

ÎAε,‹
= < definition >
maxx”=0 Îx΋
ÎAxε

Ø < ej is a specific vector >


ÎAej ε
Îej ΋
= < Aej = aj >
Îaj ε
Îej ΋
> < we assumed that aj ”= 0 >
0.

• ΖAε,‹ = |–|ÎAε,‹ (Î · ε,‹ is homogeneous):


WEEK 1. NORMS 54

ΖAε,‹
= < definition >
maxx”=0 ΖAxÎ
Îx΋
µ

= < homogeneity >


maxx”=0 |–| ÎAxÎ
Îx΋
µ

= < algebra >


|–| maxx”=0 ÎAxÎ
Îx΋
µ

= < definition >


|–|ÎAε,‹ .

• ÎA + Bε,‹ Æ ÎAε,‹ + ÎBε,‹ (Î · ε,‹ obeys the triangle inequality).

ÎA + Bε,‹
= < definition >
maxx”=0 Îx΋
Î(A+B)xε

= < distribute >


maxx”=0 ÎAx+BxÎ
Îx΋
µ

Æ < triangle inequality >


ÎAxε +ÎBxε
maxx”=0 Îx΋
Æ 1 < algebra > 2
maxx”=0 ÎAxÎ
Îx΋
µ
+ ÎBxÎ
Îx΋
µ

Æ < algebra >


maxx”=0 ÎAxÎ
Îx΋
µ
+ maxx”=0 ÎBxÎ
Îx΋
µ

= < definition >


ÎAε,‹ + ÎBε,‹ .


When Î · ε and Î · ΋ are the same norm (but possibly for different sizes of vectors), the
induced norm becomes
Definition 1.3.4.6 Define Î · ε : Cm◊n æ R by

ÎAxε
ÎAε = max
x”=0 Îxε

or, equivalently,
ÎAε = max ÎAxε .
Îxε =1


Homework 1.3.4.1 Consider the vector p-norm Î · Îp : Cn æ R and let us denote the
induced matrix norm by ||| · ||| : Cm◊n æ R for this exercise: |||A||| = maxx”=0 ÎAxÎ
ÎxÎp
p
.
ALWAYS/SOMETIMES/NEVER: |||y||| = ÎyÎp for y œ C . m
WEEK 1. NORMS 55

Answer. ALWAYS
Solution.
|||y|||
= < definition >
maxx”=0 ÎxÎp
ÎyxÎp
Ò
= < x is a scalar since y is a matrix with one column. Then ÎxÎp = Î(‰0 )Îp = p
|‰0 |p = |‰0 | >
max‰0 ”=0 |‰0 | ÎyÎ p
|‰0 |
= < algebra >
max‰0 ”=0 ÎyÎp
= < algebra >
ÎyÎp
This last exercise is important. One can view a vector x œ Cm as an m ◊ 1 matrix. What
this last exercise tells us is that regardless of whether we view x as a matrix or a vector,
ÎxÎp is the same.
We already encountered the vector p-norms as an important class of vector norms. The
matrix p-norm is induced by the corresponding vector norm, as defined by
Definition 1.3.4.7 Matrix p-norm. For any vector p-norm, define the corresponding
matrix p-norm Î · Îp : Cm◊n æ R by

ÎAxÎp
ÎAÎp = max or, equivalently, ÎAÎp = max ÎAxÎp .
x”=0 ÎxÎp ÎxÎp =1


Remark 1.3.4.8 The matrix p-norms with p œ {1, 2, Œ} will play an important role in our
course, as will the Frobenius norm. As the course unfolds, we will realize that in practice the
matrix 2-norm is of great theoretical importance but difficult to evaluate, except for special
matrices. The 1-norm, Œ-norm, and Frobenius norms are straightforward and relatively
cheap to compute (for an m ◊ n matrix, computing these costs O(mn) computation).

1.3.5 The matrix 2-norm

YouTube: https://www.youtube.com/watch?v=wZA H_K9XeI


Let us instantiate the definition of the vector p norm for the case where p = 2, giving us
a matrix norm induced by the vector 2-norm or Euclidean norm:
WEEK 1. NORMS 56

Definition 1.3.5.1 Matrix 2-norm. Define the matrix 2-norm Î · Î2 : Cm◊n æ R by

ÎAxÎ2
ÎAÎ2 = max = max ÎAxÎ2 .
x”=0 ÎxÎ2 ÎxÎ2 =1


Remark 1.3.5.2 The problem with the matrix 2-norm is that it is hard to compute. At
some point later in this course, you will find out that if A is a Hermitian matrix (A = AH ),
then ÎAÎ2 = |⁄0 |, where ⁄0 equals the eigenvalue of A that is largest in magnitude. You
may recall from your prior linear algebra experience that computing eigenvalues involves
computing the roots of polynomials, and for polynomials of degree three or greater, this is
a nontrivial task. We will see that the matrix 2-norm plays an important role in the theory
of linear algebra, but less so in practical computation.
Example 1.3.5.3 Show that
.A B.
.
. ”0 0 .
.
. . = max(|”0 |, |”1 |).
. 0 ”1 .
2

Solution.

YouTube: https://www.youtube.com/watch?v=B2rz0i5BB3A
[slides (PDF)] [LaTeX source] ⇤
Remark 1.3.5.4 The proof of the last example builds on a general principle: Showing that
maxxœD f (x) = – for some function f : D æ R can be broken down into showing that both

max f (x) Æ –
xœD

and
max f (x) Ø –.
xœD

In turn, showing that maxxœD f (x) Ø – can often be accomplished by showing that there
exists a vector y œ D such that f (y) = – since then

– = f (y) Æ max f (x).


xœD

We will use this technique in future proofs involving matrix norms.


WEEK 1. NORMS 57

Homework 1.3.5.1 Let D œ Cm◊m be a diagonal matrix with diagonal entries ”0 , . . . , ”m≠1 .
Show that
ÎDÎ2 = max |”j |.
m≠1
j=0

Solution. First, we show that ÎDÎ2 = maxÎxÎ2 =1 ÎDxÎ2 Æ maxm≠1


i=0 |”i |:

ÎDÎ22
= < definition >
maxÎxÎ2 =1 ÎDxÎ22
= < diagonal vector multiplication >
.Q R.2
. ”0 ‰0 .
. .
.c
maxÎxÎ2 =1 ..c .. d.
a . d.
b.
. .
.
”m≠1 ‰m≠1 .2
= < definition >
q
maxÎxÎ2 =1 m≠1
i=0 |”i ‰i |
2

= < homogeneity >


q
maxÎxÎ2 =1 m≠1 2
i=0 |”i | |‰i |
2

Æ < algebra >


q Ë È2
maxÎxÎ2 =1 m≠1
i=0 maxj=0 |”j | |‰i |
m≠1 2

= < algebra >


Ë È2 qm≠1
maxm≠1
j=0 |”j | maxÎxÎ2 =1 i=0 |‰i |2
= < ÎxÎ2 = 1 >
Ë È2
maxm≠1
j=0 |”j | .

Next, we show that there is a vector y with ÎyÎ2 = 1 such that ÎDyÎ2 = maxm≠1
i=0 |”i |:
Let j be such that |”j | = maxm≠1
i=0 |”i | and choose y = ej . Then

ÎDyÎ2
= < y = ej >
ÎDej Î2
= < D = diag(”0 , . . . , ”m≠1 ) >
Δj ej Î2
= < homogeneity >
|”j |Îej Î2
= < Îej |2 = 1 >
|”j |
= < choice of j >
maxm≠1i=0 |”i|

Hence ÎDÎ2 = maxm≠1


j=0 |”j |.
WEEK 1. NORMS 58

Homework 1.3.5.2 Let y œ Cm and x œ Cn .


ALWAYS/SOMETIMES/NEVER: ÎyxH Î2 = ÎyÎ2 ÎxÎ2 .
Hint. Prove that ÎyxH Î2 Æ ÎyÎ2 ÎxÎ2 and that there exists a vector z so that ÎyxH zÎ2
ÎzÎ2
=
ÎyÎ2 ÎxÎ2 .
Answer. ALWAYS
Now prove it!
Solution. W.l.o.g. assume that x ”= 0.
We know by the Cauchy-Schwarz inequality that |xH z| Æ ÎxÎ2 ÎzÎ2 . Hence

ÎyxH Î2
= < definition >
maxÎzÎ2 =1 ÎyxH zÎ2
= < Î · Î2 is homogenius >
maxÎzÎ2 =1 |xH z|ÎyÎ2
Æ < Cauchy-Schwarz inequality >
maxÎzÎ2 =1 ÎxÎ2 ÎzÎ2 ÎyÎ2
= < ÎzÎ2 = 1 >
ÎxÎ2 ÎyÎ2 .

But also
ÎyxH Î2
= < definition >
maxz”=0 ÎyxH zÎ2 /ÎzÎ2
Ø < specific z >
H
Îyx xÎ2 /ÎxÎ2
= < xH x = ÎxÎ22 ; homogeneity >
2
ÎxÎ2 ÎyÎ2 /ÎxÎ2
= < algebra >
ÎyÎ2 ÎxÎ2 .
Hence
ÎyxH Î2 = ÎyÎ2 ÎxÎ2 .
Homework 1.3.5.3 Let A œ Cm◊n and aj its column indexed with j. ALWAYS/SOMETIMES/
NEVER: Îaj Î2 Æ ÎAÎ2 .
Hint. What vector has the property that aj = Ax?
Answer. ALWAYS.
Now prove it!
WEEK 1. NORMS 59

Solution.
Îaj Î2
=
ÎAej Î2
Æ
maxÎxÎ2 =1 ÎAxÎ2
=
ÎAÎ2 .
Homework 1.3.5.4 Let A œ Cm◊n . Prove that
• ÎAÎ2 = maxÎxÎ2 =ÎyÎ2 =1 |y H Ax|.

• ÎAH Î2 = ÎAÎ2 .

• ÎAH AÎ2 = ÎAÎ22 .

Hint. Proving ÎAÎ2 = maxÎxÎ2 =ÎyÎ2 =1 |y H Ax| requires you to invoke the Cauchy-Schwarz
inequality from Theorem 1.2.3.3.
Solution.

• ÎAÎ2 = maxÎxÎ2 =ÎyÎ2 =1 |y H Ax|:

maxÎxÎ2 =ÎyÎ2 =1 |y H Ax|


Æ < Cauchy-Schwarz >
maxÎxÎ2 =ÎyÎ2 =1 ÎyÎ2 ÎAxÎ2
= < ÎyÎ2 = 1 >
maxÎxÎ2 =1 ÎAxÎ2
= < definition >
ÎAÎ2 .
Also, we know there exists x with ÎxÎ2 = 1 such that ÎAÎ2 = ÎAxÎ2 . Let y =
Ax/ÎAxÎ2 . Then

|y H Ax|
- = - instantiate >
<
- (Ax)H (Ax) -
- ÎAxÎ -
2

-
= - < z H z = ÎzÎ22 >
- ÎAxÎ2 -
- ÎAxÎ2 -
2
= < algebra >
ÎAxÎ2
= < x was chosen so that ÎAxÎ2 = ÎAÎ2 >
ÎAÎ2

Hence the bound is attained. We conclude that ÎAÎ2 = maxÎxÎ2 =ÎyÎ2 =1 |y H Ax|.
WEEK 1. NORMS 60

• ÎAH Î2 = ÎAÎ2 :

ÎAH Î2
= < first part of homework >
maxÎxÎ2 =ÎyÎ2 =1 |y H AH x|
= < |–| = |–| >
maxÎxÎ2 =ÎyÎ2 =1 |xH Ay|
= < first part of homework >
ÎAÎ2 .

• ÎAH AÎ2 = ÎAÎ22 :

ÎAH AÎ2
= < first part of homework >
maxÎxÎ2 =ÎyÎ2 =1 |y H AH Ax|
Ø < restricts choices of y >
maxÎxÎ2 =1 |xH AH Ax|
= < z H z = ÎzÎ22 >
maxÎxÎ2 =1 ÎAxÎ22
= < algebra >
1 22
maxÎxÎ2 =1 ÎAxÎ2
= < definition >
ÎAÎ22 .
So, ÎAH AÎ2 Ø ÎAÎ22 .
Now, let’s show that ÎAH AÎ2 Æ ÎAÎ22 . This would be trivial if we had already discussed
the fact that Î · · · Î2 is a submultiplicative norm (which we will in a future unit). But
let’s do it from scratch. First, we show that ÎAxÎ2 Æ ÎAÎ2 ÎxÎ2 for all (appropriately
sized) matrices A and x:

ÎAxÎ2
= < norms are homogeneus >
x
ÎA ÎxÎ 2
Î 2 ÎxÎ2
Æ < algebra >
maxÎyÎ2 =1 ÎAyÎ2 ÎxÎ2
= < definition of 2-norm
ÎAÎ2 ÎxÎ2 .
WEEK 1. NORMS 61

With this, we can then show that

ÎAH AÎ2
= < definition of 2-norm >
maxÎxÎ2 =1 ÎAH AxÎ2
Æ < ÎAzÎ2 Æ ÎAÎ2 ÎzÎ2 >
maxÎxÎ2 =1 (ÎAH Î2 ÎAxÎ2 )
= < algebra >
ÎAH Î2 maxÎxÎ2 =1 ÎAxÎ2 )
= < definition of 2-norm >
ÎAH Î2 ÎAÎ2
= < ÎAH Î2 = ÎAÎ >
2
ÎAÎ2

Alternatively, as suggested by one of the learners in the course, we can use the Cauchy-
Schwarz inequality:

ÎAH AÎ2
= < part (a) of this homework >
maxÎxÎ2 =ÎyÎ2 =1 |xH AH Ay|
= < simple manipulation >
maxÎxÎ2 =ÎyÎ2 =1 |(Ax)H Ay|
Æ < Cauchy-Schwarz inequality >
maxÎxÎ2 =ÎyÎ2 =1 ÎAxÎ2 ÎAyÎ2
= < algebra >
maxÎxÎ2 =1 ÎAxÎ2 maxÎyÎ2 =1 ÎAyÎ2
= < definition >
ÎAÎ2 ÎAÎ2
= < algebra >
ÎAÎ22
Q R
A0,0 ··· A0,N ≠1
Homework 1.3.5.5 Partition A = c
c .. .. d
a . . d.
b
AM ≠1,0 · · · AM ≠1,N ≠1
ALWAYS/SOMETIMES/NEVER: ÎAi,j Î2 Æ ÎAÎ2 .
Hint. Using Homework 1.3.5.4 choose vj and wi such that ÎAi,j Î2 = |wiH Ai,j vj |.
Solution. Choose vj and wi such that ÎAi,j Î2 = |wiH Ai,j vj |. Next, choose v and w such
WEEK 1. NORMS 62

that Q R Q R
0 0
c . d c . d
c . d c . d
c . d c . d
c d c d
c
c 0 d
d
c
c 0 d
d
c d c d
v= c vj d, w= c wi d.
c d c d
c
c 0 d
d
c
c 0 d
d
c .. d c .. d
c
a . d
b
c
a . d
b
0 0
You can check (using partitioned multiplication and the last homework) that wH Av =
wiH Ai,j vj . Then, by Homework 1.3.5.4

ÎAÎ2
= < last homework >
maxÎxÎ2 =ÎyÎ2 =1 |y H Ax|
Ø < w and v are specific vectors >
|wH Av|
= < partitioned multiplication >
|wiH Ai,j vj |
= < how wi and vj were chosen >
ÎAi,j Î2 .

1.3.6 Computing the matrix 1-norm and Œ-norm

YouTube: https://www.youtube.com/watch?v=QTKZdGQ2C6w
The matrix 1-norm and matrix Œ-norm are of great importance because, unlike the
matrix 2-norm, they are easy and relatively cheap to compute.. The following exercises
show how to practically compute the matrix 1-norm and Œ-norm.
1 2
Homework 1.3.6.1 Let A œ Cm◊n and partition A = a0 a1 · · · an≠1 . ALWAYS/
SOMETIMES/NEVER: ÎAÎ1 = max0Æj<n Îaj Î1 .
Hint. Prove it for the real valued case first.
Answer. ALWAYS
WEEK 1. NORMS 63

Solution. Let J be chosen so that max0Æj<n Îaj Î1 = ÎaJ Î1 . Then

ÎAÎ1
= < definition >
maxÎxÎ1 =1 ÎAxÎ1
= <. expose the columns of A and elements of x >
Q R.
. ‰ 0
.
. .
.1 2c d.
. c ‰1 d.
maxÎxÎ1 =1 .. a0 a1 · · · an≠1 cc .. d .
d.
.
.
a . b. .
. ‰n≠1 .1
= < definition of matrix-vector multiplication >
maxÎxÎ1 =1 Ή0 a0 + ‰1 a1 + · · · + ‰n≠1 an≠1 Î1
Æ < triangle inequality >
maxÎxÎ1 =1 (Ή0 a0 Î1 + Ή1 a1 Î1 + · · · + Ήn≠1 an≠1 Î1 )
= < homogeneity >
maxÎxÎ1 =1 (|‰0 |Îa0 Î1 + |‰1 |Îa1 Î1 + · · · + |‰n≠1 |Îan≠1 Î1 )
Æ < choice of aJ >
maxÎxÎ1 =1 (|‰0 |ÎaJ Î1 + |‰1 |ÎaJ Î1 + · · · + |‰n≠1 |ÎaJ Î1 )
= < factor out ÎaJ Î1 >
maxÎxÎ1 =1 (|‰0 | + |‰1 | + · · · + |‰n≠1 |) ÎaJ Î1
= < algebra >
ÎaJ Î1 .

Also,
ÎaJ Î1
= < eJ picks out column J >
ÎAeJ Î1
Æ < eJ is a specific choice of x >
maxÎxÎ1 =1 ÎAxÎ1 .
Hence
ÎaJ Î1 Æ max ÎAxÎ1 Æ ÎaJ Î1
ÎxÎ1 =1

which implies that


max ÎAxÎ1 = ÎaJ Î1 = max Îaj Î1 .
ÎxÎ1 =1 0Æj<n
Q R
aÂT0
c aÂT1 d
c d
Homework 1.3.6.2 Let A œ C m◊n
and partition A = c
c .. d.
d
a . b
aÂTm≠1
ALWAYS/SOMETIMES/NEVER:
ÎAÎŒ = max ÎaÂi Î1 (= max (|–i,0 | + |–i,1 | + · · · + |–i,n≠1 |))
0Æi<m 0Æi<m

Notice that in this exercise aÂi is really (aÂTi )T since aÂTi is the label for the ith row of matrix
WEEK 1. NORMS 64

A.
Hint. Prove it for the real valued case first.
Answer. ALWAYS Q R
aÂT0
Solution. Partition A = c
c .. d
a b. Then
. d
T
aÂm≠1

ÎAÎŒ
= < definition >
maxÎxÎŒ =1 ÎAxÎŒ
= < .Q expose rows R .
>
. a T .
. 0 .
.c
maxÎxÎŒ =1 ..c .. d .
a . b .
d x.
. .
. a  Tm≠1 .
Œ
= < .Q matrix-vectorR.
multiplication >
. aÂT0 x .
. .
.c
maxÎxÎŒ =1 ..c .. d.
a . d.
b.
. .
. a  Tm≠1 x .
Œ
= < 1 definition of Î · 2· · ÎŒ >
maxÎxÎŒ =1 max0Æi<m |aÂTi x|
= < expose aÂTi x >
q
maxÎxÎŒ =1 max0Æi<m | n≠1 p=0 –i,p ‰p |
Æ < triangle inequality >
q
maxÎxÎŒ =1 max0Æi<m n≠1 p=0 |–i,p ‰p |
= < algebra >
q
maxÎxÎŒ =1 max0Æi<m n≠1 p=0 (|–i,p ||‰p |)
Æ < algebra >
q
maxÎxÎŒ =1 max0Æi<m n≠1 p=0 (|–i,p |(maxk |‰k |))
= < definition of Î · ÎŒ >
q
maxÎxÎŒ =1 max0Æi<m n≠1 p=0 (|–i,p |ÎxÎŒ )
= < ÎxÎŒ = 1 >
q
max0Æi<m n≠1 p=0 |–i,p |
= < definition of Î · Î1 >
max0Æi<m ÎaÂi Î1

so that ÎAÎŒ Æ max0Æi<m ÎaÂi Î1 .


We also want to show
Q thatRÎAÎŒ Ø max0Æi<m Îai Î1 . Let k be such that max0Æi<m Îai Î1 =
 Â
Â0
ÎaÂk Î1 and pick y = c
c .. d
a b so that a
. d  Tk y = |–k,0 | + |–k,1 | + · · · + |–k,n≠1 | = Îa
 k Î1 . (This
Ân≠1
is a matter of picking Âj = |–k,j |/–k,j if –k,j ”= 0 and Âj = 1 otherwise. Then |Âj | = 1, and
WEEK 1. NORMS 65

hence ÎyÎŒ = 1 and Âj –k,j = |–k,j |.) Then

ÎAÎŒ
= < definition >
maxÎxÎ1 =1 ÎAxÎŒ
= <.Qexpose rows R .
>
. a T .
. 0 .
maxÎxÎ1 =1 ..c
.c .. d .
a . b .
d x.
. .
. a  Tm≠1 .
Œ
.Q
Ø <
R .
y is a specific x>
. aÂ0T .
. .
.c .. d .
.c
.a . b . d y.
. .
. a Tm≠1 .
Œ
.Q
= <Rmatrix-vector
.
multiplication >
. a T
y .
. 0 .
.c .. d.
.c
.a . d.
b.
. .
. aÂTm≠1 y .Œ
Ø < algebra >
|aÂTk y|
= < choice of y >
Î ak Î 1 .
Â
= < choice of k >
max0Æi<m ÎaÂi Î1
Remark 1.3.6.1 The last homework provides a hint as to how to remember how to compute
the matrix 1-norm and Œ-norm: Since ÎxÎ1 must result in the same value whether x is
considered as a vector or as a matrix, we can remember that the matrix 1-norm equals the
maximum of the 1-norms of the columns of the matrix: Similarly, considering ÎxÎŒ as a
vector norm or as matrix norm reminds us that the matrix Œ-norm equals the maximum of
the 1-norms of vectors that become the rows of the matrix.

1.3.7 Equivalence of matrix norms

YouTube: https://www.youtube.com/watch?v=Csqd4AnH7ws
WEEK 1. NORMS 66

Homework 1.3.7.1 Fill out the following table:

Q
A R
ÎAÎ1 ÎAÎŒ ÎAÎF ÎAÎ2
1 0 0
c d
a 0 1 0 b
0 0 1
Q R
1 1 1
c
c 1 1 1 d
d
c d
a 1 1 1 b

Q
1 1 1 R
0 1 0
c d
a 0 1 0 b
0 1 0

Hint. For the second and third, you may want to use Homework 1.3.5.2 when computing
the 2-norm.
Solution.
Q
A R
ÎAÎ1 ÎAÎŒ ÎAÎF ÎAÎ2
1 0 0 Ô
c d
a 0 1 0 b 1 1 3 1
0 0 1
Q R
1 1 1
c 1 1 1 d Ô Ô
c d
c d 4 3 2 3 2 3
a 1 1 1 b

Q
1 1 1 R
0 1 0 Ô Ô
c d
a 0 1 0 b 3 1 3 3
0 1 0
To compute the 2-norm of I, notice that

ÎIÎ2 = max ÎIxÎ2 = max ÎxÎ2 = 1.


|x|2 =1 |x|2 =1

Next, notice that Q R Q R


1 1 1 1
c
c 1 1 1 d
d
c
c 1 d1
d
2
c d =c d 1 1 1 .
a 1 1 1 b a 1 b
1 1 1 1
and Q R Q R
0 1 0 1 1 2
c d c d
a 0 1 0 b=a 1 b 0 1 0 .
0 1 0 1
which allows us to invoke the result from Homework 1.3.5.2.
We saw that vector norms are equivalent in the sense that if a vector is "small" in one
WEEK 1. NORMS 67

norm, it is "small" in all other norms, and if it is "large" in one norm, it is "large" in all other
norms. The same is true for matrix norms.
Theorem 1.3.7.1 Equivalence of matrix norms. Let Î · Î : Cm◊n æ R and ||| · ||| :
Cm◊n æ R both be matrix norms. Then there exist positive scalars ‡ and · such that for all
A œ Cm◊n
‡ÎAÎ Æ |||A||| Æ · ÎAÎ.
Proof. The proof again builds on the fact that the supremum over a compact set is achieved
and can be replaced by the maximum.
We will prove that there exists a · such that for all A œ Cm◊n

|||A||| Æ · ÎAÎ

leaving the rest of the proof to the reader.


Let A œ Cm◊n be an arbitary matrix. W.l.o.g. assume that A ”= 0 (the zero matrix).
Then
|||A|||
= < algebra >
|||A|||
ÎAÎ
ÎAÎ
1 Æ < definition
2 of suppremum >
supZ”=0 |||Z|||
ÎZÎ
ÎAÎ
1 = < homogeneity
2 >
supZ”=0 ||| ÎZÎ ||| ÎAÎ
Z

1 = < change 2 of variables B = Z/ÎZÎ >


supÎBÎ=1 |||B||| ÎAÎ
1 = < the set 2 ÎBÎ = 1 is compact >
maxÎBÎ=1 |||B||| ÎAÎ
The desired · can now be chosen to equal maxÎBÎ=1 |||B|||. ⌅
Remark 1.3.7.2 The bottom line is that, modulo a constant factor, if a matrix is "small"
in one norm, it is "small" in any other norm.
Homework 1.3.7.2 Given A œ Cm◊n show that ÎAÎ2 Æ ÎAÎF . For what matrix is equality
attained?
Hmmm, actually, this is really easy to prove once we know about the SVD... Hard to
prove without it. So, this problem will be moved...
Solution. Next week, we will learn about the SVD. Let us go ahead and insert that proof
here, for future reference.
Let A = U V H be the Singular Value Decomposition of A, where U and V are unitary
and = diag(‡0 , . . . , ‡min(m,n) ) with ‡0 Ø ‡1 Ø . . . Ø ‡min(m,n) Ø 0. Then
ÎAÎ2 = ÎU V H Î2 = ‡0
and Ò
ÎAÎF = ÎU V H ÎF = Î ÎF = ‡02 + . . . + ‡min(m,n)
2
.
WEEK 1. NORMS 68

Hence, ÎAÎ2 Æ ÎAÎF .


Homework 1.3.7.3 Let A œ Cm◊n . The following table summarizes the equivalence of
various matrix norms:
Ô Ô
ÎAÎ1 Æ mÎAÎ2 ÎAÎ1 Æ mÎAÎŒ ÎAÎ1 Æ mÎAÎF
Ô Ô
ÎAÎ2 Æ nÎAÎ1 ÎAÎ2 Æ mÎAÎŒ ÎAÎ2 Æ ÎAÎF
Ô Ô
ÎAÎŒ Æ nÎAÎ1 ÎAÎŒ Æ nÎAÎ2 ÎAÎŒ Æ nÎAÎF
Ô Ô
ÎAÎF Æ nÎAÎ1 ÎAÎF Æ?ÎAÎ2 ÎAÎF Æ mÎAÎŒ

For each, prove the inequality, including that it is a tight inequality for some nonzero A.
(Skip ÎAÎF Æ?ÎAÎ2 : we will revisit it in Week 2.)
Solution.
Ô
• ÎAÎ1 Æ mÎAÎ2 :

ÎAÎ1
= < definition >
maxx”=0 ÎAxÎ
ÎxÎ1
1

Ô
Æ Ô< ÎzÎ1 Æ mÎzÎ2 >
maxx”=0 mÎAxÎ
ÎxÎ1
2

Æ Ô< ÎzÎ1 Ø ÎzÎ2 >


maxx”=0 mÎAxÎ
ÎxÎ2
2

Ô = < algebra; definition >


mÎAÎ2
Q R
1
c
c 1 d
d
Equality is attained for A = c
c .. d.
d
a . b
1

• ÎAÎ1 Æ mÎAÎŒ :

ÎAÎ1
= < definition >
maxx”=0 ÎxÎ1
ÎAxÎ1

Æ < ÎzÎ1 Æ mÎzÎŒ >


maxx”=0 ÎxÎ1
mÎAxÎŒ

Æ < ÎzÎ1 Ø ÎzÎŒ >


maxx”=0 mÎAxÎ
ÎxÎŒ
Œ

= < algebra; definition >


mÎAÎŒ
WEEK 1. NORMS 69
Q R
1
c
c 1 d
d
Equality is attained for A = c
c .. d.
d
a . b
1
Ô
• ÎAÎ1 Æ mÎAÎF :
It pays to show that ÎAÎ2 Æ ÎAÎF first. Then

ÎAÎ1
Ô Æ < last part >
mÎAÎ2
Ô Æ < some other part:ÎAÎ2 Æ ÎAÎF >
mÎAÎF .

Q R
1
c
c 1 d
d
Equality is attained for A = c
c .. d.
d
a . b
1
Ô
• ÎAÎ2 Æ nÎAÎ1 :

ÎAÎ2
= < definition >
maxx”=0 ÎAxÎ
ÎxÎ2
2

Æ < ÎzÎ2 Æ ÎzÎ1 >


maxx”=0 ÎxÎ2
ÎAxÎ1
Ô
Æ Ô< mÎzÎ2 Ø ÎzÎ1 when z is of size n >
maxx”=0 nÎAxÎ
ÎxÎ1
1

Ô = < algebra; definition >


nÎAÎ1 .
1 2
Equality is attained for A = 1 1 ··· 1 .
Ô
• ÎAÎ2 Æ mÎAÎŒ :
WEEK 1. NORMS 70

ÎAÎ2
= < definition >
maxx”=0 ÎAxÎ
ÎxÎ2
2

Ô
Æ Ô< ÎzÎ2 Æ mÎzÎŒ >
maxx”=0 mÎAxÎ
ÎxÎ2
Œ

Æ Ô< ÎzÎ2 Ø ÎzÎŒ >


maxx”=0 mÎAxÎ
ÎxÎŒ
Œ

Ô = < algebra; definition >


mÎAÎŒ .
Q R
1
c
c 1 d
d
Equality is attained for A = c
c .. d.
d
a . b
1

• ÎAÎ2 Æ ÎAÎF :
(See Homework 1.3.7.2, which requires the SVD, as mentioned...)

• Please share more solutions!

1.3.8 Submultiplicative norms

YouTube: https://www.youtube.com/watch?v=TvthvYGt9x8
There are a number of properties that we would like for a matrix norm to have (but not
all norms do have). Recalling that we would like for a matrix norm to measure by how much
a vector is "stretched," it would be good if for a given matrix norm, Î · · · Î : Cm◊n æ R,
there are vector norms Î · ε : Cm æ R and Î · ΋ : Cn æ R such that, for arbitrary nonzero
x œ Cn , the matrix norm bounds by how much the vector is stretched:
ÎAxε
Æ ÎAÎ
Îx΋
or, equivalently,
ÎAxε Æ ÎAÎÎx΋
where this second formulation has the benefit that it also holds if x = 0. When this rela-
tionship between the involved norms holds, the matrix norm is said to be subordinate to the
WEEK 1. NORMS 71

vector norms:
Definition 1.3.8.1 Subordinate matrix norm. A matrix norm Î · Î : Cm◊n æ R is said
to be subordinate to vector norms Î · ε : Cm æ R and Î · ΋ : Cn æ R if, for all x œ Cn ,

ÎAxε Æ ÎAÎÎx΋ .

If Î · ε and Î · ΋ are the same norm (but perhaps for different m and n), then Î · Î is said
to be subordinate to the given vector norm. ⌃
Fortunately, all the norms that we will employ in this course are subordinate matrix
norms.
Homework 1.3.8.1 ALWAYS/SOMETIMES/NEVER: The Frobenius norm is subordinate
to the vector 2-norm.
Answer. TRUE
Now prove it.
Solution. W.l.o.g., assume x ”= 0.

ÎAxÎ2 ÎAyÎ2
ÎAxÎ2 = ÎxÎ2 Æ max ÎxÎ2 = max ÎAyÎ2 ÎxÎ2 = ÎAÎ2 ÎxÎ2 .
ÎxÎ2 y”=0 ÎyÎ2 ÎyÎ2 =1

So, it suffices to show that ÎAÎ2 Æ ÎAÎF . But we showed that in Homework 1.3.7.2.
Theorem 1.3.8.2 Induced matrix norms, Î · ε,‹ : Cm◊n æ R, are subordinate to the norms,
Î · ε and Î · ΋ , that induce them.
Proof. W.l.o.g. assume x ”= 0. Then

ÎAxε ÎAyε
ÎAxε = Îx΋ Æ max Îx΋ = ÎAε,‹ Îx΋ .
Îx΋ y” = 0 Îy΋


Corollary 1.3.8.3 Any matrix p-norm is subordinate to the corresponding vector p-norm.
Another desirable property that not all norms have is that

ÎABÎ Æ ÎAÎÎBÎ.

This requires the given norm to be defined for all matrix sizes..
Definition 1.3.8.4 Consistent matrix norm. A matrix norm Î · Î : Cm◊n æ R is said
to be a consistent matrix norm if it is defined for all m and n, using the same formula for
all m and n. ⌃
Obviously, this definition is a bit vague. Fortunately, it is pretty clear that all the matrix
norms we will use in this course, the Frobenius norm and the p-norms, are all consistently
defined for all matrix sizes.
Definition 1.3.8.5 Submultiplicative matrix norm. A consistent matrix norm Î · Î :
WEEK 1. NORMS 72

Cm◊n æ R is said to be submultiplicative if it satisfies

ÎABÎ Æ ÎAÎÎBÎ.


Theorem 1.3.8.6 Let Î · Î : Cn æ R be a vector norm defined for all n. Define the
corresponding induced matrix norm as

ÎAxÎ
ÎAÎ = max = max ÎAxÎ.
x”=0 ÎxÎ ÎxÎ=1

Then for any A œ Cm◊k and B œ Ck◊n the inequality ÎABÎ Æ ÎAÎÎBÎ holds.
In other words, induced matrix norms are submultiplicative. To prove this theorem, it
helps to first prove a simpler result:
Lemma 1.3.8.7 Let ηΠ: Cn æ R be a vector norm defined for all n and let ηΠ: Cm◊n æ R
be the matrix norm it induces. Then ÎAxÎ Æ ÎAÎÎxÎ..
Proof. If x = 0, the result obviously holds since then ÎAxÎ = 0 and ÎxÎ = 0. Let x ”= 0.
Then
ÎAxÎ ÎAxÎ
ÎAÎ = max Ø .
x”=0 ÎxÎ ÎxÎ
Rearranging this yields ÎAxÎ Æ ÎAÎÎxÎ. ⌅
We can now prove the theorem:
Proof.
ÎABÎ
= < definition of induced matrix norm >
maxÎxÎ=1 ÎABxÎ
= < associativity >
maxÎxÎ=1 ÎA(Bx)Î
Æ < lemma >
maxÎxÎ=1 (ÎAÎÎBxÎ)
Æ < lemma >
maxÎxÎ=1 (ÎAÎÎBÎÎxÎ)
= < ÎxÎ = 1 >
ÎAÎÎBÎ.

Homework 1.3.8.2 Show that ÎAxε Æ ÎAε,‹ Îx΋ .
Solution. W.l.o.g. assume that x ”= 0.

ÎAyε ÎAxε
ÎAε,‹ = max Ø .
y”=0 Îy΋ Îx΋

Rearranging this establishes the result.


WEEK 1. NORMS 73

Homework 1.3.8.3 Show that ÎABε Æ ÎAε,‹ ÎB΋,µ .


Solution.
ÎABε
= < definition >
maxÎxε =1 ÎABxε
Æ < last homework >
maxÎxε =1 ÎAε,‹ ÎBx΋
= < algebra >
ÎAε,‹ maxÎxε =1 ÎBx΋
= < definition >
ÎAε,‹ ÎB΋,µ
Homework 1.3.8.4 Show that the Frobenius norm, Î · ÎF , is submultiplicative.
Solution.
ÎABÎ2F
= < partition >
.Q R .2
. a
 T .
. 0 .
.c  T d 1 2.
.c a1 d .
.c
.c .. d
d b0 b1 · · · bn≠1
.
.
.a
.
. b .
.
. aÂTm≠1 .
F
= < partitioned matrix-matrix multiplication >
.Q R.2
.
. aÂT0 b0 aÂT0 b1 · · · aÂT0 bn≠1 .
.
.c a T
aÂ0 b1 · · · aÂ0 bn≠1 d..
T T d
.c  0 b0
.c
.c .. .. .. d.
d.
.a
.
. . . b.
.
. a T
 m≠1 b0 a T
 m≠1 b1 · · · a m≠1 bn≠1 .
T
F
= < definition of Frobenius norm >
q q
i j |a Ti bj |2
= < definition of Hermitian transpose vs transpose >
q q H 2
i j |ai bj |
Â
Æ < Cauchy-Schwarz inequality >
q q 2 2
i j Î a
 i Î2 Îbj Î2
= <1 algebra 2and ÎxÎ2 = ÎxÎ2 >
q q
( i ÎaÂi Î22 ) 2
j Îbj Î2
= < previous observations about the Frobenius norm >
ÎAÎ2F ÎBÎ2F

Hence ÎABÎ2F Æ ÎAÎ2F ÎBÎ2F . Taking the square root of both sides leaves us with ÎABÎF Æ
ÎAÎF ÎBÎF .
This proof brings to the forefront that the notation aÂTi leads to some possible confusion.
In this particular situation, it is best to think of aÂi as a vector that, when transposed,
becomes the row of A indexed with i. In this case, aÂTi = aÂi and (aÂTi )H = aÂi (where, recall,
H

x equals the vector with all its entries conjugated). Perhaps it is best to just work through
WEEK 1. NORMS 74

this problem for the case where A and B are real-valued, and not worry too much about the
details related to the complex-valued case...
Homework 1.3.8.5 For A œ Cm◊n define

ÎAÎ = max max |–i,j |.


m≠1 n≠1
i=0 j=0

1. TRUE/FALSE: This is a norm.

2. TRUE/FALSE: This is a consistent norm.

Answer.

1. TRUE

2. TRUE

Solution.

1. This is a norm. You can prove this by checking the three conditions.

2. It is a consistent norm since it is defined for all m and n.


Remark 1.3.8.8 The important take-away: The norms we tend to use in this course, the
p-norms and the Frobenius norm, are all submultiplicative.
Homework 1.3.8.6 Let A œ Cm◊n .
ALWAYS/SOMETIMES/NEVER: There exists a vector
Q R
‰0
x = a ... d
c d
b with |‰i | = 1 for i = 0, . . . , n ≠ 1
c
‰n≠1
such that ÎAÎŒ = ÎAxÎŒ .
Answer. ALWAYS
Now prove it!
Solution. Partition A by rows: Q R
aÂT0
c
A=c .. d
a . d
b.
T
aÂm≠1
We know that there exists k such that ÎaÂk Î1 = ÎAÎŒ . Now
ÎaÂk Î1
= < definition of 1-norm >
|–k,0 | + · · · + |–k,n≠1 |
= < algebra >
+ · · · + –k,n≠1
|–k,0 | |– |

–k,0 k,0 k,n≠1
–k,n≠1 .
WEEK 1. NORMS 75

where we take = 1 whenever –k,j = 0. Vector


|–k,j |
–k,j

Q |–k,0 | R
c –k,0 d
x=c
c .. d
a . d
b
|–k,n≠1 |
–k,n≠1

has the desired property.

1.3.9 Summary

YouTube: https://www.youtube.com/watch?v=DyoT2tJhxIs

1.4 Condition Number of a Matrix


1.4.1 Conditioning of a linear system

YouTube: https://www.youtube.com/watch?v=QwFQNAPKIwk
A question we will run into later in the course asks how accurate we can expect the
solution of a linear system to be if the right-hand side of the system has error in it.
Formally, this can be stated as follows: We wish to solve Ax = b, where A œ Cm◊m but
the right-hand side has been perturbed by a small vector so that it becomes b + ”b.
Remark 1.4.1.1 Notice how the ” touches the b. This is meant to convey that this is a
symbol that represents a vector rather than the vector b that is multiplied by a scalar ”.
The question now is how a relative error in b is amplified into a relative error in the
solution x.
WEEK 1. NORMS 76

This is summarized as follows:


Ax = b exact equation
A(x + ”x) = b + ”b perturbed equation

We would like to determine a formula, Ÿ(A, b, ”b), that gives us a bound on how much a
relative error in b is potentially amplified into a relative error in the solution x:
ΔxΠΔbÎ
Æ Ÿ(A, b, ”b) .
ÎxÎ ÎbÎ
We assume that A has an inverse since otherwise there may be no solution or there may be
an infinite number of solutions. To find an expression for Ÿ(A, b, ”b), we notice that

Ax + A”x = b + ”b
Ax = b ≠
A”x = ”b

and from this we deduce that


Ax = b
”x = A≠1 ”b.
If we now use a vector norm Î · Î and its induced matrix norm Î · Î, then

ÎbÎ = ÎAxÎ Æ ÎAÎÎxÎ


ΔxÎ = ÎA≠1 ”bÎ Æ ÎA≠1 ÎΔbÎ

since induced matrix norms are subordinate.


From this we conclude that
1 1
Æ ÎAÎ
ÎxÎ ÎbÎ
and
ΔxÎ Æ ÎA≠1 ÎΔbÎ
so that
ΔxΠΔbÎ
Æ ÎAÎÎA≠1 Î .
ÎxÎ ÎbÎ
Thus, the desired expression Ÿ(A, b, ”b) doesn’t depend on anything but the matrix A:

ΔxΠΔbÎ
Æ ÎAÎÎA≠1 Î .
ÎxÎ ¸ ˚˙ ˝ ÎbÎ
Ÿ(A)

Definition 1.4.1.2 Condition number of a nonsingular matrix. The value Ÿ(A) =


ÎAÎÎA≠1 Î is called the condition number of a nonsingular matrix A. ⌃
A question becomes whether this is a pessimistic result or whether there are examples of
b and ”b for which the relative error in b is amplified by exactly Ÿ(A). The answer is that,
unfortunately, the bound is tight.
WEEK 1. NORMS 77

• There is an x‚ for which


ÎAÎ = max ÎAxÎ = ÎAx‚Î,
ÎxÎ=1

namely the x for which the maximum is attained. This is the direction of maximal
magnification. Pick ‚b = Ax‚.

• There is an ”b
‚ for which

ÎA≠1 xÎ ‚
ÎA≠1 ”bÎ
ÎA≠1 Î = max = ‚
,
ÎxΔ=0 ÎxΠΔbÎ

again, the x for which the maximum is attained.

It is when solving the perturbed system

A(x + ”x) = ‚b + ”b

that the maximal magnification by Ÿ(A) is observed.


Homework 1.4.1.1 Let Î · Î be a vector norm and corresponding induced matrix norm.
TRUE/FALSE: ÎIÎ = 1.
Answer. TRUE
Solution.
ÎIÎ = max ÎIxÎ = max ÎxÎ = 1
ÎxÎ=1 ÎxÎ=1

Homework 1.4.1.2 Let Î · Î be a vector norm and corresponding induced matrix norm,
and A a nonsingular matrix.
TRUE/FALSE: Ÿ(A) = ÎAÎÎA≠1 Î Ø 1.
Answer. TRUE
Solution.
1
= < last homework >
ÎIÎ
= < A is invertible >
ÎAA≠1 Î
Æ < Î · Î is submultiplicative >
≠1
ÎAÎÎA Î.
Remark 1.4.1.3 This last exercise shows that there will always be choices for b and ”b
for which the relative error is at best directly translated into an equal relative error in the
solution (if Ÿ(A) = 1).
WEEK 1. NORMS 78

1.4.2 Loss of digits of accuracy

YouTube: https://www.youtube.com/watch?v=-5 9Ov5RXYo


Homework 1.4.2.1 Let – = ≠14.24123 and –
‚ = ≠14.24723. Compute

• |–| =

• |– ≠ –|
‚ =

• |–≠–‚|
|–|
=
1 2
• log10 |–≠–‚|
|–|
=

Solution. Let – = ≠14.24123 and –


‚ = ≠14.24723. Compute

• |–| = 14.24123

• |– ≠ –|
‚ = 0.006

• |–≠–‚|
|–|
¥ 0.00042
1 2
• log10 |–≠–‚|
|–|
¥ ≠3.4

The point of this exercise is as follows:

• If you compare – = ≠14.24123


ˆ = ≠14.24723 and you consider –
– ˆ to be an approximation of –, then –
ˆ is accurate to
four digits: ≠14.24 is accurate.
1 2
ˆ
• Computing log10 |–≠ –|
|–|
tells you approximately how many decimal digits are accurate:
3.4 digits.
Be sure to read the solution to the last homework!
WEEK 1. NORMS 79

1.4.3 The conditioning of an upper triangular matrix

YouTube: https://www.youtube.com/watch?v=LGBFyjhjt6U
We now revisit the material from the launch for the semester. We understand that when
solving Lx = b, even a small relative change to the right-hand side b can amplify into a large
relative change in the solution x̂ if the condition number of the matrix is large.
Homework 1.4.3.1 Change the script Assignments/Week01/mat ab/Test_Upper_triangu ar_
so ve_100.m to also compute the condition number of matrix U , Ÿ(U ). Investigate what
happens to the condition number as you change the problem size n.
Since in the example the upper triangular matrix is generated to have random values as
its entries, chances are that at least one element on its diagonal is small. If that element
were zero, then the triangular matrix would be singular. Even if it is not exactly zero, the
condition number of U becomes very large if the element is small.

1.5 Enrichments
1.5.1 Condition number estimation
It has been observed that high-quality numerical software should not only provide routines
for solving a given problem, but, when possible, should also (optionally) provide the user
with feedback on the conditioning (sensitivity to changes in the input) of the problem. In
this enrichment, we relate this to what you have learned this week.
Given a vector norm Î · Î and induced matrix norm Î · Î, the condition number of matrix
A using that norm is given by Ÿ(A) = ÎAÎÎA≠1 Î. When trying to practically compute the
condition number, this leads to two issues:

• Which norm should we use? A case has been made in this week that the 1-norm and
Œ-norm are canditates since they are easy and cheap to compute.

• It appears that A≠1 needs to be computed. We will see in future weeks that this
is costly: O(m3 ) computation when A is m ◊ m. This is generally considered to be
expensive.

This leads to the question "Can a reliable estimate of the condition number be cheaply
computed?" In this unit, we give a glimpse of how this can be achieved and then point the
interested learner to related papers.
WEEK 1. NORMS 80

Partition m ◊ m matrix A: Q R
aÂT0
c
A=c .. d
a . d
b.
T
aÂm≠1
We recall that
• The Œ-norm is defined by
ÎAÎŒ = max ÎAxÎŒ .
ÎxÎŒ =1

• From Homework 1.3.6.2, we know that the Œ-norm can be practically computed as

ÎAÎŒ = max ÎaÂi Î1 ,


0Æi<m

where aÂi = (aÂTi )T . This means that ÎAÎŒ can be computed in O(m2 ) operations.
• From the solution to Homework 1.3.6.2, we know that there is a vector x with |‰j | = 1
for 0 Æ j < m such that ÎAÎŒ = ÎAxÎŒ . This x satisfies ÎxÎŒ = 1.
More precisely: ÎAÎŒ = ÎaÂTk Î1 for some k. For simplicity, assume A is real valued.
Then
ÎAÎŒ = |–k,0 | + · · · + |–k,m≠1 |
= –k,0 ‰0 + · · · + –k,m≠1 ‰m≠1 ,
where each ‰j = ±1 is chosen so that ‰j –k,j = |–k,j |. That vector x then has the
property that ÎAÎŒ = ÎaÂk Î1 = ÎAxÎŒ .
From this we conclude that
ÎAÎŒ = max ÎAxÎŒ ,
xœS

where S is the set of all vectors x with |‰j | = 1, 0 Æ j < n.


We will illustrate the techniques that underly efficient condition number estimation by
looking at the simpler case where we wish to estimate the condition number of a real-valued
nonsingular upper triangular m ◊ m matrix U , using the Œ-norm. Since U is real-valued,
|‰i | = 1 means ‰i = ±1. The problem is that it appears we must compute ÎU ≠1 ÎŒ .
Computing U ≠1 when U is dense requires O(m3 ) operations (a topic we won’t touch on until
much later in the course).
Our observations tell us that

ÎU ≠1 ÎŒ = max ÎU ≠1 xÎŒ ,
xœS

where S is the set of all vectors x with elements ‰i œ {≠1, 1}. This is equivalent to

ÎU ≠1 ÎŒ = max ÎzÎŒ ,
zœT

where T is the set of all vectors z that satisfy U z = y for some y with elements Âi œ {≠1, 1}.
So, we could solve U z = y for all vectors y œ S, compute the Œ-norm for all those vectors
z, and pick the maximum of those values. But that is not practical.
WEEK 1. NORMS 81

One simple solution is to try to construct a vector y that results in a large amplification
(in the Œ-norm) when solving U z = y, and to then use that amplification as an estimate
for ÎU ≠1 ÎŒ . So how do we do this? Consider
Q R Q R Q R
.. . . .. .. .. ..
c . . . . d c . d c . d
a 0 · · · ‚m≠2,m≠2 ‚m≠2,m≠1 b a ’m≠2 b = a Âm≠2 b .
c d c d c d
0 ··· 0 ‚m≠1,m≠1 ’m≠1 Âm≠1
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
U z y
Here is a heuristic for picking y œ S:
• We want to pick Âm≠1 œ {≠1, 1} in order to construct a vector y œ S. We can
pick Âm≠1 = 1 since picking it equal to ≠1 will simply carry through negation in the
appropriate way in the scheme we are describing.
From this Âm≠1 we can compute ’m≠1 .
• Now,
‚m≠2,m≠2 ’m≠2 + ‚m≠2,m≠1 ’m≠1 = Âm≠2
where ’m≠1 is known and Âm≠2 can be strategically chosen. We want z to have a large
Œ-norm and hence a heuristic is to now pick Âm≠2 œ {≠1, 1} in such a way that ’m≠2
is as large as possible in magnitude.
With this Âm≠2 we can compute ’m≠2 .
• And so forth!
When done, the magnification equals ÎzÎŒ = |’k |, where ’k is the element of z with largest
magnitude. This approach provides an estimate for ÎU ≠1 ÎŒ with O(m2 ) operations.
The described method underlies the condition number estimator for LINPACK, developed
in the 1970s [16], as described in [11]:
• A.K. Cline, C.B. Moler, G.W. Stewart, and J.H. Wilkinson, An estimate for the con-
dition number of a matrix, SIAM J. Numer. Anal., 16 (1979).
The method discussed in that paper yields a lower bound on ÎA≠1 ÎŒ and with that on
ŸŒ (A).
Remark 1.5.1.1 Alan Cline has his office on our floor at UT-Austin. G.W. (Pete) Stewart
was Robert’s Ph.D. advisor. Cleve Moler is the inventor of Matlab. John Wilkinson received
the Turing Award for his contributions to numerical linear algebra.
More sophisticated methods are discussed in [21]:
• N. Higham, A Survey of Condition Number Estimates for Triangular Matrices, SIAM
Review, 1987.
His methods underlie the LAPACK [1] condition number estimator and are remarkably
accurate: most of the time they provides an almost exact estimate of the actual condition
number.
WEEK 1. NORMS 82

1.6 Wrap Up
1.6.1 Additional homework
Homework 1.6.1.1 For ej œ Rn (a standard basis vector), compute
• Îej Î2 =

• Îej Î1 =

• Îej ÎŒ =

• Îej Îp =
Homework 1.6.1.2 For I œ Rn◊n (the identity matrix), compute
• ÎIÎ1 =

• ÎIÎŒ =

• ÎIÎ2 =

• ÎIÎp =

• ÎIÎF =
Q R
”0 0 ··· 0
c
c 0 ”1 ··· 0 d
d
Homework 1.6.1.3 Let D = c .. . . .. d (a diagonal matrix). Compute
c
a . . . 0
d
b
0 0 · · · ”n≠1
• ÎDÎ1 =

• ÎDÎŒ =

• ÎDÎp =

• ÎDÎF =
Q R
x0
c x1 d
c d
Homework 1.6.1.4 Let x = c
c .. d
d and 1 Æ p < Œ or p = Œ.
a . b
xN ≠1
ALWAYS/SOMETIMES/NEVER: Îxi Îp Æ ÎxÎp .
Homework 1.6.1.5 For A B
1 2 ≠1
A= .
≠1 1 0
compute
WEEK 1. NORMS 83

• ÎAÎ1 =

• ÎAÎŒ =

• ÎAÎF =
Homework 1.6.1.6 For A œ Cm◊n define
Q R
|–0,0 |, ··· ,
|–0,n≠1 |,
.. ..
m≠1
ÿ n≠1
ÿ ÿc d
ÎAÎ = |–i,j | = c
a . . d.
b
i=0 j=0
|–m≠1,0 |, · · · , |–m≠1,n≠1 |

• TRUE/FALSE: This function is a matrix norm.

• How can you relate this norm to the vector 1-norm?

• TRUE/FALSE: For this norm, ÎAÎ = ÎAH Î.

• TRUE/FALSE: This norm is submultiplicative.


Homework 1.6.1.7 Let A œ Cm◊n . Partition
Q R
aÂT0
c d
1 2 c aÂT1 d
A= a0 a1 · · · an≠1 = c
c .. d.
d
a . b
aÂTm≠1

Prove that
• ÎAÎF = ÎAT ÎF .
Ò
• ÎAÎF = Îa0 Î22 + Îa1 Î22 + · · · + Îan≠1 Î22 .
Ò
• ÎAÎF = ÎaÂ0 Î22 + ÎaÂ1 Î22 + · · · + ÎaÂm≠1 Î22 .

Note that here aÂi = (aÂTi )T .


Homework 1.6.1.8 Let x œ Rm with ÎxÎ1 = 1.
TRUE/FALSE: ÎxÎ2 = 1 if and only if x = ±ej for some j.
Solution. Obviously, if x = ej then ÎxÎ1 = Îx|2 = 1. Ò
Assume x ”= ej . Then |‰i | < 1 for all i. But then ÎxÎ2 = |‰0 |2 + · · · + |‰m≠1 |2 <
Ò Ô
|‰0 | + · · · + |‰m≠1 | = 1 = 1.

Homework 1.6.1.9 Prove that if Îx΋ Æ —Îxε is true for all x, then ÎA΋ Æ —ÎAε,‹ .
WEEK 1. NORMS 84

1.6.2 Summary
If –, — œ C with – = –r + –c i and — = —r + i—c , where –r , –c , —r , —c œ R, then
• Conjugate: – = –r ≠ –c i.
• Product: –— = (–r —r ≠ –c —c ) + (–r —c + –c —r )i.
Ò Ô
• Absolute value: |–| = –r2 + –c2 = ––.
Q R Q R
‰0 Â0
Let x, y œ Cm with x = c
c .. d
and y = c
c .. d
Then
a . d
b a . d.
b
‰m≠1 Âm≠1
• Conjugate: Q R
‰0
x=c
c .. d
a . d.
b
‰m≠1
• Transpose of vector: 1 2
xT = ‰0 · · · ‰m≠1

• Hermitian transpose (conjugate transpose) of vector:


1 2
xH = xT = xT = ‰0 · · · ‰m≠1 .
qm≠1
• Dot product (inner product): xH y = xT y = xT y = ‰0 Â0 +· · ·+‰m≠1 Âm≠1 = i=0 ‰i Âi .
Definition 1.6.2.1 Vector norm. Let Î · Î : Cm æ R. Then Î · Î is a (vector) norm if
for all x, y œ Cm and all – œ C
• x ”= 0 ∆ ÎxÎ > 0 (Î · Î is positive definite),

• ΖxÎ = |–|ÎxÎ (Î · Î is homogeneous), and

• Îx + yÎ Æ ÎxÎ + ÎyÎ (Î · Î obeys the triangle inequality).


Ô Ò Ô
• 2-norm (Euclidean length): ÎxÎ2 = xH x = |‰0 |2 + · · · + |‰m≠1 |2 = ‰0 ‰0 + · · · + ‰m≠1 ‰m≠1
Òq
= m≠1
i=0 |‰i |2 .
Ò Òq
• p-norm: ÎxÎp = p
|‰0 |p + · · · + |‰m≠1 |p = p m≠1
i=0 |‰i |p .
qm≠1
• 1-norm: ÎxÎ1 = |‰0 | + · · · + |‰m≠1 | = i=0 |‰i |.
• Œ-norm: ÎxÎŒ = max(|‰0 |, . . . , ‰m≠1 |) = maxm≠1
i=0 |‰i | = limpæŒ ÎxÎp .

• Unit ball: Set of all vectors with norm equal to one. Notation: ÎxÎ = 1.
WEEK 1. NORMS 85

Theorem 1.6.2.2 Equivalence of vector norms. Let ηΠ: Cm æ R and |||·||| : Cm æ R


both be vector norms. Then there exist positive scalars ‡ and · such that for all x œ Cm

‡ÎxÎ Æ |||x|||
Ô Æ · ÎxÎ.
ÎxÎ1 Æ mÎxÎ2 ÎxÎ1 Æ mÎxÎŒ
Ô
ÎxÎ2 Æ ÎxÎ1 ÎxÎ2 Æ mÎxÎŒ
ÎxÎŒ Æ ÎxÎ1 ÎxÎŒ Æ ÎxÎ2
Definition 1.6.2.3 Linear transformations and matrices. Let L : Cn æ Cm . Then L
is said to be a linear transformation if for all – œ C and x, y œ Cn
• L(–x) = –L(x). That is, scaling first and then transforming yields the same result as
transforming first and then scaling.

• L(x + y) = L(x) + L(y). That is, adding first and then transforming yields the same
result as transforming first and then adding.


Definition 1.6.2.4 Standard basis vector. In this course, we will use ej œ C to denote m

the standard basis vector with a "1" in the position indexed with j. So,
Q R
0
c . d
c .. d
c d
c d
c
c
0 d
d
ej = c
c 1 d
d Ω≠ j
c
c 0 d
d
c
c .. d
d
a . b
0


If L is a linear transformation and we let aj = L(ej ) then
1 2
A= a0 a1 · · · an≠1

is the matrix that represents L in the sense that Ax = L(x).


Partition C, A, and B by rows and columns
Q R Q R
1 2
cÂT0 1 2
aÂT0
C=
c
=c .. d , A = a · · · a
d c
=c .. d
c0 · · · cn≠1 a . b 0 k≠1 a . d
b,
T T
cÂm≠1 aÂm≠1
and Q R
Â
bT
1 2 0
B=
c
=c .. d
b0 · · · bn≠1 a . d,
b
Â
bT
k≠1
then C := AB can be computed in the following ways:
WEEK 1. NORMS 86

1. By columns:
1 2 1 2 1 2
c0 · · · cn≠1 := A b0 · · · bn≠1 = Ab0 · · · Abn≠1 .

In other words, cj := Abj for all columns of C.

2. By rows: Q R Q R Q R
cÂT0 aÂT0 aÂT0 B
c .. d c
d := c .. d B = c
d c .. d
c
a . b a . b a . d.
b
cÂTm≠1 aÂTm≠1 aÂTm≠1 B
In other words, cÂTi = aÂTi B for all rows of C.

3. As the sum of outer products:


Q R
Â
bT
1 2c 0
C := .. d
= a0ÂbT0 + · · · + ak≠1ÂbTk≠1 ,
a0 · · · ak≠1 c
a . d
b
Â
bTk≠1

which should be thought of as a sequence of rank-1 updates, since each term is an


outer product and an outer product has rank of at most one.
Partition C, A, and B by blocks (submatrices),
Q R Q R
C0,0 ··· C0,N ≠1 A0,0 ··· A0,K≠1
C=c
c .. .. d c .. .. d
a . . d,c
b a . . d,
b
CM ≠1,0 · · · CM ≠1,N ≠1 AM ≠1,0 · · · AM ≠1,K≠1

and Q R
B0,0 ··· B0,N ≠1
c .. .. d
c
a . . d,
b
BK≠1,0 · · · BK≠1,N ≠1
where the partitionings are "conformal." Then
K≠1
ÿ
Ci,j = Ai,p Bp,j .
p=0

Definition 1.6.2.5 Matrix norm. Let Î · Î : Cm◊n æ R. Then Î · Î is a (matrix) norm if


for all A, B œ Cm◊n and all – œ C
• A ”= 0 ∆ ÎAÎ > 0 (Î · Î is positive definite),

• ΖAÎ = |–|ÎAÎ (Î · Î is homogeneous), and

• ÎA + BÎ Æ ÎAÎ + ÎBÎ (Î · Î obeys the triangle inequality).


WEEK 1. NORMS 87

Let A œ Cm◊n and


Q R Q R
–0,0 · · · –0,n≠1 1 2
aÂT0
A=c
c .. .. d
=
c
=c .. d
a . . d
b a0 · · · an≠1 a . d
b.
–m≠1,0 · · · –m≠1,n≠1 T
aÂm≠1

Then

• Conjugate of matrix: Q R
–0,0 ··· –0,n≠1
A=c
c .. .. d
a . . d.
b
–m≠1,0 · · · –m≠1,n≠1

• Transpose of matrix: Q R
–0,0 · · · –m≠1,0
c
A =a .. ...
d
.
T c d.
b
–0,n≠1 · · · –m≠1,n≠1

• Conjugate transpose (Hermitian transpose) of matrix:


Q R
–0,0 · · · –m≠1,0
c
A =A =A =c
T T .. .. d
. .
H d.
a b
–0,n≠1 · · · –m≠1,n≠1
Òq qn≠1 Òq Òq
• Frobenius norm: ÎAÎF = m≠1
i=0 j=0 |–i,j |2 = n≠1
j=0 Îaj Î22 = m≠1
i=0 ÎaÂi Î22

• matrix p-norm: ÎAÎp = maxx”=0 ÎAxÎp


ÎxÎp
= maxÎxÎp =1 ÎAxÎp .

• matrix 2-norm: ÎAÎ2 = maxx”=0 ÎAxÎ2


ÎxÎ2
= maxÎxÎ2 =1 ÎAxÎ2 = ÎAH Î2 .

• matrix 1-norm: ÎAÎ1 = maxx”=0 ÎAxÎ1


ÎxÎ1
= maxÎxÎ1 =1 ÎAxÎ1 = max0Æj<n Îaj Î1 = ÎAH ÎŒ .

• matrix Œ-norm: ÎAÎŒ = maxx”=0 ÎAxÎŒ


ÎxÎŒ
= maxÎxÎŒ =1 ÎAxÎŒ = max0Æi<m ÎaÂi Î1 =
ÎAH Î1 .
Theorem 1.6.2.6 Equivalence of matrix norms. Let Î · Î : Cm◊n æ R and ||| · ||| :
Cm◊n æ R both be matrix norms. Then there exist positive scalars ‡ and · such that for all
A œ Cm◊n
‡ÎAÎ
Ô Æ |||A||| Æ · ÎAÎ. Ô
ÎAÎ1 Æ mÎAÎ2 ÎAÎ1 Æ mÎAÎŒ ÎAÎ1 Æ mÎAÎF
Ô Ô
ÎAÎ2 Æ nÎAÎ1 ÎAÎ2 Æ mÎAÎŒ ÎAÎ2 Æ ÎAÎF
Ô Ô
ÎAÎŒ Æ nÎAÎ1 ÎAÎŒ Æ nÎAÎ2 ÎAÎŒ Æ nÎAÎF
Ô Ô
ÎAÎF Æ nÎAÎ1 ÎAÎF Æ rank(A)ÎAÎ2 ÎAÎF Æ mÎAÎŒ
WEEK 1. NORMS 88

Definition 1.6.2.7 Subordinate matrix norm. A matrix norm Î · Î : Cm◊n æ R is said


to be subordinate to vector norms Î · ε : Cm æ R and Î · ΋ : Cn æ R if, for all x œ Cn ,

ÎAxε Æ ÎAÎÎx΋ .

If Î · ε and Î · ΋ are the same norm (but perhaps for different m and n), then Î · Î is said
to be subordinate to the given vector norm. ⌃
Definition 1.6.2.8 Consistent matrix norm. A matrix norm Î · Î : Cm◊n æ R is said
to be a consistent matrix norm if it is defined for all m and n, using the same formula for
all m and n. ⌃
Definition 1.6.2.9 Submultiplicative matrix norm. A consistent matrix norm Î · Î :
Cm◊n æ R is said to be submultiplicative if it satisfies

ÎABÎ Æ ÎAÎÎBÎ.


Let A, A œ Cm◊m , x, ”x, b, ”b œ Cm , A be nonsingular, and Î · Î be a vector norm and
corresponding subordinate matrix norm. Then

ΔxΠΔbÎ
Æ ÎAÎÎA≠1 Î .
ÎxÎ ¸ ˚˙ ˝ ÎbÎ
Ÿ(A)

Definition 1.6.2.10 Condition number of a nonsingular matrix. The value Ÿ(A) =


ÎAÎÎA≠1 Î is called the condition number of a nonsingular matrix A. ⌃
Week 2

The Singular Value Decomposition

2.1 Opening Remarks


2.1.1 Low rank approximation

YouTube: https://www.youtube.com/watch?v=12K5aydB9cQ
Consider this picture of the Gates Dell Complex that houses our Department of Computer
Science:

It consists of an m ◊ n array of pixels, each of which is a numerical value. Think of the


jth column of pixels as a vector of values, bj , so that the whole picture is represented by
columns as 1 2
B = b0 b1 · · · bn≠1 ,

89
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 90

where we recognize that we can view the picture as a matrix. What if we want to store
this picture with fewer than m ◊ n data? In other words, what if we want to compress the
picture? To do so, we might identify a few of the columns in the picture to be the "chosen
ones" that are representative of the other columns in the following sense: All columns in the
picture are approximately linear combinations of these chosen columns.
Let’s let linear algebra do the heavy lifting: what if we choose k roughly equally spaced
columns in the picture:
a0 = b0
a1 = bn/k≠1
.. ..
. .
ak≠1 = b(k≠1)n/k≠1 ,
where for illustration purposes we assume that n is an integer multiple of k. (We could
instead choose them randomly or via some other method. This detail is not important as
we try to gain initial insight.) We could then approximate each column of the picture, bj , as
a linear combination of a0 , . . . , ak≠1 :
Q R
1 2c
‰0,j
bj ¥ ‰0,j a0 + ‰1,j a1 + · · · + ‰k≠1,j ak≠1 = .. d
a0 · · · ak≠1 c
a . d.
b
‰k≠1,j

We can write this more concisely by viewing these chosen columns as the columns of matrix
A so that
Q R
1 2
‰0,j
where A =
c
and xj = c .. d
bj ¥ Axj , a0 · · · ak≠1 a . d.
b
‰k≠1,j

If A has linearly independent columns, the best such approximation (in the linear least
squares sense) is obtained by choosing

xj = (AT A)≠1 AT bj ,

where you may recognize (AT A)≠1 AT as the (left) pseudo-inverse of A, leaving us with

bj ¥ A(AT A)≠1 AT bj .

This approximates bj with the orthogonal projection of bj onto the column space of A. Doing
this for every column bj leaves us with the following approximation to the picture:
Q R
c d
c d
B ¥ c A (AT A)≠1 AT b0 · · · A (AT A)≠1 AT bn≠1 d ,
a ¸ ˚˙ ˝ ¸ ˚˙ ˝ b
x0 xn≠1
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 91

which is equivalent to
1 2
B ¥ A (AT A)≠1 AT b0 · · · bn≠1 = A (AT A)≠1 AT B = AX.
¸ ˚˙ ˝ ¸ ˚˙ ˝
1 2
x0 · · · xn≠1 X

Importantly, instead of requiring m ◊ n data to store B, we now need only store A and X.
Homework 2.1.1.1 If B is m ◊ n and A is m ◊ k, how many entries are there in A and X ?
Solution.

• A is m ◊ k.

• X is k ◊ n.

A total of (m + n)k entries are in A and X.


Homework 2.1.1.2 AX is called a rank-k approximation of B. Why?
Solution. The matrix AX has rank at most equal to k (it is a rank-k matrix) since each
of its columns can be written as a linear combinations of the columns of A and hence it has
at most k linearly independent columns.
Let’s have a look at how effective this approach is for our picture:
original: k=1

k=2 k = 10
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 92

k = 25 k = 50

Now, there is no reason to believe that picking equally spaced columns (or restricting
ourselves to columns in B) will yield the best rank-k approximation for the picture. It yields
a pretty good result here in part because there is quite a bit of repetition in the picture,
from column to column. So, the question can be asked: How do we find the best rank-k
approximation for a picture or, more generally, a matrix? This would allow us to get the
most from the data that needs to be stored. It is the Singular Value Decomposition (SVD),
possibly the most important result in linear algebra, that provides the answer.
Remark 2.1.1.1 Those who need a refresher on this material may want to review Week 11
of Linear Algebra: Foundations to Frontiers [26]. We will discuss solving linear least squares
problems further in Week 4.

2.1.2 Overview
• 2.1 Opening Remarks

¶ 2.1.1 Low rank approximation


¶ 2.1.2 Overview
¶ 2.1.3 What you will learn

• 2.2 Orthogonal Vectors and Matrices

¶ 2.2.1 Orthogonal vectors


¶ 2.2.2 Component in the direction of a vector
¶ 2.2.3 Orthonormal vectors and matrices
¶ 2.2.4 Unitary matrices
¶ 2.2.5 Examples of unitary matrices
¶ 2.2.6 Change of orthonormal basis
¶ 2.2.7 Why we love unitary matrices choice

• 2.3 The Singular Value Decomposition

¶ 2.3.1 The Singular Value Decomposition Theorem


¶ 2.3.2 Geometric interpretation
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 93

¶ 2.3.3 An "algorithm" for computing the SVD


¶ 2.3.4 The Reduced Singular Value Decomposition
¶ 2.3.5 The SVD of nonsingular matrices
¶ 2.3.6 Best rank-k approximation

• 2.4 Enrichments

¶ 2.4.1 Principle Component Analysis (PCA)

• 2.5 Wrap Up

¶ 2.5.1 Additional homework


¶ 2.5.2 Summary

2.1.3 What you will learn


This week introduces two concepts that have theoretical and practical importance: unitary
matrices and the Singular Value Decomposition (SVD).
Upon completion of this week, you should be able to

• Determine whether vectors are orthogonal.

• Compute the component of a vector in the direction of another vector.

• Relate sets of orthogonal vectors to orthogonal and unitary matrices.

• Connect unitary matrices to the changing of orthonormal basis.

• Identify transformations that can be represented by unitary matrices.

• Prove that multiplying with unitary matrices does not amplify relative error.

• Use norms to quantify the conditioning of solving linear systems.

• Prove and interpret the Singular Value Decomposition.

• Link the Reduced Singular Value Decomposition to the rank of the matrix and deter-
mine the best rank-k approximation to a matrix.

• Determine whether a matrix is close to being nonsingular by relating the Singular


Value Decomposition to the condition number.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 94

2.2 Orthogonal Vectors and Matrices


2.2.1 Orthogonal vectors

YouTube: https://www.youtube.com/watch?v=3zpdTfwZSEo
At some point in your education you were told that vectors are orthogonal (perpendicular)
if and only if their dot product (inner product) equals zero. Let’s review where this comes
from. Given two vectors u, v œ Rm , those two vectors, and their sum all exist in the same
two dimensional (2D) subspace. So, they can be visualized as

where the page on which they are drawn is that 2D subspace. Now, if they are, as drawn,
perpendicular and we consider the lengths of the sides of the triangle that they define

then we can employ the first theorem you were probably ever exposed to, the Pythagorean
Theorem, to find that
ÎuÎ22 + ÎvÎ22 = Îu + vÎ22 .
Using what we know about the relation between the two norm and the dot product, we find
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 95

that
uT u + v T v = (u + v)T (u + v)
… < multiply out >
u u + v v = uT u + u T v + v T u + v T v
T T

… < uT v = v T u if u and v are real-valued >


u u + v v = uT u + 2uT v + v T v
T T

… < delete common terms >


0 = 2uT v
so that we can conclude that uT v = 0.
While we already encountered
Ô the notation x x as an alternative way of expressing the
H

length of a vector, ÎxÎ2 = x x, we have not formally defined the inner product (dot
H

product), for complex-valued vectors:


Definition 2.2.1.1 Dot product (Inner product). Given x, y œ Cm their dot product
(inner product) is defined as
m≠1
ÿ
xH y = xT y = xT y = ‰0 Â0 + ‰1 Â1 + · · · + ‰m≠1 Âm≠1 = ‰i Âi .
i=0


The notation x is short for x , where x equals the vector x with all its entries conjugated.
H T

So,
xH y
= < expose the elements of the vectors >
Q RH Q R
‰0 Â0
c .. d c .. d
c
a . d c
b a . db
‰m≠1 Âm≠1
= < xH = x T >
Q RT Q R
‰0 Â0
c .. d c .. d
c
a . d c
b a . d
b
‰m≠1 Âm≠1
= < conjugate the elements of x >
Q RT Q R
‰0 Â0
c .. d c .. d
c
a . d
b
c
a . d
b
‰m≠1 Âm≠1
= < view x Qas a m ◊ R
1 matrix and transpose >
1 2c
Â0
.. d
‰0 · · · ‰m≠1 a c
. d
b
Âmn≠1
= < view xH as a matrix and perform matrix-vector multiplication >
qm≠1
i=0 ‰i Âi .
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 96

Homework 2.2.1.1 Let x, y œ Cm .


ALWAYS/SOMETIMES/NEVER: xH y = y H x.
Answer. ALWAYS
Now prove it!
Solution.
m≠1
ÿ m≠1
ÿ m≠1
ÿ
xH y = ‰i Âi = ‰i Âi = Â i ‰i = y H x.
i=0 i=0 i=0

Homework 2.2.1.2 Let x œ C . m

ALWAYS/SOMETIMES/NEVER: xH x is real-valued.
Answer. ALWAYS
Now prove it!
Solution. By the last homework,
xH x = xH x,
A complex number is equal to its conjugate only if it is real-valued.
The following defines orthogonality of two vectors with complex-valued elements:
Definition 2.2.1.2 Orthogonal vectors. Let x, y œ Cm . These vectors are said to be
orthogonal (perpendicular) iff xH y = 0. ⌃

2.2.2 Component in the direction of a vector

YouTube: https://www.youtube.com/watch?v=CqcJ6Nh1QWg
In a previous linear algebra course, you may have learned that if a, b œ Rm then

aT b aaT

b = a = b
aT a aT a
equals the component of b in the direction of a and

aaT
b‹ = b ≠ ‚b = (I ≠ )b
aT a

equals the component of b orthogonal to a, since b = ‚b + b‹ and ‚bT b‹ = 0. Similarly, if


a, b œ Cm then
aH b aaH
b= H a= H b

a a a a
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 97

equals the component of b in the direction of a and


aaH
b‹ = b ≠ ‚b = (I ≠ )b
aH a
equals the component of b orthogonal to a.
Remark 2.2.2.1 The matrix that (orthogonally) projects the vector to which it is applied
onto the vector a is given by
aaH
aH a
while
aaH
I≠ H
a a
is the matrix that (orthogonally) projects the vector to which it is applied onto the space
orthogonal to the vector a.
Homework 2.2.2.1 Let a œ Cm .
ALWAYS/SOMETIMES/NEVER>:
A BA B
aaH aaH aaH
= .
aH a aH a aH a

Interpret what thi s means about a matrix that projects onto a vector.
Answer. ALWAYS.
Now prove it.
Solution. 1 H21 H2
aa aa
aH a aH a
= < multiply numerators and denominators >
aaH aaH
(aH a)(aH a)
= < associativity >
a(aH a)aH
(aH a)(aH a)
= < aH a is a scalar and hence commutes to front >
aH aaaH
(aH a)(aH a)
= < scalar division >
aaH
aH a
.
Interpretation: orthogonally projecting the orthogonal projection of a vector yields the
orthogonal projection of the vector.
Homework 2.2.2.2 Let a œ Cm .
ALWAYS/SOMETIMES/NEVER:
A BA B
aaH aaH
I≠ H =0
aH a a a

(the zero matrix). Interpret what this means.


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 98

Answer. ALWAYS.
Now prove it.
Solution. 1 21 2
aaH H
aH a
I ≠ aa
aH a

1 = 2 1< Hdistribute
21 H2 >
aaH aa aa
aH a
≠ aH a aH a

1 H 2 1 H 2 homework >
= < last
aa
aH a
≠ aaaH a
=
0.
Interpretation: first orthogonally projecting onto the space orthogonal to vector a and
then orthogonally projecting the resulting vector onto that a leaves you with the zero vector.
Homework 2.2.2.3 Let a, b œ Cn , ‚b = aa b, and b‹ = b ≠ ‚b.
H
aH a
ALWAYS/SOMETIMES/NEVER: ‚bH b‹ = 0.
Answer. ALWAYS.
Now prove it.
Solution.

bH b‹
= < substitute ‚b and b‹ >
1 2H
aaH
aH a
b (b ≠ ‚b)
= < (Ax)H = xH AH ; substitute b ≠ ‚b >
1 2H
(I ≠ aa
aaH
)b
H
bH aH a aH a
= < (((xy H )/–)H = yxH /– if – is real >
(I ≠ aa )b
H H
bH aa
aH a aH a
= < last homework >
bH 0b
= < 0x = 0; y H 0 = 0 >
0.

2.2.3 Orthonormal vectors and matrices

YouTube: https://www.youtube.com/watch?v=GFfvDpj5dzw
A lot of the formulae in the last unit become simpler if the length of the vector equals
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 99

one: If ÎuÎ2 = 1 then


• the component of v in the direction of u equals
uH v
u = uH vu.
uH u

• the matrix that projects a vector onto the vector u is given by


uuH
= uuH .
uH u

• the component of v orthogonal to u equals


uH v
v≠ u = v ≠ uH vu.
uH u

• the matrix that projects a vector onto the space orthogonal to u is given by
uuH
I≠ = I ≠ uuH .
uH u
Homework 2.2.3.1 Let u ”= 0 œ Cm .
ALWAYS/SOMETIMES/NEVER u/ÎuÎ2 has unit length.
Answer. ALWAYS.
Now prove it.
Solution. . .
. u .
. ÎuÎ .
2 2
= < homogenuity of norms >
ÎuÎ2
ÎuÎ2
= < algebra >
1
This last exercise shows that any nonzero vector can be scaled (normalized) to have unit
length.
Definition 2.2.3.1 Orthonormal vectors. Let u0 , u1 , . . . , un≠1 œ Cm . These vectors are
said to be mutually orthonormal if for all 0 Æ i, j < n
I
1 if i = j
uH
i uj = .
0 otherwise

Ò ⌃
The definition implies that Îui Î2 = = 1 and hence each of the vectors is of unit
uH
i ui
length in addition to being orthogonal to each other.
The standard basis vectors (Definition 1.3.1.3)

i=0 µ C ,
{ej }m≠1 m
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 100

where Q R
0
c . d
c .. d
c d
c d
c
c
0 d
d
ej = c
c 1 d
d Ω≠ entry indexed with j
c
c 0 d
d
c
c .. d
d
a . b
0
are mutually orthonormal since, clearly,
I
1 if i = j
eH
i ej =
0 otherwise.

Naturally, any subset of the standard basis vectors is a set of mutually orthonormal vectors.
Remark 2.2.3.2 For n vectors of size m to be mutually orthonormal, n must be less than
or equal to m. This is because n mutually orthonormal vectors are linearly independent and
there can be at most m linearly independent vectors of size m.
A very concise way of indicating that a set of vectors are mutually orthonormal is to view
them as the columns of a matrix, which then has a very special property:
Definition 2.2.3.3 Orthonormal matrix. Let Q œ Cm◊n (with n Æ m). Then Q is said
to be an orthonormal matrix iff QH Q = I. ⌃
The subsequent exercise makes the connection between mutually orthonormal vectors
and an orthonormal matrix.
1 2
Homework 2.2.3.2 Let Q œ Cm◊n (with n Æ m). Partition Q = q0 q1 · · · qn≠1 .
TRUE/FALSE: Q is an orthonormal matrix if and only if q0 , q1 , . . . , qn≠1 are mutually
orthonormal.
Answer. TRUE
Now prove it! 1 2
Solution. Let Q œ Cm◊n (with n Æ m). Partition Q = q0 q1 · · · qn≠1 . Then
1 2H 1 2
QH Q = q0 q1 · · · qn≠1 q0 q1 · · · qn≠1
Q H R
q0
c H d1 2
c q1 d
= c
c .. d q0 q1 · · · qn≠1
d
a . b
H
qn≠1
Q R
q0H q0 q0H q1 ··· q0H qn≠1
c q1H q0 q1H q1 ··· q1H qn≠1 d
c d
= c
c .. .. .. d.
d
a . . . b
H H H
qn≠1 q0 qn≠1 q1 · · · qn≠1 qn≠1
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 101

Now consider that QH Q = I:


Q R Q R
q0H q0 q0H q1 ··· q0H qn≠1 1 0 ··· 0
c
c q1H q0 q1H q1 ··· q1H qn≠1 d
d
c
c 0 1 ··· 0 d
d
c
c .. .. .. d
d = c
c .. .. .. d.
d
a . . . b a . . . b
H
qn≠1 H
q0 qn≠1 H
q1 · · · qn≠1 qn≠1 0 0 ··· 1

Clearly Q is orthonormal if and only if q0 , q1 , . . . , qn≠1 are mutually orthonormal.


Homework 2.2.3.3 Let Q œ Cm◊n .
ALWAYS/SOMETIMES/NEVER: If QH Q = I then QQH = I.
Answer. SOMETIMES.
Now explain why.
Solution.

• If Q is a square matrix (m = n) then QH Q = I means Q≠1 = QH . But then QQ≠1 = I


and hence QQH = I.

• If Q is not square, then QH Q = I means m > n. Hence Q has rank equal to n which in
turn means QQH is a matrix with rank at most equal to n. (Actually, its rank equals
n.). Since I has rank equal to m (it is an m ◊ m matrix with linearly independent
columns), QQH cannot equal I.
1 2
More concretely: let m > 1 and n = 1. Choose Q = e0 . Then QH Q = eH
0 e0 = 1 =
I. But Q R
1 0 ···
c d
QQ = e0 e0 = c
H H 0 0 ··· d.
a
.. .. b
. .

2.2.4 Unitary matrices

YouTube: https://www.youtube.com/watch?v=izONEmO9uqw
Homework 2.2.4.1 Let Q œ Cm◊n be an orthonormal matrix.
ALWAYS/SOMETIMES/NEVER: Q≠1 = QH and QQH = I.
Answer. SOMETIMES
Now explain it!
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 102

Solution. If Q is unitary, then it is an orthonormal matrix and square. Because it is an


orthonormal matrix, QH Q = I. If A, B œ Cm◊m , the matrix B such that BA = I is the
inverse of A. Hence Q≠1 = QH . Also, if BA = I then AB = I and hence QQH = I.
However,
A Ô B
an orthonormal matrix is not necessarily square. For example, the matrix
2
Q = Ô2
2
is an orthonormal matrix: QT Q = I. However, it doesn’t have an inverse
2
because it is not square.
If an orthonormal matrix is square, then it is called a unitary matrix.
Definition 2.2.4.1 Unitary matrix. Let U œ Cm◊m . Then U is said to be a unitary
matrix if and only if U H U = I (the identity). ⌃
Remark 2.2.4.2 Unitary matrices are always square. Sometimes the term orthogonal
matrix is used instead of unitary matrix, especially if the matrix is real valued.
Unitary matrices have some very nice properties, as captured by the following exercises.
Homework 2.2.4.2 Let Q œ Cm◊m be a unitary matrix.
ALWAYS/SOMETIMES/NEVER: Q≠1 = QH and QQH = I.
Answer. ALWAYS
Now explain it!
Solution. If Q is unitary, then it is square and QH Q = I. Hence Q≠1 = QH and QQH = I.
Homework 2.2.4.3 TRUE/FALSE: If U is unitary, so is U H .
Answer. TRUE
Now prove it!
Solution. Clearly, U H is square. Also, (U H )H U H = (U U H )H = I by the last homework.
Homework 2.2.4.4 Let U0 , U1 œ Cm◊m both be unitary.
ALWAYS/SOMETIMES/NEVER: U0 U1 , is unitary.
Answer. ALWAYS
Now prove it!
Solution. Obviously, U0 U1 is a square matrix.
Now,
(U0 U1 )H (U0 U1 ) = U1H U0H U0 U1 = U1H U1 = I.
¸ ˚˙ ˝ ¸ ˚˙ ˝
I I
Hence U0 U1 is unitary.
Homework 2.2.4.5 Let U0 , U1 , . . . , Uk≠1 œ Cm◊m all be unitary.
ALWAYS/SOMETIMES/NEVER: Their product, U0 U1 · · · Uk≠1 , is unitary.
Answer. ALWAYS
Now prove it!
Solution. Strictly speaking, we should do a proof by induction. But instead we will make
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 103

the more informal argument that

(U0 U1 · · · Uk≠1 )H U0 U1 · · · Uk≠1 = Uk≠1


H
· · · U1H U0H U0 U1 · · · Uk≠1
= Uk≠1 · · · U1H U0H U0 U1 · · · Uk≠1 = I.
H
¸ ˚˙ ˝
I
¸ ˚˙ ˝
I
¸ ˚˙ ˝
I
¸ ˚˙ ˝
I

(When you see a proof that involed · · ·, it would be more rigorous to use a proof by
induction.)
Remark 2.2.4.3 Many algorithms that we encounter in the future will involve the applica-
tion of a sequence of unitary matrices, which is why the result in this last exercise is of great
importance.
Perhaps the most important property of a unitary matrix is that it preserves length.
Homework 2.2.4.6 Let U œ Cm◊m be a unitary matrix and x œ Cm . Prove that ÎU xÎ2 =
ÎxÎ2 .
Solution.
ÎU xÎ22
= < alternative definition >
(U x)H U x
= < (Az)H = z H AH >
H H
x U Ux
= < U is unitary >
xH x
= < alternative definition >
ÎxÎ22 .
The converse is true as well:
Theorem 2.2.4.4 If A œ Cm◊m preserves length (ÎAxÎ2 = ÎxÎ2 for all x œ Cm ), then A is
unitary.
Proof. We first prove that (Ax)H (Ay) = xH y for all x, y by considering Îx ≠ yÎ22 = ÎA(x ≠
y)Î22 . We then use that to evaluate eH
i A Aej .
H
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 104

Let x, y œ Cm . Then

Îx ≠ yÎ22 = ÎA(x ≠ y)Î22


… < alternative definition >
(x ≠ y)H (x ≠ y) = (A(x ≠ y))H A(x ≠ y)
= < (Bz)H = z H B H >
(x ≠ y)H (x ≠ y) = (x ≠ y)H AH A(x ≠ y)
… < multiply out >
xH x ≠ xH y ≠ y H x + y H y = xH AH Ax ≠ xH AH Ay ≠ y H AH Ax + y H AH Ay
… < alternative definition ; xH y = y H x >
ÎxÎ2 ≠ (x y + xH y) + ÎyÎ22 = ÎAxÎ22 ≠ (xH AH Ay + xH AH Ay) + ÎAyÎ22
2 H


1 1 2 = ÎxÎ2 2and ÎAyÎ2 = ÎyÎ2 ; – + – = 2Re(–) >
2 < ÎAxÎ
Re xH y = Re (Ax)H Ay
1 2 1 2
One can similarly show that Im xH y = Im (Ax)H Ay by considering A(ix ≠ y).
Conclude that (Ax)H (Ay) = xH y.
We now use this to show that AH A = I by using the fact that the standard basis vectors
have the property that I
1 if i = j
ei ej =
H
0 otherwise
and that the i, j entry in AH A equals eH
i A Aej .
H

Note: I think the above can be made much more elegant by choosing – such that –xH y
is real and then looking at Îx + –yÎ2 = ÎA(x + –y)Î2 instead, much like we did in the proof
of the Cauchy-Schwartz inequality. Try and see if you can work out the details. ⌅
Homework 2.2.4.7 Prove that if U is unitary then ÎU Î2 = 1.
Solution.
ÎU Î2
= < definition >
maxÎxÎ2 =1 ÎU xÎ2
= < unitary matrices preserve length >
maxÎxÎ2 =1 ÎxÎ2
= < algebra >
1
(The above can be really easily proven with the SVD. Let’s point that out later.)
Homework 2.2.4.8 Prove that if U is unitary then Ÿ2 (U ) = 1.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 105

Solution.
Ÿ2 U
= < definition >
≠1
ÎU Î2 ÎU Î2
= < both U and U ≠1 are unitary ; last homework >
1◊1
= < arithmetic >
1
The preservation of length extends to the preservation of norms that have a relation to
the 2-norm:d
Homework 2.2.4.9 Let U œ Cm◊m and V œ Cn◊n be unitary and A œ Cm◊n . Show that
• ÎU H AÎ2 = ÎAÎ2 .
• ÎAV Î2 = ÎAÎ2 .
• ÎU H AV Î2 = ÎAÎ2 .

Hint. Exploit the definition of the 2-norm:


ÎAÎ2 = max ÎAxÎ2 .
ÎxÎ2 =1

Solution.

ÎU H AÎ2
= < definition of 2-norm >
maxÎxÎ2 =1 ÎU H AxÎ2
= < U is unitary and unitary matrices preserve length >
maxÎxÎ2 =1 ÎAxÎ2
= < definition of 2-norm >
ÎAÎ2 .

ÎAV Î2
= < definition of 2-norm >
maxÎxÎ2 =1 ÎAV xÎ2
= < V H is unitary and unitary matrices preserve length >
maxÎV xÎ2 =1 ÎA(V x)Î2
= < substitute y = V x >
maxÎyÎ2 =1 ÎAyÎ2
= < definition of 2-norm >
ÎAÎ2 .
• The last part follows immediately from the previous two:
ÎU H AV Î2 = ÎU H (AV )Î2 = ÎAV Î2 = ÎAÎ2 .
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 106

Homework 2.2.4.10 Let U œ Cm◊m and V œ Cn◊n be unitary and A œ Cm◊n . Show that
• ÎU H AÎF = ÎAÎF .

• ÎAV ÎF = ÎAÎF .

• ÎU H AV ÎF = ÎAÎF .

Hint. How does ÎAÎF relate to the 2-norms of its columns?


Solution.

• Partition 1 2
A= a0 · · · an≠1 .
qn≠1
Then we saw in Subsection 1.3.3 that ÎAÎ2F = j=0 ÎaÎ22 .
Now,
ÎU H AÎ2F
= 1 < partition 2A by columns >
ÎU H a0 · · · an≠1 Î2F
1= < property of matrix-vector
2 multiplication >
H H 2
Î U a0 · · · U an≠1 ÎF
= < exercise in Chapter 1 >
qn≠1
j=0 ÎU H
aj Î22
= < unitary matrices preserve length >
qn≠1 2
j=0 Îa Î
j 2
= < exercise in Chapter 1 >
ÎAÎ2F .

• To prove that ÎAV ÎF = ÎAÎF recall that ÎAH ÎF = ÎAÎF .

• The last part follows immediately from the first two parts.
In the last two exercises we consider U H AV rather than U AV because it sets us up better
for future discussion.

2.2.5 Examples of unitary matrices


In this unit, we will discuss a few situations where you may have encountered unitary matri-
ces without realizing. Since few of us walk around pointing out to each other "Look, another
matrix!", we first consider if a transformation (function) might be a linear transformation.
This allows us to then ask the question "What kind of transformations we see around us pre-
serve length?" After that, we discuss how those transformations are represented as matrices.
That leaves us to then check whether the resulting matrix is unitary.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 107

2.2.5.1 Rotations

YouTube: https://www.youtube.com/watch?v=C0m DZ28Ohc


A rotation in 2D, R◊ : R2 æ R2 , takes a vector and rotates that vector through the angle
◊:

If you think about it,

• If you scale a vector first and then rotate it, you get the same result as if you rotate it
first and then scale it.

• If you add two vectors first and then rotate, you get the same result as if you rotate
them first and then add them.

Thus, a rotation is a linear transformation. Also, the above picture captures that a rotation
preserves the length of the vector to which it is applied. We conclude that the matrix that
represents a rotation should be a unitary matrix.
Let us compute the matrix that represents the rotation through an angle ◊. Recall that
if L : Cn æ Cm is a linear transformation and A is the matrix that represents it, then the
jth column of A, aj , equals L(ej ). The pictures
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 108

and

illustrate that
A B A B
cos(◊) ≠ sin(◊)
R◊ (e0 ) = and R◊ (e1 ) = .
sin(◊) cos(◊)

Thus, A BA B
cos(◊) ≠ sin(◊) ‰0
R◊ (x) = .
sin(◊) cos(◊) ‰1
Homework 2.2.5.1 Show that
A B
cos(◊) ≠ sin(◊)
sin(◊) cos(◊)

is a unitary matrix. (Since it is real valued, it is usually called an orthogonal matrix instead.)
Hint. Hint: use c for cos(◊) and s for sin(◊) to save yourself a lot of writing!
Solution.
A BH A B
cos(◊) ≠ sin(◊) cos(◊) ≠ sin(◊)
sin(◊) cos(◊) sin(◊) cos(◊)
= < the matrix is real valued >
A BT A B
cos(◊) ≠ sin(◊) cos(◊) ≠ sin(◊)
sin(◊) cos(◊) sin(◊) cos(◊)
A = < transpose BA > B
cos(◊) sin(◊) cos(◊) ≠ sin(◊)
≠ sin(◊) cos(◊) sin(◊) cos(◊)
A = < multiply > B
cos2 (◊) + sin2 (◊) ≠ cos(◊) sin(◊) + sin(◊) cos(◊)
≠ sin(◊) cos(◊) + cos(◊) sin(◊) sin2 (◊) + cos2 (◊)
A = B< geometry; algebra >
1 0
0 1
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 109

Homework 2.2.5.2 Prove, without relying on geometry but using what you just discovered,
that cos(≠◊) = cos(◊) and sin(≠◊) = ≠ sin(◊)
Solution. Undoing a rotation by an angle ◊ means rotating in the opposite direction
through angle ◊ or, equivalently, rotating through angle ≠◊. Thus, the inverse of R◊ is
R≠◊ . The matrix that represents R◊ is given by
A B
cos(◊) ≠ sin(◊)
sin(◊) cos(◊)

and hence the matrix that represents R≠◊ is given by


A B
cos(≠◊) ≠ sin(≠◊)
.
sin(≠◊) cos(≠◊)

Since R≠◊ is the inverse of R◊ we conclude that


A B≠1 A B
cos(◊) ≠ sin(◊) cos(≠◊) ≠ sin(≠◊)
= .
sin(◊) cos(◊) sin(≠◊) cos(≠◊)

But we just discovered that


A B≠1 A BT A B
cos(◊) ≠ sin(◊) cos(◊) ≠ sin(◊) cos(◊) sin(◊)
= = .
sin(◊) cos(◊) sin(◊) cos(◊) ≠ sin(◊) cos(◊)

Hence A B A B
cos(≠◊) ≠ sin(≠◊) cos(◊) sin(◊)
.=
sin(≠◊) cos(≠◊) ≠ sin(◊) cos(◊)
from which we conclude that cos(≠◊) = cos(◊) and sin(≠◊) = ≠ sin(◊).

2.2.5.2 Reflections

YouTube: https://www.youtube.com/watch?v=r8S04qqcc-o
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 110

Picture a mirror with its orientation defined


by a unit length vector, u, that is orthogonal
to it.

We will consider how a vector, x, is reflected


by this mirror.

The component of x orthogonal to the mirror


equals the component of x in the direction of
u, which equals (uT x)u.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 111

The orthogonal projection of x onto the mir-


ror is then given by the dashed vector, which
equals x ≠ (uT x)u.

To get to the reflection of x, we now need to


go further yet by ≠(uT x)u.

We conclude that the transformation that mir-


rors (reflects) x with respect to the mirror is
given by M (x) = x ≠ 2(uT x)u.

The transformation described above preserves the length of the vector to which it is
applied.
Homework 2.2.5.3 (Verbally) describe why reflecting a vector as described above is a linear
transformation.
Solution.
• If you scale a vector first and then reflect it, you get the same result as if you reflect it
first and then scale it.
• If you add two vectors first and then reflect, you get the same result as if you reflect
them first and then add them.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 112

Homework 2.2.5.4 Show that the matrix that represents M : R3 æ R3 in the above
example is given by I ≠ 2uuT .
Hint. Rearrange x ≠ 2(uT x)u.
Solution. We notice that
x ≠ 2(uT x)u
= < –x = x– >
x ≠ 2u(uT x)
= < associativity >
Ix ≠ 2uuT x
= < distributivity >
(I ≠ 2uuT )x.

Hence M (x) = (I ≠ 2uuT )x and the matrix that represents M is given by I ≠ 2uuT .
Homework 2.2.5.5 (Verbally) describe why (I≠2uuT )≠1 = I≠2uuT if u œ R3 and ÎuÎ2 = 1.
Solution. If you take a vector, x, and reflect it with respect to the mirror defined by u,
and you then reflect the result with respect to the same mirror, you should get the original
vector x back. Hence, the matrix that represents the reflection should be its own inverse.
Homework 2.2.5.6 Let M : R3 æ R3 be defined by M (x) = (I ≠ 2uuT )x, where ÎuÎ2 = 1.
Show that the matrix that represents it is unitary (or, rather, orthogonal since it is in R3◊3 ).
Solution. Pushing through the math we find that

(I ≠ 2uuT )T (I ≠ 2uuT )
= < (A + B)T = AT + B T >
(I ≠ (2uuT )T )(I ≠ 2uuT )
T

= < (–AB T )T = –BAT >


(I ≠ 2uu )(I ≠ 2uuT )
T

= < distributivity >


(I ≠ 2uuT ) ≠ (I ≠ 2uuT )(2uuT )
= < distributivity >
I ≠ 2uu ≠ 2uuT + 2uuT 2uuT
T

= < uT u = 1 >
I ≠ 4uu + 4uuT
T

= <A≠A=0>
I.
Remark 2.2.5.1 Unitary matrices in general, and rotations and reflections in particular,
will play a key role in many of the practical algorithms we will develop in this course.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 113

2.2.6 Change of orthonormal basis

YouTube: https://www.youtube.com/watch?v=DwTVkdQKJK4
A B
≠2
Homework 2.2.6.1 Consider the vector x = and the following picture that depicts
1
a rotated basis with basis vectors u0 and u1 .

What
A are
B the coordinates of the vector x in this rotated system? In other words, find
‰‚0
x‚ = such that ‰‚0 u0 + ‰‚1 u1 = x.
‰‚1
Solution. There are a number of approaches to this. One way is to try to remember the
formula you may have learned in a pre-calculus course about change of coordinates. Let’s
instead start by recognizing (from geometry or by applying the Pythagorean Theorem) that
A Ô B Ô A B A Ô B Ô A B
2/2 2 1 ≠Ô 2/2 2 ≠1
u0 = Ô = and u1 = = .
2/2 2 1 2/2 2 1

Here are two ways in which you can employ what you have discovered in this course:
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 114

• Since u0 and u1 are orthonormal vectors, you know that

x
= < u0 and u1 are orthonormal >
(uT0 x)u0 + (uT1 x)u1
¸ ˚˙ ˝ ¸ ˚˙ ˝
component in the component in the
direction of u0 direction of u1
Q = < instantiate and
R 0 Q u1 >
u
A BT A B A BT A BR
Ô
2 1 ≠2 Ô
2 ≠1 ≠2
a b u0 +a b u1
2 1 1 2 1 1
= Ô
<Ô evaluate >
≠ 2 u0 + 3 2 2 u 1 .
2

• An alternative
1 2 way to arrive at the same answer that provides more insight. Let
U = u0 u1 . Then

x
= < U is unitary (or orthogonal since it is real valued) >
UUT x
= < Ainstantiate
B U>
1 2 uT
0
u0 u1 x
uT1
= < Amatrix-vector
B multiplication >
1 2 uT x
0
u0 u1
uT1 x
= < instantiate >
Q A BT A B R
Ô
2 1 ≠2
1 2cc 2 1 1
d
d
u0 u1 c c A BT A B d
Ô d
a 2 ≠1 ≠2 b
2 1 1
= < evaluate >
A Ô B
1 2 ≠ 2
u0 u1 Ô2
3 2
2
= < Asimplify
A >BB
1 2 Ô ≠1
2
u0 u1 2 3
Below we compare side-by-side how to describe a vector x using the standard basis vectors
e0 , . . . , em≠1 (on the left) and vectors u0 , . . . , um≠1 (on the right):
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 115
Q R Q R
‰0 uT0 x
The vector x = a ... d ..
c d c d
b describes the The vector x‚ = c d describes the
c
a . b
‰m≠1 uTm≠1 x
vector x in terms of the standard basis vec- vector x in terms of the orthonormal basis
tors e0 , . . . , em≠1 : u0 , . . . , um≠1 :

x x
= < x = Ix = IIx = II T x > = < x = Ix = U U H x >
T
II x UUHx
= < exposeQcolumns R
of I > = < exposeQcolumnsRof U >
T
1 2c
e 0 1 2c
uH0
.. d .. d
e0 · · · em≠1 c a . d bx u0 · · · um≠1 c a . d bx
T H
em≠1 um≠1
= < evaluate Q
> R
= < evaluate
Q
> R
1 2c
eT0 x 1 2c
uH0 x
.. d .. d
e0 · · · em≠1 c a . d
b u0 · · · um≠1 c a . d
b
eTm≠1 x uH
m≠1 x
= < eTj x = Q ‰j > R
1 2c
‰ 0
.. d
e0 · · · em≠1 c a . d b
‰m≠1
= < evaluate > = < evaluate >
‰0 e0 + ‰1 e1 + · · · + ‰m≠1 em≠1 . uH
0 xu 0 + 1 xu1 + · · · + um≠1 xum≠1 .
uH H

Illustration:
Illustration:
Another way of looking at this is that if u0 , u1 , . . . , um≠1 is an orthonormal basis for Cm ,
then any x œ Cm can be written as a linear combination of these vectors:

x = –0 u0 + –1 u1 + · · · + –m≠1 um≠1 .
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 116

Now,

i x = ui (–0 u0 + –1 u1 + · · · + –i≠1 ui≠1 + –i ui + –i+1 ui+1 + · · · + –m≠1 um≠1 )


uH H

= –0 uHi u0 + –1 ui u1 + · · · + –i≠1 ui ui≠1


H H
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
0 0 0
+ –i ui ui + –i+1 ui ui+1 + · · · + –m≠1 uH
H H
i um≠1
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
1 0 0
= –i .

Thus uH
i x = –i , the coefficient that multiplies ui .

Remark 2.2.6.1 The point is that given vector x and unitary matrix U , U H x computes
the coefficients for the orthonormal basis consisting of the columns of matrix U . Unitary
matrices allow one to elegantly change between orthonormal bases.

2.2.7 Why we love unitary matrices

YouTube: https://www.youtube.com/watch?v=d8-AeC3Q8Cw
In Subsection 1.4.1, we looked at how sensitive solving

Ax = b

is to a change in the right-hand side

A(x + ”x) = b + ”b

when A is nonsingular. We concluded that


ΔxΠΔbÎ
Æ ÎAÎÎA≠1 Î ,
ÎxÎ ¸ ˚˙ ˝ ÎbÎ
Ÿ(A)

when an induced matrix norm is used. Let’s look instead at how sensitive matrix-vector
multiplication is.
Homework 2.2.7.1 Let A œ Cn◊n be nonsingular and x œ Cn a nonzero vector. Consider

y = Ax and y + ”y = A(x + ”x).


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 117

Show that
ΔyΠΔxÎ
Æ ÎAÎÎA≠1 Î ,
ÎyÎ ¸ ˚˙ ˝ ÎxÎ
Ÿ(A)
where Î · Î is an induced matrix norm.
Solution. Since x = A≠1 y we know that

ÎxÎ Æ ÎA≠1 ÎÎyÎ

and hence
1 1
Æ ÎA≠1 Î . (2.2.1)
ÎyÎ ÎxÎ
Subtracting y = Ax from y + ”y = A(x + ”x) yields

”y = A”x

and hence
ΔyÎ Æ ÎAÎΔxÎ. (2.2.2)
Combining (2.2.1) and (2.2.2) yields the desired result.
There are choices of x and ”x for which the bound is tight.
What does this mean? It means that if as part of an algorithm we use matrix-vector
or matrix-matrix multiplication, we risk amplifying relative error by the condition number
of the matrix by which we multiply. Now, we saw in Section 1.4 that 1 Æ Ÿ(A). So, if we
there are algorithms that only use matrices for which Ÿ(A) = 1, then those algorithms don’t
amplify relative error.
Remark 2.2.7.1 We conclude that unitary matrices, which do not amplify the 2-norm of a
vector or matrix, should be our tool of choice, whenever practical.

2.3 The Singular Value Decomposition


2.3.1 The Singular Value Decomposition Theorem

YouTube: https://www.youtube.com/watch?v=uBo3XAGt24Q
The following is probably the most important result in linear algebra:
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 118

Theorem 2.3.1.1 Singular Value Decomposition Theorem. Given A œ Cm◊n there


exist unitary U œ Cm◊m , unitary V œ Cn◊n , and œ Rm◊n such that A = U V H . Here
Q R
‡0 0 ··· 0
A B
0
c
c 0 ‡1 ··· 0 d
d
= TL
with = c .. .. .. .. d (2.3.1)
0 0 TL c
a . . . .
d
b
0 0 · · · ‡r≠1

and ‡0 Ø ‡1 Ø · · · Ø ‡r≠1 > 0. The values ‡0 , . . . , ‡r≠1 are called the singular values of
matrix A. The columns of U and V are called the left and right singular vectors, respectively.
Recall that in our notation a 0 indicates a matrix of vector "of appropriate size" and that
in this setting the zero matrices in (2.3.1) may be 0 ◊ 0, (m ≠ r) ◊ 0, and/or 0 ◊ (n ≠ r).
Before proving this theorem, we are going to put some intermediate results in place.
Remark 2.3.1.2 As the course progresses, we will notice that there is a conflict between
the notation that explicitly exposes indices, e.g.,
1 2
U= u0 u1 · · · un≠1

and the notation we use to hide such explicit indexing, which we call the FLAME notation,
e.g., 1 2
U = U0 u1 U2 .
The two linked by Q R
u0 uk≠1 uk uk+1 un≠1
a ¸ ˚˙ ˝ ¸˚˙˝ ¸ ˚˙ ˝ b.
U0 u1 U2
In algorithms that use explicit indexing, k often is the loop index that identifies where in the
matrix or vector the algorithm currently has reached. In the FLAME notation, the index 1
identifies that place. This creates a conflict for the two distinct items that are both indexed
with 1, e.g., u1 in our example here. It is our experience that learners quickly adapt to this
and hence have not tried to introduce even more notation that avoids this conflict. In other
words: you will almost always be able to tell from context what is meant. The following
lemma and its proof illustrate this further.
Lemma 2.3.1.3 Given A œ Cm◊n , with 1 Æ n Æ m and A ”= 0 (the zero matrix), there exist
unitary matrices U œ Cm◊m and V œ Cn◊n such that
A B
‡1 0
A= UÂ VÂ H , where ‡1 = ÎAÎ2 .
0 B
Proof. In the below proof, it is really imporant to keep track of when a line is part of the
partitioning of a matrix or vector, and when it denotes scalar division.
Choose ‡1 and vÂ1 œ Cn such that
• ÎvÂ1 Î2 = 1; and
• ‡1 = ÎAvÂ1 Î2 = ÎAÎ2 .
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 119

In other words, vÂ1 is the vector that maximizes maxÎxÎ2 =1 ÎAxÎ2 .


Let uÂ1 = AvÂ1 /‡1 . Then

ÎuÂ1 Î2 = ÎAvÂ1 Î2 /‡1 = ÎAvÂ1 Î2 /ÎAÎ2 = ÎAÎ2 /ÎAÎ2 = 1.

Choose UÂ2 œ Cm◊(m≠1) and VÂ2 œ Cn◊(n≠1) so that


1 2 1 2
UÂ = uÂ1 UÂ2 and VÂ = vÂ1 VÂ2

are unitary. Then

UÂ H AVÂ
= < instantiate >
1 2H 1 2
uÂ1 UÂ 2 A vÂ1 VÂ2
A
= < multiplyBout >
uÂH A v uÂH Â
1 Â 1 1 AV2
UÂ2H AvÂ1 UÂ2H AVÂ2
A
= < AvÂ1 = ‡1 uB Â1 >
H Â
‡1 uÂH u
Â
1 1 u 1 A V 2
‡1 UÂ2H uÂ1 UÂ2H AVÂ2
A = < BuÂH1 uÂ1 = 1; U 2 Â1 = 0; pick w = V2 A u
 Hu  H H  and B = U
1
 H AV >
2 2
H
‡1 w
,
0 B

where w = VÂ2H AH uÂ1 and B = UÂ2H AVÂ2 .


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 120

We will now argue that w = 0, the zero vector of appropriate size:

‡12
= < assumption >
ÎAÎ22
= < 2-norm is invariant under multiplication by unitary matrix >
 H  2
ÎU AV Î2
= < definition of Î · Î2 >
ÂH AVÂ xÎ2
maxx”=0 ÎxÎ2 2
ÎU
2
= .< see above .>
A B 2
. ‡ H .
. 1 w .
. x.
. 0 B .
maxx”=0 ÎxÎ22
2

.A
Ø < x replaced by specific vector >
BA B.2
. ‡ H
‡1 ..
. 1 w
. .
. 0 B w .2
.A B.2
.
‡1 ..
.
..
.
w .2
.A
= < multiply out numerator >
B.
. ‡ 2 + w H w .2
. 1 .
. .
. Bw .
.A 2 B.2
. ‡1 .
. .
. .
. w .
.A2 B.2 .A B.2
Â1 ... . ‡1 .
. . .
Ø . = ÎÂ1 Î22 + Îy2 Î22 Ø ÎÂ1 Î22 ; .
<. . = ‡12 + wH w >
y2 .2. . w .
2
(‡1 + w w) /(‡1 + w w)
2 H 2 2 H

= < algebra >


‡1 + w w.
2 H

A B
‡1 0
Thus ‡12 Ø ‡12 + w w which means that w = 0 (the zero vector) and
H
UÂ H AVÂ =
0 B
A B
‡1 0
so that A = UÂ VÂ H . ⌅
0 B
Hopefully you can see where this is going: If one can recursively find that B = UB B VB ,
H
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 121

then A B
‡1 0
A = UÂ VÂ H
A
0 B B
‡ 1 0
= UÂ VÂ H
A
0 UBB BAVBH BA B
1 0 ‡1 0 1 0 .
= UÂ VÂ H
0 UB 0 B 0 VBH
A B A B A A BBH
1 0 ‡1 0 1 0
= UÂ Â
V .
0 UB 0 B 0 VB
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
U VH
The next exercise provides the insight that the values on the diagonal of will be ordered
from largest to smallest.
A B
‡1 0
Homework 2.3.1.1 Let A œ C with A =
m◊n
and assume that ÎAÎ2 = ‡1 .
0 B
ALWAYS/SOMETIMES/NEVER: ÎBÎ2 Æ ‡1 .
Solution. We will employ a proof by contradiction. Assume that ÎBÎ2 > ‡1 . Then there
exists a vector z with ÎzÎ2 = 1 such that ÎBÎ2 = ÎBzÎ2 = maxÎxÎ2 =1 ÎBxÎ2 . But then

ÎAÎ2
= < definition >
maxÎxÎ2 =1 ÎAxÎ2
. Ø A < pick a specific vector with 2 ≠ norm equal to one >
B.
.
. 0 .
.
.A .
. z .2
.A = <Binstantiate
A B. A>
. ‡
. 1 0 0 .
.
. .
. 0 B z .2
.A = B. < partitioned matrix-vector multiplication >
.
. 0 ..
. .
. Bz .
2 .A B.2
. y .
. 0 .
= < .. . = Îy0 Î22 + Îy1 Î22 >
y1 .2
ÎBzÎ2
= < assumption about z >
ÎBÎ2
> < assumption >
‡1 .

which is a contradiction.
Hence ÎBÎ2 Æ ‡1 .
We are now ready to prove the Singular Value Decomposition Theorem.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 122

Proof of Singular Value Decomposition Theorem for n Æ m. We will prove this for m Ø n,
leaving the case where m Æ n as an exercise.
Proof by induction: Since m Ø n, we select m to be arbritary and induct on n.

• Base case: n = 1.
1 2
In this case A = a1 where a1 œ Cm is its only column.
Case 1: a1 = 0 (the zero vector).
Then A B
1 2
A= 0 = Im◊m I1◊1
¸ ˚˙ ˝ 0 ¸ ˚˙ ˝
U VH
so that U = Im◊m , V = I1◊1 , and TL is an empty matrix.
Case 2: a1 ”= 0.
Then 1 2 1 2
A= a1 = u1 (Îa1 Î2 )
1 2
where u1 = a1 /Îa1 Î2 . Choose U2 œ Cm◊(m≠1) so that U = u1 U2 is unitary. Then
1 2
A = a1
1 2
= (Îa1 Î2 )
u1
A B
1 2 Îa Î 1 2H
1 2
= u1 U2 1
0
= U V H,

where
1 2
¶ U= u0 U1 ,
A B
1 2
¶ = TL
with = ‡1 and ‡1 = Îa1 Î2 = ÎAÎ2
0 TL

1 2
¶ V = 1 .

• Inductive step:
Assume the result is true for matrices with 1 Æ k columns. Show that it is true for
matrices with k + 1 columns.
Let A œ Cm◊(k+1) with 1 Æ k < n.
Case 1: A = 0 (the zero matrix)
Then A B
A = Im◊m I(k+1)◊(k+1)
0m◊(k+1)
so that U = Im◊m , V = I(k+1)◊(k+1) , and TL is an empty matrix.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 123

Case 2: A ”= 0.
Then ÎAÎ2 ”= 0. By Lemma 2.3.1.3, A we know
B that there exist unitary U œ C
 m◊m
and
‡1 0
V œ C(k+1)◊(k+1) such that A = U V with ‡1 = ÎAÎ2 .
0 B
By the inductive hypothesis, there exist unitary ǓB œ C(m≠1)◊(m≠1)
A , unitary
B V̌B œ
ˇ 0
Ck◊k , and ˇ B œ R(m≠1)◊k such that B = ǓB ˇ B V̌BH where ˇ B = TL
, ˇTL =
0 0
diag(‡2 , · · · , ‡r≠1 ), and ‡2 Ø · · · Ø ‡r≠1 > 0.
Now, let
A B A B A B
1 0 1 0 ‡1 0
U= UÂ ,V = VÂ , and = ˇB .
0 ǓB 0 V̌B 0

(There are some really tough to see "checks" in the definition of U , V , and !!) Then
A = U V H where U , V , and have the desired properties. Key here is that ‡1 =
ÎAÎ2 Ø ÎBÎ2 which means that ‡1 Ø ‡2 .

• By the Principle of Mathematical Induction the result holds for all matrices A œ Cm◊n
with m Ø n.


Homework 2.3.1.2 Let = diag(‡0 , . . . , ‡n≠1 ). ALWAYS/SOMETIMES/NEVER: Î Î2 =
maxn≠1
i=0 |‡i |.
Answer. ALWAYS
Now prove it.
Solution. Yes, you have seen this before, in Homework 1.3.5.1. We repeat it here because
of its importance to this topic.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 124

Î Î22 = maxÎxÎ2 =1 Î xÎ22


.Q RQ R.2
. ‡0 0 ··· 0 ‰0 .
. .
.c 0 ‡1 0
.c dc d.
··· dc ‰1 d.
= maxÎxÎ2 =1 ..c
c . .. .. .. dc .. d.
.a .. . dc d.
.
. . ba . b.
.
. 0 0 · · · ‡n≠1 ‰n≠1 .2
.Q R.2
. ‡0 ‰0 .
. .
.c d.
.c ‡1 ‰1 d.
= maxÎxÎ2 =1 ..c
c .. d.
d.
.a
.
. b.
.
. ‡n≠1 ‰n≠1 .
Ëq È 2
= maxÎxÎ2 =1 n≠1 |‡ ‰ |2
Ëqj=0 j j È
= maxÎxÎ2 =1 [|‡ |
n≠1
|‰j |2 ]
2
Ëqj=0 Ë j ÈÈ
Æ maxÎxÎ2 =1 n≠1 maxn≠1
i=0 |‡ i |2
|‰ j |2
Ë j=0 È
2 qn≠1
= maxÎxÎ2 =1 maxn≠1
i=0 |‡i | j=0 |‰j |
2
1 22
= maxn≠1
i=0 |‡i | maxÎxÎ2 =1 ÎxÎ22
1 22
= maxn≠1
i=0 |‡i | .
so that Î Î2 Æ maxn≠1i=0 |‡i |.
Also, choose j so that |‡j | = maxn≠1
i=0 |‡i |. Then

Î Î2 = maxÎxÎ2 =1 Î xÎ2 Ø Î ej Î2 = ·j ej Î2 = |‡j |Îej Î2 = |‡j | = maxn≠1


i=0 |‡i |.

so that maxn≠1
i=0 |‡i | Æ Î Î2 Æ maxi=0 |‡i |, which implies that Î Î2 = maxi=0 |‡i |.
n≠1 n≠1

Homework 2.3.1.3 Assume that U œ Cm◊m and V œ Cn◊n are unitary matrices. Let
A, B œ Cm◊n with B = U AV H . Show that the singular values of A equal the singular values
of B.
Solution. Let A = UA A VAH be the SVD of A. Then B = U UA A VAH V H = (U UA ) A (V VA )H
where both U UA and V VA are unitary. This gives us the SVD for B and it shows that the
singular values of B equal the singular values of A.
Homework 2.3.1.4 Let A œ Cm◊n with n Æ m and A = U V H be its SVD.
ALWAYS/SOMETIMES/NEVER: AH = V T U H .
Answer. ALWAYS
Solution.
AH = (U V H )H = (V H )H T U H = V T U H
since is real valued. Notice that is only "sort of diagonal" (it is possibly rectangular)
which is why T ”= .
Homework 2.3.1.5 Prove the Singular Value Decomposition Theorem for m Æ n.
Hint. Consider the SVD of B = AH
Solution. Let B = AH . Since it is n ◊ m with n Ø m its SVD exists: B = UB B VB .
H
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 125

Then A = B H = VB TB UBH and hence A = U V H with U = VB , = TB , and V = UB .


I believe the following video has material that is better presented in second
video of 2.3.2.

YouTube: https://www.youtube.com/watch?v=ZYzqTC5LeLs

2.3.2 Geometric interpretation

YouTube: https://www.youtube.com/watch?v=XKhCTtX1z6A
We will now illustrate what the SVD Theorem tells us about matrix-vector multiplication
(linear transformations) by examining the case where A œ R2◊2 . Let A = U V T be its SVD.
(Notice that all matrices are now real valued, and hence V H = V T .) Partition
A B
1 2 ‡0 0 1 2T
A= u0 u1 v0 v1 .
0 ‡1

Since U and V are unitary matrices, {u0 , u1 } and {v0 , v1 } form orthonormal bases for the
range and domain of A, respectively:
R2 : Domain of A: R2 : Range (codomain) of A:
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 126

Let us manipulate the decomposition a little:


A B
1 ‡0 0 1
2 2T
A = u 0 u1 v0 v1
C A
0 ‡1 BD
1 2 ‡
0 0 1 2T
= u0 u1 v0 v1
0 ‡1
1 21 2T
= ‡0 u0 ‡1 u1 v0 v1 .

Now let us look at how A transforms v0 and v1 :


A B
1 21 2T 1 2 1
Av0 = ‡0 u0 ‡1 u1 v0 v1 v0 = ‡0 u0 ‡1 u1 = ‡0 u0
0

and similarly Av1 = ‡1 u1 . This motivates the pictures in Figure 2.3.2.1.


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 127

R2 : Domain of A: R2 : Range (codomain) of A:

R2 : Domain of A: R2 : Range (codomain) of A:

Figure 2.3.2.1 Illustration of how orthonormal vectors v0 and v1 are transformed by matrix
A=U V.
Next, Alet usBlook at how A transforms any vector with (Euclidean) unit length. Notice
‰0
that x = means that
‰1
x = ‰0 e0 + ‰1 e1 ,
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 128

where e0 and e1 are the unit basis vectors. Thus, ‰0 and ‰1 are the coefficients when x is
expressed using e0 and e1 as basis. However, we can also express x in the basis given by v0
and v1 : A B
1 21 2T 1 2 vT x
0
x = V V ˝ x = v0 v1
T
v0 v1 x = v0 v1
¸ ˚˙ v1T x
I A B
1 2 –
0
= v0 x v0 + v1 x v1 = –0 v0 + –0 v1 = v0 v1
T T
.
¸˚˙˝ ¸˚˙˝ –1
–0 –1
Thus, in the basis formed by v0 and v1 , its coefficients are –0 and –1 . Now,
1 21 2T
Ax = ‡0 u0 ‡1 u1 v0 v1 x
A B
1 21 2T 1 2 –0
= ‡0 u0 ‡1 u1 v0 v1 v0 v1
A B
–1
1 2 –0
= ‡0 u0 ‡1 u1 = –0 ‡0 u0 + –1 ‡1 u1 .
–1

This is illustrated by the following picture, which also captures the fact that the unit ball
is mapped to an oval with major axis equal to ‡0 = ÎAÎ2 and minor axis equal to ‡1 , as
illustrated in Figure 2.3.2.1 (bottom).
Finally, we show the same insights for general vector x (not necessarily of unit length):
R2 : Domain of A: R2 : Range (codomain) of A:

Another observation is that if one picks the right basis for the domain and codomain,
then the computation Ax simplifies to a matrix multiplication with a diagonal matrix. Let
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 129

us again illustrate this for nonsingular A œ R2◊2 with


A B
1 2 ‡0 0 1 2 T
A= u0 u1 v0 v1 .
¸ ˚˙ ˝ 0 ‡1 ¸ ˚˙ ˝
¸ ˚˙ ˝
U V

Now, if we chose to express y using u0 and u1 as the basis and express x using v0 and v1 as
the basis, then
U U T˝ y = U U T y = (uT0 y)u0 + (uT1 y)u1
¸ ˚˙ ¸ ˚˙ ˝
I y‚ A B A B
1 2 uT y ‚0
0
= u0 u1 =U
uT1 y ‚1
¸ ˚˙ ˝
y‚
V V T˝ x = V V
¸ ˚˙ ¸ ˚˙x˝ = (v0 x)v0 + (v1 x)v1
T T T

I x‚ A B A B
1 2 vT x ‰‚0
0
= v0 v1 =V .
v1T x ‰‚1 .
¸ ˚˙ ˝
x‚
If y = Ax then
U UT y = U V T x˝ = U x‚
¸ ˚˙
¸ ˚˙ ˝
y‚ Ax
so that
y‚ = x‚
and A B A B
‚0 ‡0 ‰‚0
= .
‚1 . ‡1 ‰‚1 .
Remark 2.3.2.2 The above discussion shows that if one transforms the input vector x
and output vector y into the right bases, then the computation y := Ax can be computed
with a diagonal matrix instead: y‚ := x‚. Also, solving Ax = y for x can be computed by
multiplying with the inverse of the diagonal matrix: x‚ := ≠1 y‚.
These observations generalize to A œ Cm◊n : If

y = Ax

then
U H y = U H A ¸V ˚˙
V H˝ x
I
so that
UHy = V H
¸ ˚˙ x˝
¸ ˚˙ ˝
y‚ x‚
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 130

( is a rectangular "diagonal" matrix.)

YouTube: https://www.youtube.com/watch?v=1LpK0dbFX1g

2.3.3 An "algorithm" for computing the SVD


We really should have created a video for this section. Those who have taken our "Program-
ming for Correctness" course will recognize what we are trying to describe here. Regardless,
you can safely skip this unit without permanent (or even temporary) damage to your linear
algebra understanding.
In this unit, we show how the insights from the last unit can be molded into an "algorithm"
for computing the SVD. We put algorithm in quotes because while the details of the algorithm
mathematically exist, they are actually very difficult to compute in practice. So, this is not
a practical algorithm. We will not discuss a practical algorithm until the very end of the
course, in (((section to be determined))).
We observed that, starting with matrix A, we can compute one step towards the SVD.
If we overwrite A with the intermediate results, this means that after one step
A B A B A B
–11 aT12 1 2H – ‚ T12
‚ 11 a 1 2 ‡11 0
= uÂ1 UÂ 2 vÂ1 VÂ
2 = ,
a21 A22 a‚21 A‚22 0 B

where A‚ allows us to refer to the original contents of A.


In our proof of Theorem 2.3.1.1, we then said that the SVD of B, B = UB B VBH could
be computed, and the desired U and V can then be created by computing U = UÂ UB and
V = VÂ VB .
Alternatively, one can accumulate U and V every time a new singular value is exposed.
In this approach, you start by setting U = Im◊m and V = In◊n . Upon completing the first
step (which computes the first singular value), one multiplies U and V from the right with
the computed UÂ and VÂ :
U := U UÂ
V := V VÂ .
Now, every time another singular value is computed in future steps, the corresponding uni-
tary matrices are similarly accumulated into U and V .
To explain this more completely, assume that the process has proceeded for k steps to
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 131

the point where


1 2
U = UL UR œ Cm◊m with UL œ Cm◊k
1 2
V = V VR œ Cm◊m with VL œ Cn◊k
A L B
AT L AT R
A = with AT L œ Ck◊k ,
ABL ABR

where the current contents of A are


A B A B
AT L AT R 1 2H A‚T L A‚T R 1 2
= UL UR VL VR
ABL ABR A B A‚BL A‚BR
0
= TL
.
0 B

This means that in the current step we need to update the contents of ABR with
A B
‡11 0
UÂ H AVÂ =
0 BÂ

and update A B
1 2 1 2
Ik◊k 0
UL UR := UL UR
A 0 UÂ B
1 2 1 2 I 0
:=
k◊k
VL VR VL VR ,
0 VÂ
which simplify to
UBR := UBR UÂ and VBR := VBR VÂ .
At that point, AT L is expanded by one row and column, and the left-most columns of UR and
VR are moved to UL and VL , respectively. If ABR ever contains a zero matrix, the process
completes with A overwritten with = U H V‚ . These observations, with all details, are
captured in Figure 2.3.3.1. In that figure, the boxes in yellow are assertions that capture the
current contents of the variables. Those familiar with proving loops correct will recognize
the first and last such box as the precondition and postcondition for the operation and
A B A B
AT L AT R 1 2H A‚T L A‚T R 1 2
= UL UR VL VR
ABL ABR A B A‚BL A‚BR
0
= TL
0 B

as the loop-invariant that can be used to prove the correctness of the loop via a proof by
induction.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 132

Figure 2.3.3.1 Algorithm for computing the SVD of A, overwriting A with . In the yellow
boxes are assertions regarding the contents of the various matrices.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 133

The reason this algorithm is not practical is that many of the steps are easy to state
mathematically, but difficult (computationally expensive) to compute in practice. In partic-
ular:

• Computing ÎABR Î2 is tricky and as a result, so is computing vÂ1 .

• Given a vector, determining a unitary matrix with that vector as its first column is
computationally expensive.

• Assuming for simplicity that m = n, even if all other computations were free, comput-
ing the product A22 := UÂ2H ABR VÂ2 requires O((m ≠ k)3 ) operations. This means that
the entire algorithm requires O(m4 ) computations, which is prohibitively expensive
when n gets large. (We will see that most practical algorithms discussed in this course
cost O(m3 ) operations or less.)

Later in this course, we will discuss an algorithm that has an effective cost of O(m3 ) (when
m = n).
Ponder This 2.3.3.1 An implementation of the "algorithm" in Figure 2.3.3.1, using our
FLAME API for Matlab (FLAME@lab) [5] that allows the code to closely resemble the
algorithm as we present it, is given in mySVD.m (Assignments/Week02/matlab/mySVD.m).
This implemention depends on routines in subdirectory Assignments/flameatlab being in the
path. Examine this code. What do you notice? Execute it with
m = 5;
n = 4;
A = rand( m, n ); % create m x n random matrix
[ U, Sigma, V ] = mySVD( A )

Then check whether the resulting matrices form the SVD:

norm( A - U * Sigma * V )

and whether U and V are unitary

norm( eye( n,n ) - V * V )


norm( eye( m,m ) - U * U )

2.3.4 The Reduced Singular Value Decomposition


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 134

YouTube: https://www.youtube.com/watch?v=HAAh4IsIdsY
Corollary 2.3.4.1 Reduced Singular Value Decomposition. Let A œ Cm◊n and
r = rank(A). There exist orthonormal matrix UL œ Cm◊r , orthonormal marix VL œ Cn◊r ,
and matrix T L œ Rr◊r with T L = diag(‡0 , . . . , ‡r≠1 ) and ‡0 Ø ‡1 Ø · · · Ø ‡r≠1 > 0 such
that A = UL T L VLH .
Homework 2.3.4.1 Prove the above corollary.
A B
01 2 1 2H
Solution. Let A = U V = UL UR H TL
VL VR be the SVD of A,
0 0
where UL œ Cm◊r , VL œ Cn◊r and T L œ Rr◊r with TL = diag(‡0 , ‡1 , · · · , ‡r≠1 ) and
‡0 Ø ‡1 Ø · · · Ø ‡r≠1 > 0. Then

A
= < SVD of A >
U VT
= < Partitioning
A B>
TL 0
1 2 1 2H
UL UR VL VR
0 0
= < partitioned matrix ≠ matrix multiplication >
UL T L VLH .
1 2
Corollary 2.3.4.2 Let A = UL H
T L VL be the Reduced SVD with UL = u0 · · · ur≠1 ,
Q R
1 2
‡0
VL = v0 · · · vr≠1 , and =c
c .. d
d. Then
TL a . b
‡r≠1

A = ‡0 u0 v0H + · · · + ‡r≠1 ur≠1 vr≠1


H
.
Remark 2.3.4.3 This last result establishes that any matrix A with rank r can be written
as a linear combination of r outer products:

A= ‡0 u0 v0H + ‡1 u1 v1H + ··· + H


‡r≠1 ur≠1 vr≠1 .
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
‡0 ‡1 ‡r≠1
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 135

2.3.5 SVD of nonsingular matrices

YouTube: https://www.youtube.com/watch?v=5Gvmt 5T3k


Homework 2.3.5.1 Let A œ Cm◊m and A = U V H be its SVD.
TRUE/FALSE: A is nonsingular if and only if is nonsingular.
Answer. TRUE
Solution. = U H AV . The product of square matrices is nonsingular if and only if each
individual matrix is nonsingular. Since U and V are unitary, they are nonsingular.
Homework 2.3.5.2 Let A œ Cm◊m and A = U V H be its SVD with
Q R
‡0 0 ··· 0
c
c 0 ‡1 ··· 0 d
d
=c .. .. .. .. d
c
a . . . .
d
b
0 0 · · · ‡m≠1

TRUE/FALSE: A is nonsingular if and only if ‡m≠1 ”= 0.


Answer. TRUE
Solution. By the last homework, A is nonsingular if and only if is nonsingular. A
diagonal matrix is nonsingular if and only if its diagonal elements are all nonzero. ‡0 Ø
· · · Ø ‡m≠1 > 0. Hence the diagonal elements of are nonzero if and only if ‡m≠1 ”= 0.
Homework 2.3.5.3 Let A œ Cm◊m be nonsingular and A = U V H be its SVD.
ALWAYS/SOMETIMES/NEVER: The SVD of A≠1 equals V ≠1 U H .
Answer. SOMETIMES
Explain it!
Solution. It would seem that the answer is ALWAYS: A≠1 = (U V H )≠1 = (V H )≠1 ≠1
U ≠1 =
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 136

V ≠1
U H with
≠1

= <>
Q R≠1
‡0 0 · · · 0
c
c 0 ‡1 · · · 0 d
d
c
c .. .. . . .. d
d
a . . . . b
0 0 ··· ‡m≠1
= <>
Q R
1/‡0 0 ··· 0
c
c 0 1/‡1 ··· 0 d
d
c
c .. .. .. .. d.
d
a . . . . b
0 0 · · · 1/‡m≠1
However, the SVD requires the diagonal elements to be positive and ordered from largest to
smallest.
So, only if ‡0 = ‡1 = · · · = ‡m≠1 is it the case that V ≠1 U H is the SVD of A≠1 . In other
words, when = ‡0 I.
Homework 2.3.5.4 Let A œ Cm◊m be nonsingular and
A = U VH Q R
1
‡ ···
2c 0
0
d1 2H
= c .. ... ..
a . .
u0 · · · um≠1 d v0 · · · vm≠1
b
0 · · · ‡m≠1
be its SVD.
The SVD of A≠1 is given by (indicate all correct answers):
1. V ≠1
UH.
Q R
1 2c
1/‡0 · · · 0
. . .. d1 2H
2. c . ..
a . .
v0 · · · vm≠1 d u0 · · · um≠1
b
0 · · · 1/‡m≠1
Q R
1 2c
1/‡m≠1 · · · 0 1 2H
3. .. .. .. d
vm≠1 · · · v0 c
a . . . d
b um≠1 · · · u0 .
0 · · · 1/‡0
Q R
0 ··· 0 1
c
c 0 ··· 1 0 d
d
4. (V P )(PH ≠1
P )(U P )
H H H
where P = c
c .. .. .. d
d
a . . . b
1 ··· 0 0
Answer. 3. and 4.
Explain it!
Solution. This question is a bit tricky.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 137

1. It is the case that A≠1 = V ≠1 U H . However, the diagonal elements of ≠1


are ordered
from smallest to largest, and hence this is not its SVD.

2. This is just Answer 1. but with the columns of U and V , and the elements of ,
exposed.

3. This answer corrects the problems with the previous two answers: it reorders colums
of U and V so that the diagonal elements of end up ordered from largest to smallest.

4. This answer is just a reformulation of the last answer.


Homework 2.3.5.5 Let A œ Cm◊m be nonsingular. TRUE/FALSE: ÎA≠1 Î2 = 1/ minÎxÎ2 =1 ÎAxÎ2 .
Answer. TRUE
Solution.
ÎA≠1 Î2
= < definition >
maxx”=0 ÎxÎ2
ÎA≠1 xÎ2

= < algebra >


maxx”=0 ÎxÎ1 2
ÎA≠1 xÎ2
= < algebra >
1
ÎxÎ2
minx”=0
ÎA≠1 xÎ2
= < substitute z = A≠1 x >
1
ÎAzÎ2
minAz”=0 ÎzÎ2
= < A is nonsingular >
1
ÎAzÎ2
minz”=0 ÎzÎ2
= < x = z/ÎzÎ2 >
1
minÎxÎ2 =1 ÎAxÎ2

In Subsection 2.3.2, we discussed the case where A œ R2◊2 . Letting A = U V T and


partitioning A B
1 2 ‡
0 0 1 2T
A = u0 u1 v0 v1
0 ‡1
yielded the pictures
R2 : Domain of A: R2 : Range (codomain) of A:
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 138

This captures what the condition number Ÿ2 (A) = ‡0 /‡n≠1 captures: how elongated the
oval that equals the image of the unit ball is. The more elongated, the greater the ratio
‡0 /‡n≠1 , and the worse the condition number of the matrix. In the limit, when ‡n≠1 = 0,
the unit ball is mapped to a lower dimensional set, meaning that the transformation cannot
be "undone."
Ponder This 2.3.5.6 For the 2D problem discussed in this unit, what would the image of
the unit ball look like as Ÿ2 (A) æ Œ? When is Ÿ2 (A) = Œ?

2.3.6 Best rank-k approximation

YouTube: https://www.youtube.com/watch?v=sN0DKG8vPhQ
We are now ready to answer the question "How do we find the best rank-k approximation
for a picture (or, more generally, a matrix)? " posed in Subsection 2.1.1.
Theorem 2.3.6.1 Given A œ Cm◊n , let A = U V H be its SVD. Assume the entries on the
main diagonal of are ‡0 , · · · , ‡min(m,n)≠1 with ‡0 Ø · · · Ø ‡min(m,n)≠1 Ø 0. Given k such
that 0 Æ k Æ min(m, n), partition
A B
1 2 1 2 0
U= UL UR ,V = VL VR , and = TL
,
0 BR
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 139

where UL œ Cm◊k , VL œ Cn◊k , and TL œ Rk◊k . Then

B = UL H
T L VL

is the matrix in Cm◊n closest to A in the following sense:

ÎA ≠ BÎ2 = min ÎA ≠ CÎ2 .


C œ Cm◊n
rank(C) Æ k

In other words, B is the matrix with rank at most k that is closest to A as measured by the
2-norm. Also, for this B,
I
‡k if k < min(m, n)
ÎA ≠ BÎ2 =
0 otherwise.
The proof of this theorem builds on the following insight:
Homework 2.3.6.1 Given A œ Cm◊n , let A = U V H be its SVD. Show that

Avj = ‡j uj for 0 Æ j < min(m, n),

where uj and vj equal the columns of U and V indexed by j, and ‡j equals the diagonal
element of indexed with j.
Solution. W.l.o.g. assume n Æ m. Rewrite A = U V H as AV = U . Then

AV1 = U = 2< partition >


A v0 · · · vn≠1
Q R
‡0 · · · 0
c . ... .. d
c .. . d
c d
1 2c
c 0 · · · ‡n≠1
d
d
= u0 · · · un≠1 un · · · um≠1 c
c 0 ···
d
c 0 d
d
c . .. d
c . d
a . . b
0 0
1 = < multiply2 out1 > 2
Av0 · · · Avn≠1 = ‡0 u0 · · · ‡n≠1 un≠1 .

Hence Avj = ‡j uj for 0 Æ j < n.


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 140

Proof of Theorem 2.3.6.1. First, if B is as defined, then ÎA ≠ BÎ2 = ‡k :

ÎA ≠ BÎ2
= < multiplication with unitary matrices preserves 2-norm >
ÎU H (A ≠ B)V Î2
= < distribute >
ÎU H AV ≠ U H BV Î2
. = < use SVD of A and partition
1 2H .
1
>
2.
.
. ≠ UL UR B VL VR ..
.
2
.A = < howBB A was chosenB.>
.
. TL 0 TL 0
.
.
. ≠ .
. 0 BR 0 0 .2
.A = partitioned subtraction >
< B.
. 0 0 .
. .
. .
. 0 BR .
2
= <>
Î BR Î2
= < TL is k ◊ k >
‡k

(Obviously, this needs to be tidied up for the case where k > rank(A).)
Next, assume that C has rank r Æ k and ÎA ≠ CÎ2 < ÎA ≠ BÎ2 . We will show that this
leads to a contradiction.

• The null space of C has dimension at least n ≠ k since dim(N (C)) = n ≠ r.

• If x œ N (C) then

ÎAxÎ2 = Î(A ≠ C)xÎ2 Æ ÎA ≠ CÎ2 ÎxÎ2 < ‡k ÎxÎ2 .


1 2 1 2
• Partition U = u0 · · · um≠1 and V = v0 · · · vn≠1 . Then ÎAvj Î2 = ·j uj Î2 =
‡j Ø ‡k for j = 0, . . . , k.

• Now, let y be any linear combination of v0 , . . . , vk : y = –0 v0 + · · · + –k vk . Notice that

ÎyÎ22 = Ζ0 v0 + · · · + –k vk Î22 = |–0 |2 + · · · |–k |2


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 141

since the vectors vj are orthonormal. Then

ÎAyÎ22
= < y = –0 v0 + · · · + –k vk >
ÎA(–0 v0 + · · · + –k vk )Î22
= < distributivity >
Ζ0 Av0 + · · · + –k Avk Î22
= < Avj = ‡j uj >
Ζ0 ‡0 u0 + · · · + –k ‡k uk Î22
= < this works because the uj are orthonormal >
Ζ0 ‡0 u0 Î22 + · · · + Ζk ‡k uk Î22
= < norms are homogeneous and Îuj Î2 = 1 >
|–0 | ‡0 + · · · + |–k |2 ‡k2
2 2

Ø < ‡0 Ø ‡1 Ø · · · Ø ‡k Ø 0 >
(|–0 |2 + · · · + |–k |2 )‡k2
= < ÎyÎ22 = |–0 |2 + · · · + |–k |2 >
2 2
‡k ÎyÎ2 .

so that ÎAyÎ2 Ø ‡k ÎyÎ2 . In other words, vectors in the subspace of all linear combina-
tions of {v0 , . . . , vk } satisfy ÎAxÎ2 Ø ‡k ÎxÎ2 . The dimension of this subspace is k + 1
(since {v0 , · · · , vk } form an orthonormal basis).

• Both these subspaces are subspaces of Cn . One has dimension k + 1 and the other
n ≠ k. This means that if you take a basis for one (which consists of n ≠ k linearly
independent vectors) and add it to a basis for the other (which has k + 1 linearly
independent vectors), you end up with n + 1 vectors. Since these cannot all be linearly
independent in Cn , there must be at least one nonzero vector z that satisfies both
ÎAzÎ2 < ‡k ÎzÎ2 and ÎAzÎ2 Ø ‡k ÎzÎ2 , which is a contradiction.


Theorem 2.3.6.1 tells us how to pick the best approximation to a given matrix of a given
desired rank. In Section Subsection 2.1.1 we discussed how a low rank matrix can be used
to compress data. The SVD thus gives the best such rank-k approximation. Let us revisit
this.
Let A œ Rm◊n be a matrix that, for example, stores a picture. In this case, the i, j entry
in A is, for example, a number that represents the grayscale value of pixel (i, j).
Homework 2.3.6.2 In Assignments/Week02/matlab execute
IMG = imread( Frida.jpg );
A = doub e( IMG( :,:,1 ) );
imshow( uint8( A ) )
size( A )

to generate the picture of Mexican artist Frida Kahlo


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 142

Although the picture is black and white, it was read as if it is a color image, which means
a m ◊ n ◊ 3 array of pixel information is stored. Setting A = IMG( :,:,1 ) extracts a single
matrix of pixel information. (If you start with a color picture, you will want to approximate
IMG( :,:,1), IMG( :,:,2), and IMG( :,:,3) separately.)
Next, compute the SVD of matrix A
[ U, Sigma, V ] = svd( A );
and approximate the picture with a rank-k update, starting with k = 1:
k = 1
B = uint8( U( :, 1:k ) * Sigma( 1:k,1:k ) * V( :, 1:k ) );
imshow( B );

Repeat this with increasing k.


r = min( size( A ) )
for k=1:r
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 143

imshow( uint8( U( :, 1:k ) * Sigma( 1:k,1:k ) * V( :, 1:k ) ) );


input( strcat( num2str( k ), press return ) );
end

To determine a reasonable value for k, it helps to graph the singular values:

figure
r = min( size( A ) );
p ot( [ 1:r ], diag( Sigma ), x );

Since the singular values span a broad range, we may want to plot them with a log-log plot

og og( [ 1:r ], diag( Sigma ), x );

For this particular matrix (picture), there is no dramatic drop in the singular values that
makes it obvious what k is a natural choice.
Solution.

Figure 2.3.6.2 Distribution of singular values for the picture.


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 144

k=1 k=2

k=5 k = 10

k = 25 Original picture

Figure 2.3.6.3 Multiple pictures as generated by the code.


WEEK 2. THE SINGULAR VALUE DECOMPOSITION 145

2.4 Enrichments
2.4.1 Principle Component Analysis (PCA)
Principle Component Analysis (PCA) is a standard technique in data science related to the
SVD. You may enjoy the article

• [30] J. Novembre, T. Johnson, K. Bryc, Z. Kutalik, A.R. Boyko, A. Auton, A. Indap,


K.S. King, S. Bergmann, M.. Nelson, M. Stephens, C.D. Bustamante, , Nature, 2008.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2735096/.

In that article, PCA is cast as an eigenvalue problem rather than a singular value problem.
Later in the course, in Week 11, we will link these.

2.5 Wrap Up
2.5.1 Additional homework
Homework 2.5.1.1 U œ Cm◊m is unitary if and only if (U x)H (U y) = xH y for all x, y œ Cm .
Hint. Revisit the proof of Homework 2.2.4.6.
Homework 2.5.1.2 Let A, B œ Cm◊n . Furthermore, let U œ Cm◊m and V œ Cn◊n be
unitary.
TRUE/FALSE: U AV H = B iff U H BV = A.
Answer. TRUE
Now prove it!
Homework 2.5.1.3 Prove that nonsingular A œ Cn◊n has condition number Ÿ2 (A) = 1 if
and only if A = ‡Q where Q is unitary and ‡ œ R is positive.
Hint. Use the SVD of A.
Homework 2.5.1.4 Let U œ Cm◊m and V œ Cn◊nAbe unitary.
B
U 0
ALWAYS/SOMETIMES/NEVER: The matrix is unitary.
0 V
Answer. ALWAYS
Now prove it!
Homework 2.5.1.5 Matrix A œ Rm◊m is a stochastic matrix if and only if it is nonnegative
q
(all its entries are nonnegative) and the entries in its columns sum to one: 0Æi<m –i,j = 1.
Such matrices are at the core of Markov processes. Show that a matrix A is both unitary
matrix and a stochastic matrix if and only if it is a permutation matrix.
Homework 2.5.1.6 Show that if Î · · · Î is a norm and A is nonsingular, then Î · · · ÎA≠1
defined by ÎxÎA≠1 = ÎA≠1 xÎ is a norm.
Interpret this result in terms of the change of basis of a vector.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 146

Homework 2.5.1.7 Let A œ Cm◊m be nonsingular and A = U V H be its SVD with


Q R
‡0 0 ··· 0
c
c 0 ‡1 ··· 0 d
d
= c
c .. .. .. .. d
d
a . . . . b
0 0 · · · ‡m≠1

The condition number of A is given by (mark all correct answers):

1. Ÿ2 (A) = ÎAÎ2 ÎA≠1 Î2 .

2. Ÿ2 (A) = ‡0 /‡m≠1 .

3. Ÿ2 (A) = uH
0 Av0 /um≠1 Avm≠1 .
H

4. Ÿ2 (A) = maxÎxÎ2 =1 ÎAxÎ2 / minÎxÎ2 =1 ÎAxÎ2 .

(Mark all correct answers.)


Homework 2.5.1.8 Theorem 2.2.4.4 stated: If A œ Cm◊m preserves length (ÎAxÎ2 = ÎxÎ2
for all x œ Cm ), then A is unitary.Give an alternative proof using the SVD.
Homework 2.5.1.9 In Homework 1.3.7.2 you were asked to prove that ÎAÎ2 Æ ÎAÎF given
A œ Cm◊n . Give an alternative proof that leverages the SVD.
Homework 2.5.1.10 In Homework 1.3.7.3, we skipped how the 2-norm bounds the Frobe-
nius norm. We now have the tools to do so elegantly: Prove that, given A œ Cm◊n ,
Ô
ÎAÎF Æ rÎAÎ2 ,

where r is the rank of matrix A.

2.5.2 Summary
Given x, y œ Cm

• their dot product (inner product) is defined as


m≠1
ÿ
xH y = xT y = xT y = ‰0 Â0 + ‰1 Â1 + · · · + ‰m≠1 Âm≠1 = ‰i Âi .
i=0

• These vectors are said to be orthogonal (perpendicular) iff xH y = 0.

• The component of y in the direction of x is given by

xH y xxH
x = y.
xH x xH x
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 147

The matrix that projects a vector onto the space spanned by x is given by

xxH
.
xH x

• The component of y orthogonal to x is given by


A B
xH y xxH
y≠ H x= I≠ H y.
x x x x

Thus, the matrix that projects a vector onto the space orthogonal to x is given by

xxH
I≠ .
xH x

Given u, v œ Cm with u of unit length


• The component of v in the direction of u is given by

uH vu = uuH v.

• The matrix that projects a vector onto the space spanned by u is given by

uuH

• The component of v orthogonal to u is given by


1 2
v ≠ uH vu = I ≠ uuH v.

• The matrix that projects a vector onto the space that is orthogonal to x is given by

I ≠ uuH

Let u0 , u1 , . . . , un≠1 œ Cm . These vectors are said to be mutually orthonormal if for all
0 Æ i, j < n I
1 if i = j
ui u j =
H
.
0 otherwise
Let Q œ Cm◊n (with n Æ m). Then Q is said to be
• an orthonormal matrix iff QH Q = I.

• a unitary matrix iff QH Q = I and m = n..

• an orthogonal matrix iff it is a unitary matrix and is real-valued.


1 2
Let Q œ Cm◊n (with n Æ m). Then Q = q0 · · · qn≠1 is orthonormal iff {q0 , . . . , qn≠1
are mutually orthonormal.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 148

Definition 2.5.2.1 Unitary matrix. Let U œ Cm◊m . Then U is said to be a unitary


matrix if and only if U H U = I (the identity). ⌃
If U, V œ C m◊m
are unitary, then
• U H U = I.

• U U H = I.

• U ≠1 = U H .

• U H is unitary.

• U V is unitary.
If U œ Cm◊m and V œ Cn◊n are unitary, x œ Cm , and A œ Cm◊n , then
• ÎU xÎ2 = ÎxÎ2 .

• ÎU H AÎ2 = ÎU AÎ2 = ÎAV Î2 = ÎAV H Î2 = ÎU H AV Î2 = ÎU AV H Î2 = ÎAÎ2 .

• ÎU H AÎF = ÎU AÎF = ÎAV ÎF = ÎAV H ÎF = ÎU H AV ÎF = ÎU AV H ÎF = ÎAÎF .

• ÎU Î2 = 1

• Ÿ2 (U ) = 1
Examples of unitary matrices:
A B
c ≠s
• Rotation in 2D: .
s c

• Reflection: I ≠ 2uuH where u œ Cm and ÎuÎ2 = 1.


1 2
Change of orthonormal basis: If x œ Cm and U = u0 · · · um≠1 is unitary, then
Q R
1 2 c
uH
0 x
x = (uH .. d
0 x)u0 + · · · + (um≠1 x)um≠1 =
d = U U H x.
.
H c
u0 · · · um≠1 a b
uH
m≠1 x
¸ ˚˙ ˝
UHx

Let A œ Cn◊n be nonsingular and x œ Cn a nonzero vector. Consider

y = Ax and y + ”y = A(x + ”x).

Then
ΔyΠΔxÎ
Æ ÎAÎÎA≠1 Î ,
ÎyÎ ¸ ˚˙ ˝ ÎxÎ
Ÿ(A)
where Î · Î is an induced matrix norm.
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 149

Theorem 2.5.2.2 Singular Value Decomposition Theorem. Given A œ Cm◊n there


exist Aunitary U Bœ Cm◊m , unitary V œ Cn◊n , and œ Rm◊n such that A = U V H . Here
TL 0
= with
0 0
Q R
‡0 0 ··· 0
c
c 0 ‡1 ··· 0 d
d
TL = c
c .. .. ... .. d
d and ‡0 Ø ‡1 Ø · · · Ø ‡r≠1 > 0.
a . . . b
0 0 · · · ‡r≠1

The values ‡0 , . . . , ‡r≠1 are called the singular values of matrix A. The columns of U and V
are called the left and right singular vectors, respectively.
Let A œ Cm◊n and A = U V H its SVD with
1 2 1 2
U= UL UR = u0 · · · um≠1 ,
1 2 1 2
V = VL VR = v0 · · · vn≠1 ,
and
Q R
‡0 0 ··· 0
A B
0
c
c 0 ‡1 ··· 0 d
d
= TL
, where = c .. .. .. .. d and ‡0 Ø ‡1 Ø · · · Ø ‡r≠1 > 0.
0 0 TL c
a . . . .
d
b
0 0 · · · ‡r≠1
Here UL œ Cm◊r , VL œ Cn◊r and TL œ Rr◊r . Then
• ÎAÎ2 = ‡0 . (The 2-norm of a matrix equals the largest singular value.)
• rank(A) = r.
• C(A) = C(UL ).
• N (A) = C(VR ).
• R(A) = C(VL ).
• Left null-space of A = C(UR ).
• AH = V T
UH.
• SVD: AH = V U H .
• Reduced SVD: A = UL T L VL .
H


A= ‡0 u0 v0H + ‡1 u1 v1H +· · ·+ H
‡r≠1 ur≠1 vr≠1 .
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
‡0 ‡1 ‡r≠1
WEEK 2. THE SINGULAR VALUE DECOMPOSITION 150

• Reduced SVD: AH = VL ULH .

• If m ◊ m matrix A is nonsingular: A≠1 = V ≠1


UH.

• If A œ Cm◊m then A is nonsingular if and only if ‡m≠1 ”= 0.

• If A œ Cm◊m is nonsingular then Ÿ2 (A) = ‡0 /‡m≠1 .

• (Left) pseudo inverse: if A has linearly independent columns, then A† = (AH A)≠1 AH =
T L UL .
V ≠1 H

• v0 is the direction of maximal magnification.

• vn≠1 is is the direction of minimal magnification.

• If n Æ m, then Avj = ‡j uj , for 0 Æ j < n.


Theorem 2.5.2.3 Given A œ Cm◊n , let A = U V H be its SVD. Assume the entries on the
main diagonal of are ‡0 , · · · , ‡min(m,n)≠1 with ‡0 Ø · · · Ø ‡min(m,n)≠1 Ø 0. Given k such
that 0 Æ k Æ min(m, n), partition
A B
1 2 1 2 0
U= UL UR ,V = VL VR , and = TL
,
0 BR

where UL œ Cm◊k , VL œ Cn◊k , and TL œ Rk◊k . Then

B = UL H
T L VL

is the matrix in Cm◊n closest to A in the following sense:

ÎA ≠ BÎ2 = min ÎA ≠ CÎ2 .


C œ Cm◊n
rank(C) Æ k

In other words, B is the matrix with rank at most k that is closest to A as measured by the
2-norm. Also, for this B,
I
‡k if k < min(m, n)
ÎA ≠ BÎ2 =
0 otherwise.
Week 3

The QR Decomposition

3.1 Opening
3.1.1 Choosing the right basis

YouTube: https://www.youtube.com/watch?v=5 Em5gZo27g


A classic problem in numerical analysis is the approximation of a function, f : R æ R,
with a polynomial of degree n≠1. (The n≠1 seems cumbersome. Think of it as a polynomial
with n terms.)
f (‰) ¥ “0 + “1 ‰ + · · · + “n≠1 ‰n≠1 .
Now, often we know f only "sampled" at points ‰0 , . . . , ‰m≠1 :
f (‰0 ) = „0
.. .. ..
. . .
f (‰m≠1 ) = „m≠1 .
In other words, input to the process are the points
(‰0 , „0 ), · · · , (‰m≠1 , „m≠1 )
and we want to determine the polynomal that approximately fits these points. This means
that
“0 + “1 ‰0 + · · · + “n≠1 ‰n≠1 0 ¥ „0
.. .. .. .. .. .. .. ..
. . . . . . . .
“0 + “1 ‰m≠1 + · · · + “n≠1 ‰n≠1
m≠1 ¥ „m≠1 .

151
WEEK 3. THE QR DECOMPOSITION 152

This can be reformulated as the approximate linear system


Q RQ R Q R
1 ‰0 · · · ‰n≠1
0 “0 „0
c
c 1 ‰1 · · · ‰n≠1
1
dc
dc “1 d
d
c
c „1 d
d
c
c .. .. .. dc
dc .. d
d ¥c
c .. d.
d
a . . . ba . b a . b
1 ‰m≠1 · · · ‰n≠1
m≠1 “n≠1 „m≠1

which can be solved using the techniques for linear least-squares in Week 4. The matrix in
the above equation is known as a Vandermonde matrix.
Homework 3.1.1.1 Choose ‰0 , ‰1 , · · · , ‰m≠1 to be equally spaced in the interval [0, 1]: for
i = 0, . . . , m ≠ 1, ‰i = ih ,where h = 1/(m ≠ 1). Write Matlab code to create the matrix
Q R
1 ‰0 · · · ‰n≠1
0
c
c 1 ‰1 · · · ‰n≠1
1
d
d
X=c
c .. .. .. d
d
a . . . b
1 ‰m≠1 · · · ‰n≠1
m≠1

as a function of n with m = 5000. Plot the condition number of X, Ÿ2 (X), as a function of


n (Matlab’s function for computing Ÿ2 (X) is cond( X ).)
Hint. You may want to use the recurrence xj+1 = xxj and the fact that the . operator in
Matlab performs an element-wise multiplication.
Solution.

• Here is our implementation: Assignments/Week03/answers/Vandermonde.m.


(Assignments/Week03/answers/Vandermonde.m)

• The graph of the condition number, Ÿ(X), as a function of n is given by


WEEK 3. THE QR DECOMPOSITION 153

• The parent functions 1, x, x2 , . . . on the interval [0, 1] are visualized as

Notice that the curves for xj and xj+1 quickly start to look very similar, which explains
why the columns of the Vandermonde matrix quickly become approximately linearly
dependent.
WEEK 3. THE QR DECOMPOSITION 154

Think about how this extends to even more columns of A.

YouTube: https://www.youtube.com/watch?v=cBFt2dmXbu4
An alternative set of polynomials that can be used are known as Legendre polynomials.
A shifted version (appropriate for the interval [0, 1]) can be inductively defined by
P0 (‰) = 1
P1 (‰) = 2‰ ≠ 1
.. .
. = ..
Pn+1 (‰) = ((2n + 1)(2‰ ≠ 1)Pn (‰) ≠ nPn≠1 (‰)) /(n + 1).
The polynomials have the property that
⁄ 1 I
Cs if s = t for some nonzero constant Cs
Ps (‰)Pt (‰)d‰ =
0 0 otherwise
which is an orthogonality condition on the polynomials.
The function f : R æ R can now instead be approximated by
f (‰) ¥ “0 P0 (‰) + “1 P1 (‰) + · · · + “n≠1 Pn≠1 (‰).
and hence given points
(‰0 , „0 ), · · · , (‰m≠1 , „m≠1 )
we can determine the polynomial from
“0 P0 (‰0 ) + “1 P1 (‰0 )
+ · · · + “n≠1 Pn≠1 (‰0 ) = „0
.. .. .. .. .. .. .. ..
. . . . . . . .
“0 P0 (‰m≠1 ) + “1 P1 (‰m≠1 ) + · · · + “n≠1 Pn≠1 (‰m≠1 ) = „m≠1 .
This can be reformulated as the approximate linear system
Q RQ R Q R
1 P1 (‰0 ) ··· Pn≠1 (‰0 ) “0 „0
c
c 1 P1 (‰1 ) ··· Pn≠1 (‰1 ) dc
dc “1 d
d
c
c „1 d
d
c
c .. .. .. dc
dc .. d
d ¥c
c .. d.
d
a . . . ba . b a . b
1 P1 (‰m≠1 ) · · · Pn≠1 (‰m≠1 ) “n≠1 „m≠1
which can also be solved using the techniques for linear least-squares in Week 4. Notice that
now the columns of the matrix are (approximately) orthogonal: Notice that if we "sample"
x as ‰0 , . . . , ‰n≠1 , then
⁄ 1 n≠1
ÿ
Ps (‰)Pt (‰)d‰ ¥ Ps (‰i )Pt (‰i ),
0 i=0
WEEK 3. THE QR DECOMPOSITION 155

which equals the dot product of the columns indexed with s and t.
Homework 3.1.1.2 Choose ‰0 , ‰1 , · · · , ‰m≠1 to be equally spaced in the interval [0, 1]: for
i = 0, . . . , m ≠ 1, ‰i = ih, where h = 1/(m ≠ 1). Write Matlab code to create the matrix
Q R
1 P1 (‰0 ) ··· Pn≠1 (‰0 )
c
c 1 P1 (‰1 ) ··· Pn≠1 (‰1 ) d
d
X=c
c .. .. .. d
d
a . . . b
1 P (‰m≠1 ) · · · Pn≠1 (‰m≠1 )

as a function of n with m = 5000. Plot Ÿ2 (X) as a function of n. To check whether the


columns of X are mutually orthogonal, report ÎX T X ≠ DÎ2 where D equals the diagonal of
X T X.
Solution.

• Here is our implementation: ShiftedLegendre.m. (Assignments/Week03/answers/ShiftedLegendre.m)

• The graph of the condition number, as a function of n is given by

We notice that the matrices created from shifted Legendre polynomials have a very
good condition numbers.

• The shifted Legendre polynomials are visualized as


WEEK 3. THE QR DECOMPOSITION 156

• The columns of the matrix X are now reasonably orthogonal:

X^T * X for n=5:

ans =

5000 0 1 0 1
0 1667 0 1 0
1 0 1001 0 1
0 1 0 715 0
1 0 1 0 556

YouTube: https://www.youtube.com/watch?v=syq-jOKWqTQ
Remark 3.1.1.1 The point is that one ideally formulates a problem in a way that already
captures orthogonality, so that when the problem is discretized ("sampled"), the matrices
that arise will likely inherit that orthogonality, which we will see again and again is a good
thing. In this chapter, we discuss how orthogonality can be exposed if it is not already part
WEEK 3. THE QR DECOMPOSITION 157

of the underlying formulation of the problem.

3.1.2 Overview Week 3


• 3.1 Opening Remarks

¶ 3.1.1 Choosing the right basis


¶ 3.1.2 Overview Week 3
¶ 3.1.3 What you will learn

• 3.2 3.2 Gram-Schmidt Orthogonalization

¶ 3.2.1 Classical Gram-Schmidt (CGS)


¶ 3.2.2 Gram-Schmidt and the QR factorization
¶ 3.2.3 Classical Gram-Schmidt algorithm
¶ 3.2.4 Modified Gram-Schmidt (MGS)
¶ 3.2.5 In practice, MGS is more accurate
¶ 3.2.6 Cost of Gram-Schmidt algorithms

• 3.3 Householder QR Factorization

¶ 3.3.1 Using unitary matrices


¶ 3.3.2 Householder transformation
¶ 3.3.3 Practical computation of the Householder vector
¶ 3.3.4 Householder QR factorization algorithm
¶ 3.3.5 Forming Q
¶ 3.3.6 Applying QH
¶ 3.3.7 Orthogonality of resulting Q

• 3.4 Enrichments

¶ 3.4.1 Blocked Householder QR factorization

• 3.5 Wrap Up

¶ 3.5.1 Additional homework


¶ 3.5.2 Summary
WEEK 3. THE QR DECOMPOSITION 158

3.1.3 What you will learn


This chapter focuses on the QR factorization as a method for computing an orthonormal
basis for the column space of a matrix.
Upon completion of this week, you should be able to

• Relate Gram-Schmidt orthogonalization of vectors to the QR factorization of a matrix.

• Show that Classical Gram-Schmidt and Modified Gram-Schmidt yield the same result
(in exact arithmetic).

• Compare and contrast the Classical Gram-Schmidt and Modified Gram-Schmidt meth-
ods with regard to cost and robustness in the presence of roundoff error.

• Derive and explain the Householder transformations (reflections).

• Decompose a matrix to its QR factorization via the application of Householder trans-


formations.

• Analyze the cost of the Householder QR factorization algorithm.

• Explain why Householder QR factorization yields a matrix Q with high quality or-
thonormal columns, even in the presence of roundoff error.

3.2 Gram-Schmidt Orthogonalization


3.2.1 Classical Gram-Schmidt (CGS)

YouTube: https://www.youtube.com/watch?v=CWhBZB-3kg4
Given a set of linearly independent vectors {a0 , . . . , an≠1 } µ Cm , the Gram-Schmidt
process computes an orthonormal basis {q0 , . . . , qn≠1 } that spans the same subspace as the
original vectors, i.e.

Span({a0 , . . . , an≠1 }) = Span({q0 , . . . , qn≠1 }).

The process proceeds as follows:

• Compute vector q0 of unit length so that Span({a0 }) = Span({q0 }):


WEEK 3. THE QR DECOMPOSITION 159

¶ fl0,0 = Îa0 Î2
Computes the length of vector a0 .
¶ q0 = a0 /fl0,0
Sets q0 to a unit vector in the direction of a0 .
Notice that a0 = q0 fl0,0
• Compute vector q1 of unit length so that Span({a0 , a1 }) = Span({q0 , q1 }):
¶ fl0,1 = q0H a1
Computes fl0,1 so that fl0,1 q0 = q0H a1 q0 equals the component of a1 in the direction
of q0 .
1 = a1 ≠ fl0,1 q0
¶ a‹
Computes the component of a1 that is orthogonal to q0 .
¶ fl1,1 = Îa‹
1 Î2
Computes the length of vector a‹
1.
¶ q 1 = a‹
1 /fl1,1
Sets q1 to a unit vector in the direction of a‹
1.

Notice that A B
1 2 1 2 fl0,0 fl0,1
a0 a1 = q0 q1 .
0 fl1,1

• Compute vector q2 of unit length so that Span({a0 , a1 , a2 }) = Span({q0 , q1 , q2 }):


A B
fl = q0H a2 fl0,2 1 2H
¶ 0,2 or, equivalently, = q0 q1 a2
fl1,2 = q1 a2
H
fl1,2
Computes fl0,2 so that fl0,2 q0 = q0H a2 q0 and fl1,2 q1 = q1H a2 q1 equal the components
of a2 in the directions of q0 A
and q1 .B
1 2 fl
0,2
Or, equivalently, q0 q1 is the component in Span({q0 , q1 }).
fl1,2
A B
1fl0,2 2
¶ a‹
2= a2 ≠ fl0,2 q0 ≠ fl1,2 q1 = a2 ≠ q0 q1
fl1,2
Computes the component of a2 that is orthogonal to q0 and q1 .
¶ fl2,2 = Îa‹
2 Î2
Computes the length of vector a‹
2.
¶ q 2 = a‹
2 /fl2,2
Sets q2 to a unit vector in the direction of a‹
2.

Notice that Q R
1 2 1 2 fl0,0 fl0,1 fl0,2
a 0
c
a0 a1 a2 = q0 q1 q2 fl1,1 fl1,2 d
b.
0 0 fl2,2
WEEK 3. THE QR DECOMPOSITION 160

• And so forth.

YouTube: https://www.youtube.com/watch?v=AvXe0MfK _0
Yet another way of looking at this problem is as follows.

YouTube: https://www.youtube.com/watch?v=OZe M7YUwZo


Consider the matrices
1 2
A= a0 · · · ak≠1 ak ak+1 · · · an≠1

and 1 2
Q= q0 · · · qk≠1 qk qk+1 · · · qn≠1
We observe that
• Span({a0 }) = Span({q0 })
Hence a0 = fl0,0 q0 for some scalar fl0,0 .

• Span({a0 , a1 }) = Span({q0 , q1 })
Hence
a1 = fl0,1 q0 + fl1,1 q1
for some scalars fl0,1 , fl1,1 .

• In general, Span({a0 , . . . , ak≠1 , ak }) = Span({q0 , . . . , qk≠1 , qk })


Hence
ak = fl0,k q0 + · · · + flk≠1,k qk≠1 + flk,k qk
for some scalars fl0,k , · · · , flk,k .
Let’s assume that q0 , . . . , qk≠1 have already been computed and are mutually orthonormal.
Consider
ak = fl0,k q0 + · · · + flk≠1,k qk≠1 + flk,k qk .
WEEK 3. THE QR DECOMPOSITION 161

Notice that
qkH ak = qkH (fl0,k q0 + · · · + flk≠1,k qk≠1 + flk,k qk )
= fl0,k qkH q0 + · · · + flk≠1,k qkH qk≠1 + flk,k qkH qk
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
0 0 1
so that
fli,k = qiH ak ,
for i = 0, . . . , k ≠ 1. Next, we can compute

k = ak ≠ fl0,k q0 ≠ · · · ≠ flk≠1,k qk≠1


a‹
and, since flk,k qk = a‹
k , we can choose

flk,k = Îa‹
k Î2

and
q k = a‹
k /flk,k
Remark 3.2.1.1 For a review of Gram-Schmidt orthogonalization and exercises orthogonal-
izing real-valued vectors, you may want to look at Linear Algebra: Foundations to Frontiers
(LAFF) [26] Week 11.

3.2.2 Gram-Schmidt and the QR factorization

YouTube: https://www.youtube.com/watch?v=tHj20PSBCek
The discussion in the last unit motivates the following theorem:
Theorem 3.2.2.1 QR Decomposition Theorem. Let A œ Cm◊n have linearly indepen-
dent columns. Then there exists an orthonormal matrix Q and upper triangular matrix R
such that A = QR, its QR decomposition. If the diagonal elements of R are taken to be real
and positive, then the decomposition is unique.
In order to prove this theorem elegantly, we will first present the Gram-Schmidt orthog-
onalization algorithm using FLAME notation, in the next unit.
Ponder This 3.2.2.1 What happens in the Gram-Schmidt algorithm if the columns of A are
NOT linearly independent? How might one fix this? How can the Gram-Schmidt algorithm
be used to identify which columns of A are linearly independent?
Solution. If aj is the first column such that {a0 , . . . , aj } are linearly dependent, then a‹
j
will equal the zero vector and the process breaks down.
WEEK 3. THE QR DECOMPOSITION 162

When a vector with a‹ j equal to the zero vector is encountered, the columns can be
rearranged (permuted) so that that column (or those columns) come last.
Again, if a‹
j = 0 for some j, then the columns are linearly dependent since then aj can
be written as a linear combination of the previous columns.

3.2.3 Classical Gram-Schmidt algorithm

YouTube: https://www.youtube.com/watch?v=YEEEJYp8snQ
Remark 3.2.3.1 If the FLAME notation used in this unit is not intuitively obvious, you may
to review some of the materials in Weeks 3-5 of Linear Algebra: Foundations to Frontiers
(http://www.u aff.net).
An alternative for motivating that algorithm is as follows:

• Consider A = QR.

• Partition A, Q, and R to yield


Q R
1 2 1 2 R00 r01 R02
c T d
A0 a1 A2 = Q0 q1 Q2 a 0 fl11 r12 b.
0 0 R22

• Assume that Q0 and R00 have already been computed.

• Since corresponding columns of both sides must be equal, we find that

a1 = Q0 r01 + q1 fl11 . (3.2.1)

Also, QH
0 Q0 = I and Q0 q1 = 0, since the columns of Q are mutually orthonormal.
H

• Hence
0 a1 = Q0 Q0 r01 + Q0 q1 fl11 = r01 .
QH H H

• This shows how r01 can be computed from Q0 and a1 , which are already known:

r01 := QH
0 a1 .
WEEK 3. THE QR DECOMPOSITION 163

• Next,
1 := a1 ≠ Q0 r01
a‹
is computed from (3.2.1). This is the component of a1 that is perpendicular (orthogo-
nal) to the columns of Q0 . We know it is nonzero since the columns of A are linearly
independent.
• Since fl11 q1 = a‹
1 and we know that q1 has unit length, we now compute

fl11 := Îa‹
1 Î2

and
q1 := a‹
1 /fl11 ,

These insights are summarized in the algorithm in Figure 3.2.3.2.


[Q, R] = CGS-QR(A) A B
1 2 1 2 RT L RT R
A æ AL AR , Q æ QL QR , R æ
0 RBR
AL and QL has 0 columns and RT L is 0 ◊ 0
while
1 n(AL ) <2 n(A) 1 2 1 2 1 2
AL AR æ A0 a1 A2 , QL QR æ Q0 q1 Q2 ,
Q R
A B R00 r01 R02
RT L RT R c T d
æ a 0 fl11 r12 b
0 RBR
0 0 R22
r01 := QH 0 a1
a‹1 := a1 ≠ Q0 r01
fl11 := Îa‹ 1 Î2
q
1 1
:= a ‹
1 /fl 211 1 2 1 2 1 2
AL AR Ω A0 a1 A2 , QL QR Ω Q0 q1 Q2 ,
Q R
A B R00 r01 R02
RT L RT R
Ωca 0
T d
fl11 r12 b
0 RBR
0 0 R22
endwhile

Figure 3.2.3.2 (Classical) Gram-Schmidt (CGS) algorithm for computing the QR factor-
ization of a matrix A.
Having presented the algorithm in FLAME notation, we can provide a formal proof of
Theorem 3.2.2.1.
Proof of Theorem 3.2.2.1. Informal proof: The process described earlier in this unit con-
structs the QR decomposition. The computation of flj,j is unique if it is restricted to be a
real and positive number. This then prescribes all other results along the way.
Formal proof:
(By induction). Note that n Æ m since A has linearly independent columns.
1 2
• Base case: n = 1. In this case A = A0 a1 , where A0 has no columns. Since A has
WEEK 3. THE QR DECOMPOSITION 164

linearly independent columns, a1 ”= 0. Then


1 2
A= a1 = (q1 ) (fl11 ) ,

where fl11 = Îa1 Î2 and q1 = a1 /fl11 , so that Q = (q1 ) and R = (fl11 ).

• Inductive step: Assume that the result is true for all A0 with k linearly independent
columns. We will show it is true for A with k + 1 linearly independent columns.
1 2
Let A œ Cm◊(k+1) . Partition A æ A 0 a1 .
By the induction hypothesis, there exist Q0 and R00 such that QH 0 Q0 = I, R00 is
upper triangular with nonzero diagonal entries and A0 = Q0 R00 . Also, by induction
hypothesis, if the elements on the diagonal of R00 are chosen to be positive, then the
factorization A0 = Q0 R00 is unique.
We are looking for A B
1 2 Â
R 00 r01
Â
Q
0 q1 and
0 fl11
so that A B
1 2 1 2 Â
R 00 r01
A0 a1 = Â
Q 0 q1 .
0 fl11
This means that

¶ A0 = QÂ RÂ
0 00 ,
We choose Q Â = Q and R
0 0
 = R . If we insist that the elements on the diagonal
00 00
be positive, this choice is unique. Otherwise, it is a choice that allows us to prove
existence.
¶ a1 = Q0 r01 + fl11 q1 which is the unique choice if we insist on positive elements on
the diagonal.
a1 = Q0 r01 + fl11 q1 . Multiplying both sides by QH 0 we find that r01 must equal
Q0 a1 (and is uniquely determined by this if we insist on positive elements on the
H

diagonal).
¶ Letting a‹1 = a1 ≠ Q0 r01 (which equals the component of a1 orthogonal to C(Q0 )),
we find that fl11 q1 = a‹
1 . Since q1 has unit length, we can choose fl11 = Îa1 Î2 . If

we insist on positive elements on the diagonal, then this choice is unique.


¶ Finally, we let q1 = a‹
1 /fl11 .

• By the Principle of Mathematical Induction the result holds for all matrices A œ Cm◊n
with m Ø n.


Homework 3.2.3.1 Implement the algorithm given in Figure 3.2.3.2 as
function [ Q, R ] = CGS_QR( A )
WEEK 3. THE QR DECOMPOSITION 165

by completing the code in Assignments/Week03/mat ab/CGS_QR.m. Input is an m ◊ n matrix


A. Output is the matrix Q and the upper triangular matrix R. You may want to use
Assignments/Week03/mat ab/test_CGS_QR.m to check your implementation.
Solution. See Assignments/Week03/answers/CGS_QR.m. (Assignments/Week03/answers/CGS_QR.m)

3.2.4 Modified Gram-Schmidt (MGS)

YouTube: https://www.youtube.com/watch?v=pOBJHhV3TKY
In the video, we reasoned that the following two algorithms compute the same values,
except that the columns of Q overwrite the corresponding columns of A:
for j = 0, . . . , n ≠ 1 for j = 0, . . . , n ≠ 1
j := aj
a‹
for k = 0, . . . , j ≠ 1 for k = 0, . . . , j ≠ 1
flk,j := qkH a‹ j flk,j := aH
k aj
a‹j := a ‹
j ≠ flk,j qk aj := aj ≠ flk,j ak
end end
flj,j := Îa‹ j Î2 flj,j := Îaj Î2
qj := a‹ j /fl j,j aj := aj /flj,j
end end
(a) MGS algorithm that computes Q and (b) MGS algorithm that computes Q and
R from A. R from A, overwriting A with Q.
Homework 3.2.4.1 Assume that q0 , . . . , qk≠1 are mutually orthonormal. Let flj,k = qjH y for
j = 0, . . . , i ≠ 1. Show that

qiH y = qiH (y ≠ fl0,k q0 ≠ · · · ≠ fli≠1,k qi≠1 )


¸˚˙˝
fli,k

for i = 0, . . . , k ≠ 1.
WEEK 3. THE QR DECOMPOSITION 166

Solution.
qiH (y ≠ fl0,k q0 ≠ · · · ≠ fli≠1,k qi≠1 )
= < distribute >
qiH y ≠ qiH fl0,k q0 ≠ · · · ≠ qiH fli≠1,k qi≠1
= < fl0,k is a scalar >
qiH y ≠ fl0,k qiH q0 ≠ · · · ≠ fli≠1,k qiH qi≠1
¸ ˚˙ ˝ ¸ ˚˙ ˝
0 0

YouTube: https://www.youtube.com/watch?v=0ooNPondq5M
This homework illustrates how, given a vector y œ Cm and a matrix Q œ Cmxk the
component orthogonal to the column space of Q, given by (I ≠ QQH )y, can be computed by
either of the two algorithms given in Figure 3.2.4.1. The one on the left, Proj ‹ QCGS (Q, y)
projects y onto the column space perpendicular to Q as did the Gram-Schmidt algorithm
with which we started. The one on the left successfuly subtracts out the component in the
direction of qi using a vector that has been updated in previous iterations (and hence is
already orthogonal to q0 , . . . , qi≠1 ). The algorithm on the right is one variant of the Modified
Gram-Schmidt (MGS) algorithm.
[y ‹ , r] = Proj‹QCGS (Q, y) [y ‹ , r] = Proj‹QMGS (Q, y)
(used by CGS) (used by MGS)
y‹ = y y‹ = y
for i = 0, . . . , k ≠ 1 for i = 0, . . . , k ≠ 1
fli := qiH y fli := qiH y ‹
y ‹ := y ‹ ≠ fli qi y ‹ := y ‹ ≠ fli qi
endfor endfor

Figure 3.2.4.1 Two different ways of computing y ‹ = (I ≠QQH )y = y≠Qr, where r = QH y.


The computed y ‹ is the component of y orthogonal to C(Q), where Q has k orthonormal
columns. (Notice the y on the left versus the y ‹ on the right in the computation of fli .)
These insights allow us to present CGS and this variant of MGS in FLAME notation, in
Figure 3.2.4.2 (left and middle).
WEEK 3. THE QR DECOMPOSITION 167

[A, R] := GS(A) (overwrites


A
A with B
Q)
1 2 RT L RT R
A æ AL AR , R æ
0 RBR
AL has 0 columns and RT L is 0 ◊ 0
while n(AL ) < n(A) Q R
A B R00 r01 R02
1 2 1 2 RT L RT R c T d
AL AR æ A0 a1 A2 , æ a 0 fl11 r12 b
0 RBR
0 0 R22
CGS MGS MGS (alternative)
r01 := AH0 a1
a1 := a1 ≠ A0 r01
[a1 , r01 ] = Proj‹toQMGS (A0 , a1 )
fl11 := Îa1 Î2 fl11 := Îa1 Î2 fl11 := Îa1 Î2
a1 := a1 /fl11 a1 := a1 /fl11 a1 := a1 /fl11
T
r12 := aH
1 A2
A2 := A2 ≠ a1 r12
T
Q R
A B R00 r01 R02
1 2 1 2 RT L RT R
Ω a 0 fl11 r12
c T d
AL AR Ω A0 a1 A2 , b
0 RBR
0 0 R22
endwhile
Figure 3.2.4.2 Left: Classical Gram-Schmidt algorithm. Middle: Modified Gram-Schmidt
algorithm. Right: Alternative Modified Gram-Schmidt algorithm. In this last algorithm,
every time a new column, q1 , of Q is computed, each column of A2 is updated so that its
component in the direction of q1 is is subtracted out. This means that at the start and finish
of the current iteration, the columns of AL are mutually orthonormal and the columns of AR
are orthogonal to the columns of AL .
Next, we massage the MGS algorithm into the alternative MGS algorithmic variant given
in Figure 3.2.4.2 (right).

YouTube: https://www.youtube.com/watch?v=3XzHFWzV5iE
The video discusses how MGS can be rearranged so that every time a new vector qk
is computed (overwriting ak ), the remaining vectors, {ak+1 , . . . , an≠1 }, can be updated by
subtracting out the component in the direction of qk . This is also illustrated through the
next sequence of equivalent algorithms.
WEEK 3. THE QR DECOMPOSITION 168

for j = 0, . . . , n ≠ 1 for j = 0, . . . , n ≠ 1
flj,j := Îaj Î2 flj,j := Îaj Î2
aj := aj /flj,j aj := aj /flj,j
for k = j + 1, . . . , n ≠ 1 for k = j + 1, . . . , n ≠ 1
flj,k := aH j ak flj,k := aH j ak
ak := ak ≠ flj,k aj end
end for k = j + 1, . . . , n ≠ 1
end ak := ak ≠ flj,k aj
end
end
(c) MGS algorithm that normalizes the (d) Slight modification of the algorithm in
jth column to have unit length to com- (c) that computes flj,k in a separate loop.
pute qj (overwriting aj with the result)
and then subtracts the component in the
direction of qj off the rest of the columns
(aj+1 , . . . , an≠1 ).
for j = 0, . . . , n ≠ 1 for j = 0, . . . , n ≠ 1
flj,j := Îaj Î2 flj,j := Îaj Î2
aj 1:= aj /flj,j 2 a1j := aj /flj,j 2
flj,j+1 · · · flj,n≠1 := flj,j+1 · · · flj,n≠1 :=
1 2 1 2
aHj aj+1 · · · an≠1 aHj aj+1 · · · an≠1
1 2 1 2
aj+1 · · · an≠1 := aj+1 · · · an≠1 :=
1 2 1 2
aj+1 ≠ flj,j+1 aj · · · an≠1 ≠ flj,n≠1 aj aj+1 · · · an≠1
1 2
end ≠ aj flj,j+1 · · · flj,n≠1
end
(e) Algorithm in (d) rewritten to expose (f) Algorithm in (e) rewritten to expose
only the outer loop. the row-vector-times
1 matrix 2 multi-
plication aH j a j+1 · · · an≠1 and
1 2
rank-1 update aj+1 · · · an≠1 ≠
1 2
aj flj,j+1 · · · flj,n≠1 .

Figure 3.2.4.3 Various equivalent MGS algorithms.


This discussion shows that the updating of future columns by subtracting out the compo-
nent in the direction of the latest column of Q to be computed can be cast in terms of a rank-1
update. This is also captured, using FLAME notation, in the algorithm in Figure 3.2.4.2, as
is further illustrated in Figure 3.2.4.4:
WEEK 3. THE QR DECOMPOSITION 169

Figure 3.2.4.4 Alternative Modified Gram-Schmidt algorithm for computing the QR fac-
torization of a matrix A.

YouTube: https://www.youtube.com/watch?v=e wc14-1WF0


Ponder This 3.2.4.2 Let A have linearly independent columns and let A = QR be a QR
factorization of A. Partition
A B
1 2 1 2 RT L RT R
Aæ AL AR , Qæ QL QR , and R æ ,
0 RBR
where AL and QL have k columns and RT L is k ◊ k.
WEEK 3. THE QR DECOMPOSITION 170

As you prove the following insights, relate each to the algorithm in Figure 3.2.4.4. In
particular, at the top of the loop of a typical iteration, how have the different parts of A and
R been updated?

1. AL = QL RT L .
(QL RT L equals the QR factorization of AL .)

2. C(AL ) = C(QL ).
(The first k columns of Q form an orthonormal basis for the space spanned by the first
k columns of A.)

3. RT R = QH
L AR .

4. (AR ≠ QL RT R )H QL = 0.
(Each column in AR ≠ QL RT R equals the component of the corresponding column of
AR that is orthogonal to Span(QL ).)

5. C(AR ≠ QL RT R ) = C(QR ).

6. AR ≠ QL RT R = QR RBR .
(The columns of QR form an orthonormal basis for the column space of AR ≠ QL RT R .)

Solution. Consider the fact that A = QR. Then, multiplying the partitioned matrices,
A B
1 2 1 2 RT L RT R
AL AR = QL QR
1
0 RBR 2
= QL RT L QL RT R + QR RBR .

Hence
AL = QL RT L and AR = QL RT R + QR RBR . (3.2.2)

1. The left equality in (3.2.2) answers 1.

2. C(AL ) = C(QL ) can be shown by noting that R is upper triangular and nonsingular and
hence RT L is upper triangular and nonsingular, and using this to show that C(AL ) µ
C(QL ) and C(QL ) µ C(AL ):

• C(AL ) µ C(QL ): Let y œ C(AL ). Then there exists x such that AL x = y. But
then QL RT L x = y and hence QL (RT L x) = y which means that y œ C(QL ).
• C(QL ) µ C(AL ): Let y œ C(QL ). Then there exists x such that QL x = y. But
then AL RT≠1L x = y and hence AL (RT≠1L x) = y which means that y œ C(AL ).

This answers 2.
WEEK 3. THE QR DECOMPOSITION 171

3. Take AR ≠ QL RT R = QR RBR and multiply both side by QH


L:

L (AR ≠ QL RT R ) = QL QR RBR
QH H

is equivalent to
L AR ≠ QL QL RT R = QL QR RBR = 0.
QH H H
¸ ˚˙ ˝ ¸ ˚˙ ˝
I 0
Rearranging yields 3.

4. Since AR ≠ QL RT R = QR RBR we find that (AR ≠ QL RT R )H QL = (QR RBR )H QL and

(AR ≠ QL RT R )H QL = RBR
H
R QL = 0.
QH

5. Similar to the proof of 2.

6. Rearranging the right equality in (3.2.2) yields AR ≠QL RT R = QR RBR ., which answers
5.

7. Letting A‚ denote the original contents of A, at a typical point,

• AL has been updated with QL .


• RT L and RT R have been computed.
• AR = A‚R ≠ QL RT R .
Homework 3.2.4.3 Implement the algorithm in Figure 3.2.4.4 as
function [ Aout, Rout ] = MGS_QR( A, R )

Input is an m ◊ n matrix A and a n ◊ n matrix R. Output is the matrix Q, which has


overwritten matrix A, and the upper triangular matrix R. (The values below the diagonal
can be arbitrary.) You may want to use Assignments/Week03/mat ab/test_MGS_QR.m to check
your implementation.
Solution. See Assignments/Week03/answers/MGS_QR.m.

3.2.5 In practice, MGS is more accurate

YouTube: https://www.youtube.com/watch?v=7ArZnHE0PIw
In theory, all Gram-Schmidt algorithms discussed in the previous sections are equivalent
WEEK 3. THE QR DECOMPOSITION 172

in the sense that they compute the exact same QR factorizations when exact arithmetic is
employed. In practice, in the presense of round-off error, the orthonormal columns of Q
computed by MGS are often "more orthonormal" than those computed by CGS. We will
analyze how round-off error affects linear algebra computations in the second part of the
ALAFF. For now you will investigate it with a classic example.
When storing real (or complex) valued numbers in a computer, a limited accuracy can be
maintained, leading to round-off error when a number is stored and/or when computation
with numbers is performed. Informally, the machine epsilon (also called the unit roundoff
error) is defined as the largest positive number, ‘mach , such that the stored value of 1 + ‘mach
is rounded back to 1.
Now, let us consider a computer where the only error that is ever incurred is when

1 + ‘mach

is computed and rounded to 1.


Ô
Homework 3.2.5.1 Let ‘ = ‘mach and consider the matrix
Q R
1 1 1
c
c ‘ 0 0 d
d
1 2
A=c d = a0 a1 a2 . (3.2.3)
a 0 ‘ 0 b
0 0 ‘

By hand, apply both the CGS and the MGS algorithms with this matrix, rounding 1 + ‘mach
to 1 whenever encountered in the calculation.
Upon completion, check whether the columns of Q that are computed are (approximately)
orthonormal.
Solution. The complete calculation is given by
WEEK 3. THE QR DECOMPOSITION 173

Click here to enlarge.


WEEK 3. THE QR DECOMPOSITION 174

CGS yields the approximate matrix


Q R
1 0Ô 0Ô
c d
c ‘ ≠Ô 2 ≠ 22
2
d
Q¥c 2
d
c
a 0 2
0
Ô
d
b
2
0 0 2

while MGS yields Q R


1 0Ô 0Ô
c d
c ‘ ≠Ô 22 ≠ Ô66 d
Q¥c 2
d
c
a 0 2
≠Ô 66 d
b
6
0 0 3
Clearly, they don’t compute the same answer.
If we now ask the question "Are the columns of Q orthonormal?" we can check this by
computing QH Q, which should equal I, the identity.

• For CGS:
QH Q
=
Q RH Q R
1 0Ô 0Ô 1 0Ô 0Ô
c 2 2 d c d
c ‘ ≠ 2
c Ô
≠2 d d c
c ‘ ≠Ô 2 ≠ 22
2
d
d
2 2
a 0 0 b a 0 0
c d c d
2 Ô 2 Ô b
2 2
0 0 2
0 0 2
Q
= Ô Ô R
1 + Ô‘mach ≠ 22 ‘ ≠ 22 ‘
c d
c ≠ 2‘ 1 1 d.
a Ô2 2 b
≠ 22 ‘ 1
2
1
Clearly, the computed second and third columns of Q are not mutually orthonormal.
What is going on? The answer lies with how a‹ 2 is computed in the last step a2 :=

a2 ≠ (q0 a2 )q0 ≠ (q1 a2 )q1 . Now, q0 has a relatively small error in it and hence q0 a2 q0
H H H

has a relatively small error in it. It is likely that a part of that error is in the direction
of q1 . Relative to q0H a2 q0 , that error in the direction of q1 is small, but relative to
a2 ≠ q0H a2 q0 it is not. The point is that then a2 ≠ q0H a2 q0 has a relatively large error
in it in the direction of q1 . Subtracting q1H a2 q1 does not fix this and since in the end
2 is small, it has a relatively large error in the direction of q1 . This error is amplified
a‹
when q2 is computed by normalizing a‹ 2.
WEEK 3. THE QR DECOMPOSITION 175

• For MGS:
QH Q
=
Q RH Q R
1 0Ô 0Ô 1 0Ô 0Ô
c 2 6 d c d
c ‘ ≠ 2
c Ô
≠ Ô6 d d c
c ‘ ≠Ô 2 ≠ Ô66
2
d
d
2 6 d c 2
a 0
c
2
≠Ô 6 b a 0 2
≠Ô 66 d
b
6 6
0 0 3
0 0 3
Q
= Ô Ô R
1 + Ô‘mach ≠ 22 ‘ ≠ 66 ‘
c d
c ≠ 2‘ 1 0 d.
a Ô2 b
6
≠6‘ 0 1

Why is the orthogonality better? Consider the computation of a‹


2 := a2 ≠ (q1 a2 )q1 :
H

2 := a2 ≠ q1 a2 q1 = [a2 ≠ (q0 a2 )q0 ] ≠ (q1 [a2 ≠ (q0 a2 )q0 ])q1 .


a‹ ‹ H ‹ H H H

This time, if a2 ≠ q0H a‹


2 q0 has an error in the direction of q1 , this error is subtracted out
when (q1 a2 )q1 is subtracted from a‹
H ‹
2 . This explains the better orthogonality between
the computed vectors q1 and q2 .

YouTube: https://www.youtube.com/watch?v=OT4Yd-eVMSo
We have argued via an example that MGS is more accurate than CGS. A more thorough
analysis is needed to explain why this is generally so.

3.2.6 Cost of Gram-Schmidt algorithms


(No video for this unit.)
Homework 3.2.6.1 Analyze the cost of the CGS algorithm in Figure 3.2.4.2 (left) assuming
that A œ Cm◊n .
Solution. During the kth iteration (0 Æ k < n), A0 has k columns and A2 has n ≠ k ≠ 1
WEEK 3. THE QR DECOMPOSITION 176

columns. In each iteration


Operation Approximate cost
(in flops)
r01 := AH0 a1 2mk
a1 := a1 ≠ A0 r01 2mk
fl11 := Îa1 Î2 2m
a1 := a1 /fl11 m

Thus, the total cost is (approximately)


qn≠1
k=0 [2mk + 2mk + 2m + m]
=
qn≠1
k=0 [3m + 4mk]
=
q
3mn + 4m n≠1 k
qk=0
k=0 k = n(n ≠ 1)/2 ¥ n /2 >
2
¥ < n≠1
2
3mn + 4m n2
=
3mn + 2mn2
¥ < 3mn is of lower order >
2mn2
Homework 3.2.6.2 Analyze the cost of the MGS algorithm in Figure 3.2.4.2 (right) assum-
ing that A œ Cm◊n .
Solution. During the kth iteration (0 Æ k < n), A0 has k columns. and A2 has n ≠ k ≠ 1
columns. In each iteration
Operation Approximate cost
(in flops)
fl11 := Îa1 Î2 2m
a1 := a1 /fl11 m
T
r12 := aH
1 A2 2m(n ≠ k ≠ 1)
A2 := A2 ≠ a1 r12
T
2m(n ≠ k ≠ 1)
WEEK 3. THE QR DECOMPOSITION 177

Thus, the total cost is (approximately)


qn≠1
k=0 [2m(n ≠ k ≠ 1) + 2m(n ≠ k ≠ 1) + 2m + m]
=
qn≠1
k=0 [3m + 4m(n ≠ k ≠ 1)]
=
q
3mn + 4m n≠1
k=0 (n ≠ k ≠ 1)
= < Substitute j = (n ≠ k ≠ 1) >
q
3mn + 4m n≠1 j
qj=0
j=0 j = n(n ≠ 1)/2 ¥ n /2 >
2
¥ < n≠1
n2
3mn + 4m 2
=
3mn + 2mn2
¥ < 3mn is of lower order >
2mn 2

Homework 3.2.6.3 Which algorithm requires more flops?


Solution. They require the approximately same number of flops.
A more careful analysis shows that, in exact arithmetic, they perform exactly the same
computations, but in a different order. Hence the number of flops is exactly the same.

3.3 Householder QR Factorization


3.3.1 Using unitary matrices

YouTube: https://www.youtube.com/watch?v=NAdMU_1ZANk
A fundamental problem to avoid in numerical codes is the situation where one starts
with large values and one ends up with small values with large relative errors in them. This
is known as catastrophic cancellation. The Gram-Schmidt algorithms can inherently fall
victim to this: column aj is successively reduced in length as components in the directions of
{q0 , . . . , qj≠1 } are subtracted, leaving a small vector if aj was almost in the span of the first
j columns of A. Application of a unitary transformation to a matrix or vector inherently
preserves length. Thus, it would be beneficial if the QR factorization can be implementated
as the successive application of unitary transformations. The Householder QR factorization
accomplishes this.
The first fundamental insight is that the product of unitary matrices is itself unitary. If,
WEEK 3. THE QR DECOMPOSITION 178

given A œ Cm◊n (with m Ø n), one could find a sequence of unitary matrices, {H0 , . . . , Hn≠1 },
such that A B
R
Hn≠1 · · · H0 A = ,
0
where R œ Cn◊n is upper triangular, then
A B
R
A= H0H H
· · · Hn≠1
¸ ˚˙ ˝ 0
Q

which is closely related to the QR factorization of A.


A B
R
Homework 3.3.1.1 Show that if A œ C and A = Q
m◊n
, where Q œ Cm◊m is
0
unitary and R is upper triangular, then there exists QL œ Cm◊n such that A = QL R, is the
QR factorization of A.
Solution. A B A B
R 1 2 R
Q = QL QR = QL R,
0 0
The second fundamental insight will be that the desired unitary transformations {H0 , . . . , Hn≠1 }
can be computed and applied cheaply, as we will discover in the remainder of this section.

3.3.2 Householder transformation

YouTube: https://www.youtube.com/watch?v=6TIVIw4B5VA
What we have discovered in this first video is how to construct a Householder trans-
formation, also referred to as a reflector, since it acts like a mirroring with respect to the
subspace orthogonal to the vector u, as illustrated in Figure 3.3.2.1.
WEEK 3. THE QR DECOMPOSITION 179

PowerPoint source (Resources/Week03/


HouseholderTransformation.pptx).
Figure 3.3.2.1 Given vector x and unit length vector u, the subspace orthogonal to u
becomes a mirror for reflecting x represented by the transformation (I ≠ 2uuH ).

Definition 3.3.2.2 Let u œ Cn be a vector of unit length (ÎuÎ2 = 1). Then H = I ≠ 2uuH
is said to be a Householder transformation or (Householder) reflector. ⌃
We observe:

• Any vector z that is perpendicular to u is left unchanged:

(I ≠ 2uuH )z = z ≠ 2u(uH z) = z.

• Any vector x can be written as x = z + uH xu where z is perpendicular to u and uH xu


is the component of x in the direction of u. Then

(I ≠ 2uuH )x = (I ≠ 2uuH )(z + uH xu) = z + uH xu ≠ 2u ¸u˚˙


H
z˝ ≠ 2uuH uH xu
0
= z + u xu ≠ 2u x ¸u˚˙u˝ u = z ≠ u xu.
H H H H

These observations can be interpreted as follows: The space perpendicular to u acts as


a "mirror": a vector that is an element in that space (along the mirror) is not reflected.
However, if a vector has a component that is orthogonal to the mirror, that component is
reversed in direction, as illustrated in Figure 3.3.2.1. Notice that a reflection preserves the
length of a vector.
WEEK 3. THE QR DECOMPOSITION 180

Homework 3.3.2.1 Show that if H is a reflector, then


• HH = I (reflecting the reflection of a vector results in the original vector).

• H = HH.

• H H H = HH H = I (a reflector is unitary).

Solution. Show that if H is a reflector, then

• HH = I (reflecting the reflection of a vector results in the original vector).


Solution:
(I ≠ 2uuH )(I ≠ 2uuH )
=
I ≠ 2uuH ≠ 2uuH + 4u ¸u˚˙
H
u˝ uH
1
=
I ≠ 4uuH + 4uuH = I

• H = HH.
Solution:
(I ≠ 2uuH )H
=
I ≠ 2(uH )H uH
=
I ≠ 2uuH

• H H H = I (a reflector is unitary).
Solution:
HHH
=
HH
=
I

YouTube: https://www.youtube.com/watch?v=wmjUHak9yHU
WEEK 3. THE QR DECOMPOSITION 181

PowerPoint source (Resources/Week03/


HouseholderTransformationAsUsed.pptx)
Figure 3.3.2.3 How to compute u given vectors x and y with ÎxÎ2 = ÎyÎ2 .
Next, let us ask the question of how to reflect a given x œ Cn into another vector y œ Cn
with ÎxÎ2 = ÎyÎ2 . In other words, how do we compute vector u so that

(I ≠ 2uuH )x = y.

From our discussion above, we need to find a vector u that is perpendicular to the space
with respect to which we will reflect. From Figure 3.3.2.3 we notice that the vector from y
to x, v = x ≠ y, is perpendicular to the desired space. Thus, u must equal a unit vector in
the direction v: u = v/ÎvÎ2 .
Remark 3.3.2.4 In subsequent discussion we will prefer to give Householder transformations
as I ≠ uuH /· , where · = uH u/2 so that u needs no longer be a unit vector, just a direction.
The reason for this will become obvious later.
When employing Householder transformations as part of a QR factorization algorithm,
we need to introduce zeroes below the diagonal of our matrix. This requires a very special
case of Householder transformation.

YouTube: https://www.youtube.com/watch?v=iMrgPGCWZ_o
WEEK 3. THE QR DECOMPOSITION 182

As we compute the QR factorization via Householder transformations, we will need to


find a Householder transformation H that maps a vector x to a multiple of the first unit
basis vector (e0 ). We discuss first how to find H in the case where x œ Rn . We seek v so
that (I ≠ vT2 v vv T )x = ±ÎxÎ2 e0 . Since the resulting vector that we want is y = ±ÎxÎ2 e0 , we
must choose v = x ≠ y = x û ÎxÎ2 e0 .
Example 3.3.2.5 Show that if x œ Rn , v = x û ÎxÎ2 e0 , and · = v T v/2 then (I ≠ ·1 vv T )x =
±ÎxÎ2 e0 .
Solution. This is surprisingly messy... It is easier to derive the formula than it is to check
it. So, we won’t check it! ⇤
In practice, we choose v = x + sign(‰1 )ÎxÎ2 e0 where ‰1 denotes the first element of x.
The reason is as follows: the first element of v, ‹1 , will be ‹1 = ‰1 û ÎxÎ2 . If ‰1 is positive
and ÎxÎ2 is almost equal to ‰1 , then ‰1 ≠ ÎxÎ2 is a small number and if there is error in ‰1
and/or ÎxÎ2 , this error becomes large relative to the result ‰1 ≠ ÎxÎ2 , due to catastrophic
cancellation. Regardless of whether ‰1 is positive or negative, we can avoid this by choosing
v = x + sign(‰1 )ÎxÎ2 e0 :
A B A B A B
‰1 sign(‰1 )ÎxÎ2 ‰1 + sign(‰1 )ÎxÎ2
v := x + sign(‰1 )ÎxÎ2 e0 = + = .
x2 0 x2

Remark 3.3.2.6 This is a good place to clarify how we index in this course. Here we
label the first element of the vector x as ‰1 , despite the fact that we have advocated in
favor of indexing starting with zero. In our algorithms that leverage the FLAME notation
(partitioning/repartitioning), you may have noticed that a vector or scalar indexed with
1 refers to the "current column/row" or "current element". In preparation of using the
computation of the vectors v and u in the setting of such an algorithm, we use ‰1 here for
the first element from which these vectors will be computed, which tends to be an element
that is indexed with 1. So, there is reasoning behind the apparent insanity.
Ponder This 3.3.2.2 Consider x œ R2 as drawn below:
WEEK 3. THE QR DECOMPOSITION 183

and let u be the vector


A such that (I ≠ uuH /· ) is a Householder transformation that maps
B
1
x to a vector fle0 = fl .
0

• Draw a vector fle0 to which x is "mirrored."

• Draw the line that "mirrors."

• Draw the vector v from which u is computed.

• Repeat for the "other" vector fle0 .

Computationally, which choice of mirror is better than the other? Why?

3.3.3 Practical computation of the Householder vector

YouTube: https://www.youtube.com/watch?v=UX_QBt90jf8
WEEK 3. THE QR DECOMPOSITION 184

3.3.3.1 The real case


Next, we discuss a slight variant on the above discussion that is used in practice. To do so,
we view x as a vector that consists of its first element, ‰1 , and the rest of the vector, x2 :
More precisely, partition A B
‰1
x= ,
x2
where ‰1 equals the firstAelement
B of x and x2 is the rest of x. Then we will wish to find a
1
Householder vector u = so that
u2
Q A BA BT R A B A B
1 1 1 ‰1 ±ÎxÎ2
aI ≠ b = .
· u2 u2 x2 0
A B
±ÎxÎ2
Notice that y in the previous discussion equals the vector , so the direction of u
0
is given by A B
‰1 û ÎxÎ2
v= .
x2
We now wish to normalize this vector so its first entry equals "1":
A B A B
v 1 ‰1 û ÎxÎ2 1
u= = = .
‹1 ‰1 û ÎxÎ2 x2 x2 /‹1

where ‹1 = ‰1 û ÎxÎ2 equals the first element of v. (Note that if ‹1 = 0 then u2 can be set
to 0.)

3.3.3.2 The complex case (optional)


Let us work out the complex case, dealing explicitly with x as a vector that consists of its
first element, ‰1 , and the rest of the vector, x2 : More precisely, partition
A B
‰1
x= ,
x2
where ‰1 equals the firstAelement
B of x and x2 is the rest of x. Then we will wish to find a
1
Householder vector u = so that
u2
Q A BA BH R A B A B
1 1 1 ‰1 ± ÎxÎ2
aI ≠ b = .
· u2 u2 x2 0

Here ± denotes a complex scalar on the complex unit circle. By the same argument as
before A B
‰1 ≠ ± ÎxÎ2
v= .
x2
WEEK 3. THE QR DECOMPOSITION 185

We now wish to normalize this vector so its first entry equals "1":
A B A B
v 1 ‰1 ≠ ± ÎxÎ2 1
u= = = .
‹1 ‰1 ≠ ± ÎxÎ2 x2 x2 /‹1

where ‹1 = ‰1 ≠ ± ÎxÎ2 . (If ‹1 = 0 then we set u2 to 0.)


As was the case for the real-valued case, the choice ± is important. We choose ± =
≠sign(‰1 ) = ≠ |‰‰11 | .

3.3.3.3 A routine for computing the Householder vector


The vector A B
1
u2
is the Householder vector that reflects x into ± ÎxÎ2 e0 . The notation
CA B D AA BB
fl ‰1
, · := Housev
u2 x2

represents the computation of the above mentioned vector u2 , and scalars fl and · , from
vector x. We will use the notation H(x) for the transformation I ≠ ·1 uuH where u and · are
computed by Housev(x).
CA B D AA BB
fl ‰1
Algorithm : , · = Housev
u2 x2
‰2 :=.Îx
A 2 Î2 B.
. ‰ .
1
– := .. .
. (= ÎxÎ2 )
. ‰2 .
2
fl = ≠sign(‰1 )ÎxÎ2 fl := ≠sign(‰1 )–
‹1 = ‰1 + sign(‰1 )ÎxÎ2 ‹1 := ‰1 ≠ fl
u2 = x2 /‹1 u2 := x2 /‹1
‰2 = ‰2 /|‹1 |(= Îu2 Î2 )
· = (1 + uH
2 u2 )/2 · = (1 + ‰22 )/2
Figure 3.3.3.1 Computing the Householder transformation. Left: simple formulation.
Right: efficient computation. Note: I have not completely double-checked these formulas
for the complex case. They work for the real case.
Remark 3.3.3.2 The function
function [ rho, ...
u2, tau ] = Housev( chi1, ...
x2 )

1implements the function Housev. It can be found in Assignments/Week03/mat ab/Housev.m


WEEK 3. THE QR DECOMPOSITION 186

Homework 3.3.3.1 Function Assignments/Week03/mat ab/Housev.m implements the steps in


Figure 3.3.3.1 (left). Update this implementation with the equivalent steps in Figure 3.3.3.1
(right), which isSv^T v closer to how it is implemented in practice.
Solution. Assignments/Week03/answers/Housev-a t.m

3.3.4 Householder QR factorization algorithm

YouTube: https://www.youtube.com/watch?v=5MeeuSoFBdY
Let A be an m ◊ n with m Ø n. We will now show how to compute A æ QR, the QR
factorization, as a sequence of Householder transformations applied to A, which eventually
zeroes out all elements of that matrix below the diagonal. The process is illustrated in
Figure 3.3.4.1.
WEEK 3. THE QR DECOMPOSITION 187
CA B D A B
fl11 –11 aT12
, ·1 = :=
u21 a
A 21
A22
Original matrix A B B “Move forwardÕÕ
–11 fl11 aT12 ≠ w12
T
Housev
a21 0 A22 ≠ u21 w12T

◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ 0 ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ 0 ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ 0 ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ 0 ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
0 ◊ ◊ ◊ 0 ◊ ◊ ◊ 0 ◊ ◊ ◊
0 ◊ ◊ ◊ 0 0 ◊ ◊ 0 0 ◊ ◊
0 ◊ ◊ ◊ 0 0 ◊ ◊ 0 0 ◊ ◊
0 ◊ ◊ ◊ 0 0 ◊ ◊ 0 0 ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
0 ◊ ◊ ◊ 0 ◊ ◊ ◊ 0 ◊ ◊ ◊
0 0 ◊ ◊ 0 0 ◊ ◊ 0 0 ◊ ◊
0 0 ◊ ◊ 0 0 0 ◊ 0 0 0 ◊
0 0 ◊ ◊ 0 0 0 ◊ 0 0 0 ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
0 ◊ ◊ ◊ 0 ◊ ◊ ◊ 0 ◊ ◊ ◊
0 0 ◊ ◊ 0 0 ◊ ◊ 0 0 ◊ ◊
0 0 0 ◊ 0 0 0 ◊ 0 0 0 ◊
0 0 0 ◊ 0 0 0 0 0 0 0 0
Figure 3.3.4.1 Illustration of Householder QR factorization.
In the first iteration, we partition
A B
–11 aT12
Aæ .
a21 A22

Let CA B D AA BB
fl11 –11
, ·1 = Housev
u21 a21
be the Householder transform computed from the first column of A. Then applying this
Householder transform to A yields
A B Q A BA BH R A B
–11 aT12 1 1 1 –11 aT12
:= aI ≠ ·1
b
a21 A22 u21 u21 a21 A22
A B
fl11 aT12 ≠ w12
T
= ,
0 A22 ≠ u21 w12
T

where w12
T
= (aT12 +uH
21 A22 )/·1 . Computation of a full QR factorization of A will now proceed
with the updated matrix A22 .
WEEK 3. THE QR DECOMPOSITION 188

YouTube: https://www.youtube.com/watch?v=WWe8yVccZy0
Homework 3.3.4.1 Show that
Q R Q Q RQ RH R
I 0 0 0
c A BA BH d c
d = cI ≠
1 c dc d d
c
1 1 1 a 1 ba 1 b d b.
a
0 I≠ b a ·1
·1 u21 u21 u21 u21

Solution.
Q R Q R
I 0 0 0
c A BA BH d c A BA BH d
c
1 1 1 d = I ≠c 1 1 1 d
a
0 I≠ ·1
b a
0 ·1
b
u21 u21 Q
u21 u21 R
0 0
1
c A BA BH d
= I≠ c 1 1 d
·1 a
0 b
u21 u21
Q R
0 0 0
= I ≠ ·11 c
a 0 1 uH2
d
b
Q 0 u2 u2 u2H
Q RQ RH R
c
0 0
1 c dc d d
= c
aI ≠ ·1 a 1 b a 1 b b .
d
u21 u21
More generally, let us assume that after k iterations of the algorithm matrix A contains
Q R
A B R00 r01 R02
RT L RT R
Aæ =c
a 0
d
–11 aT12 b ,
0 ABR
0 a21 A22

where RT L and R00 are k ◊ k upper triangular matrices. Let


CA B D AA BB
fl11 –11
, ·1 = Housev .
u21 a21
WEEK 3. THE QR DECOMPOSITION 189

and update
Q RQ
I Q 0 R00 r01 R02
R
c A BA BH R d c d
A := c
a 0 aI ≠ 1 1 1 d
b ba 0 –11 aT12 b
·1 u21 u21 0 a21 A22
Q Q RQ RH R Q R
c
0 0 dc
R00 r01 R02
1 c dc d d
= cI
a ≠ ·1 a 1 ba 1 b da
b 0 –11 aT12 b
u21 u21 0 a21 A22
Q R
R00 r01 R02
c d
= a 0 fl11 aT12 ≠ w12
T
b,
0 0 A22 ≠ u21 w12 T

where, again, w12


T
= (aT12 + uH
21 A22 )/·1 .
Let Q Q RQ RH R
0 0
c 1 c k dc k d d
Hk = aI ≠ a 1 b a 1 b d
c
b
·1
u21 u21
be the Householder transform so computed during the (k + 1)st iteration. Then upon com-
pletion matrix A contains
A B
RT L
R= = Hn≠1 · · · H1 H0 Â
0

where  denotes the original contents of A and RT L is an upper triangular matrix. Rear-
ranging this we find that
 = H0 H1 · · · Hn≠1 R
which shows that if Q = H0 H1 · · · Hn≠1 then  = QR.
Typically, the algorithm overwrites the original matrix A with the upper triangular ma-
trix, and at each step u21 is stored over the elements that become zero, thus overwriting a21 .
(It is for this reason that the first element of u was normalized to equal "1".) In this case Q
is usually not explicitly formed as it can be stored as the separate Householder vectors below
the diagonal of the overwritten matrix. The algorithm that overwrites A in this manner is
given in Figure 3.3.4.2.
WEEK 3. THE QR DECOMPOSITION 190

[A, t] A
= HQR_unb_var1(A) B A B
AT L AT R tT
Aæ and t æ
ABL ABR tB
AT L is 0 ◊ 0 and tT has 0 elements
while n(ABR ) > 0 Q R Q R
A B A00 a01 A02 A B t0
AT L AT R c d tT c d
æ a aT10 –11 aT12 b and æ a ·1 b
ABL ABR tB
CA B D
A a
CA 20 B 21 D 22
A A B
t2
–11 fl11 –11
, · := , ·1 = Housev
a21 A 1 B u
A 21 A B
a21 B A B
aT12 1 1 1 2 aT12
Update := I ≠ ·1 1 u21H
A22 u21 A22
via the steps
A12 := (a B 12 +Aa21 A22 )/·1
T T H
w B
aT12 aT12 ≠ w12
T
:= T
A22 A22 ≠ a21 w12
Q R Q R
A B A00 a01 A02 A B t0
AT L AT R c T T d tT c d
Ω a a10 –11 a12 b and Ω a ·1 b
ABL ABR tB
A20 a21 A22 t2
endwhile
Figure 3.3.4.2 Unblocked Householder transformation based QR factorization.
In that figure,
[A, t] = HQR_unb_var1(A)
denotes the operation that computes the QR factorization of m ◊ n matrix A, with m Ø n,
via Householder transformations. It returns the Householder vectors and matrix R in the
first argument and the vector of scalars "·i " that are computed as part of the Householder
transformations in t.
Homework 3.3.4.2 Given A œ Rm◊n show that the cost of the algorithm in Figure 3.3.4.2
is given by
2
CHQR (m, n) ¥ 2mn2 ≠ n3 flops.
3
Solution. The bulk of the computation is in
T
w12 = (aT12 + uH
21 A22 )/·1

and
T
A22 ≠ u21 w12 .
During the kth iteration (when RT L is k ◊ k), this means a matrix-vector multiplication
(uH
21 A22 ) and rank-1 update with matrix A22 which is of size approximately (m ≠ k) ◊ (n ≠ k)
WEEK 3. THE QR DECOMPOSITION 191

for a cost of 4(m ≠ k)(n ≠ k) flops. Thus the total cost is approximately
qn≠1
k=0 4(m ≠ k)(n ≠ k)
=
qn≠1
4 j=0 (m ≠ n + j)j
=
q qn≠1 2
4(m ≠ n) n≠1j=0 j + 4 j=0 j
=
q
2(m ≠ n)n(n ≠ 1) + 4 n≠1j=0 j
2

¥ s
2(m ≠ n)n2 + 4 0n x2 dx
=
2mn2 ≠ 2n3 + 43 n3
=
2mn2 ≠ 23 n3 .
Homework 3.3.4.3 Implement the algorithm given in Figure 3.3.4.2 as
function [ A_out, t ] = HQR( A )

by completing the code in Assignments/Week03/mat ab/HQR.m. Input is an m ◊ n matrix A.


Output is the matrix Ao ut with the Householder vectors below its diagonal and R in its
upper triangular part. You may want to use Assignments/Week03/mat ab/test_HQR.m to check
your implementation.
Solution. See Assignments/Week03/answers/HQR.m. Warning: it only checks if R is computed
correctly.

3.3.5 Forming Q

YouTube: https://www.youtube.com/watch?v=cFWMsVNBzDY
Given A œ Cm◊n , let [A, t] = HQR_unb_var1(A) yield the matrix A with the House-
holder vectors stored below the diagonal, R stored on and above the diagonal, and the
scalars ·i , 0 Æ i < n, stored in vector t. We now discuss how to form the first n columns of
Q = H0 H1 · · · Hn≠1 . The computation is illustrated in Figure 3.3.5.1.
WEEK 3. THE QR DECOMPOSITION 192
A B
–11 aT12
:=
a
A 21
A22
Original matrix B “Move forwardÕÕ
1 ≠ 1/·1 ≠(uH
21 A22 )/·1
≠u21 /·1 A22 + u21 aT12
1 0 0 0 1 0 0 0 1 0 0 0
0 1 0 0 0 1 0 0 0 1 0 0
0 0 1 0 0 0 1 0 0 0 1 0
0 0 0 1 0 0 0 ◊ 0 0 0 ◊
0 0 0 0 0 0 0 ◊ 0 0 0 ◊
1 0 0 0 1 0 0 0
0 1 0 0 0 1 0 0
0 0 ◊ ◊ 0 0 ◊ ◊
0 0 ◊ ◊ 0 0 ◊ ◊
0 0 ◊ ◊ 0 0 ◊ ◊
1 0 0 0 1 0 0 0
0 ◊ ◊ ◊ 0 ◊ ◊ ◊
0 ◊ ◊ ◊ 0 ◊ ◊ ◊
0 ◊ ◊ ◊ 0 ◊ ◊ ◊
0 ◊ ◊ ◊ 0 ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
Figure 3.3.5.1 Illustration of the computation of Q.
Notice that to pick out the first n columns we must form
A B A B A B
In◊n In◊n In◊n
Q = H0 · · · Hn≠1 = H0 · · · Hk≠1 Hk · · · Hn≠1
0 0 0
¸ ˚˙ ˝
Bk
A B
In◊n
so that Q = B0 , where Bk = Hk · · · Hn≠1 .
0
Homework 3.3.5.1 ALWAYS/SOMETIMES/NEVER:
A B A B
In◊n Ik◊k 0
Bk = Hk · · · Hn≠1 = .
0 0 BÂ k
for some (m ≠ k) ◊ (n ≠ k) matrix BÂ k .
Answer. ALWAYS
Solution. The proof of this is by induction on k:
A B
In◊n
• Base case: k = n. Then Bn = , which has the desired form.
0
WEEK 3. THE QR DECOMPOSITION 193

• Inductive step: Assume the result is true for Bk . We show it is true for Bk≠1 :

Bk≠1
= A B
In◊n
Hk≠1 Hk · · · Hn≠1
0
=
Hk≠1 Bk
=A B
Ik◊k 0
Hk≠1
0 BÂ k
=
Q RQ R
I(k≠1)◊(k≠1) A 0B I(k≠1)◊(k≠1) 0 0
c
a 1 1 1 2 d
ba
c
0 1 0 d b
0 I≠ 1 uH
·k uk k
0 0 BÂ k
=
Q R
I(k≠1)◊(k≠1) A A B 0 BA B d
c
a 1 1
1
2 1 0 b
0 I≠ 1 uH
·k
uk k
0 BÂ k
= < choose yk = uk Bk /·k >
T H Â
Q R
I(k≠1)◊(k≠1) A B A 0 B
c
a 1 0 1 1 2 d
b
0 ≠ 1/· y T
0 BÂ k
k k
uk
=
Q R
I(k≠1)◊(k≠1) A 0 B d
c
a 1 ≠ 1/·k ≠ykT b
0 Â T
≠uk /·k Bk ≠ uk yk
=
Q R
I(k≠1)◊(k≠1) 0 0
c
a 0 1 ≠ 1/·k ≠ykT d
b
0 ≠uk /·k BÂ k ≠ uk ykT
A = B
I(k≠1)◊(k≠1) 0
.
0 BÂ k≠1

• By the Principle of Mathematical Induction the result holds for B0 , . . . , Bn .


WEEK 3. THE QR DECOMPOSITION 194

YouTube: https://www.youtube.com/watch?v=pNEp5X sZ4k


The last exercise justifies the algorithm in Figure 3.3.5.2,
[A] =AFormQ(A, t) B A B
AT L AT R tT
Aæ ,t æ
ABL ABR tB
AT L is n(A) ◊ n(A) and tT has n(A) elements
while n(AT L ) > 0 Q R Q R
A B A00 a01 A02 A B t0
AT L AT R c d tT c d
æ a aT10 –11 aT12 b , æ a ·1 b
ABL ABR tB
A
A
B 20
a21 A22 t2
–11 aT12
Update :=
aA21 A22 A B BA B
1 1 1 2 1 0
I ≠ ·1 1 u21
H
u21 0 A22
via the steps
–11 := 1 ≠ 1/·1
aT12 := ≠(aH 21 A22 )/·1
A22 := A22 + a21 aT12
a21 := ≠a21 /·1
Q R Q R
A B A00 a01 A02 A B t0
AT L AT R c T T d tT c d
Ω a a10 –11 a12 b , Ω a ·1 b
ABL ABR tB
A20 a21 A22 t2
endwhile
Figure 3.3.5.2 Algorithm for overwriting A with Q from the Householder transforma-
tions stored as Householder vectors below the diagonal of A (as produced by [A, t] =
HQR_unb_var1(A, t) ).
which, given [A, t] = HQR_unb_var1(A) from Figure 3.3.4.2, overwrites A with the first
n = n(A) columns of Q.
Homework 3.3.5.2 Implement the algorithm in Figure 3.3.5.2 as
function [ A_out ] = FormQ( A, t )

by completing the code in Assignments/Week03/mat ab/FormQ.m. You will want to use Assignments/
Week03/mat ab/test_FormQ.m to check your implementation. Input is the m ◊ n matrix A and
vector t that resulted from [ A, t ] = HQR( A ). Output is the matrix Q for the QR
factorization. You may want to use Assignments/Week03/mat ab/test_FormQ.m to check your
implementation.
Solution. See Assignments/Week03/answers/FormQ.m
Homework 3.3.5.3 Given A œ Cm◊n , show that the cost of the algorithm in Figure 3.3.5.2
WEEK 3. THE QR DECOMPOSITION 195

is given by
2
CFormQ (m, n) ¥ 2mn2 ≠ n3 flops.
3
Hint. Modify the answer for Homework 3.3.4.2.
Solution. When computing the Householder QR factorization, the bulk of the cost is in
the computations
T
w12 := (aT12 + uH
21 A22 )/·1

and
T
A22 ≠ u21 w12 .
When forming Q, the cost is in computing

aT12 := ≠(aH
21 A22 /·1

and
A22 := A22 + u21 w12
T
.
During the when AT L is k◊k), these represent, essentially, identical costs: p the matrix-vector
multiplication (uH
21 A22 ) and rank-1 update with matrix A22 which is of size approximately
(m ≠ k) ◊ (n ≠ k) for a cost of 4(m ≠ k)(n ≠ k) flops. Thus the total cost is approximately
q0
k=n≠1 4(m ≠ k)(n ≠ k)
= < reverse the order of the summation >
qn≠1
k=0 4(m ≠ k)(n ≠ k)
=
q
4 nj=1 (m ≠ n + j)j
=
q q
4(m ≠ n) nj=1 j + 4 nj=1 j 2
=
q
2(m ≠ n)n(n + 1) + 4 nj=1 j 2
¥ s
2(m ≠ n)n2 + 4 0n x2 dx
=
2mn2 ≠ 2n3 + 34 n3
=
2mn2 ≠ 23 n3 .
Ponder This 3.3.5.4 If m = n then Q could be accumulated by the sequence

Q = (· · · ((IH0 )H1 ) · · · Hn≠1 ).

Give a high-level reason why this would be (much) more expensive than the algorithm in
Figure 3.3.5.2
WEEK 3. THE QR DECOMPOSITION 196

3.3.6 Applying QH

YouTube: https://www.youtube.com/watch?v=BfK3DVgfxIM
In a future chapter, we will see that the QR factorization is used to solve the linear least-
squares problem. To do so, we need to be able to compute ŷ = QH y where QH = Hn≠1 · · · H0 .
Let us start by computing H0 y:
Q A BA BH R A B
aI 1 1 1 b Â1
≠ ·1 u2 u2 y2
=
A B A B A BH A B
Â1 1 1 Â1
≠ /·1
y2 u2 u2 y2
¸ ˚˙ ˝
Ê1
A = B A B
Â1 1
≠ Ê1
y2 u2
A = B
Â1 ≠ Ê1
.
y2 ≠ Ê1 u2

More generally, let us compute Hk y:


Q Q RQ RH R Q R Q R
0 0 y0 y0
c 1 c dc d d
dc d c d
cI ≠
a a 1 b a 1 b b a Â1 b = a  1 ≠ Ê1 b ,
·1
u2 u2 y2 y2 ≠ Ê1 u2

where Ê1 = (Â1 + uH 2 y2 )/·1 . This motivates the algorithm in Figure 3.3.6.1 for computing
y := Hn≠1 · · · H0 y given the output matrix A and vector t from routine HQR_unb_var1.
WEEK 3. THE QR DECOMPOSITION 197

[y] = A
Apply_QH(A,Bt, y) A B A B
AT L AT R tT yT
Aæ ,t æ ,y æ
ABL ABR tB yB
AT L is 0 ◊ 0 and tT , yT have 0 elements
while n(ABR ) < 0 Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b ,
ABL ABR
Q R
A20 a21 Q A22 R
A B t0 A B y0
tT c d yT c d
æ a ·1 b , æ a Â1 b
tB yB
A
t2B A A
y2B BA B
Â1 1 1 1 2 Â1
Update := I ≠ ·1 1 u21
H
y2 u21 y2
via the steps
AÊ1 :=B(Â1 + A a21 y2 )/·1 B
H

Â1 Â1 ≠ Ê1
:=
y2 y2 ≠ Ê1 u21
Q R
A B A00 a01 A02
AT L AT R c d
Ω a aT10 –11 aT12 b ,
ABL ABR
Q R
A20 a21 Q A22 R
A B t0 A B y0
tT c d yT c d
Ω a ·1 b , Ω a Â1 b
tB yB
t2 y2
endwhile

Figure 3.3.6.1 Algorithm for computing y := QH y(= Hn≠1 · · · H0 y) given the output from
the algorithm HQR_unb_var1.

Homework 3.3.6.1 What is the approximate cost of algorithm in Figure 3.3.6.1 if Q (stored
as Householder vectors in A) is m ◊ n.
Solution. The cost of this algorithm can be analyzed as follows: When yT is of length k,
the bulk of the computation is in a dot product with vectors of length m ≠ k ≠ 1 (to compute
Ê1 ) and an axpy operation with vectors of length m ≠ k ≠ 1 to subsequently update Â1 and
y2 . Thus, the cost is approximately given by
n≠1
ÿ n≠1
ÿ n≠1
ÿ
4(m ≠ k ≠ 1) = 4 m≠4 (k ≠ 1) ¥ 4mn ≠ 2n2 .
k=0 k=0 k=0

Notice that this is much cheaper than forming Q and then multiplying QH y.
WEEK 3. THE QR DECOMPOSITION 198

3.3.7 Orthogonality of resulting Q


WEEK 3. THE QR DECOMPOSITION 199

Homework 3.3.7.1 Previous programming assignments have the following routines for
computing the QR factorization of a given matrix A:
• Classical Gram-Schmidt (CGS) Homework 3.2.3.1:
[ A_out, R_out ] = CGS_QR( A ).
• Modified Gram-Schmidt (MGS) Homework 3.2.4.3:
[ A_out, R_out ] = MGS_QR( A ).
• Householder QR factorization (HQR) Homework 3.3.4.3:
[ A_out, t_out ] = HQR( A ).
• Form Q from Householder QR factorization Homework 3.3.5.2:
Q = FormQ( A, t ).
Use these to examine the orthogonality of the computed Q by writing the Matlab script
Assignments/Week03/matlab/test_orthogonality.m for the matrix
Q R
1 1 1
c
c ‘ 0 0 d
d
c d.
a 0 ‘ 0 b
0 0 ‘

Solution. Try Assignments/Week03/answers/test_orthogona ity.m.


WEEK 3. THE QR DECOMPOSITION 200

Ponder This 3.3.7.2 In the last homework, we examined the orthogonality of the computed
matrix Q for a very specific kind of matrix. The problem with that matrix is that the columns
are nearly linearly dependent (the smaller ‘ is).
How can you quantify how close to being linearly dependent the columns of a matrix are?
How could you create a matrix of arbitrary size in such a way that you can control how
close to being linearly dependent the columns are?
Homework 3.3.7.3 (Optional). Program up your solution to Ponder This 3.3.7.2 and
use it to compare how mutually orthonormal the columns of the computed matrices Q are.

3.4 Enrichments
3.4.1 Blocked Householder QR factorization
3.4.1.1 Casting computation in terms of matrix-matrix multiplication
Modern processors have very fast processors with very fast floating point units (which per-
form the multiply/adds that are the bread and butter of our computations), but very slow
memory. Without getting into details, the reason is that modern memories are large and
hence are physically far from the processor, with limited bandwidth between the two. To
overcome this, smaller "cache" memories are closer to the CPU of the processor. In order to
achieve high performance (efficient use of the fast processor), the strategy is to bring data
into such a cache and perform a lot of computations with this data before writing a result
out to memory.
Operations like a dot product of vectors or an "axpy" (y := –x+y) perform O(m) compu-
tation with O(m) data and hence don’t present much opportunity for reuse of data. Similarly,
matrix-vector multiplication and rank-1 update operations perform O(m2 ) computation with
O(m2 ) data, again limiting the opportunity for reuse. In contrast, matrix-matrix multipli-
cation performs O(m3 ) computation with O(m2 ) data, and hence there is an opportunity to
reuse data.
The goal becomes to rearrange computation so that most computation is cast in terms of
matrix-matrix multiplication-like operations. Algorithms that achieve this are called blocked
algorithms.
It is probably best to return to this enrichment after you have encountered simpler
algorithms and their blocked variants later in the course, since Householder QR factorization
is one of the more difficult operations to cast in terms of matrix-matrix multiplication.

3.4.1.2 Accumulating Householder transformations


Given a sequence of Householder transformations, computed as part of Householder QR fac-
torization, these Householder transformations can be accumulated into a new transformation:
If H0 , · · · , Hk≠1 are Householder transformations, then

H0 H1 · · · Hk≠1 = I ≠ U T ≠1 U H ,
WEEK 3. THE QR DECOMPOSITION 201

where T is an upper triangular matrix. If U stores the Householder vectors that define
H0 , . . . , Hk≠1 (with "1"s explicitly on its diagonal) and t holds the scalars ·0 , . . . , ·k≠1 , then

T := FormT( U, t )

computes the desired matrix T . Now, applying this UT transformation to a matrix B yields

(I ≠ U T ≠1 U H )B = B ≠ U (T ≠1 (U H B)),

which demontrates that this operations requires the matrix-matrix multiplication W :=


U H B, the triangular matrix-matrix multiplication W := T ≠1 W and the matrix-matrix mul-
tipication B ≠ U W , each of which can attain high performance.
In [23] we call the transformation I ≠ U T ≠1 U H that equals the accumulated Householder
transformations the UT transform and prove that T can instead by computed as

T = triu(U H U )

(the upper triangular part of U H U ) followed by either dividing the diagonal elements by
two or setting them to ·0 , . . . , ·k≠1 (in order). In that paper, we point out similar published
results [8] [35] [45] [32].

3.4.1.3 A blocked algorithm


A QR factorization that exploits the insights that yielded the UT transform can now be
described:

• Partition A B
A11 A12

A21 A22
where A11 is b ◊ b.
A B
A11
• We can use the unblocked algorithm in Subsection 3.3.4 to factor the panel
A21
A B A B
A11 A11
[ , t1 ] := HouseQR_unb_var1( ),
A21 A21
A B
U11
overwriting the entries below the diagonal with the Householder vectors (with
U21
the ones on the diagonal implicitly stored) and the upper triangular part with R11 .

• Form T11 from the Householder vectors using the procedure described earlier in this
unit: A B
A11
T11 := FormT( )
A21
WEEK 3. THE QR DECOMPOSITION 202

• Now we need to also apply the Householder transformations to the rest of the columns:
A B
A12
A22
=
Q A B A BH RH A B
aI U11 ≠1 U11 b A12
≠ T11
U21 U21 A22
A = B A B
A12 U11
≠ W12
A22 U21
A = B
A12 ≠ U11 W12
,
A22 ≠ U21 W12

where
W12 = T11
≠H
(U11
H
A12 + U21
H
A22 ).

This motivates the blocked algorithm in Figure 3.4.1.1.


[A, t] A
:= HouseQR_blk_var1(A,
B A
t)
B
AT L AT R tT
Aæ ,t æ
ABL ABR tB
AT L is 0 ◊ 0, tT has 0 rows
while m(AT L ) < m(A)
choose block size bQ R Q R
A B A00 A01 A02 A B t0
AT L AT R t
æc a A10 A11 A12 b ,
d T
æc d
a t1 b
ABL ABR tB
A20 A21 A22 t2
AA11
is bB◊ b, t1 has b rows A B
A11 A11
[ , t1 ] := HQR_unb_var1( )
A21 A B
A21
A11
T11 := FormT( , t1 )
A21
A 12 :=BT11 (U A 11 A12 + U21 A22B )
≠H H H
W
A12 A12 ≠ U11 W12
:=
A22 A22 ≠ U21 W12
Q R Q R
A B A00 A01 A02 A B t0
AT L AT R c d tT c d
Ω a A10 A11 A12 b , Ω a t1 b
ABL ABR tB
A20 A21 A22 t2
endwhile
Figure 3.4.1.1 Blocked Householder transformation based QR factorization.
Details can be found in [23].
WEEK 3. THE QR DECOMPOSITION 203

3.4.1.4 The W Y transform


An alternative (and more usual) way of expressing a Householder transform is

I ≠ —vv H ,

where — = 2/v H v (= 1/· , where · is as discussed before). This leads to an alternative


accumulation of Householder transforms known as the compact WY transform [35]:

I ≠ U SU H

where upper triangular matrix S relates to the matrix T in the UT transform via S = T ≠1 .
Obviously, T can be computed first and then inverted via the insights in the next exercise.
Alternatively, inversion of matrix T can be incorporated into the algorithm that computes
T (which is what is done in the implementation in LAPACK [1]).

3.4.2 Systematic derivation of algorithms


We have described two algorithms for Gram-Schmidt orthogonalization: the Classical Gram-
Schmidt (CGS) and the Modified Gram-Schmidt (MGS) algorithms. In this section we use
this operation to introduce our FLAME methodology for systematically deriving algorithms
hand-in-hand with their proof of correctness. Those who want to see the finer points of
this methodologies may want to consider taking our Massive Open Online Course titled
"LAFF-On: Programming for Correctness," offered on edX.
The idea is as follows: We first specify the input (the precondition) and ouput (the
postcondition) for the algorithm. factorization

• The precondition for the QR factorization is

A = A.

A contains the original matrix, which we specify by A‚ since A will be overwritten as


the algorithm proceeds.

• The postcondition for the QR factorization is

A = Q · A‚ = QR · QH Q = I. (3.4.1)

This specifies that A is to be overwritten by an orthonormal matrix Q and that QR


equals the original matrix A.
‚ We will not explicitly specify that R is upper triangular,
but keep that in mind as well.

Now, we know that we march through the matrices in a consistent way. At some point
in the algorithm we will have divided them as follows:
A B
1 2 1 2 RT L RT R
Aæ AL AR ,Q æ QL QR ,R æ ,
RBL RBR
WEEK 3. THE QR DECOMPOSITION 204

where these partitionings are "conformal" (they have to fit in context). To come up with
algorithms, we now ask the question "What are the contents of A and R at a typical stage
of the loop?" To answer this, we instead first ask the question "In terms of the parts of the
matrices are that naturally exposed by the loop, what is the final goal?" To answer that
question, we take the partitioned matrices, and enter them in the postcondition (3.4.1):
1 2 1 2
AL AR = QL QR
¸ ˚˙ ˝ ¸ ˚˙ ˝
A Q A B
1 2 1 2 RT L RT R
· A‚L A‚R = QL QR
¸ ˚˙ ˝ ¸ ˚˙ ˝ 0 RBR
¸ ˚˙ ˝
A‚ R B
Q
A
1 2H 1 2 I 0
· QL QR QL QR = .
¸ ˚˙ ˝ ¸ ˚˙ ˝ 0 I
¸ ˚˙ ˝
QH Q I
(Notice that RBL becomes a zero matrix since R is upper triangular.) Applying the rules of
linear algebra (multiplying out the various expressions) yields
1 2 1 2
AL AR = QL QR
1 2 1 2
· A‚ A‚R = QL RT L QL RT R + QR RBR (3.4.2)
A L B A B
QH T
L QL QL QR I 0
· .= .
QH H
R QL QR QR 0 I
We call this the Partitioned Matrix Expression (PME). It is a recursive definition of the
operation to be performed.
The different algorithms differ in what is in the matrices A and R as the loop iterates.
Can we systematically come up with an expression for their contents at a typical point in
the iteration? The observation is that when the loop has not finished, only part of the final
result has been computed. So, we should be able to take the PME in (3.4.2) and remove
terms to come up with partial results towards the final result. There are some dependencies
(some parts of matrices must be computed before others). Taking this into account gives us
two loop invariants:
• Loop invariant 1:
1 2 1 2
AL AR = QL A‚R
· A‚L = QL RT L (3.4.3)
· QHL QL = I

• Loop invariant 2:
1 2 1 2
AL AR =QL A‚R ≠ QL RT R
1 2 1 2
· A‚L A‚R = QL RT L QL RT R + QR RBR
· QHL QL = I
WEEK 3. THE QR DECOMPOSITION 205

We note that our knowledge of linear algebra allows us to manipulate this into
1 2 1 2
AL AR = QL A‚R ≠ QL RT R
(3.4.4)
· A‚L = QL RT L · QH
L AL = RT R · QL QL = I.
‚ H

The idea now is that we derive the loop that computes the QR factorization by systematically
deriving the algorithm that maintains the state of the variables described by a chosen loop
invariant. If you use (3.4.3), then you end up with CGS. If you use (3.4.4), then you end up
with MGS.
Interested in details? We have a MOOC for that: LAFF-On Programming for Correct-
ness.

3.5 Wrap Up
3.5.1 Additional homework
A B
A
Homework 3.5.1.1 Consider the matrix where A has linearly independent columns.
B
Let
• A = QA RA be the QR factorization of A.
A B A B
RA RA
• = QB RB be the QR factorization of .
B B
A B A B
A A
• = QR be the QR factorization of .
B B
Assume that the diagonal entries of RA , RB , and R are all positive. Show that R = RB .
Solution. A B A BA B A B
A QA 0 RA QA 0
= = QB RB
B 0 I B 0 I
A B
A
Also, = QR. By the uniqueness of the QR factorization (when the diagonal elements
B
A B
QA 0
of the triangular matrix are restricted to be positive), Q = QB and R = RB .
0 I
Remark 3.5.1.1 This last exercise gives a key insight that is explored in the paper
• [20] Brian C. Gunter, Robert A. van de Geijn, Parallel out-of-core computation and up-
dating of the QR factorization, ACM Transactions on Mathematical Software (TOMS),
2005.
WEEK 3. THE QR DECOMPOSITION 206

3.5.2 Summary
Classical Gram-Schmidt orthogonalization: Given a set of linearly independent vectors
{a0 , . . . , an≠1 } µ Cm , the Gram-Schmidt process computes an orthonormal basis {q0 , . . . , qn≠1 }
that spans the same subspace as the original vectors, i.e.

Span({a0 , . . . , an≠1 }) = Span({q0 , . . . , qn≠1 }).

The process proceeds as follows:

• Compute vector q0 of unit length so that Span({a0 }) = Span({q0 }):

¶ fl0,0 = Îa0 Î2
Computes the length of vector a0 .
¶ q0 = a0 /fl0,0
Sets q0 to a unit vector in the direction of a0 .

Notice that a0 = q0 fl0,0

• Compute vector q1 of unit length so that Span({a0 , a1 }) = Span({q0 , q1 }):

¶ fl0,1 = q0H a1
Computes fl0,1 so that fl0,1 q0 = q0H a1 q0 equals the component of a1 in the direction
of q0 .
1 = a1 ≠ fl0,1 q0
¶ a‹
Computes the component of a1 that is orthogonal to q0 .
¶ fl1,1 = Îa‹
1 Î2
Computes the length of vector a‹
1.

¶ q 1 = a‹
1 /fl1,1
Sets q1 to a unit vector in the direction of a‹
1.

Notice that A B
1 2 1 2 fl0,0 fl0,1
a0 a1 = q0 q1 .
0 fl1,1

• Compute vector q2 of unit length so that Span({a0 , a1 , a2 }) = Span({q0 , q1 , q2 }):


A B
fl = q0H a2 fl0,2 1 2H
¶ 0,2 or, equivalently, = q0 q1 a2
fl1,2 = q1H a2 fl1,2
Computes fl0,2 so that fl0,2 q0 = q0H a2 q0 and fl1,2 q1 = q1H a2 q1 equal the components
of a2 in the directions of q0 and q1 .
A B
1 2 fl
0,2
Or, equivalently, q0 q1 is the component in Span({q0 , q1 }).
fl1,2
WEEK 3. THE QR DECOMPOSITION 207
A B
1fl0,2 2
¶ a‹
2= a2 ≠ fl0,2 q0 ≠ fl1,2 q1 = a2 ≠ q0 q1
fl1,2
Computes the component of a2 that is orthogonal to q0 and q1 .
¶ fl2,2 = Îa‹
2 Î2
Computes the length of vector a‹
2.

¶ q 2 = a‹
2 /fl2,2
Sets q2 to a unit vector in the direction of a‹
2.

Notice that Q R
1 2 1 2 fl0,0 fl0,1 fl0,2
a 0
c
a0 a1 a2 = q0 q1 q2 fl1,1 fl1,2 d
b.
0 0 fl2,2

• And so forth.
Theorem 3.5.2.1 QR Decomposition Theorem. Let A œ Cm◊n have linearly indepen-
dent columns. Then there exists an orthonormal matrix Q and upper triangular matrix R
such that A = QR, its QR decomposition. If the diagonal elements of R are taken to be real
and positive, then the decomposition is unique.
Projection a vector y onto the orthonormal columns of Q œ Cm◊n :

[y ‹ , r] = Proj‹QCGS (Q, y) [y ‹ , r] = Proj‹QMGS (Q, y)


(used by CGS) (used by MGS)
y‹ = y y‹ = y
for i = 0, . . . , k ≠ 1 for i = 0, . . . , k ≠ 1
fli := qiH y fli := qiH y ‹
y ‹ := y ‹ ≠ fli qi y ‹ := y ‹ ≠ fli qi
endfor endfor
WEEK 3. THE QR DECOMPOSITION 208

Gram-Schmidt orthogonalization algorithms:

[A, R] := GS(A) (overwrites


A
A with B
Q)
1 2 RT L RT R
A æ AL AR , R æ
0 RBR
AL has 0 columns and RT L is 0 ◊ 0
while n(AL ) < n(A) Q R
A B R00 r01 R02
1 2 1 2 RT L RT R c T d
AL AR æ A0 a1 A2 , æ a 0 fl11 r12 b
0 RBR
0 0 R22
CGS MGS MGS (alternative)
r01 := AH0 a1
a1 := a1 ≠ A0 r01
[a1 , r01 ] = Proj‹toQMGS (A0 , a1 )
fl11 := Îa1 Î2 fl11 := Îa1 Î2 fl11 := Îa1 Î2
a1 := a1 /fl11 q1 := a1 /fl11 a1 := a1 /fl11
T
r12 := aH
1 A2
A2 := A2 ≠ a1 r12
T
Q R
A B R00 r01 R02
1 2 1 2 RT L RT R
Ωc a 0
T d
AL AR Ω A0 a1 A2 , fl11 r12 b
0 RBR
0 0 R22
endwhile

Classic example that shows that the columns of Q, computed by MGS, are "more orthog-
onal" than those computed by CGS:
Q R
1 1 1
c
c ‘ 0 0 d
d
1 2
A=c d = a0 a1 a2 .
a 0 ‘ 0 b
0 0 ‘

Cost of Gram-Schmidt algorithms: approximately 2mn2 flops.


Definition 3.5.2.2 Let u œ Cn be a vector of unit length (ÎuÎ2 = 1). Then H = I ≠ 2uuH
is said to be a Householder transformation or (Householder) reflector. ⌃
If H is a Householder transformation (reflector), then
• HH = I.

• H = HH.

• H H H = HH = I.

• H ≠1 = H H = H.
Computing a Householder transformation I ≠ 2uuH :
• Real case:
WEEK 3. THE QR DECOMPOSITION 209

¶ v = x û ÎxÎ2 e0 .
v = x + sign(‰1 )ÎxÎ2 e0 avoids catastrophic cancellation.
¶ u = v/ÎvÎ2
• Complex case:
¶ v = x û ± ÎxÎ2 e0 .
(Picking ± carefully avoids catastrophic cancellation.)
¶ u = v/ÎvÎ2
Practical computation of u and · so that I ≠ uuH /tau is a Householder transformation
(reflector): CA B D AA BB
fl ‰1
Algorithm : , · = Housev
u2 x2
‰2 :=.Îx
A 2 Î2 B.
. ‰ .
. 1 .
– := . . (= ÎxÎ2 )
. ‰2 .
2
fl = ≠sign(‰1 )ÎxÎ2 fl := ≠sign(‰1 )–
‹1 = ‰1 + sign(‰1 )ÎxÎ2 ‹1 := ‰1 ≠ fl
u2 = x2 /‹1 u2 := x2 /‹1
‰2 = ‰2 /|‹1 |(= Îu2 Î2 )
· = (1 + uH u
2 2 )/2 · = (1 + ‰22 )/2
Householder QR factorization algorithm:
[A, t] A
= HQR_unb_var1(A)B A B
AT L AT R tT
Aæ and t æ
ABL ABR tB
AT L is 0 ◊ 0 and tT has 0 elements
while n(ABR ) > 0 Q R Q R
A B A00 a01 A02 A B t0
AT L AT R c d t c d
æ a aT10 –11 aT12 b and T
æ a ·1 b
ABL ABR tB
CA B D
A a
CA 20 B 21 D 22
A A B
t2
–11 fl11 –11
, · := , ·1 = Housev
a21 A 1 B u
A 21 A B
a21 B A B
T
a12 1 1 1 2 aT12
Update := I ≠ ·1 1 u21H
A22 u21 A22
via the steps
A12 := (aB 12 +Aa21 A22 )/·1
T T H
w B
T
a12 aT12 ≠ w12
T
:= T
A22 A22 ≠ a21 w12
Q R Q R
A B A00 a01 A02 A B t0
AT L AT R c d tT c d
Ω a aT10 –11 aT12 b and Ω a ·1 b
ABL ABR tB
A20 a21 A22 t2
endwhile
WEEK 3. THE QR DECOMPOSITION 210

Cost: approximately 2mn2 ≠ 23 n3 flops.


Algorithm for forming Q given output of Householder QR factorization algorithm:

[A] =AFormQ(A, t) B A B
AT L AT R tT
Aæ ,t æ
ABL ABR tB
AT L is n(A) ◊ n(A) and tT has n(A) elements
while n(AT L ) > 0 Q R Q R
A B A00 a01 A02 A B t0
AT L AT R c d tT c d
æ a aT10 –11 aT12 b , æ a ·1 b
ABL ABR tB
A
A
B 20
a21 A22 t2
T
–11 a12
Update :=
aA21 A22 A B BA B
1 1 1 2 1 0
I ≠ ·1 1 u21
H
u21 0 A22
via the steps
–11 := 1 ≠ 1/·1
aT12 := ≠(aH 21 A22 )/·1
A22 := A22 + a21 aT12
a21 := ≠a21 /·1
Q R Q R
A B A00 a01 A02 A B t0
AT L AT R tT
Ωc a aT10 –11 aT12 b ,
d
Ωc a ·1
d
b
ABL ABR tB
A20 a21 A22 t2
endwhile

Cost: approximately 2mn2 ≠ 23 n3 flops.


WEEK 3. THE QR DECOMPOSITION 211

Algorithm for applying QH given output of Householder QR factorization algorithm:

[y] = A
Apply_QH(A,Bt, y) A B A B
AT L AT R tT yT
Aæ ,t æ ,y æ
ABL ABR tB yB
AT L is 0 ◊ 0 and tT , yT have 0 elements
while n(ABR ) < 0 Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b ,
ABL ABR
Q R
A20 a21 Q A22 R
A B t0 A B y0
tT c d yT c d
æ a ·1 b , æ a Â1 b
tB yB
A
t2B A A
y2B BA B
Â1 1 1 1 2 Â1
Update := I ≠ ·1 1 u21
H
y2 u21 y2
via the steps
AÊ1 :=B(Â1 + A a21 y2 )/·1 B
H

Â1 Â1 ≠ Ê1
:=
y2 y2 ≠ Ê1 u2
Q R
A B A00 a01 A02
AT L AT R c d
Ω a aT10 –11 aT12 b ,
ABL ABR
Q R
A20 a21 Q A22 R
A B t0 A B y0
tT c d yT c d
Ω a ·1 b , Ω a Â1 b
tB yB
t2 y2
endwhile

Cost: approximately 4mn ≠ n2 flops.


Week 4

Linear Least Squares

4.1 Opening
4.1.1 Fitting the best line

YouTube: https://www.youtube.com/watch?v=LPfdOYoQQU0
A classic problem is to fit the "best" line through a given set of points: Given

{(‰i , Âi )}m≠1
i=0 ,

we wish to fit the line f (‰) = “0 + “1 ‰ to these points, meaning that the coefficients “0 and
“1 are to be determined. Now, in the end we want to formulate this as approximately solving
Ax = b and for that reason, we change the labels we use: Starting with points

{(–i , —i )}m≠1
i=0 ,

we wish to fit the line f (–) = ‰0 + ‰1 – through these points so that

‰0 + ‰1 –0 ¥ —0
‰0 + ‰1 –1 ¥ —1
.. .. ..
. . .
‰0 + ‰1 –m≠1 ¥ —m≠1 ,

which we can instead write as


Ax ¥ b,

212
WEEK 4. LINEAR LEAST SQUARES 213

where Q R Q R
1 –0 —0
A B
c
c 1 –1 d
d ‰0
c
c —1 d
d
A= c
c .. .. d,x
d = , and b = c
c .. d.
d
a . . b ‰1 a . b
1 –m≠1 —m≠1
Homework 4.1.1.1 Use the script in Assignments/Week04/mat ab/LineFittingExercise.m to
fit a line to the given data by guessing the coefficients ‰0 and ‰1 .
Ponder This 4.1.1.2 Rewrite the script for Homework 4.1.1.1 to be a bit more engaging...)

4.1.2 Overview
• 4.1 Opening

¶ 4.1.1 Fitting the best line


¶ 4.1.2 Overview
¶ 4.1.3 What you will learn

• 4.2 Solution via the Method of Normal Equations

¶ 4.2.1 The four fundamental spaces of a matrix


¶ 4.2.2 The Method of Normal Equations
¶ 4.2.3 Solving the normal equations
¶ 4.2.4 Conditioning of the linear least squares problem
¶ 4.2.5 Why using the Method of Normal Equations could be bad

• 4.3 Solution via the SVD

¶ 4.3.1 The SVD and the four fundamental spaces


¶ 4.3.2 Case 1: A has linearly independent columns
¶ 4.3.3 Case 2: General case

• 4.4 Solution via the QR factorization

¶ 4.4.1 A has linearly independent columns


¶ 4.4.2 Via Gram-Schmidt QR factorization
¶ 4.4.3 Via the Householder QR factorization
¶ 4.4.4 A has linearly dependent columns

• 4.5 Enrichments

¶ 4.5.1 Rank Revealing QR (RRQR) via MGS


WEEK 4. LINEAR LEAST SQUARES 214

¶ 4.5.2 Rank Revealing Householder QR factorization

• 4.6 Wrap Up

¶ 4.6.1 Additional homework


¶ 4.6.2 Summary

4.1.3 What you will learn


This week is all about solving linear least squares, a fundamental problem encountered when
fitting data or approximating matrices.
Upon completion of this week, you should be able to

• Formulate a linear least squares problem.

• Transform the least squares problem into normal equations.

• Relate the solution of the linear least squares problem to the four fundamental spaces.

• Describe the four fundamental spaces of a matrix using its singular value decomposi-
tion.

• Solve the solution of the linear least squares problem via Normal Equations, the Sin-
gular Value Decomposition, and the QR decomposition.

• Compare and contrast the accuracy and cost of the different approaches for solving the
linear least squares problem.

4.2 Solution via the Method of Normal Equations


4.2.1 The four fundamental spaces of a matrix

YouTube: https://www.youtube.com/watch?v=9mdDqC1SChg
We assume that the reader remembers theory related to (vector) subspaces. If a review
is in order, we suggest Weeks 9 and 10 of Linear Algebra: Foundations to Frontiers (LAFF)
[26].
WEEK 4. LINEAR LEAST SQUARES 215

At some point in your linear algebra education, you should also have learned about the
four fundamental spaces of a matrix A œ Cm◊n (although perhaps only for the real-valued
case):

• The column space, C(A), which is equal to the set of all vectors that are linear combi-
nations of the columns of A
{y | y = Ax}.

• The null space, N (A), which is equal to the set of all vectors that are mapped to the
zero vector by A
{x | Ax = 0}.

• The row space, R(A), which is equal to the set

{y | y H = xH A}.

Notice that R(A) = C(AH ).

• The left null space, which is equal to the set of all vectors

{x | xH A = 0}.

Notice that this set is equal to N (AH ).


Definition 4.2.1.1 Orthogonal subspaces. Two subspaces S, T µ Cn are orthogonal if
any two arbitrary vectors (and hence all vectors) x œ S and y œ T are orthogonal: xH y = 0.

The following exercises help you refresh your skills regarding these subspaces.
Homework 4.2.1.1 Let A œ Cm◊n . Show that its row space, R(A), and null space, N (A),
are orthogonal.
Solution. Pick arbitrary x œ R(A) and y œ N (A). We need to show that these two vectors
are orthogonal. Then

xH y
= < x œ R(A) iff there exists z s.t. x = AH z >
(A z) y
H H

= < transposition of product >


z H Ay
= < y œ N (A) >
z 0 = 0.
H

Homework 4.2.1.2 Let A œ Cm◊n . Show that its column space, C(A), and left null space,
N (AH ), are orthogonal.
WEEK 4. LINEAR LEAST SQUARES 216

Solution. Pick arbitrary x œ C(A) and y œ N (AH ). Then

xH y
= < x œ C(A) iff there exists z s.t. x = Az >
(Az) y
H

= < transposition of product >


z H AH y
= < y œ N (AH ) >
z 0 = 0.
H

Homework 4.2.1.3 Let {s0 , · · · , sr≠1 } be a basis for subspace S µ Cn and {t0 , · · · , tk≠1 }
be a basis for subspace T µ Cn . Show that the following are equivalent statements:
1. Subspaces S, T are orthogonal.

2. The vectors in {s0 , · · · , sr≠1 } are orthogonal to the vectors in {t0 , · · · , tk≠1 }.

3. sH
i tj = 0 for all 0 Æ i < r and 0 Æ j < k.
1 2H 1 2
4. s0 · · · sr≠1 t0 · · · tk≠1 = 0, the zero matrix of appropriate size.

Solution. We are going to prove the equivalence of all the statements by showing that 1.
implies 2., 2. implies 3., 3. implies 4., and 4. implies 1.

• 1. implies 2.
Subspaces S and T are orthogonal if any vectors x œ S and y œ T are orthogonal.
Obviously, this means that si is orthogonal to tj for 0 Æ i < r and 0 Æ j < k.

• 2. implies 3.
This is true by definition of what it means for two sets of vectors to be orthogonoal.

• 3. implies 4.
Q R
1 2H 1 2
sH H
0 t0 s0 t1 · · ·
c H d
s0 · · · sr≠1 t0 · · · tk≠1 =c s t sH
a 1 0 1 t1 · · · d
.. .. b
. .

• 4. implies 1.
We need to show that if x œ S and y œ T then xH y = 0.
Notice that
Q R Q R
1 2c
‰‚0 1 2c
‚0
x= .. d .. d
sr≠1 c b and y = t0 · · · tk≠1 a
. d . d
s0 · · · c
a b
‰r≠1
‚ ‚
Âk≠1
WEEK 4. LINEAR LEAST SQUARES 217

for appropriate choices of x‚ and y‚. But then


Q Q RRH Q R
‰‚0 ‚0
c1 2c
.. dd
dd 1 2c
.. d
xH y = c
a s0 · · · sr≠1 c
a . bb t0 · · · tk≠1 ca . d
b
‰‚r≠1 ‚
Âk≠1
Q RH Q R
‰‚0 1 2H 1 2 c
‚0
c . d .. d
= a .. d
c
b s0 · · · sr≠1 t0 · · · tk≠1 c
a . d
b
¸ ˚˙ ˝
‰r≠1

0r◊k ‚k≠1
= 0
Homework 4.2.1.4 Let A œ Cm◊n . Show that any vector x œ Cn can be written as
x = xr + xn , where xr œ R(A) and xn œ N (A), and xH r xn = 0.
Hint. Let r be the rank of matrix A. In a basic linear algebra course you learned that then
the dimension of the row space, R(A), is r and the dimension of the null space, N (A), is
n ≠ r.
Let {w0 , · · · , wr≠1 } be a basis for R(A) and {wr , · · · , wn≠1 } be a basis for N (A).
Answer. TRUE
Now prove it!
Solution. Let r be the rank of matrix A. In a basic linear algebra course you learned that
then the dimension of the row space, R(A), is r and the dimension of the null space, N (A),
is n ≠ r.
Let {w0 , · · · , wr≠1 } be a basis for R(A) and {wr , · · · , wn≠1 } be a basis for N (A). Since
we know that these two spaces are orthogonal, we know that {w0 , · · · , wr≠1 } are orthogonal
to {wr , · · · , wn≠1 }. Hence {w0 , · · · , wn≠1 } are linearly independent and form a basis for Cn .
Thus, there exist coefficients {–0 , · · · , –n≠1 } such that

x = –0 w0 + · · · + –n≠1 wn≠1
= < split the summation >
–0 w0 + · · · + –r≠1 wr≠1 + –r wr + · · · + –n≠1 wn≠1 .
¸ ˚˙ ˝ ¸ ˚˙ ˝
xr xn

YouTube: https://www.youtube.com/watch?v=Zd raR_7cMA


Figure 4.2.1.2 captures the insights so far.
WEEK 4. LINEAR LEAST SQUARES 218

Figure 4.2.1.2 Illustration of the four fundamental spaces and the mapping of a vector
x œ Cn by matrix A œ Cm◊n .
That figure also captures that if r is the rank of matrix, then

• dim(R(A)) = dim(C(A)) = r;

• dim(N (A)) = n ≠ r;

• dim(N (AH )) = m ≠ r.

Proving this is a bit cumbersome given the knowledge we have so far, but becomes very easy
once we relate the various spaces to the SVD, in Subsection 4.3.1. So, we just state it for
now.
WEEK 4. LINEAR LEAST SQUARES 219

4.2.2 The Method of Normal Equations

YouTube: https://www.youtube.com/watch?v=oT4KIOxx-f4
Consider again the LLS problem: Given A œ Cm◊n and b œ Cm find x̂ œ Cn such that

Îb ≠ Ax̂Î2 = minn Îb ≠ AxÎ2 .


xœC

We list a sequence of observations that you should have been exposed to in previous study
of linear algebra:

• b̂ = Ax̂ is in the column space of A.

• b̂ equals the member of the column space of A that is closest to b, making it the
orthogonal projection of b onto the column space of A.

• Hence the residual, b ≠ b̂, is orthogonal to the column space of A.

• From Figure 4.2.1.2 we deduce that b ≠ b̂ = b ≠ Ax̂ is in N (AH ), the left null space of
A.

• Hence AH (b ≠ Ax̂) = 0 or, equivalently,

AH Ax̂ = AH b.

This linear system of equations is known as the normal equations.

• If A has linearly independent columns, then rank(A) = n, N (A) = ÿ, and AH A is


nonsingular. In this case,
x̂ = (AH A)≠1 AH b.
Obviously, this solution is in the row space, since R(A) = Cn .

With this, we have discovered what is known as the Method of Normal Equations. These
steps are summarized in Figure 4.2.2.1
WEEK 4. LINEAR LEAST SQUARES 220

PowerPoint Source
Figure 4.2.2.1 Solving LLS via the Method of Normal Equations when A has linearly
independent columns (and hence the row space of A equals Cn ).

Definition 4.2.2.2 (Left) pseudo inverse. Let A œ Cm◊n have linearly independent
columns. Then
A† = (AH A)≠1 AH
is its (left) pseudo inverse. ⌃
Homework 4.2.2.1 Let A œ Cm◊m be nonsingular. Then A≠1 = A† .
Solution.
AA† = A(AH A)≠1 AH = AA≠1 A≠H AH = II = I.
Homework 4.2.2.2 Let A œ Cm◊n have linearly independent columns. ALWAYS/SOMETIMES/
NEVER: AA† = I. 1 2
Hint. Consider A = e0 .
Answer. SOMETIMES
Solution. An example where AA† = I is the case where m = n and hence A is nonsingular.
WEEK 4. LINEAR LEAST SQUARES 221

An example where AA† ”= I is A = e0 for m > 1. Then

AA†
= < instantiate >
Q R≠1

Q R c
c Q RH Q R d
d Q RH
1 c 1 1 d 1
c d c c d c d d c d
c 0 d c c 0 d c 0 d d
c
d c 0 d
a
.. b c a
. b a .. b d
.
a
.. b
. c
c . . d
d .
a ¸ ˚˙ ˝ b
1
¸ ˚˙ ˝
1
Q
=R < simplify >
1 1 2
c d
c 0 d 1 0 ···
a
.. b
.
Q
= < multiply
R
out >
1 0 ···
c d
c 0 0 ··· d
a
.. .. b
. .
= <m>1>
=
” I.
Ponder This 4.2.2.3 The last exercise suggests there is also a right pseudo inverse. How
would you define it?

4.2.3 Solving the normal equations

YouTube: https://www.youtube.com/watch?v= n4XogsWcOE


Let us review a method you have likely seen before for solving the LLS problem when
matrix A has linearly independent columns. We already used these results in Subsection 2.1.1
We wish to solve AH Ax̂ = AH b, where A has linearly independent columns. If we form
B = AH A and y = AH b, we can instead solve B x̂ = y. Some observations:
• Since A has linearly independent columns, B is nonsingular. Hence, x̂ is unique.

• B is Hermitian since B H = (AH A)H = AH (AH )H = AH A = B.


WEEK 4. LINEAR LEAST SQUARES 222

• B is Hermitian Positive Definite (HPD): x ”= 0 implies that xH Bx > 0. This follows


from the fact that

xH Bx = xH AH Ax = (Ax)H (Ax) = ÎAxÎ22 .

Since A has linearly independent columns, x ”= 0 implies that Ax ”= 0 and hence


ÎAxÎ22 > 0.

In Section 5.4, you will find out that since B is HPD, there exists a lower triangular matrix
L such that B = LLH . This is known as the Cholesky factorization of B. The steps for
solving the normal equations then become

• Compute B = AH A.
Notice that since B is Hermitian symmetric, only the lower or upper triangular part
needs to be computed. This is known as a Hermitian rank-k update (where in this
case k = n). The cost is, approximately, mn2 flops. (See Subsection C.0.1.)

• Compute y = AH b.
The cost of this matrix-vector multiplication is, approximately, 2mn flops. (See Sub-
section C.0.1.)

• Compute the Cholesky factorization B æ LLH .


Later we will see that this costs, approximately, 13 n3 flops. (See Subsection 5.4.3.)

• Solve
Lz = y
(solve with a lower triangular matrix) followed by

LH x̂ = z

(solve with an upper triangular matrix).


Together, these triangular solves cost, approximately, 2n2 flops. (See Subsection C.0.1.)

We will revisit this in Section 5.4.

4.2.4 Conditioning of the linear least squares problem


WEEK 4. LINEAR LEAST SQUARES 223

YouTube: https://www.youtube.com/watch?v=etx_1VZ4VFk
Given A œ Cm◊n with linearly independent columns and b œ Cm , consider the linear least
squares (LLS) problem
Îb ≠ Ax̂Î2 = min Îb ≠ AxÎ2 (4.2.1)
x
and the perturbed problem
Î(b + ”b) ≠ A(x̂ + ”x̂)Î2 = min Î(b + ”b) ≠ A(x + ”x)Î2 (4.2.2)
x

The question we want to examine is by how much the relative error in b is amplified into a rel-
ative error in x̂. We will restrict our discussion to the case where A has linearly independent
columns.
Now, we discovered that b̂, the projection of b onto the column space of A, satisfies
b̂ = Ax̂ (4.2.3)
and the projection of b + ”b satisfies
b̂ + ”b̂ = A(x̂ + ”x̂) (4.2.4)

where ”b̂ equals the projection of ”b onto the column space of A.


Let ◊ equal the angle between vectors b and its projection b̂ (which equals the angle
between b and the column space of A). Then

cos(◊) = Îb̂Î2 /ÎbÎ2


and hence
cos(◊)ÎbÎ2 = Îb̂Î2 = ÎAx̂Î2 Æ ÎAÎ2 Îx̂Î2 = ‡0 Îx̂Î2
which (as long as x̂ ”= 0) can be rewritten as
1 ‡0 1
Æ . (4.2.5)
Îx̂Î2 cos(◊) ÎbÎ2
Subtracting (4.2.3) from (4.2.4) yields

”b̂ = A”x̂
or, equivalently,
A”x̂ = ”b̂
which is solved by
”x̂ = A† ”b̂
= A† A(AH A)≠1 AH ”b
= (AH A)≠1 AH A(AH A)≠1 AH ”b
= A† ”b,
where A† = (AH A)≠1 AH is the pseudo inverse of A and we recall that ”b̂ = A(AH A)≠1 AH ”b.
Hence
Δx̂Î2 Æ ÎA† Î2 ΔbÎ2 . (4.2.6)
WEEK 4. LINEAR LEAST SQUARES 224

Homework 4.2.4.1 Let A œ Cm◊n have linearly independent columns. Show that

Î(AH A)≠1 AH Î2 = 1/‡n≠1 ,

where ‡n≠1 equals the smallest singular value of A.


Hint. Use the reduced SVD of A.
Solution. Let A = UL T L V H be the reduced SVD of A, where V is square because A has
linearly independent columns. Then

Î(AH A)≠1 AH Î2
=
Î((UL T L V H )H UL T L V H )≠1 (UL T L V H )H Î2
=
Î(V T L ULH UL T L V H )≠1 V T L ULH Î2
=
Î(V ≠1 T L T L V )V T L UL Î2
≠1 H H

=
ÎV ≠1 H
T L UL Î2
=
Î ≠1 H
T L UL Î2
=
1/‡n≠1 .

This last step needs some more explanation: Clearly Î T L ULH Î2 Æ Î T L Î2 ÎULH Î2 =
‡0 ÎULH Î2 Æ ‡0 . We need to show that there exists a vector x with ÎxÎ2 = 1 such that
Î T L ULH xÎ2 = Î T L ULH Î2 . If we pick x = u0 (the first column of UL ), then Î T L ULH xÎ2 =
Î T L ULH u0 Î2 = Î T L e0 Î2 = ·0 e0 Î2 = ‡0 .
Combining (4.2.5), (4.2.6), and the result in this last homework yields

Δx̂Î2 1 ‡0 ΔbÎ2
Æ . (4.2.7)
Îx̂Î2 cos(◊) ‡n≠1 ÎbÎ2

Notice the effect of the cos(◊)b. If b is almost perpendicular to C(A), then its projection
b̂ is small and cos ◊ is small. Hence a small relative change in b can be greatly amplified.
This makes sense: if b is almost perpendical to C(A), then x̂ ¥ 0, and any small ”b œ C(A)
can yield a relatively large change ”x.
Definition 4.2.4.1 Condition number of matrix with linearly independent columns.
Let A œ Cm◊n have linearly independent columns (and hence n Æ m). Then its condition
number (with respect to the 2-norm) is defined by
‡0
Ÿ2 (A) = ÎAÎ2 ÎA† Î2 = .
‡n≠1

WEEK 4. LINEAR LEAST SQUARES 225

It is informative to explicity expose cos(◊) = Îb̂Î2 /ÎbÎ2 in (4.2.7):

Δx̂Î2 ÎbÎ2 ‡0 ΔbÎ2


Æ .
Îx̂Î2 Îb̂Î2 ‡n≠1 ÎbÎ2
Notice that the ratio
ΔbÎ2
ÎbÎ2
can be made smaller by adding a component, br , to b that is orthogonal to C(A) (and hence
does not change the projection onto the column space, b̂):

ΔbÎ2
.
Îb + br Î2

The factor 1/ cos(◊) ensures that this does not magically reduce the relative error in x̂:

Δx̂Î2 Îb + br Î2 ‡0 ΔbÎ2
Æ .
Îx̂Î2 Îb̂Î2 ‡n≠1 Îb + br Î2

4.2.5 Why using the Method of Normal Equations could be bad

YouTube: https://www.youtube.com/watch?v=W-HnQDsZsOw
Homework 4.2.5.1 Show that Ÿ2 (AH A) = (Ÿ2 (A))2 .
Hint. Use the SVD of A.
Solution. Let A = U V H be the reduced SVD of A. Then
Ÿ2 (AH A) = ÎAH AÎ2 Î(AH A)≠1 Î2
= Î(U V H )H U V H Î2 Î((U V H )H U V H )≠1 Î2
= ÎV 2 V H Î2 ÎV ( ≠1 )2 V H Î2
= Î 2 Î2 Î( ≠1 )2 Î2
1 22
‡02
= 2
‡n≠1
= ‡0
‡n≠1
= Ÿ2 (A)2 .
Let A œ Cm◊n have linearly independent columns. If one uses the Method of Normal
Equations to solve the linear least squares problem minx Îb ≠ AxÎ2 via the steps

• Compute B = AH A.
WEEK 4. LINEAR LEAST SQUARES 226

• Compute y = AH b.

• Solve B x̂ = y.
the condition number of B equals the square of the condition number of A. So, while the
sensitivity of the LLS problem is captured by
Δx̂Î2 1 ΔbÎ2
Æ Ÿ2 (A) .
Îx̂Î2 cos(◊) ÎbÎ2
the sensitivity of computing x̂ from B x̂ = y is captured by
Δx̂Î2 ΔyÎ2
Æ Ÿ2 (A)2 .
Îx̂Î2 ÎyÎ2
If Ÿ2 (A) is relatively small (meaning that A is not close to a matrix with linearly dependent
columns), then this may not be a problem. But if the columns of A are nearly linearly
dependent, or high accuracy is desired, alternatives to the Method of Normal Equations
should be employed.
Remark 4.2.5.1 It is important to realize that this squaring of the condition number is an
artifact of the chosen algorithm rather than an inherent sensitivity to change of the problem.

4.3 Solution via the SVD


4.3.1 The SVD and the four fundamental spaces

YouTube: https://www.youtube.com/watch?v=Zj72oRSSsH8
Theorem 4.3.1.1
A Given
B A œ C
m◊n
, let A = UL T L VLH equal its Reduced SVD and A =
TL 0
1 2 1 2H
UL UR VL VR its SVD. Then
0 0
• C(A) = C(UL ),

• N (A) = C(VR ),

• R(A) = C(AH ) = C(VL ), and

• N (AH ) = C(UR ).
WEEK 4. LINEAR LEAST SQUARES 227

Proof. We prove that C(A) = C(UL ), leaving the other parts as exercises.
Let A = UL T L VLH be the Reduced SVD of A. Then

• ULH UL = I (UL is orthonormal),

• VLH VL = I (VL is orthonormal), and

• TL is nonsingular because it is diagonal and the diagonal elements are all nonzero.

We will show that C(A) = C(UL ) by showing that C(A) µ C(UL ) and C(UL ) µ C(A)

• C(A) µ C(UL ):
Let z œ C(A). Then there exists a vector x œ Cn such that z = Ax. But then
z = Ax = UL T L VLH x = UL T L VLH x = UL x‚. Hence z œ C(UL ).
¸ ˚˙ ˝
x‚

• C(UL ) µ C(A):
Let z œ C(UL ). Then there exists a vector x œ Cr such that z = UL x. But then
z = UL x = UL T L VLH VL ≠1
T L x = A VL T L x = Ax
≠1
‚. Hence z œ C(A).
¸ ˚˙ ˝ ¸ ˚˙ ˝
I x‚
We leave the other parts as exercises for the learner. ⌅
Homework 4.3.1.1 For the last theorem, prove that R(A) = C(AH ) = C(VL ).
Solution. R(A) = C(VL ):
The slickest way to do this is to recognize that if A = UL T L VLH is the Reduced SVD
of A then AH = VL T L ULH is the Reduced SVD of AH . One can then invoke the fact that
C(A) = C(UL ) where in this case A is replaced by AH and UL by VL .
Ponder This 4.3.1.2 For the last theorem, prove that N (AH ) = C(UR ).
Homework 4.3.1.3
A GivenBA œ Cm◊n , let A = UL T L VLH equal its Reduced SVD and
TL 0
1 2 1 2H
A = UL UR VL VR its SVD, and r = rank(A).
0 0
• ALWAYS/SOMETIMES/NEVER: r = rank(A) = dim(C(A)) = dim(C(UL )),

• ALWAYS/SOMETIMES/NEVER: r = dim(R(A)) = dim(C(VL )),

• ALWAYS/SOMETIMES/NEVER: n ≠ r = dim(N (A)) = dim(C(VR )), and

• ALWAYS/SOMETIMES/NEVER: m ≠ r = dim(N (AH )) = dim(C(UR )).

Answer.

• ALWAYS: r = rank(A) = dim(C(A)) = dim(C(UL )),

• ALWAYS: r = dim(R(A)) = dim(C(VL )),


WEEK 4. LINEAR LEAST SQUARES 228

• ALWAYS: n ≠ r = dim(N (A)) = dim(C(VR )), and

• ALWAYS: m ≠ r = dim(N (AH )) = dim(C(UR )).

Now prove it.


Solution.

• ALWAYS: r = rank(A) = dim(C(A)) = dim(C(UL )),


The dimension of a space equals the number of vectors in a basis. A basis is any set
of linearly independent vectors such that the entire set can be created by taking linear
combinations of those vectors. The rank of a matrix is equal to the dimension of its
columns space which is equal to the dimension of its row space.
Now, clearly the columns of UL are linearly independent (since they are orthonormal)
and form a basis for C(UL ). This, together with Theorem 4.3.1.1, yields the fact that
r = rank(A) = dim(C(A)) = dim(C(UL )).

• ALWAYS: r = dim(R(A)) = dim(C(VL )),


There are a number of ways of reasoning this. One is a small modification of the proof
that r = rank(A) = dim(C(A)) = dim(C(UL )). Another is to look at AH and to apply
the last subproblem.

• ALWAYS: n ≠ r = dim(N (A)) = dim(C(VR )).


We know that dim(N (A)) + dim(R(A)) = n. The answer follows directly from this
and the last subproblem.

• ALWAYS: m ≠ r = dim(N (AH )) = dim(C(UR )).


We know that dim(N (AH )) + dim(C(A)) = m. The answer follows directly from this
and the first subproblem.
Homework 4.3.1.4 A GivenBA œ Cm◊n , let A = UL T L VLH equal its Reduced SVD and
TL 0
1 2 1 2H
A = UL UR VL VR its SVD.
0 0
Any vector x œ Cn can be written as x = xr + xn where xr œ C(VL ) and xn œ C(VR ).
TRUE/FALSE
Answer. TRUE
Now prove it!
WEEK 4. LINEAR LEAST SQUARES 229

Solution.
x = Ix = V V H x
1 21 2H
= VL VR V V x
A L BR
1 2 VH
= VL VR L
H x
V
A R B
1 2 V Hx
= VL VR L
VRH x
= VL VLH x + VR VRH x .
¸ ˚˙ ˝ ¸ ˚˙ ˝
xr xn

PowerPoint Source
Figure 4.3.1.2 Illustration of relationship between the SVD of matrix A and the four
fundamental spaces.
WEEK 4. LINEAR LEAST SQUARES 230

4.3.2 Case 1: A has linearly independent columns

YouTube: https://www.youtube.com/watch?v=wLCN0yOLFkM
Let us start by discussing how to use the SVD to find x̂ that satisfies

Îb ≠ Ax̂Î2 = min Îb ≠ AxÎ2 ,


x

for the case where A œ Cm◊n has linearly independent columns (in other words, rank(A) =
n).
Let A = UL T L V H be its reduced SVD decomposition. (Notice that VL = V since A has
linearly independent columns and hence VL is n ◊ n and equals V .)
Here is a way to find the solution based on what we encountered before: Since A has
linearly independent columns, the solution is given by x̂ = (AH A)≠1 AH b (the solution to the
normal equations). Now,


= < solution to the normal equations >
(AH A)≠1 AH b
= < A = UL T L V H >
Ë È≠1
(UL T LV ) (UL T L V H )
H H
(UL T L V H )H b
= < (BCD)H = (DH C H B H ) and H TL = TL >
Ë È≠1
(V T L ULH )(UL T L V H ) (V T L UL )b
H

= < ULH UL = I >


Ë È≠1
V T L T LV H V T L ULH b
= = V and (BCD)≠1 = D≠1 C ≠1 B ≠1 >
≠1
<V H
≠1 ≠1 H H
V TL T LV V
T L UL b
= < V V = I and ≠1
H
TL TL = I >
≠1 H
V T L UL b
WEEK 4. LINEAR LEAST SQUARES 231

PowerPoint Source
Figure 4.3.2.1 Solving LLS via the SVD when A had linearly independent columns (and
hence the row space of A equals Cn ).
Alternatively, we can come to the same conclusion without depending on the Method of
Normal Equations, in preparation for the more general case discussed in the next subsection.
The derivation is captured in Figure 4.3.2.1.

minxœCn Îb ≠ AxÎ22
= < substitute the SVDA = U V H >
minxœCn Îb ≠ U V H xÎ22
= < substitute I = U U H and factor out U >
minxœCn ÎU (U H b ≠ V H x)Î22
= < multiplication by a unitary matrix preserves two-norm >
minxœCn ÎU H b ≠ V H xÎ22
= < partition, partitioned matrix-matrix multiplication >
.A B A B .2
. UHb .
.
minxœCn . L
≠ TL
V x..
H
H
. UR b 0 .
2
= < partitioned matrix-matrix multiplication and addition >
.A B.2
. UHb ≠ H .
. T LV x .
minxœCn .. L
.
URH b .
2
.A B.2
. v .
. .
= <. T
. = ÎvT Î2 + ÎvB Î22 >
2
. vB .
. 2 .2 . .2
. . . .
minxœCn .ULH b ≠ T LV
H
x. + .URH b.
2 2
WEEK 4. LINEAR LEAST SQUARES 232

The x that solves T LV


H
x = ULH b minimizes the expression. That x is given by
x̂ = V ≠1 H
T L UL b.

since T L is a diagonal matrix with only nonzeroes on its diagonal and V is unitary.
Here is yet another way of looking at this: we wish to compute x̂ that satisfies
Îb ≠ Ax̂Î2 = min
x
Îb ≠ AxÎ2 ,

for the case where A œ Cm◊n has linearly independent columns. We know that A =
UL T L V H , its Reduced SVD. To find the x that minimizes, we first project b onto the
column space of A. Since the column space of A is identical to the column space of UL , we
can project onto the column space of UL instead:
b̂ = UL ULH b.
(Notice that this is not because UL is unitary, since it isn’t. It is because the matrix UL ULH
projects onto the columns space of UL since UL is orthonormal.) Now, we wish to find x̂ that
exactly solves Ax̂ = b̂. Substituting in the Reduced SVD, this means that
UL T LV
H
x̂ = UL ULH b.
Multiplying both sides by ULH yields

T LV
H
x̂ = ULH b.
and hence
x̂ = V ≠1 H
T L UL b.

We believe this last explanation probably leverages the Reduced SVD in a way that provides
the most insight, and it nicely motivates how to find solutions to the LLS problem when
rank(A) < r.
The steps for solving the linear least squares problem via the SVD, when A œ Cm◊n has
linearly independent columns, and the costs of those steps are given by
• Compute the Reduced SVD A = UL T LV
H
.
We will not discuss practical algorithms for computing the SVD until much later. We
will see that the cost is O(mn2 ) with a large constant.
• Compute x̂ = V ≠1 H
T L UL b.

The cost of this is approximately,


¶ Form yT = ULH b: 2mn flops.
¶ Scale the individual entries in yT by dividing by the corresponding singular values:
n divides, overwriting yT = ≠1 T L yT . The cost of this is negligible.
¶ Compute x̂ = V yT : 2n2 flops.
The devil is in the details of how the SVD is computed and whether the matrices UL
and/or V are explicitly formed.
WEEK 4. LINEAR LEAST SQUARES 233

4.3.3 Case 2: General case

YouTube: https://www.youtube.com/watch?v=qhsPHQk1id8
Now we show how to use the SVD to find x̂ that satisfies

Îb ≠ Ax̂Î2 = min Îb ≠ AxÎ2 ,


x

where rank(A) = r, with no assumptions about the relative size of m and n. In our discussion,
we let A = UL T L VLH equal its Reduced SVD and
A B
1 2 0 1 2H
A= UL UR TL
VL VR
0 0

its SVD.
The first observation is, once more, that an x̂ that minimizes Îb ≠ AxÎ2 satisfies

Ax̂ = b̂,

where b̂ = UL ULH b, the orthogonal projection of b onto the column space of A. Notice our
use of "an x̂" since the solution won’t be unique if r < m and hence the null space of A is
not trivial. Substituting in the SVD this means that
A B
1 2 0 1 2H
UL UR TL
VL VR x̂ = UL ULH b.
0 0

Multiplying both sides by ULH yields


A B
1 2 0 1 2H
I 0 TL
VL VR x̂ = ULH b
0 0

or, equivalently,
H
T L VL x̂ = ULH b. (4.3.1)
Any solution to this can be written as the sum of a vector in the row space of A with a
vector in the null space of A:
A B
1 2 zT
x̂ = V z = VL VR = VL zT + VR zB .
zB ¸ ˚˙ ˝ ¸ ˚˙ ˝
xr xn
WEEK 4. LINEAR LEAST SQUARES 234

Substituting this into (4.3.1) we get

T L VL (VL zT + VR zB ) = ULH b,
H

which leaves us with


T L zT = ULH b.
Thus, the solution in the row space is given by

xr = VL zT = VL ≠1 H
T L UL b

and the general solution is given by

x̂ = VL ≠1 H
T L UL b + VR zB ,

where zB is any vector in Cn≠r . This reasoning is captured in Figure 4.3.3.1.

PowerPoint Source
Figure 4.3.3.1 Solving LLS via the SVD of A.
Homework 4.3.3.1 Reason that

x̂ = VL ≠1 H
T L UL b

is the solution to the LLS problem with minimal length (2-norm). In other words, if xı
satisfies
Îb ≠ Axı Î2 = min
x
Îb ≠ AxÎ2
then Îx̂Î2 Æ Îxı Î2 .
WEEK 4. LINEAR LEAST SQUARES 235

Solution. The important insight is that

xı = VL ≠1 H
T L UL b + VR zB
¸ ˚˙ ˝

and that
VL ≠1 H
T L UL b and VR zB
are orthogonal to each other (since VLH VR= 0). If uH v = 0 then Îu + vÎ22 = ÎuÎ22 + ÎvÎ22 .
Hence
Îxı Î22 = Îx̂ + VR zB Î22 = Îx̂Î22 + ÎVR zB Î22 Ø Îx̂Î22
and hence Îx̂Î2 Æ Îxı Î2 .

4.4 Solution via the QR factorization


4.4.1 A has linearly independent columns

YouTube: https://www.youtube.com/watch?v=mKAZjYX656Y
Theorem 4.4.1.1 Assume A œ Cm◊n has linearly independent columns and let A = QR
be its QR factorization with orthonormal matrix Q œ Cm◊n and upper triangular matrix
R œ Cn◊n . Then the LLS problem

Find x̂ œ Cn such that Îb ≠ Ax̂Î2 = minn Îb ≠ AxÎ2


xœC

is solved by the unique solution of


Rx̂ = QH b.
Proof 1. Since A = QR, minimizing Îb ≠ AxÎ2 means minimizing

Îb ≠ Q ¸˚˙˝
Rx Î2 .
z

Since R is nonsingular, we can first find z that minimizes

Îb ≠ QzÎ2

after which we can solve Rx = z for x. But from the Method of Normal Equations we know
WEEK 4. LINEAR LEAST SQUARES 236

that the minimizing z solves


QH Qz = QH b.
Since Q has orthonormal columns, we thus deduce that

z = QH b.

Hence, the desired x̂ must satisfy


Rx̂ = QH b.

Proof 2. Let A = QL RT L 1be the QR 2factorization of A. We know that then there exists a
matrix QR such that Q = QL QR is unitary: QR is an orthonormal basis for the space
orthogonal to the space spanned by QL . Now,

minxœCn Îb ≠ AxÎ22
= < substitute A = QL RT L >
minxœC Îb ≠ QL RT L xÎ22
n

= < two norm is preserved since QH is unitary >


minxœCn ÎQH (b ≠ QL RT L x)Î22
= < partitioning; distributing >
.A B A B .2
. QH QH .
. .
minxœCn . L
b ≠ L
QL R T L x .
. QH QH .
R R 2
= < partitioned matrix-matrix multiplication >
.A B A B.2
. QH b RT L x ..
.
minxœCn . L
≠ .
. QH Rb 0 .
2
= < partitioned matrix addition >
.A B.2
. QH b ≠ R x .
. .
minxœCn . L TL
.
. QH R b .
2 .A B.
. u .2
. .
= < property of the 2-norm: . . = ÎuÎ22 + ÎvÎ22 >
. v .
3 . . 4 2
.2
minxœCn ..QH
L b ≠ RT L x. + ÎQR bÎ2
H 2
2

3 = < QH
. R
b is independent
.2 4
of x >
. H .
minxœCn .QL b ≠ RT L x. + ÎQH 2
R bÎ2
2
= < minimized by x̂ that satisfies RT L x̂ = QH
Lb >
H 2
ÎQR bÎ2 .

Thus, the desired x̂ that minimizes the linear least squares problem solves RT L x̂ = QH
L y. The
solution is unique because RT L is nonsingular (because A has linearly independent columns).

Homework 4.4.1.1 Yet another alternative proof for Theorem 4.4.1.1 starts with the ob-
servation that the solution is given by x̂ = (AH A)≠1 AH b and then substitutes in A = QR.
Give a proof that builds on this insight.
WEEK 4. LINEAR LEAST SQUARES 237

Solution. Recall that we saw in Subsection 4.2.2 that, if A has linearly independent
columns, the LLS solution is given by x̂ = (AH A)≠1 AH b (the solution to the normal equa-
tions). Also, if A has linearly independent columns and A = QR is its QR factorization,
then the upper triangular matrix R is nonsingular (and hence has no zeroes on its diagonal).
Now,

= < Solution to the Normal Equations >
(A A) A b
H ≠1 H

= < A = QR >
Ë È≠1
(QR)H (QR) (QR)H b
= < (BC)H = (C H B H ) >
Ë È≠1
RH QH QR RH QH b
= < QH Q = I >
Ë È≠1
RH R RH QH b
= < (BC)≠1 = C ≠1 B ≠1 >
R≠1 R≠H RH QH b
= < R≠H RH = I >
≠1 H
R Q b.
Thus, the x̂ that solves Rx̂ = QH b solves the LLS problem.
Ponder This 4.4.1.2 Create a picture similar to Figure 4.3.2.1 that uses the QR factoriza-
tion rather than the SVD.

4.4.2 Via Gram-Schmidt QR factorization


In Section 3.2, you were introduced to the (Classical and Modified) Gram-Schmidt process
and how it was equivalent to computing a QR factorization of the matrix, A, that has as
columns the linearly independent vectors being orthonormalized. The resulting Q and R
can be used to solve the linear least squares problem by first computing y = QH b and next
solving Rx̂ = y.
Starting with A œ Cm◊n let’s explicitly state the steps required to solve the LLS problem
via either CGS or MGS and analyze the cost.:

• From Homework 3.2.6.1 or Homework 3.2.6.2, factoring A = QR via CGS or MGS


costs, approximately, 2mn2 flops.

• Compute y = QH b: 2mn flops.

• Solve Rx̂ = y: n2 flops.

Total: 2mn2 + 2mn + n2 flops.


WEEK 4. LINEAR LEAST SQUARES 238

4.4.3 Via the Householder QR factorization

YouTube: https://www.youtube.com/watch?v=Mk-Y_15aGGc
Given A œ Cm◊n with linearly independent columns, the Householder QR factorization
yields n Householder transformations, H0 , . . . , Hn≠1 , so that
A B
RT L
Hn≠1 · · · H0 A = .
¸ ˚˙ ˝ 0
QH
[A, t] = HouseQR_unb_var1(A) overwrites A with the Householder vectors that define
H0 , · · · , Hn≠1 below the diagonal and RT L in the upper triangular part.
Rather than explicitly computing Q and then computing y := QH y, we can instead apply
the Householder transformations:
y := Hn≠1 · · · H0 y,
A B
yT
overwriting y with y‚. After this, the vector y is partitioned as y = and the triangular
yB
system RT L x̂ = yT yields the desired solution.
The steps and theirs costs of this approach are
• From Subsection 3.3.4, factoring A = QR via the Householder QR factorization costs,
approximately, 2mn2 ≠ 23 n3 flops.
• From Homework 3.3.6.1, applying Q as a sequence of Householder transformations
costs, approximately, 4mn ≠ 2n2 flops.
• Solve RT L x̂ = yT : n2 flops.
Total: 2mn2 ≠ 23 n3 + 4mn ≠ n2 ¥ 2mn2 ≠ 23 n3 flops.

4.4.4 A has linearly dependent columns


Let us now consider the case where A œ Cm◊n has rank r Æ n. In other words, it has r
linearly independent columns. Let p œ Rn be a permutation vector, by which we mean a
permutation of the vector Q R
0
c
c 1 d d
c
c .. dd
a . b
n≠1
WEEK 4. LINEAR LEAST SQUARES 239

And P (p) be the matrix that, when applied to a vector x œ Cn permutes the entries of x
according to the vector p:
Q R Q R Q R
eTfi0 eTfi0 x ‰ fi0
c eTfi1 d c eTfi1 x d c d
c d c d c ‰ fi1 d
P (p)x = c
c .. d
d x=c
c .. d
d =c
c .. d.
d
a . b a . b a . b
eTfin≠1 eTfin≠1 x ‰fin≠1
¸ ˚˙ ˝
P (p)

where ej equals the columns of I œ Rn◊n indexed with j (and hence the standard basis vector
indexed with j).
If we apply P (p)T to A œ Cm◊n from the right, we get

AP (p)T
= < definition of P (p) >
Q T
RT
efi0
c .. d
Aca . d b
T
efin≠1
1= < transpose 2 >
A efi0 · · · efin≠1

1 = < matrix multiplication


2 by columns >
Aefi0 · · · Aefin≠1
1 = < Bej = 2bj >
afi0 · · · afin≠1 .

In other words, applying the transpose of the permutation matrix to A from the right
permutes its columns as indicated by the permutation vector p.
The discussion about permutation matrices gives us the ability to rearrange the columns
of A so that the first r = rank(A) columns are linearly independent.
Theorem 4.4.4.1 Assume A œ Cm◊n and that r = rank(A). Then there exists a permutation
vector p œ Rn , orthonormal matrix QL œ Cm◊r , upper triangular matrix RT L œ Cr◊r , and
RT R œ Cr◊(n≠r) such that 1 2
AP (p)T = QL RT L RT R .
Proof. Let p be the permutation vector such that the first r columns of AP = AP (p)T are
linearly independent. Partition
1 2
AP = AP (p)T = APL APR

where APL œ Cm◊r . Since APL has linearly independent columns, its QR factorization, AP =
QL RT L , exists. Since all the linearly independent columns of matrix A were permuted to the
left, the remaining columns, now part of APR , are in the column space of APL and hence in
the column space of QL . Hence APR = QL RT R for some matrix RT R , which then must satisfy
WEEK 4. LINEAR LEAST SQUARES 240

L AR = RT R giving us a means by which to compute it. We conclude that


QH P

1 2 1 2
AP = AP (p)T = APL APR = QL RT L RT R .


Let us examine how this last theorem can help us solve the LLS
Find x̂ œ Cn such that Îb ≠ Ax̂Î2 = minn Îb ≠ AxÎ2
xœC

when rank(A) Æ n:
minxœCn Îb ≠ AxÎ2
= < P (p)T P (p) = I >
minxœCn Îb ≠ AP (p)T P (p)xÎ
1 2 2
= < AP (p)T = QL RT L RT R >
1 2
minxœCn Îb ≠ QL RT L RT R P (p)x Î2
¸ ˚˙ ˝
w1 2
= < substitute w = RT L RT R P (p)x >
minwœCr Îb ≠ QL wÎ2
which is minimized when w = QH
L b. Thus, we are looking for vector x̂ such that
1 2
RT L RT R P (p)x̂ = QH
L b.

Substituting A B
zT
z=
zB
for P (p)x̂ we find that A B
1 2 zT
RT L RT R = QH
L b.
zB
Now, we can pick zB œ Cn≠r to be an arbitrary vector, and determine a corresponding zT
by solving
RT L zT = QHL b ≠ RT R zB .

A convenient choice is zB = 0 so that zT solves


RT L zT = QH
L b.

Regardless of choice of zB , the solution x̂ is given by


A B
RT≠1L (QH
L b ≠ RT R zB )
x̂ = P (p) T
.
zB
(a permutation of vector z.) This defines an infinite number of solutions if rank(A) < n.
The problem is that we don’t know which columns are linearly independent in advance.
In enrichments in Subsection 4.5.1 and Subsection 4.5.2, rank-revealing QR factorization
algorithms are discussed that overcome this problem.
WEEK 4. LINEAR LEAST SQUARES 241

4.5 Enrichments
4.5.1 Ran -Revealing QR (RRQR) via MGS
The discussion in Subsection 4.4.4 falls short of being a practical algorithm for at least two
reasons:
• One needs to be able to determine in advance what columns of A are linearly indepen-
dent; and
• Due to roundoff error or error in the data from which the matrix was created, a column
may be linearly independent of other columns when for practical purposes it should be
considered dependent.
We now discuss how the MGS algorithm can be modified so that appropriate linearly inde-
pendent columns can be determined "on the fly" as well as the defacto rank of the matrix.
The result is known as the Rank Revealing QR factorization (RRQR). It is also known
as QR factorization with column pivoting. We are going to give a modification of the
MGS algorithm for computing the RRQR.
For our discussion, we introduce an elementary pivot matrix, P (j) œ Cn◊n , that swaps
the first element of the vector to which it is applied with the element indexed with j:
Q R Q R Q R
eTj eTj x ‰j
c d c d c d
c
c
eT1 d
d
c
c
eT1 x d
d
c
c
‰1 d
d
c .. d c .. d c .. d
c
c . d
d
c
c . d
d
c
c . d
d
c ej≠1 d
T c eTj≠1 x d c ‰j≠1 d
c d c d c d
PÂ (j)x = c dx = c d = c d.
c
c eT0 dd
c
c eT0 x d
d
c
c ‰0 dd
c eTj+1 d c eTj+1 x d c ‰j+1 d
c d c d c d
c .. d c .. d c .. d
c
a . d
b
c
a . d
b
c
a . d
b
eTn≠1 eTn≠1 x ‰n≠1
Another way of stating this is that
Q R
0 0 1 0
c
c 0 I(j≠1)◊(j≠1) 0 0 d
d
PÂ (j) = c d,
a 1 0 0 0 b
0 0 0 I(n≠j≠1)◊(n≠j≠1)

where Ik◊k equals the k ◊ k identity matrix. When applying PÂ (j) from the right to a matrix,
it swaps the first column and the column indexed with j. Notice that PÂ (j)T = PÂ (j) and
PÂ (j) = PÂ (j)≠1 .
Remark 4.5.1.1 For a more detailed discussion of permutation matrices, you may want to
consult Week 7 of "Linear Algebra: Foundations to Frontiers" (LAFF) [26]. We also revisit
this in Section 5.3 when discussing LU factorization with partial pivoting.
Here is an outline of the algorithm:
WEEK 4. LINEAR LEAST SQUARES 242

• Determine the index fi1 such that the column of A indexed with fi1 has the largest
2-norm (is the longest).

• Permute A := APÂ (fi1 ), swapping the first column with the column that is longest.

• Partition
A B A B
1 2 1 2 T
fl11 r12 fi1
Aæ a1 A2 ,Q æ q1 Q2 ,R æ ,p æ
0 R22 p2

• Compute fl11 := Îa1 Î2 .

• q1 := a1 /fl11 .

• Compute r12
T
:= q1T A2 .

• Update A2 := A2 ≠ q1 r12
T
.
This substracts the component of each column that is in the direction of q1 .

• Continue the process with the updated matrix A2 .

The complete algorithm, which overwrites A with Q, is given in Figure 4.5.1.2. Observe
that the elements on the diagonal of R will be positive and in non-increasing order because
updating A2 := A2 ≠ q1 r12
T
inherently does not increase the length of the columns of A2 .
After all, the component in the direction of q1 is being subtracted from each column of A2 ,
leaving the component orthogonal to q1 .
WEEK 4. LINEAR LEAST SQUARES 243

[A, R, p] := RRQR_MGS_simple(A,
A
R,Bp) A B
1 2 RT L RT R pT
A æ AL AR , R æ ,p æ
RBL RBR pB
AL has 0 columns, RT L is 0 ◊ 0, pT has 0 rows
while
1 n(AL ) <2 n(A) 1 2
AL AR æ A0 a1 A2 ,
Q R Q R
A B R00 r01 R02 A B p0
RT L RT R c T T d pT c d
æ a r10 fl11 r12 b, æa fi1 b
RBL RBR pB
R20 r21 1 R22 2
p2
fi1 = DetermineColumnIndex( a1 A2 )
1 2 1 2
a1 A2 := a1 A2 PÂ (fi1 )
fl11 := Îa1 Î2
a1 := a1 /fl11
T
r12 := aT1 A2
A := A2 ≠2 a1 r112
1 2
T
2
AL AR Ω A0 a1 A2 ,
Q R Q R
A B R00 r01 R02 A B p0
RT L RT R c T T d p c
Ω a r10 fl11 r12 b,
T
Ωa fi1 d
b
RBL RBR pB
R20 r21 R22 p2
endwhile
Figure 4.5.1.2 Simple implementation of RRQR via MGS. Incorporating a stopping critera
that checks whether fl11 is small would allow the algorithm to determine the effective rank
of the input matrix.
The problem with the algorithm in Figure 4.5.1.2 is that determining the index fi1 requires
the 2-norm of all columns in AR to be computed, which costs O(m(n ≠ j)) flops when AL
has j columns (and hence AR has n ≠Qj columns).
R Q
The following
R
insight reduces this cost:
‹0 Îa0 Î22
1 2 c d c 2 d
c ‹1 d c Îa1 Î2 d T
Let A = a0 a1 · · · an≠1 , v = c c .. d
d = c
c .. d, q q = 1 (here q is of the
d
a . b a . b
‹n≠1 Îan≠1 Î22
Q R
fl0
c
c fl1 dd
same size as the columns of A), and r = AT q = c
c d. Compute B := A ≠ qr with
.. d
T
a . b
1 2
fln≠1
B= b0 b1 · · · bn≠1 . Then
Q R Q R
Îb0 Î22 ‹0 ≠ fl20
c
c Îb1 Î22 d
d
c
c ‹1 ≠ fl21 d
d
c
c .. d
d = c
c .. d.
d
a . b a . b
Îbn≠1 Î22 ‹n≠1 ≠ fl2n≠1
WEEK 4. LINEAR LEAST SQUARES 244

To verify this, notice that


ai = (ai ≠ aTi qq) + aTi qq
and
(ai ≠ aTi qq)T q = aTi q ≠ aTi qq T q = aTi q ≠ aTi q = 0.
This means that

Îai Î22 = Î(ai ≠ aTi qq) + aTi qqÎ22 = Îai ≠ aTi qqÎ22 + ÎaTi qqÎ22 = Îai ≠ fli qÎ22 + Îfli qÎ22 = Îbi Î22 + fl2i

so that
Îbi Î22 = Îai Î22 ≠ fl2i = ‹i ≠ fl2i .
Building on this insight, we make an important observation that greatly reduces the cost
of determining the column that is longest. Let us start by computing v as the vector such
that the ith entry in v equals the square of the length of the ith column of A. In other
words, the ith entry of v equals the dot product of the i column of A with itself. In the
above outline for the MGS with column pivoting, we can then also partition
A B
‹1
væ .
v2

The question becomes how v2 before the update A2 := A2 ≠ q1 r12 T


compares to v2 after that
update. The answer is that the ith entry of v2 must be updated by subtracting off the square
of the ith entry of r12
T
.
Let us introduce the functions v = ComputeWeights( A ) and v = UpdateWeights( v, r )
to compute the described weight vector v and to update a weight vector v by subtracting from
its elements the squares of the corresponding entries of r. Also, the function DeterminePivot
returns the index of the largest in the vector, and swaps that entry with the first entry. An
optimized RRQR via MGS algorithm, RRQR-MGS, is now given in Figure 4.5.1.3. In that
algorithm, A is overwritten with Q.
WEEK 4. LINEAR LEAST SQUARES 245

[A, R, p] := RRQR_MSG(A, R, p)
v := ComputeWeights(A)A B A B A B
1 2 RT L RT R pT vT
A æ AL AR , R æ ,p æ ,v æ
RBL RBR pB vB
AL has 0 columns, RT L is 0 ◊ 0, pT has 0 rows, vT has 0 rows
while
1 n(AL ) <2 n(A) 1 2
AL AR æ A0 a1 A2 ,
Q R
A B R00 r01 R02
RT L RT R c T T d
æ a r10 fl11 r12 b,
RBL RBR
Q R
R 20 r R
21 Q 22 R
A B p0 A B v0
pT c d vT c d
æ a fi1 b , æ a ‹1 b
pB vB
A B
p2 v
A 2 B
‹1 ‹1
[ , fi1 ] = DeterminePivot( )
1
v 2 2 1
v
22
A0 a1 A2 := A0 a1 A2 PÂ (fi1 )T
fl11 := Îa1 Î2
a1 := a1 /fl11
T
r12 := q1T A2
A2 := A2 ≠ q1 r12 T

v := UpdateWeights(v
1 2 2 1 2 , r12 ) 2
AL AR Ω A0 a1 A2 ,
Q R
A B R00 r01 R02
RT L RT R c T T d
Ω a r10 fl11 r12 b,
RBL RBR
Q R
R20 r21 QR22 R
A B p0 A B v0
pT c d vT c d
Ω a fi1 b , Ω a ‹1 b
pB vB
p2 v2
endwhile
Figure 4.5.1.3 RRQR via MGS, with optimization. Incorporating a stopping critera that
checks whether fl11 is small would allow the algorithm to determine the effective rank of the
input matrix.
Let us revisit the fact that the diagonal elements of R are positive and in nonincreasing
order. This upper triangular matrix is singular if a diagonal element equals zero (and hence
all subsequent diagonal elements equal zero). Hence, if fl11 becomes small relative to prior
diagonal elements, the remaining columns of the (updated) AR are essentially zero vectors,
and the original matrix can be approximated with
1 2
A ¥ QL RT L RT R = . .
WEEK 4. LINEAR LEAST SQUARES 246

If QL has k columns, then this becomes a rank-k approximation.


Remark 4.5.1.4 Notice that in updating the weight vector v, the accuracy of the entries
may progressively deteriorate due to catastrophic cancellation. Since these values are only
used to determine the order of the columns and, importantly, when they become very small
the rank of the matrix has revealed itself, this is in practice not a problem.

4.5.2 Rank Revealing Householder QR factorization


The unblocked QR factorization discussed in Section 3.3 can be supplemented with column
pivoting, yielding HQRP_unb_var1 in Figure 4.5.2.1. In that algorithm, we incorporate
the idea that the weights that are used to determine how to pivot can be updated at each
step by using information in the partial row r12
T
, which overwrites aT12 , just like it was in
Subsection 4.5.1.
[A, t, p] = HQRP_unb_var1(A)
v := ComputeWeights(A)
A B A B A B A B
AT L AT R tT pT vT
Aæ ,t æ ,p æ ,v æ
ABL ABR tB pB vB
AT L is 0 ◊ 0 and tT has 0 elements
while n(AT L ) < n(A) Q R Q R
A B A00 a01 A02 A B t0
AT L AT R c d tT c d
æ a aT10 –11 aT12 b , æ a ·1 b,
ABL ABR tB
Q R
A20 a21 QA22 R t2
A B p0 A B v0
pT c d vT c d
æ a fi1 b , æ a ‹1 b
pB vB
A B
p2 v
A 2 B
‹1 ‹1
[ , fi1 ] = DeterminePivot( )
v2 v2
Q R Q R
a01 A02 a01 A02
c d c d
a –11 aT12 b := a –11 aT12 b P (fi1 )T
CA a21 A B 22 D CA a21 B A22 D A B
–11 fl11 –11
, ·1 := , ·1 = Housev
a21 u21 a21
w T
A 12 := (a T
+ a
B 12 A 21 22
H
A )/· 1 B
aT12 aT12 ≠ w12T
:= T
A22 A22 ≠ a21 w12
v2 = UpdateWeight(v2 , a12 )
···
endwhile
Figure 4.5.2.1 Rank Revealing Householder QR factorization algorithm.
Combining a blocked Householder QR factorization algorithm, as discussed in Subsub-
section 3.4.1.3, with column pivoting is tricky, since half the computational cost is inherently
WEEK 4. LINEAR LEAST SQUARES 247

in computing the parts of R that are needed to update the weights and that stands in the
way of a true blocked algorithm (that casts most computation in terms of matrix-matrix
multiplication). The following papers are related to this:
• [33] Gregorio Quintana-Orti, Xioabai Sun, and Christof H. Bischof, A BLAS-3 version
of the QR factorization with column pivoting, SIAM Journal on Scientific Computing,
19, 1998.
discusses how too cast approximately half the computation in terms of matrix-matrix
multiplication.
• [25] Per-Gunnar Martinsson, Gregorio Quintana-Orti, Nathan Heavner, Robert van
de Geijn, Householder QR Factorization With Randomization for Column Pivoting
(HQRRP), SIAM Journal on Scientific Computing, Vol. 39, Issue 2, 2017.
shows how a randomized algorithm can be used to cast most computation in terms of
matrix-matrix multiplication.

4.6 Wrap Up
4.6.1 Additional homework
We start with some concrete problems from our undergraduate course titled "Linear Algebra:
Foundations to Frontiers" [26]. If you have trouble with these, we suggest you look at Chapter
11 of that course.
Q R Q R
1 0 1
Homework 4.6.1.1 Consider A = c d c d
a 0 1 b and b = a 1 b.
1 1 0
• Compute an orthonormal basis for C(A).

• Use the method of normal equations to compute the vector x‚ that minimizes minx Îb ≠
AxÎ2

• Compute the orthogonal projection of b onto C(A).

• Compute the QR factorization of matrix A.

• Use the QR factorization of matrix A to compute the vector x‚ that minimizes minx Îb≠
AxÎ2
Homework 4.6.1.2 The vectors
Ô A B A Ô B Ô A B A Ô B
2 1 2 2 ≠1 ≠ Ô22
q0 = = Ô2 , q1 = = .
2 1 2 2 1 2
2 2

• TRUE/FALSE: These vectors are mutually orthonormal.


WEEK 4. LINEAR LEAST SQUARES 248
A B
4
• Write the vector as a linear combination of vectors q0 and q1 .
2

4.6.2 Summary
The LLS problem can be states as: Given A œ Cm◊n and b œ Cm find x̂ œ Cn such that

Îb ≠ Ax̂Î2 = minn Îb ≠ AxÎ2 .


xœC

Given A œ Cm◊n ,

• The column space, C(A), which is equal to the set of all vectors that are linear combi-
nations of the columns of A
{y | y = Ax}.

• The null space, N (A), which is equal to the set of all vectors that are mapped to the
zero vector by A
{x | Ax = 0}.

• The row space, R(A), which is equal to the set

{y | y H = xH A}.

Notice that R(A) = C(AH ).

• The left null space, which is equal to the set of all vectors

{x | xH A = 0}.

Notice that this set is equal to N (AH ).

• If Ax = b then there exist xr œ R(A) and x = xr +xn where xr œ R(A) and xn œ N (A).

These insights are summarized in the following picture, which also captures the orthogonality
of the spaces.
WEEK 4. LINEAR LEAST SQUARES 249

If A has linearly independent columns, then the solution of LLS, x‚, equals the solution
of the normal equations
(AH A)x̂ = AH b.
as summarized in
WEEK 4. LINEAR LEAST SQUARES 250

The (left) pseudo inverse of A is given by A† = (AH A)≠1 AH so that the solution of LLS
is given by x‚ = A† b.
Definition 4.6.2.1 Condition number of matrix with linearly independent columns.
Let A œ Cm◊n have linearly independent columns (and hence n Æ m). Then its condition
number (with respect to the 2-norm) is defined by
‡0
Ÿ2 (A) = ÎAÎ2 ÎA† Î2 = .
‡n≠1

Assuming A has linearly independent columns, let b̂ = Ax̂ where b̂ is the projection of b
onto the column space of A (in other words, x̂ solves the LLS problem), cos(◊) = Îb̂Î2 /ÎbÎ2 ,
and b̂ + ”b̂ = A(x̂ + ”x̂), where ”b̂ equals the projection of ”b onto the column space of A. Then
Δx̂Î2 1 ‡0 ΔbÎ2
Æ
Îx̂Î2 cos(◊) ‡n≠1 ÎbÎ2
captures the sensitivity of the LLS problem to changes in the right-hand side.
Theorem 4.6.2.2
A Given
B A œ C
m◊n
, let A = UL T L VLH equal its Reduced SVD and A =
TL 0
1 2 1 2 H
UL UR VL VR its SVD. Then
0 0
• C(A) = C(UL ),

• N (A) = C(VR ),

• R(A) = C(AH ) = C(VL ), and


WEEK 4. LINEAR LEAST SQUARES 251

• N (AH ) = C(UR ).
WEEK 4. LINEAR LEAST SQUARES 252

If A has linearly independent columns and A = UL H


T L VL is its Reduced SVD, then

x̂ = VL ≠1 H
T L UL b

solves LLS. A B
1 2 0 1 2H
Given A œ C m◊n
, let A = UL H
T L VL equal its Reduced SVD and A = UL UR TL
VL VR
0 0
its SVD. Then
x̂ = VL H
T L UL b + VR zb ,
is the general solution to LLS, where zb is any vector in Cn≠r .
Theorem 4.6.2.3 Assume A œ Cm◊n has linearly independent columns and let A = QR
be its QR factorization with orthonormal matrix Q œ Cm◊n and upper triangular matrix
R œ Cn◊n . Then the LLS problem

Find x̂ œ Cn such that Îb ≠ Ax̂Î2 = minn Îb ≠ AxÎ2


xœC

is solved by the unique solution of


Rx̂ = QH b.
Solving LLS via Gram-Schmidt QR factorization for A œ Cm◊n :
• Compute QR factorization via (Classical or Modified) Gram-Schmidt: approximately
2mn2 flops.
LINEAR LEAST SQUARES 253

• Compute y = QH b: approximately 2mn2 flops.

• Solve Rx̂ = y: approximately n2 flops.

Solving LLS via Householder QR factorization for A œ Cm◊n :

• Householder QR factorization: approximately 2mn2 ≠ 23 n3 flops.

• Compute yT = QH bnn by applying Householder transformations: approximately 4mn≠


2n2 flops.

• Solve RT L x̂ = yT : approximately n2 flops.


Part II

Solving Linear Systems

254
Week 5

The LU and Cholesky Factorizations

5.1 Opening
5.1.1 Of Gaussian elimination and LU factorization

YouTube: https://www.youtube.com/watch?v=fszE2KNxTmo
Homework 5.1.1.1 Reduce the appended system

2 ≠1 1 1
≠2 2 1 ≠1
4 ≠4 1 5

to upper triangular form, overwriting the zeroes that are introduced with the multipliers.
Solution.
2 ≠1 1 1
≠1 1 2 0
2 ≠2 3 3

255
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 256

YouTube: https://www.youtube.com/watch?v=Tt0OQikd-nI

A = LU(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
a21 := a21 /–11
A22 := A22 ≠ a21 aT12
Q R
A B A00 a01 A02
AT L AT R c T d
Ω a a10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile
Figure 5.1.1.1 Algorithm that overwrites A with its LU factorization.
Homework 5.1.1.2 The execution of the LU factorization algorithm with
Q R
2 ≠1 1
c
A = a ≠2 2 1 d
b
4 ≠4 1
in the video overwrites A with Q R
2 ≠1 1
c
a ≠1 1 2 d
b.
2 ≠2 3
Multiply the L and U stored in that matrix and compare the result with the original matrix,
let’s call it Â.
Solution. Q R Q R
1 0 0 2 ≠1 1
c
L = a ≠1 1 0 d c
b and U = a 0 1 2 d
b.
2 ≠2 1 0 0 3
Q RQ R Q R
1 0 0 2 ≠1 1 2 ≠1 1
c dc d c
LU = a ≠1 1 0 ba 0 1 2 b = a ≠2 2 1 d
b = Â.
2 ≠2 1 0 0 3 4 ≠4 1
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 257

5.1.2 Overview
• 5.1 Opening

¶ 5.1.1 Of Gaussian elimination and LU factorization


¶ 5.1.2 Overview
¶ 5.1.3 What you will learn

• 5.2 From Gaussian elimination to LU factorization

¶ 5.2.1 Gaussian elimination


¶ 5.2.2 LU factorization: The right-looking algorithm
¶ 5.2.3 Existence of the LU factorization
¶ 5.2.4 Gaussian elimination via Gauss transforms

• 5.3 LU factorization with (row) pivoting

¶ 5.3.1 Gaussian elimination with row exchanges


¶ 5.3.2 Permutation matrices
¶ 5.3.3 LU factorization with partial pivoting
¶ 5.3.4 Solving A x = y via LU factorization with pivoting
¶ 5.3.5 Solving with a triangular matrix
¶ 5.3.6 LU factorization with complete pivoting
¶ 5.3.7 Improving accuracy via iterative refinement

• 5.4 Cholesky factorization

¶ 5.4.1 Hermitian Positive Definite matrices


¶ 5.4.2 The Cholesky Factorization Theorem
¶ 5.4.3 Cholesky factorization algorithm (right-looking variant)
¶ 5.4.4 Proof of the Cholesky Factorizaton Theorem
¶ 5.4.5 Cholesky factorization and solving LLS
¶ 5.4.6 Implementation with the classical BLAS

• 5.5 Enrichments

¶ 5.5.1 Other LU factorization algorithms

• 5.6 Wrap Up

¶ 5.6.1 Additional homework


¶ 5.6.2 Summary
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 258

5.1.3 What you will learn


This week is all about solving nonsingular linear systems via LU (with or without pivoting)
and Cholesky factorization. In practice, solving Ax = b is not accomplished by forming the
inverse explicitly and then computing x = A≠1 b. Instead, the matrix A is factored into the
product of triangular matrices and it is these triangular matrices that are employed to solve
the system. This requires fewer computations.
Upon completion of this week, you should be able to
• Link Gaussian elimination to LU factorization.

• View LU factorization in different ways: as Gaussian elimination, as the application of


a sequence of Gauss transforms, and the operation that computes L and U such that
A = LU .

• State and prove necessary conditions for the existence of the LU factorization.

• Extend the ideas behind Gaussian elimination and LU factorization to include pivoting.

• Derive different algorithms for LU factorization and for solving the resulting triangular
systems.

• Employ the LU factorization, with or without pivoting, to solve Ax = b.

• Identify, prove, and apply properties of Hermitian Positive Definite matrices.

• State and prove conditions related to the existence of the Cholesky factorization.

• Derive Cholesky factorization algorithms.

• Analyze the cost of the different factorization algorithms and related algorithms for
solving triangular systems.

5.2 From Gaussian elimination to LU factorization


5.2.1 Gaussian elimination

YouTube: https://www.youtube.com/watch?v=UdN0W8Czj8c
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 259

Homework 5.2.1.1 Solve


Q RQ R Q R
2 ≠1 1 ‰0 ≠6
c
a ≠4 0 1 d c d c d
b a ‰1 b = a 2 b .
4 0 ≠2 ‰2 0
Answer. Q R Q R
‰0 ≠1
c d c d

a 1 b = a 2 b.
‰2 ≠2
Solution. We employ Gaussian elimination applied to an appended system:
• Q R
2 ≠1 1 ≠6
c
a ≠4 0 1 2 d
b
4 0 ≠2 0
• Compute the multiplier ⁄10 = (≠4)/(2) = ≠2
• Subtract ⁄10 = ≠2 times the first row from the second row, yielding
Q R
2 ≠1 1 ≠6
c
a 0 ≠2 3 ≠10 d
b
4 0 ≠2 0

• Compute the multiplier ⁄20 = (4)/(2) = 2


• Subtract ⁄20 = 2 times the first row from the third row, yielding
Q R
2 ≠1 1 ≠6
c
a 0 ≠2 3 ≠10 d
b
0 2 ≠4 12

• Compute the multiplier ⁄21 = (2)/(≠2) = ≠1


• Subtract ⁄21 = ≠1 times the second row from the third row, yielding
Q R
2 ≠1 1 ≠6
c
a 0 ≠2 3 ≠10 d
b
0 0 ≠1 2

• Solve the triangular system


Q RQ R Q R
2 ≠1 1 ‰0 ≠6
c
a 0 ≠2 3 d c d c d
b a ‰1 b = a ≠10 b
0 0 ≠1 ‰2 2
to yield Q R Q R
‰0 ≠1
c d c d
a ‰1 b = a 2 b .
‰2 ≠2
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 260

The exercise in Homework 5.2.1.1 motivates the following algorithm, which reduces the
linear system Ax = b stored in n ◊ n matrix A and right-hand side vector b of size n to an
upper triangular system.

for j := 0, . . . , n ≠ 1
for i := j + 1, . . . , n ≠ 1
⁄i,j := –i,j /–j,j
–i,j := 0
Z
for k = j + 1, . . . , n ≠ 1 _
_
_
–i,k := –i,k ≠ ⁄i,j –j,k ^
subtract ⁄i,j times row j from row k
endfor _
_
_
\
—i := —i ≠ ⁄i,j —j
endfor
endfor

This algorithm completes as long as no divide by zero is encountered.


Let us manipulate this a bit. First, we notice that we can first reduce the matrix to an
upper triangular matrix, and then update the right-hand side using the multipliers that were
computed along the way (if these are stored):

reduce A to upper triangular form


for j := 0, . . . , n ≠ 1
for i := j + 1, . . . , n ≠ 1
⁄i,j := –i,j /–j,j
–i,j := 0 Z
for k = j + 1, . . . , n ≠ 1 _
^
–i,k := –i,k ≠ ⁄i,j –j,k subtract ⁄i,j times row j from row k
_
endfor \
endfor
endfor

update b using multipliers (forward substitution)


for j := 0, . . . , n ≠ 1
for i := j + 1, . . . , n ≠ 1
—i := —i ≠ ⁄i,j —j
endfor
endfor

Ignoring the updating of the right-hand side (a process known as forward substitution), for
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 261

each iteration we can first compute the multipliers and then update the matrix:

for j := 0, . . . , n ≠ 1
Z
for i := j + 1, . . . , n ≠ 1 __
_
⁄i,j := –i,j /–j,j ^
compute multipliers
–i,j := 0 _
_
_
\
endfor
for i := j + 1, . . . , n ≠ 1 Z
for k = j + 1, . . . , n ≠ 1 _^
–i,k := –i,k ≠ ⁄i,j –j,k subtract ⁄i,j times row j from row k
_
endfor \
endfor
endfor

Since we know that –i,j is set to zero, we can use its location to store the multiplier:

for j := 0, . . . , n ≠ 1 Z
for i := j + 1, . . . , n ≠ 1 _^
–i,j := ⁄i,j = –i,j /–j,j compute all multipliers
_
endfor \
for i := j + 1, . . . , n ≠ 1 Z
for k = j + 1, . . . , n ≠ 1 _^
–i,k := –i,k ≠ –i,j –j,k subtract ⁄i,j times row j from row k
_
endfor \
endfor
endfor

Finally, we can cast the computation in terms of operations with vectors and submatrices:

forQj := 0, . .R. , n ≠Q1 R


–j+1,j ⁄j+1,j
c .. d
d := c
c .. d
c
a . b a . d /–j,j
b
Q
–n≠1,j ⁄ n≠1,j R
–j+1,j+1 · · · –j+1,n≠1
c .. .. d
d :=
c
a . . b
–n≠1,j+1
Q
· · · –n≠1,n≠1 R Q R
–j+1,j+1 · · · –j+1,n≠1 –j+1,j 1 2
c .. .. d c .. d
c
a . . d≠c
b a . d –j,j+1 · · · –j,n≠1
b
–n≠1,j+1 · · · –n≠1,n≠1 –n≠1,j
endfor

In Figure 5.2.1.1 this algorithm is presented with our FLAME notation.


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 262

A = GE(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
a21 := l21 = a21 /–11
A22 := A22 ≠ a21 aT12
Q R
A B A00 a01 A02
AT L AT R
Ωc T
a a10 –11 aT12 d
b
ABL ABR
A20 a21 A22
endwhile
Figure 5.2.1.1 Gaussian elimination algorithm that reduced a matrix A to upper triangular
form, storing the multipliers below the diagonal.
Homework 5.2.1.2 Apply the algorithm Figure 5.2.1.1 to the matrix
Q R
2 ≠1 1
c d
a ≠4 0 1 b
4 0 ≠2

and report the resulting matrix. Compare the contents of that matrix to the upper triangular
matrix computed in the solution of Homework 5.2.1.1.
Answer. Q R
2 ≠1 1
c
a ≠2 ≠2 3 d
b
2 ≠1 ≠1
Solution. Partition: Q R
2 ≠1 1
c d
a ≠4 0 1 b
4 0 ≠2
• First iteration:

¶ –21 := ⁄21 = –21 /–11 : Q R


2 ≠1 1
c d
a ≠2 0 1 b
2 0 ≠2
¶ A22 := A22 ≠ a21 aT12 : Q R
2 ≠1 1
c d
a ≠2 ≠2 3 b
2 2 ≠4
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 263

¶ State at bottom of iteration:


Q R
2 ≠1 1
c d
a ≠2 ≠2 3 b
2 2 ≠4
• Second iteration:
¶ –21 := ⁄21 = –21 /–11 : Q R
2 ≠1 1
c d
a ≠2 ≠2 3 b
2 ≠1 ≠4
¶ A22 := A22 ≠ a21 aT12 : Q R
2 ≠1 1
c d
a ≠2 ≠2 3 b
2 ≠1 ≠1
¶ State at bottom of iteration:
Q R
2 ≠1 1
c
a ≠2 ≠2 3 d
b
2 ≠1 ≠1
• Third iteration:
¶ –21 := ⁄21 = –21 /–11 : Q R
2 ≠1 1
c
a ≠2 ≠2 3 d
b
2 ≠1 ≠1
(computation with empty vector).
¶ A22 := A22 ≠ a21 aT12 : Q R
2 ≠1 1
c
a ≠2 ≠2 3 d
b
2 ≠1 ≠1
(update of empty matrix)
¶ State at bottom of iteration:
Q R
2 ≠1 1
c
a ≠2 ≠2 3 d
b
2 ≠1 ≠1
The upper triangular matrix computed in Homework 5.2.1.1 was
Q R
2 ≠1 1
a 0 ≠2 3 d
c
b
0 0 ≠1
which can be found in the upper triangular part of the updated matrix A.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 264

Homework 5.2.1.3 Applying Figure 5.2.1.1 to the matrix


Q R
2 ≠1 1
c
A = a ≠4 0 1 d
b
4 0 ≠2

yielded Q R
2 ≠1 1
c
a ≠2 ≠2 3 d
b.
2 ≠1 ≠1
This can be thought of as an array that stores the unit lower triangular matrix L below the
diagonal (with implicit ones on its diagonal) and upper triangular matrix U on and above
its diagonal: Q R Q R
1 0 0 2 ≠1 1
c
L = a ≠2 1 0 d c
b and U = a 0 ≠2 3 d
b
2 ≠1 1 0 0 ≠1
Compute B = LU and compare it to A.
Answer. Magic! B = A!
Solution.
Q RQ R Q R
1 0 0 2 ≠1 1 2 ≠1 1
c dc d c
B = LU = a ≠2 1 0 b a 0 ≠2 3 b = a ≠4 0 1 d
b = A.
2 ≠1 1 0 0 ≠1 4 0 ≠2

5.2.2 LU factorization: The right-looking algorithm

YouTube: https://www.youtube.com/watch?v=GfpB_RU8pIo
In the launch of this week, we mentioned an algorithm that computes the LU factorization
of a given matrix A so that
A = LU,
where L is a unit lower triangular matrix and U is an upper triangular matrix. We now
derive that algorithm, which is often called the right-looking algorithm for computing the
LU factorization.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 265

Partition A, L, and U as follows:


A B A B A B
–11 aT12 1 0 ‚11 uT12
Aæ , Læ , and U æ .
a21 A22 l21 L22 0 U22
Then A = LU means that
A B A BA B A B
–11 aT12 1 0 ‚11 uT12 ‚11 uT12
= = .
a21 A22 l21 L22 0 U22 l21 ‚11 l21 uT12 + L22 U22
Hence
–11 = ‚11 aT12 = uT12
a21 = ‚11 l21 A22 = l21 uT12 + L22 U22
or, equivalently,
–11 = ‚11 aT12 = uT12
a21 = ‚11 l21 A22 ≠ l21 uT12 = L22 U22 .
If we overwrite the upper triangular part of A with U and the strictly lower triangular part
of A with the strictly lower triangular part of L (since we know that its diagonal consists of
ones), we deduce that we must perform the computations
• a21 := l21 = a21 /–11 .
• A22 := A22 ≠ l21 aT12 = A22 ≠ a21 aT12 .
• Continue by computing the LU factorization of the updated A22 .
The resulting algorithm is given in Figure 5.2.2.1.
A = LU-right-looking(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
a21 := a21 /–11
A22 := A22 ≠ a21 aT12
Q R
A B A00 a01 A02
AT L AT R c T d
Ω a a10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile
Figure 5.2.2.1 Right-looking LU factorization algorithm.
Before we discuss the cost of this algorithm, let us discuss a trick that is often used in the
analysis of the cost of algorithms in linear algebra. We can approximate sums with integrals:
⁄ n
n≠1
ÿ 1 -n
- 1
p
k ¥ xp dx = xp+1 - = np+1 .
k=0 0 p+1 0 p+1
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 266

Homework 5.2.2.1 Give the approximate cost incurred by the algorithm in Figure 5.2.2.1
when applied to an n ◊ n matrix.
Answer. Approximately 23 n3 flops.
Solution. Consider the iteration where AT L is (initially) k ◊ k. Then

• a21 is of size n ≠ k ≠ 1. Thus a21 := a21 /–11 is typically computed by first computing
1/–11 and then a21 := (1/–11 )a21 , which requires (n ≠ k ≠ 1) flops. (The cost of
computing 1/–11 is inconsequential when n is large, so it is usually ignored.)

• A22 is of size (n ≠ k ≠ 1) ◊ (n ≠ k ≠ 1) and hence the rank-1 update A22 := A22 ≠ a21 aT12
requires 2(n ≠ k ≠ 1)(n ≠ k ≠ 1) flops.

Now, the cost of updating a21 is small relative to that of the update of A22 and hence will
be ignored. Thus, the total cost is given by, approximately,
n≠1
ÿ
2(n ≠ k ≠ 1)2 flops.
k=0

Let us now simplify this:


qn≠1
k=0 2(n ≠ k ≠ 1)2
= < change of variable: j = n ≠ k ≠ 1 >
qn≠1 2
j=0 2j
= < algebra >
qn≠1 2
2 j=0 j
q sn 2
j=0 j ¥ 0 x dx = n /3 >
2 3
¥ < n≠1
2 3
3
n
Homework 5.2.2.2 Give the approximate cost incurred by the algorithm in Figure 5.2.2.1
when applied to an m ◊ n matrix.
Answer. Approximately mn2 ≠ 13 n3 flops.
Solution. Consider the iteration where AT L is (initially) k ◊ k. Then

• a21 is of size m ≠ k ≠ 1. Thus a21 := a21 /–11 is typically computed by first computing
1/–11 and then a21 := (1/–11 )a21 , which requires (m ≠ k ≠ 1) flops. (The cost of
computing 1/–11 is inconsequential when m is large.)

• A22 is of size (m ≠ k ≠ 1) ◊ (n ≠ k ≠ 1) and hence the rank-1 update A22 := A22 ≠ a21 aT12
requires 2(m ≠ k ≠ 1)(n ≠ k ≠ 1) flops.

Now, the cost of updating a21 is small relative to that of the update of A22 and hence will
be ignored. Thus, the total cost is given by, approximately,
n≠1
ÿ
2(m ≠ k ≠ 1)(n ≠ k ≠ 1) flops.
k=0
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 267

Let us now simplify this:


qn≠1
k=0 2(m ≠ k ≠ 1)(n ≠ k ≠ 1)
= < change of variable: j = n ≠ k ≠ 1 >
qn≠1
j=0 2(m ≠ (n ≠ j ≠ 1) ≠ 1)j
= < simplify >
qn≠1
j=0 2(m ≠ n + j)j
= < algebra >
q qn≠1 2
2(m ≠ n) n≠1 j=0 j + 2 j=0 j
q q
¥ < j=0 j ¥ n2 /2 and n≠1
n≠1 2 3
j=0 j ¥ n /3 >
2 3
(m ≠ n)n + 3 n
2

= < simplify >


2 1 3
mn ≠ 3 n
Remark 5.2.2.2 In a practical application of LU factorization, it is uncommon to factor
a non-square matrix. However, high-performance implementations of the LU factorization
that use "blocked" algorithms perform a factorization of a rectangular submatrix of A, which
is why we generalize beyond the square case.
Homework 5.2.2.3 It is a good idea to perform a "git pull" in the Assignments directory
to update with the latest files before you start new programming assignments.
Implement the algorithm given in Figure 5.2.2.1 as

function [ A_out ] = LU_right_ ooking( A )

by completing the code in Assignments/Week05/mat ab/LU_right_ ooking.m. Input is an m ◊ n


matrix A. Output is the matrix A that has been overwritten by the LU factorization. You
may want to use Assignments/Week05/mat ab/test_LU_right_ ooking.m to check your imple-
mentation.
Solution. See Assignments/Week05/answers/LU_right_ ooking.m. (Assignments/Week05/answers/
LU_right_looking.m)

5.2.3 Existence of the LU factorization

YouTube: https://www.youtube.com/watch?v=Aaa9n97N1qc
Now that we have an algorithm for computing the LU factorization, it is time to talk
about when this LU factorization exists (in other words: when we can guarantee that the
algorithm completes).
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 268

We would like to talk about the existence of the LU factorization for the more general
case where A is an m ◊ n matrix, with m Ø n. What does this mean?
Definition 5.2.3.1 Given a matrix A œ Cm◊n with m Ø n, its LU factorization is given by
A = LU where L œ Cm◊n is unit lower trapezoidal and U œ Cn◊n is upper triangular with
nonzeroes on its diagonal. ⌃
The first question we will ask is when the LU factorization exists. For this, we need
another definition.
Definition 5.2.3.2 Principal leading submatrix. For k Æ n, the k ◊ k principal
leading
A submatrix of
B a matrix A is defined to be the square matrix AT L œ C
k◊k
such that
AT L AT R
A= . ⌃
ABL ABR
This definition allows us to state necessary and sufficient conditions for when a matrix
with n linearly independent columns has an LU factorization:
Lemma 5.2.3.3 Let L œ Cn◊n be a unit lower triangular matrix and U œ Cn◊n be an
upper triangular matrix. Then A = LU is nonsingular if and only if U has no zeroes on its
diagonal.
Homework 5.2.3.1 Prove Lemma 5.2.3.3.
Hint. You may use the fact that a triangular matrix has an inverse if and only if it has no
zeroes on its diagonal.
Solution. The proof hinges on the fact that a triangular matrix is nonsingular if and only
if it doesn’t have any zeroes on its diagonal. Hence we can instead prove that A = LU is
nonsingular if and only if U is nonsingular ( since L is unit lower triangular and hence has
no zeroes on its diagonal).

• (∆): Assume A = LU is nonsingular. Since L is nonsingular, U = L≠1 A. We can


show that U is nonsingular in a number of ways:

¶ We can explicitly give its inverse:

U (A≠1 L) = L≠1 AA≠1 L = I.

Hence U has an inverse and is thus nonsingular.


¶ Alternatively, we can reason that the product of two nonsingular matrices, namely
L≠1 and A, is nonsingular.

• (≈): Assume A = LU and U has no zeroes on its diagonal. We then know that both
L≠1 and U ≠1 exist. Again, we can either explicitly verify a known inverse of A:

A(U ≠1 L≠1 ) = LU U ≠1 L≠1 = I

or we can recall that the product of two nonsingular matrices, namely U ≠1 and L≠1 ,
is nonsingular.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 269

Theorem 5.2.3.4 Existence of the LU factorization. Let A œ Cm◊n and m Ø n have


linearly independent columns. Then A has a (unique) LU factorization if and only if all its
principal leading submatrices are nonsingular.

YouTube: https://www.youtube.com/watch?v=SP E5xJF9hY


Proof.
• (∆): Let nonsingular A have a (unique) LU factorization. We will show that its
principal leading submatrices are nonsingular.
Let A B A B A B
AT L AT R LT L 0 UT L UT R
=
ABL ABR LBL LBR 0 UBR
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
A L U
be the LU factorization of A, where AT L , LT L , UT L œ Ck◊k . By the assumption that
LU is the LU factorization of A, we know that U cannot have a zero on the diagonal
and hence is nonsingular. Now, since
A B A B A B
AT L AT R LT L 0 UT L UT R
=
ABL ABR LBL LBR 0 UBR
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
A A L U B
LT L UT L LT L UT R
= ,
LBL UT L LBL UT R + LBL UBR

the k ◊ k principal leading submatrix AT L equals AT L = LT L UT L , which is nonsingular


since LT L has a unit diagonal and UT L has no zeroes on the diagonal. Since k was
chosen arbitrarily, this means that all principal leading submatrices are nonsingular.

• (≈): We will do a proof by induction on n.


A B
–11
¶ Base Case: n = 1. Then A has the form A = where –11 is a scalar.
a21
Since
A the principal
B leading submatrices are nonsingular –11 ”= 0. Hence A =
1
–11 is the LU factorization of A. This LU factorization is unique
a21 /–11 ¸˚˙˝
¸ ˚˙ ˝ U
L
because the first element of L must be 1.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 270

¶ Inductive Step: Assume the result is true for all matrices with n = k. Show it is
true for matrices with n = k + 1.
Let A of size n = k + 1 have nonsingular principal leading submatrices. Now, if
an LU factorization of A exists, A = LU , then it would have to form
Q R Q R
A00 a01 L00 0 A B
c T d c T d U00 u01
a a10 –11 b = a l10 1 b . (5.2.1)
0 ‚11
A20 a21 L20 l21 ¸ ˚˙ ˝
¸ ˚˙ ˝ ¸ ˚˙ ˝
U
A L

If we can show that the different parts of L and U exist, are unique, and ‚11 ”= 0,
we are done (since then U is nonsingular). (5.2.1) can be rewritten as
Q R Q R Q R Q R
A00 L00 a01 L00 u01
c T d c T d c d c d
a a10 b = a l10 b U00 and a –11 b = a l10
T
u01 + ‚11 b ,
A20 L20 a21 L20 u01 + l21 ‚11

or, equivalently, Y
_
] L00 u01 = a01
‚11 = –11 ≠ l10
T
u01
_
[
l21 = (a21 ≠ L20 u01 )/‚11
Now, by the Inductive Hypothesis L00 , l10 T
, and L20 exist and are unique. So the
question is whether u01 , ‚11 , and l21 exist and are unique:
⌅ u01 exists and is unique. Since L00 is nonsingular (it has ones on its diagonal)
L00 u01 = a01 has a solution that is unique.
⌅ ‚11 exists, is unique, and is nonzero. Since l10
T
and u01 exist and are unique,
‚11 = –11 ≠ l10 u01 exists and is unique. It is also nonzero since the principal
T

leading submatrix of A given by


A B A BA B
A00 a01 L00 0 U00 u01
= ,
aT10 –11 T
l10 1 0 ‚11

is nonsingular by assumption and therefore ‚11 must be nonzero.


⌅ l21 exists and is unique. Since ‚11 exists, is unique, and is nonzero,

l21 = (a21 ≠ L20 a01 )/‚11

exists and is uniquely determined.


Thus the m ◊ (k + 1) matrix A has a unique LU factorization.
¶ By the Principal of Mathematical Induction the result holds.


The formulas in the inductive step of the proof of Theorem 5.2.3.4 suggest an alternative
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 271

algorithm for computing the LU factorization of a m ◊ n matrix A with m Ø n, given in


Figure 5.2.3.5. This algorithm is often referred to as the (unblocked) left-looking algorithm.
A = LU-left-looking(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R
æc a aT10 –11
d
aT12 b
ABL ABR
A20 a21 A22
Solve L00 u01 = a01 overwriting a01 with u01
–11 := ‚11 = –11 ≠ aT10 a01
a21 := a21 ≠ A20 a01
a21 := l21 = a21 /–11
Q R
A B A00 a01 A02
AT L AT R c T
Ω a a10 –11 aT12 d
b
ABL ABR
A20 a21 A22
endwhile
Figure 5.2.3.5 Left-looking LU factorization algorithm. L00 is the unit lower triangular
matrix stored in the strictly lower triangular part of A00 (with the diagonal implicitly stored).
Homework 5.2.3.2 Show that if the left-looking algorithm in Figure 5.2.3.5 is applied to
an m ◊ n matrix, with m Ø n, the cost is approximately mn2 ≠ 13 n3 flops (just like the
right-looking algorithm).
Solution. Consider the iteration where AT L is (initially) k ◊ k. Then

• Solving L00 u01 = a21 requires approximately k 2 flops.

• Updating –11 := –11 ≠ aT10 a01 requires approximately 2k flops, which we will ignore.

• Updating a21 := a21 ≠ A20 a01 requires approximately 2(m ≠ k ≠ 1)k flops.

• Updating a21 := a21 /–11 requires approximately (m ≠ k ≠ 1) flops, which we will ignore.

Thus, the total cost is given by, approximately,

ÿ1
n≠1 2
k 2 + 2(m ≠ k ≠ 1)k flops.
k=0
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 272

Let us now simplify this:


qn≠1
k=0 (k 2 + 2(m ≠ k ≠ 1)k)
= < algebra >
qn≠1 2 q
k=0 k + 2 n≠1
k=0 (m ≠ k ≠ 1)k
= < algebra >
qn≠1 q
k=0 2(m ≠q 1)k ≠ n≠1 2
k=0 k q
¥ < j=0 j ¥ n /2 and n≠1
n≠1 2 2 3
j=0 j ¥ n /3 >
1 3
(m ≠ 1)n ≠ 3 n
2

Had we not ignored the cost of –11 := –11 ≠ aT10 a01 , which approximately 2k, then the result
would have been approximately
1
mn2 ≠ n3
3
1 3
instead of (m ≠ 1)n ≠ 3 n , which is identical to that of the right-looking algorithm in
2

Figure 5.2.2.1. This makes sense, since the two algorithms perform the same operations in a
different order.
Of course, regardless,
1 1
(m ≠ 1)n2 ≠ n3 ¥ mn2 ≠ n3
3 3
if m is large.
Remark 5.2.3.6 A careful analysis would show that the left- and right-looking algorithms
perform the exact same operations with the same elements of A, except in a different order.
Thus, it is no surprise that the costs of these algorithms are the same.
Ponder This 5.2.3.3 If A is m ◊ m (square!), then yet another algorithm can be derived
by partitioning A, L, and U so that
A B A B A B
A00 a01 L00 0 U00 u01
A= ,L = ,U = .
aT10 –11 T
l10 1 0 ‚11

Assume that L00 and U00 have already been computed in previous iterations, and determine
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 273

how to compute u01 , l10


T
, and ‚11 in the current iteration. Then fill in the algorithm:

A = LU-bordered(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22

Q R
A B A00 a01 A02
AT L AT R c d
Ω a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile

This algorithm is often called the bordered LU factorization algorithm.


Next, modify the proof of Theorem 5.2.3.4 to show the existence of the LU factorization
when A is square and has nonsingular leading principal submatrices.
Finally, show that this bordered algorithm also requires approximately 2m3 /3 flops.
Homework 5.2.3.4 Implement the algorithm given in Figure 5.2.3.5 as
function [ A_out ] = LU_ eft_ ooking( A )

by completing the code in Assignments/Week05/mat ab/LU_ eft_ ooking.m. Input is an m ◊ n


matrix A. Output is the matrix A that has been overwritten by the LU factorization. You
may want to use Assignments/Week05/mat ab/test_LU_ eft_ ooking.m to check your imple-
mentation.
Solution. See Assignments/Week05/answers/LU_ eft_ ooking.m. (Assignments/Week05/answers/
LU_left_looking.m)

5.2.4 Gaussian elimination via Gauss transforms


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 274

YouTube: https://www.youtube.com/watch?v=YDtynD4iAVM
Definition 5.2.4.1 A matrix Lk of the form
Q R
Ik 0 0
Lk = a 0 1 0 d
c
b,
0 l21 I

where Ik is the k ◊ k identity matrix and I is an identity matrix "of appropriate size" is called
a Gauss transform. ⌃
Gauss transforms, when applied to a matrix, take multiples of the row indexed with k
and add these multiples to other rows. In our use of Gauss transforms to explain the LU
factorization, we subtract instead:
Example 5.2.4.2 Evaluate
Q RQ R
1 0 0 0 aÂT0
c
c 0 1 0 0 dc
dc aÂT1 d
d
c dc d =
a 0 ≠⁄21 1 0 ba aÂT2 b
0 ≠⁄31 0 1 aÂT3

Solution.
Q RQ R Q R Q R
1 0 0 0 aÂT0 aÂT0 aÂT0
c d
c
c 0 1 0 0 dc
dc aÂT1 d
d c  T1
B aA
d c aÂT1 d
c dc d=c A B d =c
c
d
d.
a 0 ≠⁄21 1 0 ba aÂT2 b c
a aÂT2 ⁄21 d
b a aÂT2 ≠ ⁄21 aÂT1 b
≠ aÂT1
0 ≠⁄31 0 1 aÂT3 aÂT3 ⁄31 aÂT3 ≠ ⁄31 aÂT1


Notice the similarity with what one does in Gaussian elimination: take multiples of one
row and subtracting these from other rows.
Homework 5.2.4.1 Evaluate
Q RQ R
Ik 0 0 A00 a01 A02
c dc d
a 0 1 0 b a 0 –11 aT12 b
0 ≠l21 I 0 a21 A22

where Ik is the k ◊ k identity matrix and A0 has k rows. If we compute


Q R Q RQ R
A00 a01 A02 Ik 0 0 A00 a01 A02
a 0
c
–11 aT12 d c
b := a 0 1
dc d
0 b a 0 –11 aT12 b
0 a‚21 A‚22 0 ≠l21 I 0 a21 A22

how should l21 be chosen if we want a‚21 to be a zero vector?


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 275

Solution. Q RQ R
Ik 0 0 A00 a01 A02
c dc d
a 0 1 0 b a 0 –11 aT12 b
0 Q≠l21 I 0 a21 A22 R
A00 a01 A02
c d
=a 0 –11 aT12 b
Q
0 ≠l21 –11 + a21 ≠l21 aT12 + AR22
A00 a01 A02
c d
=a 0 –11 aT12 b
0 a21 ≠ –11 l21 A22 ≠ l21 aT12
If l21 = a21 /–11 then aÂ21 = a21 ≠ –11 a21 /–11 = 0.
Hopefully you notice the parallels between the computation in the last homework, and
the algorithm in Figure 5.2.1.1.
Now, assume that the right-looking LU factorization has proceeded to where A contains
Q R
A00 a01 A02
c d
a 0 –11 aT12 b ,
0 a21 A22

where A00 is upper triangular (recall: it is being overwritten by U !). What 1we would like
2 to
do is eliminate the elements in a21 by taking multiples of the "current row"’ –11 aT12 and
1 2
subtract these from the rest of the rows: a21 A22 in order to introduce zeroes below
–11 . The vehicle is an appropriately chosen Gauss transform, inspired by Homework 5.2.4.1.
We must determine l21 so that
Q RQ R Q R
I 0 0 A00 a01 A02 A00 a01 A02
c dc d c d
a 0 1 0 b a 0 –11 aT12 b = a 0 –11 aT12 b.
0 ≠l21 I 0 a21 A22 0 0 A22 ≠ l21 a12
T

As we saw in Homework 5.2.4.1, this means we must pick l21 = a21 /–11 . The resulting
algorithm is summarized in Figure 5.2.4.3. Notice that this algorithm is, once again, identical
to the algorithm in Figure 5.2.1.1 (except that it does not overwrite the lower triangular
matrix).
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 276

A = GE-via-Gauss-transforms(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
l21 := a21 /–11
Q R Z
A00 a01 A02 _
_
c _
_
T d
a 0 –11 a12 b _
_
_
_
_
0 a
Q 21 A 22 RQ
_
_
R _
_
_
I 0 0 A00 a01 A02 _
_
^ a := 0
1 0 b a 0 –11 a12 b _ 21
c dc T d
:= a 0
A22 := A22 ≠ l21 aT12
Q
0 ≠l 21 0 0 a 21
R
A22
_
_
_
_
_
_
A00 a01 A02 _
_
c d _
_
= a 0 –11 T
a12 b
_
_
_
_
_
0 0 A22 ≠ l21 a12
T \
Q R
A B A00 a01 A02
AT L AT R c d
Ω a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile
Figure 5.2.4.3 Gaussian elimination, formulated as a sequence of applications of Gauss
transforms.
Homework 5.2.4.2 Show that
Q R≠1 Q R
Ik 0 0 Ik 0 0
c d
a 0 1 0 b =c
a 0 1 0 b
d

0 ≠l21 I 0 l21 I
where Ik denotes the k ◊ k identity matrix.
Hint. To show that B = A≠1 , it suffices to show that BA = I (if A and B are square).
Solution. Q RQ R
Ik 0 0 Ik 0 0
c dc d
a 0 1 0 ba 0 1 0 b
0 Q≠l21 I(n≠k≠1)◊(n≠k≠1)
R
0 l21 I
Ik 0 0
=ca 0 1
d
0 b
Q
0 ≠l21 + l
R 21
I
Ik 0 0
=ca 0 1 0 b
d

0 0 I
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 277

Starting with an m ◊ m matrix A, the algorithm computes a sequence of m Gauss


transforms L0 , . . . , Lm≠1 , each of the form
Q R
Ik 0 0
Lk = c
a 0 1
d
0 b, (5.2.2)
0 ≠l21 I

such that Lm≠1 Lm≠2 · · · L1 L0 A = U . Equivalently, A = L≠1


0 L1 · · · Lm≠2 Lm≠1 U , where
≠1 ≠1 ≠1

Q R
Ik 0 0
L≠1
k =c
a 0
d
1 0 b.
0 l21 I

It is easy to show that the product of unit lower triangular matrices is itself unit lower
triangular. Hence
L = L≠1 ≠1 ≠1 ≠1
0 L1 · · · Ln≠2 Ln≠1

is unit lower triangular. However, it turns out that this L is particularly easy to compute,
as the following homework suggests.
Homework 5.2.4.3 Let
Q R Q R
L00 0 0 Ik 0 0
c T d c d
L̃k≠1 = L≠1 ≠1 ≠1
0 L1 . . . Lk≠1 = a l10 1 0 b and L≠1
k = a 0 1 0 b,
L20 0 I 0 l21 I

where L00 is a k ◊ k unit lower triangular matrix. Show that


Q R
L00 0 0
= a l10 1 0 d
c T
L̃k = L̃≠1 ≠1
k≠1 Lk b.
L20 l21 I

Solution.
L̃k = L
Q0
≠1 ≠1
L1 · · · Lk≠1 L≠1 = L̃k≠1 LR
R Qk
≠1
k
L00 0 0 Ik 0 0
c T dc d
= a l10 1 0 ba 0 1 0 b
L
Q 20
0 I R 0 l21 I
L00 0 0
c T d
= a l10 1 0 b.
L20 l21 I
What this exercise shows is that L = L≠1
0 L1 · · · Ln≠2 Ln≠1 is the triangular matrix that
≠1 ≠1 ≠1

is created by simply placing the computed vectors l21 below the diagonal of a unit lower
triangular matrix. This insight explains the "magic" observed in Homework 5.2.1.3. We
conclude that the algorithm in Figure 5.2.1.1 overwrites n ◊ n matrix A with unit lower
triangular matrix L and upper triangular matrix U such that A = LU . This is known as
the LU factorization or LU decomposition of A.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 278

Ponder This 5.2.4.4 Let Q R


Ik◊k 0 0
c d
Lk = a 0 1 0 b.
0 ≠l21 I
Show that
Ÿ2 (Lk ) Ø Îl21 Î22 .
What does this mean about how error in A may be amplified if the pivot (the –11 by which
entries in a21 are divided to compute l21 ) encountered in the right-looking LU factorization
algorithm is small in magnitude relative to the elements below it? How can we chose which
row to swap so as to minimize Îl21 Î2 ?
Hint. Revisit Homework 1.3.5.5.

5.3 LU factorization with (row) pivoting


5.3.1 Gaussian elimination with row exchanges
!

YouTube: https://www.youtube.com/watch?v=t6cK75IE6d8
Homework 5.3.1.1 Perform Gaussian elimination as explained in Subsection 5.2.1 to solve
A BA B A B
0 1 ‰0 2
=
1 0 ‰1 1

Solution. The appended system is given by


A B
0 1 2
.
1 0 1

In the first step, the multiplier is computed as ⁄1,0 = 1/0 and the algorithm fails. Yet, it is
clear that the (unique) solution is
A B A B
‰0 1
= .
‰1 2
The point of the exercise: Gaussian elemination and, equivalently, LU factorization as
we have discussed so far can fail if a "divide by zero" is encountered. The element on the
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 279

diagonal used to compute the multipliers in a current iteration of the outer-most loop is
called the pivot (element). Thus, if a zero pivot is encountered, the algorithms fail. Even if
the pivot is merely small (in magnitude), as we will discuss in a future week, roundoff error
encountered when performing floating point operations will likely make the computation
"numerically unstable," which is the topic of next week’s material.
The simple observation is that the rows of the matrix (and corresponding right-hand side
element) correspond to linear equations that must be simultaneously solved. Reordering
these does not change the solution. Reordering in advance so that no zero pivot is encoun-
tered is problematic, since pivots are generally updated by prior computation. However,
when a zero pivot is encountered, the row in which it appears can simply be swapped with
another row so that the pivot is replaced with a nonzero element (which then becomes the
pivot). In exact arithmetic, it suffices to ensure that the pivot is nonzero after swapping.
As mentioned, in the presence of roundoff error, any element that is small in magnitude can
create problems. For this reason, we will swap rows so that the element with the largest
magnitude (among the elements in the "current" column below the diagonal) becomes the
pivot. This is known as partial pivoting or row pivoting.
Homework 5.3.1.2 When performing Gaussian elimination as explained in Subsection 5.2.1
to solve A BA B A B
10≠k 1 ‰0 1
= ,
1 0 ‰1 1
set
1 ≠ 10k
to
≠10k
(since we will assume k to be large and hence 1 is very small to relative to 10k ). With
this modification (which simulates roundoff error that may be encountered when performing
floating point computation), what is the answer?
Next, solve A BA B A B
1 0 ‰0 1
= .
10≠k 1 ‰1 1
What do you observe?
Solution. The appended system is given by
A B
10≠k 1 1
.
1 0 1
In the first step, the multiplier is computed as ⁄1,0 = 10k and the updated appended system
becomes A B
10≠k 1 1
0 ≠10k 1 ≠ 10k
which is rounded to A B
10≠k 1 1
.
0 ≠10 ≠10k
k
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 280

We then compute
‰1 = (≠10k )/(≠10k ) = 1
and
‰0 = (1 ≠ ‰1 )/10≠k = (1 ≠ 1)/10≠k = 0.
If we instead start with the equivalent system
A B
1 0 1
.
10≠k 1 1

the appended system after one step becomes


A B
1 0 1
0 1 1 ≠ 10≠k

which yields the solution A B A B


‰0 1
= .
‰1 1 ≠ 10≠k
which becomes A B A B
‰0 1
= .
‰1 1
as k gets large.
What this illustrates is how a large multiple of a row being added to another row can
wipe out information in that second row. After one step of Gaussian elimination, the system
becomes equivalent to one that started with
A B
10≠k 1 1
.
1 0 0

5.3.2 Permutation matrices

YouTube: https://www.youtube.com/watch?v=4 RnLbvrdtg


Recall that we already discussed permutation in Subsection 4.4.4 in the setting of column
pivoting when computing the QR factorization.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 281

Definition 5.3.2.1 Given Q R


fi0
c . d
p = a .. d
c
b,
fin≠1
where {fi0 , fi1 , . . . , fin≠1 } is a permutation (rearrangement) of the integers {0, 1, . . . , n ≠ 1},
we define the permutation matrix P (p) by
Q R
eTfi0
c
P (p) = c .. d
a . d
b.
T
efin≠1


Homework 5.3.2.1 Let
Q R Q R
fi0 ‰0
c . d c .. d
a .. b and x = a
p=c . d
d c
b.
fin≠1 ‰n≠1

Evaluate P (p)x.
Solution.
P (p)x
Q
= < definition >
R
eTfi0
c .. d
c
a . dx
b
eTfin≠1
Q
= <R matrix-vector multiplication by rows >
eTfi0 x
c .. d
c
a . d
b
eTfin≠1 x
Q
= < eT x = xj >
R j
‰ fi0
c .. d
c
a . d b
‰fin≠1
The last homework shows that applying P (p) to a vector x rearranges the elements of
that vector according to the permutation indicated by the vector p.
Homework 5.3.2.2 Let
Q R Q R
fi0 aÂT0
c . d c . d
a .. b and A = a .. b .
p=c d c d
fin≠1 aÂTn≠1

Evaluate P (p)A.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 282

Solution.
P (p)A
Q
= < definition >
R
eTfi0
c .. d
c
a . dA
b
eTfin≠1
Q
= <R matrix-matrix multiplication by rows >
eTfi0 A
c .. d
c
a . d
b
eTfin≠1 A
Q
= < eTj A = aÂTj >
R
aÂTfi0
c .. d
c
a . db
aÂTfin≠1
The last homework shows that applying P (p) to a matrix A rearranges the rows of that
matrix according to the permutation indicated by the vector p.
Homework 5.3.2.3 Let
Q R
fi0 1 2
c . d
p = a .. d
c
b and A = a 0 · · · a n≠1 .
fin≠1

Evaluate AP (p)T .
Solution.
AP (p)T
= < definition >
Q T
RT
efi0
c .. d
Aca . d
b
T
efin≠1
1= < transpose
2 P (p) >
A efi0 · · · efin≠1
1 = < matrix-matrix
2 multiplication by columns >
Aefi0 · · · Aefin≠1
1 = < Aej = 2aj >
afi0 · · · afin≠1
The last homework shows that applying P (p)T from the right to a matrix A rearranges
the columns of that matrix according to the permutation indicated by the vector p.
Homework 5.3.2.4 Evaluate P (p)P (p)T .
Answer. P (p)P (p)T = I
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 283

Solution.
P P (p)T
= < definition >
Q RQ RT
eTfi0 eTfi0
c .. dc .. d
c
a . dc
ba . db
eTfin≠1 eTfin≠1
Q
= < transpose P (p) >
R
eTfi0
c .. d1 2
c
a . d
b efi0 · · · efin≠1
eTfin≠1
= < evaluate >
Q R
eTfi0 efi0 eTfi0 efi1 · · · eTfi0 efin≠1
c eTfi1 efi0 eTfi1 efi1 · · · eTfi1 efin≠1 d
c d
c
c .. .. .. d
d
a . . . b
eTfin≠1 efi0 eTfin≠1 efi1 · · · eTfin≠1 efin≠1
= < eTi ej = · · · >
Q R
1 0 ··· 0
c
c 0 1 ··· 0 d d
c
c .. .. .. d
d
a . . . b
0 0 ··· 1

YouTube: https://www.youtube.com/watch?v=1qvS n65Ws


We will see that when discussing the LU factorization with partial pivoting, a permutation
matrix that swaps the first element of a vector with the fi-th element of that vector is a
fundamental tool.
Definition 5.3.2.2 Elementary pivot matrix. Given fi œ {0, . . . , n ≠ 1} define the
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 284

elementary pivot matrix Q R


eTfi
c d
c
c
eT1 d
d
c .. d
c
c . d
d
c efi≠1 d
T
c d
PÂ (fi) = c d
c
c eT0 dd
c eTfi+1 d
c d
c .. d
c
a . d
b
eTn≠1
or, equivalently, Y
_
_ Q In R if fi = 0
_
_
_
_
] c 0 0 1 0
d
PÂ (fi) = c 0 Ifi≠1 0 0 d
_ c d otherwise,
_
_
_
_
c
a 1 0 0 0 d
b
_
[
0 0 0 In≠fi≠1
where n is the size of the permutation matrix. ⌃
When P̃ (fi) is applied to a vector, it swaps the top element with the element indexed
with fi. When it is applied to a matrix, it swaps the top row with the row indexed with fi.
The size of matrix P̃ (fi) is determined by the size of the vector or the row size of the matrix
to which it is applied.
In discussing LU factorization with pivoting, we will use elementary pivot matrices in
a very specific way, which necessitates the definition of how a sequence of such pivots are
applied. Let p be a vector of integers satisfying the conditions
Q R
fi0
c . d
p = a .. d
c
b, where 1 Æ k Æ n and 0 Æ fii < n ≠ i, (5.3.1)
fik≠1

then PÂ (p) will denote the sequence of pivots


A BA B A B
Ik≠1 0 Ik≠2 0 1 0
PÂ (p) = ··· PÂ (fi0 ).
0 PÂ (fik≠1 ) 0 PÂ (fik≠2 ) 0 PÂ (fi1 )

(Here PÂ (·) is always an elementary pivot matrix "of appropriate size.") What this exactly
does is best illustrated through an example:
Example 5.3.2.3 Let
Q R
Q R 0.0 0.1 0.2
2 c 1.0 1.1 1.2 d
p=a 1 d
c
b and A = c
c
d
d.
a 2.0 2.1 2.2 b
1
3.0 3.1 3.2

Evaluate PÂ (p)A.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 285

Solution.

PÂ (p)A
= < instantiate >
Q R
Q R 0.0 0.1 0.2
2
d c 1.0 1.1 1.2 d
c d
PÂ (c
a 1 b) c d
a 2.0 2.1 2.2 b
1
3.0 3.1 3.2
= < definition of PÂ (·) >
Q R
Q R 0.0 0.1 0.2
1 A
0 B d c 1.0 1.1 1.2 d
c c d
a 1 b PÂ (2) c d
0 P(Â ) a 2.0 2.1 2.2 b
1
3.0 3.1 3.2
= < swap first row with row indexed with 2 >
Q R
Q R 2.0 2.1 2.2
1 0 B
d c 1.0 1.1 1.2 d
A c d
c
a 1 b c d
0 PÂ ( ) a 0.0 0.1 0.2 b
1
3.0 3.1 3.2
Q
= 1 < partitioned matrix-matrix
2 R
multiplication >
2.0 2.1 2.2
c Q R d
c A B 1.0 1.1 1.2 d
= < swap current first row with row indexed with 1 relative to that
c
c  1 c d d
d
a P( ) a 0.0 0.1 0.2 b b
1
Q A 3.0 3.1 B 3.2 R
2.0 2.1 2.2
c d
c
c
0.0A 0.1 0.2 B d
d
c 1 2 1.0 1.1 1.2 d
a Â
P( 1 ) b
3.0 3.1 3.2
Q Q
= < swap current
R R
first row with row indexed with 1 relative to that row >
2.0 2.1 2.2
c c d d
c a 0.0 0.1 0.2 b d
c d
a 1 3.0 3.1 3.2 2 b
c d
1.0 1.1 1.2
=
Q R
2.0 2.1 2.2
c 0.0 0.1 0.2 d
c d
c d
a 3.0 3.1 3.2 b
1.0 1.1 1.2


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 286

The relation between PÂ (·) and P (·) is tricky to specify:


Q R Q R
fi0 0
A B A B
c
c fi1 d
d Ik≠1 0 1 0
c
c 1 d
d
PÂ (c .. d) = P( ··· PÂ (fi0 ) c .. d).
a
c
.
d
b 0 P (fik≠1 )
 0 P (fi1 )
Â
a
c
.
d
b
fik≠1 k≠1

5.3.3 LU factorization with partial pivoting

YouTube: https://www.youtube.com/watch?v=QSnoqrsQNag
Having introduced our notation for permutation matrices, we can now define the LU
factorization with partial pivoting: Given an m ◊ n matrix A, we wish to compute
• vector p of n integers that indicates how rows are pivoting as the algorithm proceeds,
• a unit lower trapezoidal matrix L, and
• an upper triangular matrix U
so that PÂ (p)A = LU . We represent this operation by
[A, p] := LUpivA,
where upon completion A has been overwritten by {L\U }, which indicates that U overwrites
the upper triangular part of A and L is stored in the strictly lower triangular part of A.
Let us start with revisiting the derivation of the right-looking LU factorization in Sub-
section 5.2.2. The first step is to find a first permutation matrix PÂ (fi1 ) such that the element
on the diagonal in the first column is maximal in value. (Mathematically, any nonzero value
works. We will see that ensuring that the multiplier is less than one in magnitude reduces
the potential for accumulation of error.) For this, we will introduce the function
maxi(x)
which, given a vector x, returns the index of the element in x with maximal magnitude
(absolute value). The algorithm then proceeds as follows:
• Partition A, L as follows:
A B A B
–11 aT12 1 0
Aæ , and L æ
a21 A22 l21 L22
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 287
A B
–11
• Compute fi1 = maxi .
a21
A B A B
–11 aT12 –11 aT12
• Permute the rows: := PÂ (fi 1) .
a21 A22 a21 A22

• Compute l21 := a21 /–11 .

• Update A22 := A22 ≠ l21 aT12 .

This completes the introduction of zeroes below the diagonal of the first column.
Now, more generally, assume that the computation has proceeded to the point where
matrix A has been overwritten by
Q R
A00 a01 A02
c d
a 0 –11 aT12 b
0 a21 A22

where A00 is upper triangular. If no pivoting was added one would compute l21 := a21 /–11
followed by the update
Q R Q RQ R Q R
A00 a01 A02 I 0 0 A00 a01 A02 A00 a01 A02
c d c dc d c d
a 0 –11 aT12 b := a 0 1 0 b a 0 –11 aT12 b = a 0 –11 aT12 b.
0 a21 A22 0 ≠l21 I 0 a21 A22 0 0 A22 ≠ l21 a12
T

Now, instead one performs the steps

• Compute A B
–11
fi1 := maxi .
a21

• Permute the rows:


Q R Q R
A00 a01 A02 A B A00 a01 A02
c T d I 0 c d
a 0 –11 a12 b := a 0 –11 aT12 b
0 PÂ (fi1 )
0 a21 A22 0 a21 A22

• Update
l21 := a21 /–11 .

• Update Q R Q RQ R
A00 a01 A02 I 0 0 A00 a01 A02
c T d c dc d
a 0 –11 a12 b := a 0 1 0 b a 0 –11 aT12 b
0 a
Q 21
A22 0 ≠l21R I 0 a21 A22
A00 a01 A02
=c
a 0 –11 aT12
d
b.
0 0 A22 ≠ l21 aT12
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 288

This algorithm is summarized in Figure 5.3.3.1. In that algorithm, the lower triangular
matrix L is accumulated below the diagonal.
[A, p] A= LUpiv-right-looking(A)
B A B
AT L AT R pT
Aæ ,p æ
ABL ABR pB
AT L is 0 ◊ 0, pT has 0 elements
while n(AT L ) < n(A) Q R Q R
A B A00 a01 A02 A B p0
AT L AT R c T T d pT
æ a a10 –11 a12 b , æc d
a fi1 b
ABL ABR pB
A B
A20 a21 A22 p2
–11
fi1 := maxi
a21
Q R Q R
A00 a01 A02 A B A00 a01 A02
c T d I 0 c T d
a a10 –11 aT12 b := a a10 –11 aT12 b
0 P (fi1 )
A20 a21 A22 A20 a21 A22
a21 := a21 /–11
A22 := A22 ≠ a21 aT12
Q R Q R
A B A00 a01 A02 A B p0
AT L AT R c T T d pT c d
Ω a a10 –11 a12 b , Ω a fi1 b
ABL ABR pB
A20 a21 A22 p2
endwhile
Figure 5.3.3.1 Right-looking LU factorization algorithm with partial pivoting.

YouTube: https://www.youtube.com/watch?v=n-K 62HrYhM


What this algorithm computes is a sequence of Gauss transforms L0 , . . . , Ln≠1 and per-
mutations P0 , . . . , Pn≠1 such that

Ln≠1 Pn≠1 · · · L0 P0 A = U

or, equivalently,
A = P0T L≠1 T ≠1
0 · · · Pn≠1 Ln≠1 U.
A B
Ik◊k 0
Actually, since Pk = for some fi , we know that PkT = Pk and hence
0 P (fi)
Â

A = P0 L≠1 ≠1
0 · · · Pn≠1 Ln≠1 U.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 289

What we will finally show is that there are Gauss transforms Lı0 , . . . Lın≠1 such that

A = P0 · · · Pn≠1 Lı0 · · · Lın≠1 U


¸ ˚˙ ˝
L

or, equivalently,
PÂ (p)A = Pn≠1 · · · P0 A = Lı0 · · · Lın≠1 U,
¸ ˚˙ ˝
L
which is what we set out to compute.
Here is the insight. If only we know how to order the rows of A and right-hand side b
correctly, then we would not have to pivot. But we only know how to pivot as the compu-
tation unfolds. Recall that the multipliers can overwrite the elements they zero in Gaussian
elimination and do so when we formulate it as an LU factorization. By not only pivoting
the elements of A B
–11 aT12
a21 A22
but also all of A B
aT12 –11 aT12
,
A20 a21 A22
we are moving the computed multipliers with the rows that are being swapped. It is for this
reason that we end up computing the LU factorization of the permuted matrix: PÂ (p)A.
Homework 5.3.3.1 Implement the algorithm given in Figure 5.3.3.1 as
function [ A_out ] = LUpiv_right_ ooking( A )

by completing the code in Assignments/Week05/mat ab/LUpiv_right_ ooking.m. Input is an


m ◊ n matrix A. Output is the matrix A that has been overwritten by the LU factorization
and pivot vector p. You may want to use Assignments/Week05/mat ab/test_LUpiv_right_
ooking.m to check your implementation.
The following utility functions may come in handy:

• Assignments/Week05/mat ab/maxi.m

• Assignments/Week05/mat ab/Swap.m

which we hope are self explanatory.


Solution. See Assignments/Week05/answers/LUpiv_right_ ooking.m. (Assignments/Week05/
answers/LUpiv_right_looking.m)
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 290

5.3.4 Solving A x = y via LU factorization with pivoting

YouTube: https://www.youtube.com/watch?v=kqj3n1EUCkw
Given nonsingular matrix A œ Cm◊n , the above discussions have yielded an algorithm
for computing permutation matrix P , unit lower triangular matrix L and upper triangular
matrix U such that P A = LU . We now discuss how these can be used to solve the system
of linear equations Ax = y.
Starting with
Ax = b
where nonsingular matrix A is n ◊ n (and hence square),

• Overwrite A with its LU factorization, accumulating the pivot information in vector p:

[A, p] := LUpiv(A).

A now contains L and U and PÂ (p)A = LU .

• We notice that Ax = b is equivalent to PÂ (p)Ax = PÂ (p)b. Thus, we compute y := PÂ (p)b.


Usually, y overwrites b.

• Next, we recognize that PÂ (p)Ax = y is equivalent to L (U x) = y. Hence, we can


¸ ˚˙ ˝
z
compute z by solving the unit lower triangular system

Lz = y

and next compute x by solving the upper triangular system

U x = z.

5.3.5 Solving with a triangular matrix


We are left to discuss how to solve Lz = y and U x = z.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 291

5.3.5.1 Algorithmic Variant 1

YouTube: https://www.youtube.com/watch?v=qc_4NsNp3q0
Consider Lz = y where L is unit lower triangular. Partition
A B A B A B
1 0 ’1 Â1
Læ , zæ and y æ .
l21 L22 z2 y2

Then A B A B A B
1 0 ’1 Â1
= .
l21 L22 z2 y2
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
L z y
Multiplying out the left-hand side yields
A B A B
’1 Â1
=
’1 l21 + L22 z2 y2

and the equalities


’1 = Â1
’1 l21 + L22 z2 = y2 ,
which can be rearranged as
’1 = Â1
L22 z2 = y2 ≠ ’1 l21 .
We conclude that in the current iteration

• Â1 needs not be updated.

• y2 := y2 ≠ Â1 l21

So that in future iterations L22 z2 = y2 (updated!) will be solved, updating z2 .


These insights justify the algorithm in Figure 5.3.5.1, which overwrites y with the solution
to Lz = y.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 292

SolveALz = y, overwriting
B
yA withBz (Variant 1)
LT L LT R yT
Læ ,y æ
LBL LBR yB
LT L is 0 ◊ 0 and yT has 0 elements
while n(LT L ) < n(L) Q R Q R
A B L00 l01 L02 A B y0
LT L LT R c T T d yT c d
æ a l10 ⁄11 l12 b, æ a Â1 b
LBL LBR yB
L20 l21 L22 y2
y2 := y2 ≠ Â1 l21
Q R Q R
A B L00 l01 L02 A B y0
LT L LT R y
Ωc T T d
a l10 ⁄11 l12 b ,
T
Ωc d
a Â1 b
LBL LBR yB
L20 l21 L22 y2
endwhile

Figure 5.3.5.1 Lower triangular solve (with unit lower triangular matrix), Variant 1
Homework 5.3.5.1 Derive a similar algorithm for solving U x = z. Update the below
skeleton algorithm with the result. (Don’t forget to put in the lines that indicate how you
"partition and repartition" through the matrix.)

Solve AU x = z, overwriting
B
z withBx (Variant 1)
A
UT L UT R zT
Uæ ,z æ
UBL UBR zB
UBR is 0 ◊ 0 and zB has 0 elements
while n(UBR ) < n(U ) Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
æ a uT10 ‚11 uT12 b , æ a ’1 b
UBL UBR zB
U20 u21 U22 z2

Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
Ω a uT10 ‚11 uT12 b , Ω a ’1 b
UBL UBR zB
U20 u21 U22 z2
endwhile

Hint: Partition A BA B A B
U00 u01 x0 z0
= .
0 ‚11 ‰1 ’1
Solution. Multiplying this out yields
A B A B
U00 x0 + u01 ‰1 z0
= .
‚11 ‰1 ’1
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 293

So, ‰1 = ’1 /‚11 after which x0 can be computed by solving U00 x0 = z0 ≠ ‰1 u01 . The resulting
algorithm is then given by

Solve AU x = z, overwriting
B
z withBx (Variant 1)
A
UT L UT R zT
Uæ ,z æ
UBL UBR zB
UBR is 0 ◊ 0 and zB has 0 elements
while n(UBR ) < n(U ) Q R Q R
A B U00 u01 U02 A B z0
UT L UT R zT
æc T T d
a u10 ‚11 u12 b , æc d
a ’1 b
UBL UBR zB
U20 u21 U22 z2
’1 := ’1 /‚11
z0 := z0 ≠ ’1 u01
Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
Ω a uT10 ‚11 uT12 b , Ω a ’1 b
UBL UBR zB
U20 u21 U22 z2
endwhile

5.3.5.2 Algorithmic Variant 2

YouTube: https://www.youtube.com/watch?v=2tvfYnD9NrQ
An alternative algorithm can be derived as follows: Partition
A B A B A B
L00 0 z0 y0
Læ , zæ and y æ .
T
l10 1 ’1 Â1
Then A B A B A B
L00 0 z0 y0
= .
T
l10 1 ’1 Â1
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
L z y
Multiplying out the left-hand side yields
A B A B
L00 z0 y0
=
l10 z0 + ’1
T
Â1
and the equalities
L00 z0 = y0
T
l10 z0 + ’1 = Â1 .
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 294

The idea now is as follows: Assume that the elements of z0 were computed in previous
iterations in the algorithm in Figure 5.3.5.2, overwriting y0 . Then in the current iteration
we must compute ’1 := Â1 ≠ l10T
z0 , overwriting Â1 .
SolveALz = y, overwriting
B
yA withBz (Variant 2)
LT L LT R yT
Læ ,y æ
LBL LBR yB
LT L is 0 ◊ 0 and yT has 0 elements
while n(LT L ) < n(L) Q R Q R
A B L00 l01 L02 A B y0
LT L LT R c T T d yT c d
æ a l10 ⁄11 l12 b, æ a Â1 b
LBL LBR yB
L20 l21 L22 y2
Â1 := Â1 ≠ l10
T
y0
Q R Q R
A B L00 l01 L02 A B y0
LT L LT R yT
Ωc T T d
a l10 ⁄11 l12 b , Ωc d
a Â1 b
LBL LBR yB
L20 l21 L22 y2
endwhile

Figure 5.3.5.2 Lower triangular solve (with unit lower triangular matrix), Variant 2
Homework 5.3.5.2 Derive a similar algorithm for solving U x = z. Update the below
skeleton algorithm with the result. (Don’t forget to put in the lines that indicate how you
"partition and repartition" through the matrix.)

Solve AU x = z, overwriting
B
z withBx (Variant 2)
A
UT L UT R zT
Uæ ,z æ
UBL UBR zB
UBR is 0 ◊ 0 and zB has 0 elements
while n(UBR ) < n(U ) Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
æ a uT10 ‚11 uT12 b , æ a ’1 b
UBL UBR zB
U20 u21 U22 z2

Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
Ω a uT10 ‚11 uT12 b , Ω a ’1 b
UBL UBR zB
U20 u21 U22 z2
endwhile

Hint: Partition A B
‚11 uT12
Uæ .
0 U22
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 295

Solution. Partition A BA B A B
‚11 uT12 ‰1 ’1
= .
0 U22 x2 z2
Multiplying this out yields
A B A B
‚11 ‰1 + uT12 x2 ’1
= .
U22 x2 z2

So, if we assume that x2 has already been computed and has overwritten z2 , then ‰1 can be
computed as
‰1 = (’1 ≠ uT12 x2 )/‚11
which can then overwrite ’1 . The resulting algorithm is given by

Solve AU x = z, overwriting
B
z withBx (Variant 2)
A
UT L UT R zT
Uæ ,z æ
UBL UBR zB
UBR is 0 ◊ 0 and zB has 0 elements
while n(UBR ) < n(U ) Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
æ a uT10 ‚11 uT12 b , æ a ’1 b
UBL UBR zB
U20 u21 U22 z2
’1 := ’1 ≠ u12 z2
T

’1 := ’1 /‚11
Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
Ω a uT10 ‚11 uT12 b , Ω a ’1 b
UBL UBR zB
U20 u21 U22 z2
endwhile
Homework 5.3.5.3 Let L be an m ◊ m unit lower triangular matrix. If a multiply and add
each require one flop, what is the approximate cost of solving Lx = y?
Solution. Let us analyze Variant 1.
Let L00 be k ◊ k in a typical iteration. Then y2 is of size m ≠ k ≠ 1 and y2 := y2 ≠ Â1 l21
requires 2(m ≠ k ≠ 1) flops. Summing this over all iterations requires
m≠1
ÿ
[2(m ≠ k ≠ 1)] flops.
k=0

The change of variables j = m ≠ k ≠ 1 yields


m≠1
ÿ m≠1
ÿ
[2(m ≠ k ≠ 1)] = 2 j ¥ m2 .
k=0 j=0

Thus, the cost is approximately m2 flops.


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 296

5.3.5.3 Discussion
Computation tends to be more efficient when matrices are accessed by column, since in
scientific computing applications tend to store matrices by columns (in column-major order).
This dates back to the days when Fortran ruled supreme. Accessing memory consecutively
improves performance, so computing with columns tends to be more efficient than computing
with rows.
Variant 1 for each of the algorithms casts computation in terms of columns of the matrix
that is involved;
y2 := y2 ≠ Â1 l21
and
z0 := z0 ≠ ’1 u01 .
These are called axpy operations:
y := –x + y.
"alpha times x plus y." In contrast, Variant 2 casts computation in terms of rows of the
matrix that is involved:
Â1 := Â1 ≠ l10
T
y0
and
’1 := ’1 ≠ uT12 z2
perform dot products.

5.3.6 LU factorization with complete pivoting


LU factorization with partial pivoting builds on the insight that pivoting (rearranging) rows
in a linear system does not change the solution: if Ax = b then P (p)Ax = P (p)b, where
p is a pivot vector. Now, if r is another pivot vector, then notice that P (r)T P (r) = I (a
simple property of pivot matrices) and AP (r)T permutes the columns of A in exactly the
same order as P (r)A permutes the rows of A.
What this means is that if Ax = b then P (p)AP (r)T (P (r)x) = P (p)b. This supports
the idea that one might want to not only permute rows of A, as in partial pivoting, but
also columns of A. This is done in a variation on LU factorization that is known as LU
factorization with complete pivoting.
The idea is as follows: Given matrix A, partition
A B
–11 aT12
A= .
a21 A22

Now, instead of finding the largest element in magnitude in the first column, find the largest
element in magnitude in the entire matrix. Let’s say it is element (fi1 , fl1 ). Then, one
permutes A B A B
–11 aT12 –11 aT12
:= P (fi1 ) P (fl1 )T ,
a21 A22 a21 A22
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 297

making –11 the largest element in magnitude. We will later see that the magnitude of
–11 impacts element growth in the remaining matrix (A22 ) and that in turn impacts the
numerical stability (accuracy) of the algorithm. By choosing –11 to be as large as possible
in magnitude, the magnitude of multipliers is reduced as is element growth.
The problem is that complete pivoting requires O(n2 ) comparisons per iteration. Thus,
the number of comparisons is of the same order as the number of floating point operations.
Worse, it completely destroys the ability to cast most computation in terms of matrix-matrix
multiplication, thus impacting the ability to attain much greater performance.
In practice LU with complete pivoting is not used.

5.3.7 Improving accuracy via iterative refinement


When solving Ax = b on a computer, error is inherently incurred. Instead of the exact
solution x, an approximate solution x̂ is computed, which instead solves Ax̂ = b̂. The
difference between x and x̂ satisfies
A(x ≠ x̂) = b ≠ b̂.
We can compute b̂ = Ax̂ and hence we can compute ”b = b ≠ b̂. We can then solve A”x = ”b.
If this computation is completed without error, then x = x̂ + ”x and we are left with the
exact solution. Obviously, there is error in ”x as well, and hence we have merely computed
an improved approximate solution to Ax = b. This process can be repeated. As long as
solving with A yields at least one digit of accuracy, this process can be used to improve the
computed result, limited by the accuracy in the right-hand side b and the condition number
of A.
This process is known as iterative refinement.

5.4 Cholesky factorization


5.4.1 Hermitian Positive Definite matrices

YouTube: https://www.youtube.com/watch?v=nxGR8NgXYxg
Hermitian Positive Definite (HPD) are a special class of matrices that are frequently
encountered in practice.
Definition 5.4.1.1 Hermitian positive definite matrix. A matrix A œ Cn◊n is Her-
mitian positive definite (HPD) if and only if it is Hermitian (AH = A) and for all nonzero
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 298

vectors x œ Cn it is the case that xH Ax > 0. If in addition A œ Rn◊n then A is said to be


symmetric positive definite (SPD). ⌃
If you feel uncomfortable with complex arithmetic, just replace the word "Hermitian"
with "symmetric"" in this document and the Hermitian transpose operation, H , with the
transpose operation, T .
Example 5.4.1.2 Consider the case where n = 1 so that A is a real scalar, –. Notice that
then A is SPD if and only if – > 0. This is because then for all nonzero ‰ œ R it is the case
that –‰2 > 0. ⇤
Let’s get some practice with reasoning about Hermitian positive definite matrices.
Homework 5.4.1.1 Let B œ Cm◊n have linearly independent columns.
ALWAYS/SOMETIMES/NEVER: A = B H B is HPD.
Answer. ALWAYS
Now prove it!
Solution. Let x œ Cm be a nonzero vector. Then xH B H Bx = (Bx)H (Bx). Since B has
linearly independent columns we know that Bx ”= 0. Hence (Bx)H Bx > 0.
Homework 5.4.1.2 Let A œ Cm◊m be HPD.
ALWAYS/SOMETIMES/NEVER: The diagonal elements of A are real and positive.
Hint. Consider the standard basis vector ej .
Answer. ALWAYS
Now prove it!
Solution. Let ej be the jth unit basis vectors. Then 0 < eH
j Aej = –j,j .

Homework 5.4.1.3 Let A œ Cm◊m be HPD. Partition


A B
–11 aH
21
A= .
a21 A22
ALWAYS/SOMETIMES/NEVER: A22 is HPD.
Answer. ALWAYS
Now prove it!
Solution. We need to show that xH 2 A22 x2 > 0 for anyAnonzero
B x2 œ C
m≠1
.
0
Let x2 œ Cm≠1 be a nonzero vector and choose x = . Then
x2

0
< < A is HPD >
H
x Ax
= < partition >
A BH A BA B
0 –11 aH 21 0
x2 a21 A22 x2
= < multiply out >
xH
2 A 22 x 2 .
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 299

We conclude that A22 is HPD.

5.4.2 The Cholesky Factorization Theorem

YouTube: https://www.youtube.com/watch?v=w8a9xVHVmAI
We will prove the following theorem in Subsection 5.4.4.
Theorem 5.4.2.1 Cholesky Factorization Theorem. Given an HPD matrix A there
exists a lower triangular matrix L such that A = LLH . If the diagonal elements of L are
restricted to be positive, L is unique.
Obviously, there similarly exists an upper triangular matrix U such that A = U H U since
we can choose U H = L.
The lower triangular matrix L is known as the Cholesky factor and LLH is known as the
Cholesky factorization of A. It is unique if the diagonal elements of L are restricted to be
positive. Typically, only the lower (or upper) triangular part of A is stored, and it is that
part that is then overwritten with the result. In our discussions, we will assume that the
lower triangular part of A is stored and overwritten.

5.4.3 Cholesky factorization algorithm (right-looking variant)

YouTube: https://www.youtube.com/watch?v=x4grvf-MfTk
The most common algorithm for computing the Cholesky factorization of a given HPD
matrix A is derived as follows:

• Consider A = LLH , where L is lower triangular.


Partition A B A B
–11 ı ⁄11 0
A= and L = . (5.4.1)
a21 A22 l21 L22
Since A is HPD, we know that
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 300

¶ –11 is a positive number (Homework 5.4.1.2).


¶ A22 is HPD (Homework 5.4.1.3).

• By substituting these partitioned matrices into A = LLH we find that


A B A BA BH A BA B
–11 ı ⁄11 0 ⁄11 0 ⁄11 0 H
⁄̄11 l21
= =
a21 A22 l
A 21
L 22 l21 L22 B l21 L22 0 LH 22
|⁄11 |2 ı
= ,
H
⁄̄11 l21 l21 l21 + L22 LH
22

from which we conclude that


–11 = |⁄11 |2 ı
a21 = ⁄11 l21 A22 = l21 l21 + L22 LH
H
22

or, equivalently, Ô
⁄11 = ± –11 ı
.
l21 = a21 /⁄̄11 L22 = Chol(A22 ≠ l21 l21
H
)

• These equalities motivate the following algorithm for overwriting the lower triangular
part of A with the Cholesky factor of A:
A B
–11 ı
¶ Partition A æ .
a21 A22
Ô Ô
¶ Overwrite –11 := ⁄11 = –11 . (Picking ⁄11 = –11 makes it positive and real,
and ensures uniqueness.)
¶ Overwrite a21 := l21 = a21 /⁄11 .
¶ Overwrite A22 := A22 ≠ l21 l21
H
(updating only the lower triangular part of A22 ).
This operation is called a symmetric rank-1 update.
¶ Continue by computing the Cholesky factor of A22 .

The resulting algorithm is often called the "right-looking" variant and is summarized in
Figure 5.4.3.1.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 301

A = Chol-right-looking(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
Ô
–11 := ⁄11 = –11
a21 := l21 = a21 /–11
A22 := A22 ≠ a21 aH21 (syr: update only lower triangular part)
Q R
A B A00 a01 A02
AT L AT R
Ωc T
a a10 –11 aT12 d
b
ABL ABR
A20 a21 A22
endwhile

Figure 5.4.3.1 Cholesky factorization algorithm (right-looking variant). The operation


"syr" refers to "symmetric rank-1 update", which performs a rank-1 update, updating only
the lower triangular part of the matrix in this algorithm.
Homework 5.4.3.1 Give the approximate cost incurred by the algorithm in Figure 5.4.3.1
when applied to an n ◊ n matrix.
Answer. 13 n3 flops.
Solution.

YouTube: https://www.youtube.com/watch?v=6twDI6QhqCY
The cost of the Cholesky factorization of A œ Cn◊n can be analyzed as follows: In
Figure 5.4.3.1 during the kth iteration (starting k at zero) A00 is k ◊ k. Thus, the operations
in that iteration cost
Ô
• –11 := –11 : this cost is negligible when k is large.

• a21 := a21 /–11 : approximately (n≠k≠1) flops. This operation is typically implemented
as (1/–11 )a21 .

• A22 := A22 ≠ a21 aH21 (updating only the lower triangular part of A22 ): approximately
(n ≠ k ≠ 1)2 flops.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 302

Thus, the total cost in flops is given by

CChol (n)
¥ < sum over all iterations >
n≠1
ÿ n≠1
ÿ
(n ≠ k ≠ 1)2 + (n ≠ k ≠ 1)
k=0 k=0
¸ ˚˙ ˝ ¸ ˚˙ ˝
(Due to update of A22 ) (Due to update of a21 )
= < change of variables j = n ≠ k ≠ 1 >
qn≠1 2 qn≠1
j=0 j + qj=0 j qn≠1
2 3 2
¥ < n≠1
j=0 j ¥ n /3; j=0 j ¥ n /2 >
1 3 1 2
3
n + 2n
¥ < remove lower order term >
1 3
3
n .
Remark 5.4.3.2 Comparing the cost of the Cholesky factorization to that of the LU fac-
torization in Homework 5.2.2.1, we see that taking advantage of symmetry cuts the cost
approximately in half.
Homework 5.4.3.2 Implement the algorithm given in Figure 5.4.3.1 as
function [ A_out ] = Cho _right_ ooking( A )

by completing the code in Assignments/Week05/mat ab/Cho _right_ ooking.m. Input is a HPD


m ◊ n matrix A with only the lower triangular part stored. Output is the matrix A that
has its lower triangular part overwritten with the Cholesky factor. You may want to use
Assignments/Week05/mat ab/test_Cho _right_ ooking.m to check your implementation.
Solution. See Assignments/Week05/answers/Cho _right_ ooking.m. (Assignments/Week05/
answers/Chol_right_looking.m)

5.4.4 Proof of the Cholesky Factorizaton Theorem

YouTube: https://www.youtube.com/watch?v=unpQfRgIHOg
Partition, once again, A B
–11 aH
21
Aæ .
a21 A22
The following lemmas are key to the proof of the Cholesky Factorization Theorem:
Lemma 5.4.4.1 Let A œ Cn◊n be HPD. Then –11 is real and positive.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 303

Proof. This is special case of Homework 5.4.1.1. ⌅


Ô
Lemma 5.4.4.2 Let A œ Cn◊n be HPD and l21 = a21 / –11 . Then A22 ≠ l21 l21 H
is HPD.
Proof. Since A is Hermitian so are A22 and A22 ≠ l21 l21 .
H
A B
‰1
Let x2 œ C n≠1
be an arbitrary nonzero vector. Define x = where ‰1 = ≠aH21 x2 /–11 .
x2
Then, since clearly x ”= 0,

0
< < A is HPD >
xH Ax
= < partition >
A BH A BA B
‰1 –11 aH 21 ‰1
x2 a21 A22 x2
= < partitioned multiplication >
A BH A B
‰1 –11 ‰1 + aH 21 x2
x2 a21 ‰1 + A22 x2
= < partitioned multiplication >
–11 ‰1 ‰1 + ‰̄1 aH 21 x2 + x2 a21 ‰1 + x2 A22 x2
H H

= < linear algebra >


aH Ha xH a aH
21 x2 ≠ x2 a21 –11 + x2 A22 x2
x x 21 x2
–11 –2111 2–1121 ≠ 2–1121 aH
2 H H

= < since xH 2 a21 , a21 x2 are scalars and hence can move around; –11 /–11 = 1 >
H
Hx Hx aH
2 a21 –11 ≠ x2 a21 –11 ≠ x2 a21 –11 + x2 A22 x2
a 21 2 a 21 2 21 x2
xH H H H

= < cancel terms; factor out x2 and x2 > H


a aH
x2 (A22 ≠ 21
H
–11
21
)x2
= < simplify >
xH
2 (A 22 ≠ H
l21 l21 )x2 .

We conclude that A22 ≠ l21 l21


H
is HPD. ⌅
Proof of the Cholesky Factorization Theorem. Proof by induction.
1. Base case: n = 1:
Clearly the result is true for a 1 ◊ 1 matrix A = –11 : In this case, the fact that A
is HPD means that –11 is real and positive and a Cholesky factor is then given by
Ô
⁄11 = –11 , with uniqueness if we insist that ⁄11 is positive.

2. Inductive step: Assume the result is true for n = k. We will show that it holds for
n = k + 1.
Let A œ C(k+1)◊(k+1) be HPD. Partition
A B A B
–11 aH ⁄11 0
A= 21
and L = .
a21 A22 l21 L22

Let
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 304
Ô
• ⁄11 = –11 (which is well-defined by Lemma 5.4.4.1,
• l21 = a21 /⁄11 ,
• A22 ≠ l21 l21
H
= L22 LH
22 (which exists as a consequence of the Inductive Hypothesis
and Lemma 5.4.4.2.)

Then L is the desired Cholesky factor of A.

3. By the Principal of Mathematical Induction, the theorem holds.

5.4.5 Cholesky factorization and solving LLS

YouTube: https://www.youtube.com/watch?v=C7LEuhS4H94
Recall from Section 4.2 that the solution x̂ œ Cn to the linear least-squares (LLS) problem
Îb ≠ Ax‚Î2 = minn Îb ≠ AxÎ2 (5.4.2)
xœC

equals the solution to the normal equations

¸ ˚˙A˝ x = A
H H
A ‚
¸ ˚˙ b˝ .
B y

Since AH A is Hermitian, it would be good to take advantage of that special structure to


factor it more cheaply. If AH A were HPD, then the Cholesky factorization can be employed.
Fortunately, from Homework 5.4.1.1 we know that if A has linearly independent columns,
then AH A is HPD. Thus, the steps required to solve the LLS problem (5.4.2) when A œ Cm◊n
Are
• Form B = AH A. Cost: approximately mn2 flops.
• Factor B = LLH (Cholesky factorization). Cost: approximately n3 /3 flops.
• Compute y = AH b. Cost: approximately 2mn flops.
• Solve Lz = y. Cost: approximately n2 flops.
• Solve LH x̂ = z. Cost: approximately n2 flops.
for a total of, approximately, mn2 + n3 /3 flops.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 305

Ponder This 5.4.5.1 Consider A œ Cm◊n with linearly independent columns. Recall that
A has a QR factorization, A = QR where Q has orthonormal columns and R is an upper
triangular matrix with positive diagonal elements. How are the Cholesky factorization of
AH A and the QR factorization of A related?

5.4.6 Implementation with the classical BLAS


The Basic Linear Algebra Subprograms (BLAS) are an interface to commonly used fun-
damental linear algebra operations. In this section, we illustrate how the unblocked and
blocked Cholesky factorization algorithms can be implemented in terms of the BLAS. The
explanation draws from the entry we wrote for the BLAS in the Encyclopedia of Parallel
Computing [38].

5.4.6.1 What are the BLAS?


The BLAS interface [24] [15] [14] was proposed to support portable high-performance imple-
mentation of applications that are matrix and/or vector computation intensive. The idea is
that one casts computation in terms of the BLAS interface, leaving the architecture-specific
optimization of that interface to an expert.

5.4.6.2 Implementation in Fortran


We start with a simple implementation in Fortran. A simple algorithm that exposes three
loops and the corresponding code in Fortran are given by
for j := 0, . . . , n ≠ 1 do j=1, n
Ô
–j,j := –j,j A(j,j) = sqrt(A(j,j))
for i := j + 1, . . . , n ≠ 1
–i,j := –i,j /–j,j do i=j+1,n
end A(i,j) = A(i,j) / A(j,j)
for k := j + 1, . . . , n ≠ 1 enddo
for i := k, . . . , n ≠ 1
–i,k := –i,k ≠ –i,j –k,j do k=j+1,n
endfor do i=k,n
endfor A(i,k) = A(i,k) - A(i,j) * A(k,j)
endfor enddo
enddo
enddo
Notice that Fortran starts indexing at one when addressing an array.
Next, exploit the fact that the BLAS interface supports a number of "vector-vector"
operations known as the Level-1 BLAS. Of these, we will use
dsca ( n, a pha, x, incx )
which updates the vector x stored in memory starting at address x and increment between
entries of incx and of size n with –x where – is stored in alpha, and
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 306

daxpy( n, a pha, x, incx, y, incy )

which updates the vector y stored in memory starting at address y and increment between
entries of incy and of size n with –x + y where x is stored at address x with increment incx
and – is stored in alpha. With these, the implementation becomes
do j=1, n
for j := 0, . . . , n ≠ 1 A(j,j) = sqrt(A(j,j))
Ô
–j,j := –j,j
–j+1:n≠1,j := –j+1:n≠1,j /–j,j ca dsca ( n-j, 1.0d00 / a(j,j), a(j+1,j), 1 )
for k := j + 1, . . . , n ≠ 1
–k:n≠1,k := –k:n≠1,k ≠ –k,j –k:n≠1,j do k=j+1,n
endfor ca daxpy( n-k+1, -A(k,j), A(k,j), 1,
endfor A(k,k), 1 )
enddo
enddo
Q R
–j+1,j
Here –j+1:n≠1,j = a
c .. d
d.
c
. b
–n≠1,j
The entire update A22 := A22 ≠ a21 aT21 can be cast in terms of a matrix-vector operation
(level-2 BLAS call) to

dsyr( up o, n, a pha, x, A, dA )

which updates the matrix A stored in memory starting at address A with leading dimension
ldA of size n by n with –xxT + A where x is stored at address x with increment incx and
– is stored in alpha. Since both A and –xxT + A are symmetric, only the triangular part
indicated by uplo is updated. This is captured by the below algorithm and implementation.
do j=1, n
for j := 0, . . . , n ≠ 1 A(j,j) = sqrt(A(j,j))
Ô
–j,j := –j,j
–j+1:n≠1,j := –j+1:n≠1,j /–j,j ca dsca ( n-j, 1.0d00 / a(j,j), a(j+1,j), 1 )
–j+1:n≠1,j+1:n≠1 :=
–j+1:n≠1,j+1:n≠1 ca dsyr( Lower triangu ar , n-j+1, -1.0d00,
≠ tril(–j+1:n≠1,j –j+1:n≠1,j
T
) A(j+1,j), 1, A(j+1,j+1), dA )
endfor enddo
Notice how the code that cast computation in terms of the BLAS uses a higher level of
abstraction, through routines that implement the linear algebra operations that are encoun-
tered in the algorithms.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 307

A = Chol-blocked-right-looking(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 A01 A02
AT L AT R c d
æ a A10 A11 AT12 b
ABL ABR
A20 A21 A22
A11 := L11 = Chol(()A11 )
Solve L21 L≠H
11 = A21 overwriting A21 with L21
A22 := A22 ≠ A21 AH
21 syrk: update only lower triangular part
Q R
A B A00 A01 A02
AT L AT R c d
Ω a A10 A11 AT12 b
ABL ABR
A20 A21 A22
endwhile

Figure 5.4.6.1 Blocked Cholesky factorization Variant 3 (right-looking) algorithm. The op-
eration "syrk" refers to "symmetric rank-k update", which performs a rank-k update (matrix-
matrix multiplication with a small "k" size), updating only the lower triangular part of the
matrix in this algorithm.
Finally, a blocked right-looking Cholesky factorization algorithm, which casts most com-
putation in terms of a matrix-matrix multiplication operation referred to as a "symmetric
rank-k update" is given in Figure 5.4.6.1. There, we use FLAME notation to present the
algorithm. It translates into Fortran code that exploits the BLAS given below.
do j=1, nb ,n
jb = min( nb, n-j )
Cho ( jb, A( j, j );

dtrsm( Right , Lower triangu ar , Transpose ,


Unit diag , jb, n-j-jb+1, 1.0d00, A( j,j ), LDA,
A( j+jb, j ), LDA )

dsyrk( Lower triangu ar , n-j-jb+1, jb, 1.0d00,


A( j+jb,j ), LDA, 1.0d00, A( j+jb, j+jb ), LDA )
enddo
The routines dtrsm and dsyrk are level-3 BLAS routines:

• The call to dtrsm implements A21 := L21 where L21 LT11 = A21 .

• The call to dsyrk implements A22 := ≠A21 AT21 +A22 , updating only the lower triangular
part of the matrix.

The bulk of the computation is now cast in terms of matrix-matrix operations which can
achieve high performance.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 308

5.5 Enrichments
5.5.1 Other LU factorization algorithms
There are actually five different (unblocked) algorithms for computing the LU factorization
that were discovered over the course of the centuries. Here we show how to systematically
derive all five. For details, we suggest Week 6 of our Massive Open Online Course titled
"LAFF-On Programming for Correctness" [28].
Remark 5.5.1.1 To put yourself in the right frame of mind, we highly recommend you
spend about an hour reading the paper
• [31] Devangi N. Parikh, Margaret E. Myers, Richard Vuduc, Robert A. van de Geijn,
A Simple Methodology for Computing Families of Algorithms, FLAME Working Note
#87, The University of Texas at Austin, Department of Computer Science, Technical
Report TR-18-06. arXiv:1808.07832.
A = LU-var1(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R
æc d
a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
..
.
Q R
A B A00 a01 A02
AT L AT R c d
Ω a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile
Figure 5.5.1.2 LU factorization algorithm skeleton.
Finding the algorithms starts with the following observations.
• Our algorithms will overwrite the matrix A, and hence we introduce A‚ to denote the
original contents of A. We will say that the precondition for the algorithm is that
A = A‚

(A starts by containing the original contents of A.)


• We wish to overwrite A with L and U . Thus, the postcondition for the algorithm (the
state in which we wish to exit the algorithm) is that
A = L\U · LU = A‚
(A is overwritten by L below the diagonal and U on and above the diagonal, where
multiplying L and U yields the original matrix A.)
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 309

• All the algorithms will march through the matrices from top-left to bottom-right,
giving us the code skeleton in Figure 5.5.1.2. Since the computed L and U overwrite
A, throughout they are partitioned comformal to (in the same way) as is A.

• Thus, before and after each iteration of the loop the matrices are viewed as quadrants:
A B A B A B
AT L AT R LT L 0 UT L UT R
Aæ ,L æ , and U æ .
ABL ABR LBL LBR 0 UBR

where AT L , LT L , and UT L are all square and equally sized.

• In terms of these exposed quadrants, in the end we wish for matrix A to contain
A B A B
AT L AT R L\UT L UT R
=
ABLA ABR L
B A BL
L\UBRB A B
LT L 0 UT L UT R A‚T L A‚T R
· =
LBL LBR 0 UBR A‚BL A‚BR

• Manipulating this yields what we call the Partitioned Matrix Expression (PME), which
can be viewed as a recursive definition of the LU factorization:
A B A B
AT L AT R L\UT L UT R
=
ABL ABR LBL L\UBR
L U = A‚T L LT L UT R = A‚T R
· TL TL
LBL UT L = ABL LBR UBR = A‚BR ≠ LBL UT R .

• Now, consider the code skeleton for the LU factorization in Figure 5.5.1.2. At the top
of the loop (right after the while), we want to maintain certain contents in matrix A.
Since we are in a loop, we haven’t yet overwritten A with the final result. Instead, some
progress toward this final result has been made. The way we can find what the state
of A is that we would like to maintain is to take the PME and delete subexpressions.
For example, consider the following condition on the contents of A:
A B A B
AT L AT R L\UT L UT R
=
ABL ABR LBL A‚BR ≠ LBL UT R
L U = A‚T L LT L UT R = A‚T R
· TL TL
LBL UT L = ABL
‚ .

What we are saying is that AT L , AT R , and ABL have been completely updated with
the corresponding parts of L and U , and ABR has been partially updated. This is
exactly the state that the right-looking algorithm that we discussed in Subsection 5.2.2
maintains! What is left is to factor ABR , since it contains A‚BR ≠ LBL UT R , and A‚BR ≠
LBL UT R = LBR UBR .
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 310

• By carefully analyzing the order in which computation must occur (in compiler lingo:
by performing a dependence analysis), we can identify five states that can be main-
tained at the top of the loop, by deleting subexpressions from the PME. These are
called loop invariants. There are five for LU factorization:
A B A B A B A B
AT L AT R L\UT L A‚T R AT L AT R L\UT L UT R
= =
ABL ABR ‚
ABL ‚
ABR A BL A BR A‚BL A‚BR
A Invariant
B A 1 B A Invariant
B A 2 B
AT L AT R L\UT L A‚T R AT L AT R L\UT L UT R
= =
ABL ABR ‚ BL
L A‚BR ABL ABR LBL A‚BR
Invariant
A 3 B A InvariantB4
AT L AT R L\UT L UT R
=
ABL ABR LBL A‚BR ≠ LBL UT R
Invariant 5

• Key to figuring out what updates must occur in the loop for each of the variants is to
look at how the matrices are repartitioned at the top and bottom of the loop body.

For each of the five algorithms for LU factorization, we will derive the loop invariant, and
then derive the algorithm from the loop invariant.

5.5.1.1 Variant 1: Bordered algorithm


Consider the loop invariant:
A B A B
AT L AT R L\UT L A‚T R
= · LT L UT L = A‚T L ,
ABL ABR A‚BL A‚BR

meaning that the leading principal submatrix AT L has been overwritten with its LU factor-
ization, and the remainder of the matrix has not yet been touched.
At the top of the loop, after repartitioning, A then contains
Q R Q R
A00 a01 A02 L\U00 a‚01 A‚02
c T d c d
a a10 –11 aT12 b = a a‚ T10 – ‚ T12 b · L00 U00 = A
‚ 11 a ‚
00
A20 a21 A22 ‚
A20 ‚
a‚21 A22

while after updating A it must contain


Q R Q R
A00 a01 A02 L\U00 u01 A‚02
c T
a a10 –11 aT12 d c
b = a l10
T
‚11 a‚T12 d
b
A20 a A22 B A ‚
A20 B a‚21A A‚22 ,
A 21 B
L00 0 U00 u01 A‚00 a‚01
· =
T
l10 1 0 ‚11 a‚T10 –‚ 11
¸ ˚˙ ˝
L00 U00 = A‚00 L00 u01 = a‚01
T
l10 U00 = a‚T10 l10
T
u01 + ‚11 = –‚ 11
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 311

for the loop invariant to again hold after the iteration. Here the entries in red are known (in
addition to the ones marked with a "hat") and the entries in blue are to be computed. With
this, we can compute the desired parts of L and U :
• Solve L00 u01 = a01 , overwriting a01 with the result. (Notice that a01 = a‚01 before this
update.)
• Solve l10
T
U00 = aT10 (or, equivalently, U00 (l10 ) = (aT10 )T for l10
T T T T
), overwriting aT10 with
the result. (Notice that a10 = a‚10 before this update.)
T T

• Update –11 := ‚11 = –11 ≠ l10


T
u01 . (Notice that by this computation, aT10 = l10
T
and
a01 = u01 .)
The resulting algorithm is captured in Figure 5.5.1.3.
A = LU-var1(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R
æ a a10 –11 aT12 d
c T
b
ABL ABR
A20 a21 A22
Solve L00 u01 = a01 overwriting a01 with the result
Solve l10
T
U00 = aT10 overwriting aT10 with the result
–11 := ‚11 = –11 ≠ aT10 a01
Q R
A B A00 a01 A02
AT L AT R c d
Ω a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile

Figure 5.5.1.3 Variant 1 (bordered) LU factorization algorithm. Here A00 stores L\U00 .

Homework 5.5.1.1 If A is n ◊ n, show the cost of Variant 1 is approximately 23 n3 .


Solution. During the kth iteration, A00 is k ◊ k, for k = 0, . . . , n ≠ 1. Then the (approxi-
mate) cost of each of the steps is given by
• Solve L00 u01 = a01 , overwriting a01 with the result. Cost: approximately k 2 flops.

• Solve l10
T
U00 = aT10 (or, equivalently, U00 (l10 ) = (aT10 )T for l10
T T T T
), overwriting aT10 with
the result. Cost: approximately k flops.
2

• Compute ‚11 = –11 ≠ l10


T
u01 , overwriting –11 with the result. Cost: 2k flops.

Thus, the total cost is given by

ÿ1
n≠1 2 n≠1
ÿ 1 2
k 2 + k 2 + 2k ¥ 2 k 2 ¥ 2 n3 = n3 .
k=0 k=0 3 3
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 312

5.5.1.2 Variant 2: Up-looking algorithm


Consider next the loop invariant:
A B A B
AT L AT R L\UT L UT R
= · LT L UT L = A‚T L LT L UT R = A‚T R
ABL ABR A‚BL A‚BR

meaning that the leading principal submatrix AT L has been overwritten with its LU factor-
ization and UT R has overwritten AT R .
At the top of the loop, after repartitioning, A then contains
Q R Q R
A00 a01 A02 L\U00 u01 U02
c T T d c ‚T
a a10 –11 a12 b = a a10 –‚ 11 a‚T12 d
b
A20 a21 A22 A‚120 a‚21 A‚
2 22 1 2
· L00 U00 = A‚00 L00 u01 U02 = a‚01 A‚02
¸ ˚˙ ˝
L00 U00 = A‚
00 L00 u01 = a‚01 L00 U02 = A‚
02

while after updating A it must contain


Q R Q R
A00 a01 A02 L\U00 u01 U02
c T T d c d
a a10 –11 a12 b = a l10 T
‚11 uT12 b
A20 aA21 A22 B A A‚20 ‚
Ba21 A A22 ,

B A BA B A B
L00 0 U00 u01 A‚00 a‚01 L00 0 U02 A‚00
· = =
l10 1
T
0 ‚11 a‚T10 –‚ 11 T
l10 1 uT12 a‚T12
¸ ˚˙ ˝
L00 U00 = A‚00 L00 u01 = a‚01 L00 U02 = A‚02
l10 U00 = a‚10 l10 u01 + ‚11 = –
T T T ‚ 11 l10 U02 + uT12 = a
T ‚ T12

for the loop invariant to again hold after the iteration. Here, again, the entries in red are
known (in addition to the ones marked with a "hat") and the entries in blue are to be
computed. With this, we can compute the desired parts of L and U :

• Solve l10
T
U00 = aT10 , overwriting aT10 with the result.

• Update –11 := ‚11 = –11 ≠ l10


T
u01 = –11 ≠ aT10 a01 .

• Update aT12 := uT12 = aT12 ≠ l10


T
U02 = aT12 ≠ aT10 A02 .

The resulting algorithm is captured in Figure 5.5.1.4.


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 313

A = LU-var2(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
Solve l10
T
U00 = aT10 overwriting aT10 with the result
–11 := ‚11 = –11 ≠ aT10 a01
aT12 := uT12 = aT12 ≠ l10
T
U
Q 02 R
A B A00 a01 A02
AT L AT R
Ωc T
a a10 –11 aT12 d
b
ABL ABR
A20 a21 A22
endwhile

Figure 5.5.1.4 Variant 2 (up-looking) LU factorization algorithm. Here A00 stores L\U00 .

Homework 5.5.1.2 If A is n ◊ n, show the cost of Variant 2 is approximately 23 n3 .


Solution. During the kth iteration, A00 is k ◊ k, for k = 0, . . . , n ≠ 1. Then the (approxi-
mate) cost of each of the steps is given by
• Solve l10
T
U00 = aT10 , overwriting aT10 with the result. Approximate cost: k 2 flops.

• Update –11 := ‚11 = –11 ≠ l10


T
u01 = –11 ≠ aT10 a01 . Approximate cost: 2k flops.

• Update aT12 := uT12 = aT12 ≠ l10


T
U02 = aT12 ≠ aT10 A02 . Approximate cost: 2k(n ≠ k ≠ 1)
flops.

Thus, the total cost is approximately given by


qn≠1
k=0 (k 2 + 2k + 2k(n ≠ k ≠ 1))
= < simplify >
qn≠1
k=0 (2kn ≠ k )
2

= < algebra >


q q
2n n≠1
k=0 k ≠ n≠1
k=0 k
2
qn≠1 q
¥ < k=0 k ¥ n2 /2; n≠1 2 3
k=0 k ¥ k /3 >
3
n3 ≠ n3
= < simplify >
2 3
3
n.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 314

5.5.1.3 Variant 3: Left-looking algorithm


Consider the loop invariant:
A B A B
AT L AT R L\UT L A‚T R LT L UT L = A‚T L
= ·
ABL ABR LBL A‚BR LBL UT L = A‚BxL .

At the top of the loop, after repartitioning, A then contains


Q R Q R
A00 a01 A02 L\U00 a‚01 A‚02 L00 U00 = A‚00
c T d c d
a a10 –11 aT12 b = a l10
T
– ‚ T12 b · l10
‚ 11 a T
U00 = a‚T10
A20 a21 A22 L20 ‚
a‚21 A22 L20 U00 = A‚20

while after updating A it must contain


Q R Q R
A00 a01 A02 L\U00 u01 A‚02
c T T d c
a a10 –11 a12 b = a l10 T
‚11 a‚T12 d b
A20 aA21 A22 B A ‚
L20 Bl21 A A22 , B
L00 0 U00 u01 A‚00 a‚01
=
lT 1 0 ‚11 B a‚T10 –
‚ 11
· 1 10 A
2 U 1 2
00 u01
L20 l21 = A‚20 a‚21
0 ‚11
¸ ˚˙ ˝
L00 U00 = A‚00 L00 u01 = a‚01
T
l10 U00 = a‚T10 T
l10 u01 + ‚11 = –‚ 11
L20 U00 = A20 L20 u01 + l21 ‚11 = a‚21

for the loop invariant to again hold after the iteration. With this, we can compute the desired
parts of L and U :

• Solve L00 u01 = a01 , overwriting a01 with the result.

• Update –11 := ‚11 = –11 ≠ l10


T
u01 = –11 ≠ aT10 a01 .

• Update a21 := l21 = (a21 ≠ L20 u01 )/‚11 = (a21 ≠ A20 a10 )/–11 .

The resulting algorithm is captured in Figure 5.5.1.5.


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 315

A = LU-var3(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
Solve L00 u01 = a01 overwriting a01 with the result
–11 := ‚11 = –11 ≠ aT10 a01
a21 := l21 = (a21 ≠ A20 a10 )/–11
Q R
A B A00 a01 A02
AT L AT R
Ωc T
a a10 –11 a12 b
T d
ABL ABR
A20 a21 A22
endwhile

Figure 5.5.1.5 Variant 3 (left-looking) LU factorization algorithm. Here A00 stores L\U00 .

Homework 5.5.1.3 If A is n ◊ n, show the cost of Variant 3 is approximately 23 n3 .


Solution. During the kth iteration, A00 is k ◊ k, for k = 0, . . . , n ≠ 1. Then the (approxi-
mate) cost of each of the steps is given by
• Solve L00 u01 = a01 , overwriting a01 with the result. Approximate cost: k 2 flops.

• Update –11 := ‚11 = –11 ≠ l10


T
u01 = –11 ≠ aT10 a01 . Approximate cost: 2k flops.

• Update a21 := l21 = (a21 ≠ L20 u01 )/‚11 = (a21 ≠ A20 a10 )/–11 . Approximate cost:
2(n ≠ k ≠ 1) flops.

Thus, the total cost is approximately given by


qn≠1
k=0 (k 2 + 2k + 2k(n ≠ k ≠ 1))
= < simplify >
qn≠1
k=0 (2kn ≠ k ))
2

= < algebra >


q q
2n n≠1
k=0 k ≠ n≠1
k=0 k
2
qn≠1 q
¥ < k=0 k ¥ n2 /2; n≠1 2 3
k=0 k ¥ k /3 >
3
n3 ≠ n3
= < simplify >
2 3
3
n.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 316

5.5.1.4 Variant 4: Crout variant


Consider next the loop invariant:
A B A B
AT L AT R L\UT L UT R LT L UT L = A‚T L LT L UT R = A‚T R
= ·
ABL ABR LBL A‚BR LBL UT L = A‚BL .

At the top of the loop, after repartitioning, A then contains


Q R Q R
A00 a01 A02 L\U00 u01 U02
c T d c ‚ T12 d
a a10 –11 aT12 b = a l10
T
–‚ 11 a b
A20 a21 A22 L20 ‚
a‚21 A22
L00 U00 = A‚00 L00 u01 = a‚01 L00 U02 = A‚02
· l10T
U00 = a‚T10
L20 U00 = A‚20

while after updating A it must contain


Q R Q R
A00 a01 A02 L\U00 u01 U02
c T
a a10 –11 aT12 d c
b = a l10
T
‚11 uT12 d b
A20 aA21 A22 B A ‚
L20 Bl21 A A22 , B A BA B A B
L00 0 U00 u01 A‚00 a‚01 L00 0 U02 A‚02
= =
lT 1 0 ‚11 B a‚T10 –
‚ 11 T
l10 1 uT12 a‚T12
· 1 10 A
2 U 1 2
00 u01
L20 l21 = A‚20 a‚21
0 ‚11
¸ ˚˙ ˝
L00 U00 = A‚00 L00 u01 = a‚01 L00 U02 = A‚02
l10 U00 = a‚10
T T
l10 u01 + ‚11 = –
T ‚ 11 l10 U02 + uT12 = a
T ‚ T12
L20 U00 = A‚20 L20 u01 + l21 ‚11 = a‚21

for the loop invariant to again hold after the iteration. With this, we can compute the desired
parts of L and U :

• Update –11 := ‚11 = –11 ≠ l10


T
u01 = –11 ≠ aT10 a01 .

• Update aT12 := uT12 = aT12 ≠ l10


T
U02 = aT12 ≠ aT10 A02 .

• Update a21 := l21 = (a21 ≠ L20 u01 )/‚11 = (a21 ≠ A20 a10 )/–11 .

The resulting algorithm is captured in Figure 5.5.1.6.


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 317

A = LU-var4(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
–11 := ‚11 = –11 ≠ aT10 a01
aT12 := uT12 = aT12 ≠ aT10 A02
a21 := l21 = (a21 ≠ A20 a10 )/–11
Q R
A B A00 a01 A02
AT L AT R
Ωc T
a a10 –11 aT12 d
b
ABL ABR
A20 a21 A22
endwhile

Figure 5.5.1.6 Variant 4 (Crout) LU factorization algorithm.

Homework 5.5.1.4 If A is n ◊ n, show the cost of Variant 4 is approximately 23 n3 .


Solution. During the kth iteration, A00 is k ◊ k, for k = 0, . . . , n ≠ 1. Then the (approxi-
mate) cost of each of the steps is given by
• Update –11 := ‚11 = –11 ≠ l10
T
u01 = –11 ≠ aT10 a01 . Approximate cost: 2k flops.

• Update aT12 := uT12 = aT12 ≠ l10


T
U02 = aT12 ≠ aT10 A02 . Approximate cost: 2k(n ≠ k ≠ 1)
flops.

• Update a21 := l21 = (a21 ≠ L20 u01 )/‚11 = (a21 ≠ A20 a10 )/–11 . Approximate cost:
2k(n ≠ k ≠ 1) + (n ≠ k ≠ 1) flops.

Thus, ignoring the 2k flops for the dot product and the n ≠ k ≠ 1 flops for multiplying with
1/–11 in each iteration, the total cost is approximately given by
qn≠1
k=0 4k(n ≠ k ≠ 1)
q
¥ < remove lower order term > n≠1
k=0 4k(n ≠ k)
= < algebra >
q q
4n n≠1 k=0 k ≠q4 n≠1
k=0 k
2
q
¥ < k=0 k ¥ n2 /2; n≠1
n≠1 2 3
k=0 k ¥ k /3 >
n3
2n ≠ 4 3
3

= < simplify >


2 3
3
n .
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 318

5.5.1.5 Variant 5: Classical Gaussian elimination


Consider final loop invariant:
A B A B
AT L AT R L\UT L UT R LT L UT L = A‚T L LT L UT R = A‚T R
= ·
ABL ABR LBL A‚BR ≠ LBL UT R LBL UT L = A‚BL .

At the top of the loop, after repartitioning, A then contains


Q R Q R
A00 a01 A02 L\U00 u01 U02
c T d c
a a10 –11 aT12 b = a l10
T
– T
‚ 11 ≠ l10 u01 a‚T12 ≠ l10
T
U02 d
b
A20 a21 A22 ‚
L20 a‚21 ≠ L20 u01 A22 ≠ L20 U02
L00 U00 = A‚00 L00 u01 = a‚01 L00 U02 = A‚02
· l10T
U00 = a‚T10
L20 U00 = A‚20

while after updating A it must contain


Q R Q R
A00 a01 A02 L\U00 u01 U02
c T
a a10 –11 aT12 d c
b = a l10
T
‚11 uT12 d
b
A20 aA21 A22 B A ‚
L20 Bl21 A A22 ≠ l21 u12 T
B, A BA B A B
L00 0 U00 u01 A‚00 a‚01 L00 0 U02 A‚02
= =
lT 1 0 ‚11 B a‚T10 – ‚ 11 T
l10 1 uT12 a‚T12
· 1 10 A
2 U 1 2
00 u01
L20 l21 = A‚20 a‚21
0 ‚11
¸ ˚˙ ˝
L00 U00 = A‚00 L00 u01 = a‚01 L00 U02 = A‚02
l10 U00 = a‚10
T T
l10 u01 + ‚11 = –
T ‚ 11 l10 U02 + uT12 = a
T ‚ T12
L20 U00 = A‚20 L20 u01 + l21 ‚11 = a‚21

for the loop invariant to again hold after the iteration. With this, we can compute the desired
parts of L and U :

• –11 := ‚11 = – T
‚ 11 ≠ l10 u01 = –11 (no-op).
(–11 already equals – T
‚ 11 ≠ l01 u01 .)

• aT12 := uT12 = a‚T12 ≠ l10


T
U02 = aT12 (no-op).
(aT12 already equals a‚T12 ≠ l01
T
U02 .)

• Update a21 := (a‚21 ≠ L20 u01 )/‚11 = a21 /–11 .


(a21 already equals a‚21 ≠ L20 u01 .)

• Update A22 := A‚22 ≠ L20 U02 ≠ l21 uT12 = A22 ≠ a21 aT12
(A22 already equals A‚22 ≠ L20 U02 .)
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 319

The resulting algorithm is captured in Figure 5.5.1.7.


A = LU-var5(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R
æc a aT10 –11
d
aT12 b
ABL ABR
A20 a21 A22
a21 := l21 = a21 /–11
A22 := A22 ≠ a21 aT12
Q R
A B A00 a01 A02
AT L AT R c T
Ω a a10 –11 aT12 d
b
ABL ABR
A20 a21 A22
endwhile

Figure 5.5.1.7 Variant 5 (classical Gaussian elimination) LU factorization algorithm.

Homework 5.5.1.5 If A is n ◊ n, show the cost of Variant 5 is approximately 23 n3 .


Solution. During the kth iteration, A00 is k ◊ k, for k = 0, . . . , n ≠ 1. Then the (approxi-
mate) cost of each of the steps is given by
• Update a21 := l21 = a21 /–11 . Approximate cost: k flops.

• Update A22 := A22 ≠ l21 uT12 = A22 ≠ a21 aT12 . Approximate cost: 2(n ≠ k ≠ 1)(n ≠ k ≠ 1)
flops.

Thus, ignoring n ≠ k ≠ 1 flops for multiplying with 1/–11 in each iteration, the total cost is
approximately given by
qn≠1
k=0 2(n ≠ k ≠ 1)2
= < change of variable j = n ≠ k ≠ 1 >
qn≠1 2
2 j=0 j
q 2 3
¥ < n≠1k=0 k ¥ k /3 >
n3
23

5.5.1.6 Discussion
Remark 5.5.1.8 For a discussion of the different LU factorization algorithms that also gives
a historic perspective, we recommend "Matrix Algorithms Volume 1" by G.W. Stewart [37].
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 320

5.5.2 Blocked LU factorization


Recall from Subsection 3.3.4 that casting computation in terms of matrix-matrix multiplica-
tion facilitates high performance. In this unit we very briefly illustrate how the right-looking
LU factorization can be reformulated as such a "blocked" algorithm. For details on other
blocked LU factorization algorithms and blocked Cholesky factorization algorithms, we once
again refer the interested reader to our Massive Open Online Course titled "LAFF-On Pro-
gramming for Correctness" [28]. We will revisit these kinds of issues in the final week of this
course.
Consider A = LU and partition these matrices as
A B A B A B
A11 A12 L11 0 U11 U12
Aæ ,L æ ,U æ ,
A21 A22 L21 L22 0 U22

where A11 , L11 , and U11 are b ◊ b submatrices. Then


A B A BA B A B
A11 A12 L11 0 U11 U12 L11 U11 L11 A12
= = .
A21 A22 L21 L22 0 U22 A21 U11 A22 ≠ L21 U12

From this we conclude that


A11 = L11 U11 A12 = L11 U12
A21 = L21 U11 A22 ≠ L21 U12 = L22 U22 .

This suggests the following steps:

• Compute the LU factorization of A11 (e.g., using any of the "unblocked’ algorithms
from Subsection 5.5.1).
A11 = L11 U11 ,
overwriting A11 with the factors.

• Solve
L11 U12 = A12
for U12 , overwriting A12 with the result. This is known as a "triangular solve with
multple right-hand sides." This comes from the fact that solving

LX = B,

where L is lower triangular, can be reformulated by partitioning X and B by columns,


1 2 1 2
L x0 x1 · · · = b0 b1 · · · ,
1¸ ˚˙ ˝2
Lx0 Lx1 · · ·

which exposes that for each pair of columns we must solve the unit lower triangular
system Lxj = bj .
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 321

• Solve
L21 U11 = A21
for L21 , overwriting A21 with the result. This is also a "triangular solve with multple
right-hand sides" since we can instead view it as solving the lower triangular system
with multiple right-hand sides
L21 = AT21 .
T T
U11
(In practice, the matrices are not transposed.)

• Update
A22 := A22 ≠ L21 U12 .

• Proceed by computing the LU factorization of the updated A22 .

This motivates the algorithm in Figure 5.5.2.1.


A = LU-blk-var5(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 A01 A02
AT L AT R
æca A10 A11 A12 b
d
ABL ABR
A20 A21 A22
A11 := LU(A11 ) L11 and U11 overwrite A11
Solve L11 U12 = A12 overwriting A12 with U12
Solve L21 U11 = A21 overwriting A21 with L21
A22 := A22 ≠ A21 A12
Q R
A B A00 A01 A02
AT L AT R c d
Ω a A10 A11 A12 b
ABL ABR
A20 A21 A22
endwhile

Figure 5.5.2.1 Blocked Variant 5 (classical Gaussian elimination) LU factorization algo-


rithm.
The important observation is that if A is m ◊ m and b is much smaller than m, then
most of the computation is in the matrix-matrix multiplication A22 := A22 ≠ A21 A12 .
Remark 5.5.2.2 For each (unblocked) algorithm in Subsection 5.5.1, there is a correspond-
ing blocked algorithm.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 322

5.6 Wrap Up
5.6.1 Additional homework
In this chapter, we discussed how the LU factorization (with pivoting) can be used to solve
Ax = y. Why don’t we instead discuss how to compute the inverse of the matrix A and
compute x = A≠1 y? Through a sequence of exercises, we illustrate why one should (almost)
never compute the inverse of a matrix.
Homework 5.6.1.1 Let A œ Cm◊m be nonsingular and B its inverse. We know that AB = I
and hence 1 2 1 2
A b0 · · · bm≠1 = e0 · · · em≠1 ,
where ej can be thought of as the standard basis vector indexed with j or the column of I
indexed with j.
1. Justify the following algorithm for computing B:

for j = 0, . . . , m ≠ 1
Compute the LU factorization with pivoting : P (p)A = LU
Solve Lz = P (p)ej
Solve U bj = z
endfor

2. What is the cost, in flops, of the above algorithm?

3. How can we reduce the cost in the most obvious way and what is the cost of this better
algorithm?

4. If we want to solve Ax = y we can now instead compute x = By. What is the cost of
this multiplication and how does this cost compare with the cost of computing it via
the LU factorization, once the LU factorization has already been computed:

Solve Lz = P (p)y
Solve U x = z

What do we conclude about the wisdom of computing the inverse?


Homework 5.6.1.2 Let L be a unit lower triangular matrix. Partition
A B
1 0
L= .
l21 L22

1. Show that A B
1 0
L ≠1
= .
≠L≠1
22 l21 L≠1
22
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 323

2. Use the insight from the last part to complete the following algorithm for computing
the inverse of a unit lower triangular matrix:

[L] =Ainv(L) B
LT L LT R

LBL LBR
LT L is 0 ◊ 0
while n(LT L ) < n(L) Q R
A B L00 l01 L02
L T L LT R
æc T
a l10 T d
⁄11 l12 b
LBL LBR
L20 l21 L22

l21 :=
Q R
A B L00 l01 L02
L T L LT R c T T d
Ω a l10 ⁄11 l12 b
LBL LBR
L20 l21 L22
endwhile

3. The correct algorithm in the last part will avoid inverting matrices and will require,
approximately, 13 m3 flops. Analyze the cost of your algorithm.
Homework 5.6.1.3 LINPACK, the first software package for computing various operations
related to solving (dense) linear systems, includes routines for inverting a matrix. When
a survey was conducted to see what routines were in practice most frequently used, to the
dismay of the developers, it was discovered that routine for inverting matrices was among
them. To solve Ax = y users were inverting A and then computing x = A≠1 y. For this
reason, the successor to LINPACK, LAPACK, does not even include a routine for inverting
a matrix. Instead, if a user wants to compute the inverse, the user must go through the
steps.
Compute the LU factorization with pivoting : P (p)A = LU
Invert L, overwriting L with the result
Solve U X = L for X
Compute A≠1 := XP (p) (permuting the columns of X)

1. Justify that the described steps compute A≠1 .

2. Propose an algorithm for computing X that solves U X = L. Be sure to take advantage


of the triangular structure of U and L.

3. Analyze the cost of the algorithm in the last part of this question. If you did it right,
it should require, approximately, m3 operations.

4. What is the total cost of inverting the matrix?


WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 324

5.6.2 Summary
The process known as Gaussian elimination is equivalent to computing the LU factorization
of the matrix A œ Cm◊m
: A = LU,
where L is a unit lower trianguular matrix and U is an upper triangular matrix.
Definition 5.6.2.1 Given a matrix A œ Cm◊n with m Ø n, its LU factorization is given by
A = LU where L œ Cm◊n is unit lower trapezoidal and U œ Cn◊n is upper triangular with
nonzeroes on its diagonal. ⌃
Definition 5.6.2.2 Principal leading submatrix. For k Æ n, the k ◊ k principal
leading
A submatrix of
B a matrix A is defined to be the square matrix AT L œ C
k◊k
such that
AT L AT R
A= . ⌃
ABL ABR
Lemma 5.6.2.3 Let L œ Cn◊n be a unit lower triangular matrix and U œ Cn◊n be an
upper triangular matrix. Then A = LU is nonsingular if and only if U has no zeroes on its
diagonal.
Theorem 5.6.2.4 Existence of the LU factorization. Let A œ Cm◊n and m Ø n have
linearly independent columns. Then A has a (unique) LU factorization if and only if all its
principal leading submatrices are nonsingular.
A = LU-right-looking(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R
æc d
a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
a21 := a21 /–11
A22 := A22 ≠ a21 aT12
Q R
A B A00 a01 A02
AT L AT R c T d
Ω a a10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile
Figure 5.6.2.5 Right-looking LU factorization algorithm.
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 325

The right-looking algorithm performs the same computations as the algorithm


for j := 0, . . . , n ≠ 1
Z
for i := j + 1, . . . , n ≠ 1 __
_
⁄i,j := –i,j /–j,j ^
compute multipliers
–i,j := 0 _
_
_
\
endfor
for i := j + 1, . . . , n ≠ 1 Z
for k = j + 1, . . . , n ≠ 1 _^
–i,k := –i,k ≠ ⁄i,j –j,k subtract ⁄i,j times row j from row k
_
endfor \
endfor
endfor
A = LU-left-looking(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
Solve L00 u01 = a01 overwriting a01 with u01
–11 := ‚11 = –11 ≠ aT10 a01
a21 := a21 ≠ A20 a01
a21 := l21 = a21 /–11
Q R
A B A00 a01 A02
AT L AT R c T
Ω a a10 –11 aT12 d
b
ABL ABR
A20 a21 A22
endwhile
Figure 5.6.2.6 Left-looking LU factorization algorithm. L00 is the unit lower triangular
matrix stored in the strictly lower triangular part of A00 (with the diagonal implicitly stored).
Solving Ax = b via LU factorization:
• Compute the LU factorization A = LU .
• Solve Lz = b.
• Solve U x = z.
Cost of LU factorization: Starting with an m ◊ n matrix A, LU factorization requires
approximately mn2 ≠ 13 n3 flops. If m = n this becomes
2 3
n flops.
3
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 326

Definition 5.6.2.7 A matrix Lk of the form


Q R
Ik 0 0
c d
Lk = a 0 1 0 b ,
0 l21 I

where Ik is the k ◊ k identity matrix and I is an identity matrix "of appropropriate size" is
called a Gauss transform. ⌃
Q R≠1 Q R
Ik 0 0 Ik 0 0
L≠1
k =c
a 0 1 0 b
d
=c
a 0 1
d
0 b.
0 l21 I 0 ≠l21 I
Definition 5.6.2.8 Given Q R
fi0
c . d
a .. b ,
p=c d
fin≠1
where {fi0 , fi1 , . . . , fin≠1 } is a permutation (rearrangement) of the integers {0, 1, . . . , n ≠ 1},
we define the permutation matrix P (p) by
Q R
eTfi0
c
P (p) = c .. d
a . d
b.
T
efin≠1


If P is a permutation matrix then P ≠1
=P . T

Definition 5.6.2.9 Elementary pivot matrix. Given fi œ {0, . . . , n ≠ 1} define the


elementary pivot matrix Q R
eTfi
c d
c eT1 d
c d
c .. d
c . d
c d
c eTfi≠1 d
c d
PÂ (fi) = c d
c
c eT0 d
d
c eTfi+1 d
c d
c .. d
c
a . d
b
eTn≠1
or, equivalently, Y
_
_ Q In R if fi = 0
_
_
_
_
] c 0 0 1 0
d
PÂ (fi) = c 0 Ifi≠1 0 0 d
_ c d otherwise,
_
_
_
_
c
a 1 0 0 0 d
b
_
[
0 0 0 In≠fi≠1
where n is the size of the permutation matrix. ⌃
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 327

[A, p] A= LUpiv-right-looking(A)
B A B
AT L AT R pT
Aæ ,p æ
ABL ABR pB
AT L is 0 ◊ 0, pT has 0 elements
while n(AT L ) < n(A) Q R Q R
A B A00 a01 A02 A B p0
AT L AT R c T T d pT c d
æ a a10 –11 a12 b , æ a fi1 b
ABL ABR pB
A B
A20 a21 A22 p2
–11
fi1 := maxi
a21
Q R Q R
A00 a01 A02 A B A00 a01 A02
c T d I 0 c T d
a a10 –11 aT12 b := a a10 –11 aT12 b
0 P (fi1 )
A20 a21 A22 A20 a21 A22
a21 := a21 /–11
A22 := A22 ≠ a21 aT12
Q R Q R
A B A00 a01 A02 A B p0
AT L AT R c T T d pT c d
Ω a a10 –11 a12 b , Ω a fi1 b
ABL ABR pB
A20 a21 A22 p2
endwhile
Figure 5.6.2.10 Right-looking LU factorization algorithm with partial pivoting.
Solving Ax = b via LU factorization: with row pivoting:
• Compute the LU factorization with pivoting P A = LU .k
• Apply the row exchanges to the right-hand side: y = P b.
• Solve Lz = y.
• Solve U x = z.
SolveALz = y, overwriting
B
yA withBz (Variant 1)
LT L LT R yT
Læ ,y æ
LBL LBR yB
LT L is 0 ◊ 0 and yT has 0 elements
while n(LT L ) < n(L) Q R Q R
A B L00 l01 L02 A B y0
LT L LT R c T T d yT c d
æ a l10 ⁄11 l12 b, æ a Â1 b
LBL LBR yB
L20 l21 L22 y2
y2 := y2 ≠ Â1 l21
Q R Q R
A B L00 l01 L02 A B y0
LT L LT R y
Ωc T T d
a l10 ⁄11 l12 b ,
T
Ωc d
a Â1 b
LBL LBR yB
L20 l21 L22 y2
endwhile

Figure 5.6.2.11 Lower triangular solve (with unit lower triangular matrix), Variant 1
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 328

SolveALz = y, overwriting
B
yA withBz (Variant 2)
LT L LT R yT
Læ ,y æ
LBL LBR yB
LT L is 0 ◊ 0 and yT has 0 elements
while n(LT L ) < n(L) Q R Q R
A B L00 l01 L02 A B y0
LT L LT R c T T d yT c d
æ a l10 ⁄11 l12 b, æ a Â1 b
LBL LBR yB
L20 l21 L22 y2
Â1 := Â1 ≠ l10
T
y0
Q R Q R
A B L00 l01 L02 A B y0
LT L LT R y
Ωc T T d
a l10 ⁄11 l12 b ,
T
Ωc d
a Â1 b
LBL LBR yB
L20 l21 L22 y2
endwhile

Figure 5.6.2.12 Lower triangular solve (with unit lower triangular matrix), Variant 2

Solve AU x = z, overwriting
B
z withBx (Variant 1)
A
UT L UT R zT
Uæ ,z æ
UBL UBR zB
UBR is 0 ◊ 0 and zB has 0 elements
while n(UBR ) < n(U ) Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
æ a uT10 ‚11 uT12 b , æ a ’1 b
UBL UBR zB
U20 u21 U22 z2
’1 := ’1 /‚11
z0 := z0 ≠ ’1 u01
Q R Q R
A B U00 u01 U02 A B z0
UT L UT R z
Ωc d
a uT10 ‚11 uT12 b ,
T
Ωc d
a ’1 b
UBL UBR zB
U20 u21 U22 z2
endwhile
Figure 5.6.2.13 Upper triangular solve Variant 1
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 329

Solve AU x = z, overwriting
B
z withBx (Variant 2)
A
UT L UT R zT
Uæ ,z æ
UBL UBR zB
UBR is 0 ◊ 0 and zB has 0 elements
while n(UBR ) < n(U ) Q R Q R
A B U00 u01 U02 A B z0
UT L UT R c d zT c d
æ a uT10 ‚11 uT12 b , æ a ’1 b
UBL UBR zB
U20 u21 U22 z2
’1 := ’1 ≠ uT12 z2
’1 := ’1 /‚11
Q R Q R
A B U00 u01 U02 A B z0
UT L UT R z
Ωc d
a uT10 ‚11 uT12 b ,
T
Ωc d
a ’1 b
UBL UBR zB
U20 u21 U22 z2
endwhile
Figure 5.6.2.14 Upper triangular solve Variant 2
Cost of triangular solve Starting with an n ◊ n (upper or lower) triangular matrix T , solving
T x = b requires approximately n2 flops.
Provided the solution of Ax = b yields some accuracy in the solution, that accuracy can
be improved through a process known as iterative refinement.
• Let x̂ is an approximate solution to Ax = b.

• Let ”x
‚ is an approximate solution to A”x = b ≠ Ax
‚,

• Then x‚ + ”x,
‚ is an improved approximation.

• This process can be repeated until the accuracy in the computed solution is as good
as warranted by the conditioning of A and the accuracy in b.
Definition 5.6.2.15 Hermitian positive definite matrix. A matrix A œ Cn◊n is
Hermitian positive definite (HPD) if and only if it is Hermitian (AH = A) and for all
nonzero vectors x œ Cn it is the case that xH Ax > 0. If in addition A œ Rn◊n then A is said
to be symmetric positive definite (SPD). ⌃
Some insights regarding HPD matrices:
• B has linearly independent columns if and only if A = B H B is HPD.

• A diagonal matrix has only positive values on its diagonal if and only if it is HPD.

• If A is HPD, then its diagonal elements are all real-valued and positive.
A B
AT L AT R
• If A = , where AT L is square, is HPD, then AT L and ABR are HPD.
ABL ABR
Theorem 5.6.2.16 Cholesky Factorization Theorem. Given an HPD matrix A there
exists a lower triangular matrix L such that A = LLH . If the diagonal elements of L are
WEEK 5. THE LU AND CHOLESKY FACTORIZATIONS 330

restricted to be positive, L is unique.


A = Chol-right-looking(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
Ô
–11 := ⁄11 = –11
a21 := l21 = a21 /–11
A22 := A22 ≠ a21 aT12 (syr: update only lower triangular part)
Q R
A B A00 a01 A02
AT L AT R
Ωc T
a a10 –11 aT12 d
b
ABL ABR
A20 a21 A22
endwhile

Figure 5.6.2.17 Cholesky factorization algorithm (right-looking variant). The operation


"syr" refers to "symmetric rank-1 update", which performs a rank-1 update, updating only
the lower triangular part of the matrix in this algorithm.
A B
–11 aH Ô
Lemma 5.6.2.18 Let A = 21
, œ Cn◊n be HPD and l21 = a21 / –11 . Then
a21 A22
H
A22 ≠ l21 l21 is HPD.
Let x̂ œ Cn equal the solution to the linear least-squares (LLS) problem

Îb ≠ Ax‚Î2 = minn Îb ≠ AxÎ2 , (5.6.1)


xœC

where A has linearly independent columns, equals the solution to the normal equations

¸ ˚˙A˝ x = A
H H
A ‚
¸ ˚˙ b˝ .
B y

This solution can be computed via the steps

• Form B = AH A. Cost: approximately mn2 flops.

• Factor B = LLH (Cholesky factorization). Cost: approximately n3 /3 flops.

• Compute y = AH b. Cost: approximately 2mn flops.

• Solve Lz = y. Cost: approximately n2 flops.

• Solve LH x̂ = z. Cost: approximately n2 flops.

for a total of, approximately, mn2 + n3 /3 flops.


Week 6

Numerical Stability

The material in this chapter has be adapted from


• [6] Paolo Bientinesi, Robert A. van de Geijn, Goal-Oriented and Modular Stability
Analysis, SIAM Journal on Matrix Analysis and Applications , Volume 32 Issue 1,
February 2011.
and the technical report version of that paper (which includes exercises)
• [7] Paolo Bientinesi, Robert A. van de Geijn, The Science of Deriving Stability Anal-
yses, FLAME Working Note #33. Aachen Institute for Computational Engineering
Sciences, RWTH Aachen. TR AICES-2008-2. November 2008.
We recommend the technical report version for those who want to gain a deep understanding.
In this chapter, we focus on computation with real-valued scalars, vectors, and matrices.

6.1 Opening Remarks


6.1.1 Whose problem is it anyway?
Ponder This 6.1.1.1 What if we solve Ax = b on a computer and the result is an approxi-
mate solution x̂ due to roundoff error that is incurred. If we don’t know x, how do we check
that x̂ approximates x with a small relative error? Should we check the residual b ≠ Ax̂?
Solution.
• If
Îb ≠ Ax̂Î
ÎbÎ
is small, then we cannot necessarily conclude that
Îx̂ ≠ xÎ
ÎxÎ
is small (in other words: that x̂ is relatively close to x).

331
WEEK 6. NUMERICAL STABILITY 332

• If
Îb ≠ Ax̂Î
ÎbÎ
is small, then we can conclude that x̂ solves a nearby problem, provided we trust
whatever routine computes Ax̂. After all, it solves

Ax̂ = b̂

where
Îb ≠ b̂Î
ÎbÎ
is small.

So, Îb ≠ Ax̂Î/ÎbÎ being small is a necessary condition, but not a sufficient condition. If
Îb ≠ Ax̂Î/ÎbÎ is small, then x̂ is as good an answer as the problem warrants, since a small
error in the right-hand side is to be expected either because data inherently has error in it
or because in storing the right-hand side the input was inherently rounded.
In the presence of roundoff error, it is hard to determine whether an implementation is
correct. Let’s examine a few scenerios.
Homework 6.1.1.2 You use some linear system solver and it gives the wrong answer. In
other words, you solve Ax = b on a computer, computing x̂, and somehow you determine
that
Îx ≠ x̂Î
is large. Which of the following is a possible cause (identify all):
• There is a bug in the code. In other words, the algorithm that is used is sound (gives
the right answer in exact arithmetic) but its implementation has an error in it.

• The linear system is ill-conditioned. A small relative error in the right-hand side can
amplify into a large relative error in the solution.

• The algorithm you used accumulates a significant roundoff error.

• All is well: Îx̂ ≠ xÎ is large but the relative error Îx̂ ≠ xÎ/ÎxÎ is small.

Solution. All are possible causes. This week, we will delve into this.

6.1.2 Overview
• 6.1 Opening Remarks

¶ 6.1.1 Whose problem is it anyway?


¶ 6.1.2 Overview
¶ 6.1.3 What you will learn
WEEK 6. NUMERICAL STABILITY 333

• 6.2 Floating Point Arithmetic

¶ 6.2.1 Storing real numbers as floating point numbers


¶ 6.2.2 Error in storing a real number as a floating point number
¶ 6.2.3 Models of floating point computation
¶ 6.2.4 Stability of a numerical algorithm
¶ 6.2.5 Conditioning versus stability
¶ 6.2.6 Absolute value of vectors and matrices

• 6.3 Error Analysis for Basic Linear Algebra Algorithms

¶ 6.3.1 Initial insights


¶ 6.3.2 Backward error analysis of dot product: general case
¶ 6.3.3 Dot product: error results
¶ 6.3.4 Matrix-vector multiplication
¶ 6.3.5 Matrix-matrix multiplication

• 6.4 Error Analysis for Solving Linear Systems

¶ 6.4.1 Numerical stability of triangular solve


¶ 6.4.2 Numerical stability of LU factorization
¶ 6.4.3 Numerical stability of linear solve via LU factorization
¶ 6.4.4 Numerical stability of linear solve via LU factorization with partial pivoting
¶ 6.4.5 Is LU with Partial Pivoting Stable?

• 6.5 Enrichments

¶ 6.5.1 Systematic derivation of backward error analyses


¶ 6.5.2 LU factorization with pivoting can fail in practice

• 6.6 Wrap Up

¶ 6.6.1 Additional homework


¶ 6.6.2 Summary

6.1.3 What you will learn


This week, you explore how roundoff error when employing floating point computation affect
correctness.
Upon completion of this week, you should be able to

• Recognize how floating point numbers are stored.


WEEK 6. NUMERICAL STABILITY 334

• Employ strategies for avoiding unnecessary overflow and underflow that can occur in
intermediate computations.

• Compute the machine epsilon (also called the unit roundoff) for a given floating point
representation.

• Quantify errors in storing real numbers as floating point numbers and bound the in-
curred relative error in terms of the machine epsilon.

• Analyze error incurred in floating point computation using the Standard Computation
Model (SCM) and the Alternative Computation Model (ACM) to determine their
forward and backward results.

• Distinguish between conditioning of a problem and stability of an algorithm.

• Derive error results for simple linear algebra computations.

• State and interpret error results for solving linear systems.

• Argue how backward error can affect the relative error in the solution of a linear system.

6.2 Floating Point Arithmetic


6.2.1 Storing real numbers as floating point numbers

YouTube: https://www.youtube.com/watch?v=sWcdwmCdVOU
Only a finite number of (binary) digits can be used to store a real number in the memory
of a computer. For so-called single-precision and double-precision floating point numbers,
32 bits and 64 bits are typically employed, respectively.
Recall that any real number can be written as µ ◊ — e , where — is the base (an integer
greater than one), µ œ [≠1, 1] is the mantissa, and e is the exponent (an integer). For our
discussion, we will define the set of floating point numbers, F , as the set of all numbers
‰ = µ ◊ — e such that

• — = 2,

• µ = ±.”0 ”1 · · · ”t≠1 (µ has only t (binary) digits), where ”j œ {0, 1}),


WEEK 6. NUMERICAL STABILITY 335

• ”0 = 0 iff µ = 0 (the mantissa is normalized), and


• ≠L Æ e Æ U .
With this, the elements in F can be stored with a finite number of (binary) digits.
Example 6.2.1.1 Let — = 2, t = 3, µ = .101, and e = 1. Then

µ ◊ —e
=
.101 ◊ 21
=
(1 ◊ 2≠1 + 0 ◊ 2≠2 + 1 ◊ 2≠3 ) ◊ 21
1 = 2
1 0 1
2
+ 4
+ 8
◊2
=
1.25


Observe that
• There is a largest number (in absolute value) that can be stored. Any number with
larger magnitude "overflows". Typically, this causes a value that denotes a NaN (Not-
a-Number) to be stored.
• There is a smallest number (in absolute value) that can be stored. Any number that
is smaller in magnitude "underflows". Typically, this causes a zero to be stored.
In practice, one needs to be careful to consider overflow and underflow. The following
example illustrates the importance of paying attention to this.
Example 6.2.1.2 Computing the (Euclidean) length of a vector is an operation we will
frequently employ. Careful attention must be paid to overflow and underflow when computing
it.
Given x œ Rn , consider computing
ı̂n≠1
ıÿ
ÎxÎ2 = Ù ‰2 . i (6.2.1)
i=0

Notice that Ô
n max |‰i |
n≠1
ÎxÎ2 Æ
i=0

and hence, unless some ‰i is close to overflowing, the result will not overflow. The problem
is that if some element ‰i has the property that ‰2i overflows, intermediate results in the
computation in (6.2.1) will overflow. The solution is to determine k such that

|‰k | = max |‰i |


n≠1
i=0
WEEK 6. NUMERICAL STABILITY 336

and to then instead compute


ı̂n≠1 A B2
ı ÿ ‰i
ÎxÎ2 = |‰k |Ù .
i=0 ‰k
It can be argued that the same approach also avoids underflow if underflow can be avoided.

In our discussion, we mostly ignore this aspect of floating point computation.
Remark 6.2.1.3 Any time a real number is stored in our computer, it is stored as a nearby
floating point number (element in F ) (either through rounding or truncation). Nearby, of
course, could mean that it is stored as the exact number if it happens to also be a floating
point number.

6.2.2 Error in storing a real number as a floating point number

YouTube: https://www.youtube.com/watch?v=G2jawQW5WPc
Remark 6.2.2.1 We consider the case where a real number is truncated to become the
stored floating point number. This makes the discussion a bit simpler.
Let positive ‰ be represented by
‰ = .”0 ”1 · · · ◊ 2e ,
where ”i are binary digits and ”0 = 1 (the mantissa is normalized). If t binary digits are
stored by our floating point system, then
‰̌ = .”0 ”1 · · · ”t≠1 ◊ 2e
is stored (if truncation is employed). If we let ”‰ = ‰ ≠ ‰̌. Then
”‰ = .”0 ”1 · · · ”t≠1 ”t · · · ◊ 2e ≠ .”0 ”1 · · · ”t≠1 ◊ 2e
¸ ˚˙ ˝ ¸ ˚˙ ˝
‰ ‰̌
= ¸ ˚˙ ˝ t · · · ◊ 2
.0 · · · 00 ” e

t
< · · 01˝ ◊ 2e = 2≠t 2e .
¸.0 · ˚˙
t
Since ‰ is positive and ”0 = 1,
1
‰ = .”0 ”1 · · · ◊ 2e Ø ◊ 2e .
2
WEEK 6. NUMERICAL STABILITY 337

Thus,
”‰ 2≠t 2e
Æ 1 e = 2≠(t≠1) ,
‰ 2
2
which can also be written as
”‰ Æ 2≠(t≠1) ‰.
A careful analysis of what happens when ‰ equals zero or is negative yields

|”‰| Æ 2≠(t≠1) |‰|.


Example 6.2.2.2 The number 4/3 = 1.3333 · · · can be written as

1.3333 · · ·
=
1 + 02 + 14 + 08 + 16
1
+ ···
= < convert to binary representation >
1.0101 · · · ◊ 20
= < normalize >
.10101 · · · ◊ 21

Now, if t = 4 then this would be truncated to

.1010 ◊ 21 ,

which equals the number

.101 ◊ 21 =
1
2
+ 04 + 18 + 16
0
◊ 21
=
0.625 ◊ 2 = < convert to decimal >
1.25

The relative error equals


1.333 · · · ≠ 1.25
= 0.0625.
1.333 · · ·

If ‰̌ is computed by rounding instead of truncating, then

|”‰| Æ 2≠t |‰|.

We can abstract away from the details of the base that is chosen and whether rounding or
truncation is used by stating that storing ‰ as the floating point number ‰̌ obeys

|”‰| Æ ‘mach |‰|

where ‘mach is known as the machine epsilon or unit roundoff. When single precision floating
point numbers are used ‘mach ¥ 10≠8 , yielding roughly eight decimal digits of accuracy in the
WEEK 6. NUMERICAL STABILITY 338

stored value. When double precision floating point numbers are used ‘mach ¥ 10≠16 , yielding
roughly sixteen decimal digits of accuracy in the stored value.
Example 6.2.2.3 The number 4/3 = 1.3333 · · · can be written as

1.3333 · · ·
=
1 + 02 + 14 + 08 + 16
1
+ ···
= < convert to binary representation >
1.0101 · · · ◊ 20
= < normalize >
.10101 · · · ◊ 21

Now, if t = 4 then this would be rounded to

.1011 ◊ 21 ,

which is equals the number

.1011 ◊ 21 =
1
2
+ 04 + 18 + 16
1
◊ 21
=
0.6875 ◊ 2 = < convert to decimal >
1.375

The relative error equals


|1.333 · · · ≠ 1.375|
= 0.03125.
1.333 · · ·

Definition 6.2.2.4 Machine epsilon (unit roundoff). The machine epsilon (unit round-
off), ‘mach , is defined as the smallest positive floating point number ‰ such that the floating
point number that represents 1 + ‰ is greater than one. ⌃
Remark 6.2.2.5 The quantity ‘mach is machine dependent. It is a function of the parameters
characterizing how a specific architecture converts reals to floating point numbers.
Homework 6.2.2.1 Assume a floating point number system with — = 2, a mantissa with t
digits, and truncation when storing.
• Write the number 1 as a floating point number in this system.

• What is the ‘mach for this system?

Solution.

• Write the number 1 as a floating point number.


WEEK 6. NUMERICAL STABILITY 339

Answer:
.10
¸ ˚˙· · · 0˝ ◊ 21 .
t
digits

• What is the ‘mach for this system?


Answer:
.10
¸ ˚˙· · · 0˝ ◊ 21 + .00
¸ ˚˙· · · 1˝ ◊ 21 = .10
¸ ˚˙· · · 1˝ ◊ 21
t digits t digits t digits
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
1 2≠(t≠1) >1
and
.10
¸ ˚˙· · · 0˝ ◊ 21 + .00
¸ ˚˙· · · 0˝ 11 · · · ◊ 21 = .10
¸ ˚˙· · · 0˝ 11 · · · ◊ 21
t digits t digits t digits
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
1 < 2≠(t≠1) truncates to 1
Notice that
.00
¸ ˚˙· · · 1˝ ◊ 21
t digits
can be represented as
.10
¸ ˚˙· · · 0˝ ◊ 2≠(t≠2)
t digits
and
.00
¸ ˚˙· · · 0˝ 11 · · · ◊ 21
t digits
as
.11
¸ ˚˙· · · 1˝ ◊ 2≠(t≠1)
t digits
Hence ‘mach = 2≠(t≠1) .

6.2.3 Models of floating point computation


When computing with floating point numbers on a target computer, we will assume that all
(floating point) arithmetic that is performed is in terms of additions, subtractions, multipli-
cations, and divisions: {+, ≠, ◊, /}.

6.2.3.1 Notation
In our discussions, we will distinguish between exact and computed quantities. The function
fl(expression) returns the result of the evaluation of expression, where every operation is
executed in floating point arithmetic. For example, given ‰, Â, ’, Ê œ F and assuming that
WEEK 6. NUMERICAL STABILITY 340

the expressions are evaluated from left to right and order of operations is obeyed,

fl(‰ + Â + ’/Ê)

is equivalent to
fl(fl(‰ + Â) + fl(’/Ê)).
Equality between the quantities lhs and rhs is denoted by lhs = rhs. Assignment of rhs to
lhs is denoted by lhs := rhs (lhs becomes rhs). In the context of a program, the statements
lhs := rhs and lhs := fl(rhs) are equivalent. Given an assignment

Ÿ := expression,

we use the notation Ÿ̌ (pronounced "check kappa") to denote the quantity resulting from
fl(expression), which is actually stored in the variable Ÿ:

Ÿ̌ = fl(expression).
Remark 6.2.3.1 In future discussion, we will use the notation [·] as shorthand for fl(·).

6.2.3.2 Standard Computational Model (SCM)

YouTube: https://www.youtube.com/watch?v=RIsLyjFbonU
The Standard Computational Model (SCM) assumes that, for any two floating point
numbers ‰ and Â, the basic arithmetic operations satisfy the equality

fl(‰ op Â) = (‰ op Â)(1 + ‘), |‘| Æ ‘mach , and op œ {+, ≠, ú, /}.

The quantity ‘ is a function of ‰, Â and op. Sometimes we add a subscript (‘+ , ‘ú , · · · ) to


indicate what operation generated the (1 + ‘) error factor. We always assume that all the
input variables to an operation are floating point numbers.
Remark 6.2.3.2 We can interpret the SCM as follows: These operations are performed
exactly and it is only in storing the result that a roundoff error occurs.
What really happens is that enough digits of the result are computed so that the net
effect is as if the result of the exact operation was stored.
Given ‰,  œ F , performing any operation op œ {+, ≠, ú, /} with ‰ and  in floating
point arithmetic, fl(‰ op Â) yields a result that is correct up to machine precision: Let
’ = ‰ op  and ’ˇ = ’ + ”’ = fl(‰ op Â). Then |”’| Æ ‘mach |’| and hence ’ˇ is close to ’ (it
has k correct binary digits).
WEEK 6. NUMERICAL STABILITY 341

Example 6.2.3.3 Consider the operation

Ÿ = 4/3,

where we notice that both 4 and 3 can be exactly represented in our floating point system
with — = 2 and t = 4. Recall that the real number 4/3 = 1.3333 · · · is stored as .1010 ◊ 21 , if
t = 4 and truncation is employed. This equals 1.25 in decimal representation. The relative
error was 0.0625. Now
Ÿ̌
=
fl(4/3)
=
1.25
=
1.333 · · · + (≠0.0833 · · · )
= 1 2
1.333 · · · ◊ 1 + ≠0.08333···
1.333···
=
4/3 ◊ (1 + (≠0.0625))
=
Ÿ(1 + ‘/ ),
where
|‘/ | = 0.0625 Æ 0.125
¸ ˚˙ ˝ .
‘mach = 2 ≠(t≠1)

6.2.3.3 Alternative Computational Model (ACM)

YouTube: https://www.youtube.com/watch?v=6jBxznXcivg
For certain problems it is convenient to use the Alternative Computational Model (ACM)
[21], which also assumes for the basic arithmetic operations that
‰ op Â
fl(‰ op Â) = , |‘| Æ ‘mach , and op œ {+, ≠, ú, /}.
1+‘
As for the standard computation model, the quantity ‘ is a function of ‰, Â and op. Note
that the ‘’s produced using the standard and alternative models are generally not equal.
WEEK 6. NUMERICAL STABILITY 342

The Taylor series expansion of 1/(1 + ‘) is given by


1
= 1 + (≠‘) + O(‘2 ),
1+‘
which explains how the SCM and ACM are related.
The ACM is useful when analyzing algorithms that involve division. In this course,
we don’t analyze in detail any such algorithms. We include this discussion of ACM for
completeness.
Remark 6.2.3.4 Sometimes it is more convenient to use the SCM and sometimes the ACM.
Trial and error, and eventually experience, will determine which one to use.

6.2.4 Stability of a numerical algorithm

YouTube: https://www.youtube.com/watch?v=_Aoe pfTLhI


Correctness in the presence of error (e.g., when floating point computations are per-
formed) takes on a different meaning. For many problems for which computers are used,
there is one correct answer and we expect that answer to be computed by our program.
The problem is that most real numbers cannot be stored exactly in a computer memory.
They are stored as approximations, floating point numbers, instead. Hence storing them
and/or computing with them inherently incurs error. The question thus becomes "When is
a program correct in the presence of such errors?"
Let us assume that we wish to evaluate the mapping f : D æ R where D µ Rn is the
domain and R µ Rm is the range (codomain). Now, we will let fˇ : D æ R denote a computer
implementation of this function. Generally, for x œ D it is the case that f (x) ”= fˇ(x). Thus,
the computed value is not "correct". From earlier discussions about how the condition number
of a matrix can amplify relative error, we know that it may not be the case that fˇ(x) is
"close to" f (x): even if fˇ is an exact implementation of f , the mere act of storing x may
introduce a small error ”x and f (x + ”x) may be far from f (x) if f is ill-conditioned.
WEEK 6. NUMERICAL STABILITY 343

Figure 6.2.4.1 In this illustation, f : D æ R is a function to be evaluated. The function


fˇ represents the implementation of the function that uses floating point arithmetic, thus
incurring errors. The fact that for a nearby value, x̌, the computed value equals the exact
function applied to the slightly perturbed input x, that is,

f (x̌) = fˇ(x),

means that the error in the computation can be attributed to a small change in the input.
If this is true, then fˇ is said to be a (numerically) stable implementation of f for input x.
The following defines a property that captures correctness in the presence of the kinds of
errors that are introduced by computer arithmetic:
Definition 6.2.4.2 Backward stable implementation. Given the mapping f : D æ R,
where D µ Rn is the domain and R µ Rm is the range (codomain), let fˇ : D æ R be
a computer implementation of this function. We will call fˇ a backward stable (also called
"numerically stable") implementation of f on domain D if for all x œ D there exists a x̌
"close" to x such that fˇ(x) = f (x̌). ⌃
In other words, fˇ is a stable implementation if the error that is introduced is similar
to that introduced when f is evaluated with a slightly changed input. This is illustrated
in Figure 6.2.4.1 for a specific input x. If an implemention is not stable, it is numerically
WEEK 6. NUMERICAL STABILITY 344

unstable.
The algorithm is said to be forward stable on domain D if for all x œ D it is that case
that fˇ(x) ¥ f (x). In other words, the computed result equals a slight perturbation of the
exact result.
Example 6.2.4.3 Under the SCM from the last unit, floating point addition, Ÿ := ‰ + Â, is
a backward stable operation.
Solution.
Ÿ̌
= < computed value for Ÿ >
[‰ + Â]
= < SCM >
(‰ + Â)(1 + ‘+ )
= < distribute >
‰(1 + ‘+ ) + Â(1 + ‘+ )
=
(‰ + ”‰) + ( + ”Â)
where

• |‘+ | Æ ‘mach ,

• ”‰ = ‰‘+ ,

• ” = ‘+ .

Hence Ÿ̌ equals the exact result when adding nearby inputs. ⇤


Homework 6.2.4.1
• ALWAYS/SOMETIMES/NEVER: Under the SCM from the last unit, floating point
subtraction, Ÿ := ‰ ≠ Â, is a backward stable operation.

• ALWAYS/SOMETIMES/NEVER: Under the SCM from the last unit, floating point
multiplication, Ÿ := ‰ ◊ Â, is a backward stable operation.

• ALWAYS/SOMETIMES/NEVER: Under the SCM from the last unit, floating point
division, Ÿ := ‰/Â, is a backward stable operation.

Answer.

• ALWAYS: Under the SCM from the last unit, floating point subtraction, Ÿ := ‰ ≠ Â,
is a backward stable operation.

• ALWAYS: Under the SCM from the last unit, floating point multiplication, Ÿ := ‰ ◊ Â,
is a backward stable operation.

• ALWAYS: Under the SCM from the last unit, floating point division, Ÿ := ‰/Â, is a
backward stable operation.
WEEK 6. NUMERICAL STABILITY 345

Now prove it!


Solution.

• ALWAYS: Under the SCM from the last unit, floating point subtraction, Ÿ := ‰ ≠ Â,
is a backward stable operation.

Ÿ̌
= < computed value for Ÿ >
[‰ ≠ Â]
= < SCM >
(‰ ≠ Â)(1 + ‘≠ )
= < distribute >
‰(1 + ‘≠ ) ≠ Â(1 + ‘≠ )
=
(‰ + ”‰) ≠ ( + ”Â)
where

¶ |‘≠ | Æ ‘mach ,
¶ ”‰ = ‰‘≠ ,
¶ ” = ‘≠ .

Hence Ÿ̌ equals the exact result when subtracting nearby inputs.

• ALWAYS: Under the SCM from the last unit, floating point multiplication, Ÿ := ‰ ◊ Â,
is a backward stable operation.

Ÿ̌
= < computed value for Ÿ >
[‰ ◊ Â]
= < SCM >
(‰ ◊ Â)(1 + ‘◊ )
= < associative property >
‰ ◊ Â(1 + ‘◊ )
=
‰( + ”Â)
where

¶ |‘◊ | Æ ‘mach ,
¶ ” = ‘◊ .

Hence Ÿ̌ equals the exact result when multiplying nearby inputs.


WEEK 6. NUMERICAL STABILITY 346

• ALWAYS: Under the SCM from the last unit, floating point division, Ÿ := ‰/Â, is a
backward stable operation.

Ÿ̌
= < computed value for Ÿ >
[‰/Â] <
= SCM >
(‰/Â)(1 + ‘/ )
= < commutative property >
‰(1 + ‘/ )/Â
=
(‰ + ”‰)/Â
where

¶ |‘/ | Æ ‘mach ,
¶ ”‰ = ‰‘/ ,

Hence Ÿ̌ equals the exact result when dividing nearby inputs.


Ponder This 6.2.4.2 In the last homework, we showed that floating point division is
backward stable by showing that [‰/Â] = (‰ + ”‰)/ for suitably small ”‰.
How would one show that [‰/Â] = ‰/( + ”Â) for suitably small ”Â?

6.2.5 Conditioning versus stability

YouTube: https://www.youtube.com/watch?v=e29Yk4XCyLs
It is important to keep conditioning versus stability straight:

• Conditioning is a property of the problem you are trying to solve. A problem is well-
conditioned if a small change in the input is guaranteed to only result in a small change
in the output. A problem is ill-conditioned if a small change in the input can result in
a large change in the output.

• Stability is a property of an implementation. If the implementation, when executed


with an input always yields an output that can be attributed to slightly changed input,
then the implementation is backward stable.
WEEK 6. NUMERICAL STABILITY 347

In other words, in the presence of roundoff error, computing a wrong answer may be due
to the problem (if it is ill-conditioned), the implementation (if it is numerically unstable),
or a programming bug (if the implementation is sloppy). Obviously, it can be due to some
combination of these.
Now,

• If you compute the solution to a well-conditioned problem with a numerically stable


implementation, then you will get an answer that is close to the actual answer.

• If you compute the solution to a well-conditioned problem with a numerically unstable


implementation, then you may or may not get an answer that is close to the actual
answer.

• If you compute the solution to an ill-conditioned problem with a numerically stable


implementation, then you may or may not get an answer that is close to the actual
answer.

Yet another way to look at this: A numerically stable implementation will yield an answer
that is as accurate as the conditioning of the problem warrants.

6.2.6 Absolute value of vectors and matrices


In the above discussion of error, the vague notions of "near" and "slightly perturbed" are
used. Making these notions exact usually requires the introduction of measures of size for
vectors and matrices, i.e., norms. When analyzing the stability of algorithms, we instead give
all bounds in terms of the absolute values of the individual elements of the vectors and/or
matrices. While it is easy to convert such bounds to bounds involving norms, the converse
is not true.
Definition 6.2.6.1 Absolute value of vector and matrix. Given x œ Rn and A œ Rm◊n ,
Q R Q R
|‰0 | |–0,0 | |–0,1 | ... |–0,n≠1 |
c d c d
c |‰1 | d c |–1,0 | |–1,1 | ... |–1,n≠1 | d
|x| = c .. d and |A| = c .. .. .. .. d.
c
a .
d
b
c
a . . . .
d
b
|‰n≠1 | |–m≠1,0 | |–m≠1,1 | . . . |–m≠1,n≠1 |


Definition 6.2.6.2 Let Mœ {<, Æ, =, Ø, >} and x, y œ Rn . Then

|x| M |y| iff |‰i | M |Âi |,

for all i = 0, . . . , n ≠ 1. Similarly, given A and B œ Rm◊n ,

|A| M |B| iff |–ij | M |—ij |,

for all i = 0, . . . , m ≠ 1 and j = 0, . . . , n ≠ 1. ⌃


WEEK 6. NUMERICAL STABILITY 348

The next Lemma is exploited in later sections:


Homework 6.2.6.1 Let A œ Rm◊k and B œ Rk◊n .
ALWAYS/SOMETIMES/NEVER: |AB| Æ |A||B|.
Answer. ALWAYS
Now prove it.
Solution. Let C = AB. Then the (i, j) entry in |C| is given by
- -
-k≠1 - k≠1 k≠1
-ÿ - ÿ ÿ
|“i,j | = -
- –i,p —p,j -- Æ |–i,p —p,j | = |–i,p ||—p,j |
-p=0 - p=0 p=0

which equals the (i, j) entry of |A||B|. Thus |AB| Æ |A||B|.


The fact that the bounds that we establish can be easily converted into bounds involving
norms is a consequence of the following theorem, where Î ÎF indicates the Frobenius matrix
norm.
Theorem 6.2.6.3 Let A, B œ Rm◊n . If |A| Æ |B| then ÎAÎF Æ ÎBÎF , ÎAÎ1 Æ ÎBÎ1 , and
ÎAÎŒ Æ ÎBÎŒ .
Homework 6.2.6.2 Prove Theorem 6.2.6.3
Solution.

• Show that if |A| Æ |B| then ÎAÎF Æ ÎBÎF :

m≠1
ÿ n≠1
ÿ m≠1
ÿ n≠1
ÿ
ÎAÎ2F = |–i,j |2 Æ |—i,j |2 = ÎBÎ2F .
i=0 j=0 i=0 j=0

Hence ÎAÎF Æ ÎBÎF .

• Show that if |A| Æ |B| then ÎAÎ1 Æ ÎBÎ1 :


Let 1 2 1 2
A= a0 · · · an≠1 and B = b0 · · · bn≠1 .
WEEK 6. NUMERICAL STABILITY 349

Then
ÎAÎ1
= < alternate way of computing 1-norm >
max0Æj<n Îaj Î1
= <1 expose individual
2 entries of aj >
qm≠1
max0Æj<n i=0 |–i,j |

1q = < choose
2 k to be the index that maximizes >
m≠1
i=0 |–i,k |

1qÆ < 2entries of B bound corresponding entries of A >


m≠1
i=0 |—i,k |
= < express sum as 1-norm of column indexed with k >
Îbk Î1
Æ < take max over all columns >
max0Æj<n Îbj Î1
= < definition of 1-norm >
ÎBÎ1 .

• Show that if |A| Æ |B| then ÎAÎŒ Æ ÎBÎŒ :


Note:

¶ ÎAÎŒ = ÎAT Î1 and ÎBÎŒ = ÎB T Î1 .


¶ If |A| Æ |B| then, clearly, |AT | Æ |B T |.

Hence
ÎAÎŒ = ÎAT Î1 Æ ÎB T Î1 = ÎBÎŒ .

6.3 Error Analysis for Basic Linear Algebra Algorithms


6.3.1 Initial insights

YouTube: https://www.youtube.com/watch?v=OHqdJ3hjHFY
Before giving a general result, let us focus on the case where the vectors x and y have
only a few elements:
WEEK 6. NUMERICAL STABILITY 350

Example 6.3.1.1 Consider


A B A B
‰0 Â0
x= and y =
‰1 Â1

and the computation


Ÿ := xT y.
Under the SCM given in Subsubsection 6.2.3.2, the computed result, Ÿ̌, satisfies
A BT A (0) (1)
BA B
‰0 (1 + ‘ú )(1 + ‘+ ) 0 Â0
Ÿ̌ = (1) (1) . (6.3.1)
‰1 0 (1 + ‘ú )(1 + ‘+ ) Â1

Solution.
Ÿ̌ Ë È
= < Ÿ̌ = xT y >
SA BT A BT
U ‰0 Â0 V
‰1 Â1
= < definition of xT y >
[‰0 Â0 + ‰1 Â1 ]
= < each suboperation is performed in floating point arithmetic >
[[‰0 Â0 ] + [‰1 Â1 ]]
Ë = < apply SCM multiple È times >
(0) (1)
‰0 Â0 (1 + ‘ú ) + ‰1 Â1 (1 + ‘ú )
= < apply SCM >
(0) (1) (1)
(‰0 Â0 (1 + ‘ú ) + ‰1 Â1 (1 + ‘ú ))(1 + ‘+ )
= < distribute >
(0) (1) (1) (1)
‰0 Â0 (1 + ‘ú )(1 + ‘+ ) + ‰1 Â1 (1 + ‘ú )(1 + ‘+ )
= < commute >
(0) (1) (1) (1)
‰0 (1 + ‘ú )(1 + ‘+ )Â0 + ‰1 (1 + ‘ú )(1 + ‘+ )Â1
= < (perhaps too) slick way of expressing the final result >
A BT A (0) (1)
BA B
‰0 (1 + ‘ú )(1 + ‘+ ) 0 Â0
(1) (1)
‰1 0 (1 + ‘ú )(1 + ‘+ ) Â1
,
(0) (1) (1)
where |‘ú |, |‘ú |, |‘+ | Æ ‘mach . ⇤
An important insight from this example is that the result in (6.3.1) can be manipulated
to associate the accummulated error with vector x as in
A (0) (1)
BT A B
‰0 (1 + ‘ú )(1 + ‘+ ) Â0
Ÿ̌ = (1) (1)
‰1 (1 + ‘ú )(1 + ‘+ ) Â1
WEEK 6. NUMERICAL STABILITY 351

or with vector y
A BT A (0) (1)
B
‰0 Â0 (1 + ‘ú )(1 + ‘+ )
Ÿ̌ = (1) (1) .
‰1 Â1 (1 + ‘ú )(1 + ‘+ )
This will play a role when we later analyze algorithms that use the dot product.
Homework 6.3.1.1 Consider
Q R Q R
‰0 Â0
c d c d
x = a ‰1 b and y = a Â1 b
‰2 Â2

and the computation


Ÿ := xT y
computed in the order indicated by

Ÿ := (‰0 Â0 + ‰1 Â1 ) + ‰2 Â2 .

Employ the SCM given in Subsubsection 6.2.3.2, to derive a result similar to that given in
(6.3.1).
Answer.
Q R Q (0) (1) (2)
RQ R
‰0
T
(1 + ‘ú )(1 + ‘+ )(1 + ‘+ ) 0 0 Â
c d c (1) (1) (2) dc 0 d
a ‰1 b ac 0 (1 + ‘ú )(1 + ‘+ )(1 + ‘+ ) 0 d
b a Â1 b ,
(2) (2)
‰2 0 0 (1 + ‘ú )(1 + ‘+ ) Â2

(0) (1) (1) (2) (2)


where |‘ú |, |‘ú |, |‘+ |, |‘ú |, |‘+ | Æ ‘mach .
Solution. Here is a solution that builds on the last example and paves the path toward
WEEK 6. NUMERICAL STABILITY 352

the general solution presented in the next unit.

Ÿ̌ Ë È
= < Ÿ̌ = xT y >
[(‰0 Â0 + ‰1 Â1 ) + ‰2 Â2 ]
= < each suboperation is performed in floating point arithmetic >
[[‰0 Â0 + ‰1 Â1 ] + [‰2 Â2 ]]
SS = A
< reformulate
BT A T so we Tcan use result from Example 6.3.1.1 >
B
‰0 Â0 V
UU + [‰2 Â2 ]V
‰1 Â1
QA= < use Example 6.3.1.1; twice SCM >
BT A BA B
(0) (1)
a ‰0 (1 + ‘ú )(1 + ‘+ ) 0 Â0
(1) (1)
‰1 0 (1 + ‘ú )(1 + ‘+ ) Â1
2
(2) (2)
+‰2 Â2 (1 + ‘ú ) (1 + ‘+ )
= < distribute, commute >
A BT A (0) (1)
B A B
‰0 (1 + ‘ú )(1 + ‘+ ) 0 (2) Â0
(1) (1) (1 + ‘+ )
‰1 0 (1 + ‘ú )(1 + ‘+ ) Â1
(2) (2)
+ ‰2 (1 + ‘ú )(1 + ‘+ )Â2
= < (perhaps too) slick way of expressing the final result >
Q RT Q (0) (1) (2)
RQ R
‰0 (1 + ‘ú )(1 + ‘+ )(1 + ‘+ ) 0 0 Â0
c d cc (1) (1) (2) dc
a ‰1 b a 0 (1 + ‘ú )(1 + ‘+ )(1 + ‘+ ) 0 da
b Â1 d
b
(2) (2)
‰2 0 0 (1 + ‘ú )(1 + ‘+ ) Â2
,
(0) (1) (1) (2) (2)
where |‘ú |, |‘ú |, |‘+ |, |‘ú |, |‘+ | Æ ‘mach .

6.3.2 Backward error analysis of dot product: general case

YouTube: https://www.youtube.com/watch?v=PmFUqJXogm8
WEEK 6. NUMERICAL STABILITY 353

Consider now
Q RT Q R
‰0 Â0
c d c d
c
c
‰1 d
d
c
c
Â1 d
d 31 2 4
Ÿ := x y = c .. d c .. d
= (‰0 Â0 + ‰1 Â1 ) + · · · + ‰n≠2 Ân≠2 + ‰n≠1 Ân≠1 .
. .
T
c d c d
c d c d
c ‰n≠2 d c Ân≠2 d
a b a b
‰n≠1 Ân≠1

Under the computational model given in Subsection 6.2.3 the computed result, Ÿ̌, satisfies
A3 4
1 2
(0) (1) (1) (n≠2)
Ÿ̌ = ‰0 Â0 (1 + ‘ú ) + ‰1 Â1 (1 + ‘ú ) (1 + ‘+ ) + · · · (1 + ‘+ )
B
(n≠1) (n≠1)
+ ‰n≠1 Ân≠1 (1 + ‘ú ) (1 + ‘+ )
(0) (1) (2) (n≠1)
= ‰0 Â0 (1 + ‘ú )(1 + ‘+ )(1 + ‘+ ) · · · (1 + ‘+ )
(1) (1) (2) (n≠1)
+ ‰1 Â1 (1 + ‘ú )(1 + ‘+ )(1 + ‘+ ) · · · (1 + ‘+ )

(2) (2) (n≠1)


+ ‰2 Â2 (1 + ‘ú ) (1 + ‘+ ) · · · (1 + ‘+ )
+ ···
(n≠1) (n≠1)
+ ‰n≠1 Ân≠1 (1 + ‘ú ) (1 + ‘+ )
qn≠1 1 (i) rn≠1 (j)
2
= i=0 ‰i Âi (1 + ‘ú ) j=i (1 + ‘+ )

so that Q R
n≠1
ÿ n≠1
Ÿ
a‰i Âi (1 + ‘(i) ) (j)
Ÿ̌ = ú (1 + ‘+ )b , (6.3.2)
i=0 j=i

(0) (0) (j) (j)


where ‘+ = 0 and |‘ú |, |‘ú |, |‘+ | Æ ‘mach for j = 1, . . . , n ≠ 1.
Clearly, a notation to keep expressions from becoming unreadable is desirable. For this
reason we introduce the symbol ◊j :

YouTube: https://www.youtube.com/watch?v=6qnYXaw4Bms
Lemma 6.3.2.1 Let ‘i œ R, 0 Æ i Æ n ≠ 1, n‘mach < 1, and |‘i | Æ ‘mach . Then ÷ ◊n œ R
such that
n≠1
Ÿ
(1 + ‘i )±1 = 1 + ◊n ,
i=0
WEEK 6. NUMERICAL STABILITY 354

with |◊n | Æ n‘mach /(1 ≠ n‘mach ).


Here the ±1 means that on an individual basis, the term is either used in a multiplication
or a division. For example
(1 + ‘0 )±1 (1 + ‘1 )±1
might stand for

(1 + ‘0 ) (1 + ‘1 ) 1
(1 + ‘0 )(1 + ‘1 ) or or or
(1 + ‘1 ) (1 + ‘0 ) (1 + ‘1 )(1 + ‘0 )

so that this lemma can accommodate an analysis that involves a mixture of the Standard and
Alternative Computational Models (SCM and ACM).
Proof. By Mathematical Induction.
• Base case. n = 1. Trivial.

• Inductive Step. The Inductive Hypothesis (I.H.) tells us that for all ‘i œ R, 0 Æ i Æ
n ≠ 1, n‘mach < 1, and |‘i | Æ ‘mach , there exists a ◊n œ R such that
n≠1
Ÿ
(1 + ‘i )±1 = 1 + ◊n , with |◊n | Æ n‘mach /(1 ≠ n‘mach ).
i=0

We will show that if ‘i œ R, 0 Æ i Æ n, (n + 1)‘mach < 1, and |‘i | Æ ‘mach , then there
exists a ◊n+1 œ R such that
n
Ÿ
(1 + ‘i )±1 = 1 + ◊n+1 , with |◊n+1 | Æ (n + 1)‘mach /(1 ≠ (n + 1)‘mach ).
i=0

¶ Case 1: The last term comes from the application of the SCM.
rn r
i=0 (1 + ‘i ) = n≠1
i=0 (1 + ‘i ) (1 + ‘n ). See Ponder This 6.3.2.1.
±1 ±1

¶ Case 2: The last term comes from the application of the ACM.
rn rn≠1
i=0 (1 + ‘i )r = ( i=0 (1 + ‘i ) )/(1 + ‘n ). By the I.H. there exists a ◊n such that
±1 ±1

(1 + ◊n ) = i=0 (1 + ‘i ) and |◊n | Æ n‘mach /(1 ≠ n‘mach ). Then


n≠1 ±1

rn≠1
(1 + ‘i )±1 1 + ◊n ◊n ≠ ‘ n
i=0
= =1+ ,
1 + ‘n 1 + ‘n 1 + ‘n
¸ ˚˙ ˝
◊n+1
WEEK 6. NUMERICAL STABILITY 355

which tells us how to pick ◊n+1 . Now

|◊n+1 |
= < definition of ◊n+1 >
|(◊n ≠ ‘n )/(1 + ‘n )|
Æ < |◊n ≠ ‘n | Æ |◊n | + |‘n | Æ |◊n | + ‘mach >
(|◊n | + ‘mach )/(|1 + ‘n |)
Æ < |1 + ‘n | Ø 1 ≠ |‘n | Ø 1 ≠ ‘mach >
(|◊n | + ‘mach )/(1 ≠ ‘mach )
Æ < bound |◊n | using I.H. >
( 1≠n‘
n‘mach
mach
+ ‘mach )/(1 ≠ ‘mach )
= < algebra >
(n‘mach + (1 ≠ n‘mach )‘mach )/((1 ≠ n‘mach )(1 ≠ ‘mach ))

= < algebra >


((n + 1)‘mach ≠ n‘2mach )/(1 ≠ (n + 1)‘mach + n‘2mach )
Æ < increase numerator; decrease denominator >
((n + 1)‘mach )/(1 ≠ (n + 1)‘mach ).

• By the Principle of Mathematical Induction, the result holds.


Ponder This 6.3.2.1 Complete the proof of Lemma 6.3.2.1.
Remark 6.3.2.2 The quantity ◊n will be used throughout these notes. It is not intended
to be a specific number. Instead, it is an order of magnitude identified by the subscript n,
which indicates the number of error factors of the form (1 + ‘i ) and/or (1 + ‘i )≠1 that are
grouped together to form (1 + ◊n ).
Since we will often encounter the bound on |◊n | that appears in Lemma 6.3.2.1 we assign
it a symbol as follows:
Definition 6.3.2.3 For all n Ø 1 and n‘mach < 1, define

“n = n‘mach /(1 ≠ n‘mach ).


WEEK 6. NUMERICAL STABILITY 356

With this notation, (6.3.2) simplifies to

Ÿ̌
=
‰0 Â0 (1 + ◊n ) + ‰1 Â1 (1 + ◊n ) + · · · + ‰n≠1 Ân≠1 (1 + ◊2 )
=
Q RT Q RQ R
‰0 (1 + ◊n ) 0 0 ··· 0 Â0
c d c dc d
c ‰1 d c
c d c
0 (1 + ◊n ) 0 ··· 0 dc
dc
Â1 d
d
c ‰
c 2
d c
d c 0 0 (1 + ◊ n≠1 ) · · · 0 dc
dc Â2 d
d
c .. d c .. .. .. .. .. dc .. d
c
a . b a
d c . . . . . dc
ba . d
b
‰n≠1 0 0 0 · · · (1 + ◊2 ) Ân≠1
= (6.3.3)
Q RT Q Q RR Q R
‰0 ◊n 0 0 ··· 0 Â0
c d c c dd c d
c
c
‰1 d
d
c
c
c
c
0 ◊n 0 ··· 0 dd
dd
c
c
Â1 d
d
c
c ‰2 d
d
c
cI
c
+c 0 0 ◊n≠1 ··· 0 dd
dd
c
c Â2 d
d
c .. d c c .. .. .. ... .. dd c .. d
c
a . d
b
c
a
c
a . . . . dd
bb
c
a . d
b
‰n≠1 0 0 0 · · · ◊2 Ân≠1
¸ ˚˙ ˝
I+ (n)

=
xT (I + (n)
)y,

where |◊j | Æ “j , j = 2, . . . , n.
Remark 6.3.2.4 Two instances of the symbol ◊n , appearing even in the same expression,
typically do not represent the same number. For example, in (6.3.3) a (1 + ◊n ) multiplies
each of the terms ‰0 Â0 and ‰1 Â1 , but these two instances of ◊n , as a rule, do not denote the
same quantity. In particular, one should be careful when factoring out such quantities.

YouTube: https://www.youtube.com/watch?v=Uc6NuDZMakE
As part of the analyses the following bounds will be useful to bound error that accumu-
lates:
Lemma 6.3.2.5 If n, b Ø 1 then “n Æ “n+b and “n + “b + “n “b Æ “n+b .
This lemma will be invoked when, for example, we want to bound |‘| such that 1 + ‘ =
(1 + ‘1 )(1 + ‘2 ) = 1 + (‘1 + ‘2 + ‘1 ‘2 ) knowing that |‘1 | Æ “n and |‘2 | Æ “b .
WEEK 6. NUMERICAL STABILITY 357

Homework 6.3.2.2 Prove Lemma 6.3.2.5.


Solution.
“n
= < definition >
(n‘mach )/(1 ≠ n‘mach )
Æ <bØ1>
((n + b)‘mach )/(1 ≠ n‘mach )
Æ < 1/(1 ≠ n‘mach ) Æ 1/(1 ≠ (n + b)‘mach ) if (n + b)‘mach < 1 >
((n + b)‘mach )/(1 ≠ (n + b)‘mach )
= < definition >
“n+b .

and
“n + “b + “n “b
= < definition >
n‘mach
1≠n‘mach
+ b‘mach
1≠b‘mach
+ (1≠n‘
n‘mach b‘mach
mach ) (1≠b‘mach )
= < algebra >
n‘mach (1≠b‘mach )+(1≠n‘mach )b‘mach +bn‘2mach
(1≠n‘mach )(1≠b‘mach )
= < algebra >
n‘mach ≠bn‘2mach +b‘mach ≠bn‘2mach +bn‘2mach
1≠(n+b)‘mach +bn‘2mach
= < algebra >
(n+b)‘mach ≠bn‘2mach
1≠(n+b)‘mach +bn‘2mach
Æ < bn‘2mach >0>
(n+b)‘mach
1≠(n+b)‘mach +bn‘2mach
Æ < bn‘2mach >0>
(n+b)‘mach
1≠(n+b)‘mach
= < definition >
“n+b .

6.3.3 Dot product: error results

YouTube: https://www.youtube.com/watch?v=QxUCV4k8Gu8
It is of interest to accumulate the roundoff error encountered during computation as a
perturbation of input and/or output parameters:
WEEK 6. NUMERICAL STABILITY 358

• Ÿ̌ = (x + ”x)T y;
(Ÿ̌ is the exact output for a slightly perturbed x)

• Ÿ̌ = xT (y + ”y);
(Ÿ̌ is the exact output for a slightly perturbed y)

• Ÿ̌ = xT y + ”Ÿ.
(Ÿ̌ equals the exact result plus an error)

The first two are backward error results (error is accumulated onto input parameters, show-
ing that the algorithm is numerically stable since it yields the exact output for a slightly
perturbed input) while the last one is a forward error result (error is accumulated onto the
answer). We will see that in different situations, a different error result may be needed by
analyses of operations that require a dot product.
Let us focus on the second result. Ideally one would show that each of the entries of y is
slightly perturbed relative to that entry:
Q R Q RQ R
‡0 Â0 ‡0 · · · 0 Â0
c .. d = c .. . . .
d c .. d c . d = y,
”y = c
a . b a . . b a .. d
dc
b
‡n≠1 Ân≠1 0 · · · ‡n≠1 Ân≠1

where each ‡i is "small" and = diag(‡0 , . . . , ‡n≠1 ). The following special structure of ,
inspired by (6.3.3) will be used in the remainder of this note:
Y
0 ◊ 0 matrix
_
] if n = 0
(n)
= ◊1 if n = 1
_
[
diag(◊n , ◊n , ◊n≠1 , . . . , ◊2 ) otherwise .

Recall that ◊j is an order of magnitude variable with |◊j | Æ “j .


Homework 6.3.3.1 Let k Ø 0 and assume that |‘1 |, |‘2 | Æ ‘mach , with ‘1 = 0 if k = 0. Show
that A B
I + (k) 0
(1 + ‘2 ) = (I + (k+1) ).
0 (1 + ‘1 )
Hint: Reason the cases where k = 0 and k = 1 separately from the case where k > 1.
Solution. Case: k = 0.
WEEK 6. NUMERICAL STABILITY 359

Then
A B
I + (k) 0
(1 + ‘2 )
0 (1 + ‘1 )
= < k = 0 means (I + (k)
) is 0 ◊ 0 and (1 + ‘1 ) = (1 + 0) >
(1 + 0)(1 + ‘2 )
=
(1 + ‘2 )
=
(1 + ◊1 )
=
(I + (1) ).

Case: k = 1.
Then A B
I + (k) 0
(1 + ‘2 )
0 (1 + ‘1 )
A = B
1 + ◊1 0
(1 + ‘2 )
0 (1 + ‘1 )
A = B
(1 + ◊1 )(1 + ‘2 ) 0
0 (1 + ‘1 )(1 + ‘2 )
A = B
(1 + ◊2 ) 0
0 (1 + ◊2 )
=
(I + (2) ).
Case: k > 1.
Notice that
(I + (k) )(1 + ‘2 )
Q = R
1 + ◊k 0 0 ··· 0
c d
c
c
0 1 + ◊k 0 ··· 0 d
d
c
c 0 0 1 + ◊k≠1 ··· 0 d
d (1 + ‘2 )
c .. .. .. ... .. d
c
a . . . . d
b
0 0 0 · · · 1 + ◊2
Q = R
1 + ◊k+1 0 0 ··· 0
c d
c
c
0 1 + ◊k+1 0 ··· 0 d
d
c
c 0 0 1 + ◊k · · · 0 d
d
c .. .. .. ... .. d
c
a . . . . d
b
0 0 0 · · · 1 + ◊3
WEEK 6. NUMERICAL STABILITY 360

Then A B
I+ (k)
0
(1 + ‘2 )
0 (1 + ‘1 )
A = B
(I + (k)
)(1 + ‘2 ) 0
0 (1 + ‘1 )(1 + ‘2 )
=
Q Q R R
1 + ◊k+1 0 0 ··· 0
c d
c
c c
c
0 1 + ◊k+1 0 ··· 0 d
d
d
d
c d
c
c
c
c 0 0 1 + ◊k ··· 0 d
d 0 d
d
c c .. .. .. .. .. d d
c
c
c
a . . . . . d
b
d
d
c d
a 0 0 0 · · · 1 + ◊3 b
0 (1 + ◊2 )
=
(I + (k+1)
).
We state a theorem that captures how error is accumulated by the algorithm.
Theorem 6.3.3.1 Let x, y œ Rn and let Ÿ := xT y be computed in the order indicated by

(· · · ((‰0 Â0 + ‰1 Â1 ) + ‰2 Â2 ) + · · · ) + ‰n≠1 Ân≠1 .

Then Ë È
(n)
Ÿ̌ = xT y = xT (I + )y.
Proof.
Proof by Mathematical Induction on n, the size of vectors x and y.

• Base case.
m(x) = m(y) = 0. Trivial!

• Inductive Step.
Inductive Hypothesis (I.H.): Assume that if xT , yT œ Rk , k > 0, then

fl(xTT yT ) = xTT (I + T )yT , where T = (k)


.

We will show that when xT , yT œ Rk+1 , the equality fl(xTT yAT ) = BxTT (I + T )yAT holds
B
x0 y0
true again. Assume that xT , yT œ R , and partition xT æ
k+1
and yT æ .
‰1 Â1
WEEK 6. NUMERICAL STABILITY 361

Then A BT A B
x0 y0
fl( )
‰1 Â1
= < definition >
fl((fl(x0T y0 ) + fl(‰1 Â1 ))
= < I.H. with xT = x0 , yT = y0 , and 0 = (k)
>
fl(x0 (I + 0 )y0 + fl(‰1 Â1 ))
T

1 = < SCM, twice > 2


x0 (I + 0 )y0 + ‰1 Â1 (1 + ‘ú ) (1 + ‘+ )
T

= < rearrangement >


A BT A B A B
x0 (I + 0 ) 0 y0
(1 + ‘+ )
‰1 0 (1 + ‘ú ) Â1
= < renaming >
xTT (I + T )yT
where |‘ú |, |‘+ | Æ ‘mach , ‘+ = 0 if k = 0, and
A B
(I + 0) 0
(I + T) = (1 + ‘+ )
0 (1 + ‘ú )

so that T = (k+1)
.

• By the Principle of Mathematical Induction, the result holds.


A number of useful consequences of Theorem 6.3.3.1 follow. These will be used later as
an inventory (library) of error results from which to draw when analyzing operations and
algorithms that utilize a dot product.
Corollary 6.3.3.2 Under the assumptions of Theorem 6.3.3.1 the following relations hold:
R-1B Ÿ̌ = (x + ”x)T y, where |”x| Æ “n |x|,

R-2B Ÿ̌ = xT (y + ”y), where |”y| Æ “n |y|;

R-1F Ÿ̌ = xT y + ”Ÿ, where |”Ÿ| Æ “n |x|T |y|.


Proof. R-1B
We leave the proof of Corollary 6.3.3.2 R-1B as an exercise.
R-2B
The proof of Corollary 6.3.3.2 R-2B is, of course, just a minor modification of the proof
of Corollary 6.3.3.2 R-1B.
R-1F
WEEK 6. NUMERICAL STABILITY 362

For Corollary 6.3.3.2 R-1F, let ”Ÿ = xT (n)


y, where (n)
is as in Theorem 6.3.3.1. Then

|”Ÿ| = |xT (n) y|


Æ |‰0 ||◊n ||Â0 | + |‰1 ||◊n ||Â1 | + · · · + |‰n≠1 ||◊2 ||Ân≠1 |
Æ “n |‰0 ||Â0 | + “n |‰1 ||Â1 | + · · · + “2 |‰n≠1 ||Ân≠1 |
Æ “n |x|T |y|.


Homework 6.3.3.2 Prove Corollary 6.3.3.2 R1-B.
Solution. From Theorem 6.3.3.1 we know that
(n)
Ÿ̌ = xT (I + )y = (x + ¸ (n)
˚˙ x˝ ) y.
T

”x

Then -Q R- Q R Q R
- ◊n ‰ 0 - |◊n ‰0 | |◊n ||‰0 |
- -
-c d- c d c d
-c
-c
◊n ‰ 1 d-
d-
c
c
|◊n ‰1 | d
d
c
c
|◊n ||‰1 | d
d
-c d- c d c d
|”x| = | (n)
x| = -c ◊n≠1 ‰2 d- = c |◊n≠1 ‰2 | d = c |◊n≠1 ||‰2 | d
-c .. d- c .. d c .. d
-c
-a . d-
b-
c
a . d
b
c
a . d
b
- -
Q
- ◊R2 ‰n≠1
Q
- |◊R2 ‰n≠1 | |◊2 ||‰n≠1 |
|◊n ||‰0 | “n |‰0 |
c d c d
c
c
|◊n ||‰1 | d
d
c
c
“n |‰1 | d
d
c d c d
Æ c |◊n ||‰2 | d Æ c “n |‰2 | d = “n |x|.
c .. d c .. d
c
a . d
b
c
a . d
b
|◊n ||‰n≠1 | “n |‰n≠1 |
(Note: strictly speaking, one should probably treat the case n = 1 separately.)

6.3.4 Matrix-vector multiplication

YouTube: https://www.youtube.com/watch?v=q7rACPOu4ZQ
WEEK 6. NUMERICAL STABILITY 363

Assume A œ Rm◊n , x œ Rn , and y œ Rm . Partition


Q R Q R
aÂT0 Â0
c d c d
c aÂT1 d c Â1 d
A=c
c .. d
d and y = c
c .. d.
d
a . b a . b
aÂTm≠1 Âm≠1

Then computing y := Ax can be orchestrated as


Q R Q R
Â0 aÂT0 x
c d c d
c Â1 d c aÂT1 x d
c
c .. d
d := c
c .. d.
d (6.3.4)
a . b a . b
Âm≠1 aÂTm≠1 x

From R-1B 6.3.3.2 regarding the dot product we know that


Q R Q R
Â̌0 (aÂ0 + ”aÂ0 )T x
c d
c
c Â̌1 d
d
c
c (aÂ1 + ”aÂ1 )T x d
d
y̌ = c .. d = c
c .. d
d
c
a . d
b a . b
Â̌ (aÂm≠1 + ”aÂm≠1 )T x
QQm≠1 T R Q RR
aÂ0 ”aÂT0
cc  T d c  T1 dd
cc a1 d c ”a dd
= cc
cc .. d+c
d c .. ddd
d x = (A + A)x,
aa . b a . bb
aÂTm≠1 ”aÂTm≠1

where |”aÂi | Æ “n |aÂi |, i = 0, . . . , m ≠ 1, and hence | A| Æ “n |A|. .


Also, from Corollary 6.3.3.2 R-1F regarding the dot product we know that
Q R Q R Q R Q R
Â̌0 aÂT0 x + ”Â0 aÂT0 ”Â0
c d
c
c Â̌1 d
d
c
c aÂT1 x + ”Â1 d
d
c
c aÂT1 d
d
c
c ”Â1 d
d
y̌ = c .. d =c
c .. d
d =c
c .. dx + c
d c .. d
d = Ax + ”y.
c
a . d
b a . b a . b a . b
Â̌m≠1 aÂTm≠1 x + ”Âm≠1 aÂTm≠1 ”Âm≠1

where |”Âi | Æ “n |aÂi |T |x| and hence |”y| Æ “n |A||x|.


The above observations can be summarized in the following theorem:
Theorem 6.3.4.1 Error results for matrix-vector multiplication. Let A œ Rm◊n , x œ Rn ,
y œ Rm and consider the assignment y := Ax implemented via dot products as expressed in
(6.3.4). Then these equalities hold:
R-1B y̌ = (A + A)x, where | A| Æ “n |A|.

R-1F y̌ = Ax + ”y, where |”y| Æ “n |A||x|.


WEEK 6. NUMERICAL STABILITY 364

Ponder This 6.3.4.1 In the above theorem, could one instead prove the result

y̌ = A(x + ”x),

where ”x is "small"?
Solution. The answer is "sort of". The reason is that for each individual element of y

Â̌i = aÂTi (x + ”x)

which would appear to support that


Q R Q R
Â̌0 aÂT0 (x + ”x)
c d
c
c Â̌1 d
d
c
c aÂT1 (x + ”x) d
d
c .. d = c
c .. d.
d
c
a . d
b a . b
Â̌m≠1 aÂTm≠1 (x + ”x)

However, the ”x for each entry Â̌i is different, meaning that we cannot factor out x + ”x to
find that y̌ = A(x + ”x).
However, one could argue that we know that y̌ = Ax + ”y where |”y| Æ “n |A||x|. Hence
if A”x = ”y then A(x + ”x) = y̌. This would mean that ”y is in the column space of A. (For
example, if A is nonsingular). However, that is not quite what we are going for here.

6.3.5 Matrix-matrix multiplication

YouTube: https://www.youtube.com/watch?v=pvBMuIzIob8
The idea behind backward error analysis is that the computed result is the exact result
when computing with changed inputs. Let’s consider matrix-matrix multiplication:
C := AB.
What we would like to be able to show is that there exist A and B such that the computed
result, Č, satisfies
Č := (A + A)(B + B).
Let’s think about this...
Ponder This 6.3.5.1 Can one find matrices A and B such that

Č = (A + A)(B + B)?
WEEK 6. NUMERICAL STABILITY 365

YouTube: https://www.youtube.com/watch?v=3d6kQ6rnRhA
For matrix-matrix multiplication, it is possible to "throw" the error onto the result, as
summarized by the following theorem:
Theorem 6.3.5.1 Forward error for matrix-matrix multiplication. Let C œ Rm◊n ,
A œ Rm◊k , and B œ Rk◊n and consider the assignment C := AB implemented via matrix-
vector multiplication. Then there exists C œ Rm◊n such that
Č = AB + C, where | C| Æ “k |A||B|.
Homework 6.3.5.2 Prove Theorem 6.3.5.1.
Solution. Partition
1 2 1 2
C= c0 c1 · · · cn≠1 and B = b0 b1 · · · bn≠1 .

Then 1 2 1 2
c0 c1 · · · cn≠1 := Ab0 Ab1 · · · Abn≠1 .
From R-1F 6.3.4.1 regarding matrix-vector multiplication we know that
1 2 1 2
č0 č1 · · · čn≠1 = Ab0 + ”c0 Ab1 + ”c1 · · · Abn≠1 + ”cn≠1
1 2 1 2
= Ab0 Ab1 · · · Abn≠1 + ”c0 ”c1 · · · ”cn≠1
= AB + C.

where |”cj | Æ “k |A||bj |, j = 0, . . . , n ≠ 1, and hence | C| Æ “k |A||B|.

YouTube: https://www.youtube.com/watch?v=rxKba-pnquQ
Remark 6.3.5.2 In practice, matrix-matrix multiplication is often the parameterized oper-
ation C := –AB + —C. A consequence of Theorem 6.3.5.1 is that for — ”= 0, the error can be
attributed to a change in parameter C, which means the error has been "thrown back" onto
an input parameter.
WEEK 6. NUMERICAL STABILITY 366

6.4 Error Analysis for Solving Linear Systems


6.4.1 Numerical stability of triangular solve

YouTube: https://www.youtube.com/watch?v=ayj_rNkSMig
We now use the error results for the dot product to derive a backward error result
for solving Lx = y, where L is an n ◊ n lower triangular matrix, via the algorithm in
Figure 6.4.1.1, a variation on the algorithm in Figure 5.3.5.1 that stores the result in vector
x and does not assume that L is unit lower triangular.
SolveALx = y B A B A B
LT L LT R xT yT
Læ ,x æ ,y æ
LBL LBR xB yB
LT L is 0 ◊ 0 and yT , xT have 0 elements
while n(LT L ) < n(L) Q R Q R Q R
A B L00 l01 L02 A B x0 A B y0
LT L LT R c T T d xT c d yT
æ a Â1 d
c
æ a l10 ⁄11 l12 b, æa ‰1 b , b
LBL LBR xB yB
L20 l21 L22 x2 y2
‰1 := (Â1 ≠ l10
T
x0 )/⁄11
Q R Q R Q R
A B L00 l01 L02 A B x0 A B y0
LT L LT R x yT
Ωc T T d
a l10 ⁄11 l12 b ,
T
Ωc
a
d
‰1 b , Ωc d
a Â1 b
LBL LBR xB yB
L20 l21 L22 x2 y2
endwhile
Figure 6.4.1.1 Dot product based lower triangular solve algorithm.
To establish the backward error result for this algorithm, we need to understand the error
incurred in the key computation

‰1 := (Â1 ≠ l10
T
x0 )/⁄11 .

The following theorem gives the required (forward error) result, abstracted away from the
specifics of how it occurs in the lower triangular solve algorithm.
Lemma 6.4.1.2 Let n Ø 1, ⁄, ‹ œ R and x, y œ Rn . Assume ⁄ ”= 0 and consider the
computation
‹ := (– ≠ xT y)/⁄,
WEEK 6. NUMERICAL STABILITY 367

Then
(⁄ + ”⁄)ˇ
‹ = – ≠ (x + ”x)T y, where |”x| Æ “n |x| and |”⁄| Æ “2 |⁄|.
Homework 6.4.1.1 Prove Lemma 6.4.1.2
Hint. Use the Alternative Computations Model (Subsubsection 6.2.3.3) appropriately.
Solution. We know that

• From Corollary 6.3.3.2 R-1B: if — = xT y then —ˇ = (x + ”x)T y where |”x| Æ “n |x|.

• From the ACM (Subsubsection 6.2.3.3): If ‹ = (– ≠ —)/⁄ then

–≠— 1
‹ˇ = ,
⁄ (1 + ‘≠ )(1 + ‘/ )

where |‘≠ | Æ ‘mach and |‘/ | Æ ‘mach .

Hence
– ≠ (x + ”x)T y 1
‹ˇ = ,
⁄ (1 + ‘≠ )(1 + ‘/ )
or, equivalently,
⁄(1 + ‘≠ )(1 + ‘/ )ˇ
‹ = – ≠ (x + ”x)T y,
or,
⁄(1 + ◊2 )ˇ
‹ = – ≠ (x + ”x)T y,
where |◊2 | Æ “2 , which can also be written as

(⁄ + ”⁄)ˇ
‹ = – ≠ (x + ”x)T y,

where ”⁄ = ◊2 ⁄ and hence |”⁄| Æ “2 Î⁄|.


The errror result for the algorithm in Figure 6.4.1.1 is given by
Theorem 6.4.1.3 Let L œ Rn◊n be a nonsingular lower triangular matrix and let x̌ be the
computed result when executing Figure 6.4.1.1 to solve Lx = y under the computation model
from Subsection 6.2.3. Then there exists a matrix L such that

(L + L)x̌ = y where | L| Æ max(“2 , “n≠1 )|L|.


The reasoning behind the result is that one expects the maximal error to be incurred
during the final iteration when computing ‰1 := (Â1 ≠ l10
T
x0 )/⁄11 . This fits Lemma 6.4.1.2,
except that this assignment involves a dot product with vectors of length n ≠ 1 rather than
of length n.
You now prove Theorem 6.4.1.3 by first proving the special cases where n = 1 and n = 2,
and then the general case.
Homework 6.4.1.2 Prove Theorem 6.4.1.3 for the case where n = 1.
Solution. Case 1: n = 1.
WEEK 6. NUMERICAL STABILITY 368

The system looks like ⁄11 ‰1 = Â1 so that

‰1 = Â1 /⁄11

and
1
‰̌1 = Â1 /⁄11
1 + ‘/
Rearranging gives us
⁄11 ‰̌1 (1 + ‘/ ) = Â1
or
(⁄11 + ”⁄11 )‰̌1 = Â1
where ”⁄11 = ‘/ ⁄11 and hence

|”⁄11 | = |‘/ ||⁄11 |


Æ “1 |⁄11 |
Æ “2 |⁄11 |
Æ max(“2 , “n≠1 )|⁄11 |.
Homework 6.4.1.3 Prove Theorem 6.4.1.3 for the case where n = 2.
Solution. Case 2: n = 2.
The system now looks like
A BA B A B
⁄00 0 ‰0 Â0
= .
⁄10 ⁄11 ‰1 Â1
From the proof of Case 1 we know that
(⁄00 + ”⁄00 )‰̌0 = Â0 , where |”⁄00 | Æ “1 |⁄00 |. (6.4.1)
Since ‰1 = (Â1 ≠ ⁄10 ‰̌0 )/⁄11 , Lemma 6.4.1.2 tells us that
(⁄10 + ”⁄10 )‰̌0 + (⁄11 + ”⁄11 )‰̌1 = Â1 , (6.4.2)
where
|”⁄10 | Æ “1 |⁄10 | and |”⁄11 | Æ “2 |⁄11 |.
(6.4.1) and (6.4.2) can be combined into
A BA B A B
⁄00 + ”⁄00 0 ‰̌0 Â0
= ,
⁄10 + ”⁄10 ⁄11 + ”⁄11 ‰̌1 Â1
where A B A B
|”⁄00 | 0 “1 |⁄00 | 0
Æ .
|”⁄10 | |”⁄11 | “1 |⁄10 | “2 |⁄11 |
Since “1 Æ “2 -A B- -A B-
-
- ”⁄00 0 -
-
- ⁄00 0 -
- - Æ max(“2 , “n≠1 ) -- -
-.
- ”⁄10 ”⁄11 - - ⁄10 ⁄11 -
WEEK 6. NUMERICAL STABILITY 369

Homework 6.4.1.4 Prove Theorem 6.4.1.3 for n Ø 1.


Solution. We will utilize a proof by induction.

• Case 1: n = 1.
See Homework 6.4.1.2.

• Case 2: n = 2.
See Homework 6.4.1.3.

• Case 3: n > 2.
The system now looks like
A BA B A B
L00 0 x0 y0
T = , (6.4.3)
l10 ⁄11 ‰1 Â1

where L00 œ R(n≠1)◊(n≠1) , and the inductive hypothesis states that

(L00 + L00 )x̌0 = y0 where | L00 | Æ max(“2 , “n≠2 )|L00 |.

Since ‰1 = (Â1 ≠ l10


T
x̌0 )/⁄11 , Lemma 6.4.1.2 tells us that

(l10 + ”l10 )T x̌0 + (⁄11 + ”⁄11 )‰̌1 = Â1 , (6.4.4)

where |”l10 | Æ “n≠1 |l10 | and |”⁄11 | Æ “2 |⁄11 |.


(6.4.3) and (6.4.4) can be combined into
A BA B A B
L00 + ”L00 0 x̌0 y0
= ,
(l10 + ”l10 )T ⁄11 + ”⁄11 ‰̌1 Â1

where A B A B
|”L00 | 0 max(“2 , “n≠2 )|L00 | 0
T Æ T
|”l10 | |”⁄11 | “n≠1 |l10 | “2 |⁄11 |
and hence -A B- -A B-
-
- ”L00 0 -
-
-
- L00 0 -
-
-
- T -|
-
Æ max(“2 , “n≠1 ) -- T -.
-
”l10 ”⁄11 l10 ⁄11

• By the Principle of Mathematical Induction, the result holds for all n Ø 1.


WEEK 6. NUMERICAL STABILITY 370

YouTube: https://www.youtube.com/watch?v=GB7wj7_dhCE
A careful examination of the solution to Homework 6.4.1.2, together with the fact that
“n≠1 Æ “n allows us to state a slightly looser, but cleaner, result of Theorem 6.4.1.3:
Corollary 6.4.1.4 Let L œ Rn◊n be a nonsingular lower triangular matrix and let x̌ be the
computed result when executing Figure 6.4.1.1 to solve Lx = y under the computation model
from Subsection 6.2.3. Then there exists a matrix L such that

(L + L)x̌ = y where | L| Æ “n |L|.

6.4.2 Numerical stability of LU factorization

YouTube: https://www.youtube.com/watch?v=fds-FeL28ok
The numerical stability of various LU factorization algorithms as well as the triangular
solve algorithms can be found in standard graduate level numerical linear algebra texts [19]
[21]. Of particular interest may be the analysis of the Crout variant of LU factorization 5.5.1.4
in

• [6] Paolo Bientinesi, Robert A. van de Geijn, Goal-Oriented and Modular Stability
Analysis, SIAM Journal on Matrix Analysis and Applications , Volume 32 Issue 1,
February 2011.

• [7] Paolo Bientinesi, Robert A. van de Geijn, The Science of Deriving Stability Anal-
yses, FLAME Working Note #33. Aachen Institute for Computational Engineering
Sciences, RWTH Aachen. TR AICES-2008-2. November 2008. (Technical report ver-
sion with exercises.)

since these papers use the same notation as we use in our notes. Here is the pertinent result
from those papers:
Theorem 6.4.2.1 Backward error of Crout variant for LU factorization. Let
A œ Rn◊n and let the LU factorization of A be computed via the Crout variant, yielding
approximate factors Ľ and Ǔ . Then

(A + A) = ĽǓ with | A| Æ “n |Ľ||Ǔ |.


WEEK 6. NUMERICAL STABILITY 371

6.4.3 Numerical stability of linear solve via LU factorization

YouTube: https://www.youtube.com/watch?v=c1NsTSCpe1k
Let us now combine the results from Subsection 6.4.1 and Subsection 6.4.2 into a back-
ward error result for solving Ax = y via LU factorization and two triangular solves.
Theorem 6.4.3.1 Let A œ Rn◊n and x, y œ Rn with Ax = y. Let x̌ be the approximate
solution computed via the following steps:
• Compute the LU factorization, yielding approximate factors Ľ and Ǔ .

• Solve Ľz = y, yielding approximate solution ž.

• Solve Ǔ x = ž, yielding approximate solution x̌.

Then
(A + A)x̌ = y with | A| Æ (3“n + “n2 )|Ľ||Ǔ |.
We refer the interested learner to the proof in the previously mentioned papers [6] [7].
Homework 6.4.3.1 The question left is how a change in a nonsingular matrix affects the
accuracy of the solution of a linear system that involves that matrix. We saw in Subsec-
tion 1.4.1 that if
Ax = y and A(x + ”x) = y + ”y
then
ΔxΠΔyÎ
Æ Ÿ(A)
ÎxÎ ÎyÎ
when Î · Î is a subordinate norm. But what we want to know is how a change in A affects
the solution:
Ax = y and (A + A)(x + ”x) = y
then
ΔxÎ Ÿ(A) ÎÎAÎ

Æ .
ÎxÎ 1 ≠ Ÿ(A) ÎÎAÎ

Prove this!
Solution.
Ax = y and (A + A)(x + ”x) = y
implies that
(A + A)(x + ”x) = Ax
WEEK 6. NUMERICAL STABILITY 372

or, equivalently,
Ax + A”x + A”x = 0.
We can rewrite this as
”x = A≠1 (≠ Ax ≠ A”x)
so that
ΔxÎ = ÎA≠1 (≠ Ax ≠ A”x)Î Æ ÎA≠1 ÎÎ AÎÎxÎ + ÎA≠1 ÎÎ AÎΔxÎ.
This can be rewritten as

(1 ≠ ÎA≠1 ÎÎ AÎ)ΔxÎ Æ ÎA≠1 ÎÎ AÎÎxÎ

and finally
ΔxÎ ÎA≠1 ÎÎ AÎ
Æ
ÎxÎ 1 ≠ ÎA≠1 ÎÎ AÎ
and finally
ΔxÎ ÎAÎÎA≠1 Î ÎÎAÎ

Æ .
ÎxÎ 1 ≠ ÎAÎÎA≠1 Î ÎÎAÎ

The last homework brings up a good question: If A is nonsingular, how small does A
need to be for it to be nonsingular?
Theorem 6.4.3.2 Let A be nonsingular, Î · Î be a subordinate norm, and

Î AÎ 1
< .
ÎAÎ Ÿ(A)

Then A + A is nonsingular.
Proof. Proof by contradiction.
Assume that A is nonsingular,
Î AÎ 1
< .
ÎAÎ Ÿ(A)
and A + A is singular. We will show this leads to a contradition.
Since A + A is singular, there exists x ”= 0 such that (A + A)x = 0. We can rewrite
this as
x = ≠A≠1 Ax
and hence
ÎxÎ = ÎA≠1 AxÎ Æ ÎA≠1 ÎÎ AÎÎxÎ.
Dividing both sides by ÎxÎ yields
1 Æ ÎA≠1 ÎÎ AÎ
1
and hence ÎA≠1 Î
Æ Î AÎ and finally
1 Î AÎ
≠1
Æ ,
ÎAÎÎA Î ÎAÎ
WEEK 6. NUMERICAL STABILITY 373

which is a contradiction. ⌅

6.4.4 Numerical stability of linear solve via LU factorization with


partial pivoting

YouTube: https://www.youtube.com/watch?v=n95C8qjMBcI
The analysis of LU factorization without partial pivoting is related to that of LU factor-
ization with partial pivoting as follows:
• We have shown that LU factorization with partial pivoting is equivalent to the LU
factorization without partial pivoting on a pre-permuted matrix: P A = LU , where P
is a permutation matrix.

• The permutation (exchanging of rows) doesn’t involve any floating point operations
and therefore does not generate error.
It can therefore be argued that, as a result, the error that is accumulated is equivalent with
or without partial pivoting
More slowly, what if we took the following approach to LU factorization with partial
pivoting:
• Compute the LU factorization with partial pivoting yielding the pivot matrix P , the
unit lower triangular matrix L, and the upper triangular matrix U . In exact arithmetic
this would mean these matrices are related by P A = LU.

• In practice, no error exists in P (except that a wrong index of a row with which to
pivot may result from roundoff error in the intermediate results in matrix A) and
approximate factors Ľ and Ǔ are computed.

• If we now took the pivot matrix P and formed B = P A (without incurring error since
rows are merely swapped) and then computed the LU factorization of B, then the
computed L and U would equal exactly the Ľ and Ǔ that resulted from computing
the LU factorization with row pivoting with A in floating point arithmetic. Why?
Because the exact same computations are performed although possibly with data that
is temporarily in a different place in the matrix at the time of that computation.

• We know that therefore Ľ and Ǔ satisfy

B + B = ĽǓ , where | B| Æ “n |Ľ||Ǔ |..


WEEK 6. NUMERICAL STABILITY 374

We conclude that
P A + B = ĽǓ , where | B| Æ “n |Ľ||Ǔ |
or, equivalently,
P (A + A) = ĽǓ , where P | A| Æ “n |Ľ||Ǔ |
where B = P A and we note that P | A| = |P A| (taking the absolute value of a matrix
and then swapping rows yields the same matrix as when one first swaps the rows and then
takes the absolute value).

6.4.5 Is LU with Partial Pivoting Stable?

YouTube: https://www.youtube.com/watch?v=TdLM41LCma4
The last unit gives a backward error result regarding LU factorization (and, by extention,
LU factorization with pivoting):
(A + A) = ĽǓ with | A| Æ “n |Ľ||Ǔ |..
The question now is: does this mean that LU factorization with partial pivoting is stable?
In other words, is A, which we bounded with | A| Æ “n |Ľ||Ǔ |, always small relative to the
entries of |A|? The following exercise gives some insight:
Homework 6.4.5.1 Apply LU with partial pivoting to
Q R
1 0 1
c
A=a ≠1 1 1 d
b.
≠1 ≠1 1

Pivot only when necessary.


Solution. Notice that no pivoting is necessary. Eliminating the entries below the diagonal
in the first column yields: Q R
1 0 1
c
a 0 1 2 db.
0 ≠1 2
Eliminating the entries below the diagonal in the second column again does not require
pivoting and yields: Q R
1 0 1
c
a 0 1 2 d
b.
0 0 4
WEEK 6. NUMERICAL STABILITY 375

Homework 6.4.5.2 Generalize the insights from the last homework to a n ◊ n matrix.
What is the maximal element growth that is observed?
Solution. Consider Q R
1 0 0 ··· 0 1
c
c ≠1 1 0 ··· 0 1 d
d
c d
c ≠1 ≠1 1 · · · 0 1 d
c d
A = c .. .
. .
. . . .
. .
. d.
c
c . . . . . . d d
c d
a ≠1 ≠1 ··· 1 1 b
≠1 ≠1 · · · ≠1 1
Notice that no pivoting is necessary when LU factorization with pivoting is performed.
Eliminating the entries below the diagonal in the first column yields:
Q R
1 0 0 ··· 0 1
c
c 0 1 0 ··· 0 2 d
d
c d
c
c 0 ≠1 1 · · · 0 2 d
d
c .. .. .. . . .. .. d.
c
c . . . . . . d
d
c d
a 0 ≠1 ··· 1 2 b
0 ≠1 · · · ≠1 2

Eliminating the entries below the diagonal in the second column again does not require
pivoting and yields: Q R
1 0 0 ··· 0 1
c 0 1 0 ··· 0 2 d
c
d
c d
c 0 0 1 ··· 0 4 d
c d
c . . . . . . .
c .. .. .. .. .. .. d
d
c d
c d
a 0 0 ··· 1 4 b
0 0 · · · ≠1 4
Continuing like this for the remaining columns, eliminating the entries below the diagonal
leaves us with the upper triangular matrix
Q R
1 0 0 ··· 0 1
c
c 0 1 0 ··· 0 2 d
d
c d
c
c 0 0 1 ··· 0 4 d
d
c .. .. .. . . .. .. d.
c
c . . . . . . d
d
c n≠2 d
a 0 0 ··· 1 2 b
0 0 ··· 0 2n≠1
From these exercises we conclude that even LU factorization with partial pivoting can
yield large (exponential) element growth in U .
In practice, this does not seem to happen and LU factorization is considered to be sta-
ble.
WEEK 6. NUMERICAL STABILITY 376

6.5 Enrichments
6.5.1 Systematic derivation of backward error analyses
Throughout the course, we have pointed out that the FLAME notation facilitates the sys-
tematic derivation of linear algebra algorithms. The papers
• [6] Paolo Bientinesi, Robert A. van de Geijn, Goal-Oriented and Modular Stability
Analysis, SIAM Journal on Matrix Analysis and Applications , Volume 32 Issue 1,
February 2011.
• [7] Paolo Bientinesi, Robert A. van de Geijn, The Science of Deriving Stability Anal-
yses, FLAME Working Note #33. Aachen Institute for Computational Engineering
Sciences, RWTH Aachen. TR AICES-2008-2. November 2008. (Technical report ver-
sion of the SIAM paper, but with exercises.)
extend this to the systematic derivation of the backward error analysis of algorithms. Other
publications and texts present error analyses on a case-by-case basis (much like we do in
these materials) rather than as a systematic and comprehensive approach.

6.5.2 LU factorization with pivoting can fail in practice


While LU factorization with pivoting is considered to be a numerically stable approach to
solving linear systems, the following paper discusses cases where it may fail in practice:
• [18] Leslie V. Foster, Gaussian elimination with partial pivoting can fail in practice,
SIAM Journal on Matrix Analysis and Applications, 15 (1994), pp. 1354–1362.
Also of interest may be the paper
• [46] Stephen J. Wright, A Collection of Problems for Which {G}aussian Elimination
with Partial Pivoting is Unstable, SIAM Journal on Scientific Computing, Vol. 14, No.
1, 1993.
which discusses a number of (not necessarily practical) examples where LU factorization
with pivoting fails.

6.6 Wrap Up
6.6.1 Additional homework
Homework 6.6.1.1 In Units 6.3.1-3 we analyzed how error accumulates when computing
a dot product of x and y of size m in the order indicated by

Ÿ = ((· · · ((‰0 Â0 + ‰1 Â1 ) + ‰2 Â2 ) + · · · ) + ‰m≠1 Âm≠1 ).

Let’s illustrate an alternative way of computing the dot product:


WEEK 6. NUMERICAL STABILITY 377

• For m = 2:
Ÿ = ‰0 Â0 + ‰1 Â1

• For m = 4:
Ÿ = (‰0 Â0 + ‰1 Â1 ) + (‰2 Â2 + ‰3 Â3 )

• For m = 8:

Ÿ = ((‰0 Â0 + ‰1 Â1 ) + (‰2 Â2 + ‰3 Â3 )) + ((‰4 Â4 + ‰5 Â5 ) + (‰6 Â6 + ‰7 Â7 ))

and so forth. Analyze how under the SCM error accumulates and state backward stability
results. You may assume that m is a power of two.

6.6.2 Summary
In our discussions, the set of floating point numbers, F , is the set of all numbers ‰ = µ ◊ — e
such that
• — = 2,
• µ = ±.”0 ”1 · · · ”t≠1 (µ has only t (binary) digits), where ”j œ {0, 1}),
• ”0 = 0 iff µ = 0 (the mantissa is normalized), and
• ≠L Æ e Æ U .
Definition 6.6.2.1 Machine epsilon (unit roundoff). The machine epsilon (unit round-
off), ‘mach , is defined as the smallest positive floating point number ‰ such that the floating
point number that represents 1 + ‰ is greater than one. ⌃
fl(expression) = [expression]
equals the result when computing {\rm expression} using floating point computation (round-
ing or truncating as every intermediate result is stored). If
Ÿ = expression
in exact arithmetic, then we done the associated floating point result with
Ÿ̌ = [expression].
The Standard Computational Model (SCM) assumes that, for any two floating point
numbers ‰ and Â, the basic arithmetic operations satisfy the equality
fl(‰ op Â) = (‰ op Â)(1 + ‘), |‘| Æ ‘mach , and op œ {+, ≠, ú, /}.
The Alternative Computational Model (ACM) assumes for the basic arithmetic opera-
tions that
‰ op Â
fl(‰ op Â) = , |‘| Æ ‘mach , and op œ {+, ≠, ú, /}.
1+‘
WEEK 6. NUMERICAL STABILITY 378

Definition 6.6.2.2 Backward stable implementation. Given the mapping f : D æ R,


where D µ Rn is the domain and R µ Rm is the range (codomain), let fˇ : D æ R be
a computer implementation of this function. We will call fˇ a backward stable (also called
"numerically stable") implementation of f on domain D if for all x œ D there exists a x̌
"close" to x such that fˇ(x) = f (x̌). ⌃

• Conditioning is a property of the problem you are trying to solve. A problem is well-
conditioned if a small change in the input is guaranteed to only result in a small change
in the output. A problem is ill-conditioned if a small change in the input can result in
a large change in the output.
• Stability is a property of an implementation. If the implementation, when executed
with an input always yields an output that can be attributed to slightly changed input,
then the implementation is backward stable.
Definition 6.6.2.3 Absolute value of vector and matrix. Given x œ Rn and A œ Rm◊n ,
Q R Q R
|‰0 | |–0,0 | |–0,1 | ... |–0,n≠1 |
c d c d
c |‰1 | d c |–1,0 | |–1,1 | ... |–1,n≠1 | d
|x| = c
c .. d
d and |A| = c
c .. .. ... .. d.
d
a . b a . . . b
|‰n≠1 | |–m≠1,0 | |–m≠1,1 | . . . |–m≠1,n≠1 |


Definition 6.6.2.4 Let Mœ {<, Æ, =, Ø, >} and x, y œ R . Then n

|x| M |y| iff |‰i | M |Âi |,

with i = 0, . . . , n ≠ 1. Similarly, given A and B œ Rm◊n ,

|A| M |B| iff |–ij | M |—ij |,

with i = 0, . . . , m ≠ 1 and j = 0, . . . , n ≠ 1. ⌃
Theorem 6.6.2.5 Let A, B œ Rm◊n . If |A| Æ |B| then ÎAÎ1 Æ ÎBÎ1 , ÎAÎŒ Æ ÎBÎŒ , and
ÎAÎF Æ ÎBÎF .
Consider
Q RT Q R
‰0 Â0
c d c d
c
c
‰1 d
d
c
c
Â1 d
d 31 2 4
Ÿ := xT y =
c ..d c .. d
= (‰0 Â0 + ‰1 Â1 ) + · · · + ‰n≠2 Ân≠2 + ‰n≠1 Ân≠1 .
c
c .d
d
c
c . d
d
c ‰n≠2 d c Ân≠2 d
a b a b
‰n≠1 Ân≠1
Under the computational model given in Subsection 6.2.3 the computed result, Ÿ̌, satisfies
Q R
n≠1
ÿ n≠1
Ÿ
a‰i Âi (1 + ‘(i) ) (j)
Ÿ̌ = ú (1 + ‘+ )b ,
i=0 j=i
WEEK 6. NUMERICAL STABILITY 379

(0) (0) (j) (j)


where ‘+ = 0 and |‘ú |, |‘ú |, |‘+ | Æ ‘mach for j = 1, . . . , n ≠ 1.
Lemma 6.6.2.6 Let ‘i œ R, 0 Æ i Æ n ≠ 1, n‘mach < 1, and |‘i | Æ ‘mach . Then ÷ ◊n œ R
such that
n≠1
Ÿ
(1 + ‘i )±1 = 1 + ◊n ,
i=0

with |◊n | Æ n‘mach /(1 ≠ n‘mach ).


Here the ±1 means that on an individual basis, the term is either used in a multiplication
or a division. For example
(1 + ‘0 )±1 (1 + ‘1 )±1
might stand for

(1 + ‘0 ) (1 + ‘1 ) 1
(1 + ‘0 )(1 + ‘1 ) or or or
(1 + ‘1 ) (1 + ‘0 ) (1 + ‘1 )(1 + ‘0 )

so that this lemma can accommodate an analysis that involves a mixture of the Standard and
Alternative Computational Models (SCM and ACM).
Definition 6.6.2.7 For all n Ø 1 and n‘mach < 1, define

“n = n‘mach /(1 ≠ n‘mach ).


simplifies to

Ÿ̌
=
‰0 Â0 (1 + ◊n ) + ‰1 Â1 (1 + ◊n ) + · · · + ‰n≠1 Ân≠1 (1 + ◊2 )
=
Q RT Q RQ R
‰0 (1 + ◊n ) 0 0 ··· 0 Â0
c d c dc d
c ‰1 d c
c d c
0 (1 + ◊ n ) 0 · · · 0 dc
dc
Â1 d
d
c ‰
c
d c
2 d c 0 0 (1 + ◊n≠1 ) · · · 0 dc
dc Â2 d
d
c .. d c .. .. .. .. .. dc .. d
c
a . b a
d c . . . . . dc
ba . d
b
‰n≠1 0 0 0 · · · (1 + ◊2 ) Ân≠1
=
Q RT Q Q RR Q R
‰0 ◊n 0 0 ··· 0 Â0
c d c c dd c d
c
c
‰1 d
d
c
c
c
c
0 ◊n 0 ··· 0 dd c
dd c
Â1 d
d
c
c ‰2 d
d
c
cI
c
+c 0 0 ◊n≠1 ··· 0 dd c
dd c Â2 d
d,
c .. d c c .. .. .. .. .. dd c .. d
c
a . d
b
c
a
c
a . . . . . dd c
bb a . d
b
‰n≠1 0 0 0 · · · ◊2 Ân≠1

where |◊j | Æ “j , j = 2, . . . , n.
Lemma 6.6.2.8 If n, b Ø 1 then “n Æ “n+b and “n + “b + “n “b Æ “n+b .
WEEK 6. NUMERICAL STABILITY 380

Theorem 6.6.2.9 Let x, y œ Rn and let Ÿ := xT y be computed in the order indicated by

(· · · ((‰0 Â0 + ‰1 Â1 ) + ‰2 Â2 ) + · · · ) + ‰n≠1 Ân≠1 .

Then Ë È
(n)
Ÿ̌ = xT y = xT (I + )y.

Corollary 6.6.2.10 Under the assumptions of Theorem 6.6.2.9 the following relations hold:
R-1B Ÿ̌ = (x + ”x)T y, where |”x| Æ “n |x|,

R-2B Ÿ̌ = xT (y + ”y), where |”y| Æ “n |y|;

R-1F Ÿ̌ = xT y + ”Ÿ, where |”Ÿ| Æ “n |x|T |y|.


Theorem 6.6.2.11 Error results for matrix-vector multiplication. Let A œ Rm◊n , x œ Rn ,
y œ Rm and consider the assignment y := Ax implemented via dot products as expressed in
(6.3.4). Then these equalities hold:
R-1B y̌ = (A + A)x, where | A| Æ “n |A|.

R-1F y̌ = Ax + ”y, where |”y| Æ “n |A||x|.


Theorem 6.6.2.12 Forward error for matrix-matrix multiplication. Let C œ Rm◊n ,
A œ Rm◊k , and B œ Rk◊n and consider the assignment C := AB implemented via matrix-
vector multiplication. Then there exists C œ Rm◊n such that
Č = AB + C, where | C| Æ “k |A||B|.
Lemma 6.6.2.13 Let n Ø 1, ⁄, ‹ œ R and x, y œ Rn . Assume ⁄ ”= 0 and consider the
computation
‹ := (– ≠ xT y)/⁄,
Then
(⁄ + ”⁄)ˇ
‹ = – ≠ (x + ”x)T y, where |”⁄| Æ “2 |⁄| and |”x| Æ “n |x|.
Theorem 6.6.2.14 Let L œ Rn◊n be a nonsingular lower triangular matrix and let x̌ be the
computed result when executing Figure 6.4.1.1 to solve Lx = y under the computation model
from Subsection 6.2.3. Then there exists a matrix L such that

(L + L)x̌ = y where | L| Æ max(“2 , “n≠1 )|L|.


Corollary 6.6.2.15 Let L œ Rn◊n be a nonsingular lower triangular matrix and let x̌ be the
computed result when executing Figure 6.4.1.1 to solve Lx = y under the computation model
from Subsection 6.2.3. Then there exists a matrix L such that

(L + L)x̌ = y where | L| Æ “n |L|.


Theorem 6.6.2.16 Backward error of Crout variant for LU factoriztion. Let
A œ Rn◊n and let the LU factorization of A be computed via the Crout variant, yielding
WEEK 6. NUMERICAL STABILITY 381

approximate factors Ľ and Ǔ . Then

(A + A) = ĽǓ with | A| Æ “n |Ľ||Ǔ |.


Theorem 6.6.2.17 Let A œ Rn◊n and x, y œ Rn with Ax = y. Let x̌ be the approximate
solution computed via the following steps:
• Compute the LU factorization, yielding approximate factors Ľ and Ǔ .

• Solve Ľz = y, yielding approximate solution ž.

• Solve Ǔ x = ž, yielding approximate solution x̌.

Then
(A + A)x̌ = y with | A| Æ (3“n + “n2 )|Ľ||Ǔ |.
Theorem 6.6.2.18 Let A and A + A be nonsingular and

Ax = y and (A + A)(x + ”x) = y

then
ΔxÎ Ÿ(A) ÎÎAÎ

Æ .
ÎxÎ 1 ≠ Ÿ(A) ÎÎAÎ

Theorem 6.6.2.19 Let A be nonsingular, Î · Î be a subordinate norm, and

Î AÎ 1
< .
ÎAÎ Ÿ(A)

Then A + A is nonsingular.
An important example that demonstrates how LU with partial pivoting can incur "ele-
ment growth": Q R
1 0 0 ··· 0 1
c
c ≠1 1 0 ··· 0 1 d d
c d
c ≠1 ≠1 1 · · · 0 1 d
c d
A=c .. .. .. . . .. .. d .
c
c. . . . . . dd
c d
a ≠1 ≠1 ··· 1 1 b
≠1 ≠1 · · · ≠1 1
Week 7

Solving Sparse Linear Systems

7.1 Opening Remarks


7.1.1 Where do sparse linear systems come from?

YouTube: https://www.youtube.com/watch?v=Qq_cQbVQA5Y
Many computational engineering and science applications start with some law of physics
that applies to some physical problem. This is mathematically expressed as a Partial Dif-
ferential Equation (PDE). We here will use one of the simplest of PDEs, Poisson’s equation
on the domain in two dimensions:
≠ u = f.
In two dimensions this is alternatively expressed as
ˆ2u ˆ2u
≠ ≠ = f (x, y) (7.1.1)
ˆx2 ˆy 2
with Dirichlet boundary condition ˆ = 0 (meaning that u(x, y) = 0 on the boundary of
domain ). For example, the domain may be the square 0 Æ x, y Æ 1, ˆ its boundary, and
the question may be a membrane with f being some load from, for example, a sound wave.
Since this course does not require a background in the mathematics of PDEs, let’s explain
the gist of all this in layman’s terms.
• We want to find the function u that satisfies the conditions specified by (7.1.1). It is
assumed that u is appropriately differentiable.

382
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 383

• For simplicity, let’s assume the domain is the square with 0 Æ x Æ 1 and 0 Æ y Æ 1 so
that the boundary is the boundary of this square. We assume that on the boundary
the function equals zero.
• It is usually difficult to analytically determine the continuous function u that solves
such a "boundary value problem" (except for very simple examples).
• To solve the problem computationally, the problem is "discretized". What this means
for our example is that a mesh is laid over the domain, values for the function u at the
mesh points are approximated, and the operator is approximated. In other words, the
continuous domain is viewed as a mesh instead, as illustrated in Figure 7.1.1.1 (Left).
We will assume an N ◊ N mesh of equally spaced points, where the distance between
two adjacent points is h = 1/(N +1). This means the mesh consists of points {(‰i , Âj )}
with ‰i = (i + 1)h for i = 0, 1, . . . , N ≠ 1 and Âj = (j + 1)h for j = 0, 1, . . . , N ≠ 1.

Figure 7.1.1.1 2D mesh.

• If you do the math, details of which can be found in Subsection 7.4.1, you find that
the problem in (7.1.1) can be approximated with a linear equation at each mesh point:
≠u(‰i , Âj≠1 ) ≠ u(‰i≠1 , Âj ) + 4u(‰i , Âj ) ≠ u(xi+1 , uj ) ≠ u(xi , Âj+1 )
= f (‰i , Âi ).
h2
The values in this equation come from the "five point stencil" illustrated in Fig-
ure 7.1.1.1 (Right).

YouTube: https://www.youtube.com/watch?v=GvdBA5emnSs
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 384

• If we number the values at the grid points, u(‰i , Âj ) in what is called the "natural
ordering" as illustrated in Figure 7.1.1.1 (Middle), then we can write all these insights,
together with the boundary condition, as
≠‚i≠N ≠ ‚i≠1 + 4‚i ≠ ‚i+1 ≠ ‚i+N = h2 „i
or, equivalently,
h2 „i + ‚i≠N + ‚i≠1 + ‚i+1 + ‚i+N
‚i =
4
with appropriate modifications for the case where i places the point that yielded the
equation on the bottom, left, right, and/or top of the mesh.

YouTube: https://www.youtube.com/watch?v=VYMbSJAqaUM
All these insights can be put together into a system of linear equations:
4‚0 ≠‚1 ≠‚4 = h2 „0
≠‚0 +4‚1 ≠‚2 ≠‚5 = h2 „1
≠‚1 +4‚2 ≠‚3 ≠‚6 = h2 „2
≠‚2 +4‚3 ≠‚7 = h2 „3
≠‚0 +4‚4 ≠‚5 ≠‚8 = h2 „4
... ... ... ... ... ..
.
where „i = f (‰i , Âj ) if (‰i , Âj ) is the point associated with value ‚i . In matrix notation this
becomes
Q RQ
4 ≠1 ≠1 ‚0
R Q
h2 „0
R
c ≠1
c 4 ≠1 ≠1 dc
dc ‚1
d
d
c
c h2 „1
d
d
c dc
c
c
≠1 4 ≠1 ≠1 dc
dc ‚2
d
d
c
c h2 „2
d
d
d c d
c
c ≠1 4 ≠1 dc
dc ‚3 d
d
c
c h2 „3 d
d
c dc
c ≠1
c
4 ≠1 ≠1 dc
dc ‚4
d
d
c
c h2 „4
d
d
c .. dc d
=
c d
c
c ≠1 ≠1 4 ≠1 . dc
dc ‚5 d
d
c
c h2 „5 d.
d
d c d
c
c ≠1 ≠1 4 ≠1 dc
dc ‚6 d
d
c
c h2 „6 d
d
c dc
c
c
≠1 ≠1 4 dc
dc
‚7 d
d
c
c h2 „7 d
d
. d c
h2 „8
d
4 ..
c dc ‚8 d c d
c ≠1 da
a b .. b a
.. b
... ... ... . .

This demonstrates how solving the discretized Poisson’s equation boils down to the solution
of a linear system Au = h2 f , where A has a distinct sparsity pattern (pattern of nonzeroes).
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 385

Homework 7.1.1.1 The observations in this unit suggest the following way of solving
(7.1.1):
• Discretize the domain 0 Æ ‰, Â Æ 1 by creating an (N + 2) ◊ (N + 2) mesh of points.

• An (N + 2) ◊ (N + 2) array U holds the values u(‰i , Âi ) plus the boundary around it.

• Create an (N +2)◊(N +2) array F that holds the values f (‰i , Âj ) (plus, for convenience,
extra values that correspond to the boundary).

• Set all values in U to zero. This initializes the last rows and columns to zero, which
captures the boundary condition, and initializes the rest of the values at the mesh
points to zero.

• Repeatedly update all interior points with the formula

U (i, j) =
(h2 F (i, j) + U (i, j ≠ 1) + U (i ≠ 1, j) + U (i + 1, j) + U (i, j + 1) )/4
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
f (‰i , Âj ) u(‰i , Âj≠1 ) u(‰i≠1 , Âj ) u(‰i+1 , Âj ) u(‰i , Âj+1 )

until the values converge.

• Bingo! You have written your first iterative solver for a sparse linear system.

• Test your solver with the problem where f (‰, Â) = (– + —)fi 2 sin(–fi‰) sin(—fiÂ).

• Hint: if x and y are arrays with the vectors x and y (with entries ‰i and Âj ), then
mesh( x, y, U ) plots the values in U.

Hint. An outline for a matlab script can be found in Assignments/Week07/matlab/Poisson_Jacobi_iteration.


you execute the script, in the COMMAND WINDOW enter "RETURN" to advance to the
next iteration.
Solution. Assignments/Week07/answers/Poisson_Jacobi_iteration.m.When you execute
the script, in the COMMAND WINDOW enter "RETURN" to advance to the next iteration.
Remark 7.1.1.2 In Homework 7.2.1.4 we store the vectors u and f as they appear in
Figure 7.1.1.1 as 2D arrays. This captures the fact that a 2d array of numbers isn’t necessarily
a matrix. In this case, it is a vector that is stored as a 2D array because it better captures
how the values to be computed relate to the physical problem from which they arise.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 386

YouTube: https://www.youtube.com/watch?v=j-ELcqx3bRo
Remark 7.1.1.3 The point of this launch is that many problems that arise in computational
science require the solution to a system of linear equations Ax = b where A is a (very) sparse
matrix. Often, the matrix does not even need to be explicitly formed and stored.
Remark 7.1.1.4 Wilkinson defined a sparse matrix as any matrix with enough zeros that
it pays to take advantage of them.

7.1.2 Overview
• 7.1 Opening

¶ 7.1.1 Where do sparse linear systems come from?


¶ 7.1.2 Overview
¶ 7.1.3 What you will learn

• 7.2 Direct Solution

¶ 7.2.1 Banded matrices


¶ 7.2.2 Nested dissection
¶ 7.2.3 Observations

• 7.3 Iterative Solution

¶ 7.3.1 Jacobi iteration


¶ 7.3.2 Gauss-Seidel iteration
¶ 7.3.3 Convergence of splitting methods
¶ 7.3.4 Successive Over-Relaxation (SOR)

• 7.4 Enrichments

¶ 7.4.1 Details!
¶ 7.4.2 Parallelism in splitting methods
¶ 7.4.3 Dr. SOR

• 7.5 Wrap Up

¶ 7.5.1 Additional homework


¶ 7.5.2 Summary
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 387

7.1.3 What you will learn


This week is all about solving nonsingular linear systems with matrices that are sparse (have
enough zero entries that it is worthwhile to exploit them).
Upon completion of this week, you should be able to

• Exploit sparsity when computing the Cholesky factorization and related triangular
solves of a banded matrix.

• Derive the cost for a Cholesky factorization and related triangular solves of a banded
matrix.

• Utilize nested dissection to reduce fill-in when computing the Cholesky factorization
and related triangular solves of a sparse matrix.

• Connect sparsity patterns in a matrix to the graph that describes that sparsity pattern.

• Relate computations over discretized domains to the Jacobi, Gauss-Seidel, Successive


Over-Relaxation (SOR) and Symmetric Successive Over-Relaxation (SSOR) iterations.

• Formulate the Jacobi, Gauss-Seidel, Successive Over-Relaxation (SOR) and Symmetric


Successive Over-Relaxation (SSOR) iterations as splitting methods.

• Analyze the convergence of splitting methods.

7.2 Direct Solution


7.2.1 Banded matrices

YouTube: https://www.youtube.com/watch?v=UX6Z6q1_prs
It is tempting to simply use a dense linear solver to compute the solution to Ax = b via,
for example, LU or Cholesky factorization, even when A is sparse. This would require O(n3 )
operations, where n equals the size of matrix A. What we see in this unit is that we can
take advantage of a "banded" structure in the matrix to greatly reduce the computational
cost.
Homework 7.2.1.1 The 1D equivalent of the example from Subsection 7.1.1 is given by the
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 388

tridiagonal linear system


Q R
2 ≠1
c d
c ≠1
c
2 ≠1 d
d
. . . ..
. ..
c d
A=c . d. (7.2.1)
c d
c
a ≠1 2 ≠1 d
b
≠1 2

Prove that this linear system is nonsingular.


Hint. Consider Ax = 0. We need to prove that x = 0. If you instead consider the
equivalent problem
Q RQ R Q R
1 Q
0 R
0 ‰≠1 0
c dc d c d
c
c ≠1 2 ≠1 ··· 0 0 dc
dc ‰0 d
d
c
c 0 d
d
c d
c
c 0 c
c
≠1 2 ≠1 · · · 0 d d
0 dc
dc ‰1 d
d
c
c 0 d
d
c .. .. .. .. . d .. dc .. d c .. d
. .. d
c
c
c . c . . . dc
dc . d
d = c
c . d
d
c c d dc d c d
c
c 0 c
a ≠1 2 ≠1 d b 0 dc
dc ‰n≠2 d
d
c
c 0 d
d
c
a 0 ≠1 2 ≠1 dc
ba ‰n≠1 d
b
c
a 0 d
b
0 0 1 ‰n 0

that introduces two extra variables ‰≠1 = 0 and ‰n = 0, the problem for all ‰i , 0 Æ i < n,
becomes
≠‰i≠1 + 2‰i ≠ ‰i+1 = 0.
or, equivalently,
‰i≠1 + ‰i+1
‰i = .
2
Reason through what would happen if any ‰i is not equal to zero.
Solution. Building on the hint: Let’s say that ‰i ”= 0 while ‰≠1 , . . . , ‰i≠1 are. Then
‰i≠1 + ‰i+1 1
‰i = = ‰i+1
2 2
and hence
‰i+1 = 2‰i > 0..
Next,
‰i + ‰i+2
‰i+1 = = 2‰i
2
and hence
‰i+2 = 4‰i ≠ ‰i = 3‰i > 0.
Continuing this argument, the solution to the recurrence relation is ‰n = (n ≠ i + 1)‰i and
you find that ‰n > 0 which is a contradiction.
This course covers topics in a "circular" way, where sometimes we introduce and use
results that we won’t formally cover until later in the course. Here is one such situation. In
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 389

a later week you will prove these relevant results involving eigenvalues:

• A symmetric matrix is symmetric positive definite (SPD) if and only if its eigenvalues
are positive.

• The Gershgorin Disk Theorem tells us that the matrix in (7.2.1) has nonnegative
eigenvalues.

• A matrix is singular if and only if it has zero as an eigenvalue.

These insights, together with Homework 7.2.1.1, tell us that the matrix in (7.2.1) is SPD.
Homework 7.2.1.2 Compute the Cholesky factor of
Q R
4 ≠2 0 0
c ≠2 5 ≠2 0 d
A=c
c
d
d.
a 0 ≠2 10 6 b
0 0 6 5

Answer. Q R
2 0 0 0
c
c ≠1 2 0 0 d
d
L=c d.
a 0 ≠1 3 0 b
0 0 2 1
Homework 7.2.1.3 Let A œ Rn◊n be tridiagonal and SPD so that
Q R
–0,0 –1,0
c d
c –1,0 –1,1 –2,1 d
c d
c
A=c .. .. .. d
(7.2.2)
c . . . d.
d
c –n≠2,n≠3 –n≠2,n≠2 –n≠1,n≠2 d
a b
–n≠1,n≠2 –n≠1,n≠1

• Propose a Cholesky factorization algorithm that exploits the structure of this matrix.

• What is the cost? (Count square roots, divides, multiplies, and subtractions.)

• What would have been the (approximate) cost if we had not taken advantage of the
tridiagonal structure?

Solution.

• If you play with a few smaller examples, you can conjecture that the Cholesky factor
of (7.2.2) is a bidiagonal matrix (the main diagonal plus the first subdiagonal). Thus,
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 390

A = LLT translates to
Q R
–0,0 –1,0
c d
c –1,0 –1,1 –2,1 d
c d
c .. .. .. d
c
c . . . d
d
c –n≠2,n≠3 –n≠2,n≠2 –n≠1,n≠2 d
a b
Q –n≠1,n≠2 –n≠1,n≠1 RQ R
⁄0,0 ⁄0,0 ⁄1,0
c dc d
c ⁄1,0 ⁄1,1 dc ⁄1,1 ⁄2,1 d
c dc d
c
=c .. .. dc .. .. d
c . . dc
dc . . d
d
c ⁄n≠2,n≠3 ⁄n≠2,n≠2 dc ⁄n≠2,n≠2 ⁄n≠1,n≠2 d
a ba b
Q
⁄n≠1,n≠2 ⁄n≠1,n≠1 ⁄n≠1,n≠1 R
⁄0,0 ⁄0,0 ⁄0,0 ⁄1,0
c d
c ⁄1,0 ⁄0,0
c
⁄1,0 ⁄0,1 + ⁄1,1 ⁄1,1 ⁄1,1 ⁄2,1 d
d
c
c
... ... d
d
⁄21 ⁄11
=c
c ...
d,
d
c ıı ⁄n≠2,n≠2 ⁄n≠1,n≠2 d
c d
c d
a ⁄n≠1,n≠2 ⁄n≠2,n≠2 ⁄n≠1,n≠2 ⁄n≠1,n≠2 b
+ ⁄n≠1,n≠1 ⁄n≠1,n≠1

where ıı = ⁄n≠3,n≠2 ⁄n≠3,n≠2 + ⁄n≠2,n≠2 ⁄n≠2,n≠2 . With this insight, the algorithm that
overwrites A with its Cholesky factor is given by

for i = 0, . . . , n ≠ 2
Ô
–i,i := –i,i
–i+1,i := –i+1,i /–i,i
–i+1,i+1 := –i+1,i+1 ≠ –i+1,i –i+1,i
endfor
Ô
–n≠1,n≠1 := –n≠1,n≠1

• A cost analysis shows that this requires n square roots, n ≠ 1 divides, n ≠ 1 multiplies,
and n ≠ 1 subtracts.

• The cost, had we not taken advantage of the special structure, would have been (ap-
proximately) 13 n3 .
Homework 7.2.1.4 Propose an algorithm for overwriting y with the solution to Ax = y for
the SPD matrix in Homework 7.2.1.3.
Solution.
• Use the algorithm from Homework 7.2.1.3 to overwrite A with its Cholesky factor.

• Since A = LLT , we need to solve Lz = y and then LT x = z.

¶ Overwriting y with the solution of Lz = y (forward substitution) is accomplished


WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 391

by the following algorithm (here L had overwritten A):

for i = 0, . . . , n ≠ 2
Âi := Âi /–i,i
Âi+1 := Âi+1 ≠ –i+1,i Âi
endfor
Ân≠1 := Ân≠1 /–n≠1,n≠1

¶ Overwriting y with the solution of Lx = z (where z has overwritten y (back


substitution) is accomplished by the following algorithm (here L had overwritten
A):
for n ≠ 1 = 0, . . . , 1
Âi := Âi /–i,i
Âi≠1 := Âi≠1 ≠ –i,i≠1 Âi
endfor
Â0 := Â0 /–0,0
The last exercises illustrate how special structure (in terms of patterns of zeroes and
nonzeroes) can often be exploited to reduce the cost of factoring a matrix and solving a
linear system.

YouTube: https://www.youtube.com/watch?v=kugJ2N jC2U


The bandwidth of a matrix is defined as the smallest integer b such that all elements on
the jth superdiagonal and subdiagonal of the matrix equal zero if j > b.

• A diagonal matrix has bandwidth 1.

• A tridiagonal matrix has bandwith 2.

• And so forth.

Let’s see how to take advantage of the zeroes in a matrix with bandwidth b, focusing on
SPD matrices.
Definition 7.2.1.1 The half-band width of a symmetric matrix equals the number of sub-
diagonals beyond which all the matrix contains only zeroes. For example, a diagonal matrix
has half-band width of zero and a tridiagonal matrix has a half-band width of one. ⌃
Homework 7.2.1.5 Assume the SPD matrix A œ Rm◊m has a bandwidth of b. Propose a
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 392

modification of the right-looking Cholesky factorization from Figure 5.4.3.1

A = Chol-right-looking(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R c d
æ a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
Ô
–11 := –11
a21 := a21 /–11
A22 := A22 ≠ a21 aT21 (updating only the lower triangular part)
Q R
A B A00 a01 A02
AT L AT R c d
Ω a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile

that takes advantage of the zeroes in the matrix. (You will want to draw yourself a picture.)
What is its approximate cost in flops (when m is large)?
Solution. See the below video.

YouTube: https://www.youtube.com/watch?v=Ao dARtix5Q


Ponder This 7.2.1.6 Propose a modification of the FLAME notation that allows one to
elegantly express the algorithm you proposed for Homework 7.2.1.5
Ponder This 7.2.1.7 Another way of looking at an SPD matrix A œ Rn◊n with bandwidth
b is to block it Q R
A0,0 AT1,0
c d
c A1,0 A1,1
c
AT2,1 d
d
A=c
c . . . . . . . . .
d
d.
c d
c An≠2,n≠3 An≠2,n≠2 An≠1,n≠2 d
T
a b
An≠1,n≠2 An≠1,n≠1
where, Ai,j œ Rb◊b and for simplicity we assume that n is a multiple of b. Propose an
algorithm for computing its Cholesky factorization that exploits this block structure. What
special structure do matrices Ai+1,i have? Can you take advantage of this structure?
Analyze the cost of your proposed algorithm.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 393

7.2.2 Nested dissection

YouTube: https://www.youtube.com/watch?v=r1P4Ze7Yqe0
The purpose of the game is to limit fill-in, which happens when zeroes turn into nonze-
roes. With an example that would result from, for example, Poisson’s equation, we will
illustrate the basic techniques, which are known as "nested dissection."
If you consider the mesh that results from the discretization of, for example, a square
domain, the numbering of the mesh points does not need to be according to the "natural
ordering" we chose to use before. As we number the mesh points, we reorder (permute) both
the columns of the matrix (which correspond to the elements ‚i to be computed) and the
equations that tell one how ‚i is computed from its neighbors. If we choose a separator, the
points highlighted in red in Figure 7.2.2.1 (Top-Left), and order the mesh points to its left
first, then the ones to its right, and finally the points in the separator, we create a pattern
of zeroes, as illustrated in Figure 7.2.2.1 (Top-Right).

Figure 7.2.2.1 An illustration of nested dissection.


WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 394

Homework 7.2.2.1 Consider the SPD matrix


Q R
A00 0 AT20
c d
A = a 0 A11 AT21 b .
A20 A21 A22

• What special structure does the Cholesky factor of this matrix have?
• How can the different parts of the Cholesky factor be computed in a way that takes
advantage of the zero blocks?
• How do you take advantage of the zero pattern when solving with the Cholesky factors?

Solution.
• The Cholesky factor of this matrix has the structure
Q R
L00 0 0
L=c
a 0 L11 0 d b.
L20 L21 L22

• We notice that A = LLT means that


Q R Q RQ RT
A00 0 AT20 L00 0 0 L00 0 0
c
a 0 A11 AT21 d
b=
c
a 0 L11 0 d c
ba 0 L11 0 d b ,
A20 A21 A22 L20 L21 L22 L20 L21 L22
Q ¸ ˚˙ ˝ R
L00 LT00 0 ı
c d
a 0 T
L11 L11 ı b
L20 L00 L21 L11 L20 L20 + L21 L21 + L22 L22
T T T T T

where the ıs indicate "symmetric parts" that don’t play a role. We deduce that the
following steps will yield the Cholesky factor:
¶ Compute the Cholesky factor of A00 :

A00 = L00 LT00 ,

overwriting A00 with the result.


¶ Compute the Cholesky factor of A11 :

A11 = L11 LT11 ,

overwriting A11 with the result.


¶ Solve
XLT00 = A20
for X, overwriting A20 with the result. (This is a triangular solve with multiple
right-hand sides in disguise.)
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 395

¶ Solve
XLT11 = A21
for X, overwriting A21 with the result. (This is a triangular solve with multiple
right-hand sides in disguise.)
¶ Update the lower triangular part of A22 with

A22 ≠ L20 LT20 ≠ L21 LT21 .

¶ Compute the Cholesky factor of A22 :

A22 = L22 LT22 ,

overwriting A22 with the result.

• If we now want to solve Ax = y, we can instead first solve Lz = y and then LT x = z.


Consider Q RQ R Q R
L00 0 0 z0 y0
c
a 0 L11 0 d c d c d
b a z1 b = a y1 b .
L20 L21 L22 z2 y2
This can be solved via the steps

¶ Solve L00 z0 = y0 .
¶ Solve L11 z1 = y1 .
¶ Solve L22 z2 = y2 ≠ L20 z0 ≠ L21 z1 .

Similarly,
Q RT Q R Q R
LT00 0 LT20 x0 z0
c
a 0 LT11 LT21 d c d c d
b a x1 b = a z1 b .
0 0 L22 T
x2 z2
can be solved via the steps

¶ Solve LT22 x2 = z2 .
¶ Solve LT11 x1 = z1 ≠ LT21 x2 .
¶ Solve LT00 x0 = z0 ≠ LT20 x2 .

YouTube: https://www.youtube.com/watch?v=mwX0wPRdw7U
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 396

Each of the three subdomains that were created in Figure 7.2.2.1 can themselves be
reordered by identifying separators. In Figure 7.2.2.2 we illustrate this only for the left and
right subdomains. This creates a recursive structure in the matrix. Hence, the name nested
dissection for this approach.

Figure 7.2.2.2 A second level of nested dissection.

7.2.3 Observations
Through an example, we have illustrated the following insights regarding the direct solution
of sparse linear systems:

• There is a one-to-one correspondence between links in the graph that shows how mesh
points are influenced by other mesh points (connectivity) and nonzeroes in the matrix.
If the graph is undirected, then the sparsity in the matrix is symmetric (provided the
unknowns are ordered in the same order as the equations that relate the unknowns to
their neighbors). If the graph is directed, then the matrix has a nonsymmetric sparsity
pattern.

• Renumbering the mesh points is equivalent to correspondingly permuting the columns


of the matrix and the solution vector. Reordering the corresponding equations is
equivalent to permuting the rows of the matrix.

These obervations relate the problem of reducing fill-in to the problem of partitioning the
graph by identifying a separator. The smaller the number of mesh points in the separator
(the interface), the smaller the submatrix that corresponds to it and the less fill-in will occur
related to this dissection.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 397

Remark 7.2.3.1 Importantly: one can start with a mesh and manipulate it into a matrix
or one can start with a matrix and have its sparsity pattern prescribe the graph.

7.3 Iterative Solution


7.3.1 Jacobi iteration

YouTube: https://www.youtube.com/watch?v=OMbxk1ihIFo
Let’s review what we saw in Subsection 7.1.1. The linear system Au = f

4‚0 ≠‚1 ≠‚4 = h2 „0


≠‚0 +4‚1 ≠‚2 ≠‚5 = h2 „1
≠‚1 +4‚2 ≠‚3 ≠‚6 = h2 „2
≠‚2 +4‚3 ≠‚7 = h2 „3
≠‚0 +4‚4 ≠‚5 ≠‚8 = h2 „4
.. .. .. .. .. ..
. . . . . .

which can be written in matrix form as


Q RQ
4 ≠1 ≠1 ‚0
R Q
h2 „0
R
c ≠1
c 4 ≠1 ≠1 dc
dc ‚1
d
d
c
c h2 „1
d
d
c dc
c
c
≠1 4 ≠1 ≠1 dc
dc ‚2
d
d
c
c h2 „2
d
d
d c d
c
c ≠1 4 ≠1 dc
dc ‚3 d
d
c
c h2 „3 d
d
c dc
c ≠1
c
4 ≠1 ≠1 dc
dc ‚4
d
d
c
c h2 „4
d
d
c .. dc d
=
c d
c
c ≠1 ≠1 4 ≠1 . dc
dc ‚5 d
d
c
c h2 „5 d.
d
d c d
c
c ≠1 ≠1 4 ≠1 dc
dc ‚6 d
d
c
c h2 „6 d
d
c dc
c
c
≠1 ≠1 4 dc
dc
‚7 d
d
c
c h2 „7 d
d
. d c
h2 „8
d
4 ..
c dc ‚8 d c d
c ≠1 da
a b .. b a
.. b
... ... ... . .

was solved by repeatedly updating

h2 „i + ‚i≠N + ‚i≠1 + ‚i+1 + ‚i+N


‚i =
4
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 398

modified appropriately for points adjacent to the boundary. Let’s label the value of ‚i during
(k)
the kth iteration with ‚i and state the algorithm more explicitly as
for k = 0, . . . , convergence
for i = 0, . . . , N ◊ N ≠ 1
(k+1) (k) (k) (k) (k)
‚i = (h2 „i + ‚i≠N + ‚i≠1 + ‚i+1 + ‚i+N )/4
endfor
endfor
again, modified appropriately for points adjacent to the boundary. The superscripts are there
to emphasize the iteration during which a value is updated. In practice, only the values for
iteration k and k + 1 need to be stored. We can also capture the algorithm with a vector
and matrix as
(k+1) (k) (k)
4‚0 = ‚1 +‚4 +h2 „0
(k+1) (k) (k) (k)
4‚1 = ‚0 +‚2 +‚5 +h2 „1
(k+1) (k) (k) (k)
4‚2 = ‚1 +‚3 +‚6 +h2 „2
(k+1) (k) (k)
4‚3 = ‚2 +‚7 +h2 „3
(k+1) (k) (k) (k)
4‚4 = ‚0 +‚5 ≠‚8 +h2 „4
.. ... ... ... ... .. ..
. . .
which can be written in matrix form as
Q R
Q R (k+1)
‚0
4 c (k+1)
d
c dc ‚1 d
c
c
4 dc
dc (k+1)
d
d
c
c 4 dc
dc
‚2 d
d
dc (k+1) d
c
c 4 dc ‚3 d
c dc (k+1)
d
c
c 4 dc
dc ‚4
d
d
c 4 dc (k+1)
d
c dc ‚5 d
c dc d
c
c 4 dc
dc
(k+1)
‚6
d
d
c 4 dc d
c dc (k+1) d
c dc ‚7 d
c
a
4 dc
bc (k+1)
d
d
.. c ‚8 d
. a
.. b
.
Q R
Q
(k)
R (7.3.1)
0 1 1 c
‚0
d
Q 2
h „0
R
c 1 0 1 (k)
c 1 dc
dc ‚1 d c
d c h2 „1
d
d
c dc d c
c
c
1 0 1 1 dc
dc
(k)
‚2 d c
d c h2 „2
d
d
d
c
c 1 0 1 dc
dc (k)
‚3
d c
d c h2 „3 d
d
c dc d c
c 1 0 1 1 (k) d
c
dc
dc ‚4
d c
d c h2 „4 d
= c .. dc d+c d
c
c 1 1 0 1 . dc
dc
(k)
‚5 d c
d c h2 „5 d.
d
d
c
c 1 1 0 1 dc
dc (k)
‚6
d c
d c h2 „6 d
d
c dc d c
c
c
1 1 0 dc
dc (k) d c
d c
h2 „7 d
d
. ‚7 d
c
c 1 0 ..
dc
dc (k)
d c
d a h2 „8 d
a bc ‚8 d .. b
... ... ... a
.. b .
.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 399

YouTube: https://www.youtube.com/watch?v=7rDvET9_nek
How can we capture this more generally?

• We wish to solve Ax = y.
We write A as the difference of its diagonal, M = D, and the negative of its off-diagonal
part, N = D ≠ A so that

A = D ≠ (D ≠ A) = M ≠ N.

In our example, M = 4I and N = 4I ≠ A.

• We then notice that


Ax = y
can be rewritten as
(M ≠ N )x = y
or, equivalently,
M x = N x + y.
If you think about it carefully, this captures (7.3.1) for our example. Finally,

x = M ≠1 (N x + y).

• If we now let x(k) be the values of our vector x in the current step. Then the values
after all elements have been updated are given by the vector

x(k+1) = M ≠1 (N x(k) + y).

• All we now need is an initial guess for the solution, x(0) , and we are ready to iteratively
solve the linear system by computing x(1) , x(2) , etc., until we (approximately) reach a
fixed point where x(k+1) = M ≠1 (N x(k) + y) ¥ x(k) .

The described method, where M equals the diagonal of A and N = D ≠ A, is known as


the Jacobi iteration.
Remark 7.3.1.1 The important observation is that the computation involves a matrix-
vector multiplication with a sparse matrix, N = D ≠ A, and a solve with a diagonal matrix,
M = D.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 400

7.3.2 Gauss-Seidel iteration

YouTube: https://www.youtube.com/watch?v=ufMUhO1vDew
A variation on the Jacobi iteration is the Gauss-Seidel iteration. It recognizes that since
values at points are updated in some order, if a neighboring value has already been updated
earlier in the current step, then you might as well use that updated value. For our example
from Subsection 7.1.1 this is captured by the algorithm

for k = 0, . . . , convergence
for i = 0, . . . , N ◊ N ≠ 1
(k+1) (k+1) (k+1) (k) (k)
‚i = (h2 „i + ‚i≠N + ‚i≠1 + ‚i+1 + ‚i+N )/4
endfor
endfor

modified appropriately for points adjacent to the boundary. This algorithm exploits the fact
(k+1) (k+1) (k+1)
that ‚i≠N and ‚i≠1 have already been computed by the time ‚i is updated. Once
again, the superscripts are there to emphasize the iteration during which a value is updated.
In practice, the superscripts can be dropped because of the order in which the computation
happens.
Homework 7.3.2.1 Modify the code for Homework 7.1.1.1 ( what you now know as the
Jacobi iteration) to implement the Gauss-Seidel iteration.
Solution. Assignments/Week07/answers/Poisson_GS_iteration.m.
When you execute the script, in the COMMAND WINDOW enter "RETURN" to advance
to the next iteration.
You may also want to observe the Jacobi and Gauss-Seidel iterations in action side-by-side
in Assignments/Week07/answers/Poisson_Jacobi_vs_GS_iteration.m.
Homework 7.3.2.2 Here we repeat (7.3.1) for Jacobi’s iteration applied to the example in
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 401

Subsection 7.1.1:
Q R
Q R (k+1)
‚0
4 c (k+1)
d
c dc ‚1 d
c
c
4 dc
dc (k+1)
d
d
c
c 4 dc
dc
‚2 d
d
dc (k+1) d
c
c 4 dc ‚3 d
c dc (k+1)
d
c
c 4 dc
dc ‚4
d
d
c 4 dc (k+1)
d
c dc ‚5 d
c dc d
c
c 4 dc
dc
(k+1)
‚6
d
d
c 4 dc d
c dc (k+1) d
c dc ‚7 d
c
a
4 dc
bc (k+1)
d
d
.. c ‚8 d
. a
.. b
.
Q R
Q
(k)
R (7.3.2)
0 1 1 c
‚0
d
Q
h2 „0
R
c 1 0 1 (k)
c 1 dc
dc ‚1 d c
d c h2 „1
d
d
c dc d c
c
c
1 0 1 1 dc
dc
(k)
‚2 d c
d c h2 „2
d
d
d
c
c 1 0 1 dc
dc (k)
‚3
d c
d c h2 „3 d
d
c dc d c
c 1 0 1 1 (k) d
c
dc
dc ‚4
d c
d c h2 „4 d
=c .. dc d+c d
c
c 1 1 0 1 . dc
dc
(k)
‚5 d c
d c h2 „5 d.
d
d
c
c 1 1 0 1 dc
dc (k)
‚6
d c
d c h2 „6 d
d
c dc d c
c
c
1 1 0 dc
dc (k) d c
d c
h2 „7 d
d
. ‚7 d
c
c 1 0 ..
dc
dc (k)
d c
d a h2 „8 d
a bc ‚8 d .. b
... ... ... a
.. b .
.

Modify this to reflect the Gauss-Seidel iteration.


WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 402

Solution.
Q R
Q R (k+1)
‚0
4 c (k+1)
d
c dc ‚1 d
c
c
≠1 4 dc
dc (k+1)
d
d
c
c ≠1 4 dc
dc
‚2 d
d
dc (k+1) d
c
c ≠1 4 dc ‚3 d
c dc (k+1)
d
c
c ≠1 4 dc
dc ‚4
d
d
c 4 dc (k+1)
d
c ≠1 ≠1 dc ‚5 d
c dc d
c
c ≠1 ≠1 4 dc
dc
(k+1)
‚6
d
d
c ≠1 ≠1 4 dc d
c dc (k+1) d
c dc ‚7 d
c
a
≠1 4 dc
bc (k+1)
d
d
.. .. .. c ‚8 d
. . . a
.. b
Q R
.
Q R (k)
0 1 1 c
‚0
d
Q
h2 „0
R
(k)
c
c 0 1 1 dc
dc ‚1 d c
d c h2 „1
d
d
c dc d c
c
c
0 1 1 dc
dc
(k)
‚2 d c
d c h2 „2
d
d
d
c
c 0 1 dc
dc (k)
‚3
d c
d c h2 „3 d
d
c dc d c
c 0 1 1 dc (k) d c
h2 „4
d
d
c dc ‚4 d c
:= c .. dc d+c d
c
c 0 1 . dc
dc
(k)
‚5 d c
d c h2 „5 d.
d
d
c
c 0 1 dc
dc (k)
‚6
d c
d c h2 „6 d
d
c dc d c
c
c
0 dc
dc (k) d c
d c
h2 „7 d
d
.. ‚7 d
c
c 0 .
dc
dc (k)
d c
d a h2 „8 d
a bc ‚8 d .. b
.. a
.. b .
. .
This homework suggests the following:

• We wish to solve Ax = y.
We write symmetric A as
A = (D ≠ L) ≠ (LT ) ,
¸ ˚˙ ˝ ¸ ˚˙ ˝
M N
where ≠L equals the strictly lower triangular part of A and D is its diagonal.

• We then notice that


Ax = y
can be rewritten as
(D ≠ L ≠ LT )x = y
or, equivalently,
(D ≠ L)x = LT x + y.
If you think about it carefully, this captures (7.3.2) for our example. Finally,

x = (D ≠ L)≠1 (LT x + y).


WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 403

• If we now let x(k) be the values of our vector x in the current step. Then the values
after all elements have been updated are given by the vector
x(k+1) = (D ≠ L)≠1 (LT x(k) + y).
Homework 7.3.2.3 When the Gauss-Seidel iteration is used to solve Ax = y, where A œ
(k+1) (k+1)
Rn◊n , it computes entries of x(k+1) in the forward order ‰0 , ‰1 , . . .. If A = D ≠ L ≠ LT ,
this is captured by
(D ≠ L)x(k+1) = LT x(k) + y. (7.3.3)
Modify (7.3.3) to yield a "reverse" Gauss-Seidel method that computes the entries of vector
(k+1) (k+1)
x(k+1) in the order ‰n≠1 , ‰n≠2 , . . ..
(k+1) (k+1)
Solution. The reverse order is given by ‰n≠1 , ‰n≠2 , . . .. This corresponds to the splitting
M = D ≠ LT and N = L so that

(D ≠ LT )x(k+1) = Lx(k) + y.
Homework 7.3.2.4 A "symmetric" Gauss-Seidel iteration to solve symmetric Ax = y,
where A œ Rn◊n , alternates between computing entries in forward and reverse order. In
other words, if A = MF ≠ NF for the forward Gauss-Seidel method and A = MR ≠ NR for
the reverse Gauss-Seidel method, then
1
MF x(k+ 2 ) = NF x(k) + y
1
MR x(k+1) = NR x(k+ 2 ) + y

constitutes one iteration of this symmetric Gauss-Seidel iteration. Determine M and N such
that
M x(k+1) = N x(k) + y
equals one iteration of the symmetric Gauss-Seidel iteration.
(You may want to follow the hint...)
Hint.
• From this unit and the last homework, we know that MF = (D ≠ L), NF = LT ,
MR = (D ≠ LT ), and NR = L.

• Show that

(D ≠ LT )x(k+1) = L(D ≠ L)≠1 LT x(k) + (I + L(D ≠ L)≠1 )y.

• Show that I + L(D ≠ L)≠1 = D(D ≠ L)≠1 .

• Use these insights to determine M and N .

Solution.

• From this unit and the last homework, we know that MF = (D ≠ L), NF = LT ,
MR = (D ≠ LT ), and NR = L.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 404

• Show that

(D ≠ LT )x(k+1) = L(D ≠ L)≠1 LT x(k) + (I + L(D ≠ L)≠1 )y.

We show this by substituting MR and NR :


1
(D ≠ LT )x(k+1) = Lx(k+ 2 ) + y
1
and then substituting in for x(k+ 2 ) , MF and NF :

(D ≠ LT )x(k+1) = L((D ≠ L)≠1 LT x(k) + y) + y.

Multiplying out the right-hand side and factoring out y yields the desired result.

• Show that I + L(D ≠ L)≠1 = D(D ≠ L)≠1 .


We show this by noting that

I + L(D ≠ L)≠1
=
(D ≠ L)(D ≠ L)≠1 + L(D ≠ L)≠1 =
(D ≠ L + L)(D ≠ L)≠1 =
D(D ≠ L)≠1 .

• Use these insights to determine M and N .


We now notice that

(D ≠ LT )x(k+1) = L(D ≠ L)≠1 LT x(k) + (I + L(D ≠ L)≠1 )y

can be rewritten as (Someone check this... My brain hurts.)

(D ≠ LT )x(k+1) = L(D ≠ L)≠1 LT x(k) + D(D ≠ L)≠1 y

(D ≠ L)D≠1 (D ≠ LT ) x(k+1) = (D ≠ L)D≠1 L(D ≠ L)≠1 LT x(k) + y


¸ ˚˙ ˝ ¸ ˚˙ ˝
M N

7.3.3 Convergence of splitting methods


WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 405

YouTube: https://www.youtube.com/watch?v=L6PZhc-G7cE
The Jacobi and Gauss-Seidel iterations can be generalized as follows. Split matrix A =
M ≠ N where M is nonsingular. Now,

(M ≠ N )x = y

is equivalent to
Mx = Nx + y
and
x = M ≠1 (N x + y).
This is an example of a fixed-point equation: Plug x into M ≠1 (N x+y) and the result is again
x. The iteration is then created by viewing the vector on the left as the next approximation
to the solution given the current approximation x on the right:

x(k+1) = M ≠1 (N x(k) + y).

Let A = (D ≠ L ≠ U ) where ≠L, D, and ≠U are the strictly lower triangular, diagonal,
and strictly upper triangular parts of A.

• For the Jacobi iteration, M = D and N = (L + U ).

• For the Gauss-Seidel iteration, M = (D ≠ L) and N = U .

In practice, M is not inverted. Instead, the iteration is implemented as

M x(k+1) = N x(k) + y,

with which we emphasize that we solve with M rather than inverting it.
Homework 7.3.3.1 Why are the choices of M and N used by the Jacobi iteration and
Gauss-Seidel iteration convenient?
Solution. Both methods have two advantages:
• The multiplication N u(k) can exploit sparsity in the original matrix A.

• Solving with M is relatively cheap. In the case of the Jacobi iteration (M = D) it is


trivial. In the case of the Gauss-Seidel iteration (M = (D ≠ L)), the lower triangular
system inherits the sparsity pattern of the corresponding part of A.
Homework 7.3.3.2 Let A = M ≠N be a splitting of matrix A. Let x(k+1) = M ≠1 (N x(k) +y).
Show that
x(k+1) = x(k) + M ≠1 r(k) , where r(k) = y ≠ Ax(k) .
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 406

Solution.
x(k) + M ≠1 r(k)
=
x(k) + M ≠1 (y ≠ Ax(k) )
=
x + M ≠1 (y ≠ (M ≠ N )x(k) )
(k)

=
x(k) + M ≠1 y ≠ M ≠1 (M ≠ N )x(k)
=
x + M ≠1 y ≠ (I ≠ M ≠1 N )x(k)
(k)

=
M ≠1 (N x(k) + y)
This last exercise provides an important link between iterative refinement, discussed in
Subsection 5.3.7, and splitting methods. Let us revisit this, using the notation from this
section.
If Ax = y and x(k) is a (current) approximation to x, then

r(k) = y ≠ Ax(k)

is the (current) residual. If we solve

A”x(k) = r(k)

or, equivalently, compute


”x = A≠1 r(k)
then
x = x(k) + ”x
is the solution to Ax = y. Now, if we merely compute an approximation,

”x(k) ¥ A≠1 r(k) ,

then
x(k+1) = x(k) + ”x(k)
is merely a (hopefully better) approximation to x. If M ¥ A then

”x(k) = M ≠1 r(k) ¥ A≠1 r(k) .

So, the better M approximates A, the faster we can expect x(k) to converge to x.
With this in mind, we notice that if A = D ≠ L ≠ U , where D, ≠L, and ≠U equals
its diagonal, strictly lower triangular, and strictly upper triangular part, and we split A =
M ≠ N , then M = D ≠ L is a better approximation to matrix A than is M = D.
Ponder This 7.3.3.3 Given these insights, why might the symmetric Gauss-Seidel method
discussed in Homework 7.3.2.4 have benefits over the regular Gauss-Seidel method?
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 407

Loosely speaking, a sequence of numbers, ‰(k) is said to converge to the number ‰ if


|‰ ≠ ‰| eventually becomes arbitrarily close to zero. This is written as
(k)

lim ‰(k) = ‰.
kæŒ

A sequence of vectors, x(k) , converges to the vector x if for some norm Î · Î

lim Îx(k) ≠ xÎ = 0.
kæŒ

Because of the equivalence of norms, if the sequence converges in one norm, it converges in all
(k)
norms. In particular, it means it converges in the Œ-norm, which means that maxi |‰i ≠‰i |
(k)
converges to zero, and hence for all entries |‰i ≠ ‰i | eventually becomes arbitrarily small.
Finally, a sequence of matrices, A(k) , converges to the matrix A if for some norm Î · Î

lim ÎA(k) ≠ AÎ.


kæŒ

Again, if it converges for one norm, it converges for all norms and the individual elements
of A(k) converge to the corresponding elements of A.
Let’s now look at the convergence of splitting methods. If x solves Ax = y and x(k) is
the sequence of vectors generated starting with x(0) , then

Mx = Nx + y
M x(k+1) = N x(k) + y

so that
M (x(k+1) ≠ x) = N (x(k) ≠ x)
or, equivalently,
x(k+1) ≠ x = (M ≠1 N )(x(k) ≠ x).
This, in turn, means that

x(k+1) ≠ x = (M ≠1 N )k+1 (x(0) ≠ x).

If Î · Î is a vector norm and its induced matrix norm, then

Îx(k+1) ≠ xÎ = Î(M ≠1 N )k+1 (x(0) ≠ x)Î Æ ÎM ≠1 N Îk+1 Îx(0) ≠ xÎ.

Hence, if ÎM ≠1 N Î < 1 in that norm, then limiæŒ ÎM ≠1 N Îi = 0 and hence x(k) converges
to x. We summarize this in the following theorem:
Theorem 7.3.3.1 Let A œ Rn◊n be nonsingular and x, y œ Rn so that Ax = y. Let
A = M ≠ N be a splitting of A, x(0) be given (an initial guess), and x(k+1) = M ≠1 (N x(k) + y).
If ÎM ≠1 N Î < 1 for some matrix norm induced by the ηΠvector norm, then x(k) will converge
to the solution x.
Because of the equivalence of matrix norms, if we can find any matrix norm ||| · ||| such
that |||M ≠1 N ||| < 1, the sequence of vectors converges.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 408

Ponder This 7.3.3.4 Contemplate the finer points of the last argument about the conver-
gence of (M ≠1 N )i

YouTube: https://www.youtube.com/watch?v=uv8cMeR9u_U
Understanding the following observation will have to wait until after we cover eigenvalues
and eigenvectors, later in the course. For splitting methods, it is the spectral radius of a
matrix (the magnitude of the eigenvalue with largest magnitude), fl(B), that often gives us
insight into whether the method converges. This, once again, requires us to use a result from
a future week in this course: It can be shown that for all B œ Rm◊m and ‘ > 0 there exists a
norm ||| · |||B,‘ such that |||B|||B,‘ Æ fl(B) + ‘. What this means is that if we can show that
fl(M ≠1 N ) < 1, then the splitting method converges for the given matrix A.
Homework 7.3.3.5 Given nonsingular A œ Rn◊n , what splitting A = M ≠ N will give the
fastest convergence to the solution of Ax = y?
Solution. M = A and N = 0. Then, regardless of the initial vector x(0) ,

x(1) := M ≠1 (N x(0) + y) = A≠1 (0x(0) + y) = A≠1 y.

Thus, convergence occurs after a single iteration.

YouTube: https://www.youtube.com/watch?v= K sk0qdCu0

7.3.4 Successive Over-Relaxation (SOR)


WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 409

YouTube: https://www.youtube.com/watch?v=t9B 9z7HPTQ


Recall that if A = D ≠ L ≠ U where ≠L, D, and ≠U are the strictly lower triangular,
diagonal, and strictly upper triangular parts of A, then the Gauss-Seidel iteration for solving
(k+1)
Ax = y can be expressed as x(k+1) = (D ≠ L)≠1 (U x + y) or, equivalently, ‰i solves
i≠1
ÿ n≠1
ÿ
(k+1) (k+1) (k)
–i,j ‰j + –i,i ‰i =≠ –i,j ‰j + Âi .
j=0 j=i+1

(k+1) GS (i+1)
where any term involving a zero is skipped. We label this ‰i with ‰i in our
subsequent discussion.
What if we pick our next value a bit further:
(k+1) GS (k+1) (k)
‰i = ʉi + (1 ≠ Ê)‰i ,
where Ê Ø 1. This is known as over-relaxation. Then
GS (k+1) 1 (k+1) 1 ≠ Ê (k)
‰i = ‰i ≠ ‰i
Ê Ê
and 5 6
i≠1
ÿ (k+1) 1 (k+1) 1 ≠ Ê (k) n≠1
ÿ (k)
–i,j ‰j + –i,i ‰i ≠ ‰i = ≠ –i,j ‰j + Âi
j=0 Ê Ê j=i+1

or, equivalently,
i≠1
ÿ (k+1) 1 (k+1) 1≠Ê (k)
n≠1
ÿ (k)
–i,j ‰j + –i,i ‰i = –i,i ‰i ≠ –i,j ‰j + Âi .
j=0 Ê Ê j=i+1

This is equivalent to splitting


3 4 3 4
1 1≠Ê
A= D≠L ≠ D+U ,
¸
Ê ˚˙ ˝ ¸
Ê ˚˙ ˝
M N
an iteration known as successive over-relaxation (SOR). The idea now is that the relaxation
parameter Ê can often be chosen to improve (reduce) the spectral radius of M ≠1 N , thus
accelerating convergence.
We continue with A = D ≠L≠U , where ≠L, D, and ≠U are the strictly lower triangular,
diagonal, and strictly upper triangular parts of A. Building on SOR where
3 4 3 4
1 1≠Ê
A= D≠L ≠ D+U ,
¸
Ê ˚˙ ˝ ¸
Ê ˚˙ ˝
MF NF
where the F stands for "Forward." Now, an alternative would be to compute the elements of
x in reverse order, using the latest available values. This is equivalent to splitting
3 4 3 4
1 1≠Ê
A= D≠U ≠ D+L ,
¸
Ê ˚˙ ˝ ¸
Ê ˚˙ ˝
MR NR
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 410

where the R stands for "Reverse." The symmetric successive over-relaxation (SSOR) iteration
combines the "forward" SOR with a "reverse" SOR, much like the symmetric Gauss-Seidel
does: 1
x(k+ 2 ) = MF≠1 (NF x(k) + y)
1
x(k+1) = MR≠1 (NR x(k+ 2 ) + y).
This can be expressed as splitting A = M ≠ N . The details are a bit messy, and we will skip
them.

7.4 Enrichments
7.4.1 Details!
To solve the problem computationally the problem is again discretized. Relating back to the
problem of the membrane on the unit square in the previous section, this means that the
continuous domain is viewed as a mesh instead, as illustrated in Figure 7.4.1.1.

Figure 7.4.1.1 2D mesh.


In that figure, ‚i equals, for example, the displacement from rest of the point on the
membrane.
Now, let „i be the value of f (x, y) at the mesh point i. One can approximate

ˆ 2 u(x, y) u(x ≠ h, y) ≠ 2u(x, y) + u(x + h, y)


2
¥
ˆx h2
and
ˆ 2 u(x, y) u(x, y ≠ h) ≠ 2u(x, y) + u(x, y + h)
2
¥
ˆy h2
so that
ˆ2u ˆ2u
≠ ≠ = f (x, y)
ˆx2 ˆy 2
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 411

becomes
≠u(x ≠ h, y) + 2u(x, y) ≠ u(x + h, y) ≠u(x, y ≠ h) + 2u(x, y) ≠ u(x, y + h)
+ = f (x, y)
h2 h2
or, equivalently,
≠u(x ≠ h, y) ≠ u(x, y ≠ h) + 4u(x, y) ≠ u(x + h, y) ≠ u(x, y + h)
= f (x, y).
h2
If (x, y) corresponds to the point i in a mesh where the interior points form an N ◊ N grid,
this translates to the system of linear equations
≠‚i≠N ≠ ‚i≠1 + 4‚i ≠ ‚i+1 ≠ ‚i+N = h2 „i .
This can be rewritten as
h2 „i + ‚i≠N + ‚i≠1 + ‚i+1 + ‚i+N
‚i =
4
or
4‚0 ≠‚1 ≠‚4 = h2 „0
≠‚0 +4‚1 ≠‚2 ≠‚5 = h2 „1
≠‚1 +4‚2 ≠‚3 ≠‚6 = h2 „2
≠‚2 +4‚3 ≠‚7 = h2 „3
≠‚0 +4‚4 ≠‚5 ≠‚8 = h2 „4
.. .. .. .. ..
‚1 . . . . .
In matrix notation this becomes
Q RQ
4 ≠1 ≠1 ‚0
R Q
h2 „0
R
c ≠1
c 4 ≠1 ≠1 dc
dc ‚1
d
d
c
c h2 „1
d
d
c dc
c
c
≠1 4 ≠1 ≠1 dc
dc ‚2
d
d
c
c h2 „2
d
d
d c d
c
c ≠1 4 ≠1 dc
dc ‚3 d
d
c
c h2 „3 d
d
c dc
c ≠1
c
4 ≠1 ≠1 dc
dc ‚4
d
d
c
c h2 „4
d
d
c .. dc d
=
c d
(7.4.1)
c
c ≠1 ≠1 4 ≠1 . dc
dc ‚5 d
d
c
c h2 „5 d.
d
d c d
c
c ≠1 ≠1 4 ≠1 dc
dc ‚6 d
d
c
c h2 „6 d
d
c dc
c
c
≠1 ≠1 4 dc
dc
‚7 d
d
c
c h2 „7 d
d
. d c
h2 „8
d
4 ..
c dc ‚8 d c d
c ≠1 da
a b .. b a
.. b
... ... ... . .

This demonstrates how solving the discretized Poisson’s equation boils down to the solution
of a linear system Au = h2 f , where A has a distinct sparsity pattern (pattern of nonzeros).

7.4.2 Parallelism in splitting methods


One of the advantages of, for example, the Jacobi iteration over the Gauss-Seidel iteration is
that the values at all mesh points can be updated simultaneously. This comes at the expense
of slower convergence to the solution.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 412

There is actually quite a bit of parallelism to be exploited in the Gauss-Seidel iteration


as well. Consider our example of a mesh on a square domain as illustrated by

(1) (0) (0)


• First ‚0 is computed from ‚1 and ‚N .
• Second, simultaneously,
(1) (1) (0) (0)
¶ ‚1 can be computed from ‚0 , ‚2 , and ‚N +1 .
(1) (1) (0) (0)
¶ ‚N can be computed from ‚0 , ‚N +1 , and ‚2N .
• Third, simultaneously,
(1) (1) (0) (0)
¶ ‚2 can be computed from ‚1 , ‚2 , and ‚N +2 .
(1) (1) 0i) (0) (0)
¶ ‚N +1 can be computed from ‚1 , ‚N , and ‚N +2 , and ‚2N +1 .
(1) (1) (0) (0)
¶ ‚2N can be computed from ‚N , and ‚2N +1 , and ‚3N .
(2) (1) (1)
¶ AND ‚0 can be computed from ‚1 and ‚N , which starts a new "wave."
What we notice is that taking the opportunity to update when data is ready creates wave-
fronts through the mesh, where each wavefront corresponds to computation related to a
different iteration.
Alternatively, extra parallelism can be achieved by ordering the mesh points using what
is called a red-black ordering. Again focusing on our example of a mesh placed on a
domain, the idea is to partition the mesh points into two groups, where each group consists
of points that are not adjacent in the mesh: the red points and the black points.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 413

The iteration then proceeds by alternating between (simultaneously) updating all values
at the red points and (simultaneously) updating all values at the black points, always using
the most updated values.

7.4.3 Dr. SOR

YouTube: https://www.youtube.com/watch?v=WDsF7gaj4E4
SOR was first proposed in 1950 by David M. Young and Stanley P. Frankel. David
Young (1923-2008) was a colleague of ours at UT-Austin. His vanity license plate read "Dr.
SOR."

7.5 Wrap Up
7.5.1 Additional homework
Homework 7.5.1.1 In Subsection 7.3.4 we discussed SOR and SSOR. Research how to
choose the relaxation parameter Ê and then modify your implementation of Gauss-Seidel
from Homework 7.3.2.1 to investigate the benefits.

7.5.2 Summary
Let A œ Rn◊n be tridiagonal and SPD so that
Q R
–0,0 –1,0
c d
c –1,0 –1,1 –2,1 d
c d
c
A=c ... ... ... d
d.
c d
c –n≠2,n≠3 –n≠2,n≠2 –n≠1,n≠2 d
a b
–n≠1,n≠2 –n≠1,n≠1
Then its Cholesky factor is given by
Q R
⁄0,0
c d
c ⁄1,0 ⁄1,1 d
c d
c
c ... ... d
d.
c d
c ⁄n≠2,n≠3 ⁄n≠2,n≠2 d
a b
⁄n≠1,n≠2 ⁄n≠1,n≠1
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 414

An algorithm for computing it is given by


for i = 0, . . . , n ≠ 2
Ô
–i,i := –i,i
–i+1,i := –i+1,i /–i,i
–i+1,i+1 := –i+1,i+1 ≠ –i+1,i –i+1,i
endfor
Ô
–n≠1,n≠1 := –n≠1,n≠1
It requires n square roots, n ≠ 1 divides, n ≠ 1 multiplies, and n ≠ 1 subtracts. An algorithm
for overwriting y with the solution to Ax = y given its Cholesky factor is given by
• Overwrite y with the solution of Lz = y (forward substitution) is accomplished by the
following algorithm (here L has overwritten A):

for i = 0, . . . , n ≠ 2
Âi := Âi /–i,i
Âi+1 := Âi+1 ≠ –i+1,i Âi
endfor
Ân≠1 := Ân≠1 /–n≠1,n≠1

• Overwriting y with the solution of LT x = z, (where z has overwritten y (back substi-


tution).
for i = n ≠ 1, . . . , 1
Âi := Âi /–i,i
Âi≠1 := Âi≠1 ≠ –i,i≠1 Âi
endfor
Â0 := Â0 /–0,0
Definition 7.5.2.1 The half-band width of a symmetric matrix equals the number of sub-
diagonals beyond which all the matrix contains only zeroes. For example, a diagonal matrix
has half-band width of zero and a tridiagonal matrix has a half-band width of one. ⌃
Nested dissection: a hierarchical partitioning of the graph that captures the sparsity of
a matrix in an effort to reorder the rows and columns of that matrix so as to reduce fill-in
(the overwriting of zeroes in the matrix with nonzeroes).
Splitting methods: The system of linear equations Ax = y, splitting methods view A as
A = M ≠ N and then, given an initial approximation x(0) , create a sequence of approxima-
tions, x(k) that under mild conditions converge to x by solving

M x(k+1) = N x(k) + b

or, equivalently, computing


x(k+1) = M ≠1 (N x(k) + b).
This method converges to x if for some norm Î · · · Î

ÎM ≠1 N Î < 1.
WEEK 7. SOLVING SPARSE LINEAR SYSTEMS 415

Given A = D ≠L≠U where ≠L, D, and ≠U equal the strictly lower triangular, diagonal,
and strictly upper triangular parts of A, commonly used splitting methods are

• Jacobi iteration: A = ¸˚˙˝


D ≠ (L + U ) .
¸ ˚˙ ˝
M N

• Gauss-Seidel iteration: A = D ≠ L ≠ ¸˚˙˝


¸ ˚˙ ˝
U .
M N
3 4
1 1≠Ê
• Successive Over-Relaxation (SOR): A = D≠L ≠ D + U , where Ê is the
¸Ê ˚˙ ˝ ¸
Ê ˚˙ ˝
M N
relaxation parameter.

• Symmetric Successive Over-Relaxation (SSOR).


Week 8

Descent Methods

8.1 Opening
8.1.1 Solving linear systems by solving a minimization problem

YouTube: https://www.youtube.com/watch?v=--WEfBpj1Ts
Consider the quadratic polynomial
1
f (‰) = –‰2 ≠ —‰.
2
Finding the value ‰‚ that minimizes this polynomial can be accomplished via the steps:
• Compute the derivative and set it to zero:

f Õ (‰)
‚ = –‰
‚ ≠ — = 0.

We notice that computing ‰‚ is equivalent to solving the linear system (of one equation)

–‰‚ = —.

• It is a minimum if – > 0 (the quadratic polynomial is concaved up).


Obviously, you can turn this around: in order to solve –‰‚ = — where – > 0, we can instead
minimize the polynomial
1
f (‰) = –‰2 ≠ —‰.
2

416
WEEK 8. DESCENT METHODS 417

This course does not have multivariate calculus as a prerequisite, so we will walk you
through the basic results we will employ. We will focus on finding a solution to Ax = b where
A is symmetric positive definite (SPD). (In our discussions we will just focus on real-valued
problems). Now, if
1
f (x) = xT Ax ≠ xT b,
2
then its gradient equals
Òf (x) = Ax ≠ b.
The function f (x) is minimized (when A is SPD) when its gradient equals zero, which allows
us to compute the vector for which the function achieves its minimum. The basic insight is
that in order to solve Ax‚ = b we can instead find the vector x‚ that minimizes the function
f (x) = 12 xT Ax ≠ xT b.

YouTube: https://www.youtube.com/watch?v=rh9GhwU1fuU
Theorem 8.1.1.1 Let A be SPD and assume that Ax‚ = b. Then the vector x‚ minimizes the
function f (x) = 12 xT Ax ≠ xT b.
Proof. This proof does not employ multivariate calculus!
Let Ax‚ = b. Then
f (x)
= < definition of f (x) >
1 T
2
x Ax ≠ xT b
= < Ax‚ = b >
1 T
2
x Ax ≠ xT Ax‚
= < algebra >
1 T 1 1
x Ax ≠ xT Ax‚ + x‚T Ax‚ ≠ x‚T Ax‚
2
2
¸ ˚˙ 2 ˝
0
= < factor out >
1
2
(x ≠ x
‚ )T
A(x ≠ x‚) ≠ 12 x‚T Ax‚.
Since x‚T Ax‚ is independent of x, and A is SPD, this is clearly minimized when x = x‚. ⌅

8.1.2 Overview
• 8.1 Opening

¶ 8.1.1 Solving linear systems by solving a minimization problem


WEEK 8. DESCENT METHODS 418

¶ 8.1.2 Overview
¶ 8.1.3 What you will learn

• 8.2 Search directions

¶ 8.2.1 Basics of descent methods


¶ 8.2.2 Toward practical descent methods
¶ 8.2.3 Relation to Splitting Methods
¶ 8.2.4 Method of Steepest Descent
¶ 8.2.5 Preconditioning

• 8.3 The Conjugate Gradient Method

¶ 8.3.1 A-conjugate directions


¶ 8.3.2 Existence of A-conjugate search directions
¶ 8.3.3 Conjugate Gradient Method Basics
¶ 8.3.4 Technical details
¶ 8.3.5 Practical Conjugate Gradient Method algorithm
¶ 8.3.6 Final touches for the Conjugate Gradient Method

• 8.4 Enrichments

¶ 8.4.1 Conjugate Gradient Method: Variations on a theme

• 8.5 Wrap Up

¶ 8.5.1 Additional homework


¶ 8.5.2 Summary

8.1.3 What you will learn


This week, you are introduced to additional techniques for solving sparse linear systems (or
any linear system where computing a matrix-vector multiplication with the matrix is cheap).
We discuss descent methods in general and the Conjugate Gradient Method in particular,
which is the most important member of this family of algorithms.
Upon completion of this week, you should be able to
• Relate solving a linear system of equations Ax = b, where A is symmetric positive
definite (SPD), to finding the minimum of the function f (x) = 12 xT Ax + xT b.

• Solve Ax = b via descent methods including the Conjugate Gradient Method.

• Exploit properties of A-conjugate search directions to morph the Method of Steepest


Descent into a practical Conjugate Gradient Method.
WEEK 8. DESCENT METHODS 419

• Recognize that while in exact arithmetic the Conjugate Gradient Method solves Ax = b
in a finite number of iterations, in practice it is an iterative method due to error
introduced by floating point arithmetic.

• Accelerate the Method of Steepest Descent and Conjugate Gradient Method by ap-
plying a preconditioner implicitly defines a new problem with the same solution and
better condition number.

8.2 Search directions


8.2.1 Basics of descent methods

YouTube: https://www.youtube.com/watch?v=V7Cvihzs-n4
Remark 8.2.1.1 In the video, the quadratic polynomial pictured takes on the value ≠x‚Ax‚
at x‚ and that minimum is below the x-axis. This does not change the conclusions that are
drawn in the video.
The basic idea behind a descent method is that at the kth iteration one has an approx-
imation to x, x(k) , and one would like to create a better approximation, x(k+1) . To do so,
the method picks a search direction, p(k) , and chooses the next approximation by taking
a step from the current approximate solution in the direction of p(k) :

x(k+1) := x(k) + –k p(k) .

In other words, one searches for a minimum along a line defined by the current iterate, x(k) ,
and the search direction, p(k) . One then picks –k so that, preferrably, f (x(k+1) ) Æ f (x(k) ).
This is summarized in Figure 8.2.1.2.
WEEK 8. DESCENT METHODS 420

Given : A, b, x(0)
r(0) := b ≠ Ax(0)
k := 0
while r(k) ”= 0
p(k) := next direction
x(k+1) := x(k) + –k p(k) for some scalar –k
r(k+1) := b ≠ Ax(k+1)
k := k + 1
endwhile
Figure 8.2.1.2 Outline for a descent method.
To this goal, typically, an exact descent method picks –k to exactly minimize the
function along the line from the current approximate solution in the direction of p(k) .

YouTube: https://www.youtube.com/watch?v=O1S x 3oAc8


Now,

f (x(k+1) )
= < x(k+1) = x(k) + –k p(k) >
f (x(k) + –k p(k) )
= < evaluate >
1 2T 1 2 1 2T
1
2
x(k) + –k p(k) A x(k) + –k p(k) ≠ x(k) + –k p(k) b
= < multiply out >
1 (k) T
2
x Ax (k)
+ –k p(k) T Ax(k) + 12 –k2 p(k) T Ap(k) ≠ x(k) T b ≠ –k p(k) T b
= < rearrange >
1 (k) T
2
x Ax(k) ≠ x(k) T b + 12 –k2 p(k) T Ap(k) + –k p(k) T Ax(k) ≠ –k p(k) T b
= < substitute f (x(k) ) and factor out common terms >
f (x ) + 12 –k2 p(k) T Ap(k) + –k p(k) T (Ax(k) ≠ b)
(k)

= < substitute r(k) and commute to expose polynomial in –k


1 (k) T
2
p Ap –k ≠ p(k) T r(k) –k + f (x(k) ),
(k) 2

where r(k) = b ≠ Ax(k) is the residual. This is a quadratic polynomial in the scalar –k (since
this is the only free variable).
WEEK 8. DESCENT METHODS 421

YouTube: https://www.youtube.com/watch?v=SA_VrhP7EZg
Minimizing
1
f (x(k+1) ) = p(k) T Ap(k) –k2 ≠ p(k) T r(k) –k + f (x(k) )
2
exactly requires the deriviative with respect to –k to be zero:

df (x(k) + –k p(k) )
0= = p(k) T Ap(k) –k ≠ p(k) T r(k) .
d–k
Hence, for a given choice of pk

p(k) T r(k)
–k = and x(k+1) = x(k) + –k p(k) .
p(k) T Ap(k)
provides the next approximation to the solution. This leaves us with the question of how to
pick the search directions {p(0) , p(1) , . . .}.
A basic decent method based on these ideas is given in Figure 8.2.1.3.
Given : A, b, x(0)
r(0) := b ≠ Ax(0)
k := 0
while r(k) ”= 0
p(k) := next direction
(k) T r (k)
–k := pp(k) T Ap (k)

x(k+1) := x(k) + –k p(k)


r(k+1) := b ≠ Ax(k+1)
k := k + 1
endwhile
Figure 8.2.1.3 Basic descent method.
Homework 8.2.1.1 The cost of an iterative method is a combination of how many iterations
it takes to convergence and the cost per iteration. For the loop in Figure 8.2.1.3, count the
number of matrix-vector multiplications, dot products, and "axpy" operations (not counting
the cost of determining the next descent direction).
WEEK 8. DESCENT METHODS 422

Solution. (k) T (k)


–k := pp(k) T Ap
r
(k) 1 mvmult, 2 dot products
x (k+1)
:= x + –k p(k) 1 axpy
(k)

r(k+1) := b ≠ Ax(k+1) 1 mvmult


Total: 2 matrix-vector multiplies (mvmults), 2 dot products, 1 axpy.

8.2.2 Toward practical descent methods

YouTube: https://www.youtube.com/watch?v=aBTI_EEQNKE
Even though matrices are often highly sparse, a major part of the cost of solving Ax = b
via descent methods is in the matrix-vector multiplication (a cost that is proportional to the
number of nonzeroes in the matrix). For this reason, reducing the number of these is an
important part of the design of the algorithm.
Homework 8.2.2.1 Let
x(k+1) = x(k) + –k p(k)
r(k) = b ≠ Ax(k)
r(k+1) = b ≠ Ax(k+1)
Show that
r(k+1) = r(k) ≠ –k Ap(k) .
Solution.
r(k+1) = b ≠ Ax(k+1)
= < r(k) = b ≠ Ax(k) >
r(k+1) = r(k) + Ax(k) ≠ Ax(k+1)
= < rearrange, factor >
r(k+1) = r(k) ≠ A(x(k+1) ≠ x(k) )
= < x(k+1) = x(k) + –k p(k) >
r (k+1)
= r(k) ≠ –k Ap(k)
WEEK 8. DESCENT METHODS 423

Alternatively:
r(k+1) = b ≠ Ax(k+1)
= < x(k+1) = x(k) + –k p(k) >
r (k+1)
= b ≠ A(x(k) + –k p(k) )
= < distribute >
r(k+1) = b ≠ Ax(k) ≠ –k Ap(k)
= < definition of r(k) >
r (k+1)
= r(k) ≠ –k Ap(k)

YouTube: https://www.youtube.com/watch?v=j00GS9mTgd8
With the insights from this last homework, we can reformulate our basic descent method
into one with only one matrix-vector multiplication, as illustrated in Figure 8.2.2.1.

Given : A, b, x(0) Given : A, b, x(0) Given : A, b, x(0)


r(0) := b ≠ Ax(0) r(0) := b ≠ Ax(0) r(0) := b ≠ Ax(0)
k := 0 k := 0 k := 0
while r(k) ”= 0 while r(k) ”= 0 while r(k) ”= 0
p(k) = next direction p(k) := next direction p(k) := next direction
q (k) := Ap(k)
(k) T (k) (k) T (k) (k) T (k)
–k := pp(k) T Ap
r
(k) –k := pp(k) T Ap
r
(k) –k := pp(k) T qr(k)
x(k+1) := x(k) + –k p(k) x(k+1) := x(k) + –k p(k) x(k+1) := x(k) + –k p(k)
r(k+1) := b ≠ Ax(k+1) r(k+1) := r(k) ≠ –k Ap(k) r(k+1) := r(k) ≠ –k q (k)
k := k + 1endwhile k := k + 1 k := k + 1
endwhile
endwhile
Figure 8.2.2.1 Left: Basic descent method from last unit. Middle: Minor modification
that recasts the computation of the residual r(k+1) as an update of the previous residual r(k) .
Right: modification that reduces the number of matrix-vector multiplications by introducing
temporary vector q (k) .

Homework 8.2.2.2 For loops in the algorithm in Figure 8.2.2.1 (Right), count the number
of matrix-vector multiplications, dot products, and "axpy" operations (not counting the cost
of determining the next descent direction).
WEEK 8. DESCENT METHODS 424

Solution.
q (k) := Ap(k) 1 mvmult
(k) T r (k)
–k := pp(k) T qx (k) 2 dot products
x(k+1) := x(k) + –k p(k) 1 axpy
r(k+1) := r(k) ≠ –k q (k) 1axpy
Total: 1 mvmults, 2 dot products, 2 axpys

YouTube: https://www.youtube.com/watch?v=OGqV_hfaxJA
We finish our discussion regarding basic descent methods by observing that we don’t need
to keep the history of vectors, x(k) , p(k) , r(k) , q (k) , and scalar –k that were computed as long
as they are not needed to compute the next search direction, leaving us with the algorithm
Given : A, b, x
r := b ≠ Ax
while r(k) ”= 0
p := next direction
q := Ap
– := ppT qr
T

x := x + –p
r := r ≠ –q
endwhile
Figure 8.2.2.2 The algorithm from Figure 8.2.2.1 (Right) storing only the most current
vectors and scalar.

8.2.3 Relation to Splitting Methods

YouTube: https://www.youtube.com/watch?v=ifwai OB1EI


Let us pick some really simple search directions in the right-most algorithm in Home-
WEEK 8. DESCENT METHODS 425

work 8.2.2.2: p(k) = ei modn , which cycles through the standard basis vectors.
Homework 8.2.3.1 For the right-most algorithm in Homework 8.2.2.2, show that if p(0) =
e0 , then Q R Q R
(1) (0) 1 a n≠1
ÿ (0) b 1 a n≠1
ÿ (0) b
‰0 = ‰0 + —0 ≠ –0,j ‰j = —0 ≠ –0,j ‰j .
–0,0 j=0 –0,0 j=1

Solution.

• p(0) = e0 .

• p(0) T Ap(0) = eT0 Ae0 = –0,0 (the (0, 0) element in A, not to be mistaken for –0 ).

• r(0) = Ax(0) ≠ b.

• p(0) T r(0) = eT0 (b ≠ Ax(0) ) = eT0 b ≠ eT0 Ax(0) = —0 ≠ aÂT0 x(0) , where aÂTk denotes the kth row
of A.
(0) T (0) —0 ≠ÂaT (0)
• x(1) = x(0) + –0 p(0) = x(0) + pp(0) T Ap
r
(0) e0 = x
(0)
+ 0x
–0,0
e0 . This means that only the
first element of x(0) changes, and it changes to
Q R Q R
(1) (0) 1 a n≠1
ÿ (0) 1 a n≠1
ÿ (0)
‰0 = ‰0 + —0 ≠ –0,j ‰j b = —0 ≠ –0,j ‰j b .
–0,0 j=0 –0,0 j=1

This looks familiar...

YouTube: https://www.youtube.com/watch?v=karx3stbVdE
Careful contemplation of the last homework reveals that this is exactly how the first
element in vector x, ‰0 , is changed in the Gauss-Seidel method!
Ponder This 8.2.3.2 Continue the above argument to show that this choice of descent
directions yields the Gauss-Seidel iteration.
WEEK 8. DESCENT METHODS 426

8.2.4 Method of Steepest Descent

YouTube: https://www.youtube.com/watch?v=tOqAd1OhIwc
For a function f : Rn æ R that we are trying to minimize, for a given x, the direction in
which the function most rapidly increases in value at x is given by its gradient,

Òf (x).

Thus, the direction in which it decreases most rapidly is

≠Òf (x).

For our function


1
f (x) = xT Ax ≠ xT b
2
this direction of steepest descent is given by

≠Òf (x) = ≠(Ax ≠ b) = b ≠ Ax,

which we recognize as the residual. Thus, recalling that r(k) = b ≠ Ax(k) , the direction of
steepest descent at x(k) is given by p(k) = r(k) = b ≠ Ax(k) . These insights motivate the
algorithms in Figure 8.2.4.1.
Given : A, b, x(0) Given : A, b, x
r(0) := b ≠ Ax(0) k := 0
k := 0 r := b ≠ Ax
while r(k) ”= 0 while r ”= 0
p(k) := r(k) p := r
q (k) := Ap(k) q := Ap
– := ppT qr
(k) T (k) T
–k := pp(k) T qr(k)
x(k+1) := x(k) + –k p(k) x := x + –p
r(k+1) := r(k) ≠ –k q (k) r := r ≠ –q
k := k + 1 k := k + 1
endwhile endwhile

Figure 8.2.4.1 Steepest descent algorithm, with indices and without indices.
WEEK 8. DESCENT METHODS 427

8.2.5 Preconditioning

YouTube: https://www.youtube.com/watch?v=i-83HdtrI1M
For a general (appropriately differential) nonlinear function f (x), using the direction
of steepest descent as the search direction is often a reasonable choice. For our problem,
especially if A is relatively ill-conditioned, we can do better.
Here is the idea: Let A = Q QT be the SVD of SPD matrix A (or, equivalently for SPD
matrices, its spectral decomposition, which we will discuss in Subsection 9.2.4). Then
1 1
f (x) = xT Ax ≠ xT b = xT Q QT x ≠ xT QQT b.
2 2
Using the change of basis y = QT x and ‚b = QT b, then
1
g(y) = y T y ≠ y T ‚b.
2
How this relates to the convergence of the Method of Steepest Descent is discussed (infor-
mally) in the video. The key insight is that if Ÿ(A) = ‡0 /‡n≠1 (the ratio between the largest
and smallest eigenvalues or, equivalently, the ratio between the largest and smallest singular
value) is large, then convergence can take many iterations.
What would happen if instead ‡0 = · · · = ‡n≠1 ? Then A = Q QT is the SVD/Spectral
decomposition of A and A = Q(‡0 I)QT . If we then perform the Method of Steepest Descent
with y (the transformed vector x) and ‚b (the transformed right-hand side), then

y (1)
=
(0) r (0) T r (0)
y ≠ r (0) T ‡0 Ir(0)
r(0)
=
(0) 1 (0)
y ≠ ‡0
r
=
1
(0)
y ≠ ‡0
(‡0 y (0) ≠ ‚b)
=
1 ‚
‡0
b,

which is the solution to ‡0 I = ‚b. Thus, the iteration converges in one step. The point we
are trying to (informally) make is that if A is well-conditioned, then the Method of Steepest
Descent converges faster.
WEEK 8. DESCENT METHODS 428

Now, Ax = b is equivalent to M ≠1 Ax = M ≠1 b. Hence, one can define a new problem


with the same solution and, hopefully, a better condition number by letting à = M ≠1 A and
b̃ = M ≠1 b. A better condition number results if M ¥ A since then M ≠1 A ¥ A≠1 A ¥ I. A
constraint is that M should be chosen so that solving with it is easy/cheap. The matrix M
is called a preconditioner.
A problem is that, in our discussion of descent methods, we restrict ourselves to the case
where the matrix is SPD. Generally speaking, M ≠1 A will not be SPD. To fix this, choose
M ¥ A to be SPD and let M = LM LTM equal its Cholesky factorization. If A = LLT is the
Cholesky factorization of A, then L≠1
M ALM ¥ LM LL LM ¥ I. With this, we can transform
≠T ≠1 T ≠T

our linear system Ax = b in to one that has the same solution:

L≠1 ≠T
M ALM LTM x = L≠1
M b .
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
à x̃ b̃

We note that à is SPD and hence one can apply the Method of Steepest Descent to Ãx̃ = b̃,
where à = L≠1 M ALM , x̃ = LM x, and b̃ = LM b. Once the method converges to the solution
≠T T ≠1

x̃, one can transform that solution of this back to solution of the original problem by solving
LTM x = x̃. If M is chosen carefully, Ÿ(L≠1
M ALM ) can be greatly improved. The best choice
≠T

would be M = A, of course, but that is not realistic. The point is that in our case where A
is SPD, ideally the preconditioner should be SPD.
Some careful rearrangement takes the method of steepest descent on the transformed
problem to the much simpler preconditioned algorithm on the right in Figure 8.2.5.1.
WEEK 8. DESCENT METHODS 429

Given : A, b, x(0) Given : A, b, x(0) , Given : A, b, x(0) , M


M = LLT
à = L≠1 AL≠T
b̃ = L≠1 b
x̃(0) = LT x(0)
r(0) := b ≠ Ax(0) r̃(0) := b̃ ≠ Ãx̃(0) r(0) := b ≠ Ax(0)
k := 0 k := 0 k := 0
while r(k) ”= 0 while r̃(k) ”= 0 while r(k) ”= 0
p(k) := r(k) p̃(k) := r̃(k) p(k) := M ≠1 r(k)
q (k) := Ap(k) q̃ (k) := Ãp̃(k) q (k) := Ap(k)
(k) T (k) (k) T (k) (k) T (k)
–k := pp(k) T qr(k) ˜ k := p̃p̃(k) T q̃r̃(k)
– –k := pp(k) T qr(k)
x(k+1) := x(k) + –k p(k) x̃(k+1) := x̃(k) + – ˜ k p̃(k) x(k+1) := x(k) + –k p(k)
r(k+1) := r(k) ≠ –k q (k) r̃ (k+1)
:= r̃ ≠ –
(k)
˜ k q̃ (k) r(k+1) := r(k) ≠ –k q (k)
x(k+1) = L≠T x̃(k+1)
k := k + 1 k := k + 1 k := k + 1
endwhile endwhile endwhile
Figure 8.2.5.1 Left: method of steepest descent. Middle: method of steepest descent
with transformed problem. Right: preconditioned method of steepest descent. It can be
checked that the x(k) computed by the middle algorithm is exactly the x(k) computed by
the one on the right. Of course, the computation x(k+1) = L≠T x̃(k+1) needs only be done
once, after convergence, in the algorithm in the middle. We state it this way to facilitate
Homework 8.2.5.1.
Homework 8.2.5.1 Show that the algorithm in Figure 8.2.5.1 (Middle) computes the same
values for x(k) as does the algorithm to its right.
Hint. You will want to do a prove by induction. To start, conjecture a relationship between
r̃(k) and r(k) and then prove that that relationship, and the relationship x(k) = l≠T x(k) hold
for all k, where r(k) and x(k) are as computed by the algorithm on the right.
Solution 1. Notice that à = L≠1 AL≠T implies that ÃLT = L≠1 A. We will show that for
all k Ø 0

• x̃(k) = LT x(k)

• r̃(k) = L≠1 r(k) ,

• p̃(k) = LT p(k) ,

• –
˜ k = –k

via a proof by induction.

• Base case: k = 0.

¶ x̃(0) is initialized as x̃(0) := LT x(0) .


WEEK 8. DESCENT METHODS 430

¶ r̃(0)
= < algorithm on left >
(0)
b̃ ≠ Ãx̃
= < initialization of b̃ and x̃(0) >
L≠1 b ≠ ÃLT x(0)
= < initialization of à >
L≠1 b ≠ L≠1 Ax(0)
= < factor out and initialization of r(0) >
≠1 (0)
L r
¶ p̃(0)
= < initialization in algorithm >
(0)

= < r̃(0) = L≠1 r(0) >
≠1 (0)
L r
= < from right algorithm: r(k) = M p(k) and M = LLT >
≠1 T (0)
L LL p
= < L≠1 L = I >
= LT p(0) .
˜0
¶ –
= < middle algorithm >
p̃(0) T r̃(0)
p̃(0) T Ãp̃(0)
= < p̃(0) = LT p(0) etc. >
(LT p(0) )T L≠1 r (0)
(LT p(0) )T L≠1 AL≠T LT p(0)
= < transpose and cancel >
p(0) T r(0)
p(0) T Ap(0)
= < right algorithm >
–0 .

• Inductive Step: Assume that x̃(k) = LT x(k) , r̃(k) = L≠1 r(k) , p̃(k) = LT p(k) , and –˜ k = –k .
Show that x̃(k+1) = LT x(k+1) , r̃(k+1) = L≠1 r(k+1) , p̃(k+1) = LT p(k+1) , and –˜ k+1 = –k+1 .

¶ x̃(k+1)
= middle algorithm
x̃(k) + –˜ k p̃(k)
= < I.H. >
LT x(k) + –k LT p(k)
= < factor out; right algorithm >
LT x(k+1)
WEEK 8. DESCENT METHODS 431

¶ r̃(k+1)
= < middle algorithm >
(k)
r̃ ≠ – ˜ k Ãp̃(k)
= < I.H. >
L r ≠ –k L≠1 AL≠T LT p(k)
≠1 (k)

= < L≠T LT = I; factor out; right algorithmn >


≠1 (k+1)
L r
¶ p̃(k+1)
= < middle algorithm >
(k+1)

= < r̃(k+1) = L≠1 r(k+1) >
≠1 (k+1)
L r
= < from right algorithm: r(k+1) = M p(k+1) and M = LLT >
≠1 T (k+1)
L LL p
= < L≠1 L = I >
= LT p(k+1) .
˜ k+1
¶ –
= < middle algorithm >
p̃(k+1) T r̃(k+1)
p̃(k+1) T Ãp̃(k+1)
= < p̃(k+1) = LT p(k+1) etc. >
(LT p(k+1) )T L≠1 r (k+1)
(LT p(k+1) )T L≠1 AL≠T LT p(k+1)
= < transpose and cancel >
p(k+1) T r(k+1)
p(k+1) T Ap(k+1)
= < right algorithm >
–k+1 .

• By the Principle of Mathematical Induction the result holds.


WEEK 8. DESCENT METHODS 432

Solution 2 (Constructive solution). Let’s start with the algorithm in the middle:

Given : A, b, x(0) ,
M = LLT
à = L≠1 AL≠T
b̃ = L≠1 b
x̃(0) = LT x(0)
r̃(0) := b̃ ≠ Ãx̃(0)
k := 0
while r̃(k) ”= 0
p̃(k) := r̃(k)
q̃ (k) := Ãp̃(k)
(k) T (k)
˜ k := p̃p̃(k) T q̃r̃(k)

x̃(k+1) := x̃(k) + – ˜ k p̃(k)
r̃ (k+1)
:= r̃ ≠ –
(k)
˜ k q̃ (k)
x (k+1)
= L x̃ ≠T (k+1)

k := k + 1
endwhile

We now notice that à = L≠1 AL≠T and we can substitute this into the algorithm:

Given : A, b, x(0) ,
M = LLT
b̃ = L≠1 b
x̃(0) = LT x(0)
r̃(0) := b̃ ≠ L≠1 AL≠T x̃(0)
k := 0
while r̃(k) ”= 0
p̃(k) := r̃(k)
q̃ (k) := L≠1 AL≠T p̃(k)
(k) T (k)
˜ k := p̃p̃(k) T q̃r̃(k)

x̃(k+1) := x̃(k) + – ˜ k p̃(k)
r̃ (k+1)
:= r̃ ≠ –
(k)
˜ k q̃ (k)
x (k+1)
= L x̃ ≠T (k+1)

k := k + 1
endwhile

Next, we notice that x(k+1) = L≠T x̃(k+1) or, equivalently,

x̃(k) = LT x(k) .
WEEK 8. DESCENT METHODS 433

We substitute that

Given : A, b, x(0) , or, equivalently Given : A, b, x(0) ,


M = LL T
M = LLT
b̃ = L b ≠1
b̃ = L≠1 b
LT x(0) = LT x(0) r̃(0) := b̃ ≠ L≠1 Ax(0)
r̃ := b̃ ≠ L AL L x
(0) ≠1 ≠T T (0)
k := 0
k := 0 while r̃(k) ”= 0
while r̃ ”= 0(k)
p̃(k) := r̃(k)
p̃ := r̃
(k) (k)
q̃ (k) := L≠1 AL≠T p̃(k)
q̃ (k) := L≠1 AL≠T p̃(k) (k) T (k)
˜ k := p̃p̃(k) T q̃r̃(k)

(k) T (k)
˜ k := p̃p̃(k) T q̃r̃(k)
– x(k+1) := x(k) + – ˜ k L≠T p̃(k)
L x T (k+1)
:= L x + – T (k)
˜ k p̃ (k)
r̃ (k+1)
:= r̃ ≠ –
(k)
˜ k q̃ (k)
r̃ (k+1)
:= r̃ ≠ –
(k)
˜ k q̃ (k)
k := k + 1
x (k+1)
= L x̃≠T (k+1) endwhile
k := k + 1
endwhile

Now, we exploit that b̃ = L≠1 b and r̃(k) equals the residual b̃≠Ãx̃(k) = L≠1 b≠L≠1 AL≠T LT x(k) =
L≠1 (b ≠ Ax(k) ) = L≠1 r(k) . Substituting these insights in gives us

Given : A, b, x(0) , or, equivalently Given : A, b, x(0) ,


M = LL T
M = LLT
L≠1 b = L≠1 b r(0) := b ≠ Ax(0)
L r := L (b ≠ Ax )
≠1 (0) ≠1 (0)
k := 0
k := 0 while r(k) ”= 0
while L≠1 r(k) ”= 0 p̃(k) := L≠1 r(k)
p̃ := L r
(k) ≠1 (k)
q̃ (k) := L≠1 AL≠T p̃(k)
q̃ := L AL p̃
(k) ≠1 ≠T (k)

(k) T ≠1 r (k)
˜ k := p̃ p̃(k)LT q̃(k)
(k) (k)
˜ k := p̃ p̃(k)LT q̃(k)
T ≠1
x(k+1) := x(k) + – ˜ k L≠T p̃(k)
r

x (k+1)
:= x + –
(k)
˜ k L p̃
≠T (k)
r (k+1)
:= r ≠ –
(k)
˜ k Lq̃ (k)
L r≠1 (k+1)
:= L r ≠ –
≠1 (k)
˜ k q̃ (k)
k := k + 1
k := k + 1 endwhile
endwhile
WEEK 8. DESCENT METHODS 434

Now choose p̃(k) = LT p(k) so that AL≠T p̃(k) becomes Ap(k) :

Given : A, b, x(0) , or, equivalently Given : A, b, x(0) ,


M = LLT M = LLT
r := b ≠ Ax
(0) (0)
r := b ≠ Ax(0)
(0)

k := 0 k := 0
while r(k) ”= 0 while r(k) ”= 0
p(k) := L≠T L≠1 r(k) p(k) := M ≠1 r(k)
q̃ := L Ap
(k) ≠1 (k)
q̃ (k) := L≠1 Ap(k)
(k) (k) (k) T r (k)
˜ k := (L(LpT p(k)) ))LT q̃(k) ˜ k := pp(k) T Lq̃
T T ≠1 r
– – (k)

x (k+1)
:= x + –
(k)
˜k L L p≠T T (k) x (k+1)
:= x + –
(k)
˜ k p(k)
r (k+1)
:= r ≠ –
(k)
˜ k Lq̃ (k) r (k+1)
:= r ≠ –
(k)
˜ k Lq̃ (k)
k := k + 1 k := k + 1
endwhile endwhile

Finally, if we choose Lq̃ (k) = q (k) and –


˜ k = –k we end up with

Given : A, b, x(0) ,
M = LLT
r := b ≠ Ax(0)
(0)

k := 0
while r(k) ”= 0
p(k) := M ≠1 r(k)
q (k) := Ap(k)
(k) T (k)
–k := pp(k) T qr(k)
x(k+1) := x(k) + –k p(k)
r(k+1) := r(k) ≠ –k q (k)
k := k + 1
endwhile

8.3 The Conjugate Gradient Method


8.3.1 A-conjugate directions

YouTube: https://www.youtube.com/watch?v=9-SyyJv0XuU
WEEK 8. DESCENT METHODS 435

Let’s start our generic descent method algorithm with x(0) = 0. Here we do not use the
temporary vector q (k) = Ap(k) so that later we can emphasize how to cast the Conjugate
Gradient Method in terms of as few matrix-vector multiplication as possible (one to be
exact).
Given : A, b Given : A, b
x(0) := 0 x := 0
r(0) := b ≠ Ax(0) (= b) r := b
k := 0
while r(k) ”= 0 while r ”= 0
p(k) := next direction p := next direction
– := ppT Ap
(k) T r (k) Tr
–k := pp(k) T Ap (k)

x(k+1) := x(k) + –k p(k) x := x + –p


r(k+1) := r(k) ≠ –Ap(k) r := r ≠ –Ap
k := k + 1
endwhile endwhile

Figure 8.3.1.1 Generic descent algorithm started with x(0) = 0. Left: with indices. Right:
without indices.
Now, since x(0) = 0, clearly

x(k+1) = –0 p(0) + · · · + –k p(k) .

Thus, x(k+1) œ Span(p(0) , . . . , p(k) ).


It would be nice if after the kth iteration

f (x(k+1) ) = min f (x) (8.3.1)


xœSpan(p(0) ,...,p(k) )

and the search directions were linearly independent. Then, the resulting descent method, in
exact arithmetic, is guaranteed to complete in at most n iterations, This is because then

Span(p(0) , . . . , p(n≠1) ) = Rn

so that
f (x(n) ) = min f (x) = minn f (x)
xœSpan(p(0) ,...,p(n≠1) ) xœR

and hence Ax(n) = b.


Unfortunately, the Method of Steepest Descent does not have this property. The next
approximation to the solution, x(k+1) minimizes f (x) where x is constrained to be on the line
x(k) + –p(k) . Because in each step f (x(k+1) ) Æ f (x(k) ), a slightly stronger result holds: It also
minimizes f (x) where x is constrained to be on the union of lines x(j) + –p(j) , j = 0, . . . , k.
However, unless we pick the search directions very carefully, that is not the same as it
minimizing over all vectors in Span(p(0) , . . . , p(k) ).
WEEK 8. DESCENT METHODS 436

YouTube: https://www.youtube.com/watch?v=j8uNP7zjdv8
We can write (8.3.1) more concisely: Let
1 2
P (k≠1) = p(0) p(1) · · · p(k≠1)

be the matrix that holds the history of all search directions so far (as its columns) . Then,
letting Q R
–0
a(k≠1) = c
c .. d
a . db,
–k≠1
we notice that Q R
1 2c
–0
x(k) = .. d (k≠1)
p(0) · · · p(k≠1) b=P
. d (8.3.2)
c ak≠1 .
a
–k≠1
Homework 8.3.1.1 Let p(k) be a new search direction that is linearly independent of the
columns of P (k≠1) , which themselves are linearly independent. Show that

minxœSpan(p(0) ,...,p(k≠1) ,p(k) ) f (x) = miny f (P (k) y)


Ë
1 T (k≠1) T
= miny y P
2 0
AP (k≠1) y0 ≠ y0T P (k≠1) T b
È
+Â1 y0T P (k≠1) Ap(k) + 12 Â12 p(k) T Ap(k) ≠ Â1 p(k) T b ,
T

A B
y0
where y = œ Rk+1 .
Â1
Hint.
x œ Span(p(0) , . . . , p(k≠1) , p(k) )
if and only if there exists
A B A B
y0 1 2 y0
y= œR k+1
such that x = P (k≠1)
p (k)
.
Â1 Â1
WEEK 8. DESCENT METHODS 437

Solution.
minxœSpan(p(0) ,...,p(k≠1) ,p(k) ) f (x)
= 1 < equivalent2formulation >
miny f ( P (k≠1) p(k) y)
A B
y0
= < partition y = >
A
ÂB1
1 2 y0
miny f ( P (k≠1) p(k) )
Â1
= S < instantiate f >
C A BDT A B
1 2 y 1 2 y
1 0 0
miny U 2 P (k≠1)
p (k)
A P (k≠1)
p (k)
Â1 Â1
C A BDT T
1 2 y0
(k≠1) (k)
≠ P p bV .
Â1
= Ë Ë < multiply out > È Ë È È
miny 12 y0T P (k≠1) T + Â1 p(k) T A P (k≠1) y0 + Â1 p(k) ≠ y0T P (k≠1) T b ≠ Â1 p(k) T b
= Ë < multiply out some more >
miny 12 y0T P (k≠1) T AP (k≠1) y0 + Â1 y0T P (k≠1) T Ap(k)
È
+ 12 Â12 p(k) T Ap(k) ≠ y0T P (k≠1) b ≠ Â1 p(k) T b
T

= Ë < rearrange >


miny 12 y0T P (k≠1) T AP (k≠1) y0 ≠ y0T P (k≠1) T b + Â1 y0T P (k≠1) Ap(k)
T

È
+ 12 Â12 p(k) T Ap(k) ≠ Â1 p(k) T b .

YouTube: https://www.youtube.com/watch?v=5eNmr776GJY
Now, if
P (k≠1) T Ap(k) = 0
WEEK 8. DESCENT METHODS 438

then
minxœSpan(p(0) ,...,p(k≠1) ,p(k) ) f (x)
= Ë < from before >
miny 12 y0T P (k≠1) T AP (k≠1) y0 ≠ y0T P (k≠1) b
T

È
+ Â1 y0T P (k≠1) T Ap(k) + 12 Â12 p(k) T Ap(k) ≠ Â1 p(k) T b
¸ ˚˙ ˝
0
= Ë < remove zero term >
miny 12 y0T P (k≠1) T AP (k≠1) y0 ≠ y0T P (k≠1) b
T

È
+ 12 Â12 p(k) T Ap(k) ≠ Â1 p(k) T b
= Ë < split into two terms that can be È minimized Ë separately > È
miny0 21 y0T P (k≠1) T AP (k≠1) y0 ≠ y0T P (k≠1) b + minÂ1 12 Â12 p(k) T Ap(k) ≠ Â1 p(k) T b
T

= < recognize first set of terms Ë as f (P (k≠1) y0 ) > È


minxœSpan(p(0) ,...,p(k≠1) ) f (x) + minÂ1 12 Â12 p(k) T Ap(k) ≠ Â1 p(k) T b .

The minimizing Â1 is given by


p(k) T b
Â1 = .
p(k) T Ap(k)
If we pick p(k) = p(k) and –k = Â1 then

x(k+1) = P (k≠1) y0 + Â1 p(k) = –0 p(0) + · · · + –k≠1 p(k≠1) + –k p(k) = x(k) + –k p(k) .

A sequence of such directions is said to be A-conjugate.


Definition 8.3.1.2 A-conjugate directions. Let A be SPD. A sequence p(0) , . . . , p(k≠1) œ
Rn such that p(j) T Ap(i) = 0 if and only if j ”= i is said to be A-conjugate. ⌃

YouTube: https://www.youtube.com/watch?v=70t6zgeMHs8
Homework 8.3.1.2 Let A œ Rn◊n be SPD.
ALWAYS/SOMETIMES/NEVER: The the columns of P œ Rn◊k are A-conjugate if and
only if P T AP = D where D is diagonal and has positive values on its diagonal.
Answer. ALWAYS
Now prove it.
WEEK 8. DESCENT METHODS 439

Solution.
P T AP
= < partition P by columns >
1 2T 1 2
p0 · · · pk≠1 A p0 · · · pk≠1
Q
= R
< transpose >
T
p0
c . d 1 2
c . d A p0 · · · pk≠1
a . b
pTk≠1
Q
= R< multiply out >
pT0
c . d1 2
c . d Ap0 · · · Apk≠1
a . b
pTk≠1
= < multiply out >
Q R
pT0 Ap0 pT0 Ap(k) · · · pT0 Apk≠1
c (k) T
c p Ap0 p(k) T Ap(k) · · · p(k) T Apk≠1 d
d
c
c .. .. d
d
a . . b
pTk≠1 Ap0 pTk≠1 Ap(k) · · · pTk≠1 Apk≠1
= < multiply out >
Q R
pT0 Ap0 0 ··· 0
c
c 0 p (k) T
Ap (k)
··· 0 d
d
c
c .. .. .. .. d,
d
a . . . . b
0 0 · · · pTk≠1 Apk≠1
which is a diagonal matrix and its diagonal elements are positive since $ A $ is SPD.
Homework 8.3.1.3 Let A œ Rn◊n be SPD and the columns of P œ Rn◊k be A-conjugate.
ALWAYS/SOMETIMES/NEVER: The columns of P are linearly independent.
Answer. ALWAYS
Now prove it!
Solution. We employ a proof by contradiction. Suppose the columns of P are not linearly
independent. Then there exists y ”= 0 such that P y = 0. Let D = P T AP . From the last
homework we know that D is diagonal and has positive diagonal elements. But then
0
= < Py = 0 >
(P y) A(P y)
T

= < multiply out >


y T P T AP y
= < P T AP = D >
T
y Dy
> < D is SPD >
0,
which is a contradiction. Hence, the columns of P are linearly independent.
WEEK 8. DESCENT METHODS 440

The above observations leaves us with a descent method that picks the search directions
to be A-conjugate, given in Figure 8.3.1.3.
Given : A, b
x(0) := 0
r(0) = b
k := 0
while r(k) ”= 0
Choose p(k) such that p(k) T AP (k≠1) = 0 and p(k) T r(k) ”= 0
(k) T r (k)
–k := pp(k) T Ap (k)

x(k+1) := x(k) + –k p(k)


r(k+1) := r(k) ≠ –k Ap(k)
k := k + 1
endwhile
Figure 8.3.1.3 Basic method that chooses the search directions to be A-conjugate.

Remark 8.3.1.4 The important observation is that if p(0) , . . . , p(k) are chosen to be A-
conjugate, then x(k+1) minimizes not only

f (x(k) + –p(k) )

but also
min f (x).
xœSpan(p(0) ,...,p(k≠1) )

8.3.2 Existence of A-conjugate search directions

YouTube: https://www.youtube.com/watch?v=yXfR71mJ64w
The big question left dangling at the end of the last unit was whether there exists a
direction p(k) that is A-orthogonal to all previous search directions and that is not orthogonal
to r(k) . Let us examine this:

• Assume that all prior search directions p(0) , . . . , p(k≠1) were A-conjugate.

• Consider all vectors p œ Rn that are A-conjugate to p(0) , . . . , p(k≠1) . A vector p has this
property if and only if p ‹ Span(Ap(0) , . . . , Ap(k≠1) ).
WEEK 8. DESCENT METHODS 441

• For p ‹ Span(Ap(0) , . . . , Ap(k≠1) ) we notice that

pT r(k) = pT (b ≠ Ax(k) ) = pT (b ≠ AP (k≠1) a(k≠1) )

where we recall from (8.3.2) that


Q R
1 2
–0
= a ... d
c d
P (k≠1) = p(0) · · · p(k≠1) and a(k≠1) c
b.
–k≠1

• If all vectors p that are A-conjugate to p(0) , . . . , p(k≠1) are orthogonal to the current
residual, pT r(k) = 0 for all p with P (k≠1) T Ap = 0, then

0 = pT b ≠ pAP (k≠1) a(k≠1) = pT b for all p ‹ Span(Ap(0) , . . . , Ap(p≠1) ).

Let’s think about this: b is orthogonal to all vectors that are orthogonal to Span(Ap(0) , . . . , Ap(p≠1) ).
This means that
b œ Span(Ap(0) , . . . , Ap(k≠1) ).

• Hence b = AP (k≠1) z for some z œ Rk . It also means that x = P (k≠1) z solves Ax = b.

• We conclude that our method must already have found the solution since x(k) minimizes
f (x) over all vectors in Span(p(0) , . . . , p(k≠1) ). Thus Ax(k) = b and r(k) = 0.

We conclude that there exist descent methods that leverage A-conjugate search directions
as described in Figure 8.3.1.3. The question now is how to find a new A-conjugate search
direction at every step.

8.3.3 Conjugate Gradient Method Basics

YouTube: https://www.youtube.com/watch?v=OWnTq1PIFnQ
The idea behind the Conjugate Gradient Method is that in the current iteration we have
an approximation, x(k) to the solution to Ax = b. By construction, since x(0) = 0,

x(k) = –0 p(0) + · · · + –k≠1 p(k≠1) .


WEEK 8. DESCENT METHODS 442

Also, the residual


r(k)
=
b ≠ Ax(k)
=
b ≠ A(–0 p(0) + · · · + –k≠1 p(k≠1) )
=
b ≠ –0 Ap(0) ≠ · · · ≠ –k≠1 Ap(k≠1)
=
r(k≠1) ≠ –k≠1 Ap(k≠1) .
If r(k) = 0, then we know that x(k) solves Ax = b, and we are done.
Assume that r(k) ”= 0. The question now is "How should we construct a new p(k) that
is A-conjugate to the previous search directions and so that p(k) T r(k) ”= 0?" Here are some
thoughts:

• We like the direction of steepest descent, r(k) = b ≠ Ax(k) , because it is the direction
in which f (x) decreases most quickly.

• Let us chose p(k) to be the vector that is A-conjugate to p(0) , . . . , p(k≠1) and closest to
the direction of steepest descent, r(k) :

Îp(k) ≠ r(k) Î2 = min Îr(k) ≠ pÎ2 .


p‹Span(Ap(0) ,...,Ap(k≠1) )

This yields the algorithm in Figure 8.3.3.1.


Given : A, b
x(0) := 0
r(0) := b
k := 0
while r(k) ”= 0
if k = 0
p(k) = r(0)
else
p(k) minimizes minp‹Span(Ap(0) ,...,Ap(k≠1) ) Îr(k) ≠ pÎ2
endif
(k) T r (k)
–k := pp(k) T Ap (k)

x(k+1) := x(k) + –k p(k)


r(k+1) := r(k) ≠ –k Ap(k)
k := k + 1
endwhile
Figure 8.3.3.1 Basic Conjugate Gradient Method.
WEEK 8. DESCENT METHODS 443

8.3.4 Technical details


This unit is probably the most technically difficult unit in the course. We give the details
here for completeness, but you will likely live a happy and productive research life without
worrying about them too much... The important part is the final observation: that the next
search direction computed by the Conjugate Graduate Method is a linear combination of
the current residual (the direction of steepest descent) and the last search direction.

YouTube: https://www.youtube.com/watch?v=i5MoVhNsXYU
Let’s look more carefully at p(k) that satisfies

Îr(k) ≠ p(k) Î2 = min Îr(k) ≠ pÎ2 .


p‹Span(Ap(0) ,...,Ap(k≠1) )

Notice that
r(k) = v + p(k)
where v is the orthogonal projection of r(k) onto Span(Ap(0) , . . . , Ap(k≠1) )

Îr(k) ≠ vÎ2 = min Îr(k) ≠ wÎ2


wœSpan(Ap(0) ,...,Ap(k≠1) )

which can also be formulated as v = AP (k≠1) z (k) , where

Îr(k) ≠ AP (k≠1) z (k) Î2 = mink Îr(k) ≠ AP (k≠1) zÎ2 .


zœR

This can be recognized as a standard linear least squares problem. This allows us to make
a few important observations:

YouTube: https://www.youtube.com/watch?v=ye1FuJixbHQ
Theorem 8.3.4.1 In Figure 8.3.3.1,
• P (k≠1) T r(k) = 0.
WEEK 8. DESCENT METHODS 444

• Span(p(0) , . . . , p(k≠1) ) = Span(r(0) , . . . , r(k≠1) ) = Span(b, Ab, . . . , Ak≠1 b).


Proof.
• Proving that
P (k≠1) T r(k) = 0.
starts by considering that

f (P (k≠1) y)
=
1
2
(P (k≠1) y)T A(P (k≠1) y) ≠ (P (k≠1) y)T b
=
1 T
2
y (P (k≠1) T AP (k≠1) )y ≠ y T P (k≠1) T b

is minimized by y0 that satisfies

(P (k≠1) T AP (k≠1) )y0 = P (k≠1) T b.

Since x(k) minimizes


min f (x)
xœSpan(p(0) ,...,p(k≠1) )

we conclude that x = P (k≠1) y0 . But then


1 2 1 2
0 = P (k≠1) T b ≠ P (k≠1) T Ax(k) = P (k≠1) T b ≠ Ax(k) = P (k≠1) T r(k) .

• Show that Span(p(0) , . . . , p(k≠1) ) = Span(r(0) , . . . , r(k≠1) ) = Span(b, Ab, . . . , Ak≠1 b).
Proof by induction on k.

¶ Base case: k = 1.
The result clearly holds since p(0) = r(0) = b.
¶ Inductive Hypothesis: Assume the result holds for n Æ k.
Show that the result holds for k = n + 1.
⌅ If k = n + 1 then r(k≠1 ) = r(n) = r(n≠1) ≠ –n≠1 Ap(n≠1) . By I.H.

r(n≠1) œ Span(b, Ab, . . . , An≠1 b)

and
p(n≠1) œ Span(b, Ab, . . . , An≠1 b).
But then
Ap(n≠1) œ Span(Ab, A2 b, . . . , An b)
and hence
r(n) œ Span(b, Ab, A2 b, . . . , An b).
WEEK 8. DESCENT METHODS 445

⌅ p(n) = r(n) ≠ AP (n≠1) y0 and hence

p(n) œ Span(b, Ab, A2 b, . . . , An b)

since
r(n) œ Span(b, Ab, A2 b, . . . , An b)
and
AP n≠1 y0 œ Span(Ab, A2 b, . . . , An b).
¶ We complete the inductive step by noting that all three subspaces have the same
dimension and hence must be the same subspace.
¶ By the Principle of Mathematical Induction, the result holds.


Definition 8.3.4.2 Krylov subspace. The subspace

Kk (A, b) = Span(b, Ab, . . . , Ak≠1 b)

is known as the order-k Krylov subspace. ⌃


The next technical detail regards the residuals that are computed by the Conjugate
Gradient Method. They are mutually orthogonal, and hence we, once again, conclude that
the method must compute the solution (in exact arithmetic) in at most n iterations. It will
also play an important role in reducing the number of matrix-vector multiplications needed
to implement the final version of the Conjugate Gradient Method.
Theorem 8.3.4.3 The residual vectors r(k) are mutually orthogonal.
Proof. In Theorem 8.3.4.1 we established that

Span(p(0) , . . . , p(j≠1) ) = Span(r(0) , . . . , r(j≠1) )

and hence
Span(r(0) , . . . , r(j≠1) ) µ Span(p(0) , . . . , p(k≠1) ) =
for j < k. Hence r(j) = P (k≠1) t(j) for some vector t(j) œ Rk . Then

r(k) T r(j) = r(k) T P (k≠1) t(j) = 0.

Since this holds for all k and j < k, the desired result is established. ⌅
Next comes the most important result. We established that
p(k) = r(k) ≠ AP (k≠1) z (k≠1) (8.3.3)
where z (k) solves
min Îr(k) ≠ AP (k≠1) zÎ2 .
zœRk
What we are going to show is that in fact the next search direction equals a linear combination
of the current residual and the previous search direction.
WEEK 8. DESCENT METHODS 446

Theorem 8.3.4.4 For k Ø 1, the search directions generated by the Conjugate Gradient
Method satisfy
p(k) = r(k) + “k p(k≠1)
for some constant “k .
Proof. This proof has a lot of very technical details. No harm done if you only pay cursory
attention to those details.
A B
z0
Partition z (k≠1)
= and recall that r(k) = r(k≠1) ≠ “k≠1 Ap(k≠1) so that
’1

p(k)
= < (8.3.3) >
r(k) ≠ AP (k≠1) z (k≠1) A B
z0
= <z =
(k≠1)
>
’1
r(k) ≠ AP (k≠2) z0 + ’1 Ap(k≠1)
= 1 <> 2
r ≠ AP (k≠2) z0 + ’1 (r(k) ≠ r(k≠1) )/–k≠1
(k)

= <> A B
1 2 ’1 (k≠1) (k≠2)
1 ≠ –k≠1 r +
’1 (k)
r ≠ AP z0
–k≠1
¸ ˚˙ ˝
s(k)
1 = <>
2
1 ≠ –k≠1 r(k) + s(k) .
’1

We notice that r(k) and s(k) are orthogonal. Hence


A B
’1
Îp(k) Î22 = 1+ Îr(k) Î22 + Îs(k) Î22
–k≠1

and minimizing p(k) means minimizing the two separate parts. Since r(k) is fixed, this means
minimizing Îs(k) Î22 . An examination of s(k) exposes that
’1 (k≠1) ’1 1 (k≠1) 2
s(k) = r ≠ AP (k≠2) z0 = ≠ r ≠ AP (k≠2) w0
–k≠1 –k≠1
where w0 = ≠(–k≠1 /’1 )z0 . We recall that

Îr(k≠1) ≠ p(k≠1) Î2 = min Îr(k≠1) ≠ ApÎ2


p‹Span(p(0) ,...,p(k≠2) )

and hence we conclude that sk is a vector the direction of p(k≠1) . Since we are only interested
in the direction of p(k) , –k≠1
’1
is not relevant. The upshot of this lengthy analysis is that

p(k) = r(k) + “k p(k≠1) .


WEEK 8. DESCENT METHODS 447

This implies that while the Conjugate Gradient Method is an A-conjugate method and
hence leverages a "memory" of all previous search directions,

f (x(k) ) = min f (x),


xœSpan(p(0) ,...,p(k≠1) )

only the last search direction is needed to compute the current one. This reduces the cost of
computing the current search direction and means we don’t have to store all previous ones.

YouTube: https://www.youtube.com/watch?v=jHBK1OQE01s
Remark 8.3.4.5 This is a very, very, very big deal...

8.3.5 Practical Conjugate Gradient Method algorithm

YouTube: https://www.youtube.com/watch?v=FVWgZKJQjz0
We have noted that p(k) = r(k) + “k p(k≠1) . Since p(k) is A-conjugate to p(k≠1) we find that

p(k≠1) T Ap(k) = p(k≠1) T Ar(k) + “k p(k≠1) T Ap(k≠1)

so that
“k = ≠p(k≠1) T Ar(k) /p(k≠1) T Ap(k≠1) .
This yields the first practical instantiation of the Conjugate Gradient method, given in
Figure 8.3.5.1.
WEEK 8. DESCENT METHODS 448

Given : A, b
x(0) := 0
r(0) := b
k := 0
while r(k) ”= 0
if k = 0
p(k) = r(0)
else
“k := ≠p(k≠1) T Ar(k) /p(k≠1) T Ap(k≠1)
p(k) := r(k) + “k p(k≠1)
endif
(k) T r (k)
–k := pp(k) T Ap (k)

x(k+1) := x(k) + –k p(k)


r(k+1) := r(k) ≠ –k Ap(k)
k := k + 1
endwhile
Figure 8.3.5.1 Conjugate Gradient Method.
Homework 8.3.5.1 In Figure 8.3.5.1 we compute

p(k) T r(k)
–k := .
p(k) T Ap(k)
Show that an alternative formula for –k is given by

r(k) T r(k)
–k := .
p(k) T Ap(k)

Hint. Use the fact that p(k) = r(k) + “k p(k≠1) and the fact that r(k) is orthogonal to all
previous search directions to show that p(k) T r(k) = r(k) T r(k) .
Solution. We need to show that p(k) T r(k) = r(k) T r(k) .

p(k) T r(k)
= < r(k) + “k p(k≠1) >
(r + “k p(k≠1) )T r(k)
(k)

= < distribute >


r(k) T r(k) + “k p(k≠1) T r(k)
= < p(k≠1) T r(k) = 0 >
r(k) T r(k) .
The last homework justifies the refined Conjugate Gradient Method in Figure 8.3.5.2
(Left).
WEEK 8. DESCENT METHODS 449

Given : A, b Given : A, b
x(0) := 0 x(0) := 0
r(0) := b r(0) := b
k := 0 k := 0
while r(k) ”= 0 while r(k) ”= 0
if k = 0 if k = 0
p(k) = r(0) p(k) = r(0)
else else
“k := ≠(p(k≠1) T Ar(k) )/(p(k≠1) T Ap(k≠1) ) “k := (r(k) T r(k) )/(r(k≠1) T r(k≠1) )
p(k) := r(k) + “k p(k≠1) p(k) := r(k) + “k p(k≠1)
endif endif
(k) T r (k) (k) T r (k)
–k := pr(k) T Ap (k) –k := pr(k) T Ap (k)

x (k+1)
:= x + –k p(k)
(k)
x (k+1)
:= x + –k p(k)
(k)

r (k+1)
:= r(k) ≠ –k Ap(k) r (k+1)
:= r(k) ≠ –k Ap(k)
k := k + 1 k := k + 1
endwhile endwhile
Figure 8.3.5.2 Alternative Conjugate Gradient Method algorithms.
Homework 8.3.5.2 For the Conjugate Gradient Method discussed so far,
• Show that
r(k) T r(k) = ≠–k≠1 r(k) T Apk≠1 .

• Show that
p(k≠1) T Ap(k≠1) = r(k≠1) T r(k≠1) /–k≠1 .

Hint. Recall that


r(k) = r(k≠1) ≠ –k≠1 Ap(k≠1) . (8.3.4)
and rewrite (8.3.4) as
Ap(k≠1) = (r(k≠1) ≠ r(k) )/–k≠1 .
and recall that in the previous iteration

p(k≠1) = r(k≠1) ≠ “k≠1 p(k≠2) .

Solution.
r(k) T r(k) = r(k) T r(k≠1) ≠ –k≠1 r(k) T Apk≠1 = ≠–k≠1 r(k) T Apk≠1 .
WEEK 8. DESCENT METHODS 450

p(k≠1) T Ap(k≠1)
=
(r(k≠1) ≠ “k≠1 p(k≠2) )T Ap(k≠1)
=
(k≠1) T
r Ap(k≠1)
=
r(k≠1) T (r(k≠1) ≠ r(k) )/–k≠1
=
r(k≠1) T r(k≠1) /–k≠1 .
From the last homework we conclude that

“k = ≠(p(k≠1) T Ar(k) )/(p(k≠1) T Ap(k≠1) ) = r(k) T r(k) /r(k≠1) T r(k≠1) .

This is summarized in on the right in Figure 8.3.5.2.

8.3.6 Final touches for the Conjugate Gradient Method

YouTube: https://www.youtube.com/watch?v=f3rLky6mIA4
We finish our discussion of the Conjugate Gradient Method by revisiting the stopping
criteria and preconditioning.

8.3.6.1 Stopping criteria


In theory, the Conjugate Gradient Method requires at most n iterations to achieve the
condition where the residual is zero so that x(k) equals the exact solution. In practice, it is
an iterative method due to the error introduced by floating point arithmetic. For this reason,
the iteration proceeds while Îr(k) Î2 Ø ‘mach ÎbÎ2 and some maximum number of iterations is
not yet performed.

8.3.6.2 Preconditioning
In Subsection 8.2.5 we noted that the method of steepest Descent can be greatly accelerated
by employing a preconditioner. The Conjugate Gradient Method can be greatly accelerated.
While in theory the method requires at most n iterations when A is n ◊ n, in practice a
preconditioned Conjugate Gradient Method requires very few iterations.
Homework 8.3.6.1 Add preconditioning to the algorithm in Figure 8.3.5.2 (right).
WEEK 8. DESCENT METHODS 451

Solution. To add preconditioning to

Ax = b

we pick a SPD preconditioner M = L̃L̃T and instead solve the equivalent problem

¸ ˚˙x˝ = L̃
≠1 ≠T T ≠1

¸ A ˚˙L̃ ˝ L̃ ¸ ˚˙ b.˝
à x̃ b̃

This changes the algorithm in Figure 8.3.5.2 (right) to

Given : A, b, M = L̃L̃T
x̃(0) := 0
à = L̃≠1 AL̃≠T
r̃(0) := L̃≠1 b
k := 0
while r̃(k) ”= 0
if k = 0
p̃(k) = r̃(0)
else
“˜k := (r̃(k) T r̃(k) )/(r̃(k≠1) T r̃(k≠1) )
p̃(k) := r̃(k) + “˜k p̃(k≠1)
endif
(k) T r̃ (k)
˜ k := p̃r̃(k) T Ãp̃
– (k)

x̃(k+1) := x̃(k) + – ˜ k p̃(k)


r̃(k+1) := r̃(k) ≠ – ˜ k Ãp̃(k)
k := k + 1
endwhile

Now, much like we did in the constructive solution to Homework 8.2.5.1 we now morph
this into an algorithm that more directly computes x(k+1) . We start by substituting

à = L̃≠1 AL̃≠T , x̃(k) = L̃T x(k) , r̃(k) = L̃≠1 r(k) , p̃(k) = L̃T p(k) ,
WEEK 8. DESCENT METHODS 452

which yields
Given : A, b, M = L̃L̃T
L̃T x(0) := 0
L̃≠1 r(0) := L̃≠1 b
k := 0
while L̃≠1 r(k) ”= 0
if k = 0
L̃T p(k) = L̃≠1 r(0)
else
“˜k := ((L̃≠1 r(k) )T L̃≠1 r(k) )/(L̃≠1 r(k≠1) )T L̃≠1 r(k≠1) )
L̃T p(k) := L̃≠1 r(k) + “˜k L̃T p(k≠1)
endif
≠1 (k) )T L̃≠1 r (k)
˜ k := ((L̃T (pL̃(k) )Tr L̃≠1
– AL̃≠T L̃T p(k)
L̃T x(k+1) := L̃T x(k) + – ˜ k L̃T p(k)
L̃≠1 r(k+1) := L̃≠1 r(k) ≠ – ˜ k L̃≠1 L̃≠1 AL̃≠T L̃≠T L̃T p(k)
k := k + 1
endwhile
If we now simplify and manipulate various parts of this algorithm we get

Given : A, b, M = L̃L̃T
x(0) := 0
r(0) := b
k := 0
while r(k) ”= 0
if k = 0
p(k) = M ≠1 r(0)
else
“˜k := (r(k) T M ≠1 r(k) )/(r(k≠1) T M ≠1 r(k≠1) )
p(k) := M ≠1 r(k) + “˜k p(k≠1)
endif
(k) T ≠1 r (k)
˜ k := r p(k) M
– T Ap(k)

x(k+1) := x(k) + – ˜ k p(k)


r (k+1)
:= r ≠ –
(k)
˜ k Ap(k)
k := k + 1
endwhile
WEEK 8. DESCENT METHODS 453

Finally, we avoid the recomputing of M ≠1 r(k) and Ap(k) by introducing z (k) and q (k) :

Given : A, b, M = L̃L̃T
x(0) := 0
r(0) := b
k := 0
while r(k) ”= 0
z (k) := M ≠1 r(k)
if k = 0
p(k) = z (0)
else
“˜k := (r(k) T z (k) )/(r(k≠1) T z (k≠1) )
p(k) := z (k) + “˜k p(k≠1)
endif
q (k) := Ap(k)
(k) T (k)
˜ k := rp(k) T zq(k)

x(k+1) := x(k) + – ˜ k p(k)
r(k+1) := r(k) ≠ – ˜ k q (k)
k := k + 1
endwhile

(Obviously, there are a few other things that can be done to avoid unnecessary recomputa-
tions of r(k) T z (k) .)

8.4 Enrichments
8.4.1 Conjugate Gradient Method: Variations on a theme
Many variations on the Conjugate Gradient Method exist, which are employed in different
situations. A concise summary of these, including suggestions as to which one to use when,
can be found in

• [2] Richard Barrett, Michael Berry, Tony F. Chan, James Demmel, June M. Donato,
Jack Dongarra, Victor Eijkhout, Roldan Pozo, Charles Romine, and Henk Van der
Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative
Methods, SIAM Press, 1993. [ PDF ]
WEEK 8. DESCENT METHODS 454

8.5 Wrap Up
8.5.1 Additional homework
Homework 8.5.1.1 When using iterative methods, the matrices are typically very sparse.
The question then is how to store a sparse matrix and how to perform a matrix-vector
multiplication with it. One popular way is known as compressed row storage that involves
three arrays:
• 1D array nzA (nonzero A) which stores the nonzero elements of matrix A. In this array,
first all nonzero elements of the first row are stored, then the second row, etc. It has
size nnzeroes (number of nonzeroes).

• 1D array ir which is an integer array of size n + 1 such that ir( 1 ) equals the
index in array nzA where the first element of the first row is stored. ir( 2 ) then
gives the index where the first element of the second row is stored, and so forth. ir(
n+1 ) equals nnzeroes + 1. Having this entry is convenient when you implement a
matrix-vector multiplication with array nzA.

• 1D array ic of size nnzeroes which holds the column indices of the corresponding
elements in array nzA.

1. Write a function

[ nzA, ir, ic ] = Create_Poisson_prob em_nzA( N )

that creates the matrix A in this sparse format.

2. Write a function

y = SparseMvMu t( nzA, ir, ic, x )

that computes y = Ax with the matrix A stored in the sparse format.

8.5.2 Summary
Given a function f : Rn æ R, its gradient is given by
Q R
ˆf
ˆ‰0
(x)
c d
c
c
ˆf
(x) d
d
Òf (x) =
ˆ‰1
c .. d.
c
a . d
b
ˆf
ˆ‰n≠1
(x)

Òf (x) equals the direction in which the function f increases most rapidly at the point x
and ≠Òf (x) equals the direction of steepest descent (the direction in which the function f
decreases most rapidly at the point x).
WEEK 8. DESCENT METHODS 455

In this summary, we will assume that A œ Rn◊n is symmetric positive definite (SPD) and
1
f (x) = xT Ax ≠ xT b.
2
The gradient of this function equals

Òf (x) = Ax ≠ b

and x‚ minimizes the function if and only if

Ax‚ = b.

If x(k) is an approximation to x‚ then r(k) = b ≠ Ax(k) equals the corresponding residual.


Notice that r(k) = ≠Òf (x(k) ) and hence r(k) is the direction of steepest descent .
A prototypical descent method is given by

Given :A, b, x(0)


r(0) := b ≠ Ax(0)
k := 0
while r(k) ”= 0
p(k) := next direction
x(k+1) := x(k) + –k p(k) for some scalar –k
r(k+1) := b ≠ Ax(k+1)
k := k + 1
endwhile

Here p(k) is the "current" search direction and in each iteration we create the next approxi-
mation to x‚, x(k+1) , along the line x(k) + –p(k) .
If x(k+1) minimizes along that line, the method is an exact descent method and

p(k) T r(k)
–k =
p(k) T Ap(k)
so that a prototypical exact descent method is given by

Given : A, b, x(0)
r(0) := b ≠ Ax(0)
k := 0
while r(k) ”= 0
p(k) := next direction
(k) T r (k)
–k := pp(k) T Ap (k)

x(k+1) := x(k) + –k p(k)


r(k+1) := b ≠ Ax(k+1)
k := k + 1
endwhile
WEEK 8. DESCENT METHODS 456

Once –k is determined,
r(k+1) = r(k) ≠ –k Ap(k) .
which saves a matrix-vector multiplication when incorporated into the prototypical exact
descent method:
Given : A, b, x(0)
r(0) := b ≠ Ax(0)
k := 0
while r(k) ”= 0
p(k) := next direction
q (k) := Ap(k)
(k) T (k)
–k := pp(k) T qr(k)
x(k+1) := x(k) + –k p(k)
r(k+1) := r(k) ≠ –k q (k)
k := k + 1
endwhile
The steepest descent algorithm chooses p(k) = ≠Òf (x(k) ) = b ≠ Ax(k) = r(k) :
Given : A, b, x(0)
r(0) := b ≠ Ax(0)
k := 0
while r(k) ”= 0
p(k) := r(k)
q (k) := Ap(k)
(k) T (k)
–k := pp(k) T qr(k)
x(k+1) := x(k) + –k p(k)
r(k+1) := r(k) ≠ –k q (k)
k := k + 1
endwhile
Convergence can be greatly accelerated by incorporating a preconditioner, M , where,
ideally, M ¥ A is SPD and solving M z = y is easy (cheap).
Given : A, b, x(0) , M
r(0) := b ≠ Ax(0)
k := 0
while r(k) ”= 0
p(k) := M ≠1 r(k)
q (k) := Ap(k)
(k) T (k)
–k := pp(k) T qr(k)
x(k+1) := x(k) + –k p(k)
r(k+1) := r(k) ≠ –k q (k)
k := k + 1
endwhile
WEEK 8. DESCENT METHODS 457

Definition 8.5.2.1 A-conjugate directions. Let A be SPD. A sequence p(0) , . . . , p(k≠1) œ


Rn such that p(j) T Ap(i) = 0 if and only if j ”= i is said to be A-conjugate. ⌃
The columns of P œ R n◊k
are A-conjugate if and only if P AP = D where D is diagonal
T

and has positive values on its diagonal.


A-conjugate vectors are linearly independent.
A descent method that chooses the search directions to be A-conjugate will find the
solution of Ax = b, where A œ Rn◊n is SPD, in at most n iterations:

Given : A, b
x(0) := 0
r(0) = b
k := 0
while r(k) ”= 0
Choose p(k) such that p(k) T AP (k≠1) = 0 and p(k) T r(k) ”= 0
(k) T r (k)
–k := pp(k) T Ap (k)

x(k+1) := x(k) + –k p(k)


r(k+1) := r(k) ≠ –k Ap(k)
k := k + 1
endwhile

The Conjugate Gradient Method chooses the search direction to equal the vector p(k)
that is A-conjugate to all previous search directions and is closest to the direction of steepest
descent:
Given : A, b
x(0) := 0
r(0) := b
k := 0
while r(k) ”= 0
if k = 0
p(k) = r(0)
else
p(k) minimizes minp‹Span(Ap(0) ,...,Ap(k≠1) ) Îr(k) ≠ pÎ2
endif
(k) T r (k)
–k := pp(k) T Ap (k)

x(k+1) := x(k) + –k p(k)


r(k+1) := r(k) ≠ –k Ap(k)
endwhile
The various vectors1that appear in the2 Conjugate Gradient Method have the following
properties: If P (p≠1) = p(0) · · · p(k≠1) then

• P (k≠1) T r(k) = 0.

• Span(p(0) , . . . , p(k≠1) ) = Span(r(0) , . . . , r(k≠1) ) = Span(b, Ab, . . . , Ak≠1 b).


DESCENT METHODS 458

• The residual vectors r(k) are mutually orthogonal.

• For k Ø 1
p(k) = r(k) ≠ “k p(k≠1)
Definition 8.5.2.2 Krylov subspace. The subspace

Kk (A, b) = Span(b, Ab, . . . , Ak≠1 b)

is known as the order-k Krylov subspace. ⌃


Alternative Conjugate Gradient Methods are given by

Given : A, b Given : A, b
x(0) := 0 x(0) := 0
r(0) := b r(0) := b
k := 0 k := 0
while r(k) ”= 0 while r(k) ”= 0
if k = 0 if k = 0
p(k) = r(0) p(k) = r(0)
else else
“k := ≠(p(k≠1) T Ar(k) )/(p(k≠1) T Ap(k≠1) ) “k := (r(k) T r(k) )/(r(k≠1) T r(k≠1) )
p(k) := r(k) + “k p(k≠1) p(k) := r(k) + “k p(k≠1)
endif endif
(k) T r (k) (k) T r (k)
–k := pr(k) T Ap (k) –k := pr(k) T Ap (k)

x(k+1) := x(k) + –k p(k) x(k+1) := x(k) + –k p(k)


r(k+1) := r(k) ≠ –k Ap(k) r(k+1) := r(k) ≠ –k Ap(k)
k := k + 1 k := k + 1
endwhile endwhile
A practical stopping criteria for the Conjugate Gradient Method is to proceed while
Îr(k) Î2 Æ ‘mach ÎbÎ2 and some maximum number of iterations is not yet performed.
The Conjugate Gradient Method can be accelerated by incorporating a preconditioned,
M , where M ¥ A is SDP.
Part III

The Algebraic Eigenvalue Problem

459
Week 9

Eigenvalues and Eigenvectors

9.1 Opening
9.1.1 Relating diagonalization to eigenvalues and eigenvectors
You may want to start your exploration of eigenvalues and eigenvectors by watching the
video
• Eigenvectors and eigenvalues | Essence of linear algebra, chapter 14 from the 3Blue1Brown
series. (We don’t embed the video because we are not quite sure that the rules about
doing so are.)
Here are the insights from that video in the terminology of this week.

YouTube: https://www.youtube.com/watch?v=S_OgLYAh2Jk
Homework 9.1.1.1 Eigenvalues and eigenvectors are all about finding scalars, ⁄, and
nonzero vectors, x, such that
Ax = ⁄x.
To help you visualizing how a 2 ◊ 2 real-valued matrix transforms a vector on the unit
circle in general, and eigenvectors of unit length in particular, we have created the function
Assignments/Week09/mat ab/showeig.m (inspired by such a function that used to be part of
Matlab). You may now want to do a "git pull" to update your local copy of the Assignments
directory.
Once you have uploaded this function to Matlab, in the command window, first create a
2 ◊ 2 matrix in array A and then execute showeig( A ).

460
WEEK 9. EIGENVALUES AND EIGENVECTORS 461

Here are some matrices to try:

A = [ 2 0
0 -0.5 ]

A = [ 2 1
0 -0.5 ]

A = [ 2 1
1 -0.5 ]

theta = pi/4;
A = [ cos( theta) -sin( theta )
sin( theta ) cos( theta ) ]

A = [ 2 1
0 2 ]

A = [ 2 -1
-1 0.5 ]

A = [ 2 1.5
1 -0.5 ]

A = [ 2 -1
1 0.5 ]

Can you explain some of what you observe?


Solution.

A = [ 2 0
0 -0.5 ]

A = [ 2 1
0 -0.5 ]

A = [ 2 1
1 -0.5 ]

If you try a few different symmetric matrices, you will notice that the eigenvectors are always
mutually orthogonal.

theta = pi/4;
A = [ cos( theta) -sin( theta )
sin( theta ) cos( theta ) ]
WEEK 9. EIGENVALUES AND EIGENVECTORS 462

In the end, no vectors are displayed. This is because for real-valued vectors, there are no
vectors such that the rotated vector is in the same direction as the original vector. The
eigenvalues and eigenvectors of a real-valued rotation are complex-valued. Unless ◊ is an
integer multiple of fi.

A = [ 2 1
0 2 ]

We will see later that this is an example of a Jordan block. There is only one linearly
independent eigenvector associated with the eigenvalue 2. Notice that the two eigenvectors
that are displayed are not linearly independent (they point in opposite directions).

A = [ 2 -1
-1 0.5 ]

This matrix has linearly dependent columns (it has a nonzero vector in the null space and
hence 0 is an eigenvalue).

A = [ 2 1.5
1 -0.5 ]

If you look carefully, you notice that the eigenvectors are not mutually orthogonal.

A = [ 2 -1
1 0.5 ]

Another example of a matrix with complex-valued eigenvalues and eigenvectors.

9.1.2 Overview
• 9.1 Opening

¶ 9.1.1 Relating diagonalization to eigenvalues and eigenvectors


¶ 9.1.2 Overview
¶ 9.1.3 What you will learn

• 9.2 Basics

¶ 9.2.1 Singular matrices and the eigenvalue problem


¶ 9.2.2 The characteristic polynomial
¶ 9.2.3 More properties of eigenvalues and vectors
¶ 9.2.4 The Schur and Spectral Decompositions
¶ 9.2.5 Diagonalizing a matrix
WEEK 9. EIGENVALUES AND EIGENVECTORS 463

¶ 9.2.6 Jordan Canonical Form


• 9.3 The Power Method and related approaches
¶ 9.3.1 The Power Method
¶ 9.3.2 The Power Method: Convergence
¶ 9.3.3 The Inverse Power Method
¶ 9.3.4 The Rayleigh Quotient Iteration
¶ 9.3.5 Discussion
• 9.4 Enrichments
• 9.5 Wrap Up
¶ 9.5.1 Additional homework
¶ 9.5.2 Summary

9.1.3 What you will learn


This week, you are reintroduced to the theory of eigenvalues, eigenvectors, and diagonaliza-
tion. Building on this, we start our discovery of practical algorithms.
Upon completion of this week, you should be able to
• Connect the algebraic eigenvalue problem to the various ways in which singular matri-
ces can be characterized.
• Relate diagonalization of a matrix to the eigenvalue problem.
• Link the eigenvalue problem to the Schur and Spectral Decompositions of a matrix.
• Translate theoretical insights into a practical Power Method and related methods.
• Investigate the convergence properties of practical algorithms.

9.2 Basics
9.2.1 Singular matrices and the eigenvalue problem
WEEK 9. EIGENVALUES AND EIGENVECTORS 464

YouTube: https://www.youtube.com/watch?v=j85zII8u2-I
Definition 9.2.1.1 Eigenvalue, eigenvector, and eigenpair. Let A œ Cm◊m . Then
⁄ œ C and nonzero x œ Cm are said to be an eigenvalue and corresponding eigenvector if
Ax = ⁄x. The tuple (⁄, x) is said to be an eigenpair. ⌃
Ax = ⁄x means that the action of A on an eigenvector x is as if it were multiplied by a
scalar. In other words, the direction does not change and only its length is scaled. "Scaling"
and "direction" should be taken loosely here: an eigenvalue can be negative (in which case
the vector ends up pointing in the opposite direction) or even complex-valued.
As part of an introductory course on linear algebra, you learned that the following state-
ments regarding an m ◊ m matrix A are all equivalent:

• A is nonsingular.

• A has linearly independent columns.

• There does not exist a nonzero vector x such that Ax = 0.

• N (A) = {0}. (The null space of A is trivial.)

• dim(N (A)) = 0.

• det(A) ”= 0.

Since Ax = ⁄x can be rewritten as (⁄I ≠ A)x = 0, we note that the following statements are
equivalent for a given m ◊ m matrix A:

• There exists a vector x ”= 0 such that (⁄I ≠ A)x = 0.

• (⁄I ≠ A) is singular.

• (⁄I ≠ A) has linearly dependent columns.

• The null space of ⁄I ≠ A is nontrivial.

• dim(N (⁄I ≠ A)) > 0.

• det(⁄I ≠ A) = 0.

It will become important in our discussions to pick the right equivalent statement in a given
situation.
WEEK 9. EIGENVALUES AND EIGENVECTORS 465

YouTube: https://www.youtube.com/watch?v=K-yDVqijSYw
We will often talk about "the set of all eigenvalues." This set is called the spectrum of
a matrix.
Definition 9.2.1.2 Spectrum of a matrix. The set of all eigenvalues of A is denoted
by (A) and is called the spectrum of A. ⌃
The magnitude of the eigenvalue that is largest in magnitude is known as the spectral
radius. The reason is that all eigenvalues lie in the circle in the complex plane, centered at
the origin, with that radius.
Definition 9.2.1.3 Spectral radius. The spectral radius of A, fl(A), equals the absolute
value of the eigenvalue with largest magnitude:

fl(A) = max |⁄|.


⁄œ (A)


In Subsection 7.3.3 we used the spectral radius to argue that the matrix that comes up
when finding the solution to Poisson’s equation is nonsingular. Key in that argument is a
result known as the Gershgorin Disk Theorem.

YouTube: https://www.youtube.com/watch?v=r a9Q4E6hVI


Theorem 9.2.1.4 Gershgorin Disk Theorem. Let A œ Cm◊m ,
Q R
–0,0 –0,1 ··· –0,m≠1
c d
c –1,0 –1,1 ··· –1,m≠1 d
A=c
c .. .. .. d,
d
a . . . b
–m≠1,0 –m≠1,1 · · · –m≠1,m≠1
ÿ
fli (A) = |–i,j |,
j”=i
and
Ri (A) = {x s.t. |x ≠ –i,i | Æ fli }.
In other words, fli (A) equals the sum of the absolute values of the off diagonal elements of A
in row i and Ri (A) equals the set of all points in the complex plane that are within a distance
fli of diagonal element –i,i . Then
(A) µ fii Ri (A).
In other words, the eigenvalues lie in the union of these disks.
WEEK 9. EIGENVALUES AND EIGENVECTORS 466

Proof. Let ⁄ œ (A). Then (⁄I ≠ A)x = 0 for some nonzero vector x. W.l.o.g. assume that
index i has the property that 1 = ‰i Ø |‰j | for j ”= i. Then

≠–i,0 ‰0 ≠ · · · ≠ –i,i≠1 ‰i≠1 + (⁄ ≠ –i,i ) ≠ –i,i+1 ‰i+1 ≠ –i,m≠1 ‰m≠1 = 0

or, equivalently,

⁄ ≠ –i,i = –i,0 ‰0 + · · · + –i,i≠1 ‰i≠1 + –i,i+1 ‰i+1 + –i,m≠1 ‰m≠1 .

Hence
|⁄ ≠ –i,i |
=
|–i,0 ‰0 + · · · + –i,i≠1 ‰i≠1 + –i,i+1 ‰i+1 + –i,m≠1 ‰m≠1 |
Æ
|–i,0 ‰0 | + · · · + |–i,i≠1 ‰i≠1 | + |–i,i+1 ‰i+1 | + |–i,m≠1 ‰m≠1 |
=
Æ
|–i,0 ||‰0 | + · · · + |–i,i≠1 ||‰i≠1 | + |–i,i+1 ||‰i+1 | + |–i,m≠1 ||‰m≠1 |
Æ
|–i,0 | + · · · + |–i,i≠1 | + |–i,i+1 | + |–i,m≠1 |
Æ
fli (A).

YouTube: https://www.youtube.com/watch?v=19FXch2X7sQ
It is important to note that it is not necessarily the case that each such disk has exactly
one eigenvalue in it. There is, however, a slightly stronger result than Theorem 9.2.1.4.
Corollary 9.2.1.5 Let A and Ri (A) be as defined in Theorem 9.2.1.4. Let K and K C be
disjoint subsets of {0, . . . , m ≠ 1} such that K fi K C = {0, . . . , m ≠ 1}. In other words, let
K and K C partition {0, . . . , m ≠ 1}. If
1 2
(fikœK Rk (A)) fl fijœK C Rj (A) = ÿ

then fikœK Rk (A) contains exactly |K| eigenvalues of A (multiplicity counted). In other words,
if fikœK Rk (A) does not intersect with any of the other disks, then it contains as many eigen-
values of A (multiplicity counted) as there are elements of K.
WEEK 9. EIGENVALUES AND EIGENVECTORS 467

Proof. The proof splits A = D + (A ≠ D) where D equals the diagonal of A and considers
AÊ = D + Ê(A ≠ D), which varies continuously with Ê. One can argue that the disks Ri (A0 )
start with only one eigenvalue each and only when they start intersecting can an eigenvalue
"escape" the disk in which it started. We skip the details since we won’t need this result in
this course. ⌅
Through a few homeworks, let’s review basic facts about eigenvalues and eigenvectors.
Homework 9.2.1.1 Let A œ Cm◊m .
TRUE/FALSE: 0 œ (A) if and only A is singular.
Answer. TRUE
Now prove it!
Solution.

• (∆): Assume 0 œ (A). Let x be an eigenvector associated with eigenvalue 0. Then


Ax = 0x = 0. Hence there exists a nonzero vector x such that Ax = 0. This implies A
is singular.

• (≈): Assume A is singular. Then there exists x ”= 0 such that Ax = 0. Hence Ax = 0x


and 0 is an eigenvalue of A.
Homework 9.2.1.2 Let A œ Cm◊m be Hermitian.
ALWAYS/SOMETIMES/NEVER: All eigenvalues of A are real-valued.
Answer. ALWAYS
Now prove it!
Solution. Let (⁄, x) be an eigenpair of A. Then

Ax = ⁄x

and hence
xH Ax = ⁄xH x.
If we now conjugate both sides we find that

xH Ax = ⁄xH x

which is equivalent to
(xH Ax)H = (⁄xH x)H
which is equivalent to
xH Ax = ⁄xH x
since A is Hermitian. We conclude that
xH Ax
⁄= =⁄
xH x
(since xH x ”= 0).
WEEK 9. EIGENVALUES AND EIGENVECTORS 468

Homework 9.2.1.3 Let A œ Cm◊m be Hermitian positive definite (HPD).


ALWAYS/SOMETIMES/NEVER: All eigenvalues of A are positive.
Answer. ALWAYS
Now prove it!
Solution. Let (⁄, x) be an eigenpair of A. Then

Ax = ⁄x

and hence
xH Ax = ⁄xH x
and finally (since x ”= 0)
xH Ax
⁄= .
xH x
Since A is HPD, both xH Ax and xH x are positive, which means ⁄ is positive.
The converse is also always true, but we are not ready to prove that yet.
Homework 9.2.1.4 Let A œ Cm◊m be Hermitian, (⁄, x) and (µ, y) be eigenpairs associated
with A, and ⁄ ”= µ.
ALWAYS/SOMETIMES/NEVER: xH y = 0
Answer. ALWAYS
Now prove it!
Solution. Since
Ax = ⁄x and Ay = µy
we know that
y H Ax = ⁄y H x and xH Ay = µxH y
and hence (remembering that the eigenxvalues are real-valued)

⁄y H x = y H Ax = xH Ay = µxH y = µy H x.

We can rewrite this as


(⁄ ≠ µ)y H x = 0.
Since ⁄ ”= µ this implies that y H x = 0 and hence xH y = 0.
Homework 9.2.1.5 Let A œ Cm◊m , (⁄, x) and (µ, y) be eigenpairs, and ⁄ ”= µ. Prove that
x and y are linearly independent.
Solution. Proof by contradiction: Under the assumptions of the homework, we will show
that assuming that x and y are linearly dependent leads to a contradiction.
If nonzero x and nonzero y are linearly dependent, then there exists “ ”= 0 such that
y = “x. Then
Ay = µy
implies that
A(“x) = µ(“x)
WEEK 9. EIGENVALUES AND EIGENVECTORS 469

and hence
“⁄x = µ“x.
Rewriting this we get that
(⁄ ≠ µ)“x = 0.
Since ⁄ ”= µ and “ ”= 0 this means that x = 0 which contradicts that x is an eigenvector.
We conclude that x and y are linearly independent.
We now generalize this insight.
Homework 9.2.1.6 Let A œ Cm◊m , k Æ m, and (⁄i , xi ) for 1 Æ i < k be eigenpairs of this
matrix. Prove that if ⁄i ”= ⁄j when i ”= j then the eigenvectors xi are linearly independent.
In other words, given a set of distinct eigenvalues, a set of vectors created by taking one
eigenvector per eigenvalue is linearly independent.
Hint. Prove by induction.
Solution. Proof by induction on k.

• Base Case: k = 1. This is trivially.

• Assume the result holds for 1 Æ k < m. Show it holds for k + 1.


The I.H. means that x0 , . . . , xk≠1 are linearly independent. We need to show that xk
is not a linear combination of x0 , . . . , xk≠1 . We will do so via a proof by contradiction.
Assume xk is a linear combination of x0 , . . . , xk≠1 so that

xk = “0 x0 + · · · + “k≠1 xk≠1

with at least one “j ”= 0. We know that Axk = ⁄k xk and hence

A(“0 x0 + · · · + “k≠1 xk≠1 ) = ⁄k (“0 x0 + · · · + “k≠1 xk≠1 ).

Since Axi = ⁄i xi , we conclude that

“0 ⁄0 x0 + · · · + “k≠1 ⁄k xk≠1 = “0 ⁄k x0 + · · · + “k≠1 ⁄k xk≠1

or, equivalently,
“0 (⁄0 ≠ ⁄k )x0 + · · · + “k≠1 (⁄i ≠ ⁄k )xk≠1 = 0.
Since at least one “i ”= 0 and ⁄i ”= ⁄k for 0 Æ i < k, we conclude that x0 , . . . , xk≠1 are
linearly dependent, which is a contradiction.
Hence, x0 , . . . , xk are linearly independent.

• By the Principle of Mathematical Induction, the result holds for 1 Æ k Æ m.


We now return to some of the matrices we saw in Week 7.
WEEK 9. EIGENVALUES AND EIGENVECTORS 470

Homework 9.2.1.7 Consider the matrices


Q R
2 ≠1
c d
c ≠1
c
2 ≠1 d
d
c
A=c . .. ... ... d
d
c d
c
a ≠1 2 ≠1 d
b
≠1 2

and Q R
4 ≠1 ≠1
c ≠1
c 4 ≠1 ≠1 d
d
c d
c
c
≠1 4 ≠1 ≠1 d
d
c
c ≠1 4 ≠1 d
d
c d
c ≠1
c
4 ≠1 ≠1 d
d
c .. d
c
c ≠1 ≠1 4 ≠1 . d
d
c
c ≠1 ≠1 4 ≠1 d
d
c d
c
c
≠1 ≠1 4 d
d
.
4 ..
c d
c ≠1 d
a b
.. .. ..
. . .
ALWAYS/SOMETIMES/NEVER: All eigenvalues of these matrices are nonnegative.
ALWAYS/SOMETIMES/NEVER: All eigenvalues of the first matrix are positive.
Answer. ALWAYS: All eigenvalues of these matrices are nonnegative.
ALWAYS: All eigenvalues of the first matrix are positive. (So are all the eigenvalues of
the second matrix, but proving that is a bit trickier.)
Now prove it!
Solution. For the first matrix, we can use the Gershgorin disk theorem to conclude that
all eigenvalues of the matrix lie in the set {x s.t. |x ≠ 2| Æ 2. We also notice that the matrix
is symmetric, which means that its eigenvalues are real-valued. Hence the eigenvalues are
nonnegative. A similar argument can be used for the second matrix.
Now, in Homework 7.2.1.1 we showed that the first matrix is nonsingular. Hence, it
cannot have an eigenvalue equal to zero. We conclude that its eigenvalues are all positive.
It can be shown that the second matrix is also nonsingular, and hence has positive
eigenvalues. However, that is a bit nasty to prove...
WEEK 9. EIGENVALUES AND EIGENVECTORS 471

9.2.2 The characteristic polynomial

YouTube: https://www.youtube.com/watch?v=NUvfjg-JUjg
We start by discussing how to further characterize eigenvalues of a given matrix A. We
say "characterize" because none of the discussed insights lead to practical algorithms for
computing them, at least for matrices larger than 4 ◊ 4.
Homework 9.2.2.1 Let A B
–0,0 –0,1
A=
–1,0 –1,1
be a nonsingular matrix. Show that
A B≠1 A B
–0,0 –0,1 1 –1,1 ≠–0,1
= .
–1,0 –1,1 –0,0 –1,1 ≠ –1,0 –0,1 ≠–1,0 –0,0

Solution. Recall that, for square matrices, B = A≠1 if and only if AB = I.


A B A B
–0,0 –0,1 1 –1,1 ≠–0,1
–1,0 –1,1 –0,0 –1,1 ≠–1,0 –0,1 ≠–1,0 –0,0
= A B
1 –0,0 –1,1 ≠ –0,1 –1,0 ≠–0,0 –0,1 + –0,1 –0,0
–0,0 –1,1 ≠–1,0 –0,1 –1,0 –0,1 + –1,1 –0,0 ≠–0,1 –1,0 + –0,0 –1,1
A = B
1 0
.
0 1

YouTube: https://www.youtube.com/watch?v=WvwcrDM_K3k
What we notice from the last exercises is that –0,0 –1,1 ≠ –1,0 –0,1 characterizes whether
WEEK 9. EIGENVALUES AND EIGENVECTORS 472

A has an inverse or not: A B


–0,0 –0,1
A=
–1,0 –1,1
is nonsingular if and only if –0,0 –1,1 ≠ –1,0 –0,1 ”= 0. The expression –0,0 –1,1 ≠ –1,0 –0,1 . is
known as the determinant of the 2 ◊ 2 matrix and denoted by det(A):
Definition 9.2.2.1 The determinant of
A B
–0,0 –0,1
A=
–1,0 –1,1

is given by
det(A) = –0,0 –1,1 ≠ –1,0 –0,1 .

Now, ⁄ is an eigenvalue of A if and only if ⁄I ≠ A is singular. For our 2 ◊ 2 matrix
A B
⁄ ≠ –0,0 ≠–0,1
⁄I ≠ A =
≠–1,0 ⁄ ≠ –1,1
is singular if and only if
(⁄ ≠ –0,0 )(⁄ ≠ –1,1 ) ≠ (≠–1,0 )(≠–0,1 ) = 0.
In other words, ⁄ is an eigenvalue of this matrix if and only if ⁄ is a root of
pA (⁄) = (⁄ ≠ –0,0 )(⁄ ≠ –1,1 ) ≠ (≠–1,0 )(≠–0,1 )
= ⁄2 ≠ (–0,0 + –1,1 )⁄ + (–0,0 –1,1 ≠ –1,0 –0,1 ),
which is a polynomial of degree two. A polynomial of degree two has two roots (counting
multiplicity). This polynomial is known as the characteristic polynomial of the 2 ◊ 2 matrix.
We now have a means for computing eigenvalues and eigenvectors of a 2 ◊ 2 matrix:
• Form the characteristic polynomial pA (⁄).
• Solve for its roots, ⁄0 and ⁄1 .
• Find nonzero vectors x0 and x1 in the null spaces of ⁄0 I ≠ A and ⁄1 I ≠ A, respectively.

YouTube: https://www.youtube.com/watch?v=FjoULa2dMC8
The notion of a determinant of a matrix, det(A), generalizes to m◊m matrices as does the
fact that A is nonsingular if and only if det(A) ”= 0. Similarly, the notion of a characteristic
polynomial is then generalized to m ◊ m matrices:
WEEK 9. EIGENVALUES AND EIGENVECTORS 473

Definition 9.2.2.2 Characteristic polynomial. The characteristic polynomial of m ◊ m


matrix A is given by
pA (⁄) = det(⁄I ≠ A).

At some point in your education, you may have been taught how to compute the deter-
minant of an arbitrary m ◊ m matrix. For this course, such computations have no practical
applicability, when matrices are larger than 3 ◊ 3 or so, and hence we don’t spend time on
how to compute determinants. What is important to our discussions is that for an m ◊ m
matrix A the characteristic polynomial is a polynomial of degree m, a result we formalize in
a theorem without giving a proof:
Theorem 9.2.2.3 If A œ Cm◊m then pA (⁄) = det(⁄I ≠ A) is a polynomial of degree m.
This insight now allows us to further characterize the set of all eigenvalues of a given
matrix:
Theorem 9.2.2.4 Let A œ Cm◊m . Then ⁄ œ (A) if and only if pA (⁄) = 0.
Proof. This follows from the fact that a matrix has a nontrivial null space if and only
if its determinant is zero. Hence, pA (⁄) = 0 if and only if there exists x ”= 0 such that
(⁄I ≠ A)x = 0. (⁄I ≠ A)x = 0 if and only if Ax = ⁄x. ⌅
Recall that a polynomial of degree m,

pm (‰) = ‰m + · · · + “1 ‰ + “0 ,

can be factored as
pm (‰) = (‰ ≠ ‰0 )m0 · · · (‰ ≠ ‰k≠1 )mk ,
where the ‰i are distinct roots, mi equals the multiplicity of the root, and m0 + · · · + mk≠1 =
m. The concept of (algebraic) multiplicity carries over to eigenvalues.
Definition 9.2.2.5 Algebraic multiplicity of an eigenvalue. Let A œ Cm◊m and pm (⁄)
its characteristic polynomial. Then the (algebraic) multiplicity of eigenvalue ⁄i equals the
multiplicity of the corresponding root of the polynomial. ⌃
Often we will list the eigenvalues of A œ Cm◊m as m eigenvalues ⁄0 , . . . , ⁄m≠1 even when
some are equal (have algebraic multiplicity greater than one). In this case we say that
we are counting multiplicity. In other words, we are counting each eigenvalue (root of the
characteristic polynomial) separately, even if they are equal.
An immediate consequence is that A has m eigenvalues (multiplicity counted), since a
polynomial of degree m has m roots (multiplicity counted), which is captured in the following
lemma.
Lemma 9.2.2.6 If A œ Cm◊m then A has m eigenvalues (multiplicity counted).
The relation between eigenvalues and the roots of the characteristic polynomial yields a
disconcerting insight: A general formula for the eigenvalues of an arbitrary m ◊ m matrix
with m > 4 does not exist. The reason is that "Galois theory" tells us that there is no general
formula for the roots of a polynomial of degree m > 4 (details go beyond the scope of this
WEEK 9. EIGENVALUES AND EIGENVECTORS 474

course). Given any polynomial pm (‰) of degree m, an m ◊ m matrix can be constructed


such that its characteristic polynomial is pm (⁄). In particular, if

pm (‰) = ‰m + –m≠1 ‰m≠1 + · · · + –1 ‰ + –0

and Q R
≠–n≠1 ≠–n≠2 ≠–n≠3 · · · ≠–1 ≠–0
c
c 1 0 0 ··· 0 0 dd
c d
c
c 0 1 0 ··· 0 0 dd
A= c 0 0 1 ··· 0 0 d
c d
c .. .. .. .. .. .. d
c
a . . . . .
d
. b
0 0 0 ··· 1 0
then
pm (⁄) = det(⁄I ≠ A).
(Since we don’t discuss how to compute the determinant of a general matrix, you will have
to take our word for this fact.) Hence, we conclude that no general formula can be found for
the eigenvalues for m ◊ m matrices when m > 4. What we will see is that we will instead
create algorithms that converge to the eigenvalues and/or eigenvalues of matrices.
Corollary 9.2.2.7 If A œ Rm◊m is real-valued then some or all of its eigenvalues may be
complex-valued. If eigenvalue ⁄ is complex-valued, then its conjugate, ⁄̄, is also an eigenvalue.
Indeed, the complex eigenvalues of a real-valued matrix come in complex pairs.
Proof. It can be shown that if A is real-valued, then the coefficients of its characteristic
polynomial are all real -valued. Complex roots of a polynomial with real coefficients come
in conjugate pairs. ⌅
The last corollary implies that if m is odd, then at least one eigenvalue of a real-valued
m ◊ m matrix must be real-valued.
Corollary 9.2.2.8 If A œ Rm◊m is real-valued and m is odd, then at least one of the
eigenvalues of A is real-valued.

YouTube: https://www.youtube.com/watch?v=BVqdIKTK1SI
It would seem that the natural progression for computing eigenvalues and eigenvectors
would be

• Form the characteristic polynomial pA (⁄).

• Solve for its roots, ⁄0 , . . . , ⁄m≠1 , which give us the eigenvalues of A.


WEEK 9. EIGENVALUES AND EIGENVECTORS 475

• Find eigenvectors associated with the eigenvalues by finding bases for the null spaces
of ⁄i I ≠ A.
However, as mentioned, finding the roots of a polynomial is a problem. Moreover, finding
vectors in the null space is also problematic in the presence of roundoff error. For this
reason, the strategy for computing eigenvalues and eigenvectors is going to be to compute
approximations of eigenvectors hand in hand with the eigenvalues.

9.2.3 More properties of eigenvalues and vectors


No video for this unit
This unit reminds us of various properties of eigenvalue and eigenvectors through a se-
quence of homeworks.
Homework 9.2.3.1 Let ⁄ be an eigenvalue of A œ Cm◊m and let

E⁄ (A) = {x œ Cm |Ax = ⁄x}.

be the set of all eigenvectors of A associated with ⁄ plus the zero vector (which is not
considered an eigenvector). Show that E⁄ (A) is a subspace.
Solution. A set S µ Cm is a subspace if and only if for all – œ C and x, y œ Cm two
conditions hold:

• x œ S implies that –x œ S.

• x, y œ S implies that x + y œ S.

• x œ E⁄ (A) implies –x œ E⁄ (A):


x œ E⁄ (A) means that Ax = ⁄x. If – œ C then –Ax = –⁄x which, by commutivity
and associativity means that A(–x) = ⁄(–x). Hence (–x) œ E⁄ (A).

• x, y œ E⁄ (A) implies x + y œ E⁄ (A):

A(x + y) = Ax + Ay = ⁄x + ⁄y = ⁄(x + y).


While there are an infinite number of eigenvectors associated with an eigenvalue, the
fact that they form a subspace (provided the zero vector is added) means that they can be
described by a finite number of vectors, namely a basis for that space.
Homework 9.2.3.2 Let D œ Cm◊m be a diagonal matrix. Give all eigenvalues of D. For
each eigenvalue, give a convenient eigenvector.
Solution. Let Q R
”0 0 · · · 0
c 0 ”1 · · · 0
c d
d
D=c c .. .. . . .. d.
a . . . .
d
b
0 0 · · · ”m≠1
WEEK 9. EIGENVALUES AND EIGENVECTORS 476

Then Q R
⁄ ≠ ”0 0 ··· 0
c
c 0 ⁄ ≠ ”1 ··· 0 d
d
⁄I ≠ D = c
c .. .. ... .. d
d
a . . . b
0 0 · · · ⁄ ≠ ”m≠1
is singular if and only if ⁄ = ”i for some i œ {0, . . . , m ≠ 1}. Hence (D) = {”0 , ”1 , . . . , ”m≠1 }.
Now,
Dej = the column of D indexed with j = ”j ej
and hence ej is an eigenvector associated with ”j .
Homework 9.2.3.3 Compute the eigenvalues and corresponding eigenvectors of
Q R
≠2 3 ≠7
c
A=a 0 1 1 d
b
0 0 2

(Recall: the solution is not unique.)


Solution. The eigenvalues can be found on the diagonal: {≠2, 1, 2}.

• To find an eigenvector associated with ≠2, form


Q R
0 ≠3 7
c d
(≠2)I ≠ A = a 0 ≠3 ≠1 b
0 0 ≠4

and look for a vector in the null space of this matrix. By examination,
Q R
1
c d
a 0 b
0

is in the null space of this matrix and hence an eigenvector of A.

• To find an eigenvector associated with 1, form


Q R
3 ≠3 7
(1)I ≠ A = c
a 0 0 ≠1 d
b
0 0 ≠1

and look for a vector in the null space of this matrix. Given where the zero appears
on the diagonal, we notice that a vector of the form
Q R
‰0
c d
a 1 b
0
WEEK 9. EIGENVALUES AND EIGENVECTORS 477

is in the null space if ‰0 is choosen appropriately. This means that

3‰0 ≠ 3(1) = 0

and hence ‰0 = 1 so that Q R


1
c d
a 1 b
0
in the null space of this matrix and hence an eigenvector of A.

• To find an eigenvector associated with 2, form


Q R
4 ≠3 7
c
(2)I ≠ A = a 0 1 ≠1 d
b
0 0 0

and look for a vector in the null space of this matrix. Given where the zero appears
on the diagonal, we notice that a vector of the form
Q R
‰0
c d
a ‰1 b
1

is in the null space if ‰0 and ‰1 are choosen appropriately. This means that

‰1 ≠ 1(1) = 0

and hence ‰1 = 1. Also,


4‰0 ≠ 3(1) + 7(1) = 0
so that ‰0 = ≠1, Q R
≠1
c d
a 1 b
1
is in the null space of this matrix and hence an eigenvector of A.
Homework 9.2.3.4 Let U œ Cm◊m be an upper triangular matrix. Give all eigenvalues of
U . For each eigenvalue, give a convenient eigenvector.
Solution. Let Q R
‚0,0 ‚0,1 · · · ‚0,m≠1
c 0
c d
‚1,1 · · · ‚1,m≠1 d
U =c c .. .. . . .. d.
a . . . .
d
b
0 0 · · · ‚m≠1,m≠1
WEEK 9. EIGENVALUES AND EIGENVECTORS 478

Then Q R
⁄ ≠ ‚0,0 ≠‚0,1 ··· ≠‚0,m≠1
c
c 0 ⁄ ≠ ‚1,1 ··· ≠‚1,m≠1 d
d
⁄I ≠ U = c .. .. .. .. d.
c
a . . . .
d
b
0 0 · · · ⁄ ≠ ‚m≠1,m≠1
is singular if and only if ⁄ = ‚i,i for some i œ {0, . . . , m≠1}. Hence (U ) = {‚0,0 , ‚1,1 , . . . , ‚m≠1,m≠1 }.
Let ⁄ be an eigenvalue of U . Things get a little tricky if ⁄ has multiplicity greater than
one. Partition Q R
U00 u01 U02
c d
U = a 0 ‚11 uT12 b
0 0 U22
where ‚11 = ⁄. We are looking for x ”= 0 such that (⁄I ≠ U )x = 0 or, partitioning x,
Q RQ R Q R
‚11 I ≠ U00 ≠u01 ≠U02 x0 0
c dc d c d
a 0 0 ≠u12T
b a ‰1 b = a 0 b .
0 0 ‚11 I ≠ U22 x2 0

If we choose x2 = 0 and ‰1 = 1, then

(‚11 I ≠ U00 )x0 ≠ u01 = 0

and hence x0 must satisfy


(‚11 I ≠ U00 )x0 = u01 .
If ‚11 I ≠ U00 is nonsingular, then there is a unique solution to this equation, and
Q R
(‚11 I ≠ U00 )≠1 u01
c d
a 1 b
0

is the desired eigenvector. HOWEVER, this means that the partitioning


Q R
U00 u01 U02
c d
U = a 0 ‚11 uT12 b
0 0 U22

must be such that ‚11 is the FIRST diagonal element that equals ⁄.
In the next week, we will see that practical algorithms for computing the eigenvalues and
eigenvectors of a square matrix morph that matrix into an upper triangular matrix via a
sequence of transforms that preserve eigenvalues. The eigenvectors of that triangular matrix
can then be computed using techniques similar to those in the solution to the last homework.
Once those have been computed, they can be "back transformed" into the eigenvectors of
the original matrix.
WEEK 9. EIGENVALUES AND EIGENVECTORS 479

9.2.4 The Schur and Spectral Decompositions

YouTube: https://www.youtube.com/watch?v=2AsK3KEtsso
Practical methods for computing eigenvalues and eigenvectors transform a given matrix
into a simpler matrix (diagonal or tridiagonal) via a sequence of transformations that preserve
eigenvalues known as similarity transformations.
Definition 9.2.4.1 Given a nonsingular matrix Y , the transformation Y ≠1 AY is called a
similarity transformation (applied to matrix A). ⌃
Definition 9.2.4.2 Matrices A and B are said to be similar if there exists a nonsingular
matrix Y such that B = Y ≠1 AY . ⌃
Homework 9.2.4.1 Let A, B, Y œ Cm◊m , where Y is nonsingular, and (⁄, x) an eigenpair
of A.
Which of the follow is an eigenpair of B = Y ≠1 AY :

• (⁄, x).

• (⁄, Y ≠1 x).

• (⁄, Y x).

• (1/⁄, Y ≠1 x).

Answer. (⁄, Y ≠1 x).


Now justify your answer.
Solution. Since Ax = ⁄x we know that

Y ≠1 AY Y ≠1 x = ⁄Y ≠1 x.

Hence (⁄, Y ≠1 x) is an eigenpair of B.


The observation is that similarity transformations preserve the eigenvalues of a matrix,
as summarized in the following theorem.
Theorem 9.2.4.3 Let A, Y, B œ Cm◊m , assume Y is nonsingular, and let B = Y ≠1 AY .
Then (A) = (B).
Proof. Let ⁄ œ (A) and x be an associated eigenvector. Then Ax = ⁄x if and only if
Y ≠1 AY Y ≠1 x = Y ≠1 ⁄x if and only if B(Y ≠1 x) = ⁄(Y ≠1 x). ⌅
WEEK 9. EIGENVALUES AND EIGENVECTORS 480

It is not hard to expand the last proof to show that if A is similar to B and ⁄ œ (A)
has algebraic multiplicity k then ⁄ œ (B) has algebraic multiplicity k.

YouTube: https://www.youtube.com/watch?v=n02VjGJX5CQ
In Subsection 2.2.7, we argued that the application of unitary matrices is desirable, since
they preserve length and hence don’t amplify error. For this reason, unitary similarity trans-
formations are our weapon of choice when designing algorithms for computing eigenvalues
and eigenvectors.
Definition 9.2.4.4 Given a nonsingular matrix Q the transformation QH AQ is called a
unitary similarity transformation (applied to matrix A). ⌃

YouTube: https://www.youtube.com/watch?v=mJNM4EYB-9s
The following is a fundamental theorem for the algebraic eigenvalue problem that is key
to practical algorithms for finding eigenvalues and eigenvectors.
Theorem 9.2.4.5 Schur Decomposition Theorem. Let A œ Cm◊m . Then there exist a
unitary matrix Q and upper triangular matrix U such that A = QU QH . This decomposition
is called the Schur decomposition of matrix A.
Proof. We will outline how to construct Q so that QH AQ = U , an upper triangular matrix.
Since a polynomial of degree m has at least one root, matrix A has at least one eigenvalue,
⁄1 , and corresponding eigenvector q1 , where we normalize this eigenvector to have length one.
WEEK 9. EIGENVALUES AND EIGENVECTORS 481
1 2
Thus Aq1 = ⁄1 q1 . Choose Q2 so that Q = q1 Q2 is unitary. Then

QH AQ
=
1 2H 1 2
q1 Q2 A q1 Q2
A = B
q1H Aq1 q1H AQ2
QH2 Aq1 QH2 AQ2

A = B
⁄1 q1H AQ2
⁄QH 2 q1 QH2 AQ2

A = B
⁄1 w T
,
0 B

where wT = q1H AQ2 and B = QH 2 AQ2 . This insight can be used to construct an inductive
proof. ⌅
In other words: Given matrix A, there exists a unitary matrix Q such that applying the
unitary similarity transformation QH AQ yields an upper triangular matrix U . Since then
(A) = (U ), the eigenvalues of A can be found on the diagonal of U . The eigenvectors of
U can be computed and from those the eigenvectors of A can be recovered.
One should not mistake the above theorem and its proof for a constructive way to compute
the Schur decomposition: finding an eigenvalue, ⁄1 and/or 1 the eigenvector
2 associated with
it, q1 , is difficult. Also, completing the unitary matrix q1 Q2 is expensive (requiring
the equivalent of a QR factorization).
Homework 9.2.4.2 Let A œ Cm◊m , A = QU QH be its Schur decomposition, and X ≠1 U X =
, where is a diagonal matrix and X is nonsingular.
• How are the elements of related to the elements of U ?

• How are the columns of X related to the eigenvectors of A?

Solution.

• How are the elements of related to the elements of U ?


The diagonal elements of U equal the diagonal elements of .

• How are the columns of X related to the eigenvectors of A?

A = QU QH = QX X ≠1 QH = (QX) (QX)≠1 .
Hence the columns of QX equal eigenvectors of A.
WEEK 9. EIGENVALUES AND EIGENVECTORS 482

YouTube: https://www.youtube.com/watch?v=uV5-0O_LBkA
If the matrix is Hermitian, then the Schur decomposition has the added property that U
is diagonal. The resulting decomposition is known as the Spectral decomposition.
Theorem 9.2.4.6 Spectral Decomposition Theorem. Let A œ Cm◊m be Hermitian.
Then there exist a unitary matrix Q and diagonal matrix D œ Rm◊m such that A = QDQH .
This decomposition is called the spectral decomposition of matrix A.
Proof. Let A = QU QH be the Schur decomposition of A. Then U = QH AQ. Since A is
Hermitian, so is U since U H = (QH AQ)H = QH AH Q = QH AQ = U . A triangular matrix
that is Hermitian is diagonal. Any Hermitian matrix has a real-valued diagonal and hence
D has real-valued on its diagonal. ⌅
In practical algorithms, it will often occur that an intermediate result can be partitioned
into smaller subproblems. This is known as deflating the problem and builds on the fol-
lowing insights.
A B
AT L AT R
Theorem 9.2.4.7 Let A œ C m◊m
be of form A = , where AT L and ABR are
0 ABR
square submatrices. Then (A) = (AT L ) fi (ABR ).
The proof of the above theorem follows from the next homework regarding how the Schur
decomposition of A can be computed from the Schur decompositions of AT L and ABR .
Homework 9.2.4.3 Let A œ Cm◊m be of form
A B
AT L AT R
A= ,
0 ABR

where AT L and ABR are square submatrices with Schur decompositions

AT L = QT L UT L QH
T L and ABR = QBR UBR QBR .
H

Give the Schur decomposition of A.


WEEK 9. EIGENVALUES AND EIGENVECTORS 483

Solution.
A
A = B
AT L AT R
0 ABR
A = B
QT L UT L QH
TL AT R
0 QBR UBR QH
BR
=
A B A B A BH
QT L 0 UT L QH T L AT R QBR QT L 0
0 QBR 0 UBR 0 QBR
¸ ˚˙ ˝ ¸ ˚˙ ˝ ¸ ˚˙ ˝
Q U QH
Homework 9.2.4.4 Generalize the result in the last homework for block upper triangular
matrices: Q R
A0,0 A0,1 ··· A0,N ≠1
c 0 A1,1 ··· A1,N ≠1 d
c d
A=c c .. .. d.
d
a 0 0 . . b
0 0 · · · AN ≠1,N ≠1

Solution. For i = 0, . . . , N ≠ 1, let Ai.i = Qi Ui,i QH


i be the Schur decomposition of Ai,i .
WEEK 9. EIGENVALUES AND EIGENVECTORS 484

Then
A
=
Q R
A0,0 A0,1 · · · A0,N ≠1
c 0 A1,1 · · · A1,N ≠1 d
c d
c
c .. .. d
d
a 0 0 . . b
0 0 · · · AN ≠1,N ≠1
=
Q R
Q0 U0,0 QH
0 A0,1 ··· A0,N ≠1
c
c 0 H
Q1 U11 Q1 · · · A1,N ≠1 d
d
c
c ... .. d
d
a 0 0 . b
0 0 · · · QN ≠1 UN ≠1,N ≠1 QH
N ≠1
=
Q R
Q0 0 · · · 0
c 0 Q 0 d
c 1 ··· d
c
c .. .. d
d
a 0 0 . . b
0 0 · · · QN ≠1
Q R
U0,0 QH T
0 A0,1 Q1 · · · Q0 A0,N ≠1 QN ≠1
c 0 U11 H
· · · Q1 A1,N ≠1 QN ≠1 d
c d
c
c .. .. d
d
a 0 0 . . b
0 0 ··· UN ≠1,N ≠1
Q RH
Q0 0 · · · 0
c 0 Q 0 d
c 1 ··· d
c
c .. .. d
d .
a 0 0 . . b
0 0 · · · QN ≠1

9.2.5 Diagonalizing a matrix

YouTube: https://www.youtube.com/watch?v=dLaFK2TJ7y8
The algebraic eigenvalue problem or, more simply, the computation of eigenvalues and
eigenvectors is often presented as the problem of diagonalizing a matrix. We make that link
in this unit.
WEEK 9. EIGENVALUES AND EIGENVECTORS 485

Definition 9.2.5.1 Diagonalizable matrix. A matrix A œ Cm◊m is said to be diagonal-


izable if and only if there exists a nonsingular matrix X and diagonal matrix D such that
X ≠1 AX = D. ⌃

YouTube: https://www.youtube.com/watch?v=qW5 cD3K1RU


Why is this important? Consider the equality w = Av. Notice that we can write w as a
linear combination of the columns of X:

w = X (X ≠1 w) .
¸ ˚˙ ˝

In other words, X ≠1 w is the vector of coefficients when w is writen in terms of the basis
that consists of the columns of X. Similarly, we can write v as a linear combination of the
columns of X:
v = X (X ≠1 v) .
¸ ˚˙ ˝

Now, since X is nonsingular, w = Av is equivalent to X ≠1 w = X ≠1 AXX ≠1 v, and hence
X ≠1 w = D(X ≠1 v).
Remark 9.2.5.2 We conclude that if we view the matrices in the right basis (namely the
basis that consists of the columns of X), then the transformation w := Av simplifies to
w := DvÂ. This is a really big deal.
How is diagonalizing a matrix related to eigenvalues and eigenvectors? Let’s assume that
X ≠1 AX = D. We can rewrite this as

AX = XD

and partition
Q R
”0 0 ··· 0
1 2 1 2c
c 0 ”1 ··· 0 d
d
A x0 x1 · · · xm≠1 = x0 x1 · · · xm≠1 c
c .. .. ... .. d.
d
a . . . b
0 0 · · · ”m≠1

Multiplying this out yields


1 2 1 2
Ax0 Ax1 · · · Axm≠1 = ”0 x0 ”1 x1 · · · ”m≠1 xm≠1 .
WEEK 9. EIGENVALUES AND EIGENVECTORS 486

We conclude that
Axj = ”j xj
which means that the entries on the diagonal of D are the eigenvalues of A and the corre-
sponding eigenvectors are found as columns of X.
Homework 9.2.5.1 In Homework 9.2.3.3, we computed the eigenvalues and corresponding
eigenvectors of Q R
≠2 3 ≠7
A=c a 0 1 1 d
b.
0 0 2
Use the answer to that question to give a matrix X such that X ≠1 AX = . Check that
AX = X .
Solution. The eigenpairs computed for Homework 9.2.3.3 were
Q R Q R Q R
1 1 ≠1
c d c d c d
(≠2, a 0 b), (1, a 1 b), and (2, a 1 b).
0 0 1

Hence Q R≠1 Q RQ R Q R
1 1 ≠1 ≠2 3 ≠7 1 1 ≠1 ≠2 0 0
c
a 0 1 1 d
b
c
a 0 1 1 d c
ba 0 1 1 d c d
b = a 0 1 0 b.
0 0 1 0 0 2 0 0 1 0 0 2
We can check this:
Q RQ R Q RQ R
≠2 3 ≠7 1 1 ≠1 1 1 ≠1 ≠2 0 0
c dc d c
a 0 1 1 ba 0 1 1 b = a 0 1 1 ba 0 1 0 d
dc
b.
0 0 2 0 0 1 0 0 1 0 0 2
¸ Q ˚˙ R ˝ ¸ Q ˚˙ R ˝
≠2 1 ≠2 ≠2 1 ≠2
c
a 0 1 2 d
b
c
a 0 1 2 d
b
0 0 2 0 0 2
Now assume that the eigenvalues of A œ Cm◊m are given by {⁄0 , ⁄1 , . . . , ⁄m≠1 }, where
eigenvalues are repeated according to their algebraic multiplicity. Assume that there are m
linearly independent vectors {x0 , x1 , . . . , xm≠1 } such that Axj = ⁄j xj . Then
Q R
⁄0 0 ··· 0
1 2 1 2c
c 0 ⁄1 ··· 0 d
d
A x0 x1 · · · xm≠1 = x0 x1 · · · xm≠1 c
c .. .. .. .. d.
d
a . . . . b
0 0 · · · ⁄m≠1
1 2
Hence, if X = x0 x1 · · · xm≠1 and D = diag(⁄0 , ⁄1 , . . . , ⁄m≠1 ) then X ≠1 AX = D. In
other words, if A has m linearly independent eigenvectors, then A is diagonalizable.
These insights are summarized in the following theorem:
WEEK 9. EIGENVALUES AND EIGENVECTORS 487

Theorem 9.2.5.3 A matrix A œ Cm◊m is diagonalizable if and only if it has m linearly


independent eigenvectors.
Here are some classes of matrices that are diagonalizable:

• Diagonal matrices.
If A is diagonal, then choosing X = I and A = D yields X ≠1 AX = D.

• Hermitian matrices.
If A is Hermitian, then the spectral decomposition theorem tells us that there exists
unitary matrix Q and diagonal matrix D such that A = QDQH . Choosing X = Q
yields X ≠1 AX = D.

• Triangular matrices with distinct diagonal elements.


If U is upper triangular and has distinct diagonal elements, then by Homework 9.2.3.4
we know we can find an eigenvector associated with each diagonal element and by
design those eigenvectors are linearly independent. Obviously, this can be extended to
lower triangular matrices as well.
Homework 9.2.5.2 Let A œ Cm◊m have distinct eigenvalues.
ALWAYS/SOMETIMES/NEVER: A is diagonalizable.
Answer. ALWAYS
Now prove it!
Solution. Let A = QU QH be the Schur decomposition of matrix A. Since U is upper
triangular, and has the same eigenvalues as A, it has distinct entries along its diagonal.
Hence, by our earlier observations, there exists a nonsingular matrix X such that X ≠1 U X =
D, a diagonal matrix. Now,

X ≠1 QH AQX = X ≠1 U X = D

and hence Y = QX is the nonsingular matrix that diagonalizes A.

YouTube: https://www.youtube.com/watch?v=PMtZN 8CHzM


WEEK 9. EIGENVALUES AND EIGENVECTORS 488

9.2.6 Jordan Canonical Form

YouTube: https://www.youtube.com/watch?v=amD2FOSXf s
Homework 9.2.6.1 Compute the eigenvalues of k ◊ k matrix
Q R
µ 1 ··· 0 0 0
c .. d
c 0 µ
c 1 . 0 0 d d
c . . . . .. .. d
c ..
Jk (µ) = c .. .. .. . . dd (9.2.1)
c d
c ... d
a 0 0 0 µ 1 b
0 0 0 ··· 0 µ

where k > 1. For each eigenvalue compute a basis for the subspace of its eigenvectors
(including the zero vector to make it a subspace).
Hint.

• How many linearly independent columns does ⁄I ≠ Jk (µ) have?

• What does this say about the dimension of the null space N (⁄I ≠ Jk (µ))?

• You should be able to find eigenvectors by examination.

Solution. Since the matrix is upper triangular and all entries on its diagonal equal µ. Now,
Q R
0 1 0
··· 0 0
c .. d
c
c 0 0 1 . 0 0 d
d
µI ≠ Jk (µ) =
c .. ... ... .. .. .. d
c
c . . . . d
d
c d
c ... d
a 0 0 0 0 1 b
0 0 0 ··· 0 0

has m≠1 linearly independent columns and hence its nullspace is one dimensional: dim(N (µI≠
Jk (µ))) = 1. So, we are looking for one vector in the basis of N (µI ≠Jk (µ)). By examination,
Jk (µ)e0 = µe0 and hence e0 is an eigenvector associated with the only eigenvalue µ.
WEEK 9. EIGENVALUES AND EIGENVECTORS 489

YouTube: https://www.youtube.com/watch?v=QEunPSiFZF0
The matrix in (9.2.1) is known as a Jordan block.
The point of the last exercise is to show that if A has an eigenvalue of algebraic multiplicity
k, then it does not necessarily have k linearly independent eigenvectors. That, in turn, means
there are matrices that do not have a full set of eigenvectors. We conclude that there are
matrices that are not diagonalizable. We call such matrices defective.
Definition 9.2.6.1 Defective matrix. A matrix A œ Cm◊m that does not have m linearly
independent eigenvectors is said to be defective. ⌃
Corollary 9.2.6.2 Matrix A œ Cm◊m is diagonalizable if and only if it is not defective.
Proof. This is an immediate consequence of Theorem 9.2.5.3. ⌅
Definition 9.2.6.3 Geometric multiplicity. Let ⁄ œ (A). Then the geometric multi-
plicity of ⁄ is defined to be the dimension of E⁄ (A) defined by

E⁄ (A) = {x œ Cm |Ax = ⁄x}.

In other words, the geometric multiplicity of ⁄ equals the number of linearly independent
eigenvectors that are associated with ⁄. ⌃
Homework 9.2.6.2 Let A œ Cm◊m have the form
A B
A00 0
A=
0 A11

where A00 and A11 are square. Show that


A B
x
• If (⁄, x) is an eigenpair of A00 then (⁄, ) is an eigenpair of A.
0
A B
0
• If (µ, y) is an eigenpair of A11 then (µ, ) is an eigenpair of A.
y
A B
x
• If (⁄, ) is an eigenpair of A then (⁄, x) is an eigenpair of A00 and (⁄, y) is an
y
eigenpair of A11 .

• (A) = (A00 ) fi (A11 ).


WEEK 9. EIGENVALUES AND EIGENVECTORS 490

Solution. Let A œ Cm◊m have the form


A B
A00 0
A=
0 A11

where A00 and A11 are square. Show that


A B
x
• If (⁄, x) is an eigenpair of A00 then (⁄, ) is an eigenpair of A.
0
A BA B A B A B A B
A00 0 x A00 x ⁄x x
= = =⁄ .
0 A11 0 0 0 0
A B
0
• If (µ, y) is an eigenpair of A11 then (µ, ) is an eigenpair of A.
y
A BA B A B A B A B
A00 0 0 0 0 0
= = =µ .
0 A11 y A11 y µy y
A BA B A B
A00 0 x x
• =⁄ implies that
0 A11 y y
A B A B
A00 x ⁄x
= ,
A11 y ⁄y

and hence A00 x = ⁄x and A11 y = ⁄y.

• (A) = (A00 ) fi (A11 ).


This follows from the first three parts of this problem.
This last homework naturally extends to
Q R
A00 0 ··· 0
c
c 0 A11 ··· 0 d
d
A=c
c .. .. .. .. d
d
a . . . . b
0 0 · · · Akk
The following is a classic result in linear algebra theory that characterizes the relationship
between of a matrix and its eigenvectors:

YouTube: https://www.youtube.com/watch?v=RYg4xLKehDQ
WEEK 9. EIGENVALUES AND EIGENVECTORS 491

Theorem 9.2.6.4 Jordan Canonical Form Theorem. Let the eigenvalues of A œ Cm◊m
be given by ⁄0 , ⁄1 , · · · , ⁄k≠1 , where an eigenvalue is listed as many times as its geometric
multiplicity. There exists a nonsingular matrix X such that
Q R
Jm0 (⁄0 ) 0 ··· 0
c
c 0 Jm1 (⁄1 ) ··· 0 d
d
X ≠1 AX = c
c .. .. .. .. d.
d
a . . . . b
0 0 · · · Jmk≠1 (⁄k≠1 )
For our discussion, the sizes of the Jordan blocks Jmi (⁄i ) are not particularly important.
Indeed, this decomposition, known as the Jordan Canonical Form of matrix A, is not par-
ticularly interesting in practice. It is extremely sensitive to perturbation: even the smallest
random change to a matrix will make it diagonalizable. As a result, there is no practical
mathematical software library or tool that computes it. For this reason, we don’t give its
proof and don’t discuss it further.

9.3 The Power Method and related approaches


9.3.1 The Power Method

YouTube: https://www.youtube.com/watch?v=gbhHOR NxNM


The Power Method is a simple method that under mild conditions yields a vector corre-
sponding to the eigenvalue that is largest in magnitude.
Throughout this section we will assume that a given matrix A œ Cm◊m is diagonalizable.
Thus, there exists a nonsingular matrix X and diagonal matrix such that A = X X ≠1 .
From the last section, we know that the columns of X equal eigenvectors of A and the
elements on the diagonal of equal the eigenvalues:
Q R
⁄0 0 ··· 0
1 2 c
c 0 ⁄1 ··· 0 d
d
X= x0 x0 · · · xm≠1 and =c
c .. .. ... .. d
d
a . . . b
0 0 · · · ⁄m≠1

so that
Axi = ⁄i xi for i = 0, . . . , m ≠ 1.
WEEK 9. EIGENVALUES AND EIGENVECTORS 492

For most of this section we will assume that

|⁄0 | > |⁄1 | Ø · · · Ø |⁄m≠1 |.

In particular, ⁄0 is the eigenvalue with maximal absolute value.

9.3.1.1 First attempt

YouTube: https://www.youtube.com/watch?v=sX9pxaH7Wvs
Let v (0) œ Cm◊m be an initial guess’’. Our (first attempt at the) Power Method iterates
as follows:
Pick v (0)
for k = 0, . . .
v (k+1) = Av (k)
endfor
Clearly v (k) = Ak v (0) .
Let
v (0) = Â0 x0 + Â1 x1 + · · · + Âm≠1 xm≠1 .
Then
v (1) = Av (0) = A (Â0 x0 + Â1 x1 + · · · + Âm≠1 xm≠1 )
= Â0 Ax0 + Â1 Ax1 + · · · + Âm≠1 Axm≠1
= Â0 ⁄0 x0 + Â1 ⁄0 x1 + · · · + Âm≠1 ⁄m≠1 xm≠1 ,
v (2) = Av (1) = A (Â0 ⁄0 x0 + Â1 ⁄0 x1 + · · · + Âm≠1 ⁄m≠1 xm≠1 )
= Â0 ⁄0 Ax0 + Â1 ⁄0 Ax1 + · · · + Âm≠1 ⁄m≠1 Axm≠1
= Â0 ⁄20 x0 + Â1 ⁄21 x1 + · · · + Âm≠1 ⁄2m≠1 xm≠1 ,
..
.
v (k) = Av (k≠1) = Â0 ⁄k0 x0 + Â1 ⁄k1 x1 + · · · + Âm≠1 ⁄km≠1 xm≠1 .
Now, as long as Â0 ”= 0 clearly Â0 ⁄k0 x0 will eventually dominate since

|⁄i |/|⁄0 | < 1.

This means that v (k) will start pointing in the direction of x0 . In other words, it will start
pointing in the direction of an eigenvector corresponding to ⁄0 . The problem is that it will
become infinitely long if |⁄0 | > 1 or infinitesimally short if |⁄0 | < 1. All is good if |⁄0 | = 1.
WEEK 9. EIGENVALUES AND EIGENVECTORS 493

YouTube: https://www.youtube.com/watch?v=LTON9qw8B0Y
An alternative way of looking at this is to exploit the fact that the eigenvectors, xi , equal
the columns of X. Then Q R
Â0
c d
c Â1 d ≠1 (0)
y = c .. d
c
d=X v
a . b
Âm≠1
and
v (0) = A0 v (0) = Xy
v (1) = Av (0) = AXy = X y
v (2) = Av (1) = AX y = X 2 y
..
.
v (k) = Av (k≠1) = AX k≠1
y=X k
y
Thus Q RQ R
⁄k0 0 · · · 0 Â0
1 2 c 0 ⁄k1 · · ·
c
0 dc
dc Â1 d
d
v (k) = x0 x1 · · · xm≠1 c c .. .. . . .. dc .. d
a . . . .
dc
ba .
d
b
0 0 · · · ⁄km≠1 Âm≠1
= Â0 ⁄k0 x0 + Â1 ⁄k1 x1 + · · · + Âm≠1 ⁄km≠1 xm≠1 .
Notice how looking at v (k) in the right basis (the eigenvectors) simplified the explanation.

9.3.1.2 Second attempt

YouTube: https://www.youtube.com/watch?v=MGTGo_TGpTM
WEEK 9. EIGENVALUES AND EIGENVECTORS 494

Given an initial v (0) œ Cm , a second attempt at the Power Method iterates as follows:

Pick v (0)
for k = 0, . . .
v (k+1) = Av (k) /⁄0
endfor

It is not hard to see that then


v (k) = Av (k≠1) /⁄0 = Ak v (0) /⁄k0
1 2k 1 2k 1 2
⁄m≠1 k
= Â0 ⁄0
⁄0
x0 + Â1 ⁄1
⁄0
x1 + · · · + Âm≠1 xm≠1
1 2k 1 2k ⁄0
= Â0 x0 + Â1 ⁄1
⁄0
x1 + · · · + Âm≠1 ⁄m≠1
⁄0
xm≠1 .
- -
- -
Clearly limkæŒ v (k) = Â0 x0 , as long as Â0 ”= 0, since - ⁄⁄k0 - < 1 for k > 0.
Another way of stating this is to notice that

Ak = (AA · · · A) = (X X ≠1 )(X X ≠1 ) · · · (X X ≠1 ) = X k
X ≠1 .
¸ ˚˙ ˝ ¸ ˚˙ ˝
times k

so that
v (k) = Ak v (0) /⁄k0
= Ak Xy/⁄k0
= X k X ≠1 Xy/⁄k0
= X k y/⁄k0
Q R
1 0 ··· 0
c 1 2k d
1 2 c
c 0 ⁄1
⁄0
··· 0 d
d
= X k
/⁄k0 y = Xc
c .. .. .. .. d y.
d
c . . . . d
a 1 2k b
0 0 ··· ⁄m≠1
⁄0
WEEK 9. EIGENVALUES AND EIGENVECTORS 495
- -
- -
Now, since - ⁄⁄k0 - < 1 for k > 1 we can argue that

limkæŒ v (k)
=
Q R
1 0 ··· 0
c 1 2k d
c
c 0 ⁄1
⁄0
··· 0 d
d
limkæŒ X c
c .. .. .. .. dy
d
c . . . . d
a 1 2 b
⁄m≠1 k
0 0 ··· ⁄0
=
Q R
1 0 ··· 0
c
c 0 0 ··· 0 d
d
Xc
c .... . . .. dy
d
a . . . . b
0 0 ··· 0
=
XÂ0 e0
=
Â0 Xe0 = Â0 x0 .
Thus, as long as Â0 ”= 0 (which means v (0) must have a component in the direction of x0 ) this
method will eventually yield a vector in the direction of x0 . However, this time the problem
is that we don’t know ⁄0 when we start.

9.3.1.3 A practical Power Method


The following algorithm, known as the Power Method, avoids the problem of v (k) growing
or shrinking in length without requiring ⁄0 to be known, by scaling it to be of unit length
at each step:
Pick v (0) of unit length
for k = 0, . . .
v (k+1) = Av (k)
v (k+1) = v (k+1) /Îv (k+1) Î
endfor
The idea is that we are only interested in the direction of the eigenvector, and hence it is
convenient to rescale the vector to have unit length at each step.

9.3.1.4 The Rayleigh quotient


A question is how to extract an approximation of ⁄0 given an approximation of x0 . The
following insights provide the answer:
Definition 9.3.1.1 Rayleigh quotient. If A œ Cm◊m and x ”= 0 œ Cm then

xH Ax
xH x
WEEK 9. EIGENVALUES AND EIGENVECTORS 496

is known as the Rayleigh quotient. ⌃


Homework 9.3.1.1 Let x be an eigenvector of A.
ALWAYS/SOMETIMES/NEVER: ⁄ = xH Ax/(xH x) is the associated eigenvalue of A.
Answer. ALWAYS
Now prove it!
Solution. Let x be an eigenvector of A and ⁄ the associated eigenvalue. Then Ax = ⁄x.
Multiplying on the left by xH yields xH Ax = ⁄xH x which, since x ”= 0 means that ⁄ =
xH Ax/(xH x).
If x is an approximation of the eigenvector x0 associated with ⁄0 , then its Rayleigh
quotient is an approximation to ⁄0 .

9.3.2 The Power Method: Convergence

YouTube: https://www.youtube.com/watch?v=P-U4dfwHMwM
Before we make the algorithm practical, let us examine how fast the iteration converges.
This requires a few definitions regarding rates of convergence.
Definition 9.3.2.1 Convergence of a sequence of scalars. Let –0 , –1 , –2 , . . . œ R be
an infinite sequence of scalars. Then –k is said to converge to – if

lim |–k ≠ –| = 0.
kæŒ


Definition 9.3.2.2 Convergence of a sequence of vectors. Let x0 , x1 , x2 , . . . œ Cm be
an infinite sequence of vectors. Then xk is said to converge to x if for any norm Î · Î

lim Îxk ≠ xÎ = 0.
kæŒ


Because of the equivalence of norms, discussed in Subsection 1.2.6, if a sequence of vectors
converges in one norm, then it converges in all norms.
Definition 9.3.2.3 Rate of convergence. Let –0 , –1 , –2 , . . . œ R be an infinite sequence
of scalars that converges to –. Then
WEEK 9. EIGENVALUES AND EIGENVECTORS 497

• –k is said to converge linearly to – if for sufficiently large k

|–k+1 ≠ –| Æ C|–k ≠ –|

for some constant C < 1. In other words, if

|–k+1 ≠ –|
lim = C < 1.
kæŒ |–k ≠ –|

• –k is said to converge superlinearly to – if for sufficiently large k

|–k+1 ≠ –| Æ Ck |–k ≠ –|

with Ck æ 0. In other words, if

|–k+1 ≠ –|
lim = 0.
kæŒ |–k ≠ –|

• –k is said to converge quadratically to – if for sufficiently large k

|–k+1 ≠ –| Æ C|–k ≠ –|2

for some constant C. In other words, if

|–k+1 ≠ –|
lim = C.
kæŒ |–k ≠ –|2

• –k is said to converge cubically to – if for large enough k

|–k+1 ≠ –| Æ C|–k ≠ –|3

for some constant C. In other words, if

|–k+1 ≠ –|
lim = C.
kæŒ |–k ≠ –|3


Linear convergence can be slow. Let’s say that for k Ø K we observe that

|–k+1 ≠ –| Æ C|–k ≠ –|.

Then, clearly, |–k+n ≠ –| Æ C n |–k ≠ –|. If C = 0.99, progress may be very, very slow. If
WEEK 9. EIGENVALUES AND EIGENVECTORS 498

|–k ≠ –| = 1, then
|–k+1 ≠ –| Æ 0.99000
|–k+2 ≠ –| Æ 0.98010
|–k+3 ≠ –| Æ 0.97030
|–k+4 ≠ –| Æ 0.96060
|–k+5 ≠ –| Æ 0.95099
|–k+6 ≠ –| Æ 0.94148
|–k+7 ≠ –| Æ 0.93206
|–k+8 ≠ –| Æ 0.92274
|–k+9 ≠ –| Æ 0.91351
Quadratic convergence is fast. Now
|–k+1 ≠ –| Æ C|–k ≠ –|2
|–k+2 ≠ –| Æ C|–k+1 ≠ –|2 Æ C(C|–k ≠ –|2 )2 = C 3 |–k ≠ –|4
|–k+3 ≠ –| Æ C|–k+2 ≠ –|2 Æ C(C 3 |–k ≠ –|4 )2 = C 7 |–k ≠ –|8
..
.
|–k+n ≠ –| Æ C 2 |–k ≠ –|2
n ≠1 n

Even if C = 0.99 and |–k ≠ –| = 1, then


|–k+1 ≠ –| Æ 0.99000
|–k+2 ≠ –| Æ 0.970299
|–k+3 ≠ –| Æ 0.932065
|–k+4 ≠ –| Æ 0.860058
|–k+5 ≠ –| Æ 0.732303
|–k+6 ≠ –| Æ 0.530905
|–k+7 ≠ –| Æ 0.279042
|–k+8 ≠ –| Æ 0.077085
|–k+9 ≠ –| Æ 0.005882
|–k+10 ≠ –| Æ 0.000034
If we consider – the correct result then, eventually, the number of correct digits roughly
doubles in each iteration. This can be explained as follows: If |–k ≠ –| < 1, then the number
of correct decimal digits is given by
≠ log10 |–k ≠ –|.
Since log10 is a monotonically increasing function,
log10 |–k+1 ≠ –|
Æ
log10 C|–k ≠ –|2
=
log10 (C) + 2 log10 |–k ≠ –|
Æ
2 log10 |–k ≠ –|
WEEK 9. EIGENVALUES AND EIGENVECTORS 499

and hence
≠ log10 |–k+1 ≠ –| Ø 2( ≠ log10 |–k ≠ –| ).
¸ ˚˙ ˝ ¸ ˚˙ ˝
number of correct number of correct
digits in –k+1 digits in –k
Cubic convergence is dizzyingly fast: Eventually the number of correct digits triples from
one iteration to the next.
For our analysis for the convergence of the Power Method, we define a convenient norm.
Homework 9.3.2.1 Let X œ Cm◊m be nonsingular. Define Î · ÎX ≠1 : Cm æ R by ÎyÎX ≠1 =
ÎX ≠1 yÎ for some given norm Î · Î : Cm æ R.
ALWAYS/SOMETIMES/NEVER: Î · ÎX ≠1 is a norm.
Solution. We need to show that

• If y ”= 0 then ÎyÎX ≠1 > 0:


Let y ”= 0 and z = X ≠1 y. Then z ”= 0 since X is nonsingular. Hence

ÎyÎX ≠1 = ÎX ≠1 yÎ = ÎzÎ > 0.

• If – œ C and y œ Cm then ΖyÎX ≠1 = |–|ÎyÎX ≠1 :

ΖyÎX ≠1 = ÎX ≠1 (–y)Î = ΖX ≠1 yÎ = |–|ÎX ≠1 yÎ = |–|ÎyÎX ≠1 .

• If x, y œ Cm then Îx + yÎX ≠1 Æ ÎxÎX ≠1 + ÎyÎX ≠1 :

Îx + yÎX ≠1
=
ÎX ≠1 (x + y)Î
=
ÎX ≠1 x + X ≠1 yÎ
Æ
ÎX ≠1 xÎ + ÎX ≠1 yÎ
=
ÎxÎX ≠1 + ÎyÎX ≠1 .
What do we learn from this exercise? Recall that a vector z can alternatively be written
as X(X ≠1 z) so that the vector ẑ = X ≠1 z tells you how to represent the vector z in the basis
given by the columns of X. What the exercise tells us is that if we measure a vector by
applying a known norm in a new basis, then that is also a norm.
WEEK 9. EIGENVALUES AND EIGENVECTORS 500

With this insight, we can perform our convergence analysis:


v (k) ≠ Â0 x0
=
Ak v (0) /⁄k0 ≠ Â0 x0
=
Q R
1 0 ··· 0
c 1 2k d
c
c 0 ⁄1
⁄0
··· 0 d
d ≠1 (0)
Xc
c .. .. .. .. dX v
d ≠ Â0 x0
c . . . . d
a 1 2 b
⁄m≠1 k
0 0 ··· ⁄0
=
Q R
1 0 ··· 0
c 1 2k d
c
c 0 ⁄1
⁄0
··· 0 d
d
Xc
c .. .. .. .. dy
d ≠ Â0 x0
c . . . . d
a 1 2k b
0 0 ··· ⁄m≠1
⁄0
=
Q R
0 0 ··· 0
c 1 2k d
c
c 0 ⁄1
⁄0
··· 0 d
d
Xc
c .. .. ... .. dy
d
c
a
. . . d
1 2 b
⁄m≠1 k
0 0 ··· ⁄0

Hence Q R
0 0 ··· 0
c 1 2k d
c
c 0 ⁄1
··· 0 d
d
X ≠1 (v (k) ≠ Â0 x0 ) =
⁄0
c
c .. .. .. .. dy
d
c . . . . d
a 1 2k b
0 0 ··· ⁄m≠1
⁄0
and Q R
0 0 ··· 0
c d
c 0 ⁄1
··· 0 d
(k+1) c d ≠1 (k)
(v ≠ Â0 x0 ) = d X (v ≠ Â0 x0 ).
⁄0
.. .. ..
≠1
X c ...
c
a . . . d
b
0 0 ··· ⁄m≠1
⁄0
Now, let Î · Î be a p-norm and Î · ÎX ≠1 as defined in Homework 9.3.2.1. Then
Îv (k+1) ≠ Â0 x0 ÎX ≠1 = ÎX
.Q (v
≠1 (k+1)
≠ Â0 x0 )Î R .
. 0 0 ··· 0 .
. .
.c d .
.c 0 ⁄⁄1
· · · 0 d .
.c d ≠1 (k) .
= ..c (v )
0
.
c .. .
.. . . . .
.. d d X ≠ Â0 x 0 .
.
.a b .
. .
. 0 0 · · · ⁄m≠1 .
- - ⁄0 - -
- ⁄1 - - -
Æ - ⁄ - ÎX ≠1 (v (k) ≠ Â0 x0 )Î = - ⁄⁄10 - Îv (k) ≠ Â0 x0 ÎX ≠1 .
0
WEEK 9. EIGENVALUES AND EIGENVECTORS 501

This shows that, in this norm, the convergence of v (k) to Â0 x0 is linear: The difference
between current approximation, v (k) , and the eventual vector in the direction of the desired
eigenvector, Âx0 , is reduced by at least a constant factor in each iteration.
Now, what if
|⁄0 | = · · · = |⁄n≠1 | > |⁄n | Ø . . . Ø |⁄m≠1 |?
By extending the above analysis one can easily show that v (k) will converge to a vector in
the subspace spanned by the eigenvectors associated with ⁄0 , . . . , ⁄n≠1 .
An important special case is when n = 2: if A is real-valued then ⁄0 may be complex-
valued in which case its conjugate, ⁄̄0 , is also an eigenvalue and hence has the same mag-
nitude as ⁄0 . We deduce that v (k) will always be in the space spanned by the eigenvectors
corresponding to ⁄0 and ⁄̄0 .

9.3.3 The Inverse Power Method

YouTube: https://www.youtube.com/watch?v=yr YmNdYBCs


Homework 9.3.3.1 Let A œ Cm◊m be nonsingular, and (⁄, x) an eigenpair of A
Which of the follow is an eigenpair of A≠1 :

• (⁄, x).

• (⁄, A≠1 x).

• (1/⁄, A≠1 x).

• (1/⁄, x).

Answer. (1/⁄, x).


Now justify your answer.
Solution. Since Ax = ⁄x and A is nonsingular, we know that A≠1 exists and ⁄ ”= 0. Hence
1
x = A≠1 x

which can be rewritten as
1
A≠1 x = x.

We conclude that (1/⁄, x) is an eigenpair of A≠1 .
WEEK 9. EIGENVALUES AND EIGENVECTORS 502

YouTube: https://www.youtube.com/watch?v=pKPMuiCNC2s
The Power Method homes in on an eigenvector associated with the largest (in magnitude)
eigenvalue. The Inverse Power Method homes in on an eigenvector associated with the
smallest eigenvalue (in magnitude).
Once again, we assume that a given matrix A œ Cm◊m is diagonalizable so that there
exist matrix X and diagonal matrix such that A = X X ≠1 . We further assume that
= diag(⁄0 , · · · , ⁄m≠1 ) and
|⁄0 | Ø |⁄1 | Ø · · · Ø |⁄m≠2 | > |⁄m≠1 | > 0.
Notice that this means that A is nonsingular.
Clearly, if
|⁄0 | Ø |⁄1 | Ø · · · Ø |⁄m≠2 | > |⁄m≠1 | > 0,
then - - - - - - - -
- 1 - - 1 - - 1 - 1 -- -
- - - - - - -
- >- - Ø- - Ø ··· Ø
- .
-
- ⁄m≠1 - - ⁄m≠2 - - ⁄m≠3 - ⁄0 -
Thus, an eigenvector associated with the smallest (in magnitude) eigenvalue of A is an
eigenvector associated with the largest (in magnitude) eigenvalue of A≠1 .

YouTube: https://www.youtube.com/watch?v=D6KF28ycRB0
This suggest the following naive iteration (which mirrors the second attempt for the
Power Method in Subsubsection 9.3.1.2, but iterating with A≠1 ):
for k = 0, . . .
v (k+1) = A≠1 v (k)
v (k+1) = ⁄m≠1 v (k+1)
endfor
From the analysis of the convergence of in Subsection 9.3.2 for the Power Method algorithm,
we conclude that now
- -
- -
Îv (k+1) ≠ Âm≠1 xm≠1 ÎX ≠1 Æ - ⁄⁄m≠1
m≠2
- Îv (k) ≠ Âm≠1 xm≠1 ÎX ≠1 .
WEEK 9. EIGENVALUES AND EIGENVECTORS 503

A more practical Inverse Power Method algorithm is given by

Pick v (0) of unit length


for k = 0, . . .
v (k+1) = A≠1 v (k)
v (k+1) = v (k+1) /Îv (k+1) Î
endfor

We would probably want to factor P A = LU (LU factorization with partial pivoting) once
and solve L(U v (k+1) ) = P v (k) rather than multiplying with A≠1 .

9.3.4 The Rayleigh Quotient Iteration

YouTube: https://www.youtube.com/watch?v=7OOJcvYxbxM
A basic idea that allows one to accelerate the convergence of the inverse iteration is
captured by the following exercise:
Homework 9.3.4.1 Let A œ Cm◊m , fl œ C, and (⁄, x) an eigenpair of A .
Which of the follow is an eigenpair of the shifted matrix A ≠ flI:

• (⁄, x).

• (⁄, A≠1 x).

• (⁄ ≠ fl, x).

• (1/(⁄ ≠ fl), x).

Answer. (⁄ ≠ fl, x).
Now justify your answer.
Solution. Let Ax = ⁄x. Then

(A ≠ flI)x = Ax ≠ flx = ⁄x ≠ flx = (⁄ ≠ fl)x.

We conclude that (⁄ ≠ fl, x) is an eigenpair of A ≠ flI.


WEEK 9. EIGENVALUES AND EIGENVECTORS 504

YouTube: https://www.youtube.com/watch?v=btFWxkXkXZ8
The matrix A ≠ flI is referred to as the matrix A that has been "shifted" by fl. What the
next lemma captures is that shifting A by fl shifts the spectrum of A by fl:
Lemma 9.3.4.1 Let A œ Cm◊m , A = X X ≠1 and fl œ C. Then A ≠ flI = X( ≠ flI)X ≠1 .
Homework 9.3.4.2 Prove Lemma 9.3.4.1.
Solution.
A ≠ flI = X X ≠1 ≠ flXX ≠1 = X( ≠ flI)X ≠1 .
This suggests the following (naive) iteration: Pick a value fl close to ⁄m≠1 . Iterate

Pick v (0) of unit length


for k = 0, . . .
v (k+1) = (A ≠ flI)≠1 v (k)
v (k+1) = (⁄m≠1 ≠ fl)v (k+1)
endfor

Of course one would solve (A ≠ flI)v (k+1) = v (k) rather than computing and applying the
inverse of A.
If we index the eigenvalues so that

|⁄m≠1 ≠ fl| < |⁄m≠2 ≠ fl| Æ · · · Æ |⁄0 ≠ fl|

then - -
- -
Îv (k+1) ≠ Âm≠1 xm≠1 ÎX ≠1 Æ - ⁄⁄m≠1 ≠fl
m≠2 ≠fl
- Îv (k) ≠ Âm≠1 xm≠1 ÎX ≠1 .

The closer to ⁄m≠1 the shift fl is chosen, the more favorable the ratio (constant) that dictates
the linear convergence of this modified Inverse Power Method.

YouTube: https://www.youtube.com/watch?v=fCDYbunugKk
WEEK 9. EIGENVALUES AND EIGENVECTORS 505

A more practical algorithm is given by

Pick v (0) of unit length


for k = 0, . . .
v (k+1) = (A ≠ flI)≠1 v (k)
v (k+1) = v (k+1) /Îv (k+1) Î
endfor

where instead of multiplying by the inverse one would want to solve the linear system (A ≠
flI)v (k+1) = v (k) instead.
The question now becomes how to chose fl so that it is a good guess for ⁄m≠1 . Often
an application inherently supplies a reasonable approximation for the smallest eigenvalue or
an eigenvalue of particular interest. Alternatively, we know that eventually v (k) becomes a
good approximation for xm≠1 and therefore the Rayleigh quotient gives us a way to find a
good approximation for ⁄m≠1 . This suggests the (naive) Rayleigh-quotient iteration:

Pick v (0) of unit length


for k = 0, . . .
flk = v (k) H Av (k) /(v (k) H v (k) )
v (k+1) = (A ≠ flk I)≠1 v (k)
v (k+1) = (⁄m≠1 ≠ flk )v (k+1)
endfor

Here ⁄m≠1 is the eigenvalue to which the method eventually converges.


- -
Îv (k+1) ≠ Âm≠1 xm≠1 ÎX ≠1 Æ -- ⁄⁄m≠1 ≠flk -
m≠2 ≠flk
- Îv (k) ≠ Âm≠1 xm≠1 ÎX ≠1

with
lim (⁄m≠1 ≠ flk ) = 0
kæŒ

which means superlinear convergence is observed. In fact, it can be shown that once k is
large enough
Îv (k+1) ≠ Âm≠1 xm≠1 ÎX ≠1 Æ CÎv (k) ≠ Âm≠1 xm≠1 Î2X ≠1 ,
thus achieving quadratic convergence. Roughly speaking this means that every iteration
doubles the number of correct digits in the current approximation. To prove this, one shows
that |⁄m≠1 ≠ flk | Æ KÎv (k) ≠ Âm≠1 xm≠1 ÎX ≠1 for some constant K. Details go beyond this
discussion.
Better yet, it can be shown that if A is Hermitian, then (once k is large enough)

Îv (k+1) ≠ Âm≠1 xm≠1 Î Æ CÎv (k) ≠ Âm≠1 xm≠1 Î3

for some constant C and hence the naive Rayleigh Quotient Iteration achieves cubic conver-
gence for Hermitian matrices. Here our norm Î · ÎX ≠1 becomes any p-norm since the Spectral
Decomposition Theorem tells us that for Hermitian matrices X can be taken to equal a uni-
tary matrix. Roughly speaking this means that every iteration triples the number of correct
digits in the current approximation. This is mind-boggling fast convergence!
WEEK 9. EIGENVALUES AND EIGENVECTORS 506

A practical Rayleigh Quotient Iteration is given by


v (0) = v (0) /Îv (0) Î2
for k = 0, . . .
flk = v (k) H Av (k) (Now Îv (k) Î2 = 1)
v (k+1)
= (A ≠ flk I) v ≠1 (k)

v (k+1)
= v (k+1) /Îv (k+1) Î
endfor
Remark 9.3.4.2 A concern with the (Shifted) Inverse Power Method and Rayleigh Quotient
Iteration is that the matrix with which one solves is likely nearly singular. It turns out that
this actually helps: the error that is amplified most is in the direction of the eigenvector
associated with the smallest eigenvalue (after shifting, if appropriate).

9.3.5 Discussion
To summarize this section:
• The Power Method finds the eigenvector associated with the largest eigenvalue (in
magnitude). It requires a matrix-vector multiplication for each iteration, thus costing
approximately 2m2 flops per iteration if A is a dense m ◊ m matrix. The convergence
is linear.
• The Inverse Power Method finds the eigenvector associated with the smallest eigenvalue
(in magnitude). It requires the solution of a linear system for each iteration. By
performance an LU factorization with partial pivoting, the investment of an initial
O(m3 ) expense then reduces the cost per iteration to approximately 2m2 flops. if A is
a dense m ◊ m matrix. The convergence is linear.
• The Rayleigh Quotient Iteration finds an eigenvector, but with which eigenvalue it is
associated is not clear from the start. It requires the solution of a linear system for
each iteration. If computed via an LU factorization with partial pivoting, the cost per
iteration is O(m3 ) per iteration, if A is a dense m ◊ m matrix. The convergence is
quadratic if A is not Hermitian, and cubic if it is.
The cost of these methods is greatly reduced if the matrix is sparse, in which case each
iteration may require as little as O(m) per iteration.

9.4 Enrichments
9.4.1 How to compute the eigenvalues of a 2 ◊ 2 matrix
We have noted that finding the eigenvalues of a 2 ◊ 2 matrix requires the solution to the
characteristic polynomial. In particular, if a 2 ◊ 2 matrix A is real-valued and
A B
–00 –01
A=
–10 –11
WEEK 9. EIGENVALUES AND EIGENVECTORS 507

then

det(⁄I ≠ A) = (⁄ ≠ –00 )(⁄ ≠ –11 ) ≠ –10 –01 = ⁄2 ≠(–00 + –11 ) ⁄ + (–00 –11 ≠ –10 –10 ) .
¸ ˚˙ ˝ ¸ ˚˙ ˝
— “

It is then tempting to use the quadratic formula to find the roots: \[ \lambda_0 = \frac{-
\beta + \sqrt{ \beta^2 - 4 \gamma } }{2} \] and \[ \lambda_0 = \frac{-\beta - \sqrt{
\beta^2 - 4 \gamma } }{2}. \] However, as discussed in Subsection C.0.2, one of these
formulae may cause catastrophic cancellation, if “ is small. When is “ small? When –00 –11 ≠
–10 –10 is small. In other words, when the determinant of A is small or, equivalently, when
A has a small eigenvalue.
In the next week, we will discuss the QR algorithm for computing the Spectral Decompo-
sition of a Hermitian matrix. We do not discuss the QR algorithm for computing the Schur
Decomposition of a m ◊ m non-Hermitian matrix, which uses the eigenvalues of
A B
–m≠2,m≠2 –m≠2,m≠1
–m≠1,m≠1 –m≠1,m≠1

to "shift" the matrix. (What this means will become clear next week.) This happened to come
up in Robert’s dissertation work. Making the "rookie mistake" of not avoiding catastrophic
cancellation when computing the roots of a quadratic polynomial cost him three weeks of
his life (debugging his code), since the algorithm that resulted did not converge correctly...
Don’t repeat his mistakes!

9.5 Wrap Up
9.5.1 Additional homework
Homework 9.5.1.1 Let Î · Î be matrix norm induced by a vector norm Î · Î. Prove that for
any A œ Cm◊m , the spectral radius, fl(A) satisfies fl(A) Æ ÎAÎ.
Some results in linear algebra depend on there existing a consistent matrix norm Î · Î
such that ÎAÎ < 1. The following exercise implies that one can alternatively show that the
spectral radius is bounded by one: fl(A) < 1.
Homework 9.5.1.2 Given a matrix A œ Cm◊m and ‘ > 0, there exists a consistent matrix
norm Î · Î such that ÎAÎ Æ fl(A) + ‘.

9.5.2 Summary
Definition 9.5.2.1 Eigenvalue, eigenvector, and eigenpair. Let A œ Cm◊m . Then
⁄ œ C and nonzero x œ Cm are said to be an eigenvalue and corresponding eigenvector if
Ax = ⁄x. The tuple (⁄, x) is said to be an eigenpair. ⌃
For A œ C m◊m
, the following are equivalent statements:
WEEK 9. EIGENVALUES AND EIGENVECTORS 508

• A is nonsingular.

• A has linearly independent columns.

• There does not exist x ”= 0 such that Ax = 0.

• N (A) = {0}. (The null space of A is trivial.)

• dim(N (A)) = 0.

• det(A) ”= 0.

For A œ Cm◊m , the following are equivalent statements:

• ⁄ is an eigenvalue of A

• (⁄I ≠ A) is singular.

• (⁄I ≠ A) has linearly dependent columns.

• There exists x ”= 0 such that (⁄I ≠ A)x = 0.

• The null space of ⁄I ≠ A is nontrivial.

• dim(N (⁄I ≠ A)) > 0.

• det(⁄I ≠ A) = 0.
Definition 9.5.2.2 Spectrum of a matrix. The set of all eigenvalues of A is denoted by
(A) and is called the spectrum of A. ⌃
Definition 9.5.2.3 Spectral radius. The spectral radius of A, fl(A), equals the magnitude
of the largest eigenvalue, in magnitude:

fl(A) = max |⁄|.


⁄œ (A)


Theorem 9.5.2.4 Gershgorin Disk Theorem. Let A œ C m◊m
,
Q R
–0,0 –0,1 · · · –m≠1,m≠1
c d
c –0,0 –0,1 · · · –m≠1,m≠1 d
A= c
c .. .. .. d,
d
a . . . b
–0,0 –0,1 · · · –m≠1,m≠1
ÿ
fli (A) = |–i,j |,
j”=i

and
Ri (A) = {x s.t. |x ≠ –i,i | Æ fli }.
WEEK 9. EIGENVALUES AND EIGENVECTORS 509

In other words, fli (A) equals the sum of the absolute values of the off diagonal elements of A
in row i and Ri (A) equals the set of all points in the complex plane that are within a distance
fli of diagonal element –i,i . Then

(A) µ fii Ri (A).

In other words, every eigenvalue lies in one of the disks of radius fli (A) around diagonal
element –i,i .
Corollary 9.5.2.5 Let A and Ri (A) be as defined in Theorem 9.5.2.4. Let K and K C be
disjoint subsets of {0, . . . , m ≠ 1} such that K fi K C = {0, . . . , m ≠ 1}. In other words, let
K be a subset of {0, . . . , m ≠ 1 and K C its complement. If
1 2
(fikœK Rk (A)) fl fijœK C Rj (A) = ÿ

then fikœK Rk (A) contains exactly |K| eigenvalues of A (multiplicity counted). In other words,
if fikœK Rk (A) does not intersect with any of the other disks, then it contains as many eigen-
values of A (multiplicity counted) as there are elements of K.
Some useful facts for A œ Cm◊m :
• 0 œ (A) if and only A is singular.
• The eigenvectors corresponding to distinct eigenvectors are linearly independent.
• Let A be nonsingular. Then (⁄, x) is an eigenpair of A if and only if (1/⁄, x) is an
eigenpair of A≠1 .
• (⁄, x) is an eigenpair of A if and only if (⁄ ≠ fl, x) is an eigenpair of A ≠ flI.
Some useful facts for Hermitian A œ Cm◊m :
• All eigenvalues are real-valued.
• A is HPD if and only if all its eigenvalues are positive.
• If (⁄, x) and (µ, y) are both eigenpairs of Hermitian A, then x and y are orthogonal.
Definition 9.5.2.6 The determinant of
A B
–00 –01
A=
–10 –11

is given by
det(A) = –00 –11 ≠ –10 –01 .

The characteristic polynomial of
A B
–00 –01
A=
–10 –11
WEEK 9. EIGENVALUES AND EIGENVECTORS 510

is given by
det(⁄ ≠ IA) = (⁄ ≠ –00 )(⁄ ≠ –11 ) ≠ –10 –01 .
This is a second degree polynomial in ⁄ and has two roots (multiplicity counted). The
eigenvalues of A equal the roots of this characteristic polynomial.
The characteristic polynomial of A œ Cm◊m is given by
det(⁄ ≠ IA).
This is a polynomial in ⁄ of degree m and has m roots (multiplicity counted). The eigen-
values of A equal the roots of this characteristic polynomial. Hence, A has m eigenvalues
(multiplicity counted).
Definition 9.5.2.7 Algebraic multiplicity of an eigenvalue. Let A œ Cm◊m and pm (⁄)
its characteristic polynomial. Then the (algebraic) multiplicity of eigenvalue ⁄i equals the
multiplicity of the corresponding root of the polynomial. ⌃
If
pm (‰) = –0 + –1 ‰ + · · · + –m≠1 ‰m≠1 + ‰m
and Q R
≠–n≠1 ≠–n≠2 ≠–n≠3 · · · ≠–1 ≠–0
c
c 1 0 0 ··· 0 0 dd
c d
c
c 0 1 0 ··· 0 0 dd
A= c 0 0 1 ··· 0 0 d
c d
c
c .. .. .. ... .. .. d
d
a . . . . . b
0 0 0 ··· 1 0
then
pm (⁄) = det(⁄I ≠ A).
Corollary 9.5.2.8 If A œ R m◊m
is real-valued then some or all of its eigenvalues may be
complex-valued. If eigenvalue ⁄ is complex-valued, then its conjugate, ⁄̄, is also an eigenvalue.
Indeed, the complex eigenvalues of a real-valued matrix come in complex pairs.
Corollary 9.5.2.9 If A œ Rm◊m is real-valued and m is odd, then at least one of the
eigenvalues of A is real-valued.
Let ⁄ be an eigenvalue of A œ Cm◊m and
E⁄ (A) = {x œ Cm |Ax = ⁄x}.
bet the set of all eigenvectors of A associated with ⁄ plus the zero vector (which is not
considered an eigenvector). This set is a subspace.
The elements on the diagonal of a diagonal matrix are its eigenvalues. The elements on
the diagonal of a triangular matrix are its eigenvalues.
Definition 9.5.2.10 Given a nonsingular matrix Y , the transformation Y ≠1 AY is called a
similarity transformation (applied to matrix A). ⌃
Let A, B, Y œ Cm◊m , where Y is nonsingular, B = Y ≠1 AY , and (⁄, x) an eigenpair of A.
Then (⁄, Y ≠1 x) is an eigenpair of B.
WEEK 9. EIGENVALUES AND EIGENVECTORS 511

Theorem 9.5.2.11 Let A, Y, B œ Cm◊m , assume Y is nonsingular, and let B = Y ≠1 AY .


Then (A) = (B).
Definition 9.5.2.12 Given a nonsingular matrix Q the transformation QH AQ is called a
unitary similarity transformation (applied to matrix A). ⌃
Theorem 9.5.2.13 Schur Decomposition Theorem. Let A œ Cm◊m . Then there exist a
unitary matrix Q and upper triangular matrix U such that A = QU QH . This decomposition
is called the Schur decomposition of matrix A.
Theorem 9.5.2.14 Spectral Decomposition Theorem. Let A œ Cm◊m be Hermitian.
Then there exist a unitary matrix Q and diagonal matrix D œ Rm◊m such that A = QDQH .
This decomposition is called the spectral decomposition of matrix A.
A B
AT L AT R
Theorem 9.5.2.15 Let A œ C be of form A =
m◊m
, where AT L and ABR
0 ABR
are square submatrices. Then (A) = (AT L ) fi (ABR ).
Definition 9.5.2.16 Diagonalizable matrix. A matrix A œ Cm◊m is said to be diago-
nalizable if and only if there exists a nonsingular matrix X and diagonal matrix D such that
X ≠1 AX = D. ⌃
Theorem 9.5.2.17 A matrix A œ Cm◊m is diagonalizable if and only if it has m linearly
independent eigenvectors.
If A œ Cm◊m has distinct eigenvalues, then it is diagonalizable.
Definition 9.5.2.18 Defective matrix. A matrix A œ Cm◊m that does not have m
linearly independent eigenvectors is said to be defective. ⌃
Corollary 9.5.2.19 Matrix A œ Cm◊m is diagonalizable if and only if it is not defective.
Definition 9.5.2.20 Geometric multiplicity. Let ⁄ œ (A). Then the geometric
multiplicity of ⁄ is defined to be the dimension of E⁄ (A) defined by

E⁄ (A) = {x œ Cm |Ax = ⁄x}.

In other words, the geometric multiplicity of ⁄ equals the number of linearly independent
eigenvectors that are associated with ⁄. ⌃
Definition 9.5.2.21 Jordan Block. Define the k ◊ k Jordan block with eigenvalue µ as
Q R
µ 1 0
··· 0 0
c .. d
c 0 µ
c 1 . 0 0 d d
c . . . . .. .. d
Jk (µ) = c .. . . . . . .
c
. . dd
c d
c .. d
a 0 0 0 . µ 1 b
0 0 0 ··· 0 µ


WEEK 9. EIGENVALUES AND EIGENVECTORS 512

Theorem 9.5.2.22 Jordan Canonical Form Theorem. Let the eigenvalues of A œ


Cm◊m be given by ⁄0 , ⁄1 , · · · , ⁄k≠1 , where an eigenvalue is listed as many times as its geo-
metric multiplicity. There exists a nonsingular matrix X such that
Q R
Jm0 (⁄0 ) 0 ··· 0
c
c 0 Jm1 (⁄1 ) ··· 0 d
d
X ≠1
AX = c
c .. .. .. .. d.
d
a . . . . b
0 0 · · · Jmk≠1 (⁄k≠1 )
A practical Power Method for finding the eigenvector associated with the largest eigen-
value (in magnitude):
Pick v (0) of unit length
for k = 0, . . .
v (k+1) = Av (k)
v (k+1) = v (k+1) /Îv (k+1) Î
endfor
Definition 9.5.2.23 Rayleigh quotient. If A œ Cm◊m and x ”= 0 œ Cm then

xH Ax
xH x
is known as the Rayleigh quotient. ⌃
If x is an eigenvector of A, then
xH Ax
xH x
is the associated eigenvalue.
Definition 9.5.2.24 Convergence of a sequence of scalars. Let –0 , –1 , –2 , . . . œ R be
an infinite sequence of scalars. Then –k is said to converge to – if

lim |–k ≠ –| = 0.
kæŒ


Definition 9.5.2.25 Convergence of a sequence of vectors. Let x0 , x1 , x2 , . . . œ C m

be an infinite sequence of vectors. Then xk is said to converge to x if for any norm Î · Î

lim Îxk ≠ xÎ = 0.
kæŒ


Definition 9.5.2.26 Rate of convergence. Let –0 , –1 , –2 , . . . œ R be an infinite sequence
of scalars that converges to –. Then
• –k is said to converge linearly to – if for sufficiently large k
|–k+1 ≠ –| Æ C|–k ≠ –|
for some constant C < 1.
WEEK 9. EIGENVALUES AND EIGENVECTORS 513

• –k is said to converge superlinearly to – if for sufficiently large k

|–k+1 ≠ –| Æ Ck |–k ≠ –|

with Ck æ 0.

• –k is said to converge quadratically to – if for sufficiently large k

|–k+1 ≠ –| Æ C|–k ≠ –|2

for some constant C.

• –k is said to converge superquadratically to – if for sufficiently large k

|–k+1 ≠ –| Æ Ck |–k ≠ –|2

with Ck æ 0.

• –k is said to converge cubically to – if for large enough k

|–k+1 ≠ –| Æ C|–k ≠ –|3

for some constant C.


The convergence of the Power Method is linear.
A practical Inverse Power Method for finding the eigenvector associated with the smallest
eigenvalue (in magnitude):
Pick v (0) of unit length
for k = 0, . . .
v (k+1) = A≠1 v (k)
v (k+1) = v (k+1) /Îv (k+1) Î
endfor
The convergence of the Inverse Power Method is linear.
A practical Rayleigh quation iteration for finding the eigenvector associated with the
smallest eigenvalue (in magnitude):
Pick v (0) of unit length
for k = 0, . . .
flk = v (k)H Av (k)
v (k+1) = (A ≠ flk I)≠1 v (k)
v (k+1) = v (k+1) /Îv (k+1) Î
endfor
The convergence of the Rayleigh Quotient Iteration is quadratic (eventually, the number
of correct digits doubles in each iteration). If A is Hermitian, the convergence is cubic
(eventually, the number of correct digits triples in each iteration).
Week 10

Practical Solution of the Hermitian Eigen-


value Problem

10.1 Opening
10.1.1 Subspace iteration with a Hermitian matrix

YouTube: https://www.youtube.com/watch?v=kwJ6HMSLv1U
The idea behind subspace iteration is to perform the Power Method with more than one
vector in order to converge to (a subspace spanned by) the eigenvectors associated with a
set of eigenvalues.
We continue our discussion by restricting ourselves to the case where A œ Cm◊m is Her-
mitian. Why? Because the eigenvectors associated with distinct eigenvalues of a Hermitian
matrix are mutually orthogonal (and can be chosen to be orthonormal), which will simplify
our discussion. Here we repeat the Power Method:
v0 := random vector
(0)
v0 := v0 /Îv0 Î2 normalize to have length one
for k := 0, . . .
(k)
v0 := Av0
(k+1)
v0 := v0 /Îv0 Î2 normalize to have length one
endfor

In previous discussion, we used v (k) for the current approximation to the eigenvector. We

514
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM515

(k)
now add the subscript to it, v0 , because we will shortly start iterating with multiple vectors.
Homework 10.1.1.1 You may want to start by executing git pu to update your
directory Assignments.
Examine Assignments/Week10/mat ab/PowerMethod.m which implements

[ ambda_0, v0 ] = PowerMethod( A, x, maxiters, i ustrate, de ay )

This routine implements the Power Method, starting with a vector x for a maximum number
of iterations maxiters or until convergence, whichever comes first. To test it, execute the
script in Assignments/Week10/mat ab/test_SubspaceIteration.m which uses the Power Method
to compute the largest eigenvalue (in magnitude) and corresponding eigenvector for an m◊m
Hermitian matrix A with eigenvalues 1, . . . , m.
Be sure to click on "Figure 1" to see the graph that is created.
Solution. Watch the video regarding this problem on YouTube: https://youtu.be/8Bgf1tJeMmg.
(embedding a video in a solution seems to cause PreTeXt trouble...)

YouTube: https://www.youtube.com/watch?v=wmuWfjwtgcI
Recall that when we analyzed the convergence of the Power Method, we commented on
the fact that the method converges to an eigenvector associated with the largest eigenvalue
(in magnitude) if two conditions are met:

• |⁄0 | > |⁄1 |.


(0)
• v0 has a component in the direction of the eigenvector, x0 , associated with ⁄0 .
(0)
A second initial vector, v1 , does not have a component in the direction of x0 if it is
orthogonal to x0 . So, if we know x0 , then we can pick a random vector, subtract out the
(0)
component in the direction of x0 , and make this our vector v1 with which we should be
able to execute the Power Method to find an eigenvector, x1 , associated with the eigenvalue
that has the second largest magnitude, ⁄1 . If we then start the Power Method with this
new vector (and don’t introduce roundoff error in a way that introduces a component in the
direction of x0 ), then the iteration will home in on a vector associated with ⁄1 (provided A
(0)
is Hermitian, |⁄1 | > |⁄2 |, and v1 has a component in the direction of x1 .) This iteration
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM516

would look like


x0 := x0 /Îx0 Î2 normalize known eigenvector x0 to have length one
v1 := random vector
v1 := v1 ≠ xH 0 v1 x0 make sure the vector is orthogonal to x0
(0)
v1 := v1 /Îv1 Î2 normalize to have length one
for k := 0, . . .
(k)
v1 := Av1
(k+1)
v1 := v1 /Îv1 Î2 normalize to have length one
endfor

Homework 10.1.1.2 Copy Assignments/Week10/mat ab/PowerMethod.m into PowerMethodLambda1.m.


Modify it by adding an input parameter x0, which is an eigenvector associated with ⁄0 (the
eigenvalue with largest magnitude).
[ ambda_1, v1 ] = PowerMethodLambda1( A, x, x0, maxiters, i ustrate, de ay )

The new function should subtract this vector from the initial random vector as in the above
algorithm.
Modify the appropriate line in Assignments/Week10/mat ab/test_SubspaceIteration.m, chang-
ing (0) to (1), and use it to examine the convergence of the method.
What do you observe?
Solution.

• Assignments/Week10/answers/PowerMethodLambda1.m

Watch the video regarding this problem on YouTube: https://youtu.be/48HnBJmQhX8.


(embedding a video in a solution seems to cause PreTeXt trouble...)

YouTube: https://www.youtube.com/watch?v=OvBFQ84jTMw
Because we should be concerned about the introduction of a component in the direction
of x0 due to roundoff error, we may want to reorthogonalize with respect to x0 in each
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM517

iteration:
x0 := x0 /Îx0 Î2 normalize known eigenvector x0 to have length one
v1 := random vector
v1 := v1 ≠ xH 0 v1 x0 make sure the vector is orthogonal to x0
(0)
v1 := v1 /Îv1 Î2 normalize to have length one
for k := 0, . . .
(k)
v1 := Av1
v1 := v1 ≠ xH 0 v1 x0 make sure the vector is orthogonal to x0
(k+1)
v1 := v1 /Îv1 Î2 normalize to have length one
endfor

Homework 10.1.1.3 Copy PowerMethodLambda1.m into PowerMethodLambda1Reorth.m and


modify it to reorthogonalize with respect to x0:
[ ambda_1, v1 ] = PowerMethodLambda1Reorth( A, x, v0, maxiters, i ustrate, de ay );

Modify the appropriate line in Assignments/Week10/mat ab/test_SubspaceIteration.m, chang-


ing (0) to (1), and use it to examine the convergence of the method.
What do you observe?
Solution.

• Assignments/Week10/answers/PowerMethodLambda1Reorth.m

Watch the video regarding this problem on YouTube: https://youtu.be/YmZc2oq02kA.


(embedding a video in a solution seems to cause PreTeXt trouble...)

YouTube: https://www.youtube.com/watch?v=751FAyKch1s
We now observe that the steps that normalize x0 to have unit length and then subtract
out the component of v1 in the direction of x0 , normalizing the result, are exactly those
performed by the Gram-Schmidt process.
1 More
2 generally, it is is equivalent to computing
the QR factorization of the matrix x0 v1 . This suggests the algorithm

v11 := random vector


2 1 2
( x0 v1(0) ), R := QR( x0 v1 )
for1k := 0, . . . 2 1 2
( x0 v1(k+1) , R) := QR( x0 Av1(k) )
endfor
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM518

Obviously, this redundantly normalizes x0 . It puts us on the path of a practical algorithm


for computing the eigenvectors associated with ⁄0 and ⁄1 .
The problem is that we typically don’t know x0 up front. Rather than first using the
power method to compute it, we can instead iterate with two random vectors, where the
first converges to a vector associated with ⁄0 and the second to one associated with ⁄1 :

v0 := random vector
v11 := random vector
2 1 2
( v0(0) v1(0) ), R := QR( v0 v1 )
for1k := 0, . . . 2 1 2
( v0(k+1) v1(k+1) , R) := QR(A v0(k) v1(k) )
endfor

We observe:
(k)
• If |⁄0 | > |⁄1 |, the vectors v0 will converge linearly to a vector in the direction of x0
at a rate dictated by the ratio |⁄1 |/|⁄0 |.
(k)
• If |⁄0 | > |⁄1 | > |⁄2 |,, the vectors v1 will converge linearly to a vector in the direction
of x1 at a rate dictated by the ratio |⁄2 |/|⁄1 |.
(k) (k)
• If |⁄0 | Ø |⁄1 | > |⁄2 | then Span({v0 , v1 }) will eventually start approximating the
subspace Span({x0 , x1 }).

YouTube: https://www.youtube.com/watch?v= hBEjMmWLiA


What we have described is a special case of subspace iteration. The associated eigen-
value can be approximated via the Rayleigh quotient:
(k) (k) H (k) (k) (k) H (k)
⁄0 ¥ ⁄0 = v 0 Av0 and ⁄1 ¥ ⁄1 = v1 Av1

Alternatively,
A B
(k)
1
(k) (k)
2H 1
(k) (k)
2 ⁄0 0
A = v0 v1 A v0 v1 converges to
0 ⁄1

if A is Hermitian, |⁄1 | > |⁄2 |, and v (0) and v (1) have components in the directions of x0 and
x1 , respectively.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM519

The natural extention of these observations is to iterate with n vectors:


V‚ := random m ◊ n matrix
(V‚ (0) , R) := QR(V‚ )
A(0) = V (0) AV (0)
H

for k := 0, . . .
(V‚ (k+1) , R) := QR(AV‚ (k) )
H
A(k+1) = V‚ (k+1) AV‚ (k+1)
endfor

By extending the reasoning given so far in this unit, if


• A is Hermitian,

• |⁄0 | > |⁄1 | > · · · > |⁄n≠1 | > |⁄n |, and

• each vj has a component in the direction of xj , an eigenvector associated with ⁄j ,


(j)
then each vj will converge to a vector in the direction xj . The rate with which the compo-
(k)
nent in the direction of xp , 0 Æ p < n, is removed from vj , n Æ j < m, is dictated by the
ratio |⁄p |/|⁄j |.
If some of the eigenvalues have equal magnitude, then the corresponding columns of V‚ (k)
will eventually form a basis for the subspace spanned by the eigenvectors associated with
those eigenvalues.
Homework 10.1.1.4 Copy PowerMethodLambda1Reorth.m into SubspaceIteration.m and
modify it to work with an m ◊ n matrix V :
[ Lambda, V ] = SubspaceIteration( A, V, maxiters, i ustrate, de ay );

Modify the appropriate line in Assignments/Week10/mat ab/test_SubspaceIteration.m, chang-


ing (0) to (1), and use it to examine the convergence of the method.
What do you observe?
Solution.

• Assignments/Week10/answers/SubspaceIteration.m

Watch the video regarding this problem on YouTube: https://youtu.be/Er7jGYs0HbE.


(embedding a video in a solution seems to cause PreTeXt trouble...)

10.1.2 Overview
• 10.1 Opening

¶ 10.1.1 Subspace iteration with a Hermitian matrix


¶ 10.1.2 Overview
¶ 10.1.3 What you will learn
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM520

• 10.2 From Power Method to a simple QR algorithm

¶ 10.2.1 A simple QR algorithm


¶ 10.2.2 A simple shifted QR algorithm
¶ 10.2.3 Deflating the problem
¶ 10.2.4 Cost of a simple QR algorithm

• 10.3 A Practical Hermitian QR Algorithm

¶ 10.3.1 Reducing the cost of the QR algorithm


¶ 10.3.2 Reduction to tridiagonal form
¶ 10.3.3 Givens’ rotations
¶ 10.3.4 Simple tridiagonal QR algorithm
¶ 10.3.5 The implicit Q theorem
¶ 10.3.6 The Francis implicit QR Step
¶ 10.3.7 A complete algorithm

• 10.4 Enrichments

¶ 10.4.1 QR algorithm among the most important algorithms of the 20th century
¶ 10.4.2 Who was John Francis
¶ 10.4.3 Casting the reduction to tridiagonal form in terms of matrix-matrix mul-
tiplication
¶ 10.4.4 Optimizing the tridiagonal QR algorithm
¶ 10.4.5 The Method of Multiple Relatively Robust Representations (MRRR)

• 10.5 Wrap Up

¶ 10.5.1 Additional homework


¶ 10.5.2 Summary

10.1.3 What you will learn


This week, you explore practical methods for finding all eigenvalues and eigenvectors of a
Hermitian matrix, building on the insights regarding the Power Method that you discovered
last week.
Upon completion of this week, you should be able to

• Formulate and analyze subspace iteration methods.

• Expose the relationship between subspace iteration and simple QR algorithms.


WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM521

• Accelerate the convergence of QR algorithms by shifting the spectrum of the matrix.

• Lower the cost of QR algorithms by first reducing a Hermitian matrix to tridiagonal


form.

• Cast all computation for computing the eigenvalues and eigenvectors of a Hermitian
matrix in terms of unitary similarity transformations, yielding the Francis Implicit QR
Step.

• Exploit a block diagonal structure of a matrix to deflate the Hermitian eigenvalue


problem into smaller subproblems.

• Combine all these insights into a practical algorithm.

10.2 From Power Method to a simple QR algorithm


10.2.1 A simple QR algorithm

YouTube: https://www.youtube.com/watch?v=_v E7aJGoNE


We now morph the subspace iteration discussed in the last unit into a simple incarnation
of an algorithm known as the QR algorithm. We will relate this algorithm to performing
subspace iteration with an m ◊ m (square) matrix so that the method finds all eigenvectors
simultaneously (under mild conditions). Rather than starting with a random matrix V , we
now start with the identity matrix. This yields the algorithm on the left in Figure 10.2.1.1.
We contrast it with the algorithm on the right.

V‚ := I V =I
for k := 0, . . . for k := 0, . . .
(V‚ , R)
‚ := QR(AV‚ ) (Q, R) := QR(A)
A = RQ
A‚ = V‚ H AV‚ V =VQ
endfor endfor

Figure 10.2.1.1 Left: Subspace iteration started with V‚ = I. Right: Simple QR algorithm.
The magic lies in the fact that the matrices computed by the QR algorithm are identical
to those computed by the subspace iteration: Upon completion V‚ = V and the matrix A‚ on
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM522

the left equals the (updated) matrix A on the right. To be able to prove this, we annotate
the algorithm so we can reason about the contents of the matrices for each iteration.

A‚(0) := A A(0) := A
V‚ (0) := I V (0) := I
R‚ (0) := I R(0) := I
for k := 0, . . . for k := 0, . . .
(V‚ (k+1) , R
‚ (k+1) ) := QR(AV‚ (k) ) (Q(k+1) , R(k+1) ) := QR(A(k) )
H
A‚(k+1) := V‚ (k+1) AV‚ (k+1) A(k+1) := R(k+1) Q(k+1)
endfor V (k+1) := V (k) Q(k+1)
endfor
Let’s start by showing how the QR algorithm applies unitary equivalence transformations
to the matrices A(k) .
Homework 10.2.1.1 Show that for the algorithm on the right A(k+1) = Q(k+1) A(k) Q(k+1) .
H

Solution. The algorithm computes the QR factorization of A(k)

A(k) = Q(k+1) R(k+1)

after which
A(k+1) := R(k+1) Q(k+1)
Hence
A(k+1) = R(k+1) Q(k+1) = Q(k+1) A(k) Q(k+1) .
H

This last homework shows that A(k+1) is derived from A(k) via a unitary similarity trans-
formation and hence has the same eigenvalues as does A(k) . This means it also is derived from
A via a (sequence of) unitary similarity transformation and hence has the same eigenvalues
as does A.
We now prove these algorithms mathematically equivalent.
Homework 10.2.1.2 In the above algorithms, for all k,
• A‚(k) = A(k) .

• R
‚ (k) = R(k) .

• V‚ (k) = V (k) .

Hint. The QR factorization is unique, provided the diagonal elements of R are taken to be
positive.xs
Solution. We will employ a proof by induction.

• Base case: k = 0
This is trivially true:

¶ A‚(0) = A = A(0) .
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM523

‚ (0) = I = R(0) .
¶ R
¶ V‚ (0) = I = V (0) .

• Inductive step: Assume that A‚(k) = A(k) , R ‚ (k) = R(k) , and V‚ (k) = V (k) . Show that
A‚(k+1) = A(k+1) , R
‚ (k+1) = R(k+1) , and V‚ (k+1) = V (k+1) .

From the algorithm on the left, we know that

AV‚ (k) = V‚ (k+1) R


‚ (k+1) .

and
A(k)
= < (I.H.) >
‚ (k)
A
= < algorithm on left >
V‚ (k) H AV‚ (k) (10.2.1)
= < algorithm on left >
V‚ (k) H V‚ (k+1) R
‚ (k+1)
= < I.H. >
(k) H ‚ (k+1) ‚ (k+1)
V V R .
But from the algorithm on the right, we know that

A(k) = Q(k+1) R(k+1) . (10.2.2)

Both (10.2.1) and (10.2.2) are QR factorizations of A(k) and hence, by the uniqueness
of the QR factorization,
R‚ (k+1) = R(k+1)

and
Q(k+1) = V (k) H V‚ (k+1)
or, equivalently and from the algorithm on the right,

V (k) Q(k+1) = V‚ (k+1) .


¸ ˚˙ ˝
V (k+1)

This shows that


‚ (k+1) = R(k+1) and
¶ R
¶ V‚ (k+1) = V (k+1) .
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM524

Also,
A‚(k+1)
= < algorithm on left >
‚ (k+1) H ‚ (k+1)
V AV
= < V‚ (k+1) = V (k+1) >
V (k+1) H AV (k+1)
= < algorithm on right >
Q(k+1) H V (k) H AV (k) Q(k+1)
= < I.H. >
Q(k+1) H V‚ (k) H AV‚ (k) Q(k+1)
= < algorithm on left >
(k+1) H ‚(k) (k+1)
Q A Q
= < I.H. >
(k+1) H (k) (k+1)
Q A Q
= < last homework >
A(k+1) .

• By the Principle of Mathematical Induction, the result holds.


Homework 10.2.1.3 In the above algorithms, show that for all k
• V (k) = Q(0) Q(1) · · · Q(k) .

• Ak = V (k) R(k) · · · R(1) R(0) . (Note: Ak here denotes A raised to the kth power.)

Assume that Q(0) = I.


Solution. We will employ a proof by induction.

• Base case: k = 0
A 0
V (0) R(0 )
¸˚˙˝ = ¸ ˚˙ ˝ A = ¸ ˚˙ ˝ .
0
I I I

• Inductive step: Assume that V (k) = Q(0) · · · Q(k) and Ak = V (k) R(k) · · · R(0) . Show that
V (k+1) = Q(0) · · · Q(k+1) and Ak+1 = V (k+1) R(k+1) · · · R(0) .

V (k+1) = V (k) Q(k+1) = Q(0) · · · Q(k) Q(k+1) .


by the inductive hypothesis.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM525

Also,
Ak+1
= < definition >
AAk
= < inductive hypothesis >
AV (k) R(k) · · · R(0)
= < inductive hypothesis >
AV R · · · R(0)
‚ (k) (k)

= < left algorithm >


(k+1) ‚ (k+1) (k)

V R R · · · R(0)
= < V (k+1) = V‚ (k+1) ; R(k+1) = R
‚ (k+1) >
V (k+1) R(k+1) R(k) · · · R(0) .

• By the Principle of Mathematical Induction, the result holds for all k.


This last exercise shows that

Ak = Q(0) Q(1) · · · Q(k) R


¸
(k)
· · ·˚˙R(1) R(0)˝
¸ ˚˙ ˝
unitary V (k) upper triangular R
 (k)

which exposes a QR factorization of Ak . Partitioning V (k) by columns


1 2
(k) (k)
V (k) = v0 · · · vm≠1

we notice that applying k iterations of the Power Method to vector e0 yields


(k) (k) (k) (k)
Ak e0 = V (k) R
 (k) e = V (k) fl e = fl V (k) e = fl v ,
0 0,0 0 0,0 0 0,0 0

(k)
where flÂ0,0 is the (0, 0) entry in matrix R (k) . Thus, the first column of V (k) equals a vector
that would result from k iterations of the Power Method. Similarly, the second column of
V (k) equals a vector that would result from k iterations of the Power Method, but orthogonal
(k)
to v0 .

YouTube: https://www.youtube.com/watch?v=t51YqvNWa0Q
We make some final observations:

• A(k+1) = Q(k) H A(k) Q(k) . This means we can think of A(k+1) as the matrix A(k) but
viewed in a new basis (namely the basis that consists of the column of Q(k) ).
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM526

• A(k+1) = (Q(0) · · · Q(k) )H AQ(0) · · · Q(k) = V (k) H AV (k) . This means we can think of
A(k+1) as the matrix A but viewed in a new basis (namely the basis that consists of
the column of V (k) ).
• In each step, we compute
(Q(k+1) , R(k+1) ) = QR(A(k) )
which we can think of as
(Q(k+1) , R(k+1) ) = QR(A(k) ◊ I).
This suggests that in each iteration we perform one step of subspace iteration, but
with matrix A(k) and V = I:
(Q(k+1) , R(k+1) ) = QR(A(k) V ).

• The insight is that the QR algorithm is identical to subspace iteration, except that at
each step we reorient the problem (express it in a new basis) and we restart it with
V = I.
Homework 10.2.1.4 Examine Assignments/Week10/mat ab/SubspaceIterationA Vectors.m,
which implements the subspace iteration in Figure 10.2.1.1 (left). Examine it by executing
the script in Assignments/Week10/mat ab/test_simp e_QR_a gorithm.m.
Solution. Discuss what you observe online with others!
Homework 10.2.1.5 Copy Assignments/Week10/mat ab/SubspaceIterationA Vectors.m into
Simp eQRA g.m and modify it to implement the algorithm in Figure 10.2.1.1 (right) as
function [ Ak, V ] = Simp eQRA g( A, maxits, i ustrate, de ay )

Modify the appropriate line in Assignments/Week10/mat ab/test_simp e_QR_a gorithms.m, chang-


ing (0) to (1), and use it to examine the convergence of the method.
What do you observe?
Solution.

• Assignments/Week10/answers/Simp eQRA g.m

Discuss what you observe online with others!

10.2.2 A simple shifted QR algorithm


WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM527

YouTube: https://www.youtube.com/watch?v=HIxSCrFX1Ls
The equivalence of the subspace iteration and the QR algorithm tells us a lot about
convergence. Under mild conditions (|⁄0 | Ø · · · Ø |⁄n≠1 | > |⁄n | > · · · |⁄m≠1 |),
• The first n columns of V (k) converge to a basis for the subspace spanned by the eigen-
vectors associated with ⁄0 , . . . , ⁄n≠1 .
• The last m ≠ n columns of V (k) converge to the subspace orthogonal to the subspace
spanned by the eigenvectors associated with ⁄0 , . . . , ⁄n≠1 .
• If A is Hermitian, then the eigenvectors associated with ⁄0 , . . . , ⁄n≠1 , are orthogonal to
those associated with ⁄n , . . . , ⁄m≠1 . Hence, the subspace spanned by the eigenvectors
associated with ⁄0 , . . . , ⁄n≠1 is orthogonal to the subspace spanned by the eigenvectors
associated with ⁄n , . . . , ⁄m≠1 .
• The rate of convergence with which these subspaces become orthogonal to each other
is linear with a constant |⁄n |/|⁄n≠1 |.
What if in this situation we focus on n = m ≠ 1? Then
• The last column of V (k) converges to point in the direction of the eigenvector associated
with ⁄m≠1 , the smallest in magnitude.
• The rate of convergence of that vector is linear with a constant |⁄m≠1 |/|⁄m≠2 |.
In other words, the subspace iteration acts upon the last column of V (k) in the same way as
would an inverse iteration. This observation suggests that that convergence can be greatly
accelerated by shifting the matrix by an estimate of the eigenvalue that is smallest in mag-
nitude.
Homework 10.2.2.1 Copy Simp eQRA g.m into Simp eShiftedQRA gConstantShift.m
and modify it to implement an algorithm that executes the QR algorithm with a shifted
matrix A ≠ flI:
function [ Ak, V ] = Simp eShiftedQRA gConstantShift( A, rho, maxits, i ustrate, de ay )

Modify the appropriate line in Assignments/Week10/mat ab/test_simp e_QR_a gorithms.m, chang-


ing (0) to (1), and use it to examine the convergence of the method.
Try different values for fl: 0.0, 0.9, 0.99, 1.0, 1.99, 1.5. What do you observe?
Solution.

• Assignments/Week10/answers/Simp eShiftedQRA gConstantShift.m

Discuss what you observe online with others!


We could compute the Rayleigh quotient with the last column of V (k) but a moment of
reflection tells us that that estimate is already available as the last element on the diagonal
of A(k) , because the diagonal elements of A(k) converge to the eigenvalues. Thus, we arrive
upon a simple shifted QR algorithm in Figure 10.2.2.1. This algorithm inherits the cubic
convergence of the Rayleigh quotient iteration, for the last column of V .
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM528

V =I
for k := 0, . . .
(Q, R) := QR(A ≠ –m≠1,m≠1 I)
A = RQ + –m≠1,m≠1 I
V =VQ
endfor
Figure 10.2.2.1 Simple shifted QR algorithm.

YouTube: https://www.youtube.com/watch?v=Fhk0e5JF sU
To more carefully examine this algorithm, let us annotate it as we did for the simple QR
algorithm in the last unit.

A(0) = A
V (0) = I
R(0) = I
for k := 0, . . .
(k)
µ(k) = –m≠1,m≠1
(Q(k+1) , R(k+1) ) := QR(A(k) ≠ µ(k) I)
A(k+1) = R(k+1) Q(k+1) + µ(k) I
V (k+1) = V (k) Q(k+1)
endfor

The following exercises clarify some of the finer points.


Homework 10.2.2.2 For the above algorithm, show that
• A(k+1) = Q(k+1) H A(k) Q(k+1) .

• A(k+1) = V (k+1) H AV (k+1) .

Solution. The algorithm computes the QR factorization of A(k) ≠ µ(k) I

A(k) ≠ µ(k) I = Q(k+1) R(k+1)

after which
A(k+1) := R(k+1) Q(k+1) + µ(k) I
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM529

Hence
A(k+1)
=
R(k+1) Q(k+1) + µ(k) I
=
Q(k+1) H
(A(k) ≠ µ(k) I)Q(k+1) + µ(k) I
=
(k+1) H (k) (k+1)
≠ µ(k) Q(k+1) Q(k+1) + µ(k) I
H
Q A Q
=
Q(k+1) A(k) Q(k+1) .
H

This last exercise confirms that the eigenvalues of A(k) equal the eigenvalues of A.
Homework 10.2.2.3 For the above algorithm, show that

(A ≠ µk≠1 I)(A ≠ µk≠2 I) · · · (A ≠ µ1 I)(A ≠ µ0 I)


= Q(0) Q(1) · · · Q(k) R (k)
R(1) R(0)˝.
· · ·˚˙
¸ ˚˙ ˝ ¸
unitary upper triangular

Solution. In this problem, we need to assume that Q(0) = I. Also, it helps to recognize
that V (k) = Q(0) · · · Q(k) , which can be shown via a simple inductive proof.
This requires a proof by induction.

• Base case: k = 1.

A ≠ µ0 I
= < A(0) = A >
(0)
A ≠ µ0 I
= < algorithm >
(1) (1)
Q R
= < Q(0) = R(0) = I
(0) (1) (1) (0)
Q Q R R

• Inductive Step: Assume

(A ≠ µk≠1 I)(A ≠ µk≠2 I) · · · (A ≠ µ1 I)(A ≠ µ0 I)


= Q(0) Q(1) · · · Q(k) R (k)
R(1) R(0)˝.
· · ·˚˙
¸ ˚˙ ˝ ¸
V (k) upper triangular

Show that
(A ≠ µk I)(A ≠ µk≠1 I) · · · (A ≠ µ1 I)(A ≠ µ0 I)
= Q(0) Q(1) · · · Q(k+1) R (k+1)
· R(1) R(0)˝.
· ·˚˙
¸ ˚˙ ˝ ¸
V (k+1) upper triangular
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM530

Notice that
(A ≠ µk I)(A ≠ µk≠1 I) · · · (A ≠ µ0 I)
= < last homework >
(V (k+1) (k+1) (k+1) H
A V ≠ µk I)(A ≠ µk≠1 I) · · · (A ≠ µ0 I)
= < I = V (k+1) V (k+1) H >
V (k+1) (A(k+1) ≠ µk I)V (k+1) H (A ≠ µk≠1 I) · · · (A ≠ µ0 I)
= < I.H. >
V (k+1) (A(k+1) ≠ µk I)V (k+1) H V (k) R(k) · · · R(0)
= < V (k+1) H = Q(k+1) H V (k) H >
V (k+1) (A(k+1) ≠ µk I)Q(k+1) H R(k) · · · R(0)
= < algorithm >
V (k+1) R(k+1) Q(k+1) Q(k+1) H R(k) · · · R(0)
= < Q(k+1) Q(k+1) H = I >
(k+1) (k+1) (k)
V R R · · · R(0)

• By the Principle of Mathematical Induction, the result holds.


Homework 10.2.2.4 Copy Simp eShiftedQRA gConstantShift.m into Simp eShiftedQRA g
and modify it to implement an algorithm that executes the QR algorithm in Figure 10.2.2.1:
function [ Ak, V ] = Simp eShiftedQRA g( A, maxits, i ustrate, de ay )

Modify the appropriate line in Assignments/Week10/mat ab/test_simp e_QR_a gorithms.m, chang-


ing (0) to (1), and use it to examine the convergence of the method.
What do you observe?
Solution.

• Assignments/Week10/answers/Simp eShiftedQRA g.m

Discuss what you observe online with others!

10.2.3 Deflating the problem

YouTube: https://www.youtube.com/watch?v=rdWhh3 HuhY


Recall that if A B
A00 0
A= , A00 x = ⁄x, and A11 y = µy,
0 A11
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM531

then A BA B A B A BA B A B
A00 0 x x A00 0 0 0
=⁄ and =µ .
0 A11 0 0 0 A11 y y
In other words, (A) = (A00 ) fi (A11 ) and eigenvectors of A can be easily constructed
from eigenvalues of A00 and A11 .
This insight allows us to deflate a matrix when stategically placed zeroes (or, rather,
acceptably small entries) appear as part of the QR algorithm. Let us continue to focus on
the Hermitian eigenvalue problem.
Homework 10.2.3.1 Let A œ Cm◊m be a Hermitian matrix and V œ Cm◊m be a unitary
matrix such that A B
A00 0
V AV =
H
.
0 A11
If V00 and V11 are unitary matrices such that V00H A00 V00 = 0 and V11H A11 V11 = 1, are both
diagonal, show that
A A BBH A A BB A B
V00 0 V00 0 0 0
V A V = .
0 V11 0 V11 0 1

Solution. A A BBH A A BB
V00 0 V00 0
V A V
0 V11 0 V11
= < (XY )H = Y H X H >
A BH A B
V00 0 H V00 0
V AV
0 V11 0 V11
= < (V H AV = diag(A00 , A11 ) >
A BH A BA B
V00 0 A00 0 V00 0
0 V11 0 A11 0 V11
A = < partitioned matrix-matrix
B multiplication >
V00H A00 V00 0
0 V11H A11 V11
A = <BV00H A00 V00 = 0 ; V11H A11 V11 = 1 >
0 0
.
0 1

The point of this last exercise is that if at some point the QR algorithm yields a block
diagonal matrix, then the algorithm can proceed to find the spectral decompositions of the
blocks on the diagonal, updating the matrix, V , in which the eigenvectors are accumulated.
Now, since it is the last colum of V (k) that converges fastest to an eigenvector, eventually
we expect A(k) computed as part of the QR algorithm to be of the form
Q R
(k) (k)
(k) A00 f01
A =a (k) T (k)
b,
f01 –m≠1,m≠1
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM532

(k)
where f01 is small. In other words,
Q R
(k)
A 0
A(k) ¥ a 00 (k)
b.
0 –m≠1,m≠1
(k) (k)
Once f01 is small enough, the algorithm can continue with A00 . The problem is thus
deflated to a smaller problem.
What criteria should we use to deflate. If the active matrix is m ◊ m, for now we use the
criteria
(k) (k)
Îf01 Î1 Æ ‘mach (|–0,0 | + · · · + |–m≠1,m≠1 |).
The idea is that if the magnitudes of the off-diagonal elements of the last row are small
relative to the eigenvalues, then they can be considered to be zero. The sum of the absolute
values of the diagonal elements is an estimate of the sizes of the eigenvalues. We will refine
this criteria later.
Homework 10.2.3.2 Copy Simp eShiftedQRA g.m into Simp eShiftedQRA gWithDef ation
and modify it to add deflation.
function [ Ak, V ] = Simp eShiftedQRA gWithDef ation( A, maxits, i ustrate, de ay )

Modify the appropriate lines in Assignments/Week10/mat ab/test_simp e_QR_a gorithm.m, chang-


ing (0) to (1), and use it to examine the convergence of the method.
Solution.

• Assignments/Week10/answers/Simp eShiftedQRA gWithDef ation.m

Discuss what you observe online with others!


Remark 10.2.3.1 It is possible that deflation can happen anywhere in the matrix and one
should check for that. However, it is most likely to happen in the last row and column of
the active part of the matrix.

10.2.4 Cost of a simple QR algorithm

YouTube: https://www.youtube.com/watch?v=c1zJG3T0D44
The QR algorithms that we have discussed incur the following approximate costs per
iteration for an m ◊ m Hermitian matrix.
• A æ QR (QR factorization): 43 m3 flops.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM533

• A := RQ. A naive implementation would take advantage of R being upper triangular,


but not of the fact that A will again be Hermitian, at a cost of m3 flops. If one also
takes advantage of the fact that A is Hermitian, that cost is reduced to 12 m3 flops.

• If the eigenvectors are desired (which is usually the case), the update V := V Q requires
an addition 2m3 flops.

Any other costs, like shifting the matrix, are inconsequential.


Thus, the cost, per iteration, equals approximately ( 43 + 12 )m3 = 11
6
m3 flops if only the
eigenvalues are to be computed. If the eigenvectors are also required, then the cost increases
by 2m3 flops to become 23 6
m3 flops.
Let us now consider adding deflation. The rule of thumb is that it takes a few iterations
per eigenvalue that is found. Let’s say K iterations are needed. Every time an eigenvalue is
found, the problem deflates, decreasing in size by one. The cost then becomes
1
ÿ 23 3 23 ⁄ m 3 23 1 4 m 23
K i ¥K x dx = K x |0 = K m4 .
i=m 6 6 0 6 4 24

The bottom line is that the computation requires O(m4 ) flops. All other factorizations
we have encountered so far require at most O(m3 ) flops. Generally O(m4 ) is considered
prohibitively expensive. We need to do better!

10.3 A Practical Hermitian QR Algorithm


10.3.1 Reduction to tridiagonal form

YouTube: https://www.youtube.com/watch?v=ETQbxnweKok
In this section, we see that if A(0) is a tridiagonal matrix, then so are all A(k) . This
reduces the cost of each iteration of the QR algorithm from O(m3 ) flops to O(m) flops if
only the eigenvalues are computed and O(m2 ) flops if the eigenvectors are also desired. Thus,
if matrix A is first reduced to a tridiagonal matrix via (unitary) similarity transformations,
then the cost of finding its eigenvalues and eigenvectors is reduced from O(m4 ) to O(m3 )
flops. Fortunately, there is an algorithm for reducing a matrix to tridiagonal form that
requires O(m3 ) operations.
The basic algorithm for reducing a Hermitian matrix to tridiagonal form, overwriting the
original matrix with the result, can be explained as follows. We assume that A is stored only
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM534

in the lower triangular part of the matrix and that only the diagonal and subdiagonal of
the tridiagonal matrix is computed, overwriting those parts of A. Finally, the Householder
vectors used to zero out parts of A can overwrite the entries that they annihilate (set to
zero), much like we did when computing the Householder QR factorization.
Recall that in Subsubsection 3.3.3.3, we introduced the function
CA B D AA BB
fl ‰1
, · := Housev
u2 x2
A B
1
to compute the vector u = that reflects x into ± ÎxÎ2 e0 so that
u2
Q A BA BH R A B
1 1 1 ‰1
aI ≠ b = ± n ÎxÎ2 e0 .
· u2 u2 x2 ¸ ˚˙ ˝

We are going to use a variation on this function:

[u, · ] := Housev (x)

implemented by the function

function [ u, tau ] = Housev1( x )

We also reintroduce the notation H(x) for the transformation I ≠ ·1 uuH where u and · are
computed by Housev1(x).
We now describe an algorithm for reducing a Hermitian matrix to tridiagonal form:
A B
–11 ı
• Partition A æ . Here the ı denotes a part of a matrix that is neither
a21 A22
stored nor updated.

• Update [a21 , · ] := Housev1(a21 ). This overwrites the first element of a21 with ± Îa21 Î2
and the remainder with all but the first element of the Householder vector u. Implicitly,
the elements below the first element equal zero in the updated matrix A.

• Update
A22 := H(a21 )A22 H(a21 ).
Since A22 is Hermitian both before and after the update, only the lower triangular part
of the matrix needs to be updated.

• Continue this process with the updated A22 .

This approach is illustrated in Figure 10.3.1.1.


WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM535

◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 0 0 ◊ ◊ 0 0 0
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 0
◊ ◊ ◊ ◊ ◊ ≠æ 0 ◊ ◊ ◊ ◊ ≠æ 0 ◊ ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0 0 ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0 0 ◊ ◊ ◊
Original matrix First iteration Second iteration

◊ ◊ 0 0 0 ◊ ◊ 0 0 0
◊ ◊ ◊ 0 0 ◊ ◊ ◊ 0 0
≠æ 0 ◊ ◊ ◊ 0 ≠æ 0 ◊ ◊ ◊ 0
0 0 ◊ ◊ ◊ 0 0 ◊ ◊ ◊
0 0 0 ◊ ◊ 0 0 0 ◊ ◊
Third iteration
Figure 10.3.1.1 An illustration of the reduction of a Hermitian matrix to tridiagonal form.
The ◊s denote nonzero elements in the matrix. The gray entries above the diagonal are not
actually updated.
The update of A22 warrants closer scrutiny:

A22 := H(a21 )A22 H(a21 )


= (I ≠ ·1 u21 uH 1
21 )A22 (I ≠ · u21 u21 )
H
1 1
= (A22 ≠ · u21 u21 A22 )(I ≠ · u21 uH
H
21 )
¸ ˚˙ ˝
H
y21
1
= A22 ≠ u yH
· 21 21
≠ ·1 A22 u21 uH
21 +
1
u
· 2 21
H
y21 u21 uH
21
¸ ˚˙ ˝ ¸ ˚˙ ˝
1
y21 2 1 2— 2
1 1
= A22 ≠ · u21 y21 ≠ · 2 u21 u21 ≠ · y21 u21 ≠ ·—2 u21 uH
H — H H
A B A B 21
1 H — H 1 —
= A22 ≠ u21 y21 ≠ u21 ≠ y21 ≠ u21 uH 21
· · · ·
¸ ˚˙ ˝ ¸ ˚˙ ˝
H
w21 w21
= A22 ≠ u21 w21 ≠ w21 uH
H
21 .
¸ ˚˙ ˝
Hermitian
rank-2 update

This formulation has two advantages: it requires fewer computations and it does not generate
an intermediate result that is not Hermitian. An algorithm that implements all these insights
is given in Figure 10.3.1.2.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM536

[A, t] A
:= TriRed-unb(A, B
t) A B
AT L AT R tT
Aæ ,t æ
ABL ABR tB
AT L is 0 ◊ 0 and tT has 0 elements
while m(AT L ) < m(A)Q≠ 2 R Q R
A B A00 a01 A02 A B t0
AT L AT R c d tT c d
æ a aT10 –11 aT12 b , æ a ·1 b
ABL ABR tB
A20 a21 A22 t2
[a21 , ·1 ] := Housev1(a21 )
u21 = a21 with first element replaced with 1
Update A22 := H(u21 )A22 H(u21 ) via the steps
Y
_
_ y21 := A22 u21 (Hermitian matrix-vector multiply!)
_
] — := uH y /2
21 21
_
_
_ w 21 := (y21 ≠ —u21 /·1 )/·1
[
A22 := A22 ≠ tril(u21 w21H
+ w21 uH ) (Hermitian rank-2 update)
Q R 21 Q R
A B A00 a01 A02 A B t0
AT L AT R c T T d tT c d
Ω a a10 –11 a12 b , Ω a ·1 b
ABL ABR tB
A20 a21 A22 t2
endwhile
Figure 10.3.1.2 Basic algorithm for reduction of a Hermitian matrix to tridiagonal form.
During the first iteration, when updating (m ≠ 1) ◊ (m ≠ 1) matrix A22 , the bulk of
computation is in the computation of y21 := A22 u21 , at 2(m ≠ 1)2 flops, and A22 := A22 ≠
(u21 w21
H
+ w21 uH
21 ), at 2(m ≠ 1) flops. The total cost for reducing m ◊ m matrix A to
2

tridiagonal form is therefore approximately


m≠1
ÿ
4(m ≠ k ≠ 1)2 flops.
k=0

By substituting j = m ≠ k ≠ 1 we find that


⁄ m
m≠1
ÿ m≠1
ÿ 4
4(m ≠ k ≠ 1)2 flops = 4 j 2 flops ¥ 4 x2 dx = m3 flops.
k=0 j=0 0 3

This equals, approximately, the cost of one QR factorization of matrix A.


Homework 10.3.1.1 A more straight forward way of updating A22 is given by

A22 := (I ≠ ·1 u21 uH 1
21 )A22 (I ≠ · u21 u21 )
H

1
= (A22 ≠ u21 uH A22 ) (I ≠ ·1 u21 uH
21 )
· ¸ 21˚˙ ˝
H
y21
¸ ˚˙ ˝
B22
1
= B22 ≠ ·
B22 u21 uH
21 .
¸ ˚˙ ˝
x21
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM537

This suggests the steps


• Compute y21 = A22 u21 . (Hermitian matrix-vector multiplication).

• Compute B22 = A22 ≠ ·1 u21 y21


H
. (Rank-1 update yielding a nonHermitian intermediate
matrix).

• Compute x21 = B22 u21 . (Matrix-vector multiplication).

• Compute A22 = B22 ≠ ·1 x21 uH


21 . (Rank-1 update yielding a Hermitian final matrix).

Estimate the cost of this alternative approach. What other disadvantage(s) does this ap-
proach have?
Solution. During the kth iteration, for k = 0, 1, . . . , m ≠ 1 the costs for the various steps
are as follows:

• Compute y21 = A22 u21 . (Hermitian matrix-vector multiplication). Cost: approxi-


mately 2(m ≠ 1)2 flops.

• Compute B22 = A22 ≠ ·1 u21 y21


H
. (Rank-1 update yielding a nonHermitian intermediate
matrix). Cost: approximately 2(m ≠ 1)2 flops since the intermediate matrix B22 is not
Hermitian.

• Compute x21 = B22 u21 . (Matrix-vector multiplication). Cost: approximately 2(m≠1)2


flops.

• Compute A22 = B22 ≠ ·1 x21 y21


H
. Only the lower triangular part of A22 needs to be
computed. Cost: approximately (m ≠ 1)2 flops.

Thus, the total cost per iteration is, approximately

7(m ≠ 1)2 flops.

The total cost is then, approximately,


⁄ m
m≠1
ÿ
2
m≠1
ÿ
2 7
7(m ≠ k ≠ 1) flops = 7 j flops ¥ 7 x2 dx = m3 flops.
k=0 j=0 0 3

This almost doubles the cost of the reduction to tridiagonal form.


An additional disadvantage is that a nonsquare intermediate matrix must be stored.
The diagonal elements of a Hermitian matrix are real. Hence the tridiagonal matrix
has real values on its diagonal. A post process (that follows the reduction to tridiagonal
form) can be used to convert the elements of the subdiagonal to real values as well. The
advantage of this is that in the subsequent computation, that computes the eigenvalues of
the tridiagonal matrix and accumulates the eigenvectors, only needs to perform real (floating
point) arithmetic.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM538

Ponder This 10.3.1.2 Propose a postprocess that converts the off-diagonal elements of a
tridiagonal Hermitian matrix to real values. The postprocess must be equivalent to applying
a unitary similarity transformation so that eigenvalues are preserved.
You may want to start by looking at
A B
–0,0 –1,0
A= ,
–1,0 –1,1

where the diagonal elements are real-valued and the off-diagonal elements are complex-
valued. Then move on to Q R
–0,0 –1,0 0
c
A = a –1,0 –1,1 –2,1 d
b.
0 –2,1 –2,2
What is the pattern?
Homework 10.3.1.3 You may want to start by executing git pu to update your
directory Assignments.
In directory Assignments/Week10/mat ab/, you will find the following files:
• Housev1.m: An implementation of the function Housev1, mentioned in the unit.
• TriRed.m: A code skeleton for a function that reduces a Hermitian matrix to a tridi-
agonal matrix. Only the lower triangular part of the input and output are stored.
[ T, t ] = TriRed( A, t )
returns the diagonal and first subdiagonal of the tridiagonal matrix in T, stores the
Householder vectors below the first subdiagonal, and returns the scalars · in vector t.
• TriFromBi.m: A function that takes the diagonal and first subdiagonal in the input
matrix and returns the tridiagonal matrix that they define.
T = TriFromBi( A )
• test_TriRed.m: A script that tests TriRed.
With these resources, you are to complete TriRed by implementing the algorithm in Fig-
ure 10.3.1.2.
Be sure to look at the hint!
Hint. If A holds Hermitian matrix A, storing only the lower triangular part, then Ax is
implemented in Matlab as
( tri ( A ) + tri ( A, -1) ) * x;
Updating only the lower triangular part of array A with A := A ≠ B is accomplished by
A = A - tri ( B );
Solution.
• Assignments/Week10/answers/TriRed.m.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM539

10.3.2 Givens’ rotations

YouTube: https://www.youtube.com/watch?v=XAvioT6ALAg
We now introduce another important class of orthogonal matrices known as Givens’
rotations. Actually, we have seen these before, in Subsubsection 2.2.5.1, where we simply
called them rotations. It
A is how
B they are used that makes then Givens’ rotations.
‰1
Given a vector x = œ R2 , there exists an orthogonal matrix G such that GT x =
‰2
A B
±ÎxÎ2
. The Householder transformation is one example of such a matrix G. An
0
A B
“ ≠‡
alternative is the Givens’ rotation: G = where “ 2 + ‡ 2 = 1. (Notice that “ and
‡ “
‡ can be thought of as the cosine and sine of an angle.) Then
A BT A B A BA B
“ ≠‡ “ ≠‡ “ ‡ “ ≠‡
G G =
T
=
A
‡ “ ‡ “ B A
≠‡ “B ‡ “
“ 2 + ‡ 2 ≠“‡ + “‡ 1 0
= = ,
“‡ ≠ “‡ “ 2 + ‡ 2 0 1

which means that a Givens’ rotation is an orthogonal matrix.


Homework 10.3.2.1 Propose formulas for “ and ‡ such that
A BT A B A B
“ ≠‡ ‰1 ÎxÎ2
= ,
‡ “ ‰2 0
¸ ˚˙ ˝
x

where “ 2 + ‡ 2 = 1.
Solution. Take “ = ‰1 /ÎxÎ2 and ‡ = ‰2 /ÎxÎ2 , then “ 2 + ‡ 2 = (‰21 + ‰22 )/ÎxÎ22 = 1 and
A BT A B A BA B A B A B
“ ≠‡ ‰1 “ ‡ ‰1 (‰21 + ‰22 )/ÎxÎ2 ÎxÎ2
= = = .
‡ “ ‰2 ≠‡ “ ‰2 (‰1 ‰2 ≠ ‰1 ‰2 )/ÎxÎ2 0
Remark 10.3.2.1 We only discuss real-valued Givens’ rotations and how they transform
real-valued vectors, since the output of our reduction to tridiagonal form, after postprocess-
ing, yields a real-valued tridiagonal symmatric matrix.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM540

Ponder This 10.3.2.2 One could use 2◊2 Householder transformations (reflectors) instead
of Givens’ rotations. Why is it better to use Givens’ rotations in this situation.

10.3.3 Simple tridiagonal QR algorithm

YouTube: https://www.youtube.com/watch?v=_IgDCL7OPdU
Now, consider the 4 ◊ 4 tridiagonal matrix
Q R
–0,0 –0,1 0 0
c
c –1,0 –1,1 –1,2 0 d
d
c d.
a 0 –2,1 –2,2 –2,3 b
0 0 –3,2 –3,3
A B
–0,0
From , one can compute “1,0 and ‡1,0 so that
–1,0
A BT A B A B
“1,0 ≠‡1,0 –0,0 –
‚ 0,0
= .
‡1,0 “1,0 –1,0 0

Then
Q R Q RQ R

‚ 0,0 –
‚ 0,1 –
‚ 0,2 0 “1,0 ‡1,0 0 0 –0,0 –0,1 0 0
c
c 0 –
‚ 1,1 –1,2 0
‚ d
d
c ≠‡1,0 “1,0 0 0 dc –1,0 –1,1 –1,2 0 d
c d =c
c
dc
dc
d
d.
a 0 –2,1 –2,2 –2,3 b a 0 0 1 0 ba 0 –2,1 –2,2 –2,3 b
0 0 –3,2 –3,3 0 0 0 1 0 0 –3,2 –3,3
A B

‚ 1,1
Next, from , one can compute “2,1 and ‡2,1 so that
–2,1
A BT A B A B
“2,1 ≠‡2,1 –
‚ 1,1 ‚

‚ 1,1
= .
‡2,1 “2,1 –2,1 0

Then
Q R Q RQ R

‚ 0,0 –
‚ 0,1 –
‚ 0,2 0 1 0 0 0 –
‚ 0,0 –
‚ 0,1 –
‚ 0,2 0
c d
c 0 – ‚
‚ 1,1 ‚
– ‚
‚ 1,2 –
‚ 1,3 d c
c 0 “2,1 ‡2,1 0 dc
dc 0 – ‚ 1,1 –
‚ 1,2 0 d
d
c d =c dc d
c
a 0 0 –
‚ 2,2 –
‚ 2,3 d
b a 0 ≠‡2,1 “2,1 0 ba 0 –2,1 –2,2 –2,3 b
0 0 –3,2 –3,3 0 0 0 1 0 0 –3,2 –3,3
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM541
A B A BT A B

‚ 2,2 “3,2 ≠‡3,2 –
‚ 2,2
Finally, from , one can compute “3,2 and ‡3,2 so that =
–3,2 ‡3,2 “3,2 –3,2
A B


‚ 2,2
. Then
0
Q R Q RQ R

‚ 0,0 –
‚ 0,1 –
‚ 0,2 0 1 0 0 0 –
‚ 0,0 –
‚ 0,1 –
‚ 0,2 0
c d dc d
c 0 ‚

‚ 1,1 ‚

‚ 1,2 ‚

‚ 1,3 d c
c 0 1 0 0 dc 0 ‚

‚ 1,1 ‚

‚ 1,2 ‚

‚ 1,3 d
c d = c dc d
c
a 0 0 ‚

‚ 2,2 ‚

‚ 2,3 d
b a 1 0 “3,2 ‡3,2 a 0
bc 0 –
‚ 2,2 –
‚ 2,3 d
b
0 0 0 –
‚ 3,3 0 1 ≠‡3,2 “3,2 0 0 –3,2 –3,3

The matrix Q is the orthogonal matrix that results from multiplying the different Givens’
rotations together:
Q RQ RQ R
“1,0 ≠‡1,0 0 0 1 0 0 0 1 0 0 0
c
c ‡1,0 “1,0 0 0 dc
dc 0 “2,1 ≠‡2,1 0 dc
dc 0 1 0 0 d
d
Q=c dc dc d. (10.3.1)
a 0 0 1 0 ba 0 ‡2,1 “2,1 0 ba 0 0 “3,2 ≠‡3,2 b
0 0 0 1 0 0 0 1 0 0 ‡3,2 “3,2

However, it needs not be explicitly formed, as we exploit next.


The next question is how to compute RQ given the QR factorization of the tridiagonal
matrix. We notice that
Q RQ RQ RQ R

‚ 0,0 –
‚ 0,1 –
‚ 0,2 0 “1,0 ≠‡1,0 0 0 1 0 0 0 1 0 0 0
c dc
c
c 0 – ‚
‚ 1,1 ‚

‚ 1,2 ‚

‚ 1,3 dc
dc ‡1,0 “1,0 0 0 dc
dc 0 “2,1 ≠‡2,1 0 dc
dc 0 1 0 0 d
d
dc dc d
c
a 0 0 ‚

‚ 2,2 ‚

‚ 2,3 da
b 0 0 1 0 ba 0 ‡2,1 “2,1 0 ba 0 0 “3,2 ≠‡3,2 b
0 0 0 –
‚ 3,3 0 0 0 1 0 0 0 1 0 0 ‡3,2 “3,2
¸ Q ˚˙ R ˝
˜ 0,0 –
– ˜ 0,1 –
‚ 0,2 0
c
c – ˜
˜ 1,0 –
‚ 1,1 ‚

‚ 1,2 ‚

‚ 1,3
d
d
c d
c
a 0 0 ‚

‚ 2,2 ‚

‚ 2,3 d
b
0 0 0 –
‚ 3,3
¸ Q ˚˙ R ˝
– ˜ 0,1 –
˜ 0,0 – ˜ 0,2 0
c
c – ˜ 1,1 –
˜ 1,0 – ˜ 1,2 ‚

‚ 1,3
d
d
c d
c
a 0 – ˜ 2,1 –
˜ 2,2 ‚

‚ 2,3 d
b
0 0 0 –
‚ 3,3
¸ ˚˙ ˝
Q ˜ 0,1
˜ 0,0 –
– ˜ 0,2 0

R
c
c – ˜ 1,1
˜ 1,0 – ˜ 1,2 –
– ˜ 1,3 d
d
c ˜ d.
a 0 – ˜ 2,1 –2,2 –˜ 2,3 b
0 0 ˜ 3,2 –
– ˜ 3,3

A symmetry argument can be used to motivate that – ˜ 0,2 = –


˜ 1,3 = 0 (which is why they
appear in gray, if you look carefully). This also explains why none of the elements above the
first superdiagonal become nonzero.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM542

Remark 10.3.3.1 An important observation is that if A is tridiagonal, then A æ QR


(QR factorization) followed by A := RQ again yields a tridiagonal matrix. In other words,
any QR algorithm previously discussed (simple, shifted, with deflation) when started with a
tridiagonal matrix will generate a succession of tridiagonal matrices.

10.3.4 The implicit Q theorem

YouTube: https://www.youtube.com/watch?v=w6c p9UqRRE


Definition 10.3.4.1 Upper Hessenberg matrix. A matrix is said to be upper Hessen-
berg if all entries below its first subdiagonal equal zero. ⌃
In other words, an m ◊ m upper Hessenberg matrix looks like
Q R
–0,0 –0,1 –0,2 · · · –0,m≠1 –0,m≠1
c d
c
c
–1,0 –1,1 –1,2 · · · –1,m≠1 –1,m≠1 d
d
c
c 0 –2,1 –2,2 · · · –2,m≠1 –2,m≠1 d
d
A= c .. ... ... ... .. .. d.
c
c . . . d
d
c .. d
c
a 0 0 0 . –m≠2,m≠2 –m≠2,m≠1 d
b
0 0 0 · · · –m≠1,m≠2 –m≠1,m≠1

Obviously, a tridiagonal matrix is a special case of an upper Hessenberg matrix.


The following theorem sets the stage for one of the most remarkable algorithms in numer-
ical linear algebra, which allows us to greatly streamline the implementation of the shifted
QR algorithm.
Theorem 10.3.4.2 Implicit Q Theorem. Let A, B œ Cm◊m , where B is upper Hessenberg
and has only (real) positive elements on its first subdiagonal. Assume there exists a unitary
matrix Q such that QH AQ = B. Then Q and B are uniquely determined by A and the first
column of Q.
Proof. Partition 1 2
Q = q0 q1 q2 · · · qm≠2 qm≠1
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM543

and Q R
—0,0 —0,1 —0,2 ··· —0,m≠2 —0,m≠1
c d
c —1,0 —1,1 —1,2 ··· —1,m≠2 —1,m≠1 d
c d
c
c 0 —2,1 —2,2 ··· —2,m≠2 —2,m≠1 d
d
B= c 0 0 —3,2 ··· —3,m≠2 —3,m≠1 d.
c d
c .. .. .. .. .. .. d
c
a . . . . . .
d
b
0 0 0 · · · —m≠1,m≠2 —m≠1,m≠1
Notice that AQ = QB and hence
1 2
A q0 q1 q2 · · · qm≠2 qm≠1
1 2
= q0 q1 q 2 · · · qm≠2 qm≠1
Q R
—0,0 —0,1 —0,2 · · · —0,m≠2 —0,m≠1
c d
c —1,0 —1,1 —1,2 · · · —1,m≠2 —1,m≠1 d
c d
c 0 —2,1 —2,2 · · · —2,m≠1 d
c d
c 0 0 —3,2 · · · —3,m≠2 —3,m≠1 d.
c d
c . .. .. .. ..
c . ... d
d
a . . . . . b
0 0 0 · · · —m≠1,m≠2 —m≠1,m≠1

Equating the first column on the left and right, we notice that

Aq0 = —0,0 q0 + —1,0 q1 .

Now, q0 is given and Îq0 Î2 = 1 since Q is unitary. Hence

q0H Aq0 = —0,0 q0H q0 + —1,0 q0H q1 = —0,0 .

Next,
—1,0 q1 = Aq0 ≠ —0,0 q0 = q̃1 .
Since Îq1 Î2 = 1 (it is a column of a unitary matrix) and —1,0 is assumed to be positive, then
we know that
—1,0 = Îq̃1 Î2 .
Finally,
q1 = q̃1 /—1,0 .
The point is that the first column of B and second column of Q are prescribed by the first
column of Q and the fact that B has positive elements on the first subdiagonal. In this way,
it can be successively argued that, one by one, each column of Q and each column of B are
prescribed. ⌅
Homework 10.3.4.1 Give all the details of the above proof.
Solution. Assume that q1 , . . . , qk and the column indexed with k ≠ 1 of B have been shown
to be uniquely determined under the stated assumptions. We now show that then qk+1 and
the column indexed by k of B are uniquely determined. (This is the inductive step in the
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM544

proof.) Then
Aqk = —0,k q0 + —1,k q1 + · · · + —k,k qk + —k+1,k qk+1 .
We can determine —0,k through —k,k by observing that

qjH Aqk = —j,k

for j = 0, . . . , k. Then

—k+1,k qk+1 = Aqk ≠ (—0,k q0 + —1,k q1 + · · · + —k,k qk ) = q̃k+1 .

Since it is assumed that —k+1,k > 0, it can be determined as

—k+1,k = Îq̃k+1 Î2

and then
qk+1 = q̃k+1 /—k+1,k .
This way, the columns of Q and B can be determined, one-by-one.
Ponder This 10.3.4.2 Notice the similarity between the above proof and the proof of the
existence and uniqueness of the QR factorization!
This can be brought out by observing that
A B
1 2 1 1 0 2
q0 A
0 q0 q1 q2 · · · qm≠2 qm≠1
Q R
1 —0,0 —0,1 —0,2 ··· —0,m≠2 —0,m≠1
c 0 —1,0
c d
—1,1 —1,2 ··· —1,m≠2 —1,m≠1 d
c d
1 2c 0 0 —2,1 —2,2 ··· —2,m≠1 d
= q0 q1 q2 · · · qm≠2 qm≠1 cc 0 0 0 —3,2 ··· —3,m≠2 —3,m≠1
d
d.
c d
c . .. .. .. .. ..
c . .. d
a . . . . . . .
d
b
0 0 0 0 · · · —m≠1,m≠2 —m≠1,m≠1

Puzzle through this observation and interpret what it means.

YouTube: https://www.youtube.com/watch?v=1uD TfWfH6s


Remark 10.3.4.3 In our case, A is symmetric tridiagonal, and so is B.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM545

10.3.5 The Francis implicit QR Step

YouTube: https://www.youtube.com/watch?v=RSm_Mqi0aSA
In the last unit, we described how, when A(k) is tridiagonal, the steps
A(k) æ Q(k) R(k)
A(k+1) := R(k) Q(k)
of an unshifted QR algorithm can be staged as the computation and application of a se-
quence of Givens’ rotations. Obviously, one could explicitly form A(k) ≠ µk I, perform these
computations with the resulting matrix, and then add µk I to the result to compute
A(k) ≠ µk I æ Q(k) R(k)
A(k+1) := R(k) Q(k) + µk I.
The Francis QR Step combines these separate steps into a single one, in the process casting
all computations in terms of unitary similarity transformations, which ensures numerical
stability.
Consider the 4 ◊ 4 tridiagonal matrix
Q R
–0,0 –0,1 0 0
c
c –1,0 –1,1 –1,2 0 d
d
c d ≠ µI
a 0 –2,1 –2,2 –2,3 b
0 0 –3,2 –3,3
A B
–0,0 ≠ µ
The first Givens’ rotation is computed from , yielding “1,0 and ‡1,0 so that
–1,0
A BT A B
“1,0 ≠‡1,0 –0,0 ≠ µ
‡1,0 “1,0 –1,0
has a zero second entry. Now, to preserve eigenvalues, any orthogonal matrix that is applied
from the left must also have its transpose applied from the right. Let us compute
Q R
˜ 0,0 –
– ‚ 1,0 –
‚ 2,0 0
c
c –
‚ 1,0 –
‚ 1,1 –
‚ 1,2 0 d
d
c d
a –
‚ 2,0 –
‚ 2,1 –2,2 –2,3 b
0 0 –3,2 –3,3
Q RQ RQ R
“1,0 ‡1,0 0 0 –0,0 –0,1 0 0 “1,0 ≠‡1,0 0 0
c ≠‡
c 1,0 “1,0 0 0 dc
dc –1,0 –1,1 –1,2 0 dc
dc ‡1,0 “1,0 0 0 d
d
=c dc dc d.
a 0 0 1 0 ba 0 –2,1 –2,2 –2,3 ba 0 0 1 0 b
0 0 0 1 0 0 –3,2 –3,3 0 0 0 1
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM546

This is known as
A "introducing
B the bulge."

‚ 1,0
Next, from , one can compute “2,0 and ‡2,0 so that

‚ 2,0
A BT A B A B
“2,0 ≠‡2,0 –
‚ 1,0 ˜ 1,0

= .
‡2,0 “2,0 –
‚ 2,0 0
Then
Q R
˜ 0,0
– ˜ 1,0 0
– 0
c d
c
c ˜ 1,0
– ˜ 1,1 –
– ‚
‚ 2,1 –
‚ 3,1 d
d
c
a 0 ‚
–‚ 2,1 –‚ 2,2 –
‚ 2,3 d
b
0 –‚ 3,1 –‚ 3,2 –3,3
Q RQ RQ R
1 0 0 0 ˜ 0,0 –
– ‚ 1,0 –
‚ 2,0 0 1 0 0 0
c 0 “2,0 ‡2,0 0 dc –
‚ 1,0 –
‚ 1,1 –
‚ 1,2 0 dc 0 “2,0 ≠‡2,0 0 d
=c
c
dc
dc
dc
dc
d
d
a 0 ≠‡2,0 “2,0 0 ba –
‚ 2,0 –
‚ 2,1 –2,2 –2,3 ba 0 ‡2,0 “2,0 0 b
0 0 0 1 0 0 –3,2 –3,3 0 0 0 1
again preserves eigenvalues. Finally, from
A B

‚ 2,1
,

‚ 3,1

one can compute “3,1 and ‡3,1 so that


A BT A B
“3,1 ≠‡3,1 –
‚ 2,1
‡3,1A “3,1 B –
‚ 3,1
–˜ 2,1
= .
0
Then
Q R
˜ 0,0 –
– ˜ 1,0 0 0
c
c ˜ 1,0 –
– ˜ 1,1 ˜ 2,1 0 d
– d
c d
a 0 – ˜ 2,1 ˜ 2,2 –
– ˜ 2,3 b
0 0 ˜ 3,2 –
– ˜ 3,3
Q RQ RQ R
1 0 0 0 ˜ 0,0
– ˜ 1,0
– 0 0 1 0 0 0
c 0 1 c dc
c 0 0 d dc ˜ 1,0
– ˜ 1,1
– ‚

‚ 2,1 –
‚ 3,1 dc 0 1 0 0 d
d
=c dc dc d,
a 0 0 “3,2 ‡3,2 b c a 0 ‚

‚ 2,1 –
‚ 2,2 –
‚ 2,3 da
b 0 0 “3,1 ≠‡3,1 b
0 0 ≠‡3,2 “3,2 0 –
‚ 3,1 –
‚ 3,2 –3,3 0 0 ‡3,1 “3,1

yielding a tridiagonal matrix. The process of transforming the matrix that results from in-
troducing the bulge (the nonzero element – ‚ 2,0 ) back into a tridiagonal matrix is commonly
referred to as "chasing the bulge." Moving the bulge one row and column down the matrix is
illustrated in Figure 10.3.5.1. The process of determining the first Givens’ rotation, introduc-
ing the bulge, and chasing the bulge is know as a Francis Implicit QR step. An algorithhm
for this is given in Figure 10.3.5.2.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM547

Figure 10.3.5.1 Illustration of how the bulge is chased one row and column forward in the
matrix.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM548

T := ChaseBulge(T )
Q R
TT L ı ı
c d
T æ a TM L TM M ı b
0 TBM TBR
TT L is 0 ◊ 0 and TM M is 3 ◊ 3
while m(TBR ) > 0 Q R
Q R
T00 ı 0 0 0
c d
TT L ı 0 c tT10 ·11 ı 0 0 d
c c d
a TM L TM M ı dbæc
c 0 t21 T22 ı 0 d
d
0 TBM TBR c
a 0 0 tT32 ·33 ı d
b
0A 0 B 0 t43 T44 A B
·21 ·21
Compute (“, ‡) s.t. GT“,‡ t21 = , and assign t21 :=
0 0
T22 := GT“,‡ T22 G“,‡
tT32 := tT32 G“,‡ (not performed
Q
during final step) R
Q R
T00 ı 0 0 0
c T d
TT L ı 0 c t10 ·11 ı 0 0 d
c d c d
a TM L TM M ı b Ω c 0 t21 T22 ı
c 0 d
d
0 a 0 0 tT32 ·33 ı
c d
TBM TBR b
0 0 0 t43 T44
endwhile
Figure 10.3.5.2 Algorithm for "chasing the bulge" that, given a tridiagonal matrix with an
additional nonzero –2,0 element, reduces the given matrix back to a tridiagonal matrix.
The described process has the net result of updating A(k+1) = QT A(k) Q(k) , where Q is
the orthogonal matrix that results from multiplying the different Givens’ rotations together:
Q RQ RQ R
“1,0 ≠‡1,0 0 0 1 0 0 0 1 0 0 0
c
c ‡1,0 “1,0 0 0 dc
dc 0 “2,0 ≠‡2,0 0 dc
dc 0 1 0 0 d
d
Q=c dc dc d.
a 0 0 1 0 ba 0 ‡2,0 “2,0 0 ba 0 0 “3,1 ≠‡3,1 b
0 0 0 1 0 0 0 1 0 0 ‡3,1 “3,1

Importantly, the first column of Q, given by


Q R
“1,0
c ‡1,0 d
c d
c d,
a 0 b
0

is exactly the same first column had Q been computed as in Subsection 10.3.3 (10.3.1). Thus,
by the Implicit Q Theorem, the tridiagonal matrix that results from this approach is equal
to the tridiagonal matrix that would be computed by applying the QR factorization from
that section to A ≠ µI, A ≠ µI æ QR followed by the formation of RQ + µI using the
algorithm for computing RQ in Subsection 10.3.3.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM549

Remark 10.3.5.3 In Figure 10.3.5.2, we use a variation of the notation we have encoun-
tered when presenting many of our algorithms, n including most recently the reduction to
tridiagonal form. The fact is that when implementing the implicitly shifted QR algorithm,
it is best to do so by explicitly indexing into the matrix. This tridiagonal matrix is typically
stored as just two vectors: one for the diagonal and one for the subdiagonal.
Homework 10.3.5.1 A typical step when "chasing the bulge" one row and column further
down the matrix involves the computation
Q R
–i≠1,i≠1 ◊ ◊ 0
c –
c ‚ –
‚ ◊ 0 d
d
d=
i,i≠1 i,i
c
a 0 –
‚ i+1,i –
‚ i+1,i+1 ◊ b
0 –
‚ i+2,i –
‚ i+2,i+1 –i+2,i+2
Q RQ RQ R
1 0 0 0 –i≠1,i≠1 ◊ ◊ 0 1 0 0 0
c 0 “i ‡i 0 d c –i,i≠1
d c –i,i ◊ 0 dc 0 “i ≠‡i 0 d
c dc d
c dc dc d
a 0 ≠‡i “i 0 b a –i+i,i≠1 –i+1,i –i+1,i+1 ◊ ba 0 ‡i “i 0 b
0 0 0 1 0 0 –i+2,i+1 –i+2,i+2 0 0 0 1
Give a strategy (or formula) for computing
Q R
c –
‚ i,i≠1 –
‚ i,i d
c d
c d
a –
‚ i+1,i –
‚ i+1,i+1 b

‚ i+2,i –
‚ i+2,i+1

Solution. Since the subscripts will drive us crazy, let’s relabel, add one of the entries above
the diagonal, and drop the subscripts on “ and ‡:
Q R
◊ ◊ ◊ 0
c ‚ 0 d
c
c ‘‚ Ÿ‚ ⁄ d
d=
c
a 0 ⁄‚ µ‚ ◊ d
b
0 ‰‚ ‚
 ◊
Q RQ RQ R
1 0 0 0 ◊ ◊ ◊ 0 1 0 0 0
c
c 0 “ ‡ 0 d c
dc ‘ Ÿ ⁄ 0 dc
dc 0 “ ≠‡ 0 d
d
c dc dc d
a 0 ≠‡ “ 0 b a „ ⁄ µ ◊ ba 0 ‡ “ 0 b
0 0 0 1 0 0 Â ◊ 0 0 0 1
With this, the way I would compute the desired results is via the steps
• ‘‚ := “‘ + ‡„
A B CA BA BD A B
Ÿ ‚
‚ ⁄ “ ‡ Ÿ ⁄ “ ≠‡
• ‚ µ :=
⁄ ‚ ≠‡ “ ⁄ µ ‡ “
• ‰‚ := ‡Â
‚ := “‰
Translating this to the update of the actual entries is straight forward.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM550

Ponder This 10.3.5.2 Write a routine that performs one Francis implicit QR step. Use it
to write an implicitly shifted QR algorithm.

10.3.6 A complete algorithm

YouTube: https://www.youtube.com/watch?v=fqiex-FQ-JU

YouTube: https://www.youtube.com/watch?v=53XcY9IQDU0
The last unit shows how one iteration of the QR algorithm can be performed on a
tridiagonal matrix by implicitly shifting and then "chasing the bulge." All that is left to
complete the algorithm is to note that

• The shift µk can be chosen to equal –m≠1,m≠1 (the last element on the diagonal, which
tends to converge to the eigenvalue smallest in magnitude). In practice, choosing the
shift to be an eigenvalue of the bottom-right 2 ◊ 2 matrix works better. This is known
as the Wilkinson shift.

• If A = QT QT reduced A to the tridiagonal matrix T before the QR algorithm com-


menced, then the Givens’ rotations encountered as part of the implicitly shifted QR
algorithm can be applied from the right to the appropriate columns of Q so that upon
completion Q is left overwritten with the eigenvectors of A. Let’s analyze this:

¶ Reducing the matrix to tridiagonal form requires O(m3 ) computations.


¶ Forming Q from the Householder vectors requires O(m3 ) computations.
¶ Applying Givens’ rotation to a pairs of columns of Q requires O(m) computation
per Givens’ rotation. For each Francis implicit QR step O(n) Givens’ rotations are
computed, making the application of Givens’ rotations to Q of cost O(m2 ) per it-
eration of the implicitly shifted QR algorithm. Typically a few (2-3) iterations are
needed per eigenvalue that is uncovered (when deflation is incorporated), meaning
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM551

that O(m) iterations are needed. Thus, a QR algorithm with a tridiagonal matrix
that accumulates eigenvectors requires O(m3 ) computation.
Thus, the total cost of computing the eigenvalues and eigenvectors is O(m3 ).
• If an element on the subdiagonal becomes zero (or very small), and hence the corre-
sponding element of the superdiagonal, then
◊ ◊
◊ ◊ ◊
A B ◊ ◊ ◊
T00 0
T = ◊ ◊ 0
0 T11
0 ◊ ◊
◊ ◊ ◊
◊ ◊
then
¶ The computation can continue separately with T00 and T11 .
¶ One can pick the shift from the bottom-right of T00 as one continues finding the
eigenvalues of T00 , thus accelerating that part of the computation.k
¶ One can pick the shift from the bottom-right of T11 as one continues finding the
eigenvalues of T11 , thus accelerating that part of the computation.
¶ One must continue to accumulate the eigenvectors by applying the rotations to
the appropriate columns of Q.
¶ Because of the connection between the QR algorithm and the Inverse Power
Method, subdiagonal entries near the bottom-right of T are more likely to con-
verge to a zero, so most deflation will happen there.
¶ A question becomes when an element on the subdiagonal, ·i+1,i can be considered
to be zero. The answer is when |·i+1,i | is small relative to |·i | and |·i+1,i+1 |. A
typical condition that is used is
Ò
|·i+1,i | Æ ‘mach |·i,i | + |·i+1,i+1 |.

For details, see some of our papers mentioned in the enrichments.

10.4 Enrichments
10.4.1 QR algorithm among the most important algorithms of the
20th century
An article published in SIAM News, a publication of the Society for Industrial and Applied
Mathermatics, lists the QR algorithm among the ten most important algorithms of the 20th
century [10]:
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM552

• Barry A. Cipra, The Best of the 20th Century: Editors Name Top 10 Algorithms,
SIAM News, Volume 33, Number 4, 2000.

10.4.2 Who was John Francis


Below is a posting by the late Gene Golub in NA Digest Sunday, August 19, 2007 Volume
07 : Issue 34.

From: Gene H Go ub \ t go [email protected] \gt


Date: Sun, 19 Aug 2007 13:54:47 -0700 (PDT)
Subject: John Francis, Co-Inventor of QR

Dear Co eagues,

For many years, I have been interested in meeting J G F Francis, one of


the co-inventors of the QR a gorithm for computing eigenva ues of genera
matrices. Through a ead provided by the ate Erin Brent and with the aid
of Goog e, I fina y made contact with him.

John Francis was born in 1934 in London and current y ives in Hove, near
Brighton. His residence is about a quarter mi e from the sea; he is a
widower. In 1954, he worked at the Nationa Research Deve opment Corp
(NRDC) and attended some ectures given by Christopher Strachey.
In 1955, 56 he was a student at Cambridge but did not comp ete a degree.
He then went back to NRDC as an assistant to Strachey where he got
invo ved in f utter computations and this ed to his work on QR.

After eaving NRDC in 1961, he worked at the Ferranti Corp and then at the
University of Sussex. Subsequent y, he had positions with various
industria organizations and consu tancies. He is now retired. His
interests were quite genera and inc uded Artificia Inte igence,
computer anguages, systems engineering. He has not returned to numerica
computation.

He was surprised to earn there are many references to his work and
that the QR method is considered one of the ten most important
a gorithms of the 20th century. He was unaware of such deve opments as
TeX and Math Lab. Current y he is working on a degree at the Open
University.

John Francis did remarkab e work and we are a in his debt. A ong with
the conjugate gradient method, it provided us with one of the basic too s
of numerica ana ysis.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM553

Gene Go ub

10.4.3 Casting the reduction to tridiagonal form in terms of matrix-


matrix multiplication
For many algorithms, we have discussed blocked versions that cast more computation in
terms of matrix-matrix multiplication, thus achieving portable high performance (when
linked to a high-performance implementation of matrix-matrix multiplication). The incon-
venient truth is that reduction to tridiagonal form can only be partially cast in terms of
matrix-matrix multiplication. This is a severe hindrance to high performance for that first
step towards computing all eigenvalues and eigenvector of a Hermitian matrix. Worse, a
considerable fraction of the total cost of the computation is in that first step.
For a detailed discussion on the blocked algorithm for reduction to tridiagonal form, we
recommend
• [44] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, G. Joseph
Elizondo, Families of Algorithms for Reducing a Matrix to Condensed Form, ACM
Transactions on Mathematical Software (TOMS) , Vol. 39, No. 1, 2012.
Tridiagonal form is one case of what is more generally referred to as "condensed form."

10.4.4 Optimizing the tridiagonal QR algorithm


As the Givens’ rotations are applied to the tridiagonal matrix, they are also applied to a
matrix in which eigenvectors are accumulated. While one Implicit Francis QR Step requires
O(n) computation for chasing the bulge, this accumulation of the eigenvectors requires O(n2 )
computation with O(n2 ) data per step. This inherently means the cost of accessing data
dominates on current architectures.
In a paper, we showed how accumulating the Givens’ rotations for several Francis Steps
before applying these to the matrix in which the eigenvectors are being computed allows one
to attain high performance similar to that attained by a matrix-matrix multiplication.
• [42] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, Restructuring
the Tridiagonal and Bidiagonal QR Algorithms for Performance, ACM Transactions
on Mathematical Software (TOMS), Vol. 40, No. 3, 2014.
For computing all eigenvalues and eigenvectors of a dense Hermitian matrix, this approach
is competitive with the Method of Relatively Robust Representations (MRRR), which we
mention in Subsection 10.4.5

10.4.5 The Method of Multiple Relatively Robust Representations


(MRRR)
The Method of Multiple Relative Robust Representations (MRRR) computes the eigenvalues
and eigenvectors of a m ◊ m tridiagonal matrix in O(m2 ) time. It can be argued that
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM554

this is within a constant factor of the lower bound for computing these eigenvectors since
the eigenvectors constitute O(m2 ) data that must be written upon the completion of the
computation.
When computing the eigenvalues and eigenvectors of a dense Hermitian matrix, MRRR
can replace the implicitly shifted QR algorithm for finding the eigenvalues and eigenvectors
of the tridiagonal matrix. The overall steps then become

• Reduce matrix A to tridiagonal form:

A æ QA T QH
A

where T is a tridiagonal real valued matrix. The matrix QA is not explicitly formed
but instead the Householder vectors that were computed as part of the reduction to
tridiagonal form are stored.

• Compute the eigenvalues and eigenvectors of the tridiagonal matrix T :

T æ QT DQTT .

• "Back transform" the eigenvectors by forming QA QT (applying the Householder trans-


formations that define QA to QT ).

The details of that method go beyond the scope of this note. We refer the interested reader
to

• [12] Inderjit S. Dhillon and Beresford N. Parlett, Multiple Representations to Compute


Orthogonal Eigenvectors of Symmetric Tridiagonal Matrices, Lin. Alg. Appl., Vol.
387, 2004.

• [3] Paolo Bientinesi, Inderjit S. Dhillon, Robert A. van de Geijn, A Parallel Eigensolver
for Dense Symmetric Matrices Based on Multiple Relatively Robust Representations,
SIAM Journal on Scientific Computing, 2005
Remark 10.4.5.1 An important feature of MRRR is that it can be used to find a subset of
eigenvectors. This is in contrast to the QR algorithm, which computes all eigenvectors.

10.5 Wrap Up
10.5.1 Additional homework
Homework 10.5.1.1 You may want to do a new "git pull" to update directory Assignments
.
In Assignments/Week10/mat ab you will find the files

• Givens_rotation.m: A function that computes a Givens’ rotation from a 2 ◊ 1 vector


x.
WEEK 10. PRACTICAL SOLUTION OF THE HERMITIAN EIGENVALUE PROBLEM555

• Francis_Step.m: A function that performs a Francis Implicit QR Step with a tridiag-


onal matrix T (stored as the diagonal and subdiagonal of T).

• Test_Francis_Step.m: A very rudimentary script that performs a few calls to the


function Francis_Step. Notice that our criteria for the routine being correct is that
the matrix retains the correct eigenvalues.

With this,

1. Investigate the convergence of the (m, m ≠ 1) element of matrix T1.

2. Write a function

function T = Spectra _Decomposition_Lambda( T )

That returns such that T = Q QT is the Spectral Decomposition of T . The input


matrix T is a tridiagonal matrix where only the lower triangular part of the ma-
trix is stored in the diagonal and first subdiagonal of array T. The diagonal matrix
is returned in T . The upper triangular part of the array should not change
values. You are encouraged to call the function Francis_Step from the function
Spectra _Decomposition_Lambda. Obviously, you need to incorporate deflation in
your implementation. How to handle the final 2 ◊ 2 matrix is an interesting question...
(You may use the matlab function eig for this.)

10.5.2 Summary
We have noticed that typos are uncovered relatively quickly once we release the material.
Because we "cut and paste" the summary from the materials in this week, we are delaying
adding the summary until most of these typos have been identified.
Week 11

Computing the SVD

11.1 Opening
11.1.1 Linking the Singular Value Decomposition to the Spectral
Decomposition

YouTube: https://www.youtube.com/watch?v=LaYzn2x_Z8Q
Week 2 introduced us to the Singular Value Decomposition (SVD) of a matrix. For any
matrix A œ Cm◊n , there exist unitary matrix U œ Cm◊m , unitary matrix V œ Cn◊n , and
œ Rm◊n of the form
A B
0r◊(n≠r)
= TL
, with T L = diag(‡0 , . . . , ‡r≠1 ), (11.1.1)
0(m≠r)◊r 0(m≠r)◊(n≠r)
and ‡0 Ø ‡1 Ø · · · Ø ‡r≠1 > 0

such
1 that 2A = U V 1H , the SVD
2 of matrix A. We can correspondingly partition U =
UL UR and V = VL VR , where UL and VL have r columns, in which case

A = UL H
T L VL

equals the Reduced Singular Value Decomposition. We did not present practical algorithms
for computing this very important result in Week 2, because we did not have the theory
and practical insights in place to do so. With our discussion of the QR algorithm in the
last week, we can now return to the SVD and present the fundamentals that underlie its
computation.

556
WEEK 11. COMPUTING THE SVD 557

In Week 10, we discovered algorithms for computing the Spectral Decomposition of a


Hermitian matrix. The following exercises link the SVD of A to the Spectral Decomposition
of B = AH A, providing us with a first hint as to how to practically compute the SVD.
Homework 11.1.1.1 Let A œ Cm◊n and A = U V H its SVD, where has the structure
indicated in (11.1.1). Give the Spectral Decomposition of the matrix AH A.
Solution.
AH A
= <A=U VH >
(U V H )H (U V H )
= < (BC)H = C H B H ; U H U = I >
T H
V V
A= B
2
TL 0r◊(n≠r)
V V H.
0(n≠r)◊r 0(n≠r)◊(n≠r)

Homework 11.1.1.2 Let A œ Cm◊n and A = U V H its SVD, where has the structure
indicated in (11.1.1). Give the Spectral Decomposition of the matrix AAH .
Solution.
AAH
=
(U V H )(U V H )H
=
T H
U U
=
A B
2
TL 0r◊(m≠r)
U UH.
0(m≠r)◊r 0(m≠r)◊(m≠r)
Last two homeworks expose how to compute the Spectral Decomposition of AH A or AAH
from the SVD of matrix A. We already discovered practical algorithms for computing the
Spectral Decomposition in the last week. What we really want to do is to turn this around:
How do we compute the SVD of A from the Spectral Decomposition of AH A and/or AAH ?

11.1.2 Overview
• 11.1 Opening

¶ 11.1.1 Linking the Singular Value Decomposition to the Spectral Decomposition


¶ 11.1.2 Overview
¶ 11.1.3 What you will learn

• 11.2 Practical Computation of the Singular Value Decomposition

¶ 11.2.1 Computing the SVD from the Spectral Decomposition


¶ 11.2.2 A strategy for computing the SVD
WEEK 11. COMPUTING THE SVD 558

¶ 11.2.3 Reduction to bidiagonal form


¶ 11.2.4 Implicitly shifted bidiagonal QR algorithm

• 11.3 Jacobi’s Method

¶ 11.3.1 Jacobi rotation


¶ 11.3.2 Jacobi’s method for computing the Spectral Decomposition
¶ 11.3.3 Jacobi’s method for computing the Singular Value Decomposition

• 11.4 Enrichments

¶ 11.4.1 Casting the reduction to bidiagonal form in terms of matrix-matrix multi-


plication
¶ 11.4.2 Optimizing the bidiagonal QR algorithm

• 11.5 Wrap Up

¶ 11.5.1 Additional homework


¶ 11.5.2 Summary

11.1.3 What you will learn


This week, you finally discover practical algorithms for computing the SIngular Value De-
composition.
Upon completion of this week, you should be able to

• Link the (Reduced) SIngular Value Decomposition of A to the Spectral Decomposition


of AH A.

• Reduce a matrix to bidiagonal form.

• Transform the implicitly shifted QR algorithm into the implicitly shifted bidiagonal
QR algorithm.

• Use Jacobi rotations to propose alternative algorithms, known as Jacobi’s Methods,


for computing the Spectral Decomposition of a symmetric matrix and Singular Value
Decomposition of a general real-valued matrix.
WEEK 11. COMPUTING THE SVD 559

11.2 Practical Computation of the Singular Value De-


composition
11.2.1 Computing the SVD from the Spectral Decomposition

YouTube: https://www.youtube.com/watch?v=aSvGPY09b48
Let’s see if we can turn the discussion from Subsection 11.1.1 around: Given the Spectral
Decomposition of AH A, how can we extract the SVD of A?
Homework 11.2.1.1 Let A œ Cm◊m be nonsingular and AH A = QDQH , the Spectral
Decomposition of AH A. Give a formula for U , V , and so that A = U V H is the SVD of
A. (Notice that A is square.)
Solution. Since A is nonsingular, so is AH A and hence D has positive real values on its
diagonal. If we take V = Q and = D1/2 then

A = U V H = U D1/2 QH .

This suggests that we choose

U = AV ≠1
= AQD≠1/2 .

We can easily verify that U is unitary:

UHU
=
(AQD≠1/2 )H (AQD≠1/2 )
=
≠1/2 H H
D Q A AQD≠1/2
=
D≠1/2 DD≠1/2
=
I.

The final detail is that the Spectral Decomposition does not require the diagonal elements of
D to be ordered from largest to smallest. This can be easily fixed by permuting the columns
of Q and, correspondingly, the diagonal elements of D.
WEEK 11. COMPUTING THE SVD 560

YouTube: https://www.youtube.com/watch?v=sAbXHD4TMSE
Not all matrices are square and nonsingular. In particular, we are typically interested in
the SVD of matrices where m > n. Let’s examine how to extract the SVD from the Spectral
Decomposition of AH A for such matrices.
Homework 11.2.1.2 Let A œ Cm◊n have full column rank and let AH A = QDQH , the
Spectral Decomposition of AH A. Give a formula for the Reduced SVD of A.
Solution. We notice that if A has full column rank, then its Reduced Singular Value
Decomposition is given by A = UL V H , where UL œ Cm◊n , œ Rn◊n , and V œ Cn◊n .
Importantly, A A is nonsingular, and D has positive real values on its diagonal. If we take
H

V = Q and = D1/2 then


A = UL V H = UL D1/2 QH .
This suggests that we choose

UL = AV ≠1
= AQD≠1/2 ,

where, clearly, = D1/2 is nonsingular. We can easily verify that UL has orthonormal
columns:
ULH UL
=
(AQD≠1/2 )H (AQD≠1/2 )
=
D≠1/2 QH AH AQD≠1/2
=
D≠1/2 DD≠1/2
=
I.
As before, the final detail is that the Spectral Decomposition does not require the diagonal
elements of D to be ordered from largest to smallest. This can be easily fixed by permuting
the columns of Q and, correspondingly, the diagonal elements of D.
The last two homeworks gives us a first glimpse at a practical procedure for computing
the (Reduced) SVD from the Spectral Decomposition, for the simpler case where A has full
column rank.
• Form B = AH A.
• Compute the Spectral Decomposition B = QDQH via, for example, the QR algorithm.
WEEK 11. COMPUTING THE SVD 561

• Permute the columns of Q and diagonal elements of D so that the diagonal elements
are ordered from largest to smallest. If P is the permutation matrix such that P DP T
reorders the diagonal of D appropriately, then

AH A
= < Spectral Decomposition >
H
QDQ
= < insert identities >
= Q ¸P ˚˙
T T
P˝ D ¸P ˚˙P˝ QH
I I
= < associativity >
(QP T )(P DP T )(P QH )
= < (BC)H = C H B H H >
(QP )(P DP T )(QP T )H
T

• Let V = QP T , = (P DP T )1/2 (which is diagonal), and UL = AV ≠1


.

With these insights, we find the Reduced SVD of a matrix with linearly independent columns.
If in addition A is square (and hence nonsingular), then U = UL and A = U V H is its SVD.
Let us now treat the problem in full generality.
Homework 11.2.1.3 Let A œ Cm◊n be of rank r and
A B
1 2 DT L 0r◊(n≠r) 1 2H
A A=
H
QL QR QL QR
0(n≠r)◊r 0(n≠r)◊(n≠r)

be the Spectral Decomposition of AH A, where QL œ Cn◊r and, for simplicity, we assume


the diagonal elements of DT L are ordered from largest to smallest. Give a formula for the
Reduced SVD of A.
Solution. The Reduced SVD of A is given by A = UL T L VLH , where T L is r ◊ r is
diagonal with positive real values along its diagonal, ordered from largest to smallest. If we
1/2
take VL = QL and T L = DT L then

A = UL H
T L VL = UL D1/2 QH
L.

This suggests that we choose

UL = AVL = AQL DT L .
≠1 ≠1/2
TL
WEEK 11. COMPUTING THE SVD 562

We can easily verify that UL has orthonormal columns:

ULH UL
=
(AQL DT L )H (AQL DT L )
≠1/2 ≠1/2

=
≠1/2 H H ≠1/2
DT L QL A AQL DT L
= A B
≠1/2
1 2 DT L 0r◊(n≠r) 1 2H
≠1/2
DT L QH QL QR QL QR QL DT L
L
0(n≠r)◊r 0(n≠r)◊(n≠r)
= A B
1 2 DT L 0r◊(n≠r) 1 2H
I 0 I 0
≠1/2 ≠1/2
DT L DT L
0(n≠r)◊r 0(n≠r)◊(n≠r)
=
≠1/2 ≠1/2
DT L DT L DT L
=
I.
Although the discussed approaches give us a means by which to compute the (Reduced)
SVD that is mathematically sound, the Achilles heel of these is that it hinges on forming
AH A. While beyond the scope of this course, the conditioning of computing a Spectral
Decomposition of a Hermitian matrix is dictated by the condition number of the matrix,
much like solving a linear system is. We recall from Subsection 4.2.5 that we avoid using
the Method of Normal Equations to solve the linear least squares problem when a matrix
is ill-conditioned. Similarly, we try to avoid computing the SVD from AH A. The problem
here is even more acute: it is often the case that A is (nearly) rank deficient (for example,
in situations where we desire a low rank approximation of a given matrix) and hence it is
frequently the case that the condition number of A is very unfavorable. The question thus
becomes, how can we avoid computing AH A while still benefiting from the insights in this
unit?

11.2.2 A strategy for computing the SVD


Remark 11.2.2.1 In this section, we discuss both the QR factorization and the QR algo-
rithm. The QR factorization, discussed in Week 3, is given by A = QR. The QR algorithm,
which we discussed in Week 10, instead computes the Spectral Decomposition of a Hermitian
matrix. It can be modified to compute the Schur Decomposition instead, which we don’t
discuss in this course. It can also be modified to compute the SVD of a matrix, which we
discuss in this, and subsequent, units.
WEEK 11. COMPUTING THE SVD 563

YouTube: https://www.youtube.com/watch?v=SXP12WwhtJA
The first observation that leads to a practical algorithm is that matrices for which we
wish to compute the SVD are often tall and skinny, by which we mean that they have many
more rows than they have columns, and it is the Reduced SVD of this matrix that is desired.
The methods we will develop for computing the SVD are based on the implicitly shifted QR
algorithm that was discussed in Subsection 10.3.5, which requires O(n3 ) computation when
applied to an n ◊ n matrix. Importantly, the leading n3 term has a very large constant
relative to, say, the cost of a QR factorization of that same matrix.
Rather than modifying the QR algorithm to work with a tall and skinny matrix, we start
by computing its QR factorization, A = QR. After this, the SVD of the smaller, n ◊ n sized,
matrix R is computed. The following homework shows how the Reduced SVD of A can be
extracted from Q and the SVD of R.
Homework 11.2.2.1 Let A œ Cm◊n , with m Ø n, and A = QR be its QR factorization
where, for simplicity, we assume that n ◊ n upper triangular matrix R is nonsingular. If
R = U‚ ‚ V‚ H is the SVD of R, give the Reduced SVD of A.
Solution.
A
=
QR =
QU‚ R V‚ H =
(QU‚ ) ¸˚˙˝
‚ ‚H
V
¸ ˚˙ ˝.
¸ ˚˙ ˝
U VH

YouTube: https://www.youtube.com/watch?v=T0NYsbdaC78
While it would be nice if the upper triangular structure of R was helpful in computing
its SVD, it is actually the fact that that matrix is square and small (if n π m) that is
significant. For this reason, we now assume that we are interested in finding the SVD of a
square matrix A, and ignore the fact that that matrix may be triangular.
WEEK 11. COMPUTING THE SVD 564

YouTube: https://www.youtube.com/watch?v=sGBD0-PSMN8
Here are some more observations, details of which will become clear in the next units:

• In Subsection 10.3.1, we saw that an m ◊ m Hermitian matrix can be reduced to


tridiagonal form via a sequence of Householder transformations that are applied from
the left and the right to the matrix. This then greatly reduced the cost of the QR
algorithm that was used to compute the Spectral Decomposition.
In the next unit, we will see that one can similarly reduce a matrix to bidiagonal
form, a matrix that has only a nonzero diagonal and super diagonal. In other words,
there is a similarity transformation such that

A = QA BQH
A,

where B is bidiagonal. Conveniently, B can also be forced to be real-valued.

• The observation now is that B T B is a real-valued tridiagonal matrix. Thus, if we


explicitly form T = B T B, then we can employ the implicitly shifted QR algorithm
(or any other tridiagonal eigensolver) to compute its Spectral Decomposition and from
that construct the SVD of B, the SVD of the square matrix A, and the Reduced SVD
of whatever original m ◊ n matrix we started with.

• We don’t want to explicitly form B T B because the condition number of B equals the
condition number of the original problem (since they are related via unitary transfor-
mations).

• In the next units, we will find that we can again employ the Implicit Q Theorem to
compute the SVD of B, inspired by the implicitly shifted QR algorithm. The algorithm
we develop again casts all updates to B in terms of unitary transformations, yielding
a highly accurate algorithm.

Putting these observations together yields a practical methodology for computing the Re-
duced SVD of a matrix.
WEEK 11. COMPUTING THE SVD 565

11.2.3 Reduction to bidiagonal form

YouTube: https://www.youtube.com/watch?v=2OW5Yi6QOdY
Homework 11.2.3.1 Let B œ Rm◊m be a bidiagonal matrix:
Q R
—0,0 —0,1 0 ··· 0 0
c d
c
c
0 —1,1 —1,2 ··· 0 0 d
d
B= c .. ... ... ... .. .. d
c
c . . . d.
d
c
a 0 0 0 · · · —m≠2,m≠2 —m≠2,m≠1 d
b
0 0 0 ··· 0 —m≠1,m≠1

Show that T = B T B is a tridiagonal symmetric matrix.


Solution.
BT B
=
Q RT Q R
—0,0 —0,1 0 ··· —0,0 —0,1 0 ···
c 0 0 —1,1 —1,2
c d c d
—1,1 —1,2 ··· d c ··· d
c
c 0 0 —2,2 ···
d
d
c
c 0 0 —2,2 ···
d
d =
a b a b
.. ... ... ... .. ... ... ...
. .
Q RQ R
—0,0 0 0 ··· —0,0 —0,1 0 ···
c
c —0,1 —1,1 0 · · · dc
dc 0 —1,1 —1,2 ··· d
d
c dc d
c
a 0 —1,2 —2,2 · · · dc
ba 0 0 —2,2 ··· d
b
.. ... ... ... .. ... ... ...
. .
=
Q R
2
—0,0 —0,1 —0,0 0 ···
c
c
2
—0,1 —0,0 —0,1 + —1,1
2
—1,2 —1,1 ··· d
d
c d
c
a 0 —1,2 —1,1 2
—1,2 + —2,2
2
··· d
b
.. .. .. ..
. . . .
Given that we can preprocess our problem by computing its QR factorization, we focus
now on the case where A œ Cm◊m . The next step is to reduce this matrix to bidiagonal form
by multiplying the matrix from the left and right by two sequences of unitary matrices.
Once again, we employ Householder transformations. In Subsubsection 3.3.3.3, we intro-
WEEK 11. COMPUTING THE SVD 566

duced the function CA B D AA BB


fl ‰1
, · := Housev ,
u2 x2
implemented by
function [ rho, ...
u2, tau ] = Housev( chi1, ...
x2 ),
A B
1
to compute the vector u = that reflects x into ± ÎxÎ2 e0 so that
u2
Q A BA BH R A B
1 1 1 ‰1
aI ≠ b = ± ÎxÎ2 e0 .
· u2 u2 x2 ¸ ˚˙ ˝

In Subsection 10.3.1, we introduced a variation on this function:
[u, · ] := Housev1 (x)
implemented by the function
function [ u, tau ] = Housev1( x ).
They differ only in how the input and output are passed to and from the function. We also
introduce the notation H(u, · ) for the transformation I ≠ ·1 uuH .
We now describe an algorithm for reducing a square matrix to bidiagonal form:
A B
–11 aT12
• Partition A æ .
a21 A22
A B A B .A B.
–11 –11 . – .
. 11 .
• Update [ , ·1 ] := Housev( ). This overwrites –11 with ± .. .
a21 a21 a21 .
2
and a21 with u21 . Implicitly, a21 in the updated matrix equals the zero vector.
• Update A B A B A B
aT12 1 aT12
:= H( , ·1 ) .
A22 u21 A22

This introduces zeroes below the first entry in the first column, as illustrated by
Q R Q R
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
c d c d
c
c
◊ ◊ ◊ ◊ ◊ d
d
c
c
0 ◊ ◊ ◊ ◊ d
d
c
c ◊ ◊ ◊ ◊ ◊ d
d ≠æ c
c 0 ◊ ◊ ◊ ◊ d
d
c
a ◊ ◊ ◊ ◊ ◊ d
b
c
a 0 ◊ ◊ ◊ ◊ d
b
◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊
The Householder vector that introduced the zeroes is stored over those zeroes.
Next, we introduce zeroes in the first row of this updated matrix.
WEEK 11. COMPUTING THE SVD 567
A B
–11 aT12
• The matrix is still partitioned as A æ , where the zeroes have been
0 A22
overwritten with u21 .

• We compute [u12 , fl1 ] := Housev1((aT12 )T ). The first element of u12 now holds ± Î(aT12 )T Î2
and the rest of the elements define the Householder transformation that introduces ze-
roes in (aT12 )T below the first element. We store uT12 in aT12 .

• After setting the first entry of u12 explicitly to one, we update A22 := A22 H(u12 , fl1 ).

This introduces zeroes to the right of the first entry of aT12 , as illustrated by
Q R Q R
◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 0 0
c d c d
c
c
0 ◊ ◊ ◊ ◊ d
d
c
c
0 ◊ ◊ ◊ ◊ d
d
c
c 0 ◊ ◊ ◊ ◊ d
d ≠æ c
c 0 ◊ ◊ ◊ ◊ d
d
c
a 0 ◊ ◊ ◊ ◊ d
b
c
a 0 ◊ ◊ ◊ ◊ d
b
0 ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊

The Householder vector that introduced the zeroes is stored over those zeroes.
The algorithm continues this with the updated A22 as illustrated in Figure 11.2.3.1.

◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 0 0 ◊ ◊ 0 0 0
◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0 ◊ ◊ 0 0
◊ ◊ ◊ ◊ ◊ ≠æ 0 ◊ ◊ ◊ ◊ ≠æ 0 0 ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0 0 ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0 0 ◊ ◊ ◊
Original matrix First iteration Second iteration

◊ ◊ 0 0 0 ◊ ◊ 0 0 0
0 ◊ ◊ 0 0 0 ◊ ◊ 0 0
≠æ 0 0 ◊ ◊ 0 ≠æ 0 0 ◊ ◊ 0
0 0 0 ◊ ◊ 0 0 0 ◊ ◊
0 0 0 ◊ ◊ 0 0 0 0 ◊
Third iteration Fourth iteration
Figure 11.2.3.1 An illustration of the reduction of a square matrix to bidiagonal form. The
◊s denote nonzero elements in the matrix.
Ponder This 11.2.3.2 Fill in the details for the above described algorithm that reduces a
square matrix to bidiagonal form. In particular:
• For the update A B A B A B
aT12 1 aT12
:= H( , ·1 ) ,
A22 u21 A22
WEEK 11. COMPUTING THE SVD 568

describe how all the different parts of


A B
aT12
A22

are updated. (Hint: look at the QR factorization algorithm in Subsection 3.3.4.)

• For the update A22 := A22 H(u12 , fl1 ), describe explicitly how A22 is updated. (Hint:
look at Homework 10.3.1.1.)

Next, state the algorithm by completing the skeleton in Figure 11.2.3.2.


Finally, analyze the approximate cost of the algorithm, when started with a m◊m matrix.
[A, t, r]
A
:= BiRed-unb(A)
B A B A B
AT L AT R tT rT
Aæ ,t æ ,r æ
ABL ABR tB rB
AT L is 0 ◊ 0, tT , rT have 0 elements
while m(AT L ) < m(A)Q R Q R Q R
A B A00 a01 A02 A B t0 A B r0
AT L AT R c d tT c d r T c d
æ a aT10 –11 aT12 b , æ a ·1 b , æ a fl1 b
ABL ABR tB rB
A20 a21 A22 t2 r2

Update via the steps


Y
_
_
_
]
_
_
_
[

Update via the steps


Y
_
_
_
]
_
_
_
[
Q R Q R Q R
A B A00 a01 A02 A B t0 A B r0
AT L AT R c T T d tT c d rT c d
a
Ω a 10 –11 a 12 b , ·
Ω a 1 b, fl
Ωa 1 b
ABL ABR tB rB
A20 a21 A22 t2 r2
endwhile
Figure 11.2.3.2 Algorithm skeleton for reduction of a square matrix to bidiagonal form.
Ponder This 11.2.3.3 Once you have derived the algorithm in Ponder This 11.2.3.2, im-
plement it.
You may want to start by executing git pu to update your directory Assignments.
WEEK 11. COMPUTING THE SVD 569

In directory Assignments/Week11/mat ab/, you will find the following files:

• Housev.m and Housev1.m: Implementations of the function Housev and Housev1.

• BiRed.m: A code skeleton for a function that reduces a square matrix to bidiagonal
form.

[ B, t, r ] = BiRed( A, t, r )

returns the diagonal and first superdiagonal of the bidiagonal matrix in B, stores the
Householder vectors below the subdiagonal and above the first superdiagonal, and
returns the scalars · and fl in vectors t and r .

• BiFromB.m: A function that extracts the bidiagonal matrix from matrix B, which also
has the Householder vector information in it.

Bbi = BiFromB( B )

• test_BiRed.m: A script that tests BiRed.

These resources give you the tools to implement and test the reduction to bidiagonal form.

11.2.4 Implicitly shifted bidiagonal QR algorithm

YouTube: https://www.youtube.com/watch?v=V2PaGe52ImQ
Converting a (tridiagonal) implicitly shifted QR algorithm into a (bidiagonal) implicitly
shifted QR algorithm now hinges on some key insights, which we will illustrate with a 4 ◊ 4
example.

• We start with a bidiagonal matrix B (k)


Q R
—0,0 —0,1 0 0
c 0 —1,1 —1,2 0 d
B (k) = c
c d
d,
a 0 0 —2,2 —2,3 b
0 0 0 —3,3

which is our "current iteration."


WEEK 11. COMPUTING THE SVD 570

• If we explicitly form T (k) = B (k) T B (k) , then we would have to form


Q R
·0,0 ·1,0 0 0
c ·1,0 ·1,1 ·2,1 0 d
T (k) = c
c
d
d
a 0 ·2,1 ·2,2 ·3,2 b
0 0 ·3,2 ·3,3
Q R
2
—0,0 —0,1 —0,0 0 0
c
c
2
—0,1 —0,0 —0,1 + —1,1
2
—1,2 —1,1 0 d
d
= c d
a 0 —1,2 —1,1 2
—1,2 + —2,2
2
—2,3 —2,2 b
0 0 2
—2,3 —2,2 —2,3 + —3,3
2

• The Francis Implicit QR Step would then compute a first Givens’ rotation so that
A BT A B A B
“0 ≠‡0 ·0,0 ≠ ·3,3 ◊
= . (11.2.1)
‡0 “0 ·1,0 0
¸ ˚˙ ˝
GT0

• With this Givens’ rotation, it would introduce a bulge


Q R
◊ ◊ ◊ 0
c
c ◊ ◊ ◊ 0 d
d
c d
a ◊ ◊ ◊ ◊ b
0 0 ◊ ◊
Q R
Q R ·0,0 ·1,0 0 0 Q R
GT0 0 0 c G0 0 0
c dc ·1,0 ·1,1 ·2,1 0 d
dc d
=a 0 1 0 bc da 0 1 0 b
a 0 ·2,1 ·2,2 ·3,2 b
0 0 1 0 0 1
0 0 ·3,2 ·3,3

• Finally, the bulge would be chased out


Q R
◊ ◊ 0 0
c ◊
c ◊ ◊ 0 d
d
T (k+1) = c d
a 0 ◊ ◊ ◊ b

Q
0 0
R
◊ ◊
1 0 0
=c a 0 1 0 d
b
0 0 GT2
S Q R T
Q R ◊ ◊ ◊ 0 Q R Q R
1 0 0 c 1 0 0 X 1 0 0
W ◊ ◊ ◊ 0 d
bX a 0 1 0 b .
Wc
Wa 0 GT1 0 d
bc
c dc
da 0 G1 0 d X c d
U a ◊ ◊ ◊ ◊ b V
0 0 1 0 0 1 0 0 G2
0 0 ◊ ◊
¸ ˚˙ ˝
Q R
◊ ◊ 0 0
c ◊ ◊ ◊ ◊ d
c d
c d
a 0 ◊ ◊ ◊ b
0 ◊ ◊ ◊
WEEK 11. COMPUTING THE SVD 571

Obviously, this extends.


Let us examine what would happen if we instead apply these Givens’ rotations to
B (k) T B (k) . Since T (k) = B (k) T B (k) , we find that
Q R
◊ ◊ 0 0
c
c ◊ ◊ ◊ 0 d
d
T (k+1) = c d
a 0 ◊ ◊ ◊ b
0 0 ◊ ◊
SQ RT
Q RQ RQ R —0,0 —0,1 0 0
1 0 0 1 0 0 GT0 0 0 Wc
dW 0 —1,1 —1,2 0 d
= a 0 1 0 b a 0 GT1
c dc
0 d c
ba 0 1 0 bW
c
Wc
d
d
Ua 0 0 —2,2 —2,3 b
0 0 GT2 0 0 1 0 0 1
0 0 —3,3
Q RT
—0,0 —0,1 0 0 Q RQ RQ R
G0 0 0 1 0 0 1 0 0
c 0 —1,1 —1,2 0 dX
0 1 0 b a 0 G1 0 b a 0 1 0 b
c dX c dc dc d
◊ c dX a
a 0 0 —2,2 —2,3 bV
0 0 1 0 0 1 0 0 G2
0 0 —3,3
=
SQ R TT
—0,0 —0,1 0 0 Q RQ RQ R
Wc 0 G0 0 0 1 0 0 1 0 0 X
—1,1 —1,2 0 d
0 b a 0 G 1 0 b a 0 1 0 bX
Wc dc dc dc dX
Wc da 0 1
Ua 0 0 —2,2 —2,3 b V
0 0 1 0 0 1 0 0 G2
0 0 —3,3
SQ R T
—0,0 —0,1 0 0 Q RQ RQ R
G 0 0 1 0 0 1 0 0 X
Wc 0 —1,1 —1,2 0 d c 0
d
◊W 1 0 b a 0 G 1 0 b a 0 1 0 bX .
c dc dc dX
Wc da 0
Ua 0 0 —2,2 —2,3 b V
0 0 1 0 0 1 0 0 G2
0 0 —3,3

The observation now is that if we can find two sequences of Givens’ rotations such that
Q R
◊ 0◊ 0
c 0 ◊ ◊ 0 d
c d
B (k+1) = c d
a 0 ◊ 0 ◊ b
0 00 ◊
Q RQ RQ T R
1 0 0 1 0 0 G‚
0 0 0
=a 0 1 0
c dc d c d
ba 0 G1 0 b a 0 1 0
‚ T
b
0 0 G ‚T
2 0 0 1 0 0 1
Q R
—0,0 —0,1 0 0 (11.2.2)
c 0 —1,1 —1,2 0 d
◊c
c
d
d
a 0 0 —2,2 —2,3 b
0 0 —3,3
Q RQ RQ R
G0 0 0 1 0 0 1 0 0
c
◊ a 0 1 0 ba
dc
0 G
Â
1 0
dc
b a 0 1 0 db
0 0 1 0 0 1 0 0 G2
Â
¸ ˚˙ ˝
Q
WEEK 11. COMPUTING THE SVD 572

then, by the Implicit Q Theorem,


B (k+1) T B (k+1)
=
Q RT Q R
◊ ◊ 0 0 ◊ ◊ 0 0
c 0 ◊ ◊ 0 d c 0 ◊ ◊ 0 d
c d c d
c d c d
a 0 0 ◊ ◊ b a 0 0 ◊ ◊ b
0 0 0 ◊ 0 0 0 ◊
=
Q R
◊ ◊ 0 0
c ◊ ◊ ◊ 0 d
c d
c d
a 0 ◊ ◊ ◊ b
0 0 ◊ ◊
=
T (k+1)
=
QT T (k) Q.

If we iterate in this way, we know that T (k) converge to a diagonal matrix (under mild
conditions). This means that the matrices B (k) converge to a diagonal matrix, B . If we
accumulate all Givens’ rotations into matrices UB and VB , then we end up with the SVD of
B:
B = UB B VBT ,
modulo, most likely, a reordering of the diagonal elements of B and a corresponding re-
ordering of the columns of UB and VB .
This leaves us with the question of how to find the two sequences of Givens’ rotations
mentioned in (11.2.2).
• We know G0 , which was computed from (11.2.1). Importantly, computing this first
Givens’ rotation requires only that the elements ·0,0 , ·1,0 , and ·m≠1,m≠1 of T (k) to be
explicitly formed.
• If we apply it to B (k) , we introduce a bulge:
Q R Q R
◊ ◊ 0 0 —0,0 —0,1 0 0 Q R
G0 0 0
c
c ◊ ◊ ◊ 0 d
d
c
c 0 —1,1 —1,2 0 d
dc d
c d = c da 0 1 0 b.
a 0 0 ◊ ◊ b a 0 0 —2,2 —2,3 b
0 0 1
0 0 0 ◊ 0 0 —3,3

• We next compute a Givens’ rotation, G‚ , that changes the nonzero that was introduced
0
below the diagonal back into a zero.
Q R Q R
◊ ◊ ◊ 0 Q R ◊ ◊ 0 0
‚T
G 0 0 c
c
c 0 ◊ ◊ 0 d
d c
0
dc ◊ ◊ ◊ 0 d
d
c d =a 0 1 0 bc d.
a 0 0 ◊ ◊ b a 0 0 ◊ ◊ b
0 0 1
0 0 0 ◊ 0 0 0 ◊
WEEK 11. COMPUTING THE SVD 573

• This means we now need to chase the bulge that has appeared above the superdiagonal:
Q R Q R
◊ ◊ 0 0 ◊ ◊ ◊ 0 Q R
1 0 0
c
c 0 ◊ ◊ 0 d
d
c
c 0 ◊ ◊ 0 d
dc d
c d = c da 0 G
Â
1 0 b
a 0 ◊ ◊ ◊ b a 0 0 ◊ ◊ b
0 0 1
0 0 0 ◊ 0 0 0 ◊

• We continue like this until the bulge is chased out the end of the matrix.

The net result is an implicitly shifted bidiagonal QR algorithm that is applied directly
to the bidiagonal matrix, maintains the bidiagonal form from one iteration to the next, and
converges to a diagonal matrix that has the singular values of B on its diagonal. Obviously,
deflation can be added to this scheme to further reduce its cost.

11.3 Jacobi’s Method


11.3.1 Jacobi rotation

YouTube: https://www.youtube.com/watch?v=OoMPkg994ZE
Given a symmetric 2 ◊ 2 matrix A, with
A B
–0,0 –0,1
A=
–1,0 –1,1

There exists a rotation, A B


“ ≠‡
J= ,
‡ “
where “ = cos(◊) and ‡ = sin(◊) for some angle ◊, such that
A BA BA B A B
“ ‡ –0,0 –0,1 “ ≠‡ ⁄0 0
J AJ =
T
= = .
≠‡ “ –1,0 –1,1 ‡ “ 0 ⁄1

We recognize that
A = J JT
is the Spectral Decomposition of A. The columns of J are eigenvectors of length one and
the diagonal elements of are the eigenvalues.
WEEK 11. COMPUTING THE SVD 574

Ponder This 11.3.1.1 Give a geometric argument that Jacobi rotations exist.
Hint. For this exercise, you need to remember a few things:

• How is a linear transformation, L, translated into the matrix A that represents it,
Ax = L(x)?

• What do we know about the orthogonality of eigenvectors of a symmetric matrix?

• If A is not already diagonal, how can the eigenvectors be chosen so that they have unit
length, first one lies in Quadrant I of the plane, and the other one lies in Quadrant II?

• Draw a picture and deduce what the angle ◊ must be.


It is important to note that to determine J we do not need to compute ◊. We merely need
to find one eigenvector of the 2 ◊ 2 matrix from which we can then compute an eigenvector
that is orthogonal. These become the columns of J. So, the strategy is to
• Form the characteristic polynomial

det(⁄I ≠ A) = (⁄ ≠ –0,0 )(⁄ ≠ –1,1 ) ≠ –1,0


2

= ⁄ ≠ (–0,0 + –1,1 )⁄ + (–0,0 –1,1 ≠ –1,0


2 2
),

and solve for its roots, which give us the eigenvalues of A. Remember to us the
stable formula for computing the roots of a second degree polynomial, discussed in
Subsection 9.4.1.

• Find an eigenvector associated with one of the eigenvalues, scaling it to have unit
length and to lie in either Quadrant I or Quadrant II. This means that the eigenvector
has the form A B


if it lies in Quadrant I or A B
≠‡

if it lies in Quadrant II.
This gives us the “ and ‡ that define the Jacobi rotation.
Homework 11.3.1.2 With Matlab, use the eig function to explore the eigenvalues and
eigenvectors of various symmetric matrices:
[ Q, Lambda ] = eig( A )

Try, for example

A = [
-1 2
2 3
]
WEEK 11. COMPUTING THE SVD 575

A = [
2 -1
-1 -2
]

How does the matrix Q relate to a Jacobi rotation? How would Q need to be altered for it
to be a Jacobi rotation?
Solution.

>> A = [
-1 2
2 3
]

A =

-1 2
2 3

>> [ Q, Lambda ] = eig( A )

Q =

-0.9239 0.3827
0.3827 0.9239

Lambda =

-1.8284 0
0 3.8284

We notice that the columns of Q need to be swapped for it to become a Jacobi rotation:
A B A BA B A B
0.9239 0.3827 0.3827 ≠0.9239 0 ≠1 0 ≠1
J= = =Q .
≠0.3827 0.9239 0.9239 0.3827 1 0 1 0

>> A = [
2 -1
-1 -2
]

A =
WEEK 11. COMPUTING THE SVD 576

2 -1
-1 -2

>> [ Q, Lambda ] = eig( A )

Q =

-0.2298 -0.9732
-0.9732 0.2298

Lambda =

-2.2361 0
0 2.2361

We notice that the columns of Q need to be swapped for it to become a Jacobi rotation. If
we follow our "recipe", we also need to negate each column:
A B
0.9732 ≠0.2298
J= .
0.2298 0.9732

These solutions are not unique. Another way of creating a Jacobi rotation is to, for
example, scale the first column so that the diagonal elements have the same sign. Indeed,
perhaps that is the easier thing to do:
A B A B
≠0.9239 0.3827 0.9239 0.3827
Q= ≠æ J= .
0.3827 0.9239 ≠0.3827 0.9239

11.3.2 Jacobi’s method for computing the Spectral Decomposition

YouTube: https://www.youtube.com/watch?v=mBn7d9jUjcs
The oldest algorithm for computing the eigenvalues and eigenvectors of a (symmetric)
matrix is due to Jacobi and dates back to 1846.

• [22] C. G. J. Jacobi, Über ein leichtes Verfahren, die in der Theorie der Säkular-
störungen vorkommenden Gleichungen numerisch aufzulösen, Crelle’s Journal 30, 51-94
WEEK 11. COMPUTING THE SVD 577

(1846).

If we recall correctly (it has been 30 years since we read the paper in German), the paper
thanks Jacobi’s student Seidel for performing the calculations for a 5 ◊ 5 matrix, related to
the orbits of the planets, by hand...
This is a method that keeps resurfacing, since it parallelizes easily. The operation count
tends to be higher (by a constant factor) than that of reduction to tridiagonal form followed
by the tridiagonal QR algorithm.
Jacobi’s original idea went as follows:

• We start with a symmetric matrix:


Q R
–0,0 –0,1 –0,2 –0,3
c –1,0 –1,1 –1,2 –1,3 d
A=c
c
d
d.
a –2,0 –2,1 –2,2 –2,3 b
–3,0 –3,1 –3,2 –3,3

• We find the off-diagonal entry with largest magnitude. Let’s say it is –3,1 .

• We compute a Jacobi rotation so that


A BA BA B A B
“3,1 ‡3,1 –1,1 –1,3 “3,1 ≠‡3,1 ◊ 0
= ,
≠‡3,1 “3,1 –3,1 –3,3 ‡3,1 “3,1 0 ◊

where the ◊s denote nonzero entries.

• We now apply the rotation as a unitary similarity transformation from the left to the
rows of A indexed with 1 and 3, and from the right to columns 1 and 3:
Q R
–0,0 ◊ –0,2 ◊
c
c ◊ ◊ ◊ 0 d
d
c d =
a –2,0 ◊ –2,2 ◊ b
◊ 0 ◊ ◊
Q RQ RQ R
1 0 0 0 –0,0 –0,1 –0,2 –0,3 1 0 0 0
c
c 0 “3,1 0 ‡3,1 dc
dc –1,0 –1,1 –1,2 –1,3 dc
dc 0 “3,1 0 ≠‡3,1 d
d
c dc dc d.
a 0 0 1 0 ba –2,0 –2,1 –2,2 –2,3 ba 0 0 1 0 b
0 ≠‡3,1 0 “3,1 –3,0 –3,1 –3,2 –3,3 0 ‡3,1 0 “3,1

The ◊s here denote elements of the matrix that are changed by the application of the
Jacobi rotation.

• This process repeats, reducing the off-diagonal element that is largest in magnitude to
zero in each iteration.

Notice that each application of the Jacobi rotation is a unitary similarity transformation,
and hence preserves the eigenvalues of the matrix. If this method eventually yields a diagonal
WEEK 11. COMPUTING THE SVD 578

matrix, then the eigenvalues can be found on the diagonal of that matrix. We do not give a
proof of convergence here.

YouTube: https://www.youtube.com/watch?v= C5VKYR7sEM


Homework 11.3.2.1 Note: In the description of this homework, we index like Matlab does:
starting at one. This is in contrast to how we usually index in these notes, starting at zero.
In Assignments/Week11/mat ab, you will find the following files:

• Jacobi_rotation.m: A function that computes a Jacobi rotation from a 2◊2 symmetric


matrix.

• Seide .m: A script that lets you apply Jacobi’s method to a 5 ◊ 5 matrix, much like
Seidel did by hand. Fortunately, you only indicate the off-diagonal element to zero out.
Matlab then does the rest.

Use this to gain insight into how Jacobi’s method works. You will notice that finding the
off-diagonal element that has largest magnitude is bothersome. You don’t need to get it
right every time.
Once you have found the diagonal matrix, restart the process. This time, zero out the
off-diagonal elements in systematic "sweeps," zeroing the elements in the order.

( 2,1 ) - ( 3,1 ) - ( 4,1 ) - ( 5,1 ) -


( 3,2 ) - ( 4,2 ) - ( 5,2 ) -
( 4,3 ) - ( 5,3 ) -
( 5,4 )

and repeating this until convergence. A sweep zeroes every off-diagonal element exactly once
(in symmetric pairs).
Solution. Hopefully you noticed that the matrix converges to a diagonal matrix, with the
eigenvalues on its diagonal.
The key insight is that applying a Jacobi rotation to zero an element, –i,j , reduces the
square of the Frobenius norm of the off-diagonal elements of the matrix by –i,j 2
. In other
words, let off(A) equal the matrix A but with its diagonal elements set to zero. If Ji,j zeroes
out –i,j (and –j,i ), then
T
Îoff(Ji,j AJi,j )Î2F = Îoff(A)Î2F ≠ 2–i,j
2
.
WEEK 11. COMPUTING THE SVD 579

Homework 11.3.2.2 Partition matrix A like


Q R
A00 a10 AT20 a30 AT40
c d
c
c
aT10 –11 aT21 –31 aT41 d
d
c A20 a21 A22 a32 AT42 d
c d
c d
a aT30 –31 aT32 –33 aT43 b
A40 a41 A42 a43 A44

and let J equal the Jacobi rotation that zeroes the element denoted with –31 :

J T AJ
Q
= R Q R Q R
I 0 0 0 0 A00 a10 AT20 a30 AT40 I 0 0 0 0
c d c d c d
c 0 “11 0 ‡31 0 d c aT10 –11 aT21 –31 aT41 d c 0 “11 0 ≠‡13 0 d
c d c d c d
c 0 0 I 0 0 d c A20 a21 A22 a32 AT42 d c 0 0 I 0 0 d
c d c d c d
c
a 0 ≠‡31 0 “33 0 d c d c
b a aT30 –31 aT32 –33 aT43 b a 0 ‡31 0 “33 0 d
b
0 0 0 0 I A40 a41 A42 a43 A44 0 0 0 0 I
Q R
A00 a‚10 AT20 a‚30 AT40
c d
c
c
a‚T10 – ‚ T21
‚ 11 a 0 a‚T41 d
d
= c
c A20 a‚21 A22 a‚32 AT42 d
d = A.

c
a a‚T30 0 a‚T32 – ‚ T43
‚ 33 a d
b
A40 a‚41 A42 a‚43 A44

Show that Îoff(A)Î


‚ 2 = Îoff(A)Î2 ≠ 2–2 .
F F 31
Solution. We notice that
Îoff(A)Î2F =
Îoff(A00 )Î2F +Îa10 Î2F +ÎAT20 Î2F +Îa30 Î2F +ÎAT40 Î2F
+ÎaT10 Î2F +ÎaT21 Î2F +–31
2
+ÎaT41 Î2F
+ÎA20 Î2F +Îa21 Î2F +Îoff(A22 )Î2F +Îa32 Î2F ÎAT42 Î2F
+ÎaT30 Î2F +–31
2
+ÎaT32 Î2F +ÎaT43 Î2F
+ÎA40 Î2F +Îa41 ÎF +ÎA42 Î2F
2
+Îa43 ÎF +Îoff(A44 )Î2F .
2

and
‚ 2 =
Îoff(A)Î F
Îoff(A00 )Î2F +Îa‚10 Î2F +ÎAT20 Î2F +Îa‚30 Î2F +ÎAT40 Î2F
+Îa‚T10 Î2F +Îa‚T21 Î2F + 0 +Îa‚T41 Î2F
+ÎA20 Î2F +Îa‚21 ÎF +Îoff(A22 )Î2F
2
+Îa‚32 ÎF ÎAT42 Î2F
2

+Îa‚T30 Î2F + 0 +Îa‚T32 Î2F +Îa‚T43 Î2F


+ÎA40 Î2F +Îa‚41 ÎF +ÎA42 Î2F
2
+Îa‚43 ÎF +Îoff(A44 )Î2F .
2

All submatrices in black show up for Îoff(A)Î2F and Îoff(A)Î ‚ 2 . The parts of rows and
F
columns that show up in red and blue is what changes. We argue that the sum of the terms
in red are equal for both.
Since a Jacobi rotation is unitary, it preserves the Frobenius norm of the matrix to which
it applied. Thus, looking at the rows that are modified by applying J T from the left, we find
WEEK 11. COMPUTING THE SVD 580

that
ÎaT10 Î2F +ÎaT21 Î2F +ÎaT41 Î2F
=
+ÎaT Î2 +ÎaT32 Î2F +ÎaT43 Î2F
.A 30 F B.2
. aT aT21 aT41 ..
.
. 10
. =
. aT30 a32T
aT43 .F
.A BA B.2
. “11 ‡31 aT10 aT21 aT41 ..
.
. . =
. ≠‡31 “33 aT aT32 aT43 .F
.A 30 B.2
. a T
a‚T21 a‚T41 ..
. ‚ 10
. . =
. a ‚ T30 a‚T32 a‚T43 .F
Îa‚T10 Î2F +Îa‚T21 Î2F +Îa‚T41 Î2F
+Îa‚T30 Î2F +Îa‚T32 Î2F +Îa‚T43 Î2F .
Similarly, looking at the columns that are modified by applying J T from the right, we find
that .Q R.2
Îa10 Î2F +Îa30 Î2F .
. a10 a30 .
.
.c d.
.c d.
.c d.
+Îa21 Î2F +Îa32 Î2F = .c
.c a21 a32 d
d.
.
.c d.
.a b.
. .
+Îa41 Î2F +Îa43 Î2F . a41 a43 .F
.Q R .2 . . 2
. a10 a30 . . a a‚30 ..
. . . ‚ 10
.c dA B. . .
.c d . . .
.c d “11 ≠‡31 . . . .
= .c
.c a21 a32 d
d . = . a
. ‚ 21 a‚32 ..
.c d ‡31 “33 . . .
.a b . . .
. . . .
. a41 a43 . . a‚ 41 a‚43 .
F F
Îa‚10 Î2F +Îa‚30 Î2F

= +Îa‚21 Î2F +Îa‚32 Î2F

+Îa‚41 Î2F +Îa‚43 Î2F .


We conclude that
‚ = off(A) ≠ 2–2 .
off(A) 31
From this exercise, we learn:

• The good news: every time a Jacobi rotation is used to zero an off-diagonal element,
off(A) decreases by twice the square of that element.

• The bad news: a previously introduced zero may become nonzero in the process.

The original algorithm developed by Jacobi searched for the largest (in absolute value)
off-diagonal element and zeroed it, repeating this processess until all off-diagonal elements
were small. The problem with this is that searching for the largest off-diagonal element in an
m ◊ m matrix requires O(m2 ) comparisons. Computing and applying one Jacobi rotation as
a similarity transformation requires O(m) flops. For large m this is not practical. Instead, it
can be shown that zeroing the off-diagonal elements by columns (or rows) also converges to
WEEK 11. COMPUTING THE SVD 581

a diagonal matrix. This is known as the column-cyclic Jacobi algorithm. Zeroing out every
pair of off-diagonal elements once is called a sweep. We illustrate this in Figure 11.3.2.1.
Typically only a few sweeps (on the order of five) are needed to converge sufficiently.
Sweep 1
◊ 0 ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0
0 ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
æ æ æ
◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊
zero (1, 0) zero (2, 0) zero (3, 0)

◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ 0 ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊
æ æ æ
◊ 0 ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0
◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0 ◊
zero (2, 1) zero (3, 1) zero (3, 2)
Sweep 2
◊ 0 ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0
0 ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
æ æ æ
◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊
zero (1, 0) zero (2, 0) zero (3, 0)

◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊
◊ ◊ 0 ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊
æ æ æ
◊ 0 ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ ◊ 0
◊ ◊ ◊ ◊ ◊ 0 ◊ ◊ ◊ ◊ 0 ◊
zero (2, 1) zero (3, 1) zero (3, 2)
Figure 11.3.2.1 Column-cyclic Jacobi algorithm.
We conclude by noting that the matrix Q such that A = Q QH can be computed by
accumulating all the Jacobi rotations (applying them to the identity matrix).

11.3.3 Jacobi’s method for computing the Singular Value Decom-


position
WEEK 11. COMPUTING THE SVD 582

YouTube: https://www.youtube.com/watch?v=VUktLhUiR7w
Just like the QR algorithm for computing the Spectral Decomposition was modified to
compute the SVD, so can the Jacobi Method for computing the Spectral Decomposition.
The insight is very simple. Let A œ Rm◊n and partition it by columns:
1 2
A= a0 a1 · · · an≠1 .

One could form B = AT A and then compute Jacobi rotations to diagonalize it:
T
· · · J3,1 T
J2,1 B J2,1 J3,1 · · · = D.
¸ ˚˙ ˝ ¸ ˚˙ ˝
QT Q

We recall that if we order the columns of Q and diagonal elements of D appropriately, then
choosing V = Q and = D1/2 yields

A = U V T = U D1/2 QT

or, equivalently,
AQ = U = U D1/2 .
This means that if we apply the Jacobi rotations J2,1 , J3,1 , . . . from the right to A,

U D1/2 = ((AJ2,1 )J3,1 ) · · · ,

then, once B has become (approximately) diagonal, the columns of A‚ = ((AJ2,1 )J3,1 ) · · · are
mutually orthogonal. By scaling them to have length one, setting = diag(Îa‚0 Î2 , Îa‚1 Î2 , . . . , Îa‚n≠1 Î2 ),
we find that
U = A‚ ≠1 = AQ(D1/2 )≠1 .
The only problem is that in forming B, we may introduce unnecessary error since it squares
the condition number.
Here is a more practical algorithm. We notice that
Q R
aT0 a0 aT0 a1 ··· aT0 an≠1
c d
c aT1 a0 aT1 a1 ··· aT1 an≠1 d
B = AT A = c
c .. .. .. d.
d
a . . . b
aTn≠1 a0 aTn≠1 a1 · · · aTn≠1 an≠1

We observe that we don’t need to form all of B. When it is time to compute Ji,j , we need
only compute A B A B
—i,i —j,i aTi ai aTj ai
= ,
—j,i —j,j aTj ai aTj aj
from which Ji,j can be computed. By instead applying this Jacobi rotation to B, we observe
that
T
Ji,j BJi,j = Ji,j
T
AT AJi,j = (AJi,j )T (AJi,j )
WEEK 11. COMPUTING THE SVD 583

and hence the Jacobi rotation can instead be used to take linear combinations of the ith and
jth columns of A: A B
1 2 1 2 “i,j ≠‡i,j
ai aj := ai aj .
‡i,j “i,j
We have thus outlined an algorithm:

• Starting with matrix A, compute a sequence of Jacobi rotations (e.g., corresponding to


a column-cyclic Jacobi method) until the off-diagonal elements of AT A (parts of which
are formed as Jacobi rotations are computed) become small. Every time a Jacobi
rotation is computed, it updates the appropriate columns of A.

• Accumulate the Jacobi rotations into matrix V , by applying them from the right to an
identity matrix:
V = ((I ◊ J2,1 )J3,1 ) · · ·

• Upon completion,
= diag(Îa0 Î2 , Îa0 Î2 , . . . , Îan≠1 Î2 )
and
U =A ≠1
,
meaning that each column of the updated A is divided by its length.

• If necessary, reorder the columns of U and V and the diagonal elements of .

Obviously, there are variations on this theme. Such methods are known as one-sided Jacobi
methods.

11.4 Enrichments
11.4.1 Principal Component Analysis
The Spectral Decomposition and Singular Value Decomposition are fundamental to a tech-
nique in data sciences known as Principal Component Analysis (PCA). The following
tutorial makes that connection.

• [36] Jonathon Shlens, A Tutorial on Principal Component Analysis, arxiv 1404.1100,


2014.

11.4.2 Casting the reduction to bidiagonal form in terms of matrix-


matrix multiplication
As was discussed in Subsection 10.4.3 for the reduction to tridiagonal form, reduction to
bidiagonal form can only be partly cast in terms of matrix-matrix multiplication. As for the
reduction to tridiagonal form, we recommend
WEEK 11. COMPUTING THE SVD 584

• [44] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, G. Joseph
Elizondo, Families of Algorithms for Reducing a Matrix to Condensed Form, ACM
Transactions on Mathematical Software (TOMS) , Vol. 39, No. 1, 2012.

Bidiagonal, tridiagonal, and upper Hessenberg form are together referred to as condensed
form.

11.4.3 Optimizing the bidiagonal QR algorithm


As the Givens’ rotations are applied to the bidiagonal matrix, they are also applied to
matrices in which the left and right singular vectors are accumulated (matrices U and V ).
If we start with an m ◊ m matrix, one step of introducing the bulge and chasing it out
the matrix requires O(m) computation. Accumulating the Givens’ rotations into U and
V requires O(m2 ) computation for each such step, with O(m2 ) data. As was discussed in
Subsection 10.4.4 for the implicitly shifted QR algorithm, this inherently means the cost of
accessing data dominates on current architectures.
The paper also mentioned in Subsection 10.4.4, also describes techniques for applying
the Givens’ rotations for several steps of the Implicitly shifted bidiagonal QR algorithm at
the same time, which allows one to attain high performance similar to that attained by a
matrix-matrix multiplication.

• [42] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, Restructuring
the Tridiagonal and Bidiagonal QR Algorithms for Performance, ACM Transactions
on Mathematical Software (TOMS), Vol. 40, No. 3, 2014.

To our knowledge, this yields the fastest implementation for finding the SVD of a bidiagonal
matrix.

11.5 Wrap Up
11.5.1 Additional homework
No additional homework yet.

11.5.2 Summary
We have noticed that typos are uncovered relatively quickly once we release the material.
Because we "cut and paste" the summary from the materials in this week, we are delaying
adding the summary until most of these typos have been identified.
Week 12

Attaining High Performance

12.1 Opening
12.1.1 Simple Implementation of matrix-matrix multiplication
The current coronavirus crisis hit UT-Austin on March 14, a day we spent quickly making
videos for Week 11. We have not been back to the office, to create videos for Week 12, since
then. We will likely add such videos as time goes on. For now, we hope that the notes
suffice.
Remark 12.1.1.1 The exercises in this unit assume that you have installed the BLAS-like
Library Instantiation Software (BLIS), as described in Subsection 0.2.4.
Let A, B, and C be m ◊ k, k ◊ n, and m ◊ n matrices, respectively. We can expose their
individual entries as
Q R Q R
–0,0 –0,1 ··· –0,k≠1 —0,0 —0,1 ··· —0,n≠1
c d c d
c –1,0 –1,1 ··· –1,k≠1 d c —1,0 —1,1 ··· —1,n≠1 d
A=c
c .. .. .. d,B
d =c
c .. .. .. d,
d
a . . . b a . . . b
–m≠1,0 –m≠1,1 · · · –m≠1,k≠1 —k≠1,0 —k≠1,1 · · · —k≠1,n≠1
and Q R
“0,0 “0,1 ··· “0,n≠1
c d
c “1,0 “1,1 ··· “1,n≠1 d
C= c
c .. .. .. d.
d
a . . . b
“m≠1,0 “m≠1,1 · · · “m≠1,n≠1
The computation C := AB + C, which adds the result of the matrix-matrix multiplication
AB to a matrix C, is defined element-wise as
k≠1
ÿ
“i,j := –i,p —p,j + “i,j (12.1.1)
p=0

for all 0 Æ i < m and 0 Æ j < n. We add to C because this will make it easier to play with
the orderings of the loops when implementing matrix-matrix multiplication. The following

585
WEEK 12. ATTAINING HIGH PERFORMANCE 586

pseudo-code computes C := AB + C:

for i := 0, . . . , m ≠ 1
for j := 0, . . . , n ≠ 1
for p := 0, . . . , k ≠ 1
“i,j := –i,p —p,j + “i,j
end
end
end

The outer two loops visit each element of C and the inner loop updates “i,j with (12.1.1). We
use C programming language macro definitions in order to explicitly index into the matrices,
which are passed as one-dimentional arrays in which the matrices are stored in column-major
order.
Remark 12.1.1.2 For a more complete discussion of how matrices are mapped to memory,
you may want to look at 1.2.1 Mapping matrices to memory in our MOOC titled LAFF-On
Programming for High Performance. If the discussion here is a bit too fast, you may want
to consult the entire Section 1.2 Loop orderings of that course.
#define a pha( i,j ) A[ (j)* dA + i ] // map a pha( i,j ) to array A
#define beta( i,j ) B[ (j)* dB + i ] // map beta( i,j ) to array B
#define gamma( i,j ) C[ (j)* dC + i ] // map gamma( i,j ) to array C

void MyGemm( int m, int n, int k, doub e *A, int dA,


doub e *B, int dB, doub e *C, int dC )
{
for ( int i=0; i<m; i++ )
for ( int j=0; j<n; j++ )
for ( int p=0; p<k; p++ )
gamma( i,j ) += a pha( i,p ) * beta( p,j );
}
Figure 12.1.1.3 Implementation, in the C programming language, of the IJP ordering for
computing matrix-matrix multiplication.
Homework 12.1.1.1 In the file Assignments/Week12/C/Gemm_IJP.c you will find the simple
implementation given in Figure 12.1.1.3 that computes C := AB + C. In a terminal window,
the directory Assignments/Week12/C, execute
make IJP

to compile, link, and execute it. You can view the performance attained on your computer
with the Matlab Live Script in Assignments/Week12/C/data/P ot_IJP.m x (Alternatively, read
and execute Assignments/Week12/C/data/P ot_IJP_m.m.)
On Robert’s laptop, Homework 12.1.1.1 yields the graph
WEEK 12. ATTAINING HIGH PERFORMANCE 587

as the curve labeled with IJP. The time, in seconds, required to compute matrix-matrix
multiplication as a function of the matrix size is plotted, where m = n = k (each matrix
is square). "Irregularities" in the time required to complete can be attributed to a number
of factors, including that other processes that are executing on the same processor may be
disrupting the computation. One should not be too concerned about those.
The performance of a matrix-matrix multiplication implementation is measured in billions
of floating point operations per second (GFLOPS). We know that it takes 2mnk flops to
compute C := AB + C when C is m ◊ n, A is m ◊ k, and B is k ◊ n. If we measure the
time it takes to complete the computation, T (m, n, k), then the rate at which we compute
is given by
2mnk
◊ 10≠9 GFLOPS.
T (m, n, k)
For our implementation, this yields
WEEK 12. ATTAINING HIGH PERFORMANCE 588

Again, don’t worry too much about the dips in the curves in this and future graphs. If we
controlled the environment in which we performed the experiments (for example, by making
sure no other compute-intensive programs are running at the time of the experiments), these
would largely disappear.
Remark 12.1.1.4 The Gemm in the name of the routine stands for General Matrix-Matrix
multiplication. Gemm is an acronym that is widely used in scientific computing, with roots
in the Basic Linear Algebra Subprograms (BLAS) interface which we will discuss in Subsec-
tion 12.2.5.
Homework 12.1.1.2 The IJP ordering is one possible ordering of the loops. How many
distinct reorderings of those loops are there?
Answer.
3! = 6.

Solution.

• There are three choices for the outer-most loop: i, j, or p.

• Once a choice is made for the outer-most loop, there are two choices left for the second
loop.

• Once that choice is made, there is only one choice left for the inner-most loop.

Thus, there are 3! = 3 ◊ 2 ◊ 1 = 6 loop orderings.


Homework 12.1.1.3 In directory Assignments/Week12/C make copies of Assignments/
Week12/C/Gemm_IJP.c into files with names that reflect the different loop orderings (Gemm_IPJ.c,
etc.). Next, make the necessary changes to the loops in each file to reflect the ordering en-
coded in its name. Test the implementions by executing
WEEK 12. ATTAINING HIGH PERFORMANCE 589

make IPJ
make JIP
...

for each of the implementations and view the resulting performance by making the indicated
changes to the Live Script in Assignments/Week12/C/data/P ot_A _Orderings.m x (Alterna-
tively, use the script in Assignments/Week12/C/data/P ot_A _Orderings_m.m). If you have
implemented them all, you can test them all by executing

make A _Orderings

Solution.

• Assignments/Week12/answers/Gemm_IPJ.c

• Assignments/Week12/answers/Gemm_JIP.c

• Assignments/Week12/answers/Gemm_JPI.c

• Assignments/Week12/answers/Gemm_PIJ.c

• Assignments/Week12/answers/Gemm_PJI.c

Figure 12.1.1.5 Performance comparison of all different orderings of the loops, on Robert’s
laptop.
Homework 12.1.1.4 In directory Assignments/Week12/C/ execute
make JPI

and view the results with the Live Script in Assignments/Week12/C/data/P ot_Opener.m x.
WEEK 12. ATTAINING HIGH PERFORMANCE 590

(This may take a little while, since the Makefile now specifies that the largest problem to be
executed is m = n = k = 1500.)
Next, change that Live Script to also show the performance of the reference implemen-
tation provided by the BLAS-like Library Instantion Software (BLIS): Change

% Optiona y show the reference imp ementation performance data


if ( 0 )

to

% Optiona y show the reference imp ementation performance data


if ( 1 )

and rerun the Live Script. This adds a plot to the graph for the reference implementation.
What do you observe? Now are you happy with the improvements you made by reordering
the loops>
Solution. On Robert’s laptop:

Left: Plotting only simple implementations. Right: Adding the performance of the
reference implementation provided by BLIS.
Note: the performance in the graph on the left may not exactly match that in the graph
earlier in this unit. My laptop does not always attain the same performance. When a
processor gets hot, it "clocks down." This means the attainable performance goes down.
A laptop is not easy to cool, so one would expect more fluctuation than when using, for
example, a desktop or a server.
Here is a video from our course "LAFF-On Programming for High Performance", which
explains what you observe. (It refers to "Week 1" or that course. It is part of the launch for
that course.)
WEEK 12. ATTAINING HIGH PERFORMANCE 591

YouTube: https://www.youtube.com/watch?v=eZaq451nuaE
Remark 12.1.1.6 There are a number of things to take away from the exercises in this unit.
• The ordering of the loops matters. Why? Because it changes the pattern of memory
access. Accessing memory contiguously (what is often called "with stride one") improves
performance.

• Compilers, which translate high-level code written in a language like C, are not the
answer. You would think that they would reorder the loops to improve performance.
The widely-used gcc compiler doesn’t do so very effectively. (Intel’s highly-acclaimed
icc compiler does a much better job, but still does not produce performance that rivals
the reference implementation.)

• Careful implementation can greatly improve performance.

• One solution is to learn how to optimize yourself. Some of the fundamentals can be
discovered in our MOOC LAFF-On Programming for High Performance. The other
solution is to use highly optimized libraries, some of which we will discuss later in this
week.

12.1.2 Overview
• 12.1 Opening

¶ 12.1.1 Simple Implementation of matrix-matrix multiplication


¶ 12.1.2 Overview
¶ 12.1.3 What you will learn

• 12.2 Linear Algebra Building Blocks

¶ 12.2.1 A simple model of the computer


¶ 12.2.2 Opportunities for optimization
¶ 12.2.3 Basics of optimizing matrix-matrix multiplication
¶ 12.2.4 Optimizing matrix-matrix multiplication, the real story
¶ 12.2.5 BLAS and BLIS

• 12.3 Casting Computation in Terms of Matrix-Matrix Multiplication


WEEK 12. ATTAINING HIGH PERFORMANCE 592

¶ 12.3.1 Blocked Cholesky factorization


¶ 12.3.2 Blocked LU factorization
¶ 12.3.3 Other high-performance dense linear algebra algorithms
¶ 12.3.4 Libraries for higher level dense linear algebra functionality
¶ 12.3.5 Sparse algorithms

• 12.4 Enrichments

¶ 12.4.1 Optimizing matrix-matrix multiplication - We’ve got a MOOC for that!


¶ 12.4.2 Deriving blocked algorithms - We’ve got a MOOC for that too!
¶ 12.4.3 Parallel high-performance algorithms

• 12.5 Wrap Up

¶ 12.5.1 Additional homework


¶ 12.5.2 Summary

12.1.3 What you will learn


This week, you explore how algorithms and architectures interact.
Upon completion of this week, you should be able to

• Amortize the movement of data between memory layers over useful computation to
overcome the discrepency between the speed of floating point computation and the
speed with which data can be moved in and out of main memory.

• Block matrix-matrix multiplication for multiple levels of cache.

• Cast matrix factorizations as blocked algorithms that cast most computation in terms
of matrix-matrix operations.

• Recognize that the performance of some algorithms is inherently restricted by the


memory operations that must be performed.

• Utilize interfaces to high-performance software libraries.

• Find additional resources for further learning.


WEEK 12. ATTAINING HIGH PERFORMANCE 593

12.2 Linear Algebra Building Blocks


12.2.1 A simple model of the computer
The following is a relevant video from our course "LAFF-On Programming for High Perfor-
mance" (Unit 2.3.1).
The good news about modern processors is that they can perform floating point oper-
ations at very high rates. The bad news is that "feeding the beast" is often a bottleneck:
In order to compute with data, that data must reside in the registers of the processor and
moving data from main memory into a register requires orders of magnitude more time than
it does to then perform a floating point computation with that data.
In order to achieve high performance for the kinds of operations that we have encoun-
tered, one has to have a very high-level understanding of the memory hierarchy of a modern
processor. Modern architectures incorporate multiple (compute) cores in a single processor.
In our discussion, we blur this and will talk about the processor as if it has only one core.
It is useful to view the memory hierarchy of a processor as a pyramid.

At the bottom of the pyramid is the computer’s main memory. At the top are the
processor’s registers. In between are progressively larger cache memories: the L1, L2, and
L3 caches. (Some processors now have an L4 cache.) To compute, data must be brought
into registers, of which there are only a few. Main memory is very large and very slow. The
strategy for overcoming the cost (in time) of loading data is to amortise that cost over many
WEEK 12. ATTAINING HIGH PERFORMANCE 594

computations while it resides in a faster memory layer. The question, of course, is whether
an operation we wish to perform exhibits the opportunity for such amortization.
Ponder This 12.2.1.1 For the processor in your computer, research the number of registers
it has, and the sizes of the various caches.

12.2.2 Opportunities for optimization


We now examine the opportunity for the reuse of data for different linear algebra operations
that we have encountered. In our discussion, we assume that scalars, the elements of vectors,
and the elements of matrices are all stored as floating point numbers and that arithmetic
involves floating point computation.
Example 12.2.2.1 Consider the dot product
fl := xT y + fl,
where fl is a scalar, and x and y are vectors of size m.
• How many floating point operations (flops) are required to compute this operation?
• If the scalar fl and the vectors x and y are initially stored in main memory and are
written back to main memory, how many reads and writes (memops) are required?
(Give a reasonably tight lower bound.)
• What is the ratio of flops to memops?

Solution.
• How many floating point operations are required to compute this operation?
The dot product requires m multiplies and m additions, for a total of 2m flops.
• If the scalar fl and the vectors x and y are initially stored in main memory and (if
necessary) are written back to main memory, how many reads and writes (memory
operations) are required? (Give a reasonably tight lower bound.)

¶ The scalar fl is moved into a register and hence only needs to be read once and
written once.
¶ The m elements of x and m elements of y must be read (but not written).

Hence the number of memops is 2(m + 1) ¥ 2m.


• What is the ratio of flops to memops?
2m flops flops
¥1 .
2(m + 1) memops memops

We conclude that the dot product does not exhibit an opportunity for the reuse of most
data. ⇤
WEEK 12. ATTAINING HIGH PERFORMANCE 595

Homework 12.2.2.1 Consider the axpy operation

y := –x + y,

where – is a scalar, and x and y are vectors of size m.


• How many floating point operations (flops) are required to compute this operation?

• If the scalar – and the vectors x and y are initially stored in main memory and are
written back to main memory (if necessary), how many reads and writes (memops) are
required? (Give a reasonably tight lower bound.)

• What is the ratio of flops to memops?

Solution.

• How many floating point operations are required to compute this operation?
The axpy operation requires m multiplies and m additions, for a total of 2m flops.

• If the scalar – and the vectors x and y are initially stored in main memory and (if
necessary) are written back to main memory, how many reads and writes (memory
operations) are required? (Give a reasonably tight lower bound.)

¶ The scalar – is moved into a register, and hence only needs to be read once. It
does not need to be written back to memory.
¶ The m elements of x are read, and the m elements of y must be read and written.

Hence the number of memops is 3m + 1 ¥ 3m.

• What is the ratio of flops to memops?


2m flops 2 flops
¥ .
3m + 1 memops 3 memops

We conclude that the axpy operation also does not exhibit an opportunity for the reuse of
most data.
The time for performing a floating point operation is orders of magnitude less than that
of moving a floating point number from and to memory. Thus, for an individual dot product
or axpy operation, essentially all time will be spent in moving the data and the attained
performance, in GFLOPS, will be horrible. The important point is that there just isn’t much
reuse of data when executing these kinds of "vector-vector" operations.
Example 12.2.2.1 and Homework 12.2.2.1 appear to suggest that, for example, when
computing a matrix-vector multiplication, one should do so by taking dot products of rows
with the vector rather than by taking linear combinations of the columns, which casts the
computation in terms of axpy operations. It is more complicated than that: the fact that
the algorithm that uses axpys computes with columns of the matrix, which are stored con-
WEEK 12. ATTAINING HIGH PERFORMANCE 596

tiguously when column-major order is employed, makes accessing memory cheaper.


Homework 12.2.2.2 Consider a matrix-vector multiplication

y := Ax + y,

where A is m ◊ m and x and y are vectors of appropriate size.


• How many floating point operations (flops) are required to compute this operation?

• If the matrix and vectors are initially stored in main memory and are written back
to main memory (if necessary), how many reads and writes (memops) are required?
(Give a reasonably tight lower bound.)

• What is the ratio of flops to memops?

Solution.

• How many floating point operations are required to compute this operation?
y := Ax + y requires m2 multiplies and m2 additions, for a total of 2m2 flops.

• If the matrix and vectors are initially stored in main memory and are written back to
main memory, how many reads and writes (memops) are required? (Give a reasonably
tight lower bound.)
To come up with a reasonably tight lower bound, we observe that every element of A
must be read (but not written). Thus, a lower bound is m2 memops. The reading and
writing of x and y contribute a lower order term, which we tend to ignore.

• What is the ratio of flops to memops?

2m2 flops flops


¥2 .
m memops
2 memops

While this ratio is better than either the dot product’s or the axpy operation’s, it still does
not look good.
What we notice is that there is a (slighly) better opportunity for reuse of data when
computing a matrix-vector multiplication than there is when computing a dot product or
axpy operation. Can the lower bound on data movement that is given in the solution be
attained? If you bring y into, for example, the L1 cache, then it only needs to be read from
main memory once and is kept in a layer of the memory that is fast enough to keep up with
the speed of floating point computation for the duration of the matrix-vector multiplication.
Thus, it only needs to be read and written once from and to main memory. If we then
compute y := Ax + y by taking linear combinations of the columns of A, staged as axpy
operations, then at the appropriate moment an element of x with which an axpy is performed
can be moved into a register and reused. This approach requires each element of A to be
read once, each element of x to be read once, and each element of y to be read and written
WEEK 12. ATTAINING HIGH PERFORMANCE 597

(from and to main memory) once. If the vector y is too large for the L1 cache, then it
can be partitioned into subvectors that do fit. This would require the vector x to be read
into registers multiple times. However, x itself might then be reused from one of the cache
memories.
Homework 12.2.2.3 Consider the rank-1 update

A := xy T + A,

where A is m ◊ m and x and y are vectors of appropriate size.


• How many floating point operations (flops) are required to compute this operation?

• If the matrix and vectors are initially stored in main memory and are written back
to main memory (if necessary), how many reads and writes (memops) are required?
(Give a reasonably tight lower bound.)

• What is the ratio of flops to memops?

Solution.

• How many floating point operations are required to compute this operation?
A := xy T + A requires m2 multiplies and m2 additions, for a total of 2m2 flops. (One
multiply and one add per element in A.)

• If the matrix and vectors are initially stored in main memory and are written back to
main memory, how many reads and writes (memops) are required? (Give a reasonably
tight lower bound.)
To come up with a reasonably tight lower bound, we observe that every element of A
must be read and written. Thus, a lower bound is 2m2 memops. The reading of x and
y contribute a lower order term. These vectors need not be written, since they don’t
change.

• What is the ratio of flops to memops?

2m2 flops flops


¥1 .
2m memops
2 memops
What we notice is that data reuse when performing a matrix-vector multiplication with
an m ◊ m matrix is twice as favorable as is data reuse when performing a rank-1 update.
Regardless, there isn’t much reuse and since memory operations are much slower than floating
point operations in modern processors, none of the operations discussed so far in the unit
can attain high performance (if the data starts in main memory).
Homework 12.2.2.4 Consider the matrix-matrix multiplication
C := AB + C,
where A, B, and C are all m ◊ m matrices.
WEEK 12. ATTAINING HIGH PERFORMANCE 598

• How many floating point operations (flops) are required to compute this operation?

• If the matrices are initially stored in main memory and (if necessary) are written back
to main memory, how many reads and writes (memops) are required? (Give a simple
lower bound.)

• What is the ratio of flops to memops for this lower bound?

Solution.

• How many floating point operations are required to compute this operation?
C := AB + C requires m3 multiplies and m3 additions, for a total of 2m3 flops.

• If the matrices are initially stored in main memory and (if necessary) are written back
to main memory, how many reads and writes (memops) are required? (Give a simple
lower bound.)
To come up with a reasonably tight lower bound, we observe that every element of A,
B, and C must be read and every element of C must be written. Thus, a lower bound
is 4m2 memops.

• What is the ratio of flops to memops?

2m3 flops m flops


¥ .
4m memops
2 2 memops
The lower bounds that we give in this unit are simple, but useful. There is actually a tight
lower bound on the number of reads and writes that must be performed by a matrix-matrix
multiplication. Details can be found in
• [50] Tyler Michael Smith, Bradley Lowery, Julien Langou, Robert A. van de Geijn, A
Tight I/O Lower Bound for Matrix Multiplication, arxiv.org:1702.02017v2, 2019. (To
appear in ACM Transactions on Mathematical Software.)
The bottom line: operations like matrix-matrix multiplication exhibit the opportunity
for reuse of data.

12.2.3 Basics of optimizing matrix-matrix multiplication


Let us again consider the computation

C := AB + C,

where A, B, and C are m ◊ m matrices. If m is small enough, then we can read the three
matrices into the L1 cache, perform the operation, and write the updated matrix C back to
memory. In this case,
• During the computation, the matrices are in a fast memory (the L1 cache), which can
keep up with the speed of floating point computation and
WEEK 12. ATTAINING HIGH PERFORMANCE 599

• The cost of moving each floating point number from main memory into the L1 cache
is amortized over m/2 floating point computations.
If m is large enough, then the cost of moving the data becomes insignificant. (If carefully
orchestrated, some of the movement of data can even be overlapped with computation, but
that is beyond our discussion.)
We immediately notice there is a tension: m must be small so that all three matrices can
fit in the L1 cache. Thus, this only works for relatively small matrices. However, for small
matrices, the ratio m/2 may not be favorable enough to offset the very slow main memory.
Fortunately, matrix-matrix multiplication can be orchestrated by partitioning the matri-
ces that are involved into submatrices, and computing with these submatrices instead. We
recall that if we partition
Q R
C0,0 C0,1 ··· C0,N ≠1
c C1,0 C1,1 ··· C1,N ≠1 d
c d
C= c
c .. .. .. d,
d
a . . . b
CM ≠1,0 CM ≠1,1 · · · CM ≠1,N ≠1
Q R
A0,0 A0,1 ··· A0,K≠1
c A1,0 A1,1 ··· A1,K≠1 d
c d
A=c
c .. .. .. d,
d
a . . . b
AM ≠1,0 AM ≠1,1 · · · AM ≠1,K≠1
and Q R
B0,0 B0,1 ··· B0,N ≠1
c B1,0 B1,1 ··· B1,N ≠1 d
c d
B=c
c .. .. .. d,
d
a . . . b
BK≠1,0 BK≠1,1 · · · BK≠1,N ≠1
qM ≠1 qN ≠1
where Ci,j is mi ◊ nj , Ai,p is mi ◊ kp , and Bp,j is kp ◊ nj , with i=0 mi = m, j=0 ni = n,
q
and K≠1
p=0 ki = k, then
K≠1
ÿ
Ci,j := Ai,p Bp,j + Ci,j .
p=0

If we choose each mi , nj , and kp small enough, then the submatrices fit in the L1 cache.
This still leaves us with the problem that these sizes must be reasonably small if the ratio
of flops to memops is to be sufficient. The answer to that is to block for multiple levels of
caches.

12.2.4 Optimizing matrix-matrix multiplication, the real story


High-performance implementations of matrix-matrix multiplication (and related operations)
block the computation for multiple levels of caches. This greatly reduces the overhead related
to data movement between memory layers. In addition, at some level the implementation
pays very careful attention to the use of vector registers and vector instructions so as to
WEEK 12. ATTAINING HIGH PERFORMANCE 600

leverage parallelism in the architecture. This allows multiple floating point operations to be
performed simultaneous within a single processing core, which is key to high performance
on modern processors.
Current high-performance libraries invariably build upon the insights in the paper

• [49] Kazushige Goto and Robert van de Geijn, Anatomy of High-Performance Matrix
Multiplication, ACM Transactions on Mathematical Software, Vol. 34, No. 3: Article
12, May 2008.

This paper is generally considered a "must read" paper in high-performance computing.


The techniques in that paper were "refactored" (carefully layered so as to make it more
maintainable) in BLIS, as described in

• [52] Field G. Van Zee and Robert A. van de Geijn, BLIS: A Framework for Rapidly
Instantiating BLAS Functionality, ACM Journal on Mathematical Software, Vol. 41,
No. 3, June 2015.

The algorithm described in both these papers can be captured by the picture in Fig-
ure 12.2.4.1.
WEEK 12. ATTAINING HIGH PERFORMANCE 601

Picture adapted from [51]. Click to enlarge. PowerPoint source (step-by-step).


Figure 12.2.4.1 Blocking for multiple levels of cache, with packing.
This is not the place to go into great detail. Once again, we point out that the interested
learner will want to consider our Massive Open Online Course titled "LAFF-On Programming
for High Performance" [39]. In that MOOC, we illustrate issues in high performance by
exploring how high performance matrix-matrix multipication is implemented in practice.
WEEK 12. ATTAINING HIGH PERFORMANCE 602

12.2.5 BLAS and BLIS


To facilitate portable high performance, scientific and machine learning software is often
written in terms of the Basic Linear Algebra Subprograms (BLAS) interface [24] [15] [14].
This interface supports widely-used basic linear algebra functionality. The idea is that
if optimized implementations of this interface are available for different target computer
architectures, then a degree of portable high performance can be achieved by software that
is written in terms of this interface. The interface was designed for the Fortran programming
language starting in the 1970s. In our discussions, we will also show how to interface to it
from the C programming language. As we discuss the BLAS, it may be useful to keep the
"Quick Reference Guide" [9] to this interface handy.
In addition, our own BLAS-like Library Instantiation Software (BLIS) [52][47] is not only
a framework for the rapid instantiation of high-performing BLAS through the traditional
BLAS interface, but also an extended interface that, we believe, is more natural for C
programming. We refer to this interface as the BLIS typed interface [48], to distinguish
it from the traditional BLAS interface and the object based BLIS Object API.
A number of open source and vendors high-performance implementations of the BLAS
interface are availabe. For example,

• Our BLAS-like Library Instantiation Software (BLIS) is a widely-used open source


implementation of the BLAS for modern CPUs. It underlies AMD’s Optimizing CPU
Libraries (AOCL).

• Arm’s Arm Performance Libraries.

• Cray’s Cray Scientific and Math Libraries (CSML).

• IBM’s Engineering and Scientific Subroutine Library (ESSL).

• Intel’s Math Kernels Library (MKL).

• NVIDIA’s cuBLAS.

12.2.5.1 Level-1 BLAS (vector-vector functionality)


The original BLAS [24] interface was proposed in the 1970s, when vector supercomputers
like the Cray 1 and Cray 2 reigned supreme. On this class of computers, high performance
was achieved if computation could be cast in terms of vector-vector operations with vectors
that were stored contiguous in memory (with "stride one"). These are now called "level-1
BLAS" because they perform O(n) computation on O(n) data (when the vectors have size
n). The "1" refers to O(n) = O(n1 ) computation.
We here list the vector-vector functionality that is of importance in this course, for the
case where we compute with double precision real-valued floating point numbers.

• DOT: Returns xT y, the dot product of real-valued x and y.


WEEK 12. ATTAINING HIGH PERFORMANCE 603

¶ Traditional BLAS interface:


FUNCTION DDOT( N, X, INCX, Y, INCY )
¶ C:
doub e ddot_( int* n, doub e* x, int* incx, doub e* y, int* incy );
¶ BLIS typed interface for computing fl := –xT y + —fl (details):
void b i_ddotxv( conj_t conjx, conj_t conjy, dim_t n,
doub e* a pha, doub e* x, inc_t incx, doub e* y, inc_t incy,
doub e* beta, doub e* rho );
• AXPY: Updates y := –x + y, the scaled vector addition of x and y.
¶ Traditional BLAS interface:
SUBROUTINE DAXPY( N, ALPHA, X, INCX, Y, INCY )
¶ C:
void daxpy_( int* n, doub e* a pha, doub e* x, int* incx,
doub e* y, int* incy );
¶ BLIS typed interface (details):
void b i_daxpyf( conj_t conjx, dim_t n,
doub e* a pha, doub e* x, inc_t incx, doub e* y, inc_t incy );
• IAMAX: Returns the index of the element of x with maximal absolute value (indexing
starts at 1).
¶ Traditional BLAS interface:
FUNCTION IDAMAX( N, X, INCX )
¶ C:
int idamax_( int* n, doub e* x, int* incx );
¶ BLIS typed interface (details):
void b i_damaxv( dim_t n, doub e* x, inc_t incx, dim_t* index );
• NRM2: Returns ÎxÎ2 , the 2-norm of real-valued x.
¶ Traditional BLAS interface:
FUNCTION DNRM2( N, X, INCX )
¶ C:
doub e dnrm2_( int* n, doub e* x, int* incx );
¶ BLIS typed interface (details):
void b i_dnormfv( dim_t n, doub e* x, inc_t incx, doub e* norm );
Versions of these interfaces for single precision real, single precision complex, and double
precision complex can be attained by replacing the appropriate D with S, C, or Z in the call
to the Fortran BLAS interface, or d with s, c, or z in the C and BLIS typed interfaces.
WEEK 12. ATTAINING HIGH PERFORMANCE 604

12.2.5.2 Level-2 BLAS (matrix-vector functionality)


In addition to providing portable high performance, a second purpose of the BLAS is to
improve readability of code. Many of the algorithms we have encountered are cast in terms
of matrix-vector operations like matrix-vector multiplication and the rank-1 update. By
writing the code in terms of routines that implement such operations, the code more closely
resembles the algorithm that is encoded. The level-2 BLAS [15] interface provides matrix-
vector functionality. The "level-2" captures that they perform O(n2 ) computation (on O(n2 )
data), when the matrix invoved is of size n ◊ n.
We here list the the matrix-vector operations that are of importance in this course, for
the case where we compute with double precision real-valued floating point numbers.

• GEMV: Updates y := –op(A)x + —y, where op(A) indicates whether A is to be trans-


posed.

¶ Traditional BLAS interface:


SUBROUTINE DGEMV( TRANSA, M, N, ALPHA, A, LDA, X, INCX,
BETA, Y, INCY )
¶ C:
void dgemv_( char* transa, int* m, int* n,
doub e* a pha, doub e* a, int* da, doub e* x, int* incx,
doub e* beta, doub e* y, int* incy );
¶ BLIS typed interface for computing y := –opA (A)opx (x) + —y (details):
void b i_dgemv( trans_t transa, conj_t conjx, dim_t m, dim_t n,
doub e* a pha, doub e* A, inc_t rsa, inc_t csa,
doub e* x, inc_t incx, doub e* beta, doub e* y, inc_t incy );

• SYMV: Updates y := –Ax + —y, where A is symmetric and only the upper or lower
triangular part is stored.

¶ Traditional BLAS interface:


SUBROUTINE DSYMV( UPLO, N, ALPHA, A, LDA, X, INCX,
BETA, Y, INCY )
¶ C:
void dsymv_( char* up o, int* n,
doub e* a pha, doub e* a, int* da, doub e* x, int* incx,
doub e* beta, doub e* y, int* incy );
¶ BLIS typed interface (details):
void b i_dhemv( up o_t up oa, conj_t conja, conj_t conjx, dim_t n,
doub e* a pha, doub e* A, inc_t rsa, inc_t csa,
doub e* x, inc_t incx, doub e* beta, doub e* y, inc_t incy );
WEEK 12. ATTAINING HIGH PERFORMANCE 605

• TRSV: Updates x := –op(A)≠1 x, where op(A) indicates whether A is to be transposed


and A is either (unit) upper or lower triangular.

¶ Traditional BLAS interface:


SUBROUTINE DTRSV( UPLO, TRANSA, DIAG, N, A, LDA, X, INCX )
¶ C:
void dtrsv_( char* up o, char* transa, char* diag, int* n,
doub e* a, int* da, doub e* x, int* incx )
¶ BLIS typed interface (details):
void b i_dtrsv( up o_t up oa, trans_t transa, diag_t diag, dim_t n,
doub e* a pha, doub e* A, inc_t rsa, inc_t csa,
doub e* x, inc_t incx );

• GER: Updates A := –xy T + A:

¶ Traditional BLAS interface:


SUBROUTINE DGER( M, N, ALPHA, X, INCX, Y, INCY, A, LDA )
¶ C:
void dger_( int* m, int* n, doub e* a pha, doub e* x, int* incx,
doub e* y, int* incy, doub e* a, int* da );
¶ BLIS typed interface (details):
void b i_dger( conj_t conjx, conj_t conjy, dim_t m, dim_t n,
doub e* a pha, doub e* x, inc_t incx, doub e* y, inc_t incy,
doub e* A, inc_t rsa, inc_t csa );

• SYR: Updates A := –xxT + A, where A is symmetric and stored in only the upper or
lower triangular part of array A:

¶ Traditional BLAS interface:


SUBROUTINE DSYR( UPLO, N, ALPHA, X, INCX, A, LDA )
¶ C:
void dsyr_( char* up o, int* n, doub e* a pha, doub e* x, int* incx,
doub e* a, int* da );
¶ BLIS typed interface (details):
void b i_dher( up o_t up oa, conj_t conjx, dim_t n,
doub e* a pha, doub e* x, inc_t incx, doub e* A, inc_t rsa, inc_t csa );

• SYR2: Updates A := –(xy T + yxT ) + A , where A is symmetric and stored in only the
upper or lower triangular part of array A:

¶ Traditional BLAS interface:


WEEK 12. ATTAINING HIGH PERFORMANCE 606

SUBROUTINE DSYR2( UPLO, N, ALPHA, X, INCX, Y, INCY, A, LDA )


¶ C:
void dsyr2_( char* up o, int* n, doub e* a pha, doub e* x, int* incx,
doub e* y, int* incy, doub e* a, int* da );
¶ BLIS typed interface (details):
void b i_dher2( up o_t up oa, conj_t conjx, conj_t conjy, dim_t n,
doub e* a pha, doub e* x, inc_t incx, doub e* y, inc_t incy,
doub e* A, inc_t rsa, inc_t csa );

12.2.5.3 Level-3 BLAS (matrix-matrix functionality)


To attain high performance, computation has to be cast in terms of operations that reuse
data many times (for many floating point operations). Matrix-matrix operations that are
special cases of matrix-matrix multiplication fall into this category. The strategy is to cast
higher level linear algebra functionality can be implemented so that most computation is in
terms of matrix-matrix operations by calling the level-3 BLAS routines [14]. The "level-3"
captures that these perform O(n3 ) computation (on O(n2 ) data), when the matrices involved
are of size n ◊ n.
We here list the the matrix-matrix operations that are of importance in this course, for
the case where we compute with double precision real-valued floating point numbers.

• GEMM: Updates C := –opA (A)opB (B)+—C, where opA (A) and opA (A) indicate whether
A and/or B are to be transposed.

¶ Traditional BLAS interface:


SUBROUTINE DGEMM( TRANSA, TRANSB, M, N, K, ALPHA, A, LDA, B, LDB,
BETA, C, LDC )
¶ C:
void dgemm_( char* transa, char* transb, int* m, int* n, int* k,
doub e* a pha, doub e* a, int* da, doub e* b, int* db,
doub e* beta, doub e* c, int* dc );
¶ BLIS typed interface for computing C := –opA (A)opB (B) + —C (details):
void b i_dgemm( trans_t transa, trans_t transb,
dim_t m, dim_t n, dim_t k,
doub e* a pha, doub e* A, inc_t rsa, inc_t csa,
doub e* B, inc_t rsB, inc_t csb,
doub e* beta, doub e* C, inc_t rsc, inc_t csc );

• SYMM: Updates C := –AB + —C or C := –BA + —C, where A is symmetric and only


the upper or lower triangular part is stored.

¶ Traditional BLAS interface:


WEEK 12. ATTAINING HIGH PERFORMANCE 607

SUBROUTINE DSYMM( SIDE, UPLO, M, N, ALPHA, A, LDA, B, LDB,


BETA, C, LDC )
¶ C:
void dsymm_( char* up o, char* up o, int* m, int* n,
doub e* a pha, doub e* a, int* da, doub e* b, int* db,
doub e* beta, doub e* c, int* dc );
¶ BLIS typed interface (details):
void b i_dhemm( side_t sidea, up o_t up oa, conj_t conja, trans_t transb,
dim_t m, dim_t n, doub e* a pha, doub e* A, inc_t rsa, inc_t csa,
doub e* B, inc_t rsb, inc_t csb,
doub e* beta, doub e* C, inc_t rsc, inc_t csc );

• TRSM: Updates B := –op(A)≠1 B or B := –Bop(A)≠1 , where op(A) indicates whether


A is to be transposed and A is either (unit) upper or lower triangular.

¶ Traditional BLAS interface:


SUBROUTINE DTRSM( SIDE, UPLO, TRANSA, DIAG, M, N,
ALPHA, A, LDA, B, LDB, C, LDC )
¶ C:
void dtrsm_( char* side, char* up o, char* transa, char* diag,
int* m, int*, n, doub e *a pha, doub e* a, int* da,
doub e* b, int* db )
¶ BLIS typed interface (details):
void b i_dtrsm( side_t sidea, up o_t up oa, trans_t transa, diag_t diag,
dim_t m, dim_t n, doub e* a pha, doub e* A, inc_t rsa, inc_t csa,
doub e* B, inc_t rsb, inc_t csb );

• SYRK: Updates C := –AAT + —C or C := –AT A + —C, where C is symmetric and


stored in only the upper or lower triangular part of array C:

¶ Traditional BLAS interface:


SUBROUTINE DSYRK( UPLO, TRANS, N, K, ALPHA, A, LDA,
BETA, C, LDC )
¶ C:
void dsyrk_( char* up o, char* trans, int* n, int* k,
doub e* a pha, doub e* A, int* da, doub e* beta, doub e* C, int* dc );
¶ BLIS typed interface (details):
void b i_dherk( up o_t up oc, transt_t transa, dim_t n, dim_t k,
doub e* a pha, doub e* A, inc_t rsa, inc_t csa
doub e* beta, doub e* C, inc_t rsc, inc_t csc );
WEEK 12. ATTAINING HIGH PERFORMANCE 608

• SYR2K: Updates C := –(AB T + BAT ) + —C or C := –(AT B + B T A) + —C , where C


is symmetric and stored in only the upper or lower triangular part of array C:
¶ Traditional BLAS interface:
SUBROUTINE DSYR2K( UPLO, TRANS, N, K, ALPHA, A, LDA, B, LDB,
BETA, C, LDC )
¶ C:
void dsyr2k_( char* up o, char *trans, int* n, int* k, doub e* a pha, doub e* A, int*
doub e* b, int* db, doub e* beta, doub e* c, int* dc );
¶ BLIS typed interface (details):
void b i_dher2k( up o_t up oc, trans_t transab, dim_t n, dim_t k,
doub e* a pha, doub e* A, inc_t rsa, inc_t csa,
doub e* B, inc_t rsc, inc_t csc,
doub e* beta, doub e* C, inc_t rsc, inc_t csc );
);
These operations are often of direct importance to scientific or machine learning appli-
cations. In the next next section, we show how higher-level linear algebra operations can be
cast in terms of this basic functionality.

12.3 Casting Computation in Terms of Matrix-Matrix


Multiplication
12.3.1 Blocked Cholesky factorization
In the following video, we demonstrate how high-performance algorithms can be quickly
translated to code using the FLAME abstractions. It is a long video that was recorded in
a single sitting and has not been edited. (You need not watch the whole video if you "get
the point.") The purpose is to convey the importance of programming in a way that reflects
how one naturally derives and explains an algorithm. In the next unit, you will get to try
such implementation yourself, for the LU factorization.

YouTube: https://www.youtube.com/watch?v=PJ6ektH977o
• The notes to which this video refers can be found at http://www.cs.utexas.edu/users/
f ame/Notes/NotesOnCho .pdf.
WEEK 12. ATTAINING HIGH PERFORMANCE 609

• You can find all the implementations that are created during the video in the direc-
tory Assignments/Week12/Cho /. They have been updated slightly since the video
was created in 2011. In particular, the Makefile was changed so that now the BLIS
implementation of the BLAS is used rather than OpenBLAS.
• The Spark tool that is used to generate code skeletons can be found at http://www.cs.
utexas.edu/users/f ame/Spark/.

• The following reference may be useful:


¶ A Quick Reference Guide to the FLAME API to BLAS functionality can be found
at http://www.cs.utexas.edu/users/f ame/pubs/FLAMEC-BLAS-Quickguide.pdf.
¶ [41] Field Van Zee, libflame: The Complete Reference, http://www. u u.com, 2009.

12.3.2 Blocked LU factorization


Homework 12.3.2.1 Consider the LU factorization A = LU , where A is an m ◊ m ma-
trix, discussed in Subsection 5.1.1 and subsequent units. The right-looking algorithm for
computing it is given by
A = LU(A)
A B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A) Q R
A B A00 a01 A02
AT L AT R
æc d
a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
a21 := a21 /–11
A22 := A22 ≠ a21 aT12
Q R
A B A00 a01 A02
AT L AT R c d
Ω a aT10 –11 aT12 b
ABL ABR
A20 a21 A22
endwhile
Figure 12.3.2.1 Right-looking LU factorization algorithm.
For simplicity, we do not consider pivoting yet.

• How many floating point operations (flops) are required to compute this operation?

• If the matrix is initially stored in main memory and is written back to main memory,
how many reads and writes (memops) are required? (Give a simple lower bound.)

• What is the ratio of flops to memops for this lower bound?

Solution.
WEEK 12. ATTAINING HIGH PERFORMANCE 610

• How many floating point operations are required to compute this operation?
From Homework 5.2.2.1, we know that this right-looking algorithm requires 23 m3 flops.

• If the matrix is initially stored in main memory and is written back to main memory,
how many reads and writes (memops) are required? (Give a simple lower bound.)
We observe that every element of A must be read and written. Thus, a lower bound is
2m2 memops.

• What is the ratio of flops to memops?


2 3
3
m flops
m flops
¥ .
2m2 memops
3 memops
The insight is that the ratio of computation to memory operations, m/3, does not preclude
high performance. However, the algorithm in Figure 12.3.2.1 casts most computation in
terms of rank-1 updates and hence will not achieve high performance. The reason is that
the performance of each of those individual updates is limited by the unfavorable ratio of
floating point operations to memory operations.
Just like it is possible to rearrange the computations for a matrix-matrix multiplication
in order to reuse data that is brought into caches, one could carefully rearrange the com-
putations needed to perform an LU factorization. While one can do so for an individual
operation, think of the effort that this would involve for every operation in (dense) linear
algebra. Not only that, this effort would likely have to be repeated for new architectures as
they become available.
Remark 12.3.2.2 The key observation is that if we can cast most computation for the
LU factorization in terms of matrix-matrix operations (level-3 BLAS functionality) and we
link an implementation to a high-performance library with BLAS functionality, then we can
achieve portable high performance for our LU factorization algorithm.
Let us examine how to cast most computation for the LU factorization in terms of matrix-
matrix operations. You may want to start by reviewing the discussion of how to derive the
(unblocked) algorithm in Figure 12.3.2.1, in Subsection 5.2.2, which we repeat below on the
left. The derivation of a so-called blocked algorithm is given to its right.
Partition Partition
A B A B A B A B
–11 aT12 1 0 A11 A12 L11 0
Aæ , Læ , Aæ , Læ ,
a21 A22 l21 L22 A21 A22 L21 L22
A B A B
‚11 uT12 U11 U12
and U æ . and U æ ,
0 U22 0 U22
where A11 , L11 , and U11 are b ◊ b.
WEEK 12. ATTAINING HIGH PERFORMANCE 611

Plug partitioned matrices into A = LU . Plug partitioned matrices into A = LU .


A B A B
–11 aT12 A11 A12
a21
A
A22 BA B
A21
A
A22 BA B
1 0 ‚11 uT12 L11 0 U11 U12
= . = .
l21 L22 0 U22 L21 L22 0 U22
¸
A ˚˙ ˝
B A¸ ˚˙ ˝B
‚11 uT12 L11 U11 L11 U12
‚11 l21 l21 u12 + L22 U22
T
L21 U11 L21 U12 + L22 U22
Equate the submatrices and manipulate Equate the submatrices and manipulate

• –11 := ‚11 = –11 (No-op). • A11 := L\U 11 (overwrite A11 with its
LU factorization).
• aT12 := uT12 = aT12 (No-op).
• A12 := U12 = L≠1
11 A12 (triangular solve
• a21 := l21 = a21 /‚11 = a21 /–11 . with multiple right-hand sides).
• A22 := A22 ≠ l21 uT12 = A22 ≠ a21 aT12 . • A21 := L21 = A21 U11
≠1
(triangular solve
with multiple right-hand sides).

• A22 := A22 ≠ L21 U12 = A22 ≠ A21 A12


(matrix-matrix multiplication).
The derivation on the left yields the algo- The derivation on the right yields the algo-
rithm in Figure 12.3.2.1. rithm in Figure 12.3.2.3.
A = LU-blk(A,
A
b) B
AT L AT R

ABL ABR
AT L is 0 ◊ 0
while n(AT L ) < n(A)
choose block size bQ R
A B A00 A01 A02
AT L AT R c d
æ a A10 A11 A12 b
ABL ABR
A20 A21 A22
A11 is (at most) b ◊ b
A11 := L\U 11 = LU (A11 ) LU factorization
A12 := U12 = L≠1
11 A12 TRSM
A21 := A21 U11
≠1
TRSM
A22 := A22 ≠ A21 A12 GEMM
Q R
A B A00 A01 A02
AT L AT R c
Ω a A10 A11 AT12 d
b
ABL ABR
A20 A21 A22
endwhile
Figure 12.3.2.3 Blocked right-looking LU factorization algorithm.
WEEK 12. ATTAINING HIGH PERFORMANCE 612

Let us comment on each of the operations in Figure 12.3.2.3.

• A11 := L\U 11 = LU (A11 ) indicates that we need to compute the LU factorization


of A11 , overwriting that matrix with the unit lower triangular matrix L11 and upper
triangular matrix U11 . Since A11 is b ◊ b and we usually take b π m, not much of
the total computation is in the operation, and it can therefore be computed with, for
example, an unblocked algorithm. Also, if it is small enough, it will fit in one of the
smaller caches and hence the memory overhead of performing the rank-1 updates will
not be as dramatic.

• A12 := U12 = L≠1


11 A12 is an instance of solving LX = B with unit lower triangular
matrix L for X. This is referred to as a triangular solve with multiple right-
hand sides (TRSM) since we can partition B and X by columns so that
1 2 1 2
L x0 x1 · · · = b0 b1 · · · ,
1¸ ˚˙ ˝2
Lx0 Lx1 · · ·

and hence for each column of the right-hand side, bj , we need to solve a triangular
system, Lxj = bj .

• A21 := L21 = A21 U11


≠1
is an instance of solving XU = B, where U is upper triangular.
We notice that if we partition X and B by rows, then
Q R Q T R
xÂT0 Âb0
c T d c d
a 1 bU = a
c x d c ÂbT1 d
b
.. ..
. .
¸Q ˚˙ R˝
xÂT0 U
c T d
c x U d
a 1
.. b
.

and we recognize that each row, xÂTi , is computed from xÂTi U = ÂbTi or, equivalently,
by solving U T (xÂTi )T = (ÂbTi )T . We observe it is also a triangular solve with multiple
right-hand sides (TRSM).

• The update A22 := A22 ≠ A21 A12 is an instance of C := –AB + —C, where the k (inner)
size is small. This is often referred to as a rank-k update.

In the following homework, you will determine that most computation is now cast in terms
of the rank-k update (matrix-matrix multiplication).
Homework 12.3.2.2 For the algorithm in Figure 12.3.2.3, analyze the (approximate) num-
ber of flops that are performed by the LU factorization of A11 and updates of A21 and A12 ,
aggregated over all iterations. You may assume that the size of A, m, is an integer multiple
of the block size b, so that m = Kb. Next, determine the ratio of the flops spent in the
WEEK 12. ATTAINING HIGH PERFORMANCE 613

indicated operations to the total flops.


Solution. During the kth iteration, when A00 is (kb) ◊ (kb), we perform the following
number of flops in these operations:
• A11 := LU (A11 ): approximately 23 b3 flops.

• A12 := L≠1
11 A12 : During the kth iteration, A00 is (kb) ◊ (kb), A11 is b ◊ b, and A12 is
b ◊ ((K ≠ k ≠ 1)b). (It helps to draw a picture.) Hence, the total computation spent
in the operation is approximately ((K ≠ k ≠ 1)b)b2 = (K ≠ k ≠ 1)b3 flops.

• A21 := A21 U11


≠1
: During the kth iteration, A00 is (kb) ◊ (kb), A11 is b ◊ b, and A21 is
((K ≠k≠1)b)◊b. Hence, the total computation spent in the operation is approximately
((K ≠ k ≠ 1)b)b2 = (K ≠ k ≠ 1)b3 flops.

If we sum this over all K iterations, we find that the total equals
qK≠1 Ë 2 3 È
k=0 b 3
+ 2(K ≠ k ≠ 1)b3
Ë = È
2 qK≠1
3
K +2 k=0 (K ≠ k ≠ 1) b
3

Ë =
2 qK≠1 È 3
3
K +2 j=0j b
Ë ¥ È
2
3
K + K 2
b3 =
2 2
3
b m + bm . 2

Thus, the ratio of time spent in these operation to the total cost of the LU factorization is
2 2
A B2
b m + bm2 . b 3 b
3
2 3 = + .
3
m m 2m
From this last exercise, we learn that if b is fixed and m gets large, essentially all compu-
tation is in the update A22 := A22 ≠ A21 A12 , which we know can attain high performance.
Homework 12.3.2.3 In directory Assignments/Week12/mat ab/, you will find the following
files:
• LU_right_ ooking.m: an implementation of the unblocked right-looking LU factor-
ization.

• LU_b k_right_ ooking.m: A code skeleton for a function that implements a blocked
LU factorization.

[ A_out ] = LU_b k_right_ ooking( A, nb_a g )

performs a blocked LU factorization with block size b equal to nb_a g, overwriting


A_out with L and U .

• time_LU.m: A test routine that tests the various implementations.


WEEK 12. ATTAINING HIGH PERFORMANCE 614

These resources give you the tools to implement and test the blocked LU factorization.

1. Translate the blocked LU factorization in Figure 12.3.2.3 into code by completing


LU_b k_right_ ooking.m.

2. Test your implementation by executing time_LU.m.

3. Once you get the right answer, try different problem sizes to see how the different
implementations perform. Try m = 500, 1000, 1500, 2000.

On the discussion forum, discuss what you think are some of the reasons behind the
performance you observe.
Hint. You can extract L11 and U11 from A11 with

L11 = tri ( A11,-1 ) + eye( size( A11 ) );


A11 = triu( A11 );

Don’t invert L11 and U11 . In the command window execute

he p /
he p \

to read up on how those operators allow you to solve with a matrix.


Solution.

• LU_b k_right_ ooking.m.

Notice that your blocked algorithm gets MUCH better performance than does the unblocked
algorithm. However, the native LU factorization of Matlab does much better yet. The call
u( A ) by Matlab links to a high performance implementation of the LAPACK interface,
which we will discuss later.
Homework 12.3.2.4 For this exercise you will implement the unblocked and blocked al-
gorithm in C. To do so, you need to install the BLAS-like Library Instantiation Software
(BLAS) (see Subsubsection 0.2.4.1) and the libflame library (see Subsubsection 0.2.4.2).
Even if you have not previously programmed in C, you should be able to follow along.
In Assignments/Week12/LU/FLAMEC/, you will find the following files:
• LU_unb_var5.c : a skeleton for implementing the unblocked right-looking LU factor-
ization (which we often call "Variant 5").

• LU_b k_var5.c : a skeleton for implementing the blocked right-looking LU factoriza-


tion.

• REF_LU.c: A simple triple-nested loop implementation of the algorithm.

• driver.c: A "driver" for testing various implementations. This driver creates matrices
of different sizes and checks the performance and correctness of the result.
WEEK 12. ATTAINING HIGH PERFORMANCE 615

• Makefi e: A "makefile" that compiles, links, and executes. To learn about Makefiles,
you may want to check Wikipedia. (search for "Make" and choose "Make (software)".)

Our libflame library allows a simple translation from algorithms that have been typeset
with the FLAME notation (like those in Figure 12.3.2.1 and Figure 12.3.2.3). The BLAS
functionality discussed earlier in this week is made available through an interface that hides
details like size, stride, etc. A Quick Reference Guide for this interface can be found at
http://www.cs.utexas.edu/users/f ame/pubs/FLAMEC-BLAS-Quickguide.pdf.
With these resources, complete

• LU_unb_var5.c and

• LU_b k_var5.c.

You can skip the testing of LU_b k_var5 by changing the appropriate TRUE to FALSE in
driver.c.
Once you have implemented one or both, you can test by executing make test in a
terminal session (provided you are in the correct directory). This will compile and link,
yieding the executable driver.x, and will then execute this driver. The output is redirected
to the file output.m. Here is a typical line in the output:

data_REF( 10, 1:2 ) = [ 2000 2.806644e+00 ];


data_FLAME( 10, 1:2 ) = [ 2000 4.565624e+01 ];
data_unb_var5( 10, 1:3 ) = [ 2000 3.002356e+00 4.440892e-16];
data_b k_var5( 10, 1:3 ) = [ 2000 4.422223e+01 7.730705e-12];

• The first line reports the performance of the reference implementation (a simple triple-
nested loop implementation of LU factorization without pivoting). The last number is
the rate of computation in GFLOPS.

• The second line reports the performance of the LU factorization without pivoting that
is part of the libflame library. Again, the last number reports the rate of computation
in GFLOPS.

• The last two lines report the rate of performance (middle number) and difference of the
result relative to the reference implementation (last number), for your unblocked and
blocked implementations. It is important to check that the last number is small. For
larger problem sizes, the reference implementation is not executed, and this difference
is not relevant.
The fact that the blocked version shows a larger difference than does the unblocked
is not significant here. Both have roundoff error in the result (as does the reference
implementation) and we cannot tell which is more accurate.

You can cut and past the output.m file into matlab to see the performance data presented
as a graph.
WEEK 12. ATTAINING HIGH PERFORMANCE 616

Ponder This 12.3.2.5 In Subsection 5.5.1, we discuss five unblocked algorithms for com-
puting LU factorization. Can you derive the corresponding blocked algorithms? You can use
the materials in Assignments/Week12/LU/FLAMEC/ to implement and test all five.

12.3.3 Other high-performance dense linear algebra algorithms


Throughout this course, we have pointed to papers that discuss the high-performance im-
plementation of various operations. Here we review some of these.

12.3.3.1 High-performance QR factorization


We saw in Week 3 that the algorithm of choice for computing the QR factorization is based
on Householder transformations. The reason is that this casts the computation in terms of
the application of unitary transformations, which do not amplify error. In Subsection 3.4.1,
we discussed how the Householder QR factorization algorithm can be cast in terms of matrix-
matrix operations.
An important point to note is that in order to cast the computation in terms of matrix-
matrix multiplication, one has to form the "block Householder transformation"

I + U T ≠1 U H .

When the original matrix is m ◊ n, this requires O(mb2 ) floating point operations to be
performed to compute the upper triangular matrix T in each iteration, which adds O(mnb)
to the total cost of the QR factorization. This is computation that an unblocked algorithm
does not perform. In return, the bulk of the computation is performed much faster, which
in the balance benefits performance. Details can be found in, for example,

• [23] Thierry Joffrain, Tze Meng Low, Enrique S. Quintana-Orti, Robert van de Geijn,
Field G. Van Zee, Accumulating Householder transformations, revisited, ACM Trans-
actions on Mathematical Software, Vol. 32, No 2, 2006.

Casting the Rank-Revealing QR factorization, discussed in Subsection 4.5.2, in terms


of matrix-matrix multiplications is trickier. In order to determine what column should be
swapped at each step, the rest of the matrix has to be at least partially updated. One
solution to this is to use a randomized algorithm, as discussed in

• [25] Per-Gunnar Martinsson, Gregorio Quintana-Orti, Nathan Heavner, Robert van


de Geijn, Householder QR Factorization With Randomization for Column Pivoting
(HQRRP), SIAM Journal on Scientific Computing, Vol. 39, Issue 2, 2017.

12.3.3.2 Optimizing reduction to condensed form


In Subsection 10.3.1 and Subsection 11.2.3, we discussed algorithms for reducing a matrix to
tridiagonal and bidiagonal form, respectively. These are special cases of the reduction of a
matrix to condensed form. The algorithms in those sections cast most of the computation in
terms of matrix-vector multiplication and rank-1 or rank-2 updates, which are matrix-vector
WEEK 12. ATTAINING HIGH PERFORMANCE 617

operations that do not attain high performance. In enrichments in those chapters, we point
to papers that cast some of the computation in terms of matrix-matrix multiplication. Here,
we discuss the basic issues.
When computing the LU, Cholesky, or QR factorization, it is possible to factor a panel
of columns before applying an accumulation of the transformations that are encounted to
the rest of the matrix. It is this that allows the computation to be mostly cast in terms of
matrix-matrix operations. When computing the reduction to tridiagonal or bidiagonal form,
this is not possible. The reason is that if we have just computed a unitary transformation,
this transformation must be applied to the rest of the matrix both from the left and from
the right. What this means is that the next transformation to be computed depends on an
update that involves the rest of the matrix. This in turn means that inherently a matrix-
vector operation (involving the "rest of the matrix") must be performed at every step at a
cost of O(m2 ) computation per iteration (if the matrix is m ◊ m). The insight is that O(m3 )
computation is cast in terms of matrix-vector operations, which is of the same order as the
total computation.
While this is bad news, there is still a way to cast about half the computation in terms of
matrix-matrix multiplication for the reduction to tridiagonal form. Notice that this means
the computation is sped up by at most a factor two, since even if the part that is cast in terms
of matrix-matrix multiplication takes no time at all relative to the rest of the computation,
this only cuts the time to completion in half.
The reduction to bidiagonal form is trickier yet. It requires the fusing of a matrix-vector
multiplication with a rank-1 update in order to reuse data that is already in cache.
Details can be found in, for example,
• [44] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, G. Joseph
Elizondo, Families of Algorithms for Reducing a Matrix to Condensed Form, ACM
Transactions on Mathematical Software (TOMS) , Vol. 39, No. 1, 2012.

12.3.3.3 Optimizing the implicitly shifted QR algorithm


Optimizing the QR algorithm for computing the Spectral Decomposition or Singular Value
Decomposition gets even trickier. Part of the cost is in the reduction to condensed form,
which we already have noted exhibits limited opportunity for casting computation in terms
of matrix-matrix multiplication. Once the algorithm proceeds to the implicitly shifted QR
algorithm, most of the computation is in the accumulation of the eigenvectors or singular
vectors. In other words, it is in the application of the Givens’ rotations from the right to the
columns of a matrix Q in which the eigenvectors are being computed. Let us look at one
such application to two columns, qi and qj :
A B
1 2 1 2 “ ≠‡ 1 2
qi qj := qi qj = “qi + ‡qj ≠‡qi + “qj .
‡ “

The update of each column is a vector-vector operation, requiring O(m) computation with
O(m) data (if the vectors are of size m). We have reasoned that for such an operation it is
the cost of accessing memory that dominates. In
WEEK 12. ATTAINING HIGH PERFORMANCE 618

• [42] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, Restructuring
the Tridiagonal and Bidiagonal QR Algorithms for Performance, ACM Transactions
on Mathematical Software (TOMS), Vol. 40, No. 3, 2014.
we discuss how the rotations from many Francis Steps can be saved up and applied to Q
at the same time. By carefully orchestrating this so that data in cache can be reused, the
performance can be improved to rival that attained by a matrix-matrix operation.

12.3.4 Libraries for higher level dense linear algebra functionality


The Linear Algebra Package (LAPACK) is the most widely used interface for higher level
linear algebra functionality like LU, Cholesky, and QR factorization and related solvers
as well as eigensolvers. LAPACK was developed as an open source linear algebra software
library which was then embraced as an implementation and/or interface by scientific software
libraries that are supported by vendors.
A number of open source and vendors high-performance implementations of the BLAS
interface are availabe. For example,
• The original open source LAPACK implementation [1] available from netlib at http:/
/www.netlib.org/lapack/.
• Our libflame library is an open source implementation of LAPACK functionality that
leverages a programming style that is illustated in Subsection 12.3.1 and Subsec-
tion 12.3.2. It includes an LAPACK-compatable interface. It underlies AMD’s Opti-
mizing CPU Libraries (AOCL).
• Arm’s Arm Performance Libraries.
• Cray’s Cray Scientific and Math Libraries (CSML).
• IBM’s Engineering and Scientific Subroutine Library (ESSL).
• Intel’s Math Kernels Library (MKL).
• NVIDIA’s cuBLAS.

12.3.5 Sparse algorithms


Iterative methods inherently perform few floating point operations relative to the mem-
ory operations that need to be performed. For example, the Conjugate Gradient Method
discussed in Section 8.3 typically spends most of its time in a sparse matrix-vector multipli-
cation, where only two floating point operations are performed per nonzero element in the
matrix. As a result, attaining high performance with such algorithms is inherently difficult.
A (free) text that gives a nice treatment of high performance computing, including sparse
methods, is
• [17] Victor Eijkhout, Introduction to High-Performance Scientific Computing, lulu.com.
http://pages.tacc.utexas.edu/~eijkhout/istc/istc.htm
WEEK 12. ATTAINING HIGH PERFORMANCE 619

12.4 Enrichments
12.4.1 BLIS and beyond
One of the strengths of the approach to implementing matrix-matrix multiplication described
in Subsection 12.2.4 is that it can be applied to related operations. A recent talk discusses
some of these.
• Robert van de Geijn and Field Van Zee, "The BLIS Framework: Experiments in
Portability," SIAM Conference on Parallel Processing for Scientific Computing (PP20).
SIAM Activitiy group on Supercomputing Best Paper Prize talk. 2020. https://www.
youtube.com/watch?v=1biep1Rh_08

12.4.2 Optimizing matrix-matrix multiplication - We’ve got a MOOC


for that!
We reiterate that we offer a Massive Open Online Course titled "LAFF-On Programming
for High Performance" in which we use matrix-matrix multiplication as an example through
which we illustrate how high performance can be achieved on modern CPUs.
• [39] "LAFF-On Programming for High Performance", a four week Massive Open Online
Course offered on the edX platform (free for auditors).

12.4.3 Deriving blocked algorithms - We’ve got a MOOC for that


too!
Those who delve deeper into how to achieve high performance for matrix-matrix multiplica-
tion find out that it is specifically a rank-k update, the case of C := –AB + —C where the k
(inner) size is small, that achieves high performance. The blocked LU factorization that we
discussed in Subsection 5.5.2 takes advantage of this by casting most of its computation in
the matrix-matrix multiplication A22 := A22 ≠ A21 A12 . A question becomes: how do I find
blocked algorithms that cast most computation in terms of a rank-k updates?
The FLAME notation that we use in this course has made it possible for us to develop a
systematic methodology for discovering (high-performance) algorithms. This was published
in
• [4] Paolo Bientinesi, John A. Gunnels, Margaret E. Myers, Enrique S. Quintana-Orti,
Robert A. van de Geijn, The science of deriving dense linear algebra algorithms, ACM
Transactions on Mathematical Software (TOMS), 2005.
and various other publications that can be found on the FLAME project publication web
site http://www.cs.utexas.edu/~f ame/web/FLAMEPub ications.htm .
You can learn these techniques, which derive algorithms hand-in-hand with their proofs
of correctness, through our Massive Open Online Course
• [28] "LAFF-On Programming for Correctness", a six week Massive Open Online Course
offered on the edX platform (free for auditors).
WEEK 12. ATTAINING HIGH PERFORMANCE 620

12.4.4 Parallel high-performance algorithms


Modern processors achieve high performance by extracting parallelism at the instruction level
and by incorporating multiple cores that can collaborate to compute an operation. Beyond
that, parallel supercomputers consist of computational nodes, each of which consists of
multiple processing cores and local memory, that can communicate through a communication
network.
Many of the issues encountered when mapping (dense) linear algebra algorithms to such
distributed memory computers can be illustrated by studying matrix-matrix multiplication.
A classic paper is

• [40] Robert van de Geijn and Jerrell Watts, SUMMA: Scalable Universal Matrix Mul-
tiplication Algorithm, Concurrency: Practice and Experience, Volume 9, Number 4,
1997.

The techniques described in that paper are generalized in the more recent paper

• [34] Martin D. Schatz, Robert A. van de Geijn, and Jack Poulson, Parallel Matrix
Multiplication: A Systematic Journey, SIAM Journal on Scientific Computing, Volume
38, Issue 6, 2016.

12.5 Wrap Up
12.5.1 Additional homework
No additional homework yet.

12.5.2 Summary
We have noticed that typos are uncovered relatively quickly once we release the material.
Because we "cut and paste" the summary from the materials in this week, we are delaying
adding the summary until most of these typos have been identified.
Appendix A

Are you ready?

We have created a document "Advanced Linear Algebra: Are You Ready?" that a learner
can use to self-assess their readiness for a course on numerical linear algebra.

621
Appendix B

Notation

B.0.1 Householder notation


Alston Householder introduced the convention of labeling matrices with upper case Roman
letters (A, B, etc.), vectors with lower case Roman letters (a, b, etc.), and scalars with lower
case Greek letters (–, —, etc.). When exposing columns or rows of a matrix, the columns of
that matrix are usually labeled with the corresponding Roman lower case letter, and the the
individual elements of a matrix or vector are usually labeled with "the corresponding Greek
lower case letter," which we can capture with the triplets {A, a, –}, {B, b, —}, etc.
Q R
–0,0 –0,1 ··· –0,n≠1
1 2 c d
c –1,0 –1,1 ··· –1,n≠1 d
A= a0 a1 · · · an≠1 =c
c .. .. .. d
d
a . . . b
–m≠1,0 –m≠1,1 · · · –m≠1,n≠1

and Q R
‰0
c d
c ‰1 d
x= c
c .. d,
d
a . b
‰m≠1
where – and ‰ is the lower case Greek letters "alpha" and "chi," respectively. You will
also notice that in this course we start indexing at zero. We mostly adopt this convention
(exceptions include i, j, p, m, n, and k, which usually denote integer scalars.)

622
Appendix C

Knowledge from Numerical Analysis

Typically, an undergraduate numerical analysis course is considered a prerequisite for a


graduate level course on numerical linear algebra. There are, however, relatively few concepts
from such a course that are needed to be successful in this course. In this appendix, we very
briefly discuss some of these concepts.

C.0.1 Cost of basic linear algebra operations


C.0.2 Catastrophic cancellation
Recall that if
‰2 + —‰ + “ = 0
then the quadratic formula gives the largest root of this quadratic equation:
Ô
≠— + — 2 ≠ 4“
‰= .
2
Example C.0.2.1 We use the quadratic equation in the exact order indicated by the paren-
theses in SË ËÒ ÈÈ T
W ≠— + [[— 2 ] ≠ [4“]]
X
‰=U V,
2

truncating every expression within square brackets to three significant digits, to solve

‰2 + 25‰ + “ = 0

SË ËÔ ÈÈ T SË ËÔ ÈÈ T
≠25+ [[252 ]≠[4]] ≠25+ [625≠4]
‰ = U 2
V =U 2
V
5 Ô 6
[≠25+[ 621]] Ë
[≠25+24.9]
È Ë È
= 2
= 2
= ≠0.1
2
= ≠0.05.

Now, if you do this to the full precision of a typical calculator, the answer is instead
approximately ≠0.040064. The relative error we incurred is, approximately, 0.01/0.04 = 0.25.

623
APPENDIX C. KNOWLEDGE FROM NUMERICAL ANALYSIS 624

What is going on here? The problem comes from the fact that there is error in the 24.9
that is encountered after the square root is taken. Since that number is close in magnitude,
but of opposite sign to the ≠25 to which it is added, the result of ≠25 + 24.9 is mostly error.
This is known as catastrophic cancelation: adding two nearly equal numbers of opposite
sign, at least one of which has some error in it related to roundoff, yields a result with large
relative error.
Now, one can use an alternative formula to compute the root:
Ô Ô Ô
≠— + — 2 ≠ 4“ ≠— + — 2 ≠ 4“ ≠— ≠ — 2 ≠ 4“
‰= = ◊ Ô ,
2 2 ≠— ≠ — 2 ≠ 4“
which yields
2“
‰= Ô 2 .
≠— ≠ — ≠ 4“
Carrying out the computations, rounding intermediate results, yields ≠.0401. The relative
error is now 0.00004/0.040064 ¥ .001. It avoids catastrophic cancellation because now the
two numbers of nearly equal magnitude are added instead. ⇤
Remark C.0.2.2 The point is: if possible, avoid creating small intermediate results that
amplify into a large relative error in the final result.
Notice that in this example it is not inherently the case that a small relative change in
the input is amplified into a large relative change in the output (as is the case when solving a
linear system with a poorly conditioned matrix). The problem is with the standard formula
that was used. Later we will see that this is an example of an unstable algorithm.
Appendix D

GNU Free Documentation License

Version 1.3, 3 November 2008


Copyright © 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc. <http://www.
fsf.org/>
Everyone is permitted to copy and distribute verbatim copies of this license document,
but changing it is not allowed.

0. PREAMBLE. The purpose of this License is to make a manual, textbook, or other


functional and useful document “free” in the sense of freedom: to assure everyone the effective
freedom to copy and redistribute it, with or without modifying it, either commercially or
noncommercially. Secondarily, this License preserves for the author and publisher a way to
get credit for their work, while not being considered responsible for modifications made by
others.
This License is a kind of “copyleft”, which means that derivative works of the document
must themselves be free in the same sense. It complements the GNU General Public License,
which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free
software needs free documentation: a free program should come with manuals providing the
same freedoms that the software does. But this License is not limited to software manuals; it
can be used for any textual work, regardless of subject matter or whether it is published as a
printed book. We recommend this License principally for works whose purpose is instruction
or reference.

1. APPLICABILITY AND DEFINITIONS. This License applies to any manual or


other work, in any medium, that contains a notice placed by the copyright holder saying
it can be distributed under the terms of this License. Such a notice grants a world-wide,
royalty-free license, unlimited in duration, to use that work under the conditions stated
herein. The “Document”, below, refers to any such manual or work. Any member of the
public is a licensee, and is addressed as “you”. You accept the license if you copy, modify or
distribute the work in a way requiring permission under copyright law.

625
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 626

A “Modified Version” of the Document means any work containing the Document or a
portion of it, either copied verbatim, or with modifications and/or translated into another
language.
A “Secondary Section” is a named appendix or a front-matter section of the Document
that deals exclusively with the relationship of the publishers or authors of the Document
to the Document’s overall subject (or to related matters) and contains nothing that could
fall directly within that overall subject. (Thus, if the Document is in part a textbook of
mathematics, a Secondary Section may not explain any mathematics.) The relationship
could be a matter of historical connection with the subject or with related matters, or of
legal, commercial, philosophical, ethical or political position regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as
being those of Invariant Sections, in the notice that says that the Document is released under
this License. If a section does not fit the above definition of Secondary then it is not allowed
to be designated as Invariant. The Document may contain zero Invariant Sections. If the
Document does not identify any Invariant Sections then there are none.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover
Texts or Back-Cover Texts, in the notice that says that the Document is released under this
License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at
most 25 words.
A “Transparent” copy of the Document means a machine-readable copy, represented in
a format whose specification is available to the general public, that is suitable for revising
the document straightforwardly with generic text editors or (for images composed of pixels)
generic paint programs or (for drawings) some widely available drawing editor, and that is
suitable for input to text formatters or for automatic translation to a variety of formats
suitable for input to text formatters. A copy made in an otherwise Transparent file format
whose markup, or absence of markup, has been arranged to thwart or discourage subsequent
modification by readers is not Transparent. An image format is not Transparent if used for
any substantial amount of text. A copy that is not “Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without markup,
Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD,
and standard-conforming simple HTML, PostScript or PDF designed for human modifica-
tion. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats
include proprietary formats that can be read and edited only by proprietary word processors,
SGML or XML for which the DTD and/or processing tools are not generally available, and
the machine-generated HTML, PostScript or PDF produced by some word processors for
output purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following
pages as are needed to hold, legibly, the material this License requires to appear in the title
page. For works in formats which do not have any title page as such, “Title Page” means
the text near the most prominent appearance of the work’s title, preceding the beginning of
the body of the text.
The “publisher” means any person or entity that distributes copies of the Document to
the public.
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 627

A section “Entitled XYZ” means a named subunit of the Document whose title either
is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in
another language. (Here XYZ stands for a specific section name mentioned below, such
as “Acknowledgements”, “Dedications”, “Endorsements”, or “History”.) To “Preserve the
Title” of such a section when you modify the Document means that it remains a section
“Entitled XYZ” according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that
this License applies to the Document. These Warranty Disclaimers are considered to be
included by reference in this License, but only as regards disclaiming warranties: any other
implication that these Warranty Disclaimers may have is void and has no effect on the
meaning of this License.

2. VERBATIM COPYING. You may copy and distribute the Document in any
medium, either commercially or noncommercially, provided that this License, the copyright
notices, and the license notice saying this License applies to the Document are reproduced
in all copies, and that you add no other conditions whatsoever to those of this License.
You may not use technical measures to obstruct or control the reading or further copying of
the copies you make or distribute. However, you may accept compensation in exchange for
copies. If you distribute a large enough number of copies you must also follow the conditions
in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly
display copies.

3. COPYING IN QUANTITY. If you publish printed copies (or copies in media


that commonly have printed covers) of the Document, numbering more than 100, and the
Document’s license notice requires Cover Texts, you must enclose the copies in covers that
carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and
Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you
as the publisher of these copies. The front cover must present the full title with all words
of the title equally prominent and visible. You may add other material on the covers in
addition. Copying with changes limited to the covers, as long as they preserve the title
of the Document and satisfy these conditions, can be treated as verbatim copying in other
respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the
first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto
adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100,
you must either include a machine-readable Transparent copy along with each Opaque copy,
or state in or with each Opaque copy a computer-network location from which the general
network-using public has access to download using public-standard network protocols a com-
plete Transparent copy of the Document, free of added material. If you use the latter option,
you must take reasonably prudent steps, when you begin distribution of Opaque copies in
quantity, to ensure that this Transparent copy will remain thus accessible at the stated lo-
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 628

cation until at least one year after the last time you distribute an Opaque copy (directly or
through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well
before redistributing any large number of copies, to give them a chance to provide you with
an updated version of the Document.

4. MODIFICATIONS. You may copy and distribute a Modified Version of the Docu-
ment under the conditions of sections 2 and 3 above, provided that you release the Modified
Version under precisely this License, with the Modified Version filling the role of the Doc-
ument, thus licensing distribution and modification of the Modified Version to whoever
possesses a copy of it. In addition, you must do these things in the Modified Version:

A. Use in the Title Page (and on the covers, if any) a title distinct from that of the
Document, and from those of previous versions (which should, if there were any, be
listed in the History section of the Document). You may use the same title as a previous
version if the original publisher of that version gives permission.

B. List on the Title Page, as authors, one or more persons or entities responsible for
authorship of the modifications in the Modified Version, together with at least five of
the principal authors of the Document (all of its principal authors, if it has fewer than
five), unless they release you from this requirement.

C. State on the Title page the name of the publisher of the Modified Version, as the
publisher.

D. Preserve all the copyright notices of the Document.

E. Add an appropriate copyright notice for your modifications adjacent to the other copy-
right notices.

F. Include, immediately after the copyright notices, a license notice giving the public
permission to use the Modified Version under the terms of this License, in the form
shown in the Addendum below.

G. Preserve in that license notice the full lists of Invariant Sections and required Cover
Texts given in the Document’s license notice.

H. Include an unaltered copy of this License.

I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item stating
at least the title, year, new authors, and publisher of the Modified Version as given
on the Title Page. If there is no section Entitled “History” in the Document, create
one stating the title, year, authors, and publisher of the Document as given on its
Title Page, then add an item describing the Modified Version as stated in the previous
sentence.
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 629

J. Preserve the network location, if any, given in the Document for public access to a
Transparent copy of the Document, and likewise the network locations given in the
Document for previous versions it was based on. These may be placed in the “History”
section. You may omit a network location for a work that was published at least four
years before the Document itself, or if the original publisher of the version it refers to
gives permission.

K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title


of the section, and preserve in the section all the substance and tone of each of the
contributor acknowledgements and/or dedications given therein.

L. Preserve all the Invariant Sections of the Document, unaltered in their text and in
their titles. Section numbers or the equivalent are not considered part of the section
titles.

M. Delete any section Entitled “Endorsements”. Such a section may not be included in
the Modified Version.

N. Do not retitle any existing section to be Entitled “Endorsements” or to conflict in title


with any Invariant Section.

O. Preserve any Warranty Disclaimers.

If the Modified Version includes new front-matter sections or appendices that qualify as
Secondary Sections and contain no material copied from the Document, you may at your
option designate some or all of these sections as invariant. To do this, add their titles to
the list of Invariant Sections in the Modified Version’s license notice. These titles must be
distinct from any other section titles.
You may add a section Entitled “Endorsements”, provided it contains nothing but en-
dorsements of your Modified Version by various parties — for example, statements of peer
review or that the text has been approved by an organization as the authoritative definition
of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up
to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified
Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added
by (or through arrangements made by) any one entity. If the Document already includes a
cover text for the same cover, previously added by you or by arrangement made by the same
entity you are acting on behalf of, you may not add another; but you may replace the old
one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to
use their names for publicity for or to assert or imply endorsement of any Modified Version.

5. COMBINING DOCUMENTS. You may combine the Document with other docu-
ments released under this License, under the terms defined in section 4 above for modified
versions, provided that you include in the combination all of the Invariant Sections of all of
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 630

the original documents, unmodified, and list them all as Invariant Sections of your combined
work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical
Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections
with the same name but different contents, make the title of each such section unique by
adding at the end of it, in parentheses, the name of the original author or publisher of that
section if known, or else a unique number. Make the same adjustment to the section titles
in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled “History” in the various
original documents, forming one section Entitled “History”; likewise combine any sections
Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You must delete all
sections Entitled “Endorsements”.

6. COLLECTIONS OF DOCUMENTS. You may make a collection consisting of


the Document and other documents released under this License, and replace the individual
copies of this License in the various documents with a single copy that is included in the
collection, provided that you follow the rules of this License for verbatim copying of each of
the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually
under this License, provided you insert a copy of this License into the extracted document,
and follow this License in all other respects regarding verbatim copying of that document.

7. AGGREGATION WITH INDEPENDENT WORKS. A compilation of the Doc-


ument or its derivatives with other separate and independent documents or works, in or on
a volume of a storage or distribution medium, is called an “aggregate” if the copyright re-
sulting from the compilation is not used to limit the legal rights of the compilation’s users
beyond what the individual works permit. When the Document is included in an aggregate,
this License does not apply to the other works in the aggregate which are not themselves
derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document,
then if the Document is less than one half of the entire aggregate, the Document’s Cover
Texts may be placed on covers that bracket the Document within the aggregate, or the
electronic equivalent of covers if the Document is in electronic form. Otherwise they must
appear on printed covers that bracket the whole aggregate.

8. TRANSLATION. Translation is considered a kind of modification, so you may dis-


tribute translations of the Document under the terms of section 4. Replacing Invariant
Sections with translations requires special permission from their copyright holders, but you
may include translations of some or all Invariant Sections in addition to the original versions
of these Invariant Sections. You may include a translation of this License, and all the license
notices in the Document, and any Warranty Disclaimers, provided that you also include
the original English version of this License and the original versions of those notices and
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 631

disclaimers. In case of a disagreement between the translation and the original version of
this License or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled “Acknowledgements”, “Dedications”, or “His-
tory”, the requirement (section 4) to Preserve its Title (section 1) will typically require
changing the actual title.

9. TERMINATION. You may not copy, modify, sublicense, or distribute the Document
except as expressly provided under this License. Any attempt otherwise to copy, modify,
sublicense, or distribute it is void, and will automatically terminate your rights under this
License.
However, if you cease all violation of this License, then your license from a particular
copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly
and finally terminates your license, and (b) permanently, if the copyright holder fails to notify
you of the violation by some reasonable means prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated permanently if
the copyright holder notifies you of the violation by some reasonable means, this is the first
time you have received notice of violation of this License (for any work) from that copyright
holder, and you cure the violation prior to 30 days after your receipt of the notice.
Termination of your rights under this section does not terminate the licenses of parties
who have received copies or rights from you under this License. If your rights have been
terminated and not permanently reinstated, receipt of a copy of some or all of the same
material does not give you any rights to use it.

10. FUTURE REVISIONS OF THIS LICENSE. The Free Software Foundation


may publish new, revised versions of the GNU Free Documentation License from time to
time. Such new versions will be similar in spirit to the present version, but may differ in
detail to address new problems or concerns. See http://www.gnu.org/copy eft/.
Each version of the License is given a distinguishing version number. If the Document
specifies that a particular numbered version of this License “or any later version” applies
to it, you have the option of following the terms and conditions either of that specified
version or of any later version that has been published (not as a draft) by the Free Software
Foundation. If the Document does not specify a version number of this License, you may
choose any version ever published (not as a draft) by the Free Software Foundation. If the
Document specifies that a proxy can decide which future versions of this License can be
used, that proxy’s public statement of acceptance of a version permanently authorizes you
to choose that version for the Document.

11. RELICENSING. “Massive Multiauthor Collaboration Site” (or “MMC Site”) means
any World Wide Web server that publishes copyrightable works and also provides prominent
facilities for anybody to edit those works. A public wiki that anybody can edit is an example
of such a server. A “Massive Multiauthor Collaboration” (or “MMC”) contained in the site
means any set of copyrightable works thus published on the MMC site.
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 632

“CC-BY-SA” means the Creative Commons Attribution-Share Alike 3.0 license published
by Creative Commons Corporation, a not-for-profit corporation with a principal place of
business in San Francisco, California, as well as future copyleft versions of that license
published by that same organization.
“Incorporate” means to publish or republish a Document, in whole or in part, as part of
another Document.
An MMC is “eligible for relicensing” if it is licensed under this License, and if all works
that were first published under this License somewhere other than this MMC, and subse-
quently incorporated in whole or in part into the MMC, (1) had no cover texts or invariant
sections, and (2) were thus incorporated prior to November 1, 2008.
The operator of an MMC Site may republish an MMC contained in the site under CC-
BY-SA on the same site at any time before August 1, 2009, provided the MMC is eligible
for relicensing.

ADDENDUM: How to use this License for your documents. To use this License
in a document you have written, include a copy of the License in the document and put the
following copyright and license notices just after the title page:

Copyright (C) YEAR YOUR NAME.


Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3
or any ater version pub ished by the Free Software Foundation;
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the icense is inc uded in the section entit ed GNU
Free Documentation License .

If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the “with. . .
Texts.” line with this:

with the Invariant Sections being LIST THEIR TITLES, with the
Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.

If you have Invariant Sections without Cover Texts, or some other combination of the three,
merge those two alternatives to suit the situation.
If your document contains nontrivial examples of program code, we recommend releasing
these examples in parallel under your choice of free software license, such as the GNU General
Public License, to permit their use in free software.
References

[1] Ed Anderson, Zhaojun Bai, James Demmel, Jack J. Dongarra, Jeremy DuCroz, Ann
Greenbaum, Sven Hammarling, Alan E. McKenney, Susan Ostrouchov, and Danny
Sorensen, LAPACK Users’ Guide, SIAM, Philadelphia, 1992.
[2] Richard Barrett, Michael Berry, Tony F. Chan, James Demmel, June M. Donato, Jack
Dongarra, Victor Eijkhout, Roldan Pozo, Charles Romine, and Henk Van der Vorst,
Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods,
SIAM Press, 1993. [ PDF ]
[3] Paolo Bientinesi, Inderjit S. Dhillon, Robert A. van de Geijn, A Parallel Eigensolver
for Dense Symmetric Matrices Based on Multiple Relatively Robust Representations,
SIAM Journal on Scientific Computing, 2005
[4] Paolo Bientinesi, John A. Gunnels, Margaret E. Myers, Enrique S. Quintana-Orti,
Robert A. van de Geijn, The science of deriving dense linear algebra algorithms, ACM
Transactions on Mathematical Software (TOMS), 2005.
[5] Paolo Bientinesi, Enrique S. Quintana-Orti, Robert A. van de Geijn, Representing
linear algebra algorithms in code: the FLAME application program interfaces, ACM
Transactions on Mathematical Software (TOMS), 2005
[6] Paolo Bientinesi, Robert A. van de Geijn, Goal-Oriented and Modular Stability Analy-
sis, SIAM Journal on Matrix Analysis and Applications , Volume 32 Issue 1, February
2011.
[7] Paolo Bientinesi, Robert A. van de Geijn, The Science of Deriving Stability Analyses,
FLAME Working Note #33. Aachen Institute for Computational Engineering Sciences,
RWTH Aachen. TR AICES-2008-2. November 2008.
[8] Christian Bischof and Charles Van Loan, The WY Representation for Products of
Householder Matrices, SIAM Journal on Scientific and Statistical Computing, Vol. 8,
No. 1, 1987.
[9] Basic Linear Algebra Subprograms - A Quick Reference Guide, University of Tennessee,
Oak Ridge National Laboratory, Numerical Algorithms Groiup Ltd.
[10] Barry A. Cipra, The Best of the 20th Century: Editors Name Top 10 Algorithms,

633
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 634

SIAM News, Volume 33, Number 4, 2000. Available from https://archive.siam.org/


pdf/news/637.pdf.
[11] A.K. Cline, C.B. Moler, G.W. Stewart, and J.H. Wilkinson, An estimate for the con-
dition number of a matrix, SIAM J. Numer. Anal., 16 (1979).
[12] Inderjit S. Dhillon and Beresford N. Parlett, Multiple Representations to Compute
Orthogonal Eigenvectors of Symmetric Tridiagonal Matrices, Lin. Alg. Appl., Vol.
387, 2004.
[13] Jack J. Dongarra, Jeremy DuCroz, Ann Greenbaum, Sven Hammarling, Alan E.
McKenney, Susan Ostrouchov, and Danny Sorensen, LAPACK Users’ Guide, SIAM,
Philadelphia, 1992.
[14] Jack J. Dongarra, Jeremy Du Croz, Sven Hammarling, and Iain Duff, A Set of Level
3 Basic Linear Algebra Subprograms, ACM Transactions on Mathematical Software,
Vol. 16, No. 1, pp. 1-17, March 1990.
[15] Jack J. Dongarra, Jeremy Du Croz, Sven Hammarling, and Richard J. Hanson, An
Extended Set of {FORTRAN} Basic Linear Algebra Subprograms, ACM Transactions
on Mathematical Software, Vol. 14, No. 1, pp. 1-17, March 1988.
[16] J. J. Dongarra, C. B. Moler, J. R. Bunch, and G. W. Stewart, LINPACK Users’ Guide,
Society for Industrial and Applied Mathematics, 1979.
[17] Victor Eijkhout, Introduction to High-Performance Scientific Computing, lulu.com.
http://pages.tacc.utexas.edu/~eijkhout/istc/istc.htm
[18] Leslie V. Foster, Gaussian elimination with partial pivoting can fail in practice, SIAM
Journal on Matrix Analysis and Applications, 15, 1994.
[19] Gene H. Golub and Charles F. Van Loan, Matrix Computations, Fourth Edition, Johns
Hopkins Press, 2013.
[20] Brian C. Gunter, Robert A. van de Geijn, Parallel out-of-core computation and updat-
ing of the QR factorization, ACM Transactions on Mathematical Software (TOMS),
2005.
[21] N. Higham, A Survey of Condition Number Estimates for Triangular Matrices, SIAM
Review, 1987.
[22] C. G. J. Jacobi, Über ein leichtes Verfahren, die in der Theorie der Sä kular-störungen
vorkommenden Gleichungen numerisch aufzulösen, Crelle’s Journal 30, 51-94, 1846.
[23] Thierry Joffrain, Tze Meng Low, Enrique S. Quintana-Orti, Robert van de Geijn, Field
G. Van Zee, Accumulating Householder transformations, revisited, ACM Transactions
on Mathematical Software, Vol. 32, No 2, 2006.
[24] C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh, Basic Linear Algebra
Subprograms for Fortran Usage, ACM Transactions on Mathematical Software, Vol. 5,
No. 3, pp. 308-323, Sept. 1979.
[25] Per-Gunnar Martinsson, Gregorio Quintana-Orti, Nathan Heavner, Robert van de
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 635

Geijn, Householder QR Factorization With Randomization for Column Pivoting (HQRRP),


SIAM Journal on Scientific Computing, Vol. 39, Issue 2, 2017.
[26] Margaret E. Myers, Pierce M. van de Geijn, and Robert A. van de Geijn, Linear
Algebra: Foundations to Frontiers - Notes to LAFF With, self-published at ulaff.net,
2014.
[27] Margaret E. Myers and Robert A. van de Geijn, Linear Algebra: Foundations to Fron-
tiers, ulaff.net, 2014. A Massive Open Online Course offered on edX.
[28] Margaret E. Myers and Robert A. van de Geijn, LAFF-On Programming for Correct-
ness, self-published at ulaff.net, 2017.
[29] Margaret E. Myers and Robert A. van de Geijn, LAFF-On Programming for Correct-
ness, A Massive Open Online Course offered on edX.
[30] J. Novembre, T. Johnson, K. Bryc, Z. Kutalik, A.R. Boyko, A. Auton, A. Indap,
K.S. King, S. Bergmann, M.. Nelson, M. Stephens, C.D. Bustamante, Genes mirror
geography within Europe, Nature, 2008
[31] Devangi N. Parikh, Margaret E. Myers, Richard Vuduc, Robert A. van de Geijn, A
Simple Methodology for Computing Families of Algorithms, FLAME Working Note
#87, The University of Texas at Austin, Department of Computer Science, Technical
Report TR-18-06. arXiv:1808.07832.
[32] C. Puglisi, Modification of the Householder method based on the compact WY repre-
sentation, SIAM Journal on Scientific Computing, Vol. 13, 1992.
[33] Gregorio Quintana-Orti, Xioabai Sun, and Christof H. Bischof, A BLAS-3 version of
the QR factorization with column pivoting, SIAM Journal on Scientific Computing, 19,
1998.
[34] Martin D. Schatz, Robert A. van de Geijn, and Jack Poulson, Parallel Matrix Multi-
plication: A Systematic Journey, SIAM Journal on Scientific Computing, Volume 38,
Issue 6, 2016.
[35] Robert Schreiber and Charles Van Loan, A Storage-Efficient WY Representation for
Products of Householder Transformations, SIAM Journal on Scientific and Statistical
Computing, Vol. 10, No. 1, 1989.
[36] Jonathon Shlens, A Tutorial on Principal Component Analysis, arxiv 1404.1100, 2014.
[37] G.W. Stewart, Matrix Algorithms, Volume I: Basic Decompositions, SIAM Press, 2001.
[38] Robert van de Geijn and Kazushige Goto, BLAS (Basic Linear Algebra Subprograms),
Encyclopedia of Parallel Computing, Part 2, pp. 157-164, 2011. If you don’t have
access, you may want to read an advanced draft.
[39] Robert van de Geijn, Margaret Myers, and Devangi N. Parikh, LAFF-On Programming
for High Performance, ulaff.net, 2019.
[40] Robert van de Geijn and Jerrell Watts, SUMMA: Scalable Universal Matrix Multipli-
cation Algorithm, Concurrency: Practice and Experience, Volume 9, Number 4, 1997.
APPENDIX D. GNU FREE DOCUMENTATION LICENSE 636

[41] Field G. Van Zee, libflame: The Complete Reference, http://www. u u.com, 2009. [ free
PDF ]
[42] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, Restructuring the
Tridiagonal and Bidiagonal QR Algorithms for Performance, ACM Transactions on
Mathematical Software (TOMS), Vol. 40, No. 3, 2014. Available free from http://www.
cs.utexas.edu/~f ame/web/FLAMEPub ications.htm Journal Publication #33. Click on
the title of the paper.
[43] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, Restructuring the
Tridiagonal and Bidiagonal QR Algorithms for Performance. ACM Transactions on
Mathematical Software (TOMS) , 2014. Available free from http://www.cs.utexas.edu/
~f ame/web/FLAMEPub ications.htm Journal Publication #33. Click on the title of the
paper.
[44] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortí, G. Joseph Elizondo,
Families of Algorithms for Reducing a Matrix to Condensed Form. ACM Transactions
on Mathematical Software (TOMS) , Vol, No. 1, 2012. Available free from http://www.
cs.utexas.edu/~f ame/web/FLAMEPub ications.htm Journal Publication #26. Click on
the title of the paper.
[45] H. F. Walker, Implementation of the GMRES method using Householder transforma-
tions, SIAM Journal on Scientific and Statistical Computing, Vol. 9, No. 1, 1988.
[46] Stephen J. Wright, A Collection of Problems for Which {G}aussian Elimination with
Partial Pivoting is Unstable, SIAM Journal on Scientific Computing, Vol. 14, No. 1,
1993.
[47] BLAS-like Library Instantiation Software Framework, GitHub repository.
[48] BLIS typed interface, https://github.com/f ame/b is/b ob/master/docs/BLISTypedAPI.
md.
[49] Kazushige Goto and Robert van de Geijn, Anatomy of High-Performance Matrix Mul-
tiplication, ACM Transactions on Mathematical Software, Vol. 34, No. 3: Article 12,
May 2008.
[50] Tyler Michael Smith, Bradley Lowery, Julien Langou, Robert A. van de Geijn, A Tight
I/O Lower Bound for Matrix Multiplication, arxiv.org:1702.02017v2, 2019. (To appear
in ACM Transactions on Mathematical Software.)
[51] Field G. Van Zee and Tyler M. Smith, Implementing High-performance Complex Ma-
trix Multiplication via the 3M and 4M Methods, ACM Transactions on Mathematical
Software, Vol. 44, No. 1, pp. 7:1-7:36, July 2017.
[52] Field G. Van Zee and Robert A. van de Geijn, BLIS: A Framework for Rapidly In-
stantiating BLAS Functionality, ACM Journal on Mathematical Software, Vol. 41,
No. 3, June 2015. You can access this article for free by visiting the Science of High-
Performance Computing group webpage and clicking on the title of Journal Article
39.
Index

(Euclidean) length, 84 Basic Linear Algebra Subprograms, 305,


I, 43 588
[·], 340 BLAS, 305, 588
‘mach , 172 blocked algorithm, 200
fl(·), 339
“n , 355, 379 catastrophic cancellation, 623
Œ-norm (vector), 84 Cauchy-Schwarz inequality, 26
Œ-norm, vector, 30 CGS, 158
Ÿ(A), 76, 88 characteristic polynomial, 471, 473
maxi(·), 286 chasing the bulge, 546
Cholesky decomposition, 255
A, 50
Cholesky factor, 299
x, 95
Cholesky factorization, 222, 255
, 19
Cholesky factorization theorem, 299, 329
◊j , 353, 379
Classical Gram-Schmidt, 158
| · |, 18
complex conjugate, 19
ej , 100
complex product, 84
p-norm (vector), 84
condition number, 76, 88, 224, 250
p-norm, matrix, 55
conjugate, 19, 84
p-norm, vector, 31
conjugate (of matrix), 87
1-norm (vector), 84
conjugate (of vector), 84
1-norm, vector, 29
conjugate of a matrix, 50
2-norm (vector), 84 conjugate transpose (of matrix), 87
2-norm, matrix, 56 conjugate transpose (of vector), 84
2-norm, vector, 25 consistent matrix norm, 71, 88
cost of basic linear algebra operations, 623
absolute value, 18, 84 cubic convergence, 497, 513
ACM, 341
Alternative Computational Model, 341 defective matrix, 489, 511
axpy, 296 deflation, 530
descent methods, 416
backward stable implementation, 343 determinant, 472

637
INDEX 638

direction of maximal magnification, 77 induced matrix norm, 51, 52


distance, 18 infinity norm, 30
dot product, 84, 95 inner product, 84, 95

eigenpair, 464, 507 Jordan Canonical Form, 488


eigenvalue, 464, 507
eigenvector, 464, 507 Krylov subspace, 445, 458
elementary elementary pivot matrix, 241 left pseudo inverse, 220
equivalence style proof, 22 left pseudo-inverse, 90
Euclidean distance, 18 left singular vector, 118, 149
exact descent method, 420 Legendre polynomials, 154
linear convergence, 496, 497, 512
fill-in, 393
linear least squares, 212
fixed-point equation, 405
linear transformation, 41
FLAME notation, 118
LLS, 212
floating point numbers, 334
LU decomposition, 255, 268, 277, 324
flop, 594, 595, 597
LU factorization, 255, 258, 268, 277, 324
forward substitution, 260
LU factorization - existence, 269, 324
Frobenius norm, 47, 87
LU factorization algorithm (bordered),
Gauss transform, 273 273
Gaussian elimination, 258 LU factorization algorithm (left-looking),
Gaussian elimination with row exchanges, 271
278 LU factorization algorithm
Geometric multiplicity, 489, 511 (right-looking), 264
Givens’ rotation, 539 LU factorization with complete pivoting,
gradient, 417 296
Gram-Schmidt orthogonalization, 158 LU factorization with partial pivoting, 286
LU factorization with partial pivoting
Hermitian, 50 (right-looking algorithm), 286
Hermitian Positive Definite, 222, 297 LU factorization with pivoting, 278
Hermitian positive definite, 297
Hermitian transpose, 26, 49 machine epsilon, 172, 337, 338, 377
Hermitian transpose (of matrix), 87 magnitude, 18
Hermitian transpose (of vector), 84 matrix, 41, 43
Hessenberg matrix, 542 matrix 1-norm, 87
homogeneity (of absolute value), 19 matrix 2-norm, 56, 87
homogeneity (of matrix norm), 46, 86 matrix Œ-norm, 87
homogeneity (of vector norm), 24, 84 matrix p-norm, 55
Householder reflector, 179, 208 matrix norm, 46, 86
Householder transformation, 178, 179, 208 matrix norm, 2-norm, 56
HPD, 222, 297 matrix norm, p-norm, 55
matrix norm, consistent, 71, 88
identity matrix, 43 matrix norm, Frobenius, 47
Implicit Q Theorem, 542 matrix norm, induced, 51, 52
INDEX 639

matrix norm, submultiplicative, 70, 71, 88 QR decomposition, 151


matrix norm, subordinate, 71, 88 QR Decomposition Theorem, 161, 207
matrix p-norm, 87 QR factorization, 151
matrix-vector multiplication, 43 QR factorization with column pivoting,
memop, 594, 595, 598 241
Method of Multiple Relatively Robust quadratic convergence, 497, 513
Representations (MRRR), 553
Method of Normal Equations, 219 Rank Revealing QR, 241
method of normal equations, 214 rank-k update, 612
MRRR, 553 Rayleigh quotent, 496, 512
Rayleigh Quotient Iteration, 503
natural ordering, 384 reflector, 178, 179, 208
nested dissection, 396 residual, 14
norm, 12 right pseudo inverse, 221
norm, Frobenius, 47 right singular vector, 118, 149
norm, infinity, 30 rotation, 107
norm, matrix, 46, 86 rowl pivoting, 279
norm, vector, 24, 84 RRQR, 241
normal equations, 214, 219
numerical stability, 331 Schur decomposition, 479, 480, 511
Schur Decomposition Theorem, 480, 511
orthogonal matrix, 102 SCM, 340
orthogonal projection, 90 separator, 393
orthogonal vectors, 96 shifted inverse power method, 503
orthonormal matrix, 100 shifted QR algorithm, 526
orthonormal vectors, 99 similarity transformation, 479, 510
over-relaxation, 409 Singular Value Decomposition, 89, 92
parent functions, 153 singular vector, 118, 149
partial pivoting, 279, 286 solving triangular systems, 290
pivot, 279 SOR, 409
pivot element, 279 sparse linear system, 382
positive definite, 297 Spectral decomposition, 479
positive definiteness (of absolute value), spectral decomposition, 482, 511
19 Spectral Decomposition Theorem, 482,
positive defnitenessx (of matrix norm), 511
46, 86 spectral radius, 465, 508
positive defnitenessx (of vector norm), 24, spectrum, 465, 508
84 stability, 331
precondition, 308 standard basis vector, 42, 85
principal leading submatrix, 268, 324 Standard Computational Model, 340
pseudo inverse, 220, 223 submultiplicative matrix norm, 70, 71, 88
pseudo-inverse, 90 subordinate matrix norm, 71, 88
subspace iteration, 514, 518
QR algorithm, 521 successive over-relaxation, 409
INDEX 640

superlinear convergence, 497, 513 unit ball, 32, 84


superquadratic convergence, 513 unit roundoff, 337, 338, 377
SVD, 89, 92 unit roundoff error, 172
symmetric positive definite, 298, 329 unitary matrix, 102, 148
unitary similarity transformation, 480,
tall and skinny, 563 511
The Francis implicit QR Step, 545 upper Hessenberg matrix, 542
The implicit Q theorem, 542
transpose, 49 Vandermonde matrix, 152
transpose (of matrix), 87 vector 1-norm, 29, 84
transpose (of vector), 84 vector 2-norm, 25, 84
triangle inequality (for absolute value)), vector Œ-norm, 30, 84
19 vector p-norm, 84
triangle inequality (for matrix norms)), vector p-norm, 31
46, 86 vector norm, 24, 84
triangle inequality (for vector norms)), 24, vector norm, 1-norm, 29
84 vector norm, 2-norm, 25
triangular solve with multiple right-hand vector norm, Œ-norm, 30
sides, 612 vector norm, p-norm, 31
triangular system, 290
TRSM, 612 Wilkinson shift, 550
Colophon
This article was authored in, and produced with, PreTeXt.

You might also like