Numerical Linear Algebra and Matrix Analysis: Higham, Nicholas J. 2015
Numerical Linear Algebra and Matrix Analysis: Higham, Nicholas J. 2015
Higham, Nicholas J.
2015
ISSN 1749-9097
1
κ2 (X) = 1, so in numerical linear algebra transforma- angular systems. It is then clear how to solve efficiently
tions by unitary or orthogonal matrices are preferred several systems Axi = bi , i = 1 : r , with different right-
and usually lead to numerically stable algorithms. hand sides but the same coefficient matrix A: compute
In practice we often need an estimate of the matrix the LU factors once and then re-use them to solve for
condition number number κ(A) but do not wish to go each xi in turn.
to the expense of computing A−1 in order to obtain This matrix factorization1 viewpoint dates from
it. Fortunately, there are algorithms that can cheaply around the 1940s and has been extremely successful
produce a reliable estimate of κ(A) once a factorization in matrix computations. In general, a factorization is a
of A has been computed. representation of a matrix as a product of “simpler”
Note that the determinant, det(A), is rarely com- matrices. Factorization is a tool that can be used to
puted in numerical linear algebra. Its magnitude gives solve a variety of problems, as we will see below.
no useful information about the conditioning of A, not Two particular benefits of factorizations are unity
least because of its extreme behavior under scaling: and modularity. GE, for example, can be organized
det(αA) = αn det(A). in several different ways, corresponding to different
orderings of the three nested loops that it comprises,
2 Matrix Factorizations as well as the use of different blockings of the matrix
elements. Yet all of them compute the same LU factor-
The method of Gaussian elimination (GE) for solving ization, carrying out the same mathematical operations
a nonsingular linear system Ax = b of n equations in a different order. Without the unifying concept of a
in n unknowns reduces the matrix A to upper trian- factorization, reasoning about these GE variants would
gular form and then solves for x by substitution. GE be difficult.
is typically described by writing down the equations Modularity refers to the way that a factorization
(k+1) (k) (k) (k) (k)
aij = aij − aik akj /akk (and similarly for b) that breaks a problem down into separate tasks, which can
(1) be analyzed or programmed independently. To carry
describe how the starting matrix A = A(1) = (aij )
changes on each of the n − 1 steps of the elimina- out a rounding error analysis of GE we can analyze the
tion in its progress towards upper triangular form U. LU factorization and the solution of the triangular sys-
Working at the element level in this way leads to a pro- tems by substitution separately and then put the analy-
fusion of symbols, superscripts, and subscripts that ses together. The rounding error analysis of substitu-
tend to obscure the mathematical structure and hin- tion can be re-used in the many other contexts in which
der insights being drawn into the underlying process. triangular systems arise.
One of the key developments in the last century was An important example of the use of LU factoriza-
the recognition that it is much more profitable to work tion is in iterative refinement. Suppose we have used
at the matrix level. Thus the basic equation above is GE to obtain a computed solution x b to Ax = b in
written as A(k+1) = Mk A(k) , where Mk agrees with the floating-point arithmetic. If we form r = b − Ax b and
identity matrix except below the diagonal in the kth solve Ae = r , then in exact arithmetic y = x b + e is
(k)
column, where its (i, k) element is mik = −aik /akk ,
(k) the true solution. In computing e we can reuse the LU
i = k + 1 : n. Recurring the matrix equation gives factors of A, so obtaining y from x b is inexpensive. In
U := A(n) = Mn−1 . . . M1 A. Taking the Mk matrices over practice, the computation of r , e, and y is subject to
to the left-hand side leads, after some calculations, to rounding errors so the computed y b is not equal to x.
the equation A = LU , where L is unit lower triangular, But under suitable assumptions y b will be an improved
with (i, k) element mik . The prefix “unit” means that L approximation and we can iterate this refinement pro-
cess. Iterative refinement is particularly effective if r
has ones on the diagonal.
can be computed using extra precision.
GE is therefore equivalent to factorizing the matrix
Two other key factorizations are:
A as the product of a lower triangular matrix and an
upper triangular matrix—something that is not at all • Cholesky factorization: for Hermitian positive def-
obvious from the element-level equations. Solving the inite A ∈ Cn×n , A = R ∗ R, where R is upper tri-
linear system Ax = b now reduces to the task of solving angular with positive diagonal elements, and this
the two triangular systems Ly = b and U x = y. factorization is unique.
Interpreting GE as LU factorization separates the
computation of the factors from the solution of the tri- 1. Or decomposition—the two terms are essentially synonymous.
3
• QR factorization: for A ∈ Cm×n with m > n, A = (recall that FnT = Fn ), though this was not realized
QR where Q ∈ Cm×m is unitary (Q∗ Q = Im )hand i when the methods were developed. Transposition also
R ∈ Cm×n is upper trapezoidal, that is, R = R01 plays an important role in automatic differentiation:
with R1 ∈ Cn×n upper triangular. the so-called reverse or adjoint mode can be obtained
by transposing a matrix factorization representation of
These two factorizations are related: if A ∈ Cm×n with the forward mode.
m > n has full rank and A = QR is a QR factorization, The factorizations described in this section are in
in which without loss of generality we can assume that “plain vanilla” form, but all have variants that incor-
R has positive diagonal, then A∗A = R ∗R, so R is the porate pivoting. Pivoting refers to row or column inter-
Cholesky factor of A∗A. changes carried out at each step of the factorization as
The Cholesky factorization can be computed by what it is computed, introduced either to ensure that the fac-
is essentially a symmetric and scaled version of GE. The torization succeeds and is numerically stable or to pro-
QR factorization can be computed in three main ways, duce a factorization with certain desirable properties
one of which is the classical Gram–Schmidt orthogonal- usually associated with rank deficiency. For GE, partial
ization. The most widely used method constructs Q as pivoting is normally used: at the start of the kth stage
(k)
a product of Householder reflectors, which are unitary of the elimination an element ar k of largest modulus in
matrices of the form H = I −2vv ∗ /(v ∗ v), where v is a the kth column below the diagonal is brought into the
nonzero vector. Note that H is a rank 1 perturbation of (k, k) (pivot) position by interchanging rows k and r .
(k)
the identity and since it is Hermitian and unitary it is its Partial pivoting avoids dividing by zero (if akk = 0 after
own inverse, that is, it is involutory. The third approach the interchange then the pivot column is zero below the
builds Q as a product of Givens rotations, each of which diagonal and the elimination step can be skipped). More
c s
is a 2 × 2 matrix −s c embedded into two rows and importantly, partial pivoting ensures numerical stabil-
columns of an m ×m identity matrix, where (in the real ity; see section 8. The overall effect of GE with partial
case) c 2 + s 2 = 1. pivoting is to produce an LU factorization P A = LU ,
The Cholesky factorization helps us to make the where P is a permutation matrix.
most of the very desirable property of positive definite- Pivoted variants of Cholesky factorization and QR
ness. For example, suppose A is Hermitian positive def- factorization
h i take the form P TAP = R ∗ R and AP =
inite and we wish to evaluate the scalar α = x ∗ A−1 x. R
Q 0 , where P is a permutation matrix and R satisfies
We can rewrite it as x ∗ (R ∗R)−1 x = (x ∗ R −1 )(R −∗ x) = the inequalities
z∗ z, where z = R −∗ x. So once the Cholesky factoriza- j
X
tion has been computed we need just one triangular |rkk |2 > |rij |2 , j = k + 1 : n, k = 1 : n.
solve to compute α, and of course there is no need to i=k
explicitly invert the matrix A.
h i
If A is rank deficient then R has the form R = R011 R012
A matrix factorization might involve a larger num- with R11 nonsingular, and the rank of A is the dimen-
ber of factors: A = N1 N2 . . . Nk , say. It is immediate sion of R11 . Equally importantly, when A is nearly rank
that AT = NkT Nk−1 T
. . . N1T . This factorization of the deficient this tends to be revealed by a small trailing
transpose may have deep consequences in a particu- diagonal block of R.
lar application. For example, the discrete Fourier trans- A factorization of great importance in a wide vari-
form is the matrix–vector product y = Fn x, where ety of applications is the singular value decomposition
the n × n matrix Fn has (p, q) element exp(−2π i(p − (SVD) of A ∈ Cm×n :
1)(q − 1)/n); Fn is a complex, symmetric matrix. The
A = UΣV ∗ , Σ = diag(σ1 , σ2 , . . . , σp ) ∈ Rm×n , (1)
fast Fourier transform (FFT) is a way of evaluating y in
O(n log2 n) operations, as opposed to the O(n2 ) oper- where p = min(m, n), U ∈ Cm×m
and V ∈ areCn×n
ations that are required by a standard matrix–vector unitary, and the singular values σi satisfy σ1 > σ2 >
multiplication. Many variants of the FFT have been pro- · · · > σp > 0. For a square A (m = n), the 2-norm
posed since the original 1965 paper by Cooley and condition number is given by κ2 (A) = σ1 /σn .
Tukey. It turns out that different FFT variants corre- The polar decomposition of A ∈ Cm×n with m > n
spond to different factorizations of Fn with k = log2 n is a factorization A = UH in which U ∈ Cm×n has
sparse factors. Some of these methods correspond sim- orthonormal columns and H ∈ Cn×n is Hermitian posi-
ply to transposing the factorization in another method tive semidefinite. The matrix H is unique and is given by
4
(A∗A)1/2 , where the exponent 1/2 denotes the princi- be rewritten as the inequality k∆Ak/kAk < κ(A)−1 ,
pal square root, while U is unique if A has full rank. The where κ(A) = kAkkA−1 k > 1 is the condition number
polar decomposition generalizes to matrices the polar introduced in section 1. It turns out that we can always
representation z = r eiθ of a complex number. The Her- find a perturbation ∆A such that A + ∆A is singular
mitian polar factor H is also known as the matrix abso- and k∆Ak/kAk = κ(A)−1 . It follows that the relative
lute value, |A|, and is much studied in matrix analysis distance to singularity
and functional analysis.
d(A) = min { k∆Ak/kAk : A + ∆A is singular } (2)
One reason for the importance of the polar decom-
position is that it provides an optimal way to orthogo- is given by d(A) = κ(A)−1 . This reciprocal relation
nalize a matrix: a result of Fan and Hoffman (1955) says between problem conditioning and the distance to a
that U is the nearest matrix with orthonormal columns singular problem (one with an infinite condition num-
to A in any unitarily invariant norm (a unitarily invari- ber) is common to a variety of problems in linear alge-
ant norm is one with the property that kU AV k = kAk bra and control theory, as shown by James Demmel in
for any unitary U and V ; the 2-norm and the Frobe- the 1980s.
nius norm are particular examples). In various appli- We may want a more refined test for whether A + ∆A
cations a matrix A ∈ Rn×n that should be orthogonal is nonsingular. To obtain one we will need to make
drifts from orthogonality because of rounding or other some assumptions about the perturbation. Suppose
errors; replacing it by the orthogonal polar factor U is that ∆A has rank 1: ∆A = xy ∗ , for some vectors x and
then a good strategy. y. From the analysis above we know that A + ∆A will
The polar decomposition also solves the orthogonal be nonsingular if A−1 ∆A = A−1 xy ∗ has no eigenvalue
Procrustes problem, for A, B ∈ Cm×n , equal to −1. Using the fact that the nonzero eigenvalues
of AB are the same as those of BA for any conformable
min kA − BQkF : Q ∈ Cn×n , Q∗ Q = I ,
matrices A and B, we see that the nonzero eigenvalues
for which any solution Q is a unitary polar factor of of (A−1 x)y ∗ are the same as those of y ∗ A−1 x. Hence
B ∗A. This problem comes from factor analysis and mul- A + xy ∗ is nonsingular as long as y ∗ A−1 x 6= −1.
tidimensional scaling in statistics, where the aim is to Now that we know when A + xy ∗ is nonsingular we
see whether two data sets A and B are the same up to might ask if there is an explicit formula for the inverse.
an orthogonal transformation. Since A + xy ∗ = A(I + A−1 xy ∗ ) we can take A = I
Either of the SVD and the polar decomposition can without loss of generality. So we are looking for the
be derived, or computed, from the other. Histori- inverse of B = I + xy ∗ . One way to find it is to guess
cally, the SVD came first (Beltrami, in 1873), with the that B −1 = I + θxy ∗ for some scalar θ and equate the
polar decomposition three decades behind (Autonne, product with B to I, to obtain θ(1 + y ∗ x) + 1 = 0. Thus
in 1902). (I + xy ∗ )−1 = I − xy ∗ /(1 + y ∗ x). The corresponding
formula for (A + xy ∗ )−1 is
3 Distance to Singularity and Low-Rank
(A + xy ∗ )−1 = A−1 − A−1 xy ∗ A−1 /(1 + y ∗ A−1 x),
Perturbations
which is known as the Sherman–Morrison formula.
The question commonly arises of whether a given per- This formula and its generalizations originate in the
turbation of a nonsingular matrix A preserves nonsin- 1940s and have been rediscovered many times. The
gularity. In a sense, this question is trivial. Recalling corresponding formula for a rank p perturbation is
that a square matrix is nonsingular when all its eigen- the Sherman–Morrison–Woodbury formula: for U, V ∈
values are nonzero, and that the product of two matri- Cn×p ,
ces is nonsingular unless one of them is singular, from
(A + UV ∗ )−1 = A−1 − A−1 U(I + V ∗ A−1 U)−1 V ∗ A−1 .
A + ∆A = A(I + A−1 ∆A) we see that A + ∆A is non-
singular as long as A−1 ∆A has no eigenvalue equal to Important applications of these formulae are in opti-
−1. However, this is not an easy condition to check, mization, where rank-1 or rank-2 updates are made to
and in practice we may not know ∆A but only a bound Hessian approximations in quasi-Newton methods and
for its norm. Since any norm of a matrix exceeds the to basis matrices in the simplex method. More gener-
modulus of every eigenvalue, a sufficient condition for ally, the task of updating the solution to a problem after
A + ∆A to be nonsingular is that kA−1 ∆Ak < 1, which a coefficient matrix has undergone a low-rank change,
is certainly true if kA−1 kk∆Ak < 1. This condition can or has had a row or column added or removed, arises in
5
2
20
1 10
10
0 • • • •
0 0
−1
−10
−2 −10
−4 −3 −2 −1 0 1 2 3 4 5 6 7 8
−20
Figure 1 Gershgorin discs for the matrix in (3); the −20 −10 0 0 20 40
eigenvalues are marked as solid dots.
Figure 2 Fields of values for a pentadiagonal Toeplitz
of Gershgorin’s theorem exist with discs replaced by matrix (left) and a circulant matrix (right), both of dimen-
sion 32. The eigenvalues are denoted by crosses.
other shapes.
The spectral radius ρ(A) (the largest absolute value
of any eigenvalue of A) satisfies ρ(A) 6 kAk, as shown y, respectively, then there is an eigenvalue λ + ∆λ of
above, but hthisiinequality can be arbitrarily weak, as A+∆A such that ∆λ = y ∗ ∆Ax/(y ∗ x)+O(k∆Ak2 ) and
the matrix 10 θ1 shows for |θ| 1. It is natural to ask so
whether there are any sharper relations between the kyk2 kxk2
spectral radius and norms. One answer is the equality |∆λ| 6 k∆Ak + O(k∆Ak2 ).
|y ∗ x|
ρ(A) = lim kAk k1/k . (4) The term kyk2 kxk2 /|y ∗ x| can be shown to be an
k→∞
(absolute) condition number for λ. It is at least 1 and
Another is the result that given any ε > 0 there is a
norm such that kAk 6 ρ(A) + ε; however, the norm tends to infinity as y and x approach orthogonality
depends on A. This result can be used to give a proof of (which can never exactly be achieved for simple λ), so
the fact, discussed in the article on the Jordan canonical λ can be very ill conditioned. However if A is Hermitian
form, that the powers of A converge to zero if ρ(A) < 1. then we can take y = x and the bound simplifies to
The field of values, also known as the numerical |∆λ| 6 k∆Ak + O(k∆Ak2 ), so all the eigenvalues of a
range, is a tool that can be used for localization and Hermitian matrix are perfectly conditioned.
many other purposes. It is defined for A ∈ Cn×n by Much research has been done to obtain eigenvalue
∗
z Az
perturbation bounds under both weaker and stronger
n
F (A) = : 0 =
6 z ∈ C . assumptions about the problem. Suppose we drop the
z∗ z
The set F (A) is compact and convex (a nontrivial prop- requirement that λ is simple. Consider the matrix and
erty proved by Toeplitz and Hausdorff) and it contains perturbation
all the eigenvalues of A. For normal matrices it is the
0 1 0 0 0 0
convex hull of the eigenvalues. The normal matrices A A = 0 0 1 , ∆A = 0 0 0 .
are those for which AA∗ = A∗ A, and they include the 0 0 0 ε 0 0
Hermitian, the skew-Hermitian, and the unitary matri-
The eigenvalues of A are all zero and those of A + ∆A
ces. For a Hermitian matrix F (A) is a segment of the real
are the third roots of ε. The change in the eigenvalue
axis while for a skew-Hermitian matrix it is a segment
is proportional not to ε but to a fractional power of ε.
of the imaginary axis. Figure 2 illustrates two fields of
values, the second of which is the convex hull of the In general, the sensitivity of an eigenvalue depends on
eigenvalues because a circulant matrix is normal. the Jordan structure for that eigenvalue.
matrix, and this approach is used by some computer where {a e ii } is the set of diagonal elements of A
codes, for example the roots function of MATLAB. arranged in decreasing order: a e 11 > · · · > a e nn . There
While standard eigenvalue algorithms do not exploit is equality for k = n, since both sides equal trace(A).
the structure of C, this approach has proved competi- These inequalities say that the vector [λ1 , . . . , λn ] of
tive with specialist polynomial root-finding algorithms. eigenvalues majorizes the vector [a e 11 , . . . , a
e nn ] of diag-
Another use for the relation is to obtain bounds for onal elements.
roots of polynomials from bounds for matrix eigen- In general there is no useful formula for the eigen-
values, and vice versa. values of a sum A + B of Hermitian matrices. How-
Companion matrices have many interesting proper- ever, the Courant–Fischer theorem yields the upper and
ties. For example, any nonderogatory n × n matrix lower bounds
is similar to a companion matrix. Companion matri- λk (A) + λn (B) 6 λk (A + B) 6 λk (A) + λ1 (B),
ces therefore have featured strongly in matrix analysis
and also in control theory. However, similarity trans- from which it follows that |λk (A + B) − λk (A)| 6
formations to companion form are little used in prac- max(|λn (B)|, |λ1 (B)|) = kBk2 . The latter inequality
tice because of problems with ill conditioning and again shows that the eigenvalues of a Hermitian matrix
numerical instability. are well conditioned under perturbation.
Returning to the characteristic polynomial, p(λ) = The Cauchy interlace theorem has a different flavor. It
det(λI − A) = λn − an−1 λn−1 − · · · − a0 , we know that relates the eigenvalues of successive leading principal
p(λi ) = 0 for every eigenvalue λi of A. The Cayley– submatrices Ak = A(1 : k, 1 : k) by
Hamilton theorem says that p(A) = An − an−1 An−1 − λk+1 (Ak+1 ) 6 λk (Ak ) 6 λk (Ak+1 )
· · · − a0 I = 0 (which cannot be obtained simply by 6 · · · 6 λ2 (Ak+1 ) 6 λ1 (Ak ) 6 λ1 (Ak+1 )
putting “λ = A” in the previous expression!). Hence the
nth power of A, and inductively all higher powers, are for k = 1 : n − 1, showing that the eigenvalues of Ak
expressible as a linear combination of I, A, . . . , An−1 . interlace those of Ak+1 .
Moreover, if A is nonsingular then from A−1 p(A) = 0 it In 1962 Alfred Horn made a conjecture that a cer-
follows that A−1 can also be written as a polynomial in tain set of linear inequalities involving real numbers
A of degree at most n − 1. These relations are not use- αi , βi , and γi , i = 1 : n, is necessary and sufficient for
ful for practical computation because the coefficients the existence of n × n Hermitian matrices A, B, and C
ai can vary tremendously in magnitude and it is not with eigenvalues the αi , βi , and γi , respectively, such
possible to compute them to high relative accuracy. that C = A+B. The conjecture was open for many years
but was finally proved to be true in papers published by
5.4 Eigenvalue Inequalities for Hermitian Matrices Klyachko in 1998 and Knutson and Tao in 1999, which
exploit deep connections with algebraic geometry, rep-
The eigenvalues of Hermitian matrices A ∈ Cn×n ,
resentations of Lie groups, and quantum cohomology.
which in this section we order λn 6 · · · 6 λ1 , satisfy
many beautiful inequalities. Among the most impor-
tant are those in the Courant–Fischer theorem (1905), 5.5 Solving the Non-Hermitian Eigenproblem
which states that every eigenvalue is the solution of a
min-max problem over a suitable subspace S of Cn : The simplest method for computing eigenvalues, the
x ∗ Ax power method, computes just one: the largest in mod-
λi = min max . ulus. It comprises repeated multiplication of a starting
dim(S)=n−i+1 06=x∈S x∗ x
8
vector x by A. Since the resulting sequence is liable to check that Hk+1 = Qk∗ Hk Qk , so the QR iteration carries
overflow or underflow in floating-point arithmetic one out a sequence of unitary similarity transformations.
normalizes the vector after each iteration. Therefore Why the QR iteration works is not obvious but can
one step of the power method has the form x ← Ax, be elegantly explained by analyzing the subspaces
x ← ν −1 x, where ν = xj with |xj | = maxi |xi |. If A spanned by the columns of Qk . To produce a practi-
has a unique eigenvalue λ of largest modulus and the cal and efficient algorithm various refinements of the
starting vector has a component in the direction of the iteration are needed, which include
corresponding eigenvector then ν converges to λ and x
converges to the corresponding eigenvector. The power • deflation, whereby when an element on the first
method is most often applied to (A − µI)−1 , where µ is subdiagonal of Hk becomes small, that element is
an approximation to an eigenvalue of interest. In this set to zero and the problem is split into two smaller
form it is known as inverse iteration and convergence is problems that are solved independently,
to the eigenvalue closest to µ. We now turn to methods • a double shift technique for real A that allows
that compute all the eigenvalues. two QR steps with complex conjugate shifts to be
Since similarities X −1 AX preserve the eigenvalues carried out entirely in real arithmetic and gives
and change the eigenvectors in a controlled way, car- convergence to the real Schur form,
rying out a sequence of similarity transformations to • a multishift technique for including m different
reduce A to a simpler form is a natural way to tackle the shifts in a single QR iteration.
eigenproblem. Some early methods used nonunitary X,
but such transformations are now avoided because of A proof of convergence is lacking for all current shift
numerical instability when X is ill conditioned. Since strategies. Implementations introduce a random shift
the 1960s the focus has been on using unitary similar- when convergence appears to be stagnating. The QR
ities to compute the Schur decomposition A = QT Q∗ , algorithm works very well in practice and continues
where Q is unitary and T is upper triangular. The diag- to be the method of choice for the non-Hermitian
onal entries of T are the eigenvalues of A, and they can eigenproblem.
be made to appear in any order by appropriate choice of
Q. The first k columns of Q span an invariant subspace 5.6 Solving the Hermitian Eigenproblem
corresponding to the eigenvalues t11 , . . . , tkk . Eigen-
vectors can be obtained by solving triangular systems The eigenvalue problem for Hermitian matrices is eas-
involving T . ier to solve than that for non-Hermitian matrices and
For some matrices the Schur factor T is diagonal; the range of available numerical methods is much
these are precisely the normal matrices defined in sec- wider.
tion 5.1. The real Schur decomposition contains only To solve the complete Hermitian eigenproblem we
real matrices when A is real: A = QRQT , where Q is need to compute the spectral decomposition A =
orthogonal and R is real upper quasi-triangular, which QDQ∗ , where D = diag(λi ) contains the eigenvalues
means that R is upper triangular except for 2×2 blocks and the columns of the unitary matrix Q are the corre-
on the diagonal corresponding to complex conjugate sponding eigenvectors. Many methods begin by unitary
eigenvalues. reduction to tridiagonal form T = U ∗ AU , where tij = 0
The standard algorithm for solving the non- for |i − j| > 1 and the unitary matrix U is constructed
Hermitian eigenproblem is the QR algorithm, which as a product of Householder matrices. The eigenvalue
was proposed independently by John Francis and Vera problem for T is much simpler, though still nontriv-
Kublanovskaya in 1961. The matrix A ∈ Cn×n is ial. The most widely used method is the QR algorithm,
first unitarily reduced to upper Hessenberg form H = which has the same form as in the non-Hermitian case
U ∗ AU (hij = 0 for i > j + 1), with U a product of but with the upper Hessenberg Hk replaced by the Her-
Householder matrices. The QR iteration constructs a mitian tridiagonal Tk and the shifts chosen to acceler-
sequence of upper Hessenberg matrices beginning with ate the convergence of Tk to diagonal form. The Her-
H1 = H defined by Hk − µk I =: Qk Rk (QR factorization, mitian QR algorithm with appropriate shifts has been
computed using Givens rotations), Hk+1 := Rk Qk + µk I, proved to converge at a cubic rate.
where the µk are shifts chosen to accelerate the con- Another method for solving the Hermitian tridiag-
vergence of Hk to upper triangular form. It is easy to onal eigenproblem is the divide and conquer method.
9
This method decouples T in the form to x, that is, N(x). The LDL∗ factors of a tridiagonal
T11 0 matrix can be computed in O(n) flops, so this bisec-
T = + αvv ∗ ,
0 T22 tion process is efficient. An alternative approach can be
where only the trailing diagonal element of T11 and the built by using properties of Sturm sequences, which are
leading diagonal element of T22 differ from the corre- sequences comprising the characteristic polynomials
sponding elements of T and hence the vector v has of leading principal submatrices of T − λI.
only two nonzero elements. The eigensystems of T11
and T22 are found by applying the method recursively, 5.7 Computing the SVD
yielding T11 = Q1 Λ1 Q1∗ and T22 = Q2 Λ2 Q2∗ . Then
m×n the eigenvalues of
Q1 Λ1 Q1∗ 0
For a rectangular matrixh A∈ iC
∗
T = ∗ + αvv the Hermitian matrix A0∗ A 0 of dimension m + n are
0 Q 2 Λ2 Q 2
plus and minus the nonzero singular values of A along
e ∗ diag(Q1 , Q2 )∗ ,
= diag(Q1 , Q2 ) diag(Λ1 , Λ2 ) + αvev
with m + n − 2 min(m, n) zeros. Hence the SVD can
where ve = diag(Q1 , Q2 )∗ v. The eigensystem of a rank- be computed via the eigendecomposition of this larger
1 perturbed diagonal matrix D + ρzz∗ can be found by matrix. However, this would be inefficient, and instead
solving the secular equation obtained by equating the one uses algorithms that work directly on A and are
characteristic polynomial to zero: analogues of the algorithms for Hermitian matrices.
n
X |zj |2 The standard approach is to reduce A to bidiagonal
f (λ) = 1 + ρ = 0.
djj − λ form B by Householder transformations applied on the
j=1
left and the right and then to apply an adaptation of the
Putting the pieces together yields the overall eigende-
QR algorithm that works on the bidiagonal factor (and
composition.
implicitly applies the QR algorithm to the tridiagonal
Other methods are suitable for computing just a por-
matrix B ∗ B).
tion of the spectrum. Suppose we want to compute the
kth smallest eigenvalue of T and that we can some-
5.8 Generalized Eigenproblems
how compute the integer N(x) equal to the number
of eigenvalues of T that are less than or equal to x. The generalized eigenvalue problem (GEP) Ax = λBx,
Then we can apply the bisection method to N(x) to with A, B ∈ Cn×n , can be converted into a standard
find the point where N(x) jumps from k − 1 to k. eigenvalue problem if B (say) is nonsingular: B −1 Ax =
We can compute N(x) by making use of the following λx. However, such a transformation is inadvisable
result about the inertia of a Hermitian matrix, defined numerically unless B is very well conditioned. If A and B
by inertia(A) = (ν, ζ, π ), where ν is the number of neg- have a common null vector z the problem takes on a dif-
ative eigenvalues, ζ is the number of zero eigenvalues, ferent character because then (A − λB)z = 0 for any λ;
and π is the number of positive eigenvalues. such a problem is called singular . We will assume that
Theorem 2 (Sylvester’s inertia theorem). If A is Her- the problem is regular , so that det(A − λB) 6≡ 0. The
mitian and M is nonsingular then inertia(A) = linear polynomial A − λB is sometimes called a pencil.
inertia(M ∗ AM). It is convenient to write λ = α/β, where α and β are
Sylvester’s inertia theorem says that the number not both zero, and rephrase the problem in the more
of negative, zero, and positive eigenvalues does not symmetric form βAx = αBx. If x is a nonzero vector
change under congruence transformations. By using GE such that Bx = 0 then, since the problem is assumed
we can factorize2 T − xI = LDL∗ , where D is diago- to be regular, Ax 6= 0 and so β = 0. This means that
nal and L is unit lower bidiagonal (a bidiagonal matrix λ = ∞ is an eigenvalue. Infinite eigenvalues may seem
is one that is both triangular and tridiagonal). Then a strange concept, but in fact they are no different in
inertia(T − xI) = inertia(D), so the number of nega- most respects to finite eigenvalues.
tive diagonal or zero elements of D equals the number An important special case is the definite general-
of eigenvalues of T − xI less than or equal to 0, which ized eigenvalue problem, in which A and B are Hermi-
is the number of eigenvalues of T less than or equal tian and B (say) is positive definite. If B = R ∗R is a
Cholesky factorization then Ax = λBx can be rewrit-
2. The factorization may not exist, but if it does not we can simply
ten as R −∗ AR −1 · Rx = λRx, which is a standard eigen-
perturb T slightly and try again without any loss of numerical stability. problem for the Hermitian matrix C = R −∗ AR −1 . This
10
argument shows that the eigenvalues of a definite prob- The standard approach for numerical solution of the
lem are all real. Definite generalized eigenvalue prob- QEP mimics the conversion of the scalar polynomial
lems arise in many physical situations where an energy root problem into a matrix eigenproblem described in
minimization principle is at work, such as in problems section 5.3. From the relation
in engineering and physics. A1 A0 A2 0 λx
L(λ)z ≡ +λ
A generalization of the QR algorithm called the QZ I 0 0 −I x
algorithm computes a generalization to two matrices Q(λ)x
=
of the Schur decomposition: Q∗ AZ = T , Q∗ BZ = S, 0
where Q and Z are unitary and T and S are upper tri- we see that the eigenvalues of the quadratic Q are the
angular. The generalized Schur decomposition yields eigenvalues of the 2n × 2n linear polynomial L(λ). This
the eigenvalues as the ratios tii /sii and enables eigen- is an example of an exact linearization process—thanks
vectors to be computed by substitution. to the hidden λ in the eigenvector! The eigenvalues of L
The quadratic eigenvalue problem (QEP) Q(λ)x = can be found using the QZ algorithm. The eigenvectors
(λ2 A2 + λA1 + A0 )x = 0, where Ai ∈ Cn×n , i = 0 : 2, of L have the form z = λx
x , where x is an eigenvector
arises most commonly in the dynamic analysis of struc- of Q, and so x can be obtained from either the first n
tures when the finite element method is used to dis- (if λ 6= 0) or the last n components of z.
cretize the original PDE into a system of second-order
ODEs A2 q̈(t) + A1 q̇(t) + A0 q(t) = f (t). Here, the Ai 6 Sparse Linear Systems
are usually Hermitian (though A1 is skew-Hermitian in
For linear systems coming from discretization of dif-
gyroscopic systems) and positive (semi)definite. Anal-
ferential equations it is common that A is banded,
ogously to the GEP, the QEP is said to be regular if
that is, the nonzero elements lie in a band about the
det(Q(λ)) 6≡ 0. The quadratic problem differs funda-
main diagonal. An extreme case is a tridiagonal matrix,
mentally from the linear GEP because a regular problem
of which the classic example is the second-difference
has 2n eigenvalues, which are the roots of det(Q(λ)) =
matrix, illustrated for n = 4 by
0, but at most n linearly independent eigenvectors,
−2 1 0 0 4 3 2 1
and a vector may be an eigenvector for two different
1 −2 1 0
eigenvalues. For example, the QEP with −1 1 3 6 4 2
A= 0 1 −2 1 , A = − 5 .
2 4 6 3
−1 −6 0 12
Q(λ) = λ2 I + λ + 0 0 1 −2 1 2 3 4
2 −9 −2 14
This matrix corresponds to a centered finite difference
has eigenvalues 1, 2, 3, and 4, with eigenvectors 10 ,
0 1 1 approximation to a second derivative: f 00 (x) ≈ (f (x +
1 , 1 , and 1 , respectively. Moreover, there is no
h)−2f (x)+f (x−h))/h2 . Note that A−1 is a full matrix.
Schur form for three or more matrices, that is, we can-
For banded matrices, GE produces banded LU factors
not in general find unitary matrices U and V such that
and its computational cost is proportional to n times
U ∗ Ai V is triangular for i = 0 : 2.
the square of the bandwidth.
Associated with the QEP is the matrix Q(X) = A2 X 2 +
A matrix is sparse if advantage can be taken of the
A1 X + A0 , with X ∈ Cn×n . From the relation
zero entries, because of either their number or their dis-
Q(λ) − Q(X) = A2 (λ2 I − X 2 ) + A1 (λI − X) tribution. A banded matrix is a special case of a sparse
= (λA2 + A2 X + A1 )(λI − X) matrix. Sparse matrices are stored on a computer not as
a square array but in a special format that records only
it is clear that if we can find a matrix X such that
the nonzeros and their location in the matrix. This can
Q(X) = 0, known as a solvent, then we have reduced
be done with three vectors: one to store the nonzero
the QEP to finding the eigenvalues of X and solving
entries and the other two to define the row and column
one n × n GEP. For theh 2 ×i2 Q above there are five sol-
indices of the elements in the first vector.
vents, one of which is 31 02 . The existence and enumer-
Sparse matrices help to explain the tenet: never solve
ation of solvents is nontrivial and leads into the theory
a linear system Ax = b by computing x = A−1 × b. The
of matrix polynomials. In general, matrix polynomials reasons for eschewing A−1 are threefold:
Pk
are matrices of the form i=0 λi Ai whose elements are
polynomials in a complex variable; an older term for • Computing A−1 requires three times as many flops
such matrices is λ-matrices. as solving Ax = b by GE with partial pivoting.
11
equations using Cholesky factorization. For reasons of (It is certainly not obvious that these equations have
numerical stability, ith is ipreferable to use a QR fac- a unique solution.) In the case where A is square and
torization: if A = Q R01 then the normal equations nonsingular it is easily seen that A+ is just A−1 . More-
reduce to the triangular system R1 x = c, where c is the over, if rank(A) = n then A+ = (A∗A)−1 A∗ , while if
first n components of Q∗ b. rank(A) = m then A+ = A∗ (AA∗ )−1 . In terms of the
When A is rank deficient there are many least squares SVD (7),
solutions, which vary widely in norm. A natural choice
A+ = V diag(σ1−1 , . . . , σr−1 , 0, . . . , 0)U ∗ ,
is one of minimal 2-norm, and in fact there is a unique
minimal 2-norm solution, xLS , given by where r = rank(A). The formula xLS = A+ b holds for
r
X all m and n, so the pseudoinverse yields the minimal
xLS = (u∗
i b/σi )vi , 2-norm solution to both the least squares (overdeter-
i=1 mined) problem Ax = b and an underdetermined sys-
where tem Ax = b. The pseudoinverse has many interesting
A = UΣV ∗ , U = [u1 , . . . , um ], V = [v1 , . . . , vn ] (7) properties, including (A+ )+ = A, but it is not always
true that (AB)+ = B + A+ .
is an SVD and r = rank(A). The use of this formula in
Although the pseudoinverse is a very useful theoret-
practice is not straightforward because a matrix stored
ical tool it is rarely necessary to compute it explicitly
in floating-point arithmetic will rarely have any zero
(just as for its special case the matrix inverse).
singular values. Therefore r must be chosen by desig-
The pseudoinverse is just one of many ways of gen-
nating which singular values can be regarded as negligi-
eralizing the notion of inverse to rectangular matri-
ble and this choice should take account of the accuracy
ces, but it is the right one for minimum 2-norm solu-
with which the elements of A are known.
tions to linear systems. Other generalized inverses can
Another choice of least squares solution in the rank-
be obtained by requiring only a subset of the four
deficient case is a basic solution: one with at most r
nonzeros. Such a solution can be computed via the QR Moore–Penrose conditions to hold.
factorization with column pivoting.
8 Numerical Considerations
7.2 Underdetermined Systems
Prior to the introduction of the first digital comput-
When m < n and A has full rank, there are infinitely ers in the 1940s, numerical computations were carried
many solutions to Ax = b and again it is natural to out by humans, sometimes with the aid of mechanical
seek one of minimal 2-norm. There is a unique such calculators. The human involvement in a sequence of
solution xLS = A∗ (AA∗ )−1 b, and it is best computed calculations meant that potentially dangerous events
via a QR factorization, this time of A∗ . A basic solu- such as dividing by a tiny number or subtracting two
tion, with m nonzeros, can alternatively be computed. numbers that agree to almost all their significant digits
As a simple example, consider the problem “find two could be observed, their effect monitored, and possible
x
numbers whose sum is 5”, that is, solve [1 1] x12 = corrective action taken—such as temporarily increas-
5. A basic solution is [5 0]T while the minimal 2- ing the precision of the calculations. On the very early
norm solution is [5/2 5/2]T . Minimal 1-norm solu- computers intermediate results were observed on a
tions to underdetermined systems are important in cathode-ray tube monitor, but this became impossible
compressed sensing. as problem sizes increased (along with available com-
puting power). Fears were raised in the 1940s that algo-
7.3 Pseudoinverse rithms such as GE would suffer exponential growth
The analysis in the previous two subsections can be of errors as the problem dimension increased, due
unified in a very elegant way by making use of the to the rapidly increasing number of arithmetic opera-
Moore–Penrose pseudoinverse A+ of A ∈ Cm×n , which tions, each having its associated rounding error. These
is defined as the unique X ∈ Cn×m satisfying the fears were particularly concerning given that the error
Moore–Penrose conditions growth might be unseen and unsuspected.
The subject of rounding error analysis grew out
AXA = A, XAX = X,
of the need to understand the effect on algorithms
∗
(AX) = AX, (XA)∗ = XA. of rounding errors. The person who did the most to
13
develop the subject was James Wilkinson, whose influ- upper triangular matrix Tb such that
ential papers and 1961 and 1965 books showed how e ∗ (A + ∆A)Q
Q e = Tb , k∆AkF 6 p(n)ukAkF ,
backward error analysis can be used to obtain deep
insights into numerical stability. We will discuss just where Q e is some exactly unitary matrix and p(n) is a
two particular examples. cubic polynomial. The computed Schur factor Q b is not
kth iterate. Subtracting Mx = Nx + b from (8) gives where M1 z = d1 and M2 z = d2 are easy to solve. In
M(x (k+1) − x) = N(x (k) − x), so this case it is natural to take W = diag(M1 , M2 ) as the
e(k+1) = M −1 Ne(k) = · · · = (M −1 N)k+1 e(0) . (9) preconditioner. When A is Hermitian positive definite
the preconditioned system is written in a way that pre-
Ifρ(M −1 N) < 1 then (M −1 N)k → 0 as k → ∞ (see
serves the structure. For example, for the Jacobi pre-
Jordan canonical form) and so x (k) converges to x, at
conditioner, D = diag(A), the preconditioned system
a linear rate. In practice, for convergence in a reason-
would be written D −1/2 AD −1/2 xe = b,
e where x e = D 1/2 x
able number of iterations we need ρ(M −1 N) to be suf- −1/2 −1/2 −1/2
and b = D b. Here, the matrix D AD has unit
ficiently less than 1 and the powers of M −1 N should
e
diagonal and off-diagonal elements lying between −1
not grow too large initially before eventually decaying;
and 1.
in other words, M −1 N must not be too nonnormal.
Three standard choices of splitting are, with D = The most powerful iterative methods for linear sys-
diag(A) and L and U denoting the strictly lower and tems Ax = b are the Krylov methods. In these methods
strictly upper triangular parts of A, respectively, each iterate x (k) is chosen from the shifted subspace
x (0) + Kk (A, r (0) ) where
• M = D, N = −(L + U ): Jacobi iteration;
• M = D + L, N = −U : Gauss–Seidel iteration; Kk (A, r (0) ) = span{r (0) , Ar (0) , . . . , Ak−1 r (0) }
1 1−ω
• M = ω D + L, N = ω D − U, where ω ∈ (0, 2) is a Krylov subspace of dimension k, with r (k) =
is a relaxation parameter: successive overrelaxation b − Ax (k) . Different strategies for choosing approxi-
(SOR) iteration. mations from within the Krylov subspaces yield dif-
Sufficient conditions for convergence are that A is ferent methods. For example, the conjugate gradient
strictly diagonally dominant by rows for the Jacobi method (CG, for Hermitian positive definite A) and
iteration and that A is symmetric positive definite for the full orthogonalization method (FOM, for general A)
the Gauss–Seidel iteration. How to choose ω so that make the residual r (k) orthogonal to the Krylov sub-
ρ(M −1 N|ω ) is minimized for the SOR iteration was space Kk (A, r (0) ), while the minimal residual method
elucidated in the landmark 1950 PhD thesis of David (MINRES, for Hermitian A) and the generalized min-
Young. imal residual method (GMRES, for general A) mini-
The Google PageRank algorithm, which underlies mize the 2-norm of the residual over all vectors in the
Google’s ordering of search results, can be interpreted Krylov subspace. How to compute the vectors defined
as an application of the Jacobi iteration to a certain lin- in these ways is nontrivial. It turns out that CG can
ear system involving the adjacency matrix of the graph be implemented with a recurrence requiring just one
corresponding to the whole world wide web. However, matrix–vector multiplication and three inner products
the most common use of stationary iterative methods per iteration, and MINRES is just a little more expen-
is as preconditioners within other iterative methods. sive. GMRES, being applicable to non-Hermitian matri-
The aim of preconditioning is to convert a given lin- ces, is significantly more expensive, and it is also much
ear system Ax = b into one that can be solved more harder to analyze its convergence behavior. For general
cheaply by a particular iterative method. The basic idea matrices there are alternatives to GMRES that employ
is to use a nonsingular matrix W to transform the sys- short recurrences. We mention just BiCGSTAB, which
tem to (W −1 A)x = W −1 b in such a way that (a) the pre- has the distinction that the 1992 paper by Henk van
conditioned system can be solved in fewer iterations der Vorst that introduced it was the most-cited paper
than the original system and (b) matrix–vector multi- in mathematics of the 1990s.
plications with W −1 A (which require the solution of a Theoretically, Krylov methods converge in at most
linear system with coefficient matrix W ) are not signif- n iterations for a system of dimension n. However, in
icantly more expensive than matrix–vector multiplica- practical computation rounding errors intervene and
tions with A. In general, this is a difficult or impossible the methods behave as truly iterative methods not
task, but in many applications the matrix A has struc- having finite termination. Since n is potentially huge,
ture that can be exploited. For example, many elliptic a Krylov method would not be used unless a good
PDE problems lead to a positive definite matrix A of the approximate solution was obtained in many fewer than
form
M1 F
n iterations, and preconditioning plays a crucial role
A= T , here. Available error bounds for a method help to guide
F M2
15
the choice of preconditioner, but care is needed in inter- D = diag(λi ) containing the eigenvalues on its diag-
preting the bounds. To illustrate this, consider the CG onal. In many respects, normal matrices have very pre-
method for Ax = b, where A is Hermitian positive defi- dictable behavior. For example, kAk k2 = ρ(A)k and
nite. In the A-norm, kzkA = (z∗ Az)1/2 , the error on the k etA k2 = eα(tA) , where the spectral abscissa α(tA) is
kth step satisfies the largest real part of any eigenvalue of tA. However,
!k matrices that arise in practice are often very nonnor-
κ2 (A)1/2 − 1
kx − x (k) kA 6 2kx − x (0) kA , mal. The adjective “very” can be quantified in various
κ2 (A)1/2 + 1
ways, of which one is the Frobenius norm of the strictly
where κ2 (A) = kAk2 kA−1 k2 . If we can precondition A
upper triangular part of the upper triangular matrix T
so that its 2-norm condition number is very close to 1
in the Schurh decomposition A = QT Q∗ . For example,
then fast convergence is guaranteed. However, another t11 θ
i
the matrix 0 t22 is nonnormal for θ 6= 0 and grows
result says that if A has k distinct eigenvalues then
increasingly nonnormal as |θ| increases.
CG converges in at most k iterations. Therefore a bet-
Consider the moderately nonnormal matrix
ter approach might be to choose the preconditioner so
that the eigenvalues of the preconditioned matrix are −0.97 25
A= . (10)
clustered into a small number of groups. 0 −0.3
Another important class of iterative methods is While the powers of A ultimately decay to zero, since
multigrid methods, which work on a hierarchy of grids ρ(A) = 0.97 < 1, we see from figure 4 that initially they
that come from a discretization of an underlying PDE increase in norm. Likewise, since α(A) = −0.3 < 0 the
(geometric multigrid) or are constructed artificially norm k etA k2 tends to zero as t → ∞, but figure 4 shows
from a given matrix (algebraic multigrid).
that there is an initial hump in the plot. In station-
An important practical issue is how to terminate
ary iterations the hump caused by a nonnormal iter-
an iteration. Popular approaches are to stop when the
ation matrix M −1 N can delay convergence, as is clear
residual r (k) = b − Ax (k) (suitably scaled) is small or
from (9). In finite precision arithmetic it can even hap-
when an estimate of the error x − x (k) is small. Compli-
pen that, for a sufficiently large hump, rounding errors
cating factors include the fact that the preconditioner
cause the norms of the powers to plateau at the hump
can change the norm and a possible desire to match the
level and never actually converge to zero.
error in the iterations with the discretization error in
How can we predict the shape of the curves in fig-
the PDE from which the linear system might have come
ure 4? Let us concentrate on kAk k2 . Initially it grows
(as there is no point solving the system to greater accu-
racy than the data warrants). Research in recent years like kAkk2 and ultimately it decays like ρ(A)k , the decay
has led to good understanding of these issues. rate following from (4). The height of the hump is
The ideas of Krylov methods and preconditioners can related to pseudospectra, which have been popularized
be applied to problems other than linear systems. A by Nick Trefethen.
popular Krylov method for solving the least squares The ε-pseudospectrum of A ∈ Cn×n is defined, for a
problem (6) is LSQR, which is mathematically equiva- given ε > 0, to be the set
lent to applying CG to the normal equations. In large-
Λε (A) = { z ∈ C : z is an eigenvalue of A + E
scale eigenvalue problems only a few eigenpairs are
usually required. A number of methods project the for some E with kEk2 < ε }, (11)
original matrix onto a Krylov subspace and then solve a and it can also be represented, in terms of the resolvent
smaller eigenvalue problem. These include the Lanczos (zI − A)−1 , as
method for Hermitian matrices and the Arnoldi method
for general matrices. Also of much current research Λε (A) = { z ∈ C : k(zI − A)−1 k2 > ε−1 }.
interest are rational Krylov methods based on rational
The 0.001-pseudospectrum, for example, tells us the
generalizations of Krylov subspaces.
uncertainty in the eigenvalues of A if the elements are
known only to three decimal places. Pseudospectra pro-
10 Nonnormality and Pseudospectra
vide much insight into the effects of nonnormality of
Normal matrices A ∈ Cn×n (defined in section 5.1) matrices and (with an appropriate extension of the def-
have the property that they are unitarily diagonaliz- inition) linear operators. For nonnormal matrices the
able: A = QDQ∗ for some unitary Q and diagonal pseudospectra are much bigger than a perturbation of
16
the spectrum by ε. It can be shown that for any ε > 0, 3. there is a positive vector x such that Ax = ρ(A)x,
ρε (A) − 1 ρε (A)k+1 4. ρ(A) is an eigenvalue of algebraic multiplicity 1.
sup kAk k > , kAk k 6 ,
k>0 ε ε
To illustrate the theorem consider the following two
where the pseudospectral radius ρε (A) = max{ |λ| : irreducible matrices and their eigenvalues:
λ ∈ Λε (A) }. For A in (10) and ε = 10−2 these inequal-
8 1 6
√
ities give an upper bound of 230 for kA3 k and a lower A = 3 5 7,
Λ(A) = {15, ±2 6},
bound of 23 for supk>0 kAk k, and figure 5 plots the 4 9 2
corresponding ε-pseudospectrum.
0 0 6
1 1 √
B = 2 0 0, Λ(B) = 1, 2 (−1 ± 3i) .
11 Structured Matrices 0 13 0
In a wide variety of applications the matrices have The Perron–Frobenius theorem correctly tells us that
a special structure. The matrix elements might form ρ(A) = 15 is a distinct eigenvalue of A, and that it has
a pattern, as for a Toeplitz matrix or a Hamiltonian a corresponding positive eigenvector, which is known
matrix, the matrix may satisfy a nonlinear equation as the Perron vector. The Perron vector of A is the vec-
such as A∗ ΣA = Σ, where Σ = diag(±1), which yields tor of all ones, as A forms a magic square and ρ(A) is
the pseudo-unitary matrices A, or the submatrices may the magic sum! The Perron vector of B, which is both a
satisfy certain rank conditions (as for quasisepara- Leslie matrix and a companion matrix, is [6 3 1]T . There
ble matrices). We discuss here two of the oldest and is one notable difference between A and B: for A, ρ(A)
most studied classes of structured matrices, both of exceeds the other eigenvalues in modulus, but all three
which were historically important in the analysis of eigenvalues of B have modulus 1. In fact, Perron’s orig-
iterative methods for linear systems arising from the inal version of Theorem 3 says that if A has all positive
discretization of differential equations. elements then ρ(A) is not only an eigenvalue of A but
is larger in modulus than every other eigenvalue. Note
11.1 Nonnegative Matrices that B 3 = I, which provides another way to see that the
eigenvalues of B all have modulus 1.
A nonnegative matrix is a real matrix all of whose
We saw in the section 9 that the spectral radius plays
entries are nonnegative. A number of important classes
an important role in the convergence of stationary iter-
of matrices are subsets of the nonnegative matrices.
ative methods, through ρ(M −1 N), where A = M − N is
These include adjacency matrices, stochastic matrices,
a splitting. In comparing different splittings we can use
and Leslie matrices (used in population modeling). Non-
the result that for A, B ∈ Rn×n , with |A| denoting the
negative matrices have a large body of theory, which
matrix (|aij |),
originates with Perron in 1907 and Frobenius in 1908.
To state the celebrated Perron–Frobenius theorem |aij | 6 bij ∀i, j ⇒ ρ(A) 6 ρ(|A|) 6 ρ(B).
we need the definition that A ∈ Rn×n with n > 2 is
reducible if there is a permutation matrix P such that
A11 A12
11.2 M-Matrices
P T AP = ,
0 A22 A ∈ Rn×n is an M-matrix if it can be written in the form
where A11 and A22 are square, nonempty submatrices, A = sI − B, where B is nonnegative and s > ρ(B). M-
and it is irreducible if it is not reducible. A matrix with matrices arise in many applications, a classic one being
positive entries is trivially irreducible. A useful char- Leontief’s input–output models in economics.
acterization is that A is irreducible if and only if the The special sign pattern of an M-matrix—positive
directed graph associated with A (which has n vertices, diagonal elements and nonpositive off-diagonal
with an an edge connecting the ith vertex to the jth elements—combines with the spectral radius condi-
vertex if aij 6= 0) is strongly connected. tion to give many interesting characterizations and
Theorem 3 (Perron–Frobenius). If A ∈ Rn×n is nonneg- properties. For example, a nonsingular matrix A with
ative and irreducible then nonpositive off-diagonal elements is an M-matrix if and
only if A−1 is nonnegative. Another characterization,
1. ρ(A) > 0, which makes connections with section 1, is that A
2. ρ(A) is an eigenvalue of A, is an M-matrix if and only if A has positive diagonal
17
34
16
32 14
30 12
10
||Ak||2 28 ||etA||2
8
26
6
24
4
22
2
20 0
0 5 10 15 20 0 2 4 6 8 10
k t
A function f is matrix monotone if it preserves the the related inequalities k eA+B k 6 k eA/2 eB eA/2 k 6
order, that is, A 6 B implies f (A) 6 f (B), where f (A) k eA eB k hold for any unitarily invariant norm.
denotes a function of a matrix. Much is known about
this class of functions, including that t 1/2 and log t are 13 Library Software
matrix monotone but t 2 is not.
From the early days of digital computing the benefits
Many matrix inequalities involve norms. One exam-
of providing library subroutines for carrying out basic
ple is
√ operations such as the addition of vectors and the for-
k |A| − |B| kF 6 2kA − BkF , mation of vector inner products was recognized. Over
where A, B ∈ Cm×n and | · | is the matrix absolute value the ensuing years many matrix computation research
defined in section 2. This inequality can be regarded codes were published, including in the Linear Alge-
as a perturbation result that shows the matrix absolute bra volume of the Handbook for Automatic Computa-
value to be very well conditioned. tion (1971) and in the Collected Algorithms of the ACM.
An example of an inequality that finds use in the Starting in the 1970s the concept of standardized sub-
analysis of convergence of methods in nonlinear opti- programs was developed in the form of the Basic Lin-
mization is the Kantorovich inequality, which for Her- ear Algebra Subprograms (BLAS), which are specifica-
mitian positive definite A with eigenvalues λn 6 · · · 6 tions for vector (level 1), matrix–vector (level 2), and
λ1 and x 6= 0 is matrix–matrix (level 3) operations. The BLAS have been
widely adopted, and highly optimized implementations
(x ∗ Ax)(x ∗ A−1 x) (λ1 + λn )2
∗ 2
6 . are available for most machines. The freely-available
(x x) 4λ1 λn
LAPACK library of Fortran codes represents the current
This inequality is attained for some x, and the left-hand state of the art for solving dense linear equations, least
side is always at least 1. squares problems, and eigenvalue and singular value
Many inequalities are available that generalize scalar problems. Many modern programming packages and
inequalities for means. For example, the arithmetic– environments build on LAPACK.
1
geometric mean inequality (ab)1/2 6 2 (a + b) for posi- It is interesting to note that the TOP500 list
tive scalars has an analogue for Hermitian positive def- (http://www.top500.org) ranks the world’s fastest
1
inite A and B in the inequality A # B 6 2 (A + B), where computers by their speed (measured in flops per sec-
A # B is the geometric mean defined as the unique Her- ond) in solving a random linear system Ax = b by GE.
mitian positive definite solution to XA−1 X = B. The This benchmark has its origins in the 1970s LINPACK
geometric mean also satisfies the extremal property project, a precursor to LAPACK, in which the perfor-
A X mance of contemporary machines was compared by
A # B = max X : X = X ∗ , >0 ,
X B running the LINPACK GE code on a 100 × 100 system.
which hints at matrix completion problems, in which
the aim is to choose missing elements of a matrix in 14 Outlook
order to achieve some goal, which could be to satisfy a Matrix analysis and numerical linear algebra remain
particular matrix property or, as here, to maximize an very active areas of research. Many problems in applied
objective function. Another mean for Hermitian posi- mathematics and scientific computing require the solu-
tive definite matrices (and applicable more generally), tion of a matrix problem at some stage, so there is
1
is the log-Euclidean mean, exp( 2 (log A + log B)), where always a demand for better understanding of matrix
log is the principal logarithm, which is used in image problems and faster and more accurate algorithms for
registration, for example. their solution. As the overarching applications evolve,
Finally, we mention an inequality for the matrix expo- new problem variants are generated, often involving
nential. Although there is no simple relation between new assumptions on the data, different requirements
eA+B and eA eB in general, for Hermitian A and B the on the solution, or new metrics for measuring the suc-
inequality trace(eA+B ) 6 trace(eA eB ) was proved inde- cess of an algorithm. A further driver of research is
pendently by S. Golden and J. Thompson in 1965. Orig- computer hardware. With the advent of processors with
inally of interest in statistical mechanics, the Golden– many cores, the use of accelerators such as graphics
Thompson inequality has more recently found use in processing units (GPUs), and the harnessing of vast
random matrix theory. Again for Hermitian A and B, numbers of processors for parallel computing, the
19
15 Further Reading
Three must-haves for researchers are Golub and Van
Loan’s influential treatment of numerical linear alge-
bra and the two volumes by Horn and Johnson, which
contain a comprehensive treatment of matrix analysis.