Preconditioning
Preconditioning
Preconditioning
A. J. Wathen
Mathematical Institute, University of Oxford,
Andrew Wiles Building, Radcliffe Observatory Quarter,
Woodstock Road, Oxford OX2 6GG, UK
E-mail: [email protected]
CONTENTS
1 Introduction 1
2 Krylov subspace iterative methods 5
3 Symmetric and positive definite matrices 8
4 Semidefinite matrices 18
5 Symmetric and indefinite matrices 19
6 Nonsymmetric matrices 27
7 Some comments on symmetry 33
8 Some comments on practical computing 35
9 Summary 38
References 39
1. Introduction
This review aims to give a comprehensive view of the current state of pre-
conditioning for general linear(ized) systems of equations. However we start
with a specific example.
2 A. J. Wathen
where the outer diagonals of A have been overwritten by the central di-
agonals which are ‘wrapped around’ in a periodic manner. Circulant ma-
trices are diagonalised by a Fourier basis (Van Loan 1992) and it follows
that for any given b ∈ Rn , the fast Fourier transform (FFT) (Cooley and
Tukey 1965) enables the computation of the matrix-vector product Cb and
of the solution, y ∈ Rn of Cy = b in O(n log n) arithmetic operations. The
importance of Strang’s observation is that C is close to A if the discarded
entries ad n e , . . . , an are small; precise statements will describe this in terms
2
of decay in the entries of A along rows/columns moving away from the di-
agonal. With such a property, one can show that not only are the matrices
A and C close entrywise, but that they are spectrally close, i.e. that the
eigenvalues of C −1 A (equivalently the eigenvalues λ of the matrix ‘pencil’
A − λC) are mostly close to 1. As a consequence, an appropriate iterative
method (it would be the conjugate gradient method if A is positive definite:
Preconditioning 3
Thus the iterates and residuals of every Krylov subspace method satisfy
k−1
X
xk − x0 = αj Aj r0
j=0
Now, cg for (2.4) will generate a Krylov space of the form K(AT A, AT b)
for example, when x0 = 0. Perhaps the most suitable implementation alge-
braically equivalent to a symmetric Krylov subspace method for the normal
equations is the lsqr algorithm of Paige and Saunders (1982) which also
computes least squares solutions for rectangular A. An advantage of these
methods is that the convergence theory of cg and minres apply.
The reason that there are methods of choice (cg , minres) for symmetric
matrices, but a whole range of possibilities for nonsymmetric matrices arises
because of convergence guarantees. In the case of symmetric matrices there
are descriptive convergence bounds which depend only on the eigenvalues
of the coefficient matrix, thus the number of iterations which will be needed
for convergence to a given tolerance can be estimated and bounded a priori;
a good preconditioner will ensure fast convergence. These comments apply
also for normal equation approaches for nonsymmetric matrices.
By contrast, to date there are no generally applicable and descriptive
convergence bounds even for gmres; for any of the nonsymmetric methods
without a minimisation property, convergence theory is extremely limited.
The convergence of gmres has attracted a lot of attention, but the absence
of any reliable way to guarantee (or bound) the number of iterations needed
to achieve convergence to any given tolerance in most situations means that
there is currently no good a priori way to identify what are the desired
qualities of a preconditioner. This is a major theoretical difficulty, but
heuristic ideas abound.
All Krylov subspace methods are to an extent affected by rounding errors
in a computer if run for a large enough number of iterations, see Liesen and
Strakoš (2013) or the discussion in Greenbaum (1997, Chapter 4). In this
paper we ignore any such effects since an important consequence of good
preconditoning is that few iterations should be necessary for acceptable
convergence, thus round-off effects are not usually able to build up and
become a significant issue. Similarly, there is theory which describes the
eventual (i.e. asymptotic) convergence of Krylov subspace methods such as
cg, but this is also not generally relevant to preconditioned iterations—it is
rapid error or residual reduction in early iterations that is typically desired.
finite element. The precise eigenvalue bounds of the form (3.2) or (3.3), to-
gether with tabulated values for ` and Υ for many types of finite elements,
are derived in Wathen (1986).
We briefly comment that in problems as above with analytic a priori
eigenvalue bounds ` and Υ which are (reasonably) tight, the possibility also
arises to use a fixed but small number of Chebyshev (semi-)iterations as a
(linear) preconditioner—see Wathen and Rees (2009).
The second important situation arises also through highly variable coef-
ficients; the classic example arises from the operator
∇ · (K∇u)
where the permeability K has large jumps. For continuous finite element
approximation of such terms, Graham and Hagger (1999) were the first to
identify the utility of diagonal scaling to render the preconditioned prob-
lem more homogeneous. For such an elliptic PDE, Jacobi preconditioning
is highly unlikely to be an adequate preconditioner on its own, but when
used together with an appropriate preconditioner for a homogeneous elliptic
PDE operator—see below—the combination can be much better than just
the elliptic preconditioner by itself. In the literature there are several de-
velopments of the idea of Graham and Hagger, including for non-isotropic
permeabilities.
easy solution of systems with the preconditioner, and the main remaining
question is whether the preconditioner leads to fast convergence of cg.
In general many different variants of this basic idea exist and software for
various incomplete factorization-based algorithms is widely available. This
is certainly a plus point, but unless A is diagonally dominant, incomplete
Cholesky algorithms do not always work well, indeed they do not always
exist without modification, see Meijerink and van der Vorst (1977), and
see Benzi (2002) for discussion of this as well as many other aspects of
incomplete factorization preconditioning. With regard to robustness, there
are certain advantages to computing an incomplete triangular factorization
not using Cholesky but via orthogonalization in the inner product generated
by A; see Benzi and Tuma (2003). This approach is called RIF (robust
incomplete factorization) and shares some similarity with the AINV sparse
approximate inverse discussed below.
Let us make some of these statements precise in the context of solving
a simple discretized elliptic boundary value problem, namely the common
5-point finite difference approximation of the Dirichlet problem for the (neg-
ative) Laplacian
−∇2 u = f in Ω, u given on ∂Ω
which arises from second order central differencing on a regular grid with
spacing h in each of the two spatial dimensions. The choice of sign ensures
that the operator and thus the arising matrix are positive rather than neg-
ative definite. If the domain, Ω, is the unit square (so that the boundary
∂Ω is four coordinate-aligned line segments) then the arising matrix A is
of dimension N ≈ h−2 for small h and also κA = O(h−2 ), see Elman et al.
(2014b, section 1.6). Thus (3.1) predicts that O(h−1 ) cg iterations would
be required for convergence of unpreconditioned cg. The simplest incom-
plete Cholesky preconditioner for which L has the same sparsity pattern
−1
as the lower trianglular part of A also leads to κP A = O(h−2 ) though
there is significant clustering of a large number of the eigenvalues around 1;
the bound (3.1) based only on the extreme eigenvalues is pessimistic in this
situation. A variant, modified incomplete factorization (due to Gustafsson
(1978) but related to rather earlier work by Dupont et al. (1968)) reduces
−1
this to κP A = O(h−1 ), and generally leads to slightly faster cg conver-
gence. Applying (3.1) in this case predicts that O(h−1/2 ) preconditioned
cg iterations would be required for convergence, and this is close to what
is seen in practical computation. This implies that for a sequence of such
problems with smaller and smaller h, it will take an increasing number of
iterations to achieve acceptable convergence as h gets smaller (and so the
matrix dimension gets larger) – a rather undesirable feature.
Fortunately, there are much better preconditioners for this and related
−1
elliptic PDE problems which guarantee κP A = O(1), so that the number of
Preconditioning 13
3.3. Multigrid
Since the groundbreaking paper by Brandt (1977), multigrid methods have
been at the core of scientific computing. Earlier contributions, notably by
Fedorenko (1964) and Bakhvalov (1966), are now considered to have in-
troduced the key ideas of combining a simple smoothing iteration (such
as Jacobi or Gauss-Seidel iteration) with coarse grid correction on succes-
sively coarser grids for the finite difference Laplacian, but Brandt’s vision
was much broader: he showed just how fast multigrid solvers could be and
introduced many ideas to keep reseachers busy for decades! For a posi-
tive definite elliptic PDE operator, it has been clear for a long time that
an appropriate multigrid method will provide an optimal solver, that is, a
solver such that the work to compute the solution scales linearly with the
dimension of the discretized problem.
Classically, multigrid is a stationary iteration (albeit a non-trivial one) for
solving Ax = b when A is a discrete Laplacian matrix (arising from finite
difference or finite elements). For V- or W-cycles, iterates satisfy single
cycle contraction of the form
kx − xk kA ≤ η kx − xk−1 kA (3.5)
where the contraction factor, η is typically around 0.1 (see, for example
Elman et al. (2014b, section 2.5)). It follows that the linear multigrid op-
erator (represented by a complicated and fairly dense matrix which one
would never want to use in computations) is an excellent candidate for a
14 A. J. Wathen
to be the coarse grid size in essence; the bounds above assume exact solution
of the problems on subdomains.
With the importance of parallel and distributed computing, the domain
decomposition concept remains very important and much is known for more
advanced methods and; see the regular Domain Decomposition conference
proceeding available at ddm.org (2014).
is by far the most widely considered since the minimisation problem then
reduces to least squares problems for the columns of R separately (thus
computable in parallel). Further, these least squares problems are all of
small dimension when the specification of SR ensures that R is a sparse
matrix. We again refer to Benzi (2002) for a thorough description.
As posed in this simple form, the sparse approximate inverse, R is not
generally symmetric when A is symmetric, however we introduce this type
of preconditioner in this section because symmetry can be enforced by com-
puting R in factored form R = LLT where L is sparse and lower triangular;
see Kolotilina and Yeremin (1993). Symmetry of a sparse approximate in-
verse can also be achieved by the AINV algorithm of Benzi et al. (1996)
and the more robust (stabilized) SAINV variant (Benzi et al. 2000) which
is guaranteed not to break down if A is positive definite. See again Benzi
(2002).
The main issue in the construction of any sparse approximate inverse is
the choice of the sparsity pattern, SR . The simplest idea is to use the spar-
sity of A, whilst there have been several attempts to dynamically define SR ,
Preconditioning 17
most notably in the SPAI algorithm of Grote and Huckle (1997). An exten-
sion which can be used in a number of different scenarios is to use a target
matrix, T , and hence minimise kT − ARkF ; see Holland et al. (2005). A
further comment is that sparse approximate inverse algorithms are clearly
quite widely applicable, but generally do not give spectrally equivalent pre-
conditioners for elliptic PDEs. They do however make excellent smoothers
for use with multigrid (Tang and Wan 2000, Bröker et al. 2001).
4. Semidefinite matrices
It might seem rather odd to be interested in preconditioners for singular
matrices; one might believe that only well-posed problems were worthy of
attention. However, there are different situations in which approximations
to certain singular operators/matrices are valuable and we mention some
ideas in this brief section.
The simplest PDE probem which gives rise to a singular matrix is proba-
bly the Neumann problem for a simple operator like the Laplacian; specify-
ing only derivative boundary conditions for a potential-like variable leaves
a 1-dimensional kernel comprising the constants as is well known. Such
a small dimensional null space generally causes little significant difficulty
in terms of iterative methods or preconditioning and methods for definite
matrices generally apply: see (Elman et al. 2014b, section 2.3). Similarly,
graph analysis and Markov chain problems respectively lead to symmetric
and nonsymmetric singular systems with simple, low-dimensional kernels;
again, methods for definite matrices should generally be applied without
explicit ‘fixes’ to ensure nonsingularity for such problems. For problems
arising in contexts other than PDEs, singularity tends to imply that a de-
scription has not been completely specified.
The issue of semidefiniteness is certainly serious, however, for equations
involving the curl or div operators applied to vector fields as arise, for exam-
ple, in the Maxwell equations describing electromagnetics. In this context,
the operator curl curl has all gradients of scalar fields in its kernel, thus
under most discretizations one would expect that the matrices arising will
be semidefinite with a kernel of some significant dimension. Similarly, the
curl of any smooth enough vector field is in the kernel of the div opera-
tor. On the relevant spaces of functions, the curl curl operator is however
positive definite and self-adjoint (symmetric), and it seems that it may be
possibile to use convenient preconditioners like standard AMG implemen-
tations if appropriate account is taken of the kernel. This is the purpose of
‘Auxiliary Space’ preconditioning which aims to take account of the natural
Preconditioning 19
decompositions of function spaces that arise. The ideas have evolved over
some years; see Hiptmair and Xu (2007) and references therein. The main
practical issue is representing mapping operators between the various spaces
in the discrete setting.
krk kI
≤ min max |pk (λ)|, (5.1)
kr0 kI pk ∈Πk ,pk (0)=1 λ∈σ(A)
where the notation is as used before in the positive definite case. The
difference from that case is not only that the residual rather than the error
is minimised (it is really a difference of norm since rk = A(x − xk ) implies
that krk kI = kx − xk kA2 ), but the spectrum, σ(A), now contains both
positive and negative eigenvalues.
As in the symmetric and positive definite case, (5.1) immediately indicates
that termination of the iteration will occur with the solution at the sth iter-
ation if A has just s distinct eigenvalues. Correspondingly, the eigenvalues
lying in a small number of clusters should also lead to rapid convergence
if none of the clusters is too far from, nor too close to the origin. If the
eigenvalues of A are all contained in two intervals
[−a, −b] ∪ [c, d],
where a, b, c, d are all positive, then it is less straightforward in general
than in the positive definite case to derive a simple convergence bound (see
Wathen et al. (1995)). One situation where a simple expression is possible,
however, is when d − c = a − b; the minres residuals then satisfy the
convergence bound
√ k
kr2k kI κ−1
≤2 √ , (5.2)
kr0 kI κ+1
where κ = ad/(bc) (see (Elman et al. 2014b, section 4.2.4.)). Note that κ
here plays a similar role to λmax /λmin in the positive definite case, though it
is not the matrix condition number here; since the Euclidean norm condition
number, kAkkA−1 k is max{a, d}/ min{b, c} in this situation, κ can be as
large as the square of this condition number when a = d, b = c, but rather
20 A. J. Wathen
less when the two intervals are not symmetrically placed about the origin.
When a = c, for example, κ = d/b is again exactly the Euclidean norm
condition number.
When A is symmetric and indefinite, any preconditioner for minres must
be symmetric and positive definite. This is necessary since otherwise there is
no equivalent symmetric system for the preconditioned matrix. Thus a non-
symmetric iterative method must be used even if a symmetric and indefinite
preconditioner is employed for a symmetric and indefinite matrix in general.
An indefinite preconditioner is therefore not usually desirable, though there
are exceptions: see below. A preconditioner for a symmetric indefinite
matrix A for use with minres therefore can not be an approximation of
the inverse of A, since this is also indefinite. This simple observation rules
out several approaches described above in the positive definite case.
With a symmetric and positive definite preconditioner, the preconditioned
minres convergence bounds as above become
krk kP −1
≤ min max |pk (λ)|
kr0 kP −1 pk ∈Πk ,pk (0)=1 λ∈σ(P −1 A)
≤ min max |pk (λ)| (5.3)
pk ∈Πk ,pk (0)=1 λ∈[−a,−b]∪[c,d]
when [−a, −b] ∪ [c, d] is now an inclusion set for all of the eigenvalues of
P −1 A rather than A. A further aspect introduced by preconditioning for
minres which does not arise for cg in the positive definite case is now
apparent: the norm in which convergence naturally occurs is affected by
the preconditioner; it is not just the eigenvalue distribution and hence the
speed of convergence which is affected.
The only reason we bring up this issue is because one now has the undesir-
able possibility that a preconditioner which apparently gives rapid minres
convergence might give inaccurate solution iterates simply because k · kP −1
is a highly distorted norm. The following trivial example will make the
point:
Let A be any indefinite diagonal matrix and suppose we precondition with
P = diag(, 1/, . . . , 1/)
for some small positive . Convergence to some small tolerance will appear
to occur after only 1 iteration since there is only 1 large entry of P −1 and the
remaining entries are very small. Because of this distorted weighting, the
first component of the first solution iterate will be accurate, but the rest will
generally not be. For further consideration of this issue see Wathen (2007).
These different issues, which do not arise when A is symmetric and pos-
itive definite, lead some authors to the conclusion that preconditioning for
symmetric indefinite matrices is much more difficult than the definite case.
However it is still much more tractable than the nonsymmetric case! For
Preconditioning 21
a PDE problem, for example, the convergence bound (5.3) guarantees that
preconditioned minres convergence will be in a number of iterations in-
dependent of any mesh size parameter, h, given a preconditioner which
ensures that a, b, c, d are bounded, bounded away from zero and indepen-
dent of h. This is like the situation for spectrally equivalent preconditioners
in the symmetric positive definite case. Such guarantees rarely exist for
non-symmetric iterative methods; see section 6 below.
We mention that the sqmr Krylov subspace iterative method (Freund
and Nachtigal 1994) is designed for indefinite matrices with indefinite pre-
conditioning; this method, however, has no convergence guarantees, being
a variant of the nonsymmetric qmr iterative method for which symmetry
allows a more streamlined implementation.
General preconditioning strategies for symmetric indefinite matrices are
not so available as in the positive definite case. Amongst the ‘given a ma-
trix’ approaches, simple application of scaling and incomplete factorization
are not contenders and there is little significant work on sparse approxi-
mate inverses since these appear unlikely to provide much benefit except
possibly when A has only a few negative eigenvalues. ‘Given a problem’ is
more promising and there are two general classes of problem which give rise
to symmetric indefinite matrices for which helpful—sometimes excellent—
preconditioning approaches are known.
problem with many different right hand side vectors there could thus be
some amelioration in the work required to set up the preconditioner.
We should alert the reader to the article by Ernst and Gander (2012)
which explains the general difficulty of iterative solution of Helmholtz prob-
lems.
The Maxwell equations describing electromagnetics have wave-like solu-
tions and hence some of the character of Helmholtz problems. However,
they also (in various formulations) have additional semidefinite features as
discussed in the section above, and additional indefinite saddle point struc-
ture, and are therefore considered below.
The idea of sweeping through the discrete variables for a problem in a par-
ticular order is also important for preconditioner construction for nonsym-
metric matrices which derive from PDEs that involve advection/convection
(or transport) as described below.
proach is described and analysed by Greif and Schötzau (2006) who were
motivated by the solution of Maxwell equations describing electromagnetic
phenomena—a key PDE problem with this structure. For related but earlier
ideas associated with the semidefiniteness of the div operator, see Arnold
et al. (1997).
For the various Maxwell equation formulations, semidefiniteness arises
through curl curl terms, indefiniteness because of a div constraint and wave-
like solutions are always expected, which means that fine computational
grids must be used especially for high frequency problems. Fortunately,
Maxwell’s equations are linear and self-adjoint in most formulations which
is why we bring them up in this section. Mixed finite element approxi-
mation is widely used for these PDEs. Representative of the large and
still growing literature on preconditioning for such problems are Greif and
Schötzau (2007), who develop block preconditioners and Tsuji et al. (2012),
who extend sweeping preconditioner ideas to these problems.
A slightly more detailed review of preconditioning for saddle point prob-
lems can be found in Benzi and Wathen (2009).
centred on the origin, then the undesirable staircasing effect is always im-
plied! In general, it is advantageous in terms of convergence speed if there
is some asymmetry in the distribution of positive and negative eigenvalues
of preconditioned indefinite symmetric matrices. However, it is then con-
siderably more complicated to take account of this asymmetry in a minres
convergence bound (but see Wathen et al. (1995)).
6. Nonsymmetric matrices
As already mentioned, all preconditioning for nonsymmetric problems is
heuristic, since descriptive convergence bounds for gmres or any of the
other applicable nonsymmetric Krylov subspace iterative methods do not
presently exist (but see Pestana and Wathen (2013b)). The most straight-
forward convergence bound for gmres for diagonalisable matrices, A =
XΛX −1 where Λ is a diagonal matrix of eigenvalues, comes directly from
(2.3) and the residual minimization property:
krk k = min kp(A)r0 k
p∈Πk ,p(0)=1
6.2. Multigrid
There is no inherent obstacle to applying multigrid or multilevel ideas for
nonsymmetric problems. However, smoothing is generally not so easy and
requires more work than for symmetric problems and care must be taken so
that coarse grid correction does not reintroduce high frequencies. Without
attention to these aspects, the methods simply do not work at all well.
Since such issues vary with differing applications, we briefly discuss only
the important problem associated with the convection-diffusion equation.
Smoothing – or perhaps more appropriately in this setting, ‘sweeping’ –
should ideally be something like a Gauss-Seidel iteration with the variables
ordered in the direction of the convection. For problems with recirculation,
one can try multidirectional sweeping, so that all parts of the flow have
at least one of the directional sweeps that is approximately in the convec-
tion direction. Coarse grid correction then requires care (in grid transfers
and/or coarse grid operator). In the convection-diffusion geometric multi-
grid method due to Ramage (1999), for example, the coarse grid operator
is recomputed on each coarse grid with—crucially—streamline upwinding
appropriate to that grid size. The point is that when a grid under-resolves
30 A. J. Wathen
(The H stands for Hermitian, since the same idea obviously applies in the
complex case when transpose is replace by conjugate transpose.) When H is
Preconditioning 31
− sign which is chosen so that all eigenvalues are clustered around 1 rather
than the 2 clusters around ±1 which arise with the + choice, though it is
not clear that this is to be preferred (Ymbert et al. 2015).
Approximations of the Schur complement remain a key aspect of any
such block preconditioning approach. For the common Oseen linearization
of the incompressible Navier-Stokes problem, the matrix block A comes
from a vector convection–diffusion operator so that an appropriate multi-
grid approach as outlined above or any other such good approximation of
(preconditioner for) convection-diffusion is required as A.b To identify an
approximate Schur complement requires more in depth consideration, but
there are now two excellent candidates: the pressure convection-diffusion
(PCD) (Kay et al. 2002) and the least-squares commutator (LSC) precon-
ditioners. Both are derived, described and analysed in Elman et al. (2014b,
Chapter 9). The PCD method is in some ways simpler, but it requires the
construction of a convection–diffusion operator on the pressure space (the
space of Lagrange multipliers), which is not required for the problem for-
mulation itself. A purely algebraic way to compute this operator based on
sparse approximate inverse technology was, however, been described by El-
man et al. (2006). The LSC method is derived via a commutator argument
and is defined purely in terms of the matrix blocks which arise naturally
for the Oseen problem. For a comparison and full consideration of advan-
tages and disadvantages of these two approaches, see Elman et al. (2014b,
Chapter 9).
The augmentation or augumented Lagrangian approach has also been
applied to non-symmetric saddle point problems. Although some concerns
have been expressed regarding possible distortion of the norm in which
convergence occurs, good results have been reported even for large Reynolds
number (Benzi et al. 2011b).
Earlier approaches for this problem involved segregated treatment of pres-
sure and velocity and predate the preconditioned Krylov subspce iteration
technology. It is perhaps worth commenting here that this example is one
of the first where block preconditioners comprising of blocks designed for
different sets of variables allow treatment of different ‘physics’ within the
preconditioner whilst gmres or some other Krylov subspace method is ap-
plied to the complete coupled problem. This is an important paradigm.
Such is its importance that it is in the context of nonsymmetric saddle
point systems arising from approximation of the Oseen problem that per-
haps the most refined analysis of preconditioned gmres convergence has
been pursued. There are several eigenvalue analyses for different precondi-
tioners, see for example Elman and Silvester (1996), although these do not
lead to rigorous estimates as explained above. It is in the context of this
problem, however, that field-of-values convergence analysis has been suc-
cessfully employed to rigorously bound preconditioned gmres convergence
Preconditioning 33
9. Summary
We conclude with a few general guidelines on preconditioner design and
selection.
• Keep to the original structure when preconditioning (e.g., preserve
symmetry, partitioning into blocks, etc.)
• The more knowledge about a problem that can be represented in a
preconditioner, the better.
• The work in applying the preconditioner should ideally be commen-
surate with the work in matrix–vector multiplication: that would be
O(n) for a sparse matrix and O(n2 ) for a general full matrix.
• It may be easier to precondition a matrix of larger dimension, for which
structures are clearer, than to eliminate variables, giving a less struc-
tured matrix of smaller dimension.
There is, of course, no such concept as a best preconditioner: the only
two candidates for this would be P = I, for which the preconditioning takes
no time at all, and P = A, for which only one iteration would be required
for solution by any iterative method. However, every practitioner knows
when they have a good preconditioner which enables feasible computation
and solution of problems. In this sense, preconditioning will always be an
art rather than a science.
Acknowledgements
I am grateful to Michele Benzi, Howard Elman and David Silvester for read-
ing an early draft; their comments and suggestions have led to significant
improvements.
Parts of this survey were written whilst visiting CERFACS, Toulouse,
France and the Memorial University of Newfoundland, Canada. I am grate-
ful to Serge Gratton, Xavier Vasseur, Hermann Brunner and Scott MacLach-
lan for their support and encouragement.
Preconditioning 39
REFERENCES
P.R. Amestoy, I.S. Duff, J-Y. L’Excellent and J. Koster (2000), ’MUMPS: a general
purpose distributed memory sparse solver’, In Proc. PARA2000, 5th Interna-
tional Workshop on Applied Parallel Computing, Springer-Verlag, 122–131.
J.R. Appleyard and I.M. Cheshire ’Nested factorization’, Society of Petroleum
Engineers, SPE 12264, presented at the Seventh SPE Symposium on Reservoir
Simulation, San Francisco, 1983.
M. Arioli (2004), ’A stopping criterion for the conjugate gradient algorithm in a
finite element method framework’, Numer. Math. 97, 1–24.
M. Arioli, D. Loghin and A.J. Wathen (2005), ’Stopping criteria for iterations in
finite element methods’, Numer. Math. 99, 381–410.
D.N. Arnold, R.S. Falk and R. Winther (1997), ’Preconditioning in H( div) and
applications’, Math. Comput. 66, 957–984.
Z.-Z. Bai, G.H. Golub and M.K. Ng (2003), ’Hermitian and skew-Hermitian split-
ting methods for non-Hermitian positive definite linear systems’, SIAM J.
Matrix Anal. Appl. 24, 603–626.
N.S. Bakhvalov (1966), ’On the convergence of a relaxation method with natural
constraints on the elliptic operator’, USSR Comp. Math. and Math. Phys.
6, 101–135.
M. Bebendorf (2008), Hierarchical Matrices: A Means to Efficiently Solve Ellip-
tic Boundary Value Problems, Lecture Notes in Computational Science and
Engineering, vol. 63, Springer-Verlag.
M. Benzi (2002), ’Preconditioning techniques for large linear systems: A survey’,
J. Comput. Phys. 182, 418–477.
M. Benzi, J.K. Cullum and M. Tuma (2000), ’Robust approximate inverse precon-
ditioning for the conjugate gradient method’, SIAM J. Sci. Comput. 22,1318–
1332.
M. Benzi, G.H. Golub and J. Liesen (2005), ’Numerical solution of saddle point
problems’, Acta Numerica 14, 1–137.
M. Benzi, C.D. Meyer and M. Tuma (1996), ’A sparse approximate inverse precon-
ditioner for the conjugate gradient method’, SIAM J. Sci. Comput. 17, 1135–
1149.
M. Benzi, M. Ng, Q. Niu and Z. Wang (2011a), ’A relaxed dimensional factorization
preconditioner for the incompressible Navier-Stokes equations’, J. Comput.
Phys. 230, 6185–6202.
M. Benzi and M.A. Olshanskii (2011), ’Field-of-values convergence analysis of aug-
mented Lagrangian preconditioners for the linearized Navier-Stokes problem’,
SIAM J. Numer. Anal. 49, 770–788.
M. Benzi, M.A. Olshanskii and Z. Wang (2011b), ’Modified augmented Lagrangian
preconditioners for the incompressible Navier-Stokes equations’, Int. J. Nu-
mer. Meth. Fluids 66, 486–508.
M. Benzi and V. Simoncini (2006), ’On the eigenvalues of a class of saddle point
matrices’, Numer. Math. 103, 173–196.
M. Benzi and M. Tuma (1998), ’A sparse approximate inverse preconditioner for
nonsymmetric linear systems’, SIAM J. Sci. Comput. 19, 968–994.
M. Benzi and M. Tuma (2003), ’A robust incomplete factorization preconditioner
for positive definite matrices’, Numer. Linear Algebra Appl. 10, 385–400.
40 A. J. Wathen
M. Benzi and A.J. Wathen (2008), ’Some preconditioning techniques for saddle
point problems’, in Model order reduction: theory, research aspects and ap-
plications, 195–211.
L. Bergamaschi, J. Gondzio and G. Zilli (2004), ’Preconditioning indefinite systems
in interior point methods for optimization’, Comput. Optim. Appl. 28, 149–
171.
M. Bern, J.R. Gilbert, B. Hendrickson, N. Nguyen and S. Toledo (2006), ’Support-
Graph preconditioners’, SIAM J. Matrix Anal. Appl. 27, 930–951.
A. Björck (1996), Numerical Methods for Least Squares Problems, Society for In-
dustrial and Applied Mathematics.
J.W. Boyle, M.D. Mihajlovic and J.A. Scott (2010), ’HSL MI20: an efficient AMG
preconditioner for finite element problems in 3D’, Int. J. Numer. Meth. En-
gnrg 82, 64–98.
D. Braess and P. Peisker (1986), ’On the numerical solution of the biharmonic equa-
tion and the role of squaring matrices for preconditioning’, IMA J. Numer.
Anal. 6, 393–404.
J. Bramble and J. Pasciak (1988), ’A preconditioning technique for indefinite sys-
tem resulting from mixed approximations of elliptic problems’, Math. Com-
put. 50, 1–17.
J.H. Bramble, J.E. Pasciak and J. Xu (1990), ’Parallel multilevel preconditioners’,
Math. Comput. 55,1–22.
A. Brandt (1977), ’Multilevel adaptive solutions to boundary-value problems’,
Math. Comput. 31,333–390.
A. Brandt and S. Ta’asan (1986), ’Multigrid methods for nearly singular and
slightly indefinite problems’, in Multigrid Methods II, W. Hackbusch and
U. Trottenberg, eds., Lecture Notes in Math. 1228, Springer-Verlag, Berlin,
pp. 99–121.
O. Bröker, M.J. Grote, C. Mayer and A. Reusken (2001), ’Robust parallel smooth-
ing for multigrid via sparse approximate inverses’, SIAM J. Sci Comp.
23, 1395–1416.
R.H.-F. Chan and X.-Q. Jin (2007), An Introduction to Iterative Toeplitz Solvers,
Society for Industrial and Applied Mathematics.
T.F. Chan and T.P. Mathew (1994), ’Domain decomposition algorithms’, Acta
Numerica 3, 61–143.
S. Chandrasekaran, P. Dewilde, M. Gu, T. Pals, X. Sun, A.J. van der Veen and
D. White (2005), ’Some fast algorithms for sequentially semiseparable repre-
sentation’, SIAM J. Matrix Anal. Appl. 27, 341–364.
K. Chen (2005), Matrix preconditioning techniques and applications, Cambridge
University Press.
P. Concus and G.H. Golub (1976), ’A generalized conjugate gradient method for
nonsymmetric systems of linear equations’, in R. Glowinski and P.L. Lions
eds., Lecture Notes in Economics and Mathematical Systems 134, 56–65,
Springer.
J.W. Cooley and J.W. Tukey (1965), ’An algorithm for the machine calculation of
complex Fourier series’, Math. Comput. 19, 297–301.
T.A. Davis (2004), ’Algorithm 832: UMFPACK V4.3—an unsymmetric-pattern
multifrontal method’, ACM Trans. Math. Softw. 30, 196–199.
Preconditioning 41
A. Greenbaum (1997), Iterative Methods for Solving Linear Systems, Society for
Industrial and Applied Mathematics.
L, Greengard and V. Rokhlin (1987), ’A fast algorithm for particle simulations’, J.
Comput. Phys. 73, 325–348.
C. Greif and D. Schötzau (2006), ’Preconditioners for saddle point linear systems
with highly singular (1,1) blocks’, Elect. Trans. Numer. Anal. 22, 114–121.
C. Greif and D. Schötzau (2007), ’Preconditioners for the discretized time-harmonic
Maxwell equations in mixed form’, Numer. Lin. Alg. Appl. 14, 281–297.
M.J. Grote and T. Huckle (1997), ’Parallel preconditioning with sparse approxi-
mate inverses’, SIAM J. Sci. Comput. 18, 83–853.
A. Günnel, R. Herzog and E. Sachs (2014), ’A note on preconditioners and scalar
products in Krylov subspace methods for self-adjoint problems in Hilbert
space’, Elect. Trans. Numer. Anal. 41, 13–20.
I. Gustafsson (1978), ’A class of first order factorization methods’, BIT 18, 142–
156.
E. Haber and U.M. Ascher (2001), ’Preconditioned all-at-once methods for large,
sparse parameter estimation problems’, Inverse Probl. 17, 1847–1864.
W. Hackbusch (1994), Iterative Solution of Large Sparse Systems of Equations,
Springer-Verlag.
W. Hackbusch (1999), ’A sparse matrix arithmetic based on H-matrices. I. Intro-
duction to H-matrices’, Computing 62, 89–108
W. Hackbusch, B.K. Khoromskij and S. Sauter (2000), ’On H2 -matrices’, in Lec-
tures on Applied Mathematics, 9–29, H. Bungartz, R. Hoppe and C. Zenger,
eds..
M. Heil, A.L. Hazel and J. Boyle (2008), ’Solvers for large-displacement fluid-
structure interaction problems: segregated versus monolithic approaches’,
Comput. Mech. 43, 91–101.
V.E. Henson and U.M. Yang (2000), ’BoomerAMG: a parallel algebraic multigrid
solver and preconditioner’, Appl. Numer. Math. 41, 155–177.
M.R. Hestenes and E. Stiefel (1952), ’Methods of conjugate gradients for solving
linear problems’, J. Res. Nat. Bur. Stand. 49, 409–436.
R. Hiptmair and J. Xu (2007), ’Nodal auxiliary space preconditioning in H(curl)
and H(div) spaces’, SIAM J. Numer. Anal. 45, 2483–2509.
R.M. Holland, A.J. Wathen and G.J. Shaw, (2005), ’Sparse approximate inverses
and target matrices’, SIAM J. Sci. Comput. 26, 1000–1011.
’HSL(2013). A collection of fortran codes for large scale scientific computation’,
http://www.hsl.rl.ac.uk .
V. John and L. Tobiska (2000), ’Numerical performance of smoothers in coupled
multigrid methods for the parallel solution of the incompressible Navier-
Stokes equations’, Int. J. Numer. Meth. Fluids 33, 453–473.
T.B. Jonsthovel, M.B. van Gijzen, S.C. Maclachlan, C. Vuik and A. Scarpas (2012),
’Comparison of the deflated preconditioned conjugate gradient method and
algebraic multigrid for composite materials’, Comput. Mech. 50, 321–333.
D. Kay, D. Loghin and A.J. Wathen (2002), ’A preconditioner for the steady-state
Navier-Stokes equations’, SIAM J. Sci. Comput. 24, 237–256.
C. Keller, N.I.M. Gould and A.J. Wathen (2000), ’Constraint preconditioning for
indefinite linear systems’, SIAM J. Matrix Anal. Appl. 21, 1300–1317.
44 A. J. Wathen
M.A. Olshanskii and E.E. Tyrtyshnikov (2014), Iterative Methods for Linear Sys-
tems: Theory and Applications, Society for Industrial and Applied Mathe-
matics.
M.E.G. Ong (1997), ’Hierarchical basis preconditioners in three dimensions’, SIAM
J. Sci. Comput. 18, 479–498.
K. Otto (1996), ’Analysis of preconditioners for hyperbolic partial differential equa-
tions’, SIAM J. Numer. Anal. 33, 2131–2165.
C.C. Paige and M.A. Saunders (1975), ’Solution of sparse indefinite systems of
linear equations’, SIAM J. Num. Anal. 12, 617–629.
C.C. Paige and M.A. Saunders (1982), ’LSQR: An algorithm for sparse linear
equations and sparse least squares’, ACM Trans. Math. Software 8, 43–71.
M.L. Parks, E. de Sturler, G. Mackey, D.D. Johnson and S. Maiti (2006), ’Recycling
Krylov subspaces for sequences of linear systems’, SIAM J. Sci. Comput.
28, 1651–1674.
J. Pestana and A.J. Wathen (2014), ’A preconditioned MINRES method for non-
symmetric Toeplitz matrices’, Numerical Analysis report 14/11, Oxford Uni-
versity Mathematical Institute, Oxford, UK.
J. Pestana and A.J. Wathen (2013a), ’Combination preconditioning of saddle point
systems for positive definiteness’, Numer. Linear Algebra Appl. 20, 785–808.
J. Pestana and A.J. Wathen (2013b), ’On choice of preconditioner for mini-
mum residual methods for non-Hermitian matrices’, J. Comput. Appl. Math.
249, 57–68.
J. Pestana and A.J. Wathen (2015), ’Natural preconditioning and iterative methods
for saddle point systems’, SIAM Rev. to appear.
E.G. Phillips, H.C. Elman, E.C. Cyr, J.N. Shadid and R.P. Pawlowski (2014), ’A
block preconditioner for an exact penalty formulation for stationary MHD’,
University of Maryland, Computer Science report CS-TR-5032.
J. Poulson, B. Engquist, S. Li and L. Ying (2013), ’A parallel sweeping precon-
ditioner for heterogeneous 3D Helmholtz equations’, SIAM J. Sci. Comput.
35, C194–C212.
A. Quarteroni and A. Valli (1999), Domain Decomposition Methods for Partial
Differential Equations, Oxford University Press.
A. Ramage (1999), ’A multigrid preconditioner for stabilised discretisations of
advection-diffusion problems’, J. Comput. Appl. Math. 101, 187–203.
T. Rees, H.S. Dollar and A.J. Wathen (2010a), ’Optimal solvers for PDE-
constrained optimization’, SIAM J. Sci. Comput. 32, 271–298.
T. Rees, M. Stoll and A.J. Wathen (2010b), ’All-at-once preconditioning in PDE-
constrained optimization’, Kybernetika 46, 341–360.
S. Rhebergen, G.N. Wells, R.F. Katz and A.J. Wathen (2014), ’Analysis of block-
preconditioners for models of coupled magma/mantle dynamics’, SIAM J.
Sci. Comput. 36, A1960–A1977.
Y. Saad (1993), ’A flexible inner-outer preconditioned GMRES algorithm’, SIAM
J. Sci. Comput. 14, 461–469.
Y. Saad (1996), ’ILUM: a multi-elimination ILU preconditioner for general sparse
matrices’, SIAM J. Sci. Comput. 17, 830–847.
Y. Saad (2003), Iterative Methods for Sparse Linear Systems, second edition, So-
ciety for Industrial and Applied Mathematics.
46 A. J. Wathen
Y. Saad and M.H. Schultz (1986), ’GMRES: A generalized minimal residual algo-
rithm for solving nonsymmetric linear systems’, SIAM J. Sci. Statist. Com-
put. 7, 856–869.
J.A. Sifuentes, M. Embree and R.B. Morgan (2013), ’GMRES convergence for per-
turbed coefficient matrices, with application to approximate deflation precon-
ditioning’, SIAM. J. Matrix Anal. Appl. 34, 1066–1088.
D.J. Silvester and A.J. Wathen (1994), ’Fast iterative solution of stabilised Stokes
systems Part II: Using general block preconditioners’, SIAM J. Numer Anal.
31, 1352–1367.
V. Simoncini and D.B. Szyld (2007), ’Recent computational developments in
Krylov subspace methods for linear systems’, Numer. Linear Algebra Appl.
14, 1–59.
G.L.G. Sleijpen and D.R. Fokkema (1993), ’BiCGstab(ell) for linear equations in-
volving unsymmetric matrices with complex spectrum’, Elect. Trans. Numer.
Anal. 1, 11–32.
B. Smith, P. Bjørstad and W. Gropp (1996), Domain Decomposition, Parallel Mul-
tilevel Methods for Elliptic Partial Differential Equations, Cambridge Univer-
sity Press.
P. Sonneveld and M.B. van Gijzen (2008), ’IDR(s): A family of simple and fast
algorithms for solving large nonsymmetric systems of linear equations’, SIAM
J. Sci. Comput. 31, 1035–1062.
D.A. Spielman and S.-H. Teng (2014), ’Nearly-linear time algorithms for precon-
ditioning and solving symmetric, diagonally dominant linear systems’, SIAM
J Matrix Anal. Appl. 35, 835–885.
M. Stoll and A.J. Wathen, (2008), ’Combination preconditioning and the Bramble-
Pasciak+ preconditioner’, SIAM J Matrix Anal. Appl. 30, 582–608.
G. Strang (1986), ’A proposal for Toeplitz matrix calculations’, Stud. Appl. Math.
74, 171–176.
W-P. Tang and W.L. Wan (2000), ’Sparse approximate inverse smoother for multi-
grid’, SIAM J. Matrix Anal. Appl. 21, 1236–1252.
O. Taussky (1972), ’The role of symmetric matrices in the study of general matri-
ces’, Linear Algebra Appl. 5, 147–154.
A. Toselli and O. Widlund (2004), Domain Decompposition Methods - Algorithms
and Theory, Springer Series in Computational Mathematics, Vol. 34.
L.N. Trefethen and M. Embree (2005), Spectra and Pseudospectra, Princeton Uni-
versity Press.
U. Trottenberg, C.W. Oosterlee and A. Schüler (2000), Multigrid, Academic Press.
P. Tsuji, B. Engquist and L.Ying (2012), ’A sweeping preconditioner for time-
harmonic Maxwells equations with finite elements’, J. Comput. Phys.
231, 3770–3783.
M.B. van Gijzen, Y.A. Erlangga and C. Vuik (2007), ’Spectral analysis of the
discrete Helmholtz operator preconditioned with a shifted Laplacian’, SIAM
J. Sci. Comput. 29, 1942–1958.
H.A. van der Vorst (1992), ’Bi-CGSTAB: A fast and smoothly converging variant of
Bi-CG for the solution of nonsymmetric linear systems’, SIAM J. Sci. Statist.
Comput. 13. 631–644.
Preconditioning 47
H.A. van der Vorst (2003), Iterative Krylov Methods for Large Linear Systems,
Cambridge University Press.
C.F. Van Loan (1992), Computational Frameworks for the Fast Fourier Transform,
Society for Industrial and Applied Mathematics.
P.S. Vassilevski (2008), Multilevel Block Factorization Preconditioners, Matrix-
based Analysis and Algorithms for Solving Finite Element Equations,
Springer.
R. Verfürth (1984), ’A combined conjugate gradient-multigrid algorithm for the
numerical solution of the Stokes problem’, IMA J. Numer. Anal. 4, 441–455.
A.J. Wathen (1986), ’Realistic eigenvalue bounds for the Galerkin mass matrix’,
IMA J. Numer. Anal. 7 449–457.
A.J. Wathen (2007), ’Preconditioning and convergence in the right norm’, Int. J.
Comput. Math. 84, 1199–1209.
A.J. Wathen, B. Fischer and D.J. Silvester (1995), ’The convergence rate of the
minimum residual method for the Stokes problem’, Numer. Math. 71, 121–
134.
A.J. Wathen and T. Rees (2009), ’Chebyshev semi-iteration in preconditioning for
problems including the mass matrix’, Elect. Trans. Numer. Anal. 34, 125–135.
M.P. Wathen (2014), ’Iterative solution of a mixed finite element discretisation of
an incompressible magnetohydrodynamics problem’, MSc dissertation, De-
partment of Computer Science, University of British Columbia.
O. Widlund (1978), ’A Lanczos method for a class of nonsymmetric systems of
linear equations’, SIAM J. Numer. Anal. 15, 801–812.
P.H. Worley (1991), ’Limits on parallelism in the numerical solution of linear partial
differential equations’, SIAM J. Sci. Stat. Comput. 12, 1–35.
J. Xu (1992), ’Iterative methods by space decomposition and subspace correction’,
SIAM Rev. 34, 581–613
G. Ymbert, M. Embree and J. Sifuentes (2015), ’Approximate Murphy-Golub-
Wathen preconditioning for saddle point problems’, In Preparation.
H. Yserentant (1986), ’On the multi-level splitting of finite element spaces’, Numer.
Math. 49, 379–412.