2.5 Iterative Improvement of A Solution To Linear Equations
2.5 Iterative Improvement of A Solution To Linear Equations
2.5 Iterative Improvement of A Solution To Linear Equations
visit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).
readable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMs
Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine-
Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.
Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)
b + b
x
x+
x b
x b
A1
Figure 2.5.1. Iterative improvement of the solution to A x = b. The first guess x + x is multiplied by
A to produce b + b. The known vector b is subtracted, giving b. The linear set with this right-hand
side is inverted, giving x. This is subtracted from the first guess giving an improved solution x.
If this happens to you, there is a neat trick to restore the full machine precision,
called iterative improvement of the solution. The theory is very straightforward (see
Figure 2.5.1): Suppose that a vector x is the exact solution of the linear set
Ax=b (2.5.1)
You dont, however, know x. You only know some slightly wrong solution x + x,
where x is the unknown error. When multiplied by the matrix A, your slightly wrong
solution gives a product slightly discrepant from the desired right-hand side b, namely
A (x + x) = b + b (2.5.2)
A x = b (2.5.3)
56 Chapter 2. Solution of Linear Algebraic Equations
But (2.5.2) can also be solved, trivially, for b. Substituting this into (2.5.3) gives
A x = A (x + x) b (2.5.4)
In this equation, the whole right-hand side is known, since x + x is the wrong
solution that you want to improve. It is essential to calculate the right-hand side
visit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).
readable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMs
Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine-
Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.
Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)
in double precision, since there will be a lot of cancellation in the subtraction of b.
Then, we need only solve (2.5.4) for the error x, then subtract this from the wrong
solution to get an improved solution.
An important extra benefit occurs if we obtained the original solution by LU
decomposition. In this case we already have the LU decomposed form of A, and all
we need do to solve (2.5.4) is compute the right-hand side and backsubstitute!
The code to do all this is concise and straightforward:
#include "nrutil.h"
void mprove(float **a, float **alud, int n, int indx[], float b[], float x[])
Improves a solution vector x[1..n] of the linear set of equations A X = B. The matrix
a[1..n][1..n], and the vectors b[1..n] and x[1..n] are input, as is the dimension n.
Also input is alud[1..n][1..n], the LU decomposition of a as returned by ludcmp, and
the vector indx[1..n] also returned by that routine. On output, only x[1..n] is modified,
to an improved set of values.
{
void lubksb(float **a, int n, int *indx, float b[]);
int j,i;
double sdp;
float *r;
r=vector(1,n);
for (i=1;i<=n;i++) { Calculate the right-hand side, accumulating
sdp = -b[i]; the residual in double precision.
for (j=1;j<=n;j++) sdp += a[i][j]*x[j];
r[i]=sdp;
}
lubksb(alud,n,indx,r); Solve for the error term,
for (i=1;i<=n;i++) x[i] -= r[i]; and subtract it from the old solution.
free_vector(r,1,n);
}
You should note that the routine ludcmp in 2.3 destroys the input matrix as it
LU decomposes it. Since iterative improvement requires both the original matrix
and its LU decomposition, you will need to copy A before calling ludcmp. Likewise
lubksb destroys b in obtaining x, so make a copy of b also. If you dont mind
this extra storage, iterative improvement is highly recommended: It is a process
of order only N 2 operations (multiply vector by matrix, and backsubstitute see
discussion following equation 2.3.7); it never hurts; and it can really give you your
moneys worth if it saves an otherwise ruined solution on which you have already
spent of order N 3 operations.
You can call mprove several times in succession if you want. Unless you are
starting quite far from the true solution, one call is generally enough; but a second
call to verify convergence can be reassuring.
2.5 Iterative Improvement of a Solution to Linear Equations 57
visit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).
readable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMs
Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine-
Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.
Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)
approximate inverse of the matrix A, so that B0 A is approximately the identity matrix 1.
Define the residual matrix R of B0 as
R 1 B0 A (2.5.5)
which is supposed to be small (we will be more precise below). Note that therefore
B0 A = 1 R (2.5.6)
Next consider the following formal manipulation:
A1 = A1 (B1
0 B0 ) = (A
1
B1
0 ) B0 = (B0 A)
1
B0
(2.5.7)
= (1 R)1 B0 = (1 + R + R2 + R3 + ) B0
We can define the norm of a matrix as the largest amplification of length that it is
able to induce on a vector,
|R v|
kRk max (2.5.12)
v6=0 |v|
If we let equation (2.5.7) act on some arbitrary right-hand side b, as one wants a matrix inverse
to do, it is obvious that a sufficient condition for convergence is
visit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).
readable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMs
Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine-
Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.
Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)
kRk < 1 (2.5.13)
Pan and Reif [1] point out that a suitable initial guess for B0 is any sufficiently small constant
times the matrix transpose of A, that is,
Rarely does one know the eigenvalues of AT A in equation (2.5.16). Pan and Reif
derive several interesting bounds, which are computable directly from A. The following
choices guarantee the convergence of Bn as n ,
X X X
1 a2jk or 1 max |aij | max |aij | (2.5.17)
i j
j,k j i
The latter expression is truly a remarkable formula, which Pan and Reif derive by noting that
the vector norm in equation (2.5.12) need not be the usual L2 norm, but can instead be either
the L (max) norm, or the L1 (absolute value) norm. See their work for details.
Another approach, with which we have had some success, is to estimate the largest
eigenvalue statistically, by calculating si |A vi |2 for several unit vector vi s with randomly
chosen directions in N -space. The largest eigenvalue can then be bounded by the maximum
of 2 max si and 2N Var(si )/(si ), where Var and denote the sample variance and mean,
respectively.
visit website http://www.nr.com or call 1-800-872-7423 (North America only),or send email to trade@cup.cam.ac.uk (outside North America).
readable files (including this one) to any servercomputer, is strictly prohibited. To order Numerical Recipes books,diskettes, or CDROMs
Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine-
Copyright (C) 1988-1992 by Cambridge University Press.Programs Copyright (C) 1988-1992 by Numerical Recipes Software.
Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5)
will diagnose for you precisely what the problem is. In some cases, SVD will
not only diagnose the problem, it will also solve it, in the sense of giving you a
useful numerical answer, although, as we shall see, not necessarily the answer
that you thought you should get.
SVD is also the method of choice for solving most linear least-squares problems.
We will outline the relevant theory in this section, but defer detailed discussion of
the use of SVD in this application to Chapter 15, whose subject is the parametric
modeling of data.
SVD methods are based on the following theorem of linear algebra, whose proof
is beyond our scope: Any M N matrix A whose number of rows M is greater than
or equal to its number of columns N , can be written as the product of an M N
column-orthogonal matrix U, an N N diagonal matrix W with positive or zero
elements (the singular values), and the transpose of an N N orthogonal matrix V.
The various shapes of these matrices will be made clearer by the following tableau:
w1
w2
A = U VT
wN
(2.6.1)
The matrices U and V are each orthogonal in the sense that their columns are
orthonormal,
X
M
1kN
Uik Uin = kn (2.6.2)
i=1
1nN
X
N
1kN
Vjk Vjn = kn (2.6.3)
j=1
1nN