Numerical Analysis Introduction
Numerical Analysis Introduction
Another key figure is that of John von Neumann, who came to the US in the early 1930s
to join the newly formed Institute for Advanced Studies in Princeton, NJ. Other members
ot the IAS included Albert Einstein and Hermann Weyl.
A Hungarian-born child prodigy, von Neumann initially worked with Hilbert on
mathematical logic, the foundations of mathematics, and the rigorous formulation of
quantum mechanics. He also worked on topological groups, rings of operators, ergodic
theory and many other areas of pure mathematics.
In applied mathematics he created the field of game theory (with the economist Oskar
Morgenstern) and did fundamental work in hydrodynamics, meteorology, numerical
linear algebra, and numerical PDEs. He is, however, best known for his work on the
logical design of digital computers and his theory of automata.
Numerical linear algebra also received great impetus from the establishment on the
UCLA campus of the Institute for Numerical Analysis of the National Bureau of
Standards (1947-1954).
Among the main figures associated with INA we mention George Forsythe, Cornelius
Lanczos, Isaac Schoenberg, Olga Taussky-Todd, John Todd, Magnus Hestenes, and
Eduard Stiefel.
Both the Lanczos algorithm and the Conjugate Gradient method date to this exciting
period. Their widespread use, however, will have to wait until the 1970s.
Cornelius Lanczos (1893–1974)
The 1950s also see great progress in the analysis of finite difference methods for PDEs
(including the famous equivalence theorem of Peter Lax) and the development of
Alternating Direction Implicit (ADI) and splitting-up methods by Peaceman, Rachford,
Douglas, Birkhoff, Varga, Young in USA and by Yanenko and others in USSR.
Together with the SOR method, ADI schemes are the method of choice for the
numerical solution of discretized PDEs during much of the 1950s and 1960s. Only in the
1970s will the Conjugate Gradient method gain favor, especially after the development
of incomplete Cholesky preconditioning. Much credit here goes to John Reid (UK), Koos
Meijerink and Henk van der Vorst (Netherlands), and to a 1976 paper by Gene Golub,
Paul Concus and Dianne O’Leary.
In the 1960s and 1970s a class of (non-iterative) methods for solving the Poisson
equation on regular grids in near-optimal time is developed by Buneman, Golub,
Buzbee, Nielson and others. These techniques are closely related to cyclic reduction
and the Fast Fourier Transform (FFT) of Cooley and Tuckey (1965).
These fast Poisson solvers will later be superseded by variants of the multigrid method,
first studied in USSR by Fedorenko and Bakhvalov in the 1960s, and later by Achi
Brandt, Wolfgang Hackbusch, and many others.
Chief applications of PDE solvers during this period are in the areas of nuclear reactor
modeling and petroleum engineering.
The 1980s and 1990s see rapid developments in the field of Krylov subspace methods
for nonsymmetric linear systems (Young, Saylor, Sonneveld, Eisenstat, Elman, Schulz,
Saad, Freund, Nachtigal, van der Vorst), preconditioning, multilevel algorithms, and
large-scale eigenvalue solvers.
On the theory side, the important Faber-Manteuffel Theorem (1984) settles a question
of Golub on the existence of optimal Krylov methods based on short recurrences.
We also mention the impact of parallel computing, stimulating the development of
domain decomposition schemes by Lions, Widlund, and many others.
Jacobi uses a plane rotation (with angle α = 22o 300) to annihilate the (1,2)-(2,1)
coefficient. After this, the transformed system is solved in three iterations of Jacobi’s
method. Each iteration adds about one digit of accuracy.
Again in the context of least squares, in 1874 another German, Seidel, publishes his
own iterative method. The paper contains what we now (inappropriately) call the
Gauss–Seidel method, which he describes as an improvement over Jacobi’s method.
Seidel notes that the unknowns do not have to be processed cyclically (in fact, he
advises against it!); instead, one could choose to update at each step the unknown with
the largest residual. He seems to be unaware that this is precisely Gauss’ method.
In the same paper, Seidel mentions a block variant of his scheme. He also notes that
the calculations can be computed to variable accuracy, using fewer decimals in the first
iterations. His linear systems had up to 72 unknowns.
Another important 19th Century development that is worth mentioning were the
independent proofs of convergence by Nekrasov (1885) and by Pizzetti (1887) of
Seidel’s method for systems of normal equations (more generally, SPD systems).
These authors were the first to note that a necessary and sufficient condition for the
convergence of the method (for an arbitrary initial guess x0) is that all the eigenvalues
of the iteration matrix must satisfy |λ| < 1. Nekrasov and Mehmke (1892) also gave
examples to show that convergence can be slow.
Nekrasov seems to have been the first to relate the rate of convergence to the dominant
eigenvalue of the iteration matrix. The treatment is still in terms of determinants, and no
use is made of matrix notation.
Mauro Picone
Throughout the 1930’s and beyond, under Picone’s direction, the INAC employed a
number of young mathematicians, many of whom later became very well known.
INAC researchers did basic research and also worked on a large number of applied
problems supplied by industry, government agencies, and the Italian military. Picone
was fond of saying that
Matematica Applicata = Matematica Fascista
In the 1930s the INAC also employed up to eleven computers and draftsmen. These
were highly skilled men and women who were responsible for carrying out all the
necessary calculations using a variety of mechanical, electro-mechanical, and graphical
devices.
As early as 1932, Picone designed and taught one of the first courses on numerical
methods ever offered at an Italian university (the course was called Calcoli Numerici e
Grafici). The course was taught in the School of Statistical and Actuarial Sciences,
because Picone’s colleagues in the Mathematics Institute denied his request to have
the course listed among the electives for the degree in Mathematics.
The course covered root finding, maxima and minima, solutions of linear and nonlinear
systems, interpolation, numerical quadrature, and practical Fourier analysis.
Both Jacobi’s and Seidel’s method are discussed (including block variants). Picone’s
course was not very different from current introductory classes in numerical analysis.
In the same paper, Cesari applies his general theory to the methods of Jacobi, Seidel,
and von Mises (stationary Richardson). He uses ω 6= 1 only for the latter.
In the case of von Mises’ method (analyzed for the SPD case), Cesari notes that,
regardless of ω, the rate of convergence of the method deteriorates as the ratio of the
extreme eigenvalues of A increases. He writes:
In practice, we found that already for λmax(A) λmin(A) > 10 the method of von Mises
converges too slowly.
This observation leads Cesari to the idea of polynomial preconditioning.
Given estimates a ≈ λmin(A) and b ≈ λmax(A), Cesari determines the coefficients of the
polynomial p(x) of degree k such that the ratio of the maximum and the minimum of q(x)
= xp(x) is minimized over [a,b], for 1 ≤ k ≤ 4.
The transformed system
p(A)Ax = p(A)b,
which he shows to be equivalent to the original one, can be expected to have a smaller
condition number.
Cesari ends the paper with a brief discussion of when this approach may be useful and
gives the results of numerical experiments with all three methods on a 3×3 example
using a polynomial of degree k = 1.
Cesari’s paper was not without influence: it is cited, sometimes at length, in important
papers by Forsythe (1952-1953) and in the books by Bodewig (1956), Faddeev &
Faddeeva (1960), Householder (1964), Wachspress (1966) and Saad (2003) among
others.
It is, however, not cited in the influential books of Varga (1962) and Young (1971).
Cesari’s paper is important for our story also because of the effect it had on a former
student and assistant of Picone.
Cimmino’s method
Gianfranco Cimmino (Naples, 1908; Bologna, 1989) graduated at Naples with Picone in
1927 with a thesis on approximate solution methods for the heat equation in 2D.
After a period spent at INAC and a study stay in Germany (again with Carath´eodory),
he undertook a brilliant academic career. He became a full professor in 1938, and in
1939 moved to the chair of Mathematical Analysis at Bologna, where he spent his entire
career.
Cimmino’s work was mostly in analysis: theory of linear elliptic PDEs, calculus of
variations, integral equations, functional analysis, etc. He also wrote 5-6 short papers on
matrix computations.
Gianfranco Cimmino (1908-1989)
Given an initial approximation x(0), Cimmino takes, for each i = 1, 2,...,n, the reflection
(mirror image) x(0) i of x(0) with respect to the hyperplane (1):
x (0)i = x(0) + 2 bi −hai,x(0)i kaik2ai . (2)
s method (cont.)
Given n arbitrarily chosen positive quantities m1, ...,mn, Cimmino constructs the next
iterate x(1) as the center of gravity of the system formed by placing the n masses mi at
the points x(0) i given by (2), for i = 1, 2,..., n. Cimmino notes that the initial point x(0)
and its reflections with respect to the n hyperplanes (1) all lie on a hypersphere the
center of which is precisely the point common to the n hyperplanes, namely, the solution
of the linear system. Because the center of gravity of the system of masses {mi}n i=1
must necessarily fall inside this hypersphere, it follows that the new iterate x(1) is a
better approximation to the solution than x(0):
Cimmino’s method (n = 2)
From C. D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM, 2000.
Cimmino proves that his method is always convergent.
In the same paper Cimmino shows that the iterates converge to a solution of Ax = b
even in the case of a singular (but consistent) system, provided that rank(A) ≥ 2. He
then notes that the sequence {x(k)} converges even when the linear system is
inconsistent, always provided that rank(A) ≥ 2. Much later (1967) Cimmino wrote: The
latter observation, however, is just a curiosity, being obviously devoid of any practical
usefulness. [sic!]
It can be shown that for an appropriate choice of the masses mi, the sequence {x(k)}
converges to the minimum 2-norm solution of kb−Axk2 = min.
In matrix form, Cimmino’s method can be written as follows: x(k+1) = x(k) + 2 µ
ATD(b−Ax(k)) (k = 0,1,...), where D = diag m1 ka1k2 , m2 ka2k2 ,..., mn kank2! and µ =
n Xi =1 mi. Therefore, Cimmino’s method is a special case of von Mises’ method
(stationary Richardson) on the normal equations if we let mi = kaik2. Cimmino’s method
corresponds to using ω = 2/µ for the relaxation factor. With such a choice, convergence
is guaranteed.
Cimmino’s legacy
Cimmino’s method, like the contemporary (and related) method of Kaczmarz, did not
attract much attention until many years later.
Although it was described by Forsythe (1953) and in the books of Bodewig (1956),
Householder (1964), Gastinel (1966) and others, I was able to find only 8 journal
citations of Cimmino’s 1938 paper until 1980.
After 1980, the number of papers and books citing Cimmino’s (as well as Kaczmarz’s)
method picks up dramatically, and it is now in the hundreds. Moreover, both methods
have been reinvented several times.
Kaczmarz’s method (n = 2)
From C. D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM, 2000.
Two major reasons for this surge in popularity are the fact that the method has the
regularizing property when applied to discrete ill-posed problems, and the high degree
of parallelism of the algorithm.
Today, Cimmino’s method is rarely used to solve linear systems. Rather, it forms the
basis for algorithms that are used to solve systems of inequalities (the so-called convex
feasibility problem), and it has applications in computerized tomography, radiation
treatment planning, medical imaging, etc.
Indeed, most citations occur in the medical physics literature, an outcome that would
have pleased Gianfranco Cimmino.
References
G. Birkhoff, Solving Elliptic Problems: 1930–1980, in M. H. Schulz, Ed., Elliptic Problem
Solvers, Academic Press, NY, 1981.
C. Brezinski and L. Wuytack, Eds., Numerical Analysis: Historical Developments in the
Twentieth Century, North-Holland, Amsterdam, 2001.
M. R. Hestenes and J. Todd, Mathematicians Learning to Use Computers. The Institute
for Numerical Analysis, UCLA, 1947–1954. National Institute of Standards and
Technology and Mathematical Association of America, Washington, DC, 1991.
S. G. Nash, Ed., A History of Scientific Computing, ACM Press and Addison-Wesley
Publishing Co., NY, 1990.
The SIAM History Project at http://history.siam.org contains a number of articles and
transcripts of interviews—highly recommended!
M. Benzi, Gianfranco Cimmino’s contribution to numerical mathematics, Atti del
Seminario di Analisi Matematica dell’Universit`a di Bologna, Technoprint, 2005, pp. 87–
109.
Y. Saad and H. A. van der Vorst, Iterative solution of linear systems in the 20th Century,
J. Comput. Applied. Math., 123 (2000), pp. 1–33.
http://history.siam.org/%5C/pdf/nahist_Benzi.pdf
Rizal Technological University
College of Engineering and Industrial Technology