A Method of Solving Algebraic Equations Using An Automatic Computer
A Method of Solving Algebraic Equations Using An Automatic Computer
1. Iterate the function 2y2 — 1, starting with the number y whose arc sin is
required.
2. Record the signs of the iterates in order.
3. Accumulate the signs; that is, record the "partial products" of the signs
in order.
4. Write descending powers of 2 between the signs accumulated.
5. Multiply the series obtained by ir/2.
where the coefficients ao, at, ■■■, a„ are complex numbers and ao ^ 0. Each root
is found by an iterative procedure. Successive iterations toward a particular root
are obtained by finding the nearer root of a quadratic whose curve passes through
the last three points. The quadratic will in general have complex coefficients and
complex roots. This solution is accomplished by a variation of the standard
quadratic formula. Although the method derived here is rather complicated, no
evaluation of derivatives of fix) and only one evaluation of the polynomial fix)
is required per iteration. If the degree of the equation is large a greater amount
of time is spent evaluating the function than is spent in the remainder of the
process. Thus, the time spent per iteration is less with this process than with
iterative schemes which require the calculation of derivatives, whenever the degree
of the equation is large.
The Lagrange interpolation formula will yield a quadratic
whose curve passes through the last three points (x,-, fix/)), (x,-_i, /(x,_i)),
(x,_2, /(x,_2)) where the coefficients bo, 6i, b2 satisfy the system of equations
\J) A,+ l — ,
gi ± Vg<2- 4/(x,)5,Xt[/(xi_2)Xi - f(xi-i)Si + /(*<)]
that one of the two possible choices having the smaller magnitude so that x¿+i
is the root which is closer to Xj.
A convenient starting method for this process uses artificial starting values
at Xo = —1, Xi = 1, and x2 = 0 :
where a/ is the new coefficient to replace a¿ and r is the root which has just been
found. We make a_i = 0. Errors introduced by this process will be reduced if the
roots are eliminated in order of increasing magnitude. By always starting at the
point x = 0 one will tend to find roots in roughly this order.
No general proof of convergence in the large has been obtained for this
process, but convergence can be shown to occur whenever the process leads one
sufficiently close to a single or double root.
In order to facilitate the study of convergence let us assume that x,+i = 0.
This loses no generality since a simple shift of origin is always possible within the
system. At point Xj+i we also have L;(x¿+i) = 0 so that
Now each of the functions appearing on the right hand sides of equations (3)
may be expanded about x,+i = 0 as
(9) /(«^-¿«»^(OVi!
t-=o
SOLVING ALGEBRAIC EQUATIONS BY AN AUTOMATIC COMPUTER 211
This system of equations may be solved by elimination provided Xi, x¿_i, and
x,_2 are distinct and we obtain Ô20= 1, Ô21= b22 = 0 for the first three solutions.
When k > 3 the solution becomes somewhat more difficult but may be carried
out by eliminating bu between the first two equations to give
k-1
bokXiXi-i — biz = x<Xi_i ¿Z xty¡:í~p.
p=0
A similar elimination may be made between another pair and the result combined
with the above equations to give
Up to this point no approximations have been made and no limits have been
taken. Equation (13) expresses the same relationship contained in (5). We now
assume that the points x,-, X;_i, x,_2 lie in the neighborhood of a root r. Thus if we
let e,-+i = Xi+i — r, e,- = Xi — r, «t_i = x,_i — r and éí_2 = x,_2 — r the magni-
tudes of the last three quantities are all assumed to be less than some upper
bound em
and we shall seek to justify this assumption later. If (14) and (15) are inserted
in (13) and the functions are expanded about r we obtain
/'" (f)
(17) «¿+i= -e,t,_ie,--2 ■+ 0(4,).
6f'(r)
A solution e,+i to this equation does exist if em is sufficiently small and will satisfy
(15). This solution will also satisfy L¿(/(xi+i)) = 0, and hence we are justified
in assumption (15) and hence (17) for at least one of the two x,+i for which
Li(f(xi+i)) = 0 holds. We now wish to show that (17) holds for that xi+i which
is actually chosen by the process described in connection with equation (5). If
both x.+i satisfy (17), the proof need not be given. If, however, one does and one
does not, we must make some further analysis. It was pointed out that the process
chooses the point x¿+i which is nearer to x,-. The point given by equation (17)
must satisfy
This must therefore also hold for the x¿+i which is chosen by the process in (5)
and hence |í,+i| < 3em for this case. But |«,+i| < 3em is adequate to give (17) for
sufficiently small em and we may therefore assume (17) for the e,+i obtained in
the process.
A general limiting formula for the e,- in the neighborhood of a root may be
obtained from equation (17). If logarithms are taken on both sides we obtain
(19) log u+i = log e,-+ log ei_i + log 6,_2+ log ( - Jjrr!) + 0(0-
Neglecting the terms 0(em) we may solve (19) as a difference equation using
standard techniques and obtain
/ f'"(r)\
(20) log tj = cmx* + c2m2' + C3m3' — 5 log I — ,,,, : )■
\ 6/'(r)/
Where the constants Ci, c2, and c3 are determined by the starting values and the
three orders of convergence mi, m2, and m3 are roots of the characteristic equation
(21) m3 = m2 + m + 1.
Since the last two roots have magnitude less than 1 their effect will die out and
the order of the process is given by m\. After these approximations become valid
we have from (20)
where
K_( f'"(r)Y
In the case of a double root a similar argument exists. Equation (17) is then
replaced by
(24) 2m3 = m2 + m + 1.
It has roots
mi = 1.23,
m2, mi = —.367 ± .520t,
and again the order of convergence is given by mi. We therefore have in the limit
with
/ fir) y
K-\~3rV)) •
Convergence of the Generalized Process. One might imagine a generalized
process in which an a degree Lagrange interpolation polynomial Li(x, a) is used
rather than the quadratic of equation (2). This presumes that some new method
for obtaining the nearest root of this polynomial is to be used. Since the direct
method corresponding to equation (5) would no longer be practical, one would
probably use some iterative method for solution of the equation Li(x,+i, a) = 0.
We now wish to investigate the convergence rate for such a process.
A general set of equations corresponding to equations (11) may be formed.
They are
a
When k > a the quantity bak may be obtained by elimination. (This direct method
for obtaining bak was pointed out to the author by Mr. W. Scott Bartky.) Let us
eliminate ba-i,k between the first equation and each succeeding equation giving
a-2
s = 1, 2, • • •, a.
214 SOLVING ALGEBRAIC EQUATIONS BY AN AUTOMATIC COMPUTER
We next eliminate bx-i,k • ■-, bok in a similar manner until the result
is obtained. In this expression the summation is made over all terms for which
a
the exponents po, pi, ■■• ,pa are non-negative integers and zZ Pi — h — (a + 1).
J-0
We also see directly from (26) that bak = 0 if k < a, except that bao = 1.
We may therefore obtain a generalization of equation (17)
This equation has one root «i which lies between 1 and 2 on the real axis and
which approaches 2 with increasing a. The remaining roots lie within the unit
circle and therefore represent perturbations which die out. The order of con-
vergence of the process to single roots is therefore given by mi. Since this can never
reach 2 we conclude that there is little to be gained in speed of convergence by
letting a exceed 2.
One should not ignore the possibility of letting a = 1. In this case the formula
corresponding to (5) is greatly simplified since a linear equation rather than a
quadratic now must be solved. This choice, however, suffers from a disadvantage
if all the coefficients of the original equation are real. If one starts from a real point
xo then all successive iterative results x, will also be real and hence only real roots
will be found.
Tests of the Method. The process with a = 2 as outlined in the preceding
sections was altered slightly in practice. Whenever the new value of the function
/(xi+i) is calculated the quantity |/(xi+i) |/ |/(x¿) | is formed. If this latter quan-
tity exceeds 10 the quantity X¿+i is halved and hi+iXi+i, and /(x,+i) are recomputed
accordingly. With this revision the process has produced convergence in all the
cases tested.
Another alteration was made to handle the case in which the denominator
of (5) is zero. This occurs whenever /(x¿) = /(x,_i) = /(x,_2) and in such cases
the arbitrary value X,+i = 1 is chosen since (5) may no longer be used.
The process (a = 2) was tested for equations of varying degree. Fourteen
equations were solved starting with degree 10 and progressing in steps of ten
through degree 140. Each equation was formed by choosing random points as
SOLVING ALGEBRAIC EQUATIONS BY AN AUTOMATIC COMPUTER 215
roots within the square having vertices rfc 1 ± t. Polynomials were then formed
from these roots. The solutions to these polynomials were then compared with the
original random numbers which were used to generate the polynomial. Results
are summarized in the table.
Time Taken by
Accuracy of Last Illiac for Corn-
Degree of Accuracy of Least Root to be plete Solution
Equation Accurate Root Found in Minutes
10 10-7 10-9 1
20 10"8 10-8 2
30 IO-5 10-8 5
40 IO-4 lu"9 6
50 IO-4 10-6 10
60 IO-5 IO-7 12
70 10-" IO-5 17
80 lu"4 IO-5 20
90 lO-1 10-6 20
100 10-6 33
110 — — 42
120 — — 43
130 — 10-8 48
140 — — 60
Dashes in the table indicate that some roots were too inaccurate to be identi-
fied. In all equations some roots appeared which were correct to 10-8 or better.
The solutions to the 100th degree equation, which had some unidentifiable roots,
were used to generate a polynomial. All coefficients of this polynomial agreed with
the coefficients of the original polynomial to at least 6 decimal places. This result
indicates that the obtaining of accurate values of the roots of the equations of
higher degree was precluded by the limited accuracy of the coefficients and
independent of the method of solution.
The equation x128—1=0 was solved as an example of a special type of
equation whose solution could be easily checked. The maximum error occurring
in any root was of order 10-7 and the time for solution was 70 minutes.
No equation whose solution has been attempted has failed to yield convergence
although as indicated in the table, the solutions of equations of large degree may
be greatly in error. We conclude that convergence in the large does occur in most
practical cases in spite of the fact that convergence has only been proved for
single and double roots when the process has brought one to the neighborhood
of a root.
David E. Müller
University of Illinois
Urbana, Illinois
1. R. A. Brooker, "The solution of algebraic equations on the EDSAC," Cambridge Phil.
Soc, Proc, v. 48, 1952, p. 255-270.
2. Hans J. Maehly, "Zur Iterativen Auflösung Algebraischer Gleichungen," Z. Angew. Math.
Physik, v. 5, 1954, p. 260-263.