Solution of Nonlinear Equations: 1 Bisection
Solution of Nonlinear Equations: 1 Bisection
Yan-Bin Jia
One of the most frequently occurring problems in scientific work is to find the roots of an
equation of the form
f (x) = 0. (1)
The function f (x) may be given explicitly as, for example, a polynomial or a transcendental func-
tion. Frequently, however, f (x) may be known only implicitly in that only a rule for evaluating it
on any argument is known. In rare cases it may be possible to obtain the exact roots such as in
the case of a factorizable polynomial. In general, however, we can hope to obtain only approximate
values of the roots, relying on some computational techniques to produce the approximation. In
this lecture, we will introduce some elementary iterative methods for finding a root of equation (1),
in other words, a zero of f (x).
1 Bisection
Suppose the function f (x) is over the interval [a0 , b0 ] such that f (a0 )f (b0 ) ≤ 0. If f is well-behaved,
then it will have a root between a0 and b0 . We halve the interval [a0 , b0 ] while still bracketing the
root, and repeat.
for i = 0, 1, 2, . . ., until satisfied, do
m ← (ai + bi )/2
if f (ai )f (m) ≤ 0
then ai+1 ← ai
bi+1 ← m
else ai+1 ← m
bi+1 ← bi
The first part of the idea is critical to many root-finding techniques, namely, to find an interval
that brackets a root of f . This can be difficult though in a number of situations shown in the next
figure.
(a) The interval straddles a singularity. In this case, bisection will converge to that singularity.
(b) Multiple roots are bracketed. Bisection will find only one while leaving an impression that
no other roots lie in the interval.
(c) A double root r, that is, f (r) = f ′ (r) = 0, is bracketed. Since f (a0 )f (b0 ) > 0, bisection will
not even be invoked.
1
(a) (b) (c)
Example 1. f (x) = x3 − x − 1
Since f is a cubic it has either one or three real zeros.1 There is only one variation
in the signs of its coefficients. Thus f must have only a single zero instead of three.
This zero is initially bracketed by [1, 2]. The iteration results are given in the following
3
table:
2
ai +bi
i [ai , bi ] f (ai ) f (bi ) 2 f ( ai +b
2 )
i
1
0 [1, 2] −1 5 1.5 0.875
1 [1, 1.5] −1 0.875 1.25 -0.29683 −2 −1 1 2
−1
2 [1.25,1.5] −0.296875 0.875 1.375 0.224609
−2
3 [1.25, 1.375] −0.296875 0.22460937 1.3125 −0.515136
.. .. .. .. .. .. −3
. . . . . .
14 [1.324707, 1.3247681] −4.659 · 10−5 2.137 · 10−4 1.3247375 8.355 · 10−5
15 [1.324707, 1.3247375] −4.659 · 10−5 8.355 · 10−5 1.3247223 1.848 · 10−5
In each step of the bisection method, the length of the bracketing interval is halved. Hence each
step produces one more binary digit, or bit, in the approximation to the root. Bisection can be
slow, but it is simple and robust. It is therefore sometimes used as a backup for more complicated
algorithms.
2 Regula Falsi
The method of regular falsi uses the idea that it often makes sense
to assume that the function is linear locally. Instead of using the
midpoint of the bracketing interval to select a new root estimate,
use a weighted average:
f
ai w bi
root f (bi )ai − f (ai )bi
w= . (2)
f (bi ) − f (ai )
1
Root counting for polynomials will be introduced in an upcoming lecture.
2
Here f (bi ) and f (ai ) have opposite signs under bracketing. Note that w is just the weighted average
of ai and bi with weights |f (bi )| and |f (ai )|, that is
|f (bi )| |f (ai )|
w= ai + bi . (3)
|f (bi )| + |f (ai )| |f (bi )| + |f (ai )|
If |f (bi )| is larger than |f (ai )|, then the new root estimate w is closer to ai than to bi . as shown in
the figure on the left.
Indeed, the weighted average w is the intersection of the x-axis with the line through the points
(ai , f (ai )) and (bi , f (bi )). Such a straight line is a secant to f (x). The description of the regular
falsi algorithm is similar to that of bisection:
3
F ← f (a0 )
G ← f (b0 )
w0 ← a0
for i = 0, 1, 2, . . ., until satisfied, do
wi+1 ← (Gai − F bi )/(G − F )
if f (ai )f (wi+1 ) ≤ 0
then ai+1 ← ai
bi+1 ← wi+1
G ← f (wi+1 )
if f (wi )f (wi+1 ) > 0
then F ← F/2
else ai+1 ← wi+1
bi+1 ← bi
F ← f (wi+1 )
if f (wi )f (wi+1 ) > 0
then G ← G/2
We run the modified regula falsi method on Example 1 and the results are as follows. Note the
slightly faster convergence than with bisection (6 vs. 15 steps).
Unlike bisection (which always halves the interval), root bracketing of the modified regula falsi
method may not give a small interval of convergence. In general, a numerical routine terminates
on one of the following conditions:
(a) xi+1 − xi is “small”;
(b) |f (xi )| is “small”;
(c) i is “large”.
One may wish to measure (a) and (b) as relative errors, say, respectively as
(a) |xi+1 − xi | ≤ XTOL · |xi |,
(b) |f (xi )| ≤ FTOL · F,
where XTOL and FTOL are some preset “tolerances” and F is an estimate of the magnitude.
4
4 Secant Method
The method starts with two estimates x0 and x1 and iter-
1 ates as follows:
f (xi )xi−1 − f (xi−1 )xi
xi+1 = . (4)
2 f (xi ) − f (xi−1 )
Another very popular modification of the regular falsi is
the secant method. It retains the use of secants throughout,
but gives up the bracketing of the root. The secant method
3
locates quite rapidly a point at which |f (x)| is small but
0 gives no general sense for how far away from a root of f (x)
this point might be. Also, f (xi ) and f (xi−1 ) need not be
of opposite sign, so that the iteration formula (4) is prone
to round-off errors. In an extreme situation, we might even have f (xi ) = f (xi−1 ), making the
calculation of xi+1 impossible. Although this does not cure the trouble, it is often better to
calculate xi+1 from the equivalent expression
xi − xi−1
xi+1 = xi − f (xi ) ,
f (xi ) − f (xi−1 )
in which xi+1 is obtained from xi by adding the “correction term”
f (xi )
− .
f (xi ) − f (xi−1 ))/(xi − xi−1 )
i xi f (xi )
0 1 −1
1 2 5
2 1.16666 −0.5787
3 1.253112 −0.28536
4 1.337206 0.05388
5 1.32385 −0.003698
6 1.3247079 −4.27 · 10−5
7 1.3247179 3.458 · 10−8
5 Newton’s Method
In the secant method, we can write
f (xi )
xi+1 = xi − ,
f [xi , xi−1 ]
5
where f [xi , xi−1 ] = (f (xi )−f (xi−1 ))/(xi −xi−1 ) is a first order divided difference that approximates
the first derivative of f . Analogously, in the continuous case, the above suggests the formula
f (xi )
xi+1 = xi − ′ .
f (xi )
This is Newton’s method. Essentially, xi+1 is the abscissa of the point where the x-axis intersects
the line through (xi , f (xi )) with the slope f ′ (xi ). It requires the knowledge of the derivative f ′ .
Example 4. We now run Newton’s method to find the unique real root of f (x) = x3 − x − 1 using
′ 2
f (x) = 3x − 1.
i xi f (xi ) i xi f (xi )
0 1 −1 0 2 5
1 1.5 0.875 1 1.54545 1.14576
2 1.347826 0.10068 2 1.359615 0.1537
3 1.325200 0.002058 3 1.325801 0.00462
4 1.324718 9.2 · 10−7 4 1.324718 4.65 · 10−6
5 1.324718 1.86 · 10−13 5 1.324718 4.7 · 10−12
Newton’s method converges if f is well-behaved and if the initial guess is near the root. Below
we look at an example where Newton’s method actually diverges.
Example 5. Let f (x) = arctan x. Then x = 0 is a solution of f (x) = 0. The Newton’s iteration is defined
by
xk+1 = xk − (1 + x2k ) arctan x.
If we choose x0 so that
2|x0 |
,
arctan |x0 | >
1 + x20
then the sequence |xk | diverges, that is, limk→∞ |xk | = ∞. The following diagram plots arctan x, and marks
two roots of the function arctan x − 2x/(1 + x2 ), at which Newton’s iteration will always yield each other.
The chosen value x0 is outside the interval ending at the two roots.
To see the divergence caused by the chosen x0 , consider the function g(x) = x − (1 + x2 ) arctan x. We
have
g ′ (x) = 1 − 1 − 2x arctan x
= −2x arctan x
< 0, for all x 6= 0.
Since g(0) = 0, the above implies that g(x) > 0 when x < 0 and g(x) < 0 when x > 0. Subsequently,
xk+1 · xk = g(xk ) · xk < 0
and
|xk+1 | xk+1
= −
|xk | xk
−xk + (1 + x2k ) arctan xk
=
xk
−|xk | + (1 + |xk |2 ) arctan |xk |
= ,
|xk |
6
when xk 6= 0. Differentiating the function
−x + (1 + x2 ) arctan x
h(x) =
x
yields
(x − arctan x) + x2 arctan x
h′ (x) = .
x
Note that x > arctan x > 0 when x > 0. So h′ (x) > 0 and h(x) increase monotonically. Also we have
h(x0 ) > 1. Now we can easily show by induction:
Finally, we have
k
|xk | |x1 | |x1 |
|xk | = ··· · |x0 | > · |x0 |.
|xk−1 | |x0 | |x0 |
Thus the sequence {|xk |} diverges.
2x
arctan x = 1+x 2
(x ≈ ±1.3918)
From the above example, we see that convergence for Newton’s method is not guaranteed. For
instance, if f ′ is near zero, the method can shoot off to infinity. Under what conditions will the
method guarantee to converge?
iii) f ′′ (x) is either non-negative everywhere on [a, b] or non-positive everywhere on [a, b].
7
iv)
f (a) f (b)
f ′ (a) < b − a and f ′ (b) < b − a.
Then Newton’s method converges to the unique root of f (x) in [a, b] for any initial guess x0 ∈ [a, b].
Conditions i) and ii) ensure that there is exactly one
zero in [a, b]. Condition iii) implies that f is either con-
cave from above or concave from below. So conditions
ii) and iii) together ensure that f ′ is monotone on [a, b].
Finally, condition iv) says that the tangent to the curve
of f at either endpoint intersects the x-axis within the
interval [a, b]. In the example on the right, f ′′ (x) ≥ 0,
f (a) < 0, and f (b) > 0. The true root is at ξ. Observe a x0 ξ b
that x1 > ξ always holds. And xk > ξ for all k > 1 and x2 x1
decrease monotonically to ξ.
8
References
[1] M. Erdmann. Lecture notes for 16-811 Mathematical Fundamentals for Robotics. The Robotics
Institute, Carnegie Mellon University, 1998.
[2] W. H. Press, et al. Numerical Recipes in C++: The Art of Scientific Computing. Cambridge
University Press, 2nd edition, 2002.
[3] J. Stoer and R. Bulirsch. Introduction to Numerical Analysis. Springer-Verlag New York, Inc.,
2nd edition, 1993.