Newton Method
Newton Method
In numerical analysis, Newton's method (also known as the Newton–Raphson method), named after Isaac Newton and Joseph
Raphson, is perhaps the best known method for finding successively better approximations to the zeroes (or roots) of a real-
valued function. Newton's method can often converge remarkably quickly, especially if the iteration begins "sufficiently near" the desired
root. Just how near "sufficiently near" needs to be, and just how quickly "remarkably quickly" can be, depends on the problem. This is
discussed in detail below. Unfortunately, when iteration begins far from the desired root, Newton's method can easily lead an unwary user
astray with little warning. Thus, good implementations of the method embed it in a routine that also detects and perhaps overcomes
Given a function ƒ(x) and its derivative ƒ '(x), we begin with a first guess x0. Provided the function is reasonably well-behaved a better
approximation x1 is
An important and somewhat surprising application is Newton–Raphson division, which can be used to quickly find
The algorithm is first in the class of Householder's methods, succeeded by Halley's method.
Contents
[hide]
• 3 History
• 4 Practical considerations
• 5 Analysis
• 6 Examples
• 7 Counterexamples
• 8 Generalizations
• 9 See also
• 10 References
• 11 External links
An illustration of one iteration of Newton's method (the function ƒ is shown in blue and the tangent line is in red). We see that xn+1 is a better
The idea of the method is as follows: one starts with an initial guess which is reasonably close to the true root, then the
function is approximated by its tangent line(which can be computed using the tools of calculus), and one computes the x-
intercept of this tangent line (which is easily done with elementary algebra). This x-intercept will typically be a better
approximation to the function's root than the original guess, and the method can be iterated.
Suppose ƒ : [a, b] → R is a differentiable function defined on the interval [a, b] with values in the real numbers R. The
formula for converging on the root can be easily derived. Suppose we have some current approximation xn. Then we can
derive the formula for a better approximation, xn+1 by referring to the diagram on the right. We know from the definition of the
That is
Here, f ' denotes the derivative of the function f. Then by simple algebra we can derive
We start the process off with some arbitrary initial value x0. (The closer to the zero, the better. But, in the
absence of any intuition about where the zero might lie, a "guess and check" method might narrow the
possibilities to a reasonably small interval by appealing to the intermediate value theorem.) The method will
usually converge, provided this initial guess is close enough to the unknown zero, and that ƒ'(x0) ≠ 0.
Furthermore, for a zero of multiplicity 1, the convergence is at least quadratic (see rate of convergence) in
a neighbourhood of the zero, which intuitively means that the number of correct digits roughly at least
doubles in every step. More details can be found in the analysis sectionbelow.
Newton's method can also be used to find a minimum or maximum of a function. The derivative is zero at a
minimum or maximum, so minima and maxima can be found by applying Newton's method to the derivative.
[edit]History
Newton's method was described by Isaac Newton in De analysi per aequationes numero
terminorum infinitas (written in 1669, published in 1711 by William Jones) and in De metodis
fluxionum et serierum infinitarum (written in 1671, translated and published as Method of Fluxions in
1736 by John Colson). However, his description differs substantially from the modern description
given above: Newton applies the method only to polynomials. He does not compute the successive
approximations xn, but computes a sequence of polynomials and only at the end, he arrives at an
approximation for the root x. Finally, Newton views the method as purely algebraic and fails to notice
the connection with calculus. Isaac Newton probably derived his method from a similar but less
precise method by Vieta. The essence of Vieta's method can be found in the work of the Persian
mathematician, Sharaf al-Din al-Tusi, while his successor Jamshīd al-Kāshī used a form of Newton's
method to solve xP − N = 0to find roots of N (Ypma 1995). A special case of Newton's method for
calculating square roots was known much earlier and is often called theBabylonian method.
Newton's method was used by 17th century Japanese mathematician Seki Kōwa to solve single-
Newton's method was first published in 1685 in A Treatise of Algebra both Historical and
aequationum universalis. Raphson again viewed Newton's method purely as an algebraic method
and restricted its use to polynomials, but he describes the method in terms of the successive
Finally, in 1740, Thomas Simpson described Newton's method as an iterative method for solving
general nonlinear equations using fluxional calculus, essentially giving the description above. In the
same publication, Simpson also gives the generalization to systems of two equations and notes that
Newton's method can be used for solving optimization problems by setting the gradient to zero.
Arthur Cayley in 1879 in The Newton-Fourier imaginary problem was the first who noticed the
difficulties in generalizing the Newton's method to complex roots of polynomials with degree greater
than 2 and complex initial values. This opened the way to the study of the theory of iterations of
rational functions.
[edit]Practical considerations
Newton's method is an extremely powerful technique—in general the convergence is quadratic: the
error is essentially squared at each step (which means that the number of accurate digits roughly
doubles in each step). However, there are some difficulties with the method.
practical problems, the function in question may be given by a long and complicated
formula, and hence an analytical expression for the derivative may not be easily
using the slope of a line through two points on the function. In this case, the Secant
method results. This has slightly slower convergence than Newton's method but does
2. If the initial value is too far from the true zero, Newton's method may fail to
converge. For this reason, Newton's method is often referred to as a local technique.
Most practical implementations of Newton's method put an upper limit on the number of
3. If the derivative of the function is not continuous the method may fail to converge.
4. It is clear from the formula for Newton's method that it will fail in cases where the
derivative is zero. Similarly, when the derivative is close to zero, the tangent line is
nearly horizontal and hence may "shoot" wildly past the desired root.
5. If the root being sought has multiplicity greater than one, the convergence rate is
merely linear (errors reduced by a constant factor at each step) unless special steps are
taken. When there are two or more roots that are close together then it may take many
iterations before the iterates get close enough to one of them for the quadratic
convergence to be apparent.
Since the most serious of the problems above is the possibility of a failure of convergence, Press et
al. (1992) present a version of Newton's method that starts at the midpoint of an interval in which the
root is known to lie and stops the iteration if an iterate is generated that lies outside the interval.
Developers of large scale computer systems involving root finding tend to prefer the secant
method over Newton's method because the use of a difference quotient in place of the derivative in
Newton's method implies that the additional code to compute the derivative need not be maintained.
In practice, the advantages of maintaining a smaller code base usually outweigh the superior
[edit]Analysis
a neighborhood of α such that for all starting values x0 in that neighborhood, the sequence {xn}
will converge to α.
If the function is continuously differentiable and its derivative is not 0 at α and it has a second
derivative at α then the convergence is quadratic or faster. If the second derivative is not 0 at α then
If the derivative is 0 at α, then the convergence is usually only linear. Specifically, if ƒ is twice
continuously differentiable, ƒ '(α) = 0 andƒ ''(α) ≠ 0, then there exists a neighborhood of α such that
for all starting values x0 in that neighborhood, the sequence of iterates converges linearly,
with rate log10 2 (Süli & Mayers, Exercise 1.6). Alternatively if ƒ '(α) = 0 and ƒ '(x) ≠ 0 for x ≠ 0, x in
converges linearly.
In practice these results are local and the neighborhood of convergence are not known a priori, but
there are also some results on global convergence, for instance, given a right neighborhood U+ of α,
[edit]Examples
Consider the problem of finding the square root of a number. There are many methods of computing
For example, if one wishes to find the square root of 612, this is equivalent to finding the solution to
with derivative,
Where the correct digits are underlined. With only a few iterations one
= x3. We can rephrase that as finding the zero of f(x) = cos(x) − x3.
quadratic convergence.
[edit]Counterexamples
initial point.
undefined:
desired zero.
convergence. Let
fractal.)
[edit]Derivative issues
first try.
already found):
The algorithm
Newton's method
iteration.
every f(x) = | x | α,
where
of (square
oscillate indefinitely
either.
[edit]Discontinuou
s derivative
neighborhood of the
function
Its derivative
is:
Within
any
neighb
orhood
of the
root,
this
derivati
ve
keeps
changi
ng sign
as x ap
proach
es 0
from
the
right
(or
from
the left)
while f(
x) ≥ x −
x2 > 0
for
0<x<
1.
So f(x)/
f'(x) is
unboun
ded
near
the
root,
and
Newton
's
method
will
diverge
almost
everyw
here in
any
neighb
orhood
of it,
even
though:
th
fu
ct
io
is
di
ff
nt
ia
bl
th
u
s
nt
in
s)
ry
e;
th
ri
at
iv
at
th
ot
is
n
o;
fi
in
fi
ni
te
ly
di
ff
nt
ia
bl
pt
at
th
ot
;
th
ri
at
iv
is
in
ei
d
of
th
ot
nl
ik
f(
x)
/f'
x)
).
[edit]N
on-
quad
ratic
conv
erge
nce
In
some
cases
the
iterates
conver
ge but
do not
conver
ge as
quickly
as
promis
ed. In
these
cases
simpler
method
conver
ge just
as
quickly
as
Newton
's
method
[edit]Z
ero
deriva
tive
If the
first
derivati
ve is
zero at
the
root,
then
conver
gence
will not
be
quadrat
ic.
Indeed,
let
t
e
n
e
f
l
e
r
e
"
"
d
Then
the first
few
iterates
starting
at x0 = 1
are 1,
0.50025
0376,
0.25106
2828,
0.12750
7934,
0.06767
1976,
0.04122
4176,
0.03274
1218,
0.03164
2362; it
takes
six
iteration
s to
reach a
point
where
the
converg
ence
appears
to be
quadrati
c.
[edit]N
o
secon
d
derivat
ive
If there
is no
second
derivativ
e at the
root,
then
converg
ence
may fail
to be
quadrati
c.
Indeed,
let
Then
And
it is undefined. Given ,