0% found this document useful (0 votes)
43 views65 pages

Roots

This document discusses numerical methods for finding the roots or zeros of nonlinear scalar equations. It begins by introducing the problem of finding solutions to the equation f(x)=0 for a continuous function f. It then describes the bisection method for root finding. The bisection method iteratively narrows down the interval containing a root by checking the midpoint. It converges linearly and is simple and robust but slow. The document also introduces fixed point iteration methods, which transform the equation into finding a fixed point x* of a function g(x). Examples of applying different fixed point functions g to find a root of an example equation are shown along with the convergence behavior.

Uploaded by

shiv ratn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views65 pages

Roots

This document discusses numerical methods for finding the roots or zeros of nonlinear scalar equations. It begins by introducing the problem of finding solutions to the equation f(x)=0 for a continuous function f. It then describes the bisection method for root finding. The bisection method iteratively narrows down the interval containing a root by checking the midpoint. It converges linearly and is simple and robust but slow. The document also introduces fixed point iteration methods, which transform the equation into finding a fixed point x* of a function g(x). Examples of applying different fixed point functions g to find a root of an example equation are shown along with the convergence behavior.

Uploaded by

shiv ratn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Numerical Methods and Computation

MTL 107

Harish Kumar
([email protected])

Dept. of Mathematics, IIT Delhi


Root finding Methods

Topics
▶ Zeros of nonlinear scalar equations
▶ Bisection algorithm
▶ Fixed point iteration
▶ Newton and other higher order methods
Lecture 4
Zeros of nonlinear scalar equations and Bisection Method
The problem

The problem
Want to find solutions of the scalar nonlinear equation

f (x) = 0 with continuous f : [a, b] ⊂ R 7→ R


We denote a solution of the equation (called root, or zero) by
x∗
In contrast to scalar linear equations
b
ax − b = 0 =⇒ a̸=0 x∗ =
a
nonlinear equations have an undetermined number of zeros.
We denote the set of all continuous functions on the interval [a,b]. So,
above, we require f ∈ C [a, b].
Examples

1. f(x) = x-1 on [a,b] = [0,2].


2. f(x) = sin(x)
On [a,b] = [ π2 , 3π ∗
2 ] there is one root x = π.
On [a, b] = [0, 4π] there are five roots, cf. Fig. on next page.
3. f(x) = x 3 − 20x 2 + 2552 on [0,20].
4. f(x) = 10 cosh( x4 ) on −∞ < x < ∞ .
cosh(t) = 21 (e t + e −t )
merical Methods for Computational Science and Engineering
The problem

Examples(cont.)
xamples (cont.)

SE, Lecture 3, Sept 26, 2013

Figure 1: Examples of Roots


Iterative method for finding roots

▶ Roots that can be expressed analytically are known for very


few special nonlinear functions.
▶ Even for polynomials this holds for very low orders.
▶ We have to resort to iterative methods:
Starting with an initial guess/iterate x0 we generate a
sequence of iterates x1 , x2 , ..... that (hopefully ) converges to
a root of the function.
▶ A rough knowledge of the root’s location is required.
▶ (Plots)
▶ Could probe the function and try to find two arguments a,b
s.t. f (a)f (b) < 0.
intermediate Value Theorem =⇒ ∃ x ∗ in interval (a, b).
Stopping an iterative procedure
In general, an iterative procedure does not find the solution
but gets (arbitrary) close.
Varous criteria are used to check (almost) convergence: We
terminate iterating after n iterations if:

|xn − xn−1 | < atol, and/or

|xn − xn−1 | < rtol|xn |, and/or


|f (xn )| < ftol,
where atol, rtol and ftol are user-defined constants.
Usually (but not always) the relative criterion is more robust
than the absolute one.
A combination of the first two is

|xn − xn−1 | < tol(1 + |xn |)


Desirable algorithm properties

Iterative methods: starting with initial iterate (guess) x0 , generate


sequence of iterates x1 , x2 , ....xk , ..., that hopefully converge to a
root x ∗ .
▶ Efficient: requires a small number of function evaluations.
▶ Robust: fails rarely, if ever. Announces failure if it does fail.
▶ Requires a minimal amount of additional information such as
the derivative of f.
▶ Requires f to satisfy only minimal smoothness properties.
▶ Generalizes easily and naturally to many equations in many
unknowns.
Bisection

▶ Method for finding a root of scalar equation f(x) = 0 in an


interval [a,b].
▶ Assumption: f (a)f (b) < 0.
▶ Since f is continuous there must be a zero x ∗ ∈ [a, b].
▶ Compute midpoint m of the interval and check the values
f (m).
▶ Depending on the sign of f (m), we can decide if x ∗ ∈ [a, m]
or x ∗ ∈ [m, b]
(of course, if f(m) = 0 then we are done.)
▶ Repeat
Code bisection I

1 function [x,fx] = bisect(f,a,b,tol)


2 fa=f(a); if fa==0,x=a;fx=fa;return;end
3 fb=f(b); if fb==0,x=b;fx=fb;return;end
4
5 if fa*fb > 0, error('f(a)*f(b) > 0'), end
6 x = (a+b)/2; fx = f(x);
7 if nargin<4, tol=0; end
8
9 while (b-a>tol) & ((a < x) & (x < b)),
10 if fx == 0, break, end
11 if fa*fx < 0,
12 b = x; fb = fx;
13 else
14 a = x; fa = fx;
15 end
16 x = (a+b)/2; fx = f(x);
17 end
Bisection: Convergence
b−a
Error estimate: |xk − s| ≤ 2k+1
with xk = midpoint of the k-th
interval.
Bisection: Number of Steps
Number of steps for Desired accuracy: If we want the error to
satisfy |xk − x ∗ | ≤ ϵ, then it suffices to have (b − a)/2k ≤ ϵ, so that
b−a b−a
   
k > log2 = log /log 2.
ϵ ϵ
Properties of bisection

▶ Simple
▶ Safe, robust (even foolproof)
▶ Requires f to be only continuous
▶ Slow
▶ Can converge to wrong root
▶ Cant be generalize to the systems
Lecture 5
Fixed point Methods
Fixed point iteration

The method discussed now have direct extensions to more


complicated problems, e.g., to systems of nonlinear equations and
to more functional equations.

Problem f (x) = 0 can be rewritten as

x = g (x). (∗)
(There are many ways to do this.)
Given (*) we are looking for a fixed point, i.e., a point x ∗ satisfying
g (x ∗ ) = x ∗
Algorithm:Fixed point iteration

Given a scalar function f (x). Select a function g (x) such that

f (x) = 0 ⇐⇒ g (x) = x.

Then:
▶ Start from an initial guess x0 .
▶ For k =0,1,2,..... set

xk+1 = g (xk ), k = 0, 1, .....

until xk+1 satisfies some termination criterion


Examples of fixed point iterations

Note: there are many ways to transform f(x)=0 into fixed point
form! Not all them ”good” in terms of convergence.
Options for fixed point iterations for

f (x) = xe x − 1, x ∈ [0, 1]

Different fixed point forms:

g1 (x) = e −x ,

1+x
g2 (x) = ,
1 + ex
g3 (x) = x + 1 − xe x .
Numerical Methods for Computational Science and Engineering

Examples of fixed point iteration (cont.)


Fixed point iteration

Examples of fixed point iterations (cont.)


k xk+1 := g1 (xk ) xk+1 := g2 (xk ) xk+1 := g3 (xk )
0 0.500000000000000 0.500000000000000 0.500000000000000
1 0.606530659712633 0.566311003197218 0.675639364649936
2 0.545239211892605 0.567143165034862 0.347812678511202
3 0.579703094878068 0.567143290409781 0.855321409174107
4 0.560064627938902 0.567143290409784 -0.156505955383169
5 0.571172148977215 0.567143290409784 0.977326422747719
6 0.564862946980323 0.567143290409784 -0.619764251895580
7 0.568438047570066 0.567143290409784 0.713713087416146
8 0.566409452746921 0.567143290409784 0.256626649129847
9 0.567559634262242 0.567143290409784 0.924920676910549
10 0.566907212935471 0.567143290409784 -0.407422405542253

Figure 2: Result of different g


NumCSE, Lecture 3, Sept 26, 2013 18/32
Numerical Methods for Computational Science and Engineering

Examples of fixed point iteration (const.)


Fixed point iteration

Examples of fixed point iterations (cont.)


k |xk x ⇤ | |xk x ⇤ | |xk x ⇤ |
0 0.067143290409784 0.067143290409784 0.067143290409784
1 0.039387369302849 0.000832287212566 0.108496074240152
2 0.021904078517179 0.000000125374922 0.219330611898582
3 0.012559804468284 0.000000000000003 0.288178118764323
4 0.007078662470882 0.000000000000000 0.723649245792953
5 0.004028858567431 0.000000000000000 0.410183132337935
6 0.002280343429460 0.000000000000000 1.186907542305364
7 0.001294757160282 0.000000000000000 0.146569797006362
8 0.000733837662863 0.000000000000000 0.310516641279937
9 0.000416343852458 0.000000000000000 0.357777386500765
10 0.000236077474313 0.000000000000000 0.974565695952037

Figure 3: Error with different g

NumCSE, Lecture 3, Sept 26, 2013 19/32


Some questions regarding the fixed point iteration

Suppose that we have somehow determined the continuous


function g ∈ C [a, b]. Now let us consider the fixed point
iterationxk+1 = g (xk ). Obvious questions arise:
▶ Is there a fixed point x ∗ in [a,b] ?
▶ If yes, is it unique ?
▶ Does the sequaence of iterates converge to a root x ∗ ?.
▶ If yes, how fast ?
▶ If not, does this mean that no root exists ?
Fixed point theorem
If g ∈ C [a, b] and a ≤ g (x) ≤ b for all x ∈ [a, b], then there is a
fixed point x ∗ in the interval [a,b].

If, in addition, the derivatives g ′ exists and there is a constant


ρ < 1 such that the derivatives
|g ′ (x)| ≤ ρ ∀x ∈ (a, b),
then the fixed point x∗ is unique in this interval.
Convergence of the fixed point iteration

Does the sequence of iterates converge to a root x ∗ ?

|xk+1 − x ∗ | = |g (xk ) − g (x ∗ )| = |g ′ (ξ)| · |xk − x ∗ | ≤ ρ|xk − x ∗ |

with ξ ∈ [xk , x ∗ ].
This is a contraction if the factor ρ < 1. Thus,

|xk+1 − x ∗ | ≤ ρ|xk − x ∗ | ≤ ρ2 |xk−1 − x ∗ | ≤ ... ≤ ρk+1 |x0 − x ∗ |.

Since ρ < 1 then ρk → 0 as k → ∞.


Numerical Methods for Computational Science and Engineering
Fixed point iteration
Convergence of fixed point iteration in 1D

Convergence of fixed point iterations in 1D


Vastly di↵erent behavior of di↵erent fixed point iterations:

g1 : linear convergence? g2 : quadratic convergence? g3 : no convergence?


Geometric
eometric interpretationofoffixed
interpretation fixed point
point iteration
iteration
x0 x0 : start with startx0with
on the
x0 axis
on the x -axis
F (x
F (x0 ) 0 ): gogo parallel the
parallel to y-axis
to the to thetograph
y -axis of F ≡ of
the graph g F ⌘g
x = F (x ) move parallel to the x-axis to
x1 =1 F (x0 )0 move parallel to the x -axis to the graph the graph y =y x= x
F (xF1(x
) 1 ) go parallel to the
go parallel toy-axis
the yto the to
-axis graph
the of F
graph of F
etc.

E, Lecture 3, Sept 26, 2013 2

Figure 4: Geometric Interpretation of fixed point iterations


Geometric Interpretation: Different Cases
Numerical Methods for Computational Science and Engineering
Fixed point iteration

left: at least linear convergence; right: divergence


Note: these are local scenarios.

NumCSE, Lecture 3, Sept 26, 2013 25/32


Geometric
Numerical MethodsInterpretation: Different Cases
for Computational Science and Engineering
Fixed point iteration

left: at least linear


Figureconvergence;
5: Convergenceright: divergence
and Divergence
Rate of convergence

Let x ∗ be a fixed point of the iteration xk+1 = g (xk ) and


ρ = |g ′ (x ∗ )| with 0 < ρ < 1.
x0 sufficiently close to x ∗ =⇒ xk − x ∗ ≈ g ′ (x ∗ )(xk−1 − x ∗ ).

|xk − x ∗ | ≈ ρ|xk−1 − x ∗ | ≈ .... ≈ ρk |x0 − x ∗ |.

Definition: The rate rate of convergence is defined as rate =

log10 ρ.

The rate us higher (faster convergence), the smaller ρ.


About k = [1/rate] iteration steps are needed to reduce the error
by one order of magnitude.
Rate of convergence (Numerical example)

The rate of convergence for xk+1 = gi (xk ) is in the previous


scenarios for solving f (x) = xe x − 1, with x ∈ [0, 1].

Fixed point is x ∗ = 0.5671432904.

g1′ (x ∗ ) = −g1 (x ∗ ) = −x ∗ = −0.5671432904.


0 < |g1′ (x ∗ )| < 1 → convergence.
∗ ∗
g3′ (x ∗ ) = 1 − x ∗ e x − e x = −1.76.... → divergence.

∗ x∗
1−x e ∗ f (x )
g2′ (x ∗ ) = (e x ∗ +1)2 = − (e x ∗ +1)2 = 0,

g2′′ (x ∗ ) ̸= 0 → quadratic convergence (see next time: order of


convergence)
Termination criteria
Termination criteria
Residual based termination: STOP convergent iteration {xk }, if
Residual based termination: STOP convergent iteration xk , if
|f|f(x
(xkk)|)| ≤ ⌧τ, ⌧τ=ˆisprescribed
prescribedtolerance > 0 > 0.
tolerance
=⇒=)no no
guaranteed
guaranteed accuracy
accuracy
f f

x x

|f (xk )| small 6) |x x⇤ | small |f (xk )| small ) |x x⇤ | small

Figure 6: Convergence and Divergence


NumCSE, Lecture 3, Sept 26, 2013 29/32
Termination criteria (cont.)

Correction based termination: STOP convergent iteration xk , if



 τabs
|xk+1 − xk | ≤ or (0.1)
τrel |xk |

τabs /τrel are prescribed absolute/relative tolerances > 0.

A posteriori termination criterion for linearly convergent iteration


with rate of convergence 0 < ρ < 1 :
ρ
|xk+1 − x ∗ | ≤ |xk+1 − xk |
1−ρ
Termination criteria (cont.)
Proof
Idea of the proof: Bound |xk+m − xk+1 |, m > 1, independent
of m and let m → ∞. For m > 2

|xk+2 − xk+1 | = |g (xk+1 ) − g (xk )| ≤ ρ|xk+1 − xk |


|xk+m − xk+1 | = |xk+m − xk+m − xk+m−1 + xk+m−1 .......
.. + xk+3 − xk+2 + xk+2 | + |xk+2 − xk+1 |
≤ |xk+m − xk+m−1 | + |xk+m−1 − xk+m−2 | + ..
..... + |xk+3 − xk+2 | + |xk+2 − xk+1 |
≤ (ρm−1 + ρm−2 + ..... + ρ)|xk+1 − xk |
≤ ρ(ρm−2 + ρm−3 + .... + 1)|xk+1 − xk |
1 − ρm−1 ρ
≤ ρ |xk+1 − xk | < |xk+1 − xk |
1−ρ 1−ρ
Let m → ∞ then xk+m → x ∗ (0.2)
Useful results from analysis

▶ Intermediate value theorem. If f ∈ C [a, b] and s is a value


such that f (a) ≤ s ≤ f (b), then there exists a real number
c ∈ [a, b] for which f (c) = s.
▶ Mean value theorem. If f ∈ C [a, b] and f is differential on the
open interval (a, b), then there exists a real number c ∈ (a, b)
for which f ′ (c) = f (b)−f
b−a
(a)
.
▶ Rolle’s theorem. If f ∈ C [a, b] and f is differentiable on (a, b),
and in addition f (a) = f (b) = 0, then there is a real number
c ∈ (a, b) for which f ′ (c) = 0.
Lecture 6 & 7
Newton’s Method and Modifications
Newton iteration

▶ Let the function f for which we look a zero x ∗ be


differentiable.
▶ In practice, the derivative f’ should be cheap to compute
(comparable with the computation of f).
▶ In Newton’s method the function f is linearized at some
approximate value xk ≈ x ∗ .
▶ Define function t(x) that has at x = xk same function value
and derivative as f
(Taylor polynomial of degree one )

t(x) = f (xk ) + f ′ (xk )(x − xk ).

▶ The single zero of t(x) = 0 yield xk+1 .


Newton iteration (cont.)

▶ The single zero of t(x) = 0 yields xk+1 :

f (xk )
xk+1 = xk − , k = 0, 1, ....
f ′ (xk )
▶ Newton’s method is a fixed point iteration with iteration
function
f (x)
g (x) = x − ′ .
f (x)
Clearly, g (x ∗ ) = x ∗ .
▶ Since we have neglected second (and higher) terms of the
Taylor expansion of f at x ∗ we can expect that t(x) is a very
good approximation of f (x) if xk ≈ x ∗ .
Numerical Methods for Computational Science and Engineering
Newton’s method: geometric interpretation
Newton iteration

Newton’s method: geometric interpretation

Figure 7: Geometric Interpretation


NumCSE, Lecture 4, Sept 30, 2013
Algorithm:Newton’s iteration

Given a scalar differentiable function f (x).


▶ Start from an initial guess x0 .
▶ For k = 0,1,2,....set

f (xk )
xk+1 = xk −
f ′ (xk )
until xk+1 satisfies some termination criterion
Newton’s method code:

1 function [x,it]=newton(f,df,x)
2 % NEWTON Newton iteration for scalar eq f(x)=0.
3 % x=newton(f,df,x0)
4 dx = f(x)/df(x); it=0;
5 while abs(dx) > 1e-10,
6 x = x - dx;
7 dx = f(x)/df(x);
8 it = it+1; end
Example (Convergence of Newton’s iteration)


Iteration for computing a, a > 0:
▶ f (x) = x 2 − a, f ′ (x) = 2x.
x 2 −a x 2 +a 1
x + xa

▶ g (x) = x − 2x = 2x = 2
2 √
▶ g ′ (x) = 1 − x2x+a 1 a
2 = 2 − 2x 2 , g ′ ( a) = 0,

▶ g ′′ (x) = xa3 , g ′′ ( a) = √1a .

√ √
xk+1 = 12 (xk + a
xk ) =⇒ |xk+1 − a| = 1
2|xk | |xk − a|2
Numerical Methods for Computational Science and Engineering
Newton iteration
Example (Convergence of Newton iteration (cont.))
Example (Convergence of Newton iteration (cont.))
Numerical experiment: iterates for a =2:
Numerical experiment: iterates for a = 2: p
k xk ek := xk 2 log |e|ek k |1 | : log |e|e(kk 1|
2) |

0 2.00000000000000000 0.58578643762690485
1 1.50000000000000000 0.08578643762690485
2 1.41666666666666652 0.00245310429357137 1.850
3 1.41421568627450966 0.00000212390141452 1.984
4 1.41421356237468987 0.00000000000159472 2.000
5 1.41421356237309492 0.00000000000000022 0.630

The number of significant digits in xk essentially doubles at each


The number of significant digits in xk essentially doubles at each
iteration.
iteration.When roundoff level is reached, no meaningful
When roundo↵ level is reached, no meaningful improvement can be
improvement can be obtained any further. The improvement from
obtained any further. The improvement from the 4th to the 5th
the 4th to th 5th iteration (in this example ) is minimal.
iteration (in this example) is minimal.
Example (Choice of initial guess)

Consider f (x) = 2 cosh(x/4)x. f has 2 roots:


x1∗ ≈ 2.35755106, x2∗ ≈ 8.50719958.

2 cosh(xk /4)xk
The Newton’s iteration here is xk+1 = xk − 0.5 sinh(xk /4) .
Iteration until |f (xk+1 )| < 10−8 .

▶ Starting from x0 = 2 requires 4 iteration to reach x1∗ to within


the given tolerance.
▶ Starting from x0 = 4 requires 5 iterations to reach x1∗ .
▶ Starting from x0 = 8 requires 5 iteration to reach x2∗ .
▶ Starting from x0 = 10 requires 6 iterations to reach x2∗
The value of f (xk ), starting from x0 = 8, are:

k 0 1 2 3 4 5
f (xk ) 4.76e-1 8.43e-2 1.56e-3 5.65e-7 7.28e-14 1.78e-15

Number of significant digits essentially doubles at each iteration.


When roundoff level is reached, no meaningful improvement can be
obtained upon heaping more floating point operations, and the
improvement from the 4th to the 5th iteration in this example is
marginal.
Order of convergence

The method is said to be


▶ linearly convergent if there is a constant ρ ¡ 1 such that

|xk+1 − x ∗ | ≤ ρ|xk − x ∗ |, for k sufficiently large;

▶ quadratic convergent if there is a constant M such that

|xk+1 − x ∗ | ≤ M|xk − x ∗ |2 , for k sufficiently large;

▶ superlinearly convergent if there is a sequence of constants


ρk → 0 such that

|xk+1 − x ∗ | ≤ ρk |xk − x ∗ |, for k sufficiently large;

The quadratic case is superlinear with ρk = M|xk − x ∗ | → 0


Convergence of Newton’s method
Theorem ( Convergence of Newton’s iteration)
If f ∈ C 2 [a, b] has a root x ∗ in [a, b] and f ′ (x ∗ ) ̸= 0, then there
exists δ > 0 such that, for any x0 in [x ∗ − δ, x ∗ + δ], Newton’s
method converges quadratically.
Proof:
(i) Since the iteration function g is continuously differentiable
there is some neighborhood [x ∗ − δ, x ∗ + δ] of x ∗ in which
|g ′ | < 1.
(ii) For convergence order use the Taylor expansion of g at x ∗ :
x ∗ − xk+1 = g (x ∗ ) − g (xk )
= g (x ∗ ) − g (x ∗ ) + g ′ (x ∗ )(x ∗ − xk+1 ) + g ′′ (ξ)(x ∗ − xk+1 )2

Another example

The equation f (x) = x 3 − 3x + 2 = (x + 2)(x − 1)2 has two zeros:


-2 and 1. The Newton iteration is
2xk 2
xk+1 = g (xk ) = +
3 3(xk + 1)2
2 2
g ′ (x) = −
3 3(x + 1)2
Newton iteration for multiple roots
Let x ∗ be a multiple root of f: f (x) = (x − x ∗ )m g (x) with g
differentiable and g (x ∗ ) ̸= 0. Then,

f (x) (x − x ∗ )m g (x)
g (x) = x − =
f ′ (x) m(x − x ∗ )m−1 g (x) + (x − x ∗ )m g ′ (x)
(x − x ∗ )g (x)
= x−
mg (x) + (x − x ∗ )g ′ (x)

g (x) + (x − x ∗ )g ′ (x)
g ′ (x) = 1 −
mg (x) + (x − x ∗ )g ′ (x)
(x − x ∗ )g (x)(mg ′ (x) + g ′ (x) + g ′ (x) + (x − x ∗ )g ”(x))

(mg (x) + (x − x ∗ )g ′ (x))2
1
g ′ (x ∗ ) = 1 − . (0.3)
m
EXERCISE
Newton iteration for multiple roots (cont.)

Therefore, Newton iteration converges only linearly to multiple


roots. For large m the convergence is very slow.

For a double root (m=2) the Lipschitz constant is L = 1/2.

Remedy: We have to extend the step length in accordance with


the multiplicity of the zero of f(x).

Note: Often we do not know the multiplicity of a root.

Remark: One may try to apply Newton to the function f (x)/f ′ (x)
that has only simple roots.
Simplified Newton iteration

xk+1 = xk − f (xk )/f ′ (x 0 ), k = 0, 1.....


Linear convergence

K := |g ′ (x ∗ )| = |1 − f ′ (x ∗ )/f ′ (x 0 )|.

Simplified Newton iteration can be very effective if, x 0 is a good


approximation of x ∗ . Then,

1 − f (xk )/f ′ (x 0 ) ≈ 1 − f (xk )/f ′ (x ∗ )

Such that the convergence factor L is small.


Damped Newton
To avoid overshooting one can damp (shorten) the Newton step

xk+1 = xk − λk f (xk )/f ′ (xk ), k = 0, 1, ...

λk is chosen such that |f (xk+1 )| < |f (xk )|.

∆x = f (x)/f ′ (x);
while (|f (x − λ∆x)| > |f (x)|)
λ = λ/2;
end

Close to convergence we should let λk → 1 to have the full step


length and quadratic convergence. Before each iteration step:
λ = min(1, 2λ)
Secant method

xk − xk−1
xk+1 = xk − f (xk ) , secant
f (xk ) − f (xk−1 )
Notice that the secant method is obtained be approximating the
derivative in Newton’s method by a finite difference,

f (xk ) − f (xk−1 )
f ′ (xk ) ≈ .
xk − xk−1

The secant mehtod is not a fixed point method but a multi-point

method. It can interpreted as follows. The next xk+1 is the zero of


the degree 1 polynomial that interpolates f at xk and xk−1 .
Convergence rate: 1.618, i.e. superlinear ! No derivative needed
Secant method (cont.)

1 function [x,i] = secant(x0,x1,f,tol,maxit)


2 % secant method
3 f0 = f(x0);
4 for i=1:maxit
5 f1 = f(x1);
6 s = f1*(x1-x0)/(f1-f0);
7 x0=x1;x1=x1-s;
8 if(abs(s)<tol), x=x1; return; end
9 f0=f1;
10 end
11 x = NaN;
Inverse Interpolation

Given a data set

(xi , yi = f (xi )), i = 0, 1, ....n

In inverse interpolation we want to find a position x̄ such that, for

a given ȳ , f (x̄) = ȳ .
If the given function f is monotone in the interval, then for each y
there is only one x for which f (x) = y . In this situation, it makes
sense to interpolate the points (yi , xi = f −1 (yi )).

Here: we are looking for x ∗ such that

f (x ∗ ) = 0 ⇐⇒ x ∗ = f −1 (0)
Inverse Linear Interpolation

The secant method can be derived as linear interpolation:


The function that linearly interpolate (yk , f −1 (yk )) and
(yk−1 , f −1 (yk−1 )) is
y − yk−1 y − yk
f −1 (yk ) − f −1 (yk−1 ) .
yk − yk−1 yk − yk−1

The value of this function at y = 0 gives the approximate xk+1 :


−xk yk−1 + xk−1 yk
xk+1 = (0.4)
yk − yk−1
xk − xk−1
 
= xk − yk , yk ≡ fxk (0.5)
yk − yk−1
Inverse quadratic interpolation

f(k−2) fk−1
xk+1 = xk
(fk − f(k−2) )(fk − fk−1 )
f(k−2) fk
+ xk−1
(fk−1 − f(k−2) )(fk−1 − fk )
fk−1 fk
+ xk−2
(fk−2 − fk−1 )(fk−2 − fk )

Code on next page from Moler, Numerical computing with

MATLAB, SIAM, 2004.


Convergence rate: 1.839. No derivatives needed!!
Example for IQI
Example for IQI
Find zero of f (x) = xe x − 0 = 0, x 1 = 2.5, x 2 = 5.
x 1, x (0) (1) (2)
Find zero of f (x) = xe 1, x = 0, x = 2.5, x = 5.
log |ek+1 | log |ek |
k xk g(xk ) ek := xk x ⇤ log |ek | log |ek 1 |
3 0.08520390058175 -0.90721814294134 -0.48193938982803
4 0.16009252622586 -0.81211229637354 -0.40705076418392 3.33791154378839
5 0.79879381816390 0.77560534067946 0.23165052775411 2.28740488912208
6 0.63094636752843 0.18579323999999 0.06380307711864 1.82494667289715
7 0.56107750991028 -0.01667806436181 -0.00606578049951 1.87323264214217
8 0.56706941033107 -0.00020413476766 -0.00007388007872 1.79832936980454
9 0.56714331707092 0.00000007367067 0.00000002666114 1.84841261527097
10 0.56714329040980 0.00000000000003 0.00000000000001

Figure 8: Errors
Zeroin (MATLAB’s fzero)

Combine the reliability of bisection with the convergence speed of


secant and inverse quadratic interpolation (IQI). Requires only
function evaluation.
Outline:
▶ Start with a and b s.t. f (a)f (b) < 0.
▶ Use a secant step to get c between a and b.
▶ Repeat the following steps until |b − a| < |b| or f (b) = 0.
▶ Arrange a,b and c so that
▶ f (a)f (b) < 0
▶ |f (b)| ≤ |f (a)|
▶ c is the previous value of b.
▶ If c ̸= a, consider an IQI step.
▶ If c = a, consider a secant step.
Zeroin (MATLAB’s fzero) (cont.)

▶ If the IQI or secant step is in the interval [a,b]. take it.


▶ If the step is not in the interval, use bisection.

This algorithm is foolproof: It never loses track of the zero trapped


in a shrinking interval.
It uses rapidly convergent methods when they are reliable.
It uses a slow, but sure, method when it is necessary.
It only uses function values, no derivatives.
Computing multiple zeros

If we have found a zero z of f(x) = 0 and want to compute another


one, we want to avoid recomputing the already found z.
We can explicitly deflate the zero by defining a new function

f (x)
f1 (x) := ,
x −z
and apply method of choice to f1 . This procedure can in particular
be done with polynomials which can be error prone if z is not
accurate.
We can proceed similarly for multiple zeros z1 , .....zm .
Computing multiple zeros (cont.)

For the reciprocal Newton correction for f1 we get


f ′ (x) f (x)
f1′ (x) x−z − (x−z)2 1
= f ′ (x)
− .
f1 (x) x −z
f (x)

Then a Newton correction becomes


1
x (k+1) = xk − f ′ (xk ) 1
f (xk ) − xk −z

and similarly for multiple zeros z1 , ......zm .


The above proceed is called implicit deflation. f is not modified.
In this way errors in z are not propagated to f1
Comparison of methods

Comparison of some methods for computing the smallest zero of


f (x) = cos(x) cosh(x) + 1 = 0

method start steps function evals


bisection [0,3] 32 34
secant method [1.5,3] 8 9
secant method [0,3] 15 16
Newton x (0) = 1.5 5 11
Brent [0,1.5,3] 6 9

Notes: (1) Brent is a method similar to MATLAB’s fzero


(2) These numbers depend on the function f !
Minimizing a function of one variable

▶ A major source of applications giving rise to root finding is


optimization.
▶ One-variable version: find an argument x = x̂ that minimizes
a given objection function ϕ(x).
▶ Example from earlier: Find the minimum of the function.
x
ϕ(x) = 10 cosh( ) − x
4
over the real line.
▶ Note: maximize function ψ(x) ⇐⇒ minimize ϕ(x) = −ψ(x)
Conditions for minimum point

Assume that ϕ ∈ C 2 [a, b]. Denote

f (x) = ϕ′ (x).

An argument x ∗ satisfying a < x ∗ < b is called a critical point if

f (x ∗ ) = 0.

For parameter h small enough so that x ∗ + h ∈ [a, b] we can


expand in a Taylors series

h2
ϕ(x ∗ + h) = ϕ(x ∗ ) + hϕ′ (x ∗ ) + ϕ”(x ∗ ) + .....
2
h2
= ϕ(x ∗ ) + [ϕ”(x ∗ ) + O(h)]
2
Conditions for a minimum point (cont.)

Since |h| can be taken arbitrary small, it is now clear that at a


critical point:
▶ If ϕ”(x ∗ ) > 0, then x̂ = x ∗ is a local minimizer of ϕ(x). This
means that ϕ attains a minimum at x̂ = x ∗ in some
neighborhood which includes x ∗ .
▶ If ϕ”(x ∗ ) < 0, then x̂ = x ∗ is a local maximizer of ϕ(x). This
means that ϕ attains a minimum at x̂ = x ∗ in some
neighborhood which includes x ∗ .
▶ If ϕ”(x ∗ ) = 0, then a further investigation at x ∗ is required.
Computation of minima of functions

If ϕ(x) attains a minimum (or maximum) at a point x̂, then this


point must be critical, i.e. f (x̂) = 0.
We can apply any of the zero finder to find x̂ with f (x̂) = 0.
Example. For the function ϕ(x) = 10 cosh( x4 ) − x we have

10 x 10 x
ϕ′ (x) = f (x) = sinh( ) − 1, ϕ”(x) = f ′ (x) = cosh( ).
4 4 16 4
Note that for quadratic convergence of Newton method, ϕ(x)
must have three continuous derivatives.
Note: The problem of finding all minima of a given function ϕ(x) can be
solved be finding all the critical roots and then checking for each if it is a
minimum by examining the sign of second derivative of ϕ.

You might also like