0% found this document useful (0 votes)
103 views31 pages

Lecture6 Handout PDF

The document summarizes key concepts in numerical methods for solving nonlinear equations. It covers: 1) Basics of nonlinear solvers including root finding in one dimension and systems of nonlinear equations. Iterative methods are needed as there are typically no closed-form solutions. 2) The bisection and Newton's methods for root finding in one dimension. Bisection converges linearly while Newton's converges quadratically. 3) Convergence properties and analysis of the Newton's method, showing its second-order convergence. Proofs of local and global convergence are discussed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views31 pages

Lecture6 Handout PDF

The document summarizes key concepts in numerical methods for solving nonlinear equations. It covers: 1) Basics of nonlinear solvers including root finding in one dimension and systems of nonlinear equations. Iterative methods are needed as there are typically no closed-form solutions. 2) The bisection and Newton's methods for root finding in one dimension. Bisection converges linearly while Newton's converges quadratically. 3) Convergence properties and analysis of the Newton's method, showing its second-order convergence. Proofs of local and global convergence are discussed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Numerical Methods I

Solving Nonlinear Equations

Aleksandar Donev
Courant Institute, NYU1
[email protected]
1 Course G63.2010.001 / G22.2420-001, Fall 2010

October 14th, 2010

A. Donev (Courant Institute) Lecture VI 10/14/2010 1 / 31


Outline

1 Basics of Nonlinear Solvers

2 One Dimensional Root Finding

3 Systems of Non-Linear Equations

4 Intro to Unconstrained Optimization

5 Conclusions

A. Donev (Courant Institute) Lecture VI 10/14/2010 2 / 31


Final Presentations

The final project writeup will be due Sunday Dec. 26th by midnight (I
have to start grading by 12/27 due to University deadlines).
You will also need to give a 15 minute presentation in front of me and
other students.
Our last class is officially scheduled for Tuesday 12/14, 5-7pm, and
the final exam Thursday 12/23, 5-7pm. Neither of these are good!
By the end of next week, October 23rd, please let me know the
following:
Are you willing to present early Thursday December 16th during usual
class time?
Do you want to present during the official scheduled last class,
Thursday 12/23, 5-7pm.
If neither of the above, tell me when you cannot present Monday Dec.
20th to Thursday Dec. 23rd (finals week).

A. Donev (Courant Institute) Lecture VI 10/14/2010 3 / 31


Basics of Nonlinear Solvers

Fundamentals

Simplest problem: Root finding in one dimension:


f (x) = 0 with x [a, b]

Or more generally, solving a square system of nonlinear equations

f(x) = 0 fi (x1 , x2 , . . . , xn ) = 0 for i = 1, . . . , n.

There can be no closed-form answer, so just as for eigenvalues, we


need iterative methods.
Most generally, starting from m 1 initial guesses x 0 , x 1 , . . . , x m ,
iterate:
x k+1 = (x k , x k1 , . . . , x km ).

A. Donev (Courant Institute) Lecture VI 10/14/2010 4 / 31


Basics of Nonlinear Solvers

Order of convergence

Consider one dimensional root finding and let the actual root be ,
f () = 0.
A sequence of iterates x k that converges to has order of
convergence p > 1 if as k
k+1 k+1
x e
k p = k p C = const,
|x | |e |

where the constant 0 < C < 1 is the convergence factor.


A method should at least converge linearly, that is, the error should
at least be reduced by a constant factor every iteration, for example,
the number of accurate digits increases by 1 every iteration.
A good method for root finding coverges quadratically, that is, the
number of accurate digits doubles every iteration!

A. Donev (Courant Institute) Lecture VI 10/14/2010 5 / 31


Basics of Nonlinear Solvers

Local vs. global convergence

A good initial guess is extremely important in nonlinear solvers!


Assume we are looking for a unique root a b starting with an
initial guess a x0 b.
A method has local convergence if it converges to a given root for
any initial guess that is sufficiently close to (in the neighborhood
of a root).
A method has global convergence if it converges to the root for any
initial guess.
General rule: Global convergence requires a slower (careful) method
but is safer.
It is best to combine a global method to first find a good initial guess
close to and then use a faster local method.

A. Donev (Courant Institute) Lecture VI 10/14/2010 6 / 31


Basics of Nonlinear Solvers

Conditioning of root finding

f ( + ) f () + f 0 () = f

|f | 1
abs = f 0 () .

||
|f 0 ()|

The problem of finding a simple root is well-conditioned when |f 0 ()|


is far from zero.
Finding roots with multiplicity m > 1 is ill-conditioned:
 1/m
0
f () = = f (m1) () = 0 |f |
|| m
|f ()|

Note that finding roots of algebraic equations (polynomials) is a


separate subject of its own that we skip.

A. Donev (Courant Institute) Lecture VI 10/14/2010 7 / 31


One Dimensional Root Finding

The bisection and Newton algorithms

A. Donev (Courant Institute) Lecture VI 10/14/2010 8 / 31


One Dimensional Root Finding

Bisection
First step is to locate a root by searching for a sign change, i.e.,
finding a0 and b0 such that
f (a0 )f (b0 ) < 0.
The simply bisect the interval,
ak + b k
xx =
2
and choose the half in which the function changes sign by looking at
the sign of f (x k ).
Observe that each step we need one function evaluation, f (x k ), but
only the sign matters.
The convergence is essentially linear because
k+1
k x
x b
k
2.
2k+1 |x k |
A. Donev (Courant Institute) Lecture VI 10/14/2010 9 / 31
One Dimensional Root Finding

Newtons Method

Bisection is a slow but sure method. It uses no information about the


value of the function or its derivatives.

Better convergence, of order p = (1 + 5)/2 1.63 (the golden
ratio), can be achieved by using the value of the function at two
points, as in the secant method.
Achieving second-order convergence requires also evaluating the
function derivative.
Linearize the function around the current guess using Taylor series:

f (x k+1 ) f (x k ) + (x k+1 x k )f 0 (x k ) = 0

f (x k )
x k+1 = x k
f 0 (x k )

A. Donev (Courant Institute) Lecture VI 10/14/2010 10 / 31


One Dimensional Root Finding

Convergence of Newtons method


Taylor series with remainder:
1
f () = 0 = f (x k )+(x k )f 0 (x k )+ (x k )2 f 00 () = 0, for some [xn , ]
2
After dividing by f 0 (x k ) 6= 0 we get

f (x k ) f 00 ()
 
1
k
x 0 k = ( x k )2 0 k
f (x ) 2 f (x )

1 k 2 f 00 ()
x k+1 = e k+1 = e
2 f 0 (x k )
which shows second-order convergence
k+1 k+1
x e f 00 () 00
f ()

2
= 2
= 0 k 0

|x k | |e k | 2f (x ) 2f ()

A. Donev (Courant Institute) Lecture VI 10/14/2010 11 / 31


One Dimensional Root Finding

Proof of Local Convergence

k+1 00 00

x
f () f (x)
2
= 0 k M = sup
0

|x k | 2f (x ) |e 0 |x,y +|e 0 | 2f (y )

2 2
M x k+1 = E k+1 M x k = E k

which will converge if E 0 < 1, i.e., if


x = e 0 < M 1
0

Newtons method thus always converges quadratically if we start


sufficiently close to a simple root.

A. Donev (Courant Institute) Lecture VI 10/14/2010 12 / 31


One Dimensional Root Finding

Fixed-Point Iteration

Another way to devise iterative root finding is to rewrite f (x) in an


equivalent form
x = (x)
Then we can use fixed-point iteration

x k+1 = (x k )

whose fixed point (limit), if it converges, is x .


For example, recall from first lecture solving x 2 = c via the
Babylonian method for square roots
1 c 
xn+1 = (xn ) = +x ,
2 x
which converges (quadratically) for any non-zero initial guess.

A. Donev (Courant Institute) Lecture VI 10/14/2010 13 / 31


One Dimensional Root Finding

Convergence theory

It can be proven that the fixed-point iteration x k+1 = (x k )


converges if (x) is a contraction mapping:
0
(x) K < 1 x [a, b]

x k+1 = (x k )() = 0 () x k by the mean value theorem




k+1
< K x k

x

If 0 () 6= 0 near the root we have linear convergence


k+1
x
0 ().
|x k |

If 0 () = 0 we have second-order convergence if 00 () 6= 0, etc.

A. Donev (Courant Institute) Lecture VI 10/14/2010 14 / 31


One Dimensional Root Finding

Applications of general convergence theory

Think of Newtons method


f (x k )
x k+1 = x k
f 0 (x k )

as a fixed-point iteration method x k+1 = (x k ) with iteration


function:
f (x)
(x) = x 0 .
f (x)
We can directly show quadratic convergence because (also see
homework)
f (x)f 00 (x)
0 (x) = 0 () = 0
[f 0 (x)]2
f 00 ()
00 () = 6= 0
f 0 ()

A. Donev (Courant Institute) Lecture VI 10/14/2010 15 / 31


One Dimensional Root Finding

Stopping Criteria
A good library function for root finding has to implement careful
termination criteria.
An obvious option is to terminate when the residual becomes small
k
f (x ) < ,

which is only good for very well-conditioned problems, |f 0 ()| 1.


Another option is to terminate when the increment becomes small
k+1
x k < .

x

For fixed-point iteration



x k+1 x k = e k+1 e k 1 0 () e k
  k
e ,
[1 0 ()]
so we see that the increment test works for rapidly converging
iterations (0 ()  1).
A. Donev (Courant Institute) Lecture VI 10/14/2010 16 / 31
One Dimensional Root Finding

In practice

A robust but fast algorithm for root finding would combine bisection
with Newtons method.
Specifically, a method like Newtons that can easily take huge steps in
the wrong direction and lead far from the current point must be
safeguarded by a method that ensures one does not leave the search
interval and that the zero is not missed.
Once x k is close to , the safeguard will not be used and quadratic or
faster convergence will be achieved.
Newtons method requires first-order derivatives so often other
methods are preferred that require function evaluation only.
Matlabs function fzero combines bisection, secant and inverse
quadratic interpolation and is fail-safe.

A. Donev (Courant Institute) Lecture VI 10/14/2010 17 / 31


One Dimensional Root Finding

Find zeros of a sin(x) + b exp(x 2 /2) in MATLAB

% f=@ m f i l e u s e s a f u n c t i o n i n an m f i l e

% Parameterized f u n c t i o n s are created with :


a = 1; b = 2;
f = @( x ) a s i n ( x ) + b exp(x 2 / 2 ) ; % Handle

figure (1)
ezplot ( f ,[ 5 ,5]); grid

x1=f z e r o ( f , [ 2 , 0 ] )
[ x2 , f 2 ]= f z e r o ( f , 2 . 0 )

x1 = 1.227430849357917
x2 = 3.155366415494801
f2 = 2.116362640691705 e 16

A. Donev (Courant Institute) Lecture VI 10/14/2010 18 / 31


One Dimensional Root Finding

Figure of f (x)

a sin(x)+b exp(x2/2)
2.5

1.5

0.5

0.5

5 4 3 2 1 0 1 2 3 4 5
x

A. Donev (Courant Institute) Lecture VI 10/14/2010 19 / 31


Systems of Non-Linear Equations

Multi-Variable Taylor Expansion


We are after solving a square system of nonlinear equations for
some variables x:
f(x) = 0 fi (x1 , x2 , . . . , xn ) = 0 for i = 1, . . . , n.
It is convenient to focus on one of the equations, i.e., consider a
scalar function f (x).
The usual Taylor series is replaced by
1
f (x + x) = f (x) + gT (x) + (x)T H (x)
2
where the gradient vector is
f T
 
f f
g = x f = , , ,
x1 x2 xn
and the Hessian matrix is
2f
 
H= 2x f =
xi xj ij

A. Donev (Courant Institute) Lecture VI 10/14/2010 20 / 31


Systems of Non-Linear Equations

Newtons Method for Systems of Equations


It is much harder if not impossible to do globally convergent methods
like bisection in higher dimensions!
A good initial guess is therefore a must when solving systems, and
Newtons method can be used to refine the guess.
The first-order Taylor series is
f xk + x f xk + J xk x = 0
   

where the Jacobian J has the gradients of fi (x) as rows:


fi
[J (x)]ij =
xj
So taking a Newton step requires solving a linear system:
J xk x = f xk but denote J J xk
   

xk+1 = xk + x = xk J1 f xk .


A. Donev (Courant Institute) Lecture VI 10/14/2010 21 / 31


Systems of Non-Linear Equations

Convergence of Newtons method

Newtons method converges quadratically if started sufficiently close


to a root x? at which the Jacobian is not singular.

x? = ek+1 = xk J1 f xk x? = ek J1 f xk
k+1  
x

but using second-order Taylor series


 
1 k T
J1 f xk J1 f (x? ) + Jek + H ek
  
e
2
1
J T
= ek + ek H ek

2

k+1 J1 k T
J1 kHk 2

H ek ek


e =
2 e 2
Fixed point iteration theory generalizes to multiple variables, e.g.,
replace f 0 () < 1 with (J(x? )) < 1.
A. Donev (Courant Institute) Lecture VI 10/14/2010 22 / 31
Systems of Non-Linear Equations

Problems with Newtons method

Newtons method requires solving many linear systems, which can


become complicated when there are many variables.
It also requires computing a whole matrix of derivatives, which can
be expensive or hard to do (differentiation by hand?)
Newtons method converges fast if the Jacobian J (x? ) is
well-conditioned, otherwise it can blow up.
For large systems one can use so called quasi-Newton methods:
Approximate the Jacobian with another matrix e
J and solve
Jx = f(xk ).
e
Damp the step by a step length k . 1,

xk+1 = xk + k x.

Update e
J by a simple update, e.g., a rank-1 update (recall homework
2).

A. Donev (Courant Institute) Lecture VI 10/14/2010 23 / 31


Systems of Non-Linear Equations

In practice

It is much harder to construct general robust solvers in higher


dimensions and some problem-specific knowledge is required.
There is no built-in function for solving nonlinear systems in
MATLAB, but the Optimization Toolbox has fsolve.
In many practical situations there is some continuity of the problem
so that a previous solution can be used as an initial guess.
For example, implicit methods for differential equations have a
time-dependent Jacobian J(t) and in many cases the solution x(t)
evolves smootly in time.
For large problems specialized sparse-matrix solvers need to be used.
In many cases derivatives are not provided but there are some
techniques for automatic differentiation.

A. Donev (Courant Institute) Lecture VI 10/14/2010 24 / 31


Intro to Unconstrained Optimization

Formulation
Optimization problems are among the most important in engineering
and finance, e.g., minimizing production cost, maximizing profits,
etc.
minn f (x)
xR
where x are some variable parameters and f : Rn R is a scalar
objective function.
Observe that one only need to consider minimization as
max f (x) = minn [f (x)]
xRn xR
A local minimum x? is optimal in some neighborhood,
?
f (x ) f (x) x s.t. kx x? k R > 0.
(think of finding the bottom of a valley)
Finding the global minimum is generally not possible for arbitrary
functions
(think of finding Mt. Everest without a satelite)
A. Donev (Courant Institute) Lecture VI 10/14/2010 25 / 31
Intro to Unconstrained Optimization

Connection to nonlinear systems

Assume that the objective function is differentiable (i.e., first-order


Taylor series converges or gradient exists).
Then a necessary condition for a local minimizer is that x? be a
critical point
 
f
g (x? ) = x f (x? ) = (x? ) = 0
xi i

which is a system of non-linear equations!


In fact similar methods, such as Newton or quasi-Newton, apply to
both problems.
Vice versa, observe that solving f (x) = 0 is equivalent to an
optimization problem h i
min f (x)T f (x)
x
although this is only recommended under special circumstances.
A. Donev (Courant Institute) Lecture VI 10/14/2010 26 / 31
Intro to Unconstrained Optimization

Sufficient Conditions

Assume now that the objective function is twice-differentiable (i.e.,


Hessian exists).
A critical point x? is a local minimum if the Hessian is positive
definite
H (x? ) = 2x f (x? )  0
which means that the minimum really looks like a valley or a convex
bowl.
At any local minimum the Hessian is positive semi-definite,
2x f (x? )  0.
Methods that require Hessian information converge fast but are
expensive (next class).

A. Donev (Courant Institute) Lecture VI 10/14/2010 27 / 31


Intro to Unconstrained Optimization

Direct-Search Methods

A direct search method only requires f (x) to be continuous but


not necessarily differentiable, and requires only function evaluations.
Methods that do a search similar to that in bisection can be devised
in higher dimensions also, but they may fail to converge and are
usually slow.
The MATLAB function fminsearch uses the Nelder-Mead or
simplex-search method, which can be thought of as rolling a simplex
downhill to find the bottom of a valley. But there are many others
and this is an active research area.
Curse of dimensionality: As the number of variables
(dimensionality) n becomes larger, direct search becomes hopeless
since the number of samples needed grows as 2n !

A. Donev (Courant Institute) Lecture VI 10/14/2010 28 / 31


Intro to Unconstrained Optimization

Minimum of 100(x2 x12 )2 + (a x1 )2 in MATLAB

% R o s e n b r o c k o r banana f u n c t i o n :
a = 1;
banana = @( x ) 1 0 0 ( x (2) x ( 1 ) 2 ) 2 + ( ax ( 1 ) ) 2 ;

% T h i s f u n c t i o n must a c c e p t a r r a y a r g u m e n t s !
b a n a n a x y = @( x1 , x2 ) 1 0 0 ( x2x1 . 2 ) . 2 + ( ax1 ) . 2 ;

f i g u r e ( 1 ) ; e z s u r f ( banana xy , [ 0 , 2 , 0 , 2 ] )

[ x , y ] = meshgrid ( l i n s p a c e ( 0 , 2 , 1 0 0 ) ) ;
f i g u r e ( 2 ) ; c o n t o u r f ( x , y , banana xy ( x , y ) ,100)

% C o r r e c t a n s w e r s a r e x = [ 1 , 1 ] and f ( x )=0
[ x , f v a l ] = f m i n s e a r c h ( banana , [ 1 . 2 , 1 ] , o p t i m s e t ( TolX , 1 e 8))
x = 0.999999999187814 0.999999998441919
fval = 1 . 0 9 9 0 8 8 9 5 1 9 1 9 5 7 3 e 18

A. Donev (Courant Institute) Lecture VI 10/14/2010 29 / 31


Intro to Unconstrained Optimization

Figure of Rosenbrock f (x)

100 (x2x12)2+(ax1)2
2

1.8

1.6
800
1.4

600
1.2

400 1

200 0.8

0.6
0
2
0.4
1 1
0 0.5 0.2
0
1
0.5 0
x2 2 1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x1

A. Donev (Courant Institute) Lecture VI 10/14/2010 30 / 31


Conclusions

Conclusions/Summary

Root finding is well-conditioned for simple roots (unit multiplicity),


ill-conditioned otherwise.
Methods for solving nonlinear equations are always iterative and the
order of convergence matters: second order is usually good enough.
A good method uses a higher-order unsafe method such as Newton
method near the root, but safeguards it with something like the
bisection method.
Newtons method is second-order but requires derivative/Jacobian
evaluation. In higher dimensions having a good initial guess for
Newtons method becomes very important.
Quasi-Newton methods can aleviate the complexity of solving the
Jacobian linear system.

A. Donev (Courant Institute) Lecture VI 10/14/2010 31 / 31

You might also like