0% found this document useful (0 votes)
24 views

Root Finding and Optimization: DG (X) DX DG (X) DX

Root finding, solving equations, and optimization are closely related topics that frequently arise in practical applications. There are several methods for finding roots (solutions where a function equals zero), including bisection, regula falsi, and Newton-Raphson. Bisection repeatedly divides an interval in half until the root is found to a desired accuracy. Regula falsi uses linear interpolation to converge faster than bisection. Newton-Raphson uses the slope of the tangent line to iteratively find better approximations of the root. The document provides examples applying these methods to solve equations related to two-body orbital motion.

Uploaded by

Samra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Root Finding and Optimization: DG (X) DX DG (X) DX

Root finding, solving equations, and optimization are closely related topics that frequently arise in practical applications. There are several methods for finding roots (solutions where a function equals zero), including bisection, regula falsi, and Newton-Raphson. Bisection repeatedly divides an interval in half until the root is found to a desired accuracy. Regula falsi uses linear interpolation to converge faster than bisection. Newton-Raphson uses the slope of the tangent line to iteratively find better approximations of the root. The document provides examples applying these methods to solve equations related to two-body orbital motion.

Uploaded by

Samra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Root Finding and Optimization

Root finding, solving equations, and optimization are very closely


related subjects, which occur often in practical applications.

Root finding : f (x) = 0 Solve for x


Equation Solving : f (x) = g(x), or f (x)  g(x) = 0
dg(x) dg(x)
Optimization : = 0 f (x) =
dx dx

Start with one equation in one variable. Different methods have


different strengths/weaknesses. However, all methods require a
good starting value and some bounds on the possible values of
the root(s).

Winter Semester 2006/7 Computational Physics I Lecture 11 1


Root Finding

Bisection

Need to find an interval where f(x) changes sign (implies a zero


crossing). If no such interval, no root. Then, divide interval into
 b0  a0   b0  a0 
a0 ,a0 + 2 , a0 + 2 ,b0 
Find interval where f(x) changes sign, and repeat until interval is
small enough.

Winter Semester 2006/7 Computational Physics I Lecture 11 2


Root Finding
We will try this method on one of our old examples - planetary
motion. Recall, for two masses m1,m2, we found:

1  μGMm a(1 e 2 )   
=
r  L 
2 [1 ecos( +  0 )] r =
1 ecos( +  0 )
where r = r1  r2

m1m2
and Reduced mass : μ =
m1 + m2
rmin + rmax  L2   1 
a= = 
2  μGMm 1 e 2 
 L2   1 
b=  
 μGMm  1 e 2 

Winter Semester 2006/7 Computational Physics I Lecture 11 3


2-body motion
Here, we are interested in plotting the orbit of individual masses
as a function of time - i.e., taking equal time steps. The
relationship between the angle and the time is:
T
t= (  esin  ) where  is the angle from the center of the ellipse
2
The position of the individual masses is given by
m2 m1
x1 = a(cos  e) x 2 =  a(cos  e)
m1 + m2 m1 + m2
m2 m1
y1 = a 1 e 2 sin  y2 =  a 1 e 2 sin 
m1 + m2 m1 + m2
We first have to solve for (t), then for x,y. To solve for (t), need
to find the root of 2t
(  esin  )  =0
for a given t. T

Winter Semester 2006/7 Computational Physics I Lecture 11 4


2-body motion
2t
(  esin  )  =0
T

f()

 
Need a starting interval for the bisection algorithm: We note that
the maximum and minimum of the sin() term is 1, -1. So we can
take as the starting range for  2t 2t
a = e b = +e
T T
Winter Semester 2006/7 Computational Physics I Lecture 11 5
2-body motion
Let’s take some random values for the parameters:
T=1, a=1, e=0.6, m2=4m1
accuracy=1.D-6 2t
* Define range in which we search (
tfunc=   esin  )
angle1=2*3.1415926*t-e T
angle2=angle1+2.*e
try1=tfunc(e,period,t,angle1) angle1=angle2
try2=tfunc(e,period,t,angle2) try1=try2
If (try1*try2.gt.0) then angle2=angle2+step/2.
print *,' Cannot find root - bad start parameters' try2=tfunc(e,period,t,angle2)
return If (try1*try2.lt.0.) goto 2
Endif If (try1.eq.0.) then
* Now update until within accuracy angle=angle1
1 continue return
step=angle2-angle1 Elseif (try2.eq.0.) then
angle2=angle1+step/2. angle=angle2
try2=tfunc(e,period,t,angle2) return
If (try1*try2.lt.0.) goto 2 (root in this interval) Endif
If (try1.eq.0.) then 2 continue
angle=angle1 If ((angle2-angle1).gt.accuracy) goto 1

}
return angle=angle1+(angle2-angle1)/2.
Elseif (try2.eq.0.) then (check for exact landing)
angle=angle2
return Note: accuracy2-(# iterations).
Endif
For 10-6, need 21 iterations
Winter Semester 2006/7 Computational Physics I Lecture 11 6
2-body motion

Winter Semester 2006/7 Computational Physics I Lecture 11 7


Regula Falsi

Similar to bisection, but use linear interpolation to speed up the


convergence. Start with the interval [ x 0 ,a0 ] with function values
f (x 0 ), f (a0 ) such that f (x 0 ) f (a0 ) < 0 . Use linear interpolation to
guess the zero crossing.
f (a0 )  f (x 0 ) a0 f (x 0 )  x 0 f (a0 )
p(x) = f (x 0 ) + (x  x 0 ) p(x) = 0 gives 1 =
a0  x 0 f (x 0 )  f (a0 )

Winter Semester 2006/7 Computational Physics I Lecture 11 8


Regula Falsi
Now calculate f (1 )
If f (x 0 ) f (1 ) < 0 choose new interval [x 0 ,1 ] Iterate until
interval
else f (x 0 ) f (1 ) > 0 choose new interval [1,a0 ]
sufficiently small
With our previous example (2-body motion), most of the time we
are faster at converging that the bisection method, but we find
that we sometimes need many iterations to reach the accuracy of
10-6.
Problem occurs when either f(x0) or
f(a0) very close to zero, then 1 is very
close to x0 or a0 and convergence is
extremely slow (or not converging
because of machine rounding)
a0 f (x 0 )  x 0 f (a0 )
p(x) = 0 gives 1 =
f (x 0 )  f (a0 )
Winter Semester 2006/7 Computational Physics I Lecture 11 9
Regula Falsi

So we add the extra condition that the interval has to shrink by at


least the level of accuracy we are trying to reach. The logic is:
If a0  x 0 < accuracy, Converged
If 1  x 0 < accuracy /2, 1 = x 0 + accuracy /2
If 1  a0 < accuracy /2, 1 = a0  accuracy /2 Modified
Bisection
Regula Falsi
Then continue with the usual
f (x 0 ) f (1 ) < 0 choose new interval [x 0 ,1 ].
f (x 0 ) f (1 ) > 0 choose new interval [1,a0 ]

Winter Semester 2006/7 Computational Physics I Lecture 11 10


Newton-Raphson Method

Here we use the slope (or also 2nd derivative) at a guess position
to extrapolate to the zero crossing. This method is the most
powerful of the ones we consider today, since we can easily
generalize to many parameters and many equations. However, it
also has its drawbacks as we will see.

What if this is our guess ?

Winter Semester 2006/7 Computational Physics I Lecture 11 11


Newton-Raphson Method
f (x)
The algorithm is: x i+1 = x i  1st order
f  (x)
* make a starting guess for the angle
angle=2.*3.1415926*t
try=tfunc(e,period,t,angle)
* Now update until angular change within accuracy
Do Itry=1,40
slope=tfuncp(e,period,t,angle)
angle1=angle-try/slope
try1=tfunc(e,period,t,angle1)
If (abs(angle1-angle).lt.accuracy) goto 1
angle=angle1
try=try1
Enddo
1 continue

Analytic derivative needed for NR method

2t
(  esin  )  = tfunc(e, period,t,angle)
T
(1 ecos ) = tfuncp(e, period,t,angle)

Winter Semester 2006/7 Computational Physics I Lecture 11 12


Newton-Raphson Method

Modified
Regula Falsi Newton-Raphson

This is the fastest of the methods


we have tried so far.

Note: if no analytic derivatives


available, then calculate them
numerically: secant method

Winter Semester 2006/7 Computational Physics I Lecture 11 13


Several Roots
Suppose we have a function which as several roots; e.g. the
function we found a couple lectures ago when dealing with
eigenvalues (principal axis problem):
[ ]
0 =  3 + 8 2 (a 2 + b 2 )   (16a 4 + 39a 2b 2 + 16b 4 ) + 28a 2b 2 (a 2 + b 2 )

It is important to analyze the problem and set the bounds


correctly. For the NR method, if you are not sure where the root
is, then need to give several different starting values and see what
happens.
Winter Semester 2006/7 Computational Physics I Lecture 11 14
Several variable
The Newton-Raphson method is easily generalized to several
functions of several variables:
 f1 (x1,, x n )
f (x) =    =0
 
 f n (x1,, x n )

 f 1 f 1 
We form the derivative matrix: 
 x 1 x n 
Df =     
 f n f n 
 x  x 
 1 n

If the matrix is not singular, we solve the SLE:


   
0 = f ( x (1) ) + Df ( x ( 0) )( x (1)  x (0) )

The iteration is x
 (r +1)  (r )
( )
 (r ) 1  (r )
= x  Df ( x ) f ( x )
Winter Semester 2006/7 Computational Physics I Lecture 11 15
Optimization
We now move to the closely related problem of optimization
(finding the zeroes of a derivative). This is a very widespread
problem in physics (e.g., finding the minimum 2, the maximum
likelihood, the lowest energy state, …). Instead of looking for
zeroes of a function, we look for extrema.

Finding global extrema is a very important and very difficult


problem, particularly in the case of several variables. Many
techniques have been invented, and we look at a few here. The
most powerful techniques (using Monte Carlo methods) will be
reserved for next semester.

Here, look for the minimum of h( x ) . For a maximum, consider

the minimum of -h( x ). We assume the function is at least twice
differentiable.

Winter Semester 2006/7 Computational Physics I Lecture 11 16


Optimization

First and second derivatives:

 t   h h 
g ( x ) =  ,,  a vector
 x 1 x n 
  2h  2h 
 x x  x x 
  1 1 1 n

H( x ) =  the Hessian matrix
 2 
  h  h 
2
 
 x nx1 x nx n 

General technique:

1. Start with initial guess x ( 0)

2. Determine a direction, s, and a step size 
    
3. Iterate x until g t < , or cannot find smaller h x ( r +1) = x (r ) +  r sr
Winter Semester 2006/7 Computational Physics I Lecture 11 17
Steepest Descent
Reasonable try: steepest descent
 
   h( x  g r)
sr =  gr step length from 0 = r

Note that consecutive steps are in orthogonal directions.
As an example, we will come back to the data smoothing
example from Lecture 3: 
 2  are the parameters of the function to be fit
n
(yi  f (xi ;  ))
 =
2
2
yi are the measured points at values xi
i=0 wi
wi is the weight given to point i

In our example: f (x; A,  ) = A cos(x +  )


and wi = 1 i

we want to minimize 2 as a function of A and 


Winter Semester 2006/7 Computational Physics I Lecture 11 18
Steepest Descent
 2
x y n
(yi  f (xi ;  ))
0. 0.  =
2
2
i=0 w i
1.26 0.95
8
2.51 0.59 h(A, ) =  =  (y i  Acos(x i +  )) 2
2
3.77 -0.59 i=1
5.03 -0.95
6.28 0.
7.54 0.95
8.80 0.59

Winter Semester 2006/7 Computational Physics I Lecture 11 19


Steepest Descent
To use our method, need to have the derivatives:

 8 2(y  Acos(x +  ))( cos(x +  ))



  i i i

g(A, ) = 8i=1
 
  2(y i  Acos(x i +  ))(Asin(x i +  ))
 i=1
 
dh( x r  gr ) d 
recall step length from 0 = = h( x r +1 )
d r d r
d   d   
h( x r +1 ) = h( x r +1 ) 
T
x r +1 = h( x r +1 )T gr
d r d r

Setting to zero we see that the step length is chosen so as to


make the next step orthogonal. We proceed in a zig-zag pattern.

Winter Semester 2006/7 Computational Physics I Lecture 11 20


Steepest Descent
* Starting guesses for parameters
A=0.5
phase=1.
step=1.
* Evaluate derivatives of function
gA=dchisqdA(A,phase)
gp=dchisqdp(A,phase)
*
h=chisq(A,phase)
*
Itry=0
1 continue
Itry=Itry+1
* update parameters for given step size
A1=A-step*gA
phase1=phase-step*gp
* reevaluate the chisquared
Determining the step h1=chisq(A1,phase1)
size through the * change step size if chi squared increased
If (h1.gt.h) then
orthogonality is often step=step/2.
goto 1
difficult. Easier to do it Endif
* Chi squared decreased, keep this update
by trial and error: A=A1
phase=phase1
gA=dchisqdA(A,phase)
gp=dchisqdp(A,phase)
Winter Semester 2006/7 Computational Physics I Lecture 11 21
Steepest Descent

* Starting guesses for parameters


A=0.5
phase=1.
step=1.

Minimum found is A=-1, phase=/2.


Previously, A=1, phase=- /2

Winter Semester 2006/7 Computational Physics I Lecture 11 22


Other Techniques

• Conjugate Gradient Discuss briefly next time


• Newton-Raphson
• Simulated annealing (Metropolis)
• Constrained optimization

You want to learn how to


make these nice pictures ?
Attend today’s recitation.
Special presentation from
Fred on GNU.

Winter Semester 2006/7 Computational Physics I Lecture 11 23


Exercizes

1. Write a code for the 2-body motion studied in the lecture, but
using numerically calculated derivatives (secant method).
Compare the speed of convergence to that found for the NR
method.

2. Code the 2 minimization problem. Find suitable starting


conditions so that the second minimum is reached. Display
graphically.

Winter Semester 2006/7 Computational Physics I Lecture 11 24

You might also like