Vol2b Scipyoptimize 20171 PDF
Vol2b Scipyoptimize 20171 PDF
Lab Objective: The Optimize package in Scipy provides highly optimized and
versatile methods for solving fundamental optimization problems. In this lab we
introduce the syntax and variety of scipy.optimize as a foundation for unconstrained
numerical optimization.
Local Minimization
First we will test out a few of the minimization algorithms on the Rosenbrock
function, which is defined as
149
150 Lab 15. scipy.optimize
later. For this lab, you do not need to understand how they work, just how to use
them.
As an example, we’ll minimize the Rosenbrock with the Newton-CG method.
This method often performs better by including the optional hessian as an argument,
which we will do here.
>>> import numpy as np
>>> from scipy import optimize as opt
>>> x0 = np.array([4., -2.5])
>>> opt.minimize(opt.rosen, x0, method='Newton-CG', hess=opt.rosen_hess,
jac=opt.rosen_der)
fun: 1.1496545381999877e-15
jac: array([ 1.12295570e-05, -5.63744647e-06])
message: 'Optimization terminated successfully.'
nfev: 45
nhev: 34
nit: 34
njev: 78
status: 0
success: True
x: array([ 0.99999997, 0.99999993])
The printed output gives you information on the performance of the algorithm.
The most relevant output for this lab include
fun: 1.1496545381999877e-15, the obtained minimum;
nit: 96, the number of iterations the algorithm took to complete;
success: True, whether the algorithm converged or not;
x: array([ 0.99999997, 0.99999993]), the obtained minimizer.
Each of these outputs can be accessed either by indexing OptimizeResult object
like a dictionary (result['nit']), or as attributes of a class (result.nit). We recom-
mend access by indexing, as this is consistent with other optimization packages in
Python.
The online documenation for scipy.optimize.minimize() includes other optional
parameters available to users, for example, to set a tolerance of convergence. In
some methods, the derivative may be optional, while it may be necessary in others.
While we do not cover all possible parameters in this lab, they should be explored
as needed for specific applications.
Each of these three algorithms will be explored in great detail later in Volume
2: Nelder-Mead is a variation of the Simplex algorithm, CG is a variant of the
Conjugate Gradient algorithm, and BFGS is a quasi-Newton method developed by
Broyden, Fletcher, Goldfarb, and Shanno.
The minimize() function can use various algorithms, each of which is best for
certain problems. Which algorithm one uses depends on the specific nature of one’s
problem.
It is also important to note that in many optimization applications, very little
is known about the function to optimize. These functions are often called blackbox
functions. For example, one may be asked in the airline industry to analyze and
optimize certain properties involving a segment of airplane wing. Perhaps expert
engineers have designed extremely robust and complicated software to model this
wing segment given certain inputs, but this function is so complicated that nobody
except the experts dares to try parcing it. Briefly said, you simply don’t want to
understand it; nobody wants to understand it.
Fortunately, one can still optimize e↵ectively in such situations by wisely select-
ing a correct algorithm. However, because so little is known about the blackbox
function, one must wisely select an appropriate minimization method and follow
the specifications of the problem exactly.
152 Lab 15. scipy.optimize
Then plot your initial curve and minimizing curve together on the same
plot, including endpoints. Note that this will require padding your array of
internal y-values with the y-values of the endpoints, so that you plot a total
of 20 points for each curve.
z = r2 (1 + sin2 (4r)),
where p
r= (x + 1)2 + y 2 .
Essentially, this is a wavy crater o↵set from the origin by 1 along the x axis (see
Figure 15.2). The presence of many local minima proves to be a difficulty for the
minimization algorithms.
For example, if we try using the Nelder-Mead method as previously, with an ini-
tial point of x0 = np.array([-2, -2]), the algorithm fails to find the global minimum,
and instead comes to rest on a local minimum.
>>> def multimin(x):
>>> r = np.sqrt((x[0]+1)**2 + x[1]**2)
>>> return r**2 *(1+ np.sin(4*r)**2)
>>>
>>> x0 = np.array([-2, -2])
>>> res = opt.minimize(multimin, x0, method='Nelder-Mead')
153
However, SciPy does have some tools to help us with these problems. Specifi-
cally, we can use the opt.basinhopping() function.
The opt.basinhopping() function uses the same minimizing algorithms (in fact,
you can tell it whatever minimizing algorithm you can pass to opt.minimize()). How-
ever, once it settles on a minimum, it hops randomly to a new point in the domain
(depending on how we set the “hopping” distance) that hopefully lies outside of the
valley or basin belonging to the current local minimum. It then searches for the
minimum from this new starting point, and if it finds a better minimizer, it repeats
the hopping process from this new minimizer. Thus, the opt.basinhopping() function
has multiple chances to escape a local basin and find the correct global minimum.
Only Scipy Version 0.12+ has opt.basinhopping. In earlier versions, such as 0.11,
you won’t find it.
154 Lab 15. scipy.optimize
Plot the initial point and minima by adapting the following line:
ax1.scatter(x_value, y_value, z_value)
Why doesn’t the alogrithm find the global minimum with stepsize=0.2?
Print your answer to this question, and return the true global minimum.
Root Finding
The optimize package also has functions useful in root-finding. The next exam-
ple, taken from the online documentation, solves the following nonlinear system of
equations using opt.root.
x0 + 1/2(x0 x1 )3 1 0
= .
1/2(x1 x0 )3 + x1 0
As with opt.minimize(), opt.root() has more than one algorithm for root finding.
Here we have used the hybr method. There are also several algorithms for scalar
root finding. See the online documentation for more.
Curve Fitting
SciPy also has methods for curve fitting wrapped by the opt.curve_fit() function.
Just pass it data and a function to be fit. The function should take in the indepen-
dent variable as its first argument and values for the fitting parameters as subsequent
arguments. Examine the following example from the online documentation.
>>> #the function with which to create the data and later fit it
>>> def func(x,a,b,c):
>>> return a*np.exp(-b*x) + c
The variable popt now contains the fitted parameters and pcov gives the covariance
of the fit. See Figure 15.3 for a plot of the data and the fitted curve.
One of the most fundamental phenomena in the physical and engineering sciences
is turbulent convection, wherein an unstable density gradient induces a fluid to move
chaotically (basically, hot air rises). This problem is so important that experiments
and numerical simulations have been pushed to their limits in the past several
decades to determine the qualitative nature of the fluid’s motion under an extreme
forcing (think of boiling a pot of water, but instead at temperatures akin to the
interior of the sun). The strength of the forcing (amount of the enforced temperature
gradient) is measured by the non-dimensional Rayleigh number R. Of paticular
interest is to determine how well the chaotic turbulent flow transports the heat
from the hot bottom to the cold top, as measured by the Nusselt number ⌫. One of
the primary goals of experiments, simulations, and analysis is to determine how the
Nusselt number ⌫ depends on the Rayleigh number R, i.e. if the bottom of the pot
156 Lab 15. scipy.optimize
Figure 15.3: Example of perturbed data graphed with the resulting curve using the
fitted parameters: a = 2.72, b = 1.31, and c = 0.45.
of water is heated more strongly, how much faster does the boiling water transport
heat to the top?
It is often generically believed that the Nusselt number obeys a power law of the
form ⌫ = cR , where 1/2. Through some mild assumptions on the temperature,
we can construct an eigenvalue problem that we solve numerically for a variety
of Rayleigh numbers, thus obtaining an upper bound on the Nusselt number as
⌫ cR . With our physical specifications of the problem, we may predict < 1/2.
⌫ = cR ,
use opt.curve_fit() to find a fit to the data using c and as the fitting
parameters. See Figure 15.4 for a plot of the data along with a fitted curve.
Though it may be difficult to see in the figure, the first four points skew the
data, and do not help us determine the appropriate long-term values of c and
. Thus, do not use the first four points when fitting a curve to the data,
157
Figure 15.4: The black points are the data from convection.npy plotted as a scatter
plot. The blue line is a fitted curve.
The scipy.optimize package has many other useful functions, and is a good first
resource when confronting a numerical optimization problem. See the online docu-
mentation for further details.