Nelder Mead 2D

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

The Nelder-Mead Algorithm in Two Dimensions

CEE 201L. Uncertainty, Design, and Optimization


Department of Civil and Environmental Engineering
Duke University
Henri P. Gavin
Spring, 2015
The Nelder-Mead algorithm provides a means of minimizing an objective function of n design
parameters, f (p), p = [p1 , p2 , , pn ]T . The algorithm may be extended to constrained minimization problems through the addition of a penalty function. The Nelder-Mead algorithm iterates on a
simplex, which is a set of n + 1 designs, [p(1) , p(2) , , p(n+1) ]. The Nelder-Mead algorithm specifies
a sequence of steps for iteratively updating the worst design in the simplex (p(n+1) ) in order to
converge on the smallest value of f (p(1) ).
The simplex may be thought of as a polygon with n + 1 vertices. If n = 2, the simplex is
a triangle, and the Nelder-Mead algorithm may be easily visualized. If n = 3, the simplex is a
tetrahedron. This document introduces the Nelder-Mead algorithm for triangular simplexes.
Consider a simplex of three points [u, v, w] in the p1 p2 plane, the triangle connecting
them, and the objective function evaluated at the three points, f (u), f (v), and f (w). The steps
listed below, and illustrated in Figure 1, iteratively improve the vertices of the triangle in order to
minimize f (p).

Steps for one iteration of the Nelder-Mead Algorithm


1. Sort the vertices such that f (u) < f (v) < f (w). Point u is the best point, point v is the
next-to-worst point, and point w is the worst point.
2. Reflect the worst point, w, through the centroid of the remaining points (u and v) to obtain
the reflected point, r, and evaluate f (r).
If the cost at the reflected point, f (r), is between the best and next-to-worst cost (f (u) <
f (r) < f (v)) then replace the worst point, w, with the reflected point, r, and go to step 5.
3. If the cost at the reflected point, f (r), is better than f (u) (f (r) < f (u)) then extend the
reflected point, r, further past the average of u and v, to point e, and evaluate f (e).
(a) If the cost at the extended point, f (e), is better than the reflected point cost, f (r), then
replace the worst point, w, with the extended point, e, and go to step 5.
(b) Otherwise replace the worst point, w with the reflected point, r, and go to step 5.
4. If the inequalities of steps 2 and 3 are not satisfied, then it is certain that the reflected point,
r, is worse than the next-to-worst point, v, (f (r) > f (v)) and, a smaller value of f might be
found between points w and r. So try to contract the worst point, w, to a point c between
w and r, and evaluate f (w). The best distance along the line from w to r can be hard to

CEE 201L. Uncertainty, Design, and Optimization Duke University Spring 2015 H.P. Gavin

determine, and, in general, it is not worth trying too hard. Typical values of c are one-quarter
and three-quarters of the way from w to r. These are called inside and outside contraction
points, ci and co .
(a) If the cost at the better of the two contraction points is better than the next-to-worst
cost, min[f (ci ), f (co )] < f (v), then replace w with the better contraction point, ci or co ,
and go to step 5.
(b) Otherwise shrink the simplex into the best point, u, and go to step 5.
5. Check convergence.
(a) If

[u, v] [v, w]
2 max
[u, v] + [v, w]

and



< p



f (u) [f (v), f (w)]

< f
max

f (u) + 109

then
parameter differences between adjacent vertices is less than p times the parameter
average of adjacent vertices and
the objective, f , at all vertices is within f times the best objective function value.
So the iterations have converged, and the algorithm is terminated.
(b) Otherwise, if the number of function evaluations has exceeded a specified limit then the
algorithm is terminated
(c) Otherwise, go back to step 1 for the next iteration.

1. Sort

worst

if f(r) < f(u),


3. Extend

2. Reflect
nextto
worst

otherwise, f(r) > f(v),


4.a. Contract
4.b. Shrink

ci

w
u

best

r
e

f(u) < f(v) < f(w)

if f(u) < f(r) < f(v),


w=r

if f(e) < f(r),


w=e
otherwise, w = r

co

w
v

if f(c i) < min[f(co),f(v)],


w = ci
if f(co) < min[f(c i),f(v)],
w = co
otherwise, shrink.

v = v
w = w

Figure 1. Illustration of of the sequence of steps in one iteration of the Nelder-Mead method (for n = 2).
An extension of this method, for any value of n and including constraints on parameter values
and general inequality constraints, is implemented in the Matlab function NMAopt.m available at:
http://www.duke.edu/hpgavin/cee201/NMAopt.m
CC BY-NC-ND HP Gavin

The Nelder-Mead Algorithm in Two Dimensions

Remarks
1. In an iteration, the Nelder-Mead method requires one (r), two (r and e), three (r, ci , and co ),
or 3 + n (r, ci , co , and n to shrink) function evaluations.
2. Within any iteration, the best point is not adjusted.
The best point can be re-assigned when the simplex is re-sorted.
3. In 2-D, a simplex and its reflection make a parallelogram.
4. If a reflection point is selected, the simplex remains the same size.
5. In an extension, the line from w to e passes through point r.
If an extension point is selected, the simplex grows in size.
6. After a shrink operation, if w0 remains the worst point the search direction is not changed.
7. The simplex will decrease in size only if it does not improve by reflection or extension.
8. The six steps listed here apply to problems with n > 2 as well, with u = p(1) , v = p(n) , and
w = p(n+1) .
9. The Nelder-Mead method is efficient in getting to the general area of a minimum point, but
is not efficient in converging to a precise minimum (Lagarias, 1998; McKinnon, 1998). The
simplex can tend to oscillate around a minimum point, can shrink into itself, or can converge to
a non-minimum point. These problems can arise when a penalty function is used to enforce a
constraint at a minimum point and all contraction points are inside contractions (McKinnon,
1998). This is related to the fact that, as the method is defined, the factors used for reflecting,
extending, contracting and shrinking the simplex (typically 1, 2, 1/2, and 1/2) are fixed and
do not depend on how a problem is converging. Nelder and Mead recognized this issue in their
original (1965) paper, and hinted at a way to compute the Hessian of the objective function
from the simplex once the simplex converged.
10. In step 4, comparing f (w) to f (v) instead of f (c) decreases the likelihood that w will be
accepted, and therefore increases the likelihood of a shrink operation. Shrink operations
are preferable to contractions for two reasons. First, contractions tend to make the simplex
degenerate (more than two vertices on a line). Second, shrink operations decrease the size
of the simplex, leading to faster convergence with respect to p . On the other hand, shrink
operations require n function evaluations, and are therefore more computationally expensive
than the other operations.
11. Convergent variants of the Nelder-Mead method have been proposed, (e.g., David Byatt, 2000
and Tseng 2001).
12. Using the Nelder-Mead method to converge in the general region of a precise solution before
switching to a gradient-based method, such as sequential quadratic programming (SQP), can
sometimes work well with difficult optimization problems.

CC BY-NC-ND HP Gavin

CEE 201L. Uncertainty, Design, and Optimization Duke University Spring 2015 H.P. Gavin

w
u

Figure 2. A sequence of seven iterations of the Nelder-Mead method. Dashed lines show attempted
reflection steps.

References
1. Byatt, David, Coope, Ian, and Price, Chris, 40 Years of the Nelder-Mead Algorithm, University of
Canterbury, New Zealand, 5 November, 2003. http://oldweb.cecm.sfu.ca/AAS/coope.pdf
2. Byatt, David, Convergent variants of the Nelder-Mead algorithm. Master of Science Thesis, Department of Mathematics and Statistics, University of Canterbury, 2000.
3. Kelly, T.C., Iterative Methods for Optimization, SIAM, 1999, (section 8.1).
4. Lagarias, J.C., Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions,
SIAM J. Optimization, 9(1) (1998): 112-147.
5. McKinnon, K.I.M., Convergence of the Nelder-Mead simplex method to a nonstationary point, SIAM
J. Optimization, 9(1) (1998), 148-158.
6. Nelder, J.A., and Mead, R A simplex method for function minimization, Computer Journal, 7(4)
(1965): 308-313.
7. Press, W.H., Teukolsky, S.A., Vetterline, W.T., and Flannery, B.P., Numerical Recipes, the Art of
Scientific Computing, 2nd ed. Cambridge Univ. Press, 1992. (section 10.4).
8. Rao, S.S., Optimization Theory and Applications, 2nd ed., John Wiley & Sons, 1984, (section 6.6).
9. Wright, M.H., Nelder, Mead, and the other Simplex Method, Documenta Mathematica. Extra Volume ISMP (2012) 271-276. http://www.math.uiuc.edu/documenta/vol-ismp/42 wright-margaret.pdf
CC BY-NC-ND HP Gavin

You might also like