Assignment_NumericalOptimization
Assignment_NumericalOptimization
Optimization Toolbox
The Optimization Toolbox provides a collection of algorithms for numerical standard and
large-scale optimization. The toolbox handles constrained and unconstrained continuous
and discrete problems. It contains
• interactive tools for defining and solving optimization problems and monitoring
solution progress,
• solvers for nonlinear least squares, data fitting, and nonlinear equations,
Open the template file NumericalOptim_Template.mlx in the Matlab Live Editor and
work through the assignments step by step.
Numerical optimization
Mathematical or numerical optimization is concerned with the selection of a best solution
w.r.t. to some criteria (objective function) from a set of available alternatives. In the
simplest case, an optimization problem consists of maximizing or minimizing a real valued
function f (x) by systematically choosing solutions x from within an allowed set x ∈ X and
computing the value of the function. Optimization includes finding best available values
of some objective function given a defined domain (or a set of constraints), including a
variety of different types of objective functions and different types of domains.
Typically, A is some subset of the Euclidean space Rn , often specified by a set of cons-
traints, equalities or inequalities that the members of A have to satisfy. The domain A
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 2
of f is called the search space, while the elements of A are called candidate solutions or
feasible solutions.
The function f is called, an objective function, a loss function, a cost function or a fitness
function (evolutionary algorithms). A feasible solution that minimizes (or maximizes) the
objective function is called an optimal solution.
• heuristics that may provide approximate solutions to some problems although their
iterates may not converge
This lab is concerned with iterative methods that are capable to find a local minimum or
maximum of an objective function. In case the objective function is unimodal the local
optimum coincides with the global optimum. The iterative methods differ according to
whether they evaluate Hessians, gradients, or only function values. While evaluating Hes-
sians (H) and gradients (G) improves the rate of convergence, for functions for which these
quantities exist and vary sufficiently smoothly, such evaluations increase the computatio-
nal complexity or computational cost of each iteration. In some cases, the computational
complexity may be excessively high.
• interpolation methods
Methods that evaluate gradients or approximate gradients using finite differences (or even
subgradients):
• Interior point methods: This is a large class of methods for constrained optimization.
Some interior-point methods use only (sub)gradient information, and others of which
require the evaluation of Hessians.
• Subgradient methods: An iterative method for large locally Lipschitz functions using
generalized gradients. Subgradient projection methods are similar to conjugate gra-
dient methods.
Methods that evaluate Hessians (or approximate Hessians, using finite differences):
• Newton’s method
The formulation and solving of the optimization problems follows these steps:
• Instantiate the optimization problem object and the objective sense (maximize,
minimize).
prob = optimproblem(’ObjectiveSense’,’minimize’);
• Instantiate the optimization parameters (variables) and define the underlying data
type (continuous, integer, binary) and optionally its range e.g. x ∈ R2
x = optimvar(’x’,2,’Type’,’continuous’);
by inspecting the details (only available since Matlab 2019b, otherwise use
showproblem(prob)). The review shows the basic components and features of the
problem.
• (optional) Convert the problem based formulation to solver form and utilize the
generated code (m-files) for solver based optimization.
problem = prob2struct(prob);
This problem is a linear least squares problem as the parameters x1 , x2 emerge linearly in
the squared terms (x1 + 2x2 − 7)2 and (2x1 + x2 − 5)2 .
1) Setup the problem definition for the unconstrained optimization problem of mini-
mizing the Booth function in Eq. 1.
prob= ...
3) Solve the optimization problem with the initial solution x0 = (0, 0) and report the
solution vector.
sol = ...
4) Verify that the reported solution is accurate by inspecting the function value, exit
flag, output structure.
[sol,fval,exitflag,output] = solve(prob)
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 5
1,000
500
0
−4 4
−2 0 2
2 −2 0
4 −4
x1 x2
4
2
x2
0
−2
−4
−5 −4 −3 −2 −1 0 1 2 3 4 5
x1
5) Convert the optimization problem to solver form and inspect the problem formula-
tion. The employed solver may depend on the Matlab version.
problem = prob2struct(prob);
6) Solve the Booth’s optimization problem with the solver proposed in the problem
solver attribute problem.solver.
7) Solve the Booth’s optimization problem with the Live Editor Task. In the Live Editor
menu select Task->Optimization->Optimize. This opens the dialogue shown in
figure 2. Select
,
The solver lsqnonlin expects the vector of least squares residuals fi (x) as input for
solving minx i fi (x)2 . The residuals of the Booth function f1 = x1 + 2x2 − 7, f2 =
P
In the options menue of the task (three vertical dots in the upper righ corner) select
the option Controls and Code. Plot the current point and objective value at each
iteration. Modify the initial solution xstart by entering numbers in the control
fields for xstart(1) and xstart(2).
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 7
Unconstrained Optimization
The Matlab function fminunc provides a nonlinear programming solver, that finds the
solution of problem (1).
[x,fval] = fminunc(fun,x0,options)
minimizes fun with the optimization options specified in options which are conveniently
set with the command optimoptions. x0 denotes the initial solution. fminunc returns the
optimal solution x and the function value fval. fminunc gets faster and more reliable
when your objective function fun provides derivatives, in terms of gradient as second and
the Hessian as third output arguments. With options you can choose diagnostic output
during the solution process and select the optimization scheme. fminunc either employs
a quasi-Newton algorithm, set parameter ’Algorithm’ to ’quasi-newton’ or a trust
region method, set parameter ’Algorithm’ to ’trust-region’ (default).
Function Handles
The argument fun to fminunc is a function handle. Function handles provide a means
of calling a function indirectly. You can pass function handles in calls to other functions
(often called function functions). The function fminbnd finds the minimum of a single-
variable function on a fixed interval. humps is a function with strong maxima near x = 0.3
and x = 0.9. This code computes the minimum of humps in the interval x ∈ [0.3, 1.0]:
fplot(@humps,[0,2]);
x = fminbnd(@humps, 0.3, 1);
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 8
Rosenbrock function
3
1
x2
−1
−2
−3
−3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3
x1
Rosenbrock Function
The elevation profile of the Rosenbrock function has a banana shape. The global minimum
is located along a narrow flat valley with parabolic cliffs. Solutions converge rapidly to the
valley, bit convergence to the global minimum x∗ = (1, 1) insider the valley is non-trivial.
In this assignment the goal is to find the minimum of the Rosenbrock function for para-
meters a = 1 and b = 100 (see figure 3):
1) Generate a contour plot of the Rosenbrock function with fcontour in the interval
[x1 , x2 ] ∈ [−3, 3][−3, 3] as shown in figure 3.
2) Edit the Matlab function rosenbrock(x), at the end of the template file, with the
input vector x = [x1 , x2 ] and the output scalar f (x1 , x2 ) of (3), with the following
signature
function f = rosenbrock(x)
% Calculate objective function f
f = ...
end
For that purpose calculate the analytical gradient of the Rosenbrock function by
hand.
∇f (x1 , x2 ) = . . . (4)
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 10
In case of doubt confirm your result using the Symbolic Toolbox to determine the
gradient analytically with gradient. Do NOT use the symbolic expression for gradi-
ent calculation of the cost function within the solver (optimtool) but rather provide
the corresponding numerical expression of the gradient.
Trust region methods operate with a model function mk (x) which approximates the ob-
jective function f (x) in the vicinity of the current solution xk . If x is far from xk the
approximation might be poor, therefore the minimization of mk is restricted to a region
around xk . In other words, we find the candidate step p by approximately solving the
following subproblem: The candidate step is obtained as the solution of
in which Sk denotes the trust region. If the step p does not result in the expected decrease
in f , the trust region shrinks and problem (5) is resolved. The trust region Sk is often
defined by a ball of radius ∆ > 0,
4) MANDATORY: Find the minimum of the Rosenbrock function with fminunc using
the algorithm trust-region (parameter ’Algorithm’). Utilize the LiveEditor Task
Optimize to generate code. Select options
• objective : nonlinear
• objective function : rosenbrock.m
• initial point : xstart
,
Specify the following non-default solver options for fminunc
3
Trust-Region
2.5 Quasi-Newton
Levenberg-Marquardt
Trust-Region-Reflective
2
1.5
x2
0.5
−0.5
−1
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
x1
To log xk , the function value fk and the number of function calls during the opti-
mization a template Matlab function is provided (see template.m). The template
uses a custom output function optHistoryLogger, which stores the information in
a struct called a optHistory struct during the optimization. The function handle
to the output function has to be passed to the optimization via the optimoptions
command
options = optimoptions(options,’OutputFcn’, @optHistoryLogger);
You can fetch the recorded data after the optimization using:
optHistory = optHistoryLogger;
Quasi-Newton Method
Line search algorithms proceed in two steps, first determine a suitable search direction pk
along which f decreases and then determine an optimal step size α that minimizes the
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 12
1: for k = 1 to . . . do
2: Solve Bk pk = −∇f (xk )
3: Set xk+1 = xk + αk pk , where αk satisfies the Wolfe condition
4: end for
pk = −B−1
k ∇fk (8)
in which Bk is a positive definite symmetric matrix. The most obvious search direction is
the negative gradient which points in the direction of steepest descent. In that case Bk = I
and pk = −∇fk . It only requires calculation of the gradient ∇fk but convergence can be
slow on difficult problems such as the Rosenbrock function. An optimal direction is given
by the Newton direction for which Bk = ∇2 fk is the Hessian matrix. The Newton direction
has a natural step length of α = 1 and exhibits a quadratic convergence rate. Its main
drawback is the computational burden that is required for the explicit computation of the
Hessian in particular if it is approximated finite differences. That observation motivates
Quasi-Newton methods that employ an approximation of the Hessian an still exhibit
superlinear convergence. The approximation Bk of the Hessian is updated after each
iteration taking into account the change of gradient between ∇fk and ∇fk+1 to estimate
the second derivatives. The update of Bk has to comply with the so called secant condition:
The BFGS update rule for Bk is named after Broyden, Fletcher, Goldfarb, and Shanno:
The quasi-Newton search direction is determined by (8) using Bk in lieu of the exact Hes-
sian of the Newton algorithm. The (quasi)-Newton algorithm with line search is detailed
in 1.
The Wolfe condition guarantees a sufficient decrease of f along the search direction pk
such that α is neither too small nor too large. Line search Newton terminates either if
the changes in the solution |xk+1 − xk | fall below a threshold or the gradient ∇fk is less
than threshold.
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 13
6) Find the minimum of the Rosenbrock function using fminunc with the algorithm
quasi-newton. Reutilize the optimize task and code from the previous assignment.
Merely substitute the trust-region solver by the quasi-newton solver
in which r(x) = (r1 (x), . . . , rm (x))| denotes a vector of residuals. Nonlinear least squares
problem are particular relevant for regression in which a parametrized model is identified
from observed data. In this case rj (x) measures the disagreement between the prediction of
the model and the observed output for over a set of training data {(xi , yi )}. By minimizing
(11), the parameters of the model are selected such that it matches the data in a least
squares sense. Numerical optimization schemes such as Levenberg-Marquardt exploit the
particular structure of f and its derivatives.
In least-squares problems the Hessian ∇2 f (x) matrix can be approximated from the know-
ledge of the Jacobian.
m m
∇2 f (x) = ∇rj (x)∇rj (x)| + rj (x)∇2 rj (x)|
X X
(13)
j=1 j=1
m
= J(x)| J(x) + rj (x)∇2 rj (x)|
X
(14)
j=1
The first term J(x)| J(x) often dominates the second term in particular if the residuals
rj (x) become small. Algorithms such as Levenberg-Marquardt exploit this property of the
Hessian for nonlinear least squares problems.
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 14
The Levenberg-Marquardt algorithm solves non-linear least squares problems that arise
in least squares regression. The Levenberg-Marquardt method can be interpreted as a
trust region approach. Assuming a spherical trust region Sk of radius ∆k each Levenberg-
Marquardt step solves the quadratic problem
1
min
p 2
||Jk p + rk ||2 with p ∈ Sk (15)
J|k Jk pGN
k = −J|k rk (17)
lsqnonlin solves nonlinear least-squares curve fitting problems of the form (11). You can
provide optional lower and upper bounds lb and ub on the components of x.
Rather than to compute the sum of squares f (x), lsqnonlin requires the user-defined
function to compute the vector-valued function of residuals r(x) = (r1 (x), . . . , rm (x))| .
x = lsqnonlin(fun,x0)
starts at the point x0 and finds a minimum of the sum of squares of the functions described
in fun. The function fun should return a vector of values and not the sum of squares of
the values, in case of the Rosenbrock function r1 (x1 , x2 ) = 10(x2 −x21 ), r2 (x1 , x2 ) = 1−x1 .
lsqnonlin operates with ’Algorithm’ ’levenberg-marquardt’ or
’trust-region-reflective’ (default). Notice that ’levenberg-marquardt’ does not
handle constraints.
In case of doubt verify your analytical result with the Symbolic Toolbox.
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 15
f = ...;
if nargout > 1
J = ...
end
end
10) (Optional:) Find the minimum of the Rosenbrock function with the lsqnonlin
solver with the levenberg-marquardt algorithm.
Set the option ’SpecifyObjectiveGradient’ to true. Utilize the optimize task in
the Live Editor to specify the problem and automatically generate the code. Select
the problem type least squares.
Select options
,
Specify the following non-default solver options for levenberg-marquardt
12) (Optional:) Compare the convergence rate of the algorithms Quasi-Newton, Trust-
Region and Levenberg-Marquardt by plotting (semilogy).
• the logarithm of the function value fk at the current solution w.r.t. the number
of function evaluations
• the logarithm of the Euclidean distance of the current solution xk to the opti-
mum x∗ w.r.t. the function evaluations
The minimum of the Rosenbrock function with the parameters a, b is xopt = [1, 1]
at which fopt = 0.
HINT: Use semilogy(x,y) to generate plots with a logarithmic scale of the y-axis.
Note, semilogy plots the logarithm of y directly, so there is no need to calculate
the logarithm of y.
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 17
Constrained Optimization
A constrained optimization problem is formulated in the following way:
in which ci (x) i ∈ E denotes a vector equality constraints and ci (x) i ∈ I denotes a vector
of inequality constraints.
starts at x0 and attempts to find a minimizer x of the function described in fun subject
to the linear inequalities Ax ≤ x and linear equalities Aeq x = beq .
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon)
subjects the minimization to the nonlinear inequalities c(x) or equalities ceq(x) defined
in nonlcon. fmincon optimizes such that c(x) ≤ 0 and ceq (x) = 0.
This assignment is concerned with the Rosenbrock function from the previous assignment
but this time its optimization is subject to additional constraints.
1) Determine the minimum of the Rosenbrock function using fmincon with the algo-
rithms Interior-Point, SQP and Active-Set subject to the linear constraints.
x1 + 2x2 ≤ 1 (2)
x1 − 0.5x2 = 0.7 (3)
2) Generate a second contour plot of the Rosenbrock function with fcontour in the
interval [x1 , x2 ] ∈ [−3, 3][−3, 3]. (Reuse the code from the previous exercise.)
4) Superimpose the two lines that represent the equality and inequality constraint.
x1 = [-3 3];
hcon1 = plot(x1, -A(1)/A(2)*x1 + b/A(2), ’c’);
hcon2 = plot(x1, -Aeq(1)/Aeq(2)*x1+ beq/Aeq(2), ’k’);
axis([-3 3 -3 3]);
Scientific Programming in Matlab
Version 1.4 Scientific Programming: Numerical Optimization Page 18
5) Apparently, the inequality constraint is active and the optimal solution xopt =
(0.76, 0.12) is entirely determined by the two active constraints. Relax the inequality
constraint to
x1 + 2x2 ≤ 5 (4)
(5)
Repeat the constrained optimization with the relaxed inequality. In this case the
inequality becomes inactive at the optimal solution xopt = (1, 0.6).
6) Determine the minimum of the Rosenbrock function using algorithm fmincon with
the solvers interior-point, sqp and active-set subject to the nonlinear inequa-
lity constraint. Generate the code with Live Editor task optimize with options
• objective : nonlinear
• constraints : nonlinear inequality
• objective function : rosenbrock.m
• constraints function : ellipseCon.m
• initial point : xstart
7) Replace the linear constraints from the previous task by the nonlinear constraint,
that x has to lie within an ellipse with major-axis rx1 , minor-axis rx2 and centered
at (mx1 , mx2 ). The condition is expressed by