MATLAB Optimization ToolboxUser's Guide 2020
MATLAB Optimization ToolboxUser's Guide 2020
User's Guide
R2020a
How to Contact MathWorks
Phone: 508-647-7000
Acknowledgments
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx
Getting Started
1
Optimization Toolbox Product Description . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
Key Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
v
Setting Up an Optimization
2
Optimization Theory Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
vi Contents
Linear Equality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-36
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-80
Examining Results
3
Current Point and Function Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
vii
Exit Message Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
viii Contents
Use a Sparse Solver or a Multiply Function . . . . . . . . . . . . . . . . . . . . . . 4-10
Use Parallel Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11
Optimization App
5
Optimization App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Optimization App Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Specifying Certain Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
Importing and Exporting Your Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
ix
Minimization with Gradient and Hessian . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
x Contents
Code Generation in fmincon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-114
What Is Code Generation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-114
Code Generation Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-114
Generated Code Not Multithreaded . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-115
Nonlinear Problem-Based
7
Rational Objective Function, Problem-Based . . . . . . . . . . . . . . . . . . . . . . . 7-2
xi
Multiobjective Algorithms and Examples
8
Multiobjective Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
Multiobjective Optimization Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
xii Contents
Optimal Dispatch of Power Generators: Solver-Based . . . . . . . . . . . . . . . 9-55
Problem-Based Optimization
10
Problem-Based Optimization Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
xiii
Named Index for Optimization Variables . . . . . . . . . . . . . . . . . . . . . . . . . 10-20
Create Named Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-20
Use Named Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-21
View Solution with Index Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-22
Create Initial Point for Optimization with Named Index Variables . . . . 10-43
Quadratic Programming
11
Quadratic Programming Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
Quadratic Programming Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
interior-point-convex quadprog Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 11-2
trust-region-reflective quadprog Algorithm . . . . . . . . . . . . . . . . . . . . . . . 11-7
active-set quadprog Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-11
xiv Contents
Step 1: Decide what part of H to pass to quadprog as the first argument.
.................................................... 11-17
Step 2: Write a function to compute Hessian-matrix products for H. . . . 11-17
Step 3: Call a quadratic minimization routine with a starting point. . . . 11-18
Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-19
Least Squares
12
Least-Squares (Model Fitting) Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 12-2
Least Squares Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
Linear Least Squares: Interior-Point or Active-Set . . . . . . . . . . . . . . . . . . 12-2
Trust-Region-Reflective Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
Levenberg-Marquardt Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6
xv
Setting Up the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
Systems of Equations
13
Equation Solving Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
Equation Solving Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
Trust-Region Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
Trust-Region-Dogleg Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-4
Levenberg-Marquardt Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5
fzero Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
\ Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
xvi Contents
Nonlinear Equations with Jacobian Sparsity Pattern . . . . . . . . . . . . . . . 13-13
Step 1: Write a file nlsf1a.m that computes the objective function values.
.................................................... 13-13
Step 2: Call the system of equations solve routine. . . . . . . . . . . . . . . . . 13-13
xvii
Optimization Options Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-6
Optimization Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-6
Hidden Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-16
Functions
16
xviii Contents
Acknowledgments
xix
Acknowledgments
Acknowledgments
MathWorks® would like to acknowledge the following contributors to Optimization Toolbox
algorithms.
Thomas F. Coleman researched and contributed algorithms for constrained and unconstrained
minimization, nonlinear least squares and curve fitting, constrained linear least squares, quadratic
programming, and nonlinear equations.
Yin Zhang researched and contributed the large-scale linear programming algorithm.
xx
1
Getting Started
Optimization Toolbox provides functions for finding parameters that minimize or maximize objectives
while satisfying constraints. The toolbox includes solvers for linear programming (LP), mixed-integer
linear programming (MILP), quadratic programming (QP), nonlinear programming (NLP),
constrained linear least squares, nonlinear least squares, and nonlinear equations. You can define
your optimization problem with functions and matrices or by specifying variable expressions that
reflect the underlying mathematics.
You can use the toolbox solvers to find optimal solutions to continuous and discrete problems,
perform tradeoff analyses, and incorporate optimization methods into algorithms and applications.
The toolbox lets you perform design optimization tasks, including parameter estimation, component
selection, and parameter tuning. It can be used to find optimal solutions in applications such as
portfolio optimization, resource allocation, and production planning and scheduling.
Key Features
• Nonlinear and multiobjective optimization of smooth constrained and unconstrained problems
• Solvers for nonlinear least squares, constrained linear least squares, data fitting, and nonlinear
equations
• Quadratic programming (QP) and linear programming (LP)
• Mixed-integer linear programming (MILP)
• Optimization modeling tools
• Graphical monitoring of optimization progress
• Gradient estimation acceleration (with Parallel Computing Toolbox™)
1-2
First Choose Problem-Based or Solver-Based Approach
This table summarizes the main differences between the two approaches.
Approaches Characteristics
“Problem-Based Optimization Setup” Easier to create and debug
Represents the objective and constraints symbolically
Requires translation from problem form to matrix form, resulting in a
time
Does not allow direct inclusion of gradient or Hessian; see “Include De
Problem-Based Workflow” on page 7-20
See the steps in “Problem-Based Optimization Workflow” on page 10-2
“Problem-Based Workflow for Solving Equations” on page 10-4
Basic linear example: “Mixed-Integer Linear Programming Basics: Pro
page 10-40 or the video Solve a Mixed-Integer Linear Programming Pr
Optimization Modeling
1-3
1 Getting Started
See Also
More About
• “Problem-Based Optimization Setup”
• “Solver-Based Optimization Problem Setup”
1-4
Solve a Constrained Nonlinear Problem, Problem-Based
This example shows how to solve a constrained nonlinear optimization problem using the problem-
based approach. The example demonstrates the typical work flow: create an objective function,
create constraints, solve the problem, and examine the results.
Note:
If your objective function or nonlinear constraints are not composed of elementary functions, you
must convert the nonlinear functions to optimization expressions using fcn2optimexpr. See the
last part of this example, Alternative Formulation Using fcn2optimexpr on page 1-0 , or “Convert
Nonlinear Function to Optimization Expression” on page 7-8.
For the solver-based approach to this problem, see “Solve a Constrained Nonlinear Problem, Solver-
Based” on page 1-11.
2 2
f (x) = 100 x2 − x12 + (1 − x1) ,
over the unit disk, meaning the disk of radius 1 centered at the origin. In other words, find x that
minimizes the function f (x) over the set x12 + x22 ≤ 1. This problem is a minimization of a nonlinear
function subject to a nonlinear constraint.
Rosenbrock's function is a standard test function in optimization. It has a unique minimum value of 0
attained at the point [1,1]. Finding the minimum is a challenge for some algorithms because the
function has a shallow minimum inside a deeply curved valley. The solution for this problem is not at
the point [1,1] because that point does not satisfy the constraint.
This figure shows two views of Rosenbrock's function in the unit disk. The vertical axis is log-scaled;
in other words, the plot shows log(1 + f (x)). Contour lines lie beneath the surface plot.
rosenbrock = @(x)100*(x(:,2) - x(:,1).^2).^2 + (1 - x(:,1)).^2; % Vectorized function
% Create subplot
subplot1 = subplot(1,2,1,'Parent',figure1);
view([124 34]);
grid('on');
hold on;
% Create surface
1-5
1 Getting Started
surf(X,Y,Z,'Parent',subplot1,'LineStyle','none');
% Create contour
contour(X,Y,Z,'Parent',subplot1);
% Create subplot
subplot2 = subplot(1,2,2,'Parent',figure1);
view([234 34]);
grid('on');
hold on
% Create surface
surf(X,Y,Z,'Parent',subplot2,'LineStyle','none');
% Create contour
contour(X,Y,Z,'Parent',subplot2);
% Create textarrow
annotation(figure1,'textarrow',[0.4 0.31],...
[0.055 0.16],...
'String',{'Minimum at (0.7864,0.6177)'});
% Create arrow
annotation(figure1,'arrow',[0.59 0.62],...
[0.065 0.34]);
hold off
The rosenbrock function handle calculates Rosenbrock's function at any number of 2-D points at
once. This “Vectorization” (MATLAB) speeds the plotting of the function, and can be useful in other
contexts for speeding evaluation of a function at multiple points.
1-6
Solve a Constrained Nonlinear Problem, Problem-Based
The function f (x) is called the objective function. The objective function is the function you want to
minimize. The inequality x12 + x22 ≤ 1 is called a constraint. Constraints limit the set of x over which a
solver searches for a minimum. You can have any number of constraints, which are inequalities or
equations.
The problem-based approach to optimization uses optimization variables to define objective and
constraints. There are two approaches for creating expressions using these variables:
For this problem, both the objective function and the nonlinear constraint are polynomials, so you can
write the expressions directly in terms of optimization variables. Create a 2-D optimization variable
named 'x'.
x = optimvar('x',1,2);
Create an optimization problem named prob having obj as the objective function.
prob = optimproblem('Objective',obj);
OptimizationProblem :
Solve for:
x
minimize :
((100 .* (x(2) - x(1).^2).^2) + (1 - x(1)).^2)
subject to circlecons:
(x(1).^2 + x(2).^2) <= 1
Solve Problem
To solve the optimization problem, call solve. The problem needs an initial point, which is a
structure giving the initial value of the optimization variable. Create the initial point structure x0
having an x-value of [0 0].
1-7
1 Getting Started
x0.x = [0 0];
[sol,fval,exitflag,output] = solve(prob,x0)
fval = 0.0457
exitflag =
OptimalSolution
Examine Solution
The solution shows exitflag = OptimalSolution. This exit flag indicates that the solution is a
local optimum. For information on trying to find a better solution, see “When the Solver Succeeds” on
page 4-18.
The exit message indicates that the solution satisfies the constraints. You can check that the solution
is indeed feasible in several ways.
• Check the reported infeasibility in the constrviolation field of the output structure.
infeas = output.constrviolation
infeas = 0
infeas = infeasibility(nlcons,sol)
infeas = 0
1-8
Solve a Constrained Nonlinear Problem, Problem-Based
nx = norm(sol.x)
nx = 1.0000
The output structure gives more information on the solution process, such as the number of
iterations (24), the solver (fmincon), and the number of function evaluations (84). For more
information on these statistics, see “Tolerances and Stopping Criteria” on page 2-68.
For more complex expressions, write function files for the objective or constraint functions, and
convert them to optimization expressions using fcn2optimexpr. For example, the basis of the
nonlinear constraint function is in the disk.m file:
type disk
Furthermore, you can also convert the rosenbrock function handle, which was defined at the
beginning of the plotting routine, into an optimization expression.
rosenexpr = fcn2optimexpr(rosenbrock,x);
OptimizationProblem :
Solve for:
x
minimize :
anonymousFunction2(x)
where:
anonymousFunction2 = @(x)100*(x(:,2)-x(:,1).^2).^2+(1-x(:,1)).^2;
subject to :
disk(x) <= 1
Solve the new problem. The solution is essentially the same as before.
[sol,fval,exitflag,output] = solve(convprob,x0)
1-9
1 Getting Started
fval = 0.0457
exitflag =
OptimalSolution
For the list of supported functions, see “Supported Operations on Optimization Variables and
Expressions” on page 10-36.
See Also
More About
• “Solve a Constrained Nonlinear Problem, Solver-Based” on page 1-11
• “First Choose Problem-Based or Solver-Based Approach” on page 1-3
1-10
Solve a Constrained Nonlinear Problem, Solver-Based
For the problem-based approach to this problem, see “Solve a Constrained Nonlinear Problem,
Problem-Based” on page 1-5.
over the unit disk, that is, the disk of radius 1 centered at the origin. In other words, find x that
minimizes the function f(x) over the set x12 + x22 ≤ 1. This problem is a minimization of a nonlinear
function with a nonlinear constraint.
Note Rosenbrock's function is a standard test function in optimization. It has a unique minimum
value of 0 attained at the point [1,1]. Finding the minimum is a challenge for some algorithms
because the function has a shallow minimum inside a deeply curved valley. The solution for this
problem is not at the point [1,1] because that point does not satisfy the constraint.
This figure shows two views of Rosenbrock's function in the unit disk. The vertical axis is log-scaled;
in other words, the plot shows log(1+f(x)). Contour lines lie beneath the surface plot.
1-11
1 Getting Started
The function f(x) is called the objective function. The objective function is the function you want to
minimize. The inequality x12 + x22 ≤ 1 is called a constraint. Constraints limit the set of x over which a
solver searches for a minimum. You can have any number of constraints, which are inequalities or
equations.
1 Define the objective function in the MATLAB® language, as a function file or anonymous function.
This example uses a function file.
2 Define the constraints as a separate file or anonymous function.
A function file is a text file that contains MATLAB commands and has the extension .m. Create a
function file in any text editor, or use the built-in MATLAB Editor as in this example.
edit rosenbrock
2 In the MATLAB Editor, enter:
1-12
Solve a Constrained Nonlinear Problem, Solver-Based
function f = rosenbrock(x)
Note rosenbrock is a vectorized function that can compute values for several points at once.
See “Vectorization” (MATLAB). A vectorized function is best for plotting. For a nonvectorized
version, enter:
function f = rosenbrock1(x)
Constraint functions have the form c(x) ≤ 0 or ceq(x) = 0. The constraint x12 + x22 ≤ 1 is not in the
form that the solver handles. To have the correct syntax, reformulate the constraint as
x12 + x22 − 1 ≤ 0.
Furthermore, the syntax for nonlinear constraints returns both equality and inequality constraints.
This example includes only an inequality constraint, so you must pass an empty array [] as the
equality constraint function ceq.
With these considerations in mind, write a function file for the nonlinear constraint.
• Use the Optimization app; see “Minimize Rosenbrock's Function Using the Optimization App” on
page 1-13.
• Use command-line functions; see “Minimize Rosenbrock's Function at the Command Line” on
page 1-16.
Note The Optimization app warns that it will be removed in a future release. For alternatives, see
“Optimization App Alternatives” on page 5-12.
1-13
1 Getting Started
1 Start the Optimization app by entering optimtool at the command line. For more information
about this tool, see “Optimization App” on page 5-2.
Ensure that your Problem Setup and Results pane matches this figure.
1-14
Solve a Constrained Nonlinear Problem, Solver-Based
5 In the Options pane, under Display to command window (at the bottom of the pane), select
iterative from the Level of display list. (If you do not see the option, click Display to
command window.) This setting shows the progress of fmincon in the command window.
6 In the Problem Setup and Results pane, under Run solver and view results, click Start.
The following message appears in the Run solver and view results box:
Optimization running.
Objective function value: 0.04567482475812774
Local minimum found that satisfies the constraints.
Your objective function value can differ slightly, depending on your computer system and version of
Optimization Toolbox.
1-15
1 Getting Started
• The search for a constrained optimum ended because the derivative of the objective function is
nearly 0 in directions allowed by the constraint.
• The constraint is satisfied to the requisite accuracy.
At the bottom of the Problem Setup and Results pane, the minimizer x appears under Final point.
For more information about exit messages, see “Exit Flags and Exit Messages” on page 3-3.
You can run the same optimization from the command line.
1 Create options that choose iterative display and the interior-point algorithm.
options = optimoptions(@fmincon,...
'Display','iter','Algorithm','interior-point');
2 Run the fmincon solver with the options structure, reporting both the location x of the
minimizer and the value fval attained by the objective function.
[x,fval] = fmincon(@rosenbrock,[0 0],...
[],[],[],[],[],[],@unitdisk,options)
The six sets of empty brackets represent optional constraints that are not being used in this
example. See the fmincon function reference pages for the syntax.
x =
0.7864 0.6177
fval =
0.0457
The message tells you that the search for a constrained optimum ended because the derivative of the
objective function is nearly 0 in directions allowed by the constraint, and that the constraint is
satisfied to the requisite accuracy. Several phrases in the message contain links to more information
about the terms used in the message. For more details about these links, see “Enhanced Exit
Messages” on page 3-4.
1-16
Solve a Constrained Nonlinear Problem, Solver-Based
Your table can differ, depending on toolbox version and computing platform. The following description
applies to the table as displayed.
• The first column, labeled Iter, is the iteration number from 0 to 24. fmincon took 24 iterations
to converge.
• The second column, labeled F-count, reports the cumulative number of times Rosenbrock's
function was evaluated. The final row shows an F-count of 84, indicating that fmincon evaluated
Rosenbrock's function 84 times in the process of finding a minimum.
• The third column, labeled f(x), displays the value of the objective function. The final value,
0.04567482, is the minimum reported in the Optimization app Run solver and view results box,
and at the end of the exit message in the command window.
• The fourth column, Feasibility, is 0 for all iterations. This column shows the value of the
constraint function unitdisk at each iteration where the constraint is positive. Because the value
of unitdisk was negative in all iterations, every iteration satisfied the constraint.
The other columns of the iteration table are described in “Iterative Display” on page 3-14.
See Also
fmincon
More About
• “Solve a Constrained Nonlinear Problem, Problem-Based” on page 1-5
• “First Choose Problem-Based or Solver-Based Approach” on page 1-3
• “Get Started with Optimization Toolbox”
• “Solver-Based Optimization Problem Setup”
1-17
1 Getting Started
The variables and expressions in the problem represent a model of operating a chemical plant, from
an example in Edgar and Himmelblau [1]. There are two videos that describe the problem.
• Mathematical Modeling with Optimization, Part 1 shows the problem in pictorial form. It shows
how to generate the mathematical expressions of “Model Description” on page 1-18 from the
picture.
• Optimization Modeling, Part 2: Converting to Solver Form describes how to convert these
mathematical expressions into Optimization Toolbox solver syntax. This video shows how to solve
the problem, and how to interpret the results.
The remainder of this example is concerned solely with transforming the problem to solver syntax.
The example closely follows the video Optimization Modeling, Part 2: Converting to Solver Form. The
main difference between the video and the example is that this example shows how to use named
variables, or index variables, which are similar to hash keys. This difference is in “Combine Variables
Into One Vector” on page 1-20.
Model Description
The video Mathematical Modeling with Optimization, Part 1 suggests that one way to convert a
problem into mathematical form is to:
For the meaning of the variables in this section, see the video Mathematical Modeling with
Optimization, Part 1.
The optimization problem is to minimize the objective function, subject to all the other expressions as
constraints.
1-18
Set Up a Linear Program, Solver-Based
2500 ≤ P1 ≤ 6250
I1 ≤ 192,000
C ≤ 62,000
I1 - HE1 ≤ 132,000
I1 = LE1 + HE1 + C
1359.8 I1 = 1267.8 HE1 + 1251.4 LE1 + 192 C + 3413 P1
3000 ≤ P2 ≤ 9000
I2 ≤ 244,000
LE2 ≤ 142,000
I2 = LE2 + HE2
1359.8 I2 = 1267.8 HE2 + 1251.4 LE2 + 3413 P2
HPS = I1 + I2 + BF1
HPS = C + MPS + LPS
LPS = LE1 + LE2 + BF2
MPS = HE1 + HE2 + BF1 - BF2
P1 + P2 + PP ≥ 24,550
EP + PP ≥ 12,000
MPS ≥ 271,536
LPS ≥ 100,623
All variables are positive.
Solution Method
To solve the optimization problem, take the following steps.
The steps are also shown in the video Optimization Modeling, Part 2: Converting to Solver Form.
Choose a Solver
To find the appropriate solver for this problem, consult the “Optimization Decision Table” on page 2-
4. The table asks you to categorize your problem by type of objective function and types of
constraints. For this problem, the objective function is linear, and the constraints are linear. The
decision table recommends using the linprog solver.
As you see in “Problems Handled by Optimization Toolbox Functions” on page 2-12 or the linprog
function reference page, the linprog solver solves problems of the form
1-19
1 Getting Started
A ⋅ x ≤ b,
T
min f x such that Aeq ⋅ x = beq, (1-1)
x
lb ≤ x ≤ ub .
• fTx means a row vector of constants f multiplying a column vector of variables x. In other words,
The syntax of the linprog solver, as shown in its function reference page, is
[x fval] = linprog(f,A,b,Aeq,beq,lb,ub);
The inputs to the linprog solver are the matrices and vectors in “Equation 1-1”.
There are 16 variables in the equations of “Model Description” on page 1-18. Put these variables into
one vector. The name of the vector of variables is x in “Equation 1-1”. Decide on an order, and
construct the components of x out of the variables.
The following code constructs the vector using a cell array of names for the variables.
variables = {'I1','I2','HE1','HE2','LE1','LE2','C','BF1',...
'BF2','HPS','MPS','LPS','P1','P2','PP','EP'};
N = length(variables);
% create variables for indexing
for v = 1:N
eval([variables{v},' = ', num2str(v),';']);
end
Executing these commands creates the following named variables in your workspace:
1-20
Set Up a Linear Program, Solver-Based
These named variables represent index numbers for the components of x. You do not have to create
named variables. The video Optimization Modeling, Part 2: Converting to Solver Form shows how to
solve the problem simply using the index numbers of the components of x.
There are four variables with lower bounds, and six with upper bounds in the equations of “Model
Description” on page 1-18. The lower bounds:
P1 ≥ 2500
P2 ≥ 3000
MPS ≥ 271,536
LPS ≥ 100,623.
Also, all the variables are positive, which means they have a lower bound of zero.
Create the lower bound vector lb as a vector of 0, then add the four other lower bounds.
lb = zeros(size(variables));
lb([P1,P2,MPS,LPS]) = ...
[2500,3000,271536,100623];
P1 ≤ 6250
P2 ≤ 9000
I1 ≤ 192,000
I2 ≤ 244,000
C ≤ 62,000
1-21
1 Getting Started
LE2 ≤ 142000.
Create the upper bound vector as a vector of Inf, then add the six upper bounds.
ub = Inf(size(variables));
ub([P1,P2,I1,I2,C,LE2]) = ...
[6250,9000,192000,244000,62000,142000];
There are three linear inequalities in the equations of “Model Description” on page 1-18:
I1 - HE1 ≤ 132,000
EP + PP ≥ 12,000
P1 + P2 + PP ≥ 24,550.
In order to have the equations in the form A x≤b, put all the variables on the left side of the
inequality. All these equations already have that form. Ensure that each inequality is in “less than”
form by multiplying through by –1 wherever appropriate:
I1 - HE1 ≤ 132,000
-EP - PP ≤ -12,000
-P1 - P2 - PP ≤ -24,550.
In your MATLAB workspace, create the A matrix as a 3-by-16 zero matrix, corresponding to 3 linear
inequalities in 16 variables. Create the b vector with three components.
A = zeros(3,16);
A(1,I1) = 1; A(1,HE1) = -1; b(1) = 132000;
A(2,EP) = -1; A(2,PP) = -1; b(2) = -12000;
A(3,[P1,P2,PP]) = [-1,-1,-1];
b(3) = -24550;
There are eight linear equations in the equations of “Model Description” on page 1-18:
I2 = LE2 + HE2
LPS = LE1 + LE2 + BF2
HPS = I1 + I2 + BF1
HPS = C + MPS + LPS
I1 = LE1 + HE1 + C
MPS = HE1 + HE2 + BF1 - BF2
1359.8 I1 = 1267.8 HE1 + 1251.4 LE1 + 192 C + 3413 P1
1359.8 I2 = 1267.8 HE2 + 1251.4 LE2 + 3413 P2.
In order to have the equations in the form Aeq x=beq, put all the variables on one side of the
equation. The equations become:
LE2 + HE2 - I2 = 0
LE1 + LE2 + BF2 - LPS = 0
I1 + I2 + BF1 - HPS = 0
C + MPS + LPS - HPS = 0
LE1 + HE1 + C - I1 = 0
HE1 + HE2 + BF1 - BF2 - MPS = 0
1-22
Set Up a Linear Program, Solver-Based
Now write the Aeq matrix and beq vector corresponding to these equations. In your MATLAB
workspace, create the Aeq matrix as an 8-by-16 zero matrix, corresponding to 8 linear equations in
16 variables. Create the beq vector with eight components, all zero.
Aeq = zeros(8,16); beq = zeros(8,1);
Aeq(1,[LE2,HE2,I2]) = [1,1,-1];
Aeq(2,[LE1,LE2,BF2,LPS]) = [1,1,1,-1];
Aeq(3,[I1,I2,BF1,HPS]) = [1,1,1,-1];
Aeq(4,[C,MPS,LPS,HPS]) = [1,1,1,-1];
Aeq(5,[LE1,HE1,C,I1]) = [1,1,1,-1];
Aeq(6,[HE1,HE2,BF1,BF2,MPS]) = [1,1,1,-1,-1];
Aeq(7,[HE1,LE1,C,P1,I1]) = [1267.8,1251.4,192,3413,-1359.8];
Aeq(8,[HE2,LE2,P2,I2]) = [1267.8,1251.4,3413,-1359.8];
You now have inputs required by the linprog solver. Call the solver and print the outputs in
formatted form:
options = optimoptions('linprog','Algorithm','dual-simplex');
[x fval] = linprog(f,A,b,Aeq,beq,lb,ub,options);
for d = 1:N
fprintf('%12.2f \t%s\n',x(d),variables{d})
end
fval
The result:
Optimal solution found.
136328.74 I1
244000.00 I2
128159.00 HE1
143377.00 HE2
0.00 LE1
100623.00 LE2
8169.74 C
0.00 BF1
0.00 BF2
380328.74 HPS
271536.00 MPS
100623.00 LPS
6250.00 P1
7060.71 P2
11239.29 PP
1-23
1 Getting Started
760.71 EP
fval =
1.2703e+03
The fval output gives the smallest value of the objective function at any feasible point.
The solution vector x is the point where the objective function has the smallest value. Notice that:
• HPS — 380,328.74
• PP — 11,239.29
• EP — 760.71
The video Optimization Modeling, Part 2: Converting to Solver Form gives interpretations of these
characteristics in terms of the original problem.
Bibliography
[1] Edgar, Thomas F., and David M. Himmelblau. Optimization of Chemical Processes. McGraw-Hill,
New York, 1988.
See Also
More About
• “Set Up a Linear Program, Problem-Based” on page 1-25
• “Solver-Based Optimization Problem Setup”
1-24
Set Up a Linear Program, Problem-Based
The variables and expressions in the problem represent a model of operating a chemical plant, from
an example in Edgar and Himmelblau [1]. There are two videos that describe the problem.
• Mathematical Modeling with Optimization, Part 1 shows the problem in pictorial form. It shows
how to generate the mathematical expressions of “Model Description” on page 1-18 from the
picture.
• Optimization Modeling, Part 2: Problem-Based Solution of a Mathematical Model describes how to
convert these mathematical expressions into Optimization Toolbox solver syntax. This video shows
how to solve the problem, and how to interpret the results.
The remainder of this example is concerned solely with transforming the problem to solver syntax.
The example closely follows the video Optimization Modeling, Part 2: Problem-Based Solution of a
Mathematical Model.
Model Description
The video Mathematical Modeling with Optimization, Part 1 suggests that one way to convert a
problem into mathematical form is to:
1-25
1 Getting Started
For the meaning of the variables in this section, see the video Mathematical Modeling with
Optimization, Part 1.
The optimization problem is to minimize the objective function, subject to all the other expressions as
constraints.
2500 ≤ P1 ≤ 6250
I1 ≤ 192,000
C ≤ 62,000
I1 - HE1 ≤ 132,000
I1 = LE1 + HE1 + C
1359.8 I1 = 1267.8 HE1 + 1251.4 LE1 + 192 C + 3413 P1
3000 ≤ P2 ≤ 9000
I2 ≤ 244,000
LE2 ≤ 142,000
I2 = LE2 + HE2
1359.8 I2 = 1267.8 HE2 + 1251.4 LE2 + 3413 P2
HPS = I1 + I2 + BF1
HPS = C + MPS + LPS
LPS = LE1 + LE2 + BF2
MPS = HE1 + HE2 + BF1 - BF2
P1 + P2 + PP ≥ 24,550
EP + PP ≥ 12,000
MPS ≥ 271,536
LPS ≥ 100,623
All variables are positive.
P1 = optimvar('P1','LowerBound',2500,'UpperBound',6250);
P2 = optimvar('P2','LowerBound',3000,'UpperBound',9000);
I1 = optimvar('I1','LowerBound',0,'UpperBound',192000);
I2 = optimvar('I2','LowerBound',0,'UpperBound',244000);
C = optimvar('C','LowerBound',0,'UpperBound',62000);
LE1 = optimvar('LE1','LowerBound',0);
LE2 = optimvar('LE2','LowerBound',0,'UpperBound',142000);
HE1 = optimvar('HE1','LowerBound',0);
HE2 = optimvar('HE2','LowerBound',0);
HPS = optimvar('HPS','LowerBound',0);
MPS = optimvar('MPS','LowerBound',271536);
LPS = optimvar('LPS','LowerBound',100623);
BF1 = optimvar('BF1','LowerBound',0);
1-26
Set Up a Linear Program, Problem-Based
BF2 = optimvar('BF2','LowerBound',0);
EP = optimvar('EP','LowerBound',0);
PP = optimvar('PP','LowerBound',0);
I1 - HE1 ≤ 132,000
EP + PP ≥ 12,000
P1 + P2 + PP ≥ 24,550.
I2 = LE2 + HE2
LPS = LE1 + LE2 + BF2
HPS = I1 + I2 + BF1
HPS = C + MPS + LPS
I1 = LE1 + HE1 + C
MPS = HE1 + HE2 + BF1 - BF2
1359.8 I1 = 1267.8 HE1 + 1251.4 LE1 + 192 C + 3413 P1
1359.8 I2 = 1267.8 HE2 + 1251.4 LE2 + 3413 P2.
Solve Problem
The problem formulation is complete. Solve the problem using solve.
linsol = solve(linprob);
1-27
1 Getting Started
Examine Solution
Evaluate the objective function. (You could have asked for this value when you called solve.)
evaluate(linprob.Objective,linsol)
ans =
1.2703e+03
tbl = struct2table(linsol)
tbl =
1×16 table
This table is too wide to see easily. Stack the variables to get them to a vertical orientation.
vars = {'P1','P2','I1','I2','C','LE1','LE2','HE1','HE2',...
'HPS','MPS','LPS','BF1','BF2','EP','PP'};
outputvars = stack(tbl,vars,'NewDataVariableName','Amt','IndexVariableName','Var')
outputvars =
16×2 table
Var Amt
___ __________
P1 6250
P2 7060.7
I1 1.3633e+05
I2 2.44e+05
C 8169.7
LE1 0
LE2 1.0062e+05
HE1 1.2816e+05
HE2 1.4338e+05
HPS 3.8033e+05
MPS 2.7154e+05
LPS 1.0062e+05
BF1 0
BF2 0
EP 760.71
PP 11239
1-28
Set Up a Linear Program, Problem-Based
• HPS — 380,328.74
• PP — 11,239.29
• EP — 760.71
The video Optimization Modeling, Part 2: Problem-Based Solution of a Mathematical Model gives
interpretations of these characteristics in terms of the original problem.
Create the problem object, include the linear constraints, and solve the problem.
linprob = optimproblem('Objective',0.002614*x('HPS') + 0.0239*x('PP') + 0.009825*x('EP'));
1-29
1 Getting Started
[linsol,fval] = solve(linprob);
tbl = table(vars',linsol.x')
tbl =
16×2 table
Var1 Var2
_____ __________
'P1' 6250
'P2' 7060.7
'I1' 1.3633e+05
'I2' 2.44e+05
'C' 8169.7
'LE1' 0
'LE2' 1.0062e+05
'HE1' 1.2816e+05
'HE2' 1.4338e+05
'HPS' 3.8033e+05
'MPS' 2.7154e+05
'LPS' 1.0062e+05
'BF1' 0
'BF2' 0
'EP' 760.71
'PP' 11239
Bibliography
[1] Edgar, Thomas F., and David M. Himmelblau. Optimization of Chemical Processes. McGraw-Hill,
New York, 1988.
See Also
More About
• “Set Up a Linear Program, Solver-Based” on page 1-18
• “Problem-Based Optimization Setup”
1-30
2
Setting Up an Optimization
subject to
where x is the vector of length n design parameters, f(x) is the objective function, which returns a
scalar value, and the vector function G(x) returns a vector of length m containing the values of the
equality and inequality constraints evaluated at x.
An efficient and accurate solution to this problem depends not only on the size of the problem in
terms of the number of constraints and design variables but also on characteristics of the objective
function and constraints. When both the objective function and the constraints are linear functions of
the design variable, the problem is known as a Linear Programming (LP) problem. Quadratic
Programming (QP) concerns the minimization or maximization of a quadratic objective function that
is linearly constrained. For both the LP and QP problems, reliable solution procedures are readily
available. More difficult to solve is the Nonlinear Programming (NP) problem in which the objective
function and constraints can be nonlinear functions of the design variables. A solution of the NP
problem generally requires an iterative procedure to establish a direction of search at each major
iteration. This is usually achieved by the solution of an LP, a QP, or an unconstrained subproblem.
All optimization takes place in real numbers. However, unconstrained least squares problems and
equation-solving can be formulated and solved using complex analytic functions. See “Complex
Numbers in Optimization Toolbox Solvers” on page 2-14.
2-2
Optimization Toolbox Solvers
This group of solvers attempts to find a local minimum of the objective function near a starting
point x0. They address problems of unconstrained optimization, linear programming, quadratic
programming, and general nonlinear programming.
• Multiobjective minimizers on page 2-12
This group of solvers attempts to either minimize the maximum value of a set of functions
(fminimax), or to find a location where a collection of functions is below some prespecified values
(fgoalattain).
• Equation solvers on page 2-13
This group of solvers attempts to find a solution to a scalar- or vector-valued nonlinear equation
f(x) = 0 near a starting point x0. Equation-solving can be considered a form of optimization
because it is equivalent to finding the minimum norm of f(x) near x0.
• Least-Squares (curve-fitting) solvers on page 2-13
This group of solvers attempts to minimize a sum of squares. This type of problem frequently
arises in fitting a model to data. The solvers address problems of finding nonnegative solutions,
bounded or linearly constrained solutions, and fitting parametrized nonlinear models to data.
For more information see “Problems Handled by Optimization Toolbox Functions” on page 2-12. See
“Optimization Decision Table” on page 2-4 for aid in choosing among solvers for minimization.
min f (x),
x
possibly subject to constraints. f(x) is called an objective function. In general, f(x) is a scalar function
of type double, and x is a vector or scalar of type double. However, multiobjective optimization,
equation solving, and some sum-of-squares minimizers, can have vector or matrix objective functions
F(x) of type double. To use Optimization Toolbox solvers for maximization instead of minimization,
see “Maximizing an Objective” on page 2-30.
Write the objective function for a solver in the form of a function file or anonymous function handle.
You can supply a gradient ∇f(x) for many solvers, and you can supply a Hessian for several solvers.
See “Write Objective Function”. Constraints have a special form, as described in “Write Constraints”.
2-3
2 Setting Up an Optimization
In this table:
• * means relevant solvers are found in Global Optimization Toolbox (Global Optimization Toolbox)
functions (licensed separately from Optimization Toolbox solvers).
• fmincon applies to most smooth objective functions with smooth constraints. It is not listed as a
preferred solver for least squares or linear or quadratic programming because the listed solvers
are usually more efficient.
• The table has suggested functions, but it is not meant to unduly restrict your choices. For
example, fmincon can be effective on some nonsmooth problems.
• The Global Optimization Toolbox ga function can address mixed-integer programming problems.
• The Statistics and Machine Learning Toolbox™ bayesopt function can address low-dimensional
deterministic or stochastic optimization problems with combinations of continuous, integer, or
categorical variables.
Note This table does not list multiobjective solvers nor equation solvers. See “Problems Handled by
Optimization Toolbox Functions” on page 2-12 for a complete list of problems addressed by
Optimization Toolbox functions.
2-4
Optimization Decision Table
Note Some solvers have several algorithms. For help choosing, see “Choosing the Algorithm” on
page 2-6.
2-5
2 Setting Up an Optimization
fmincon Algorithms
fmincon has five algorithm options:
• 'interior-point' (default)
• 'trust-region-reflective'
• 'sqp'
• 'sqp-legacy'
• 'active-set'
Recommendations
• Use the 'interior-point' algorithm first.
For help if the minimization fails, see “When the Solver Fails” on page 4-3 or “When the Solver
Might Have Succeeded” on page 4-12.
• To run an optimization again to obtain more speed on small- to medium-sized problems, try
'sqp' next, and 'active-set' last.
• Use 'trust-region-reflective' when applicable. Your problem must have: objective
function includes gradient, only bounds, or only linear equality constraints (but not both).
• 'interior-point' handles large, sparse problems, as well as small dense problems. The
algorithm satisfies bounds at all iterations, and can recover from NaN or Inf results. It is a large-
scale algorithm; see “Large-Scale vs. Medium-Scale Algorithms” on page 2-10. The algorithm can
use special techniques for large-scale problems. For details, see Interior-Point Algorithm in
fmincon options.
• 'sqp' satisfies bounds at all iterations. The algorithm can recover from NaN or Inf results. It is
not a large-scale algorithm; see “Large-Scale vs. Medium-Scale Algorithms” on page 2-10.
• 'sqp-legacy' is similar to 'sqp', but usually is slower and uses more memory.
2-6
Choosing the Algorithm
• 'active-set' can take large steps, which adds speed. The algorithm is effective on some
problems with nonsmooth constraints. It is not a large-scale algorithm; see “Large-Scale vs.
Medium-Scale Algorithms” on page 2-10.
• 'trust-region-reflective' requires you to provide a gradient, and allows only bounds or
linear equality constraints, but not both. Within these limitations, the algorithm handles both large
sparse problems and small dense problems efficiently. It is a large-scale algorithm; see “Large-
Scale vs. Medium-Scale Algorithms” on page 2-10. The algorithm can use special techniques to
save memory usage, such as a Hessian multiply function. For details, see Trust-Region-
Reflective Algorithm in fmincon options.
For descriptions of the algorithms, see “Constrained Nonlinear Optimization Algorithms” on page 6-
19.
fsolve Algorithms
fsolve has three algorithms:
• 'trust-region-dogleg' (default)
• 'trust-region'
• 'levenberg-marquardt'
Recommendations
• Use the 'trust-region-dogleg' algorithm first.
For help if fsolve fails, see “When the Solver Fails” on page 4-3 or “When the Solver Might
Have Succeeded” on page 4-12.
• To solve equations again if you have a Jacobian multiply function, or want to tune the internal
algorithm (see Trust-Region Algorithm in fsolve options), try 'trust-region'.
• Try timing all the algorithms, including 'levenberg-marquardt', to find the algorithm that
works best on your problem.
For descriptions of the algorithms, see “Equation Solving Algorithms” on page 13-2.
fminunc Algorithms
fminunc has two algorithms:
• 'quasi-newton' (default)
• 'trust-region'
2-7
2 Setting Up an Optimization
Recommendations
• If your objective function includes a gradient, use 'Algorithm' = 'trust-region', and set
the SpecifyObjectiveGradient option to true.
• Otherwise, use 'Algorithm' = 'quasi-newton'.
For help if the minimization fails, see “When the Solver Fails” on page 4-3 or “When the Solver
Might Have Succeeded” on page 4-12.
For descriptions of the algorithms, see “Unconstrained Nonlinear Optimization Algorithms” on page
6-2.
Recommendations
• Try 'interior-point' first.
Tip For better performance when your input matrix C has a large fraction of nonzero entries,
specify C as an ordinary double matrix. Similarly, for better performance when C has relatively
few nonzero entries, specify C as sparse. For data type details, see “Sparse Matrices” (MATLAB).
You can also set the internal linear algebra type by using the 'LinearSolver' option.
• If you have no constraints or only bound constraints, and want higher accuracy, more speed, or
want to use a “Jacobian Multiply Function with Linear Least Squares” on page 12-30, try
'trust-region-reflective'.
• If you have a large number of linear constraints and not a large number of variables, try
'active-set'.
For help if the minimization fails, see “When the Solver Fails” on page 4-3 or “When the Solver
Might Have Succeeded” on page 4-12.
For descriptions of the algorithms, see “Least-Squares (Model Fitting) Algorithms” on page 12-2.
• 'trust-region-reflective' (default)
• 'levenberg-marquardt'
2-8
Choosing the Algorithm
Recommendations
• Generally, try 'trust-region-reflective' first. If your problem has bounds, you must use
'trust-region-reflective'.
• If your problem has no bounds and is underdetermined (fewer equations than dimensions), use
'levenberg-marquardt'.
For help if the minimization fails, see “When the Solver Fails” on page 4-3 or “When the Solver
Might Have Succeeded” on page 4-12.
For descriptions of the algorithms, see “Least-Squares (Model Fitting) Algorithms” on page 12-2.
Recommendations
Use the 'dual-simplex' algorithm or the 'interior-point' algorithm first.
For help if the minimization fails, see “When the Solver Fails” on page 4-3 or “When the Solver
Might Have Succeeded” on page 4-12.
• Often, the 'dual-simplex' and 'interior-point' algorithms are fast, and use the least
memory.
• The 'interior-point-legacy' algorithm is similar to 'interior-point', but 'interior-
point-legacy' can be slower, less robust, or use more memory.
For descriptions of the algorithms, see “Linear Programming Algorithms” on page 9-2.
• 'interior-point-convex' (default)
• 'trust-region-reflective'
• 'active-set'
2-9
2 Setting Up an Optimization
Recommendations
• If you have a convex problem, or if you don't know whether your problem is convex, use
'interior-point-convex'.
• Tip For better performance when your Hessian matrix H has a large fraction of nonzero entries,
specify H as an ordinary double matrix. Similarly, for better performance when H has relatively
few nonzero entries, specify H as sparse. For data type details, see “Sparse Matrices” (MATLAB).
You can also set the internal linear algebra type by using the 'LinearSolver' option.
• If you have a nonconvex problem with only bounds, or with only linear equalities, use 'trust-
region-reflective'.
• If you have a positive semidefinite problem with a large number of linear constraints and not a
large number of variables, try 'active-set'.
For help if the minimization fails, see “When the Solver Fails” on page 4-3 or “When the Solver
Might Have Succeeded” on page 4-12.
For descriptions of the algorithms, see “Quadratic Programming Algorithms” on page 11-2.
In contrast, medium-scale methods internally create full matrices and use dense linear algebra. If a
problem is sufficiently large, full matrices take up a significant amount of memory, and the dense
linear algebra may require a long time to execute.
Don't let the name “large scale” mislead you; you can use a large-scale algorithm on a small problem.
Furthermore, you do not need to specify any sparse matrices to use a large-scale algorithm. Choose a
medium-scale algorithm to access extra functionality, such as additional constraint types, or possibly
for better performance.
2-10
Choosing the Algorithm
• Run a different algorithm, starting from the interior-point solution. This can fail, because some
algorithms can use excessive memory or time, and all linprog and some quadprog algorithms
do not accept an initial point.
For example, try to minimize the function x when bounded below by 0. Using the fmincon default
interior-point algorithm:
options = optimoptions(@fmincon,'Algorithm','interior-point','Display','off');
x = fmincon(@(x)x,1,[],[],[],[],0,[],[],options)
x =
2.0000e-08
options.Algorithm = 'sqp';
x2 = fmincon(@(x)x,1,[],[],[],[],0,[],[],options)
x2 =
Similarly, solve the same problem using the linprog interior-point-legacy algorithm:
opts = optimoptions(@linprog,'Display','off','Algorithm','interior-point-legacy');
x = linprog(1,[],[],[],[],0,[],1,opts)
x =
2.0833e-13
opts.Algorithm = 'dual-simplex';
x2 = linprog(1,[],[],[],[],0,[],1,opts)
x2 =
In these cases, the interior-point algorithms are less accurate, but the answers are quite close to the
correct answer.
2-11
2 Setting Up an Optimization
Minimization Problems
Multiobjective Problems
2-12
Problems Handled by Optimization Toolbox Functions
such that x ≥ 0
Constrained linear-least-squares 1 2 lsqlin
min C⋅x−d 2
x 2
such that lb ≤ x ≤ ub
Nonlinear curve fitting min F(x, xdata) − ydata
2 lsqcurvefit
2
x
such that lb ≤ x ≤ ub
2-13
2 Setting Up an Optimization
• The objective function must be analytic in the complex function sense (for details, see Nevanlinna
and Paatero [1]). For example, the function f(z) = Re(z) – iIm(z) is not analytic, but the function
f(z) = exp(z) is analytic. This restriction automatically holds for lsqlin.
• There must be no constraints, not even bounds. Complex numbers are not well ordered, so it is not
clear what “bounds” might mean. When there are problem bounds, nonlinear least-squares solvers
disallow steps leading to complex values.
• Do not set the FunValCheck option to 'on'. This option immediately halts a solver when the solver
encounters a complex value.
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
The least-squares solvers and fsolve try to minimize the squared norm of a vector of function
values. This makes sense even in the presence of complex values.
If you have a non-analytic function or constraints, split the real and imaginary parts of the problem.
For an example, see “Fit a Model to Complex-Valued Data” on page 12-50.
To get the best (smallest norm) solution, try setting a complex initial point. For example, solving
1 + x4 = 0 fails if you use a real start point:
f = @(x)1+x^4;
x0 = 1;
x = fsolve(f,x0)
No solution found.
fsolve stopped because the problem appears regular as measured by the gradient,
but the vector of function values is not near zero as measured by the
default value of the function tolerance.
x =
1.1176e-08
x0 = 1 + 1i/10;
x = fsolve(f,x0)
Equation solved.
2-14
Complex Numbers in Optimization Toolbox Solvers
x =
0.7071 + 0.7071i
References
[1] Nevanlinna, Rolf, and V. Paatero. Introduction to Complex Analysis. Addison-Wesley, 1969.
See Also
Related Examples
• “Fit a Model to Complex-Valued Data” on page 12-50
2-15
2 Setting Up an Optimization
fminbnd
fminsearch
fseminf
fzero
Nonlinear least squares lsqcurvefit “Writing Vector and Matrix Objective Functions”
on page 2-26
lsqnonlin
Multivariable equation solving fsolve
Multiobjective fgoalattain
fminimax
Linear programming linprog “Writing Objective Functions for Linear or
Mixed-integer linear intlinprog Quadratic Problems” on page 2-29
programming
Linear least squares lsqlin
lsqnonneg
Quadratic programming quadprog
2-16
Writing Scalar Objective Functions
Function Files
A scalar objective function file accepts one input, say x, and returns one real scalar output, say f. The
input x can be a scalar, vector, or matrix on page 2-31. A function file can return more outputs (see
“Including Gradients and Hessians” on page 2-19).
1 Write this function as a file that accepts the vector xin = [x;y;z] and returns f:
function f = myObjective(xin)
f = 3*(xin(1)-xin(2))^4 + 4*(xin(1)+xin(3))^2/(1+norm(xin)^2) ...
+ cosh(xin(1)-1) + tanh(xin(2)+xin(3));
2 Save it as a file named myObjective.m to a folder on your MATLAB path.
3 Check that the function evaluates correctly:
myObjective([1;2;3])
ans =
9.2666
For information on how to include extra parameters, see “Passing Extra Parameters” on page 2-57.
For more complex examples of function files, see “Minimization with Gradient and Hessian Sparsity
Pattern” on page 6-16 or “Minimization with Bound Constraints and Banded Preconditioner” on
page 6-81.
Functions can exist inside other files as local functions (MATLAB) or nested functions (MATLAB).
Using local functions or nested functions can lower the number of distinct files you save. Using
nested functions also lets you access extra parameters, as shown in “Nested Functions” on page 2-
58.
For example, suppose you want to minimize the myObjective.m objective function, described in
“Function Files” on page 2-17, subject to the ellipseparabola.m constraint, described in
“Nonlinear Constraints” on page 2-37. Instead of writing two files, myObjective.m and
ellipseparabola.m, write one file that contains both functions as local functions:
if nargin < 2
options = optimoptions('fmincon','Algorithm','interior-point');
end
2-17
2 Setting Up an Optimization
function f = myObjective(xin)
f = 3*(xin(1)-xin(2))^4 + 4*(xin(1)+xin(3))^2/(1+sum(xin.^2)) ...
+ cosh(xin(1)-1) + tanh(xin(2)+xin(3));
x =
1.1835
0.8345
-1.6439
fval =
0.5383
ans =
104
x =
2-18
Writing Scalar Objective Functions
1.0000
1.0000
fval =
1.2266e-10
For fmincon and fminunc, you can include gradients in the objective function. Generally, solvers are
more robust, and can be slightly faster when you include gradients. See “Benefits of Including
Derivatives” on page 2-24. To also include second derivatives (Hessians), see “Including Hessians”
on page 2-21.
The following table shows which algorithms can use gradients and Hessians.
Tip For most flexibility, write conditionalized code. Conditionalized means that the number of
function outputs can vary, as shown in the following example. Conditionalized code does not error
2-19
2 Setting Up an Optimization
2 2
f (x) = 100 x2 − x12 + (1 − x1) ,
which is described and plotted in “Solve a Constrained Nonlinear Problem, Solver-Based” on page 1-
11. The gradient of f(x) is
−400 x2 − x12 x1 − 2 1 − x1
∇ f (x) = ,
200 x2 − x12
end
nargout checks the number of arguments that a calling function specifies. See “Find Number of
Function Arguments” (MATLAB).
The fminunc solver, designed for unconstrained optimization, allows you to minimize Rosenbrock's
function. Tell fminunc to use the gradient and Hessian by setting options:
options = optimoptions(@fminunc,'Algorithm','trust-region',...
'SpecifyObjectiveGradient',true);
[x fval] = fminunc(@rosentwo,[-1;2],options)
Local minimum found.
x =
1.0000
1.0000
fval =
1.9886e-17
If you have a Symbolic Math Toolbox™ license, you can calculate gradients and Hessians
automatically, as described in “Symbolic Math Toolbox™ Calculates Gradients and Hessians” on page
6-94.
2-20
Writing Scalar Objective Functions
Including Hessians
You can include second derivatives with the fmincon 'trust-region-reflective' and
'interior-point' algorithms, and with the fminunc 'trust-region' algorithm. There are
several ways to include Hessian information, depending on the type of information and on the
algorithm.
You must also include gradients (set SpecifyObjectiveGradient to true and, if applicable,
SpecifyConstraintGradient to true) in order to include Hessians.
These algorithms either have no constraints, or have only bound or linear equality constraints.
Therefore the Hessian is the matrix of second derivatives of the objective function.
Include the Hessian matrix as the third output of the objective function. For example, the Hessian
H(x) of Rosenbrock’s function is (see “How to Include Gradients” on page 2-19)
end
options = optimoptions('fminunc','Algorithm','trust-region',...
'SpecifyObjectiveGradient',true,'HessianFcn','objective');
The Hessian is the Hessian of the Lagrangian, where the Lagrangian L(x,λ) is
g and h are vector functions representing all inequality and equality constraints respectively
(meaning bound, linear, and nonlinear constraints), so the minimization problem is
2-21
2 Setting Up an Optimization
For details, see “Constrained Optimality Theory” on page 3-12. The Hessian of the Lagrangian is
hessian = hessianfcn(x,lambda)
hessian is an n-by-n matrix, sparse or dense, where n is the number of variables. If hessian is large
and has relatively few nonzero entries, save running time and memory by representing hessian as a
sparse matrix. lambda is a structure with the Lagrange multiplier vectors associated with the
nonlinear constraints:
lambda.ineqnonlin
lambda.eqnonlin
fmincon computes the structure lambda and passes it to your Hessian function. hessianfcn must
calculate the sums in “Equation 2-2”. Indicate that you are supplying a Hessian by setting these
options:
options = optimoptions('fmincon','Algorithm','interior-point',...
'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
'HessianFcn',@hessianfcn);
For example, to include a Hessian for Rosenbrock’s function constrained to the unit disk x12 + x22 ≤ 1,
notice that the constraint function g(x) = x12 + x22 − 1 ≤ 0 has gradient and second derivative matrix
2x1
∇g(x) =
2x2
2 0
Hg(x) = .
0 2
Save hessianfcn on your MATLAB path. To complete the example, the constraint function including
gradients is
if nargout > 2
gc = [2*x(1);2*x(2)];
2-22
Writing Scalar Objective Functions
gceq = [];
end
fun = @rosenboth;
nonlcon = @unitdisk2;
x0 = [-1;2];
options = optimoptions('fmincon','Algorithm','interior-point',...
'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
'HessianFcn',@hessianfcn);
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],@unitdisk2,options);
For other examples using an interior-point Hessian, see “fmincon Interior-Point Algorithm with
Analytic Hessian” on page 6-66 and “Symbolic Math Toolbox™ Calculates Gradients and Hessians”
on page 6-94.
Hessian Multiply Function
Instead of a complete Hessian function, both the fmincon interior-point and trust-region-
reflective algorithms allow you to supply a Hessian multiply function. This function gives the
result of a Hessian-times-vector product, without computing the Hessian directly. This can save
memory. The SubproblemAlgorithm option must be 'cg' for a Hessian multiply function to work;
this is the trust-region-reflective default.
W = HessMultFcn(x,lambda,v);
The result W should be the product H*v, where H is the Hessian of the Lagrangian at x (see
“Equation 16-1”), lambda is the Lagrange multiplier (computed by fmincon), and v is a vector of
size n-by-1. Set options as follows:
options = optimoptions('fmincon','Algorithm','interior-point','SpecifyObjectiveGradient',true,
'SpecifyConstraintGradient',true,'SubproblemAlgorithm','cg','HessianMultiplyFcn',@HessMult
Supply the function HessMultFcn, which returns an n-by-1 vector, where n is the number of
dimensions of x. The HessianMultiplyFcn option enables you to pass the result of multiplying
the Hessian by a vector without calculating the Hessian.
• The trust-region-reflective algorithm does not involve lambda:
W = HessMultFcn(H,v);
The result W = H*v. fmincon passes H as the value returned in the third output of the objective
function (see “Hessian for fminunc trust-region or fmincon trust-region-reflective algorithms” on
page 2-21). fmincon also passes v, a vector or matrix with n rows. The number of columns in v
can vary, so write HessMultFcn to accept an arbitrary number of columns. H does not have to be
the Hessian; rather, it can be anything that enables you to calculate W = H*v.
options = optimoptions('fmincon','Algorithm','trust-region-reflective',...
'SpecifyObjectiveGradient',true,'HessianMultiplyFcn',@HessMultFcn);
2-23
2 Setting Up an Optimization
If you do not provide gradients, solvers estimate gradients via finite differences. If you provide
gradients, your solver need not perform this finite difference estimation, so can save time and be
more accurate, although a finite-difference estimate can be faster for complicated derivatives.
Furthermore, solvers use an approximate Hessian, which can be far from the true Hessian. Providing
a Hessian can yield a solution in fewer iterations. For example, see the end of “Symbolic Math
Toolbox™ Calculates Gradients and Hessians” on page 6-94.
For constrained problems, providing a gradient has another advantage. A solver can reach a point x
such that x is feasible, but, for this x, finite differences around x always lead to an infeasible point.
Suppose further that the objective function at an infeasible point returns a complex output, Inf, NaN,
or error. In this case, a solver can fail or halt prematurely. Providing a gradient allows a solver to
proceed. To obtain this benefit, you might also need to include the gradient of a nonlinear constraint
function, and set the SpecifyConstraintGradient option to true. See “Nonlinear Constraints”
on page 2-37.
The fmincon interior-point algorithm has many options for selecting an input Hessian
approximation. For syntax details, see “Hessian as an Input” on page 16-88. Here are the options,
along with estimates of their relative characteristics.
• Run out of memory — Try 'lbfgs' instead of 'bfgs'. If you can provide your own gradients, try
'fin-diff-grads', and set the SpecifyObjectiveGradient and
SpecifyConstraintGradient options to true.
• Want more efficiency — Provide your own gradients and Hessian. See “Including Hessians” on
page 2-21, “fmincon Interior-Point Algorithm with Analytic Hessian” on page 6-66, and “Symbolic
Math Toolbox™ Calculates Gradients and Hessians” on page 6-94.
The reason 'lbfgs' has only moderate efficiency is twofold. It has relatively expensive Sherman-
Morrison updates. And the resulting iteration step can be somewhat inaccurate due to the 'lbfgs'
limited memory.
The reason 'fin-diff-grads' and HessianMultiplyFcn have only moderate efficiency is that
they use a conjugate gradient approach. They accurately estimate the Hessian of the objective
function, but they do not generate the most accurate iteration step. For more information, see
“fmincon Interior Point Algorithm” on page 6-30, and its discussion of the LDL approach and the
conjugate gradient approach to solving “Equation 6-50”.
2-24
Writing Scalar Objective Functions
See Also
More About
• “Checking Validity of Gradients or Jacobians” on page 2-74
2-25
2 Setting Up an Optimization
∂Fi(x)
Ji j(x) = .
∂x j
For example, if
x12 + x2x3
F(x) = ,
sin x1 + 2x2 − 3x3
then J(x) is
2x1 x3 x2
J(x) = .
cos x1 + 2x2 − 3x3 2cos x1 + 2x2 − 3x3 −3cos x1 + 2x2 − 3x3
To indicate to your solver that your objective function includes a Jacobian, set the
SpecifyObjectiveGradient option to true. For example,
2-26
Writing Vector and Matrix Objective Functions
options = optimoptions('lsqnonlin','SpecifyObjectiveGradient',true);
F11 F12
F = F21 F22
F31 F32
as a vector f:
F11
F21
F31
f = .
F12
F22
F32
∂fi
Ji j = .
∂x j
For example, if
x2 x1
−4x13 5
0 −2x2
J(x) = .
3x12 6x2
−x2 /x12 1/x1
3x12 −4x23
2-27
2 Setting Up an Optimization
x11 x12
X= ,
x21 x22
x11
x21
x= .
x12
x22
With
F11 F12
F = F21 F22 ,
F31 F32
and with f the vector form of F as above, the Jacobian of F(X) is defined as the Jacobian of f(x):
∂fi
Ji j = .
∂x j
If F is an m-by-n matrix and x is a j-by-k matrix, then the Jacobian is an mn-by-jk matrix.
See Also
More About
• “Checking Validity of Gradients or Jacobians” on page 2-74
2-28
Writing Objective Functions for Linear or Quadratic Problems
Input the vector f for the objective. See the examples in “Linear Programming and Mixed-Integer
Linear Programming”.
• lsqlin and lsqnonneg: minimize
∥Cx - d∥.
Input the matrix C and the vector d for the objective. See “Nonnegative Linear Least Squares,
Solver-Based” on page 12-24.
• quadprog: minimize
Input both the vector f and the symmetric matrix H for the objective. See “Quadratic
Programming”.
2-29
2 Setting Up an Optimization
Maximizing an Objective
All solvers attempt to minimize an objective function. If you have a maximization problem, that is, a
problem of the form
max f (x),
x
[x,fval] = fminunc(@(x)-tan(cos(x)),5)
x = 6.2832
fval = -1.5574
The maximum is 1.5574 (the negative of the reported fval), and occurs at x = 6.2832. This
answer is correct since, to five digits, the maximum is tan(1) = 1 . 5574, which occurs at
x = 2π = 6 . 2832.
2-30
Matrix Arguments
Matrix Arguments
Optimization Toolbox solvers accept vectors for many arguments, such as the initial point x0, lower
bounds lb, and upper bounds ub. They also accept matrices for these arguments, where matrix
means an array of any size. When your solver arguments are naturally arrays, not vectors, feel free to
provide the arguments as arrays.
• Internally, solvers convert matrix arguments into vectors before processing. For example, x0
becomes x0(:). For an explanation of this syntax, see the A(:) entry in colon, or the "Indexing
with a Single Index" section of “Array Indexing” (MATLAB).
• For output, solvers reshape the solution x to the same size as the input x0.
• When x0 is a matrix, solvers pass x as a matrix of the same size as x0 to both the objective
function and to any nonlinear constraint function.
• “Linear Constraints” on page 2-35, though, take x in vector form, x(:). In other words, a linear
constraint of the form
takes x as a vector, not a matrix. Ensure that your matrix A or Aeq has the same number of
columns as x0 has elements, or the solver will error.
See Also
colon
More About
• “Writing Scalar Objective Functions” on page 2-17
• “Bound Constraints” on page 2-34
• “Linear Constraints” on page 2-35
• “Nonlinear Constraints” on page 2-37
• “Array Indexing” (MATLAB)
2-31
2 Setting Up an Optimization
Types of Constraints
Optimization Toolbox solvers have special forms for constraints:
• “Bound Constraints” on page 2-34 — Lower and upper bounds on individual components: x ≥ l
and x ≤ u.
• “Linear Inequality Constraints” on page 2-35 — A·x ≤ b. A is an m-by-n matrix, which represents
m constraints for an n-dimensional vector x. b is m-dimensional.
• “Linear Equality Constraints” on page 2-36 — Aeq·x = beq. Equality constraints have the same
form as inequality constraints.
• “Nonlinear Constraints” on page 2-37 — c(x) ≤ 0 and ceq(x) = 0. Both c and ceq are scalars or
vectors representing several constraints.
Optimization Toolbox functions assume that inequality constraints are of the form ci(x) ≤ 0 or A x ≤ b.
Express greater-than constraints as less-than constraints by multiplying them by –1. For example, a
constraint of the form ci(x) ≥ 0 is equivalent to the constraint –ci(x) ≤ 0. A constraint of the form
A·x ≥ b is equivalent to the constraint –A·x ≤ –b. For more information, see “Linear Inequality
Constraints” on page 2-35 and “Nonlinear Constraints” on page 2-37.
You can sometimes write constraints in several ways. For best results, use the lowest numbered
constraints possible:
1 Bounds
2 Linear equalities
3 Linear inequalities
4 Nonlinear equalities
5 Nonlinear inequalities
For example, with a constraint 5 x ≤ 20, use a bound x ≤ 4 instead of a linear inequality or nonlinear
inequality.
For information on how to pass extra parameters to constraint functions, see “Passing Extra
Parameters” on page 2-57.
2-32
Iterations Can Violate Constraints
In this section...
“Intermediate Iterations can Violate Constraints” on page 2-33
“Algorithms That Satisfy Bound Constraints” on page 2-33
“Solvers and Algorithms That Can Violate Bound Constraints” on page 2-33
For example, if you take a square root or logarithm of x, and x < 0, the result is not real. You can try
to avoid this error by setting 0 as a lower bound on x. Nevertheless, an intermediate iteration can
violate this bound.
Note If you set a lower bound equal to an upper bound, iterations can violate these constraints.
See Also
More About
• “Bound Constraints” on page 2-34
2-33
2 Setting Up an Optimization
Bound Constraints
Lower and upper bounds limit the components of the solution x.
If you know bounds on the location of an optimum, you can obtain faster and more reliable solutions
by explicitly including these bounds in your problem formulation.
Give bounds as vectors with the same length as x, or as matrices on page 2-31 with the same number
of elements as x.
• If a particular component has no lower bound, use -Inf as the bound; similarly, use Inf if a
component has no upper bound.
• If you have only bounds of one type (upper or lower), you do not need to write the other type. For
example, if you have no upper bounds, you do not need to supply a vector of Infs.
• If only the first m out of n components have bounds, then you need only supply a vector of length
m containing bounds. However, this shortcut causes solvers to throw a warning.
x3 ≥ 8
x2 ≤ 3.
l = [-Inf; -Inf; 8]
u = [Inf; 3] (throws a warning) or u = [Inf; 3; Inf].
Tip Use Inf or -Inf instead of a large, arbitrary bound to lower memory usage and increase solver
speed. See “Use Inf Instead of a Large, Arbitrary Bound” on page 4-10.
You need not give gradients for bound constraints; solvers calculate them automatically. Bounds do
not affect Hessians.
For a more complex example of bounds, see “Set Up a Linear Program, Solver-Based” on page 1-18.
See Also
More About
• “Iterations Can Violate Constraints” on page 2-33
2-34
Linear Constraints
Linear Constraints
In this section...
“What Are Linear Constraints?” on page 2-35
“Linear Inequality Constraints” on page 2-35
“Linear Equality Constraints” on page 2-36
For example, suppose that you have the following linear inequalities as constraints:
x1 + x3 ≤ 4,
2x2 – x3 ≥ –2,
x1 – x2 + x3 – x4 ≥ 9.
Here, m = 3 and n = 4.
1 0 1 0
A= 0 −2 1 0 ,
−1 1 −1 1
4
b= 2 .
−9
Notice that the “greater than” inequalities are first multiplied by –1 to put them in “less than”
inequality form. In MATLAB syntax:
A = [1 0 1 0;
0 -2 1 0;
-1 1 -1 1];
b = [4;2;-9];
You do not need to give gradients for linear constraints; solvers calculate them automatically. Linear
constraints do not affect Hessians.
Even if you pass an initial point x0 as a matrix, solvers pass the current point x as a column vector to
linear constraints. See “Matrix Arguments” on page 2-31.
2-35
2 Setting Up an Optimization
For a more complex example of linear constraints, see “Set Up a Linear Program, Solver-Based” on
page 1-18.
Intermediate iterations can violate linear constraints. See “Iterations Can Violate Constraints” on
page 2-33.
Pass linear equality constraints in the Aeq and beq arguments in the same way as described for the A
and b arguments in “Linear Inequality Constraints” on page 2-35.
See Also
More About
• “Write Constraints”
2-36
Nonlinear Constraints
Nonlinear Constraints
Several optimization solvers accept nonlinear constraints, including fmincon, fseminf,
fgoalattain, fminimax, and the Global Optimization Toolbox solvers ga, gamultiobj,
patternsearch, paretosearch, GlobalSearch, and MultiStart. Nonlinear constraints allow
you to restrict the solution to any region that can be described in terms of smooth functions.
Nonlinear inequality constraints have the form c(x) ≤ 0, where c is a vector of constraints, one
component for each constraint. Similarly, nonlinear equality constraints have the form ceq(x) = 0.
Note Nonlinear constraint functions must return both c and ceq, the inequality and equality
constraint functions, even if they do not both exist. Return an empty entry [] for a nonexistent
constraint.
For example, suppose that you have the following inequalities as constraints:
x12 x22
+ ≤ 1,
9 4
x2 ≥ x12 − 1.
ellipseparabola returns an empty entry [] for ceq, the nonlinear equality constraint function.
Also, the second inequality is rewritten to ≤ 0 form.
x =
-0.2500 -0.9375
2-37
2 Setting Up an Optimization
Providing a gradient has another advantage. A solver can reach a point x such that x is feasible, but
finite differences around x always lead to an infeasible point. In this case, a solver can fail or halt
prematurely. Providing a gradient allows a solver to proceed.
if nargout > 2
gradc = [2*x(1)/9, 2*x(1); ...
x(2)/2, -1];
gradceq = [];
end
See “Writing Scalar Objective Functions” on page 2-17 for information on conditionalized functions.
The gradient matrix has the form
gradci, j = [∂c(j)/∂xi].
The first column of the gradient matrix is associated with c(1), and the second column is associated
with c(2). This derivative form is the transpose of the form of Jacobians.
To have a solver use gradients of nonlinear constraints, indicate that they exist by using
optimoptions:
options = optimoptions(@fmincon,'SpecifyConstraintGradient',true);
If you have a Symbolic Math Toolbox license, you can calculate gradients and Hessians automatically,
as described in “Symbolic Math Toolbox™ Calculates Gradients and Hessians” on page 6-94.
Anonymous functions return just one output. So how can you write an anonymous function as a
nonlinear constraint?
The deal function distributes multiple outputs. For example, suppose that you have the nonlinear
inequalities
x12 x22
+ ≤ 1,
9 4
x2 ≥ x12 − 1 .
2-38
Nonlinear Constraints
x2 = tanh(x1).
c = @(x)[x(1)^2/9 + x(2)^2/4 - 1;
x(1)^2 - x(2) - 1];
ceq = @(x)tanh(x(1)) - x(2);
nonlinfcn = @(x)deal(c(x),ceq(x));
To minimize the function cosh(x1) + sinh(x2) subject to the constraints in nonlinfcn, use
fmincon.
obj = @(x)cosh(x(1))+sinh(x(2));
opts = optimoptions(@fmincon,'Algorithm','sqp');
z = fmincon(obj,[0;0],[],[],[],[],[],[],nonlinfcn,opts)
z = 2×1
-0.6530
-0.5737
To check how well the resulting point z satisfies the constraints, use nonlinfcn.
[cout,ceqout] = nonlinfcn(z)
cout = 2×1
-0.8704
0
ceqout = 1.1102e-16
z satisfies all the constraints to within the default value of the constraint tolerance
ConstraintTolerance, 1e-6.
For information on anonymous objective functions, see “Anonymous Function Objectives” on page 2-
18.
See Also
GlobalSearch | MultiStart | fgoalattain | fmincon | ga | patternsearch
More About
• “Tutorial for the Optimization Toolbox™” on page 6-36
• “Nonlinear Constraints with Gradients” on page 6-63
2-39
2 Setting Up an Optimization
2-40
Or Instead of And Constraints
These formulations are not logically equivalent, and there is generally no way to express OR
constraints in terms of AND constraints.
Tip Fortunately, nonlinear constraints are extremely flexible. You get OR constraints simply by
setting the nonlinear constraint function to the minimum of the constraint functions.
The reason that you can set the minimum as the constraint is due to the nature of “Nonlinear
Constraints” on page 2-37: you give them as a set of functions that must be negative at a feasible
point. If your constraints are
c(x) = min(F1(x),F2(x),F3(x)).
c(x) is not smooth, which is a general requirement for constraint functions, due to the minimum.
Nevertheless, the method often works.
Note You cannot use the usual bounds and linear constraints in an OR constraint. Instead, convert
your bounds and linear constraints to nonlinear constraint functions, as in this example.
For example, suppose your feasible region is the L-shaped region: x is in the rectangle –1 ≤ x(1) ≤ 1,
0 ≤ x(2) ≤ 1 OR x is in the rectangle 0 ≤ x(1) ≤ 1, –1 ≤ x(2) ≤ 1.
2-41
2 Setting Up an Optimization
if (b <= a) || (d <= c)
error('Give a rectangle a < b, c < d')
end
cout = max([(x(1)-b),(x(2)-d),(a-x(1)),(c-x(2))]);
Following the prescription of using the minimum of nonlinear constraint functions, for the L-shaped
region, the nonlinear constraint function is:
2-42
Or Instead of And Constraints
z = zeros(length(x),1); % allocate
for ii = 1:length(x)
[z(ii),~] = rectconstrfcn(x(ii,:));
end
z = reshape(z,size(xx));
surf(xx,yy,z)
colorbar
axis equal
xlabel('x');ylabel('y')
view(0,90)
2-43
2 Setting Up an Optimization
opts = optimoptions(@fmincon,'Algorithm','interior-point','Display','off');
x0 = [-.5,.6]; % an arbitrary guess
[xsol,fval,eflag] = fmincon(fun,x0,[],[],[],[],[],[],@rectconstrfcn,opts)
xsol =
0.4998 -0.9996
fval =
2.4650e-07
eflag =
Clearly, the solution xsol is inside the L-shaped region. The exit flag is 1, indicating that xsol is a
local minimum.
See Also
fmincon
More About
• “Nonlinear Constraints” on page 2-37
2-44
How to Use All Types of Constraints
The problem has five variables, x(1) through x(5). The objective function is a polynomial in the
variables.
The objective function is in the local function myobj(x), which is nested inside the function
fullexample. The code for fullexample appears at the end of this section on page 2-0 .
x1 − 0 . 2x2x5 ≤ 71
0 . 9x3 − x42 ≤ 67
The nonlinear constraints are in the local function myconstr(x), which is nested inside the function
fullexample.
0 ≤ x3 ≤ 20, x5 ≥ 1.
There is a linear equality constraint x1 = 0 . 3x2, which you can write as x1 − 0 . 3x2 = 0.
0 . 1x5 ≤ x4
x4 ≤ 0 . 5x5
0 . 9x5 ≤ x3.
Represent the bounds and linear constraints as matrices and vectors. The code that creates these
arrays is in the fullexample function. As described in the fmincon “Input Arguments” on page 16-
72 section, the lb and ub vectors represent the constraints
lb ≤ x ≤ ub
A*x ≤ b,
and the matrix Aeq and vector beq represent the linear equality constraints
Aeq*x = b.
Call fullexample to solve the minimization problem subject to all types of constraints.
2-45
2 Setting Up an Optimization
[x,fval,exitflag] = fullexample
x = 5×1
0.6114
2.0380
1.3948
0.1572
1.5498
fval = 37.3806
exitflag = 1
The exit flag value of 1 indicates that fmincon converges to a local minimum that satisfies all of the
constraints.
This code creates the fullexample function, which contains the nested functions myobj and
myconstr.
[x,fval,exitflag] = fmincon(@myobj,x0,A,b,Aeq,beq,lb,ub,...
@myconstr,opts);
%---------------------------------------------------------
function f = myobj(x)
%---------------------------------------------------------
function [c, ceq] = myconstr(x)
c = [x(1) - 0.2*x(2)*x(5) - 71
0.9*x(3) - x(4)^2 - 67];
ceq = 3*x(2)^2*x(5) + 3*x(1)^2*x(3) - 20.875;
end
end
2-46
How to Use All Types of Constraints
See Also
More About
• “Write Constraints”
• “Solver-Based Nonlinear Optimization”
2-47
2 Setting Up an Optimization
You typically use such a function in a simulation. Solvers such as fmincon evaluate the objective and
nonlinear constraint functions separately. This evaluation is wasteful when you use the same
calculation for both results.
To avoid wasting time, have your calculation use a nested function to evaluate the objective and
constraint functions only when needed, by retaining the values of time-consuming calculations. Using
a nested function avoids using global variables, yet lets intermediate results be retained and shared
between the objective and constraint functions.
Note Because of the way ga calls nonlinear constraint functions, the technique in this example
usually does not reduce the number of calls to the objective or constraint functions.
For example, suppose computeall is the expensive (time-consuming) function called by both the
objective function and by the nonlinear constraint functions. Suppose you want to use fmincon as
your optimizer.
Write a function that computes a portion of Rosenbrock’s function f1 and a nonlinear constraint c1
that keeps the solution in a disk of radius 1 around the origin. Rosenbrock’s function is
2 2
f (x) = 100 x2 − x12 + (1 − x1) ,
which has a unique minimum value of 0 at (1,1). See “Solve a Constrained Nonlinear Problem, Solver-
Based” on page 1-11.
In this example there is no nonlinear equality constraint, so ceq1 = []. Add a pause(1) statement
to simulate an expensive computation.
2-48
Objective and Nonlinear Constraints in the Same Function
computeall returns the first part of the objective function. Embed the call to computeall in a
nested function:
function [x,f,eflag,outpt] = runobjconstr(x0,opts)
% Call fmincon
[x,f,eflag,outpt] = fmincon(fun,x0,[],[],[],[],[],[],cfun,opts);
function y = objfun(x)
if ~isequal(x,xLast) % Check if computation is necessary
[myf,myc,myceq] = computeall(x);
xLast = x;
end
% Now compute objective function
y = myf + 20*(x(3) - x(4)^2)^2 + 5*(1 - x(4))^2;
end
end
Save the nested function as a file named runobjconstr.m on your MATLAB path.
Run the file, timing the call with tic and toc.
opts = optimoptions(@fmincon,'Algorithm','interior-point','Display','off');
x0 = [-1,1,1,2];
tic
[x,fval,exitflag,output] = runobjconstr(x0,opts);
toc
Compare the times to run the solver with and without the nested function. For the run without the
nested function, save myrosen2.m as the objective function file, and constr.m as the constraint:
2-49
2 Setting Up an Optimization
function y = myrosen2(x)
f1 = computeall(x); % get first part of objective
y = f1 + 20*(x(3) - x(4)^2)^2 + 5*(1 - x(4))^2;
end
tic
[x,fval,exitflag,output] = fmincon(@myrosen2,x0,...
[],[],[],[],[],[],@constr,opts);
toc
The solver takes twice as long as before, because it evaluates the objective and constraint separately.
If you have a Parallel Computing Toolbox license, you can save even more time by setting the
UseParallel option to true.
parpool
Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers.
ans =
Connected: true
NumWorkers: 4
Cluster: local
AttachedFiles: {}
IdleTimeout: 30 minute(s) (30 minutes remaining)
SpmdEnabled: true
opts = optimoptions(opts,'UseParallel',true);
tic
[x,fval,exitflag,output] = runobjconstr(x0,opts);
toc
In this case, enabling parallel computing cuts the computational time in half.
Compare the runs with parallel computing, with and without a nested function:
tic
[x,fval,exitflag,output] = fmincon(@myrosen2,x0,...
[],[],[],[],[],[],@constr,opts);
toc
2-50
Objective and Nonlinear Constraints in the Same Function
In this example, computing in parallel but not nested takes about the same time as computing nested
but not parallel. Computing both nested and parallel takes half the time of using either alone.
See Also
More About
• “Objective and Constraints Having a Common Function in Serial or Parallel, Problem-Based” on
page 2-52
• “Solve a Constrained Nonlinear Problem, Solver-Based” on page 1-11
• “Optimizing a Simulation or Ordinary Differential Equation” on page 4-26
• “Parallel Computing”
2-51
2 Setting Up an Optimization
You typically use such a function in a simulation. Solvers usually evaluate the objective and nonlinear
constraint functions separately. This evaluation is wasteful when you use the same calculation for
both results.
This example also shows the effect of parallel computation on solver speed. For time-consuming
functions, computing in parallel can speed the solver, as can avoiding calling the time-consuming
function repeatedly at the same point. Using both techniques together speeds the solver the most.
The computeall function returns outputs that are part of the objective and nonlinear constraints.
type computeall
x = optimvar('x',4);
Convert the computeall function to an optimization expression. To save time during the
optimization, use the 'ReuseEvaluation' name-value pair. To save time for the solver to determine
the output expression sizes (this happens only once), set the 'OutputSize' name-value pair to [1
1], indicating that both f and c are scalar.
cons = c <= 0;
2-52
Objective and Constraints Having a Common Function in Serial or Parallel, Problem-Based
prob = optimproblem('Objective',obj);
prob.Constraints.cons = cons;
show(prob)
OptimizationProblem :
Solve for:
x
minimize :
((arg3 + (20 .* (x(3) - x(4).^2).^2)) + (5 .* (1 - x(4)).^2))
where:
[arg3,~] = computeall(x);
subject to cons:
arg_LHS <= 0
where:
[~,arg_LHS] = computeall(x);
Solve Problem
Monitor the time it takes to solve the problem starting from the initial point x0.x = [-1;1;1;2].
x0.x = [-1;1;1;2];
x0.x = x0.x/norm(x0.x); % Feasible initial point
tic
[sol,fval,exitflag,output] = solve(prob,x0)
fval = 0.7107
exitflag =
OptimalSolution
2-53
2 Setting Up an Optimization
algorithm: 'interior-point'
firstorderopt: 4.0000e-07
cgiterations: 7
message: '↵Local minimum found that satisfies the constraints.↵↵Optimization complete
solver: 'fmincon'
time1 = toc
time1 = 149.9299
The number of seconds for the solution is just over the number of function evaluations, which
indicates that the solver computed each evaluation only once.
fprintf("The number of seconds to solve was %g, and the number of evaluation points was %g.\n",ti
The number of seconds to solve was 149.93, and the number of evaluation points was 149.
If, instead, you do not call fcn2optimexpr using 'ReuseEvaluation', then the solution time
doubles.
[f2,c2] = fcn2optimexpr(@computeall,x,'ReuseEvaluation',false);
obj2 = f2 + 20*(x(3) - x(4)^2)^2 + 5*(1 - x(4))^2;
cons2 = c2 <= 0;
prob2 = optimproblem('Objective',obj2);
prob2.Constraints.cons2 = cons2;
tic
[sol2,fval2,exitflag2,output2] = solve(prob2,x0);
time2 = toc
time2 = 298.4493
Parallel Processing
If you have a Parallel Computing Toolbox™ license, you can save even more time by computing in
parallel. To do so, set options to use parallel processing, and call solve with options.
options = optimoptions(prob,'UseParallel',true);
tic
[sol3,fval3,exitflag3,output3] = solve(prob,x0,'Options',options);
2-54
Objective and Constraints Having a Common Function in Serial or Parallel, Problem-Based
time3 = toc
time3 = 74.7043
Using parallel processing and 'ReuseEvaluation' together provides a faster solution than using
'ReuseEvaluation' alone. See how long it takes to solve the problem using parallel processing
alone.
tic
[sol4,fval4,exitflag4,output4] = solve(prob2,x0,'Options',options);
time4 = toc
time4 = 145.5278
timingtable = table([time1;time2;time3;time4],...
'RowNames',["Reuse Serial";"No Reuse Serial";"Reuse Parallel";"No Reuse Parallel"])
timingtable=4×1 table
Var1
______
For this problem, on a computer with a 6-core processor, computing in parallel takes about half the
time of computing in serial, and computing with 'ReuseEvaluation' takes about half the time of
computing without 'ReuseEvaluation'. Computing in parallel with 'ReuseEvaluation' takes
about a quarter of the time of computing in serial without 'ReuseEvaluation'.
See Also
fcn2optimexpr
More About
• “Objective and Nonlinear Constraints in the Same Function” on page 2-48
2-55
2 Setting Up an Optimization
2-56
Passing Extra Parameters
Global variables are troublesome because they do not allow names to be reused among functions. It is
better to use one of the other two methods.
Generally, for problem-based optimization, you pass extra parameters in a natural manner. See “Pass
Extra Parameters in Problem-Based Approach” on page 10-11.
for different values of a, b, and c. Solvers accept objective functions that depend only on a single
variable (x in this case). The following sections show how to provide the additional parameters a, b,
and c. The solutions are for parameter values a = 4, b = 2.1, and c = 4 near x0 = [0.5 0.5] using
fminunc.
Anonymous Functions
To pass parameters using anonymous functions:
function y = parameterfun(x,a,b,c)
y = (a - b*x(1)^2 + x(1)^4/3)*x(1)^2 + x(1)*x(2) + ...
(-c + c*x(2)^2)*x(2)^2;
2 Assign values to the parameters and define a function handle f to an anonymous function by
entering the following commands at the MATLAB prompt:
[x,fval] = fminunc(f,x0)
2-57
2 Setting Up an Optimization
x =
-0.0898 0.7127
fval =
-1.0316
Note The parameters passed in the anonymous function are those that exist at the time the
anonymous function is created. Consider the example
a = 4; b = 2.1; c = 4;
f = @(x)parameterfun(x,a,b,c)
[x,fval] = fminunc(f,x0)
You get the same answer as before, since parameterfun uses a = 4, the value when f was created.
To change the parameters that are passed to the function, renew the anonymous function by
reentering it:
a = 3;
f = @(x)parameterfun(x,a,b,c)
You can create anonymous functions of more than one argument. For example, to use lsqcurvefit,
first create a function that takes two input arguments, x and xdata:
fh = @(x,xdata)(sin(x).*xdata +(x.^2).*cos(xdata));
x = pi; xdata = pi*[4;2;3];
fh(x, xdata)
ans =
9.8696
9.8696
-9.8696
Nested Functions
To pass the parameters for “Equation 2-3” via a nested function, write a single file that
Here is the code for the function file for this example:
2-58
Passing Extra Parameters
The objective function is the nested function nestedfun, which has access to the variables a, b, and
c.
Global Variables
Global variables can be troublesome, so it is better to avoid using them. Also, global variables fail in
parallel computations. See “Factors That Affect Results” on page 14-13.
To use global variables, declare the variables to be global in the workspace and in the functions that
use the variables.
function y = globalfun(x)
global a b c
y = (a - b*x(1)^2 + x(1)^4/3)*x(1)^2 + x(1)*x(2) + ...
(-c + c*x(2)^2)*x(2)^2;
2 In your MATLAB workspace, define the variables and run fminunc:
global a b c;
a = 4; b = 2.1; c = 4; % Assign parameter values
x0 = [0.5,0.5];
[x,fval] = fminunc(@globalfun,x0)
See Also
More About
• “Solver-Based Optimization Problem Setup”
• “Pass Extra Parameters in Problem-Based Approach” on page 10-11
2-59
2 Setting Up an Optimization
They simplify solver syntax—you don’t have to include a lot of name-value pairs in a call to a solver.
To see how to set and change options, see “Set and Change Options” on page 2-62.
For an overview of all options, including which solvers use each option, see “Optimization Options
Reference” on page 15-6.
2-60
Options in Common Use: Tuning and Troubleshooting
To tune your solver for improved speed or accuracy, try setting these options first:
See Also
optimoptions | optimset
Related Examples
• “Improve Results”
More About
• “Solver Outputs and Iterative Display”
2-61
2 Setting Up an Optimization
options = optimoptions('fmincon',...
'Algorithm','sqp','Display','iter','ConstraintTolerance',1e-12);
Note Use optimset instead of optimoptions for the fminbnd, fminsearch, fzero, and
lsqnonneg solvers. These are the solvers that do not require an Optimization Toolbox license.
options.StepTolerance = 1e-10;
• optimoptions. For example,
options = optimoptions(options,'StepTolerance',1e-10);
• Reset an option to default using resetoptions. For example,
options = resetoptions(options,'StepTolerance');
Reset more than one option at a time by passing a cell array of option names, such as
{'Algorithm','StepTolerance'}.
Note Ensure that you pass options in your solver call. For example,
[x,fval] = fmincon(@objfun,x0,[],[],[],[],lb,ub,@nonlcon,options);
You can also set and change options using the “Optimization App” on page 5-2.
See Also
More About
• “Optimization Options Reference” on page 15-6
2-62
Choose Between optimoptions and optimset
optimset still works, and it is the only way to set options for solvers that are available without an
Optimization Toolbox license: fminbnd, fminsearch, fzero, and lsqnonneg.
Note Some other toolboxes use optimization options and require you to pass in options created using
optimset, not optimoptions. Check the documentation for your toolboxes.
optimoptions organizes options by solver, with a more focused and comprehensive display than
optimset:
• For optimoptions, you include the solver name as the first argument.
options = optimoptions(SolverName,Name,Value,...)
• For optimset, the syntax does not include the solver name.
options = optimset(Name,Value,...)
In both cases, you can query or change options by using dot notation. See “Set and Change Options”
on page 2-62 and “View Options” on page 2-66.
options =
fminunc options:
Set properties:
SpecifyObjectiveGradient: 1
Default properties:
Algorithm: 'trust-region'
CheckGradients: 0
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
FunctionTolerance: 1.0000e-06
HessianFcn: []
2-63
2 Setting Up an Optimization
HessianMultiplyFcn: []
MaxFunctionEvaluations: '100*numberOfVariables'
MaxIterations: 400
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
StepTolerance: 1.0000e-06
SubproblemAlgorithm: 'cg'
TypicalX: 'ones(numberOfVariables,1)'
options = optimset('GradObj','on')
options =
Display: []
MaxFunEvals: []
MaxIter: []
TolFun: []
TolX: []
FunValCheck: []
OutputFcn: []
PlotFcns: []
ActiveConstrTol: []
Algorithm: []
AlwaysHonorConstraints: []
DerivativeCheck: []
Diagnostics: []
DiffMaxChange: []
DiffMinChange: []
FinDiffRelStep: []
FinDiffType: []
GoalsExactAchieve: []
GradConstr: []
GradObj: 'on'
HessFcn: []
Hessian: []
HessMult: []
HessPattern: []
HessUpdate: []
InitBarrierParam: []
InitTrustRegionRadius: []
Jacobian: []
JacobMult: []
JacobPattern: []
LargeScale: []
MaxNodes: []
MaxPCGIter: []
MaxProjCGIter: []
MaxSQPIter: []
MaxTime: []
MeritFunction: []
MinAbsMax: []
NoStopIfFlatInfeas: []
ObjectiveLimit: []
2-64
Choose Between optimoptions and optimset
PhaseOneTotalScaling: []
Preconditioner: []
PrecondBandWidth: []
RelLineSrchBnd: []
RelLineSrchBndDuration: []
ScaleProblem: []
Simplex: []
SubproblemAlgorithm: []
TolCon: []
TolConSQP: []
TolGradCon: []
TolPCG: []
TolProjCG: []
TolProjCGAbs: []
TypicalX: []
UseParallel: []
See Also
More About
• “Set Options”
2-65
2 Setting Up an Optimization
View Options
optimoptions “hides” some options, meaning it does not display their values. For example, it hides
the DiffMinChange option.
options = optimoptions('fsolve','DiffMinChange',1e-3)
options =
fsolve options:
Set properties:
No options set.
Default properties:
Algorithm: 'trust-region-dogleg'
CheckGradients: 0
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
FunctionTolerance: 1.0000e-06
MaxFunctionEvaluations: '100*numberOfVariables'
MaxIterations: 400
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-06
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
You can view the value of any option, including “hidden” options, by using dot notation. For example,
options.DiffMinChange
ans =
1.0000e-03
• There are better ways. For example, the FiniteDifferenceStepSize option supersedes both
the DiffMinChange and DiffMaxChange options. Therefore, both DiffMinChange and
DiffMaxChange are “hidden”.
• They are rarely used, or are difficult to set appropriately. For example, the fmincon MaxSQPIter
option is recondite and hard to choose, and so is “hidden”.
• For a list of hidden options, see “Hidden Options” on page 15-16.
2-66
View Options
See Also
More About
• “Optimization Options Reference” on page 15-6
2-67
2 Setting Up an Optimization
Set tolerances and other criteria using optimoptions as explained in “Set and Change Options” on
page 2-62.
Tip Generally set tolerances such as OptimalityTolerance and StepTolerance to be well above
eps, and usually above 1e-14. Setting small tolerances does not always result in accurate results.
Instead, a solver can fail to recognize when it has converged, and can continue futile iterations. A
tolerance value smaller than eps effectively disables that stopping condition. This tip does not apply
to fzero, which uses a default value of eps for the TolX tolerance.
You can find the default tolerances in the “Optimization App” on page 5-2. Some default tolerances
differ for different algorithms, so set both the solver and the algorithm.
options = optimoptions('fmincon');
[options.OptimalityTolerance,options.FunctionTolerance,options.StepTolerance]
ans =
1.0e-06 *
You can also find the default tolerances in the options section of the solver function reference page.
• StepTolerance is a lower bound on the size of a step, meaning the norm of (xi – xi+1). If the
solver attempts to take a step that is smaller than StepTolerance, the iterations end.
StepTolerance is generally used as a relative bound, meaning iterations end when |(xi – xi+1)| <
StepTolerance*(1 + |xi|), or a similar relative measure. See “Tolerance Details” on page 2-71.
2-68
Tolerances and Stopping Criteria
• For some algorithms, FunctionTolerance is a lower bound on the change in the value of the
objective function during a step. For those algorithms, if |f(xi) – f(xi+1)| < FunctionTolerance,
the iterations end. FunctionTolerance is generally used as a relative bound, meaning iterations
end when |f(xi) – f(xi+1)| < FunctionTolerance*(1 + |f(xi)|), or a similar relative measure. See
“Tolerance Details” on page 2-71.
Note Unlike other solvers, fminsearch stops when it satisfies both TolFun (the function
tolerance) and TolX (the step tolerance).
• OptimalityTolerance is a tolerance for the first-order optimality measure. If the optimality
measure is less than OptimalityTolerance, the iterations end. OptimalityTolerance can
also be a relative bound on the first-order optimality measure. See “Tolerance Details” on page 2-
71. First-order optimality measure is defined in “First-Order Optimality Measure” on page 3-11.
• ConstraintTolerance is an upper bound on the magnitude of any constraint functions. If a
solver returns a point x with c(x) > ConstraintTolerance or |ceq(x)|
> ConstraintTolerance, the solver reports that the constraints are violated at x.
ConstraintTolerance can also be a relative bound. See “Tolerance Details” on page 2-71.
There are two other tolerances that apply to particular solvers: TolPCG and MaxPCGIter. These
relate to preconditioned conjugate gradient steps. For more information, see “Preconditioned
Conjugate Gradient Method” on page 6-21.
There are several tolerances that apply only to the fmincon interior-point algorithm. For more
information, see Interior-Point Algorithm in fmincon options.
There are several tolerances that apply only to intlinprog. See “Some “Integer” Solutions Are Not
Integers” on page 9-36 and “Branch and Bound” on page 9-31.
2-69
2 Setting Up an Optimization
See Also
More About
• “Tolerance Details” on page 2-71
• “Optimization Options Reference” on page 15-6
2-70
Tolerance Details
Tolerance Details
Optimization Toolbox solvers use tolerances to decide when to stop iterating and to measure solution
quality. See “Tolerances and Stopping Criteria” on page 2-68.
For the four most important tolerances, this section describes which tolerances are relative, meaning
scale in some sense with problem size or values, and which are absolute, meaning do not scale with
the problem. In the following table,
• R means Relative.
• A means Absolute.
• . means inapplicable.
• A* means Absolute when the tolerances are checked; however, preprocessing can scale the entries
to some extent, so the tolerances can be considered relative.
• A*, R means the constraints are first checked as Absolute. If this check passes, the solver returns
a positive exit flag. If this check fails then the constraints are checked as Relative. If this check
passes, the solver returns a positive exit flag with "poor feasibility". If this check fails, the solver
returns a negative exit flag.
2-71
2 Setting Up an Optimization
2-72
Tolerance Details
See Also
More About
• “Tolerances and Stopping Criteria” on page 2-68
2-73
2 Setting Up an Optimization
• If a component of the gradient function is less than 1, “match” means the absolute difference of
the gradient function and the finite-difference approximation of that component is less than 1e-6.
• Otherwise, “match” means that the relative difference is less than 1e-6.
The CheckGradients option causes the solver to check the supplied derivative against a finite-
difference approximation at just one point. If the finite-difference and supplied derivatives do not
match, the solver errors. If the derivatives match to within 1e-6, the solver reports the calculated
differences, and continues iterating without further derivative checks. Solvers check the match at a
point that is a small random perturbation of the initial point x0, modified to be within any bounds.
Solvers do not include the computations for CheckGradients in the function count; see “Iterations
and Function Counts” on page 3-9.
Central finite differences are more accurate than the default forward finite differences. To use central
finite differences:
2-74
Checking Validity of Gradients or Jacobians
Consider the problem of minimizing the Rosenbrock function within the unit disk as described in
“Solve a Constrained Nonlinear Problem, Solver-Based” on page 1-11. The rosenboth function
calculates the objective function and its gradient:
function [f g H] = rosenboth(x)
if nargout > 1
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
if nargout > 2
H = [1200*x(1)^2-400*x(2)+2, -400*x(1);
-400*x(1), 200];
end
end
rosenboth calculates the Hessian, too, but this example does not use the Hessian.
The unitdisk2 function correctly calculates the constraint function and its gradient:
function [c,ceq,gc,gceq] = unitdisk2(x)
c = x(1)^2 + x(2)^2 - 1;
ceq = [ ];
if nargout > 2
gc = [2*x(1);2*x(2)];
gceq = [];
end
if nargout > 2
gc = [x(1);x(2)]; % Gradient incorrect: off by a factor of 2
gceq = [];
end
1 Set the options to use the interior-point algorithm, gradient of objective and constraint functions,
and the CheckGradients option:
% For reproducibility--CheckGradients randomly perturbs the initial point
rng(0,'twister');
2-75
2 Setting Up an Optimization
options = optimoptions(@fmincon,'Algorithm','interior-point',...
'CheckGradients',true,'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true);
2 Solve the minimization with fmincon using the erroneous unitdiskb constraint function:
The constraint function does not match the calculated gradient, encouraging you to check the
function for an error.
3 Replace the unitdiskb constraint function with unitdisk2 and run the minimization again:
____________________________________________________________
Derivative Check Information
Note The Optimization app warns that it will be removed in a future release.
2-76
Checking Validity of Gradients or Jacobians
To set up the example using correct derivative functions, but starting from [0 0], using the
Optimization app:
2-77
2 Setting Up an Optimization
4 Press the Start button under Run solver and view results.
The forward finite difference approximation is inaccurate enough near [0 0] that the derivative
check fails.
5 To use the more accurate central differences, select central differences in the
Approximated derivatives > Type pane:
2-78
Checking Validity of Gradients or Jacobians
6 Click Run solver and view results > Clear Results, then Start. This time the derivative check
is successful:
The derivative check also succeeds when you select the initial point [-1 2], or most random points.
2-79
2 Setting Up an Optimization
Bibliography
[1] Biggs, M.C., “Constrained Minimization Using Recursive Quadratic Programming,” Towards
Global Optimization (L.C.W. Dixon and G.P. Szergo, eds.), North-Holland, pp 341–349, 1975.
[2] Brayton, R.K., S.W. Director, G.D. Hachtel, and L. Vidigal, “A New Algorithm for Statistical Circuit
Design Based on Quasi-Newton Methods and Function Splitting,” IEEE Transactions on
Circuits and Systems, Vol. CAS-26, pp 784–794, Sept. 1979.
[3] Broyden, C.G., “The Convergence of a Class of Double-rank Minimization Algorithms,”; J. Inst.
Maths. Applics., Vol. 6, pp 76–90, 1970.
[4] Conn, N.R., N.I.M. Gould, and Ph.L. Toint, Trust-Region Methods, MPS/SIAM Series on
Optimization, SIAM and MPS, 2000.
[5] Dantzig, G., Linear Programming and Extensions, Princeton University Press, Princeton, 1963.
[6] Dantzig, G.B., A. Orden, and P. Wolfe, “Generalized Simplex Method for Minimizing a Linear Form
Under Linear Inequality Restraints,” Pacific Journal Math., Vol. 5, pp. 183–195, 1955.
[7] Davidon, W.C., “Variable Metric Method for Minimization,” A.E.C. Research and Development
Report, ANL-5990, 1959.
[8] Dennis, J.E., Jr., “Nonlinear least-squares,” State of the Art in Numerical Analysis ed. D. Jacobs,
Academic Press, pp 269–312, 1977.
[9] Dennis, J.E., Jr. and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and
Nonlinear Equations, Prentice-Hall Series in Computational Mathematics, Prentice-Hall,
1983.
[10] Fleming, P.J., “Application of Multiobjective Optimization to Compensator Design for SISO
Control Systems,” Electronics Letters, Vol. 22, No. 5, pp 258–259, 1986.
[11] Fleming, P.J., “Computer-Aided Control System Design of Regulators using a Multiobjective
Optimization Approach,” Proc. IFAC Control Applications of Nonlinear Prog. and Optim.,
Capri, Italy, pp 47–52, 1985.
[12] Fletcher, R., “A New Approach to Variable Metric Algorithms,” Computer Journal, Vol. 13, pp
317–322, 1970.
[13] Fletcher, R., “Practical Methods of Optimization,” John Wiley and Sons, 1987.
[14] Fletcher, R. and M.J.D. Powell, “A Rapidly Convergent Descent Method for Minimization,”
Computer Journal, Vol. 6, pp 163–168, 1963.
[15] Forsythe, G.F., M.A. Malcolm, and C.B. Moler, Computer Methods for Mathematical
Computations, Prentice Hall, 1976.
[16] Gembicki, F.W., “Vector Optimization for Control with Performance and Parameter Sensitivity
Indices,” Ph.D. Thesis, Case Western Reserve Univ., Cleveland, Ohio, 1974.
[17] Gill, P.E., W. Murray, M.A. Saunders, and M.H. Wright, “Procedures for Optimization Problems
with a Mixture of Bounds and General Linear Constraints,” ACM Trans. Math. Software, Vol.
10, pp 282–298, 1984.
2-80
Bibliography
[18] Gill, P.E., W. Murray, and M.H. Wright, Numerical Linear Algebra and Optimization, Vol. 1,
Addison Wesley, 1991.
[19] Gill, P.E., W. Murray, and M.H.Wright, Practical Optimization, London, Academic Press, 1981.
[20] Goldfarb, D., “A Family of Variable Metric Updates Derived by Variational Means,” Mathematics
of Computing, Vol. 24, pp 23–26, 1970.
[21] Grace, A.C.W., “Computer-Aided Control System Design Using Optimization Techniques,” Ph.D.
Thesis, University of Wales, Bangor, Gwynedd, UK, 1989.
[22] Han, S.P., “A Globally Convergent Method for Nonlinear Programming,” J. Optimization Theory
and Applications, Vol. 22, p. 297, 1977.
[24] Hollingdale, S.H., Methods of Operational Analysis in Newer Uses of Mathematics (James
Lighthill, ed.), Penguin Books, 1978.
[25] Levenberg, K., “A Method for the Solution of Certain Problems in Least Squares,” Quart. Appl.
Math. Vol. 2, pp 164–168, 1944.
[26] Madsen, K. and H. Schjaer-Jacobsen, “Algorithms for Worst Case Tolerance Optimization,” IEEE
Transactions of Circuits and Systems, Vol. CAS-26, Sept. 1979.
[27] Marquardt, D., “An Algorithm for Least-Squares Estimation of Nonlinear Parameters,” SIAM J.
Appl. Math. Vol. 11, pp 431–441, 1963.
[28] Moré, J.J., “The Levenberg-Marquardt Algorithm: Implementation and Theory,” Numerical
Analysis, ed. G. A. Watson, Lecture Notes in Mathematics 630, Springer Verlag, pp 105–116,
1977.
[29] NAG Fortran Library Manual, Mark 12, Vol. 4, E04UAF, p. 16.
[30] Nelder, J.A. and R. Mead, “A Simplex Method for Function Minimization,” Computer J., Vol.7, pp
308–313, 1965.
[31] Nocedal, J. and S. J. Wright. Numerical Optimization, Second Edition. Springer Series in
Operations Research, Springer Verlag, 2006.
[32] Powell, M.J.D., “The Convergence of Variable Metric Methods for Nonlinearly Constrained
Optimization Calculations,” Nonlinear Programming 3, (O.L. Mangasarian, R.R. Meyer and
S.M. Robinson, eds.), Academic Press, 1978.
[33] Powell, M.J.D., “A Fast Algorithm for Nonlinearly Constrained Optimization Calculations,”
Numerical Analysis, G.A.Watson ed., Lecture Notes in Mathematics, Springer Verlag, Vol. 630,
1978.
[34] Powell, M.J.D., “A Fortran Subroutine for Solving Systems of Nonlinear Algebraic Equations,”
Numerical Methods for Nonlinear Algebraic Equations, (P. Rabinowitz, ed.), Ch.7, 1970.
[35] Powell, M.J.D., “Variable Metric Methods for Constrained Optimization,” Mathematical
Programming: The State of the Art, (A. Bachem, M. Grotschel and B. Korte, eds.) Springer
Verlag, pp 288–311, 1983.
2-81
2 Setting Up an Optimization
[37] Shanno, D.F., “Conditioning of Quasi-Newton Methods for Function Minimization,” Mathematics
of Computing, Vol. 24, pp 647–656, 1970.
[38] Waltz, F.M., “An Engineering Approach: Hierarchical Optimization Criteria,” IEEE Trans., Vol.
AC-12, pp 179–180, April, 1967.
[39] Branch, M.A., T.F. Coleman, and Y. Li, “A Subspace, Interior, and Conjugate Gradient Method for
Large-Scale Bound-Constrained Minimization Problems,” SIAM Journal on Scientific
Computing, Vol. 21, Number 1, pp 1–23, 1999.
[40] Byrd, R.H., J. C. Gilbert, and J. Nocedal, “A Trust Region Method Based on Interior Point
Techniques for Nonlinear Programming,” Mathematical Programming, Vol 89, No. 1, pp. 149–
185, 2000.
[41] Byrd, R.H., Mary E. Hribar, and Jorge Nocedal, “An Interior Point Algorithm for Large-Scale
Nonlinear Programming,” SIAM Journal on Optimization, Vol 9, No. 4, pp. 877–900, 1999.
[42] Byrd, R.H., R.B. Schnabel, and G.A. Shultz, “Approximate Solution of the Trust Region Problem
by Minimization over Two-Dimensional Subspaces,” Mathematical Programming, Vol. 40, pp
247–263, 1988.
[43] Coleman, T.F. and Y. Li, “On the Convergence of Reflective Newton Methods for Large-Scale
Nonlinear Minimization Subject to Bounds,” Mathematical Programming, Vol. 67, Number 2,
pp 189–224, 1994.
[44] Coleman, T.F. and Y. Li, “An Interior, Trust Region Approach for Nonlinear Minimization Subject
to Bounds,” SIAM Journal on Optimization, Vol. 6, pp 418–445, 1996.
[45] Coleman, T.F. and Y. Li, “A Reflective Newton Method for Minimizing a Quadratic Function
Subject to Bounds on some of the Variables,” SIAM Journal on Optimization, Vol. 6, Number 4,
pp 1040–1058, 1996.
[46] Coleman, T.F. and A. Verma, “A Preconditioned Conjugate Gradient Approach to Linear Equality
Constrained Minimization,” Computational Optimization and Applications, Vol. 20, No. 1, pp.
61–72, 2001.
[47] Mehrotra, S., “On the Implementation of a Primal-Dual Interior Point Method,” SIAM Journal on
Optimization, Vol. 2, pp 575–601, 1992.
[48] Moré, J.J. and D.C. Sorensen, “Computing a Trust Region Step,” SIAM Journal on Scientific and
Statistical Computing, Vol. 3, pp 553–572, 1983.
[49] Sorensen, D.C., “Minimization of a Large Scale Quadratic Function Subject to an Ellipsoidal
Constraint,” Department of Computational and Applied Mathematics, Rice University,
Technical Report TR94-27, 1994.
[50] Steihaug, T., “The Conjugate Gradient Method and Trust Regions in Large Scale Optimization,”
SIAM Journal on Numerical Analysis, Vol. 20, pp 626–637, 1983.
[51] Waltz, R. A. , J. L. Morales, J. Nocedal, and D. Orban, “An interior algorithm for nonlinear
optimization that combines line search and trust region steps,” Mathematical Programming,
Vol 107, No. 3, pp. 391–408, 2006.
2-82
Bibliography
[52] Zhang, Y., “Solving Large-Scale Linear Programs by Interior-Point Methods Under the MATLAB
Environment,” Department of Mathematics and Statistics, University of Maryland, Baltimore
County, Baltimore, MD, Technical Report TR96-01, July, 1995.
[53] Hairer, E., S. P. Norsett, and G. Wanner, Solving Ordinary Differential Equations I - Nonstiff
Problems, Springer-Verlag, pp. 183–184.
[55] Bixby, Robert E., “Implementing the Simplex Method: The Initial Basis,” ORSA Journal on
Computing, Vol. 4, No. 3, 1992.
[56] Andersen, Erling D. and Knud D. Andersen, “Presolving in Linear Programming,” Mathematical
Programming, Vol. 71, pp. 221–245, 1995.
[57] Lagarias, J. C., J. A. Reeds, M. H. Wright, and P. E. Wright, “Convergence Properties of the
Nelder-Mead Simplex Method in Low Dimensions,” SIAM Journal of Optimization, Vol. 9,
Number 1, pp. 112–147, 1998.
[58] Dolan, Elizabeth D. , Jorge J. Moré and Todd S. Munson, “Benchmarking Optimization Software
with COPS 3.0,” Argonne National Laboratory Technical Report ANL/MCS-TM-273, February
2004.
[59] Applegate, D. L., R. E. Bixby, V. Chvátal and W. J. Cook, The Traveling Salesman Problem: A
Computational Study, Princeton University Press, 2007.
[60] Spellucci, P., “A new technique for inconsistent QP problems in the SQP method,” Journal of
Mathematical Methods of Operations Research, Volume 47, Number 3, pp. 355–400, October
1998.
[61] Tone, K., “Revisions of constraint approximations in the successive QP method for nonlinear
programming problems,” Journal of Mathematical Programming, Volume 26, Number 2, pp.
144–152, June 1983.
[62] Gondzio, J. “Multiple centrality corrections in a primal dual method for linear programming.”
Computational Optimization and Applications, Volume 6, Number 2, pp. 137–156, 1996.
[63] Gould, N. and P. L. Toint. “Preprocessing for quadratic programming.” Math. Programming,
Series B, Vol. 100, pp. 95–132, 2004.
[64] Schittkowski, K., “More Test Examples for Nonlinear Programming Codes,” Lecture Notes in
Economics and Mathematical Systems, Number 282, Springer, p. 45, 1987.
2-83
3
Examining Results
• The current point is the final point in the solver iterations. It is the best point the solver found in
its run.
• If you call a solver without assigning a value to the output, the default output, ans, is the
current point.
• The function value is the value of the objective function at the current point.
• The function value for least-squares solvers is the sum of squares, also known as the residual
norm.
• fgoalattain, fminimax, and fsolve return a vector function value.
• Sometimes fval or Fval denotes function value.
See Also
More About
• “Solver Outputs and Iterative Display”
3-2
Exit Flags and Exit Messages
Exit Flags
When an optimization solver completes its task, it sets an exit flag. An exit flag is an integer that is a
code for the reason the solver halted its iterations. In general:
A table of solver outputs in the solver's function reference section lists the meaning of each solver's
exit flags.
Note Exit flags are not infallible guides to the quality of a solution. Many other factors, such as
tolerance settings, can affect whether a solution is satisfactory to you. You are responsible for
deciding whether a solver returns a satisfactory answer. Sometimes a negative exit flag does not
correspond to a “bad” solution. Similarly, sometimes a positive exit flag does not correspond to a
“good” solution.
You obtain an exit flag by calling a solver with the exitflag syntax. This syntax depends on the
solver. For details, see the solver function reference pages. For example, for fsolve, the calling
syntax to obtain an exit flag is
[x,fval,exitflag] = fsolve(...)
The following example uses this syntax. Suppose you want to solve the system of nonlinear equations
−x1
2x1 − x2 = e
−x2
−x1 + 2x2 = e .
Write these equations as an anonymous function that gives a zero vector at a solution:
myfcn = @(x)[2*x(1) - x(2) - exp(-x(1));
-x(1) + 2*x(2) - exp(-x(2))];
Call fsolve with the exitflag syntax at the initial point [-5 -5]:
[xfinal fval exitflag] = fsolve(myfcn,[-5 -5])
Equation solved.
3-3
3 Examining Results
xfinal =
0.5671 0.5671
fval =
1.0e-06 *
-0.4059
-0.4059
exitflag =
1
In the table for fsolve exitflag, you find that an exit flag value 1 means “Function converged to a
solution x.” In other words, fsolve reports myfcn is nearly zero at x = [0.5671 0.5671].
Exit Messages
Each solver issues a message to the MATLAB command window at the end of its iterations. This
message explains briefly why the solver halted. The message might give more detail than the exit
flag.
Many examples in this documentation show exit messages, such as “Minimize Rosenbrock's Function
at the Command Line” on page 1-16. The example in the previous section, “Exit Flags” on page 3-3,
shows the following exit message:
Equation solved.
This message is more informative than the exit flag. The message indicates that the gradient is
relevant. The message also states that the function tolerance controls how near 0 the vector of
function values must be for fsolve to regard the solution as completed.
• Links on words or phrases. If you click such a link, a window opens that displays a definition of the
term, or gives other information. The new window can contain links to the Help browser
documentation for more detailed information.
• A link as the last line of the display saying <stopping criteria details>. If you click this
link, MATLAB displays more detail about the reason the solver halted.
3-4
Exit Flags and Exit Messages
Each of the underlined words or phrases contains a link that provides more information.
• The <stopping criteria details> link prints the following to the MATLAB command line:
Optimization completed: The first-order optimality measure, 0.000000e+00, is less
than options.OptimalityTolerance = 1.000000e-06.
• The other links bring up a help window with term definitions. For example, clicking the Local
minimum found link opens the following window:
3-5
3 Examining Results
Clicking the first-order optimality measure expander link brings up the definition of first-
order optimality measure for fminunc:
3-6
Exit Flags and Exit Messages
The expander link is a way to obtain more information in the same window. Clicking the first-
order optimality measure expander link again closes the definition.
• The other links open the Help Viewer.
3-7
3 Examining Results
For example,
opts = optimoptions(@fminunc,'Display','iter-detailed','Algorithm','quasi-newton');
[xfinal fval] = fminunc(@cos,1,opts);
See Also
More About
• “Solver Outputs and Iterative Display”
3-8
Iterations and Function Counts
You can limit the number of iterations or function counts by setting the MaxIterations or
MaxFunctionEvaluations options for a solver using optimoptions. Or, if you want a solver to
continue after reaching one of these limits, raise the values of these options. See “Set and Change
Options” on page 2-62.
At any step, intermediate calculations can involve evaluating the objective function and any
constraints at points near the current iterate xi. For example, the solver might estimate a gradient by
finite differences. At each nearby point, the function count (F-count) increases by one. The figure
“Typical Iteration in 3-D Space” on page 3-9 shows that, in 3-D space with forward finite
differences of size delta, one iteration typically corresponds to an increase in function count of four.
In the figure, ei represents the unit vector in the ith coordinate direction.
• If the problem has no constraints, the F-count reports the total number of objective function
evaluations.
• If the problem has constraints, the F-count reports only the number of points where function
evaluations took place, not the total number of evaluations of constraint functions. So, if the
problem has many constraints, the F-count can be significantly less than the total number of
function evaluations.
Sometimes a solver attempts a step and rejects the attempt. The trust-region, trust-region-
reflective, and trust-region-dogleg algorithms count these failed attempts as iterations, and
report the (unchanged) result in the iterative display. The interior-point, active-set, and
levenberg-marquardt algorithms do not count failed attempts as iterations, and do not report the
attempts in the iterative display. All attempted steps increase the F-count, regardless of the
algorithm.
F-count is a header in the iterative display for many solvers. For an example, see “Interpret the
Result” on page 1-16.
The F-count appears in the output structure as output.funcCount, enabling you to access the
evaluation count programmatically. For more information, see “Output Structures” on page 3-21.
See Also
optimoptions
3-9
3 Examining Results
More About
• “Solver Outputs and Iterative Display”
3-10
First-Order Optimality Measure
For general information about first-order optimality, see Nocedal and Wright [31]. For specifics about
the first-order optimality measures for Optimization Toolbox solvers, see “Unconstrained Optimality”
on page 3-11, “Constrained Optimality Theory” on page 3-12, and “Constrained Optimality in
Solver Form” on page 3-13.
Some solvers or algorithms use relative first-order optimality as a stopping criterion. Solver iterations
end if the first-order optimality measure is less than μ times OptimalityTolerance, where μ is
either:
A relative measure attempts to account for the scale of a problem. Multiplying an objective function
by a very large or small number does not change the stopping condition for a relative stopping
criterion, but does change it for an unscaled one.
Solvers with enhanced exit messages on page 3-4 state, in the stopping criteria details, when they
use relative first-order optimality.
Unconstrained Optimality
For a smooth unconstrained problem,
min f (x),
x
3-11
3 Examining Results
the first-order optimality measure is the infinity norm (meaning maximum absolute value) of ∇f(x),
which is:
This measure of optimality is based on the familiar condition for a smooth function to achieve a
minimum: its gradient must be zero. For unconstrained problems, when the first-order optimality
measure is nearly zero, the objective function has gradient nearly zero, so the objective function
could be near a minimum. If the first-order optimality measure is not small, the objective function is
not minimal.
For a smooth constrained problem, let g and h be vector functions representing all inequality and
equality constraints respectively (meaning bound, linear, and nonlinear constraints):
The meaning of first-order optimality in this case is more complex than for unconstrained problems.
The definition is based on the Karush-Kuhn-Tucker (KKT) conditions. The KKT conditions are
analogous to the condition that the gradient must be zero at a minimum, modified to take constraints
into account. The difference is that the KKT conditions hold for constrained problems.
The vector λ, which is the concatenation of λg and λh, is the Lagrange multiplier vector. Its length is
the total number of constraints.
∇x L(x, λ) = 0, (3-2)
g(x) ≤ 0,
h(x) = 0, (3-4)
λg, i ≥ 0.
Solvers do not use the three expressions in “Equation 3-4” in the calculation of optimality measure.
3-12
First-Order Optimality Measure
λgg(x) , (3-6)
where the norm in “Equation 3-6” means infinity norm (maximum) of the vector λg, igi(x).
The combined optimality measure is the maximum of the values calculated in “Equation 3-5” and
“Equation 3-6”. Solvers that accept nonlinear constraint functions report constraint violations
g(x) > 0 or |h(x)| > 0 as ConstraintTolerance violations. See “Tolerances and Stopping Criteria”
on page 2-68.
where the norm of the vectors in “Equation 3-7” and “Equation 3-8” is the infinity norm (maximum).
The subscripts on the Lagrange multipliers correspond to solver Lagrange multiplier structures. See
“Lagrange Multiplier Structures” on page 3-22. The summations in “Equation 3-7” range over all
constraints. If a bound is ±Inf, that term is not constrained, so it is not part of the summation.
For some large-scale problems with only linear equalities, the first-order optimality measure is the
infinity norm of the projected gradient. In other words, the first-order optimality measure is the size
of the gradient projected onto the null space of Aeq.
For least-squares solvers and trust-region-reflective algorithms, in problems with bounds alone, the
first-order optimality measure is the maximum over i of |vi*gi|. Here gi is the ith component of the
gradient, x is the current point, and
xi − bi if the negative gradient points toward bound bi
vi =
1 otherwise.
If xi is at a bound, vi is zero. If xi is not at a bound, then at a minimizing point the gradient gi should
be zero. Therefore the first-order optimality measure should be zero at a minimizing point.
See Also
More About
• “Solver Outputs and Iterative Display”
3-13
3 Examining Results
Iterative Display
In this section...
“Introduction” on page 3-14
“Common Headings” on page 3-14
“Function-Specific Headings” on page 3-15
Introduction
The iterative display is a table of statistics describing the calculations in each iteration of a solver.
The statistics depend on both the solver and the solver algorithm. The table appears in the MATLAB
Command Window when you run solvers with appropriate options. For more information about
iterations, see “Iterations and Function Counts” on page 3-9.
Obtain the iterative display by using optimoptions with the Display option set to 'iter' or
'iter-detailed'. For example:
options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
[x fval exitflag output] = fminunc(@sin,0,options);
First-order
Iteration Func-count f(x) Step-size optimality
0 2 0 1
1 4 -0.841471 1 0.54
2 8 -1 0.484797 0.000993
3 10 -1 1 5.62e-05
4 12 -1 1 0
You can also obtain the iterative display by using the Optimization app. In the Display to command
window section of the Options pane, select Level of display > iterative or iterative with
detailed message.
Common Headings
This table lists some common headings of iterative display.
3-14
Iterative Display
Function-Specific Headings
The tables in this section describe headings of the iterative display whose meaning is specific to the
optimization function you are using.
This table describes the headings specific to fgoalattain, fmincon, fminimax, and fseminf.
3-15
3 Examining Results
QP subproblem procedures:
3-16
Iterative Display
• initial
• golden (golden section search)
• parabolic (parabolic interpolation)
fminsearch
• initial simplex
• expand
• reflect
• shrink
• contract inside
• contract outside
fminunc
The fminunc 'quasi-newton' algorithm can issue a skipped update message to the right of the
First-order optimality column. This message means that fminunc did not update its Hessian
estimate, because the resulting matrix would not have been positive definite. The message usually
indicates that the objective function is not smooth at the current point.
3-17
3 Examining Results
fsolve
intlinprog
where
linprog
This table describes the headings specific to linprog. Each algorithm has its own iterative display.
3-18
Iterative Display
lsqlin
The lsqlin 'interior-point' iterative display is inherited from the quadprog iterative display.
The relationship between these functions is explained in “Linear Least Squares: Interior-Point or
Active-Set” on page 12-2. For iterative display details, see “quadprog” on page 3-19.
quadprog
This table describes the headings specific to quadprog. Only the 'interior-point-convex'
algorithm has the iterative display.
3-19
3 Examining Results
3-20
Output Structures
Output Structures
An output structure contains information on a solver's result. All solvers can return an output
structure. To obtain an output structure, invoke the solver with the output structure in the calling
syntax. For example, to get an output structure from lsqnonlin, use the syntax
[x,resnorm,residual,exitflag,output] = lsqnonlin(...)
You can also obtain an output structure by running a problem using the Optimization app. All results
exported from Optimization app contain an output structure.
The contents of the output structure are listed in each solver's reference pages. For example, the
output structure returned by lsqnonlin contains firstorderopt, iterations, funcCount,
cgiterations, stepsize, algorithm, and message. To access, for example, the message, enter
output.message.
Optimization app exports results in a structure. The results structure contains the output structure.
To access, for example, the number of iterations, use the syntax
optimresults.output.iterations.
You can also see the contents of an output structure by double-clicking the output structure in the
MATLAB Workspace pane.
See Also
More About
• “Solver Outputs and Iterative Display”
3-21
3 Examining Results
To access, for example, the nonlinear inequality field of a Lagrange multiplier structure, enter
lambda.inqnonlin. To access the third element of the Lagrange multiplier associated with lower
bounds, enter lambda.lower(3).
The content of the Lagrange multiplier structure depends on the solver. For example, linear
programming has no nonlinearities, so it does not have eqnonlin or ineqnonlin fields. Each
applicable solver's function reference pages contains a description of its Lagrange multiplier
structure under the heading “Outputs.”
Examine the Lagrange multiplier structure for the solution of a nonlinear problem with linear and
nonlinear inequality constraints and bounds.
disp(lambda)
• The lambda.eqlin and lambda.eqnonlin fields have size 0 because there are no linear
equality constraints and no nonlinear equality constraints.
3-22
Lagrange Multiplier Structures
• The lambda.ineqlin field has value 0.3407, indicating that the linear inequality constraint is
active. The linear inequality constraint is x(1) + x(2) <= 1. Check that the constraint is active
at the solution, meaning the solution causes the inequality to be an equality:
x(1) + x(2)
ans =
1.0000
• Check the values of the lambda.lower and lambda.upper fields.
lambda.lower
ans =
1.0e-07 *
0.2210
0.2365
lambda.upper
ans =
1.0e-07 *
0.3361
0.3056
These values are effectively zero, indicating that the solution is not near the bounds.
• The value of the lambda.ineqnonlin field is 1.7038e-07, indicating that this constraint is not
active. Check the constraint, which is x(1)^2 + x(2)^2 <= 1.
x(1)^2 + x(2)^2
ans =
0.5282
The nonlinear constraint function value is not near its limit, so the Lagrange multiplier is
approximately 0.
See Also
More About
• “Solver Outputs and Iterative Display”
3-23
3 Examining Results
Hessian Output
In this section...
“Returned Hessian” on page 3-24
“fminunc Hessian” on page 3-24
“fmincon Hessian” on page 3-25
Returned Hessian
The fmincon and fminunc solvers return an approximate Hessian as an optional output:
[x,fval,exitflag,output,grad,hessian] = fminunc(fun,x0)
% or
[x,fval,exitflag,output,lambda,grad,hessian] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon)
This section describes the meaning of the returned Hessian, and the accuracy you can expect.
You can also specify the type of Hessian that the solvers use as input Hessian arguments. For
fmincon, see “Hessian as an Input” on page 16-88. For fminunc, see “Including Gradients and
Hessians” on page 2-19.
fminunc Hessian
The Hessian for an unconstrained problem is the matrix of second derivatives of the objective
function f:
∂2 f
Hessian Hi j = .
∂xi ∂x j
• If you supply a Hessian in the objective function and set the HessianFcn option to
'objective', fminunc returns this Hessian.
• If you supply a HessianMultiplyFcn function, fminunc returns the Hinfo matrix from the
HessianMultiplyFcn function. For more information, see HessianMultiplyFcn in the
trust-region section of the fminunc options table.
• Otherwise, fminunc returns an approximation from a sparse finite difference algorithm on the
gradients.
This Hessian is accurate for the next-to-last iterate. However, the next-to-last iterate might not be
close to the final point.
The reason the trust-region algorithm returns the Hessian at the next-to-last point is for
efficiency. fminunc uses the Hessian internally to compute its next step. When fminunc reaches
a stopping condition, it does not need to compute the next step, so does not compute the Hessian.
3-24
Hessian Output
fmincon Hessian
The Hessian for a constrained problem is the Hessian of the Lagrangian. For an objective function f,
nonlinear inequality constraint vector c, and nonlinear equality constraint vector ceq, the Lagrangian
is
L = f + ∑ λici + ∑ λ jceq j .
i j
The λi are Lagrange multipliers; see “First-Order Optimality Measure” on page 3-11 and “Lagrange
Multiplier Structures” on page 3-22. The Hessian of the Lagrangian is
H = ∇2 L = ∇2 f + ∑ λi ∇2 ci + ∑ λ j ∇2 ceq j .
i j
fmincon has four algorithms, with several options for Hessians, as described in “fmincon Trust
Region Reflective Algorithm” on page 6-19, “fmincon Active Set Algorithm” on page 6-22, and
“fmincon Interior Point Algorithm” on page 6-30. fmincon returns the following for the Hessian:
• If you supply a Hessian in the objective function and set the HessianFcn option to
'objective', fmincon returns this Hessian.
• If you supply a HessianMultiplyFcn function, fmincon returns the Hinfo matrix from the
HessianMultiplyFcn function. For more information, see Trust-Region-Reflective
Algorithm in fmincon options.
• Otherwise, fmincon returns an approximation from a sparse finite difference algorithm on the
gradients.
This Hessian is accurate for the next-to-last iterate. However, the next-to-last iterate might not be
close to the final point.
The reason the trust-region-reflective algorithm returns the Hessian at the next-to-last
point is for efficiency. fmincon uses the Hessian internally to compute its next step. When
fmincon reaches a stopping condition, it does not need to compute the next step, so does not
compute the Hessian.
• interior-point Algorithm
3-25
3 Examining Results
• If the HessianFcn option is a function handle, fmincon returns this function as the Hessian
at the final point.
See Also
More About
• “Including Gradients and Hessians” on page 2-19
• “Hessian as an Input” on page 16-88
3-26
Plot Functions
Plot Functions
In this section...
“Plot an Optimization During Execution” on page 3-27
“Using a Plot Function” on page 3-27
You can also use a custom-written plot function. Write a function file using the same structure as an
output function. For more information on this structure, see “Output Function Syntax” on page 15-
26.
Note The Optimization app warns that it will be removed in a future release.
1 Write the nonlinear objective and constraint functions, including the derivatives:
function [f g H] = rosenboth(x)
% ROSENBOTH returns both the value y of Rosenbrock's function
% and also the value g of its gradient and H the Hessian.
if nargout > 1
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
if nargout > 2
H = [1200*x(1)^2-400*x(2)+2, -400*x(1);
-400*x(1), 200];
end
end
3-27
3 Examining Results
c = x(1)^2 + x(2)^2 - 1;
ceq = [ ];
if nargout > 2
gc = [2*x(1);2*x(2)];
gceq = [];
end
Your Problem Setup and Results panel should match the following figure.
4 Choose three plot functions in the Options pane: Current point, Function value, and First
order optimality.
3-28
Plot Functions
5 Click the Start button under Run solver and view results.
6 The output appears as follows in the Optimization app.
3-29
3 Examining Results
• The “Current Point” plot graphically shows the minimizer [0.786,0.618], which is reported as
the Final point in the Run solver and view results pane. This plot updates at each iteration,
showing the intermediate iterates.
• The “Current Function Value” plot shows the objective function value at all iterations. This graph
is nearly monotone, showing fmincon reduces the objective function at almost every iteration.
• The “First-order Optimality” plot shows the first-order optimality measure at all iterations.
1 Write the nonlinear objective and constraint functions, including the derivatives, as shown in
“Running the Optimization Using the Optimization App” on page 3-27.
2 Create an options structure that includes calling the three plot functions:
options = optimoptions(@fmincon,'Algorithm','interior-point',...
'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,'PlotFcn',{@optimplotx,...
@optimplotfval,@optimplotfirstorderopt});
3 Call fmincon:
x = fmincon(@rosenboth,[0 0],[],[],[],[],[],[],...
@unitdisk2,options)
4 fmincon gives the following output:
3-30
Plot Functions
x =
0.7864 0.6177
fmincon also displays the three plot functions, shown at the end of “Running the Optimization
Using the Optimization App” on page 3-27.
See Also
More About
• “Output Function Syntax” on page 15-26
• “Output Function for Problem-Based Optimization” on page 7-25
3-31
3 Examining Results
Output Functions
In this section...
“What Is an Output Function?” on page 3-32
“Example: Using Output Functions” on page 3-32
This section shows the solver-based approach to output functions. For the problem-based approach,
see “Output Function for Problem-Based Optimization” on page 7-25.
Generally, the solvers that can employ an output function are the ones that can take nonlinear
functions as inputs. You can determine which solvers can have an output function by looking in the
Options section of function reference pages, or by checking whether the Output function option is
available in the Optimization app for a solver.
The following example continues the one in “Nonlinear Inequality Constraints” on page 6-61, which
calls the function fmincon at the command line to solve a nonlinear, constrained optimization
problem. The example in this section uses a function file to call fmincon. The file also contains all the
functions needed for the example, including:
The code for the file is here: “Writing the Example Function File” on page 3-33.
3-32
Output Functions
options = optimoptions(@fmincon,'OutputFcn',@outfun)
where outfun is the name of the output function. When you call an optimization function with
options as an input, the optimization function calls outfun at each iteration of its algorithm.
In general, outfun can be any MATLAB function, but in this example, it is a nested function of the
function file described in “Writing the Example Function File” on page 3-33. The following code
defines the output function:
switch state
case 'init'
hold on
case 'iter'
% Concatenate current point and objective function
% value with history. x must be a row vector.
history.fval = [history.fval; optimValues.fval];
history.x = [history.x; x];
% Concatenate current search direction with
% searchdir.
searchdir = [searchdir;...
optimValues.searchdirection'];
plot(x(1),x(2),'o');
% Label points with iteration number.
% Add .15 to x(1) to separate label from plotted 'o'
text(x(1)+.15,x(2),num2str(optimValues.iteration));
case 'done'
hold off
otherwise
end
end
See “Using Handles to Store Function Parameters” (MATLAB) for more information about nested
functions.
For more information about these arguments, see “Output Function Syntax” on page 15-26.
3-33
3 Examining Results
% call optimization
x0 = [-1 1];
options = optimoptions(@fmincon,'OutputFcn',@outfun,...
'Display','iter','Algorithm','active-set');
xsol = fmincon(@objfun,x0,[],[],[],[],[],[],@confun,options);
switch state
case 'init'
hold on
case 'iter'
% Concatenate current point and objective function
% value with history. x must be a row vector.
history.fval = [history.fval; optimValues.fval];
history.x = [history.x; x];
% Concatenate current search direction with
% searchdir.
searchdir = [searchdir;...
optimValues.searchdirection'];
plot(x(1),x(2),'o');
% Label points with iteration number and add title.
% Add .15 to x(1) to separate label from plotted 'o'
text(x(1)+.15,x(2),...
num2str(optimValues.iteration));
title('Sequence of Points Computed by fmincon');
case 'done'
hold off
otherwise
end
end
function f = objfun(x)
f = exp(x(1))*(4*x(1)^2 + 2*x(2)^2 + 4*x(1)*x(2) +...
2*x(2) + 1);
end
3-34
Output Functions
x: [9×2 double]
fval: [9×1 double]
The fval field contains the objective function values corresponding to the sequence of points
computed by fmincon:
history.fval
ans =
1.8394
1.8513
0.3002
0.5298
0.1870
0.0729
0.0353
0.0236
0.0236
These are the same values displayed in the iterative output in the column with header f(x).
The x field of history contains the sequence of points computed by the algorithm:
history.x
ans =
-1.0000 1.0000
3-35
3 Examining Results
-1.3679 1.2500
-5.5708 3.4699
-4.8000 2.2752
-6.7054 1.2618
-8.0679 1.0186
-9.0230 1.0532
-9.5471 1.0471
-9.5474 1.0474
This example displays a plot of this sequence of points, in which each point is labeled by its iteration
number.
The optimal point occurs at the eighth iteration. Note that the last two points in the sequence are so
close that they overlap.
The second output argument, searchdir, contains the search directions for fmincon at each
iteration. The search direction is a vector pointing from the point computed at the current iteration to
the point computed at the next iteration:
searchdir =
-0.3679 0.2500
-4.2029 2.2199
0.7708 -1.1947
-3.8108 -2.0268
-1.3625 -0.2432
-0.9552 0.0346
3-36
Output Functions
-0.5241 -0.0061
-0.0003 0.0003
See Also
More About
• “Output Function Syntax” on page 15-26
• “Output Function for Problem-Based Optimization” on page 7-25
3-37
4
You can access relevant answers from many solvers' default exit message. The first line of the exit
message contains a link to a brief description of the result. This description contains a link leading to
documentation.
See Also
Related Examples
• “When the Solver Fails” on page 4-3
• “Solver Takes Too Long” on page 4-9
• “When the Solver Might Have Succeeded” on page 4-12
• “When the Solver Succeeds” on page 4-18
4-2
When the Solver Fails
In this section...
“Too Many Iterations or Function Evaluations” on page 4-3
“Converged to an Infeasible Point” on page 4-6
“Problem Unbounded” on page 4-7
“fsolve Could Not Solve Equation” on page 4-8
Set the Display option to 'iter'. This setting shows the results of the solver iterations.
• Using the Optimization app, choose Level of display to be iterative or iterative with
detailed message.
• At the MATLAB command line, enter
options = optimoptions('solvername','Display','iter');
For an example of iterative display, see “Interpret the Result” on page 1-16.
What to Look For in Iterative Display
• See if the objective function (Fval or f(x) or Resnorm) decreases. Decrease indicates progress.
• Examine constraint violation (Max constraint) to ensure that it decreases towards 0. Decrease
indicates progress.
• See if the first-order optimality decreases towards 0. Decrease indicates progress.
• See if the Trust-region radius decreases to a small value. This decrease indicates that the
objective might not be smooth.
What to Do
4-3
4 Steps to Take After Running a Solver
2. Relax Tolerances
If StepTolerance or OptimalityTolerance, for example, are too small, the solver might not
recognize when it has reached a minimum; it can make futile iterations indefinitely.
To change tolerances using the Optimization app, use the Stopping criteria list at the top of the
Options pane.
To change tolerances at the command line, use optimoptions as described in “Set and Change
Options” on page 2-62.
For example, check that your objective and nonlinear constraint functions return the correct values at
some points. See Check your Objective and Constraint Functions on page 4-20. Check that an
infeasible point does not cause an error in your functions; see “Iterations Can Violate Constraints” on
page 2-33.
Solvers run more reliably when each coordinate has about the same effect on the objective and
constraint functions. Multiply your coordinate directions with appropriate scalars to equalize the
effect of each coordinate. Add appropriate values to certain coordinates to equalize their size.
Example: Centering and Scaling
x =
0
0.5000
The result is incorrect; poor scaling interfered with obtaining a good solution.
4-4
When the Solver Fails
D = diag([1e-3,1e3]);
fr = @(y) f(D*y);
y = fminunc(fr, [0.5;0.5], opts)
y =
0
0 % the correct answer
z =
1.0e+005 *
10.0000 -10.0000 % looks good, but...
ans =
w = fminunc(fcc,[.5 .5],opts)
w =
0 0 % the correct answer
If you do not provide gradients or Jacobians, solvers estimate gradients and Jacobians by finite
differences. Therefore, providing these derivatives can save computational time, and can lead to
increased accuracy.
For constrained problems, providing a gradient has another advantage. A solver can reach a point x
such that x is feasible, but finite differences around x always lead to an infeasible point. In this case,
a solver can fail or halt prematurely. Providing a gradient allows a solver to proceed.
Provide gradients or Jacobians in the files for your objective function and nonlinear constraint
functions. For details of the syntax, see “Writing Scalar Objective Functions” on page 2-17, “Writing
Vector and Matrix Objective Functions” on page 2-26, and “Nonlinear Constraints” on page 2-37.
To check that your gradient or Jacobian function is correct, use the CheckGradients option, as
described in “Checking Validity of Gradients or Jacobians” on page 2-74.
If you have a Symbolic Math Toolbox license, you can calculate gradients and Hessians
programmatically. For an example, see “Symbolic Math Toolbox™ Calculates Gradients and Hessians”
on page 6-94.
For examples using gradients and Jacobians, see “Minimization with Gradient and Hessian” on page
6-13, “Nonlinear Constraints with Gradients” on page 6-63, “Symbolic Math Toolbox™ Calculates
Gradients and Hessians” on page 6-94, “Nonlinear Equations with Analytic Jacobian” on page 13-
7, and “Nonlinear Equations with Jacobian” on page 13-11.
4-5
4 Steps to Take After Running a Solver
7. Provide Hessian
Solvers often run more reliably and with fewer iterations when you supply a Hessian.
• fmincon interior-point. Write the Hessian as a separate function. For an example, see
“fmincon Interior-Point Algorithm with Analytic Hessian” on page 6-66.
• fmincon trust-region-reflective. Give the Hessian as the third output of the objective
function. For an example, see “Minimization with Dense Structured Hessian, Linear Equalities” on
page 6-90.
• fminunc trust-region. Give the Hessian as the third output of the objective function. For an
example, see “Minimization with Gradient and Hessian” on page 6-13.
If you have a Symbolic Math Toolbox license, you can calculate gradients and Hessians
programmatically. For an example, see “Symbolic Math Toolbox™ Calculates Gradients and Hessians”
on page 6-94.
To proceed when the solver found no feasible point, try one or more of the following.
“1. Check Linear Constraints” on page 4-6
“2. Check Nonlinear Constraints” on page 4-6
Try finding a point that satisfies the bounds and linear constraints by solving a linear programming
problem.
1 Define a linear programming problem with an objective function that is always zero:
xnew = linprog(f,A,b,Aeq,beq,lb,ub);
3 If there is a feasible point xnew, use xnew as the initial point and rerun your original problem.
4 If there is no feasible point, your problem is not well-formulated. Check the definitions of your
bounds and linear constraints. For details on checking linear constraints, see “Investigate Linear
Infeasibilities” on page 9-139.
After ensuring that your bounds and linear constraints are feasible (contain a point satisfying all
constraints), check your nonlinear constraints.
@(x)0
4-6
When the Solver Fails
Run your optimization with all constraints and with the zero objective. If you find a feasible point
xnew, set x0 = xnew and rerun your original problem.
• If you do not find a feasible point using a zero objective function, use the zero objective function
with several initial points.
• If you find a feasible point xnew, set x0 = xnew and rerun your original problem.
• If you do not find a feasible point, try relaxing the constraints, discussed next.
1 Change the nonlinear constraint function c to return c-Δ, where Δ is a positive number. This
change makes your nonlinear constraints easier to satisfy.
2 Look for a feasible point for the new constraint function, using either your original objective
function or the zero objective function.
1 If you find a feasible point,
a Reduce Δ
b Look for a feasible point for the new constraint function, starting at the previously found
point.
2 If you do not find a feasible point, try increasing Δ and looking again.
If you find no feasible point, your problem might be truly infeasible, meaning that no solution exists.
Check all your constraint definitions again.
If the solver started at a feasible point, but converged to an infeasible point, try the following
techniques.
• Try a different algorithm. The fmincon 'sqp' and 'interior-point' algorithms are usually
the most robust, so try one or both of them first.
• Tighten the bounds. Give the highest lb and lowest ub vectors that you can. This can help the
solver to maintain feasibility. The fmincon 'sqp' and 'interior-point' algorithms obey
bounds at every iteration, so tight bounds help throughout the optimization.
Usually, you get this message because the linear constraints are inconsistent, or are nearly singular.
To check whether a feasible point exists, create a linear programming problem with the same
constraints and with a zero objective function vector f. Solve using the linprog 'dual-simplex'
algorithm:
options = optimoptions('linprog','Algorithm','dual-simplex');
x = linprog(f,A,b,Aeq,beq,lb,ub,options)
If linprog finds a feasible point, then try a different quadprog algorithm. Alternatively, change
some tolerances such as StepTolerance or ConstraintTolerance and solve the problem again.
Problem Unbounded
The solver reached a point whose objective function was less than the objective limit tolerance.
4-7
4 Steps to Take After Running a Solver
• Your problem might be truly unbounded. In other words, there is a sequence of points xi with
1 Try Changing the Initial Point on page 4-18. fsolve relies on an initial point. By giving it
different initial points, you increase the chances of success.
2 Check the definition of the equation to make sure that it is smooth. fsolve might fail to
converge for equations with discontinuous gradients, such as absolute value. fsolve can fail to
converge for functions with discontinuities.
3 Check that the equation is “square,” meaning equal dimensions for input and output (has the
same number of unknowns as values of the equation).
4 Change tolerances, especially OptimalityTolerance and StepTolerance. If you attempt to
get high accuracy by setting tolerances to very small values, fsolve can fail to converge. If you
set tolerances that are too high, fsolve can fail to solve an equation accurately.
5 Check the problem definition. Some problems have no real solution, such as x^2 + 1 = 0.
See Also
More About
• “Investigate Linear Infeasibilities” on page 9-139
4-8
Solver Takes Too Long
• Using the Optimization app, choose Level of display to be iterative or iterative with
detailed message.
• At the MATLAB command line, enter
options = optimoptions('solvername','Display','iter');
For an example of iterative display, see “Interpret the Result” on page 1-16. For more information,
see “What to Look For in Iterative Display” on page 4-3.
To change tolerances using the Optimization app, use the Stopping criteria list at the top of the
Options pane.
To change tolerances at the command line, use optimoptions as described in “Set and Change
Options” on page 2-62.
4-9
4 Steps to Take After Running a Solver
• Using the Optimization app, check the boxes next to each plot function you wish to use.
• At the MATLAB command line, enter
options = optimoptions('solvername','PlotFcn',{@plotfcn1,@plotfcn2,...});
For an example of using a plot function, see “Using a Plot Function” on page 3-27.
Enable CheckGradients
If you have supplied derivatives (gradients or Jacobians) to your solver, the solver can fail to converge
if the derivatives are inaccurate. For more information about using the CheckGradients option, see
“Checking Validity of Gradients or Jacobians” on page 2-74.
Why? An interior-point algorithm can set an initial point to the midpoint of finite bounds. Or an
interior-point algorithm can try to find a “central path” midway between finite bounds. Therefore, a
large, arbitrary bound can resize those components inappropriately. In contrast, infinite bounds are
ignored for these purposes.
Minor point: Some solvers use memory for each constraint, primarily via a constraint Hessian. Setting
a bound to Inf or -Inf means there is no constraint, so there is less memory in use, because a
constraint Hessian has lower dimension.
For an example of using an output function, see “Example: Using Output Functions” on page 3-32.
• Use a large-scale algorithm if possible (see “Large-Scale vs. Medium-Scale Algorithms” on page 2-
10). These algorithms include trust-region-reflective, interior-point, the fminunc
4-10
Solver Takes Too Long
Tip If you use a large-scale algorithm, then use sparse matrices for your linear constraints.
• Use a Jacobian multiply function or Hessian multiply function. For examples, see “Jacobian
Multiply Function with Linear Least Squares” on page 12-30, “Quadratic Minimization with
Dense, Structured Hessian” on page 11-17, and “Minimization with Dense Structured Hessian,
Linear Equalities” on page 6-90.
4-11
4 Steps to Take After Running a Solver
In this section...
“Final Point Equals Initial Point” on page 4-12
“Local Minimum Possible” on page 4-12
If you are unsure that the initial point is truly a local minimum, try:
1 Starting from different points — see Change the Initial Point on page 4-18.
2 Checking that your objective and constraints are defined correctly (for example, do they return
the correct values at some points?) — see Check your Objective and Constraint Functions on
page 4-20. Check that an infeasible point does not cause an error in your functions; see
“Iterations Can Violate Constraints” on page 2-33.
3 Changing tolerances, such as OptimalityTolerance, ConstraintTolerance, and
StepTolerance — see Use Appropriate Tolerances on page 4-9.
4 Scaling your problem so each coordinate has about the same effect — see Rescale the Problem
on page 4-15.
5 Providing gradient and Hessian information — see Provide Analytic Gradients or Jacobian on
page 4-16 and Provide a Hessian on page 4-16.
1. Nonsmooth Functions
If you try to minimize a nonsmooth function, or have nonsmooth constraints, “Local Minimum
Possible” can be the best exit message. This is because the first-order optimality conditions do not
apply at a nonsmooth point.
To satisfy yourself that the solution is adequate, try to Check Nearby Points on page 4-19.
4-12
When the Solver Might Have Succeeded
Restarting an optimization at the final point can lead to a solution with a better first-order optimality
measure. A better (lower) first-order optimality measure gives you more reason to believe that the
answer is reliable.
For example, consider the following minimization problem, taken from the example “Using Symbolic
Mathematics with Optimization Toolbox™ Solvers” on page 6-105:
options = optimoptions('fminunc','Algorithm','quasi-newton');
funh = @(x)log(1 + (x(1) - 4/3)^2 + 3*(x(2) - (x(1)^3 - x(1)))^2);
[xfinal fval exitflag] = fminunc(funh,[-1;2],options)
xfinal =
1.3333
1.0370
fval =
8.5265e-014
exitflag =
5
The exit flag value of 5 indicates that the first-order optimality measure was above the function
tolerance. Run the minimization again starting from xfinal:
xfinal2 =
1.3333
1.0370
fval2 =
6.5281e-014
exitflag2 =
1
The local minimum is “found,” not “possible,” and the exitflag is 1, not 5. The two solutions are
virtually identical. Yet the second run has a more satisfactory exit message, since the first-order
optimality measure was low enough: 7.5996e-007, instead of 3.9674e-006.
Many solvers give you a choice of algorithm. Different algorithms can lead to the use of different
stopping criteria.
4-13
4 Steps to Take After Running a Solver
For example, Rerun Starting At Final Point on page 4-13 returns exitflag 5 from the first run. This run
uses the quasi-newton algorithm.
The trust-region algorithm requires a user-supplied gradient. betopt.m contains a calculation of the
objective function and gradient:
function [f gradf] = betopt(x)
Running the optimization using the trust-region algorithm results in a different exitflag:
options = optimoptions('fminunc','Algorithm','trust-region','SpecifyObjectiveGradient',true);
[xfinal3 fval3 exitflag3] = fminunc(@betopt,[-1;2],options)
xfinal3 =
1.3333
1.0370
fval3 =
7.6659e-012
exitflag3 =
3
The exit condition is better than the quasi-newton condition, though it is still not the best.
Rerunning the algorithm from the final point produces a better point, with extremely small first-order
optimality measure:
[xfinal4 fval4 eflag4 output4] = fminunc(@betopt,xfinal3,options)
xfinal4 =
1.3333
1.0370
fval4 =
0
eflag4 =
1
output4 =
4-14
When the Solver Might Have Succeeded
iterations: 1
funcCount: 2
cgiterations: 1
firstorderopt: 7.5497e-11
algorithm: 'trust-region'
message: 'Local minimum found.
4. Change Tolerances
Sometimes tightening or loosening tolerances leads to a more satisfactory result. For example,
choose a smaller value of OptimalityTolerance in the Try a Different Algorithm on page 4-13
section:
options = optimoptions('fminunc','Algorithm','trust-region',...
'OptimalityTolerance',1e-8,'SpecifyObjectiveGradient',true); % default=1e-6
[xfinal3 fval3 eflag3 output3] = fminunc(@betopt,[-1;2],options)
xfinal3 =
1.3333
1.0370
fval3 =
0
eflag3 =
1
output3 =
iterations: 15
funcCount: 16
cgiterations: 12
firstorderopt: 7.5497e-11
algorithm: 'trust-region'
message: 'Local minimum found.
fminunc took one more iteration than before, arriving at a better solution.
Try to have each coordinate give about the same effect on the objective and constraint functions by
scaling and centering. For examples, see Center and Scale Your Problem on page 4-4.
Evaluate your objective function and constraints, if they exist, at points near the final point. If the
final point is a local minimum, nearby feasible points have larger objective function values. See Check
Nearby Points on page 4-19 for an example.
4-15
4 Steps to Take After Running a Solver
If you have a Global Optimization Toolbox license, try running the patternsearch solver from the
final point. patternsearch examines nearby points, and accepts all types of constraints.
Central finite differences are more time-consuming to evaluate, but are much more accurate. Use
central differences when your problem can have high curvature.
To choose central differences in the Optimization app, set Options > Approximated derivatives >
Type to be central differences.
If you do not provide gradients or Jacobians, solvers estimate gradients and Jacobians by finite
differences. Therefore, providing these derivatives can save computational time, and can lead to
increased accuracy.
For constrained problems, providing a gradient has another advantage. A solver can reach a point x
such that x is feasible, but finite differences around x always lead to an infeasible point. In this case,
a solver can fail or halt prematurely. Providing a gradient allows a solver to proceed.
Provide gradients or Jacobians in the files for your objective function and nonlinear constraint
functions. For details of the syntax, see “Writing Scalar Objective Functions” on page 2-17, “Writing
Vector and Matrix Objective Functions” on page 2-26, and “Nonlinear Constraints” on page 2-37.
To check that your gradient or Jacobian function is correct, use the CheckGradients option, as
described in “Checking Validity of Gradients or Jacobians” on page 2-74.
If you have a Symbolic Math Toolbox license, you can calculate gradients and Hessians
programmatically. For an example, see “Symbolic Math Toolbox™ Calculates Gradients and Hessians”
on page 6-94.
For examples using gradients and Jacobians, see “Minimization with Gradient and Hessian” on page
6-13, “Nonlinear Constraints with Gradients” on page 6-63, “Symbolic Math Toolbox™ Calculates
Gradients and Hessians” on page 6-94, “Nonlinear Equations with Analytic Jacobian” on page 13-
7, and “Nonlinear Equations with Jacobian” on page 13-11.
9. Provide a Hessian
Solvers often run more reliably and with fewer iterations when you supply a Hessian.
• fmincon interior-point. Write the Hessian as a separate function. For an example, see
“fmincon Interior-Point Algorithm with Analytic Hessian” on page 6-66.
• fmincon trust-region-reflective. Give the Hessian as the third output of the objective
function. For an example, see “Minimization with Dense Structured Hessian, Linear Equalities” on
page 6-90.
• fminunc trust-region. Give the Hessian as the third output of the objective function. For an
example, see “Minimization with Gradient and Hessian” on page 6-13.
4-16
When the Solver Might Have Succeeded
If you have a Symbolic Math Toolbox license, you can calculate gradients and Hessians
programmatically. For an example, see “Symbolic Math Toolbox™ Calculates Gradients and Hessians”
on page 6-94.
The example in “Symbolic Math Toolbox™ Calculates Gradients and Hessians” on page 6-94 shows
fmincon taking 77 iterations without a Hessian, but only 19 iterations with a Hessian.
4-17
4 Steps to Take After Running a Solver
options = optimoptions('fmincon','Algorithm','active-set');
ffun = @(x)x^3;
xfinal = fmincon(ffun,1/3,[],[],[],[],-2,2,[],options)
No active inequalities.
xfinal =
-1.5056e-008
The true minimum occurs at x = -2. fmincon gives this report because the function f(x) is so flat
near x = 0.
Another common problem is that a solver finds a local minimum, but you might want a global
minimum. For more information, see “Local vs. Global Optima” on page 4-22.
Lesson: check your results, even if the solver reports that it “found” a local minimum, or “solved” an
equation.
4-18
When the Solver Succeeds
x =
-1.6764e-008
fval =
-4.7111e-024
Change the initial point by a small amount, and the solver finds a better solution:
xfinal =
-0.7500
fval =
-0.1055
x = -0.75 is the global solution; starting from other points cannot improve the solution.
For more information, see “Local vs. Global Optima” on page 4-22.
For example, with the objective function ffun from “What Can Be Wrong If The Solver Succeeds?” on
page 4-18, and the final point xfinal = -1.5056e-008, calculate ffun(xfinal±Δ) for some Δ:
delta = .1;
[ffun(xfinal),ffun(xfinal+delta),ffun(xfinal-delta)]
ans =
-0.0000 0.0011 -0.0009
The objective function is lower at ffun(xfinal-Δ), so the solver reported an incorrect solution.
options = optimoptions(@fmincon,'Algorithm','active-set');
lb = [0,-1]; ub = [1,1];
ffun = @(x)(x(1)-(x(1)-x(2))^2);
[x fval exitflag] = fmincon(ffun,[1/2 1/3],[],[],[],[],...
lb,ub,[],options)
4-19
4 Steps to Take After Running a Solver
x =
1.0e-007 *
0 0.1614
fval =
-2.6059e-016
exitflag =
1
Evaluating ffun at nearby feasible points shows that the solution x is not a true minimum:
[ffun([0,.001]),ffun([0,-.001]),...
ffun([.001,-.001]),ffun([.001,.001])]
ans =
1.0e-003 *
-0.0010 -0.0010 0.9960 1.0000
The first two listed values are smaller than the computed minimum fval.
If you have a Global Optimization Toolbox license, you can use the patternsearch function to check
nearby points.
options = optimoptions('fminunc','Algorithm','quasi-newton');
[x fval] = fminunc(@(x)-x+x^2,0,options)
x =
0.5000
fval =
-0.2500
4-20
When the Solver Succeeds
4-21
4 Steps to Take After Running a Solver
• A local minimum of a function is a point where the function value is smaller than at nearby points,
but possibly greater than at a distant point.
• A global minimum is a point where the function value is smaller than at all other feasible points.
Optimization Toolbox solvers typically find a local minimum. (This local minimum can be a global
minimum.) They find the minimum in the basin of attraction of the starting point. For more
information about basins of attraction, see “Basins of Attraction” on page 4-23.
• Linear programming problems and positive definite quadratic programming problems are convex,
with convex feasible regions, so they have only one basin of attraction. Depending on the specified
options, linprog ignores any user-supplied starting point, and quadprog does not require one,
even though you can sometimes speed a minimization by supplying a starting point.
• Global Optimization Toolbox functions, such as simulannealbnd, attempt to search more than
one basin of attraction.
You can set initial values to search for a global minimum in these ways:
4-22
Local vs. Global Optima
• Use identical initial points with added random perturbations on each coordinate—bounded,
normal, exponential, or other.
• If you have a Global Optimization Toolbox license, use the GlobalSearch or MultiStart
solvers. These solvers automatically generate random start points within bounds.
The more you know about possible initial points, the more focused and successful your search will be.
Basins of Attraction
If an objective function f(x) is smooth, the vector –∇f(x) points in the direction where f(x) decreases
most quickly. The equation of steepest descent, namely
d
x(t) = − ∇ f (x(t)),
dt
yields a path x(t) that goes to a local minimum as t increases. Generally, initial values x(0) that are
near each other give steepest descent paths that tend towards the same minimum point. The basin of
attraction for steepest descent is the set of initial values that lead to the same local minimum.
This figure shows two one-dimensional minima. The figure shows different basins of attraction with
different line styles, and indicates the directions of steepest descent with arrows. For this and
subsequent figures, black dots represent local minima. Every steepest descent path, starting at a
point x(0), goes to the black dot in the basin containing x(0).
One-dimensional basins
This figure shows how steepest descent paths can be more complicated in more dimensions.
4-23
4 Steps to Take After Running a Solver
One basin of attraction, showing steepest descent paths from various starting points
This figure shows even more complicated paths and basins of attraction.
Constraints can break up one basin of attraction into several pieces. For example, consider
minimizing y subject to:
y ≥ |x|
y ≥ 5 – 4(x–2)2.
This figure shows the two basins of attraction with the final points.
4-24
Local vs. Global Optima
The steepest descent paths are straight lines down to the constraint boundaries. From the constraint
boundaries, the steepest descent paths travel down along the boundaries. The final point is either
(0,0) or (11/4,11/4), depending on whether the initial x-value is above or below 2.
See Also
More About
• “Improve Results”
4-25
4 Steps to Take After Running a Solver
For a problem-based example of optimizing an ODE, see “Fit ODE, Problem-Based” on page 12-77.
For a solver-based example, see “Fit an Ordinary Differential Equation (ODE)” on page 12-54.
Optimization Toolbox solvers use derivatives of objective and constraint functions internally. By
default, they estimate these derivatives using finite difference approximations of the form
F x+δ −F x
δ
or
F x+δ −F x−δ
.
2δ
4-26
Optimizing a Simulation or Ordinary Differential Equation
• Simulations are often insensitive to small changes in parameters. This means that if you use too
small a perturbation δ, the simulation can return a spurious estimated derivative of 0.
• Both simulations and numerical solutions of ODEs can have inaccuracies in their function
evaluations. These inaccuracies can be amplified in finite difference approximations.
• Numerical solution of ODEs introduces noise at values much larger than machine precision. This
noise can be amplified in finite difference approximations.
• If an ODE solver uses variable step sizes, then sometimes the number of ODE steps in the
evaluation of F(x + δ) can differ from the number of steps in the evaluation of F(x). This difference
can lead to a spurious difference in the returned values, giving a misleading estimate of the
derivative.
If you have a Global Optimization Toolbox license, you can try using patternsearch as your solver.
patternsearch does not attempt to estimate gradients, so does not suffer from the limitations in
“Problems in Finite Differences” on page 4-26.
If you use patternsearch for expensive (time-consuming) function evaluations, use the Cache
option:
options = optimoptions('patternsearch','Cache','on');
You can sometimes avoid the problems in “Problems in Finite Differences” on page 4-26 by taking
larger finite difference steps than the default.
• If you have MATLAB R2011b or later, set a finite difference step size option to a value larger than
the default sqrt(eps) or eps^(1/3), such as:
• For R2011b–R2012b:
options = optimset('FinDiffRelStep',1e-3);
• For R2013a–R2015b and a solver named 'solvername':
options = optimoptions('solvername','FinDiffRelStep',1e-3);
• For R2016a onwards and a solver named 'solvername':
options = optimoptions('solvername','FiniteDifferenceStepSize',1e-3);
If you have different scales in different components, set the finite difference step size to a vector
proportional to the component scales.
4-27
4 Steps to Take After Running a Solver
• If you have MATLAB R2011a or earlier, set the DiffMinChange option to a larger value than the
default 1e-8, and possibly set the DiffMaxChange option also, such as:
options = optimset('DiffMinChange',1e-3,'DiffMaxChange',1);
options = optimoptions('solvername','FiniteDifferenceType','central');
To avoid the problems of finite difference estimation, you can give an approximate gradient function
in your objective and nonlinear constraints. Remember to set the SpecifyObjectiveGradient
option to true using optimoptions, and, if relevant, also set the SpecifyConstraintGradient
option to true.
• For some ODEs, you can evaluate the gradient numerically at the same time as you solve the ODE.
For example, suppose the differential equation for your objective function z(t,x) is
d
z(t, x) = G(z, t, x),
dt
where x is the vector of parameters over which you minimize. Suppose x is a scalar. Then the
differential equation for its derivative y,
d
y(t, x) = z(t, x)
dx
is
d ∂G(z, t, x) ∂G(z, t, x)
y(t, x) = y(t, x) + ,
dt ∂z ∂x
where z(t,x) is the solution of the objective function ODE. You can solve for y(t,x) in the same
system of differential equations as z(t,x). This solution gives you an approximated derivative
without ever taking finite differences. For nonscalar x, solve one ODE per component.
For theoretical and computational aspects of this method, see Leis and Kramer [2]. For
computational experience with this and finite-difference methods, see Figure 7 of Raue et al. [3].
• For some simulations, you can estimate a derivative within the simulation. For example, the
likelihood ratio technique described in Reiman and Weiss [4] or the infinitesimal perturbation
analysis technique analyzed in Heidelberger, Cao, Zazanis, and Suri [1] estimate derivatives in the
same simulation that estimates the objective or constraint functions.
You can use odeset to set the AbsTol or RelTol ODE solver tolerances to values below their
defaults. However, choosing too small a tolerance can lead to slow solutions, convergence failure, or
other problems. Never choose a tolerance less than 1e-9 for RelTol. The lower limit on each
component of AbsTol depends on the scale of your problem, so there is no advice.
4-28
Optimizing a Simulation or Ordinary Differential Equation
If a simulation uses random numbers, then evaluating an objective or constraint function twice can
return different results. This affects both function estimation and finite difference estimation. The
value of a finite difference might be dominated by the variation due to randomness, rather than the
variation due to different evaluation points x and x + δ.
If your simulation uses random numbers from a stream you control, reset the random stream before
each evaluation of your objective or constraint functions. This practice can reduce the variability in
results. For example, in an objective function:
function f = mysimulation(x)
rng default % or any other resetting method
...
end
For details, see “Generate Random Numbers That Are Repeatable” (MATLAB).
Frequently, a simulation evaluates both the objective function and constraints during the same
simulation run. Or, both objective and nonlinear constraint functions use the same expensive
computation. Solvers such as fmincon separately evaluate the objective function and nonlinear
constraint functions. This can lead to a great loss of efficiency, because the solver calls the expensive
computation twice. To circumvent this problem, use the technique in “Objective and Nonlinear
Constraints in the Same Function” on page 2-48, or, when using the problem-based approach,
“Objective and Constraints Having a Common Function in Serial or Parallel, Problem-Based” on page
2-52.
While you might not know all limitations on the parameter space, try to set appropriate bounds on all
parameters, both upper and lower. This can speed up your optimization, and can help the solver avoid
problematic parameter values.
Use a Solver That Respects Bounds
As described in “Iterations Can Violate Constraints” on page 2-33, some algorithms can violate bound
constraints at intermediate iterations. For optimizing simulations and ODEs, use algorithms that
always obey bound constraints. See “Algorithms That Satisfy Bound Constraints” on page 2-33.
Return NaN
If your simulation or ODE solver does not successfully evaluate an objective or nonlinear constraint
function at a point x, have your function return NaN. Most Optimization Toolbox and Global
Optimization Toolbox solvers have the robustness to attempt a different iterative step if they
encounter a NaN value. These robust solvers include:
4-29
4 Steps to Take After Running a Solver
Some people are tempted to return an arbitrary large objective function value at an unsuccessful,
infeasible, or other poor point. However, this practice can confuse a solver, because the solver does
not realize that the returned value is arbitrary. When you return NaN, the solver can attempt to
evaluate at a different point.
Bibliography
[1] Heidelberger, P., X.-R. Cao, M. A. Zazanis, and R. Suri. Convergence properties of infinitesimal
perturbation analysis estimates. Management Science 34, No. 11, pp. 1281–1302, 1988.
[2] Leis, J. R. and Kramer, M.A. The Simultaneous Solution and Sensitivity Analysis of Systems
Described by Ordinary Differential Equations. ACM Trans. Mathematical Software, Vol. 14,
No. 1, pp. 45–60, 1988.
[3] Raue, A. et al. Lessons Learned from Quantitative Dynamical Modeling in Systems Biology.
Available at http://www.plosone.org/article/info:doi/10.1371/
journal.pone.0074335, 2013.
[4] Reiman, M. I. and A. Weiss. Sensitivity analysis via likelihood ratios. Proc. 18th Winter Simulation
Conference, ACM, New York, pp. 285–289, 1986.
4-30
5
Optimization App
Optimization App
Note The Optimization app warns that it will be removed in a future release. For alternatives, see
“Optimization App Alternatives” on page 5-12.
In this section...
“Optimization App Basics” on page 5-2
“Specifying Certain Options” on page 5-6
“Importing and Exporting Your Work” on page 5-8
optimtool
in the Command Window. This opens the Optimization app, as shown in the following figure.
5-2
Optimization App
You can also start the Optimization app from the MATLAB Apps tab.
The reference page for the Optimization app provides variations for starting the optimtool function.
5-3
5 Optimization App
This is a summary of the steps to set up your optimization problem and view results with the
Optimization app.
• Click Pause to temporarily suspend the algorithm. To resume the algorithm using the current
iteration at the time you paused, click Resume.
• Click Stop to stop the algorithm. The Run solver and view results window displays information
for the current iteration at the moment you clicked Stop.
You can export your results after stopping the algorithm. For details, see “Exporting Your Work” on
page 5-9.
5-4
Optimization App
Viewing Results
When a solver terminates, the Run solver and view results window displays the reason the
algorithm terminated. To clear the Run solver and view results window between runs, click Clear
Results.
Sorting the Displayed Results
Depending on the solver and problem, results can be in the form of a table. If the table has multiple
rows, sort the table by clicking a column heading. Click the heading again to sort the results in
reverse.
For example, suppose you use the Optimization app to solve the lsqlin problem described in
“Optimization App with the lsqlin Solver” on page 12-27. The result appears as follows.
To sort the results by value, from lowest to highest, click Value. The results were already in that
order, so don’t change.
To sort the results in reverse order, highest to lowest, click Value again.
5-5
5 Optimization App
For an example of sorting a table returned by the Global Optimization Toolbox gamultiobj function,
see “Pareto Front for Two Objectives” (Global Optimization Toolbox).
If you export results using File > Export to Workspace, the exported results do not depend on the
sorted display.
Final Point
The Final point updates to show the coordinates of the final point when the algorithm terminated. If
you don't see the final point, click the upward-pointing triangle on the icon on the lower-left.
Selecting File > Reset Optimization Tool resets the problem definition and options to the original
default values. This action is equivalent to closing and restarting the app.
To clear only the problem definition, select File > Clear Problem Fields. With this action, fields in
the Problem Setup and Results pane are reset to the defaults, with the exception of the selected
solver and algorithm choice. Any options that you have modified from the default values in the
Options pane are not reset with this action.
Setting Preferences for Changing Solvers
To modify how your options are handled in the Optimization app when you change solvers, select File
> Preferences, which opens the Preferences dialog box shown below.
The default value, Reset options to defaults, discards any options you specified previously in the
optimtool. Under this choice, you can select the option Prompt before resetting options to
defaults.
Alternatively, you can select Keep current options if possible to preserve the values you have
modified. Changed options that are not valid with the newly selected solver are kept but not used,
while active options relevant to the new solver selected are used. This choice allows you to try
different solvers with your problem without losing your options.
5-6
Optimization App
Plot Functions
You can select a plot function to easily plot various measures of progress while the algorithm
executes. Each plot selected draws a separate axis in the figure window. If available for the solver
selected, the Stop button in the Run solver and view results window to interrupt a running solver.
You can select a predefined plot function from the Optimization app, or you can select Custom
function to write your own. Plot functions not relevant to the solver selected are grayed out. The
following lists the available plot functions:
• Current point — Select to show a bar plot of the point at the current iteration.
• Function count — Select to plot the number of function evaluations at each iteration.
• Function value — Select to plot the function value at each iteration.
• Norm of residuals — Select to show a bar plot of the current norm of residuals at the current
iteration.
• Max constraint — Select to plot the maximum constraint violation value at each iteration.
• Current step — Select to plot the algorithm step size at each iteration.
• First order optimality — Select to plot the violation of the optimality conditions for the solver at
each iteration.
• Custom function — Enter your own plot function as a function handle. To provide more than one
plot function use a cell array, for example, by typing:
{@plotfcn,@plotfcn2}
Write custom plot functions with the same syntax as output functions. For information, see
“Output Function Syntax” on page 15-26.
The graphic above shows the plot functions available for the default fmincon solver.
Output function
Output function is a function or collection of functions the algorithm calls at each iteration.
Through an output function you can observe optimization quantities such as function values, gradient
values, and current iteration. Specify no output function, a single output function using a function
handle, or multiple output functions. To provide more than one output function use a cell array of
function handles in the Custom function field, for example by typing:
{@outputfcn,@outputfcn2}
For more information on writing an output function, see “Output Function Syntax” on page 15-26.
5-7
5 Optimization App
Select Level of display to specify the amount of information displayed when you run the algorithm.
Choose from the following; depending on the solver, only some may be available:
See “Enhanced Exit Messages” on page 3-4 for information on detailed messages.
Selecting Show diagnostics lists problem information and options that have changed from the
defaults.
The graphic below shows the display options for the fmincon solver. Some other solvers have fewer
options.
5-8
Optimization App
The Export to Workspace dialog box enables you to send your problem information to the MATLAB
workspace as a structure or object that you may then manipulate in the Command Window.
To access the Export to Workspace dialog box shown below, select File > Export to Workspace.
Exported results contain all optional information. For example, an exported results structure for
lsqcurvefit contains the data x, resnorm, residual, exitflag, output, lambda, and
jacobian.
After you have exported information from the Optimization app to the MATLAB workspace, you can
see your data in the MATLAB Workspace browser or by typing the name of the structure at the
Command Window. To see the value of a field in a structure or object, double-click the name in the
Workspace window. Alternatively, see the values by entering exportname.fieldname at the
command line. For example, so see the message in an output structure, enter output.message. If a
structure contains structures or objects, you can double-click again in the workspace browser, or
enter exportname.name2.fieldname at the command line. For example, to see the level of
iterative display contained in the options of an exported problem structure, enter
optimproblem.options.Display.
You can run a solver on an exported problem at the command line by typing
solver(problem)
For example, if you have exported a fmincon problem named optimproblem, you can type
fmincon(optimproblem)
5-9
5 Optimization App
This runs fmincon on the problem with the saved options in optimproblem. You can exercise more
control over outputs by typing, for example,
[x,fval,exitflag] = fmincon(optimproblem)
Caution For Optimization Toolbox solvers, the Optimization app imports and exports only one option
related to the former TolFun tolerance. It displays this option as Function tolerance, and uses it as
the OptimalityTolerance option. You cannot import, export, or change the FunctionTolerance
option in the Optimization app.
However, Global Optimization Toolbox solvers do not have an OptimalityTolerance option. Those
solvers can import, export, and set the FunctionTolerance option in the Optimization app.
Whether you save options from Optimization Toolbox functions at the Command Window, or whether
you export options, or the problem and options, from the Optimization app, you can resume work on
your problem using the Optimization app.
There are three ways to import your options, or problem and options, to the Optimization app:
• Call the optimtool function from the Command Window specifying your options, or problem and
options, as the input, for example,
optimtool(options)
• Select File > Import Options in the Optimization app.
• Select File > Import Problem in the Optimization app.
The methods described above require that the options, or problem and options, be present in the
MATLAB workspace.
If you import a problem that was generated with the Include information needed to resume this
run box checked, the initial point is the latest point generated in the previous run. (For Genetic
Algorithm solvers, the initial population is the latest population generated in the previous run. For the
Simulated Annealing solver, the initial point is the best point generated in the previous run.) If you
import a problem that was generated with this box unchecked, the initial point (or population) is the
initial point (or population) of the previous run.
Generating a File
You may want to generate a file to continue with your optimization problem in the Command Window
at another time. You can run the file without modification to recreate the results that you created with
the Optimization app. You can also edit and modify the file and run it from the Command Window.
To export data from the Optimization app to a file, select File > Generate Code.
• The problem definition, including the solver, information on the function to be minimized,
algorithm specification, constraints, and start point
5-10
Optimization App
Running the file at the Command Window reproduces your problem results.
Although you cannot export your problem results to a generated file, you can save them in a MAT-file
that you can use with your generated file, by exporting the results using the Export to Workspace
dialog box, then saving the data to a MAT-file from the Command Window.
See Also
More About
• “Optimization App Alternatives” on page 5-12
• “Solver-Based Optimization Problem Setup”
5-11
5 Optimization App
• Set options easily — “Set Options Using Live Scripts” on page 5-12 or “Set Options: Command
Line or Standard Scripts” on page 5-14
• Monitor the optimization — “Choose Plot Functions” on page 5-15
• Pass solver arguments correctly — “Pass Solver Arguments” on page 5-16
1 On the Home tab, in the File section, click New Live Script to create a live script.
2 In the Live Editor, set options by entering options = optimoptions(. MATLAB shows a list of
solvers.
5-12
Optimization App Alternatives
3 Select a solver, then enter a comma. MATLAB displays a list of name-value pairs for the solver.
[x,fval,exitflag,output] = ...
fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nlcon,options)
Tip
• For help choosing a solver, see “Optimization Decision Table” on page 2-4.
5-13
5 Optimization App
• For help choosing the solver algorithm, see “Choosing the Algorithm” on page 2-6.
• To understand the meaning of other options, see “Set Options”.
1 Set options by entering options = optimoptions(' and pressing Tab. MATLAB shows a list
of solvers.
5-14
Optimization App Alternatives
[x,fval,exitflag,output] = ...
fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nlcon,options)
Tip
• For help choosing a solver, see “Optimization Decision Table” on page 2-4.
• For help choosing the solver algorithm, see “Choosing the Algorithm” on page 2-6.
• To understand the meaning of other options, see “Set Options”.
To choose plot functions using the Editor or the command line, enter options =
optimoptions('solvername','PlotFcn',{' and then press Tab:
To choose a custom plot function, pass a function handle such as @myplotfun. Custom plot functions
use the same syntax as output functions. See “Output Functions” on page 3-32 and “Output Function
Syntax” on page 15-26.
5-15
5 Optimization App
linprog, lsqlin, quadprog, and lsqnonneg do not support plot functions, because these solvers
typically run quickly. To monitor their progress, you can use iterative display for linprog, the
lsqlin 'interior-point' algorithm, and the quadprog 'interior-point-convex' algorithm.
Set the 'Display' option to 'iter'.
The fminbnd, fminsearch, and fzero solvers do not use options created by optimoptions, only
optimset. To see which plot functions these solvers use, consult their reference pages:
• fminbnd options
• fminsearch options
• fzero options
If you want to include only fun, x0, lb, and options arguments, then the appropriate syntax is
fmincon(fun,x0,[],[],[],[],lb,[],[],options)
This call throws an error. In this incorrect command, fmincon interprets the lb argument as
representing the A matrix, and the options argument as representing the b vector. The third
argument always represents the A matrix, and the fourth argument always represents the b vector.
It can be difficult to keep track of positional arguments as you enter a command. The following are
suggestions for obtaining the correct syntax.
• Use live scripts. As you enter a command, you see function hints that guide you to enter the
correct argument in each position. Enter [] for unused arguments.
• Use the MATLAB Editor or command line. As you enter commands, you see lists of proper syntax
that guide you to enter the correct argument in each position. Enter [] for unused arguments.
5-16
Optimization App Alternatives
• Create a problem structure. This way, you can pass fewer arguments and pass named arguments
instead of positional arguments. For fmincon, the problem structure requires at least the
objective, x0, solver, and options fields. So, to give only the fun, x0, lb, and options
arguments, create a problem structure as follows:
% These commands assume that fun, x0, lb, and opts exist
prob.objective = fun;
prob.x0 = x0;
prob.lb = lb;
prob.options = opts;
prob.solver = 'fmincon';
You can also create a problem structure using one struct command.
% This command assumes that fun, x0, lb, and opts exist
prob = struct('objective',fun,'x0',x0,'lb',lb,...
'options',opts,'solver','fmincon')
• If you have Global Optimization Toolbox, you can create a problem structure for fmincon,
fminunc, lsqcurvefit, and lsqnonlin by using createOptimProblem.
See Also
More About
• “Solver-Based Optimization Problem Setup”
• “Set Options”
• “Run Sections in Live Scripts” (MATLAB)
• “Code Sections” (MATLAB)
5-17
6
min f (x)
x
Many of the methods used in Optimization Toolbox solvers are based on trust regions, a simple yet
powerful concept in optimization.
The current point is updated to be x + s if f(x + s) < f(x); otherwise, the current point remains
unchanged and N, the region of trust, is shrunk and the trial step computation is repeated.
The key questions in defining a specific trust-region approach to minimizing f(x) are how to choose
and compute the approximation q (defined at the current point x), how to choose and modify the trust
region N, and how accurately to solve the trust-region subproblem. This section focuses on the
unconstrained problem. Later sections discuss additional complications due to the presence of
constraints on the variables.
In the standard trust-region method ([48]), the quadratic approximation q is defined by the first two
terms of the Taylor approximation to F at x; the neighborhood N is usually spherical or ellipsoidal in
shape. Mathematically the trust-region subproblem is typically stated
1 T
min s Hs + sT g such that Ds ≤ Δ , (6-2)
2
where g is the gradient of f at the current point x, H is the Hessian matrix (the symmetric matrix of
second derivatives), D is a diagonal scaling matrix, Δ is a positive scalar, and ∥ . ∥ is the 2-norm. Good
6-2
Unconstrained Nonlinear Optimization Algorithms
algorithms exist for solving “Equation 6-2” (see [48]); such algorithms typically involve the
computation of all eigenvalues of H and a Newton process applied to the secular equation
1 1
− = 0.
Δ s
Such algorithms provide an accurate solution to “Equation 6-2”. However, they require time
proportional to several factorizations of H. Therefore, for large-scale problems a different approach is
needed. Several approximation and heuristic strategies, based on “Equation 6-2”, have been
proposed in the literature ([42] and [50]). The approximation approach followed in Optimization
Toolbox solvers is to restrict the trust-region subproblem to a two-dimensional subspace S ([39] and
[42]). Once the subspace S has been computed, the work to solve “Equation 6-2” is trivial even if full
eigenvalue/eigenvector information is needed (since in the subspace, the problem is only two-
dimensional). The dominant work has now shifted to the determination of the subspace.
The two-dimensional subspace S is determined with the aid of a preconditioned conjugate gradient
process described below. The solver defines S as the linear space spanned by s1 and s2, where s1 is in
the direction of the gradient g, and s2 is either an approximate Newton direction, i.e., a solution to
H ⋅ s2 = − g, (6-3)
The philosophy behind this choice of S is to force global convergence (via the steepest descent
direction or negative curvature direction) and achieve fast local convergence (via the Newton step,
when it exists).
These four steps are repeated until convergence. The trust-region dimension Δ is adjusted according
to standard rules. In particular, it is decreased if the trial step is not accepted, i.e., f(x + s) ≥ f(x). See
[46] and [49] for a discussion of this aspect.
Optimization Toolbox solvers treat a few important special cases of f with specialized functions:
nonlinear least-squares, quadratic functions, and linear least-squares. However, the underlying
algorithmic ideas are the same as for the general case. These special cases are discussed in later
sections.
A popular way to solve large, symmetric, positive definite systems of linear equations Hp = –g is the
method of Preconditioned Conjugate Gradients (PCG). This iterative approach requires the ability to
calculate matrix-vector products of the form H·v where v is an arbitrary vector. The symmetric
positive definite matrix M is a preconditioner for H. That is, M = C2, where C–1HC–1 is a well-
conditioned matrix or a matrix with clustered eigenvalues.
6-3
6 Nonlinear algorithms and examples
In a minimization context, you can assume that the Hessian matrix H is symmetric. However, H is
guaranteed to be positive definite only in the neighborhood of a strong minimizer. Algorithm PCG
exits when it encounters a direction of negative (or zero) curvature, that is, dTHd ≤ 0. The PCG
output direction p is either a direction of negative curvature or an approximate solution to the
Newton system Hp = –g. In either case, p helps to define the two-dimensional subspace used in the
trust-region approach discussed in “Trust-Region Methods for Nonlinear Minimization” on page 6-2.
Although a wide spectrum of methods exists for unconstrained optimization, methods can be broadly
categorized in terms of the derivative information that is, or is not, used. Search methods that use
only function evaluations (e.g., the simplex search of Nelder and Mead [30]) are most suitable for
problems that are not smooth or have a number of discontinuities. Gradient methods are generally
more efficient when the function to be minimized is continuous in its first derivative. Higher order
methods, such as Newton's method, are only really suitable when the second-order information is
readily and easily calculated, because calculation of second-order information, using numerical
differentiation, is computationally expensive.
Gradient methods use information about the slope of the function to dictate a direction of search
where the minimum is thought to lie. The simplest of these is the method of steepest descent in which
a search is performed in a direction, –∇f(x), where ∇f(x) is the gradient of the objective function. This
method is very inefficient when the function to be minimized has long narrow valleys as, for example,
is the case for Rosenbrock's function
2 2
f (x) = 100 x2 − x12 + (1 − x1) . (6-5)
The minimum of this function is at x = [1,1], where f(x) = 0. A contour map of this function is shown
in the figure below, along with the solution path to the minimum for a steepest descent
implementation starting at the point [-1.9,2]. The optimization was terminated after 1000 iterations,
still a considerable distance from the minimum. The black areas are where the method is continually
zigzagging from one side of the valley to another. Note that toward the center of the plot, a number of
larger steps are taken when a point lands exactly at the center of the valley.
6-4
Unconstrained Nonlinear Optimization Algorithms
This function, also known as the banana function, is notorious in unconstrained examples because of
the way the curvature bends around the origin. Rosenbrock's function is used throughout this section
to illustrate the use of a variety of optimization techniques. The contours have been plotted in
exponential increments because of the steepness of the slope surrounding the U-shaped valley.
For a more complete description of this figure, including scripts that generate the iterative points, see
“Banana Function Minimization”.
Quasi-Newton Methods
Of the methods that use gradient information, the most favored are the quasi-Newton methods. These
methods build up curvature information at each iteration to formulate a quadratic model problem of
the form
1
min xT Hx + cT x + b, (6-6)
x 2
where the Hessian matrix, H, is a positive definite symmetric matrix, c is a constant vector, and b is a
constant. The optimal solution for this problem occurs when the partial derivatives of x go to zero,
i.e.,
x* = − H−1c . (6-8)
Newton-type methods (as opposed to quasi-Newton methods) calculate H directly and proceed in a
direction of descent to locate the minimum after a number of iterations. Calculating H numerically
involves a large amount of computation. Quasi-Newton methods avoid this by using the observed
behavior of f(x) and ∇f(x) to build up curvature information to make an approximation to H using an
appropriate updating technique.
6-5
6 Nonlinear algorithms and examples
A large number of Hessian updating methods have been developed. However, the formula of
Broyden [3], Fletcher [12], Goldfarb [20], and Shanno [37] (BFGS) is thought to be the most effective
for use in a general purpose method.
where
sk = xk + 1 − xk,
qk = ∇ f xk + 1 − ∇ f xk .
As a starting point, H0 can be set to any symmetric positive definite matrix, for example, the identity
matrix I. To avoid the inversion of the Hessian H, you can derive an updating method that avoids the
direct inversion of H by using a formula that makes an approximation of the inverse Hessian H–1 at
each update. A well-known procedure is the DFP formula of Davidon [7], Fletcher, and Powell [14].
This uses the same formula as the BFGS method (“Equation 6-9”) except that qk is substituted for sk.
The gradient information is either supplied through analytically calculated gradients, or derived by
partial derivatives using a numerical differentiation method via finite differences. This involves
perturbing each of the design variables, x, in turn and calculating the rate of change in the objective
function.
d = − Hk−1 ⋅ ∇ f xk . (6-10)
The quasi-Newton method is illustrated by the solution path on Rosenbrock's function in “Figure 6-2,
BFGS Method on Rosenbrock's Function” on page 6-6. The method is able to follow the shape of
the valley and converges to the minimum after 140 function evaluations using only finite difference
gradients.
For a more complete description of this figure, including scripts that generate the iterative points, see
“Banana Function Minimization”.
6-6
Unconstrained Nonlinear Optimization Algorithms
Line Search
Line search is a search method that is used as part of a larger optimization algorithm. At each step of
the main algorithm, the line-search method searches along the line containing the current point, xk,
parallel to the search direction, which is a vector determined by the main algorithm. That is, the
method finds the next iterate xk+1 of the form
xk + 1 = xk + α*dk, (6-11)
where xk denotes the current iterate, dk is the search direction, and α* is a scalar step length
parameter.
The line search method attempts to decrease the objective function along the line xk + α*dk by
repeatedly minimizing polynomial interpolation models of the objective function. The line search
procedure has two main steps:
• The bracketing phase determines the range of points on the line xk + 1 = xk + α*dk to be searched.
The bracket corresponds to an interval specifying the range of values of α.
• The sectioning step divides the bracket into subintervals, on which the minimum of the objective
function is approximated by polynomial interpolation.
T T
∇ f xk + αdk dk ≥ c2 ∇ f k dk, (6-13)
The first condition (“Equation 6-12”) requires that αk sufficiently decreases the objective function.
The second condition (“Equation 6-13”) ensures that the step length is not too small. Points that
satisfy both conditions (“Equation 6-12” and “Equation 6-13”) are called acceptable points.
The line search method is an implementation of the algorithm described in Section 2-6 of [13]. See
also [31] for more information about line search.
Hessian Update
Many of the optimization functions determine the direction of search by updating the Hessian matrix
at each iteration, using the BFGS method (“Equation 6-9”). The function fminunc also provides an
option to use the DFP method given in “Quasi-Newton Methods” on page 6-5 (set HessUpdate to
'dfp' in options to select the DFP method). The Hessian, H, is always maintained to be positive
definite so that the direction of search, d, is always in a descent direction. This means that for some
arbitrarily small step α in the direction d, the objective function decreases in magnitude. You achieve
positive definiteness of H by ensuring that H is initialized to be positive definite and thereafter qkT sk
(from “Equation 6-14”) is always positive. The term qkT sk is a product of the line search step length
parameter αk and a combination of the search direction d with past and present gradient evaluations,
T T
qkT sk = αk ∇ f xk + 1 d − ∇ f xk d . (6-14)
You always achieve the condition that qkT sk is positive by performing a sufficiently accurate line
search. This is because the search direction, d, is a descent direction, so that αk and the negative
6-7
6 Nonlinear algorithms and examples
gradient –∇f(xk)Td are always positive. Thus, the possible negative term –∇f(xk+1)Td can be made as
small in magnitude as required by increasing the accuracy of the line search.
See Also
fminunc
More About
• “fminsearch Algorithm” on page 6-9
6-8
fminsearch Algorithm
fminsearch Algorithm
fminsearch uses the Nelder-Mead simplex algorithm as described in Lagarias et al. [57]. This
algorithm uses a simplex of n + 1 points for n-dimensional vectors x. The algorithm first makes a
simplex around the initial guess x0 by adding 5% of each component x0(i) to x0, and using these n
vectors as elements of the simplex in addition to x0. (It uses 0.00025 as component i if x0(i) = 0.)
Then, the algorithm modifies the simplex repeatedly according to the following procedure.
Note The keywords for the fminsearch iterative display appear in bold after the description of the
step.
1 Let x(i) denote the list of points in the current simplex, i = 1,...,n+1.
2 Order the points in the simplex from lowest function value f(x(1)) to highest f(x(n+1)). At each
step in the iteration, the algorithm discards the current worst point x(n+1), and accepts another
point into the simplex. [Or, in the case of step 7 below, it changes all n points with values above
f(x(1))].
3 Generate the reflected point
r = 2m – x(n+1),
where
m = Σx(i)/n, i = 1...n,
s = m + 2(m – x(n+1)),
c = m + (r – m)/2
and calculate f(c). If f(c) < f(r), accept c and terminate the iteration. Contract outside
Otherwise, continue with Step 7 (Shrink).
b If f(r) ≥ f(x(n+1)), calculate
cc = m + (x(n+1) – m)/2
and calculate f(cc). If f(cc) < f(x(n+1)), accept cc and terminate the iteration. Contract
inside Otherwise, continue with Step 7 (Shrink).
7 Calculate the n points
6-9
6 Nonlinear algorithms and examples
and calculate f(v(i)), i = 2,...,n+1. The simplex at the next iteration is x(1), v(2),...,v(n+1). Shrink
The following figure shows the points that fminsearch might calculate in the procedure, along with
each possible new simplex. The original simplex has a bold outline. The iterations proceed until they
meet a stopping criterion.
See Also
fminsearch
More About
• “Unconstrained Nonlinear Optimization Algorithms” on page 6-2
6-10
Unconstrained Minimization Using fminunc
x1
min f (x) = e 4x12 + 2x22 + 4x1x2 + 2x2 + 1 .
x
To solve this two-dimensional problem, write a function that returns f (x). Then, invoke the
unconstrained minimization routine fminunc starting from the initial point x0 = [-1,1].
The helper function objfun at the end of this example on page 6-0 calculates f (x).
To find the minimum of f (x), set the initial point and call fminunc.
x0 = [-1,1];
[x,fval,exitflag,output] = fminunc(@objfun,x0);
View the results, including the first-order optimality measure in the output structure.
disp(x)
0.5000 -1.0000
disp(fval)
3.6609e-15
disp(exitflag)
disp(output.firstorderopt)
1.2284e-07
The exitflag output indicates whether the algorithm converges. exitflag = 1 means fminunc
finds a local minimum.
The output structure gives more details about the optimization. For fminunc, the structure includes
Helper Function
6-11
6 Nonlinear algorithms and examples
function f = objfun(x)
f = exp(x(1)) * (4*x(1)^2 + 2*x(2)^2 + 4*x(1)*x(2) + 2*x(2) + 1);
end
See Also
Related Examples
• “Minimization with Gradient and Hessian” on page 6-13
More About
• “Set Options”
• “Solver Outputs and Iterative Display”
6-12
Minimization with Gradient and Hessian
where n = 1000.
The helper function brownfgh at the end of this example on page 6-0 calculates f (x), its gradient
g(x), and its Hessian H(x). To specify that the fminunc solver use the derivative information, set the
SpecifyObjectiveGradient and HessianFcn options using optimoptions. To use a Hessian
with fminunc, you must use the 'trust-region' algorithm.
options = optimoptions(@fminunc,'Algorithm','trust-region',...
'SpecifyObjectiveGradient',true,'HessianFcn','objective');
Set the parameter n to 1000, and set the initial point xstart to –1 for odd components and +1 for
even components.
n = 1000;
xstart = -ones(n,1);
xstart(2:2:n) = 1;
2.8709e-17
disp(exitflag)
disp(output)
iterations: 7
funcCount: 8
stepsize: 0.0039
cgiterations: 7
firstorderopt: 4.7948e-10
algorithm: 'trust-region'
message: '...'
constrviolation: []
The function f (x) is a sum of powers of squares, so is nonnegative. The solution fval is nearly zero,
so is clearly a minimum. The exit flag 1 also indicates that fminunc found a solution. The output
structure shows that fminunc took just seven iterations to reach the solution.
6-13
6 Nonlinear algorithms and examples
1.1987e-10
disp(min(x))
-1.1987e-10
Helper Function
6-14
Minimization with Gradient and Hessian
i=[(1:n)';(1:(n-1))'];
j=[(1:n)';(2:n)'];
s=[v0;2*v1];
H=sparse(i,j,s,n,n);
H=(H+H')/2;
end
end
See Also
Related Examples
• “Minimization with Gradient and Hessian Sparsity Pattern” on page 6-16
6-15
6 Nonlinear algorithms and examples
where n = 1000.
n = 1000;
To use the trust-region method in fminunc, you must compute the gradient in the objective
function; it is not optional as in the quasi-newton method.
The helper function brownfg at the end of this example on page 6-0 computes the objective
function and gradient.
To allow efficient computation of the sparse finite-difference approximation of the Hessian matrix
H(x), the sparsity structure of H must be predetermined. In this case, the structure Hstr, a sparse
matrix, is available in the file brownhstr.mat. Using the spy command, you can see that Hstr is,
indeed, sparse (only 2998 nonzeros).
load brownhstr
spy(Hstr)
6-16
Minimization with Gradient and Hessian Sparsity Pattern
Set the HessPattern option to Hstr using optimoptions. When such a large problem has obvious
sparsity structure, not setting the HessPattern option uses a great amount of memory and
computation unnecessarily, because fminunc attempts to use finite differencing on a full Hessian
matrix of one million nonzero entries.
To use the Hessian sparsity pattern, you must use the trust-region algorithm of fminunc. This
algorithm also requires you to set the SpecifyObjectiveGradient option to true using
optimoptions.
options = optimoptions(@fminunc,'Algorithm','trust-region',...
'SpecifyObjectiveGradient',true,'HessPattern',Hstr);
Set the objective function to @brownfg. Set the initial point to –1 for odd x components and +1 for
even x components.
xstart = -ones(n,1);
xstart(2:2:n,1) = 1;
fun = @brownfg;
Solve the problem by calling fminunc using the initial point xstart and options options.
[x,fval,exitflag,output] = fminunc(fun,xstart,options);
disp(fval)
7.4739e-17
disp(exitflag)
disp(output)
iterations: 7
funcCount: 8
stepsize: 0.0046
cgiterations: 7
firstorderopt: 7.9822e-10
algorithm: 'trust-region'
message: '...'
constrviolation: []
The function f (x) is a sum of powers of squares and, therefore, is nonnegative. The solution fval is
nearly zero, so it is clearly a minimum. The exit flag 1 also indicates that fminunc finds a solution.
The output structure shows that fminunc takes only seven iterations to reach the solution.
disp(max(x))
1.9955e-10
6-17
6 Nonlinear algorithms and examples
disp(min(x))
-1.9955e-10
Helper Function
See Also
Related Examples
• “Minimization with Gradient and Hessian” on page 6-13
6-18
Constrained Nonlinear Optimization Algorithms
In this section...
“Constrained Optimization Definition” on page 6-19
“fmincon Trust Region Reflective Algorithm” on page 6-19
“fmincon Active Set Algorithm” on page 6-22
“fmincon SQP Algorithm” on page 6-29
“fmincon Interior Point Algorithm” on page 6-30
“fminbnd Algorithm” on page 6-32
“fseminf Problem Formulation and Algorithm” on page 6-32
min f (x)
x
such that one or more of the following holds: c(x) ≤ 0, ceq(x) = 0, A·x ≤ b, Aeq·x = beq, l ≤ x ≤ u.
There are even more constraints used in semi-infinite programming; see “fseminf Problem
Formulation and Algorithm” on page 6-32.
Many of the methods used in Optimization Toolbox solvers are based on trust regions, a simple yet
powerful concept in optimization.
The current point is updated to be x + s if f(x + s) < f(x); otherwise, the current point remains
unchanged and N, the region of trust, is shrunk and the trial step computation is repeated.
The key questions in defining a specific trust-region approach to minimizing f(x) are how to choose
and compute the approximation q (defined at the current point x), how to choose and modify the trust
region N, and how accurately to solve the trust-region subproblem. This section focuses on the
unconstrained problem. Later sections discuss additional complications due to the presence of
constraints on the variables.
6-19
6 Nonlinear algorithms and examples
In the standard trust-region method ([48]), the quadratic approximation q is defined by the first two
terms of the Taylor approximation to F at x; the neighborhood N is usually spherical or ellipsoidal in
shape. Mathematically the trust-region subproblem is typically stated
1 T
min s Hs + sT g such that Ds ≤ Δ , (6-16)
2
where g is the gradient of f at the current point x, H is the Hessian matrix (the symmetric matrix of
second derivatives), D is a diagonal scaling matrix, Δ is a positive scalar, and ∥ . ∥ is the 2-norm. Good
algorithms exist for solving “Equation 6-16” (see [48]); such algorithms typically involve the
computation of all eigenvalues of H and a Newton process applied to the secular equation
1 1
− = 0.
Δ s
Such algorithms provide an accurate solution to “Equation 6-16”. However, they require time
proportional to several factorizations of H. Therefore, for large-scale problems a different approach is
needed. Several approximation and heuristic strategies, based on “Equation 6-16”, have been
proposed in the literature ([42] and [50]). The approximation approach followed in Optimization
Toolbox solvers is to restrict the trust-region subproblem to a two-dimensional subspace S ([39] and
[42]). Once the subspace S has been computed, the work to solve “Equation 6-16” is trivial even if full
eigenvalue/eigenvector information is needed (since in the subspace, the problem is only two-
dimensional). The dominant work has now shifted to the determination of the subspace.
The two-dimensional subspace S is determined with the aid of a preconditioned conjugate gradient
process described below. The solver defines S as the linear space spanned by s1 and s2, where s1 is in
the direction of the gradient g, and s2 is either an approximate Newton direction, i.e., a solution to
H ⋅ s2 = − g, (6-17)
The philosophy behind this choice of S is to force global convergence (via the steepest descent
direction or negative curvature direction) and achieve fast local convergence (via the Newton step,
when it exists).
These four steps are repeated until convergence. The trust-region dimension Δ is adjusted according
to standard rules. In particular, it is decreased if the trial step is not accepted, i.e., f(x + s) ≥ f(x). See
[46] and [49] for a discussion of this aspect.
Optimization Toolbox solvers treat a few important special cases of f with specialized functions:
nonlinear least-squares, quadratic functions, and linear least-squares. However, the underlying
algorithmic ideas are the same as for the general case. These special cases are discussed in later
sections.
6-20
Constrained Nonlinear Optimization Algorithms
A popular way to solve large, symmetric, positive definite systems of linear equations Hp = –g is the
method of Preconditioned Conjugate Gradients (PCG). This iterative approach requires the ability to
calculate matrix-vector products of the form H·v where v is an arbitrary vector. The symmetric
positive definite matrix M is a preconditioner for H. That is, M = C2, where C–1HC–1 is a well-
conditioned matrix or a matrix with clustered eigenvalues.
In a minimization context, you can assume that the Hessian matrix H is symmetric. However, H is
guaranteed to be positive definite only in the neighborhood of a strong minimizer. Algorithm PCG
exits when it encounters a direction of negative (or zero) curvature, that is, dTHd ≤ 0. The PCG
output direction p is either a direction of negative curvature or an approximate solution to the
Newton system Hp = –g. In either case, p helps to define the two-dimensional subspace used in the
trust-region approach discussed in “Trust-Region Methods for Nonlinear Minimization” on page 6-2.
Linear constraints complicate the situation described for unconstrained minimization. However, the
underlying ideas described previously can be carried through in a clean and efficient way. The trust-
region methods in Optimization Toolbox solvers generate strictly feasible iterates.
where A is an m-by-n matrix (m ≤ n). Some Optimization Toolbox solvers preprocess A to remove
strict linear dependencies using a technique based on the LU factorization of AT [46]. Here A is
assumed to be of rank m.
The method used to solve “Equation 6-19” differs from the unconstrained approach in two significant
ways. First, an initial feasible point x0 is computed, using a sparse least-squares step, so that Ax0 = b.
Second, Algorithm PCG is replaced with Reduced Preconditioned Conjugate Gradients (RPCG), see
[46], in order to compute an approximate reduced Newton step (or a direction of negative curvature
in the null space of A). The key linear algebra step involves solving systems of the form
T
C A s r
= , (6-20)
A 0 t 0
where A approximates A (small nonzeros of A are set to zero provided rank is not lost) and C is a
sparse symmetric positive-definite approximation to H, i.e., C = H. See [46] for more details.
Box Constraints
where l is a vector of lower bounds, and u is a vector of upper bounds. Some (or all) of the
components of l can be equal to –∞ and some (or all) of the components of u can be equal to ∞. The
method generates a sequence of strictly feasible points. Two techniques are used to maintain
feasibility while achieving robust convergence behavior. First, a scaled modified Newton step
replaces the unconstrained Newton step (to define the two-dimensional subspace S). Second,
reflections are used to increase the step size.
6-21
6 Nonlinear algorithms and examples
The scaled modified Newton step arises from examining the Kuhn-Tucker necessary conditions for
“Equation 6-21”,
−2
D(x) g = 0, (6-22)
where
−1/2
D(x) = diag vk ,
The nonlinear system “Equation 6-22” is not differentiable everywhere. Nondifferentiability occurs
when vi = 0. You can avoid such points by maintaining strict feasibility, i.e., restricting l < x < u.
The scaled modified Newton step sk for the nonlinear system of equations given by “Equation 6-22” is
defined as the solution to the linear system
MDsN = − g (6-23)
1/2
g = D−1g = diag v g, (6-24)
and
Here Jv plays the role of the Jacobian of |v|. Each diagonal component of the diagonal matrix Jv equals
0, –1, or 1. If all the components of l and u are finite, Jv = diag(sign(g)). At a point where gi = 0, vi
might not be differentiable. Jiiv = 0 is defined at such a point. Nondifferentiability of this type is not a
cause for concern because, for such a component, it is not significant which value vi takes. Further, |
vi| will still be discontinuous at this point, but the function |vi|·gi is continuous.
Second, reflections are used to increase the step size. A (single) reflection step is defined as follows.
Given a step p that intersects a bound constraint, consider the first bound constraint crossed by p;
assume it is the ith bound constraint (either the ith upper or ith lower bound). Then the reflection
step pR = p except in the ith component, where pRi = –pi.
In constrained optimization, the general aim is to transform the problem into an easier subproblem
that can then be solved and used as the basis of an iterative process. A characteristic of a large class
of early methods is the translation of the constrained problem to a basic unconstrained problem by
using a penalty function for constraints that are near or beyond the constraint boundary. In this way
6-22
Constrained Nonlinear Optimization Algorithms
m
∇ f x* + ∑ λi ⋅ ∇Gi x* = 0
i=1
(6-26)
λi ⋅ Gi x* = 0, i = 1, ..., me
λi ≥ 0, i = me + 1, ..., m,
The first equation describes a canceling of the gradients between the objective function and the
active constraints at the solution point. For the gradients to be canceled, Lagrange multipliers (λi, i =
1,...,m) are necessary to balance the deviations in magnitude of the objective function and constraint
gradients. Because only active constraints are included in this canceling operation, constraints that
are not active must not be included in this operation and so are given Lagrange multipliers equal to
0. This is stated implicitly in the last two Kuhn-Tucker equations.
The solution of the KKT equations forms the basis to many nonlinear programming algorithms. These
algorithms attempt to compute the Lagrange multipliers directly. Constrained quasi-Newton methods
guarantee superlinear convergence by accumulating second-order information regarding the KKT
equations using a quasi-Newton updating procedure. These methods are commonly referred to as
Sequential Quadratic Programming (SQP) methods, since a QP subproblem is solved at each major
iteration (also known as Iterative Quadratic Programming, Recursive Quadratic Programming, and
Constrained Variable Metric methods).
The 'active-set' algorithm is not a large-scale algorithm; see “Large-Scale vs. Medium-Scale
Algorithms” on page 2-10.
SQP methods represent the state of the art in nonlinear programming methods. Schittkowski [36], for
example, has implemented and tested a version that outperforms every other tested method in terms
of efficiency, accuracy, and percentage of successful solutions, over a large number of test problems.
Based on the work of Biggs [1], Han [22], and Powell ([32] and [33]), the method allows you to closely
mimic Newton's method for constrained optimization just as is done for unconstrained optimization.
At each major iteration, an approximation is made of the Hessian of the Lagrangian function using a
quasi-Newton updating method. This is then used to generate a QP subproblem whose solution is
used to form a search direction for a line search procedure. An overview of SQP is found in
Fletcher [13], Gill et al. [19], Powell [35], and Schittkowski [23]. The general method, however, is
stated here.
Given the problem description in GP (“Equation 2-1”) the principal idea is the formulation of a QP
subproblem based on a quadratic approximation of the Lagrangian function.
6-23
6 Nonlinear algorithms and examples
m
L(x, λ) = f (x) + ∑ λi ⋅ gi(x) . (6-27)
i=1
Here you simplify “Equation 2-1” by assuming that bound constraints have been expressed as
inequality constraints. You obtain the QP subproblem by linearizing the nonlinear constraints.
1 T T
min d Hkd + ∇ f xk d
2
d ∈ ℜn
T (6-28)
∇gi xk d + gi xk = 0, i = 1, ..., me
T
∇gi xk d + gi xk ≤ 0, i = me + 1, ..., m .
This subproblem can be solved using any QP algorithm (see, for instance, “Quadratic Programming
Solution” on page 6-26). The solution is used to form a new iterate
xk + 1 = xk + α k dk .
The step length parameter αk is determined by an appropriate line search procedure so that a
sufficient decrease in a merit function is obtained (see “Updating the Hessian Matrix” on page 6-25).
The matrix Hk is a positive definite approximation of the Hessian matrix of the Lagrangian function
(“Equation 6-27”). Hk can be updated by any of the quasi-Newton methods, although the BFGS
method (see “Updating the Hessian Matrix” on page 6-25) appears to be the most popular.
A nonlinearly constrained problem can often be solved in fewer iterations than an unconstrained
problem using SQP. One of the reasons for this is that, because of limits on the feasible area, the
optimizer can make informed decisions regarding directions of search and step length.
This was solved by an SQP implementation in 96 iterations compared to 140 for the unconstrained
case. “Figure 6-3, SQP Method on Nonlinearly Constrained Rosenbrock's Function” on page 6-24
shows the path to the solution point x = [0.9072,0.8228] starting at x = [–1.9,2.0].
6-24
Constrained Nonlinear Optimization Algorithms
SQP Implementation
The SQP implementation consists of three main stages, which are discussed briefly in the following
subsections:
At each major iteration a positive definite quasi-Newton approximation of the Hessian of the
Lagrangian function, H, is calculated using the BFGS method, where λi, i = 1,...,m, is an estimate of
the Lagrange multipliers.
where
sk = xk + 1 − xk
m m
qk = ∇ f xk + 1 + ∑ λi ⋅ ∇gi xk + 1 − ∇ f xk + ∑ λi ⋅ ∇gi xk .
i=1 i=1
Powell [33] recommends keeping the Hessian positive definite even though it might be positive
indefinite at the solution point. A positive definite Hessian is maintained providing qkT sk is positive at
each update and that H is initialized with a positive definite matrix. When qkT sk is not positive, qk is
modified on an element-by-element basis so that qkT sk > 0. The general aim of this modification is to
distort the elements of qk, which contribute to a positive definite update, as little as possible.
Therefore, in the initial phase of the modification, the most negative element of qk*sk is repeatedly
halved. This procedure is continued until qkT sk is greater than or equal to a small negative tolerance.
If, after this procedure, qkT sk is still not positive, modify qk by adding a vector v multiplied by a
constant scalar w, that is,
qk = qk + wv, (6-31)
where
vi = ∇gi xk + 1 ⋅ gi xk + 1 − ∇gi xk ⋅ gi xk
if qk i ⋅ w < 0 and qk i ⋅ sk i < 0, i = 1, ..., m
vi = 0 otherwise,
The functions fmincon, fminimax, fgoalattain, and fseminf all use SQP. If Display is set to
'iter' in options, then various information is given such as function values and the maximum
constraint violation. When the Hessian has to be modified using the first phase of the preceding
6-25
6 Nonlinear algorithms and examples
procedure to keep it positive definite, then Hessian modified is displayed. If the Hessian has to be
modified again using the second phase of the approach described above, then Hessian modified
twice is displayed. When the QP subproblem is infeasible, then infeasible is displayed. Such
displays are usually not a cause for concern but indicate that the problem is highly nonlinear and that
convergence might take longer than usual. Sometimes the message no update is displayed,
indicating that qkT sk is nearly zero. This can be an indication that the problem setup is wrong or you
are trying to minimize a noncontinuous function.
Quadratic Programming Solution
At each major iteration of the SQP method, a QP problem of the following form is solved, where Ai
refers to the ith row of the m-by-n matrix A.
1 T
min q(d) = d Hd + cT d,
2
d ∈ ℜn
(6-32)
Aid = bi, i = 1, ..., me
Aid ≤ bi, i = me + 1, ..., m .
The method used in Optimization Toolbox functions is an active set strategy (also known as a
projection method) similar to that of Gill et al., described in [18] and [17]. It has been modified for
both Linear Programming (LP) and Quadratic Programming (QP) problems.
The solution procedure involves two phases. The first phase involves the calculation of a feasible
point (if one exists). The second phase involves the generation of an iterative sequence of feasible
points that converge to the solution. In this method an active set, Ak, is maintained that is an estimate
of the active constraints (i.e., those that are on the constraint boundaries) at the solution point.
Virtually all QP algorithms are active set methods. This point is emphasized because there exist many
different methods that are very similar in structure but that are described in widely different terms.
Ak is updated at each iteration k, and this is used to form a basis for a search direction d k. Equality
constraints always remain in the active set Ak. The notation for the variable d k is used here to
distinguish it from dk in the major iterations of the SQP method. The search direction d k is calculated
and minimizes the objective function while remaining on any active constraint boundaries. The
feasible subspace for d k is formed from a basis Zk whose columns are orthogonal to the estimate of
the active set Ak (i.e., AkZk = 0). Thus a search direction, which is formed from a linear summation of
any combination of the columns of Zk, is guaranteed to remain on the boundaries of the active
constraints.
T
The matrix Zk is formed from the last m – l columns of the QR decomposition of the matrix Ak , where l
is the number of active constraints and l < m. That is, Zk is given by
Zk = Q : , l + 1: m , (6-33)
where
T T R
Q Ak = .
0
6-26
Constrained Nonlinear Optimization Algorithms
Once Zk is found, a new search direction d k is sought that minimizes q(d) where d k is in the null
space of the active constraints. That is, d k is a linear combination of the columns of Zk: d k = Zkp for
some vector p.
Then if you view the quadratic as a function of p, by substituting for d k, you have
1 T T
q(p) = p Zk HZkp + cT Zkp . (6-34)
2
∇q(p) is referred to as the projected gradient of the quadratic function because it is the gradient
projected in the subspace defined by Zk. The term ZkT HZk is called the projected Hessian. Assuming
the Hessian matrix H is positive definite (which is the case in this implementation of SQP), then the
minimum of the function q(p) in the subspace defined by Zk occurs when ∇q(p) = 0, which is the
solution of the system of linear equations
At each iteration, because of the quadratic nature of the objective function, there are only two
choices of step length α. A step of unity along d k is the exact step to the minimum of the function
restricted to the null space of Ak. If such a step can be taken, without violation of the constraints,
then this is the solution to QP (“Equation 6-32”). Otherwise, the step along d k to the nearest
constraint is less than unity and a new constraint is included in the active set at the next iteration.
The distance to the constraint boundaries in any direction d k is given by
− Aixk − bi
α= min , (6-38)
i ∈ 1, ..., m Aid k
which is defined for constraints not in the active set, and where the direction d k is towards the
constraint boundary, i.e., Aid k > 0, i = 1, ..., m.
When n independent constraints are included in the active set, without location of the minimum,
Lagrange multipliers, λk, are calculated that satisfy the nonsingular set of linear equations
T
Ak λk = c + Hxk . (6-39)
If all elements of λk are positive, xk is the optimal solution of QP (“Equation 6-32”). However, if any
component of λk is negative, and the component does not correspond to an equality constraint, then
the corresponding element is deleted from the active set and a new iterate is sought.
6-27
6 Nonlinear algorithms and examples
Initialization
The algorithm requires a feasible point to start. If the current point from the SQP method is not
feasible, then you can find a point by solving the linear programming problem
min γ such that
γ ∈ ℜ, x ∈ ℜn
Aix = bi, i = 1, ..., me (6-40)
The notation Ai indicates the ith row of the matrix A. You can find a feasible point (if one exists) to
“Equation 6-40” by setting x to a value that satisfies the equality constraints. You can determine this
value by solving an under- or overdetermined set of linear equations formed from the set of equality
constraints. If there is a solution to this problem, then the slack variable γ is set to the maximum
inequality constraint at this point.
You can modify the preceding QP algorithm for LP problems by setting the search direction to the
steepest descent direction at each iteration, where gk is the gradient of the objective function (equal
to the coefficients of the linear objective function).
d k = − ZkZkT gk . (6-41)
If a feasible point is found using the preceding LP method, the main QP phase is entered. The search
direction d k is initialized with a search direction d 1 found from solving the set of linear equations
Hd 1 = − gk, (6-42)
where gk is the gradient of the objective function at the current iterate xk (i.e., Hxk + c).
If a feasible solution is not found for the QP problem, the direction of search for the main SQP routine
d k is taken as one that minimizes γ.
Line Search and Merit Function
The solution to the QP subproblem produces a vector dk, which is used to form a new iterate
xk + 1 = xk + αdk . (6-43)
The step length parameter αk is determined in order to produce a sufficient decrease in a merit
function. The merit function used by Han [22] and Powell [33] of the following form is used in this
implementation.
me m
Ψ(x) = f (x) + ∑ ri ⋅ gi(x) + ∑ ri ⋅ max[0, gi(x)] . (6-44)
i=1 i = me + 1
rk i + λi
ri = rk + 1 i = max λi, , i = 1, ..., m . (6-45)
i 2
This allows positive contribution from constraints that are inactive in the QP solution but were
recently active. In this implementation, the penalty parameter ri is initially set to
6-28
Constrained Nonlinear Optimization Algorithms
∇ f (x)
ri = , (6-46)
∇gi(x)
This ensures larger contributions to the penalty parameter from constraints with smaller gradients,
which would be the case for active constraints at the solution point.
The sqp algorithm is essentially the same as the sqp-legacy algorithm, but has a different
implementation. Usually, sqp has faster execution time and less memory usage than sqp-legacy.
The most important differences between the sqp and the active-set algorithms are:
The sqp algorithm takes every iterative step in the region constrained by bounds. Furthermore, finite
difference steps also respect bounds. Bounds are not strict; a step can be exactly on a boundary. This
strict feasibility can be beneficial when your objective function or nonlinear constraint functions are
undefined or are complex outside the region constrained by bounds.
During its iterations, the sqp algorithm can attempt to take a step that fails. This means an objective
function or nonlinear constraint function you supply returns a value of Inf, NaN, or a complex value.
In this case, the algorithm attempts to take a smaller step.
The sqp algorithm uses a different set of linear algebra routines to solve the quadratic programming
subproblem, “Equation 6-28”. These routines are more efficient in both memory usage and speed
than the active-set routines.
The sqp algorithm has two new approaches to the solution of “Equation 6-28” when constraints are
not satisfied.
• The sqp algorithm combines the objective and constraint functions into a merit function. The
algorithm attempts to minimize the merit function subject to relaxed constraints. This modified
problem can lead to a feasible solution. However, this approach has more variables than the
original problem, so the problem size in “Equation 6-28” increases. The increased size can slow
the solution of the subproblem. These routines are based on the articles by Spellucci [60] and
Tone [61]. The sqp algorithm sets the penalty parameter for the merit function “Equation 6-44”
according to the suggestion in [41].
• Suppose nonlinear constraints are not satisfied, and an attempted step causes the constraint
violation to grow. The sqp algorithm attempts to obtain feasibility using a second-order
approximation to the constraints. The second-order technique can lead to a feasible solution.
6-29
6 Nonlinear algorithms and examples
However, this technique can slow the solution by requiring more evaluations of the nonlinear
constraint functions.
There are as many slack variables si as there are inequality constraints g. The si are restricted to be
positive to keep ln(si) bounded. As μ decreases to zero, the minimum of fμ should approach the
minimum of f. The added logarithmic term is called a barrier function. This method is described in
[40], [41], and [51].
The approximate problem “Equation 6-48” is a sequence of equality constrained problems. These are
easier to solve than the original inequality-constrained problem “Equation 6-47”.
To solve the approximate problem, the algorithm uses one of two main types of steps at each
iteration:
• A direct step in (x, s). This step attempts to solve the KKT equations, “Equation 3-2” and
“Equation 3-3”, for the approximate problem via a linear approximation. This is also called a
Newton step.
• A CG (conjugate gradient) step, using a trust region.
By default, the algorithm first attempts to take a direct step. If it cannot, it attempts a CG step. One
case where it does not take a direct step is when the approximate problem is not locally convex near
the current iterate.
The parameter ν may increase with iteration number in order to force the solution towards feasibility.
If an attempted step does not decrease the merit function, the algorithm rejects the attempted step,
and attempts a new step.
If either the objective function or a nonlinear constraint function returns a complex value, NaN, Inf,
or an error at an iterate xj, the algorithm rejects xj. The rejection has the same effect as if the merit
function did not decrease sufficiently: the algorithm then attempts a different, shorter step. Wrap any
code that can error in try-catch:
6-30
Constrained Nonlinear Optimization Algorithms
val = NaN;
end
The objective and constraints must yield proper (double) values at the initial point.
Direct Step
This equation comes directly from attempting to solve “Equation 3-2” and “Equation 3-3” using a
linearized Lagrangian.
In order to solve this equation for (Δx, Δs), the algorithm makes an LDL factorization of the matrix.
(See Example 3 — The Structure of D (MATLAB) in the MATLAB ldl function reference page.) This is
the most computationally expensive step. One result of this factorization is a determination of
whether the projected Hessian is positive definite or not; if not, the algorithm uses a conjugate
gradient step, described in the next section.
The conjugate gradient approach to solving the approximate problem “Equation 6-48” is similar to
other conjugate gradient calculations. In this case, the algorithm adjusts both x and s, keeping the
slacks s positive. The approach is to minimize a quadratic approximation to the approximate problem
in a trust region, subject to linearized constraints.
Specifically, let R denote the radius of the trust region, and let other variables be defined as in “Direct
Step” on page 6-31. The algorithm obtains Lagrange multipliers by approximately solving the KKT
equations
6-31
6 Nonlinear algorithms and examples
in the least-squares sense, subject to λ being positive. Then it takes a step (Δx, Δs) to approximately
solve
T 1 T 2 −1 1 −1
min ∇ f Δx + Δx ∇xx LΔx + μeT S Δs + ΔsT S ΛΔs, (6-51)
Δx, Δs 2 2
To solve “Equation 6-52”, the algorithm tries to minimize a norm of the linearized constraints inside a
region with radius scaled by R. Then “Equation 6-51” is solved with the constraints being to match
the residual from solving “Equation 6-52”, staying within the trust region of radius R, and keeping s
strictly positive. For details of the algorithm and the derivation, see [40], [41], and [51]. For another
description of conjugate gradients, see “Preconditioned Conjugate Gradient Method” on page 6-21.
Here are the meanings and effects of several options in the interior-point algorithm.
• HonorBounds — When set to true, every iterate satisfies the bound constraints you have set.
When set to false, the algorithm may violate bounds during intermediate iterations.
• HessianApproximation — When set to:
fminbnd Algorithm
fminbnd is a solver available in any MATLAB installation. It solves for a local minimum in one
dimension within a bounded interval. It is not based on derivatives. Instead, it uses golden-section
search and parabolic interpolation.
fseminf addresses optimization problems with additional types of constraints compared to those
addressed by fmincon. The formulation of fmincon is
6-32
Constrained Nonlinear Optimization Algorithms
min f (x)
x
fseminf adds the following set of semi-infinite constraints to those already given. For wj in a one- or
two-dimensional bounded interval or rectangle Ij, for a vector of continuous functions K(x, w), the
constraints are
The term “dimension” of an fseminf problem means the maximal dimension of the constraint set I: 1
if all Ij are intervals, and 2 if at least one Ij is a rectangle. The size of the vector of K does not enter
into this concept of dimension.
The reason this is called semi-infinite programming is that there are a finite number of variables (x
and wj), but an infinite number of constraints. This is because the constraints on x are over a set of
continuous intervals or rectangles Ij, which contains an infinite number of points, so there are an
infinite number of constraints: Kj(x, wj) ≤ 0 for an infinite number of points wj.
You might think a problem with an infinite number of constraints is impossible to solve. fseminf
addresses this by reformulating the problem to an equivalent one that has two stages: a maximization
and a minimization. The semi-infinite constraints are reformulated as
where |K| is the number of components of the vector K; i.e., the number of semi-infinite constraint
functions. For fixed x, this is an ordinary maximization over bounded intervals or rectangles.
fseminf further simplifies the problem by making piecewise quadratic or cubic approximations κj(x,
wj) to the functions Kj(x, wj), for each x that the solver visits. fseminf considers only the maxima of
the interpolation function κj(x, wj), instead of Kj(x, wj), in “Equation 6-53”. This reduces the original
problem, minimizing a semi-infinitely constrained function, to a problem with a finite number of
constraints.
Sampling Points
Your semi-infinite constraint function must provide a set of sampling points, points used in making
the quadratic or cubic approximations. To accomplish this, it should contain:
The initial spacing s is a |K|-by-2 matrix. The jth row of s represents the spacing for neighboring
sampling points for the constraint function Kj. If Kj depends on a one-dimensional wj, set
s(j,2) = 0. fseminf updates the matrix s in subsequent iterations.
fseminf uses the matrix s to generate the sampling points w, which it then uses to create the
approximation κj(x, wj). Your procedure for generating w from s should keep the same intervals or
rectangles Ij during the optimization.
Example of Creating Sampling Points
Consider a problem with two semi-infinite constraints, K1 and K2. K1 has one-dimensional w1, and K2
has two-dimensional w2. The following code generates a sampling set from w1 = 2 to 12:
6-33
6 Nonlinear algorithms and examples
% Sampling set
w1 = 2:s(1,1):12;
fseminf specifies s as NaN when it first calls your constraint function. Checking for this allows you
to set the initial sampling interval.
The following code generates a sampling set from w2 in a square, with each component going from 1
to 100, initially sampled more often in the first component than the second:
% Sampling set
w2x = 1:s(2,1):100;
w2y = 1:s(2,2):100;
[wx,wy] = meshgrid(w2x,w2y);
% Sampling set
w1 = 2:s(1,1):12;
w2x = 1:s(2,1):100;
w2y = 1:s(2,2):100;
[wx,wy] = meshgrid(w2x,w2y);
fseminf Algorithm
1 At the current value of x, fseminf identifies all the wj,i such that the interpolation κj(x, wj,i) is a
local maximum. (The maximum refers to varying w for fixed x.)
2 fseminf takes one iteration step in the solution of the fmincon problem:
min f (x)
x
such that c(x) ≤ 0, ceq(x) = 0, A·x ≤ b, Aeq·x = beq, and l ≤ x ≤ u, where c(x) is augmented with
all the maxima of κj(x, wj) taken over all wj∈Ij, which is equal to the maxima over j and i of
κj(x, wj,i).
3 fseminf checks if any stopping criterion is met at the new point x (to halt the iterations); if not,
it continues to step 4.
6-34
Constrained Nonlinear Optimization Algorithms
4 fseminf checks if the discretization of the semi-infinite constraints needs updating, and updates
the sampling points appropriately. This provides an updated approximation κj(x, wj). Then it
continues at step 1.
6-35
6 Nonlinear algorithms and examples
All the principles outlined in this example apply to the other nonlinear solvers, such as
fgoalattain, fminimax, lsqnonlin, lsqcurvefit, and fsolve.
The example starts with minimizing an objective function, then proceeds to minimize the same
function with additional parameters. After that, the example shows how to minimize the objective
function when there is a constraint, and finally shows how to get a more efficient and/or accurate
solution by providing gradients or Hessian, or by changing some options.
f = @(x,y) x.*exp(-x.^2-y.^2)+(x.^2+y.^2)/20;
fsurf(f,[-2,2],'ShowContours','on')
The plot shows that the minimum is near the point (-1/2,0).
6-36
Tutorial for the Optimization Toolbox™
Usually you define the objective function as a MATLAB file. For now, this function is simple enough to
define as an anonymous function:
x0 = [-.5; 0];
Set optimization options to not use fminunc's default large-scale algorithm, since that algorithm
requires the objective function gradient to be provided:
options = optimoptions('fminunc','Algorithm','quasi-newton');
options.Display = 'iter';
First-order
Iteration Func-count f(x) Step-size optimality
0 3 -0.3769 0.339
1 6 -0.379694 1 0.286
2 9 -0.405023 1 0.0284
3 12 -0.405233 1 0.00386
4 15 -0.405237 1 3.17e-05
5 18 -0.405237 1 3.35e-08
uncx = x
uncx = 2×1
-0.6691
0.0000
uncf = fval
uncf = -0.4052
We will use the number of function evaluations as a measure of efficiency in this example. The total
number of function evaluations is:
output.funcCount
ans = 18
6-37
6 Nonlinear algorithms and examples
We will now pass extra parameters as additional arguments to the objective function. We show two
different ways of doing this - using a MATLAB file, or using a nested function.
2 2 2 2
f (x, y, a, b, c) = (x − a)exp( − ((x − a) + (y − b) )) + ((x − a) + (y − b) )/c .
This function is a shifted and scaled version of the original objective function.
Suppose we have a MATLAB file objective function called bowlpeakfun defined as:
type bowlpeakfun
function y = bowlpeakfun(x, a, b, c)
%BOWLPEAKFUN Objective function for parameter passing in TUTDEMO.
y = (x(1)-a).*exp(-((x(1)-a).^2+(x(2)-b).^2))+((x(1)-a).^2+(x(2)-b).^2)/c;
a = 2;
b = 3;
c = 10;
f = @(x)bowlpeakfun(x,a,b,c)
x0 = [-.5; 0];
options = optimoptions('fminunc','Algorithm','quasi-newton');
[x, fval] = fminunc(f,x0,options)
x = 2×1
1.3639
3.0000
6-38
Tutorial for the Optimization Toolbox™
fval = -0.3840
Consider the following function that implements the objective as a nested function
type nestedbowlpeak
[x,fval] = fminunc(@nestedfun,x0,options);
function y = nestedfun(x)
y = (x(1)-a).*exp(-((x(1)-a).^2+(x(2)-b).^2))+((x(1)-a).^2+(x(2)-b).^2)/c;
end
end
In this method, the parameters (a,b,c) are visible to the nested objective function called nestedfun.
The outer function, nestedbowlpeak, calls fminunc and passes the objective function, nestedfun.
a = 2;
b = 3;
c = 10;
x0 = [-.5; 0];
options = optimoptions('fminunc','Algorithm','quasi-newton');
[x,fval] = nestedbowlpeak(a,b,c,x0,options)
x = 2×1
1.3639
3.0000
fval = -0.3840
You can see both methods produced identical answers, so use whichever one you find most
convenient.
2 2
subject to xy/2 + (x + 2) + (y − 2) /2 ≤ 2 .
6-39
6 Nonlinear algorithms and examples
The constraint set is the interior of a tilted ellipse. Look at the contours of the objective function
plotted together with the tilted ellipse
f = @(x,y) x.*exp(-x.^2-y.^2)+(x.^2+y.^2)/20;
g = @(x,y) x.*y/2+(x+2).^2+(y-2).^2/2-2;
fimplicit(g)
axis([-6 0 -1 7])
hold on
fcontour(f)
plot(-.9727,.4685,'ro');
legend('constraint','f contours','minimum');
hold off
The plot shows that the lowest value of the objective function within the ellipse occurs near the lower
right part of the ellipse. We are about to calculate the minimum that was just plotted. Take a guess at
the solution:
x0 = [-2 1];
Set optimization options: use the interior-point algorithm, and turn on the display of results at each
iteration:
options = optimoptions('fmincon','Algorithm','interior-point','Display','iter');
Solvers require that nonlinear constraint functions give two outputs: one for nonlinear inequalities,
the second for nonlinear equalities. So we write the constraint using the deal function to give both
outputs:
6-40
Tutorial for the Optimization Toolbox™
Call the nonlinear constrained solver. There are no linear equalities or inequalities or bounds, so pass
[ ] for those arguments:
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],gfun,options);
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 3 2.365241e-01 0.000e+00 1.972e-01
1 6 1.748504e-01 0.000e+00 1.734e-01 2.260e-01
2 10 -1.570560e-01 0.000e+00 2.608e-01 9.347e-01
3 14 -6.629160e-02 0.000e+00 1.241e-01 3.103e-01
4 17 -1.584082e-01 0.000e+00 7.934e-02 1.826e-01
5 20 -2.349124e-01 0.000e+00 1.912e-02 1.571e-01
6 23 -2.255299e-01 0.000e+00 1.955e-02 1.993e-02
7 26 -2.444225e-01 0.000e+00 4.293e-03 3.821e-02
8 29 -2.446931e-01 0.000e+00 8.100e-04 4.035e-03
9 32 -2.446933e-01 0.000e+00 1.999e-04 8.126e-04
10 35 -2.448531e-01 0.000e+00 4.004e-05 3.289e-04
11 38 -2.448927e-01 0.000e+00 4.036e-07 8.156e-05
x = 1×2
-0.9727 0.4686
fval
fval = -0.2449
Fevals = output.funcCount
Fevals = 38
c = -2.4608e-06
ceq =
[]
6-41
6 Nonlinear algorithms and examples
Since c(x) is close to 0, the constraint is "active," meaning the constraint affects the solution. Recall
the unconstrained solution was found at
uncx
uncx = 2×1
-0.6691
0.0000
uncf
uncf = -0.4052
fval-uncf
ans = 0.1603
Optimization problems can be solved more efficiently and accurately if gradients are supplied by the
user. This example shows how this may be performed. We again solve the inequality-constrained
problem
2 2
subject to xy/2 + (x + 2) + (y − 2) /2 ≤ 2 .
To provide the gradient of f(x) to fmincon, we write the objective function in the form of a MATLAB
file:
type onehump
r = x(1)^2 + x(2)^2;
s = exp(-r);
f = x(1)*s+r/20;
if nargout > 1
gf = [(1-2*x(1)^2)*s+x(1)/10;
-2*x(1)*x(2)*s+x(2)/10];
end
The constraint and its gradient are contained in the MATLAB file tiltellipse:
type tiltellipse
6-42
Tutorial for the Optimization Toolbox™
if nargout > 2
gc = [x(2)/2+2*(x(1)+2);
x(1)/2+x(2)-2];
gceq = [];
end
x0 = [-2; 1];
Set optimization options: we continue to use the same algorithm for comparison purposes.
options = optimoptions('fmincon','Algorithm','interior-point');
We also set options to use the gradient information in the objective and constraint functions. Note:
these options MUST be turned on or the gradient information will be ignored.
options = optimoptions(options,...
'SpecifyObjectiveGradient',true,...
'SpecifyConstraintGradient',true);
There should be fewer function counts this time, since fmincon does not need to estimate gradients
using finite differences.
options.Display = 'iter';
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 2.365241e-01 0.000e+00 1.972e-01
1 2 1.748504e-01 0.000e+00 1.734e-01 2.260e-01
2 4 -1.570560e-01 0.000e+00 2.608e-01 9.347e-01
3 6 -6.629161e-02 0.000e+00 1.241e-01 3.103e-01
4 7 -1.584082e-01 0.000e+00 7.934e-02 1.826e-01
5 8 -2.349124e-01 0.000e+00 1.912e-02 1.571e-01
6 9 -2.255299e-01 0.000e+00 1.955e-02 1.993e-02
7 10 -2.444225e-01 0.000e+00 4.293e-03 3.821e-02
8 11 -2.446931e-01 0.000e+00 8.100e-04 4.035e-03
9 12 -2.446933e-01 0.000e+00 1.999e-04 8.126e-04
10 13 -2.448531e-01 0.000e+00 4.004e-05 3.289e-04
11 14 -2.448927e-01 0.000e+00 4.036e-07 8.156e-05
6-43
6 Nonlinear algorithms and examples
fmincon estimated gradients well in the previous example, so the iterations in the current example
are similar.
xold = 2×1
-0.9727
0.4686
minfval = -0.2449
Fgradevals = 14
Fevals = 38
2 2
subject to xy/2 + (x + 2) + (y − 2) /2 ≤ 2,
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 2.365241e-01 0.000e+00 1.972e-01
1 2 1.748504e-01 0.000e+00 1.734e-01 2.260e-01
2 4 -1.570560e-01 0.000e+00 2.608e-01 9.347e-01
6-44
Tutorial for the Optimization Toolbox™
We now choose to see more decimals in the solution, in order to see more accurately the difference
that the new tolerances make.
format long
x = 2×1
-0.972742227363546
0.468569289098342
xold = 2×1
-0.972742694488360
0.468569966693330
The change is
x - xold
ans = 2×1
10-6 ×
0.467124813385844
-0.677594988729435
fval =
-0.244893137879894
6-45
6 Nonlinear algorithms and examples
fval - minfval
ans =
-3.996450220755676e-07
output.funcCount
ans =
15
Compare this to the number of function evaluations when the problem is solved with user-provided
gradients but with the default tolerances:
Fgradevals
Fgradevals =
14
If you give not only a gradient, but also a Hessian, solvers are even more accurate and efficient.
fmincon's interior-point solver takes a Hessian matrix as a separate function (not part of the
objective function). The Hessian function H(x,lambda) should evaluate the Hessian of the Lagrangian;
see the User's Guide for the definition of this term.
Solvers calculate the values lambda.ineqnonlin and lambda.eqlin; your Hessian function tells solvers
how to use these values.
In this problem we have but one inequality constraint, so the Hessian is:
type hessfordemo
function H = hessfordemo(x,lambda)
% HESSFORDEMO Helper function for Tutorial for the Optimization Toolbox demo
s = exp(-(x(1)^2+x(2)^2));
H = [2*x(1)*(2*x(1)^2-3)*s+1/10, 2*x(2)*(2*x(1)^2-1)*s;
2*x(2)*(2*x(1)^2-1)*s, 2*x(1)*(2*x(2)^2-1)*s+1/10];
hessc = [2,1/2;1/2,1];
H = H + lambda.ineqnonlin(1)*hessc;
options = optimoptions('fmincon',...
'Algorithm','interior-point',...
'SpecifyConstraintGradient',true,...
6-46
Tutorial for the Optimization Toolbox™
'SpecifyObjectiveGradient',true,...
'HessianFcn',@hessfordemo);
The tolerances have been set back to the defaults. There should be fewer function counts this time.
options.Display = 'iter';
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 2.365241e-01 0.000e+00 1.972e-01
1 3 5.821325e-02 0.000e+00 1.443e-01 8.728e-01
2 5 -1.218829e-01 0.000e+00 1.007e-01 4.927e-01
3 6 -1.421167e-01 0.000e+00 8.486e-02 5.165e-02
4 7 -2.261916e-01 0.000e+00 1.989e-02 1.667e-01
5 8 -2.433609e-01 0.000e+00 1.537e-03 3.486e-02
6 9 -2.446875e-01 0.000e+00 2.057e-04 2.727e-03
7 10 -2.448911e-01 0.000e+00 2.068e-06 4.191e-04
8 11 -2.448931e-01 0.000e+00 2.001e-08 4.218e-06
x = 2×1
-0.972742246093537
0.468569316215571
fval
fval =
-0.244893121872758
output.funcCount
ans =
11
Compare this to the number using only gradient evaluations, with the same default tolerances:
6-47
6 Nonlinear algorithms and examples
Fgradevals
Fgradevals =
14
See Also
More About
• “Passing Extra Parameters” on page 2-57
• “Solver-Based Optimization Problem Setup”
6-48
Banana Function Minimization
f (x) is called the banana function because of its curvature around the origin. It is notorious in
optimization examples because of the slow convergence most methods exhibit when trying to solve
this problem.
f (x) has a unique minimum at the point x = [1, 1] where f (x) = 0. This example shows a number of
ways to minimize f (x) starting at the point x0 = [ − 1 . 9, 2].
The fminsearch function finds a minimum for a problem without constraints. It uses an algorithm
that does not estimate any derivatives of the objective function. Rather, it uses a geometric search
method described in “fminsearch Algorithm” on page 6-9.
Minimize the banana function using fminsearch. Include an output function to report the sequence
of iterations.
fun = @(x)(100*(x(2) - x(1)^2)^2 + (1 - x(1))^2);
options = optimset('OutputFcn',@bananaout,'Display','off');
x0 = [-1.9,2];
[x,fval,eflag,output] = fminsearch(fun,x0,options);
title 'Rosenbrock solution via fminsearch'
6-49
6 Nonlinear algorithms and examples
Fcount = output.funcCount;
disp(['Number of function evaluations for fminsearch was ',num2str(Fcount)])
The fminunc function finds a minimum for a problem without constraints. It uses a derivative-based
algorithm. The algorithm attempts to estimate not only the first derivative of the objective function,
but also the matrix of second derivatives. fminunc is usually more efficient than fminsearch.
options = optimoptions('fminunc','Display','off',...
'OutputFcn',@bananaout,'Algorithm','quasi-newton');
[x,fval,eflag,output] = fminunc(fun,x0,options);
title 'Rosenbrock solution via fminunc'
Fcount = output.funcCount;
disp(['Number of function evaluations for fminunc was ',num2str(Fcount)])
6-50
Banana Function Minimization
If you attempt to minimize the banana function using a steepest descent algorithm, the high
curvature of the problem makes the solution process very slow.
You can run fminunc with the steepest descent algorithm by setting the hidden HessUpdate option
to the value 'steepdesc' for the 'quasi-newton' algorithm. Set a larger-than-default maximum
number of function evaluations, because the solver does not find the solution quickly. In this case, the
solver does not find the solution even after 600 function evaluations.
options = optimoptions(options,'HessUpdate','steepdesc',...
'MaxFunctionEvaluations',600);
[x,fval,eflag,output] = fminunc(fun,x0,options);
title 'Rosenbrock solution via steepest descent'
Fcount = output.funcCount;
disp(['Number of function evaluations for steepest descent was ',...
num2str(Fcount)])
6-51
6 Nonlinear algorithms and examples
If you provide a gradient, fminunc solves the optimization using fewer function evaluations. When
you provide a gradient, you can use the 'trust-region' algorithm, which is often faster and uses
less memory than the 'quasi-newton' algorithm. Reset the HessUpdate and
MaxFunctionEvaluations options to their default values.
Fcount = output.funcCount;
disp(['Number of function evaluations for fminunc with gradient was ',...
num2str(Fcount)])
6-52
Banana Function Minimization
If you provide a Hessian (matrix of second derivatives), fminunc can solve the optimization using
even fewer function evaluations. For this problem the results are the same with or without the
Hessian.
Fcount = output.funcCount;
disp(['Number of function evaluations for fminunc with gradient and Hessian was ',...
num2str(Fcount)])
Number of function evaluations for fminunc with gradient and Hessian was 32
disp(['Number of solver iterations for fminunc with gradient and Hessian was ',num2str(output.ite
Number of solver iterations for fminunc with gradient and Hessian was 31
The recommended solver for a nonlinear sum of squares is lsqnonlin. This solver is even more
efficient than fminunc without a gradient for this special class of problems. To use lsqnonlin, do
6-53
6 Nonlinear algorithms and examples
not write your objective as a sum of squares. Instead, write the underlying vector that lsqnonlin
internally squares and sums.
options = optimoptions('lsqnonlin','Display','off','OutputFcn',@bananaout);
vfun = @(x)[10*(x(2) - x(1)^2),1 - x(1)];
[x,resnorm,residual,eflag,output] = lsqnonlin(vfun,x0,[],[],options);
title 'Rosenbrock solution via lsqnonlin'
Fcount = output.funcCount;
disp(['Number of function evaluations for lsqnonlin was ',...
num2str(Fcount)])
As in the minimization using a gradient for fminunc, lsqnonlin can use derivative information to
lower the number of function evaluations. Provide the Jacobian of the nonlinear objective function
vector and run the optimization again.
jac = @(x)[-20*x(1),10;
-1,0];
vfunjac = @(x)deal(vfun(x),jac(x));
options.SpecifyObjectiveGradient = true;
6-54
Banana Function Minimization
[x,resnorm,residual,eflag,output] = lsqnonlin(vfunjac,x0,[],[],options);
title 'Rosenbrock solution via lsqnonlin with Jacobian'
Fcount = output.funcCount;
disp(['Number of function evaluations for lsqnonlin with Jacobian was ',...
num2str(Fcount)])
See Also
More About
• “Solve a Constrained Nonlinear Problem, Solver-Based” on page 1-11
• “Symbolic Math Toolbox™ Calculates Gradients and Hessians” on page 6-94
6-55
6 Nonlinear algorithms and examples
For the purpose of this example, we solve a problem in four variables, where the objective and
constraint functions are made artificially expensive by pausing.
function f = expensive_objfun(x)
%EXPENSIVE_OBJFUN An expensive objective function used in optimparfor example.
We are interested in measuring the time taken by fmincon in serial so that we can compare it to the
parallel time.
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 5 1.839397e+00 1.500e+00 3.211e+00
6-56
Minimizing an Expensive Optimization Problem Using Parallel Computing Toolbox™
Since ga usually takes many more function evaluations than fmincon, we remove the expensive
constraint from this problem and perform unconstrained optimization instead. Pass empty matrices
[] for constraints. In addition, we limit the maximum number of generations to 15 for ga so that ga
can terminate in a reasonable amount of time. We are interested in measuring the time taken by ga
so that we can compare it to the parallel ga evaluation. Note that running ga requires Global
Optimization Toolbox.
rng default % for reproducibility
try
gaAvailable = false;
nvar = 4;
gaoptions = optimoptions('ga','MaxGenerations',15,'Display','iter');
startTime = tic;
gasol = ga(@expensive_objfun,nvar,[],[],[],[],[],[],[],gaoptions);
time_ga_sequential = toc(startTime);
fprintf('Serial GA optimization takes %g seconds.\n',time_ga_sequential);
gaAvailable = true;
catch ME
warning(message('optimdemos:optimparfor:gaNotFound'));
end
6-57
6 Nonlinear algorithms and examples
The finite differencing used by the functions in Optimization Toolbox to approximate derivatives is
done in parallel using the parfor feature if Parallel Computing Toolbox is available and there is a
parallel pool of workers. Similarly, ga, gamultiobj, and patternsearch solvers in Global
Optimization Toolbox evaluate functions in parallel. To use the parfor feature, we use the parpool
function to set up the parallel environment. The computer on which this example is published has
four cores, so parpool starts four MATLAB® workers. If there is already a parallel pool when you
run this example, we use that pool; see the documentation for parpool for more information.
To minimize our expensive optimization problem using the parallel fmincon function, we need to
explicitly indicate that our objective and constraint functions can be evaluated in parallel and that we
want fmincon to use its parallel functionality wherever possible. Currently, finite differencing can be
done in parallel. We are interested in measuring the time taken by fmincon so that we can compare
it to the serial fmincon run.
options = optimoptions(options,'UseParallel',true);
startTime = tic;
xsol = fmincon(@expensive_objfun,startPoint,[],[],[],[],[],[],@expensive_confun,options);
time_fmincon_parallel = toc(startTime);
fprintf('Parallel FMINCON optimization takes %g seconds.\n',time_fmincon_parallel);
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 5 1.839397e+00 1.500e+00 3.211e+00
1 11 -9.760099e-01 3.708e+00 7.902e-01 2.362e+00
2 16 -1.480976e+00 0.000e+00 8.344e-01 1.069e+00
3 21 -2.601599e+00 0.000e+00 8.390e-01 1.218e+00
4 29 -2.823630e+00 0.000e+00 2.598e+00 1.118e+00
5 34 -3.905339e+00 0.000e+00 1.210e+00 7.302e-01
6 39 -6.212992e+00 3.934e-01 7.372e-01 2.405e+00
7 44 -5.948762e+00 0.000e+00 1.784e+00 1.905e+00
8 49 -6.940062e+00 1.233e-02 7.668e-01 7.553e-01
9 54 -6.973887e+00 0.000e+00 2.549e-01 3.920e-01
10 59 -7.142993e+00 0.000e+00 1.903e-01 4.735e-01
11 64 -7.155325e+00 0.000e+00 1.365e-01 2.626e-01
12 69 -7.179122e+00 0.000e+00 6.336e-02 9.115e-02
13 74 -7.180116e+00 0.000e+00 1.069e-03 4.670e-02
14 79 -7.180409e+00 0.000e+00 7.799e-04 2.815e-03
15 84 -7.180410e+00 0.000e+00 6.189e-06 3.122e-04
6-58
Minimizing an Expensive Optimization Problem Using Parallel Computing Toolbox™
To minimize our expensive optimization problem using the ga function, we need to explicitly indicate
that our objective function can be evaluated in parallel and that we want ga to use its parallel
functionality wherever possible. To use the parallel ga we also require that the 'Vectorized' option be
set to the default (i.e., 'off'). We are again interested in measuring the time taken by ga so that we
can compare it to the serial ga run. Though this run may be different from the serial one because ga
uses a random number generator, the number of expensive function evaluations is the same in both
runs. Note that running ga requires Global Optimization Toolbox.
rng default % to get the same evaluations as the previous run
if gaAvailable
gaoptions = optimoptions(gaoptions,'UseParallel',true);
startTime = tic;
gasol = ga(@expensive_objfun,nvar,[],[],[],[],[],[],[],gaoptions);
time_ga_parallel = toc(startTime);
fprintf('Parallel GA optimization takes %g seconds.\n',time_ga_parallel);
end
X = [time_fmincon_sequential time_fmincon_parallel];
Y = [time_ga_sequential time_ga_parallel];
t = [0 1];
plot(t,X,'r--',t,Y,'k-')
ylabel('Time in seconds')
legend('fmincon','ga')
ax = gca;
ax.XTick = [0 1];
6-59
6 Nonlinear algorithms and examples
Utilizing parallel function evaluation via parfor improved the efficiency of both fmincon and ga.
The improvement is typically better for expensive objective and constraint functions.
See Also
More About
• “Parallel Computing”
6-60
Nonlinear Inequality Constraints
x1
min f (x) = e 4x12 + 2x22 + 4x1x2 + 2x2 + 1 ,
x
x1x2 − x1 − x2 ≤ − 1 . 5
x1x2 ≥ − 10 .
Because neither of the constraints is linear, create a function, confun.m, that returns the value of
both constraints in a vector c. Because the fmincon solver expects the constraints to be written in
the form c(x) ≤ 0, write your constraint function to return the following value:
x1x2 − x1 − x2 + 1 . 5
cx = .
−10 − x1x2
The helper function objfun is the objective function; it appears at the end of this example on page 6-
0 . Set the fun argument as a function handle to the objfun function.
fun = @objfun;
Nonlinear constraint functions must return two arguments: c, the inequality constraint, and ceq, the
equality constraint. Because this problem has no equality constraint, the helper function confun at
the end of this example on page 6-0 returns [] as the equality constraint.
Solve Problem
x0 = [-1,1];
The problem has no bounds or linear constraints. Set those arguments to [].
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
[x,fval] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,@confun)
6-61
6 Nonlinear algorithms and examples
x = 1×2
-9.5473 1.0474
fval = 0.0236
Examine Solution
The exit message indicates that the solution is feasible with respect to the constraints. To double-
check, evaluate the nonlinear constraint function at the solution. Negative values indicate satisfied
constraints.
[c,ceq] = confun(x)
c = 2×1
10-4 ×
-0.3179
-0.3062
ceq =
[]
Both nonlinear constraints are negative and close to zero, indicating that the solution is feasible and
that both constraints are active at the solution.
Helper Functions
function f = objfun(x)
f = exp(x(1))*(4*x(1)^2 + 2*x(2)^2 + 4*x(1)*x(2) + 2*x(2) + 1);
end
See Also
Related Examples
• “Nonlinear Equality and Inequality Constraints” on page 6-75
• “Nonlinear Constraints with Gradients” on page 6-63
6-62
Nonlinear Constraints with Gradients
x1
min f (x) = e 4x12 + 2x22 + 4x1x2 + 2x2 + 1 ,
x
x1x2 − x1 − x2 ≤ − 1 . 5
x1x2 ≥ − 10 .
Because the fmincon solver expects the constraints to be written in the form c(x) ≤ 0, write your
constraint function to return the following value:
x1x2 − x1 − x2 + 1 . 5
cx = .
−10 − x1x2
x1
f (x) = e 4x12 + 2x22 + 4x1x2 + 2x2 + 1 .
Compute the gradient of f (x) with respect to the variables x1 and x2.
The objfungrad helper function at the end of this example on page 6-0 returns both the objective
function f (x) and its gradient in the second output gradf. Set @objfungrad as the objective.
fun = @objfungrad;
The helper function confungrad is the nonlinear constraint function; it appears at the end of this
example on page 6-0 .
The derivative information for the inequality constraint has each column correspond to one
constraint. In other words, the gradient of the constraints is in the following format:
6-63
6 Nonlinear algorithms and examples
∂c1 ∂c2
∂x1 ∂x1 x2 − 1 −x2
= .
∂c1 ∂c2 x1 − 1 −x1
∂x2 ∂x2
nonlcon = @confungrad;
Indicate to the fmincon solver that the objective and constraint functions provide derivative
information. To do so, use optimoptions to set the SpecifyObjectiveGradient and
SpecifyConstraintGradient option values to true.
options = optimoptions('fmincon',...
'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true);
Solve Problem
x0 = [-1,1];
The problem has no bounds or linear constraints, so set those argument values to [].
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
[x,fval] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
x = 1×2
-9.5473 1.0474
fval = 0.0236
The solution is the same as in the example “Nonlinear Inequality Constraints” on page 6-61, which
solves the problem without using derivative information. The advantage of using derivatives is that
solving the problem takes fewer function evaluations while gaining robustness, although this
advantage is not obvious in this example. Using even more derivative information, as in “fmincon
Interior-Point Algorithm with Analytic Hessian” on page 6-66, gives even more benefit, such as
fewer solver iterations.
6-64
Nonlinear Constraints with Gradients
Helper Functions
See Also
Related Examples
• “Nonlinear Inequality Constraints” on page 6-61
• “fmincon Interior-Point Algorithm with Analytic Hessian” on page 6-66
6-65
6 Nonlinear algorithms and examples
The helper function bigtoleft is an objective function that grows rapidly negative as the x(1)
coordinate becomes negative. Its gradient is a three-element vector. The code for the bigtoleft
helper function appears at the end of this example on page 6-0 .
The constraint set for this example is the intersection of the interiors of two cones—one pointing up,
and one pointing down. The constraint function is a two-component vector containing one component
for each cone. Because this example is three-dimensional, the gradient of the constraint is a 3-by-2
matrix. The code for the twocone helper function appears at the end of this example on page 6-0 .
% Create figure
figure1 = figure;
% Create axes
axes1 = axes('Parent',figure1);
view([-63.5 18]);
grid('on');
hold('all');
6-66
fmincon Interior-Point Algorithm with Analytic Hessian
To use second-order derivative information in the fmincon solver, you must create a Hessian that is
the Hessian of the Lagrangian. The Hessian of the Lagrangian is given by the equation
Here, f (x) is the bigtoleft function, and the ci(x) are the two cone constraint functions. The
hessinterior helper function at the end of this example on page 6-0 computes the Hessian of the
Lagrangian at a point x with the Lagrange multiplier structure lambda. The function first computes
∇2 f (x). It then computes the two constraint Hessians ∇2 c1(x) and ∇2 c2(x), multiplies them by their
corresponding Lagrange multipliers lambda.ineqnonlin(1) and lambda.ineqnonlin(2), and
adds them. You can see from the definition of the twocone constraint function that ∇2 c1(x) = ∇2 c2(x),
which simplifies the calculation.
To enable fmincon to use the objective gradient, constraint gradients, and the Hessian, you must set
appropriate options. The HessianFcn option using the Hessian of the Lagrangian is available only
for the interior-point algorithm.
options = optimoptions('fmincon','Algorithm','interior-point',...
"SpecifyConstraintGradient",true,"SpecifyObjectiveGradient",true,...
'HessianFcn',@hessinterior);
6-67
6 Nonlinear algorithms and examples
x0 = [-1,-1,-1];
The problem has no linear constraints or bounds. Set those arguments to [].
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
[x,fval,eflag,output] = fmincon(@bigtoleft,x0,...
A,b,Aeq,beq,lb,ub,@twocone,options);
Examine the solution, objective function value, exit flag, and number of function evaluations and
iterations.
disp(x)
disp(fval)
-2.8941e+03
disp(eflag)
disp([output.funcCount,output.iterations])
7 6
If you do not use a Hessian function, fmincon takes more iterations to converge and requires more
function evaluations.
options.HessianFcn = [];
[x2,fval2,eflag2,output2] = fmincon(@bigtoleft,x0,...
A,b,Aeq,beq,lb,ub,@twocone,options);
6-68
fmincon Interior-Point Algorithm with Analytic Hessian
disp([output2.funcCount,output2.iterations])
13 9
If you also do not include the gradient information, fmincon takes the same number of iterations, but
takes many more function evaluations.
options.SpecifyConstraintGradient = false;
options.SpecifyObjectiveGradient = false;
[x3,fval3,eflag3,output3] = fmincon(@bigtoleft,x0,...
A,b,Aeq,beq,lb,ub,@twocone,options);
disp([output3.funcCount,output3.iterations])
43 9
Helper Functions
if nargout > 1
gradf=[30*x(1)^2+x(2)^2+2*x(3)*x(1);
2*x(1)*x(2)+2*x(3)*x(2);
(x(1)^2+x(2)^2)];
end
end
ceq = [];
r = sqrt(x(1)^2 + x(2)^2);
c = [-10+r-x(3);
x(3)-3+r];
if nargout > 2
gradceq = [];
gradc = [x(1)/r,x(1)/r;
x(2)/r,x(2)/r;
-1,1];
6-69
6 Nonlinear algorithms and examples
end
end
function h = hessinterior(x,lambda)
h = [60*x(1)+2*x(3),2*x(2),2*x(1);
2*x(2),2*(x(1)+x(3)),2*x(2);
2*x(1),2*x(2),0];% Hessian of f
r = sqrt(x(1)^2+x(2)^2);% radius
rinv3 = 1/r^3;
hessc = [(x(2))^2*rinv3,-x(1)*x(2)*rinv3,0;
-x(1)*x(2)*rinv3,x(1)^2*rinv3,0;
0,0,0];% Hessian of both c(1) and c(2)
h = h + lambda.ineqnonlin(1)*hessc + lambda.ineqnonlin(2)*hessc;
end
See Also
Related Examples
• “Linear or Quadratic Objective with Quadratic Constraints” on page 6-71
• “Symbolic Math Toolbox™ Calculates Gradients and Hessians” on page 6-94
6-70
Linear or Quadratic Objective with Quadratic Constraints
1 T T
min x Qx + f x + c
x 2
subject to
1 T T
x Hix + ki x + di ≤ 0,
2
where 1 ≤ i ≤ m. Assume that at least one Hi is nonzero; otherwise, you can use quadprog or
linprog to solve this problem. With nonzero Hi, the constraints are nonlinear, and the “Optimization
Decision Table” on page 2-4 states that fmincon is the appropriate solver.
The example assumes that the quadratic matrices are symmetric. This is without loss of generality;
you can replace a nonsymmetric H (or Q) matrix with an equivalent symmetrized version (H + HT)/2.
If x has N components, then Q and the Hi are N-by-N matrices, f and the ki are N-by-1 vectors, and c
and the di are scalars.
Objective Function
Formulate the problem using fmincon syntax. Assume that x and f are column vectors. (x is a
column vector when the initial vector x0 is.)
Constraint Function
For consistency and easy indexing, place every quadratic constraint matrix in one cell array. Similarly,
place the linear and constant terms in cell arrays.
if nargout > 2
grady = zeros(length(x),jj);
for i = 1:jj
grady(:,i) = H{i}*x + k{i};
6-71
6 Nonlinear algorithms and examples
end
end
gradyeq = [];
Numeric Example
Hessian
Create a Hessian function. The Hessian of the Lagrangian is given by the equation
fmincon calculates an approximate set of Lagrange multipliers λi, and packages them in a structure.
To include the Hessian, use the following function.
function hess = quadhess(x,lambda,Q,H)
hess = Q;
jj = length(H); % jj is the number of inequality constraints
for i = 1:jj
hess = hess + lambda.ineqnonlin(i)*H{i};
end
Solution
Use the fmincon interior-point algorithm to solve the problem most efficiently. This algorithm
accepts a Hessian function that you supply. Set these options.
options = optimoptions(@fmincon,'Algorithm','interior-point',...
'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
'HessianFcn',@(x,lambda)quadhess(x,lambda,Q,H));
6-72
Linear or Quadratic Objective with Quadratic Constraints
ans =
12.8412
39.2337
Both nonlinear inequality multipliers are nonzero, so both quadratic constraints are active at the
solution.
The interior-point algorithm with gradients and a Hessian is efficient. Examine the number of
function evaluations.
output
output =
iterations: 9
funcCount: 10
constrviolation: 0
stepsize: 5.3547e-04
algorithm: 'interior-point'
firstorderopt: 1.5851e-05
cgiterations: 0
message: 'Local minimum found that satisfies the constraints.
Optimization compl...'
options.HessianFcn = [];
[x2,fval2,eflag2,output2,lambda2] = fmincon(fun,[0;0;0],...
[],[],[],[],[],[],nonlconstr,options);
output2
output2 =
iterations: 17
funcCount: 22
constrviolation: 0
stepsize: 2.8475e-04
algorithm: 'interior-point'
firstorderopt: 1.7680e-05
cgiterations: 0
message: 'Local minimum found that satisfies the constraints.
Optimization compl...'
This time fmincon used about twice as many iterations and function evaluations. The solutions are
the same to within tolerances.
If you also have quadratic equality constraints, you can use essentially the same technique. The
problem is the same, with the additional constraints
6-73
6 Nonlinear algorithms and examples
1 T
x Jix + piT x + qi = 0.
2
Reformulate your constraints to use the Ji, pi, and qi variables. The lambda.eqnonlin(i) structure
has the Lagrange multipliers for equality constraints.
See Also
Related Examples
• “fmincon Interior-Point Algorithm with Analytic Hessian” on page 6-66
More About
• “Including Gradients and Hessians” on page 2-19
• “Including Gradients in Constraint Functions” on page 2-38
6-74
Nonlinear Equality and Inequality Constraints
[c,ceq] = nonlinconstr(x)
The function c(x) represents the constraint c(x) <= 0. The function ceq(x) represents the
constraint ceq(x) = 0.
Note: You must have the nonlinear constraint function return both c(x) and ceq(x), even if you
have only one type of nonlinear constraint. If a constraint does not exist, have the function return []
for that constraint.
Nonlinear Constraints
x12 + x2 = 1
x1x2 ≥ − 10.
x12 + x2 − 1 = 0
−x1x2 − 10 ≤ 0 .
The confuneq function at the end of this example on page 6-0 implements these inequalities in the
correct syntax.
Objective Function
x1
min f (x) = e 4x12 + 2x22 + 4x1x2 + 2x2 + 1
x
subject to the constraints. The objfun function at the end of this example on page 6-0 implements
this objective function.
Solve Problem
Solve the problem by calling the fmincon solver. This solver requires an initial point; use the point
x0 = [-1,-1].
x0 = [-1,-1];
There are no bounds or linear constraints in the problem, so set those inputs to [].
A = [];
b = [];
Aeq = [];
6-75
6 Nonlinear algorithms and examples
beq = [];
lb = [];
ub = [];
[x,fval] = fmincon(@objfun,x0,A,b,Aeq,beq,lb,ub,@confuneq)
x = 1×2
-0.7529 0.4332
fval = 1.5093
The solver reports that the constraints are satisfied at the solution. Check the nonlinear constraints at
the solution.
[c,ceq] = confuneq(x)
c = -9.6739
ceq = 2.0701e-12
c is less than 0, as required. ceq is equal to 0 within the default constraint tolerance of 1e-6.
Helper Functions
function f = objfun(x)
f = exp(x(1))*(4*x(1)^2+2*x(2)^2+4*x(1)*x(2)+2*x(2)+1);
end
See Also
Related Examples
• “Nonlinear Inequality Constraints” on page 6-61
• “Optimization App with the fmincon Solver” on page 6-77
6-76
Optimization App with the fmincon Solver
Note The Optimization app warns that it will be removed in a future release.
0.5 ≤ x1 (bound)
−x1 − x2 + 1 ≤ 0 (linear inequality)
−x12 − x22 + 1 ≤ 0
−9x12 − x22 + 9 ≤ 0
(nonlinear inequality)
−x12 + x2 ≤ 0
−x22 + x1 ≤ 0
Step 3: Set up and run the problem with the Optimization app.
1 Enter optimtool in the Command Window to open the Optimization app.
2 Select fmincon from the selection of solvers and change the Algorithm field to Active set.
3 Enter @objecfun in the Objective function field to call the objecfun.m file.
6-77
6 Nonlinear algorithms and examples
• Set the bound 0.5 ≤ x1 by entering [0.5,-Inf] in the Lower field. The -Inf entry means
there is no lower bound on x2.
• Set the linear inequality constraint by entering [-1 -1] in the A field and enter -1 in the b
field.
• Set the nonlinear constraints by entering @nonlconstr in the Nonlinear constraint
function field.
6 In the Options pane, expand the Display to command window option if necessary, and select
Iterative to show algorithm information at the Command Window for each iteration.
8 When the algorithm terminates, under Run solver and view results the following information is
displayed:
6-78
Optimization App with the fmincon Solver
• The Current iteration value when the algorithm terminated, which for this example is 7.
• The final value of the objective function when the algorithm terminated:
Objective function value: 2.0000000268595803
• The algorithm termination message:
Local minimum found that satisfies the constraints.
6-79
6 Nonlinear algorithms and examples
See Also
Related Examples
• “Solve a Constrained Nonlinear Problem, Solver-Based” on page 1-11
6-80
Minimization with Bound Constraints and Banded Preconditioner
n n/2
p p
f (x) = 1 + ∑ 3 − 2xi xi − xi − 1 − xi + 1 + 1 + ∑ xi + xi + n/2 ,
i=1 i=1
where p = 7/3, x0 = 0, and xn + 1 = 0. The tbroy function at the end of this example on page 6-0
implements the objective function, including its gradient.
n = 800;
lb = -10*ones(n,1);
ub = -lb;
Hessian Pattern
The sparsity pattern of the Hessian matrix has been predetermined and stored in the file
tbroyhstr.mat. The sparsity structure for the Hessian of this problem is banded, as you can see in
the following spy plot.
load tbroyhstr
spy(Hstr)
6-81
6 Nonlinear algorithms and examples
In this plot, the center stripe is itself a five-banded matrix. The following plot shows the matrix more
clearly.
spy(Hstr(1:20,1:20))
6-82
Minimization with Bound Constraints and Banded Preconditioner
Problem Options
Set options to use the trust-region-reflective algorithm. This algorithm requires you to set the
SpecifyObjectiveGradient option to true.
Also, use optimoptions to set the HessPattern option to Hstr. When a problem as large as this
has obvious sparsity structure, not setting the HessPattern option requires a huge amount of
unnecessary memory and computation. This is because fmincon attempts to use finite differencing
on a full Hessian matrix of 640,000 nonzero entries.
options = optimoptions('fmincon','SpecifyObjectiveGradient',true,'HessPattern',Hstr,...
'Algorithm','trust-region-reflective');
Solve Problem
Set the initial point to –1 for odd indices and +1 for even indices.
x0 = -ones(n,1);
x0(2:2:n) = 1;
A = [];
b = [];
Aeq = [];
beq = [];
nonlcon = [];
6-83
6 Nonlinear algorithms and examples
Examine the exit flag, objective function value, first-order optimality measure, and number of solver
iterations.
disp(exitflag);
disp(fval)
270.4790
disp(output.firstorderopt)
0.0163
disp(output.iterations)
fmincon did not take very many iterations to reach a solution. However, the solution has a relatively
high first-order optimality measure, which is the reason that the exit flag is not the more preferable
value of 1.
Improve Solution
Try using a five-banded preconditioner instead of the default diagonal preconditioner. Using
optimoptions, set the PrecondBandWidth option to 2 and solve the problem again. (The
bandwidth is the number of upper (or lower) diagonals, not counting the main diagonal.)
options.PrecondBandWidth = 2;
[x2,fval2,exitflag2,output2] = ...
fmincon(@tbroyfg,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
disp(exitflag2);
disp(fval2)
270.4790
disp(output2.firstorderopt)
7.5340e-05
6-84
Minimization with Bound Constraints and Banded Preconditioner
disp(output2.iterations)
The exit flag and objective function value do not appear to change. However, the number of iterations
increased, and the first-orer optimality measure decreased considerably. Compute the difference in
objective function value.
disp(fval2 - fval)
-2.9005e-07
The objective function value decreased by a tiny amount. The improvement in the solution is only an
improvement in the first-order optimality measure, not in the objective function.
Helper Function
6-85
6 Nonlinear algorithms and examples
grad = g;
end
end
6-86
Minimization with Linear Equality Constraints, Trust-Region Reflective Algorithm
n−1
xi2+ 1 + 1 xi2 + 1
f (x) = ∑ xi2 + xi2+ 1 ,
i=1
Create Problem
The browneq.mat file contains matrices Aeq and beq that represent the linear constraints
Aeq*x = beq. The Aeq matrix has 100 rows representing 100 linear constraints (so Aeq is a 100-
by-1000 matrix). Load the browneq.mat data.
load browneq.mat
The brownfgh function at the end of this example on page 6-0 implements the objective function,
including its gradient and Hessian.
Set Options
options = optimoptions('fmincon','Algorithm','trust-region-reflective',...
'SpecifyObjectiveGradient',true,'HessianFcn','objective');
Solve Problem
Set the initial point to –1 for odd indices and +1 for even indices.
n = 1000;
x0 = -ones(n,1);
x0(2:2:n) = 1;
There are no bounds, linear inequality or nonlinear constraints, so set those parameters to [].
A = [];
b = [];
lb = [];
ub = [];
nonlcon = [];
[x,fval,exitflag,output] = ...
fmincon(@brownfgh,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
6-87
6 Nonlinear algorithms and examples
Examine the exit flag, objective function value, and constraint violation.
disp(exitflag)
disp(fval)
205.9313
disp(output.constrviolation)
2.2027e-13
The exitflag value of 3 also indicates that fmincon stopped because the change in the objective
function value was less than the tolerance FunctionTolerance. The final objective function value is
given by fval. Constraints are satisfied, as you see in output.constrviolation, which shows a
very small number.
norm(Aeq*x-beq,Inf)
ans = 2.2027e-13
Helper Function
6-88
Minimization with Linear Equality Constraints, Trust-Region Reflective Algorithm
See Also
More About
• “Problem-Based Nonlinear Minimization with Linear Constraints” on page 7-17
6-89
6 Nonlinear algorithms and examples
In this section...
“Hessian Multiply Function for Lower Memory” on page 6-90
“Step 1: Write a file brownvv.m that computes the objective function, the gradient, and the sparse
part of the Hessian.” on page 6-91
“Step 2: Write a function to compute Hessian-matrix products for H given a matrix Y.” on page 6-91
“Step 3: Call a nonlinear minimization routine with a starting point and linear equality constraints.”
on page 6-91
“Preconditioning” on page 6-93
In this example, the objective function is nonlinear and linear equalities exist so fmincon is used.
The description applies to the trust-region reflective algorithm; the fminunc trust-region
algorithm is similar. For the interior-point algorithm, see the HessianMultiplyFcn option in
“Hessian Multiply Function” on page 16-88. The objective function has the structure
1 T T
f x =f x − x VV x,
2
where V is a 1000-by-2 matrix. The Hessian of f is dense, but the Hessian of f is sparse. If the
Hessian of f is H , then H, the Hessian of f, is
H = H − VV T .
To avoid excessive memory usage that could happen by working with H directly, the example provides
a Hessian multiply function, hmfleq1. This function, when passed a matrix Y, uses sparse matrices
Hinfo, which corresponds to H , and V to compute the Hessian matrix product
In this example, the Hessian multiply function needs H and V to compute the Hessian matrix product.
V is a constant, so you can capture V in a function handle to an anonymous function.
However, H is not a constant and must be computed at the current x. You can do this by computing H
in the objective function and returning H as Hinfo in the third output argument. By using
optimoptions to set the 'Hessian' options to 'on', fmincon knows to get the Hinfo value from
the objective function and pass it to the Hessian multiply function hmfleq1.
6-90
Minimization with Dense Structured Hessian, Linear Equalities
type brownvv
Because brownvv computes the gradient as well as the objective function, the example (Step 3 on
page 6-91) uses optimoptions to set the SpecifyObjectiveGradient option to true.
W = hmfleq1(Hinfo,Y)
The first argument must be the same as the third argument returned by the objective function
brownvv. The second argument to the Hessian multiply function is the matrix Y (of W = H*Y).
Because fmincon expects the second argument Y to be used to form the Hessian matrix product, Y is
always a matrix with n rows where n is the number of dimensions in the problem. The number of
columns in Y can vary. Finally, you can use a function handle to an anonymous function to capture V,
so V can be the third argument to 'hmfleqq'.
function W = hmfleq1(Hinfo,Y,V);
%HMFLEQ1 Hessian-matrix product function for BROWNVV objective.
% W = hmfleq1(Hinfo,Y,V) computes W = (Hinfo-V*V')*Y
% where Hinfo is a sparse matrix computed by BROWNVV
% and V is a 2 column matrix.
W = Hinfo*Y - V*(V'*Y);
6-91
6 Nonlinear algorithms and examples
[fval,exitflag,output,x] = runfleq1;
Because the iterative display was set using optimoptions, this command generates the following
iterative display:
Norm of First-order
Iteration f(x) step optimality CG-iterations
0 2297.63 1.41e+03
1 1084.59 6.3903 578 1
2 1084.59 100 578 3
3 1084.59 25 578 0
4 1084.59 6.25 578 0
5 1047.61 1.5625 240 0
6 761.592 3.125 62.4 2
7 761.592 6.25 62.4 4
8 746.478 1.5625 163 0
9 546.578 3.125 84.1 2
10 274.311 6.25 26.9 2
11 55.6193 11.6597 40 2
12 55.6193 25 40 3
13 22.2964 6.25 26.3 0
14 -49.516 6.25 78 1
15 -93.2772 1.5625 68 1
16 -207.204 3.125 86.5 1
17 -434.162 6.25 70.7 1
18 -681.359 6.25 43.7 2
19 -681.359 6.25 43.7 4
20 -698.041 1.5625 191 0
21 -723.959 3.125 256 7
22 -751.33 0.78125 154 3
23 -793.974 1.5625 24.4 3
24 -820.831 2.51937 6.11 3
25 -823.069 0.562132 2.87 3
26 -823.237 0.196753 0.486 3
27 -823.245 0.0621202 0.386 3
28 -823.246 0.0199951 0.11 6
29 -823.246 0.00731333 0.0404 7
30 -823.246 0.00505883 0.0185 8
31 -823.246 0.00126471 0.00268 9
32 -823.246 0.00149326 0.00521 9
33 -823.246 0.000373314 0.00091 9
6-92
Minimization with Dense Structured Hessian, Linear Equalities
Convergence is rapid for a problem of this size with the PCG iteration cost increasing modestly as the
optimization progresses. Feasibility of the equality constraints is maintained at the solution.
ans =
1.8874e-14
Preconditioning
In this example, fmincon cannot use H to compute a preconditioner because H only exists implicitly.
Instead of H, fmincon uses Hinfo, the third argument returned by brownvv, to compute a
preconditioner. Hinfo is a good choice because it is the same size as H and approximates H to some
degree. If Hinfo were not the same size as H, fmincon would compute a preconditioner based on
some diagonal scaling matrices determined from the algorithm. Typically, this would not perform as
well.
6-93
6 Nonlinear algorithms and examples
• jacobian generates the gradient of a scalar function, and generates a matrix of the partial
derivatives of a vector function. So, for example, you can obtain the Hessian matrix, the second
derivatives of the objective function, by applying jacobian to the gradient. Part of this example
shows how to use jacobian to generate symbolic gradients and Hessians of objective and
constraint functions.
• matlabFunction generates either an anonymous function or a file that calculates the values of a
symbolic expression. This example shows how to use matlabFunction to generate files that
evaluate the objective and constraint function and their derivatives at arbitrary points.
The syntax and structures of the two sets of toolbox functions differ. In particular, symbolic variables
are real or complex scalars, but Optimization Toolbox™ functions pass vector arguments. So there
are several steps to take to generate symbolically the objective function, constraints, and all their
requisite derivatives, in a form suitable for the interior-point algorithm of fmincon.
To see the efficiency in using gradients and Hessians, see Compare to Optimization Without Gradients
and Hessians on page 6-0 . For a problem-based approach to this problem without using derivative
information, see “Constrained Electrostatic Nonlinear Optimization, Problem-Based” on page 7-12.
Problem Description
Consider the electrostatics problem of placing 10 electrons in a conducting body. The electrons will
arrange themselves so as to minimize their total potential energy, subject to the constraint of lying
inside the body. It is well known that all the electrons will be on the boundary of the body at a
minimum. The electrons are indistinguishable, so there is no unique minimum for this problem
(permuting the electrons in one solution gives another valid solution). This example was inspired by
Dolan, Moré, and Munson [58].
z≤ − x − y
2
x2 + y2 + (z + 1) ≤ 1.
6-94
Symbolic Math Toolbox™ Calculates Gradients and Hessians
There is a slight gap between the upper and lower surfaces of the figure. This is an artifact of the
general plotting routine used to create the figure. This routine erases any rectangular patch on one
surface that touches the other surface.
Create Variables
Generate a symbolic vector x as a 30-by-1 vector composed of real symbolic variables xij, i between
1 and 10, and j between 1 and 3. These variables represent the three coordinates of electron i: xi1
corresponds to the x coordinate, xi2 corresponds to the y coordinate, and xi3 corresponds to the z
coordinate.
x = cell(3, 10);
for i = 1:10
for j = 1:3
x{j,i} = sprintf('x%d%d',i,j);
end
end
x = x(:); % now x is a 30-by-1 vector
x = sym(x, 'real');
x =
6-95
6 Nonlinear algorithms and examples
x11
x12
x13
x21
x22
x23
x31
x32
x33
x41
x42
x43
x51
x52
x53
x61
x62
x63
x71
x72
x73
x81
x82
x83
x91
x92
x93
x101
x102
x103
z≤ − x − y
6-96
Symbolic Math Toolbox™ Calculates Gradients and Hessians
B = [1,1,1;-1,1,1;1,-1,1;-1,-1,1];
A = zeros(40,30);
for i=1:10
A(4*i-3:4*i,3*i-2:3*i) = B;
end
b = zeros(40,1);
disp(A*x)
6-97
6 Nonlinear algorithms and examples
6-98
Symbolic Math Toolbox™ Calculates Gradients and Hessians
2
x2 + y2 + (z + 1) ≤ 1
are also structured. Generate the constraints, their gradients, and Hessians as follows.
c = sym(zeros(1,10));
i = 1:10;
c = (x(3*i-2).^2 + x(3*i-1).^2 + (x(3*i)+1).^2 - 1).';
The constraint vector c is a row vector, and the gradient of c(i) is represented in the ith column of
the matrix gradc. This is the correct form, as described in “Nonlinear Constraints” on page 2-37.
The Hessian matrices, hessc{1}, ..., hessc{10}, are square and symmetric. It is better to store
them in a cell array, as is done here, than in separate variables such as hessc1, ..., hesssc10.
Use the .' syntax to transpose. The ' syntax means conjugate transpose, which has different
symbolic derivatives.
The objective function, potential energy, is the sum of the inverses of the distances between each
electron pair:
1
energy = ∑ xi − x j
.
i< j
The distance is the square root of the sum of the squares of the differences in the components of the
vectors.
gradenergy = jacobian(energy,x).';
hessenergy = jacobian(gradenergy,x);
The objective function should have two outputs, energy and gradenergy. Put both functions in one
vector when calling matlabFunction to reduce the number of subexpressions that
6-99
6 Nonlinear algorithms and examples
matlabFunction generates, and to return the gradient only when the calling function (fmincon in
this case) requests both outputs. This example shows placing the resulting files in your current folder.
Of course, you can place them anywhere you like, as long as the folder is on the MATLAB path.
This syntax causes matlabFunction to return energy as the first output, and gradenergy as the
second. It also takes a single input vector {x} instead of a list of inputs x11, ..., x103.
The resulting file demoenergy.m contains, in part, the following lines or similar ones:
This function has the correct form for an objective function with a gradient; see “Writing Scalar
Objective Functions” on page 2-17.
Generate the nonlinear constraint function, and put it in the correct format.
filename = [currdir,'democonstr.m'];
matlabFunction(c,[],gradc,[],'file',filename,'vars',{x},...
'outputs',{'c','ceq','gradc','gradceq'});
The resulting file democonstr.m contains, in part, the following lines or similar ones:
This function has the correct form for a constraint function with a gradient; see “Nonlinear
Constraints” on page 2-37.
6-100
Symbolic Math Toolbox™ Calculates Gradients and Hessians
To generate the Hessian of the Lagrangian for the problem, first generate files for the energy Hessian
and for the constraint Hessians.
The Hessian of the objective function, hessenergy, is a very large symbolic expression, containing
over 150,000 symbols, as evaluating size(char(hessenergy)) shows. So it takes a substantial
amount of time to run matlabFunction(hessenergy).
filename = [currdir,'hessenergy.m'];
matlabFunction(hessenergy,'file',filename,'vars',{x});
In contrast, the Hessians of the constraint functions are small, and fast to compute:
for i = 1:10
ii = num2str(i);
thename = ['hessc',ii];
filename = [currdir,thename,'.m'];
matlabFunction(hessc{i},'file',filename,'vars',{x});
end
After generating all the files for the objective and constraints, put them together with the appropriate
Lagrange multipliers in a file hessfinal.m, whose code appears at the end of this example on page
6-0 ..
Start the optimization with the electrons distributed randomly on a sphere of radius 1/2 centered at
[0,0,–1].
Set the options to use the interior-point algorithm, and to use gradients and the Hessian.
options = optimoptions(@fmincon,'Algorithm','interior-point','SpecifyObjectiveGradient',true,...
'SpecifyConstraintGradient',true,'HessianFcn',@hessfinal,'Display','final');
Call fmincon.
6-101
6 Nonlinear algorithms and examples
View the objective function value, exit flag, number of iterations, and number of function evaluations.
disp(fval)
34.1365
disp(exitflag)
disp(output.iterations)
19
disp(output.funcCount)
28
Even though the initial positions of the electrons were random, the final positions are nearly
symmetric.
for i = 1:10
plot3(xfinal(3*i-2),xfinal(3*i-1),xfinal(3*i),'r.','MarkerSize',25);
end
rotate3d
figure(hand)
6-102
Symbolic Math Toolbox™ Calculates Gradients and Hessians
The use of gradients and Hessians makes the optimization run faster and more accurately. To
compare with the same optimization using no gradient or Hessian information, set the options not to
use gradients and Hessians:
options = optimoptions(@fmincon,'Algorithm','interior-point',...
'Display','final');
[xfinal2 fval2 exitflag2 output2] = fmincon(@demoenergy,Xinitial,...
A,b,[],[],[],[],@democonstr,options);
34.1365
disp(exitflag2)
disp(output2.iterations)
78
disp(output2.funcCount)
2463
Compare the number of iterations and number of function evaluations with and without derivative
information.
tbl = table([output.iterations;output2.iterations],[output.funcCount;output2.funcCount],...
'RowNames',{'With Derivatives','Without Derivatives'},'VariableNames',{'Iterations','Fevals'}
tbl=2×2 table
Iterations Fevals
__________ ______
With Derivatives 19 28
Without Derivatives 78 2463
The symbolic variables in this example have the assumption, in the symbolic engine workspace, that
they are real. To clear this assumption from the symbolic engine workspace, it is not sufficient to
delete the variables. Clear the variable assumptions by using syms:
syms x
6-103
6 Nonlinear algorithms and examples
assumptions(x)
ans =
Helper Function
function H = hessfinal(X,lambda)
%
% Call the function hessenergy to start
H = hessenergy(X);
See Also
Related Examples
• “Using Symbolic Mathematics with Optimization Toolbox™ Solvers” on page 6-105
• “Constrained Electrostatic Nonlinear Optimization, Problem-Based” on page 7-12
6-104
Using Symbolic Mathematics with Optimization Toolbox™ Solvers
There are several considerations in using symbolic calculations with optimization functions:
1 Optimization objective and constraint functions should be defined in terms of a vector, say x.
However, symbolic variables are scalar or complex-valued, not vector-valued. This requires you to
translate between vectors and scalars.
2 Optimization gradients, and sometimes Hessians, are supposed to be calculated within the body
of the objective or constraint functions. This means that a symbolic gradient or Hessian has to be
placed in the appropriate place in the objective or constraint function file or function handle.
3 Calculating gradients and Hessians symbolically can be time-consuming. Therefore you should
perform this calculation only once, and generate code, via matlabFunction, to call during
execution of the solver.
4 Evaluating symbolic expressions with the subs function is time-consuming. It is much more
efficient to use matlabFunction.
5 matlabFunction generates code that depends on the orientation of input vectors. Since
fmincon calls the objective function with column vectors, you must be careful to call
matlabFunction with column vectors of symbolic variables.
2 2
f (x1, x2) = log 1 + 3 x2 − (x13 − x1) + (x1 − 4/3) .
This function is positive, with a unique minimum value of zero attained at x1 = 4/3, x2 =(4/3)^3 - 4/3
= 1.0370...
We write the independent variables as x1 and x2 because in this form they can be used as symbolic
variables. As components of a vector x they would be written x(1) and x(2). The function has a
twisty valley as depicted in the plot below.
syms x1 x2 real
x = [x1;x2]; % column vector of symbolic variables
f = log(1 + 3*(x2 - (x1^3 - x1))^2 + (x1 - 4/3)^2)
f =
4 2 2
log x1 − + 3 −x13 + x1 + x2 + 1
3
fsurf(f,[-2 2],'ShowContours','on')
view(127,38)
6-105
6 Nonlinear algorithms and examples
gradf =
8
6 3 x12 − 1 −x13 + x1 + x2 − 2 x1 + 3
−
σ1
−6 x13 + 6 x1 + 6 x2
σ1
where
4 2 2
σ1 = x1 − + 3 −x13 + x1 + x2 + 1
3
hessf = jacobian(gradf,x)
hessf =
6-106
Using Symbolic Mathematics with Optimization Toolbox™ Solvers
2
6 3 x12 − 1 − 36 x1 −x13 + x1 + x2 + 2 σ32
− 2 σ1
σ2 σ2
2
6 −6 x13 + 6 x1 + 6 x2
σ1 −
σ2 σ 2 2
where
−6 x13 + 6 x1 + 6 x2 σ3 18 x12 − 6
σ1 = −
σ22 σ2
4 2 2
σ2 = x1 − + 3 −x13 + x1 + x2 + 1
3
8
σ3 = 6 3 x12 − 1 −x13 + x1 + x2 − 2 x1 +
3
The fminunc solver expects to pass in a vector x, and, with the SpecifyObjectiveGradient
option set to true and HessianFcn option set to 'objective', expects a list of three outputs:
[f(x),gradf(x),hessf(x)].
matlabFunction generates exactly this list of three outputs from a list of three inputs. Furthermore,
using the vars option, matlabFunction accepts vector inputs.
fh = matlabFunction(f,gradf,hessf,'vars',{x});
xfinal = 2×1
1.3333
1.0370
fval = 7.6623e-12
exitflag = 3
6-107
6 Nonlinear algorithms and examples
constrviolation: []
Compare this with the number of iterations using no gradient or Hessian information. This requires
the 'quasi-newton' algorithm.
options = optimoptions('fminunc','Display','final','Algorithm','quasi-newton');
fh2 = matlabFunction(f,'vars',{x});
% fh2 = objective with no gradient or Hessian
[xfinal,fval,exitflag,output2] = fminunc(fh2,[-1;2],options)
xfinal = 2×1
1.3333
1.0371
fval = 2.1985e-11
exitflag = 1
The number of iterations is lower when using gradients and Hessians, and there are dramatically
fewer function evaluations:
ans =
'There were 14 iterations using gradient and Hessian, but 18 without them.'
ans =
'There were 15 function evaluations using gradient and Hessian, but 81 without them.'
We consider the same objective function and starting point, but now have two nonlinear constraints:
6-108
Using Symbolic Mathematics with Optimization Toolbox™ Solvers
The constraints keep the optimization away from the global minimum point [1.333,1.037]. Visualize
the two constraints:
[X,Y] = meshgrid(-2:.01:3);
Z = (5*sinh(Y./5) >= X.^4);
% Z=1 where the first constraint is satisfied, Z=0 otherwise
Z = Z+ 2*(5*tanh(X./5) >= Y.^2 - 1);
% Z=2 where the second is satisfied, Z=3 where both are
surf(X,Y,Z,'LineStyle','none');
fig = gcf;
fig.Color = 'w'; % white background
view(0,90)
hold on
plot3(.4396, .0373, 4,'o','MarkerEdgeColor','r','MarkerSize',8);
% best point
xlabel('x')
ylabel('y')
hold off
Here is a plot of the objective function over the feasible region, the region that satisfies both
constraints, pictured above in dark red, along with a small red circle around the optimal point:
W = log(1 + 3*(Y - (X.^3 - X)).^2 + (X - 4/3).^2);
% W = the objective function
6-109
6 Nonlinear algorithms and examples
W(Z < 3) = nan; % plot only where the constraints are satisfied
surf(X,Y,W,'LineStyle','none');
view(68,20)
hold on
plot3(.4396, .0373, .8152,'o','MarkerEdgeColor','r', ...
'MarkerSize',8); % best point
xlabel('x')
ylabel('y')
zlabel('z')
hold off
The nonlinear constraints must be written in the form c(x) <= 0. We compute all the symbolic
constraints and their derivatives, and place them in a function handle using matlabFunction.
The gradients of the constraints should be column vectors; they must be placed in the objective
function as a matrix, with each column of the matrix representing the gradient of one constraint
function. This is the transpose of the form generated by jacobian, so we take the transpose below.
We place the nonlinear constraints into a function handle. fmincon expects the nonlinear constraints
and gradients to be output in the order [c ceq gradc gradceq]. Since there are no nonlinear
equality constraints, we output [] for ceq and gradceq.
c1 = x1^4 - 5*sinh(x2/5);
c2 = x2^2 - 5*tanh(x1/5) - 1;
c = [c1 c2];
gradc = jacobian(c,x).'; % transpose to put in correct form
constraint = matlabFunction(c,[],gradc,[],'vars',{x});
6-110
Using Symbolic Mathematics with Optimization Toolbox™ Solvers
The interior-point algorithm requires its Hessian function to be written as a separate function,
instead of being part of the objective function. This is because a nonlinearly constrained function
needs to include those constraints in its Hessian. Its Hessian is the Hessian of the Lagrangian; see
the User's Guide for more information.
The Hessian function takes two input arguments: the position vector x, and the Lagrange multiplier
structure lambda. The parts of the lambda structure that you use for nonlinear constraints are
lambda.ineqnonlin and lambda.eqnonlin. For the current constraint, there are no linear
equalities, so we use the two multipliers lambda.ineqnonlin(1) and lambda.ineqnonlin(2).
We calculated the Hessian of the objective function in the first example. Now we calculate the
Hessians of the two constraint functions, and make function handle versions with matlabFunction.
hessc1 = jacobian(gradc(:,1),x); % constraint = first c column
hessc2 = jacobian(gradc(:,2),x);
hessfh = matlabFunction(hessf,'vars',{x});
hessc1h = matlabFunction(hessc1,'vars',{x});
hessc2h = matlabFunction(hessc2,'vars',{x});
To make the final Hessian, we put the three Hessians together, adding the appropriate Lagrange
multipliers to the constraint functions.
myhess = @(x,lambda)(hessfh(x) + ...
lambda.ineqnonlin(1)*hessc1h(x) + ...
lambda.ineqnonlin(2)*hessc2h(x));
Set the options to use the interior-point algorithm, the gradient, and the Hessian, have the objective
function return both the objective and the gradient, and run the solver:
options = optimoptions('fmincon', ...
'Algorithm','interior-point', ...
'SpecifyObjectiveGradient',true, ...
'SpecifyConstraintGradient',true, ...
'HessianFcn',myhess, ...
'Display','final');
% fh2 = objective without Hessian
fh2 = matlabFunction(f,gradf,'vars',{x});
[xfinal,fval,exitflag,output] = fmincon(fh2,[-1;2],...
[],[],[],[],[],[],constraint,options)
xfinal = 2×1
0.4396
0.0373
fval = 0.8152
exitflag = 1
6-111
6 Nonlinear algorithms and examples
funcCount: 13
constrviolation: 0
stepsize: 1.9160e-06
algorithm: 'interior-point'
firstorderopt: 1.9217e-08
cgiterations: 0
message: '...'
Again, the solver makes many fewer iterations and function evaluations with gradient and Hessian
supplied than when they are not:
options = optimoptions('fmincon','Algorithm','interior-point',...
'Display','final');
% fh3 = objective without gradient or Hessian
fh3 = matlabFunction(f,'vars',{x});
% constraint without gradient:
constraint = matlabFunction(c,[],'vars',{x});
[xfinal,fval,exitflag,output2] = fmincon(fh3,[-1;2],...
[],[],[],[],[],[],constraint,options)
xfinal = 2×1
0.4396
0.0373
fval = 0.8152
exitflag = 1
ans =
'There were 10 iterations using gradient and Hessian, but 17 without them.'
6-112
Using Symbolic Mathematics with Optimization Toolbox™ Solvers
ans =
'There were 13 function evaluations using gradient and Hessian, but 54 without them.'
The symbolic variables used in this example were assumed to be real. To clear this assumption from
the symbolic engine workspace, it is not sufficient to delete the variables. You must clear the
assumptions of variables using the syntax
assume([x1,x2],'clear')
All assumptions are cleared when the output of the following command is empty
assumptions([x1,x2])
ans =
See Also
More About
• “Symbolic Math Toolbox™ Calculates Gradients and Hessians” on page 6-94
6-113
6 Nonlinear algorithms and examples
Typically, you use code generation to deploy code on hardware that is not running MATLAB. For
example, you can deploy code on a robot, using fmincon for optimizing movement or planning.
For an example, see “Code Generation for Optimization Basics” on page 6-116. For quadratic
programming, see “Code Generation for quadprog” on page 11-51.
x = fmincon(@fun,x0,A,b,Aeq,beq,lb,ub,@nonlcon) % Supported
% Not supported: fmincon('fun',...) or fmincon("fun",...)
• All fmincon input matrices such as A, Aeq, lb, and ub must be full, not sparse. You can convert
sparse matrices to full by using the full function.
• The lb and ub arguments must have either the same number of entries as the x0 argument or
must be empty [].
• For advanced code optimization involving embedded processors, you also need an Embedded
Coder® license.
• You must include options for fmincon and specify them using optimoptions. The options must
include the Algorithm option, set to 'sqp' or 'sqp-legacy'.
options = optimoptions('fmincon','Algorithm','sqp');
[x,fval,exitflag] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
• Code generation supports these options:
6-114
Code Generation in fmincon
opts = optimoptions('fmincon','Algorithm','sqp');
opts = optimoptions(opts,'MaxIterations',1e4); % Recommended
opts.MaxIterations = 1e4; % Not recommended
• Do not load options from a file. Doing so can cause code generation to fail. Instead, create options
in your code.
• Usually, if you specify an option that is not supported, the option is silently ignored during code
generation. However, if you specify a plot function or output function by using dot notation, code
generation can error. For reliability, specify only supported options.
• Because output functions and plot functions are not supported, fmincon does not return exit flag
–1.
If your target hardware has multiple cores, then you can achieve better performance by using custom
multithreaded LAPACK and BLAS libraries. To incorporate these libraries in your generated code, see
“Speed Up Linear Algebra in Generated Standalone Code by Using LAPACK Calls” (MATLAB Coder).
See Also
codegen | fmincon | optimoptions
More About
• “Code Generation for Optimization Basics” on page 6-116
• “Static Memory Allocation for fmincon Code Generation” on page 6-120
• “Optimization Code Generation for Real-Time Applications” on page 6-122
6-115
6 Nonlinear algorithms and examples
The example uses the following simple objective function. To use this objective function in your own
testing, copy the code to a file named rosenbrockwithgrad.m. Save the file on your MATLAB path.
To generate code using the rosenbrockwithgrad objective function, create a file named
test_rosen.m containing this code:
After some time, codegen creates a MEX file named test_rosen_mex.mexw64 (the file extension
will vary, depending on your system). You can run the resulting C code by entering test_rosen_mex.
The results are the following or similar:
x =
1.0000 1.0000
fval =
1.3346e-11
ans =
1.0000 1.0000
6-116
Code Generation for Optimization Basics
cfg = coder.config('mex');
cfg.IntegrityChecks = false;
cfg.SaturateOnIntegerOverflow = false;
cfg.DynamicMemoryAllocation = 'Off';
Tighten the bounds on the problem from [-3,3] to [-2,2]. Also, set a looser optimality tolerance
than the default 1e-6.
test_rosen2_mex
x =
1.0000 1.0000
fval =
2.0057e-11
eflag =
output =
iterations: 40
funcCount: 155
algorithm: 'sqp'
constrviolation: 0
stepsize: 5.9344e-08
lssteplength: 1
ans =
1.0000 1.0000
This solution is almost as good as the previous solution, with the fval output around 2e-11
compared to the previous 1e-11.
Try limiting the number of allowed iterations to half of those taken in the previous computation.
6-117
6 Nonlinear algorithms and examples
test_rosen3
x =
0.2852 0.0716
fval =
0.5204
eflag =
output =
iterations: 20
funcCount: 91
algorithm: 'sqp'
message: '↵Solver stopped prematurely.↵↵fmincon stopped because it exceeded the itera
constrviolation: 0
stepsize: 0.0225
lssteplength: 1
firstorderopt: 1.9504
ans =
0.2852 0.0716
With this severe iteration limit, fmincon does not reach a good solution. The tradeoff between
accuracy and speed can be difficult to manage.
To save function evaluations and possibly increase accuracy, use the built-in derivatives of the
example by setting the SpecifyObjectiveGradient option to true.
6-118
Code Generation for Optimization Basics
test_rosen4_mex
x =
1.0000 1.0000
fval =
3.3610e-20
eflag =
output =
iterations: 40
funcCount: 113
algorithm: 'sqp'
constrviolation: 0
stepsize: 9.6356e-08
lssteplength: 1
ans =
1.0000 1.0000
Compared to test_rosen2, the number of iterations is the same at 40, but the number of function
evaluations is lower at 113 instead of 155. The result has a better (lower) objective function value of
3e-20 compared to 2e-11.
See Also
codegen | fmincon | optimoptions
More About
• “Code Generation in fmincon” on page 6-114
• “Code Generation for quadprog” on page 11-51
• “Static Memory Allocation for fmincon Code Generation” on page 6-120
• “Optimization Code Generation for Real-Time Applications” on page 6-122
6-119
6 Nonlinear algorithms and examples
The problem is a simple nonlinear minimization with both a nonlinear constraint function and linear
constraints. The sizes of the linear constraint matrices change at each iteration, which causes the
memory requirements to increase at each iteration. The example shows how to use the
coder.varsize command to set the appropriate variable sizes for static memory allocation.
The nlp_for_loop.m file contains the objective function, linear constraints, and nonlinear
constraint function. Copy the following code to create this file on your MATLAB path.
function nlp_for_loop
% Driver for an example fmincon use case. Adding constraints increases the
% minimum and uses more memory.
A = zeros(0,nVar);
b = zeros(0,1);
% The next step is required for static memory support. Because you concatenate
% constraints in a "for" loop, you need to limit the dimensions of the
% constraint matrices.
%coder.varsize('var name', [maxRows, maxCols], [canRowsChange, canColsChange]);
coder.varsize('A',[maxIneq,nVar],[true,false]);
coder.varsize('b',[maxIneq,1],[true,false]);
Aeq = [1,0,0,0,1];
beq = 0;
lb = [];
ub = [];
% Initial point
x0 = [2;-3;0;0;-2];
options = optimoptions('fmincon','Algorithm','sqp','Display','none');
for idx = 1:maxIneq
% Add a new linear inequality constraint at each iteration
A = [A; circshift([1,1,0,0,0],idx-1)];
b = [b; -1];
[x,fval,exitflag] = fmincon(@rosenbrock_nd,x0,A,b,Aeq,beq,...
lb,ub,@circleconstr,options);
% Set initial point to found point
x0 = x;
% Print fval, ensuring that the datatypes are consistent with the
% corresponding fprintf format specifiers
fprintf('%i Inequality Constraints; fval: %f; Exitflag: %i \n',...
int32(numel(b)),fval,int32(exitflag));
end
end
6-120
Static Memory Allocation for fmincon Code Generation
radius = 2;
ceq = [];
c = sum(x.^2) - radius^2;
end
To generate code from this file using static memory allocation, set the coder configuration as follows.
cfg = coder.config('mex');
cfg.DynamicMemoryAllocation = 'Off'; % No dynamic memory allocation
cfg.SaturateOnIntegerOverflow = false; % No MATLAB integer saturation checking
cfg.IntegrityChecks = false; % No checking for out-of-bounds access in arrays
nlp_for_loop_mex
The function value increases at each iteration because the problem has more constraints.
See Also
More About
• “Code Generation in fmincon” on page 6-114
• “Code Generation for Optimization Basics” on page 6-116
• “Optimization Code Generation for Real-Time Applications” on page 6-122
6-121
6 Nonlinear algorithms and examples
For general advice on writing efficient code for code generation, see “MATLAB Code Design
Considerations for Code Generation” (MATLAB Coder).
• Check the clock speeds of your target hardware and your computer. Scale your benchmarking
results accordingly.
• Set maxNumCompThreads in MATLAB to 1, because the default LAPACK and BLAS libraries
generated by MATLAB Coder are single-threaded.
lastN = maxNumCompThreads(1);
N = maxNumCompThreads(lastN);
% Alternatively,
% N = maxNumCompThreads('automatic');
Note If your target hardware has multiple cores and you use custom multithreaded LAPACK and
BLAS libraries, then set maxNumCompThreads to the number of threads on the target hardware.
See “Speed Up Linear Algebra in Generated Standalone Code by Using LAPACK Calls” (MATLAB
Coder).
• If you have an Embedded Coder license, see these topics for details on reliable ways to evaluate
the resulting performance of your embedded code: “Speed Up Linear Algebra in Code Generated
from a MATLAB Function Block” (Embedded Coder), “Speed Up Matrix Operations in Code
Generated from a MATLAB Function Block” (Embedded Coder), “Verification” (Embedded Coder),
and “Performance” (Embedded Coder).
cfg = coder.config('mex');
To save time in the generated code, turn off integrity checks and checks for integer saturation.
Solvers do not rely on these checks to function properly, assuming that the objective function and
nonlinear constraint function do not require them. For details, see “Control Run-Time Checks”
(MATLAB Coder).
6-122
Optimization Code Generation for Real-Time Applications
cfg.IntegrityChecks = false;
cfg.SaturateOnIntegerOverflow = false;
Typically, generated code runs faster when using static memory allocation, although this allocation
can increase the amount of generated code. Also, some hardware does not support dynamic memory
allocation. To use static memory allocation, specify this setting.
cfg.DynamicMemoryAllocation = 'Off';
You can improve the performance of your code by selecting different types of BLAS, the underlying
linear algebra subprograms. To learn how to set the BLAS for your generated code, see “Speed Up
Matrix Operations in Generated Standalone Code by Using BLAS Calls” (MATLAB Coder). If you want
the embedded application to run in parallel, you must supply BLAS or LAPACK libraries that support
parallel computation on your system. Similarly, when you have parallel hardware, you can improve
the performance of your code by setting custom LAPACK calls. See “Speed Up Linear Algebra in
Generated Standalone Code by Using LAPACK Calls” (MATLAB Coder).
If your optimization problem does not have parameters changing slowly, and includes only a few
control variables, then trying to estimate a response from previous solutions can be worthwhile.
Construct a model of the solution as a function of the parameters, either as a quadratic in the
parameters or as a low-dimensional interpolation, and use the predicted solution point as a starting
point for the solver.
opts = optimoptions('fmincon','Algorithm','sqp','MaxIterations',50);
[x,fval,exitflag] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
However, the result can be far from an optimum. Ensure that an inaccurate result does not overly
affect your system. Set MaxIterations as large as possible while still meeting your time constraint.
You can estimate this value by measuring how long an iteration takes, or by measuring how long a
function evaluation takes, and then either setting the MaxFunctionEvaluations option or the
6-123
6 Nonlinear algorithms and examples
MaxIterations option. For an example, see “Code Generation for Optimization Basics” on page 6-
116.
For further suggestions on settings that can speed the solver, see “Solver Takes Too Long” on page 4-
9. Note that some suggestions in this topic do not apply because of limitations in code generation.
See “Code Generation in fmincon” on page 6-114 or “Code Generation for quadprog” on page 11-51.
Global Minimum
You might want a global minimum, not just a local minimum, as a solution. Searching for a global
minimum can take a great deal of time, and is not guaranteed to work. For suggestions, see
“Searching for a Smaller Minimum” on page 4-22.
See Also
codegen | fmincon | optimoptions | quadprog
More About
• “Code Generation in fmincon” on page 6-114
• “Code Generation for quadprog” on page 11-51
• “Code Generation for Optimization Basics” on page 6-116
• “Generate Code for quadprog” on page 11-53
• “Static Memory Allocation for fmincon Code Generation” on page 6-120
6-124
One-Dimensional Semi-Infinite Constraints
where
1 2
K1 x, w1 = sin w1x1 cos w1x2 − w − 50 − sin w1x3 − x3 ≤ 1,
1000 1
1 2
K2 x, w2 = sin w2x2 cos w2x1 − w − 50 − sin w2x3 − x3 ≤ 1,
1000 2
1 ≤ w1 ≤ 100,
1 ≤ w2 ≤ 100.
Note that the semi-infinite constraints are one-dimensional, that is, vectors. Because the constraints
must be in the form Ki(x,wi) ≤ 0, you need to compute the constraints as
1 2
K1 x, w1 = sin w1x1 cos w1x2 − w − 50 − sin w1x3 − x3 − 1 ≤ 0,
1000 1
1 2
K2 x, w2 = sin w2x2 cos w2x1 − w − 50 − sin w2x3 − x3 − 1 ≤ 0.
1000 2
function f = myfun(x,s)
% Objective function
f = sum((x-0.5).^2);
Second, write a file mycon.m that computes the nonlinear equality and inequality constraints and the
semi-infinite constraints.
% Semi-infinite constraints
K1 = sin(w1*X(1)).*cos(w1*X(2)) - 1/1000*(w1-50).^2 -...
sin(w1*X(3))-X(3)-1;
K2 = sin(w2*X(2)).*cos(w2*X(1)) - 1/1000*(w2-50).^2 -...
sin(w2*X(3))-X(3)-1;
6-125
6 Nonlinear algorithms and examples
title('Semi-infinite constraints')
drawnow
x
x =
0.6675
0.3012
0.4022
The function value and the maximum values of the semi-infinite constraints at the solution x are
fval
fval =
0.0771
6-126
One-Dimensional Semi-Infinite Constraints
This plot shows how peaks in both constraints are on the constraint boundary.
The plot command inside mycon.m slows down the computation. Remove this line to improve the
speed.
See Also
fseminf
Related Examples
• “Two-Dimensional Semi-Infinite Constraint” on page 6-128
• “Analyzing the Effect of Uncertainty Using Semi-Infinite Programming” on page 6-131
6-127
6 Nonlinear algorithms and examples
where
1 2
K1 x, w = sin w1x1 cos w2x2 − w − 50 − sin w1x3 − x3 + ...
1000 1
1 2
sin w2x2 cos w1x1 − w − 50 − sin w2x3 − x3 ≤ 1.5,
1000 2
1 ≤ w1 ≤ 100,
1 ≤ w2 ≤ 100,
function f = myfun(x,s)
% Objective function
f = sum((x-0.2).^2);
Second, write a file for the constraints, called mycon.m. Include code to draw the surface plot of the
semi-infinite constraint each time mycon is called. This enables you to see how the constraint
changes as X is being minimized.
% Sampling set
w1x = 1:s(1,1):100;
w1y = 1:s(1,2):100;
[wx,wy] = meshgrid(w1x,w1y);
% Semi-infinite constraint
K1 = sin(wx*X(1)).*cos(wx*X(2))-1/1000*(wx-50).^2 -...
sin(wx*X(3))-X(3)+sin(wy*X(2)).*cos(wx*X(1))-...
1/1000*(wy-50).^2-sin(wy*X(3))-X(3)-1.5;
% Mesh plot
m = surf(wx,wy,K1,'edgecolor','none','facecolor','interp');
camlight headlight
title('Semi-infinite constraint')
drawnow
6-128
Two-Dimensional Semi-Infinite Constraint
The goal was to minimize the objective f(x) such that the semi-infinite constraint satisfied
K1(x,w) ≤ 1.5. Evaluating mycon at the solution x and looking at the maximum element of the matrix
K1 shows the constraint is easily satisfied.
[c,ceq,K1] = mycon(x,[0.5,0.5]); % Sampling interval 0.5
max(max(K1))
ans =
-0.0370
This call to mycon produces the following surf plot, which shows the semi-infinite constraint at x.
6-129
6 Nonlinear algorithms and examples
See Also
fseminf
Related Examples
• “One-Dimensional Semi-Infinite Constraints” on page 6-125
• “Analyzing the Effect of Uncertainty Using Semi-Infinite Programming” on page 6-131
6-130
Analyzing the Effect of Uncertainty Using Semi-Infinite Programming
The problem illustrated in this example involves the control of air pollution. Specifically, a set of
chimney stacks are to be built in a given geographic area. As the height of each chimney stack
increases, the ground level concentration of pollutants from the stack decreases. However, the
construction cost of each chimney stack increases with height. We will solve a problem to minimize
the cumulative height of the chimney stacks, hence construction cost, subject to ground level
pollution concentration not exceeding a legislated limit. This problem is outlined in the following
reference:
Air pollution control with semi-infinite programming, A.I.F. Vaz and E.C. Ferreira, XXVIII Congreso
Nacional de Estadistica e Investigacion Operativa, October 2004
In this example we will first solve the problem published in the above article as the Minimal Stack
Height problem. The models in this problem are dependent on several parameters, two of which are
wind speed and direction. All model parameters are assumed to be known exactly in the first solution
of the problem.
We then extend the original problem by allowing the wind speed and direction parameters to vary
within given ranges. This will allow us to analyze the effects of uncertainty in these parameters on
the optimal solution to this problem.
Consider a 20km-by-20km region, R, in which ten chimney stacks are to be placed. These chimney
stacks release several pollutants into the atmosphere, one of which is sulfur dioxide. The x, y
locations of the stacks are fixed, but the height of the stacks can vary.
Constructors of the stacks would like to minimize the total height of the stacks, thus minimizing
construction costs. However, this is balanced by the conflicting requirement that the concentration of
sulfur dioxide at any point on the ground in the region R must not exceed the legislated maximum.
First, let's plot the chimney stacks at their initial height. Note that we have zoomed in on a 4km-
by-4km subregion of R which contains the chimney stacks.
h0 = [210;210;180;180;150;150;120;120;90;90];
plotChimneyStacks(h0, 'Chimney Stack Initial Height');
6-131
6 Nonlinear algorithms and examples
There are two environment related parameters in this problem, the wind speed and direction. Later
in this example we will allow these parameters to vary, but for the first problem we will set these
parameters to typical values.
Now let's plot the ground level concentration of sulfur dioxide (SO2) over the entire region R
(remember that the plot of chimney stacks was over a smaller region). The SO2 concentration has
been calculated with the chimney stacks set to their initial heights.
We can see that the concentration of SO2 varies over the region of interest. There are two features of
the Sulfur Dioxide graph of note:
• SO2 concentration rises in the top left hand corner of the (x,y) plane
• SO2 concentration is approximately zero throughout most of the region
In very simple terms, the first feature is due to the prevailing wind, which is blowing SO2 toward the
top left hand corner of the (x,y) plane in this example. The second factor is due to SO2 being
transported to the ground via diffusion. This is a slower process compared to the prevailing wind and
thus SO2 only reaches ground level in the top left hand corner of the region of interest.
For a more detailed discussion of atmospheric dispersion from chimney stacks, consult the reference
cited in the introduction.
6-132
Analyzing the Effect of Uncertainty Using Semi-Infinite Programming
The pink plane indicates a SO2 concentration of 0 . 000125gm−3. This is the legislated maximum for
which the Sulfur Dioxide concentration must not exceed in the region R. It can be clearly seen from
the graph that the SO2 concentration exceeds the maximum for the initial chimney stack height.
Examine the MATLAB file concSulfurDioxide to see how the sulfur dioxide concentration is
calculated.
Before we solve the minimal stack height problem, we will outline how fseminf solves a semi-infinite
problem. A general semi-infinite programming problem can be stated as:
minf (x)
such that
6-133
6 Nonlinear algorithms and examples
and
This algorithm allows you to specify constraints for a nonlinear optimization problem that must be
satisfied over intervals of an auxiliary variable, w. Note that for fseminf, this variable is restricted to
be either 1 or 2 dimensional for each semi-infinite constraint.
The function fseminf solves the general semi-infinite problem by starting from an initial value, x0,
and using an iterative procedure to obtain an optimum solution, xopt.
The key component of the algorithm is the handling of the "semi-infinite" constraints, K j. At xopt it is
required that the K j must be feasible at every value of w in the interval I j. This constraint can be
simplified by considering all the local maxima of K j with respect to w in the interval I j. The original
constraint is equivalent to requiring that the value of K j at each of the above local maxima is feasible.
fseminf calculates an approximation to all the local maximum values of each semi-infinite
constraint, K j. To do this, fseminf first calculates each semi-infinite constraint over a mesh of w
values. A simple differencing scheme is then used to calculate all the local maximum values of K j
from the evaluated semi-infinite constraint.
As we will see later, you create this mesh in your constraint function. The spacing you should use for
each w coordinate of the mesh is supplied to your constraint function by fseminf.
1 Evaluate K j over a mesh of w-values using the current mesh spacing for each w-coordinate.
2 Calculate an approximation to all the local maximum values of K j using the evaluation of K j from
step 1.
3 Replace each K j in the general semi-infinite problem with the set of local maximum values found
in steps 1-2. The problem now has a finite number of nonlinear constraints. fseminf uses the
SQP algorithm used by fmincon to take one iteration step of the modified problem.
4 Check if any of the SQP algorithm's stopping criteria are met at the new point x. If any criteria
are met the algorithm terminates; if not, fseminf continues to step 5. For example, if the first
order optimality value for the problem defined in step 3 is less than the specified tolerance then
fseminf will terminate.
5 Update the mesh spacing used in the evaluation of the semi-infinite constraints in step 1.
Before we can call fseminf to solve the problem, we need to write a function to evaluate the
nonlinear constraints in this problem. The constraint to be implemented is that the ground level
Sulfur Dioxide concentration must not exceed 0 . 000125gm−3 at every point in region R.
This is a semi-infinite constraint, and the implementation of the constraint function is explained in
this section. For the minimal stack height problem we have implemented the constraint in the
MATLAB file airPollutionCon.
type airPollutionCon.m
6-134
Analyzing the Effect of Uncertainty Using Semi-Infinite Programming
% Define the grid that the "infinite" constraints will be evaluated over
w1x = -20000:s(1,1):20000;
w1y = -20000:s(1,2):20000;
[t1,t2] = meshgrid(w1x,w1y);
% No finite constraints
c = [];
ceq = [];
This function illustrates the general structure of a constraint function for a semi-infinite programming
problem. In particular, a constraint function for fseminf can be broken up into three parts:
Recall that fseminf evaluates the "semi-infinite" constraints over a mesh as part of the overall
calculation of these constraints. When your constraint function is called by fseminf, the mesh
spacing you should use is supplied to your function. Fseminf will initially call your constraint
function with the mesh spacing, s, set to NaN. This allows you to initialize the mesh size for the
constraint evaluation. Here, we have one "infinite" constraint in two "infinite" variables. This means
we need to initialize the mesh size to a 1-by-2 matrix, in this case, s = [1000 4000].
2. Define the mesh that will be used for the constraint evaluation
A mesh that will be used for the constraint evaluation needs to be created. The three lines of code
following the comment "Define the grid that the "infinite" constraints will be evaluated over" in
airPollutionCon can be modified for most 2-d semi-infinite programming problems.
6-135
6 Nonlinear algorithms and examples
Once the mesh has been defined, the constraints can be calculated over it. These constraints are then
returned to fseminf from the above constraint function.
Note that in this problem, we have also rescaled the constraints so that they vary on a scale which is
closer to that of the objective function. This helps fseminf to avoid scaling issues associated with
objectives and constraints which vary on disparate scales.
We can now call fseminf to solve the problem. The chimney stacks must all be at least 10m tall and
we use the initial stack height specified earlier. Note that the third input argument to fseminf below
(1) indicates that there is only one semi-infinite constraint.
lb = 10*ones(size(h0));
[hsopt, sumh, exitflag] = fseminf(@(h)sum(h), h0, 1, ...
@(h,s) airPollutionCon(h,s,theta0,U0), [], [], [], [], lb);
The minimum cumulative height computed by fseminf is considerably higher than the initial total
height of the chimney stacks. We will see how the minimum cumulative height changes when
parameter uncertainty is added to the problem later in the example. For now, let's plot the chimney
stacks at their optimal height.
Examine the MATLAB file plotChimneyStacks to see how the plot was generated.
6-136
Analyzing the Effect of Uncertainty Using Semi-Infinite Programming
Recall that fseminf determines that the semi-infinite constraint is satisfied everywhere by ensuring
that discretized maxima of the constraint are below the specified bound. We can verify that the semi-
infinite constraint is satisfied everywhere by plotting the ground level sulfur dioxide concentration for
the optimal stack height.
Note that the sulfur dioxide concentration takes its maximum possible value in the upper left corner
of the (x, y) plane, i.e. at x = -20000m, y = 20000m. This point is marked by the blue dot in the figure
below and verified by calculating the sulfur dioxide concentration at this point.
Examine the MATLAB file plotSulfurDioxide to see how the plots was generated.
6-137
6 Nonlinear algorithms and examples
The sulfur dioxide concentration depends on several environmental factors which were held at fixed
values in the above problem. Two of the environmental factors are wind speed and wind direction.
See the reference cited in the introduction for a more detailed discussion of all the problem
parameters.
We can investigate the change in behavior for the system with respect to the wind speed and
direction. In this section of the example, we want to make sure that the sulfur dioxide limits are
satisfied even if the wind direction changes from 3.82 rad to 4.18 rad and mean wind speed varies
between 5 and 6.2 m/s.
We need to implement a semi-infinite constraint to ensure that the sulfur dioxide concentration does
not exceed the limit in region R. This constraint is required to be feasible for all pairs of wind speed
and direction.
Such a constraint will have four "infinite" variables (wind speed and direction and the x-y coordinates
of the ground). However, any semi-infinite constraint supplied to fseminf can have no more than two
"infinite" variables.
To implement this constraint in a suitable form for fseminf, we recall the SO2 concentration at the
optimum stack height in the previous problem. In particular, the SO2 concentration takes its
6-138
Analyzing the Effect of Uncertainty Using Semi-Infinite Programming
maximum possible value at x = -20000m, y = 20000m. To reduce the number of "infinite" variables,
we will assume that the SO2 concentration will also take its maximum value at this point when
uncertainty is present. We then require that SO2 concentration at this point is below 0 . 000125gm−3
for all pairs of wind speed and direction.
This means that the "infinite" variables for this problem are wind speed and direction. To see how this
constraint has been implemented, inspect the MATLAB file uncertainAirPollutionCon.
type uncertainAirPollutionCon.m
% Define the grid that the "infinite" constraints will be evaluated over
w1x = 3.82:s(1,1):4.18; % Wind direction
w1y = 5.0:s(1,2):6.2; % Wind speed
[t1,t2] = meshgrid(w1x,w1y);
% No finite constraints
c = [];
ceq = [];
This constraint function can be divided into same three sections as before:
The code following the comment "Initial sampling interval" initializes the mesh size.
2. Define the mesh that will be used for the constraint evaluation
6-139
6 Nonlinear algorithms and examples
The next section of code creates a mesh (now in wind speed and direction) using a similar
construction to that used in the initial problem.
The remainder of the code calculates the SO2 concentration at each point of the wind speed/direction
mesh. These constraints are then returned to fseminf from the above constraint function.
We can now call fseminf to solve the stack height problem considering uncertainty in the
environmental factors.
fprintf('\nMinimal computed cumulative height of chimney stacks with uncertainty: %7.2f m\n', sum
We can now look at the difference between the minimum computed cumulative stack height for the
problem with and without parameter uncertainty. You should be able to see that the minimum
cumulative height increases when uncertainty is added to the problem. This expected increase in
height allows the SO2 concentration to remain below the legislated maximum for all wind speed/
direction pairs in the specified range.
We can check that the sulfur dioxide concentration does not exceed the limit over the region of
interest via inspection of a sulfur dioxide plot. For a given (x, y) point, we plot the maximum SO2
concentration for the wind speed and direction in the stated ranges. Note that we have zoomed in on
the upper left corner of the X-Y plane.
6-140
Analyzing the Effect of Uncertainty Using Semi-Infinite Programming
We finally plot the chimney stacks at their optimal height when there is uncertainty in the problem
definition.
6-141
6 Nonlinear algorithms and examples
There are many options available for the semi-infinite programming algorithm, fseminf. Consult the
Optimization Toolbox™ User's Guide for details, in the Using Optimization Toolbox Solvers chapter,
under Constrained Nonlinear Optimization: fseminf Problem Formulation and Algorithm.
See Also
More About
• “One-Dimensional Semi-Infinite Constraints” on page 6-125
• “Two-Dimensional Semi-Infinite Constraint” on page 6-128
6-142
7
Nonlinear Problem-Based
A rational function is a quotient of polynomials. When the objective function is a rational function of
optimization variables, you can create the objective function expression directly from the variables.
(In contrast, when your objective function is not a rational function, you must create a MATLAB®
function that represents the objective and then convert the function to an expression by using
fcn2optimexpr.)
2
(x − y) x + y2
f = 4 2
4 + (x + y) 1 + y
x = optimvar('x');
y = optimvar('y');
f = (x-y)^2/(4+(x+y)^4)*(x+y^2)/(1+y^2);
To find the minimum of this objective function, create an optimization problem with f as the
objective, set an initial point, and call solve.
prob = optimproblem('Objective',f);
x0.x = -1;
x0.y = 1;
[sol,fval,exitflag,output] = solve(prob,x0)
fval = -1.0945
exitflag =
OptimalSolution
7-2
Rational Objective Function, Problem-Based
solver: 'fminunc'
The exit flag shows that the reported solution is a local minimum. The output structure shows that
the solver took just 30 function evaluations to reach the minimum.
See Also
fcn2optimexpr
More About
• “Problem-Based Optimization Setup”
7-3
7 Nonlinear Problem-Based
To find the minimum value of a nonlinear objective function using the problem-based approach, first
write the objective function as a file or anonymous function. The objective function for this example is
type objfunx
function f = objfunx(x,y)
f = exp(x).*(4*x.^2 + 2*y.^2 + 4*x.*y + 2*y - 1);
end
x = optimvar('x');
y = optimvar('y');
obj = objfunx(x,y);
prob = optimproblem('Objective',obj);
Create a nonlinear constraint that the solution lies in a tilted ellipse, specified as
2
xy 2 (y − 2)
+ (x + 2) + ≤ 2.
2 2
prob.Constraints.constr = TiltEllipse;
x0.x = -3;
x0.y = 3;
show(prob)
OptimizationProblem :
Solve for:
x, y
7-4
Solve Constrained Nonlinear Optimization, Problem-Based
minimize :
(exp(x) .* (((((4 .* x.^2) + (2 .* y.^2)) + ((4 .* x) .* y))
+ (2 .* y)) - 1))
subject to constr:
((((x .* y) ./ 2) + (x + 2).^2) + ((y - 2).^2 ./ 2)) <= 2
[sol,fval] = solve(prob,x0)
fval = 0.3299
x0.x = -1;
x0.y = 1;
[sol2,fval2] = solve(prob,x0)
fmincon stopped because the size of the current step is less than
the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
fval2 = 0.7626
Plot the ellipse, the objective function contours, and the two solutions.
f = @objfunx;
g = @(x,y) x.*y/2+(x+2).^2+(y-2).^2/2-2;
rnge = [-5.5 -0.25 -0.25 7];
fimplicit(g,'k-')
axis(rnge);
hold on
fcontour(f,rnge,'LevelList',logspace(-1,1))
plot(sol.x,sol.y,'ro','LineWidth',2)
7-5
7 Nonlinear Problem-Based
plot(sol2.x,sol2.y,'ko','LineWidth',2)
legend('Constraint','f Contours','Global Solution','Local Solution','Location','northeast');
hold off
The solutions are on the nonlinear constraint boundary. The contour plot shows that these are the
only local minima. The plot also shows that there is a stationary point near [–2,3/2], and local maxima
near [–2,0] and [–1,4].
For some objective functions or software versions, you must convert nonlinear functions to
optimization expressions by using fcn2optimexpr. See “Supported Operations on Optimization
Variables and Expressions” on page 10-36. Pass the x and y variables in the fcn2optimexpr call to
indicate which optimization variable corresponds to each objfunx input.
obj = fcn2optimexpr(@objfunx,x,y);
Create an optimization problem with obj as the objective function just as before.
prob = optimproblem('Objective',obj);
See Also
fcn2optimexpr
7-6
Solve Constrained Nonlinear Optimization, Problem-Based
More About
• “Problem-Based Nonlinear Optimization”
7-7
7 Nonlinear Problem-Based
Function File
To use a function file in the problem-based approach, you need to convert the file to an expression
using fcn2optimexpr.
type expfn3.m
To use this function file as an optimization expression, first create optimization variables of the
appropriate sizes.
[f,g,mineval] = fcn2optimexpr(@expfn3,u,v);
Because all returned expressions are scalar, you can save computing time by specifying the
expression sizes using the 'OutputSize' name-value pair argument. Also, because expfn3
computes all of the outputs, you can save more computing time by using the ReuseEvaluation
name-value pair.
[f,g,mineval] = fcn2optimexpr(@expfn3,u,v,'OutputSize',[1,1],'ReuseEvaluation',true)
f =
Nonlinear OptimizationExpression
[argout,~,~] = expfn3(u, v)
g =
Nonlinear OptimizationExpression
[~,argout,~] = expfn3(u, v)
mineval =
Nonlinear OptimizationExpression
[~,~,argout] = expfn3(u, v)
7-8
Convert Nonlinear Function to Optimization Expression
Anonymous Function
To use a general nonlinear function handle in the problem-based approach, convert the handle to an
optimization expression using fcn2optimexpr. For example, write a function handle equivalent to f
and convert it.
fun = @(x,y)-exp(-y'*x*y);
funexpr = fcn2optimexpr(fun,u,v,'OutputSize',[1,1])
funexpr =
Nonlinear OptimizationExpression
anonymousFunction1(u, v)
where:
anonymousFunction1 = @(x,y)-exp(-y'*x*y);
Create Objective
Define Constraints
Also define the constraints that u is symmetric and that mineval ≥ − 1/2.
prob.Constraints.sym = u == u.';
prob.Constraints.mineval = mineval >= -1/2;
OptimizationProblem :
Solve for:
u, v
minimize :
[argout,~,~] = expfn3(u, v)
subject to nlcons1:
arg_LHS <= 0
where:
subject to sym:
7-9
7 Nonlinear Problem-Based
u(2, 1) - u(1, 2) == 0
u(3, 1) - u(1, 3) == 0
-u(2, 1) + u(1, 2) == 0
u(3, 2) - u(2, 3) == 0
-u(3, 1) + u(1, 3) == 0
-u(3, 2) + u(2, 3) == 0
subject to mineval:
arg_LHS >= (-0.5)
where:
variable bounds:
-1 <= u(1, 1) <= 1
-1 <= u(2, 1) <= 1
-1 <= u(3, 1) <= 1
-1 <= u(1, 2) <= 1
-1 <= u(2, 2) <= 1
-1 <= u(3, 2) <= 1
-1 <= u(1, 3) <= 1
-1 <= u(2, 3) <= 1
-1 <= u(3, 3) <= 1
Solve Problem
fval = -403.4288
exitflag =
OptimalSolution
7-10
Convert Nonlinear Function to Optimization Expression
funcCount: 965
constrviolation: 5.3174e-12
stepsize: 7.1054e-05
algorithm: 'interior-point'
firstorderopt: 7.3458e-04
cgiterations: 66
message: '...'
solver: 'fmincon'
disp(sol.u)
disp(sol.v)
2.0000
-2.0000
2.0000
See Also
fcn2optimexpr
More About
• “Problem-Based Optimization Setup”
7-11
7 Nonlinear Problem-Based
For an equivalent solver-based example using Symbolic Math Toolbox™, see “Symbolic Math
Toolbox™ Calculates Gradients and Hessians” on page 6-94.
Problem Geometry
This example involves a conducting body defined by the following inequalities. For each electron with
coordinates (x, y, z),
z≤ − x − y
2
x2 + y2 + (z + 1) ≤ 1 .
These constraints form a body that looks like a pyramid on a sphere. To view the body, enter the
following code.
[X,Y] = meshgrid(-1:.01:1);
Z1 = -abs(X) - abs(Y);
Z2 = -1 - sqrt(1 - X.^2 - Y.^2);
Z2 = real(Z2);
W1 = Z1; W2 = Z2;
W1(Z1 < Z2) = nan; % only plot points where Z1 > Z2
W2(Z1 < Z2) = nan; % only plot points where Z1 > Z2
hand = figure; % handle to the figure, since we'll plot more later
set(gcf,'Color','w') % white background
surf(X,Y,W1,'LineStyle','none');
hold on
surf(X,Y,W2,'LineStyle','none');
view(-44,18)
7-12
Constrained Electrostatic Nonlinear Optimization, Problem-Based
A slight gap exists between the upper and lower surfaces of the figure. This gap is an artifact of the
general plotting routine used to create the figure. The routine erases any rectangular patch on one
surface that touches the other surface.
The problem has ten electrons. The constraints give bounds on each x and y value from –1 to 1, and
the z value from –2 to 0. Define the variables for the problem.
N = 10;
x = optimvar('x',N,'LowerBound',-1,'UpperBound',1);
y = optimvar('y',N,'LowerBound',-1,'UpperBound',1);
z = optimvar('z',N,'LowerBound',-2,'UpperBound',0);
elecprob = optimproblem;
Define Constraints
The problem has two types of constraints. The first, a spherical constraint, is a simple polynomial
inequality for each electron separately. Define this spherical constraint.
The preceding constraint command creates a vector of ten constraints. View the constraint vector
using show.
show(elecprob.Constraints.spherec)
7-13
7 Nonlinear Problem-Based
where:
arg2 = 1;
arg1 = arg2([1 1 1 1 1 1 1 1 1 1]);
arg_RHS = arg1(:);
The second type of constraint in the problem is linear. You can express the linear constraints in
different ways. For example, you can use the abs function to represent an absolute value constraint.
To express the constraints this way, write a MATLAB function and convert it to an expression using
fcn2optimexpr. For a different approach, write the absolute value constraint as four linear
inequalities. Each constraint command returns a vector of ten constraints.
elecprob.Constraints.plane1 = z <= -x-y;
elecprob.Constraints.plane2 = z <= -x+y;
elecprob.Constraints.plane3 = z <= x-y;
elecprob.Constraints.plane4 = z <= x+y;
The objective function is the potential energy of the system, which is a sum over each electron pair of
the inverse of their distances:
1
energy = ∑ ‖electron(i) − electron( j)‖
.
i< j
Run Optimization
Start the optimization with the electrons distributed randomly on a sphere of radius 1/2 centered at
[0,0,–1].
rng default % For reproducibility
x0 = randn(N,3);
for ii=1:N
x0(ii,:) = x0(ii,:)/norm(x0(ii,:))/2;
x0(ii,3) = x0(ii,3) - 1;
end
init.x = x0(:,1);
init.y = x0(:,2);
init.z = x0(:,3);
7-14
Constrained Electrostatic Nonlinear Optimization, Problem-Based
fval = 34.1365
exitflag =
OptimalSolution
View Solution
figure(hand)
plot3(sol.x,sol.y,sol.z,'r.','MarkerSize',25)
hold off
7-15
7 Nonlinear Problem-Based
The electrons are distributed fairly evenly on the constraint boundary. The electrons are mainly on
edges and the pyramid point.
Reference
[1] Dolan, Elizabeth D., Jorge J. Moré, and Todd S. Munson. “Benchmarking Optimization Software
with COPS 3.0.” Argonne National Laboratory Technical Report ANL/MCS-TM-273, February 2004.
See Also
More About
• “Symbolic Math Toolbox™ Calculates Gradients and Hessians” on page 6-94
• “Problem-Based Optimization Setup”
7-16
Problem-Based Nonlinear Minimization with Linear Constraints
The example “Minimization with Linear Equality Constraints, Trust-Region Reflective Algorithm” on
page 6-87 uses a solver-based approach involving the gradient and Hessian. Solving the same
problem using the problem-based approach is straightforward, but takes more solution time because
the problem-based approach currently does not use gradient or Hessian information.
n−1
xi2+ 1 + 1 xi2 + 1
f (x) = ∑ xi2 + xi2+ 1 ,
i=1
subject to a set of linear equality constraints Aeq*x = beq. Start by creating an optimization
problem and variables.
prob = optimproblem;
N = 1000;
x = optimvar('x',N);
The objective function is in the brownfgh.m file included in your Optimization Toolbox™ installation.
Convert the function to an optimization expression using fcn2optimexpr.
prob.Objective = fcn2optimexpr(@brownfgh,x,'OutputSize',[1,1]);
Include Constraints
To obtain the Aeq and beq matrices in your workspace, execute this command.
load browneq
show(prob.Objective)
brownfgh(x)
The problem has one hundred linear equality constraints, so the resulting constraint expression is too
lengthy to include in the example. To show the constraints, uncomment and run the following line.
% show(prob.Constraints)
7-17
7 Nonlinear Problem-Based
x0.x = -ones(N,1);
x0.x(2:2:N) = 1;
[sol,fval,exitflag,output] = solve(prob,x0)
fval = 207.5463
exitflag =
SolverLimitExceeded
The solver stops prematurely because it exceeds the function evaluation limit. To continue the
optimization, restart the optimization from the final point, and allow for more function evaluations.
options = optimoptions(prob,'MaxFunctionEvaluations',1e5);
[sol,fval,exitflag,output] = solve(prob,sol,'Options',options)
fval = 205.9313
exitflag =
OptimalSolution
7-18
Problem-Based Nonlinear Minimization with Linear Constraints
To solve the problem using the solver-based approach as shown in “Minimization with Linear Equality
Constraints, Trust-Region Reflective Algorithm” on page 6-87, convert the initial point to a vector.
Then set options to use the gradient and Hessian information provided in brownfgh.
xstart = x0.x;
fun = @brownfgh;
opts = optimoptions('fmincon','SpecifyObjectiveGradient',true,'HessianFcn','objective',...
'Algorithm','trust-region-reflective');
[x,fval,exitflag,output] = ...
fmincon(fun,xstart,[],[],Aeq,beq,[],[],[],opts);
Fval = 205.931
Number of iterations = 22
Number of function evals = 23.
The solver-based solution in “Minimization with Linear Equality Constraints, Trust-Region Reflective
Algorithm” on page 6-87 uses the gradients and Hessian provided in the objective function. By using
that derivative information, the solver fmincon converges to the solution in 22 iterations, using only
23 function evaluations. The solver-based solution has the same final objective function value as this
problem-based solution.
However, constructing the gradient and Hessian functions without using symbolic math is difficult
and prone to error. For an example showing how to use symbolic math to calculate derivatives, see
“Symbolic Math Toolbox™ Calculates Gradients and Hessians” on page 6-94.
See Also
fcn2optimexpr
More About
• “Minimization with Linear Equality Constraints, Trust-Region Reflective Algorithm” on page 6-
87
• “Problem-Based Optimization Setup”
7-19
7 Nonlinear Problem-Based
• More robust results. Finite differencing steps sometimes reach points where the objective or a
nonlinear constraint function is undefined, not finite, or complex.
• Analytic gradients can be more accurate than finite difference estimates.
• Including a Hessian can lead to faster convergence, meaning fewer iterations.
• Analytic gradients can be faster to calculate than finite difference estimates, especially for
problems with a sparse structure. For complicated expressions, however, analytic gradients can be
slower to calculate.
Despite these advantages, the problem-based approach currently does not use derivative information.
To use derivatives in problem-based optimization, convert your problem using prob2struct, and
edit the resulting objective and constraint functions.
Include the constraint that the sum of squares of x and y is no more than 4.
fun2 is not based on supported functions for optimization expressions; see “Supported Operations on
Optimization Variables and Expressions” on page 10-36. Therefore, to include fun2 in an
optimization problem, you must convert it to an optimization expression using fcn2optimexpr.
prob = optimproblem;
x = optimvar('x',2);
y = optimvar('y',2);
fun1 = 100*(x(2) - x(1)^2)^2 + (1 - x(1))^2;
fun2 = @(x,y)-exp(-sum((x - y).^2))*exp(-exp(-y(1)))*sech(y(2));
7-20
Include Derivatives in Problem-Based Workflow
problem = prob2struct(prob);
During the conversion, prob2struct creates function files that represent the objective and
nonlinear constraint functions. By default, these functions have the names
'generatedObjective.m' and 'generatedConstraints.m'.
The solver-based approach has one control variable. Each optimization variable (x or y, in this
example) is a portion of the control variable.
You can find the mapping between optimization variables and the single control variable in the
generated objective function file, 'generatedObjective.m'. The beginning of the file contains
these lines of code:
%% Variable indices.
idx_x = [1 2];
idx_y = [3 4];
idx = varindex(prob);
disp(idx.x)
1 2
disp(idx.y)
3 4
Once you know the mapping, you can use standard calculus to find the following expressions for the
gradient grad of the objective function objective = fun1 + fun2 with respect to the control
variable [x(:);y(:)].
7-21
7 Nonlinear Problem-Based
where
2 2
c = exp − x1 − y1 − x2 − y2
d = cexp −exp −y1 sech y2
σ1 = 2 x1 − y1 d
σ2 = 2 x2 − y2 d .
Recall that the nonlinear constraint is x(1)^2 + x(2)^2 + y(1)^2 + y(2)^2 <= 4. Clearly, the
gradient of this constraint function is 2*[x;y]. To include the calculated gradients of the nonlinear
constraint, edit 'generatedConstraints.m' as follows.
%% Insert gradient calculation here.
% If you include a gradient, notify the solver by setting the
% SpecifyConstraintGradient option to true.
if nargout > 2
cineqGrad = 2*[x;y];
ceqGrad = [];
end
7-22
Include Derivatives in Problem-Based Workflow
xsolver =
0.8671
0.7505
1.0433
0.5140
fvalsolver =
-0.5500
exitflagsolver =
outputsolver =
iterations: 46
funcCount: 77
constrviolation: 0
stepsize: 7.4091e-06
algorithm: 'interior-point'
firstorderopt: 7.5203e-07
cgiterations: 9
message: '↵Local minimum found that satisfies the constraints.↵↵Optimization complete
Compare this solution with the one obtained from solve, which uses no derivative information.
init.x = x0(1:2);
init.y = x0(3:4);
[xproblem,fvalproblem,exitflagproblem,outputproblem] = solve(prob,init)
7-23
7 Nonlinear Problem-Based
xproblem =
x: [2×1 double]
y: [2×1 double]
fvalproblem =
-0.5500
exitflagproblem =
OptimalSolution
outputproblem =
iterations: 48
funcCount: 276
constrviolation: 0
stepsize: 7.9340e-07
algorithm: 'interior-point'
firstorderopt: 6.5496e-07
cgiterations: 9
message: '↵Local minimum found that satisfies the constraints.↵↵Optimization complete
solver: 'fmincon'
Both solutions report the same function value to display precision, and both require roughly the same
number of iterations (46 using gradient information, 48 without). However, the solution using
gradients requires only 77 function evaluations, compared to 276 for the solution without gradients.
See Also
fcn2optimexpr | prob2struct | varindex
More About
• “Including Gradients and Hessians” on page 2-19
7-24
Output Function for Problem-Based Optimization
For the solver-based approach to this example, see “Output Functions” on page 3-32.
Plot functions have the same syntax as output functions, so this example also applies to plot
functions, too.
For both the solver-based approach and for the problem-based approach, write the output function as
if you are using the solver-based approach. In the solver-based approach, you use a single vector
variable, usually denoted x, instead of a collection of optimization variables of various sizes. So to
write an output function for the problem-based approach, you must understand the correspondence
between your optimization variables and the single solver-based x. To map between optimization
variables and x, use varindex. In this example, to avoid confusion with an optimization variable
named x, use "in" as the vector variable name.
Problem Description
x + y − xy ≥ 1 . 5
xy ≥ 10 .
Problem-Based Setup
To set up the problem in the problem-based approach, define optimization variables and an
optimization problem object.
x = optimvar('x');
y = optimvar('y');
prob = optimproblem;
Because this is a nonlinear problem, you must include an initial point structure x0. Use x0.x = –1
and x0.y = 1.
7-25
7 Nonlinear Problem-Based
x0.x = -1;
x0.y = 1;
Output Function
The outfun output function records a history of the points generated by fmincon during its
iterations. The output function also plots the points and keeps a separate history of the search
directions for the sqp algorithm. The search direction is a vector from the previous point to the next
point that fmincon tries. During its final step, the output function saves the history in workspace
variables, and saves a history of the objective function values at each iterative step.
For the required syntax of optimization output functions, see “Output Function Syntax” on page 15-
26.
An output function takes a single vector variable as an input. But the current problem has two
variables. To find the mapping between the optimization variables and the input variable, use
varindex.
idx = varindex(prob);
idx.x
ans = 1
idx.y
ans = 2
The mapping shows that x is variable 1 and y is variable 2. So, if the input variable is named in, then
x = in(1) and y = in(2).
type outfun
switch state
case 'init'
hold on
history = [];
fhistory = [];
searchdir = [];
case 'iter'
% Concatenate current point and objective function
% value with history. in must be a row vector.
fhistory = [fhistory; optimValues.fval];
history = [history; in(:)']; % Ensure in is a row vector
% Concatenate current search direction with
% searchdir.
searchdir = [searchdir;...
optimValues.searchdirection(:)'];
plot(in(idx.x),in(idx.y),'o');
% Label points with iteration number and add title.
% Add .15 to idx.x to separate label from plotted 'o'
text(in(idx.x)+.15,in(idx.y),...
num2str(optimValues.iteration));
title('Sequence of Points Computed by fmincon');
case 'done'
7-26
Output Function for Problem-Based Optimization
hold off
assignin('base','optimhistory',history);
assignin('base','searchdirhistory',searchdir);
assignin('base','functionhistory',fhistory);
otherwise
end
end
Include the output function in the optimization by setting the OutputFcn option. Also, set the
Algorithm option to use the 'sqp' algorithm instead of the default 'interior-point' algorithm.
Pass idx to the output function as an extra parameter in the last input. See “Passing Extra
Parameters” on page 2-57.
outputfn = @(in,optimValues,state)outfun(in,optimValues,state,idx);
opts = optimoptions('fmincon','Algorithm','sqp','OutputFcn',outputfn);
Run the optimization, including the output function, by using the 'Options' name-value pair
argument.
[sol,fval,eflag,output] = solve(prob,x0,'Options',opts)
7-27
7 Nonlinear Problem-Based
fval = 0.0236
eflag =
OptimalSolution
Examine the iteration history. Each row of the optimhistory matrix represents one point. The last
few points are very close, which explains why the plotted sequence shows overprinted numbers for
points 8, 9, and 10.
disp('Locations');disp(optimhistory)
Locations
-1.0000 1.0000
-1.3679 1.2500
-1.6509 1.1813
-3.5870 2.0537
-4.4574 2.2895
-5.8015 1.5531
-7.6498 1.1225
-8.5223 1.0572
-9.5463 1.0464
-9.5474 1.0474
-9.5474 1.0474
disp('Search Directions');disp(searchdirhistory)
Search Directions
0 0
-0.3679 0.2500
-0.2831 -0.0687
-1.9360 0.8725
-0.8704 0.2358
-1.3441 -0.7364
-2.0877 -0.6493
-0.8725 -0.0653
-1.0241 -0.0108
7-28
Output Function for Problem-Based Optimization
-0.0011 0.0010
0.0000 -0.0000
disp('Function Values');disp(functionhistory)
Function Values
1.8394
1.8513
1.7757
0.9839
0.6343
0.3250
0.0978
0.0517
0.0236
0.0236
0.0236
If your objective function or nonlinear constraint functions are not composed of elementary functions,
you must convert the functions to optimization expressions using fcn2optimexpr. For the present
example:
For the list of supported functions, see “Supported Operations on Optimization Variables and
Expressions” on page 10-36.
See Also
varindex
More About
• “Output Functions” on page 3-32
• “Problem-Based Optimization Setup”
7-29
7 Nonlinear Problem-Based
Problem Definition
2
(y + x2) + 0 . 1y2 ≤ 1
y ≤ exp( − x) − 3
y ≤ x − 4.
Problem-Based Solution
x = optimvar('x');
y = optimvar('y');
prob = optimproblem;
cons1 = (y + x^2)^2 + 0.1*y^2 <= 1;
cons2 = y <= exp(-x) - 3;
cons3 = y <= x - 4;
prob.Constraints.cons1 = cons1;
prob.Constraints.cons2 = cons2;
prob.Constraints.cons3 = cons3;
show(prob)
OptimizationProblem :
Solve for:
x, y
minimize :
subject to cons1:
((y + x.^2).^2 + (0.1 .* y.^2)) <= 1
subject to cons2:
y <= (exp(-x) - 3)
subject to cons3:
y - x <= -4
Create a pseudorandom start point structure x0 with fields x and y for the optimization variables.
rng default
x0.x = randn;
x0.y = randn;
[sol,~,exitflag,output] = solve(prob,x0)
7-30
Solve Nonlinear Feasibility Problem, Problem-Based
exitflag =
OptimalSolution
The solver can fail to find a solution when starting from some initial points. Set the initial point x0.x
= -1, x0.y = -4 and solve the problem starting from x0.
x0.x = -1;
x0.y = -4;
[sol2,~,exitflag2,output2] = solve(prob,x0)
fmincon stopped because the size of the current step is less than
the value of the step size tolerance but constraints are not
satisfied to within the value of the constraint tolerance.
exitflag2 =
NoFeasiblePointFound
7-31
7 Nonlinear Problem-Based
funcCount: 590
constrviolation: 1.4609
stepsize: 1.7329e-10
algorithm: 'interior-point'
firstorderopt: 0
cgiterations: 288
message: '...'
solver: 'fmincon'
inf1 = infeasibility(cons1,sol2)
inf1 = 1.1974
inf2 = infeasibility(cons2,sol2)
inf2 = 0
inf3 = infeasibility(cons3,sol2)
inf3 = 1.4609
Both cons1 and cons3 are infeasible at the solution sol2. The results highlight the importance of
using multiple start points to investigate and solve a feasibility problem.
Visualize Constraints
To visualize the constraints, plot the points where each constraint function is zero by using
fimplicit. The fimplicit function passes numeric values to its functions, whereas the evaluate
function requires a structure. To tie these functions together, use the evaluateExpr helper function,
which appears at the end of this example on page 7-0 . This function simply puts passed values into
a structure with the appropriate names.
Note: If you use the live script file for this example, the evaluateExpr function is already included
at the end of the file. Otherwise, you need to create this function at the end of your .m file or add it as
a file on the MATLAB® path.
Avoid a warning that occurs because the evaluateExpr function does not work on vectorized inputs.
s = warning('off','MATLAB:fplot:NotVectorized');
cc1 = (y + x^2)^2 + 0.1*y^2 - 1;
fimplicit(@(a,b)evaluateExpr(cc1,a,b),[-2 2 -4 2],'r')
hold on
cc2 = y - exp(-x) + 3;
fimplicit(@(a,b)evaluateExpr(cc2,a,b),[-2 2 -4 2],'k')
cc3 = y - x + 4;
fimplicit(@(x,y)evaluateExpr(cc3,x,y),[-2 2 -4 2],'b')
hold off
7-32
Solve Nonlinear Feasibility Problem, Problem-Based
warning(s);
The feasible region is inside the red outline and below the black and blue lines. The feasible region is
at the lower right of the red outline.
Helper Function
function p = evaluateExpr(expr,x,y)
pt.x = x;
pt.y = y;
p = evaluate(expr,pt);
end
See Also
“Problem-Based Optimization Workflow” on page 10-2
More About
• “Investigate Linear Infeasibilities” on page 9-139
• “Solve Feasibility Problem” (Global Optimization Toolbox)
7-33
8
In this section...
“Multiobjective Optimization Definition” on page 8-2
“Algorithms” on page 8-3
• fgoalattain addresses the problem of reducing a set of nonlinear functions Fi(x) below a set of
goals F*i. Since there are several functions Fi(x), it is not always clear what it means to solve this
problem, especially when you cannot achieve all the goals simultaneously. Therefore, the problem
is reformulated to one that is always well-defined.
The unscaled goal attainment problem is to minimize the maximum of Fi(x) – F*i.
There is a useful generalization of the unscaled problem. Given a set of positive weights wi, the
goal attainment problem tries to find x to minimize the maximum of
Fi(x) − Fi*
. (8-1)
wi
If you set all weights equal to 1 (or any other positive constant), the goal attainment problem is
the same as the unscaled goal attainment problem. If the F*i are positive, and you set all weights
as wi = F*i, the goal attainment problem becomes minimizing the relative difference between the
functions Fi(x) and the goals F*i.
In other words, the goal attainment problem is to minimize a slack variable γ, defined as the
maximum over i of the expressions in “Equation 8-1”. This implies the expression that is the
formal statement of the goal attainment problem:
minγ
x, γ
such that F(x) – w·γ ≤ F*, c(x) ≤ 0, ceq(x) = 0, A·x ≤ b, Aeq·x = beq, and l ≤ x ≤ u.
• fminimax addresses the problem of minimizing the maximum of a set of nonlinear functions,
subject to all types of constraints:
minmaxFi(x)
x i
Clearly, this problem is a special case of the unscaled goal attainment problem, with F*i = 0 and
wi = 1.
8-2
Multiobjective Optimization Algorithms
Algorithms
Goal Attainment Method
This section describes the goal attainment method of Gembicki [3]. This method uses a set of design
goals, F* = F1*, F2*, ..., Fm
* , associated with a set of objectives, F(x) = {F1(x),F2(x),...,Fm(x)}. The
problem formulation allows the objectives to be under- or overachieved, enabling the designer to be
relatively imprecise about the initial design goals. The relative degree of under- or overachievement
of the goals is controlled by a vector of weighting coefficients, w = {w1,w2,...,wm}, and is expressed as
a standard optimization problem using the formulation
minimize γ (8-2)
γ ∈ ℜ, x ∈ Ω
The term wiγ introduces an element of slackness into the problem, which otherwise imposes that the
goals be rigidly met. The weighting vector, w, enables the designer to express a measure of the
relative tradeoffs between the objectives. For instance, setting the weighting vector w equal to the
initial goals indicates that the same percentage under- or overachievement of the goals, F*, is
achieved. You can incorporate hard constraints into the design by setting a particular weighting
factor to zero (i.e., wi = 0). The goal attainment method provides a convenient intuitive interpretation
of the design problem, which is solvable using standard optimization procedures. Illustrative
examples of the use of the goal attainment method in control system design can be found in
Fleming ([10] and [11]).
The goal attainment method is represented geometrically in the figure below in two dimensions.
Specification of the goals, F1*, F2* , defines the goal point, P. The weighting vector defines the
direction of search from P to the feasible function space, Λ(γ). During the optimization γ is varied,
which changes the size of the feasible region. The constraint boundaries converge to the unique
solution point F1s, F2s.
The goal attainment method has the advantage that it can be posed as a nonlinear programming
problem. Characteristics of the problem can also be exploited in a nonlinear programming algorithm.
8-3
8 Multiobjective Algorithms and Examples
In sequential quadratic programming (SQP), the choice of merit function for the line search is not
easy because, in many cases, it is difficult to “define” the relative importance between improving the
objective function and reducing constraint violations. This has resulted in a number of different
schemes for constructing the merit function (see, for example, Schittkowski [36]). In goal attainment
programming there might be a more appropriate merit function, which you can achieve by posing
“Equation 8-2” as the minimax problem
minimize max Λi , (8-3)
x ∈ ℜn i
where
Fi(x) − Fi*
Λi = , i = 1, ..., m .
wi
Following the argument of Brayton et al. [1] for minimax optimization using SQP, using the merit
function of “Equation 6-44” for the goal attainment problem of “Equation 8-3” gives
m
ψ(x, γ) = γ + ∑ ri ⋅ max 0, Fi(x) − wiγ − Fi* . (8-4)
i=1
When the merit function of “Equation 8-4” is used as the basis of a line search procedure, then,
although ψ(x,γ) might decrease for a step in a given search direction, the function max Λi might
paradoxically increase. This is accepting a degradation in the worst case objective. Since the worst
case objective is responsible for the value of the objective function γ, this is accepting a step that
ultimately increases the objective function to be minimized. Conversely, ψ(x,γ) might increase when
max Λi decreases, implying a rejection of a step that improves the worst case objective.
Following the lines of Brayton et al. [1], a solution is therefore to set ψ(x) equal to the worst case
objective, i.e.,
A problem in the goal attainment method is that it is common to use a weighting coefficient equal to 0
to incorporate hard constraints. The merit function of “Equation 8-5” then becomes infinite for
arbitrary violations of the constraints.
To overcome this problem while still retaining the features of “Equation 8-5”, the merit function is
combined with that of “Equation 6-45”, giving the following:
Another feature that can be exploited in SQP is the objective function γ. From the KKT equations it
can be shown that the approximation to the Hessian of the Lagrangian, H, should have zeros in the
rows and columns associated with the variable γ. However, this property does not appear if H is
initialized as the identity matrix. H is therefore initialized and maintained to have zeros in the rows
and columns associated with γ.
These changes make the Hessian, H, indefinite. Therefore H is set to have zeros in the rows and
columns associated with γ, except for the diagonal element, which is set to a small positive number
8-4
Multiobjective Optimization Algorithms
(e.g., 1e-10). This allows use of the fast converging positive definite QP method described in
“Quadratic Programming Solution” on page 6-26.
The preceding modifications have been implemented in fgoalattain and have been found to make
the method more robust. However, because of the rapid convergence of the SQP method, the
requirement that the merit function strictly decrease sometimes requires more function evaluations
than an implementation of SQP using the merit function of “Equation 6-44”.
fminimax uses a goal attainment method. It takes goals of 0, and weights of 1. With this formulation,
the goal attainment problem becomes
f i(x) − goali
minmax = minmax f i(x),
i x weighti i x
Parenthetically, you might expect fminimax to turn the multiobjective function into a single
objective. The function
f(x) = max(F1(x),...Fj(x))
is a single objective function to minimize. However, it is not differentiable, and Optimization Toolbox
objectives are required to be smooth. Therefore the minimax problem is formulated as a smooth goal
attainment problem.
References
[1] Brayton, R. K., S. W. Director, G. D. Hachtel, and L.Vidigal, “A New Algorithm for Statistical Circuit
Design Based on Quasi-Newton Methods and Function Splitting,” IEEE Transactions on
Circuits and Systems, Vol. CAS-26, pp 784-794, Sept. 1979.
[2] Fleming, P.J. and A.P. Pashkevich, Computer Aided Control System Design Using a Multi-Objective
Optimisation Approach, Control 1985 Conference, Cambridge, UK, pp. 174-179.
[3] Gembicki, F.W., “Vector Optimization for Control with Performance and Parameter Sensitivity
Indices,” Ph.D. Dissertation, Case Western Reserve Univ., Cleveland, OH, 1974.
[4] Grace, A.C.W., “Computer-Aided Control System Design Using Optimization Techniques,” Ph.D.
Thesis, University of Wales, Bangor, Gwynedd, UK, 1989.
[5] Han, S.P., “A Globally Convergent Method For Nonlinear Programming,” Journal of Optimization
Theory and Applications, Vol. 22, p. 297, 1977.
[6] Madsen, K. and H. Schjaer-Jacobsen, “Algorithms for Worst Case Tolerance Optimization,” IEEE
Trans. of Circuits and Systems, Vol. CAS-26, Sept. 1979.
[7] Powell, M.J.D., “A Fast Algorithm for Nonlinear Constrained Optimization Calculations,” Numerical
Analysis, ed. G.A. Watson, Lecture Notes in Mathematics, Vol. 630, Springer Verlag, 1978.
8-5
8 Multiobjective Algorithms and Examples
For example, define fun(x) as three linear objective functions in two variables, and fun2 as the
maximum of these three objectives.
a = [1;1];
b = [-1;1];
c = [0;-1];
a0 = 2;
b0 = -3;
c0 = 4;
fun = @(x)[x*a+a0,x*b+b0,x*c+c0];
fun2 = @(x)max(fun(x),[],2);
8-6
Compare fminimax and fminunc
x0 = [0,0];
[xm,fvalm,maxfval] = fminimax(fun,x0)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
xm = 1×2
-2.5000 2.2500
fvalm = 1×3
maxfval = 1.7500
However, fminunc stops at a point that is far from the minimax point.
[xu,fvalu] = fminunc(fun2,x0)
xu = 1×2
0 1.0000
fvalu = 3.0000
fprintf("fminimax finds a point with objective %g,\nwhile fminunc finds a point with objective %g
See Also
fminimax
More About
• “Multiobjective Optimization”
8-7
8 Multiobjective Algorithms and Examples
The plant is an under-damped third-order model with actuator limits. The actuator limits are a
saturation limit and a slew rate limit. The actuator saturation limit cuts off input values greater than
2 units or less than –2 units. The slew rate limit of the actuator is 0.8 units/sec. The closed-loop
response of the system to a step input is shown in Closed-Loop Response on page 8-0 . You can see
this response by opening the model (type optsim at the command line or click the model name), and
selecting Run from the Simulation menu. The response plots to the scope.
Closed-Loop Response
The problem is to design a feedback control loop that tracks a unit step input to the system. The
closed-loop plant is entered in terms of the blocks where the plant and actuator have been placed in a
hierarchical Subsystem block. A Scope block displays output trajectories during the design process.
8-8
Using fminimax with a Simulink® Model
Closed-Loop Model
To optimize this system, minimize the maximum value of the output at any time t between 0 and 100.
(In contrast, in “lsqnonlin with a Simulink® Model” on page 12-17 the solution involves minimizing
the error between the output and the input signal.)
The code for this example is contained in the function runtrackmm at the end of this example. on
page 8-0 The objective function is simply the output yout returned by the sim command. But
minimizing the maximum output at all time steps might force the output to be far below unity for
some time steps. To keep the output above 0.95 after the first 20 seconds, the constraint function
trackmmcon contains the constraint yout >= 0.95 from t = 20 to t = 100. Because constraints
must be in the form g ≤ 0, the constraint in the function is g = -yout(20:100) + 0.95.
Both trackmmobj and trackmmcon use the result yout from sim, calculated from the current PID
values. To avoid calling the simulation twice, runtrackmm has nested functions so that the value of
yout is shared between the objective and constraint functions. The simulation is called only when the
current point changes.
Call runtrackmm.
[Kp,Ki,Kd] = runtrackmm
Kp = 0.5894
Ki = 0.0605
Kd = 5.5295
8-9
8 Multiobjective Algorithms and Examples
The last value in the Objective value column of the output shows that the maximum value for all the
time steps is just under 1. The closed loop response with this result is shown in the figure Closed-
Loop Response Using fminimax on page 8-0 .
This solution differs from the solution obtained in “lsqnonlin with a Simulink® Model” on page 12-
17 because you are solving different problem formulations.
function F = trackmmobj(pid)
% Track the output of optsim to a signal of 1.
% Variables a1 and a2 are shared with RUNTRACKMM.
% Variable yout is shared with RUNTRACKMM and
% RUNTRACKMMCON.
8-10
Using fminimax with a Simulink® Model
updateIfNeeded(pid)
F = yout;
end
c = -yout(20:100)+.95;
ceq=[];
end
function updateIfNeeded(pid)
if ~isequal(pid,pold) % compute only if needed
Kp = pid(1);
Ki = pid(2);
Kd = pid(3);
pold = pid;
end
end
end
See Also
fminimax
More About
• “Multiobjective Optimization”
8-11
8 Multiobjective Algorithms and Examples
2M
H(f ) = ∑ h(n)e− j2πf n
n=0
where A(f) is the magnitude of the frequency response. One solution is to apply a goal attainment
method to the magnitude of the frequency response. Given a function that computes the magnitude,
fgoalattain will attempt to vary the magnitude coefficients a(n) until the magnitude response
matches the desired response within some tolerance. The function that computes the magnitude
response is given in filtmin.m. This function uses a, the magnitude function coefficients, and w, the
discretization of the frequency domain of interest.
To set up a goal attainment problem, you must specify the goal and weights for the problem. For
frequencies between 0 and 0.1, the goal is one. For frequencies between 0.15 and 0.5, the goal is
zero. Frequencies between 0.1 and 0.15 are not specified, so no goals or weights are needed in this
range.
This information is stored in the variable goal passed to fgoalattain. The length of goal is the
same as the length returned by the function filtmin. So that the goals are equally satisfied, usually
weight would be set to abs(goal). However, since some of the goals are zero, the effect of using
weight=abs(goal) will force the objectives with weight 0 to be satisfied as hard constraints, and
the objectives with weight 1 possibly to be underattained (see “Goal Attainment Method” on page 8-
3). Because all the goals are close in magnitude, using a weight of unity for all goals will give them
equal priority. (Using abs(goal) for the weights is more important when the magnitude of goal
differs more significantly.) Also, setting
options = optimoptions('fgoalattain','EqualityGoalCount',length(goal));
specifies that each objective should be as near as possible to its goal value (neither greater nor less
than).
8-12
Signal Processing Using fgoalattain
w = linspace(0,0.5,incr);
y0 = filtmin(a0,w);
clf, plot(w,y0,'-.b');
drawnow;
% Call fgoalattain
options = optimoptions('fgoalattain','EqualityGoalCount',length(goal));
[a,fval,attainfactor,exitflag]=fgoalattain(@(x)filtmin(x,w0),...
a0,goal,weight,[],[],[],[],[],[],[],options);
Compare the magnitude response computed with the initial coefficients and the final coefficients
(“Magnitude Response with Initial and Final Magnitude Coefficients” on page 8-14). Note that you
could use the firpm function in Signal Processing Toolbox™ software to design this filter.
8-13
8 Multiobjective Algorithms and Examples
See Also
fgoalattain
More About
• “Multi-Objective Goal Attainment Optimization” on page 8-18
• “Minimax Optimization” on page 8-24
8-14
Generate and Plot a Pareto Front
The two objectives in this example are shifted and scaled versions of the convex function 1 + x2.
function f = simple_mult(x)
f(:,1) = sqrt(1+x.^2);
f(:,2) = 4 + 2*sqrt(1+(x-1).^2);
Both components are increasing as x decreases below 0 or increases above 1. In between 0 and 1,
f1(x) is increasing and f2(x) is decreasing, so there is a tradeoff region.
t = linspace(-0.5,1.5);
F = simple_mult(t);
plot(t,F,'LineWidth',2)
hold on
plot([0,0],[0,8],'g--');
plot([1,1],[0,8],'g--');
plot([0,1],[1,6],'k.','MarkerSize',15);
text(-0.25,1.5,'Minimum(f_1(x))')
text(.75,5.5,'Minimum(f_2(x))')
hold off
legend('f_1(x)','f_2(x)')
xlabel({'x';'Tradeoff region between the green lines'})
8-15
8 Multiobjective Algorithms and Examples
To find the Pareto front, first find the unconstrained minima of the two functions. In this case, you can
see by inspection that the minimum of f1(x) is 1, and the minimum of f2(x) is 6, but in general you
might need to use an optimization routine.
In general, write a function that returns a particular component of the multiobjective function.
function z = pickindex(x,k)
z = simple_mult(x); % evaluate both objectives
z = z(k); % return objective k
Then find the minimum of each component using an optimization solver. You can use fminbnd in this
case, or fminunc for higher-dimensional problems.
k = 1;
[min1,minfn1] = fminbnd(@(x)pickindex(x,k),-1,2);
k = 2;
[min2,minfn2] = fminbnd(@(x)pickindex(x,k),-1,2);
Set goals that are the unconstrained optima for each component. You can simultaneously achieve
these goals only if the multiobjective functions do not interfere with each other, meaning there is no
tradeoff.
goal = [minfn1,minfn2];
To calculate the Pareto front, take weight vectors [a,1–a] for a from 0 through 1. Solve the goal
attainment problem, setting the weights to the various values.
figure
plot(f(:,1),f(:,2),'k.');
xlabel('f_1')
ylabel('f_2')
8-16
Generate and Plot a Pareto Front
See Also
fgoalattain
More About
• “Multi-Objective Goal Attainment Optimization” on page 8-18
8-17
8 Multiobjective Algorithms and Examples
Consider a 2-input 2-output unstable plant. The equation describing the evolution of the system x(t) is
dx
= Ax(t) + Bu(t),
dt
where u(t) is the input (control) signal. The output of the system is
y(t) = Cx(t) .
A = [ -0.5 0 0; 0 -2 10; 0 1 -2 ];
B = [ 1 0; -2 2; 0 1 ];
C = [ 1 0 0; 0 0 1 ];
Optimization Objective
Suppose that the control signal u(t) is set as proportional to the output y(t):
u(t) = Ky(t)
dx
= Ax(t) + BKCx(t) = (A + BKC)x(t) .
dt
The object of the optimization is to design K to have the following two properties:
1. The real parts of the eigenvalues of (A + BKC) are smaller than [–5, –3, –1]. (This is called pole
placement in the control literature.)
Set the weights equal to the goals to ensure same percentage under- or over-attainment in the goals.
weight = abs(goal);
K0 = [ -1 -1; -1 -1];
lb = repmat(-4,size(K0))
8-18
Multi-Objective Goal Attainment Optimization
lb = 2×2
-4 -4
-4 -4
ub = repmat(4,size(K0))
ub = 2×2
4 4
4 4
options = optimoptions('fgoalattain','Display','iter');
Create a vector-valued function eigfun that returns the eigenvalues of the closed loop system. This
function requires additional parameters (namely, the matrices A, B, and C); the most convenient way
to pass these is through an anonymous function:
[K,~,attainfactor] = ...
fgoalattain(eigfun,K0,goal,weight,[],[],[],[],lb,ub,[],options);
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
K = 2×2
-4.0000 -0.2564
8-19
8 Multiobjective Algorithms and Examples
-4.0000 -4.0000
The eigenvalues of the closed loop system are in eigfun(K) as follows: (they are also held in output
fval)
eigfun(K)
ans = 3×1
-6.9313
-4.1588
-1.4099
The attainment factor indicates the level of goal achievement. A negative attainment factor indicates
over-achievement, positive indicates under-achievement. The value attainfactor we obtained in this
run indicates that the objectives have been over-achieved by almost 40 percent:
attainfactor
attainfactor = -0.3863
Here is how the system x(t) evolves from time 0 to time 4, using the calculated feedback matrix K ,
starting from the point x(0) = [1;1;1].
plot(Times,xvals)
legend('x_1(t)','x_2(t)','x_3(t)','Location','best')
xlabel('t');
ylabel('x(t)');
8-20
Multi-Objective Goal Attainment Optimization
Suppose we now require the eigenvalues to be as near as possible to the goal values, [–5, –3, –1]. Set
options.EqualityGoalCount to the number of objectives that should be as near as possible to the
goals (i.e., do not try to over-achieve):
8-21
8 Multiobjective Algorithms and Examples
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
K = 2×2
-1.5954 1.2040
-0.4201 -2.9046
This time the eigenvalues of the closed loop system, which are also held in output fval, are as follows:
eigfun(K)
ans = 3×1
-5.0000
-3.0000
-1.0000
The attainment factor is the level of goal achievement. A negative attainment factor indicates over-
achievement, positive indicates under-achievement. The low attainfactor obtained indicates that the
eigenvalues have almost exactly met the goals:
attainfactor
attainfactor = -1.3288e-21
Here is how the system x(t) evolves from time 0 to time 4, using the new calculated feedback matrix
K , starting from the point x(0) = [1;1;1].
plot(Times,xvals)
legend('x_1(t)','x_2(t)','x_3(t)','Location','best')
xlabel('t');
ylabel('x(t)');
8-22
Multi-Objective Goal Attainment Optimization
See Also
fgoalattain
More About
• “lsqnonlin with a Simulink® Model” on page 12-17
8-23
8 Multiobjective Algorithms and Examples
Minimax Optimization
This example shows how to solve a nonlinear filter design problem using a minimax optimization
algorithm, fminimax, in Optimization Toolbox™. Note that to run this example you must have the
Signal Processing Toolbox™ installed.
Consider an example for the design of finite precision filters. For this, you need to specify not only the
filter design parameters such as the cut-off frequency and number of coefficients, but also how many
bits are available since the design is in finite precision.
This is a continuous filter design; we use cheby1, but we could also use ellip, yulewalk or remez
here:
[b1,a1] = cheby1(n-1,Rp,Wn);
8-24
Minimax Optimization
Scale Coefficients
Set the biggest value equal to maxbin and scale other filter coefficients appropriately.
xmask = 1:2*n;
8-25
8 Multiobjective Algorithms and Examples
% Remove the biggest value and the element that controls D.C. Gain
% from the list of values that can be changed.
xmask(mix) = [];
nx = 2*n;
Using optimoptions, adjust the termination criteria to reasonably high values to promote short
running times. Also turn on the display of results at each iteration:
if length(w) == 1
options = optimoptions(options,'AbsoluteMaxObjectiveCount',w);
else
options = optimoptions(options,'AbsoluteMaxObjectiveCount',length(w));
end
Discretize and eliminate first value and perform optimization by calling FMINIMAX:
x = 1×8
xmask = 1×6
1 2 3 4 5 8
niters = length(xmask);
disp(sprintf('Performing %g stages of optimization.\n\n', niters));
for m = 1:niters
fun = @(xfree)filtobj(xfree,x,xmask,n,h,maxbin); % objective
confun = @(xfree)filtcon(xfree,x,xmask,n,h,maxbin); % nonlinear constraint
disp(sprintf('Stage: %g \n', m));
x(xmask) = fminimax(fun,x(xmask),[],[],[],[],vlb(xmask),vub(xmask),...
confun,options);
[x, xmask] = elimone(x, xmask, h, w, n, maxbin);
end
8-26
Minimax Optimization
Stage: 1
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
Stage: 2
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
Stage: 3
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
Stage: 4
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
Stage: 5
8-27
8 Multiobjective Algorithms and Examples
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
Stage: 6
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
xold = x;
xmask = 1:2*n;
xmask([n+1, mix]) = [];
x = x + 0.5;
for i = xmask
[x, xmask] = elimone(x, xmask, h, w, n, maxbin);
end
xmask = 1:2*n;
xmask([n+1, mix]) = [];
x = x - 0.5;
for i = xmask
[x, xmask] = elimone(x, xmask, h, w, n, maxbin);
end
if any(abs(x) > maxbin)
x = xold;
end
We first plot the frequency response of the filter and we compare it to a filter where the coefficients
are just rounded up or down:
subplot(211)
bo = x(1:n);
ao = x(n+1:2*n);
h2 = abs(freqz(bo,ao,128));
plot(w,h,w,h2,'o')
8-28
Minimax Optimization
xround = round(xorig)
xround = 1×8
b = xround(1:n);
a = xround(n+1:2*n);
h3 = abs(freqz(b,a,128));
subplot(212)
plot(w,h,w,h3,'+')
title('Rounded filter versus original')
fig = gcf;
fig.NextPlot = 'replace';
See Also
fminimax
More About
• “lsqnonlin with a Simulink® Model” on page 12-17
8-29
9
A·x ≤ b
Aeq·x = beq
l ≤ x ≤ u.
Presolve
The algorithm first tries to simplify the problem by removing redundancies and simplifying
constraints. The tasks performed during the presolve step can include the following:
• Check if any variables have equal upper and lower bounds. If so, check for feasibility, and then fix
and remove the variables.
• Check if any linear inequality constraint involves only one variable. If so, check for feasibility, and
then change the linear constraint to a bound.
• Check if any linear equality constraint involves only one variable. If so, check for feasibility, and
then fix and remove the variable.
• Check if any linear constraint matrix has zero rows. If so, check for feasibility, and then delete the
rows.
• Determine if the bounds and linear constraints are consistent.
9-2
Linear Programming Algorithms
• Check if any variables appear only as linear terms in the objective function and do not appear in
any linear constraint. If so, check for feasibility and boundedness, and then fix the variables at
their appropriate bounds.
• Change any linear inequality constraints to linear equality constraints by adding slack variables.
If the algorithm detects an infeasible or unbounded problem, it halts and issues an appropriate exit
message.
The algorithm might arrive at a single feasible point, which represents the solution.
If the algorithm does not detect an infeasible or unbounded problem in the presolve step, and if the
presolve has not produced the solution, the algorithm continues to its next steps. After reaching a
stopping criterion, the algorithm reconstructs the original problem, undoing any presolve
transformations. This final step is the postsolve step.
For simplicity, if the problem is not solved in the presolve step, the algorithm shifts all finite lower
bounds to zero.
To set the initial point, x0, the algorithm does the following.
1 Initialize x0 to ones(n,1), where n is the number of elements of the objective function vector f.
2 Convert all bounded components to have a lower bound of 0. If component i has a finite upper
bound u(i), then x0(i) = u/2.
3 For components that have only one bound, modify the component if necessary to lie strictly
inside the bound.
4 To put x0 close to the central path, take one predictor-corrector step, and then modify the
resulting position and slack variables to lie well within any bounds. For details of the central
path, see Nocedal and Wright [7], page 397.
Predictor-Corrector
Ax = b
T
min f x subject to x + t = u
x
x, t ≥ 0.
• Assume for now that all variables have at least one finite bound. By shifting and negating
components, if necessary, this assumption means that all x components have a lower bound of 0.
• A is the extended linear matrix that includes both linear inequalities and linear equalities. b is the
corresponding linear equality vector. A also includes terms for extending the vector x with slack
variables s that turn inequality constraints to equality constraints:
Aeq 0 x0
Ax = ,
A I s
9-3
9 Linear Programming and Mixed-Integer Linear Programming
The Lagrangian is
T
L = f x − yT Ax − b − vT x − wT u − x − t .
The linprog algorithm uses a different sign convention for the returned Lagrange multipliers than
this discussion gives. This discussion uses the same sign as most literature. See lambda.
The algorithm first predicts a step from the Newton-Raphson formula, and then computes a corrector
step. The corrector attempts to reduce the residual in the nonlinear complementarity equations
sizi = 0. The Newton-Raphson step is
T
Δx T rd
0 −A 0 −I I f − A y−v+w
Δy rp
A 0 0 0 0 Ax − b
−I 0 −I 0 0 Δt = − u−x−t = − rub , (9-1)
V 0 0 X 0 Δv VX rvx
0 0 W 0 T Δw WT rwt
where X, V, W, and T are diagonal matrices corresponding to the vectors x, v, w, and t respectively.
The residual vectors on the far right side of the equation are:
Primal infeasibility = rp 1
+ rub 1
Dual infeasibility = rd ∞
.
9-4
Linear Programming Algorithms
T R
−D A Δx
= − , (9-2)
A 0 Δy rp
where
D = X −1V + T −1W
R = − rd − X −1rvx + T −1rwt + T −1Wrub .
All the matrix inverses in the definitions of D and R are simple to compute because the matrices are
diagonal.
To derive “Equation 9-2” from “Equation 9-1”, notice that the second row of “Equation 9-2” is the
same as the second matrix row of “Equation 9-1”. The first row of “Equation 9-2” comes from solving
the last two rows of “Equation 9-1” for Δv and Δw, and then solving for Δt.
“Equation 9-2” is symmetric, but it is not positive definite because of the –D term. Therefore, you
cannot solve it using a Cholesky factorization. A few more steps lead to a different equation that is
positive definite, and hence can be solved efficiently by Cholesky factorization.
AΔx = − rp
Substituting
T
Δx = D−1 A Δy + D−1R
gives
T
AD−1 A Δy = − AD−1R − rp . (9-3)
Usually, the most efficient way to find the Newton step is to solve “Equation 9-3” for Δy using
Cholesky factorization. Cholesky factorization is possible because the matrix multiplying Δy is
obviously symmetric and, in the absence of degeneracies, is positive definite. Afterward, to find the
Newton step, back substitute to find Δx, Δt, Δv, and Δw. However, when A has a dense column, it can
be more efficient to solve “Equation 9-2” instead. The linprog interior-point algorithm chooses the
solution algorithm based on the density of columns.
After calculating the corrected Newton step, the algorithm performs more calculations to get both a
longer current step, and to prepare for better subsequent steps. These multiple correction
calculations can improve both performance and robustness. For details, see Gondzio [4].
The predictor-corrector algorithm is largely the same as the full quadprog 'interior-point-
convex' version, except for the quadratic terms. See “Full Predictor-Corrector” on page 11-5.
9-5
9 Linear Programming and Mixed-Integer Linear Programming
Stopping Conditions
The predictor-corrector algorithm iterates until it reaches a point that is feasible (satisfies the
constraints to within tolerances) and where the relative step sizes are small. Specifically, define
ρ = max 1, A , f , b .
rp 1
+ rub 1 ≤ ρTolCon
rd ∞
≤ ρTolFun
rc ≤ TolFun,
where
rc essentially measures the size of the complementarity residuals xv and tw, which are each vectors of
zeros at a solution.
The default interior-point-legacy method is based on LIPSOL ([52]), which is a variant of Mehrotra's
predictor-corrector algorithm ([47]), a primal-dual interior-point method.
Main Algorithm
The algorithm begins by applying a series of preprocessing steps (see “Preprocessing” on page 9-
8). After preprocessing, the problem has the form
T A⋅x=b
min f x such that (9-4)
x 0 ≤ x ≤ u.
The upper bounds constraints are implicitly included in the constraint matrix A. With the addition of
primal slack variables s, “Equation 9-4” becomes
A⋅x=b
T
min f x such that x + s = u (9-5)
x
x ≥ 0, s ≥ 0.
which is referred to as the primal problem: x consists of the primal variables and s consists of the
primal slack variables. The dual problem is
T AT ⋅ y − w + z = f
maxb y − uT w such that (9-6)
z ≥ 0, w ≥ 0,
where y and w consist of the dual variables and z consists of the dual slacks. The optimality
conditions for this linear program, i.e., the primal “Equation 9-5” and the dual “Equation 9-6”, are
9-6
Linear Programming Algorithms
A⋅x−b
x+s−u
F(x, y, z, s, w) = AT ⋅ y − w + z − f = 0,
(9-7)
xizi
siwi
x ≥ 0, z ≥ 0, s ≥ 0, w ≥ 0,
The linprog algorithm uses a different sign convention for the returned Lagrange multipliers than
this discussion gives. This discussion uses the same sign as most literature. See lambda.
The quadratic equations xizi = 0 and siwi = 0 are called the complementarity conditions for the linear
program; the other (linear) equations are called the feasibility conditions. The quantity
xTz + sTw
is the duality gap, which measures the residual of the complementarity portion of F when
(x,z,s,w) ≥ 0.
The algorithm is a primal-dual algorithm, meaning that both the primal and the dual programs are
solved simultaneously. It can be considered a Newton-like method, applied to the linear-quadratic
system F(x,y,z,s,w) = 0 in “Equation 9-7”, while at the same time keeping the iterates x, z, w, and s
positive, thus the name interior-point method. (The iterates are in the strictly interior region
represented by the inequality constraints in “Equation 9-5”.)
where μ > 0 is called the centering parameter and must be chosen carefully. e is a zero-one vector
with the ones corresponding to the quadratic equations in F(v), i.e., the perturbations are only
applied to the complementarity conditions, which are all quadratic, but not to the feasibility
conditions, which are all linear. The two directions are combined with a step length parameter α > 0
and update v to obtain the new iterate v+:
v+ = v + α Δvp + Δvc ,
v+ = [x+;y+;z+;s+;w+]
satisfies
[x+;z+;s+;w+] > 0.
9-7
9 Linear Programming and Mixed-Integer Linear Programming
In solving for the preceding predictor/corrector directions, the algorithm computes a (sparse) direct
factorization on a modification of the Cholesky factors of A·AT. If A has dense columns, it instead uses
the Sherman-Morrison formula. If that solution is not adequate (the residual is too large), it performs
an LDL factorization of an augmented system form of the step equations to find a solution. (See
Example 4 — The Structure of D (MATLAB) in the MATLAB ldl function reference page.)
The algorithm then loops until the iterates converge. The main stopping criteria is a standard one:
T T
rb rf ru f x − b y + uT w
max , , , ≤ tol,
max 1, b max 1, f max 1, u max 1, f T x , bT y − uT w
where
rb = Ax − b
r f = AT y − w + z − f
ru = x + s − u
are the primal residual, dual residual, and upper-bound feasibility respectively ({x} means those x
with finite upper bounds), and
T T
f x − b y + uT w
is the difference between the primal and dual objective values, and tol is some tolerance. The sum in
the stopping criteria measures the total relative errors in the optimality conditions in “Equation 9-7”.
The measure of primal infeasibility is ||rb||, and the measure of dual infeasibility is ||rf||, where the
norm is the Euclidean norm.
Preprocessing
The algorithm first tries to simplify the problem by removing redundancies and simplifying
constraints. The tasks performed during the presolve step can include the following:
• Check if any variables have equal upper and lower bounds. If so, check for feasibility, and then fix
and remove the variables.
• Check if any linear inequality constraint involves only one variable. If so, check for feasibility, and
then change the linear constraint to a bound.
• Check if any linear equality constraint involves only one variable. If so, check for feasibility, and
then fix and remove the variable.
• Check if any linear constraint matrix has zero rows. If so, check for feasibility, and then delete the
rows.
• Determine if the bounds and linear constraints are consistent.
• Check if any variables appear only as linear terms in the objective function and do not appear in
any linear constraint. If so, check for feasibility and boundedness, and then fix the variables at
their appropriate bounds.
• Change any linear inequality constraints to linear equality constraints by adding slack variables.
If the algorithm detects an infeasible or unbounded problem, it halts and issues an appropriate exit
message.
The algorithm might arrive at a single feasible point, which represents the solution.
9-8
Linear Programming Algorithms
If the algorithm does not detect an infeasible or unbounded problem in the presolve step, and if the
presolve has not produced the solution, the algorithm continues to its next steps. After reaching a
stopping criterion, the algorithm reconstructs the original problem, undoing any presolve
transformations. This final step is the postsolve step.
While these preprocessing steps can do much to speed up the iterative part of the algorithm, if the
Lagrange multipliers are required, the preprocessing steps must be undone since the multipliers
calculated during the algorithm are for the transformed problem, and not the original. Thus, if the
multipliers are not requested, this transformation back is not computed, and might save some time
computationally.
Dual-Simplex Algorithm
At a high level, the linprog 'dual-simplex' algorithm essentially performs a simplex algorithm
on the dual problem.
The algorithm begins by preprocessing as described in “Preprocessing” on page 9-8. For details, see
Andersen and Andersen [1] and Nocedal and Wright [7], Chapter 13. This preprocessing reduces the
original linear programming problem to the form of “Equation 9-4”:
T A⋅x=b
min f x such that
x 0 ≤ x ≤ u.
A and b are transformed versions of the original constraint matrices. This is the primal problem.
x if x > 0
x+ =
0 if x ≤ 0.
+ 2 + 2 + 2 2
Primal infeasibility = lb−x + x − ub + Ax − b + Aeqx − beq .
As explained in “Equation 9-6”, the dual problem is to find vectors y and w, and a slack variable
vector z that solve
T AT ⋅ y − w + z = f
maxb y − uT w such that
z ≥ 0, w ≥ 0.
The linprog algorithm uses a different sign convention for the returned Lagrange multipliers than
this discussion gives. This discussion uses the same sign as most literature. See lambda.
Dual infeasibility = AT y + z − w − f 2.
It is well known (for example, see [7]) that any finite solution of the dual problem gives a solution to
the primal problem, and any finite solution of the primal problem gives a solution of the dual problem.
Furthermore, if either the primal or dual problem is unbounded, then the other problem is infeasible.
9-9
9 Linear Programming and Mixed-Integer Linear Programming
And if either the primal or dual problem is infeasible, then the other problem is either infeasible or
unbounded. Therefore, the two problems are equivalent in terms of obtaining a finite solution, if one
exists. Because the primal and dual problems are mathematically equivalent, but the computational
steps differ, it can be better to solve the primal problem by solving the dual problem.
To help alleviate degeneracy (see Nocedal and Wright [7], page 366), the dual simplex algorithm
begins by perturbing the objective function.
Phase 1 of the dual simplex algorithm is to find a dual feasible point. The algorithm does this by
solving an auxiliary linear programming problem.
Phase 1 Outline
In phase 1, the algorithm finds an initial basic feasible solution (see “Basic and Nonbasic Variables”
on page 9-11 for a definition) by solving an auxiliary piecewise linear programming problem. The
objective function of the auxiliary problem is the linear penalty function P = ∑ P j x j ,
j
x j − u j if x j > u j
Pj xj = 0 if l j ≤ x j ≤ u j
l j − x j if l j > x j .
P(x) measures how much a point x violates the lower and upper bound conditions. The auxiliary
problem is
A⋅x≤b
min ∑ P j subject to
x j Aeq ⋅ x = beq .
The original problem has a feasible basis point if and only if the auxiliary problem has minimum value
0.
The algorithm finds an initial point for the auxiliary problem by a heuristic method that adds slack
and artificial variables as necessary. The algorithm then uses this initial point together with the
simplex algorithm to solve the auxiliary problem. The solution is the initial point for phase 2 of the
main algorithm.
During Phase 2, the solver repeatedly chooses an entering variable and a leaving variable. The
algorithm chooses a leaving variable according to a technique suggested by Forrest and Goldfarb [3]
called dual steepest-edge pricing. The algorithm chooses an entering variable using the variation of
Harris’ ratio test suggested by Koberstein [5]. To help alleviate degeneracy, the algorithm can
introduce additional perturbations during Phase 2.
Phase 2 Outline
In phase 2, the algorithm applies the simplex algorithm, starting at the initial point from phase 1, to
solve the original problem. At each iteration, the algorithm tests the optimality condition and stops if
the current solution is optimal. If the current solution is not optimal, the algorithm
1 Chooses one variable, called the entering variable, from the nonbasic variables and adds the
corresponding column of the nonbasis to the basis (see “Basic and Nonbasic Variables” on page
9-11 for definitions).
9-10
Linear Programming Algorithms
2 Chooses a variable, called the leaving variable, from the basic variables and removes the
corresponding column from the basis.
3 Updates the current solution and the current objective value.
The algorithm chooses the entering and the leaving variables by solving two linear systems while
maintaining the feasibility of the solution.
The algorithm detects when there is no progress in the Phase 2 solution process. It attempts to
continue by performing bound perturbation. For an explanation of this part of the algorithm, see
Applegate, Bixby, Chvatal, and Cook [2].
The solver iterates, attempting to maintain dual feasibility while reducing primal infeasibility, until
the solution to the reduced, perturbed problem is both primal feasible and dual feasible. The
algorithm unwinds the perturbations that it introduced. If the solution (to the perturbed problem) is
dual infeasible for the unperturbed (original) problem, then the solver restores dual feasibility using
primal simplex or Phase 1 algorithms. Finally, the solver unwinds the preprocessing steps to return
the solution to the original problem.
This section defines the terms basis, nonbasis, and basic feasible solutions for a linear programming
problem. The definition assumes that the problem is given in the following standard form:
T A ⋅ x = b,
min f x such that
x lb ≤ x ≤ ub .
(Note that A and b are not the matrix and vector defining the inequalities in the original problem.)
Assume that A is an m-by-n matrix, of rank m < n, whose columns are {a1, a2, ..., an}. Suppose that
ai1, ai2, ..., aim is a basis for the column space of A, with index set B = {i1, i2, ..., im}, and that N =
{1, 2, ..., n}\B is the complement of B. The submatrix AB is called a basis and the complementary
submatrix AN is called a nonbasis. The vector of basic variables is xB and the vector of nonbasic
variables is xN. At each iteration in phase 2, the algorithm replaces one column of the current basis
with a column of the nonbasis and updates the variables xB and xN accordingly.
If x is a solution to A·x = b and all the nonbasic variables in xN are equal to either their lower or
upper bounds, x is called a basic solution. If, in addition, the basic variables in xB satisfy their lower
and upper bounds, so that x is a feasible point, x is called a basic feasible solution.
References
[1] Andersen, E. D., and K. D. Andersen. Presolving in linear programming. Math. Programming 71,
1995, pp. 221–245.
[2] Applegate, D. L., R. E. Bixby, V. Chvátal and W. J. Cook, The Traveling Salesman Problem: A
Computational Study, Princeton University Press, 2007.
[3] Forrest, J. J., and D. Goldfarb. Steepest-edge simplex algorithms for linear programming. Math.
Programming 57, 1992, pp. 341–374.
[4] Gondzio, J. “Multiple centrality corrections in a primal dual method for linear programming.”
Computational Optimization and Applications, Volume 6, Number 2, 1996, pp. 137–156.
Available at https://www.maths.ed.ac.uk/~gondzio/software/correctors.ps.
9-11
9 Linear Programming and Mixed-Integer Linear Programming
[5] Koberstein, A. Progress in the dual simplex algorithm for solving large scale LP problems:
techniques for a fast and stable implementation. Computational Optim. and Application 41,
2008, pp. 185–204.
[6] Mehrotra, S. “On the Implementation of a Primal-Dual Interior Point Method.” SIAM Journal on
Optimization, Vol. 2, 1992, pp 575–601.
[7] Nocedal, J., and S. J. Wright. Numerical Optimization, Second Edition. Springer Series in
Operations Research, Springer-Verlag, 2006.
9-12
Typical Linear Programming Problem
A ⋅ x ≤ b,
T
min f x such that Aeq ⋅ x = beq,
x
x ≥ 0.
Load the sc50b.mat file, which contains the matrices and vectors A, Aeq, b, beq, f, and the lower
bounds lb.
load sc50b
disp(size(A))
30 48
disp(size(Aeq))
20 48
Set options to use the dual-simplex algorithm and the iterative display.
options = optimoptions(@linprog,'Algorithm','dual-simplex','Display','iter');
ub = [];
[x,fval,exitflag,output] = ...
linprog(f,A,b,Aeq,beq,lb,ub,options);
Examine the exit flag, objective function value at the solution, and number of iterations used by
linprog to solve the problem.
exitflag,fval,output.iterations
exitflag = 1
fval = -70
ans = 33
You can also find the objective function value and number of iterations in the iterative display.
9-13
9 Linear Programming and Mixed-Integer Linear Programming
9-14
Maximize Long-Term Investments Using Linear Programming: Solver-Based
Problem Formulation
Suppose that you have an initial amount of money Capital_0 to invest over a time period of T years
in N zero-coupon bonds. Each bond pays an interest rate that compounds each year, and pays the
principal plus compounded interest at the end of a maturity period. The objective is to maximize the
total amount of money after T years.
You can include a constraint that no single investment is more than a certain fraction of your total
capital.
This example shows the problem setup on a small case first, and then formulates the general case.
You can model this as a linear programming problem. Therefore, to optimize your wealth, formulate
the problem for solution by the linprog solver.
Introductory Example
By splitting up the first option B0 into 5 bonds with maturity period of 1 year and interest rate of 0%,
this problem can be equivalently modeled as having a total of 9 available bonds, such that for k=1..9
• Entry k of vector PurchaseYears represents the year that bond k is available for purchase.
• Entry k of vector Maturity represents the maturity period mk of bond k.
• Entry k of vector InterestRates represents the interest rate ρk of bond k.
Visualize this problem by horizontal bars that represent the available purchase times and durations
for each bond.
9-15
9 Linear Programming and Mixed-Integer Linear Programming
plotInvestments(N,PurchaseYears,Maturity,InterestRates)
Decision Variables
Represent your decision variables by a vector x, where x(k) is the dollar amount of investment in
bond k, for k=1..9. Upon maturity, the payout for investment x(k) is
mk
x(k)(1 + ρk /100) .
mk
rk = (1 + ρk /100) .
% Total returns
finalReturns = (1+InterestRates/100).^Maturity;
Objective Function
The goal is to choose investments to maximize the amount of money collected at the end of year T.
From the plot, you see that investments are collected at various intermediate years and reinvested. At
the end of year T, the money returned from investments 5, 7, and 8 can be collected and represents
your final wealth:
9-16
Maximize Long-Term Investments Using Linear Programming: Solver-Based
To place this problem into the form linprog solves, turn this maximization problem into a
minimization problem using the negative of the coefficients of x(j):
T
min f x
x
with
f = zeros(nPtotal,1);
f([5,7,8]) = [-finalReturns(5),-finalReturns(7),-finalReturns(8)];
Every year, you have a certain amount of money available to purchase bonds. Starting with year 1,
you can invest the initial capital in the purchase options x1 and x6, so:
x1 + x6 = Capital0
Then for the following years, you collect the returns from maturing bonds, and reinvest them in new
available bonds to obtain the system of equations:
x2 + x8 + x9 = r1x1
x3 = r2x2
x4 = r3x3
x5 + x7 = r4x4 + r6x6 + r9x9
Write these equations in the form Aeqx = beq, where each row of the Aeq matrix corresponds to the
equality that needs to be satisfied that year:
1 0 0 0 0 1 00 0
−r1 1 0 0 0 0 01 1
Aeq = 0 −r2 1 0 0 0 0 0 0
0 0 −r3 1 0 0 0 0 0
0 0 0 −r4 1 −r6 1 0 −r9
Capital0
0
beq =
0
0
Aeq = spalloc(N+1,nPtotal,15);
Aeq(1,[1,6]) = 1;
Aeq(2,[1,2,8,9]) = [-1,1,1,1];
Aeq(3,[2,3]) = [-1,1];
Aeq(4,[3,4]) = [-1,1];
Aeq(5,[4:7,9]) = [-finalReturns(4),1,-finalReturns(6),1,-finalReturns(9)];
9-17
9 Linear Programming and Mixed-Integer Linear Programming
beq = zeros(T,1);
beq(1) = Capital_0;
Because each amount invested must be positive, each entry in the solution vector x must be positive.
Include this constraint by setting a lower bound lb on the solution vector x. There is no explicit upper
bound on the solution vector. Thus, set the upper bound ub to empty.
lb = zeros(size(f));
ub = [];
Solve this problem with no constraints on the amount you can invest in a bond. The interior-point
algorithm can be used to solve this type of linear programming problem.
options = optimoptions('linprog','Algorithm','interior-point');
[xsol,fval,exitflag] = linprog(f,[],[],Aeq,beq,lb,ub,options);
The exit flag is 1, indicating that the solver found a solution. The value -fval, returned as the second
linprog output argument, corresponds to the final wealth. Plot your investments over time.
fprintf('After %d years, the return for the initial $%g is $%g \n',...
T,Capital_0,-fval);
plotInvestments(N,PurchaseYears,Maturity,InterestRates,xsol)
To diversify your investments, you can choose to limit the amount invested in any one bond to a
certain percentage Pmax of the total capital that year (including the returns for bonds that are
currently in their maturity period). You obtain the following system of inequalities:
9-18
Maximize Long-Term Investments Using Linear Programming: Solver-Based
x1 ≤ Pmax × Capital0
x2 ≤ Pmax × (ρ1x1 + ρ6x6)
x3 ≤ Pmax × (ρ2x2 + ρ62x6 + ρ8x8 + ρ9x9)
x4 ≤ Pmax × (ρ3x3 + ρ63x6 + ρ82x8 + ρ92x9)
To set up the system of inequalities, first generate a matrix yearlyReturns that contains the return
for the bond indexed by i at year j in row i and column j. Represent this system as a sparse matrix.
% Build the return for each bond over the maturity period as a sparse
% matrix
cumMaturity = [0;cumsum(Maturity)];
xr = zeros(cumMaturity(end-1),1);
yr = zeros(cumMaturity(end-1),1);
cr = zeros(cumMaturity(end-1),1);
for i = 1:nPtotal
mi = Maturity(i); % maturity of bond i
pi = PurchaseYears(i); % purchase year of bond i
idx = cumMaturity(i)+1:cumMaturity(i+1); % index into xr, yr and cr
xr(idx) = i; % bond index
yr(idx) = pi+1:pi+mi; % maturing years
cr(idx) = (1+InterestRates(i)/100).^(1:mi); % returns over the maturity period
end
yearlyReturns = sparse(xr,yr,cr,nPtotal,T+1);
% Left-hand side
b = zeros(nPtotal,1);
b(PurchaseYears == 1) = Pmax*Capital_0;
Solve the problem by investing no more than 60% in any one asset. Plot the resulting purchases.
Notice that your final wealth is less than the investment without this constraint.
[xsol,fval,exitflag] = linprog(f,A,b,Aeq,beq,lb,ub,options);
9-19
9 Linear Programming and Mixed-Integer Linear Programming
fprintf('After %d years, the return for the initial $%g is $%g \n',...
T,Capital_0,-fval);
plotInvestments(N,PurchaseYears,Maturity,InterestRates,xsol)
Create a model for a general version of the problem. Illustrate it using T = 30 years and 400
randomly generated bonds with interest rates from 1 to 6%. This setup results in a linear
programming problem with 430 decision variables. The system of equality constraints is represented
by a sparse matrix Aeq of dimension 30-by-430 and the system of inequalities is represented by a
sparse matrix A of dimension 430-by-430.
% for reproducibility
rng default
% Initial amount of money
Capital_0 = 1000;
% Time period in years
T = 30;
% Number of bonds
N = 400;
% Total number of buying oportunities
nPtotal = N+T;
% Generate random maturity durations
Maturity = randi([1 T-1],nPtotal,1);
% Bond 1 has a maturity period of 1 year
Maturity(1:T) = 1;
% Generate random yearly interest rate for each bond
InterestRates = randi(6,nPtotal,1);
% Bond 1 has an interest rate of 0 (not invested)
InterestRates(1:T) = 0;
% Compute the return at the end of the maturity period for each bond:
finalReturns = (1+InterestRates/100).^Maturity;
9-20
Maximize Long-Term Investments Using Linear Programming: Solver-Based
% Initialize f to 0
f = zeros(nPtotal,1);
% Indices of the sale oportunities at the end of year T
SalesTidx = SaleYears==T+1;
% Expected return for the sale oportunities at the end of year T
ReturnsT = finalReturns(SalesTidx);
% Objective function
f(SalesTidx) = -ReturnsT;
% For each sale option, put -\rho_k, where \rho_k is the interest rate for the
% associated bond that is being sold, in the row corresponding to the
% year for the sale option and the column corresponding to the purchase
% oportunity
xeq2 = SaleYears(~SalesTidx);
yeq2 = find(~SalesTidx);
ceq2 = -finalReturns(~SalesTidx);
% Build the returns for each bond over the maturity period
cumMaturity = [0;cumsum(Maturity)];
xr = zeros(cumMaturity(end-1),1);
yr = zeros(cumMaturity(end-1),1);
cr = zeros(cumMaturity(end-1),1);
for i = 1:nPtotal
mi = Maturity(i); % maturity of bond i
pi = PurchaseYears(i); % purchase year of bond i
idx = cumMaturity(i)+1:cumMaturity(i+1); % index into xr, yr and cr
xr(idx) = i; % bond index
9-21
9 Linear Programming and Mixed-Integer Linear Programming
% Left-hand side
b = zeros(nPtotal,1);
b(PurchaseYears==1) = Pmax*Capital_0;
First, solve the linear programming problem without inequality constraints using the interior-point
algorithm.
toc
fprintf('\nAfter %d years, the return for the initial $%g is $%g \n',...
T,Capital_0,-fval);
toc
9-22
Maximize Long-Term Investments Using Linear Programming: Solver-Based
fprintf('\nAfter %d years, the return for the initial $%g is $%g \n',...
T,Capital_0,-fval);
Even though the number of constraints increased by an order of 10, the time for the solver to find a
solution increased by an order of 100. This performance discrepancy is partially caused by dense
columns in the inequality system shown in matrix A. These columns correspond to bonds with a long
maturity period, as shown in the following graph.
% Number of nonzero elements per column
nnzCol = sum(spones(A));
% Plot the maturity length vs. the number of nonzero elements for each bond
figure;
plot(Maturity,nnzCol,'o');
xlabel('Maturity period of bond k')
ylabel('Number of nonzero in column k of A')
Dense columns in the constraints lead to dense blocks in the solver's internal matrices, yielding a loss
of efficiency of its sparse methods. To speed up the solver, try the dual-simplex algorithm, which is
less sensitive to column density.
% Solve the problem with inequality constraints using dual simplex
options = optimoptions('linprog','Algorithm','dual-simplex');
9-23
9 Linear Programming and Mixed-Integer Linear Programming
tic
[xsol,fval,exitflag] = linprog(f,A,b,Aeq,beq,lb,[],options);
toc
fprintf('\nAfter %d years, the return for the initial $%g is $%g \n',...
T,Capital_0,-fval);
In this case, the dual-simplex algorithm took much less time to obtain the same solution.
To get a feel for the solution found by linprog, compare it to the amount fmax that you would get if
you could invest all of your starting money in one bond with a 6% interest rate (the maximum interest
rate) over the full 30 year period. You can also compute the equivalent interest rate corresponding to
your final wealth.
% Maximum amount
fmax = Capital_0*(1+6/100)^T;
% Ratio (in percent)
rat = -fval/fmax*100;
% Equivalent interest rate (in percent)
rsol = ((-fval/Capital_0)^(1/T)-1)*100;
The amount collected is 88.7137% of the maximum amount $5743.49 that you would obtain from invest
Your final wealth corresponds to a 5.57771% interest rate over the 30 year period.
plotInvestments(N,PurchaseYears,Maturity,InterestRates,xsol,false)
9-24
Maximize Long-Term Investments Using Linear Programming: Solver-Based
See Also
More About
• “Maximize Long-Term Investments Using Linear Programming: Problem-Based”
9-25
9 Linear Programming and Mixed-Integer Linear Programming
• Linear objective function, fTx, where f is a column vector of constants, and x is the column vector
of unknowns
• Bounds and linear constraints, but no nonlinear constraints (for definitions, see “Write
Constraints”)
• Restrictions on some components of x to have integer values
In mathematical terms, given vectors f, lb, and ub, matrices A and Aeq, corresponding vectors b and
beq, and a set of indices intcon, find a vector x to solve
x(intcon) are integers
T A⋅x≤b
min f x subject to
x Aeq ⋅ x = beq
lb ≤ x ≤ ub .
intlinprog Algorithm
• “Algorithm Overview” on page 9-26
• “Linear Program Preprocessing” on page 9-27
• “Linear Programming” on page 9-27
• “Mixed-Integer Program Preprocessing” on page 9-27
• “Cut Generation” on page 9-28
• “Heuristics for Finding Feasible Solutions” on page 9-29
• “Branch and Bound” on page 9-31
Algorithm Overview
intlinprog uses this basic strategy to solve mixed-integer linear programs. intlinprog can solve
the problem in any of the stages. If it solves the problem in a stage, intlinprog does not execute
the later stages.
1 Reduce the problem size using “Linear Program Preprocessing” on page 9-27.
2 Solve an initial relaxed (noninteger) problem using “Linear Programming” on page 9-27.
3 Perform “Mixed-Integer Program Preprocessing” on page 9-27 to tighten the LP relaxation of
the mixed-integer problem.
4 Try “Cut Generation” on page 9-28 to further tighten the LP relaxation of the mixed-integer
problem.
9-26
Mixed-Integer Linear Programming Algorithms
According to the “Mixed-Integer Linear Programming Definition” on page 9-26, there are matrices A
and Aeq and corresponding vectors b and beq that encode a set of linear inequalities and linear
equalities
A·x≤b
Aeq · x = beq .
Usually, it is possible to reduce the number of variables in the problem (the number of components of
x), and reduce the number of linear constraints. While performing these reductions can take time for
the solver, they usually lower the overall time to solution, and can make larger problems solvable. The
algorithms can make solution more numerically stable. Furthermore, these algorithms can sometimes
detect an infeasible problem.
Preprocessing steps aim to eliminate redundant variables and constraints, improve the scaling of the
model and sparsity of the constraint matrix, strengthen the bounds on variables, and detect the
primal and dual infeasibility of the model.
For details, see Andersen and Andersen [2] and Mészáros and Suhl [8].
Linear Programming
The initial relaxed problem is the linear programming problem with the same objective and
constraints as “Mixed-Integer Linear Programming Definition” on page 9-26, but no integer
constraints. Call xLP the solution to the relaxed problem, and x the solution to the original problem
with integer constraints. Clearly,
fTxLP ≤ fTx,
because xLP minimizes the same function but with fewer restrictions.
This initial relaxed LP (root node LP) and all generated LP relaxations during the branch-and-bound
algorithm are solved using linear programming solution techniques.
During mixed-integer program preprocessing, intlinprog analyzes the linear inequalities A*x ≤ b
along with integrality restrictions to determine whether:
9-27
9 Linear Programming and Mixed-Integer Linear Programming
The IntegerPreprocess option lets you choose whether intlinprog takes several steps, takes all
of them, or takes almost none of them. If you include an x0 argument, intlinprog uses that value in
preprocessing.
Cut Generation
Cuts are additional linear inequality constraints that intlinprog adds to the problem. These
inequalities attempt to restrict the feasible region of the LP relaxations so that their solutions are
closer to integers. You control the type of cuts that intlinprog uses with the CutGeneration
option.
Furthermore, if the problem is purely integer (all variables are integer-valued), then intlinprog
also uses the following cuts:
'advanced' cuts include all 'intermediate' cuts except reduce-and-split cuts, plus:
For purely integer problems, 'intermediate' uses the most cut types, because it uses reduce-and-
split cuts, while 'advanced' does not.
Another option, CutMaxIterations, specifies an upper bound on the number of times intlinprog
iterates to generate cuts.
For details about cut generation algorithms (also called cutting plane methods), see Cornuéjols [5]
and, for clique cuts, Atamtürk, Nemhauser, and Savelsbergh [3].
9-28
Mixed-Integer Linear Programming Algorithms
To get an upper bound on the objective function, the branch-and-bound procedure must find feasible
points. A solution to an LP relaxation during branch-and-bound can be integer feasible, which can
provide an improved upper bound to the original MILP. Certain techniques find feasible points faster
before or during branch-and-bound. intlinprog uses these techniques at the root node and during
some branch-and-bound iterations. These techniques are heuristic, meaning they are algorithms that
can succeed but can also fail.
Heuristics can be start heuristics, which help the solver find an initial or new integer-feasible
solution. Or heuristics can be improvement heuristics, which start at an integer-feasible point and
attempt to find a better integer-feasible point, meaning one with lower objective function value. The
intlinprog improvement heuristics are 'rins', 'rss', 1-opt, and guided diving.
Set the intlinprog heuristics using the 'Heuristics' option. The options are:
Option Description
'basic' (default) The solver runs rounding heuristics twice with different parameters,
runs diving heuristics twice with different parameters, then runs 'rss'.
The solver does not run later heuristics when earlier heuristics lead to a
sufficiently good integer-feasible solution.
'intermediate' The solver runs rounding heuristics twice with different parameters,
then runs diving heuristics twice with different parameters. If there is
an integer-feasible solution, the solver then runs 'rins' followed by
'rss'. If 'rss' finds a new solution, the solver runs 'rins' again.
The solver does not run later heuristics when earlier heuristics lead to a
sufficiently good integer-feasible solution.
'advanced' The solver runs rounding heuristics twice with different parameters,
then runs diving heuristics twice with different parameters. If there is
an integer-feasible solution, the solver then runs 'rins' followed by
'rss'. If 'rss' finds a new solution, the solver runs 'rins' again.
The solver does not run later heuristics when earlier heuristics lead to a
sufficiently good integer-feasible solution.
'rins' or the equivalent intlinprog searches the neighborhood of the current, best integer-
'rins-diving' feasible solution point (if available) to find a new and better solution.
See Danna, Rothberg, and Le Pape [6]. When you select 'rins', the
solver runs rounding heuristics twice with different parameters, runs
diving heuristics twice with different parameters, then runs 'rins'.
'rss' or the equivalent intlinprog applies a hybrid procedure combining ideas from 'rins'
'rss-diving' and local branching to search for integer-feasible solutions. When you
select 'rss', the solver runs rounding heuristics twice with different
parameters, runs diving heuristics twice with different parameters, then
runs 'rss'. The solver does not run later heuristics when earlier
heuristics lead to a sufficiently good integer-feasible solution. These
settings perform the same heuristics as 'basic'.
9-29
9 Linear Programming and Mixed-Integer Linear Programming
Option Description
'round' intlinprog takes the LP solution to the relaxed problem at a node,
and rounds the integer components in a way that attempts to maintain
feasibility. When you select 'round', the solver, at the root node, runs
rounding heuristics twice with different parameters, then runs diving
heuristics twice with different parameters. Thereafter, the solver runs
only rounding heuristics at some branch-and-bound nodes.
'round-diving' The solver works in a similar way to 'round', but also runs diving
heuristics (in addition to rounding heuristics) at some branch-and-
bound nodes, not just the root node.
'diving' intlinprog uses heuristics that are similar to branch-and-bound steps,
but follow just one branch of the tree down, without creating the other
branches. This single branch leads to a fast “dive” down the tree
fragment, thus the name “diving.” Currently, intlinprog uses six
diving heuristics in this order:
The main difference between 'intermediate' and 'advanced' is that 'advanced' runs
heuristics more frequently during branch-and-bound iterations.
In addition to the previous table, the following heuristics run when the Heuristics option is
'basic', 'intermediate', or 'advanced'.
• ZI round — This heuristic runs whenever an algorithm solves a relaxed LP. The heuristic goes
through each fractional integer variable to attempt to shift it to a neighboring integer without
affecting the feasibility with respect to other constraints. For details, see Hendel [7].
• 1-opt — This heuristic runs whenever an algorithm finds a new integer feasible solution. The
heuristic goes through each integer variable to attempt to shift it to a neighboring integer without
affecting the feasibility with respect to other constraints, while lowering the objective function
value.
At the beginning of the heuristics phase, intlinprog runs the trivial heuristic unless Heuristics is
'none' or you provide an initial integer-feasible point in the x0 argument. The trivial heuristic
checks the following points for feasibility:
9-30
Mixed-Integer Linear Programming Algorithms
• All zeros
• Upper bound
• Lower bound (if nonzero)
• "Lock" point
The "lock" point is defined only for problems with finite upper and lower bounds for all variables. The
"lock" point for each variable is its upper or lower bound, chosen as follows. For each variable j,
count the number of corresponding positive entries in the linear constraint matrix A(:,j) and
subtract the number corresponding negative entries. If the result is positive, use the lower bound for
that variable, lb(j). Otherwise, use the upper bound for that variable, ub(j). The "lock" point
attempts to satisfy the largest number of linear inequality constraints for each variable, but is not
necessarily feasible.
After each heuristic completes with a feasible solution, intlinprog calls output functions and plot
functions. See “intlinprog Output Function and Plot Function Syntax” on page 15-33.
If you include an x0 argument, intlinprog uses that value in the 'rins' and guided diving
heuristics until it finds a better integer-feasible point. So when you provide x0, you can obtain good
results by setting the 'Heuristics' option to 'rins-diving' or another setting that uses
'rins'.
As explained in “Linear Programming” on page 9-27, any solution to the linear programming relaxed
problem has a lower objective function value than the solution to the MILP. Also, any feasible point
xfeas satisfies
fTxfeas ≥ fTx,
In this context, a node is an LP with the same objective function, bounds, and linear constraints as
the original problem, but without integer constraints, and with particular changes to the linear
constraints or bounds. The root node is the original problem with no integer constraints and no
changes to the linear constraints or bounds, meaning the root node is the initial relaxed LP.
From the starting bounds, the branch-and-bound method constructs new subproblems by branching
from the root node. The branching step is taken heuristically, according to one of several rules. Each
rule is based on the idea of splitting a problem by restricting one variable to be less than or equal to
an integer J, or greater than or equal to J+1. These two subproblems arise when an entry in xLP,
corresponding to an integer specified in intcon, is not an integer. Here, xLP is the solution to a relaxed
problem. Take J as the floor of the variable (rounded down), and J+1 as the ceiling (rounded up). The
resulting two problems have solutions that are larger than or equal to fTxLP, because they have more
restrictions. Therefore, this procedure potentially raises the lower bound.
The performance of the branch-and-bound method depends on the rule for choosing which variable to
split (the branching rule). The algorithm uses these rules, which you can set in the BranchRule
option:
9-31
9 Linear Programming and Mixed-Integer Linear Programming
Pseudocost
The pseudocost of a variable i is based on empirical estimates of the change in the lower bound
when i has been chosen as the branching variable, combined with the fractional part of the i
component of the current point x. The fractional part p is in two pieces, the lower part and the
upper part:
Let xi– be the solution of the linear program restricted to have x(i) ≤ ⌊x(i)⌋, and let the change in
objective function be denoted
Similarly, Δi+ is the change in objective function when the problem is restricted to have
x(i) ≥ ⌈x(i)⌉.
− Δi− + Δi+
di = or di = .
pi− pi+
Let si– and si+ be the empirical averages of di– and di+ during the branch-and-bound algorithm up
to this point. The empirical values are initialized to the absolute value of the objective coefficient
f(i) for the terms before there are any observations. Then the 'maxpscost' rule is to branch on a
node i that maximizes, for some positive weights w+ and w–, the quantity
Roughly speaking, this rule chooses a coefficient that is likely to increase the lower bound
maximally.
• 'strongpscost' — Similar to 'maxpscost', but instead of the pseudocost being initialized to 1
for each variable, the solver attempts to branch on a variable only after the pseudocost has a more
reliable estimate. To obtain a more reliable estimate, the solver does the following (see
Achterberg, Koch, and Martin [1]).
• Order all potential branching variables (those that are currently fractional but should be
integer) by their current pseudocost-based scores.
• Run the two relaxed linear programs based on the current branching variable, starting from
the variable with the highest score (if the variable has not yet been used for a branching
calculation). The solver uses these two solutions to update the pseudocosts for the current
branching variable. The solver can halt this process early to save time in choosing the branch.
• Continue choosing variables in the list until the current highest pseudocost-based score does
not change for k consecutive variables, where k is an internally chosen value, usually between
5 and 10.
• Branch on the variable with the highest pseudocost-based score. The solver might have already
computed the relaxed linear programs based on this variable during an earlier pseudocost
estimation procedure.
9-32
Mixed-Integer Linear Programming Algorithms
Because of the extra linear program solutions, each iteration of 'strongpscost' branching
takes longer than the default 'maxpscost'. However, the number of branch-and-bound iterations
typically decreases, so the 'strongpscost' method can save time overall.
• 'reliability' — Similar to 'strongpscost', but instead of running the relaxed linear
programs only for uninitialized pseudocost branches, 'reliability' runs the programs up to
k2 times for each variable, where k2 is a small integer such as 4 or 8. Therefore, 'reliability'
has even slower branching, but potentially fewer branch-and-bound iterations, compared to
'strongpscost'.
• 'mostfractional' — Choose the variable with fractional part closest to 1/2.
• 'maxfun' — Choose the variable with maximal corresponding absolute value in the objective
vector f.
After the algorithm branches, there are two new nodes to explore. The algorithm chooses which node
to explore among all that are available using one of these rules:
• 'minobj' — Choose the node that has the lowest objective function value.
• 'mininfeas' — Choose the node with the minimal sum of integer infeasibilities. This means for
every integer-infeasible component x(i) in the node, add up the smaller of pi– and pi+, where
Best Projection
Let xB denote the best integer-feasible point found so far, xR demote the LP relaxed solution at the
root node, and x denote the node we examine. Let in(x) denote the sum of integer infeasibilities at
the node x (see 'mininfeas'). The best projection rule is to minimize
T T
T f xB − f xR
f x+ in x .
in xR
intlinprog skips the analysis of some subproblems by considering information from the original
problem such as the objective function’s greatest common divisor (GCD).
For details about the branch-and-bound procedure, see Nemhauser and Wolsey [9] and Wolsey [11].
9-33
9 Linear Programming and Mixed-Integer Linear Programming
References
[1] Achterberg, T., T. Koch and A. Martin. Branching rules revisited. Operations Research Letters 33,
2005, pp. 42–54. Available at https://www-m9.ma.tum.de/downloads/felix-
klein/20B/AchterbergKochMartin-BranchingRulesRevisited.pdf.
[4] Berthold, T. Primal Heuristics for Mixed Integer Programs. Technischen Universität Berlin,
September 2006. Available at https://www.zib.de/groetschel/students/Diplom-
Berthold.pdf.
[5] Cornuéjols, G. Valid inequalities for mixed integer linear programs. Mathematical Programming B,
Vol. 112, pp. 3–44, 2008.
[6] Danna, E., Rothberg, E., Le Pape, C. Exploring relaxation induced neighborhoods to improve MIP
solutions. Mathematical Programming, Vol. 102, issue 1, pp. 71–90, 2005.
[7] Hendel, G. New Rounding and Propagation Heuristics for Mixed Integer Programming. Bachelor's
thesis at Technische Universität Berlin, 2011. PDF available at https://opus4.kobv.de/opus4-
zib/files/1332/bachelor_thesis_main.pdf.
[8] Mészáros C., and Suhl, U. H. Advanced preprocessing techniques for linear and quadratic
programming. OR Spectrum, 25(4), pp. 575–595, 2003.
[10] Savelsbergh, M. W. P. Preprocessing and Probing Techniques for Mixed Integer Programming
Problems. ORSA J. Computing, Vol. 6, No. 4, pp. 445–454, 1994.
9-34
Tuning Integer Linear Programming
Note Often, you can change the formulation of a MILP to make it more easily solvable. For
suggestions on how to change your formulation, see Williams [1].
After you run intlinprog once, you might want to change some options and rerun it. The changes
you might want to see include:
Here are general recommendations for option changes that are most likely to help the solution
process. Try the suggestions in this order:
1 For a faster and more accurate solution, increase the CutMaxIterations option from its default
10 to a higher number such as 25. This can speed up the solution, but can also slow it.
2 For a faster and more accurate solution, change the CutGeneration option to
'intermediate' or 'advanced'. This can speed up the solution, but can use much more
memory, and can slow the solution.
3 For a faster and more accurate solution, change the IntegerPreprocess option to
'advanced'. This can have a large effect on the solution process, either beneficial or not.
4 For a faster and more accurate solution, change the RootLPAlgorithm option to 'primal-
simplex'. Usually this change is not beneficial, but occasionally it can be.
5 To try to find more or better feasible points, increase the HeuristicsMaxNodes option from its
default 50 to a higher number such as 100.
6 To try to find more or better feasible points, change the Heuristics option to either
'intermediate' or 'advanced'.
7 To try to find more or better feasible points, change the BranchRule option to
'strongpscost' or, if that choice fails to improve the solution, 'maxpscost'.
8 For a faster solution, increase the ObjectiveImprovementThreshold option from its default
of zero to a positive value such as 1e-4. However, this change can cause intlinprog to find
fewer integer feasible points or a less accurate solution.
9 To attempt to stop the solver more quickly, change the RelativeGapTolerance option to a
higher value than the default 1e-4. Similarly, to attempt to obtain a more accurate answer,
9-35
9 Linear Programming and Mixed-Integer Linear Programming
change the RelativeGapTolerance option to a lower value. These changes do not always
improve results.
To round all supposed integers to be precisely integers, use the round function.
x(intcon) = round(x(intcon));
Caution Rounding can cause solutions to become infeasible. Check feasibility after rounding:
max(A*x - b) % see if entries are not too positive, so have small infeasibility
max(abs(Aeq*x - beq)) % see if entries are near enough to zero
max(x - ub) % positive entries are violated bounds
max(lb - x) % positive entries are violated bounds
If you get this error, sometimes you can scale the problem to have smaller coefficients:
• For coefficients in f that are too large, try multiplying f by a small positive scaling factor.
• For constraint coefficients that are too large, try multiplying all bounds and constraint matrices by
the same small positive scaling factor.
References
[1] Williams, H. Paul. Model Building in Mathematical Programming. Wiley, 2013.
9-36
Mixed-Integer Linear Programming Basics: Solver-Based
For the problem-based approach to this problem, see “Mixed-Integer Linear Programming Basics:
Problem-Based” on page 10-40.
Problem Description
You want to blend steels with various chemical compositions to obtain 25 tons of steel with a specific
chemical composition. The result should have 5% carbon and 5% molybdenum by weight, meaning 25
tons*5% = 1.25 tons of carbon and 1.25 tons of molybdenum. The objective is to minimize the cost for
blending the steel.
This problem is taken from Carl-Henrik Westerberg, Bengt Bjorklund, and Eskil Hultman, “An
Application of Mixed Integer Programming in a Swedish Steel Mill.” Interfaces February 1977 Vol. 7,
No. 2 pp. 39–43, whose abstract is at https://doi.org/10.1287/inte.7.2.39.
Four ingots of steel are available for purchase. Only one of each ingot is available.
Cost
Ingot Weight in Tons % Carbon % Molybdenum
Ton
1 5 5 3 $ 350
2 3 4 3 $ 330
3 4 5 4 $ 310
4 6 3 4 $ 280
Three grades of alloy steel and one grade of scrap steel are available for purchase. Alloy and scrap
steels can be purchased in fractional amounts.
Cost
Alloy % Carbon % Molybdenum
Ton
1 8 6 $ 500
2 7 7 $ 450
3 6 8 $ 400
Scrap 3 9 $ 100
To formulate the problem, first decide on the control variables. Take variable x(1) = 1 to mean you
purchase ingot 1, and x(1) = 0 to mean you do not purchase the ingot. Similarly, variables x(2)
through x(4) are binary variables indicating whether you purchase ingots 2 through 4.
Variables x(5) through x(7) are the quantities in tons of alloys 1, 2, and 3 that you purchase, and
x(8) is the quantity of scrap steel that you purchase.
MATLAB® Formulation
Formulate the problem by specifying the inputs for intlinprog. The relevant intlinprog syntax:
[x,fval] = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub)
Create the inputs for intlinprog from the first (f) through the last (ub).
9-37
9 Linear Programming and Mixed-Integer Linear Programming
f is the vector of cost coefficients. The coefficients representing the costs of ingots are the ingot
weights times their cost per ton.
f = [350*5,330*3,310*4,280*6,500,450,400,100];
intcon = 1:4;
Tip: To specify binary variables, set the variables to be integers in intcon, and give them a lower
bound of 0 and an upper bound of 1.
The problem has no linear inequality constraints, so A and b are empty matrices ([]).
A = [];
b = [];
The problem has three equality constraints. The first is that the total weight is 25 tons.
The second constraint is that the weight of carbon is 5% of 25 tons, or 1.25 tons.
Aeq = [5,3,4,6,1,1,1,1;
5*0.05,3*0.04,4*0.05,6*0.03,0.08,0.07,0.06,0.03;
5*0.03,3*0.03,4*0.04,6*0.04,0.06,0.07,0.08,0.09];
beq = [25;1.25;1.25];
Each variable is bounded below by zero. The integer variables are bounded above by one.
lb = zeros(8,1);
ub = ones(8,1);
ub(5:end) = Inf; % No upper bound on noninteger variables
Solve Problem
Now that you have all the inputs, call the solver.
[x,fval] = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub);
9-38
Mixed-Integer Linear Programming Basics: Solver-Based
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
x,fval
x = 8×1
1.0000
1.0000
0
1.0000
7.2500
0
0.2500
3.5000
fval = 8.4950e+03
The optimal purchase costs $8,495. Buy ingots 1, 2, and 4, but not 3, and buy 7.25 tons of alloy 1,
0.25 ton of alloy 3, and 3.5 tons of scrap steel.
Set intcon = [] to see the effect of solving the problem without integer constraints. The solution is
different, and is not realistic, because you cannot purchase a fraction of an ingot.
See Also
More About
• “Mixed-Integer Linear Programming Basics: Problem-Based” on page 10-40
• “Solver-Based Optimization Problem Setup”
9-39
9 Linear Programming and Mixed-Integer Linear Programming
The example first generates random locations for factories, warehouses, and sales outlets. Feel free
to modify the scaling parameter N, which scales both the size of the grid in which the production and
distribution facilities reside, but also scales the number of these facilities so that the density of
facilities of each type per grid area is independent of N.
Facility Locations
For a given value of the scaling parameter N, suppose that there are the following:
• f N2 factories
• wN2 warehouses
• sN2 sales outlets
These facilities are on separate integer grid points between 1 and N in the x and y directions. In
order that the facilities have separate locations, you require that f + w + s ≤ 1. In this example, take
N = 20, f = 0 . 05, w = 0 . 05, and s = 0 . 1.
The demand for each product p in a sales outlet s is d(s, p). The demand is the quantity that can be
sold in a time interval. One constraint on the model is that the demand is met, meaning the system
produces and distributes exactly the quantities in the demand.
Suppose that each sales outlet receives its supplies from just one warehouse. Part of the problem is to
determine the cheapest mapping of sales outlets to warehouses.
Costs
The cost of transporting products from factory to warehouse, and from warehouse to sales outlet,
depends on the distance between the facilities, and on the particular product. If dist(a, b) is the
distance between facilities a and b, then the cost of shipping a product p between these facilities is
the distance times the transportation cost tcost(p):
dist(a, b) * tcost(p) .
9-40
Factory, Warehouse, Sales Allocation Model: Solver-Based
The distance in this example is the grid distance, also known as the L1 distance. It is the sum of the
absolute difference in x coordinates and y coordinates.
Optimization Problem
Given a set of facility locations, and the demands and capacity constraints, find:
These quantities must ensure that demand is satisfied and total cost is minimized. Also, each sales
outlet is required to receive all its products from exactly one warehouse.
The control variables, meaning the ones you can change in the optimization, are
d(s, p)
∑ ∑ turn(p) ⋅ y(s, w) ≤ wcap(w) (capacity of warehouse).
p s
The variables x and y appear in the objective and constraint functions linearly. Because y is restricted
to integer values, the problem is a mixed-integer linear program (MILP).
Set the values of the N, f , w, and s parameters, and generate the facility locations.
9-41
9 Linear Programming and Mixed-Integer Linear Programming
Of course, it is not realistic to take random locations for facilities. This example is intended to show
solution techniques, not how to generate good facility locations.
Plot the facilities. Facilities 1 through F are factories, F+1 through F+W are warehouses, and F+W
+1 through F+W+S are sales outlets.
h = figure;
plot(xloc(1:F),yloc(1:F),'rs',xloc(F+1:F+W),yloc(F+1:F+W),'k*',...
xloc(F+W+1:F+W+S),yloc(F+W+1:F+W+S),'bo');
lgnd = legend('Factory','Warehouse','Sales outlet','Location','EastOutside');
lgnd.AutoUpdate = 'off';
xlim([0 N+1]);ylim([0 N+1])
9-42
Factory, Warehouse, Sales Allocation Model: Solver-Based
% Product transport cost per distance between 5 and 10 for each product
tcost = 5*rand(1,P) + 5;
% Product demand by sales outlet between 200 and 500 for each
% product/outlet
d = 300*rand(S,P) + 200;
These random demands and capacities can lead to infeasible problems. In other words, sometimes
the demand exceeds the production and warehouse capacity constraints. If you alter some
parameters and get an infeasible problem, during solution you will get an exitflag of -2.
The objective function vector obj in intlincon consists of the coefficients of the variables x(p, f , w)
and y(s, w). So there are naturally P*F*W + S*W coefficients in obj.
One way to generate the coefficients is to begin with a P-by-F-by-W array obj1 for the x
coefficients, and an S-by-W array obj2 for the y(s, w) coefficients. Then convert these arrays to two
vectors and combine them into obj by calling
obj = [obj1(:);obj2(:)];
obj1 = zeros(P,F,W); % Allocate arrays
obj2 = zeros(S,W);
Throughout the generation of objective and constraint vectors and matrices, we generate the (p, f , w)
array or the (s, w) array, and then convert the result to a vector.
To begin generating the inputs, generate the distance arrays distfw(i,j) and distsw(i,j).
distfw = zeros(F,W); % Allocate matrix for factory-warehouse distances
for ii = 1:F
for jj = 1:W
distfw(ii,jj) = abs(xloc(ii) - xloc(F + jj)) + abs(yloc(ii) ...
- yloc(F + jj));
end
end
9-43
9 Linear Programming and Mixed-Integer Linear Programming
for jj = 1:W
distsw(ii,jj) = abs(xloc(F + W + ii) - xloc(F + jj)) ...
+ abs(yloc(F + W + ii) - yloc(F + jj));
end
end
for ii = 1:P
for jj = 1:F
for kk = 1:W
obj1(ii,jj,kk) = pcost(jj,ii) + tcost(ii)*distfw(jj,kk);
end
end
end
for ii = 1:S
for jj = 1:W
obj2(ii,jj) = distsw(ii,jj)*sum(d(ii,:).*tcost);
end
end
The width of each linear constraint matrix is the length of the obj vector.
matwid = length(obj);
There are two types of linear inequalities: the production capacity constraints, and the warehouse
capacity constraints.
There are P*F production capacity constraints, and W warehouse capacity constraints. The constraint
matrices are quite sparse, on the order of 1% nonzero, so save memory by using sparse matrices.
9-44
Factory, Warehouse, Sales Allocation Model: Solver-Based
end
for ii = 1:W
xtemp = clearer2;
xtemp(:,ii) = vj;
xtemp = sparse([clearer12;xtemp(:)]); % Convert to sparse
Aineq(counter,:) = xtemp'; % Fill in the row
bineq(counter) = wcap(ii);
counter = counter + 1;
end
There are two types of linear equality constraints: the constraint that demand is met, and the
constraint that each sales outlet corresponds to one warehouse.
counter = 1;
% Demand is satisfied:
for ii = 1:P
for jj = 1:W
xtemp = clearer1;
xtemp(ii,:,jj) = 1;
xtemp2 = clearer2;
xtemp2(:,jj) = -d(:,ii);
xtemp = sparse([xtemp(:);xtemp2(:)]'); % Change to sparse row
Aeq(counter,:) = xtemp; % Fill in row
counter = counter + 1;
end
end
intcon = P*F*W+1:length(obj);
lb = zeros(length(obj),1);
ub = Inf(length(obj),1);
ub(P*F*W+1:end) = 1;
9-45
9 Linear Programming and Mixed-Integer Linear Programming
Turn off iterative display so that you don't get hundreds of lines of output. Include a plot function to
monitor the solution progress.
opts = optimoptions('intlinprog','Display','off','PlotFcn',@optimplotmilp);
You generated all the solver inputs. Call the solver to find the solution.
[solution,fval,exitflag,output] = intlinprog(obj,intcon,...
Aineq,bineq,Aeq,beq,lb,ub,opts);
exitflag
exitflag = 1
infeas1 = 8.2991e-12
9-46
Factory, Warehouse, Sales Allocation Model: Solver-Based
infeas2 = 1.6428e-11
Check that the integer components are really integers, or are close enough that it is reasonable to
round them. To understand why these variables might not be exactly integers, see “Some “Integer”
Solutions Are Not Integers” on page 9-36.
diffint = norm(solution(intcon) - round(solution(intcon)),Inf)
diffint = 1.1990e-13
Some integer variables are not exactly integers, but all are very close. So round the integer variables.
solution(intcon) = round(solution(intcon));
Check the feasibility of the rounded solution, and the change in objective function value.
infeas1 = max(Aineq*solution - bineq)
infeas1 = 8.2991e-12
infeas2 = 5.8435e-11
diffrounding = 2.2352e-08
You can examine the solution most easily by reshaping it back to its original dimensions.
solution1 = solution(1:P*F*W); % The continuous variables
solution2 = solution(intcon); % The integer variables
solution1 = reshape(solution1,P,F,W);
solution2 = reshape(solution2,S,W);
For example, how many sales outlets are associated with each warehouse? Notice that, in this case,
some warehouses have 0 associated outlets, meaning the warehouses are not in use in the optimal
solution.
outlets = sum(solution2,1) % Sum over the sales outlets
outlets = 1×20
3 0 3 2 2 2 3 2 3 1 1 0 0 3 4 3
Plot the connection between each sales outlet and its warehouse.
figure(h);
hold on
for ii = 1:S
jj = find(solution2(ii,:)); % Index of warehouse associated with ii
xsales = xloc(F+W+ii); ysales = yloc(F+W+ii);
xwarehouse = xloc(F+jj); ywarehouse = yloc(F+jj);
if rand(1) < .5 % Draw y direction first half the time
plot([xsales,xsales,xwarehouse],[ysales,ywarehouse,ywarehouse],'g--')
9-47
9 Linear Programming and Mixed-Integer Linear Programming
See Also
More About
• “Factory, Warehouse, Sales Allocation Model: Problem-Based”
9-48
Traveling Salesman Problem: Solver-Based
Problem Formulation
Formulate the traveling salesman problem for integer linear programming as follows:
Generate Stops
Generate random stops inside a crude polygonal representation of the continental U.S.
load('usborder.mat','x','y','xx','yy');
rng(3,'twister') % Makes a plot with stops in Maine & Florida, and is reproducible
nStops = 200; % You can use any number, but the problem size scales as N^2
stopsLon = zeros(nStops,1); % Allocate x-coordinates of nStops
stopsLat = stopsLon; % Allocate y-coordinates
n = 1;
while (n <= nStops)
xp = rand*1.5;
yp = rand;
if inpolygon(xp,yp,x,y) % Test if inside the border
stopsLon(n) = xp;
stopsLat(n) = yp;
n = n+1;
end
end
Because there are 200 stops, there are 19,900 trips, meaning 19,900 binary variables (# variables =
200 choose 2).
idxs = nchoosek(1:nStops,2);
Calculate all the trip distances, assuming that the earth is flat in order to use the Pythagorean rule.
9-49
9 Linear Programming and Mixed-Integer Linear Programming
dist'*x_tsp
where x_tsp is the binary solution vector. This is the distance of a tour that you try to minimize.
Represent the problem as a graph. Create a graph where the stops are nodes and the trips are edges.
G = graph(idxs(:,1),idxs(:,2));
Display the stops using a graph plot. Plot the nodes without the graph edges.
figure
hGraph = plot(G,'XData',stopsLon,'YData',stopsLat,'LineStyle','none','NodeLabel',{});
hold on
% Draw the outside border
plot(x,y,'r-')
hold off
9-50
Traveling Salesman Problem: Solver-Based
Constraints
Create the linear constraints that each stop has two associated trips, because there must be a trip to
each stop and a trip departing each stop.
Binary Bounds
All decision variables are binary. Now, set the intcon argument to the number of decision variables,
put a lower bound of 0 on each, and an upper bound of 1.
intcon = 1:lendist;
lb = zeros(lendist,1);
ub = ones(lendist,1);
The problem is ready for solution. To suppress iterative output, turn off the default display.
opts = optimoptions('intlinprog','Display','off');
[x_tsp,costopt,exitflag,output] = intlinprog(dist,intcon,[],[],Aeq,beq,lb,ub,opts);
Create a new graph with the solution trips as edges. To do so, round the solution in case some values
are not exactly integers, and convert the resulting values to logical.
x_tsp = logical(round(x_tsp));
Gsol = graph(idxs(x_tsp,1),idxs(x_tsp,2));
Visualize Solution
hold on
highlight(hGraph,Gsol,'LineStyle','-')
title('Solution with Subtours')
9-51
9 Linear Programming and Mixed-Integer Linear Programming
As can be seen on the map, the solution has several subtours. The constraints specified so far do not
prevent these subtours from happening. In order to prevent any possible subtour from happening,
you would need an incredibly large number of inequality constraints.
Subtour Constraints
Because you can't add all of the subtour constraints, take an iterative approach. Detect the subtours
in the current solution, then add inequality constraints to prevent those particular subtours from
happening. By doing this, you find a suitable tour in a few iterations.
Eliminate subtours with inequality constraints. An example of how this works is if you have five points
in a subtour, then you have five lines connecting those points to create the subtour. Eliminate this
subtour by implementing an inequality constraint to say there must be less than or equal to four lines
between these five points.
Even more, find all lines between these five points, and constrain the solution not to have more than
four of these lines present. This is a correct constraint because if five or more of the lines existed in a
solution, then the solution would have a subtour (a graph with n nodes and n edges always contains a
cycle).
Detect the subtours by identifying the connected components in Gsol, the graph built with the edges
in the current solution. conncomp returns a vector with the number of the subtour to which each
edge belongs.
tourIdxs = conncomp(Gsol);
numtours = max(tourIdxs); % number of subtours
fprintf('# of subtours: %d\n',numtours);
9-52
Traveling Salesman Problem: Solver-Based
# of subtours: 27
Include the linear inequality constraints to eliminate subtours, and repeatedly call the solver, until
just one subtour remains.
% Visualize result
hGraph.LineStyle = 'none'; % Remove the previous highlighted path
highlight(hGraph,Gsol,'LineStyle','-')
drawnow
# of subtours: 20
# of subtours: 7
# of subtours: 9
# of subtours: 9
# of subtours: 3
# of subtours: 2
# of subtours: 7
# of subtours: 2
# of subtours: 1
9-53
9 Linear Programming and Mixed-Integer Linear Programming
Solution Quality
The solution represents a feasible tour, because it is a single closed loop. But is it a minimal-cost
tour? One way to find out is to examine the output structure.
disp(output.absolutegap)
The smallness of the absolute gap implies that the solution is either optimal or has a total length that
is close to optimal.
See Also
More About
• “Traveling Salesman Problem: Problem-Based”
9-54
Optimal Dispatch of Power Generators: Solver-Based
For the problem-based approach to this problem, see “Optimal Dispatch of Power Generators:
Problem-Based”.
Problem Definition
The electricity market has different prices at different times of day. If you have generators, you can
take advantage of this variable pricing by scheduling your generators to operate when prices are
high. Suppose that there are two generators that you control. Each generator has three power levels
(off, low, and high). Each generator has a specified rate of fuel consumption and power production at
each power level. Of course, fuel consumption is 0 when the generator is off.
You can assign a power level to each generator during each half-hour time interval during a day (24
hours, so 48 intervals). Based on historical records, you can assume that you know the revenue per
megawatt-hour (MWh) that you get in each time interval. The data for this example is from the
Australian Energy Market Operator https://www.nemweb.com.au/REPORTS/CURRENT/ in
mid-2013, and is used under their terms https://www.aemo.com.au/About-AEMO/Legal-
Notices/Copyright-Permissions.
9-55
9 Linear Programming and Mixed-Integer Linear Programming
There is a cost to start a generator after it has been off. The other constraint is a maximum fuel usage
for the day. The maximum fuel constraint is because you buy your fuel a day ahead of time, so can use
only what you just bought.
You can formulate the scheduling problem as a binary integer programming problem as follows.
Define indexes i, j, and k, and a binary scheduling vector y as:
You need to determine when a generator starts after being off. Let
Obviously, you need a way to set z automatically based on the settings of y. A linear constraint below
handles this setting.
You also need the parameters of the problem for costs, generation levels for each generator,
consumption levels of the generators, and fuel available.
9-56
Optimal Dispatch of Power Generators: Solver-Based
You got poolPrice when you executed load dispatchPrice;. Set the other parameters as
follows.
fuelPrice = 3;
totalfuel = 3.95e4;
nPeriods = length(poolPrice); % 48 periods
nGens = 2; % Two generators
gen = [61,152;50,150]; % Generator 1 low = 61 MW, high = 152 MW
fuel = [427,806;325,765]; % Fuel consumption for generator 2 is low = 325, high = 765
startCost = 1e4; % Cost to start a generator after it has been off
Generator Efficiency
Examine the efficiency of the two generators at their two operating points.
9-57
9 Linear Programming and Mixed-Integer Linear Programming
Notice that generator 2 is a bit more efficient than generator 1 at its corresponding operating points
(low or high), but generator 1 at its high operating point is more efficient than generator 2 at its low
operating point.
To set up the problem, you need to encode all the problem data and constraints in the form that the
intlinprog solver requires. You have variables y(i,j,k) that represent the solution of the
problem, and z(i,j) auxiliary variables for charging to turn on a generator. y is an nPeriods-by-
nGens-by-2 array, and z is an nPeriods-by-nGens array.
To put these variables in one long vector, define the variable of unknowns x:
x = [y(:);z(:)];
For bounds and linear constraints, it is easiest to use the natural array formulation of y and z, then
convert the constraints to the total decision variable, the vector x.
Bounds
The solution vector x consists of binary variables. Set up the bounds lb and ub.
9-58
Optimal Dispatch of Power Generators: Solver-Based
Linear Constraints
For linear constraints A*x <= b, the number of columns in the A matrix must be the same as the
length of x, which is the same as the length of lb. To create rows of A of the appropriate size, create
zero matrices of the sizes of the y and z matrices.
cleary = zeros(nPeriods,nGens,2);
clearz = zeros(nPeriods,nGens);
To ensure that the power level has no more than one component equal to 1, set a linear inequality
constraint:
The running cost per period is the cost for fuel for that period. For generator j operating at level k,
the cost is fuelPrice * fuel(j,k).
To ensure that the generators do not use too much fuel, create an inequality constraint on the sum of
fuel usage.
addrow = [yFuel(:);clearz(:)]';
A = [A;sparse(addrow)];
b = [b;totalfuel]; % A*x <= b means the total fuel usage is <= totalfuel
How can you get the solver to set the z variables automatically to match the active/off periods that
the y variables represent? Recall that the condition to satisfy is z(i,j) = 1 exactly when
Notice that
9-59
9 Linear Programming and Mixed-Integer Linear Programming
in the problem formulation, and include the z variables in the objective function cost. By including
the z variables in the objective function, the solver attempts to lower the values of the z variables,
meaning it tries to set them all equal to 0. But for those intervals when a generator turns on, the
linear inequality forces the z(i,j) to equal 1.
Add extra rows to the linear inequality constraint matrix A to represent these new inequalities. Wrap
around the time so that interval 1 logically follows interval 48.
tempA = spalloc(nPeriods*nGens,length(lb),2*nPeriods*nGens);
counter = 1;
for ii = 1:nPeriods
for jj = 1:nGens
temp = cleary;
tempy = clearz;
temp(ii,jj,1) = -1;
temp(ii,jj,2) = -1;
if ii < nPeriods % Intervals 1 to 47
temp(ii+1,jj,1) = 1;
temp(ii+1,jj,2) = 1;
else % Interval 1 follows interval 48
temp(1,jj,1) = 1;
temp(1,jj,2) = 1;
end
tempy(ii,jj) = -1;
temp = [temp(:);tempy(:)]'; % Row vector for inclusion in tempA matrix
tempA(counter,:) = sparse(temp);
counter = counter + 1;
end
end
A = [A;tempA];
b = [b;zeros(nPeriods*nGens,1)]; % A*x <= b sets z(i,j) = 1 at generator startup
Sparsity of Constraints
If you have a large problem, using sparse constraint matrices saves memory, and can save
computational time as well. The constraint matrix A is quite sparse:
filledfraction = nnz(A)/numel(A)
filledfraction = 0.0155
intlinprog accepts sparse linear constraint matrices A and Aeq, but requires their corresponding
vector constraints b and beq to be full.
Define Objective
The objective function includes fuel costs for running the generators, revenue from running the
generators, and costs for starting the generators.
generatorlevel = lby; % Generation in MW, start with 0s
generatorlevel(:,1,1) = gen(1,1); % Fill in the levels
generatorlevel(:,1,2) = gen(1,2);
generatorlevel(:,2,1) = gen(2,1);
generatorlevel(:,2,2) = gen(2,2);
9-60
Optimal Dispatch of Power Generators: Solver-Based
revenue(ii,:,:) = poolPrice(ii)*generatorlevel(ii,:,:);
end
fuelCost = yFuel*fuelPrice;
options = optimoptions('intlinprog','Display','final');
[x,fval,eflag,output] = intlinprog(-f,1:length(f),A,b,[],[],lb,ub,options);
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
The easiest way to examine the solution is dividing the solution vector x into its two components, y
and z.
ysolution = x(1:nPeriods*nGens*2);
zsolution = x(nPeriods*nGens*2+1:end);
ysolution = reshape(ysolution,[nPeriods,nGens,2]);
zsolution = reshape(zsolution,[nPeriods,nGens]);
subplot(3,1,1)
bar(ysolution(:,1,1)*gen(1,1)+ysolution(:,1,2)*gen(1,2),.5,'g')
xlim([.5,48.5])
ylabel('MWh')
title('Generator 1 optimal schedule','FontWeight','bold')
subplot(3,1,2)
bar(ysolution(:,2,1)*gen(1,1)+ysolution(:,2,2)*gen(1,2),.5,'c')
title('Generator 2 optimal schedule','FontWeight','bold')
xlim([.5,48.5])
ylabel('MWh')
subplot(3,1,3)
bar(poolPrice,.5)
xlim([.5,48.5])
title('Energy price','FontWeight','bold')
9-61
9 Linear Programming and Mixed-Integer Linear Programming
xlabel('Period')
ylabel('$ / MWh')
Generator 2 runs longer than generator 1, which you would expect because it is more efficient.
Generator 2 runs at its high power level whenever it is on. Generator 1 runs mainly at its high power
level, but dips down to low power for one time unit. Each generator runs for one contiguous set of
periods daily, so incurs only one startup cost.
Check that the z variable is 1 for the periods when the generators start.
theperiod = 2×1
23
16
thegenerator = 2×1
1
2
9-62
Optimal Dispatch of Power Generators: Solver-Based
If you choose a small value of startCost, the solution involves multiple generation periods.
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
ysolutionnew = xnew(1:nPeriods*nGens*2);
zsolutionnew = xnew(nPeriods*nGens*2+1:end);
ysolutionnew = reshape(ysolutionnew,[nPeriods,nGens,2]);
zsolutionnew = reshape(zsolutionnew,[nPeriods,nGens]);
subplot(3,1,1)
bar(ysolutionnew(:,1,1)*gen(1,1)+ysolutionnew(:,1,2)*gen(1,2),.5,'g')
xlim([.5,48.5])
ylabel('MWh')
title('Generator 1 optimal schedule','FontWeight','bold')
subplot(3,1,2)
bar(ysolutionnew(:,2,1)*gen(1,1)+ysolutionnew(:,2,2)*gen(1,2),.5,'c')
title('Generator 2 optimal schedule','FontWeight','bold')
xlim([.5,48.5])
ylabel('MWh')
subplot(3,1,3)
bar(poolPrice,.5)
xlim([.5,48.5])
title('Energy price','FontWeight','bold')
xlabel('Period')
ylabel('$ / MWh')
9-63
9 Linear Programming and Mixed-Integer Linear Programming
theperiod = 3×1
22
16
45
thegenerator = 3×1
1
2
2
See Also
More About
• “Optimal Dispatch of Power Generators: Problem-Based”
9-64
Mixed-Integer Quadratic Programming Portfolio Optimization: Solver-Based
Problem Outline
As Markowitz showed ("Portfolio Selection," J. Finance Volume 7, Issue 1, pp. 77-91, March 1952),
you can express many portfolio optimization problems as quadratic programming problems. Suppose
that you have a set of N assets and want to choose a portfolio, with x(i) being the fraction of your
investment that is in asset i. If you know the vector r of mean returns of each asset, and the
covariance matrix Q of the returns, then for a given level of risk-aversion λ you maximize the risk-
adjusted expected return:
The quadprog solver addresses this quadratic programming problem. However, in addition to the
plain quadratic programming problem, you might want to restrict a portfolio in a variety of ways,
such as:
You cannot include these constraints in quadprog. The difficulty is the discrete nature of the
constraints. Furthermore, while the mixed-integer linear programming solver intlinprog does
handle discrete constraints, it does not address quadratic objective functions.
This example constructs a sequence of MILP problems that satisfy the constraints, and that
increasingly approximate the quadratic objective function. While this technique works for this
example, it might not apply to different problem or constraint types.
x is the vector of asset allocation fractions, with 0 ≤ x(i) ≤ 1 for each i. To model the number of assets
in the portfolio, you need indicator variables v such that v(i) = 0 when x(i) = 0, and v(i) = 1 when
x(i) > 0. To get variables that satisfy this restriction, set the v vector to be a binary variable, and
impose the linear constraints
These inequalities both enforce that x(i) and v(i) are zero at exactly the same time, and they also
enforce that f min ≤ x(i) ≤ f max whenever x(i) > 0.
Also, to enforce the constraints on the number of assets in the portfolio, impose the linear constraints
9-65
9 Linear Programming and Mixed-Integer Linear Programming
m≤ ∑ v(i) ≤ M .
i
As first formulated, you try to maximize the objective function. However, all Optimization Toolbox™
solvers minimize. So formulate the problem as minimizing the negative of the objective:
minλxT Qx − r T x .
x
This objective function is nonlinear. The intlinprog MILP solver requires a linear objective
function. There is a standard technique to reformulate this problem into one with linear objective and
nonlinear constraints. Introduce a slack variable z to represent the quadratic term.
minλz − r T x such that xT Qx − z ≤ 0, z ≥ 0 .
x, z
As you iteratively solve MILP approximations, you include new linear constraints, each of which
approximates the nonlinear constraint locally near the current point. In particular, for x = x0 + δ
where x0 is a constant vector and δ is a variable vector, the first-order Taylor approximation to the
constraint is
2
xT Qx − z = x0T Qx0 + 2x0T Qδ − z + O( δ ) .
Replacing δ by x − x0 gives
2
xT Qx − z = − x0T Qx0 + 2x0T Qx − z + O( x − x0 ) .
For each intermediate solution xk you introduce a new linear constraint in x and z as the linear part of
the expression above:
This has the form Ax ≤ b, where A = 2xkT Q, there is a −1 multiplier for the z term, and b = xkT Qxk.
This method of adding new linear constraints to the problem is called a cutting plane method. For
details, see J. E. Kelley, Jr. "The Cutting-Plane Method for Solving Convex Programs." J. Soc. Indust.
Appl. Math. Vol. 8, No. 4, pp. 703-712, December, 1960.
To express problems for the intlinprog solver, you need to do the following:
Have the first N variables represent the x vector, the next N variables represent the binary v vector,
and the final variable represent the z slack variable. There are 2N + 1 variables in the problem.
9-66
Mixed-Integer Quadratic Programming Portfolio Optimization: Solver-Based
Load the data for the problem. This data has 225 expected returns in the vector r and the covariance
of the returns in the 225-by-225 matrix Q. The data is the same as in the Using Quadratic
Programming on Portfolio Optimization Problems example.
load port5
r = mean_return;
Q = Correlation .* (stdDev_return * stdDev_return');
The lower bounds of all the 2N+1 variables in the problem are zero. The upper bounds of the first 2N
variables are one, and the last variable has no upper bound.
lb = zeros(2*N+1,1);
ub = ones(2*N+1,1);
ub(zvar) = Inf;
Set the number of assets in the solution to be between 100 and 150. Incorporate this constraint into
the problem in the form, namely
m≤ ∑ v(i) ≤ M,
i
∑ v(i) ≤ M
i
∑ − v(i) ≤ − m.
i
M = 150;
m = 100;
A = zeros(1,2*N+1); % Allocate A matrix
A(vvars) = 1; % A*x represents the sum of the v(i)
A = [A;-A];
b = zeros(2,1); % Allocate b vector
b(1) = M;
b(2) = -m;
Include semicontinuous constraints. Take the minimal nonzero fraction of assets to be 0.001 for each
asset type, and the maximal fraction to be 0.05.
fmin = 0.001;
fmax = 0.05;
Include the inequalities x(i) ≤ f max(i) * v(i) and f min(i) * v(i) ≤ x(i) as linear inequalities.
Atemp = eye(N);
Amax = horzcat(Atemp,-Atemp*fmax,zeros(N,1));
9-67
9 Linear Programming and Mixed-Integer Linear Programming
A = [A;Amax];
b = [b;zeros(N,1)];
Amin = horzcat(-Atemp,Atemp*fmin,zeros(N,1));
A = [A;Amin];
b = [b;zeros(N,1)];
lambda = 100;
Define the objective function λz − r T x as a vector. Include zeros for the multipliers of the v variables.
f = [-r;zeros(N,1);lambda];
To solve the problem iteratively, begin by solving the problem with the current constraints, which do
not yet reflect any linearization. The integer constraints are in the vvars vector.
Prepare a stopping condition for the iterations: stop when the slack variable z is within 0.01% of the
true quadratic value. Set tighter tolerances than default to help ensure that the problem remains
strictly feasible as constraints accumulate.
thediff = 1e-4;
iter = 1; % iteration counter
assets = xLinInt(xvars); % the x variables
truequadratic = assets'*Q*assets;
zslack = xLinInt(zvar); % slack variable value
options = optimoptions(options,'LPOptimalityTolerance',1e-10,'RelativeGapTolerance',1e-8,...
'ConstraintTolerance',1e-9,'IntegerTolerance',1e-6);
Keep a history of the computed true quadratic and slack variables for plotting.
history = [truequadratic,zslack];
Compute the quadratic and slack values. If they differ, then add another linear constraint and solve
again.
In toolbox syntax, each new linear constraint Ax ≤ b comes from the linear approximation
You see that the new row of A = 2xkT Q and the new element in b = xkT Qxk, with the z term represented
by a -1 coefficient in A.
After you find a new solution, use a linear constraint halfway between the old and new solutions. This
heuristic way of including linear constraints can be faster than simply taking the new solution. To use
9-68
Mixed-Integer Quadratic Programming Portfolio Optimization: Solver-Based
the solution instead of the halfway heuristic, comment the "Midway" line below, and uncomment the
following one.
Plot the history of the slack variable and the quadratic part of the objective function to see how they
converged.
plot(history)
legend('Quadratic','Slack')
xlabel('Iteration number')
title('Quadratic and linear approximation (slack)')
9-69
9 Linear Programming and Mixed-Integer Linear Programming
What is the quality of the MILP solution? The output structure contains that information. Examine
the absolute gap between the internally-calculated bounds on the objective at the solution.
disp(output.absolutegap)
The absolute gap is zero, indicating that the MILP solution is accurate.
Plot the optimal allocation. Use xLinInt(xvars), not assets, because assets might not satisfy
the constraints when using the midway update.
bar(xLinInt(xvars))
grid on
xlabel('Asset index')
ylabel('Proportion of investment')
title('Optimal Asset Allocation')
You can easily see that all nonzero asset allocations are between the semicontinuous bounds
f min = 0 . 001 and f max = 0 . 05.
How many nonzero assets are there? The constraint is that there are between 100 and 150 nonzero
assets.
sum(xLinInt(vvars))
ans = 100
9-70
Mixed-Integer Quadratic Programming Portfolio Optimization: Solver-Based
What is the expected return for this allocation, and the value of the risk-adjusted return?
More elaborate analyses are possible by using features specifically designed for portfolio optimization
in Financial Toolbox™. For an example that shows how to use the Portfolio class to directly handle
semicontinuous and cardinality constraints, see “Portfolio Optimization with Semicontinuous and
Cardinality Constraints” (Financial Toolbox).
See Also
More About
• “Mixed-Integer Quadratic Programming Portfolio Optimization: Problem-Based”
9-71
9 Linear Programming and Mixed-Integer Linear Programming
You probably have seen Sudoku puzzles. A puzzle is to fill a 9-by-9 grid with integers from 1 through
9 so that each integer appears only once in each row, column, and major 3-by-3 square. The grid is
partially populated with clues, and your task is to fill in the rest of the grid.
Initial Puzzle
Here is a data matrix B of clues. The first row, B(1,2,2), means row 1, column 2 has a clue 2. The
second row, B(1,5,3), means row 1, column 5 has a clue 3. Here is the entire matrix B.
B = [1,2,2;
1,5,3;
1,8,4;
2,1,6;
2,9,3;
3,3,4;
3,7,5;
4,4,8;
4,6,6;
5,1,8;
5,5,1;
5,9,6;
6,4,7;
6,6,5;
7,3,7;
7,7,6;
8,1,4;
8,9,8;
9,2,3;
9,5,4;
9,8,2];
drawSudoku(B) % For the listing of this program, see the end of this example.
9-72
Solve Sudoku Puzzles Via Integer Programming: Solver-Based
This puzzle, and an alternative MATLAB® solution technique, was featured in Cleve's Corner in 2009.
There are many approaches to solving Sudoku puzzles manually, as well as many programmatic
approaches. This example shows a straightforward approach using binary integer programming.
This approach is particularly simple because you do not give a solution algorithm. Just express the
rules of Sudoku, express the clues as constraints on the solution, and then intlinprog produces the
solution.
The key idea is to transform a puzzle from a square 9-by-9 grid to a cubic 9-by-9-by-9 array of binary
values (0 or 1). Think of the cubic array as being 9 square grids stacked on top of each other. The top
grid, a square layer of the array, has a 1 wherever the solution or clue has a 1. The second layer has a
1 wherever the solution or clue has a 2. The ninth layer has a 1 wherever the solution or clue has a 9.
The objective function is not needed here, and might as well be 0. The problem is really just to find a
feasible solution, meaning one that satisfies all the constraints. However, for tie breaking in the
internals of the integer programming solver, giving increased solution speed, use a nonconstant
objective function.
9-73
9 Linear Programming and Mixed-Integer Linear Programming
Suppose a solution x is represented in a 9-by-9-by-9 binary array. What properties does x have? First,
each square in the 2-D grid (i,j) has exactly one value, so there is exactly one nonzero element among
the 3-D array entries x(i, j, 1), . . . , x(i, j, 9). In other words, for every i and j,
9
∑ x(i, j, k) = 1 .
k=1
Similarly, in each row i of the 2-D grid, there is exactly one value out of each of the digits from 1 to 9.
In other words, for each i and k,
9
∑ x(i, j, k) = 1 .
j=1
And each column j in the 2-D grid has the same property: for each j and k,
9
∑ x(i, j, k) = 1 .
i=1
The major 3-by-3 grids have a similar constraint. For the grid elements 1 ≤ i ≤ 3 and 1 ≤ j ≤ 3, and
for each 1 ≤ k ≤ 9,
3 3
∑ ∑ x(i, j, k) = 1 .
i=1 j=1
To represent all nine major grids, just add 3 or 6 to each i and j index:
3 3
∑ ∑ x(i + U, j + V, k) = 1, where U, V ϵ {0, 3, 6} .
i=1 j=1
Express Clues
Each initial value (clue) can be expressed as a constraint. Suppose that the (i, j) clue is m for some
9
1 ≤ m ≤ 9. Then x(i, j, m) = 1. The constraint ∑ x(i, j, k) = 1 ensures that all other x(i, j, k) = 0 for
k=1
k ≠ m.
Although the Sudoku rules are conveniently expressed in terms of a 9-by-9-by-9 solution array x,
linear constraints are given in terms of a vector solution matrix x(:). Therefore, when you write a
Sudoku program, you have to use constraint matrices derived from 9-by-9-by-9 initial arrays.
Here is one approach to set up Sudoku rules, and also include the clues as constraints. The
sudokuEngine file comes with your software.
type sudokuEngine
9-74
Solve Sudoku Puzzles Via Integer Programming: Solver-Based
counter = 1;
for j = 1:9 % one in each row
for k = 1:9
Astuff = lb; % clear Astuff
Astuff(1:end,j,k) = 1; % one row in Aeq*x = beq
Aeq(counter,:) = Astuff(:)'; % put Astuff in a row of Aeq
counter = counter + 1;
end
end
9-75
9 Linear Programming and Mixed-Integer Linear Programming
Astuff = lb;
Astuff(U+(1:3),V+(1:3),k) = 1;
Aeq(counter,:) = Astuff(:)';
counter = counter + 1;
end
end
end
for i = 1:size(B,1)
lb(B(i,1),B(i,2),B(i,3)) = 1;
end
intcon = 1:N;
[x,~,eflag] = intlinprog(f,intcon,[],[],Aeq,beq,lb,ub);
9-76
Solve Sudoku Puzzles Via Integer Programming: Solver-Based
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
drawSudoku(S)
type drawSudoku
function drawSudoku(B)
% Function for drawing the Sudoku board
9-77
9 Linear Programming and Mixed-Integer Linear Programming
if size(B,2) == 9 % 9 columns
[SM,SN] = meshgrid(1:9); % make i,j entries
B = [SN(:),SM(:),B(:)]; % i,j,k rows
end
for ii = 1:size(B,1)
text(B(ii,2)-0.5,9.5-B(ii,1),num2str(B(ii,3)))
end
hold off
end
See Also
More About
• “Solve Sudoku Puzzles Via Integer Programming: Problem-Based”
9-78
Office Assignments by Binary Integer Programming: Solver-Based
You want to assign six people, Marcelo, Rakesh, Peter, Tom, Marjorie, and Mary Ann, to seven offices.
Each office can have no more than one person, and each person gets exactly one office. So there will
be one empty office. People can give preferences for the offices, and their preferences are considered
based on their seniority. The longer they have been at the MathWorks, the higher the seniority. Some
offices have windows, some do not, and one window is smaller than others. Additionally, Peter and
Tom often work together, so should be in adjacent offices. Marcelo and Rakesh often work together,
and should be in adjacent offices.
Office Layout
Offices 1, 2, 3, and 4 are inside offices (no windows). Offices 5, 6, and 7 have windows, but the
window in office 5 is smaller than the other two. Here is how the offices are arranged.
9-79
9 Linear Programming and Mixed-Integer Linear Programming
Problem Formulation
You need to formulate the problem mathematically. First, choose what each element of your solution
variable x represents in the problem. Since this is a binary integer problem, a good choice is that
each element represents a person assigned to an office. If the person is assigned to the office, the
variable has value 1. If the person is not assigned to the office, the variable has value 0. Number
people as follows:
1 Mary Ann
2 Marjorie
3 Tom
4 Peter
5 Marcelo
6 Rakesh
x is a vector. The elements x(1) to x(7) correspond to Mary Ann being assigned to office 1, office 2,
etc., to office 7. The next seven elements correspond to Marjorie being assigned to the seven offices,
etc. In all, the x vector has 42 elements, since six people are assigned to seven offices.
Seniority
You want to weight the preferences based on seniority so that the longer you have been at
MathWorks, the more your preferences count. The seniority is as follows: Mary Ann 9 years, Marjorie
10 years, Tom 5 years, Peter 3 years, Marcelo 1.5 years, and Rakesh 2 years. Create a normalized
weight vector based on seniority.
Set up a preference matrix where the rows correspond to offices and the columns correspond to
people. Ask each person to give values for each office so that the sum of all their choices, i.e., their
column, sums to 100. A higher number means the person prefers the office. Each person's
preferences are listed in a column vector.
The ith element of a person's preference vector is how highly they value the ith office. Thus, the
combined preference matrix is as follows.
Weight the preferences matrix by weightvector to scale the columns by seniority. Also, it's more
convenient to reshape this matrix as a vector in column order so that it corresponds to the x vector.
PM = prefmatrix * diag(weightvector);
c = PM(:);
9-80
Office Assignments by Binary Integer Programming: Solver-Based
Objective Function
The objective is to maximize the satisfaction of the preferences weighted by seniority. This is a linear
objective function
max c'*x
or equivalently
min -c'*x.
Constraints
The first set of constraints requires that each person gets exactly one office, that is for each person,
the sum of the x values corresponding to that person is exactly one. For example, since Marjorie is
the second person, this means that sum(x(8:14))=1. Represent these linear constraints in an
equality matrix Aeq and vector beq, where Aeq*x = beq, by building the appropriate matrices. The
matrix Aeq consists of ones and zeros. For example, the second row of Aeq will correspond to
Marjorie getting one office, so the row looks like this:
0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0
There are seven 1s in columns 8 through 14 and 0s elsewhere. Then Aeq(2,:)*x = 1 is equivalent
to sum(x(8:14)) = 1.
numOffices = 7;
numPeople = 6;
numDim = numOffices * numPeople;
onesvector = ones(1,numOffices);
The second set of constraints are inequalities. These constraints specify that each office has no more
than one person in it, i.e., each office has one person in it, or is empty. Build the matrix A and the
vector b such that A*x <= b to capture these constraints. Each row of A and b corresponds to an
office and so row 1 corresponds to people assigned to office 1. This time, the rows have this type of
pattern, for row 1:
1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 ... 1 0 0 0 0 0 0
Each row after this is similar but shifted (circularly) to the right by one spot. For example, row 3
corresponds to office 3 and says that A(3,:)*x <= 1, i.e., office 3 cannot have more than one
person in it.
A = repmat(eye(numOffices),1,numPeople);
b = ones(numOffices,1);
The next set of constraints are also inequalities, so add them to the matrix A and vector b, which
already contain the inequalities from above. You want Tom and Peter no more than one office away
from each other, and the same with Marcelo and Rakesh. First, build the distance matrix of the offices
based on their physical locations and using approximate Manhattan distances. This is a symmetric
matrix.
9-81
9 Linear Programming and Mixed-Integer Linear Programming
D = zeros(numOffices);
D(1,2:end) = [1 2 3 2 3 4];
D(2,3:end) = [1 2 1 2 3];
D(3,4:end) = [1 2 1 2];
D(4,5:end) = [3 2 1];
D(5,6:end) = [1 2];
D(6,end) = 1;
D = triu(D)' + D;
Find the offices that are more than one distance unit away.
numPairs = 26
This finds numPairs pairs of offices that are not adjacent. For these pairs, if Tom occupies one office
in the pair, then Peter cannot occupy the other office in the pair. If he did, they would not be adjacent.
The same is true for Marcelo and Rakesh. This gives 2*numPairs more inequality constraints that
you add to A and b.
For each pair of offices in numPairs, for the x(i) that corresponds to Tom in officeA and for the
x(j) that corresponds to Peter in OfficeB,
This means that either Tom or Peter can occupy one of these offices, but they both cannot.
tom = 3;
peter = 4;
marcelo = 5;
rakesh = 6;
The following anonymous functions return the index in x corresponding to Tom, Peter, Marcelo and
Rakesh respectively in office i.
tom_index=@(officenum) (tom-1)*numOffices+officenum;
peter_index=@(officenum) (peter-1)*numOffices+officenum;
marcelo_index=@(officenum) (marcelo-1)*numOffices+officenum;
rakesh_index=@(officenum) (rakesh-1)*numOffices+officenum;
for i = 1:numPairs
9-82
Office Assignments by Binary Integer Programming: Solver-Based
tomInOfficeA = tom_index(officeA(i));
peterInOfficeB = peter_index(officeB(i));
A(i+numOffices, [tomInOfficeA, peterInOfficeB]) = 1;
min -c'*x
Aeq*x = beq
A*x <= b
You already made the A, b, Aeq, and beq entries. Now set the objective function.
f = -c;
To ensure that the variables are binary, put lower bounds of 0, upper bounds of 1, and set intvars to
represent all variables.
lb = zeros(size(f));
ub = lb + 1;
intvars = 1:length(f);
Intlinprog stopped at the root node because the objective value is within a gap tolerance of the
value, options.AbsoluteGapTolerance = 0 (the default value). The intcon variables are
integer within tolerance, options.IntegerTolerance = 1e-05 (the default value).
9-83
9 Linear Programming and Mixed-Integer Linear Programming
people = {'Mary Ann', 'Marjorie',' Tom ',' Peter ','Marcelo ',' Rakesh '};
for i=1:numPeople
if isempty(office{i})
name{i} = ' empty ';
else
name{i} = people(office{i});
end
end
printofficeassign(name);
title('Solution of the Office Assignment Problem');
Solution Quality
For this problem, the satisfaction of the preferences by seniority is maximized to the value of -fval.
exitflag = 1 tells you that intlinprog converged to an optimal solution. The output structure
gives information about the solution process, such as how many nodes were explored, and the gap
between the lower and upper bounds in the branching calculation. In this case, no branch-and-bound
nodes were generated, meaning the problem was solved without a branch-and-bound step. The gap is
0, meaning the solution is optimal, with no difference between the internally calculated lower and
upper bounds on the objective function.
fval,exitflag,output
fval = -33.8361
exitflag = 1
9-84
Office Assignments by Binary Integer Programming: Solver-Based
See Also
More About
• “Office Assignments by Binary Integer Programming: Problem-Based”
9-85
9 Linear Programming and Mixed-Integer Linear Programming
Problem Overview
A lumber mill starts with trees that have been trimmed to fixed-length logs. Specify the fixed log
length.
logLength = 40;
The mill then cuts the logs into fixed lengths suitable for further processing. The problem is how to
make the cuts so that the mill satisfies a set of orders with the fewest logs.
Specify these fixed lengths and the order quantities for the lengths.
Assume that there is no material loss in making cuts, and no cost for cutting.
Several authors, including Ford and Fulkerson [1] and Gilmore and Gomory [2], suggest the following
procedure, which you implement in the next section. A cutting pattern is a set of lengths to which a
single log can be cut.
Instead of generating every possible cutting pattern, it is more efficient to generate cutting patterns
as the solution of a subproblem. Starting from a base set of cutting patterns, solve the linear
programming problem of minimizing the number of logs used subject to the constraint that the cuts,
using the existing patterns, satisfy the demands.
After solving that problem, generate a new pattern by solving an integer linear programming
subproblem. The subproblem is to find the best new pattern, meaning the number of cuts from each
length in lengthlist that add up to no more than the total possible length logLength. The
quantity to optimize is the reduced cost of the new pattern, which is one minus the sum of the
Lagrange multipliers for the current solution times the new cutting pattern. If this quantity is
negative, then bringing that pattern into the linear program will improve its objective. If not, then no
better cutting pattern exists, and the patterns used so far give the optimal linear programming
solution. The reason for this conclusion is exactly parallel to the reason for when to stop the primal
simplex method: the method terminates when there is no variable with a negative reduced cost. The
problem in this example terminates when there is no pattern with negative reduced cost. For details
and an example, see Column generation algorithms and its references.
After solving the linear programming problem in this way, you can have noninteger solutions.
Therefore, solve the problem once more, using the generated patterns and constraining the variables
to have integer values.
9-86
Cutting Stock Problem: Solver-Based
A pattern, in this formulation, is a vector of integers containing the number of cuts of each length in
lengthlist. Arrange a matrix named patterns to store the patterns, where each column in the
matrix gives a pattern. For example,
2 0
0 2
patterns = .
0 1
1 0
The first pattern (column) represents two cuts of length 8 and one cut of length 20. The second
pattern represents two cuts of length 12 and one cut of length 16. Each is a feasible pattern because
the total of the cuts is no more than logLength = 40.
In this formulation, if x is a column vector of integers containing the number of times each pattern is
used, then patterns*x is a column vector giving the number of cuts of each type. The constraint of
meeting demand is patterns*x >= quantity. For example, using the previous patterns matrix,
45
suppose that x = . (This x uses 101 logs.) Then
56
90
112
patterns * x = ,
56
45
which represents a feasible solution because the result exceeds the demand
90
111
quantity = .
55
30
To have an initial feasible cutting pattern, use the simplest patterns, which have just one length of
cut. Use as many cuts of that length as feasible for the log.
patterns = diag(floor(logLength./lengthlist));
nPatterns = size(patterns,2);
To generate new patterns from the existing ones based on the current Lagrange multipliers, solve a
subproblem. Call the subproblem in a loop to generate patterns until no further improvement is
found. The subproblem objective depends only on the current Lagrange multipliers. The variables are
nonnegative integers representing the number of cuts of each length. The only constraint is that the
sum of the lengths of the cuts in a pattern is no more than the log length. Create a lower bound
vector lb2 and matrices A2 and b2 that represent these bounds and linear constraints.
lb2 = zeros(nLengths,1);
A2 = lengthlist';
b2 = logLength;
To avoid unnecessary feedback from the solvers, set the Display option to 'off' for both the outer
loop and the inner subproblem solution.
9-87
9 Linear Programming and Mixed-Integer Linear Programming
lpopts = optimoptions('linprog','Display','off');
ipopts = optimoptions('intlinprog',lpopts);
reducedCost = -Inf;
reducedCostTolerance = -0.0001;
exitflag = 1;
[values,nLogs,exitflag,~,lambda] = linprog(f,A,b,[],[],lb,[],lpopts);
if exitflag > 0
fprintf('Using %g logs\n',nLogs);
% Now generate a new pattern, if possible
f2 = -lambda.ineqlin;
[values,reducedCost,pexitflag] = intlinprog(f2,1:nLengths,A2,b2,[],[],lb2,[],ipopts);
reducedCost = 1 + reducedCost; % continue if this reducedCost is negative
newpattern = round(values);
if pexitflag > 0 && reducedCost < reducedCostTolerance
patterns = [patterns newpattern];
nPatterns = nPatterns + 1;
end
end
end
You now have the solution of the linear programming problem. To complete the solution, solve the
problem again with the final patterns, using intlinprog with all variables being integers. Also,
compute the waste, which is the quantity of unused logs (in feet) for each pattern and for the problem
as a whole.
if exitflag <= 0
disp('Error in column generation phase')
else
[values,logsUsed,exitflag] = intlinprog(f,1:length(lb),A,b,[],[],lb,[],[],ipopts);
if exitflag > 0
values = round(values);
logsUsed = round(logsUsed);
fprintf('Optimal solution uses %g logs\n', logsUsed);
totalwaste = sum((patterns*values - quantity).*lengthlist); % waste due to overproduction
for j = 1:size(values)
if values(j) > 0
fprintf('Cut %g logs with pattern\n',values(j));
for w = 1:size(patterns,1)
if patterns(w,j) > 0
fprintf(' %d cut(s) of length %d\n', patterns(w,j),lengthlist(w));
end
9-88
Cutting Stock Problem: Solver-Based
end
wastej = logLength - dot(patterns(:,j),lengthlist); % waste due to pattern ineffi
totalwaste = totalwaste + wastej;
fprintf(' Waste of this pattern is %g\n', wastej);
end
end
fprintf('Total waste in this problem is %g.\n',totalwaste);
else
disp('Error in final optimization')
end
end
Part of the waste is due to overproduction, because the mill cuts one log into three 12-foot pieces, but
uses only one. Part of the waste is due to pattern inefficiency, because the three 12-foot pieces are 4
feet short of the total length of 40 feet.
References
[1] Ford, L. R., Jr. and D. R. Fulkerson. A Suggested Computation for Maximal Multi-Commodity
Network Flows. Management Science 5, 1958, pp. 97-101.
[2] Gilmore, P. C., and R. E. Gomory. A Linear Programming Approach to the Cutting Stock Problem--
Part II. Operations Research 11, No. 6, 1963, pp. 863-888.
See Also
More About
• “Cutting Stock Problem: Problem-Based”
9-89
9 Linear Programming and Mixed-Integer Linear Programming
The example first generates random locations for factories, warehouses, and sales outlets. Feel free
to modify the scaling parameter N, which scales both the size of the grid in which the production and
distribution facilities reside, but also scales the number of these facilities so that the density of
facilities of each type per grid area is independent of N.
Facility Locations
For a given value of the scaling parameter N, suppose that there are the following:
• f N2 factories
• wN2 warehouses
• sN2 sales outlets
These facilities are on separate integer grid points between 1 and N in the x and y directions. In
order that the facilities have separate locations, you require that f + w + s ≤ 1. In this example, take
N = 20, f = 0 . 05, w = 0 . 05, and s = 0 . 1.
The demand for each product p in a sales outlet s is d(s, p). The demand is the quantity that can be
sold in a time interval. One constraint on the model is that the demand is met, meaning the system
produces and distributes exactly the quantities in the demand.
Suppose that each sales outlet receives its supplies from just one warehouse. Part of the problem is to
determine the cheapest mapping of sales outlets to warehouses.
Costs
The cost of transporting products from factory to warehouse, and from warehouse to sales outlet,
depends on the distance between the facilities, and on the particular product. If dist(a, b) is the
distance between facilities a and b, then the cost of shipping a product p between these facilities is
the distance times the transportation cost tcost(p):
dist(a, b) * tcost(p) .
9-90
Factory, Warehouse, Sales Allocation Model: Problem-Based
The distance in this example is the grid distance, also known as the L1 distance. It is the sum of the
absolute difference in x coordinates and y coordinates.
Optimization Problem
Given a set of facility locations, and the demands and capacity constraints, find:
These quantities must ensure that demand is satisfied and total cost is minimized. Also, each sales
outlet is required to receive all its products from exactly one warehouse.
The control variables, meaning the ones you can change in the optimization, are
d(s, p)
∑ ∑ turn(p) ⋅ y(s, w) ≤ wcap(w) (capacity of warehouse).
p s
The variables x and y appear in the objective and constraint functions linearly. Because y is restricted
to integer values, the problem is a mixed-integer linear program (MILP).
Set the values of the N, f , w, and s parameters, and generate the facility locations.
9-91
9 Linear Programming and Mixed-Integer Linear Programming
Of course, it is not realistic to take random locations for facilities. This example is intended to show
solution techniques, not how to generate good facility locations.
Plot the facilities. Facilities 1 through F are factories, F+1 through F+W are warehouses, and F+W
+1 through F+W+S are sales outlets.
h = figure;
plot(xloc(1:F),yloc(1:F),'rs',xloc(F+1:F+W),yloc(F+1:F+W),'k*',...
xloc(F+W+1:F+W+S),yloc(F+W+1:F+W+S),'bo');
lgnd = legend('Factory','Warehouse','Sales outlet','Location','EastOutside');
lgnd.AutoUpdate = 'off';
xlim([0 N+1]);ylim([0 N+1])
9-92
Factory, Warehouse, Sales Allocation Model: Problem-Based
P = 20; % 20 products
% Product transport cost per distance between 5 and 10 for each product
tcost = 5*rand(1,P) + 5;
% Product demand by sales outlet between 200 and 500 for each
% product/outlet
d = 300*rand(S,P) + 200;
These random demands and capacities can lead to infeasible problems. In other words, sometimes
the demand exceeds the production and warehouse capacity constraints. If you alter some
parameters and get an infeasible problem, during solution you will get an exitflag of -2.
To begin specifying the problem, generate the distance arrays distfw(i,j) and distsw(i,j).
Create variables for the optimization problem. x represents the production, a continuous variable,
with dimension P-by-F-by-W. y represents the binary allocation of sales outlet to warehouse, an S-by-W
variable.
x = optimvar('x',P,F,W,'LowerBound',0);
y = optimvar('y',S,W,'Type','integer','LowerBound',0,'UpperBound',1);
Now create the constraints. The first constraint is a capacity constraint on production.
9-93
9 Linear Programming and Mixed-Integer Linear Programming
The next constraint is that the demand is met at each sales outlet.
Finally, there is a requirement that each sales outlet connects to exactly one warehouse.
factoryprob = optimproblem;
The objective function has three parts. The first part is the sum of the production costs.
objfun1 = sum(sum(sum(x,3).*(pcost'),2),1);
The second part is the sum of the transportation costs from factories to warehouses.
objfun2 = 0;
for p = 1:P
objfun2 = objfun2 + tcost(p)*sum(sum(squeeze(x(p,:,:)).*distfw));
end
The third part is the sum of the transportation costs from warehouses to sales outlets.
factoryprob.Constraints.capconstr = capconstr;
factoryprob.Constraints.demconstr = demconstr;
factoryprob.Constraints.warecap = warecap;
factoryprob.Constraints.salesware = salesware;
Turn off iterative display so that you don't get hundreds of lines of output. Include a plot function to
monitor the solution progress.
opts = optimoptions('intlinprog','Display','off','PlotFcn',@optimplotmilp);
[sol,fval,exitflag,output] = solve(factoryprob,'options',opts);
9-94
Factory, Warehouse, Sales Allocation Model: Problem-Based
exitflag =
OptimalSolution
infeas1 = max(max(infeasibility(capconstr,sol)))
infeas1 = 9.0949e-13
infeas2 = max(max(infeasibility(demconstr,sol)))
infeas2 = 8.0718e-12
infeas3 = max(infeasibility(warecap,sol))
infeas3 = 0
infeas4 = max(infeasibility(salesware,sol))
infeas4 = 2.4425e-15
9-95
9 Linear Programming and Mixed-Integer Linear Programming
Round the y portion of the solution to be exactly integer-valued. To understand why these variables
might not be exactly integers, see “Some “Integer” Solutions Are Not Integers” on page 9-36.
How many sales outlets are associated with each warehouse? Notice that, in this case, some
warehouses have 0 associated outlets, meaning the warehouses are not in use in the optimal solution.
outlets = sum(sol.y,1)
outlets = 1×20
2 0 3 2 2 2 3 2 3 2 1 0 0 3 4 3
Plot the connection between each sales outlet and its warehouse.
figure(h);
hold on
for ii = 1:S
jj = find(sol.y(ii,:)); % Index of warehouse associated with ii
xsales = xloc(F+W+ii); ysales = yloc(F+W+ii);
xwarehouse = xloc(F+jj); ywarehouse = yloc(F+jj);
if rand(1) < .5 % Draw y direction first half the time
plot([xsales,xsales,xwarehouse],[ysales,ywarehouse,ywarehouse],'g--')
else % Draw x direction first the rest of the time
plot([xsales,xwarehouse,xwarehouse],[ysales,ysales,ywarehouse],'g--')
end
end
hold off
9-96
Factory, Warehouse, Sales Allocation Model: Problem-Based
See Also
More About
• “Factory, Warehouse, Sales Allocation Model: Solver-Based” on page 9-40
9-97
9 Linear Programming and Mixed-Integer Linear Programming
For the solver-based approach to this problem, see “Traveling Salesman Problem: Solver-Based” on
page 9-49.
Problem Formulation
Formulate the traveling salesman problem for integer linear programming as follows:
Generate Stops
Generate random stops inside a crude polygonal representation of the continental U.S.
load('usborder.mat','x','y','xx','yy');
rng(3,'twister') % Makes stops in Maine & Florida, and is reproducible
nStops = 200; % You can use any number, but the problem size scales as N^2
stopsLon = zeros(nStops,1); % Allocate x-coordinates of nStops
stopsLat = stopsLon; % Allocate y-coordinates
n = 1;
while (n <= nStops)
xp = rand*1.5;
yp = rand;
if inpolygon(xp,yp,x,y) % tTest if inside the border
stopsLon(n) = xp;
stopsLat(n) = yp;
n = n+1;
end
end
Because there are 200 stops, there are 19,900 trips, meaning 19,900 binary variables (# variables =
200 choose 2).
Calculate all the trip distances, assuming that the earth is flat in order to use the Pythagorean rule.
9-98
Traveling Salesman Problem: Problem-Based
dist'*trips
where trips is the binary vector representing the trips that the solution takes. This is the distance of
a tour that you try to minimize.
Represent the problem as a graph. Create a graph where the stops are nodes and the trips are edges.
G = graph(idxs(:,1),idxs(:,2));
Display the stops using a graph plot. Plot the nodes without the graph edges.
figure
hGraph = plot(G,'XData',stopsLon,'YData',stopsLat,'LineStyle','none','NodeLabel',{});
hold on
% Draw the outside border
plot(x,y,'r-')
hold off
9-99
9 Linear Programming and Mixed-Integer Linear Programming
Create an optimization problem with binary optimization variables representing the potential trips.
tsp = optimproblem;
trips = optimvar('trips',lendist,1,'Type','integer','LowerBound',0,'UpperBound',1);
tsp.Objective = dist'*trips;
Constraints
Create the linear constraints that each stop has two associated trips, because there must be a trip to
each stop and a trip departing each stop.
Use the graph representation to identify all trips starting or ending at a stop by finding all edges
connecting to that stop. For each stop, create the constraint that the sum of trips for that stop equals
two.
constr2trips = optimconstr(nStops,1);
for stop = 1:nStops
whichIdxs = outedges(G,stop); % Identify trips associated with the stop
constr2trips(stop) = sum(trips(whichIdxs)) == 2;
end
tsp.Constraints.constr2trips = constr2trips;
The problem is ready to be solved. To suppress iterative output, turn off the default display.
opts = optimoptions('intlinprog','Display','off');
tspsol = solve(tsp,'options',opts)
Visualize Solution
Create a new graph with the solution trips as edges. To do so, round the solution in case some values
are not exactly integers, and convert the resulting values to logical.
tspsol.trips = logical(round(tspsol.trips));
Gsol = graph(idxs(tspsol.trips,1),idxs(tspsol.trips,2));
Overlay the new graph on the existing plot and highlight its edges.
hold on
highlight(hGraph,Gsol,'LineStyle','-')
title('Solution with Subtours')
9-100
Traveling Salesman Problem: Problem-Based
As can be seen on the map, the solution has several subtours. The constraints specified so far do not
prevent these subtours from happening. In order to prevent any possible subtour from happening,
you would need an incredibly large number of inequality constraints.
Subtour Constraints
Because you can't add all of the subtour constraints, take an iterative approach. Detect the subtours
in the current solution, then add inequality constraints to prevent those particular subtours from
happening. By doing this, you find a suitable tour in a few iterations.
Eliminate subtours with inequality constraints. An example of how this works is if you have five points
in a subtour, then you have five lines connecting those points to create the subtour. Eliminate this
subtour by implementing an inequality constraint to say there must be less than or equal to four lines
between these five points.
Even more, find all lines between these five points, and constrain the solution not to have more than
four of these lines present. This is a correct constraint because if five or more of the lines existed in a
solution, then the solution would have a subtour (a graph with n nodes and n edges always contains a
cycle).
Detect the subtours by identifying the connected components in Gsol, the graph built with the edges
in the current solution. conncomp returns a vector with the number of the subtour to which each
edge belongs.
tourIdxs = conncomp(Gsol);
numtours = max(tourIdxs); % Number of subtours
fprintf('# of subtours: %d\n',numtours);
9-101
9 Linear Programming and Mixed-Integer Linear Programming
# of subtours: 27
Include the linear inequality constraints to eliminate subtours, and repeatedly call the solver, until
just one subtour remains.
# of subtours: 20
# of subtours: 7
# of subtours: 9
# of subtours: 9
# of subtours: 3
# of subtours: 2
# of subtours: 7
# of subtours: 2
# of subtours: 1
9-102
Traveling Salesman Problem: Problem-Based
Solution Quality
The solution represents a feasible tour, because it is a single closed loop. But is it a minimal-cost
tour? One way to find out is to examine the output structure.
disp(output.absolutegap)
The smallness of the absolute gap implies that the solution is either optimal or has a total length that
is close to optimal.
See Also
More About
• “Traveling Salesman Problem: Solver-Based” on page 9-49
9-103
9 Linear Programming and Mixed-Integer Linear Programming
For the solver-based approach to this problem, see “Optimal Dispatch of Power Generators: Solver-
Based” on page 9-55.
Problem Definition
The electricity market has different prices at different times of day. If you have generators supplying
electricity, you can take advantage of this variable pricing by scheduling your generators to operate
when prices are high. Suppose that you control two generators. Each generator has three power
levels (off, low, and high). Each generator has a specified rate of fuel consumption and power
production at each power level. Fuel consumption is 0 when the generator is off.
You can assign a power level to each generator for each half-hour time interval during a day (24
hours, so 48 intervals). Based on historical records, assume that you know the revenue per megawatt-
hour (MWh) that you receive in each time interval. The data for this example is from the Australian
Energy Market Operator https://www.nemweb.com.au/REPORTS/CURRENT/ in mid-2013, and is
used under their terms https://www.aemo.com.au/Privacy_and_Legal_Notices/
Copyright_Permissions_Notice.
9-104
Optimal Dispatch of Power Generators: Problem-Based
There is a cost to start a generator after it has been off. Also, there is a constraint on the maximum
fuel usage for the day. This constraint exists because you buy your fuel a day ahead of time, so you
can use only what you just bought.
You can formulate the scheduling problem as a binary integer programming problem. Define indexes
i, j, and k, and a binary scheduling vector y, as follows:
Determine when a generator starts after being off. To do so, define the auxiliary binary variable
z(i,j) that indicates whether to charge for turning on generator j at period i.
You need a way to set z automatically based on the settings of y. A linear constraint below handles
this setting.
9-105
9 Linear Programming and Mixed-Integer Linear Programming
You also need the parameters of the problem for costs, generation levels for each generator,
consumption levels of the generators, and fuel available.
You got poolPrice when you executed load dispatchPrice;. Set the other parameters as
follows.
fuelPrice = 3;
totalFuel = 3.95e4;
nPeriods = length(poolPrice); % 48 periods
nGens = 2; % Two generators
gen = [61,152;50,150]; % Generator 1 low = 61 MW, high = 152 MW
fuel = [427,806;325,765]; % Fuel consumption for generator 2 is low = 325, high = 765
startCost = 1e4; % Cost to start a generator after it has been off
Generator Efficiency
Examine the efficiency of the two generators at their two operating points.
9-106
Optimal Dispatch of Power Generators: Problem-Based
Notice that generator 2 is a bit more efficient than generator 1 at its corresponding operating points
(low and high), but generator 1 at its high operating point is more efficient than generator 2 at its low
operating point.
To set up the problem, you need to encode all the problem data and constraints in problem form. The
variables y(i,j,k) represent the solution of the problem, and the auxiliary variables z(i,j)
indicate whether to charge for turning on a generator. y is an nPeriods-by-nGens-by-2 array, and
z is an nPeriods-by-nGens array. All variables are binary.
y = optimvar('y',nPeriods,nGens,{'Low','High'},'Type','integer','LowerBound',0,...
'UpperBound',1);
z = optimvar('z',nPeriods,nGens,'Type','integer','LowerBound',0,...
'UpperBound',1);
Linear Constraints
To ensure that the power level has no more than one component equal to 1, set a linear inequality
constraint.
The running cost per period is the cost of fuel for that period. For generator j operating at level k,
the cost is fuelPrice * fuel(j,k).
Create an expression fuelUsed that accounts for all the fuel used.
9-107
9 Linear Programming and Mixed-Integer Linear Programming
yFuel = zeros(nPeriods,nGens,2);
yFuel(:,1,1) = fuel(1,1); % Fuel use of generator 1 in low setting
yFuel(:,1,2) = fuel(1,2); % Fuel use of generator 1 in high setting
yFuel(:,2,1) = fuel(2,1); % Fuel use of generator 2 in low setting
yFuel(:,2,2) = fuel(2,2); % Fuel use of generator 2 in high setting
fuelUsed = sum(sum(sum(y.*yFuel)));
The constraint is that the fuel used is no more than the fuel available.
How can you get the solver to set the z variables automatically to match the active/off periods of the
y variables? Recall that the condition to satisfy is z(i,j) = 1 exactly when sum_k y(i,j,k) = 0
and sum_k y(i+1,j,k) = 1.
Notice that sum_k ( - y(i,j,k) + y(i+1,j,k) ) > 0 exactly when you want z(i,j) = 1.
Also, include the z variables in the objective function cost. With the z variables in the objective
function, the solver attempts to lower their values, meaning it tries to set them all equal to 0. But for
those intervals when a generator turns on, the linear inequality forces z(i,j) to equal 1.
Create an auxiliary variable w that represents y(i+1,j,k) - y(i,j,k). Represent the generator
startup inequality in terms of w.
w = optimexpr(nPeriods,nGens); % Allocate w
idx = 1:(nPeriods-1);
w(idx,:) = y(idx+1,:,'Low') - y(idx,:,'Low') + y(idx+1,:,'High') - y(idx,:,'High');
w(nPeriods,:) = y(1,:,'Low') - y(nPeriods,:,'Low') + y(1,:,'High') - y(nPeriods,:,'High');
switchcons = w - z <= 0;
Define Objective
The objective function includes fuel costs for running the generators, revenue from running the
generators, and costs for starting the generators.
generatorlevel = zeros(size(yFuel));
generatorlevel(:,1,1) = gen(1,1); % Fill in the levels
generatorlevel(:,1,2) = gen(1,2);
generatorlevel(:,2,1) = gen(2,1);
generatorlevel(:,2,2) = gen(2,2);
revenue = optimexpr(size(y));
for ii = 1:nPeriods
revenue(ii,:,:) = poolPrice(ii)*y(ii,:,:).*generatorlevel(ii,:,:);
end
fuelCost = fuelUsed*fuelPrice;
9-108
Optimal Dispatch of Power Generators: Problem-Based
startingCost = z*startCost;
dispatch = optimproblem('ObjectiveSense','maximize');
dispatch.Objective = profit;
dispatch.Constraints.switchcons = switchcons;
dispatch.Constraints.fuelcons = fuelcons;
dispatch.Constraints.powercons = powercons;
options = optimoptions('intlinprog','Display','final');
[dispatchsol,fval,exitflag,output] = solve(dispatch,'options',options);
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
subplot(3,1,1)
bar(dispatchsol.y(:,1,1)*gen(1,1)+dispatchsol.y(:,1,2)*gen(1,2),.5,'g')
xlim([.5,48.5])
ylabel('MWh')
title('Generator 1 Optimal Schedule','FontWeight','bold')
subplot(3,1,2)
bar(dispatchsol.y(:,2,1)*gen(1,1)+dispatchsol.y(:,2,2)*gen(1,2),.5,'c')
title('Generator 2 Optimal Schedule','FontWeight','bold')
xlim([.5,48.5])
ylabel('MWh')
subplot(3,1,3)
bar(poolPrice,.5)
xlim([.5,48.5])
title('Energy Price','FontWeight','bold')
xlabel('Period')
ylabel('$ / MWh')
9-109
9 Linear Programming and Mixed-Integer Linear Programming
Generator 2 runs longer than generator 1, which you would expect because it is more efficient.
Generator 2 runs at its high power level whenever it is on. Generator 1 runs mainly at its high power
level, but dips down to low power for one time unit. Each generator runs for one contiguous set of
periods daily, and, therefore, incurs only one startup cost each day.
Check that the z variable is 1 for the periods when the generators start.
theperiod = 2×1
23
16
thegenerator = 2×1
1
2
If you specify a lower value for startCost, the solution involves multiple generation periods.
9-110
Optimal Dispatch of Power Generators: Problem-Based
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
subplot(3,1,1)
bar(dispatchsolnew.y(:,1,1)*gen(1,1)+dispatchsolnew.y(:,1,2)*gen(1,2),.5,'g')
xlim([.5,48.5])
ylabel('MWh')
title('Generator 1 Optimal Schedule','FontWeight','bold')
subplot(3,1,2)
bar(dispatchsolnew.y(:,2,1)*gen(1,1)+dispatchsolnew.y(:,2,2)*gen(1,2),.5,'c')
title('Generator 2 Optimal Schedule','FontWeight','bold')
xlim([.5,48.5])
ylabel('MWh')
subplot(3,1,3)
bar(poolPrice,.5)
xlim([.5,48.5])
title('Energy Price','FontWeight','bold')
xlabel('Period')
ylabel('$ / MWh')
9-111
9 Linear Programming and Mixed-Integer Linear Programming
theperiod = 3×1
22
16
45
thegenerator = 3×1
1
2
2
See Also
More About
• “Optimal Dispatch of Power Generators: Solver-Based” on page 9-55
9-112
Office Assignments by Binary Integer Programming: Problem-Based
You want to assign six people, Marcelo, Rakesh, Peter, Tom, Marjorie, and Mary Ann, to seven offices.
Each office can have no more than one person, and each person gets exactly one office. So there will
be one empty office. People can give preferences for the offices, and their preferences are considered
based on their seniority. The longer they have been at MathWorks, the higher the seniority. Some
offices have windows, some do not, and one window is smaller than others. Additionally, Peter and
Tom often work together, so should be in adjacent offices. Marcelo and Rakesh often work together,
and should be in adjacent offices.
Office Layout
Offices 1, 2, 3, and 4 are inside offices (no windows). Offices 5, 6, and 7 have windows, but the
window in office 5 is smaller than the other two. Here is how the offices are arranged.
9-113
9 Linear Programming and Mixed-Integer Linear Programming
Problem Formulation
You need to formulate the problem mathematically. Create binary variables that indicate whether a
person occupies an office. The list of people's names is
occupy = optimvar('occupy',namelist,officelist,...
'Type','integer','LowerBound',0,'Upperbound',1);
Seniority
You want to weight the preferences based on seniority so that the longer you have been at
MathWorks, the more your preferences count. The seniority is as follows: Mary Ann 9 years, Marjorie
10 years, Tom 5 years, Peter 3 years, Marcelo 1.5 years, and Rakesh 2 years. Create a normalized
weight vector based on seniority.
Set up a preference matrix where the rows correspond to offices and the columns correspond to
people. Ask each person to give values for each office so that the sum of all their choices, i.e., their
column, sums to 100. A higher number means the person prefers the office. Each person's
preferences are listed in a column vector.
The ith element of a person's preference vector is how highly they value the ith office. Thus, the
combined preference matrix is as follows.
prefmatrix = [MaryAnn;Marjorie;Tom;Peter;Marcelo;Rakesh];
PM = diag(weightvector) * prefmatrix;
Objective Function
The objective is to maximize the satisfaction of the preferences weighted by seniority. This is the
linear objective function sum(sum(occupy.*PM)).
peopleprob = optimproblem('ObjectiveSense','maximize','Objective',sum(sum(occupy.*PM)));
Constraints
The first set of constraints requires that each person gets exactly one office, that is for each person,
the sum of the occupy values corresponding to that person is exactly one.
9-114
Office Assignments by Binary Integer Programming: Problem-Based
peopleprob.Constraints.constr1 = sum(occupy,2) == 1;
The second set of constraints are inequalities. These constraints specify that each office has no more
than one person in it.
You want Tom and Peter no more than one office away from each other, and the same with Marcelo
and Rakesh.
Set constraints that Tom and Peter are not more than 1 away from each other.
Now create constraints that Marcelo and Rakesh are not more than 1 away from each other.
[soln,fval,exitflag,output] = solve(peopleprob);
Intlinprog stopped at the root node because the objective value is within a gap tolerance of the
options.AbsoluteGapTolerance = 0 (the default value). The intcon variables are integer within tol
options.IntegerTolerance = 1e-05 (the default value).
9-115
9 Linear Programming and Mixed-Integer Linear Programming
numOffices = length(officelist);
office = cell(numOffices,1);
for i=1:numOffices
office{i} = find(soln.occupy(:,i)); % people index in office
end
printofficeassign(whoinoffice);
title('Solution of the Office Assignment Problem');
Solution Quality
For this problem, the satisfaction of the preferences by seniority is maximized to the value of fval.
The value of exitflag indicates that solve converged to an optimal solution. The output structure
gives information about the solution process, such as how many nodes were explored, and the gap
between the lower and upper bounds in the branching calculation. In this case, no branch-and-bound
nodes were generated, meaning the problem was solved without a branch-and-bound step. The
9-116
Office Assignments by Binary Integer Programming: Problem-Based
absolute gap is 0, meaning the solution is optimal, with no difference between the internally
calculated lower and upper bounds on the objective function.
fval,exitflag,output
fval = 33.8361
exitflag = 1
See Also
More About
• “Office Assignments by Binary Integer Programming: Solver-Based” on page 9-79
9-117
9 Linear Programming and Mixed-Integer Linear Programming
Problem Outline
As Markowitz showed ("Portfolio Selection," J. Finance Volume 7, Issue 1, pp. 77-91, March 1952),
you can express many portfolio optimization problems as quadratic programming problems. Suppose
that you have a set of N assets and want to choose a portfolio, with x(i) being the fraction of your
investment that is in asset i. If you know the vector r of mean returns of each asset, and the
covariance matrix Q of the returns, then for a given level of risk-aversion λ you maximize the risk-
adjusted expected return:
The quadprog solver addresses this quadratic programming problem. However, in addition to the
plain quadratic programming problem, you might want to restrict a portfolio in a variety of ways,
such as:
You cannot include these constraints in quadprog. The difficulty is the discrete nature of the
constraints. Furthermore, while the mixed-integer linear programming solver does handle discrete
constraints, it does not address quadratic objective functions.
This example constructs a sequence of MILP problems that satisfy the constraints, and that
increasingly approximate the quadratic objective function. While this technique works for this
example, it might not apply to different problem or constraint types.
x is the vector of asset allocation fractions, with 0 ≤ x(i) ≤ 1 for each i. To model the number of assets
in the portfolio, you need indicator variables v such that v(i) = 0 when x(i) = 0, and v(i) = 1 when
x(i) > 0. To get variables that satisfy this restriction, set the v vector to be a binary variable, and
impose the linear constraints
These inequalities both enforce that x(i) and v(i) are zero at exactly the same time, and they also
enforce that f min ≤ x(i) ≤ f max whenever x(i) > 0.
Also, to enforce the constraints on the number of assets in the portfolio, impose the linear constraints
9-118
Mixed-Integer Quadratic Programming Portfolio Optimization: Problem-Based
m≤ ∑ v(i) ≤ M .
i
As first formulated, you try to maximize the objective function. However, all Optimization Toolbox™
solvers minimize. So formulate the problem as minimizing the negative of the objective:
minλxT Qx − r T x .
x
This objective function is nonlinear. The MILP solver requires a linear objective function. There is a
standard technique to reformulate this problem into one with linear objective and nonlinear
constraints. Introduce a slack variable z to represent the quadratic term.
minλz − r T x such that xT Qx − z ≤ 0, z ≥ 0 .
x, z
As you iteratively solve MILP approximations, you include new linear constraints, each of which
approximates the nonlinear constraint locally near the current point. In particular, for x = x0 + δ
where x0 is a constant vector and δ is a variable vector, the first-order Taylor approximation to the
constraint is
2
xT Qx − z = x0T Qx0 + 2x0T Qδ − z + O( δ ) .
Replacing δ by x − x0 gives
2
xT Qx − z = − x0T Qx0 + 2x0T Qx − z + O( x − x0 ) .
For each intermediate solution xk you introduce a new linear constraint in x and z as the linear part of
the expression above:
This has the form Ax ≤ b, where A = 2xkT Q, there is a −1 multiplier for the z term, and b = xkT Qxk.
This method of adding new linear constraints to the problem is called a cutting plane method. For
details, see J. E. Kelley, Jr. "The Cutting-Plane Method for Solving Convex Programs." J. Soc. Indust.
Appl. Math. Vol. 8, No. 4, pp. 703-712, December, 1960.
Load the data for the problem. This data has 225 expected returns in the vector r and the covariance
of the returns in the 225-by-225 matrix Q. The data is the same as in the Using Quadratic
Programming on Portfolio Optimization Problems example.
9-119
9 Linear Programming and Mixed-Integer Linear Programming
load port5
r = mean_return;
Q = Correlation .* (stdDev_return * stdDev_return');
N = length(r);
Create continuous variables xvars representing the asset allocation fraction, binary variables vvars
representing whether or not the associated xvars is zero or strictly positive, and zvar representing
the z variable, a positive scalar.
xvars = optimvar('xvars',N,1,'LowerBound',0,'UpperBound',1);
vvars = optimvar('vvars',N,1,'Type','integer','LowerBound',0,'UpperBound',1);
zvar = optimvar('zvar',1,'LowerBound',0);
The lower bounds of all the 2N+1 variables in the problem are zero. The upper bounds of the xvars
and yvars variables are one, and zvar has no upper bound.
Set the number of assets in the solution to be between 100 and 150. Incorporate this constraint into
the problem in the form, namely
m≤ ∑ v(i) ≤ M,
i
∑ v(i) ≤ M
i
∑ v(i) ≥ m .
i
M = 150;
m = 100;
qpprob = optimproblem('ObjectiveSense','maximize');
qpprob.Constraints.mconstr = sum(vvars) <= M;
qpprob.Constraints.mconstr2 = sum(vvars) >= m;
Include semicontinuous constraints. Take the minimal nonzero fraction of assets to be 0.001 for each
asset type, and the maximal fraction to be 0.05.
fmin = 0.001;
fmax = 0.05;
Include the inequalities x(i) ≤ f max(i) * v(i) and f min(i) * v(i) ≤ x(i).
qpprob.Constraints.allin = sum(xvars) == 1;
9-120
Mixed-Integer Quadratic Programming Portfolio Optimization: Problem-Based
lambda = 100;
To solve the problem iteratively, begin by solving the problem with the current constraints, which do
not yet reflect any linearization.
Prepare a stopping condition for the iterations: stop when the slack variable z is within 0.01% of the
true quadratic value.
thediff = 1e-4;
iter = 1; % iteration counter
assets = xLinInt.xvars;
truequadratic = assets'*Q*assets;
zslack = xLinInt.zvar;
Keep a history of the computed true quadratic and slack variables for plotting. Set tighter tolerances
than default to help the iterations converge to a correct solution.
history = [truequadratic,zslack];
options = optimoptions(options,'LPOptimalityTolerance',1e-10,'RelativeGapTolerance',1e-8,...
'ConstraintTolerance',1e-9,'IntegerTolerance',1e-6);
Compute the quadratic and slack values. If they differ, then add another linear constraint and solve
again.
After you find a new solution, use a linear constraint halfway between the old and new solutions. This
heuristic way of including linear constraints can be faster than simply taking the new solution. To use
the solution instead of the halfway heuristic, comment the "Midway" line below, and uncomment the
following one.
9-121
9 Linear Programming and Mixed-Integer Linear Programming
Plot the history of the slack variable and the quadratic part of the objective function to see how they
converged.
plot(history)
legend('Quadratic','Slack')
xlabel('Iteration number')
title('Quadratic and linear approximation (slack)')
What is the quality of the MILP solution? The output structure contains that information. Examine
the absolute gap between the internally-calculated bounds on the objective at the solution.
disp(output.absolutegap)
The absolute gap is zero, indicating that the MILP solution is accurate.
Plot the optimal allocation. Use xLinInt.xvars, not assets, because assets might not satisfy the
constraints when using the midway update.
bar(xLinInt.xvars)
grid on
xlabel('Asset index')
ylabel('Proportion of investment')
title('Optimal asset allocation')
9-122
Mixed-Integer Quadratic Programming Portfolio Optimization: Problem-Based
You can easily see that all nonzero asset allocations are between the semicontinuous bounds
f min = 0 . 001 and f max = 0 . 05.
How many nonzero assets are there? The constraint is that there are between 100 and 150 nonzero
assets.
sum(xLinInt.vvars)
ans = 100
What is the expected return for this allocation, and the value of the risk-adjusted return?
More elaborate analyses are possible by using features specifically designed for portfolio optimization
in Financial Toolbox®. For an example that shows how to use the Portfolio class to directly handle
9-123
9 Linear Programming and Mixed-Integer Linear Programming
semicontinuous and cardinality constraints, see “Portfolio Optimization with Semicontinuous and
Cardinality Constraints” (Financial Toolbox).
See Also
More About
• “Mixed-Integer Quadratic Programming Portfolio Optimization: Solver-Based” on page 9-65
9-124
Cutting Stock Problem: Problem-Based
Problem Overview
A lumber mill starts with trees that have been trimmed to fixed-length logs. Specify the fixed log
length.
logLength = 40;
The mill then cuts the logs into fixed lengths suitable for further processing. The problem is how to
make the cuts so that the mill satisfies a set of orders with the fewest logs.
Specify these fixed lengths and the order quantities for the lengths.
Assume that there is no material loss in making cuts, and no cost for cutting.
Several authors, including Ford and Fulkerson [1] and Gilmore and Gomory [2], suggest the following
procedure, which you implement in the next section. A cutting pattern is a set of lengths to which a
single log can be cut.
Instead of generating every possible cutting pattern, it is more efficient to generate cutting patterns
as the solution of a subproblem. Starting from a base set of cutting patterns, solve the linear
programming problem of minimizing the number of logs used subject to the constraint that the cuts,
using the existing patterns, satisfy the demands.
After solving that problem, generate a new pattern by solving an integer linear programming
subproblem. The subproblem is to find the best new pattern, meaning the number of cuts from each
length in lengthlist that add up to no more than the total possible length logLength. The
quantity to optimize is the reduced cost of the new pattern, which is one minus the sum of the
Lagrange multipliers for the current solution times the new cutting pattern. If this quantity is
negative, then bringing that pattern into the linear program will improve its objective. If not, then no
better cutting pattern exists, and the patterns used so far give the optimal linear programming
solution. The reason for this conclusion is exactly parallel to the reason for when to stop the primal
simplex method: the method terminates when there is no variable with a negative reduced cost. The
problem in this example terminates when there is no pattern with negative reduced cost. For details
and an example, see Column generation algorithms and its references.
After solving the linear programming problem in this way, you can have noninteger solutions.
Therefore, solve the problem once more, using the generated patterns and constraining the variables
to have integer values.
9-125
9 Linear Programming and Mixed-Integer Linear Programming
A pattern, in this formulation, is a vector of integers containing the number of cuts of each length in
lengthlist. Arrange a matrix named patterns to store the patterns, where each column in the
matrix gives a pattern. For example,
2 0
0 2
patterns = .
0 1
1 0
The first pattern (column) represents two cuts of length 8 and one cut of length 20. The second
pattern represents two cuts of length 12 and one cut of length 16. Each is a feasible pattern because
the total of the cuts is no more than logLength = 40.
In this formulation, if x is a column vector of integers containing the number of times each pattern is
used, then patterns*x is a column vector giving the number of cuts of each type. The constraint of
meeting demand is patterns*x >= quantity. For example, using the previous patterns matrix,
45
suppose that x = . (This x uses 101 logs.) Then
56
90
112
patterns * x = ,
56
45
which represents a feasible solution because the result exceeds the demand
90
111
quantity = .
55
30
To have an initial feasible cutting pattern, use the simplest patterns, which have just one length of
cut. Use as many cuts of that length as feasible for the log.
patterns = diag(floor(logLength./lengthlist));
nPatterns = size(patterns,2);
To generate new patterns from the existing ones based on the current Lagrange multipliers, solve a
subproblem. Call the subproblem in a loop to generate patterns until no further improvement is
found. The subproblem objective depends only on the current Lagrange multipliers. The variables are
nonnegative integers representing the number of cuts of each length. The only constraint is that the
sum of the lengths of the cuts in a pattern is no more than the log length.
subproblem = optimproblem();
cuts = optimvar('cuts', nLengths, 1, 'Type','integer','LowerBound',zeros(nLengths,1));
subproblem.Constraints = dot(lengthlist,cuts) <= logLength;
To avoid unnecessary feedback from the solvers, set the Display option to 'off' for both the outer
loop and the inner subproblem solution.
9-126
Cutting Stock Problem: Problem-Based
lpopts = optimoptions('linprog','Display','off');
ipopts = optimoptions('intlinprog',lpopts);
reducedCost = -inf;
reducedCostTolerance = -0.0001;
exitflag = 1;
[values,nLogs,exitflag,~,lambda] = solve(logprob,'options',lpopts);
if exitflag > 0
fprintf('Using %g logs\n',nLogs);
% Now generate a new pattern, if possible
subproblem.Objective = 1.0 - dot(lambda.Constraints.Demand,cuts);
[values,reducedCost,pexitflag] = solve(subproblem,'options',ipopts);
newpattern = round(values.cuts);
if double(pexitflag) > 0 && reducedCost < reducedCostTolerance
patterns = [patterns newpattern];
nPatterns = nPatterns + 1;
end
end
end
You now have the solution of the linear programming problem. To complete the solution, solve the
problem again with the final patterns, changing the solution variable x to the integer type. Also,
compute the waste, which is the quantity of unused logs (in feet) for each pattern and for the problem
as a whole.
if exitflag <= 0
disp('Error in column generation phase')
else
x.Type = 'integer';
[values,logsUsed,exitflag] = solve(logprob,'options',ipopts);
if double(exitflag) > 0
values.x = round(values.x); % in case some values were not exactly integers
logsUsed = sum(values.x);
fprintf('Optimal solution uses %g logs\n', logsUsed);
totalwaste = sum((patterns*values.x - quantity).*lengthlist); % waste due to overproducti
for j = 1:size(values.x)
if values.x(j) > 0
fprintf('Cut %g logs with pattern\n',values.x(j));
9-127
9 Linear Programming and Mixed-Integer Linear Programming
for w = 1:size(patterns,1)
if patterns(w,j) > 0
fprintf(' %g cut(s) of length %d\n', patterns(w,j),lengthlist(w));
end
end
wastej = logLength - dot(patterns(:,j),lengthlist); % waste due to pattern ineffi
totalwaste = totalwaste + wastej;
fprintf(' Waste of this pattern is %g\n',wastej);
end
end
fprintf('Total waste in this problem is %g.\n',totalwaste);
else
disp('Error in final optimization')
end
end
Part of the waste is due to overproduction, because the mill cuts one log into three 12-foot pieces, but
uses only one. Part of the waste is due to pattern inefficiency, because the three 12-foot pieces are 4
feet short of the total length of 40 feet.
References
[1] Ford, L. R., Jr. and D. R. Fulkerson. A Suggested Computation for Maximal Multi-Commodity
Network Flows. Management Science 5, 1958, pp. 97-101.
[2] Gilmore, P. C., and R. E. Gomory. A Linear Programming Approach to the Cutting Stock Problem--
Part II. Operations Research 11, No. 6, 1963, pp. 863-888.
See Also
More About
• “Cutting Stock Problem: Solver-Based” on page 9-86
9-128
Solve Sudoku Puzzles Via Integer Programming: Problem-Based
You probably have seen Sudoku puzzles. A puzzle is to fill a 9-by-9 grid with integers from 1 through
9 so that each integer appears only once in each row, column, and major 3-by-3 square. The grid is
partially populated with clues, and your task is to fill in the rest of the grid.
Initial Puzzle
Here is a data matrix B of clues. The first row, B(1,2,2), means row 1, column 2 has a clue 2. The
second row, B(1,5,3), means row 1, column 5 has a clue 3. Here is the entire matrix B.
B = [1,2,2;
1,5,3;
1,8,4;
2,1,6;
2,9,3;
3,3,4;
3,7,5;
4,4,8;
4,6,6;
5,1,8;
5,5,1;
5,9,6;
6,4,7;
6,6,5;
7,3,7;
7,7,6;
8,1,4;
8,9,8;
9,2,3;
9,5,4;
9,8,2];
drawSudoku(B) % For the listing of this program, see the end of this example.
9-129
9 Linear Programming and Mixed-Integer Linear Programming
This puzzle, and an alternative MATLAB® solution technique, was featured in Cleve's Corner in 2009.
There are many approaches to solving Sudoku puzzles manually, as well as many programmatic
approaches. This example shows a straightforward approach using binary integer programming.
This approach is particularly simple because you do not give a solution algorithm. Just express the
rules of Sudoku, express the clues as constraints on the solution, and then MATLAB produces the
solution.
The key idea is to transform a puzzle from a square 9-by-9 grid to a cubic 9-by-9-by-9 array of binary
values (0 or 1). Think of the cubic array as being 9 square grids stacked on top of each other, where
each layer corresponds to an integer from 1 through 9. The top grid, a square layer of the array, has a
1 wherever the solution or clue has a 1. The second layer has a 1 wherever the solution or clue has a
2. The ninth layer has a 1 wherever the solution or clue has a 9.
The objective function is not needed here, and might as well be a constant term 0. The problem is
really just to find a feasible solution, meaning one that satisfies all the constraints. However, for tie
breaking in the internals of the integer programming solver, giving increased solution speed, use a
nonconstant objective function.
9-130
Solve Sudoku Puzzles Via Integer Programming: Problem-Based
Suppose a solution x is represented in a 9-by-9-by-9 binary array. What properties does x have? First,
each square in the 2-D grid (i,j) has exactly one value, so there is exactly one nonzero element among
the 3-D array entries x(i, j, 1), . . . , x(i, j, 9). In other words, for every i and j,
9
∑ x(i, j, k) = 1 .
k=1
Similarly, in each row i of the 2-D grid, there is exactly one value out of each of the digits from 1 to 9.
In other words, for each i and k,
9
∑ x(i, j, k) = 1 .
j=1
And each column j in the 2-D grid has the same property: for each j and k,
9
∑ x(i, j, k) = 1 .
i=1
The major 3-by-3 grids have a similar constraint. For the grid elements 1 ≤ i ≤ 3 and 1 ≤ j ≤ 3, and
for each 1 ≤ k ≤ 9,
3 3
∑ ∑ x(i, j, k) = 1 .
i=1 j=1
To represent all nine major grids, just add 3 or 6 to each i and j index:
3 3
∑ ∑ x(i + U, j + V, k) = 1, where U, V ϵ {0, 3, 6} .
i=1 j=1
Express Clues
Each initial value (clue) can be expressed as a constraint. Suppose that the (i, j) clue is m for some
9
1 ≤ m ≤ 9. Then x(i, j, m) = 1. The constraint ∑ x(i, j, k) = 1 ensures that all other x(i, j, k) = 0 for
k=1
k ≠ m.
x = optimvar('x',9,9,9,'Type','integer','LowerBound',0,'UpperBound',1);
Create an optimization problem with a rather arbitrary objective function. The objective function can
help the solver by destroying the inherent symmetry of the problem.
sudpuzzle = optimproblem;
mul = ones(1,1,9);
mul = cumsum(mul,3);
sudpuzzle.Objective = sum(sum(sum(x,1),2).*mul);
9-131
9 Linear Programming and Mixed-Integer Linear Programming
Represent the constraints that the sums of x in each coordinate direction are one.
sudpuzzle.Constraints.consx = sum(x,1) == 1;
sudpuzzle.Constraints.consy = sum(x,2) == 1;
sudpuzzle.Constraints.consz = sum(x,3) == 1;
Create the constraints that the sums of the major grids sum to one as well.
majorg = optimconstr(3,3,9);
for u = 1:3
for v = 1:3
arr = x(3*(u-1)+1:3*(u-1)+3,3*(v-1)+1:3*(v-1)+3,:);
majorg(u,v,:) = sum(sum(arr,1),2) == ones(1,1,9);
end
end
sudpuzzle.Constraints.majorg = majorg;
Include the initial clues by setting lower bounds of 1 at the clue entries. This setting fixes the value of
the corresponding entry to be 1, and so sets the solution at each clued value to be the clue entry.
for u = 1:size(B,1)
x.LowerBound(B(u,1),B(u,2),B(u,3)) = 1;
end
sudsoln = solve(sudpuzzle)
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
Round the solution to ensure that all entries are integers, and display the solution.
sudsoln.x = round(sudsoln.x);
y = ones(size(sudsoln.x));
for k = 2:9
y(:,:,k) = k; % multiplier for each depth k
end
S = sudsoln.x.*y; % multiply each entry by its depth
S = sum(S,3); % S is 9-by-9 and holds the solved puzzle
drawSudoku(S)
9-132
Solve Sudoku Puzzles Via Integer Programming: Problem-Based
function drawSudoku(B)
% Function for drawing the Sudoku board
9-133
9 Linear Programming and Mixed-Integer Linear Programming
if size(B,2) == 9 % 9 columns
[SM,SN] = meshgrid(1:9); % make i,j entries
B = [SN(:),SM(:),B(:)]; % i,j,k rows
end
for ii = 1:size(B,1)
text(B(ii,2)-0.5,9.5-B(ii,1),num2str(B(ii,3)))
end
hold off
end
See Also
More About
• “Solve Sudoku Puzzles Via Integer Programming: Solver-Based” on page 9-72
9-134
Minimize Makespan in Parallel Processing
Problem Setup
This example has 11 processors and 40 tasks. The time for each processor to process each task is
given in the array processingTime. For this example, generate random processing times.
processingTime(i,j) represents the amount of time that processor i takes to process task j.
To solve the problem using binary integer programming, create process as a binary optimization
variable array, where process(i,j) = 1 means processor i processes task j.
process = optimvar('process',size(processingTime),'Type','integer','LowerBound',0,'UpperBound',1)
assigneachtask = sum(process,1) == 1;
makespan = optimvar('makespan','LowerBound',0);
Compute the time that each processor requires to process its tasks.
computetime = sum(process.*processingTime,2);
Relate the compute times to the makespan. The makespan is greater than or equal to each compute
time.
9-135
9 Linear Programming and Mixed-Integer Linear Programming
Create an optimization problem whose objective is to minimize the makespan, and include the
problem constraints.
scheduleproblem = optimproblem('Objective',makespan);
scheduleproblem.Constraints.assigneachtask = assigneachtask;
scheduleproblem.Constraints.makespanbound = makespanbound;
options = optimoptions(scheduleproblem,'Display',"off");
[sol,fval,exitflag] = solve(scheduleproblem,'Options',options)
fval = 1.3634
exitflag =
OptimalSolution
The returned exitflag indicates that the solver found an optimal solution, meaning the returned
solution has minimal makespan.
The returned makespan is 1.3634. Is this an efficient schedule? To find out, view the resulting
schedule as a stacked bar chart. First, create a schedule matrix where row i represents the tasks
done by processor i. Then, find the processing time for each entry in the schedule.
processval = round(sol.process);
maxlen = max(sum(processval,2)); % Required width of the matrix
% Now calculate the schedule matrix
optimalSchedule = zeros(numberOfProcessors,maxlen);
ptime = optimalSchedule;
for i = 1:numberOfProcessors
schedi = find(processval(i,:));
optimalSchedule(i,1:length(schedi)) = schedi;
ptime(i,1:length(schedi)) = processingTime(i,schedi);
end
optimalSchedule
optimalSchedule = 11×10
25 38 0 0 0 0 0 0 0 0
4 12 23 32 0 0 0 0 0 0
7 13 14 18 31 37 0 0 0 0
35 0 0 0 0 0 0 0 0 0
6 22 39 0 0 0 0 0 0 0
10 26 28 30 0 0 0 0 0 0
20 0 0 0 0 0 0 0 0 0
21 24 27 0 0 0 0 0 0 0
8 16 33 0 0 0 0 0 0 0
3 17 34 0 0 0 0 0 0 0
9-136
Minimize Makespan in Parallel Processing
Display the schedule matrix as a stacked bar chart. Label the top of each bar with the task number.
figure
bar(ptime,'stacked')
xlabel('Processor Number')
ylabel('Processing Time')
title('Task Assignments to Processors')
for i=1:size(optimalSchedule,1)
for j=1:size(optimalSchedule,2)
if optimalSchedule(i,j) > 0
processText = num2str(optimalSchedule(i,j),"%d");
hText = text(i,sum(ptime(i,1:j),2),processText);
set(hText,"VerticalAlignment","top","HorizontalAlignment","center","FontSize",10,"Col
end
end
end
Find the minimum height of the stacked bars, which represents the earliest time a processor stops
working. Then, find the processor corresponding to the maximum height.
minlength = min(sum(ptime,2))
minlength = 1.0652
[~,processormaxlength] = max(sum(ptime,2))
9-137
9 Linear Programming and Mixed-Integer Linear Programming
processormaxlength = 7
All processors are busy until time minlength = 1.0652. From the stacked bar chart, you can see that
processor 8 stops working at that time. Processor processormaxlength = 7 is the last processor to
stop working, which occurs at time makespan = 1.3634.
See Also
solve
More About
• “Problem-Based Optimization Workflow” on page 10-2
• “Linear Programming and Mixed-Integer Linear Programming”
9-138
Investigate Linear Infeasibilities
If linear constraints cause a problem to be infeasible, you might want to find a subset of the
constraints that is infeasible, but removing any member of the subset makes the problem feasible.
The name for such a subset is Irreducible Infeasible Subset of Constraints, abbreviated IIS.
Conversely, you might want to find a maximum cardinality subset of constraints that is feasible. This
subset is called a Maximum Feasible Subset, abbreviated MaxFS. The two concepts are related, but
not identical. A problem can have many different IISs, some with different cardinality. So one IIS does
not necessarily correspond to a MaxFS.
This example shows two ways of finding an IIS, and two ways of obtaining a feasible set of
constraints.
Infeasible Example
Create a random matrix A representing linear inequalities of size 150-by-15. Set the corresponding
vector b to a vector with entries of 10, and change 5% of those values to –10.
N = 15;
rng default
A = randn([10*N,N]);
b = 10*ones(size(A,1),1);
Aeq = [];
beq = [];
b(rand(size(b)) <= 0.05) = -10;
f = ones(N,1);
lb = -f;
ub = f;
[x,fval,exitflag] = linprog(f,A,b,Aeq,beq,lb,ub);
Deletion Filter
To identify an IIS, perform the following steps. Given a set of linear constraints numbered 1 through
N, where all problem constraints are infeasible:
At the end of this procedure, the constraints that remain in the problem form an IIS.
9-139
9 Linear Programming and Mixed-Integer Linear Programming
For MATLAB® code that implements this procedure, see the deletionfilter helper function at the
end of this example on page 9-0 .
Note: If you use the live script file for this example, the deletionfilter function is already
included at the end of the file. Otherwise, you need to create this function at the end of your .m file or
add it as a file on the MATLAB path. The same is true for the other helper functions used later in this
example.
The problem has no equality constraints. Find the indices for the inequality constraints and the value
of b(iis).
iis = find(ineqs)
iis = 114
b(iis)
ans = -10
Only one inequality constraint causes the problem to be infeasible. The constraint is
Why is this constraint infeasible? Find the sum of the absolute values of that row of A.
disp(sum(abs(A(iis,:))))
8.4864
The x vector has values between –1 and 1, so A(iis,:)*x cannot be less than b(iis) = –10.
150
The problem has 150 linear constraints, so the function called linprog 150 times.
Elastic Filter
As an alternative to the deletion filter, which examines every constraint, try the elastic filter. This
filter works as follows.
First, allow each constraint i to be violated by a positive amount e(i), where equality constraints
have both additive and subtractive positive elastic values.
Aineqx ≤ bineq + e
Aeqx = beq + e1 − e2
min ∑ ei
x, e
9-140
Investigate Linear Infeasibilities
• If the associated LP has a solution, remove all constraints that have a strictly positive associated
ei, and record those constraints in a list of indices (potential IIS members). Return to the previous
step to solve the new, reduced associated LP.
• If the associated LP has no solution (is infeasible) or has no strictly positive associated ei, exit the
filter.
The elastic filter can exit in many fewer iterations than the deletion filter, because it can bring many
indices at once into the IIS, and can halt without going through the entire list of indices. However, the
problem has more variables than the original problem, and its resulting list of indices can be larger
than an IIS. To find an IIS after running an elastic filter, run the deletion filter on the result.
For MATLAB® code that implements this filter, see the elasticfilter helper function at the end of
this example on page 9-0 .
[ineqselastic,eqselastic,ncall] = ...
elasticfilter(A,b,Aeq,beq,lb,ub);
The problem has no equality constraints. Find the indices for the inequality constraints.
iiselastic = find(ineqselastic)
iiselastic = 5×1
2
60
82
97
114
The elastic IIS lists five constraints, whereas the deletion filter found only one. Run the deletion filter
on the returned set to find a genuine IIS.
iiselasticdeletion = 5
The fifth constraint in the elastic filter result, inequality 114, is the IIS. This result agrees with the
answer from the deletion filter. The difference between the approaches is that the combined elastic
and deletion filter approach uses many fewer linprog calls. Display the total number of linprog
calls used by the elastic filter followed by the deletion filter.
disp(ncall + ncall2)
9-141
9 Linear Programming and Mixed-Integer Linear Programming
Generally, obtaining a single IIS does not enable you to find all the reasons that your optimization
problem fails. To correct an infeasible problem, you can repeatedly find an IIS and remove it from the
problem until the problem becomes feasible.
The following code shows how to remove one IIS at a time from a problem until the problem becomes
feasible. The code uses an indexing technique to keep track of constraints in terms of their positions
in the original problem, before the algorithm removes any constraints.
The code keeps track of the original variables in the problem by using a Boolean vector activeA to
represent the current constraints (rows) of the A matrix, and a Boolean vector activeAeq to
represent the current constraints of the Aeq matrix. When adding or removing constraints, the code
indexes into A or Aeq so that the row numbers do not change, even though the number of constraints
changes.
Running this code returns idx2, a vector of the indices of the nonzero elements in activeA:
idx2 = find(activeA)
Suppose that var is a Boolean vector that has the same length as idx2. Then
idx2(find(var))
expresses var as indices into the original problem variables. In this way, the indexing can take a
subset of a subset of constraints, work with only the smaller subset, and still unambiguously refer to
the original problem variables.
opts = optimoptions('linprog','Display',"none");
activeA = true(size(b));
activeAeq = true(size(beq));
[~,~,exitflag] = linprog(f,A,b,Aeq,beq,lb,ub,opts);
ncl = 1;
while exitflag < 0
[ineqselastic,eqselastic,ncall] = ...
elasticfilter(A(activeA,:),b(activeA),Aeq(activeAeq,:),beq(activeAeq),lb,ub);
ncl = ncl + ncall;
idxaa = find(activeA);
idxae = find(activeAeq);
tmpa = idxaa(find(ineqselastic));
tmpae = idxae(find(eqselastic));
AA = A(tmpa,:);
bb = b(tmpa);
AE = Aeq(tmpae,:);
be = beq(tmpae);
[ineqs,eqs,ncall] = ...
deletionfilter(AA,bb,AE,be,lb,ub);
ncl = ncl + ncall;
activeA(tmpa(ineqs)) = false;
activeAeq(tmpae(eqs)) = false;
disp(['Removed inequalities ',int2str((tmpa(ineqs))'),' and equalities ',int2str((tmpae(eqs))
[~,~,exitflag] = ...
linprog(f,A(activeA,:),b(activeA),Aeq(activeAeq,:),beq(activeAeq),lb,ub,opts);
ncl = ncl + 1;
end
9-142
Investigate Linear Infeasibilities
Notice that the loop removes inequalities 64 and 82 simultaneously, which indicates that these two
constraints form an IIS.
Find MaxFS
Another approach for obtaining a feasible set of constraints is to find a MaxFS directly. As Chinneck
[1] explains, finding a MaxFS is an NP-complete problem, meaning the problem does not necessarily
have efficient algorithms for finding a MaxFS. However, Chinneck proposes some algorithms that can
work efficiently.
Use Chinneck's Algorithm 7.3 to find a cover set of constraints that, when removed, gives a feasible
set. The algorithm is implemented in the generatecover helper function at the end of this example
on page 9-0 .
[coversetineq,coverseteq,nlp] = generatecover(A,b,Aeq,beq,lb,ub)
coversetineq = 5×1
114
97
60
82
2
coverseteq =
[]
nlp = 40
usemeineq = true(size(b));
usemeineq(coversetineq) = false; % Remove inequality constraints
usemeeq = true(size(beq));
usemeeq(coverseteq) = false; % Remove equality constraints
[xs,fvals,exitflags] = ...
linprog(f,A(usemeineq,:),b(usemeineq),Aeq(usemeeq),beq(usemeeq),lb,ub);
Notice that the cover set is exactly the same as the iiselastic set from Elastic Filter on page 9-
0 . In general, the elastic filter finds too large a cover set. Chinneck's Algorithm 7.3 starts with the
elastic filter result and then retains only the constraints that are necessary.
Chinneck's Algorithm 7.3 takes 40 calls to linprog to complete the calculation of a MaxFS. This
number is a bit more than 28 calls used earlier in the process of deleting IIS in a loop.
Also, notice that the inequalities removed in the loop are not exactly the same as the inequalities
removed by Algorithm 7.3. The loop removes inequalities 114, 97, 82, 60, and 64, while Algorithm 7.3
9-143
9 Linear Programming and Mixed-Integer Linear Programming
removes inequalities 114, 97, 82, 60, and 2. Check that inequalities 82 and 64 form an IIS (as
indicated in Remove IIS in a Loop on page 9-0 ), and that inequalities 82 and 2 also form an IIS.
usemeineq = false(size(b));
usemeineq([82,64]) = true;
ineqs = deletionfilter(A(usemeineq,:),b(usemeineq),Aeq,beq,lb,ub);
disp(ineqs)
1
1
usemeineq = false(size(b));
usemeineq([82,2]) = true;
ineqs = deletionfilter(A(usemeineq,:),b(usemeineq),Aeq,beq,lb,ub);
disp(ineqs)
1
1
References
[2] Chinneck, J. W. "Feasibility and Infeasibility in Optimization." Tutorial for CP-AI-OR-07, Brussels,
Belgium. Available at https://www.sce.carleton.ca/faculty/chinneck/docs/
CPAIOR07InfeasibilityTutorial.pdf.
Helper Functions
for i=1:mi
ineq_iis(i) = 0; % Remove inequality i
[~,~,exitflag] = linprog(f,Aineq(ineq_iis,:),bineq(ineq_iis),...
Aeq,beq,lb,ub,[],opts);
ncalls = ncalls + 1;
if exitflag == 1 % If now feasible
ineq_iis(i) = 1; % Return i to the problem
end
end
for i=1:me
eq_iis(i) = 0; % Remove equality i
[~,~,exitflag] = linprog(f,Aineq,bineq,...
Aeq(eq_iis,:),beq(eq_iis),lb,ub,[],opts);
ncalls = ncalls + 1;
if exitflag == 1 % If now feasible
eq_iis(i) = 1; % Return i to the problem
9-144
Investigate Linear Infeasibilities
end
end
end
ineq_iis = false(mi,1);
eq_iis = false(me,1);
[x,fval,exitflag] = linprog(f,Aineq_r,bineq,Aeq_r,beq,lb_r,ub_r,[],opts);
fval0 = fval;
ncalls = ncalls + 1;
while exitflag == 1 && fval > tol % Feasible and some slacks are nonzero
c = 0;
for i = 1:mi
j = ineq_slack_offset+i;
if x(j) > tol
ub_r(j) = 0.0;
ineq_iis(i) = true;
c = c+1;
end
end
for i = 1:me
j = eq_pos_slack_offset+i;
if x(j) > tol
ub_r(j) = 0.0;
eq_iis(i) = true;
c = c+1;
end
end
for i = 1:me
j = eq_neg_slack_offset+i;
if x(j) > tol
ub_r(j) = 0.0;
eq_iis(i) = true;
c = c+1;
end
end
[x,fval,exitflag] = linprog(f,Aineq_r,bineq,Aeq_r,beq,lb_r,ub_r,[],opts);
if fval > 0
fval0 = fval;
end
ncalls = ncalls + 1;
9-145
9 Linear Programming and Mixed-Integer Linear Programming
end
end
This code creates the generatecover helper function. The code uses the same indexing technique
for keeping track of constraints as the Remove IIS in a Loop on page 9-0 code.
9-146
Investigate Linear Infeasibilities
for i = 1:length(candidateeq(:))
activeAeq(candidateeq(i)) = false;
idx2 = find(activeA);
idx2eq = find(activeAeq);
[ineq_iis,eq_iis,ncalls,fval] = elasticfilter(Aineq(activeA),bineq(activeA),Aeq(activeAeq
nlp = nlp + ncalls;
ineq_iis = idx2(find(ineq_iis));
eq_iis = idx2eq(find(eq_iis));
if fval == 0
coverseteq = [coverseteq;candidateeq(i)];
return
end
if fval < minsinf
ineqflag = -1;
winner = candidateeq(i);
minsinf = fval;
holdseteq = eq_iis;
if numel(ineq_iis(:)) + numel(eq_iis(:)) == 1
nextwinner = ineq_iis;
nextwinner2 = eq_iis;
nextwinner = [nextwinnner,nextwinner2];
else
nextwinner = [];
end
end
activeAeq(candidateeq(i)) = true;
end
% Step 3 of Algorithm 7.3
if ineqflag == 1
coversetineq = [coversetineq;winner];
activeA(winner) = false;
if nextwinner
coversetineq = [coversetineq;nextwinner];
return
end
end
if ineqflag == -1
coverseteq = [coverseteq;winner];
activeAeq(winner) = false;
if nextwinner
coverseteq = [coverseteq;nextwinner];
return
end
end
candidateineq = holdsetineq;
candidateeq = holdseteq;
end
end
See Also
linprog
More About
• “Solve Nonlinear Feasibility Problem, Problem-Based” on page 7-30
• “Converged to an Infeasible Point” on page 4-6
9-147
9 Linear Programming and Mixed-Integer Linear Programming
9-148
10
Problem-Based Optimization
Note Optimization Toolbox provides two approaches for solving single-objective optimization
problems. This topic describes the problem-based approach. “Solver-Based Optimization Problem
Setup” describes the solver-based approach.
prob = optimproblem('ObjectiveSense','maximize');
• Create named variables by using optimvar. An optimization variable is a symbolic variable that
you use to describe the problem objective and constraints. Include any bounds in the variable
definitions.
x = optimvar('x',15,3,'Type','integer','LowerBound',0,'UpperBound',1);
• Define the objective function in the problem object as an expression in the named variables.
Note If you have a nonlinear function that is not composed of polynomials, rational expressions,
and elementary functions such as exp, then convert the function to an optimization expression by
using fcn2optimexpr. See “Convert Nonlinear Function to Optimization Expression” on page 7-8
and “Supported Operations on Optimization Variables and Expressions” on page 10-36.
For example, assume that you have a real matrix f of the same size as a matrix of variables x, and
the objective is the sum of the entries in f times the corresponding variables x.
prob.Objective = sum(sum(f.*x));
• Define constraints for optimization problems as either comparisons in the named variables or as
comparisons of expressions.
Note If you have a nonlinear function that is not composed of polynomials, rational expressions,
and elementary functions such as exp, then convert the function to an optimization expression by
using fcn2optimexpr. See “Convert Nonlinear Function to Optimization Expression” on page 7-8
and “Supported Operations on Optimization Variables and Expressions” on page 10-36.
For example, assume that the sum of the variables in each row of x must be one, and the sum of
the variables in each column must be no more than one.
onesum = sum(x,2) == 1;
vertsum = sum(x,1) <= 1;
prob.Constraints.onesum = onesum;
prob.Constraints.vertsum = vertsum;
10-2
Problem-Based Optimization Workflow
• For nonlinear problems, set an initial point as a structure whose fields are the optimization
variable names. For example:
x0.x = randn(size(x));
x0.y = eye(4); % Assumes y is a 4-by-4 variable
• Solve the problem by using solve.
sol = solve(prob);
% Or, for nonlinear problems,
sol = solve(prob,x0)
In addition to these basic steps, you can review the problem definition before solving the problem by
using show or write. Set options for solve by using optimoptions, as explained in “Change
Default Solver or Options” on page 10-14.
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
For a basic mixed-integer linear programming example, see “Mixed-Integer Linear Programming
Basics: Problem-Based” on page 10-40 or the video version Solve a Mixed-Integer Linear
Programming Problem Using Optimization Modeling. For a nonlinear example, see “Solve a
Constrained Nonlinear Problem, Problem-Based” on page 1-5. For more extensive examples, see
“Problem-Based Nonlinear Optimization”, “Linear Programming and Mixed-Integer Linear
Programming”, or “Quadratic Programming”.
See Also
fcn2optimexpr | optimoptions | optimproblem | optimvar | show | solve | write
More About
• “Mixed-Integer Linear Programming Basics: Problem-Based” on page 10-40
• Solve a Mixed-Integer Linear Programming Problem Using Optimization Modeling
• “Optimization Expressions” on page 10-6
• “Review or Modify Optimization Problems” on page 10-14
• “Examine Optimization Solution” on page 10-25
10-3
10 Problem-Based Optimization
Note Optimization Toolbox provides two approaches for solving equations. This topic describes the
problem-based approach. “Solver-Based Optimization Problem Setup” describes the solver-based
approach.
prob = eqnproblem;
• Create named variables by using optimvar. An optimization variable is a symbolic variable that
you use to describe the equations. Include any bounds in the variable definitions.
For example, create a 15-by-3 array of variables named 'x' with lower bounds of 0 and upper
bounds of 1.
x = optimvar('x',15,3,'LowerBound',0,'UpperBound',1);
• Define equations in the problem variables. For example:
sumeq = sum(x,2) == 1;
prob.Equations.sumeq = sumeq;
Note If you have a nonlinear function that is not composed of polynomials, rational expressions,
and elementary functions such as exp, then convert the function to an optimization expression by
using fcn2optimexpr. See “Convert Nonlinear Function to Optimization Expression” on page 7-8
and “Supported Operations on Optimization Variables and Expressions” on page 10-36.
• For nonlinear problems, set an initial point as a structure whose fields are the optimization
variable names. For example:
x0.x = randn(size(x));
x0.y = eye(4); % Assumes y is a 4-by-4 variable
• Solve the problem by using solve.
sol = solve(prob);
% Or, for nonlinear problems,
sol = solve(prob,x0)
In addition to these basic steps, you can review the problem definition before solving the problem by
using show or write. Set options for solve by using optimoptions, as explained in “Change
Default Solver or Options” on page 10-14.
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
10-4
Problem-Based Workflow for Solving Equations
For a basic equation-solving example with polynomials, see “Solve Nonlinear System of Polynomials,
Problem-Based” on page 13-21. For a general nonlinear example, see “Solve Nonlinear System of
Equations, Problem-Based” on page 13-19. For more extensive examples, see “Systems of Nonlinear
Equations”.
See Also
eqnproblem | fcn2optimexpr | optimoptions | optimvar | show | solve | write
More About
• “Systems of Nonlinear Equations”
• “Optimization Expressions” on page 10-6
• “Review or Modify Optimization Problems” on page 10-14
• “Examine Optimization Solution” on page 10-25
10-5
10 Problem-Based Optimization
Optimization Expressions
In this section...
“What Are Optimization Expressions?” on page 10-6
“Expressions for Objective Functions” on page 10-6
“Expressions for Constraints and Equations” on page 10-7
“Optimization Variables Have Handle Behavior” on page 10-9
Optimization expressions also result from many MATLAB operations on optimization variables, such
as transpose or concatenation of variables. For the list of supported operations on optimization
expressions, see “Supported Operations on Optimization Variables and Expressions” on page 10-36.
Finally, optimization expressions can be the result of applying fcn2optimexpr to a MATLAB function
acting on optimization variables. For details, see “Convert Nonlinear Function to Optimization
Expression” on page 7-8.
Optimization modeling functions do not allow you to specify complex, Inf, or NaN values. If you
obtain such an expression through operations, the expression cannot be displayed. See “Expression
Contains Inf or NaN” on page 10-35.
y = optimvar('y',5,3);
expr = sum(y,2); % a 5-by-1 vector
expr2 = [1:5]*expr;
The expression expr is not suitable for an objective function because it is a vector. The expression
expr2 is suitable for an objective function.
Note If you have a nonlinear function that is not composed of polynomials, rational expressions, and
elementary functions such as exp, then convert the function to an optimization expression by using
fcn2optimexpr. See “Convert Nonlinear Function to Optimization Expression” on page 7-8 and
“Supported Operations on Optimization Variables and Expressions” on page 10-36.
To include an expression as an objective function in a problem, use dot notation, or include the
objective when you create the problem.
10-6
Optimization Expressions
prob = optimproblem;
prob.Objective = expr2;
% or equivalently
prob = optimproblem('Objective',expr2);
x = optimvar('x',3,3,'Type','integer','LowerBound',0,'UpperBound',1);
y = optimvar('y',3,3);
expr = optimexpr;
for i = 1:3
for j = 1:3
expr = expr + y(j,i) - x(i,j);
end
end
show(expr)
x = optimvar('x',3,3,'Type','integer','LowerBound',0,'UpperBound',1);
y = optimvar('y',3,3);
expr = sum(sum(y' - x));
show(expr)
Note If your objective function is a sum of squares, and you want solve to recognize it as such,
write it as sum(expr.^2), and not as expr'*expr. The internal parser recognizes only explicit
sums of squares. For an example, see “Nonnegative Linear Least Squares, Problem-Based” on page
12-40.
x = optimvar('x',3,2,'Type','integer','LowerBound',0,'UpperBound',1);
y = optimvar('y',2,4);
z = optimvar('z');
constr2 = y <= z;
10-7
10 Problem-Based Optimization
sum(x,1) is of size 1-by-2, so (sum(x,1))' is of size 2-by-1. sum(y,2) is of size 2-by-1, so the two
expressions are comparable.
Note If you have a nonlinear function that is not composed of polynomials, rational expressions, and
elementary functions such as exp, then convert the function to an optimization expression by using
fcn2optimexpr. See “Convert Nonlinear Function to Optimization Expression” on page 7-8 and
“Supported Operations on Optimization Variables and Expressions” on page 10-36.
To include constraints in a problem, use dot notation and give each constraint a different name.
prob = optimproblem;
prob.Constraints.constr1 = constr1;
prob.Constraints.constr2 = constr2;
prob.Constraints.constr3 = constr3;
Similarly, to include equations in a problem, use dot notation and give each equation a different
name.
prob = eqnproblem;
prob.Equations.eq1 = eq1;
prob.Equations.eq2 = eq12
You can also include constraints or equations when you create a problem. For example, suppose that
you have 10 pairs of positive variables whose sums are no more than one.
x = optimvar('x',10,2,'LowerBound',0);
prob = optimproblem('Constraints',sum(x,2) <= 1);
To create constraint or equation expressions in a loop, start with an empty constraint expression as
returned by optimconstr, optimeq, or optimineq.
x = optimvar('x',3,2,'Type','integer','LowerBound',0,'UpperBound',1);
y = optimvar('y',2,4);
z = optimvar('z');
const1 = optimconstr(2);
for i = 1:2
const1(i) = x(1,i) - x(3,i) + 2*z >= 4*(y(i,2) + y(i,3) + 2*y(i,4));
end
show(const1)
(1, 1)
(2, 1)
x = optimvar('x',3,2,'Type','integer','LowerBound',0,'UpperBound',1);
y = optimvar('y',2,4);
z = optimvar('z');
const1 = x(1,:) - x(3,:) + 2*z >= 4*(y(:,1) + y(:,3) + 2*y(:,4))';
show(const1)
10-8
Optimization Expressions
(1, 1)
(1, 2)
Tip For best performance, include variable bounds in the variable definitions, not in constraint
expressions. Also, performance generally improves when you create constraints without using loops.
See “Create Efficient Optimization Problems” on page 10-28.
Caution Each constraint expression in a problem must use the same comparison. For example, the
following code leads to an error, because cons1 uses the <= comparison, cons2 uses the >=
comparison, and cons1 and cons2 are in the same expression.
prob = optimproblem;
x = optimvar('x',2,'LowerBound',0);
cons1 = x(1) + x(2) <= 10;
cons2 = 3*x(1) + 4*x(2) >= 2;
prob.Constraints = [cons1;cons2]; % This line throws an error
You can avoid this error by using separate expressions for the constraints.
prob.Constraints.cons1 = cons1;
prob.Constraints.cons2 = cons2;
x = optimvar('x','LowerBound',1);
y = x;
y.LowerBound = 0;
showbounds(x)
0 <= x
See Also
OptimizationConstraint | OptimizationExpression | optimvar | show
More About
• “Mixed-Integer Linear Programming Basics: Problem-Based” on page 10-40
• “Problem-Based Optimization Workflow” on page 10-2
• “Review or Modify Optimization Problems” on page 10-14
10-9
10 Problem-Based Optimization
10-10
Pass Extra Parameters in Problem-Based Approach
To include these parameters in the problem-based approach, simply refer to workspace variables in
your objective or constraint functions.
For example, suppose that you have matrices C and d in the particle.mat file, and these matrices
represent data for your problem. Load the data into your workspace.
load particle
disp(size(C))
2000 400
disp(size(d))
2000 1
Create an optimization variable x of a size that is suitable for forming the vector C*x.
x = optimvar('x',size(C,2));
Create an optimization problem to minimize the sum of squares of the terms in C*x – d subject to
the constraint that x is nonnegative.
x.LowerBound = 0;
prob = optimproblem;
expr = sum((C*x - d).^2);
prob.Objective = expr;
You include the data C and d into the problem simply by referring to them in the objective function
expression. Solve the problem.
[sol,fval,exitflag,output] = solve(prob)
fval = 22.5795
10-11
10 Problem-Based Optimization
exitflag =
OptimalSolution
Use the same approach for nonlinear problems. For example, suppose that you have an objective
function of several variables, some of which are fixed data for the optimization.
type parameterfun
function y = parameterfun(x,a,b,c)
y = (a - b*x(1)^2 + x(1)^4/3)*x(1)^2 + x(1)*x(2) + (-c + c*x(2)^2)*x(2)^2;
For this objective function, x is a 2-element vector, and a, b, and c are scalar parameters. Create the
optimization variable and assign the parameter values in your workspace.
a = 4;
b = 2.1;
c = 4;
x = optimvar('x',2);
Create an optimization problem. Convert parameterfun to an optimization expression and set the
converted expression as the objective.
prob = optimproblem;
expr = fcn2optimexpr(@parameterfun,x,a,b,c);
prob.Objective = expr;
x0.x = [1/2;1/2];
[sol,fval] = solve(prob,x0)
fval = -1.0316
10-12
Pass Extra Parameters in Problem-Based Approach
Because this objective function is a rational function of x, you can specify the objective in terms of the
optimization variable, without using fcn2optimexpr. Either way, you include the extra parameters
simply by referring to them in the objective function.
fval = -1.0316
See Also
fcn2optimexpr
More About
• “Passing Extra Parameters” on page 2-57
10-13
10 Problem-Based Optimization
prob = optimproblem;
x = optimvar('x',2,'LowerBound',0);
prob.Objective = x(1) - 2*x(2);
prob.Constraints.cons1 = x(1) + 2*x(2) <= 4;
prob.Constraints.cons2 = -x(1) + x(2) <= 1;
show(prob)
OptimizationProblem :
Solve for:
x
minimize :
x(1) - 2*x(2)
subject to cons1:
x(1) + 2*x(2) <= 4
subject to cons2:
-x(1) + x(2) <= 1
variable bounds:
0 <= x(1)
0 <= x(2)
This review shows the basic elements of the problem, such as whether the problem is to minimize or
maximize, and the variable bounds. The review shows the index names, if any, used in the variables.
The review does not show whether the variables are integer valued.
To see the default solver and options, use optimoptions(prob). For example,
rng default
x = optimvar('x',3,'LowerBound',0);
10-14
Review or Modify Optimization Problems
expr = sum((rand(3,1).*x).^2);
prob = optimproblem('Objective',expr);
prob.Constraints.lincon = sum(sum(randn(size(x)).*x)) <= randn;
options = optimoptions(prob)
options =
lsqlin options:
Set properties:
No options set.
Default properties:
Algorithm: 'interior-point'
ConstraintTolerance: 1.0000e-08
Display: 'final'
LinearSolver: 'auto'
MaxIterations: 200
OptimalityTolerance: 1.0000e-08
StepTolerance: 1.0000e-12
The default solver for this problem is lsqlin, and you can see the default options.
To change the solver, set the 'Solver' name-value pair in solve. To see the applicable options for a
different solver, use optimoptions to pass the current options to the different solver. For example,
continuing the problem,
options = optimoptions('quadprog',options)
options =
quadprog options:
Set properties:
ConstraintTolerance: 1.0000e-08
MaxIterations: 200
OptimalityTolerance: 1.0000e-08
StepTolerance: 1.0000e-12
Default properties:
Algorithm: 'interior-point-convex'
Display: 'final'
LinearSolver: 'auto'
To change the options, use optimoptions or dot notation to set options, and pass the options to
solve in the 'Options' name-value pair. See “Options in Common Use: Tuning and
Troubleshooting” on page 2-61. Continuing the example,
10-15
10 Problem-Based Optimization
options.Display = 'iter';
sol = solve(prob,'Options',options,'Solver','quadprog');
x = optimvar('x',9,9,9,'LowerBound',0,'UpperBound',1);
cons1 = sum(x,1) == 1;
cons2 = sum(x,2) == 1;
cons3 = sum(x,3) == 1;
prob = optimproblem;
prob.Constraints.cons1 = cons1;
prob.Constraints.cons2 = cons2;
prob.Constraints.cons3 = cons3;
mul = ones(1,1,9);
mul = cumsum(mul,3);
prob.Objective = sum(sum(sum(x,1),2).*mul);
cons4 = optimconstr(3,3,9);
for u = 1:3
for v = 1:3
arr = x(3*(u-1)+1:3*(u-1)+3,3*(v-1)+1:3*(v-1)+3,:);
cons4(u,v,:) = sum(sum(arr,1),2) <= ones(1,1,9);
end
end
prob.Constraints.cons4 = cons4;
B = [1,2,2;
1,5,3;
1,8,4;
2,1,6;
2,9,3;
3,3,4;
3,7,5;
4,4,8;
4,6,6;
5,1,8;
10-16
Review or Modify Optimization Problems
5,5,1;
5,9,6;
6,4,7;
6,6,5;
7,3,7;
7,7,6;
8,1,4;
8,9,8;
9,2,3;
9,5,4;
9,8,2];
for u = 1:size(B,1)
x.LowerBound(B(u,1),B(u,1),B(u,1)) = 1;
end
This script has some errors that you can find by examining the variables, objective, and constraints.
First, examine the variable x.
x =
Array-wide properties:
Name: 'x'
Type: 'continuous'
IndexNames: {{} {} {}}
Elementwise properties:
LowerBound: [9×9×9 double]
UpperBound: [9×9×9 double]
This display shows that the type of the variable is continuous. The variable should be integer valued.
Change the type.
x.Type = 'integer'
x =
Array-wide properties:
Name: 'x'
Type: 'integer'
IndexNames: {{} {} {}}
Elementwise properties:
LowerBound: [9×9×9 double]
UpperBound: [9×9×9 double]
10-17
10 Problem-Based Optimization
Check the bounds. There should be 21 lower bounds with the value 1, one for each row of B. Because
x is a large array, write the bounds to a file instead of displaying them at the command line.
writebounds(x,'xbounds.txt')
Search the file xbounds.txt for all instances of 1 <=. Only nine lower bounds having the value 1, in
the variables x(1,1,1), x(2,2,2), …, x(9,9,9). To investigate this discrepancy, examine the code
where you set the lower bounds:
for u = 1:size(B,1)
x.LowerBound(B(u,1),B(u,1),B(u,1)) = 1;
end
The line inside the loop should say x.LowerBound(B(u,1),B(u,2),B(u,3)) = 1;. Reset all
lower bounds to zero, then run the corrected code.
x.LowerBound = 0;
for u = 1:size(B,1)
x.LowerBound(B(u,1),B(u,2),B(u,3)) = 1;
end
writebounds(x,'xbounds.txt')
xbounds.txt now has the correct number of lower bound entries that are 1.
Examine the objective function. The objective function expression is large, so write the expression to
a file.
write(prob.Objective,'objectivedescription.txt')
write(prob.Constraints.cons1,'cons1.txt')
write(prob.Constraints.cons2,'cons2.txt')
write(prob.Constraints.cons3,'cons3.txt')
write(prob.Constraints.cons4,'cons4.txt')
Review cons4.txt and you see a mistake. All the constraints are inequalities rather than equalities.
Correct the lines of code that create this constraint and put the corrected constraint in the problem.
cons4 = optimconstr(3,3,9);
for u = 1:3
for v = 1:3
arr = x(3*(u-1)+1:3*(u-1)+3,3*(v-1)+1:3*(v-1)+3,:);
cons4(u,v,:) = sum(sum(arr,1),2) == ones(1,1,9);
end
end
prob.Constraints.cons4 = cons4;
10-18
Review or Modify Optimization Problems
sol = solve(prob);
x = round(sol.x);
y = ones(size(x));
for k = 2:9
y(:,:,k) = k; % multiplier for each depth k
end
S = x.*y; % multiply each entry by its depth
S = sum(S,3); % S is 9-by-9 and holds the solved puzzle
drawSudoku(S)
See Also
OptimizationConstraint | OptimizationExpression | OptimizationProblem |
OptimizationVariable | show | showbounds | write | writebounds
More About
• “Problem-Based Optimization Workflow” on page 10-2
• “Mixed-Integer Linear Programming Basics: Problem-Based” on page 10-40
10-19
10 Problem-Based Optimization
x = optimvar('x',["United","Lufthansa","Virgin Air"])
x =
1x3 OptimizationVariable array with properties:
Array-wide properties:
Name: 'x'
Type: 'continuous'
IndexNames: {{} {1x3 cell}}
Elementwise properties:
LowerBound: [-Inf -Inf -Inf]
UpperBound: [Inf Inf Inf]
optimvar automatically maps the names you specify to index numbers in the order of your variables.
For example, "United" corresponds to index 1, "Lufthansa" corresponds to index 2, and "Virgin
Air" corresponds to index 3. Display this last variable for confirmation.
show(x(3))
[ x('Virgin Air') ]
Index names enable you to address elements of x by the index names. For example:
route =
Linear OptimizationExpression
You can create or change the index names after you create a variable. However, you cannot change
the size of an optimization variable after construction. So you can change index names only by setting
new names that index the same size as the original variable. For example:
x = optimvar('x',3,2);
x.IndexNames = { {'row1','row2','row3'}, {'col1','col2'} };
10-20
Named Index for Optimization Variables
You can also set the index names for each dimension individually:
x.IndexNames{1} = {'row1', 'row2', 'row3'};
x.IndexNames{2} = {'col1', 'col2'};
x.IndexNames{2}
Create bounds, an objective function, and linear constraints for x by using the named indices.
x('P1').LowerBound = 2500;
x('I2').UpperBound = 244000;
linprob = optimproblem;
linprob.Objective = 0.002614*x('HPS') + 0.0239*x('PP') + 0.009825*x('EP');
linprob.Constraints.cons1 = x('I1') - x('HE1') <= 132000;
You can use strings (" ") or character vectors (' ') in index variables indiscriminately. For example:
x("P2").LowerBound = 3000;
x('MPS').LowerBound = 271536;
showbounds(x)
10-21
10 Problem-Based Optimization
0 <= x('BF1')
0 <= x('BF2')
0 <= x('EP')
0 <= x('PP')
There is no distinction between variables you specified with a string, such as x("P2"), and variables
you specified with a character vector, such as x('MPS').
Because named index variables have numeric equivalents, you can use ordinary summation and colon
operators even when you have named index variable. For example, you can have constraints of these
forms:
constr = sum(x) <= 100;
show(constr)
y = optimvar('y',{'red','green','blue'},{'plastic','wood','metal'},...
'Type','integer','LowerBound',0);
constr2 = y("red",:) == [5,7,3];
show(constr2)
(1, 1)
y('red', 'plastic') == 5
(1, 2)
y('red', 'wood') == 7
(1, 3)
y('red', 'metal') == 3
10-22
Named Index for Optimization Variables
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
Find the optimal flow of oranges and berries to New York and Los Angeles.
idxFruit = 1×2
2 4
idxAirports = 1×2
1 3
orangeBerries = 2×2
0 980.0000
70.0000 0
This display means that no oranges are going to NYC, 70 berries are going to NYC, 980 oranges are
going to LAX, and no berries are going to LAX.
Fruit Airports
----- --------
Berries NYC
Apples BOS
Oranges LAX
idx = 1×3
4 5 10
optimalFlow = sol.flow(idx)
optimalFlow = 1×3
10-23
10 Problem-Based Optimization
This display means that 70 berries are going to NYC, 28 apples are going to BOS, and 980 oranges are
going to LAX.
See Also
findindex | optimvar
More About
• “Problem-Based Optimization Workflow” on page 10-2
• “Create Initial Point for Optimization with Named Index Variables” on page 10-43
10-24
Examine Optimization Solution
x = optimvar('x');
y = optimvar('y');
prob = optimproblem;
prob.Objective = -x -y/3;
prob.Constraints.cons1 = x + y <= 2;
prob.Constraints.cons2 = x + y/4 <= 1;
prob.Constraints.cons3 = x - y <= 2;
prob.Constraints.cons4 = x/4 + y >= -1;
prob.Constraints.cons5 = x + y >= 1;
prob.Constraints.cons6 = -x + y <= 2;
sol = solve(prob)
sol =
x: 0.6667
y: 1.3333
Suppose that you want the objective function value at the solution. You can rerun the problem, this
time asking for the objective function value and the solution.
[sol,fval] = solve(prob)
sol =
x: 0.6667
10-25
10 Problem-Based Optimization
y: 1.3333
fval =
-1.1111
Alternatively, for a time-consuming problem, save time by evaluating the objective function at the
solution using evaluate.
fval = evaluate(prob.Objective,sol)
fval =
-1.1111
• Check the exit flag. exitflag = OptimalSolution generally means that solve converged to
the solution. For an explanation of the other exitflag values, see exitflag.
• Check the exit message at the command line or in the output structure. When the exit message
states that the solver converged to a solution, then generally the solution is reliable. This message
corresponds to exitflag = OptimalSolution.
• When you have integer constraints, check the absolute gap and the relative gap in the exit
message or in the output structure. When these gaps are zero or nearly zero, the solution is
reliable.
Infeasible Solution
If solve reports that your problem is infeasible (the exit flag is NoFeasiblePointFound), examine
the problem infeasibility at a variety of points to see which constraints might be overly restrictive.
Suppose that you have a single continuous optimization variable named x that has finite bounds on all
components, and you have constraints constr1 through constr20.
N = 100; % check 100 points
infeas = zeros(N,20); % allocate
L = x.LowerBound;
U = x.UpperBound;
S = numel(L);
pthist = cell(N);
for k = 1:N
pt = L + rand(size(L)).*(U-L);
pthist{k} = pt;
for j = 1:20
infeas(k,j) = infeasibility(['constr',num2str(j)],pt);
end
end
The result infeas(a,b) has nonzero values wherever the associated point pt{a} is infeasible for
constraint b.
10-26
Examine Optimization Solution
• Problem formulation is slow. If you have defined objective or constraint expressions in nested
loops, then solve can take a long time to convert the problem internally to a matrix form. To
speed the solution, try to formulate your expressions in a vectorized fashion. See “Create Efficient
Optimization Problems” on page 10-28.
• Mixed-integer linear programming solution is slow. Sometimes you can speed up an integer
problem by setting options. You can also reformulate the problem to make it faster to solve. See
“Tuning Integer Linear Programming” on page 9-35.
• Nonlinear programming solution is slow. For suggestions, see “Solver Takes Too Long” on page 4-
9. For further suggestions, see “When the Solver Fails” on page 4-3.
• Solver Limit Exceeded. To solve some problems, solve can take more than the default number of
solution steps. For problems with integer constraints, increase the number of allowed steps by
increasing the LPMaxIterations, MaxNodes,MaxTime, or RootLPMaxIterations options to
higher-than-default values. To set these options, use optimoptions('intlinprog',...). For
non-integer problems, increase the MaxIterations option using
optimoptions('linprog','MaxIterations',...). See options.
See Also
evaluate | infeasibility | solve
More About
• “Tuning Integer Linear Programming” on page 9-35
• “Exit Flags and Exit Messages” on page 3-3
• “Output Structures” on page 3-21
• “Lagrange Multiplier Structures” on page 3-22
• “Mixed-Integer Linear Programming Basics: Problem-Based” on page 10-40
10-27
10 Problem-Based Optimization
Before you start solving the problem, sometimes you can improve the formulation of your problem
constraints or objective. Usually, it is faster for the software to create expressions for objective
function or constraints in a vectorized fashion rather than in a loop. Suppose that your objective
function is
30 30 10
∑ ∑ ∑ xi, j, kbkci, j,
i = 1 j = 1k = 1
where x is an optimization variable, and b and c are constants. Two general ways to formulate this
objective function are as follows:
expr = optimexpr;
for i = 1:30
for j = 1:30
for k = 1:10
expr = expr + x(i,j,k)*b(k)*c(i,j);
end
end
end
Here, expr contains the objective function expression. While this method is straightforward, it
can take excessive time to loop through many levels of for loops.
• Use a vectorized statement. Vectorized statements generally run faster than a for loop. You can
create a vectorized statement in several ways:
• Expand b and c. To enable term-wise multiplication, create constants that are the same size as
x.
bigb = reshape(b,1,1,10);
bigb = repmat(bigb,30,30,1);
bigc = repmat(c,1,1,10);
expr = sum(sum(sum(x.*bigb.*bigc)));
• Loop once over b.
expr = optimexpr;
for k = 1:10
expr = expr + sum(sum(x(:,:,k).*c))*b(k);
end
• Create an expression differently by looping over b and then summing terms after the loop.
expr = optimexpr(30,30,10);
for k = 1:10
expr(:,:,k) = x(:,:,k).*c*b(k);
end
expr = sum(expr(:));
10-28
Create Efficient Optimization Problems
See Also
More About
• “Tuning Integer Linear Programming” on page 9-35
• “Separate Optimization Model from Data” on page 10-30
10-29
10 Problem-Based Optimization
Suppose that you have a multiperiod scheduling problem with several products. The time periods are
in a vector, periods, and the products are in a string vector, products.
periods = 1:10;
products = ["strawberry","cherry","red grape",...
"green grape","nectarine","apricot"];
To create variables that represent the number of products used in each period, use statements that
take sizes from the data. For example:
usage = optimvar('usage',length(periods),products,...
'Type','integer','LowerBound',0);
To later change the time periods or products, you need to change the data only in periods and
products. You can then run the same code to create usage.
In other words, to maintain flexibility and allow for reuse, do not use a statement that has hard-coded
data sizes. For example:
The same consideration holds for expressions as well as variables. Suppose that the costs for the
products are in a data matrix, costs, of size length(periods)-by-length(products). To
simulate valid data, create a random integer matrix of the appropriate size.
The best practice is to create cost expressions that take sizes from the data.
costPerYear = sum(costs.*usage,2);
totalCost = sum(costPerYear);
In this way, if you ever change the data sizes, the statements that create costPerYear and
totalCost do not change. In other words, to maintain flexibility and allow for reuse, do not use a
statement that has hard-coded data sizes. For example:
See Also
More About
• “Problem-Based Optimization Workflow” on page 10-2
10-30
Separate Optimization Model from Data
10-31
10 Problem-Based Optimization
Before solve can call these functions, the problems must be converted to solver form, either by
solve or some other associated functions or objects. This conversion entails, for example, linear
constraints having a matrix representation rather than an optimization variable expression.
The first step in the algorithm occurs as you place optimization expressions into the problem. An
OptimizationProblem object has an internal list of the variables used in its expressions. Each
variable has a linear index in the expression, and a size. Therefore, the problem variables have an
implied matrix form. The prob2struct function performs the conversion from problem form to
solver form. For an example, see “Convert Problem to Structure” on page 16-369.
For the default and allowed solvers that solve calls, depending on the problem objective and
constraints, see 'solver'. You can override the default by using the 'solver' name-value pair
argument when calling solve.
For the algorithm that intlinprog uses to solve MILP problems, see “intlinprog Algorithm” on page
9-26. For the algorithms that linprog uses to solve linear programming problems, see “Linear
Programming Algorithms” on page 9-2. For the algorithms that quadprog uses to solve quadratic
programming problems, see “Quadratic Programming Algorithms” on page 11-2. For linear or
nonlinear least-squares solver algorithms, see “Least-Squares (Model Fitting) Algorithms” on page
12-2. For nonlinear solver algorithms, see “Unconstrained Nonlinear Optimization Algorithms” on
page 6-2 and “Constrained Nonlinear Optimization Algorithms” on page 6-19.
For nonlinear equation solving, solve internally represents each equation as the difference between
the left and right sides. Then solve attempts to minimize the sum of squares of the equation
components. For the algorithms for solving nonlinear systems of equations, see “Equation Solving
Algorithms” on page 13-2. When the problem also has bounds, solve calls lsqnonlin to minimize
the sum of squares of equation components. See “Least-Squares (Model Fitting) Algorithms” on page
12-2.
Note If your objective function is a sum of squares, and you want solve to recognize it as such,
write it as sum(expr.^2), and not as expr'*expr or any other form. The internal parser recognizes
10-32
Problem-Based Optimization Algorithms
only explicit sums of squares. For details, see “Write Objective Function for Problem-Based Least
Squares” on page 12-92. For an example, see “Nonnegative Linear Least Squares, Problem-Based”
on page 12-40.
See Also
intlinprog | linprog | prob2struct
More About
• “intlinprog Algorithm” on page 9-26
• “Linear Programming Algorithms” on page 9-2
• “Create Efficient Optimization Problems” on page 10-28
10-33
10 Problem-Based Optimization
x = optimvar('x',10,2);
cons = sum(x,2) == 1;
At this point, you realize that you intended to create integer variables. So you recreate the variable,
changing its type.
x = optimvar('x',10,2,'Type','integer');
obj = sum(x*[2;3]);
prob = optimproblem('Objective',obj);
prob.Constraints = cons
At this point, you get an error message stating that OptimizationVariables appearing in the
same problem must have distinct "Name" properties. The issue is that when you recreated the x
variable, it is a new variable, unrelated to the constraint expression.
cons = sum(x,2) == 1;
prob.Constraints = cons;
• Retrieve the original x variable by creating a problem using the old expression. Update the
retrieved variable to have the correct Type property. Use the retrieved variable for the problem
and objective.
oprob = optimproblem('Constraints',cons);
x = oprob.Variables.x;
x.Type = 'integer';
oprob.Objective = sum(x*[2;3]);
This method can be useful if you have created more expressions using the old variable than
expressions using the new variable.
See Also
More About
• “Problem-Based Optimization Workflow” on page 10-2
10-34
Expression Contains Inf or NaN
Optimization expressions containing Inf or NaN cannot be displayed. For example, the largest real
number in double precision arithmetic is about 1.8e308. So 2e308 overflows to Inf.
x = optimvar('x');
y = 1e308;
expr = 2*x*y
expr =
OptimizationExpression
Similarly, because Inf - Inf = NaN, the following expression cannot be displayed.
expr =
OptimizationExpression
If any of your optimization expressions contain Inf or NaN, try to eliminate these values before
calling solve. To do so:
See Also
show | write
More About
• “Problem-Based Optimization Workflow” on page 10-2
• “Review or Modify Optimization Problems” on page 10-14
10-35
10 Problem-Based Optimization
• x and y represent optimization arrays of arbitrary size (usually the same size).
• x2D represents a 2-D optimization array.
• a is a scalar numeric constant.
• M is a constant numeric matrix.
• c is a numeric array of the same size as x.
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
10-36
Supported Operations on Optimization Variables and Expressions
10-37
10 Problem-Based Optimization
Operation Example
N-D numeric indexing (includes colon and end) x(3,5:end)
N-D logical indexing x(ind), where ind is a logical array
N-D string indexing x(str1,str2), where str1 and str2 are
strings
N-D mixed indexing (combination of numeric, x(ind,str1,:)
logical, colon, end, and string)
Linear numeric indexing (includes colon and end) x(17:end)
Linear logical indexing x(ind)
Linear string indexing x(str1)
10-38
Supported Operations on Optimization Variables and Expressions
See Also
OptimizationExpression | OptimizationVariable
More About
• “Problem-Based Optimization Setup”
• “Problem-Based Optimization Workflow” on page 10-2
• “Optimization Expressions” on page 10-6
10-39
10 Problem-Based Optimization
For the solver-based approach to this problem, see “Mixed-Integer Linear Programming Basics:
Solver-Based” on page 9-37.
Problem Description
You want to blend steels with various chemical compositions to obtain 25 tons of steel with a specific
chemical composition. The result should have 5% carbon and 5% molybdenum by weight, meaning 25
tons*5% = 1.25 tons of carbon and 1.25 tons of molybdenum. The objective is to minimize the cost for
blending the steel.
This problem is taken from Carl-Henrik Westerberg, Bengt Bjorklund, and Eskil Hultman, “An
Application of Mixed Integer Programming in a Swedish Steel Mill.” Interfaces February 1977 Vol. 7,
No. 2 pp. 39–43, whose abstract is at https://doi.org/10.1287/inte.7.2.39.
Four ingots of steel are available for purchase. Only one of each ingot is available.
Cost
Ingot Weight in Tons % Carbon % Molybdenum
Ton
1 5 5 3 $ 350
2 3 4 3 $ 330
3 4 5 4 $ 310
4 6 3 4 $ 280
Three grades of alloy steel and one grade of scrap steel are available for purchase. Alloy and scrap
steels can be purchased in fractional amounts.
Cost
Alloy % Carbon % Molybdenum
Ton
1 8 6 $ 500
2 7 7 $ 450
3 6 8 $ 400
Scrap 3 9 $ 100
Formulate Problem
To formulate the problem, first decide on the control variables. Take variable ingots(1) = 1 to
mean that you purchase ingot 1, and ingots(1) = 0 to mean that you do not purchase the ingot.
Similarly, variables ingots(2) through ingots(4) are binary variables indicating whether you
purchase ingots 2 through 4.
Variables alloys(1) through alloys(3) are the quantities in tons of alloys 1, 2, and 3 you
purchase. scrap is the quantity of scrap steel that you purchase.
steelprob = optimproblem;
ingots = optimvar('ingots',4,'Type','integer','LowerBound',0,'UpperBound',1);
alloys = optimvar('alloys',3,'LowerBound',0);
scrap = optimvar('scrap','LowerBound',0);
10-40
Mixed-Integer Linear Programming Basics: Problem-Based
The problem has three equality constraints. The first constraint is that the total weight is 25 tons.
Calculate the weight of the steel.
totalWeight = weightIngots*ingots + sum(alloys) + scrap;
The second constraint is that the weight of carbon is 5% of 25 tons, or 1.25 tons. Calculate the weight
of the carbon in the steel.
carbonIngots = [5,4,5,3]/100;
carbonAlloys = [8,7,6]/100;
carbonScrap = 3/100;
totalCarbon = (weightIngots.*carbonIngots)*ingots + carbonAlloys*alloys + carbonScrap*scrap;
The third constraint is that the weight of molybdenum is 1.25 tons. Calculate the weight of the
molybdenum in the steel.
molybIngots = [3,3,4,4]/100;
molybAlloys = [6,7,8]/100;
molybScrap = 9/100;
totalMolyb = (weightIngots.*molybIngots)*ingots + molybAlloys*alloys + molybScrap*scrap;
Solve Problem
Now that you have all the inputs, call the solver.
[sol,fval] = solve(steelprob);
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
10-41
10 Problem-Based Optimization
sol.ingots
ans = 4×1
1.0000
1.0000
0
1.0000
sol.alloys
ans = 3×1
7.2500
0
0.2500
sol.scrap
ans = 3.5000
fval
fval = 8.4950e+03
The optimal purchase costs $8,495. Buy ingots 1, 2, and 4, but not 3, and buy 7.25 tons of alloy 1,
0.25 ton of alloy 3, and 3.5 tons of scrap steel.
See Also
More About
• “Mixed-Integer Linear Programming Basics: Solver-Based” on page 9-37
• “Problem-Based Optimization Workflow” on page 10-2
• Solve a Mixed-Integer Linear Programming Problem using Optimization Modeling
10-42
Create Initial Point for Optimization with Named Index Variables
The problem is a multiperiod inventory problem that involves blending raw and refined oils. The
objective is to maximize profit subject to various constraints on production and inventory capacities
and on the "hardness" of oil blends. This problem is taken from Williams [1].
Problem Description
The problem involves two types of raw vegetable oil and three types of raw nonvegetable oil that a
manufacturer can refine into edible oil. The manufacturer can refine up to 200 tons of vegetable oils,
and up to 250 tons of nonvegetable oils per month. The manufacturer can store 1000 tons of each raw
oil, which is beneficial because the cost of purchasing raw oils depends on the month as well as the
type of oil. A quality called "hardness" is associated with each oil. The hardness of blended oil is the
linearly weighted hardness of the constituent oils.
Because of processing limitations, the manufacturer restricts the number of oils refined in any one
month to no more than three. Also, if an oil is refined in a month, at least 20 tons of that oil must be
refined. Finally, if a vegetable oil is refined in a month, then nonvegetable oil 3 must also be refined.
The revenue is a constant for each ton of oil sold. The costs are the cost of purchasing the oils, which
varies by oil and month, and there is a fixed cost per ton of storing each oil for each month. There is
no cost for refining an oil, but the manufacturer cannot store refined oil (it must be sold).
Create named index variables for the planning periods and oils.
months = {'January','February','March','April','May','June'};
oils = {'veg1','veg2','non1','non2','non3'};
vegoils = {'veg1','veg2'};
nonveg = {'non1','non2','non3'};
h = [8.8,6.1,2,4.2,5.0];
10-43
10 Problem-Based Optimization
Specify the costs of the raw oils as this array. Each row of the array represents the cost of the raw
oils in a month. The first row represents the costs in January, and the last row represents the costs in
June.
costdata = [...
110 120 130 110 115
130 130 110 90 115
110 140 130 100 95
120 110 120 120 125
100 120 150 110 105
90 100 140 80 135];
Create Variables
Additionally, to account for constraints on the number of oils refined and sold each month and the
minimum quantity produced, create an auxiliary binary variable induse that is 1 exactly when an oil
is sold in a month.
produce = sum(sell,2);
Create Objective
To create the objective function for the problem, calculate the revenue, and subtract the costs of
purchasing and storing oils.
Create an optimization problem for maximization, and include the objective function as the
Objective property.
The objective expression is quite long. If you like, you can see it using the
showexpr(prob.Objective) command.
Create Constraints
The quantity of each oil stored in June is 500. Set this constraint by using lower and upper bounds.
10-44
Create Initial Point for Optimization with Named Index Variables
The manufacturer cannot refine more than maxuseveg vegetable oil in any month. Set this and all
subsequent constraints by using “Expressions for Constraints and Equations” on page 10-7.
The manufacturer cannot refine more than maxusenon nonvegetable oil any month.
nonvegoiluse = sell(:,nonveg);
nonvegused = sum(nonvegoiluse,2) <= maxusenon;
The hardness of the blended oil must be from hmin through hmax.
The amount of each oil stored at the end of the month is equal to the amount at the beginning of the
month, plus the amount bought, minus the amount sold.
If an oil is refined at all in a month, at least minuseraw of the oil must be refined and sold.
Ensure that the induse variable is 1 exactly when the corresponding oil is refined.
The manufacturer can sell no more than maxnraw oils each month.
If a vegetable oil is refined, oil non3 must also be refined and sold.
prob.Constraints.vegused = vegused;
prob.Constraints.nonvegused = nonvegused;
prob.Constraints.hardmin = hardmin;
prob.Constraints.hardmax = hardmax;
prob.Constraints.initstockbal = initstockbal;
prob.Constraints.stockbal = stockbal;
prob.Constraints.minuse = minuse;
prob.Constraints.maxusev = maxusev;
prob.Constraints.maxusenv = maxusenv;
prob.Constraints.maxnuse = maxnuse;
prob.Constraints.deflogic1 = deflogic1;
Solve Problem
To show the eventual difference between using an initial point and not using one, set options to use
no heuristics. Then solve the problem.
10-45
10 Problem-Based Optimization
opts = optimoptions('intlinprog','Heuristics','none');
[sol1,fval1,exitstatus1,output1] = solve(prob,'options',opts)
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
fval1 = 1.0028e+05
exitstatus1 =
OptimalSolution
For this problem, using an initial point can save branch-and-bound iterations. Create an initial point
of the correct dimensions.
10-46
Create Initial Point for Optimization with Named Index Variables
x0.buy = zeros(size(buy));
x0.induse = zeros(size(induse));
x0.store = zeros(size(store));
x0.sell = zeros(size(sell));
Set the initial point to sell only vegetable oil veg2 and nonvegetable oil non3. To set this initial point
appropriately, use the findindex function.
numMonths = size(induse,1);
[idxMonths,idxOils] = findindex(induse,1:numMonths,{'veg2','non3'});
x0.induse(idxMonths,idxOils) = 1;
Satisfy the initstockbal constraint for the first month based on the initial store of 500 for each oil
type, and no purchase the first month, and constant usage of veg2 and non3.
x0.store(1,:) = [500 300 500 500 250];
Satisfy the remaining stock balance constraints stockbal by using the findindex function.
[idxMonths,idxOils] = findindex(store,2:6,{'veg2'});
x0.store(idxMonths,idxOils) = [100;0;0;0;500];
[idxMonths,idxOils] = findindex(store,2:6,{'veg1','non1','non2'});
x0.store(idxMonths,idxOils) = 500;
[idxMonths,idxOils] = findindex(store,2:6,{'non3'});
x0.store(idxMonths,idxOils) = [0;0;0;0;500];
[idxMonths,idxOils] = findindex(buy,2:6,{'veg2'});
x0.buy(idxMonths,idxOils) = [0;100;200;200;700];
[idxMonths,idxOils] = findindex(buy,2:6,{'non3'});
x0.buy(idxMonths,idxOils) = [0;250;250;250;750];
Check that the initial point is feasible. Because the constraints have different dimensions, set the
cellfun UniformOutput name-value pair to false when checking the infeasibilities.
inf{1} = infeasibility(vegused,x0);
inf{2} = infeasibility(nonvegused,x0);
inf{3} = infeasibility(hardmin,x0);
inf{4} = infeasibility(hardmax,x0);
inf{5} = infeasibility(initstockbal,x0);
inf{6} = infeasibility(stockbal,x0);
inf{7} = infeasibility(minuse,x0);
10-47
10 Problem-Based Optimization
inf{8} = infeasibility(maxusev,x0);
inf{9} = infeasibility(maxusenv,x0);
inf{10} = infeasibility(maxnuse,x0);
inf{11} = infeasibility(deflogic1,x0);
allinfeas = cellfun(@max,inf,'UniformOutput',false);
anyinfeas = cellfun(@max,allinfeas);
disp(anyinfeas)
0 0 0 0 0 0 0 0 0 0 0
All of the infeasibilities are zero, which shows that the initial point is feasible.
[sol2,fval2,exitstatus2,output2] = solve(prob,x0,'options',opts)
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
fval2 = 1.0028e+05
exitstatus2 =
OptimalSolution
10-48
Create Initial Point for Optimization with Named Index Variables
numfeaspoints: 7
numnodes: 1114
constrviolation: 1.7580e-12
message: 'Optimal solution found....'
solver: 'intlinprog'
This time, solve took fewer branch-and-bound steps to find the solution.
Reference
[1] Williams, H. Paul. Model Building in Mathematical Programming. Fourth edition. J. Wiley,
Chichester, England. Problem 12.1, "Food Manufacture1." 1999.
See Also
findindex | solve
More About
• “Named Index for Optimization Variables” on page 10-20
• “Problem-Based Optimization Workflow” on page 10-2
10-49
11
Quadratic Programming
1
min xT Hx + cT x (11-1)
x 2
Note The algorithm has two code paths. It takes one when the Hessian matrix H is an ordinary (full)
matrix of doubles, and it takes the other when H is a sparse matrix. For details of the sparse data
type, see “Sparse Matrices” (MATLAB). Generally, the algorithm is faster for large problems that have
relatively few nonzero terms when you specify H as sparse. Similarly, the algorithm is faster for
small or relatively dense problems when you specify H as full.
Presolve/Postsolve
The algorithm first tries to simplify the problem by removing redundancies and simplifying
constraints. The tasks performed during the presolve step can include the following:
• Check if any variables have equal upper and lower bounds. If so, check for feasibility, and then fix
and remove the variables.
• Check if any linear inequality constraint involves only one variable. If so, check for feasibility, and
then change the linear constraint to a bound.
• Check if any linear equality constraint involves only one variable. If so, check for feasibility, and
then fix and remove the variable.
• Check if any linear constraint matrix has zero rows. If so, check for feasibility, and then delete the
rows.
11-2
Quadratic Programming Algorithms
If the algorithm detects an infeasible or unbounded problem, it halts and issues an appropriate exit
message.
The algorithm might arrive at a single feasible point, which represents the solution.
If the algorithm does not detect an infeasible or unbounded problem in the presolve step, and if the
presolve has not produced the solution, the algorithm continues to its next steps. After reaching a
stopping criterion, the algorithm reconstructs the original problem, undoing any presolve
transformations. This final step is the postsolve step.
Predictor-Corrector
The sparse and full interior-point-convex algorithms differ mainly in the predictor-corrector phase.
The algorithms are similar, but differ in some details. For the basic algorithm description, see
Mehrotra [47].
The algorithms begin by turning the linear inequalities Ax <= b into inequalities of the form Ax >= b
by multiplying A and b by -1. This has no bearing on the solution, but makes the problem of the same
form found in some literature.
Sparse Predictor-Corrector
Similar to the fmincon interior-point algorithm on page 6-30, the sparse interior-point-convex
algorithm tries to find a point where the Karush-Kuhn-Tucker (KKT) on page 3-12 conditions hold. For
the quadratic programming problem described in “Quadratic Programming Definition” on page 11-2,
these conditions are:
11-3
11 Quadratic Programming
T T
Hx + c − Aeq y−A z=0
Ax − b − s = 0
Aeqx − beq = 0
sizi = 0, i = 1, 2, ..., m
s≥0
z ≥ 0.
Here
• A is the extended linear inequality matrix that includes bounds written as linear inequalities. b is
the corresponding linear inequality vector, including bounds.
• s is the vector of slacks that convert inequality constraints to equalities. s has length m, the
number of linear inequalities and bounds.
• z is the vector of Lagrange multipliers corresponding to s.
• y is the vector of Lagrange multipliers associated with the equality constraints.
The algorithm first predicts a step from the Newton-Raphson formula, then computes a corrector
step. The corrector attempts to better enforce the nonlinear constraint sizi = 0.
T T
rd = Hx + c − Aeq y − A z.
• req, the primal equality constraint residual:
rineq = Ax − b − s .
• rsz, the complementarity residual:
rsz = Sz.
S is the diagonal matrix of slack terms, z is the column matrix of Lagrange multipliers.
• rc, the average complementarity:
sT z
rc = .
m
T T rd
H 0 − Aeq −A Δx
Aeq 0 0 0 Δs req
= − . (11-2)
A −I 0 0 Δy rineq
0 Z 0 S Δz rsz
11-4
Quadratic Programming Algorithms
However, a full Newton step might be infeasible, because of the positivity constraints on s and z.
Therefore, quadprog shortens the step, if necessary, to maintain positivity.
Additionally, to maintain a “centered” position in the interior, instead of trying to solve sizi = 0, the
algorithm takes a positive parameter σ, and tries to solve
sizi = σrc.
quadprog replaces rsz in the Newton step equation with rsz + ΔsΔz – σrc1, where 1 is the vector of
ones. Also, quadprog reorders the Newton equations to obtain a symmetric, more numerically stable
system for the predictor step calculation.
After calculating the corrected Newton step, the algorithm performs more calculations to get both a
longer current step, and to prepare for better subsequent steps. These multiple correction
calculations can improve both performance and robustness. For details, see Gondzio [4].
Full Predictor-Corrector
The full predictor-corrector algorithm does not combine bounds into linear constraints, so it has
another set of slack variables corresponding to the bounds. The algorithm shifts lower bounds to
zero. And, if there is only one bound on a variable, the algorithm turns it into a lower bound of zero,
by negating the inequality of an upper bound.
A is the extended linear matrix that includes both linear inequalities and linear equalities. b is the
corresponding linear equality vector. A also includes terms for extending the vector x with slack
variables s that turn inequality constraints to equality constraints:
Aeq 0 x0
Ax = ,
A I s
T
Hx + c − A y − v + w = 0
Ax = b
x+t=u
(11-3)
vixi = 0, i = 1, 2, ..., m
witi = 0, i = 1, 2, ..., n
x, v, w, t ≥ 0.
To find the solution x, slack variables and dual variables to “Equation 11-3”, the algorithm basically
considers a Newton-Raphson step:
T
Δx T rd
H −A 0 −I I Hx + c − A y − v + w
Δy rp
A 0 0 0 0 Ax − b
−I 0 −I 0 0 Δt = − u−x−t = − rub , (11-4)
V 0 0 X 0 Δv VX rvx
0 0 W 0 T Δw WT rwt
11-5
11 Quadratic Programming
where X, V, W, and T are diagonal matrices corresponding to the vectors x, v, w, and t respectively.
The residual vectors on the far right side of the equation are:
The algorithm solves “Equation 11-4” by first converting it to the symmetric matrix form
T R
−D A Δx
= − , (11-5)
A 0 Δy rp
where
D = H + X −1V + T −1W
R = − rd − X −1rvx + T −1rwt + T −1Wrub .
All the matrix inverses in the definitions of D and R are simple to compute because the matrices are
diagonal.
To derive “Equation 11-5” from “Equation 11-4”, notice that the second row of “Equation 11-5” is the
same as the second matrix row of “Equation 11-4”. The first row of “Equation 11-5” comes from
solving the last two rows of “Equation 11-4” for Δv and Δw, and then solving for Δt.
To solve “Equation 11-5”, the algorithm follows the essential elements of Altman and Gondzio [1]. The
algorithm solves the symmetric system by an LDL decomposition. As pointed out by authors such as
Vanderbei and Carpenter [2], this decomposition is numerically stable without any pivoting, so can be
fast.
After calculating the corrected Newton step, the algorithm performs more calculations to get both a
longer current step, and to prepare for better subsequent steps. These multiple correction
calculations can improve both performance and robustness. For details, see Gondzio [4].
The full quadprog predictor-corrector algorithm is largely the same as that in the linprog
'interior-point' algorithm, but includes quadratic terms as well. See “Predictor-Corrector” on
page 9-3.
References
[1] Altman, Anna and J. Gondzio. Regularized symmetric indefinite systems in interior point methods
for linear and quadratic optimization. Optimization Methods and Software, 1999. Available for
download here.
[2] Vanderbei, R. J. and T. J. Carpenter. Symmetric indefinite systems for interior point methods.
Mathematical Programming 58, 1993. pp. 1–32. Available for download here.
Stopping Conditions
The predictor-corrector algorithm iterates until it reaches a point that is feasible (satisfies the
constraints to within tolerances) and where the relative step sizes are small. Specifically, define
11-6
Quadratic Programming Algorithms
rp 1
+ rub 1 ≤ ρTolCon
rd ∞
≤ ρTolFun
rc ≤ TolFun,
where
rc essentially measures the size of the complementarity residuals xv and tw, which are each vectors of
zeros at a solution.
Infeasibility Detection
quadprog calculates a merit function φ at every iteration. The merit function is a measure of
feasibility. quadprog stops if the merit function grows too large. In this case, quadprog declares the
problem to be infeasible.
The merit function is related to the KKT conditions for the problem—see “Predictor-Corrector” on
page 11-3. Use the following definitions:
The notation A and b means the linear inequality coefficients, augmented with terms to represent
bounds for the sparse algorithm. The notation λ ineq similarly represents Lagrange multipliers for the
linear inequality constraints, including bound constraints. This was called z in “Predictor-Corrector”
on page 11-3, and λeq was called y.
1
max req , rineq , rd +g .
ρ ∞ ∞ ∞
If this merit function becomes too large, quadprog declares the problem to be infeasible and halts
with exit flag -2.
11-7
11 Quadratic Programming
The current point is updated to be x + s if f(x + s) < f(x); otherwise, the current point remains
unchanged and N, the region of trust, is shrunk and the trial step computation is repeated.
The key questions in defining a specific trust-region approach to minimizing f(x) are how to choose
and compute the approximation q (defined at the current point x), how to choose and modify the trust
region N, and how accurately to solve the trust-region subproblem. This section focuses on the
unconstrained problem. Later sections discuss additional complications due to the presence of
constraints on the variables.
In the standard trust-region method ([48]), the quadratic approximation q is defined by the first two
terms of the Taylor approximation to F at x; the neighborhood N is usually spherical or ellipsoidal in
shape. Mathematically the trust-region subproblem is typically stated
1 T
min s Hs + sT g such that Ds ≤ Δ , (11-7)
2
where g is the gradient of f at the current point x, H is the Hessian matrix (the symmetric matrix of
second derivatives), D is a diagonal scaling matrix, Δ is a positive scalar, and ∥ . ∥ is the 2-norm. Good
algorithms exist for solving “Equation 11-7” (see [48]); such algorithms typically involve the
computation of all eigenvalues of H and a Newton process applied to the secular equation
1 1
− = 0.
Δ s
Such algorithms provide an accurate solution to “Equation 11-7”. However, they require time
proportional to several factorizations of H. Therefore, for large-scale problems a different approach is
needed. Several approximation and heuristic strategies, based on “Equation 11-7”, have been
proposed in the literature ([42] and [50]). The approximation approach followed in Optimization
Toolbox solvers is to restrict the trust-region subproblem to a two-dimensional subspace S ([39] and
[42]). Once the subspace S has been computed, the work to solve “Equation 11-7” is trivial even if full
eigenvalue/eigenvector information is needed (since in the subspace, the problem is only two-
dimensional). The dominant work has now shifted to the determination of the subspace.
The two-dimensional subspace S is determined with the aid of a preconditioned conjugate gradient
process described below. The solver defines S as the linear space spanned by s1 and s2, where s1 is in
the direction of the gradient g, and s2 is either an approximate Newton direction, i.e., a solution to
H ⋅ s2 = − g, (11-8)
11-8
Quadratic Programming Algorithms
The philosophy behind this choice of S is to force global convergence (via the steepest descent
direction or negative curvature direction) and achieve fast local convergence (via the Newton step,
when it exists).
These four steps are repeated until convergence. The trust-region dimension Δ is adjusted according
to standard rules. In particular, it is decreased if the trial step is not accepted, i.e., f(x + s) ≥ f(x). See
[46] and [49] for a discussion of this aspect.
Optimization Toolbox solvers treat a few important special cases of f with specialized functions:
nonlinear least-squares, quadratic functions, and linear least-squares. However, the underlying
algorithmic ideas are the same as for the general case. These special cases are discussed in later
sections.
The subspace trust-region method is used to determine a search direction. However, instead of
restricting the step to (possibly) one reflection step, as in the nonlinear minimization case, a
piecewise reflective line search is conducted at each iteration. See [45] for details of the line search.
A popular way to solve large, symmetric, positive definite systems of linear equations Hp = –g is the
method of Preconditioned Conjugate Gradients (PCG). This iterative approach requires the ability to
calculate matrix-vector products of the form H·v where v is an arbitrary vector. The symmetric
positive definite matrix M is a preconditioner for H. That is, M = C2, where C–1HC–1 is a well-
conditioned matrix or a matrix with clustered eigenvalues.
In a minimization context, you can assume that the Hessian matrix H is symmetric. However, H is
guaranteed to be positive definite only in the neighborhood of a strong minimizer. Algorithm PCG
exits when it encounters a direction of negative (or zero) curvature, that is, dTHd ≤ 0. The PCG
output direction p is either a direction of negative curvature or an approximate solution to the
Newton system Hp = –g. In either case, p helps to define the two-dimensional subspace used in the
trust-region approach discussed in “Trust-Region Methods for Nonlinear Minimization” on page 6-2.
Linear constraints complicate the situation described for unconstrained minimization. However, the
underlying ideas described previously can be carried through in a clean and efficient way. The trust-
region methods in Optimization Toolbox solvers generate strictly feasible iterates.
where A is an m-by-n matrix (m ≤ n). Some Optimization Toolbox solvers preprocess A to remove
strict linear dependencies using a technique based on the LU factorization of AT [46]. Here A is
assumed to be of rank m.
11-9
11 Quadratic Programming
The method used to solve “Equation 11-10” differs from the unconstrained approach in two
significant ways. First, an initial feasible point x0 is computed, using a sparse least-squares step, so
that Ax0 = b. Second, Algorithm PCG is replaced with Reduced Preconditioned Conjugate Gradients
(RPCG), see [46], in order to compute an approximate reduced Newton step (or a direction of
negative curvature in the null space of A). The key linear algebra step involves solving systems of the
form
T
C A s r
= , (11-11)
A 0 t 0
where A approximates A (small nonzeros of A are set to zero provided rank is not lost) and C is a
sparse symmetric positive-definite approximation to H, i.e., C = H. See [46] for more details.
Box Constraints
where l is a vector of lower bounds, and u is a vector of upper bounds. Some (or all) of the
components of l can be equal to –∞ and some (or all) of the components of u can be equal to ∞. The
method generates a sequence of strictly feasible points. Two techniques are used to maintain
feasibility while achieving robust convergence behavior. First, a scaled modified Newton step
replaces the unconstrained Newton step (to define the two-dimensional subspace S). Second,
reflections are used to increase the step size.
The scaled modified Newton step arises from examining the Kuhn-Tucker necessary conditions for
“Equation 11-12”,
−2
D(x) g = 0, (11-13)
where
−1/2
D(x) = diag vk ,
The nonlinear system “Equation 11-13” is not differentiable everywhere. Nondifferentiability occurs
when vi = 0. You can avoid such points by maintaining strict feasibility, i.e., restricting l < x < u.
The scaled modified Newton step sk for the nonlinear system of equations given by “Equation 11-13”
is defined as the solution to the linear system
MDsN = − g (11-14)
11-10
Quadratic Programming Algorithms
1/2
g = D−1g = diag v g, (11-15)
and
Here Jv plays the role of the Jacobian of |v|. Each diagonal component of the diagonal matrix Jv equals
0, –1, or 1. If all the components of l and u are finite, Jv = diag(sign(g)). At a point where gi = 0, vi
might not be differentiable. Jiiv = 0 is defined at such a point. Nondifferentiability of this type is not a
cause for concern because, for such a component, it is not significant which value vi takes. Further, |
vi| will still be discontinuous at this point, but the function |vi|·gi is continuous.
Second, reflections are used to increase the step size. A (single) reflection step is defined as follows.
Given a step p that intersects a bound constraint, consider the first bound constraint crossed by p;
assume it is the ith bound constraint (either the ith upper or ith lower bound). Then the reflection
step pR = p except in the ith component, where pRi = –pi.
The active-set strategy (also known as a projection method) is similar to the strategy of Gill et al.,
described in [18] and [17].
Presolve Step
The algorithm first tries to simplify the problem by removing redundancies and simplifying
constraints. The tasks performed during the presolve step can include the following:
• Check if any variables have equal upper and lower bounds. If so, check for feasibility, and then fix
and remove the variables.
• Check if any linear inequality constraint involves only one variable. If so, check for feasibility, and
then change the linear constraint to a bound.
• Check if any linear equality constraint involves only one variable. If so, check for feasibility, and
then fix and remove the variable.
• Check if any linear constraint matrix has zero rows. If so, check for feasibility, and then delete the
rows.
• Determine if the bounds and linear constraints are consistent.
• Check if any variables appear only as linear terms in the objective function and do not appear in
any linear constraint. If so, check for feasibility and boundedness, and then fix the variables at
their appropriate bounds.
• Change any linear inequality constraints to linear equality constraints by adding slack variables.
If the algorithm detects an infeasible or unbounded problem, it halts and issues an appropriate exit
message.
The algorithm might arrive at a single feasible point, which represents the solution.
11-11
11 Quadratic Programming
If the algorithm does not detect an infeasible or unbounded problem in the presolve step, and if the
presolve has not produced the solution, the algorithm continues to its next steps. After reaching a
stopping criterion, the algorithm reconstructs the original problem, undoing any presolve
transformations. This final step is the postsolve step.
Phase 1 Algorithm
In Phase 1, the algorithm attempts to find a point x that satisfies all constraints, with no
consideration of the objective function. quadprog runs the Phase 1 algorithm only if the supplied
initial point x0 is infeasible.
To begin, the algorithm tries to find a point that is feasible with respect to all equality constraints,
such as X = Aeq\beq. If there is no solution x to the equations Aeq*x = beq, then the algorithm
halts. If there is a solution X, the next step is to satisfy the bounds and linear inequalities. In the case
of no equality constraints set X = x0, the initial point.
minγ
x, γ
such that
Ax − γ ≤ b
Aeq x = beq
lb−γ ≤ x ≤ ub + γ
γ ≥ − ρ.
Here, ρ is the ConstraintTolerance option multiplied by the absolute value of the largest element
in A and Aeq. If the algorithm finds γ = 0 and a point x that satisfies the equations and inequalities,
then x is a feasible Phase 1 point. If there is no solution to the auxiliary linear programming problem
x with γ = 0, then the Phase 1 problem is infeasible.
To solve the auxiliary linear programming problem, the algorithm sets γ0 = M + 1, sets x0 = X, and
initializes the active set as the fixed variables (if any) and all the equality constraints. The algorithm
reformulates the linear programming variables p to be the offset of x from the current point x0,
namely x = x0 + p. The algorithm solves the linear programming problem by the same iterations as it
takes in Phase 2 to solve the quadratic programming problem, with an appropriately modified
Hessian.
Phase 2 Algorithm
11-12
Quadratic Programming Algorithms
During Phase 2, an active set Ak, which is an estimate of the active constraints (those on the
constraint boundaries) at the solution point.
The algorithm updates Ak at each iteration k, forming the basis for a search direction dk. Equality
constraints always remain in the active set Ak. The search direction dk is calculated and minimizes
the objective function while remaining on any active constraint boundaries. The algorithm forms the
feasible subspace for dk from a basis Zk whose columns are orthogonal to the estimate of the active
set Ak (that is, AkZk = 0). Therefore, a search direction, which is formed from a linear summation of
any combination of the columns of Zk, is guaranteed to remain on the boundaries of the active
constraints.
The algorithm forms the matrix Zk from the last m – l columns of the QR decomposition of the matrix
T
Ak , where l is the number of active constraints and l < m. That is, Zk is given by
Zk = Q : , l + 1: m , (11-18)
where
T T R
Q Ak = .
0
After finding Zk, the algorithm looks for a new search direction dk that minimizes q(d), where dk is in
the null space of the active constraints. That is, dk is a linear combination of the columns of Zk:
d k = Zkp for some vector p.
1 T T
q(p) = p Zk HZkp + cT Zkp . (11-19)
2
∇q(p) is referred to as the projected gradient of the quadratic function because it is the gradient
projected in the subspace defined by Zk. The term ZkT HZk is called the projected Hessian. Assuming
the projected Hessian matrix ZkT HZk is positive semidefinite, the minimum of the function q(p) in the
subspace defined by Zk occurs when ∇q(p) = 0, which is the solution of the system of linear equations
xk + 1 = xk + αdk,
where
dk = Zkp .
11-13
11 Quadratic Programming
Due to the quadratic nature of the objective function, only two choices of step length α exist at each
iteration. A step of unity along dk is the exact step to the minimum of the function restricted to the
null space of Ak. If the algorithm can take such a step without violating the constraints, then this step
is the solution to the quadratic program (“Equation 6-32”). Otherwise, the step along dk to the
nearest constraint is less than unity, and the algorithm includes a new constraint in the active set at
the next iteration. The distance to the constraint boundaries in any direction dk is given by
− Aixk − bi
α= min ,
i ∈ 1, ..., m Aidk
which is defined for constraints not in the active set, and where the direction dk is towards the
constraint boundary, that is, Aidk > 0, i = 1, ..., m.
When the active set includes n independent constraints, without location of the minimum, the
algorithm calculates the Lagrange multipliers λk, which satisfy the nonsingular set of linear equations
T
Ak λk = c + Hxk . (11-22)
If all elements of λk are positive, xk is the optimal solution of the quadratic programming problem
“Equation 11-1”. However, if any component of λk is negative, and the component does not
correspond to an equality constraint, then the minimization is not complete. The algorithm deletes
the element corresponding to the most negative multiplier and starts a new iteration.
Sometimes, when the solver finishes with all nonnegative Lagrange multipliers, the first-order
optimality measure is above the tolerance, or the constraint tolerance is not met. In these cases, the
solver attempts to reach a better solution by following the restart procedure described in [1]. In this
procedure, the solver discards the current set of active constraints, relaxes the constraints a bit, and
constructs a new set of active constraints while attempting to solve the problem in a manner that
avoids cycling (repeatedly returning to the same state). If necessary, the solver can perform the
restart procedure several times.
Note Do not try to stop the algorithm early by setting large values of the ConstraintTolerance
and OptimalityTolerance options. Generally, the solver iterates without checking these values
until it reaches a potential stopping point, and only then checks to see whether the tolerances are
satisfied.
Occasionally, the active-set algorithm can have difficulty detecting when a problem is unbounded.
This issue can occur if the direction of unboundedness v is a direction where the quadratic term v'Hv
= 0. For numerical stability and to enable a Cholesky factorization, the active-set algorithm adds a
small, strictly convex term to the quadratic objective. This small term causes the objective function to
be bounded away from –Inf. In this case, the active-set algorithm reaches an iteration limit
instead of reporting that the solution is unbounded. In other words, the algorithm halts with exit flag
0 instead of –3.
References
[1] Gill, P. E., W. Murray, M. A. Saunders, and M. H. Wright. A practical anti-cycling procedure for
linearly constrained optimization. Math. Programming 45 (1), August 1989, pp. 437–474.
11-14
Quadratic Minimization with Bound Constraints
The problem stored in the MAT-file qpbox1.mat is a positive definite quadratic, and the Hessian
matrix H is tridiagonal, subject to upper (ub) and lower (lb) bounds.
exitflag,output.iterations,output.cgiterations
exitflag =
ans =
19
ans =
1637
You can see that while convergence occurred in 19 iterations, the high number of CG iterations
indicates that the cost of solving the linear system is high. In light of this cost, try using a direct
solver at each iteration by setting the SubproblemAlgorithm option to 'factorization':
options = optimoptions(options,'SubproblemAlgorithm','factorization');
[x,fval,exitflag,output] = ...
quadprog(H,f,[],[],[],[],lb,ub,xstart,options);
exitflag,output.iterations,output.cgiterations
exitflag =
11-15
11 Quadratic Programming
ans =
10
ans =
Using a direct solver at each iteration usually causes the number of iterations to decrease, but often
takes more time per iteration. For this problem, the tradeoff is beneficial, as the time for quadprog
to solve the problem decreases by a factor of 10.
You can also use the default 'interior-point-convex' algorithm to solve this convex problem:
options = optimoptions('quadprog','Algorithm','interior-point-convex');
[x,fval,exitflag,output] = ...
quadprog(H,f,[],[],[],[],lb,ub,[],options);
Check the exit flag and iterations (the interior-point algorithm does not use CG iterations):
exitflag,output.iterations
exitflag =
ans =
11-16
Quadratic Minimization with Dense, Structured Hessian
In this example, the Hessian matrix H has the structure H = B + A*A' where B is a sparse 512-
by-512 symmetric matrix, and A is a 512-by-10 sparse matrix composed of a number of dense
columns. To avoid excessive memory usage that could happen by working with H directly because H is
dense, the example provides a Hessian multiply function, qpbox4mult. This function, when passed a
matrix Y, uses sparse matrices A and B to compute the Hessian matrix product
W = H*Y = (B + A*A')*Y.
In the first part of this example, the matrices A and B need to be provided to the Hessian multiply
function qpbox4mult. You can pass one matrix as the first argument to quadprog, which is passed
to the Hessian multiply function. You can use a nested function to provide the value of the second
matrix.
The second part of the example shows how to tighten the TolPCG tolerance to compensate for an
approximate preconditioner instead of an exact H matrix.
• Contains a nested function qpbox4mult that uses A and B to compute the Hessian matrix product
W, where W = H*Y = (B + A*A')*Y. The nested function must have the form
W = qpbox4mult(Hinfo,Y,...)
11-17
11 Quadratic Programming
The first argument to the nested function qpbox4mult must be the same as the first argument
passed to quadprog, which in this case is the matrix B.
The second argument to qpbox4mult is the matrix Y (of W = H*Y). Because quadprog expects Y to
be used to form the Hessian matrix product, Y is always a matrix with n rows, where n is the number
of dimensions in the problem. The number of columns in Y can vary. The function qpbox4mult is
nested so that the value of the matrix A comes from the outer function. Optimization Toolbox software
includes the runqpbox4.m file.
function W = qpbox4mult(B,Y)
%QPBOX4MULT Hessian matrix product with dense structured Hessian.
% W = qpbox4mult(B,Y) computes W = (B + A*A')*Y where
% INPUT:
% B - sparse square matrix (512 by 512)
% Y - vector (or matrix) to be multiplied by B + A'*A.
% VARIABLES from outer function runqpbox4:
% A - sparse matrix with 512 rows and 10 columns.
%
% OUTPUT:
% W - The product (B + A*A')*Y.
%
end
[fval,exitflag,output] = runqpbox4;
11-18
Quadratic Minimization with Dense, Structured Hessian
to run the preceding code. Then display the values for fval, exitflag, output.iterations, and
output.cgiterations.
fval,exitflag,output.iterations, output.cgiterations
fval =
-1.0538e+03
exitflag =
ans =
18
ans =
30
After 18 iterations with a total of 30 PCG iterations, the function value is reduced to
fval
fval =
-1.0538e+003
Preconditioning
Sometimes quadprog cannot use H to compute a preconditioner because H only exists implicitly.
Instead, quadprog uses B, the argument passed in instead of H, to compute a preconditioner. B is a
good choice because it is the same size as H and approximates H to some degree. If B were not the
same size as H, quadprog would compute a preconditioner based on some diagonal scaling matrices
determined from the algorithm. Typically, this would not perform as well.
Because the preconditioner is more approximate than when H is available explicitly, adjusting the
TolPCG parameter to a somewhat smaller value might be required. This example is the same as the
previous one, but reduces TolPCG from the default 0.1 to 0.01.
function [fval, exitflag, output, x] = runqpbox4prec
%RUNQPBOX4PREC demonstrates 'HessianMultiplyFcn' option for QUADPROG with bounds.
% Choose algorithm, the HessianMultiplyFcn option, and override the TolPCG option
11-19
11 Quadratic Programming
options = optimoptions(@quadprog,'Algorithm','trust-region-reflective',...
'HessianMultiplyFcn',mtxmpy,'TolPCG',0.01);
function W = qpbox4mult(B,Y)
%QPBOX4MULT Hessian matrix product with dense structured Hessian.
% W = qpbox4mult(B,Y) computes W = (B + A*A')*Y where
% INPUT:
% B - sparse square matrix (512 by 512)
% Y - vector (or matrix) to be multiplied by B + A'*A.
% VARIABLES from outer function runqpbox4prec:
% A - sparse matrix with 512 rows and 10 columns.
%
% OUTPUT:
% W - The product (B + A*A')*Y.
%
end
Now, enter
[fval,exitflag,output] = runqpbox4prec;
to run the preceding code. After 18 iterations and 50 PCG iterations, the function value has the same
value to five significant digits
fval
fval =
-1.0538e+003
output.firstorderopt
ans =
0.0043
Note Decreasing TolPCG too much can substantially increase the number of PCG iterations.
See Also
More About
• “Jacobian Multiply Function with Linear Least Squares” on page 12-30
11-20
Large Sparse Quadratic Program with Interior Point Algorithm
Create a symmetric circulant matrix based on shifts of the vector [3,6,2,14,2,6,3], with 14 being
on the main diagonal. Have the matrix be n-by-n, where n = 30,000.
n = 3e4;
H2 = speye(n);
H = 3*circshift(H2,-3,2) + 6*circshift(H2,-2,2) + 2*circshift(H2,-1,2)...
+ 14*H2 + 2*circshift(H2,1,2) + 6*circshift(H2,2,2) + 3*circshift(H2,3,2);
spy(H)
11-21
11 Quadratic Programming
The linear constraint is that the sum of the solution elements is nonpositive. The objective function
contains a linear term expressed in the vector f.
A = ones(1,n);
b = 0;
f = 1:n;
f = -f;
Solve Problem
Solve the quadratic programming problem using the 'interior-point-convex' algorithm. To keep
the solver from stopping prematurely, set the StepTolerance option to 0.
options = optimoptions(@quadprog,'Algorithm','interior-point-convex','StepTolerance',0);
[x,fval,exitflag,output,lambda] = ...
quadprog(H,f,A,b,[],[],[],[],[],options);
On many computers you cannot create a full n-by-n matrix when n = 30,000. So you can run this
problem only by using sparse matrices.
Examine Solution
View the objective function value, number of iterations, and Lagrange multiplier associated with
linear inequality.
fprintf('The objective function value is %d.\nThe number of iterations is %d.\nThe Lagrange multi
fval,output.iterations,lambda.ineqlin)
Because there are no lower bounds, upper bounds, or linear equality constraints, the only meaningful
Lagrange multiplier is lambda.ineqlin. Because lambda.ineqlin is nonzero, you can tell that the
inequality constraint is active. Evaluate the constraint to see that the solution is on the boundary.
The solution x has three regions: an initial portion, a final portion, and an approximately linear
portion over most of the solution. Plot the three regions.
subplot(3,1,1)
plot(x(1:60))
title('x(1) Through x(60)')
11-22
Large Sparse Quadratic Program with Interior Point Algorithm
subplot(3,1,2)
plot(x(61:n-60))
title('x(61) Through x(n-60)')
subplot(3,1,3)
plot(x(n-59:n))
title('x(n-59) Through x(n)')
See Also
circshift | quadprog
More About
• “Sparse Matrices” (MATLAB)
11-23
11 Quadratic Programming
Problem Definition
Consider building a circus tent to cover a square lot. The tent has five poles covered with a heavy,
elastic material. The problem is to find the natural shape of the tent. Model the shape as the height
x(p) of the tent at position p.
The potential energy of heavy material lifted to height x is cx, for a constant c that is proportional to
the weight of the material. For this problem, choose c = 1/3000.
The elastic potential energy of a piece of the material Estretch is approximately proportional to the
second derivative of the material height, times the height. You can approximate the second derivative
by the 5-point finite difference approximation (assume that the finite difference steps are of size 1).
Let Δx represent a shift of 1 in the first coordinate direction, and Δy represent a shift by 1 in the
second coordinate direction.
Estretch(p) = − 1 x(p + Δx) + x(p − Δx) + x(p + Δy) + x(p − Δy) + 4x(p) x(p) .
The natural shape of the tent minimizes the total potential energy. By discretizing the problem, you
find that the total potential energy to minimize is the sum over all positions p of Estretch(p) + cx(p).
Specify the boundary condition that the height of the tent at the edges is zero. The tent poles have a
cross section of 1-by-1 unit, and the tent has a total size of 33-by-33 units. Specify the height and
location of each pole. Plot the square lot region and tent poles.
height = zeros(33);
height(6:7,6:7) = 0.3;
height(26:27,26:27) = 0.3;
height(6:7,26:27) = 0.3;
height(26:27,6:7) = 0.3;
height(16:17,16:17) = 0.5;
colormap(gray);
surfl(height)
axis tight
view([-20,30]);
title('Tent Poles and Region to Cover')
11-24
Bound-Constrained Quadratic Programming, Solver-Based
The height matrix defines the lower bounds on the solution x. To restrict the solution to be zero at
the boundary, set the upper bound ub to be zero on the boundary.
boundary = false(size(height));
boundary([1,33],:) = true;
boundary(:,[1,33]) = true;
ub = inf(size(boundary)); % No upper bound on most of the region
ub(boundary) = 0;
1 T T
x Hx + f x.
2
T
In this case, the linear term f x corresponds to the potential energy of the material height. Therefore,
specify f = 1/3000 for each component of x.
f = ones(size(height))/3000;
Create the finite difference matrix representing Estretch by using the delsq function. The delsq
function returns a sparse matrix with entries of 4 and -1 corresponding to the entries of 4 and -1 in
the formula for Estretch(p). Multiply the returned matrix by 2 to have quadprog solve the quadratic
program with the energy function as given by Estretch.
11-25
11 Quadratic Programming
H = delsq(numgrid('S',33+2))*2;
View the structure of the matrix H. The matrix operates on x(:), which means the matrix x is
converted to a vector by linear indexing.
spy(H);
title('Sparsity Structure of Hessian Matrix');
x = quadprog(H,f,[],[],[],[],height,ub);
Plot Solution
S = reshape(x,size(height));
surfl(S);
axis tight;
view([-20,30]);
11-26
Bound-Constrained Quadratic Programming, Solver-Based
See Also
More About
• “Bound-Constrained Quadratic Programming, Problem-Based” on page 11-40
11-27
11 Quadratic Programming
The matrices that define the problems in this example are dense; however, the interior-point
algorithm in quadprog can also exploit sparsity in the problem matrices for increased speed. For a
sparse example, see “Large Sparse Quadratic Program with Interior Point Algorithm” on page 11-21.
Suppose that there are different assets. The rate of return of asset is a random variable with
expected value . The problem is to find what fraction to invest in each asset in order to
minimize risk, subject to a specified minimum expected rate of return.
The expected return should be no less than a minimal rate of portfolio return that the investor
desires,
the sum of the investment fractions 's should add up to a total of one,
and, being fractions (or percentages), they should be numbers between zero and one,
Since the objective to minimize portfolio risk is quadratic, and the constraints are linear, the resulting
optimization problem is a quadratic program, or QP.
225-Asset Problem
Let us now solve the QP with 225 assets. The dataset is from the OR-Library [Chang, T.-J., Meade, N.,
Beasley, J.E. and Sharaiha, Y.M., "Heuristics for cardinality constrained portfolio optimisation"
Computers & Operations Research 27 (2000) 1271-1302].
We load the dataset and then set up the constraints in a format expected by quadprog. In this
dataset the rates of return range between -0.008489 and 0.003971; we pick a desired return in
between, e.g., 0.002 (0.2 percent).
11-28
Quadratic Programming for Portfolio Optimization Problems, Solver-Based
In order solve the QP using the interior-point algorithm, we set the option Algorithm to 'interior-
point-convex'.
options = optimoptions('quadprog','Algorithm','interior-point-convex');
We now set some additional options, and call the solver quadprog.
Set additional options: turn on iterative display, and set a tighter optimality termination tolerance.
options = optimoptions(options,'Display','iter','TolFun',1e-10);
11-29
11 Quadratic Programming
Plot results.
plotPortfDemoStandardModel(x1)
We now add to the model group constraints that require that 30% of the investor's money has to be
invested in assets 1 to 75, 30% in assets 76 to 150, and 30% in assets 151 to 225. Each of these
groups of assets could be, for instance, different industries such as technology, automotive, and
pharmaceutical. The constraints that capture this new requirement are
Groups = blkdiag(ones(1,nAssets/3),ones(1,nAssets/3),ones(1,nAssets/3));
Aineq = [Aineq; -Groups]; % convert to <= constraint
bineq = [bineq; -0.3*ones(3,1)]; % by changing signs
tic
[x2,fval2] = quadprog(Covariance,c,Aineq,bineq,Aeq,beq,lb,ub,[],options);
toc
11-30
Quadratic Programming for Portfolio Optimization Problems, Solver-Based
11-31
11 Quadratic Programming
We see from the second bar plot that, as a result of the additional group constraints, the portfolio is
now more evenly distributed across the three asset groups than the first portfolio. This imposed
diversification also resulted in a slight increase in the risk, as measured by the objective function (see
column labeled "f(x)" for the last iteration in the iterative display for both runs).
In order to show how quadprog's interior-point algorithm behaves on a larger problem, we'll use a
1000-asset randomly generated dataset. We generate a random correlation matrix (symmetric,
positive-semidefinite, with ones on the diagonal) using the gallery function in MATLAB®.
rng(0,'twister');
nAssets = 1000; % desired number of assets
a = -0.1; b = 0.4;
mean_return = a + (b-a).*rand(nAssets,1);
a = 0.08; b = 0.6;
stdDev_return = a + (b-a).*rand(nAssets,1);
% Correlation matrix, generated using Correlation = gallery('randcorr',nAssets).
% (Generating a correlation matrix of this size takes a while, so we load
% a pre-generated one instead.)
load('correlationMatrixDemo.mat','Correlation');
% Calculate covariance matrix from correlation matrix.
Covariance = Correlation .* (stdDev_return * stdDev_return');
We now define the standard QP problem (no group constraints here) and solve.
tic
x3 = quadprog(Covariance,c,Aineq,bineq,Aeq,beq,lb,ub,[],options);
toc
11-32
Quadratic Programming for Portfolio Optimization Problems, Solver-Based
Summary
This example illustrates how to use the interior-point algorithm in quadprog on a portfolio
optimization problem, and shows the algorithm running times on quadratic problems of different
sizes.
More elaborate analyses are possible by using features specifically designed for portfolio optimization
in Financial Toolbox™.
See Also
“Mixed-Integer Quadratic Programming Portfolio Optimization: Solver-Based” on page 9-65
11-33
11 Quadratic Programming
n n−1
2 ∑ xi2 − 2 ∑ xixi + 1 − 2x1 − 2xn .
i=1 i=1
Create Problem
Create a problem variable named x that has 400 components. Also, create an expression named
objec for the objective function. Bound each variable below by 0 and above by 0.9, except allow xn to
be unbounded.
n = 400;
x = optimvar('x',n,'LowerBound',0,'UpperBound',0.9);
x(n).LowerBound = -Inf;
x(n).UpperBound = Inf;
prevtime = 1:n-1;
nexttime = 2:n;
objec = 2*sum(x.^2) - 2*sum(x(nexttime).*x(prevtime)) - 2*x(1) - 2*x(end);
Create an optimization problem named qprob. Include the objective function in the problem.
qprob = optimproblem('Objective',objec);
Create options that specify the quadprog 'trust-region-reflective' algorithm and no display.
Create an initial point approximately centered between the bounds.
opts = optimoptions('quadprog','Algorithm','trust-region-reflective','Display','off');
x0 = 0.5*ones(n,1);
x00 = struct('x',x0);
[sol,qfval,qexitflag,qoutput] = solve(qprob,x00,'options',opts);
plot(sol.x,'b-')
xlabel('Index')
ylabel('x(index)')
11-34
Quadratic Programming with Bound Constraints: Problem-Based
Report the exit flag, the number of iterations, and the number of conjugate gradient iterations.
fprintf('Exit flag = %d, iterations = %d, cg iterations = %d\n',...
double(qexitflag),qoutput.iterations,qoutput.cgiterations)
Reduce the number of conjugate gradient iterations by setting the SubproblemAlgorithm option to
'factorization'. This option causes the solver to use a more expensive internal solution
technique that eliminates conjugate gradient steps, for a net overall savings of time in this case.
opts.SubproblemAlgorithm = 'factorization';
[sol2,qfval2,qexitflag2,qoutput2] = solve(qprob,x00,'options',opts);
fprintf('Exit flag = %d, iterations = %d, cg iterations = %d\n',...
double(qexitflag2),qoutput2.iterations,qoutput2.cgiterations)
Compare these solutions with that obtained using the default 'interior-point' algorithm. The
'interior-point' algorithm does not use an initial point, so do not pass x00 to solve.
11-35
11 Quadratic Programming
opts = optimoptions('quadprog','Algorithm','interior-point-convex','Display','off');
[sol3,qfval3,qexitflag3,qoutput3] = solve(qprob,'options',opts);
fprintf('Exit flag = %d, iterations = %d, cg iterations = %d\n',...
double(qexitflag3),qoutput3.iterations,0)
middle = floor(n/2);
fprintf('The three solutions are slightly different.\nThe middle component is %f, %f, or %f.\n',.
sol.x(middle),sol2.x(middle),sol3.x(middle))
fprintf(['The three objective function values are %f, %f, and %f.\n' ...
'The ''interior-point'' algorithm is slightly less accurate.'],qfval,qfval2,qfval3)
The three objective function values are -1.985000, -1.985000, and -1.984963.
The 'interior-point' algorithm is slightly less accurate.
See Also
More About
• “Quadratic Minimization with Bound Constraints” on page 11-15
11-36
Large Sparse Quadratic Program, Problem-Based
Create a symmetric circulant matrix H based on shifts of the vector [3,6,2,14,2,6,3], with 14
being on the main diagonal. Have the matrix be n-by-n, where n = 30,000.
n = 3e4;
H2 = speye(n);
H = 3*circshift(H2,-3,2) + 6*circshift(H2,-2,2) + 2*circshift(H2,-1,2)...
+ 14*H2 + 2*circshift(H2,1,2) + 6*circshift(H2,2,2) + 3*circshift(H2,3,2);
spy(H)
11-37
11 Quadratic Programming
x = optimvar('x',n);
qprob = optimproblem;
Create the objective function and constraints. Place the objective and constraints into qprob.
f = 1:n;
obj = 1/2*x'*H*x - f*x;
qprob.Objective = obj;
cons = sum(x) <= 0;
qprob.Constraints = cons;
Solve Problem
Solve the quadratic programming problem using the default 'interior-point-convex' algorithm
and sparse linear algebra. To keep the solver from stopping prematurely, set the StepTolerance
option to 0.
options = optimoptions('quadprog','Algorithm','interior-point-convex',...
'LinearSolver','sparse','StepTolerance',0);
[sol,fval,exitflag,output,lambda] = solve(qprob,'Options',options);
Examine Solution
View the objective function value, number of iterations, and Lagrange multiplier associated with the
linear inequality constraint.
fprintf('The objective function value is %d.\nThe number of iterations is %d.\nThe Lagrange multi
fval,output.iterations,lambda.Constraints)
The solution x has three regions: an initial portion, a final portion, and an approximately linear
portion over most of the solution. Plot the three regions.
subplot(3,1,1)
plot(sol.x(1:60))
title('x(1) Through x(60)')
11-38
Large Sparse Quadratic Program, Problem-Based
subplot(3,1,2)
plot(sol.x(61:n-60))
title('x(61) Through x(n-60)')
subplot(3,1,3)
plot(sol.x(n-59:n))
title('x(n-59) Through x(n)')
See Also
More About
• “Large Sparse Quadratic Program with Interior Point Algorithm” on page 11-21
11-39
11 Quadratic Programming
For a solver-based version of this example, see “Bound-Constrained Quadratic Programming, Solver-
Based” on page 11-24.
Problem Definition
Consider building a circus tent to cover a square lot. The tent has five poles covered with a heavy,
elastic material. The problem is to find the natural shape of the tent. Model the shape as the height
x(p) of the tent at position p.
The potential energy of heavy material lifted to height x is cx, for a constant c that is proportional to
the weight of the material. For this problem, choose c = 1/3000.
The elastic potential energy of a piece of the material Estretch is approximately proportional to the
second derivative of the material height, times the height. You can approximate the second derivative
by the five-point finite difference approximation (assume that the finite difference steps are of size 1).
Let Δx represent a shift of 1 in the first coordinate direction, and Δy represent a shift of 1 in the
second coordinate direction.
Estretch(p) = − 1 x(p + Δx) + x(p − Δx) + x(p + Δy) + x(p − Δy) + 4x(p) x(p) .
The natural shape of the tent minimizes the total potential energy. By discretizing the problem, you
find that the total potential energy to minimize is the sum over all positions p of Estretch(p) + cx(p).
Specify the boundary condition that the height of the tent at the edges is zero. The tent poles have a
cross section of 1-by-1 unit, and the tent has a total size of 33-by-33 units. Specify the height and
location of each pole. Plot the square lot region and tent poles.
height = zeros(33);
height(6:7,6:7) = 0.3;
height(26:27,26:27) = 0.3;
height(6:7,26:27) = 0.3;
height(26:27,6:7) = 0.3;
height(16:17,16:17) = 0.5;
colormap(gray);
surfl(height)
axis tight
view([-20,30]);
title('Tent Poles and Region to Cover')
11-40
Bound-Constrained Quadratic Programming, Problem-Based
Calculate the elastic potential energy at each point. First, calculate the potential energy in the
interior of the region, where the finite differences do not overstep the region containing the solution.
L = size(height,1);
peStretch = optimexpr(L,L); % This initializes peStretch to zeros(L,L)
interior = 2:(L-1);
peStretch(interior,interior) = (-1*(x(interior - 1,interior) + x(interior + 1,interior) ...
+ x(interior,interior - 1) + x(interior,interior + 1)) + 4*x(interior,interior))...
.*x(interior, interior);
Because the solution is constrained to be 0 at the edges of the region, you do not need to include the
remainder of the terms. All terms have a multiple of x, and x at the edge is zero. For reference in
case you want to use a different boundary condition, the following is a commented-out version of the
potential energy .
11-41
11 Quadratic Programming
peHeight = x/3000;
Create an optimization problem named tentproblem. Include the expression for the objective
function, which is the sum of the two potential energies over all locations.
Set Constraint
Set the constraint that the solution must lie above the values of the height matrix. This matrix is
zero at most locations, representing the ground, and includes the height of each tent pole at its
location.
Solve the problem. Ignore the resulting statement "Your Hessian is not symmetric." solve issues this
statement because the internal conversion from problem form to a quadratic matrix does not ensure
that the matrix is symmetric.
sol = solve(tentproblem);
Plot Solution
surfl(sol.x);
axis tight;
view([-20,30]);
11-42
Bound-Constrained Quadratic Programming, Problem-Based
See Also
More About
• “Bound-Constrained Quadratic Programming, Solver-Based” on page 11-24
11-43
11 Quadratic Programming
Suppose that a portfolio contains different assets. The rate of return of asset is a random variable
with expected value . The problem is to find what fraction to invest in each asset in order to
minimize risk, subject to a specified minimum expected rate of return.
The expected return should be no less than a minimal rate of portfolio return that the investor
desires,
the sum of the investment fractions 's should add up to a total of one,
and, being fractions (or percentages), they should be numbers between zero and one,
Since the objective to minimize portfolio risk is quadratic, and the constraints are linear, the resulting
optimization problem is a quadratic program, or QP.
225-Asset Problem
Let us now solve the QP with 225 assets. The dataset is from the OR-Library [Chang, T.-J., Meade, N.,
Beasley, J.E. and Sharaiha, Y.M., "Heuristics for cardinality constrained portfolio optimisation"
Computers & Operations Research 27 (2000) 1271-1302].
We load the dataset and then set up the constraints for the problem-based approach. In this dataset
the rates of return range between -0.008489 and 0.003971; we pick a desired return in between,
e.g., 0.002 (0.2 percent).
11-44
Quadratic Programming for Portfolio Optimization, Problem-Based
portprob = optimproblem;
Create an optimization vector variable 'x' with nAssets elements. This variable represents the
fraction of wealth invested in each asset, so should lie between 0 and 1.
x = optimvar('x',nAssets,'LowerBound',0,'UpperBound',1);
The objective function is 1/2*x'*Covariance*x. Include this objective into the problem.
objective = 1/2*x'*Covariance*x;
portprob.Objective = objective;
The sum of the variables is 1, meaning the entire portfolio is invested. Express this as a constraint
and place it in the problem.
sumcons = sum(x) == 1;
portprob.Constraints.sumcons = sumcons;
The average return must be greater than r. Express this as a constraint and place it in the problem.
Set options to turn on iterative display, and set a tighter optimality termination tolerance.
options = optimoptions('quadprog','Display','iter','TolFun',1e-10);
tic
[x1,fval1] = solve(portprob,'Options',options);
toc
11-45
11 Quadratic Programming
Plot results.
plotPortfDemoStandardModel(x1.x)
We now add to the model group constraints that require that 30% of the investor's money has to be
invested in assets 1 to 75, 30% in assets 76 to 150, and 30% in assets 151 to 225. Each of these
groups of assets could be, for instance, different industries such as technology, automotive, and
pharmaceutical. The constraints that capture this new requirement are
11-46
Quadratic Programming for Portfolio Optimization, Problem-Based
tic
[x2,fval2] = solve(portprob,'Options',options);
toc
plotPortfDemoGroupModel(x1.x,x2.x);
11-47
11 Quadratic Programming
We see from the second bar plot that, as a result of the additional group constraints, the portfolio is
now more evenly distributed across the three asset groups than the first portfolio. This imposed
diversification also resulted in a slight increase in the risk, as measured by the objective function (see
column labeled "f(x)" for the last iteration in the iterative display for both runs).
In order to show how the solver behaves on a larger problem, we'll use a 1000-asset randomly
generated dataset. We generate a random correlation matrix (symmetric, positive-semidefinite, with
ones on the diagonal) using the gallery function in MATLAB®.
rng(0,'twister');
nAssets = 1000; % desired number of assets
a = -0.1; b = 0.4;
mean_return = a + (b-a).*rand(nAssets,1);
r = 0.15; % desired return
11-48
Quadratic Programming for Portfolio Optimization, Problem-Based
a = 0.08; b = 0.6;
stdDev_return = a + (b-a).*rand(nAssets,1);
load('correlationMatrixDemo.mat','Correlation');
portprob2 = optimproblem;
x = optimvar('x',nAssets,'LowerBound',0,'UpperBound',1);
objective = 1/2*x'*Covariance*x;
portprob2.Objective = objective;
Include the constraints that the sum of the variables is 1 and the average return is greater than r.
sumcons = sum(x) == 1;
portprob2.Constraints.sumcons = sumcons;
averagereturn = dot(mean_return,x) >= r;
portprob2.Constraints.averagereturn = averagereturn;
tic
x3 = solve(portprob2,'Options',options);
toc
11-49
11 Quadratic Programming
Summary
This example illustrates how to use problem-based approach on a portfolio optimization problem, and
shows the algorithm running times on quadratic problems of different sizes.
More elaborate analyses are possible by using features specifically designed for portfolio optimization
in Financial Toolbox™.
See Also
More About
• “Quadratic Programming for Portfolio Optimization Problems, Solver-Based” on page 11-28
11-50
Code Generation for quadprog
Typically, you use code generation to deploy code on hardware that is not running MATLAB. For an
example, see “Generate Code for quadprog” on page 11-53. For general nonlinear programming, see
“Code Generation for Optimization Basics” on page 6-116.
options = optimoptions('quadprog','Algorithm','active-set');
[x,fval,exitflag] = quadprog(H,f,A,b,Aeq,beq,lb,ub,x0,options);
• Code generation supports these options:
11-51
11 Quadratic Programming
opts = optimoptions('quadprog','Algorithm','active-set');
opts = optimoptions(opts,'MaxIterations',1e4); % Recommended
opts.MaxIterations = 1e4; % Not recommended
• Do not load options from a file. Doing so can cause code generation to fail. Instead, create options
in your code.
• If you specify an option that is not supported, the option is typically ignored during code
generation. For reliable results, specify only supported options.
If your target hardware has multiple cores, then you can achieve better performance by using custom
multithreaded LAPACK and BLAS libraries. To incorporate these libraries in your generated code, see
“Speed Up Linear Algebra in Generated Standalone Code by Using LAPACK Calls” (MATLAB Coder).
See Also
codegen | optimoptions | quadprog
More About
• “Generate Code for quadprog” on page 11-53
• “Static Memory Allocation for fmincon Code Generation” on page 6-120
• “Optimization Code Generation for Real-Time Applications” on page 6-122
11-52
Generate Code for quadprog
1 T T
x Hx + f x
2
where
1 −1 1
H = −1 2 −2
1 −2 4
and
2
f = −3
1
After some time, codegen creates a MEX file named test_quadp_mex.mexw64 (the file extension
varies, depending on your system). Run the resulting C code.
[x,fval] = test_quadp_mex
x =
0
0.5000
11-53
11 Quadratic Programming
fval =
-1.2500
cfg = coder.config('mex');
cfg.IntegrityChecks = false;
cfg.SaturateOnIntegerOverflow = false;
cfg.DynamicMemoryAllocation = 'Off';
Create a file named test_quadp2.m containing the following code. This code sets a looser optimality
tolerance than the default 1e-8.
[x,fval,eflag,output] = test_quadp2_mex
x =
0
0.5000
0
fval =
-1.2500
eflag =
11-54
Generate Code for quadprog
output =
algorithm: 'active-set'
firstorderopt: 8.8818e-16
constrviolation: 0
iterations: 3
Changing the optimality tolerance does not affect the optimization process, because the 'active-
set' algorithm does not check this tolerance until it reaches a point where it stops.
Create a third file that limits the number of allowed iterations to 2 to see the effect on the
optimization process.
To see the effect of these settings on the solver, run test_quadp3 in MATLAB without generating
code.
[x,fval,exitflag,output] = test_quadp3
x =
-0.0000
0.5000
0
fval =
-1.2500
exitflag =
11-55
11 Quadratic Programming
output =
algorithm: 'active-set'
iterations: 2
constrviolation: 1.6441e-18
firstorderopt: 2
message: '↵Solver stopped prematurely.↵↵quadprog stopped because it exceeded the iter
linearsolver: []
cgiterations: []
In this case, the solver reached the solution in fewer steps than the default. Usually, though, limiting
the number of iterations does not allow the solver to reach a correct solution.
See Also
codegen | optimoptions | quadprog
More About
• “Code Generation for quadprog” on page 11-51
• “Static Memory Allocation for fmincon Code Generation” on page 6-120
• “Optimization Code Generation for Real-Time Applications” on page 6-122
11-56
Quadratic Programming with Many Linear Constraints
Problem Description
Create a pseudorandom quadratic problem with N variables and 10*N linear inequality constraints.
Specify N = 150.
ee = min(eig(H))
ee = 3.6976
All of the eigenvalues are positive, so the quadratic form x'*H*x is convex.
Aeq = [];
beq = [];
lb = [];
ub = [];
Set options to use the quadprog 'active-set' algorithm. This algorithm requires an initial point.
Set the initial point x0 to be a zero vector of length N.
opts = optimoptions('quadprog','Algorithm','active-set');
x0 = zeros(N,1);
tic
[xa,fvala,eflaga,outputa,lambdaa] = quadprog(H,f,A,b,Aeq,beq,lb,ub,x0,opts);
toc
11-57
11 Quadratic Programming
Compare the solution time to the time of the default 'interior-point-convex' algorithm.
tic
[xi,fvali,eflagi,outputi,lambdai] = quadprog(H,f,A,b);
toc
The 'active-set' algorithm is much faster on problems with many linear constraints.
The 'active-set' algorithm reports only a few nonzero entries in the Lagrange multiplier
structure associated with the linear constraint matrix.
nnz(lambdaa.ineqlin)
ans = 14
nnz(lambdai.ineqlin)
ans = 1500
Nearly all of these Lagrange multipliers are smaller than N*eps in size.
ans = 20
In other words, the 'active-set' algorithm gives clear indications of active constraints in the
Lagrange multiplier structure, whereas the 'interior-point-convex' algorithm does not.
See Also
lsqlin | quadprog
More About
• “Potential Inaccuracy with Interior-Point Algorithms” on page 2-10
11-58
12
Least Squares
There are several Optimization Toolbox solvers available for various types of F(x) and various types of
constraints:
There are five least-squares algorithms in Optimization Toolbox solvers, in addition to the algorithms
used in mldivide:
• lsqlin interior-point
• lsqlin active-set
• Trust-region-reflective (nonlinear or linear least-squares)
• Levenberg-Marquardt (nonlinear least-squares)
• The algorithm used by lsqnonneg
All the algorithms except lsqlin active-set are large-scale; see “Large-Scale vs. Medium-Scale
Algorithms” on page 2-10. For a general survey of nonlinear least-squares methods, see Dennis [8].
Specific details on the Levenberg-Marquardt method can be found in Moré [28].
12-2
Least-Squares (Model Fitting) Algorithms
1
min xT Hx + cT x
x 2
subject to linear constraints and bound constraints. The lsqlin function minimizes the squared 2-
norm of the vector Cx – d subject to linear constraints and bound constraints. In other words, lsqlin
minimizes
2 T
Cx − d 2 = Cx − d Cx − d
T T
= xT C − d Cx − d
T T T T
= xT C Cx − xT C d − d Cx + d d
1 T T T T T
= x 2C C x + −2C d x + d d .
2
This fits into the quadprog framework by setting the H matrix to 2CTC and the c vector to (–2CTd).
(The additive term dTd has no effect on the location of the minimum.) After this reformulation of the
lsqlin problem, quadprog calculates the solution.
Note The quadprog 'interior-point-convex' algorithm has two code paths. It takes one when
the Hessian matrix H is an ordinary (full) matrix of doubles, and it takes the other when H is a sparse
matrix. For details of the sparse data type, see “Sparse Matrices” (MATLAB). Generally, the algorithm
is faster for large problems that have relatively few nonzero terms when you specify H as sparse.
Similarly, the algorithm is faster for small or relatively dense problems when you specify H as full.
Many of the methods used in Optimization Toolbox solvers are based on trust regions, a simple yet
powerful concept in optimization.
The current point is updated to be x + s if f(x + s) < f(x); otherwise, the current point remains
unchanged and N, the region of trust, is shrunk and the trial step computation is repeated.
The key questions in defining a specific trust-region approach to minimizing f(x) are how to choose
and compute the approximation q (defined at the current point x), how to choose and modify the trust
region N, and how accurately to solve the trust-region subproblem. This section focuses on the
unconstrained problem. Later sections discuss additional complications due to the presence of
constraints on the variables.
12-3
12 Least Squares
In the standard trust-region method ([48]), the quadratic approximation q is defined by the first two
terms of the Taylor approximation to F at x; the neighborhood N is usually spherical or ellipsoidal in
shape. Mathematically the trust-region subproblem is typically stated
1 T
min s Hs + sT g such that Ds ≤ Δ , (12-2)
2
where g is the gradient of f at the current point x, H is the Hessian matrix (the symmetric matrix of
second derivatives), D is a diagonal scaling matrix, Δ is a positive scalar, and ∥ . ∥ is the 2-norm. Good
algorithms exist for solving “Equation 12-2” (see [48]); such algorithms typically involve the
computation of all eigenvalues of H and a Newton process applied to the secular equation
1 1
− = 0.
Δ s
Such algorithms provide an accurate solution to “Equation 12-2”. However, they require time
proportional to several factorizations of H. Therefore, for trust-region problems a different approach
is needed. Several approximation and heuristic strategies, based on “Equation 12-2”, have been
proposed in the literature ([42] and [50]). The approximation approach followed in Optimization
Toolbox solvers is to restrict the trust-region subproblem to a two-dimensional subspace S ([39] and
[42]). Once the subspace S has been computed, the work to solve “Equation 12-2” is trivial even if full
eigenvalue/eigenvector information is needed (since in the subspace, the problem is only two-
dimensional). The dominant work has now shifted to the determination of the subspace.
The two-dimensional subspace S is determined with the aid of a preconditioned conjugate gradient
process described below. The solver defines S as the linear space spanned by s1 and s2, where s1 is in
the direction of the gradient g, and s2 is either an approximate Newton direction, i.e., a solution to
H ⋅ s2 = − g, (12-3)
The philosophy behind this choice of S is to force global convergence (via the steepest descent
direction or negative curvature direction) and achieve fast local convergence (via the Newton step,
when it exists).
These four steps are repeated until convergence. The trust-region dimension Δ is adjusted according
to standard rules. In particular, it is decreased if the trial step is not accepted, i.e., f(x + s) ≥ f(x). See
[46] and [49] for a discussion of this aspect.
Optimization Toolbox solvers treat a few important special cases of f with specialized functions:
nonlinear least-squares, quadratic functions, and linear least-squares. However, the underlying
algorithmic ideas are the same as for the general case. These special cases are discussed in later
sections.
12-4
Least-Squares (Model Fitting) Algorithms
where F(x) is a vector-valued function with component i of F(x) equal to fi(x). The basic method used
to solve this problem is the same as in the general case described in “Trust-Region Methods for
Nonlinear Minimization” on page 6-2. However, the structure of the nonlinear least-squares problem
is exploited to enhance efficiency. In particular, an approximate Gauss-Newton direction, i.e., a
solution s to
2
min Js + F 2, (12-6)
(where J is the Jacobian of F(x)) is used to help define the two-dimensional subspace S. Second
derivatives of the component function fi(x) are not used.
In each iteration the method of preconditioned conjugate gradients is used to approximately solve the
normal equations, i.e.,
J T Js = − J T F,
possibly subject to linear constraints. The algorithm generates strictly feasible iterates converging, in
the limit, to a local solution. Each iteration involves the approximate solution of a large linear system
(of order n, where n is the length of x). The iteration matrices have the structure of the matrix C. In
particular, the method of preconditioned conjugate gradients is used to approximately solve the
normal equations, i.e.,
T T
C Cx = − C d,
The subspace trust-region method is used to determine a search direction. However, instead of
restricting the step to (possibly) one reflection step, as in the nonlinear minimization case, a
piecewise reflective line search is conducted at each iteration, as in the quadratic case. See [45] for
details of the line search. Ultimately, the linear systems represent a Newton approach capturing the
first-order optimality conditions at the solution, resulting in strong local convergence rates.
Jacobian Multiply Function
lsqlin can solve the linearly-constrained least-squares problem without using the matrix C
explicitly. Instead, it uses a Jacobian multiply function jmfun,
W = jmfun(Jinfo,Y,flag)
that you provide. The function must calculate the following products for a matrix Y:
12-5
12 Least Squares
This can be useful if C is large, but contains enough structure that you can write jmfun without
forming C explicitly. For an example, see “Jacobian Multiply Function with Linear Least Squares” on
page 12-30.
Levenberg-Marquardt Method
In the least-squares problem a function f(x) is minimized that is a sum of squares.
2
min f (x) = F(x) 2 = ∑ Fi2(x) . (12-7)
x i
Problems of this type occur in a large number of practical applications, especially when fitting model
functions to data, i.e., nonlinear parameter estimation. They are also prevalent in control where you
want the output, y(x,t), to follow some continuous model trajectory, φ(t), for vector x and scalar t. This
problem can be expressed as
t2
min
x ∈ ℜnt1
∫ y(x, t) − φ(t) dt,
2
(12-8)
When the integral is discretized using a suitable quadrature formula, the above can be formulated as
a least-squares problem:
m
2
min f (x) = ∑ y(x, ti) − φ(ti) , (12-9)
x ∈ ℜn i=1
where y and φ include the weights of the quadrature scheme. Note that in this problem the vector
F(x) is
In problems of this kind, the residual ∥F(x)∥ is likely to be small at the optimum since it is general
practice to set realistically achievable target trajectories. Although the function in LS can be
minimized using a general unconstrained minimization technique, as described in “Basics of
Unconstrained Optimization” on page 6-4, certain characteristics of the problem can often be
exploited to improve the iterative efficiency of the solution procedure. The gradient and Hessian
matrix of LS have a special structure.
Denoting the m-by-n Jacobian matrix of F(x) as J(x), the gradient vector of f(x) as G(x), the Hessian
matrix of f(x) as H(x), and the Hessian matrix of each Fi(x) as Hi(x), you have
12-6
Least-Squares (Model Fitting) Algorithms
x
G(x) = 2 J( F(x)
(12-10)
x
H(x) = 2 J( J(x) + 2Q(x),
where
m
Q(x) = ∑ Fi(x) ⋅ Hi(x) .
i=1
The matrix Q(x) has the property that when the residual ∥F(x)∥ tends to zero as xk approaches the
solution, then Q(x) also tends to zero. Thus when ∥F(x)∥ is small at the solution, a very effective
method is to use the Gauss-Newton direction as a basis for an optimization procedure.
In the Gauss-Newton method, a search direction, dk, is obtained at each major iteration, k, that is a
solution of the linear least-squares problem:
2
min J(xk) − F(xk) 2. (12-11)
x ∈ ℜn
The direction derived from this method is equivalent to the Newton direction when the terms of Q(x)
can be ignored. The search direction dk can be used as part of a line search strategy to ensure that at
each iteration the function f(x) decreases.
The Gauss-Newton method often encounters problems when the second-order term Q(x) is significant.
A method that overcomes this problem is the Levenberg-Marquardt method.
The Levenberg-Marquardt [25], and [27] method uses a search direction that is a solution of the
linear set of equations
T T
J xk J xk + λkI dk = − J xk F xk , (12-12)
T T T
J xk J xk + λkdiag J xk J xk dk = − J xk F xk , (12-13)
where the scalar λk controls both the magnitude and direction of dk. Set option ScaleProblem to
'none' to choose “Equation 12-12”, and set ScaleProblem to 'Jacobian' to choose
“Equation 12-13”.
You set the initial value of the parameter λ0 using the InitDamping option. Occasionally, the 0.01
default value of this option can be unsuitable. If you find that the Levenberg-Marquardt algorithm
makes little initial progress, try setting InitDamping to a different value than the default, perhaps
1e2.
When λk is zero, the direction dk is identical to that of the Gauss-Newton method. As λk tends to
infinity, dk tends towards the steepest descent direction, with magnitude tending to zero. This implies
that for some sufficiently large λk, the term F(xk + dk) < F(xk) holds true. The term λk can therefore
be controlled to ensure descent even when second-order terms, which restrict the efficiency of the
Gauss-Newton method, are encountered. When the step is successful (gives a lower function value),
the algorithm sets λk+1 = λk/10. When the step is unsuccessful, the algorithm sets λk+1 = λk*10.
12-7
12 Least Squares
The Levenberg-Marquardt method therefore uses a search direction that is a cross between the
Gauss-Newton direction and the steepest descent direction. This is illustrated in “Figure 12-1,
Levenberg-Marquardt Method on Rosenbrock's Function” on page 12-8. The solution for
Rosenbrock's function converges after 90 function evaluations compared to 48 for the Gauss-Newton
method. The poorer efficiency is partly because the Gauss-Newton method is generally more effective
when the residual is zero at the solution. However, such information is not always available
beforehand, and the increased robustness of the Levenberg-Marquardt method compensates for its
occasional poorer efficiency.
For a more complete description of this figure, including scripts that generate the iterative points, see
“Banana Function Minimization”.
See Also
lsqcurvefit | lsqlin | lsqnonlin | lsqnonneg | quadprog
More About
• “Least Squares”
12-8
Nonlinear Data-Fitting
Nonlinear Data-Fitting
This example shows how to fit a nonlinear function to data using several Optimization Toolbox™
algorithms.
Problem Setup
Data = ...
[0.0000 5.8955
0.1000 3.5639
0.2000 2.5173
0.3000 1.9790
0.4000 1.8990
0.5000 1.3938
0.6000 1.1359
0.7000 1.0096
0.8000 1.0343
0.9000 0.8435
1.0000 0.6856
1.1000 0.6100
1.2000 0.5392
1.3000 0.3946
1.4000 0.3903
1.5000 0.5474
1.6000 0.3459
1.7000 0.1370
1.8000 0.2211
1.9000 0.1704
2.0000 0.2636];
t = Data(:,1);
y = Data(:,2);
% axis([0 2 -0.5 6])
% hold on
plot(t,y,'ro')
title('Data points')
12-9
12 Least Squares
% hold off
y = c(1)*exp(-lam(1)*t) + c(2)*exp(-lam(2)*t)
to the data.
x(1) = c(1)
x(2) = lam(1)
x(3) = c(2)
x(4) = lam(2)
Then define the curve as a function of the parameters x and the data t:
F = @(x,xdata)x(1)*exp(-x(2)*xdata) + x(3)*exp(-x(4)*xdata);
We arbitrarily set our initial point x0 as follows: c(1) = 1, lam(1) = 1, c(2) = 1, lam(2) = 0:
12-10
Nonlinear Data-Fitting
x0 = [1 1 1 0];
[x,resnorm,~,exitflag,output] = lsqcurvefit(F,x0,t,y)
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1×4
resnorm = 0.1477
exitflag = 3
hold on
plot(t,F(x,t))
hold off
12-11
12 Least Squares
To solve the problem using fminunc, we set the objective function as the sum of squares of the
residuals.
xunc = 1×4
ressquared = 0.1477
eflag = 1
12-12
Nonlinear Data-Fitting
lssteplength: 1
firstorderopt: 2.9662e-05
algorithm: 'quasi-newton'
message: '...'
Notice that fminunc found the same solution as lsqcurvefit, but took many more function
evaluations to do so. The parameters for fminunc are in the opposite order as those for
lsqcurvefit; the larger lam is lam(2), not lam(1). This is not surprising, the order of variables is
arbitrary.
fprintf(['There were %d iterations using fminunc,' ...
' and %d using lsqcurvefit.\n'], ...
outputu.iterations,output.iterations)
There were 185 function evaluations using fminunc, and 35 using lsqcurvefit.
Notice that the fitting problem is linear in the parameters c(1) and c(2). This means for any values of
lam(1) and lam(2), we can use the backslash operator to find the values of c(1) and c(2) that solve the
least-squares problem.
We now rework the problem as a two-dimensional problem, searching for the best values of lam(1)
and lam(2). The values of c(1) and c(2) are calculated at each step using the backslash operator as
described above.
type fitvector
12-13
12 Least Squares
Solve the problem using lsqcurvefit, starting from a two-dimensional initial point lam(1), lam(2):
x02 = [1 0];
F2 = @(x,t) fitvector(x,t,y);
[x2,resnorm2,~,exitflag2,output2] = lsqcurvefit(F2,x02,t,y)
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x2 = 1×2
10.5861 1.4003
resnorm2 = 0.1477
exitflag2 = 3
The efficiency of the two-dimensional solution is similar to that of the four-dimensional solution:
fprintf(['There were %d function evaluations using the 2-d ' ...
'formulation, and %d using the 4-d formulation.'], ...
output2.funcCount,output.funcCount)
There were 33 function evaluations using the 2-d formulation, and 35 using the 4-d formulation.
Choosing a bad starting point for the original four-parameter problem leads to a local solution that is
not global. Choosing a starting point with the same bad lam(1) and lam(2) values for the split two-
parameter problem leads to the global solution. To show this we re-run the original problem with a
start point that leads to a relatively bad local solution, and compare the resulting fit with the global
solution.
x0bad = [5 1 1 0];
[xbad,resnormbad,~,exitflagbad,outputbad] = ...
lsqcurvefit(F,x0bad,t,y)
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
xbad = 1×4
12-14
Nonlinear Data-Fitting
resnormbad = 2.2173
exitflagbad = 3
hold on
plot(t,F(xbad,t),'g')
legend('Data','Global fit','Bad local fit','Location','NE')
hold off
12-15
12 Least Squares
The residual norm at the good ending point is 0.147723, and the residual norm at the bad ending p
See Also
More About
• “Nonlinear Data-Fitting Using Several Problem-Based Approaches” on page 12-84
• “Nonlinear Least Squares Without and Including Jacobian” on page 12-21
• “Nonlinear Curve Fitting with lsqcurvefit” on page 12-48
12-16
lsqnonlin with a Simulink® Model
The plant is an under-damped third-order model with actuator limits. The actuator limits are a
saturation limit and a slew rate limit. The actuator saturation limit cuts off input values greater than
2 units or less than –2 units. The slew rate limit of the actuator is 0.8 units/sec. The closed-loop
response of the system to a step input is shown in Closed-Loop Response on page 12-0 . You can see
this response by opening the model (type optsim at the command line or click the model name), and
selecting Run from the Simulation menu. The response plots to the scope.
Closed-Loop Response
The problem is to design a feedback control loop that tracks a unit step input to the system. The
closed-loop plant is entered in terms of the blocks where the plant and actuator have been placed in a
hierarchical Subsystem block. A Scope block displays output trajectories during the design process.
12-17
12 Least Squares
Closed-Loop Model
To solve this problem, minimize the error between the output and the input signal. (In contrast, in
“Using fminimax with a Simulink® Model” on page 8-8 the solution involves minimizing maximum
value of the output.) The variables are the parameters of the Proportional Integral Derivative (PID)
controller. If you only need to minimize the error at one time unit, you would have a scalar objective
function. But the goal is to minimize the error for all time steps from 0 to 100, thus producing a
multiobjective function (one function for each time step).
Use lsqnonlin to perform a least-squares fit on the tracking of the output. The tracking is
performed via the function tracklsq, which is nested in runtracklsq below on page 12-0 .
tracklsq returns the error signal yout, the output computed by calling sim, minus the input signal
1.
The function runtracklsq sets up all the needed values and then calls lsqnonlin with the
objective function tracklsq. The variable options passed to lsqnonlin defines the criteria and
display characteristics. In this case you ask for output, use the 'levenberg-marquardt' algorithm,
and give termination tolerances for the step and objective function on the order of 0.001.
To run the simulation in the model optsim, the variables Kp, Ki, Kd, a1, and a2 (a1 and a2 are
variables in the Plant block) must all be defined. Kp, Ki, and Kd are the variables to be optimized. The
function tracklsq is nested inside runtracklsq so that the variables a1 and a2 are shared between
the two functions. The variables a1 and a2 are initialized in runtracklsq.
The objective function tracklsq runs the simulation. The simulation can be run either in the base
workspace or the current workspace, that is, the workspace of the function calling sim, which in this
case is the workspace of tracklsq. In this example, the SrcWorkspace option is set to 'Current'
to tell sim to run the simulation in the current workspace. runtracklsq runs the simulation to 100
seconds.
When the simulation is completed, runtracklsq creates the myobj object in the current workspace
(that is, the workspace of tracklsq). The Outport block in the block diagram model puts the yout
field of the object into the current workspace at the end of the simulation.
When you run runtracklsq, the optimization gives the solution for the proportional, integral, and
derivative (Kp, Ki, Kd) gains of the controller:
Kp = 3.1330
Ki = 0.1465
Kd = 14.3918
12-18
lsqnonlin with a Simulink® Model
Note: The call to sim results in a call to one of the Simulink ordinary differential equation (ODE)
solvers. A choice must be made about the type of solver to use. From the optimization point of view, a
fixed-step ODE solver is the best choice if that is sufficient to solve the ODE. However, in the case of a
stiff system, a variable-step ODE method might be required to solve the ODE.
The numerical solution produced by a variable-step solver, however, is not a smooth function of
parameters, because of step-size control mechanisms. This lack of smoothness can prevent an
optimization routine from converging. The lack of smoothness is not introduced when a fixed-step
solver is used. (For a further explanation, see [53].)
12-19
12 Least Squares
function F = tracklsq(pid)
% Track the output of optsim to a signal of 1
12-20
Nonlinear Least Squares Without and Including Jacobian
The problem has 10 terms with two unknowns: find x, a two-dimensional vector, that minimizes
10
kx1 kx 2
∑ 2 + 2k − e −e 2 ,
k=1
Because lsqnonlin assumes that the sum of squares is not explicitly formed in the user function,
the function passed to lsqnonlin must compute the vector-valued function
kx1 kx
Fk(x) = 2 + 2k − e − e 2,
The helper function myfun defined at the end of this example on page 12-0 implements the vector-
valued objective function with no derivative information. Solve the minimization starting from the
point x0.
x0 = [0.3,0.4]; % Starting guess
[x,resnorm,res,eflag,output] = lsqnonlin(@myfun,x0); % Invoke optimizer
0.2578 0.2578
disp(resnorm)
124.3622
disp(output.funcCount)
72
The objective function is simple enough that you can calculate its Jacobian. Following the definition in
“Jacobians of Vector Functions” on page 2-26, a Jacobian function represents the matrix
∂Fk(x)
Jk j(x) = .
∂x j
Here, Fk(x) is the kth component of the objective function. This example has
12-21
12 Least Squares
kx1 kx
Fk(x) = 2 + 2k − e − e 2,
so
kx1
Jk1(x) = − ke
kx2
Jk2(x) = − ke .
The helper function myfun2 defined at the end of this example on page 12-0 implements the
objective function with the Jacobian. Set options so the solver uses the Jacobian.
opts = optimoptions(@lsqnonlin,'SpecifyObjectiveGradient',true);
lb = []; % No bounds
ub = [];
[x2,resnorm2,res2,eflag2,output2] = lsqnonlin(@myfun2,x0,lb,ub,opts);
disp(x2)
0.2578 0.2578
disp(resnorm2)
124.3622
The advantage of using a Jacobian is that the solver takes many fewer function evaluations.
disp(output2.funcCount)
24
Helper Functions
function F = myfun(x)
k = 1:10;
F = 2 + 2*k-exp(k*x(1))-exp(k*x(2));
end
12-22
Nonlinear Least Squares Without and Including Jacobian
end
end
See Also
More About
• “Nonlinear Data-Fitting” on page 12-9
• “Nonlinear Curve Fitting with lsqcurvefit” on page 12-48
12-23
12 Least Squares
2
min‖Cx − d‖ .
x
load particle
sizec = size(C)
sizec = 1×2
2000 400
sized = size(d)
sized = 1×2
2000 1
The C matrix has 2000 rows and 400 columns. Therefore, to have the correct size for matrix
multiplication, the x vector has 400 rows. To represent the nonnegativity constraint, set lower bounds
of zero on all variables.
lb = zeros(size(C,2),1);
[x,resnorm,residual,exitflag,output] = ...
lsqlin(C,d,[],[],[],[],lb);
disp(output)
message: '...'
algorithm: 'interior-point'
firstorderopt: 3.6717e-06
constrviolation: 0
iterations: 8
linearsolver: 'sparse'
cgiterations: []
12-24
Nonnegative Linear Least Squares, Solver-Based
The output structure shows that lsqlin uses a sparse internal linear solver for the interior-point
algorithm and takes 8 iterations to reach a first-order optimality measure of about 3.7e-6.
Change Algorithm
The trust-region-reflective algorithm handles bound-constrained problems. See how well it performs
on this problem.
options = optimoptions('lsqlin','Algorithm','trust-region-reflective');
[x2,resnorm2,residual2,exitflag2,output2] = ...
lsqlin(C,d,[],[],[],[],lb,[],[],options);
lsqlin stopped because the relative change in function value is less than the square root of the
disp(output2)
iterations: 10
algorithm: 'trust-region-reflective'
firstorderopt: 2.7870e-05
cgiterations: 42
constrviolation: []
linearsolver: []
message: 'Local minimum possible....'
This time, the solver takes more iterations and reaches a solution with a higher (worse) first-order
optimality measure.
To improve the first-order optimality measure, try setting the SubproblemAlgorithm option to
'factorization'.
options.SubproblemAlgorithm = 'factorization';
[x3,resnorm3,residual3,exitflag3,output3] = ...
lsqlin(C,d,[],[],[],[],lb,[],[],options);
disp(output3)
iterations: 12
algorithm: 'trust-region-reflective'
firstorderopt: 5.5907e-15
cgiterations: 0
constrviolation: []
linearsolver: []
message: 'Optimal solution found.'
Using this option brings the first-order optimality measure nearly to zero, which is the best possible
result.
Change Solver
Try solving t problem using the lsqnonneg solver, which is designed to handle nonnegative linear
least squares.
[x4,resnorm4,residual4,exitflag4,output4] = lsqnonneg(C,d);
disp(output4)
12-25
12 Least Squares
iterations: 184
algorithm: 'active-set'
message: 'Optimization terminated.'
lsqnonneg does not report a first-order optimality measure. Instead, investigate the residual norms.
To see the lower-significance digits, subtract 22.5794 from each residual norm.
t=1×4 table
default trust-region-reflective factorization lsqnonneg
__________ _______________________ _____________ __________
The default lsqlin algorithm has a higher residual norm than the trust-region-reflective
algorithm. The factorization and lsqnonneg residual norms are even lower, and are the same at
this level of display precision. See which one is lower.
disp(resnorm3 - resnorm4)
6.9278e-13
The lsqnonneg residual norm is the lowest by a negligible amount. However, lsqnonneg takes the
most iterations to converge.
See Also
lsqlin | lsqnonneg
More About
• “Nonnegative Linear Least Squares, Problem-Based” on page 12-40
12-26
Optimization App with the lsqlin Solver
The Problem
This example shows how to use the Optimization app to solve a constrained least-squares problem.
Note The Optimization app warns that it will be removed in a future release.
The problem in this example is to find the point on the plane x1 + 2x2 + 4x3 = 7 that is closest to the
origin. The easiest way to solve this problem is to minimize the square of the distance from a point
x = (x1,x2,x3) on the plane to the origin, which returns the same optimal point as minimizing the
actual distance. Since the square of the distance from an arbitrary point (x1,x2,x3) to the origin is
x12 + x22 + x32, you can describe the problem as follows:
x1 + 2x2 + 4x3 = 7.
The function f(x) is called the objective function and x1 + 2x2 + 4x3 = 7 is an equality constraint. More
complicated problems might contain other equality constraints, inequality constraints, and upper or
lower bound constraints.
12-27
12 Least Squares
The Aeq and beq fields should appear as shown in the following figure.
6 When the algorithm terminates, under Run solver and view results the following information is
displayed:
• The Current iteration value when the algorithm terminated, which for this example is 1.
• The final value of the objective function when the algorithm terminated:
12-28
Optimization App with the lsqlin Solver
0.333
0.667
1.333
See Also
More About
• “Shortest Distance to a Plane” on page 12-38
12-29
12 Least Squares
1 2
min C⋅x−d 2
x 2
such that lb ≤ x ≤ ub, for problems where C is very large, perhaps too large to be stored. For this
technique, use the 'trust-region-reflective' algorithm.
For example, consider a problem where C is a 2n-by-n matrix based on a circulant matrix. The rows of
C are shifts of a row vector v. This example has the row vector v with elements of the form
k+1
( – 1) /k:
d = [n − 1, n − 2, …, − n],
For large enough n, the dense matrix C does not fit into computer memory (n = 10, 000 is too large on
one tested system).
w = jmfcn(Jinfo,Y,flag)
Jinfo is a matrix the same size as C, used as a preconditioner. If C is too large to fit into memory,
Jinfo should be sparse. Y is a vector or matrix sized so that C*Y or C'*Y works as matrix
multiplication. flag tells jmfcn which product to form:
12-30
Jacobian Multiply Function with Linear Least Squares
Because C is such a simply structured matrix, you can easily write a Jacobian multiply function in
terms of the vector v; that is, without forming C. Each row of C*Y is the product of a circularly shifted
version of v times Y. Use circshift to circularly shift v.
To compute C*Y, compute v*Y to find the first row, then shift v and compute the second row, and so
on.
To compute C'*Y, perform the same computation, but use a shifted version of temp, the vector
formed from the first row of C':
temp = [fliplr(v),fliplr(v)];
To compute C'*C*Y, simply compute C*Y using shifts of v, and then compute C' times the result
using shifts of fliplr(v).
The helper function lsqcirculant3 is a Jacobian multiply function that implements this procedure;
it appears at the end of this example on page 12-0 .
The dolsqJac3 helper function at the end of this example on page 12-0 sets up the vector v and
calls the solver lsqlin using the lsqcirculant3 Jacobian multiply function.
When n = 3000, C is an 18,000,000-element dense matrix. Determine the results of the dolsqJac3
function for n = 3000 at selected values of x, and display the output structure.
[x,resnorm,residual,exitflag,output] = dolsqJac3(3000);
lsqlin stopped because the relative change in function value is less than the function tolerance.
disp(x(1))
5.0000
disp(x(1500))
-0.5201
disp(x(3000))
-5.0000
disp(output)
iterations: 16
algorithm: 'trust-region-reflective'
firstorderopt: 5.9351e-05
cgiterations: 36
constrviolation: []
linearsolver: []
message: 'Local minimum possible.↵↵lsqlin stopped because the relative change in func
Helper Functions
12-31
12 Least Squares
if flag > 0
w = Jpositive(Y);
elseif flag < 0
w = Jnegative(Y);
else
w = Jnegative(Jpositive(Y));
end
function a = Jpositive(q)
% Calculate C*q
temp = v;
for r = 1:size(a,1)
a(r,:) = temp*q; % Compute the rth row
temp = circshift(temp,1,2); % Shift the circulant
end
end
function a = Jnegative(q)
% Calculate C'*q
temp = fliplr(v);
temp = circshift(temp,1,2); % Shift the circulant for C'
for r = 1:len
a(r,:) = [temp,temp]*q; % Compute the rth row
temp = circshift(temp,1,2); % Shift the circulant
end
end
end
r = 1:2*n;
d(r) = n-r;
12-32
Jacobian Multiply Function with Linear Least Squares
lb = -5*ones(1,n);
ub = 5*ones(1,n);
[x,resnorm,residual,exitflag,output] = ...
lsqlin(Jinfo,d,[],[],[],[],lb,ub,[],options);
end
See Also
circshift | fliplr
More About
• “Quadratic Minimization with Dense, Structured Hessian” on page 11-17
12-33
12 Least Squares
The Problem
load optdeblur
[m,n] = size(P);
mn = m*n;
imshow(P)
title(sprintf('Original Image, size %d-by-%d, %d pixels',m,n,mn))
The problem is to take a blurred version of this photo and try to deblur it. The starting image is black
and white, meaning it consists of pixel values from 0 through 1 in the m x n matrix P.
Add Motion
Simulate the effect of vertical motion blurring by averaging each pixel with the 5 pixels above and
below.Construct a sparse matrix D to blur with a single matrix multiply.
Draw a picture of D.
cla
axis off ij
xs = 31;
ys = 15;
xlim([0,xs+1]);
12-34
Large-Scale Constrained Linear Least-Squares, Solver-Based
ylim([0,ys+1]);
[ix,iy] = meshgrid(1:(xs-1),1:(ys-1));
l = abs(ix-iy)<=5;
text(ix(l),iy(l),'x')
text(ix(~l),iy(~l),'0')
text(xs*ones(ys,1),1:ys,'...');
text(1:xs,ys*ones(xs,1),'...');
title('Blurring Operator D (x = 1/11)')
The image is much less distinct; you can no longer read the license plate.
Deblurred Image
To deblur, suppose that you know the blurring operator D. How well can you remove the blur and
recover the original image P?
12-35
12 Least Squares
2
min(‖Dx − G‖ ) subject to 0 ≤ x ≤ 1.
This problem takes the blurring matrix D as given, and tries to find the x that makes Dx closest to G =
DP. In order for the solution to represent sensible pixel values, restrict the solution to be from 0
through 1.
lb = zeros(mn,1);
ub = 1 + lb;
sol = lsqlin(D,G,[],[],[],[],lb,ub);
xpic = reshape(sol,m,n);
figure
imshow(xpic)
title('Deblurred Image')
The deblurred image is much clearer than the blurred image. You can once again read the license
plate. However, the deblurred image has some artifacts, such as horizontal bands in the lower-right
pavement region. Perhaps these artifacts can be removed by a regularization.
Regularization
Regularization is a way to smooth the solution. There are many regularization methods. For a simple
approach, add a term to the objective function as follows:
2
min(‖(D + εI)x − G‖ ) subject to 0 ≤ x ≤ 1.
The termεI makes the resulting quadratic problem more stable. Take ε = 0 . 02 and solve the problem
again.
addI = speye(mn);
sol2 = lsqlin(D+0.02*addI,G,[],[],[],[],lb,ub);
12-36
Large-Scale Constrained Linear Least-Squares, Solver-Based
xpic2 = reshape(sol2,m,n);
figure
imshow(xpic2)
title('Deblurred Regularized Image')
See Also
More About
• “Large-Scale Constrained Linear Least-Squares, Problem-Based” on page 12-44
12-37
12 Least Squares
This example shows how to formulate a linear least squares problem using the problem-based
approach.
The problem is to find the shortest distance from the origin (the point [0,0,0]) to the plane
x1 + 2x2 + 4x3 = 7. In other words, this problem is to minimize f (x) = x12 + x22 + x32 subject to the
constraint x1 + 2x2 + 4x3 = 7. The function f(x) is called the objective function and x1 + 2x2 + 4x3 = 7
is an equality constraint. More complicated problems might contain other equality constraints,
inequality constraints, and upper or lower bound constraints.
To formulate this problem using the problem-based approach, create an optimization problem object
called pointtoplane.
pointtoplane = optimproblem;
x = optimvar('x',3);
Create the objective function and put it in the Objective property of pointtoplane.
obj = sum(x.^2);
pointtoplane.Objective = obj;
v = [1,2,4];
pointtoplane.Constraints = dot(x,v) == 7;
The problem formulation is complete. To check for errors, review the problem.
show(pointtoplane)
OptimizationProblem :
Solve for:
x
minimize :
sum(x.^2)
subject to :
x(1) + 2*x(2) + 4*x(3) == 7
12-38
Shortest Distance to a Plane
[sol,fval,exitflag,output] = solve(pointtoplane);
disp(sol.x)
0.3333
0.6667
1.3333
To verify the solution, solve the problem analytically. Recall that for any nonzero t, the vector
t*[1,2,4] = t*v is perpendicular to the plane x1 + 2x2 + 4x3 = 7. So the solution point xopt is
t*v for the value of t that satisfies the equation dot(t*v,v) = 7.
t = 7/dot(v,v)
t = 0.3333
xopt = t*v
xopt = 1×3
Indeed, the vector xopt is equivalent to the point sol.x that solve finds.
See Also
More About
• “Optimization App with the lsqlin Solver” on page 12-27
• “Problem-Based Optimization Workflow” on page 10-2
12-39
12 Least Squares
2
min‖Cx − d‖ .
x
load particle
sizec = size(C)
sizec = 1×2
2000 400
sized = size(d)
sized = 1×2
2000 1
Create an optimization variable x of the appropriate size for multiplication by C. Impose a lower
bound of 0 on the elements of x.
x = optimvar('x',sizec(2),'LowerBound',0);
residual = C*x - d;
obj = sum(residual.^2);
Create an optimization problem called nonneglsq and include the objective function in the problem.
nonneglsq = optimproblem('Objective',obj);
opts = optimoptions(nonneglsq)
opts =
lsqlin options:
Set properties:
No options set.
Default properties:
12-40
Nonnegative Linear Least Squares, Problem-Based
Algorithm: 'interior-point'
ConstraintTolerance: 1.0000e-08
Display: 'final'
LinearSolver: 'auto'
MaxIterations: 200
OptimalityTolerance: 1.0000e-08
StepTolerance: 1.0000e-12
[sol,fval,exitflag,output] = solve(nonneglsq);
disp(output)
message: '...'
algorithm: 'interior-point'
firstorderopt: 9.9673e-07
constrviolation: 0
iterations: 9
linearsolver: 'sparse'
cgiterations: []
solver: 'lsqlin'
The output structure shows that the lsqlin solver uses a sparse internal linear solver for the
interior-point algorithm and takes 9 iterations to arrive at a first-order optimality measure of about
1e-6.
Change Algorithm
The trust-region-reflective algorithm handles bound-constrained problems. See how well it performs
on this problem.
opts.Algorithm = 'trust-region-reflective';
[sol2,fval2,exitflag2,output2] = solve(nonneglsq,'Options',opts);
lsqlin stopped because the relative change in function value is less than the function tolerance.
disp(output2)
iterations: 14
algorithm: 'trust-region-reflective'
firstorderopt: 5.2187e-08
12-41
12 Least Squares
cgiterations: 54
constrviolation: []
linearsolver: []
message: 'Local minimum possible....'
solver: 'lsqlin'
This time, the solver takes more iterations and arrives at a solution with a lower (better) first-order
optimality measure.
To get an even better first-order optimality measure, try setting the SubproblemAlgorithm option
to 'factorization'.
opts.SubproblemAlgorithm = 'factorization';
[sol3,fval3,exitflag3,output3] = solve(nonneglsq,'Options',opts);
disp(output3)
iterations: 11
algorithm: 'trust-region-reflective'
firstorderopt: 1.3973e-14
cgiterations: 0
constrviolation: []
linearsolver: []
message: 'Optimal solution found.'
solver: 'lsqlin'
Using this option brings the first-order optimality measure nearly to zero, which is the best possible.
Change Solver
The lsqnonneg solver is specially designed to handle nonnegative linear least squares. Try that
solver.
[sol4,fval4,exitflag4,output4] = solve(nonneglsq,'Solver','lsqnonneg');
disp(output4)
iterations: 184
algorithm: 'active-set'
message: 'Optimization terminated.'
solver: 'lsqnonneg'
lsqnonneg does not report a first-order optimality measure. Instead, investigate the residual norms,
which are returned in the fval outputs. To see the lower-significance digits, subtract 22.5794 from
each residual norm.
t=1×4 table
default trust-region-reflective factorization lsqnonneg
__________ _______________________ _____________ __________
12-42
Nonnegative Linear Least Squares, Problem-Based
The default solver has a slightly higher (worse) residual norm than the other three, whose residual
norms are indistinguishable at this level of display precision. To see which is lowest, subtract the
lsqnonneg result from the two results.
1.0e-12 *
0.6786 0.6857
The lsqnonneg residual norm is the smallest by a nearly negligible amount. However, lsqnonneg
takes the most iterations to converge.
See Also
More About
• “Nonnegative Linear Least Squares, Solver-Based” on page 12-24
• “Problem-Based Optimization Workflow” on page 10-2
12-43
12 Least Squares
The Problem
The problem is to take a blurred version of this photo and try to deblur it. The starting image is black
and white, meaning it consists of pixel values from 0 through 1 in the m x n matrix P.
Add Motion
Simulate the effect of vertical motion blurring by averaging each pixel with the 5 pixels above and
below.Construct a sparse matrix D to blur with a single matrix multiply.
blur = 5;
mindex = 1:mn;
nindex = 1:mn;
for i = 1:blur
mindex=[mindex i+1:mn 1:mn-i];
nindex=[nindex 1:mn-i i+1:mn];
end
D = sparse(mindex,nindex,1/(2*blur+1));
Draw a picture of D.
cla
axis off ij
12-44
Large-Scale Constrained Linear Least-Squares, Problem-Based
xs = 31;
ys = 15;
xlim([0,xs+1]);
ylim([0,ys+1]);
[ix,iy] = meshgrid(1:(xs-1),1:(ys-1));
l = abs(ix-iy) <= blur;
text(ix(l),iy(l),'x')
text(ix(~l),iy(~l),'0')
text(xs*ones(ys,1),1:ys,'...');
text(1:xs,ys*ones(xs,1),'...');
title('Blurring Operator D (x = 1/11)')
G = D*(P(:));
figure
imshow(reshape(G,m,n));
The image is much less distinct; you can no longer read the license plate.
12-45
12 Least Squares
Deblurred Image
To deblur, suppose that you know the blurring operator D. How well can you remove the blur and
recover the original image P?
This problem takes the blurring matrix D as given, and tries to find the x that makes Dx closest to G =
DP. In order for the solution to represent sensible pixel values, restrict the solution to be from 0
through 1.
x = optimvar('x',mn,'LowerBound',0,'UpperBound',1);
expr = D*x-G;
objec = expr'*expr;
blurprob = optimproblem('Objective',objec);
sol = solve(blurprob);
xpic = reshape(sol.x,m,n);
figure
imshow(xpic)
title('Deblurred Image')
The deblurred image is much clearer than the blurred image. You can once again read the license
plate. However, the deblurred image has some artifacts, such as horizontal bands in the lower-right
pavement region. Perhaps these artifacts can be removed by a regularization.
Regularization
Regularization is a way to smooth the solution. There are many regularization methods. For a simple
approach, add a term to the objective function as follows:
12-46
Large-Scale Constrained Linear Least-Squares, Problem-Based
2
min(‖(D + εI)x − G‖ ) subject to 0 ≤ x ≤ 1.
The termεI makes the resulting quadratic problem more stable. Take ε = 0 . 02 and solve the problem
again.
addI = speye(mn);
expr2 = (D + 0.02*addI)*x - G;
objec2 = expr2'*expr2;
blurprob2 = optimproblem('Objective',objec2);
sol2 = solve(blurprob2);
xpic2 = reshape(sol2.x,m,n);
figure
imshow(xpic2)
title('Deblurred Regularized Image')
See Also
More About
• “Large-Scale Constrained Linear Least-Squares, Solver-Based” on page 12-34
• “Problem-Based Optimization Workflow” on page 10-2
12-47
12 Least Squares
In this example, the vector xdata represents 100 data points, and the vector ydata represents the
associated measurements. Generate the data for the problem.
rng(5489,'twister') % reproducible
xdata = -2*log(rand(100,1));
ydata = (ones(100,1) + .1*randn(100,1)) + (3*ones(100,1)+...
0.5*randn(100,1)).*exp((-(2*ones(100,1)+...
.5*randn(100,1))).*xdata);
The code generates xdata from 100 independent samples of an exponential distribution with mean 2.
The code generates ydata from its defining equation using a = [1;3;2], perturbed by adding
normal deviates with standard deviations [0.1;0.5;0.5].
The goal is to find parameters ai, i = 1, 2, 3, for the model that best fit the data.
In order to fit the parameters to the data using lsqcurvefit, you need to define a fitting function.
Define the fitting function predicted as an anonymous function.
To fit the model to the data, lsqcurvefit needs an initial estimate a0 of the parameters.
a0 = [2;2;2];
[ahat,resnorm,residual,exitflag,output,lambda,jacobian] =...
lsqcurvefit(predicted,a0,xdata,ydata);
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
disp(ahat)
1.0169
3.1444
2.1596
If you have the Statistics and Machine Learning Toolbox™ software, use the nlparci function to
generate confidence intervals for the ahat estimate.
12-48
Nonlinear Curve Fitting with lsqcurvefit
See Also
lsqcurvefit | nlparci
More About
• “Nonlinear Data-Fitting” on page 12-9
• “Nonlinear Least Squares Without and Including Jacobian” on page 12-21
12-49
12 Least Squares
Do not set the FunValCheck option to 'on' when using complex data. The solver errors.
Data Model
The is input data, is the response, and is a complex-valued vector of coefficients. The goal is to
estimate from and noisy observations . The data model is analytic, so you can use it in a complex
solution.
Generate artificial data for the model. Take the complex coefficient vector as [2;3+4i;-.5+.4i].
Take the observations as exponentially distributed. Add complex-valued noise to the responses .
rng default % for reproducibility
N = 100; % number of observations
v0 = [2;3+4i;-.5+.4i]; % coefficient vector
xdata = -log(rand(N,1)); % exponentially distributed
noisedata = randn(N,1).*exp((1i*randn(N,1))); % complex noise
cplxydata = v0(1) + v0(2).*exp(v0(3)*xdata) + noisedata;
The difference between the response predicted by the data model and an observation (xdata for
and response cplxydata for ) is:
objfcn = @(v)v(1)+v(2)*exp(v(3)*xdata) - cplxydata;
Use either lsqnonlin or lsqcurvefit to fit the model to the data. This example first uses
lsqnonlin.
opts = optimoptions(@lsqnonlin,'Display','off');
x0 = (1+1i)*[1;1;1]; % arbitrary initial guess
[vestimated,resnorm,residuals,exitflag,output] = lsqnonlin(objfcn,x0,[],[],opts);
vestimated,resnorm,exitflag,output.firstorderopt
vestimated =
2.1582 + 0.1351i
2.7399 + 3.8012i
-0.5338 + 0.4660i
resnorm =
100.9933
12-50
Fit a Model to Complex-Valued Data
exitflag =
ans =
0.0018
lsqnonlin recovers the complex coefficient vector to about one significant digit. The norm of the
residual is sizable, indicating that the noise keeps the model from fitting all the observations. The exit
flag is 3, not the preferable 1, because the first-order optimality measure is about 1e-3, not below
1e-6.
To fit using lsqcurvefit, write the model to give just the responses, not the responses minus the
response data.
objfcn = @(v,xdata)v(1)+v(2)*exp(v(3)*xdata);
vestimated =
2.1582 + 0.1351i
2.7399 + 3.8012i
-0.5338 + 0.4660i
resnorm =
100.9933
The results match those from lsqnonlin, because the underlying algorithms are identical. Use
whichever solver you find more convenient.
To include bounds, or simply to stay completely within real values, you can split the real and complex
parts of the coefficients into separate variables. For this problem, split the coefficients as follows:
12-51
12 Least Squares
Split the response data into its real and imaginary parts.
ydata2 = [real(cplxydata),imag(cplxydata)];
The coefficient vector v now has six dimensions. Initialize it as all ones, and solve the problem using
lsqcurvefit.
x0 = ones(6,1);
[vestimated,resnorm,residuals,exitflag,output] = ...
lsqcurvefit(@cplxreal,x0,xdata,ydata2);
vestimated,resnorm,exitflag,output.firstorderopt
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
vestimated =
2.1582
0.1351
2.7399
3.8012
-0.5338
0.4660
resnorm =
100.9933
exitflag =
ans =
0.0018
12-52
Fit a Model to Complex-Valued Data
Interpret the six-element vector vestimated as a three-element complex vector, and you see that the
solution is virtually the same as the previous solutions.
See Also
More About
• “Complex Numbers in Optimization Toolbox Solvers” on page 2-14
12-53
12 Least Squares
The Lorenz system is a system of ordinary differential equations (see Lorenz system). For real
constants σ, ρ, β, the system is
dx
= σ(y − x)
dt
dy
= x(ρ − z) − y
dt
dz
= xy − βz .
dt
Lorenz's values of the parameters for a sensitive system are σ = 10, β = 8/3, ρ = 28. Start the
system from [x(0),y(0),z(0)] = [10,20,10] and view the evolution of the system from time 0
through 100.
sigma = 10;
beta = 8/3;
rho = 28;
f = @(t,a) [-sigma*a(1) + sigma*a(2); rho*a(1) - a(2) - a(1)*a(3); -beta*a(3) + a(1)*a(2)];
xt0 = [10,20,10];
[tspan,a] = ode45(f,[0 100],xt0); % Runge-Kutta 4th/5th order ODE solver
figure
plot3(a(:,1),a(:,2),a(:,3))
view([-10.0 -2.0])
12-54
Fit an Ordinary Differential Equation (ODE)
The evolution is quite complicated. But over a small time interval, it looks somewhat like uniform
circular motion. Plot the solution over the time interval [0,1/10].
12-55
12 Least Squares
In terms of these parameters, determine the position of the circular path for times xdata.
type fitlorenzfn
function f = fitlorenzfn(x,xdata)
theta = x(1:2);
R = x(3);
V = x(4);
t0 = x(5);
delta = x(6:8);
f = zeros(length(xdata),3);
f(:,3) = R*sin(theta(1))*sin(V*(xdata - t0)) + delta(3);
f(:,1) = R*cos(V*(xdata - t0))*cos(theta(2)) ...
12-56
Fit an Ordinary Differential Equation (ODE)
To find the best-fitting circular path to the Lorenz system at times given in the ODE solution, use
lsqcurvefit. In order to keep the parameters in reasonable limits, put bounds on the various
parameters.
lb = [-pi/2,-pi,5,-15,-pi,-40,-40,-40];
ub = [pi/2,pi,60,15,pi,40,40,40];
theta0 = [0;0];
R0 = 20;
V0 = 1;
t0 = 0;
delta0 = zeros(3,1);
x0 = [theta0;R0;V0;t0;delta0];
[xbest,resnorm,residual] = lsqcurvefit(@fitlorenzfn,x0,tspan,a,lb,ub);
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
Plot the best-fitting circular points at the times from the ODE solution together with the solution of
the Lorenz system.
soln = a + residual;
hold on
plot3(soln(:,1),soln(:,2),soln(:,3),'r')
legend('ODE Solution','Circular Arc')
hold off
12-57
12 Least Squares
figure
plot3(a(:,1),a(:,2),a(:,3),'b.','MarkerSize',10)
hold on
plot3(soln(:,1),soln(:,2),soln(:,3),'rx','MarkerSize',10)
legend('ODE Solution','Circular Arc')
hold off
12-58
Fit an Ordinary Differential Equation (ODE)
Now modify the parameters σ, β, and ρ to best fit the circular arc. For an even better fit, allow the
initial point [10,20,10] to change as well.
To do so, write a function file paramfun that takes the parameters of the ODE fit and calculates the
trajectory over the times t.
type paramfun
sigma = x(1);
beta = x(2);
rho = x(3);
xt0 = x(4:6);
f = @(t,a) [-sigma*a(1) + sigma*a(2); rho*a(1) - a(2) - a(1)*a(3); -beta*a(3) + a(1)*a(2)];
[~,pos] = ode45(f,tspan,xt0);
To find the best parameters, use lsqcurvefit to minimize the differences between the new
calculated ODE trajectory and the circular arc soln.
xt0 = zeros(1,6);
xt0(1) = sigma;
xt0(2) = beta;
xt0(3) = rho;
xt0(4:6) = soln(1,:);
[pbest,presnorm,presidual,exitflag,output] = lsqcurvefit(@paramfun,xt0,tspan,soln);
12-59
12 Least Squares
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
figure
hold on
odesl = presidual + soln;
plot3(odesl(:,1),odesl(:,2),odesl(:,3),'b')
plot3(soln(:,1),soln(:,2),soln(:,3),'r')
legend('ODE Solution','Circular Arc')
view([-30 -70])
hold off
12-60
Fit an Ordinary Differential Equation (ODE)
options = optimoptions('lsqcurvefit','FiniteDifferenceStepSize',1e-4,...
'FiniteDifferenceType','central');
[pbest2,presnorm2,presidual2,exitflag2,output2] = ...
lsqcurvefit(@paramfun,xt0,tspan,soln,[],[],options);
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
In this case, using these finite differencing options does not improve the solution.
disp([presnorm,presnorm2])
20.0637 20.0637
See Also
More About
• “Optimizing a Simulation or Ordinary Differential Equation” on page 4-26
12-61
12 Least Squares
Model
where A1, A2, r1, and r2 are the unknown parameters, y is the response, and t is time. The problem
requires data for times tdata and (noisy) response measurements ydata. The goal is to find the best
A and r , meaning those values that minimize
2
∑ y(t) − ydata .
t ∈ tdata
Sample Data
Typically, you have data for a problem. In this case, generate artificial noisy data for the problem. Use
A = [1,2] and r = [-1,-3] as the underlying values, and use 200 random values from 0 to 3 as
the time data. Plot the resulting data points.
12-62
Nonlinear Least-Squares, Problem-Based
The data appears to be noisy. Therefore, the solution probably will not match the original parameters
A and r very well.
Problem-Based Approach
To find the best-fitting parameters A and r, first define optimization variables with those names.
A = optimvar('A',2);
r = optimvar('r',2);
Create an expression for the objective function, which is the sum of squares to minimize.
lsqproblem = optimproblem("Objective",obj);
For the problem-based approach, specify the initial point as a structure, with the variable names as
the fields of the structure. Specify the initial A = [1/2,3/2] and the initial r = [-1/2,-3/2].
x0.A = [1/2,3/2];
x0.r = [-1/2,-3/2];
show(lsqproblem)
12-63
12 Least Squares
OptimizationProblem :
Solve for:
A, r
minimize :
sum(arg6)
where:
arg5 = extraParams{3};
arg6 = (((A(1) .* exp((r(1) .* extraParams{1})))
+ (A(2) .* exp((r(2) .* extraParams{2})))) - arg5).^2;
extraParams{1}:
0.0139
0.0357
0.0462
0.0955
0.1033
0.1071
0.1291
0.1385
0.1490
0.1619
0.1793
0.2276
0.2279
0.2345
0.2434
0.2515
0.2533
0.2894
0.2914
0.2926
0.3200
0.3336
0.3570
0.3700
0.3810
0.3897
0.3959
0.4082
0.4159
0.4257
0.4349
0.4366
0.4479
0.4571
0.4728
0.4865
0.4878
0.4969
0.5070
0.5136
0.5455
0.5505
12-64
Nonlinear Least-Squares, Problem-Based
0.5517
0.5606
0.5669
0.5898
0.6714
0.6869
0.7043
0.7197
0.7199
0.7251
0.7306
0.7533
0.7628
0.7653
0.7725
0.7796
0.7889
0.7914
0.8281
0.8308
0.8355
0.8575
0.8890
0.9190
0.9336
0.9513
1.0114
1.0132
1.0212
1.0500
1.0529
1.0550
1.0595
1.1055
1.1077
1.1413
1.1447
1.1692
1.1767
1.1993
1.2054
1.2117
1.2518
1.2653
1.2942
1.3076
1.3162
1.3280
1.3368
1.3404
1.3516
1.3528
1.4082
1.4199
1.4561
1.4604
1.4678
1.4693
12-65
12 Least Squares
1.4726
1.4951
1.5179
1.5255
1.5323
1.5397
1.5856
1.5924
1.6150
1.6406
1.6410
1.6416
1.6492
1.6496
1.7035
1.7065
1.7256
1.7391
1.7558
1.7558
1.8059
1.8481
1.8662
1.8769
1.8971
1.9389
1.9432
1.9473
1.9622
1.9653
1.9664
1.9672
2.0362
2.0391
2.0603
2.0676
2.0845
2.0972
2.1181
2.1281
2.1952
2.2294
2.2341
2.2445
2.2538
2.2612
2.2641
2.2716
2.2732
2.2966
2.3247
2.3271
2.3375
2.3407
2.3408
2.3766
2.3829
2.3845
12-66
Nonlinear Least-Squares, Problem-Based
2.3856
2.4002
2.4008
2.4429
2.4442
2.4519
2.4529
2.4636
2.4704
2.4775
2.4925
2.5222
2.5474
2.5591
2.6061
2.6079
2.6727
2.7002
2.7081
2.7174
2.7319
2.7400
2.7401
2.7472
2.7516
2.7878
2.7882
2.8020
2.8020
2.8262
2.8344
2.8507
2.8684
2.8715
2.8725
2.8779
2.8785
2.8792
2.8857
2.8947
2.9118
2.9884
extraParams{2}:
0.0139
0.0357
0.0462
0.0955
0.1033
0.1071
0.1291
0.1385
0.1490
0.1619
0.1793
0.2276
0.2279
12-67
12 Least Squares
0.2345
0.2434
0.2515
0.2533
0.2894
0.2914
0.2926
0.3200
0.3336
0.3570
0.3700
0.3810
0.3897
0.3959
0.4082
0.4159
0.4257
0.4349
0.4366
0.4479
0.4571
0.4728
0.4865
0.4878
0.4969
0.5070
0.5136
0.5455
0.5505
0.5517
0.5606
0.5669
0.5898
0.6714
0.6869
0.7043
0.7197
0.7199
0.7251
0.7306
0.7533
0.7628
0.7653
0.7725
0.7796
0.7889
0.7914
0.8281
0.8308
0.8355
0.8575
0.8890
0.9190
0.9336
0.9513
1.0114
1.0132
1.0212
12-68
Nonlinear Least-Squares, Problem-Based
1.0500
1.0529
1.0550
1.0595
1.1055
1.1077
1.1413
1.1447
1.1692
1.1767
1.1993
1.2054
1.2117
1.2518
1.2653
1.2942
1.3076
1.3162
1.3280
1.3368
1.3404
1.3516
1.3528
1.4082
1.4199
1.4561
1.4604
1.4678
1.4693
1.4726
1.4951
1.5179
1.5255
1.5323
1.5397
1.5856
1.5924
1.6150
1.6406
1.6410
1.6416
1.6492
1.6496
1.7035
1.7065
1.7256
1.7391
1.7558
1.7558
1.8059
1.8481
1.8662
1.8769
1.8971
1.9389
1.9432
1.9473
1.9622
12-69
12 Least Squares
1.9653
1.9664
1.9672
2.0362
2.0391
2.0603
2.0676
2.0845
2.0972
2.1181
2.1281
2.1952
2.2294
2.2341
2.2445
2.2538
2.2612
2.2641
2.2716
2.2732
2.2966
2.3247
2.3271
2.3375
2.3407
2.3408
2.3766
2.3829
2.3845
2.3856
2.4002
2.4008
2.4429
2.4442
2.4519
2.4529
2.4636
2.4704
2.4775
2.4925
2.5222
2.5474
2.5591
2.6061
2.6079
2.6727
2.7002
2.7081
2.7174
2.7319
2.7400
2.7401
2.7472
2.7516
2.7878
2.7882
2.8020
2.8020
12-70
Nonlinear Least-Squares, Problem-Based
2.8262
2.8344
2.8507
2.8684
2.8715
2.8725
2.8779
2.8785
2.8792
2.8857
2.8947
2.9118
2.9884
extraParams{3}:
2.9278
2.7513
2.7272
2.4199
2.3172
2.3961
2.2522
2.1974
2.1666
2.0944
1.9566
1.7989
1.7984
1.7540
1.8318
1.6745
1.6874
1.5526
1.5229
1.5680
1.4784
1.5280
1.3727
1.2968
1.4012
1.3602
1.2714
1.1773
1.2119
1.2033
1.2037
1.1729
1.1829
1.1602
1.0448
1.0320
1.0397
1.0334
1.0233
1.0275
0.8173
0.9373
12-71
12 Least Squares
1.0202
0.8896
0.9791
0.9128
0.7763
0.7669
0.6579
0.7135
0.7978
0.7164
0.7071
0.6429
0.6676
0.6782
0.6802
0.6328
0.6301
0.7406
0.4908
0.7126
0.6136
0.6269
0.4668
0.4963
0.5007
0.5226
0.3764
0.4824
0.3930
0.4390
0.4665
0.4490
0.4841
0.4539
0.3698
0.3974
0.3356
0.3045
0.4131
0.3561
0.3506
0.3960
0.3625
0.3446
0.3778
0.3565
0.3187
0.2677
0.2664
0.3572
0.2129
0.2919
0.1764
0.3210
0.3016
0.2572
0.2514
0.1301
12-72
Nonlinear Least-Squares, Problem-Based
0.2825
0.1372
0.1243
0.2421
0.1888
0.2547
0.2559
0.2632
0.1801
0.2309
0.2134
0.2495
0.2332
0.2512
0.1875
0.1861
0.2397
0.0803
0.1579
0.1196
0.1541
0.1978
0.2034
0.1095
0.1332
0.1567
0.1345
0.1635
0.1661
0.0991
0.1366
0.0387
0.1922
0.1031
0.0714
0.1178
0.0568
0.1255
0.0957
0.2313
0.1710
-0.0148
0.1316
0.0385
0.0946
0.1147
0.1436
0.0917
0.1840
0.0786
0.1161
0.1327
0.1026
0.1421
0.1142
0.0553
0.0036
0.1866
12-73
12 Least Squares
0.0634
0.0974
0.1203
0.0939
0.0429
0.0640
0.0811
0.1603
0.0427
0.1244
0.0993
0.0696
0.0264
0.0641
0.0703
0.0010
0.0793
0.0267
0.0625
0.0834
0.0204
0.0507
0.0826
-0.0272
0.1161
0.1832
0.1100
0.0453
0.0826
0.0079
0.1531
0.1052
0.0965
0.0132
0.0729
0.0287
0.0410
0.0280
0.0049
0.0102
0.0442
-0.0343
Problem-Based Solution
[sol,fval] = solve(lsqproblem,x0)
12-74
Nonlinear Least-Squares, Problem-Based
r: [2x1 double]
fval = 0.4724
figure
responsedata = evaluate(fun,sol);
plot(tdata,ydata,'r*',tdata,responsedata,'b-')
legend('Original Data','Fitted Curve')
xlabel 't'
ylabel 'Response'
title("Fitted Response")
The plot shows that the fitted data matches the original noisy data fairly well.
See how closely the fitted parameters match the original parameters A = [1,2] and r = [-1,-3].
disp(sol.A)
1.1615
1.8629
disp(sol.r)
-1.0882
-3.2256
12-75
12 Least Squares
If your objective function is not composed of elementary functions, you must convert the function to
an optimization expression using fcn2optimexpr. For the present example:
The remainder of the steps in solving the problem are the same. The only other difference is in the
plotting routine, where you call response instead of fun:
responsedata = evaluate(response,sol);
For the list of supported functions, see “Supported Operations on Optimization Variables and
Expressions” on page 10-36.
See Also
solve
More About
• “Problem-Based Optimization Workflow” on page 10-2
12-76
Fit ODE, Problem-Based
Problem
The problem is a multistep reaction model involving several substances, some of which react with
each other to produce different substances.
For this problem, the true reaction rates are unknown. So, you need to observe the reactions and
infer the rates. Assume that you can measure the substances for a set of times t. From these
observations, fit the best set of reaction rates to the measurements.
Model
The model has six substances, C1 through C6, that react as follows:
The reaction rate is proportional to the product of the quantities of the required substances. So, if yi
represents the quantity of substance Ci, then the reaction rate to produce C3 is r1 y1 y2. Similarly, the
reaction rate to produce C5 is r2 y3 y4, and the reaction rate to produce C6 is r3 y3 y4.
In other words, the differential equation controlling the evolution of the system is
−r1 y1 y2
−r1 y1 y2
dy −r2 y3 y4 + r1 y1 y2 − r3 y3 y4
= .
dt −r2 y3 y4 − r3 y3 y4
r2 y3 y4
r3 y3 y4
Start the differential equation at time 0 at the point y(0) = [1, 1, 0, 1, 0, 0]. These initial values ensure
that all of the substances react completely, causing C1 through C4 to approach zero as time increases.
The diffun function implements the differential equations in a form ready for solution by ode45.
type diffun
12-77
12 Least Squares
s12 = y(1)*y(2);
s34 = y(3)*y(4);
dydt(1) = -r(1)*s12;
dydt(2) = -r(1)*s12;
dydt(3) = -r(2)*s34 + r(1)*s12 - r(3)*s34;
dydt(4) = -r(2)*s34 - r(3)*s34;
dydt(5) = r(2)*s34;
dydt(6) = r(3)*s34;
end
The true reaction rates are r1 = 2 . 5, r2 = 1 . 2, and r3 = 0 . 45. Compute the evolution of the system
for times zero through five by calling ode45.
12-78
Fit ODE, Problem-Based
Optimization Problem
To prepare the problem for solution in the problem-based approach, create a three-element
optimization variable r that has a lower bound of 0.1 and an upper bound of 10.
r = optimvar('r',3,"LowerBound",0.1,"UpperBound",10);
The objective function for this problem is the sum of squares of the differences between the ODE
solution with parameters r and the solution with the true parameters yvals. To express this
objective function, first write a MATLAB function that computes the ODE solution using parameters
r. This function is the RtoODE function.
type RtoODE
To use RtoODE in an objective function, convert the function to an optimization expression by using
fcn2optimexpr.
myfcn = fcn2optimexpr(@RtoODE,r,tspan,y0);
Express the objective function as the sum of squared differences between the ODE solution and the
solution with true parameters.
obj = sum(sum((myfcn - yvalstrue).^2));
OptimizationProblem :
Solve for:
r
minimize :
sum(sum((RtoODE(r, extraParams{1}, extraParams{2}) - extraParams{3}).^2, 1))
extraParams
variable bounds:
0.1 <= r(1) <= 10
0.1 <= r(2) <= 10
0.1 <= r(3) <= 10
Solve Problem
To find the best-fitting parameters r, give an initial guess r0 for the solver and call solve.
r0.r = [1 1 1];
[rsol,sumsq] = solve(prob,r0)
12-79
12 Least Squares
sumsq = 3.8669e-15
The sum of squared differences is essentially zero, meaning the solver found parameters that cause
the ODE solution to match the solution with true parameters. So, as expected, the solution contains
the true parameters.
disp(rsol.r)
2.5000
1.2000
0.4500
disp(rtrue)
Limited Observations
Suppose that you cannot observe all the components of y, but only the final outputs y(5) and y(6).
Can you obtain the values of all the reaction rates based on this limited information?
To find out, modify the function RtoODE to return only the fifth and sixth ODE outputs. The modified
ODE solver is in RtoODE2.
type RtoODE2
The RtoODE2 function simply calls RtoODE and then takes the final two rows of the output.
Create a new optimization expression from RtoODE2 and the optimization variable r, the time span
data tspan, and the initial point y0.
myfcn2 = fcn2optimexpr(@RtoODE2,r,tspan,y0);
yvals2 = yvalstrue([5,6],:);
Create a new objective and new optimization problem from the optimization expression myfcn2 and
the comparison data yvals2.
12-80
Fit ODE, Problem-Based
[rsol2,sumsq2] = solve(prob2,r0)
lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
sumsq2 = 2.1616e-05
Once again, the returned sum of squares is essentially zero. Does this mean that the solver found the
correct reaction rates?
disp(rsol2.r)
1.7811
1.5730
0.5899
disp(rtrue)
No; in this case, the new rates are quite different from the true rates. However, a plot of the new ODE
solution compared to the true values shows that y(5) and y(6) match the true values.
figure
plot(tspan,yvals2(1,:),'b-')
hold on
ss2 = RtoODE2(rsol2.r,tspan,y0);
plot(tspan,ss2(1,:),'r--')
plot(tspan,yvals2(2,:),'c-')
plot(tspan,ss2(2,:),'m--')
legend('True y(5)','New y(5)','True y(6)','New y(6)','Location','northwest')
hold off
12-81
12 Least Squares
To identify the correct reaction rates for this problem, you must have data for more observations than
y(5) and y(6).
Plot all the components of the solution with the new parameters, and plot the solution with the true
parameters.
figure
yvals2 = RtoODE(rsol2.r,tspan,y0);
for i = 1:6
subplot(3,2,i)
plot(tspan,yvalstrue(i,:),'b-',tspan,yvals2(i,:),'r--')
legend('True','New','Location','best')
title(['y(',num2str(i),')'])
end
12-82
Fit ODE, Problem-Based
With the new parameters, substances C1 and C2 drain more slowly, and substance C3 does not
accumulate as much. But substances C4, C5, and C6 have exactly the same evolution with both the
new parameters and the true parameters.
See Also
fcn2optimexpr | ode45 | solve
More About
• “Problem-Based Optimization Workflow” on page 10-2
12-83
12 Least Squares
This example shows the efficiency of a least-squares solver by comparing the performance of
lsqnonlin with that of fminunc on the same problem. Additionally, the example shows added
benefits that you can obtain by explicitly recognizing and handling separately the linear parts of a
problem.
Problem Setup
Data = ...
[0.0000 5.8955
0.1000 3.5639
0.2000 2.5173
0.3000 1.9790
0.4000 1.8990
0.5000 1.3938
0.6000 1.1359
0.7000 1.0096
0.8000 1.0343
0.9000 0.8435
1.0000 0.6856
1.1000 0.6100
1.2000 0.5392
1.3000 0.3946
1.4000 0.3903
1.5000 0.5474
1.6000 0.3459
1.7000 0.1370
1.8000 0.2211
1.9000 0.1704
2.0000 0.2636];
t = Data(:,1);
y = Data(:,2);
plot(t,y,'ro')
title('Data points')
12-84
Nonlinear Data-Fitting Using Several Problem-Based Approaches
y = c(1)*exp(-lam(1)*t) + c(2)*exp(-lam(2)*t)
to the data.
c = optimvar('c',2);
lam = optimvar('lam',2);
Arbitrarily set the initial point x0 as follows: c(1) = 1, c(2) = 1, lam(1) = 1, and lam(2) = 0:
x0.c = [1,1];
x0.lam = [1,0];
Create a function that computes the value of the response at times t when the parameters are c and
lam.
Convert diffun to an optimization expression that sums the squares of the differences between the
function and the data y.
12-85
12 Least Squares
ssqprob = optimproblem('Objective',diffexpr);
[sol,fval,exitflag,output] = solve(ssqprob,x0)
lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
fval = 0.1477
exitflag =
FunctionChangeBelowTolerance
Plot the resulting curve based on the returned solution values sol.c and sol.lam.
resp = diffun(sol.c,sol.lam);
hold on
plot(t,resp)
hold off
12-86
Nonlinear Data-Fitting Using Several Problem-Based Approaches
To solve the problem using the fminunc solver, set the 'Solver' option to 'fminunc' when calling
solve.
[xunc,fvalunc,exitflagunc,outputunc] = solve(ssqprob,x0,'Solver',"fminunc")
fvalunc = 0.1477
exitflagunc =
OptimalSolution
12-87
12 Least Squares
funcCount: 185
stepsize: 0.0017
lssteplength: 1
firstorderopt: 2.9562e-05
algorithm: 'quasi-newton'
message: '...'
solver: "fminunc"
Notice that fminunc found the same solution as lsqcurvefit, but took many more function
evaluations to do so. The parameters for fminunc are in the opposite order as those for
lsqcurvefit; the larger lam is lam(2), not lam(1). This is not surprising, the order of variables is
arbitrary.
There were 185 function evaluations using fminunc, and 35 using lsqcurvefit.
Notice that the fitting problem is linear in the parameters c(1) and c(2). This means for any values
of lam(1) and lam(2), we can use the backslash operator to find the values of c(1) and c(2) that
solve the least-squares problem.
Rework the problem as a two-dimensional problem, searching for the best values of lam(1) and
lam(2). The values of c(1) and c(2) are calculated at each step using the backslash operator as
described above. To do so, use the fitvector function, which performs the backslash operation to
obtain c(1) and c(2) at each solver iteration.
type fitvector
12-88
Nonlinear Data-Fitting Using Several Problem-Based Approaches
for j = 1:length(lam)
A(:,j) = exp(-lam(j)*xdata);
end
c = A\ydata; % solve A*c = y for linear parameters c
yEst = A*c; % return the estimated response based on c
Solve the problem using solve starting from a two-dimensional initial point x02.lam = [1,0]. To do
so, first convert the fitvector function to an optimization expression. To avoid a warning, give the
output size of the resulting expression. Create a new optimization problem with objective as the sum
of squared differences between the converted fitvector function and the data y.
x02.lam = x0.lam;
F2 = fcn2optimexpr(@(x) fitvector(x,t,y),lam,'OutputSize',[length(t),1]);
ssqprob2 = optimproblem('Objective',sum((F2 - y).^2));
[sol2,fval2,exitflag2,output2] = solve(ssqprob2,x02)
lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
fval2 = 0.1477
exitflag2 =
FunctionChangeBelowTolerance
The efficiency of the two-dimensional solution is similar to that of the four-dimensional solution:
fprintf(['There were %d function evaluations using the 2-d ' ...
'formulation, and %d using the 4-d formulation.'], ...
output2.funcCount,output.funcCount)
There were 33 function evaluations using the 2-d formulation, and 35 using the 4-d formulation.
Choosing a bad starting point for the original four-parameter problem leads to a local solution that is
not global. Choosing a starting point with the same bad lam(1) and lam(2) values for the split two-
parameter problem leads to the global solution. To show this, rerun the original problem with a start
point that leads to a relatively bad local solution, and compare the resulting fit with the global
solution.
12-89
12 Least Squares
x0bad.c = [5 1];
x0bad.lam = [1 0];
[solbad,fvalbad,exitflagbad,outputbad] = solve(ssqprob,x0bad)
lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
fvalbad = 2.2173
exitflagbad =
FunctionChangeBelowTolerance
respbad = diffun(solbad.c,solbad.lam);
hold on
plot(t,respbad,'g')
legend('Data','Global fit','Bad local fit','Location','NE')
hold off
12-90
Nonlinear Data-Fitting Using Several Problem-Based Approaches
The residual norm at the good ending point is 0.147723, and the residual norm at the bad ending p
See Also
fcn2optimexpr | solve
More About
• “Nonlinear Data-Fitting” on page 12-9
• “Problem-Based Optimization Workflow” on page 10-2
12-91
12 Least Squares
prob = optimproblem("Objective",obj);
% Check to see the default solver
opts = optimoptions(prob)
opts =
lsqlin options:
...
In contrast, expressing the objective as a mathematically equivalent expression gives a problem that
the software interprets as a general quadratic problem.
opts =
quadprog options:
...
prob3 = optimproblem("Objective",obj3);
% Check to see the default solver
opts = optimoptions(prob3)
opts =
lsqnonlin options:
...
The most general form that the software interprets as a least-squares problem is a sum of expressions
Rn of this form:
Rn = an + k1 ∑ k2 ∑ k3 ∑ ...k jen2
12-92
Write Objective Function for Problem-Based Least Squares
Each expression Rn must evaluate to a scalar, not a multidimensional value. For example,
x = optimvar('x',10,3,4);
y = optimvar('y',10,2);
t = randn(10,3,4); % Data for example
u = randn(10,2); % Data for example
a = randn; % Coefficient
k = abs(randn(5,1)); % Positive coefficients
% Explicit sums of squares:
R1 = a + k(1)*sum(k(2)*sum(k(3)*sum((x - t).^2,3)));
R2 = k(4)*sum(k(5)*sum((y - u).^2,2));
R3 = 1 + (fcn2optimexpr(@cos,x(1)))^2;
prob = optimproblem('Objective',R1 + R2 + R3);
options = optimoptions(prob)
options =
lsqnonlin options:
...
See Also
More About
• “Problem-Based Optimization Workflow” on page 10-2
12-93
13
Systems of Equations
fsolve attempts to solve a system of equations by minimizing the sum of squares of the components.
If the sum of squares is zero, the system of equations is solved. fsolve has three algorithms:
• Trust-region
• Trust-region-dogleg
• Levenberg-Marquardt
All algorithms are large scale; see “Large-Scale vs. Medium-Scale Algorithms” on page 2-10.
Trust-Region Algorithm
Many of the methods used in Optimization Toolbox solvers are based on trust regions, a simple yet
powerful concept in optimization.
min q(s), s ∈ N .
s
The solver updates the current point to x + s if f(x + s) < f(x); otherwise, the current point remains
unchanged and the solver shrinks N (the trust region) and repeats the trial step computation.
The key questions in defining a specific trust-region approach to minimizing f(x) are how to choose
and compute the approximation q (defined at the current point x), how to choose and modify the trust
region N, and how accurately to solve the trust-region subproblem.
13-2
Equation Solving Algorithms
In the standard trust-region method ([48]), the quadratic approximation q is defined by the first two
terms of the Taylor approximation to F at x. The neighborhood N is usually spherical or ellipsoidal in
shape. Mathematically, the trust-region subproblem is typically stated
1 T
min s Hs + sT g such that Ds ≤ Δ , (13-1)
2
where g is the gradient of f at the current point x, H is the Hessian matrix (the symmetric matrix of
second derivatives), D is a diagonal scaling matrix, Δ is a positive scalar, and ∥ . ∥ is the 2-norm. To
solve “Equation 13-1”, an algorithm (see [48]) can compute all eigenvalues of H and then apply a
Newton process to the secular equation
1 1
− = 0.
Δ s
Such an algorithm provides an accurate solution to “Equation 13-1”. However, this requires time
proportional to several factorizations of H. Therefore, trust-region problems require a different
approach. Several approximation and heuristic strategies, based on “Equation 13-1”, have been
proposed in the literature ([42] and [50]). Optimization Toolbox solvers follow an approximation
approach that restricts the trust-region subproblem to a two-dimensional subspace S ([39] and [42]).
After the solver computes the subspace S, the work to solve “Equation 13-1” is trivial because, in the
subspace, the problem is only two-dimensional. The dominant work now shifts to the determination of
the subspace.
The solver determines the two-dimensional subspace S with the aid of a preconditioned conjugate
gradient method (described in the next section). The solver defines S as the linear space spanned by
s1 and s2, where s1 is in the direction of the gradient g, and s2 is either an approximate Newton
direction, that is, a solution to
H ⋅ s2 = − g,
s2T ⋅ H ⋅ s2 < 0.
The philosophy behind this choice of S is to force global convergence (via the steepest descent
direction or negative curvature direction) and achieve fast local convergence (via the Newton step,
when it exists).
The process of unconstrained minimization using the trust-region approach is now easy to specify:
The solver repeats these four steps until convergence, adjusting he trust-region dimension Δ
according to standard rules. In particular, the solver decreases the trust-region size if it does not
accept the trial step, when f(x + s) ≥ f(x). See [46] and [49] for a discussion of this aspect.
Optimization Toolbox solvers treat important cases of f with specialized functions: nonlinear least-
squares, quadratic functions, and linear least-squares. However, the underlying algorithmic ideas are
the same as for the general case.
13-3
13 Systems of Equations
A popular way to solve large, symmetric, positive definite systems of linear equations Hp = –g is the
method of Preconditioned Conjugate Gradients (PCG). This iterative approach requires the ability to
calculate matrix-vector products of the form H·v where v is an arbitrary vector. The symmetric
positive definite matrix M is a preconditioner for H. That is, M = C2, where C–1HC–1 is a well-
conditioned matrix or a matrix with clustered eigenvalues.
In a minimization context, you can assume that the Hessian matrix H is symmetric. However, H is
guaranteed to be positive definite only in the neighborhood of a strong minimizer. Algorithm PCG
exits when it encounters a direction of negative (or zero) curvature, that is, dTHd ≤ 0. The PCG
output direction p is either a direction of negative curvature or an approximate solution to the
Newton system Hp = –g. In either case, p helps to define the two-dimensional subspace used in the
trust-region approach discussed in “Trust-Region Methods for Nonlinear Minimization” on page 6-2.
Trust-Region-Dogleg Algorithm
Another approach is to solve a linear system of equations to find the search direction. Newton's
method specifies to solve for the search direction dk such that
J(xk)dk = –F(xk)
xk + 1 = xk + dk,
T
∇F1 xk
T
∇F2 xk
J xk = .
⋮
T
∇Fn xk
Newton's method can be problematic. J(xk) might be singular, in which case the Newton step dk is not
even defined. Also, the exact Newton step dk can be expensive to compute. In addition, Newton's
method might not converge if the starting point is far from the solution.
1 T
min f (d) = F xk + d F xk + d .
d 2
13-4
Equation Solving Algorithms
1 2 1 2
minm(d) = M xk + d 2 = F xk + J xk d 2
d 2 2
(13-2)
1 T T T 1 T T
= F xk F xk + d J xk F xk + d J xk J xk d .
2 2
m(d) is a better choice of merit function than f(d), so the trust-region subproblem is
1 T T T 1 T T
min F xk F xk + d J xk F xk + d J xk J xk d , (13-3)
d 2 2
such that ∥D·d∥ ≤ Δ. You can solve this subproblem efficiently using a dogleg strategy.
For an overview of trust-region methods, see Conn [4] and Nocedal [31].
Trust-Region-Dogleg Implementation
The key feature of the trust-region-dogleg algorithm is the use of the Powell dogleg procedure for
computing the step d, which minimizes “Equation 13-3”. For a detailed description, see Powell [34].
The algorithm constructs the step d from a convex combination of a Cauchy step (a step along the
steepest descent direction) and a Gauss-Newton step for f(x). The Cauchy step is calculated as
dC = –αJ(xk)TF(xk),
J(xk)·dGN = –F(xk),
d = dC + λ(dGN – dC),
where λ is the largest value in the interval [0,1] such that ∥d∥ ≤ Δ. If Jk is (nearly) singular, d is just
the Cauchy direction.
The trust-region-dogleg algorithm is efficient because it requires only one linear solve per iteration
(for the computation of the Gauss-Newton step). Additionally, the algorithm can be more robust than
using the Gauss-Newton method with a line search.
Levenberg-Marquardt Method
The Levenberg-Marquardt algorithm ([25], and [27]) uses a search direction that is a solution of the
linear set of equations
T T
J xk J xk + λkI dk = − J xk F xk , (13-4)
T T T
J xk J xk + λkdiag J xk J xk dk = − J xk F xk , (13-5)
13-5
13 Systems of Equations
where the scalar λk controls both the magnitude and direction of dk. Set the fsolve option
ScaleProblem to 'none' to use “Equation 13-4”, or set this option to 'jacobian' to use
“Equation 13-5”.
When λk is zero, the direction dk is the Gauss-Newton method. As λk tends towards infinity, dk tends
towards the steepest descent direction, with magnitude tending towards zero. The implication is that,
for some sufficiently large λk, the term F(xk + dk) < F(xk) holds true. Therefore, the algorithm can
control the term λk to ensure descent despite second-order terms, which restrict the efficiency of the
Gauss-Newton method. The Levenberg-Marquardt algorithm, therefore, uses a search direction that
is a cross between the Gauss-Newton direction and the steepest descent direction.
fzero Algorithm
fzero attempts to find the root of a scalar function f of a scalar variable x.
fzero looks for an interval around an initial point such that f(x) changes sign. If you specify an initial
interval instead of an initial point, fzero checks to make sure that f(x) has different signs at the
endpoints of the interval. The initial interval must be finite; it cannot contain ±Inf.
fzero uses a combination of interval bisection, linear interpolation, and inverse quadratic
interpolation in order to locate a root of f(x). See fzero for more information.
\ Algorithm
The \ algorithm is described in the MATLAB arithmetic operators section for mldivide.
See Also
fsolve | fzero
More About
• “Systems of Nonlinear Equations”
13-6
Nonlinear Equations with Analytic Jacobian
• The system of nonlinear equations is square, i.e., the number of equations equals the number of
unknowns.
• There exists a solution x such that F(x) = 0.
The example uses fsolve to obtain the minimum of the banana (or Rosenbrock) function by deriving
and then solving an equivalent system of nonlinear equations. The Rosenbrock function, which has a
minimum of F(x) = 0, is a common test problem in optimization. It has a high degree of nonlinearity
and converges extremely slowly if you try to use steepest descent type methods. It is given by
2 2
f (x) = 100 x2 − x12 + (1 − x1) .
First generalize this function to an n-dimensional function, for any positive, even value of n:
n/2
2 2
f (x) = ∑ 2
100 x2i − x2i −1 + (1 − x2i − 1) .
i=1
This function is referred to as the generalized Rosenbrock function. It consists of n squared terms
involving n unknowns.
Before you can use fsolve to find the values of x such that F(x) = 0, i.e., obtain the minimum of the
generalized Rosenbrock function, you must rewrite the function as the following equivalent system of
nonlinear equations:
F(1) = 1 − x1
F(2) = 10 x2 − x12
F(3) = 1 − x3
F(4) = 10 x4 − x32
⋮
F(n − 1) = 1 − xn − 1
F(n) = 10 xn − xn2 − 1 .
This system is square, and you can use fsolve to solve it. As the example demonstrates, this system
has a unique solution given by xi = 1, i = 1,...,n.
13-7
13 Systems of Equations
Use the starting point x(i) = –1.9 for the odd indices, and x(i) = 2 for the even indices. Set Display
to 'iter' to see the solver's progress. Set SpecifyObjectiveGradient to true to use the
Jacobian defined in bananaobj.m. The fsolve function generates the following output:
Norm of First-order Trust-region
Iteration Func-count f(x) step optimality radius
0 1 8563.84 615 1
1 2 3093.71 1 329 1
2 3 225.104 2.5 34.8 2.5
3 4 212.48 6.25 34.1 6.25
4 5 212.48 6.25 34.1 6.25
5 6 212.48 1.5625 34.1 1.56
6 7 116.793 0.390625 5.79 0.391
7 8 109.947 0.390625 0.753 0.391
8 9 99.4696 0.976562 1.2 0.977
9 10 83.6416 2.44141 7.13 2.44
10 11 77.7663 2.44141 9.94 2.44
11 12 77.7663 2.44141 9.94 2.44
12 13 43.013 0.610352 1.38 0.61
13 14 36.4334 0.610352 1.58 0.61
14 15 34.1448 1.52588 6.71 1.53
15 16 18.0108 1.52588 4.91 1.53
16 17 8.48336 1.52588 3.74 1.53
17 18 3.74566 1.52588 3.58 1.53
18 19 1.46166 1.52588 3.32 1.53
19 20 0.29997 1.24265 1.94 1.53
20 21 0 0.0547695 0 1.53
Equation solved.
See Also
fsolve
More About
• “Systems of Nonlinear Equations”
13-8
Nonlinear Equations with Finite-Difference Jacobian
This example uses bananaobj from the example “Nonlinear Equations with Analytic Jacobian” on
page 13-7 as the objective function, but sets SpecifyObjectiveGradient to false so that fsolve
approximates the Jacobian and ignores the second bananaobj output.
n = 64;
x0(1:n,1) = -1.9;
x0(2:2:n,1) = 2;
options = optimoptions(@fsolve,'Display','iter','SpecifyObjectiveGradient',false);
[x,F,exitflag,output,JAC] = fsolve(@bananaobj,x0,options);
Equation solved.
The finite-difference version of this example requires the same number of iterations to converge as
the analytic Jacobian version in the preceding example. It is generally the case that both versions
converge at about the same rate in terms of iterations. However, the finite-difference version requires
many additional function evaluations. The cost of these extra evaluations might or might not be
significant, depending on the particular problem.
See Also
fsolve
13-9
13 Systems of Equations
More About
• “Systems of Nonlinear Equations”
13-10
Nonlinear Equations with Jacobian
To solve a large nonlinear system of equations, F(x) = 0, you can use the trust-region reflective
algorithm available in fsolve, a large-scale algorithm (“Large-Scale vs. Medium-Scale Algorithms”
on page 2-10).
A starting point is given as well as the function name. The default method for fsolve is trust-region-
dogleg, so it is necessary to specify 'Algorithm' as 'trust-region' in the options argument in
order to select the trust-region algorithm. Setting the Display option to 'iter' causes fsolve to
display the output at each iteration. Setting 'SpecifyObjectiveGradient' to true, causes
fsolve to use the Jacobian information available in nlsf1.m.
13-11
13 Systems of Equations
fsolve stopped because the vector of function values is near zero, as measured by the value
of the function tolerance. However, the last step was ineffective.
A linear system is (approximately) solved in each major iteration using the preconditioned conjugate
gradient method. Setting PrecondBandWidth to 0 in options means a diagonal preconditioner is
used. (PrecondBandWidth specifies the bandwidth of the preconditioning matrix. A bandwidth of 0
means there is only one diagonal in the matrix.)
From the first-order optimality values, fast linear convergence occurs. The number of conjugate
gradient (CG) iterations required per major iteration is low, at most five for a problem of 1000
dimensions, implying that the linear systems are not very difficult to solve in this case (though more
work is required as convergence progresses).
If you want to use a tridiagonal preconditioner, i.e., a preconditioning matrix with three diagonals (or
bandwidth of one), set PrecondBandWidth to the value 1:
options = optimoptions(@fsolve,'Display','iter','SpecifyObjectiveGradient',true,...
'Algorithm','trust-region','PrecondBandWidth',1);
[x,fval,exitflag,output] = fsolve(fun,xstart,options);
fsolve stopped because the vector of function values is near zero, as measured by the value
of the function tolerance. However, the last step was ineffective.
Note that although the same number of iterations takes place, the number of PCG iterations has
dropped, so less work is being done per iteration. See “Preconditioned Conjugate Gradient Method”
on page 6-21.
Setting PrecondBandWidth to Inf (this is the default) means that the solver uses Cholesky
factorization rather than PCG.
See Also
fsolve
More About
• “Systems of Nonlinear Equations”
13-12
Nonlinear Equations with Jacobian Sparsity Pattern
In order for this finite differencing to be as efficient as possible, you should supply the sparsity
pattern of the Jacobian, by setting JacobPattern to a sparse matrix Jstr in options. That is,
supply a sparse matrix Jstr whose nonzero entries correspond to nonzeros of the Jacobian for all x.
Indeed, the nonzeros of Jstr can correspond to a superset of the nonzero locations of J; however, in
general the computational cost of the sparse finite-difference procedure will increase with the
number of nonzeros of Jstr.
Providing the sparsity pattern can drastically reduce the time needed to compute the finite
differencing on large problems. If the sparsity pattern is not provided (and the Jacobian is not
computed in the objective function either) then, in this problem with 1000 variables, the finite-
differencing code attempts to compute all 1000-by-1000 entries in the Jacobian. But in this case there
are only 2998 nonzeros, substantially less than the 1,000,000 possible nonzeros the finite-differencing
code attempts to compute. In other words, this problem is solvable if you provide the sparsity pattern.
If not, most computers run out of memory when the full dense finite-differencing is attempted. On
most small problems, it is not essential to provide the sparsity structure.
Suppose the sparse matrix Jstr, computed previously, has been saved in file nlsdat1.mat. The
following driver calls fsolve applied to nlsf1a, which is nlsf1 without the Jacobian. Sparse finite-
differencing is used to estimate the sparse Jacobian matrix as needed.
13-13
13 Systems of Equations
fsolve stopped because the vector of function values is near zero, as measured by the value
of the function tolerance. However, the last step was ineffective.
Alternatively, it is possible to choose a sparse direct linear solver (i.e., a sparse QR factorization) by
indicating a “complete” preconditioner. For example, if you set PrecondBandWidth to Inf, then a
sparse direct linear solver is used instead of a preconditioned conjugate gradient iteration:
xstart = -ones(1000,1);
fun = @nlsf1a;
load nlsdat1 % Get Jstr
options = optimoptions(@fsolve,'Display','iter','JacobPattern',Jstr,...
'Algorithm','trust-region','SubproblemAlgorithm','factorization');
[x,fval,exitflag,output] = fsolve(fun,xstart,options);
Equation solved.
When using the sparse direct solver, there are no CG iterations. Notice that the final optimality and
f(x) value (which for fsolve, f(x), is the sum of the squares of the function values) are closer to zero
than using the PCG method, which is often the case.
See Also
fsolve
More About
• “Systems of Nonlinear Equations”
13-14
Nonlinear Systems with Constraints
A solution that satisfies your constraints is not guaranteed to exist. In fact, the problem might not
have any solution, even one that does not satisfy your constraints. However techniques exist to help
you search for solutions that satisfy your constraints.
1 + x22
F1(x) = x1 + 1 10 − x1
1 + x22 + x2
(13-6)
1 + x12
F2(x) = x2 + 2 20 − x2 ,
1 + x12 + x1
where the components of x must be nonnegative. The equations have four solutions:
x = (–1,–2)
x = (10,–2),
x = (–1,20),
x = (10,20).
function F = fbnd(x)
F(1) = (x(1)+1)*(10-x(1))*(1+x(2)^2)/(1+x(2)^2+x(2));
F(2) = (x(2)+2)*(20-x(2))*(1+x(1)^2)/(1+x(1)^2+x(1));
13-15
13 Systems of Equations
For this example, to look for a solution to “Equation 13-6”, take 10 random points that are normally
distributed with mean 0 and standard deviation 100.
rng default % For reproducibility
N = 10; % Try 10 random start points
pts = 100*randn(N,2); % Initial points are rows in pts
soln = zeros(N,2); % Allocate solution
opts = optimoptions('fsolve','Display','off');
for k = 1:N
soln(k,:) = fsolve(@fbnd,pts(k,:),opts); % Find solutions
end
Examine the solutions in soln, and note that several satisfy the constraints.
For this example, take x0 = [1,9] and examine the solution each algorithm returns.
x0 = [1,9];
opts = optimoptions(@fsolve,'Display','off',...
'Algorithm','trust-region-dogleg');
x1 = fsolve(@fbnd,x0,opts)
x1 =
-1.0000 -2.0000
opts.Algorithm = 'trust-region';
x2 = fsolve(@fbnd,x0,opts)
x2 =
-1.0000 20.0000
opts.Algorithm = 'levenberg-marquardt';
x3 = fsolve(@fbnd,x0,opts)
x3 =
0.9523 8.9941
Here, all three algorithms find different solutions for the same initial point. In fact, x3 is not even a
solution, but is simply a locally stationary point.
lb = [0,0];
rng default
13-16
Nonlinear Systems with Constraints
x0 = 100*randn(2,1);
[x,res] = lsqnonlin(@fbnd,x0,lb)
x =
10.0000
20.0000
res =
2.4783e-25
You can use lsqnonlin with the Global Optimization Toolbox MultiStart solver to search over
many initial points automatically. See “MultiStart Using lsqcurvefit or lsqnonlin” (Global Optimization
Toolbox).
• Give a constant objective function, such as @(x)0, which evaluates to 0 for each x.
• Set the fsolve objective function as the nonlinear equality constraints in fmincon.
• Give any other constraints in the usual fmincon syntax.
For this example, write a function file for the nonlinear constraints.
x =
10.0000
20.0000
See Also
fmincon | fsolve | lsqnonlin
More About
• “Nonlinear System of Equations with Constraints, Problem-Based” on page 13-30
13-17
13 Systems of Equations
13-18
Solve Nonlinear System of Equations, Problem-Based
x = optimvar('x',2);
prob = eqnproblem;
prob.Equations.eq1 = eq1;
prob.Equations.eq2 = eq2;
show(prob)
EquationProblem :
Solve for:
x
eq1:
exp(-exp(-(x(1) + x(2)))) == (x(2) .* (1 + x(1).^2))
eq2:
((x(1) .* cos(x(2))) + (x(2) .* sin(x(1)))) == 0.5
Solve the problem starting from the point [0,0]. For the problem-based approach, specify the initial
point as a structure, with the variable names as the fields of the structure. For this problem, there is
only one variable, x.
x0.x = [0 0];
[sol,fval,exitflag] = solve(prob,x0)
Equation solved.
13-19
13 Systems of Equations
exitflag =
EquationSolved
disp(sol.x)
0.3532
0.6061
If your equation functions are not composed of elementary functions, you must convert the functions
to optimization expressions using fcn2optimexpr. For the present example:
ls1 = fcn2optimexpr(@(x)exp(-exp(-(x(1)+x(2)))),x);
eq1 = ls1 == x(2)*(1 + x(1)^2);
ls2 = fcn2optimexpr(@(x)x(1)*cos(x(2))+x(2)*sin(x(1)),x);
eq2 = ls2 == 1/2;
For the list of supported functions, see “Supported Operations on Optimization Variables and
Expressions” on page 10-36.
See Also
fcn2optimexpr | solve
More About
• “Convert Nonlinear Function to Optimization Expression” on page 7-8
• “Systems of Nonlinear Equations”
• “Problem-Based Workflow for Solving Equations” on page 10-4
13-20
Solve Nonlinear System of Polynomials, Problem-Based
12
x3 =
34
is a system of polynomial equations. Here, x3 means x * x * x using matrix multiplication. You can
easily formulate and solve this system using the problem-based approach.
x = optimvar('x',2,2);
prob = eqnproblem('Equations',eqn);
x0.x = ones(2);
sol = solve(prob,x0)
Equation solved.
disp(sol.x)
-0.1291 0.8602
1.2903 1.1612
sol.x^3
ans = 2×2
1.0000 2.0000
3.0000 4.0000
13-21
13 Systems of Equations
See Also
solve
More About
• “Systems of Nonlinear Equations”
• “Problem-Based Workflow for Solving Equations” on page 10-4
13-22
Follow Equation Solution as a Parameter Changes
sinh(x) − 3x = a,
where a is a numeric parameter that goes from 0 to 5. At a = 0, one solution to this equation is x = 0.
When a is not too large in absolute value, the equation has three solutions. To visualize the equation,
create the left side of the equation as an anonymous function. Plot the function.
13-23
13 Systems of Equations
Problem-Based Setup
To create an objective function for the problem-based approach, create an optimization expression
expr in an optimization variable x.
x = optimvar('x');
expr = sinh(x) - 3*x;
Starting from the initial solution x = 0 at a = 0, find solutions for 100 values of a from 0 through 5.
Because fun is a scalar nonlinear function, solve calls fzero as the solver.
Set up the problem object, options, and data structures for holding solution statistics.
prob = eqnproblem;
options = optimset('Display','off');
sols = zeros(100,1);
fevals = sols;
as = linspace(0,5);
Solve the equation in a loop, starting from the first solution sols(1) = 0.
for i = 2:length(as)
x0.x = sols(i-1); % Start from previous solution
prob.Equations = expr == as(i);
[sol,~,~,output] = solve(prob,x0,'Options',options);
sols(i) = sol.x;
fevals(i) = output.funcCount;
end
Plot the solution as a function of the parameter a and the number of function evaluations taken to
reach the solution.
subplot(2,1,1)
plot(as,sols,'ko')
xlabel 'a'
ylabel('Solution(x)')
subplot(2,1,2)
plot(fevals,'k*')
xlabel('Iteration Number')
ylabel('Fevals')
13-24
Follow Equation Solution as a Parameter Changes
A jump in the solution occurs near a = 2 . 5. At the same point, the number of function evaluations to
reach a solution increases from near 15 to near 40. To understand why, examine a more detailed plot
of the function. Plot the function and every seventh solution point.
figure
t = linspace(-3.5,3.5);
plot(t,fun(t));
hold on
plot([-3.5,min(sols)],[2.5,2.5],'k--')
legend('fun','Maximum a','Location','north','autoupdate','off')
for a0 = 7:7:100
plot(sols(a0),as(a0),'ro')
if mod(a0,2) == 1
text(sols(a0) + 0.15,as(a0) + 0.15,num2str(a0/7))
else
text(sols(a0) - 0.3,as(a0) + 0.05,num2str(a0/7))
end
end
plot(t,zeros(size(t)),'k-')
hold off
13-25
13 Systems of Equations
As a increases, at first the solutions move to the left. However, when a is above 2.5, there is no longer
a solution near the previous solution. fzero requires extra function evaluations to search for a
solution, and finds a solution near x = 3. After that, the solution values move slowly to the right as a
increases further. The solver requires only about 10 function evaluations for each subsequent
solution.
The fsolve solver can be more efficient than fzero. However, fsolve can become stuck in a local
minimum and fail to solve the equation.
Set up the problem object, options, and data structures for holding solution statistics.
probfsolve = eqnproblem;
sols = zeros(100,1);
fevals = sols;
infeas = sols;
asfsolve = linspace(0,5);
Solve the equation in a loop, starting from the first solution sols(1) = 0.
for i = 2:length(as)
x0.x = sols(i-1); % Start from previous solution
probfsolve.Equations = expr == asfsolve(i);
[sol,fval,~,output] = solve(probfsolve,x0,'Options',options,'Solver','fsolve');
sols(i) = sol.x;
fevals(i) = output.funcCount;
13-26
Follow Equation Solution as a Parameter Changes
infeas(i) = fval;
end
Plot the solution as a function of the parameter a and the number of function evaluations taken to
reach the solution.
subplot(2,1,1)
plot(asfsolve,sols,'ko',asfsolve,infeas,'r-')
xlabel 'a'
legend('Solution','Error of Solution','Location','best')
subplot(2,1,2)
plot(fevals,'k*')
xlabel('Iteration Number')
ylabel('Fevals')
fsolve is somewhat more efficient than fzero, requiring about 7 or 8 function evaluations per
iteration. Again, when the solver finds no solution near the previous value, the solver requires many
more function evaluations to search for a solution. This time, the search is unsuccessful. Subsequent
iterations require few function evaluations for the most part, but fail to find a solution. The Error of
Solution plot shows the function value fun(x) − a.
To try to overcome a local minimum not being a solution, search again from a different start point
when fsolve returns with a negative exit flag. Set up the problem object, options, and data
structures for holding solution statistics.
13-27
13 Systems of Equations
fevals = sols;
asfsolve = linspace(0,5);
Solve the equation in a loop, starting from the first solution sols(1) = 0.
for i = 2:length(as)
x0.x = sols(i-1); % Start from previous solution
probfsolve.Equations = expr == asfsolve(i);
[sol,~,exitflag,output] = solve(probfsolve,x0,'Options',options,'Solver','fsolve');
sols(i) = sol.x;
fevals(i) = fevals(i) + output.funcCount;
end
Plot the solution as a function of the parameter a and the number of function evaluations taken to
reach the solution.
subplot(2,1,1)
plot(asfsolve,sols,'ko')
xlabel 'a'
ylabel('Solution(x)')
subplot(2,1,2)
plot(fevals,'k*')
xlabel('Iteration Number')
ylabel('Fevals')
13-28
Follow Equation Solution as a Parameter Changes
This time, fsolve recovers from the poor initial point near a = 2 . 5 and obtains a solution similar to
the one obtained by fzero. The number of function evaluations for each iteration is typically 8,
increasing to about 30 at the point where the solution jumps.
For some objective functions or software versions, you must convert nonlinear functions to
optimization expressions by using fcn2optimexpr. See “Supported Operations on Optimization
Variables and Expressions” on page 10-36. For this example, convert the original function fun used
for plotting to the optimization expression expr:
expr = fcn2optimexpr(fun,x);
The remainder of the example is exactly the same after this change to the definition of expr.
See Also
fsolve | fzero | solve
More About
• “Systems of Nonlinear Equations”
• “Problem-Based Workflow for Solving Equations” on page 10-4
13-29
13 Systems of Equations
Bound Constraints
When your problem has only bound constraints, the process for solving the problem is
straightforward. For example, to find the solution with positive components to the system of
equations
1 + x22
x1 + 1 10 − x1 =0
1 + x22 + x2
1 + x12
x2 + 2 20 − x2 = 0,
1 + x12 + x1
simply create optimization variables with lower bounds of 0. (These equations have four solutions:
where x1 = − 1 or x1 = 10, and where x2 = − 2 or x2 = 20.)
x = optimvar('x',2,"LowerBound",0);
expr1 = (x(1) + 1)*(10 - x(1))*((1 + x(2)^2))/(1 + x(2)^2 + x(2));
expr2 = (x(2) + 2)*(20 - x(2))*((1 + x(1)^2))/(1 + x(1)^2 + x(1));
eqn1 = expr1 == 0;
eqn2 = expr2 == 0;
prob = eqnproblem;
prob.Equations.eqn1 = eqn1;
prob.Equations.eqn2 = eqn2;
x0.x = [15,15];
[sol,fval,exitflag] = solve(prob,x0)
Equation solved.
exitflag =
EquationSolved
13-30
Nonlinear System of Equations with Constraints, Problem-Based
sol.x
ans = 2×1
10
20
General Constraints
When your problem has general constraints, formulate the problem as an optimization problem, not
an equation problem. Set the equations as equality constraints. For example, to solve the preceding
2
equations subject to the nonlinear inequality constraint ‖x‖ ≤ 10, remove the bounds on x and
formulate the problem as an optimization problem with no objective function.
x.LowerBound = [];
circlecons = x(1)^2 + x(2)^2 <= 10;
prob2 = optimproblem;
prob2.Constraints.circlecons = circlecons;
prob2.Constraints.eqn1 = eqn1;
prob2.Constraints.eqn2 = eqn2;
[sol2,fval2,exitflag2] = solve(prob2,x0)
fval2 = 0
exitflag2 =
OptimalSolution
sol2.x
ans = 2×1
-1.0000
-2.0000
You can also formulate the problem by setting the objective function as a sum of squares, and the
general constraints as a constraint. This alternative formulation gives a mathematically equivalent
problem, but can result in a different solution because the change in formulation leads the solver to
different iterations.
13-31
13 Systems of Equations
prob3 = optimproblem;
prob3.Objective = expr1^2 + expr2^2;
prob3.Constraints.circlecons = circlecons;
[sol3,fval3,exitflag3] = solve(prob3,x0)
fval3 = 4.8066e-13
exitflag3 =
OptimalSolution
sol3.x
ans = 2×1
-1.0000
-2.0000
In this case, the least squares objective leads to the same solution as the previous formulation, which
uses only constraints.
Generally, solve attempts to solve a nonlinear system of equations by minimizing the sum of squares
of the equation components. In other words, if LHS(i) is the left-side expression for equation i, and
RHS(i) is the right-side expression, then solve attempts to minimize sum((LHS – RHS).^2).
In contrast, when attempting to satisfy nonlinear constraint expressions, solve generally uses
fmincon, and tries to satisfy the constraints by using different strategies.
In both cases, the solver can fail to solve the equations. For strategies you can use to attempt to find
a solution when the solver fails, see “fsolve Could Not Solve Equation” on page 4-8.
See Also
solve
More About
• “Nonlinear Systems with Constraints” on page 13-15
• “When the Solver Fails” on page 4-3
13-32
Nonlinear System of Equations with Constraints, Problem-Based
13-33
14
The following Optimization Toolbox solvers can automatically distribute the numerical estimation of
gradients of objective functions and nonlinear constraint functions to multiple processors:
• fmincon
• fminunc
• fgoalattain
• fminimax
• fsolve
• lsqcurvefit
• lsqnonlin
These solvers use parallel gradient estimation under the following conditions:
When these conditions hold, the solvers compute estimated gradients in parallel.
Note Even when running in parallel, a solver occasionally calls the objective and nonlinear
constraint functions serially on the host machine. Therefore, ensure that your functions have no
assumptions about whether they are evaluated in serial or parallel.
14-2
What Is Parallel Computing in Optimization Toolbox?
where
To estimate ∇f(x) in parallel, Optimization Toolbox solvers distribute the evaluation of (f(x + Δiei) –
f(x))/Δi to extra processors.
You can choose to have gradients estimated by central finite differences instead of the default
forward finite differences. The basic central finite difference formula is
This takes twice as many function evaluations as forward finite differences, but is usually much more
accurate. Central finite differences work in parallel exactly the same as forward finite differences.
Enable central finite differences by using optimoptions to set the FiniteDifferenceType option
to 'central'. To use forward finite differences, set the FiniteDifferenceType option to
'forward'.
Suppose, for example, your objective function userfcn calls parfor, and you wish to call fmincon
in a loop. Suppose also that the conditions for parallel gradient evaluation of fmincon, as given in
“Parallel Optimization Functionality” on page 14-2, are satisfied. “When parfor Runs In Parallel” on
page 14-4 shows three cases:
14-3
14 Parallel Computing for Optimization
See Also
“Using Parallel Computing in Optimization Toolbox” on page 14-5 | “Improving Performance with
Parallel Computing” on page 14-13 | “Minimizing an Expensive Optimization Problem Using Parallel
Computing Toolbox™” on page 14-8
14-4
Using Parallel Computing in Optimization Toolbox
Suppose you have a dual-core processor, and want to use parallel computing:
• Enter
parpool
at the command line. MATLAB starts a pool of workers using the multicore processor. If you had
previously set a nondefault cluster profile, you can enforce multicore (local) computing:
parpool('local')
Note Depending on your preferences, MATLAB can start a parallel pool automatically. To enable
this feature, check Automatically create a parallel pool in Home > Parallel > Parallel
Preferences.
• • For command-line use, enter
options = optimoptions('solvername','UseParallel',true);
• For Optimization app, check Options > Approximated derivatives > Evaluate in parallel.
When you run an applicable solver with options, applicable solvers automatically use parallel
computing.
To stop computing optimizations in parallel, set UseParallel to false, or set the Optimization app
not to compute in parallel. To halt all parallel computation, enter
delete(gcp)
1 Make sure your system is configured properly for parallel computing. Check with your systems
administrator, or refer to the Parallel Computing Toolbox documentation.
14-5
14 Parallel Computing for Optimization
where 'solvername' represents one of the nonlinear solvers that support parallel
evaluation.
• For Optimization app, check Options > Approximated derivatives > Evaluate in parallel.
After you establish your parallel computing environment, applicable solvers automatically use parallel
computing whenever you call them with options.
To stop computing optimizations in parallel, set UseParallel to false, or set the Optimization app
not to compute in parallel. To halt all parallel computation, enter
delete(gcp)
14-6
Using Parallel Computing in Optimization Toolbox
3 Set UseParallel to true, and create a parallel pool using parpool. Unless you have a
multicore processor or a network set up, you won't see any speedup. This testing is simply to
verify the correctness of the computations.
Remember to call your solver using an options argument to test or use parallel functionality.
See Also
More About
• “What Is Parallel Computing in Optimization Toolbox?” on page 14-2
• “Improving Performance with Parallel Computing” on page 14-13
• “Minimizing an Expensive Optimization Problem Using Parallel Computing Toolbox™” on page
14-8
14-7
14 Parallel Computing for Optimization
For the purpose of this example, we solve a problem in four variables, where the objective and
constraint functions are made artificially expensive by pausing.
function f = expensive_objfun(x)
%EXPENSIVE_OBJFUN An expensive objective function used in optimparfor example.
We are interested in measuring the time taken by fmincon in serial so that we can compare it to the
parallel time.
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 5 1.839397e+00 1.500e+00 3.211e+00
14-8
Minimizing an Expensive Optimization Problem Using Parallel Computing Toolbox™
Since ga usually takes many more function evaluations than fmincon, we remove the expensive
constraint from this problem and perform unconstrained optimization instead. Pass empty matrices
[] for constraints. In addition, we limit the maximum number of generations to 15 for ga so that ga
can terminate in a reasonable amount of time. We are interested in measuring the time taken by ga
so that we can compare it to the parallel ga evaluation. Note that running ga requires Global
Optimization Toolbox.
rng default % for reproducibility
try
gaAvailable = false;
nvar = 4;
gaoptions = optimoptions('ga','MaxGenerations',15,'Display','iter');
startTime = tic;
gasol = ga(@expensive_objfun,nvar,[],[],[],[],[],[],[],gaoptions);
time_ga_sequential = toc(startTime);
fprintf('Serial GA optimization takes %g seconds.\n',time_ga_sequential);
gaAvailable = true;
catch ME
warning(message('optimdemos:optimparfor:gaNotFound'));
end
14-9
14 Parallel Computing for Optimization
The finite differencing used by the functions in Optimization Toolbox to approximate derivatives is
done in parallel using the parfor feature if Parallel Computing Toolbox is available and there is a
parallel pool of workers. Similarly, ga, gamultiobj, and patternsearch solvers in Global
Optimization Toolbox evaluate functions in parallel. To use the parfor feature, we use the parpool
function to set up the parallel environment. The computer on which this example is published has
four cores, so parpool starts four MATLAB® workers. If there is already a parallel pool when you
run this example, we use that pool; see the documentation for parpool for more information.
To minimize our expensive optimization problem using the parallel fmincon function, we need to
explicitly indicate that our objective and constraint functions can be evaluated in parallel and that we
want fmincon to use its parallel functionality wherever possible. Currently, finite differencing can be
done in parallel. We are interested in measuring the time taken by fmincon so that we can compare
it to the serial fmincon run.
options = optimoptions(options,'UseParallel',true);
startTime = tic;
xsol = fmincon(@expensive_objfun,startPoint,[],[],[],[],[],[],@expensive_confun,options);
time_fmincon_parallel = toc(startTime);
fprintf('Parallel FMINCON optimization takes %g seconds.\n',time_fmincon_parallel);
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 5 1.839397e+00 1.500e+00 3.211e+00
1 11 -9.760099e-01 3.708e+00 7.902e-01 2.362e+00
2 16 -1.480976e+00 0.000e+00 8.344e-01 1.069e+00
3 21 -2.601599e+00 0.000e+00 8.390e-01 1.218e+00
4 29 -2.823630e+00 0.000e+00 2.598e+00 1.118e+00
5 34 -3.905339e+00 0.000e+00 1.210e+00 7.302e-01
6 39 -6.212992e+00 3.934e-01 7.372e-01 2.405e+00
7 44 -5.948762e+00 0.000e+00 1.784e+00 1.905e+00
8 49 -6.940062e+00 1.233e-02 7.668e-01 7.553e-01
9 54 -6.973887e+00 0.000e+00 2.549e-01 3.920e-01
10 59 -7.142993e+00 0.000e+00 1.903e-01 4.735e-01
11 64 -7.155325e+00 0.000e+00 1.365e-01 2.626e-01
12 69 -7.179122e+00 0.000e+00 6.336e-02 9.115e-02
13 74 -7.180116e+00 0.000e+00 1.069e-03 4.670e-02
14 79 -7.180409e+00 0.000e+00 7.799e-04 2.815e-03
15 84 -7.180410e+00 0.000e+00 6.189e-06 3.122e-04
14-10
Minimizing an Expensive Optimization Problem Using Parallel Computing Toolbox™
To minimize our expensive optimization problem using the ga function, we need to explicitly indicate
that our objective function can be evaluated in parallel and that we want ga to use its parallel
functionality wherever possible. To use the parallel ga we also require that the 'Vectorized' option be
set to the default (i.e., 'off'). We are again interested in measuring the time taken by ga so that we
can compare it to the serial ga run. Though this run may be different from the serial one because ga
uses a random number generator, the number of expensive function evaluations is the same in both
runs. Note that running ga requires Global Optimization Toolbox.
rng default % to get the same evaluations as the previous run
if gaAvailable
gaoptions = optimoptions(gaoptions,'UseParallel',true);
startTime = tic;
gasol = ga(@expensive_objfun,nvar,[],[],[],[],[],[],[],gaoptions);
time_ga_parallel = toc(startTime);
fprintf('Parallel GA optimization takes %g seconds.\n',time_ga_parallel);
end
X = [time_fmincon_sequential time_fmincon_parallel];
Y = [time_ga_sequential time_ga_parallel];
t = [0 1];
plot(t,X,'r--',t,Y,'k-')
ylabel('Time in seconds')
legend('fmincon','ga')
ax = gca;
ax.XTick = [0 1];
14-11
14 Parallel Computing for Optimization
Utilizing parallel function evaluation via parfor improved the efficiency of both fmincon and ga.
The improvement is typically better for expensive objective and constraint functions.
See Also
More About
• “What Is Parallel Computing in Optimization Toolbox?” on page 14-2
• “Using Parallel Computing in Optimization Toolbox” on page 14-5
• “Improving Performance with Parallel Computing” on page 14-13
14-12
Improving Performance with Parallel Computing
• Parallel overhead. There is overhead in calling parfor instead of for. If function evaluations are
fast, this overhead could become appreciable. In particular, solving a problem in parallel can be
slower than solving the problem serially.
• No nested parfor loops. This is described in “Nested Parallel Functions” on page 14-3. parfor
does not work in parallel when called from within another parfor loop. If you have programmed
your objective or constraint functions to take advantage of parallel processing, the limitation of no
nested parfor loops may cause a solver to run more slowly than you expect. In particular, the
parallel computation of finite differences takes precedence, since that is an outer loop. This causes
any parallel code within the objective or constraint functions to execute serially.
• When executing serially, parfor loops run slower than for loops. Therefore, for best
performance, ensure that only your outermost parallel loop calls parfor. For example, suppose
your code calls fmincon within a parfor loop. For best performance in this case, set the
fmincon UseParallel option to false.
• Passing parameters. Parameters are automatically passed to worker machines during the
execution of parallel computations. If there are a large number of parameters, or they take a large
amount of memory, passing them may slow the execution of your computation.
• Contention for resources: network and computing. If the network of worker machines has low
bandwidth or high latency, computation could be slowed.
• Persistent or global variables. If your objective or constraint functions use persistent or global
variables, these variables may take different values on different worker processors. Furthermore,
they may not be cleared properly on the worker processors. Solvers can throw errors such as size
mismatches.
• Accessing external files. External files may be accessed in an unpredictable fashion during a
parallel computation. The order of computations is not guaranteed during parallel processing, so
external files may be accessed in unpredictable order, leading to unpredictable results.
• Accessing external files. If two or more processors try to read an external file simultaneously, the
file may become locked, leading to a read error, and halting the execution of the optimization.
• If your objective function calls Simulink, results may be unreliable with parallel gradient
estimation.
• Noncomputational functions, such as input, plot, and keyboard, might behave badly when
used in objective or constraint functions. When called in a parfor loop, these functions are
14-13
14 Parallel Computing for Optimization
executed on worker machines. This can cause a worker to become nonresponsive, since it is
waiting for input.
• parfor does not allow break or return statements.
If you have a Global Optimization Toolbox license, you can use the MultiStart solver to examine
multiple start points in parallel. See “Parallel Computing” (Global Optimization Toolbox) and “Parallel
MultiStart” (Global Optimization Toolbox).
See Also
More About
• “What Is Parallel Computing in Optimization Toolbox?” on page 14-2
• “Using Parallel Computing in Optimization Toolbox” on page 14-5
• “Minimizing an Expensive Optimization Problem Using Parallel Computing Toolbox™” on page
14-8
14-14
15
15-2
Function Input Arguments
See Also
More About
• “Function Output Arguments” on page 15-4
15-3
15 Argument and Options Reference
15-4
Function Output Arguments
See Also
More About
• “Function Input Arguments” on page 15-2
15-5
15 Argument and Options Reference
Optimization Options
The following table describes optimization options. Create options using the optimoptions function,
or optimset for fminbnd, fminsearch, fzero, or lsqnonneg.
See the individual function reference pages for information about available option values and
defaults.
The default values for the options vary depending on which optimization function you call with
options as an input argument. You can determine the default option values for any of the
optimization functions by entering optimoptions('solvername') or the equivalent
optimoptions(@solvername). For example,
optimoptions('fmincon')
returns a list of the options and the default values for the default 'interior-point' fmincon
algorithm. To find the default values for another fmincon algorithm, set the Algorithm option. For
example,
opts = optimoptions('fmincon','Algorithm','sqp')
optimoptions “hides” some options, meaning it does not display their values. Those options do not
appear in this table. Instead, they appear in “Hidden Options” on page 15-16.
15-6
Optimization Options Reference
Optimization Options
U – L <=
AbsoluteGapTolerance.
AbsoluteMaxObjective Number of F(x) to minimize the fminimax
Count worst case absolute values.
Algorithm Chooses the algorithm used by the fmincon, fminunc, fsolve,
solver. linprog, lsqcurvefit,
lsqlin, lsqnonlin,
quadprog
BranchRule Rule for choosing the component for intlinprog optimoption
branching: s only
15-7
15 Argument and Options Reference
15-8
Optimization Options Reference
15-9
15 Argument and Options Reference
delta =
v.*max(abs(x),TypicalX);
Scalar
FiniteDifferenceStepSize
expands to a vector. The default is
sqrt(eps) for forward finite
differences, and eps^(1/3) for
central finite differences.
FiniteDifferenceType Finite differences, used to estimate fgoalattain, fmincon, optimoption
gradients, are either 'forward' fminimax, fminunc, s only. For
(the default), or 'central' fseminf, fsolve, optimset,
(centered), which takes twice as lsqcurvefit, lsqnonlin use
many function evaluations but FinDiffType
should be more accurate.
'central' differences might violate
bounds during their evaluation in
fmincon interior-point evaluations if
the HonorBounds option is set to
false.
FunctionTolerance Termination tolerance on the fgoalattain, fmincon, optimoption
function value. fminimax, fminsearch, s only. For
fminunc, fseminf, fsolve, optimset,
lsqcurvefit, lsqlin, use TolFun
lsqnonlin, quadprog
HessianApproximation Method of Hessian approximation: fmincon optimoption
'bfgs', 'lbfgs', s only. For
{'lbfgs',Positive Integer}, optimset,
or 'finite-difference'. use Hessian
15-10
Optimization Options Reference
• 'basic'
• 'intermediate'
• 'advanced'
• 'rss'
• 'rins'
• 'round'
• 'diving'
• 'rss-diving'
• 'rins-diving'
• 'round-diving'
• 'none'
HeuristicsMaxNodes Strictly positive integer that bounds intlinprog optimoption
the number of nodes intlinprog s only
can explore in its branch-and-bound
search for feasible points. See
“Heuristics for Finding Feasible
Solutions” on page 9-29.
HonorBounds The default true ensures that bound fmincon optimoption
constraints are satisfied at every s only. For
iteration. Turn off by setting to optimset,
false. use
AlwaysHonor
Constraints
IntegerPreprocess Types of integer preprocessing (see intlinprog optimoption
“Mixed-Integer Program s only
Preprocessing” on page 9-27):
15-11
15 Argument and Options Reference
15-12
Optimization Options Reference
15-13
15 Argument and Options Reference
15-14
Optimization Options Reference
(U – L) / (abs(U) + 1) <=
RelativeGapTolerance.
tolerance = min(1/(1+|L|),
RelativeGapTolerance)
RootLPAlgorithm Algorithm for solving linear intlinprog optimoption
programs: s only
• 'dual-simplex' — Dual
simplex algorithm
• 'primal-simplex' — Primal
simplex algorithm
RootLPMaxIterations Nonnegative integer that is the intlinprog optimoption
maximum number of simplex s only
algorithm iterations to solve the
initial linear programming problem.
ScaleProblem For fmincon interior-point and fmincon
sqp algorithms, true causes the
algorithm to normalize all
constraints and the objective
function by their initial values.
Disable by setting to the default
false.
SpecifyConstraintGra User-defined gradients for the fgoalattain, fmincon, optimoption
dient nonlinear constraints. fminimax s only. For
optimset,
use
GradConstr
SpecifyObjectiveGrad User-defined gradients or Jacobians fgoalattain, fmincon, optimoption
ient for the objective functions. fminimax, fminunc, s only. For
fseminf, fsolve, optimset,
lsqcurvefit, lsqnonlin use GradObj
or Jacobian
StepTolerance Termination tolerance on x. All functions except optimoption
linprog and lsqlin s only. For
optimset,
use TolX
15-15
15 Argument and Options Reference
Hidden Options
optimoptions “hides” some options, meaning it does not display their values. To learn how to view
these options, and why they are hidden, see “View Options” on page 2-66.
15-16
Optimization Options Reference
Note FunValCheck
does not return an error
for Inf when used with
fminbnd, fminsearch,
or fzero, which handle
Inf appropriately.
15-17
15 Argument and Options Reference
• 'none' — No
preprocessing.
• 'basic' — Use
preprocessing.
MaxPCGIter Maximum number of fmincon, fminunc,
iterations of fsolve, lsqcurvefit,
preconditioned lsqlin, lsqnonlin,
conjugate gradients quadprog
method allowed.
MaxProjCGIter A tolerance for the fmincon
number of projected
conjugate gradient
iterations; this is an
inner iteration, not the
number of iterations of
the algorithm.
MaxSQPIter Maximum number of fgoalattain,
iterations of sequential fmincon, fminimax
quadratic programming
method allowed.
MeritFunction Use goal attainment/ fgoalattain,
minimax merit function fminimax
(multiobjective) vs.
fmincon (single
objective).
15-18
Optimization Options Reference
15-19
15 Argument and Options Reference
For the reasons these options are hidden, see “Options that optimoptions Hides” (Global Optimization
Toolbox).
See Also
More About
• “Current and Legacy Option Name Tables” on page 15-21
15-20
Current and Legacy Option Name Tables
options = optimoptions('fsolve','TolX',1e-4)
options =
fsolve options:
Set properties:
StepTolerance: 1.0000e-04
Default properties:
Algorithm: 'trust-region-dogleg'
CheckGradients: 0
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
FunctionTolerance: 1.0000e-06
MaxFunctionEvaluations: '100*numberOfVariables'
MaxIterations: 400
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
SpecifyObjectiveGradient: 0
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
These two tables have identical information. One is in alphabetical order by legacy option name, the
other is in order by current option name. The tables show values only when the values differ between
legacy and current, and show only names that differ or have different values. For changes in Global
Optimization Toolbox solvers, see “Options Changes in R2016a” (Global Optimization Toolbox).
15-21
15 Argument and Options Reference
15-22
Current and Legacy Option Name Tables
15-23
15 Argument and Options Reference
15-24
Current and Legacy Option Name Tables
See Also
More About
• “Optimization Options Reference” on page 15-6
15-25
15 Argument and Options Reference
Caution intlinprog output functions and plot functions differ from those in other solvers. See
“intlinprog Output Function and Plot Function Syntax” on page 15-33.
specifies OutputFcn to be the handle to outfun. To specify more than one output function, use
the syntax
“Passing Extra Parameters” on page 2-57 explains how to parametrize the output function
OutputFcn, if necessary.
stop = outfun(x,optimValues,state)
where
15-26
Output Function Syntax
• optimValues is a structure containing data from the current iteration. “Fields in optimValues” on
page 15-27 describes the structure in detail.
• state is the current state of the algorithm. “States of the Algorithm” on page 15-31 lists the
possible values.
• stop is a flag that is true or false depending on whether the optimization routine should quit or
continue. See “Stop Flag” on page 15-31 for more information.
The optimization function passes the values of the input arguments to outfun at each iteration.
Fields in optimValues
The following table lists the fields of the optimValues structure. A particular optimization function
returns values for only some of these fields. For each field, the Returned by Functions column of the
table lists the functions that return the field.
Some of the fields of optimValues correspond to output arguments of the optimization function.
After the final iteration of the optimization algorithm, the value of such a field equals the
corresponding output argument. For example, optimValues.fval corresponds to the output
argument fval. So, if you call fmincon with an output function and return fval, the final value of
optimValues.fval equals fval. The Description column of the following table indicates the fields
that have a corresponding output argument.
Command-Line Display
The values of some fields of optimValues are displayed at the command line when you call the
optimization function with the Display field of options set to 'iter', as described in “Iterative
Display” on page 3-14. For example, optimValues.fval is displayed in the f(x) column. The
Command-Line Display column of the following table indicates the fields that you can display at the
command line.
• AS — active-set
• D — trust-region-dogleg
• IP — interior-point
• LM — levenberg-marquardt
• Q — quasi-newton
• SQP — sqp
• TR — trust-region
• TRR — trust-region-reflective
Some optimValues fields exist in certain solvers or algorithms, but are always filled with empty or
zero values, so are meaningless. These fields include:
15-27
15 Argument and Options Reference
optimValues Fields
15-28
Output Function Syntax
See “Iterative
Display” on page 3-
14.
maxfval Maximum function value fminimax None
positivedefinite 0 if algorithm detects fmincon (TRR), fminunc None
negative curvature while (TR), fsolve (TRR),
computing Newton step. lsqcurvefit (TRR),
lsqnonlin (TRR)
1 otherwise.
15-29
15 Argument and Options Reference
See “Iterative
Display” on page 3-
14.
resnorm 2-norm of the residual lsqcurvefit, lsqnonlin Resnorm
squared.
See “Iterative
Display” on page 3-
14.
searchdirection Search direction. fgoalattain, fmincon None
(AS, SQP), fminimax,
fminunc (Q), fseminf,
fsolve (LM),
lsqcurvefit (LM),
lsqnonlin (LM)
stepaccept Status of the current trust- fsolve (D) None
region step. Returns true if
the current trust-region step
was successful, and false if
the trust-region step was
unsuccessful.
stepsize Current step size fgoalattain, fmincon, Step-size or Norm
(displacement in x). Final fminimax, fminunc, of Step
value equals optimization fseminf, fsolve,
function output lsqcurvefit, lsqnonlin See “Iterative
output.stepsize. Display” on page 3-
14.
trustregionradius Radius of trust region. fmincon (IP, TRR), Trust-region
fminunc (TR), fsolve (D, radius
TRR), lsqcurvefit (TRR),
lsqnonlin (TRR) See “Iterative
Display” on page 3-
14.
15-30
Output Function Syntax
Degeneracy
The value of the field degenerate, which measures the degeneracy of the current optimization point
x, is defined as follows. First, define a vector r, of the same size as x, for which r(i) is the minimum
distance from x(i) to the ith entries of the lower and upper bounds, lb and ub. That is,
r = min(abs(ub-x, x-lb))
Then the value of degenerate is the minimum entry of the vector r + abs(grad), where grad is
the gradient of the objective function. The value of degenerate is 0 if there is an index i for which
both of the following are true:
• grad(i) = 0
• x(i) equals the ith entry of either the lower or upper bound.
State Description
'init' The algorithm is in the initial state before the first iteration.
'interrupt' The algorithm is in some computationally expensive part of the iteration. In
this state, the output function can interrupt the current iteration of the
optimization. At this time, the values of x and optimValues are the same
as at the last call to the output function in which state=='iter'.
'iter' The algorithm is at the end of an iteration.
'done' The algorithm is in the final state after the last iteration.
The 'interrupt' state occurs only in the fmincon 'active-set' algorithm and the
fgoalattain, fminimax, and fseminf solvers. There, the state can occur before a quadratic
programming subproblem solution or a line search.
The following code illustrates how the output function might use the value of state to decide which
tasks to perform at the current iteration:
switch state
case 'iter'
% Make updates to plot or guis as needed
case 'interrupt'
% Probably no action here. Check conditions to see
% whether optimization should quit.
case 'init'
% Setup for plots or guis
case 'done'
% Cleanup of plots, guis, or final plot
otherwise
end
Stop Flag
The output argument stop is a flag that is true or false. The flag tells the optimization function
whether the optimization should quit or continue. The following examples show typical ways to use
the stop flag.
15-31
15 Argument and Options Reference
The output function can stop an optimization at any iteration based on the current data in
optimValues. For example, the following code sets stop to true if the directional derivative is less
than .01:
If you design a GUI to perform optimizations, you can make the output function stop an optimization
when a user clicks a Stop button on the GUI. The following code shows how to do this, assuming that
the Stop button callback stores the value true in the optimstop field of a handles structure called
hObject:
15-32
intlinprog Output Function and Plot Function Syntax
Caution intlinprog output functions and plot functions differ from those in other solvers. For
output functions or plot functions in other Optimization Toolbox solvers, see “Output Function
Syntax” on page 15-26 and “Plot Functions” on page 3-27.
• There is one built-in output function: savemilpsolutions. This function collects the integer
feasible points that the algorithm finds at event times. It puts the feasible points in a matrix
named xIntSol in your base workspace, where each column is one integer feasible point. It saves
the objective function values in a vector named fIntSol, where each entry is the objective
function of the corresponding column in xIntSol.
• There is one built-in plot function: optimplotmilp. This function plots the internally-calculated
bounds on the best objective function value. For an example of its use, see “Factory, Warehouse,
Sales Allocation Model: Solver-Based” on page 9-40.
Call output functions or plot functions by passing the OutputFcn or PlotFcn name-value pairs,
including the handle to the output function or plot function. For example,
options = optimoptions(@intlinprog,'OutputFcn',@savemilpsolutions,'PlotFcn',@optimplotmilp);
x = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub,options);
If you have several output functions or plot functions, pass them as a cell array.
options = optimoptions(@intlinprog,'OutputFcn',{@savemilpsolutions,@customFcn});
• stop — Set to true to halt intlinprog. Set to false to allow intlinprog to continue.
• x — Either an empty matrix [] or an N-by-1 vector that is a feasible point. x is nonempty only
when intlinprog finds a new integer feasible solution. x can be nonempty when phase is
'heuristics' or 'branching'.
15-33
15 Argument and Options Reference
• 'init' — intlinprog is starting. Use this state to set up any plots or data structures that
you need.
• 'iter' — intlinprog is solving the problem. Access data related to the solver’s progress.
For example, plot or perform file operations.
• 'done' — intlinprog has finished solving the problem. Close any files, finish annotating
plots, etc.
For examples of writing output or plot functions, see the built-in functions savemilpsolutions.m
or optimplotmilp.m.
optimValues Structure
optimValues Field Meaning
phase Phase of the algorithm. Possible values:
15-34
16
Functions
16 Functions
EquationProblem
System of nonlinear equations
Description
Specify a system of equations using optimization variables, and solve the system using solve.
Creation
Create an EquationProblem object by using the eqnproblem function. Add equations to the
problem by creating OptimizationEquality objects and setting them as Equations properties of
the EquationProblem object.
prob = eqnproblem;
x = optimvar('x');
eqn = x^5 - x^4 + 3*x == 1/2;
prob.Equations.eqn = eqn;
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
Properties
Equations — Problem equations
[] (default) | OptimizationEquality array | structure with OptimizationEquality arrays as
fields
Problem label, specified as a string or character vector. The software does not use Description for
computation. Description is an arbitrary label that you can use for any reason. For example, you
can share, archive, or present a model or problem, and store descriptive information about the model
or problem in Description.
Example: "An iterative approach to the Traveling Salesman problem"
Data Types: char | string
16-2
EquationProblem
Object Functions
optimoptions Create optimization options
prob2struct Convert optimization problem or equation problem to solver form
show Display information about optimization object
solve Solve optimization problem or equation problem
varindex Map problem variables to solver-based variable index
write Save optimization object description
Examples
EquationProblem :
Solve for:
x
eq1:
exp(-exp(-(x(1) + x(2)))) == (x(2) .* (1 + x(1).^2))
eq2:
((x(1) .* cos(x(2))) + (x(2) .* sin(x(1)))) == 0.5
16-3
16 Functions
Solve the problem starting from the point [0,0]. For the problem-based approach, specify the initial
point as a structure, with the variable names as the fields of the structure. For this problem, there is
only one variable, x.
x0.x = [0 0];
[sol,fval,exitflag] = solve(prob,x0)
Equation solved.
exitflag =
EquationSolved
disp(sol.x)
0.3532
0.6061
If your equation functions are not composed of elementary functions, you must convert the functions
to optimization expressions using fcn2optimexpr. For the present example:
ls1 = fcn2optimexpr(@(x)exp(-exp(-(x(1)+x(2)))),x);
eq1 = ls1 == x(2)*(1 + x(1)^2);
ls2 = fcn2optimexpr(@(x)x(1)*cos(x(2))+x(2)*sin(x(1)),x);
eq2 = ls2 == 1/2;
For the list of supported functions, see “Supported Operations on Optimization Variables and
Expressions” on page 10-36.
See Also
OptimizationEquality | eqnproblem | fcn2optimexpr | optimvar | show | write
Topics
“Systems of Nonlinear Equations”
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
16-4
EquationProblem
Introduced in R2019b
16-5
16 Functions
eqnproblem
Create equation problem
Syntax
prob = eqnproblem
prob = eqnproblem(Name,Value)
Description
prob = eqnproblem creates an equation problem with default properties.
prob = eqnproblem(Name,Value) specifies additional options using one or more name-value pair
arguments. For example, you can specify equations when constructing the problem by using the
Equations name.
Examples
x = optimvar('x',2);
prob = eqnproblem;
prob.Equations.eq1 = eq1;
prob.Equations.eq2 = eq2;
show(prob)
EquationProblem :
16-6
eqnproblem
Solve for:
x
eq1:
exp(-exp(-(x(1) + x(2)))) == (x(2) .* (1 + x(1).^2))
eq2:
((x(1) .* cos(x(2))) + (x(2) .* sin(x(1)))) == 0.5
Solve the problem starting from the point [0,0]. For the problem-based approach, specify the initial
point as a structure, with the variable names as the fields of the structure. For this problem, there is
only one variable, x.
x0.x = [0 0];
[sol,fval,exitflag] = solve(prob,x0)
Equation solved.
exitflag =
EquationSolved
disp(sol.x)
0.3532
0.6061
If your equation functions are not composed of elementary functions, you must convert the functions
to optimization expressions using fcn2optimexpr. For the present example:
ls1 = fcn2optimexpr(@(x)exp(-exp(-(x(1)+x(2)))),x);
eq1 = ls1 == x(2)*(1 + x(1)^2);
ls2 = fcn2optimexpr(@(x)x(1)*cos(x(2))+x(2)*sin(x(1)),x);
eq2 = ls2 == 1/2;
For the list of supported functions, see “Supported Operations on Optimization Variables and
Expressions” on page 10-36.
16-7
16 Functions
12
x3 =
34
is a system of polynomial equations. Here, x3 means x * x * x using matrix multiplication. You can
easily formulate and solve this system using the problem-based approach.
x = optimvar('x',2,2);
prob = eqnproblem('Equations',eqn);
x0.x = ones(2);
sol = solve(prob,x0)
Equation solved.
disp(sol.x)
-0.1291 0.8602
1.2903 1.1612
sol.x^3
ans = 2×2
1.0000 2.0000
3.0000 4.0000
16-8
eqnproblem
Input Arguments
Name-Value Pair Arguments
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and
Value is the corresponding value. Name must appear inside quotes. You can specify several name and
value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
Example: prob = eqnproblem('Equations',eqn)
Problem label, specified as a string or character vector. The software does not use Description for
computation. Description is an arbitrary label that you can use for any reason. For example, you
can share, archive, or present a model or problem, and store descriptive information about the model
or problem in Description.
Example: "An iterative approach to the Traveling Salesman problem"
Data Types: char | string
Output Arguments
prob — Equation problem
EquationProblem object
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
See Also
EquationProblem | OptimizationEquality | optimvar | solve
Topics
“Systems of Nonlinear Equations”
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
16-9
16 Functions
Introduced in R2019b
16-10
evaluate
evaluate
Package: optim.problemdef
Syntax
val = evaluate(expr,pt)
Description
val = evaluate(expr,pt) returns the value of the optimization expression expr at the value pt.
Examples
val = 1×2
-3 12
16-11
16 Functions
prob.Constraints.cons5 = x + y >= 1;
prob.Constraints.cons6 = -x + y <= 2;
sol = solve(prob)
val = evaluate(prob.Objective,sol)
val = -1.1111
Input Arguments
expr — Optimization expression
OptimizationExpression object
Values of variables in expression, specified as a structure. The structure pt has the following
requirements:
Output Arguments
val — Numeric value of expression
double
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
16-12
evaluate
See Also
OptimizationExpression | infeasibility | solve
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-13
16 Functions
fcn2optimexpr
Package: optim.problemdef
Syntax
[out1,out2,...,outN] = fcn2optimexpr(fcn,in1,in2,...,inK)
[out1,out2,...,outN] = fcn2optimexpr(fcn,in1,in2,...,inK,Name,Value)
Description
[out1,out2,...,outN] = fcn2optimexpr(fcn,in1,in2,...,inK) converts the function
fcn(in1,in2,...,inK) to an optimization expression with N outputs.
Examples
To use the objective function gamma (the mathematical function Γ(x), an extension of the factorial
function), create an optimization variable x and use it in a converted anonymous function.
x = optimvar('x');
obj = fcn2optimexpr(@gamma,x);
prob = optimproblem('Objective',obj);
show(prob)
OptimizationProblem :
Solve for:
x
minimize :
gamma(x)
To solve the resulting problem, give an initial point structure and call solve.
x0.x = 1/2;
sol = solve(prob,x0)
16-14
fcn2optimexpr
For more complex functions, convert a function file. The function file gammabrock.m computes an
objective of two optimization variables.
type gammabrock
function f = gammabrock(x,y)
f = (10*(y - gamma(x)))^2 + (1 - x)^2;
x = optimvar('x','LowerBound',0);
y = optimvar('y');
obj = fcn2optimexpr(@gammabrock,x,y);
prob = optimproblem('Objective',obj);
show(prob)
OptimizationProblem :
Solve for:
x, y
minimize :
gammabrock(x, y)
variable bounds:
0 <= x
The gammabrock function is a sum of squares. You get a more efficient problem formulation by
expressing the function as an explicit sum of squares of optimization expressions.
f = fcn2optimexpr(@(x,y)y - gamma(x),x,y);
obj2 = (10*f)^2 + (1-x)^2;
prob2 = optimproblem('Objective',obj2);
To see the difference in efficiency, solve prob and prob2 and examine the differences in number of
iterations.
x0.x = 1/2;
x0.y = 1/2;
[sol,fval,~,output] = solve(prob,x0);
[sol2,fval2,~,output2] = solve(prob2,x0);
16-15
16 Functions
If your function has several outputs, you can use them as elements of the objective function. In this
case, u is a 2-by-2 variable, v is a 2-by-1 variable, and expfn3 has three outputs.
type expfn3
Create appropriately sized optimization variables, and create an objective function from the first two
outputs.
u = optimvar('u',2,2);
v = optimvar('v',2);
[f,g,mineval] = fcn2optimexpr(@expfn3,u,v);
prob = optimproblem;
prob.Objective = f*g/(1 + f^2);
show(prob)
OptimizationProblem :
Solve for:
u, v
minimize :
((arg3 .* arg4) ./ (1 + arg2.^2))
where:
16-16
fcn2optimexpr
Create the nonlinear constraint that gammafn2 is less than or equal to –1/2. This function of two
variables is in the gammafn2.m file.
type gammafn2
function f = gammafn2(x,y)
f = -gamma(x)*(y/(1+y^2));
Create optimization variables, convert the function file to an optimization expression, and then
express the constraint as confn.
x = optimvar('x','LowerBound',0);
y = optimvar('y','LowerBound',0);
expr1 = fcn2optimexpr(@gammafn2,x,y);
confn = expr1 <= -1/2;
show(confn)
OptimizationProblem :
Solve for:
x, y
minimize :
subject to confn:
gammafn2(x, y) <= -0.5
subject to confn2:
gammafn2(x, y) >= (x + y)
variable bounds:
0 <= x
0 <= y
If your problem involves a common, time-consuming function to compute the objective and nonlinear
constraint, you can save time by using the 'ReuseEvaluation' name-value pair argument. The
rosenbrocknorm function computes both the Rosenbrock objective function and the norm of the
2
argument for use in the constraint ‖x‖ ≤ 4.
type rosenbrocknorm
16-17
16 Functions
Create objective and constraint expressions from the returned expressions. Include the objective and
constraint expressions in an optimization problem. Review the problem using show.
prob = optimproblem('Objective',f);
prob.Constraints.cineq = c <= 4;
show(prob)
OptimizationProblem :
Solve for:
x
minimize :
[argout,~] = rosenbrocknorm(x)
subject to cineq:
arg_LHS <= 4
where:
[~,arg_LHS] = rosenbrocknorm(x);
Solve the problem starting from the initial point x0.x = [-1;1], timing the result.
x0.x = [-1;1];
tic
[sol,fval,exitflag,output] = solve(prob,x0)
fval = 3.6604e-11
exitflag =
OptimalSolution
16-18
fcn2optimexpr
toc
The solution time in seconds is nearly the same as the number of function evaluations. This result
indicates that the solver reused function values, and did not waste time by reevaluating the same
point twice.
For a more extensive example, see “Objective and Constraints Having a Common Function in Serial
or Parallel, Problem-Based” on page 2-52.
Input Arguments
fcn — Function to convert
function handle
in — Input argument
MATLAB variable
Input argument, specified as a MATLAB variable. The input can have any data type and any size.
Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 |
logical | char | string | struct | table | cell | function_handle | categorical | datetime
| duration | calendarDuration | fi
Complex Number Support: Yes
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and
Value is the corresponding value. Name must appear inside quotes. You can specify several name and
value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
Example: [out1,out2] = fcn2optimexpr(@fun,x,y,'OutputSize',
[1,1],'ReuseEvaluation',true) specifies that out1 and out2 are scalars that will be reused
between objective and constraint functions without recalculation.
16-19
16 Functions
• An integer vector — If the function has one output out1, OutputSize specifies the size of out1.
If the function has multiple outputs out1,…,outN, OutputSize specifies that all outputs have the
same size.
• A cell array of integer vectors — The size of output outj is the jth element of OutputSize.
If you do not specify the 'OutputSize' name-value pair argument, then fcn2optimexpr passes
data to fcn in order to determine the size of the outputs (see “Algorithms” on page 16-20). By
specifying 'OutputSize', you enable fcn2optimexpr to skip this step, which saves time. Also, if
you do not specify 'OutputSize' and the evaluation of fcn fails for any reason, then
fcn2optimexpr fails as well.
Example: [out1,out2,out3] = fcn2optimexpr(@fun,x,'OutputSize',[1,1]) specifies that
the three outputs [out1,out2,out3] are scalars.
Example: [out1,out2] = fcn2optimexpr(@fun,x,'OutputSize',{[4,4],[3,5]}) specifies
that out1 has size 4-by-4 and out2 has size 3-by-5.
Data Types: double | cell
Indicator to reuse values, specified as false (do not reuse) or true (reuse).
'ReuseEvaluation' can make your problem run faster when, for example, the objective and some
nonlinear constraints rely on a common calculation. In this case, the solver stores the value for reuse
wherever needed and avoids recalculating the value.
Reusable values involve some overhead, so it is best to enable reusable values only for expressions
that share a value.
Example: [out1,out2,out3] = fcn2optimexpr(@fun,x,'ReuseEvaluation',true) allows
out1, out2, and out3 to be used in multiple computations, with the outputs being calculated only
once per evaluation point.
Data Types: logical
Output Arguments
out — Output argument
OptimizationExpression
Algorithms
To find the output size of each returned expression when you do not specify OutputSize,
fcn2optimexpr evaluates the function at the following point for each element of the problem
variables.
16-20
fcn2optimexpr
An evaluation point might lead to an error in function evaluation. To avoid this error, specify
'OutputSize'.
See Also
Topics
“Problem-Based Optimization Workflow” on page 10-2
“Optimization Expressions” on page 10-6
Introduced in R2019a
16-21
16 Functions
fgoalattain
Solve multiobjective goal attainment problems
Syntax
x = fgoalattain(fun,x0,goal,weight)
x = fgoalattain(fun,x0,goal,weight,A,b)
x = fgoalattain(fun,x0,goal,weight,A,b,Aeq,beq)
x = fgoalattain(fun,x0,goal,weight,A,b,Aeq,beq,lb,ub)
x = fgoalattain(fun,x0,goal,weight,A,b,Aeq,beq,lb,ub,nonlcon)
x = fgoalattain(fun,x0,goal,weight,A,b,Aeq,beq,lb,ub,nonlcon,options)
x = fgoalattain(problem)
[x,fval] = fgoalattain( ___ )
[x,fval,attainfactor,exitflag,output] = fgoalattain( ___ )
[x,fval,attainfactor,exitflag,output,lambda] = fgoalattain( ___ )
Description
fgoalattain solves the goal attainment problem, a formulation for minimizing a multiobjective
optimization problem.
weight, goal, b, and beq are vectors, A and Aeq are matrices, and F(x), c(x), and ceq(x), are
functions that return vectors. F(x), c(x), and ceq(x) can be nonlinear functions.
x, lb, and ub can be passed as vectors or matrices; see “Matrix Arguments” on page 2-31.
Note “Passing Extra Parameters” on page 2-57 explains how to pass extra parameters to the
objective functions and nonlinear constraint functions, if necessary.
16-22
fgoalattain
Note If the specified input bounds for a problem are inconsistent, the output x is x0 and the output
fval is [].
x = fgoalattain(problem) solves the goal attainment problem for problem, where problem is a
structure described in problem. Create the problem structure by exporting a problem from the
Optimization app, as described in “Exporting Your Work” on page 5-9.
[x,fval] = fgoalattain( ___ ), for any syntax, returns the values of the objective functions
computed in fun at the solution x.
Examples
This function clearly minimizes F1(x) at x = 3, attaining the value 2, and minimizes F2(x) at x = 0,
attaining the value 5.
Set the goal [3,6] and weight [1,1], and solve the goal attainment problem starting at x0 = 1.
fun = @(x)[2+(x-3)^2;5+x^2/4];
goal = [3,6];
weight = [1,1];
x0 = 1;
x = fgoalattain(fun,x0,goal,weight)
16-23
16 Functions
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 2.0000
ans = 2×1
3.0000
6.0000
Here, p_1 = [2,3] and p_2 = [4,1]. The goal is [3,6], the weight is [1,1], and the linear constraint is
x1 + x2 ≤ 4.
Set an initial point [1,1] and solve the goal attainment problem.
x0 = [1,1];
x = fgoalattain(fun,x0,goal,weight,A,b)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
16-24
fgoalattain
2.0694 1.9306
ans = 2×1
3.1484
6.1484
fgoalattain does not meet the goals. Because the weights are equal, the solver underachieves
each goal by the same amount.
Here, p_1 = [2,3] and p_2 = [4,1]. The goal is [3,6], the weight is [1,1], and the bounds are 0 ≤ x1 ≤ 3,
2 ≤ x2 ≤ 5.
Set the initial point to [1,4] and solve the goal attainment problem.
x0 = [1,4];
A = []; % no linear constraints
b = [];
Aeq = [];
beq = [];
x = fgoalattain(fun,x0,goal,weight,A,b,Aeq,beq,lb,ub)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
16-25
16 Functions
2.6667 2.3333
ans = 2×1
2.8889
5.8889
fgoalattain more than meets the goals. Because the weights are equal, the solver overachieves
each goal by the same amount.
2
2 + ‖x − p1‖
F(x) = 2 .
5 + ‖x − p2‖ /4
Here, p_1 = [2,3] and p_2 = [4,1]. The goal is [3,6], the weight is [1,1], and the nonlinear constraint is
2
‖x‖ ≤ 4.
Create empty input arguments for the linear constraints and bounds.
A = [];
Aeq = [];
b = [];
beq = [];
lb = [];
ub = [];
Set the initial point to [1,1] and solve the goal attainment problem.
x0 = [1,1];
x = fgoalattain(fun,x0,goal,weight,A,b,Aeq,beq,lb,ub,@norm4)
16-26
fgoalattain
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
1.1094 1.6641
fun(x)
ans = 2×1
4.5778
7.1991
fgoalattain does not meet the goals. Despite the equal weights, F1(x) is about 1.58 from its goal of
3, and F2(x) is about 1.2 from its goal of 6. The nonlinear constraint prevents the solution x from
achieving the goals equally.
Monitor a goal attainment solution process by setting options to return iterative display.
options = optimoptions('fgoalattain','Display','iter');
2
2 + ‖x − p1‖
F(x) = 2 .
5 + ‖x − p2‖ /4
Here, p_1 = [2,3] and p_2 = [4,1]. The goal is [3,6], the weight is [1,1], and the linear constraint is
x1 + x2 ≤ 4.
p_1 = [2,3];
p_2 = [4,1];
fun = @(x)[2 + norm(x-p_1)^2;5 + norm(x-p_2)^2/4];
goal = [3,6];
weight = [1,1];
A = [1,1];
b = 4;
Create empty input arguments for the linear equality constraints, bounds, and nonlinear constraints.
16-27
16 Functions
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = [];
Set an initial point [1,1] and solve the goal attainment problem.
x0 = [1,1];
x = fgoalattain(fun,x0,goal,weight,A,b,Aeq,beq,lb,ub,nonlcon,options)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
2.0694 1.9306
The positive value of the reported attainment factor indicates that fgoalattain does not find a
solution satisfying the goals.
2
2 + ‖x − p1‖
F(x) = 2 .
5 + ‖x − p2‖ /4
Here, p_1 = [2,3] and p_2 = [4,1]. The goal is [3,6], the weight is [1,1], and the linear constraint is
x1 + x2 ≤ 4.
p_1 = [2,3];
p_2 = [4,1];
fun = @(x)[2 + norm(x-p_1)^2;5 + norm(x-p_2)^2/4];
goal = [3,6];
weight = [1,1];
16-28
fgoalattain
A = [1,1];
b = 4;
Set an initial point [1,1] and solve the goal attainment problem. Request the value of the objective
function.
x0 = [1,1];
[x,fval] = fgoalattain(fun,x0,goal,weight,A,b)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
2.0694 1.9306
fval = 2×1
3.1484
6.1484
The objective function values are higher than the goal, meaning fgoalattain does not satisfy the
goal.
2
2 + ‖x − p1‖
F(x) = 2 .
5 + ‖x − p2‖ /4
Here, p_1 = [2,3] and p_2 = [4,1]. The goal is [3,6], the weight is [1,1], and the linear constraint is
x1 + x2 ≤ 4.
p_1 = [2,3];
p_2 = [4,1];
fun = @(x)[2 + norm(x-p_1)^2;5 + norm(x-p_2)^2/4];
goal = [3,6];
weight = [1,1];
A = [1,1];
b = 4;
Set an initial point [1,1] and solve the goal attainment problem. Request the value of the objective
function, attainment factor, exit flag, output structure, and Lagrange multipliers.
16-29
16 Functions
x0 = [1,1];
[x,fval,attainfactor,exitflag,output,lambda] = fgoalattain(fun,x0,goal,weight,A,b)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
2.0694 1.9306
fval = 2×1
3.1484
6.1484
attainfactor = 0.1484
exitflag = 4
The positive value of attainfactor indicates that the goals are not attained; you can also see this
by comparing fval with goal.
The lambda.ineqlin value is nonzero, indicating that the linear inequality constrains the solution.
2
2 + ‖x − p1‖
F(x) = 2 .
5 + ‖x − p2‖ /4
16-30
fgoalattain
Here, p_1 = [2,3] and p_2 = [4,1]. The goal is [3,6], and the initial weight is [1,1].
p_1 = [2,3];
p_2 = [4,1];
fun = @(x)[2 + norm(x-p_1)^2;5 + norm(x-p_2)^2/4];
goal = [3,6];
weight = [1,1];
A = [1 1];
b = 4;
Solve the goal attainment problem starting from the point x0 = [1 1].
x0 = [1 1];
[x,fval] = fgoalattain(fun,x0,goal,weight,A,b)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
2.0694 1.9306
fval = 2×1
3.1484
6.1484
Each component of fval is above the corresponding component of goal, indicating that the goals
are not attained.
Increase the importance of satisfying the first goal by setting weight(1) to a smaller value.
weight(1) = 1/10;
[x,fval] = fgoalattain(fun,x0,goal,weight,A,b)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
2.0115 1.9885
fval = 2×1
3.0233
16-31
16 Functions
6.2328
Now the value of fval(1) is much closer to goal(1), whereas fval(2) is farther from goal(2).
Change goal(2) to 7, which is above the current solution. The solution changes.
goal(2) = 7;
[x,fval] = fgoalattain(fun,x0,goal,weight,A,b)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
1.9639 2.0361
fval = 2×1
2.9305
6.3047
Both components of fval are less than the corresponding components of goal. But fval(1) is
much closer to goal(1) than fval(2) is to goal(2). A smaller weight is more likely to make its
component nearly satisfied when the goals cannot be achieved, but makes the degree of
overachievement less when the goal can be achieved.
Change the weights to be equal. The fval results have equal distance from their goals.
weight(2) = 1/10;
[x,fval] = fgoalattain(fun,x0,goal,weight,A,b)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
1.7613 2.2387
fval = 2×1
2.6365
6.6365
Constraints can keep the resulting fval from being equally close to the goals. For example, set an
upper bound of 2 on x(2).
ub = [Inf,2];
lb = [];
16-32
fgoalattain
Aeq = [];
beq = [];
[x,fval] = fgoalattain(fun,x0,goal,weight,A,b,Aeq,beq,lb,ub)
fgoalattain stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
2.0000 2.0000
fval = 2×1
3.0000
6.2500
In this case, fval(1) meets its goal exactly, but fval(2) is less than its goal.
Input Arguments
fun — Objective functions
function handle | function name
Objective functions, specified as a function handle or function name. fun is a function that accepts a
vector x and returns a vector F, the objective functions evaluated at x. You can specify the function
fun as a function handle for a function file:
x = fgoalattain(@myfun,x0,goal,weight)
function F = myfun(x)
F = ... % Compute function values at x.
x = fgoalattain(@(x)sin(x.*x),x0,goal,weight);
If the user-defined values for x and F are arrays, fgoalattain converts them to vectors using linear
indexing (see “Array Indexing” (MATLAB)).
To make an objective function as near as possible to a goal value (that is, neither greater than nor
less than), use optimoptions to set the EqualityGoalCount option to the number of objectives
required to be in the neighborhood of the goal values. Such objectives must be partitioned into the
first elements of the vector F returned by fun.
Suppose that the gradient of the objective function can also be computed and the
SpecifyObjectiveGradient option is true, as set by:
options = optimoptions('fgoalattain','SpecifyObjectiveGradient',true)
16-33
16 Functions
In this case, the function fun must return, in the second output argument, the gradient value G (a
matrix) at x. The gradient consists of the partial derivative dF/dx of each F at the point x. If F is a
vector of length m and x has length n, where n is the length of x0, then the gradient G of F(x) is an
n-by-m matrix where G(i,j) is the partial derivative of F(j) with respect to x(i) (that is, the jth
column of G is the gradient of the jth objective function F(j)).
Note Setting SpecifyObjectiveGradient to true is effective only when the problem has no
nonlinear constraints, or the problem has a nonlinear constraint with
SpecifyConstraintGradient set to true. Internally, the objective is folded into the constraints,
so the solver needs both gradients (objective and constraint) supplied in order to avoid estimating a
gradient.
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. Solvers use the number of elements in x0 and the
size of x0 to determine the number and size of variables that fun accepts.
Example: x0 = [1,2,3,4]
Data Types: double
Goal to attain, specified as a real vector. fgoalattain attempts to find the smallest multiplier γ that
makes these inequalities hold for all values of i at the solution x:
• If the solver finds a point x that simultaneously achieves all the goals, then the attainment factor γ
is negative, and the goals are overachieved.
• If the solver cannot find a point x that simultaneously achieves all the goals, then the attainment
factor γ is positive, and the goals are underachieved.
Example: [1 3 6]
Data Types: double
Relative attainment factor, specified as a real vector. fgoalattain attempts to find the smallest
multiplier γ that makes these inequalities hold for all values of i at the solution x:
When the values of goal are all nonzero, to ensure the same percentage of underachievement or
overattainment of the active objectives, set weight to abs(goal). (The active objectives are the set
of objectives that are barriers to further improvement of the goals at the solution.)
16-34
fgoalattain
Note Setting a component of the weight vector to zero causes the corresponding goal constraint to
be treated as a hard constraint rather than a goal constraint. An alternative method to setting a hard
constraint is to use the input argument nonlcon.
When weight is positive, fgoalattain attempts to make the objective functions less than the goal
values. To make the objective functions greater than the goal values, set weight to be negative
rather than positive. To see some effects of weights on a solution, see “Effects of Weights, Goals, and
Constraints in Goal Attainment” on page 16-30.
To make an objective function as near as possible to a goal value, use the EqualityGoalCount
option and specify the objective as the first element of the vector returned by fun (see fun and
options). For an example, see “Multi-Objective Goal Attainment Optimization” on page 8-18.
Example: abs(goal)
Data Types: double
Linear inequality constraints, specified as a real matrix. A is an M-by-N matrix, where M is the number
of inequalities, and N is the number of variables (number of elements in x0). For large problems, pass
A as a sparse matrix.
A*x <= b,
where x is the column vector of N variables x(:), and b is a column vector with M elements.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
A = [1,2;3,4;5,6];
b = [10;20;30];
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear inequality constraints, specified as a real vector. b is an M-element vector related to the A
matrix. If you pass b as a row vector, solvers internally convert b to the column vector b(:). For
large problems, pass b as a sparse vector.
A*x <= b,
16-35
16 Functions
where x is the column vector of N variables x(:), and A is a matrix of size M-by-N.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear equality constraints, specified as a real matrix. Aeq is an Me-by-N matrix, where Me is the
number of equalities, and N is the number of variables (number of elements in x0). For large
problems, pass Aeq as a sparse matrix.
Aeq*x = beq,
where x is the column vector of N variables x(:), and beq is a column vector with Me elements.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
Linear equality constraints, specified as a real vector. beq is an Me-element vector related to the Aeq
matrix. If you pass beq as a row vector, solvers internally convert beq to the column vector beq(:).
For large problems, pass beq as a sparse vector.
Aeq*x = beq,
where x is the column vector of N variables x(:), and Aeq is a matrix of size Me-by-N.
16-36
fgoalattain
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
lb — Lower bounds
real vector | real array
Lower bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in lb, then lb specifies that
ub — Upper bounds
real vector | real array
Upper bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in ub, then ub specifies that
Nonlinear constraints, specified as a function handle or function name. nonlcon is a function that
accepts a vector or array x and returns two arrays, c(x) and ceq(x).
16-37
16 Functions
For example,
x = fgoalattain(@myfun,x0,...,@mycon)
Suppose that the gradients of the constraints can also be computed and the
SpecifyConstraintGradient option is true, as set by:
options = optimoptions('fgoalattain','SpecifyConstraintGradient',true)
In this case, the function nonlcon must also return, in the third and fourth output arguments, GC, the
gradient of c(x), and GCeq, the gradient of ceq(x). See “Nonlinear Constraints” on page 2-37 for an
explanation of how to “conditionalize” the gradients for use in solvers that do not accept supplied
gradients.
If nonlcon returns a vector c of m components and x has length n, where n is the length of x0, then
the gradient GC of c(x) is an n-by-m matrix, where GC(i,j) is the partial derivative of c(j) with
respect to x(i) (that is, the jth column of GC is the gradient of the jth inequality constraint c(j)).
Likewise, if ceq has p components, the gradient GCeq of ceq(x) is an n-by-p matrix, where
GCeq(i,j) is the partial derivative of ceq(j) with respect to x(i) (that is, the jth column of GCeq
is the gradient of the jth equality constraint ceq(j)).
Note Because Optimization Toolbox functions accept only inputs of type double, user-supplied
objective and nonlinear constraint functions must return outputs of type double.
See “Passing Extra Parameters” on page 2-57 for an explanation of how to parameterize the nonlinear
constraint function nonlcon, if necessary.
Data Types: char | function_handle | string
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
For details about options that have different names for optimset, see “Current and Legacy Option
Name Tables” on page 15-21.
16-38
fgoalattain
Option Description
ConstraintTolerance Termination tolerance on the constraint violation,
a positive scalar. The default is 1e-6. See
“Tolerances and Stopping Criteria” on page 2-68.
16-39
16 Functions
Option Description
FiniteDifferenceStepSize Scalar or vector step size factor for finite
differences. When you set
FiniteDifferenceStepSize to a vector v, the
forward finite differences delta are
delta = v.*sign′
(x).*max(abs(x),TypicalX);
delta = v.*max(abs(x),TypicalX);
16-40
fgoalattain
Option Description
MaxIterations Maximum number of iterations allowed (a
positive integer). The default is 400. See
“Tolerances and Stopping Criteria” on page 2-68
and “Iterations and Function Counts” on page 3-
9.
16-41
16 Functions
Option Description
PlotFcn Plots showing various measures of progress while
the algorithm executes. Select from predefined
plots or write your own. Pass a name, function
handle, or cell array of names or function
handles. For custom plot functions, pass function
handles. The default is none ([]).
16-42
fgoalattain
Option Description
SpecifyObjectiveGradient Gradient for the objective function defined by the
user. Refer to the description of fun to see how
to define the gradient. Set this option to true to
have fgoalattain use a user-defined gradient
of the objective function. The default, false,
causes fgoalattain to estimate gradients using
finite differences.
Example: optimoptions('fgoalattain','PlotFcn','optimplotfval')
16-43
16 Functions
You must supply at least the objective, x0, goal, weight, solver, and options fields in the
problem structure.
The simplest way to obtain a problem structure is to export the problem from the Optimization app.
Data Types: struct
Output Arguments
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically,
x is a local solution to the problem when exitflag is positive. For information on the quality of the
solution, see “When the Solver Succeeds” on page 4-18.
Objective function values at the solution, returned as a real array. Generally, fval = fun(x).
Attainment factor, returned as a real number. attainfactor contains the value of γ at the solution.
If attainfactor is negative, the goals have been overachieved; if attainfactor is positive, the
goals have been underachieved. See goal.
16-44
fgoalattain
Information about the optimization process, returned as a structure with the fields in this table.
Lagrange multipliers at the solution, returned as a structure with the fields in this table.
Algorithms
For a description of the fgoalattain algorithm and a discussion of goal attainment concepts, see
“Algorithms” on page 8-3.
Extended Capabilities
Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.
options = optimoptions('solvername','UseParallel',true)
For more information, see “Using Parallel Computing in Optimization Toolbox” on page 14-5.
See Also
fmincon | fminimax | optimoptions | optimtool
16-45
16 Functions
Topics
“Multi-Objective Goal Attainment Optimization” on page 8-18
“Generate and Plot a Pareto Front” on page 8-15
“Create Function Handle” (MATLAB)
“Multiobjective Optimization”
16-46
findindex
findindex
Package: optim.problemdef
Syntax
[numindex1,numindex2,...,numindexk] = findindex(var,strindex1,strindex2,...,
strindexk)
numindex = findindex(var,strindex1,strindex2,...,strindexk)
Description
[numindex1,numindex2,...,numindexk] = findindex(var,strindex1,strindex2,...,
strindexk) finds the numeric index equivalents of the named index variables in the optimization
variable var.
Examples
Create an optimization variable named colors that is indexed by the primary additive color names
and the primary subtractive color names. Include 'black' and 'white' as additive color names and
'black' as a subtractive color name.
colors = optimvar('colors',["black","white","red","green","blue"],["cyan","magenta","yellow","bla
Find the index numbers for the additive colors 'red' and 'black' and for the subtractive color
'black'.
[idxadd,idxsub] = findindex(colors,{'red','black'},{'black'})
idxadd = 1×2
3 1
idxsub = 4
Create an optimization variable named colors that is indexed by the primary additive color names
and the primary subtractive color names. Include 'black' and 'white' as additive color names and
'black' as a subtractive color name.
16-47
16 Functions
colors = optimvar('colors',["black","white","red","green","blue"],["cyan","magenta","yellow","bla
idx = findindex(colors,["white","red","green","blue"],["black","cyan","magenta","yellow"])
idx = 1×4
17 3 9 15
Create and solve an optimization problem using named index variables. The problem is to maximize
the profit-weighted flow of fruit to various airports, subject to constraints on the weighted flows.
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
Find the optimal flow of oranges and berries to New York and Los Angeles.
idxFruit = 1×2
2 4
idxAirports = 1×2
1 3
16-48
findindex
orangeBerries = 2×2
0 980.0000
70.0000 0
This display means that no oranges are going to NYC, 70 berries are going to NYC, 980 oranges are
going to LAX, and no berries are going to LAX.
Fruit Airports
----- --------
Berries NYC
Apples BOS
Oranges LAX
idx = findindex(flow, {'berries', 'apples', 'oranges'}, {'NYC', 'BOS', 'LAX'})
idx = 1×3
4 5 10
optimalFlow = sol.flow(idx)
optimalFlow = 1×3
This display means that 70 berries are going to NYC, 28 apples are going to BOS, and 980 oranges are
going to LAX.
Create named index variables for a problem with various land types, potential crops, and plowing
methods.
land = ["irr-good","irr-poor","dry-good","dry-poor"];
crops = ["wheat-lentil","wheat-corn","barley-chickpea","barley-lentil","wheat-onion","barley-onio
plow = ["tradition","mechanized"];
xcrop = optimvar('xcrop',land,crops,plow,'LowerBound',0);
Set the initial value to 3000 for the "wheat-onion" and "wheat-lentil" crops that are planted in
any dry condition and are plowed traditionally.
16-49
16 Functions
idx = findindex(xcrop,...
["dry-good","irr-poor","irr-good"],...
["wheat-corn","barley-onion","barley-chickpea"],...
["mechanized","tradition","mechanized"]);
x0.xcrop(idx) = [2000,5000,3500];
Input Arguments
var — Optimization variable
OptimizationVariable object
Named index, specified as a cell array of character vectors, character vector, string vector, or integer
vector. The number of strindex arguments must be the number of dimensions in var.
Example: ["small","medium","large"]
Data Types: double | char | string | cell
Output Arguments
numindex — Numeric index equivalent
integer vector
Numeric index equivalent, returned as an integer vector. The number of output arguments must be
one of the following:
• The number of dimensions in var. Each output vector numindexj is the numeric equivalent of the
corresponding input argument strindexj.
• One. In this case, the size of each input strindexj must be the same for all j, and the output
satisfies the linear indexing criterion
See Also
OptimizationVariable | solve
16-50
findindex
Topics
“Create Initial Point for Optimization with Named Index Variables” on page 10-43
“Named Index for Optimization Variables” on page 10-20
Introduced in R2018a
16-51
16 Functions
fminbnd
Find minimum of single-variable function on fixed interval
Syntax
x = fminbnd(fun,x1,x2)
x = fminbnd(fun,x1,x2,options)
x = fminbnd(problem)
[x,fval] = fminbnd( ___ )
[x,fval,exitflag] = fminbnd( ___ )
[x,fval,exitflag,output] = fminbnd( ___ )
Description
fminbnd is a one-dimensional minimizer that finds a minimum for a problem specified by
x, x1, and x2 are finite scalars, and f(x) is a function that returns a scalar.
x = fminbnd(fun,x1,x2) returns a value x that is a local minimizer of the scalar valued function
that is described in fun in the interval x1 < x < x2.
Create problem by exporting a problem from Optimization app, as described in “Exporting Your
Work” on page 5-9.
[x,fval] = fminbnd( ___ ), for any input arguments, returns the value of the objective function
computed in fun at the solution x.
[x,fval,exitflag] = fminbnd( ___ ) additionally returns a value exitflag that describes the
exit condition.
Examples
Minimum of sin
Find the point where the sin(x) function takes its minimum in the range 0 < x < 2π.
fun = @sin;
x1 = 0;
16-52
fminbnd
x2 = 2*pi;
x = fminbnd(fun,x1,x2)
x = 4.7124
3*pi/2
ans = 4.7124
Minimize a function that is specified by a separate function file. A function accepts a point x and
returns a real scalar representing the value of the objective function at x.
Write the following function as a file, and save the file as scalarobjective.m on your MATLAB®
path.
function f = scalarobjective(x)
f = 0;
for k = -10:10
f = f + (k+1)^2*cos(k*x)*exp(-k^2/2);
end
x = fminbnd(@scalarobjective,1,3)
x =
2.0061
Minimize a function when there is an extra parameter. The function sin(x − a) has a minimum that
depends on the value of the parameter a. Create an anonymous function of x that includes the value
of the parameter a. Minimize this function over the interval 0 < x < 2π.
a = 9/7;
fun = @(x)sin(x-a);
x = fminbnd(fun,1,2*pi)
x = 5.9981
3*pi/2 + 9/7
ans = 5.9981
16-53
16 Functions
For more information about including extra parameters, see “Parameterizing Functions” (MATLAB).
Monitor Iterations
Monitor the steps fminbnd takes to minimize the sin(x) function for 0 < x < 2π.
fun = @sin;
x1 = 0;
x2 = 2*pi;
options = optimset('Display','iter');
x = fminbnd(fun,x1,x2,options)
Optimization terminated:
the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-04
x = 4.7124
Find the location of the minimum of sin(x) and the value of the minimum for 0 < x < 2π.
fun = @sin;
[x,fval] = fminbnd(fun,1,2*pi)
x = 4.7124
fval = -1.0000
Return all information about the fminbnd solution process by requesting all outputs. Also, monitor
the solution process using a plot function.
fun = @sin;
x1 = 0;
x2 = 2*pi;
options = optimset('PlotFcns',@optimplotfval);
[x,fval,exitflag,output] = fminbnd(fun,x1,x2,options)
16-54
fminbnd
x = 4.7124
fval = -1.0000
exitflag = 1
Input Arguments
fun — Function to minimize
function handle | function name
Function to minimize, specified as a function handle or function name. fun is a function that accepts
a real scalar x and returns a real scalar f (the objective function evaluated at x).
x = fminbnd(@myfun,x1,x2)
16-55
16 Functions
function f = myfun(x)
f = ... % Compute function value at x
You can also specify fun as a function handle for an anonymous function:
x = fminbnd(@(x)norm(x)^2,x1,x2);
x1 — Lower bound
finite real scalar
x2 — Upper bound
finite real scalar
Optimization options, specified as a structure such as optimset returns. You can use optimset to
set or change the values of these fields in the options structure. See “Optimization Options
Reference” on page 15-6 for detailed information.
16-56
fminbnd
Custom plot functions use the same syntax as output functions. See
“Output Functions” on page 3-32 and “Output Function Syntax” on
page 15-26.
TolX Termination tolerance on x, a positive scalar. The default is 1e-4.
See “Tolerances and Stopping Criteria” on page 2-68.
The simplest way to obtain a problem structure is to export the problem from the Optimization app.
Data Types: struct
Output Arguments
x — Solution
real scalar
Solution, returned as a real scalar. Typically, x is a local solution to the problem when exitflag is
positive. For information on the quality of the solution, see “When the Solver Succeeds” on page 4-18.
Objective function value at the solution, returned as a real number. Generally, fval = fun(x).
16-57
16 Functions
Limitations
• The function to be minimized must be continuous.
• fminbnd might only give local solutions.
• fminbnd can exhibit slow convergence when the solution is on a boundary of the interval. In such
a case, fmincon often gives faster and more accurate solutions.
Algorithms
fminbnd is a function file. The algorithm is based on golden section search and parabolic
interpolation. Unless the left endpoint x1 is very close to the right endpoint x2, fminbnd never
evaluates fun at the endpoints, so fun need only be defined for x in the interval x1 < x < x2.
If the minimum actually occurs at x1 or x2, fminbnd returns a point x in the interior of the interval
(x1,x2) that is close to the minimizer. In this case, the distance of x from the minimizer is no more than
2*(TolX + 3*abs(x)*sqrt(eps)). See [1] or [2] for details about the algorithm.
References
[1] Forsythe, G. E., M. A. Malcolm, and C. B. Moler. Computer Methods for Mathematical
Computations. Englewood Cliffs, NJ: Prentice Hall, 1976.
[2] Brent, Richard. P. Algorithms for Minimization without Derivatives. Englewood Cliffs, NJ: Prentice-
Hall, 1973.
16-58
fminbnd
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
See Also
fmincon | fminsearch | optimset | optimtool
Topics
“Create Function Handle” (MATLAB)
“Anonymous Functions” (MATLAB)
16-59
16 Functions
fmincon
Find minimum of constrained nonlinear multivariable function
Syntax
x = fmincon(fun,x0,A,b)
x = fmincon(fun,x0,A,b,Aeq,beq)
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub)
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon)
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
x = fmincon(problem)
[x,fval] = fmincon( ___ )
[x,fval,exitflag,output] = fmincon( ___ )
[x,fval,exitflag,output,lambda,grad,hessian] = fmincon( ___ )
Description
Nonlinear programming solver.
c(x) ≤ 0
ceq(x) = 0
min f (x) such that A⋅x≤b
x
Aeq ⋅ x = beq
lb ≤ x ≤ ub,
b and beq are vectors, A and Aeq are matrices, c(x) and ceq(x) are functions that return vectors, and
f(x) is a function that returns a scalar. f(x), c(x), and ceq(x) can be nonlinear functions.
x, lb, and ub can be passed as vectors or matrices; see “Matrix Arguments” on page 2-31.
Note “Passing Extra Parameters” on page 2-57 explains how to pass extra parameters to the
objective function and nonlinear constraint functions, if necessary.
16-60
fmincon
Note If the specified input bounds for a problem are inconsistent, fmincon throws an error. In this
case, output x is x0 and fval is [].
For the default 'interior-point' algorithm, fmincon sets components of x0 that violate the
bounds lb ≤ x ≤ ub, or are equal to a bound, to the interior of the bound region. For the 'trust-
region-reflective' algorithm, fmincon sets violating components to the interior of the bound
region. For other algorithms, fmincon sets violating components to the closest bound. Components
that respect the bounds are not changed. See “Iterations Can Violate Constraints” on page 2-33.
x = fmincon(problem) finds the minimum for problem, where problem is a structure described
in “Input Arguments” on page 16-72. Create the problem structure by exporting a problem from
Optimization app, as described in “Exporting Your Work” on page 5-9.
[x,fval] = fmincon( ___ ), for any syntax, returns the value of the objective function fun at the
solution x.
• lambda — Structure with fields containing the Lagrange multipliers at the solution x.
• grad — Gradient of fun at the solution x.
• hessian — Hessian of fun at the solution x. See “fmincon Hessian” on page 3-25.
Examples
Find the minimum value of Rosenbrock's function when there is a linear inequality constraint.
Set the objective function fun to be Rosenbrock's function. Rosenbrock's function is well-known to be
difficult to minimize. It has its minimum objective value of 0 at the point (1,1). For more information,
see “Solve a Constrained Nonlinear Problem, Solver-Based” on page 1-11.
fun = @(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2;
Find the minimum value starting from the point [-1,2], constrained to have x(1) + 2x(2) ≤ 1.
Express this constraint in the form Ax <= b by taking A = [1,2] and b = 1. Notice that this
constraint means that the solution will not be at the unconstrained solution (1,1), because at that
point x(1) + 2x(2) = 3 > 1.
x0 = [-1,2];
A = [1,2];
16-61
16 Functions
b = 1;
x = fmincon(fun,x0,A,b)
x = 1×2
0.5022 0.2489
Find the minimum value of Rosenbrock's function when there are both a linear inequality constraint
and a linear equality constraint.
Find the minimum value starting from the point [0.5,0], constrained to have x(1) + 2x(2) ≤ 1 and
2x(1) + x(2) = 1.
• Express the linear inequality constraint in the form A*x <= b by taking A = [1,2] and b = 1.
• Express the linear equality constraint in the form Aeq*x = beq by taking Aeq = [2,1] and beq
= 1.
x0 = [0.5,0];
A = [1,2];
b = 1;
Aeq = [2,1];
beq = 1;
x = fmincon(fun,x0,A,b,Aeq,beq)
x = 1×2
0.4149 0.1701
16-62
fmincon
Look in the region where x has positive values, x(1) ≤ 1, and x(2) ≤ 2.
lb = [0,0];
ub = [1,2];
A = [];
b = [];
Aeq = [];
beq = [];
x0 = (lb + ub)/2;
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub)
x = 1×2
1.0000 2.0000
x0 = x0/5;
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub)
x = 1×2
10-6 ×
0.4000 0.4000
To determine which solution is better, see “Obtain the Objective Function Value” on page 16-68.
Nonlinear Constraints
16-63
16 Functions
Find the point where Rosenbrock's function is minimized within a circle, also subject to bound
constraints.
lb = [0,0.2];
ub = [0.5,0.8];
Also look within the circle centered at [1/3,1/3] with radius 1/3. Copy the following code to a file on
your MATLAB® path named circlecon.m.
A = [];
b = [];
Aeq = [];
beq = [];
x0 = [1/4,1/4];
nonlcon = @circlecon;
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon)
x =
0.5000 0.2500
Nondefault Options
Set options to view iterations as they occur and to use a different algorithm.
16-64
fmincon
To observe the fmincon solution process, set the Display option to 'iter'. Also, try the 'sqp'
algorithm, which is sometimes faster or more accurate than the default 'interior-point'
algorithm.
options = optimoptions('fmincon','Display','iter','Algorithm','sqp');
Find the minimum of Rosenbrock's function on the unit disk, . First create a function that
represents the nonlinear constraint. Save this as a file named unitdisk.m on your MATLAB® path.
fmincon stopped because the size of the current step is less than
the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x =
0.7864 0.6177
16-65
16 Functions
Include Gradient
Include gradient evaluation in the objective function for faster or more reliable computations.
Include the gradient evaluation as a conditionalized output in the objective function file. For details,
see “Including Gradients and Hessians” on page 2-19. The objective function is Rosenbrock's
function,
options = optimoptions('fmincon','SpecifyObjectiveGradient',true);
Create the other inputs for the problem. Then call fmincon.
fun = @rosenbrockwithgrad;
x0 = [-1,2];
A = [];
b = [];
Aeq = [];
beq = [];
lb = [-2,-2];
ub = [2,2];
nonlcon = [];
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
x =
16-66
fmincon
1.0000 1.0000
Solve the same problem as in “Nondefault Options” on page 16-64 using a problem structure instead
of separate arguments.
Create the options and a problem structure. See “problem” on page 16-0 for the field names and
required fields.
options = optimoptions('fmincon','Display','iter','Algorithm','sqp');
problem.options = options;
problem.solver = 'fmincon';
problem.objective = @(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2;
problem.x0 = [0,0];
The nonlinear constraint function unitdisk appears at the end of this example on page 16-0 .
Include the nonlinear constraint function in problem.
problem.nonlcon = @unitdisk;
fmincon stopped because the size of the current step is less than
the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
0.7864 0.6177
The iterative display and solution are the same as in “Nondefault Options” on page 16-64.
16-67
16 Functions
Call fmincon with the fval output to obtain the value of the objective function at the solution.
The “Minimize with Bound Constraints” on page 16-62 example shows two solutions. Which is better?
Run the example requesting the fval output as well as the solution.
x = 1×2
1.0000 2.0000
fval = -0.6667
x0 = x0/5;
[x2,fval2] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub)
x2 = 1×2
10-6 ×
0.4000 0.4000
fval2 = 1.0000
This solution has an objective function value fval2 = 1, which is higher than the first value fval = –
0.6667. The first solution x has a lower local minimum objective function value.
16-68
fmincon
To easily examine the quality of a solution, request the exitflag and output outputs.
Set up the problem of minimizing Rosenbrock's function on the unit disk, . First create a
function that represents the nonlinear constraint. Save this as a file named unitdisk.m on your
MATLAB® path.
x =
0.7864 0.6177
fval =
0.0457
exitflag =
output =
16-69
16 Functions
iterations: 24
funcCount: 84
constrviolation: 0
stepsize: 6.9164e-06
algorithm: 'interior-point'
firstorderopt: 2.0934e-08
cgiterations: 4
message: '...'
fmincon optionally returns several outputs that you can use for analyzing the reported solution.
Set up the problem of minimizing Rosenbrock's function on the unit disk. First create a function that
represents the nonlinear constraint. Save this as a file named unitdisk.m on your MATLAB® path.
x =
16-70
fmincon
0.7864 0.6177
fval =
0.0457
exitflag =
output =
iterations: 24
funcCount: 84
constrviolation: 0
stepsize: 6.9164e-06
algorithm: 'interior-point'
firstorderopt: 2.0934e-08
cgiterations: 4
message: '...'
lambda =
grad =
-0.1911
-0.1501
hessian =
497.2876 -314.5576
-314.5576 200.2383
• The lambda.ineqnonlin output shows that the nonlinear constraint is active at the solution, and
gives the value of the associated Lagrange multiplier.
• The grad output gives the value of the gradient of the objective function at the solution x.
16-71
16 Functions
Input Arguments
fun — Function to minimize
function handle | function name
Function to minimize, specified as a function handle or function name. fun is a function that accepts
a vector or array x and returns a real scalar f, the objective function evaluated at x.
x = fmincon(@myfun,x0,A,b)
function f = myfun(x)
f = ... % Compute function value at x
You can also specify fun as a function handle for an anonymous function:
x = fmincon(@(x)norm(x)^2,x0,A,b);
If you can compute the gradient of fun and the SpecifyObjectiveGradient option is set to true,
as set by
options = optimoptions('fmincon','SpecifyObjectiveGradient',true)
then fun must return the gradient vector g(x) in the second output argument.
If you can also compute the Hessian matrix and the HessianFcn option is set to 'objective' via
optimoptions and the Algorithm option is 'trust-region-reflective', fun must return the
Hessian value H(x), a symmetric matrix, in a third output argument. fun can give a sparse Hessian.
See “Hessian for fminunc trust-region or fmincon trust-region-reflective algorithms” on page 2-21 for
details.
If you can also compute the Hessian matrix and the Algorithm option is set to 'interior-point',
there is a different way to pass the Hessian to fmincon. For more information, see “Hessian for
fmincon interior-point algorithm” on page 2-21. For an example using Symbolic Math Toolbox to
compute the gradient and Hessian, see “Symbolic Math Toolbox™ Calculates Gradients and
Hessians” on page 6-94.
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. Solvers use the number of elements in, and size
of, x0 to determine the number and size of variables that fun accepts.
16-72
fmincon
Example: x0 = [1,2,3,4]
Data Types: double
Linear inequality constraints, specified as a real matrix. A is an M-by-N matrix, where M is the number
of inequalities, and N is the number of variables (number of elements in x0). For large problems, pass
A as a sparse matrix.
A*x <= b,
where x is the column vector of N variables x(:), and b is a column vector with M elements.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
A = [1,2;3,4;5,6];
b = [10;20;30];
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear inequality constraints, specified as a real vector. b is an M-element vector related to the A
matrix. If you pass b as a row vector, solvers internally convert b to the column vector b(:). For
large problems, pass b as a sparse vector.
A*x <= b,
where x is the column vector of N variables x(:), and A is a matrix of size M-by-N.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
16-73
16 Functions
A = [1,2;3,4;5,6];
b = [10;20;30];
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear equality constraints, specified as a real matrix. Aeq is an Me-by-N matrix, where Me is the
number of equalities, and N is the number of variables (number of elements in x0). For large
problems, pass Aeq as a sparse matrix.
Aeq*x = beq,
where x is the column vector of N variables x(:), and beq is a column vector with Me elements.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
Linear equality constraints, specified as a real vector. beq is an Me-element vector related to the Aeq
matrix. If you pass beq as a row vector, solvers internally convert beq to the column vector beq(:).
For large problems, pass beq as a sparse vector.
Aeq*x = beq,
where x is the column vector of N variables x(:), and Aeq is a matrix of size Me-by-N.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
16-74
fmincon
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
lb — Lower bounds
real vector | real array
Lower bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in lb, then lb specifies that
ub — Upper bounds
real vector | real array
Upper bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in ub, then ub specifies that
Nonlinear constraints, specified as a function handle or function name. nonlcon is a function that
accepts a vector or array x and returns two arrays, c(x) and ceq(x).
For example,
16-75
16 Functions
x = fmincon(@myfun,x0,A,b,Aeq,beq,lb,ub,@mycon)
If the gradients of the constraints can also be computed and the SpecifyConstraintGradient
option is true, as set by
options = optimoptions('fmincon','SpecifyConstraintGradient',true)
then nonlcon must also return, in the third and fourth output arguments, GC, the gradient of c(x),
and GCeq, the gradient of ceq(x). GC and GCeq can be sparse or dense. If GC or GCeq is large, with
relatively few nonzero entries, save running time and memory in the interior-point algorithm by
representing them as sparse matrices. For more information, see “Nonlinear Constraints” on page 2-
37.
Data Types: char | function_handle | string
Some options apply to all algorithms, and others are relevant for particular algorithms. See
“Optimization Options Reference” on page 15-6 for detailed information.
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
All Algorithms
16-76
fmincon
• 'interior-point' (default)
• 'trust-region-reflective'
• 'sqp'
• 'sqp-legacy' (optimoptions only)
• 'active-set'
For optimset, the name is TolCon. See “Current and Legacy Option
Name Tables” on page 15-21.
Diagnostics Display diagnostic information about the function to be minimized or
solved. Choices are 'off' (default) or 'on'.
DiffMaxChange Maximum change in variables for finite-difference gradients (a positive
scalar). The default is Inf.
DiffMinChange Minimum change in variables for finite-difference gradients (a positive
scalar). The default is 0.
16-77
16 Functions
delta = v.*sign′(x).*max(abs(x),TypicalX);
delta = v.*max(abs(x),TypicalX);
16-78
fmincon
FunValCheck Check whether objective function values are valid. The default setting,
'off', does not perform a check. The 'on' setting displays an error
when the objective function returns a value that is complex, Inf, or
NaN.
MaxFunctionEvaluations Maximum number of function evaluations allowed, a positive integer.
The default value for all algorithms except interior-point is
100*numberOfVariables; for the interior-point algorithm the
default is 3000. See “Tolerances and Stopping Criteria” on page 2-68
and “Iterations and Function Counts” on page 3-9.
For optimset, the name is MaxIter. See “Current and Legacy Option
Name Tables” on page 15-21.
OptimalityTolerance Termination tolerance on the first-order optimality (a positive scalar).
The default is 1e-6. See “First-Order Optimality Measure” on page 3-
11.
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
OutputFcn Specify one or more user-defined functions that an optimization
function calls at each iteration. Pass a function handle or a cell array of
function handles. The default is none ([]). See “Output Function
Syntax” on page 15-26.
16-79
16 Functions
PlotFcn Plots various measures of progress while the algorithm executes; select
from predefined plots or write your own. Pass a built-in plot function
name, a function handle, or a cell array of built-in plot function names
or function handles. For custom plot functions, pass function handles.
The default is none ([]):
Custom plot functions use the same syntax as output functions. See
“Output Functions” on page 3-32 and “Output Function Syntax” on
page 15-26.
For optimset, the name is GradConstr and the values are 'on' or
'off'. See “Current and Legacy Option Name Tables” on page 15-21.
SpecifyObjectiveGradie Gradient for the objective function defined by the user. See the
nt description of fun to see how to define the gradient in fun. The
default, false, causes fmincon to estimate gradients using finite
differences. Set to true to have fmincon use a user-defined gradient
of the objective function. To use the 'trust-region-reflective'
algorithm, you must provide the gradient, and set
SpecifyObjectiveGradient to true.
For optimset, the name is GradObj and the values are 'on' or
'off'. See “Current and Legacy Option Name Tables” on page 15-21.
16-80
fmincon
StepTolerance Termination tolerance on x, a positive scalar. The default value for all
algorithms except 'interior-point' is 1e-6; for the 'interior-
point' algorithm, the default is 1e-10. See “Tolerances and Stopping
Criteria” on page 2-68.
For optimset, the name is TolX. See “Current and Legacy Option
Name Tables” on page 15-21.
TypicalX Typical x values. The number of elements in TypicalX is equal to the
number of elements in x0, the starting point. The default value is
ones(numberofvariables,1). fmincon uses TypicalX for scaling
finite differences for gradient estimation.
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
HessianFcn If [] (default), fmincon approximates the Hessian using finite
differences, or uses a Hessian multiply function (with option
HessianMultiplyFcn). If 'objective', fmincon uses a user-
defined Hessian (defined in fun). See “Hessian as an Input” on page
16-88.
For optimset, the name is HessFcn. See “Current and Legacy Option
Name Tables” on page 15-21.
16-81
16 Functions
W = hmfun(Hinfo,Y)
The first argument is the same as the third argument returned by the
objective function fun, for example
[f,g,Hinfo] = fun(x)
16-82
fmincon
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
MaxSQPIter Maximum number of SQP iterations allowed, a positive integer. The
default is 10*max(numberOfVariables, numberOfInequalities
+ numberOfBounds).
RelLineSrchBnd Relative bound (a real nonnegative scalar value) on the line search step
length. The total displacement in x satisfies |Δx(i)| ≤ relLineSrchBnd·
max(|x(i)|,|typicalx(i)|). This option provides control over the magnitude
of the displacements in x for cases in which the solver takes steps that
are considered too large. The default is no bounds ([]).
RelLineSrchBndDuration Number of iterations for which the bound specified in
RelLineSrchBnd should be active (default is 1).
TolConSQP Termination tolerance on inner iteration SQP constraint violation, a
positive scalar. The default is 1e-6.
Interior-Point Algorithm
HessianApproximation Chooses how fmincon calculates the Hessian (see “Hessian as an
Input” on page 16-88). The choices are:
• 'bfgs' (default)
• 'finite-difference'
• 'lbfgs'
• {'lbfgs',Positive Integer}
For optimset, the name is Hessian and the values are 'user-
supplied', 'bfgs', 'lbfgs', 'fin-diff-grads', 'on', or 'off'.
See “Current and Legacy Option Name Tables” on page 15-21.
16-83
16 Functions
For optimset, the name is HessFcn. See “Current and Legacy Option
Name Tables” on page 15-21.
HessianMultiplyFcn User-supplied function that gives a Hessian-times-vector product (see
“Hessian Multiply Function” on page 16-88). Pass a function handle.
16-84
fmincon
Example: options =
optimoptions('fmincon','SpecifyObjectiveGradient',true,'SpecifyConstraintGrad
ient',true)
You must supply at least the objective, x0, solver, and options fields in the problem structure.
The simplest way to obtain a problem structure is to export the problem from the Optimization app.
Data Types: struct
Output Arguments
x — Solution
real vector | real array
16-85
16 Functions
Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically,
x is a local solution to the problem when exitflag is positive. For information on the quality of the
solution, see “When the Solver Succeeds” on page 4-18.
Objective function value at the solution, returned as a real number. Generally, fval = fun(x).
All Algorithms:
1 First-order optimality measure was less than
options.OptimalityTolerance, and maximum constraint
violation was less than options.ConstraintTolerance.
0 Number of iterations exceeded options.MaxIterations or
number of function evaluations exceeded
options.MaxFunctionEvaluations.
-1 Stopped by an output function or plot function.
-2 No feasible point was found.
All algorithms except active-set:
2 Change in x was less than options.StepTolerance and
maximum constraint violation was less than
options.ConstraintTolerance.
trust-region-reflective algorithm only:
3 Change in the objective function value was less than
options.FunctionTolerance and maximum constraint
violation was less than options.ConstraintTolerance.
active-set algorithm only:
4 Magnitude of the search direction was less than
2*options.StepTolerance and maximum constraint violation
was less than options.ConstraintTolerance.
5 Magnitude of directional derivative in search direction was less
than 2*options.OptimalityTolerance and maximum
constraint violation was less than
options.ConstraintTolerance.
interior-point, sqp-legacy, and sqp algorithms:
-3 Objective function at current iteration went below
options.ObjectiveLimit and maximum constraint violation
was less than options.ConstraintTolerance.
16-86
fmincon
Gradient at the solution, returned as a real vector. grad gives the gradient of fun at the point x(:).
Approximate Hessian, returned as a real matrix. For the meaning of hessian, see “Hessian Output”
on page 3-24.
Limitations
• fmincon is a gradient-based method that is designed to work on problems where the objective
and constraint functions are both continuous and have continuous first derivatives.
• For the 'trust-region-reflective' algorithm, you must provide the gradient in fun and set
the 'SpecifyObjectiveGradient' option to true.
• The 'trust-region-reflective' algorithm does not allow equal upper and lower bounds. For
example, if lb(2)==ub(2), fmincon gives this error:
Equal upper and lower bounds not permitted in trust-region-reflective algorithm. Use
either interior-point or SQP algorithms instead.
16-87
16 Functions
• There are two different syntaxes for passing a Hessian, and there are two different syntaxes for
passing a HessianMultiplyFcn function; one for trust-region-reflective, and another for
interior-point. See “Including Hessians” on page 2-21.
• For trust-region-reflective, the Hessian of the Lagrangian is the same as the Hessian of
the objective function. You pass that Hessian as the third output of the objective function.
• For interior-point, the Hessian of the Lagrangian involves the Lagrange multipliers and
the Hessians of the nonlinear constraint functions. You pass the Hessian as a separate function
that takes into account both the current point x and the Lagrange multiplier structure lambda.
• When the problem is infeasible, fmincon attempts to minimize the maximum constraint value.
More About
Hessian as an Input
fmincon uses a Hessian as an optional input. This Hessian is the matrix of second derivatives of the
Lagrangian (see “Equation 3-1”), namely,
The active-set and sqp algorithms do not accept an input Hessian. They compute a quasi-Newton
approximation to the Hessian of the Lagrangian.
The interior-point algorithm has several choices for the 'HessianApproximation' option; see
“Choose Input Hessian Approximation for interior-point fmincon” on page 2-24:
• 'bfgs' — fmincon calculates the Hessian by a dense quasi-Newton approximation. This is the
default Hessian approximation.
• 'lbfgs' — fmincon calculates the Hessian by a limited-memory, large-scale quasi-Newton
approximation. The default memory, 10 iterations, is used.
• {'lbfgs',positive integer} — fmincon calculates the Hessian by a limited-memory, large-
scale quasi-Newton approximation. The positive integer specifies how many past iterations should
be remembered.
• 'finite-difference' — fmincon calculates a Hessian-times-vector product by finite
differences of the gradient(s). You must supply the gradient of the objective function, and also
gradients of nonlinear constraints (if they exist). Set the 'SpecifyObjectiveGradient' option
to true and, if applicable, the 'SpecifyConstraintGradient' option to true. You must set
the 'SubproblemAlgorithm' to 'cg'.
16-88
fmincon
Algorithms
Choosing the Algorithm
For help choosing the algorithm, see “fmincon Algorithms” on page 2-6. To set the algorithm, use
optimoptions to create options, and use the 'Algorithm' name-value pair.
The rest of this section gives brief summaries or pointers to information about each algorithm.
Interior-Point Optimization
This algorithm is described in “fmincon Interior Point Algorithm” on page 6-30. There is more
extensive description in [1], [41], and [9].
The fmincon 'sqp' and 'sqp-legacy' algorithms are similar to the 'active-set' algorithm
described in “Active-Set Optimization” on page 16-89. “fmincon SQP Algorithm” on page 6-29
describes the main differences. In summary, these differences are:
Active-Set Optimization
fmincon uses a sequential quadratic programming (SQP) method. In this method, the function solves
a quadratic programming (QP) subproblem at each iteration. fmincon updates an estimate of the
Hessian of the Lagrangian at each iteration using the BFGS formula (see fminunc and references [7]
and [8]).
fmincon performs a line search using a merit function similar to that proposed by [6], [7], and [8].
The QP subproblem is solved using an active set strategy similar to that described in [5]. “fmincon
Active Set Algorithm” on page 6-22 describes this algorithm in detail.
See also “SQP Implementation” on page 6-25 for more details on the algorithm used.
Trust-Region-Reflective Optimization
References
[1] Byrd, R. H., J. C. Gilbert, and J. Nocedal. “A Trust Region Method Based on Interior Point
Techniques for Nonlinear Programming.” Mathematical Programming, Vol 89, No. 1, 2000,
pp. 149–185.
[2] Byrd, R. H., Mary E. Hribar, and Jorge Nocedal. “An Interior Point Algorithm for Large-Scale
Nonlinear Programming.” SIAM Journal on Optimization, Vol 9, No. 4, 1999, pp. 877–900.
16-89
16 Functions
[3] Coleman, T. F. and Y. Li. “An Interior, Trust Region Approach for Nonlinear Minimization Subject to
Bounds.” SIAM Journal on Optimization, Vol. 6, 1996, pp. 418–445.
[4] Coleman, T. F. and Y. Li. “On the Convergence of Reflective Newton Methods for Large-Scale
Nonlinear Minimization Subject to Bounds.” Mathematical Programming, Vol. 67, Number 2,
1994, pp. 189–224.
[5] Gill, P. E., W. Murray, and M. H. Wright. Practical Optimization, London, Academic Press, 1981.
[6] Han, S. P. “A Globally Convergent Method for Nonlinear Programming.” Journal of Optimization
Theory and Applications, Vol. 22, 1977, pp. 297.
[8] Powell, M. J. D. “The Convergence of Variable Metric Methods For Nonlinearly Constrained
Optimization Calculations.” Nonlinear Programming 3 (O. L. Mangasarian, R. R. Meyer, and S.
M. Robinson, eds.), Academic Press, 1978.
[9] Waltz, R. A., J. L. Morales, J. Nocedal, and D. Orban. “An interior algorithm for nonlinear
optimization that combines line search and trust region steps.” Mathematical Programming,
Vol 107, No. 3, 2006, pp. 391–408.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
• fmincon supports code generation using either the codegen function or the MATLAB Coder app.
You must have a MATLAB Coder license to generate code.
• The target hardware must support standard double-precision floating-point computations. You
cannot generate code for single-precision or fixed-point computations.
• All code for generation must be MATLAB code. In particular, you cannot use a custom black-box
function as an objective function for fmincon. You can use coder.ceval to evaluate a custom
function coded in C or C++. However, the custom function must be called in a MATLAB function.
• fmincon does not support the problem argument for code generation.
[x,fval] = fmincon(problem) % Not supported
• You must specify the objective function and any nonlinear constraint function by using function
handles, not strings or character names.
x = fmincon(@fun,x0,A,b,Aeq,beq,lb,ub,@nonlcon) % Supported
% Not supported: fmincon('fun',...) or fmincon("fun",...)
• All fmincon input matrices such as A, Aeq, lb, and ub must be full, not sparse. You can convert
sparse matrices to full by using the full function.
• The lb and ub arguments must have either the same number of entries as the x0 argument or
must be empty [].
• For advanced code optimization involving embedded processors, you also need an Embedded
Coder license.
16-90
fmincon
• You must include options for fmincon and specify them using optimoptions. The options must
include the Algorithm option, set to 'sqp' or 'sqp-legacy'.
options = optimoptions('fmincon','Algorithm','sqp');
[x,fval,exitflag] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
• Code generation supports these options:
opts = optimoptions('fmincon','Algorithm','sqp');
opts = optimoptions(opts,'MaxIterations',1e4); % Recommended
opts.MaxIterations = 1e4; % Not recommended
• Do not load options from a file. Doing so can cause code generation to fail. Instead, create options
in your code.
• Usually, if you specify an option that is not supported, the option is silently ignored during code
generation. However, if you specify a plot function or output function by using dot notation, code
generation can error. For reliability, specify only supported options.
• Because output functions and plot functions are not supported, fmincon does not return exit flag
–1.
For an example, see “Code Generation for Optimization Basics” on page 6-116.
options = optimoptions('solvername','UseParallel',true)
For more information, see “Using Parallel Computing in Optimization Toolbox” on page 14-5.
See Also
fminbnd | fminsearch | fminunc | optimoptions | optimtool
16-91
16 Functions
Topics
“Solver-Based Nonlinear Optimization”
“Solver-Based Optimization Problem Setup”
“Constrained Nonlinear Optimization Algorithms” on page 6-19
16-92
fminimax
fminimax
Solve minimax constraint problem
Syntax
x = fminimax(fun,x0)
x = fminimax(fun,x0,A,b)
x = fminimax(fun,x0,A,b,Aeq,beq)
x = fminimax(fun,x0,A,b,Aeq,beq,lb,ub)
x = fminimax(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon)
x = fminimax(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
x = fminimax(problem)
[x,fval] = fminimax( ___ )
[x,fval,maxfval,exitflag,output] = fminimax( ___ )
[x,fval,maxfval,exitflag,output,lambda] = fminimax( ___ )
Description
fminimax seeks a point that minimizes the maximum of a set of objective functions.
The problem includes any type of constraint. In detail, fminimax seeks the minimum of a problem
specified by
c(x) ≤ 0
ceq(x) = 0
minmaxFi(x) such that A⋅x≤b
x i
Aeq ⋅ x = beq
lb ≤ x ≤ ub
where b and beq are vectors, A and Aeq are matrices, and c(x), ceq(x), and F(x) are functions that
return vectors. F(x), c(x), and ceq(x) can be nonlinear functions.
x, lb, and ub can be passed as vectors or matrices; see “Matrix Arguments” on page 2-31.
You can also solve max-min problems with fminimax, using the identity
minmax Fi(x)
x i
by using the AbsoluteMaxObjectiveCount option; see “Solve Minimax Problem Using Absolute
Value of One Objective” on page 16-99.
16-93
16 Functions
Note “Passing Extra Parameters” on page 2-57 explains how to pass extra parameters to the
objective functions and nonlinear constraint functions, if necessary.
Note If the specified input bounds for a problem are inconsistent, the output x is x0 and the output
fval is [].
x = fminimax(problem) solves the minimax problem for problem, where problem is a structure
described in problem. Create the problem structure by exporting a problem from the Optimization
app, as described in “Exporting Your Work” on page 5-9.
[x,fval] = fminimax( ___ ), for any syntax, returns the values of the objective functions
computed in fun at the solution x.
Examples
Create a plot of the sin and cos functions and their maximum over the interval [–pi,pi].
t = linspace(-pi,pi);
plot(t,sin(t),'r-')
hold on
plot(t,cos(t),'b-');
16-94
fminimax
plot(t,max(sin(t),cos(t)),'ko')
legend('sin(t)','cos(t)','max(sin(t),cos(t))','Location','NorthWest')
The plot shows two local minima of the maximum, one near 1, and the other near –2. Find the
minimum near 1.
fun = @(x)[sin(x);cos(x)];
x0 = 1;
x1 = fminimax(fun,x0)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x1 = 0.7854
x0 = -2;
x2 = fminimax(fun,x0)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
16-95
16 Functions
x2 = -2.3562
The objective functions for this example are linear plus constants. For a description and plot of the
objective functions, see “Compare fminimax and fminunc” on page 8-6.
Set the objective functions as three linear functions of the form dot(x, v) + v0 for three vectors v and
three constants v0.
a = [1;1];
b = [-1;1];
c = [0;-1];
a0 = 2;
b0 = -3;
c0 = 4;
fun = @(x)[x*a+a0,x*b+b0,x*c+c0];
Find the minimax point subject to the inequality x(1) + 3*x(2) <= –4.
A = [1,3];
b = -4;
x0 = [-1,-2];
x = fminimax(fun,x0,A,b)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
-5.8000 0.6000
The objective functions for this example are linear plus constants. For a description and plot of the
objective functions, see “Compare fminimax and fminunc” on page 8-6.
Set the objective functions as three linear functions of the form dot(x, v) + v0 for three vectors v and
three constants v0.
a = [1;1];
b = [-1;1];
c = [0;-1];
a0 = 2;
b0 = -3;
c0 = 4;
fun = @(x)[x*a+a0,x*b+b0,x*c+c0];
16-96
fminimax
Set bounds that –2 <= x(1) <= 2 and –1 <= x(2) <= 1 and solve the minimax problem starting
from [0,0].
lb = [-2,-1];
ub = [2,1];
x0 = [0,0];
A = []; % No linear constraints
b = [];
Aeq = [];
beq = [];
[x,fval] = fminimax(fun,x0,A,b,Aeq,beq,lb,ub)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
-0.0000 1.0000
fval = 1×3
In this case, the solution is not unique. Many points satisfy the constraints and have the same
minimax value. Plot the surface representing the maximum of the three objective functions, and plot a
red line showing the points that have the same minimax value.
[X,Y] = meshgrid(linspace(-2,2),linspace(-1,1));
Z = max(fun([X(:),Y(:)]),[],2);
Z = reshape(Z,size(X));
surf(X,Y,Z,'LineStyle','none')
view(-118,28)
hold on
line([-2,0],[1,1],[3,3],'Color','r','LineWidth',8)
hold off
16-97
16 Functions
The objective functions for this example are linear plus constants. For a description and plot of the
objective functions, see “Compare fminimax and fminunc” on page 8-6.
Set the objective functions as three linear functions of the form dot(x, v) + v0 for three vectors v and
three constants v0.
a = [1;1];
b = [-1;1];
c = [0;-1];
a0 = 2;
b0 = -3;
c0 = 4;
fun = @(x)[x*a+a0,x*b+b0,x*c+c0];
2
The unitdisk function represents the nonlinear inequality constraint ‖x‖ ≤ 1.
type unitdisk
Solve the minimax problem subject to the unitdisk constraint, starting from x0 = [0,0].
16-98
fminimax
x0 = [0,0];
A = []; % No other constraints
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = @unitdisk;
x = fminimax(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
-0.0000 1.0000
fminimax can minimize the maximum of either Fi(x) or Fi(x) for the first several values of i by using
the AbsoluteMaxObjectiveCount option. To minimize the absolute values of k of the objectives,
arrange the objective function values so that F1(x) through Fk(x) are the objectives for absolute
minimization, and set the AbsoluteMaxObjectiveCount option to k.
In this example, minimize the maximum of sin and cos, specify sin as the first objective, and set
AbsoluteMaxObjectiveCount to 1.
fun = @(x)[sin(x),cos(x)];
options = optimoptions('fminimax','AbsoluteMaxObjectiveCount',1);
x0 = 1;
A = []; % No constraints
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = [];
x1 = fminimax(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x1 = 0.7854
x0 = -2;
x2 = fminimax(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
16-99
16 Functions
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x2 = -3.1416
t = linspace(-pi,pi);
plot(t,max(abs(sin(t)),cos(t)))
To see the effect of the AbsoluteMaxObjectiveCount option, compare this plot to the plot in the
example “Minimize Maximum of sin and cos” on page 16-94.
Obtain both the location of the minimax point and the value of the objective functions. For a
description and plot of the objective functions, see “Compare fminimax and fminunc” on page 8-6.
Set the objective functions as three linear functions of the form dot(x, v) + v0 for three vectors v and
three constants v0.
16-100
fminimax
a = [1;1];
b = [-1;1];
c = [0;-1];
a0 = 2;
b0 = -3;
c0 = 4;
fun = @(x)[x*a+a0,x*b+b0,x*c+c0];
Set the initial point to [0,0] and find the minimax point and value.
x0 = [0,0];
[x,fval] = fminimax(fun,x0)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
-2.5000 2.2500
fval = 1×3
All three objective functions have the same value at the minimax point. Unconstrained problems
typically have at least two objectives that are equal at the solution, because if a point is not a local
minimum for any objective and only one objective has the maximum value, then the maximum
objective can be lowered.
The objective functions for this example are linear plus constants. For a description and plot of the
objective functions, see “Compare fminimax and fminunc” on page 8-6.
Set the objective functions as three linear functions of the form dot(x, v) + v0 for three vectors v and
three constants v0.
a = [1;1];
b = [-1;1];
c = [0;-1];
a0 = 2;
b0 = -3;
c0 = 4;
fun = @(x)[x*a+a0,x*b+b0,x*c+c0];
Find the minimax point subject to the inequality x(1) + 3*x(2) <= –4.
A = [1,3];
b = -4;
x0 = [-1,-2];
16-101
16 Functions
Set options for iterative display, and obtain all solver outputs.
options = optimoptions('fminimax','Display','iter');
Aeq = []; % No other constraints
beq = [];
lb = [];
ub = [];
nonlcon = [];
[x,fval,maxfval,exitflag,output,lambda] =...
fminimax(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
fminimax stopped because the size of the current search direction is less than
twice the value of the step size tolerance and constraints are
satisfied to within the value of the constraint tolerance.
x = 1×2
-5.8000 0.6000
fval = 1×3
maxfval = 3.4000
exitflag = 4
16-102
fminimax
Input Arguments
fun — Objective functions
function handle | function name
Objective functions, specified as a function handle or function name. fun is a function that accepts a
vector x and returns a vector F, the objective functions evaluated at x. You can specify the function
fun as a function handle for a function file:
x = fminimax(@myfun,x0,goal,weight)
function F = myfun(x)
F = ... % Compute function values at x.
x = fminimax(@(x)sin(x.*x),x0,goal,weight);
If the user-defined values for x and F are arrays, fminimax converts them to vectors using linear
indexing (see “Array Indexing” (MATLAB)).
To minimize the worst-case absolute values of some elements of the vector F(x) (that is, min{max
abs{F(x)} } ), partition those objectives into the first elements of F and use optimoptions to set the
AbsoluteMaxObjectiveCount option to the number of these objectives. These objectives must be
partitioned into the first elements of the vector F returned by fun. For an example, see “Solve
Minimax Problem Using Absolute Value of One Objective” on page 16-99.
Assume that the gradients of the objective functions can also be computed and the
SpecifyObjectiveGradient option is true, as set by:
options = optimoptions('fminimax','SpecifyObjectiveGradient',true)
In this case, the function fun must return, in the second output argument, the gradient values G (a
matrix) at x. The gradient consists of the partial derivative dF/dx of each F at the point x. If F is a
vector of length m and x has length n, where n is the length of x0, then the gradient G of F(x) is an
n-by-m matrix where G(i,j) is the partial derivative of F(j) with respect to x(i) (that is, the jth
column of G is the gradient of the jth objective function F(j)). If you define F as an array, then the
preceding discussion applies to F(:), the linear ordering of the F array. In any case, G is a 2-D
matrix.
Note Setting SpecifyObjectiveGradient to true is effective only when the problem has no
nonlinear constraint, or when the problem has a nonlinear constraint with
SpecifyConstraintGradient set to true. Internally, the objective is folded into the constraints,
so the solver needs both gradients (objective and constraint) supplied in order to avoid estimating a
gradient.
16-103
16 Functions
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. Solvers use the number of elements in x0 and the
size of x0 to determine the number and size of variables that fun accepts.
Example: x0 = [1,2,3,4]
Data Types: double
Linear inequality constraints, specified as a real matrix. A is an M-by-N matrix, where M is the number
of inequalities, and N is the number of variables (number of elements in x0). For large problems, pass
A as a sparse matrix.
A*x <= b,
where x is the column vector of N variables x(:), and b is a column vector with M elements.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
A = [1,2;3,4;5,6];
b = [10;20;30];
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear inequality constraints, specified as a real vector. b is an M-element vector related to the A
matrix. If you pass b as a row vector, solvers internally convert b to the column vector b(:). For
large problems, pass b as a sparse vector.
A*x <= b,
where x is the column vector of N variables x(:), and A is a matrix of size M-by-N.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
16-104
fminimax
A = [1,2;3,4;5,6];
b = [10;20;30];
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear equality constraints, specified as a real matrix. Aeq is an Me-by-N matrix, where Me is the
number of equalities, and N is the number of variables (number of elements in x0). For large
problems, pass Aeq as a sparse matrix.
Aeq*x = beq,
where x is the column vector of N variables x(:), and beq is a column vector with Me elements.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
Linear equality constraints, specified as a real vector. beq is an Me-element vector related to the Aeq
matrix. If you pass beq as a row vector, solvers internally convert beq to the column vector beq(:).
For large problems, pass beq as a sparse vector.
Aeq*x = beq,
where x is the column vector of N variables x(:), and Aeq is a matrix of size Me-by-N.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
16-105
16 Functions
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
lb — Lower bounds
real vector | real array
Lower bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in lb, then lb specifies that
ub — Upper bounds
real vector | real array
Upper bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in ub, then ub specifies that
Nonlinear constraints, specified as a function handle or function name. nonlcon is a function that
accepts a vector or array x and returns two arrays, c(x) and ceq(x).
For example,
x = fminimax(@myfun,x0,...,@mycon)
16-106
fminimax
Suppose that the gradients of the constraints can also be computed and the
SpecifyConstraintGradient option is true, as set by:
options = optimoptions('fminimax','SpecifyConstraintGradient',true)
In this case, the function nonlcon must also return, in the third and fourth output arguments, GC, the
gradient of c(x), and GCeq, the gradient of ceq(x). See “Nonlinear Constraints” on page 2-37 for an
explanation of how to “conditionalize” the gradients for use in solvers that do not accept supplied
gradients.
If nonlcon returns a vector c of m components and x has length n, where n is the length of x0, then
the gradient GC of c(x) is an n-by-m matrix, where GC(i,j) is the partial derivative of c(j) with
respect to x(i) (that is, the jth column of GC is the gradient of the jth inequality constraint c(j)).
Likewise, if ceq has p components, the gradient GCeq of ceq(x) is an n-by-p matrix, where
GCeq(i,j) is the partial derivative of ceq(j) with respect to x(i) (that is, the jth column of GCeq
is the gradient of the jth equality constraint ceq(j)).
Note Because Optimization Toolbox functions accept only inputs of type double, user-supplied
objective and nonlinear constraint functions must return outputs of type double.
See “Passing Extra Parameters” on page 2-57 for an explanation of how to parameterize the nonlinear
constraint function nonlcon, if necessary.
Data Types: char | function_handle | string
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
For details about options that have different names for optimset, see “Current and Legacy Option
Name Tables” on page 15-21.
16-107
16 Functions
Option Description
AbsoluteMaxObjectiveCount Number of elements of Fi(x) for which to minimize the
absolute value of Fi. See “Solve Minimax Problem Using
Absolute Value of One Objective” on page 16-99.
16-108
fminimax
Option Description
FiniteDifferenceStepSize Scalar or vector step size factor for finite differences.
When you set FiniteDifferenceStepSize to a vector
v, the forward finite differences delta are
delta = v.*sign′(x).*max(abs(x),TypicalX);
delta = v.*max(abs(x),TypicalX);
16-109
16 Functions
Option Description
MaxSQPIter Maximum number of SQP iterations allowed (a positive
integer). The default is 10*max(numberOfVariables,
numberOfInequalities + numberOfBounds).
MeritFunction If this option is set to 'multiobj' (the default), use the
goal attainment or minimax merit function. If this option is
set to 'singleobj', use the fmincon merit function.
OptimalityTolerance Termination tolerance on the first-order optimality (a
positive scalar). The default is 1e-6. See “First-Order
Optimality Measure” on page 3-11.
16-110
fminimax
Option Description
SpecifyConstraintGradient Gradient for nonlinear constraint functions defined by the
user. When this option is set to true, fminimax expects
the constraint function to have four outputs, as described
in nonlcon. When this option is set to false (the default),
fminimax estimates gradients of the nonlinear constraints
using finite differences.
Example: optimoptions('fminimax','PlotFcn','optimplotfval')
16-111
16 Functions
You must supply at least the objective, x0, solver, and options fields in the problem structure.
The simplest way to obtain a problem structure is to export the problem from the Optimization app.
Data Types: struct
Output Arguments
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically,
x is a local solution to the problem when exitflag is positive. For information on the quality of the
solution, see “When the Solver Succeeds” on page 4-18.
Objective function values at the solution, returned as a real array. Generally, fval = fun(x).
Maximum of the objective function values at the solution, returned as a real scalar. maxfval =
max(fval(:)).
16-112
fminimax
Information about the optimization process, returned as a structure with the fields in this table.
Lagrange multipliers at the solution, returned as a structure with the fields in this table.
Algorithms
fminimax solves a minimax problem by converting it into a goal attainment problem, and then
solving the converted goal attainment problem using fgoalattain. The conversion sets all goals to
0 and all weights to 1. See “Equation 8-1” in “Multiobjective Optimization Algorithms” on page 8-2.
Extended Capabilities
Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.
options = optimoptions('solvername','UseParallel',true)
For more information, see “Using Parallel Computing in Optimization Toolbox” on page 14-5.
See Also
fgoalattain | optimoptions
16-113
16 Functions
Topics
“Create Function Handle” (MATLAB)
“Multiobjective Optimization”
16-114
fminsearch
fminsearch
Find minimum of unconstrained multivariable function using derivative-free method
Syntax
x = fminsearch(fun,x0)
x = fminsearch(fun,x0,options)
x = fminsearch(problem)
[x,fval] = fminsearch( ___ )
[x,fval,exitflag] = fminsearch( ___ )
[x,fval,exitflag,output] = fminsearch( ___ )
Description
Nonlinear programming solver. Searches for the minimum of a problem specified by
min f (x)
x
f(x) is a function that returns a scalar, and x is a vector or a matrix; see “Matrix Arguments” on page
2-31.
x = fminsearch(fun,x0) starts at the point x0 and attempts to find a local minimum x of the
function described in fun.
x = fminsearch(problem) finds the minimum for problem, where problem is a structure. Create
problem by exporting a problem from Optimization app, as described in “Exporting Your Work” on
page 5-9.
[x,fval] = fminsearch( ___ ), for any previous input syntax, returns in fval the value of the
objective function fun at the solution x.
Examples
Minimize Rosenbrock's function, a notoriously difficult optimization problem for many algorithms:
2 2
f (x) = 100(x2 − x12) + (1 − x1) .
16-115
16 Functions
Set the start point to x0 = [-1.2,1] and minimize Rosenbrock's function using fminsearch.
x = 1×2
1.0000 1.0000
options = optimset('PlotFcns',@optimplotfval);
2 2
f (x) = 100(x2 − x12) + (1 − x1) .
Set the start point to x0 = [-1.2,1] and minimize Rosenbrock's function using fminsearch.
16-116
fminsearch
x = 1×2
1.0000 1.0000
Minimize an objective function whose values are given by executing a file. A function file must accept
a real vector x and return a real scalar that is the value of the objective function.
Copy the following code and include it as a file named objectivefcn1.m on your MATLAB® path.
function f = objectivefcn1(x)
f = 0;
for k = -10:10
f = f + exp(-(x(1)-x(2))^2 - 2*x(1)^2)*cos(x(2))*sin(2*x(2));
end
x0 = [0.25,-0.25];
x = fminsearch(@objectivefcn1,x0)
16-117
16 Functions
x =
-0.1696 -0.5086
Sometimes your objective function has extra parameters. These parameters are not variables to
optimize, they are fixed values during the optimization. For example, suppose that you have a
parameter a in the Rosenbrock-type function
2 2
f (x, a) = 100(x2 − x12) + (a − x1) .
This function has a minimum value of 0 at x1 = a, x2 = a2. If, for example, a = 3, you can include the
parameter in your objective function by creating an anonymous function.
Create the objective function with its extra parameters as extra arguments.
a = 3;
Create an anonymous function of x alone that includes the workspace value of the parameter.
fun = @(x)f(x,a);
x0 = [-1,1.9];
x = fminsearch(fun,x0)
x = 1×2
3.0000 9.0000
For more information about using extra parameters in your objective function, see “Parameterizing
Functions” (MATLAB).
Find both the location and value of a minimum of an objective function using fminsearch.
x0 = [1,2,3];
fun = @(x)-norm(x+x0)^2*exp(-norm(x-x0)^2 + sum(x));
Find the minimum of fun starting at x0. Find the value of the minimum as well.
16-118
fminsearch
[x,fval] = fminsearch(fun,x0)
x = 1×3
fval = -5.9565e+04
Inspect the results of an optimization, both while it is running and after it finishes.
Set options to provide iterative display, which gives information on the optimization as the solver
runs. Also, set a plot function to show the objective function value as the solver runs.
options = optimset('Display','iter','PlotFcns',@optimplotfval);
function f = objectivefcn1(x)
f = 0;
for k = -10:10
f = f + exp(-(x(1)-x(2))^2 - 2*x(1)^2)*cos(x(2))*sin(2*x(2));
end
x0 = [0.25,-0.25];
fun = @objectivefcn1;
Obtain all solver outputs. Use these outputs to inspect the results after the solver finishes.
[x,fval,exitflag,output] = fminsearch(fun,x0,options)
16-119
16 Functions
Optimization terminated:
the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-04
and F(X) satisfies the convergence criteria using OPTIONS.TolFun of 1.000000e-04
x =
-0.1696 -0.5086
fval =
-13.1310
exitflag =
output =
iterations: 35
funcCount: 69
algorithm: 'Nelder-Mead simplex direct search'
message: 'Optimization terminated:...'
16-120
fminsearch
The output structure shows the number of iterations. The iterative display and the plot show this
information as well. The output structure also shows the number of function evaluations, which the
iterative display shows, but the chosen plot function does not.
Input Arguments
fun — Function to minimize
function handle | function name
Function to minimize, specified as a function handle or function name. fun is a function that accepts
a vector or array x and returns a real scalar f (the objective function evaluated at x).
x = fminsearch(@myfun,x0)
function f = myfun(x)
f = ... % Compute function value at x
You can also specify fun as a function handle for an anonymous function:
x = fminsearch(@(x)norm(x)^2,x0);
16-121
16 Functions
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. Solvers use the number of elements in x0 and the
size of x0 to determine the number and size of variables that fun accepts.
Example: x0 = [1,2,3,4]
Data Types: double
Optimization options, specified as a structure such as optimset returns. You can use optimset to
set or change the values of these fields in the options structure. See “Optimization Options
Reference” on page 15-6 for detailed information.
16-122
fminsearch
Custom plot functions use the same syntax as output functions. See
“Output Functions” on page 3-32 and “Output Function Syntax” on
page 15-26.
TolFun Termination tolerance on the function value, a positive scalar. The
default is 1e-4. See “Tolerances and Stopping Criteria” on page 2-
68. Unlike other solvers, fminsearch stops when it satisfies both
TolFun and TolX.
TolX Termination tolerance on x, a positive scalar. The default value is
1e-4. See “Tolerances and Stopping Criteria” on page 2-68. Unlike
other solvers, fminsearch stops when it satisfies both TolFun and
TolX.
The simplest way to obtain a problem structure is to export the problem from the Optimization app.
Data Types: struct
Output Arguments
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically,
x is a local solution to the problem when exitflag is positive. For information on the quality of the
solution, see “When the Solver Succeeds” on page 4-18.
Objective function value at the solution, returned as a real number. Generally, fval = fun(x).
16-123
16 Functions
Tips
• fminsearch only minimizes over the real numbers, that is, x must only consist of real numbers
and f(x) must only return real numbers. When x has complex values, split x into real and imaginary
parts.
• Use fminsearch to solve nondifferentiable problems or problems with discontinuities,
particularly if no discontinuity occurs near the solution.
• fminsearch is generally less efficient than fminunc, especially for problems of dimension
greater than two. However, when the problem is discontinuous, fminsearch can be more robust
than fminunc.
• fminsearch is not the preferred solver for problems that are sums of squares, that is, of the form
2 2 2 2
min f (x) 2 = min f 1(x) + f 2(x) + ... + f n(x)
x x
Instead, use the lsqnonlin function, which has been optimized for problems of this form.
Algorithms
fminsearch uses the simplex search method of Lagarias et al. [1]. This is a direct search method
that does not use numerical or analytic gradients as in fminunc. The algorithm is described in detail
in “fminsearch Algorithm” on page 6-9. The algorithm is not guaranteed to converge to a local
minimum.
References
[1] Lagarias, J. C., J. A. Reeds, M. H. Wright, and P. E. Wright. “Convergence Properties of the Nelder-
Mead Simplex Method in Low Dimensions.” SIAM Journal of Optimization. Vol. 9, Number 1,
1998, pp. 112–147.
16-124
fminsearch
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
• fminsearch ignores the Display option and does not give iterative display or an exit message.
To check solution quality, examine the exit flag.
• The output structure does not include the algorithm or message fields.
• fminsearch ignores the OutputFcn and PlotFcns options.
See Also
fminbnd | fminunc | optimset | optimtool
Topics
“Create Function Handle” (MATLAB)
“Anonymous Functions” (MATLAB)
16-125
16 Functions
fminunc
Find minimum of unconstrained multivariable function
Syntax
x = fminunc(fun,x0)
x = fminunc(fun,x0,options)
x = fminunc(problem)
[x,fval] = fminunc( ___ )
[x,fval,exitflag,output] = fminunc( ___ )
[x,fval,exitflag,output,grad,hessian] = fminunc( ___ )
Description
Nonlinear programming solver.
min f (x)
x
x = fminunc(fun,x0) starts at the point x0 and attempts to find a local minimum x of the function
described in fun. The point x0 can be a scalar, vector, or matrix.
Note “Passing Extra Parameters” on page 2-57 explains how to pass extra parameters to the
objective function and nonlinear constraint functions, if necessary.
fminunc is for nonlinear problems without constraints. If your problem has constraints, generally
use fmincon. See “Optimization Decision Table” on page 2-4.
x = fminunc(problem) finds the minimum for problem, where problem is a structure described
in “Input Arguments” on page 16-131. Create the problem structure by exporting a problem from
Optimization app, as described in “Exporting Your Work” on page 5-9.
[x,fval] = fminunc( ___ ), for any syntax, returns the value of the objective function fun at the
solution x.
16-126
fminunc
Examples
Minimize a Polynomial
x0 = [1,1];
[x,fval] = fminunc(fun,x0)
x = 1×2
2.2500 -4.7500
fval = -16.3750
Supply Gradient
fminunc can be faster and more reliable when you provide derivatives.
Write an objective function that returns the gradient as well as the function value. Use the
conditionalized form described in “Including Gradients and Hessians” on page 2-19. The objective
function is Rosenbrock's function,
2 2
f (x) = 100 x2 − x12 + (1 − x1) ,
−400 x2 − x12 x1 − 2 1 − x1
∇ f (x) = .
200 x2 − x12
The code for the objective function with gradient appears at the end of this example on page 16-0 .
Create options to use the objective function’s gradient. Also, set the algorithm to 'trust-region'.
options = optimoptions('fminunc','Algorithm','trust-region','SpecifyObjectiveGradient',true);
16-127
16 Functions
x0 = [-1,2];
fun = @rosenbrockwithgrad;
x = fminunc(fun,x0,options)
x = 1×2
1.0000 1.0000
The following code creates the rosenbrockwithgrad function, which includes the gradient as the
second output.
Solve the same problem as in “Supply Gradient” on page 16-127 using a problem structure instead of
separate arguments.
Write an objective function that returns the gradient as well as the function value. Use the
conditionalized form described in “Including Gradients and Hessians” on page 2-19. The objective
function is Rosenbrock's function,
2 2
f (x) = 100 x2 − x12 + (1 − x1) ,
−400 x2 − x12 x1 − 2 1 − x1
∇ f (x) = .
200 x2 − x12
The code for the objective function with gradient appears at the end of this example on page 16-0 .
Create options to use the objective function’s gradient. Also, set the algorithm to 'trust-region'.
options = optimoptions('fminunc','Algorithm','trust-region','SpecifyObjectiveGradient',true);
Create a problem structure including the initial point x0 = [-1,2]. For the required fields in this
structure, see “problem” on page 16-0 .
16-128
fminunc
problem.options = options;
problem.x0 = [-1,2];
problem.objective = @rosenbrockwithgrad;
problem.solver = 'fminunc';
x = fminunc(problem)
x = 1×2
1.0000 1.0000
The following code creates the rosenbrockwithgrad function, which includes the gradient as the
second output.
Find both the location of the minimum of a nonlinear function and the value of the function at that
minimum. The objective function is
2 2
f (x) = x(1)e− x 2+ x 2 /20.
Find the location and objective function value of the minimizer starting at x0 = [1,2].
x0 = [1,2];
[x,fval] = fminunc(fun,x0)
x = 1×2
-0.6691 0.0000
16-129
16 Functions
fval = -0.4052
Set options to obtain iterative display and use the 'quasi-newton' algorithm.
options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
2 2
f (x) = x(1)e− x 2+ x 2 /20 .
Start the minimization at x0 = [1,2], and obtain outputs that enable you to examine the solution
quality and process.
x0 = [1,2];
[x,fval,exitflag,output] = fminunc(fun,x0,options)
First-order
Iteration Func-count f(x) Step-size optimality
0 3 0.256738 0.173
1 6 0.222149 1 0.131
2 9 0.15717 1 0.158
3 18 -0.227902 0.438133 0.386
4 21 -0.299271 1 0.46
5 30 -0.404028 0.102071 0.0458
6 33 -0.404868 1 0.0296
7 36 -0.405236 1 0.00119
8 39 -0.405237 1 0.000252
9 42 -0.405237 1 7.97e-07
x = 1×2
-0.6691 0.0000
fval = -0.4052
exitflag = 1
16-130
fminunc
algorithm: 'quasi-newton'
message: '...'
Input Arguments
fun — Function to minimize
function handle | function name
Function to minimize, specified as a function handle or function name. fun is a function that accepts
a vector or array x and returns a real scalar f, the objective function evaluated at x.
You can also specify fun as a function handle for an anonymous function:
x = fminunc(@(x)norm(x)^2,x0);
If you can compute the gradient of fun and the SpecifyObjectiveGradient option is set to true,
as set by
options = optimoptions('fminunc','SpecifyObjectiveGradient',true)
then fun must return the gradient vector g(x) in the second output argument.
If you can also compute the Hessian matrix and the HessianFcn option is set to 'objective' via
options = optimoptions('fminunc','HessianFcn','objective') and the Algorithm
option is set to 'trust-region', fun must return the Hessian value H(x), a symmetric matrix, in a
third output argument. fun can give a sparse Hessian. See “Hessian for fminunc trust-region or
fmincon trust-region-reflective algorithms” on page 2-21 for details.
The trust-region algorithm allows you to supply a Hessian multiply function. This function gives
the result of a Hessian-times-vector product without computing the Hessian directly. This can save
memory. See “Hessian Multiply Function” on page 2-23.
Example: fun = @(x)sin(x(1))*cos(x(2))
Data Types: char | function_handle | string
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. Solvers use the number of elements in x0 and the
size of x0 to determine the number and size of variables that fun accepts.
16-131
16 Functions
Example: x0 = [1,2,3,4]
Data Types: double
Some options apply to all algorithms, and others are relevant for particular algorithms. See
“Optimization Options Reference” on page 15-6 for detailed information.
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
All Algorithms
Algorithm Choose the fminunc algorithm. Choices are 'quasi-newton'
(default) or 'trust-region'.
16-132
fminunc
delta = v.*sign′(x).*max(abs(x),TypicalX);
delta = v.*max(abs(x),TypicalX);
16-133
16 Functions
For optimset, the name is MaxIter. See “Current and Legacy Option
Name Tables” on page 15-21.
OptimalityTolerance Termination tolerance on the first-order optimality (a positive scalar).
The default is 1e-6. See “First-Order Optimality Measure” on page 3-
11.
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
OutputFcn Specify one or more user-defined functions that an optimization
function calls at each iteration. Pass a function handle or a cell array of
function handles. The default is none ([]). See “Output Function
Syntax” on page 15-26.
PlotFcn Plots various measures of progress while the algorithm executes; select
from predefined plots or write your own. Pass a built-in plot function
name, a function handle, or a cell array of built-in plot function names
or function handles. For custom plot functions, pass function handles.
The default is none ([]):
Custom plot functions use the same syntax as output functions. See
“Output Functions” on page 3-32 and “Output Function Syntax” on
page 15-26.
16-134
fminunc
SpecifyObjectiveGradie Gradient for the objective function defined by the user. See the
nt description of fun to see how to define the gradient in fun. Set to
true to have fminunc use a user-defined gradient of the objective
function. The default false causes fminunc to estimate gradients
using finite differences. You must provide the gradient, and set
SpecifyObjectiveGradient to true, to use the trust-region
algorithm. This option is not required for the quasi-Newton algorithm.
For optimset, the name is GradObj and the values are 'on' or
'off'. See “Current and Legacy Option Name Tables” on page 15-21.
StepTolerance Termination tolerance on x, a positive scalar. The default value is 1e-6.
See “Tolerances and Stopping Criteria” on page 2-68.
For optimset, the name is TolX. See “Current and Legacy Option
Name Tables” on page 15-21.
TypicalX Typical x values. The number of elements in TypicalX is equal to the
number of elements in x0, the starting point. The default value is
ones(numberofvariables,1). fminunc uses TypicalX for scaling
finite differences for gradient estimation.
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
HessianFcn If set to [] (default), fminunc approximates the Hessian using finite
differences.
For optimset, the name is HessFcn. See “Current and Legacy Option
Name Tables” on page 15-21.
16-135
16 Functions
W = hmfun(Hinfo,Y)
The first argument is the same as the third argument returned by the
objective function fun, for example
[f,g,Hinfo] = fun(x)
16-136
fminunc
The simplest way to obtain a problem structure is to export the problem from the Optimization app.
Data Types: struct
16-137
16 Functions
Output Arguments
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically,
x is a local solution to the problem when exitflag is positive. For information on the quality of the
solution, see “When the Solver Succeeds” on page 4-18.
Objective function value at the solution, returned as a real number. Generally, fval = fun(x).
16-138
fminunc
Gradient at the solution, returned as a real vector. grad gives the gradient of fun at the point x(:).
Approximate Hessian, returned as a real matrix. For the meaning of hessian, see “Hessian Output”
on page 3-24.
Algorithms
Quasi-Newton Algorithm
The quasi-newton algorithm uses the BFGS Quasi-Newton method with a cubic line search
procedure. This quasi-Newton method uses the BFGS ([1],[5],[8], and [9]) formula for updating the
approximation of the Hessian matrix. You can select the DFP ([4],[6], and [7]) formula, which
approximates the inverse Hessian matrix, by setting the HessUpdate option to 'dfp' (and the
Algorithm option to 'quasi-newton'). You can select a steepest descent method by setting
HessUpdate to 'steepdesc' (and Algorithm to 'quasi-newton'), although this setting is
usually inefficient. See “fminunc quasi-newton Algorithm” on page 6-4.
The trust-region algorithm requires that you supply the gradient in fun and set
SpecifyObjectiveGradient to true using optimoptions. This algorithm is a subspace trust-
region method and is based on the interior-reflective Newton method described in [2] and [3]. Each
iteration involves the approximate solution of a large linear system using the method of
preconditioned conjugate gradients (PCG). See “fminunc trust-region Algorithm” on page 6-2, “Trust-
Region Methods for Nonlinear Minimization” on page 6-2 and “Preconditioned Conjugate Gradient
Method” on page 6-3.
References
[1] Broyden, C. G. “The Convergence of a Class of Double-Rank Minimization Algorithms.” Journal
Inst. Math. Applic., Vol. 6, 1970, pp. 76–90.
[2] Coleman, T. F. and Y. Li. “An Interior, Trust Region Approach for Nonlinear Minimization Subject to
Bounds.” SIAM Journal on Optimization, Vol. 6, 1996, pp. 418–445.
[3] Coleman, T. F. and Y. Li. “On the Convergence of Reflective Newton Methods for Large-Scale
Nonlinear Minimization Subject to Bounds.” Mathematical Programming, Vol. 67, Number 2,
1994, pp. 189–224.
[4] Davidon, W. C. “Variable Metric Method for Minimization.” A.E.C. Research and Development
Report, ANL-5990, 1959.
[5] Fletcher, R. “A New Approach to Variable Metric Algorithms.” Computer Journal, Vol. 13, 1970, pp.
317–322.
[6] Fletcher, R. “Practical Methods of Optimization.” Vol. 1, Unconstrained Optimization, John Wiley
and Sons, 1980.
16-139
16 Functions
[7] Fletcher, R. and M. J. D. Powell. “A Rapidly Convergent Descent Method for Minimization.”
Computer Journal, Vol. 6, 1963, pp. 163–168.
[8] Goldfarb, D. “A Family of Variable Metric Updates Derived by Variational Means.” Mathematics of
Computing, Vol. 24, 1970, pp. 23–26.
Extended Capabilities
Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.
options = optimoptions('solvername','UseParallel',true)
For more information, see “Using Parallel Computing in Optimization Toolbox” on page 14-5.
See Also
fmincon | fminsearch | optimoptions
Topics
“Solver-Based Nonlinear Optimization”
“Solver-Based Optimization Problem Setup”
“Unconstrained Nonlinear Optimization Algorithms” on page 6-2
16-140
fseminf
fseminf
Find minimum of semi-infinitely constrained multivariable nonlinear function
Equation
Finds the minimum of a problem specified by
A ⋅ x ≤ b,
Aeq ⋅ x = beq,
lb ≤ x ≤ ub,
min f (x) such that
x c(x) ≤ 0,
ceq(x) = 0,
Ki(x, wi) ≤ 0, 1 ≤ i ≤ n .
b and beq are vectors, A and Aeq are matrices, c(x), ceq(x), and Ki(x,wi) are functions that return
vectors, and f(x) is a function that returns a scalar. f(x), c(x), and ceq(x) can be nonlinear functions.
The vectors (or matrices) Ki(x,wi) ≤ 0 are continuous functions of both x and an additional set of
variables w1,w2,...,wn. The variables w1,w2,...,wn are vectors of, at most, length two.
x, lb, and ub can be passed as vectors or matrices; see “Matrix Arguments” on page 2-31.
Syntax
x = fseminf(fun,x0,ntheta,seminfcon)
x = fseminf(fun,x0,ntheta,seminfcon,A,b)
x = fseminf(fun,x0,ntheta,seminfcon,A,b,Aeq,beq)
x = fseminf(fun,x0,ntheta,seminfcon,A,b,Aeq,beq,lb,ub)
x = fseminf(fun,x0,ntheta,seminfcon,A,b,Aeq,beq,lb,ub,options)
x = fseminf(problem)
[x,fval] = fseminf(...)
[x,fval,exitflag] = fseminf(...)
[x,fval,exitflag,output] = fseminf(...)
[x,fval,exitflag,output,lambda] = fseminf(...)
Description
fseminf finds a minimum of a semi-infinitely constrained scalar function of several variables,
starting at an initial estimate. The aim is to minimize f(x) so the constraints hold for all possible
values of wi∈ℜ1 (or wi∈ℜ2). Because it is impossible to calculate all possible values of Ki(x,wi), a
region must be chosen for wi over which to calculate an appropriately sampled set of values.
Note “Passing Extra Parameters” on page 2-57 explains how to pass extra parameters to the
objective function and nonlinear constraint functions, if necessary.
16-141
16 Functions
x = fseminf(problem) finds the minimum for problem, where problem is a structure described
in “Input Arguments” on page 16-142.
Create the problem structure by exporting a problem from Optimization app, as described in
“Exporting Your Work” on page 5-9.
[x,fval] = fseminf(...) returns the value of the objective function fun at the solution x.
Note If the specified input bounds for a problem are inconsistent, the output x is x0 and the output
fval is [].
Input Arguments
“Function Input Arguments” on page 15-2 contains general descriptions of arguments passed into
fseminf. This section provides function-specific details for fun, ntheta, options, seminfcon, and
problem:
16-142
fseminf
fun The function to be minimized. fun is a function that accepts a vector x and returns a scalar
f, the objective function evaluated at x. The function fun can be specified as a function
handle for a file
x = fseminf(@myfun,x0,ntheta,seminfcon)
function f = myfun(x)
f = ... % Compute function value at x
fun = @(x)sin(x''*x);
If the gradient of fun can also be computed and the SpecifyObjectiveGradient option
is true, as set by
options = optimoptions('fseminf','SpecifyObjectiveGradient',true)
then the function fun must return, in the second output argument, the gradient value g, a
vector, at x.
ntheta The number of semi-infinite constraints.
options “Options” on page 16-146 provides the function-specific details for the options values.
16-143
16 Functions
seminfcon The function that computes the vector of nonlinear inequality constraints, c, a vector of
nonlinear equality constraints, ceq, and ntheta semi-infinite constraints (vectors or
matrices) K1, K2,..., Kntheta evaluated over an interval S at the point x. The function
seminfcon can be specified as a function handle.
x = fseminf(@myfun,x0,ntheta,@myinfcon)
S is a recommended sampling interval, which might or might not be used. Return [] for c
and ceq if no such constraints exist.
The vectors or matrices K1, K2, ..., Kntheta contain the semi-infinite constraints
evaluated for a sampled set of values for the independent variables w1, w2, ..., wntheta,
respectively. The two-column matrix, S, contains a recommended sampling interval for
values of w1, w2, ..., wntheta, which are used to evaluate K1, K2, ..., Kntheta. The ith
row of S contains the recommended sampling interval for evaluating Ki. When Ki is a
vector, use only S(i,1) (the second column can be all zeros). When Ki is a matrix,
S(i,2) is used for the sampling of the rows in Ki, S(i,1) is used for the sampling
interval of the columns of Ki (see “Two-Dimensional Semi-Infinite Constraint” on page 6-
128). On the first iteration S is NaN, so that some initial sampling interval must be
determined by seminfcon.
Note Because Optimization Toolbox functions only accept inputs of type double, user-
supplied objective and nonlinear constraint functions must return outputs of type double.
16-144
fseminf
Output Arguments
“Function Input Arguments” on page 15-2 contains general descriptions of arguments returned by
fseminf. This section provides function-specific details for exitflag, lambda, and output:
exitflag Integer identifying the reason the algorithm terminated. The following lists the
values of exitflag and the corresponding reasons the algorithm terminated.
1 Function converged to a solution x.
4 Magnitude of the search direction was less than the
specified tolerance and constraint violation was less than
options.ConstraintTolerance.
5 Magnitude of directional derivative was less than the
specified tolerance and constraint violation was less than
options.ConstraintTolerance.
0 Number of iterations exceeded
options.MaxIterations or number of function
evaluations exceeded
options.MaxFunctionEvaluations.
-1 Algorithm was terminated by the output function.
-2 No feasible point was found.
lambda Structure containing the Lagrange multipliers at the solution x (separated by
constraint type). The fields of the structure are
lower Lower bounds lb
upper Upper bounds ub
ineqlin Linear inequalities
eqlin Linear equalities
ineqnonlin Nonlinear inequalities
eqnonlin Nonlinear equalities
output Structure containing information about the optimization. The fields of the structure
are
iterations Number of iterations taken
funcCount Number of function evaluations
lssteplength Size of line search step relative to search direction
stepsize Final displacement in x
algorithm Optimization algorithm used
constrviolation Maximum of constraint functions
firstorderopt Measure of first-order optimality
message Exit message
16-145
16 Functions
Options
Optimization options used by fseminf. Use optimoptions to set or change options. See
“Optimization Options Reference” on page 15-6 for detailed information.
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
16-146
fseminf
FiniteDifferenceStepSize Scalar or vector step size factor for finite differences. When
you set FiniteDifferenceStepSize to a vector v, the
forward finite differences delta are
delta = v.*sign′(x).*max(abs(x),TypicalX);
delta = v.*max(abs(x),TypicalX);
16-147
16 Functions
16-148
fseminf
Notes
The optimization routine fseminf might vary the recommended sampling interval, S, set in
seminfcon, during the computation because values other than the recommended interval might be
more appropriate for efficiency or robustness. Also, the finite region wi, over which Ki(x,wi) is
calculated, is allowed to vary during the optimization, provided that it does not result in significant
changes in the number of local minima in Ki(x,wi).
Examples
2
(x − 1) ,
16-149
16 Functions
0≤x≤2
2
g(x, t) = (x − 1/2) − (t − 1/2) ≤ 0 for all 0 ≤ t ≤ 1.
2
implies x ≤ 1/2. You can see this by noticing that (t – 1/2) ≥ 0, so
Therefore
To solve this problem using fseminf, write the objective function as an anonymous function.
objfun = @(x)(x-1)^2;
Write the semi-infinite constraint function, which includes the nonlinear constraints ([ ] in this case),
initial sampling interval for t (0 to 1 in steps of 0.01 in this case), and the semi-infinite constraint
function g(x, t). The function appears as seminfcon at the end of this section on page 16-0 .
x0 = 0.2;
ntheta = 1;
x = fseminf(objfun,x0,ntheta,@seminfcon)
x = 0.5000
% Sample set
if isnan(s)
16-150
fseminf
Limitations
The function to be minimized, the constraints, and semi-infinite constraints, must be continuous
functions of x and w. fseminf might only give local solutions.
When the problem is not feasible, fseminf attempts to minimize the maximum constraint value.
Algorithms
fseminf uses cubic and quadratic interpolation techniques to estimate peak values in the semi-
infinite constraints. The peak values are used to form a set of constraints that are supplied to an SQP
method as in the fmincon function. When the number of constraints changes, Lagrange multipliers
are reallocated to the new set of constraints.
The recommended sampling interval calculation uses the difference between the interpolated peak
values and peak values appearing in the data set to estimate whether the function needs to take more
or fewer points. The function also evaluates the effectiveness of the interpolation by extrapolating the
curve and comparing it to other points in the curve. The recommended sampling interval is decreased
when the peak values are close to constraint boundaries, i.e., zero.
For more details on the algorithm used and the types of procedures displayed under the Procedures
heading when the Display option is set to 'iter' with optimoptions, see also “SQP
Implementation” on page 6-25. For more details on the fseminf algorithm, see “fseminf Problem
Formulation and Algorithm” on page 6-32.
See Also
fmincon | optimoptions | optimtool
Topics
“Create Function Handle” (MATLAB)
“fseminf Problem Formulation and Algorithm” on page 6-32
“Multiobjective Optimization”
16-151
16 Functions
fsolve
Solve system of nonlinear equations
Syntax
x = fsolve(fun,x0)
x = fsolve(fun,x0,options)
x = fsolve(problem)
[x,fval] = fsolve( ___ )
[x,fval,exitflag,output] = fsolve( ___ )
[x,fval,exitflag,output,jacobian] = fsolve( ___ )
Description
Nonlinear system solver
F(x) = 0
x = fsolve(fun,x0) starts at x0 and tries to solve the equations fun(x) = 0, an array of zeros.
[x,fval] = fsolve( ___ ), for any syntax, returns the value of the objective function fun at the
solution x.
Examples
This example shows how to solve two nonlinear equations in two variables. The equations are
16-152
fsolve
Write a function that computes the left-hand side of these two equations.
function F = root2d(x)
fun = @root2d;
x0 = [0,0];
x = fsolve(fun,x0)
Equation solved.
x =
0.3532 0.6061
Set options to have no display and a plot function that displays the first-order optimality, which should
converge to 0 as the algorithm iterates.
options = optimoptions('fsolve','Display','none','PlotFcn',@optimplotfirstorderopt);
16-153
16 Functions
Write a function that computes the left-hand side of these two equations.
function F = root2d(x)
Solve the nonlinear system starting from the point [0,0] and observe the solution process.
fun = @root2d;
x0 = [0,0];
x = fsolve(fun,x0,options)
x =
0.3532 0.6061
16-154
fsolve
Solve the same problem as in “Solution with Nondefault Options” on page 16-153, but formulate the
problem using a problem structure.
Set options for the problem to have no display and a plot function that displays the first-order
optimality, which should converge to 0 as the algorithm iterates.
problem.options = optimoptions('fsolve','Display','none','PlotFcn',@optimplotfirstorderopt);
Write a function that computes the left-hand side of these two equations.
function F = root2d(x)
problem.objective = @root2d;
problem.x0 = [0,0];
problem.solver = 'fsolve';
x = fsolve(problem)
x =
0.3532 0.6061
16-155
16 Functions
This example returns the iterative display showing the solution process for the system of two
equations and two unknowns
−x1
2x1 − x2 = e
−x2
−x1 + 2x2 = e .
16-156
fsolve
options = optimoptions('fsolve','Display','iter');
[x,fval] = fsolve(F,x0,options)
Equation solved.
x = 2×1
0.5671
0.5671
fval = 2×1
10-6 ×
-0.4059
-0.4059
The iterative display shows f(x), which is the square of the norm of the function F(x). This value
decreases to near zero as the iterations proceed. The first-order optimality measure likewise
decreases to near zero as the iterations proceed. These entries show the convergence of the
iterations to a solution. For the meanings of the other entries, see “Iterative Display” on page 3-14.
The fval output gives the function value F(x), which should be zero at a solution (to within the
FunctionTolerance tolerance).
1 2
X*X*X = ,
3 4
16-157
16 Functions
starting at the point x0 = [1,1;1,1]. Create an anonymous function that calculates the matrix
equation and create the point x0.
options = optimoptions('fsolve','Display','off');
Examine the fsolve outputs to see the solution quality and process.
[x,fval,exitflag,output] = fsolve(fun,x0,options)
x = 2×2
-0.1291 0.8602
1.2903 1.1612
fval = 2×2
10-9 ×
-0.1621 0.0780
0.1160 -0.0474
exitflag = 1
The exit flag value 1 indicates that the solution is reliable. To verify this manually, calculate the
residual (sum of squares of fval) to see how close it is to zero.
sum(sum(fval.*fval))
ans = 4.8062e-20
You can see in the output structure how many iterations and function evaluations fsolve performed
to find the solution.
Input Arguments
fun — Nonlinear equations to solve
function handle | function name
Nonlinear equations to solve, specified as a function handle or function name. fun is a function that
accepts a vector x and returns a vector F, the nonlinear equations evaluated at x. The equations to
16-158
fsolve
solve are F = 0 for all components of F. The function fun can be specified as a function handle for a
file
x = fsolve(@myfun,x0)
x = fsolve(@(x)sin(x.*x),x0);
If the user-defined values for x and F are arrays, they are converted to vectors using linear indexing
(see “Array Indexing” (MATLAB)).
If the Jacobian can also be computed and the 'SpecifyObjectiveGradient' option is true, set by
options = optimoptions('fsolve','SpecifyObjectiveGradient',true)
the function fun must return, in a second output argument, the Jacobian value J, a matrix, at x.
If fun returns a vector (matrix) of m components and x has length n, where n is the length of x0, the
Jacobian J is an m-by-n matrix where J(i,j) is the partial derivative of F(i) with respect to x(j).
(The Jacobian J is the transpose of the gradient of F.)
Example: fun = @(x)x*x*x-[1,2;3,4]
Data Types: char | function_handle | string
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. fsolve uses the number of elements in and size
of x0 to determine the number and size of variables that fun accepts.
Example: x0 = [1,2,3,4]
Data Types: double
Some options apply to all algorithms, and others are relevant for particular algorithms. See
“Optimization Options Reference” on page 15-6 for detailed information.
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
All Algorithms
16-159
16 Functions
16-160
fsolve
FiniteDifferenceStepSi Scalar or vector step size factor for finite differences. When you set
ze FiniteDifferenceStepSize to a vector v, the forward finite
differences delta are
delta = v.*sign′(x).*max(abs(x),TypicalX);
delta = v.*max(abs(x),TypicalX);
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
FunValCheck Check whether objective function values are valid. 'on' displays an
error when the objective function returns a value that is complex,
Inf, or NaN. The default, 'off', displays no error.
MaxFunctionEvaluations Maximum number of function evaluations allowed, a positive integer.
The default is 100*numberOfVariables. See “Tolerances and
Stopping Criteria” on page 2-68 and “Iterations and Function Counts”
on page 3-9.
For optimset, the name is MaxIter. See “Current and Legacy Option
Name Tables” on page 15-21.
16-161
16 Functions
Custom plot functions use the same syntax as output functions. See
“Output Functions” on page 3-32 and “Output Function Syntax” on
page 15-26.
For optimset, the name is Jacobian and the values are 'on' or
'off'. See “Current and Legacy Option Name Tables” on page 15-21.
StepTolerance Termination tolerance on x, a positive scalar. The default is 1e-6. See
“Tolerances and Stopping Criteria” on page 2-68.
For optimset, the name is TolX. See “Current and Legacy Option
Name Tables” on page 15-21.
TypicalX Typical x values. The number of elements in TypicalX is equal to the
number of elements in x0, the starting point. The default value is
ones(numberofvariables,1). fsolve uses TypicalX for scaling
finite differences for gradient estimation.
16-162
fsolve
W = jmfun(Jinfo,Y,flag)
[F,Jinfo] = fun(x)
• If flag == 0, W = J'*(J*Y).
• If flag > 0, W = J*Y.
• If flag < 0, W = J'*Y.
16-163
16 Functions
16-164
fsolve
The simplest way of obtaining a problem structure is to export the problem from the Optimization
app.
Data Types: struct
Output Arguments
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically,
x is a local solution to the problem when exitflag is positive. For information on the quality of the
solution, see “When the Solver Succeeds” on page 4-18.
Objective function value at the solution, returned as a real vector. Generally, fval = fun(x).
16-165
16 Functions
Jacobian at the solution, returned as a real matrix. jacobian(i,j) is the partial derivative of
fun(i) with respect to x(j) at the solution x.
Limitations
• The function to be solved must be continuous.
• When successful, fsolve only gives one root.
• The default trust-region dogleg method can only be used when the system of equations is square,
i.e., the number of equations equals the number of unknowns. For the Levenberg-Marquardt
method, the system of equations need not be square.
Tips
• For large problems, meaning those with thousands of variables or more, save memory (and
possibly save time) by setting the Algorithm option to 'trust-region' and the
SubproblemAlgorithm option to 'cg'.
Algorithms
The Levenberg-Marquardt and trust-region methods are based on the nonlinear least-squares
algorithms also used in lsqnonlin. Use one of these methods if the system may not have a zero. The
algorithm still returns a point where the residual is small. However, if the Jacobian of the system is
singular, the algorithm might converge to a point that is not a solution of the system of equations (see
“Limitations” on page 16-166).
• By default fsolve chooses the trust-region dogleg algorithm. The algorithm is a variant of the
Powell dogleg method described in [8]. It is similar in nature to the algorithm implemented in [7].
See “Trust-Region-Dogleg Algorithm” on page 13-4.
• The trust-region algorithm is a subspace trust-region method and is based on the interior-
reflective Newton method described in [1] and [2]. Each iteration involves the approximate
solution of a large linear system using the method of preconditioned conjugate gradients (PCG).
See “Trust-Region Algorithm” on page 13-2.
• The Levenberg-Marquardt method is described in references [4], [5], and [6]. See “Levenberg-
Marquardt Method” on page 13-5.
16-166
fsolve
References
[1] Coleman, T.F. and Y. Li, “An Interior, Trust Region Approach for Nonlinear Minimization Subject to
Bounds,” SIAM Journal on Optimization, Vol. 6, pp. 418-445, 1996.
[2] Coleman, T.F. and Y. Li, “On the Convergence of Reflective Newton Methods for Large-Scale
Nonlinear Minimization Subject to Bounds,” Mathematical Programming, Vol. 67, Number 2,
pp. 189-224, 1994.
[3] Dennis, J. E. Jr., “Nonlinear Least-Squares,” State of the Art in Numerical Analysis, ed. D. Jacobs,
Academic Press, pp. 269-312.
[4] Levenberg, K., “A Method for the Solution of Certain Problems in Least-Squares,” Quarterly
Applied Mathematics 2, pp. 164-168, 1944.
[5] Marquardt, D., “An Algorithm for Least-squares Estimation of Nonlinear Parameters,” SIAM
Journal Applied Mathematics, Vol. 11, pp. 431-441, 1963.
[6] Moré, J. J., “The Levenberg-Marquardt Algorithm: Implementation and Theory,” Numerical
Analysis, ed. G. A. Watson, Lecture Notes in Mathematics 630, Springer Verlag, pp. 105-116,
1977.
[7] Moré, J. J., B. S. Garbow, and K. E. Hillstrom, User Guide for MINPACK 1, Argonne National
Laboratory, Rept. ANL-80-74, 1980.
[8] Powell, M. J. D., “A Fortran Subroutine for Solving Systems of Nonlinear Algebraic Equations,”
Numerical Methods for Nonlinear Algebraic Equations, P. Rabinowitz, ed., Ch.7, 1970.
Extended Capabilities
Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.
options = optimoptions('solvername','UseParallel',true)
For more information, see “Using Parallel Computing in Optimization Toolbox” on page 14-5.
See Also
fzero | lsqcurvefit | lsqnonlin | optimoptions
Topics
“Nonlinear Equations with Analytic Jacobian” on page 13-7
“Nonlinear Equations with Finite-Difference Jacobian” on page 13-9
“Nonlinear Equations with Jacobian” on page 13-11
“Nonlinear Equations with Jacobian Sparsity Pattern” on page 13-13
“Nonlinear Systems with Constraints” on page 13-15
“Solver-Based Optimization Problem Setup”
“Equation Solving Algorithms” on page 13-2
16-167
16 Functions
fzero
Root of nonlinear function
Syntax
x = fzero(fun,x0)
x = fzero(fun,x0,options)
x = fzero(problem)
Description
x = fzero(fun,x0) tries to find a point x where fun(x) = 0. This solution is where fun(x)
changes sign—fzero cannot find a root of a function such as x^2.
Examples
x = 3.1416
x = 1.5708
16-168
fzero
function y = f(x)
y = x.^3-2*x-5;
z =
2.0946
Since f(x) is a polynomial, you can find the same real zero, and a complex conjugate pair of zeros,
using the roots command.
roots([1 0 -2 -5])
ans =
2.0946
-1.0473 + 1.1359i
-1.0473 - 1.1359i
x = 0.7854
Nondefault Options
fun = @(x)sin(cosh(x));
x0 = 1;
Examine the solution process by setting options that include plot functions.
16-169
16 Functions
options = optimset('PlotFcns',{@optimplotx,@optimplotfval});
x = fzero(fun,x0,options)
x = 1.8115
Define a problem in Optimization app. Enter optimtool('fzero'), and fill in the problem as
pictured.
Note The Optimization app warns that it will be removed in a future release.
16-170
fzero
Select File > Export to Workspace, and export the problem as pictured to a variable named
problem.
x =
1.8115
Find the point where exp(-exp(-x)) = x, and display information about the solution process.
fun = @(x) exp(-exp(-x)) - x; % function
x0 = [0 1]; % initial interval
options = optimset('Display','iter'); % show iterations
[x fval exitflag output] = fzero(fun,x0,options)
16-171
16 Functions
x = 0.5671
fval = 0
exitflag = 1
Input Arguments
fun — Function to solve
function handle | function name
Function to solve, specified as a handle to a scalar-valued function or the name of such a function.
fun accepts a scalar x and returns a scalar fun(x).
fzero solves fun(x) = 0. To solve an equation fun(x) = c(x), instead solve fun2(x) =
fun(x) - c(x) = 0.
To include extra parameters in your function, see the example “Root of Function with Extra
Parameter” on page 16-169 and the section “Passing Extra Parameters” on page 2-57.
Example: 'sin'
Example: @myFunction
Example: @(x)(x-a)^5 - 3*x + a - 1
Data Types: char | function_handle | string
x0 — Initial value
scalar | 2-element vector
• Scalar — fzero begins at x0 and tries to locate a point x1 where fun(x1) has the opposite sign
of fun(x0). Then fzero iteratively shrinks the interval where fun changes sign to reach a
solution.
• 2-element vector — fzero checks that fun(x0(1)) and fun(x0(2)) have opposite signs, and
errors if they do not. It then iteratively shrinks the interval where fun changes sign to reach a
solution. An interval x0 must be finite; it cannot contain ±Inf.
16-172
fzero
Tip Calling fzero with an interval (x0 with two elements) is often faster than calling it with a scalar
x0.
Example: 3
Example: [2,17]
Data Types: double
Options for solution process, specified as a structure. Create or modify the options structure using
optimset. fzero uses these options structure fields.
• 'on' displays an error when the objective function returns a value that is
complex, Inf, or NaN.
• The default, 'off', displays no error.
OutputFcn Specify one or more user-defined functions that an optimization function calls
at each iteration, either as a function handle or as a cell array of function
handles. The default is none ([]). See “Output Function Syntax” on page 15-
26.
PlotFcns Plot various measures of progress while the algorithm executes. Select from
predefined plots or write your own. Pass a function handle or a cell array of
function handles. The default is none ([]).
Custom plot functions use the same syntax as output functions. See “Output
Functions” on page 3-32 and “Output Function Syntax” on page 15-26.
TolX Termination tolerance on x, a positive scalar. The default is eps, 2.2204e–16.
See “Tolerances and Stopping Criteria” on page 2-68.
16-173
16 Functions
You can generate problem by exporting from Optimization app. See “Importing and Exporting Your
Work” on page 5-8 or “Solve Exported Problem” on page 16-170.
Data Types: struct
Output Arguments
x — Location of root or sign change
real scalar
Integer encoding the exit condition, meaning the reason fzero stopped its iterations.
Information about root-finding process, returned as a structure. The fields of the structure are:
16-174
fzero
Algorithms
The fzero command is a function file. The algorithm, created by T. Dekker, uses a combination of
bisection, secant, and inverse quadratic interpolation methods. An Algol 60 version, with some
improvements, is given in [1]. A Fortran version, upon which fzero is based, is in [2].
References
[1] Brent, R., Algorithms for Minimization Without Derivatives, Prentice-Hall, 1973.
[2] Forsythe, G. E., M. A. Malcolm, and C. B. Moler, Computer Methods for Mathematical
Computations, Prentice-Hall, 1976.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
• The fun input argument must be a function handle, and not a structure or character vector.
• fzero ignores all options except for TolX and FunValCheck.
• fzero does not support the fourth output argument, the output structure.
See Also
fminbnd | fsolve | optimset | optimtool | roots
Topics
“Roots of Scalar Functions” (MATLAB)
“Passing Extra Parameters” on page 2-57
16-175
16 Functions
infeasibility
Package: optim.problemdef
Syntax
infeas = infeasibility(constr,pt)
Description
infeas = infeasibility(constr,pt) returns the amount of violation of the constraint constr
at the point pt.
Examples
Check whether the point x = 0, y = 3 satisfies the constraint named cons. A point is feasible when
its infeasibility is zero.
pt.x = 0;
pt.y = 3;
infeas = infeasibility(cons,pt)
infeas = 1
infeas = 0
16-176
infeasibility
x = optimvar('x',3,2);
cons = sum(x,2) <= [1;3;2];
pt.x = [1,-1;2,3;3,-1];
infeas = infeasibility(cons,pt)
infeas = 3×1
0
2
0
Input Arguments
constr — Optimization constraint
OptimizationEquality object | OptimizationInequality object | OptimizationConstraint
object
pt — Point to evaluate
structure with field names that match the optimization variable names
Point to evaluate, specified as a structure with field names that match the optimization variable
names, for optimization variables in the constraint. The size of each field in pt must match the size of
the corresponding optimization variable.
Example: pt.x = 5*eye(3)
Data Types: struct
Output Arguments
infeas — Infeasibility of constraint
real array
Infeasibility of constraint, returned as a real array. Each zero entry represents a feasible constraint,
and each positive entry represents an infeasible constraint. The size of infeas is the same as the
size of the constraint constr. For an example of nonscalar infeas, see “Compute Multiple
Constraint Violations” on page 16-176.
16-177
16 Functions
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
See Also
OptimizationConstraint | OptimizationEquality | OptimizationInequality | evaluate
Topics
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-178
intlinprog
intlinprog
Mixed-integer linear programming (MILP)
Syntax
x = intlinprog(f,intcon,A,b)
x = intlinprog(f,intcon,A,b,Aeq,beq)
x = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub)
x = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub,x0)
x = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub,x0,options)
x = intlinprog(problem)
[x,fval,exitflag,output] = intlinprog( ___ )
Description
Mixed-integer linear programming solver.
x(intcon) are integers
T A⋅x≤b
min f x subject to
x Aeq ⋅ x = beq
lb ≤ x ≤ ub .
f, x, intcon, b, beq, lb, and ub are vectors, and A and Aeq are matrices.
You can specify f, intcon, lb, and ub as vectors or arrays. See “Matrix Arguments” on page 2-31.
Note intlinprog applies only to the solver-based approach. For a discussion of the two
optimization approaches, see “First Choose Problem-Based or Solver-Based Approach” on page 1-3.
x = intlinprog(f,intcon,A,b) solves min f'*x such that the components of x in intcon are
integers, and A*x ≤ b.
16-179
16 Functions
x = intlinprog(problem) uses a problem structure to encapsulate all solver inputs. You can
import a problem structure from an MPS file using mpsread. You can also create a problem
structure from an OptimizationProblem object by using prob2struct.
Examples
x2 is an integer
x1 + 2x2 ≥ − 14
min8x1 + x2 subject to
x −4x1 − x2 ≤ − 33
2x1 + x2 ≤ 20 .
f = [8;1];
intcon = 2;
Convert all inequalities into the form A*x <= b by multiplying “greater than” inequalities by -1.
A = [-1,-2;
-4,-1;
2,1];
b = [14;-33;20];
Call intlinprog.
x = intlinprog(f,intcon,A,b)
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
x = 2×1
6.5000
7.0000
16-180
intlinprog
x3 binary
x1, x2 ≥ 0
min −3x1 − 2x2 − x3 subject to
x x1 + x2 + x3 ≤ 7
4x1 + 2x2 + x3 = 12 .
Call intlinprog.
x = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub)
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
x = 3×1
0
5.5000
1.0000
Compare the number of steps to solve an integer programming problem both with and without an
initial feasible point. The problem has eight variables, four linear equality constraints, and has all
variables restricted to be positive.
16-181
16 Functions
Aeq = [22 13 26 33 21 3 14 26
39 16 22 28 26 30 23 24
18 14 29 27 30 38 26 26
41 26 28 36 18 38 16 26];
beq = [ 7872
10466
11322
12058];
N = 8;
lb = zeros(N,1);
intcon = 1:N;
f = [2 10 13 17 7 5 7 3];
Solve the problem without using an initial point, and examine the display to see the number of
branch-and-bound nodes.
[x1,fval1,exitflag1,output1] = intlinprog(f,intcon,[],[],Aeq,beq,lb);
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
x0 = [8 62 23 103 53 84 46 34];
[x2,fval2,exitflag2,output2] = intlinprog(f,intcon,[],[],Aeq,beq,lb,[],x0);
16-182
intlinprog
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
Giving an initial point does not always help. For this problem, giving an initial point saves time and
computational steps. However, for some problems, giving an initial point can cause intlinprog to
take more steps.
x3 binary
x1, x2 ≥ 0
min −3x1 − 2x2 − x3 subject to
x x1 + x2 + x3 ≤ 7
4x1 + 2x2 + x3 = 12
f = [-3;-2;-1];
intcon = 3;
A = [1,1,1];
b = 7;
Aeq = [4,2,1];
beq = 12;
lb = zeros(3,1);
ub = [Inf;Inf;1]; % enforces x(3) is binary
x0 = [];
Specify no display.
options = optimoptions('intlinprog','Display','off');
16-183
16 Functions
x = intlinprog(f,intcon,A,b,Aeq,beq,lb,ub,x0,options)
x = 3×1
0
5.5000
1.0000
This example shows how to set up a problem using the problem-based approach and then solve it
using the solver-based approach. The problem is
x3 binary
x1, x2 ≥ 0
min −3x1 − 2x2 − x3 subject to
x x1 + x2 + x3 ≤ 7
4x1 + 2x2 + x3 = 12
Create an OptimizationProblem object named prob to represent this problem. To specify a binary
variable, create an optimization variable with integer type, a lower bound of 0, and an upper bound of
1.
x = optimvar('x',2,'LowerBound',0);
xb = optimvar('xb','LowerBound',0,'UpperBound',1,'Type','integer');
prob = optimproblem('Objective',-3*x(1)-2*x(2)-xb);
cons1 = sum(x) + xb <= 7;
cons2 = 4*x(1) + 2*x(2) + xb == 12;
prob.Constraints.cons1 = cons1;
prob.Constraints.cons2 = cons2;
problem = prob2struct(prob);
[sol,fval,exitflag,output] = intlinprog(problem)
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
sol = 3×1
16-184
intlinprog
5.5000
1.0000
fval = -12
exitflag = 1
Both sol(1) and sol(3) are binary-valued. Which value corresponds to the binary optimization
variable xb?
prob.Variables
The variable xb appears last in the Variables display, so xb corresponds to sol(3) = 1. See
“Algorithms” on page 16-376.
Call intlinprog with more outputs to see solution details and process.
x3 binary
x1, x2 ≥ 0
min −3x1 − 2x2 − x3 subject to
x x1 + x2 + x3 ≤ 7
4x1 + 2x2 + x3 = 12 .
16-185
16 Functions
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
x = 3×1
0
5.5000
1.0000
fval = -12
exitflag = 1
The output structure shows numnodes is 0. This means intlinprog solved the problem before
branching. This is one indication that the result is reliable. Also, the absolutegap and
relativegap fields are 0. This is another indication that the result is reliable.
Input Arguments
f — Coefficient vector
real vector | real array
Coefficient vector, specified as a real vector or real array. The coefficient vector represents the
objective function f'*x. The notation assumes that f is a column vector, but you are free to use a row
vector or array. Internally, linprog converts f to the column vector f(:).
If you specify f = [], intlinprog tries to find a feasible point without trying to minimize an
objective function.
Example: f = [4;2;-1.7];
Data Types: double
Vector of integer constraints, specified as a vector of positive integers. The values in intcon indicate
the components of the decision variable x that are integer-valued. intcon has values from 1 through
numel(f).
16-186
intlinprog
intcon can also be an array. Internally, intlinprog converts an array intcon to the vector
intcon(:).
Example: intcon = [1,2,7] means x(1), x(2), and x(7) take only integer values.
Data Types: double
Linear inequality constraint matrix, specified as a matrix of doubles. A represents the linear
coefficients in the constraints A*x ≤ b. A has size M-by-N, where M is the number of constraints and N
= numel(f). To save memory, A can be sparse.
Example: A = [4,3;2,0;4,-1]; means three linear inequalities (three rows) for two decision
variables (two columns).
Data Types: double
Linear inequality constraint vector, specified as a vector of doubles. b represents the constant vector
in the constraints A*x ≤ b. b has length M, where A is M-by-N.
Example: [4,0]
Data Types: double
Linear equality constraint matrix, specified as a matrix of doubles. Aeq represents the linear
coefficients in the constraints Aeq*x = beq. Aeq has size Meq-by-N, where Meq is the number of
constraints and N = numel(f). To save memory, Aeq can be sparse.
Example: A = [4,3;2,0;4,-1]; means three linear inequalities (three rows) for two decision
variables (two columns).
Data Types: double
Linear equality constraint vector, specified as a vector of doubles. beq represents the constant vector
in the constraints Aeq*x = beq. beq has length Meq, where Aeq is Meq-by-N.
Example: [4,0]
Data Types: double
lb — Lower bounds
[] (default) | real vector or array
Lower bounds, specified as a vector or array of doubles. lb represents the lower bounds elementwise
in lb ≤ x ≤ ub.
16-187
16 Functions
ub — Upper bounds
[] (default) | real vector or array
Upper bounds, specified as a vector or array of doubles. ub represents the upper bounds elementwise
in lb ≤ x ≤ ub.
x0 — Initial point
[] (default) | real array
Initial point, specified as a real array. The number of elements in x0 is the same as the number of
elements of f, when f exists. Otherwise, the number is the same as the number of columns of A or
Aeq. Internally, the solver converts an array x0 into a vector x0(:).
Providing x0 can change the amount of time intlinprog takes to converge. It is difficult to predict
how x0 affects the solver. For suggestions on using appropriate Heuristics with x0, see “Tips” on
page 16-196.
x0 must be feasible with respect to all constraints. If x0 is not feasible, the solver errors. If you do
not have a feasible x0, set x0 = [].
Example: x0 = 100*rand(size(f))
Data Types: double
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
U – L <= AbsoluteGapTolerance.
16-188
intlinprog
16-189
16 Functions
• 'basic'
• 'intermediate'
• 'advanced'
• 'rss'
• 'rins'
• 'round'
• 'diving'
• 'rss-diving'
• 'rins-diving'
• 'round-diving'
• 'none'
HeuristicsMax Strictly positive integer that bounds the number of nodes 50
Nodes intlinprog can explore in its branch-and-bound search
for feasible points. Applies only to 'rss' and 'rins'.
See “Heuristics for Finding Feasible Solutions” on page 9-
29.
IntegerPrepro Types of integer preprocessing (see “Mixed-Integer 'basic'
cess Program Preprocessing” on page 9-27):
16-190
intlinprog
In this expression,
numberOfEqualitie
s means the number of
rows of Aeq,
numberOfInequalit
ies means the number
of rows of A, and
numberOfVariables
means the number of
elements of f.
LPOptimalityT Nonnegative real where reduced costs must exceed 1e-7
olerance LPOptimalityTolerance for a variable to be taken into
the basis.
LPPreprocess Type of preprocessing for the solution to the relaxed 'basic'
linear program (see “Linear Program Preprocessing” on
page 9-27):
• 'none' — No preprocessing.
• 'basic' — Use preprocessing.
MaxNodes Strictly positive integer that is the maximum number of 1e7
nodes intlinprog explores in its branch-and-bound
process.
MaxFeasiblePo Strictly positive integer. intlinprog stops if it finds Inf
ints MaxFeasiblePoints integer feasible points.
MaxTime Positive real that is the maximum time in seconds that 7200
intlinprog runs.
NodeSelection Choose the node to explore next. 'simplebestproj'
16-191
16 Functions
16-192
intlinprog
(U – L) / (abs(U) + 1) <=
RelativeGapTolerance.
tolerance = min(1/(1+|L|),
RelativeGapTolerance)
In this expression,
numberOfEqualitie
s means the number of
rows of Aeq,
numberOfInequalit
ies means the number
of rows of A, and
numberOfVariables
means the number of
elements of f.
Structure encapsulating the inputs and options, specified with the following fields.
16-193
16 Functions
You must specify at least these fields in the problem structure. Other fields are optional:
• f
• intcon
• solver
• options
Output Arguments
x — Solution
real vector
Solution, returned as a vector that minimizes f'*x subject to all bounds, integer constraints, and
linear constraints.
16-194
intlinprog
Algorithm stopping condition, returned as an integer identifying the reason the algorithm stopped.
The following lists the values of exitflag and the corresponding reasons intlinprog stopped.
The exit message can give more detailed information on the reason intlinprog stopped, such as
exceeding a tolerance.
Exitflags 3 and -9 relate to solutions that have large infeasibilities. These usually arise from linear
constraint matrices that have large condition number, or problems that have large solution
components. To correct these issues, try to scale the coefficient matrices, eliminate redundant linear
constraints, or give tighter bounds on the variables.
Solution process summary, returned as a structure containing information about the optimization
process.
16-195
16 Functions
Limitations
• Often, some supposedly integer-valued components of the solution x(intCon) are not precisely
integers. intlinprog deems as integers all solution values within IntegerTolerance of an
integer.
To round all supposed integers to be exactly integers, use the round function.
x(intcon) = round(x(intcon));
Caution Rounding solutions can cause the solution to become infeasible. Check feasibility after
rounding:
max(A*x - b) % See if entries are not too positive, so have small infeasibility
max(abs(Aeq*x - beq)) % See if entries are near enough to zero
max(x - ub) % Positive entries are violated bounds
max(lb - x) % Positive entries are violated bounds
• intlinprog does not enforce that solution components be integer-valued when their absolute
values exceed 2.1e9. When your solution has such components, intlinprog warns you. If you
receive this warning, check the solution to see whether supposedly integer-valued components of
the solution are close to integers.
• intlinprog does not allow components of the problem, such as coefficients in f, A, or ub, to
exceed 1e25 in absolute value. If you try to run intlinprog with such a problem, intlinprog
issues an error.
• Currently, you cannot run intlinprog in the Optimization app on page 5-2.
Tips
• To specify binary variables, set the variables to be integers in intcon, and give them lower
bounds of 0 and upper bounds of 1.
• Save memory by specifying sparse linear constraint matrices A and Aeq. However, you cannot use
sparse matrices for b and beq.
16-196
intlinprog
• If you include an x0 argument, intlinprog uses that value in the 'rins' and guided diving
heuristics until it finds a better integer-feasible point. So when you provide x0, you can obtain
good results by setting the 'Heuristics' option to 'rins-diving' or another setting that uses
'rins'.
• To provide logical indices for integer components, meaning a binary vector with 1 indicating an
integer, convert to intcon form using find. For example,
logicalindices = [1,0,0,1,1,0,0];
intcon = find(logicalindices)
intcon =
1 4 5
• intlinprog replaces bintprog. To update old bintprog code to use intlinprog, make the
following changes:
• Set intcon to 1:numVars, where numVars is the number of variables in your problem.
• Set lb to zeros(numVars,1).
• Set ub to ones(numVars,1).
• Update any relevant options. Use optimoptions to create options for intlinprog.
• Change your call to bintprog as follows:
[x,fval,exitflag,output] = bintprog(f,A,b,Aeq,Beq,x0,options)
% Change your call to:
[x,fval,exitflag,output] = intlinprog(f,intcon,A,b,Aeq,Beq,lb,ub,x0,options)
Compatibility Considerations
Default BranchRule is 'reliability'
Behavior changed in R2019a
The default value of the BranchRule option is 'reliability' instead of 'maxpscost'. In testing,
this value gave better performance on many problems, both in solution times and in number of
explored branching nodes.
On a few problems, the previous branch rule performs better. To get the previous behavior, set the
BranchRule option to 'maxpscost'.
See Also
linprog | mpsread | optimoptions | prob2struct
Topics
“Mixed-Integer Linear Programming Basics: Solver-Based” on page 9-37
“Factory, Warehouse, Sales Allocation Model: Solver-Based” on page 9-40
“Traveling Salesman Problem: Solver-Based” on page 9-49
“Solve Sudoku Puzzles Via Integer Programming: Solver-Based” on page 9-72
“Mixed-Integer Quadratic Programming Portfolio Optimization: Solver-Based” on page 9-65
“Optimal Dispatch of Power Generators: Solver-Based” on page 9-55
“Mixed-Integer Linear Programming Algorithms” on page 9-26
“Tuning Integer Linear Programming” on page 9-35
“Solver-Based Optimization Problem Setup”
16-197
16 Functions
Introduced in R2014a
16-198
linprog
linprog
Solve linear programming problems
Syntax
x = linprog(f,A,b)
x = linprog(f,A,b,Aeq,beq)
x = linprog(f,A,b,Aeq,beq,lb,ub)
x = linprog(f,A,b,Aeq,beq,lb,ub,options)
x = linprog(problem)
[x,fval] = linprog( ___ )
[x,fval,exitflag,output] = linprog( ___ )
[x,fval,exitflag,output,lambda] = linprog( ___ )
Description
Linear programming solver
A ⋅ x ≤ b,
T
min f x such that Aeq ⋅ x = beq,
x
lb ≤ x ≤ ub .
f, x, b, beq, lb, and ub are vectors, and A and Aeq are matrices.
Note linprog applies only to the solver-based approach. For a discussion of the two optimization
approaches, see “First Choose Problem-Based or Solver-Based Approach” on page 1-3.
Note If the specified input bounds for a problem are inconsistent, the output fval is [].
x = linprog(problem) finds the minimum for problem, where problem is a structure described
in “Input Arguments” on page 16-209.
Create the problem structure by exporting a problem from Optimization app, as described in
“Exporting Your Work” on page 5-9. You can import a problem structure from an MPS file using
16-199
16 Functions
mpsread. You can also create a problem structure from an OptimizationProblem object by using
prob2struct.
[x,fval] = linprog( ___ ), for any input arguments, returns the value of the objective function
fun at the solution x: fval = f'*x.
Examples
x(1) + x(2) ≤ 2
x(1) + x(2)/4 ≤ 1
x(1) − x(2) ≤ 2
−x(1)/4 − x(2) ≤ 1
−x(1) − x(2) ≤ − 1
−x(1) + x(2) ≤ 2 .
A = [1 1
1 1/4
1 -1
-1/4 -1
-1 -1
-1 1];
b = [2 1 2 1 -1 2];
f = [-1 -1/3];
x = linprog(f,A,b)
x = 2×1
0.6667
16-200
linprog
1.3333
Solve a simple linear program defined by linear inequalities and linear equalities.
x(1) + x(2) ≤ 2
x(1) + x(2)/4 ≤ 1
x(1) − x(2) ≤ 2
−x(1)/4 − x(2) ≤ 1
−x(1) − x(2) ≤ − 1
−x(1) + x(2) ≤ 2 .
A = [1 1
1 1/4
1 -1
-1/4 -1
-1 -1
-1 1];
b = [2 1 2 1 -1 2];
x = 2×1
0
2
Solve a simple linear program with linear inequalities, linear equalities, and bounds.
16-201
16 Functions
x(1) + x(2) ≤ 2
x(1) + x(2)/4 ≤ 1
x(1) − x(2) ≤ 2
−x(1)/4 − x(2) ≤ 1
−x(1) − x(2) ≤ − 1
−x(1) + x(2) ≤ 2 .
A = [1 1
1 1/4
1 -1
-1/4 -1
-1 -1
-1 1];
b = [2 1 2 1 -1 2];
Aeq = [1 1/4];
beq = 1/2;
−1 ≤ x(1) ≤ 1 . 5
−0 . 5 ≤ x(2) ≤ 1 . 25 .
lb = [-1,-0.5];
ub = [1.5,1.25];
f = [-1 -1/3];
x = linprog(f,A,b,Aeq,beq,lb,ub)
x = 2×1
0.1875
1.2500
16-202
linprog
x(1) + x(2) ≤ 2
x(1) + x(2)/4 ≤ 1
x(1) − x(2) ≤ 2
−x(1)/4 − x(2) ≤ 1
−x(1) − x(2) ≤ − 1
−x(1) + x(2) ≤ 2 .
A = [1 1
1 1/4
1 -1
-1/4 -1
-1 -1
-1 1];
b = [2 1 2 1 -1 2];
Aeq = [1 1/4];
beq = 1/2;
−1 ≤ x(1) ≤ 1 . 5
−0 . 5 ≤ x(2) ≤ 1 . 25 .
lb = [-1,-0.5];
ub = [1.5,1.25];
f = [-1 -1/3];
options = optimoptions('linprog','Algorithm','interior-point');
x = linprog(f,A,b,Aeq,beq,lb,ub,options)
x = 2×1
0.1875
16-203
16 Functions
1.2500
This example shows how to set up a problem using the problem-based approach and then solve it
using the solver-based approach. The problem is
x+y ≤2
x + y/4 ≤ 1
x−y ≤2
x/4 + y ≥ − 1
max(x + y/3) subject to x+y ≥1
x
−x + y ≤ 2
x + y/4 = 1/2
−1 ≤ x ≤ 1 . 5
−1/2 ≤ y ≤ 1 . 25
x = optimvar('x','LowerBound',-1,'UpperBound',1.5);
y = optimvar('y','LowerBound',-1/2,'UpperBound',1.25);
prob = optimproblem('Objective',x + y/3,'ObjectiveSense','max');
prob.Constraints.c1 = x + y <= 2;
prob.Constraints.c2 = x + y/4 <= 1;
prob.Constraints.c3 = x - y <= 2;
prob.Constraints.c4 = x/4 + y >= -1;
prob.Constraints.c5 = x + y >= 1;
prob.Constraints.c6 = -x + y <= 2;
prob.Constraints.c7 = x + y/4 == 1/2;
problem = prob2struct(prob);
[sol,fval,exitflag,output] = linprog(problem)
sol = 2×1
0.1875
1.2500
fval = -0.6042
exitflag = 1
16-204
linprog
constrviolation: 0
message: 'Optimal solution found.'
algorithm: 'dual-simplex'
firstorderopt: 0
The returned fval is negative, even though the solution components are positive. Internally,
prob2struct turns the maximization problem into a minimization problem of the negative of the
objective function. See “Maximizing an Objective” on page 2-30.
Which component of sol corresponds to which optimization variable? Examine the Variables
property of prob.
prob.Variables
As you might expect, sol(1) corresponds to x, and sol(2) corresponds to y. See “Algorithms” on
page 16-376.
Calculate the solution and objective function value for a simple linear program.
x(1) + x(2) ≤ 2
x(1) + x(2)/4 ≤ 1
x(1) − x(2) ≤ 2
−x(1)/4 − x(2) ≤ 1
−x(1) − x(2) ≤ − 1
−x(1) + x(2) ≤ 2 .
A = [1 1
1 1/4
1 -1
-1/4 -1
-1 -1
-1 1];
b = [2 1 2 1 -1 2];
f = [-1 -1/3];
16-205
16 Functions
[x,fval] = linprog(f,A,b)
x = 2×1
0.6667
1.3333
fval = -1.1111
Obtain the exit flag and output structure to better understand the solution process and quality.
x(1) + x(2) ≤ 2
x(1) + x(2)/4 ≤ 1
x(1) − x(2) ≤ 2
−x(1)/4 − x(2) ≤ 1
−x(1) − x(2) ≤ − 1
−x(1) + x(2) ≤ 2 .
A = [1 1
1 1/4
1 -1
-1/4 -1
-1 -1
-1 1];
b = [2 1 2 1 -1 2];
Aeq = [1 1/4];
beq = 1/2;
−1 ≤ x(1) ≤ 1 . 5
−0 . 5 ≤ x(2) ≤ 1 . 25 .
lb = [-1,-0.5];
ub = [1.5,1.25];
f = [-1 -1/3];
16-206
linprog
options = optimoptions('linprog','Algorithm','dual-simplex');
Solve the linear program and request the function value, exit flag, and output structure.
[x,fval,exitflag,output] = linprog(f,A,b,Aeq,beq,lb,ub,options)
x = 2×1
0.1875
1.2500
fval = -0.6042
exitflag = 1
• fval, the objective function value, is larger than “Return the Objective Function Value” on page
16-205, because there are more constraints.
• exitflag = 1 indicates that the solution is reliable.
• output.iterations = 0 indicates that linprog found the solution during presolve, and did not
have to iterate at all.
Solve a simple linear program and examine the solution and the Lagrange multipliers.
x1 − x2 + x3 ≤ 20
3x1 + 2x2 ≤ 30 .
A = [1 -1 1
3 2 4
16-207
16 Functions
3 2 0];
b = [20;42;30];
x1 ≥ 0
x2 ≥ 0
x3 ≥ 0 .
lb = zeros(3,1);
Set Aeq and beq to [], indicating that there are no linear equality constraints.
Aeq = [];
beq = [];
x = 3×1
0
15.0000
3.0000
ans = 3×1
0
1.5000
0.5000
ans = 3×1
1.0000
0
0
lambda.ineqlin is nonzero for the second and third components of x. This indicates that the
second and third linear inequality constraints are satisfied with equalities:
3x1 + 2x2 = 30 .
16-208
linprog
ans = 3×1
-12.0000
42.0000
30.0000
lambda.lower is nonzero for the first component of x. This indicates that x(1) is at its lower bound
of 0.
Input Arguments
f — Coefficient vector
real vector | real array
Coefficient vector, specified as a real vector or real array. The coefficient vector represents the
objective function f'*x. The notation assumes that f is a column vector, but you are free to use a row
vector or array. Internally, linprog converts f to the column vector f(:).
Example: f = [1,3,5,-6]
Data Types: double
Linear inequality constraints, specified as a real matrix. A is an M-by-N matrix, where M is the number
of inequalities, and N is the number of variables (length of f). For large problems, pass A as a sparse
matrix.
A*x <= b,
where x is the column vector of N variables x(:), and b is a column vector with M elements.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
A = [1,2;3,4;5,6];
b = [10;20;30];
Example: To specify that the x-components add up to 1 or less, take A = ones(1,N) and b = 1
Data Types: double
16-209
16 Functions
Linear equality constraints, specified as a real matrix. Aeq is an Me-by-N matrix, where Me is the
number of equalities, and N is the number of variables (length of f). For large problems, pass Aeq as
a sparse matrix.
Aeq*x = beq,
where x is the column vector of N variables x(:), and beq is a column vector with Me elements.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Example: To specify that the x-components sum to 1, take Aeq = ones(1,N) and beq = 1
Data Types: double
Linear inequality constraints, specified as a real vector. b is an M-element vector related to the A
matrix. If you pass b as a row vector, solvers internally convert b to the column vector b(:). For
large problems, pass b as a sparse vector.
A*x <= b,
where x is the column vector of N variables x(:), and A is a matrix of size M-by-N.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear equality constraints, specified as a real vector. beq is an Me-element vector related to the Aeq
matrix. If you pass beq as a row vector, solvers internally convert beq to the column vector beq(:).
For large problems, pass beq as a sparse vector.
16-210
linprog
Aeq*x = beq,
where x is the column vector of N variables x(:), and Aeq is a matrix of size Me-by-N.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
lb — Lower bounds
real vector | real array
Lower bounds, specified as a real vector or real array. If the length of f is equal to that of lb, then lb
specifies that
ub — Upper bounds
real vector | real array
Upper bounds, specified as a real vector or real array. If the length of f is equal to that of ub, then ub
specifies that
16-211
16 Functions
Some options apply to all algorithms, and others are relevant for particular algorithms. See
“Optimization Options Reference” on page 15-6 for detailed information.
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
All Algorithms
Algorithm Choose the optimization algorithm:
• 'dual-simplex' (default)
• 'interior-point-legacy'
• 'interior-point'
For optimset, the name is MaxIter. See “Current and Legacy Option
Name Tables” on page 15-21.
OptimalityTolerance Termination tolerance on the dual feasibility, a positive scalar. The
default is:
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
interior-point Algorithm
16-212
linprog
For optimset, the name is TolCon. See “Current and Legacy Option
Name Tables” on page 15-21.
Preprocess Level of LP preprocessing prior to algorithm iterations. Specify
'basic' (default) or 'none'.
Dual-Simplex Algorithm
ConstraintTolerance Feasibility tolerance for constraints, a scalar from 1e-10 through
1e-3. ConstraintTolerance measures primal feasibility tolerance.
The default is 1e-4.
For optimset, the name is TolCon. See “Current and Legacy Option
Name Tables” on page 15-21.
MaxTime Maximum amount of time in seconds that the algorithm runs. The
default is Inf.
Preprocess Level of LP preprocessing prior to dual simplex algorithm iterations.
Specify 'basic' (default) or 'none'.
You must supply at least the solver field in the problem structure.
The simplest way to obtain a problem structure is to export the problem from the Optimization app.
Data Types: struct
Output Arguments
x — Solution
real vector | real array
16-213
16 Functions
Solution, returned as a real vector or real array. The size of x is the same as the size of f.
Objective function value at the solution, returned as a real number. Generally, fval = f'*x.
Exitflags 3 and -9 relate to solutions that have large infeasibilities. These usually arise from linear
constraint matrices that have large condition number, or problems that have large solution
components. To correct these issues, try to scale the coefficient matrices, eliminate redundant linear
constraints, or give tighter bounds on the variables.
Information about the optimization process, returned as a structure with these fields.
16-214
linprog
The Lagrange multipliers for linear constraints satisfy this equation with length(f) components:
This sign convention matches that of nonlinear solvers (see “Constrained Optimality Theory” on page
3-12). However, this sign is the opposite of the sign in much linear programming literature, so a
linprog Lagrange multiplier is the negative of the associated "shadow price."
Algorithms
Dual-Simplex Algorithm
Interior-Point-Legacy Algorithm
The 'interior-point-legacy' method is based on LIPSOL (Linear Interior Point Solver, [3]),
which is a variant of Mehrotra's predictor-corrector algorithm [2], a primal-dual interior-point
method. A number of preprocessing steps occur before the algorithm begins to iterate. See “Interior-
Point-Legacy Linear Programming” on page 9-6.
The first stage of the algorithm might involve some preprocessing of the constraints (see “Interior-
Point-Legacy Linear Programming” on page 9-6). Several conditions might cause linprog to exit
with an infeasibility message. In each case, linprog returns a negative exitflag, indicating to
indicate failure.
• If a row of all zeros is detected in Aeq, but the corresponding element of beq is not zero, then the
exit message is
16-215
16 Functions
Note The preprocessing steps are cumulative. For example, even if your constraint matrix does not
have a row of all zeros to begin with, other preprocessing steps can cause such a row to occur.
When the preprocessing finishes, the iterative part of the algorithm begins until the stopping criteria
are met. (For more information about residuals, the primal problem, the dual problem, and the
related stopping criteria, see “Interior-Point-Legacy Linear Programming” on page 9-6.) If the
residuals are growing instead of getting smaller, or the residuals are neither growing nor shrinking,
one of the two following termination messages is displayed, respectively,
One or more of the residuals, duality gap, or total relative error
has grown 100000 times greater than its minimum value so far:
or
One or more of the residuals, duality gap, or total relative error
has stalled:
After one of these messages is displayed, it is followed by one of the following messages indicating
that the dual, the primal, or both appear to be infeasible.
• The dual appears to be infeasible (and the primal unbounded). (The primal
residual < OptimalityTolerance.)
• The primal appears to be infeasible (and the dual unbounded). (The dual
residual < OptimalityTolerance.)
• The dual appears to be infeasible (and the primal unbounded) since the dual
residual > sqrt(OptimalityTolerance). (The primal residual <
10*OptimalityTolerance.)
• The primal appears to be infeasible (and the dual unbounded) since the
primal residual > sqrt(OptimalityTolerance). (The dual residual <
10*OptimalityTolerance.)
• The dual appears to be infeasible and the primal unbounded since the primal
objective < -1e+10 and the dual objective < 1e+6.
• The primal appears to be infeasible and the dual unbounded since the dual
objective > 1e+10 and the primal objective > -1e+6.
• Both the primal and the dual appear to be infeasible.
For example, the primal (objective) can be unbounded and the primal residual, which is a measure of
primal constraint satisfaction, can be small.
Interior-Point Algorithm
References
[1] Dantzig, G.B., A. Orden, and P. Wolfe. “Generalized Simplex Method for Minimizing a Linear Form
Under Linear Inequality Restraints.” Pacific Journal Math., Vol. 5, 1955, pp. 183–195.
16-216
linprog
[2] Mehrotra, S. “On the Implementation of a Primal-Dual Interior Point Method.” SIAM Journal on
Optimization, Vol. 2, 1992, pp. 575–601.
[3] Zhang, Y., “Solving Large-Scale Linear Programs by Interior-Point Methods Under the MATLAB
Environment.” Technical Report TR96-01, Department of Mathematics and Statistics,
University of Maryland, Baltimore County, Baltimore, MD, July 1995.
See Also
intlinprog | mpsread | optimoptions | prob2struct | quadprog
Topics
“Set Up a Linear Program, Solver-Based” on page 1-18
“Typical Linear Programming Problem” on page 9-13
“Maximize Long-Term Investments Using Linear Programming: Problem-Based”
“Solver-Based Optimization Problem Setup”
“Linear Programming Algorithms” on page 9-2
16-217
16 Functions
lsqcurvefit
Solve nonlinear curve-fitting (data-fitting) problems in least-squares sense
Syntax
x = lsqcurvefit(fun,x0,xdata,ydata)
x = lsqcurvefit(fun,x0,xdata,ydata,lb,ub)
x = lsqcurvefit(fun,x0,xdata,ydata,lb,ub,options)
x = lsqcurvefit(problem)
[x,resnorm] = lsqcurvefit( ___ )
[x,resnorm,residual,exitflag,output] = lsqcurvefit( ___ )
[x,resnorm,residual,exitflag,output,lambda,jacobian] = lsqcurvefit( ___ )
Description
Nonlinear least-squares solver
2 2
min F(x, xdata) − ydata 2 = min ∑ F x, xdatai − ydatai ,
x x i
given input data xdata, and the observed output ydata, where xdata and ydata are matrices or
vectors, and F (x, xdata) is a matrix-valued or vector-valued function of the same size as ydata.
Optionally, the components of x can have lower and upper bounds lb, and ub. The arguments x, lb,
and ub can be vectors or matrices; see “Matrix Arguments” on page 2-31.
The lsqcurvefit function uses the same algorithm as lsqnonlin. lsqcurvefit simply provides a
convenient interface for data-fitting problems.
Rather than compute the sum of squares, lsqcurvefit requires the user-defined function to
compute the vector-valued function
F x, xdata(1)
F x, xdata(2)
F(x, xdata) = .
⋮
F x, xdata(k)
Note “Passing Extra Parameters” on page 2-57 explains how to pass extra parameters to the vector
function fun(x), if necessary.
16-218
lsqcurvefit
Note If the specified input bounds for a problem are inconsistent, the output x is x0 and the outputs
resnorm and residual are [].
Components of x0 that violate the bounds lb ≤ x ≤ ub are reset to the interior of the box defined
by the bounds. Components that respect the bounds are not changed.
[x,resnorm] = lsqcurvefit( ___ ), for any input arguments, returns the value of the squared 2-
norm of the residual at x: sum((fun(x,xdata)-ydata).^2).
Examples
Suppose that you have observation time data xdata and observed response data ydata, and you
want to find parameters x(1) and x(2) to fit a model of the form
16-219
16 Functions
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1×2
498.8309 -0.1013
times = linspace(xdata(1),xdata(end));
plot(xdata,ydata,'ko',times,fun(x,times),'b-')
legend('Data','Fitted exponential')
title('Data and Fitted Curve')
Find the best exponential fit to data where the fitting parameters are constrained.
Generate data from an exponential decay model plus noise. The model is
y = exp( − 1 . 3t) + ε,
16-220
lsqcurvefit
with t ranging from 0 through 3, and ε normally distributed noise with mean 0 and standard deviation
0.05.
The problem is: given the data (xdata, ydata), find the exponential decay model
y = x(1)exp(x(2)xdata) that best fits the data, with the parameters bounded as follows:
0 ≤ x(1) ≤ 3/4
−2 ≤ x(2) ≤ − 1 .
lb = [0,-2];
ub = [3/4,-1];
fun = @(x,xdata)x(1)*exp(x(2)*xdata);
x0 = [1/2,-2];
x = lsqcurvefit(fun,x0,xdata,ydata,lb,ub)
x = 1×2
0.7500 -1.0000
Examine how well the resulting curve fits the data. Because the bounds keep the solution away from
the true values, the fit is mediocre.
plot(xdata,ydata,'ko',xdata,fun(x,xdata),'b-')
legend('Data','Fitted exponential')
title('Data and Fitted Curve')
16-221
16 Functions
Compare Algorithms
Compare the results of fitting with the default 'trust-region-reflective' algorithm and the
'levenberg-marquardt' algorithm.
Suppose that you have observation time data xdata and observed response data ydata, and you
want to find parameters x(1) and x(2) to fit a model of the form
16-222
lsqcurvefit
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1×2
498.8309 -0.1013
options = optimoptions('lsqcurvefit','Algorithm','levenberg-marquardt');
lb = [];
ub = [];
x = lsqcurvefit(fun,x0,xdata,ydata,lb,ub,options)
x = 1×2
498.8309 -0.1013
The two algorithms converged to the same solution. Plot the data and the fitted exponential model.
times = linspace(xdata(1),xdata(end));
plot(xdata,ydata,'ko',times,fun(x,times),'b-')
legend('Data','Fitted exponential')
title('Data and Fitted Curve')
16-223
16 Functions
Compare the results of fitting with the default 'trust-region-reflective' algorithm and the
'levenberg-marquardt' algorithm. Examine the solution process to see which is more efficient in
this case.
Suppose that you have observation time data xdata and observed response data ydata, and you
want to find parameters x(1) and x(2) to fit a model of the form
xdata = ...
[0.9 1.5 13.8 19.8 24.1 28.2 35.2 60.3 74.6 81.3];
ydata = ...
[455.2 428.6 124.1 67.3 43.2 28.1 13.1 -0.4 -1.3 -1.5];
fun = @(x,xdata)x(1)*exp(x(2)*xdata);
16-224
lsqcurvefit
x0 = [100,-1];
[x,resnorm,residual,exitflag,output] = lsqcurvefit(fun,x0,xdata,ydata);
lsqcurvefit stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
options = optimoptions('lsqcurvefit','Algorithm','levenberg-marquardt');
lb = [];
ub = [];
[x2,resnorm2,residual2,exitflag2,output2] = lsqcurvefit(fun,x0,xdata,ydata,lb,ub,options);
norm(x-x2)
ans = 2.0630e-06
times = linspace(xdata(1),xdata(end));
plot(xdata,ydata,'ko',times,fun(x,times),'b-')
legend('Data','Fitted exponential')
title('Data and Fitted Curve')
16-225
16 Functions
Input Arguments
fun — Function you want to fit
function handle | name of function
Function you want to fit, specified as a function handle or the name of a function. fun is a function
that takes two inputs: a vector or matrix x, and a vector or matrix xdata. fun returns a vector or
matrix F, the objective function evaluated at x and xdata. The function fun can be specified as a
function handle for a function file:
x = lsqcurvefit(@myfun,x0,xdata,ydata)
function F = myfun(x,xdata)
F = ... % Compute function values at x, xdata
16-226
lsqcurvefit
f = @(x,xdata)x(1)*xdata.^2+x(2)*sin(xdata);
x = lsqcurvefit(f,x0,xdata,ydata);
If the user-defined values for x and F are arrays, they are converted to vectors using linear indexing
(see “Array Indexing” (MATLAB)).
Note fun should return fun(x,xdata), and not the sum-of-squares sum((fun(x,xdata)-
ydata).^2). lsqcurvefit implicitly computes the sum of squares of the components of
fun(x,xdata)-ydata. See “Examples” on page 16-0 .
If the Jacobian can also be computed and the 'SpecifyObjectiveGradient' option is true, set by
options = optimoptions('lsqcurvefit','SpecifyObjectiveGradient',true)
then the function fun must return a second output argument with the Jacobian value J (a matrix) at
x. By checking the value of nargout, the function can avoid computing J when fun is called with
only one output argument (in the case where the optimization algorithm only needs the value of F but
not J).
function [F,J] = myfun(x,xdata)
F = ... % objective function values at x
if nargout > 1 % two output arguments
J = ... % Jacobian of the function evaluated at x
end
If fun returns a vector (matrix) of m components and x has n elements, where n is the number of
elements of x0, the Jacobian J is an m-by-n matrix where J(i,j) is the partial derivative of F(i)
with respect to x(j). (The Jacobian J is the transpose of the gradient of F.) For more information,
see “Writing Vector and Matrix Objective Functions” on page 2-26.
Example: @(x,xdata)x(1)*exp(-x(2)*xdata)
Data Types: char | function_handle | string
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. Solvers use the number of elements in x0 and the
size of x0 to determine the number and size of variables that fun accepts.
Example: x0 = [1,2,3,4]
Data Types: double
Input data for model, specified as a real vector or real array. The model is
ydata = fun(x,xdata),
where xdata and ydata are fixed arrays, and x is the array of parameters that lsqcurvefit
changes to search for a minimum sum of squares.
Example: xdata = [1,2,3,4]
16-227
16 Functions
Response data for model, specified as a real vector or real array. The model is
ydata = fun(x,xdata),
where xdata and ydata are fixed arrays, and x is the array of parameters that lsqcurvefit
changes to search for a minimum sum of squares.
The ydata array must be the same size and shape as the array fun(x0,xdata).
Example: ydata = [1,2,3,4]
Data Types: double
lb — Lower bounds
real vector | real array
Lower bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in lb, then lb specifies that
ub — Upper bounds
real vector | real array
Upper bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in ub, then ub specifies that
16-228
lsqcurvefit
Some options apply to all algorithms, and others are relevant for particular algorithms. See
“Optimization Options Reference” on page 15-6 for detailed information.
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
All Algorithms
Algorithm Choose between 'trust-region-reflective' (default) and
'levenberg-marquardt'.
16-229
16 Functions
FiniteDifferenceStepSi Scalar or vector step size factor for finite differences. When you set
ze FiniteDifferenceStepSize to a vector v, the forward finite
differences delta are
delta = v.*sign′(x).*max(abs(x),TypicalX);
delta = v.*max(abs(x),TypicalX);
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
FunValCheck Check whether function values are valid. 'on' displays an error when
the function returns a value that is complex, Inf, or NaN. The default
'off' displays no error.
MaxFunctionEvaluations Maximum number of function evaluations allowed, a positive integer.
The default is 100*numberOfVariables. See “Tolerances and
Stopping Criteria” on page 2-68 and “Iterations and Function Counts”
on page 3-9.
For optimset, the name is MaxIter. See “Current and Legacy Option
Name Tables” on page 15-21.
16-230
lsqcurvefit
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
OutputFcn Specify one or more user-defined functions that an optimization
function calls at each iteration. Pass a function handle or a cell array of
function handles. The default is none ([]). See “Output Function
Syntax” on page 15-26.
PlotFcn Plots various measures of progress while the algorithm executes; select
from predefined plots or write your own. Pass a name, a function
handle, or a cell array of names or function handles. For custom plot
functions, pass function handles. The default is none ([]):
Custom plot functions use the same syntax as output functions. See
“Output Functions” on page 3-32 and “Output Function Syntax” on
page 15-26.
For optimset, the name is Jacobian, and the values are 'on' or
'off'. See “Current and Legacy Option Name Tables” on page 15-21.
StepTolerance Termination tolerance on x, a positive scalar. The default is 1e-6. See
“Tolerances and Stopping Criteria” on page 2-68.
For optimset, the name is TolX. See “Current and Legacy Option
Name Tables” on page 15-21.
TypicalX Typical x values. The number of elements in TypicalX is equal to the
number of elements in x0, the starting point. The default value is
ones(numberofvariables,1). The solver uses TypicalX for
scaling finite differences for gradient estimation.
16-231
16 Functions
W = jmfun(Jinfo,Y,flag)
where Jinfo contains the matrix used to compute J*Y (or J'*Y, or
J'*(J*Y)). The first argument Jinfo must be the same as the second
argument returned by the objective function fun, for example, by
[F,Jinfo] = fun(x)
16-232
lsqcurvefit
16-233
16 Functions
You must supply at least the objective, x0, solver, xdata, ydata, and options fields in the
problem structure.
The simplest way of obtaining a problem structure is to export the problem from the Optimization
app.
Data Types: struct
Output Arguments
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically,
x is a local solution to the problem when exitflag is positive. For information on the quality of the
solution, see “When the Solver Succeeds” on page 4-18.
Squared norm of the residual, returned as a nonnegative real. resnorm is the squared 2-norm of the
residual at x: sum((fun(x,xdata)-ydata).^2).
16-234
lsqcurvefit
Jacobian at the solution, returned as a real matrix. jacobian(i,j) is the partial derivative of
fun(i) with respect to x(j) at the solution x.
Limitations
• The Levenberg-Marquardt algorithm does not handle bound constraints.
• The trust-region-reflective algorithm does not solve underdetermined systems; it requires that the
number of equations, i.e., the row dimension of F, be at least as great as the number of variables.
In the underdetermined case, lsqcurvefit uses the Levenberg-Marquardt algorithm.
Since the trust-region-reflective algorithm does not handle underdetermined systems and the
Levenberg-Marquardt does not handle bound constraints, problems that have both of these
characteristics cannot be solved by lsqcurvefit.
• lsqcurvefit can solve complex-valued problems directly with the levenberg-marquardt
algorithm. However, this algorithm does not accept bound constraints. For a complex problem
with bound constraints, split the variables into real and imaginary parts, and use the trust-
region-reflective algorithm. See “Fit a Model to Complex-Valued Data” on page 12-50.
16-235
16 Functions
• The preconditioner computation used in the preconditioned conjugate gradient part of the trust-
region-reflective method forms JTJ (where J is the Jacobian matrix) before computing the
preconditioner. Therefore, a row of J with many nonzeros, which results in a nearly dense product
JTJ, can lead to a costly solution process for large problems.
• If components of x have no upper (or lower) bounds, lsqcurvefit prefers that the corresponding
components of ub (or lb) be set to inf (or -inf for lower bounds) as opposed to an arbitrary but
very large positive (or negative for lower bounds) number.
You can use the trust-region reflective algorithm in lsqnonlin, lsqcurvefit, and fsolve with
small- to medium-scale problems without computing the Jacobian in fun or providing the Jacobian
sparsity pattern. (This also applies to using fmincon or fminunc without computing the Hessian or
supplying the Hessian sparsity pattern.) How small is small- to medium-scale? No absolute answer is
available, as it depends on the amount of virtual memory in your computer system configuration.
Suppose your problem has m equations and n unknowns. If the command J = sparse(ones(m,n))
causes an Out of memory error on your machine, then this is certainly too large a problem. If it
does not result in an error, the problem might still be too large. You can find out only by running it
and seeing if MATLAB runs within the amount of virtual memory available on your system.
Algorithms
The Levenberg-Marquardt and trust-region-reflective methods are based on the nonlinear least-
squares algorithms also used in fsolve.
• The default trust-region-reflective algorithm is a subspace trust-region method and is based on the
interior-reflective Newton method described in [1] and [2]. Each iteration involves the
approximate solution of a large linear system using the method of preconditioned conjugate
gradients (PCG). See “Trust-Region-Reflective Least Squares” on page 12-3.
• The Levenberg-Marquardt method is described in references [4], [5], and [6]. See “Levenberg-
Marquardt Method” on page 12-6.
References
[1] Coleman, T.F. and Y. Li. “An Interior, Trust Region Approach for Nonlinear Minimization Subject to
Bounds.” SIAM Journal on Optimization, Vol. 6, 1996, pp. 418–445.
[2] Coleman, T.F. and Y. Li. “On the Convergence of Reflective Newton Methods for Large-Scale
Nonlinear Minimization Subject to Bounds.” Mathematical Programming, Vol. 67, Number 2,
1994, pp. 189–224.
[3] Dennis, J. E. Jr. “Nonlinear Least-Squares.” State of the Art in Numerical Analysis, ed. D. Jacobs,
Academic Press, pp. 269–312.
[4] Levenberg, K. “A Method for the Solution of Certain Problems in Least-Squares.” Quarterly
Applied Mathematics 2, 1944, pp. 164–168.
[5] Marquardt, D. “An Algorithm for Least-squares Estimation of Nonlinear Parameters.” SIAM
Journal Applied Mathematics, Vol. 11, 1963, pp. 431–441.
16-236
lsqcurvefit
[7] Moré, J. J., B. S. Garbow, and K. E. Hillstrom. User Guide for MINPACK 1. Argonne National
Laboratory, Rept. ANL–80–74, 1980.
[8] Powell, M. J. D. “A Fortran Subroutine for Solving Systems of Nonlinear Algebraic Equations.”
Numerical Methods for Nonlinear Algebraic Equations, P. Rabinowitz, ed., Ch.7, 1970.
Extended Capabilities
Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.
options = optimoptions('solvername','UseParallel',true)
For more information, see “Using Parallel Computing in Optimization Toolbox” on page 14-5.
See Also
fsolve | lsqnonlin | optimoptions
Topics
“Nonlinear Least Squares (Curve Fitting)”
“Solver-Based Optimization Problem Setup”
“Least-Squares (Model Fitting) Algorithms” on page 12-2
16-237
16 Functions
lsqlin
Solve constrained linear least-squares problems
Syntax
x = lsqlin(C,d,A,b)
x = lsqlin(C,d,A,b,Aeq,beq,lb,ub)
x = lsqlin(C,d,A,b,Aeq,beq,lb,ub,x0,options)
x = lsqlin(problem)
[x,resnorm,residual,exitflag,output,lambda] = lsqlin( ___ )
Description
Linear least-squares solver with bounds or linear constraints.
A ⋅ x ≤ b,
1 2
min C ⋅ x − d 2 such that Aeq ⋅ x = beq,
x 2
lb ≤ x ≤ ub .
Note lsqlin applies only to the solver-based approach. For a discussion of the two optimization
approaches, see “First Choose Problem-Based or Solver-Based Approach” on page 1-3.
x = lsqlin(C,d,A,b) solves the linear system C*x = d in the least-squares sense, subject to
A*x ≤ b.
x = lsqlin(problem) finds the minimum for problem, where problem is a structure. Create the
problem structure by exporting a problem from Optimization app, as described in “Exporting Your
Work” on page 5-9. Or create a problem structure from an OptimizationProblem object by using
prob2struct.
16-238
lsqlin
The factor ½ in the definition of the problem affects the values in the lambda structure.
Examples
Find the x that minimizes the norm of C*x - d for an overdetermined problem with linear inequality
constraints.
x = lsqlin(C,d,A,b)
x = 4×1
0.1299
-0.5757
0.4251
0.2438
Find the x that minimizes the norm of C*x - d for an overdetermined problem with linear equality
and inequality constraints and bounds.
16-239
16 Functions
x = lsqlin(C,d,A,b,Aeq,beq,lb,ub)
x = 4×1
-0.1000
-0.1000
0.1599
0.4090
This example shows how to use nondefault options for linear least squares.
Set options to use the 'interior-point' algorithm and to give iterative display.
options = optimoptions('lsqlin','Algorithm','interior-point','Display','iter');
16-240
lsqlin
d = [0.0578
0.3528
0.8131
0.0098
0.1388];
A = [0.2027 0.2721 0.7467 0.4659
0.1987 0.1988 0.4450 0.4186
0.6037 0.0152 0.9318 0.8462];
b = [0.5251
0.2026
0.6721];
x = 4×1
0.1299
-0.5757
0.4251
0.2438
Define a problem with linear inequality constraints and bounds. The problem is overdetermined
because there are four columns in the C matrix but five rows. This means the problem has four
unknowns and five conditions, even before including the linear constraints and bounds.
C = [0.9501 0.7620 0.6153 0.4057
0.2311 0.4564 0.7919 0.9354
0.6068 0.0185 0.9218 0.9169
0.4859 0.8214 0.7382 0.4102
0.8912 0.4447 0.1762 0.8936];
d = [0.0578
0.3528
16-241
16 Functions
0.8131
0.0098
0.1388];
A = [0.2027 0.2721 0.7467 0.4659
0.1987 0.1988 0.4450 0.4186
0.6037 0.0152 0.9318 0.8462];
b = [0.5251
0.2026
0.6721];
lb = -0.1*ones(4,1);
ub = 2*ones(4,1);
options = optimoptions('lsqlin','Algorithm','interior-point');
The 'interior-point' algorithm does not use an initial point, so set x0 to [].
x0 = [];
[x,resnorm,residual,exitflag,output,lambda] = ...
lsqlin(C,d,A,b,[],[],lb,ub,x0,options)
x = 4×1
-0.1000
-0.1000
0.2152
0.3502
resnorm = 0.1672
residual = 5×1
0.0455
0.0764
-0.3562
0.1620
0.0784
exitflag = 1
16-242
lsqlin
cgiterations: []
Examine the nonzero Lagrange multiplier fields in more detail. First examine the Lagrange
multipliers for the linear inequality constraint.
lambda.ineqlin
ans = 3×1
0.0000
0.2392
0.0000
Lagrange multipliers are nonzero exactly when the solution is on the corresponding constraint
boundary. In other words, Lagrange multipliers are nonzero when the corresponding constraint is
active. lambda.ineqlin(2) is nonzero. This means that the second element in A*x should equal the
second element in b, because the constraint is active.
[A(2,:)*x,b(2)]
ans = 1×2
0.2026 0.2026
Now examine the Lagrange multipliers for the lower and upper bound constraints.
lambda.lower
ans = 4×1
0.0409
0.2784
0.0000
0.0000
lambda.upper
ans = 4×1
0
0
0
0
The first two elements of lambda.lower are nonzero. You see that x(1) and x(2) are at their lower
bounds, -0.1. All elements of lambda.upper are essentially zero, and you see that all components
of x are less than their upper bound, 2.
16-243
16 Functions
Input Arguments
C — Multiplier matrix
real matrix
Multiplier matrix, specified as a matrix of doubles. C represents the multiplier of the solution x in the
expression C*x - d. C is M-by-N, where M is the number of equations, and N is the number of
elements of x.
Example: C = [1,4;2,5;7,8]
Data Types: double
d — Constant vector
real vector
Constant vector, specified as a vector of doubles. d represents the additive constant term in the
expression C*x - d. d is M-by-1, where M is the number of equations.
Example: d = [5;0;-12]
Data Types: double
Linear inequality constraints, specified as a real matrix. A is an M-by-N matrix, where M is the number
of inequalities, and N is the number of variables (number of elements in x0). For large problems, pass
A as a sparse matrix.
A*x <= b,
where x is the column vector of N variables x(:), and b is a column vector with M elements.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear inequality constraints, specified as a real vector. b is an M-element vector related to the A
matrix. If you pass b as a row vector, solvers internally convert b to the column vector b(:). For
large problems, pass b as a sparse vector.
16-244
lsqlin
A*x <= b,
where x is the column vector of N variables x(:), and A is a matrix of size M-by-N.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear equality constraints, specified as a real matrix. Aeq is an Me-by-N matrix, where Me is the
number of equalities, and N is the number of variables (number of elements in x0). For large
problems, pass Aeq as a sparse matrix.
Aeq*x = beq,
where x is the column vector of N variables x(:), and beq is a column vector with Me elements.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
Linear equality constraints, specified as a real vector. beq is an Me-element vector related to the Aeq
matrix. If you pass beq as a row vector, solvers internally convert beq to the column vector beq(:).
For large problems, pass beq as a sparse vector.
Aeq*x = beq,
16-245
16 Functions
where x is the column vector of N variables x(:), and Aeq is a matrix of size Me-by-N.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
lb — Lower bounds
[] (default) | real vector or array
Lower bounds, specified as a vector or array of doubles. lb represents the lower bounds elementwise
in lb ≤ x ≤ ub.
ub — Upper bounds
[] (default) | real vector or array
Upper bounds, specified as a vector or array of doubles. ub represents the upper bounds elementwise
in lb ≤ x ≤ ub.
x0 — Initial point
[] (default) | real vector or array
Initial point for the solution process, specified as a real vector or array. The 'trust-region-
reflective' and 'active-set' algorithms use x0 (optional).
Options for lsqlin, specified as the output of the optimoptions function or the Optimization app.
16-246
lsqlin
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
All Algorithms
• 'interior-point' (default)
• 'trust-region-reflective'
• 'active-set'
If you have a large number of linear constraints and not a large number of
variables, try the 'active-set' algorithm.
For optimset, the option name is MaxIter. See “Current and Legacy
Option Name Tables” on page 15-21.
16-247
16 Functions
For optimset, the option name is TolFun. See “Current and Legacy
Option Name Tables” on page 15-21.
JacobianMultiplyFcn Jacobian multiply function, specified as a function handle. For large-
scale structured problems, this function should compute the Jacobian
matrix product C*Y, C'*Y, or C'*(C*Y) without actually forming C.
Write the function in the form
W = jmfun(Jinfo,Y,flag)
In each case, jmfun need not form C explicitly. lsqlin uses Jinfo to
compute the preconditioner. See “Passing Extra Parameters” on page 2-
57 for information on how to supply extra parameters if necessary.
See “Jacobian Multiply Function with Linear Least Squares” on page 12-
30 for an example.
For optimset, the option name is TolFun. See “Current and Legacy
Option Name Tables” on page 15-21.
PrecondBandWidth Upper bandwidth of preconditioner for PCG (preconditioned conjugate
gradient). By default, diagonal preconditioning is used (upper
bandwidth of 0). For some problems, increasing the bandwidth reduces
the number of PCG iterations. Setting PrecondBandWidth to Inf uses
a direct factorization (Cholesky) rather than the conjugate gradients
(CG). The direct factorization is computationally more expensive than
CG, but produces a better quality step toward the solution. For more
information, see “Trust-Region-Reflective Algorithm” on page 16-253.
16-248
lsqlin
SubproblemAlgorithm Determines how the iteration step is calculated. The default, 'cg',
takes a faster but less accurate step than 'factorization'. See
“Trust-Region-Reflective Least Squares” on page 12-3.
TolPCG Termination tolerance on the PCG (preconditioned conjugate gradient)
iteration, a positive scalar. The default is 0.1.
TypicalX Typical x values. The number of elements in TypicalX is equal to the
number of variables. The default value is
ones(numberofvariables,1). lsqlin uses TypicalX internally for
scaling. TypicalX has an effect only when x has unbounded
components, and when a TypicalX value for an unbounded component
is larger than 1.
For optimset, the option name is TolCon. See “Current and Legacy
Option Name Tables” on page 15-21.
LinearSolver Type of internal linear solver in algorithm:
For optimset, the option name is TolFun. See “Current and Legacy
Option Name Tables” on page 15-21.
StepTolerance Termination tolerance on x, a positive scalar. The default is 1e-12.
For optimset, the option name is TolX. See “Current and Legacy
Option Name Tables” on page 15-21.
16-249
16 Functions
For optimset, the option name is TolCon. See “Current and Legacy
Option Name Tables” on page 15-21.
ObjectiveLimit A tolerance (stopping criterion) that is a scalar. If the objective
function value goes below ObjectiveLimit and the current point is
feasible, the iterations halt because the problem is unbounded,
presumably. The default value is -1e20.
OptimalityTolerance Termination tolerance on the first-order optimality, a positive scalar.
The default value is 1e-8. See “First-Order Optimality Measure” on
page 3-11.
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
StepTolerance Termination tolerance on x, a positive scalar. The default value is
1e-12.
For optimset, the option name is TolX. See “Current and Legacy
Option Name Tables” on page 15-21.
Create the problem structure by exporting a problem from the Optimization app, as described in
“Exporting Your Work” on page 5-9.
Data Types: struct
Output Arguments
x — Solution
real vector
16-250
lsqlin
Solution, returned as a vector that minimizes the norm of C*x-d subject to all bounds and linear
constraints.
Algorithm stopping condition, returned as an integer identifying the reason the algorithm stopped.
The following lists the values of exitflag and the corresponding reasons lsqlin stopped.
The exit message for the interior-point algorithm can give more details on the reason lsqlin
stopped, such as exceeding a tolerance. See “Exit Flags and Exit Messages” on page 3-3.
Solution process summary, returned as a structure containing information about the optimization
process.
16-251
16 Functions
• 'interior-point'
• 'trust-region-reflective'
• 'mldivide' for an unconstrained problem
constrviolation = max([0;norm(Aeq*x-beq,
inf);(lb-x);(x-ub);(A*x-b)])
message Exit message.
firstorderopt First-order optimality at the solution. See “First-Order
Optimality Measure” on page 3-11.
linearsolver Type of internal linear solver, 'dense' or 'sparse'
('interior-point' algorithm only)
cgiterations Number of conjugate gradient iterations the solver
performed. Nonempty only for the 'trust-region-
reflective' algorithm.
Tips
• For problems with no constraints, you can use mldivide (matrix left division). When you have no
constraints, lsqlin returns x = C\d.
• Because the problem being solved is always convex, lsqlin finds a global, although not
necessarily unique, solution.
• If your problem has many linear constraints and few variables, try using the 'active-set'
algorithm. See “Quadratic Programming with Many Linear Constraints” on page 11-57.
• Better numerical results are likely if you specify equalities explicitly, using Aeq and beq, instead of
implicitly, using lb and ub.
• The trust-region-reflective algorithm does not allow equal upper and lower bounds. Use
another algorithm for this case.
16-252
lsqlin
• If the specified input bounds for a problem are inconsistent, the output x is x0 and the outputs
resnorm and residual are [].
• You can solve some large structured problems, including those where the C matrix is too large to
fit in memory, using the trust-region-reflective algorithm with a Jacobian multiply
function. For information, see trust-region-reflective Algorithm Options.
Algorithms
Trust-Region-Reflective Algorithm
This method is a subspace trust-region method based on the interior-reflective Newton method
described in [1]. Each iteration involves the approximate solution of a large linear system using the
method of preconditioned conjugate gradients (PCG). See “Trust-Region-Reflective Least Squares” on
page 12-3, and in particular “Large Scale Linear Least Squares” on page 12-5.
Interior-Point Algorithm
Active-Set Algorithm
The 'active-set' algorithm is based on the quadprog 'active-set' algorithm. For more
information, see “Linear Least Squares: Interior-Point or Active-Set” on page 12-2 and “active-set
quadprog Algorithm” on page 11-11.
References
[1] Coleman, T. F. and Y. Li. “A Reflective Newton Method for Minimizing a Quadratic Function
Subject to Bounds on Some of the Variables,” SIAM Journal on Optimization, Vol. 6, Number
4, pp. 1040–1058, 1996.
[2] Gill, P. E., W. Murray, and M. H. Wright. Practical Optimization, Academic Press, London, UK,
1981.
See Also
lsqnonneg | mldivide | optimtool | quadprog
Topics
“Nonnegative Linear Least Squares, Solver-Based” on page 12-24
“Optimization App with the lsqlin Solver” on page 12-27
“Jacobian Multiply Function with Linear Least Squares” on page 12-30
“Least-Squares (Model Fitting) Algorithms” on page 12-2
16-253
16 Functions
lsqnonlin
Solve nonlinear least-squares (nonlinear data-fitting) problems
Syntax
x = lsqnonlin(fun,x0)
x = lsqnonlin(fun,x0,lb,ub)
x = lsqnonlin(fun,x0,lb,ub,options)
x = lsqnonlin(problem)
[x,resnorm] = lsqnonlin( ___ )
[x,resnorm,residual,exitflag,output] = lsqnonlin( ___ )
[x,resnorm,residual,exitflag,output,lambda,jacobian] = lsqnonlin( ___ )
Description
Nonlinear least-squares solver
2 2 2 2
min f (x) 2 = min f 1(x) + f 2(x) + ... + f n(x)
x x
x, lb, and ub can be vectors or matrices; see “Matrix Arguments” on page 2-31.
2
Rather than compute the value f (x) 2 (the sum of squares), lsqnonlin requires the user-defined
function to compute the vector-valued function
f 1(x)
f 2(x)
f (x) = .
⋮
f n(x)
x = lsqnonlin(fun,x0) starts at the point x0 and finds a minimum of the sum of squares of the
functions described in fun. The function fun should return a vector (or array) of values and not the
sum of squares of the values. (The algorithm implicitly computes the sum of squares of the
components of fun(x).)
Note “Passing Extra Parameters” on page 2-57 explains how to pass extra parameters to the vector
function fun(x), if necessary.
x = lsqnonlin(fun,x0,lb,ub) defines a set of lower and upper bounds on the design variables
in x, so that the solution is always in the range lb ≤ x ≤ ub. You can fix the solution component
x(i) by specifying lb(i) = ub(i).
16-254
lsqnonlin
Note If the specified input bounds for a problem are inconsistent, the output x is x0 and the outputs
resnorm and residual are [].
Components of x0 that violate the bounds lb ≤ x ≤ ub are reset to the interior of the box defined
by the bounds. Components that respect the bounds are not changed.
[x,resnorm] = lsqnonlin( ___ ), for any input arguments, returns the value of the squared 2-
norm of the residual at x: sum(fun(x).^2).
Examples
Generate data from an exponential decay model plus noise. The model is
y = exp( − 1 . 3t) + ε,
with t ranging from 0 through 3, and ε normally distributed noise with mean 0 and standard deviation
0.05.
The problem is: given the data (d, y), find the exponential decay rate that best fits the data.
Create an anonymous function that takes a value of the exponential decay rate r and returns a vector
of differences from the model with that decay rate and the data.
fun = @(r)exp(-d*r)-y;
Find the value of the optimal decay rate. Arbitrarily choose an initial guess x0 = 4.
x0 = 4;
x = lsqnonlin(fun,x0)
16-255
16 Functions
lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1.2645
Find the best-fitting model when some of the fitting parameters have bounds.
1
exp − t2 /2 .
2π
16-256
lsqnonlin
Create a vector t of data points, and the corresponding normal density at those points.
t = linspace(-4,4);
y = 1/sqrt(2*pi)*exp(-t.^2/2);
Create a function that evaluates the difference between the centered and scaled function from the
normal y, with x(1) as the scaling a and x(2) as the centering b.
fun = @(x)x(1)*exp(-t).*exp(-exp(-(t-x(2)))) - y;
Find the optimal fit starting from x0 = [1/2,0], with the scaling a between 1/2 and 3/2, and the
centering b between -1 and 3.
lb = [1/2,-1];
ub = [3/2,3];
x0 = [1/2,0];
x = lsqnonlin(fun,x0,lb,ub)
lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1×2
0.8231 -0.2444
plot(t,y,'r-',t,fun(x)+y,'b-')
xlabel('t')
legend('Normal density','Fitted function')
16-257
16 Functions
Compare the results of a data-fitting problem when using different lsqnonlin algorithms.
Suppose that you have observation time data xdata and observed response data ydata, and you
want to find parameters x(1) and x(2) to fit a model of the form
xdata = ...
[0.9 1.5 13.8 19.8 24.1 28.2 35.2 60.3 74.6 81.3];
ydata = ...
[455.2 428.6 124.1 67.3 43.2 28.1 13.1 -0.4 -1.3 -1.5];
Create a simple exponential decay model. The model computes a vector of differences between
predicted values and observed values.
fun = @(x)x(1)*exp(x(2)*xdata)-ydata;
Fit the model using the starting point x0 = [100,-1]. First, use the default 'trust-region-
reflective' algorithm.
16-258
lsqnonlin
x0 = [100,-1];
options = optimoptions(@lsqnonlin,'Algorithm','trust-region-reflective');
x = lsqnonlin(fun,x0,[],[],options)
lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1×2
498.8309 -0.1013
options.Algorithm = 'levenberg-marquardt';
x = lsqnonlin(fun,x0,[],[],options)
x = 1×2
498.8309 -0.1013
The two algorithms found the same solution. Plot the solution and the data.
plot(xdata,ydata,'ko')
hold on
tlist = linspace(xdata(1),xdata(end));
plot(tlist,x(1)*exp(x(2)*tlist),'b-')
xlabel xdata
ylabel ydata
title('Exponential Fit to Data')
legend('Data','Exponential Fit')
hold off
16-259
16 Functions
10
kx1 kx 2
∑ 2 + 2k − e −e 2 ,
k=1
Because lsqnonlin assumes that the sum of squares is not explicitly formed in the user-defined
function, the function passed to lsqnonlin should instead compute the vector-valued function
kx1 kx
Fk(x) = 2 + 2k − e − e 2,
The myfun function that computes the 10-component vector F appears at the end of this example on
page 16-0 .
Find the minimizing point and the minimum value, starting at the point x0 = [0.3,0.4].
x0 = [0.3,0.4];
[x,resnorm] = lsqnonlin(@myfun,x0)
16-260
lsqnonlin
x = 1×2
0.2578 0.2578
resnorm = 124.3622
The resnorm output is the squared residual norm, the sum of squares of the function values.
Examine the solution process both as it occurs (by setting the Display option to 'iter') and
afterward (by examining the output structure).
Suppose that you have observation time data xdata and observed response data ydata, and you
want to find parameters x(1) and x(2) to fit a model of the form
Create a simple exponential decay model. The model computes a vector of differences between
predicted values and observed values.
fun = @(x)x(1)*exp(x(2)*xdata)-ydata;
Fit the model using the starting point x0 = [100,-1]. Examine the solution process by setting the
Display option to 'iter'. Obtain an output structure to obtain more information about the
solution process.
x0 = [100,-1];
options = optimoptions('lsqnonlin','Display','iter');
[x,resnorm,residual,exitflag,output] = lsqnonlin(fun,x0,[],[],options);
Norm of First-order
Iteration Func-count f(x) step optimality
0 3 359677 2.88e+04
Objective function returned Inf; trying a new point...
1 6 359677 11.6976 2.88e+04
2 9 321395 0.5 4.97e+04
16-261
16 Functions
3 12 321395 1 4.97e+04
4 15 292253 0.25 7.06e+04
5 18 292253 0.5 7.06e+04
6 21 270350 0.125 1.15e+05
7 24 270350 0.25 1.15e+05
8 27 252777 0.0625 1.63e+05
9 30 252777 0.125 1.63e+05
10 33 243877 0.03125 7.48e+04
11 36 243660 0.0625 8.7e+04
12 39 243276 0.0625 2e+04
13 42 243174 0.0625 1.14e+04
14 45 242999 0.125 5.1e+03
15 48 242661 0.25 2.04e+03
16 51 241987 0.5 1.91e+03
17 54 240643 1 1.04e+03
18 57 237971 2 3.36e+03
19 60 232686 4 6.04e+03
20 63 222354 8 1.2e+04
21 66 202592 16 2.25e+04
22 69 166443 32 4.05e+04
23 72 106320 64 6.68e+04
24 75 28704.7 128 8.31e+04
25 78 89.7947 140.674 2.22e+04
26 81 9.57381 2.02599 684
27 84 9.50489 0.0619927 2.27
28 87 9.50489 0.000462262 0.0114
lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
Examine the output structure to obtain more information about the solution process.
output
First-Order Norm of
Iteration Func-count Residual optimality Lambda step
0 3 359677 2.88e+04 0.01
Objective function returned Inf; trying a new point...
1 13 340761 3.91e+04 100000 0.280777
2 16 304661 5.97e+04 10000 0.373146
3 21 297292 6.55e+04 1e+06 0.0589933
4 24 288240 7.57e+04 100000 0.0645444
16-262
lsqnonlin
The 'levenberg-marquardt' converged with fewer iterations, but almost as many function
evaluations:
output
Input Arguments
fun — Function whose sum of squares is minimized
function handle | name of function
Function whose sum of squares is minimized, specified as a function handle or the name of a function.
fun is a function that accepts an array x and returns an array F, the objective functions evaluated at
x. The function fun can be specified as a function handle to a file:
x = lsqnonlin(@myfun,x0)
If the user-defined values for x and F are arrays, they are converted to vectors using linear indexing
(see “Array Indexing” (MATLAB)).
16-263
16 Functions
Note The sum of squares should not be formed explicitly. Instead, your function should return a
vector of function values. See “Examples” on page 16-0 .
If the Jacobian can also be computed and the 'SpecifyObjectiveGradient' option is true, set by
options = optimoptions('lsqnonlin','SpecifyObjectiveGradient',true)
then the function fun must return a second output argument with the Jacobian value J (a matrix) at
x. By checking the value of nargout, the function can avoid computing J when fun is called with
only one output argument (in the case where the optimization algorithm only needs the value of F but
not J).
function [F,J] = myfun(x)
F = ... % Objective function values at x
if nargout > 1 % Two output arguments
J = ... % Jacobian of the function evaluated at x
end
If fun returns an array of m components and x has n elements, where n is the number of elements of
x0, the Jacobian J is an m-by-n matrix where J(i,j) is the partial derivative of F(i) with respect to
x(j). (The Jacobian J is the transpose of the gradient of F.)
Example: @(x)cos(x).*exp(-x)
Data Types: char | function_handle | string
x0 — Initial point
real vector | real array
Initial point, specified as a real vector or real array. Solvers use the number of elements in x0 and the
size of x0 to determine the number and size of variables that fun accepts.
Example: x0 = [1,2,3,4]
Data Types: double
lb — Lower bounds
real vector | real array
Lower bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in lb, then lb specifies that
ub — Upper bounds
real vector | real array
Upper bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in ub, then ub specifies that
16-264
lsqnonlin
Some options apply to all algorithms, and others are relevant for particular algorithms. See
“Optimization Options Reference” on page 15-6 for detailed information.
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
All Algorithms
Algorithm Choose between 'trust-region-reflective' (default) and
'levenberg-marquardt'.
16-265
16 Functions
delta = v.*sign′(x).*max(abs(x),TypicalX);
delta = v.*max(abs(x),TypicalX);
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
FunValCheck Check whether function values are valid. 'on' displays an error when
the function returns a value that is complex, Inf, or NaN. The default
'off' displays no error.
16-266
lsqnonlin
For optimset, the name is MaxIter. See “Current and Legacy Option
Name Tables” on page 15-21.
OptimalityTolerance Termination tolerance on the first-order optimality (a positive scalar).
The default is 1e-6. See “First-Order Optimality Measure” on page 3-
11.
For optimset, the name is TolFun. See “Current and Legacy Option
Name Tables” on page 15-21.
OutputFcn Specify one or more user-defined functions that an optimization
function calls at each iteration. Pass a function handle or a cell array of
function handles. The default is none ([]). See “Output Function
Syntax” on page 15-26.
PlotFcn Plots various measures of progress while the algorithm executes; select
from predefined plots or write your own. Pass a name, a function
handle, or a cell array of names or function handles. For custom plot
functions, pass function handles. The default is none ([]):
Custom plot functions use the same syntax as output functions. See
“Output Functions” on page 3-32 and “Output Function Syntax” on
page 15-26.
16-267
16 Functions
SpecifyObjectiveGradie If false (default), the solver approximates the Jacobian using finite
nt differences. If true, the solver uses a user-defined Jacobian (defined in
fun), or Jacobian information (when using JacobMult), for the
objective function.
For optimset, the name is Jacobian, and the values are 'on' or
'off'. See “Current and Legacy Option Name Tables” on page 15-21.
StepTolerance Termination tolerance on x, a positive scalar. The default is 1e-6. See
“Tolerances and Stopping Criteria” on page 2-68.
For optimset, the name is TolX. See “Current and Legacy Option
Name Tables” on page 15-21.
TypicalX Typical x values. The number of elements in TypicalX is equal to the
number of elements in x0, the starting point. The default value is
ones(numberofvariables,1). The solver uses TypicalX for
scaling finite differences for gradient estimation.
UseParallel When true, the solver estimates gradients in parallel. Disable by
setting to the default, false. See “Parallel Computing”.
Trust-Region-Reflective Algorithm
16-268
lsqnonlin
W = jmfun(Jinfo,Y,flag)
where Jinfo contains the matrix used to compute J*Y (or J'*Y, or
J'*(J*Y)). The first argument Jinfo must be the same as the second
argument returned by the objective function fun, for example, by
[F,Jinfo] = fun(x)
16-269
16 Functions
You must supply at least the objective, x0, solver, and options fields in the problem structure.
The simplest way of obtaining a problem structure is to export the problem from the Optimization
app.
Data Types: struct
16-270
lsqnonlin
Output Arguments
x — Solution
real vector | real array
Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically,
x is a local solution to the problem when exitflag is positive. For information on the quality of the
solution, see “When the Solver Succeeds” on page 4-18.
Squared norm of the residual, returned as a nonnegative real. resnorm is the squared 2-norm of the
residual at x: sum(fun(x).^2).
16-271
16 Functions
Jacobian at the solution, returned as a real matrix. jacobian(i,j) is the partial derivative of
fun(i) with respect to x(j) at the solution x.
Limitations
• The Levenberg-Marquardt algorithm does not handle bound constraints.
• The trust-region-reflective algorithm does not solve underdetermined systems; it requires that the
number of equations, i.e., the row dimension of F, be at least as great as the number of variables.
In the underdetermined case, lsqnonlin uses the Levenberg-Marquardt algorithm.
Since the trust-region-reflective algorithm does not handle underdetermined systems and the
Levenberg-Marquardt does not handle bound constraints, problems that have both of these
characteristics cannot be solved by lsqnonlin.
• lsqnonlin can solve complex-valued problems directly with the levenberg-marquardt
algorithm. However, this algorithm does not accept bound constraints. For a complex problem
with bound constraints, split the variables into real and imaginary parts, and use the trust-
region-reflective algorithm. See “Fit a Model to Complex-Valued Data” on page 12-50.
• The preconditioner computation used in the preconditioned conjugate gradient part of the trust-
region-reflective method forms JTJ (where J is the Jacobian matrix) before computing the
preconditioner. Therefore, a row of J with many nonzeros, which results in a nearly dense product
JTJ, can lead to a costly solution process for large problems.
• If components of x have no upper (or lower) bounds, lsqnonlin prefers that the corresponding
components of ub (or lb) be set to inf (or -inf for lower bounds) as opposed to an arbitrary but
very large positive (or negative for lower bounds) number.
You can use the trust-region reflective algorithm in lsqnonlin, lsqcurvefit, and fsolve with
small- to medium-scale problems without computing the Jacobian in fun or providing the Jacobian
sparsity pattern. (This also applies to using fmincon or fminunc without computing the Hessian or
supplying the Hessian sparsity pattern.) How small is small- to medium-scale? No absolute answer is
available, as it depends on the amount of virtual memory in your computer system configuration.
Suppose your problem has m equations and n unknowns. If the command J = sparse(ones(m,n))
causes an Out of memory error on your machine, then this is certainly too large a problem. If it
does not result in an error, the problem might still be too large. You can find out only by running it
and seeing if MATLAB runs within the amount of virtual memory available on your system.
Algorithms
The Levenberg-Marquardt and trust-region-reflective methods are based on the nonlinear least-
squares algorithms also used in fsolve.
16-272
lsqnonlin
• The default trust-region-reflective algorithm is a subspace trust-region method and is based on the
interior-reflective Newton method described in [1] and [2]. Each iteration involves the
approximate solution of a large linear system using the method of preconditioned conjugate
gradients (PCG). See “Trust-Region-Reflective Least Squares” on page 12-3.
• The Levenberg-Marquardt method is described in references [4], [5], and [6]. See “Levenberg-
Marquardt Method” on page 12-6.
References
[1] Coleman, T.F. and Y. Li. “An Interior, Trust Region Approach for Nonlinear Minimization Subject to
Bounds.” SIAM Journal on Optimization, Vol. 6, 1996, pp. 418–445.
[2] Coleman, T.F. and Y. Li. “On the Convergence of Reflective Newton Methods for Large-Scale
Nonlinear Minimization Subject to Bounds.” Mathematical Programming, Vol. 67, Number 2,
1994, pp. 189–224.
[3] Dennis, J. E. Jr. “Nonlinear Least-Squares.” State of the Art in Numerical Analysis, ed. D. Jacobs,
Academic Press, pp. 269–312.
[4] Levenberg, K. “A Method for the Solution of Certain Problems in Least-Squares.” Quarterly
Applied Mathematics 2, 1944, pp. 164–168.
[5] Marquardt, D. “An Algorithm for Least-squares Estimation of Nonlinear Parameters.” SIAM
Journal Applied Mathematics, Vol. 11, 1963, pp. 431–441.
[7] Moré, J. J., B. S. Garbow, and K. E. Hillstrom. User Guide for MINPACK 1. Argonne National
Laboratory, Rept. ANL–80–74, 1980.
[8] Powell, M. J. D. “A Fortran Subroutine for Solving Systems of Nonlinear Algebraic Equations.”
Numerical Methods for Nonlinear Algebraic Equations, P. Rabinowitz, ed., Ch.7, 1970.
Extended Capabilities
Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.
options = optimoptions('solvername','UseParallel',true)
For more information, see “Using Parallel Computing in Optimization Toolbox” on page 14-5.
See Also
fsolve | lsqcurvefit | optimoptions
Topics
“Nonlinear Least Squares (Curve Fitting)”
“Solver-Based Optimization Problem Setup”
16-273
16 Functions
16-274
lsqnonneg
lsqnonneg
Solve nonnegative linear least-squares problem
Syntax
x = lsqnonneg(C,d)
x = lsqnonneg(C,d,options)
x = lsqnonneg(problem)
[x,resnorm,residual] = lsqnonneg( ___ )
[x,resnorm,residual,exitflag,output] = lsqnonneg( ___ )
[x,resnorm,residual,exitflag,output,lambda] = lsqnonneg( ___ )
Description
Solve nonnegative least-squares curve fitting problems of the form
2
min C ⋅ x − d 2, where x ≥ 0.
x
Note lsqnonneg applies only to the solver-based approach. For a discussion of the two optimization
approaches, see “First Choose Problem-Based or Solver-Based Approach” on page 1-3.
x = lsqnonneg(problem) finds the minimum for problem, where problem is a structure. Create
the problem argument by exporting a problem from Optimization app, as described in “Exporting
Your Work” on page 5-9.
[x,resnorm,residual] = lsqnonneg( ___ ), for any previous syntax, additionally returns the
value of the squared 2-norm of the residual, norm(C*x-d)^2, and returns the residual d-C*x.
Examples
Compute a nonnegative solution to a linear least-squares problem, and compare the result to the
solution of an unconstrained problem.
16-275
16 Functions
d = [0.8587
0.1781
0.0747
0.8405];
x = 2×1
0
0.6929
xunc = C\d
xunc = 2×1
-2.5627
3.1108
All entries in x are nonnegative, but some entries in xunc are negative.
constrained_norm = 0.9118
unconstrained_norm = norm(C*xunc - d)
unconstrained_norm = 0.6674
The unconstrained solution has a smaller residual norm because constraints can only increase a
residual norm.
Set the Display option to 'final' to see output when lsqnonneg finishes.
16-276
lsqnonneg
0.6233 0.6245
0.6344 0.6170];
d = [0.8587
0.1781
0.0747
0.8405];
x = lsqnonneg(C,d,options);
Optimization terminated.
Call lsqnonneg with outputs to obtain the solution, residual norm, and residual vector.
C = [0.0372 0.2869
0.6861 0.7071
0.6233 0.6245
0.6344 0.6170];
d = [0.8587
0.1781
0.0747
0.8405];
[x,resnorm,residual] = lsqnonneg(C,d)
x = 2×1
0
0.6929
resnorm = 0.8315
residual = 4×1
0.6599
-0.3119
-0.3580
0.4130
Verify that the returned residual norm is the square of the norm of the returned residual vector.
norm(residual)^2
ans = 0.8315
16-277
16 Functions
Request all output arguments to examine the solution and solution process after lsqnonneg finishes.
C = [0.0372 0.2869
0.6861 0.7071
0.6233 0.6245
0.6344 0.6170];
d = [0.8587
0.1781
0.0747
0.8405];
[x,resnorm,residual,exitflag,output,lambda] = lsqnonneg(C,d)
x = 2×1
0
0.6929
resnorm = 0.8315
residual = 4×1
0.6599
-0.3119
-0.3580
0.4130
exitflag = 1
lambda = 2×1
-0.1506
-0.0000
x(1) = 0, and the corresponding lambda(1) ≠ 0, showing the correct duality. Similarly, x(2) > 0,
and the corresponding lambda(2) = 0.
16-278
lsqnonneg
Input Arguments
C — Linear multiplier
real matrix
Linear multiplier, specified as a real matrix. Represents the variable C in the problem
2
min Cx − d 2.
x
d — Additive term
real vector
Additive term, specified as a real vector. Represents the variable d in the problem
2
min Cx − d 2.
x
Optimization options, specified as a structure such as optimset returns. You can use optimset to
set or change the values of these fields in the options structure. See “Optimization Options
Reference” on page 15-6 for detailed information.
16-279
16 Functions
The simplest way to obtain a problem structure is to export the problem from the Optimization app.
Data Types: struct
Output Arguments
x — Solution
real vector
Solution, returned as a real vector. The length of x is the same as the length of d.
residual — Residual
real vector
Lagrange multipliers, returned as a real vector. The entries satisfy the complementarity condition
x'*lambda = 0. This means lambda(i) < 0 when x(i) is approximately 0, and lambda(i) is
approximately 0 when x(i) > 0.
16-280
lsqnonneg
Tips
• For problems where d has length over 20, lsqlin might be faster than lsqnonneg. When d has
length under 20, lsqnonneg is generally more efficient.
To convert between the solvers when C has more rows than columns (meaning the system is
overdetermined),
[x,resnorm,residual,exitflag,output,lambda] = lsqnonneg(C,d)
is equivalent to
[m,n] = size(C);
[x,resnorm,residual,exitflag,output,lambda_lsqlin] = ...
lsqlin(C,d,-eye(n,n),zeros(n,1));
The only difference is that the corresponding Lagrange multipliers have opposite signs: lambda =
-lambda_lsqlin.ineqlin.
Algorithms
lsqnonneg uses the algorithm described in [1]. The algorithm starts with a set of possible basis
vectors and computes the associated dual vector lambda. It then selects the basis vector
corresponding to the maximum value in lambda to swap it out of the basis in exchange for another
possible candidate. This continues until lambda ≤ 0.
References
[1] Lawson, C. L. and R. J. Hanson. Solving Least-Squares Problems. Upper Saddle River, NJ: Prentice
Hall. 1974. Chapter 23, p. 161.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
See Also
lsqlin | mldivide | optimset | optimtool
16-281
16 Functions
mapSolution
Package: optim.problemdef
Syntax
[sol,fval,exitflag,output,lambda] = mapSolution(prob,xin,fin,eflag,outpt,
lambdain,solver)
Description
[sol,fval,exitflag,output,lambda] = mapSolution(prob,xin,fin,eflag,outpt,
lambdain,solver) formats an optimization solution in the form that solve returns.
Note You can request any subset of the output arguments. If you do, you can include only those input
arguments that are required for the output arguments that you request. For example,
[sol,fval] = mapSolution(prob,xin,fin)
% or
[sol,~,~,~,lambda] = mapSolution(prob,xin,[],[],[],lambdain,solver)
Examples
Transform Solution
x = optimvar('x');
y = optimvar('y');
prob = optimproblem;
prob.Objective = -x - y/3;
prob.Constraints.cons1 = x + y <= 2;
prob.Constraints.cons2 = x + y/4 <= 1;
prob.Constraints.cons3 = x - y <= 2;
prob.Constraints.cons4 = x/4 + y >= -1;
prob.Constraints.cons5 = x + y >= 1;
prob.Constraints.cons6 = -x + y <= 2;
Convert the problem to a structure to enable solution by a different solver than solve.
problem = prob2struct(prob);
options = optimoptions('linprog','Algorithm','dual-simplex');
[xin,fin,eflag,outpt,lambdain] = linprog(problem);
16-282
mapSolution
[sol,fval,exitflag,output,lambda] = mapSolution(prob,xin,fin,eflag,outpt,lambdain,'linprog')
fval = -1.1111
exitflag =
OptimalSolution
Input Arguments
prob — Optimization problem
OptimizationProblem object
Optimization solution, specified as a real vector. The length of the vector is the sum of the number of
elements in the problem variables.
Data Types: double
If there is an additive constant for the objective function, include it in fin. In other words, if the
objective function is a + f'*x, then include the value a, not just f'*x, in fin.
Data Types: double
16-283
16 Functions
Output structure, specified as a structure. This structure is copied to the output variable, and the
solver input is appended as the solver field of the resulting structure.
Data Types: struct
Lagrange multipliers, specified as a structure. For details of the fields of this structure, see “lambda”
on page 16-0 .
Data Types: struct
Note If you request an output structure, you must provide a solver input.
Output Arguments
sol — Solution
structure
Solution, returned as a structure. The fields of the structure are the names of the optimization
variables. See optimvar.
Reason the solver stopped, returned as an enumeration variable. You can convert exitflag to its
numeric equivalent using double(exitflag), and to its string equivalent using
string(exitflag).
This table describes the exit flags for the intlinprog solver.
16-284
mapSolution
• LPMaxIterations
• MaxNodes
• MaxTime
• RootLPMaxIterations
Exitflags 3 and -9 relate to solutions that have large infeasibilities. These usually arise from linear
constraint matrices that have large condition number, or problems that have large solution
components. To correct these issues, try to scale the coefficient matrices, eliminate redundant linear
constraints, or give tighter bounds on the variables.
This table describes the exit flags for the linprog solver.
16-285
16 Functions
Exitflags 3 and -9 relate to solutions that have large infeasibilities. These usually arise from linear
constraint matrices that have large condition number, or problems that have large solution
components. To correct these issues, try to scale the coefficient matrices, eliminate redundant linear
constraints, or give tighter bounds on the variables.
This table describes the exit flags for the lsqlin solver.
This table describes the exit flags for the quadprog solver.
16-286
mapSolution
This table describes the exit flags for the lsqcurvefit or lsqnonlin solver.
This table describes the exit flags for the fminunc solver.
16-287
16 Functions
This table describes the exit flags for the fmincon solver.
16-288
mapSolution
This table describes the exit flags for the fsolve solver.
This table describes the exit flags for the fzero solver.
16-289
16 Functions
Information about the optimization process, returned as a structure. The output structure contains
the fields in the relevant underlying solver output field, depending on which solver solve called:
• 'fmincon' output
• 'fminunc' output
• 'fsolve' output
• 'fzero' output
• 'intlinprog' output
• 'linprog' output
• 'lsqcurvefit' or 'lsqnonlin' output
• 'lsqlin' output
• 'lsqnonneg' output
• 'quadprog' output
solve includes the additional field Solver in the output structure to identify the solver used, such
as 'intlinprog'.
For the intlinprog and fminunc solvers, lambda is empty, []. For the other solvers, lambda has
these fields:
• Variables – Contains fields for each problem variable. Each problem variable name is a
structure with two fields:
• Lower – Lagrange multipliers associated with the variable LowerBound property, returned as
an array of the same size as the variable. Nonzero entries mean that the solution is at the
lower bound. These multipliers are in the structure
lambda.Variables.variablename.Lower.
• Upper – Lagrange multipliers associated with the variable UpperBound property, returned as
an array of the same size as the variable. Nonzero entries mean that the solution is at the
upper bound. These multipliers are in the structure
lambda.Variables.variablename.Upper.
16-290
mapSolution
• Constraints – Contains a field for each problem constraint. Each problem constraint is in a
structure whose name is the constraint name, and whose value is a numeric array of the same size
as the constraint. Nonzero entries mean that the constraint is active at the solution. These
multipliers are in the structure lambda.Constraints.constraintname.
Note Elements of a constraint array all have the same comparison (<=, ==, or >=) and are all of
the same type (linear, quadratic, or nonlinear).
See Also
OptimizationProblem | prob2struct | solve | varindex
Topics
“Examine Optimization Solution” on page 10-25
“Include Derivatives in Problem-Based Workflow” on page 7-20
Introduced in R2017b
16-291
16 Functions
mpsread
Read MPS file for LP and MILP optimization data
Syntax
problem = mpsread(mpsfile)
Description
problem = mpsread(mpsfile) reads data for linear programming (LP) and mixed-integer linear
programming (MILP) problems. It returns the data in a structure that the intlinprog or linprog
solvers accept.
Examples
Load the eil33-2.mps file from a public repository. View the problem type.
gunzip('http://miplib.zib.de/WebData/instances/eil33-2.mps.gz')
problem = mpsread('eil33-2.mps')
problem =
f: [4516x1 double]
Aineq: [0x4516 double]
bineq: [0x1 double]
Aeq: [32x4516 double]
beq: [32x1 double]
lb: [4516x1 double]
ub: [4516x1 double]
intcon: [4516x1 double]
solver: 'intlinprog'
options: [1x1 optim.options.Intlinprog]
Notice that problem.intcon is not empty, and problem.solver is 'intlinprog'. The problem is
an integer linear programming problem.
Change the options to suppress iterative display and to generate a plot as the solver progresses.
options = optimoptions('intlinprog','Display','final','PlotFcn',@optimplotmilp);
problem.options = options;
[x,fval,exitflag,output] = intlinprog(problem);
16-292
mpsread
Intlinprog stopped because the objective value is within a gap tolerance of the optimal value,
options.AbsoluteGapTolerance = 0 (the default value). The intcon variables are
integer within tolerance, options.IntegerTolerance = 1e-05 (the default value).
Input Arguments
mpsfile — Path to MPS file
character vector
Path to MPS file, specified as a character vector. mpsfile should be a file in the MPS format.
Note
Example: 'documents/optimization/milpproblem.mps'
Data Types: char
16-293
16 Functions
Output Arguments
problem — Problem structure
structure
optimoptions(solver)
See Also
intlinprog | linprog
Topics
“Linear Programming and Mixed-Integer Linear Programming”
Introduced in R2015b
16-294
optimget
optimget
Optimization options values
Syntax
val = optimget(options,'param')
val = optimget(options,'param',default)
Description
val = optimget(options,'param') returns the value of the specified option in the optimization
options structure options. You need to type only enough leading characters to define the option
name uniquely. Case is ignored for option names.
Examples
This statement returns the value of the Display option in the structure called my_options.
val = optimget(my_options,'Display')
This statement returns the value of the Display option in the structure called my_options (as in
the previous example) except that if the Display option is not defined, it returns the value 'final'.
optnew = optimget(my_options,'Display','final');
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
See Also
optimset
16-295
16 Functions
optimconstr
Create empty optimization constraint array
Syntax
constr = optimconstr(N)
constr = optimconstr(cstr)
constr = optimconstr(cstr1,N2,...,cstrk)
constr = optimconstr({cstr1,cstr2,...,cstrk})
constr = optimconstr([N1,N2,...,Nk])
Description
constr = optimconstr(N) creates an N-by-1 array of empty optimization constraints. Use constr
to initialize a loop that creates constraint expressions.
If cstr is 1-by-ncstr, where ncstr is the number of elements of cstr, then constr is also 1-by-
ncstr. Otherwise, constr is ncstr-by-1.
Examples
Create constraints for an inventory model. The stock of goods at the start of each period is equal to
the stock at the end of the previous period. During each period, the stock increases by buy and
decreases by sell. The variable stock is the stock at the end of the period.
N = 12;
stock = optimvar('stock',N,1,'Type','integer','LowerBound',0);
buy = optimvar('buy',N,1,'Type','integer','LowerBound',0);
sell = optimvar('sell',N,1,'Type','integer','LowerBound',0);
initialstock = 100;
stockbalance = optimconstr(N,1);
for t = 1:N
if t == 1
enterstock = initialstock;
else
enterstock = stock(t-1);
end
stockbalance(t) = stock(t) == enterstock + buy(t) - sell(t);
16-296
optimconstr
end
show(stockbalance)
(1, 1)
(2, 1)
(3, 1)
(4, 1)
(5, 1)
(6, 1)
(7, 1)
(8, 1)
(9, 1)
(10, 1)
(11, 1)
(12, 1)
prob = optimproblem;
prob.Constraints.stockbalance = stockbalance;
16-297
16 Functions
Instead of using a loop, you can create the same constraints by using matrix operations on the
variables.
tt = ones(N-1,1);
d = diag(tt,-1); % shift index by -1
stockbalance2 = stock == d*stock + buy - sell;
stockbalance2(1) = stock(1) == initialstock + buy(1) - sell(1);
See that the new constraints are the same as the constraints in stockbalance:
show(stockbalance2)
(1, 1)
(2, 1)
(3, 1)
(4, 1)
(5, 1)
(6, 1)
(7, 1)
(8, 1)
(9, 1)
(10, 1)
(11, 1)
(12, 1)
16-298
optimconstr
Creating constraints in a loop can be more time-consuming than creating constraints by matrix
operations. However, you are less likely to create an erroneous constraint by using loops.
Create indexed constraints and variables to represent the calories consumed in a diet. Each meal has
a different calorie limit.
meals = ["breakfast","lunch","dinner"];
constr = optimconstr(meals);
foods = ["cereal","oatmeal","yogurt","peanut butter sandwich","pizza","hamburger",...
"salad","steak","casserole","ice cream"];
diet = optimvar('diet',foods,meals,'LowerBound',0);
calories = [200,175,150,450,350,800,150,650,350,300]';
for i = 1:3
constr(i) = diet(:,i)'*calories <= 250*i;
end
show(constr("dinner"))
Input Arguments
N — Size of constraint dimension
positive integer
Example: 5
Data Types: double
Names for indexing, specified as a cell array of character vectors or a string vector.
Example: {'red','orange','green','blue'}
16-299
16 Functions
Example: ["red";"orange";"green";"blue"]
Data Types: string | cell
Output Arguments
constr — Constraints
empty OptimizationConstraint array
For example:
x = optimvar('x',8);
constr = optimconstr(4);
for k = 1:4
constr(k) = 5*k*(x(2*k) - x(2*k-1)) <= 10 - 2*k;
end
Limitations
• Each constraint expression in a problem must use the same comparison. For example, the
following code leads to an error, because cons1 uses the <= comparison, cons2 uses the >=
comparison, and cons1 and cons2 are in the same expression.
prob = optimproblem;
x = optimvar('x',2,'LowerBound',0);
cons1 = x(1) + x(2) <= 10;
cons2 = 3*x(1) + 4*x(2) >= 2;
prob.Constraints = [cons1;cons2]; % This line throws an error
You can avoid this error by using separate expressions for the constraints.
prob.Constraints.cons1 = cons1;
prob.Constraints.cons2 = cons2;
Tips
• It is generally more efficient to create constraints by vectorized expressions rather than loops. See
“Create Efficient Optimization Problems” on page 10-28.
• You can use optimineq instead of optimconstr to create inequality expressions. Similarly, you
can use optimeq instead of optimconstr to create equality expressions.
See Also
OptimizationConstraint | OptimizationExpression | OptimizationProblem |
OptimizationVariable | optimeq | optimexpr | optimineq
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
16-300
optimconstr
Introduced in R2017b
16-301
16 Functions
optimeq
Create empty optimization equality array
Syntax
eq = optimeq(N)
eq = optimeq(cstr)
eq = optimeq(cstr1,N2,...,cstrk)
eq = optimeq({cstr1,cstr2,...,cstrk})
eq = optimeq([N1,N2,...,Nk])
Description
eq = optimeq(N) creates an N-by-1 array of empty optimization equalities. Use eq to initialize a
loop that creates equalities. Use the resulting equalities as constraints in an optimization problem or
as equations in an equation problem.
eq = optimeq(cstr) creates an array of empty optimization equalities that are indexed by cstr, a
cell array of character vectors or string vectors .
If cstr is 1-by-ncstr, where ncstr is the number of elements of cstr, then eq is also 1-by-ncstr.
Otherwise, eq is ncstr-by-1.
eq = optimeq(cstr1,N2,...,cstrk) or eq = optimeq({cstr1,cstr2,...,cstrk}) or eq
= optimeq([N1,N2,...,Nk]), for any combination of cstr and N arguments, creates an ncstr1-
by-N2-by-...-by-ncstrk array of empty optimization equalities, where ncstr is the number of
elements in cstr.
Examples
Create equality constraints for an inventory model. The stock of goods at the start of each period is
equal to the stock at the end of the previous period. During each period, the stock increases by buy
and decreases by sell. The variable stock is the stock at the end of the period.
N = 12;
stock = optimvar('stock',N,1,'Type','integer','LowerBound',0);
buy = optimvar('buy',N,1,'Type','integer','LowerBound',0);
sell = optimvar('sell',N,1,'Type','integer','LowerBound',0);
initialstock = 100;
stockbalance = optimeq(N,1);
for t = 1:N
if t == 1
enterstock = initialstock;
else
enterstock = stock(t-1);
end
16-302
optimeq
show(stockbalance)
(1, 1)
(2, 1)
(3, 1)
(4, 1)
(5, 1)
(6, 1)
(7, 1)
(8, 1)
(9, 1)
(10, 1)
(11, 1)
(12, 1)
prob = optimproblem;
prob.Constraints.stockbalance = stockbalance;
16-303
16 Functions
Instead of using a loop, you can create the same constraints by using matrix operations on the
variables.
Display the new constraints. Note that they are the same as the constraints in stockbalance.
show(stockbalance2)
(1, 1)
(2, 1)
(3, 1)
(4, 1)
(5, 1)
(6, 1)
(7, 1)
(8, 1)
(9, 1)
(10, 1)
(11, 1)
(12, 1)
16-304
optimeq
Creating constraints in a loop can be more time consuming than creating constraints by using matrix
operations.
Create indexed equalities for a problem that involves shipping goods between airports. First, create
indices representing airports.
airports = ["LAX" "JFK" "ORD"];
Create a variable array representing quantities of goods to be shipped from each airport to each
other airport. quantities(airport1,airport2,goods) represents the quantity of goods being
shipped from airport1 to airport2.
quantities = optimvar('quantities',airports,airports,goods,'LowerBound',0);
Create an equality constraint that the sum of the weights of goods being shipped from each airport is
equal to the sum of the weights of goods being shipped to the airport.
eq = optimeq(airports);
outweight = optimexpr(size(eq));
inweight = optimexpr(size(eq));
for i = 1:length(airports)
temp = optimexpr;
temp2 = optimexpr;
for j = 1:length(airports)
for k = 1:length(goods)
temp = temp + quantities(i,j,k)*weights(k);
temp2 = temp2 + quantities(j,i,k)*weights(k);
end
end
outweight(i) = temp;
inweight(i) = temp2;
eq(i) = outweight(i) == inweight(i);
end
(1, 'LAX')
16-305
16 Functions
(1, 'JFK')
(1, 'ORD')
To avoid the nested for loops, express the equalities using standard MATLAB® operators. Create the
array of departing quantities by summing over the arrival airport indices. Squeeze the result to
remove the singleton dimension.
departing = squeeze(sum(quantities,2));
16-306
optimeq
Create the constraints that the departing weights equal the arriving weights.
eq2 = departweights == arriveweights;
Include the appropriate index names for the equalities by setting the IndexNames property.
eq2.IndexNames = {airports,{}};
Display the new equalities. Note that they match the previous equalities, but are transposed vectors.
show(eq2)
('LAX', 1)
('JFK', 1)
16-307
16 Functions
('ORD', 1)
Creating constraints in a loop can be more time consuming than creating constraints by using matrix
operations.
Input Arguments
N — Size of constraint dimension
positive integer
Example: 5
Data Types: double
Names for indexing, specified as a cell array of character vectors or a string vector.
Example: {'red','orange','green','blue'}
Example: ["red";"orange";"green";"blue"]
Data Types: string | cell
Output Arguments
eq — Equalities
empty OptimizationEquality array
16-308
optimeq
For example:
x = optimvar('x',8);
eq = optimeq(4);
for k = 1:4
eq(k) = 5*k*(x(2*k) - x(2*k-1)) == 10 - 2*k;
end
Tips
• You can use optimconstr instead of optimeq to create equality constraints for optimization
problems or equations for equation problems.
See Also
OptimizationEquality | optimconstr | optimineq
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2019b
16-309
16 Functions
optimexpr
Create empty optimization expression array
Syntax
expr = optimexpr(n)
expr = optimexpr(cstr)
expr = optimexpr(cstr1,n2,...,cstrk)
expr = optimexpr([n1,n2,...,nk])
expr = optimexpr({cstr1,cstr2,...,cstrk})
Description
expr = optimexpr(n) creates an empty n-by-1 OptimizationExpression array. Use expr as
the initial value in a loop that creates optimization expressions.
expr = optimexpr(cstr) creates an empty OptimizationExpression array that can use the
vector cstr for indexing. The number of elements of expr is the same as the length of cstr. When
cstr is a row vector, then expr is a row vector. When cstr is a column vector, then expr is a
column vector.
Examples
expr = optimexpr(3)
expr =
3x1 OptimizationExpression array with properties:
Create a string array of color names, and an optimization expression that is indexed by the color
names.
16-310
optimexpr
strexp = ["red","green","blue","yellow"];
expr = optimexpr(strexp)
expr =
1x4 OptimizationExpression array with properties:
You can use a cell array of character vectors instead of strings to get the same effect.
strexp = {'red','green','blue','yellow'};
expr = optimexpr(strexp)
expr =
1x4 OptimizationExpression array with properties:
expr =
4x1 OptimizationExpression array with properties:
expr =
3x4x2 OptimizationExpression array with properties:
Create a 3-by-4 array of optimization expressions, where the first dimension is indexed by the strings
"brass", "stainless", and "galvanized", and the second dimension is numerically indexed.
16-311
16 Functions
bnames = ["brass","stainless","galvanized"];
expr = optimexpr(bnames,4)
expr =
3x4 OptimizationExpression array with properties:
Create an expression using a named index indicating that each stainless expression is 1.5 times
the corresponding x(galvanized) value.
x = optimvar('x',bnames,4);
expr('stainless',:) = x('galvanized',:)*1.5;
show(expr('stainless',:))
('stainless', 1)
1.5*x('galvanized', 1)
('stainless', 2)
1.5*x('galvanized', 2)
('stainless', 3)
1.5*x('galvanized', 3)
('stainless', 4)
1.5*x('galvanized', 4)
Input Arguments
n — Variable dimension
positive integer
16-312
optimexpr
Output Arguments
expr — Optimization expression
OptimizationExpression object
Tips
• You can use optimexpr to create empty expressions that you fill programmatically, such as in a
for loop.
x = optimvar('x',8);
expr = optimexpr(4)
for k = 1:4
expr(k) = 5*k*(x(2*k) - x(2*k-1));
end
• It is generally more efficient to create expressions by vectorized statements rather than loops. See
“Create Efficient Optimization Problems” on page 10-28.
See Also
OptimizationExpression | optimconstr | show | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-313
16 Functions
optimineq
Create empty optimization inequality array
Syntax
constr = optimineq(N)
constr = optimineq(cstr)
constr = optimineq(cstr1,N2,...,cstrk)
constr = optimineq({cstr1,cstr2,...,cstrk})
constr = optimineq([N1,N2,...,Nk])
Description
constr = optimineq(N) creates an N-by-1 array of empty optimization inequalities. Use constr
to initialize a loop that creates inequality expressions.
constr = optimineq(cstr) creates an array of empty optimization constraints that are indexed
by cstr, a cell array of character vectors or string vectors.
If cstr is 1-by-ncstr, where ncstr is the number of elements of cstr, then constr is also 1-by-
ncstr. Otherwise, constr is ncstr-by-1.
Examples
Create the constraint that a two-element variable x must lie in the intersections of a number of disks
whose centers and radii are in the arrays centers and radii.
x = optimvar('x',1,2);
centers = [1 -2;3 -4;-2 3];
radii = [6 7 8];
constr = optimineq(length(radii));
for i = 1:length(constr)
constr(i) = sum((x - centers(i,:)).^2) <= radii(i)^2;
end
show(constr)
where:
16-314
optimineq
extraParams{1}:
1 -2
extraParams{2}:
3 -4
extraParams{3}:
-2 3
Instead of using a loop, you can create the same constraints by using matrix operations on the
variables.
constr2 = sum(([x;x;x] - centers).^2,2) <= radii'.^2;
Creating inequalities in a loop can be more time consuming than creating inequalities by using matrix
operations.
Create indexed inequalities and variables to represent the calories consumed in a diet. Each meal has
a different calorie limit. Create arrays representing the meals, foods, and calories for each food.
meals = ["breakfast","lunch","dinner"];
foods = ["cereal","oatmeal","yogurt","peanut butter sandwich","pizza","hamburger",...
"salad","steak","casserole","ice cream"];
calories = [200,175,150,450,350,800,150,650,350,300]';
Create optimization variables representing the foods for each meal, indexed by food names and meal
names.
diet = optimvar('diet',foods,meals,'LowerBound',0);
Set the inequality constraints that each meal has an upper bound on the calories in the meal.
constr = optimineq(meals);
for i = 1:3
constr(i) = diet(:,i)'*calories <= 250*i;
end
16-315
16 Functions
Instead of using a loop, you can create the same inequalities by using matrix operations on the
variables.
Include the appropriate index names for the inequalities by setting the IndexNames property.
Display the new inequalities for dinner. Note that they are the same as the previous inequalities.
show(constr2("dinner"))
Creating inequalities in a loop can be more time consuming than creating inequalities by using matrix
operations.
Input Arguments
N — Size of constraint dimension
positive integer
Example: 5
Data Types: double
Names for indexing, specified as a cell array of character vectors or a string vector.
Example: {'red','orange','green','blue'}
Example: ["red";"orange";"green";"blue"]
Data Types: string | cell
16-316
optimineq
Output Arguments
constr — Constraints
empty OptimizationInequality array
For example,
x = optimvar('x',8);
constr = optimineq(4);
for k = 1:4
constr(k) = 5*k*(x(2*k) - x(2*k-1)) <= 10 - 2*k;
end
Tips
• It is generally more efficient to create constraints by vectorized expressions rather than loops. See
“Create Efficient Optimization Problems” on page 10-28.
• You can use optimconstr instead of optimineq to create inequality constraints for optimization
problems.
See Also
OptimizationInequality | optimconstr | optimeq
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2019b
16-317
16 Functions
OptimizationConstraint
Optimization constraints
Description
An OptimizationConstraint object contains constraints in terms of OptimizationVariable
objects or OptimizationExpression objects. Each constraint uses one of these comparison
operators: ==, <=, or >=.
A single statement can represent an array of constraints. For example, you can express the
constraints that each row of a matrix variable x sums to one, as shown in “Create Simple Constraints
in Loop” on page 16-319.
Creation
Create an empty constraint object using optimconstr. Typically, you use a loop to fill the
expressions in the object.
If you create an optimization expressions from optimization variables using a comparison operators
==, <=, or >=, then the resulting object is either an OptimizationEquality or an
OptimizationInequality. See “Compatibility Considerations” on page 16-320.
Include constraints in the Constraints property of an optimization problem by using dot notation.
prob = optimproblem;
x = optimvar('x',5,3);
rowsum = sum(x,2);
prob.Constraints.rowsum = rowsum;
Properties
IndexNames — Index names
'' (default) | cell array of strings | cell array of character vectors
Index names, specified as a cell array of strings or character vectors. For information on using index
names, see “Named Index for Optimization Variables” on page 10-20.
Data Types: cell
16-318
OptimizationConstraint
Object Functions
infeasibility Constraint violation at a point
show Display information about optimization object
write Save optimization object description
Examples
x = optimvar('x',5,3);
Create the constraint that each row sums to one by using a loop. Initialize the loop using
optimconstr.
rowsum = optimconstr(5);
for i = 1:5
rowsum(i) = sum(x(i,:)) == 1;
end
rowsum
rowsum =
5x1 Linear OptimizationConstraint array with properties:
show(rowsum)
(1, 1)
(2, 1)
(3, 1)
(4, 1)
(5, 1)
16-319
16 Functions
Compatibility Considerations
OptimizationConstraint split into OptimizationEquality and OptimizationInequality
Behavior changed in R2019b
When you use a comparison operator <=, >=, or == on an optimization expression, the result is no
longer an OptimizationConstraint object. Instead, the equality comparison == returns an
OptimizationEquality object, and an inequality comparison <= or >= returns an
OptimizationInequality object. You can use these new objects for defining constraints in an
OptimizationProblem object, exactly as you would previously for OptimizationConstraint
objects. Furthermore, you can use OptimizationEquality objects to define equations for an
EquationProblem object.
The new objects make it easier to distinguish between expressions that are suitable for an
EquationProblem and those that are suitable only for an OptimizationProblem. You can use
existing OptimizationConstraint objects that represent equality constraints in an
EquationProblem object. Furthermore, when you use an OptimizationEquality or an
OptimizationInequality as a constraint in an OptimizationProblem, the software converts
the constraint to an OptimizationConstraint object.
See Also
OptimizationEquality | OptimizationExpression | OptimizationInequality |
OptimizationProblem | OptimizationVariable | infeasibility | optimconstr | show |
write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-320
OptimizationEquality
OptimizationEquality
Equalities and equality constraints
Description
An OptimizationEquality object contains equalities and equality constraints in terms of
OptimizationVariable objects or OptimizationExpression objects. Each equality uses the
comparison operator ==.
A single statement can represent an array of equalities. For example, you can express the equalities
that each row of a matrix variable x sums to one in this single statement:
constrsum = sum(x,2) == 1
Creation
Create equalities using optimization expressions with the comparison operator ==.
prob = optimproblem;
x = optimvar('x',4,6);
SumToOne = sum(x,2) == 1;
prob.Constraints.SumToOne = SumToOne;
% Or for an equation problem:
eqprob = eqnproblem;
eqprob.Equations.SumToOne = SumToOne;
You can also create an empty optimization equality by using optimeq or optimconstr. Typically,
you then set the equalities in a loop. For an example, see “Create Equalities in Loop” on page 16-323.
However, for the most efficient problem formulation, avoid setting equalities in loops. See “Create
Efficient Optimization Problems” on page 10-28.
Properties
IndexNames — Index names
'' (default) | cell array of strings | cell array of character vectors
Index names, specified as a cell array of strings or character vectors. For information on using index
names, see “Named Index for Optimization Variables” on page 10-20.
Data Types: cell
16-321
16 Functions
Object Functions
infeasibility Constraint violation at a point
show Display information about optimization object
write Save optimization object description
Examples
x = optimvar('x',4,6);
constrsum = sum(x,2) == 1
constrsum =
4×1 Linear OptimizationEquality array with properties:
show(constrsum)
(1, 1)
(2, 1)
(3, 1)
(4, 1)
16-322
OptimizationEquality
prob = optimproblem;
prob.Constraints.constrsum = constrsum
prob =
OptimizationProblem with properties:
Description: ''
ObjectiveSense: 'minimize'
Variables: [1×1 struct] containing 1 OptimizationVariable
Objective: [0×0 OptimizationExpression]
Constraints: [1×1 struct] containing 1 OptimizationConstraint
eqnprob = eqnproblem;
eqnprob.Equations.constrsum = constrsum
eqnprob =
EquationProblem with properties:
Description: ''
Variables: [1×1 struct] containing 1 OptimizationVariable
Equations: [1×1 struct] containing 1 OptimizationEquality
eq1 = optimeq;
x = optimvar('x',5,5);
2
Create the equalities that row i of x sums to i .
for i = 1:size(x,1)
eq1(i) = sum(x(i,:)) == i^2;
end
show(eq1)
(1, 1)
(1, 2)
16-323
16 Functions
(1, 3)
(1, 4)
(1, 5)
To use eq1 as a constraint in an optimization problem, set eq1 as a Constraints property by using
dot notation.
prob = optimproblem;
prob.Constraints.eq1 = eq1;
Similarly, to use eq1 as a set of equations in an equation problem, set eq1 as an Equations property
by using dot notation.
eqprob = eqnproblem;
eqprob.Equations.eq1 = eq1;
See Also
EquationProblem | OptimizationConstraint | OptimizationExpression |
OptimizationInequality | OptimizationProblem | OptimizationVariable | eqnproblem |
infeasibility | optimconstr | optimeq | show | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2019b
16-324
OptimizationExpression
OptimizationExpression
Arithmetic or functional expression in terms of optimization variables
Description
An OptimizationExpression is an arithmetic or functional expression in terms of optimization
variables. Use an OptimizationExpression as an objective function, or as a part of an inequality
or equality in a constraint or equation.
Creation
Create an optimization expression by performing operations on OptimizationVariable objects.
Use standard MATLAB arithmetic including taking powers, indexing, and concatenation of
optimization variables to create expressions. See “Supported Operations on Optimization Variables
and Expressions” on page 10-36 and “Examples” on page 16-0 .
You can also create an optimization expression from a MATLAB function applied to optimization
variables by using fcn2optimexpr. For examples, see “Create Expression from Nonlinear Function”
on page 16-328 and “Problem-Based Nonlinear Optimization”.
Create an empty optimization expression by using optimexpr. Typically, you then fill the expression
in a loop. For examples, see “Create Optimization Expression by Looping” on page 16-327 and the
optimexpr function reference page.
After you create an expression, use it as either an objective function, or as part of a constraint or
equation. For examples, see the solve function reference page.
Properties
IndexNames — Index names
'' (default) | cell array of strings | cell array of character vectors
Index names, specified as a cell array of strings or character vectors. For information on using index
names, see “Named Index for Optimization Variables” on page 10-20.
Data Types: cell
Object Functions
evaluate Evaluate optimization expression
16-325
16 Functions
Examples
x = optimvar('x',3,2);
expr = sum(sum(x))
expr =
Linear OptimizationExpression
f = [2,10,4];
w = f*x;
show(w)
(1, 1)
(1, 2)
x = optimvar('x',3,2);
y = x'
y =
2x3 Linear OptimizationExpression array with properties:
Simply indexing into an optimization array does not create an expression, but instead creates an
optimization variable that references the original variable. To see this, create a variable w that is the
first and third row of x. Note that w is an optimization variable, not an optimization expression.
w = x([1,3],:)
w =
2x2 OptimizationVariable array with properties:
16-326
OptimizationExpression
Elementwise properties:
LowerBound: [2x2 double]
UpperBound: [2x2 double]
f =
4x10 Linear OptimizationExpression array with properties:
Use optimexpr to create an empty expression, then fill the expression in a loop.
y = optimvar('y',6,4);
expr = optimexpr(3,2);
for i = 1:3
for j = 1:2
expr(i,j) = y(2*i,j) - y(i,2*j);
end
end
show(expr)
(1, 1)
y(2, 1) - y(1, 2)
(2, 1)
y(4, 1) - y(2, 2)
(3, 1)
y(6, 1) - y(3, 2)
16-327
16 Functions
(1, 2)
y(2, 2) - y(1, 4)
(2, 2)
y(4, 2) - y(2, 4)
(3, 2)
y(6, 2) - y(3, 4)
fun =
Nonlinear OptimizationExpression
anonymousFunction1(x)
where:
anonymousFunction1 = @(x)x^2/10+exp(-exp(-x));
Find the point that minimizes fun starting from the point x0 = 0.
x0 = struct('x',0);
prob = optimproblem('Objective',fun);
[sol,fval] = solve(prob,x0)
fval = 0.1656
16-328
OptimizationExpression
x = optimvar('x',3,2);
y = optimvar('y',1,2);
expr = sum(x,1) - 2*y;
xmat = [3,-1;
0,1;
2,6];
sol.x = xmat;
sol.y = [4,-3];
val = evaluate(expr,sol)
val = 1×2
-3 12
More About
Arithmetic Operations
For the list of supported operations on optimization expressions, see “Supported Operations on
Optimization Variables and Expressions” on page 10-36.
See Also
OptimizationVariable | evaluate | fcn2optimexpr | optimexpr | show | solve | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
“Optimization Expressions” on page 10-6
Introduced in R2017b
16-329
16 Functions
OptimizationInequality
Inequality constraints
Description
An OptimizationInequality object contains an inequality constraint in terms of
OptimizationVariable objects or OptimizationExpression objects. An inequality constraint
uses the comparison operator <= or >=.
A single statement can represent an array of inequalities. For example, you can express the
inequalities that each row of a matrix variable x sums to no more than one in this single statement:
Creation
Create an inequality using optimization expressions with the comparison operator <= or >=.
Include inequalities in the Constraints property of an optimization problem by using dot notation.
prob = optimproblem;
x = optimvar('x',4,6);
SumLessThanOne = sum(x,2) <= 1;
prob.Constraints.SumLessThanOne = SumLessThanOne;
You can also create an empty optimization inequality by using optimineq or optimconstr.
Typically, you then set the inequalities in a loop. For an example, see “Create Inequalities in Loop” on
page 16-332. However, for the most efficient problem formulation, avoid setting inequalities in loops.
See “Create Efficient Optimization Problems” on page 10-28.
Properties
IndexNames — Index names
'' (default) | cell array of strings | cell array of character vectors
Index names, specified as a cell array of strings or character vectors. For information on using index
names, see “Named Index for Optimization Variables” on page 10-20.
Data Types: cell
16-330
OptimizationInequality
Object Functions
infeasibility Constraint violation at a point
show Display information about optimization object
write Save optimization object description
Examples
x = optimvar('x',4,6);
Create the inequalities that each row of x sums to no more than one.
constrsum =
4x1 Linear OptimizationInequality array with properties:
show(constrsum)
(1, 1)
(2, 1)
(3, 1)
(4, 1)
prob = optimproblem;
prob.Constraints.constrsum = constrsum
prob =
OptimizationProblem with properties:
Description: ''
16-331
16 Functions
ObjectiveSense: 'minimize'
Variables: [1x1 struct] containing 1 OptimizationVariable
Objective: [0x0 OptimizationExpression]
Constraints: [1x1 struct] containing 1 OptimizationConstraint
Create the constraint that a two-element variable x must lie in the intersections of a number of disks
whose centers and radii are in the arrays centers and radii.
x = optimvar('x',1,2);
centers = [1 -2;3 -4;-2 3];
radii = [6 7 8];
constr = optimineq(length(radii));
for i = 1:length(constr)
constr(i) = sum((x - centers(i,:)).^2) <= radii(i)^2;
end
where:
extraParams{1}:
1 -2
extraParams{2}:
3 -4
extraParams{3}:
-2 3
Instead of using a loop, you can create the same constraints by using matrix operations on the
variables.
constr2 = sum(([x;x;x] - centers).^2,2) <= radii'.^2;
16-332
OptimizationInequality
Creating inequalities in a loop can be more time consuming than creating inequalities by using matrix
operations.
See Also
OptimizationConstraint | OptimizationEquality | OptimizationExpression |
OptimizationProblem | OptimizationVariable | infeasibility | optimconstr |
optimineq | show | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2019b
16-333
16 Functions
OptimizationProblem
Optimization problem
Description
An OptimizationProblem object describes an optimization problem, including variables for the
optimization, constraints, the objective function, and whether the objective is to be maximized or
minimized. Solve a complete problem using solve.
Creation
Create an OptimizationProblem object by using optimproblem.
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
Properties
Description — Problem label
'' (default) | string | character vector
Problem label, specified as a string or character vector. The software does not use Description. It
is an arbitrary label that you can use for any reason. For example, you can share, archive, or present
a model or problem, and store descriptive information about the model or problem in the
Description property.
Example: "Describes a traveling salesman problem"
Data Types: char | string
You can use the short name 'min' for 'minimize' or 'max' for 'maximize'.
Example: 'maximize'
Data Types: char | string
16-334
OptimizationProblem
prob.Constraints.TrayArea = [];
Object Functions
optimoptions Create optimization options
prob2struct Convert optimization problem or equation problem to solver form
show Display information about optimization object
solve Solve optimization problem or equation problem
varindex Map problem variables to solver-based variable index
write Save optimization object description
Examples
Create a linear programming problem for maximization. The problem has two positive variables and
three linear inequality constraints.
prob = optimproblem('ObjectiveSense','max');
16-335
16 Functions
x = optimvar('x',2,1,'LowerBound',0);
prob.Objective = x(1) + 2*x(2);
show(prob)
OptimizationProblem :
Solve for:
x
maximize :
x(1) + 2*x(2)
subject to cons1:
x(1) + 5*x(2) <= 100
subject to cons2:
x(1) + x(2) <= 40
subject to cons3:
2*x(1) + 0.5*x(2) <= 60
variable bounds:
0 <= x(1)
0 <= x(2)
sol = solve(prob);
sol.x
ans = 2×1
25.0000
15.0000
See Also
OptimizationConstraint | OptimizationExpression | OptimizationVariable |
optimproblem | show | solve | write
16-336
OptimizationProblem
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-337
16 Functions
OptimizationVariable
Variable for optimization
Description
An OptimizationVariable object contains variables for optimization expressions. Use expressions
to represent an objective function, constraints, or equations. Variables are symbolic in nature, and
can be arrays of any size.
Creation
Create an OptimizationVariable object using optimvar.
Properties
Array-Wide Properties
Name gives the variable label to be displayed, such as in show or write. Name also gives the field
names in the solution structure that solve returns.
Tip To avoid confusion, set name to be the MATLAB variable name. For example,
metal = optimvar('metal')
The variable type applies to all variables in the array. To have multiple variable types, create multiple
variables.
Tip To specify a binary variable, use the 'integer' type and specify LowerBound = 0 and
UpperBound = 1.
16-338
OptimizationVariable
Index names, specified as a cell array of strings or character vectors. For information on using index
names, see “Named Index for Optimization Variables” on page 10-20.
Data Types: cell
Element-wise Properties
Lower bound, specified as a real scalar or as a real array having the same dimensions as the
OptimizationVariable object. Scalar values apply to all elements of the variable.
The LowerBound property is always displayed as an array. However, you can set the property as a
scalar that applies to all elements. For example,
var.LowerBound = 0
Upper bound, specified as a real scalar or as a real array having the same dimensions as the
OptimizationVariable object. Scalar values apply to all elements of the variable.
The UpperBound property is always displayed as an array. However, you can set the property as a
scalar that applies to all elements. For example
var.UpperBound = 1
Object Functions
show Display information about optimization object
showbounds Display variable bounds
write Save optimization object description
writebounds Save description of variable bounds
Examples
Create Scalar Optimization Variable
dollars =
OptimizationVariable with properties:
Name: 'dollars'
16-339
16 Functions
Type: 'continuous'
IndexNames: {{} {}}
LowerBound: -Inf
UpperBound: Inf
x =
3x1 OptimizationVariable array with properties:
Array-wide properties:
Name: 'x'
Type: 'continuous'
IndexNames: {{} {}}
Elementwise properties:
LowerBound: [3x1 double]
UpperBound: [3x1 double]
Create an integer optimization variable vector named bolts that is indexed by the strings "brass",
"stainless", and "galvanized". Use the indices of bolts to create an optimization expression,
and experiment with creating bolts using character arrays or in a different orientation.
bolts =
1x3 OptimizationVariable array with properties:
Array-wide properties:
Name: 'bolts'
Type: 'integer'
IndexNames: {{} {1x3 cell}}
Elementwise properties:
LowerBound: [-Inf -Inf -Inf]
UpperBound: [Inf Inf Inf]
16-340
OptimizationVariable
y =
Linear OptimizationExpression
Use a cell array of character vectors instead of strings to get a variable with the same indices as
before.
bnames = {'brass','stainless','galvanized'};
bolts = optimvar('bolts',bnames,'Type','integer')
bolts =
1x3 OptimizationVariable array with properties:
Array-wide properties:
Name: 'bolts'
Type: 'integer'
IndexNames: {{} {1x3 cell}}
Elementwise properties:
LowerBound: [-Inf -Inf -Inf]
UpperBound: [Inf Inf Inf]
Use a column-oriented version of bnames, 3-by-1 instead of 1-by-3, and observe that bolts has that
orientation as well.
bnames = ["brass";"stainless";"galvanized"];
bolts = optimvar('bolts',bnames,'Type','integer')
bolts =
3x1 OptimizationVariable array with properties:
Array-wide properties:
Name: 'bolts'
Type: 'integer'
IndexNames: {{1x3 cell} {}}
Elementwise properties:
LowerBound: [3x1 double]
UpperBound: [3x1 double]
16-341
16 Functions
xarray = optimvar('xarray',3,4,2)
xarray =
3x4x2 OptimizationVariable array with properties:
Array-wide properties:
Name: 'xarray'
Type: 'continuous'
IndexNames: {{} {} {}}
Elementwise properties:
LowerBound: [3x4x2 double]
UpperBound: [3x4x2 double]
You can also create multidimensional variables indexed by a mixture of names and numeric indices.
For example, create a 3-by-4 array of optimization variables where the first dimension is indexed by
the strings 'brass', 'stainless', and 'galvanized', and the second dimension is numerically
indexed.
bnames = ["brass","stainless","galvanized"];
bolts = optimvar('bolts',bnames,4)
bolts =
3x4 OptimizationVariable array with properties:
Array-wide properties:
Name: 'bolts'
Type: 'continuous'
IndexNames: {{1x3 cell} {}}
Elementwise properties:
LowerBound: [3x4 double]
UpperBound: [3x4 double]
Create an optimization variable named x of size 3-by-3-by-3 that represents binary variables.
x = optimvar('x',3,3,3,'Type','integer','LowerBound',0,'UpperBound',1)
x =
3x3x3 OptimizationVariable array with properties:
Array-wide properties:
Name: 'x'
Type: 'integer'
IndexNames: {{} {} {}}
Elementwise properties:
LowerBound: [3x3x3 double]
16-342
OptimizationVariable
More About
Arithmetic Operations
For the list of supported operations on optimization variables, see “Supported Operations on
Optimization Variables and Expressions” on page 10-36.
Tips
• OptimizationVariable objects have handle copy behavior. See “Handle Object Behavior”
(MATLAB) and “Comparison of Handle and Value Classes” (MATLAB). Handle copy behavior
means that a copy of an OptimizationVariable points to the original and does not have an
independent existence. For example, create a variable x, copy it to y, then set a property of y.
Note that x takes on the new property value.
x = optimvar('x','LowerBound',1);
y = x;
y.LowerBound = 0;
showbounds(x)
0 <= x
See Also
OptimizationConstraint | OptimizationExpression | OptimizationProblem | optimvar |
show | showbounds | write | writebounds
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
“Supported Operations on Optimization Variables and Expressions” on page 10-36
Introduced in R2017b
16-343
16 Functions
optimoptions
Package: optim.problemdef
Syntax
options = optimoptions(SolverName)
options = optimoptions(SolverName,Name,Value)
options = optimoptions(oldoptions,Name,Value)
options = optimoptions(SolverName,oldoptions)
options = optimoptions(prob)
options = optimoptions(prob,Name,Value)
Description
options = optimoptions(SolverName) returns a set of default options for the SolverName
solver.
options = optimoptions(prob) returns a set of default options for the prob optimization
problem or equation problem.
Examples
options = optimoptions('fmincon')
options =
fmincon options:
16-344
optimoptions
Set properties:
No options set.
Default properties:
Algorithm: 'interior-point'
CheckGradients: 0
ConstraintTolerance: 1.0000e-06
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
HessianApproximation: 'bfgs'
HessianFcn: []
HessianMultiplyFcn: []
HonorBounds: 1
MaxFunctionEvaluations: 3000
MaxIterations: 1000
ObjectiveLimit: -1.0000e+20
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
ScaleProblem: 0
SpecifyConstraintGradient: 0
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-10
SubproblemAlgorithm: 'factorization'
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
options = optimoptions(@fmincon,'Algorithm','sqp','MaxIterations',1500)
options =
fmincon options:
Set properties:
Algorithm: 'sqp'
MaxIterations: 1500
Default properties:
CheckGradients: 0
ConstraintTolerance: 1.0000e-06
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
MaxFunctionEvaluations: '100*numberOfVariables'
16-345
16 Functions
ObjectiveLimit: -1.0000e+20
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
ScaleProblem: 0
SpecifyConstraintGradient: 0
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-06
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
Update Options
oldoptions =
lsqnonlin options:
Set properties:
Algorithm: 'levenberg-marquardt'
MaxFunctionEvaluations: 1500
Default properties:
CheckGradients: 0
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
FunctionTolerance: 1.0000e-06
MaxIterations: 400
OutputFcn: []
PlotFcn: []
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-06
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
options =
lsqnonlin options:
16-346
optimoptions
Set properties:
Algorithm: 'levenberg-marquardt'
MaxFunctionEvaluations: 2000
Default properties:
CheckGradients: 0
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
FunctionTolerance: 1.0000e-06
MaxIterations: 400
OutputFcn: []
PlotFcn: []
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-06
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
options = optimoptions(@lsqnonlin,'Algorithm','levenberg-marquardt',...
'MaxFunctionEvaluations',1500)
options =
lsqnonlin options:
Set properties:
Algorithm: 'levenberg-marquardt'
MaxFunctionEvaluations: 1500
Default properties:
CheckGradients: 0
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
FunctionTolerance: 1.0000e-06
MaxIterations: 400
OutputFcn: []
PlotFcn: []
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-06
16-347
16 Functions
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
options =
lsqnonlin options:
Set properties:
Algorithm: 'levenberg-marquardt'
MaxFunctionEvaluations: 2000
Default properties:
CheckGradients: 0
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
FunctionTolerance: 1.0000e-06
MaxIterations: 400
OutputFcn: []
PlotFcn: []
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-06
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
Transfer nondefault options for the fmincon solver to options for the fminunc solver.
oldoptions =
fmincon options:
Set properties:
Algorithm: 'sqp'
MaxIterations: 1500
Default properties:
16-348
optimoptions
CheckGradients: 0
ConstraintTolerance: 1.0000e-06
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
MaxFunctionEvaluations: '100*numberOfVariables'
ObjectiveLimit: -1.0000e+20
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
ScaleProblem: 0
SpecifyConstraintGradient: 0
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-06
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
options = optimoptions(@fminunc,oldoptions)
options =
fminunc options:
Set properties:
CheckGradients: 0
FiniteDifferenceType: 'forward'
MaxIterations: 1500
OptimalityTolerance: 1.0000e-06
PlotFcn: []
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-06
Default properties:
Algorithm: 'quasi-newton'
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
MaxFunctionEvaluations: '100*numberOfVariables'
ObjectiveLimit: -1.0000e+20
OutputFcn: []
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
Create an optimization problem and find the default solver and options.
16-349
16 Functions
rng default
x = optimvar('x',3,'LowerBound',0);
expr = x'*(eye(3) + randn(3))*x - randn(1,3)*x;
prob = optimproblem('Objective',expr);
options = optimoptions(prob)
options =
quadprog options:
Set properties:
No options set.
Default properties:
Algorithm: 'interior-point-convex'
ConstraintTolerance: 1.0000e-08
Display: 'final'
LinearSolver: 'auto'
MaxIterations: 200
OptimalityTolerance: 1.0000e-08
StepTolerance: 1.0000e-12
options.Display = 'iter';
sol = solve(prob,'Options',options);
sol.x
ans = 3×1
1.6035
0.0000
0.8029
16-350
optimoptions
Input Arguments
SolverName — Solver name
character vector | string | function handle
oldoptions — Options
options created using optimoptions
Options, specified as an options object. The optimoptions function creates options objects.
Example: oldoptions = optimoptions(@fminunc)
The syntaxes using prob enable you to see what the default solver is for your problem and to modify
the algorithm or other options.
Example: prob = optimproblem('Objective',myobj), where myobjis an optimization
expression
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and
Value is the corresponding value. Name must appear inside quotes. You can specify several name and
value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
Example: optimoptions(@fmincon,'Display','iter','FunctionTolerance',1e-10) sets
fmincon options to have iterative display, and to have a FunctionTolerance of 1e-10.
For relevant name-value pairs, consult the options table for your solver:
• fgoalattain options
• fmincon options
• fminimax options
• fminunc options
• fseminf options on page 16-146
• fsolve options
• ga options (in Global Optimization Toolbox)
• gamultiobj options (in Global Optimization Toolbox)
• intlinprog options
16-351
16 Functions
• linprog options
• lsqcurvefit options
• lsqlin options
• lsqnonlin options
• paretosearch options (in Global Optimization Toolbox)
• particleswarm options (in Global Optimization Toolbox)
• patternsearch options (in Global Optimization Toolbox)
• quadprog options on page 16-390
• simulannealbnd options (in Global Optimization Toolbox)
• surrogateopt options (in Global Optimization Toolbox)
Output Arguments
options — Options object
options object
Alternative Functionality
App
You can set and modify options using the Optimization app (optimtool). See “Importing and
Exporting Your Work” on page 5-8.
Note The Optimization app warns that it will be removed in a future release.
See Also
OptimizationProblem | optimset | optimtool | resetoptions
Topics
“Set Options”
“Optimization App” on page 5-2
Introduced in R2013a
16-352
optimproblem
optimproblem
Create optimization problem
Syntax
prob = optimproblem
prob = optimproblem(Name,Value)
Description
prob = optimproblem creates an optimization problem with default properties.
Examples
prob =
OptimizationProblem with properties:
Description: ''
ObjectiveSense: 'minimize'
Variables: [0x0 struct] containing 0 OptimizationVariables
Objective: [0x0 OptimizationExpression]
Constraints: [0x0 struct] containing 0 OptimizationConstraints
No problem defined.
Create a linear programming problem for maximization. The problem has two positive variables and
three linear inequality constraints.
prob = optimproblem('ObjectiveSense','max');
16-353
16 Functions
show(prob)
OptimizationProblem :
Solve for:
x
maximize :
x(1) + 2*x(2)
subject to cons1:
x(1) + 5*x(2) <= 100
subject to cons2:
x(1) + x(2) <= 40
subject to cons3:
2*x(1) + 0.5*x(2) <= 60
variable bounds:
0 <= x(1)
0 <= x(2)
sol = solve(prob);
sol.x
ans = 2×1
25.0000
15.0000
Input Arguments
Name-Value Pair Arguments
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and
Value is the corresponding value. Name must appear inside quotes. You can specify several name and
value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
16-354
optimproblem
Problem label, specified as a string or character vector. The software does not use Description for
computation. Description is an arbitrary label that you can use for any reason. For example, you
can share, archive, or present a model or problem, and store descriptive information about the model
or problem in Description.
Example: "An iterative approach to the Traveling Salesman problem"
Data Types: char | string
Sense of optimization, specified as 'minimize' or 'maximize'. You can also specify 'min' to
obtain 'minimize' or 'max' to obtain 'maximize'. The solve function minimizes the objective
when ObjectiveSense is 'minimize' and maximizes the objective when ObjectiveSense is
'maximize'.
Example: prob = optimproblem('ObjectiveSense','max')
Data Types: char | string
Output Arguments
prob — Optimization problem
OptimizationProblem object
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
16-355
16 Functions
See Also
OptimizationProblem | optimvar | solve
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-356
optimset
optimset
Create or edit optimization options structure
Syntax
options = optimset('param1',value1,'param2',value2,...)
optimset
options = optimset
options = optimset(optimfun)
options = optimset(oldopts,'param1',value1,...)
options = optimset(oldopts,newopts)
Description
Note optimoptions is recommended instead of optimset for all solvers except fzero, fminbnd,
fminsearch, and lsqnonneg.
optimset with no input or output arguments displays a complete list of options with their valid
values.
options = optimset (with no input arguments) creates an options structure options where all
fields are set to [].
options = optimset(optimfun) creates an options structure options with all option names and
default values relevant to the optimization function optimfun.
Options
For more information about individual options, including their default values, see the reference pages
for the optimization functions. “Optimization Options Reference” on page 15-6 provides descriptions
of optimization options and which functions use them. optimset uses different names for some
options than optimoptions. See “Current and Legacy Option Name Tables” on page 15-21.
Use the command optimset(@solver) or the equivalent optimset solver to see the default
values of relevant optimization options for a solver. Some solvers do not have a default value, since
16-357
16 Functions
the default depends on the algorithm. For example, the default value of the MaxIterations option
in the fmincon solver is 400 for the trust-region-reflective algorithm, but is 1000 for the interior-
point algorithm.
You can also see the default values of all relevant options in the Optimization app. To see the options:
1 Start the Optimization app, e.g., with the optimtool command.
2 Choose the solver from the Solver menu.
3 Choose the algorithm, if applicable, from the Algorithm menu.
4 Read off the default values within the Options pane.
Examples
This statement creates an optimization options structure called options in which the Display
option is set to 'iter' and the TolX option is set to 1e-8.
options = optimset('Display','iter','TolX',1e-8)
This statement makes a copy of the options structure called options, changing the value of the TolX
option and storing new values in optnew.
optnew = optimset(options,'TolX',1e-4);
This statement returns an optimization options structure options that contains all the option names
and default values relevant to the function fminbnd.
options = optimset('fminbnd')
If you only want to see the default values for fminbnd, you can simply type
optimset fminbnd
or equivalently
optimset('fminbnd')
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
• Code generation does not support the syntax that has no input or output arguments:
optimset
• Functions specified in the options must be supported for code generation.
• The input argument optimfun must be a function that is supported for code generation.
• The fields of the options structure oldopts must be fixed-size fields.
• Code generation ignores the Display option.
• Code generation does not support the additional options in an options structure created by the
Optimization Toolbox optimset function. If an input options structure includes the additional
Optimization Toolbox options, then the output structure does not include them.
16-358
optimset
See Also
optimget | optimoptions | optimtool
16-359
16 Functions
optimtool
Select solver and optimization options, run problems
Syntax
optimtool
optimtool(optstruct)
optimtool('solver')
Description
optimtool warns that it will be removed in a future release.
Note The Optimization app warns that it will be removed in a future release. For alternatives, see
“Optimization App Alternatives” on page 5-12.
optimtool opens the Optimization app. Use the Optimization app to select a solver, optimization
options, and run problems. See “Optimization App” on page 5-2 for a complete description of the
Optimization app.
The Optimization app can be used to run any Optimization Toolbox solver except intlinprog, and
any Global Optimization Toolbox solver except GlobalSearch and MultiStart. Results can be
exported to a file or to the MATLAB workspace as a structure.
optimtool(optstruct) starts the Optimization app and loads optstruct. optstruct can be
either optimization options or an optimization problem structure. Create optimization options with
the optimoptions or optimset function, or by using the export option from the Optimization app.
Create a problem structure by exporting the problem from the Optimization app to the MATLAB
workspace. If you have Global Optimization Toolbox, you can create a problem structure for
fmincon, fminunc, lsqnonlin, or lsqcurvefit using the createOptimProblem function.
optimtool('solver') starts the Optimization app with the specified solver, identified as a
character vector, and the corresponding default options and problem fields. All Optimization Toolbox
and Global Optimization Toolbox solvers are valid inputs to the optimtool function, except for
intlinprog, GlobalSearch, and MultiStart.
16-360
optimtool
See Also
optimoptions | optimset
Topics
“Optimization App” on page 5-2
“Solve a Constrained Nonlinear Problem, Solver-Based” on page 1-11
Introduced in R2006b
16-361
16 Functions
optimvar
Create optimization variables
Syntax
x = optimvar(name)
x = optimvar(name,n)
x = optimvar(name,cstr)
x = optimvar(name,cstr1,n2,...,cstrk)
x = optimvar(name,{cstr1,cstr2,...,cstrk})
x = optimvar(name,[n1,n2,...,nk])
Description
x = optimvar(name) creates a scalar optimization variable. An optimization variable is a symbolic
object that enables you to create expressions for the objective function and the problem constraints
in terms of the variable.
Tip To avoid confusion, set name to be the MATLAB variable name. For example,
metal = optimvar('metal')
x = optimvar(name,cstr) creates a vector of optimization variables that can use cstr for
indexing. The number of elements of x is the same as the length of the cstr vector. The orientation
of x is the same as the orientation of cstr: x is a row vector when cstr is a row vector, and x is a
column vector when cstr is a column vector.
x = optimvar(name,cstr1,n2,...,cstrk) or x = optimvar(name,{cstr1,cstr2,...,
cstrk}) or x = optimvar(name,[n1,n2,...,nk]), for any combination of positive integers nj
and names cstrk, creates an array of optimization variables with dimensions equal to the integers nj
and the lengths of the entries cstr1k.
x = optimvar( ___ ,Name,Value), for any previous syntax, uses additional options specified by
one or more Name,Value pair arguments. For example, to specify an integer variable, use x =
optimvar('x','Type','integer').
Examples
16-362
optimvar
dollars = optimvar('dollars')
dollars =
OptimizationVariable with properties:
Name: 'dollars'
Type: 'continuous'
IndexNames: {{} {}}
LowerBound: -Inf
UpperBound: Inf
x = optimvar('x',3)
x =
3x1 OptimizationVariable array with properties:
Array-wide properties:
Name: 'x'
Type: 'continuous'
IndexNames: {{} {}}
Elementwise properties:
LowerBound: [3x1 double]
UpperBound: [3x1 double]
Create an integer optimization variable vector named bolts that is indexed by the strings "brass",
"stainless", and "galvanized". Use the indices of bolts to create an optimization expression,
and experiment with creating bolts using character arrays or in a different orientation.
bnames = ["brass","stainless","galvanized"];
bolts = optimvar('bolts',bnames,'Type','integer')
bolts =
1x3 OptimizationVariable array with properties:
Array-wide properties:
Name: 'bolts'
16-363
16 Functions
Type: 'integer'
IndexNames: {{} {1x3 cell}}
Elementwise properties:
LowerBound: [-Inf -Inf -Inf]
UpperBound: [Inf Inf Inf]
y =
Linear OptimizationExpression
Use a cell array of character vectors instead of strings to get a variable with the same indices as
before.
bnames = {'brass','stainless','galvanized'};
bolts = optimvar('bolts',bnames,'Type','integer')
bolts =
1x3 OptimizationVariable array with properties:
Array-wide properties:
Name: 'bolts'
Type: 'integer'
IndexNames: {{} {1x3 cell}}
Elementwise properties:
LowerBound: [-Inf -Inf -Inf]
UpperBound: [Inf Inf Inf]
Use a column-oriented version of bnames, 3-by-1 instead of 1-by-3, and observe that bolts has that
orientation as well.
bnames = ["brass";"stainless";"galvanized"];
bolts = optimvar('bolts',bnames,'Type','integer')
bolts =
3x1 OptimizationVariable array with properties:
Array-wide properties:
Name: 'bolts'
Type: 'integer'
IndexNames: {{1x3 cell} {}}
Elementwise properties:
LowerBound: [3x1 double]
16-364
optimvar
xarray = optimvar('xarray',3,4,2)
xarray =
3x4x2 OptimizationVariable array with properties:
Array-wide properties:
Name: 'xarray'
Type: 'continuous'
IndexNames: {{} {} {}}
Elementwise properties:
LowerBound: [3x4x2 double]
UpperBound: [3x4x2 double]
You can also create multidimensional variables indexed by a mixture of names and numeric indices.
For example, create a 3-by-4 array of optimization variables where the first dimension is indexed by
the strings 'brass', 'stainless', and 'galvanized', and the second dimension is numerically
indexed.
bnames = ["brass","stainless","galvanized"];
bolts = optimvar('bolts',bnames,4)
bolts =
3x4 OptimizationVariable array with properties:
Array-wide properties:
Name: 'bolts'
Type: 'continuous'
IndexNames: {{1x3 cell} {}}
Elementwise properties:
LowerBound: [3x4 double]
UpperBound: [3x4 double]
16-365
16 Functions
Create an optimization variable named x of size 3-by-3-by-3 that represents binary variables.
x = optimvar('x',3,3,3,'Type','integer','LowerBound',0,'UpperBound',1)
x =
3x3x3 OptimizationVariable array with properties:
Array-wide properties:
Name: 'x'
Type: 'integer'
IndexNames: {{} {} {}}
Elementwise properties:
LowerBound: [3x3x3 double]
UpperBound: [3x3x3 double]
Input Arguments
name — Variable name
character vector | string
Tip To avoid confusion about which name relates to which aspect of a variable, set the workspace
variable name to the variable name. For example,
truck = optimvar('truck');
Example: "Warehouse"
Example: 'truck'
Data Types: char | string
n — Variable dimension
positive integer
16-366
optimvar
Example: x = optimvar('x',{'Warehouse','Truck','City'})
Data Types: string | cell
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and
Value is the corresponding value. Name must appear inside quotes. You can specify several name and
value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
Example: Create x as a 3-element nonnegative vector with x(2) <= 2 and x(3) <= 4 by the
command x = optimvar('x',3,'LowerBound',0,'UpperBound',[Inf,2,4])
The variable type applies to all variables in the array. To have multiple variable types, create multiple
variables.
Tip To specify binary variables, use the 'integer' type with LowerBound equal to 0 and
UpperBound equal to 1.
Example: 'integer'
Lower bounds, specified as an array of the same size as x or as a real scalar. If LowerBound is a
scalar, the value applies to all elements of x.
Example: To set a lower bound of 0 to all elements of x, specify the scalar value 0.
Data Types: double
Upper bounds, specified as an array of the same size as x or as a real scalar. If UpperBound is a
scalar, the value applies to all elements of x.
Example: To set an upper bound of 2 to all elements of x, specify the scalar value 2.
Data Types: double
Output Arguments
x — Optimization variable
OptimizationVariable array
16-367
16 Functions
Tips
• OptimizationVariable objects have handle copy behavior. See “Handle Object Behavior”
(MATLAB) and “Comparison of Handle and Value Classes” (MATLAB). Handle copy behavior
means that a copy of an OptimizationVariable points to the original and does not have an
independent existence. For example, create a variable x, copy it to y, then set a property of y.
Note that x takes on the new property value.
x = optimvar('x','LowerBound',1);
y = x;
y.LowerBound = 0;
showbounds(x)
0 <= x
See Also
OptimizationVariable
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
“Named Index for Optimization Variables” on page 10-20
Introduced in R2017b
16-368
prob2struct
prob2struct
Package: optim.problemdef
Syntax
problem = prob2struct(prob)
problem = prob2struct(prob,x0)
problem = prob2struct( ___ ,Name,Value)
Description
problem = prob2struct(prob) returns an optimization problem structure suitable for a solver-
based solution. For nonlinear problems, prob2struct creates files for the objective function, and, if
necessary, for nonlinear constraint functions and supporting files.
problem = prob2struct(prob,x0) also converts the initial point structure x0 and includes it in
problem.
problem = prob2struct( ___ ,Name,Value), for any input arguments, specifies additional
options using one or more name-value pair arguments. For example, for a nonlinear optimization
problem, problem = prob2struct(prob,'ObjectiveFunctionName','objfun1') specifies
that prob2struct creates an objective function file named objfun1.m in the current folder.
Examples
Input the basic MILP problem from “Mixed-Integer Linear Programming Basics: Problem-Based” on
page 10-40.
ingots = optimvar('ingots',4,1,'Type','integer','LowerBound',0,'UpperBound',1);
alloys = optimvar('alloys',4,1,'LowerBound',0);
weightIngots = [5,3,4,6];
costIngots = weightIngots.*[350,330,310,280];
costAlloys = [500,450,400,100];
cost = costIngots*ingots + costAlloys*alloys;
steelprob = optimproblem;
steelprob.Objective = cost;
carbonIngots = [5,4,5,3]/100;
molybIngots = [3,3,4,4,]/100;
carbonAlloys = [8,7,6,3]/100;
16-369
16 Functions
molybAlloys = [6,7,8,9]/100;
problem = prob2struct(steelprob);
Aeq = problem.Aeq
Aeq =
(1,1) 1.0000
(2,1) 0.0800
(3,1) 0.0600
(1,2) 1.0000
(2,2) 0.0700
(3,2) 0.0700
(1,3) 1.0000
(2,3) 0.0600
(3,3) 0.0800
(1,4) 1.0000
(2,4) 0.0300
(3,4) 0.0900
(1,5) 5.0000
(2,5) 0.2500
(3,5) 0.1500
(1,6) 3.0000
(2,6) 0.1200
(3,6) 0.0900
(1,7) 4.0000
(2,7) 0.2000
(3,7) 0.1600
(1,8) 6.0000
(2,8) 0.1800
(3,8) 0.2400
beq = problem.beq
beq = 3×1
25.0000
1.2500
1.2500
problem.lb
ans = 8×1
16-370
prob2struct
0
0
0
0
0
0
0
0
problem.ub
ans = 8×1
Inf
Inf
Inf
Inf
1
1
1
1
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
x = 8×1
7.2500
0
0.2500
3.5000
1.0000
1.0000
0
1.0000
16-371
16 Functions
x = optimvar('x',2);
fun = 100*(x(2) - x(1)^2)^2 + (1 - x(1))^2;
prob = optimproblem('Objective',fun);
mycon = dot(x,x) <= 4;
prob.Constraints.mycon = mycon;
x0.x = [-1;1.5];
Convert prob to an optimization problem structure. Name the generated objective function file
'rosenbrock' and the constraint function file 'circle2'.
problem = prob2struct(prob,x0,'ObjectiveFunctionName','rosenbrock',...
'ConstraintFunctionName','circle2');
prob2struct creates nonlinear objective and constraint function files in the current folder. To create
these files in a different folder, use the 'FileLocation' name-value pair.
[x,fval] = fmincon(problem)
x = 2×1
1.0000
1.0000
fval = 3.6035e-11
Input Arguments
prob — Optimization problem or equation problem
OptimizationProblem object | EquationProblem object
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
x0 — Initial point
structure
16-372
prob2struct
Initial point, specified as a structure with field names equal to the variable names in prob.
For an example using x0 with named index variables, see “Create Initial Point for Optimization with
Named Index Variables” on page 10-43.
Example: If prob has variables named x and y: x0.x = [3,2,17]; x0.y = [pi/3,2*pi/3].
Data Types: struct
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and
Value is the corresponding value. Name must appear inside quotes. You can specify several name and
value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
Example: problem = prob2struct(prob,'FileLocation','C:\Documents\myproblem')
Name of the nonlinear constraint function file created by prob2struct for an optimization problem,
specified as the comma-separated pair consisting of 'ConstraintFunctionName' and a file name.
This argument applies to fmincon or fminunc problems; see problem. Do not include the file
extension .m in the file name. prob2struct appends the file extension when it creates the file.
Name of the nonlinear equation function file created by prob2struct for an equation problem,
specified as the comma-separated pair consisting of 'EquationFunctionName' and a file name.
This argument applies to fsolve, fzero, or lsqnonlin equations; see problem. Do not include the
file extension .m in the file name. prob2struct appends the file extension when it creates the file.
16-373
16 Functions
Location for generated files (objective function, constraint function, and other subfunction files),
specified as the comma-separated pair consisting of 'FileLocation' and a path to a writable
folder. All the generated files are stored in this folder; multiple folders are not supported.
Example: 'C:Documents\MATLAB\myproject'
Data Types: char | string
Name of the objective function file created by prob2struct for an optimization problem, specified as
the comma-separated pair consisting of 'ObjectiveFunctionName' and a file name. This argument
applies to fmincon or fminunc problems; see problem. Do not include the file extension .m in the
file name. prob2struct appends the file extension when it creates the file.
Optimization options, specified as the comma-separated pair consisting of 'Options' and an options
object created by optimoptions. Create options for the appropriate solver; see problem.
Example: optimoptions('fmincon','PlotFcn','optimplotfval')
Output Arguments
problem — Problem structure
fmincon problem structure | fminunc problem structure | fsolve problem structure | intlinprog
problem structure | linprog problem structure | lsqlin problem structure | lsqnonlin problem
structure | quadprog problem structure
The following table gives the resulting problem type for optimization problems.
16-374
prob2struct
No constraints.
General nonlinear objective function, and there is at least one fmincon
constraint of any type.
The following table gives the resulting problem type for equation solving problems.
Note For nonlinear problems, prob2struct creates function files for the objective and nonlinear
constraint functions. For objective and constraint functions that call supporting functions,
prob2struct also creates supporting function files and stores them in the FileLocation folder.
For linear and quadratic optimization problems, the problem structure includes an additional field,
f0, that represents an additive constant for the objective function. If you solve the problem structure
using the specified solver, the returned objective function value does not include the f0 value. If you
solve prob using the solve function, the returned objective function value includes the f0 value.
If the ObjectiveSense of prob is 'max' or 'maximize', then problem uses the negative of the
objective function in prob because solvers minimize. To maximize, they minimize the negative of the
original objective function. In this case, the reported optimal function value from the solver is the
negative of the value in the original problem. See “Maximizing an Objective” on page 2-30. You
cannot use lsqlin for a maximization problem.
Tips
• If you call prob2struct multiple times in the same MATLAB session for nonlinear problems, use
the ObjectiveFunctionName or EquationFunctionName argument and, if appropriate, the
16-375
16 Functions
Algorithms
The basis for the problem structure is an implicit ordering of all problem variables into a single
vector. The order of the problem variables is the same as the order of the Variables property in
prob. See OptimizationProblem. You can also find the order by using varindex.
For example, suppose that the problem variables are in this order:
• x — a 3-by-2-by-4 array
• y — a 3-by-2 array
In this case, the implicit variable order is the same as if the problem variable is vars =
[x(:);y(:)].
The first 24 elements of vars are equivalent to x(:), and the next six elements are equivalent to
y(:), for a total of 30 elements. The lower and upper bounds correspond to this variable ordering,
and each linear constraint matrix has 30 columns.
For problems with general nonlinear objective or constraint functions, prob2struct creates function
files in the current folder or in the folder specified by FileLocation. The returned problem
structure refers to these function files.
See Also
EquationProblem | OptimizationProblem | varindex
Topics
“Problem-Based Optimization Workflow” on page 10-2
“Include Derivatives in Problem-Based Workflow” on page 7-20
“Output Function for Problem-Based Optimization” on page 7-25
Introduced in R2017b
16-376
quadprog
quadprog
Quadratic programming
Syntax
x = quadprog(H,f)
x = quadprog(H,f,A,b)
x = quadprog(H,f,A,b,Aeq,beq)
x = quadprog(H,f,A,b,Aeq,beq,lb,ub)
x = quadprog(H,f,A,b,Aeq,beq,lb,ub,x0)
x = quadprog(H,f,A,b,Aeq,beq,lb,ub,x0,options)
x = quadprog(problem)
[x,fval] = quadprog( ___ )
[x,fval,exitflag,output] = quadprog( ___ )
[x,fval,exitflag,output,lambda] = quadprog( ___ )
Description
Solver for quadratic objective functions with linear constraints.
A ⋅ x ≤ b,
1 T T
min x Hx + f x such that Aeq ⋅ x = beq,
x 2
lb ≤ x ≤ ub .
H, A, and Aeq are matrices, and f, b, beq, lb, ub, and x are vectors.
You can pass f, lb, and ub as vectors or matrices; see “Matrix Arguments” on page 2-31.
Note quadprog applies only to the solver-based approach. For a discussion of the two optimization
approaches, see “First Choose Problem-Based or Solver-Based Approach” on page 1-3.
x = quadprog(H,f) returns a vector x that minimizes 1/2*x'*H*x + f'*x. The input H must be
positive definite for the problem to have a finite minimum. If H is positive definite, then the solution x
= H\(-f).
16-377
16 Functions
Note If the specified input bounds for a problem are inconsistent, the output x is x0 and the output
fval is [].
quadprog resets components of x0 that violate the bounds lb ≤ x ≤ ub to the interior of the box
defined by the bounds. quadprog does not change components that respect the bounds.
[x,fval] = quadprog( ___ ), for any input variables, also returns fval, the value of the objective
function at x:
Examples
1 2
f (x) = x + x22 − x1x2 − 2x1 − 6x2
2 1
x1 + x2 ≤ 2
−x1 + 2x2 ≤ 2
2x1 + x2 ≤ 3 .
16-378
quadprog
1 T T
f (x) = x Hx + f x,
2
where
1 −1
H=
−1 2
−2
f = ,
−6
H = [1 -1; -1 2];
f = [-2; -6];
A = [1 1; -1 2; 2 1];
b = [2; 2; 3];
Call quadprog.
[x,fval,exitflag,output,lambda] = ...
quadprog(H,f,A,b);
x,fval,exitflag
x = 2×1
0.6667
1.3333
fval = -8.2222
exitflag = 1
An exit flag of 1 means the result is a local minimum. Because H is a positive definite matrix, this
problem is convex, so the minimum is a global minimum.
eig(H)
ans = 2×1
0.3820
2.6180
16-379
16 Functions
1 2
f (x) = x + x22 − x1x2 − 2x1 − 6x2
2 1
x1 + x2 = 0 .
1 T T
f (x) = x Hx + f x,
2
where
1 −1
H=
−1 2
−2
f = ,
−6
x = 2×1
-0.8000
0.8000
fval = -1.6000
exitflag = 1
16-380
quadprog
An exit flag of 1 means the result is a local minimum. Because H is a positive definite matrix, this
problem is convex, so the minimum is a global minimum.
eig(H)
ans = 2×1
0.3820
2.6180
1 T T
x Hx + f x
2
where
1 −1 1 2
H = −1 2 −2 , f = −3 ,
1 −2 4 1
0 ≤ x ≤ 1, ∑ x = 1/2.
H = [1,-1,1
-1,2,-2
1,-2,4];
f = [2;-3;1];
lb = zeros(3,1);
ub = ones(size(lb));
Aeq = ones(1,3);
beq = 1/2;
x = quadprog(H,f,[],[],Aeq,beq,lb,ub)
x = 3×1
0.0000
16-381
16 Functions
0.5000
0.0000
options = optimoptions('quadprog','Display','iter');
H = [1 -1; -1 2];
f = [-2; -6];
A = [1 1; -1 2; 2 1];
b = [2; 2; 3];
To help write the quadprog function call, set the unnecessary inputs to [].
Aeq = [];
beq = [];
lb = [];
ub = [];
x0 = [];
x = quadprog(H,f,A,b,Aeq,beq,lb,ub,x0,options)
x = 2×1
0.6667
1.3333
Create a problem structure using a “Problem-Based Optimization Workflow” on page 10-2. Create an
optimization problem equivalent to “Quadratic Program with Linear Constraints” on page 16-378.
16-382
quadprog
x = optimvar('x',2);
objec = x(1)^2/2 + x(2)^2 - x(1)*x(2) - 2*x(1) - 6*x(2);
prob = optimproblem('Objective',objec);
prob.Constraints.cons1 = sum(x) <= 2;
prob.Constraints.cons2 = -x(1) + 2*x(2) <= 2;
prob.Constraints.cons3 = 2*x(1) + x(2) <= 3;
x = 2×1
0.6667
1.3333
fval = -8.2222
Solve a quadratic program and return both the solution and the objective function value.
H = [1,-1,1
-1,2,-2
1,-2,4];
f = [-7;-12;-15];
A = [1,1,1];
b = 3;
[x,fval] = quadprog(H,f,A,b)
x = 3×1
-3.5714
2.9286
3.6429
fval = -47.1786
16-383
16 Functions
Check that the returned objective function value matches the value computed from the quadprog
objective function definition.
fval2 = 1/2*x'*H*x + f'*x
fval2 = -47.1786
To see the optimization process for quadprog, set options to show an iterative display and return
four outputs. The problem is to minimize
1 T T
x Hx + f x
2
subject to
0 ≤ x ≤ 1,
where
2 1 −1
4
1
H= 1 3 2 , f = −7 .
−1
1
5 12
2
16-384
quadprog
x = 3×1
0.0000
1.0000
0.0000
fval = -5.5000
exitflag = 1
H = [1,-1,1
-1,2,-2
1,-2,4];
f = [-7;-12;-15];
A = [1,1,1];
b = 3;
lb = zeros(3,1);
[x,fval,exitflag,output,lambda] = quadprog(H,f,A,b,[],[],lb);
disp(lambda)
ineqlin: 12.0000
eqlin: [0x1 double]
lower: [3x1 double]
upper: [3x1 double]
disp(lambda.lower)
16-385
16 Functions
5.0000
0.0000
0.0000
Only the first component of lambda.lower has a nonzero multiplier. This generally means that only
the first component of x is at the lower bound of zero. Confirm by displaying the components of x.
disp(x)
0.0000
1.5000
1.5000
Input Arguments
H — Quadratic objective term
symmetric real matrix
Quadratic objective term, specified as a symmetric real matrix. H represents the quadratic in the
expression 1/2*x'*H*x + f'*x. If H is not symmetric, quadprog issues a warning and uses the
symmetrized version (H + H')/2 instead.
If the quadratic matrix H is sparse, then by default, the 'interior-point-convex' algorithm uses
a slightly different algorithm than when H is dense. Generally, the sparse algorithm is faster on large,
sparse problems, and the dense algorithm is faster on dense or small problems. For more information,
see the LinearSolver option description and “interior-point-convex quadprog Algorithm” on page
11-2.
Example: [2,1;1,3]
Data Types: double
Linear objective term, specified as a real vector. f represents the linear term in the expression
1/2*x'*H*x + f'*x.
Example: [1;3;2]
Data Types: double
Linear inequality constraints, specified as a real matrix. A is an M-by-N matrix, where M is the number
of inequalities, and N is the number of variables (number of elements in x0). For large problems, pass
A as a sparse matrix.
A*x <= b,
where x is the column vector of N variables x(:), and b is a column vector with M elements.
16-386
quadprog
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
A = [1,2;3,4;5,6];
b = [10;20;30];
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear inequality constraints, specified as a real vector. b is an M-element vector related to the A
matrix. If you pass b as a row vector, solvers internally convert b to the column vector b(:). For
large problems, pass b as a sparse vector.
A*x <= b,
where x is the column vector of N variables x(:), and A is a matrix of size M-by-N.
x1 + 2x2 ≤ 10
3x1 + 4x2 ≤ 20
5x1 + 6x2 ≤ 30,
A = [1,2;3,4;5,6];
b = [10;20;30];
Example: To specify that the x components sum to 1 or less, use A = ones(1,N) and b = 1.
Data Types: double
Linear equality constraints, specified as a real matrix. Aeq is an Me-by-N matrix, where Me is the
number of equalities, and N is the number of variables (number of elements in x0). For large
problems, pass Aeq as a sparse matrix.
Aeq*x = beq,
where x is the column vector of N variables x(:), and beq is a column vector with Me elements.
x1 + 2x2 + 3x3 = 10
16-387
16 Functions
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
Linear equality constraints, specified as a real vector. beq is an Me-element vector related to the Aeq
matrix. If you pass beq as a row vector, solvers internally convert beq to the column vector beq(:).
For large problems, pass beq as a sparse vector.
Aeq*x = beq,
where x is the column vector of N variables x(:), and Aeq is a matrix of size Me-by-N.
x1 + 2x2 + 3x3 = 10
2x1 + 4x2 + x3 = 20,
Aeq = [1,2,3;2,4,1];
beq = [10;20];
Example: To specify that the x components sum to 1, use Aeq = ones(1,N) and beq = 1.
Data Types: double
lb — Lower bounds
real vector | real array
Lower bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in lb, then lb specifies that
ub — Upper bounds
real vector | real array
16-388
quadprog
Upper bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the
number of elements in ub, then ub specifies that
x0 — Initial point
real vector
Initial point, specified as a real vector. The length of x0 is the number of rows or columns of H.
x0 applies to the 'trust-region-reflective' algorithm when the problem has only bound
constraints. x0 also applies to the 'active-set' algorithm.
If you do not specify x0, quadprog sets all components of x0 to a point in the interior of the box
defined by the bounds. quadprog ignores x0 for the 'interior-point-convex' algorithm and for
the 'trust-region-reflective' algorithm with equality constraints.
Example: [1;2;1]
Data Types: double
Some options are absent from the optimoptions display. These options appear in italics in the
following table. For details, see “View Options” on page 2-66.
16-389
16 Functions
All Algorithms
• 'interior-point-convex' (default)
• 'trust-region-reflective'
• 'active-set'
For optimset, the option name is MaxIter. See “Current and Legacy
Option Name Tables” on page 15-21.
16-390
quadprog
For optimset, the option name is TolFun. See “Current and Legacy
Option Name Tables” on page 15-21.
StepTolerance Termination tolerance on x; a positive scalar.
For optimset, the option name is TolX. See “Current and Legacy
Option Name Tables” on page 15-21.
16-391
16 Functions
For optimset, the option name is TolFun. See “Current and Legacy
Option Name Tables” on page 15-21.
HessianMultiplyFcn Hessian multiply function, specified as a function handle. For large-
scale structured problems, this function computes the Hessian matrix
product H*Y without actually forming H. The function has the form
W = hmfun(Hinfo,Y)
16-392
quadprog
For optimset, the option name is TolCon. See “Current and Legacy
Option Name Tables” on page 15-21.
LinearSolver Type of internal linear solver in the algorithm:
For optimset, the option name is TolCon. See “Current and Legacy
Option Name Tables” on page 15-21.
ObjectiveLimit A tolerance (stopping criterion) that is a scalar. If the objective
function value goes below ObjectiveLimit and the current point is
feasible, the iterations halt because the problem is unbounded,
presumably. The default value is -1e20.
The required fields are H, f, solver, and options. When solving, quadprog ignores any fields in
problem other than those listed.
Data Types: struct
16-393
16 Functions
Output Arguments
x — Solution
real vector
Solution, returned as a real vector. x is the vector that minimizes 1/2*x'*H*x + f'*x subject to all
bounds and linear constraints. x can be a local minimum for nonconvex problems. For convex
problems, x is a global minimum. For more information, see “Local vs. Global Optima” on page 4-22.
Objective function value at the solution, returned as a real scalar. fval is the value of
1/2*x'*H*x + f'*x at the solution x.
All Algorithms
1 Function converged to the solution x.
0 Number of iterations exceeded options.MaxIterations.
-2 Problem is infeasible. Or, for 'interior-point-convex', the step
size was smaller than options.StepTolerance, but constraints
were not satisfied.
-3 Problem is unbounded.
'interior-point-convex' Algorithm
2 Step size was smaller than options.StepTolerance, constraints
were satisfied.
-6 Nonconvex problem detected.
-8 Unable to compute a step direction.
'trust-region-reflective' Algorithm
4 Local minimum found; minimum is not unique.
3 Change in the objective function value was smaller than
options.FunctionTolerance.
-4 Current search direction was not a direction of descent. No further
progress could be made.
'active-set' Algorithm
-6 Nonconvex problem detected; projection of H onto the nullspace of
Aeq is not positive semidefinite.
Note Occasionally, the 'active-set' algorithm halts with exit flag 0 when the problem is, in fact,
unbounded. Setting a higher iteration limit also results in exit flag 0.
16-394
quadprog
Information about the optimization process, returned as a structure with these fields:
Algorithms
'interior-point-convex'
The 'interior-point-convex' algorithm attempts to follow a path that is strictly inside the
constraints. It uses a presolve module to remove redundancies and to simplify the problem by solving
for components that are straightforward.
The algorithm has different implementations for a sparse Hessian matrix H and for a dense matrix.
Generally, the sparse implementation is faster on large, sparse problems, and the dense
implementation is faster on dense or small problems. For more information, see “interior-point-
convex quadprog Algorithm” on page 11-2.
'trust-region-reflective'
'active-set'
The 'active-set' algorithm is a projection method, similar to the one described in [2]. The
algorithm is not large-scale; see “Large-Scale vs. Medium-Scale Algorithms” on page 2-10. For more
information, see “active-set quadprog Algorithm” on page 11-11.
16-395
16 Functions
Alternative Functionality
App
You can use the Optimization app for quadratic programming. Enter optimtool at the MATLAB
command line, and choose the quadprog - Quadratic programming solver. For more
information, see “Optimization App” on page 5-2.
Problem-Based Approach
You can solve quadratic programming problems using the “Problem-Based Optimization Setup”. For
examples, see “Quadratic Programming”.
References
[1] Coleman, T. F., and Y. Li. “A Reflective Newton Method for Minimizing a Quadratic Function
Subject to Bounds on Some of the Variables.” SIAM Journal on Optimization. Vol. 6, Number
4, 1996, pp. 1040–1058.
[2] Gill, P. E., W. Murray, and M. H. Wright. Practical Optimization. London: Academic Press, 1981.
[3] Gould, N., and P. L. Toint. “Preprocessing for quadratic programming.” Mathematical
Programming. Series B, Vol. 100, 2004, pp. 95–132.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
• quadprog supports code generation using either the codegen function or the MATLAB Coder
app. You must have a MATLAB Coder license to generate code.
• The target hardware must support standard double-precision floating-point computations. You
cannot generate code for single-precision or fixed-point computations.
• quadprog does not support the problem argument for code generation.
options = optimoptions('quadprog','Algorithm','active-set');
[x,fval,exitflag] = quadprog(H,f,A,b,Aeq,beq,lb,ub,x0,options);
• Code generation supports these options:
16-396
quadprog
opts = optimoptions('quadprog','Algorithm','active-set');
opts = optimoptions(opts,'MaxIterations',1e4); % Recommended
opts.MaxIterations = 1e4; % Not recommended
• Do not load options from a file. Doing so can cause code generation to fail. Instead, create options
in your code.
• If you specify an option that is not supported, the option is typically ignored during code
generation. For reliable results, specify only supported options.
See Also
linprog | lsqlin | optimoptions | prob2struct
Topics
“Solver-Based Optimization Problem Setup”
“Optimization Results”
“Quadratic Programming”
“Mixed-Integer Quadratic Programming Portfolio Optimization: Solver-Based” on page 9-65
16-397
16 Functions
resetoptions
Reset options
Syntax
options2 = resetoptions(options,optionname)
options2 = resetoptions(options,multioptions)
Description
options2 = resetoptions(options,optionname) resets the specified option back to its default
value.
Tip If you want only one set of options, use options as the output argument instead of options2.
Examples
Create options with some nondefault settings. Examine the MaxIterations setting.
options = optimoptions('fmincon','Algorithm','sqp','MaxIterations',2e4,...
'SpecifyObjectiveGradient',true);
options.MaxIterations
ans =
20000
options2 = resetoptions(options,'MaxIterations');
options2.MaxIterations
ans =
400
The default value of the MaxIterations option is 400 for the 'sqp' algorithm.
16-398
resetoptions
Create options with some nondefault settings. Examine the MaxIterations setting.
options = optimoptions('fmincon','Algorithm','sqp','MaxIterations',2e4,...
'SpecifyObjectiveGradient',true);
options.MaxIterations
ans =
20000
Reset the MaxIterations and Algorithm options to their default values. Examine the
MaxIterations setting.
multiopts = {'MaxIterations','Algorithm'};
options2 = resetoptions(options,multiopts);
options2.MaxIterations
ans =
1000
The default value of the MaxIterations option is 1000 for the default 'interior-point'
algorithm.
Input Arguments
options — Optimization options
object as created by optimoptions
Option names, specified as a name in single quote marks. The allowable option names for each solver
are listed in the options section of the function reference page.
Example: 'Algorithm'
Data Types: char
16-399
16 Functions
Output Arguments
options2 — Optimization options
object as created by optimoptions
See Also
optimoptions
Topics
“Set Options”
Introduced in R2016a
16-400
show
show
Package: optim.problemdef
Syntax
show(obj)
Description
show(obj) displays information about obj at the command line. If the object display is large,
consider using write instead to save the information in a text file.
Examples
Examine the various stages of problem construction for optimizing the Rosenbrock function confined
to the unit disk (see “Solve a Constrained Nonlinear Problem, Problem-Based” on page 1-5).
x = optimvar('x',2);
show(x)
[ x(1) ]
[ x(2) ]
Create an optimization problem that has obj as the objective function and cons as the constraint.
Show the problem.
prob = optimproblem("Objective",obj,"Constraints",cons);
show(prob)
OptimizationProblem :
Solve for:
16-401
16 Functions
minimize :
((100 .* (x(2) - x(1).^2).^2) + (1 - x(1)).^2)
subject to :
(x(1).^2 + x(2).^2) <= 1
Finally, create an initial point [0 0] and solve the problem starting at the initial point.
x0.x = [0 0];
[sol,fval,exitflag] = solve(prob,x0)
fval = 0.0457
exitflag =
OptimalSolution
sol.x
ans = 2×1
0.7864
0.6177
Input Arguments
obj — Optimization object
OptimizationProblem object | EquationProblem object | OptimizationExpression object |
OptimizationVariable object | OptimizationConstraint object | OptimizationEquality
object | OptimizationInequality object
• OptimizationProblem object — show(obj) displays the variables for the solution, objective
function, constraints, and variable bounds.
• EquationProblem object — show(obj) displays the variables for the solution, equations for the
solution, and variable bounds.
16-402
show
See Also
showbounds | write
Topics
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2019b
16-403
16 Functions
showbounds
Package: optim.problemdef
Syntax
showbounds(var)
Description
showbounds(var) displays the bounds for var.
Examples
x = optimvar('x',2,2);
showbounds(x)
x is unbounded.
Set lower bounds of 0 on all elements of x, and set upper bounds on the first row.
x.LowerBound = 0;
x.UpperBound(1,:) = [3,5];
showbounds(x)
binvar = optimvar('binvar',2,2,'Type','integer',...
'LowerBound',0,'UpperBound',1);
showbounds(binvar)
Create a large optimization variable that has few bounded elements, and display the variable bounds.
16-404
showbounds
bigvar = optimvar('bigvar',100,10,50);
bigvar.LowerBound(55,4,3) = -20;
bigvar.LowerBound(20,5,30) = -40;
bigvar.UpperBound(35,3,35) = -200;
showbounds(bigvar)
Input Arguments
var — Optimization variable
OptimizationVariable object
Tips
• For a variable that has many bounds, use writebounds to generate a text file containing the
bound information.
See Also
OptimizationVariable | optimvar | writebounds
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-405
16 Functions
showconstr
Package: optim.problemdef
Syntax
showconstr(constr)
Description
showconstr is not recommended. Use show instead.
Examples
x = optimvar('x',3,2);
constr = sum(x,2) <= [1;3;2];
showconstr(constr)
(1, 1)
(2, 1)
(3, 1)
Input Arguments
constr — Optimization constraint
OptimizationEquality object | OptimizationInequality object | OptimizationConstraint
object
16-406
showconstr
Tips
• For a large or complicated constraint, use writeconstr to generate a text file containing the
constraint information.
Compatibility Considerations
showconstr is not recommended
Not recommended starting in R2019b
The showconstr function is not recommended. Instead, use show. The show function replaces
showconstr and many other problem-based functions.
See Also
OptimizationConstraint | show | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-407
16 Functions
showexpr
Package: optim.problemdef
Syntax
showexpr(expr)
Description
showexpr is not recommended. Use show instead.
showexpr(expr) displays the optimization expression expr at the MATLAB Command Window.
Examples
x = optimvar('x',3,3);
A = magic(3);
expr = sum(sum(A.*x));
showexpr(expr)
Input Arguments
expr — Optimization expression
OptimizationExpression object
Tips
• For an expression that has many terms, use writeexpr to generate a text file containing the
expression information.
16-408
showexpr
Compatibility Considerations
showexpr is not recommended
Not recommended starting in R2019b
The showexpr function is not recommended. Instead, use show. The show function replaces
showexpr and many other problem-based functions.
See Also
OptimizationExpression | show | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-409
16 Functions
showproblem
Package: optim.problemdef
Syntax
showproblem(prob)
Description
showproblem is not recommended. Use show instead.
Examples
Create an optimization problem, including an objective function and constraints, and display the
problem.
Create the problem in “Mixed-Integer Linear Programming Basics: Problem-Based” on page 10-40.
steelprob = optimproblem;
ingots = optimvar('ingots',4,1,'Type','integer','LowerBound',0,'UpperBound',1);
alloys = optimvar('alloys',4,1,'LowerBound',0);
weightIngots = [5,3,4,6];
costIngots = weightIngots.*[350,330,310,280];
costAlloys = [500,450,400,100];
cost = costIngots*ingots + costAlloys*alloys;
steelprob.Objective = cost;
totalweight = weightIngots*ingots + sum(alloys);
carbonIngots = [5,4,5,3]/100;
carbonAlloys = [8,7,6,3]/100;
totalCarbon = (weightIngots.*carbonIngots)*ingots + carbonAlloys*alloys;
molybIngots = [3,3,4,4,]/100;
molybAlloys = [6,7,8,9]/100;
totalMolyb = (weightIngots.*molybIngots)*ingots + molybAlloys*alloys;
steelprob.Constraints.conswt = totalweight == 25;
steelprob.Constraints.conscarb = totalCarbon == 1.25;
steelprob.Constraints.consmolyb = totalMolyb == 1.25;
showproblem(steelprob)
OptimizationProblem :
Solve for:
alloys, ingots
16-410
showproblem
minimize :
1750*ingots(1) + 990*ingots(2) + 1240*ingots(3) + 1680*ingots(4)
+ 500*alloys(1) + 450*alloys(2) + 400*alloys(3) + 100*alloys(4)
subject to conswt:
5*ingots(1) + 3*ingots(2) + 4*ingots(3) + 6*ingots(4) + alloys(1)
+ alloys(2) + alloys(3) + alloys(4) == 25
subject to conscarb:
0.25*ingots(1) + 0.12*ingots(2) + 0.2*ingots(3) + 0.18*ingots(4)
+ 0.08*alloys(1) + 0.07*alloys(2) + 0.06*alloys(3) + 0.03*alloys(4) == 1.25
subject to consmolyb:
0.15*ingots(1) + 0.09*ingots(2) + 0.16*ingots(3) + 0.24*ingots(4)
+ 0.06*alloys(1) + 0.07*alloys(2) + 0.08*alloys(3) + 0.09*alloys(4) == 1.25
variable bounds:
0 <= alloys(1)
0 <= alloys(2)
0 <= alloys(3)
0 <= alloys(4)
Input Arguments
prob — Optimization problem or equation problem
OptimizationProblem object | EquationProblem object
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
Tips
• showproblem is equivalent to calling all of the following:
• showexpr(prob.Objective)
16-411
16 Functions
Compatibility Considerations
showproblem is not recommended
Not recommended starting in R2019b
The showproblem function is not recommended. Instead, use show. The show function replaces
showproblem and many other problem-based functions.
See Also
OptimizationProblem | show | showbounds | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-412
showvar
showvar
Package: optim.problemdef
Syntax
showvar(var)
Description
showvar is not recommended. Use show instead.
Examples
Input Arguments
var — Optimization variable
OptimizationVariable object
Compatibility Considerations
showvar is not recommended
Not recommended starting in R2019b
The showvar function is not recommended. Instead, use show. The show function replaces showvar
and many other problem-based functions.
16-413
16 Functions
See Also
OptimizationVariable | optimvar | show | write
Topics
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-414
solve
solve
Package: optim.problemdef
Syntax
sol = solve(prob)
sol = solve(prob,x0)
sol = solve( ___ ,Name,Value)
[sol,fval] = solve( ___ )
[sol,fval,exitflag,output,lambda] = solve( ___ )
Description
sol = solve(prob) solves the optimization problem or equation problem prob.
sol = solve( ___ ,Name,Value) modifies the solution process using one or more name-value pair
arguments in addition to the input arguments in previous syntaxes.
[sol,fval] = solve( ___ ) also returns the objective function value at the solution using any of
the input arguments in previous syntaxes.
Examples
x = optimvar('x');
y = optimvar('y');
prob = optimproblem;
prob.Objective = -x - y/3;
prob.Constraints.cons1 = x + y <= 2;
prob.Constraints.cons2 = x + y/4 <= 1;
prob.Constraints.cons3 = x - y <= 2;
prob.Constraints.cons4 = x/4 + y >= -1;
prob.Constraints.cons5 = x + y >= 1;
prob.Constraints.cons6 = -x + y <= 2;
sol = solve(prob)
16-415
16 Functions
Find a minimum of the peaks function, which is included in MATLAB®, in the region x2 + y2 ≤ 4. To
do so, convert the peaks function to an optimization expression.
prob = optimproblem;
x = optimvar('x');
y = optimvar('y');
fun = fcn2optimexpr(@peaks,x,y);
prob.Objective = fun;
Set the initial point for x to 1 and y to –1, and solve the problem.
x0.x = 1;
x0.y = -1;
sol = solve(prob,x0)
Compare the number of steps to solve an integer programming problem both with and without an
initial feasible point. The problem has eight integer variables and four linear equality constraints, and
all variables are restricted to be positive.
prob = optimproblem;
x = optimvar('x',8,1,'LowerBound',0,'Type','integer');
Create four linear equality constraints and include them in the problem.
16-416
solve
Aeq = [22 13 26 33 21 3 14 26
39 16 22 28 26 30 23 24
18 14 29 27 30 38 26 26
41 26 28 36 18 38 16 26];
beq = [ 7872
10466
11322
12058];
cons = Aeq*x == beq;
prob.Constraints.cons = cons;
Solve the problem without using an initial point, and examine the display to see the number of
branch-and-bound nodes.
[x1,fval1,exitflag1,output1] = solve(prob);
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
16-417
16 Functions
Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon
variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the
default value).
fprintf('Without an initial point, solve took %d steps.\nWith an initial point, solve took %d ste
Giving an initial point does not always improve the problem. For this problem, using an initial point
saves time and computational steps. However, for some problems, an initial point can cause solve to
take more steps.
x3 binary
x1, x2 ≥ 0
min −3x1 − 2x2 − x3 subject to
x x1 + x2 + x3 ≤ 7
4x1 + 2x2 + x3 = 12
x = optimvar('x',2,1,'LowerBound',0);
x3 = optimvar('x3','Type','integer','LowerBound',0,'UpperBound',1);
prob = optimproblem;
prob.Objective = -3*x(1) - 2*x(2) - x3;
prob.Constraints.cons1 = x(1) + x(2) + x3 <= 7;
prob.Constraints.cons2 = 4*x(1) + 2*x(2) + x3 == 12;
options = optimoptions('intlinprog','Display','off');
sol = solve(prob,'Options',options)
sol.x
16-418
solve
ans = 2×1
0
5.5000
sol.x3
ans = 1
Force solve to use intlinprog as the solver for a linear programming problem.
x = optimvar('x');
y = optimvar('y');
prob = optimproblem;
prob.Objective = -x - y/3;
prob.Constraints.cons1 = x + y <= 2;
prob.Constraints.cons2 = x + y/4 <= 1;
prob.Constraints.cons3 = x - y <= 2;
prob.Constraints.cons4 = x/4 + y >= -1;
prob.Constraints.cons5 = x + y >= 1;
prob.Constraints.cons6 = -x + y <= 2;
Solve the mixed-integer linear programming problem described in “Solve Integer Programming
Problem with Nondefault Options” on page 16-418 and examine all of the output data.
x = optimvar('x',2,1,'LowerBound',0);
x3 = optimvar('x3','Type','integer','LowerBound',0,'UpperBound',1);
prob = optimproblem;
prob.Objective = -3*x(1) - 2*x(2) - x3;
prob.Constraints.cons1 = x(1) + x(2) + x3 <= 7;
prob.Constraints.cons2 = 4*x(1) + 2*x(2) + x3 == 12;
[sol,fval,exitflag,output] = solve(prob)
16-419
16 Functions
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
fval = -12
exitflag =
OptimalSolution
For a problem without any integer constraints, you can also obtain a nonempty Lagrange multiplier
structure as the fifth output.
Create and solve an optimization problem using named index variables. The problem is to maximize
the profit-weighted flow of fruit to various airports, subject to constraints on the weighted flows.
16-420
solve
Intlinprog stopped at the root node because the objective value is within a gap
tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default
value). The intcon variables are integer within tolerance,
options.IntegerTolerance = 1e-05 (the default value).
Find the optimal flow of oranges and berries to New York and Los Angeles.
[idxFruit,idxAirports] = findindex(flow, {'oranges','berries'}, {'NYC', 'LAX'})
idxFruit = 1×2
2 4
idxAirports = 1×2
1 3
orangeBerries = 2×2
0 980.0000
70.0000 0
This display means that no oranges are going to NYC, 70 berries are going to NYC, 980 oranges are
going to LAX, and no berries are going to LAX.
Fruit Airports
----- --------
Berries NYC
Apples BOS
Oranges LAX
idx = findindex(flow, {'berries', 'apples', 'oranges'}, {'NYC', 'BOS', 'LAX'})
idx = 1×3
4 5 10
optimalFlow = sol.flow(idx)
optimalFlow = 1×3
16-421
16 Functions
This display means that 70 berries are going to NYC, 28 apples are going to BOS, and 980 oranges are
going to LAX.
x = optimvar('x',2);
prob = eqnproblem;
prob.Equations.eq1 = eq1;
prob.Equations.eq2 = eq2;
show(prob)
EquationProblem :
Solve for:
x
eq1:
exp(-exp(-(x(1) + x(2)))) == (x(2) .* (1 + x(1).^2))
eq2:
((x(1) .* cos(x(2))) + (x(2) .* sin(x(1)))) == 0.5
Solve the problem starting from the point [0,0]. For the problem-based approach, specify the initial
point as a structure, with the variable names as the fields of the structure. For this problem, there is
only one variable, x.
x0.x = [0 0];
[sol,fval,exitflag] = solve(prob,x0)
Equation solved.
16-422
solve
exitflag =
EquationSolved
disp(sol.x)
0.3532
0.6061
If your equation functions are not composed of elementary functions, you must convert the functions
to optimization expressions using fcn2optimexpr. For the present example:
ls1 = fcn2optimexpr(@(x)exp(-exp(-(x(1)+x(2)))),x);
eq1 = ls1 == x(2)*(1 + x(1)^2);
ls2 = fcn2optimexpr(@(x)x(1)*cos(x(2))+x(2)*sin(x(1)),x);
eq2 = ls2 == 1/2;
For the list of supported functions, see “Supported Operations on Optimization Variables and
Expressions” on page 10-36.
Input Arguments
prob — Optimization problem or equation problem
OptimizationProblem object | EquationProblem object
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
16-423
16 Functions
x0 — Initial point
structure
Initial point, specified as a structure with field names equal to the variable names in prob.
For an example using x0 with named index variables, see “Create Initial Point for Optimization with
Named Index Variables” on page 10-43.
Example: If prob has variables named x and y: x0.x = [3,2,17]; x0.y = [pi/3,2*pi/3].
Data Types: struct
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and
Value is the corresponding value. Name must appear inside quotes. You can specify several name and
value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
Example: solve(prob,'options',opts)
Optimization options, specified as the comma-separated pair consisting of 'options' and an object
created by optimoptions or an options structure such as created by optimset.
Internally, the solve function calls a relevant solver as detailed in the 'solver' argument
reference. Ensure that options is compatible with the solver. For example, intlinprog does not
allow options to be a structure, and lsqnonneg does not allow options to be an object.
For suggestions on options settings to improve an intlinprog solution or the speed of a solution,
see “Tuning Integer Linear Programming” on page 9-35. For linprog, the default 'dual-simplex'
algorithm is generally memory-efficient and speedy. Occasionally, linprog solves a large problem
faster when the Algorithm option is 'interior-point'. For suggestions on options settings to
improve a nonlinear problem's solution, see “Options in Common Use: Tuning and Troubleshooting”
on page 2-61 and “Improve Results”.
Example: options = optimoptions('intlinprog','Display','none')
Optimization solver, specified as the comma-separated pair consisting of 'solver' and the name of a
listed solver. For optimization problems, this table contains the available solvers for each problem
type.
16-424
solve
Note If you choose lsqcurvefit as the solver for a least-squares problem, solve uses lsqnonlin.
The lsqcurvefit and lsqnonlin solvers are identical for solve.
For equation solving, this table contains the available solvers for each problem type. In the table,
16-425
16 Functions
Example: 'intlinprog'
Data Types: char | string
Output Arguments
sol — Solution
structure
Solution, returned as a structure. The fields of the structure are the names of the optimization
variables. See optimvar.
Objective function value at the solution, returned as a real number, or, for systems of equations, a real
vector. For least-squares problems, fval is the sum of squares of the residuals at the solution. For
equation-solving problems, fval is the function value at the solution, meaning the left-hand side
minus the right-hand side of the equations.
Tip If you neglect to ask for fval for an optimization problem, you can calculate it using:
fval = evaluate(prob.Objective,sol)
Reason the solver stopped, returned as an enumeration variable. You can convert exitflag to its
numeric equivalent using double(exitflag), and to its string equivalent using
string(exitflag).
This table describes the exit flags for the intlinprog solver.
16-426
solve
• LPMaxIterations
• MaxNodes
• MaxTime
• RootLPMaxIterations
Exitflags 3 and -9 relate to solutions that have large infeasibilities. These usually arise from linear
constraint matrices that have large condition number, or problems that have large solution
components. To correct these issues, try to scale the coefficient matrices, eliminate redundant linear
constraints, or give tighter bounds on the variables.
This table describes the exit flags for the linprog solver.
16-427
16 Functions
Exitflags 3 and -9 relate to solutions that have large infeasibilities. These usually arise from linear
constraint matrices that have large condition number, or problems that have large solution
components. To correct these issues, try to scale the coefficient matrices, eliminate redundant linear
constraints, or give tighter bounds on the variables.
This table describes the exit flags for the lsqlin solver.
This table describes the exit flags for the quadprog solver.
16-428
solve
This table describes the exit flags for the lsqcurvefit or lsqnonlin solver.
This table describes the exit flags for the fminunc solver.
16-429
16 Functions
This table describes the exit flags for the fmincon solver.
16-430
solve
This table describes the exit flags for the fsolve solver.
This table describes the exit flags for the fzero solver.
16-431
16 Functions
Information about the optimization process, returned as a structure. The output structure contains
the fields in the relevant underlying solver output field, depending on which solver solve called:
• 'fmincon' output
• 'fminunc' output
• 'fsolve' output
• 'fzero' output
• 'intlinprog' output
• 'linprog' output
• 'lsqcurvefit' or 'lsqnonlin' output
• 'lsqlin' output
• 'lsqnonneg' output
• 'quadprog' output
solve includes the additional field Solver in the output structure to identify the solver used, such
as 'intlinprog'.
For the intlinprog and fminunc solvers, lambda is empty, []. For the other solvers, lambda has
these fields:
• Variables – Contains fields for each problem variable. Each problem variable name is a
structure with two fields:
• Lower – Lagrange multipliers associated with the variable LowerBound property, returned as
an array of the same size as the variable. Nonzero entries mean that the solution is at the
lower bound. These multipliers are in the structure
lambda.Variables.variablename.Lower.
• Upper – Lagrange multipliers associated with the variable UpperBound property, returned as
an array of the same size as the variable. Nonzero entries mean that the solution is at the
upper bound. These multipliers are in the structure
lambda.Variables.variablename.Upper.
• Constraints – Contains a field for each problem constraint. Each problem constraint is in a
structure whose name is the constraint name, and whose value is a numeric array of the same size
as the constraint. Nonzero entries mean that the constraint is active at the solution. These
multipliers are in the structure lambda.Constraints.constraintname.
Note Elements of a constraint array all have the same comparison (<=, ==, or >=) and are all of
the same type (linear, quadratic, or nonlinear).
16-432
solve
Algorithms
Internally, the solve function solves optimization problems by calling a solver:
Before solve can call these functions, the problems must be converted to solver form, either by
solve or some other associated functions or objects. This conversion entails, for example, linear
constraints having a matrix representation rather than an optimization variable expression.
The first step in the algorithm occurs as you place optimization expressions into the problem. An
OptimizationProblem object has an internal list of the variables used in its expressions. Each
variable has a linear index in the expression, and a size. Therefore, the problem variables have an
implied matrix form. The prob2struct function performs the conversion from problem form to
solver form. For an example, see “Convert Problem to Structure” on page 16-369.
For the default and allowed solvers that solve calls, depending on the problem objective and
constraints, see 'solver'. You can override the default by using the 'solver' name-value pair
argument when calling solve.
For the algorithm that intlinprog uses to solve MILP problems, see “intlinprog Algorithm” on page
9-26. For the algorithms that linprog uses to solve linear programming problems, see “Linear
Programming Algorithms” on page 9-2. For the algorithms that quadprog uses to solve quadratic
programming problems, see “Quadratic Programming Algorithms” on page 11-2. For linear or
nonlinear least-squares solver algorithms, see “Least-Squares (Model Fitting) Algorithms” on page
12-2. For nonlinear solver algorithms, see “Unconstrained Nonlinear Optimization Algorithms” on
page 6-2 and “Constrained Nonlinear Optimization Algorithms” on page 6-19.
For nonlinear equation solving, solve internally represents each equation as the difference between
the left and right sides. Then solve attempts to minimize the sum of squares of the equation
components. For the algorithms for solving nonlinear systems of equations, see “Equation Solving
Algorithms” on page 13-2. When the problem also has bounds, solve calls lsqnonlin to minimize
the sum of squares of equation components. See “Least-Squares (Model Fitting) Algorithms” on page
12-2.
Note If your objective function is a sum of squares, and you want solve to recognize it as such,
write it as sum(expr.^2), and not as expr'*expr or any other form. The internal parser recognizes
16-433
16 Functions
only explicit sums of squares. For details, see “Write Objective Function for Problem-Based Least
Squares” on page 12-92. For an example, see “Nonnegative Linear Least Squares, Problem-Based” on
page 12-40.
Compatibility Considerations
solve(prob,solver), solve(prob,options), and solve(prob,solver,options) syntaxes
have been removed
Errors starting in R2018b
To choose options or the underlying solver for solve, use name-value pairs. For example,
sol = solve(prob,'options',opts,'solver','quadprog');
The previous syntaxes were not as flexible, standard, or extensible as name-value pairs.
See Also
EquationProblem | OptimizationProblem | evaluate | findindex | optimoptions |
prob2struct
Topics
“Problem-Based Optimization Workflow” on page 10-2
“Create Initial Point for Optimization with Named Index Variables” on page 10-43
Introduced in R2017b
16-434
varindex
varindex
Package: optim.problemdef
Syntax
idx = varindex(prob)
idx = varindex(prob,varname)
Description
idx = varindex(prob) returns the linear indices of problem variables as a structure or an integer
vector. If you convert prob to a problem structure by using prob2struct, idx gives the variable
indices in the resulting problem structure that correspond to the variables in prob.
Examples
x = optimvar('x',3);
y = optimvar('y',3,3);
prob = optimproblem('Objective',x'*y*x);
problem = prob2struct(prob);
idx = varindex(prob);
disp(idx.x)
1 2 3
disp(idx.y)
4 5 6 7 8 9 10 11 12
idxy = varindex(prob,'y')
idxy = 1×9
4 5 6 7 8 9 10 11 12
16-435
16 Functions
Input Arguments
prob — Optimization problem or equation problem
OptimizationProblem object | EquationProblem object
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
Output Arguments
idx — Linear indices of problem variables
structure | integer vector
Linear indices of problem variables, returned as a structure or an integer vector. If you convert prob
to a problem structure by using prob2struct, idx gives the variable indices in the resulting
problem structure that correspond to the variables in prob.
• When you call idx = varindex(prob), the returned idx is a structure. The field names of the
structure are the variable names in prob. The value for each field is the integer vector of linear
indices to which the variables map in the associated solver-based problem variable.
• When you call idx = varindex(prob,varname), the returned idx is the vector of linear
indices to which the variable varname maps in the associated solver-based problem variable.
See Also
EquationProblem | OptimizationProblem | prob2struct
Topics
“Output Function for Problem-Based Optimization” on page 7-25
“Include Derivatives in Problem-Based Workflow” on page 7-20
16-436
varindex
Introduced in R2019a
16-437
16 Functions
write
Package: optim.problemdef
Syntax
write(obj)
write(obj,filename)
Description
write(obj) saves a description of the optimization object obj in a file named obj.txt. Here, obj
is the workspace variable name of the optimization object. If write cannot construct the file name
from the expression, it writes the description to WriteOutput.txt instead. write overwrites any
existing file. If the object description is small, consider using show instead to display the description
at the command line.
Examples
Create an optimization variable and an expression that uses the variable. Save a description of the
expression to a file.
x = optimvar('x',3,3);
A = magic(3);
var = sum(sum(A.*x));
write(var)
write creates a file named var.txt in the current folder. The file contains the following text:
write(var,"VarExpression.txt")
Input Arguments
obj — Optimization object
OptimizationProblem object | EquationProblem object | OptimizationExpression object |
OptimizationVariable object | OptimizationConstraint object | OptimizationEquality
object | OptimizationInequality object
16-438
write
• OptimizationProblem object — write(obj) saves a file containing the variables for the
solution, objective function, constraints, and variable bounds.
• EquationProblem object — write(obj) saves a file containing the variables for the solution,
equations for the solution, and variable bounds.
• OptimizationExpression object — write(obj) saves a file containing the optimization
expression.
• OptimizationVariable object — write(obj) saves a file containing the optimization
variables. The saved description does not indicate variable types or bounds; it includes only the
variable dimensions and index names (if any).
• OptimizationConstraint object — write(obj) saves a file containing the constraint
expression.
• OptimizationEquality object — write(obj) saves a file containing the equality expression.
• OptimizationInequality object — write(obj) saves a file containing the inequality
expression.
Path to the file, specified as a string or character vector. The path is relative to the current folder. The
resulting file is a text file, so the file name typically has the extension .txt.
Example: "../Notes/steel_stuff.txt"
Data Types: char | string
See Also
show | writebounds
Topics
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2019b
16-439
16 Functions
writebounds
Package: optim.problemdef
Syntax
writebounds(var)
writebounds(var,filename)
Description
writebounds(var) saves a description of the variable bounds in a file named
variable_bounds.txt. Here, variable is the “Name” on page 16-0 property of var. The
writebounds function overwrites any existing file.
Examples
x = optimvar('x',10,4,'LowerBound',randi(8,10,4),...
'UpperBound',10+randi(7,10,4),'Type','integer');
writebounds(x,'BoundFile.txt')
16-440
writebounds
Input Arguments
var — Optimization variable
OptimizationVariable object
Path to the file, specified as a string or character vector. The path is relative to the current folder. The
resulting file is a text file, so the file name typically has the extension .txt.
Example: "../Notes/steel_stuff.txt"
Data Types: char | string
Tips
• To obtain the writebounds information at the Command Window, use showbounds.
See Also
OptimizationVariable | showbounds
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-441
16 Functions
writeconstr
Package: optim.problemdef
Syntax
writeconstr(constr)
writeconstr(constr,filename)
Description
writeconstr is not recommended. Use write instead.
Examples
Create an optimization constraint in terms of optimization variables, and save its description in a file.
x = optimvar('x',3,2);
cons = sum(x,2) <= [1;3;2];
writeconstr(cons,"TripleConstraint.txt")
(1, 1)
(2, 1)
(3, 1)
16-442
writeconstr
Input Arguments
constr — Optimization constraint
OptimizationEquality object | OptimizationInequality object | OptimizationConstraint
object
Path to the file, specified as a string or character vector. The path is relative to the current folder. The
resulting file is a text file, so the file name typically has the extension .txt.
Example: "../Notes/steel_stuff.txt"
Data Types: char | string
Tips
• To obtain the writeconstr information at the MATLAB Command Window, use showconstr.
Compatibility Considerations
writeconstr is not recommended
Not recommended starting in R2019b
The writeconstr function is not recommended. Instead, use write. The write function replaces
writeconstr and many other problem-based functions.
See Also
OptimizationConstraint | show | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-443
16 Functions
writeexpr
Package: optim.problemdef
Syntax
writeexpr(expr)
writeexpr(expr,filename)
Description
writeexpr is not recommended. Use write instead.
Examples
Create an optimization variable and an expression that uses the variable. Save a description of the
expression to a file.
x = optimvar('x',3,3);
A = magic(3);
var = sum(sum(A.*x));
writeexpr(var,"VarExpression.txt")
Input Arguments
expr — Optimization expression
OptimizationExpression object
16-444
writeexpr
Path to the file, specified as a string or character vector. The path is relative to the current folder. The
resulting file is a text file, so the file name typically has the extension .txt.
Example: "../Notes/steel_stuff.txt"
Data Types: char | string
Tips
• To obtain the writeexpr information at the MATLAB Command Window, use showexpr.
Compatibility Considerations
writeexpr is not recommended
Not recommended starting in R2019b
The writeexpr function is not recommended. Instead, use write. The write function replaces
writeexpr and many other problem-based functions.
See Also
OptimizationExpression | show | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-445
16 Functions
writeproblem
Package: optim.problemdef
Syntax
writeproblem(prob)
writeproblem(prob,filename)
Description
writeproblem is not recommended. Use write instead.
Examples
x = optimvar('x');
y = optimvar('y');
prob = optimproblem;
prob.Objective = -x - y/3;
prob.Constraints.cons1 = x + y <= 2;
prob.Constraints.cons2 = x + y/4 <= 1;
prob.Constraints.cons3 = x - y <= 2;
prob.Constraints.cons4 = x/4 + y >= -1;
prob.Constraints.cons5 = x + y >= 1;
prob.Constraints.cons6 = -x + y <= 2;
subject to cons1:
x + y <= 2
16-446
writeproblem
subject to cons2:
x + 0.25*y <= 1
subject to cons3:
x - y <= 2
subject to cons4:
0.25*x + y >= -1
subject to cons5:
x + y >= 1
subject to cons6:
-x + y <= 2
Input Arguments
prob — Optimization problem or equation problem
OptimizationProblem object | EquationProblem object
Warning The problem-based approach does not support complex values in an objective function,
nonlinear equalities, or nonlinear inequalities. If a function calculation has a complex value, even as
an intermediate value, the final result can be incorrect.
Path to the file, specified as a string or character vector. The path is relative to the current folder. The
resulting file is a text file, so the file name typically has the extension .txt.
Example: "../Notes/steel_stuff.txt"
Data Types: char | string
Tips
• writeproblem is equivalent to calling all of the following:
• writeexpr(prob.Objective,filename)
• writeconstr on each constraint in prob.Constraints
• writebounds on all the variables in prob
• To obtain the writeproblem information at the Command Window, use showproblem.
16-447
16 Functions
Compatibility Considerations
writeproblem is not recommended
Not recommended starting in R2019b
The writeproblem function is not recommended. Instead, use write. The write function replaces
writeproblem and many other problem-based functions.
See Also
OptimizationProblem | show | write | writebounds
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-448
writevar
writevar
Package: optim.problemdef
Syntax
writevar(var)
writevar(var,filename)
Description
writevar is not recommended. Use write instead.
Examples
var = optimvar('var',8,3,'Type','integer');
writevar(var,"VariableDescription.txt")
Input Arguments
var — Optimization variable
OptimizationVariable object
16-449
16 Functions
Path to the file, specified as a string or character vector. The path is relative to the current folder. The
resulting file is a text file, so the file name typically has the extension .txt.
Example: "../Notes/steel_stuff.txt"
Data Types: char | string
Compatibility Considerations
writevar is not recommended
Not recommended starting in R2019b
The writevar function is not recommended. Instead, use write. The write function replaces
writevar and many other problem-based functions.
See Also
OptimizationVariable | optimvar | show | write
Topics
“Problem-Based Optimization Setup”
“Problem-Based Optimization Workflow” on page 10-2
Introduced in R2017b
16-450