Optimization Models
Optimization Models
Alberto Bemporad
http://cse.lab.imtlucca.it/~bemporad/teaching/numopt
Application domains:
• Optimization theory
– Optimality conditions, sensitivity analysis
– Duality
• Optimization algorithms
– Basics of numerical linear algebra
– Convex programming
– Nonlinear programming
fuel
[ℓ/km]
velocity
0 30 60 90 120 160
[km/h]
velocity
0 30 60 90 120 160
[km/h]
min f (x)
x f(x*)
x
x*
∗
f = minx f (x) = optimal value
x∗ = arg minx f (x) = optimizer x ∈ Rn , f : R n → R
x1
maxx f (x) .
x = ..
, f (x) = f (x1 , x2 , . . . , xn )
xn
min f (x)
x
fuel
[ℓ/km]
velocity
0 30 60 90 120 160
[km/h]
best fuel
fuel consumption
[ℓ/km]
optimal
velocity
velocity
0 30 60 90 120 160
[km/h]
s.t. g(x) ≤ 0
h(x) = 0
g(x) 0
x
• Scaling f (x) to αf (x) and/or gi (x) to βi gi (x) does not change the solution, for
all α, βi > 0. Same if hj (x) is scaled to γj hj (x), ∀γj ̸= 0
with respect to x = [ ab ]
a∗
• The problem b∗
= arg min f ([ ab ]) is a least-squares problem: ŷ = a∗ u + b∗
In MATLAB: 1.5
In Python: y 0
-0.5
import numpy as np -1
A=np.hstack((u,np.ones(u.shape))) -1.5
-1 -0.5 0 0.5 1
x=np.linalg.lstsq(A,y,rcond=0)[0] u
2.5
1
u
1
2
ϕ(u) = u21
3
u1
1.5
u41
1
0 0.5 1 1.5 2
-2
-3
-2 -1 0 1 2 3
2 http://www.utc.fr/~mottelet/mt94/leastSquares.pdf
convex set
nonconvex set
S x1 x2
x1 x2
Jensen’s inequality
mλ(1 − λ)
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 ) − ∥x1 − x2 ∥22
2
• Every local solution is also a global one (we will see this later)
P = {x ∈ R : Ax ≤ b} n
v3
v1 A
1 x=
b1
A3x=b3 A3
• Vertex (V-)representation:
X
q X
p Convex hull = transformation
P = {x ∈ Rn : x = αi vi + βj rj } from V- to H-representation
i=1 i=1
Vertex enumeration =
∑
q
αi , βj ≥ 0, αi = 1, vi , rj ∈ R n
transformation from H- to
i=1
V-representation
when q = 0 the polyhedron is a cone
vi = vertex, rj = extreme ray
©2020 A. Bemporad - Numerical Optimization 23/101
Linear programming
• Linear programming (LP) problem:
-c
min c′ x
x*
s.t. Ax ≤ b, x ∈ Rn
Ex = f George Dantzig
(1914–2005)
′
• LP in standard form: min cx
s.t. Ax = b
x ≥ 0, x ∈ Rn
• Conversion to standard form:
1. introduce slack variables
∑
n ∑
n
aij xj ≤ bi ⇒ aij xj + si = bi , si ≥ 0
j=1 j=1
Ex = f 2
2 x Qx = 2x ( 2 ′ + 2 )x = 12 x′ ( Q+Q 1 ′ 1 ′ ′ ′
2 )x + 4 x Qx − 4 (x Q x)
1 ′ Q+Q
= 2 x ( 2 )x
• In some problems the optimization variables can only take integer values.
We call x ∈ Z an integrality constraint
• When all variables are integer (or binary) the problem is an integer
programming problem (a special case of discrete optimization)
• In a mixed integer programming (MIP) problem some of the variables are real
(xi ∈ R), some are discrete/binary (xi ∈ Z or xi ∈ {0, 1})
• Many good solvers are available (CPLEX, Gurobi, GLPK, Xpress-MP, CBC, ...)
For comparisons see http://plato.la.asu.edu/bench.html
– Robustness = perform well on a wide variety of problems in their class, for all
reasonable values of the initial guess x0
– Accuracy = find a solution close to the optimal one, without being affected by
rounding errors due to the finite precision arithmetic of the CPU
Optimization
Integer Combinatorial
Unconstrained Constrained
Programming Optimization
Nonlinear Nondifferentiable
Nonlinear Equations Global Optimization Nonlinear Programming Network Optimization Bound Constrained Linearly Constrained
Least Squares Optimization
Second-Order Complementarity
Cone Programming Problems
Quadratically-Constrained
Quadratic Programming
https://neos-guide.org/content/optimization-taxonomy
Optimization Guide
Introduction
©2020 A. Bemporad - Numerical Optimization 31/101
Optimization software
• Comparison on benchmark problems:
http://plato.la.asu.edu/bench.html
http://www.neos-server.org
http://www.coin-or.org/
2. Single out the optimization variables xi (what are we able to decide?) and their
domain (real, binary, integer)
3. Treat the remaining variables as parameters (=data that affect the problem but
are not part of the decision process)
5. Are there constraints on the decision variables ? If yes, translate them into
(in)equalities involving x
c x0
+
Ax Qx
b
2 x0
1
s.t n
mi
.
analysis of the solution
• It may take several iterations to formulate the optimization model properly, as:
– The solution does not make sense (is any constraint missing or wrong?)
– The optimal value does not make sense (is the cost function properly defined?)
– It takes too long to find the solution (can we simplify the model?)
A small joinery makes two different sizes of boxwood chess sets. The small set requires 3 hours of machining on a
lathe, and the large set requires 2 hours. There are four lathes with skilled operators who each work a 40 hour week,
so we have 160 lathe-hours per week. The small chess set requires 1 kg of boxwood, and the large set requires 3 kg.
Unfortunately, boxwood is scarce and only 200 kg per week can be obtained. When sold, each of the large chess sets
yields a profit of $20, and one of the small chess set has a profit of $5.
The problem is to decide how many sets of each kind should be made each week so as to maximize profit.
• A small joinery makes two different sizes of boxwood chess sets. The small set requires 3 hours of machining on a
lathe, and the large set requires 2 hours.
• There are four lathes with skilled operators who each work a 40 hour week, so we have 160 lathe-hours per week.
• The small chess set requires 1 kg of boxwood, and the large set requires 3 kg. Unfortunately, boxwood is scarce and
only 200 kg per week can be obtained.
• When sold, each of the large chess sets yields a profit of $20, and one of the small chess set has a profit of $5.
• The problem is to decide how many sets of each kind should be made each week so as to maximize profit.
• Constraints:
– 3xs + 2xℓ ≤ 4 · 40 (maximum lathe-hours)
2 Linear Programming
• What is the best solution ? A numerical solver provides the following solution
We have just built a model for the decision process that the joinery owner has to make. We have isolated
x∗ =(how
the decisions he has to make
0, x ∗ of each type of chess set∗to manufacture), and taken his objective
many
= 66.6666 ⇒ f (x ) = 1333.3 $
of maximizing profit. Thes constraintsℓ acting on the decision variables have been analyzed. We have given
names to his variables and then written down the constraints and the objective function in terms of these
variable names.
At the same time as doing this we have made, explicitly or implicitly, various assumptions. The explicit
©2020 A. Bemporad - Numerical
assumptions that weOptimization
noted were: 39/101
Optimization models
• Optimization models, as all mathematical models, are never an exact
representation of reality but a good approximation of it
• There are usually many different models for the same real problem
ADE OFF
TR
– Simple enough to be able to solve the resulting optimization problem
• Even if the solution is not really applied, it provides a good suggestion to the
decision maker (you should only make large chess sets)
• Making the model and analyzing the solution allows a better understanding of
the problem at hand (small chess sets are not profitable)
• We can analyze the robustness of the solution with respect to variations in the
data (big changes of solution for small changes of prices?)
• GNU MathProg a subset of AMPL associated with the free package GLPK
(GNU Linear Programming Kit)
xs = sdpvar(1,1);
xl = sdpvar(1,1);
optimize(Constraints,-Profit)
value(xs),value(xl),value(Profit)
cvx_clear
cvx_begin
variable xs(1)
variable xl(1)
Profit = 5*xs+20*xl;
maximize Profit
subject to
3*xs+2*xl <= 4*40; % maximum lathe-hours
1*xs+3*xl <= 200; % available kg of boxwood
xs>=0;
xl>=0;
cvx_end
xs,xl,Profit
import casadi.*
xs=SX.sym('xs');
xl=SX.sym('xl');
Profit = 5*xs+20*xl;
Constraints = [3*xs+2*xl-4*40; 1*xs+3*xl-200];
prob=struct('x',[xs;xl],'f',-Profit,'g',Constraints);
solver = nlpsol('solver','ipopt', prob);
res = solver('lbx',[0;0],'ubg',[0;0]);
Profit = -res.f;
xs = res.x(1);
xl = res.x(2);
• Model and solve the problem using Optimization Toolbox (The Mathworks, Inc.)
xs=optimvar('xs','LowerBound',0);
xl=optimvar('xl','LowerBound',0);
Profit = 5*xs+20*xl;
C1 = 3*xs+2*xl-4*40<=0;
C2= 1*xs+3*xl-200<=0;
prob=optimproblem('Objective',Profit,'ObjectiveSense','max');
prob.Constraints.C1=C1;
prob.Constraints.C2=C2;
[sol,Profit] = solve(prob);
xs=sol.xs;
xl=sol.xl;
• The Hybrid Toolbox for MATLAB contains interfaces to various solvers for LP,
QP, MILP, MIQP (http://cse.lab.imtlucca.it/~bemporad/hybrid/toolbox) (Bemporad, 2003-today)
• However, when there are many variables and constraints forming the problem
matrices manually can be very time-consuming and error-prone
optimization
variables
cost function
B6:C6
=SUMPRODUCT(B6:C6;B2:C2)
Reference:
0 ℓ1 u1 x1
• Example: ``We cannot sell more than 100 units of Product A''
• Pay attention: some solvers assume nonnegative variables by default!
• When ℓi = ui the constraint becomes xi = ℓi and variable xi becomes
redundant. Still it may be worthwhile keeping in the model
• Example: ``I can get water from 3 suppliers, S1, S2 and S3. I
x1
want to have at least 1000 liters available.'' x1 + x2 + x3 ≥ 1000
total flow
x2
• Example: ``I have 50 trucks available to xrent
3
to 3 customers C1,
C2 and C3'' x1 + x2 + x3 ≤ 50
• Losses can be included as well: ``2% water I get from suppliers gets
lost.'' 0.98x1 + 0.98x2 + 0.98x3 ≥ 1000
• The technological coefficients Rji denote the amount of resource j used per
unit of activity i
• Example:
``Small chess sets require 1 kg ``Small chess sets require 3
boxwood, the large ones 3 kg, lathe hours, the large ones 2 h,
total available is 200 kg.'' total time is 4×40 h.''
x1 + 3x2 ≤ 200 3x1 + 2x2 ≤ 160
R = [ 23 32 ] , Rmax = [ 200
160 ]
• Balance constraints model the fact that “what goes out must in total equal
what comes in”
L
x1in x1out
X
N X
M
xout
i = xin
i +L x2in
i=1 i=1
x3in x2out
• Example: ``I have 100 tons steel and can buy more from
suppliers 1,2,3 to serve customers A,B.'' xA + xB = 100 + x1 + x2 + x3
• Example: ``The cash I'll have tomorrow is what I have now plus
what I receive minus what I spend today.'' xt+1 = xt + ut − yt
PN
• Of course we can replace y with i=1 xi everywhere in the model (condensed
form), but this would make it less readable
• We call the new variable ϵj panic variable: it should be normally zero but can
assume a positive value in case there is no way to fulfill the constraint set
The company Steel has received an order for 500 tonnes of steel to be used in shipbuilding. This steel must have the
certain characteristics (``grades''). The company has seven different raw materials in stock that may be used for the
production of this steel. The objective is to determine the composition of the steel that minimizes the production cost.
• The company Steel has received an order for 500 tonnes of steel to be used in shipbuilding.
• This steel must have the certain characteristics (``grades'').
• The company has seven different raw materials in stock that may be used for the production of this steel.
• The objective is to determine the composition of the steel that minimizes the production cost.
• “The company Steel has received an order for 500 tonnes of steel''
∑7
– flow constraint y = j=1 xj (produced metal)
1 Carbon (C) 2 3
2 Copper (Cu) 0.4 0.6
3 Manganese (Mn) 1.2 1.65
x∗1 = 400, x∗3 = 39.7763, x∗5 = 2.7613, x∗6 = 57.4624, x∗2 = x∗4 = x∗7 = 0 [t]
x=sdpvar(7,1);
y=sum(x);
cost = sum(c.*x);
optimize(Constraints,cost)
value(x),value(cost)
cvx_clear
cvx_begin
variable x(7)
cost = sum(c.*x);
minimize cost
subject to
y=sum(x); y>=500;
x>=0; x<=R;
P'*x<=B(:,2)*y;
P'*x>=B(:,1)*y;
cvx_end
x,cost
Result
Every convex piecewise affine function
a04x + b4
ℓ : Rn → R can be represented as the `(x)
a01x + b1
max of affine functions, and vice versa
(Schechter, 1987) a03x + b3
Example:
ℓ(x) = max {a′1 x + b1 , . . . , a′4 x + b4 } x
a02x + b2
ϵ minϵ,x ϵ
ϵ ≥ a′1 x + b1
ϵ ≥ a′ x + b
2 2
s.t. ′
ϵ ≥ a x + b
3 3
x ϵ ≥ a′4 x + b4
∥u∥2 ≤ r∗ } contained in P
sup Ai (x + u) = Ai x + r∥Ai ∥2 ≤ bi , ∀i = 1, . . . , m,
∥u∥2 ≤r
maxx,r r
s.t. Ai x + r∥Ai ∥2 ≤ bi , i = 1, . . . , m
References:
• The convex hull of N points x̄1 , . . . , x̄N is the set of all their convex
combinations
x1 x2
P
S = {x ∈ Rn : ∃λ ∈ RN : x = λi x̄i ,
PN x7
λi ≥ 0, i=1 λi = 1} xN x3
x6
x5
x4
• A convex cone of N points x̄1 , . . . , x̄N is the set
x1
X
S = {x ∈ R : ∃λ ∈ R n N
:x= λi x̄i , λi ≥ 0} x
x2
a’x ≤ b
• The set {x : a′ x ≤ b}, a ̸= 0 is called halfspace
• Hyperplanes, halfspaces, polyhedra, balls, and ellipsoids are all convex sets
• Sublevel sets Cα of convex functions are convex sets (but not vice versa)
Cα = {x ∈ dom f : f (x) ≤ α}
∇2 f (x) ≽ 0, ∀x ∈ dom f
General composition rule: h(f1 (x), . . . , fk (x)) is convex when h is convex and
for each i = 1, . . . , k
h is increasing in argument i, and fi convex, or
h is decreasing in argument i, and fi concave, or
fi is affine
x*
1 1
∥Ax − b∥22 = x′ A′ Ax − b′ Ax + b′ b 1 x0 Qx + c0 x = constant
2
2 2
• A generalization of constrained least squares is quadratic programming (QP)
1 ′
min x Qx + c′ x
2
s.t. Ax ≤ b Q = Q′ ≽ 0
Ex = f
1 ′ 1 1
x Qx + c′ x = ∥Lx − (−L−1 )′ c∥22 − c′ Q−1 c
2 2 2
min c′ x
E[c] = c̄, Var[c] = E[(c − c̄)(c − c̄)′ ] = Σ
s.t. Ax ≤ b, Ex = f
• We want to trade off the expectation of c′ x with its variance (=risk) with a risk
aversion coefficient γ ≥ 0
1
min ∥Ax − b∥22 + λ∥x∥1 A ∈ Rm×n , b ∈ Rm
x 2
1
min ∥Ax − b∥22 + λ∥x∥1
x 2 50
A ∈ R1000×3000 , b ∈ R3000
45
10-4 10-3 10-2 10-1 100 101 102
• A, B = random matrices
1000
• A sparse with 3000 nonzero entries
800
400
min c′ x
s.t. prob(a′i x ≤ bi ) ≥ η, i = 1, . . . , m
min c′ x
s.t. ā′i x + Φ−1 (η)∥LΣ x∥2 ≤ bi , i = 1, . . . , m
• Problem to solve:
Qn
maxx,y i=1 yi
nonlinear, nonconvex,
s.t. A(x + diag(v)y) ≤ b, ∀v ∈ {0, 1}n
many constraints!
y≥0
• Reformulate as maximize log(volume), remove redundant constraints:
X
n
convex problem
minx,y − log(yi )
i=1
s.t. Ax + A+ y ≤ b, y≥0 A+
ij = max{Aij , 0}
min c′ x min c′ x
s.t. Ax ≤ b s.t. diag(Ax − b) ≼ 0
• Good SDP packages exist (SeDuMi, SDPT3, Mathworks LMI Toolbox, ...)
• A monomial function f : Rn
++ → R++ , where R++ = {x ∈ R : x > 0}, has
the form
f (x) = cxa1 1 xa2 2 . . . xann , c > 0, ai ∈ R
• A posynomial function f : Rn
++ → R++ is the sum of monomials
X
K
f (x) = ck xa1 1k xa2 2k . . . xannk , ck > 0, aik ∈ R
k=1
• One can prove that F (y) = log fP (ey ) is convex and so it is the program
P
K a′k y+bk
min log e
Pk=1 ′
K
s.t. log k=1 e
cik y+dik
≤ 0, i = 1, . . . , m
Ey + f = 0
• Constraints:
– total wall area 2(hw + hd) ≤ Awall
– floor area wd ≤ Aflr
– upper and lower bounds on aspect ratios α ≤ h/w ≤ β , γ ≤ w/d ≤ δ
CVX YALMIP
cvx_begin gp quiet sdpvar h w d
variables h w d
% obj. function = box volume C = [alpha <= h/w <= beta,
maximize(h*w*d) gamma <= d/w <= delta, h>=0,
subject to w>=0];
2*(h*w + h*d) <= Awall; C = [C, 2*(h*w+h*d) <= Awall,
w*d <= Afloor; w*d <= Afloor];
alpha <= h/w <= beta;
gamma <= d/w <= delta; optimize(C,-(h*w*d))
cvx_end
yalmip.github.io/tutorial/geometricprogramming
opt_volume = cvx_optval;
CVXPY
import cvxpy as cp constraints = [
2*(h*w + h*d) <= Awall,
alpha = 0.5 w*d <= Afloor,
beta = 2.0 alpha <= h/w, h/w <= beta,
gamma = 0.5 gamma <= d/w, d/w <= delta]
delta = 2.0
Awall = 1000.0 problem = cp.Problem(cp.Maximize
Afloor = 500.0 (obj), constraints)
problem.solve(gp=True)
h = cp.Variable(pos=True)
w = cp.Variable(pos=True) print("h: ", h.value)
d = cp.Variable(pos=True) print("w: ", w.value)
print("d: ", d.value)
obj = h * w * d print("volume: ", problem.value)
√
– Example: min x with x ≥ 0, is a nonconvex problem, but we can minimize
√
( x)2 = x instead
∏n
– Example: max f (x) = i=1 xi is a nonconvex problem, but the function
∑n
log(f (x)) = i=1 log(x 1 ) is concave