0% found this document useful (0 votes)

112 views104 pages

Optimization Models

This document introduces numerical optimization and describes the course objectives, contents, and references. The course aims to teach students how to formulate decision problems as numerical optimization problems, determine the most appropriate algorithms to solve them, and understand the underlying theory of the algorithms. The content will cover optimization modeling with linear and convex models, optimization theory including optimality conditions and duality, and optimization algorithms such as convex programming and nonlinear programming. The course references other online courses and textbooks on numerical optimization.

Uploaded by

Michele Vietri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

112 views104 pages

Optimization Models

Uploaded by

Michele Vietri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 104

Numerical Optimization

Alberto Bemporad

http://cse.lab.imtlucca.it/~bemporad/teaching/numopt

Academic year 2019-2020

Course objectives

Solve complex decision problems by using numerical optimization

Application domains:

• Finance, management science, economics (portfolio optimization, business

analytics, investment plans, resource allocation, logistics, ...)

• Engineering (engineering design, process optimization, embedded control, ...)

• Artificial intelligence (machine learning, data science, autonomous driving, ...)

• Myriads of other applications (transportation, smart grids, water networks,

sports scheduling, health-care, oil & gas, space, ...)

©2020 A. Bemporad - Numerical Optimization 2/101

Course objectives

What this course is about:

• How to formulate a decision problem as a numerical optimization problem?

(modeling)

• Which numerical algorithm is most appropriate to solve the problem?

(algorithms)

• What’s the theory behind the algorithm? (theory)

©2020 A. Bemporad - Numerical Optimization 3/101

Course contents
• Optimization modeling
– Linear models
– Convex models

• Optimization theory
– Optimality conditions, sensitivity analysis
– Duality

• Optimization algorithms
– Basics of numerical linear algebra
– Convex programming
– Nonlinear programming

©2020 A. Bemporad - Numerical Optimization 4/101

References i

©2020 A. Bemporad - Numerical Optimization 5/101

Other references

• Stephen Boyd’s “Convex Optimization” courses at Stanford:

http://ee364a.stanford.edu http://ee364b.stanford.edu

• Lieven Vandenberghe’s courses at UCLA:

http://www.seas.ucla.edu/~vandenbe/

• For more tutorials/books see

http://plato.asu.edu/sub/tutorials.html

©2020 A. Bemporad - Numerical Optimization 6/101

Optimization modeling
What is optimization?
• Optimization = assign values to a set of decision variables so to optimize a
certain objective function

• Example: Which is the best velocity to minimize fuel consumption ?

fuel
[ℓ/km]

velocity
0 30 60 90 120 160
[km/h]

©2020 A. Bemporad - Numerical Optimization 7/101

What is optimization?
• Optimization = assign values to a set of decision variables so to optimize a
certain objective function

• Example: Which is the best velocity to minimize fuel consumption ?

best fuel
fuel consumption
[ℓ/km]
optimal
velocity

velocity
0 30 60 90 120 160
[km/h]

optimization variable: velocity

cost function to minimize: fuel consumption
parameters of the decision problem: engine type, chassis shape, gear, …

©2020 A. Bemporad - Numerical Optimization 8/101

Optimization problem
f(x)

min f (x)
x f(x*)
x
x*
∗
f = minx f (x) = optimal value
x∗ = arg minx f (x) = optimizer x ∈ Rn , f : R n → R
 
x1
maxx f (x)  . 
x =  .. 

, f (x) = f (x1 , x2 , . . . , xn )
xn

Most often the problem is difficult to solve by inspection

use a numerical solver implementing an optimization algorithm

©2020 A. Bemporad - Numerical Optimization 9/101

Optimization problem

min f (x)
x

• The objective function f : Rn → R models our goal: minimize (or maximize)

some quantity.

For example fuel, money, distance from a target, etc.

• The optimization vector x ∈ Rn is the vector of optimization variables

(or unknowns) xi to be decided optimally.

For example velocity, number of assets in a portfolio, voltage applied to a

motor, etc.

©2020 A. Bemporad - Numerical Optimization 10/101

Constrained optimization problem
• The optimization vector x may not be completely free, but rather restricted to a
feasible set X ⊆ Rn
• Example: the velocity must be smaller than 60 km/h

fuel
[ℓ/km]

velocity
0 30 60 90 120 160
[km/h]

best fuel
fuel consumption

[ℓ/km]
optimal
velocity

velocity
0 30 60 90 120 160
[km/h]

The new optimizer is x∗ = 42 km/h.

s.t. g(x) ≤ 0
h(x) = 0
g(x)  0
x

• The (in)equalities define the feasible set

X of admissible variables g : R n → Rm , h : Rn → Rp
" g1 (x1 ,x2 ,...,xn ) #
X = {x ∈ Rn : g(x) ≤ 0, h(x) = 0} ..
g(x) =
.
gm (x1 ,x2 ,...,xn )
• Further constraints may restrict X ,
" h1 (x1 ,x2 ,...,xn ) #
for example:
h(x) = ..
x ∈ {0, 1}n (x = binary vector) .
hp (x1 ,x2 ,...,xn )
x ∈ Zn (x = integer vector)

©2020 A. Bemporad - Numerical Optimization 12/101

A few observations
• An optimization problem can be always written as a minimization problem

max f (x) = − min{−f (x)}

x∈X x∈X

• Similarly, an inequality gi (x) ≥ 0 is equivalent to −gi (x) ≤ 0

• An equality h(x) = 0 is equivalent to the double inequalities h(x) ≤ 0,

−h(x) ≤ 0 (often this is only good in theory, but not numerically)

• Scaling f (x) to αf (x) and/or gi (x) to βi gi (x) does not change the solution, for
all α, βi > 0. Same if hj (x) is scaled to γj hj (x), ∀γj ̸= 0

• Adding constraints makes the objective worse or equal:

min f (x) ≤ min f (x)

x∈X1 x∈X1 , x∈X2

• Strict inequalities gi (x) < 0 can be approximated by gi (x) ≤ −ϵ (0 < ϵ ≪ 1)

©2020 A. Bemporad - Numerical Optimization 13/101

Infeasibility and unboundedness

• A vector x ∈ Rn is feasible if x ∈ X , i.e., it satisfies the given constraints

• A problem is infeasible if X = ∅ (the constraints are too tight)

• A problem is unbounded if ∀M > 0 ∃x ∈ X such that f (x) < −M .

In this case we write
inf f (x) = −∞
x∈X

©2020 A. Bemporad - Numerical Optimization 14/101

Global and local minima

• A vector x∗ ∈ Rn is a global optimizer if x ∈ X and f (x) ≥ f (x∗ ), ∀x ∈ X

• A vector x∗ ∈ Rn is a strict global optimizer if x∗ ∈ X and f (x) > f (x∗ ),

∀x ∈ X , x ̸= x∗

• A vector x∗ ∈ Rn is a (strict) local optimizer if x∗ ∈ X and there exists a

neighborhood1 N of x∗ such that f (x) ≥ f (x∗ ), ∀x ∈ X ∩ N
(f (x) > f (x∗ ), ∀x ∈ X ∩ N , x ̸= x∗ )

1 Neighborhood of x = open set containing x

©2020 A. Bemporad - Numerical Optimization 15/101

Example: Least Squares
• We have a dataset (uk , yk ), uk , yk ∈ R, k = 1, . . . N

• We want to fit a line ŷ = au + b to the dataset that minimizes

  " y1 # 2
u1 1
X
N X
N
′ . . ..
f (x) = (yk − auk − b) = 2
([ u1k ] x − yk ) =  .. ..  x −
2
.
k=1 k=1 uN 1 yN 2

with respect to x = [ ab ]
a∗
• The problem b∗
= arg min f ([ ab ]) is a least-squares problem: ŷ = a∗ u + b∗
In MATLAB: 1.5

x=[u ones(size(u))]\y 0.5

In Python: y 0

-0.5

import numpy as np -1

A=np.hstack((u,np.ones(u.shape))) -1.5
-1 -0.5 0 0.5 1

x=np.linalg.lstsq(A,y,rcond=0)[0] u

©2020 A. Bemporad - Numerical Optimization 16/101

Least Squares using Basis Functions
• More generally: we can fit nonlinear functions y = f (u) expressed as the sum
X
n
of basis functions yk ≈ xi ϕi (uk ) using least squares
i=1
• Example: fit polynomial function y = x1 + x2 u1 + x3 u21 + x4 u31 + x5 u41
N
X h i 2
min yk − 1 uk u2k u3k u4k x least squares
x
k=1 | {z }
linear with respect to x


 2.5
1
u 
 1
  2
ϕ(u) =  u21 
 3
 u1 
1.5
u41
1
0 0.5 1 1.5 2

©2020 A. Bemporad - Numerical Optimization 17/101

Least Squares - Fitting a circle
• Example: fit circle to a set of data2
X
N
min (r2 − (xk − x0 )2 − (yk − y0 )2 )2
x0 ,y0 ,r
k=1
x0

y0
• Let x = be the optimization vector (note the change of variables!)
r −x20 −y02
2

• The problem becomes the least squares problem

2
N h
X i 2 1
min 2xk 2yk 1 x − (x2k + yk2 )
x 0
k=1
-1

-2

-3
-2 -1 0 1 2 3

2 http://www.utc.fr/~mottelet/mt94/leastSquares.pdf

©2020 A. Bemporad - Numerical Optimization 18/101

Convex sets
Definition
A set S ⊆ Rn is convex if for all x1 , x2 ∈ S

λx1 + (1 − λ)x2 ∈ S, ∀λ ∈ [0, 1]

convex set
nonconvex set

S x1 x2
x1 x2

©2020 A. Bemporad - Numerical Optimization 19/101

Convex functions
• f : S → R is a convex function if S is convex and

f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 )

∀x1 , x2 ∈ S, λ ∈ [0, 1]

Jensen’s inequality

• If f is convex and differentiable, by taking the limit λ → 0 we get 3

f (x1 ) ≥ f (x2 ) + ∇f (x2 )′ (x1 − x2 )

• A function f is strictly convex if f (λx1 + (1 − λ)x2 ) < λf (x1 ) + (1 − λ)f (x2 ),

∀x1 ̸= x2 ∈ S , ∀λ ∈ (0, 1)
3 f (x
1) − f (x2 ) ≥ limλ→0 (f (x2 + λ(x1 − x2 )) − f (x2 ))/λ = ∇f ′ (x2 )(x1 − x2 )

©2020 A. Bemporad - Numerical Optimization 20/101

Convex functions
• A function f : S → R is strongly convex with parameter m ≥ 0 if

mλ(1 − λ)
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 ) − ∥x1 − x2 ∥22
2

• If f strongly convex with parameter m ≥ 0 and differentiable then

m
f (y) ≥ f (x) + ∇f (x)′ (y − x) + ∥y − x∥22
2
• Equivalently, f is strongly convex with parameter m ≥ 0 if and only if
′
f (x) − m
2 x x convex
• Moreover, if f is differentiable twice this is equivalent to ∇2 f (x) ≽ mI
(i.e., matrix ∇2 f (x) − mI is positive semidefinite), ∀x ∈ Rn

• A function f is (strictly/strongly) concave if −f is (strictly/strongly) convex

©2020 A. Bemporad - Numerical Optimization 21/101

Convex programming
The optimization problem
min f (x)
s.t. x ∈ S

is a convex optimization problem if S is a convex set S

x1 x2
and f : S → R is a convex function

• Often S is defined by linear equality constraints Ax = b and convex inequality

constraints g(x) ≤ 0, g : Rn → Rm convex

• Every local solution is also a global one (we will see this later)

• Efficient solution algorithms exist (we will see many later)

• Often occurring in many problems in engineering, economics, and science

Excellent textbook: “Convex Optimization” (Boyd, Vandenberghe, 2002)

©2020 A. Bemporad - Numerical Optimization 22/101

Polyhedra
Definition
Convex polyhedron = intersection of a finite set of half-spaces of Rn
Convex polytope = bounded convex polyhedron
v3
b2
x=
A2
• Hyperplane (H-)representation: v2 A
A2 1

P = {x ∈ R : Ax ≤ b} n
v3
v1 A
1 x=
b1
A3x=b3 A3
• Vertex (V-)representation:
X
q X
p Convex hull = transformation
P = {x ∈ Rn : x = αi vi + βj rj } from V- to H-representation
i=1 i=1
Vertex enumeration =
∑
q
αi , βj ≥ 0, αi = 1, vi , rj ∈ R n
transformation from H- to
i=1
V-representation
when q = 0 the polyhedron is a cone
vi = vertex, rj = extreme ray
©2020 A. Bemporad - Numerical Optimization 23/101
Linear programming
• Linear programming (LP) problem:
-c
min c′ x
x*
s.t. Ax ≤ b, x ∈ Rn
Ex = f George Dantzig
(1914–2005)

′
• LP in standard form: min cx
s.t. Ax = b
x ≥ 0, x ∈ Rn
• Conversion to standard form:
1. introduce slack variables
∑
n ∑
n
aij xj ≤ bi ⇒ aij xj + si = bi , si ≥ 0
j=1 j=1

2. split positive and negative part of x

 n  n
 ∑  ∑
 aij xj + si = bi  + −
aij (xj − xj ) + si = bi
⇒
 j=1
  + −

j=1
xj free, si ≥ 0 x j , xj , s i ≥ 0

©2020 A. Bemporad - Numerical Optimization 24/101

Quadratic programming (QP)
• Quadratic programming (QP) problem:
1 ′
x Qx + c′ x
Ax  b
min
2 x*
s.t. Ax ≤ b, x ∈ Rn
1 x0 Qx + c0 x = constant

Ex = f 2

• Convex optimization problem if Q ≽ 0 (Q = positive semidefinite matrix) 4

• Without loss of generality, we can assume Q = Q′ :
′
1 ′ 1 ′ Q+Q Q−Q′ ′

2 x Qx = 2x ( 2 ′ + 2 )x = 12 x′ ( Q+Q 1 ′ 1 ′ ′ ′
2 )x + 4 x Qx − 4 (x Q x)
1 ′ Q+Q
= 2 x ( 2 )x

• Hard problem if Q ̸≽ 0 (Q = indefinite matrix)

4 A matrix P ∈ Rn×n is positive semidefinite (P ≽ 0) if x′ P x ≥ 0 for all x.

It is positive definite (P ≻ 0) if in addition x′ P x > 0 for all x ̸= 0.

It is negative (semi)definite (P ≺ 0, P ≼ 0) if −P is positive (semi)definite.
It is indefinite otherwise.

©2020 A. Bemporad - Numerical Optimization 25/101

Continuous vs Discrete Optimization

• In some problems the optimization variables can only take integer values.
We call x ∈ Z an integrality constraint

• A special case is x ∈ {0, 1} (binary constraint)

• When all variables are integer (or binary) the problem is an integer
programming problem (a special case of discrete optimization)

• In a mixed integer programming (MIP) problem some of the variables are real
(xi ∈ R), some are discrete/binary (xi ∈ Z or xi ∈ {0, 1})

Optimization problems with integer variables are more difficult to solve

©2020 A. Bemporad - Numerical Optimization 26/101

Mixed-integer programming (MIP)
1 ′
min c′ x min x Qx + c′ x
2
s.t. Ax ≤ b, x = [ xxcb ] s.t. Ax ≤ b, x = [ xxcb ]
xc ∈ Rnc , xb ∈ {0, 1}nb xc ∈ Rnc , xb ∈ {0, 1}nb
mixed-integer linear program (MILP) mixed-integer quadratic program (MIQP)

• Some variables are real, some are binary (0/1)

• MILP and MIQP are N P -hard problems, in general

• Many good solvers are available (CPLEX, Gurobi, GLPK, Xpress-MP, CBC, ...)
For comparisons see http://plato.la.asu.edu/bench.html

©2020 A. Bemporad - Numerical Optimization 27/101

Stochastic and Robust Optimization
• Relations affected by random numbers lead to stochastic models

min Ew [f (x, w)]

• The model is enriched by the information about the probability distribution of w

• Other stochastic measures can be minimized (Var, conditional value-at-risk, ...)
• The deterministic version minx f (x, Ew [w]) of the problem only considers the
expected value of w, not its entire distribution
If f is convex w.r.t. x then f (x, Ew [w]) ≤ Ew [f (x, w)]

• chance constraints are constraints enforced only in probability:

prob(g(x, w) ≤ 0) ≥ 99%
• robust constraints are constraints that must be always satisfied:
g(x, w) ≤ 0, ∀w

©2020 A. Bemporad - Numerical Optimization 28/101

Dynamic Optimization
• Dynamic optimization involves decision variables that evolve over time

Example: For a given a value of x0 we want to optimize

X
N −1
minx,u x2t + u2t
t=0
s.t. xt+1 = axt + but
where ut is the control value (to be decided) and xt the state at time t.

The decision variables are

" u0 # " x1 #
u= .. , x = ...
.
uN −1 xN

• Heavily used to solve optimal control problems, such as in model predictive

control

©2020 A. Bemporad - Numerical Optimization 29/101

Optimization algorithm
• An optimization algorithm is a procedure to find an optimizer x∗ of a given
optimization problem minx∈X f (x)

• It is usually iterative: starting from an initial guess x0 of x it generates a

sequence xk of “iterates”, with hopefully xN ≈ x∗ after N iterations

• Good optimization algorithms should possess the following properties:

– Robustness = perform well on a wide variety of problems in their class, for all
reasonable values of the initial guess x0

– Efficiency = do not require excessive CPU time/flops and memory allocation

– Accuracy = find a solution close to the optimal one, without being affected by
rounding errors due to the finite precision arithmetic of the CPU

• The above are often conflicting properties

©2020 A. Bemporad - Numerical Optimization 30/101

It is difficult to provide a taxonomy of optimization because many of the subfields have multiple links.
Optimization taxonomy
Shown here is one perspective, focused mainly on the subfields of deterministic optimization with a single
objective function.

Optimization

Uncertainty Deterministic Multiobjective Optimization

Stochastic Programming Robust Optimization Continuous Discrete

Integer Combinatorial
Unconstrained Constrained
Programming Optimization

Nonlinear Nondifferentiable
Nonlinear Equations Global Optimization Nonlinear Programming Network Optimization Bound Constrained Linearly Constrained
Least Squares Optimization

Semiinﬁnite Mathematical Programs Mixed Integer Derivative-Free Quadratic Linear

Semideﬁnite Programming
Programming with Equilibrium Constraints Nonlinear Programming Optimization Programming Programming

Second-Order Complementarity
Cone Programming Problems

Quadratically-Constrained
Quadratic Programming

https://neos-guide.org/content/optimization-taxonomy
Optimization Guide

Introduction
©2020 A. Bemporad - Numerical Optimization 31/101
Optimization software
• Comparison on benchmark problems:
http://plato.la.asu.edu/bench.html

• Taxonomy of many solvers for different classes of optimization problems:

http://www.neos-guide.org

• NEOS server for remotely solving optimization problems:

http://www.neos-server.org

• Good open-source optimization software:

http://www.coin-or.org/

• GitHub , MATLAB Central , Google ,...

©2020 A. Bemporad - Numerical Optimization 32/101

Optimization model

• An optimization model is a mathematical model that captures the objective

function to minimize and the constraints imposed on the optimization variables

• It is a quantitative model, the decision problem must be formulated as a set of

mathematical relations involving the optimization variables

©2020 A. Bemporad - Numerical Optimization 33/101

Formulating an optimization model
Steps required to formulate an optimization model that solves a given
decision problem:

1. Talk to the domain expert to understand the problem we want to solve

2. Single out the optimization variables xi (what are we able to decide?) and their
domain (real, binary, integer)

3. Treat the remaining variables as parameters (=data that affect the problem but
are not part of the decision process)

4. Translate the objective(s) into a cost function of x to minimize (or maximize)

5. Are there constraints on the decision variables ? If yes, translate them into
(in)equalities involving x

6. Make sure we have all the required data available

©2020 A. Bemporad - Numerical Optimization 34/101

Formulating an optimization model
solution
optimization model 2 3
modeling solver
1.1
minx f (x) 6 7
real problem
6 7
x⇤ = 6 3 7
4 5
s.t. g(x)  0 25

c x0
 +
Ax Qx
b
2 x0
1
s.t n
mi
.
analysis of the solution

• It may take several iterations to formulate the optimization model properly, as:

– A solution does not exist (anything wrong in the constraints?)

– The solution does not make sense (is any constraint missing or wrong?)

– The optimal value does not make sense (is the cost function properly defined?)

– It takes too long to find the solution (can we simplify the model?)

©2020 A. Bemporad - Numerical Optimization 35/101

Example: Chess set problem

(Guerét et al., Applications of Optimization with XpressMP, 1999)

A small joinery makes two different sizes of boxwood chess sets. The small set requires 3 hours of machining on a
lathe, and the large set requires 2 hours. There are four lathes with skilled operators who each work a 40 hour week,
so we have 160 lathe-hours per week. The small chess set requires 1 kg of boxwood, and the large set requires 3 kg.
Unfortunately, boxwood is scarce and only 200 kg per week can be obtained. When sold, each of the large chess sets
yields a profit of $20, and one of the small chess set has a profit of $5.

The problem is to decide how many sets of each kind should be made each week so as to maximize profit.

©2020 A. Bemporad - Numerical Optimization 36/101

Example: Chess set problem
(Guerét et al., Applications of Optimization with XpressMP, 1999)

• A small joinery makes two different sizes of boxwood chess sets. The small set requires 3 hours of machining on a
lathe, and the large set requires 2 hours.
• There are four lathes with skilled operators who each work a 40 hour week, so we have 160 lathe-hours per week.
• The small chess set requires 1 kg of boxwood, and the large set requires 3 kg. Unfortunately, boxwood is scarce and
only 200 kg per week can be obtained.
• When sold, each of the large chess sets yields a profit of $20, and one of the small chess set has a profit of $5.

• The problem is to decide how many sets of each kind should be made each week so as to maximize profit.

©2020 A. Bemporad - Numerical Optimization 37/101

Example: Chess set problem

• Optimization variables: xs , xℓ = produced quantities of small/large chess sets

• Cost function: f (x) = 5xs + 20xℓ (profit)

• Constraints:
– 3xs + 2xℓ ≤ 4 · 40 (maximum lathe-hours)

– xs + 3xℓ ≤ 200 (available kg of boxwood)

– xs , xℓ ≥ 0 (produced quantities cannot be negative)

max 5xs + 20xℓ

s.t. [ 31 23 ] [ xxsℓ ] ≤ [ 160
200 ]
xs , x ℓ ≥ 0

©2020 A. Bemporad - Numerical Optimization 38/101

Example: Chess set problem

• What is the best decision ? LetTable

us make some
1.1: Values for xs guesses:
and xl
xs xl Lathe-hours Boxwood OK? Profit Notes
A 0 0 0 0 Yes 0 Unprofitable!
B 10 10 50 40 Yes 250 We won’t get rich doing this.
C -10 10 -10 20 No 150 Planning to make a negative
number of small sets.
D 53 0 159 53 Yes 265 Uses all the lathe-hours. There
is spare boxwood.
E 50 20 190 110 No 650 Uses too many lathe-hours.
F 25 30 135 115 Yes 725 There are spare lathe-hours
and spare boxwood.
G 12 62 160 198 Yes 1300 Uses all the resources
H 0 66 130 198 Yes 1320 Looks good. There are spare
resources.

2 Linear Programming
• What is the best solution ? A numerical solver provides the following solution
We have just built a model for the decision process that the joinery owner has to make. We have isolated
x∗ =(how
the decisions he has to make
0, x ∗ of each type of chess set∗to manufacture), and taken his objective
many
= 66.6666 ⇒ f (x ) = 1333.3 $
of maximizing profit. Thes constraintsℓ acting on the decision variables have been analyzed. We have given
names to his variables and then written down the constraints and the objective function in terms of these
variable names.
At the same time as doing this we have made, explicitly or implicitly, various assumptions. The explicit
©2020 A. Bemporad - Numerical
assumptions that weOptimization
noted were: 39/101
Optimization models
• Optimization models, as all mathematical models, are never an exact
representation of reality but a good approximation of it

• We need to make working assumptions, for example:

– Lathe hours are never more than 160
– Available wood is exactly 200 kg
– Prices are constant
– We sell all chess sets

• There are usually many different models for the same real problem

Optimization modeling is an art

©2020 A. Bemporad - Numerical Optimization 40/101

Optimization models
• Numerical complexity (and even the ability to find an optimal solution)
depends on the optimization model we have formulated

• Good optimization models must be

– Descriptive enough to capture the most significant aspects of the decision problem
(variables, costs, constraints)

ADE OFF
TR
– Simple enough to be able to solve the resulting optimization problem

“Make everything as simple as possible, but not simpler.”

— Albert Einstein

©2020 A. Bemporad - Numerical Optimization 41/101

Optimization models - Benefits

• An optimization project provides rational (quantitative) decisions based on the

available information

• Even if the solution is not really applied, it provides a good suggestion to the
decision maker (you should only make large chess sets)

• Making the model and analyzing the solution allows a better understanding of
the problem at hand (small chess sets are not profitable)

• We can analyze the robustness of the solution with respect to variations in the
data (big changes of solution for small changes of prices?)

©2020 A. Bemporad - Numerical Optimization 42/101

Modeling languages for optimization problems

• AMPL (A Modeling Language for Mathematical Programming) most used

modeling language, supports several solvers

• OPL (Optimization Programming Language), associated with commercial

package IBM-CPLEX

• MOSEL associated with commercial package FICO Xpress

• GAMS (General Algebraic Modeling System) is one of the first modeling

languages

• LINGO modeling language of Lindo Systems Inc.

• GNU MathProg a subset of AMPL associated with the free package GLPK
(GNU Linear Programming Kit)

©2020 A. Bemporad - Numerical Optimization 43/101

Modeling languages for optimization problems
• YALMIP MATLAB-based modeling language

• CVX (CVXPY) Modeling language for convex problems in MATLAB ( )

• CASADI + IPOPT Nonlinear modeling + automatic differentiation, nonlinear

programming solver (MATLAB, , C++)

• Optimization Toolbox’ modeling language (part of MATLAB since R2017b)

• PYOMO -based modeling language

• GEKKO -based mixed-integer nonlinear modeling language

• PuLP A linear programming modeler for

• JuMP A modeling language for linear, quadratic, and nonlinear constrained

optimization problems embedded in

©2020 A. Bemporad - Numerical Optimization 44/101

Example: Chess set problem

• Model and solve the problem using YALMIP (Löfberg, 2004)

xs = sdpvar(1,1);
xl = sdpvar(1,1);

Constraints = [3xs+2xl <= 440, 1xs+3*xl <= 200, ...

xs >= 0, xl >= 0]
Profit = 5*xs+20*xl;

optimize(Constraints,-Profit)

value(xs),value(xl),value(Profit)

©2020 A. Bemporad - Numerical Optimization 45/101

Example: Chess set problem
• Model and solve the problem using CVX (Grant, Boyd, 2013)

cvx_clear
cvx_begin
variable xs(1)
variable xl(1)

Profit = 5*xs+20*xl;

maximize Profit

subject to
3*xs+2*xl <= 4*40; % maximum lathe-hours
1*xs+3*xl <= 200; % available kg of boxwood
xs>=0;
xl>=0;
cvx_end

xs,xl,Profit

©2020 A. Bemporad - Numerical Optimization 46/101

Example: Chess set problem
• Model and solve the problem using CASADI + IPOPT
(Andersson, Gillis, Horn, Rawlings, Diehl, 2018) (Wächter, Biegler, 2006)

import casadi.*
xs=SX.sym('xs');
xl=SX.sym('xl');

Profit = 5*xs+20*xl;
Constraints = [3*xs+2*xl-4*40; 1*xs+3*xl-200];

prob=struct('x',[xs;xl],'f',-Profit,'g',Constraints);
solver = nlpsol('solver','ipopt', prob);
res = solver('lbx',[0;0],'ubg',[0;0]);

Profit = -res.f;
xs = res.x(1);
xl = res.x(2);

©2020 A. Bemporad - Numerical Optimization 47/101

Example: Chess set problem

• Model and solve the problem using Optimization Toolbox (The Mathworks, Inc.)

xs=optimvar('xs','LowerBound',0);
xl=optimvar('xl','LowerBound',0);

Profit = 5*xs+20*xl;
C1 = 3*xs+2*xl-4*40<=0;
C2= 1*xs+3*xl-200<=0;

prob=optimproblem('Objective',Profit,'ObjectiveSense','max');
prob.Constraints.C1=C1;
prob.Constraints.C2=C2;

[sol,Profit] = solve(prob);

xs=sol.xs;
xl=sol.xl;

©2020 A. Bemporad - Numerical Optimization 48/101

Example: Chess set problem
• In this case the optimization model is very simple and we can directly code the
LP problem in plain MATLAB or Python:

A=[1 3;3 2]; import scipy as sc

b=[200;160]; import numpy as np
c=[5 20]; A=np.array([[1,3],[3,2]])
[xopt,fopt]=linprog(... b=np.array([[200],[160]])
-c,A,b,[],[],[0;0]) c=np.array([5,20])
sol=sc.optimize.linprog(
-c,A,b,bounds=[0,None])

• The Hybrid Toolbox for MATLAB contains interfaces to various solvers for LP,
QP, MILP, MIQP (http://cse.lab.imtlucca.it/~bemporad/hybrid/toolbox) (Bemporad, 2003-today)

• However, when there are many variables and constraints forming the problem
matrices manually can be very time-consuming and error-prone

©2020 A. Bemporad - Numerical Optimization 49/101

Example: Chess set problem
• We can even model and solve the optimization problem in Excel:

optimization
variables
cost function
B6:C6
=SUMPRODUCT(B6:C6;B2:C2)

©2020 A. Bemporad - Numerical Optimization 50/101

Linear optimization models

Reference:

C. Guéret, C. Prins, M. Sevaux, “Applications of optimization with Xpress-MP,”

Translated and revised by S.Heipcke, 1999
Optimization modeling: linear constraints

• Constraints define the set where to look for an optimal solution

• They define relations between decision variables

• When formulating an optimization model we must disaggregate the

restrictions appearing in the decision problem into subsets of constraints that
we know how to model

• There many types of constraints we know how to model ...

©2020 A. Bemporad - Numerical Optimization 51/101

1. Upper and lower bounds (box constraints)
• Box constraints are the simplest constraints: they define upper and lower
bounds on the decision variables
x2
ℓi ≤ xi ≤ ui
u2
admissible
ℓi ∈ R ∪ {−∞}, ui ∈ R ∪ {+∞}
set
ℓ2

0 ℓ1 u1 x1
• Example: ``We cannot sell more than 100 units of Product A''
• Pay attention: some solvers assume nonnegative variables by default!
• When ℓi = ui the constraint becomes xi = ℓi and variable xi becomes
redundant. Still it may be worthwhile keeping in the model

©2020 A. Bemporad - Numerical Optimization 52/101

2. Flow constraints x1
total flow
x2
x3
• Flow constraints arise when an item can be divided in different streams, or vice
versa many streams come together
x1 x1
X
n
total flow total flow
xi ≤ Fmax x2 x2
i=1 x3 x3

• Example: ``I can get water from 3 suppliers, S1, S2 and S3. I
x1
want to have at least 1000 liters available.'' x1 + x2 + x3 ≥ 1000
total flow
x2
• Example: ``I have 50 trucks available to xrent
3
to 3 customers C1,
C2 and C3'' x1 + x2 + x3 ≤ 50

• Losses can be included as well: ``2% water I get from suppliers gets
lost.'' 0.98x1 + 0.98x2 + 0.98x3 ≥ 1000

©2020 A. Bemporad - Numerical Optimization 53/101

3. Resource constraints
• Resource constraints take into account that a given resource is limited
X
n
Rji xi ≤ Rmax,j
i=1

• The technological coefficients Rji denote the amount of resource j used per
unit of activity i

• Example:
``Small chess sets require 1 kg ``Small chess sets require 3
boxwood, the large ones 3 kg, lathe hours, the large ones 2 h,
total available is 200 kg.'' total time is 4×40 h.''
x1 + 3x2 ≤ 200 3x1 + 2x2 ≤ 160

R = [ 23 32 ] , Rmax = [ 200
160 ]

©2020 A. Bemporad - Numerical Optimization 54/101

4. Balance constraints

• Balance constraints model the fact that “what goes out must in total equal
what comes in”
L
x1in x1out
X
N X
M
xout
i = xin
i +L x2in
i=1 i=1
x3in x2out

• Example: ``I have 100 tons steel and can buy more from
suppliers 1,2,3 to serve customers A,B.'' xA + xB = 100 + x1 + x2 + x3

• Balance can occur between time periods in a multi-time period model

• Example: ``The cash I'll have tomorrow is what I have now plus
what I receive minus what I spend today.'' xt+1 = xt + ut − yt

©2020 A. Bemporad - Numerical Optimization 55/101

5. Quality constraints
• Quality constraints are requirements on the average percentage of a certain
quality when blending several components
PN X
N X
N
i=1 αi xi αi xi T pmin
PN T pmin xi
i=1 xi i=1 i=1

• Example: ``The average risk of an investment in assets A,B,C,

which have risks 25%, 5%, and 12% respectively, must be
smaller than 10%'' 0.25xA +0.05xB +0.12xC
xA +xB +xC ≤ 0.1

• The nonlinear quality constraint is converted to a linear one under the

assumption that xi ≥ 0 (if xi = 0 ∀i the constraint becomes redundant)

Objectives and constraints can be often simplified by mathematical trans-

formations and/or adding extra variables

©2020 A. Bemporad - Numerical Optimization 56/101

6. Accounting variables and constraints

• It is often useful to add extra accounting variables

X
N
y= xi accounting constraint
i=1

PN
• Of course we can replace y with i=1 xi everywhere in the model (condensed
form), but this would make it less readable

• Moreover, keeping y in the model (non-condensed form) may preserve some

structural properties that the solver could exploit

• Example: ``The profit at any given year is the difference between

revenues and expenditures'' pt = rt − et

©2020 A. Bemporad - Numerical Optimization 57/101

7. Blending constraints

• Blending constraints occur when we want to blend a set of ingredients xi in

given percentages αi in the final product
xi
PN = αi
j=1 xj

• Similar to quality constraints, blending constraints can be converted to linear

equality constraints
X
N
xi = αi xj
j=1

©2020 A. Bemporad - Numerical Optimization 58/101

8. Soft constraints
• So far we have seen are hard constraints, i.e., that cannot be violated.
• Soft constraints are a relaxation, in which the constraint can be violated,
usually paying a penalty
X
N
X
N
aij xi ≤ bj aij xi ≤ bj + ϵj
i=1 i=1

• We call the new variable ϵj panic variable: it should be normally zero but can
assume a positive value in case there is no way to fulfill the constraint set

• Example: ``Only 200 kg boxwood are available to make chess

sets, but we can buy extra for 6 $/kg''

maxxs ,xℓ ,ϵ≥0 5xs + 20xℓ − 6ϵ

s.t. xs + 3xℓ ≤ 200 + ϵ
3xs + 2xℓ ≤ 160

©2020 A. Bemporad - Numerical Optimization 59/101

Example: Production of alloys
(Guerét et al., Applications of Optimization with XpressMP, 1999)

The company Steel has received an order for 500 tonnes of steel to be used in shipbuilding. This steel must have the
certain characteristics (``grades''). The company has seven different raw materials in stock that may be used for the
production of this steel. The objective is to determine the composition of the steel that minimizes the production cost.

©2020 A. Bemporad - Numerical Optimization 60/101

Example: Production of alloys
(Guerét et al., Applications of Optimization with XpressMP, 1999)

• The company Steel has received an order for 500 tonnes of steel to be used in shipbuilding.
• This steel must have the certain characteristics (``grades'').
• The company has seven different raw materials in stock that may be used for the production of this steel.
• The objective is to determine the composition of the steel that minimizes the production cost.

©2020 A. Bemporad - Numerical Optimization 61/101

Example: Production of alloys
• Available data of characteristics of steel ordered

chemical element minimum grade maximum grade

Carbon (C) 2 3
Copper (Cu) 0.4 0.6
Manganese (Mn) 1.2 1.65

• Available data of grades, available amounts, and prices

raw material C% Cu % Mn % availability [t] cost [e/t]

Iron alloy 1 2.5 0 1.3 400 200
Iron alloy 2 3 0 0.8 300 250
Iron alloy 3 0 0.3 0 600 150
Copper alloy 1 0 90 0 500 220
Copper alloy 2 0 96 4 200 240
Aluminum alloy 1 0 0.4 1.2 300 200
Aluminum alloy 2 0 0.6 0 250 165

©2020 A. Bemporad - Numerical Optimization 62/101

Example: Production of alloys

• “The company has seven different raw materials in stock''

– We need 7 optimization variables x1 , . . . , x7

– Their upper and lower bounds are

0 ≤ xj ≤ Rj , Rj = tons available of raw material j

• “The company Steel has received an order for 500 tonnes of steel''
∑7
– flow constraint y = j=1 xj (produced metal)

– lower bound y ≥ 500 (quantity that needs to be produced)

©2020 A. Bemporad - Numerical Optimization 63/101

Example: Production of alloys
• “This steel must have the certain characteristics''
index i chemical element minimum grade gimin maximum grade gimax

1 Carbon (C) 2 3
2 Copper (Cu) 0.4 0.6
3 Manganese (Mn) 1.2 1.65

index j raw material C % Pj 1 Cu % Pj 2 Mn % Pj 3 availability [t] Rj cost [e/t] cj

1 Iron alloy 1 2.5 0 1.3 400 200
2 Iron alloy 2 3 0 0.8 300 250
3 Iron alloy 3 0 0.3 0 600 150
4 Copper alloy 1 0 90 0 500 220
5 Copper alloy 2 0 96 4 200 240
6 Aluminum alloy 1 0 0.4 1.2 300 200
7 Aluminum alloy 2 0 0.6 0 250 165

Quality constraints on grades for each chemical element i = 1, 2, 3:

P7 ( P7
j=1 Pji xj j=1 Pji xj ≤ gimax y
gimin ≤ P7 ≤ gimax P7
i=1 xj j=1 Pji xj ≥ gimin y

where Pji = percentage of metal i contained in raw material j

©2020 A. Bemporad - Numerical Optimization 64/101

Example: Production of alloys
• “The objective is to determine the composition of the steel that minimizes the production cost''
P7
The cost function to minimize is j=1 cj xj
• The complete optimization model is finally
P7
minx,y j=1 cj xj
s.t. 0 ≤ xj ≤ Rj
P7
y = j=1 xj
y ≥ 500
P7
Pji xj ≤ gimax y
Pj=1
7
j=1 Pji xj ≥ gi
min
y

• The problem is a linear programming problem. The solution is

x∗1 = 400, x∗3 = 39.7763, x∗5 = 2.7613, x∗6 = 57.4624, x∗2 = x∗4 = x∗7 = 0 [t]

with optimal production cost f (x∗ ) = 98122 e

©2020 A. Bemporad - Numerical Optimization 65/101

Example: Production of alloys

• Model and solve the problem using YALMIP (Löfberg, 2004)

x=sdpvar(7,1);
y=sum(x);

Constraints = [x>=0, x<=R, y>=500, ...

P'*x<=B(:,2)*y, P'*x>=B(:,1)*y];

cost = sum(c.*x);

optimize(Constraints,cost)

value(x),value(cost)

©2020 A. Bemporad - Numerical Optimization 66/101

Example: Production of alloys
• Model and solve the problem using CVX (Grant, Boyd, 2013)

cvx_clear
cvx_begin
variable x(7)

cost = sum(c.*x);

minimize cost

subject to
y=sum(x); y>=500;
x>=0; x<=R;
P'*x<=B(:,2)*y;
P'*x>=B(:,1)*y;

cvx_end

x,cost

©2020 A. Bemporad - Numerical Optimization 67/101

Linear objective functions

• Linear programs only allow minimizing a linear combination of the optimization

variables

• However, by introducing new variables, we can minimize any convex piecewise

affine (PWA) function

Result
Every convex piecewise affine function
a04x + b4
ℓ : Rn → R can be represented as the `(x)
a01x + b1
max of affine functions, and vice versa
(Schechter, 1987) a03x + b3

Example:
ℓ(x) = max {a′1 x + b1 , . . . , a′4 x + b4 } x
a02x + b2

©2020 A. Bemporad - Numerical Optimization 68/101

Convex PWA optimization problems and LP
• Minimization of a convex PWA function ℓ(x):

ϵ minϵ,x ϵ

 ϵ ≥ a′1 x + b1

 ϵ ≥ a′ x + b
2 2
s.t. ′

 ϵ ≥ a x + b


3 3
x ϵ ≥ a′4 x + b4

• By construction ϵ ≥ max{a′1 x + b1 , a′2 x + b2 , a′3 x + b3 , a′4 x + b4 }

• By contradiction it is easy to show that at the optimum we have that

ϵ = max{a′1 x + b1 , a′2 x + b2 , a′3 x + b3 , a′4 x + b4 }

• Convex PWA constraints ℓ(x) ≤ 0 can be handled similarly by imposing

a′i x + bi ≤ 0, ∀i = 1, 2, 3, 4

©2020 A. Bemporad - Numerical Optimization 69/101

1. Minmax objective
• minmax objective: we want to minimize the maximum among M given linear
objectives fi (x) = a′i x + bi

min max {fi (x)} s.t. linear constraints

x i=1,...,M

• Example: asymmetric cost minx max{a′ x + b, 0}

• Example: minimize the ∞-norm

min ∥Ax − b∥∞

where ∥v∥∞ , maxi=1,...,n |vi | and A ∈ Rm×n , b ∈ Rm .

This corresponds to

min max{A1 x + b1 , −A1 x − b1 , . . . , Am x + bm , −Am x − bm }

©2020 A. Bemporad - Numerical Optimization 70/101

2. Minimize the sum of max objectives
• We want to minimize the sum of maxima among given linear objectives
fij (x) = a′ij x + bij
X
N
min max {fij (x)} s.t. linear constraints
x i=1,...,Mj
j=1

• The equivalent reformulation is

PN
minϵ,x j=1 ϵj
s.t. ϵj ≥ a′ij x + bij , i = 1, . . . , Mj , j = 1, . . . , N
(other linear constraints)
• Example: minimize the 1-norm
min ∥Ax − b∥1
x
P
where ∥v∥1 , i=1,...,n |vi | and A ∈ Rm×n , b ∈ Rm , that corresponds to
X
m
min max{Ai x + bi , −Ai x − bi }
x
i=1
©2020 A. Bemporad - Numerical Optimization 71/101
3. Linear-fractional program
• We want to minimize the ratio of linear objectives c′ x+d
minx e′ x+f
s.t. Ax ≤ b
Gx = h
over the domain e′ x + f > 0
1
• We introduce the new variable z = and replace xi with the new
e′ x + f
variables yi = zxi , i = 1, . . . , n, where
1 = z(e′ x + f ) = e′ y + f z, z ≥ 0
• Since z ≥ 0 then zAx ≤ zb, and the original problem is translated into the LP
minz,y c′ y + dz
s.t. Ay − bz ≤ 0
Gy = hz
e′ y + f z = 1
z≥0
from which we recover x∗ = 1 ∗
z∗ y in case z ∗ > 0.
©2020 A. Bemporad - Numerical Optimization 72/101
Chebychev center of a polyhedron
• The Chebychev center of a polyhedron P = {x : Ax ≤ b} x*
is the center x∗ of the largest ball B(x∗ , r∗ ) = {x : x = x∗ + u, r*

∥u∥2 ≤ r∗ } contained in P

• The radius r∗ is called the Chebychev radius of P

• A ball B(x, r) is included in P if and only if

sup Ai (x + u) = Ai x + r∥Ai ∥2 ≤ bi , ∀i = 1, . . . , m,
∥u∥2 ≤r

where A ∈ Rm×n , b ∈ Rm , and Ai is the ith row of A.

• Therefore, we can compute the Chebychev center/radius by solving the LP

maxx,r r
s.t. Ai x + r∥Ai ∥2 ≤ bi , i = 1, . . . , m

Convex optimization models

References:

S. Boyd, L. Vandenberghe, “Convex Optimization,” 2004

S. Boyd, “Convex Optimization,” lecture notes, http://ee364a.stanford.edu,

http://ee364b.stanford.edu
Convex sets
• Convex set: A set S ⊆ Rn is convex if for all x1 , x2 ∈ S

λx1 + (1 − λ)x2 ∈ S, ∀λ ∈ [0, 1]

• The convex hull of N points x̄1 , . . . , x̄N is the set of all their convex
combinations
x1 x2
P
S = {x ∈ Rn : ∃λ ∈ RN : x = λi x̄i ,
PN x7
λi ≥ 0, i=1 λi = 1} xN x3
x6
x5
x4
• A convex cone of N points x̄1 , . . . , x̄N is the set
x1
X
S = {x ∈ R : ∃λ ∈ R n N
:x= λi x̄i , λi ≥ 0} x

Convex sets
a
a’x ≥ b
• The set {x : a′ x = b}, a ̸= 0 is called hyperplane a’x=b

a’x ≤ b
• The set {x : a′ x ≤ b}, a ̸= 0 is called halfspace

• The set P = {x : Ax ≤ b, Ex = f } is called polyhedron

• The set B(x0 , r) = {x : ∥x − x0 ∥2 ≤ r}

={x0 + ry : ∥u∥2 ≤ 1} is called (Euclidean) ball

• An ellipsoid is the set E = {x : (x − x0 )′ P (x − x0 ) ≤ 1}

with P = P ′ ≻ 0, or equivalently E = {x0 + Ay : ∥y∥2 ≤ 1}, x0

A square and det A ̸= 0

• Hyperplanes, halfspaces, polyhedra, balls, and ellipsoids are all convex sets

Properties of convex sets

• The intersection of (any number of) convex sets is convex

• The image of a convex set under an affine function f (x) = Ax + b

(A ∈ Rm×n , b ∈ Rm ) is convex

S ⊆ Rn convex ⇒ f (S) = {y : y = f (x), x ∈ S} convex

for example: scaling (A diagonal, b = 0), translation (A = 0, b ̸= 0),

′
projection (A = [I 0], b = 0, i.e., f (S) = {y = [ x1 ... xi ] : x ∈ S})

Convex functions

• Recall: f : S → R is a convex function if S is convex and

f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 )

Jensen’s inequality
∀x1 , x2 ∈ S, λ ∈ [0, 1]

• Sublevel sets Cα of convex functions are convex sets (but not vice versa)

Cα = {x ∈ dom f : f (x) ≤ α}

• Therefore linear equality constraints Ax = b and inequality constraints

g(x) ≤ 0, with g a convex (vector) function, define a convex set

Convex functions
• Examples of convex functions
– affine f (x) = a′ x + b, for any a ∈ Rn , b ∈ R
– exponential f (x) = eax , x ∈ R, for any a ∈ R
– power f (x) = xα , x ∈ R, for any α > 1 or α ≤ 0. Example: x2 , 1/x for x > 0
– powers of absolute value f (x) = |x|p , x ∈ R, for p ≥ 1
– negative entropy f (x) = x log x, x ∈ R
– any norm f (x) = ∥x∥
– maximum f (x) = max(x1 , . . . , xn )

• Examples of concave functions

– affine f (x) = a′ x + b, for any a ∈ Rn , b ∈ R
– logarithm f (x) = log x, x ∈ R
√
– power f (x) = xα , x ∈ R, for any 0 ≤ α ≤ 1. Example: x, x ≥ 0
– minimum f (x) = min(x1 , . . . , xn )

Convex functions

• Recall the first-order condition of convexity: f : Rn → R with convex domain

dom f and differentiable is convex if and only if
f(y)
f (y) ≥ f (x)+∇f (x)′ (y−x), ∀x, y ∈ dom f
f(x) f(x)+∇f(x)’(y-x)

• Second-order condition: Let f : Rn → R with convex domain dom f be twice

∂ 2 f (x)
differentiable and ∇2 f (x) its Hessian matrix, [∇2 f (x)]ij = ∂xi ∂xj . Then f is
convex if and only if

∇2 f (x) ≽ 0, ∀x ∈ dom f

If ∇2 f (x) ≻ 0 for all x ∈ dom f then f is strictly convex.

Checking convexity

1. Check directly whether the definition is satisfied (Jensen’s inequality)

2. Check if the Hessian matrix is positive semidefinite (only for twice

differentiable functions)

3. Show that f is obtained by combining known convex functions via operations

that preserve convexity

Calculus rules for convex functions
• nonnegative scaling: f convex, α ≥ 0 ⇒ αf convex
• sum: f, g convex ⇒ f + g convex
• affine composition: f convex ⇒ f (Ax + b) convex
• pointwise maximum: f1 , . . . , fm convex ⇒ maxi fi (x) convex
• composition: h convex increasing, f convex ⇒ h(f (x)) convex

General composition rule: h(f1 (x), . . . , fk (x)) is convex when h is convex and
for each i = 1, . . . , k
h is increasing in argument i, and fi convex, or
h is decreasing in argument i, and fi concave, or
fi is affine

Convex programming
(Boyd, Vandenberghe, 2002)

• The optimization problem

min f (x) or, more min f (x)

s.t. g(x) ≤ 0 generally, s.t. x ∈ S
Ax = b
S convex set
g : R → R , gi convex
n m

with f : X → R convex is a convex optimization problem, where

X = {x ∈ Rn : g(x) ≤ 0, Ax = b} or, more generally, X = S .

• Convex programs can be solved to global optimality and many efficient

algorithms exist for this (we will see many later)
• Although convexity may sound like a restriction, it occurs very frequently in
practice (sometimes after some transformations or approximations)

Disciplined convex programming
(Grant, Boyd, Ye, 2006)

• The objective function has the form

– minimize a scalar convex expression, or
– maximize a scalar concave expression

• Each of the constraints (if any) has the form

– convex expression ≤ concave expression, or
– concave expression ≥ convex expression, or
– affine expression = affine expression

This framework is used in the CVX, CVXPY, and Convex.jl packages.

Least squares
• least squares (LS) problem

min ∥Ax − b∥22 x∗ = (A′ A)−1 A′ b

| {z }
pseudoinverse of A Adrien-Marie Legendre
(1752–1833)

• nonnegative least squares (NNLS) (Lawson, Hanson, 1974)

min ∥Ax − b∥22

s.t. x ≥ 0
J. Carl Friedrich Gauss
• bounded-variable least squares (BVLS) (Stark,Parker, 1995)
(1777–1855)

min ∥Ax − b∥22

s.t. ℓ ≤ x ≤ u

• constrained least squares

min ∥Ax − b∥22
s.t. Ax ≤ b, Ex = f

Quadratic programming
• The least squares cost is a special case of quadratic cost Ax  b

x*
1 1
∥Ax − b∥22 = x′ A′ Ax − b′ Ax + b′ b 1 x0 Qx + c0 x = constant
2

2 2
• A generalization of constrained least squares is quadratic programming (QP)

1 ′
min x Qx + c′ x
2
s.t. Ax ≤ b Q = Q′ ≽ 0
Ex = f

• If Q = L′ L ≻ 0 we can complete the squares by setting y = Lx + (L−1 )′ c and

convert the QP into a LS problem:

1 ′ 1 1
x Qx + c′ x = ∥Lx − (−L−1 )′ c∥22 − c′ Q−1 c
2 2 2

Linear program with random cost = QP

• We want to solve the LP with random cost c

min c′ x
E[c] = c̄, Var[c] = E[(c − c̄)(c − c̄)′ ] = Σ
s.t. Ax ≤ b, Ex = f

• c′ x is a random variable with expectation E[c′ x] = c̄x and variance

Var[c′ x] = x′ Σx

• We want to trade off the expectation of c′ x with its variance (=risk) with a risk
aversion coefficient γ ≥ 0

• This is equivalent to a QP:

min E[c′ x] + γ Var[c′ x] min c̄′ x + γx′ Σx

s.t. Ax ≤ b, Ex = f s.t. Ax ≤ b, Ex = f

LASSO optimization = QP
(Tibshirani, 1996)

• The following ℓ1 -penalized linear regression problem is called LASSO

(least absolute shrinkage and selection operator):

1
min ∥Ax − b∥22 + λ∥x∥1 A ∈ Rm×n , b ∈ Rm
x 2

• The tuning parameter λ ≥ 0 determines the tradeoff between fitting Ax ≈ b

(λ small) and making x sparse (λ large)
• By splitting x in the difference of its positive and negative parts, x = y − z ,
y, z ≥ 0 we get the positive semidefinite QP with 2n variables
1
min ∥A(y − z) − b∥22 + λ1′ (y + z)
y,z≥0 2

where 1′ = [1 . . . 1]. At optimality at least one of yi∗ , zi∗ will be zero

• A small Tikhonov regularization σ(∥y∥22 + ∥z∥22 ) makes the QP strictly convex

LASSO - Example
• Solve LASSO problem 55

1
min ∥Ax − b∥22 + λ∥x∥1
x 2 50

A ∈ R1000×3000 , b ∈ R3000
45
10-4 10-3 10-2 10-1 100 101 102
• A, B = random matrices
1000
• A sparse with 3000 nonzero entries
800

• Problem solved by QP for different λ’s 600

400

• CPU time ranges from 8.5 ms to 1.17 s 200

using osQP (http://osqp.org) 0

10-4 10-3 10-2 10-1 100 101 102

Quadratically constrained quadratic program (QCQP)

• If we add quadratic constraints in a QP we get the quadratically constrained

quadratic program (QCQP)
1 ′ ′
min 2 x Qx + c x
1 ′ ′
s.t. 2 x Pi x + d i x + hi ≤ 0, i = 1, . . . , m
Ax = b

• QCQP is a convex problem if Q, Pi ≽ 0, i = 1, . . . , m

• If P1 , . . . , Pm ≻ 0 the feasible region X of the QCQP is the intersection of m

ellipsoids and p hyperplanes (f ∈ Rp )

• Polyhedral constraints (halfspaces) are a special case when Pi = 0

Second-order cone programming
• A generalization of LP, QP, and QCQP is second-order cone programming
(SOCP)
min c′ x
s.t. ∥Fi x + gi ∥2 ≤ d′i x + hi , i = 1, . . . , m
Ax = b
with Fi ∈ Rn1 ×n , A ∈ Rp×n

• If Fi = 0 the SOC constraint becomes a linear inequality constraint

• If di = 0 (hi ≥ 0) the SOC constraint becomes a quadratic constraint

• The quadratic constraint x′ F ′ F x + d′ x + h ≤ 0 is equivalent to the SOC

constraint h i ′ 1
(1 − d′ x − h)
1
2 (1+d x+h) ≤
Fx 2 2

Example: Robust linear programming
(Boyd, Vandenberghe, 2004)

• We want to solve the LP with uncertain constraint coefficients ai

min c′ x
s.t. a′i x ≤ bi , i = 1, . . . , m
• Assume ai can be anything in the ellipsoid Ei = {āi + Pi y, ∥y∥2 ≤ 1},
Pi ∈ Rn×n , where āi ∈ Rn is the center of Ei
min c′ x
s.t. a′i x ≤ bi , ∀ai ∈ Ei , i = 1, . . . , m
• The constraint is equivalent to supai ∈Ei {ai ; x} ≤ bi , where
sup {a′i x} = sup {(āi + Pi y)′ x} = ā′i x + ∥Pi′ x∥2
ai ∈Ei ∥y∥2 ≤1

• The original robust LP is therefore equivalent to the SOCP

min c′ x
s.t. ā′i x + ∥Pi′ x∥2 ≤ bi , i = 1, . . . , m

Example: LP with random constraints
• Assume ai Gaussian, ai ∼ N (āi , Σi ), Σi = L′Σ LΣ (LΣ = Σ 2 if Σ is diagonal)
1

• For a given η ∈ [ 12 , 1] we want to solve the LP with chance constraints

min c′ x
s.t. prob(a′i x ≤ bi ) ≥ η, i = 1, . . . , m

• Let α = a′i x − bi , ᾱ = ā′i x − bi , σ̄ 2 = x′ Σi x. The cumulative distribution

R β −t2 /2
function (CDF) of α ∼ N (ᾱ, σ̄) is F (α) = Φ( α− ᾱ
σ̄ ), Φ(β) =
√1
2π −∞
e dt

−ᾱ bi − ā′i x
prob(a′i x − bi ≤ 0) = F (0) = Φ =Φ ≥η
σ̄ ∥Σ x∥2
• The original LP with random constraints is equivalent to the SOCP

min c′ x
s.t. ā′i x + Φ−1 (η)∥LΣ x∥2 ≤ bi , i = 1, . . . , m

where Φ−1 (η) ≥ 0 since η ≥ 1

2 (Boyd, Vandenberghe, 2004)

Example: Maximum volume box in a polyhedron
(Bemporad, Filippi, Torrisi, 2004)

• Goal: find the largest box B contained inside a polyhedron

P = {x ∈ Rn : Ax ≤ b}
x*+y*

• Let y ∈ R = vector of dimensions of B and x ∈ R

n n

= vertex of B with lowest coordinates

• Problem to solve:
Qn
maxx,y i=1 yi
nonlinear, nonconvex,
s.t. A(x + diag(v)y) ≤ b, ∀v ∈ {0, 1}n
many constraints!
y≥0
• Reformulate as maximize log(volume), remove redundant constraints:
X
n
convex problem
minx,y − log(yi )
i=1
s.t. Ax + A+ y ≤ b, y≥0 A+
ij = max{Aij , 0}

Semidefinite program (SDP)
• A semidefinite program (SDP) is an optimization problem in which we have
constraints on positive semidefiniteness of matrices
minx c′ x
s.t. x1 F1 + x2 F2 + . . . + xn Fn + G ≼ 0
Ax = b
where F1 , F2 , . . . , Fn , G are (wlog) symmetric m × m matrices
• The constraint is called linear matrix inequality (LMI) 5

• Multiple LMIs can be combined in a single LMI using block-diagonal matrices

x1 F11 + . . . + xn Fn1 + G1 ≼ 0
h i h i h i
F11 0 Fn1 0 G1 0
x1 +. . . xn + ≼0
x1 F12 + . . . + xn Fn2 + G2 ≼ 0 0 F12 0 Fn2 0 G2

Many interesting problems can be formulated (or approximated) as SDPs

5 The LMI constraint means z ′ (x + x2 F2 + . . . + xn Fn + G)z ≤ 0, ∀z ≥ 0

1 F1

Semidefinite program (SDP)
SDP generalizes LP, QP, QCQP, SOCP:

• an LP can be recast as an SDP

min c′ x min c′ x
s.t. Ax ≤ b s.t. diag(Ax − b) ≼ 0

• an SOCP can be recast as an SDP

min c′ x min ch′ x i

(d′i x+hi )I Fi x+gi
s.t. ∥Fi x + gi ∥2 ≤ d′i x + hi s.t. (F x+g )′
i i d′i x+hi
≽0
i = 1, . . . , m i = 1, . . . , m

• Good SDP packages exist (SeDuMi, SDPT3, Mathworks LMI Toolbox, ...)

Geometric programming
(Boyd, Kim, Vandenberghe, Hassibi, 2007)

• A monomial function f : Rn
++ → R++ , where R++ = {x ∈ R : x > 0}, has
the form
f (x) = cxa1 1 xa2 2 . . . xann , c > 0, ai ∈ R
• A posynomial function f : Rn
++ → R++ is the sum of monomials

X
K
f (x) = ck xa1 1k xa2 2k . . . xannk , ck > 0, aik ∈ R
k=1

• A geometric program (GP) is the following optimization problem

min f (x)
s.t. gi (x) ≤ 1, i = 1, . . . , m
hi (x) = 1, i = 1, . . . , p

with f, gi posynomials, hi monomials.

Geometric programming - Equivalent convex program
• Introduce the change of variables yi = log xi . The optimizer is the same if we
minimize log f instead of f and take the log of both sides of the constraints
• The logarithm of a monomial fM (x) = cxa1 1 . . . xann becomes affine in y

log fM (x) = log(cxa1 1 . . . xann ) = log(ceai y1 . . . ean yn ) = a′ y + b, b = log c

PK
• The logarithm of a posynomial fP (x) = k=1 ck xa1 1k . . . xannk becomes
!
X K
a′k y+bk
log fP (x) = log e , bk = log ck
k=1

• One can prove that F (y) = log fP (ey ) is convex and so it is the program
P
K a′k y+bk
min log e
Pk=1 ′
K
s.t. log k=1 e
cik y+dik
≤ 0, i = 1, . . . , m
Ey + f = 0

Geometric programming - Example
(Boyd, Kim, Vandenberghe, Hassibi, 2007)

• Maximize the volume of a box-shaped structure with

height h, width w, depth d

• Constraints:
– total wall area 2(hw + hd) ≤ Awall
– floor area wd ≤ Aﬂr
– upper and lower bounds on aspect ratios α ≤ h/w ≤ β , γ ≤ w/d ≤ δ

• The problem can be cast as the following GP

min h−1 w−1 d−1
2
s.t. Awall 2
hw + Awall hd ≤ 1
1
Aflr wd ≤ 1
αh−1 w ≤ 1, β1 hw−1 ≤ 1
γwd−1 ≤ 1, 1δ w−1 d ≤ 1

Geometric programming example
• We solve the problem in MATLAB:
alpha=0.5; beta=2; gamma=0.5; delta=2; Awall=1000; Afloor=500;

CVX YALMIP
cvx_begin gp quiet sdpvar h w d
variables h w d
% obj. function = box volume C = [alpha <= h/w <= beta,
maximize(h*w*d) gamma <= d/w <= delta, h>=0,
subject to w>=0];
2*(h*w + h*d) <= Awall; C = [C, 2*(h*w+h*d) <= Awall,
w*d <= Afloor; w*d <= Afloor];
alpha <= h/w <= beta;
gamma <= d/w <= delta; optimize(C,-(h*w*d))
cvx_end
yalmip.github.io/tutorial/geometricprogramming
opt_volume = cvx_optval;

• Result: max volume = 5590.17, h∗ = 11.1803, w∗ = 22.3599, d∗ = 22.3614

Geometric programming - Example
• We solve the problem in PYTHON:

CVXPY
import cvxpy as cp constraints = [
2*(h*w + h*d) <= Awall,
alpha = 0.5 w*d <= Afloor,
beta = 2.0 alpha <= h/w, h/w <= beta,
gamma = 0.5 gamma <= d/w, d/w <= delta]
delta = 2.0
Awall = 1000.0 problem = cp.Problem(cp.Maximize
Afloor = 500.0 (obj), constraints)
problem.solve(gp=True)
h = cp.Variable(pos=True)
w = cp.Variable(pos=True) print("h: ", h.value)
d = cp.Variable(pos=True) print("w: ", w.value)
print("d: ", d.value)
obj = h * w * d print("volume: ", problem.value)

Change of function/variables

• Substituting the objective f with a monotonically increasing function of f can

simplify the problem

√
– Example: min x with x ≥ 0, is a nonconvex problem, but we can minimize
√
( x)2 = x instead

∏n
– Example: max f (x) = i=1 xi is a nonconvex problem, but the function
∑n
log(f (x)) = i=1 log(x 1 ) is concave

• Sometimes a nonconvex problem can be transformed into a convex problem by

making a nonlinear transformation of the optimization variables (as in GP)

Pranab K Sen - Julio M Singer - Large Sample Methods in Statistics (1994) - An Introduction With Applications (2017, CRC Press) - Libgen - Li
No ratings yet
Pranab K Sen - Julio M Singer - Large Sample Methods in Statistics (1994) - An Introduction With Applications (2017, CRC Press) - Libgen - Li
395 pages
Matrix Algebra For Engineers
100% (3)
Matrix Algebra For Engineers
187 pages
Mathematics Essentials For Convex Optimization
No ratings yet
Mathematics Essentials For Convex Optimization
300 pages
Support Vector Machine - Python Implementation Using CVXOPT - Data Blog
100% (1)
Support Vector Machine - Python Implementation Using CVXOPT - Data Blog
12 pages
Unconstrained Optimization (Contd.) Constrained Optimization
No ratings yet
Unconstrained Optimization (Contd.) Constrained Optimization
19 pages
Informed Search Strategies: Artificial Intelligence
No ratings yet
Informed Search Strategies: Artificial Intelligence
72 pages
Mod7 CVX CVXOPT
No ratings yet
Mod7 CVX CVXOPT
69 pages
A Tutorial On Convex Optimization (Haitham Hindi)
100% (1)
A Tutorial On Convex Optimization (Haitham Hindi)
14 pages
Quadratic Programming With Python and CVXOPT
No ratings yet
Quadratic Programming With Python and CVXOPT
4 pages
Reservoir Simulation
100% (8)
Reservoir Simulation
108 pages
Model Predictive Control Using YALMIP Getting Started
No ratings yet
Model Predictive Control Using YALMIP Getting Started
5 pages
Convex Optimization Theory - Summary
100% (1)
Convex Optimization Theory - Summary
58 pages
OptimisationII Notes
100% (1)
OptimisationII Notes
94 pages
Lecture 7
No ratings yet
Lecture 7
46 pages
Levenberg Marquardt
No ratings yet
Levenberg Marquardt
7 pages
Predictive Control: For Linear and Hybrid Systems
No ratings yet
Predictive Control: For Linear and Hybrid Systems
458 pages
06 Convex Optimization - MCQs
No ratings yet
06 Convex Optimization - MCQs
5 pages
George B Dantzig PDF
100% (1)
George B Dantzig PDF
19 pages
N Queen Problem
No ratings yet
N Queen Problem
12 pages
Cuestionarios IA
No ratings yet
Cuestionarios IA
17 pages
Algorithms and Complexity
No ratings yet
Algorithms and Complexity
130 pages
Optimization
No ratings yet
Optimization
16 pages
Further Topics On Discrete-Time Markov Control Processes
No ratings yet
Further Topics On Discrete-Time Markov Control Processes
285 pages
Finite Element For Heat Transfer: Analysis
100% (1)
Finite Element For Heat Transfer: Analysis
204 pages
Difference Equations
100% (1)
Difference Equations
120 pages
Numerical Method - Accuracy of Numbers
No ratings yet
Numerical Method - Accuracy of Numbers
9 pages
Gurobi Optimization
100% (2)
Gurobi Optimization
26 pages
1 - Course Slides - Data Science and ML Fundamentals
No ratings yet
1 - Course Slides - Data Science and ML Fundamentals
92 pages
Notes
No ratings yet
Notes
422 pages
Analysis On Manifold Via Laplacian Canzani
No ratings yet
Analysis On Manifold Via Laplacian Canzani
114 pages
Statistical Inference For Engineers and Data Scientists Solutions Manual
No ratings yet
Statistical Inference For Engineers and Data Scientists Solutions Manual
12 pages
Introduction To Linear Programming
100% (1)
Introduction To Linear Programming
34 pages
Class 8 Algebraic Expressions & Identities Ques Bank
No ratings yet
Class 8 Algebraic Expressions & Identities Ques Bank
6 pages
Binomial Distribution
No ratings yet
Binomial Distribution
16 pages
General Structure of Transportation Problem
No ratings yet
General Structure of Transportation Problem
9 pages
Python-Linear Regression
No ratings yet
Python-Linear Regression
72 pages
Dynamic Programming and Optimal Control: Third Edition Dimitri P. Bertsekas
0% (1)
Dynamic Programming and Optimal Control: Third Edition Dimitri P. Bertsekas
54 pages
A Brief Survey of Deep Reinforcement Learning
No ratings yet
A Brief Survey of Deep Reinforcement Learning
16 pages
DLP Remainder and Factors Theorem
100% (1)
DLP Remainder and Factors Theorem
35 pages
LMI-Linear Matrix Inequality
100% (1)
LMI-Linear Matrix Inequality
34 pages
Fuzzy Min-Max Neural Networks
No ratings yet
Fuzzy Min-Max Neural Networks
32 pages
Ctran Modeling
No ratings yet
Ctran Modeling
101 pages
Operational Research
No ratings yet
Operational Research
13 pages
Linear Dynamical Systems - Course Reader
No ratings yet
Linear Dynamical Systems - Course Reader
414 pages
C Track: Using The Program.: Mmaakkee
No ratings yet
C Track: Using The Program.: Mmaakkee
5 pages
Wiener Filters-Chapter56-2020 PDF
No ratings yet
Wiener Filters-Chapter56-2020 PDF
48 pages
Bioinformatics F&amp M 20100722 Bujak
100% (1)
Bioinformatics F&amp M 20100722 Bujak
27 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
Convex Optimization - Introduction (S.l. Dr. Ing. Carmen Voicu)
No ratings yet
Convex Optimization - Introduction (S.l. Dr. Ing. Carmen Voicu)
32 pages
Statistical Models
No ratings yet
Statistical Models
35 pages
C 1 Quadratic Interpolation
100% (1)
C 1 Quadratic Interpolation
18 pages
Adjoint Tutorial PDF
No ratings yet
Adjoint Tutorial PDF
6 pages
Lecture Notes Stochastic Optimization-Koole
No ratings yet
Lecture Notes Stochastic Optimization-Koole
42 pages
Dynamic Programming Value Iteration
100% (1)
Dynamic Programming Value Iteration
36 pages
Adaline/Madaline:Applications
100% (1)
Adaline/Madaline:Applications
25 pages
LTI
No ratings yet
LTI
67 pages
Dividing Polynomials Worksheet
No ratings yet
Dividing Polynomials Worksheet
6 pages
Practice Problems M201
No ratings yet
Practice Problems M201
13 pages
College of Engineering CVE154 Course Guide: Numerical Solutions To Civil Engineering Problems
No ratings yet
College of Engineering CVE154 Course Guide: Numerical Solutions To Civil Engineering Problems
4 pages
Python Modeling
100% (1)
Python Modeling
49 pages
Spline Methods Draft: Tom Lyche and Knut Mørken
No ratings yet
Spline Methods Draft: Tom Lyche and Knut Mørken
235 pages
Finite Element Analysis of Shell Structures
No ratings yet
Finite Element Analysis of Shell Structures
1 page
Basic Elements of Queueing Theory Lec Notes Philippe NAIN
No ratings yet
Basic Elements of Queueing Theory Lec Notes Philippe NAIN
110 pages
Lesson 5 Algebraic Expressions
No ratings yet
Lesson 5 Algebraic Expressions
9 pages
Eem520l3 2023
No ratings yet
Eem520l3 2023
25 pages
Newton Gauss Method
No ratings yet
Newton Gauss Method
37 pages
PG Department Berhampur University
No ratings yet
PG Department Berhampur University
15 pages
CS 3600 Project 4b Analysis
No ratings yet
CS 3600 Project 4b Analysis
3 pages
Lecture 1 - Roots of Nonlinear
No ratings yet
Lecture 1 - Roots of Nonlinear
17 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
29 pages
Octave Programming and Linear Algebra
No ratings yet
Octave Programming and Linear Algebra
17 pages
Optimization Class Notes MTH-9842
No ratings yet
Optimization Class Notes MTH-9842
25 pages
Stochastic Models Estimation and Control
No ratings yet
Stochastic Models Estimation and Control
6 pages
Optimal Control: Wave Equation
No ratings yet
Optimal Control: Wave Equation
32 pages
Course 1
No ratings yet
Course 1
30 pages
EC744 Lecture Note 6 Stochastic Models: Mathematical Preliminaries
No ratings yet
EC744 Lecture Note 6 Stochastic Models: Mathematical Preliminaries
18 pages
E The Master of All
No ratings yet
E The Master of All
12 pages
Updated UE20MA251 UNIT1 SGK Lecture Notes
No ratings yet
Updated UE20MA251 UNIT1 SGK Lecture Notes
178 pages
Graphs of Cubic Functions
No ratings yet
Graphs of Cubic Functions
26 pages
Dr. B.C. Roy Engineering College, Durgapur: Continuous Assessment 1
No ratings yet
Dr. B.C. Roy Engineering College, Durgapur: Continuous Assessment 1
10 pages
Nptel: - Course
No ratings yet
Nptel: - Course
4 pages
1 - 8 Find The General Solution of Each Equation: Exercises B-4.1
No ratings yet
1 - 8 Find The General Solution of Each Equation: Exercises B-4.1
3 pages
Cuckoo Search (CS) Algorithm - File Exchange - MATLAB Central
No ratings yet
Cuckoo Search (CS) Algorithm - File Exchange - MATLAB Central
5 pages
EC744 Lecture Notes: Economic Dynamics: Prof. Jianjun Miao
No ratings yet
EC744 Lecture Notes: Economic Dynamics: Prof. Jianjun Miao
13 pages
4.iterative Method
No ratings yet
4.iterative Method
6 pages
Corelatii
No ratings yet
Corelatii
16 pages
Andesnia Qonita Luthfiya - Assignment3
No ratings yet
Andesnia Qonita Luthfiya - Assignment3
13 pages
Linear Regression Assumptions and Limitations
No ratings yet
Linear Regression Assumptions and Limitations
10 pages
4.1 Day 1 Intro Student Journal PG 76-77
No ratings yet
4.1 Day 1 Intro Student Journal PG 76-77
1 page
Uj 038
No ratings yet
Uj 038
2 pages