0% found this document useful (0 votes)
112 views104 pages

Optimization Models

This document introduces numerical optimization and describes the course objectives, contents, and references. The course aims to teach students how to formulate decision problems as numerical optimization problems, determine the most appropriate algorithms to solve them, and understand the underlying theory of the algorithms. The content will cover optimization modeling with linear and convex models, optimization theory including optimality conditions and duality, and optimization algorithms such as convex programming and nonlinear programming. The course references other online courses and textbooks on numerical optimization.

Uploaded by

Michele Vietri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views104 pages

Optimization Models

This document introduces numerical optimization and describes the course objectives, contents, and references. The course aims to teach students how to formulate decision problems as numerical optimization problems, determine the most appropriate algorithms to solve them, and understand the underlying theory of the algorithms. The content will cover optimization modeling with linear and convex models, optimization theory including optimality conditions and duality, and optimization algorithms such as convex programming and nonlinear programming. The course references other online courses and textbooks on numerical optimization.

Uploaded by

Michele Vietri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

Numerical Optimization

Alberto Bemporad

http://cse.lab.imtlucca.it/~bemporad/teaching/numopt

Academic year 2019-2020


Course objectives

Solve complex decision problems by using numerical optimization

Application domains:

• Finance, management science, economics (portfolio optimization, business


analytics, investment plans, resource allocation, logistics, ...)

• Engineering (engineering design, process optimization, embedded control, ...)

• Artificial intelligence (machine learning, data science, autonomous driving, ...)

• Myriads of other applications (transportation, smart grids, water networks,


sports scheduling, health-care, oil & gas, space, ...)

©2020 A. Bemporad - Numerical Optimization 2/101


Course objectives

What this course is about:

• How to formulate a decision problem as a numerical optimization problem?


(modeling)

• Which numerical algorithm is most appropriate to solve the problem?


(algorithms)

• What’s the theory behind the algorithm? (theory)

©2020 A. Bemporad - Numerical Optimization 3/101


Course contents
• Optimization modeling
– Linear models
– Convex models

• Optimization theory
– Optimality conditions, sensitivity analysis
– Duality

• Optimization algorithms
– Basics of numerical linear algebra
– Convex programming
– Nonlinear programming

©2020 A. Bemporad - Numerical Optimization 4/101


References i

©2020 A. Bemporad - Numerical Optimization 5/101


Other references

• Stephen Boyd’s “Convex Optimization” courses at Stanford:


http://ee364a.stanford.edu http://ee364b.stanford.edu

• Lieven Vandenberghe’s courses at UCLA:


http://www.seas.ucla.edu/~vandenbe/

• For more tutorials/books see


http://plato.asu.edu/sub/tutorials.html

©2020 A. Bemporad - Numerical Optimization 6/101


Optimization modeling
What is optimization?
• Optimization = assign values to a set of decision variables so to optimize a
certain objective function

• Example: Which is the best velocity to minimize fuel consumption ?

fuel
[ℓ/km]

velocity
0 30 60 90 120 160
[km/h]

©2020 A. Bemporad - Numerical Optimization 7/101


What is optimization?
• Optimization = assign values to a set of decision variables so to optimize a
certain objective function

• Example: Which is the best velocity to minimize fuel consumption ?


best fuel
fuel consumption
[ℓ/km]
optimal
velocity

velocity
0 30 60 90 120 160
[km/h]

optimization variable: velocity


cost function to minimize: fuel consumption
parameters of the decision problem: engine type, chassis shape, gear, …

©2020 A. Bemporad - Numerical Optimization 8/101


Optimization problem
f(x)

min f (x)
x f(x*)
x
x*

f = minx f (x) = optimal value
x∗ = arg minx f (x) = optimizer x ∈ Rn , f : R n → R
 
  x1
maxx f (x)  . 
x =  .. 

, f (x) = f (x1 , x2 , . . . , xn )
xn

Most often the problem is difficult to solve by inspection


use a numerical solver implementing an optimization algorithm

©2020 A. Bemporad - Numerical Optimization 9/101


Optimization problem

min f (x)
x

• The objective function f : Rn → R models our goal: minimize (or maximize)


some quantity.

For example fuel, money, distance from a target, etc.

• The optimization vector x ∈ Rn is the vector of optimization variables


(or unknowns) xi to be decided optimally.

For example velocity, number of assets in a portfolio, voltage applied to a


motor, etc.

©2020 A. Bemporad - Numerical Optimization 10/101


Constrained optimization problem
• The optimization vector x may not be completely free, but rather restricted to a
feasible set X ⊆ Rn
• Example: the velocity must be smaller than 60 km/h

fuel
[ℓ/km]

velocity
0 30 60 90 120 160
[km/h]

best fuel
fuel consumption

[ℓ/km]
optimal
velocity

velocity
0 30 60 90 120 160
[km/h]

The new optimizer is x∗ = 42 km/h.


©2020 A. Bemporad - Numerical Optimization 11/101
Constrained optimization problem
minx f (x) f(x)

s.t. g(x) ≤ 0
h(x) = 0
g(x)  0
x

• The (in)equalities define the feasible set


X of admissible variables g : R n → Rm , h : Rn → Rp
" g1 (x1 ,x2 ,...,xn ) #
X = {x ∈ Rn : g(x) ≤ 0, h(x) = 0} ..
g(x) =
.
gm (x1 ,x2 ,...,xn )
• Further constraints may restrict X ,
" h1 (x1 ,x2 ,...,xn ) #
for example:
h(x) = ..
x ∈ {0, 1}n (x = binary vector) .
hp (x1 ,x2 ,...,xn )
x ∈ Zn (x = integer vector)

©2020 A. Bemporad - Numerical Optimization 12/101


A few observations
• An optimization problem can be always written as a minimization problem

max f (x) = − min{−f (x)}


x∈X x∈X

• Similarly, an inequality gi (x) ≥ 0 is equivalent to −gi (x) ≤ 0

• An equality h(x) = 0 is equivalent to the double inequalities h(x) ≤ 0,


−h(x) ≤ 0 (often this is only good in theory, but not numerically)

• Scaling f (x) to αf (x) and/or gi (x) to βi gi (x) does not change the solution, for
all α, βi > 0. Same if hj (x) is scaled to γj hj (x), ∀γj ̸= 0

• Adding constraints makes the objective worse or equal:

min f (x) ≤ min f (x)


x∈X1 x∈X1 , x∈X2

• Strict inequalities gi (x) < 0 can be approximated by gi (x) ≤ −ϵ (0 < ϵ ≪ 1)

©2020 A. Bemporad - Numerical Optimization 13/101


Infeasibility and unboundedness

• A vector x ∈ Rn is feasible if x ∈ X , i.e., it satisfies the given constraints

• A problem is infeasible if X = ∅ (the constraints are too tight)

• A problem is unbounded if ∀M > 0 ∃x ∈ X such that f (x) < −M .


In this case we write
inf f (x) = −∞
x∈X

©2020 A. Bemporad - Numerical Optimization 14/101


Global and local minima

• A vector x∗ ∈ Rn is a global optimizer if x ∈ X and f (x) ≥ f (x∗ ), ∀x ∈ X

• A vector x∗ ∈ Rn is a strict global optimizer if x∗ ∈ X and f (x) > f (x∗ ),


∀x ∈ X , x ̸= x∗

• A vector x∗ ∈ Rn is a (strict) local optimizer if x∗ ∈ X and there exists a


neighborhood1 N of x∗ such that f (x) ≥ f (x∗ ), ∀x ∈ X ∩ N
(f (x) > f (x∗ ), ∀x ∈ X ∩ N , x ̸= x∗ )

1 Neighborhood of x = open set containing x

©2020 A. Bemporad - Numerical Optimization 15/101


Example: Least Squares
• We have a dataset (uk , yk ), uk , yk ∈ R, k = 1, . . . N

• We want to fit a line ŷ = au + b to the dataset that minimizes


  " y1 # 2
u1 1
X
N X
N
′ . . ..
f (x) = (yk − auk − b) = 2
([ u1k ] x − yk ) =  .. ..  x −
2
.
k=1 k=1 uN 1 yN 2

with respect to x = [ ab ]
 a∗ 
• The problem b∗
= arg min f ([ ab ]) is a least-squares problem: ŷ = a∗ u + b∗
In MATLAB: 1.5

x=[u ones(size(u))]\y 0.5

In Python: y 0

-0.5

import numpy as np -1

A=np.hstack((u,np.ones(u.shape))) -1.5
-1 -0.5 0 0.5 1

x=np.linalg.lstsq(A,y,rcond=0)[0] u

©2020 A. Bemporad - Numerical Optimization 16/101


Least Squares using Basis Functions
• More generally: we can fit nonlinear functions y = f (u) expressed as the sum
X
n
of basis functions yk ≈ xi ϕi (uk ) using least squares
i=1
• Example: fit polynomial function y = x1 + x2 u1 + x3 u21 + x4 u31 + x5 u41
N 
X h i 2
min yk − 1 uk u2k u3k u4k x least squares
x
k=1 | {z }
linear with respect to x


 2.5
1
u 
 1
  2
ϕ(u) =  u21 
 3
 u1 
1.5
u41
1
0 0.5 1 1.5 2

©2020 A. Bemporad - Numerical Optimization 17/101


Least Squares - Fitting a circle
• Example: fit circle to a set of data2
X
N
min (r2 − (xk − x0 )2 − (yk − y0 )2 )2
x0 ,y0 ,r
k=1
 x0

y0
• Let x = be the optimization vector (note the change of variables!)
r −x20 −y02
2

• The problem becomes the least squares problem


2
N h
X i 2 1
min 2xk 2yk 1 x − (x2k + yk2 )
x 0
k=1
-1

-2

-3
-2 -1 0 1 2 3

2 http://www.utc.fr/~mottelet/mt94/leastSquares.pdf

©2020 A. Bemporad - Numerical Optimization 18/101


Convex sets
Definition
A set S ⊆ Rn is convex if for all x1 , x2 ∈ S

λx1 + (1 − λ)x2 ∈ S, ∀λ ∈ [0, 1]

convex set
nonconvex set

S x1 x2
x1 x2

©2020 A. Bemporad - Numerical Optimization 19/101


Convex functions
• f : S → R is a convex function if S is convex and

f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 )


∀x1 , x2 ∈ S, λ ∈ [0, 1]

Jensen’s inequality

• If f is convex and differentiable, by taking the limit λ → 0 we get 3

f (x1 ) ≥ f (x2 ) + ∇f (x2 )′ (x1 − x2 )

• A function f is strictly convex if f (λx1 + (1 − λ)x2 ) < λf (x1 ) + (1 − λ)f (x2 ),


∀x1 ̸= x2 ∈ S , ∀λ ∈ (0, 1)
3 f (x
1) − f (x2 ) ≥ limλ→0 (f (x2 + λ(x1 − x2 )) − f (x2 ))/λ = ∇f ′ (x2 )(x1 − x2 )

©2020 A. Bemporad - Numerical Optimization 20/101


Convex functions
• A function f : S → R is strongly convex with parameter m ≥ 0 if

mλ(1 − λ)
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 ) − ∥x1 − x2 ∥22
2

• If f strongly convex with parameter m ≥ 0 and differentiable then


m
f (y) ≥ f (x) + ∇f (x)′ (y − x) + ∥y − x∥22
2
• Equivalently, f is strongly convex with parameter m ≥ 0 if and only if

f (x) − m
2 x x convex
• Moreover, if f is differentiable twice this is equivalent to ∇2 f (x) ≽ mI
(i.e., matrix ∇2 f (x) − mI is positive semidefinite), ∀x ∈ Rn

• A function f is (strictly/strongly) concave if −f is (strictly/strongly) convex

©2020 A. Bemporad - Numerical Optimization 21/101


Convex programming
The optimization problem
min f (x)
s.t. x ∈ S

is a convex optimization problem if S is a convex set S


x1 x2
and f : S → R is a convex function

• Often S is defined by linear equality constraints Ax = b and convex inequality


constraints g(x) ≤ 0, g : Rn → Rm convex

• Every local solution is also a global one (we will see this later)

• Efficient solution algorithms exist (we will see many later)

• Often occurring in many problems in engineering, economics, and science

Excellent textbook: “Convex Optimization” (Boyd, Vandenberghe, 2002)

©2020 A. Bemporad - Numerical Optimization 22/101


Polyhedra
Definition
Convex polyhedron = intersection of a finite set of half-spaces of Rn
Convex polytope = bounded convex polyhedron
v3
b2
x=
A2
• Hyperplane (H-)representation: v2 A
A2 1

P = {x ∈ R : Ax ≤ b} n
v3
v1 A
1 x=
b1
A3x=b3 A3
• Vertex (V-)representation:
X
q X
p Convex hull = transformation
P = {x ∈ Rn : x = αi vi + βj rj } from V- to H-representation
i=1 i=1
Vertex enumeration =

q
αi , βj ≥ 0, αi = 1, vi , rj ∈ R n
transformation from H- to
i=1
V-representation
when q = 0 the polyhedron is a cone
vi = vertex, rj = extreme ray
©2020 A. Bemporad - Numerical Optimization 23/101
Linear programming
• Linear programming (LP) problem:
-c
min c′ x
x*
s.t. Ax ≤ b, x ∈ Rn
Ex = f George Dantzig
(1914–2005)


• LP in standard form: min cx
s.t. Ax = b
x ≥ 0, x ∈ Rn
• Conversion to standard form:
1. introduce slack variables

n ∑
n
aij xj ≤ bi ⇒ aij xj + si = bi , si ≥ 0
j=1 j=1

2. split positive and negative part of x


 n  n
 ∑  ∑
 aij xj + si = bi  + −
aij (xj − xj ) + si = bi

 j=1
  + −

j=1
xj free, si ≥ 0 x j , xj , s i ≥ 0

©2020 A. Bemporad - Numerical Optimization 24/101


Quadratic programming (QP)
• Quadratic programming (QP) problem:
1 ′
x Qx + c′ x
Ax  b
min
2 x*
s.t. Ax ≤ b, x ∈ Rn
1 x0 Qx + c0 x = constant

Ex = f 2

• Convex optimization problem if Q ≽ 0 (Q = positive semidefinite matrix) 4


• Without loss of generality, we can assume Q = Q′ :

1 ′ 1 ′ Q+Q Q−Q′ ′

2 x Qx = 2x ( 2 ′ + 2 )x = 12 x′ ( Q+Q 1 ′ 1 ′ ′ ′
2 )x + 4 x Qx − 4 (x Q x)
1 ′ Q+Q
= 2 x ( 2 )x

• Hard problem if Q ̸≽ 0 (Q = indefinite matrix)


4 A matrix P ∈ Rn×n is positive semidefinite (P ≽ 0) if x′ P x ≥ 0 for all x.

It is positive definite (P ≻ 0) if in addition x′ P x > 0 for all x ̸= 0.


It is negative (semi)definite (P ≺ 0, P ≼ 0) if −P is positive (semi)definite.
It is indefinite otherwise.

©2020 A. Bemporad - Numerical Optimization 25/101


Continuous vs Discrete Optimization

• In some problems the optimization variables can only take integer values.
We call x ∈ Z an integrality constraint

• A special case is x ∈ {0, 1} (binary constraint)

• When all variables are integer (or binary) the problem is an integer
programming problem (a special case of discrete optimization)

• In a mixed integer programming (MIP) problem some of the variables are real
(xi ∈ R), some are discrete/binary (xi ∈ Z or xi ∈ {0, 1})

Optimization problems with integer variables are more difficult to solve

©2020 A. Bemporad - Numerical Optimization 26/101


Mixed-integer programming (MIP)
1 ′
min c′ x min x Qx + c′ x
2
s.t. Ax ≤ b, x = [ xxcb ] s.t. Ax ≤ b, x = [ xxcb ]
xc ∈ Rnc , xb ∈ {0, 1}nb xc ∈ Rnc , xb ∈ {0, 1}nb
mixed-integer linear program (MILP) mixed-integer quadratic program (MIQP)

• Some variables are real, some are binary (0/1)

• MILP and MIQP are N P -hard problems, in general

• Many good solvers are available (CPLEX, Gurobi, GLPK, Xpress-MP, CBC, ...)
For comparisons see http://plato.la.asu.edu/bench.html

©2020 A. Bemporad - Numerical Optimization 27/101


Stochastic and Robust Optimization
• Relations affected by random numbers lead to stochastic models

min Ew [f (x, w)]


x

• The model is enriched by the information about the probability distribution of w


• Other stochastic measures can be minimized (Var, conditional value-at-risk, ...)
• The deterministic version minx f (x, Ew [w]) of the problem only considers the
expected value of w, not its entire distribution
If f is convex w.r.t. x then f (x, Ew [w]) ≤ Ew [f (x, w)]

• chance constraints are constraints enforced only in probability:


prob(g(x, w) ≤ 0) ≥ 99%
• robust constraints are constraints that must be always satisfied:
g(x, w) ≤ 0, ∀w

©2020 A. Bemporad - Numerical Optimization 28/101


Dynamic Optimization
• Dynamic optimization involves decision variables that evolve over time

Example: For a given a value of x0 we want to optimize


X
N −1
minx,u x2t + u2t
t=0
s.t. xt+1 = axt + but
where ut is the control value (to be decided) and xt the state at time t.

The decision variables are


" u0 # " x1 #
u= .. , x = ...
.
uN −1 xN

• Heavily used to solve optimal control problems, such as in model predictive


control

©2020 A. Bemporad - Numerical Optimization 29/101


Optimization algorithm
• An optimization algorithm is a procedure to find an optimizer x∗ of a given
optimization problem minx∈X f (x)

• It is usually iterative: starting from an initial guess x0 of x it generates a


sequence xk of “iterates”, with hopefully xN ≈ x∗ after N iterations

• Good optimization algorithms should possess the following properties:

– Robustness = perform well on a wide variety of problems in their class, for all
reasonable values of the initial guess x0

– Efficiency = do not require excessive CPU time/flops and memory allocation

– Accuracy = find a solution close to the optimal one, without being affected by
rounding errors due to the finite precision arithmetic of the CPU

• The above are often conflicting properties

©2020 A. Bemporad - Numerical Optimization 30/101


It is difficult to provide a taxonomy of optimization because many of the subfields have multiple links.
Optimization taxonomy
Shown here is one perspective, focused mainly on the subfields of deterministic optimization with a single
objective function.

Optimization

Uncertainty Deterministic Multiobjective Optimization

Stochastic Programming Robust Optimization Continuous Discrete

Integer Combinatorial
Unconstrained Constrained
Programming Optimization

Nonlinear Nondifferentiable
Nonlinear Equations Global Optimization Nonlinear Programming Network Optimization Bound Constrained Linearly Constrained
Least Squares Optimization

Semiinfinite Mathematical Programs Mixed Integer Derivative-Free Quadratic Linear


Semidefinite Programming
Programming with Equilibrium Constraints Nonlinear Programming Optimization Programming Programming

Second-Order Complementarity
Cone Programming Problems

Quadratically-Constrained
Quadratic Programming

https://neos-guide.org/content/optimization-taxonomy
Optimization Guide

Introduction
©2020 A. Bemporad - Numerical Optimization 31/101
Optimization software
• Comparison on benchmark problems:
http://plato.la.asu.edu/bench.html

• Taxonomy of many solvers for different classes of optimization problems:


http://www.neos-guide.org

• NEOS server for remotely solving optimization problems:

http://www.neos-server.org

• Good open-source optimization software:

http://www.coin-or.org/

• GitHub , MATLAB Central , Google ,...

©2020 A. Bemporad - Numerical Optimization 32/101


Optimization model

• An optimization model is a mathematical model that captures the objective


function to minimize and the constraints imposed on the optimization variables

• It is a quantitative model, the decision problem must be formulated as a set of


mathematical relations involving the optimization variables

©2020 A. Bemporad - Numerical Optimization 33/101


Formulating an optimization model
Steps required to formulate an optimization model that solves a given
decision problem:

1. Talk to the domain expert to understand the problem we want to solve

2. Single out the optimization variables xi (what are we able to decide?) and their
domain (real, binary, integer)

3. Treat the remaining variables as parameters (=data that affect the problem but
are not part of the decision process)

4. Translate the objective(s) into a cost function of x to minimize (or maximize)

5. Are there constraints on the decision variables ? If yes, translate them into
(in)equalities involving x

6. Make sure we have all the required data available

©2020 A. Bemporad - Numerical Optimization 34/101


Formulating an optimization model
solution
optimization model 2 3
modeling solver
1.1
minx f (x) 6 7
real problem
6 7
x⇤ = 6 3 7
4 5
s.t. g(x)  0 25

c x0
 +
Ax Qx
b
2 x0
1
s.t n
mi
.
analysis of the solution

• It may take several iterations to formulate the optimization model properly, as:

– A solution does not exist (anything wrong in the constraints?)

– The solution does not make sense (is any constraint missing or wrong?)

– The optimal value does not make sense (is the cost function properly defined?)

– It takes too long to find the solution (can we simplify the model?)

©2020 A. Bemporad - Numerical Optimization 35/101


Example: Chess set problem

(Guerét et al., Applications of Optimization with XpressMP, 1999)

A small joinery makes two different sizes of boxwood chess sets. The small set requires 3 hours of machining on a
lathe, and the large set requires 2 hours. There are four lathes with skilled operators who each work a 40 hour week,
so we have 160 lathe-hours per week. The small chess set requires 1 kg of boxwood, and the large set requires 3 kg.
Unfortunately, boxwood is scarce and only 200 kg per week can be obtained. When sold, each of the large chess sets
yields a profit of $20, and one of the small chess set has a profit of $5.

The problem is to decide how many sets of each kind should be made each week so as to maximize profit.

©2020 A. Bemporad - Numerical Optimization 36/101


Example: Chess set problem
(Guerét et al., Applications of Optimization with XpressMP, 1999)

• A small joinery makes two different sizes of boxwood chess sets. The small set requires 3 hours of machining on a
lathe, and the large set requires 2 hours.
• There are four lathes with skilled operators who each work a 40 hour week, so we have 160 lathe-hours per week.
• The small chess set requires 1 kg of boxwood, and the large set requires 3 kg. Unfortunately, boxwood is scarce and
only 200 kg per week can be obtained.
• When sold, each of the large chess sets yields a profit of $20, and one of the small chess set has a profit of $5.

• The problem is to decide how many sets of each kind should be made each week so as to maximize profit.

©2020 A. Bemporad - Numerical Optimization 37/101


Example: Chess set problem

• Optimization variables: xs , xℓ = produced quantities of small/large chess sets

• Cost function: f (x) = 5xs + 20xℓ (profit)

• Constraints:
– 3xs + 2xℓ ≤ 4 · 40 (maximum lathe-hours)

– xs + 3xℓ ≤ 200 (available kg of boxwood)

– xs , xℓ ≥ 0 (produced quantities cannot be negative)

max 5xs + 20xℓ


s.t. [ 31 23 ] [ xxsℓ ] ≤ [ 160
200 ]
xs , x ℓ ≥ 0

©2020 A. Bemporad - Numerical Optimization 38/101


Example: Chess set problem

• What is the best decision ? LetTable


us make some
1.1: Values for xs guesses:
and xl
xs xl Lathe-hours Boxwood OK? Profit Notes
A 0 0 0 0 Yes 0 Unprofitable!
B 10 10 50 40 Yes 250 We won’t get rich doing this.
C -10 10 -10 20 No 150 Planning to make a negative
number of small sets.
D 53 0 159 53 Yes 265 Uses all the lathe-hours. There
is spare boxwood.
E 50 20 190 110 No 650 Uses too many lathe-hours.
F 25 30 135 115 Yes 725 There are spare lathe-hours
and spare boxwood.
G 12 62 160 198 Yes 1300 Uses all the resources
H 0 66 130 198 Yes 1320 Looks good. There are spare
resources.

2 Linear Programming
• What is the best solution ? A numerical solver provides the following solution
We have just built a model for the decision process that the joinery owner has to make. We have isolated
x∗ =(how
the decisions he has to make
0, x ∗ of each type of chess set∗to manufacture), and taken his objective
many
= 66.6666 ⇒ f (x ) = 1333.3 $
of maximizing profit. Thes constraintsℓ acting on the decision variables have been analyzed. We have given
names to his variables and then written down the constraints and the objective function in terms of these
variable names.
At the same time as doing this we have made, explicitly or implicitly, various assumptions. The explicit
©2020 A. Bemporad - Numerical
assumptions that weOptimization
noted were: 39/101
Optimization models
• Optimization models, as all mathematical models, are never an exact
representation of reality but a good approximation of it

• We need to make working assumptions, for example:


– Lathe hours are never more than 160
– Available wood is exactly 200 kg
– Prices are constant
– We sell all chess sets

• There are usually many different models for the same real problem

Optimization modeling is an art

©2020 A. Bemporad - Numerical Optimization 40/101


Optimization models
• Numerical complexity (and even the ability to find an optimal solution)
depends on the optimization model we have formulated

• Good optimization models must be


– Descriptive enough to capture the most significant aspects of the decision problem
(variables, costs, constraints)

ADE OFF
TR
– Simple enough to be able to solve the resulting optimization problem

“Make everything as simple as possible, but not simpler.”


— Albert Einstein

©2020 A. Bemporad - Numerical Optimization 41/101


Optimization models - Benefits

• An optimization project provides rational (quantitative) decisions based on the


available information

• Even if the solution is not really applied, it provides a good suggestion to the
decision maker (you should only make large chess sets)

• Making the model and analyzing the solution allows a better understanding of
the problem at hand (small chess sets are not profitable)

• We can analyze the robustness of the solution with respect to variations in the
data (big changes of solution for small changes of prices?)

©2020 A. Bemporad - Numerical Optimization 42/101


Modeling languages for optimization problems

• AMPL (A Modeling Language for Mathematical Programming) most used


modeling language, supports several solvers

• OPL (Optimization Programming Language), associated with commercial


package IBM-CPLEX

• MOSEL associated with commercial package FICO Xpress

• GAMS (General Algebraic Modeling System) is one of the first modeling


languages

• LINGO modeling language of Lindo Systems Inc.

• GNU MathProg a subset of AMPL associated with the free package GLPK
(GNU Linear Programming Kit)

©2020 A. Bemporad - Numerical Optimization 43/101


Modeling languages for optimization problems
• YALMIP MATLAB-based modeling language

• CVX (CVXPY) Modeling language for convex problems in MATLAB ( )

• CASADI + IPOPT Nonlinear modeling + automatic differentiation, nonlinear


programming solver (MATLAB, , C++)

• Optimization Toolbox’ modeling language (part of MATLAB since R2017b)

• PYOMO -based modeling language

• GEKKO -based mixed-integer nonlinear modeling language

• PuLP A linear programming modeler for

• JuMP A modeling language for linear, quadratic, and nonlinear constrained


optimization problems embedded in

©2020 A. Bemporad - Numerical Optimization 44/101


Example: Chess set problem

• Model and solve the problem using YALMIP (Löfberg, 2004)

xs = sdpvar(1,1);
xl = sdpvar(1,1);

Constraints = [3*xs+2*xl <= 4*40, 1*xs+3*xl <= 200, ...


xs >= 0, xl >= 0]
Profit = 5*xs+20*xl;

optimize(Constraints,-Profit)

value(xs),value(xl),value(Profit)

©2020 A. Bemporad - Numerical Optimization 45/101


Example: Chess set problem
• Model and solve the problem using CVX (Grant, Boyd, 2013)

cvx_clear
cvx_begin
variable xs(1)
variable xl(1)

Profit = 5*xs+20*xl;

maximize Profit

subject to
3*xs+2*xl <= 4*40; % maximum lathe-hours
1*xs+3*xl <= 200; % available kg of boxwood
xs>=0;
xl>=0;
cvx_end

xs,xl,Profit

©2020 A. Bemporad - Numerical Optimization 46/101


Example: Chess set problem
• Model and solve the problem using CASADI + IPOPT
(Andersson, Gillis, Horn, Rawlings, Diehl, 2018) (Wächter, Biegler, 2006)

import casadi.*
xs=SX.sym('xs');
xl=SX.sym('xl');

Profit = 5*xs+20*xl;
Constraints = [3*xs+2*xl-4*40; 1*xs+3*xl-200];

prob=struct('x',[xs;xl],'f',-Profit,'g',Constraints);
solver = nlpsol('solver','ipopt', prob);
res = solver('lbx',[0;0],'ubg',[0;0]);

Profit = -res.f;
xs = res.x(1);
xl = res.x(2);

©2020 A. Bemporad - Numerical Optimization 47/101


Example: Chess set problem

• Model and solve the problem using Optimization Toolbox (The Mathworks, Inc.)

xs=optimvar('xs','LowerBound',0);
xl=optimvar('xl','LowerBound',0);

Profit = 5*xs+20*xl;
C1 = 3*xs+2*xl-4*40<=0;
C2= 1*xs+3*xl-200<=0;

prob=optimproblem('Objective',Profit,'ObjectiveSense','max');
prob.Constraints.C1=C1;
prob.Constraints.C2=C2;

[sol,Profit] = solve(prob);

xs=sol.xs;
xl=sol.xl;

©2020 A. Bemporad - Numerical Optimization 48/101


Example: Chess set problem
• In this case the optimization model is very simple and we can directly code the
LP problem in plain MATLAB or Python:

A=[1 3;3 2]; import scipy as sc


b=[200;160]; import numpy as np
c=[5 20]; A=np.array([[1,3],[3,2]])
[xopt,fopt]=linprog(... b=np.array([[200],[160]])
-c,A,b,[],[],[0;0]) c=np.array([5,20])
sol=sc.optimize.linprog(
-c,A,b,bounds=[0,None])

• The Hybrid Toolbox for MATLAB contains interfaces to various solvers for LP,
QP, MILP, MIQP (http://cse.lab.imtlucca.it/~bemporad/hybrid/toolbox) (Bemporad, 2003-today)

• However, when there are many variables and constraints forming the problem
matrices manually can be very time-consuming and error-prone

©2020 A. Bemporad - Numerical Optimization 49/101


Example: Chess set problem
• We can even model and solve the optimization problem in Excel:

optimization
variables
cost function
B6:C6
=SUMPRODUCT(B6:C6;B2:C2)

©2020 A. Bemporad - Numerical Optimization 50/101


Linear optimization models

Reference:

C. Guéret, C. Prins, M. Sevaux, “Applications of optimization with Xpress-MP,”


Translated and revised by S.Heipcke, 1999
Optimization modeling: linear constraints

• Constraints define the set where to look for an optimal solution

• They define relations between decision variables

• When formulating an optimization model we must disaggregate the


restrictions appearing in the decision problem into subsets of constraints that
we know how to model

• There many types of constraints we know how to model ...

©2020 A. Bemporad - Numerical Optimization 51/101


1. Upper and lower bounds (box constraints)
• Box constraints are the simplest constraints: they define upper and lower
bounds on the decision variables
x2
ℓi ≤ xi ≤ ui
u2
admissible
ℓi ∈ R ∪ {−∞}, ui ∈ R ∪ {+∞}
set
ℓ2

0 ℓ1 u1 x1
• Example: ``We cannot sell more than 100 units of Product A''
• Pay attention: some solvers assume nonnegative variables by default!
• When ℓi = ui the constraint becomes xi = ℓi and variable xi becomes
redundant. Still it may be worthwhile keeping in the model

©2020 A. Bemporad - Numerical Optimization 52/101


2. Flow constraints x1
total flow
x2
x3
• Flow constraints arise when an item can be divided in different streams, or vice
versa many streams come together
x1 x1
X
n
total flow total flow
xi ≤ Fmax x2 x2
i=1 x3 x3

• Example: ``I can get water from 3 suppliers, S1, S2 and S3. I
x1
want to have at least 1000 liters available.'' x1 + x2 + x3 ≥ 1000
total flow
x2
• Example: ``I have 50 trucks available to xrent
3
to 3 customers C1,
C2 and C3'' x1 + x2 + x3 ≤ 50

• Losses can be included as well: ``2% water I get from suppliers gets
lost.'' 0.98x1 + 0.98x2 + 0.98x3 ≥ 1000

©2020 A. Bemporad - Numerical Optimization 53/101


3. Resource constraints
• Resource constraints take into account that a given resource is limited
X
n
Rji xi ≤ Rmax,j
i=1

• The technological coefficients Rji denote the amount of resource j used per
unit of activity i

• Example:
``Small chess sets require 1 kg ``Small chess sets require 3
boxwood, the large ones 3 kg, lathe hours, the large ones 2 h,
total available is 200 kg.'' total time is 4×40 h.''
x1 + 3x2 ≤ 200 3x1 + 2x2 ≤ 160

R = [ 23 32 ] , Rmax = [ 200
160 ]

©2020 A. Bemporad - Numerical Optimization 54/101


4. Balance constraints

• Balance constraints model the fact that “what goes out must in total equal
what comes in”
L
x1in x1out
X
N X
M
xout
i = xin
i +L x2in
i=1 i=1
x3in x2out

• Example: ``I have 100 tons steel and can buy more from
suppliers 1,2,3 to serve customers A,B.'' xA + xB = 100 + x1 + x2 + x3

• Balance can occur between time periods in a multi-time period model

• Example: ``The cash I'll have tomorrow is what I have now plus
what I receive minus what I spend today.'' xt+1 = xt + ut − yt

©2020 A. Bemporad - Numerical Optimization 55/101


5. Quality constraints
• Quality constraints are requirements on the average percentage of a certain
quality when blending several components
PN X
N X
N
i=1 αi xi αi xi T pmin
PN T pmin xi
i=1 xi i=1 i=1

• Example: ``The average risk of an investment in assets A,B,C,


which have risks 25%, 5%, and 12% respectively, must be
smaller than 10%'' 0.25xA +0.05xB +0.12xC
xA +xB +xC ≤ 0.1

• The nonlinear quality constraint is converted to a linear one under the


assumption that xi ≥ 0 (if xi = 0 ∀i the constraint becomes redundant)

Objectives and constraints can be often simplified by mathematical trans-


formations and/or adding extra variables

©2020 A. Bemporad - Numerical Optimization 56/101


6. Accounting variables and constraints

• It is often useful to add extra accounting variables


X
N
y= xi accounting constraint
i=1

PN
• Of course we can replace y with i=1 xi everywhere in the model (condensed
form), but this would make it less readable

• Moreover, keeping y in the model (non-condensed form) may preserve some


structural properties that the solver could exploit

• Example: ``The profit at any given year is the difference between


revenues and expenditures'' pt = rt − et

©2020 A. Bemporad - Numerical Optimization 57/101


7. Blending constraints

• Blending constraints occur when we want to blend a set of ingredients xi in


given percentages αi in the final product
xi
PN = αi
j=1 xj

• Similar to quality constraints, blending constraints can be converted to linear


equality constraints
X
N
xi = αi xj
j=1

©2020 A. Bemporad - Numerical Optimization 58/101


8. Soft constraints
• So far we have seen are hard constraints, i.e., that cannot be violated.
• Soft constraints are a relaxation, in which the constraint can be violated,
usually paying a penalty
X
N
X
N
aij xi ≤ bj aij xi ≤ bj + ϵj
i=1 i=1

• We call the new variable ϵj panic variable: it should be normally zero but can
assume a positive value in case there is no way to fulfill the constraint set

• Example: ``Only 200 kg boxwood are available to make chess


sets, but we can buy extra for 6 $/kg''

maxxs ,xℓ ,ϵ≥0 5xs + 20xℓ − 6ϵ


s.t. xs + 3xℓ ≤ 200 + ϵ
3xs + 2xℓ ≤ 160

©2020 A. Bemporad - Numerical Optimization 59/101


Example: Production of alloys
(Guerét et al., Applications of Optimization with XpressMP, 1999)

The company Steel has received an order for 500 tonnes of steel to be used in shipbuilding. This steel must have the
certain characteristics (``grades''). The company has seven different raw materials in stock that may be used for the
production of this steel. The objective is to determine the composition of the steel that minimizes the production cost.

©2020 A. Bemporad - Numerical Optimization 60/101


Example: Production of alloys
(Guerét et al., Applications of Optimization with XpressMP, 1999)

• The company Steel has received an order for 500 tonnes of steel to be used in shipbuilding.
• This steel must have the certain characteristics (``grades'').
• The company has seven different raw materials in stock that may be used for the production of this steel.
• The objective is to determine the composition of the steel that minimizes the production cost.

©2020 A. Bemporad - Numerical Optimization 61/101


Example: Production of alloys
• Available data of characteristics of steel ordered

chemical element minimum grade maximum grade


Carbon (C) 2 3
Copper (Cu) 0.4 0.6
Manganese (Mn) 1.2 1.65

• Available data of grades, available amounts, and prices

raw material C% Cu % Mn % availability [t] cost [e/t]


Iron alloy 1 2.5 0 1.3 400 200
Iron alloy 2 3 0 0.8 300 250
Iron alloy 3 0 0.3 0 600 150
Copper alloy 1 0 90 0 500 220
Copper alloy 2 0 96 4 200 240
Aluminum alloy 1 0 0.4 1.2 300 200
Aluminum alloy 2 0 0.6 0 250 165

©2020 A. Bemporad - Numerical Optimization 62/101


Example: Production of alloys

• “The company has seven different raw materials in stock''


– We need 7 optimization variables x1 , . . . , x7

– Their upper and lower bounds are

0 ≤ xj ≤ Rj , Rj = tons available of raw material j

• “The company Steel has received an order for 500 tonnes of steel''
∑7
– flow constraint y = j=1 xj (produced metal)

– lower bound y ≥ 500 (quantity that needs to be produced)

©2020 A. Bemporad - Numerical Optimization 63/101


Example: Production of alloys
• “This steel must have the certain characteristics''
index i chemical element minimum grade gimin maximum grade gimax

1 Carbon (C) 2 3
2 Copper (Cu) 0.4 0.6
3 Manganese (Mn) 1.2 1.65

index j raw material C % Pj 1 Cu % Pj 2 Mn % Pj 3 availability [t] Rj cost [e/t] cj


1 Iron alloy 1 2.5 0 1.3 400 200
2 Iron alloy 2 3 0 0.8 300 250
3 Iron alloy 3 0 0.3 0 600 150
4 Copper alloy 1 0 90 0 500 220
5 Copper alloy 2 0 96 4 200 240
6 Aluminum alloy 1 0 0.4 1.2 300 200
7 Aluminum alloy 2 0 0.6 0 250 165

Quality constraints on grades for each chemical element i = 1, 2, 3:


P7 ( P7
j=1 Pji xj j=1 Pji xj ≤ gimax y
gimin ≤ P7 ≤ gimax P7
i=1 xj j=1 Pji xj ≥ gimin y

where Pji = percentage of metal i contained in raw material j

©2020 A. Bemporad - Numerical Optimization 64/101


Example: Production of alloys
• “The objective is to determine the composition of the steel that minimizes the production cost''
P7
The cost function to minimize is j=1 cj xj
• The complete optimization model is finally
P7
minx,y j=1 cj xj
s.t. 0 ≤ xj ≤ Rj
P7
y = j=1 xj
y ≥ 500
P7
Pji xj ≤ gimax y
Pj=1
7
j=1 Pji xj ≥ gi
min
y

• The problem is a linear programming problem. The solution is

x∗1 = 400, x∗3 = 39.7763, x∗5 = 2.7613, x∗6 = 57.4624, x∗2 = x∗4 = x∗7 = 0 [t]

with optimal production cost f (x∗ ) = 98122 e

©2020 A. Bemporad - Numerical Optimization 65/101


Example: Production of alloys

• Model and solve the problem using YALMIP (Löfberg, 2004)

x=sdpvar(7,1);
y=sum(x);

Constraints = [x>=0, x<=R, y>=500, ...


P'*x<=B(:,2)*y, P'*x>=B(:,1)*y];

cost = sum(c.*x);

optimize(Constraints,cost)

value(x),value(cost)

©2020 A. Bemporad - Numerical Optimization 66/101


Example: Production of alloys
• Model and solve the problem using CVX (Grant, Boyd, 2013)

cvx_clear
cvx_begin
variable x(7)

cost = sum(c.*x);

minimize cost

subject to
y=sum(x); y>=500;
x>=0; x<=R;
P'*x<=B(:,2)*y;
P'*x>=B(:,1)*y;

cvx_end

x,cost

©2020 A. Bemporad - Numerical Optimization 67/101


Linear objective functions

• Linear programs only allow minimizing a linear combination of the optimization


variables

• However, by introducing new variables, we can minimize any convex piecewise


affine (PWA) function

Result
Every convex piecewise affine function
a04x + b4
ℓ : Rn → R can be represented as the `(x)
a01x + b1
max of affine functions, and vice versa
(Schechter, 1987) a03x + b3

Example:
ℓ(x) = max {a′1 x + b1 , . . . , a′4 x + b4 } x
a02x + b2

©2020 A. Bemporad - Numerical Optimization 68/101


Convex PWA optimization problems and LP
• Minimization of a convex PWA function ℓ(x):

ϵ minϵ,x ϵ

 ϵ ≥ a′1 x + b1

 ϵ ≥ a′ x + b
2 2
s.t. ′

 ϵ ≥ a x + b


3 3
x ϵ ≥ a′4 x + b4

• By construction ϵ ≥ max{a′1 x + b1 , a′2 x + b2 , a′3 x + b3 , a′4 x + b4 }


• By contradiction it is easy to show that at the optimum we have that

ϵ = max{a′1 x + b1 , a′2 x + b2 , a′3 x + b3 , a′4 x + b4 }

• Convex PWA constraints ℓ(x) ≤ 0 can be handled similarly by imposing


a′i x + bi ≤ 0, ∀i = 1, 2, 3, 4

©2020 A. Bemporad - Numerical Optimization 69/101


1. Minmax objective
• minmax objective: we want to minimize the maximum among M given linear
objectives fi (x) = a′i x + bi

min max {fi (x)} s.t. linear constraints


x i=1,...,M

• Example: asymmetric cost minx max{a′ x + b, 0}

• Example: minimize the ∞-norm

min ∥Ax − b∥∞


x

where ∥v∥∞ , maxi=1,...,n |vi | and A ∈ Rm×n , b ∈ Rm .


This corresponds to

min max{A1 x + b1 , −A1 x − b1 , . . . , Am x + bm , −Am x − bm }


x

©2020 A. Bemporad - Numerical Optimization 70/101


2. Minimize the sum of max objectives
• We want to minimize the sum of maxima among given linear objectives
fij (x) = a′ij x + bij
X
N
min max {fij (x)} s.t. linear constraints
x i=1,...,Mj
j=1

• The equivalent reformulation is


PN
minϵ,x j=1 ϵj
s.t. ϵj ≥ a′ij x + bij , i = 1, . . . , Mj , j = 1, . . . , N
(other linear constraints)
• Example: minimize the 1-norm
min ∥Ax − b∥1
x
P
where ∥v∥1 , i=1,...,n |vi | and A ∈ Rm×n , b ∈ Rm , that corresponds to
X
m
min max{Ai x + bi , −Ai x − bi }
x
i=1
©2020 A. Bemporad - Numerical Optimization 71/101
3. Linear-fractional program
• We want to minimize the ratio of linear objectives c′ x+d
minx e′ x+f
s.t. Ax ≤ b
Gx = h
over the domain e′ x + f > 0
1
• We introduce the new variable z = and replace xi with the new
e′ x + f
variables yi = zxi , i = 1, . . . , n, where
1 = z(e′ x + f ) = e′ y + f z, z ≥ 0
• Since z ≥ 0 then zAx ≤ zb, and the original problem is translated into the LP
minz,y c′ y + dz
s.t. Ay − bz ≤ 0
Gy = hz
e′ y + f z = 1
z≥0
from which we recover x∗ = 1 ∗
z∗ y in case z ∗ > 0.
©2020 A. Bemporad - Numerical Optimization 72/101
Chebychev center of a polyhedron
• The Chebychev center of a polyhedron P = {x : Ax ≤ b} x*
is the center x∗ of the largest ball B(x∗ , r∗ ) = {x : x = x∗ + u, r*

∥u∥2 ≤ r∗ } contained in P

• The radius r∗ is called the Chebychev radius of P

• A ball B(x, r) is included in P if and only if

sup Ai (x + u) = Ai x + r∥Ai ∥2 ≤ bi , ∀i = 1, . . . , m,
∥u∥2 ≤r

where A ∈ Rm×n , b ∈ Rm , and Ai is the ith row of A.

• Therefore, we can compute the Chebychev center/radius by solving the LP

maxx,r r
s.t. Ai x + r∥Ai ∥2 ≤ bi , i = 1, . . . , m

©2020 A. Bemporad - Numerical Optimization 73/101


Convex optimization models

References:

S. Boyd, L. Vandenberghe, “Convex Optimization,” 2004

S. Boyd, “Convex Optimization,” lecture notes, http://ee364a.stanford.edu,


http://ee364b.stanford.edu
Convex sets
• Convex set: A set S ⊆ Rn is convex if for all x1 , x2 ∈ S

λx1 + (1 − λ)x2 ∈ S, ∀λ ∈ [0, 1]

• The convex hull of N points x̄1 , . . . , x̄N is the set of all their convex
combinations
x1 x2
P
S = {x ∈ Rn : ∃λ ∈ RN : x = λi x̄i ,
PN x7
λi ≥ 0, i=1 λi = 1} xN x3
x6
x5
x4
• A convex cone of N points x̄1 , . . . , x̄N is the set
x1
X
S = {x ∈ R : ∃λ ∈ R n N
:x= λi x̄i , λi ≥ 0} x

x2

©2020 A. Bemporad - Numerical Optimization 74/101


Convex sets
a
a’x ≥ b
• The set {x : a′ x = b}, a ̸= 0 is called hyperplane a’x=b

a’x ≤ b
• The set {x : a′ x ≤ b}, a ̸= 0 is called halfspace

• The set P = {x : Ax ≤ b, Ex = f } is called polyhedron

• The set B(x0 , r) = {x : ∥x − x0 ∥2 ≤ r}


={x0 + ry : ∥u∥2 ≤ 1} is called (Euclidean) ball

• An ellipsoid is the set E = {x : (x − x0 )′ P (x − x0 ) ≤ 1}


with P = P ′ ≻ 0, or equivalently E = {x0 + Ay : ∥y∥2 ≤ 1}, x0

A square and det A ̸= 0

• Hyperplanes, halfspaces, polyhedra, balls, and ellipsoids are all convex sets

©2020 A. Bemporad - Numerical Optimization 75/101


Properties of convex sets

• The intersection of (any number of) convex sets is convex

• The image of a convex set under an affine function f (x) = Ax + b


(A ∈ Rm×n , b ∈ Rm ) is convex

S ⊆ Rn convex ⇒ f (S) = {y : y = f (x), x ∈ S} convex

for example: scaling (A diagonal, b = 0), translation (A = 0, b ̸= 0),



projection (A = [I 0], b = 0, i.e., f (S) = {y = [ x1 ... xi ] : x ∈ S})

©2020 A. Bemporad - Numerical Optimization 76/101


Convex functions

• Recall: f : S → R is a convex function if S is convex and

f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 )


Jensen’s inequality
∀x1 , x2 ∈ S, λ ∈ [0, 1]

• Sublevel sets Cα of convex functions are convex sets (but not vice versa)

Cα = {x ∈ dom f : f (x) ≤ α}

• Therefore linear equality constraints Ax = b and inequality constraints


g(x) ≤ 0, with g a convex (vector) function, define a convex set

©2020 A. Bemporad - Numerical Optimization 77/101


Convex functions
• Examples of convex functions
– affine f (x) = a′ x + b, for any a ∈ Rn , b ∈ R
– exponential f (x) = eax , x ∈ R, for any a ∈ R
– power f (x) = xα , x ∈ R, for any α > 1 or α ≤ 0. Example: x2 , 1/x for x > 0
– powers of absolute value f (x) = |x|p , x ∈ R, for p ≥ 1
– negative entropy f (x) = x log x, x ∈ R
– any norm f (x) = ∥x∥
– maximum f (x) = max(x1 , . . . , xn )

• Examples of concave functions


– affine f (x) = a′ x + b, for any a ∈ Rn , b ∈ R
– logarithm f (x) = log x, x ∈ R

– power f (x) = xα , x ∈ R, for any 0 ≤ α ≤ 1. Example: x, x ≥ 0
– minimum f (x) = min(x1 , . . . , xn )

©2020 A. Bemporad - Numerical Optimization 78/101


Convex functions

• Recall the first-order condition of convexity: f : Rn → R with convex domain


dom f and differentiable is convex if and only if
f(y)
f (y) ≥ f (x)+∇f (x)′ (y−x), ∀x, y ∈ dom f
f(x) f(x)+∇f(x)’(y-x)

• Second-order condition: Let f : Rn → R with convex domain dom f be twice


∂ 2 f (x)
differentiable and ∇2 f (x) its Hessian matrix, [∇2 f (x)]ij = ∂xi ∂xj . Then f is
convex if and only if

∇2 f (x) ≽ 0, ∀x ∈ dom f

If ∇2 f (x) ≻ 0 for all x ∈ dom f then f is strictly convex.

©2020 A. Bemporad - Numerical Optimization 79/101


Checking convexity

1. Check directly whether the definition is satisfied (Jensen’s inequality)

2. Check if the Hessian matrix is positive semidefinite (only for twice


differentiable functions)

3. Show that f is obtained by combining known convex functions via operations


that preserve convexity

©2020 A. Bemporad - Numerical Optimization 80/101


Calculus rules for convex functions
• nonnegative scaling: f convex, α ≥ 0 ⇒ αf convex
• sum: f, g convex ⇒ f + g convex
• affine composition: f convex ⇒ f (Ax + b) convex
• pointwise maximum: f1 , . . . , fm convex ⇒ maxi fi (x) convex
• composition: h convex increasing, f convex ⇒ h(f (x)) convex

General composition rule: h(f1 (x), . . . , fk (x)) is convex when h is convex and
for each i = 1, . . . , k
h is increasing in argument i, and fi convex, or
h is decreasing in argument i, and fi concave, or
fi is affine

See also dcp.stanford.edu (Diamond 2014)

©2020 A. Bemporad - Numerical Optimization 81/101


Convex programming
(Boyd, Vandenberghe, 2002)

• The optimization problem

min f (x) or, more min f (x)


s.t. g(x) ≤ 0 generally, s.t. x ∈ S
Ax = b
S convex set
g : R → R , gi convex
n m

with f : X → R convex is a convex optimization problem, where


X = {x ∈ Rn : g(x) ≤ 0, Ax = b} or, more generally, X = S .

• Convex programs can be solved to global optimality and many efficient


algorithms exist for this (we will see many later)
• Although convexity may sound like a restriction, it occurs very frequently in
practice (sometimes after some transformations or approximations)

©2020 A. Bemporad - Numerical Optimization 82/101


Disciplined convex programming
(Grant, Boyd, Ye, 2006)

• The objective function has the form


– minimize a scalar convex expression, or
– maximize a scalar concave expression

• Each of the constraints (if any) has the form


– convex expression ≤ concave expression, or
– concave expression ≥ convex expression, or
– affine expression = affine expression

This framework is used in the CVX, CVXPY, and Convex.jl packages.

©2020 A. Bemporad - Numerical Optimization 83/101


Least squares
• least squares (LS) problem

min ∥Ax − b∥22 x∗ = (A′ A)−1 A′ b


| {z }
pseudoinverse of A Adrien-Marie Legendre
(1752–1833)

• nonnegative least squares (NNLS) (Lawson, Hanson, 1974)

min ∥Ax − b∥22


s.t. x ≥ 0
J. Carl Friedrich Gauss
• bounded-variable least squares (BVLS) (Stark,Parker, 1995)
(1777–1855)

min ∥Ax − b∥22


s.t. ℓ ≤ x ≤ u

• constrained least squares


min ∥Ax − b∥22
s.t. Ax ≤ b, Ex = f

©2020 A. Bemporad - Numerical Optimization 84/101


Quadratic programming
• The least squares cost is a special case of quadratic cost Ax  b

x*
1 1
∥Ax − b∥22 = x′ A′ Ax − b′ Ax + b′ b 1 x0 Qx + c0 x = constant
2

2 2
• A generalization of constrained least squares is quadratic programming (QP)

1 ′
min x Qx + c′ x
2
s.t. Ax ≤ b Q = Q′ ≽ 0
Ex = f

• If Q = L′ L ≻ 0 we can complete the squares by setting y = Lx + (L−1 )′ c and


convert the QP into a LS problem:

1 ′ 1 1
x Qx + c′ x = ∥Lx − (−L−1 )′ c∥22 − c′ Q−1 c
2 2 2

©2020 A. Bemporad - Numerical Optimization 85/101


Linear program with random cost = QP

• We want to solve the LP with random cost c

min c′ x
E[c] = c̄, Var[c] = E[(c − c̄)(c − c̄)′ ] = Σ
s.t. Ax ≤ b, Ex = f

• c′ x is a random variable with expectation E[c′ x] = c̄x and variance


Var[c′ x] = x′ Σx

• We want to trade off the expectation of c′ x with its variance (=risk) with a risk
aversion coefficient γ ≥ 0

• This is equivalent to a QP:

min E[c′ x] + γ Var[c′ x] min c̄′ x + γx′ Σx


s.t. Ax ≤ b, Ex = f s.t. Ax ≤ b, Ex = f

©2020 A. Bemporad - Numerical Optimization 86/101


LASSO optimization = QP
(Tibshirani, 1996)

• The following ℓ1 -penalized linear regression problem is called LASSO


(least absolute shrinkage and selection operator):

1
min ∥Ax − b∥22 + λ∥x∥1 A ∈ Rm×n , b ∈ Rm
x 2

• The tuning parameter λ ≥ 0 determines the tradeoff between fitting Ax ≈ b


(λ small) and making x sparse (λ large)
• By splitting x in the difference of its positive and negative parts, x = y − z ,
y, z ≥ 0 we get the positive semidefinite QP with 2n variables
1
min ∥A(y − z) − b∥22 + λ1′ (y + z)
y,z≥0 2

where 1′ = [1 . . . 1]. At optimality at least one of yi∗ , zi∗ will be zero


• A small Tikhonov regularization σ(∥y∥22 + ∥z∥22 ) makes the QP strictly convex

©2020 A. Bemporad - Numerical Optimization 87/101


LASSO - Example
• Solve LASSO problem 55

1
min ∥Ax − b∥22 + λ∥x∥1
x 2 50

A ∈ R1000×3000 , b ∈ R3000
45
10-4 10-3 10-2 10-1 100 101 102
• A, B = random matrices
1000
• A sparse with 3000 nonzero entries
800

• Problem solved by QP for different λ’s 600

400

• CPU time ranges from 8.5 ms to 1.17 s 200

using osQP (http://osqp.org) 0


10-4 10-3 10-2 10-1 100 101 102

©2020 A. Bemporad - Numerical Optimization 88/101


Quadratically constrained quadratic program (QCQP)

• If we add quadratic constraints in a QP we get the quadratically constrained


quadratic program (QCQP)
1 ′ ′
min 2 x Qx + c x
1 ′ ′
s.t. 2 x Pi x + d i x + hi ≤ 0, i = 1, . . . , m
Ax = b

• QCQP is a convex problem if Q, Pi ≽ 0, i = 1, . . . , m

• If P1 , . . . , Pm ≻ 0 the feasible region X of the QCQP is the intersection of m


ellipsoids and p hyperplanes (f ∈ Rp )

• Polyhedral constraints (halfspaces) are a special case when Pi = 0

©2020 A. Bemporad - Numerical Optimization 89/101


Second-order cone programming
• A generalization of LP, QP, and QCQP is second-order cone programming
(SOCP)
min c′ x
s.t. ∥Fi x + gi ∥2 ≤ d′i x + hi , i = 1, . . . , m
Ax = b
with Fi ∈ Rn1 ×n , A ∈ Rp×n

• If Fi = 0 the SOC constraint becomes a linear inequality constraint

• If di = 0 (hi ≥ 0) the SOC constraint becomes a quadratic constraint

• The quadratic constraint x′ F ′ F x + d′ x + h ≤ 0 is equivalent to the SOC


constraint h i ′ 1
(1 − d′ x − h)
1
2 (1+d x+h) ≤
Fx 2 2

©2020 A. Bemporad - Numerical Optimization 90/101


Example: Robust linear programming
(Boyd, Vandenberghe, 2004)

• We want to solve the LP with uncertain constraint coefficients ai


min c′ x
s.t. a′i x ≤ bi , i = 1, . . . , m
• Assume ai can be anything in the ellipsoid Ei = {āi + Pi y, ∥y∥2 ≤ 1},
Pi ∈ Rn×n , where āi ∈ Rn is the center of Ei
min c′ x
s.t. a′i x ≤ bi , ∀ai ∈ Ei , i = 1, . . . , m
• The constraint is equivalent to supai ∈Ei {ai ; x} ≤ bi , where
sup {a′i x} = sup {(āi + Pi y)′ x} = ā′i x + ∥Pi′ x∥2
ai ∈Ei ∥y∥2 ≤1

• The original robust LP is therefore equivalent to the SOCP


min c′ x
s.t. ā′i x + ∥Pi′ x∥2 ≤ bi , i = 1, . . . , m

©2020 A. Bemporad - Numerical Optimization 91/101


Example: LP with random constraints
• Assume ai Gaussian, ai ∼ N (āi , Σi ), Σi = L′Σ LΣ (LΣ = Σ 2 if Σ is diagonal)
1

• For a given η ∈ [ 12 , 1] we want to solve the LP with chance constraints

min c′ x
s.t. prob(a′i x ≤ bi ) ≥ η, i = 1, . . . , m

• Let α = a′i x − bi , ᾱ = ā′i x − bi , σ̄ 2 = x′ Σi x. The cumulative distribution


R β −t2 /2
function (CDF) of α ∼ N (ᾱ, σ̄) is F (α) = Φ( α− ᾱ
σ̄ ), Φ(β) =
√1
2π −∞
e dt
   
−ᾱ bi − ā′i x
prob(a′i x − bi ≤ 0) = F (0) = Φ =Φ ≥η
σ̄ ∥Σ x∥2
• The original LP with random constraints is equivalent to the SOCP

min c′ x
s.t. ā′i x + Φ−1 (η)∥LΣ x∥2 ≤ bi , i = 1, . . . , m

where Φ−1 (η) ≥ 0 since η ≥ 1


2 (Boyd, Vandenberghe, 2004)

©2020 A. Bemporad - Numerical Optimization 92/101


Example: Maximum volume box in a polyhedron
(Bemporad, Filippi, Torrisi, 2004)

• Goal: find the largest box B contained inside a polyhedron


P = {x ∈ Rn : Ax ≤ b}
x*+y*

• Let y ∈ R = vector of dimensions of B and x ∈ R


n n

= vertex of B with lowest coordinates


x*

• Problem to solve:
Qn
maxx,y i=1 yi
nonlinear, nonconvex,
s.t. A(x + diag(v)y) ≤ b, ∀v ∈ {0, 1}n
many constraints!
y≥0
• Reformulate as maximize log(volume), remove redundant constraints:
X
n
convex problem
minx,y − log(yi )
i=1
s.t. Ax + A+ y ≤ b, y≥0 A+
ij = max{Aij , 0}

©2020 A. Bemporad - Numerical Optimization 93/101


Semidefinite program (SDP)
• A semidefinite program (SDP) is an optimization problem in which we have
constraints on positive semidefiniteness of matrices
minx c′ x
s.t. x1 F1 + x2 F2 + . . . + xn Fn + G ≼ 0
Ax = b
where F1 , F2 , . . . , Fn , G are (wlog) symmetric m × m matrices
• The constraint is called linear matrix inequality (LMI) 5

• Multiple LMIs can be combined in a single LMI using block-diagonal matrices


x1 F11 + . . . + xn Fn1 + G1 ≼ 0
h i h i h i
F11 0 Fn1 0 G1 0
x1 +. . . xn + ≼0
x1 F12 + . . . + xn Fn2 + G2 ≼ 0 0 F12 0 Fn2 0 G2

Many interesting problems can be formulated (or approximated) as SDPs

5 The LMI constraint means z ′ (x + x2 F2 + . . . + xn Fn + G)z ≤ 0, ∀z ≥ 0


1 F1

©2020 A. Bemporad - Numerical Optimization 94/101


Semidefinite program (SDP)
SDP generalizes LP, QP, QCQP, SOCP:

• an LP can be recast as an SDP

min c′ x min c′ x
s.t. Ax ≤ b s.t. diag(Ax − b) ≼ 0

• an SOCP can be recast as an SDP

min c′ x min ch′ x i


(d′i x+hi )I Fi x+gi
s.t. ∥Fi x + gi ∥2 ≤ d′i x + hi s.t. (F x+g )′
i i d′i x+hi
≽0
i = 1, . . . , m i = 1, . . . , m

• Good SDP packages exist (SeDuMi, SDPT3, Mathworks LMI Toolbox, ...)

©2020 A. Bemporad - Numerical Optimization 95/101


Geometric programming
(Boyd, Kim, Vandenberghe, Hassibi, 2007)

• A monomial function f : Rn
++ → R++ , where R++ = {x ∈ R : x > 0}, has
the form
f (x) = cxa1 1 xa2 2 . . . xann , c > 0, ai ∈ R
• A posynomial function f : Rn
++ → R++ is the sum of monomials

X
K
f (x) = ck xa1 1k xa2 2k . . . xannk , ck > 0, aik ∈ R
k=1

• A geometric program (GP) is the following optimization problem


min f (x)
s.t. gi (x) ≤ 1, i = 1, . . . , m
hi (x) = 1, i = 1, . . . , p

with f, gi posynomials, hi monomials.

©2020 A. Bemporad - Numerical Optimization 96/101


Geometric programming - Equivalent convex program
• Introduce the change of variables yi = log xi . The optimizer is the same if we
minimize log f instead of f and take the log of both sides of the constraints
• The logarithm of a monomial fM (x) = cxa1 1 . . . xann becomes affine in y

log fM (x) = log(cxa1 1 . . . xann ) = log(ceai y1 . . . ean yn ) = a′ y + b, b = log c


PK
• The logarithm of a posynomial fP (x) = k=1 ck xa1 1k . . . xannk becomes
!
X K
a′k y+bk
log fP (x) = log e , bk = log ck
k=1

• One can prove that F (y) = log fP (ey ) is convex and so it is the program
P 
K a′k y+bk
min log e
Pk=1 ′ 
K
s.t. log k=1 e
cik y+dik
≤ 0, i = 1, . . . , m
Ey + f = 0

©2020 A. Bemporad - Numerical Optimization 97/101


Geometric programming - Example
(Boyd, Kim, Vandenberghe, Hassibi, 2007)

• Maximize the volume of a box-shaped structure with


height h, width w, depth d

• Constraints:
– total wall area 2(hw + hd) ≤ Awall
– floor area wd ≤ Aflr
– upper and lower bounds on aspect ratios α ≤ h/w ≤ β , γ ≤ w/d ≤ δ

• The problem can be cast as the following GP


min h−1 w−1 d−1
2
s.t. Awall 2
hw + Awall hd ≤ 1
1
Aflr wd ≤ 1
αh−1 w ≤ 1, β1 hw−1 ≤ 1
γwd−1 ≤ 1, 1δ w−1 d ≤ 1

©2020 A. Bemporad - Numerical Optimization 98/101


Geometric programming example
• We solve the problem in MATLAB:
alpha=0.5; beta=2; gamma=0.5; delta=2; Awall=1000; Afloor=500;

CVX YALMIP
cvx_begin gp quiet sdpvar h w d
variables h w d
% obj. function = box volume C = [alpha <= h/w <= beta,
maximize(h*w*d) gamma <= d/w <= delta, h>=0,
subject to w>=0];
2*(h*w + h*d) <= Awall; C = [C, 2*(h*w+h*d) <= Awall,
w*d <= Afloor; w*d <= Afloor];
alpha <= h/w <= beta;
gamma <= d/w <= delta; optimize(C,-(h*w*d))
cvx_end
yalmip.github.io/tutorial/geometricprogramming
opt_volume = cvx_optval;

• Result: max volume = 5590.17, h∗ = 11.1803, w∗ = 22.3599, d∗ = 22.3614

©2020 A. Bemporad - Numerical Optimization 99/101


Geometric programming - Example
• We solve the problem in PYTHON:

CVXPY
import cvxpy as cp constraints = [
2*(h*w + h*d) <= Awall,
alpha = 0.5 w*d <= Afloor,
beta = 2.0 alpha <= h/w, h/w <= beta,
gamma = 0.5 gamma <= d/w, d/w <= delta]
delta = 2.0
Awall = 1000.0 problem = cp.Problem(cp.Maximize
Afloor = 500.0 (obj), constraints)
problem.solve(gp=True)
h = cp.Variable(pos=True)
w = cp.Variable(pos=True) print("h: ", h.value)
d = cp.Variable(pos=True) print("w: ", w.value)
print("d: ", d.value)
obj = h * w * d print("volume: ", problem.value)

©2020 A. Bemporad - Numerical Optimization 100/101


Change of function/variables

• Substituting the objective f with a monotonically increasing function of f can


simplify the problem


– Example: min x with x ≥ 0, is a nonconvex problem, but we can minimize

( x)2 = x instead

∏n
– Example: max f (x) = i=1 xi is a nonconvex problem, but the function
∑n
log(f (x)) = i=1 log(x 1 ) is concave

• Sometimes a nonconvex problem can be transformed into a convex problem by


making a nonlinear transformation of the optimization variables (as in GP)

©2020 A. Bemporad - Numerical Optimization 101/101

You might also like