Applied Numerical Optimization: Prof. Alexander Mitsos, Ph.D. What Is Optimization and How Do We Use It?
Applied Numerical Optimization: Prof. Alexander Mitsos, Ph.D. What Is Optimization and How Do We Use It?
Often the term mathematical programming is used as an alternative to numerical optimization. This term
dates back to the times before computers. The term programming referred to the solution of planning
problems.
For those interested in the history of optimization: Documenta Mathematica, Journal der
Deutschen Mathematiker-Vereinigung, Extra Volume - Optimization Stories, 21st
International Symposium on Mathematical Programming, Berlin, August 19–24, 2012
• An objective function
The mathematical model of the system under consideration and the additional restrictions are also referred
to as constraints.
• The objective function describes an economical measure (operating costs, investment costs, profit, etc.), or
technological, or ...
• The mathematical modeling of the system results in models to be added to the optimization problem as
equality constraints.
• The additional constraints (mostly linear inequalities) result, for instance, from:
plant- or equipment-specific limitations (capacity, pressure, etc.)
material limitations (explosion limit, boiling point, corrosivity, etc.)
product requirements (quality, etc.)
resources (availability, quality, etc.)
• Those values of the influencing variables (decision variables or degrees of freedom) are sought,
which maximize or minimize the objective function.
• The values of the degrees of freedom must satisfy the mathematical model and all additional
constraints like, for instance, physical or resource limitations at the optimum.
• The solution is, typically, a compromise between opposing effects. In process design, for instance, the
investment costs can be reduced while increasing the operating costs (and vice versa).
Optimization is widely used in science and engineering, and in particular in process and energy systems
engineering, e.g.,
• Business decisions (determination of product portfolio, choice of location of production sites, analysis of
competing investments, etc.)
• Design decisions: Process, plant and equipment (structure of a process or energy conversion plant,
favorable operating point, selection and dimensions of major equipment, modes of process operation,
etc.)
• Model identification (parameter estimation, design of experiments, model structure discrimination, etc.)
• Navigation systems: how to go from A to B in shortest time (or shortest distance, lowest fuel consumption or
...)
• LaTeX varies spacing and arrangement of figures to maximize visual appeal of documents
• What is the difference between a nonlinear program, an optimal control problem and a stochastic program?
max slack max social impact min graduation time max papers
Constraints: Variables:
• sleep > 4hrs • work load
• pay rent • free lunch schemes
• keep funding • seem busy schemes
• The heating costs (operational costs) are proportional to the heat loss, which can be reduced by the
installation of an insulation (investment costs).
600 °C
• The aim is to find the best compromise between the cost of additional heating and cost of additional insulation.
The objective function corresponds thus to the total (annualized) cost.
[1]
Applied Numerical Optimization [1] Kaynakli, Economic thermal insulation thickness for pipes and ducts: A
Prof. Alexander Mitsos, Ph.D. review study, 2014
Example: Optimal Motion Planning of Robots
Task:
Aims:
TR
where C is an undesirable by-product.
ABC
Optimization problem:
• The selectivity of the reaction can be maximized over the batch by manipulating the
dosage of reactant A and the reaction temperature.
• The degrees of freedom are functions of time.
• Like in robot motion planning, this problem is an optimal trajectory planning problem.
Optimal Cut?
• For each of the considered examples, state: variables, objective function, model, additional constraints
Applied Numerical Optimization CSPonD: Slocum*, Codd, et int. Mitsos, Solar Energy
Prof. Alexander Mitsos, Ph.D.
Example: Heliostat Fields – Optimization Applicable?
• Noone (with guidance by Mitsos) developed and validated a model suitable for optimization (fast yet
• Heuristic global methods (genetic algorithm, multistart) prohibitive for realistic number of heliostats
• Local optimization from arbitrary initial guess not suitable as results are very sensitive to initial guess
• Heuristic solution tried: Start with existing designs and optimize locally
[1] Noone C. J., Torrilhon M., Mitsos A., Heliostat field optimization: A new
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D. computationally efficient model and biomimetic layout, Solar Energy, 2012
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Land use
Infrastructure cost
• Most basic case: constant wind from one direction, minimize levelized cost of electricity
𝑦
𝑁r =8 Both are optimized layouts
Problem formulation and
[2]
[1]
0 algorithm make a difference
0 𝑥 45𝐷
[1] Grady et al. (2005), Renew. Energ., 30, 259.
Applied Numerical Optimization [2] Bongartz (2020), in preparation.
Prof. Alexander Mitsos, Ph.D.
Example: Sailing – Technology Choice
Typically,
inventions by
human creativity,
not by
[1] [2] [3] [4] mathematical
Ancient sailing: Fixed mast; no Classic sailing: Fixed mast;
optimization.
Novel hulls (catamaran)
boom → mostly downwind boom → can go upwind Novel sails (wing, …)
Constraints
Optimization problems are classified with respect to the type of the objective function, constraints and
variables, in particular
• Linearity of objective function and constraints:
• Linear (LP) versus nonlinear programs (NLP)
• NLPs can be convex or nonconvex, smooth or nonsmooth
• Discrete and/or continuous variables:
• Integer programs (IP) and mixed-integer programs (MIP or MILP and MINLP, respectively)
• Time-dependence:
• Dynamic optimization or optimal control programs (DO or OCP)
• Stochastic or deterministic models and variables:
• Stochastic programs, semi-infinite optimization, …
• Single objective vs multi-objective, single-level vs multi-level, ...
• An optimization problem: mathematical formulation to find the best possible solution out of all feasible
solutions. Typically comprising one or multiple objective function(s), decision variables, equality constraints
and/or inequality constraints.
• An algorithm is a procedure for solving a problem based on conducting a sequence of specified actions.
The terms ‘algorithm’ and ‘solution method’ are commonly used interchangeably.
• A solver is the implementation of an algorithm in a computer using a programming language. Often, the
terms ‘solver’ and ‘software’ are used interchangeably.
• “Optimizer's curse”: solution using good algorithm and bad model will look better than what it is
Random error: if the model has a random error and we optimize, the true objective value of the solution found will be
worse than the calculated one
If model allows for nonphysical solution with good objective value, good optimizer will pick such
On the other hand, model has to just lead in correct direction, not be correct
• Many engineering (design) problems are nonconvex, but global algorithms are inherently very expensive
• Often optimal solution at constraint, thus tradeoff good vs. robust solution
• What is the difference between a nonlinear program, an optimal control problem and a stochastic program?
𝑐𝑖 : D R constraint functions 𝑖 𝐸 ∪ 𝐼
s.t. 𝑐𝑖 𝒙 = 0, 𝑖 ∈ 𝐸
𝐸 the index set of equality constraints
𝑐𝑖 𝒙 ≤ 0, 𝑖 ∈ 𝐼
𝐼 the index sets of inequality constraints
The constraints and the host set define the feasible set, i.e., the set of all feasible solutions:
= {𝒙 𝐷 | 𝑐𝑖 𝒙 ≤ 0 𝑖 ∈ 𝐼, 𝑐𝑖 𝒙 = 0 𝑖 ∈ 𝐸}
N1 N2
x* x 𝑥1∗ 𝑥2∗ x
c) a strict local minimum, d) each 𝑥 ∗ ∈ 𝑎, 𝑏 is a local and global minimum
no global minimum no strict minima
f f
x* x a x* b x
5 of 6 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Check Yourself
• Is every local solution also a global solution? Is every global solution also a local solution?
• Can a solution be in the interior of the feasible set? On its boundary? Outside the feasible set?
Draw the corresponding picture
• For given problem recognize the (local or global) optimal solution points
Mathematical background
Nonlinear Optimization Problem (Nonlinear Program, NLP)
𝑐𝑖 : D R constraint functions 𝑖 𝐸 ∪ 𝐼
s.t. 𝑐𝑖 𝒙 = 0, 𝑖 ∈ 𝐸
𝐸 the index set of equality constraints
𝑐𝑖 𝒙 ≤ 0, 𝑖 ∈ 𝐼
𝐼 the index sets of inequality constraints
The constraints and the host set define the feasible set, i.e., the set of all feasible solutions:
= {𝒙 𝐷 | 𝑐𝑖 𝒙 ≤ 0 𝑖 ∈ 𝐼, 𝑐𝑖 𝒙 = 0 𝑖 ∈ 𝐸}
Definition:
𝑛 𝑛
𝑓
Let 𝑓:𝐷𝑅, 𝐷 𝑅 , 𝒙𝐷 and 𝒑𝑅 with 𝒑 =1.
𝑓 𝒙𝑎 + 𝜀𝒑 − 𝑓(𝒙𝑎 )
𝐷 𝑓, 𝒑 |𝒙=𝒙𝑎 = lim =: 𝜵𝒑 𝑓(𝒙𝑎 )
𝜀→0 𝜀
Definition:
• The first derivative of a scalar, continuous function 𝑓 is called the gradient of 𝑓 at point 𝒙:
𝜕𝑓
ቚ
𝜕𝑥1 𝒙
𝜵𝑓 𝒙 = ⋮ .
𝜕𝑓
ቚ
𝜕𝑥𝑛 𝒙
Remarks:
• If 𝒙 is a function of time 𝑡, the chain rule applies:
𝑛
𝑑𝑓 𝑑𝒙 𝜕𝑓 𝜕𝑥𝑖
ቤ = 𝜵𝑓(𝒙)𝑇 ቤ = ቤ ቤ .
𝑑𝑡 𝒙 𝑡
𝑑𝑡 𝑡 𝜕𝑥𝑖 𝒙 𝑡
𝜕𝑡 𝑡
𝑖=1
Definition:
• The second derivative of a scalar, twice continuously differentiable function 𝑓 is the symmetric Hessian
(matrix) 𝑯 𝒙 of the function 𝑓
𝜕2𝑓 𝜕2𝑓
อ ⋯ อ
𝜕𝑥12 𝜕𝑥1 𝑥𝑛
𝒙 𝒙
2
𝑯 𝒙 =𝜵 𝑓 𝒙 = ⋮ ⋱ ⋮ .
𝜕2𝑓 𝜕2𝑓
อ ⋯ อ
𝜕𝑥1 𝑥𝑛 𝜕𝑥𝑛2
𝒙 𝒙
• Necessary condition: Statement A is a necessary condition for statement B if (and only if) the falsity of A
guarantees the falsity of B. In math notation: not𝐴 ⇒ not𝐵
• Sufficient condition: Statement A is a sufficient condition for statement B, if (and only if) the truth of A
guarantees the truth of B. In math notation: 𝐴 ⇒ 𝐵
• If statement A is necessary condition for statement B, then B is sufficient condition for statement A.
not𝐴 ⇒ not𝐵 implies 𝐵 ⇒ 𝐴
• If statement A is sufficient condition for statement B, then B is necessary condition for statement A.
𝐴 ⇒ 𝐵 implies not𝐵 ⇒ not𝐴
• In optimization we would like to have easy to check conditions that tell us if a candidate point
is a local optimum (sufficient condition for optimality is sufficient)
is not an optimal condition (necessary condition is violated)
Ideally we want conditions that are necessary and sufficient for local optimum (or even better for global)
min 𝑓(𝒙)
𝒙∈𝑅 𝑛
5
20
∗ 𝑛
As 𝒙 is a local minimizer of 𝑓, for each 𝒑 ∈ 𝑅 , there exists 𝜏 > 0, such 4
10
11.5
3 2.81
𝒑
that 𝑓 𝒙∗ + 𝜀𝒑 ≥ 𝑓 𝒙∗ ∀ 𝜀 ∈ [0, 𝜏]. 𝒙∗
2
7
1 0.1 4
1
1.5
0 2.81 20
By the definition of the directional derivative: -1
𝑥1
-2 -1 0 1 2 3 4
𝑓 𝒙∗ +𝜀𝒑 −𝑓(𝒙∗ )
𝜵𝒑 𝑓 𝒙∗ = lim = 𝜵𝑓(𝒙∗ )𝑇 𝒑 ≥ 0 (1)
𝜀→0 𝜀
30
25
𝑓 𝒙∗ + 𝜀𝒑
The special choice, 𝒑 = −𝜵𝑓(𝒙∗ ), leads to 20
15
𝜀
5
• A stationary point does not have to be a minimum or a maximum. Such a stationary point is
called a saddle point.
• Example: the gradient of 𝑓 𝒙 = 𝑥12 − 𝑥22 is 𝜵𝑓 𝒙 = [2𝑥1 , −2𝑥2 ]𝑇 . Thus, 𝒙∗ = 𝟎 is its only
stationary point. As 𝑓 is positively curved in 𝑥1 -direction and negatively curved in 𝑥2 -direction,
𝒙∗ is a saddle point.
saddle point
𝒙∗
1. 𝜵𝑓(𝒙∗ ) = 𝟎,
2. 𝜵2 𝑓(𝒙∗ ) is positive semidefinite.
Thus, ∃𝒑 ∈ 𝑅𝑛 : 𝒑𝑻 𝜵2 𝑓 𝒙∗ 𝒑 < 0
1
𝑓 𝒙∗ + 𝜖𝒑 = 𝑓 𝒙∗ + 𝜖𝜵𝑓(𝒙∗ )𝑇 𝒑 + 2 𝜖 2 𝒑𝑇 𝜵2 𝑓 𝒙∗ 𝒑 + 𝑂(𝜖 3 ).
𝜵𝑓 𝒙∗ = 𝟎
𝑓 𝒙∗ + 𝜖𝒑 < 𝑓(𝒙∗ )
1. 𝜵𝑓(𝒙∗ ) = 𝟎,
2. 𝜵2 𝑓(𝒙∗ ) is positive definite.
Remark
• 𝑓 𝑥 = 𝑥 4 attains at 𝑥 ∗ = 0 its (unique) strict global minimum. Further, 𝛻𝑓 0 = 0 and 𝛻 2 𝑓 0 = 0 hold, thus
the 2nd condition in the above theorem is violated.
• Hence, the conditions mentioned in the theorem are sufficient but not necessary
𝑅𝑛
• Optimality conditions are at a point, not for the
whole 𝑅𝑛
1st order conditions
satisfied
• All the sets shown are true subsets
2nd order necessary
conditions satisfied
• The first-order necessary conditions exclude non-
stationary points
Local minima
Definition:
𝒙∗ is a local solution if ∃𝑁 𝒙∗ : 𝑓 𝒙∗ ≤ 𝑓 𝒙 , ∀𝒙 ∈ 𝑁 𝒙∗ x* N
N ( x*) 𝑥∗ x
Optimality Conditions:
1st-Order Necessary : If 𝒙∗ is a local minimum then 𝜵𝑓 𝒙∗ = 𝟎
2nd-Order Necessary: If 𝒙∗ is a local minimum then 𝜵𝑓 𝒙∗ = 𝟎 and 𝑯 𝒙∗ is positive semi definite
2nd-Order Sufficient : If 𝜵𝑓 𝒙∗ = 𝟎 and 𝑯 𝒙∗ is positive definite then 𝒙∗ is a local minimum
Solution
The stationary points 𝒙∗ are defined by the condition, 𝜵𝑓 𝒙∗ = 𝟎
𝜕𝑓
ቤ
𝜕𝑥1 𝒙 4𝑥13 + 2𝑥1 1 − 2𝑥2 − 2𝑥2 + 4.5
𝜵𝑓 𝒙 = = =𝟎
𝜕𝑓 −2𝑥12 + 4𝑥2 − 2𝑥1 − 4
ቤ
𝜕𝑥2 𝒙
4𝑥13 + 2𝑥1 1 − 2𝑥2 − 2𝑥2 + 4.5 = 0
⇒൝
−2𝑥12 + 4𝑥2 − 2𝑥1 − 4 = 0
𝑥2
• To classify the stationary points, we investigate the
definiteness of the Hessian 𝑯 𝒙
𝑥1
• At 𝑨 and 𝐁 all eigenvalues are positive. By 2nd order sufficient conditions 𝑨 and 𝐁 are local minima. (A is
indeed the unique global minimizer)
At 𝑪, the Hessian has one positive and one negative eigenvalue. The 2nd order necessary conditions are
violated and thus 𝑪 is not a local minimum (𝑪 is indeed a saddle point)
4 of 6 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
A Funky Function [1] (1)
3000 0.02
𝑓 𝒙 = (𝑥2 − 𝑥12 )(𝑥2 − 4𝑥12 ) 𝑓 500𝑥 2 , 𝑥z)
f(500*z, 2
0.015
𝑓 𝑥f(z,
2 , 𝑥2
z)
2000
𝑥1 = 500𝑥2 0.01 𝑥1 = 𝑥2
1000
0.005
0 0
-0.01 -0.005 0 0.005 0.01 -0.1 -0.05 0 0.05 0.1
z𝑥 z𝑥
-3
2 2
x 10
8 0.015
𝑓 𝑥f(y,
1, 0
0) 𝑓 f(0,
0, 𝑥z)
2
−10𝑥1 𝑥2 +16𝑥13
6
0.01
• 𝜵𝑓 𝒙 = , 𝜵𝑓 𝟎 = 𝟎 𝑥2 = 0 𝑥1 = 0
2 4
2𝑥2 − 5𝑥1 0.005
2
• Necessary conditions (1st and 2nd) satisfied • 0 is local minimum w.r.t. every line through it
• 0 is not a local minimum of 𝑓
• Sufficient conditions not satisfied
• How can that be?
5 of 6 Applied Numerical Optimization [1] Bertsekas D.P., Nonlinear Programming – Third Edition, Athena Scientific, 2016.
Prof. Alexander Mitsos, Ph.D.
A Funky Function [1] (2)
-10
-15
-20 𝑥2 = 2𝑥12
-25
-30
-35
-2 -1 0 1 2
y𝑥1
6 of 6 Applied Numerical Optimization [1] Bertsekas D.P., Nonlinear Programming – Third Edition, Athena Scientific, 2016.
Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Convexity in optimization
Convexity of a Set
convex nonconvex
f f f
x x x
a b
strictly convex concave, but not strictly neither convex, nor concave
Theorem:
A symmetric (n×n)-matrix 𝑨 is positive definite, if 𝜆𝑘 > 0, ∀𝑘 ∈ 1, … , 𝑛 where
𝜆𝑘 represent the eigenvalues of 𝑨, i.e., the solutions of det(𝑨 - 𝜆𝑰) = 0.
Similarly positive semi-definite if 𝜆𝑘 ≥ 0, ∀𝑘 ∈ 1, … , 𝑛
if ∀ 𝒙 ∈ 𝐷
∀ 𝒑 ∈ 𝑅𝑛 ,
𝑓 is 𝑯 𝒙 is all 𝜆𝑘 are
𝒑𝑇 𝑯 𝒙 𝒑 is
strictly convex positive definite >0 >0
neither convex, - ≥ 0 or ≤ 0 ≥ 0 or ≤ 0
nor concave
Definiteness of Sign of Sign of the
the Hessian the eigenvalues quadratic form
1,2> 0 1 > 0, 2 = 0
1 > 0, 2 < 0
• The optimization problem min 𝑓(𝒙) is convex, if the objective function 𝑓 is convex and if the feasible set is
𝒙∈
convex.
• “… in fact, the great watershed in optimization isn't between linearity and nonlinearity, but convexity and
nonconvexity.” R. Tyrrell Rockafellar in SIAM Review, 1993
7 of 9 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Optimality Conditions for Smooth Convex Problems
• Simply said:
The first order optimality condition is both necessary and sufficient.
A stationary point is equivalent to a local solution point and a global solution point.
In constrained problems a similar property exists: convexity implies that the first-order optimality conditions are both
necessary and sufficient