0% found this document useful (0 votes)
33 views

Applied Numerical Optimization: Prof. Alexander Mitsos, Ph.D. Branch & Bound For NLP

The document discusses branch and bound methods for solving nonlinear programs. It shows illustrations of the branch and bound algorithm applied to box-constrained nonlinear programs. The key steps are: (1) construct a relaxation of the problem, (2) solve the relaxation to obtain a lower bound, (3) solve the original problem locally to obtain an upper bound, (4) branch into two or more nodes to partition the search space, and (5) repeat these steps for each new node until the optimal solution is found. The document also discusses how to obtain tighter lower and upper bounds to improve the algorithm's efficiency.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Applied Numerical Optimization: Prof. Alexander Mitsos, Ph.D. Branch & Bound For NLP

The document discusses branch and bound methods for solving nonlinear programs. It shows illustrations of the branch and bound algorithm applied to box-constrained nonlinear programs. The key steps are: (1) construct a relaxation of the problem, (2) solve the relaxation to obtain a lower bound, (3) solve the original problem locally to obtain an upper bound, (4) branch into two or more nodes to partition the search space, and (5) repeat these steps for each new node until the optimal solution is found. The document also discusses how to obtain tighter lower and upper bounds to improve the algorithm's efficiency.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Applied Numerical Optimization

Prof. Alexander Mitsos, Ph.D.

Branch & Bound for NLP


B&B Illustration for Box-Constrained NLPs (1)

2 of 13 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

𝑋
1. Construct a relaxation

3 of 13 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

LBD
1. Construct a relaxation
2. Solve relaxation  LBD

4 of 13 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

UBD

LBD
1. Construct a relaxation
2. Solve relaxation  LBD
3. Solve original locally  UBD

5 of 13 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

branch

UBD

LBD
1. Construct a relaxation
2. Solve relaxation  LBD
3. Solve original locally  UBD
4. Branch to nodes (a) and (b)

6 of 13 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

branch branch

UBD UBD

LBD LBD
(a) (b)
1. Construct a relaxation 𝑥𝑎 𝑥𝑏
2. Solve relaxation  LBD
3. Solve original locally  UBD
4. Branch to nodes (a) and (b)
5. Repeat steps for each node
7 of 13 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

branch branch

UBD UBD
LBDa

LBDb
LBD (a) (b)
1. Construct a relaxation 𝑥𝑎 𝑥𝑏
2. Solve relaxation  LBD
3. Solve original locally  UBD
4. Branch to nodes (a) and (b)
5. Repeat steps for each node
8 of 13 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

branch branch

UBD
LBDa
UBD

LBDb
LBD
1. Construct a relaxation 𝑥𝑎 𝑥𝑏
2. Solve relaxation  LBD
3. Solve original locally  UBD
4. Branch to nodes (a) and (b)
5. Repeat steps for each node
9 of 13 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

branch branch

UBD

UBD

LBDb
LBD
1. Construct a relaxation 𝑥𝑎 𝑥𝑏
2. Solve relaxation  LBD
3. Solve original locally  UBD 6. Fathom by value dominance
4. Branch to nodes (a) and (b)
5. Repeat steps for each node
10 of 13 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

branch branch

UBD

UBD

LBDb
LBD
1. Construct a relaxation 𝑥𝑏
2. Solve relaxation  LBD
3. Solve original locally  UBD 6. Fathom by value dominance
4. Branch to nodes (a) and (b)
5. Repeat steps for each node Range reduction of variables
11 of 13 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
B&B Illustration for Box-Constrained NLPs (1)

branch branch

UBD

UBD

LBDb
LBD (a) (b)
1. Construct a relaxation 𝑥𝑎 𝑥𝑏
2. Solve relaxation  LBD
 How to get lower bounds?
3. Solve original locally  UBD
4. Branch to nodes (a) and (b)  How to get upper bounds?
5. Repeat steps 1-4  (Range reduction of the variable bounds?)
12 of 13 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Check Yourself

• What are the implications of nonconvex objective function?

• What are the implications of nonconvex feasible set?

• Describe B&B for nonconvex optimization.

13 of 13 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Convex relaxations of nonconvex functions


Basic Ideas for Relaxation of Functions: Natural Interval Extension

1. Decompose function to finite sequence of addition, multiplication


and intrinsic functions
2. Propagate intervals of variables

• Example: exp 𝑥 3 − 𝑥 2 for 𝑥 ∈ −1,1


exp −1,1 3 − −1,1 2
⊂ exp( −1,1 − 0,1 )
⊂ exp( −2,1 )
⊂ [exp −2 , exp 1 ]

• Applicable to most functions


• Simple and cheap but weak relaxations with linear convergence order
• Centered form and Taylor models are improvements

2 of 7 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Basic Ideas for Relaxation of Functions: αBB Method

• αBB relaxations for smooth functions add a negative quadratic term


𝑓 𝑥 + σ𝑖 𝛼𝑖 (𝑥𝑖 − 𝑥𝑖𝐿 )(𝑥𝑖 − 𝑥𝑖𝑈 )
 Relaxation for any α > 0
 Convex for large α

𝑥 𝐿 = −1
𝑥𝑈 = 3

Maranas & Floudas JOGO 4.2 (1994): 135-170.


3 of 7 Applied Numerical Optimization Akrotirianakis & Floudas JOGO 30.4 (2004): 367-390.
Prof. Alexander Mitsos, Ph.D.
Basic Ideas for Relaxation of Functions: αBB Method

• αBB relaxations for smooth functions add a negative quadratic term


𝑓 𝑥 + σ𝑖 𝛼𝑖 (𝑥𝑖 − 𝑥𝑖𝐿 )(𝑥𝑖 − 𝑥𝑖𝑈 )
 Relaxation for any α > 0
 Convex for large α
 Calculate suitable α by underestimating eigenvalues of Hessian

• Quadratic convergence in 𝑥𝑖𝑈 − 𝑥𝑖𝐿


but often relatively weak for large 𝑥𝑖𝑈 − 𝑥𝑖𝐿

• Many variants
 piecewise
 first decompose function 𝑥 𝐿 = −1
 exponential function 𝑓 𝑥 − σ𝑖(1 − exp(𝛾𝑖 𝑥𝑖 − 𝑥𝑖𝐿 )(1 − exp(𝛾𝑖 (𝑥𝑖𝑈 − 𝑥𝑖 ))
𝑥𝑈 = 3

𝑥 𝐿 = −1 𝑥𝐿 = 1
𝑥 𝑈 =1 𝑥𝑈 = 3

Maranas & Floudas JOGO 4.2 (1994): 135-170.


4 of 7 Applied Numerical Optimization Akrotirianakis & Floudas JOGO 30.4 (2004): 367-390.
Prof. Alexander Mitsos, Ph.D.
McCormick Relaxations: With Auxiliary Variables and in Original Variable Space
Auxiliary variable method Multivariate McCormick1,2
min exp 𝑥 − 𝑥 3 s. t. 𝑥 ∈ [−1,1.5] cc(exp 𝑥 )
min exp 𝑥 − 𝑥 3
𝑥
cv(exp 𝑥 ) 𝑥
𝒛𝟏 = 𝐞𝐱𝐩 𝒙 s. t. 𝑥 ∈ [−1,1.5]
𝒛𝟐 = −𝒙𝟑 s. t. 𝑧1 = exp 𝑥
min 𝑧1 + 𝑧2 𝑧2 = −𝑥 3 convexify
𝑥,𝑧1 ,𝑧2
𝑥 ∈ [−1,1.5]
min cv(exp 𝑥 − 𝑥 3 )
convexify 𝑥
s. t. 𝑥 ∈ [−1,1.5]
min 𝑧1 + 𝑧2
𝑥,𝑧1 ,𝑧2
s. t. cv(exp 𝑥 ) ≤ 𝑧1 ≤ cc(exp 𝑥 )
cv −𝑥 3 ≤ 𝑧2 ≤ cc(−𝑥 3 )
linearize 𝑥 ∈ [−1,1.5]

min 𝑧1 + 𝑧2
𝑥,𝑧1 ,𝑧2
cc(−𝑥 3 )
s. t. 𝐥𝐢𝐧(cv exp 𝑥 ) ≤ 𝑧1 ≤ 𝐥𝐢𝐧(cc(exp 𝑥 )) −𝑥 3
cv(−𝑥 3 )
𝐥𝐢𝐧 cv −𝑥 3 ≤ 𝑧2 ≤ 𝐥𝐢𝐧(cc(−𝑥 3 ))
𝑥 ∈ [−1,1.5]

[1] McCormick, Mathematical Programming 10 (1976)


5 of 7 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D. [2] Tsoukalas & Mitsos, Journal of Global Optimization 59 (2014)
[3] Mitsos et al., SIAM Journal on Optimization 20(2) (2009)
McCormick Relaxations: With Auxiliary Variables and in Original Variable Space
Auxiliary variable method Multivariate McCormick1,2
min exp 𝑥 − 𝑥 3 s. t. 𝑥 ∈ [−1,1.5] min exp 𝑥 − 𝑥 3
𝑥
𝑥
𝒛𝟏 = 𝐞𝐱𝐩 𝒙 exp 𝑥 − 𝑥 3 s. t. 𝑥 ∈ [−1,1.5]
𝒛𝟐 = −𝒙𝟑 s. t. 𝑧1 = exp 𝑥 cv(exp 𝑥 − 𝑥 3 )
min 𝑧1 + 𝑧2 𝑧2 = −𝑥 3 convexify
𝑥,𝑧1 ,𝑧2
𝑥 ∈ [−1,1.5]
min cv(exp 𝑥 − 𝑥 3 )
convexify 𝑥
s. t. 𝑥 ∈ [−1,1.5]
min 𝑧1 + 𝑧2
𝑥,𝑧1 ,𝑧2
s. t. cv(exp 𝑥 ) ≤ 𝑧1 ≤ cc(exp 𝑥 )
cv −𝑥 3 ≤ 𝑧2 ≤ cc(−𝑥 3 )
linearize 𝑥 ∈ [−1,1.5] linearize3

min 𝑧1 + 𝑧2 min lin(cv(exp 𝑥 − 𝑥 3 ))


𝑥
𝑥,𝑧1 ,𝑧2
s. t. 𝑥 ∈ [−1,1.5]
s. t. 𝐥𝐢𝐧(cv exp 𝑥 ) ≤ 𝑧1 ≤ 𝐥𝐢𝐧(cc(exp 𝑥 ))
𝐥𝐢𝐧 cv −𝑥 3 ≤ 𝑧2 ≤ 𝐥𝐢𝐧(cc(−𝑥 3 ))
𝑥 ∈ [−1,1.5]

[1] McCormick, Mathematical Programming 10 (1976)


6 of 7 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D. [2] Tsoukalas & Mitsos, Journal of Global Optimization 59 (2014)
[3] Mitsos et al., SIAM Journal on Optimization 20(2) (2009)
Check Yourself

• Describe methods to obtain underestimating functions. What are the underlying assumptions?

7 of 7 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Convergence rate of convex relaxations


Convergence Rate of Relaxations: Theory

• Take 𝒙 ∈ 𝑋 0 = 𝒙𝐿,0 , 𝒙𝑈,0 ⊃ 𝑋 = 𝒙𝐿 , 𝒙𝑈 , 𝑅


𝛿 𝑋 = max 𝑥𝑖𝑈 − 𝑥𝑖𝐿 ,
𝑖
0
• Take 𝑓: 𝑋 → 𝑅 𝐻𝑓 (𝑋) 𝑓𝑜
• We construct pair of relaxations: convex 𝑓 𝑢 and concave 𝑓 𝑜 𝑓
 𝑓 𝑢 𝒙 ≤ 𝑓 𝒙 ≤ 𝑓 𝑜 𝒙 , ∀𝒙 ∈ 𝒙𝐿 , 𝒙𝑈
• Tightness desired: small 𝑓 𝒙 − 𝑓 𝑢 𝒙 , 𝑓 𝑜 𝒙 − 𝑓 𝒙
• Convergence: 𝑓 𝑜 𝒙 , 𝑓 𝑢 𝒙 → 𝑓 𝒙 for 𝛿 𝑋 → 0 𝑓(𝑋)
• Pointwise convergence rate 𝛾: ∃𝐶 > 0, s.t.,∀𝑋 ⊂ 𝑋 0 :
𝛾
sup𝒙∈𝑋 𝑓 𝒙 − 𝑓 𝑢 𝒙 , 𝑓 𝑜 𝒙 − 𝑓 𝒙 ≤ 𝐶 𝛿 𝑋
• Hausdorff convergence rate 𝛽: ∃𝐶 > 0, s.t., ∀𝑋 ⊂ 𝑋 0 : 𝑓𝑢
max inf𝒙∈𝑋 𝑓 𝒙 − inf𝒙∈𝑋 𝑓 𝑢 𝒙 , sup𝒙∈𝑋 𝑓 𝑜 𝒙 − sup𝒙∈𝑋 𝑓 𝒙
𝛽
≤𝐶 𝛿 𝑋
• Cluster effect: need high convergence rate to avoid creating
many nodes in B&B tree 𝑅𝑛
 < 2 is problematic 𝑋
 > 2 is desired
 = 2 is often acceptable

Cluster effect: Du & Kerfott JOGO (1994), Wechsung, Schaber & Barton JOGO (2014)
2 of 4 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D. Convergence rate: Bompadre & Mitsos JOGO (2012), Najman & Mitsos JOGO (2016)
Convergence Rate of Relaxations: Properties (1)

• Two convergence orders: pointwise 𝛾, Hausdorff 𝛽 Relaxations of exp 1 + 𝑥 1 − 𝑥


• 𝛾≤𝛽
• Envelopes of smooth functions have 𝛾 = 2
• 𝛾 > 2 not possible for nonlinear
• Natural interval extensions: 𝛽 = 1
 Other interval extensions: 𝛽 = 2
• αBB: 𝛾 = 2, even for fixed 𝛼
 αBB works only for smooth functions
• McCormick: 𝛾 = 2
 Under mild assumptions
• Relative tightness of different relaxations depends on
width of intervals
Example: exp 1 − 𝑥 2
 red: original function
 green: original McCormick relaxations with bad decomposition
 blue: αBB relaxations, with optimized 𝛼

Convergence rate of McCormick relaxations: Bompadre & Mitsos JOGO (2012)


3 of 4 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D. Convergence rate of multivariate McCormick relaxations: Najman & Mitsos JOGO (2016)
Check Yourself

• What does convergence of relaxations mean? How do we measure the convergence? What convergence
properties are established for standard relaxations?

4 of 4 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Deterministic global solvers


Global Solution for Nonconvex Problems

• Many engineering/design problems are nonconvex


suboptimal local
• Global solution is in principle always desired minima
 Sometimes required
 Sometimes too expensive
 Sometimes no algorithms exist

global & local minimum


• For nonconvex Ω, finding 𝒙 ∈ Ω is a global optimization problem!

2 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Lower Bounds for B&B in Nonconvex Nonlinear Case

• Local methods provide global solution to convex optimization problems


 In theory, under suitable assumptions
 In practice there are complications

• Finite bounds required for all variables 𝒙𝐿 ≤ 𝒙 ≤ 𝒙𝑈

• Construct simple underestimations 𝑓 𝑢 of 𝑓 on 𝒙𝐿 , 𝒙𝑈


 𝑓 𝑢 𝒙 ≤ 𝑓 𝒙 , ∀𝒙 ∈ 𝒙𝐿 , 𝒙𝑈
 𝑓 𝑢 : constant, linear, piecewise linear, convex nonlinear
 Required: convergence to 𝑓 as 𝒙𝑈 − 𝒙𝐿 → 0
 Desired: tight relaxations and fast convergence rate
 Active research area

• Treat nonconvex constraints similarly to objective


 Rewrite equalities as pairs of inequalities
 𝑐𝑖 (𝒙, 𝒚) = 0 as 𝑐𝑖 𝒙, 𝒚 ≤ 0 and −𝑐𝑖 𝒙, 𝒚 ≤ 0
 Relax inequalities 𝑐𝑖 𝒙, 𝒚 ≤ 0 by underestimating 𝑐𝑖
 Relax inequalities 𝑐𝑖 𝒙, 𝒚 ≥ 0 by overestimating 𝑐𝑖

3 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Upper Bounds for B&B in Nonconvex Nonlinear Case

• Any feasible point of NLP suffices as upper bound


 Better upper bounds give faster convergence
 Local solution points are desirable
 Convergence of upper bound is required

• Typically nonconvex restrictions solved locally


 Restrict continuous variables to smaller ranges
 Convergence is not trivially satisfied

• Typically upper bound converges quicker than lower bound


 Proving global optimum more expensive than finding global optimum!
 Not always true, e.g., for semi-infinite and bilevel optimization

4 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Complications for B&B in Nonconvex Continuous Case

• Branching on continuous variables


𝑥𝑖 ∈ [𝑥𝑖𝐿 , 𝑥𝑖𝑈 ] branched to 𝑥𝑖 ∈ 𝑥𝑖𝐿 , 𝑥𝑖𝑀 and 𝑥𝑖 ∈ [𝑥𝑖𝑀 , 𝑥𝑖𝑈 ]

• Infinite sequence imply that convergence is nontrivial


 You need to always prove, it is easy to write non-convergent algorithms

• Convergence in the limit to an optimal solution point

• For any user-defined precision 𝜀 𝑓 finite termination with


 ഥ∈Ω
𝒙
 𝐿𝐵𝐷 ≤ 𝑓 𝒙⋆ = 𝑓 ∗ ≤ 𝑓 𝒙 ഥ
 𝑈𝐵𝐷 = 𝑓 𝒙 ഥ ≤ 𝐿𝐵𝐷 + 𝜀 𝑓
 𝐿𝐵𝐷 is a certificate of optimality 𝑓(ഥ
𝒙)
 We do not find 𝒙⋆ nor 𝑓 ∗ , we bound 𝑓 ⋆ 𝑓∗

5 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
How to Solve Mixed-Integer Nonlinear Programs (MINLP)?

• MINLPs combine difficulties of NLP and integrality


min 𝑓 𝒙, 𝒚
𝒙,𝒚
s. t. 𝑐𝑖 𝒙, 𝒚 = 0, ∀ 𝑖 ∈ 𝐸
𝑐𝑖 𝒙, 𝒚 ≤ 0, ∀ 𝑖 ∈ 𝐼
𝒙  𝑅𝑛𝑥 , continuous
𝑛𝑦
𝒚  𝑌, discrete (e.g. 𝒚  0,1 )

• Branch-and-bound (and do the right thing) is standard method


 Simple idea: B&B on 𝒚, globally solve the NLP on each node
 State of the art: B&B simultaneously on 𝒙 and 𝒚, relax nonconvex terms and integrality constraints

• Other global algorithms exist: outer approximation, generalized branch-and-cut, …

• Local solution methods for MINLP exist

6 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Selection of Available Deterministic Global Optimization Solvers

• Antigone (Algorithms for coNTinuous / Integer Global Optimization of Nonlinear Equations)


 commercial, developed by Misener in Floudas group
 decomposition of non-convex constraints and relaxation, auxiliary variables
• BARON (Branch-And-Reduce Optimization Navigator)
 commercial, developed by Sahinidis group
 auxiliary variables, first accessible solver
• COUENNE (Convex Over and Under ENvelopes for Nonlinear Estimation)
 COIN-OR, open source
• EAGO (Easy-Advanced Global Optimization
 open source, by Stuber group
 part of JuMp, McCormick relaxations
• LINDO Global
 commercial
• MAiNGO (McCormick-based Algorithm for mixed-integer Nonlinear Global Optimization)
 open source, developed by AVT.SVT
 multivariate McCormick relaxations, reduced space formulations, parallelization
• SCIP (Solving Constraint Integer Programs)
 free for academic use
 developed by Vigerske and Gleixner

7 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Check Yourself

• What are the implications of nonconvex objective function?

• What are the implications of nonconvex feasible set?

• Basic assumptions and guarantees of deterministic global algorithms.

8 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Reduced space for global optimization


Reduced Space vs Full Space Formulation
Full Space (FS) Reduced Space (RS)
𝒙 degrees of freedom
min 𝑓(𝒙, 𝒛) 𝒛 state variables ሚ
min 𝑓(𝒙)
𝒙,𝒛 𝒙
s. t. 𝒄𝐼 𝒙, 𝒛 ≤ 𝟎 s. t. 𝒄෥𝐼 𝒙 ≤ 𝟎
𝒄𝐸 𝒙, 𝒛 = 𝟎
Solve 𝒄𝐸 𝒙, 𝒛 = 𝟎 for 𝒛
Total dim: dim 𝒙 + dim(𝒛) Total dim: dim 𝒙
with dim 𝒙 ≪ dim(𝒛)

• NLP
𝑛+1
min ෍ 𝑇𝑖𝑚 − 𝑇𝑖 2
dim 𝑘 + dim(𝐓)
𝑘,𝑇𝑖 𝑖=0
𝑇𝑖−1 − 2𝑇𝑖 + 𝑇𝑖+1 𝑞𝑖0 + 𝑞𝑖1 𝑇𝑖 1 = dim 𝑘 ≪ dim 𝐓 = 99
s. t. =− , 𝑖 ∈ 1, … , 𝑛 + 1
Δ𝑥 2 𝑘
𝑇0 = 500, 𝑇𝑛+1 = 600, 𝑘 ∈ 0.1,10 , 𝑇𝑖 ∈ [0,2000]

[7] Epperly & Pistikopoulos, JOGO, 11(3), 287-311 (1997) [8] Byrne & Bogle, Ind. Eng. Chem. Res, 39(11), 4296-4301 (2000) [9] Mitsos, Chachuat & Barton, SIOPT, 20(2), 573-601 (2009)
[10] Bongartz, & Mitsos, JOGO, 69(4), 761-796 (2017) [11] Bongartz and Mitsos, JOGO, 69(4), 761-796 (2018)

2 of 5 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Reduced Space vs Full Space Formulation
Full Space (FS) Reduced Space (RS)
𝒙 degrees of freedom
min 𝑓(𝒙, 𝒛) 𝒛 state variables ሚ
min 𝑓(𝒙)
𝒙,𝒛 𝒙
s. t. 𝒄𝐼 𝒙, 𝒛 ≤ 𝟎 s. t. 𝒄෥𝐼 𝒙 ≤ 𝟎
𝒄𝐸 𝒙, 𝒛 = 𝟎
Solve 𝒄𝐸 𝒙, 𝒛 = 𝟎 for 𝒛
Total dim: dim 𝒙 + dim(𝒛) Total dim: dim 𝒙
with dim 𝒙 ≪ dim(𝒛)

dim 𝑘 + dim(𝐓)
• NLP 1 = dim 𝑘 ≪ dim 𝐓 = 99
𝑛+1 1 500
min ෍ 𝑇𝑖𝑚 − 𝑓𝑖 (𝑘) 2 𝑞21 2
𝑘 𝑖=0 1 −2 − Δ𝑥 1dim 𝑘 + dim(𝐓) 𝑇0 𝑞0 2
− Δ𝑥
𝑘 1 𝑇1
= dim 𝑘 ≪ dim 𝐓 = 99 𝑘
s. t. 𝑇0 = 500, 𝑇𝑛+1 = 600, 𝑘 ∈ 0.1,10 ⋯ ⋯ ⋯ ⋮ = ⋮
𝑇𝑛 0
𝑞𝑛1 𝑞
1 −2 − Δ𝑥 2 1 𝑇𝑛+1 − Δ𝑥 2
𝑘 𝑘
1 600
[7] Epperly & Pistikopoulos, JOGO, 11(3), 287-311 (1997) [8] Byrne & Bogle, Ind. Eng. Chem. Res, 39(11), 4296-4301 (2000) [9] Mitsos, Chachuat & Barton, SIOPT, 20(2), 573-601 (2009)
[10] Bongartz, & Mitsos, JOGO, 69(4), 761-796 (2017) [11] Bongartz and Mitsos, JOGO, 69(4), 761-796 (2018)

3 of 5 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Implementation Structure of Global Solver MAiNGO

Branch & Bound


𝐗k 𝑘
𝐱 LBD open source
𝑘 𝑘
𝐱LBD , 𝐿𝐵𝐷 𝐱 UBD , 𝑈𝐵𝐷
LP 𝑘 𝐗k
(MI)LP Lower Bounding Upper Bounding Local NLP
Solver 𝑘
Wrapper Wrapper 𝑘
𝐱 UBD , 𝑈𝐵𝐷
Solver
𝐱 LBD , 𝐿𝐵𝐷
CLP3 k 𝐱 IPOPT5
𝐗 k , 𝐱 lin
CPLEX4 NLOPT6
KNITRO7
MC++1 FADBAD++2
DAG 𝑓, 𝐠, 𝐡
𝑓 𝑐𝑣 , 𝐠 𝑐𝑣 , 𝐡𝑐𝑣/𝑐𝑐
𝛁𝑠 𝑓 𝑐𝑣 , 𝛁𝑠 𝐠 𝑐𝑣 , 𝛁𝑠 𝐡𝑐𝑣/𝑐𝑐 𝛁𝑓, 𝛁𝐠, 𝛁𝐡
𝜵2 𝑓, 𝜵2 𝐠, 𝜵2 𝐡

[1] Chachuat et al., IFAC-PapersOnLine 48(8) (2015) [2] Bendtsen & Stauning, v2.1 (2012) Model [5] Wächter & Biegler, Mathematical Programming 106(1) (2006)
[3] COIN-OR CLP v1.17 (2019) [4] IBM ILOG CPLEX v12.8, (2017) [6] Johnson, The NLopt nonlinear-optimization package [7] Artelys Knitro v11.1.0, (2018)

4 of 5 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
https://git.rwth-aachen.de/avt.svt/public/maingo
Check Yourself

• What is the benefit of using the reduced space instead of full space?

5 of 5 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Basics of stochastic global optimization


“Black-box” Optimization (alternative meanings exist)

• Only numerical evaluations of functions, no gradients (zero-order oracle):

• Typical formulation

min 𝑓(𝒙)
𝒙∈Ω

Ω = 𝒙 ∈ 𝑅𝑛 |𝑐𝑖 (𝒙) ≤ 0, 𝑖 ∈ 𝐼, 𝒙𝐿 ≤ 𝒙 ≤ 𝒙𝑈

• Basic idea
input
algorithmic 𝒙: 𝒙𝐿 ≤ 𝒙 ≤ 𝒙𝑈
parameters
evaluation of
optimization
objective function
initialization algorithm
(black box)
output
𝑓 𝒙 , 𝒄 𝒙 ≤ 𝟎?
optimal solution
2 of 8 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Stochastic Global Optimization

• General idea: sample the space.


 Simple and sophisticated approaches
 Tradeoff: exploitation vs. exploration
 Fundamental problem: host set has infinite cardinality

• Promise: avoid getting trapped in suboptimal local solution point


 Does not avoid exponential complexity
 As # function evaluations → ∞, probability of finding a global minimum → 1

• Advantages: robust, no derivatives required, easy to implement and parallelize, parallelization efficient

• Drawbacks: slower than gradient-based local methods, no rigorous termination criteria, no guarantee to finitely
find global optimum/feasible point, no certificate of optimality

• Hybrid methods: combine stochastic with deterministic local solver

• Many methods exist. We describe the basics of popular ideas

3 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
No Free-Lunch Theorem

• No-free lunch in everyday life: it is impossible to get something for nothing.


• No-free lunch in economics: cannot make profit without capital and risk of loss
• No-free lunch in stochastic global optimization: any elevated performance of an algorithm for one class of
problems is offset by worse performance for another class.
 True also compared to random search

• Important consequences
 Comparisons are difficult
 If possible tune your algorithm to your problems.
If you have no knowledge about your problem, try many algorithms

4 of 8 Applied Numerical Optimization Wolpert, David H., and William G. Macready. "No free lunch theorems for
Prof. Alexander Mitsos, Ph.D.
optimization." IEEE transactions on evolutionary computation 1.1 (1997): 67-82.
Random Search

• Random search:
 starting from initial point 𝒙(0)
 randomly choose new iterate 𝒙(𝑘+1)
 compare 𝑓(𝒙 𝑘+1 ) and best value found 𝑓 ∗ , and update if applicable

  very easy to implement, no special requirements on objective function


  requires many function evaluations and provides no guarantee for (finite) convergence

5 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Multistart as a Heuristic for Global Solution of NLPs

• General idea: start local solvers from many initial guesses


 hope: some will converge to the global minimum
 by construction hybrid method

• # initial guesses?
 Theory: we would like to cover space, but this scales
exponentially with number of variables
 In practice: determined by how long you are willing
to wait

• Various possibilities to pick initial guesses: grid,


latin hypercube, random, physical insight

6 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Practical Recommendations for Multistart

• Think about the problem, try with deterministic solvers

• Parallelize
 No communication between instances required → submit as separate processes
 Instances may take long without progress → limit CPU time for each

• Try different solvers simultaneously


 Possibly repeat same initial guesses with different algorithms
 Vary solver options for different runs

• Try different formulations

• Record points visited by local solvers to avoid problems with convergence

• Examine pool of solutions


 Do we have multiple points at the suspected global solution?
 Are some runs better (e.g. one solver vs another)?

7 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Check Yourself

• Describe random search

• Describe multistart.

• What is the basic idea of stochastic global algorithms? What are their properties, advantages and
disadvantages?

8 of 8 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Stochastic global optimization: Genetic Algorithm


Genetic Algorithm: Basic Idea

• Based on simplistic biological principle: survival of the fittest


1.0 1 3.2 1.1
• Start with an initial population (= initial guesses).
 At each iteration the size remains fixed 0.4 0 0.1 1.6

• Accept survivors based on merit function and distance from 0.7 1 4.8 0.6
previous members ...
 Merit function: tradeoff of objective and constraints

• Generate new members by mutation: perturb entries randomly 1.0 1 3.2 1.1 0.9 1 2.8 1.8
 Move around in the place, ensures local optimization

• Generate new members by crossing (recombination): 1.0 1 3.0 0.8


child inherits some entries from parents 0.5 0 2.6 0.4
 Move far away, avoiding suboptimal points
0 0 2.2 0

2 of 4 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Genetic Algorithm: Practical Recommendations

• See practical recommendations for multistart algorithms


 Parallize by MPI or even shell script: manager processor for algorithm, worker processors for function evaluation

• Hybridize with deterministic local solver: run local solver for promising points

• Termination criteria: # iterations, small improvement in objective function

• Plethora of variants → picking best algorithm/solver is hard.


Alternatives:
 Take existing solver and tune. Advantage: no need to reinvent the wheel, easy start.
 Implement basic solver and tune to problem. Advantage: less problems with compatibility (OS, language, license, …), you
know pitfalls, you can tailor code to your needs

• Visits many points → can be used to generate pool of solutions


 All algorithms visit many points, GA hopefully qualitatively different

• Can be easily extended to multiobjective optimization

3 of 4 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Check Yourself

• Describe genetic algorithm.

• What is the basic idea of stochastic global algorithms? What are their properties, advantages and
disadvantages?

4 of 4 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Derivative free optimization


“Black-box” Optimization (alternative meanings exist)

• Only numerical evaluations of functions, no gradients (zero-order oracle):

• Typical formulation

min 𝑓(𝒙)
𝒙∈Ω

Ω = 𝒙 ∈ 𝑅𝑛 |𝑐𝑖 (𝒙) ≤ 0, 𝑖 ∈ 𝐼, 𝒙𝐿 ≤ 𝒙 ≤ 𝒙𝑈

• Basic idea
input
algorithmic 𝒙: 𝒙𝐿 ≤ 𝒙 ≤ 𝒙𝑈
parameters
evaluation of
optimization
objective function
initialization algorithm
(black box)
output
𝑓 𝒙 , 𝒄 𝒙 ≤ 𝟎?
optimal solution
2 of 6 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Derivative-free Optimization

• Gradient information may be expensive or not available, e.g.


 simulation optimization
 external functions, compiled legacy software

• Gradient evaluation by finite difference prone to errors due to inaccurate function evaluation

• Methods determine new iterate from previous function evaluations

• Non-smoothness does not pose a fundamental problem

3 of 6 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Gradient-free Search Methods: Simplex Search

1. choose an initial point 𝒙(0) and a 𝛿 > 0


2. construct an 𝑛-dimensional simplex with edges of length 𝛿, containing 𝒙(𝑘) as a vertex
3. evaluate 𝑓 at each vertex
4. reflect the point with highest value of 𝑓 to the opposite edge, thus preserving the geometrical shape and
define 𝒙(𝑘+1) .
5. If the procedure does not result in an improvement (close to minimum), reduce the length of the edges and
start a new iteration

𝒙(𝑘)
13
9
11.5
𝛿
𝑘+2 ,1
𝒙

11.3 𝒙 𝑘+2 ,2 8.5


𝒙(𝑘+1)

4 of 6 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Gradient-free Search Methods: Univariate Search (Coordinate Descent)

Search along the coordinates of the problem.

Basic algorithm:
• choose sequentially component 𝑖 of 𝒙(𝑘) and descend in this direction
• after 𝑛 steps start from the beginning or reverse the sequence

Himmelblau et. al, p. 185

5 of 6 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Check Yourself

• Describe the derivative free methods

6 of 6 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.

You might also like