Lipschitzian Optimization, DIRECT Algorithm, and Applications
Lipschitzian Optimization, DIRECT Algorithm, and Applications
DIRECT Algorithm
Applications
Yves Brise
April 1, 2008
Outline
1 Lipschitzian Optimization
2 DIRECT Algorithm
3 Applications
Outline
1 Lipschitzian Optimization
2 DIRECT Algorithm
3 Applications
Function Optimization
Problem
For a function f : D ⊆ Rd → R, find
min f (x ).
x∈D
Simple Bounds
Mostly we will assume li ≤ xi ≤ ui for all i ∈ [d], i.e. every variable xi
has some lower bound li and some upper bound ui . This means D is
a hyperrectangle.
Taxonomy of Methods
Lipschitzian Optimization
Shubert (1972)
“A Sequential Method Seeking the Global Maximum of a Function”
Definition
A function f : D ⊆ Rd → R is called Lipschitz-continuous if there
exists a positive constant K ∈ R+ such that
|f (x ) − f (x ′ )| ≤ K |x − x ′ |, ∀x , x ′ ∈ D.
Problem
We consider the following minimization problem
min f (x ),
x∈D
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥ f (a) − K (x − a),
f (x ) ≥ f (b) + K (x − b).
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥ f (a) − K (x − a),
f (x ) ≥ f (b) + K (x − b).
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥ f (a) − K (x − a),
f (x ) ≥ f (b) + K (x − b).
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥ f (a) − K (x − a),
f (x ) ≥ f (b) + K (x − b).
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥ f (a) − K (x − a),
f (x ) ≥ f (b) + K (x − b).
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥ f (a) − K (x − a),
f (x ) ≥ f (b) + K (x − b).
Shubert’s Algorithm in 1D
If we substitute a and b for x ′ into the definition of Lipschitz-continuity
we get the following two conditions for f (x ), where x ∈ [a, b],
f (x ) ≥ f (a) − K (x − a),
f (x ) ≥ f (b) + K (x − b).
a + b f (a) − f (b)
X (a, b, f , K ) = + ,
2 2K
f (a) + f (b) K (b − a)
B(a, b, f , K ) = − .
2 2
a + b f (a) − f (b)
X (a, b, f , K ) = + ,
2 2K
f (a) + f (b) K (b − a)
B(a, b, f , K ) = − .
2 2
a + b f (a) − f (b)
X (a, b, f , K ) = + ,
2 2K
f (a) + f (b) K (b − a)
B(a, b, f , K ) = − .
2 2
Pros
+ Global search possible
+ Deterministic, no need for multiple runs
+ Few paramters apart from K no need for fine-tuning
+ K gives bound on error, no need to rely on arbitrary stopping
criteria such as the number of iterations
Outline
1 Lipschitzian Optimization
2 DIRECT Algorithm
3 Applications
Problem 1: Specifying K
K might not be easily accessible. DIRECT needs no prior knowledge
and uses all possible constants. Sounds terrific, but how...
DIRECT in 1D
Division of Intervals
When dividing the search space we have to make sure that previous
function evaluations are not lost, i.e. they are still at the center of
some interval.
Lipschitz Bound
f (x ) ≥ f (c) + K (x − c) for x ≤ c,
f (x ) ≥ f (c) − K (x − c) for x ≥ c.
Definition
An interval j ∈ S is called potentially optimal if there exists some
constant K̃ ≥ 0 such that the following conditions hold,
where ǫ ≥ 0.
Summary
DIRECT in 1D
Input : a, b ∈ R, f (·), ǫ ≥ 0
Output: fmin
1 Initialize;
2 repeat
3 Identify set S of potentially optimal intervals;
4 for s ∈ S do
5 Evaluate new center points and subdivide s;
6 until too many iterations ;
7 return fmin ;
Division of Hypercubes
This way the largest rectangles contain the best function values.
Division of Hypercubes
This way the largest rectangles contain the best function values.
Division of Hypercubes
This way the largest rectangles contain the best function values.
Division of Hypercubes
This way the largest rectangles contain the best function values.
Division of Hyperrectangles
Division of Hyperrectangles
Convergence of DIRECT
Proof
Let D be the d-dimensional unit hypercube.
A rectangle R that has been involved in r divisions will have
j := r mod d sides of length 3−(k +1) and d − j sides of length
3−k , where k = (r − j)/d.
p
The radius of R is therefore (j3−2(k +1) + (d − j)3−2k )/2, which
goes to zero as r approches infinity.
Let t ∈ N be the current iteration, and rt ∈ N the fewest number
of divisions undergone by any rectangle.
Convergence of DIRECT
Proof (cont’d)
Claim: limt→∞ rt = ∞.
Assume otherwise: ∃t ′ after which rt never changes, i.e.
limt→∞ rt = rt ′ .
After iteration t ′ there will be a finite number of rectangles (say
N) of maximal size. The one with the lowest function value will be
potentially optimal, and therefore subdivided.
This only leaves N − 1 maximal rectangles. After N − 1 iterations
rt has increased by 1.
Convergence of DIRECT
Definition
The generalized directional derivative of f at x ∈ D in direction v is
f (y + tv ) − f (y )
f 0 (x , v ) := lim sup .
y →x,y ∈D, t
t↓0,y +tv ∈D
2α(R)K
ǫ> p √ ,
|f (c(R))|( (d + 8) − d )
Outline
1 Lipschitzian Optimization
2 DIRECT Algorithm
3 Applications
Aircraft Routing
Component Design
Last Slide