NLP TheoryManual PDF
NLP TheoryManual PDF
- Theory Manual -
GmbH
Fuerther Str. 212
D - 90429 Nuernberg
iii
Contents
iv
Contents
19.Example 173
20.MidacoWrapper 185
20.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
20.2. Approaches for non-convex Mixed Integer Nonlinear Programs . . . . . . . . . . 186
20.3. The Ant Colony Optimization framework . . . . . . . . . . . . . . . . . . . . . . 188
20.4. Penalty method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
20.4.1. Robust oracle penalty method . . . . . . . . . . . . . . . . . . . . . . . . 192
20.4.2. Oracle update rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
20.5. Program Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
21.MipOptimizer 203
21.1. Definition of Optimization Problems addressed . . . . . . . . . . . . . . . . . . . 203
21.2. General Notation for Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
21.3. MIPSQP: Algorithm for Solving Nonlinear Discrete Optimization Problems . . . 205
21.4. Program Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
23.Example 229
v
Contents
28.Example 275
Bibliography 313
vi
1. General information about NLP++
The NLP++ Optimization Toolbox is a comprehensive C++ class library providing optimization
routines for a large variety of nonlinear, constrained optimization problems.
with
The formulation of the optimization proplems in the following chapters might be different.
However, when an optimization problem for Nlp++ is implemented it has to be consistent with
the formulation above.
1
1. General information about NLP++
• NLP++ Kernel
– SqpWrapper for sequential quadratic optimization
– ScpWrapper for large scale nonlinear optimization using the optimization method
sequential convex programming (SCP) or (optional) the method of moving asymp-
totes (MMA)
– CobylaWrapper, a gradient-free optimization routine
– MisqpWrapper for nonlinear optimization applying a sequential quadratic program-
ming (SQP) method with trust region stabilization
– QlWrapper, for quadratic optimization with linear constraints
– IpoptWrapper, an interface to the interior-point algorithm Ipopt for large scale
optimization.
Note: Only the interface is provided. Ipopt itself must be obtained separately.
– ApproxGrad, a class for approximating gradients of objective and/or constraint
functions
2
1.2. Program Components
3
Part I.
5
2. SqpWrapper - Sequential Quadratic
Programming
The Fortran subroutine NLPQLP by Schittkowski solves smooth nonlinear programming prob-
lems and is an extension of the code NLPQL. This version is specifically tuned to run under
distributed systems controlled by an input parameter l. In case of computational errors as for
example caused by inaccurate function or gradient evaluations, a non-monotone line search is
activated. Numerical results are included, which show that in case of noisy function values a
drastic improvement of the performance is achieved compared to the version with monotone line
search. The usage of the code is documented and illustrated by an example (chapter 10). The
sections 2.1, 2.2, 2.3 and 2.6 are taken from Schittkowski [138].
2.1. Introduction
We consider the general optimization problem to minimize an objective function f under non-
linear equality and inequality constraints,
min f (x)
g (x) = 0 , j = 1, . . . , me
x ∈ Rn : j (2.1)
gj (x) ≥ 0 , j = me + 1, . . . , m
xl ≤ x ≤ xu
where x is an n-dimensional parameter vector. It is assumed that all problem functions f (x)
and gj (x), j = 1, . . ., m, are continuously differentiable on the whole Rn .
Sequential quadratic programming is the standard general purpose method to solve smooth
nonlinear optimization problems, at least under the following assumptions:
7
2. SqpWrapper - Sequential Quadratic Programming
problem collection of Bongartz et al. [13]. About 80 test problems based on a Finite Element
formulation are collected for a comparative evaluation in Schittkowski et al. [145]. A set of 1,000
least squares test problems solved by an extension of the code NLPQL to retain typical features
of a Gauss-Newton algorithm, is described in [128].
Moreover, there exist hundreds of commercial and academic applications of NLPQL, for ex-
ample
1. mechanical structural optimization, see Schittkowski, Zillober, Zotemantel [145] and Knep-
pe, Krammer, Winkler [77],
2. data fitting and optimal control of transdermal pharmaceutical systems, see Boderke,
Schittkowski, Wolf [10] or Blatt, Schittkowski [7],
3. computation of optimal feed rates for tubular reactors, see Birk, Liepelt, Schittkowski, and
Vogel [5],
4. food drying in a convection oven, see Frias, Oliveira, and Schittkowski [49],
5. optimal design of horn radiators for satellite communication, see Hartwanger, Schittkowski,
and Wolf [66],
7. optimal design of surface acoustic wave filters for signal processing, see Bünner, Schitt-
kowski, and van de Braak [17].
8
2.2. Sequential Quadratic Programming Methods
is presented. The general idea goes back to Grippo, Lampariello, and Lucidi [57], and was
extended to constrained optimization and trust region methods in a series of subsequent papers,
see Bonnans et al. [14], Deng et al. [29], Grippo et al. [58, 59], Ke and Han [74], Ke et al. [75],
Lucidi et al. [87], Panier and Tits [101], Raydan [113], and Toint [157, 158]. However, there is
a basic difference in the methodology: Our goal is to allow monotone line searches as long as
they terminate successfully, and to apply a non-monotone one only in a special error situation.
In Section 2.2 we outline the general mathematical structure of an SQP algorithm, the non-
monotone line search, and the modifications to run the code under distributed systems. Section
2.3 contains some numerical results obtained for a set of 306 standard test problems of the
collections published in Hock and Schittkowski [69] and in Schittkowski [124]. They show the
sensitivity of the new version with respect to the number of parallel machines and the influence
of different gradient approximations under uncertainty. Moreover, we test the non-monotone line
search versus the monotone one, and generate noisy test problems by adding random errors to
function values and by inaccurate gradient approximations. This situation appears frequently in
practical environments, where complex simulation codes prevent accurate responses and where
gradients can only be computed by a difference formula.
The usage of the Fortran subroutine is documented in Section 4 and Section 5 contains an
illustrative example.
min f (x)
n
x ∈ R : gj (x) = 0 , j = 1, . . . , me (2.2)
gj (x) ≥ 0 , j = me + 1, . . . , m
It is assumed that all problem functions f (x) and gj (x), j = 1, . . ., m, are continuously differ-
entiable on Rn .
The basic idea is to formulate and solve a quadratic programming subproblem in each iteration
which is obtained by linearizing the constraints and approximating the Lagrangian function
m
X
L(x, u) := f (x) − uj gj (x) (2.3)
j=1
9
2. SqpWrapper - Sequential Quadratic Programming
min 12 dT Bk d + ∇f (xk )T d
d ∈ Rn : ∇gj (xk )T d + gj (xk ) = 0 , j = 1, . . . , me (2.4)
∇gj (xk )T d + gj (xk ) ≥ 0 , j = me + 1, . . . , m
Let dk be the optimal solution and uk the corresponding multiplier of this subproblem. A new
iterate is obtained by
xk+1 xk dk
:= + αk (2.5)
vk+1 vk uk − vk
where αk ∈ (0, 1] is a suitable steplength parameter.
Although we are able to guarantee that the matrix Bk is positive definite, it is possible that
(2.4) is not solvable due to inconsistent constraints. One possible remedy is to introduce an
additional variable δ ∈ R, leading to a modified quadratic programming problem, see Schitt-
kowski [123] for details.
The steplength parameter αk is required in (2.5) to enforce global convergence of the SQP
method, i.e., the approximation of a point satisfying the necessary Karush-Kuhn-Tucker opti-
mality conditions when starting from arbitrary initial values, typically a user-provided x0 ∈ Rn
and v0 = 0, B0 = I. αk should satisfy at least a sufficient decrease condition of a merit function
φr (α) given by
x d
φr (α) := ψr +α (2.6)
v u−v
with a suitable penalty function ψr (x, v). Implemented is the augmented Lagrangian function
X 1 1X 2
ψr (x, v) := f (x) − (vj gj (x) − rj gj (x)2 ) − vj /rj , (2.7)
2 2
j∈J j∈K
10
2.2. Sequential Quadratic Programming Methods
desirable to waste too many function calls. Moreover, the behavior of the merit function becomes
irregular in case of constrained optimization because of very steep slopes at the border caused
by large penalty terms. Even the implementation is more complex than shown above, if linear
constraints and bounds of the variables are to be satisfied during the line search.
Usually, the steplength parameter αk is chosen to satisfy the Armijo [2] condition
see for example Ortega and Rheinboldt [100]. The constants are from the ranges 0 < µ < 0.5,
0 < β < 1, and 0 < σ ≤ 1. We start with i = 0 and increase i until (2.9) is satisfied for the first
time, say at ik . Then the desired steplength is αk = σβ ik .
Fortunately, SQP methods are quite robust and accept the steplength one in the neighborhood
of a solution. Typically the test parameter µ for the Armijo-type sufficient descent property
(2.9) is very small. Nevertheless the choice of the reduction parameter β must be adopted to the
actual slope of the merit function. If β is too small, the line search terminates very fast, but on
the other hand the resulting stepsizes are usually too small leading to a higher number of outer
iterations. On the other hand, a larger value close to one requires too many function calls during
the line search. Thus, we need some kind of compromise, which is obtained by first applying a
polynomial interpolation, typically a quadratic one, and use (2.9) only as a stopping criterion.
Since φr (0), φ0r (0), and φr (αi ) are given, αi the current iterate of the line search procedure, we
easily get the minimizer of the quadratic interpolation. We accept then the maximum of this
value and the Armijo parameter as a new iterate, as shown by the subsequent code fragment
implemented in NLPQLP.
Algorithm 1
Let β, µ with 0 < β < 1, 0 < µ < 0.5 be given.
Start: α0 := 1
For i = 0, 1, 2, . . . do:
1) If φr (αi ) < φr (0) + µ αi φ0r (0), then stop.
0.5 αi2 φ0r (0)
2) Compute ᾱi := .
αi φ0r (0) − φr (αi ) + φr (0)
3) Let αi+1 := max(β αi , ᾱi ).
Corresponding convergence results are found in Schittkowski[120]. ᾱi is the minimizer of the
quadratic interpolation, and we use the Armijo descent property for checking termination. Step
3 is required to avoid irregular values, since the minimizer of the quadratic interpolation could be
outside of the feasible domain (0, 1]. The search algorithm is implemented in NLPQLP together
with additional safeguards, for example to prevent violation of bounds. Algorithm 4.1 assumes
that φr (1) is known before calling the procedure, i.e., that the corresponding function values
are given. We have to stop the algorithm, if sufficient descent is not observed after a certain
number of iterations, say 10. If the tested stepsize falls below machine precision or the accuracy
by which model function values are computed, the merit function cannot decrease further.
To outline the new approach, let us assume that functions can be computed simultaneously
on l different machines. Then l test values αi = β i−1 with β = 1/(l−1) are selected, i = 1, . . ., l,
where is a guess for the machine precision. Next we require l parallel function calls to get the
11
2. SqpWrapper - Sequential Quadratic Programming
corresponding model function values. The first αi satisfying a sufficient descent property (2.9),
say for i = ik , is accepted as the new steplength to set the subsequent iterate by αk := αik . One
has to be sure that existing convergence results of the SQP algorithm are not violated.
The proposed parallel line search will work efficiently, if the number of parallel machines l is
sufficiently large, and works as follows, where we omit the iteration index k.
Algorithm 2
Let β, µ with 0 < β < 1, 0 < µ < 0.5 be given.
Start: For αi = β i compute φr (αi ) for i = 0, . . ., l − 1.
For i = 0, 1, 2, . . . do:
If φr (αi ) < φr (0) + µ αi φ0r (0), then stop.
is satisfied, where p(k) is a predetermined parameter with p(k) = min{k, p}, p a given tolerance.
Thus, we allow an increase of the reference value φrjk (0) in a certain error situation, i.e., an
increase of the merit function value. To implement the non-monotone line search, we need a
queue consisting of merit function values at previous iterates. In case of k = 0, the reference
value is adapted by a factor greater than 1, i.e., φrjk (0) is replace by tφrjk (0), t > 1. The
basic idea to store reference function values and to replace the sufficient descent property by a
sufficient ’ascent’ property in max-form, is for example described in Dai [27], where a general
convergence proof for the unconstrained case is presented. The general idea goes back to Grippo,
Lampariello, and Lucidi [57], and was extended to constrained optimization and trust region
methods in a series of subsequent papers, see Bonnans et al. [14], Deng et al. [29], Grippo et
al. [58, 59], Ke and Han [74], Ke et al. [75], Lucidi et al. [87], Panier and Tits [101], Raydan [113],
12
2.3. Performance Evaluation
and Toint [157, 158]. However, there is a difference in the methodology: Our goal is to allow
monotone line searches as long as they terminate successfully, and to apply a non-monotone one
only in an error situation.
Since analytical derivatives are not available for all problems, we approximate them numer-
ically. The test examples are provided with exact solutions, either known from analytical pre-
calculations by hand or from the best numerical data found so far. The Fortran codes are
compiled by the Intel Visual Fortran Compiler, Version 8.0, under Windows XP, and executed
on a Pentium IV processor with 2.8 GHz.
First we need a criterion to decide whether the result of a test run is considered as a successful
return or not. Let > 0 be a tolerance for defining the relative accuracy, xk the final iterate of
a test run, and x? the supposed exact solution known from the test problem collection. Then
we call the output a successful return, if the relative error in the objective function is less than
and if the maximum constraint violation is less than 2 , i.e., if
13
2. SqpWrapper - Sequential Quadratic Programming
or
f (xk ) < , if f (x? ) = 0
and
r(xk ) = kg(xk )− k∞ ) < 2 ,
where k . . . k∞ denotes the maximum norm and gj (xk )− = min(0, gj (xk )), j > me , and gj (xk )− =
gj (xk ) otherwise.
We take into account that a code returns a solution with a better function value than the
known one, subject to the error tolerance of the allowed constraint violation. However, there
is still the possibility that an algorithm terminates at a local solution different from the known
one. Thus, we call a test run a successful one, if in addition to the above decision the internal
termination conditions are satisfied subject to a reasonably small tolerance (IFAIL=0), and if
1. Forward differences:
∂ 1
f (x) ≈ f (x + ηi ei ) − f (x) (2.11)
∂xi ηi
14
2.3. Performance Evaluation
2. Two-sided differences:
∂ 1
f (x) ≈ f (x + ηi ei ) − f (x − ηi ei (2.12)
∂xi 2ηi
3. Fourth-order formula:
∂ 1
f (x) ≈ 2f (x − 2ηi ei ) − 16f (x − ηi ei ) + 16f (x + ηi ei ) − 2f (x + 2ηi ei ) (2.13)
∂xi 4!ηi
Here ηi = η max(10−5 , |xi |) and ei is the i-th unit vector, i = 1, . . . , n. The tolerance η depends
on the difference formula and is set to η = ηm 1/2 for forward differences, η = ηm 1/3 for two-sided
differences, and η = (ηm /72)1/4 for fourth-order formulae. ηm is a guess for the accuracy by
which function values are computed, i.e., either machine accuracy in case of analytical formulae
or an estimate of the noise level in function computations. In a similar way, derivatives of
constraints are computed.
The Fortran implementation of the SQP method introduced in the previous section, is called
NLPQLP. The code represents the most recent version of NLPQL which is frequently used
in academic and commercial institutions. NLPQLP is prepared to run also under distributed
systems, but behaves in exactly the same way as the serial version, if the number of simulated
processors is set to one. Functions and gradients must be provided by reverse communication
and the quadratic programming subproblems are solved by the primal-dual method of Goldfarb
and Idnani [56] based on numerically stable orthogonal decompositions. NLPQLP is executed
with termination accuracy ACC=10−8 and a maximum number of iterations MAXIT=500.
In the subsequent tables, we use the notation
number of successful test runs (according to above definition) : nsucc
number of runs with error messages of NLPQLP (IFAIL>0) : nerr
average number of function evaluations : nf unc
average number of gradient evaluations or iterations : ngrad
average number of equivalent function calls (function calls counted : nequ
also for gradient approximations)
total execution time for all test runs in seconds : time
To get nf unc , we count each single function call, also in the case of several simulated processors,
l > 1. However, function evaluations needed for gradient approximations, are not counted. Their
average number is nf unc for forward differences, 2×nf unc for two-sided differences, and 4×nf unc
for fourth-order formulae. One gradient computation corresponds to one iteration of the SQP
method.
15
2. SqpWrapper - Sequential Quadratic Programming
l = 1 corresponds to the sequential case, when Algorithm 1 is applied to the line search
consisting of a quadratic interpolation combined with an Armijo-type bisection strategy and a
non-monotone stopping criterion.
In all other cases, l > 1 simultaneous function evaluations are made according to Algorithm 2.
To get a reliable and robust line search, we need at least 5 parallel processors. No significant
improvements are observed, if we have more than 10 parallel function evaluations.
The most promising possibility to exploit a parallel system architecture occurs, when gradients
cannot be calculated analytically, but have to be approximated numerically, for example by
forward differences, two-sided differences, or even higher order methods. Then we need at least
n additional function calls, where n is the number of optimization variables, or a suitable multiple
of n.
For our numerical tests, we apply the three different difference formulae mentioned before,
see (2.11), (2.12), and (2.13). To test the stability of these formulae, we add some randomly
generated noise to each function value. Non-monotone line search is applied with a queue size
of p = 30, and the serial line search calculation by Algorithm 1 is required.
Tables 2.2 to 2.4 show the corresponding results for the different procedures under consider-
ation, and for increasing random perturbations (err ). More precisely, if ρ denotes a uniformly
distributed random number between 0 and 1, we replace f (xk ) by f (xk )(1 + err (2ρ − 1)) for
each iterate xk . In the same way, restriction functions are perturbed. The tolerance for approx-
imating gradients, ηm , is set to the machine accuracy in case of err = 0, and to the random
noise level otherwise.
The results are surprising and depend heavily on the new non-monotone line search strategy.
There are no significant differences in the number of test problems solved between the three
different formulae despite of the increasing theoretical approximation orders. Moreover, we are
able to solve about 80 % of the test examples in case of extremely noisy function values with
at most two correct digits. If we take the number of equivalent function calls into account, we
conclude that forward differences are more efficient than higher order formulae.
16
2.3. Performance Evaluation
17
2. SqpWrapper - Sequential Quadratic Programming
η = ε1/2 η = 10−7
ε nsucc−nm nsucc−m nsucc−nm nsucc−m
0 306 304 306 304
10−12 306 302 305 303
10−10 304 299 299 286
10−8 299 287 268 196
10−6 298 255 176 58
10−4 275 202 112 22
10−2 229 103 113 17
18
2.4. Program Documentation
19
2. SqpWrapper - Sequential Quadratic Programming
20
2.4. Program Documentation
where pdXu and pdXl must be pointing on arrays of length equal to the number of
design variables.
Before setting bounds, the number of design variables must be set.
If no bounds are set, these will be -1E30 and +1E30 by default.
b) Provide an initial guess for the optimization vector by
PutInitialGuess( const double * pdInitialGuess )
where pdInitialGuess must be a double array of length equal to the number of
variables or
PutInitialGuess( const int iVariableIndex, const double dInitialGuess )
for iVariableIndex = 0,...,NumberOfVariables - 1.
Before setting an initial guess, the number of design variables must be set.
c) If you want to provide an initial guess for the Hessian of the Lagrangian and the La-
grange multipliers use Put( const char * pcParam , const double * pdValue )
to set the values. Use Put( const char * pcParam , const int * piValue ) to
set parameter "Mode" = 1.
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
21
2. SqpWrapper - Sequential Quadratic Programming
22
2.4. Program Documentation
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
23
2. SqpWrapper - Sequential Quadratic Programming
Some of the termination reasons depend on the accuracy used for approximating gradients.
If we assume that all functions and gradients are computed within machine precision and that
the implementation is correct, there remain only the following possibilities that could cause an
error message:
1. The termination parameter TermAcc is too small, so that the numerical algorithm plays
around with round-off errors without being able to improve the solution. Especially the
Hessian approximation of the Lagrangian function becomes unstable in this case. A
straightforward remedy is to restart the optimization cycle again with a larger stopping
tolerance.
2. The constraints are contradicting, i.e., the set of feasible solutions is empty. There is no
way to find out, whether a general nonlinear and non-convex set possesses a feasible point
or not. Thus, the nonlinear programming algorithms will proceed until running in any of
the mentioned error situations. In this case, the correctness of the model must be very
carefully checked.
3. Constraints are feasible, but some of them there are degenerate, for example if some of the
constraints are redundant. One should know that SQP algorithms assume the satisfaction
of the so-called constraint qualification, i.e., that gradients of active constraints are linearly
independent at each iterate and in a neighborhood of an optimal solution. In this situation,
it is recommended to check the formulation of the model constraints.
However, some of the error situations also occur if, because of wrong or non-accurate gradients,
the quadratic programming subproblem does not yield a descent direction for the underlying
merit function. In this case, one should try to improve the accuracy of function evaluations,
scale the model functions in a proper way, or start the algorithm from other initial values.
Since Version 2.1, NLPQLP returns the best iterate obtained. In case of successful termination
(Status = 0), this is always the last iterate, in case of non-successful return (Status > 0) an
eventually better previous iterate. The success if measured by objective function value and
24
2.5. Example - using parallel line search in SqpWrapper
constraint violation. Note that the output of constraints and multiplier values is suppressed for
more than 10.000 constraints.
25
2. SqpWrapper - Sequential Quadratic Programming
8 using s t d : : c i n ;
9
10 SqpExample : : SqpExample ( )
11 {
12 c o u t << e n d l << ”−−−− t h i s i s Example 10 −−−−” << e n d l << e n d l ;
13 c o u t << e n d l << ”−−− s o l v e d by SqpWrapper −−−” << e n d l << e n d l ;
14
15 m pOptimizer = new SqpWrapper ( ) ;
16 }
17
18 int SqpExample : : FuncEval ( bool /∗ bGradApprox ∗/ )
19 {
20 const double ∗ dX ;
21 int iNumParSys ;
22 double ∗ pdFuncVals ;
23 int iNumConstr ;
24
25 m pOptimizer−>GetNumConstr ( iNumConstr ) ;
26
27 pdFuncVals = new double [ iNumConstr + 1 ] ;
28
29 m pOptimizer−>Get ( ” NumParallelSys ” , iNumParSys ) ;
30
31 // The f o r −l o o p s i m u l a t e s a d i s t r i b u t e d system where
32 // t h e f u n c t i o n s can be e v a l u a t e d a t a l l
33 // d e s i g n v a r i a b l e v e c t o r s s i m u l t a n e o u s l y .
34 for ( int i = 0 ; i < iNumParSys ; i++ )
35 {
36 m pOptimizer−>GetDesignVarVec ( dX , i ) ;
37
38 // 0 t h c o n s t r a i n t
39 pdFuncVals [ 0 ] = dX [ 0 ] ∗ dX [ 0 ] + dX [ 1 ] ∗ dX [ 1 ] ;
40
41 // 1 s t c o n s t r a i n t
42 pdFuncVals [ 1 ] = 1 . 0 − dX [ 0 ] ∗ dX[0] − dX [ 1 ] ∗ dX [ 1 ] ;
43
44 // o b j e c t i v e
45 pdFuncVals [ 2 ] = − dX [ 0 ] ∗ dX [ 1 ] ;
46
47 m pOptimizer−>PutObjVal ( pdFuncVals [ iNumConstr ] , 0 , i ) ;
48 m pOptimizer−>PutConstrValVec ( pdFuncVals , i ) ;
49 }
50
51 delete [ ] pdFuncVals ;
52
53 return EXIT SUCCESS ;
26
2.5. Example - using parallel line search in SqpWrapper
54 }
55
56 int SqpExample : : GradEval ( )
57 {
58 const double ∗ dX ;
59
60 m pOptimizer−>GetDesignVarVec ( dX ) ;
61
62 m pOptimizer−>PutDerivObj ( 0 , − dX [ 1 ] ) ;
63 m pOptimizer−>PutDerivObj ( 1 , − dX [ 0 ] ) ;
64
65 m pOptimizer−>PutDerivConstr ( 0 , 0 , 2 . 0 ∗ dX [ 0 ] ) ;
66 m pOptimizer−>PutDerivConstr ( 0 , 1 , 2 . 0 ∗ dX [ 1 ] ) ;
67
68 m pOptimizer−>PutDerivConstr ( 1 , 0 , ( − 2 . 0 ) ∗ dX [ 0 ] ) ;
69 m pOptimizer−>PutDerivConstr ( 1 , 1 , ( − 2 . 0 ) ∗ dX [ 1 ] ) ;
70
71 return EXIT SUCCESS ;
72 }
73
74 int SqpExample : : SolveOptProblem ( )
75 {
76 int i E r r o r ;
77
78 iError = 0 ;
79
80 // Suppose we have a d i s t r i b u t e d system which can e v a l u a t e
81 // f u n c t i o n s a t 10 d e s i g n v a r i a b l e v e c t o r s s i m u l t a n e o u s l y .
82 m pOptimizer−>Put ( ” NumParallelSys ” , 10 ) ;
83
84 // The f o l l o w i n g p a r a m e t e r s do not have t o be s e t ,
85 // t h e r e a r e d e f a u l t v a l u e s g i v e n i n t h e o p t i m i z e r .
86 // However f o r t h e f i n e t u n i n g i t may be n e c e s s a r y t o s e t them .
87 int iMaxNumIter = 500 ;
88 int iMaxNumIterLS = 30 ;
89 double dTermAcc = 1 . 0 E−6 ;
90
91 // Parameters can be s e t u s i n g t h e a d d r e s s o f t h e v a l u e . . .
92 m pOptimizer−>Put ( ”MaxNumIter” , & iMaxNumIter ) ;
93 // . . . a v a r i a b l e c o n t a i n i n g t h e v a l u e . . .
94 m pOptimizer−>Put ( ”MaxNumIterLS” , iMaxNumIterLS ) ;
95 m pOptimizer−>Put ( ”TermAcc” , dTermAcc ) ;
96 // . . . or t h e v a l u e i t s e l f .
97 m pOptimizer−>Put ( ” OutputLevel ” , 2 ) ;
98
99
27
2. SqpWrapper - Sequential Quadratic Programming
100 // We d e f i n e t h e o p t i m i z a t i o n problem :
101
102 m pOptimizer−>PutNumDv( 2 ) ;
103 m pOptimizer−>PutNumIneqConstr ( 2 ) ;
104 m pOptimizer−>PutNumEqConstr ( 0 ) ;
105
106 double LB [ 2 ] = { 0 . 0 , 0 . 0 } ;
107 double UB[ 2 ] = { 10E6 , 10E6 } ;
108 double IG [ 2 ] = { 0 . 8 , 0 . 0 5 } ;
109
110 m pOptimizer−>PutUpperBound ( UB ) ;
111 m pOptimizer−>PutLowerBound ( LB ) ;
112 m pOptimizer−>P u t I n i t i a l G u e s s ( IG ) ;
113
114
115 // S t a r t t h e o p t i m i z a t i o n
116 i E r r o r = m pOptimizer−>DefaultLoop ( t h i s ) ;
117
118 // i f an e r r o r occured , r e p o r t i t . . .
119 i f ( i E r r o r != EXIT SUCCESS )
120 {
121 c o u t << ” E r r o r ” << i E r r o r << ” \n” ;
122 return i E r r o r ;
123 }
124
125 // . . . e l s e r e p o r t t h e r e s u l t :
126
127 const double ∗ dX ;
128
129 m pOptimizer−>GetDesignVarVec ( dX ) ;
130
131 int iNumDesignVar ;
132
133 m pOptimizer−>GetNumDv( iNumDesignVar ) ;
134
135 for ( int iDvIdx = 0 ; iDvIdx < iNumDesignVar ; iDvIdx++ )
136 {
137 c o u t << ”dX [ ” << iDvIdx << ” ] = ” << dX [ iDvIdx ] << e n d l ;
138 }
139 return EXIT SUCCESS ;
140 }
28
2.6. Conclusions
2.6. Conclusions
We present a modification of an SQP algorithm designed for execution under a parallel comput-
ing environment (SPMD) and where a non-monotone line search is applied in error situations.
Numerical results indicate stability and robustness for a set of 306 standard test problems. It
is shown that not more than 6 parallel function evaluations per iteration are required for con-
ducting the line search. Significant performance improvement is achieved by the non-monotone
line search especially in case of noisy function values and numerical differentiation. There are
no differences in the number of test problems solved between forward differences, two-sided dif-
ferences, and fourth-order formula, even not in case of severe random perturbations. With the
new non-monotone line search, we are able to solve about 80 % of the test examples in case
of extremely noisy function values with at most two correct digits and forward differences for
derivative calculations.
29
3. NlpqlbWrapper - Solving Problems With
Very Many Constraints
The Fortran subroutine NLPQLB by Schittkowski solves smooth nonlinear programming prob-
lems with a large number of constraints, but a moderate number of variables. The underlying
algorithm applies an active set method proceeding from a given bound mw for the maximum
number of expected active constraints. A quadratic programming subproblem is generated with
mw linear constraints, the so-called working set, which are internally exchanged from one iterate
to the next. Only for active constraints, i.e., a certain subset of the working set, new gradient
values must be computed. The line search takes the active constraints into account. In case
of computational errors as for example caused by inaccurate function or gradient evaluations, a
non-monotone line search is activated. Numerical results are included for some academic test
problems, which show that nonlinear programs with up to 200,000,000 nonlinear constraints
can be efficiently solved. The amazing observation is that despite of a large number of nearly
dependent active constraints, the underlying SQP code converges very fast. The usage of the
code is documented and illustrated by an example (chapter 10). The sections 3.1, 3.2, 3.3 and
3.5 are taken from Schittkowski [137].
3.1. Introduction
We consider the general optimization problem to minimize an objective function under nonlinear
equality and inequality constraints,
min f (x)
gj (x) = 0 , j = 1, . . . , me ,
x ∈ Rn : (3.1)
gj (x) ≥ 0 , j = me + 1, . . . , m ,
xl ≤ x ≤ xu ,
where x is an n-dimensional parameter vector. It is assumed that all problem functions f (x) and
gj (x), j = 1, . . ., m, are continuously differentiable on the whole Rn . To simplify the notation,
we omit the upper and lower bounds and get a problem of the form
min f (x)
x ∈ Rn : gj (x) = 0 , j = 1, . . . , me , (3.2)
gj (x) ≥ 0 , j = me + 1, . . . , m .
We assume now that the nonlinear programming problem possesses a very large number of
nonlinear inequality constraints on the one hand, but a much lower number of variables. A
typical situation is the discretization of an infinite number of constraints, as indicated by the
following case studies.
31
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
min t
x∈X: (3.6)
f (x, y) ≤ t for all y ∈ Y .
Again, we get a semi-infinite optimization problem provided that Y is an infinite set.
Typically, the problem describes the approximation of a nonlinear functions by a simpler func-
tion, i.e., fi (x) = f (ti ) − p(ti , x). In this case, f (t) is a given function depending on a variable
t ∈ R and p(t, x) a member of a class of approximating functions, e.g., a polynomial in t with
coefficients x ∈ Rn . The problem is non-differentiable, but can be transformed into a smooth
one assuming that all functions fi (x) are smooth,
min t
x ∈ Rn : fi (x) ≤ t i = 1, . . . , r , (3.8)
fi (x) ≥ −t i = 1, . . . , r .
4. Optimal control: The goal is to determine a control function u(t) depending on a time
variable t, which has to minimize a cost criterion subject to a state equation in form of a system
of differential equations and additional restrictions on the state and control variables that are
to be satisfied for all time values under consideration. If the control problem is discretized in a
proper way, we get a nonlinear programming problem with a large number of constraints.
32
3.1. Introduction
The examples mentioned above motivate the necessity to develop special methods for prob-
lems with very many restrictions. The total number of constraints is so large that either the
linearized constraints cannot be stored in memory or slow down the solution process unneces-
sarily. Although we can expect that most of the constraints are redundant, we cannot predict
a priori which constraints are the important ones, i.e., probably active at the optimal solution,
and which not.
Sequential quadratic programming methods construct a sequence of quadratic programming
subproblems by approximating the Lagrangian function
m
. X
L(x, u) = f (x) − uj gj (x) (3.9)
j=1
quadratically and by linearizing the constraints. The resulting quadratic programming subprob-
lem
min 12 dT Bk d + 5f (xk )T d
d ∈ Rn , δ ∈ R : 5gj (xk )T d + gj (xk ) = 0 , j = 1, . . . me , (3.10)
5gj (xk )T d + gj (xk ) ≥ 0 , j = me + 1, . . . m
can be solved by any available black-box algorithm, at least in principle. Here, xk denotes a
current iterate and Bk an estimate of the Hessian of the Lagrangian function (3.9) updated by
the BFGS quasi-Newton method. However, if m is large, the Jacobian might become too big to
be stored in the computer memory.
The basic idea is to proceed from a user-provided value mw with
n ≤ mw ≤ m
by which we estimate the maximum number of expected active constraints. Only quadratic
programming subproblems with mw linear constraints are created which require lower storage
and allow faster numerical solution. Thus, one has to develop a strategy to decide, which
constraint indices are added to a working set of size mw
.
W = {j1 , . . . , jmw } ⊂ {1, . . . , m}
and which ones have to leave the working set. It is recommended to keep as many constraints as
possible in a working set, i.e., to keep mw as large as possible, since also non-active constraint
could contribute an important influence on the computation of search direction.
33
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
It is, however, possible, that too many constraints are violated at a starting point even if
it is known that the optimal solution possesses only very few active constraints. To avoid an
unnecessary blow-up of the working set, it is also possible to extend the given optimization
problem by an additional artificial variable xn+1 , which, if chosen sufficiently large at start,
decreases the number of active constraints. (3.1) or (3.2), respectively, is then replaced by
However, this transformation does not make sense in cases where the original problem is
transformed in a similar way as for example the min-max problem (3.6). Also, the choice of the
penalty parameter ρ and the starting value for xn+1 is crucial. A too rapid decrease of xn+1
to zero must be prevented to avoid too many active constraints, which is difficult to achieve in
general. But if adapted to a specific situation, the transformation works very well and can be
extremely helpful.
There is another motivation for considering active sets. Since we want to solve problems with
a large number of constraints, many of them are probably redundant. But in any case, we have
to require the evaluation of gradients for all of them in the working set. Thus, an additional
active set strategy is proposed with the aim to reduce the number of gradient evaluations, and
to calculate gradients at a new iterate only for a certain subset of estimated active constraints.
The underlying SQP algorithm is described in Schittkowski [120], and the presented active set
approach for solving problems with a large number of constraints in Schittkowski [126].
Active set strategies are widely discussed in the nonlinear programming literature and have
been implemented in most of the available codes. A computation study for linear constraints
was even conducted in the 70’s, see Lenard [80], and Google finds 267,000 hits for active set
strategy nonlinear programming. It is out of the scope of this paper to give a review. Some of
these strategies are quite complex and a typical example is the one included in the KNITRO
package for large scale optimization, see Byrd, Gould, Nocedal, and Waltz [19], based on linear
programming and equality constrained subproblems.
From the technical point of view, NLPQLB is implemented in form of a Fortran subroutine,
where function and gradient values are passed through reverse communication, see chapter 2 or
Schittkowski [132]. NLPQLB calls the SQP code NLPQLP, see again chapter 2 or[132], with
exactly mw constraints, where the constraints of the working set are changed from one iteration
to the next.
The modified SQP-algorithm is described in Section 3.2 in detail. Since some heuristics are
included which prevent a rigorous convergence analysis, at least the most important sufficient
decrease property is available which shows that the algorithm is well-defined. Some numerical
test results based on a few academic examples are found in Section 3.3, where the number
of nonlinear constraints is very large, i.e., up to 200,000,000. More details of the software
implementation and the usage of the code NLPQLB are presented in Section 3.4. Chapter 10
contains a simple example to become familiar with the software.
34
3.2. An Active-Set Sequential Quadratic Programming Method
35
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
(k) (k)
vk = (v1 , . . . , vm )T is the current multiplier estimate and a user provided error tolerance.
The indices j(k) in (3.12) denote previously computed gradients of constraints. Their definition
will become clear when investigating the algorithm in more detail. The idea is to recalculate
only gradients of active constraints and to fill the remaining rows of the constraint matrix with
previously computed ones.
We have to assume that there are not more than mw active constraints throughout the algo-
rithm. But we do not support the idea to include some kind of automatized phase I procedure
to project an iterate back to the feasible region whenever this assumption is violated. We will
have some safeguards in the line search algorithm to prevent this situation. If, for example at
a starting point, more than mw constraints are active, it is preferred to stop the algorithm and
to leave it to the user either to change the starting point or to establish an outer constraint
restoration procedure depending on the problem structure.
After solving the quadratic programming subproblem (3.12) we get a search direction dk and
a corresponding multiplier vector uk . The new iterate is obtained by
. .
xk+1 = xk + αk dk , vk+1 = vk + αk (uk − vk ) (3.16)
for approximating the optimal solution x? ∈ Rn of (3.2) and the corresponding optimal multiplier
vector u? ∈ Rm . The steplength parameter αk is the result of an additional line search sub-
algorithm, by which we want to achieve a sufficient decrease of an augmented Lagrangian merit
function
. X 1 1 X
ψr (x, v) = f (x) − (vj gj (x) − rj gj (x)2 ) − vj2 /rj . (3.17)
2 2
j∈J(x,v) j∈K(x,v)
the degree of constraint violation, must carefully be chosen to guarantee a sufficient descent
direction of the merit function
see Schittkowski [120], Ortega and Rheinboldt [100], or Wolfe [166] in a more general setting,
where
. xk dk
φrk (α) = ψrk +α (3.20)
vk uk − vk
and
dk
φ0rk (0) = 5ψrk (xk , vk ) T
. (3.21)
uk − vk
An additional requirement is that at each intermediate step of the line search procedure at most
mw constraints are active. If this condition is violated, the steplength is further reduced until
satisfying this condition. ¿From the definition of our index sets, we have
.
Jk? ⊃ Jk = J(xk , vk ) . (3.22)
36
3.2. An Active-Set Sequential Quadratic Programming Method
The starting point x0 is crucial from the viewpoint of numerical efficiency and must be pre-
determined by the user. It has to satisfy the assumption that not more than mw constraints
are active, i.e., that J0 ⊂ W0 . The remaining indices of W0 are to be set in a suitable way
and must not overlap with the active ones. Also W0 must be provided by the user to have the
possibility to exploit pre-existing knowhow about the position of the optimal solution and its
active constraints.
For all other parameters, suitable default values can be provided, e.g., v0 = 0 for the initial
multiplier guess, B0 = I for the initial estimate of the Hessian matrix of the Lagrangian function
of (1) and r0 = (1, . . . , 1)T for the initial penalty parameters. In general it is assumed that vj0 ≥ 0
for j = me + 1, . . ., m, that B0 is positive definite, and that all coefficients of the vector r0 are
positive.
The basic idea of the algorithm can be described in the following way: We determine a
working set Wk and perform one step of a standard SQP-algorithm with respect to nonlinear
programming problem with mw nonlinear constraints. Then the working set is updated and the
whole procedure repeated.
One particular advantage is that the numerical convergence conditions for the reduced problem
are applicable for the original one as well, since all constraints not in the working set Wk are
inactive, i.e., satisfy gj (xk ) > for j ∈ {1, . . . , m} \ Wk .
The line search procedure described in Schittkowski [120] can be used to determine a steplength
parameter αk , which is a combination of an Armijo-type steplength reduction with a quadratic
interpolation of φk (α). The proposed approach guarantees theoretical convergence results, is
very easy to implement and works satisfactorily in practice. But in our case we want to achieve
the additional requirement that all intermediate iterates αk,i or xk + αk,i−1 dk , respectively, do
not possess more than mw violated constraints. By introducing an additional loop reducing the
steplength by a constant factor, it is always possible to guarantee this condition. An artificial
penalty term is added to the objective function consisting of violated constraints. The modifi-
cation of the line search procedure prevents iterates of the modified SQP-method that violate
too many constraints.
BFGS-updates are standard technique in nonlinear programming and yield excellent con-
vergence results both from the theoretical and numerical point of view. The modification to
guarantee positive definite matrices Bk was proposed by Powell [105]. The update is performed
with respect to the corrections xk+1 − xk , and 5x L(xk+1 ) − 5x L(xk ).
Since a new restriction is included in the working set Wk+1 only if it belongs to Jk+1 ? , we
get always new and current gradients in the quadratic programming subproblem (3.12). But
gradients can be reevaluated for any larger set, e.g., Wk+1 . In this case we can expect even a
better performance of the algorithm.
The proposed modification of the standard SQP-technique is straightforward and easy to
analyze. We want to stress out that its practical performance depends mainly on the heuristics
used to determine the working set Wk . The first idea could be to take out those constraints
from the working set which got the largest function values. However, the numerical size of a
constraint depends on its internal scaling. In other words, we cannot conclude from a large
restriction function value that the constraint is probably inactive.
To get a decision on constraints in the working set Wk that is independent of the scaling of
the functions as much as possible, we propose the following rules:
• Among the constraints feasible at xk and xk+1 , keep those in the working set that were
37
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
violated during the line search. If there are too many of them according to some given
constant, select constraints for which
gj (xk+1 ) −
gj (xk+1 ) − fj (xk + αk,i−1 dk )
is minimal. The decision whether a constraint is feasible or not, is performed with respect
to the given tolerance .
• In addition keep the restriction in the working set for which gj (xk + dk ) is minimal.
• Take out those feasible constraints from the working set, which are the oldest ones with
respect to their successive number of iterations in the working set.
is satisfied, where p(k) is a predetermined parameter with p(k) = min{k, p}, p a given tolerance.
Thus, we allow an increase of the reference value φrk (0) in a certain error situation, i.e., an
increase of the merit function value. In case of k = 0, the reference value is adapted by a
factor greater than 1, i.e., φrjk (0) is replace by tφrjk (0), t > 1. The basic idea to store reference
function values and to replace the sufficient descent property by a sufficient ’ascent’ property in
max-form, see Dai and Schittkowski [28] for details and a convergence proof.
38
3.3. Numerical Tests
Otherwise, an alternative, much simpler proposal could be to include all constraints in the
initial working set for which gj (x) ≤ , and to fill the remaining position with indices for which
gj (x) > .
Any other initialization of the working set depending on available information about the
expected active constraints may be applied.
Some numerical experiments are reported to show that the resulting algorithm works as ex-
pected. The examples are small academic test problems taken from the literature, but somewhat
modified in particular to get problems with a varying number of constraints. They have been
used before to get the results published in Schittkowski [126], and are now solved with up to
200,000,000 instead of maximal 10,000 constraints. Gradients are evaluated analytically.
P1: The nonlinear semi-infinite test problem is taken from Tanaka, Fukushima, and Ibaraki [156],
with starting point x0 = (1, −1, 2)T . After a discretization of the interval [0, 1] with m = 4 · 107
equidistant points, we get a nonlinear program with 40,000,000 constraints. Figure 3.1 shows the
curve plots over y for the starting point and the optimal solution x? = (−0.21331259, −1.3614504,
1.8535473)T . Although only one constraint is active at the optimal solution x? , the starting point
x0 violates about 50 % of all constraints. Thus, the initial working set W0 must be sufficiently
large and we choose mw = 2 · 107 . For the same reason, we restrict the discretization of the
interval [0, 1] and the total number of constraints by m = 4 · 107 .
P1F: This is the same nonlinear semi-infinite test problem as before. It is to be shown that
the simple feasibility modification (3.11) by introducing an additional variable x4 and a penalty
term with ρ = 104 in the modified objective function reduces the number of intermediate active
39
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
The equidistant discretization is performed with m = 2 · 108 points, and the initial iterate is
x0 = (1, −1, 2, 100)T . No constraint is active at the starting point and we are able to choose a
much smaller working set of size mw = 2 · 103 .
P3: The nonlinear semi-infinite test problem is very similar to P1, see Tanaka, Fukushima, and
Ibaraki [156],
with starting point x0 = (1, 0.5, 0)T . An equidistant discretization of the interval [0, 1] with
m = 2 · 108 equidistant points is chosen and the size of the working set is mw = 5 · 105 .
Figure 3.2 shows the curve plots over y for the starting point and the optimal solution x? =
(1.0066047, −0.12687988, −0.37972483)T . One constraint is active at the starting point and two
at the optimal solution. However, we have to expect a larger number of nearly active constraints.
Since the active set at the starting point differs significantly from the active set at the optimal
solution, we have to choose a relatively large working set.
P4: This is a nonlinear semi-infinite test problem with two free variables from the interval
40
3.3. Numerical Tests
with starting values x0 = (−2, −1, 0)T . Again, we use 200,000,000 uniformly distributed points
of the interval [0, 1] × [0, 1] to discretize it. Since only one constraint is active for y1 = y2 = 0,
the size of the working set can be as low as mw = 200. Figure 3.3 shows the curvature plot over
y1 and y2 for the optimal solution x? = (−1, 0, 0)T .
TP332: The problem is a modification of the test problem TP332 of Schittkowski [124] to get
a larger number of constraints,
m
X
(log ti + x2 sin ti + x1 cos ti )2 + (log ti + x2 cos ti − x1 sin ti )2
min
x1 , x2 ∈ R : i=1
(3.29)
1/ti − x1 π
arctan ≤ , i = 1, . . . , m
log ti + x2 60
.
with starting values x0 = (0.75, 0.75)T , where ti = π(1/3 + (i − 1)/180). We proceed from
m = 2 · 108 constraints, and the size of the working set is mw = 100. Only one constraint is
active at the starting and the optimal solution x? = (0.94990963, 0.049757167)T .
TP374: The problem is extended in a straightforward way to allow more than 35 constraints
41
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
min x10
10
z(ti ) − (1 − x10 )2 ≥ 0 , i = 1, . . . , r ,
x∈R : (3.30)
−z(ti ) + (1 + x10 )2 ≥ 0 , i = r + 1, . . . , 2r ,
−z(ti ) + x210 ≥ 0 , i = 2r + 1, . . . , 3.5r ,
where
9
!2 9
!2
. X X
z(t) = xk cos(kt) + xk sin(kt)
k=1 k=1
and
ti = π(i − 1)0.025 , i = 1, . . . , r ,
ti = π(i − 1 − r)0.025 , i = r + 1, . . . , 2r ,
ti = π(1.2 + (i − 1 − 2r)0.2)0.25 , i = 2r + 1, . . . , 3.5r .
Starting solution is x0 = (0.1, 0.1, . . . , 0.1, 1)T . By choosing r = 108 /3.5, we get 100,000,000
nonlinear constraints. Because of a large number of intermediate active constraints, we have to
restrict the total number of constraints and let mw = 2 · 106 . The optimal solution is
U3: The goal is to approximate the exponential function by a rational one, i.e., to minimize the
maximum norm of r functions, see Luksan [88],
x1 + x2 t i
min max{| − exp(ti )| , i = 1, . . . , r} , (3.31)
x∈R5 1 + x3 ti + x4 t2i + x5 t3i
where
. i−1
ti = 2 −1
r−1
for i = 1, . . ., r. Starting point is x0 = (0.5, 0, 0, 0, 0)T . The problem is transformed into a
.
smooth nonlinear program of the form (3.8) with m = 2r constraints and n + 1 variables. The
starting point for the additional variable is set to t = 20, and the number of constraints is
r = 5 · 107 or m = 108 , respectively, where we expect a large number of intermediate active
constraints. Thus, we allow up to mw = 5 · 105 constraints in the working set. Figure 3.4 shows
the curvature plot of the residual function f (x, t) as defined by (3.31) over t for the optimal
solution
L5: The problem is similar to the previous one. Again, the sum of absolute values of a set of
r differentiable functions is to be minimized, see Luksan [88], but now with additional linear
42
3.3. Numerical Tests
where
. π
θj = (8.5 + 0.5i)
180
for i = 1, . . ., r. Starting point is x0 = (0.5, 1, 1.5, 2, 2.5, 3, 3.5)T . The problem is transformed
.
into a smooth nonlinear program (3.8) with m = 2r + 8 constraints and n + 1 variables. The
starting point for the additional variable is set to t = 1. We set r = 108 − 4 leading to m = 2 · 108
constraints. The number of constraints in the working set is mw = 4 · 104 . Figure 3.5 shows the
curvature plot of the residual function f (x, t) as defined by (3.32) over θ for the optimal solution
43
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
where
. 4i
ti =
r
for i = 1, . . ., r. Starting point is x0 = (25, 5, −5, −1)T . The problem is transformed into a
.
smooth nonlinear program of the form (3.8) with m = r constraints and n + 1 variables. The
starting point for the additional variable is set to t = 1, 000. For r = 2 · 108 we get m = 2 · 108
constraints, and the number of constraints in the working set is mw = 5 · 104 . Figure 3.6 shows
the curvature plot of the residual function f (x, t) as defined by (3.33) over t for the optimal
solution
The Fortran codes were compiled by the Intel Visual Fortran Compiler, Version 10.1, under
Windows XP64, and executed on a Dual Core ADM Opteron processor 265 with 1.81 GHz
and 4 GB RAM. The working arrays of the routine calling NLPQLB are dynamically allo-
cated. Quadratic programming subproblems are solved by the primal-dual method of Goldfarb
and Idnani [56] based on numerically stable orthogonal decompositions, see Schittkowski [130].
NLPQLB is executed with termination accuracy = 10−8 .
Some test problem data characterizing their structure, are summarized in Table 1. Numerical
experiments are reported in Table 2 where we use the following notation:
44
3.3. Numerical Tests
name n m mw f?
P1 3 40,000,000 20,000,000 5.33469
P1F 4 200,000,000 2,000 5.33469
P3 3 200,000,000 500,000 4.30118
P4 3 200,000,000 200 1.00000
TP332 2 200,000,000 100 398.587
TP374 10 100,000,000 2,000,000 0.434946
U3 6 100,000,000 500,000 0.00012399
L5 8 200,000,000 40,000 0.0952475
E5 5 200,000,000 50,000 125.619
45
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
active set at the starting point to that of the optimal solution is. If dramatic changes of active
constraints are expected as in case of P1, i.e., if intermediate iterates with a large number of
violated constraints are generated, the success of the algorithm is marginal. On the other hand,
practical optimization problems often have special structures from where good starting points
can be predetermined. Examples P1F and especially P4 and TP332 show a dramatic reduction
of derivative calculations, which is negligible compared to the number of function calls.
Since the constraints are nonlinear and non-convex, we have to compute all m constraint
function values at each iteration to check feasibility and to predict the new active set. The total
number of individual constraint function evaluations is nf · m.
Calculation times are excessive and depend mainly on data transfer operations from and to
the standard swap file of Windows, and the available memory in core, which is 4 GB in our case.
To give an example, test problem TP332 requires 110 sec for m = 2 · 107 constraints and only 8
sec for m = 2 · 106 constraints.
Note that the code NLPQLB requires additional working space in the order of 2m double
precision real numbers plus mw · (n + 1) double precision numbers for the partial derivatives
of constraints in the working set. Thus, the total memory to run a test problem with m =
2 · 8 constraints requires at least 600,000,000 double precision numbers and in addition at least
400, 000, 000 logical values.
It is amazing that numerical instabilities due to degeneracy are prevented. The huge number
of constraints indicates that the derivatives are extremely close to each other, making the opti-
mization problem unstable. The constraint qualification, i.e., the linear independence of active
constraints, is more or less violated. We benefit from the fact that derivatives are analytically
given.
Besides the implementation described here, Nlp++ provides the possibility to derive your prob-
lem from a given base class and therefore be able to use an automated loop over StartOptimizer
and the function/gradient evaluation. For this possibility see chapter 31.
46
3.4. Program Documentation
47
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
– "ToleranceLS" Put the relative bound for increase of merit function value, if
line search is not successful during the very first step. Must be non-negative
(e.g. 0.1).
• Put( const char * pcParam , const char * pcValue ) ;
or
Put( const char * pcParam , const char cValue ) ;
where pcParam = "OutputFileName" to set the name of the output file.
• Put( const char * pcParam , const bool * pbValue ) ;
or
Put( const char * pcParam , const bool bValue ) ;
where pcParam is one of the following parameters:
– "LQL" If LQL is set true, the quadratic programming subproblem is solved
with a full positive definite quasi-Newton matrix. Otherwise,a Cholesky de-
composition is performed and updated, so that the subproblem matrix con-
tains only an upper triangular factor.
– "OpenNewOutputFile" Determines whether a new output file has to be cre-
ated. If false, the output is appended to an existing file. Usually, this
parameter does not have to be changed.
d) Put the number of inequality constraint functions to the object:
PutNumIneqConstr( int iNumIneqConstr ) ;
e) Put the number of equality constraint functions to the object:
PutNumEqConstr( int iNumEqConstr ) ;
48
3.4. Program Documentation
49
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
50
3.4. Program Documentation
Some of the termination reasons depend on the accuracy used for approximating gradients.
If we assume that all functions and gradients are computed within machine precision and that
the implementation is correct, there remain only the following possibilities that could cause an
error message:
1. The termination parameter TermAcc is too small, so that the numerical algorithm plays
around with round-off errors without being able to improve the solution. Especially the
Hessian approximation of the Lagrangian function becomes unstable in this case. A
straightforward remedy is to restart the optimization cycle again with a larger stopping
tolerance.
2. The constraints are contradicting, i.e., the set of feasible solutions is empty. There is no
way to find out whether a general nonlinear and non-convex set possesses a feasible point
or not. Thus, the nonlinear programming algorithms will proceed until running in any of
the mentioned error situations. In this case, the correctness of the model must be very
carefully checked.
3. Constraints are feasible, but some of them there are degenerate, for example if some of the
constraints are redundant. One should know that SQP algorithms assume the satisfaction
of the so-called constraint qualification, i.e., that gradients of active constraints are linearly
independent at each iterate and in a neighborhood of an optimal solution. In this situation,
it is recommended to check the formulation of the model constraints.
However, some of the error situations also occur if, because of wrong or inaccurate gradients,
the quadratic programming subproblem does not yield a descent direction for the underlying
merit function. In this case, one should try to improve the accuracy of function evaluations,
scale the model functions in a proper way, or start the algorithm from other initial values.
51
3. NlpqlbWrapper - Solving Problems With Very Many Constraints
NLPQLB returns the best iterate obtained. In case of successful termination (Status= 0),
this is always the last iterate, in case of non-successful return (Status> 0) an eventually better
previous iterate. The success is measured by objective function value and constraint violation.
Note that the output of constraints and multiplier values is suppressed for more than 10, 000
constraints.
3.5. Conclusions
We present a modification of an SQP algorithm to solve optimization problems with a very
large number of constraints, m, relative to the number of variables, n. The idea is to proceed
from a user-provided guess, mw , for the maximum number of violated constraints, and to solve
quadratic programming subproblems with mw linear constraints instead of m constraints.
Some numerical experiments with simple academic test problems show that it is possible
to solve problems with up to m = 2 · 108 nonlinear constraints, which would not be solvable
otherwise by a standard SQP algorithm. Sparsity of the Jacobian of the constraints is not
assumed.
The performance depends significantly on the position of the starting point, i.e., on the ques-
tion, how close the initial active set is to the active set at the final solution. If, in the worst
case, all constraints are violated at an intermediate iterate, the proposed active set strategy is
useless, as probably any other as well. In practical applications, however, optimization prob-
lems are often routinely solved and some information about the choice of good starting points
is available. If there are no or very little changes of the active set and if only a few constraints
are active, the achievements are significant.
It is very amazing that it is possible at all to solve problems with this huge number of
constraints on a standard PC. The test examples have a quite simple structure, in most cases
arising from a semi-infinite optimization and an equidistant discretization. It must be expected
that gradients of neighbored constraints coincide up to seven digits. These optimization problems
are highly unstable in the sense that the linear independency constraint qualification is more or
less violated at all iterates.
52
4. NlpqlgWrapper - Heuristic Global
Optimization
Usually, global optimization codes with guaranteed convergence require a large number of func-
tion evaluations. On the other hand, there are efficient optimization methods which exploit
gradient information, but only the approximation of a local minimizer can be expected. If,
however, the underlying application model is expensive, if there are additional constraints, es-
pecially equality constraints, and if the existence of different local solutions is expected, then
heuristic rules for successive switches from one local minimizer to another are often the only
applicable approach. For this specific situation, we present some simple ideas for cutting off
a local minimizer and to restart a new local optimization run. However, some safeguards are
needed to stabilize the algorithm, since very little is usually known about the distribution of
local minima. The paper introduces an approach where the nonlinear programs generated can
be solved by any available black box software. For our implementation, a sequential quadratic
programming code (NLPQLP) is chosen for local optimization. The usage of the code is outlined
and we present some numerical results based on a set of test examples found in the literature.
In chapter 10 an example implementation can be found. The sections 4.1, 4.2 and 4.4 are taken
from Schittkowski [143].
4.1. Introduction
We consider the general optimization problem to minimize an objective function f under non-
linear equality and inequality constraints,
min f (x)
gj (x) = 0 , j = 1, . . . , me ,
x ∈ Rn : (4.1)
gj (x) ≥ 0 , j = me + 1, . . . , m,
xl ≤ x ≤ xu ,
where x is an n-dimensional parameter vector. It is assumed that all problem functions f (x)
and gj (x), j = 1, . . ., m, are continuously differentiable on the whole Rn . But besides of this
we do not impose any further restrictions on the mathematical structure.
Let P denote the feasible domain,
P := {x ∈ Rn : gj (x) = 0, j = 1, . . . , me , gj (x) ≥ 0, j = me + 1, . . . , m, xl ≤ x ≤ xu } .
Our special interest is to find a global optimizer, i.e., a feasible point x? ∈ P with f (x? ) ≤ f (x)
for all x ∈ P . Without further assumptions, it is not possible to know in advance how many
local solutions exist or even whether the number of local solutions is finite or not, or whether to
global minimizer is unique.
53
4. NlpqlgWrapper - Heuristic Global Optimization
Global optimization algorithms have been investigated in the literature extensively, see for
example the books of Pinter [103], Törn and Zilinskas [159], Horst and Pardalos [71] and the
references herein. Main solution strategies are partition techniques, stochastic algorithms, or
approximation techniques, among many other methods. Despite of significant progress and im-
provements in developing new computer codes, there remains the disadvantage that the number
of function evaluations is often large and unacceptable for realistic time-consuming simulations.
Especially nonlinear equality constraints are often not appropriate and must be handled in form
of penalty terms, by which direct and random search algorithms can be deteriorated drastically.
One of the main drawbacks of global optimization is the lack of numerically computable and
generally applicable stopping criteria. Thus, global optimization is inherently a difficult problem
and requires more or less an exhaustive search over the whole feasible domain to guarantee
convergence.
As soon as function evaluations become extremely expensive preventing the application of
any of the methods mentioned above, there are only two alternatives. Either the mathematical
model allows specific analysis to restrict the domain of interest to a region where the global
minimizer is expected, or one tries to improve local solutions until a reasonable, not necessarily
global solution is found. A typical technique is the so-called tunnelling method, see Levy and
Montalvo [82], where the objective function in (4.1) is replaced by
f (x) − f (xloc )
f (x) =
||x − xloc ||ρ
and where xloc ∈ P denotes a local minimizer. ρ is a penalty parameter with the intention to
push away the next local minimizer from the known one, xloc . A similar idea to move to another
local minimizer, is proposed by Ge and Qin [50], also called function filling method, where the
objective function is inverted and an exponential penalty factor added to prevent approximation
of a known local solution.
Our heuristic proposal follows a similar idea, the successive improvement of global minima.
But the algorithm is different in the following sense. Additional constraints are attached to
the original problem formulation (4.1). First, there is a constraint that the next local solution
must have an objective function value less than the best known feasible function value minus a
relative improvement of 1 > 0. For each known local solution xloc , a ball is formulated around
xloc with radius 2 to prevent a subsequent approximation of xloc . The procedure is repeated
until the solution method breaks down with an error message from which we conclude that the
feasible domain is empty. To further prevent a return to a known local solution, an RBF kernel
function of the form
ρ exp(−µkx − xloc k2 )
is added to the objective function for each local solution.
However, an appropriate choice of the tolerances 1 , 2 , ρ, and µ depends on the distribution
of the local solutions, scaling properties, and the curvature of the objective function. Moreover,
the nonlinear programs generated this way, become more and more non-convex. Thus, one has
to add an additional regularization to stabilize the algorithm and appropriate starting points
for each subproblem.
In Section 4.2, we outline the heuristic procedure in more detail. Regularization and numerical
aspects are discussed. The usage of the Fortran subroutine is documented in Section 4.3 and
Chapter 10 contains an illustrative example.
54
4.2. The Algorithm
and k.k denotes the Euclidean norm. Once a local solution of (4.2) is found, the objective
function value is cut away and a ball around the minimizer prevents an approximation of any
of the previous iterates. In addition, a radial basis function (RBF) is added to the objective
function to push away subsequent local minimizers from the known ones. 1 > 0 and 2 > 0 are
suitable tolerances for defining the artificial constraints, also ρi > 0 and µi > 0 for defining the
RBF kernel.
If the applied solution method for the nonlinear program (4.2) terminates with an error
message at a non-feasible iterate, we conclude that P is empty and stop. Otherwise, we obtain
at least a feasible point x?k+1 with an objective function value better than all known ones, and
k is replaced by k + 1 to solve (4.2) again.
Obviously, we get a series of feasible iterates with decreasing objective function values. How-
ever, the approach has a couple of severe drawbacks:
1. The choice of the tolerances 1 and 2 is critical for the performance and it is extremely
difficult to find appropriate values in advance. Too small values prevent the algorithm
from moving away from the neighborhood of a local minimizer of the original program
(4.1) towards another local minimizer, and too large values could cut off too many local
minimizers, even the global one.
2. The choice of the RBF kernel parameters ρi and µi seems to be less crucial, but must
nevertheless carefully adapted to the structure on the model functions.
3. Even if the initial feasible set P is convex, the additional constraints are non-convex and
the feasible domains of (4.2) become more and more irregular. It is even possible that an
initial connected set P becomes non-connected.
55
4. NlpqlgWrapper - Heuristic Global Optimization
4. The algorithm stops as soon as the constraints of (4.2) become inconsistent. But infeasi-
bility cannot be checked by any mathematical criterion. The only possibility is to run an
optimization algorithm until an error message occurs at an infeasible iterate. But there is
no guarantee in this case that the feasible region is in fact non-empty.
5. The local solutions of (4.2) do not coincide with local solutions of (4.1), if some of the
artificial constraints become active.
6. It is difficult to find appropriate starting values for solving (4.2). Keeping the original one
provided by the user, could force the iterates to get stuck at a previous local solution until
an internal error occurs.
The situation is illustrated in Figure 4.1. Once a local solution x?1 is obtained, an interval
with radius 2 around x?1 and a cut of the objective function subject to a relative bound 1 try
to push away subsequent iterates from x?1 . The dotted line shows the objective function to be
minimized, including the RBF term. The new feasible domain is shrinking, as shown by the gray
area. If, however, the applied descent algorithm is started close to x?1 , it tries to follow the slope
and to reach x?1 . If 1 and 2 are too small and if there are additional numerical instabilities,
for example badly scaled functions or inaccurate gradient approximation, it is possible that the
code runs into an error situation despite of its theoretical convergence properties.
To overcome at least some of these disadvantages, it is tried to regularize (4.2) in the following
sense. For each of the artificial inequality constraints, a slack variable is introduced. Thus, we
are sure that the feasible domain of the new subproblem is always non-empty. However, we get
only a perturbed solution of (4.2) in case of non-zero slack variables. To reduce their size and
influence as much as possible, a penalty term is added to the objective function, and we get the
56
4.2. The Algorithm
problem
Pk
min f (x) + i=1 ρi exp(−µi kx − x?i k2 ) + γk (y + eT z)
gj (x) = 0 , j = 1, . . . , me ,
gj (x) ≥ 0 , j = me + 1, . . . , m,
f (x) ≤ fk? − 1 |fk? | + y ,
x ∈ Rn , y ∈ R, z ∈ Rk , : (4.4)
kx − x?i k2 ≥ 2 − eTi z , i = 1, . . . , k,
xl ≤ x ≤ xu ,
0 ≤ y ≤ β1 ,
0 ≤ z ≤ β2 .
Here ei ∈ Rk denotes the i-th unit vector, i = 1, . . ., k, γk is a penalty parameter, and β1 and β2
are upper bounds for the slack variables y and z, e = (1, . . . , 1)T . By letting β1 = 0 or β2 = 0,
the corresponding slack variables are suppressed completely.
There remains the question how to find a suitable starting point for (4.4) without forcing a
user to supply too much information about the optimization problem and without trying to find
some kind of pattern or decomposition of the feasible domain, where problem functions must be
evaluated. Basically, the user should have full control how to proceed, and new starting values
could be computed randomly. Another possibility is to choose
k
!
0 1 X
?
xk = x0 + xi (4.5)
k+1
i=1
where x0 ∈ Rn is the initial starting point provided by the user.
The algorithm can be summarized as follows:
Another tolerance 3 is introduced to adopt the penalty parameter γk and to force the artificial
variables y and z to become as small as possible. Usually we set 3 = 1 .
The algorithm is a heuristic one without guaranteed convergence. However, there are many
situations preventing the application of a more rigorous method, for example based on a parti-
tion technique, stochastic optimization, approximations, or a direct search method. The main
disadvantage of these algorithms is the large number of function evaluations, often not accept-
able for realistic time-consuming simulation programs. Especially nonlinear equality constraints
are often not appropriate and must be handled in form of penalty terms, by which direct and
random search algorithms can be deteriorated drastically. To summarize, the approach seems
to be applicable under the subsequent conditions:
• The evaluation of the model functions f (x) and gj (x), j = 1, . . ., m is expensive.
• There are highly nonlinear restrictions, especially equality constraints.
• Model functions are continuously differentiable and the numerical noise in function and
gradient evaluations is negligible.
• The number of local minima of (4.1) is not too large.
• There is some empirical, model-based knowledge about the expected relative locations of
local minima and the curvature of the objective function.
57
4. NlpqlgWrapper - Heuristic Global Optimization
2. Formulate the expanded nonlinear program (4.4) with slack variables y ∈ R and z ∈ Rk .
3. Solve (4.4) by an available locally convergent algorithm for smooth, constrained nonlinear
programming starting from x0k ∈ Rn , for example given by (4.5), y = 0, and z = 0. Let
xk+1 , yk+1 , and zk+1 be the optimal solution.
4. If yk + eT zk > 3 , let γk+1 = δγk , ρk+1 = δρk , and µk+1 = δµk . Otherwise, let γk+1 = γk ,
ρk+1 = ρk , and µk+1 = µk .
6. Repeat the iteration, if the local optimization code reports that all internal convergence
criteria are satisfied.
7. Stop otherwise. The last successful return is supposed to be the global minimizer.
58
4.3. Program Documentation
59
4. NlpqlgWrapper - Heuristic Global Optimization
60
4.3. Program Documentation
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): NlpqlgWrapper needs new function values → 6b
New function values.
After passing these values to the NlpqlgWrapper object go to 4.
• if iStatus equals EvalGradId():
NlpqlgWrapper needs new values for gradients → 6c Providing new gradient values.
After passing these values to the NlpqlgWrapper object go to 4.
• if iStatus == -3 :
Nlpqlg needs a new initial guess. In the DefaultLoop() this is automatically done
using random numbers.
6. Providing new function and gradient values:
a) Useful methods:
i. GetNumConstr( int & iNumConstr ) const
Returns the number of constraints.
GetNumDv( int & iNumDesignVar ) const
Returns the number of design variables.
ii. GetDesignVarVec( const double * & pdPointer, const int i ) const
with i=0 (default) returns a pointer to the design variable vector.
GetDesignVar( const int iVarIdx , double & pdPointer,
const int iVectorIdx ) const
with iVectorIdx = 0 (default) returns the value of the iVarIdx’th design vari-
able of the design variable vector.
b) Providing new function values
Function values for the objective as well as for constraints have to be calculated for
the design variable vector provided by Nlpqlg.
For access to the design variable vector see 6(a)ii. After calculating the values, these
must be passed to the NlpqlbWrapper object using:
• PutConstrValVec( const double * const pdConstrVals,
const int iParSysIdx ),
with iParSysIdx=0 (default)
where pdConstrVals is pointing on an array containing the values of each con-
straint at the design variable vector provided by NlpqlgWrapper.
• PutObjVal( const double dObjVal, const int iObjFunIdx,
const int iParSysIdx ),
with iParSysIdx = 0(default), iObjFunIdx = 0 (default) and dObjVal defining
the value of the objective function at the design variable vector provided by
NlpqlgWrapper.
Alternatively you can employ
PutConstrVal( const int iConstrIdx, const double dConstrValue,
61
4. NlpqlgWrapper - Heuristic Global Optimization
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
62
4.4. Summary
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
with iFunIdx = 0 (default) returns the value of the derivative of the objective with
respect to the iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
with iFunIdx = 0 (default) returns the value of the objective function at the last
solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
4.4. Summary
We present a heuristic approach for stepwise approximation of the global solution of a constrained
nonlinear programming problem. In each step, additional variables and constraints are added to
the original ones, to cut off known local solutions. The idea is implemented and the resulting code
NLPQLG is able to solve a large number of test problems found in the literature. However, the
algorithm is quite sensitive subject to the input tolerances, which must be chosen very carefully.
But under certain circumstances, for example very time-consuming function evaluations or highly
nonlinear constraints, the proposed idea is often the only way to improve a known local solution.
63
5. ScpWrapper - Sequential Convex
Programming
For solving optimization problems with very many variables, the sequential convex program-
ming algorithm SCPIP by Zillober can be applied. The code is frequently used in industrial
applications, especially in mechanical engineering.
5.1. Introduction
The method of moving asymptotes (MMA) has been introduced in 1987 by K. Svanberg [153].
In 1993 the lack of missing global convergence has been removed in the method SCP (sequential
convex programming; cf. [176]) by adding a line-search procedure. Both methods have been
proven to be efficient tools in the context of structural optimization (cf. [145]), since displacement
dependent constraints (e.g. stresses) are approximated very well. However, problems besides this
context could also be solved efficiently. In 1995, weak convergence results have been proven in
[154] without a line-search. In [176] the methods have been extended to a general mathematical
programming framework. Box-constraints for the variables are kept in the model because they
can be handled separately and they appear in many real-world applications.
Thus, we consider the following general nonlinear programming problem:
min f (x), x ∈ Rn ,
s.t. hj (x) = 0, j = 1, ..., meq ,
hj (x) ≤ 0, j = meq + 1, ..., M, (P1)
xi ≤ xi ≤ xi , i = 1, ..., n.
65
5. ScpWrapper - Sequential Convex Programming
possible to reduce to n × n systems which is desirable for a small number of variables and a large
number of constraints. This is the case in many sizing problems of structural optimization. As
a third possibility, reduction to certain (n + M ) × (n + M ) systems is possible which can be
of interest for certain sparsity patterns. It is important to notice that sparsity in the original
problem can be exploited by the interior point method. This is not the case for the dual approach.
SCPIP has been described in a structural optimization framework in [177]. In [178] its ability
to solve very large-scale problems arising in structural optimization and the solution of elliptic
control problems has been shown.
The outline of the paper is as follows. In Section 5.2 the approximation scheme is introduced
and the methods MMA and SCP are formulated. The interior point method to solve the sub-
problems is described in Section 5.3. Section 5.4 contains convergence results. In Section 5.5 the
interface of ScpWrapper is introduced and in chapter 10 a sample calling program is described.
Sections 5.2-5.4 are a summary of some papers mentioned above, containing the underlying
methods and theory.
X ∂hj (xk ) (U k − xk )2
hkj (x) k
:= hj (x ) + i i
− (Uik − xki )
+
∂xi Uik − xi
Ij,k
X ∂hj (xk ) (xk − Lk )2
− i i
− (xki − Lki ) . (5.2)
−
∂xi xi − Lki
Ij,k
+ −
Ij,k , (Ij,k , resp.) is the set of indices of all components with non-negative (negative) first
+ ∂hj (xk ) − ∂hj (xk )
partial derivatives at the expansion point xk , Ij,k := {i| ∂xi ≥ 0}, ( Ij,k := {i| ∂xi < 0}),
where h0 := f .
66
5.2. The MMA approximation
Lki and Uik are parameters to be chosen with Lki < xki < Uik , τik are positive parameters,
i = 1, ..., n.
This means, equality constraints are linearized in its usual sense, inequality constraints and the
1 1
objective function are linearized with respect to transformed variables k and . In
Ui − x i xi − Lki
the objective function an additional term is added. These so defined functions have the following
properties:
f k and hkj are first-order approximations of f and hj , resp. I.e.
min f k (x), x ∈ Rn ,
s.t. hkj (x) = 0, j = 1, ..., meq ,
hkj (x) ≤ 0, j = meq + 1, ..., M, k )
(Psub
xi 0 ≤ xi ≤ xi 0 , i = 1, ..., n.
where xi 0 := max{xi , xki − ω(xki − Lki )} and xi 0 := min{xi , xki + ω(Uik − xki )}, ω ∈]0; 1[ fixed.
I.e., xi ≤ xi 0 ≤ xki ≤ xi 0 ≤ xi , where xk denotes the expansion point. For further use we define
X 0 := {x : xi 0 ≤ xi ≤ xi 0 , i = 1, ..., n}.
67
5. ScpWrapper - Sequential Convex Programming
1. Lki ≤ xki − ξ, Uik ≥ xki + ξ, ξ is a positive constant, for all i = 1, ..., n and k ≥ 0.
The first point of this definition prevents the curvature of the approximations from going to
infinity. The second part prevents the approximations from going to close to linearity. This
is necessary for the approximation of the objective. It is not necessary for the constraints.
Therefore, one could use different asymptotes for the objective and the constraints but, for the
matter of simplicity of the presentation we use the same asymptotes for both the objective and
the constraints.
Remark 5.2.1 Individual asymptotes for each function that has to be approximated were intro-
duced by Fleury [47]. But with very large problems in mind we do not proceed this way, because
we would have to compute and store 2n(M − meq + 1) values. Therefore, it only seems to be
reasonable to use different asymptotes for the objective on the one hand and the set of inequality
constraints on the other hand.
68
5.2. The MMA approximation
To simplify the notation we rewrite (P1) incorporating the box-constraints into the general
inequality constraints:
min f (x), x ∈ Rn ,
s.t. hj (x) = 0, j = 1, ..., meq , (P2)
hj (x) ≤ 0, j = meq + 1, ..., m,
where m = M + 2n.
The motivation to choose the augmented Lagrangian merit function results from the following
two well known statements:
∗ ∗
a) A point xy∗ is stationary for Φρ if and only if xy∗ is stationary for (P2).
b) Under some regularity conditions there is a ρ > 0 ∈ Rm such that x∗ is a local minimizer
x
for Φ̃ρ (x) := Φρ y∗ for all ρ ≥ ρ.
and
η k := min ηik . (5.5)
i=1,...,n
Another important ingredient for the SCP-algorithm is the procedure to modify the penalty
parameter vector in case of violation of the descent property (cf. step 6 of Algorithm 6), which
will be introduced now. The motivation for subsequently used factors can be found in [176].
Moreover, we define:
yj
J := {j | 1 ≤ j ≤ meq } ∪ j | meq + 1 ≤ j ≤ m; − ≤ hj (x) ,
ρj
K := {j | 1 ≤ j ≤ m; j 6∈ J} .
69
5. ScpWrapper - Sequential Convex Programming
j ∈ J : if hj (xk ) > 0 and ∇hj (xk )T (z k − xk ) 6= 0 or hj (xk ) < 0 and ∇hj (xk )T (z k − xk ) > 0 :
2(vj − yj )
ρj := min κ2 ρj ; max κ1 ρj ;
hj (xk )
else: ρj := κ1 ρj
j ∈ K : if vj − yj < 0 :
yj (vj − yj )4m
ρj := min κ2 ρj ; max κ1 ρj ;
η k (δ k )2
else: ρj := κ1 ρj
70
5.3. Predictor-corrector interior point method
xk T k
η k (δ k )2
Step 6 : If ∇Φρ yk p >− update the penalty parameters ρj (j = 1, ..., m)
2
according to Algorithm 5 and goto step 5;
otherwise, let σ := 1.
xk
Step 7 : Compute f (xk + σ(z k − xk )), hj (xk + σ(z k − xk )), j = 1, ..., M, Φρ + σpk .
yk
else let σ k := σ.
k+1 k
Step 9 : Let xyk+1 := xyk + σ k pk , k := k + 1.
Step 10: Compute ∇f (xk ), ∇hj (xk ), j = 1, ..., M , goto step 1.
71
5. ScpWrapper - Sequential Convex Programming
We chose the predictor-corrector interior point method for the solution of our convex programs.
It has been proven to be one of the most efficient interior point methods for linear programming
(see [89] and [93] for the development of the method and [89] for a numerical evaluation). The
basic structure of our problem does not differ very much from a linear program.
We proceed from problem (Psubk ) and modify it in order to prepare it for the application of
the predictor-corrector method. For that purpose we add nonnegative slack variables wherever
inequalities appear:
min f k (x), x ∈ Rn ,
s.t. hkj (x) = 0, j = 1, ..., meq ,
hkj (x) + cj−meq = 0, j = meq + 1, ..., M,
− cj + rj = 0, j = 1, ..., M − meq , (5.6)
0
xi − xi + si = 0, i = 1, ..., n,
0
xi − xi + ti = 0, i = 1, ..., n,
r, s, t ≥ 0.
The slacks s and t are introduced because variable x is formally free and allowed to violate its
bounds. Slack r is not absolutely necessary, but with its usage we allow c and in consequence
the dual of the original subproblem constraints to become 0. If we do not use slack r we need
c and the dual variable y to be positive. The computational amount is touched only slightly by
this modification.
The positivity is only demanded for the new introduced variables r, s and t, variables that do
not appear outside the subproblems, i.e. in the main loop.
For the next step we add barrier terms corresponding to the slack variables and build the
Lagrangian of the so modified problem:
M −meq n
X X
k
Lµ (x, yeq , yie , c, r, s, t, dr , ds , dt ) = f (x) − µ ln rj − µ ln si
j=1 i=1
n
X
T k T
−µ ln ti + yeq heq (x) + yie (hkie (x) + c)
i=1
+ T
dr (−c + r) + dTs (x0 − x + s) + dTt (x − x0 + t),
where hkeq := (hk1 , ..., hkmeq )T , hkie := (hkmeq +1 , ..., hkM )T and yeq , yie , dr , ds , dt are the dual variable
vectors to the corresponding constraints. The combination of yeq and yie corresponds to the
vector y of the last section, i.e. y = (yeq , yie )T . µ is a positive homotopy parameter. Formally,
in this formulation all variables are free, the barrier formulation, however, needs r, s and t to be
positive.
The necessary optimality condition ∇Lµ = 0 reads then:
72
5.3. Predictor-corrector interior point method
where R = diag (r1 , ..., rM −meq ); S, T, Dr , Ds , Dt resp., e = (1, 1, ..., 1)T in the appropriate
dimension,
∂hkmeq (x)
∂hk1 (x) ∂hk2 (x)
∂x1 ∂x1 ... ∂x1
. . ... .
. . ... . ∈ Rn,meq and
Jeq =
. . ... .
. . ... .
∂hk1 (x) ∂hk2 (x) ∂hkmeq (x)
∂xn ∂xn ... ∂xn
The linear system to compute a Newton step for the solution of this systems reads as follows:
73
5. ScpWrapper - Sequential Convex Programming
d d
where ∇xx L = ∇2 f k (x) + dx Jeq yeq + dx Jie yie . We will discuss this term now a little bit more
detailed. It is crucial for the effectiveness of the interior point method. For the equalities the
d
matrix Jeq is constant and thus dx Jeq yeq = 0. For the inequalities the term does not vanish:
M
P ∂hkj (x)
∂x1 (yie )j−meq
j=meq +1
.
!
M ∂hk (x)
∂(Jie yie )k ∂ P j
Jie yie = . =⇒ = ∂x ∂xk (yie )j−meq =
∂xi i
.
j=meq +1
M
P ∂hkj (x)
∂xn (yie )j−meq
j=meq +1
M
P ∂ 2 hkj (x)
∂xk ∂xi (yie )j−meq .
j=meq +1
∂ 2 hk (x)
Since the functions hkj (j = meq + 1, ..., M ) are separable, we have: ∂xkj∂xi = 0 if k 6= i,
i.e. these Hessians are diagonal. Thus, ∇xx L is completely determined since
M 2 hk (x)
d X ∂ j
Jie yie = diag 2 (yie )j−meq
dx ∂x k
j=m +1 eq
k=1,...,n
and f k is separable, too. From a computational point of view, ∇xx L is in consequence not a
n × n matrix, but a n vector. A central point considering problems with a large number of
variables. Remember, that each component of ∇xx L is strictly positive due to the convexity
properties of the included functions.
For the purpose of computational efficiency, let us have a look on the nonzero structure of the
matrix Jie . For one particular component of the gradient of a constraint we have (cf. (5.2)):
∂hj (xk ) (Uik −xki )2 ∂hj (xk )
∂hkj (x) ∂xi (Uik −xi )2 , if ∂xi ≥ 0
= k k k 2 . (5.7)
∂xi j (x ) (xi −Li ) ∂hj (xk )
∂h∂x
k 2 , if ∂xi < 0
i (x −L )
i i
Since the second term is in both cases always ensured to be strictly positive, the nonzero structure
of Jie in an arbitrary point is identical to the nonzero structure of the original constraints in the
current main iteration point xk . The same is true for the matrix Jeq . That means, sparsity in
the original problem is preserved for the subproblems.
We will now briefly state the principles of the predictor-corrector interior point algorithm,
which has been chosen for the solution of (Psub k ).
74
5.3. Predictor-corrector interior point method
75
5. ScpWrapper - Sequential Convex Programming
Remark 5.3.1
• For the initialization in Step 0 one can take the values of the previous main (MMA-/SCP-)
iteration. One possibly has to take care of the positivity condition.
• The factor 0.99995 prevents the components of the critical vectors from getting too close
to 0.
• δip,primal and δip,dual are the stepsizes to ensure the interior point condition. δprimal takes
additionally the definition of the approximations into account where we have the stronger
demand x ∈ X 0 . δdual ensures that yie ≥ 0 throughout, to guarantee the positivity of the
Hessian of the Lagrangian.
We will now consider step 2 of the algorithm above, i.e., we will analyze how the linear system
(S1) can be solved for the predictor step in practice.
1 h 0 0
µk+1 := (sk + δprimal · ∆s)T (dks + δdual · ∆ds )+
n + n + M − meq
0 0
(tk + δprimal · ∆t)T (dkt + δdual · ∆dt )+
0 0
i
(rk + δprimal · ∆r)T (dkr + δdual · ∆dr )
0 0
where ∆s, ∆t, ∆r, ∆ds , ∆dt and ∆dr are the results of the predictor step and δprimal and δdual
are intermediate stepsizes computed in the same way as in step 5 of Algorithm 7.
This update rule is well approved in linear programming, cf. [167] or [160], and works also
very well in our context.
76
5.3. Predictor-corrector interior point method
=⇒ ∇xx L∆x + Jeq ∆yeq + Jie ∆yie + S −1 Ds ∆s − T −1 Dt ∆t = −∇f k (x) − Jeq yeq − Jie yie . (5.11)
∆r = ∆c + c − r, (5.13)
0
∆s = ∆x + x − x − s, (5.14)
0
∆t = −∆x − x + x − t. (5.15)
∇xx L + S −1 Ds + T −1 Dt Jeq
Jie ∆x γ1
T
Jeq ∆yeq = −hkeq (x) .
(S2)
T
Jie −1
−Dr R ∆yie γ2
This system is indefinite and has full rank provided that Jeq and Jie are of full rank (remember
∇xx L > 0). It’s dimension is (n+M )×(n+M ). The upper left and lower right part are diagonal.
Thus, the matrix can be considered as sparse. Additional sparsity of the Jacobians improves the
situation. It is now possible to solve this system with a sparse indefinite linear system solver.
SCPIP does not support this approach. We will proceed with two more possibilities and will
show which approach is favorable in which situation.
77
5. ScpWrapper - Sequential Convex Programming
Together we have
T −1 T Θ−1 J
k T Θ−1 γ
Jeq Θ Jeq Jeq ie ∆yeq heq (x) + Jeq 1
T Θ−1 J T −1 −1 = T Θ−1 γ − γ . (S3)
Jie eq Jie Θ Jie + Dr R ∆yie Jie 1 2
This system is positive definite. It’s dimension is M ×M . If Jeq and Jie are sparse then we can
hope that the same is true for the matrix in (S3). Unfortunately, this cannot be ensured. In the
context of linear programming this case was extensively examined. If one of the Jacobians has
at least one dense column then the matrix is dense, too. On the other hand, there are techniques
to overcome this situation by splitting dense columns. It is beyond the scope of this paper to
outline these techniques. The reader is referred to the book of Wright [167] and the references
cited therein. Notice, that it is worthwhile to think about sparse positive definite systems if the
Jacobians are dense overall. Then, of course, the matrix in (S3) is also dense.
In SCPIP there are three possibilities to solve the linear systems (S3). Parameter ”SubProb-
lemStrat” has to be set to 1.
• A dense Cholesky solver can be chosen. This is favorable in case of a small or medium
number of constraints (M ). In this case, parameter ”LinEqSolver” has to be set to 1.
• A sparse Cholesky solver can also be chosen. This should be the best approach (within
this three) for large M and sparse Jacobians Jie and Jeq , if it does not lead to a dense
matrix in (S3). ”LinEqSolver” has to be set to 2.
• The third possibility is a conjugate gradient (cg) solver, i.e. an iterative solution method.
This is favorable in case of large M and dense Jie and Jeq , or a large M and sparse
Jacobians if this leads to a dense matrix in (S3). ”LinEqSolver” has to be set to 3.
78
5.3. Predictor-corrector interior point method
As in the last subsection, this system is positive definite. It’s dimension is n × n. The remarks
about sparsity are also valid here replacing columns by rows. The impact of dense rows or
columns is examined more detailed in the book of Vanderbei [160].
In SCPIP there are three possibilities to solve the linear systems (S4). Parameter ”SubProb-
lemStrat” has to be set to 2. This is equivalent to the last subsection.
• A dense Cholesky solver can be chosen. This is favorable in case of a small or medium
number of variables (n). In this case, parameter ”LinEqSolver” has to be set to 1.
• A sparse Cholesky solver can be chosen. This should be the best approach (within this
three) for large n and a sparse Jacobian Jie , if it does not lead to a dense matrix in (S4).
(LinEqSolver = 2)
• The third possibility is a cg solver. This is favorable in case of large n and a dense Jie , or
a large n and a sparse Jie if it leads to a dense matrix in (S4). (LinEqSolver = 3)
79
5. ScpWrapper - Sequential Convex Programming
Let us summarize these properties: To the experience of the author, the dual approach is
not relevant any longer. The advantage, that in the case of few constraints one can reduce the
computing amount to the solution of M ×M systems is also governed by approach (S3). But the
interior point approaches are much more stable. In case of large n and small M approach (S3)
is favorable, as well as approach (S4) vice versa, if n is small and M is large. The choice is not
so easy if n and M are in the same range. Then approaches (S3) and (S4) are comparable. It
depends on the sparsity of Jeq and Jie . In cases, where a sparse Jacobian causes dense positive
definite systems, approach (S2) is also a practical alternative.
Which specific linear system solver should be chosen is a difficult question. For small and
medium dimensions the dense Cholesky solver is usually the best choice. The sparse Cholesky
option is the best approach for large dimensions and sparse matrices. In case of large dense
matrices or if the Cholesky decomposition needs to much storage, the iterative solver is the best
possibility. At which moment it is the best to switch from one possibility to the other is hard
to predict. It is matter of further research and numerical experience.
The same eliminations, that have been done for the predictor system, are applied to the
corrector system. Firstly, equations 5, 6 and 7 of (S1):
(5.27)
80
5.3. Predictor-corrector interior point method
Further eliminations in the equations 8, 9 and 10 of (S1) yield (as for the predictor step):
∆r = ∆c + c − r, (5.29)
0
∆s = ∆x + x − x − s, (5.30)
0
∆t = −∆x − x + x − t. (5.31)
T
Jie ∆x − Dr−1 R∆yie = −hkie (x) − Dr−1 R(−yie + dr + R−1 (µe − ∆r∆dr ) =: γ4 .
Putting all together we get the following linear system:
∇xx L + S −1 Ds + T −1 Dt Jeq
Jie ∆x γ3
T
Jeq ∆yeq = −hkeq (x) .
(S5)
T
Jie −Dr−1 R ∆yie γ4
In comparison with the corresponding predictor system (S2) the matrix remains unchanged.
Only the right hand side has changed. This is important justifying an iteration that requires
two linear system solutions. A direct linear solver decomposes the matrix in the predictor
step and uses this decomposition in the corrector step such that there is only an additional
backward substitution necessary. This is also true for the two other approaches, that resulted
in the equations (S3) and (S4) for the predictor step. We will not formulate the corresponding
corrector equations. They are the same replacing γ1 by γ3 and γ2 by γ4 .
The advantage of using a decomposed matrix twice does not apply for the cg method.
Presently, the cg solver is also applied twice but in a future release it is planned to use then the
classical primal-dual interior point approach which needs only one linear system solution per
iteration but usually a few iterations more than the predictor-corrector method.
81
5. ScpWrapper - Sequential Convex Programming
Theorem 5.4.1 Let the sequence (xk , y k )k=0,1,2,... be produced by the SCP-algorithm, all sub-
problems be solvable and gradients of active constraints at the optimal points of (Psub k ) be linear
independent, as well as those of (Psub ∗ ) where (P ∗ ) is the subproblem defined in any possible
sub
∗ k
accumulation point x of (x )k=0,1,2,... .
Then all points xk are in X and all y k are in a compact set U , i.e., there is a ymax with
|yjk |, |vjk | ≤ ymax for all k ≥ 0 and j = 1, ..., m.
Theorem 5.4.2 Let the assumptions of Theorem 5.4.1 be valid. pk , η k and δ k are defined as in
Algorithm 6 and let the choice of asymptotes be feasible.
a) Then there are penalty parameters ρkj > 0 (j = 1, ..., m) such that pk is a direction of
descent for all ρ ≥ ρk for the augmented Lagrangian function Φρ , that means
η k (δ k )2
∇Φρ (xk , y k )T pk ≤ − for all ρ ≥ ρk .
2
xk
b) For each fixed δ > 0 there is a finite ρδ such that for all with δ k ≥ δ we have
yk
η k (δ k )2 ηδ 2
∇Φρ (xk , y k )T pk ≤ − ≤− for all ρ ≥ ρδ .
2 2
The first part of this theorem shows us that we can always find penalty parameters, such
that the computed search direction is a direction of descent for the augmented Lagrangian. The
second part guarantees that the penalty parameters are uniformly bounded, provided that we
are not too close to a stationary point.
This suffices to prove a weak convergence theorem, i.e. the existence of an accumulation point
and at least one accumulation point is stationary. To be able to prove a strong convergence
theorem, i.e. additionally to the weak theorem, each accumulation point is stationary, we have
to conclude, that the penalty parameters are uniformly bounded in the neighborhood of a
stationary point too. For this purpose we need some additional but reasonable assumptions.
Recall that
yj
J := {j | 1 ≤ j ≤ meq } ∪ j | meq + 1 ≤ j ≤ m; − ≤ hj (x) ,
ρj
K := {j | 1 ≤ j ≤ m; j 6∈ J} .
Theorem 5.4.3 Let the assumptions of Theorem 5.4.1 be valid and assume a feasible choice of
ky k − v k k2 k
asymptotes. For δ k 6= 0 we define: αk := k 2
. Let xyk be determined by Algorithm 6
(δ )
and fulfill the following two conditions:
82
5.5. Program Documentation
η(δ k )2 xk
k k T k
∇Φρ0 (x , y ) p ≤ − for all with δ k ≤ δ 0 ,
2 yk
Theorem 5.4.3 would suffice to prove strong convergence but especially assumption a) seems
not to be necessarily fulfilled if we are far away from a stationary point. Therefore, we distinguish
between Theorem 5.4.2 and Theorem 5.4.3. Notice, that we do not use any information about
δ k in the proof of Theorem 5.4.3.
An immediate consequence of the Theorems 5.4.2 and 5.4.3 is:
k
Corollary 5.4.1 Let the sequence xyk be produced by Algorithm 6 and let the assump-
k=0,1,...
tions of Theorems 5.4.2 and 5.4.3 be valid. Then the sequence of penalty parameter vectors is
bounded, i.e. there is a ρ < ∞ and a k < ∞ such that ρk = ρ for all k ≥ k ≥ 0.
With the results of this section it is now possible to formulate the main convergence theorem.
k
Theorem 5.4.4 Let the sequence xyk be produced by Algorithm 6 and let the assumptions
k=0,1,...
of Theorems 5.4.2 and 5.4.3 be valid. Then the sequence has at least one accumulation point
and each accumulation point is stationary.
83
5. ScpWrapper - Sequential Convex Programming
84
5.5. Program Documentation
about errors.
0: no additional output (default)
1: only final convergence analysis
2: one line of intermediate results
3: more detailed and intermediate results
– "OutputUnit" Put output unit for the Fortran Output.
– "OutUnit" Same as ”OutputUnit”
• Put( const char * pcParam , const double * pdValue ) ;
or
Put( const char * pcParam , const double dValue ) ;
where pcParam is one of the following parameters:
– "TermAcc" Put desired final accuracy.
– "Infinity" Put the number which shall represent infinity.
– "ActiveThreshold"Put threshold for activity constraints.
∗ ActiveThreshold > 0: All inequality constraints with values ¡ ActiveThreshold
are going into the subproblem, i.e. are considered as potentially active.
Other constraints are not considered in the submodel.
If no active-set-strategy is desired, it is recommended to let ActiveThreshold
be a large number, e.g. Infinity, which is the default.
85
5. ScpWrapper - Sequential Convex Programming
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
86
5.5. Program Documentation
87
5. ScpWrapper - Sequential Convex Programming
with pcParam = "ActConstr" sets piActPtr on an array of length equal to the num-
ber of constraints whose entries are 0 for an inactive constraint and 1 for an active
constraint. Note: piActPtr must be NULL on input.
For access to the design variable vector see 6(a)ii.
For the gradients of the constraints you can use
PutDerivConstr( const int iConstrIdx, const int iVariableIdx,
const double dDerivativeValue )
for iVariableIdx = 0,...,NumberOfVariables - 1
and iConstrIdx = 0,...,NumberOfConstraints - 1
where dDerivativeValue is defined as the derivative of the iConstrIdx’th con-
straint with respect to the iVariableIdx’th variable.
Alternatively, you can use
PutGradConstr( const int iConstrIdx, const double * pdGradient ),
for iConstrIdx = 0,...,NumberOfConstraints - 1
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
with iFunIdx = 0 (default) returns the value of the derivative of the objective with
respect to the iVariableIdx’th design variable at the last solution vector.
88
5.5. Program Documentation
89
6. CobylaWrapper - Gradient-free Constrained
Optimization by Linear Approximations
6.1. Algorithm
The algorithm by Powell employs linear approximations of the objective and constraint functions,
the approximations being formed by linear interpolation at N + 1 points in the space of the
variables. The interpolation points are regarded as vertices of a simplex. The parameter ρ
controls the size of the simplex and is reduced automatically from ρBegin to ρEnd .
For each ρ the subroutine tries to achieve a good vector of variables for the current size, and
then ρ is reduced until the value ρEnd is reached. Therefore ρBegin and ρEnd should be set to
reasonable initial changes to and the required accuracy in the variables respectively, but this
accuracy should be viewed as a subject for experimentation because it is not guaranteed.
The algorithm has an advantage over many of its competitors, however, which is that it treats
each constraint individually when calculating a change to the variables,* instead of lumping the
constraints together into a single penalty function.
The name of the subroutine is derived from the phrase Constrained Optimization BY Linear
Approximations.
For more information on the algorithm see Powell [108] and [109].
91
6. CobylaWrapper - Gradient-free Constrained Optimization by Linear Approximations
92
6.2. Program Documentation
where pdXu and pdXl must be pointing on arrays of length equal to the number of
design variables .
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): CobylaWrapper needs new function values → 6b
New function values.
After passing these values to the CobylaWrapper object go to 4.
93
6. CobylaWrapper - Gradient-free Constrained Optimization by Linear Approximations
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
with iFunIdx = 0 (default) returns the value of the objective function at the last
solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
94
7. MisqpWrapper - Sequential Quadratic
Programming with Trust Region
Stabilization
MisqpWrapper is a C++ Wrapper for the Fortran subroutine MISQP. A sequential quadratic
programming method for nonlinear optimization problems is implemented that applies trust
region techniques instead of a line search approach to achieve fast local convergence. MISQP
allows continuous, binary, integer, and catalogued variables which can only take values within
a given discrete set. Thus, MISQP also solves mixed-integer nonlinear programming problems,
see Section 22 for more details. Parts of the following sections are taken from Exler et al. [40].
7.1. Introduction
The implemented trust region algorithm was proposed by Yuan [171] and addresses the gen-
eral optimization problem to minimize an objective function f under nonlinear equality and
inequality constraints,
min f (x)
g (x) = 0 , j = 1, . . . , me
x ∈ Rn : j (7.1)
gj (x) ≥ 0 , j = me + 1, . . . , m
xl ≤ x ≤ xu
where x denotes the vectors of the continuous variables that is restriced by box constraints, i.e.,
lower bounds xl and upper bounds xu . It is assumed that the problem functions f (x) and gj (x),
j = 1, . . . , m, are twice continuously differentiable with respect to all x ∈ Rn , where n denotes
the number of design variables.
Trust region methods have been invented many years ago first for unconstrained optimization,
especially for least squares optimization, see for example Powell [107], or Moré [95]. Extensions
were developed for non-smooth optimization, see Fletcher [43], and for constrained optimiza-
tion, see Celis [21], Powell and Yuan [110], Byrd et al. [20], Toint [158] and many others. A
comprehensive review on trust region methods is given by Conn, Gould, and Toint [26].
On the other hand, sequential quadratic programming or SQP methods belong to the most
frequently used algorithms to solve practical optimization problems. The theoretical background
is described e.g. in Stoer [151], Fletcher [44], or Gill et al. [54].
The basic idea of trust region methods is to compute a new trial step dk by a second order
model or a close approximation, where k denotes the k-th iteration. The stepsize is restricted
by a trust region radius ∆k , i.e., kdk k ≤ ∆k with k.k being an arbitrary norm. Subsequently,
a ratio rk of the actual and the predicted improvement subject to a certain merit function is
computed. Depending on the calculated ration the trial step is either accepted and a move to
an new iterate is performed, or the trial step is rejected and the iterate remains unchanged.
95
7. MisqpWrapper - Sequential Quadratic Programming with Trust Region Stabilization
Moreover, the trust region radius is either enlarged or decreased depending on the deviation of
rk from the ideal value rk = 1. If sufficiently close to a solution, the artificial bound ∆k should
not become active, so that the new trial step proposed by the second order model can always
be accepted. For more details see Exler and Schittkowski [41] or Exler et. al. [39].
where x ∈ Rn is the primal variable and u = (u1 , . . . , um )T ∈ Rm the multiplier vector, and the
exact L∞ penalty function is given by
Here we combine all constraints in one vector, g(x) = (g1 (x), . . . , gm (x))T , and the minus-sign
defines the constraint violation
− gj (x) , if j ≤ me ,
gj (x) := (7.5)
min(0, gj (x)) , otherwise ,
j = 1, . . . , m. The penalty parameter of the exact L∞ penalty function (7.4) be denoted by
σ > 0. As σ has to be sufficiently large, the value is adapted by the algorithm. (7.4) is frequently
applied to enforce convergence, see for example Fletcher [43], Burke [18], or Yuan [171].
To obtain an SQP or sequential quadratic programming method, we compute a trial step dk
at iteration k towards the next iterate that is a solution to the subproblem
−
1 T T T
min φk (d) := 2 d Bk d + ∇f (xk ) d + σk
g(xk ) + ∇g(xk ) d
n
d∈R : ∞ (7.6)
kdk∞ ≤ ∆k
The constraints are linearized and are treated as a penalty term subject to the maximum
norm. The matrix Bk is an approximation to the Hessian of the Lagrangian function (7.3).
A particular advantage is that we do not need further safeguards in case of inconsistent linear
systems. Note that (7.6) is equivalent to the quadratic program
min 21 dT Bk d + ∇f (xk )T d + σk µ
T
d ∈ Rn , µ ∈ R : −gj (xk ) − ∇gj (xk ) d + µ ≥ 0 , j = 1, . . . , me (7.7)
gj (xk ) + ∇gj (xk )T d + µ ≥ 0 , j = 1, . . . , m
kdk∞ ≤ ∆k , µ ≥ 0 ,
96
7.2. The Trust Region SQP Method
which can be solved by any black-box quadratic programming solver. The optimal solution is
denoted by dk and uk is the corresponding multiplier. The penalty parameter σk and the trust
region radius ∆k are iteratively adapted.
A major role in a trust region method plays the decision whether a trial step dk is accepted
or not. Rejecting some trial steps is required to ensures the convergence of the method. In case
a trial step is accepted we move to the new point and set xk+1 := xk + dk , otherwise the point
for the next iteration remains unchanged, i.e., xk+1 := xk . The idea is to check the quotient of
the actual and the predicted improvements of the merit function by
where Pσk (x) denotes the penalty function (7.4) to measure the actual change, and φk (d) denotes
the objective function of subproblem (7.6). The proposed method accepts all trial steps in case
the ratio rk is greater than zero. Otherwise, the step is rejected.
In addition, a new trust radius ∆k+1 for the next iteration k + 1 has to be determined. If
rk is close to one or even greater than one, then the trust region radius ∆k is enlarged and if
the ratio rk is very small, ∆k is decreased. If rk remains in the intermediate range, ∆k is not
changed at all. More formally, we use the same constants proposed by Yuan [171], and set
max(2∆k , 4kdk k∞ ) , if rk > 0.9 ,
∆k+1 := ∆ , if 0.1 ≤ rk ≤ 0.9 , (7.9)
k
min(∆k /4, kdk k∞ /2) , if rk < 0.1 .
97
7. MisqpWrapper - Sequential Quadratic Programming with Trust Region Stabilization
Let the solution be dˆk . Note that subproblem (7.10) can be transformed into a differentiable
quadratic problem the same way as subproblem (7.6). The quotient rk is replaced by
that is rk := r̂k . The newly determined ratio rk is then used to decide whether the step with
second order correction is accepted or not. The extension of the aforementioned algorithm is
straightforward.
For an algorithm that applies second order correction steps, the superlinear convergence rate
can be proved, see Fletcher [43] and Yuan [170].
98
7.3. Program Documentation
99
7. MisqpWrapper - Sequential Quadratic Programming with Trust Region Stabilization
Before setting an initial guess, the number of design variables must be set.
4. Start Misqp-Routine: StartOptimizer()
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId():
The final accuracy has been achieved, the problem is solved.
• if iStatus is > 0:
An error occured during the solution process.
• if iStatus equals EvalFuncId() :
MisqpWrapper needs new function values → 6b New function values.
After passing these values to the MisqpWrapper object go to 4.
100
7.3. Program Documentation
101
7. MisqpWrapper - Sequential Quadratic Programming with Trust Region Stabilization
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
with iFunIdx = 0 (default) returns the value of the derivative of the objective with
respect to the iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
with iFunIdx = 0 (default) returns the value of the objective function at the last
solution vector.
• GetDesignVar( const int iVariableIdx,
double & dValue,
const int iParSysIdx ) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
102
7.3. Program Documentation
103
8. QlWrapper - Quadratic optimization with
linear constraints
QlWrapper is a C++ Wrapper for the Fortran subroutine QL which solves strictly convex
quadratic programming problems subject to linear equality and inequality constraints by the
primal-dual method of Goldfarb and Idnani. An available Cholesky decomposition of the ob-
jective function matrix can be provided by the user. Bounds are handled separately. The code
is designed for solving small-scale quadratic programs in a numerically stable way. Its usage is
outlined and an illustrative example is presented. Section 8.1 is taken from Schittkowski [139].
8.1. Introduction
The code solves the strictly convex quadratic program
1 T T
min 2 x Cx + d x
aTj x + bj = 0 , j = 1, . . . , me , (8.1)
x ∈ Rn : aTj x + bj ≥ 0 , j = me + 1, . . . , m ,
xl ≤ x ≤ xu
105
8. QlWrapper - Quadratic optimization with linear constraints
avoided. A generalization of the method introduced in this section, is published by Boland [11]
for the case that C is positive semi-definite.
The implementation of the code goes back to Powell [106] and the algorithmic details are
found in the reference. Besides of a few internal changes concerning numerical details, the main
extensions of QL compared to ZQPCVX are the separate handling of upper and lower bounds
and the optional provision of the Cholesky factor of C.
As part of the sequential quadratic programming code NLPQL for constrained nonlinear
programming, QL is frequently used in practice. Even many of the standard test problems of the
collections of Hock and Schittkowski [68] and Schittkowski [124] are badly scaled, ill-conditioned,
or even degenerate. The reliability of an SQP solver depends mainly on the numerical efficiency
and stability of the code solving the quadratic programming subproblem. As verified by the
comparative study of Schittkowski [132], NLPQL and QL successfully solve all 306 test problems
under consideration. In addition, all 1,000 test examples of the interactive data fitting system
EASY-FIT, see Schittkowski [128], are solved by the code DFNLP. The subroutine is an extension
of NLPQL and depends also on the stable solution of quadratic programming subproblems, see
Schittkowski [125].
106
8.2. Program Documentation
where pdXu and pdXl must be pointing on arrays of length equal to the number of design
variables .
4. Set matrices and Vectors for the objective and constraint functions using
6. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId():
The final accuracy has been achieved, the problem is solved.
• if iStatus is > 0:
An error occured during the solution process.
7. Useful methods:
a) GetNumConstr( int & iNumConstr ) const
Returns the number of constraints.
GetNumDv( int & iNumDesignVar ) const
Returns the number of design variables.
b) GetDesignVarVec( const double * & pdPointer, const int i ) const
Returns a pointer to the i’th design variable vector ( default: i=0 ).
GetDesignVar( const int iVarIdx, double & pdPointer,
const int iVectorIdx ) const
Returns the value of the iVarIdx’th design variable of the iVectorIdx’th design
variable vector (default: iVectorIdx=0).
107
8. QlWrapper - Quadratic optimization with linear constraints
8. Output
a) GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 returns the value of the iVariableIdx’th design variable in
the last solution vector.
b) GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 returns a pointer to the last solution vector.
8.3. Example
Consider the following problem:
1 P4 2
Minimize 2 − 21.98x0 − 1.26x1 + 61.39x2 + 5.3x3 + 101.3x4
i=0 xi
s.t. −7.56x0 + 0.5x4 + 39.1 ≥ 0,
−100 ≤ xi ≤ 100, i = 0, . . . , 4
xi ∈ R, i = 0, . . . , 4.
Minimize 1/2xT Cx + dT x,
s.t. aT0 x + b0 ≥ 0,
−100 ≤ xi ≤ 100, i = 0, . . . , 4
xi ∈ R, i = 0, . . . , 4.
1 0 0 0 0 −21.98 −7.56
0 1 0 0 0 −1.26 0
with C =
0 0 1 0 0 ,
d= 61.39 , a0 = 0 and b0 = 39.1.
0 0 0 1 0 5.3 0
0 0 0 0 1 101.3 0.5
Now it can be solved by QlWrapper. The file QlExample.h contains the class definition:
1 # i f n d e f QLEXAMPLE H INCLUDED
2 # define QLEXAMPLE H INCLUDED
3
4 #include ” OptProblem . h”
5
6
7 c l a s s QlExample : public OptProblem
8 {
9 public :
10
11 QlExample ( ) ;
12
13 int FuncEval ( bool bGradApprox = f a l s e ) ;
14 int GradEval ( ) ;
15 int SolveOptProblem ( ) ;
108
8.3. Example
16 } ;
17 # endif
The file QlExample.cpp contains the implementation:
1 #include ” QlExample . h”
2 #include<i o s t r e a m >
3 #include<cmath>
4 #include ”QlWrapper . h”
5
6 using s t d : : c o u t ;
7 using s t d : : e n d l ;
8 using s t d : : c i n ;
9
10 QlExample : : QlExample ( )
11 {
12 c o u t << e n d l << ”−−− t h i s i s t h e Ql Example −−−” << e n d l ;
13 c o u t << e n d l << ”−−−− s o l v e d by QlWrapper −−−−−” << e n d l << e n d l ;
14 m pOptimizer = new QlWrapper ;
15 }
16
17 int QlExample : : FuncEval ( bool bGradApprox )
18 {
19 // not needed
20 return EXIT FAILURE ;
21 }
22
23 int QlExample : : GradEval ( )
24 {
25 // not needed
26 return EXIT FAILURE ;
27 }
28
29 int QlExample : : SolveOptProblem ( )
30 {
31 int i E r r o r ;
32
33 iError = 0 ;
34
35 m pOptimizer−>Put ( ”TermAcc” , 1 . 0 E−6 ) ;
36 m pOptimizer−>Put ( ” OutputLevel ” , 2 ) ;
37 m pOptimizer−>PutNumDv( 5 ) ;
38 m pOptimizer−>PutNumIneqConstr ( 1 ) ;
39 m pOptimizer−>PutNumEqConstr ( 0 ) ;
40
41 double ∗ MatVec = new double [ 2 5 ] ;
42
109
8. QlWrapper - Quadratic optimization with linear constraints
110
8.3. Example
89 const double ∗ dX ;
90 int iNumDesignVar ;
91 double dObjVal ;
92
93 m pOptimizer−>GetNumDv( iNumDesignVar ) ;
94
95 m pOptimizer−>GetDesignVarVec ( dX ) ;
96
97 for ( int iDvIdx = 0 ; iDvIdx < iNumDesignVar ; iDvIdx++ )
98 {
99 c o u t << ”dX [ ” << iDvIdx << ” ] = ” << dX [ iDvIdx ] << e n d l ;
100 }
101
102 m pOptimizer−>GetObjVal ( dObjVal ) ;
103 c o u t << ” Value o f o b j e c t i v e : ” << dObjVal << e n d l ;
104
105
106 return EXIT SUCCESS ;
107 }
111
9. IpoptWrapper - Interface to an open-source
large scale optimization algorithm
IpoptWrapper provides the possibility to use Ipopt through the Nlp++-Interface. Details on the
algorithm of Ipopt can be found in [163, 162, 161, 98, 115]. Ipopt is not contained in Nlp++.
It must be obtained separately. After successful installation of Ipopt Nlp++ must be adjusted
according to the instructions in NlpConfig.h.
3. Either the class constructor or the SolveOptProblem() function must allocate memory
for m pOptimizer, e.g. using new IpoptWrapper.
4. Via m pOptimizer, SolveOptProblem() then must define the problem using the Put-
methods:
a) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
b) Put the values for the parameters you want to adjust:
• Put( const char * pcParam , const int * piValue ) ;
or
Put( const char * pcParam , const int iValue ) ;
• Put( const char * pcParam , const double * pdValue ) ;
or
Put( const char * pcParam , const double dValue ) ;
113
9. IpoptWrapper - Interface to an open-source large scale optimization algorithm
6. Call MyProblem::SolveOptProblem().
114
10. Example - Solving a constrained nonlinear
optimization problem using
active-set-strategy
As an illustrative example we solve the following problem using DefaultLoop():
115
10. Example - Solving a constrained nonlinear optimization problem using active-set-strategy
7 #include ” ScpWrapper . h”
8 #include ” CobylaWrapper . h”
9
10 using s t d : : c o u t ;
11 using s t d : : e n d l ;
12 using s t d : : c i n ;
13
14 Example14 : : Example14 ( )
15 {
16 char s ;
17
18 c o u t << e n d l << ”−−−− t h i s i s Example 14 −−−−” << e n d l << e n d l ;
19
20 // Choose o p t i m i z e r :
21 do
22 {
23 c o u t << e n d l << ”Which O p t i m i z e r do you want t o u s e ? \n\n” ;
24 c o u t << ” [ q ] SqpWrapper \n” ;
25 c o u t << ” [ c ] ScpWrapper \n” ;
26 c o u t << ” [ b ] NlpqlbWrapper \n” ;
27 c o u t << ” [ g ] NlpqlgWrapper \n” ;
28 c o u t << ” [ y ] CobylaWrapper \n” ;
29 c i n >> s ;
30
31 i f ( s == ’ q ’ )
32 {
33 m pOptimizer = new SqpWrapper ( ) ;
34 }
35 e l s e i f ( s == ’ c ’ )
36 {
37 m pOptimizer = new ScpWrapper ( ) ;
38 }
39 e l s e i f ( s == ’ b ’ )
40 {
41 m pOptimizer = new NlpqlbWrapper ( ) ;
42 }
43 e l s e i f ( s == ’ g ’ )
44 {
45 m pOptimizer = new NlpqlgWrapper ;
46 }
47 e l s e i f ( s == ’ y ’ )
48 {
49 m pOptimizer = new CobylaWrapper ;
50 }
51 else
52 {
116
53 c o u t << ” i l l e g a l i n p u t ! ” << e n d l ;
54 }
55 }
56 while ( s != ’ c ’ &&
57 s != ’ q ’ &&
58 s != ’ g ’ &&
59 s != ’ y ’ &&
60 s != ’ b ’ ) ;
61 }
62
63 int Example14 : : FuncEval ( bool /∗ Parameter not needed ∗/ )
64 {
65 const double ∗ dX ;
66 double dValue ;
67 int i E r r o r ;
68
69 iError = 0 ;
70
71 m pOptimizer−>GetDesignVarVec ( dX ) ;
72
73 // 0 t h c o n s t r a i n t
74 dValue = dX [ 0 ] + dX [ 1 ] / 2 . 0 + dX [ 2 ] / 3 . 0 + dX [ 3 ] / 4 . 0 − 2 5 . 0 / 1 2 . 0 ;
75 m pOptimizer−>PutConstrVal ( 0 , dValue ) ;
76
77 // 1 s t c o n s t r a i n t
78 dValue = dX [ 0 ] / 2 . 0 + dX [ 1 ] / 3 . 0 + dX [ 2 ] / 4 . 0 + dX [ 3 ] / 5 . 0 − 7 7 . 0 / 6 0 . 0 ;
79 m pOptimizer−>PutConstrVal ( 1 , dValue ) ;
80
81 // 2nd c o n s t r a i n t
82 dValue = dX [ 0 ] / 3 . 0 + dX [ 1 ] / 4 . 0 + dX [ 2 ] / 5 . 0 + dX [ 3 ] / 6 . 0 − 1 9 . 0 / 2 0 . 0 ;
83 m pOptimizer−>PutConstrVal ( 2 , dValue ) ;
84
85 // 3 rd c o n s t r a i n t
86 dValue = dX [ 0 ] / 4 . 0 + dX [ 1 ] / 5 . 0 + dX [ 2 ] / 6 . 0 + dX [ 3 ] / 7 . 0
87 − 319.0/420.0 ;
88 m pOptimizer−>PutConstrVal ( 3 , dValue ) ;
89
90 // o b j e c t i v e
91 dValue = 2 5 . 0 / 1 2 . 0 ∗ dX [ 0 ] + 7 7 . 0 / 6 0 . 0 ∗ dX [ 1 ]
92 + 1 9 . 0 / 2 0 . 0 ∗ dX [ 2 ] + 3 1 9 . 0 / 4 2 0 . 0 ∗ dX [ 3 ] ;
93 m pOptimizer−>PutObjVal ( dValue ) ;
94
95
96 return EXIT SUCCESS ;
97 }
98
117
10. Example - Solving a constrained nonlinear optimization problem using active-set-strategy
118
145 {
146 m pOptimizer−>PutDerivConstr ( 3 , 0 , 1 . 0 / 4 . 0 ) ;
147 m pOptimizer−>PutDerivConstr ( 3 , 1 , 1 . 0 / 5 . 0 ) ;
148 m pOptimizer−>PutDerivConstr ( 3 , 2 , 1 . 0 / 6 . 0 ) ;
149 m pOptimizer−>PutDerivConstr ( 3 , 3 , 1 . 0 / 7 . 0 ) ;
150 }
151
152 return EXIT SUCCESS ;
153 }
154
155 int Example14 : : SolveOptProblem ( )
156 {
157 int i E r r o r ;
158
159 iError = 0 ;
160
161 // I f SqpWrapper has been chosen as o p t i m i z e r ,
162 // t h e number o f p a r a l l e l s y s t e m s i s s e t .
163 // ( The i m p l e m e n t a t i o n o f FuncEval a l l o w s o n l y one system . )
164 // I f any o t h e r o p t i m i z e r has been chosen ,
165 // t h i s i n s t r u c t i o n w i l l not do a n y t h i n g .
166 m pOptimizer−>Put ( ” NumParallelSys ” , 1 ) ;
167
168 // I f NlpqlgWrapper has been chosen as o p t i m i z e r ,
169 // t h e number o f e x p e c t e d l o c a l m i n i m i z e r s i s s e t .
170 // I f any o t h e r o p t i m i z e r has been chosen ,
171 // t h i s i n s t r u c t i o n w i l l not do a n y t h i n g .
172 m pOptimizer−>Put ( ”MaxNumLocMin” , 2 ) ;
173
174 // I f NlpqlbWrapper has been chosen as o p t i m i z e r ,
175 // t h e number o f c o n s t r a i n t s t o be t a k e n i n t o t h e s u b p r o b l e m i s s e t .
176 // I f any o t h e r o p t i m i z e r has been chosen ,
177 // t h i s i n s t r u c t i o n w i l l not do a n y t h i n g .
178 m pOptimizer−>Put ( ”NumConstrSubProb” , 2 ) ;
179
180 // D e f i n e t h e o p t i m i z a t i o n problem
181 m pOptimizer−>PutNumDv( 4 ) ;
182 m pOptimizer−>PutNumIneqConstr ( 4 ) ;
183 m pOptimizer−>PutNumEqConstr ( 0 ) ;
184
185 double LB [ 4 ] = { 0 . 0 , 0 . 0 , 0 . 0 , 0 . 0 } ;
186 double UB[ 4 ] = { 10E6 , 10E6 , 10E6 , 10E6 } ;
187 double IG [ 4 ] = { 0 . 0 , 0 . 0 , 0 . 0 , 0 . 0 } ;
188
189 m pOptimizer−>PutUpperBound ( UB ) ;
190 m pOptimizer−>PutLowerBound ( LB ) ;
119
10. Example - Solving a constrained nonlinear optimization problem using active-set-strategy
191 m pOptimizer−>P u t I n i t i a l G u e s s ( IG ) ;
192
193 // S t a r t t h e o p t i m i z a t i o n
194 i E r r o r = m pOptimizer−>DefaultLoop ( t h i s ) ;
195
196 // i f an e r r o r occured , r e p o r t i t . . .
197 i f ( i E r r o r != EXIT SUCCESS )
198 {
199 c o u t << ” E r r o r ” << i E r r o r << ” \n” ;
200 return i E r r o r ;
201 }
202
203 // . . . e l s e r e p o r t t h e r e s u l t s :
204 const double ∗ dX ;
205
206 m pOptimizer−>GetDesignVarVec ( dX ) ;
207
208 int iNumDesignVar ;
209
210 m pOptimizer−>GetNumDv( iNumDesignVar ) ;
211
212 for ( int iDvIdx = 0 ; iDvIdx < iNumDesignVar ; iDvIdx++ )
213 {
214 c o u t << ”dX [ ” << iDvIdx << ” ] = ” << dX [ iDvIdx ] << e n d l ;
215 }
216 return EXIT SUCCESS ;
217 }
120
Part II.
Multicriteria Optimization
121
11. Theory - Multiple Objective Optimization
11.1. Introduction
The multiple objective optimization problem is stated as follows:
In other words we wish to determine from among the set of all numbers which satisfy the
inequality g(x) ≥ 0 and the equality constraint h(x) = 0 that particular set x∗ ∈ Rn with
xl ≤ x∗ ≤ xu which yields the optimum values of all objective functions f1 , ..., fk . Here ’optimize’
does not simply mean to find the minimum or maximum of the objective function as it is
for a single criterion optimization problem. It means to find a ’good’ solution considering all
the objective functions simultaneously. Of course, first we need to know how to designate a
particular solution as ’good’ or ’bad’. This is the major question which arises while solving any
multicriterian optimization problem, and will be discussed here.
X := {x ∈ Rn : g(x) ≥ 0, h(x) = 0, xl ≤ x ≤ xu }
denote the set of all feasible solutions. Then problem (MOO) can be simplified to
min f~(x)
x∈X
If there exists x∗ ∈ X such that for all objective functions x∗ is optimal, that is
then x∗ is certainly a desirable solution. Unfortunately, this is an utopian situation which rarely
exists, since it is unlikely that all fi (x) will take out their minimum values at a common point
x∗ . Thus we are almost always faced with the question: What solution should we adopt; that
is, how should an ’optimal’ solution be defined. First consider the so called ideal solution. In
order to define this solution we have to find separately attainable minima, for all the objective
functions.
123
11. Theory - Multiple Objective Optimization
Assuming there is one, let xi∗ be the solution of the scalar optimization problem
then fi∗ is called the individual minimum for the scalar problem i, the vector f ∗ = (f1∗ , ..., fk∗ ) is
called ideal for a multiple objective optimization problem and the points in X which determined
this vector is the ideal solution.
It is usually not true that
xi∗ = xj∗ ∀i, j = 1, ..., k
holds, which would be great since we would have solved the multiple objective optimization
problem just by considering a sequence of scalar ones. Thus we need to define a new form
of optimality which leads us to the concept of Pareto Optimality, which was introduced by V.
Pareto in 1896 and is still the most important part of multicriterian analysis:
This definition is based on the intuitive conviction that the point x∗ ∈ X is chosen as the
optimal if no criterion can be improved without worsening at least one other criterion. Unfor-
tunately the Pareto optimium almost always gives not a single solution but a set of solutions
called the functional-efficient boundary (Pareto Set) of (MOO).
subject to:
12 − x1 − x2 ≥ 0
−x21 + 10x1 − x22 + 16x2 − 80 ≥ 0
It could seem in chapter 11 that the solution of a multiple objective problem is all about
finding Pareto optimal points in design space and to decide based on a selection what points
to choose as the optimal one. There are different ways to evaluate Pareto optimal points. One
class of methods consist in transforming the multiple objective problem in one (or a sequence
of) scalar optimization problem(s) which will be content of the following chapters. Each method
described here requires some information about the preferences in the criteria. Since the engineer
may meet different decision-making problems, and since different information about preferences
is available, it is difficult to indicate which method can be recommended to solve a problem.
124
11.2. Pareto Optimality
125
12. The Method of Weighted Objectives
12.1. The Algorithm
The weighting objectives method has received most attention and particular models within this
method have been widely applied. The basis of this method consist in adding all the objective
functions together using different weighting coefficients for each. We transform our MOO to a
scalar problem by creating one function of the form
k
X
f (x) = ωi fi (x) (12.1)
i=1
k
P
where ωi ≥ 0, ωi = 1 are the weighting coefficients, which reflect the preference of the
i=1
individual objective functions. Since the numerical values of the different objectives may vary a
lot we should scale our formulation to
k
X
f (x) = ωi fi (x)ci (12.2)
i=1
127
12. The Method of Weighted Objectives
128
12.2. Program Documentation
where pdXu and pdXl must be pointing on arrays of length equal to the number of
design variables .
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MooWeightedObjectives needs new function values
→ 6b New function values.
After passing these values to the MooWeightedObjectives object go to 4.
• if iStatus equals EvalGradId():
MooWeightedObjectives needs new values for gradients → 6c Providing new gradient
values.
After passing these values to the MooWeightedObjectives object go to 4.
129
12. The Method of Weighted Objectives
130
12.2. Program Documentation
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
returns the value of the derivative of the iFunIdx’th objective with respect to the
iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
returns the value of the iFunIdx’th objective function at the last solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
131
13. The Hierarchical Optimization Method
The hierarchical optimization method has been suggested by Walz [164] and considers the situa-
tion where the objectives can be ordered in terms of their importance. Without loss of generality
let the vector of objectives
be ordered with respect to their importance, i.e. let f1 (x) be the most and fk (x) the least
important objective function, specified by the user. We now minimize each objective separately,
adding in, at each step a new constraint which limits the assumed increase or decrease of the
previously considered functions.
such that
f1 (x(1) ) = min f1 (x) (13.1)
x∈X
such that
fi (x(i) ) = min fi (x) (13.2)
x∈X
Note that the solution of the k scalar problems are all functional-efficient but only the last
(k’th) leads to optimality. The assumption that in each step the parameter εi equals zero leads
to the suggested variation of Ben-Tal [4].
133
13. The Hierarchical Optimization Method
2. Generate a scalar optimizer (e.g. SqpWrapper ) object and attach it to the multiple
objective optimizer using AttachScalarOptimizer()
a) Put the number of objective functions to the object:
Put( const char * pcParam , const int * piNumObjFuns ) ;
or
Put( const char * pcParam , const int iNumObjFuns ) ;
where pcParam = "NumObjFuns".
b) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
c) Put the values for other parameters you want to adjust:
• Put( const char * pcParam , const int * piValue ) ;
or
Put( const char * pcParam , const int iValue ) ;
where pcParam is one of the parameters of the scalar optimizer.
• Put( const char * pcParam , const double * pdValue ) ;
or
Put( const char * pcParam , const double dValue ) ;
where pcParam is one of the parameters of the scalar optimizer or
– "Epsilon" Parameter for the additional constrains. pdValue must be a
pointer on a double array of length equal to the number of objective functions
containing the epsilon-vector.
• Put( const char * pcParam , const char * pcValue ) ;
or
Put( const char * pcParam , const char cValue ) ;
with pcParam = "OutputFileName" to set the name of the output file.
• Put( const char * pcParam , const bool * pbValue ) ;
or
Put( const char * pcParam , const bool bValue ) ;
where pcParam is one of the parameters of the scalar optimizer.
d) Put the number of inequality constraint functions to the object:
PutNumIneqConstr( int iNumIneqConstr ) ;
e) Put the number of equality constraint functions to the object:
PutNumEqConstr( int iNumEqConstr ) ;
134
13.2. Program Documentation
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MooHierarchOptMethod needs new function values
→ 6b New function values.
After passing these values to the MooHierarchOptMethod object go to 4.
• if iStatus equals EvalGradId():
MooHierarchOptMethod needs new values for gradients → 6c Providing new gradient
values.
After passing these values to the MooHierarchOptMethod object go to 4.
135
13. The Hierarchical Optimization Method
136
13.2. Program Documentation
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
returns the value of the derivative of the iFunIdx’th objective with respect to the
iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
returns the value of the iFunIdx’th objective function at the last solution vector.
137
13. The Hierarchical Optimization Method
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
138
14. The Trade-Off Method
Algorithm 10 Trade-Off
For given l ∈ {1, ..., k} do:
(1) Find the minimum of the l’th objective function, i.e. find x∗ such that
fi (x) ≤ yi , ∀i = 1, ..., k, i 6= l
where the yi are assumed values of the objective functions we wish not to exceed.
(2) Repeat (1) for different values of yi . The information derived from a well chosen set
of yi can be useful in making the decision. The search is stopped when the user finds a
satisfactory solution.
It may be necessary to repeat Algorithm 10 for different indices l ∈ {1, ..., k}. In order to
get a reasonable choice of yi , it is useful to minimize each objective function separately, i.e. let
fi∗ , i = 1, ..., k be the optimal values of the scalar problems, then take account of the following
constraints:
fi (x) ≤ fi∗ + ∆fi , ∀i = 1, ..., k, i 6= l
where ∆fi are given vlaues of function increments. A detailed theory about the trade-off method
and its variants can be found in [63], [99] and [96].
139
14. The Trade-Off Method
2. Generate a scalar optimizer (e.g. SqpWrapper ) object and attach it to the multiple
objective optimizer using AttachScalarOptimizer()
a) Put the number of objective functions to the object:
Put( const char * pcParam , const int * piNumObjFuns ) ;
or
Put( const char * pcParam , const int iNumObjFuns ) ;
where pcParam = "NumObjFuns".
b) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
c) Put the values for the parameters you want to adjust:
• Put( const char * pcParam , const int * piValue ) ;
or
Put( const char * pcParam , const int iValue ) ;
where pcParam is one of the parameters of the scalar optimizer or:
– "IdxOptFun Put the index of the objective used as objective in the scalar
problem. The default is 0.
• Put( const char * pcParam , const double * pdValue ) ;
or
Put( const char * pcParam , const double dValue ) ;
where pcParam is one of the parameters of the scalar optimizer or
– "Bounds" Put the bounds for the objectives used as constraints. pdValue
must be a pointer on a double array whose length equals the number of
objective functions. If no bounds are set, the default is to compute the ideals
and add BoundFactor times their absolute values ( default: BoundFactor =
0.5 ). (This means, that for the one used as objective in the scalar problem a
bound has to be put, too. This bound will be ignored in the scalar problem.)
– "BoundFactor" Put the factor used to compute the bounds from the ideals.
Only relevant when parameter Bounds is not set.
• Put( const char * pcParam , const char * pcValue ) ;
or
Put( const char * pcParam , const char cValue ) ;
with pcParam = "OutputFileName" to set the name of the output file.
• Put( const char * pcParam , const bool * pbValue ) ;
or
Put( const char * pcParam , const bool bValue ) ;
where pcParam is one of the parameters of the scalar optimizer or:
– "ComputeIdeals" Determine whether the ideals shall be computed before
the multiple objective optimization starts.
Note: If no bounds are set, the ideals will always be computed, regardless of
this parameter.
d) Put the number of inequality constraint functions to the object:
140
14.2. Program Documentation
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MooTradeOff needs new function values → 6b New
function values.
After passing these values to the MooTradeOff object go to 4.
• if iStatus equals EvalGradId():
MooTradeOff needs new values for gradients → 6c Providing new gradient values.
After passing these values to the MooTradeOff object go to 4.
141
14. The Trade-Off Method
142
14.2. Program Documentation
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
143
14. The Trade-Off Method
returns the value of the derivative of the iFunIdx’th objective with respect to the
iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
returns the value of the iFunIdx’th objective function at the last solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
144
15. Method of Distance Functions
k
! p1
X p
f p (x) = |fi (x) − yi | , 1≤p≤∞
i=1
k
f 1 (x) =
P
p=1: |fi (x) − yi | Goal Programming
i=1
k 12
f 2 (x) |2
P
p=2: = |fi (x) − yi Euclidian Norm
i=1
2. Generate a scalar optimizer (e.g. SqpWrapper ) object and attach it to the multiple
objective optimizer using AttachScalarOptimizer()
a) Put the number of objective functions to the object:
Put( const char * pcParam , const int * piNumObjFuns ) ;
or
Put( const char * pcParam , const int iNumObjFuns ) ;
where pcParam = "NumObjFuns".
b) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
c) Put the values for the parameters you want to adjust:
145
15. Method of Distance Functions
146
15.2. Program Documentation
where pdXu and pdXl must be pointing on arrays of length equal to the number of
design variables .
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MooDistFunc needs new function values → 6b New
function values.
After passing these values to the MooDistFunc object go to 4.
• if iStatus equals EvalGradId():
MooDistFunc needs new values for gradients → 6c Providing new gradient values.
After passing these values to the MooDistFunc object go to 4.
147
15. Method of Distance Functions
returns the value of the iVarIdx’th design variable of the iVectorIdx’th design
variable vector.
iii. IsObjFunInSubProb( const int iFunIdx ) const
Returns true, if the iFunIdx’th objective function (or its gradient) is in the
current subproblem and thus has to be evaluated.
Returns false, if the function (or its gradient) is not in the current subproblem
and thus does not have to be evaluated.
b) Providing new function values
Function values for the objectives as well as for constraints have to be calculated for
the design variable vectors provided by MooDistFunc.
Note: When Status is equal EvalGradId() for the first time, the functions need to
be evaluated only for the 0’th parallel system (This is only relevant if SqpWrapper is
used as scalar optimizer.)
For access to the design variable vectors see 6(a)ii. After calculating the values, these
must be passed to the MooDistFunc object using:
• PutConstrValVec( const double * const pdConstrVals,
const int iParSysIdx ),
for iParSysIdx=0,...,NumberOfParallelSystems - 1
where pdConstrVals is pointing on an array containing the values of each con-
straint at the iParSysIdx’th design variable vector provided by MooWeighte-
dObjectives.
• PutObjVal( const double dObjVal, const int iObjFunIdx,
const int iParSysIdx ),
for iParSysIdx = 0,...,NumberOfParallelSystems - 1
and iObjFunIdx = 0,...,NumberOfObjectiveFunctions - 1
where dObjVal defines the value of the iObjFunIdx’th objective function at the
iParSysIdx’th design variable vector provided by MooWeightedObjectives.
Alternatively you can employ
PutConstrVal( const int iConstrIdx, const double dConstrValue,
const int iParSysIdx ),
for iParSysIdx = 0,...,NumberOfParallelSystems - 1
and iConstrIdx = 0,...,NumberOfConstraints - 1
c) Providing new gradient values
Gradients must be calculated for all objectives which are used in the current sub-
problem (see 6(a)iii) and the constraints at the current design variable vector. For
access to the design variable vector see 6(a)ii.
For the gradients of the constraints you can use
PutDerivConstr( const int iConstrIdx, const int iVariableIdx,
const double dDerivativeValue )
for iVariableIdx = 0,...,NumberOfVariables - 1
and iConstrIdx = 0,...,NumberOfConstraints - 1
where dDerivativeValue is defined as the derivative of the iConstrIdx’th con-
straint with respect to the iVariableIdx’th variable.
148
15.2. Program Documentation
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
returns the value of the derivative of the iFunIdx’th objective with respect to the
iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
returns the value of the iFunIdx’th objective function at the last solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
149
16. Global Criterion Method and the Min-Max
Optimum
fi (x) − fi∗
min max ωi , fi∗ ≥ 0
x∈X i=1,...,k fi∗
min z
x∈X,z
One can show that this problem is equivalent to the min-max problem. However, there is one
more optimization parameter z and k additional constraints.
151
16. Global Criterion Method and the Min-Max Optimum
2. Generate a scalar optimizer (e.g. SqpWrapper ) object and attach it to the multiple
objective optimizer using AttachScalarOptimizer()
a) Put the number of objective functions to the object:
Put( const char * pcParam , const int * piNumObjFuns ) ;
or
Put( const char * pcParam , const int iNumObjFuns ) ;
where pcParam = "NumObjFuns".
b) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
c) Put the values for the parameters you want to adjust:
• Put( const char * pcParam , const int * piValue ) ;
or
Put( const char * pcParam , const int iValue ) ;
where pcParam is one of the parameters of the scalar optimizer or:
– "Order Put the order p of the method. The default is 1.
• Put( const char * pcParam , const double * pdValue ) ;
or
Put( const char * pcParam , const double dValue ) ;
where pcParam is one of the parameters of the scalar optimizer.
• Put( const char * pcParam , const char * pcValue ) ;
or
Put( const char * pcParam , const char cValue ) ;
with pcParam = "OutputFileName" to set the name of the output file.
• Put( const char * pcParam , const bool * pbValue ) ;
or
Put( const char * pcParam , const bool bValue ) ;
where pcParam is one of the parameters of the scalar optimizer.
d) Put the number of inequality constraint functions to the object:
PutNumIneqConstr( int iNumIneqConstr ) ;
e) Put the number of equality constraint functions to the object:
PutNumEqConstr( int iNumEqConstr ) ;
152
16.2. Program Documentation: MooGlobalCriterion
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MooGlobalCriterion needs new function values →
6b New function values.
After passing these values to the MooGlobalCriterion object go to 4.
• if iStatus equals EvalGradId():
MooGlobalCriterion needs new values for gradients → 6c Providing new gradient
values.
After passing these values to the MooGlobalCriterion object go to 4.
153
16. Global Criterion Method and the Min-Max Optimum
154
16.2. Program Documentation: MooGlobalCriterion
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
returns the value of the derivative of the iFunIdx’th objective with respect to the
iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
returns the value of the iFunIdx’th objective function at the last solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
155
16. Global Criterion Method and the Min-Max Optimum
2. Generate a scalar optimizer (e.g. SqpWrapper ) object and attach it to the multiple
objective optimizer using AttachScalarOptimizer()
a) Put the number of objective functions to the object:
Put( const char * pcParam , const int * piNumObjFuns ) ;
or
Put( const char * pcParam , const int iNumObjFuns ) ;
where pcParam = "NumObjFuns".
b) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
c) Put the values for the parameters you want to adjust:
• Put( const char * pcParam , const int * piValue ) ;
or
Put( const char * pcParam , const int iValue ) ;
where pcParam is one of the parameters of the scalar optimizer.
• Put( const char * pcParam , const double * pdValue ) ;
or
Put( const char * pcParam , const double dValue ) ;
where pcParam is one of the parameters of the scalar optimizer or:
– "Weights" Put the weights for the weighted maximum. pdValue must be a
pointer on a double array of length equal to the number of objective functions
containing the weights.
• Put( const char * pcParam , const char * pcValue ) ;
or
Put( const char * pcParam , const char cValue ) ;
with pcParam = "OutputFileName" to set the name of the output file.
• Put( const char * pcParam , const bool * pbValue ) ;
or
Put( const char * pcParam , const bool bValue ) ;
where pcParam is one of the parameters of the scalar optimizer.
d) Put the number of inequality constraint functions to the object:
PutNumIneqConstr( int iNumIneqConstr ) ;
e) Put the number of equality constraint functions to the object:
PutNumEqConstr( int iNumEqConstr ) ;
156
16.3. Program Documentation: MooMinMaxOpt
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MooMinMaxOpt needs new function values → 6b
New function values.
After passing these values to the MooMinMaxOpt object go to 4.
• if iStatus equals EvalGradId():
MooMinMaxOpt needs new values for gradients → 6c Providing new gradient values.
After passing these values to the MooMinMaxOpt object go to 4.
157
16. Global Criterion Method and the Min-Max Optimum
158
16.3. Program Documentation: MooMinMaxOpt
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
returns the value of the derivative of the iFunIdx’th objective with respect to the
iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
returns the value of the iFunIdx’th objective function at the last solution vector.
159
16. Global Criterion Method and the Min-Max Optimum
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
160
17. Weighted Tchebycheff Method
with the ideal objective vector f ∗ = (f1∗ , ..., fk∗ )T and appropriate weights ωi , i = 1, ..., k. A
particularly good choice would be ωi = f1∗ . This problem is nondifferentiable. However, it
i
can be solved in a differentiable form as long as the objective and the constraint functions are
differentiable and f ∗ is known globally. In this case the | · |-operator can be removed and the
following differentiable equivalent problem is solved
min z
x∈X,z
2. Generate a scalar optimizer (e.g. SqpWrapper ) object and attach it to the multiple
objective optimizer using AttachScalarOptimizer()
a) Put the number of objective functions to the object:
Put( const char * pcParam , const int * piNumObjFuns ) ;
or
Put( const char * pcParam , const int iNumObjFuns ) ;
where pcParam = "NumObjFuns".
b) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
c) Put the values for the parameters you want to adjust:
161
17. Weighted Tchebycheff Method
162
17.2. Program Documentation
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MooWeightedTchebycheff needs new function val-
ues → 6b New function values.
After passing these values to the MooWeightedTchebycheff object go to 4.
• if iStatus equals EvalGradId():
MooWeightedTchebycheff needs new values for gradients → 6c Providing new gradi-
ent values.
After passing these values to the MooWeightedTchebycheff object go to 4.
163
17. Weighted Tchebycheff Method
164
17.2. Program Documentation
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
returns the value of the derivative of the iFunIdx’th objective with respect to the
iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
returns the value of the iFunIdx’th objective function at the last solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
165
18. STEP Method
The STEP Method contains elements similar to the Tchebycheff method described in 17, but is
based on a different idea. STEP is a so called interactive method for MOO problems. It can be
considered to aspire at finding satisfactory solutions.
It is assumed in STEP that at a certain Pareto optimal objective vector the decision maker
can indicate both objective functions that have acceptable values and those whose values are too
high. The latter can be said to be unacceptable. The decision maker is now assumed to allow
the values of some acceptable objective functions to increase so that the unacceptable functions
can have lower values. In other words, the decision maker must give up a little in the value(s)
of some objective functions fi , i ∈ I a in order to improve the values of some other objective
functions fi , i ∈ I ua such that I a ∪ I ua = {1, ..., k}.
STEP uses the weighted Tchebycheff problem (see 17) to generate new trial solutions. The
ideal objective vector f ∗ is used as a reference point in the calculations. Information concerning
the ranges of the Pareto optimal set is needed in determining the weighting vector for the metric.
The idea is to make the scales of all the objective functions similar with the help of the weighting
coefficients. The so called nadir objective vector - which is defined as
where fi∗ is a component of the ideal objective vector and ∆fi > 0 a relatively small but
computationally significant scalar for objective i - is approximated from the payoff table. The
weighting vector is calculated by the formula
ei
ωi = k
, i = 1, ..., k
P
ej
j=1
The weight is larger for those objective functions that are far from their ideal objective vector
component.
167
18. STEP Method
Algorithm 11 STEP
(1) Calculate the ideal and the nadir objective vectors and the weighting constants as given
above. Set k = 1. Solve the weighted Tchebycheff problem with the calculated weights:
where the additional upper bounds for the acceptable objectives are taken into account. Denote
the solution by xk+1 ∈ X and the corresponding objective vector by f~(xk+1 ). Go to step (2).
(4) Stop. The final solution is xk ∈ X
In the first step the distance between the ideal objective vector and the feasible region is
minimized by the weighted Tchebycheff metric (18.1). The solution obtained is presented to the
decision maker. Then the decision maker is asked to specify those objective function(s) whose
value(s) he/she is willing to relax (i.e. weaken) to decrease the values of some other objective
functions. The decision maker must also specify the amounts of acceptable relaxation.
The feasible region is restricted according to the information of the decision maker and the
weights of the relaxed objective functions are set equal to zero, that is ωi = 0, i ∈ I a . Then a
new distance minimization problem (18.2) is solved. The new constraint set allows the relaxed
(acceptable) objective function values to increase up to the specified level. The procedure
continues until the decision maker does not change any component of the current objective
vector.
2. Generate a scalar optimizer (e.g. SqpWrapper ) object and attach it to the multiple
objective optimizer using AttachScalarOptimizer()
a) Put the number of objective functions to the object:
Put( const char * pcParam , const int * piNumObjFuns ) ;
168
18.2. Program Documentation
or
Put( const char * pcParam , const int iNumObjFuns ) ;
where pcParam = "NumObjFuns".
b) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
c) Put the values for the parameters you want to adjust:
• Put( const char * pcParam , const int * piValue ) ;
or
Put( const char * pcParam , const int iValue ) ;
where pcParam is one of the parameters of the scalar optimizer.
• Put( const char * pcParam , const double * pdValue ) ;
or
Put( const char * pcParam , const double dValue ) ;
where pcParam is one of the parameters of the scalar optimizer or:
– "DeltaF" Put the distances for computing the nadir objective vector from
the ideal vector.
• Put( const char * pcParam , const char * pcValue ) ;
or
Put( const char * pcParam , const char cValue ) ;
with pcParam = "OutputFileName" to set the name of the output file.
• Put( const char * pcParam , const bool * pbValue ) ;
or
Put( const char * pcParam , const bool bValue ) ;
where pcParam is one of the parameters of the scalar optimizer.
d) Put the number of inequality constraint functions to the object:
PutNumIneqConstr( int iNumIneqConstr ) ;
e) Put the number of equality constraint functions to the object:
PutNumEqConstr( int iNumEqConstr ) ;
169
18. STEP Method
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MooStepMethod needs new function values → 6b
New function values.
After passing these values to the MooStepMethod object go to 4.
• if iStatus equals EvalGradId():
MooStepMethod needs new values for gradients → 6c Providing new gradient values.
After passing these values to the MooStepMethod object go to 4.
170
18.2. Program Documentation
171
18. STEP Method
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivObj( const int iVariableIdx, double & dDerivativeValue,
const int iFunIdx ) const
returns the value of the derivative of the iFunIdx’th objective with respect to the
iVariableIdx’th design variable at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
returns the value of the iFunIdx’th objective function at the last solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
172
19. Example
As an illustrative example we solve the following problem using DefaultLoop():
173
19. Example
11 #include ” MooHierarchOptMethod . h”
12 #include ” MooTradeOff . h”
13 #include ” MooDistFunc . h”
14 #include ”MooMinMaxOpt . h”
15 #include ” M o o G l o b a l C r i t e r i o n . h”
16 #include ” MooWeightedTchebycheff . h”
17 #include ”MooStepMethod . h”
18
19 using std :: cout ;
20 using std :: endl ;
21 using std :: cin ;
22 using std :: setprecision ;
23
24 Example38 : : Example38 ( )
25 {
26 char s ;
27 c o u t << e n d l << ”−−− t h i s i s Example 38 −−−” << e n d l << e n d l ;
28
29 // Choose m u l t i p l e o b j e c t i v e o p t i m i z e r :
30 do
31 {
32 c o u t << e n d l << ”Which O p t i m i z e r do you want t o u s e ? \n\n” ;
33 c o u t << ” [ o ] MooWeightedObjectives \n” ;
34 c o u t << ” [ h ] MooHierarchOptMethod \n” ;
35 c o u t << ” [ t ] MooTradeOff \n” ;
36 c o u t << ” [ d ] MooDistFunc \n” ;
37 c o u t << ” [ p ] MooMinMaxOpt \n” ;
38 c o u t << ” [ a ] M o o G l o b a l C r i t e r i o n \n” ;
39 c o u t << ” [ w ] MooWeightedTchebycheff \n” ;
40 c o u t << ” [ e ] MooStepMethod \n” ;
41
42 c i n >> s ;
43
44 if ( s == ’ o ’ )
45 {
46 m pOptimizer = new MooWeightedObjectives ;
47 }
48 else i f ( s == ’ t ’ )
49 {
50 m pOptimizer = new MooTradeOff ;
51 }
52 else i f ( s == ’ d ’ )
53 {
54 m pOptimizer = new MooDistFunc ;
55 }
56 else i f ( s == ’ h ’ )
174
57 {
58 m pOptimizer = new MooHierarchOptMethod ;
59 }
60 e l s e i f ( s == ’ p ’ )
61 {
62 m pOptimizer = new MooMinMaxOpt ;
63 }
64 e l s e i f ( s == ’ a ’ )
65 {
66 m pOptimizer = new M o o G l o b a l C r i t e r i o n ;
67 }
68 e l s e i f ( s == ’w ’ )
69 {
70 m pOptimizer = new MooWeightedTchebycheff ;
71 }
72 e l s e i f ( s == ’ e ’ )
73 {
74 m pOptimizer = new MooStepMethod ;
75 }
76 else
77 {
78 c o u t << ” i l l e g a l i n p u t ! ” << e n d l ;
79 }
80 }
81 while ( s != ’o ’ &&
82 s != ’t ’ &&
83 s != ’d ’ &&
84 s != ’h ’ &&
85 s != ’p ’ &&
86 s != ’a ’ &&
87 s != ’w ’ &&
88 s != ’e ’ ) ;
89
90 // Attach a s c a l a r o p t i m i z e r using AttachScalarOptimizer .
91 // The second argument t e l l s t h e m u l t i p l e o b j e c t i v e o p t i m i z e r
92 // t o d e l e t e t h e s c a l a r o p t i m i z e r a f t e r use .
93 do
94 { cout << endl << ” Choose a s c a l a r o p t i m i z e r : \n\n” ;
95 cout << ”[q] SqpWrapper \n” ;
96 cout << ”[c] ScpWrapper \n” ;
97 cout << ”[b] NlpqlbWrapper \n” ;
98 cout << ”[g] NlpqlgWrapper \n” ;
99 cout << ”[y] CobylaWrapper \n” ;
100
101 c i n >> s ;
102
175
19. Example
103 i f ( s == ’ y ’ )
104 {
105 m pOptimizer−>A t t a c h S c a l a r O p t i m i z e r ( new CobylaWrapper ( ) ,
106 true ) ;
107 }
108 e l s e i f ( s == ’ q ’ )
109 {
110 m pOptimizer−>A t t a c h S c a l a r O p t i m i z e r ( new SqpWrapper ( ) ,
111 true ) ;
112 }
113 e l s e i f ( s == ’ c ’ )
114 {
115 m pOptimizer−>A t t a c h S c a l a r O p t i m i z e r ( new ScpWrapper ( ) ,
116 true ) ;
117 }
118 e l s e i f ( s == ’ b ’ )
119 {
120 m pOptimizer−>A t t a c h S c a l a r O p t i m i z e r ( new NlpqlbWrapper ( ) ,
121 true ) ;
122 }
123 e l s e i f ( s == ’ g ’ )
124 {
125 m pOptimizer−>A t t a c h S c a l a r O p t i m i z e r ( new NlpqlgWrapper ( ) ,
126 true ) ;
127 }
128 else
129 {
130 c o u t << ” i l l e g a l i n p u t ! ” << e n d l ;
131 }
132 }
133 while ( s != ’ c ’ &&
134 s != ’ q ’ &&
135 s != ’ g ’ &&
136 s != ’ b ’ &&
137 s != ’ y ’ ) ;
138 }
139
140 int Example38 : : FuncEval ( bool bGradApprox )
141 {
142 const double ∗ dX ;
143 int iNumParSys ;
144 double ∗ pdFuncVals ;
145 int iNumConstr ;
146 int iNumEqConstr ;
147 int iNumIneqConstr ;
148 int iNumActConstr ;
176
149 int ∗ piActive ;
150 int iCounter ;
151 int iNumObjFuns ;
152 int iError ;
153 bool bActiveSetStrat ;
154
155 p i A c t i v e = NULL ;
156 iNumActConstr = 0 ;
157 iCounter = 0 ;
158
159 m pOptimizer−>Get ( ”NumObjFuns” , iNumObjFuns ) ;
160 m pOptimizer−>GetNumConstr ( iNumConstr ) ;
161 m pOptimizer−>GetNumEqConstr ( iNumEqConstr ) ;
162 m pOptimizer−>GetNumIneqConstr ( iNumIneqConstr ) ;
163
164 i E r r o r = m pOptimizer−>Get ( ” ActConstr ” , p i A c t i v e ) ;
165 // I f t h e o p t i m i z e r d o e s not s u p p o r t a c t i v e s e t s t r a t e g y ,
166 // a l l i n e q u a l i t i e s a r e c o n s i d e r e d a c t i v e .
167 i f ( i E r r o r != EXIT SUCCESS )
168 {
169 bActiveSetStrat = false ;
170 p i A c t i v e = new int [ iNumIneqConstr ] ;
171 f o r ( int i C o n s t r I d x = 0 ; i C o n s t r I d x < iNumIneqConstr ; i C o n s t r I d x++ )
172 {
173 piActive [ iConstrIdx ] = 1 ;
174 }
175 }
176 else
177 {
178 b A c t i v e S e t S t r a t = true ;
179 }
180
181
182 for ( int i C o n s t r I d x = 0 ; i C o n s t r I d x < iNumIneqConstr ; i C o n s t r I d x++ )
183 {
184 i f ( p i A c t i v e [ i C o n s t r I d x ] != 0 )
185 {
186 iNumActConstr++ ;
187 }
188 }
189 // when used f o r g r a d i e n t a p p r o x i m a t i o n , FuncEval d o e s not
190 // e v a l u a t e i n a c t i v e i n e q u a l i t y c o n s t r a i n t s
191 i f ( bGradApprox == true )
192 {
193 pdFuncVals = new double [ iNumEqConstr + iNumObjFuns
194 + iNumActConstr ] ;
177
19. Example
195 }
196 else
197 {
198 pdFuncVals = new double [ iNumConstr + iNumObjFuns ] ;
199 }
200
201 // G r a d i e n t s have t o be e v a l u a t e d o n l y a t
202 // one d e s i g n v a r i a b l e v e c t o r
203 i f ( bGradApprox == true )
204 {
205 iNumParSys = 1 ;
206 }
207 // F u n c t i o n s may have t o be e v a l u a t e d a t more than one
208 // d e s i g n v a r i a b l e v e c t o r when t h e s c a l a r o p t i m i z e r
209 // i s SqpWrapper .
210 else
211 {
212 m pOptimizer−>Get ( ” NumParallelSys ” , iNumParSys ) ;
213 }
214
215 // A f o r −l o o p s i m u l a t e s t h e p a r a l l e l e v a l u a t i o n i n
216 // iNumParSys p a r a l l e l s y s t e m s
217 for ( int i S y s I d x = 0 ; i S y s I d x < iNumParSys ; i S y s I d x++ )
218 {
219 i f ( bGradApprox == f a l s e )
220 {
221 m pOptimizer−>GetDesignVarVec ( dX , i S y s I d x ) ;
222 }
223 else
224 {
225 m pApprox−>GetDesignVarVec ( dX ) ;
226 }
227
228 // E v a l u a t e t h e i n e q u a l i t y c o n s t r a i n t s ( when a p p r o x i m a t i n g
229 // g r a d i e n t s o n l y t h e a c t i v e ones )
230
231 i C o u n t e r = iNumEqConstr ;
232
233 // 0 t h c o n s t r a i n t
234 i f ( bGradApprox == f a l s e | | p i A c t i v e [ 0 ] != 0 )
235 {
236 pdFuncVals [ i C o u n t e r ] = 1 2 . 0 − dX [ 0 ] − dX [ 1 ] − dX [ 2 ] ;
237 i C o u n t e r ++ ;
238 }
239 // 1 s t c o n s t r a i n t
240 i f ( bGradApprox == f a l s e | | p i A c t i v e [ 1 ] != 0 )
178
241 {
242 pdFuncVals [ i C o u n t e r ] = − dX [ 0 ] ∗ dX [ 0 ] + 1 0 . 0 ∗ dX [ 0 ]
243 − dX [ 1 ] ∗ dX [ 1 ] + 1 6 . 0 ∗ dX [ 1 ]
244 − dX [ 2 ] ∗ dX [ 2 ] + 2 2 . 0 ∗ dX [ 2 ] − 8 0 . 0 ;
245 i C o u n t e r ++ ;
246 }
247
248 // E v a l u a t e t h e o b j e c t i v e s
249
250 // 0 t h o b j e c t i v e
251 pdFuncVals [ i C o u n t e r ] = dX [ 0 ] + dX [ 1 ] ∗ dX [ 1 ] + dX [ 2 ] ;
252 // 1 s t o b j e c t i v e
253 pdFuncVals [ i C o u n t e r + 1 ] = dX [ 0 ] ∗ dX [ 0 ] + dX [ 1 ] + dX [ 2 ] ;
254 // 2 s t o b j e c t i v e
255 pdFuncVals [ i C o u n t e r + 2 ] = dX [ 0 ] + dX [ 1 ] + dX [ 2 ] ∗ dX [ 2 ] ;
256
257 // r e t u r n t h e v a l u e s
258
259 i f ( bGradApprox == f a l s e )
260 {
261 f o r ( int iFunIdx = i C o u n t e r ; iFunIdx < i C o u n t e r + iNumObjFuns ;
262 iFunIdx++ )
263 {
264 m pOptimizer−>PutObjVal ( pdFuncVals [ iFunIdx ] ,
265 iFunIdx − iCounter , i S y s I d x ) ;
266 }
267 m pOptimizer−>PutConstrValVec ( pdFuncVals , i S y s I d x ) ;
268 }
269 else
270 {
271 m pApprox−>PutFuncVals ( pdFuncVals ) ;
272 }
273 }
274
275 delete [ ] pdFuncVals ;
276 pdFuncVals = NULL ;
277
278 // I f t h e r e i s no a c t i v e s e t s t r a t e g y , p i A c t i v e has been
279 // a l l o c a t e d by t h i s f u n c t i o n and t h u s must be d e l e t e d now .
280 i f ( b A c t i v e S e t S t r a t == f a l s e )
281 {
282 delete [ ] p i A c t i v e ;
283 p i A c t i v e = NULL ;
284 }
285
286 return EXIT SUCCESS ;
179
19. Example
287 }
288
289 int Example38 : : GradEval ( )
290 {
291 /∗ I f we were a l l o w i n g o n l y CobylaWrapper as a s c a l a r o p t i m i z e r , t h i s
292 f u n c t i o n c o u l d be implemented as a r e t u r n command o n l y .
293 As CobylaWrapper i s a g r a d i e n t f r e e a l g o r i t h m t h i s f u n c t i o n
294 would n e v e r be c a l l e d . A l l o t h e r o p t i m i z e r s i n t h i s p a r t
295 o f t h e manual o f f e r an a c t i v e s e t s t r a t e g y which i s used
296 i n t h e f o l l o w i n g i m p l e m e n t a t i o n . ∗/
297 const double ∗ dX ;
298
299 m pOptimizer−>GetDesignVarVec ( dX ) ;
300
301 m pOptimizer−>PutDerivObj ( 0 , 1 . 0 ) ;
302 m pOptimizer−>PutDerivObj ( 1 , 2 . 0 ∗ dX [ 1 ] ) ;
303 m pOptimizer−>PutDerivObj ( 2 , 1 . 0 ) ;
304
305 m pOptimizer−>PutDerivObj ( 0 , 2 . 0 ∗ dX [ 0 ] , 1 ) ;
306 m pOptimizer−>PutDerivObj ( 1 , 1 . 0 , 1 ) ;
307 m pOptimizer−>PutDerivObj ( 2 , 1 . 0 , 1 ) ;
308
309 m pOptimizer−>PutDerivObj ( 0 , 1 . 0 , 2 ) ;
310 m pOptimizer−>PutDerivObj ( 1 , 1 . 0 , 2 ) ;
311 m pOptimizer−>PutDerivObj ( 2 , 2 . 0 ∗ dX [ 2 ] , 2 ) ;
312
313 m pOptimizer−>PutDerivConstr ( 0 , 0 , − 1 . 0 ) ;
314 m pOptimizer−>PutDerivConstr ( 0 , 1 , − 1 . 0 ) ;
315 m pOptimizer−>PutDerivConstr ( 0 , 2 , − 1 . 0 ) ;
316
317 m pOptimizer−>PutDerivConstr ( 1 , 0 , − 2 . 0 ∗ dX [ 0 ] + 1 0 . 0 ) ;
318 m pOptimizer−>PutDerivConstr ( 1 , 1 , − 2 . 0 ∗ dX [ 1 ] + 1 6 . 0 ) ;
319 m pOptimizer−>PutDerivConstr ( 1 , 2 , − 2 . 0 ∗ dX [ 2 ] + 2 2 . 0 ) ;
320
321 return EXIT SUCCESS ;
322 }
323
324 int Example38 : : SolveOptProblem ( )
325 {
326 int i E r r o r ;
327
328 iError = 0 ;
329
330 // Parameters can be s e t u s i n g a v a r i a b l e , an a d d r e s s
331 // or t h e v a l u e i t s e l f :
332
180
333 int iMaxNumIter = 500 ;
334 int iMaxNumIterLS = 1000 ;
335 double dTermAcc = 1 . 0 E−6 ;
336
337 m pOptimizer−>Put ( ”NumObjFuns” , 3 ) ;
338 m pOptimizer−>Put ( ” Order ” , 2 ) ;
339 m pOptimizer−>Put ( ” ComputeIdeals ” , true ) ;
340 m pOptimizer−>Put ( ” RhoBegin ” , 5 ) ;
341 m pOptimizer−>Put ( ”MaxNumLocMin” , 10 ) ;
342 m pOptimizer−>Put ( ”MaxNumIter” , iMaxNumIter ) ;
343 m pOptimizer−>Put ( ”MaxNumIterLS” , & iMaxNumIterLS ) ;
344 m pOptimizer−>Put ( ”TermAcc” , & dTermAcc ) ;
345 m pOptimizer−>Put ( ” OutputLevel ” , 3 ) ;
346
347 // D e f i n e t h e o p t i m i z a t i o n problem :
348
349 m pOptimizer−>PutNumDv( 3 ) ;
350 m pOptimizer−>PutNumIneqConstr ( 2 ) ;
351 m pOptimizer−>PutNumEqConstr ( 0 ) ;
352
353 double LB [ 3 ] = { − 1 e5 , − 1 e5 , − 1 e5 } ;
354 double UB[ 3 ] = { 1 e5 , 1 e5 , 1 e5 } ;
355 double IG [ 3 ] = { 0.5 , 8.0 , 2.0 } ;
356
357 m pOptimizer−>PutUpperBound ( UB ) ;
358 m pOptimizer−>PutLowerBound ( LB ) ;
359 m pOptimizer−>P u t I n i t i a l G u e s s ( IG ) ;
360
361 // S t a r t t h e o p t i m i z a t i o n
362 i E r r o r = m pOptimizer−>DefaultLoop ( t h i s ) ;
363
364 // i f t h e r e was an e r r o r , r e p o r t i t . . .
365 i f ( i E r r o r != EXIT SUCCESS )
366 {
367 c o u t << ” E r r o r ” << i E r r o r << ”\n” ;
368 return i E r r o r ;
369 }
370
371 // . . . e l s e r e p o r t t h e r e s u l t s :
372
373 const double ∗ dX ;
374 double ∗ F ;
375
376 int iNumDesignVar ;
377 int iNumConstr ;
378 int iNumObjFuns ;
181
19. Example
182
Part III.
183
20. MidacoWrapper
MidacoWrapper is a C++ Wrapper for MIDACO, a stochastic Gauss approximation algorithm
by Schlüter for mixed integer nonlinear optimization problems. MidacoWrapper allows contin-
uous, integer and catalogued variables which can only take values within a given discrete set.
Furthermore the algorithm supports parallel function evaluation at a given number of points.
The algorithm succeeded in finding the best known solutions to three global trajectory opti-
mization problems proposed by the ESA (European Space Agency) which can be found at
http://www.esa.int/gsp/ACT/inf/op/globopt/edvdvdedjds.htm,
http://www.esa.int/gsp/ACT/inf/op/globopt/evevejsa.htm
and
http://www.esa.int/gsp/ACT/inf/op/globopt/MessengerFull.html.
The sections 20.1, 20.2, 20.3 and 20.4 are taken from Schlüter et al. [146], where a description
of the extended ACO algorithm is given. MIDACO controls several of the ACO algorithms
described in the following and contains further heuristics.
More information on MIDACO can be found online at www.midaco-solver.com.
20.1. Introduction
The first optimization algorithms inspired by ants foraging behavior were introduced by Marco
Dorigo in his PhD thesis (Dorigo [32]). Later, these algorithms were formalized as the Ant
Colony Optimization (ACO) metaheuristic (Dorigo and Di Caro [33]). Originally the ACO
metaheuristic was considered only for combinatorial optimization problems (e.g. Travelling
Salesman Problem, Sttzle and Dorigo [152]). In Bonabeau et al. [12] a general overview on
ACO and its applications on some scientific and engineering problems is given. An introduction
to ACO together with recent trends is given in Blum [8]. Comprehensive information on ACO
can be found in Dorigo and Stuetzle [34]
Several extensions of the ACO (Ant Colony Optimization) metaheuristic for continuous search
domains can be found in the literature, among them Socha and Dorigo [149], Yu et al. [169],
Dreo and Siarry [35] or Kong and Tian [78]. Other applications of ACO frameworks for real-
world problems, arising from engineering design applications, can be found in Jayaraman et al.
[73], Rajesh et al. [111], Chunfeng and Xin [24] or Zhang et al. [172]. In contrast extensions for
mixed integer search domains are very rare in the literature. In Socha [148] a general extension
on continuous and mixed integer domains is discussed.
Although a detailed explanation of an ACO algorithm design for continuous problems together
with numerical results is given in this reference, the application on mixed integer domains is
185
20. MidacoWrapper
This paper introduces a conceptual new extension of the ACO metaheuristic for mixed integer
search domains. In contrast to a pheromone table, a pheromone guided discretised probability
density function will be used for the discrete search domain. This approach allows an intuitive
handling of integer variables besides continuous ones within the same algorithm framework.
Whilst the above mentioned extensions of ACO on mixed integer domains can be seen as based
on a discrete ACO algorithm with an extension for continuous domains, our approach works
the other way round. Based on a continuous ACO methodology we extend the algorithm on
discrete variables. This is done by a heuristic, defining a lower bound for the standard deviation
of the discretized Gaussian probability function, which is assumed here as probability density
function.
To apply the ACO metaheuristic on general MINLPs, not only we had to consider mixed
integer search domains, but also a good handling of arbitrary constraints. Here we propose a
new penalty strategy that reinforces this approach and which fits very well in the extended ACO
metaheuristic. Our method is based on a user-given oracle information, an estimated preferred
objective function value, which is used as the crucial parameter for the penalty function. A
detailed investigation of this approach is in preparation and the results seem to be promising.
In particular the method seems to be quite robust against ’bad’ selected oracles. Furthermore
it can be shown analytically, that a sufficiently large or sufficiently low oracle can be used as
default parameters.
Our implementation, named ACOmi, is based on the ACO metaheuristic for continuous domains
presented by Socha and Dorigo [149] and enlarged by the above mentioned two novel heuristics.
As this implementation is a sophisticated one, aiming on the practical use of real-world appli-
cations, several other heuristics (which will be briefly described in Section 5) are included and a
mixed integer sequential quadratic programming algorithm is embedded as a local solver in the
metaheuristic framework. In addition a set of academic benchmark test problems, a complex
MINLP application, a thermal insulation system described in Abramson [1], is solved with this
implementation to fortify its practical relevance for real-world applications.
186
20.2. Approaches for non-convex Mixed Integer Nonlinear Programs
Before giving an overview of the existing methodologies for such problems, the mathematical
formulation of a MINLP is given in (20.1).
In this formulation f (x, y) is the objective function, which has to be minimized, depending
on x, the vector of ncon continuous decision variables, and y, the vector of nint integer deci-
sion variables. The functions g1 , ..., gmeq represent the equality constraints and the functions
gmeq +1 , .., gm the inequality constraints. The vectors xl , xu and yl , yu are the lower and upper
bounds for the decision variables x and y, those are also called box-constraints.
In principle two types of approaches are possible to solve this kind of problem: deterministic
and stochastic methods. So-called metaheuristics (Glover and Kochenberger [55]) often belong
to the latter. ACO can be classified as a stochastic metaheuristic. Modern algorithms, like the
one discussed in this paper, often combine both methodologies in a hybrid-manner, consisting
of a stochastic framework with deterministic strategies embedded. For example Egea et al.
[38], Chelouah and Siarry [23] or Chelouah and Siarry [22] follow also this hybrid framework
approach.
Among the deterministic approaches for MINLPs, Branch and Bound techniques, Outer Ap-
proximation, General Benders Decomposition or extended Cutting Plane methods are the most
common ones. A comprehensive review on these can be found in Grossman [60]. The big ad-
vantage of deterministic approaches is that a lot of these can guarantee global optimality. On
the other hand this guarantee comes with a disadvantage, the possibility of a tremendous com-
putation time depending on the problem structure. In addition, most of the implementations of
these methods require a user given formulation of the mathematical MINLP in an explicit way.
In this case the implementation is called a white box solver. In principle any implementation
can gain the required information via approximation by function evaluations from a black box
formulation alternatively, but this does highly increase the computational time effort.
In contrast to the white box approach, black box solvers do not require any knowledge of
the mathematical formulation of the optimization problem. Of course a mathematical formula-
tion is always essential to implement and tackle an optimization problem, but black box solvers
do not assimilate this formulation. This property makes them very flexible according to pro-
gramming languages and problem types, which is very much appreciated by practitioners. All
metaheuristics can be seen as black box solvers regarding their fundamental concept. In this
paper an extension of the ACO metaheuristic will be introduced to apply this method on gen-
eral MINLPs. Besides academic benchmark MINLP test problems, one complex engineering
application will be considered and optimized with our ACO implementation.
187
20. MidacoWrapper
This basic idea of ACO algorithms is to mimic this biological behavior with artificial ants
’walking’ on a graph, which represents a mathematical problem (e.g. Traveling Salesman Prob-
lem). An optimal path in terms of length or some other cost-resource is requested in those
problems, which belong to the field of combinatorial optimization. By using a parametrized
probabilistic model, called pheromone table, the artificial ants choose a path through a com-
pletely connected Graph G(C, L), where C is the set of vertices and L is the set of connections.
The set C of vertices represent the solution components, which every ant chooses incrementally
to create a path. The pheromone values within the pheromone table are used by the ants, to
make these probabilistic decisions. By updating the pheromone values according to information
gained on the search domain, this algorithmic procedure leads to very good and hopefully global
optimal solutions, like the biological counterpart.
The pseudo-code in Algorithm 12 illustrates this fundamental working procedure of the ACO
metaheuristic. The stopping criteria and daemon actions are a choice of the algorithm designer.
Commonly used stopping criteria are for example a maximal limit of constructed solutions or
a maximal time budget. Daemon actions might be any activity, that cannot be performed by
single ants. Local search activities and additional pheromone manipulations are examples for
such daemon actions, see Blum [9].
In contrast to the original ACO metaheuristic developed for combinatorial optimization prob-
lems, the ACO framework considered in this paper is mainly based on the extended ACO for
continuous domains proposed by Socha and Dorigo [149]. The biological visualization of ants
choosing their way through a graph-like search domain does not hold any longer for these prob-
lems, as these belong to a completely different class. However, the extension of the original
ACO metaheuristic to continuous domains is possible without any major conceptual change,
see Socha [148]. In this methodology, ACO works by the incremental construction of solutions
regarding a probabilistic choice according to a probability density function (PDF), instead of a
188
20.3. The Ant Colony Optimization framework
pheromone table like in the original ACO. In principle any function P (x) ≥ 0 for all x with the
property:
Z ∞
P (x) dx = 1 (20.2)
−∞
can act as a PDF. Among the most popular functions to be used as a PDF is the Gaussian
function. This function has some clear advantages like an easy implementation (e.g. Box and
Mller [16]) and a corresponding fast sampling time of random numbers. On the other hand, a
single Gaussian function is only able to focus on one mean and therefore not able to describe
situations where two or more disjoint areas of the search domain are promising. To overcome
this disadvantage by still keeping track of the benefits of a Gaussian function, a PDF Gi (x)
consisting of a weighted sum of several one-dimensional Gaussian functions gli (x) is considered
for every dimension i of the original search domain:
k k (x−µil )2
X X 1 −
i 2 σi 2
G (x) = wli · gli (x) = wli i
√ e l (20.3)
l=1 l=1
σl 2π
This function is characterized by the triplets (wli , σli , µil ) that are given for every dimension i of
the search domain and the number of kernels k of Gauss functions used within Gi (x). Within this
triplet, w represents the weights for the individual Gaussian functions for the PDF, σ represents
the standard deviations, and µ represents the means for the corresponding Gaussian functions.
The indices i and l refer, respectively, to the i-th dimension of the decision vector of the MINLP
problem and the l-th kernel number of the individual Gaussian function within the PDF.
As that the above triplets fully characterize the PDF and therefore guide the sampled solution
candidates throughout the search domain, they are called pheromones in the ACO sense and
constitute the biological background of the ACO metaheuristic presented here. Besides the
incremental construction of the solution candidates according to the PDF, the update of the
pheromones plays a major role in the ACO metaheuristic.
An obviously good choice to update the pheromones is the use of information, which has been
gained throughout the search process so far. This can be done by using a solution archive SA
in which the so far most promising solutions are saved. In case of k kernels this can be done
choosing an archive size of k. Thus the SA contains k n-dimensional solution vectors sl and the
corresponding k objective function values (see Socha [148]).
As the focus is here on constrained MINLPs, the solution archive SA also contains the cor-
responding violation of the constraints and the penalty function value for every solution sl .
In particular, the attraction of a solution sl saved in the archive is measured regarding the
penalty function value instead of the objective function value. Details on the measurement of
the violation and the penalty function will be described in Section 20.4.
We now explain the update process for the pheromones which is directly connected to the
update process of the solution archive SA. The weights w (which indicate the importance of an
ant and therefore rank them) are calculated with a linear proportion according to the parameter
189
20. MidacoWrapper
k:
(k − l + 1)
wli = Pk (20.4)
j=1 j
With this fixed distribution of the weights, a linear order of priority within the solution archive
SA is established. Solutions sl with a low index l are preferred. Hence, s1 is the current best
solution found and therefore most important, while sk is the solution of the lowest interest, saved
in SA. Updating the solution archive will then directly imply a pheromone update based on
best solutions found so far. Every time a new solution (ant) is created and evaluated within a
generation its attraction (penalty function value) is compared to the attraction of the so far best
solutions saved in SA, starting with the very best solution s1 and ending up with the last one
sk in the archive. In case the new solution has a better attraction than the j-th one saved in the
archive, the new solution will be saved on the j-th position in SA, while all solutions formerly
taking the j-th till k − 1-th position will drop down one index in the archive and the solution
formerly taking the last k-th position is discarded at all. Of course, if the new solution has a
worse than attraction as one of sk , the solution archive remains unaffected. As it is explained
in detail in the following, the solutions saved in SA fully imply the deviations and means used
for the PDF and imply therefore the major part of the pheromone triplet (wli , σli , µil ). This way
updating the solution archive with better solutions leads automatically a positive pheromone
update. Note that a negative pheromone update (evaporation) is indirectly performed as well by
the dropping the last solution sk of SA every time a new solution is introduced in the solution
archive. Explicit pheromone evaporation rules are known and can be found for example in Socha
and Dorigo [149] but were not considered here due to the implicit negative update and simplicity
of the framework.
Standard deviations σ are calculated by exploiting the variety of solutions saved in SA. For
every dimension i the maximal and minimal distance between the single solution components
si of the k solutions saved in SA is calculated. Then the distance between these two values,
divided by the number of generations, defines the standard variation for every dimension i:
For all k single Gaussian functions within the PDF this deviation is then used regarding the
corresponding dimension i. The means µ are given directly by the single components of the
solutions saved in SA:
µil = sil (20.6)
The incremental construction of a new ant works the following way: A mean µil is chosen first
for every dimension i . This choice is done respectively to the weights wli . According to the
weights defined in (20.4) and the identity of µil and sil defined in (20.6), the mean µi1 has the
highest probability to be chosen, while µik has the lowest probability to be chosen. Second, a
random number is generated by sampling around the selected mean µil using the deviation σli
defined by (20.5). Proceeding like this through all dimensions i = 1, ..., n then creates the new
ant, which can be evaluated regarding its objective function value and constraint violations in
a next step.
190
20.4. Penalty method
So far, this algorithm framework does not differ from the one proposed by Socha [148], except
that here explicit rules for all pheromone parameters (wli , σli , µil ) are given in (20.4), (20.5) and
(20.6), while Socha proposes those only for the deviations and means. It is to note that rules
for the deviations and means shown here were taken from Socha. The novel extension which
enables the algorithm to handle mixed integer search domains modifies the deviations σli used
for the integer variables and is described now in detail.
To handle integer variables besides the continuous ones, as it is described above, a discretiza-
tion of the continuous random numbers (sampled by the PDF) is necessary. The clear advantage
of this approach is the straight forward integration in the same framework described above. A
disadvantage is the missing flexibility in cases where for an integer dimension i all k solution
components si1,...,k in SA share the same value. In this case the corresponding deviations σ1,...,k i
are zero regarding the formulation in (20.5). As a consequence, no further progress in these
components is possible, as the PDF would only sample the exact mean without any deviation.
Introducing a lower bound for the deviation of integer variables helps to overcome this disad-
vantage and enables the ACO metaheuristic to handle integer and continuous variables simulta-
neously without any major extension in the same framework. For a dimension i that corresponds
to an integer variable the deviations σli are calculated by:
i dismax (i) − dismin (i) 1 1
σl = max , , (1 − √ )/2 (20.7)
#generation # generation nint
With this definition the deviations according to the corresponding Gaussian functions for
integer variables will never fall under a fixed lower bound of (1 − √n1int )/2, determined by the
number of integer variables nint considered in the MINLP problem formulation. For MINLP
with a large amount of integers, this lower bound converges toward 0.5 and ensures therefore a
deviation that hopefully keeps the algorithm flexible enough to find its way through the (large)
mixed integer search domain. A small number of integer variables leads to a smaller lower
bound, whilst for only one integer variable this bound is actually zero. In case of a small
amount of integer variables in the MINLP it is reasonable to think that at some point of the
search progress the optimal combination of integers is found and therefore no further searching
with a wide deviation is necessary for the integer variables. But even in case of only one integer
1
variable, with a corresponding absolute lower bound of zero, the middle term ( #generation ) in (7)
ensures a not too fast convergence of the deviation. Therefore, the calculation of the standard
deviation σ for integer variables by (7) seems to be a reasonable choice and is confirmed by the
numerical results.
191
20. MidacoWrapper
use and implementation very easy and popular. This advantage of simple penalty methods
comes with the drawback, that for especially challenging problems these are often not capable
to gain sufficient performance. Sophisticated penalty methods (e.g. adaptive or annealing, see
Yeniay [168]) are more powerful and adjustable to a specific problem due to a larger number of
parameters. However, this greater potential implies an additional optimization task for a specific
problem: the sufficiently good selection of a large set of parameters. A high potential penalty
method, that requires none or only a few parameters to be selected, is therefore an interesting
approach.
meq m
X X
res(z) = |gi (z)| − min{0, gi (z)} (20.8)
i=1 i=meq +1
where g1,...,meq denote the equality constraints and gmeq +1,...,m the inequality constraints. The
penalty function p(z) is then calculated by:
α · |f (z) − Ω| + (1 − α) · res(z) − β , if f (z) > Ω or res(z) > 0
p(z) = (20.9)
−|f (z) − Ω| , if f (z) ≤ Ω and res(z) = 0
192
20.4. Penalty method
√
|f (z)−Ω|· 6 √ 3−2
−res(z) |f (z)−Ω|
6 3
|f (z)−Ω|−res(z) , if f (z) > Ω and res(z) < 3
|f (z)−Ω|
1 − q |f 1(z)−Ω| , if f (z) > Ω and ≤ res(z) ≤ |f (z) − Ω|
3
2
res(z)
α = q (20.10)
|f (z)−Ω|
1
, if f (z) > Ω and res(z) > |f (z) − Ω|
2 res(z)
, if f (z) ≤ Ω
0
and β by:
√
|f (z)−Ω|· 6 √3−2
3 res(z) |f (z)−Ω|
6 3
· (1 − |f (z)−Ω| ) , if f (z) > Ω and res(z) <
1 3
1+ √#generation
β = (20.11)
0 , else
Before a brief motivation of the single components of the penalty method is given, a graphical
illustration of the penalty function values p(z) regarding different objective and residual function
values for a given oracle parameter Ω and a fixed number of generations is given below. Figure
20.1 shows the three dimensional shape of the penalty function according to the first and the
hundredth generation with an oracle parameter equal to zero, for residual function values in the
range from zero to ten and for objective function values in the range from minus ten to ten. It
is important to note, that the shape of the penalty function is not affected at all by the oracle
parameter. A lower or greater oracle parameter than zero results only in a movement to the left
or right respectively on the axis representing the objective function value.
193
20. MidacoWrapper
Figure 20.1.: Three dimensional shape of the robust oracle penalty method for Ω = 0 and
#generation = 1 (left) and #generation = 100 (right).
Now a brief motivation of the single components of the penalty method is given. The penalty
function p(z) defined in (20.9) is split into two cases. The appearance of the penalty function
in case f (z) > Ω and res(z) > 0 is similar to common ones, where a parameter α balances the
weight between the objective function and the residual and an additional β term acts as a bias.
In case f (z) ≤ Ω and res(z) = 0 the penalty function p(z) is defined as the negative distance
between the objective function value f (z) and the oracle parameter Ω. In this case, the resulting
penalty values will be zero or negative. This case corresponds with the front left sides (res = 0
and f (z) ≤ Ω) of the 3D shapes shown in Figure 20.1, which are formed as a vertical triangular.
In case f ≤ Ω and res(z) > 0 both, the α and β term, are zero. According to (20.9) this implies,
that the penalty value p(z) is equal to the residual value res(z). This case corresponds with the
left sides (f (z) ≤ Ω) of the two 3D shapes shown in Figure 20.1, which are formed as a beveled
plane.
q
The two middle terms 1 − q |f 1(z)−Ω| and 12 |fres(z)
(z)−Ω|
in the definition of α are the major
2 res(z)
components of the oracle penalty method. In case f (z) > Ω those are active in α and used
respectively to the residual value res(z). If res(z) > |f (z) − Ω| the latter one is active and
results in a value of α < 12 , which will increase the weight on the residual in the penalty function
p(z). Otherwise, if res(z) ≤ |f (z) − Ω|, the first one is active and results in a value of α ≥ 21 ,
which will increase the weight on the objective function regarding to (20.9). Therefore these two
components enable the penalty method to balance the search process, respectively to either the
objective function value or the residual. This balancing finds it representation in the nonlinear
shapes of the upper right sides (f (z) > Ω) of the two 3D shapes shown in Figure 20.1.
A special case occurs if f (z) > Ω and res(z) < |f (z)−Ω|
3 , in this case the very first term in
the definition of α and the β term is active. This is done, because otherwise the second term
in the definition of α would lead to a worser penalization of iterates with res(z) < |f (z)−Ω|
3
194
20.4. Penalty method
than those with res(z) = |f (z)−Ω| 3 (a detailed explanation goes beyond the scope here). As this
is seen as a negative effect regarding the robustness of the method, because feasible solutions
with res(z) = 0 < |f (z)−Ω|3 would be penalized worse than infeasible ones, the very first term
of α is activated in this case. The very first term in α implies an equal penalization p(z) of all
iterates sharing the same objective function value f (z) > Ω and a residual between zero and
|f (z)−Ω|
3 , not taking the β term in definition (20.9) into account. The β term goes one step
further and biases the penalization of iterates with a smaller residual than |f (z)−Ω|
3 better than
|f (z)−Ω|
those with res(z) = 3 . Moreover this bias will increase with ongoing algorithm process, as
the number of generations influence the β term. The front right sides (f (z) > Ω) of the two 3D
shapes shown in Figure 20.1 correspond to this special case, those are formed as a triangular.
In case of the left subfigure in Figure 20.1, which corresponds to a generation number of 1, the
biasing effect is not as strong as in case of the right subfigure, which corresponds to a generation
number of 100. It is to note, that this dynamic bias, caused by the β term in definition (20.9),
is the only difference between the two 3D shapes shown in Figure 20.1.
For most real-world problems the global optimal (feasible) objective function value is unknown a
priori. As the oracle parameter Ω is selected best, equal or just slightly greater than the optimal
objective function value, finding a sufficiently good oracle and solving the original problem to
the global optimum goes hand in hand. This self-tuning effect of the method is easy to exploit
if several optimization test runs are performed. Each time an optimization test run using a
specific oracle has finished, the obtained solution objective and residual function values directly
deliver information about an appropriate oracle update.
Numerical tests show that the oracle method is quite robust against overestimated oracles.
Underestimated oracles tend to deliver feasible points that are close to but not exactly the
global optimal objective function value. Based on those results the following update rule was
deduced and is intended to apply, if for a given problem absolutely no information is available
and a possible optimal objective function value. The oracle for the very first run should be
selected sufficiently low (in Table 20.1 designated as −∞, e.g. −1012 ), in this case the robust
oracle method follows a dynamic penalty strategy. The hope is to find a feasible point, which
is already close to the global optimum. If the first run succeeded in finding a feasible solution,
the oracle will be updated with this solution and from now on only feasible and lower solutions
will be used to further update the oracle. In case the first run did not deliver any feasible point,
the oracle should be selected sufficiently large (in Table 20.1 designated as ∞, e.g. 1012 . With
this parameter the method will completely focus on finding a feasible solution at first (see the
flat shape in Figure 20.1 for f (z) < Ω and res(z) > 0) and will then switch to a death strategy
(see the spike shape in Figure 20.1 for f (z) < Ω and res(z) = 0) as soon as a feasible solution
is found.
195
20. MidacoWrapper
Table 20.1.: Update rule for the oracle throughout several optimization test runs
Ωi Oracle used for the i-th optimization test run
fi Obj-function value obtained by the i-th test run
resi Residual function value corresponding to f i
f1 ,if res1 = 0
Ω2 = Update with solution if f 1 is feasible,
∞ (suff. high) ,else
static & death strategy if f 1 is infeasible
resi−1 = 0
f i−1 ,if and
Ωi = i−1 Update Ωi only with feasible (resi−1 = 0) and
f < Ωi−1
Ωi−1 ,else
better (f i−1 < Ωi−1 = {f i−2 ∨ ∞}) solutions
2. a) Put the number of parallel systems, i.e. the number of points where you want to
evaluate the functions simultaneously:
Put( const char * pcParam , const int * piValue ) ;
or
Put( const char * pcParam , const int iValue ) ;
where pcParam is "NumParallelSys"
b) Put the number of design variables to the object:
PutNumDv( int iNumDv ) ;
c) Put the number of integer design variables to the object:
Put( const char * pcParam , const int * piValue ) ;
or
Put( const char * pcParam , const int iValue ) ;
where pcParam is "NumIntDv"
d) Put the number of catalogued design variables to the object:
Put( const char * pcParam , const int * piValue ) ;
or
196
20.5. Program Documentation
197
20. MidacoWrapper
198
20.5. Program Documentation
3. Set boundary values for continuous and integer variables and provide initial guess.
a) Set upper and lower bounds using
PutLowerBound( const int iVariableIndex, const double dLowerBound );
where pdXu and pdXl must be pointing on arrays of length equal to the number of
continuous and integer design variables .
199
20. MidacoWrapper
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): MidacoWrapper needs new function values → 6b
New function values.
After passing these values to the MidacoWrapper object go to 4.
200
20.5. Program Documentation
with iObjFunIdx = 0 and dObjVal defining the value of the objective function
at the iParSysIdx’th design variable vector provided by Midaco.
Alternatively you can employ
PutConstrVal( const int iConstrIdx, const double dConstrValue,
const int iParSysIdx ),
for iParSysIdx = 0,...,NumberOfParallelSystems - 1
and iConstrIdx = 0,...,NumberOfConstraints - 1
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetObjVal( double & dObjVal, const int iFunIdx ) const
with iFunIdx = 0 (default) returns the value of the objective function at the last
solution vector.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
201
21. MipOptimizer
Our purpose is to solve nonlinear mixed discrete optimization problems as given in the following
mathematical definition:
Minimize f (x, y R , y N R )
s.t.:
gj (x, y R , y N R ) = 0 , j = 1, . . . , me
gj (x, y R , y N R ) ≥ 0 , j = me + 1, . . . , m
(MIP)
li ≤ xi ≤ ui , i = 1, . . . , nC
yiR ∈ SiR ⊂ R, i = 1, . . . , nR
yiN R ∈ SiN R ⊂ R, i = 1, . . . , nN R
with:
f Objective function, f : Rn → R, n = nN R + nR + nC
gj Constraint functions, gj : Rn → R, j = 1, . . . , m
yN R Vector of nonrelaxable discrete design variables.
yR Vector of relaxable discrete parameters.
x Vector of continuous optimization parameters.
l, u ∈ Rn Vectors containing lower and upper bounds for continuous design vari-
ables.
The difference between relaxable and nonrelaxable design variables is, that function val-
ues f (x, y R , y N R ) and gj (x, y R , y N R ) can be calculated at any point y R : yiR ∈ [min{ỹ ∈
SiR }, max{ỹ ∈ SiR }]∀i = 1, ..., nR , whereas for nonrelaxable design variables these can only
be evaluated at y N R ∈ S N R = S1N R × · · · × SnNNRR .
The underlying algorithm MIPSQP solves discrete optimization problems by solving several
continuous nonlinear optimization problems (NLPs). These NLPs are derived from the origin
problem (MIP) by fixing all nonrelaxable parameters to provided values and considering the
203
21. MipOptimizer
relaxable design variables as contonuous ones. This NLP can be stated as follows.
Minimize f (x, y R , y N R )
s.t.:
gj (x, y R , y N R ) = 0 , j = 1, . . . , me
gj (x, y R , y N R ) ≥ 0 , j = me + 1, . . . , m (P(y N R ))
li ≤ xi ≤ ui , i = 1, . . . , nC
liR ≤ yiR ≤ uR i , i = 1, . . . , nR
The second type of nonlinear optimization problem to be solved is generated from (MIP) by
fixing all discrete parameters to given values. This NLP can be stated as follows.
Minimize f (x, y R , y N R )
s.t.:
gj (x, y R , y N R ) = 0 , j = 1, . . . , me
(P(y N R , y R ))
gj (x, y R , y N R ) ≥ 0 , j = me + 1, . . . , m
li ≤ xi ≤ ui , i = 1, . . . , nC
If we use F (y N R ) and F (y R , y N R )in this paper we denote the optimal objective function
value of P(y N R ) and P(y N R , y R ) respectively. If an optimization problem is infeasible we set
F (y N R ) = +∞ or F (y R , y N R ) = +∞ respectively.
ỹjR = yopt
R
∀j ∈ {1, . . . , nR } \ i
j
n
R R
R
ỹ = y +d
i opt
[
R NR i
∪ (ỹ , yopt ) : (
n o n o
)
R R
i=1
d ∈ min yopt i − y > 0 , min y − yopt i > 0
y∈SiR y∈SiR
In other words, if we cannot yield an improved objective function value by varying a single
component of the discrete design vector to the left or right neighbor we consider the design
vector as local optimal.
For solving nonlinear optimization problems we employ the well known SQP (Sequential
Quadratic Programming) approach.
204
21.3. MIPSQP: Algorithm for Solving Nonlinear Discrete Optimization Problems
Relaxable Design Variables: These design variables are first considered as continuous ones.
The optimal solution of the continuous relaxation (P(y N R )) is used to generate different feasible
candidate vectors for relaxable design variables. Here we assume, that the optimal solution
of (MIP) is in the neighborhood of the solution of (P(y N R )). The optimality for these design
variables is tested by MP-Search. If no descent direction is found the algorithm stops and
our optimality criterion is fulfilled. If all generated candidate vectors are not feasible, we will
continue with MP-Search.
Nonrelaxable Design Variables: To retrieve a descent direction for nonrelaxable design vari-
ables, we approximate the minimum value function F (y N R ) : Rn → R quadratically:
Let H ≈ ∇2 F (y N R ) and G ≈ ∇F (y N R ), then we solve the following quadratic optimization
problem.
1 T
Minimize 2 d Hd + GT d
s.t.:
y N R − lN R ≤ d ≤ y N R − uN R ,
where (x, y R ) is the optimal solution for P(y N R ) and lN R , uN R ∈ RnN R , with liN R = min(y ∈
SiN R )and uN i
R = max(y ∈ S N R ), i = 1, . . . , n
i N R the lower and upper bound vectors for
nonrelaxable design variables. The optimal solution d of the quadratic problem is taken as
new search direction for nonrelaxable design variables, i.e. let ỹ N R be the vector resulting from
rounding each component of y N R +d to the next feasible point. Then we solve P(ỹ N R ) and search
a feasible point for relaxable design variables ỹ R . If this point satisfies F (ỹ R , ỹ N R ) < F (y R , y N R )
we have a new iterate with an improved objective value and restart with the QP-search. If the
QP-search fails we will perform the MP-Search algorithm.
MIPSQP - Algorithm
(I) Generating First Iterate
The generation of the first iterate is very important for solving the problem. We do it by fixing
all discrete design variables to user provided initial values:
Solve P(y0R , y0N R ) where y0R , y0N R are the user provided initial values for nonrelaxable and
relaxable design variables. Let x̃ be the solution for P(y0R , y0nR ), then the first iterate is given
by (x̃, y0R , y0N R ) with objective function value F (y0R , y0N R ). If P(y0R , y0N R ) is not feasible, we try
to retrieve a feasible point for non relaxable design variables by searching a descent direction of
the following objective function:
FP (y N R ) = FR (y N R ) + φ(x, y R , y N R )
205
21. MipOptimizer
where
m
X
R NR
φ(x, y , y )= ωi gi (x, y R , y N R )
i=1
and
FR (y0N R ) = min{f (x, y R , y N R ) : g(x, y R , y N R ) ≤ c, c ≥ 0}
y R ,x
To retrieve a descent direction we approximate ∇FR (y0N R ) by finite central differences and search
the other direction. The first steplength is calculated by the following rule:
m
|∇FR (y0N R )|
P
• A=
i=1
∇FR (y0N R )
• l= A
• τ =1
• Let ỹ N R be the value resulting from rounding y0N R + l to the next value in SiN R
• If ỹ N R is feasible for (MIP) the first iterate is generated either by rounding the optimal
relaxed values for F (ỹ N R ) or applying more concise methods.
• Goto (1)
(II) Generating Search Directions for Non-Relaxable Design Variables using QP-Search
method
Let (xk , ykR , ykN R ) be the current iterate. For k = 0 this is equal to the solution vector generated
by a method described above.
If the set of non relaxable design variables is empty, we proceed with testing optimality for
relaxable design variables. Else we approximate F (ykN R ) : RnN R → R quadratically. I.e. Let
H ≈ ∇2 F (ykN R ) and G ≈ ∇F (ykN R ). Then we solve the quadratic optimization problem
min 21 dT Hd + GT d
d ∈ RnN R :
lN R − ykN R ≤ d ≤ uN R − ykN R
206
21.3. MIPSQP: Algorithm for Solving Nonlinear Discrete Optimization Problems
(III) Generating Search Directions for Discrete Design Variables using MP-Search method
Let (xk , ykR , ykN R ) be the best solution for (MIP) found so far and d = 0nN R +nR Then we vary
each component yi of the design vector y = (ykR , ykN R ) to the left neighbor ỹ, where ỹ ∈ SiR if
i = 1, ..., nR and ỹ ∈ Si−nN R if i = n , ..., n
R R N R , i.e. y = y − ei yi + ei ỹ and solve
min f (x, y)
n
x∈R : g(x, y) ≥ 0
li ≤ xi ≤ ui , i = 1, ..., nC
• k =k+1
• i=i+1
• di = −1
Else set
• y = y + ei yi − ei ỹ
• i=i+1
• k =k+1
• i=i+1
• di = +1
207
21. MipOptimizer
Else set
• y = y + ei yi − ei ỹ
• i=i+1
Determination of Steplength
Set
• If di = −1 set ˜(y R )i = y i
• If di = +1 set ˜(y R )i = y i
ỹiN R ỹiN R
• ỹiN R = y ∈ SiN R : y − = min ŷ −
2 NR 2
ŷ∈Si
If ỹ R = ykR and ỹ N R = ykN R restart MP-Search or QP-Search to retrieve a new search direction.
Else go to (*).
208
21.4. Program Documentation
209
21. MipOptimizer
about errors.
0: no additional output (default)
– "OutputUnit" Put output unit for the Fortran output.
– "OutUnit" Same as ”OutputUnit”
• Put( const char * pcParam , const double * pdValue ) ;
or
Put( const char * pcParam , const double dValue ) ;
where pcParam is one of the parameters of the continuous optimizer.
• Put( const char * pcParam , const char * pcValue ) ;
or
Put( const char * pcParam , const char cValue ) ;
where pcParam = "OutputFileName" to set the name of the output file.
• Put( const char * pcParam , const bool * pbValue ) ;
or
Put( const char * pcParam , const bool bValue ) ;
where pcParam is one of the parameters of the continuous optimizer.
f) Put the number of inequality constraint functions to the object:
PutNumIneqConstr( int iNumIneqConstr ) ;
g) Put the number of equality constraint functions to the object:
PutNumEqConstr( int iNumEqConstr ) ;
3. Set boundary values for continuous and integer variables and provide initial guess.
a) Set upper and lower bounds using
PutLowerBound( const int iVariableIndex, const double dLowerBound );
PutUpperBound( const int iVariableIndex, const double dUpperBound );
for iVariableIndex = 0,...,NumContVariables + NumIntVariables - 1.
or
PutUpperBound( const double * pdXu );
PutLowerBound( const double * pdXl );
where pdXu and pdXl must be pointing on arrays of length equal to the number of
continuous and integer design variables .
210
21.4. Program Documentation
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalGradId():
MipOptimizer needs new values for gradients → 6c Providing new gradient values.
After passing these values to the MipOptimizer object go to 4.
211
21. MipOptimizer
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
212
21.4. Program Documentation
Some of the termination reasons depend on the accuracy used for approximating gradients.
If we assume that all functions and gradients are computed within machine precision and that
the implementation is correct, there remain only the following possibilities that could cause an
error message:
1. The termination parameter TermAcc is too small, so that the numerical algorithm plays
around with round-off errors without being able to improve the solution. Especially the
Hessian approximation of the Lagrangian function becomes unstable in this case. A
straightforward remedy is to restart the optimization cycle again with a larger stopping
tolerance.
2. The constraints are contradicting, i.e., the set of feasible solutions is empty. There is no
way to find out, whether a general nonlinear and non-convex set possesses a feasible point
or not. Thus, the nonlinear programming algorithms will proceed until running in any of
the mentioned error situations. In this case, the correctness of the model must be very
carefully checked.
3. Constraints are feasible, but some of them there are degenerate, for example if some of the
constraints are redundant. One should know that SQP algorithms assume the satisfaction
of the so-called constraint qualification, i.e., that gradients of active constraints are linearly
independent at each iterate and in a neighborhood of an optimal solution. In this situation,
it is recommended to check the formulation of the model constraints.
However, some of the error situations also occur if, because of wrong or non-accurate gradients,
the quadratic programming subproblem does not yield a descent direction for the underlying
merit function. In this case, one should try to improve the accuracy of function evaluations,
scale the model functions in a proper way, or start the algorithm from other initial values.
213
22. MisqpWrapper - A Mixed Integer SQP
Extension
MisqpWrapper is a C++ Wrapper for the Fortran subroutine MISQP. MISQP is an implemen-
tation of a sequential quadratic programming method that addresses nonlinear optimization
problems. The algorithm applies trust region techniques instead of the line search approach.
MISQP allows continuous, binary, integer and catalogued variables which can only take values
within a given discrete set. Thus, MISQP also solves mixed-integer nonlinear programming
problems. To simplify notation only the mixed integer problem formulation is considered in the
subsequent description of the method. Moreover, all types of variables, except the continuous
ones, are denoted by integer variables from now on. All integer variables are assumed to be
non-relaxable, therefore, the variables are not evaluated for fractional values. Partial deriva-
tives with respect to integer variables are approximated internally. Alternatively, these partial
derivatives can be provided by user. Numerical results are included which shows the behavior
of the code with respect to different parameter settings. The sections 22.1, 22.2 and 22.3 are
taken from Exler et al. [40].
22.1. Introduction
We consider the general optimization problem to minimize an objective function f under non-
linear equality and inequality constraints,
min f (x, y)
gj (x, y) = 0 , j = 1, . . . , me
nc ni
x ∈ R , y ∈ Z : gj (x, y) ≥ 0 , j = me + 1, . . . , m (22.1)
xl ≤ x ≤ xu
yl ≤ y ≤ yu
where x and y denote the vectors of the continuous and integer variables, respectively. The
vector y of length ni contains binary, integer and catalogued variables, in this order. Note that
the binary variables can only take values 0 and 1. It is assumed that the problem functions
f (x, y) and gj (x, y), j = 1, . . . , m, are continuously differentiable subject to all x ∈ Rnc , where
nc denotes the number of continuous variables.
Trust region methods have been invented many years ago first for unconstrained optimization,
especially for least squares optimization, see for example Powell [107], or Moré [95]. Extensions
were developed for non-smooth optimization, see Fletcher [43], and for constrained optimiza-
tion, see Celis [21], Powell and Yuan [110], Byrd et al. [20], Toint [158] and many others. A
comprehensive review on trust region methods is given by Conn, Gould, and Toint [26].
On the other hand, sequential quadratic programming or SQP methods belong to the most
frequently used algorithms to solve practical optimization problems. The theoretical background
is described e.g. in Stoer [151], Fletcher [44], or Gill et al. [54].
However, the situation becomes much more complex if additional integer variables must be
taken into account. Numerous algorithms have been proposed in the past, see for example
215
22. MisqpWrapper - A Mixed Integer SQP Extension
Floudas [48] or Grossmann and Kravanja [61] for review papers. Typically, these approaches
require convex model functions and continuous relaxations of integer variables. By a continuous
relaxation, we understand that integer variables can be treated as continuous variables, i.e.,
function values can also be computed between successive integer points. There are branch-
and-bound methods where a series of relaxed nonlinear programs obtained by restricting the
variable range of the relaxed integer variables must be solved, see Gupta and Ravindran [62] or
Borchers and Mitchell [15]. When applying an SQP algorithm for solving a subproblem, it is
possible to apply early branching, see also Leyffer [83]. Pattern search algorithms are available to
search the integer space, see Audet and Dennis [3]. After replacing the integrality condition by
continuous nonlinear constraints, it is possible to solve the resulting highly nonconvex program
by a global optimization algorithm, see e.g. Li and Chou [84]. Outer approximation methods are
investigated by Duran and Grossmann [36] and Fletcher and Leyffer [45], where a sequence of
alternating mixed-integer linear and nonlinear programs must be solved. Alternatively, it is also
possible to apply cutting planes similar to linear programming, see Westerlund and Pörn [165].
But fundamental assumptions are often violated in practice. Many real-life mixed-integer
problems are not relaxable, and model functions are highly nonlinear and nonconvex. Moreover,
some approaches require detection of infeasibility of nonlinear programs, a highly unattractive
feature from the computational point of view. We assume now that integer variables are not
relaxable, that there are relatively large ranges for integer values, and that the integer variables
possess some kind of physical meaning, i.e., are smooth in a certain sense. It is supposed that a
slight alteration of an integer value, say by one, changes the model functions only slightly, at least
much less than a more drastic change. Typically, relaxable programs satisfy this requirement.
In contrast to them, there are categorial variables which are introduced to enumerate certain
situations and where any change leads to a completely different category and thus to a completely
different response.
A practical situation is considered by Bünner and Schittkowski [17], where typical integer
variables are the number of fingers and the number of layers of an electrical filter. By increasing
or decreasing the number of fingers by one, we expect only a small alteration in the total
response, the transmission energy. The more drastically the variation of the integer variable is,
the more drastically model function values are supposed to change.
Thus, we propose an alternative idea in Section 22.2 where we try to approximate the La-
grangian subject to the continuous and integer variables by a quadratic function based on a
quasi-Newton update formula. Instead of a line search as is often applied in the continuous
case, we use trust regions to stabilize the algorithm and to enforce convergence following the
continuous trust region method of Yuan [171] with second order corrections. The specific form
of the quadratic programming subproblem avoids difficulties with inconsistent linearized con-
straints and leads to a convex mixed-integer quadratic programming problem, which can be
solved by any available algorithm, for example, a branch-and-cut method. Algorithmic details
and numerical results are reported in Exler et. al. [39].
216
22.2. The Mixed-Integer Trust Region SQP Method
criterion to decide whether we are close to an optimal solution nor can we retrieve any infor-
mation about the position of the optimal integer solution from the continuous solution of the
corresponding relaxed problem.
The basic idea of a trust region method is to compute a new iterate by a second order model
or a close approximation, see Exler and Schittkowski [41] or Exler et. al. [39] for more details.
The stepsize is restricted by a trust region radius ∆k , where k denotes the k-th iteration step.
Subsequently, a ratio rk of the actual and the predicted improvement subject to a certain merit
function is computed. The trust region radius is either enlarged or decreased depending on the
deviation of rk from the ideal value rk = 1. If sufficiently close to a solution, the artificial bound
∆k should not become active, so that the new trial step proposed by the second order model
can always be accepted.
For the continuous case, the superlinear convergence rate can be proved, see Fletcher [43],
Burke [18], or Yuan [170]. The individual steps depend on a large number of constants, which
are carefully selected based on numerical experience.
The goal is to apply a trust region SQP algorithm and to adapt it to the mixed-integer case
with as few alterations as possible. Due to the integer variables, the quadratic programming
subproblems that have to be solved are mixed-integer quadratic programming problems and
can be solved by any available algorithm. Since the generated subproblems are always convex,
we apply the branch-and-cut code MIQL of Lehmann et al. [79]. The mixed-integer quadratic
programming problems to be solved, are of the form
min 12 dT Bk d + ∇f (xk , yk )T d + σk µ
gj (xk , yk ) + ∇gj (xk , yk )T d ≥ −µ , j = 1, . . . , m
−g (x , y ) − ∇g (x , y ) T d ≥ −µ , j = 1, . . . , me
j k k j k k
d ∈ Rnc × Zni , µ ∈ R : l (k) c c u (k) c
max(xj − xj , −∆k ) ≤ dj ≤ max(xj − xj , ∆k ) , j = 1, . . . , nc
(k) (k)
max(yjl − yj , −∆ik ) ≤ dij ≤ max(yju − yj , ∆ik ) , j = 1, . . . , ni
µ ≥0
(22.2)
c i T c i
where d := (d , d ) contains a continuous step d and an integer step d . The solution is denoted
by dk , and µk .
Since we do not assume that (22.1) is relaxable, i.e., that f and g1 , . . . , gm can be evaluated
at any fractional parts of the integer variables, we approximate the first derivatives at f (x, y)
by the difference formula
1
dy f (x, y) ≈ f (x, y1 , . . . , yj + 1, . . . , yni ) − f (x, y1 , . . . , yj − 1, . . . , yni ) (22.3)
2
for j = 1, . . . , ni , at neighboring grid points, see Exler et. al. [39] for the detailed procedure. If
either yj + 1 or yj − 1 violates a bound, we apply a non-symmetric difference formula. Similarly,
dy gj (x, y) denote a difference formula for first derivatives at gj (x, y) computed at neighboring
grid points.
The adaption of the continuous trust region radius must be modified, since a trust region
radius smaller than one does not allow any further change of integer variables. Therefore, two
different radii are defined, ∆ck for the continuous and ∆ik for the integer variables. We prevent
∆ik from falling below one by ∆ik+1 := max(1, 2∆ik ). Note, however, that the algorithm allows a
decrease of ∆ik below one. In this situation, integer variables are fixed and a new step is made
217
22. MisqpWrapper - A Mixed Integer SQP Extension
only subject to the continuous variables. As soon as a new iterate is accepted, we set ∆ik to one
in order to be able to continue optimization over the whole range of variables.
and kg(xk , yk )− k∞ < 2t , where g(xk , yk )− represents the vector of constraint violations. We take
into account that a code returns a solution with a better function value than the best known
one, subject to the error tolerance of the allowed constraint violation. The tolerance is smaller
for measuring the constraint violation than for the error in the objective function, since we apply
a relative measure in this case.
Optimization algorithms for continuous optimization have the advantage that in each step,
local optimality criteria, the so-called KKT-conditions, can be checked. The algorithm is stopped
as soon as an iterate is sufficiently feasible and satisfies these stationary conditions subject to a
user-provided tolerance.
In mixed-integer optimization, however, we do not know how to distinguish between a local
or global solution nor do we have any local criterion to find out whether we are sufficiently close
to a solution or not. Thus, our algorithm stops if constraints are satisfied subject to a tolerance,
and if no further progress is possible towards neighboring grid points as measured by a merit
function. Thus, we distinguish between feasible, but non-global solutions, successful solutions
as defined above, and false terminations. We call the return of a test run, say xk and yk , an
acceptable solution, if the internal termination conditions are satisfied subject to a reasonably
small tolerance = 10−6 and if instead of (22.4)
f (xk , yk ) − f ? ≥ t |f ? | (22.5)
218
22.3. Performance Evaluation
holds. For our numerical tests, we use t = 0.01. Note that our approach is based on a local
search method, as in case of continuous optimization, and that global search is never started.
There is basic difficulty when evaluating statistical comparative scores by mean values for
a series of test problems and different computer codes. It might happen that the less reliable
codes do not solve the more complex test problems successfully, but the more advanced ones
solve them with additional numerical efforts, say calculation time or number of function calls. A
direct evaluation of mean values over successful test runs would thus penalize the more reliable
algorithms.
A more realistic possibility is to compute mean values of the criteria we are interested in, and
to compare the codes pairwise over sets of test examples, which are successfully solved by two
codes. We then get a reciprocal ncode × ncode matrix, where ncode is the number of codes under
consideration. The largest eigenvalue of this matrix is positive and we compute its normalized
eigenvector from where we retrieve priority scores. In a final step, we normalize the eigenvectors
so that the smallest coefficient gets the value one. The idea is known under the name priority
theory, see Saaty [114], and has been used by Schittkowski [116] for comparing 27 optimization
codes, see also Exler et al. [39] for another comparative study of mixed-integer codes.
We use the subsequent criteria to compare the robustness and efficiency of the code:
nsucc - percentage of successful and acceptable test runs according to above defini-
tions,
nacc - percentage of acceptable, i.e., of non-global feasible solutions, see above defi-
nition,
nerr - percentage of errors, i.e., of terminations with iStatus > 0,
pf unc - relative priority of equivalent function calls including function calls used for
gradient approximations,
nf unc - average number of equivalent function calls including function calls used for
computing a descent direction or gradient approximations, evaluated over all
successful test runs, where one function call consists of one evaluation of the
objective function and all constraint functions at a given x and y,
pgrad - relative priority of number of iterations,
ngrad - average number of iterations or gradient evaluations subject to the continuous
variables, respectively, evaluated over all successful test runs,
ptime - relative priority of execution times.
time - average execution times in seconds, evaluated over all successful test runs,
The following parameter settings are defined to compare several versions of MISQP against
the standard options:
Version Option Value Description
1 - partial derivatives subject to continuous vari-
ables computed by two-sided differences
2 - external function scaling
3 ”InternalScaling” 0 internal function scaling suppressed
4 ”ModifyMatrix” 0 internal scaling of matrix Bk suppressed
5 UserProvDerivatives 1 partial derivatives subject to discrete vari-
ables computed externally by forward differ-
ences
219
22. MisqpWrapper - A Mixed Integer SQP Extension
code nsucc (%) nacc (%) nerr (%) pf unc nf unc pgrad ngrad ptime ntime
1 85.2 12.7 4.2 2.0 1,047 1.7 24 1.3 0.2
2 85.9 11.9 2.1 1.5 950 1.5 22 1.8 0.2
3 82.4 10.5 7.0 1.5 566 1.5 18 1.3 0.1
4 62.7 35.2 2.1 1.7 801 1.5 15 1.0 0.3
5 82.4 14.1 3.5 2.0 1,222 2.3 31 1.0 0.3
6 61.3 35.2 3.5 1.0 541 1.0 12 1.0 0.1
7 84.5 14.8 0.7 2.0 1,191 1.9 26 1.4 0.3
8 83.1 14.8 2.1 1.1 547 1.6 24 1.5 0.2
9 78.2 19.7 2.8 2.1 553 1.4 16 1.4 0.2
220
22.4. Program Documentation
code nsucc (%) nacc (%) nerr (%) nf unc ngrad ntime
1 85.2 12.7 4.2 1,047 24 0.2
10 79.6 16.2 4.2 924 24 0.01
Table 22.2.: Performance Results of Mixed-Integer versus Continuously Relaxed Test Problems
• Less accurate partial derivatives subject to the continuous variables decrease number of
function evaluations significantly, but are a bit less reliable.
• Monotone trust region updates decrease the number of function calls, but less problems
are successfully solved.
Finally, it might be interesting to compare performance results obtained for mixed-integer
problems on the one hand and for the corresponding continuous relaxations on the other. The
test problems are relaxable and MISQP can be applied for solving continuous optimization
problems in an efficient way. The code is then a typical SQP-implementation with trust-region
and a second-order stabilization.
Table 22.2 contains numerical results, where code number 10 stands for the relaxed version
of MISQP. Also in this case, we apply a two-sided difference formula for partial derivatives to
avoid side effects due to inaccurate gradients.
The relaxed version of MISQP computes better optimal solutions for 106 test runs, as must
be expected. Very surprisingly, the numerical solution of mixed-integer test problems requires
about the same number of function evaluations as the solution of the relaxed counterparts.
Because of a large number of branch-and-bound steps when solving the mixed-integer quadratic
subproblems, computation times for solving mixed-integer programs are much higher. But for
the practical applications with time consuming function calls we have in mind, these additional
efforts are tolerable and in many situations negligible compared to a large amount of internal
computations of a complex simulation code.
There is still another important conclusion. About 96 % of the relaxed and the mixed-integer
test problems are successfully solved by MISQP, i.e., the code terminates at a feasible iterate
where the optimality criteria are satisfied subject to a tolerance 10−6 . Thus, we should not
expect that MISQP will be able to get significantly higher scores for nsucc for the much more
complex mixed-integer version of our test problem suite. Another conclusion is that about 15
% of the test problems are non-convex.
221
22. MisqpWrapper - A Mixed Integer SQP Extension
222
22.4. Program Documentation
223
22. MisqpWrapper - A Mixed Integer SQP Extension
224
22.4. Program Documentation
3. Set boundary values for continuous and integer variables and provide initial guess. Note
that boundary values for binary variables are set to 0 and 1 automatically.
a) Set upper and lower bounds using
PutLowerBound( const int iVariableIndex, const double dLowerBound )
PutUpperBound( const int iVariableIndex, const double dUpperBound )
for iVariableIndex = 0,...,NumberOfContinuousVariables +
NumberOfBinaryVariables + NumberOfInteger - 1
or
PutUpperBound( const double * pdXu );
PutLowerBound( const double * pdXl );
where pdXu and pdXl must be pointing on arrays of length equal to the number of
continuous and integer (without catalogued) design variables.
Before setting an initial guess, the number of design variables must be set.
225
22. MisqpWrapper - A Mixed Integer SQP Extension
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId():
The final accuracy has been achieved, the problem is solved.
• if iStatus is > 0:
An error occured during the solution process.
• if iStatus equals EvalFuncId() :
MisqpWrapper needs new function values → 6b New function values.
After passing these values to the MisqpWrapper object go to 4.
• if iStatus equals EvalGradId():
MisqpWrapper needs new values for gradients → 6c Providing new gradient values.
After passing these values to the MisqpWrapper object go to 4.
226
22.4. Program Documentation
227
22. MisqpWrapper - A Mixed Integer SQP Extension
228
23. Example
We solve the following problem using DefaultLoop():
Minimize x0 + x1 + x2 + x3 + x4 + x5
S.T.
−1 ≤ x0 ≤ 1 (real)
−1 ≤ x1 ≤ 1 (real)
−1 ≤ x2 ≤ 1 (integer, relaxable)
−1 ≤ x3 ≤ 2 (integer, relaxable)
0.5 ≤ x4 ≤ 5.4 (∈ {0.5, 3.5, 5.4}, not relaxable)
2 ≤ x5 ≤ 10 (integer, not relaxable)
Initial guess:
x0 = 1
x1 = 1
x2 = 1
x3 = 2
x4 = 5.4
x5 = 10
Optimal solution:
x0 = −1.0
x1 = −1.0
x2 = 0
x3 = −1
x4 = 0.5
x5 = 2
fopt = −0.5
229
23. Example
1 # i f n d e f Example41 H INCLUDED
2 # define Example41 H INCLUDED
3
4 #include ” OptProblem . h”
5
6
7 c l a s s Example41 : public OptProblem
8 {
9 public :
10
11 Example41 ( ) ;
12
13 int FuncEval ( bool bGradApprox = f a l s e ) ;
14 int GradEval ( ) ;
15 int SolveOptProblem ( ) ;
16 } ;
17
18 # endif
The file Example41.cpp contains the implementation:
1 #include ” Example41 . h”
2 #include<i o s t r e a m >
3 #include<cmath>
4 #include ” MidacoWrapper . h”
5 #include ” MisqpWrapper . h”
6 #include ” MipOptimizer . h”
7 #include ” ScpWrapper . h”
8 #include ”SqpWrapper . h”
9
10 using s t d : : c o u t ;
11 using s t d : : e n d l ;
12 using s t d : : c i n ;
13
14 Example41 : : Example41 ( )
15 {
16 char c ;
17 c o u t << e n d l << ”−−− t h i s i s Example 41 −−−” << e n d l << e n d l ;
18
19 // Choose o p t i m i z e r :
20 do
21 {
22 c o u t << ” Choose an o p t i m i z e r : \ n\n” ;
23 c o u t << ” [ 1 ] MipOptimizer \n” ;
24 c o u t << ” [ 2 ] MidacoWrapper \n” ;
25 c o u t << ” [ 3 ] MisqpWrapper \n” ;
26
230
27 c i n >> c ;
28 }
29 while ( c != ’ 1 ’ && c != ’ 2 ’ && c != ’ 3 ’ ) ;
30
31 i f ( c == ’ 2 ’ )
32 {
33 m pOptimizer = new MidacoWrapper ( ) ;
34 }
35 e l s e i f ( c == ’ 3 ’ )
36 {
37 m pOptimizer = new MisqpWrapper ( ) ;
38 }
39 else
40 {
41 m pOptimizer = new MipOptimizer ( ) ;
42
43 // Choose c o n t i n u o u s o p t i m i z e r and a t t a c h i t u s i n g
44 // A t t a c h S c a l a r O p t i m i z e r . The second argument t e l l s
45 // MipOptimizer t o d e l e t e t h e c o n t i n u o u s o p t i m i z e r
46 // a f t e r use .
47 // C u r r e n t l y o n l y ScpWrapper and SqpWrapper can be used .
48 do
49 { c o u t << e n d l << ” Choose a s c a l a r o p t i m i z e r : \n\n” ;
50 c o u t << ” [ q ] SqpWrapper \n” ;
51 c o u t << ” [ c ] ScpWrapper \n” ;
52
53 c i n >> c ;
54
55 i f ( c == ’ q ’ )
56 {
57 m pOptimizer−>A t t a c h S c a l a r O p t i m i z e r ( new SqpWrapper ( ) ,
58 true ) ;
59 }
60 e l s e i f ( c == ’ c ’ )
61 {
62 m pOptimizer−>A t t a c h S c a l a r O p t i m i z e r ( new ScpWrapper ( ) ,
63 true ) ;
64 }
65 else
66 {
67 c o u t << ” i l l e g a l i n p u t ! ” << e n d l ;
68 }
69 }
70 while ( c != ’ c ’ &&
71 c != ’ q ’ ) ;
72 }
231
23. Example
73 }
74
75 int Example41 : : FuncEval ( bool bGradApprox )
76 {
77 const double ∗ dX ;
78 double ∗ pdFuncVals ;
79 int iNumConstr ;
80 int iNumObjFuns ;
81
82 m pOptimizer−>GetNumConstr ( iNumConstr ) ;
83 m pOptimizer−>Get ( ”NumObjFuns” , iNumObjFuns ) ;
84
85 pdFuncVals = new double [ iNumConstr + iNumObjFuns ] ;
86
87 // 0 t h c o n s t r a i n t
88 pdFuncVals [ 0 ] = − dX [ 2 ] − dX [ 4 ] + 0 . 6 ;
89 // 1 s t c o n s t r a i n t
90 pdFuncVals [ 1 ] = dX [ 2 ] + dX [ 4 ] − 0 . 5 ;
91 // o b j e c t i v e
92 pdFuncVals [ 2 ] = dX [ 0 ] + dX [ 1 ] + dX [ 2 ] + dX [ 3 ] + dX [ 4 ] + dX [ 5 ] ;
93
94 m pOptimizer−>PutObjVal ( pdFuncVals [ iNumConstr ] ) ;
95 m pOptimizer−>PutConstrValVec ( pdFuncVals ) ;
96
97 delete [ ] pdFuncVals ;
98
99 return EXIT SUCCESS ;
100 }
101
102 int Example41 : : GradEval ( )
103 {
104 const double ∗ dX ;
105 int iNumDv ;
106
107 m pOptimizer−>GetNumDv( iNumDv ) ;
108 m pOptimizer−>GetDesignVarVec ( dX ) ;
109
110 for ( int iDvIdx = 0 ; iDvIdx < iNumDv ; iDvIdx++ )
111 {
112 m pOptimizer−>PutDerivObj ( iDvIdx , 1 ) ;
113 i f ( ( iDvIdx == 2 ) | | ( iDvIdx == 4 ) )
114 {
115 m pOptimizer−>PutDerivConstr ( 0 , iDvIdx , −1 ) ;
116 m pOptimizer−>PutDerivConstr ( 1 , iDvIdx , 1 ) ;
117 }
118 else
232
119 {
120 m pOptimizer−>PutDerivConstr ( 0 , iDvIdx , 0 ) ;
121 m pOptimizer−>PutDerivConstr ( 1 , iDvIdx , 0 ) ;
122 }
123 }
124 return EXIT SUCCESS ;
125 }
126
127 int Example41 : : SolveOptProblem ( )
128 {
129 int i E r r o r ;
130
131 iError = 0 ;
132
133 // MidacoWrapper n ee d s a r e l a t i v e l y h i g h v a l u e f o r
134 // t h i s parameter comnpared t o MipOptimizer .
135 m pOptimizer−>Put ( ”MaxNumIter” , 1 0 0 0 0 0 ) ;
136
137 // D e f i n e t h e o p t i m i z a t i o n problem :
138
139 m pOptimizer−>PutNumDv( 6 ) ;
140 m pOptimizer−>Put ( ”NumIntDv” , 2 ) ;
141 m pOptimizer−>Put ( ”NumCatDv” , 2 ) ;
142
143 m pOptimizer−>PutNumIneqConstr ( 2 ) ;
144 m pOptimizer−>PutNumEqConstr ( 0 ) ;
145
146 m pOptimizer−>PutUpperBound ( 0, 1 ) ;
147 m pOptimizer−>PutLowerBound ( 0 , −1 ) ;
148 m pOptimizer−>P u t I n i t i a l G u e s s ( 0 , 1 ) ;
149 m pOptimizer−>PutUpperBound ( 1, 1 ) ;
150 m pOptimizer−>PutLowerBound ( 1 , −1 ) ;
151 m pOptimizer−>P u t I n i t i a l G u e s s ( 1 , 1 ) ;
152 m pOptimizer−>PutUpperBound ( 2, 1 ) ;
153 m pOptimizer−>PutLowerBound ( 2 , −1 ) ;
154 m pOptimizer−>P u t I n i t i a l G u e s s ( 2 , 1 ) ;
155 m pOptimizer−>PutUpperBound ( 3, 2 ) ;
156 m pOptimizer−>PutLowerBound ( 3 , −1 ) ;
157 m pOptimizer−>P u t I n i t i a l G u e s s ( 3 , 2 ) ;
158
159 // For t h e 0 ’ t h c a t a l o g u e d v a r i a b l e a l l a l l o w e d v a l u e s
160 // a r e p u t s e p a r a t e l y
161 m pOptimizer−>m p V a r i a b l e C a t a l o g u e [ 0 ] . PutNumAllowedValues ( 3 ) ;
162 m pOptimizer−>m p V a r i a b l e C a t a l o g u e [ 0 ] . PutAllowedValue ( 0 , 0 . 5 ) ;
163 m pOptimizer−>m p V a r i a b l e C a t a l o g u e [ 0 ] . PutAllowedValue ( 1 , 3 . 5 ) ;
164 m pOptimizer−>m p V a r i a b l e C a t a l o g u e [ 0 ] . PutAllowedValue ( 2 , 5 . 4 ) ;
233
23. Example
165 // As t h e a l l o w e d v a l u e s o f t h e 1 ’ s t c a t a l o g u e d v a r i a b l e a r e an
166 // e q u i d i s t a n t p a r t i t i o n o f an i n t e r v a l , t h e v a l u e s can be p u t
167 // by s p e c i f y i n g t h e bounds and s t e p s i z e .
168 m pOptimizer−>m p V a r i a b l e C a t a l o g u e [ 1 ] . PutAllowedValues ( 2 . 0 ,
169 1.0 , 10.0 ) ;
170
171 m pOptimizer−>P u t I n i t i a l G u e s s ( 4 , 5 . 4 ) ;
172 m pOptimizer−>P u t I n i t i a l G u e s s ( 5 , 10 ) ;
173
174 // S t a r t o p t i m i z a t i o n
175 i E r r o r = m pOptimizer−>DefaultLoop ( t h i s ) ;
176
177 // i f t h e r e was an e r r o r , r e p o r t i t . . .
178 i f ( i E r r o r != EXIT SUCCESS )
179 {
180 c o u t << ” E r r o r ” << i E r r o r << ” \n” ;
181 return i E r r o r ;
182 }
183
184 // . . . e l s e r e p o r t t h e r e s u l t s :
185
186 const double ∗ dX ;
187 int iNumDesignVar ;
188 double dObjval ;
189
190 m pOptimizer−>GetNumDv( iNumDesignVar ) ;
191
192 m pOptimizer−>GetDesignVarVec ( dX ) ;
193
194 for ( int iDvIdx = 0 ; iDvIdx < iNumDesignVar ; iDvIdx++ )
195 {
196 c o u t << ”dX [ ” << iDvIdx << ” ] = ” << dX [ iDvIdx ] << e n d l ;
197 }
198
199 m pOptimizer−>GetObjVal ( dObjval ) ;
200
201 c o u t <<” O b j e c t i v e f u n c t i o n v a l u e : ” << dObjval << ” \n” ;
202
203 return EXIT SUCCESS ;
204 }
234
Part IV.
235
24. NlpmmxWrapper - Constrained Min-Max
Optimization
The Fortran subroutine NLPMMX by Schittkowski solves constrained min-max nonlinear pro-
gramming problems, where the maximum of absolute nonlinear function values is to be min-
imized. It is assumed that all functions are continuously differentiable. By introducing one
additional variable and nonlinear inequality constraints, the problem is transformed into a gen-
eral smooth nonlinear program subsequently solved by the sequential quadratic programming
(SQP) code NLPQLP. The sections 24.1 and 24.2 are taken from Schittkowski [142].
24.1. Introduction
Min-max optimization problems consist of minimizing the maximum of finitely many given
functions,
min max{fi (x), i = 1, . . . , l}
gj (x) = 0 , j = 1, . . . , me ,
x ∈ Rn : (24.1)
gj (x) ≥ 0 , j = me + 1, . . . , m ,
xl ≤ x ≤ xu .
We consider the constrained nonlinear min-max problem (24.1), and introduce one additional
variable, z, and l additional nonlinear inequality constraints of the form
z − fi (x) ≥ 0 , (24.2)
237
24. NlpmmxWrapper - Constrained Min-Max Optimization
min z
gj (x) = 0 , j = 1, . . . , me ,
In this case, the quadratic programming subproblem wich has to be solved in each step of an
SQP method, has the form
1 T d
min 2 (d , e)Bk +e
e
∇gj (xk )T d + gj (xk ) = 0 , j = 1, . . . , me ,
(d, e) ∈ Rn+1 : ∇gj (xk )T d + gj (xk ) ≥ 0 , j = me + 1, . . . , m , (24.4)
xl − xk ≤ d ≤ xu − xk .
xk+1 = xk + αk dk , zk+1 = zk + αk ek ,
238
24.3. Program Documentation
239
24. NlpmmxWrapper - Constrained Min-Max Optimization
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
240
24.3. Program Documentation
241
24. NlpmmxWrapper - Constrained Min-Max Optimization
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
242
24.3. Program Documentation
243
25. NlpinfWrapper - Constrained Data Fitting
in the L∞-Norm
The Fortran subroutine NLPINF by Schittkowski solves constrained min-max or L∞ nonlin-
ear programming problems, where the maximum of absolute nonlinear function values is to be
minimized. It is assumed that all functions are continuously differentiable. By introducing one
additional variable and nonlinear inequality constraints, the problem is transformed into a gen-
eral smooth nonlinear program subsequently solved by the sequential quadratic programming
(SQP) code NLPQLP. An important application is data fitting, where the distance of experi-
mental data from a model function evaluated at given experimental times is minimized by the
L∞ or maximum norm, respectively. The usage of the code is documented, and an illustrative
example is presented. The sections 25.1 and 25.2 are taken from Schittkowski [140].
25.1. Introduction
Min-max optimization problems arise in many practical situations, for example in approximation
or when fitting a model function to given data in the L∞ -norm. In this particular case, a
mathematical model is available in form of one or several equations, and the goal is to estimate
some unknown parameters of the model. Exploited are available experimental data, to minimize
the distance of the model function, in most cases evaluated at certain time values, from measured
data at the same time values. An extensive discussion of data fitting especially in case of
dynamical systems is given by Schittkowski [128].
The mathematical problem we want to solve, is given in the form
245
25. NlpinfWrapper - Constrained Data Fitting in the L∞ -Norm
fi (x) + z ≥ 0 ,
(25.2)
−fi (x) + z ≥ 0 ,
min z
gj (x) = 0 , j = 1, . . . , me ,
gj (x) ≥ 0 , j = me + 1, . . . , m ,
n+1
(x, z) ∈ R : fi (x) + z ≥ 0 , i = 1, . . . , l , (25.3)
−fi (x) + z ≥ 0 , i = 1, . . . , l ,
xl ≤ x ≤ xn ,
z≥0 .
In this case, the quadratic programming subproblem wich has to be solved in each step of an
SQP method, has the form
1 T d
min 2 (d , e)Bk +e
e
∇gj (xk )T d + gj (xk ) = 0 , j = 1, . . . , me ,
xl − xk ≤ d ≤ xu − xk ,
e≥0 .
xk+1 = xk + αk dk , zk+1 = zk + αk ek ,
where dk ∈ Rn and ek ∈ R are a solution of (25.4) and αk a steplength parameter obtained from
forcing a sufficient descent of a merit function.
246
25.3. Program Documentation
The proposed transformation (25.3) is independent of the SQP method used, so that available
codes can be used in the form of a black box. However, an active set strategy is recommended
to reduce the number of constraints, if l becomes large, e.g., the code NLPQLB (see chapter 3
or [136]).
A final remark concerns the theoretical convergence of the algorithm. Since the original prob-
lem is transformed into a general nonlinear programming problem, we can apply all convergence
results known for SQP methods. If an augmented Lagrangian function is preferred for the merit
function, a global convergence theorem is found in Schittkowski [120], see also [126] for con-
vergence of the active set strategy. The theorem states that when starting from an arbitrary
initial value, a Karush-Kuhn-Tucker point is approximated, i.e., a point satisfying the necessary
optimality conditions. If, on the other hand, an iterate is sufficiently close to an optimal solu-
tion and if the steplength is 1, then the convergence speed of the algorithm is superlinear, see
Powell [105] for example. This remark explains the fast final convergence rate one observes in
practice.
247
25. NlpinfWrapper - Constrained Data Fitting in the L∞ -Norm
248
25.3. Program Documentation
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): NlpinfWrapper needs new function values → 6b
New function values.
After passing these values to the NlpinfWrapper object go to 4.
• if iStatus equals EvalGradId():
NlpinfWrapper needs new values for gradients → 6c Providing new gradient values.
After passing these values to the NlpinfWrapper object go to 4.
249
25. NlpinfWrapper - Constrained Data Fitting in the L∞ -Norm
250
25.3. Program Documentation
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivModel( const int iVariableIdx, double & dDerivativeValue,
const int iSuppPtIdxdx ) const
returns the value of the derivative of the model function with respect to the iVari-
ableIdx’th design variable at the iSuppPtIdx’th supporting point.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
• Get( const char * pcParam, double & dResidual )
with pcParam = "Residual" returns the residual.
251
26. NlpL1Wrapper - Constrained Data Fitting
in the L1-Norm
The Fortran subroutine NLPL1 by Schittkowski solves constrained nonlinear programming prob-
lems, where the sum of absolute nonlinear function values is to be minimized. It is assumed that
all functions are continuously differentiable. By introducing additional variables and nonlinear
inequality constraints, the problem is transformed into a general smooth nonlinear program
subsequently solved by the sequential quadratic programming (SQP) code NLPQLP. The usage
of the code is documented, and an illustrative example is presented. The sections 26.1 and 26.2
are taken from Schittkowski [141].
26.1. Introduction
L1 optimization problems arise in many practical situations, for example in approximation
or when fitting a model function to given data in the L1 -norm. In this particular case, a
mathematical model is available in form of one or several equations, and the goal is to estimate
some unknown parameters of the model. Exploited are available experimental data, to minimize
the distance of the model function, in most cases evaluated at certain time values, from measured
data at the same time values. An extensive discussion of data fitting especially in case of
dynamical systems is given by Schittkowski [128]. The mathematical problem we want to solve,
is given in the form
Pl
min i=1 |fi (x)|
gj (x) = 0 , j = 1, . . . , me ,
x ∈ Rn : (26.1)
gj (x) ≥ 0 , j = me + 1, . . . , m ,
xl ≤ x ≤ xu .
It is assumed that f1 , . . ., fl and g1 , . . ., gm are continuously differentiable functions.
In this paper, we consider the question how an existing nonlinear programming code can be
used to solve constrained L1 optimization problems in an efficient and robust way after a suitable
transformation. In a very similar way, also L∞ , min-max and least squares problems can be
solved efficiently by an SQP code, see chapters 25, 24, 27 or Schittkowski [128, 131, 134, 142].
fi (x) + zi ≥ 0 ,
(26.2)
−fi (x) + zl+i ≥ 0 ,
253
26. NlpL1Wrapper - Constrained Data Fitting in the L1 -Norm
xl − xk ≤ d ≤ xu − xk ,
e≥0 .
xk+1 = xk + αk dk , zk+1 = zk + αk ek ,
where dk ∈ Rn and ek ∈ R are a solution of (26.4) and αk a steplength parameter obtained from
forcing a sufficient descent of a merit function.
The proposed transformation (26.3) is independent of the SQP method used, so that available
codes can be used in the form of a black box. Our implementation calls the code NLPQLP (see
chapter 2 or [132]).
254
26.3. Program Documentation
255
26. NlpL1Wrapper - Constrained Data Fitting in the L1 -Norm
5. Check status of the object using GetStatus( int & iStatus) const
256
26.3. Program Documentation
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): NlpL1Wrapper needs new function values → 6b
New function values.
After passing these values to the NlpL1Wrapper object go to 4.
• if iStatus equals EvalGradId():
NlpL1Wrapper needs new values for gradients → 6c Providing new gradient values.
After passing these values to the NlpL1Wrapper object go to 4.
6. Providing new function and gradient values:
a) Useful methods:
i. GetNumConstr( int & iNumConstr ) const
Returns the number of constraints.
GetNumDv( int & iNumDesignVar ) const
Returns the number of design variables.
Get( const char * pcParam, int & iNumSuppPts )
where pcParam = "NumSuppPts" returns the number of supporting points.
Get( const char * pcParam, double * & pdSuppPtTimes ),
where pcParam = "SuppPtTimes" returns a pointer to an array of ti , i = 1, ..., l,
i.e. the time coordinates of the supporting points.
Get( const char * pcParam, double * & pdSuppPtVals ),
where pcParam = "SuppPtVals" returns a pointer to an array of yi , i = 1, ..., l,
i.e. the values at the supporting points.
ii. GetDesignVarVec( const double * & pdPointer, const int i ) const
with i=0 (default) returns a pointer to the design variable vector.
GetDesignVar( const int iVarIdx , double & pdPointer,
const int iVectorIdx ) const
with iVectorIdx = 0 (default) returns the value of the iVarIdx’th design vari-
able of the design variable vector.
b) Providing new function values
Function values of the model function h as well as of constraints have to be calculated
for the variable vectors at each supporting point.
For access to the design variable vector see 6(a)ii. After calculating the values, these
must be passed to the NlpL1Wrapper object using:
• PutConstrValVec( const double * const pdConstrVals,
const int iParSysIdx ),
with iParSysIdx=0 (default)
where pdConstrVals is pointing on an array containing the values of each con-
straint at the design variable vector provided by NlpL1Wrapper.
• PutModelVal( const double dModelVal, const int iSuppPtIdx,
where iSuppPtIdx is the index of the supporting point (beginning with 0) and
dModelVal defines the value of the model function at the current design variable
vector.
257
26. NlpL1Wrapper - Constrained Data Fitting in the L1 -Norm
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
258
26.3. Program Documentation
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivModel( const int iVariableIdx, double & dDerivativeValue,
const int iSuppPtIdxdx ) const
returns the value of the derivative of the model function with respect to the iVari-
ableIdx’th design variable at the iSuppPtIdx’th supporting point.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
• Get( const char * pcParam, double & dResidual )
with pcParam = "Residual" returns the residual.
259
27. LeastSquaresWrapper - Constrained Data
Fitting in the L2-Norm
LeastSquaresWrapper is a C++-Wrapper for the Fortran subroutines NLPLSX and NLPLSQ.
The Fortran subroutines by Schittkowski solve constrained least squares nonlinear programming
problems, where the sum of squared nonlinear functions is to be minimized. It is assumed that
all functions are continuously differentiable.
The subroutine NLPLSX is applied when the problem has more than 200 data points. The
problem is directly solved by the SQP-routine NLPQLP.
If the problem has 200 or less data points the subroutine NLPLSQ is applied. By introducing
additional variables and nonlinear equality constraints, the problem is transformed into a general
smooth nonlinear program subsequently solved by the sequential quadratic programming (SQP)
code NLPQLP. It can be shown that typical features of special purpose algorithms are retained,
i.e., a combination of a Gauss-Newton and a quasi-Newton search direction. The additionally
introduced variables are eliminated in the quadratic programming subproblem, so that calcula-
tion time is not increased significantly. Some comparative numerical results are included, the
usage of the code is documented, and an illustrative example is presented. The sections 27.1,
27.2, 27.3, 27.4 are taken from Schittkowski [134]
27.1. Introduction
Nonlinear least squares optimization is extremely important in many practical situations. Typ-
ical applications are maximum likelihood estimation, nonlinear regression, data fitting, system
identification, or parameter estimation, respectively. In these cases, a mathematical model is
available in form of one or several equations, and the goal is to estimate some unknown para-
meters of the model. Exploited are available experimental data, to minimize the distance of the
model function, in most cases evaluated at certain time values, from measured data at the same
time values. An extensive discussion of data fitting especially in case of dynamical systems is
given by Schittkowski [128].
The mathematical problem we want to solve, is given in the form
min 12 li=1 fi (x)2
P
gj (x) = 0 , j = 1, . . . , me ,
x ∈ Rn : (27.1)
gj (x) ≥ 0 , j = me + 1, . . . , m ,
xl ≤ x ≤ xu .
It is assumed that f1 , . . ., fl and g1 , . . ., gm are continuously differentiable.
Although many nonlinear least squares programs were developed in the past, see Hiebert [67]
for an overview and numerical comparison, only very few programs were written for the non-
linearly constrained case, see, e.g., Holt and Fletcher [70], Lindström [86], Mahdavi-Amiri and
261
27. LeastSquaresWrapper - Constrained Data Fitting in the L2 -Norm
Bartels [91], or Fernanda et al. [42]. However, the implementation of one of these or any similar
special purpose code requires a large amount of theoretical and numerical analysis.
In this paper, we consider the question how an existing nonlinear programming code can
be used to solve constrained nonlinear least squares problems in an efficient and robust way.
We will see that a simple transformation of the model under consideration and subsequent
solution by a sequential quadratic programming (SQP) algorithm retains typical features of
special purpose methods, i.e., the combination of a Gauß-Newton search direction with a quasi-
Newton correction. Numerical test results indicate that the proposed approach is as efficient as
special purpose methods, although the required programming effort is negligible provided that
an SQP code is available.
In a very similar way, also L1 , L∞ , and min-max problems can be solved efficiently by an SQP
code after a suitable transformation, see chapters 26, 25, 24 or Schittkowski [128, 131, 135].
The following section describes some basic features of least squares optimization, especially
some properties of Gauss-Newton and related methods. The transformation of a least squares
problem into a special nonlinear program is described in Section 27.3. We will discuss how
some basic features of special purpose algorithms are retained. The same ideas are extended to
the constrained case in Section 27.4. Section 27.5 contains a documentation of the code. An
example implementation can be found in chapter 28.
and let
l
1X
f (x) = fi (x)2 .
2
i=1
Then
∇f (x) = ∇F (x)F (x) (27.3)
defines the Jacobian of the objective function with ∇F (x) = (∇f1 (x), . . . , ∇fl (x)). If we assume
now that all functions f1 , . . ., fl are twice continuously differentiable, we get the Hessian matrix
of f by
∇2 f (x) = ∇F (x)∇F (x)T + B(x) , (27.4)
262
27.2. Least Squares Methods
where
l
X
B(x) = fi (x)∇2 fi (x) . (27.5)
i=1
Proceeding from a given iterate xk , Newton’s method can be applied to (27.2) to get a search
direction dk ∈ Rn by solving the linear system
∇2 f (xk )d + ∇f (xk ) = 0
or, alternatively,
∇F (xk )∇F (xk )T d + B(xk )d + ∇F (xk )F (xk ) = 0 . (27.6)
Assume for a moment that
at an optimal solution x? . A possible situation is a perfect fit where model function values
coincide with experimental data. Because of B(x? ) = 0, we neglect matrix B(xk ) in (27.6),
see also (27.5). Then (27.6) defines the so-called normal equations of the linear least squares
problem
min k∇F (xk )T d + F (xk )k
(27.7)
d ∈ Rn .
A new iterate is obtained by xk+1 = xk + αk dk , where dk is a solution of (27.7) and where
αk denotes a suitable steplength parameter. It is obvious that a quadratic convergence rate
is achieved when starting sufficiently close to an optimal solution. The above calculation of a
search direction is known as the Gauss-Newton method and represents the traditional way to
solve nonlinear least squares problems, see Björck [6] for more details. In general, the Gauss-
Newton method possesses the attractive feature that it converges quadratically although we only
provide first order information.
However, the assumptions of the convergence theorem of Gauss-Newton methods are very
strong and cannot be satisfied in real situations. We have to expect difficulties in case of non-
zero residuals, rank-deficient Jacobian matrices, non-continuous derivatives, and starting points
far away from a solution. Further difficulties arise when trying to solve large residual problems,
where F (x? )T F (x? ) is not sufficiently small, for example relative to k∇F (x? )k. Numerous
proposals have been made in the past to deal with this situation, and it is outside the scope of
this section to give a review of all possible attempts developed in the last 30 years. Only a few
remarks are presented to illustrate basic features of the main approaches, for further reviews see
Gill, Murray and Wright [54], Ramsin and Wedin [112], or Dennis [31].
A very popular method is known under the name Levenberg-Marquardt algorithm, see Lev-
enberg [81] and Marquardt [92]. The key idea is to replace the Hessian in (27.6) by a multiple
of the identity matrix, say λk I, with a suitable positive factor λk . We get a uniquely solvable
system of linear equations of the form
For the choice of λk and the relationship to so-called trust region methods, see Moré [94].
A more sophisticated idea is to replace B(xk ) in (27.6) by a quasi-Newton-matrix Bk , see
Dennis [30]. But some additional safeguards are necessary to deal with indefinite matrices
263
27. LeastSquaresWrapper - Constrained Data Fitting in the L2 -Norm
∇F (xk )∇F (xk )T + Bk in order to get a descent direction. A modified algorithm is proposed
by Gill and Murray [52], where Bk is either a second-order approximation of B(xk ), or a quasi-
Newton matrix. In this case, a diagonal matrix is added to ∇F (xk )∇F (xk )T + Bk to obtain
a positive definite matrix. Lindström [85] proposes a combination of a Gauss-Newton and a
Newton method by using a certain subspace minimization technique.
If, however, the residuals are too large, there is no possibility to exploit the special structure
and a general unconstrained minimization algorithm, for example a quasi-Newton method, can
be applied as well.
fi (x) − zi = 0 , i = 1, . . . , l . (27.9)
F (x) = (f1 (x), . . ., fl (x))T . We consider now (27.10) as a general nonlinear programming
problem of the form
n̄
min f¯(x̄)
x̄ ∈ R : (27.11)
ḡ(x̄) = 0
with n̄ = n + l, x̄ = (x, z), f¯(x, z) = 21 z T z, ḡ(x, z) = F (x) − z, and apply an SQP algorithm, see
Spellucci [150], Stoer [151], or Schittkowski [123, 128]. The quadratic programming subproblem
is of the form
min 21 d¯T B̄k d¯ + ∇f¯(x̄k )T d¯
d¯ ∈ Rn̄ : (27.12)
∇ḡ(x̄k )T d¯ + ḡ(x̄k ) = 0 .
264
27.3. The SQP-Gauss-Newton Method
with Bk ∈ Rn×n , Ck ∈ Rn×l , and Dk ∈ Rl×l , a given approximation of the Hessian of the
Lagrangian function L(x̄, u) defined by
Since
−∇F (x)u
∇x̄ L(x̄, u) =
z+u
and
B(x, u) : 0
∇2x̄ L(x̄, u) =
0 : I
with
l
X
B(x, u) = − ui ∇2 fi (x) , (27.14)
i=1
where Bk ∈ Rn×n is a suitable positive definite approximation of B(xk , uk ). Insertion of this B̄k
into (27.12) leads to the equivalent quadratic programming subproblem
1 T 1 T T
n+l
min 2 d B k d + 2 e e + zk e
(d, e) ∈ R : (27.16)
∇F (xk )T d − e + F (xk ) − zk =0 ,
where we replaced d¯ by (d, e). Some simple calculations show that the solution of the above
quadratic programming problem is identified by the linear system
This equation is identical to (27.6), if Bk = B(xk ), and we obtain a Newton direction for solving
the unconstrained least squares problem (27.8).
Note that B(x) defined by (27.5) and B(x) defined by (27.14) coincide at an optimal solution
of the least squares problem, since F (xk ) = −uk . Based on the above considerations, an SQP
method can be applied to solve (27.10) directly. The quasi-Newton-matrices B̄k are always
positive definite, and consequently also the matrix Bk defined by (27.13). Therefore, we omit
numerical difficulties imposed by negative eigenvalues as found in the usual approaches for
solving least squares problems.
265
27. LeastSquaresWrapper - Constrained Data Fitting in the L2 -Norm
When starting the SQP method, one could proceed from a user-provided initial guess x0 for
the variables and define
z0 = F (x0 ) ,
(27.18)
µI : 0
B0 = ,
0 : I
guaranteeing a feasible starting point x̄0 . The choice of B0 is of the form (27.15) and allows
a user to provide some information on the estimated size of the residuals, if available. If it is
known that the final norm F (x? )T F (x? ) is close to zero at the optimal solution x? , the user
could choose a small µ in (27.18). At least in the first iterates, the search directions are more
similar to a Gauss-Newton direction. Otherwise, a user could define µ = 1, if a large residual is
expected.
Under the assumption that B̄k is decomposed in the form (27.15) and that B̄k be updated
by the BFGS formula, then B̄k+1 is decomposed in the form (27.15), see Schittkowski [125].
The decomposition (27.15) is rarely satisfied in practice, but seems to be reasonable, since the
intention is to find a x? ∈ Rn with
∇F (x? )F (x? ) = 0 ,
and ∇F (xk )T dk + F (xk ) is a Taylor approximation of F (xk+1 ). Note also that the usual way
to derive Newton’s method is to assume that the optimality condition is satisfied for a certain
linearization of a given iterate xk , and to use this linearized system for obtaining a new iterate.
f (x1 , x2 ) = 100(x2 − x1 2 )2 + (1 − x1 )2 .
When applying the nonlinear programming code NLPQLP of Schittkowski [123, 132], an imple-
mentation of a general purpose SQP method, we get the iterates of Table 27.1 when starting
at x0 = (−1.2, 1.0)T . The objective function is scaled by 0.5 to adjust this factor in the least
squares formulation (27.1). The last column contains an internal stopping condition based on
the optimality criterion, in our unconstrained case equal to
with a quasi-Newton matrix Bk . We observe a very fast final convergence speed, but a relatively
large number of iterations.
The transformation discussed above, leads to the equivalent constrained nonlinear programming
problem
min z1 2 + z2 2
x1 , x2 , z1 , z2 : 10(x2 − x1 2 ) − z1 = 0 ,
1 − x 1 − z2 = 0 .
NLPQLP computes the results of Table 27.2, where the second column shows in addition the
maximal constraint violation.
266
27.4. Constrained Least Squares Problems
k f (xk ) s(xk )
0 24.20 0.54 · 105
1 12.21 0.63 · 102
2 7.98 0.74 · 102
...
35 0.29 · 10−3 0.47 · 10−3
36 0.19 · 10−4 0.39 · 10−4
37 0.12 · 10−5 0.25 · 10−5
38 0.12 · 10−7 0.24 · 10−7
39 0.58 · 10−12 0.11 · 10−11
40 0.21 · 10−15 0.42 · 10−15
1 Pl 2
min 2 i=1 fi (x)
gj (x) = 0 , j = 1, . . . , me ,
x ∈ Rn : (27.19)
gj (x) ≥ 0 , j = me + 1, . . . , m ,
xl ≤ x ≤ xu .
A combination of the SQP method with the Gauss-Newton method is proposed by Mahdavi-
Amiri [90]. Lindström [86] developed a similar method based on an active set idea leading to a
sequence of equality constrained linear least squares problems. A least squares code for linearly
constrained problems was published by Hanson and Krogh [65] that is based on a tensor model.
On the other hand, a couple of SQP codes are available for solving general smooth nonlinear
programming problems, for example VFO2AD (Powell [105]), NLPQLP (Schittkowski [123,
132]), NPSOL (Gill, Murray, Saunders, Wright [53]), or DONLP2 (Spellucci [150]). Since most
nonlinear least squares problems are ill-conditioned, it is not recommended to solve (27.19)
directly by a general nonlinear programming method as shown in the previous section. The
same transformation used before can be extended to solve also constrained problems. The
subsequent solution by an SQP method retains typical features of a special purpose code and is
easily implemented.
267
27. LeastSquaresWrapper - Constrained Data Fitting in the L2 -Norm
fi (x) − zi = 0 ,
xl − xk ≤ d ≤ xu − xk .
e = ∇F (xk )T d + F (xk ) − zk ,
so that the quadratic programming subproblem depends on only n variables and m linear con-
straints. This is an important observation from the numerical point of view, since the com-
putational effort to solve (27.21) reduces from the order of (n + l)3 to n3 , and the remaining
computations in the outer SQP frame are on the order of (n + l)2 . Therefore, the computational
work involved in the proposed least squares algorithm is comparable to the numerical efforts
required by special purpose methods, at least if the number l of observations is not too large.
When implementing the above proposal, one has to be aware that the quadratic programming
subproblem is sometimes expanded by an additional variable δ, so that some safeguards are
required. Except for this limitation, the proposed transformation (27.20) is independent from
the variant of the SQP method used, so that available codes can be used in the form of a black
box.
In principle, one could use the starting points proposed by (27.18). Numerical experience
suggests, however, starting from z0 = F (x0 ) only if the constraints are satisfied at x0 ,
gj (x0 ) = 0 , j = 1, . . . , me ,
gj (x0 ) ≥ 0 , j = me + 1, . . . , m .
268
27.5. Program Documentation
A final remark concerns the theoretical convergence of the algorithm. Since the original prob-
lem is transformed into a general nonlinear programming problem, we can apply all convergence
results known for SQP methods. If an augmented Lagrangian function is preferred for the merit
function, a global convergence theorem is found in Schittkowski [120]. The theorem states that
when starting from an arbitrary initial value, a Karush-Kuhn-Tucker point is approximated,
i.e., a point satisfying the necessary optimality conditions. If, on the other hand, an iterate is
sufficiently close to an optimal solution and if the steplength is 1, then the convergence speed
of the algorithm is superlinear, see Powell [104] for example. This remark explains the fast final
convergence rate one observes in practice.
The assumptions are standard and are required by any special purpose algorithm in one or
another form. But in our case, we do not need any regularity conditions for the function f1 ,
. . ., fl , i.e., an assumption that the matrix ∇F (xk ) is of full rank, to adapt the mentioned
convergence results to the least squares case. The reason is found in the special form of the
quadratic programming subproblem (27.21), since the first l constraints are linearly independent
and are also independent of the remaining restrictions.
269
27. LeastSquaresWrapper - Constrained Data Fitting in the L2 -Norm
270
27.5. Program Documentation
5. Check status of the object using GetStatus( int & iStatus) const
• if iStatus equals SolvedId(): The final accuracy has been achieved, the problem is
solved.
• if iStatus is > 0: An error occured during the solution process.
• if iStatus equals EvalFuncId(): LeastSquaresWrapper needs new function values
→ 6b New function values.
After passing these values to the LeastSquaresWrapper object go to 4.
• if iStatus equals EvalGradId():
LeastSquaresWrapper needs new values for gradients → 6c Providing new gradient
values.
After passing these values to the LeastSquaresWrapper object go to 4.
271
27. LeastSquaresWrapper - Constrained Data Fitting in the L2 -Norm
272
27.5. Program Documentation
7. Output
• GetConstrValVec( const double * & pdPointer ) const
Returns a pointer to an array containing the values of all constraint functions at the
last solution vector.
• GetConstrVal( const int iConstrIdx, double & dConstrVal ) const
Returns the value of the iConstrIdx’th constraint function at the last solution vector.
• GetDerivConstr( const int iConstrIdx , const int iVariableIdx,
double & dDerivativeValue ) const
Returns the value of the derivative of the iConstrIdx’th constraint with respect to
the iVariableIdx’th design variable at the last solution vector.
• GetDerivModel( const int iVariableIdx, double & dDerivativeValue,
const int iSuppPtIdxdx ) const
returns the value of the derivative of the model function with respect to the iVari-
ableIdx’th design variable at the iSuppPtIdx’th supporting point.
• GetDesignVar( const int iVariableIdx, double & dValue, const int iParSysIdx
) const
with iParSysIdx = 0 (default) returns the value of the iVariableIdx’th design vari-
able in the last solution vector.
• GetDesignVarVec( const double * & pdPointer, const int iParSysIdx) const
with iParSysIdx = 0 (default) returns a pointer to the last solution vector.
• Get( const char * pcParam, double & dResidual )
with pcParam = "Residual" returns the residual.
273
28. Example
As an illustrative example we solve the following problem using DefaultLoop():
Pl−1
Minimize F0 (x) = i=0 (h(x, ti ) − yi )2 ,
where
t2 + x1 ) · x0
h(x, t) =
t 2 + x2 t + x3
and
0.0625 0.0246
0.0714
0.0235
0.0823
0.0323
0.1
0.0342
0.125 0.0456
t= 0.167 y= 0.0627
0.25 0.0844
0.5 0.16
1 0.1735
2 0.1947
4 0.1957
For the approximation of gradients we use the class ApproxGrad (see chapter 30).
The file Example44.h contains the class definition.
1 # i f n d e f Example44 H INCLUDED
2 # define Example44 H INCLUDED
3
4 #include ” OptProblem . h”
5
6 c l a s s Example44 : public OptProblem
7 {
8 public :
9
10 Example44 ( ) ;
11
12 int FuncEval ( bool bGradApprox = f a l s e ) ;
13 int GradEval ( ) ;
14 int SolveOptProblem ( ) ;
15
16 protected :
17 double F( const double x , const double ∗ b ) const ;
18 } ;
275
28. Example
19
20 # endif
The file Example44.cpp contains the implementation:
1 #include ” Example44 . h”
2 #include<i o s t r e a m >
3 #include<iomanip>
4 #include<cmath>
5 #include ” NlpinfWrapper . h”
6 #include ”NlpL1Wrapper . h”
7 #include ”NlpmmxWrapper . h”
8 #include ” LeastSquaresWrapper . h”
9
10 using std :: cout ;
11 using std :: endl ;
12 using std :: cin ;
13 using std :: setprecision ;
14
15 Example44 : : Example44 ( )
16 {
17 char s ;
18 c o u t << e n d l << ”−−− t h i s i s Example 44 −−−” << e n d l << e n d l ;
19 do
20 {
21 // Choose a l g o r i t h m :
22 c o u t << e n d l << ”Which Algorithm do you want t o u s e ? \n\n” ;
23 c o u t << ” [ i ] NlpinfWrapper \n” ;
24 c o u t << ” [m] NlpmmxWrapper \n” ;
25 c o u t << ” [ l ] NlpL1Wrapper \n” ;
26 c o u t << ” [ s ] LeastSquaresWrapper \n” ;
27
28 c i n >> s ;
29
30 i f ( s == ’m’ )
31 {
32 m pOptimizer = new NlpmmxWrapper ( ) ;
33 }
34 e l s e i f ( s == ’ l ’ )
35 {
36 m pOptimizer = new NlpL1Wrapper ( ) ;
37 }
38 e l s e i f ( s == ’ i ’ )
39 {
40 m pOptimizer = new NlpinfWrapper ;
41 }
42 e l s e i f ( s == ’ s ’ )
276
43 {
44 m pOptimizer = new LeastSquaresWrapper ;
45 }
46 else
47 {
48 c o u t << ” i l l e g a l i n p u t ! ” << e n d l ;
49 }
50 }
51 while ( s != ’ l ’ &&
52 s != ’m’ &&
53 s != ’ i ’ &&
54 s != ’ s ’ ) ;
55
56 }
57
58 int Example44 : : FuncEval ( bool bGradApprox )
59 {
60 const double ∗ dX ;
61 double ∗ pdFuncVals ;
62 double ∗ pdSuppPtTimes ;
63 int iNumSuppPts ;
64 int iSuppPtIdx ;
65 int iNumConstr ;
66
67 pdSuppPtTimes = NULL ;
68
69 m pOptimizer−>GetNumConstr ( iNumConstr ) ;
70 m pOptimizer−>Get ( ”NumSuppPts” , iNumSuppPts ) ;
71 m pOptimizer−>Get ( ” SuppPtTimes ” , pdSuppPtTimes ) ;
72
73 pdFuncVals = new double [ iNumConstr + iNumSuppPts ] ;
74
75 // Get t h e c u r r e n t d e s i g n v a r i a b l e v e c t o r
76 // from m pApprox when g r a d i e n t s a r e b e i n g a p p r o x i m a t e d
77 // and from m pOptimizer i n c a s e o f normal f u n c t i o n e v a l u a t i o n .
78 i f ( bGradApprox == f a l s e )
79 {
80 m pOptimizer−>GetDesignVarVec ( dX ) ;
81 }
82 else
83 {
84 m pApprox−>GetDesignVarVec ( dX ) ;
85 }
86
87 // E v a l u a t e model f u n c t i o n a t a l l s u p p o r t i n g p o i n t s
88 for ( iSuppPtIdx = 0 ; iSuppPtIdx < iNumSuppPts ; iSuppPtIdx++ )
277
28. Example
89 {
90 pdFuncVals [ iSuppPtIdx ] = ( pow ( pdSuppPtTimes [ iSuppPtIdx ] , 2 )
91 +dX [ 1 ] ) ∗ dX [ 0 ] / ( pow ( pdSuppPtTimes [ iSuppPtIdx ] , 2 ) +dX [ 2 ]
92 ∗ pdSuppPtTimes [ iSuppPtIdx ] + dX [ 3 ] ) ;
93 }
94
95
96 // Put f u n c t i o n v a l u e s
97 // t o m pApprox when g r a d i e n t s a r e b e i n g e v a l u a t e d
98 // t o m pOptimizer i n c a s e o f normal f u n c t i o n e v a l u a t i o n
99 i f ( bGradApprox == f a l s e )
100 {
101
102 for ( iSuppPtIdx = iNumConstr ; iSuppPtIdx < iNumConstr
103 + iNumSuppPts ; iSuppPtIdx++ )
104 {
105 m pOptimizer−>PutModelVal ( pdFuncVals [ iSuppPtIdx ] ,
106 iSuppPtIdx ) ;
107 }
108
109 m pOptimizer−>PutConstrValVec ( pdFuncVals , iSuppPtIdx ) ;
110 }
111 else
112 {
113 m pApprox−>PutFuncVals ( pdFuncVals ) ;
114 }
115
116 delete [ ] pdFuncVals ;
117
118 return EXIT SUCCESS ;
119 }
120
121 int Example44 : : GradEval ( )
122 {
123 // G r a d i e n t s a r e a p p r o x i m a t e d .
124 return GradApprox ( ) ;
125 }
126
127 int Example44 : : SolveOptProblem ( )
128 {
129 int i E r r o r ;
130
131 iError = 0 ;
132
133 int iMaxNumIter = 500 ;
134 int iMaxNumIterLS = 50 ;
278
135 double dTermAcc = 1 . 0 E−6 ;
136
137 double dTimes [ 11 ] = { 0 . 0 6 2 5 , 0 . 0 7 1 4 , 0 . 0 8 2 3 , 0 . 1 , 0 . 1 2 5 , 0 . 1 6 7 ,
138 0.25 , 0.5 , 1 , 2 , 4 } ;
139 double dVals [ 11 ] = { 0 . 0 2 4 6 , 0 . 0 2 3 5 , 0 . 0 3 2 3 , 0 . 0 3 4 2 , 0 . 0 4 5 6 , 0 . 0 6 2 7 ,
140 0.0844 , 0.16 , 0.1735 , 0.1947 , 0.1957 } ;
141
142
143 // S e t p a r a m e t e r s :
144 m pOptimizer−>Put ( ”MaxNumIter” , & iMaxNumIter ) ;
145 m pOptimizer−>Put ( ”MaxNumIterLS” , & iMaxNumIterLS ) ;
146 m pOptimizer−>Put ( ”TermAcc” , & dTermAcc ) ;
147 m pOptimizer−>Put ( ” OutputLevel ” , 2 ) ;
148
149 // D e f i n e f i t t i n g problem :
150 m pOptimizer−>Put ( ”NumSuppPts” , 11 ) ;
151 m pOptimizer−>Put ( ” SuppPtTimes ” , dTimes ) ;
152 m pOptimizer−>Put ( ” SuppPtVals ” , dVals ) ;
153 m pOptimizer−>Put ( ” ApproxSize ” , 0.01 ) ;
154 m pOptimizer−>PutNumDv( 4 ) ;
155 m pOptimizer−>PutNumIneqConstr ( 0 ) ;
156 m pOptimizer−>PutNumEqConstr ( 0 ) ;
157
158 double LB [ 4 ] = { − 1 e5 , − 1 e5 , − 1 e5 , − 1 e5 } ;
159 double UB[ 4 ] = { 1 e5 , 1 e5 , 1 e5 , 1 e5 } ;
160 double IG [ 4 ] = { 0 . 2 5 , 0 . 3 9 , 0 . 4 1 5 , 0 . 3 9 } ;
161
162 m pOptimizer−>PutUpperBound ( UB ) ;
163 m pOptimizer−>PutLowerBound ( LB ) ;
164 m pOptimizer−>P u t I n i t i a l G u e s s ( IG ) ;
165
166 // S t a r t o p t i m i z a t i o n
167 i E r r o r = m pOptimizer−>DefaultLoop ( t h i s ) ;
168
169 // i f t h e r e was an e r r o r , r e p o r t i t . . .
170 i f ( i E r r o r != EXIT SUCCESS )
171 {
172 c o u t << ” E r r o r ” << i E r r o r << ”\n” ;
173 return i E r r o r ;
174 }
175
176 // . . . e l s e r e p o r t t h e r e s u l t s :
177
178 const double ∗ dX ;
179
180 int iNumDesignVar ;
279
28. Example
280
Part V.
281
29. Class Variable Catalogue
This class defines a catalogued design variable.
The catalogued variables of MipOptimizer and MidacoWrapper are stored in an array of that
type. So before you start optimization you have to access the class methods of VariableCatalogue
via the class member MipOptimizer::m pVariableCatalogue or MidacoWrapper::m pVariableCatalogue
which is a VariableCatalogue* pointer to the array of catalogued variables.
• If the stepsize between the variables is constant you can use PutAllowedValues( double
dLowerBound, double dStepSize, double dUpperBound )
283
30. Class ApproxGrad
30.1. Usage
1. Generate an ApproxGrad object using ApproxGrad().
2. Put the number of design variables: PutNumDv (const int iNumDv) ;
3. Put the number of functions: PutNumFuns (const int iNumFuns) ;
4. Put the perturbance factor:
PutPerturbFactor (const double dPerturbFactor) ;
5. Put the design variable vector where to compute the gradient(s):
PutDesignVarVec (const double *dX) ;
6. Determine whether the Jacobian is expected to be sparse:
PutJacobianSparse (const bool bJacobianSparse) ;
7. If you use input- files for a program to evaluate the functions and within these files there
is only a limited number of digits for the values of the variables, it may happen that the
perturbed and the unperturbed variables have the same value in the file. To avoid this
you can put this parameter. ApproxGrad will then perturb the variables enough to make
the difference visible in the file.
PutMaxNumChar (const int iMaxNumChar) ;
8. Put the approximation method:
• 1: Forward differences
• 2: Central differences
• 3: Five-Point-Formula
PutApproxMethod( const int iMethod ) ;
9. Run Start() ;
10. Check the Status: GetStatus (int & iStatus) ;
• If iStatus < 1: More function values are needed. Get the current design variable
vector using
GetDesignVarVec (const double *& dX) ;
(with dX = NULL on input) and evaluate all functions at this vector. Store the values
in an array and put them to the object using
PutFuncVals (const double *pdFuncVals) ;
where pdFuncVals points on a double array of length equal to the number of functions
containing the values. Then go to Step 9.
285
30. Class ApproxGrad
30.2. Example
As an example we solve the following problem using DefaultLoop() and approximating the
gradients using ApproxGrad:
286
30.2. Example
1 #include ” Example12 . h”
2 #include<i o s t r e a m >
3 #include<cmath>
4 #include ”SqpWrapper . h”
5
6 using s t d : : c o u t ;
7 using s t d : : e n d l ;
8 using s t d : : c i n ;
9
10 Example12 : : Example12 ( )
11 {
12 char s ;
13 c o u t << e n d l << ”−−− t h i s i s Example 12 −−−” << e n d l << e n d l ;
14 c o u t << e n d l << ”−−−S o l v e d by SqpWrapper−−−” << e n d l << e n d l ;
15
16 m pOptimizer = new SqpWrapper ( ) ;
17 }
18
19 int Example12 : : FuncEval ( bool bGradApprox )
20 {
21 const double ∗ dX ;
22 int iNumParSys ;
23 double ∗ pdFuncVals ;
24 int iNumConstr ;
25 int iNumEqConstr ;
26 int iNumIneqConstr ;
27 int iNumActConstr ;
28 int ∗ piActive ;
29 int iCounter ;
30 int iNumObjFuns ;
31 int iError ;
32 bool bActiveSetStrat ;
33
34 p i A c t i v e = NULL ;
35 iNumActConstr = 0 ;
36 iCounter = 0 ;
37
38 m pOptimizer−>Get ( ”NumObjFuns” , iNumObjFuns ) ;
39 m pOptimizer−>GetNumConstr ( iNumConstr ) ;
40 m pOptimizer−>GetNumEqConstr ( iNumEqConstr ) ;
41 m pOptimizer−>GetNumIneqConstr ( iNumIneqConstr ) ;
42
43 i E r r o r = m pOptimizer−>Get ( ” ActConstr ” , p i A c t i v e ) ;
44
45 // I f t h e o p t i m i z e r d o e s not s u p p o r t a c t i v e s e t s t r a t e g y ,
46 // a l l i n e q u a l i t i e s a r e c o n s i d e r e d a c t i v e .
287
30. Class ApproxGrad
47 i f ( i E r r o r != EXIT SUCCESS )
48 {
49 bActiveSetStrat = false ;
50 p i A c t i v e = new int [ iNumIneqConstr ] ;
51 for ( int i C o n s t r I d x = 0 ; i C o n s t r I d x < iNumIneqConstr ;
52 i C o n s t r I d x++ )
53 {
54 piActive [ iConstrIdx ] = 1 ;
55 }
56 }
57 else
58 {
59 b A c t i v e S e t S t r a t = true ;
60 }
61
62
63 for ( int i C o n s t r I d x = 0 ; i C o n s t r I d x < iNumIneqConstr ;
64 i C o n s t r I d x++ )
65 {
66 i f ( p i A c t i v e [ i C o n s t r I d x ] != 0 )
67 {
68 iNumActConstr++ ;
69 }
70 }
71 // when used f o r g r a d i e n t a p p r o x i m a t i o n ,
72 // FuncEval d o e s not e v a l u a t e i n a c t i v e i n e q u a l i t y c o n s t r a i n t s
73 i f ( bGradApprox == true )
74 {
75 pdFuncVals = new double [ iNumEqConstr + iNumObjFuns
76 + iNumActConstr ] ;
77 }
78 else
79 {
80 pdFuncVals = new double [ iNumConstr + iNumObjFuns ] ;
81 }
82
83 // G r a d i e n t s have t o be e v a l u a t e d o n l y a t one d e s i g n v a r i a b l e v e c t o r
84 i f ( bGradApprox == true )
85 {
86 iNumParSys = 1 ;
87 }
88 // F u n c t i o n s may have t o be e v a l u a t e d a t more than one d e s i g n
89 // v a r i a b l e v e c t o r when u s i n g SqpWrapper
90 else
91 {
92 m pOptimizer−>Get ( ” NumParallelSys ” , iNumParSys ) ;
288
30.2. Example
93 }
94
95 // A f o r −l o o p s i m u l a t e s t h e p a r a l l e l e v a l u a t i o n i n iNumParSys
96 // p a r a l l e l s y s t e m s
97 for ( int i S y s I d x = 0 ; i S y s I d x < iNumParSys ; i S y s I d x++ )
98 {
99 // Get t h e d e s i g n v a r i a b l e v e c t o r
100 // from m pApprox when g r a d i e n t s a r e b e i n g a p p r o x i m a t e d
101 // from m pOptimizer i n c a s e o f normal f u n c t i o n e v a l u a t i o n
102 i f ( bGradApprox == f a l s e )
103 {
104 m pOptimizer−>GetDesignVarVec ( dX , i S y s I d x ) ;
105 }
106 else
107 {
108 m pApprox−>GetDesignVarVec ( dX ) ;
109 }
110
111 // E v a l u a t e t h e e q u a l i t y c o n s t r a i n t s
112
113 // 0 t h c o n s t r a i n t
114 pdFuncVals [ 0 ] = dX [ 1 ] ∗ dX [ 1 ] + dX [ 2 ] ∗ dX [ 2 ] − 4 . 0 ;
115
116 // E v a l u a t e t h e i n e q u a l i t y c o n s t r a i n t s
117 // ( when a p p r o x i m a t i n g g r a d i e n t s o n l y t h e a c t i v e ones )
118
119 i C o u n t e r = iNumEqConstr ;
120
121 i f ( bGradApprox == f a l s e | | p i A c t i v e [ 0 ] != 0 )
122 {
123 // 1 s t c o n s t r a i n t
124 pdFuncVals [ i C o u n t e r ] = dX [ 2 ] − 1 . 0 − dX [ 0 ] ∗ dX [ 0 ] ;
125 i C o u n t e r ++ ;
126 }
127
128 // E v a l u a t e t h e o b j e c t i v e
129
130 pdFuncVals [ i C o u n t e r ] = l o g ( f a b s ( dX [ 2 ] ) ) − dX [ 1 ] ;
131
132 // Put t h e v a l u e s
133 // t o m pApprox when g r a d i e n t s a r e b e i n g a p p r o x i m a t e d
134 // t o m pOptimizer i n c a s e o f normal f u n c t i o n e v a l u a t i o n
135 i f ( bGradApprox == f a l s e )
136 {
137 m pOptimizer−>PutObjVal ( pdFuncVals [ i C o u n t e r ] , 0 , i S y s I d x ) ;
138 m pOptimizer−>PutConstrValVec ( pdFuncVals , i S y s I d x ) ;
289
30. Class ApproxGrad
139 }
140 else
141 {
142 m pApprox−>PutFuncVals ( pdFuncVals ) ;
143 }
144
145 }
146
147 delete [ ] pdFuncVals ;
148 pdFuncVals = NULL ;
149
150 // I f t h e r e i s no a c t i v e s e t s t r a t e g y , p i A c t i v e has been
151 // a l l o c a t e d by t h i s f u n c t i o n and t h u s must be d e l e t e d h e r e .
152 i f ( b A c t i v e S e t S t r a t == f a l s e )
153 {
154 delete [ ] p i A c t i v e ;
155 p i A c t i v e = NULL ;
156 }
157
158 return EXIT SUCCESS ;
159 }
160
161 int Example12 : : GradEval ( )
162 {
163 // As a l l t h e work i s done i n GradApprox ( ) ,
164 // t h e r e i s not much t o do h e r e . . .
165 return GradApprox ( ) ;
166 }
167
168 int Example12 : : GradApprox ( int iApproxMethod )
169 {
170 // This f u n c t i o n i s a l r e a d y implemented i n t h e b a s e c l a s s
171 // OptProblem and t h u s d o e s not have t o implemented h e r e .
172 // However i n some c a s e s i t may be n e c e s s a r y t o r ei m pl em en t i t .
173
174 m pApprox = new ApproxGrad ;
175
176 int iNumDv ;
177 int iNumConstr ;
178 int iNumObjFuns ;
179 int iNumSuppPts ;
180 const double ∗ pdX ;
181 int iStatus ;
182 double ∗∗ ppdJac ;
183 bool bFitting ;
184 int ∗ piActive ;
290
30.2. Example
291
30. Class ApproxGrad
292
30.2. Example
277 {
278 m pOptimizer−>GetNumDv( iNumDv ) ;
279 m pApprox−>PutNumDv(iNumDv) ;
280
281 m pOptimizer−>Get ( ”NumSuppPts” , iNumSuppPts ) ;
282 m pOptimizer−>GetNumConstr ( iNumConstr ) ;
283 m pApprox−>PutNumFuns ( iNumConstr + iNumSuppPts ) ;
284
285 m pApprox−>Pu tPe rtu rbF act or ( 1 . 0 E−6 ) ;
286
287 m pApprox−>PutApproxMethod ( iApproxMethod ) ;
288
289 m pOptimizer−>GetDesignVarVec ( pdX ) ;
290 m pApprox−>PutDesignVarVec ( pdX ) ;
291
292 m pApprox−>S t a r t ( ) ;
293 m pApprox−>G e t St a t u s ( i S t a t u s ) ;
294
295
296 while ( i S t a t u s <= 0 )
297 {
298 // c a l l g r a d i e n t −v e r s i o n o f FuncEval
299 FuncEval ( true ) ;
300
301 m pApprox−>S t a r t ( ) ;
302
303 m pApprox−>G e t S ta t u s ( i S t a t u s ) ;
304 }
305
306 m pApprox−>GetJacobian ( ppdJac ) ;
307
308 f o r ( int i C o n s t r I d x = 0 ; i C o n s t r I d x < iNumConstr ;
309 i C o n s t r I d x++ )
310 {
311 m pOptimizer−>PutGradConstr ( i C o n s t r I d x ,
312 ppdJac [ i C o n s t r I d x ] ) ;
313 }
314 f o r ( int iSuppPtIdx = 0 ; iSuppPtIdx < iNumSuppPts ;
315 iSuppPtIdx++)
316 {
317 m pOptimizer−>PutGradModel ( ppdJac [ iNumConstr + iSuppPtIdx ] ,
318 iSuppPtIdx ) ;
319 }
320 }
321
322 delete m pApprox ;
293
30. Class ApproxGrad
294
30.2. Example
369
370 m pOptimizer−>GetDesignVarVec ( dX ) ;
371
372 int iNumDesignVar ;
373
374 m pOptimizer−>GetNumDv( iNumDesignVar ) ;
375
376 for ( int iDvIdx = 0 ; iDvIdx < iNumDesignVar ; iDvIdx++ )
377 {
378 c o u t << ”dX [ ” << iDvIdx << ” ] = ” << dX [ iDvIdx ] << e n d l ;
379 }
380 return EXIT SUCCESS ;
381 }
295
31. Class OptProblem and the
DefaultLoop()-function
OptProblem is a base class for optimization problems which can be solved using NLP++’s
optimizers.
DefaultLoop() is a member function of OptProblem which performs an optimization loop
until either an error occurs or the optimal solution is computed.
OptProblem is providing the DefaultLoop() function as well as several functions which the
optimizer needs to run DefaultLoop().
These functions are:
• FuncEval(), which evaluates the objective and constraint functions at the current iterate
and puts the values to the optimizer
• GradEval(), which evaluates the gradients of objective and constraint functions and puts
them to the OptimizerInterface object.
• GradApprox(), which approximates the gradient of all functions by using the ApproxGrad
class.
If you do not want to use DefaultLoop(), there is no need to derive your optimization problem
from OptProblem.
Usage:
1. Define a class (e.g. MyProblem) which is derived from OptProblem and defines your opti-
mization problem.
3. Either the class constructor or the SolveOptProblem() function must allocate memory
for m pOptimizer, e.g. using new SqpWrapper.
4. Via m pOptimizer, the function SolveOptProblem() then must define the problem using
the Put-methods (see documentation of the optimizer). Instead of calling StartOptimizer()
the SolveOptProblem() function can now call DefaultLoop(this) using the current
MyProblem object as an argument.
297
31. Class OptProblem and the DefaultLoop()-function
6. Call MyProblem::SolveOptProblem().
298
Part VI.
299
32. Proceeding from a calculation to an
optimization program
• Compilation:
– Microsoft Visual C++ Project settings:
∗ It will probably be necessary to ignore the libcmt.lib library.
∗ Add $(NLP)\include as include directory
∗ Add $(NLP)\lib as library directory
∗ Link to nlp.lib in release-mode and to nlpd.lib in debug-mode
– Linux: Adjust .cmake1 and .cmake2 so that NLP++’s headerfiles and the libNlp++.a
can be found and used for compilation.
32.2. Example
A Diffpack program is given which computes the deflection of a beam of given length when a
force is applied. Proceeding from that program we minimize the cross sectional area. Variables
are the height and the width of the beam. However, the deflection is not allowed to exceed a
bound (maxDeflection). Thus we solve the following optimization problem:
301
32. Proceeding from a calculation to an optimization program
File DPBeamSolver.h:
#i f n d e f DPBeamSolver h IS INCLUDED
#d e f i n e DPBeamSolver h IS INCLUDED
#i n c l u d e <FEM. h>
#i n c l u d e <DegFreeFE . h>
#i n c l u d e <LinEqAdmFE . h>
dpreal e p s i l o n ; // Parameter f u e r B i e g e b a l k e n
dpreal width ; // B a l k e n b r e i t e
dpreal h e i g h t ; // Balkenhoehe
//NLP++: Added v a r i a b l e s
dpreal m a x D e f l e c t i o n ; / / maximum d e f l e c t i o n
dpreal minWidth ; // l o w e r bound f o r width
dpreal maxWidth ; // upper bound f o r width
dpreal minHeight ; // l o w e r bound f o r h e i g h t
dpreal maxHeight ; // upper bound f o r h e i g h t
//ENDNLP++
dpreal length ; // B a l k e n l a e n g e
dpreal rho ; // B a l k e n d i c h t e
dpreal I y; // F l a e c h e n t r a e g h e i t s m o m e n t i n
// Richtung d e r y−Achse
302
32.2. Example
dpreal E modul ; // E l a s t i z i t a e t s m o d u l
dpreal force z ; // K r a f t a u f den Balken
dpreal qforce ; // Q u e r k r a f t a u f den Balken
// S e t z e n d e r e s s e n t i e l l e n Randbedingung
v i r t u a l void fillEssBC ( ) ;
// Berechnung von Elementmatrix und −v e k t o r
v i r t u a l v o i d calcElmMatVec ( i n t e , ElmMatVec& elmat ,
F i n i t e E l e m e n t& f e ) ;
// Berechnung d e r V a r i a t i o n s g l e i c h u n g
v i r t u a l v o i d i n t e g r a n d s ( ElmMatVec& elmat ,
c o n s t F i n i t e E l e m e n t& f e ) ;
// NLP++ Added f u n c t i o n s :
i n t FuncEval ( b o o l = f a l s e ) ;
i n t GradEval ( ) { r e t u r n GradApprox ( ) ; }
i n t SolveOptProblem ( v o i d ) ;
// ENDNLP++
DPBeamSolver ( ) ;
˜DPBeamSolver ( ) ;
};
#e n d i f
File main.cpp:
#i f d e f WIN32
#i n c l u d e <LibsDP . h>
#e n d i f
#i n c l u d e <DPBeamSolver . h>
// NLP++ : added h e a d e r f i l e
#i n c l u d e ” O p t i m i z e r I n t e r f a c e H e a d e r s . h”
// ENDNLP++
303
32. Proceeding from a calculation to an optimization program
// NLP++: i n s t e a d o f c a l l i n g m u l t i p l e L o o p
// c a l l adm and SolveOptProblem
// o l d :
// g l o b a l m e n u . m u l t i p l e L o o p ( sim ) ;
// new :
sim . adm( g l o b a l m e n u ) ;
sim . SolveOptProblem ( ) ;
// ENDNLP++
return 0;
}
File DPBeamSolver.cpp:
#i n c l u d e <DPBeamSolver . h>
#i n c l u d e <PreproBox . h>
#i n c l u d e <ElmMatVec . h>
#i n c l u d e <F i n i t e E l e m e n t . h>
#i n c l u d e <ErrorNorms . h>
#i n c l u d e <readOrMakeGrid . h>
DPBeamSolver : : DPBeamSolver ( )
{
}
DPBeamSolver : : ˜DPBeamSolver ( )
{
l i n e q . detach ( ) ;
}
304
32.2. Example
// o l d :
//new :
// i n i t i a l v a l u e f o r t h e width
menu . addItem ( l e v e l , ” width ” , ” Anfangswert b a l k e n b r e i t e ” , ” 0 . 8 ” ) ;
// i n i t i a l v a l u e f o r t h e h e i g h t
menu . addItem ( l e v e l , ” h e i g h t ” , ” Anfangswert b a l k e n h o e h e ” , ” 0 . 8 ” ) ;
// maximum d e f l e c t i o n ( n e g a t i v e )
menu . addItem ( l e v e l , ” m a x D e f l e c t i o n ” , ” Maximale Durchbiegung ” , ” − 0 . 0 0 5 ” ) ;
// l o w e r bound f o r t h e width
menu . addItem ( l e v e l , ”minWidth ” , ” Minimale B a l k e n b r e i t e ” , ” 0 . 0 0 0 0 1 ” ) ;
// upper bound f o r t h e width
menu . addItem ( l e v e l , ”maxWidth ” , ” Maximale B a l k e n b r e i t e ” , ” 1 . 0 ” ) ;
// l o w e r bound f o r t h e h e i g h t
menu . addItem ( l e v e l , ” minHeight ” , ” Minimale Balkenhoehe ” , ” 0 . 0 0 0 0 1 ” ) ;
// upper bound f o r t h e h e i g h t
menu . addItem ( l e v e l , ” maxHeight ” , ” Maximale Balkenhoehe ” , ” 1 . 0 ” ) ;
//ENDNLP++:
menu . addItem ( l e v e l , ” l e n g t h ” , ” b a l k e n l a e n g e ” , ” 1 0 . 0 0 ” ) ;
menu . addItem ( l e v e l , ” f o r c e z ” , ” k r a f t a u f b a l k e n ” , ” 2 0 0 0 0 . 0 ” ) ;
//Submenu
// Parameter d e s l i n e a r e n Systems
LinEqAdmFE : : d e f i n e S t a t i c ( menu , l e v e l +1);
}
v o i d DPBeamSolver : : s c a n ( )
305
32. Proceeding from a calculation to an optimization program
{
MenuSystem& menu = SimCase : : getMenuSystem ( ) ;
// E i n l e s e n d e s L o e s u n g s g i t t e r s
g r i d . r e b i n d ( new GridFE ( ) ) ;
String g r i d f i l e ;
g r i d f i l e = menu . g e t ( ” g r i d f i l e ” ) ;
readOrMakeGrid ( ∗ g r i d , g r i d f i l e ) ;
// E i n l e s e n d e r Menu−E i n t r a e g e
//NLP++: g e t v a l u e s o f t h e a d d i t i o n a l menu e n t r i e s :
//ENDNLP++
length = menu . g e t ( ” l e n g t h ” ) . g e t R e a l ( ) ;
rho = menu . g e t ( ” rho ” ) . g e t R e a l ( ) ;
E modul = menu . g e t ( ” E modul ” ) . g e t R e a l ( ) ;
force z = menu . g e t ( ” f o r c e z ” ) . g e t R e a l ( ) ;
// G e w i c h t s k r a f t
// q f o r c e = width ∗ h e i g h t ∗ rho ∗ ( − 9 . 8 1 ) ;
// f u e r das Bsp mit r e i n e r P u n k t l a s t
qforce = 0.0;
// F l a e c h e n t r a e g h e i t s m o m e n t
I y = width ∗ pow ( h e i g h t , 3 . 0 ) / 1 2 . 0 ;
// B i e g e b a l k e n p a r a m e t e r
e p s i l o n = E modul ∗ I y ;
// Anlegen d e s L o e s u n g s v e k t o r s
w. r e b i n d ( new FieldsFE ( ∗ g r i d , 2 , ”w ” ) ) ;
// Anlegen d e s DegFreeFE−Objektes , 2 Unbekannte pro Knoten
d o f . r e b i n d ( new DegFreeFE ( ∗ g r i d , 2 ) ) ;
// L i n e a r e s System und l i n e a r e L o e s e r e i n b i n d e n
l i n e q . r e b i n d ( new LinEqAdmFE ( ) ) ;
// E i n l e s e n d e r Parameter d e s l i n e a r e n Systems
l i n e q −>s c a n ( menu ) ;
306
32.2. Example
// Gesamtanzahl d e r F r e i h e i t s g r a d e
i n t t o t a l n o d o f s = dof−> getTotalNoDof ( ) ;
// Anlegen d e s L s u n g s v e k t o r s d e s l i n e a r e n Systems
l i n s o l . redim ( t o t a l n o d o f s ) ;
l i n e q −>a t t a c h ( l i n s o l ) ;
linsol . f i l l (0.0);
}
// s e t z e n d e r D i r i c h l e t −Randbedingungen
v o i d DPBeamSolver : : f i l l E s s B C ( )
{
int i ;
dof−>i n i t E s s B C ( ) ;
i n t nno= g r i d −>getNoNodes ( ) ; // Anzahl d e r G i t t e r k n o t e n
// Berechnung d e s l i n e a r e n G l e i c h u n g s s y s t e m s
v o i d DPBeamSolver : : calcElmMatVec ( i n t e , ElmMatVec& elmat ,
F i n i t e E l e m e n t& f e )
{
fe . evalDerivatives (2);
FEM : : calcElmMatVec ( e , elmat , f e ) ;
int s , nsides ;
i n t globNode ;
i n t noNodes ;
i n t locNodeNum ;
nsides = f e . getNoSides ( ) ; // Anzahl d e r Knoten d e s Elementes
noNodes = g r i d −>getNoNodes ( ) ;
f o r ( s = 1 ; s <= n s i d e s ; s++)
{
307
32. Proceeding from a calculation to an optimization program
// f a l l s d e r R a n d i n d i k a t o r 5 g e s e t z t wurde
i f ( f e . boSide ( s , 5))
{
// l o k a l e Knotennummer
locNodeNum = f e . getElmDef ( ) . getNodeOnSide ( s , 1 ) ;
// g l o b a l e Knotennummer
globNode = g r i d −>l o c 2 g l o b ( e , locNodeNum ) ;
// Randknoten
i f ( ( e == 1 && globNode == 1 ) | | ( e == g r i d −>getNoElms ( ) &&
globNode == noNodes ) )
{
// K r a f t zum El e me nt v ek to r a d d i e r e n
elmat . b ( ( locNodeNum −1)∗2+1) += − f o r c e z ;
}
else
{
elmat . b ( ( locNodeNum −1)∗2+1) += (− f o r c e z / 2 . 0 ) ;
}
}
}
}
308
32.2. Example
solveProblem ( ) ;
i n t nno = g r i d −>getNoNodes ( ) ;
i f ( bGradApprox == f a l s e )
{
m pOptimizer−>PutObjVal ( width ∗ h e i g h t ) ;
m pOptimizer−>PutConstrVal ( 0 , l i n s o l ( nno ) − m a x D e f l e c t i o n ) ;
}
else
{
d o u b l e FuncVals [ 2 ] = { l i n s o l ( nno ) − m a x D e f l e c t i o n , width ∗ h e i g h t } ;
m pApprox−>PutFuncVals ( FuncVals ) ;
}
r e t u r n EXIT SUCCESS ;
}
//ENDNLP++
iError = 0 ;
309
32. Proceeding from a calculation to an optimization program
m pOptimizer−>Put ( ” NumParallelSys ” , 1 ) ;
m pOptimizer−>Put ( ”MaxNumIter ” , & iMaxNumIter ) ;
m pOptimizer−>Put ( ”MaxNumIterLS ” , & iMaxNumIterLS ) ;
m pOptimizer−>Put ( ” RhoBegin ” , 0.1 ) ;
m pOptimizer−>Put ( ”TermAcc ” , & dTermAcc ) ;
m pOptimizer−>Put ( ” OutputLevel ” , 2 ) ;
m pOptimizer−>PutNumDv( 2 ) ;
m pOptimizer−>PutNumIneqConstr ( 1 ) ;
m pOptimizer−>PutNumEqConstr ( 0 ) ;
d o u b l e LB [ 2 ] = { minWidth , minHeight } ;
d o u b l e UB[ 2 ] = { maxWidth , maxHeight } ;
d o u b l e IG [ 2 ] = { width , h e i g h t } ;
m pOptimizer−>PutUpperBound (UB) ;
m pOptimizer−>PutLowerBound (LB) ;
m pOptimizer−>P u t I n i t i a l G u e s s ( IG ) ;
i E r r o r = m pOptimizer−>DefaultLoop ( t h i s ) ;
i f ( i E r r o r != EXIT SUCCESS )
{
c o u t << ” E r r o r ” << i E r r o r << ”\n” ;
return iError ;
}
c o n s t d o u b l e ∗ dX ;
m pOptimizer−>GetDesignVarVec ( dX ) ;
i n t iNumDesignVar ;
m pOptimizer−>GetNumDv( iNumDesignVar ) ;
r e t u r n EXIT SUCCESS ;
310
32.2. Example
}
//ENDNLP++
v o i d DPBeamSolver : : s o l v e P r o b l e m ( )
{
// S e t z e n d e r e s s e n t i e l l e n Randbedingung
fillEssBC ( ) ;
// Berechnen d e s l i n e a r e n Systems
makeSystem ( ∗ dof , ∗ l i n e q ) ;
// Loesen d e s l i n e a r e n Systems
l i n e q −>s o l v e ( ) ;
// S c h r e i b e n d e r Loesung i n w
d o f −> v e c 2 f i e l d ( l i n s o l , ∗w ) ;
// NLP++: S u p p r e s s output o f s o l u t i o n
// i n t nno = g r i d −>getNoNodes ( ) ;
// i n t i , j ;
// i n t k=1;
// Ptv ( d p r e a l ) x v a l u e ; // a k t u e l l e r Auswertungspunkt
// x v a l u e . redim ( nno ) ;
// x v a l u e . f i l l ( 0 . 0 ) ;
// Ptv ( d p r e a l ) x c o o r d s ; // G i t t e r k o o r d i n a t e n
// x c o o r d s . redim ( nno ) ;
//ENDNLP++
311
32. Proceeding from a calculation to an optimization program
312
Bibliography
[1] M. A. Abramson. Mixed variable optimization of a load-bearing thermal insulation system
using a filter pattern search algorithm. Optim. Eng., 5(2):157–177, 2004.
[2] L. Armijo. Minimization of functions having lipschitz continuous first partial derivatives.
Pacific Journal of Mathematics, 16:1–3, 1966.
[3] C. Audet and J. Dennis. Pattern search algorithm for mixed variable programming. SIAM
Journal on Optimization, 11:573–594, 2001.
[5] J. Birk, M. Liepelt, K. Schittkowski, and F. Vogel. Computation of optimal feed rates and
operation intervals for tubular reactors. Journal of Process Control, 9:325–336, 1999.
[8] C. Blum. Ant colony optimization: Introduction and recent trends. Physics of Life
Reviews, 2(4):353–373, 2005.
[9] C. Blum. The metaheuristics network web site: Ant colony optimization.
Http://www.metaheuristics.org/index.php?main=3&sub=31, 2008.
[10] P. Boderke, K. Schittkowski, M. Wolf, and H. Merkle. Modeling of diffusion and concurrent
metabolism in cutaneous tissue. Journal on Theoretical Biology, 204(3):393–407, 2000.
[12] F. Bonabeau, M. Dorigo, and G. Theraulaz. Inspiration for optimization from social insect
behaviour. Nature, 406:39–42, 2000.
[13] I. Bongartz, A. Conn, N. Gould, and P. Toint. Cute: Constrained and unconstrained
testing environment. Transactions on Mathematical Software, 21(1):123–160, 1995.
[14] J. Bonnans, E. Panier, A. Tits, and J. Zhou. Avoiding the maratos effect by means of a
nonmonotone line search, ii: Inequality constrained problems – feasible iterates,. SIAM
Journal on Numerical Analysis, 29:1187–1202, 1992.
313
Bibliography
[15] B. Borchers and J. Mitchell. An improved branch and bound algorithm for mixed integer
nonlinear programming,. Computers and Operations Research, 21(4):359–367, 1194.
[16] G. E. P. Box and M. E. Mller. A note on the generation of random normal deviates. Ann.
Math. Stat., 29(2):610–611, 1958.
[17] M. Bünner, K. Schittkowski, and G. van de Braak. Optimal design of electronic compo-
nents by mixed-integer nonlinear programming. Optimization and Engineering, 5:271–294,
2004.
[18] J. Burke. A robust trust region method for constrained nonlinear programming problems.
SIAM Journal on Optimization, 2:325–347, 1992.
[19] R. Byrd, N. Gould, J. Nocedal, and R. Waltz. An active-set algorithm for nonlinear pro-
gramming using linear programming and equality constrained subproblems. Mathematical
Programming B, 100:27–48, 2004.
[20] R. H. Byrd, R. B. Schnabel, and G. A. Schultz. A trust region algorithm for nonlinearly
constrained optimization. SIAM Journal on Numerical Analysis, 24:1152–1170, 1987.
[21] M. Celis. A trust region strategy for nonlinear equality constrained optimization. PhD
thesis, Department of Mathematics, Rice University, USA, 1983.
[22] R. Chelouah and P. Siarry. Genetic and nelder-mead algorithms hybridized for a more
accurate global optimization of continuous multiminima functions. Eur. J. Oper. Res.,
148(2):335–348, 2003.
[23] R. Chelouah and P. Siarry. A hybrid method combining continuous tabu search and nelder-
mead simplex algorithms for the global optimization of multiminima functions. Eur. J.
Oper. Res., 161(3):636–654, 2005.
[24] W. Chunfeng and Z. Xin. Ants foraging mechanism in the design of multiproduct batch
chemical process. Ind. Eng. Chem. Res., 41(26):6678–6686, 2002.
[25] C. A. Coello. Theoretical and numerical constraint-handling techniques used with evo-
lutionary algorithms: A survey of the state of the art. Comput. Method. Appl. M.,
191(11):1245–1287, 2002.
[26] A. Conn, N. Gould, and P. Toint. Trust-Region Methods. MPS/SIAM Series on Optimiza-
tion, 2000.
[27] Y. Dai. On the nonmonotone line search. Journal of Optimization Theory and Applications,
112(2):315–330, 2002.
[28] Y. Dai and K. Schittkowski. A sequential quadratic programming algorithm with non-
monotone line search. Pacific Journal of Optimization, 4:335–351, 2006.
[29] N. Deng, Y. Xiao, and Z. F.J. Nonmonotonic trust-region algorithm. Journal of Opti-
mization Theory and Applications, 26:259–285, 1993.
314
Bibliography
[30] J. Dennis. Some computational techniques for the nonlinear least squares problem. In
G. Byrne and C. Hall, editors, Numerical Solution of Systems of Nonlinear Algebraic
Equations. Academic Press, London, New York, 1973.
[31] J. Dennis. Nonlinear least squares. In D. Jacobs, editor, The State of the Art in Numerical
Analysis. Academic Press, London, New York, 1977.
[32] M. Dorigo. Optimization, Learning and Natural Algorithms. PhD thesis, Politecnico di
Milano (Italy), 1992.
[33] M. Dorigo, G. Di Caro, and L. M. Gambardella. Ant algorithms for discrete optimization.
Artifical Life, 5(2):137–172, 1999.
[34] M. Dorigo and T. Stuetzle. Ant Colony Optimization. MIT Press, 2004.
[35] J. Dreo and P. Siarry. An ant colony algorithm aimed at dynamic continuous optimization.
Appl. Math. Comput., 181(1):457–467, 2006.
[40] O. Exler, T. Lehmann, and K. Schittkowski. Misqp: A fortran subroutine of a trust region
SQP algorithm for mixed-integer nonlinear programming - user’s guide. Technical report,
Department of Mathematics, University of Bayreuth, 2012.
[41] O. Exler and K. Schittkowksi. A trust region SQP algorithm for mixed-integer nonlinear
programming. Optimization Letters, 3(1):269–280, 2007.
[44] R. Fletcher. Practical Methods of Optimization. John Wiley & Sons, Chichester, 2. ed.,
reprinted in paperback. edition, 2000.
[45] R. Fletcher and S. Leyffer. Solving mixed integer nonlinear programs by outer approxi-
mation. Mathematical Programming, 66(1-3):327–349, 1994.
[46] C. Fleury. CONLIN: an efficient dual optimizer based on convex approximation concepts.
Structural Optimization, 1:81–89, 1989.
315
Bibliography
[47] C. Fleury. First and second order convex approximation strategies in structural optimiza-
tion. Structural Optimization, 1:3–10, 1989.
[49] J. Frias, J. Oliveira, and K. Schittkowski. Modelling of maltodextrin de12 drying process
in a convection oven. Applied Mathematical Modelling, 24:449–462, 2001.
[50] R. Ge and Y. Qin. A class of filled functions for finding global minimizers of a function of
several variables. Journal of Optimization Theory and Applications, 54:241–252, 1987.
[51] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam. PVM 3.0.
A User’s Guide and Tutorial for Networked Parallel Computing. The MIT Press, 1995.
[52] P. Gill and W. Murray. Algorithms for the solution of the non-linear least-squares problem.
CIAM Journal on Numerical Analysis, 15:977–992, 1978.
[53] P. Gill, W. Murray, M. Saunders, and M. Wright. User’s guide for sql/npsol: A fortran
package for nonlinear programming. Technical Report SOL 83-12, Dept. of Operations
Research, Standford University, California, 1983.
[54] P. Gill, W. Murray, and M. Wright. Practical Optimization. Academic Press, London,
New York, Toronto, Sydney, San Francisco, 1981.
[56] D. Goldfarb and A. Idnani. A numerically stable method for solving strictly convex
quadratic programs. Mathematical Programming, 27:1–33, 1983.
[57] L. Grippo, F. Lampariello, and S. Lucidi. A nonmonotone line search technique for new-
tons’s method. SIAM Journal on Numerical Analysis, 23:707–716, 1986.
[58] L. Grippo, F. Lampariello, and S. Lucidi. A truncated newton method with nonmonotone
line search for unconstrained optimization. Journal of Optimization Theory and Applica-
tions, 60:401–419, 1989.
[62] O. K. Gupta and V. Ravindran. Branch and bound experiments in convex non- linear
integer programming. Manage. Sci., 31:1533–1546, 1985.
316
Bibliography
317
Bibliography
[79] T. Lehmann, K. Schittkowski, and T. Spickenreuther. Miql: A fortran subroutine for con-
vex mixed-integer quadratic programming by branch-and-bound - user’s guide. Technical
report, Department of Computer Science, University of Bayreuth, Bayreuth, 2009.
[80] M. Lenard. A computational study of active set strategies in nonlinear programming with
linear constraints. Mathematical Programming, 16:81–97, 1979.
[81] K. Levenberg. A method for the solution of certain problems in least squares. Quarterly
of Applied Mathematics, 2:164–168, 1944.
[82] A. Levy and A. Montalvo. The tunneling algorithm for the global minimization of func-
tions. SIAM Journal on Scientific and Statistical Computing, 6:15–29, 1985.
[83] S. Leyffer. Integrating SQP and branch-and-bound for mixed integer nonlinear program-
ming. Computational Optimization and Application, 18:295–309, 2001.
[84] H. L. Li and C. T. Chou. A global approach for nonlinear mixed discrete programming in
design optimization. Engineering Optimization, 22:109–122, 1994.
[85] P. Lindström. A stabilized gauß-newton algorithm for unconstrained least squares prob-
lems. Technical Report UMINF-102.82, Institute of Information Processing, University of
Umea, Umea, Sweden, 1982.
[86] P. Lindström. A general purpose algorithm for nonlinear least squares problems with non-
linear constraints. Technical Report UMINF-103.83, Institute of Information Processing,
University of Umea, Umea, Sweden, 1983.
[87] S. Lucidi, F. Rochetich, and M. Roma. Curvilinear stabilization techniques for truncated
newton methods in large-scale unconstrained optimization. SIAM Journal on Optimiza-
tion, 8:916–939, 1998.
[90] N. Mahdavi-Amiri. Generally constrained nonlinear least squares and generating nonlin-
ear programming test problems: Algorithmic approach. PhD thesis, The John Hopkins
University, Baltimore, Maryland, USA, 1981.
[91] N. Mahdavi-Amiri and R. Bartels. Constrained nonlinear least squares: an exact penalty
approach with projected structured quasi-newton updates. ACM Transactions on Mathe-
matical Software (TOMS), 15:220–242, 1989.
318
Bibliography
[94] J. Moré. The Levenberg-Marquardt algorithm: implementation and theory, volume 630 of
Watson, G.(ed): Numerical Analysis, Lecture Notes in Mathematics. Springer, Berlin,
1977.
[95] J. J. Moré. Recent developments in algorithms and software for trust region methods. In
A. Bachem, M. Grötschel, and B. Korte, editors, Mathematical Programming: The State
of the Art, pages 258–287. Springer, Heidelberg, Berlin, New York, 1983.
[96] K. Musselmann and J. Tavalage. A trade-off cut approach to multiple objective optimiza-
tion. Operations Research, 28(6):1424–1435, 1980.
[97] Y. Nesterov and A. Nemirovskii. Interior Point Polynomial Methods in Convex Program-
ming. SIAM publications, 1994.
[98] J. Nocedal, A. Wächter, and R. A. Waltz. Adaptive barrier strategies for nonlinear interior
methods. Technical Report RC 23563, IBM T. J. Watson Research Center, March 2005;
revised January 2006.
[100] J. Ortega and W. Rheinbold. Iterative Solution of Nonlinear Equations in Several Vari-
ables. Academic Press, New York, San Francisco, London, 1970.
[101] E. Panier and A. Tits. Avoiding the maratos effect by means of a nonmonotone line
search, i: General constrained problems. SIAM Journal on Numerical Analysis, 28:1183–
1195, 1991.
[102] P. Papalambros and D. Wilde. Principles of Optimal Design. Cambridge University Press,
1988.
[103] J. Pinter. Global Optimization in Action. Kluwer Academic Publishers, Dordrecht, 1996.
[104] M. Powell. The convergence of variable metric methods for nonlinearly constrained opti-
mization calculations. In O. Mangasarian, R. Meyer, and S. Robinson, editors, Nonlinear
Programming, volume 3, pages 27–63. Academic Press, 1978.
[105] M. Powell. A fast algorithm for nonlinearly constraint optimization calculations, volume
630 of Watson, G. (ed.): Numerical Analysis, Lecture Notes in Mathematics. Springer,
Berlin, 1978.
[106] M. Powell. Zqpcvx, a fortran subroutine for convex quadratic programming. Technical
Report DAMTP/1983/NA17, University of Cambridge, 1983.
[107] M. Powell. On the global convergence of trust region algorithms for unconstrained mini-
mization. Mathematical Programming, 29:297–303, 1984.
[108] M. Powell. A direct search optimization method that models the objective and constraint
functions by linear interpolation. In S. Gomez and J.-P. Hennart, editors, Advances in
Optimization and Numerical Analysis, pages 51–67. Kluwer Academic, Dordrecht, 1994.
319
Bibliography
[109] M. Powell. A view of algorithms for optimization without derivatives. Technical Report
DAMTP 2007/Na 03, University of Cambridge, Cambridge, 2007.
[110] M. J. D. Powell and Y. Yuan. A trust region algorithm for equality constrained optimiza-
tion. Mathematical Programming, 49:189–211, 1991.
[112] H. Ramsin and P. Wedin. A comparison of some algorithms for the nonlinear least squares
problem. Nordisk Tidstr. Informationsbehandlung (BIT), 17:72–90, 1977.
[113] M. Raydan. The barzilai and borwein gradient method for the large-scale unconstrained
minimization problem. SIAM Journal on Optimization, 7:26–33, 1997.
[114] T. L. Saaty. A scaling method for priorities in hierarchical structures. Journal of Mathe-
matical Psychology, 15:234–281, 1977.
[116] K. Schittkowski. Nonlinear Programming Codes, volume 183 of Lecture Notes in Economics
and Mathematical Systems. Springer, 1980.
[117] K. Schittkowski. The nonlinear programming method of wilson, han and powell. part 1:
Convergence analysis. Numerische Mathematik, 38:83–114, 1981.
[118] K. Schittkowski. The nonlinear programming method of wilson, han and powell. part 2: An
efficient implementation with linear least squares subproblems. Numerische Mathematik,
38:115–127, 1981.
[119] K. Schittkowski. Nonlinear programming methods with linear least squares subproblems,
volume 199 of Mulvey, J.M. (ed.): Evaluating Mathematical Programming Techniques,
Lecture Notes in Economics and Mathematical Systems. Springer, 1982.
320
Bibliography
[124] K. Schittkowski. More Test Examples for Nonlinear Programming, volume 182 of Lecture
Notes in Economics and Mathematical Systems. Springer, 1987.
[125] K. Schittkowski. Solving nonlinear least squares problems by a general purpose sqp-
method. In HoffmannK.-H., H.-U. J.-B., C. Lemarechal, and J. Zowe, editors, Trends in
Mathematical Optimization, volume 84 of Series of Numerical Mathematics, pages 295–
309. Birkhäuser, 1988.
[126] K. Schittkowski. Solving nonlinear programming problems with very many constraints.
Optimization, 25:179–196, 1992.
[127] K. Schittkowski. Parameter estimation in systems of nonlinear equations. Numerische
Mathematik, 68:129–142, 1994.
[128] K. Schittkowski. Numerical Data Fitting in Dynamical Systems. Kluwer Academic Pub-
lishers, Dordrecht, 2002.
[129] K. Schittkowski. Test problems for nonlinear programming - user’s guide. Technical report,
Department of Mathematics, University of Bayreuth, 2002.
[130] K. Schittkowski. Ql: A fortran code for convex quadratic programming - user’s guide.
Technical report, Department of Mathematics, University of Bayreuth, 2003.
[131] K. Schittkowski. Dfnlp: A fortran implementation of an sqp-gauss-newton algorithm -
user’s guide, version 2.0. Technical report, Department of Computer Science, University
of Bayreuth, 2005.
[132] K. Schittkowski. Nlpqlp: A fortran implementation of a sequential quadratic program-
ming algorithm with distributed and non-monotone line search - user’s guide, version 2.2.
Technical report, Department of Computer Science, University of Bayreuth, 2006.
[133] K. Schittkowski. An active set strategy for solving optimization problems with up to
60,000,000 nonlinear constraints. Technical report, Department of Computer Science,
University of Bayreuth, 2007. submitted for publication.
[134] K. Schittkowski. Nlplsq: A fortran implementation of an sqp-gauss-newton algorithm
for least-squares optimization - user’s guide. Technical report, Department of Computer
Science, University of Bayreuth, 2007.
[135] K. Schittkowski. Nlpmmx: A fortran implementation of a sequential quadratic program-
ming algorithm for solving constrained nonlinear min-max problems - user’s guide, version
1.0. Technical report, Department of Computer Science, University of Bayreuth, 2007.
[136] K. Schittkowski. Nlpqlb: A fortran implementation of a sequential quadratic programming
algorithm with active set strategy for solving optimization problems with a very large
number of nonlinear constraints - user’s guide, version 2.0. Technical report, Department
of Computer Science, University of Bayreuth, 2007.
[137] K. Schittkowski. Nlpqlb: A fortran implementation of an sqp algorithm with active set
strategy for solving optimization problems with a very large number of nonlinear con-
straints - users guide, version 2.0 -. Technical report, Department of Computer Science,
University of Bayreuth, 2007.
321
Bibliography
[139] K. Schittkowski. Ql: A fortran code for convex quadratic programming - user’s guide.
Technical report, Department of Mathematics, University of Bayreuth, 2007.
[143] K. Schittkowski. Nlpqlg: Heuristic global optimization - users guide, version 2.1. Technical
report, Department of Computer Science, University of Bayreuth, 2008.
[144] K. Schittkowski. A collection of 186 test problems for nonlinear mixed-integer program-
ming in fortran - user’s guide. Technical report, Department of Computer Science, Uni-
versity of Bayreuth, Bayreuth, 2012.
[146] M. Schlüter, J. A. Egea, and J. R. Banga. Extended ant colony optimization for non-convex
mixed integer nonlinear programming. Computers & Operations Research, 7, 2009.
[147] A. T. Serban and G. Sandou. Mixed ant colony optimization for the unit commitment
problem. In B. Beliczynski, A. Dzielinski, M. Iwanowski, and B. Ribeiro, editors, Adaptive
and Natural Computing Algorithms, pages 332–340. Springer, Berlin, Heidelberg, 2007.
[148] K. Socha. Aco for continuous and mixed-variable optimization. In M. Dorigo, M. Birat-
tari, C. Blum, M. Gambardella, L., F. Mondada, and T. Stützle, editors, Ant Colony,
Optimization and Swarm Intelligence, pages 25–36. Springer, Berlin, Heidelberg, 2004.
[149] K. Socha and M. Dorigo. Ant colony optimization for continuous domains. Eur. J. Oper.
Res., 185:1155–1173, 2008.
[151] J. Stoer. Foundations of recursive quadratic programming methods for solving nonlinear
programs. In K. Schittkowski, editor, Computational Mathematical Programming, vol-
ume 15 of NATO ASI Series, Series F: Computer and Systems Sciences. Springer, 1985.
322
Bibliography
[152] T. Stuetzle and M. Dorigo. ACO algorithms for the travelling salesman problem, pages
163–183. Miettinen, K. and Maekelae, M., M. and Neittaanmaeki, P. and Periaux, J.
(eds.): Evolutionary Algorithms in Engineering and Computer Science. John Wiley &
Sons, Chichester (UK), 1999.
[153] K. Svanberg. The Method of Moving Asymptotes – a new method for structural optimiza-
tion. International Journal for Numerics Methods in Engineering, 24:359–373, 1987.
[154] K. Svanberg. A globally convergent version of MMA without linesearch. In N. Olhoff and
G. Rozvany, editors, Proceedings of the First World Congress of Structural and Multidis-
ciplinary Optimization, pages 9–16. Pergamon, 1995.
[155] K. Svanberg. Two primal-dual interior-point methods for the MMA subproblems. Techni-
cal Report TRITA/MAT-98-OS12, Department of Mathematics, KTH, Stockholm, 1998.
[157] P. Toint. An assessment of nonmontone line search techniques for unconstrained optimiza-
tion. SIAM Journal on Scientific Computing, 17:725–739, 1996.
[159] A. Törn and A. Zilinskas. Global Optimization, volume 350 of Lecture Notes in Computer
Science. Springer, Heidelberg, 1989.
[161] A. Wächter. An Interior Point Algorithm for Large-Scale Nonlinear Optimization with
Applications in Process Engineering. PhD thesis, Carnegie Mellon University, 2002.
[162] A. Wächter and L. T. Biegler. Line search filter methods for nonlinear programming:
Motivation and global convergence. SIAM Journal on Optimization, 16(1):1–31, 2005.
[164] F. Walz. An engineering approach: Hierarchical optimization criteria. IEEE Trans. Au-
tomatic Control, 12:179, 1967.
[165] T. Westerlund and R. Pörn. Solving pseudo-convex mixed integer optimization problems
by cutting plane techniques. Optim. Eng., 3:253–280, 2002.
[166] P. Wolfe. Convergence conditions for ascent methods. SIAM Review, 11:226–235, 1969.
323
Bibliography
[168] O. Yeniay. Penalty function methods for constrained optimization with genetic algorithms.
Mathematical and Computational Applications, 10:45–56, 2005.
[169] L. Yu, K. Liu, and K. Li. Ant colony optimization in continuous problem. Frontiers of
Mechanical Engineering in China, 2(4):459–462, 2007.
[170] Y. Yuan. On the superlinear convergence of a trust region algorithm for nonsmooth
optimization. Mathematical Programming, 31:269–285, 1985.
[171] Y. Yuan. On the convergence of a new trust region algorithm. Numerische Mathematik,
70:515–539, 1995.
[172] B. Zhang, D. Chen, and W. Zhao. Iterative ant-colony algorithm and its application to
dynamic optimization of chemical process. Comput. Chem. Eng., 29(10):2078–2086, 2005.
[173] C. Zillober. A globally convergent version of the method of moving asymptotes. Structural
Optimization, 6:166–174, 1993.
[174] C. Zillober. A practical interior point method for a nonlinear programming problem
arising in sequential convex programming. Technical Report TR98-1, Informatik, Uni-
versität Bayreuth, WWW: www.uni-bayreuth.de/departments/math/∼czillober/papers/-
tr98-1.ps, 1998.
[175] C. Zillober. A combined convex approximation – interior point approach for large scale
nonlinear programming. Optimization and Engineering, 2(1):51–73, 2001.
[176] C. Zillober. Global convergence of a nonlinear programming method using convex ap-
proximations. Numerical Algorithms, 27(3):256–289, 2001.
[177] C. Zillober. SCPIP — an efficient software tool for the solution of structural optimization
problems. Structural and Multidisciplinary Optimization, 24(5):362–371, 2002.
324