Global Optimization - Deterministic Approaches
Global Optimization - Deterministic Approaches
Global Optimization - Deterministic Approaches
Global
Optimization
Deterministic Approaches
With 55 Figures
and 7 Tables
Springer
Professor Dr. Remer Horst
University ofTrier
Department of Mathematics
P.O.Box 3825
D-54286 Trier, Germany
Professor Dr. Hoang Tuy
Vien Toan Hoc
Institute ofMathematics
P.O.Box 631, BO-HO
10000 Hanoi, Vietnam
This work is subject to copyright. AII rights are reserved, whether the whole or part of the
material is concerne<!, specifica1ly the rights of translation, reprinting, reuse of iIIustrations,
recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data
banks. Duplication of this publication or parts thereof is only permitted under the provisions
of the German Copyright Law of September 9, 1965, in its version of June 240 1985, and a
copyright fee must always be paid. Violations fali under the prosecution act of the German
Copyright Law.
The use of registered names, trademarks, etc. in this publication does not imply, even in the
absence of a specific statement, that such names are exempt from the relevant protective
laws and regulations and therefore free for general use.
SPIN 10517261 42.12202 - 5 43 21 o - Printed on acid-free paper
PREFACE TO THE THIRD EDITION
Most chapters contain quite a number of modifications and additions which take
into account the recent development in the field. Among other things, one finds
additional d.c. decompositions, new results and proofs on normal partitioning
procedures, optimality conditions, outer approximation methods and design
centering as weIl as revisions in the presentation of some basic concepts.
The main contents and character of the monograph did not change with respect
to the first edition. However, within most chapters we incorporated quite a number
of modifications which take into account the recent development of the field, the
very valuable suggestions and comments that we received from numerous colleagues
and students as well as our own experience while using the book. Some errors and
misprints in the first edition are also corrected.
The enormous practical need for solving global optimization problems coupled
with a rapidly advancing computer technology has allowed one to consider problems
which a few years aga would have been considered computationally intractable. As a
consequence, we are seeing the creation of a large and increasing number of diverse
algorithms for solving a wide variety of multiextremal global optimization problems.
The goal of this book is to systematically clarify and unify these diverse
approaches in order to provide insight into the underlying concepts and their pro-
perties. Aside from a coherent view of the field much new material is presented.
Standard nonlinear programming techniques have not been successful for solving
these problems. Their deficiency is due to the intrinsie multiextremality of the
formulation and not to the lack of smoothness or continuity, for often the latter
properties are present. One can observe that local tools such as gradients, subgra-
dients, and second order constructions such as Hessians, cannot be expected to yield
more than local solutions. One finds, for example, that a stationary point is often
detected for which there is even no guarantee of local rninimality. Moreover, deter-
rnining the local rninimality of such a point is known to be NP-hard in the sense of
computational complexity even in relatively simple cases. Apart from this deficiency
in the local situation, classical methods do not recognize conditions for global
optimality.
x
For these reasons global solution methods must be significantly different from
standard nonlinear programming techniques, and they can be expected to be - and
are - much more expensive computationally. Throughout this book our focus will
be on typical procedures that respond to the inherent difiiculty of multiextremality
and which take advantage of helpful specific features of the problem structure. In
certain sections, methods are presented for solving very general and difiicult global
problems, but the reader should be aware that difiicult large scale global optimiza-
tion problems cannot be solved with sufiicient accuracy on currently available
computers. For these very general cases our exposition is intended to provide useful
tools for transcending local optimality restrictions, in the sense of providing valuable
information about the global quality of a given feasible point. Typically, such in-
formation will give upper and lower bounds for the optimal objective function value
and indicate parts of the feasible set where further investigations of global optima-
lity will not be worthwhile.
On the other hand, in many practical global optimizations, the multiextremal
feature involves only a small number of variables. Moreover, many problems have
additional structure that is amenable to large scale solutions.
With the current state of the art, these properties are best exploited by determini-
stic methods that combine analytical and combinatorial tools in an effective way.
We find that typical approaches use techniques such as branch and hound, relaxa-
tion, outer approximation, and valid cutting planes, whose basic principles have long
appeared in the related fields of integer and combinatorial optimization as we1l as
convex minimization. We have found, however, that application of these fruitful
ideas to global optimization is raising many new interesting theoretical and compu-
tational questions whose answers cannot be inferred from previous successes. For
example, branch and bound methods applied to global optimization problems
generate infinite processes, and hence their own convergence theory must be deve-
loped. In contrast, in integer programming these are finite procedures, and so their
convergence properties do not directly apply. Other examples involve important
XI
results in convex minimization that reflect the coincidence of local and global solu-
tions. Here also one cannot expect a direct application to multiextremal global
rninirnization.
(a) minimization 01 concave fu,nctions subject to linear and convex constraints (i.e.,
"concave minimization");
(b) convex minimization over the interseetion 01 convex sets and complements 01
convex sets (i.e., "reverse convex programming"); and
(c) global optimization 01 fu,nctions that can be expressed as a dillerence 01 two convex
fu,nctions (i. e., "d. c. -programming").
Another large dass of global optirnization that we shall discuss in some detail has
been termed 11 Lipschitz Programming", where now the functions in the formulation
are assumed to be Lipschitz continuous on certain subsets of their 'domains. AI-
though neither of the aforementioned properties (i)-(ii) is necessarily satisfied in
Lipschitz problems, much can be done here by applying the basic ideas and tech-
niques which we shall develop for the problem dasses (a), (b), and (c) mentioned
above.
Finally, we also demonstrate how global optimization problems are related to solving
systems of equations and/or inequalities. As a by-product, then, we shall present
some new solution methods for solving such systems.
The underlying purpose of this book is to present general methods in such a way
as to enhance the derivation of special techniques that exploit frequently encoun-
tered additional problem structure. The multifaceted approach is manifested occa-
sionally in some computational results for these special but abundant problems.
However, at the present stage, these computational results should be considered as
prelirninary.
Part A introduces the main global optirnization problem dasses we study, and
develops some of their basic properties and applications. It then discusses the funda-
mental concepts that unify the various general methods of solution, such as out er
approximation, concavity cuts, and branch and bound.
XII
The technical prerequisites for this book are rather modest, and are within reach
of most advanced undergraduate university programs. They include asound know-
ledge of elementary real analysis, linear algebra, and convexity theory. No familia-
rity with any other branch of mathematics is required.
Our special thanks are due to Frau Rita Feiden for the efficient typing and
patient retyping of the many drafts of the manuscript.
References 681
Notation 719
Index 723
PARTA
Part A introduces the main global optimization problem classes we study, and
develops some of their basic properties and applications. It then d~scusses some
fundamental concepts that unify the various general methods of solution, such as
outer approximation, concavity cuts, and branch and bound.
CHAPTER I
1. GLOBAL OPTIMIZATION
f(i') ~ f(x) for aU xE D or show that such a point does not ezist.
sets D and objective functions f that are continuous in the (relative) interior of D,
denoted by min f(D). The set of all solutions of problem (1) will be denoted by
argmin f(D).
Note that since
Definition 1.1. A closed subset D C !Rn is called robust il it is the closure 01 an open
set.
Note that a convex set D C !Rn with nonempty interior is robust (cl., e.g.,
Rockafellar (1970), Theorem 6.3).
Let 11·11 denote the Euclidean norm in !Rn and let c > 0 be areal number. Then an
(open) c-neighbourhood of a point x* e !Rn is defined as the open ball
holds.
In order to understand the enormous difficulties inherent in global optimization
problems and the computational cost of solving them, it is important to notice that
aU standard teehniq'Ue8 in nonlinear optimization ean at most loeate loeal minima.
Moreover, there is no local criterion for deciding whether a local solution is global.
Therefore, conventional methods of optimization using such tools as derivatives,
gradients, subgradients and the like, are, in general, not capable of locating or
identifying a global optimum.
Remark 1.1. Several global criteria for a global minimizer have been proposed. Let
D be bounded and robust and let f be continuous on D. Denote by ~M) a measure of
a subset M C IRn. Furthermore, let x* e D and
Note that eertain important classes of optimization problems have the property
Due to the inherent diffieulties mentioned above, the methods devised for analys-
ing multiextremal global optimization problems are quite diverse and signifieantly
different from the standard tools referred to above.
Though several general theoretieal eoneepts exist for solving problem (1), in order
to build a numerieally promising implementation, additional properties of the prob-
lems data usually have to be exploited.
Convexity, for example, will often be present. Many problems have linear eon-
straints. Other problems involve Lipsehitzian funetions with known Lipsehitz eon-
stants.
7
In recent years a rapidly growing number of proposals has been pubIished for
tain recent basic approaches. Knowledge of these approaches not only leads to a
This book presents certain deterministic concepts used in many methods for
for further research. These concepts will be applied to derive algorithms for solving
encountered in applications.
assume the reader to be familiar with basic notions and results on convexity.
Definition 1.4. Let h: IRn --+ IR be a convex ju.nction. Then the inequality h(x) ~ 0 is
called convez whereas the inequality h(x) ~ 0 is caUed reverse cont/ex.
Feasible Set D:
- cont/ex and dejined by jinitely many convex inequalities,
- intersection oJ a convex set with jinitely many complements oJ a cont/ex set and
dejined by jinitely many cont/ex and finitely many reverse convex inequalities,
- defined by finitely many Lipschitzian inequalities.
Objective Function f:
- convex,
- concat/e,
- d.c.,
- Lipschitzian,
- certain generalizations oJthese Jour classes.
9
The sections that follow contain an introduction to some basic properties of the
problems introduced above. Many applications will be described, and various
connections between these classes will be revealed.
2. CONCAVE MIN1MIZATION
where D ( IRn is nonempty, closed and convex, and where f: A --I IR is concave on a
sv.itable set A ( IRn containing D.
Example 1.1. Let f(x) = -lIxIl2, where 11·11 denotes the Euclidean norm, and let
D = {x E IRn: a ~ x ~ b} with a, be IRn, a < 0, b > 0 (all inequalities are understood
with respect to the componentwise order of IRn). It is easily seen that every vertex of
10
Dis a local minimizer off over D: let v = (v 1""'vn ? be a vertex of D. Then there
are two index sets 11' 12 C {1, ... ,n} satisfying 11 U 12 = {1, ... ,n}, 11 n 12 = 0 such
that we have
Let
Clearly, we have
where 0 ~ Yi ~ E:.
By the definition of E:, it follows that we have
Besides convexity of the feasible set D which will be heavily exploited in the
design of algorithms for solving (2), the most interesting property is that a concave
function f attains its global minimum over D at an extreme point of D.
11
Proof. The global minimum of f over the compact set D exists by the we1l-known
Theorem of Weierstraß since a concave function defined on IRn is continuous
everywhere. It suffices to show that for every x e D there is an extreme point v of D
such that f(x) ~ f(v) holds.
By the Theorems of Krein-Milman/Caratheodory, there is a natural number
k ~ n+1 such that
k . k
x = E ~.v\ E~. = 1, ~. ~ 0 (i=l, ... ,k), (3)
i=l 1 i=l 1 1
where vi (i=l, ... ,k) are extreme points of D. Let v satisfy f(v) = min {f(vi ):
i=l, ... ,k}. Then we see from the concavity of fand from (3) that we have
k. k
f(x) ~ E ~.f(Vl) ~ f(v)-( E ~.) = f(v) .
i=l 1 i=l 1 •
From the viewpoint of computational complexity, a concave minimization prob-
lem is NP-hard, even in such special cases as that of minimizing a quadratic con-
cave function over very simple polytopes such as hypercubes (e.g., Pardalos and
Schnitger (1987». One exception is the negative Euc1idean norm that can be
minimized over a hyperrectangle by an O(n) algorithm (for related work see
Gritzmann and Klee (1988), Bodlaender et al. (1990». Among the few practical
instances of concave minimization problems for which polynomial time algorithms
have been constructed are certain production scheduling, production-transportation,
and inventory models that can be regarded as special network flow problems (e.g.,
Zangwill (1968 and 1985), Love (1973), Konno (1973 and 1988), Afentakis et al.
(1984), Tuy et al. (1995». More details on complexity of concave minimization and
related problems can be found in Horst et al. (1995), Vavasis (1991, 1995).
12
Many problems consist in choosing the levels xi of n activities i=l, ... ,n restricted
to ai ~ xi ~ bi' ai'bi e IR+, ai < bi' producing independent costs fi : [ai'bil-i IR+,
subject to additional convex or (in most applications) linear inequality constraints.
n
The objective function f(x) = E f.(x.) is separable, Le., the sum of n functions
i=11 1
fi(~)' Each fi typically reßects the fact that the activities incUI a fixed setup cost
when the activity is started (positive jump at xi = 0) as well as a variable cost
related to the level of the activity.
If the variable cost is linear and the setup cost is positive, then fi is nonllnear and
concave and we have the objective functions of the c1assical fixed charge problems
(e.g., Murty (1969), Bod (1970), Steinberg (1970), Cabot (1974». Frequently
encountered special examples inc1ude fixed charge transportation problems and
capacitated as well as uncapacitated plant location (or ~ite selection) problems (e.g.,
Manne (1964), Gray (1971), Dutton et al. (1974) Barr et al. (1981». Multilevel fixed
charge problems have also been discussed by Jones and Soland (1969). Interactive
fixed charge problems were investigated by Kao (1979), Erenguc and Benson (1986),
Benson and Erenguc (1988) and references therein.
13
Often, price breaks and setup costs yield concave functions fi having piecewise
linear variable costs. An early problem of this type was the bid evaluation problem
(e.g., Bracken and McCormick (1968), and Horst (1980a)). Moreover, piecewise
linear concave functions frequently arise in inventory models and in connection with
constraints that form so-called Leontiev substitution systems (e.g., Zangwill (1966),
Veinott (1969), and Koehler et al. (1975)).
Nonlinear concave variable costs occur whenever it is assumed that as the number
of units of a product increases, the unit cost strictly decreases (economies 0/ scale).
Different concave cost functions that might be used in practice are discussed by, e.g.,
Zwart (1974). Often the concave functions fi are assumed to be quadratic (or'
approximated by quadratic functions); the most well-known examples are quadratic
(1957), Cabot and Francis (1974), Bhatia (1981), Bazaraa and Sherali (1982),
Florian (1986)).
Many other situations arising in practice lead to minimum concave objective
by many authors, e.g., Ritter (1965 and 1966), Cabot and Francis (1970), Balas
(1975a), Tammer (1976), Konno (1976a and 1980), Gupta and Sharma (1983), Rosen
(1983, 1984 and 1984a), Aneja et al. (1984), Kalantari (1984), Schoch (1984), Thoai
(1984), Pardalos (1985, 1987, 1988 and 1988a), Benacer and Tao (1986), Kalantari
and Rosen (1986 and 1987), Pardalos and Rosen (1986 and 1987), Rosen and
Pardalos (1986), Thoai (1987), Tuy (1987), Pardalos and Gupta (1988), Warga
(1992), Bomze and Danninger (1994), Horst et al. (1995), Horst and Thoai (1995),
14
creasing marginal values. Some models yield quasiconcave rat her than concave ob-
jective functions that in many respects behave like concave functions in minim-
ization problems. Moreover, quasiconcave functions are often concave in the region
of interest (e.g., Röss)er (1971) and references therein). Sometimes these functions
can be suitably transformed into concave functions that yield equivalent minim-
Many other specific problems lead directly to concave minimizationj examples are
discussed in, e.g., Grotte (1975), and McCormick (1973 and 1983). Of particular
interest are certain engineering design problems. Many of them can be formulated as
suitable concave objective network problems (see above). Other examples arise from
VLSI chip design (Watanabe (1984», in the fabrication of integrated circuits
(Vidigal and Director (1982», and in diamond cutting (Nguyen et al. (1985». The
last two examples are so-called design centering problems that turn out to be linear,
concave or d.c. programming problems (cf. Vidigal and Director (1982), Thach
One of the most challenging classes of optimization problems with a wide range of
applications is integer programming. These are extremum problems with a discrete
feasible set.
15
equivalence is understood in the sense that the sets of optimal solutions coincide.
This equivalence is well-known for the quadratic assignment problem (e.g., Baza-
raa and Sherali (1982), Lawler (1963)) and the 3-dimensional assignment problem
(e.g., Frieze (1974)). The zero-one integer linear programming problem was reduced
Let C ~ IRn , B = {O,l}, Bn = Bx ... xB (n times), f: IRn -+IR. Consider the integer
programming problem
minimize f(x)
s .t. xE C n Bn (4)
Define e:= (l,l...,1)T E IRn , E: = {x E IRn : 0 ~ x ~ e} and associate with (4) the
nonlinear problem
which depends on the real number Jl.. Then the following connection between (4) and
(5) holds.
Proof. (i): Set rp(x):= x(e - x). The function rp(x) is continuous on E, and
obviously we have
We show that there is a J1.0 E IR such that, whenever we have J1. > J1.0' then the
global minimum off(x) + J1.rp(x) over C n Eis attained on Bn.
First, note that, for all y E Bn, there exists an open neighbourhood
N(y, c:) = {x E IRn: IIx-YIi < c:} such that for all x E N(y,c:) n (E \ Bn) we have
n 2n 2
=r E lu·l-r E lu·1 .
j=1 J j=1 J
n 2 1 2 n
Using 0< r < c: < 1, E luJ·1 = ~lIx-yli = 1 and E lu·1 ~ lIull = 1, we finally
j=1 r j=1 J
see that
holds.
17
f(y) - fex) V C \ C C
Fy:= ij)(x) x E 2 l' YE I '
L
IF y(x) I ~ l-f < +ID Vx E A(y, E) n (C 2 \ Cl)' y E Cl .
The family of sets {A(yi,E): yi E Bn} is a finite cover of Cl' Let k = 2n and consider
k .
C3: = (u A(yl, E)) n C2.
1=1
(8)
holds.
Finally, consider the compact set
18
m{= min f(C 2) and Mf:= max f(C 2) exist. By a similar argument, we see that also
mcp:= min CP(C 4) exists. Note that we have cp(x) > 0 on C4 , hence mcp > 0 and
Mf - mf
"'(= m ~ O.
cp
It follows that
(ü): Let by V2f denote the Hessian matrix of f. The Hessian V2f exists on E and
its elements are bounded there. Then, by a well-known criterion on definiteness of
symmetrie matrices and diagonal dominance, there is a ~ > 0 such that, whenever
'" > ~, the Hessian V2f - diag(2",) of fex) + p.xT(e-x) is negative semidefinite on E,
and this implies concavity off(x) + p.xT(e-x).
{~, "'1' "'2}
From the above, Theorem 1.2 follows for "'0 = max .
•
Now let C be a convex set. Then problem (5) is a concave minimization problem,
whenever '" > "'0' and Theorem 1.2 shows that large classes 0/ integer programming
problems are equivalent to concave minimization problems.
19
Important special classes of integer programming problems are the integer linear
programming problem
minimize cx
s . t. Ax ~ b, x e Bn , (10)
where c e IRn , b e IRm , A e IRmxn , and the integer quadratic programming problem
minimize cx + ~ x (Cx)
s.t. Ax~ b, xeB n , (11)
which adds to the objective function of (10) a quadratic term with matrix C e IRnxn
(assumed symmetrie without loss of generality). When the feasible sets in (10) and
(11) are not empty, the assumptions of Theorem 1.2 are obviously satisfied for
Estimates for the parameter p. in Theorem 1.2 can be found in Borchardt (1980),
Kalantari and Rosen (1987a), Horst et al. (1995).
Note that problem (4) also covers the cases where x e Bn is replaced by
xj eIN U{O}, and the variables xj are bounded. A simple representation of xj by
(O-l)-variables is then
K i
x·= E y..2
J i=O IJ
, YiJ· e {O,l} ,
cases (e.g., Bazaraa and Sherali (1982), Frieze (1974». The connections discussed
above, however, may lead to an adaptation of typical ideas used in concave minimi-
zation to approaches for solving certain integer problems (e.g., Adams and Sherali
(1986), Beale and Forrest (1978), Erenguc and Benson (1987), and Glover and
where X, Y are given closed convex polyhedral sets in IRn, IRm respectively, and
pe IRn, q e IRm , Ce IRn"m. Problem (12) was studied in the bimatrix game context
by, e.g., Mangasarian (1964), Mangasarian and Stone (1964), Altman (1968).
Further applications include dynamic Markovian assignment problems, multi-
commodity network flow problems and certain dynamic production problems. An
extensive discussion of applied problems which can be formulated as bilinear pro-
gramming problems is given by Konno (1971a).
The first solution procedures were either locally convergent (e.g., Altman (1968),
Cabot and Francis (1970» or completely enumerative (e.g., Mangasarian and Stone
(1964».
Cabot and Francis (1970) proposed an extreme point ranking procedure. Subse-
quent solution methods included various relaxation and cutting plane techniques
(e.g., Gallo and Ülkucü (1977), Konno (1971, 1971a and 1976), Vaish and Shetty
(1976 and 1977), Sherali and Shetty (1980a», or branch and bound approaches (e.g.,
Falk (1973), Al-Khayyal (1990». A general treatment of cutting-plane methods
and branch and bound techniques will be given in subsequent chapters. Related
with fl' f:2 convex, and where the constraints form 80 convex set in IRn+m, is given in
Al-Khayyal and Fallt (1983).
Most of the methods mentioned above use in an explicit or implicit way the close
relationship between problem (12) and concave minimization. Formulations of bilin-
ear problems 80S concave minimization problems are discussed, e.g., in Altman
(1968), Konno (1976), Gallo and Ülkucü (1977), Thieu (1980), Altman (1968),
Aggarwal and Floudas (1990), Hansen and Jaumard (1992) discuss different ways to
reduce linearly constrained concave quadratic problems to bilinear problems. Frieze
(1974) has reduced the 3-dimensional assignment problem to 80 bilinear
programming problem and then to 80 special concave programming problem.
Theorem 1.3. In problem (1!) assume that Y has at least one verte:r: and that for
every :r:e X
has a solution. Then problem (1!) can be reduced to a concave minimization problem
with piecewise linear objective fu,nction and linear constraints.
Proof. Note first that the assumptions of Theorem 1.3 are both satisfied if Y is
nonempty and compact. Denote by V(Y) the set of vertices of Y. It is known from
the theory of linear programming that for every x e X the solution of (13) is attained
at least at one vertex of Y. Problem (12) can be restated as
The set V(Y) is finite, and for eaeh y E V(Y), f(x,y) is an affine funetion of x.
Thus, f(x) is the pointwise minimum of a finite family of affine funetions, and henee
is eoneave and pieeewise linear.
•
In order to obtain a eonverse result, let f(x):= 2px + x(Cx), where p E !Rn and
C E IRnxn is symmetrie and negative semi~efinite. Consider the quadratic eoneave
programming problem
Theorem 1.4. Under the assumptions above the following equivalence holds:
lfr solves (14), then (r, r) solves (15). lf(x, y) solves (15), then x and y solve
(14)·
Obviously, we have
(x - Y)(C(x - Y)) ~ 0
(20)
sets.
Note as weH that, by Theorem 1.2 and Theorem 1.4, the integer linear
Let D ~ IRn, g,h: IRn --+ IRm. The problem offinding x E D such that
enormous nurnber of papers and some excellent textbooks. We only cite here the
monographs of Lüthi (1976), Garcia and Zangwill (1981), Gould and Tolle (1983),
Murty (1988), Cottle et al. (1992), and the surveys of Al-Khayyal (1986a) and Pang
(1995).
A frequently studied special case is the linear complementarity problem which is
Theorem 1.5. In (22) let D be a convex set and let 9 and h be concave mappings
on D. Suppose that (22) has a solution. Then (22) is equivalent to the concave minim-
ization problem
m
minimize f(x}:= Emin {g.(x}, h.(x}}
i=l I I
(23)
s.t. g(x} ~ 0, h(x} ~ 0, xE D.
25
Proof. Let x* be a solution of (22). Then, obviously, x* is feasible for (23). From
g(x*) ~ 0, h(x*) ~ 0, and g(x*)h(x*) = 0 it follows that f(x*) = O. But all feasible
points x of (23) satisfy f(x) ~ 0; hence x* is an optimal solution of (23).
(23) has to satisfy f(x*) = O. Thus we have g(x*) ~ 0, h(x*) ~ 0, min {~(x*),
Again, since the convexity of D and the concavity of g and h were not used in the
In the linear case, when D is a convex polyhedral set, and h and g are affine
above relationship with concave minimization is proposed by Thoai and Tuy (1983),
Tuyet al. (1985); see also Al-Khayyal (1986a and 1987), Pardalos and Rosen (1988)
s.t. Ax + By ~ b, x, y ~ 0
n m SlCn SlCm s
where x, c E IR , y,d E IR , A E IR , B E IR , and b E IR .
The relationship between problem (24) and concave minimization was discussed
Theorem 1.6. Let the leasible set 01 (24) be nonempty and compact. The problem
(24) is equivalent to a concave minimization problem with piecewise linear objective
function and linear constraints.
Le., P:= {x E IRn: x ~ 0, Ax + By ~ b for at least one y ~ O}. Further, for x E P, let
Dx := {y E IRm , By ~ b - Ax, y ~ O}. Then (24) may be rewritten as
funetion
+ cx) .
maximize (f(x)
S.t. xEP
•
Reeall from Seetion I.1 that a real-valued funetion f defined on a eonvex set
C ( ~ is ealled d.c. on C if, for all x E C, f ean be expressed in the form
The function fis called d.c. if f is d.c. on uf. The representation (25) is said to be a
cic. decomposition of f.
minimize f(x)
s.t. xEC,gj(X)~O (j=1, ... ,m) (26)
Note that C is usually given by a system of convex inequalities, so that, for the
sake of simplicity of notation, we will sometimes set C = IRn without loss of general-
ity.
Clearly, every concave minimization problem is also a d.c. programming problem,
and it follows from Example 1.1 (Section 1.2.1) that (26) is a multi-extremal optim-
ization problem.
The following results show that the class of d.c. functions is very rich and, more-
over, that it enjoys a remarkable stability with respect to operations frequently en-
countered in optimization.
Theorem 1.7. Let J, li (i=1, ... ,m) be d.c. Then the lollowing fv.nctions are also
d.c.:
n
(i) E A. I., lor any real numbers A.i
i=1 I . •
(iii) 1I(z) 1,{(x):= max {O, I(x}} and r (x):= min {O,J(x}}.
28
Therefore, we have
m m
max f. = max {p.-q.} = max {po + E q.} - E q ..
' 1 , .. ,m 1 '1=,
1= 1 .. ,m 1 1 '1=
1 , .. ,m 1 J=
'lJ 'lJ
J=
j:f:i
This is a d.c. decomposition, since the sum and the maximum of finitely many
convex functions are convex.
Similarly, we see that
m m m m
min f.= E p.+ min {~E p.)-q.}= E p.- max {( E p.)+q.}
i=1, .. ,m 1 j=l J i=l, .. ,m j=l J 1 j=l J i=l, .. ,m j=l J 1
jji jji
is d.c.
(iii): Suppose that we have p(x) ~ q(x). Then If(x) I = p(x) - q(x) = 2p(x) -
(p(x) + q(x)) holds.
Now let p(x) < q(x). Then we have If(x) I = q(x) - p(x) = 2q(x) - (p(x) + q(x)).
Rence, it follows that
which will not be used in this book (e.g., Tuy (1995». An example is the product of
two d.c. functions (e.g., Hiriart-Urruty (1985) and Horst et al. (1995)).
A main result concerning the recognition of d.c. functions goes back to Hartman
(1959). Before stating it, let U8 agree to call a function f: IRn -+ IR loca1ly d.c. iJ, for
every 1Pe IR n, there exists a neighboumood N = N(zO,e) of zO and convez functions
PN' qN such that
The proof requires some extension techniques that are beyond the scope of this
book and is omitted (cf. Hartman (1959), Ellaia (1984), Hiriart-Urruty (1985)).
Denote by C2 the class of functions IRn -+ IR whose second partial derivatives are
continuous everywhere.
of f on N(xO,e), i.e., fis locally d.c., and hence d.c. (cf. Theorem 1.8).
•
30
Furthermore, it turns out that any problem of minimizing a continuous real func-
tion over a compact subset D of IRn can, in principle, be approximated as closely as
desired by a problem ofminimizing a d.c. function over D.
Of course, the main concern when using Corollary 1.2 is how to construct an ap-
propriate approximation by d.c. functions for a given continuous function on D.
However, in many special cases of interest, the exact d.c. decomposition is already
given or easily found (cf. Theorem 1.7, Bittner (1970), Hiriart-Urruty (1985) and
the brief discussion of direct applications below).
n
Enmple 1.2. Let fex) be separable, Le., we have fex) = i=l
E f.(x.),
1 1
where each f.
1
x·I x·I
Fig. 1.1
(f:1 (i.)
1
(x.-
1
i.)
1
+ f.(i.))
1 1
/2 (x.1 < i.)
1
p.(x. ):= {
1 1
f.(x.)
11
- (f(i.)
111111
(x.-i.) + f.(i.)) /2 (x.1 >
-
i.)
1
(f(i.)
1 1
(x.-i.)
1 1
+ f.(i.))
1 1
/2 -f,(x.)
1 1
(x.1 < i.)
1
qi(Xi ):= {
-(f(i.)
1 1
(x.-i.)
1 1
+ f.(i.))
1 1
/2 (x.1 >
-
i.)
1
I = {x: a ~ x ~ b}, b > a, of the real line. Then, f is said to be a piecewise linear
function if I can be partitioned into a finite number of subintervals such that in each
subinterval f(x) is affine. That is, there exist values a < xl < ... < xr < b such that
32
~(X) = Q·X
1
+ {J.1 for x.1-1 <
-
X< x.,
1 .
i=I, ... jr+l,
where Xo:= a, Xr + 1:= b and ~, Pi are known real constants (i=I, ... ,r+l). A similar
for _ < x ~ x o' and f(x) = Qr+!x + Pr +1 for x r +! ~ x < 111, we see that f can be
continuous piecewise-linear function it follows that, for each point x e IR, there is a
neighbourhood N(X,E) such that T can be expressed in N(X,E) as the pointwise max-
imum. or minimum of at most two affine functions. Corollary 1.3 is then established
is a d.c. representation for any matrix norm IIQII (cf., e.g., Phong et al. (1995)).
33
Example 1.4. Let M be an arbitrary nonempty closed subset of IRn and let d~(x)
denote the square of the distance from a point x E IRn to M. Then it can be shown
Simple proofs for ellipsoidal norms can be found, e.g., in Horst et al. (1995), Tuy
(1995).
reflects the fact that in some activities the unit cost increases when the scale of
activity is enlarged (diseconomies of scale), whereas in other activities the unit cost
decreases when the scale of activity is enlarged. Likewise, in optimal investment
planning, one might have to find an investment program with maximal profit, where
the profit function is (-p(y)) + q(z) with (-p(y)) concave (according to a law of
diminishing marginal returns) and with q(z) convex (according to a law of increasing
marginal returns).
Certain general economic models also give rise to d.c. functions in the constraints.
Suppose that a vector of activities x has to be selected from a certain convex set
C c IRn of technologically feasible activities. Suppose also that the selection has to be
made so as to keep some kind of "utilit y " above a certain level. This leads to
where u.(x) represents some utility depending on the activity level x, and where c. is
1 1
a minimal level of utility required. In many cases, the ui(x) can be assumed to be
n
u.(x) = E u· .(x.)
1 j=l IJ J
(cf. Example 1.2) with uij either convex or concave or of the form discussed in
Example 1.2. For a thorough discussion of the underlying economic models, see, e.g.,
(e.g., Horst and Thoai (1995)), and in blending and pooling problems encountered in
oil refineries (e.g., Floudas and Aggarwal (1990), AI-Khayyal et al. (1995)).
physics (cf., e.g., Giannessi et al. (1979), Heron and Sermange (1982), Mahjoub
(1983), Vidigal and Director (1982), Strodiot et al. (1985 and 1988), Polak and
Vincentelli (1979), Toland (1978), Tuy (1986 and 1987)).
Moreover, for certain d.c. programming problems, there is a nice duality theory
developed by Toland (1978 and 1979) and Singer (1979, 1979a and 1992), that leads
where G: IRn"lRs ---I IR and S ( IRs is compact. Here, x represents the design vector and
Polak (1987». The function g(x) belongs to the class of so-called lower
respect to x and which, along with all these derivatives, is jointly continuous in
(x,s) E IRn"S. But lower C2-functions are d.c. For a detailed discussion of lower
Example 1.5. Let D ( IRn be a compact set, and let K ( IRn be a conllez compact
set. Assume that the origin 0 is an interior point of K. Define the function
r D: D - t IR by
maximize r D(x)
s.t. xED (30)
problem from the diamond industry where one wants to know how to cut the largest
diamond of shape K that can be cut from a rough stone D ) K (Nguyen et al.
(1985». Other applications are discussed by Tuy (1986) and Polak and Vincentelli
(1979).
In a more general context consider any fabrication process where random vari-
ations may result in a very low production yield. A method to minimize the influ-
ence of these random variations consists of centering the nominal value of the de-
has been shown in Thach (1988) that in these cases (30) is actually a d.c.
36
where D is a closed convex subset of ~xlRn, and fand h are real-valued convex func-
tions on D. The objective function is a d.c. function, since xy = ~IIx+Yll2 -
IIx-YIl2).
The intuition of many people working in global optimization is that most of the
where D is a closed subset of IRn and f: IRn ~ IR is continuous, can be converted into a
d.c. program (e.g., Tuy (1985)). Introducing the additional real variable t, we can
obviously write problem (31) in the equivalent form
minimize t
s.t. XED,f(x)~t (32)
The feasible set M:= ((x,t) E ~+1: x E D, fex) ~ t} in (32) is closed and the
condition (x,t) e M is equivalent to the constraint d~(x,t) ~ 0, where dM(x,t)
denotes the distance from (x,t) to M. This is a d.c. inequality (cf. Example 1.4).
37
Recall from Section 1.1 that a constraint h(x) ~ 0 is called reverse convex when-
ever h: IRn - I IR is convex. Obviously, every optimization problem with concave,
convex or d.c. objective function and a combination of convex, reverse convex and
sider reverse convex constraints separately. One reason for doing so is tliat we often
encounter convex or even linear problems having only one or very few additional
reverse convex constraints. Usually, these problems can be solved more efficiently by
taking into account their specific structure instead of using the more general
approaches designed for solving d.c. problems (e.g., Tuy (1987), Horst (1988), Thoai
(1988)). Another reason is that every d.c. programming problem can be converted
into an equivalent relatively simple problem having only one reverse convex con-
straint. This so--<:alled canonical d.c. programming problem will be derived below.
Example 1.7. Given two mappings h: IRn -I IRm , and g: IRn -I IRm , the condition
h(x) g(x) = 0 is often called a complementarity condition (cf. Section 1.2.5). Several
applications yield optimization problems where an additional simple linear comple-
logy, a submarine pipeline is usually laid so that it rests freely on the sea bot tom.
Since, however, the sea bed profile is usually irregularly hilly, it is often regularized
by means of trench excavation in order to bury the pipe for protection and to avoid
excessive bending moments on the pipe. The optimization problem which arises is to
minimize the total cost of the excavation, under the condition that the free contact
equilibrium configuration of the pipe nowhere implies excessive bending. It has been
shown by Giannessi et al. (1979) that the resulting optimization problem is a linear
38
x ~ 0, y ~ 0, xy = ° (x,y E IRn) .
the last inequality being areverse convex constraint, since the function
n
Emin {x., y.} is concave.
i=l 1 1
Example 1.8. A (0-1) restriction can be cast into the reverse convex form. For
2
-x·1 + (x.)
1
> 0, 0 <
-
x· < 1 .
- 1-
Example 1.9. Let G be an open convex set defined by the convex inequality
g(x) < 0, g: IRn - I IR convex. Let K be a compact, convex set contained in G. Then
the problem of rninirnizing the distance d(x) from K to !Rn \ G is a convex mini-
mization problem with the additional reverse convex constraint g(x) ~ 0 (Fig. 1.2) .
.....
Fig. 1.2
39
were studied by Rosen (1966), Avriel and Williams (1970), Meyer (1970), Ueing
(1972), Bansal and Jacobsen (1975), Hillestad and Jacobsen (1980 and 1980a), Tuy
(1983 and 1987), Thuong and Tuy (1984), Horst and Dien (1987), Horst (1988),
Horst et al. (1990), Horst and Thoai (1994). Avriel and Williams (1970) showed that
reverse convex constraints (cf. Section 1.3.2). In an abstract setting, Singer (1980)
A striking feature of d.c. programming problems is that any d.c. problem can
always be reduced to a canonical form which has a linear objective function and only
two constraints, one of them being a convex inequality, the other being reverse
minimize cx (33)
s. t. h(x) $ 0, g(x} ~ 0
where C is defined by a finite system 01 convex inequalities hix) ~ 0, k EIe IN, and
where f, gj are d.c. junctions, can be converted into an equivalent canonical d.c.
program.
minimize t
s. t. hk(x)~O (kEI), glx)~O (j=l, ... ,m), f(x)-t~O
involving the additional real variable z. The first inequality in (35) is convex and the
Along the same lines, various other transformations can be carried out that
transform, e.g., a d.c. problem into a program having convex or concave objective
functions, etc. Obviously, since a canonical d.c. problem has simpler structure than
the general d.c. problem, transformations of the above type will be useful for the
41
development of certain algorithms for solving d.c. problems. Note, however, that
these transformations increase the number of variables of the original problem. Since
the numerical effort required to solve these kinds of difficult problems generally
grows exponentially with the number of variables, we will also attempt to solve d.c.
problems without prior transformations (cf. Chapter X).
We now consider the canonica1 d.c. problem (33), and we let OB and lJG denote
the boundaries of the sets H:= {x: h(x) ~ O} and G:= {x: g(x) ~ O}, respectively. For
the sake of simplicity we shall assume that H is bounded and that H n G is not
empty, so that an optimal solution of problem (33) exists.
Definition 1.6. The reverse convex constraint g(x) ~ 0 is caUed usential in the
canonical d.c. program (33) ifthe inequality
holds.
minimize cx (36)
s.t. xE H
Theorem 1.10. Consider the canonical d.c. program. Suppose that His bounded,
H n G is nonempty and the reverse convex constraint is essential. Then the canonical
d.c. program (33) always has a solution lying on lJHn lJG.
Proof. Since the reverse convex constraint is essential, there must be a point
wEH satisfying
where the line segment [w,x] intersects the boundary 00. The number A is uniquely
g(h + (1 - A)W) =0 .
Now let x E H n G satisfy g(x) > O. Then we have
minimize cx (38)
S.t. xEHnOG
Now consider an optimal solution i of (38). Denote by int M the interior of a set
M. Then the c10sed convex set (IRn\G) U 00 has a supporting hyperplane at i. The
intersection of this hyperplane with the compact convex set H is a compact convex
set K. It is well-known from linear programming theory that cx attains its
Clearly, we have u E 8H, and since K C H \ int (IRn \ G), it follows that u is a
In certain applications we have that the function h(x) is the maximum of a finite
number of affine functions, i.e., the canonical d.c. program reduces to linear pro-
grams with one additional reverse convex constraint. It will be shown in Part B of
this volume, that in this case the global minimum is usually attained on an edge
ofB.
Obviously, if f is Lipschitzian with constant L, then fis also Lipschitzian with all
servation. Suppose that the diameter d(M):= sup {lix - ylI: x, y e M} < ID of M is
known. Then we easily see from (39) that
44
holds. Let S ( M denote a finite sampIe of points in M where the function values
Le., knowledge of L and d(M) leads to computable bounds for inf (M).
Note that the following weIl-known approach for solving the problem
minimize f(x)
(43)
S.t. xED
(41):
xl E argmin F O(D) .
(44)
(45)
It is easy to show that any accumulation point of the sequence {xk } solves prob-
lem (43). This algorithm was proposed by Piyavskii (1967 and 1972) and Shubert
Polale (1984), Horst and Tuy (1987), and Pinter (1983 and 1986). See also Bulatov
(1977), Evtushenko (1985), and, in particular, the survey of Hansen and Jaumard
(1995). The crucial part of this method is the minimization in (45). Unfortunately,
since F k is the pointwise maximum of a finite family of concave functions, this
minimization problem is, except in the one-dimensional case, a very difficult one.
Actually Fk(x) is a d.c. function (cf. Theorem 1.7), and applying the above approach
lems. In the chapters that follow, some different and, hopefully, more practical ap-
set or the intersection of a convex set with finitely many complements of convex sets
or a set defined by a finite number of Lipschitzian inequalities, are encountered in
many economic and engineering applications. Examples are discussed, e.g., in Dixon
and Szego (1975 and 1978), Strongin (1978), Zielinski and Neumann (1983), Fedorov
(1985), Zilinskas (1982 and 1986), Pinter et al. (1986), Tuy (1986), Hansen and
Jaumard (1995). Algorithms such as those proposed in Horst (1987 and 1988), Pinter
(1988), Thach and Tuy (1987), Horst et al. (1995) will be discussed in Chapter XI.
and maxima or minima of finitely many Lipschitzian functions (cf., also Sec-
tion 1.4.2).
Howe1Jer, we point out that aU methods lor sol1ling Lipschitzian optimization prob-
lems to be presented in this book require knowledge 01 C& Lipschitzian constant lor
some or aU olthe functions in1Jol1Jed.
46
Though such a constant can often be estimated, this requirement sets limits on
the application of Lipschitzian optimization techniques, since - in general - finding
a good estimate for L (using, e.g., (40)) can be almost as difficult as solving the
proposed, the sets Mare successively refined in such a way that one can use adaptive
approximation of L (cf. Strongin (1973) and Pinter (1986)). Another means of calcu-
lating suitable approximations of L is by interval analysis (cf., e.g., Ratscheck and
Example 1.10. Many practical problems may involve indefinite, separable quadra-
tic objective functions and/or constraints (cf., e.g., Pardalos et al. (1987), Pardalos
and Rosen (1986 and 1987)). To solve some of these problems, AI-Khayyal et al.
(1989) proposed several variants of a branch and bound scheme that require a Lip-
on a rectangle M = {x: ak ~ xk ~ bk , k=l, ... ,n}. In this case, the relation (40) yields
and
n
~ max 1PkYk + qk l
k=l a k ~ Yk~bk
where
11 = {k: _ :: ~~ ; bk } ,
numerical analysis. It is beyond the scope of our book to present here an overview of
applications and methodology in that field (see, e.g., Forster (1980 and 1995),
Aligower and Georg (1980 and 1983), Dennis and Schnabel (1983)). It is the purpose
of this section to show that unconventional global optimization methods designed for
solving Lipschitzian optimization problems or d.c. programs can readily be applied
(46)
(47)
subject to x E D ( !Rn, where 11' 12 are finite index sets satisfying 11 n 12 = 0. Sup-
pose that Dis nonempty and compact and that all functions fi are continuous on D.
The system (46), (47) can be transformed into an equivalent global optimization
problem.
Suppose first that we have 12 = 0 in (46), (47), Le., we consider the system of
equations (46).
1I 1
Let F: !Rn --t IR 1 be the mapping that associates to x E !Rn the vector with
components fi(x) (i E 11), and let HIN denote any norm on the image space of F.
48
Then we have
Lemma 1.1. x* e Dis a solution o/the system 0/ equations (./6) i/ anti only i/
holds.
•
Let f(x) = IIF(x)II N . Then, by virtue of Lemma 1.1, the optimization problem
minimize f(x)
(49)
s.t. xeD
contains all of the information on (46) that is usually of interest. We see that min
f(D) > 0 holds if and only if (46) has no solution, and in the case min f(D) = 0 the
set of solutions of (49) coincides with the set of solutions of (46).
Suppose now that we have 11 = 0 in (46), (47), i.e., we consider the system of
inequalities (47). The following Lemma is obvious.
(50)
holds.
minimize T(x),
s.t. xeD (51)
where T(x) = max {fi(x): i e 12}. Whenever a procedure for solving (51) detects a
point x* e D satisfying T(x*) ~ 0, then a solution of (47) has been found. The
49
The case 11 f. 0 and 12 f. 0 can be treated in a similar way. As above, we see that
x* E D is a solution of (46) and (47) if and only if for
(52)
we have
Now let a1l the functions in (46) and (47) be d.c., and let
Then, by virtue of Theorem 1.7 (Section 1.3.1), each of the objective functions f,
T, f in (49), (51) and (53) is d.c. In other words, whenever the functions fi involved
in a system of equations and/or inequalities of the form (46) (47) are d.c., then this
system can be solved by d.c. programming techniques.
Proof. We use two well-known properties of the norms involved. First, it follows
from the triangle inequality that for any norm II'II N in IRm and any zl, zl E ~ we
have
(54)
(55)
Using (54), (55) and the Lipschitz constants Li of fi (i=I, ... ,m), we obtain the
following relations:
m m m
= E If.(x) -f·(y)1 ~ E L·llx-yll = IIx-yll( E L.),
i=1 1 1 i=1 1 i=1 1
we see that max f.(x) defines a Lipschitz function with Lipschitz constant
. 1 , .. m 1
1=
max
i=I, .. ,m
Li"
•
Lemma 1.3 provides Lipschitz constants of the objective functions in (49), (51)
OUTER APPROXIMATION
computational issues are addressed. These indude the calculation of. new vertices
and extreme directions generated from a polyhedral convex set by a linear cut and
In this section we present a dass of methods which are among the basic tools in
many fields of optimization and which have been used in many forms and variants.
The feasible set is relaxed to a simpler set D1 containing D, and the original object-
ive function f is minimized over the relaxed set. If the solution of this relaxed prob-
54
lem is in D, then we are done; otherwise an appropriate portion of D1 \ Dis cut off
by an additional constraint, yielding a new relaxed set D2 that is a better approx-
imation of D than Dr Then, D1 is replaced by D2, and the procedure is repeated.
These methods are frequently called outer approximation or relaxation methods.
Since the pioneering papers of Gomory (1958 and 1960), Cheney and Goldstein
(1959) and Kelley (1960), outer approximation in this sense has developed into a
tion has been applied in various forms for solving most of the problem c1asses that
we introduced in the preceding chapter. Examples inc1ude concave minimization
(e.g., Hoffman (1981), Thieu, Tam and Ban (1983), Tuy (1983), Thoai (1984»,
problems having reverse convex constraints (e.g., Tuy (1987», d.c. programming
(e.g., Tuy (1986), Thoai (1988» and Lipschitzian optimization (e.g., Thach and Tuy
(1987».
We modify a general treatment given by Horst, Thoai and Tuy (1987 and 1989).
ConBider the global optimization problem
A widely used outer approximation method for solving (P) is obtained by replacing
it by a sequence of simpler "relaxed" problems
a) The sets Dk ( IRn are closed, and any problem (Qk) with Dk E Ehas a solution
4c(x) 5 0 Vx ED , (3)
k
lk(x ) > 0 , (4)
(5)
Solve the relaxed problem (Qk) obtaining a solution xk E argmin f(D k).
Otherwise construct a constraint function 4c: IRn ........ IR satisfying (3), (4), (5) and
set
(6)
> 0
I (x) < 0
k
Conditions (3) and (4) imply that the set {x E IRn: lk(X) = O} strictly separates
xk E Dk \ D from D. The additional constraint ~(x) ~ 0 cuts off a subset of Dk.
However, since we have D ( Dk for all k (no part of D is cut off), each Dk
constitutes an outer approximation of D.
Note that, since Dk ) Dk+1 ) D, we have
(7)
In order to ensure that Dk+1 is c10sed whenever Dk is c1osed, we require that the
Theorem 11.1. In the context 01 the outer approximation method above, assume
that
(i) Lk is lower semi-continuo'ILS lor each k = 1,2, ... ;
(ii) each convergent subsequence {x q} C {i} satisfying xq yq::;;;r x contains
a subsequence {x r} C {x q} such that
lim Lr (xr) = lim Lr (x),
1'-Im 1'-Im
and
(iii) li m Lr (x) = 0 implies xE D.
1'-Im
Then every accumulation point 01 the sequence {xk} belongs to D, and hence solves
(P).
On the other hand, from (4), we have Lr(xr ) > 0 Vr, which implies
From (8) and (9) it follows that !im Lr(X) = 0, and hence, from assumption (iii) we
r~m
have xE D.
Finally, since f(xk) S f(x) Vx E Dk ) D, it follows, by continuity of f, that
f(X) S f(x) Vx E D, i.e., x solves (P). •
58
In many applications which we shalI encounter in the sequel the functions ~ are
even continuous, and, moreover, there often exists a function
Assumption (ii) is then fulfilled, for example, when lr(x) converges uniformly (or
continuously) to l (x) (cf. KalI (1986».
Several alternative versions of Theorem n.1 can be derived along the very same
lines of reasoning. For example, instead of (ii), (iii) one could require
and
In most realizations of out er approximation methods the sets Dk are convex poly-
hedral sets, and the constraints \(x) are affine functions such that {x: \(x) = O}
define hyperplanes strictly separating xk and D. The set D is usualIy assumed to be
a closed or even compact convex set and the procedures are often calIed cutting
plane methods.
Let
where g: IRn -+ IR is convex. Clearly, Dis a closed convex set. Note that in the case of
several convex constraints ~(x) ~ 0 (ieI), we set
59
k k
~(x) = p (x - y ) + 1\ ' (12)
(13)
(14)
or equivalently, of
where f, g: IRn ~ IR are convex, one usually transforms (16) into the equivalent
problem
where tE IR and g(x,t) = max {g(x), f(x) - t} . The relaxed problems (Qk) are then
linear programming problems.
Instead of subgradients pk E 8g(l) one can as weIl use c-subgradients (cf., e.g.,
Parikh (1976».
(Qk))' both classical methods have been applied for solving concave minimization
problems: the KCG approach by Thieu, Tam and Ban (1983) and the supporting
outer approximation by cutting planes into procedures for solving global optimiza-
tion problems was discussed in Tuy (1983), Horst (1986) and Tuy and Horst (1988).
We will return to these discussions later.
cussed in the next section, where we will also mention some additional important ap-
proaches to outer approximation by cutting planes.
duced in Horst, Thoai and Tuy (1987 and 1989) - provides a large variety of
methods that include as very special cases the KCG and supporting hyperplane
methods. Consider problem (P) with feasible set D as defined by (10). Assume that
an initial polyhedral convex set D1 J D is given and let the constraints ~ be affine,
i.e., of the form (12).
Denote
(17)
We assume that og(x)\{O} # 0 Vx E IRn \ DO, where 8g(x) denotes (as above) the
sub differential of g at x. This assumption is certainly fulfilled if DO # 0 (Slater condi-
tion): let z E DO and y E IRn \ DO, p E 8g(y). Then, by the definitions of 8g(y) and
DO, we have
Theorem ll.2. Let K be any compact convez sv.bset 0/ DO. In (12) /or each
k = 1,2, ... choose
Then the conditions (3), (~) 0/ the ov.ter approximation method and the a8SV.mtr
tions (i), (ii) and (iii) 0/ Theorem 11.1 are fu'fi"ed.
(19)
~k
k k k k
x - Y = Qk(y - z ), where Qk = 'I=Xk ~ 0 . (20)
Since K is compact and is contained in DO, there exists a number 0 > 0 such that
g(x) ~ -0< 0 Vx E K.
Using pk E 8g(yk), g(yk) ~ 0, we obtain
hence
63
whenever (Xk > 0 (i.e., xk *yk). However, for (Xk = 0, i.e., yk = xk, we have
4c(xk) = g(xk) > 0, since xk t D.
(i): For each k = 1,2, ... , the affine function 4c defined by (12) and (18) is
obviously continuous.
(ü): Let {xq} be any convergent subsequence of {xk}, and let xq - - I X. Then there
is a qo E IN sufficiently large such that xq E B: = {x E !Rn: IIx - xII ~ I} Vq > qO'
Since K and {8g(y): y E Y} are compact sets when Y is compact (cf., e.g., Rockafel-
lar (1970), Chapter 24) and Aq E [0,1], we see that there is a subsequence {xr} of
{xq} such that
= p (i - y) + 'P = l(i),
where l(x) = p (x-Y) + 'P. (23)
But since yq ~ DO, we have ßq = g(yq) ~ 0; hence, by continuity of the convex func-
tion g, we have 73 = g(Y) ~ 0. Moreover, while verifying (4), we saw that
pq(yq - zq) ~ 6 > 0, and hence p(y - Z) ~ 6> 0. From (24) it then follows that
",
k
D X
g(w) < 0, and determine las the unique point where the line segment [w,xkj meets
the boundary of D, then we obtain the supporting hyperplane algorithm. The KCG
and the supporting hyperplane methods, which in their basic form were originally
designed for convex programming problems, are useful tools in that field if the
problems are of moderate size and if the accuracy of the approximate solution
obtained when the algorithm is stopped, does not need to be very high. The rate of
convergence, however, generally cannot be expected to be faster than linear (cf., e.g.,
65
Moreover, in the case of cutting planes, it has been observed that near a solution,
the vectors pk of successive hyperplanes may tend to become linearly dependent,
causing numerical instabilities.
In order to solve convex programming problems and also to obtain local solutions
in nonlinear programming, recently several fast er cutting plane methods have been
proposed that combine the idea of outer approximation methods with typicallocal
elements as line searches etc. We will return brießy to these methods when dis-
cussing constraint dropping strategies in the next section. In global optimization,
where rapid local methods fail and where - because of the inherent difficulties of glo-
over, we have
where .; = 7f(zk), then the outer approximation conditions (9), (4) and the assumJr
tions (i), (ii) and (iii) 0/ Theorem 11.1 are satisfied.
(4): 4c(xk) = (xk -,t) (xk -,t) = IIxk _1I'k 1l 2 > 0, since xk ~ D.
(i), (ü): The affine function 4c defined by (27) is continuous for each k. Let {xq}
be a subsequence of {xk} satisfying xq --t X. Sinee the funetion 11' is eontinuous, we
have \fq --t w(i). It follows that
lim l (xq) = lim (xq - \fq) (xq - \fq) = lim (xq - \fq) (x - \fq) =
r~m q r~m r~m
Note that Theorem n.3 eannot be subsumed under Theorem 11.2. As an example,
consider the ease where the ray from xk through .,t. does not meet the set DO (Fig.
llA).
67
k
X
straints could be dropped. Eaves and Zangwill (1971) - by introducing the notions of
a cut map and aseparator - gave a general and abstract theory of outer approxi-
mation by cutting planes. This theory was generalized further by Hogan (1973) to
cover certain dual approaches. Examples of dual approaches that indude constraint
68
dropping strategies are given in Gonzaga and Polak (1979) and in Mayne and Polak
(1984).
Though convex programming is not within the scope of our treatment, we would
like to mention that for (possibly nonsmooth) convex problems several cutting plane
approaches have been proposed that converge linearly or faster; moreover, they
enable one to drop constraints in such a manner that the number of constraints used
in each step is bounded. Most of these algorithms do not fit into our basic approach,
since in each step a quadratic term is added to the objective function of the original
problem, while the outer approximation methods presented here use the original
(1984) and to the book of Kiwiel (1985). However, these results on local optimi-
zation and convex programming, respectively, could not yet be carried over to the
global optimization of multiextremal problems.
zation that can be applied to all algorithms satisfying the assumptions of Theorem
11.1, hence, e.g., also to the classes of procedures discussed in the preceding section.
Note that these results apply also to nonlinear cuts.
We shall need separators as defined in Eaves and Zangwill (1971).
For practical purposes, when separators are used to drop certain old cuts, a
straightforward choice of separator is any penalty function, as is well-known from
standard local nonlinear programming techniques. For example, let
69
with continuous ~: IRn ---+ IR (i = 1, ... ,m). Then all functions ofthe form
6(x) =~ [max {O, ~(x)}].ß , .ß ~ 1 (28)
i=l
(29)
(30)
where Ik is a subset of {l, ... ,k}, IIkl < k, defines a convergent outer approximation
method. Let 1,! denote monotonically increasing or decreasing convergence,
respectively.
C··
n = 0, (31)
c··1
IJ "i.1 as j .... 00, "i.!0
I
as i .... oo. (32)
Assume that the sequence {lk(x)} in the outer approximation method satisfies (9), (4)
and the requirements (i), (ii) and (iii) 0/ Theorem [1.1. Then
(33)
with
(34)
70
approximation method satisfying (31), (32), (33), (34). Let {xq} be a subsequence of
{xk } converging to x.
If 6(xq) - I 0, then x E D by Definition 11.1.
Suppose that 6(xq ) + O. By passing to a suitable subsequence if necessary, we
may assume that 6(xq ) ~ E > 0 Vq. Since t i 1 0 we may also assume that E ~ t i for all
i considered. Then using (32), we have 6(xi) ~ E ~ ti ~ Ei,q_l Vi < q-l. By (34), this
implies i E Iq_l' hence x q E Li' i.e., 4(xq ) ~ 0 Vi < q. Considering the subsequence
{xr} of {xq} satisfying (ii) of Theorem 11.2., we see that lr(xr ') ~ 0 V r' > r, and
hence, as in the proofofTheorem 11.2., lim lr(i) ~ O.
r-+IJI
But by (4) we have lr(x ) > 0 Vr; hence by (ii), lim lr(xr ) ~ O. By (iii) of Theorem
r
r-+IJI
xE D which implies that xsolves (P).
11.1, then it follows that
•
Many double-indexed sequences satisfying the above requirements exist. For
example, let { 17i} be any monotonically decreasing sequence of positive real numbers
all s > k: a dropped cut remains dropped forever. Likewise, a cut retained at some
iteration k may be dropped at a subsequent iteration s > k. On the other hand, for
i,k -... IJI , we have Eik -... 0, which at first glance seems to imply that few "late"
constraints can be dropped for large i,k, by (34). However, note that. we also have
6(xi ) - I 0, so that the number of constraints to be dropped for large i,k depends on
the relative speed of convergence of 6(xi ) and Eik. This limit behaviour can be
influenced by the choices of 5 and Eik , and should be investigated further.
71
We now discuss the question of how to solve the subproblems (Qk) in outer
approximation methods that use affine cuts. In global optimization, these algorithms
have often been applied to problems satisfying
where Dk is a polytope and Vk = V(D k ) denotes the vertex set of Dk. The most
well-known examples are concave minimization and stable d.c. programming that
can be reduced to parametric concave minimization (cf. Chapters I, IX and X).
The problem of finding all vertices and redundant constraints of a polytope given
by a system of linear inequalities has been treated often (e.g., Manas and Nedoma
(1968), Matheiss (1973), Gal (1975), Dyer and ProH (1977 and 1982), Matheiss and
Rubin (1980), Dyer (1983), Khang and Fujiwara (1989». Since, however, in our case
Vk and the new constraint are given, we are interested in developing methods that
take into account our specific setting rat her than applying one of the standard
72
methods referred to above. We will briefly also treat the case of unbounded
polyhedral convex sets Dk , where vertices and extreme directions are of interest.
Denote
and
n
a:= max { E x·: x E D} . (36)
j=l J
n . n
D1:= {x E IR : a· -x. ~ 0 (J=l, ... ,n), E x. - a ~ O} (37)
J J j=l J
is a simplex containing D. The n+l facets of D1 are defined by the n+l hyperplanes
n
{x E IRn: x. = a.} (j=l, ....n). and {x E IRn : E x. = a}. Each of these hyperplanes is
J J i=l 1
a supporting hyperplane of D. The set of vertices of D1 is
VI 0 I ..... vn}
= {v,v
where
(38)
and
(39)
with
73
ß· = Q - E~.
J i#j
Note that (35), (36) define n + 1 convex optimization problems with linear
objective functions. For this reason their solution can be efficiently computed using
standard optimization algorithms. If D is contained in the orthant IR~ (e.g., if the
constraints include the inequalities xj ~ 0 (j = 1, ... ,n», it suffices to compute Q
n . n
D1 = {x e IR : x. ~ 0 (J=I, ... ,n), E x. ~ Q} (40)
J j=1 J
lower and upper bounds Qj and ßj , then obviously D1 may be the rectangular set
(41)
n
VI = {v1,... ,v2 }
with
(42)
where
Let the current polyhedral convex set constituting the feasible set of the relaxed
(44)
(45)
The following lemma characterizes the new vertices and extreme directions of Dk +l'
Lemma 11.1. Let P ~ IR n be an n-dimensional convex polyhedral set with vertex set
Then we have
a) w E V I \ Vif and only if w is the intersection of the hyperplane {x: l(x) = O}
with an edge [v -, v+Jof P satisfying l(v -) < 0, l(v +) > 0 or with an unbounded edge
emanating /rom avertex v E V in a direction 11. E U satisJying either l(v) < 0 and
au > 0 or l(v) > 0 and au < O.
b) 11. E U ' \ U if and only ifu satisfies au = 0 and is ofthe form 11. = >'11.- + I'U +
with >',1' > 0, au < 0, au + > 0, which defines a tWQ-dimensional face of the
recession cone of P.
75
Now let w E V' \ V. Since w E V', among the linear constraints defining P' there
F: = {x E P: 4(x) = 0, i E J}
Now let C and C' denote the recession cones of P and P', respectively.
We have
Let u E U' \ U. Then, as in the proof of part a) we conclude that among the
constraints defining C there are (n-1) linearly independent constraints active at u.
I
76
G: = {y e C: aiy = 0, i e J}
is a smallest face of C containing the ray {oo: Q ~ O}. Certainly, G :/: {oo: Q ~ O},
since otherwise, u would be an extreme direction of P, i.e., u e U. Therefore,
dim G = 2, and G is a two-dimensional cone generated by two extreme directions
u- , u+ e U, u- :/: u, u+ :/: u.
Thus, we have
This implies au-:/: 0, au + :/: 0 and (auj(au +) < 0, since 0 = au = ~au- + pau+.•
Note that part a) of Lemma n.l requires that P possesses edges and part b) re-
quires the existence of at least one two-dimensional face of the recession cone of P.
It is easy to see that otherwise we cannot have new vertices or new extreme direc-
tions, respectively.
MethodI:
ut:= k
{u e Uk: aku > O}, U ·:= {u e Uk: aku< O} . (47)
77
a) Finding the vertices of Vk+1 \ Vk that are points where {x e tt: 'tc(x) = O}
meets a bounded edge of Dk :
For any pair (v-,v +) E V k x vt let w = av- + (1- a)v +, where
If the rank ofthe matrix A(w) having the rows ai , where t(x)
1
= aix + p.,1 ai E IRn ,
Pi E IR, i E I(w), is less than n-l, then w cannot be in Vk+ 1 \ Vk. Otherwise, w is in
Vk +1 \ Vk ·
If Vk is bounded, then Vk+1 \ Vk is determined in this way.
Note that, by the proof of Lemma 11.1, we have
(50)
Since the calculation of the rank of A(w) may be time-<:onsuming, one may simply
calculate I(w) via (50) and consider all points w defined by (48), (49) which satisfy
II(w) I = n-l. Then, in general we obtain a set Vk+1larger than Vk + 1' But since
Vk + 1 ~ Vk+1 ( Dk + 1 , and since we assume that argmin f(D k + 1)=argmin f(V k+1)'
we have argmin f(D k+ 1) = argmin f(V k + 1). The computational effort for the whole
fJ) Finding the vertices of Vk+ 1 \ Vk that are points where {x e ~: ~(x) = O}
meets an unbounded edge of Dk :
For any pair (u, v) E {U x k vt} u {ut x Vk}, determine w = v + 00, where
a = -'tc(v) / aku (Le., 'tc(w) = 0). Compute I(w) = {i: 4(w) = 0, i E K} and as in
the bounded case decide from the rank of A(w) whether w E Vk+1 \ Vk . Note that
Methodll:
In the most interesting case in practice when D and Dk are bounded, the vertices
of Vk+1 \ Vk may be calculated by linear programming techniques (cf., e.g., Falk
and Hoffman (1976) and Hoffman (1981), Horst, Thoai and de Vries (1988)).
We say that a vertex v of a polytope P is a neighbour in P of a vertex w of P if
[v, w] is an edge of P.
Let 'tt(x) ~ 0 be the new constraint and consider the polytope
Then, from Lemma 11.1, it follows that w E Vk+ 1 \ Vk if and only if w is a vertex of
Dk+1 satisfying w t Vk, 'tt(w) = 0 and having a neighbour v in Dk+1' v E Vk'.
Falk and Hoffman (1976) represent each vertex v in Vk' by the entire system of
inequalities defining Dk+1' This system must be transformed by a suitable pivoting
such that each v E Vk' is represented in the form
s + Tt = b,
79
where sand t are the vectors of basic and nonbasic variables, respectively, corres-
ponding to v. By performing dual pivoting on all current nonbasic variables in the
row corresponding to lk(x) $ 0, one obtains the neighbour vertices of v in the new
cutting hyperplane or detects that such a neighbour vertex does not exist. The case
of degenerate vertices is not considered in Falk and Hoffman (1976) and Hoffman
(1981).
The proposal of Horst, Thoai and de Vries (1988) is based on the following consid-
eration.
For each v E Vk', denote by E(v) the set of halflines (directions) emanating from v,
each of which contains an edge of Dk , and denote by I(v) the index set of a1l
constraints that are active (binding) at v. By Lemma II.1, the set of new vertices
coincides with those intersection points of the halflines e E E(v) with the hyperplane
ik .
a x + ßi $ 0, I k E I(v) (k = 1,,,.,n)
k
defines a polyhedral cone with vertex at v and is generated by the halflines e E E(v).
To carry out the calculations, in a basic form of the procedure, introduce slack
xl x2 xn Y1 Y2 Yn Yn+1 RHS
i1 i1 i1
a1 a2 an 1 0 0 0 -ß·1 1
i
a n
1
o o 1 o
o o o 1
80
(1977 and 1983); Horst, Thoai and de Vries (1988». A comprehensive treatment of
the determination of the neighbouring vertices of a given degenerate vertex in
polytopes is given in Kruse (1986) (cf. also Gal et al. (1988), Horst (1991».
In the procedure above, the new vertices are calculated from Vk'. Obviously,
instead of Vk' one can likewise use vi. Since the number of elements of these two
sets may differ considerably, one should always use the set with the smaller number
of vertices.
nurnber of new vertices created by the cut (for a related conjecture see, e.g.,
Matheiss and Rubin (1980), Avis and Fukuda (1992)).
Table 11.1 below shows sorne related results of 20 randomly generated exarnples.
n: dimension of Dk '
m: nurnber of constraints that defines Dk ;
sign: indicates which set of vertices is taken for calculation of the new vertices
For a comparison of the procedure of Horst, Thoai and de Vries (HTV) with the
method of Falk-Hoffman, note that in an outer approximation algorithm using
cutting planes one usually starts with a simplex or an n-rectangIe defined by
original FORTRAN code of Thieu-Tam-Ban for their relaxed version and our
FORTRAN code for the procedure presented in the preceding section. In the relaxed
version of Thieu-Tam-Ban, the decision on w E Vk+1 or w I Vk+1 is based only on
the number of constraints that are binding at w and omits the relatively expensive
determination of their rank (relaxed procedure). The results show that the Thieu-
83
Tam-Ban algorithm should not be used if IVI exceeds 100 (cf. Table II.2 below).
Problem n m I Vk I I Vk+l I Tl T2
No. [Min] [Min]
1 3 7 9 10 0.008 0.004
2 - 16 10 10 0.018 0.010
3 5 28 34 42 0.390 0.300
4 10 14 27 32 0.420 0.170
5 - 17 48 48 0.720 0.670
6 - 12 32 56 0.120 0.110
7 - 18 56 80 0.580 1.040
8 - 19 80 80 1.12 2.18
9 - 16 160 256 3.16 20.610
10 - 18 288 256 3.55 27.310
11 - 27 400 520 2.65 53.650
12 - 30 672 736 1.91 81.850
13 20 23 57 72 0.96 1.980
14 - 24 72 220 1.21 8.250
15 - 25 220 388 7.54 81.640
16 -- 26 388 1233 7.71 303.360
n: dimension of Dk '
Another way of treating the unbounded case is as follows (cf. Tuy (1983)).
Assume D1 C IR!. Recall first that, under the natural correspondence between points
of the hyperplane {(l,x): x E IRn } in IRn + 1 and points x E IRn , a point x E D k can be
represented by the ray p(x) = {a(l,x): Q ~ O}, and a direction of recession y can be
84
Associate to each point x e Dk the point sex), where the ray p(x) meets S, and to
each direction y of recession y of Dk associate the point s(y) where the ray p(y)
meets S. In this way there is a on~to-one correspondence between the points and
directions of recession of Dk and the points of the corresponding polytope s(Dk) C S.
then
Denote the vertex set of s(D k) by Wk. Then, it follows from these observations, that
Wk+1 can be derived from Wk by one of the methods presented above. Therefore,
the set of vertices and extreme directions of Dk+1 can be computed whenever the
vertices and extreme directions of Dk and the new constraint are known.
where
(52)
We also say that the constraint llx} ~ 0 is redundant for P relative to liO .
In a cutting plane approach as described above, the first polytope can usually be
cannot be redundant for P'. If we assurne that redundancy is checked and redundant
has no redundant constraints), then P' can only possess redundant constraints
relative to the cut 'tc. These redundant constraints can be eliminated by the
(53)
86
ilx) ~ 0 is redundant for P' relative to ~ if and only if conv V{Fj ) C {x: ~(x) ~ O}.
Obviously, because of the convexity of the halfspace {x: ~(x) ~ O}, this inclusion is
equivalent to the condition V{Fj ) nV-{P) = 0, which in turn holds if and only if
(53) is satisfied. _
If one does not want to investigate redundancy at each step of the cutting plane
method, then one can choose to identify only the so-called strictly redundant
F.:=P. n{XElRn:iJ.(x)=0}C{XElRn
J '0
:4 0(x»O}. (54)
The following assertion shows that, whenever a constraint is strictly redundant for
(55)
relative to ~(x) ~ o.
87
Proof. a) The proof of (55) is similar to the proof of (53) (replace V-(P) by
Y(P) \ Y+(P)).
b) Let l.(x) ~ 0 be strictly redundant for P and let t satisfy F·c {x: l (x)
J 10 J 10
> O}.
Then we have p. n {x: [1' (x) ~ O}=pc{x: lJ'(x) < O}. It follows that l.(v) < 0
10 0 J
Vv E Y(P). Rence, we see that llv) < 0 Vv E Y(P) \ Y+(P) holds, and, by part a)
ofTheorem 11.7, llx) ~ 0 is strictly redundant for P' relative to ~(x) ~ O. •
Remark 11.1. Let the feasible set D:= {x E IRn : g(x) ~ O} with strictly convex
g: IRn -+ IR and assume that int D # 0. If all of the facets of the initial polytope D1
and all hyperplanes generated by an outer approximation method support D, then it
CONCAVITY CUTS
In Chapter 11 we discussed the general concept of a cut and the use of cuts in the
basic technique of outer approximation. There, we were mainly concerned with using
cuts in a "conjunctive" manner: typically, cuts were generated in such a way that no
feasible point of the problem is excluded and the intersection of all the cuts contains
the whole feasible region. This technique is most successful when the feasible region
is a convex set, so that supporting hyperplanes can easily be constructed to separate
We now discuss cuts that are used in a "disjunctive" manner: each cut taken sep-
arately may exclude certain points of the current region of interest, but the union of
all cuts constructed at a given stage covers this region entirely. This technique is
often used for approximating a certain nonconvex set and separating it from a point
lying outside.
Consider the problem of minimizing a continuous function f: IRn ---f IR over a poly-
hedron D of fuU dimension in IRn . Suppose that 1 is the smallest value of f(x) taken
over an feasible points where an evaluation has been made up to a given stage in the
90
process of solving the problem. Then we may restrict our further search to the subset
of D consisting only of points x satisfying fex) < ,. In other words we may discard
an points x e D in the set
which except in certain simple wes is no longer polyhedral, and so would be diffi.-
cult to handle. Therefore, in order to preserve the polyhedral structure of the con-
straint set, we have to consider instead an affine function L (x) such that the con-
straint L (x) ~ 0 does not exclude any feasible point x with fex) < " i.e., such that
Definition m.l A linear inequalit1l L (z) ~ 0 satisfying (2) is called a 7-t14litl cut
for (J,D).
Typically, we have a point z e D such that fez) > ,and we would like to have the
cut eliminate it, i e., we require that
L(z) < 0 •
When fex) is a convex function, so that the set D*( ,) is convex, any hyperplane
L (x) = 0 strictly separating z !rom D*(-y) is a -y--valid cut that excludes z. For
example, if p e 8f(z), then fez) + p(x-t:) ~ fex), and the inequality
1) In this chapter it will be more convenient to denote a cut by L(x) ~ 0 rather than
by L (x) ~ 0 as in Chapter II.
91
xO. Let u1,u2, ... ,un denote the directions ofthese edges. Ifthe i-th edge is a line seg-
ment joining xO to an adjacent vertex yi, then one can take ui = yi - xO. However,
note that certain edges of D may be unbounded (the corresponding vectors ui are
extreme directions of D). Since D has full dimension (dim D = n) the vectors
u 1,u2, ... ,un are linearly independent and the cone vertexed at xO and generated by
the halflines emanating from xO in the directions u1,u2,... ,un is an n-dimensional
cone K with exactly n edges such that D c K (in fact K is the smallest cone with
vertex at xO that contains D).
Now for each i=l, ... ,n let us take a point zi # xO on the i-th edge of K such that
f(zi) ~ 7 (see Fig. Iß.l). These points exist by virtue of the assumption xO Eint G.
The nxn-matrix
° ° °
Q = (z 1-x ,z2-x , ... ,zn-x )
with zi-xO as its i-th column, is nonsingular because its columns are linearly
Theorem m.l. 11 zi = :rP + Biui with Bi > 0 and I(i) ~ 7, then the linear in-
equality
(3)
92
Proof. Since zl_xO,z2_xO, ... ,zn_xO are linearly independent, there is a unique hy-
perplane H passing through the n points zl,z2, ... ,zn. The equation ofthis hyperplane
can be written in the form 1I{x-xO) = 1, where 11" is some n-vector. Since this equa-
tion is satisfied by each zi, we can write
.°
1I{zl-x ) = 1 (i=1,2, ... ,n) , (4)
i.e.,
1I"Q=e,
hence 11" = eQ-l. Thus, the equation of the hyperplane H through zl,z2, ... ,zn is
eQ-l(x-xO) = 1.
Now let S = [xO,zl, ... ,zn]:= conv {xO,zl, ... ,zn} denote the simplex spanned by
°
x ,z1, ... ,zn . Clearly S = K
. n {x: eQ-1 (x-x) ~ ° . °
I}. Smce x ,zi E G (i=1,2, ... ,n) and
the set Gis convex, it follows that Sc G, in other words, fex) ~ 1 Vx E S. Then, if
xE D satisfies fex) < 1, we must have x t Si and since D is contained in the cone K,
this implies eQ-l(x-xO) > 1. Therefore,
{x E D: fex) ~ 1} ( {x E D: eQ -1 (x-x) ° ~ I} ,
°
K = {x: x = x + Ut, t ~ O} . (5)
Clearly, t = U-1(x-xO). Using (5), from Theorem III.1 we can derive the follow-
ing form of a valid cut which in computations is more often used than the form (3).
93
Corollary m.l. For any ° (° 1,°2,... ,0,) > 0 such that J(xo + 0iu)i
= ~ 7 the
linear inequality
n t·
E -i:~1, (6)
i=1 i
°
Proof. If zi = x0 + iUi (.1=1,2, ... ,n ) and we denote Q = (1
z -x0,z2-x0,... ,zn-x0) ,
then Q = U diag (Op ... ,On). Hence,
-1 . 1 1 -1
Q = diag (0. ,... , 71) U ,
1 n
-1 0 1 1 -1 0 1 1
eQ (x - x ) = (0. ,... , 71) U (x - x ) = (0. ,... , 71) (tl ,... ,t n)
1 n 1 n
n t.
= E
i=1 i
1:,
and so (3) is equivalent to (6).
•
Definition ID.2. A 7-valid cut which excludes a larger portion oJ the set
D n G = {x E D: J(x) ~ 7} than another oneJ is said to dominate the latter, or to be
stronger (deeper) than the latter.
i=1 i
~ 1 cannot F:
dominate the cut (6). Therefore, the strongest 7-valid cut of type (6) corresponds to
0.1 = (l.
1
(i=I,2, ... ,n), where
Since this cut originated in concave programming, and since the set D*( 7) that it
separates from xOis "concave", we shall refer to this strongest cut, Le., to the cut
94
as a concavity cut.
Definition m.3. A cut of the form (6) with 8i = (li satisfying (7) is called a
Note that some of the values (li defined by (7) may be equal to +00. This occurs
we see that if I = {i: (li < +oo}, then the concavity cut is given by
t.
E ....! > 1 (8)
iEI (li -
95
this case the hyperplane of the concavity cut is parallel to each direction u j with
j ~ I (see Fig. III.2). Of course, the cut (8) will exist only if I is nonempty. Other-
wise, the co ne K will have all its edges contained in G, in which case K c G and
there is no point x E D with f(x) < 'Y (in this situation one can say that the cut is
"infinite" ).
f(x)= ~_.
,
, concovity cut
rj
1
=+CD
The construction of the cut (3) or (6) relies heavily on the assumption that xO is a
nondegenerate vertex of D. When this assumption fails to hold, there are more than
n edges of D emanating from xo. Then the smallest cone K vertexed at xO and con-
taining D has s > n edges. If, as before, we take a point zi f xO satisfying f(xi ) ~ 'Y0n
96
each edge of K, then there may not exist any hyperplane passing through an these
z.i
The first method, based on linear programming theory, is simply to avoid degen-
eracy. In fact, it is well-known that by a slight perturbation of the vector b one can
nondegenerate (Le., have only nondegenerate vertices). However, despite its concep-
straints that are binding for xo. Let D' denote the polyhedron obtained from D by
omitting all the other binding constraints for xo. Then xo is a nondegenerate vertex
of D', and a 1'-valid cut for (f,D') can be constructed at xo. Since D' J D, this will
also be a -y-valid cut for (f,D).
In practical computations, the most convenient way to carry out this method is as
follows. Using standard linear programming techniques, write the constraints de-
(9)
(10)
where x B = (Xi ' i E B) is the vector of basic variables corresponding to the basic
solution xo = (x~,o), x N = (xj , JEN), IN I = n, is the vector of the corresponding
nonbasic variables, and U is a matrix given by the corresponding simplex tableau.
97
o
x B + Ut ~ 0, t ~ 0. (11)
The vertex xO now corresponds to t O = 0 e IRn, while the concave function f(x) he-
comes a certain concave function r,o(t) such that r,o(O) = f(x O) > 7 (hence, the convex
set {x: f(x) > 7} becomes the convex set {t: cp(t) > 7} in t-space). The polyhedron
Dis now contained in the orthant t ~ 0 which is a cone vertexed at the origin t O = 0
and having exact1y n edges (though some of these edges may not he edges of D, hut
rather lie outside D). Therefore, all the constructions discussed in the previous sec-
tion apply. In particular, if ei denotes the i-th unit vector in t-space, and if
r,o(0iei) ~ 7, then a 7-valid concavity cut is given hy the same inequality (6) of Co-
rollary 111.1. Note that when the vertex xO is nondegenerate (so that D' = D), the
variables t i (i=1, ... ,n) and the matrix U in (11) can be identified with the variables
t i (i=1, ... ,n) and the matrix U in (5).
where u1,u2,... ,us are directions of the edges of D em.anating from xO (s > n), and (ri
is defined by (7) (with the usual convention1. = 0 if (rl' = +m).
~
Lemma m.l. The system (12) is consistent and has at least one basic solution.
Proof. Since dim K = n and K has a vertex (Le., the lineality of K = 0), the
polar KOof -K + xO must have full dimension: dim KO= n, so that KOcontains an
interior point, i.e., a point t satisfying tui > 0 Vi=1,2, ... ,s (cf. Rockafellar (1970),
98
Corollary 14.6.1). Then the vector 11" = ~tT, with ~ E IR, ~> °
sufficiently large, will
satisfy the system (12). That is, the system (12) is consistent. Since obviously the
origin cannot belong to the polyhedron described by (12), the latter contains no
entire straight line. Hence, it has at least one extreme point (cf. Rockafellar (1970),
Corollary 18.5.3), i.e. the system (12) has at least one basic solution.
•
Proposition m.L Any solution 11" 01 (12) provides a 7-valid cut lor (j,D) by
me ans olthe inequality
1I"(X - i) ~ 1 .
Proof. Denote M = {x E K: 7r{x - xo) ~ 1}, where, we recall, K is the cone ver-
texed at xO with s edges of direction u 1,u 2,... ,us . Since D ( K, it suffices to show
that
x E°
= x + i=l
S
~. u
1
i
, ~.
1-
> °.
Furthermore, since 7r{x - xO) ~ 1, it follows that
s . s~.
1> E ~.(1rIll) > E ...!.
- i=l 1 - i=l a i
we can write
s -.\ (x
x= E
i=l a i
°+ a.ui )
1
+ (1- p.)x °, (14)
which means that xis a convex combination ofthe points xO and xO+aiui (i=l, ... ,s).
99
Since f(xO) ~ 'Y and f(xO +/li Ui ) ~ 'Y, it follows that f(x) ~ 'Y, hence we have proven
(13) for the case where all of the /li are finite.
In the case where some of the /li may be +1Il, the argument is similar, but instead
of (14) we write
A. 0 ' i
x= E 2(x +/l.u1) + (l-JL) [x + E I=IIU],
° \
ieI /li 1 i~I ,..
A.
where I = {i: /l. < +1Il}, JL = E ,.,1.
1 ieI "'i •
If '7f' is a basic solution of (12), then it satisfies n of the inequalities of (12) as
equalities, i.e., the corresponding cutting hyperplane passes through n points
zi = xO + /li Ui (with the understanding that if /li = +1Il for some i, this means that
A simple way to obtain a basic solution of the system (12) is to solve a linear pro-
gram such as
When s = n (Le., the vertex xO is nondegenerate), the system (12) has a unique
basic solution which corresponds exactly to the concavity cut.
The cuts which are used in the outer approximation method (Chapter II) are
conjunctive in the following sense: at each step k=1,2, ... , of the procedure we
construct a cut Lk = {x: 4c(x) ~ O} such that the set to be considered in the next
step is
100
where D is the feasible set. Since each Lk is a halfspace, it is c1ear that this method
By constrast, the valid cuts developed above may be used in disjunctive form.
function f over a polyhedron D in !Rn. If I is the best feasible value of f(x) known up
to a given stage, we would like to find a feasible point x with f(x) < I or else to be
sure that no such point exists (i.e., I is the global optimal value). We construct a
L11 = {x: L11 (x) ~ O}, ... , LI N = {x: LI N (x) ~ O} such that
" , 1 ' 1
Le., the union (disjunction) of all these cuts together forms a set LI which plays the
role of a ')'-valid cut for (f,D), namely it exc1udes xl without exc1uding any point of
D*(/)·
Thus, a general cutting procedure works as follows. Given a target set 0 which is
a subset of some set D (for instance 0 = D*(t)), we start with a set LO ) O. At step
are done. Otherwise, construct a system of cuts Lk,j (j=l, ... ,N k ) such that
Nk
Lk = .U L k J. ) 0 , (15)
j=l '
101
(17)
Many convergence proofs for cutting plane procedures are based on the following,
simple, but useful proposition.
102
xk E n Lh 1 (18)
h<k
then
d(xk1Lk) -; 0 (k-; m)
(where disthe distance function in IRn). The same conclusion holdsI ilinstead 01 (18)
we have
k
x E n Lh . (19)
h>k
Proof. Suppose that (18) holds, but d(xk,Lk) does not tend to o. Then there
exists a positive number e and an infinite sequence {kv} C {1,2, ... } satisfying
k
d(x v,Lk ) ~ e for an v. Since {xk} is bounded, we may assume, by passing to sub-
v
k k k
sequences if necessary, that x v converges, so that IIx JL-x vII < e for an JL,V suffi-
k
ciently large. But, by (18), for an JL > v we have x JL E Lk ,and hence
v
k k k
IIx JL-x vII ~ d(x v,L k ) ~ e
v
Lk = {x: 4c(x) ~ O}
103
Then
(20)
while
k
~(x ) ~ 0 (h = 0,1, ... ,k-1). (21)
Theorem III.2. Let Hk = {x: 'ix) = O}. If there exists an e > 0 s'Uch that
d(;,H~ ~ c for all k, then the proced'Ure is finite.
Proof. It suffices to apply Lemma III.2 to {xk} and Lk = {x: 4c(x) ~ O}. In view
of (21) we have (18). But from (21) d(xk,L k) = d(xk,Hk), and since d(xk,H k) ~ e for
all k, it follows from Lemma III.2 that the procedure must be finite.
•
From this Theorem we can derive the following results (see Bulatov (1977)).
Proof. Clearly, d(xk,H k) = 1171"k ll -1. Therefore, if 1171"kll ~ c for some c E IR, c> 0,
then d(xk,Hk) ~ 1fc , and the conclusion follows from Theorem III.2. •
Corollary III.3. Suppose that every cut is ofthe form (9), i.e.,
Ifthe seq'Uence 11 Qk-1 11 , k = 1,2, ... , is bounded, then the procedure is finite.
104
k = eQk-1 .
Proof. This follows from Corollary III.2 for 'Ir
•
Corollary illA. Suppose that every cut is ofthe form (6), i.e.,
where t E IR n is related to xE IR n by
Ifthe sequence "Uk -1" is bounded and there exists 0> 0 such that 0ik ~ 0 for all i =
l,2, ... ,n and all k, then the procedure is finite.
Cuts of the type (3) or (6), which were originally devised for concave pro-
gramming, have proved to be useful in some other problems as weIl, especially in in-
teger programming (see Glover (1972, 1973 and 1973a), Balas (1971, 1972, 1975,
1975a and 1979), Young (1971)). In fact, Corollary HI.1 can be restated in the
105
following form (see Glover (1972)), which may be more convenient for use in certain
Proposition ID.2. Let G be a convex set in IRn whose interior contains a point xo
but does not contain any point of a given set S. Let U be a nonsingular nxn matrix
with columns u1,u2, ... ,un. Then for any constants (Ji > °such that i + (Jiui E G
Proof. Clearly K = {x: U-1(x-xO) ~ O} is not hing but the cone generated by n
halflines emanating from xO in the direction u 1,u2,... ,un . It can easily be verified
that the argument used to prove Theorem III.1 and its corollary carries over to the
case when G is an arbitrary convex set with xO Eint G, and the set D*(-y) is re-
The proposition can .also be derived directly from Corollary 111.1 by setting D = K,
f(x) = -p(x-xO), 7 = -1, where p(x) = inf {>. > 0 : ix E G} is the gauge of G, so
that G = {x: p(x-xO) ~ 1}. •
To emphasize the role of the convex set Gin the construction of the cut, Glover
(1972) has called this cut a convexity cut. In addition, the term "intersection cut"
has been used by Balas (1970a), referring to the fact that the cut is constructed by
taking the intersection of the edges of the cone K with the boundary of G.
In the sequel, we shall refer to the cut obtained in Proposition III.2 as a concavity
cut relative to (G,K). Such a cut is valid for K\G, in the sense that it does not
In Section 111.1 concavity cuts were introduced as a tool for handling nonconvex
objective functions. We now show how these cuts can also be used to handle noncon-
vex constraints.
As a typical example let us consider a problem
such that by omitting the last constraint g(x) ~ 0 we obtain a relatively easy
problem
This is the case, for example, if Dis a convex set, while both functions f(x) and g(x)
are convex. Setting
G = {x: g(x) ~ O} ,
x t int G . (22)
In order to solve the problem (P) by the outer approximation method discussed in
Start with DO = D.
If xk t int G, then stop (xk solves (P)). Otherwise, construct a cut 4c(x) ~ 0 to ex-
clude xk without excluding any feasible point of (P). Form the new relaxed problem
(Qk+1) with constraint set
107
and go to iteration k+ 1.
Obviously, a crucial point in this approach is the construction of the cuts. If xO
denotes an optimal solution of (QO)' then we can assume that xO eint G (otherwise
xO would solve (P)). Furthermore, in most cases we can find a cone K vertexed at xO
having exactly n edges (of directions u1,u2,... ,un ), and such that D c K. Then all the
conditions of Proposition III.2 are fulfilled (with S = D \ int G), and a cut of type
(6) can be constructed that excludes xOwithout excluding any feasible point of (P).
The cut at any iteration k=I,2, ... can be constructed in a similar way.
There are cases where a cone K with the above mentioned properties does not
exist, or is not efficient because the cut constructed using this cone would be too
shallow (see, for example, the situation with xl depicted in Fig. III.3). However, in
these cases, it is always possible to find a number of cones KO I ,KO2 ,... ,KON
, , , 0
covering D such that the concavity cuts L .(x) ~ 0 relative to (G,K O·) (i=I, ... ,NO)
~~ ~
form a disjunctive system which does not exclude any point of K \ G. (We shall see
in Part B how these cones and the corresponding cuts can be constructed
effecti vely. )
Thus, concavity cuts offer a tool for handling reverse convex constraints such as
(22).
In integer programming these cuts are useful, too, because an integraHty
condition like xj e {O,l} (j e J) can be rewritten as x~ ~ xj , 0 ~ xj ~ I (j e J); and
hence it implies E h.(x~-x.) ~ 0 for arbitrary nonnegative numbers h. (j e J). Note
jeJ J J J J
that the convex set
G = {x: E h.(x.-x.)
2
~ O}
jeJ J J J
108
does not contain in its interior any point x satisfying xj e {O,l} (j e J). Therefore, if
°
o~ x~ ~ 1 for all j e J and < x~ < 1 for some j e J with hj > 0, then xO eint G,
and a concavity cut can be constructed to exclude xO without excluding any x
satisfying xj e {O,l} (j e J).
The reader interested in a more detailed discussion of suitable choices of the hj or,
more generally, ofthe use of concavity cuts in integer programming is referred to the
cited papers of Glover, Balas, and Young. For a detailed treatment of disjunctive
cuts, see Jeroslow (1976 and 1977) and also Sherali and Shetty (1980).
The cuts (6) (or (3)) do not exhaust the class of -y-valid cuts for (f,D). For
example, the cut (Il) in Fig. 111.2, which is stronger than the concavity cut (I),
cannot be obtained by Theorem III.l or its corollary (to see this, it suffices to
observe that by definition, the concavity cut is the strongest of all cuts obtained by
Corollary III.li moreover, a cut obtained by Corollary III.1 never meets an edge of
K in its negative extension, as does the cut (Il).
In this section we describe a general class of cuts that usually includes stronger
valid cuts than the concavity cut.
Definition ill.4. Let G be a convez subset o/R n with an interior point zO, and let
K be a fu.ll dimensional polyhedral cone with vertez at zOo A cut that does not ezclude
any point 0/ K\ Gis caUed a (G,K)-cut (cf. Fig. 111.4, page 105).
When G = {x: fex) ~ 7} with fex) a concave function, we recover the concept of a
-y-valid cut.
109
Assume that the cone K has s edges (s ~ n) and that for each i EIe {1,2, ... ,s} the
i-th edge meets the boundary of G at a point zi f. xO, while for i t I the i-th edge is
entirely contained in G, i.e., its direction belongs to the recession cone R(G) of G
(cf. Rockafellar (1970)). Denote by ui the direction of the i-th edge and define
(24)
Clearly, (}i > °because zi f. xO, ßij > °(otherwise ui would belong to R(G),
because the latter cone is convex and closed) and ßij may be +00 for certain (i,j),
K\G
'Ir(x-x,o) ~ 1 (26)
(27)
(28)
(It is understood that if ßij = +m, then the condition 7r{ ai ui + ßijU j) ~ 0, Le.,
a.· .
° .
7r{i;; u1 + uJ) ~ means that ruJ ~ 0).
IJ
Proof. First observe that the inequality (26) defines a valid cut for K \ G if and
M = {x E K: 'Ir (x-xo) ~ I}
is contained in G. Because of the convexity of all the sets involved, the latter
condition in turn is equivalent to requiring that
while
(i): Let 1+ = {i: 1I11i > O}. We must have I c 1+, since if 1I11i ~ ° for some i E I
7r{x-x )° = .~ °<
011111 l.
It is easily seen that the extreme points of M consist of xO and the points where the
According to (23), for i E I these points belong to G if and only if Bi ~ Cl'i ' Le., if and
only if (27) holds. On the other hand, for i ~ I, since ui is a direction of recession for
o .
G, we have x + 0l E G für all 0i > O.
(ii) Let us now show that any extreme directiün of M is either a vector uj
(j ~ 1+) or a vector of the form
(31)
p = {x E R(M): tx = I}
with S = {x E K O: tx = I}. But the vertices of S are obviously vi = -\ui (i=I,2, ... ,s)
(i.e., j ~ 1+) or the intersection of the hyperplane ?rX = 0 with an edge of S, i.e., a
line segment joining two vertices vi, ~ such that 1[Vi > 0 (i.e., rui > 0) and 1[Vj <0
(i.e., ruj < 0). In the latter case, let uij denote the intersection of the line segment
[vi)] with the hyperplane ?rX = O. Since the hyperplane ?rX = 1 meets the halfline
from 0 through vi at 0iui and the halfline from 0 through v j at -~juj (Fig. 111.5), it is
clear that uij is parallel to 0iui + ~juj. We have thus proved that the extreme
112
....•.
.......... ........ u l]
Fig. III.5
/l·e·
1 J
> 8·ß..
- 1 IJ
ui , uj E R(G) (because i t I, j t I), so that 8iui + ejuj E R(G) for all 8i > 0, ej > 0.•
113
Theorem III.3 yields a very general elass of cuts, since the inequalities (27) and
(28) that characterize this elass are linear in "" and can be satisfied in many different
ways.
The cut (6) in the nondegenerate case (s = n), the Carvajal-Moreno cut for the
degenerate case (Proposition III.1) and the cut obtained by Proposition III.2 are
special varlants of this elass, which share the common feature that their coefficient
vectors "" satisfy the inequalities rni ~ 0 (i=1,2, ... ,s). Though these cuts are easy to
construct, they are not always the most efficient ones. In fact, Glover (1973 and
1974) showed that if G is a polyhedron and some edges of K (say, the edges of
direction uj , j E J) strictly recede from all boundary hyperplanes of G, then stronger
cuts than the concavity cut can be constructed which satisfy rnj < 0 for jE J (thus,
these cuts use negative extensions of the edges rather than the usual positive
extensions: an example is furnished by the cut (II) in Fig. III.2, already mentioned
in the previous section).
We elose this chapter by showing that these cuts can also be derived from
Theorem III.3.
As before, let I denote the index set of those edges of K which meet the boundary
of G. Assume that I t 0 (otherwise, K \ G = 0).
(32)
proof. Since (12) implies (32) and (33), the consistency of the system (12) im-
plies that of the system (32) and (33). Obviously, any solution of the latter satisfies
the conditions (27), (28). Hence, by Theorem 1ß.3, it generates a (G,K)~ut. It re-
mains to show that the polyhedron (32) and (33) has at least one extreme point. But
since I # 0, the origin 0 does not satisfy (32), therefore this polyhedron cannot con-
tain any line and must have at least one extreme point, i.e. the system (32), (33)
must have a basic solution.
•
The following remark provides a cut of the type (32), (33).
Remark m.l. To obtain a cut of the type indicated in Theorem 111.4, it suffices
to solve, for example, the linear program
This cut is obviously stronger than the Carvajal-Moreno cut obtained by minim-
izing the same linear funetion over the smaller polyhedron (12). In particular, if s=n
(nondegenerate case), then this cut is determined by the system of equations
(l.7nli = 1 (i e I),
I
From the latter equation it follows that 7nlj < 0 for j ~ I, provided that
maxß·· < +111.
i eI IJ
readily available from the constraints determining G. In this case, the values ßij can
be easily computed from (24), and we necessarily have max ß·· < +111 for every j
i eI IJ
j
such that u is strictly interior to R(G).
CHAPTERIV
called branch and bound. In this technique, the feasible set is relaxed and
subsequently split into parts (branching) over which lower (and often also upper)
In this chapter, branch and bound is developed with regard to the specific needs
all the specific approaches that will be discussed in subsequent chapters. A general
convergence theory is developed and applied to concave minimization, d.c. pro-
gramming and Lipschitzian optimization. The basic operations of branching and
where f: A -+ IR and DcA ( IRn . The set A is assumed to contain all of the subsets
For the moment we only assume that min f(D) msts. Further assumptions will
be needed later.
Notice that mere smoothness of the functions involved in problem (P) is not the
i.e., a finite method that guarantees to find a point x* E D such that f(x*) differs
from f* = min f(D) by no more than a specified accuracy. It is easy to see that for D
robust, from finitely many function values and derivatives and the information that
fE Ck (kEINO U {oo} known) one cannot compute a lower bound for f*. The reason is
simply that one can find a point y* E D and an c-neighbourhood U of y* such that
no point in U has been considered. Then it is well-known that one can modify fon
U such that f(y*) takes on any desired value, the modification is not detectable from
the above local information from outside U, and the modified function is still Ck on
D.
Any method not susceptible to this argument must make global assumptions (or
somehow generate global information, as, for example, interval methods) which al-
low one to compute suitable lower bounds for min f(M) at least for some sets of suffi-
ciently simple structure. When such global information is available, as, for example,
in the case of Lipschitz functions or concave functions over polytopes with known
vertex sets, a "branch and bound" method (abbreviated BB) can often be con-
structed. In the last three decades, along with general BB concepts, abundant
branch and bound methods have been proposed for many classes of global optim-
ization problems. Many of these will be treated and sometimes combined with other
methods in subsequent parts of this book. An extensive list of references can also be
found in Chapters 2, 3, 4, 7, 8, 10 and 13 of the Handbook Horst and Pardalos
(1995).
The following presentation is mainly based on Horst (1986 and 1988), Tuy and
Horst (1988).
117
- Start with a relaxed feasible set MO ) D and split (partition) MO into finitely
- For each subset Mi determine lower and (if possible) upper bounds ß (Mi)' Cl (Mi)
respectively, satisfying
Then ß: = min ß (Ml·), Cl := min Cl (M.) are "overall" bounds, i.e., we have
i EI i EI 1
ß5minf(D)5 Cl.
- Otherwise select some of the subsets Mi and partition these chosen subsets in
order to obtain a refined partition of MO. Determine new, hopefully better bounds
on the new partition elements, and repeat the process in this way.
An advantage of BB methods is that during the iteration process one can usually
delete certain subsets of D, since one knows that min f(D) cannot be attained there.
can only be measured by the difference Cl - ß of the current bounds. Hence a "good"
feasible point found earIy may be detected as "good" only much Iater after many
further refinements.
Definiüon IV.I. Let B be a su.bset 0/ IRn and 1 be a finite set 0/ indices. A set
{Mi: i E 1} 0/ su.bsets 0/ Bis said to be a partition 0/ B i/
cated above. For MO and all partition sets M, it is natural to use most simple poly-
topes or convex polyhedral sets, such as simplices, rectangles and polyhedral cones.
In this context, a polytope M is often given by its vertex set, a polyhedral cone by
its generating vectors.
leting partition sets satisfying M n D = 0. Clearly, for many feasible sets adecision
on whether we have M n D = 0 or M n D I 0 will be difficult based on the informa-
Prototype BB procedure:
Step 0 (Initialization):
Otherwise, go to Step 1.
119
At the beginning of Step k we have the eurrent partition .Jt k-l of a subset of MO
still of interest. Furthermore, for every M E .Jt k-l we have SM ~ M n D and bounds
ß (M), 0' (M) satisfying
Moreover, we have the current lower and upper bounds ßk- 1, O'k-l satisfying
Finally, if O'k-l < (1), then we have a point xk- 1 E D satisfying f(xk- 1) = O'k_l
(the best feasible point obtained so far).
ß (M) ~ O'k-l .
Let se k be the eolleetion of the remaining sets in the partition .Jt k-l'
and
fothomed
.At'
1(
(üi) In Step kA one can obviously replace any M E ...t Ir. by a smaller set Me M
such that M E ~ and M nD = M n D.
or even all sets SM are empty and we may even have Qk = m for all k. We will
return to this point later, but we first treat the case where "enough" feasible points
are available. Then the conditions imposed on SM and (:J (M) ensure that
{Qk} = {f(xkn is a nonincreasing sequence, {~} is a nondecreasing sequence, and
Qk ~ min f(D) ~ (:Jk· Thus, the difference Qk - (:Jk measures the proximity of the
current best solution xk to the optimum. For a given tolerance E > 0, the algorithm
can be stopped as soon as Qk - ~ ~ E.
respectively, the limits Q = !im Qk and (:J = !im ~ necessarily exist, and, by
k-+m k-+m
construction, they satisfy ß ~ min f(D) ~ (:J. The algorithm is said to be finite if
Qk =~ occurs at some step k, while it is convergent if Qk - (:Jk --+ 0, i.e.,
122
It is dear that the concrete choice of the following three basic operations is
crucially important for the convergence and efficiency of the prototype branch and
bound procedure:
1
s.t. x 2 -ixl no,
~ 500 ,
The procedure that follows is one of several possibilities for solving this problem by
a BB algorithm.
Step 0: MO is chosen to be the simplex conv {(O,O), (0,40), (40,0)} having vertices
(0,0), (0,40), (40,0).
aa = -500
40~ + aa = -laoo
123
yields 'P (xl' ~) = -20x2 - 500. Because of the concavity of f(xl' x 2), 'P (xl' x2) is
underestimating f(x 1, x2), i.e., we have cp(xl'x2) ~ f(xl'~) V(xl'x2) E MO' A lower
bound Po can be found by solving the convex optimization problem (with linear
objective function)
The two feasible points (0,0) and (20,20) are at hand, and we set
Step 1: We partition MO into the two simplices M11 = conv {(O,O), (0,40), (20,20)}
,
and M12 = conv {(O,O), (20,20), (40,0)}.
,
As in Step 0, we construct lower bounds P(MI ,I) and P(MI ,2) by minimizing
over MI I n D the affine function 'PlI that coincides with f at the vertices of MI I
" ,
and by minimizing over MI ,2 n D the affine function 'PI ,2 that coincides with f at
the vertices of MI ,2' respectively.
One obtains
SM = {(0,0)(0,10)(20,20}, SM = {(0,0),(20,20)}.
1,1 1,2
1
Hence a (MI ,I) = a (MI ,2)= f(O,O) = -500, a1 = -500 and x = (0,0).
124
SM = {(0,0),(0,10),(0,20),(20,20)}, Cl (M 2 1) = -500 ;
2,1 '
SM = {(20,20)}, Cl (M 2 2) = -100.
2,2 '
Another possibility for calculating the lower bounds would have been simply by
minimizing f over the vertex set of the corresponding partition element M. Keeping
X
2
40
20
X
20 1
The example shows both the mentioned disadvantage that for a solution found
very early it may take a long time to verify optimality, and the typical advantage
that parts of the feasible region may be deleted from furt her eonsideration.
Definition IV.3. Abounding operation is called finitely consistent ij, at every step,
any unlathomed partition element can be further re/ined, and il any decreasing se-
quence {Mk } 01 successively refined partition elements is finite.
q
127
Proof. Sinee any unfathomed partition element ean be further refined, the proee-
dure stops only when (}k = ßk and an optimal solution has been attained. A direeted
graph G ean be associated to the proeedure in 30 natural way. The nodes of G eonsist
of MO and 3011 partition elements generated by the algorithm. Two nodes are connee-
ted by an are if and only if the first node represents an immediate aneestor of the se-
eond node, i.e., the second is obtained by a direet partition of the first.
Obviously, in terms of graph theory, G is a rooted tree with root MO' A path in G
corresponds to 30 decreasing sequenee {Mk } of suceessively refined partition ele-
q
ments, and the assumption of the theorem means that every path in G is finite.
On the other hand, by Definition IV.1, each partition eonsists of finitely many
elements: hence, from each node of G only a finite number of ares ean emanate (the
"out-degree" of each node is finite, the "in-degree" of each node different from MO
is one, by construction).
Therefore, for eaeh node M, the set of deseendants of M, Le., the set of nodes
attainable (by a path) from M must be finite. In particular, the set of all descen-
dants of MO is finite, Le., the procedure itself is finite.
•
Remark IV.2. The type of tree that appears in the proof is diseussed in Berge
(1958) (the so-called "r-finite graphs", cf. especially Berge (1958, Chapter 3,
Theorem 2)).
In the sequel, convergence eonditions for the infinite BB procedure are eonsidered.
following eorollary.
Definition IVA. Abounding operation is called cOf&8istent i/ at el1ery step any un-
/athomed partition element can be further refined, and i/ any infinitely decreasing se-
quence {Mk } 0/ successil1ely refined partition elements satisfies
q
(3)
eessively refined partition elements. However, &inee ß(Mk ) ~ 1\ ' (3) does
q q
not nece&sarily imply 1im '\ = 1im ßx. = min f(D) via 1im '\ = 1im ßx. .
k-+m k-+m q-+m q q-+m q
In order to guarantee convergenee an additional requirement must be imposed on
the seleetion operation.
(ii) The relation (3) may be diffieult to verify in praetice, sinee Qk is not neces-
q
sarily attained at Mk . In view of the inequality Q (M k ) ~ Qk ~ ß(Mk ) and the
q q q q
properties just mentioned, (3) will be implied by the more praetical requirement
li m (Q (Mk ) - ß(Mk
q-+m q q
»= 0 , (4)
whieh simply says that, whenever a deereasing sequence of partition sets converges
to a eertain limit set, the bounds also must converge to the exaet minimum of f over
this limit set.
129
Stated in words, this means that any portion of the feasible set which is left "un-
explored forever" must (in the end) be not better than the fathomed portions.
fex) ~ a Vx E D.
(iii) If neither of the two previous cases holds for xe D, then any partition set M
containing x must be partitioned at some iteration, Le., it must belong to .Jt k for
some k. Therefore, one can find a decreasing sequence {Mk } of partition sets satis-
q
fying Mk E.Jt k
,x e Mk . By consistency, it follows that I im ak = a = lim
q q q q-im q q-im
ß (M k ). Hence fex) ~ a, since fex) ~ ß (M k ) for every q. •
q q
Theorem IV.2 does not include results on the behaviour of the sequence {ßq} of
lower bounds. By construction (since {ßq} is a nondecreasing sequence bounded from
130
ß: = lim.t\ ~ inff(D).
k-tll]
i.e., at least one partition element where the actuallower bound is attained is selected
for further partition in Step k ofthe prototype algorithm.
Several selection rules not explicitly using (5) are actually bound improving. For
example, define for every partition set Miet :7 (M) denote the index of the step
where M is generated, and choose the oldest partition set, i.e., select
of M, or the length of the longest edge if M is a polytope. Let the refining operation
be such that, for every compact M, given E > 0, after a finite number of refinements
of M we have 6(Mi ) ~ E for all elements Mi of the current partition of M. Choose the
largest partition element, i.e., select
Both selections are bound improving simply because of the finiteness of the
number of partition elements in each step which assures that any partition set M
131
Theorem IV.3. In the infinite BB procedure, auppoae that the bounding operation
ia conaistent and the aelection operation is bound improving. Then the procedure ia
convergent:
Among the members of the finite partition of MO there exists one, say Mk ' with in-
o
finitely many marked descendants; then, among the members of the partition of M k
o
there exists one, say M k ' with infinitely many marked descendants. Continuing in
1
this way, we see that there exists a decreasing sequence {M k } such that every M k
q q
has infinitely many marked descendants. Since the bounrung operation is consistent,
we have li m (ak - ß(M k )) = O. But every Mk has at least one marked descend-
q q q q
ant, say Mq. Then ß (Mq) = ßhq for some hq > kq, and since Mq c M kq we must
Convergence of the sequence of current best points xk now follows by standard ar-
guments.
132
proof. The set C(xO) is bounded and, by continuity of f, C(x,°) is closed and
therefore compact. By construction, we have f(xk+!) ~ f(xk) Vkj thus {xk} c C(xo).
Hence {xk } possesses accumulation points. Corollary IV.2. then follows immediately
For several classes of problems, however, it is not possible to guarantee that a se-
quence of best feasible points xk and the associated upper bounds erk can be obtained
in such a way that consistency holds (e.g., Horst (1988 and 1989), Horst and Dien
(1987), Horst and Thoai (1988), Pinter (1988), cf. the comments in Section IV.l). In
this case we propose to consider the above sequence {ik } or sequences closely related
bounds ß (M) be the minimum of f on V(M), i.e., ß(M) = min f(V(M)), and let the
upper bounds er (M) be the minimum of f taken over the feasible vertices of M, i.e.,
SM = V(M) n D, i.e.,
er (M) = { min f(V(M n D)), if V(M) nD t 0,
m , if V(M) n D = 0 .
134
By concavity of f, we have
It is easy to define refining procedures and selection rules such that the branch
satisfying
Obviously, we have
I im ß (Mn) = f(i) , li m vn = i ,
n-i 111 n-i 111
type algorithm is empty. Consistency does not hold, and convergence cannot be es-
way. As mentioned in Section IV.l, it is in many cases too difficult to decide exactly
whether M n D = 0 for all partition sets M from the information available. There-
fore, we have to invent simple techniqes for checking infeasibility that, though pos-
sibly incorrect for a given M, lead to correct and convergent algorithms when incorp-
orated in Step k.3 of the prototype BB procedure.
135
minimize g (x)
s. t. x E M
Note that for the convex set D: = {x: gi(x) 5 0, i = 1, ... ,m} defined by the convex
functions gi: IRn -I IR (i = 1, ... ,m), the function g(x) = max {~(x), i=l, ... ,m} is non-
smooth, and it may be numerically expensive to solve the above minimization prob-
lem for each M satisfying V(M) n D = 0. Moreover, whenever min g(M) > 0 is
small, the algorithms available for solving this problem may in practice produce a
The situation is, of course, worse for more complicated feasible sets D. For ex-
ample, let D be defined by a finite number of inequalities gi (x) 5 0, i E I, where
gi: IRn - I IR is Lipschitzian on a set MO containing D (i EI). Then the problem of de-
ciding whether we have M n D = 0 or M n D *0 is c1early almost as difficult as the
original problem (1).
Note that the notioDS of consistency and strong consistency involve not only the
calculation of bounds but, obviously, also the sub division of partition elements.
Moreover, in order to ensure strong consistency of the lower bounding operation, the
deletion of infeasible sets in Step k.3 of the BB procedure has to guarantee that
M n D # 0 holds for the limit M of every nested sequence of partition elements gen-
erated by the algorithm.
Definition IV.8. The "deletion b1l in/easibility" rule 1J.Sed in Step k.9 throughout a
BB procedure is caUed cerl4in in the limit i/ /or every infinite decreasing sequence
{Mk } 0/ successivel1l refined partition elements with limit M we have M n D # 0.
q
Coro11ary IV.3. In the BB procedure suppose that the lower bouMing operation is
strongl1l consistent aM the selection operation is bOUM improving. Then we have
ß = limßk = min/(D).
k-tlll
tions, and bounding and deletion rules are presented that illustrate the wide range of
applicability of the BB procedure.
137
As mentioned above, for the partition sets M it is natural to use most simple
cones.
3.1. Simplices
Suppose that D c IRn is robust and convex. Furthermore, let MO and all partition
Definition IV.9. Let M be an n-simplex with vertex set V(M) = {va, v1, .. ,vn}.
Choose a point wEM, W ~ V(M) which is uniquely represented by
n . n
W = E >. .vl , >.. ~ 0 (i=O, ... ,n), E >.. = 1, (3)
i=O" i=O '
°
and for each i such that \ > form the simplex M(i, w) obtained from M by replacing
. MI:
th evertex I. by W,I.e., ",w;,) = conv {o
V, .. ,Vi-l ,W,Vi+l , .. , vn} .
This sub division is called a radial subdivision (Fig. IV.4).
2
v
M( 1 ,w)
o tFC..-----+------~ 1
v v
M(2,w)
Radial sub divisions of simplices were introduced in Horst (1976) and subsequently
used by many authors.
Proposition IV.I. The set 0/ subsets M(i,w) that can be constrocted /rom an
n-simplex M by an arbitrary radial sub division /orms apartition 0/ M into
n-simplices.
independent set of points {vO,,,., vn} and a point w represented by (3), then
{vO ,,,.,vi- 1,w,vi+1,,,.,vn} is a set of affinely independent points whenever we have
Ai > °in (3). Thus, all the sets M(i,w) generated by a radial subdivision of an
n-simplex Mare n-simplices.
Let x e M(i,w), Le.,
n
j=O J
.
x = E p..yJ + /lr,w, p.. ~
1 J
° (j=O,,,.,n), /Ir,
1
+
n
Ep..=1.
j=O J
(4)
Hi Hi
Inserting (3) yields
n·
x = E p..v J
j=O J
+ p..
1
n
E Akv
k=O
k
, p.J' ~ °(j=O,,,.,n), Ak ~ °(k=O,,,.,n),
j#i
(5)
n
p..+Ep..=l,
1 j=O J
Hi
This can be expressed as
(6)
where all of the coefficients in (6) are greater than or equal to zero, and, by (5), their
sum equals 1- lLj + lLj(l-Ai) + ILjAi = 1. Hence, xis a convex combination of the
139
y = E JL;Vi, JL;
iel l I
°
~ (iel), E JL; = 1, I C IN, III < n+1
iell
(7)
x = äy + (1 - Ci)w, °< et ~ 1.
(8)
xe UM(i,w).
(9)
I
n
J
1.
In (9), we cannot have \ > °or I-'j > 0. To see this, express in (9) w as convex
combination w =
n . °
E a· Vi of v ,... ,vn. Then in (9) xis expressed by two convex
i=O I
combinations of vO, ... ,vn whose coefficients have to coincide. Taking into account
that '\, ak f 0, it is then easy to see that I-'j = Ak= 0. The details are left to the rea-
der.
•
140
a BB procedure using radially subdivided simplices, we have to make sure that the
Denote by ö(M) the diameter of M (measured by the Eudidean distance). For the
simplex M, ö(M) is also the length of a longest edge of M.
The notion of exhaustiveness was introduced in Thoai and Tuy (1980) for a
similar splitting procedure for cones (see also Horst (1976)), and it was further in-
vestigated in Tuy, Katchaturov and Utkin (1987). Note that exhaustiveness, though
often intuitively dear, is usually not easy to prove for a given radial subdivision pro-
cedure. Moreover, some straightforward simplex splitting procedures are not ex-
haustive.
Example IV.4. Let vk (i=O, .. ,n) denote the vertices of a simplex M . Let
q q
n > 1, and in Definition IV.9 choose the barycenter of Mq, Le.,
1 n i
w=w =n+l. E vM . (10)
q 1=0 q
with w given by (10), and suppose that for all q, Mq+1 is obtained from Mq by re-
placing v~ by the barycenter w of M . Then, clearly, every simplex M contains
q q q q
the face conv {v~ ,... ,v~-I} of the initial simplex MI ' and thus M = ~ M has
1 1 q=l q
141
positive diameter.
and Utkin (1987). We present here only the most frequently used bisection, intro-
duced in Horst (1976), where, in Definition IV.lO, w is the midpoint of one of the
(11)
where [v~, v~] is a longest edge of M. In this case, M is obviously subdivided into
two n-simplices having equal volume. The exhaustiveness of any decreasing se-
quence of simplices produced by successive bisection follows from the following re-
sult.
(ii)
PIoof. Consider a sequence {Mq} such that Mq+ 1 is always obtained from Mq
by bisection. Let 5 (M q ) = 5q. It suffices to pIove (i) for q = 1. Color every vertex of
MI "black", color "white" every vertex of Mr with r > 1 which is not black. Let dr
denote the longest edge of Mr that is bisected. Let p be the smallest index such that
have p 5 n+1.
Let dp = [u,v] with u white. Then u is the midpoint of some dk with k < p. Let
dk = [a,b]. If a or b coincides with v, then obviously 5p = ~ 5k 5 ~ 51 and (i) holds.
142
that
2
2l1u-vll = IIv-a1i 2 + IIv-bll 21 2 21232
- "2""a-b l $ 26k -"2" 0k = 2" 0k '
Note that the notion of a radial subdivision can be defined similarly for any se-
to simplices.
cases the whole vertex set V(M) of a partition set M is needed to compute bounds
143
and "deletion by infeasibility" rules. Since an n-simplex has the least number of ver-
For some classes of problems, however, rectangles M = {x: a ~ x ~ b}, a,b E (Rn,
a< b, are a more natural choice. Note that M is uniquely deterrnined by its "lower
left" vertex a = (ap ... ,an ? and its "upper right" vertex b = (bp ... ,bn ? Each of
the 2n vertices of the rectangle M is of the form
a+c
Rectangular partition sets have been used to solve certain Lipschitzian optimiza-
tion problems (e.g., Strongin (1984), Pinter (1986, 1986 a, 1987), Horst (1987 and
1988), Horst and Tuy (1987), Neferdov (1987), Horst, Nast and Thoai (1995), cf.
Chapter XI).
Most naturally, rectangular sets are suitable if the functions involved in the prob-
lem are separable, i.e., the sum of n functions of one variable, since in this case ap-
propriate lower bounds are often readily available (e.g., Falk and Soland (1969), So-
land (1971), Falk (1972), Horst (1978), Kalantari and Rosen (1987), Pardalos and
Rosen (1987)). We will return to separable problems in severallater chapters.
Let M be an n-rectangle and let wEM, w ;. V(M), where V(M} again denotes the
vertex set of M. Then a radial subdivision of M using w (defined in the same way as
in the case of simplices) does not partition M into n-rectangles but rather into more
complicated sets.
144
For most algorithms, the subdivision must be exhaustive. An example is the bi-
section, where w is the midpoint of one of the longest edges of M, and M is sub-
divided into two n-rectangles having equal volume and such that w is avertex of
both new n-rectangles. It can be shown in a manner similar to Proposition IV.2,
Polyhedral cones are frequently used for concave minimization problems with
(possibly unbounded) robust convex feasible sets (e.g., Thoai and Tuy (1980), Tuy,
Thieu and Thai (1985), Horst, Thoai and Benson (1991)).
(n-1)-;;implices Fis exhaustive, then any nested sequence {Cq} of cones associated
to the subdivision of the corresponding sequence {Fq} of (n-l)-;;implices converge
to a ray from yo through the point x satisfying Fq (q-laJ) {X}.
----i
145
4. LOWER BOUNDS
ment of a partition of M' and suppose that we have ß (M) < ß (M'). Then let us
agree to use
Section 104.:
Let f be Lipschitzian on M, i.e., assume that there is a constant L > 0 such that
Finding a good upper bound A ~ L is, of course, difficult in general, but without
such abound branch and bound cannot be applied for Lipschitzian problems. On the
other hand, there are many problems, where A can readily be determined. Note that
146
in a branch and bound procedure IIlocal ll bounds for L on the partition sets M should
be used instead of global bounds on MO'
Moreover, suppose that the diameter 6(M) of M is known.
Recall that for a simplex M, o(M) is the length of the longest edge of M. For a
rectangle M = {x E !Rn: a ~ x ~ b}; a,b E !Rn, a < b, o(M) is the length of the diag-
onal [a,b] joining the IIlower left ll vertex a and the lIupper right ll vertex b (all in-
equalities are to be understood in the sense of the componentwise ordering in !Rn).
is a naturallower bound.
In (14), V'(M) can be replaced by any known subset of M.
If M = {x E !Rn: a ~ x ~ b} is a hyperrectangle, then
might be a better choice than (14). For more sophisticated bounds, see Chapter
XI.2.5.
Let M be a polytope. Then, for certain classes of objective functions, lower bounds
for inf f(M n D) or inf f(M) can be determined simply by minimizing a certain func-
tion related to f over the finite vertex set V(M) of M.
For example, if f is concave on M, then
and ß (M) = min f(V(P)) is, in general, a tighter bound than min f(V(M)).
(18)
A commonly used way of calculating lower bounds ß (M) for min f(D n M) or min
f(M) is by minimizing a suitable convex subfunctional of f over D n M or over M. A
convex lubfunctional of f on M ia a convex function that never exceeds f on M. A
convex subfunctional !p is said to be the convex envelope of f on M, if no other con-
vex subfunctional of f on M exceeds !p at any point x E M (i.e., !p is the pointwise su-
premum of all convex subfunctionals of f on M). Convex envelopes play an import-
ant role in optimization and have been discussed by many authors. We refer here
mainly to Falk (1969), Rockafellar (1970), Horst (1976, 1976a and 1979).
(iii) there is Ra function \If: M - t Dl satisJying (i), (ii) and !piz) < \If (i) for
some point i e M.
Thus we have !p(x) ~ \If(x) for all x E M and all convex subfunctionals -q, of f
onM.
Proof. Let XE argmin f(M). Then we must have rp(X) ~ f(X) (by Definition IV.ll
(ii)). But we cannot have rp(X) < f(X), since if this were the case, then the constant
function w(x) :: f(X) would satisfy (i), (ü) of Definition IV.ll; but the relation
w(X) > rp(X) would contradict (iii).
Lemma IV.I. Let M ( Oln be compact anti convex, and let l M - - I Ol be lower semi-
continuous on M. Then rp: M - - I Ol is the convex envelope off if and only if
or, equivalently
Proof. The proof is an immediate consequence of the definition of the convex huH
Note that epi (f) and conv (epi(f)) are closed sets, since fis lower semicontinuous
IR ,
~----------------------------------------- n
IR
Proof. Applying Caratheodory's Theorem to (20), we see that cp(x) can be ex-
pressed as
n+1 . n+1 n+1.
cp(x) = inf { E >'.f(xI ), E >.. = 1, E >..xI = Xj
i=l 1 i=l 1 i=l 1
Note that the maximum in (21) always exists, since fis lower semicontinuous and
M is compact. Hence, f* is defined for all t E IRn. If we replace the max operator in
(21) by the sup operator then the conjugate can be defined for arbitrary functions on
152
functions) .
The same operation can be performed on f* to yield a new convex function f**:
this is the s~alled second conjugate of f. The function f** turns out to be identical
to the convex envelope cp (cf., e.g., Falk (1969), Rockafellar (1970)). Our proof of the
following theorem follows Falk (1969).
Rence,
and
i.e., xO ( D(f**).
Note that it can actually be shown that D(f**) = M (cf., e.g., Falk (1969), Rocka-
fellar (1970)).
153
Since f** is convex: on M and, by (24), f**(x) ~ f(x) "Ix e M, it follows from the
definition of tp that f**(x) ~ tp(x) "Ix e M. Suppose that f**(xO) < tp(xO) for some
xO e M. Then we have (xO, f**(xO)) t epi (tp). Note that epi (tp) is a closed convex:
set. Thus, there is a hyperplane strictly separating the point (xO, f**(xO)) from
epi tp, Le., there is a vector (s,O') e IRn +! satisfying
(25)
(26)
or
(27)
But this is equivalent to f**(xO) > tp(xO), which we have seen above to be false.
If (27) holds, then, since tp(x) ~ f(x), it follows that
This contradicts the definition of f*, which implies that in (25) equality (rather than
Further useful results on convex envelopes have been obtained for special classes
of functions f over special polytopes. The following two results are due to Falk and
Theorem IV.6. Let M be a polytope with vertices v1,,,.,vk, and let f' M -I IR be
Proof. The function tp defined in (29) is convex. To see this, let 0 ~ ). ~ 1 and
x 1 ,x2 e M. Let Ql,Q2 solve (29) for x = xl and x = x2, respectively. Then
that
and suppose that tp(X) < 1JI(i) for some i e M. Let Ci solve (29) for x = i. Then we
have
155
which is a contradiction.
•
Theorem IV.7. Let M = conv {vO, .. ,vn} be an n-simplex with vertices vO, .. ,vn,
and let f: M ~ IR be concave on M. Then the convex envelope off on M is the affine
junction
l{J(x) = ax + 0, a E IR n, 0 E IR , (30)
Proof. (30') constitutes a system of (n+l) linear equations in the n+l unknown
n
a E IR ,oE IR.
Subtracting the first equation from all of the n remaining equations yields
The coefficient matrix V T whose columns are the vectors vi - vO (i=I, ... ,n) is non-
singular, since the vectors vi_v O (i=I, ... ,n) are linearly independent. Thus, a and the
Now suppose that there is another convex subfunctional lJt of f on M and a point
n . n
Then i = E /k,V1, E I'j = 1, I'j ~ 0 (i = O, ... ,n), and
i=O 1 i=O
which is a contradiction.
•
Note that Theorem IV.7 can also be derived from Theorem IV.6. Each point x of
an n-simplex M has a unique representation as a convex combination of the n+1
affinely independent vertices vO, ... ,v n. To see this, consider
n i n 0
x = E eH + (1- E a.)v , a· > 0 (i=l, ... ,n),
i=l 1 i=l 1 1-
Le.,
n i 0 0 0
x = E a.(v - v ) + v = Va + v , a ~ 0 ,
i=l 1
where a = (ap ... ,an ? and V is the nonsingular matrix with rows vi - vO
(i=l, ... ,n). It follows that a = V-1(x - vo) is uniquely determined by x.
Rence, by Theorem IV.6, we have
n .
rp(x) = E a.f(v1), (30")
i=O 1
n . n
where x = . E ai VI , • E a i = 1, ai ~ 0 (i=l, ... ,n) is the unique representation of x
1=0 1=0
in the barycentric coordinates aO, ... ,an of M. It is very easy to see that this function
coincides with (30).
It follows from (30") that, whenever the barycentric coordinates of Mare used,
the system of linear equations (30') does not need to be solved in order to determine
rp(x).
157
By Theorem IV.7, the constmction of convex envelopes is especially easy for con-
cave functions of one real variable f: [a,b] --t IR over an interval [a,b]. The graph of
the convex envelope !P then is simply the line segment passing through the points (a,
f(a» and (b, f(b».
r
Theorem. IV.8. Let M =.n Mi be the product 01 r compact nc dimensional rect-
1=1
r
angles M. {i=l, ... ,r} satisfying E n.; = n. Suppose that f: M --t IR can be decomposed
1 i=l •
r .
into the lorm I{z} = i~l Ilz'}, where li: Mi --t IR is lower semicontinuous on Mi
{i=l, ... ,r}. Then the convez envelope !P off on Misequal to the sum 01 the conVe:I:
envelopes !Pi olli on Mi ' i.e.,
r .
!p{z} = E !plzl}
i=l
. n.
Proof. We use Theorem IV.5. Let t l E IR 1 (i=l, ... ,r). Then we have
r . . . r .
f*(t) = maMx {xt - f(x)} = .E 1 ~a x {X1tl - fi(xl)} = .E 1 fi(t l ) .
xE 1= xl EM. 1=
1
r
rp(x) = f**(x) = sup {xt - f*(t)} = E ~up {xiti - fi(t i )}
tElRn i=l tlElRni
•
Theorem IV.8 is often used for separable fnnaions f, where r=n and the
n
M. = [a.,b.] are onEHiimensional intervals, i.e., we have f(x) = E f.(x1·),
1 1 1 i=l 1
~: [ai'bi] --t IR (i=l, ... ,n).
Note that Theorem IV.8 cannot be generalized to arbitrary sums of functions. For
example, the convex envelope of the function f(x) = x2 - x2 over the interval [0,1]
158
Theorem IV.9. Let f: M ---< IR be lower semicontinuous on the convex compact set
M ( IR n and let h: IR n ---< IR be an affine junction. Then
where 'Pfand 'Pf+h denote the convex envelopes off and (f+h) on M, respectively.
with the last inequality holding because the right-hand side is a convex subfunc-
Since the middle expression is a convex function, equality must hold in the second
inequality.
•
More on convex envelopes and attempts to determine 'P can be found in McCor-
mick (1976 and 1983). Convex envelopes of bilinear forms over rectangular sets are
negative definite quadratic forms over the parallelepipeds defined by the conjugate
directions of the quadratic form are derived in Kalantari and Rosen (1987). We will
4.4. Duality
where f, ~: IRn --I IR. Assume that ~ is convex (i=l, ... ,m), fis lower semicontinuous
which is the pointwise infimum of a collection of functions affine in u and hence con-
cave on the feasible region {u E IR:: d(u) > -m} of (32).
Let inf (P) and sup (D) denote the optimal value of (P) and (D), respectively.
Proof. Let u ~ 0 satisfy d(u) > -m, and let i E C satisfy g(i) S O. Then it follows
that
Note that Lemma IV.2 holds without the convexity and continuity assumptions
Ü of the dual problem could be a candidate for deriving the lower bounds
It is weH-known that for convex f we have inf (P) = sup (D) whenever a suitable
"constraint qualification" holds. The corresponding duality theory can be found in
many textbooks on optimization, cf. also Geoffrion (1971) for a thorough exposition.
For nonconvex f, however, a "duality gap" inf (P) - sup (D) > 0 has to be expected
and we would like to have an estimate of this duality gap (cf. Bazaraa (1973),
Aubin and Ekeland (1976)). A very easy development is as foHows (cf. Horst
(1980a)).
Let tp be the convex envelope of f on C. Replacing f by tp in the definition of
problems (P) and (D), we obtain two new problems, which we denote (F) and (D),
with d(u) being the objective function of (D). Obviously, since tp(x) ~ f(x) Vx E C,
one has d(u) ~ d(u) Vu E ~ , and we obtain the foHowing lemma as a trivial con-
sequence of the definition of tp.
Lemma IV.3. in! (P) ~ in! (P), sup (D) ~ sup (D).
Convex duality applies to (F) and (D) (cf., e.g., Geoffrion (1971)):
(I) There exists a point xOE C such that ~(xO) < 0 (i=l, ... ,m).
161
(I) depends only on the constraintsj hence its validity for (F) can be verified on
(P).
Combining the preceding lemmas we can easily obtain an estimate for the duality
gap inf (P) - sup (D).
Proof. The function tp exists and Lemmas IV.2, IV.3, IV.4 yield
By Lemma IV.~, M = C n {x: ~(x) ~ 0 (i=l, ... ,m)} is non-ilmpty and inf (F) is
finite. Hence, by (33), we have inf (P) I: ± m, and the first two inequalities in the
assertion are fulfilled.
By the definition of sup and inf, we have
su p {fex) - tp(x)}
xeM
~ sup {fex) - tp(x)}.
xEC •
162
The quantity sup {f(x) - «p(xH may be considered as a measure of the lack of
xeC
convexity off over C.
the assumptions of Theorem IV.lO, the constraint functions ~(x) are affine, then we
have sup (D) = sup (15), i.e., instead of minimizing I{J on M, we can solve the dual of
the original problem min {f(x): x e M} without calculating I{J (cf., e.g., Falk (1969)).
However, since (D) is usuallya difficult problem, until now this approach has been
applied only for some relatively simple problems (cf., e.g., Falk and Soland (1969),
Horst (1980a)).
Theorem IV.H. Suppose that the a&sumptions 01 Theorem IV.10 are falfilled and,
Proof. The last equation is Lemma IVA. To prove the first equation, we first
observe that
continuous everywhere and equals its convex envelope, so that we may apply
= mi n {f(x) + (ATu)x} - ub
xEC
= -f*(-A T u) - ub.
d(u) = -cp*(-ATU) - ub
T
= -max {x(-A u) - cp(x)} - ub
xEC
But this is the objective function of (TI), and we have in fact shown that the
objective functions of (D) and (TI) coincide, this clearly implies the assertion.
•
A related result on the convergence of branch and bound methods using duality
bound is given in Ben Tal et al. (1994).
We would like to mention that another very natural tool to provide lower (and
(1979 and 1980), Ratschek and Rokne (1984, 1988 and 1995)).
Some other bounding operations that are closely related to specific properties of
4.5. Consistency
In tbis section, we show that for many important c1asses of global optimization
problems, the examples discussed in the preceding sections yield consistent or
strongly consistent bounding operations whenever the subdivision is exhaustive and
the deletion of infeasible partition elements is certain in the limit (cf. Definitions
IV.6, IV.7, IV.S, IV.lO).
Recall that f denotes the objective function and D denotes the c10sed feasible set
of a given optimization problem. Let SM C M n D be the set introduced in Step k.4
of the prototype BB procedure, and recall that we suppose that er (M) = min f(SM)
is available whenever SM # •.
Lemma IV.5. Suppose tha.t f: MO ---I R is continuous a.nd the subdivision procedure
is ezha.ustive. Furthermore, a.ssume tha.t every infinite decrea.sing sequence {Mql 0f
succusivel1l refined pa.rtition elements sa.tisfiu • # SM CM n D.
q q
Then every strongl,l consistent lower bounding opera.tion 1Iields a. consistent bounding
opera.tion.
Suppose that MO and all of the partition sets M are polytopes. Recall that the BB
procedure requires that
165
Looking at the examples discussed in the preceding sections, we see that ß (M) is
always determined by a certain optimization procedure. A suitably defined function
,: M - - I I is minimized or maximized over a known subset T o{ M:
or
Examples.
Let { be Lipschitzian on MO' let A be an upper bound tor the Lipschitz constant of
where V'(M) is a nonempty subset of the vertex set V(M) (see also (14')).
Let f = f1 + 12 ' with f1 concave and f2 convex on an open set containing MO'
Then we have
'3(x) = f1(x) + 12(v*) + p*(x-v*), T = V(M), P(M) = min {'3(X): x e T}, (39)
Suppose that the convex envelope 'PM of f over M is available, and D is such that
min 'PM(D n M) can be calculated. Then we may set
T = M, if M is uncertain.
Now suppose that 'PM exists but is not explicitly available. Let D be a polytope.
Then, by Theorem IV.1O, P(M) in (40) can be obtained by solving the dual to
min {f(x): xe T}. Since, however, this dual problem is difficult to solve, this ap-
proach seems to be applicable only in special cases, e.g., if M is a rectangle and Dis
defined by a few separable constraints (cf., e.g., Horst (1980a)).
Let M c M' be an element of a partition of M'. Then it is easily seen that the
bounding operations (37), (39) do not necessarily fulfill the monotonicity require-
ment P(M) ~ P(M') in Step kA of the prototype BB procedure, whereas (38), (40)
yield monotonie bounds. Recall that in the case of nonmonotonic lower bounds we
agreed to use j'1(M):= max {P (M), P(M')} instead of P(M).
Since P(M) ~ j'1 (M) ~ f(x) Vx e M, obviously j'1 is strongly consistent whenever P
is strongly consistent.
167
as follows
Now consider the bounding methods in the above example, respectively using
(37), (38), (39), (40). In the first three cases, f is obviously continuous on MO'
Suppose that fis also continuous on MO in case 4 (convex envelopes, (40)).
Proposition IV.3. Suppose that at every step any undeleted partition element can
be further refined. Furthermore, suppose that the "deletion by infeasibility" rule is
certain in the limit and the sub division is exha'ILStive. Then each bounding operation in
the above example given by (97), (98), (99), (~.O), respectively, is strongly consistent.
which, by the continuity off and the exhaustiveness, implies that '1 ,q(iq) - I f(i).
q-iaJ
In the case i=3, note that v*q q:;: i E D since Mq q:;: {i}. Moreover, since
{~(x): x E MO} is compact (cf., e.g., Rockafellar (1970)), there exists a subse-
quence p*q' =-+
q-iaJ p*. Hence, we have
Now consider the case (40) (convex envelopes). By Theorem IVA, we have
min f(M q) = min IPM (Mq), and hence
q
---. {i},
From the assumptions we know that Mqq-im xE D. If D n Mq,f. 0 is known
for infinitely many q', then D n Mqq-im
,-r-I {i}. If M ,is uncertain for all but finitely
q
many q', then ß(Mq ,) = rnin f(M q ,). Finally, the continuity off and (46) impIy that
ß (Mq,) -r-I
q-im
f(i}, and hence we have strong consistency.
•
The following Corollary IV.5 is an immediate consequence of Corollary IV.3.
Corollary IV.5. In the BB procedure suppose that the lower bounding operation is
strongly consistent, and the selection is bound improving. Suppose that 1is continuous
on MO' Then every accumulation point 01 {;;k} solves problem (P).
Note that it is natural to require that all partition sets where ßk is attained are
refined in each step, since otherwise 1\+1 = 1\ holds and the Iower bound is not im-
proved.
5. DELETION BY INFEASIBILITY
In this section, following Horst (1988) certain rules are proposed for deleting in-
feasible partition sets M. These rules, properly incorporated into the branch and
bound concept, will lead to convergent algorithms (cf. Section IVA.5). Since the in-
feasibility of partition sets depends upon the specific type of the feasible set D, three
ments of convex sets (reverse convex programrning), feasible sets defined by Lip-
schitzian inequalities.
Again suppose that the partition sets M are convex polytopes defined by their vertex
sets V(M).
170
Deletion by Cenainty.
Clearly, whenever we have a procedure that, for each partition set M, can
Example IV.5. Let D:= {x E IRn: h(x) ~ O}, where h: IRn --I IR is convex. Then a
Because of the convexity of the polytope M and the convexity of the set {x E IRn:
Let
where g: IRn --I IR is convex, e.g., g(x) = sup {~(x): i E I} with gi: IRn --I IR convex,
i E I ( IN. Suppose that a point yO satisfying g(yO) < ° is known (the Slater
Let M be a partition set defined by its vertex set V(M). If there is a vertex v E
By convexity of D, we have
the hyperplane (s(z), x-z) = 0 supporting D at z, then M is deleted, Le., we have the
first deletion rule
ible. However, it is easy to see that infeasible sets M may exist that do not satisfy
(52).
Let
(53)
where
(55)
°
Assume that a point yO satisfying g(yO) < is known.
Recall that a typical feasible set arising from revene conves: programmiDg can be
described by (53), (54), and (55).
Let
(56)
(DR 2) Delete a partition set M if its vertex set V(M) satisfies either (DR 1)
applied to Dl (in place of D) or if there is a jE (l,... ,r) such that we ove
V(M) (Cj .
Again, it is easy to see that by (DR 2) only infeasible sets will be deleted, but
possibly not all infeasible sets. For arecent related rule, see Fülöp (1995).
Lipschiu Constraints.
Let
(57)
where gi Rn -IR are Lipschitzian on the partition set M with Lipschitz constants Lj
(j=I, ... ,m). Let 6(M) be the diameter of M, and let Aj be the upper bounds for Lj
(j=I, ... ,m). Furthermore, let V'(M) be an arbitrary nonempty subset of the vertex
set V(M). Then we propose
(DR 3) De1ete a partition element M whenever there is a j E (l, ... ,m) satisfying
Proposition IV.". Suppose that the subdivision is exhaustive. Then the "deletion by
infeasibility" mles (DR 1) - (DR 9) are certain in the limit.
Proof. a) Let D = {x: g(x) ~ O}, where g: IRn --I IR is convex. Suppose that we
have yO satisfying g(yO) < 0. Apply deletion rule (DR 1). Let {Mq} be a decreasing
sequence of partition sets and let pq E Mq \ D. Consider the line segment [yO, pq].
By exhaustiveness, there is an i satisfying pq - < i.
q-laJ
Suppose that we have i ~ D. Then by the convexity of gon IRn we have continuity
of g on IRn (e.g., Rockafellar (1970)), and IRn \ D is an open set. It follows that there
exists a ball B(i,e):= {x: IIx-ill ~ cl, e > 0, satisfying B(i, e) ( IRn\D, such that
Mq (B(i,c) Vq > ~ , where qo E IN is sufficiently large. Consider the sequence of
(59)
(60)
(61)
(1 - X) (i - Z) = -X(yO - Z) . (62)
174
Note that X > 0 since otherwise (60) would imply that i = z E an ( D which
contradicts the assumption that i;' D. Likewise, it follows that we have X < 1, since
z= yO is not possible.
) we have -(-
Obviously, by (62, ;;"\
s x-z, -Xs-( y0;;"\
=- -z" 0 < ->. < 1.
I-X
Hence, from (61) it follows that
(64)
Since pq' is an arbitrary point of Mq, and Mq, - - I {i}, we see that (64) also holds
for all vertices of Mq" q' sufficiently large. Hence, according to deletion rule (DR 1),
Mq, was deleted, and we contradict the assumptions. Thus, we must have i E D.
b) Let D = D1 n D2 ,where
Let Cf = {x E IRn : hj(x) < O}, j=I, ... ,r. Suppose that we have i ;. D2. Then there
is a jE {1, ... ,r} such that i E Cj , and in a manner similar to the first part of a)
above, we conc1ude that Mq ( Cj for sufficiently large q, since Cj is an open set. This
contradicts the deletion rule (DR 2).
175
e) Finally, let D = {x: glx) ~ 0 (j=l, ... ,m)}, where all gf IRn -i IR are Lipsehitzian
on MO. Suppose that the loeal overestimators AiMq) of the Lipsehitz eonstants
L.(M ) are known (j=l, ... ,m). Sinee the overestimator A.(M ) is an overestimator
J q J q
for LiMq') whenever Mq, ( Mq , we may assume that there is abound A satisfying
(65)
Apply deletion rule (DR 3), and suppose that we have xt D. Sinee Mq -i {X}, by
the eontinuity of gj (j=l, ... ,m) and by (65) it follows that for every sequenee of
nonempty sets V'(M q) ( V(M q ), we have
Sinee xt D, there is at least one j E {l, ... ,m} satisfying gj(X) > O. Taking into
aecount the boundedness of {AlM q ) }, the limit 5(M q) -i 0 and the eontinuity of
In this section the combination of outer approximation and branch and bound will
minimize f (x)
(P)
S.t. xED
minimize f(x)
s.t. xE D v
by a BB-Procedure.
Clearly, since in the basic outer approximation method, each of these problems
differs from the previous one by only a single additional constraint, when solving
(Qv+1) we would like to be able to take advantage of the solution of (Qv). In other
words, in order to handle efficiently the new constraints to be added to the current
feasible set, the algorithm selected for solving the relaxed problems in an outer ap-
proximation scheme should have the capability of being restarted at its previous
stage after each iteration of the relaxation scheme.
Indeed, suppose we are at Step k of the BB-Procedure for solving (Qv). Let
,k-1 denote the value of the objective at the best feasible point in Dv obtained so
<kv
far. Then certain pa.rtition sets M ma.y be deleted in Step k.1, since ß (M)
,k-1.
~ <kv
These sets, however, may not qualify for deletion when the BB procedure is applied
to solve a subproblem (Q~), ~ > v, since D~ ( D v and possibly <k~,k_1 > <kv,k-1 for
177
Let :Y be the family of sets D// admitted in the outer approximation procedure
(d. Chapter 11).
Choose D 1 E :Y such that D 1 J D. Set // = 1.
Apply the finite BB procedure of Section IV.l to problem (Q//) with the conditions
in Step k.4 replaced by
b) If ~ > f\ and f\ = f(z//) for some z// E D// \ D, then construct ~ constraint
l//(x) S 0 satisfying {x E D//: l//(x) S O} E :Y, l//(z//) > 0, l//(x) S 0 Vx E D
(d. Chapter 11), and let
(68)
Set v+- v+1 and go to Step k+1 of the BB procedure (applied from now on to
problem (Q//+1»'
178
unchanged).
Therefore, if it happens that Qk = ~ (which has to be the case for some k, if the
family :y is finite), then f(xk) = min f(D), i.e., xk solves (P).
Now suppose that the algorithm generates an infinite sequence. Then we have
problem (P) starting from the most recent partition and bounds obtained in solving
well, and hence Theorem IV.12 can also be verified by proving consistency of the
bounding operation and completeness of the selection.
Applications of the (RBB-R) algorithm will be presented in Part B.
PARTB
CONCAVE MINIMIZATION
Many applications lead to minimizing a concave function over a convex set (cf.
Chapter I). Moreover, it turns out that concave minimization techniqlles also play
an important role in other fields of global optimization.
Part B is devoted to a thorough study of methods for solving concave minim-
ization problems and some related problems having reverse convex constraints.
The methods for concave minimization fall into three main categories: cutting
methods, successive approximation methods, and successive partition methods. Al-
though most methods combine several different techniques, cutting planes play a
dominant role in cutting methods, relaxation and restrietion are the main aspects of
successive approximation, and branch and bound concepts usually serve as the
framework for successive partition.
Aside from general purpose methods, we also discuss decomposition approaches to
large scale problems and specialized methods adapted to problems with a particular
structure, such as quadratic problems, separable problems, bilinear programming,
complementarity problems and concave network problems.
CHAPTER V
CUTTING METHODS
In this chapter we discuss some basic cutting plane methods for concave minim-
ization. These include concavity cuts and related cuts, facial cuts, cut and split pro-
cedures and a discussion of how to generate deep cuts. The important special case of
concave quadratic objective functions is treated in some detail.
x ~ 0, (3)
concave function. For ease of exposition, in this chapter we shall assume that the
int D # 0, and that for any real number a the level set {x: IRn : f(x) ~ a} is bounded.
182
Note that the assumption that fex) is defined and finite throughout IRn is not ful-
filled in various applications. However, we can prove the following result on the ex-
D. 11 int D # 0 and IIVIlx)1I is bounded on the set 01 all x e D where Ilx) is diffe.,..
entiable (Vllx) denotes the gradient olloM at the point x), then 10 can be extended
to a concave fv,nction I: IR n - t IR.
ye C, consider the affine function hy(x) = fO(Y) + Vfo(Y)(x-y). Then the function
fex) = inf {hy(x): y e C} is concave on IRn (as the pointwise infimum of a family of
affine functions). Since by assumption IIVfO(y)1I is bounded on C, it is easily seen
that -GI < fex) < +1Il for an x e IRn. Moreover, for any x,y e C we have hy(x) ~ fO(x),
while hx(x) = fo(x). Since fex) = fo(x) for an x e C and C is dense in int D, con-
tinuity implies that fex) = fo(x) for an x e D. •
Note that if ~(x) is any other concave extension of fO' then for any x e IRn and
any y e C, we have ~(x) ~ hix), hence ~(x) ~ fex). Thus, fex) is the maximal exten-
sion of fO(x). Also, observe that the condition on boundedness of IIVfO(x)1I is fulfilled
if, for example, fO(x) is defined and finite on some open set containing D (cf. Rocka-
Let x O be the feasible solution with the least objective function value found so far
by some method. A fundamental question in solving our problem is how to check
whether xOis a global solution.
183
Clearly, by the very nature of global optimization problems, any criterion for glo-
bal optimality must be based on global information about the behaviour of the ob-
jective function on the whole feasible set. Standard nonlinear programming methods
use only local information, and hence cannot be expected to provide global
optimality criteria.
However, when a given problem has some particular structure, by exploiting this
structure, it is often possible to obtain useful sufficient conditions for global op-
timality. For the problem (BCP) formulated above, the structural property to be ex-
ploited can be expressed in either of the following forms:
I. The global minimum of f(x) over any polytope is always attained at some vertex
(extreme point) ofthe polytope (see Theorem 11).
Therefore, the problem is equivalent to minimizing f(x) over the vertex set of D.
H. For any polytope P with vertices u1,u2, ... ,us the number
min {f(u 1),/(u2), ... ,f(us)} is a lower bound for f(D n P).
Here the points ui might not belong to D. Thus, the values of f(x) outside D can
be used to obtain information on the values inside.
These observations underlie the main idea of the cutting method we are going to
present.
First of all, in view of Property I, we can assume that the point xO under consid-
eration is a vertex of D.
Definition V.I. Let 'Y = f(:P). For any x e IRn satisfying f(x) ~ 'Y the point
xO + O(x-xO) such that
From the concavity of f(x) and the boundedness of its upper level sets it is imme-
diate that 1 ~ °< +111.
Let y1,y2,,,.,ys denote the vertices of D adjacent to xo (s ~ n). We may assume
that
(5)
. °
0i'll'{yl-x ) ~ 1 (i=l,2,,,.,s) (6)
provides a ')'-valid cut for (f,D). In other words, we have the following information
on the values off(x) inside the polytope D.
Theorem V.I. (Sufficient condition for global optimality). Let '11' be a solution of
the system (6).
Then
'II'(:r;-i) > 1 for all :r; E D such that f(:r;) < 7. (7)
Hence, if
°
LP(x ,'II',D) max {'II'{x-xO): xE D} , (9)
185
where 1I{x-xO) ~ 1 is a valid cut for (f,D). If the optimal value of this program does
not exceed 1, then xO is a global optimal solution. Otherwise, we know that any feas-
ible point that is bett er than xO must be sought in the region D n {x: 1I{x-xO) ~ I}
left over by the cut.
Clearly, we are interested in using the deepest possible cut, or at least a cut which
is not dominated by any other one of the same kind. Such cuts correspond to the
basic solutions of (6), and can be obtained, for example, by solving the linear pro-
gram
s
min E
i=l 1
°
o·',,(i. -x) . °
s.t. 0i1l{yl-x) ~ 1 (i=1,2, ... ,s). (10)
.°
0.1I{yl-x ) = 1
1
(i=l, ... ,n). (11)
This yields
11" = eQ-1 , Q = (1 ° °
z -x , ... ,z2-x ,... ,zn-x0) , (12)
where e = (1,1, ... ,1) and zi is the -y-extension of the i-th vertex yi of D adjacent to
xO. The corresponding cut 1I{x - xO) ~ 1 is then the 7-valid concavity cut as defined
in Definition 1113.
In the general case, where degeneracy may occur, solving the linear program (10)
may be time consuming. Therefore, as pointed out in Section III.2, the most conveni-
ent approach is to transform the problem to the space of nonbasic variables relative
to the basic solution xO. More specifically, in the system (2), (3) let us introduce the
(ti' i E N) be the basic and nonbasic variables, respectively, relative to the basic
solution t O = (sO,xO), sO = b - AxO, that corresponds to the vertex xO of D. Then,
186
expressing the basic variables in terms of the nonbasic ones, we obtain from (2), (3)
a system of the form
(13)
b +- tg), we can thus assume that the original constraints (2), (3) have been given
such that xo = 0.
In the sequel, when the constraints have the form (2) (3) with the origin ° at a
vertex xo of D, we shall say that the BCP problem is in standard form with respect
to zOo Under these conditions, if 'Y< f(xo) (e.g., 'Y = f(xo)-c , e > °being the
tolerance), and
(17)
187
The above sufficient condition for global optimality suggests the following cutting
method for solving the problem (BCP).
Since the search for the global minimum can be restricted to the vertex set of D,
we first compute a vertex xO which is a Iocal minimizer of f(x) over D. Such a vertex
can be found, e.g., as follows: starting from an arbitrary vertex vO, pivot from vO to
a better vertex vI adjacent to vO, then pivot from vI to a better vertex v2 adjacent
to vI, and so on, until a vertex vn = xO is obtained which is not worse than any
vertex adjacent to it. From the concavity of f(x) it immediately follows that
f(xO) ~ f(x) for any x in the convex hull of xO and the adjacent vertices; hence xO is
(18)
and solve the linear program LP(xO,1!"°,D). If the optimal value of this linear
program does not exceed 1, then by Theorem V.l, xO is a global minimizer and we
stop. Otherwise, let wO be a basic optimal solution of L(xO,1!"°,D), and consider the
residual polytope left over by the cut (18), i.e., the polytope
(19)
By Theorem V.l, any feasible solution better than xO must be sought only in Dl .
f(x) over Dl (then f(x l ) ~ f(wO)). It may happen that f(x l ) < 7. Then, by Theorem
V.l, xl must satisfy (18) as a strict inequality; hence, since it is a vertex of Dl' it
must also be a vertex of D. In that case the same procedure as before can be
More often, however, we have f(x I ) ~ 1. In this ease, the procedure is repeated
with xO +- xl, D +- DI , while 1 is unehanged.
In this manner, after one iteration we either obtain a better vertex of D, or at
least reduee the polytope that remains to be explored. Sinee the number of vertiees
of D is finite, the first situation ean oeeur only finitely often. Therefore, if we ean
also ensure the finiteness of the number of oeeurrenees of the second situation, the
method will terminate sueeessfully after finitely many iterations (Fig. V.I).
Fig. V.I
Algorithm V.I.
Initialization:
Theorem V.2. [f the sequence {·i} is bounded, then the above cutting algorithm is
finite.
of one cycle and the beginning of a new one. As noticed above, in view of the
inequality f(xk +1) < 7 in Step 4), xk+ 1 satisfies al1 the previous cuts as strict
inequalities; hence, since xk+ 1 is a vertex of Dk+ 1 ' it must be a vertex of D,
distinct from al1 the vertices of D previously encountered. Since the vertex set of D
is finite, it follows that the number of occurrences of Step 4), Le., the number of
cycles, is finite.
Now during each cycle a sequence of cuts ~(x):= ,t(x-xk )-1 ~ °is generated
such that
Since the sequence {'/rk} is bounded, we conclude from Corollary III.2 that each cycle
have II'/rkll ~ C for some constant C. Note that 1111'/rk ll is the distance from xk to the
hyperplane '/rk(x_xk ) = 1, so these distances (which measure the depth of the cuts)
190
must be bounded away from 0. Though there is some freedom in the choice of 7rk
(condition (6)), it is generaIly very difficult to enforce the boundedness of this se-
quence. In the sections that follow we shaIl discuss various methods for overcoming
An advantage of concavity cuts as developed above is that they are easy to con-
cuts, when used alone in a pure cutting algorithm, often tend to become shaIlower
and shaIlower as the algorithm proceeds, thus making the convergence very difficult
more expensive but have the advantage that they guarantee finiteness and can be
suitably combined with concavity cuts to produce reasonably practical finite algo-
rithms.
A problem closely related to the concave minimization problem (BCP) is the fol-
lowing:
If we know some efficient procedure for solving this problem, then the concave
Start !rom a vertex xo of D which is a loeal mini mi zer. At step k = O,l, ... ,let 'Yk
be the best feasible value of the objective function known so far, i.e., 'Yk =
min {f(xO), .... ,f(xk )}. At xk eonstruct a 'Yk-valid cut 7rk (x - xk ) ~ 1 for (f,D) and let
191
Since each cut eliminates at least one vertex of D, the above procedure is obvious-
ly finite.
Despite its attractiveness, this procedure cannot be implemented. In fact, the vertex
problem is a very difficult one, and up to now there has been no reasonably efficient
method developed to solve it. Therefore, following Majthay and Whinston (1974),
we replace the vertex problem by an easier one.
to a polyhedron M if
0f.FnMcriF. (20)
For example, in Fig. V.2, page 186, F l' F 2 and the vertex x are extreme faces of
the polytope D relative to the polyhedral cone M. Since the relative interior of a
point coincides with the point itself, any vertex of D lying in M is an extreme
O-dimensional face of D relative to M.
Extreme face problem. Given two polyhedra D and M, find an extreme face of D
Actually, as it will be seen shortly, this problem can be treated by linear pro-
gramrning methods.
192
A cutting scheme that uses extreme faces can be realized in the following way: at
each step k find an extreme face of D relative to Mk and construct a cut eliminating
this extreme face without eliminating any possible candidate for an optimal solution.
Since the number of faces of D is finite, this procedure will terminate after finitely
many steps.
Fig. V.2
Assume now that the constraints defining the polytope D have been given in the
canonical form
X.
1
= p.10 - E p..x. (i e B)
jeJ IJ J
(21)
where B is the index set of basic variables (I B I = m) and J is the index set of
The following consideration makes use of the fact that a face of D is described in
For any x E !Rn let Z(x) = {j: xj = O}. Then we have the following characteristic
property of an extreme face of D relative to M.
°
Proof. Obviously, F is a face of D containing xO, so FOn M # 0.
°
If F is an extreme face of D relative to M, then for any x E FOn M we must have
x E ri F O' hence xi > ° for any i E {1,2, ... ,n} \ Z(xO) (since the linear function
°
x - xi which is nonnegative on F can vanish at a relative interior point of F only °
if it vanishes at every point of F 0)' In view of the compactness of FOn M, this
implies that °< min {xi: x E FO n M} for any i E {1,2, ... ,n}\Z(xO).
Conversely, if the latter condition is satisfied, then for any x E FOn M we must
have xi> ° for all i; Z(xO), hence xE ri F O' Le., F O n M ( ri F O ' and F O is an
extreme face.
•
Note that the above proposition is equivalent to saying that a face F O of D that
meets M is an extreme face of D relative to M if and only if Z(x) = Z(x') for any
x,x' E F On M.
On the basis of this proposition, we can solve the extreme face problem by the
n
E c..x. > d. (i=n+1, ... ,n) .
j=l IJ r 1
194
Introducing the slack variables xi (i=n+1, ... ,ii) and using (21), (22) we can de-
scribe the polytope D n M by a canonical system of the form
X.
1
= p.10 - jei
E p..x.
IJ J
(i E B) , (23)
X~ = ° (j E j, H n).
If j ( Z(xO):= {j E {1,2, ... ,n}: X~ = O}, i.e., if all nonbasic variables are original,
then xO is avertex of D, since in this case the system (23), (24) restricted to i ~ n
gives a canonical representation of D.
In the general case, let
Proc:edure I.
Let ek be the optimal value and Je be a basic optimal solution 01 {PIIi. Set k -- k+l
and repeat the procedure until k = s.
195
Proof. Ifi E {1,2, ... ,n} \ Z(xs), then i = ik for some ik t Z(xs), hence k > 0, i.e., e
° °
< min {xi: x E D n M, xj = Vj E Z(xs)}. Therefore, F is an extreme face by
Proposition V.2 provided Z(xs) # 0. Moreover, if Z(xs) = 0, then °< min {xi:
xe D n M}, from which it easily follows that the only extreme face is D itself. •
where the asterisk means that an nonbasic original variables in (23), (24) should be
omitted.
hence Pi,j ~ °for an j E j \ {1,2, ... ,n}. If Pij > °for at least one j E j \ {1,2,... ,n},
then by pivoting on this element (i,j) we will force ~ out of the basis. On the other
D
°
hand, if p.. = for an je j \ {1,2, ... ,n}, this means that x. depends only on the vari-
1
ables X j ' j E j n {1,2, ... ,n}. Repeating the same operation for each i E Z(xO) n B, we
will transform the tableau (23) into one where the only variables xi' i e Z(xO) which
196
remain basic are those which depend only on the nonbasic original variables. Then
and only then we start the simplex procedure for minimizing Xi with this tableau,
1
where we omit, along with the nonbasic original variables (Le., all the columns
j ~ n), also all the basic original variables Xi with i E Z(xO) (Le., all the rows
i E Z(xO)).
(ii) The set Z(xk) is equal to the index set Zk of nonbasic original variables in
the optimal tableau of (P k) plus the indices of all of the basic original variables
which are at level zero in this tableau. In particular, this implies that F = {x E D:
x. = ° Vj E Z }, and therefore that F is a vertex of D if and only if IZs I = n-m. We
J 8
can thu8 8top the extreme face finding proces8 when k = s or IZk I = n-m.
Now consider the case where F is a proper face but not avertex of D
Definition V.3. Let F be a proper face but not avertex of D. A linear inequality
l (x)~O is a facial cut if it eliminates F without eliminating any vertex of D lying in M.
Let Qj (j E Z) be prechosen positive numbers and for each h E {1,2,... ,n} \ Z con-
sider the parametric linear program
Proposition V.4. Let 0< IZI < n-m. Ifp < +m, then the inequality
E Q.z.~ p (29)
jeZ J J
Remark V.2. For each h the value qh = aup {q: 0 < optimal value in (Ph(q))}
can be computed by parametric linear programming methods.
Of course, the construction of a facial cut is computationally rather expensive,
even though it only involves aolving linear programs. However, such a cut eliminates
an entire face of D, i.e., all the vertices of D in this face, and this is sometimes worth
the cost. Moreover, if all we need is a valid (even shallow) cut, it is enough for each
198
°
h to choose any q = qh > for which the optimal value in (Ph(q)) is positive.
Since a facial cut eliminates a face of D, the maximal number of facial valid cuts
cannot exceed the total number of distinct faces of D. Therefore, a cutting procedure
in which facial valid cuts are used each time after finitely many steps must be finite.
The following modification of Algorithm V.l is based upon this observation.
Algorithm V.2.
Initialization:
If Fk is a proper face but not a vertex of D, construct a facial valid cut 4c(x) ~ 0.
If this cut is infinite, then stop: xO is a global optimal solution; otherwise, set
1
dk = +m, 6k = "Nök-l and go to 3).
199
to obtain a basic optimal solution wk of this problem. If lk( uf) ~ 0, then stop: xo
is a global optimal solution. Otherwise, go to 3).
3) Let Dk + l = Dk n {x: ~(x) ~ o}. Find a vertex x k+1 of Dk+1 which is a local
minimizer of f(x) over Dk + l . If f(x k + l ) ~ i, go to iteration k+l. Otherwise, go
to 4).
Proof. Like Algorithm V.l, the above procedure consists of a number of cydes of
iterations, each of which results in a vertex of D (the point xk+ l in Step 4) which is
better than the incumbent one. Therefore, it suffices to show that each cyde is
finite. But within a given cyde, since the number of facial cuts is finite, there must
exist a kO such that Step la) occurs in all iterations k ~ kO· Then d k ~ Dk = Dk for
distance from xk- l to the cutting hyperplane (when a i-valid cut has been applied
in iteration k-l), this means that a facial cut is introduced if the previous cut was a
'"(-valid cut that was too shallow; moreover, in that case Dk is decreased to kDk_ l '
so that a facial cut will have less chance of being used again in subsequent iterations.
Roughly speaking, the (D,N) device allows facial cuts to intervene from time to time
in order to prevent the cutting process from jamming, while keeping the frequency of
these expensive cuts to a low level.
200
Of course, the choice of the parameters 60 and N is up to the user, and N may
vary with k. If 60 is close to zero, then the procedure will degenerate into Algorithm
V.l; if 60is large, then the procedure will emphasize on facial cuts.
every N cuts, where N is some natural number. However, this method might end up
subject to -2xl + ~ ~ 1 ,
x2 ~ 2 ,
xl+~~4,
Xl ~ 3 ,
0.5xl-~ ~ 1 ,
Xl ~ 0, ~ ~ O.
Suppose that in each cycle of iterations we decide to introduce a facial cut after
every two concavity cuts. Then, starting from the vertex xO = (0,0) (a local minim-
izer with f(xO) = -1.8), after two cycles of 1 and 3 iterations, respectively, we find
the global minimizer x* = (3,1) with f(x*) = -3.4.
Cycle 1.
Iteration 2: X
2 = (2.856,0.43).
Since the facial cut is infinite, the incumbent x* = (3,1) is the global optimizer (Fig.
V.3).
Fig. V.3
A pure cutting plane algorithm for the BCP problem can be made convergent in
two different ways: either by introducing special (usuaIly expensive) cuts from time
to time, for example, facial cuts, that will eliminate a face of the polytope D; or by
the use of deep cuts that at one stoke eliminates a sufficiently "thick" portion of D.
A drawback of concavity cuts is that, when used repeatedly, these cuts tend to
strengthen these cuts: later, in Section VA, we shaIl examine how this can be done
in certain circumstances.
202
In the general case, a procedure which has proven to be rather efficient for making
Let us first describe a construction which will frequently be used in this and
subsequent chapters.
To simplify the language, in the sequel, unless otherwise stated, a cone always
means a convex polyhedral cone vertexed at the origin 0 and generated by n linearly
Ki = con(Qi)' we then have the following fact whose pIoof is analogous to that of
Proposition IV.l:
(int K i ) n (int K j ) = 0 (j # i) j
K = U{Ki : i E I} .
to u.
203
In actual computations, we work with the matrices rather than with the cones.
cone partitioning. Thus, we shall say that the matrices Qi (i EI), with Qi = (zi, ... ,
the partitions of a matrix Q with respect to >.u for different>. > 0, lead to different
matrices, although the corresponding partitions of the cone K = con(Q) are the
same.
Let us start with a vertex xO of D which is a local mini mi zer of f(x) over D. Set
, = f(xO), 0: = ,-c. Without loss of generality we may always assume that the
problem is in standard form with respect to xO (cf. Section V.l), so that xO = 0 and
G = {z: f(z) =
0:
0:, f(>'z) < 0: V>. > I} .
For each i=I,2, ... ,n let zi be the point where Go: meets the positive xi-ms.
and let w = w(Q) be a basic optimal solution and # = #(Q) be the optimal value of
thislinear program (i.e., #(Q) =eQ-Iw).
If # ~ I, then xO is a global e-optimizer. On the other hand, if f(w) < a, then we
can find a vertex xl of D such that f(x l ) ~ f(w) < a. Replacing the polytope D by
D n {x: eQ-Ix ~ I}, we can rest art from xl instead of xO. So the case that remains
to be examined is when # > I, but f(w) ~ a (Fig. V.4).
As we already know, a feasible point x with fex) < -y-e should be sought only in
the residual polytope D n {x: eQ-Ix ~ I} left over by the cut. In order to cut off
more of the unwanted portion of D, instead of repeating for this residual polytope
what was done for D (as in Algorithm V.I) we now construct the a-ilXtension Wof w
and split the cone K = con(Q) with respect to w(Fig. VA).
Let Qi ' i e I, be the corresponding partition of the matrix Q:
Now note that for each subcone Ki = con(Qi) the cut through zl,,,.,zi-l,w,
i+1,,,.,zn does not eliminate any point x of Ki with fex) < -y-e. (This can be seen in
the same way that one sees that the cut through zli,,,.,zn does not eliminate any
point x e K with fex) < -y-e.) Therefore, to check whether there is a feasible point x
with fex) < -y-e in any subcone Ki' for each i e I we solve the linear program
max eQi x
-1 -1
s.t. x e D, Qi x ~ °.
205
Note that the constraint Qjlx ~ 0 simply expresses the condition that x E Ki =
con(Qi) (see (30)). If all of these linear programs have optimal values ~ 1, this
indicates that no point x E D in any cone Ki has fex) < 7-t:, and hence, that x O is a
global c:-optimal solution. Otherwise, each subcone K i for which the linear program
LP(Qi'D) has an optimal value > 1 can be furt her explored by the same splitting
method that was used for the cone K.
f(x)=O - (
Fig. V.4
Algorithm V.3.
Select c: ~ O.
Initialization:
Phase I.
over M.
Phase 11.
0) Let 7 = f(xo), a = -y-e. Rewrite the problem in standard form with respect to
xO. Construct QO = (zOl,z02, ... ,zOn), where zOi is the intersection of Ga with the
i-th edge of K o ' Set .Jt = {Qo}'
2) Let .9t = {Q E .Jt: ",(Q) > I}. H .9t = 0, then stop: xO is a global c-Qptimal
solution. Otherwise, go to 3).
3) For each Q E .9t construct the a-extension w(Q) of w (Q) and split Q with
respect to w(Q). Replace Q by the resulting partition and let .Jt' be the resulting
collection of matrices. Set .Jt I - .Jt' and return to 1).
Phase I" indicates the passage to a new cyde. Within a cyde the polytope M that
remains to be explored and the incumbent xO do not change, but from one cyde to
207
-1
eQO x ~ 1 ,
while xO changes to a better vertex of D. (Note that QO = (zOl,z02 ... ,zOn) is the
matrix formed in Step 0 of Phase II, and hence determines an a-valid cut for (f,M)
at xO; the subsequent cuts eQÖ1x ~ 1 in this cycle cannot be used to reduce M,
because they are not a-valid for (f,M) at xO.) Under these conditions, it is readily
seen that at each stage the current xO satisfies every previous cut as a strict
inequality. Therefore, since xO is a vertex of M, it must also be a vertex of D.
Moreover, since the vertex set of D is finite, the number of cycles of iterations must
also be finite.
°
(ii) The value of c: ~ is selected by the user. If c: is large, then few iterations will
be needed but the accuracy of the solution will be poor; on the other hand, if c: is
small, the accuracy will be high but many iterations will be required. Also, since the
minimum of f(x) is achieved at least at one vertex, if c: is smaller than the difference
between the values of f at the best and the second best vertex, then a vertex xO
which is globally c:--optimal will actually be an exact global optimal solution. There-
fore, for c: small enough, the solution given by Algorithm V.3 is an exact global
optimizer.
(iii) The linear programs LP(Q,M) can be given in a more convenient form which
does not require computing the inverse matrices Q-1. Indeed, since the problem is in
standard form with respect to xO, the initial cone KO in Phase II is the nonnegative
orthant. Then any cone K = con(zl,z2, ... ,zn) generated in Phase II is a subcone of
KO ' and the constraints Q-1x ~ °(Le., xe K) imply that x ~ 0. Therefore, if
M = D n {x: Cx ~ d}, where Cx S dis the system formed by the previous cuts, then
-1
Ax ~ b, Cx ~ d, Q x ~ °.
208
Thus, in terms of the variables (A 1,A 2,... ,An) = Q-1x, this linear program can be
written as
n
max E AJ'
j=l
LP(Q,M)
(iv) For ease of exposition, we assumed that the function fex) has bounded level
sets. If this is not the case, we cannot represent a cone by a matrix Q = (zl,,,.,zn),
where each i is the intersection of the i-th edge with G Q (because this intersection
may not exist). Therefore, to each cone we associate a matrix Q = (zl,,,.,zn), where
each zi is the intersection point of the i-th edge with GQ if this intersection exists,
or the direction of the i-th edge otherwise. Then, in the linear subproblem LP(Q,M)
the vector e should be replaced by a column vector with its i-th component equal to
1 if zi is a point or equal to 0 if zi is a direction. If 1= {i: zi is a point}, this
subproblem can similarly be written as
209
max E >..
i eI 1
n . n .
s.t. E >'.(Az 1 ) ~ b, E >'.(Cz1) ~ d ,
i=l 1 i=l 1
With these modifications, Algorithm V.3 still works for an arbitrary concave
function f: IRn - I IR.
(v) An algorithm which is very similar to Algorithm V.3, but with c = 0 and the
condition Qi1x ~ 0 omitted in LP(Qi'M), was first given in Tuy (1964). This
algorithm was later shown to involve cycling (cf. Zwart (1973)). Zwart then
developed a modified algorithm by explicitly incorporating the condition Qi1x ~ 0
While Zwart's algorithm for c > 0 is computationally finite, it may give an in-
correct solution, which is not even an c-approximate solution in the sense of Zwart
ate anti-jamming device. The resulting algorithm, called the Normal Conical AIgo-
Let us also mention two other modifications of Tuy's original algorithm: one by
Bali (1973), which is only slightly different from Zwart's algorithm (for c = 0), the
other by Gallo and Ülkücü (1977), in the application to the bilinear programming
problem. The algorithm of Gallo and Ülkücü has been proved to cycle in an example
210
of Vaish and Shetty (cf. Vaish and Shetty (1976 and 1977». The convergence of
The only result in this regard that has been established is stated in the following
Proposition V.5. Let Qi = (zi1}2, ... }n), i=1,2, ... , be the matrices generated in
aPhase II of Algorithm V.S, where the indez system is such that i < j if Qi is
generated before Qj' Suppose that e = O. If eQ-/zik ~ 1 (k=1,2, ... ,n) for any i < j,
then Phase II is finite, unless zO is already aglobai optimizer.
better than xo. Denote wi = W(Qj)' Lj = {x: eQj1x ~ I}. By construction, we have
By Lemma 111.2, this implies that d(cJ,Li ) -t 0 as i - t CD. Now for any i, x* must
belong to some cone K j = con(Qj) with j > i, and from the definition of J = w (Qj)'
we have d(x*,Lj ) ~ d(Wi,L j ). Therefore, d(x*,L j ) -+ 0 as j -+ CD.
On the other hand, by hypothesis, the halfspace Lj entirely contains the polytope
spanned by 0, zOl,z02, ... ,zOn and ~P = w(QO)' Hence, the distance from xO = °to
the hyperplane Hj = {x: eQj1x = I}, i.e., 1/ IIQj111 , must be bounded below by some
positive constant 6.
Let ~ denote the intersection of Hj with the halfline from °through x*, and note
that IIx*-~llId(x*,Lj) = 1I~II/d(O,Hj)' which implies that IIx*-~1I ~ II~II 6d
(x*,L j) -+ 0. But since ~ belongs to the simplex [~1,~2, ... ,zjn] with f(~k) = 7
(j=1,2, ... ,n), we must have f(~) ~ 7. This contradicts the assumption that f(x*) < 7,
~ -+ 0 as j -+ CD.
since we have just established that x* -
•
211
For every cone K = con(Q) generated by Algorithm V.3, let ß(K) = K n {x:
eQ-1x $ I}. Then the convergence condition stated in the above proposition is
equivalent to requiring that at the completion of any iteration in Phase II, the union
of all ß(K) corresponding to all of the cones K that have been generated so far (in
the current Phase II) is a polytope. In the next chapter, we shall present an
FUNCTIONALS
The basic construction in a cutting method for solving (BCP) is for a given
feasible value , (the current best value) of the objective function f(x) to define a
cutting plane which will delete as large as possible a sub set of D n {x: f(x) ~ ,}.
Therefore, although shallow cuts which can delete some basic constituent of the
feasible set (such as a face) may sometimes be useful, we are more often interested in
er $ f(xO). As usual, we may ass urne that the problem is in standard form' with
respect to xO and that (14) holds. So xO = 0, D ( IR~ and
where ei is the i-th unit vector. We already know that an a-valid cut for xO is
(31)
Our aim is to develop Q-valid cuts at xO which may be stronger than the con-
cavity cut.
(iii) F(u,v) is concave in u for every fixed v and affine in v for every
fixed u. (34)
Ll(t) = {x e D: Ex/ti ~ I} ,
M(t) = {x e D: Ex/ti ~ I} ,
(where we set 4>t(x) = --w ifF(x,v) is not bounded from below on M(t).
Proof. The concavity of the function in (35) follows from assumption (iii). Since,
by assumption (ii), F(x,v) ~ min {f(x), f(v)} ~ min f(M(t)) for any x,v e M(t), we
have:
Ez.jt.>l,
I 1-
(38)
(39)
Ez.je*,
I
>1
I -
(40)
Proof. The inequality t* ~ t follows from the definition of t*. Then ~(t*) nM(t)
is contained in a polytope with vertices ui = tiei , vi = ttei (i=1,2, ... ,n). Since we
have tt(ui) ~ Q, tt(vi ) ~ Q (i=1,2, ... ,n), by (39) and the definition of t*, it follows
from the concavity of tt(x) that tt(x) ~ Q Vx e ~(t*) n M(t).
Therefore,
f(x) ~ Q Vx e ~(t) .
Referring to the concavity cut (31), we obtain the following version of The-
orem VA.
Corollary V.I. Let °= (Ol,02'''''0nJ define the concavity cut (91). For each
i = l,2,,,.,n let
(43)
is an a-valid cut.
Le., if we have
then under the conditions in the above corollary we will have t i > 8i' provided that
0iei t D, because then yi f. 0iei and consequently 1JI0(Oiei) > min {f(Oiei),f(yi)} ~ a.
This means that the cut (43) will generally be strictly deeper than the concavity cut
(Fig. V.5).
Moreover, note that for the purpose of solving the problem (Bep), a cut con-
struction is only a means to move towards a better feasible solution than the in-
f(yi) < a for some i, then the goal of the cut is achieved, because the incumbent has
improved.
215
o f(x)=CX
Öl,------"~_____,;.:_4_~----------
X .....
-......... . ... -_ .......-
Fig. V.5
Iteration 1:
Compute the points yi (i=1,2, ... ,n) according to (42) (this requires solving a
linear program for each i, since F(.,v) is affine in v).
Otherwise, compute the values t i (i=1,2, ... ,n) according to (41). Then by
Corollary V.1, t ~ e, so that the cut defined by t is deeper than the concavity cut
provided t f. e(which is the case if (44) holds).
216
Iteration k > 1:
Let (38) be the current cut. Compute ti = max {T: tt( rei ) ~ a} (i=I,2, ... ,n).
Since t ~ 0, we have M(t) (M(O), hence ~t(tiei) ~ ~O(tiei) ~ a (i=I,2, ... ,n).
Therefore, by Theorem VA, t* ~ t and the cut defined by t* is deeper than the
Thus, if condition (44) holds, then the above iterative procedure either leads to an
o~ t ~ t* ~ t** ~ ... , which define an increasing sequence of valid cuts (the procedure
stops when successive cuts do not differ substantially).
Of course, this scheme requires the availability of a function F(u,v) with prop-
In the next section we shall see that this is the case if f(x) is quadratic. For the
Proposition V.7. If a junction F(u, v) satis!ies (ii) and (iii), then the junction
f(x) = F(x,x) is quasiconcave.
Proof. For any x,y e IRn and >',1' ~ 0 such that >'+1' = 1, we have
lems.
Another important issue when implementing the above scheme is how to compute
the values t i according to (41) (or the values ti in Theorem VA). For this we
217
(45)
(iii). Since the optimal value ~ 0 (rei ) of this linear program is concave in T, by
Proposition V.6, and since 4> 0 (Oiei) ~ a, it follows that t i is the value such that
~ 0 (rei ) ~ a for all TE [Oi,ti], while ~ 0 (rei ) < a for all T> ti' Therefore, this value
can be computed by parametric linear programming techniques.
More specifically, let vi be a basic optimal solution of (45) for T = 0i' Then the
reduced costs (in linear programming terminology) for this optimal solution must be
~ O. Clearly, these reduced costs are concave functions of the parameter T, and one
can determine the maximal value Ti such that these reduced costs are still ~ 0 for
0i $ T $ Ti' If ~ 0 (Tiei) = F( Tiei ,vi) < a, then t i is equal to the unique value
TE [0i'Ti] satisfying F(rei,vi ) = a. Otherwise, F(Tiei,vi) ~ a, in which case at least
one of these reduced costs will become negative for T> Ti' and by pivoting we can
pass to a basic solution v2 which is optimal for T> Ti sufficiently elose to Ti'
The procedure is then repeated, with Ti and v2 instead of 0 and vi. (For the sake
The cut improvement technique presented above was first developed in a more
specialized form by Konno (1976a) for solving the concave quadratic programming
problem (CQP). Results similar to those of Konno have also been obtained by Balas
and Burdet (1973) using the theory of generalized outer polars (the corresponding
f(x) over D. Writing the problem in standard form with respeet to xo (see Seetion
V.Ll), we have
S.t. Ax ~ b , (47)
x~o, (48)
Furthermore, p ~ °beeause xO = °
is a loeal minimizer of f(x) over D (if Pi < °
for some i we would have f(>.ei ) = 2>'Pi - >'\i °
< for all sufficiently small >. > 0).
Proposition V.S. The bilinear fu,nction (49) satisfies the conditions (92), (99),
(9..(.) for the concave quadratic fu,nction (..(.6). If the matrix Cis positive definite, then
Proof. Clearly, f(x) = F(x,x), and F(u,v) is affine in u for fixed v and affine in v
for fixed u. For any u, v we ean write
If C is positive definite, then (v-u)(C(v-u)) > °for u # v and one of the two
differences must be positive, proving (50).
•
It follows from this proposition that the cut improvement scheme can be applied
to the problem (CQP). We now show that this scheme can be implemented.
Proposition V.9. The quantity 0i is the larger root ofthe quadratic equation:
(52)
ProoI. . Since er < f(xO) = 0, Pi ~ °and cii > 0, it is easily seen that the equation
(52) always has a positive root which is also the larger one. Obviously, f (Oiei) = er,
and the proposition follows.
•
Next, the points yi in Corollary V.1 are obtained by solving the linear programs:
(53)
If f(yi) ~ er (i=1,2, ... ,n), then, according to Corollary V.1, a stronger cut than the
ti °
= max {r : ~ (rei) ~ er} (i=1,2, ... ,n).
220
Proposition V.lO. The value ti is the optimal value ofthe linear program
(*) minimize pz - az O
Proof. Since F( rei,v) = TPi + pv - r ~ CijVj , the optimal value of (53) is equal to
Since the constraint set of this linear program is bounded, by the duality theorem
T T T
~(r) = max{-bHr: -A Hr (1/01' ... ,1/ On) ~ p-rCi ' S ~ 0, r ~ O}
Thus,
Hence,
T T T
t i = max{r : -bs+r + TPi ~ a, -A Hr (l/0l' ... ,l/On) ~ p-rCi '
s ~ 0 , r ~ O} .
By passing again to the dual program and noting that the above program is
always feasible, we obtain the desired result (with the understanding that \ = +111 if
the dual program is infeasible).
•
221
Since p ~ 0 and a < 0, it follows that (z,zO) = (0,0) is a dual feasible solution of
(*) with only one constraint violated, and it usually takes only a few pivots to solve
(*) starting from this dual feasible solution. Moreover, the objective function of (*)
increases monotonically while the dual simplex algorithm proceeds, and we can stop
After the first iteration, if a furt her improvement is needed, we may try a second
iteration, and so forth. It turns out that this iterative procedure quite often yields a
substantially deeper cut than the concavity cut. For example, for the problem
xl - x2 ~ 1,
-xl + 2x2 ~ 3,
2xI - ~ ~ 3,
xl ~ 0 , ~ ~ 0 ,
(Konno (1976a)), the concavity cut and the cuts produced by this method are shown
6., 1 st iteration
------------- 2nd iteration
5, - - - 3rd iteration
---BLP cut
4
, Xl /3.25 + X2 /4.66
·\3
,.,
.,
Xl /4.05 + x2 /5.48
2·' .,
.'.
+ x2 /6-13
Fig. V.6
222
The cutting method of Konno (1976a) for the problem (CQP) is similar to AIgo-
rithm V.I, except that in Step 1 a deep cut of the type discussed above is generated
The convergence of this method has never been established, though it has seemed
to work successfully on test problems. Of course, the method can be made conver-
cuts with splitting as in the Normal Conical Algorithm which will be introduced in
Section VI!.l.
The cut improvement procedure may be started from the concavity cut or the
bilinear programming (BLP) cut constructed in the following way.
(so that 4>o(x) has the form (35), with M(O) = D).
By the same argument as in the proof of Proposition V.6 it follows that the
which is not hing but the a-valid concavity cut for the problem min {4>O(x): x E D},
will be also a-valid for the problem (CQP) because f(x) = F(x,x) ~ min {F(x,v):
If f(yi*) ~ a (i=1,2, ... ,n), where yi* E argmin {F(Oiei,v): v E D}, and if C is
positive definite, then the BLP cut is strictly deeper than the concavity cut (for the
problem CQP).
Note that F(rei,v) = 7Jli + pTv -" ~ CijVj . With the same argument as in the
J
proof of Proposition V.10, one can easily show that er is equal to the optimal value
of the linear program
minimize pz - a Zo
E c· .z. - zOp· = +1
j IJ J 1
The BLP cut for the previously given example is shown in Fig. V.6.
Denote
Since D;(o) = {x: ~O(x) ~ o}, this set is closed and convex and contains the origin
as an interior point. Therefore, the BLP cut can also be defined as a valid cut for
IR~ \ D;( 0) (i.e., a convexity cut relative to D;( 0), see Section III.4).
The set D;( 0) is sometimes called the polaroid (or generalized o'Uter polar) set of
D with respect to the bilinear function F(u,v) (cl. Balas (1972) and Burdet (1973)).
In the cutting plane methods discussed in the previous chapter, the feasible da-
main is reduced at each step by cutting off a feasible portion that is known to con-
construct a sequence of problems which are used to approximate the original one,
where each approximating problem can be solved by the methods already available
problem will get eloser and eloser to a global optimal solution of the original one.
where D is a elosed convex sub set of IRn and f: IRn - I IR is a concave function.
226
strained problems was discussed. In this section, we shall deal with the application
Tarn and Ban (1983), Thieu (1984), Tuy (1983), Horst, Thoai and Tuy (1987 and
1989), Horst, Thoai and de Vries (1988)). A more introductory discussion is given in
Let us first consider the problem (BCP) as defined in Section V.1., i.e., the prob-
lem of minimizing a concave function f: IRn - I IR over a polyhedron D given by a sys-
(1)
Proposition VI.1. If the concave fu,nction f(x) is bo'Unded from below on some half
Suppose that f(x) is bounded from below on a halfline from xOin the direction y, i.e.,
f(x) ~ c for all x = xO + >.y with >. ~ O. Then the halfline {(xO + >.y, cl: >. ~ O} (in
227
IRn+ 1) lies entirely in G. Rence, by a well-known property of closed convex sets (cf.
Rockafellar (1970), Theorem 8.3), (y,O) is a recession direction of G, so that
{(x+>.y, f(X)): >. ~ O} c G for any X, which means that f(x + >.y) ~ f(X) for all >. ~ 0.
Thus, f(x) is bounded !rom below on any halfline parallel to y.
•
Proposition VI.2. Let M be any closed con'llez set in IRn. 11 the conca'lle function
I(z) is bounded from below on e'llery eztreme ray 01 M, then it is bounded from below
on any halfline contained in M.
Proof. Suppose that f(x) is bounded !rom below on every unbounded edge of M.
Then, by Proposition V1.2, f(x) is bounded !rom below on any halßine contained in
M. By a well-known property of concave functions (cf. Rockafellar (1970), Corol-
larles 32.3.3 and 32.3.4) it follows that f(x) is bounded !rom below on M and attains
its minimum over M at some vertex of M (see Theorem 1.1).
•
On the basis of the above propositions, it is now easy to describe an outer approx-
imation method for solving the problem (BCP).
228
Start with a simplex D1 such that D ( D1 (IR~. At iteration k = 1,2, ... , one has a
polytope Dk such that D ( Dk ( IR~. Solve the relaxed problem
tion of (BCP). Otherwise, xk violates at least one of the constraints (1). Select
(3)
and form
(4)
Go to iteration k+l.
•
Since each new constraint (4) cuts off xk, and hence is different from all of the
previous ones, the finiteness of this procedure is an immediate consequence of the
finiteness of the number of constraints (1).
The implementation of this method and its extension to the case of an unbounded
feasible set D require us to examine two questions:
(i) When the polyhedron D defined by (1), (2) is unbounded, the starting polyhed-
D = • or f(x) is unbounded from below over any halßine emanating from a feasible
Otherwise, uk violates one of the inequalities Aiu ~ 0 (i=l, ... ,m). Let
Ifwe define
(6)
then, since Ai uk > 0, it follows that uk is no longer a recession direction for Dk+1.
k
(ü) Each relaxed problem (Qk) is itself a concave minimization problem over a
polyhedron. How can it be solved? Taking into account that (Qk) differs from
(Qk-l) by just one additional linear constraint, how can we use the information ob-
f('\u k) < f(O) for some ,\ > 0, then f(x) is unbounded from below on the extreme ray
of Dk in the direction uk . Otherwise, the minimum off(x) over Dk must be aUained
at one ofthe vertices of Dk, namely, at xk E arg min{f(x): x E Vk}.
Algorithm VI.l.
Initialization:
(For instance, one can take D 1 = IR!. Then VI = {O}, U 1 = {el,e2,... ,en} where ei
is the i-th unit vector of IRn .)
Iteration k = 1,2,... :
1) For each u E Uk check whether there exist A > 0 such that f(AU) < f(O). If this
k
occurs for some u E Uk , then:
b) Otherwise, compute
(7)
and go to 3).
2) If no u E Uk exists such that f(AU) < f(O) for some A > 0, then find
x k E arg min{f(x): x E V k}.
solution.
b) Otherwise, compute
(8)
and go to 3).
231
3) Form
(9)
Compute the vertex set Vk+ 1 and the extreme direction set Uk + 1 of Sk+1 from
knowledge of Vk and Uk · Set I k+1 = I k \ {i k} and go to iteration k+1.
Theorem VI.I. S-uppose that D j 0. Then Algorithm VI.1 terminates after at most
m iterations, yielding a global optimal solution 0/ (Bep) or a halfline in D on which
f(x) - i -00.
Proof. It is easy to see that each ik is distinct from all il' .. .,ik_1' Indeed, if 1b)
occurs, then we have Aiu k ~ 0 (i=i1, .. ·,i k 1)' Ai uk > 0, while if 2b) occurs, then
- k
we have A.xk
1
~ b. (i=i 1,... ,i k 1)' A. xk > b.. Since Ik ( {l, ... ,m}, it follows that
1 - lk lk
the algorithm must terminate after at most m iterations.
•
Remarks. VI.I. (i) An alternative procedure for solving (Qk) which has restart
capability and does not necessarily involve a complete inspection of all of the ver-
(ii) A drawback of the above outer approximation algorithm is that all of the inter-
mediate approximate solutions xk , except the last one, are infeasible. In order to
have an estimate of the accuracy provided by an intermediate approximate solution
xk , we can proceed in the following way. Suppose that Dis full dimensional. At the
we compute yk E arg min{f(x): x E r(xk )}, where r(xk ) denotes the intersection of
D with the halfline from xk through zO (l is one endpoint of this line segment). If
i k denotes the best among all of the feasible points yh (h ~ k) obtained in this way
until Step k, then f(x k) ~ min f(D) ~ f(ik ). Therefore, the difference f(i k ) - f(xk )
yields an estimate of the accuracy attained; in particular, if f(ik) - f(xk ) ~ E, then
232
age only about half of the constraints of D (not including the nonnegativity con-
x x
minimize f(x) = X~ +~ - O.05(xl + x2)
Xl - 4x2 - 2 ~ 0 , - Xl + x2 - 5 ~ 0 ,
Xl ~ 0 , x2 ~ 0 .
X
1
Fig. VI.I
233
Initia1ization:
Iteration 1:
Iteration 2:
i2 = 1, so that
Iteration 3:
Values A.x 3-b. (i=2,4): 17, -7. Since the largest of these values is 17, we have
1 1
i 3 = 2, so that
Iteration 4:
Since A4x4 = -10 < 0, this is the optimal solution (Fig. VI.l).
Now let us consider the concave minimization problem in the general case when
the feasible set Dis a closed convex set given by an inequality of the form
method discussed in Chapter Ir, in order to solve the problem (CP) we can proceed
as folIows.
Start with a polytope D1 containing D. At iteration k = 1,2, ... , solve the relaxed
problem
tion of (CP). Otherwise we have g(xk ) > O. Construct a hyperplane strictly separ-
lk(x) S 0 Vx E D , (11)
k
lk(x ) >0 . (12)
235
Form
Dk +1 = Dk n {x: 4c(x) ~ O} ,
The convergence of this procedure depends on the choice of the affine function
lk(x) = Pk (x - k
Y) + g(yk ), (13)
with pk E 8g(yk) \ {O}, then by Theorem II.2, 4c(x) satisfies (11), (12) and the pro-
cedure will converge to a global optimal solution, in the sense that whenever the al-
gorithm is infinite, any accumulation point of the generated sequence xk is such a so-
lution.
IK I = 1 and l E öD this is the method of Veinott (1960) which has been applied to
concave minimization by Hoffman (1981). Notice that here each relaxed problem
(Qk) is solved by choosing x k E argmin{f(x): xE Vk}, where Vk = vert Dkj the set
Vk is derived from Vk-l by one of the procedures indicated in Section 11.4.2 (V 1 is
known at the beginning).
Extending the above algorithm to the general case when D may be unbounded is
not trivial. In fact, the relaxed problem (Qk) may now have no finite optimal solu-
this difficulty in the same way as in the linearly constrained case, since the convex
Algorithm VI.2.
Initialization:
Iteration k = 1.2•... :
a) If g(xO + ~uk) ~ 0 for all ~ ~ 0 (Le.• the halfline r k trom xO in the direction uk
lies entire1y in D), then stop: the function fex) is unbounded trom below on
r k c D.
b) Otherwise, compute the intersection point yk of r k with the boundary an of D,
find pk E 8g(yk) and go to 3).
3) Form
Theorem VI.2. Assume that lor so me a < 1(1,0) the set {x E IRn: 1(1,) = a} is
bounded. Then the above algorithm either terminates after finitely many iterations
(with a finite global optimal solution 01 (CP) or with a halfline in D on which 1(1,) is
unbounded [rom below), or it is infinite. In the latter case, the algorithm either
generates a bounded sequence {i}, every accumulation point Z 01 which is aglobaI
optimal solution 01 (CP), or it generates a sequence {uk}, every accumulation point u
01 which is a direction 01 a halfline in D on which 1(1,) is unbounded [rom below.
Proof. On each r k take a point zk such that f(zk) = a. By hypothesis, the se-
quence {zk} is bounded; hence, by passing to a subsequence if necessary, we may
assume that the zk converge to some z. Because of the continuity of f, we then have
f(z) = a< f(xO). Since zEr, it follows by Corollary VI.1 that fis unbounded from
below on M = r.
•
Lemma VI.2. Under the 48sumptions 01 Theorem Vl.2, il the algorithm generates
an infinite sequence i, then this sequence is bounded and any accumulation point 01
the sequence is aglobaI optimal solution 01 (CP).
238
Proof. Suppose that the sequence {xk} is unbounded, so that it contains a subse-
k
quence {xq} satisfying IIx qll > q (q=I,2, ... ). Let X be an accumulation point of he
k k k
sequence x q/llx qll. Since f(x q) < f(xO) , Corollary VI.1 implies that fis unboun-
k
ded from below on the halfline from xO through x q. Hence, by the previous lemma,
f is unbounded from below on the halfline from xO through X, and we can find a
point z on this halfline such that f(z) < f(x l ). Let B be a ball around z such that
f(x) < f(x l ) for all x E B. Then for all sufficiently large q, the halfline from xO
k I k
through x q meets B at some point zq such that f(zq) < f(x ) ~ f(x q). Because of
k
the concavity of f(x), this implies that x q lies on the line segment [xO ,zq]. Thus,
k
all of the x q with q large enough belong to the convex hull of xO and B, contradic-
k
ting the assumption that Ilx qll > q. Therefore, the sequence {xk} is bounded. Since
the conditions of Theorem 11.2 are satisfied, it then follows that any accumulation
point of this sequence solves the problem (P).
•
Lemma VI.3. 0/ Theorem VI.2, i/ the algorithm generates
Under the ass'Umptions
an infinite sequence {'U k}, then every acc'Um'Ulation point 0/ this sequence yields a
recession direction 0/ D on which f(z) is 'Unbo'Unded !rom below.
k
Proof. Let u = lim u q and denote by r k (resp. r) the halfline emanating from
q-t(l)
xO in the direction uk (resp. u). Suppose that r is not entirely contained in D and let
k
y be the intersection point of r with an. It is not hard to see that y q -I Y (yk is
the point defined in Step lb of the algorithm). Indeed, denoting by rp the gauge of
° _°
since tp (yk q - x ) = tp (y - x ) = 1.
239
(13)
°
Let zk = 2yk - x . Clearly zk = yk + (ykO . follows that
-x ) E r k and from (13) It
(14)
k
But the sequence {y q} is convergent, and hence bounded. It follows that the se-
k k
quence p q E 8g(y q) is also bounded (cf. Rockafellar (1970), Theorem 24.7). There-
k
fore, by passing to subsequences if necessary, we may assurne that p q - i pe 8g(Y}.
k
Obviously, z q - i Z = 2y - xO, and, by (14), p(z-y) ~ - g(xo) > 0. Since
k ks k __
p q(z -y q) - i p(z-y) as s - i m, q - i m, it follows that for all sufficiently large q
(15)
on each halfline r k' it follows from Lemma VI.1 that it must be unbounded on r.
This completes the proof of Lemma V1.3, and with it the proof of Theorem V1.2. •
240
method for generating these constraints one by one, as they are needed. For prob-
lems with many variables, usually a large number of constraints has to be generated
Uk increase rapidly in size, making the computation of these sets more and more dif-
ficult as the algorithm proceeds. In practice, for problems with about 15 variables
To alleviate this difficulty, a common idea is from time to time to drop certain
were discussed in Section II.3. Here is another constraint dropping strategy which is
Let K denote the index set of all iterations k in which the relaxed problem (Qk)
has a finite optimal solution x k . Then for k E K we have
(16)
xk = 2yk -x0
(to avoid confusion this point was denoted by zk in the proof of Lemma VI.3).
whereas x.i E r j C Dk +1 for all j > k (rj is the halfline from x O parallel to u j ).
Therefore, (16) holds for all k = 1,2, ....
241
Now, choose a natural number N. At each iteration k let vk denote the number of
points x-i with j < k such that lk(x-i) > 0 (i.e., the number of previously generated
points that violate the current constraint). Let NO be a fixed natural number greater
than N. Then we may modify the rule for forming the relaxed problems (Qk) as
follows:
It is easily seen that in this way any constraint ~(x) ~ 0 with vk < N, k > NO ' is
used just once (in the (k+1)-th relaxed problem) and will be dropped in all
subsequent iterations. Intuitively, only those constraints are retained that are
sufficiently efficient in the sense of having discarded at least N previously generated
points.
Since (Qk+1) is constructed by adding just one new constraint to (Qk) or
(Qk-1)' the sets Vk +1' Uk+1 can be computed from Vk ' Uk or Vk_ 1 ' Uk_ 1 '
respectively (of course, this requires that at each iteration one stores the sets Vk'
Uk , Vk- 1, Uk- 1, as well as the previously obtained points xi, i < k).
Proposition VI.3. With the above modijication, Algorithm VI.1 stiU converges
'Under the same conditions as in Theorem VI.f.
(17)
Let us first show that any accumulation point i of the sequence {xk} belongs to
k
D. Assume the contrary, i.e., that there is a subsequence x q - I i ~ D. Without
difficulty one can prove, by passing to subsequences if necessary, that
242
k k
y q - I YE 00, P q - I PE 8g(Y) (cf. the proof of Lemma VI. 3).
Let l (x) = p(x-Y). Since g(xO) < 0, we have l (xO) = p(xO-Y) ~ g(xO)-g(Y) =
g(xO) < 0, and consequently, l (i) > (because l (Y) ° = 0). Noting that
. k .
4c =
(x-l) -lk (i) P q(x-l-i), for any q, j we can write
q q
Therefore,
~ ()) >
q
° (j = jO+1, .. ·,jO+N) .
~
q
(xj ) ~ °
for all j > kq , a contradiction.
Therefore, any accumulation point xof {xk} belongs to D. Now suppose that the
algorithm generates an infinite sequence {xk , k E K}. Then, by Lemma VI.2, this se-
quence is bounded, and, by the above, any accumulation point xof the sequence
must belong to D, and hence must solve (CP).
On the other hand, if the algorithm generates an infinite sequence {uk , k ~ K},
k
then far any accumulation point u of this sequence, with u = lim u q, we must have
r = {x: xO + AU: A ~ O} c D. Indeed, otherwise, using the same notation and the
k
same argument as in the proof of Lemma V1.3, we would have y q -I YE 00,
243
kq - -0 -0 ..
x --+ x = 2y-x , and henee 2y-x e D, whieh IS impossible, sinee ye öD,
x O eint D. This eompletes the proof of the proposition.
•
The parameters NO and N in the above eonstraint dropping rule can be chosen ar-
bitrarily considering only computational efficieney. While a large value of N allows
one to reduee the number of constraints of the relaxed problem more significantly at
eaeh iteration this advantage can be offset by a greater number of required itera-
tions.
Though eonstraint dropping strategies may help to reduee the size of the sets Vk '
Uk , when n is large they are often not efficient enough to keep these sets within
manageable size. Therefore, outer approximation methods are practieal only for CP
handling new additional eonstraints. Because of this, they are useful, especially in
combination with other methods, in decomposing large scale problems or in finding a
rough approximate solution of highly nonlinear problems which otherwise would be
aimost impossible to handle. In Section VII.1.10 we shall present a more efficient
method for solving (CP), whieh eombines outer approximation ideas with cone
splitting teehniques and branch and bound methods.
2. INNER APPROXIMATION
In the outer approximation method for finding min F(D), we approximate the
such that min f(D k ) 1min f(D). Dually, in the inner approximation method, we ap-
proximate the feasible set D from the inside by a sequenee of expanding polytopes
tisfying P h ) D.
The inner approximation approach originated from the early work of Tuy (1964).
Later it was developed by Glover (1975) for integer programming and by Vaish and
Shetty (1976) for bilinear programming. These authors used the term "polyhedral
annexation ", referring to the process of enlarging the polytopes P k by 11 annexing 11
more and more portions of the space. Other developments of the inner approxim-
Below we essentially follow Tuy (1988b and 1990), Horst and Tuy (1991).
The (DG)-Problem turns out to be of importance for a large dass of global op-
timization problems, induding problems in concave minimization and reverse convex
programming.
For example, consider the problem (BCP), Le., the problem (CP) where the feas-
ible domain is a polytope D determined by the linear constraints (1) (2). Assume
As seen in Chapter V, when solving the BCP problem a crucial question is the fol-
lowing one: given areal number 'Y e f(D) (e.g., 'Y is the best feasible value of f(x) ob-
tained so far), and given a tolerance c > 0, find a feasible solution y with f(y) < 'Y-€
or else establish that no such point exists (Le. min f(D) ~ 'Y-€).
245
KO J D (this can be done, e.g., by rewriting the problem in standard form with
respect to xO, so that KO = IR~). Then the above question is just a (DG)-problem,
with G = {x: f(x) ~ 7-t:} (this set is obviously compact and convex, and it contains
Phase I.
Search for a local minimizer xO, which is a vertex of D such that f(xO) ~ f(z).
Phasen.
global c-optimal solution. Otherwise, let y E D \ G. Then f(y) < f(xO) - c. Set
Z f- Y and return to Phase I.
Since a new vertex of D is found at each return to Phase I that is better than all
(DG )-problems.
In this and the next sections we shall present the polyhedral annexation method
for solving (DG), and hence the problem (BCP).
246
The idea is rather simple. Since D C KO ' we can replace G by G n KO. Now we
start with the n-simplex PI spanned by the origin 0 and the n points where the
(This is easy, since Plis an n-simplex.) If no such point exists (which implies that
D C G), or if y1 ~ G (which means that y1 E D \ G), then we are done.
Otherwise, let zl be the point where the halfline from 0 through y1 meets {)G (such
Enlarge PI to
P2 = conv (P 1 U {z 1}) ,
k
y E D \ P k , P k+1 = conv(P k U{zk}) , (19)
can require that l be a vertex of D. Under these conditions, each P k+1 contains at
least one vertex of D which does not belong to any PI ,P 2 ,... ,P k . Since the vertex
set of D is finite, the above polyhedral annexation procedure must terminate with a
polytope P h ) D (proving that D c G) OI with a point yh E D \ G.
247
guage, we shall identify a facet with its normal vector and instead of saying "the fa-
cet whose hyperplane is vx= 1", we shall simply say "the facet v".
In view o{ the property con (P k ) = KO (see (18» we see that, if Vk is the C'Ollec-
Now {or each v E Vk let ~v) denote the optimal value in the linear program:
LP(vjD) maximize vx
s.t. xED
Proposition VI.4. 11 ",(v) ~ 1 lor all v E Vk J then D ( Pk . 11 "'(v) > 1 lor some
v E Vk then any basic optimal solution yk 01 LP(WD) satisfies
J ,l E D \ Pk .
Proof. Suppose that ~v) ~ 1 for all v E Vk . Then xE D implies that vx ~ 1 for all
'" (v) > 1 for some v E Vk , then a basic optimal solution yk of the linear program
LP(v;D) must satisfy yk E D and vyk> 1, hence yk E D \ P k . This completes the
248
versal facet v of P k . The question that remains is how to find the set Vk of these fa-
cets.
The set VI is very simple: it consists of a single element, namely the facet whose
hyperplane passes through the n intersections of 8G with the edges of KO' Since
However, rather than solving this problem direct1y, we shall associate it with
another problem which is easier to visualize and has already been studied in Section
II.4.2.
exists between the above two problems which allows one to reduce the first problem
to the second and vice-versa (cf., e.g., Balas (1972)).
249
Proposition VI.5. Let P be a polytope 01 fu.U dimension which contains 0, and let
S = {x: zx ~ 1 Vz E P} be the polar 01 P. Then 0 Eint Sand each transversal lacet
tJZ = 1 01 P corresponds to avertex v 01 Sand vice versa; each nontransversal lacet
convex sets (cf. Rockafellar (1970), Corollary 14.5.1). Denote by Z the vertex set of
P. Since for any x the linear function Z --I zx attains its maximum over P at some
pass through 0, the facet must contain at least n linearly independent vertices of P.
Conversely, let v be any vertex of S (v * 0 because 0 Eint S). Then v satisfies all
of the constraints of S, with equality in n linearly independent constraints. That is,
Corollary VI.2. Let P be a polytope 01 fu.U dimension which contains 0, and Let
z t P. 11 S denotes the polar 01 P then each transversal lacet tJZ = 1 01 the polytope
P' = conv (P U {z}) corresponds to avertex v 01 the polyhedron S' = Sn {x: zx ~ I}
250
and vice versa; each nontransversal lacet 01 P' corresponds to an extreme direction
01 S' and vice versa.
2
V x= 1
--
3
V
Fig. VI.2
251
solving the (DG)-problem. It now suffices to incorporate this procedure in the two
algorithm. tor the problem (BCP). In the version presented below we describe the
procedure through the sequence SI ( S2 ( ... and add a concavity cut be{ore each
return to Phase I.
Select E~ 0.
IDitialization:
Phase I.
Starting with z search tor a vertex xo o{ M which is a local minimizer o{ {(x) over M.
phasen.
KO= IR~).
For each i=I,2, ... ,n construct the intersection ui of the i-th edge of KO with the
k
I) For each v E V solve the linear program
LP(vjM) maximize vx
s .t. xE M
to obtain the optimal value p. (v) and a basic optimal solution w (v).
If we have f(w (v» < a for some v E Vk' then set
where vI was defined in Step 0), and return to Phase I. Otherwise, go to 2).
2) Compute vk E arg max {po (v): v E Vk}. If p.(vk) ~ I, then stop: xO is a global
e-optimal solution of (BCP). Otherwise, go to 3).
solving a (DG)-problem, with D = M and G = {x: fex) ~ a}. Each time the algo-
rithm returns to Phase I, the current feasible set is reduced by a concavity cut v1x ~
1. Since the vertex xO of M satisfies all the previous cuts as strict inequallties it will
actually be a vertex of D. The finiteness of the algorithm follows from the finiteness
of the vertex set of D.
•
R.ema.rks VI.2. (i) During aPhase 11, all of the linear programs LP(vjM) have
the same constraint set M. This makes their solution a relatively easy task.
253
(ii) It is not necessary to take a local minimizer as xO. Actually, for the algorithm to
work, it suffices that xO be a vertex of D and a < f(xO) (in order that ui f: °
(i=1,2, ... ,n) can be constructed). For example, Phase I can be modified as folIows:
Compute a vertex x of D such that f(X) ~ f(z). Let xO e argmin {f(x): x = x or x
is a vertex of D adjacent to xO}. Then a = f(xO) - E: in Phase 11, and when
Jl. (v k) ~ 1 (Step 2), xis a global E:-optimal solution.
f(w (v)) < a. However, independently of f(w (v)), one can restart also whenever a
point w (v) has been found which is a vertex of D (note that each w (v) is a vertex of
M, but not necessarily a vertex of D): if f(w (v» ~ a, simply return to Step 0, with
xO f-- w (v); otherwise, return to Phase I, with z f-- w (v). It is easy to see that with
this modification, the algorithm will still be finite.
In other cases, when the set Vk approaches a critical size, a restart is advisable
even if no w (v) is available with the above conditions. But then the original feasible
domain D should be replaced by the last set M obtained. Provided that such restarts
are applied in limited number, the convergence of the algorithm will not be
adversely affected. On the other hand, since the polyhedral annexation method is
sensitive to the choice of starting vertex xO, arestart can often help to correct a bad
choice.
254
vertices), then the algorithm works even for e = 0, because then IR~ coincides with
subject to -xl + x2 ~ 3,
xl+~~ 11,
2x l -~ ~ 16,
-xl -x2 ~ - 1 ,
x2 ~ 5,
xl ~ °, ~ ~ 0.
Fig. VI.3
255
Phase I:
°
x = (9.0,2.0)
Phase 11:
0) O! = f{xO) = -23.05;
The unique element vI of VI corresponds to the hyperplane through u 1 = (7.0,-2.0)
2
and u = (4.3,6. 7).
Iteration 1:
2) JL{v 1) > 1.
respectively.
Iteration 2:
21
1) LP{v ,D): W(V 21 ) = (1.0,0.0); LP{v 22 ,D): W(V 22 ) = (0.0,3.0).
Iteration 3:
2) v 3 = v 21 .
256
3) z3 = (-{).082,-{).271)
Vl = {v41 ,v42 } corresponds to hyperplanes through u 1, z3 and through zl, z3
respectively.
Iteration 4:
Thus, the algorithm has actually verified the global optimality of the solution al-
where x e 1R4 ,
3
~ 2
f(x) = -[' xl' + 0.1(x1 - 0.5x2 + 0.3xa + x4 - 4.2) ] ,
Tolerance E = 10-6 .
First cycle:
D = {x: Ax ~ b , x ~ O} .
Phase I:
Adjacent vertices:
01
y = (0.428571, 0.000000, 0.000000, 0,000000) ,
02
Y = (0.000000, 0.666667, 0.000000, 0.000000) ,
03
Y = (0.000000, 0.000000, 0.333333, 0.000000) ,
Y
04
= (0.000000, 0.000000, 0.000000, 1.600000)
Current best point x = (0.000000, 0.666667, 0.000000, 0.000000,)
(see Remark VI.2(ii)).
Phase 11:
The vertex set VI of SI = {x: uix ~ 1, i=I, ... ,4} is VI = {vI} with
v1 = (0.413885, 0.999997, 0.011450, 0.183206)
Iteration 1:
Iteration 2:
M = D n {x: v1x ~ I}
Phase I:
adjaeent vertiees:
Y01 = (1.104202, 0.900840, 0.000000, 5.267227 ) ,
02
Y = (1.216328, 1.245331, 0.818956, 2.366468) ,
f(X) = -2.281489.
Phase 11:
Thus, the global optimal solution (v 24 ) is eneountered after two iterations of the
first eyde, but is identified as such only after a seeond eyde involving twelve
iterations.
VI.3 ean be found in Horst and Thoai (1989), where, among others, the following
types of objeetive funetions are minimized over randomly generated polytopes in IR!:
259
... + 2] •
nxn)
(5) - max {lIx - dill: i=1 ..... p} with chosen d 1..... dP E IRn .
of problems with n ranging from 5 to 50. gives an idea of the behaviour of slight
modifications of Algorithm V1.3. The column "f(x)" indicates theform of the
objective function; the column 11 Res 11 gives the number of restarts; the column "Lin"
gives the number of linear programs solved.
The algorithm was coded in FORTRAN 77. and the computer used was an
IBM-PSII. Model 80 (DOS 3.3). The time in seconds includes CPU time and time
for printing the intermediate results.
Table VI.1.
260
Notice that, as communieated by its authors to us, in Horst and Thoai (1989), 3.4.,
pp. 283-285, by an input error, the figures correspond to the truncated objective
funetion f(u) = -lu113/2. This example reduces to one linear program, for whieh
general purpose coneave minimization methods are, of course, often not ef6cient.
Other objective funetions encountered in the concave minimization literature turn
out to be of the form !p[l(u)], l: IRn - IR affine, !p: IR -IR (quasi)concave, so that
they reduce to actually two linear programs (e.g., Horst and Thoai (1989), Horst,
Thoai and Benson (1991)).
More recent investigations (Tuy (1991b and 1992a), Tuy and Tam (1992), Tuy,
Tam and Dan (1994), Tuy (1995)) have shown that a number of interesting global
optimization problems belong to the class 0/ so-ca"ed rank k problems. These
problems can be transformed into considerably easier problems of smaller dimension.
Examples include eertain problems with products in the objective (multiplicative
programs), certain loeation problems, transportation-production models, and bilevel
programs (Stackelberg games).
It is easy to see that objeetive funetions of type (3) above, belong to the class of
rank two q1J.asiconca1Je minimization problems whieh could be solved by a parametrie
method whieh is a specialized version of the polyhedral annexation proeedure (Tuy
and Tam (1992)).
There are several interpretations of the PA algorithm which show the relationship
between this approach and other known methods.
Consider the (DG)-problem as formulated in 5ection 2.1, and let G# denote the
polar of G, i.e., G# = {v: vx ~ 1 Vx E G}. 5ince the PA algorithm for (DG)
generates a nested sequence of polyhedra SI J S2 J ... J G#, each of which is ob-
tained from the previous one by adding just one new constraint, the computational
scheme is much like an outer approximation procedure performed on G#. One can
even view the PA algorithm for (DG) as an outer approximation procedure for
solving the following convex maximization problem
where J.'(v):= max v(D) (this is a convex function, since it is the pointwise max-
imum of a family of linear functions V""" v(x».
Indeed, starting with the polyhedron SI J G#, one finds the maximum of J.'(v)
over Sr Since this maximum is achieved at a vertex vI of 51' if J'Cv l ) > 1, then
vI ~ G# (because max v l (D) = J.'(v l ) > 1), so one can separate vI from G# by the
hyperplane zlx = 1.
Next one finds the maximum of J'Cv) over 52 = SI n{x: zlx SI}. If this max-
imum is achieved at v2, and if J.'(v2) > 1, then v2 ~ G#, so one can separate v2 from
G# by the hyperplane z2x = 1, and so on. Note, however, that one stops when a vk
is obtained such that J.'(vk ) S 1, since this already solves our (DG)-problem.
262
improves as the algorithm proceeds. Furthermore, since restarts are possible, the
number of constraints on Sk will have a better chance to be kept within manageable
limits than in the usual out er approximation algorithms.
verified that all of the previous results still hold if we only assume that 0 e int Ko G
Note that the same procedure can also be performed using the vertex set rather
than the constraints of D. In fact, if we know a finite set E such that D = conv E,
then instead of defining J4..v) = max {vx: x e D} as above (the optimal value of
LP(v;D)), we can simply define ,""v) = max {vx: x E E}. Therefore, the PA algo-
rithm can be used to solve the following problem:
Given a finite set E in IRn, find linear inequalities that determine the convez hull
oIE.
This problem is encountered in certain applications (cf. Schacht mann (1974)). For
instance, if we have to solve a sequence of problems of the form min {ckx: x e E},
263
where ck E IRn, k=1,2, ... , then it may be more convenient first to find the constraints
pix ~ qi, i=1,2, ... ,r, of the convex huH of E and then solve the linear programs
.{k
IDln c x: pix<
_ qi,1=
· 12, ,... ,r } .
Assuming that E contains at least n+1 affinely independent points, to solve the
above problem we start with the n-simplex PI spanned by these n+ 1 points, and we
translate the origin to an interior point of PI' Then we use the polyhedral
For each polytope P k let .At k denote the coHection of cones generated by the
transversal facets of P k . Clearly, .At k forms a conical sub division of the initial cone
K O ' and so the PA algorithm may also be regarded as a modification of the cut and
with f(x) ~ Q. H max v(D) ~ 1 for an these v (Le., if the colleetion of an of the
pyramids covers an of D), then the algorithm stops. Otherwise, using zk E argmax
{vkx: x E D} with vk E argmax v(D), we generate a new conieal subdivision .At k +1 '
etc ....
Thus, compared with the cut and split algorithm, the fundamental difference is
that the bases of the pyramids M(v) are required to be the transversal facets of a
convez polytope. Because of this requirement, a cone may have more than n edges
and .At k+1 may not necessarily be a refinement of .At k (Le., not every cone in .At
k+l is a subcone of some cone in .At k)' On the other hand, this requirement anows
the cones to be examined through linear programs with the same feasible domain
In the cut and split algorithm, P k+1 is obtained by merely adding to P k the
n-simplex spanned by zk and the vertices of the facet vk that generated zk j thus the
annexation procedure is very simple. But, on the other hand, the method might not
be finite, and we might need some anti-jamming device in order for it to converge.
We shall return to this question in Section VII.1.
2.6. Extensions
So far the PA algorithm has been developed under the assumption that the feas-
ible domain D is a polytope and the objective function fex) is finite throughout IRn
(i) Dis a polyhedron, possibl1l unbounded bat line free, while. j{z} has bounded level
sets.
In this case the linear program LP(vjD) might have no finite optimal solution. If
max v(D) = +ID for some v, i.e., if a halfline in D in the direction y = w (v) is found
for which vy > 0, then, since the set {x: fex) ~ t} is bounded for any real number t,
we must have fex) --I --w over this halfline (therefore, the algorithm stops). Other-
(ü) Dis a pol1lhedron, possibl1l unbounded bat line free, while. inj j{D} > _ {bat j{z}
ma1l have unbounded level sets}.
Under these conditions certain edges of KO might not meet the surface fex) = et, so
we define
265
where I is the index set of the edges of KO which meet the surface fex) = a, and if
i ;. I, then ui denotes the direction of the i-th edge. Moreover, even if the linear pro-
gram LP(v;D) has a finite optimal solution w (v), this point might have no finite
a-extension. That is, at certain steps, the polyhedral annexation process might in-
volve taking the convex hull of the union of the current polyhedron P k and a point
at infinity (i.e. a direction) zk.
To see how one should proceed in this case observe that Proposition VI.5 still
holds when P is an unbounded polyhedron, except that 0 is then a boundary rat her
(ili) Dis a polytope, 'IIJhile /(x) is finite omy on D (f(x) = -m outside D}.
Most of the existing concave minimization methods require that the objective
function f: D -+ IR can be extended to a finite concave function on a suitable set A,
A J D.
However, certain problems of practical interest involve an objective function fex)
which is defined only on D and cannot be finitely extended to IRn . To solve these
However, the PA algorithm might still be useful in these cases. Assume that a
polytopes, each obtained from the previous one by annexing just one new vertex.
Such a procedure, if carried out completely, will certainly produce all of the
vertices of D. However, there are several respects in which this procedure differs
First, this is a kind of branch and bound procedure, in each step of which all of
the vertices that remain to be examined are divided into a number of subsets (cor-
responding to the facets v of the current polytope P k ), and the subset that has max-
imal J'(v) = max v(D) is chosen for further partitioning. Although J'(v) is not a
lower bound for f(x) on the set D n {x: vx S 1}, it does provide reasonable heuristics
to guide the branching process.
Second, the best solution obtained up to iteration k is the best vertex of the poly-
tope P k j it approximates D monotonically as k increases.
Third, the accuracy of approximation can be estimated with the help of the fol-
lowing
be the collection 01 its transversallacets (i.e., the collection 01 vertices 01 its polar
Sk)· 11 d(x,Pi! denotes the distance from x to Pk , then
Proof. Let x E D \ P k. Then there is avE Vk such that x belongs to the cone ge-
nerated by the facet v. Denoting by y the point where the line segment [xO,x] meets
and hence
3. CONVEX UNDERESTIMATION
first introduced by Falk and Soland (1969) in the context of separable, nonconvex
programming, and was later used by Falk and Hoffman (1976) to solve the concave
programming problem, when the feasible set D is a polytope. Similar methods have
also been developed by Emelichev and Kovalev (1970), and by Bulatov (1977) and
Bulatov and Kansinkaya (1982) (see also Bulatov and Khamisov (1992). The first
method for solving the nonlinearly constrained concave minimization problem by
means of convex underestimation was presented in Horst (1976), a survey of the de-
velopment since then is contained in Benson (1995).
can be solved by available algorithms, and its optimal solution approaches an opti-
functionals) of f(x).
(26)
and is called arelaxation of the problem (22). The above variant of outer approxim-
When Dk =D for every k, the method is also called the 6UCcenive untier-
estimation methotl. Thus, the successive underestimation method for solving (22)
consists in constructing a sequence of underestimators ,,\(x) satisfying (25) and such
that the sequence
1) if ,,\(xk) = f(x k), then f(x k) = min f(D), i.e. xk solves (22), and we stoPj
2) otherwise, "\+1 must be constructed such that ,,\(x) ~ "\+1 (x) for all x ED
Of course, the convergence of the method crucially depends upon the choice of "\,
k=l,2, ....
Remark VI.3. Another special case of the successive relaxation method is when
,,\(x) :: fex) for every k. Then the sequence Dk is constructed adaptively, based on
the results of solving the relaxed problem min f(D k ) at each step: this is just the
usua! outer approximation method that we discussed in Chapter 11 and Section VI.1.
With the above background, let us now return to the concave programming
problem (BCP), i.e., the problem (22) in which D is a polytope of the form
270
with b e IRm , A an mxn matrix, and f(x) is a concave function defined throughout
IRn .
Let us apply the successive underestimation method to this problem, using as
Recall from Section IVA.3 that the convex envelope of f(x) taken over S, is the
largest convex underestimator of f over S.
Proposition VI.7. Let Sk be a polytope with vertices ,f,O,vk,l, ... ,vk,N(k), and let
lI\(z) be the convez envelope 0/ /(z) taken over Sk' Then the relaxed problem
min {lPlz): z e D} is a linear program which can be mtten as
Nfk) ,k .
minimize A./(TI",J)
j=O J
Nfk) k'
s.t. A . AV'°,J< b
j=O J -
Aj ~ 0 (j = O, ... ,N{k)) .
If vk,j E D for all j E J(k), then f(vk,j) ~ min f(D) for all j E J(k), and hence,
(29)
271
(30)
k' k~
In general, however, v ,J t D at least for some j E J(k), for example, v t D.
Then, taking any ik E {l, ... ,m} such that
k,jk
A. v > b.
lk lk
we obtain a polytope smaller than Sk but still containing D. Hence, for the convex
envelope ~+1 off over Sk+1 we have ~+1 (x) ~ ~(x), ~+1 ~ ~ .
We are thus led to the following successive convex underestimation (SCU) algo-
rithm of Falk and Hoffman (1976):
Initialization:
1) Solve problem (Qk)' obtaining an optimal solution >.k. Let J(k) = {j: >.1> O}.
2) If vk,j E D for all j E J(k), then stop: any vk,j with j E J(k) is a global optimal
solution of (BCP).
272
3) Otherwise, there is a vk,jk ~ D with jk E J(k). Se1ect any ik E {I, ... , m} such
k,jk
that A. v > b. and define
lk lk
4) Compute the vertex set Vk+ l of Sk+1 (from knowledge of Vk , using, e.g., one
of the procedures in Section 11.4). Let Vk+1 = {vk+ I ,O, ... ,vk+1,N(k+1)}. Go to
iteration k+ 1.
Proof. As in the proof of Theorem VI.1, it is readily seen that each ik ia distinct
from a1l the previous indices iO, ... ,i k_ l . Hence, after at most miterations, we have
Sm = D, and then vm,j E D for a1l j E J(m), Le., each of these points is a global
optimal solution of (BCP). •
k,jk k'
v E arg min {f(v ,J): j E J(k)} .
(ü) When compared to Algorithm VI.I, a relative advantage of the successive con-
vex underestimation method is that it yie1ds a sequence of feasible solutions x k such
that lP:k(xk) monotonically approaches the optimal value of the problem (see (25».
However, the price paid for this advantage is that the computational effort required
to solve (Qk) is greater here than that needed to determin xk E arg min{f(x):
273
xE Vk} in Algorithm VI. 1. It is not clear whether the advantage is worth the priee.
problem (CQP)
matrix.
ized eigen-veetors of C, so that UTCU = diag (Al'A 2,... ,A n ) > 0, then by setting
x = Uy, F(y) = f(Uy) we have:
1 T n
F(y) = qy - 2"Y[(U CU)y] = E F .(y.) ,
j=l J J
where
1 2 T
F·(y·)=q·Y·--2AY. ,q=U p. (31)
JJ JJ J-J
n
minimize E F .(y.) subject to Y E Cl ,
j=l J J
where Cl = {y: Uy E D} .
n
Lemma VIA. Let F(y) = E F. (y.) be a separable function and let S be any
j=l J J
rectangular domain of the form
274
(32)
To simplify the language, in the sequel we shall always use the term reet angle to
mean a domain defined by 2n inequalities of the form (32), the faces of which are
we will have F(y) ~ F(v) for all y E T. Hence, if Cl \ T = 0, then no feasible point
better than v can exist, i.e. v is a global optimal solution. Otherwise, the interior of
T can be excluded from furt her consideration and it remains to investigate only the
set Cl \ int T. Though this set is not convex, it is the union of r $ 2n convex pieces,
and it should be possible to handle each of these pieces, for example, by the
Clearly, since each v.i (j=1, ... ,2n) is the optimal solution of a linear program, v.i
can be assumed to be a vertex of O. Let
Then F(v) is an upper bound for the number tp* = mi n F(y). On the other hand,
yeO
. . ·+n
Wlth {J. = ~, (J.+ = ~ , the set
J J J n J
furnishes a reasonably tight lower bound for tp*. Ifit happens that
F(v) - I{J ~ c,
where
Proo!. Since the minimum of F(y) over S is attained at some vertex, 'P is equal
to the smallest of all of the numbers F(w) where w is a vertex of S. The proposition
Proposition VI.9. For each j let 7j J 7j+n (7j> 7j+rI be the roots o/the equation
1 2 - ßF
--2>"1] +ql]=F.(y.J--. (35)
J J J J n
Proof. Since FJ·(Y·) - ßF < max FJ.(t) (where FJ.(t) = q.t _!>...t 2), the quadratic
J n t EIR J J
equation (35) has two distinct roots. By Lemma VIA for each vertex w of T we
n 1 2 n
have: F(w) = E (q'7' - -2>"'7') with 7· = 7, or 7'+ ' and hence F(w) = E
j=l J J J J J J n j=l
(F.(Y.) - ßF) = F(Y) - ßF = F(v). •
J J n
Since 'P ~ 'P* ~ F(v) ~ F(y) for all y E T, the optimal solution of (CQP) must be
sought among the feasible points lying on the boundary or outside of T, Le., in the
set n \ int T (see Fig. VIA, page 270). Since the latter set is nonconvex, it must be
treated in a special way. The most convenient method is to use the hyperplanes that
For each nonempty polytope Oj , we already know one vertex, namely i Usuallyan
n-simplex with avertex at ~ can be constructed that contains 0., so that each
J
problem
can be treated, for example, by the Falk-Hoffman algorithm. If rpj is the optimal
value in (36) then 11'* = min {1Pi, ... ,102n,F(;)} with the convention that rpj = +ID if
Oj = 0. The following example illustrates the above procedure.
Example VIA. Consider the two dimensional problem whose main features are
presented in Fig. VI.4.
The feasible set 0 and a level curve of the objective function F(y) are shown.
First, vI E argmax {Y( y E O}, v3 E argmin {Y( y E O}, and (similarly) v2, v4
are computed, and the rectangle S is constructed. The minimum of F(vi ), i=I,. .. ,4,
is attained at v = v3 (this is actually the global minimum over 0 but it is not yet
recognized as such since F(v) > 11' = min {F(y): y E S}).
the deletion of the interior of T are 0 3 and 0 4. Since 0 4 is a simplex with all
vertices inside the level ellipsoid F(y) = F(v), it can be eliminated from further con-
sideration. By constructing a simplex which has one vertex at v3 and contains 0 3 '
we also see that all of the vertices of this simplex lie inside our ellipsoid. Therefore,
Fig. VI.4
In this section concave piecewise underestimators are used to solve the concave
minimization problem.
denote the hypograph of f(x). Clearly, G is a convex set. Now consider a finite set
X ( MI such that conv X = MI' Let
Z = {z = (x,f(x)): x E X} ( Ml " IR ,
and let P denote the set of points (x, t) E Ml "IR on or below the convex hull of Z.
Then we can write
P = conv Z - r
where r = {(x,t) E IRn"lR: t > O} is the positive vertical halfline. We shall call P a
tnnJ.: 1l1ith base X. Clearly, P is the hypograph of a certain concave function cp (x) on
Proposition VI.lO. The polyhedral fu,nction cp(x) with hypograph P is the lowest
concave underestimator of f(x) that agrees with f(x) at each point x E X.
If 1/1 is any concave function that agrees with f(x) on X, then its hypograph
must be convex and must contain Z; hence it must contain P. This implies that
1/J(x) ~ cp(x).
•
We now outline the concave polyhedral underestimation method and discuss the
main computational issues involved.
Xl = { vI ,... ,vn+1}
280
At the beginning of iteration k=I,2, ... we already have a finite set Xk such that
Xl c Xk c MI = conv Xk , along with the best feasible point i k- l known so far. Let
P k be the trunk with base Xk and let lf1c(x) be the concave function with hypograph
P k . We solve the relaxed problem
obtaining a basic optimal solution x k . Let i k be the point with the least function
value among i k- l and all of the new vertices of D that are encountered while
solving (SPk ). Since lf1c(x) is an underestimator of fex), lf1c(xk ) yields a lower bound
for min f(D). Therefore, if
have x k t Xk (since xke Xk would imply that lf1c(xk ) = f(xk ) ~ f(ik )" because the
function \Ok(x) agrees with fex) on Xk ). Setting Xk +1 = Xk u {xk}, we then pass to
iteration k+1.
Theorem VI.5. The procedure just described terminates after finitely many
iterations, yielding a global optimal solution 0/ (BCP).
Proof. We have lf1c(xk ) ~ min f(D) ~ f(ik). Therefore, if the k-th iteration is not
the last one, then 'I\(xk ) < f(ik) ~ f(x k) = '1\+1 (xk). This shows that '1\ ~ 1DJi, and
hence Xk # Xh for all h < k. Since each xk is a vertex of D, the number of iterations
is bounded from above by the number of vertices of D.
•
Thus, finite termination is ensured for this procedure, which can also be identified
with a polyhedral annexation (inner approximation) method, where the target set ia
. - k
G and the expanding polyhedra are PO' PI' ... (note that P k+1 = conv (P k U {z }),
with zk = (xk,f(xk)) t P k ).
281
For a successful implementation of this procedure, the main issue is, of course,
how to compute the functions <.Otc(x) and solve the relaxed problems (SP k ). We pro-
ceed to discuss this issue in the sections that follow.
(x,t) E u. This hyperplane is not vertical, 50 its equation has the form
Proof. The result follows because the trunk P k is defined by the inequalities:
•
The above formulas show that the function <.Otc(x) can be computed, once the
equations of the hyperplanes through the nonvertical facets are known. Since
the relaxed problem (SP k ) can be solved by separately solving each linear program
and taking
282
(37)
Thus, computing the functions ~ as weIl as solving (SP k ) are reduced to the com-
putation of the nonvertical facets of P k , or rather, the associated affine functions
, K,o-(x).
Cf),
Observe that the initial trunk PI has just one nonvertical facet which is the
n-fiimplex spanned by (vi, f(vi )) , i=I, ... ,n+1. Therefore, it will suffice to consider
the following auxiliary problem:
By translating if necessary, we may assume that 0 Eint MI and f(x) > 0 for all
x E MI' so that any trunk P with base X contains 0 E IRn + 1 in its interior. Under
these conditions, we shall convert problem (:7) into an easier one by using the fol-
lowing result, which is analogous to Proposition VIA, from which, in fact, it could
be derived.
through (1 does not contain 0, and so (1 must contain n+llinearly independent ver-
Conversely, let (q,~) be a vertex of S. Then t ~ ~ - qx for all (x,t) E vert P, and
hence for all (x,t) E Pj moreover, of these at least n+l linearly independent con-
facet cannot be vertical, because the coefficient of t in the equation of its hyperplane
is 1.
•
Corollary VI.3. Let
(38)
(39)
and conversely.
lowing problem:
284
(~) Suppose that the vertex set .Atk 01 Sk (defined in (98)) is known. Compute
the vertex set .At k+1 01 Sk+1 .
Since Sk+1 is also obtained by adding just one new linear constraint to Sk (for-
mula (39», problem (.At) can be solved by the available methods (cf. III.4.2). Once
the vertices of Sk+1 have been computed, the equations of the nonvertical facets of
In more detail, the computation of the nonvertical facets of P k can be carried out
First compute the unique nonvertical facet of P l' which is given by the unique
vertex of
(40)
and hence
(41)
where
(42)
At iteration k, the vertex set .At k of Sk is already known. Form Sk+ 1 by adding
to Sk the new constraint
285
where xk is the point to be added to Xk· Compute the vertex set .At k+1 of Sk+1
(by any available subroutine, e.g., by any of the methods discussed in III.4.2. Then
t = ~ -qx.
Compute a vertex i O of D.
Xl = {v 1,v 2,... ,v n +1}, SI = ((q,~): qo - qv 1 ~ f(v 1), i=1,2, ... ,n+1}. Let .At 1 be
the singleton {(f(v 1), ... ,f(vn + 1))Ql1}, where Q1 is the matrix (42). Set
obtaining a basic optimal solution w (q,qO) and the optimal value ß (q,~).
2) Compute
4b) Iff(ik ) < f(ik- 1), set iO t-ik , and return to Step 0.
5) Otherwise, let
Compute the vertex set .,K k+1 of Sk+1· Set f k+1 = .,K k+1 \ f k. Let k t- k+ 1
and return to 1).
Remarb VI.5. (i) Finiteness of the above algorithm follows from Theorem VI.5.
Indeed, since at each return to Step °(restart) the new vertex iO is better than the
previous one, Step 4b can occur only finitely many times. That is, from a certain
moment on, Step 4b never occurs and the algorithm coincides exact1y with the pro-
cedure described in Section VIA.I. Therefore, by Theorem VI.5 it must terminate at
a Step 4a, establishing that the last iO is a global optimal solution.
(ü) A potential difficulty of the algorithm is that the set .,K k ' i.e., the collection of
nonvertical facets of P k might become very numerous. However, according to Step
4b, when the current best feasible solution is improved, the algorithm returns to
Step °
with the new best feasible solution as xO. Such restarts can often accelerate
the convergence and prevent an excessive growth of 1.,K k I.
(iii) Sometimes it may also happen that, while the current best feasible solution re-
mains unchanged, the set .,K k is becoming too large. To overcome the difficulty in
that case, it is advisable then to make arestart, after replacing the polyhedron D by
D n{x: l (x-iO) ~ I}, where l (x-iO) ~ 1 is an Q-valld cut for (fl,D) at iO with Cl! =
f(iO). The finiteness of the algorithm cannot be adversely affected by such a step.
287
Now consider a finite set Xk in IRn such that D ( conv Xk and let
Since Xk is finite, '1\ is a polyhedral function and since D ( conv Xk ' we have
'I\(x) ~ f(x) Vx E D ,
min {'I\(x): xe D} ,
Therefore, if 'I\(xk ) = f(ik) lor the best leasible point i k so far encountered, then
i k solves the problem (BCP). Otherwise, since 'I\(x) = f(x) for any x E Xk and
and repeat the procedure just described with V\+1 in place of V\.
This is exactly the PU algorithm, if we start with Xl = {v l ,... ,vn +1}, where
MI = [vl, ... ,v n + l ] is an n-simplex containing D. To see this, it suffices to observe
the following
Proposition VI.13. The junction 'Pk defined in (~9) is identical to the concave
junction whose hypograph is the trunk Pk with base Xk.
Proof. For any x' E MI = conv Xk, let (X',t') be the point where the verticalline
through x' meets the upper boundary of P k. Then (x' ,t') belongs to some nonvertical
facet (J' of P k' so that t ' = Ilo - qx' , where t = qo - qx is the equation of the hyper-
plane through (J'. Since the latter is a supporting hyperplane of P k at (X',t'), we must
have qo - qv ~ f(v) for all v E Xk. Moreover, any hyperplane h(x) = t such that h(v)
~ f(v) Vv E Xk must meet the vertical through x' at a point (x',t*) such that t* ~ t '.
Therefore, t ' = min {t: h(x) = t, h affine, h(v) ~ f(v) Vv E Xk}j that is, V\(.) coin-
eides with the function whose hypograph is just P k .
•
Remark VI.6. Set vn+i+I:= xi, so that Xk = {vi, i E I k} with Ik =
{1, ... ,n+k+1}. By (43), for each x E MI' V\(x) is given by the optimal value in the
linear program
(the feasible set of L(xjXk )), then the relaxed problem min V\(D) is the same as
289
From this it is obvious that the crucial step consists in determining the set .Jt k.
On the other hand, instead of determining .Jt k one could also determine directly the
set :Y k of all nonvertical facets of P k , i.e., the linearity pieces of 1Ptc. For this,
observe that :Y 1 is readily available. Once :Y k has been computed, :Yk +1 n :Yk
~- qxk ~ f(x k), while :Y k+ 1\:Y k is given by the collection of all (q,~) that are
basic optimal solutions of the linear program L(xk ; Xk + 1).
Separable problems form an important dass of problems for which the polyhedral
where each f{) is a function of one variable which is concave and finite on the line
segment b. j = [rl j ] (but possibly discontinuous at the endpoints of the segment,
see Fig. VI.5). This situation occurs in practice, e.g., when D represents the set of
feasible production programs, and fP) is the cost of producing t units of the j-th
product. If a fixed cost and economies of scale are present, then seeking the cheapest
f. (t)
J
t
r---~----~------~~------~------
s.
J
Fig. VI.5
For this problem, many of the methods discussed previously either do not apply
or lose their efficiency, because they assume that the function f(x) is extendable to a
concave finite function over IRn, which is not the case here.
To apply the PU method, for each j let us choose a finite grid Xkj of points in the
segment Ll j such that r j , Sj E Xkj" Consider the piecewise affine function IPJcj(t), t E
IR, that agrees with fP) at each point of Xkj" Since we are dealing with functions of
Let i k be the best feasible solution so far available (i.e., the best among x 1, ... ,xk
and the other vertices of D that may have been encountered during the process of
computation). If 'P]c(xk) = f(i k ), then f(i k) ~ 'P]c(x) ~ fex) "Ix e D, hence i k solves
our problem (Bep) and we stop. Otherwise,
k n k k k n k
'P]c(x ) = E 'P]cJ'(x.) < f(i ) ~ fex ) = E f.(x.) , (44)
j=1 J j=1 J J
therefore, x~* ~ Xkj* for at least one j*. Setting Xk + 1,j* = Xkj * U {x~*} (j=I, ... ,n),
Xk + 1,j = Xk,j (j # j*), we can then repeat the procedure, with k -k+1.
Thus, if we start at k = 1 with X 1j = Ll j (j=I, ... ,n) and perform the above
adding to Xkj* a point x~* ~ X kj* . Since x k is a vertex of D, the set of all possible
x1 (j=I, ... ,nj k=I, ... } is finite. Hence, the sequence {Xl'X2,... } is finite. _
the relaxed problems (SP k ). Of course, since the objective function /(\(x) in each
problem (SP k ) is concave, piecewise affine, and finite throughout IRn, these problems
can be solved by any of the methods discussed previously. However, in view of the
strong connection between (SP k +1) and (SP k ), an effi.cient method for solving these
n
Let Xk = II Xk . , and denote by §k the partition of M determined by Xk '
j=l ,J
Le., the partition obtained by constructing, for each j=l,2, ... ,n, a11 the hyperplanes
parallel to the facets of M and passing through the points of Xk,j (let us agree to call
these hyperplanes partitioning hyperplanes of .9IiJ
For k=l, §1 = {M}, so that rp1(x) is affine and solving (SP 1), Le., finding
xl E argmin {rp1(x): x E D n M} ,
presents no difficulty. Set .91 1 = § 1. At the end of iteration k = 1,2, ... we already
have:
b) for each P E .9I k ' a point x(P) and a number J.L (P) are determined which are,
respectively, a basic optimal solution and the optimal value of the linear program
Now, to pass from iteration k to iteration k+l, we choose an index j* such that
x1* t Xk,j* and set Xk +1,j* = Xk,j* U {x1*}, Xk+1,j = Xk,j (j f j*). To solve
(SP k + 1) we proceed according to the fo11owing branch and bound scheme:
293
It is easily seen that the above process must be finite. Note that in this way it is
generally not necessary to investigate all of the members of the partition .9k+1; nor
is it necessary to find 1PJc+1 (x) explicitly.
CHAPTER VII
This chapter is devoted to a dass of methods for concave minimization which in-
vestigate the feasible domain by dividing it into smaller pieces and refining the par-
tition as needed (successive partition methods, branch and bound).
We shall discuss algorithms that proceed through conical subdivisions (conical al-
1. CONICAL ALGORITHMS
and Soland (1969) and by Horst (1976), in subsequent conical algorithms, the pro-
cess of conical subdivision was coupled with a lower bounding or some equivalent op-
eration, following the basic steps of the branch and bound scheme.
A first convergent algorithm of this type was developed by Thoai and Tuy (1980).
the original algorithm in Tuy (1964) and that of Thoai and Tuy (1980).
Let us begin with the following (DG) problem which was considered in Section
VI.2.
such that G = {x: p(x) ~ 1}, then D ( Gis equivalent to max p(D) ~ 1; in this case,
To construct a conical procedure for solving (1), by the branch and bound
scheme, we must determine three basic operations: branching, bounding and candid-
ate selection (cf. Section IV.2).
such that u # Azi VA ~ 0, i=1,2, ... ,n. As we saw in Section V.3.1, if u = E A.zi
iEI 1
(\ > 0) and ii is the point where the halfline from 0 through u meets 00, then
Ki = con(Qi)' i E I, with
Thus, to determine the branching operation, a rule has to be specified that assigns
to each cone K = con(Q), Q = (z1,z2, ... ,zn), a point u(Q) E K which does not lie
297
on any edge of K.
2) Bounding. For any cone K = con(Q), Q = (zl,z2, ... ,zn), the hyperplane
eQ-1x = 1 passes through zl,z2, ... ,zn, Le., the linear function h(x) = eQ-1x
agrees with p(x) at zl i,... ,zn. Rence, h(x) ~ p(x) for an x E K, and the value
~Q) = max {eQ-1x: x E K n D}
will satisfy ~Q) ~ max p(K n D). In other words, JL(Q) is an upper bound for p(K
n D) (note that (1) is a maximization problem).
3) Se1ection. The simplest rule is to se1ect (for furt her splitting) the cone
K = con(Q) with largest JL(Q) among an cones currently ofinterest.
Once the three basic operations have been defined, a corresponding branch and
bound procedure can be described that will converge under appropriate conditions.
Since the selection here is bound improving, we know from the general theory of
branch and bound algorithms that a sufficient convergence condition is consistency
of the bounding operation (cf. Section IV.3).
Let us proceed as follows. For each cone K = con(Q) denote by w (Q) a basic op-
timal solution of the linear program
. . eQ-1 x -1
LP(QjD) maXlIIllze subject to x E D, Q x ~ 0. (2)
above. For each s let tl = w(Qs)' uS = u(Qs) (in general, uS #- tl and even uS #- >.tl
for every >'), and denote by qS and tJ the points where the halfline from 0 through
tl meets the simplex [zSl,zS2, ... ,zsn] and the boundary 8G of G, respectively.
298
(3)
A cone splitting process is said to be ftOn7IGl (an NOS proceBs) if any infinite
nested sequence of cones that it generates is normal.
With the operations of branching, bounding and selection defined above, we now
state the procedure for solving (DG), which we shall refer to as the (DG)-procedure
and which will be used as a main subroutine in the algorithm to be developed for
concave minimization.
(DG)-Procedore:
1) Compute the intersections zOl,z02, ... ,zOn of the edges of K O with 00. Set
2) For each matrix Q e .? solve the linear program LP(Q,D) to obtain the optimal
value J.'(Q) and a basic optimal solution w (Q) of this program. If w (Q) ~ G for
some Q, then terminate: y = w (Q). Otherwise, w (Q) e G for all Q e .?; then go
to 3).
299
3) In .At delete all Q E !I' such that J.t(Q) S 1. Let .ge be the collection of remaining
matrices. If .ge = 0, terminate: D c G. Otherwise, .ge # 0; then go to 4).
4) Choose Q* E argmax {Jl (Q): Q E .ge}, and split K* = con(Q*) with respect to
5) Replace Q* by !I'* in .ge and denote by .At* the resulting collection of matrices.
Set !I' I - !I'*, .Atl-.At* and return to 2).
proof. Consider any infinite nested sequence of cones Ks = con(Qs)' s=I,2, ... ,
generated by the procedure. As previously, let J' = w (Qs)' uS = u(Qs) and denote
by qS and rJ the points where the halfline from 0 through J' meets the hyperplane
eQs-lx = 1 and aa, respectively. Then, by normality, we may assume, by passing
to subsequences if necessary, that IIqS - rJII -10. Rence, IIJ' - qSll S IIrJ - qSll -10
and since IIqSll is bounded, we have
Tbis means that the bounding is consistent (cf. Definition IVA). Since the selection
is bound improving, it follows from Theorem IV.2 that
max p(D) = 1,
and hence D c G. Furthermore, by Corollary IV.l, if the procedure is infinite, then
there exists at least one infinite nested sequence Ks = con(Qs) of the type just con-
sidered. Because of the normality and of the boundedness of the sequence J', we
may assume that qS - rJ -I 0, while the J' approach some y E D. Then
300
t1' - WS ---i 0, implying that WS ---i y. Since WS E 00, it follows that y E 00, which
proves the proposition.
•
Proposition Vll.2. Let G' be a compact conllu 8et contained in the interior of G.
Then after finitely many steps the (DG) procedure either establishes that D C G, or
else finds a point y e D \ G'.
Proof. Let us apply the (DG)-procedure with the following stopping rule: stop if
w (Q) ~ G' for some Q in Step 2, or if .9t = 0 in Step 3 (Le., D C G). Suppose that
the procedure is infinite. Then, as we saw in the previous proof, there is a sequence
t1' ---i y, where y e D n 00. In view of the compactness of both G and G' and the
fact that G' eint G, we must have d(y,G ') > O. Hence, t1' ~ G' for some sufficiently
luge S, and the procedure would have stopped.
•
Now we discuss the question of how to construct anormal conical subdivision pro-
cess.
it is said to be nondegenerate if lim lIeQs-l11 < IIJ, i.e., ifthere exiSts an infinite
8-i1lJ
proof. Suppose that the sequence Ks shrinks to a ray r. Then each point zSi
(i=1,2, ... ,n) approaches a unique point x* of r n 00. Hence, both qS and u S tend to
. qS - us --t 0•
x* , l.e.,
Now suppose that the sequence Ks is nondegenerate and denote by HS+ 1 the hy-
perplane through zS+1,1,i+1,2, ... ,zS+1,n and by LS+ 1 the halfspace not containing
Since the sequence qS is bounded, it follows from Lemma III.2 on the convergence of
cutting procedures that d(qS,Ls+ 1) --t 0, and hence d(qS,H sH ) --t O. But the equa-
(s E A, S --t 1Il).
•
Letting uS = iJ we can state the following consequence of the above proposition:
302
Corollary VII.1. (Sufficient condition for normality). A conical sub division pro-
cess is normal if any infinite nested sequence Ks = con (Qsl that it generates satisfies
either ofthe following conditions:
~) the sequence is nondegenerate, and for all but finitely many s the subdivision of Ks
is performed with respect to a basic optimal solution wS = w(Qsl of the associated
linear program LP(Qs,D).
This result was essentially established directly by Thoai and Tuy (1980), who
in the obvious way by the bisection of the simplex Z(Q). In other words, the bisec-
ively.
Given a simplex Z = [v1,v2,... ,vn] and a point w E Z, we define
303
Note that 5(Z) is the diameter of Z, while 5(w,Z) is the radius of the smallest ball
with center w containing Z.
The following lemma is closely related to Proposition IV.2.
5(w,Z) ~ q5(Z) .
Proof. Let w be the midpoint of [vi ,v2], with IIv2_vIII = 5(Z). Of course,
For i>2, since the line segment [w,vi] is a median ofthe tri angle [v l ,v2,vi] we have
proof. For any i such that (i > 0, denote by r i the point where the halfline from
vi through w meets the facet of Z opposite i Then w = vi + 8{ri -vi ) for some (J E
(0,1), and ri = E e·v-i, e· ~ 0, E e· = 1. Hence
j# J J J
n· . .
E (.vl = (1-(J)v1 + fJ( E eJ.vJ)
j=l J j#
304
IIw-vili ~ (1-(i)lIri-vili < p5(Z) whenever (i > O. The lemma follows immediately .•
(4)
where pE (0,1) is some constant. If for an infinite subsequence 11 ( {O,l,,,.} each ws,
sE 11, is the midpoint of a longest edge of Zs (i.e., ZS+l is obtained !rom Zs by a bt.-
(1)
Proof. Denote the diameter of Zs by lis' Since lis+1 ~ lis ' li s tends to some limit li
as S - - I (1). Assume that 5 > O. Then we can choose t such that p5s < 5 for an s ~ t.
From (4) it follows that for an s > t
. max IIws_vsili ~ ~s < 5 ~ 5s . (5)
1=1, .",n
Let us colour every vertex: of Zt "black" and color "white" every vertex of any Zs
with s > t which is not black. Then (5) implies that for s ~ talongest edge of Zs
must have two black endpoints. Consequently, if s ~ t and 5 E 11, then wS must be
the midpoint of an edge of Zs joining two black vertices, SO that Zs+ 1 will have at
least one black vertex: less than Zs' On the other hand, Zs+l can never have more
black vertices than Zs' Therefore, after at most n (not necessarily consecutive) bisec-
tions corresponding to SI < 52 < ".< Sn (Si E 11, SI ~ t), we obtain a simplex: Zs with
only white vertices, Le., according to (5), with only edges of length less than 5. This
305
(ii) It is not difficult to construct a nested sequence {Zs} such that Zs+! is ob-
tained from Zs by a bisection for infinitely many s, and nevertheless the intersection
n Zs is not a singleton (cf. Chapter IV). Thus, condition (4) cannot be omitted in
s
Proposition VII.4.
The importance of this result is that it serves as a basis for "normalizing" any
w (Q): basic optimal solution of the linear program LP(Q,D) associated with Qj
When u(Q) = w(Q), Le., K is subdivided with respect to w(Q), we shall refer to
This subdivision method obviously depends on the data of the problem and seems
to be the most natural one. However, we do not know whether it is normal. By Co-
rollary VrI.2, it will be normal whenever it is nondegenerate, i.e., if
~~: lIeQs-lll < 111 for any infinite nested sequenee of eones Ks = eon(Q) that it gen-
erates. Unfortunately, this kind of nondegeneraey is a condition whieh is very diffi-
On the other hand, while exhaustive subdivision processes are normal (and hence
ensure the convergenee of the (DG)-Procedure), so far the most commonly used ex-
haustive proeess - the bisection - has not proven to be very efficient computa-
tionally. An obvious drawback of this subdivision is that it is defined independently
of the problem's data, whieh partially explains the slow rate of eonvergenee usually
observed with the biseetion, as compared with the w-subdivision process in eases
307
To resolve this conflict between convergence and efficiency, the best strategy sug-
Set r(K o) = 0 for the initial cone K O = con(QO)' At each iteration, an index r(K)
Proposition Vll.5. The conical sub division process just described is normal.
Proof. Let Ks = con(Qs)' s=l, ... , be an infinite nested sequence of cones gener-
ated in this process. By completing if necessary, we may assume that KS+1 is an im-
the sequence is normal. Therefore, it suffices to consider the case then 6'(Zs) ~ 6' > O.
Let u(Qs) = u S, w(Qs) = ws. By Corollary VII.3, the eccentricity a(K s) cannot be
bounded by any constant p E (0,1). Consequently, there exists a subsequence {sh'
308
Sh
h = l,2, ... } such that o{Ks ) = 6(w ,Zs) /6 (Zs ) .... 1 as h .... 00, and by Lemma
h h h
VII.1, u Sh = wwh for all but finitely many h. By compactness, we may assume that
sh sh i . sh
w .... w*, while z ' .... z\ i = l,,,.,n. Since 6(Zs ) ~ 6 > 0, it follows that 6(w ,
h
Z ) - 6(Zs ) .... 0, and hence w* must be a vertex of the simplex Z* = [zl,,,.,zn],
sh h
say w* = zl. Obviously, since 0 eint G, zi is the unique intersection point of lJG
I ,sh I sh ,sh
with the halfline from 0 through z . Therefore, w .... z ,i.e. w - w .... 0 (h .... 00).
Noting that, in the notation of Definition VII.I, qsh = wSh, this implies normality of
the sequence.
•
Corollary vn.4. The (DG)-procedure 'lJ.Sing the basic NCS process can be infinite
s.t. Ax ~ b , (7)
x ~ 0, (8)
where we assume that the constraints (7) (8) define a polytope D, and the concave
objective function f: IR --+ IR has bounded level sets.
309
In view of Corollary VI1.2, we can solve the problem (BCP) by the following two
Phase I:
Search for a local minimizer xo which is a vertex of D such that f(xO) ~ f(z).
phasen:
Let a = f(xO)--c. Translate the origin to xO and construct a cone K O ) D. Using the
basic NCS process, apply the (DG)-procedure for G = {x: f(x) ~ f(xO)--c}, G' = {x:
f(x) ~ f(xO)}. If D ( G, then terminate: xO is a global c-optimal solution. Otherwise,
a point y E D \ G' is obtained (so that f(y) < f(xO)): set z f- y and return to
Phase I.
minimizer. Also, a concavity cut can be added to the current feasible set before a re-
turn to Phase 11. Incorporating these observations in the above scheme and replacing
Select c ~ 0.
IniUalization:
Phase I:
Starting with z find a vertex xO of D such that f(xO) ~ f(z). Let xbe the best among
xO and all the vertices of D adjacent to XOj let 'Y = f(X).
310
phasen:
Select an infinite increasing sequence ß of natural numbers.
0) Let 0 = '"1-E. Translate the origin to x O and construct a cone KO J D. For each
i=I,2, ... ,n compute the point zOI where the i-th edge of K O meets the surface
01 02 On
f(x) = o. Let QO = (z ,z ,... ,z ),.At = {QO}' .9' = .At, r(QO) = 0 .
-1 -1
LP(QjM) maxeQ x s.t. xEM,Q x~O
to obtain the optimal value J.'(Q) and a basic optimal solution w(Q).
2) In .At delete all Q E .9' satisfying J.'(Q) ~ 1. Let .9t be the collection of remaining
(perform a bisection) and set r(Q) = r(Q*)+1 for every member Q of the
partition.
4) Let .9'* be the partition of Q* , .At * the collection obtained from .9t by replacing
Theorem Vll.1. For c: > 0 the normal conical algorithm terminates after finitely
many steps at aglobai c:-optimalsolution.
Proof. By Corollary VIIA, where G = {x: f(x) ~ Cl}, G' = {x: f(x) ~ 7}, Phase II
must terminate after finitely many steps either with ~ = 0 (the incumbent i is a
global c:-optimal solution), or else with a point w (Q) such that f(w (Q)) < 1. In the
latter case, the algorithm returns to Phase I, and the incumbent i in the next cyde
of iterations will be a vertex of D bett er than all of the vertices previously
encountered. The finiteness of the algorithm follows from the finiteness of the vertex
set of D.
•
As previously remarked (cf. Section V.3.3), if c: is sufficiently small, avertex of D
which is aglobaI c:-optimal solution will actually be an exact global optimal solu-
tion.
The algorithm will still work for c: = 0, provided that the points zOl # °
(i=1,2, ... ,n) in Step °can be constructed. The latter condition holds, for example, if
xO is a nondegenerate vertex of D (because then the positive i-th coordinate axis
will coincide with the i-th edge of D emanating from xO).
(i) As in the cut and split algorithm (Section V.3.2), when KO = IR~ (which is
the case if the problem is in standard form with respect to xO), the linear program
LP(QiM) in Step 1) can be solved without having to invert the matrix Q. Actually,
if Q = (z l i,... ,zn) and ifthe additional constraints (cuts) that define Mare Cx ~ d,
then in terms of the variables ().1').2''''').n) = Q-lx this program can be written as
313
n
LP(Q,M) max E >.. (9)
j=1 J
(ii) As in the PA method (Section VI.2), we return to Phase I (i.e., we ruto.rt a new
cycle) whenever an w (Q) is found with f(w (Q) < 7. Restarting is also possible when
w (Q) is a vertex of D, no matter what f(w(Q» iso Then, instead of returning to
Phase I, one should simply return to Step 0, with xOI - w (Q). In this way the new
xOwill be different from all of the previous ones, so that the convergence of the algo-
rithm will still be ensured (recall that a concavity cut should be made before each
restart).
Sometimes it may happen that at a given stage, while no w (Q) satisfying the
above conditions is available, the set ~ of cones that must be investigated has
become very numerous. In that event, it is also advisable to return to Phase I with
z I - w (QO)' M I- Mn {x: eQo-lx> I}, D I - Mn {x: eQo-lx> I} (note that not
only M, but also the original feasible domain D is reduced). Of course, a finite num-
ber of restarts of this kind will not adversely affect the convergence.
Computational experience reported in Horst and Thoai (1989) has shown that a
judicious rest art strategy can often substantially enhance the efficiency of the algo-
rithm by keeping ~ within manageable size and correcting a bad choice of the
starting vertex xO.
(Tuy (1991a)):
(*) Select a natural number N and a sequence l1k 1 o. At the beginning set r{QoJ = 0
/or the initial cone KO= con{QoJ. At iteration k, i/ r{Q*) < N and p.{Q*) - 1 > 11",
then per/orm an w-subdivision 0/ Q* and set r{Q) = r{Q*) + 1 /or every member Q
0/ the partition," othenoise, per/orm a bisection 0/ Q* and set r{Q) = O/or every
member Q o/the partition.
It has been shown in Tuy (1991a) that this rule generates indeed anormal
subdivision process. If N is chosen sufficiently large, then the condition r(Q*) < N
almost always holds (in the computational experiments reported in Zwart (1974),
r(Q*) rarely exceeds 5 for problems up to 15 variables) and the just descrihed rule
(*) practically amounts to using w-subdivisions if p.( Q*) - 1 S l1k and bisections
otherwise.
Since the value p.(Q*) - 1 indicates how far we are from the optimum, the fact
p.(Q*) - 1 S '1tt means, roughly speaking, that the algorithm runs normally
(according to the "criterion" {'1tc} supplied by the user). Thus, in practical
implementations w-subdivisions are used as along as the algorithm runs normally,
and bisections only when the algorithm slows down and threatens to jam.
In extensive numerical experiments given in Horst and Thoai (1989) the following
* Ai>
(**) Choose c > 0 su/ficiently smalL Use an w-subdivision i/min {Ai: * O} ~ c,
and a bisection othenoise.
(iv) It follows from Theorem VII.2 that Algorithm VIll with E = 0 will find an
exact global optimal solution after finitely many steps, but it may require infinitely
many steps to recognize this global optimal solution as such. A similar situation may
315
occur with c > 0: though a global c-optimal solution has already been found at a
very early stage, the algorithm might have to go through many more steps to check
the global c-optimality of the solution attained. This is not a peculiar feature, but
rat her a typical phenomenon in these types of methods.
Finally, the assumption that f(x) has bounded level sets can be removed. In fact,
if this assumption is not satisfied, the set {x: f(x) ~ a} may be unbounded, and the
a-extension of a point may lie at infinity. By analogy with the cut and split algo-
rithm, Algorithm VII.1 can be modified to solve the BCP problem as folIows.
To each cone K we associate a matrix Q = (zl,z2, ... ,zn) where zi is the intersec-
tion of the i-th edge of K with the surface f(x) = a, if this intersection exists, or the
direction of the i-th edge otherwise. Then, in the problem LP( Q,M), the vector e
should be understood as a vector whose i-th component is 1 if zi is a point, or 0 if zi
is a direction. Also, if I = {i: zi is a point}, then this linear program can be written
as
max E )..
jeI J
n .
s.t. E ).. (AzJ) 5 b,).. ~ 0 (j=l, ... ,n).
j=l J J
Tolerance e = 1O-Q.
With the heuristic subdivision rule (**), where the parameter c was chosen to c =
First cycle
Phase I:
-2.055110.
phasen:
0) Q = -2.055111. The problem is in standard form with respect to XOj KO= IR!.
01
z = (1.035485, 0.000000, 0.000000, 0.000000)
02
z = (0.000000, 0.666669, 0.000000, 0.000000)
Iteration 1
2) .9l 1 = .J( 0
4) .9'* = {Q11 ' Q1 2 ' ... , Q1 4}' .J( 1 =.9'* .9'1 = .9'* .
" ,
Iteration 2
Phase I:
Y
02
= (1.216328, 1.245331, 0.818956, 2.366468)
03
Y = (1.134454, 0.941176, 0.000000, 5.151261)
Y04= (0.957983, 0.991597, 0.000000, 5.327731)
Current best point i = (1.083760, 1.080259, 0.868031, 0.00000) with
f(X) = -2.281489.
Phase ll:
After 20 iterations (generating 51 subcones in all) the algorithm finds the global
value -2.281489.
318
Thus, the global optimal solution is eneountered at the end of the first eycle (with
two iterations), but ehecldng its optimality requires a second eycle with twenty more
iterations.
Aeeounts of the first eomputational experiments with the normal eonieal
algorithm and some modifications of it ean be found in Thieu (1989) and in Horst
and Thoai (1989) (cf. the remarks on page 260). For example, in Horst and Thoai
(1989), among others, numerieal results for a number of problems with n ranging
from 5 to 50 and with different objeetive funetions of the forms mentioned in Section
VI.2.4, are summarized in the table below (the eolumn fex): form of the objeetive
funetionj Res.: number of eyclesj Con: number of eones generatedj Bi: number of
biseetions performed). The time includes CPU time and time for printing
intermediate results (the algorithm was eoded in FORTRAN 77 and run on an
IBM-PS 11, Model 80, Dos 3.3).
Table VII.I
319
In this section we discuss two alternative variants of the normal conical algo-
rithm.
Note that the above Algorithm VII.1 operates in the same manner as a branch
and bound algorithm, although the number J.t{Q) associated with each cone
K = con(Q) is not actually a lower bound for f(M n K), as required by the conven-
tional branch and bound concept. This is so because Phase II solves a (DG)-problem
with G = {x: f(x) ~ a} and, as seen in Section VILl.1, J.t{Q) is an upper bound for
cal algorithm can be developed where a lower bound for f(M n K) is used in place of
J.t{Q) to determine the cones that can be deleted as non promising (Step 2), and to
select the candidate for further subdivision (Step 3).
through w(Q) with the ray through zi. Since the simplex [O,i 1,i 2,... ,i n] entirely
contains M n K, and f(ii) < f(O) (i=1,2, ... ,n), the concavity of f(x) implies that
f(M n K) ~ min {f(ii), i=1,2, ... ,n}. Hence, if we define ß(Q) inductively, starting
with (QO)' by the formula
a ifp,(Q)~ 1;
ß(Q) ={ -1 -2 -n
max {ß(Qanc)' min [f(z ),fz ), ... ,f(z)]} if p,(Q) >1,
where Q denotes the immediate ancestor of Q, then dearly ß(Q) yields a lower
anc
bound for min f(M n K) such that ß(Q) ~ ß(Q') whenever con(Q') is a subcone of
con(Q).
320
Based on this lower bound, we should delete any Q E !/' such that ß( Q) ~ a, and
choose for further splitting Q* e argmin {ß(Q): Q E .9t}. Note, however, that
ß(Q) > a if and only if J.t(Q) ~ 1, so in practice a change will occur only in the selec-
tion of Q*.
Theorem VII.3. The concl1LSions of Theorems VII.1 and VI!.2 stiU hold for the
variant of the normal conical algorithm where ß(Q) is 1LSed in place of p.(Q) in Steps 2
and 9.
Ks = con(Qs)' s=1,2, ... generated in Phase 11 (by Corollary IV.1, such a sequence
exists if Phase 11 is infinite). Let J = w (Qs) and, as before, denote by WS and qS,
respectively, the a-extension of J and the intersection of the hyperplane
eQs-1x = 1 with the ray through J. Then f(J) ~ '1 = a+e (see Step 1) and
JE [qS,WS]. Since qS - WS -+ 0 (s -+ 111) by the normality condition, we deduce that
J- WS -+ 0, and hence, by the uniform continuity of f(x) on the compact set
G = {x: f(x) ~ a}, we have f(J) -f(WS) -+ 0, Le., f(J) -+ a. We thus arrive at a
contradiction, unless e = O.
On the other hand, by the same argument as in the proof of Proposition VII.1, we
it follows that ß(Qs) -+ '1 (s -+ 111). This proves the consistency of the bounding op-
eration.
Since the selection operation is bound improving, we finally conclude from The-
orem IV.2 that
321
For each cone K = con(Q), Q = (zl,z2, ... ,zn), with f(i) = 7< f(O) (i=1,2, ... ,n),
we define
where zi = I'CQ)zi, and I'CQ) is the optimal value of the linear program
-1
LP(Q,D) max eQ-1x s.t. x E D, Q x ~ O.
Algorithm VII.I*.
1) For each Q E Jl k solve LP(Q,D) to obtain the optimal value J.'(Q) and abasie
4) Let xk+1 be the best point among xk, w(Q) (Q E Jl k), and the point (ifit exists)
where the splitting ray for con(Qk) meets the boundary of D. Let
k+1
Ik+1 = f(x ).
Let Jl k+1 be the partition of Qk obtained in Step 4). For each Q E Jl k+1 reset
Q = (zl,z2, ... ,zn) with zi a point on the i-th edge of K = con(Q) such that
f(zi) = Ik+1 (i=1,2, ... ,n). Let .J(k+1 = (.ge k \{Qk}) U Jl k+1. Set k f- k+1 and
return to 1).
Proof. First observe that Proposition VII.5 remains valid if the condition zSi E
8G is replaced by the following weaker one: zSi E 8G s ' where Gs is a convex subset
of a convex compact set G, and 0 Eint Gs ( int GS+ 1 . Now, if the procedure is in-
finite, it generates at least one infinite nested sequence of cones Ks = con(Qs)' sET
( {O,1,2, ... }, with Qs = (zsI,zs2, ... ,zsn) such that f(ii) = I S (i=1,2, ... ,n) and 15 L1
proof of Proposition VILl, J.'(Qs) -+ 1. This implies that lim v(QS'/S) ~ I. But
s
since
323
this is possible only if c = 0. Then ß(Qs) --I 7, Le., the lower bounding is consistent.
Remarks VII.3. (i) If the problem is in standard form with respect to xO, so that
then one can take K O = IR! and the linear program LP(Q,D), where Q = (zl, ... ,zn),
can be written as
max E)'.
. J
J
s.t. ~ ).lzj ~
J
b, ).j ~ ° (j=1,2, ... ,n).
(ii) As with Algorithm VII.1, the efficiency of the procedure critically depends
upon the choice of rule for cone subdivision. Although theoretically an NCS rule is
An advantage of the one phase Algorithm VIl.h is that it can easily be extended
where f: IRn --I IR is a concave function and D is a closed convex set defined by the in-
equality
g(x) ~ °,
with g: IRn --I IR a convex function.
Assume that the constraint set Dis compact and that int D f 0.
324
When extending Algorithm VII.1* to problem (CP), the key point is to develop a
method for estimating abound ß(Q) ~ min f(D n K) for each given cone K = con(Q)
such that the lower bounding process is consistent.
Tuy, Thieu and Thai (1985) proposed cutting the cone K by a supporting hyper-
plane of the convex set D, thus generating a simplex containing D n K. Then the
minimum of fex) over this simplex provides a lower bound for min f(D n K). This
method is simple and can be carried out easily (it does not even involve solving a lin-
ear program); furthermore, it applies even if Dis unbounded. However, it is practi-
cal only for relatively small problems.
A more efficient method was developed by Horst, Thoai and Benson (1991) (see
also Benson and Horst (1991)). Their basic idea was to combine cone splitting with
outer approximation in a scheme which can roughly be described as a conical proced-
ure of the same type as Algorithm VII.1*, in which lower bounds are computed
using an adaptively constructed sequence of outer approximating polytopes
DO J D1 J ... J D.
The algorithm we are going to present is a modified version of the original
method of Horst, Thoai and Benson. The modification consists mainlY in using an
NCS process instead of a pure bisection process for cone subdivision.
Assume that f(O) > min f(D) and that a cone KO is available such that for any
xE K O \ {O} the ray {1X: T > O} meets D. Denote by x the point fJx where
0= sup {r: 1X E D}. Select E~ 0 and an infinite increasing sequence !:J. C {0,1,2, ... }.
0) For each i=1,2, ... ,n take a point yi # 0 on the i-th edge of KO and compute the
corresponding point i = 0iYO' Let xO E argmin {f(yi), i=1,2, ... ,n}, 10 = f(xO),
and let zOi be the 1 0-extension of yi (i=l, ... ,n) (cf. Definition V.1). Set
325
1) For each Q E 9l k ' Q = (zl,z2, ... ,zn), solve the linear program
to obtain the optimal value J.&(Q) and a basic optimal solution w(Q) of this
. -i. -i i
programj compute v(Q,'Yk) = lIl1n (f(z ), 1=1,2, ... ,n}, where z = J.&(Q)z , and let
5) Let xk+1 be the best point among xk , u(Qk) and all W(Q) for 9 E 9lk j let
k+1)
'Yk+1 = fex .
Theorem. VTI.5. Algorithm VI!.! can be infinite only i/ e = o. In this case 'Yk ! 'Y,
and e'IJery accumulation point 0/ the sequence {zk} is a global optimal solution 0/
(CP).
the simplex [zsl, ... ,zsn] and the surface f(x) = I S' respectivelyj also recall that 'i1' is
the point where this halfline meets the boundary 8D of D. Since f(w) = f(zSi) = I S
and IS ! I, as in the proof of Theorem VIIA it can be seen that the sequence
Ks = con(Qs) is normal, Le., lim (qS - cJ) = O. Hence, in view ofthe boundedness of
DO' we may assume, by taking subsequences if necessary, that "qS, cJ tend to a
common limit qlll, while ws, 'i1' tend to Will, Will, respectively. We claim that
W
III
=W-Ill
•
This implies that zSi - zSi -I 0, where zsi = JL(Qs)zsi. Since f(zsi) = I S' it then fol-
lows from the definition of II(QS,/S) that lim v(QS,/S) ~ I' Hence, noting that
bounding is consistent. Since the candidate selection here is bound improving, the
concIusion follows from Theorem IV.2.
•
327
--..5;
.z
.... f(x)= 0
~ s
.. ~ ....
........... .
s
·w
.................
Fig. VII.l
Remar1ts VII.4. (i) The above algorithm differs from Algorithm VIl.h only in
the presence of Step 4. If Dis a polytope and we take DO = D, then Dk = D Vk, and
the above algorithm reduces exactly to Algorithm VII.h.
(ii) If DO = {x: Ax ~ b} and Dk = {x: Ax ~ b, Cx ~ d}, then for Q = (zl,z2, ... ,zn)
the linear program LP(Q;D k) is
(iii) In the general case when an initial cone KO as defined above is not available,
Step 0) should be modified as follows.
let QOi be the matrix with columns zOj, j E {l, ... ,n+1}\{i}. Set .J{O = .9>0 = {QOi'
i=l, ... ,n+1}. Construct a polytope DO ) D. Set k=O.
domain.
Consider, for example, Algorithm Vll.2. If D is unbounded, then Dk is unbounded,
and in Step 2 the linear program LP(Q,D k ) may have no finite optimal solution.
That is, there is the possibility that I'(Q) = +11), and ,) = w (Qk) is not a finite
point but rather the direction of a halfline contained in Dk . If we assurne, as before,
that the function f(x) has bounded level sets (so that, in particular, it is unbounded
from below over any halfline), Algorithm VII.2 will still work, provided we make the
following modifications:
When ,) = w (Qk) is a direction, the 'k-extension of wk should be understood as
a point r) = )..CJ such that f()"cJ) = 'k. Moreover, in Step 4, if J is a recession
direction of D, then the algorithm terminates, because inf f(D) = -11) and f(x) is
unbounded from below on the halfline in the direction ,). Otherwise, the halfline
parallel to cJ intersects the boundary of D at a unique point iJ f O. In this case let
pk E 8g(wk), and
Dk +1 = Dk n{x: pk(x_wk) ~ O} .
Theorem VII.6. Assume that the function f(z) has bounded level sets. For e =0
consider Algorithm VII.2 with the above modifications in Steps 2 and 4. Then
f(i)! inff(D).
Proof. It suffices to consider the case where 'k! , > - 11). Let Ks = con(Qs) be
any infinite nested sequence of cones, with Qs = (zSl ,zs2 ,... ,zsn), f(zsi) = 's
329
(i=1,2, ... ,n). We claim that JL(Qs) = +111 for at most finitely many s. Indeed, suppose
the contrary, so that, by taking a subsequence if necessary, we have JL(Qs) = +111 Vs,
that W ( D, and hence w+r ( D. Since f(x) is unbounded from below on r, we can
take a ball B around some point of r such that B ( w+r ( D and f(x) < '1 Vx e B.
Then for all sufficiently large s, the ray in the direction cl will intersect B at some
point x e D with f(x) < '1. This implies that f(WS) < '1 $ '1s +1' contradicting the
definition of '1s+1. Therefore, we have JL(Qs) < +111 for all but finitely many s.
Arguing as in the first part of the proof of Theorem VII.5, we then can show that
qS _ cl' -i o. Since f(WS) ~ '1, the sequence {WS} is bounded.
If 11 cl-WS 11 ~ 1] > 0 for infinitely many 5, then, by taking the point yS e [WS,cl] such
that lIys-WS 11 = 1], we have ls(Ys) > 0, ls(x) < 0 Vx e D. Hence, by Theorem III.2,
lim yS e D, Le., yS - WS - i 0, a contradiction. Therefore, cl - WS -i 0, and, as in the
last part of the proof of Theorem VI1.5, we conclude that ß(Qs) T '1.
We have thus proved that the lower bounding is consistent. The conclusion of the
theorem then follows from the general theory of branch and bound methods
(Theorem IV.2).
•
In the general theory of branch and bound discussed in Chapter IV, we have seen
subdivisions in which a cone is most often split into more than two subcones. This
dass was introduced by Utkin (see Tuy, Khachaturov and Utkin (1987) and also
then be exhaustive.
Recall from Section IV.3 that a radial subdivision process is specified by giving a
function w{Z) which assigns to each (n-l)--simplex Z = [vl,v 2,... ,vn] a point
w{Z) E Z \ V{Z), where V{Z) = {vl,v2,... ,vn }. If
n. n
w{Z) = E A. VI , A. ~ 0 , E A. =1 ,
i=1 I I i=l I
then Z is divided into subsimplices Z{i,w) that for each i E I{Z) are formed by
substituting w{Z) for vi in the vertex set of Z. In order to have an exhaustive radial
subdivision process, certain conditions must be imposed on the function w{Z). Let us
denote
We see that 5{Z) is the diameter (the length of the longest edge) of Z, whereas
6{i,Z) is the length of the longest edge of Z incident to the vertex vi. The following
conditions define an exhaustive class of subdivisions.
331
There exists a constant p, 0 < p < 1, such that for any simplex Z = [vI, v2, ... , vn] "
Proof. It suffices to show that IIw(Zs)-vS+1,i ll $ p6(Zs) for all i E {l, ... ,n} \ {is}.
But for i E I(Zs) \ {i s} this follows from (11), while for i E {l, ... ,n} \ I(Zs) we have,
by (12):
Lemma VIIA. For every s at least one ofthe foUowing relations holds:
(13)
(14)
Since Zs+ 1 is obtained from Zs by replacing the vertex of index is by w(Zs)' for
every i we can write:
(15)
Hence, if i E I(Zs+1)' Le., 5(i,Zs+1) > p5(ZS+1) = p5(Zs)' then by the previous
lemma, 5(i,Zs+1) = Ö(i,Zs)' and consequently p5(i,Zs) > 5(Zs)' Le., i E I(Zs)' There-
fore,
(16)
333
(17)
Indeed, the second inclusion follows from the inequality 5(Zl) ~ 5(Zs). To check
the first, observe that if i ;. I s ' i.e., if 5(i,Zs) ~ p5(Zl)' then from (15) and Lemma
VII.3 (which implies that 5(i s ,ZS+1 ~ p5(Zl» we derive the inequality 5(i,ZS+1) ~
p5(Zl); hence i ;. IS+ 1 . FlOm (17) it follows, in particular, that hs ~ 1.
(18)
while
(19)
From Lemma VIIA it then follows that II(ZS+h) I ~ II(Zs) I-h; hence, by (17) and
(18),
Therefore, for h ~ h s either (18) or (19) does not hold. If (18) does not hold, then
in view of (17) we must have (16). Otherwise, there exists r such that 5 ~ r < s+h
and
(20)
If I
r = 0, then (16) trivially holds, since I f 0. Consequently, we may assume
s
that Ir f 0. From (20) it follows that 5(ir ,Zr ) = 5(Zr ), and hence
334
This implies that ir E Ir. But, by Lemma VI1.3, 6(i r ,Zr+1) S p6(Zr) S p6(Zl)' so
that ir ~ I r +1 . Taking account of (17), we then deduce that Il r +1 1 < 1Ir I. and
hence (16) holds since 1IS+h 1 S 1I r +1 1 and 1Ir 1 S 1I s I. This completes the proof of
Lemma VII.5.
(21)
Since any nonempty It has at least two elements, it follows that Ilh 1 ~ 2.
v-1
Rence, '" ~ n-1 and Ilhj-1 1 ~ 2 + (v-j). Substituting into the expression for hj we
obtain
Therefore, h", S hll'-l + n-1 S hv-2 + (n-2) + (n-1) ~ ... ~ hO+ (n-",) +... + (n-2) +
(n-1) ~ s + n(n-l)/2. Thus, Is+ p = 0, implying the inequality (21).
•
335
VII.6 is established.
Example VI1.2. Conditions (11) (12) are satisfied, with p = 1-I/n, if for each
(n-l)-fiimplex Z we let
Indeed, condition (12) is obvious, while for ieI(Z) we have, by setting /I = 1/ II(Z) I:
the Young inequality for the radius of the smallest ball containing a compact set
with a given diameter (cf., e.g., Leichtweiß (1980)), one can show the existence of a
subdivision process of the type (11) and (12) with p = [(n-l)/2n] 1/2 $ 2-1/2. For
another discussion of generalized bisection, see Horst (1995) and Horst, Pardalos and
Thoai (1995).
provide an adequate answer to the original problem, so that finite convergence may
tion VILl.3.
Given the initial cone KO = IR~ and the (n-1)-fiimplex Zo = [el,e2,... ,en], which
j] .
ki kj
Bk =[ vki -vkJ ' i <
IIv -v 11
The columns of Bk represent the unit vectors in the edge directions of the
erate (briefly, an END process) if for any infinite nested sequence of simplices Zk=
[Jl,J2, ... ,Jn] obtained by the sub division method, any convergent subsequence of
the aBsociated sequence {Bk} converges to a matrix B ofrank n-l.
Basically, this property ensures that when Qk = (zk1,zk2, ... ,zkn), where zki is the
intersection of the ray through vki with the boundary of a given bounded convex set
G whose interior contains 0, then, under mild conditions, the vector eQk-1 tends to
anormal to G at the point z* = lim zki (k - f IV).
To show this, we need some lemmas.
Lemma Vll.7. Let C be a convex bounded set containing 0 in its interior. Then the
mapping ?r that aBsociates to each point x f 0 the intersection ?r(x) of the boundary of
C with the ray (from 0) through x, is Lipschitzian relative to any compact subset S of
IR n not containing o.
From this the lemma follows, since S is compact, the convex function p(x) is Lip-
schitzian relative to S (cf. Rockafellar (1970), Theorem 10.4) and, moreover, IIxll is
337
generated in an END subdivision process on the initial simplex ZOo Assume that
7< f(O) (f(x) is the objective function of the BCP problem) and let zki = 0ki vki be
the ~xtension of vki . Since the subdivision is exhaustive, the sequences {vki }
(i=1,2, ... ,n) tend to a common limit v* and if z* is the 7-€xtension of v* then
z*-li
- m zki(k ) 1-
'-12
- - I ID , , , ... ,n.
Lemma Vll.8. There exist two positive numbers 1'Jij and Pij such that for i < j and
every k:
Proof. The second inequality is due to the Lipschitz property of the mapping 71",
where 7r{x) denotes the intersection of the boundary of the convex set
G = {x: f(x) ~ 7} with the ray through X. The first inequality is due to the Lipschitz
property of the mapping (J, where a(x) denotes the intersection of Zo with the ray
through x (Zo can be embedded in the boundary of a convex set containing 0 in its
interior).
•
Also note the following simple fact:
Lemma Vll.9. There exist 01 and 02 such that 0 < 01 ~ 0ki ~ 02 for i=I,2, ... ,n
and every k.
ki
Proof. We have 0ki = l/p(v ), where p(x) is the gauge of the convex set
G = {x: f(x) ~ 7}. The lemma then follows from the continuity of p(x) on ZO' since
any continuous function on a compact set must have a maximum and a: minimum on
338
this set.
•
Now for each simplex Zk = [vk1,vk2, ... ,vkn] define the matrix
ki kj
Uk = [ II:ki:kJi!' i < j ] .
Lemma VII.I0. There exists a subsequence A ( {k} such that Us ' se A, tends to
such that lim Bs = B as s -Im, S e A, with rank(B) = n-l. Let bij , (i,j) e I, be a set
of n-l linearly independent columns of B. We mayassume that (zsi~sj)/lizsi ~sjll
converges to uij for i < j. Let us show that the vectors uij , (i,j) e I, are linearly in-
dependent. Suppose that E a..uij = o.
(i,j)eI IJ
We may write
and therefore
where
(24)
By Lemma VII.9 and Lemma VII.8, we may assume that this latter sequence con-
verges. Then from equation (22) we may also assume that (Osi-Osj)/lizSi - zsjll con-
verges. Multiplying both sides of (23) by e = (1,1, ... ,1) and letting s -Im, we obtain
339
Now, taking the limit in equation (23) as S -I ID, we deduce from the above equality
that
Since the bij , (i,j) E I, are linearly independent, this implies that ßij = 0 for (i,j) E I.
But from (24) it follows, by Lemma VII.9 and Lemma VII.S, that the only way
ßij = 0 can hold is that (}ij = 0 for all (i,j) E I. Therefore, the vectors uij , (i,j) E I,
Recall that for any simplex Zk = [ vkl ,vk2 ,... ,Vkn] , Qk denotes the matnx
.
of
kl k2 kn
columns z ,z ,... ,Z .
Proposition Vll.7. Let {Zk} be any infinite nested sequence 01 simplices generated
But f(zsj) = f(zsi) = "{, and therefore, dividing by IIz sj-2si li and letting s - I ID we
obtain
q*uij = 0 Vi <j ,
340
Le., q*U = 0, where rank(U) = n-l. On the other hand, one can assume that the
subsequence d = {s} has been chosen in such a way that the vectors qS = eQ;I/
IIeQ;I 11 converge. Since eQ;I(zSj-zsi) = 0, it follows that if q = lim qS, then we also
have quij =0 Vi < j, i.e., qU = o. This implies that q = :I: q*. But eQ;lzSi = 1 for
any s, hence qz* = 1, and so we must have q = -q*, proving the proposition. _
Because of the above property, END subdivision processes can be used to ensure
finite convergence in conical algorithms for concave minimization over polytopes.
More specifically, let us select an END process and consider Algorithm 1* for the
BCP problem, in which the following subdivision method is used:
Theorem W.7. Assume that the function f(z) has bounded level sets and is eon-
tinuously differentiable (at least in a neighbourhood of any global optimal solution of
(BCP)). Furthermore, assume that any global optimal solution of (BCP) is astriet
loeal minimum. Then for t = 0 Algorithm VII. 1* with the above sub division method is
finite.
optimal solution. In view of (25), then, for large enough s, say for s ~ sO' the linear
program LP(Qs;D), i.e., max {eQs-1x: x E D n eon(QsH will have z* as its unique
optimal solution, i.e., w(Qs) = z*. Then eon(Qs) is subdivided with respeet to z*.
So we must have z* E {zsl, ... ,zsn}
But z* is eommon to all eon(Qs); henee for all s >
and eQs-1z* = 1. Sinee z* solves LP(Qs;D), it follows that J.t(Qs) = max {eQs-1x:
xE D n eon(QsH = 1, contradicting the fact that J.t(Qk) > 1 for any k. This proves
finiteness of the algorithm.
•
Remarks vrr.6. (i) An example of an END process is the following: Given a
. 1ex Zk
Slmp = [vk1 ,vk2 ,... ,vkn] ,we dVIi 'ed1' t 'mto 2n- 1 sub'
Slmpli ces by t he
(n-2)-hyperplanes through the midpoints of the edges (for each i there is an (n-2)-
hyperplane through the midpoints of the n-1 edges of Zk emanating from vk1 ).
Clearly, the edge directions of each sub simplex are the same as those of Zk' There-
fore, the sequence {Bk} will be constant.
But of course there are subdivision processes (even exhaustive subdivision pro-
cesses) which do not satisfy the END condition. An example is the classical bary-
centric subdivision method used in combinatorial topology. Fig. VII.2 illustrates the
case n = 2: the edge lengths of Zk+1 are less than 2/3 those of Zk' and therefore the
process is exhaustive. It is also easy to verify that the sequence {Bk} converges to a
matrix ofrank 1.
(ii) The difficulty with the above method of Hamami and Jacobsen is that no
sufficiently simple END process has yet been found. The authors conjecture that
bisection should be an END process, but this eonjecture has never been proved or
disproved. In any case, the idea of using hybrid subdivision proeesses, with an
emphasis on subdivision with respect to vertices, has inspired the development of
342
1 1
v v
2 .3 2 .3
v v v v
Fig. VII.2
(iii) From Proposition VII.7 it follows that if the function f(x) is continuously
differentiable, then the END process is nondegenerate in the sense presented in Sec-
tion VII.l.3. Thus, the condition of an END process is generally much stronger than
the nondegeneracy condition discussed in normal conical algorithms.
2. SIMPLICIAL ALGORITHMS
In this section we present branch and bound algorithms for solving problems
The first approach for minimizing a concave function f: IRn ---. IR over a nonpoly-
hedral compact convex set D was the simplicial branch and bound algorithm by
Horst (1976) (see also Horst (1980)). In this algorithm the partition sets M are
simplices that are successively subdivided by a certain process. A lower bound ß(M)
343
minimizing IPM over D n M. Recall from Theorem IV.7 that IPM is the affine func-
ti on that is uniquely determined by the set of linear equations that arises from the
requirements that IPM and f have to coincide at the vertices of M. Note that this lin-
ear system does not need to be solved explicitly, since (in barycentric coordinates)
wehave
n+l . n+l
IPM(x) = E .>..f(vl ), E >.. = 1 , >.. ~ 0 (i=l, ... ,n),
i=l I i=l I I
. n+l .
where VI (i=l, ... ,n) denotes the vertices of M and x = E >..vl • The lower bound
i=l I
ß(M) is then determined by the convex program
Whenever the selection operation is bound improving, the convergence of this al-
gorithm follows from the discussion in Chapter IV. An example that illustrates this
approach is Example IV. I.
Since the conical subdivision discussed in the preceding sections are defined by
means of subdivisions of simplices the normal simplicial subdivision processes turn
out to be very similar to the conical ones.
For any n-subsimplex M = [i', ... ,vn+1] of MO denote by 'PM(x) the convex
envelope of f over Mj it is the affine function that agrees with f(x) at each vertex of
Let w(M) be a basic optimal solution of this linear program, and let ß(M) be its
optimal value.
Then we know from Section IV.4.3 that
Le., ß(M) is a lower bound for f(x) on D n M. If M' is a subsimplex of M then the
value of ~, at any vertex v of M' is f(v) ~ ~(v) (by the concavity of f)j hence
V'M,(x) ~ V'M(x) Vx e M'. Therefore, thislower bounding satisfies the.monotonicity
condition:
Let Ms' s=0,1, ... , be any infinite nested sequence of simplices generated by the
process, Le., such that MS+1 is obtained by a subdivision of Ms ' For each s let
,J = w(Ms)' wS = w(M s) (note that in general wS # ,J).
Definiüon Vll.4. A nested sequence Ms, s=011 1"" is said to be flDfTTl4l for a giuen
(/,D) if
process.
A rule for simplicial subdivision that generates an NSS process is called an NSS rule.
Incorporating anormal simplicial subdivision (NSS) process and the above lower
bounding into a branch and bound scheme yields the following generalization of the
Initialization:
1) For each M E .%k form the affine function fPM(x) that agrees with f(x) at the
vertices of M, and solve the linear program
LP(M,D) s.t. xE D n M
to obtain a basic optimal solution w(M) and the optimal value ß(M).
2) Delete all M E .At k such that ß(M) ~ f(xk ) - c. Let .5e k be the remaining
collection of simplices.
346
5) Update the incumbent, setting xk +1 equal to the best of all feasible solutions
known so far.
Step 4, and let .At k+1 = (.9t k\ {Mk}) U f k+1 . Set k t- k+1 and return to
Step 1.
Theorem. VII.S. The normal simplicial algorithm can be infinite only il e = 0, and
in this case any accumulation point 01 the generated sequence {i} is a global optimal
solution 01 (BCP).
Proof. Consider an infinite nested sequence Ms' se/). ( {O,l, ... }, generated by
the algorithm. Since the sequence f(x k) is nonincreasing, while the sequence ß (M k )
is nondecreasing (by virtue of the monotonicity of the lower bounding and the selec-
tion criterion), there exist 'Y = !im f(x k) = !im f(xk+1) and ß = lim ß (M k ).
Furthermore, by the normality condition, we mayassume (taking subsequences if
necessary) that !im If( tl) - ß(M s)I = 0, and hence lim f( tl) = ß, where
tl = w (Ms). But, from the selection of Mk in Step 2 we find that ß (M s) < f(xs)-t:,
and from the definition of the incumbent in Step 6 we have f(l+1) 5 f(wk ) for any
k. Therefore, ß 5 'Y-t:, while 'Y 5 lim f(tl) = ß. This leads to a contradiction unless
e = 0, and in the latter case we obtain !im ('Yk-ß(M k)) = O. The conclusion then fol-
lows by Theorem IV.3.
•
R.emark VII.7. Given a feasible point z we can find a vertex of D which cor-
responds to an objective function value no greater than f(z). Therefore, we may sup-
pose that each xk is a vertex of D. Then for e = 0, we have f(xk) = min f(D) when k
is sufficiently large.
347
where, as before, 'PM(x) is the affine funetion that agrees with f(x) at the vertices
ofM.
Proposition W.8. Let Ms = [lI s1 ,lIs2,... ,ln], s=l,2, ... , be an infinite nested
sequence 01 simplices such that MS+1 is obtained from Ms by means 01 a radial
sub division with respect to a point ws. Ilthe sequence is nondegenerate, then
Proof. Denote by Hs the hyperplane in IRn+1 that is the graph of the affine
funetion 'PM (x), and denote by L the halfspaee that is the epigraph of this fune-
s s
tion. Let
Now, from the degeneracy assumption, it follows that there exist an infinite
sequence l:l ( {1,2, ... } and a constant fJ such that I11r{M s+1)1I ~ fJ Vs E l:l. Therefore,
If(ws) -IPM (ws) I ~ (1+fJ2)1/2d(yS,HS+1) ~ 0,
s
If the sequence Ms is nondegenerate and wS = WS for an but finitely many 5, then the
relation (26) follows from Proposition VI1.8, since ß(M ) = IPM (ws) for sufficiently
S s
large s.
•
349
Specifically, select an infinite increasing sequence tJ. of natural numbers and adopt
the following rule for the subdivision process (Tuy (1991a)):
Set r(M O) = 0 for the initial simplex MO. At iteration k = 0,1,2, ... , if Mk is the
simplex to be subdivided, then:
longest edge of Mk , and set r(M) = r(M k )+1 for each subsimplex M of this
subdivision.
that either this sequence is exhaustive (in which case it it obviously normal), or
sh sh sh
there exists a subsequence {sh' h=1,2, ... } such that w = wand, as h -100, W
ß(M ) = cp(M ) (wsh ) ~ min f(D n Ms ) ~ f(wsh ), since cp(M ) (wsh ) -I f(W*),
sh sh h sh
s s
while f( w h) -I f( W*), it follows that f( w h) - ß(M s ) -I o. Therefore, in any case the
h
sequence {M s} is normal.
•
We shall refer to the subdivision process constructed above as the Basic NSS
Process. Of course, the choice of the sequence 6. is user specified and must be based
When 6. ={0,1,2, ... } the subdivision process consists exclusively of bisections: the
When 6. = {N,2N,3N, ... } with N very large, the subdivison process consists
essentially of w-subdivisions.
Remark Vll.8. Let D be given by the system (7) (8). Recall that for any simplex
M = [v 1,v2, ... ,v n + 1], the affine function that agrees with f(x) at the vertices of M
If we denote
1 2 ... vn+1] ,
Q= [ vv
1 1 ... 1
351
of 'PM(x).
Like the normal conical algorithm, the normal simplicial algorithm can be ex-
tended to problems where the constraint set D is a convex nonpolyhedral set defined
g(x) ~ 0 .
The extension is based upon the same method that was used for conical algo-
technique according to the scheme proposed by Horst, Benson and Thoai (1988) and
Initialization:
Let x O be the best available feasible solution. Set DO = MO' ..(( 0 = .A"O = {MO}'
7J(MO) = '110 .
352
1) For each M E f k form the affine function IPM( x) and sol ve the linear pro gram
to obtain a basic optimal solution w(M) and the optimal value ß(M).
2) Delete all M E .At k such that ß(M) ~ f(xk)-c. Let .ge k be the remaining
collection of simplices.
xdenotes the intersection of the boundary of D with the ray through x. Set
6) Update the incumbent, setting x k+1 equal to the best among xk , ük and all
W(M) corresponding to M E f k .
Let f k +1 be the partition of Mk , .At k +1 = (.ge k \ {M k}) U f k +1. Set
k +- k+l and return to Step 1.
Theorem W.9. Algorithm VILj. can be infinite only if c = 0 and in this case any
if necessary) that lim If(J) - ß(M s) I = 0, where J = w(M s). Let '1 = lim f(xk) =
lim f(xk +\ ß = lim ß(M k ). Then ß = lim f(~). But f(xS+ 1) ~ f(l1) (see Step 6)
and f(xs+ 1) ~ f(xo) ~ f(O)j Rence if WS E D (i.e., ~ E [0,11]) for infinitely many 5,
then, by the concavity of f(x), we have f(x s+1) ~ f( ~), which, if we let 5 -+ 11), gives
353
'Y ~ ß·
On the other hand, if wS ;. D for infinitely many s, then, reasoning as in the proof
of Theorem VII.5, we see that J - "J -I 0, and since f(x S+1) ~ f("J) again
it follows that 'Y ~ ß· Now, from the selection of Mk in Step 4, we have
ß(M s) < f(x s)-€, and hence ß ~ "f-€. This contradicts the above inequality 'Y ~ ß un-
less c: = O. Then 'Y = ß, i.e., lim (ß(M s) - 'Ys) = 0, and the conclusion follows by
Theorem IV.2.
•
now present an exact (finite) algorithm based upon a specific simplicial subdivision
of the feasible set. The original idea for this method is due to Ban (1983, 1986) (cf.
also Tam and Ban (1985)). Below we follow a modified presentation of Tuy and
Horst (1988).
We shall assume that the feasible set is a polytope D C IR~ defined by the
inequalities
n
gi(x):= .E aiJ·xJ. - bi ~ 0 , (i=l, ... ,m) , x ~ 0 . (27)
J=l
Definition vn.6. A simplex M = ['1l,'1I, ... ,'1l] c IR~ with vertices u1,u2,,,.,ur
(28)
354
The motivation for defining this notion stems from the following property.
Proof. Let x E M, so that x = E ). .uj with J ( {I, ... ,r}, ).. > 0, E)'. = 1. If xE D
~J J J J
then for every i=I, ... ,m:
(30)
Define
(31)
355
Let M1 (resp. M2) be the simplex whose vertex set is obtained from that of M by
respect to v).
Proposition W.ll. The set {M1 ' M2} is a partition 0/ M. 1/ each Mv (v=l,2) is
nontrivial, then p(MJ > p(M).
Proof. The first assertion follows from Proposition IV.l. To prove the second
Therefore, when Mv is still non trivial, its test index Sv is at least equal to s. If
Sv = s, then it follows from (30) and (31) that the number of vertices u satisfying
gs(u) = 0 has increased by at least one. Consequently, p(M v) > p(M). •
The operation of dividing M into M1, M2 that was just described will be called a
D-bisection (or a bisection with respect to the constraints (27». Noting that
Corollary W.6. Any sequence 0/ simplices Mk such that Mk+1 is obtained from
Mk by a D-bisection is finite.
seen that every vertex of D must be a vertex of some simplex in the decomposition.
356
Therefore, this simplicial subdivision process will eventually produce all of the
vertices of the polytope D (incidentally, we thus obtain a method for generating all
of the vertices of a polytope). The point, however, is that this subdivision process
can be incorporated into a branch and bound scheme. Since deletions will be possible
by bounding, we can hope to find the optimal vertex well before all the simplices of
the decomposition have been generated.
To every simplex M = [ul,u2,... ,ur] C IR~ let us assign two numbers o(M), ß(M)
defined as follows:
Proposition vn.12. The BB procedure just defined, with the selection rule
(35)
is finite.
357
Hence, any trivial simplex is removed (cf. Section IV.I). This means that every
as the Exa.ct Simplicial (ES) Algorithm), we may use the following rules in Step k.3
to reduce or delete certain simplices M E .9l k.
Rule 1. Let M = [u l ,u2,... ,ur]. If for some index i there is just one p such that
~(up) > 0, then reduce M by replacing each u q for which gi(u q) < 0 by the point
vq E [up,uq] that satisfies ~(vq) = o.
Rule 2. If there is a p such that gi(uP) < 0 for some i satisfying (28), then replace M
by the proper face of M spanned by those uj with gi(u j ) = o. If for some i we have
gi(uj ) < 0 Vj, then delete M.
The hypothesis in Rule I means that exactly one vertex up of the simplex M lies
on the positive side of the hyperplane ~(x) = o. It is then easily seen that, when we
replace M by the smaller simplex M' which is the part of M on the nonnegative side
of this hyperplane, we do not lose any point of M n D (Le. M n D ( M'). This is just
The hypothesis in Rule 2 means that all vertices of M lie on the nonpositive side
of the hyperplane ~(x) = 0 and at least one vertex uP lies on the negative side. It is
358
then obvious that we do not lose any point of M n D when we replace M by its inter-
section with this hyperplane. This is just the adjustment prescribed by Rule 2. If
every vertex of M lies on the negative side of the hyperplane ~(x) = 0, then
Mn D = 0, and M can therefore be deleted.
•
necessary may have to be performed before reaching the trivial simplex of interest.
On the other hand, if Mk is subdivided with respect to some constraint other than
the test one, the finiteness of the procedure may not be guaranteed.
Let Zll be the vertex of DII which solves (Q). If Zll E D, stop: Zll is a global
optimal solution of (P).
Iteration 0:
Form the polytope D1 by adjoining to MO the io-th constraint in the system (27):
ß(M) ~ Qk-l .
k.2. Se1ect Mk e arg min {ß(M): M e .9t k}. Bisect Mk with respect to the polytope
k.3. Reduce or delete any newly generated simplex that can be reduced or deleted
according to Rules 1 and 2 (with respect to the polytope D/I). Let .At k be the
360
k
kA. For each ME .At compute
Theorem Vll.I0. The modified ES algorithm terminates after finitely many steps.
Proof. Clearly D c Dv and the number v is bounded from above by the total
the ES Algorithm to minimize f(x) over Dv' Since this algorithm is finite, the
where ~j = gi(uj). M is trivial if no row of this matrix has two entries with opposite
sign. The test index s is the index of the first row that has two such entries, i.e.,
~ uq - ~ up
v= p q
gip - ~q
and the matrices associated with the two subsimplices are obtained by replacing
column p (resp. q) in the matrix (36) with the column of entries
Ax=b , x~O ,
Viewing the polytope D as a subset of the (n-m)-dimensional space of xm+l' ... ,xn '
we can then apply the above method with ~(x) = xi" In this case the matrix
associated with a simplex M = [u1,u2, ... ,ur] is given simply by the first m rows of
the matrix (ui), i=l, ... ,nj j=1, ... ,r.
n+l
y= (x,t ) EIR+ '
t = 1.
simplex M = [u 1,u2,... ,ur ] in IR! may be regarded as the intersection of the hyper-
plane t = 1 with the cone K = con(yl,y2, ... ,y\ where yi = (ui ,I). A simplicial sub-
division of D may be regarded as being induced by a conical sub division of IR! +1.
With this interpretation the method can be extended to the case of an unbounded
polyhedron D.
't'
P roposllon vn.13. For any cone K = con{y IR n +1 tcn·th yi = (i
I. 1,y2, ... ,yr) (+ u, ti)
and 1= {i:ti > O}, the set 7r{K) is a generalized simplex with vertices 7r{yi), i E I, and
extreme directions 7f{yi), i ~ I.
r ..
Proof. Let z = 7f{y) with y = (x,t) E K, i.e. y = E .\(ul,t1),).. ~ O. If t =
i=1 I
with E J.L: = I, J.L: ~ 0 (i=l, ... ,r). If t = 0, Le., ).. = 0 (iEI), then z = x = E )..ui =
iEI 1 1 1 i~I 1
E >..?r{yi) with >.. ~ O. Thus, in any case z E ?r{K) implies that z belongs to the
iiI 1 1
generalized simplex generated by the points 7r(yi), i E I, and the directions 1r{yi)),
i ~ I. The converse is obvious.
•
363
A cone K = con(y1,y2, ... ,l) ( 1R~+1 is said to be trivial if for every i=1, ... ,m we
have
Proposition VII.14. 1/ a cone K = con{y1,y2, ... ,{) is trivial, then 1f{K) n Dis the
face 0/ 1f{K) whose vertices are those points 1f{yi), i E 1 = {i: ti > O}, which are
vertices 0/ D and whose extreme directions are those vectors 1f{yi), i t 1, which are
extreme directions 0/ D.
Now if a cone K = con(y1,y2, ... ,l) is nontrivial, we define its test index to be the
smallest s E {1, ... ,m} such that there exist p, q satisfying
2) Any cone in 1R~+1 can be partitioned into trivial subcones by means of a finite
In order to define a branch and bound algorithm based on this partition method,
it remains to specify the bounds.
is finite.
Remark Vll.I0. Incidentally, we have obtained a procedure for generating all the
vertices and extreme directions of a given polyhedron D.
365
4. RECTANGULAR ALGORITHMS
A standard method for lower bounding in branch and bound algorithms for
min CP(M) (which can be computed by convex programming methods) yields a lower
bound for min f(M). The fact that the convex envelope of a concave function f taken
over a simplex is readily computable (and is actually linear), gave rise to the conical
Another case where the convex envelope is easy to compute is when the function
the functions fP) taken over the line segments r j $ t $ Sj (j=l, ... ,n).
Moreover, the convex envelope of a concave function fP) (of one variable) taken
over a line segment [rlj ] is simply the affine function that agrees with fj at the
endpoints of this segment, Le., the function
f.(s.)-f.(r.)
J
t -f()
cp.() - . r·
J J
+ JJ _ JJ( t - r· ) .
s· r. J
(37)
J J
A branch and bound algorithm of this type for separable concave programming
was developed by Falk and Soland in 1969. A variant of this algorithm was discussed
in Horst (1977).
366
The same idea of rectangular subdivision was used for minimizing a concave
Rosen (1987), Pardalos (1985), Rosen and Pardalos (1986), Phillips and Rosen
vious rectangular procedures. We present here in detail the Falk-Soland method and
the approach of Kalantari-Rosen. The Rosen-Pardalos procedure that has been suc-
n
(SCP) minimize f(x):= E f.(x.)
j=l J J
subject to x e D ,
"rectangle" we shall always mean a set of this type). Let IPM(x) = ~ IPM.j(Xj ) be the
J
convex enve10pe of the function f(x) taken over M. Denote by w (M) and ß(M) a
basic optimal solution and the optimal value, respectively, of the linear program
"1t(x) = IPMh(x).
(38)
Initialization:
Iteration k = 0,1,... :
1) For each M E J k compute the affine function VM(x) that agrees with fex) at the
vertices of M and solve the linear program
to obtain a basic optimal solution weM) and the optimal value ß(M).
2) Delete an M E .At k such that ß(M) ~ f(xk)-e. Let .9t k be the remaining
collection of rectangles.
5) Update the incumbent, setting xk+ 1 equal to the best of the feasible solutions
known so far.
Let J k+1 be the collection of subrectangles of Mk provided by the subdivision
in Step 5, and let .At k+ 1 = (.9t k\ {Mkl) U J k+ 1. Set k I- k+ 1 and return to
Step 1.
The proof of this theorem is exactly the same as that of Theorem VII.8 on con-
vergence of the normal simplicial algorithm.
The selection of Mk implies that ß(M k ) < f(i)--c; hence, setting j. = W (Mk ),
(39)
k k
Let Mk = {x: r ~ x ~ s }.
Choose an index jk E {1,2, ... ,n} and a number wk E (r~ ,s~ ), and, using the
Jk Jk
hyper plane x· = wk , subdivide Mk into two rectangles
Jk
Proposition VII.17. The above sub division rv,le (i.e., the rv,le lor selecting jk and
Je) generates an NRS process il it satisfies either 01 the lollowing conditions:
. k k k k
(i) Jk E argm~ I~ (w j ) - I{Jk/wj)1 and w = wj ;
J
k k . k k
f.(w"'.} - I{Jk {w"'.} < uk ·, uk' - - I 0 sI s . - r . - - I 0 . (40)
J Y J Y- J J J J
Proof. Let Mk ' h=1,2, ... , be any infinite nested sequence of rectangles gen-
h
erated by the rule. To simplify the notation let us write h for kh. By taking a subse-
quence if necessary, we may assume that jh = jo ' for example jh = 1, for all h. It
suffices to show that
370
(41)
where in case (i) we set O"h1 = f1(~) - ~h1(~)' From the definition of ~ it then
follows that fj(~) - V1tj(~) - i 0 Vj and hence f(Jt) - V1t(Jt) -i 0, whichjs the
desired normality condition.
Clearly, if rule (ii) is applied, then the interval [r~,s~] shrinks to a point as
h - i 11), and this implies (41), in view of (40). Now consider the case when rule (i) is
applied. Again taking a subsequence if necessary, we may assume that ~ - i w1 as
h - i 11). If cl < w1 < dl' then, since the concave function f1(t) is Lipschitzian in any
bounded interval contained in (cl'd1) (cf. Rockafellar (1970), Theorem 10.4), for all
h sufficiently large we have
where 1/ is a positive constant. Hence, using formula (37) for V1tj(t), we obtain
Since ~-1 is one of the endpoints of the interval [r~,s~], it follows that
V1t1(~-1) = f1(~-1). Therefore,
proving (41).
w
On the other hand, if 1 coincides with an endpoint of [cl'd1], for example
- h h h-1 h -
w1 = cl' then we must have r 1 = cl' SI = "1 Vh, and hence, SI --I w1 = Cl as
h - i 11). Noting that V1t1 (~) ~ min {fl(c l ), fl(s~)}, we then conclude that
Thus, (41) holds in any case, completing the proof of the proposition.
•
Corollary VII,9. If the subdivision in Step 5 is performed according to either of the
mIes (i), (ii) described in Proposition VII.17, then Algorithm VII. 6 converges (in the
sense of Theorem VII. 11).
(1969) as a relaxed form of an algorithm which was finite but somewhat more
complicated (see VI.4.6). In contrast to the "complete" algorithm, the "relaxed"
algorithm may involve iterations in which there is no concave polyhedral function
agreeing with l'M(x) on each reet angle M of the current partition. For this reason,
the argument used to prove finite convergence of the "complete" algorithm cannot
gular algorithms.
Kalantari and Rosen (1987), Rosen and Pardalos (1986), ete. (cf. the introduetion to
Section IV). We diseuss the Kalantari-Rosen proeedure here as a specialized version
where p E IRn, C is a symmetrie positive definite nxn matrix, and D is the polytope
in IRn defined by the linear inequalities
(42)
x~O (43)
UTCU = diag(>'1'>'2, ... ,>'n) > 0, then, as shown in Seetion VI.3.3, after the affine
transformation x = Uy this problem ean be rewritten as
n
min F(y):= E F.(y.) s.t. yEn, (44)
j=l J J
In this separable form the problem can be solved by the normal rectangular
algorithm. To specialize Algorithm VII.6 to this ease, we need the following prop-
(45)
2) (46)
'M(Y) = E'l/JMj(Yj)' where 'MP) is the convex envelope of FP) over the interval
[rl~' Let GP) = - ~ >.l·
Since FP) = qjt + GP), it follows that
,tl.. _.(t)
TMJ
= q.tJ + -y.(t)
J
,
where -yP) is the convex envelope of GP) over the interval [rj,sj]' But -yP) is an
affine function that agrees with GP) at the endpoints of this interval; hence, after
an easy computation we obtain
(47)
F'(Y)-'l/JM .(y)
J ,J
= G.(y.)--y.(y.)
J J J J
1 2 1 1
= - 2' >'lj + 2' >'j(rj + Sj)Yj - 2' >'llj
= ~ >'iYj - r j ) (Sj - Yj) . (48)
problem (44).
According to Corollary vn.8, for the subdivision operation in Step 5 we can use
either of the rules given in Proposition VII. 17. Rule (i) is easy to apply, and
corresponds to the method of Falk and Soland. Rule (ii) requires us to choose
numbers O"kj satisfying (40). Formula (46) suggests taking
1 k k2
uk·=öÄ.(s.-r.) . (49)
,J 0 J J J
AUy ~ b, Uy ~ 0, r ~ y ~ s .
In certain cases, the constraints of the original polytope have .some special
structure which can be exploited in solving the subproblems. Therefore, it may be
more advantageous to work with the original polytope D (in the x-ilpace), rat her
than with the transformed polytope 0 (in the Y-ilpace). Since x = Uy and
ÄjYj = uj(Cx), a rectangle M = {y: r ~ y ~ s} is described by inequalities
(50)
We can thus state the following algorithm of Kalantari and Rosen (1987) for solving
(CQP):
375
Algorithm vrr.7.
Initia1ization:
obtaining the basic optimal solutions x Oj , xDj and the optimal values t'/j' ;;j of these
programs, respectively.
Clearly, DeMO = {x: >'jt'/j ~ uj(Cx) ~ >'j;;j , j=l,2, ... ,n}. Set .J( 1 = .%1 = {MO}'
x O = argmin{f(xOj ), f(xD j ), j=l,2,,,.,n}.
Iteration k = 1,2,,,. :
1) For each M e .%k compute ~(x) = E ~ .(x.) according to (50) and solve the
j ,J J
linear program
to obtain a basic optimal solution w(M) and the optimal value ß(M).
2) Update the incumbent by setting xk equal to the best among all feasible
k-1
solutions so far encountered: x and all w(M), M e .%k'
Delete all M e .J( k such that ß(M) ~ f(xk)-e. Let .ge k be the remaining collection
of rectangles.
5) Let jk e argmax {O'kf j=l,2,,,.,n}, where O'kj is given by (49) (rki are the
k 1 k k
vectors that define Mk ), w = 2' (r jk + Sj/
376
Therefore, if one uses the w-subdivision, then one should choose the index jk that
maximizes (51), and divide the parallelepiped Mk by the hyperplane
jk 1c
u (C(x -w-)) = O.
(ii) Another way of improving the algorithm is to use a concavity cut from time
to time (cf. Section V.l) to reduce the feasible polytope. Kalantari (1984) reported
computational experiments showing the usefulness of such cuts for convergence of
the algorithm. Of course, deeper cuts specially devised for concave quadratic pro-
gramming, such as those developed by Konno (cf. Section VA), shOuld provide even
better results.
(iii) Kalantari and Rosen (1987), and also Rosen and Pardalos (1986), Pardalos
and Rosen (1987), actually considered linearly constrained concave quadratic pro-
gramming problems in which the number of variables that enter the nonlinear part
of the objective function is small in comparison with the total number of variables.
In the next chapter these applications to the large-scale case will be discussed in a
more general framework.
377
The following simple example is taken from Kalantari and Rosen (1987), where
one can also find details of computational results with Algorithm VII.7.
Let fex) = - ~2xi + 8X~), and let the polytope D be given by the constraints:
We have ui = ei (i-th unit vector, i=1,2) and these are normalized eigenvectors of
C, >'1 = 2, >'2 = 8.
It can easily be checked that
o
MI = {x: 0 ~ Xl ~ 8, 0 ~ x 2 ~ 4}, Xo = (8,2), fex ) = -80 .
Iteration 1:
(0,4 ) (8,4 )
ßo = -104, Wo = (7,3) ,
1 .
Xl = (7,3), fex ) = -85 .
MI = MO'
1
f(w ) - 'PM = 19 . Divide MI into
1
(0,0) (8,0)
Mn and M 12 (jl = 1).
Fig. a
378
Iteration 2:
Pu = -73.6, wu = (4,3.6) ,
P12 = -100. , W12 = (7,3) ,
2
x = (7,3), fex 2) = ~5 .
M U fathomedj M2 = M12
f( J) - 'PM (J) = 15. Divide M2 into
2
M21 ' M22 (j2 = 2) .
Fig. b
Iteration 3:
Fig. c
Iteration 4:
Iteration 5:
41
ß41 = -86 , w = (7,3) ,
M5 = M42 ,
f( w5 ) - 'PM (w5 ) = 1. Divide M5 into
5
M 51 ' M 52 (j5 = 1).
Fig. e
Iteration 6:
Iteration 7:
Note that the solution (7,3) was already encountered in the first iteration, but the
while the incumbent is x3 = (7,3) with f(x3) = -85, so M 21 ' M 22 are fathomed, and
Fig. a Fig. b
CHAPTER VIII
linear, while the objective function is a sum of two parts: a linear part involving
most of the variables of the problem, and a concave part involving only a relatively
small number of variables. More precisely, these problems have the form
In solving these problems it is essential to consider methods which take fuH ad-
ant special cases such as separable concave minimization and concave minimization
1. DECOMPOSITION FRAMEWORK
We shall assurne that for every fixed x E D the linear function dy attains a min-
imum over the set of all y such that (x,y) E n. Define the function
with dom 9 = D.
transformation from IRn +! to IRn (see e.g. Rockafellar (1970), Theorem 19.3). Since
s .. r
(x,y) = E A.(U\V1), E A. = 1, A. ~ 0 (Vi), (3)
i=l 1 i=l 1 1
where r ~ s, and since (ui,vi ) for i ~ rare the extreme points of n, while (ui,vi ) for
s .
i > r are the extreme directions of n, we have g(x) = inf E A.dvl , where the in-
i=l 1
fimum is taken over all choices of Ai satisfying (3). That is, g(x) is finitely gener-
ated, and hence convex polyhedral (see Rockafellar (1970), Corollary 19.1.2). Fur-
thermore, from (1) and (2) it is obvious that g(x) < +111 if and only if x E D.
•
Proposition VIII.2. Problem (P) is equivalent to the problem
Specijically, if (x, y) solves (P), then x solves (H), and if x solves (H), then (X, y)
solves (P), where y is the point satisfying g(x} = dy, (x, y) E n .
383
Proof. If (x,Y) solves P, then XE D and f(X) + dy ~ f(x)+ dy for all y such that
(x,y) E Oj hence dy = g(X), and we have f(x) + g(X) = f(x) + dy ~ f(x) + dy for all
(x,y) E O. This implies that f(X) + g(x) ~ f(x) + g(x) for all x E D, i.e., Xsolves (H).
Conversely, suppose that X solves (H) and let y E argrnin {dy: (x,y) E O}. Then
f(x) + dy = f(x) + g(X) ~ f(x) + g(x) for all x E D and hence f(X) + dy ~ f(x) + dy
for all (x,y) E 0, Le., (x,Y) solves (P).
•
Thus, solving (P) is reduced to solving (H), which involves only the x variables
and rnight have a much smaller dimension than (P). Difficulties could arise for the
1) the function g(x) is convex (therefore, f(x) + g(x) is neither convex nor concave,
but is a difference of two convex functions)j
2) the function g(x) and the set D are not defined explicitly.
To cope with these issues, a first method is to convert the problem (H) to the
form
and to deal with the implicitly defined constraints of this concave program in the
We shall see, however, that in many cases the two mentioned points can merely
cave underestimation.
We shall discuss these methods in the next sections, where, for the sake of simpli-
One of the most appropriate techniques for solving (H) without explicit know-
ledge of the function g( x) and the set D is the branch and bound approach (cf. Horst
and Thoai (1992)j Tuy (1992)j Horst, Pardalos and Thoai (1995».
In fact, in this approach all that we need to solve the problem is:
2) a practical procedure for computing a lower bound for the minimum of f(x) +
g(x) over any partition set M, such that the lower bounding is consistent (see
Chapter IV).
+ dy: x E M n D}.
min {1fM(x)
•
385
Thus, to compute a lower bound for f(x) + g(x) over M n D, it suffices to solve a
linear program of the form (5). In this manner, any branch and bound a.lgorithm ori-
gina.lly devised for concave minimization over polytopes that uses linear undeT'-
estimators for lower bounding, can be applied to solve the reduced problem (H). In
doing this, there is no need to know the function f(x) and the set D explicitly. The
convergence of such a branch and bound algorithm when extended to (H) can
genera.lly be established in much the same way a.s that of its origina.l version.
Let us examine how the above decomposition scheme works with the branch and
When minimizing a concave function f(x) over a polytope D ( IRn, the norma.l sim-
of MO' alower bound for f(x) over M n D is taken to be the optima.l value of the lin-
ear program
where rpM(x) is the linear underestimator of f(x) given simply by the affine function
that agrees with f(x) at the n+l vertices of M. In order to extend this a.lgorithm to
problem (H), it suffices to consider the linear program (5) in place of (6). We can
Then for any simplex M = [vl,v2,... ,vn +1] in x-space, since x e M is a convex
combination of the vertices vl,v 2,... ,v n +1, the lower bounding subproblem (5) can
also be written as:
(7)
E Ai =1, \ ~ ° Vi, y ~ °
(cf. Section VIL2.4).
Initialization:
Construct an n-simplex MO such that DeMO c IR! . Let (xO,yo) be the best feas-
Iteration k = 1,2,... :
1) For each M E f k solve the linear program (7) to obtain a basic optimal solution
2) Update the incumbent by setting (xk,yk) equal to the best among all feasible
5) Let .h'"k+1 be the partition of Mk and vi( k+1 = (9t k \ {Mk }) U .h'"k+1 . Set
k -- k+ 1 and return to Step 1.
Theorem. VIll.l. Algorithm VIII.1 can be infinite onlll i/ E = 0, anti in this case,
anll accumulation point 0/ the generated sequence {(t',,j}} is a global optimal sol'/./;-
tion o/(P).
Proof. Consider the algorithm as a branch and bound procedure applied to prob-
lem (H), where for each simplex M, the value Q (M) = f(w (M)) + d(y(M» is an
upper bound for min f(M), and ß (M) is a lower bound for min f(M) (cf. Section
IV.I). It suffices to show that the bounding operation is consistent, i.e., that for any
infinite nested sequence {Mq} generated by the algorithm we have lim(Qq - ßq) = 0
(q - - I m), where Qq = Q (M q), ßq = ß (M q) (then the proof can be completed in the
same way as for Theorem VII.8).
But if we denote "'q(.) = ~q(.), wq = w (Mq), then clearly Qq - ß
q
= f(wq) -
"'q(wq). Now, as sh~wn in the.proof of Proposition VII.9, by taking a subsequence if
necessary, wq - - I w , where w is avertex of M.= n~ =1 M q. We can assume that
w· = li m vq,I, where vq,1 is a vertex of Mq, and that wq = 11=1 Aq i vq,i, where
q~oo '
A . ~ A. i with A. i = 1, A. i = 0 for i # 1. Hence, by continuity of f(x) , "'q (wq) =
q,1 , .'.' •
E~-1 A . f(vq,l) ~ f(w ). Since f(wq) ~ f(w ), it follows that f(wq) - 'l/J(wq ) ~ 0, as
1- q,1
was to be proved.
•
We recall from the discussion in Section VII.1.6 that in practice, instead of a nor-
mal rule one can often use the following simple rule for simplicial subdivision
Choose 7 > 0 sufficiently small. At iteration k, let Ak = (Af, ... ,A:) be an optimal
solution of the linear program (7) for M = Mk. Use w-subdivision if min {>.f:
Af > o} ~ 7, and bisection otherwise.
388
n
f(x) = E f.(x.).
j=1 J J
As previously shown (Theorem IV.8), the convex envelope of such a function over a
where each 1/JM,P) is the affine function of one variable which agrees with fP) at
the endpoints of the interval [rlj ]' Rence, a lower bound for f(x) + g(x) over M
can be obtained by solving the linear program (5), with VM(x) defined by (8). This
leads to the following extension of Algorithm VII.6 for solving problem (P).
Select a tolerance t: ~ 0.
Initialization:
(xO,yO) be the best feasible solution available. Set .At 1 = .%1 = {MO} .
Iteraüon k = 1,2, ... :
1) For each member M = {x E IRn: r ~ x ~ sO} of .%k compute the function VM(x)
Let (w (M),y(M» and ß (M) be a basic optimal solution and the optimal value of
LP(M,O).
389
2) Update the incumbent by setting (xk,yi) equal to the best feasible solution
among (xk-l,yk-l) and all w (M),y(M», M e .A'k'
Delete all M e .Jt k for which ß (M) ~ f(xk ) + dyk - E. Let j! k be the remaining
collection of rectangles.
'/L
rkJ
.(x) = '/LrMkJ
- .(x).
5) Select jk e argmax{ Ifj(~) - 1I1tj(~) I: j=l, ... ,n}. Divide Mk into two
subrectangles by the hyperplane x. =;;. .
Jk Jk
6) Let .A'k+l be the partition of Mk , .Jt k +1 = (j! k\{Mk }) U .A'k' Set k +- k+1
and return to 1).
Theorem VIII.2. Algorithm VIII.2 can be infinite only i/ E = 0 and in this case,
any accum'lllation point o/{(xk,yk)} is a global optimal solution 0/ (P).
ratic programming problems, since a concave quadratic function f(x) can always be
Strictly speaking, the normal conical algorithm (Section VII.I) is not a branch
and bound procedure using linear underestimators for lower bounding. However, it
can be extended in a similar way to solve problems of the form (P), with d = 0 (i.e.,
where the objective function does not depend upon y):
i.e., a problem of the form (9), with (x,t) e IRn+! in the role of x.
As before, let D = {x e IRn: 3y ~ 0 such that Ax + By ~ c, x ~ O}. Then (9) be-
comes a BCP problem to which Algorithm VII.1 * can be applied. The fact that D is
not known explicitly does not matter, since the linear program
-1 -1
LP(QjD) max eQ x s.t. x e D, Q x ~ 0
-1 -1
max eQ x s.t. Ax + By ~ c, Q x ~ 0,y ~ 0
Note that in all the above methods, the function g(x) and the set D are needed
only conceptually and are not used in the actual computation.
391
3. POLYHEDRAL UNDERESTIMATIONMETHOD
Another approach which allows us to solve (H) without explicit knowledge of the
function g(x) and the set D uses polyhedral underestimation.
The polyhedral underestimation method for solving (BCP) is based on the follow-
ing property which was derived in Section VIA (cf. Proposition VI.10):
To every finite set Xk in IRn such that conv Xk = MI ) D one can associate a
polyhedron
(10)
is the lowest concave function which agrees with f(x) at all points of Xk (so, in par-
with ~(x) defined by (10), we obtain a lower estimate for min {f(x)+dy: (x,y)eO}.
Therefore, if (xk,l) is a basic optimal solution of (P k) and (xk,yk) satisfies
~(xk) = f(xk ), then ~(xk) + dyk = f(xk ) + dl; hence (xk,l) solves (P).
392
Otherwise, tI1c(xk ) < f(xk ), which can happen only if x k e D \ X k . We then consider
the new grid X k +1 = X k U {xk }. Since
the vertex set of Sk+1 can be derived from that of Sk by any of the procedures dis-
cussed in Section 11.4.2. Once this vertex set is known, the procedure can be re-
In this manner, starting from a simple grid Xl (with conv Xl = MI)' we gener-
of (P k » and since, by (11), the sequence (xk,yk) never repeats, it follows that the
procedure will terminate after finitely many iterations at a global optimal solution
(xk ,yk).
We are thus led to the following extension of Algorithm VI.5 (we allow restarting
Algorithm. VIII.3.
0) Set Xl = {v l ,... ,vn +l } (where [v l ,... ,vn +1] is a simplex in IRn which contains
D). Let (xO,yO) be the best feasible solution known so far; set SI = {(q,qO): ~
qvi ~ {(vi), i=1, ... ,n+1}. Let vK 1 be the vertex set of SI' i.e., the singleton
{[f(v l ), ... ,f(v n +1)] Ql1}, where Q1 is the matrix with n+1 columns [-~
(j=I, ... ,n+1). Set .#"1 = vK l' k=l.
393
obtaining a basic optimal solution w (q,qO) and the optimal value ß (q,qO)'
2) Compute
n
f(x) = E f.(x.) , (12)
j=1 J J
it is more convenient to start with a reet angle MI = {x: r l ~ x ~ sI} and to con-
n
lPtt(x) = E lPtt .(x.) , (13)
j=1 J J
394
where each fPJcP) is a piecewise affine function defined by a formula of the type
«37), Section VII.4) in each of the k subintervals that the grid Xkj determines in
1 1
the segment [r j ,s j] .
Theoretically, the method looks very simple. From a computational point of view,
however, for large values of k the functions fPJcj(t) are not easy to manipulate. To
cope with this difficulty, some authors propose using mixed integer programming, as
in the following method of Rosen and Pardalos (1986) for concave quadratic minim-
ization.
Consider the problem (P) where f(x) is concave quadratic. Without loss of gener-
ality we may assume that f(x) has the form (12) with
(cf. Section VI.3.3), and that n is the polytope defined by the linear inequalities
Ax + By ~ c , x~0 , y ~ 0. (15)
Suppose we partition each interval Ij = [O,ß~ (j=l, ... ,n) into kj equal subinter-
vals of length 6j = ß/kj" Let rpP) be the lowest concave function that agrees with
fP) at all subdivision points of the interval I j , and let
n
rp(x) = E rp.(x.).
j=l J J
then (x,y) is an approximate optimal solution of (P) with an accuracy that obviously
depends on the number k. (j=1, ... ,n) of subintervals into which each interval I. is di-
J J
vided. It turns out that we can select kj (j=1, ... ,n) so as to ensure any prescribed
level of accuracy (cf. Pardalos and Rosen (1987)).
Let lII(x,y) = f(x) + dy and denote by Vi* the global minimum of lII(x,y) over O.
The error at (x,y) is given by lII(x,y) -111*. Since
We now give abound for E(x) relative to the range of f(x) over M. Let
f = max f(x), f . = mi n f(x). Then the range off(x) over M is
max xEM mm xEM
M=f -f . .
max mm
(17)
(18)
1 2 n 2
Lemma W.l. If ~ 8" >'lß 1 .E PJ·(l + 17J')
J=l
Proof. Denote
where the maximum and minimum are taken over the interval [O,ßj].
Then {j(ßj) = >'!lij - ~ ßj ) 5 0 , and since the minimum of the concave function
{P) over the interval 0 5 t 5 ßj is attained at an endpoint, we have
min {P) = {lß/ On the other hand, max {p) = ! >'l~' From (19) we have xj =!
ßP-17j)' Hence,
(ii)
Then {·(ß·) > 0, so that min {.(t) = {.(O) = O. From (19) we have X. = ~ß·(1+17·),
JJ- J J J -'J J
hence
(iii) xj 5 0:
This implies qj 50, {P) 5 -~ >'jt~ (0 5 t 5 ßj ), so that max fP) = 0,
1 212 2
ß{j ~ 2 >'l j = 8" >'l j(1+17j)
397
(iv) x.rJ
> ß·:
Finally, we have
M = n I n A 21 n ,,2 2
E M. ~ 8" E A. ·(1+71') ~ 8" E A1PIP·(1+7J·) .
j=1 J j=1 J J j=1 J J •
Theorem VllI.3. We have
n 2
E (p·/k.)
j=1 J J
(20)
n 2
E p.(1+7J.)
j=1 J J
Since fj is concave and 'PP) interpolates it at the points t = i5j (i=O,I, ... ,kj ), it
can easily be shown that
0< f.(i.) - cp.(i.) < ~ A.{j~ = -SI A·(ß·/kl (j=I, ... ,n).
-JJ JJ-oJJ JJJ
and the inequality (20) follows from Lemma Vll.1 and (21).
•
The following corollaries are immediate consequences of Theorem VIlL3.
398
Corollary VIII.I. Let tp(x) be the convex envelope oi J(x) taken over M and let (i,l)
be an optimal solution ofthe problem
Then
n
E Pj
qT(xO,yO) - w. J·=1
(22)
- -"i.{ 2 - =: u(p,1/)
~ -=nc-'-'--"'-(--,.)....
E P j 1+1/j
j=l
k.~
J
(!!:.p}/2, wherea=e p.(1+1/.l
a 1 j=1 J J
E (23)
n
E -:1-
p.
j=l k.
n 2
~ a = e E p.(1+1/.) .
j=l J J •
J
Thus, in the case of a quadratic function fex), we can choose the numbers kj
(j=l, ... ,n) so that the solution to the approximation problem (16) that corresponds
to the subdivision of each interval [O,ßj ] into kj subintervals will give an e-approxi-
construction of the functions tpP) may be cumbersome. Rosen and Pardalos (1986)
399
problem as follows.
Let us introduce new variables wij such that
k.
xj = Oj Ei~l wij (j=l, ... ,n). (24)
The variables w·· are restricted to w·· E [0,1] and furthermore, w. = (w1 ·, ... ,w. .)
IJ IJ J J Kr
is restricted to have the form wj = (l, ... ,l,wl j'O, ... ,O). Then there will be a unique
w
vector j representing any xj E [O,ßj] and it is easy to see that
k.
ep.(x.) = EJ M .. w.. , with
J J i=l IJ IJ
w
where j is deterrnined by (24). We can therefore reformulate (16) as
n k.
(MI) rnin E E J M. ·w·· + dy
j=l i=l IJ IJ
n k.
s.t. E o.a. EJw.. +By~c,
j=l J J i=l IJ
Zij E {0,1},
1) For the objective function, compute the eigenvalues Ai and the corresponding
2) Choose the incumbent function value (IFV). Construct CP(X),M,pi'j (j=1, ... ,n).
3) Solve the approximating problem (16) to obtain (i,Y). If 1I1(i,y) < IFV, reset
IFV = 1I1(x,y).
accelerate pruning.
max {xi (x,y) E Cl} , min {xi (x,y) E Cl} (j=1,2, ... ,n).
sibilities is only k1 x ... x kn . In addition, the number N of 0-1 integer variables never
exceeds
of problem (H) (Le., (4)), which is equivalent to (H). This method (cf. Tuy (1985,
1987) and Thieu (1989)) can also be regarded as an application of Benders' parti-
As before, we shall assume that the constraint set n is given in the form
Ax + By ~ c , x ~ 0 , y ~ 0 (25)
As shown in Section VIII.1, this problem is equivalent to the following one in the
space IRnxlR:
where
D = {x ~ 0: 3 y ~ 0 such that Ax + By ~ c} .
tion with dom g = D, it follows that the constraint set G of (H) is a polyhedron; and
hence problem (H) can be solved by the out er approximation method described in
Section VI.l.
402
Note that the function g(x) is continuous on D, because any proper convex poly-
hedral function is closed and hence c:ontinuous on any polyhedron contained in its
domain (Rockafellar, Corollary 19.1.2 and Theorem 10.2). Hence, if we assume that
optimal solution of (H) must satisfy g(x) = t, we can add the constraint a ~ g(x) ~ ß
to (H). In other words, we may assume that the constraint set G of (H) is contained
Under these conditions, an outer approximation method for solving (H) proceeds
and let (xk,t k ) be an optimal solution. If (xk,t k) happens to be feasible (Le., belong
to G), then it solves (H). Otherwise, one constructs a linear constraint on G that is
violated by (xk,tk ). Adding this constraint to T k , one defines a new polytope T k+1
that excludes (xk,t k) but contains G. Then the process is repeated, with T k +1 in
place of T k . Since the number of constraints on G is finite, the process will termin-
ate at an optimal solution of (P) after firii.tely many iterations.
Clearly, to carry out this scheme we must carefully examine two essential questions:
(i) How can one check whether a point (xk,t k) is feasible for (H)?
(il) If (xk,t k ) is infeasible, how does one construct a constraint on G that is violated
by this point?
In the next section we shall show how these two questions can be resolved, with-
out, in general, having to generate explicitly all of the constraints of G.
Let (xO,tO) E IRn"lR. Recall that, by definition, g(xO) is the optimal value of the
linear program
Since the function g(x) is bounded on D, C(x) is either infeasible or else has a
Proposition vrn.4. The point (il) is feasible for (iI), i.e., i E D and
g(i) ~ tO, if and only if both programs C(xO) and C*(xO) are feasible and their
common optimal value does not exceed l
Proof. Since the point (xO,tO) is feasible for (H) if and only if C(xO) has an op-
timal value that does not exceed tO, the conclusion follows immediately from the du-
ality theory of linear programming.
•
Observe that, since C(x) is either infeasible or has a finite optimal value (g(x) >
-m), C*(x) must be feasible, i.e., the polyhedron
T
W = B w ~ -d, w ~ °
is nonempty.
404
Now suppose that (xO,tO) is infeasible. By the proposition just stated, this can
happen only in the following two cases:
I. C(xO) is infeasible.
(26)
Proof. Indeed, for any xe D, C(x) has a finite optimal value. Hence, C*(x) has a
finite optimal value, too. This implies that (x - c)v ~ °for any extreme direction v
ofW.
•
ll. C(xO) is feasible but its optimal value exceeds ,0.
Then C*(xO) has an optimal solution wO with (AxO - c)wO= g(xO) > tO.
(27)
Proof. For any x e D, since C(x) has g(x) as its optimal value, it follows that
C*(x), too, has g(x) as its optimal value. Hence,
which implies (27) if g(x) ~ t. Since tO < g(xO) = (Ax° - c)wO, the proposition fol-
lows.
•
405
Thus, given any (xO,tO), by solving C(xO) and C*(xO) we obtain an the informa-
tion needed to check feasibility of (xO,tO) and, in case of infeasibility, to construct
the corresponding inequality that excludes (xO,tO) without excluding any feasible
(x,t).
Let a ~ g(x) ~ ß Vx E D.
Algorithm VIll.5
Iniüalizaüon:
solution of (P).
If a basic optimal solution wk of C*(xk ) is found with (Axk - c)wk > t k , then
form
T k +1 = Tk n{x: (Axk - k
c)w ~ t}
(ii) If we know bounds 0, ß for g(x) over D, then we can take TO = s" [o,P] ,
where S is an n-simplex or a rectangle in IRn which contains D. Obviously, a lower
bound for g(x) over D is
0= min {dy: Ax + By ~ c, x ~ 0, y ~ O}
= min {g(x): xe D} .
It may be more difficult to compute an upper bound ß. However, for the above algo-
rithm to work it is not necessary that TO be a polytope. Instead, one can take TO to
be a polyhedron of the form S" [0,+111), where S is a polytope containing D. Indeed,
this will suffice to ensure that any relaxed problem (Hk) will have an optimal solu-
407
f(x k) + (Axk--c)wk provides an upper bound for the optimal value w* in (P), and so
On the basis of this observation, the algorithm can be improved by the following
modification.
To start, set CBS (Current Best Solution) = X, UBD (Upper Bound) = f(x) +
g(x), where i is some best available point of D (if no such point is available, set
CBS = 0, UBD = +ID). At iteration k, in Step 1, if f(xk ) + t k ~ UBD-e, then stop:
CBS is a global E-{)ptimal solution. In Step 3 (entered with an optimal solution wk
of C*(xk), i.e., x k E D), if f(x k ) + (Axk - c)wk < UBD, reset UBD = f(x k ) + g(xk ),
k
CBS = x .
(iv) As in all outer approximation procedures, the relaxed problem (Hk) differs
from (Hk_I) by just one additional constraint. Therefore, to solve the relaxed prob-
lems one should use an algorithm with rest art capability. For example, if Algorithm
VI.l is used, then at the beginning the vertex set T Ois known, and at iteration k the
vertex set of T k is computed using knowledge of the vertex set of T k_ 1 and the
newly added constraint (see Section 1II.4). The optimal solution (xk,t k) is found by
comparing the values of f(x) + t at the vertices.
Minirni z e f ( x) + dy
s . t. Ax + By ~ c, x ~ 0, y ~ 0,
2 5 2 2 T
where x E IR , Y E IR ,f(x):= -(xCI) - (x2+1) ,d = (1,-1,2,1,-1) ,
B= [~ ~ ~ -~ ~ 1
001 01
c= [=1~ 1
-3
408
Iniüalizaüon:
a = min {dy: Ax + By ~ c, x ~ 0, y ~ O} = -3.
TO= S,,[a,+m), S = {x: xl ~ 0, ~ ~ 0, xl + ~ ~ 3}.
lteraüonO:
The relaxed problem is
Iteraüon 1:
Optimal solution of (H1): xl = (0,3), t 1 = -1.
Solving C*(x1) yields w1 = (0,1,0), with (Ax1 _ c)w1 = -1 = t 1.
The termination criterion in Step 3 is satisfied. Hence (x1;y1) = (0,3;0,1,0,0,0) is the
desired global optimal solution of (P).
4.3. An Extension
with X a convex polyhedron in IRn , and the projection D of (1 on IRn is not necessarlly
bounded.
As above, define
409
and assume that g(x) ~ CI! Vx E Xj hence W # 0 (this is seen by considering the dual
linear programs C(x), C*(x) for an arbitrary x E D). Denote the set of vertices and
Proposiüon VIII.7. A vector (x/t) E XKR belongs to G if and only ifit satisfies
proof. The proofs of Propositions VIII.5 and VIII.6 do not depend upon the hy-
potheses that X = IR~ and D is bounded. According to these propositions, for any
x EX, if (x,t) t G, then (x,t) violates at least one of the inequalities (29), (30). On
the other hand, these inequalities are satisfied by all (x,t) E G.
•
Thus, the feasible set G can be described by the system of constraints: x e X, (29)
and (30).
In view of this result, checlring the feasibility of a point (xk,t k ) E XKR and con-
structing the constraint that excludes it when it is infeasible proceeds exactly as be-
The complication now is that, since the outer approximating polyhedron T k may be
unbounded, the relaxed problem (Hk) may have an unbounded optimal solution.
That is, in solving (Hk) we may obtain an extreme direction (zki) of T k on which
the function f(x) +t is unbounded from below. Whenever this situation occurs, we
must check whether this direction belongs to the recession cone of the feasible set G,
and if not, we must construct a linear constraint on G that excludes this direction
410
Proof. Indeed, let (x,t) be an arbitrary point of G. Then x + >. z E G V>. ~ 0, and
by Proposition VIII.7. the recession cone of G consists of all (z,s) such that for all
>. ~ 0, (A(x + >.z))v ~ 0 Vv E extd W and {A(x + >.z))w ~ S Vw E vert W. This is
equivalent to (31), (32).
•
Therefore, a vector (zk,sk) E IRn ,,1R belongs to the recession cone of G if the linear
program
max {(Azk)w: w E W} .
has a basic optimal solution wk with (Azk)wk ~ sk. If not, i.e., if (Azk)w > sk, then
the constraint (30) corresponding to w = wk will exclude (zkl) from the recession
cone. H S(zk) has an unbounded optimal solution with direction vk , then the con-
straint (29) corresponding to v = vk will exclude (zk,sk) from the recession cone.
On the basis, of the above results one can propose the following modification of AI-
gorithm VIII.5 for the case where X is an arbitrary convex polyhedron and D may
be unbounded:
Initialization:
Construct a polyhedron T Osuch that G c T O C Sx [a,+m). Set k = O.
411
5) Solve S(zk).
If a basic optimal solution (zk}) of S(zk) is found with (Azk)wk > sk, then form
k
T k +1 = T k n {x: (Ax - c)w ~ t}
7) Otherwise, a direction vk E extd W is found such that (Azk)vk > 0, Then form
k
T k +1 = T k n {x: (Ax - c)v ~ O}
It is dear that the modified Algorithm VIII.5 will terminate in finitely many
steps.
W = {w: BT w ~ -d}.
412
(ii) A further extension of the algorithm can be made to the case when the non-
negativity constraint y ~ 0 is replaced by a constraint of the form Ey ~ p, where Eis
an lo,h matrix and p an l-vector. Then the problems C*(x), S(z) become
As we saw in Chapter 11 and Section VI.1, a major difficulty with outer approx-
imation methods is the rapid growth of the number of constraints in the relaxed
problems (Hk)' Despite this difficulty, there are instances when outer approximation
methods appear to be more easily applicable than other methods. In the decomposi-
tion context discussed in this chapter, an obvious advantage of the outer approx-
imation approach is that all the linear subproblems involved in it have the same con-
413
straint set. This is in contrast with branch and bound methods in which each linear
subproblem has a different constraint set.
To illustrate this remark, let us consider a class of two level decision problems
which are sometimes encountered in production planning. They have the following
general formulation
Specifically, x might denote a production program to be chosen (at the first de-
cision level) from a set X of feasible programs, and y might denote some trans-
portation-distribution program that is to be determined once the production pro-
gram x has heen chosen in such a way that the requirements (33) are met. The ob-
jective function f(x) is the production cost, which is assumed to be concave, and dy
is the transportation-distribution cost. Often, the structure of the constraints (33) is
such that highly efficient algorithms are currently available for solving linear pro-
grams with these constraints. This is the case, for example, with the so-called plant
location problem, in which y = {Yij' i=l, ... ,m, j=l, ... ,n} and the constraints (33)
are:
Clearly, branch and bound methods do not take advantage of the specific struc-
ture of the constraints (33): each linear subproblem corresponding to a partition set
involves additional constraints which totally destroy the original structure. In con-
trast, the decomposition approach by outer approximation allows this structure to
be fully exploited: the linear programs involved in the iterations are simply
414
This is particularly convenient in the case of the constraints (34), for then each
C(x) is a classical transportation problem, and the dual variables w, u are merely
modifications described in Section VIII.4.3, goes as follows (cf. Thieu (1987)). For
the sake of simplicity, assume that Xis bounded and the constraints (33) are feasible
for every fixed x E X.
Algorithm VIII.6
Initialization:
distribution cost of xk: g(xk) = wkxk + ukb, where (wk,u k) is a basic optimal
obtaining an optimal solution (xk +l,tk +1) (the new production program together
Go to iteration k+l.
[
6 66 68
81
d = (dij ) = 40 20 34 83 27
4] .
90 22 82 17 8
Applying Algorithm VIII.6 with X defined by (37) we obtain the following results.
Initialization:
t
o= 3636, X0 = (203, 0, 0).
416
Iteration 0:
Iteration 3:
Optimal value of C(x3): 3773.647 > t3
Potentials: w3 = (-48, -46, -44), u 3 = (54, 66, 80, 61, 52)
62 0
(y..)=
IJ
[ 0 0 5~ ~ 1~0 J.
o 65 o 10
417
problem (Hk) one cau use any procedure, as long as it is capable of being restarted.
One such procedure is the method of Thieu- Tam-Ban (Algorithm VI.l), which re-
lies on the inductive computation of the vertex set Vk of the approximating polytope
T k . However, for relatively large values of n, this method encounters serious com-
putational difficulties in view of the rapid growth of the set Vk which may attain a
prohibitive size.
One possible way of avoiding these difficulties is to solve the relaxed probleD)S by
arestart branch and bound procedure (Section IV.6). This idea is related to the ap-
proaches of Benson and Horst (1991), Horst, Thoai and Benson (1991) for concave
In fact, solving the relaxed problems in Algorithm VIII.5 by the restart normal con-
iCal algorithm amounts to applying Algorithm VII.2 (normal conical algorithm for
For the sake of simplicity, as before we assume that Dis boundedj as seen above (in
Section VIII.4.2), this implies that g(x) is bounded on D, and for any x E IRn the lin-
ear program
is feasible.
Let G = ((x,t): x E D, g(x) ~ t} be the feasible set of (H). From the above results
(Section VIII.4.2) it follows that for any point (xk,t k ), one of the following three
W = {w: BTw ~ -d,w ~ O} can be found over which (Axk - c)wk is unbounded.
Now, when applying Algorithm VII.2 to problem (H), let us observe that the feas-
ible set G of (H) is a convex set of a particular kind, namely it is the epigraph of a
convex function on D. In view of this particular structure of G, instead of using a
conical subdivision of the (x,t)-space as prescribed by Algorithm VII.2, it is more
convenient to subdivide the space into prisms of the form MxR, where M is an
n-simplex in the x-space (such a prism can also be viewed as a cone with vertex at
infinity, in the direction t --t +(1).
(x,t)-space, and let 'k be the incumbent function value. For every prism MxlR which
is in some partition of T k , where M = [sl, ... ,sn+l], we can compute the points
(si,#) on the verticals x =i such that f(si) + # = Ik and consider the linear
program
max {1: A/-t: (1: A/,t) E Tk, E '\ = 1, -\ ~ 0 (i=l, ... ,n+l)} (40)
which is equivalent to
419
where "'M(x) is the intersection point of the vertical through x with the hyperplane
through (sl,ol), ... ,(sn+1,tf+ 1). Let J'{M) be the optimal value, and let z(M) =
(x(M),t(M)) be a basic optimal solution of (40). Clearly, if ",(M) 5 0, then the por-
tion of G contained in MxlR lies entirely above our hyperplane. Hence, by the con-
cavity of iJl(x,t):= f(x)+t, we must have f(x)+t ~ 'Yk V(x,t) E G n (MxR), Le., this
prism can be fathomed. Otherwise, if J'{M) > 0, this prism should be further inves-
tigated. In any case, x(M) is distinct from all vertices of M, so that M can be sub-
divided with respect to this point. In addition, the number
On the other hand, by solving C*(x(M)) we can check whether z(M) belongs to G,
and if not, construct a constraint to add to T k to define the new polyhedron T k+1 .
For every xE D denote F(x) = f(x) + g(x). The above development leads to the fol-
lowing procedure:
Algorithm VIII. 7.
2) For each M E .9 k solve the linear program (40), obtaining the optimal value
5) Let x k = x(Mk ), tk = t(M k). Solve C*(xk ). If case a) occurs, Le., (xk,t k) E G,
then let T k +1 = T k . Otherwise, form T k+1 by adding the new constraint (38),
or (39), to T k ' according to whether case b) or case c) occurs.
6) Let i k+1 be the best (in terms of the value of F(x» among i k , an x(M) for
M E .9 k +1 ' and let the point u(M k ) E Dk be used for subdividing Mk if u(M k) f
x (M k). Let Ik+1 = F(ik+1), .At k+1 = (~k \ {Mk }) U .9 k+1 . Set b - k+1
and return to 2).
Proof. By viewing a prism M"IR as a cone with vertex at infinity in the direction
t - - I 111, the proof can be carried out in the same way as for Theorem VII.5 on the
an additional subproblem C*(xk ) (which is the dual to a linear program in y). How-
ever, since at least two new simplices appear at each iteration, in an, Algorithm
NETWORKS
indude problems in inventory and produetion planning, eapacity sizing, loeation and
network design whieh involve set-up eharges, diseounting, or eeonomies of seale.
Other, more general noneonvex network problems ean be transformed into equi-
valent eoneave network problems (Lamar, 1993). Large seale problems of this dass
ean often be treated by appropriate decomposition methods that take advantage of
Consider a (direeted) graph G = (V, A), where V is the set of nodes and Ais the
set of ares (an are is an ordered pair of nodes). Suppose we are given areal number
d(v) for eaeh node v e V and two nonnegative numbers Pa' qa (Pa S qa) for eaeh arc
a e A. A vector x with components x(a) ~ 0, a e A, is ealled a flow in the network G
(where the eomponent x(a) ia the flow value in the are a). A flow is said to be feas-
ible if
Pa S x(a) S qa Va e A , (42)
where A+(v) (resp. A-(v) ) denotes the set of ares entering (resp. leaving) node v.
The number d(v) expresses the "demand" at node v (if d(v) < 0, then node v is a
"supply" node with supply ~(v)). The numbers Pa' qa represent lower and upper
bounds on the flow value in are a. The relation (43) expresses flow eonservation. It
follows immediately from (43) that a feasible flow exists only if E d(v) = 0.
veV
422
Furthermore, to each arc we associate a concave function fa: IR+ --I IR+ whose
value fa(t) at a given t ~ 0 represents the cost of sending an amount t of the flow
through the arc a. The minimum concalle cost jlow problem (CF) is to find a ieasible
flow x with smallest cost
of the form fa(x(a)) = fa(Pa) + ca(x(a) -Pa) (ca ~ 0), and let All =A \ AI. In
many practical cases, lAll I is relatively small compared to IAI I, i.e., the' problem
involves relatively few nonlinear variables. Then the minimum concave cost flow
problem belongs to the class considered in this chapter and can he treated by the
methods discussed above. It has been proved recently that by fixing the number of
sources (supply points), capacitated arcs and nonlinear arc costs (i.e. IIAII II) this
problem becomes even strongly polynomially solvable (Tuy, Ghannadan, Migdalas
and Värbrand (1995)). In particular, efficient algorithms have been proposed for
SUCF with just one or two nonlinear arc costs (Guisewite and Pardalos (1992), Tuy,
423
Dan and Ghannadan (1993), Tuy, Ghannadan, Migdalas and Värbrand (1993b),
Horst, Pardalos and Thoai (1995)).
For every flow x let us write x = (i ,xII) where xl = (x(a), aEA I ), xII = (x(a),
aEAII ). The following Algorithm VIII.8 is a specialization to the concave cost flow
VIII.2) which differs from the algorithm of Soland (1974) by a more efficient
bounding method.
Algorithm VIll.8.
1) For each rectangle M E .%k with M = 11 11 [ra ,s ] solve the linear problem:
aEA a
where
(c~ is the slope ofthe affine function 'I/J~(t), which agrees with f(t) at the points
t = r a and t = sa).
If (LP(M)) is infeasible, then set ß(M) = +(1). Otherwise, let w(M) be an optimal
2) Define the incumbent by setting xk equal to the best feasible solution among
xk-l, and all w(M), M E .%k· Let 'k = f(x k) if x k exists, 'k = +00 otherwise.
424
Delete all M E .At k for which ß(M) ~ f(x k). Let .ge k be the remaining collection of
rectangles.
3) If .ge k = 0, then terminate: if 'Yk < +111, then xk is an optimal solution (optimal
flow), otherwise the problem is infeasible.
Convergence of Algorithm VIII.8 can be deduced from Theorem VIII.2, since AI-
gorithm VIIL8 is a specialization to the concave cost flow problem of Algorithm
VIII.2 which handles more general separable problems. It follows that every accumu-
lation point of the sequence {xk } is an optimal solution of (CF), and f(x k ) converges
Next, assume that the numbers Pa' qa' d(v) in (42), (43) are all integers, which is
usually the case in practical applications. Then it is well-known that the vertices of
the feasible polytope defined by (42), (43) are all integer vectors because of the total
unimodularity of the matrix defining the left hand side of (43) (which is the
Papadimitriou and Steiglitz (1982)). Since we know that the concave function (44)
attains its minimum at a vertex of the feasible polytope, it is c1ear that we can add
the requirement
425
to the constraints (42), (43) without changing the problem (in (46) INO:= {O} UIN).
Finiteness of Algorithm VIII.8 follows then because the optimal solution w(M) of
every linear subproblem LP(M) is integral, and hence every xk is integral. Since the
Notice that, for solving the linear cost network flow problem (LP(M)) in Step 1) a
number of very efficient polynomial algorithms are available (cf., e.g., Ahuya, Mag-
nanti and Orlin (1993)).
rectangular bisection (Horst and Thoai (1994a and 1995): Let Mk be the rectangle
(47)
be the length of one of its longest edges. Then, in Step 5), subdivide Mk into
1 6 (M)
Mk = {x E Mk : x(i\) ~ sä + l2"- J} (48)
k
and
(49)
Proposition vm.8. The integral version 0/ Algorithm VIII.8 'U.Sing the subdivision
(47)-(49) terminates after at most T = rr II r(q(a)-p(a))/21 iterations.
aeA
Proof. Notice that optimal solutions of LP(M) are integer. From this and the fact
that the convex envelope of a univariate concave function f over an interval coin-
cides with f at the endpoints it follows that rectangles M satisfying 6(M) = 1 are
426
deleted in Step 2). Therefore, it is sufficient to show that after at most T iterations
no partition element M satisfying 6(M) > I is left. But this follows readily from
(48), (49) which implies that an edge e of the initial rectangle MI cannot be involved
in a subsequent subdivision more than rI e 1/21 times, where Ieldenotes the length
of the edge e. Since lei ~ qa - Pa for the corresponding a E All, we obtain the above
bound.
•
5.2. The Single Source Uncapacitated Minimum Concave Cost Flow (SUCF)
Problem
Now we consider the SUCF problem, i.e., the special case of (CF) when there are
no capacity constraints, (Le., Pa = 0, qa = +ID Va E A) and there is only one supply
node (i.e., one node v E V with dv < 0). Since (SUCF) is NP-hard, large seale
SUCF problems cannot be expected to be effectively solved by general purpose
methods. Fortunately, many SUCF problems encountered in practice have special
additional structure that can be exploited to devise specialized algorithms. Examples
include the concave warehouse problem, the plant location problem, the multi-
product production and inventory models, etc. All of these problems and many
others can be described as (SUCF) over a network which consists of several pairwise
interconnected subnetworks like the one depicted in Fig. VIII.I.
427
For problems with such a reccuring special structure, a decomposition method can
Let us first restate the problem and its structure in a convenient form. Anode v
of a graph G = (V,A) is called a source if At = 0. Let S(G) be the set of all sources
of G, and suppose that we are given a demand d(v) ~ 0 for each node v E V \ S(G)
and a concave cost functian fa : IR+ --+ IR+ far each arc a E A. Then we consider the
problem
x(a) ~ 0 (a E A) . (52)
428
Clearly this problem beeomes an (SUCF) problem when we add to the network a fie-
tive supply node vo along with a fietive are (vO,v) for eaeh v E S(G). Furthermore,
IS(G)I = 1. Therefore, from now on, by (SUCF) we shall mean the problem (50) -
(52) with the fietive node vO and the fietive ares (vO,v) (v E S(G)) as introdueed
above.
We shall further impose a eertain strueture on the network G = (V,A):
(*) There is a partition {V1, ... , Vn} of V such that for every i=l, ... ,n we have the
foUowing property:
if the initial node of an arc belongs to Vi' then its final node belongs to either Vi or
Vi +1"
In view of this eondition, the eoefficient matrix of the flow eonservation eon-
straints (51) has astairease strueture whieh we exploit in order to detive a deeom-
position method.
Ai = {a E A: a = (u,v), u E V, v E Vi} ,
and for eaeh v we let At(v) (resp. Ai(v)) denote the set of ares in Ai whieh enter
(resp. leave) v.
n
Then A. n A. = 0 (i f j) and A = U A., i.e., the sets A l , ... ,A form a partition of
1 J i=l 1 n
A. Furthermore, let Gi be the subgraph of G generated by Ai ' i.e., Gi = (Wi' Ai)'
where Wi is the set of all nodes incident to ares in Ai. Then any flow x in G ean be
written
.
as x = (x l ,... ,xn
), where x. is theI
restrietion of
l
x to A..
Setting Ui = Wi n Wi +! ' we see that v E Ui if and only if v is the initial node of an
are going from Vi to Vi+!. Let hi : V -+ IR+ be a funetion such that hi(v) = ° for v ~
Ui . Denote by X(G,d) the set of all feasible flows in G for the demand veetor d with
429
components d(v), v E V \ S(G), i.e., the set of all veetors x = (x(a), a E A) satis-
fying (51) and (52); similarly, denote by X(Gi ' d + hi ) the set of all feasible nows in
t
where A +1(v) denotes the set 01 ares a E Ai+1 'eaving v.
A +(v) = At(v) Vv E Vi
Ai(v) U Ai+l(v) if v E Ui
A-(v) ={ _
A.(v) i f v E V. \ U.
111
We shall assume that for any node v E V \ S(G) there is at least one path going
from a souree to v. This amounts to requiring that X(G,d) is nonempty for any vee-
Proposition vm.l0. A feasible flow z is an extreme flow if and only if there ezists
We omit the proof, which can be found in textbooks on elementary network flow
Proposition vm.ll. The function ~(d) is concave on the set of all d = (d(v)),
v e V\ S(G), d(v) ~ O.
spanning forest T = (V,B) such that B ) {a e A: i(a) > O}. Since T is a forest, there
exist a unique x' e X(T,d') and a unique x" e X(T,d"). Clearly ).x' + (1->')x" e
X(T,d) and hence by uniqueness, i = ).x' + (1->')x". But we have ~(d) = f(i) ~
>'f(x') + (1->') f(x") ~ >'~(d') + (1->')~(d"), which proves the concavity of~. •
Proposition vm.12. The fu,nctions F/) are concave on their domains of de-
finition. 1f lP/xJ denotes the fu,nction obtained !rom Fi -lhi - 1) by replacing hi - 1 by
the vector E _ x.(a), u E Ui - 1 ' then lP/xJ is concave.
aeA i (u) I
Therefore, again using Proposition VIII.10, we see that F 2(·) and 1P3(') are concave.
Let (i
n,h~1) denote an optimal solution of (P n), and consider the next subproblem
432
S.t. x. E X(G., d
1 1
+ li.),
1
h. l(u) = E
1- aEA~(u) 1
x.(a) 'tu EU. I'
l-
I
in which lii is obtained !rom an optimal solution (xi +1' lii) of PHI'
Note that, if we agree to set 'PI (. ) :: 0, then, by Proposition VIII.12 each problem
(P.) is equivalent to minimizing the concave function cp.(xI·) + E fa(x.(a» subject
1 1 aeA. 1
1
to Xi E X(Gi , d + hi ).
Theorem VIII.6. Let Q be the optimalllalue o/the objective function in {PnJ. Then
Q is the optimal value 0/ the objective fu,nction in {P}, and x = (xp-"'xn), with xi
{i=l, ... ,n} as defined above, is an optimal solution 0/ (P).
Proof. Replacing the functions Fi (i=n-l, ... ,l) in the problems (Pi) (i=n-l, ... ,l)
by their defining expressions we can easily see that Q is the optimal value of the ob-
jective function and xis a minimizer of the function E f (x(a» over the domain
aeA a
433
When solving the subproblems (Pi)' the difficulty is that the functions Fi_ 1(hi_ 1)
which occur in the objective functions of these subproblems are defined only impli-
citly. One way to overcome this difficulty is as follows: first approximate the func-
tions Fi (·) (which are nonnegative and concave by Proposition VIII.lI) with certain
polyhedral concave underestimators ",~, and for i=n,n-l, ... ,1 solve the approximate
subproblems (P~) obtained from (Pi) by substituting "'~ for Fi_ 1. Then use the so-
lutions of (P~), ... ,(P~) to define functions "'} that are better approximations to Fi
than "'~, and solve the new approximate subproblems (P 1), ... ,(P 11), and so on. It
1 n
turns out that with an appropriate choice of approximating functions ",~,,,,}, ... ,~, ... ,
this iterative procedure will generate an optimal flow x after finitely many iter-
ations.
Recall that for any real valued function f/J(h) defined on a set dom "', the hypo-
lnitialization:
k,I. Solve
(P~)
obtaining an optimal solution (x~, hLl) and the optimal value t~ of (P~).
If i ~ 2, set i t - - i-I and return to k.l. Otherwise, go to k.2.
k,2. If t~ S ~(h~) for a11 i=I, ... ,n-l, then stop: xk = (x~, ... ,x~) is an optimal ßow.
Otherwise, go to k.3.
k,3. Construct a new concave underestimator ~+1 for each Fi such that the
hypograph of ~+1 ia the eonvex hull of the set obtained by adjoining the point
(hf,tf) to hypo ~ , i.e.,
(54)
Go to iteration k+ 1.
435
some remarks on how to construct the functions .,pf+l and how to solve the subprob-
k
lem (Pi).
u.
Thus, for any hi E IR+ 1 we have
k . k .
= sup {t: t ~ E s.tJ, E s.hJ(u) ~ h.(u) Vu E U1· ,
j=l J 1 j=l J 1 1
k
E s· ~ 1 , s· ~ 0 Vj=O, ... ,k}
j=l J J
k . k .
= sup {E s.tJ: E s.hJ(u) ~ h.(u) Vu E U1. ,
j=l J 1 j=l J 1 1
(55)
k
E s· ~ 1 , s· ~ 0 Vj=O, ... ,k}.
j=l J J
For a given hi E IR ~i the value of .,pf+1(hi ) is equal to the optimal value of the linear
program (55).
On the other hand, since hypo .,pf+1 is a polyhedral convex set, .,pf+1 can be ex-
pressed as a pointwise minimum of affine functions. The graphs of these affine func-
tions are hyperplanes through the nonvertical facets of hypo .,pf+ 1. Starting with the
o U.
fact that .,pi has a unique nonvertical facet, namely IR lx{O}, and using formula (54),
we can inductively determine the nonvertical facets of .,pi,.,p~, ... ,.,pf+1 by the poly-
436
hedral annexation technique (see Section VI.4.3). In the present context this pro-
cedure works in the following way.
Consider the dual problem of (55):
U.1
rElR+ ,t~O. (58)
Since v{+l(hi ) is equal to the optimal value in (55), it is also equal to the optimal
value in (56)-{58). Furthermore, since v{+1(hi ) is finite, the optimal value in
(56)-{58) must be achieved at least at one vertex of the polyhedral convex set Zf
defined by (57) and (58). Let Ef denote the vertex set of Zf. Then Ef is finite, and
we have
Therefore, in order to obtain an explicit expression for t/lf+l we compute the vertex
set Ef of zf· Clearly, E~ = {O} for all i=I, ... ,n-l, and Zf+1 differs from Zf by just
one additional linear constraint. Hence, we can compute the vertex set of Zf by the
methods discussed in Section 11.4.2. Since Zf ( IR Ui"IR, the above procedure is prac-
k
S.t. x. E X(G., d
1 1
+ h.),
1
h. leu) =};
1-
x.(a)
aEA~(u) 1
Vu EU. I'
l-
I
Recalling now that Gi = (Wi , Ai) and using an artificial node w, let us define the
network U.1 = (W.,X.),
1 1
where W.1 = W.1 U {w}, X.1 = A.1 U {(w,u): u EU.1-I} and to
each arc a = (w,u) (u E Ui_ l ) one assigns the linear cost function fa(x) = r(u)x.
lution (xf,hf_l) such that xf is an extreme flow in Gi' Hence, we may assume that
for every k, xk = (xf, ... ,x~) is an extreme flow in G.
In order to show convergence of the above algorithm, we first formulate the fol-
lowing propositions.
Proposition vm.13. For any i=l, ... ,n-l and any k=O,1,2, ... we have "'~(hJ ~
F/hJ Vhi E IR ~i (i.e., "'~ is actually an underestimator of Fi ).
U.
Proof. Since Fi is nonnegative on IR+I (i=I, ... ,n-l), it is immediate that
o U.
1Pi(hi) ~ Fi(hi ) Vhi E IR+ I (i=l, ... ,n-l). (60)
Furthermore, since "'~(hl) ~ F l(h l ), we further have hypo "'~ ( hypo F l' Therefore,
by (60), hypo 'I/J~ (hypo F 1 Vk, or, equivalently, 'I/J~(hl) ~ FI(h l ) Vk. Since
438
and hypo 'IjJ~ ( hypo F 2' it follows from (54) that ~(h2) ~ F 2(h2) for all k.
By the same argument we have
•
Proposition VIll.14. I/at some iteration k
or, equivalently,
,
vlf, = 'IjJ~+1 V i=l, ... ,n-l, (62)
.k k k
'/{J;(h.) = F.(h.)
1 1 1 1
(63)
for all i=1, ... ,n-1. Indeed, for i=1 it is obvious that
k 1fJi(h.
F.(h.)> .k k) =t/J.k+1( h.k) >t.=mf(P.)
k. k
11-1111-1 1
h. I(U) = E k .
x.(a) Vu E u. I} = F.(h.)
1- A- () 1 1- 1 1
aE i-I u
of (P).
•
Theorem VIII.7. Algorithm VII1.9 terminates after finitely many iterations at an
Proof. We first show that for any fixed i E {0,1, ... ,n-1} there is a finite collection
.Ni of functions such that 1/It E .Ni for all k=0,1,2, .... Indeed, this is obvious for i=O
since ~ =0 Vk. Arguing by induction, suppose that the claim is true for i = p-1
(p ~ 1) and consider the case i = p. For k = 0, we have 1/1~ =0, while for k ~ 1
(64)
where
treme flows is finite, x~-l mU8~ belong to some finite set X p . Moreover, aince the
quantities hf-1 (i==l, ... ,n-l) are uniquely determined by xk- l , they must belong to
k-l k-l
hp-l e Bp-l ' xp E Xp ,
ud IlillCe ':=~ E JI p-1 ud JfI p-1 is finite by aasumptiOll, it follows from (65) that
(h k- 1 t k- 1) E H xT .
p 'p p P
By virtue of (64), this implies that any '1/1; is the eonvex hull of the union of the
hypograph of 'I/I~ and some subset oft he finite set HpxTp' Henee, ~ itself belongs to
some finite family cH p'
We have thus proved that for any i=0,1, ... ,n-1, cHi is finite. Since for any k
and both 'I/If and 'I/If+1 belong to the finite set cHi' there must exist a k such that
'I/Iki = 'I/Ik+1
i
.
(1=0,1, ... ,n-1) .
In other words, the algorithm must stop at some iteration k. By Proposition VIII.14,
the form
Are a: 1 2 3 4 5 6 7 8
Are a: 9 10 11 12 13 14 15 16
Fixed eost b(a): 15.7 15.7 15.5 15.7 50.5 41.5 55 41.5
Are a: 17 18 19
Fixed cost b(a): 15.7 15.7 41.5
1 .,
2 2 3 3 4
7 8
5
11
5 9 9
19
---3>---------3------ 10
Fig. VIII.2
Iteration 0:
(Here we wnte xj rather than xO(j) to denote the value of the flow xO in Me j).
xo
19 = 39.5.
Solving (P~):
a = 68.2, X20 = 30.7, x0 = 1.5, x0 = 0, X I) = 2.5, ~() = 0, Xr0 = 32, X 0 = 0, x 0 =
xl 3 4 s s 9 0,
a
x :::::: O.
10
443
0.2.: °
t~ = 599.3 > = 1/J~(h~) (stopping criterion not satisfied).
Iteration 1:
In this example the total number N of extreme flows is 162. Further numerical re-
5.4. Extension
If we set Xi(hi ):= X(Gi' d + hi) (i=I, ... ,n), Bi (xi +1):= ( E _ xi +1(a),
aeAi +1(u)
u e U.) (i=I, ... ,n-l), f.(x.):= E fa(~(a» (i=I, ... ,n), then it is easily seen that
1 1 1 aeA.
1
(P) (67)
(68)
(69)
444
m· k.
Here, xi E IR / ' hi E IR+1 and it is assumed that:
m.
(i) each fi (·) is a concave function on 1R+ 1;
(ii) each Xi(hi ) is a convex polyhedron, and the point-t<HIet mapping hi t----+ Xi(hi )
is affine, i.e.,
11 + (l-..\)h~')
X.(..\h! 11 + (l-..\)X.(h~')
1 = ..\X.(h!) 11
m. 1 k.
"') each Hi: IR +1+
(11l -I
IR+1.18 a l'meu mappmg.
.
It is obvioU8 that (SUCF) satisfies (i) and (iii). To see that (SUCF) also satisfies
(ii), it suffices to show that any extreme point xi of Xi(..\hj + (l-..\)hi) = X(G i , d +
..\hi + (l-..\)hP is ofthe form xi = ..\xi + (l-..\)xi' with xi E Xi(hi), xi E Xi(hj). But,
since Xi is an extreme flow in Gi = (Vi' Ai)' there exists (by Proposition VIII.10) a
xi, xii be feasible
spanning forest Ti = (Vi'B) such that {a E Ai: ~(a) > O} ( B. Let
flows in Ti = (Vi'B) for the demands d + hj and d + hi ' respectively. Then xi E
Xi(hi), xi E Xi(hP, and hence ..\xi + (l-..\)xi E Xi(..\hi + (l-..\)hi). Since there is a
unique feasible flow in T.1 for the demand d + ..\h!1 + (l-..\)h '1.' we conclude that X.1 =
..\i + (l-..\)xi·
Note that several problems of practical interest can be formulated as special cases
of problem (P). Consider for example the following concave cost production and in-
ventory model (see, e.g., Zangwill (1968».
n
minimize E (p.(y.) + 'lj(h.» (70)
i=l 1 1 1
(72)
445
Here di > 0 is the given market demand for a product in period i, Yi is the amount
to be produced and hi is the inventory in that period. The function Pi(Yi) is the pro-
duction cost, and q.(h.)
1 1
is the inventory holding cost in period i (where both p.(.)
1
x.1 E X.(h.)
1 1
~ x· = (y.,h. 1)' h. 1 + y. = h. + d. ,
1 1 1- 1- 1 1 1
f.(X.)
1 1
= p.(y.)
1 1
+ q.(h.)
1 1
,
A feasible solution x = (xl""'xn ) for (P) can be generated in the following way:
first, choose x n E Xn(O) and compute hn_ l = Hn_l(xn ); then choose xn_ l E
Xn_l(h n_ l ) and compute hn_ 2 = Hn_ 2(xn_ l ), and so on; finally, choose xl E
Xl(h l )·
Therefore, the problem can be considered as a multistage decision process. This sug-
gests decomposing (P) into a sequence of smaller problems which corresponds to dif-
ferent stages of the process.
For each j=l, ... ,n consider the problem
minimize ~
i=l
f.(x.)
1 1
(73)
P -(h.) (74)
J J
h.1 = H.(x·+
1 1 l
) (i=l, ... ,j-l), (75)
h j given. (76)
Proof. Denote by nj(hj ) the set of all (xl""'xj ) that are feasible for Pj(hj ), i.e.,
for each (xl""'xj ) E nihj) there exist hl' ... ,h~l satisfying (73), (74), (75). By in-
duction, it can easily be shown that the point-to--flet mapping hj - I nj(hj ) is affine.
Now let hj = >'hj + (l->.)hj (0 ~ >. ~ 1), and let (xl""'xj ) E nj(hj ) satisfy (xl'''''xj )
= >,(xi,,,,,xj) + (1->')(x1,... ,xj) with (xi, .. ·,xj) E nj(hj) and (x1,.. ·,xj) E nihj).
Since fi (·) is concave, we have fi(~) ~ >.fi(xi) + (l->')fi (xf). Hence, if (xl'''''xj ) is
equations:
F.(h.)
JJ
= min {F. l(h. 1)
,J-,J-
+ L(x.): h. 1
JJ,J-
= H. l(x,), x. E X.(h.)}
,J- J J JJ
(j=2, ... ,n).
hedra Xj(hj ) by their vertex sets. Since the laUer sets are finite, one can compute
the function F 1(h 1) (i.e., the tableau of its values corresponding to different possible
values of h 1). Then, using the recursive equation, one can find F 2(h2) from
F 1(h 1), .. ·, and finally F neO) from F n-1 (hn_ 1). However, this method is practical
Under more general conditions it is easy to see that the same method that was de-
veloped in Section VIII.5.3 for the SUCF problem can be applied to solve (P), pro-
vided t'il~t the dimensions of the variables hi are relatively small (but the dimensions
of the variables Xi may be fairly large, as in (SUCF».
CHAPTERIX
methods. In this ehapter we shall study some of the most important examples of
1. BILINEAR PROGRAMMING
and Shetty (1980a), Almeddine (1990), Benett and Mangasarian (1992), Sherali and
Almeddine (1992), Benson (1995), ete.) ean be modeled by the following general
subject to x EX, Y E Y ,
448
n'
Y = {y Eil: By ~ b , y ~ O} ,
with a E IRm, bEIm', p E IRn', q E IRn, and C,A,B are matrices of dimension n'-n,
m-n, m'-n', respectively. Let V(X) and V(Y) denote the vertex set of X and Y, re-
Benett and Mangasarian (1992), SheraIi and Almeddine (1992), Benson (1995).
Problem (BLP) has been extensively studied in the Iiterature for more than
twenty years (e.g., Mangasarian (1964), Mangasarian and Stone (1964) and Altman
(1968); Konno (1976), Vaish and Shetty (1976 and 1977), Gallo and Ulkücü (1977),
Mukhamediev (1982), SheraIi and Shetty (1980), Thieu (1980 and 1988), Czoch-
ralska (1982 and 1982a), Al-Khayyal (1986), SheraIi and Almeddine (1992». We
shall focus on methods that are directly related to concave minimization.
The key property which is exploited in mOlt methods for bilinear programming is
the equivalence of (BLP) with a polyhedral concave minimization problem (see Sec-
where
If Y has at least one vertex, and if we denote the vertex set of Y by V(Y), then the
is the aim of this section to examine some of the methods for which this special-
isation is not trivial. But before turning to the methods, let us mention some general
Proposition IX.I. If problem (BLP) has a finite optimal value (e.g., if X and Y
are bounded), then an optimal solution (x,y) exists such that xE V(X), 11 E V(Y).
Proof. Indeed, the infimum of f(x) over X, if it is finite, must be attained at some
i E V(X). Furthermore, from (3) we see that f(Je) = F(i,Y) for some y E V(Y) (cf.
Theorem 1.3).
•
Of course, by interchanging the roles of x and y, we obtain another equivalent
form of (BLP):
min {g(y): y E Y} ,
pair (i,Y) with i E V(X), YE V(Y) to be an optimal solution of the problem is that
mi n F(x,Y) = F(i,Y) = mi n F(i,y) . (4)
xEX yeY
450
Proposition IX.2. Let (z,y) satisfy (-I). 11 Y= arg mi n F(z,y) (;'.e., y is the
yeY
z
un;,que minimizer 01 F(z,.) over Y), then is a local optimal solution 01 (1).
In view of the finiteness of the set V(X)-V(Y), the situa.tion F(xh+1,yh) <
F(xh,yh) cannat occur infiDitely mur times. Therefore, the above procedure muat
451
terminate after finitely many steps with a pair (xh,yh) such that mi n F(xh,y» =
yeY
h h h+l h .. h
F(x ,y ) = F(x ,y) = ml n F(x,y ) .
xeX
As seen above, problem (BLP) is equiva1ent to each of the following concave min-
imization problems:
To solve (BLP) we can specialize Algorithm V.l to either of these problems. How-
ever, if we use the symmetrie structure of the bilinear programming problem, a more
efficient method might be to alternate the cutting process for problem (5) with the
cutting process for problem (6) in such a way that one uses the information obtained
in the course of one process to speed up the other.
Specifically, consider two polyhedra Xo c X, Yo c Y with vertex sets V(Xo)'
V(Yo)' respectively, and let
fO(xo) = F(x
. (},y0) _( 0) .
= ,O'y (7)
/}-valid cut for the concave program min fO(X O) is given by a vector 1rX (Y O) such
o
that
(8)
(9)
Therefore, setting
(10)
we have fO(x) ~ er for all x E ß X (YO) , i.e., any candidate (x,y) E XOxYO with
o
F(x,y) < er must lie in the region Xl xYO' where
Thus, if Xl = 0 , then
a vector 7fy (Xl) which determines an /}-valid cut for the concave program min
o
gl(YO) , where gl(Y) = min {F(x,y): x E Xl}. (Note that this is possible, provided
yO is a nondegenerate vertex of yO, because er = ~(yO) -t: ~ gl (yO) - E). Setting
(11)
we have ~l(Y) ~ er for all Y E ß y (Xl)' 50 that any candidate (x,y) E X1xYO with
o
F(x,y) < er must lie in the region Xl xY1 ' where
453
Thus, the problem to be considered now is min {F(x,y): x E Xl' Y E Y1}. Of course,
the same operations can be repeated with Xl' Y1 in place of XO' YO. We are led to
the following procedure (Konno (1976)).
Algorithm IX.I.
Step 1: Compute a pair (xO,yO) satisfying (7). If F(xO,yO) - c < a, then reset
al- F(xO,yO)-t:.
Step 2: Construct the cut 1fX (YO) defined by (8), (9), and let ß X (Yo) denote the
° °
set (10). If Xl = Xo\ß X (YO) = 0, then stop: (xO,yO) is a global c-{)ptimal solution
Step 3: Construct the cut 1fy (Xl)' and define the set (11). IfYI=YO\ß y (X I )=0,
°
then stop: (xO,yo) is a global c-{)ptimal solution. °
Otherwise, go to Step 4.
Denote by 1f~ and 1f~ the cuts generated in Steps 2 and 3 of iteration k, respect-
ively. From Theorem V.2 we know that the cutting plane algorithm just deseribed
will converge ifthe sequences {1f~}, {?r~} are bounded. However, iR the general case
the algorithm may no~ converge. To ensure convergence one could, from time to
time, insert a fa.cial cut as described in Section V.2; but trus may be computaiionally
expensive.
For the implemwation of the algorithm, aote that, because oI the specific struc-
ture of the function fo(x), the computation of the numbers 0j defined by (i) reduces
454
simply to solving linear programs. This will be shown in the following proposition.
Proposition IX.3. Let X o = X. Then (Jj equals the optimal value ofthe linear pro-
gram
. o·
ProoC. Define IPP') = ApdJ + min {(q + Cx + ACdJ)y: By ~ b, y ~ O} . From
. T o·
IPj{A) = ApdJ + max {-bu: -B u~q + C(x + AdJ) , u ~ O} .
Hence,
(J. = max A
J
u ~ o.
which is usually deeper than the ooncavity cut 1rX (YO) defined by (8), (9).
o
455
Note that the numbers 0j in (9) depend upon the set YO' If we write O/Y 0) to em-
phasize this dependence, then for Y1 C YOand Y1 smaller than Y0 we have O/Y1) ~
O/YO) Vj, usually with strict inequality for at least one jj i.e., the cut ?l'Xo(Y1) is
usually deeper than 'll'X (Y O)' Based on this observation, Konno's cut improvement
o
procedure consists in the following.
Construct ?l'X (Y O)' then ?l'y (X O) for Xl = XO\~X (YO) and ?l'X (Y l ) for
o 0 0 0
Yl = YO\~Y (Xl)'
o
The cut 'll'X (Y 1) is also an a-valid cut at xO for the concave program
o
min fO(X O)' and it is generally deeper than the cut 'll'X (Y O). Of course, the process
o
can be iterated until successive cuts converge within some tolerance.
This cut improvement procedure seems to be particularly efficient when the prob-
lem is symmetric with respect to the variables x,y, as it happens, e.g., in the bilinear
programming problem associated with a given quadratic minimization problem (see
Section VA).
s. t. Xl + 4x2 ~ 8 , 2Yl + Y2 ~ 8 ,
4x l + x2 ~ 12 , Yl + 2Y2 ~ 8,
3x l + 4x2 ~ 12 , Yl + Y2 ~ 5 ,
xl ~ 0 , x2 ~ 0 , Yl ~ 0 'Y2 ~ 0 .
Applying Algorithm IX.I, where Step 3 is omitted (with Y1 = Yo), we obtain the
following results (e = 0):
1st iteration: xO = PI' Yo = Ql (see Fig. IX.l)j CI! = -10
1 1
cut: ~xl -~x2 ~ 1
Xl f. 0 (shaded region)
456
1
2nd iteration: x = P4 ,y1 = Q4 j Q = -13
~ = e. Optimal solution.
x
2
.3 Y2
Q1
.3
2
FQ
2 .3
Fig. IX.1
Polyhedral annexation type algorithms for bilinear programming have been pro-
posed by several authors (see, e.g., Vaish and Shetty (1976), Mukhamediev (1978».
We present here an al~orithm similar to that of Vaish and Shetty (1976), which is a
direct application of the PA algorithm (Algorithm VI.3) to the concave minim-
ization problem (1).
Because of the special form of the concave function f(x) , the a-extension of a
point with respect to a given vertex xO of X such that f(xO) ~ Q can be computed by
solving a linear program. Namely, by Proposition IX.3, if xO = 0, then the value
457
minimize -a So + qs
We can now give the following polyhedral annexation algorithm (assuming X,Y to
be bounded).
0) Starting with z, search for a vertex x Oof Xo which is a local minimizer of f(x)
over XO' Let a = f(xO). Translate the origin to xO, and construct a cone KO
containing Xo such that for each i the i-th edge of KO contains a point yOi f. °
satisfying f(li) ~ a. Construct the a-extension zOi of yOi (i=I, ... ,n), and find
the (unique) vertex vI of
I
Let VI = {v }, Vi = VI· Set k = 1
1) k
For each v E V salve the linear program
to obtain the optimal value ~v) and a basic optimal solution w (v). If for some
1
Xo I - Xo n {x: v x ~ I}
458
Otherwise, go to 3).
Compute the vertex set Vk+1 of Sk+1' and let V +1 k = Vk+1 \ Vk . Set
k t- k+l and return to 1.
It follows from Theorem VI.3 that this algorithm must terminate after finitely
many steps at an optimal solution of problem (BLP).
A cone splitting algorithm to solve problem (BLP) was proposed by Gallo and
Ülkucü (1977). However, it has been shown subsequently that this algorithm can be
considered as a specialization of the cut and split algorithm (Algorithm V.3) to the
concave program (5), which is equiva1ent to problem (BLP) (cf. Thieu (1980».
Though the latter algorithm may work quite successfully in many circumstances, we
now know that its convergence is not guaranteed (see Sections V.3.3 and VII.1.6). In
fact, an example by Vaish (1974) has shown that the algorithm of Gal10 and Ülkucü
may lead to cycling.
However, from the results of Chapter VII it follows that a. convergent conical al-
gorithm for solving (BLP) can be obtained by specializing Algorithm VII.1 to prob-
lem (5).
459
Recall that for a given x the value fex) is computed by solving the linear program
while the a-extension of x with respect to a vertex xo of X (when f(xO) < er, x f. xO
and fex) ~ er) is computed accbrding to Proposition IX.3. More precisely, the number
max >.
s.t. ->.p(x-xO) + bu ~ pxO - er ,
->.C(x-x O) - BTu ~ CxO + q,
u~ °.
By passing to the dual linear program, we see that, likewise, the number (J is equal
.
mm (0
px - er ) So + (q +
s.t. (p(x-xO))sO + (C(x-xO)) 8 = -1,
80 ~ °, 8 = (8 1 , ... , 8 m ) T ~ °.
Assuming X and Y to be bounded, we thus can specialize Algorithm VII.1 to prob-
Algorithm IX.3.
0) Starting with z, find a vertex xO of X such that f(xO) ~ f(z). Let x be the best
among xO and all the vertices of X adjacent to xO. Let 1 = f(X).
1) Let Q = 1-€. Translate the origin to xO, and construct a cone KOJ X such that
for each i the i-th edge of KO contains a point yOi # xO satisfying f(yOi) ~ Q.
Let QO = (z01 ,z 02 ,... ,ZOn) , where each zOi.IS t he a--extenslon
. 0 f yOi . Let
LP(Q,X)
to obtain its optimal value I-'(Q) and basic optimal solution w (Q). If
f(w (Q» < 1 for some Q, then return to 0) with
-1
z ~ w (Q) , X ~X n {x: eQo x ~ 1} ,
5) Let.9'* be the partition of Q*. For each Q E .9'* reset Q = (z l i,... ,zn) with
zi such that f( zi) = Q.
Return to 2) with .9 ~ .9*, .At ~ {lRnHQ*}} U .9'*.
s.t. Xl + x2 S 5 Y1 + 2Y2 S 8 ,
2x 1 + ~S 7 3Y1 + Y2 S 14 ,
3x 1 + x2 S 6 2Y1 S 9 ,
Xl - 2x2 S 1 Y2 S 3 ,
Xl > o, ~ ~ 0 , Y1 ~ 0 , Y2 ~O,
Applying Algorithm IX.3 with t: = 0 and the NCS rule (*) of Section VII.1.6, where
N is very large and pis very elose to 1, leads to the following calculations.
0) Choose x O = (OjO). The neighbouring vertices are (ljO) and (Oj5). The values of
f(x) at these three points: -3, -13/2, -18. Hence, x= (Oj5) and 'Y = -18.
Iteration 1:
01 02 . 01 02
1) Choose QO = {z ,z ), Wlth z = (36/13jO), z = (Oj5) . .Jt 1 =.9 1 = {QO}'
2) Solve LP(QO'X): ~QO) = 119/90> 1, w (QO) = (2j3) with f(w (QO» = -10> J.
3) .ge 1 = {Qo} .
r.P = (30/17;45/7) .
Iteration 2:
2) Both Q01' Q02 are deleted . .ge 2 = 0, and hence the global optimal solution of
x Y2
2
(0;5)
(0;3) (2;3)
(2;3)
X
Y
x Y1
1
(0;0) (1 ;0) (0;0)
Fig.IX.2
In the previous methods we have assumed that the polyhedra X, Y are bounded.
We now present a method, due to Thieu (1988), which applies to the general case,
when X and (or) Y may be unbounded.
This method is obtained by specializing the out er approximation method for concave
A nontrivial question that arises here is how to check whether the function f(x) is
bounded from below over a given halfline r = {xO + Ou: 0 ~ 0< +IJJ} (note that f(x)
lem (BCP». A natural approach would be to consider the parametric linear program
which can be shown to be a linear program. However, this approach is not the best
one. In fact, solving the parametric linear program (12) is computationally expensive
and the subproblem (12) or (13) depends upon xO, and this means that different sub-
problems have to be solved for different points xO.
Thieu (1988) proposed a more efficient method, based on the following fact.
Let f(x) be a concave function defined by
where r(y) e IRn , s(y) e IR and J is an arbitrary set of indices (in the present context,
J = Y, r(y) = p + CTy, s(y) = qy). Let r be the halfline emanating !rom a point
xO e ~ in the direction u.
Proposition IX.4. The function f(x) is bounded /rom below on r if and only if
This shows that f(x) is bounded !rom below on r. In the case p(u) < 0, let yO e J
be such that 'Y = r(yO)u < 0. Then !rom (14) we see that
In the general case, when the index set J is arbitrary, the value p(u) defined by
(15) might not be easy to determine. For example, for the formula
where f*(x*) is the concave conjugate of fex) (cf. RockafelIar (1970)), computing
inf {x*u: x* e dom f*} may be difficult. But in our case, be~ause of the specific
p(u) = in f (p + CTy)u
yeY
= pu + in f
yeY
(Cu)y ~ °.
Thus, in order to check whether fex) is bounded from below over r it suffices to
Note that all of these subproblems have the same constraint set Y, and their ob-
Let us now consider the concave minimization problem (5) (which is equivalent to
For the sake of convenience we shall assurne that both polyhedra X, Y are non-
empty.
On the basis of the above results, we can give the following outer approximation
Algorithm IX.4.
Initialization:
Set Xl = IR~. Let VI = {O} (vertex set ofXl ), Ul = {el, ... ,en} (extreme direction
set of Xl)' where ei is the i-th unit vector of IRn. Set 11 = {l, ... ,m}.
Iteration k = 1.2•....
a) Ir Vi E I k Aiu k ~ 0, stop: problem (5) has no finite optimal solution, and fex) is
unbounded from below on any halfline in X parallel to uk . In this case (BLP) is
unsolvable.
b) Otherwise, select
and go to 3).
xk E argmin {f(x): x E V k} .
(this linear program must be solvable, because f(x k) = pxk + in! {(q + CTxk)y:
y E Y} is finite). Then (xk,yk) is an optimal solution of (BLP).
and go to 3).
466
3) Form
Determine the vertex set V k+1 and the extreme direction set Uk+1 of Xk + 1 from
Vk and Uk (see Section 11.4.2). Set Ik+ 1 = Ik \ {ikJ and go to iteration k+ 1.
Remark DU. In the worst Ca&e, the above algorithm might stop only when k =
m. Then the algorithm would have enumerated not only all of the vertices and ex-
treme directions of X, but also the vertices and extreme directions of intermediate
polyhedra Xk generated during the procedure. However, computational experiments
reported in Thieu (1988) suggest that this case generally cannot be expected to
occur, and that the number of linear programs to be solved is likely to be sub-
stantially less than the total number of vertices and extreme directions of X.
Example IX.2.
s.t. - xl + x 2 ~ 5 Y1 + Y2 + Y3 ~ 6
x1-~~ 2 Y1 -Y2 + Y3 ~ 2
-3x1 + x 2 ~ 1 -Y1 + Y2 + Y3 ~ 2
-3x1 - 5~ ~ -23 , -Y1 -Y2 + Y3 ~-2
X
1
Fig. IX.3
y ,
1
Fig. IXA
468
p(e1) °
= pe1 + min {(Ce1)y: y E Y} = 3 + = 3 > 0.
p(e2) = 5 - 10 = -5 < 0.
max {Aie2: i EIl} = max {1,-4,1,-5} = 1 > 0. This maximum is achieved for i = 1.
Hence i 1 = 1.
12 = 11 \ {i 1} = {2,3,4}.
Iteration 2.
i = 3. Hence i 2 = 3.
Form
X 3 = X 2 n {x: -3x1 + x 2 $ 1} .
Iteration 3.
By solving the problem min {(q + Cx3)y: y E Y}, we then obtain y3 = (4;2;0).
Thus, (x3 ,y3) is a global optimal solution of the BLP problem under consideration.
Note that the polytope Y has five vertices: (2,2,2), (2,0,0), (0,2,0), (4,2,0), (2,4,0).
469
2. COMPLEMENTARlTY PROBLEMS
Section 1.2.5). In this section, we shall consider the concave complementarity problem
(CCP), which can be formulated as follows:
Given a concave mapping h: !Rn -i !Rn, Le., a mapping h(x) = (h1(x), ... ,hn (x)),
such that each hi(x) is a concave function, find a point x E !Rn satisfying
n
x ~ 0, h(x) ~ 0, . E xihi(x) =0 . (17)
1=1
Note that in the literature problem (17) is sometimes called a convex complement-
arity problem.
(Lemke (1965), Tomlin (1978)) - and other pivoting methods due to Cottle and
Dantzig (1968), Murty (1974), and Van der Heyden (1980), are guaranteed to work
only under restrictive assumptions on the structure of the problem matrix M. Re-
cently, optirnization methods have been proposed to solve larger dasses of linear
complementarity problems (Mangasarian (1976, 1978 and 1979), Cottle and Pang
(1978), Cheng (1982), Cirina (1983), Ramarao and Shetty (1984), AI-Khayyal
470
(1986, 1986a and 1987), Pardalos (1988b), Pardalos and Rosen (1988) (see also the
books of Murty (1988), Cottle, Pang and Stone (1992), and the survey of Pang
(1995)).
In the sequel we shall be concerned with the global optimization approach to com-
plementarity problems, as initiated by Thoai and Tuy (1983) and further developed
in Tuy, Thieu and Thai (1985) for the convex complementarity problem and in Par-
dalos and Rosen (1987) for the (LCP). An advantage of this approach is that it does
not depend upon any special properties of the problem matrix M (which, however,
has to be paid for by a greater computational cost).
one can reduce the concave complementarity problem (17) to the concave minim-
ization problem
x ~ 0 , Mx + q ~ 0 .
Many solution methods for (LCP) are based on this property of (LCP). From the
equivalence between (17) and (19) it also follows that, in principle, any method of
solution for concave minimization problems gives rise to a method for solving con-
cave complementarity problems. In practice, however, there are some particular fea-
tures of the concave minimization problem (19) that should be taken into account
when devising methods for solving (17):
1) The objective function f(x) is nonnegative on the feasible domain D and must be
2) The feasible domain D, as well as the level sets of the objective function f(x),
may be unbounded.
Furthermore, what we want to compute is not really the optimal value of (19),
but rat her a point x(if it exists) such that
XE D , f(x) = 0 .
Observe that, since the functions h(x) and f(x) are concave, both of the sets D, G
are convex. Thus, the complementarity problem is a special case of the following
Given two convex sets D and G, find an element 01 the complement 01 G with
respect to D.
472
In Chapters VI and VII we saw that the eoneave minimization problem is also
closely related to a problem of this form (the "(DG) problem", cf. Seetion VI.2). It
turns out that the proeedures developed in Seetions VI.2 and VII.1 for solving the
(LCP) x eD , f(x) = °,
where D = {x: x ~ °,Mx + q ~ O}, f(x) =
n
. E min {xi' Mix + qJ
1=1
°
Let xO be a vertex of the polyhedron D. If f(xO) = then xO solves (LCP). Other-
wise, we have f(xO) > 0.
We introduee the slaek variables xn +i = Mix + qi (i=1, ... ,n), and express the
basic variables (relative to the basic solution xO) in terms of the nonbasic ones. If we
then change the notation, we can rewrite (LCP) in the form
y ~ 0, Cy +d ~ °, f(y) = °, (20)
fi = {y: y ~ °,Cy + d ~ O} ,
and f(y) ~ °for all y e fi. Setting
G= {y: f(y) > O} ,
we see that °Eint G, and all of the eonditions assumed in the (fiG) problem as for-
mulated in Seetion VI.2 are fulfilled, exeept that G is an open (rather than closed)
473
set and D and G may be unbounded. Therefore, with some suitable modifications,
the polyhedral annexation algorithm (Section VI.2.4) can be applied to solve (20),
In Section V1.2.6 it was shown how this algorithm can be extended in a natural
way to the case when D and G may be unbounded, but D contains no line and inf
f(D) > -1IJ (the latter conditions are fulfilled here, because D ( IR~ and f(y) ~ 0 for
all y E D). On the other hand, it is straightforward to see that, when G is open, the
JL(zf) < 1 rather than JL(v k) ~ 1. We can thus give the following
Using a vertex x Oof the polyhedron D, rewrite (LCP) in the form (20).
- i
0i = sup {t: f(te) ~ O} (i=I, ... ,n) ,
where ei is the i-th unit vector of IRn (Oi > 0 because f(O) > 0 and f is
continuous). Let
1
SI = {Y: Yi ~ 7r. (i=l, ... ,n)}
Set VI = {vI}, Vi = VI' k = 1 (for k > 1, Vkis the set of new vertices of Sk).
k
k.1. For each v E V solve the linear program
to obtain the optimal value JL(v) and a basic optimal solution w (v) (when
JL(v) = +00, w(v) is an extreme direction of D over which vx -I +00). If for some
k
v E V the point w(v) satisfies f(w(v») = 0, then stop. Otherwise, go to k.2.
474
k.2. Select vk E argmax {JL(v): v E Vk}. If J.i(vk ) < 1, then stop: (LCP) has no
solution. Otherwise, go to k.3.
Vx e Ö).
Form the polyhedron
Theorem IX.I. The above algorithm termiRl.&tes after finitely many steps, either
yielding a solution to (LCP) or else establishing that (LCP) has 11.0 solution.
Proof. Denote by P k the polar set of Sk. It is easily verified that P 1 is the convex
hull of {O,ul, ... ,un}, where ui ia the point sii if Si < +m, and the direction ei if
Si = +m. Similarly, P k +1 is the convex hull of P k u {zk}, where zk is the point Tk )-
if Tk < +m and the direction )- if Tk = +m (see Section VI.2.6). Since each
ate after finitely many steps. If it terminates at a step k.I., then a solution y of (20)
> 0 for all y E D. This implies that fex) > 0 for al1 xE D.
{(y)
•
Note that the set Vk may increase quickly in size as the algorithm proceeds.
Therefore, to alleviate storage problems and other difficulties that can arise when Vk
475
becomes too large, it is recommended to restart the algorithm from some w(v) E V k'
i.e., to return to Step 0 with xOI - w(v) and
~ 1 ~
D I - D n {y: v y ~ I} , f I - f ,
where vI is the vertex computed in Step 0 of the current cyde of iterations. At each
rest art the feasible domain is reduced, while the starting vertex xO changesj there-
Like the polyhedral annexation algorithm, the conical (DG) procedure in Section
VII.1.2 can be extended to solve the problem (20). This extension, however, requires
In the coniCal (DG) procedure, when the sets D and G are not bounded (as assu-
med in Section VII.1.2), the optimal value /-I(Q) of the linear program LP(Q,D) asso-
ciated with a given cone K = con(Q) might be +111. In this case, the procedure might
generate an infinite nested sequence of cones Ks = con(Qs) with /-I(Qs) = +111. To
avoid this difficulty and ensure convergence of the method, we need an appropriate
subdivision process in conjunction with an appropriate selection rule in order to pre-
vent such sequences of cones from being generated when the problem is solvable.
To be specific, consider (LCP) in the formulation (20). Let ei be the i-th unit
vector of IRn, and let HO be the hyperplane passing through e 1, ... ,en , i.e.,
n
HO = {y: E y. = I}. Then for any cone K in lR+ n we can define a nonsingular nlCn
i=l 1
matrix Z = (v 1, ... ,vn ) whose i-th column vi is the intersection of HO with the i-th
edge of K. Since {CO) > 0 and the function {is continuous, we have
i = 0 if Ui = +ID).
)..
(here, as before, it is agreed that
1
• • • T
Denote by J.'(Z) and)' = ().l'.·.,An) the optimal value and a basic optimal solu-
tion of this linear program (when J.'(Z) = +ID, Xdenotes an extreme direction of the
i
)..
polyhedron CZ). + d ~ 0, ). ~ 0 over which ~ -I +ID).
1 1
Proposition IX.7. Let l(y) = 1 be the equation 0/ the hyperplane passing through
i = Uivi (i EI) which is paraUel to the directions vi (j ~ I), where I = {i: Ui < -kI}.
Then J.L(Z} and w(Z} = zX are the optimal value and a basic optimal solution 0/ the
linear program
n .
Proof. Since Z is a nonsingular nxn matrix, we Can write y = Z). = E ).. VI =
i=l I
E
).. .
iEI i
i 1 .
z\ with ). = z- y. Hence, noting that l(i) = 1 (i EI), we see that:
)... )..
l(y) = E fl(ZI) = E f.
iEI i iEI i
Proof. The hyperplane i(y) = p,(Z) passes through z*i = "tvi (i E I) and is paral-
t.
1
~ ° (i=l, ... ,n) and E t. ~
~I 1
1. But clearly f(z*i) > ° (i E I) and f(tv j ) > ° (j t I)
for all t > 0. Rence f(y) > 0 by the concavity of f(y).
•
Now for each cone K with matrix Z = (v 1,... ,vn ) define the number
where the Bi are computed according to (21). Let .9t be the collection of matrices
that remain for study at a given stage of the conical procedure. It turns out that the
following selection rule coupled with an exhaustive subdivision process will suffice to
ensure convergence of this procedure:
(v 1,... ,vn ) is such that Bi< +111 Vi and P(Z) < +lIIj K is said to be ofthe second ca~
egorg otherwise. If there exists at least one cone of the first category in .9t, then
choose a cone of the first category with maximal P(Z) for furt her subdivisionj other-
x~O,Mx+q~O,
is available such that f(xo) > 0. Using xO, rewrite (LCP) in the form (20). Select an
exhaustive cone subdivision process.
0) Let Z1 = (e 1,e2,... ,en), where ei is the i-th unit vector of IRn, and let
k.1. For each Z E .9 k solve the linear program LP(Z,D) (see (22)). Let p(Z) and X
be the optimal value and a basic optimal solution of LP(Z,D), and let w(Z) = zt
478
If f(w(Z» =0 for some Z e .9k ' then terminate. Otherwise (f(w(Z» > 0
VZ e .9k ), go to k.2.
k.2. In .Jt k delete a1l Z e .9k satisfying J.l(Z) < 1. Let ~ k be the collection of the
k.3. Let ~ LI) = {Z e .9k: Z = (v1,... ,vn), Bi < +m Vi, J'(Z) < +m}, where Bi is
defined by (21). If ~ LI) # 0, then select
Otherwise, select
Zk e argmin {O(Z): Z e ~ k} .
k.4. Let .9'k+l be the partition of Zk' and let .At: k+l be the collection obtained
from .9l k by replacing Zk with .9'k+l . Set k t - k+l and return to k.1.
Lemma IX.1. For every point v ofthe simplex [e 1, ... ,en] denote by r(v) the half
line /rom °through v. If r (ti*) n b is a line segment [0, y*], then for aU v e
[e 1, ... ,en] sufficiently close to V*, r(v) n b is a line segment [O,y] satisfying y-t y*
asv-tti*.
Proof. Since 0 e :Ö, we must have d ~ o. Consider a point v* such that r(v*) n:ö
is a line segment [O,y*]. First Suppose that y* # o. Then, since y* is a boundary
point of :Ö, there exists an i such that Ciy* + di = 0 but >.Ciy* + di < 0 for all
>. > 1, Le., di > O. Rence, J = {i: di > O} # 0. Define
479
C.u
g(u) = max { - i-: i E J} .
1
of the lemma.
•
Lemma IX.2. 1/ Ks ' sE ß ( {1,2, ... }, is an infinite nested sequence 0/ cones 0/
the first category, then n Ks = r is a ray such that r n iJ = [0, y*] with
SEß
y* = lim w(Z) .
SEß
8-!m
Proof. Recall that Ks denotes the cone to be subdivided at iteration s. From the
exhaustiveness of the subdivision process it follows that the interseetion r of all Ks'
s E ß, is a ray. Now, since Ks is of the first category, the set Ks n {y: ls(Y) ~ JL(Zs)}
(which contains Ks n D) is a simplex (ls(Y) = 1 is the equation of the hyperplane
tersections of the rays through y* and w(Zs)' respectively, with the simplex
[e l , ... ,en], then, as s --+ m, S E ß, we have vB --+ v*, and hence, by Lemma IX.l,
•
480
Lemma IX.3. 11 the cone Ks chosen lor further subdivision is 01 the second cat-
egory at some iteration s, then f7y) > 0 lor aU y E lJ in the simplex T s = {y E IR~ :
n
E y.~ O(Z)}.
i=l '
Proof. The selection rule implies that st ~1) = 0 and 6(Zs) ~ 6(Z) for all Z E st s'
k ~ s (in which case l(y) > 0), or else belongs to a cone with matrix Z E st s' In the
n
latter case, since O(Zs) ~ 6(Z), we have . E Yi ~ 6(Z), and it follows from the de-
1=1
finition of 6(Z) that l(y) > O.
•
Lemma IX.4. 11 the algorithm generates infinitely many cones Ks 01 the second
category SE/). ( {1,2, ... }, then O(Z) ~ IJJ as s ~ IJJ, sE/)..
Proof. Among the cones in the partition of K 1 = IR~ there exists a tone, say K s '
1
that contains infinite1y manY cones of the second category generated by the algo-
rithm. Then among the cones in the partition of Ks there exists a cone, say K s '
1 2
that contains infinite1y manY cones of the second category. Continuing in this way,
we find an infinite nested sequence of cones Ks ' v=1,2, ... , each of which contains in-
v
finitely many cones of the second category. Since the subdivision is exhaustive, the
IJJ
intersection n Ks = r is a ray. If r contains a point y such that f(y) < 0, then,
v=1 v
since y t :Ö, there exists a ball U around y, disjoint from :Ö, such that f(u) < 0 for
all u EU. Then for v sufficiently large, any ray contained in K s will meet U. This
v
implies that for all k such that Kk ( Ks ' we have ~ < IJJ (i=1, ... ,n) and J.&(Zk) < IJJ.
v
That is, all subcones of K s generated by the algorithm will be of the first category.
v
This contradicts the above property of Ks . Therefore, we must have f(y) ~ 0 for all
v
y Er. Since fis concave and f(O) > 0, it follows that f(y) > 0 for all y Er.
481
For any given positive number N, consider a point CEr and a ball W around c such
n
that . E Yi > N and f(y) > 0 for all y E W. When 11 is sufficiently large, say 11 ~ 110,
1=1
sll,i sll,i
the edges of Ks will meet W at points y such that f(y ) > 0 (i=1, ... ,n). Since
11
n sll,i
E YJ' > N, it follows that Os i > N (i=1, ... ,n), and hence O(Z ) > N. Now for
j=1 11' sll
any s E 11 such that s ~ sll ' the cone Ks must be a subcone of some cone in.9t .
o ~
o
Therefore, O(Zs) ~ O(Zs )
110
> N.
•
Theorem IX.2. If Algorithm IX.5 generates infinitely many cones of the second
category, then the LOP problem has no solution. Othenoise, beginning at some
iteration, the algorithm generates only cones ofthe first category. In the latter case, if
the algorithm is infinite, then the sequence J = w(ZII has at least one accumulation
point and each of its accumulation points yields a solution of (LOP).
the second category, sE tl C {1,2, ... }. By Lemma IX.3, the problem has no solution
n
in the simplices T s = {y E IR~: . E Yi ~ O(Zs)}' On the other hand, by Lemma IXA,
1=1
O(Zs) --+ mas s --+ m, S E 11. Therefore, the problem has no solution.
Now suppose that for all k ~ kO' Jl k consists only of cones of the first category. If
the algorithm is infinite, then it generates at least one infinite nested sequence of
cones K s' s E 11 C {1,2, ... }. Since all K s' s ~ kO' are of the first category, it follows
m
from Lemma IX.2 that lim w(Zs) = y*, where y* is such that [O,y*] ~ f> n n Ks '
S-+m s=1
Now consider any accumulation point y of {w(Zk)}' for example, y= lim w(Zk ).
r-+m r
Reasoning as in the beginning of the proof of Lemma IXA, we can find an infinite
nested sequence of cones Ks ' s E tl' C {1,2, ... }, such that each Ks ' s E 11', contains
infinitely many members of the sequence {K k ,r=1,2, ... }. Without loss of gen-
r
482
erality, we may assume that Kk C Ks (s E ~I). Since the sub division process is ex-
s
haustive, the intersection of all of the Ks' s E ~', is the ray passing through y. If
f(Y) > 0, then around some point c of this ray there msts a ball U such that
[O,YJ nU = 0 and f(u) > 0 Vu E U. Then for all sufficiently large s E ~', we have
[O,w(Zk )] n U = 0.
s
k ,i
On the other hand, the i-th edge of the cone Kk meets U at some point u s .
s
k,i k,i k,i
Since f( u s ) > 0, it follows that u s will lie on the line segment [O,z s ]. Con-
sequently, since P(Zk) ~ I, the line segment [O'w(Zk)] meets the simplex
s s
k ,I k ,n k k k
[u s ,,,.,u S ] at some point u s. Then we have u s tU (because u sE [O,w(Zk )]
s
k k ,I k ,n
C ~ \ U), while u sE [u S ,,,.,u s ] cU. This contradiciion shows thai f(Y) ~ 0,
Corollary IX.2. For any E > 0 and any N > 0, Algorithm IX.5 either fonds an
E-approzimate solution 01 (20) (i.e., a point y E jj such that /(y) < E) after flnitely
many iterations or else establishes that the problem has no solution in the baU
lIylI <N.
proof. H the first alternative in the previous theorem holds, then, for all k such
n
thai Kk is of the second category and the simplex Tk = {y E IR~: i!l Yi ~ O(Zk)}
contains the ball lIylI < N, it follows that we have f(y) > 0 for all y E D with
lIylI < N.
H the second alternative holds, then for sufficiently large k we have l(w(Zk» < E. •
Remarb IX.2. (i) As in the case of Algorithm IXA, when ~k becomes too
large, it is advisable to rest art (i.e., to return to step 0), with xO......- w(Z) and
f(w(Z)) > 0, and where ll(y) = 1 is the equation of the hyperplane in Proposition
IX.7 for ZI constructed in Step O.
(ii) When the problem is known to be solvable, the algorithm can be made finite as
follows. At each iteration k, denote by .9'~1) the set of cones of the first category in
.9'k ' and let yl = w(ZI)' yk e argmin {f(yk-l), f(w(Z)) VZ e .9'~1)} (i.e., yk is the
best point of:ö known up to iteration k). If l is a vertex of :Ö, let yk = yk j other-
wise, find a vertex yk of:ö such that l(yk) ~ f(yk). Since f(yk) ~ 0 (k ~ m) and the
vertex set of:ö is finite, we have f(yk) = 0 after finitely many steps.
(iii) To generate an exhaustive subdivision process one can use the rules discussed
in Section VII.1.6, for example, the rule (*) in Section VII. 1. 6, which generates
mostly (V-fIubdivisions. Note that when w(Zk) is a direction (i.e., ~Zk) = +m), the
(V-fIubdivision of Zk is the subdivision with respect to the point where the ray in the
direction w(Zk) intersects the simplex [Zk] defined by the matrix Zk'
algorithm for (LCP) proposed by Thoai and Tuy (1983). The major improvement
consists in using a more efficient selection rule and an exhaustive subdivision process
which involves mostly (V-fIubdivisions instead of pure bisection.
The above methods for solving (LCP) are based on the reduction of (LCP) to the
n
minimize E min {~ , MXi + q.} S.t. x ~ 0, Mx + q ~ 0 .
i=1 1
484
Since min {xi' MXi + qi} = xi + min {O, Mix - xi + qi}' by introducing the auxili-
ary variables w. we can rewrite this concave minimization problem in the separable
1
form
n
minimize . E {x i + min (O,wi )}
1=1
Bard and Falk (1982) proposed solving this separable program by a branch and
bound algorithm which reduces the problem to aseries of linear programs with con-
In the general case when M is indefinite, (23) becomes a harder d.c. optimization
problem (see Chapter X).
Pardalos and Rosen (1988) showed that (LCP) is equivalent to the following
mixed zer~ne integer program:
485
maximize a
S.t. 0<- M.y
1
+ q.a<
1 -
1 - z.1 (i=l, ... ,n)
(MIP)
o~ Y i ~ zi (i=l, ... ,n)
Of course, here we assume that qi < 0 for at least one i (otherwise x = 0 is an ob-
vious solution of (LCP».
Proposition IX.S. If (MIP) has an optimal solution (a,y,z) with Q > 0, then
x= y / Q solves (LCP). If the optimal value of (MIP) is Q = 0, then (LCP) has no
solution.
else zi = 1 (and hence Miy + qiQ = 0, i.e., Mix + qi = 0). Therefore, x solves
(LCP).
Now suppose that Q = O. If (LCP) has a solution x, then we have max {xi' Mix +
qi (i=l, ... ,n)} > O. Denote by a the reciprocal of this positive number. Then a feas-
ible solution of (MIP) is a, y = ax, zi = 0 if xi = 0, zi = 1 if xi > O. Hence we have
a ~ Q = 0, a contradiction. Therefore, (LCP) has no solution.
•
Using this result, Pardalos and Rosen suggested the following method of solving
(LCP):
2. Choose n orthogonal directions ui (i=l, ... ,n), and solve the linear programs
min {cTx: x E D} with c = ui or c = _ui (i=l, ... ,n). This will generate k ~ 2n
486
vertices x.i of D (j E J). If f(,J) = 0 for some j, then stop. Otherwise, go to 3).
3. Starting from the vertex ,J (j E J) with smallest f(,J) , solve the quadratic
progra.m (23) (by 80 loca.l optimization method) to obtain 80 Kuhn-Tucker point
ri. If f(ri) = 0, then stop. Otherwise, go to 4)
In this approch, (MIP) is used only 80S a last resort, when loca.l methods fail.
From the computationa.l results reported by Parda.los and Rosen (1987), it seems
that the average complexity of this a.lgorithm is O(n4).
Let us a.lso mention another approach to (LCP), which consists in reformulating
(23) as a bilinea.r progra.m:
Since the constraints of this progra.m involve both x and w, the standard bilinear
programming a.lgorithms discussed in Section IX.l cannot be used. The problem can,
however, be handled by a method of Al-Khayya.l and Fa.lk (1983) for jointly con-
strained biconvex progra.mming (cf. Chapter X). For details of this approach, we
refer to Al-Khayya.l (1986a).
where h: IRn _ IRn is 80 given concave mapping such that hi(O) < 0 for at least one
i=I, ... ,n. Setting
n
D = {x: x ~ 0, h(xH O} , G = {x: . E min {xi' hi(x)} > O} ,
1=1
487
we saw that the sets D and G are convex, and the problem is to find a point
xE D \ G.
By translating the origin to a suitable point x E G and performing some simple
Given an open convex set G containing the origin 0 and a closed convex set
able, then a solution always exists on the part of the boundary of :ö that consists of
points x such that x = Oy with y E IR! \ {Ol, 0 = sup {t: ty E D}. Therefore, re-
Algorithm IX.7.
where Z = (v1, ... ,vn ), 0i = sup {t: tvi E O} (i=l, ... ,n), >. = (>'l' ... ,>'n)T (as
>..
usual, .,l-
u,
= 0 if O. = +111). Let p.(Z) and ~ be the optimal value and abasie
1
1
k.2. In .At k delete all Z E .9l k satisfying p.(Z) < 1. Let st k be the collection of
remaining elements of .At k' If st k = 0, then terminate: :ö ( Ö (the problem has
no solution). Otherwise, go to k.3.
k.3. Let st ~l) = {Z E st k: Z = (v1, ... ,v n ), Bi < +00 Vi, p.(Z) < +ID}. If st p) f 0,
select
Otherwise, select .
Zk E argmin {O{Z): Z E st k} .
Subdivide (the cone generated by) Zk according to the chosen exhaustive sub-
division rule.
k.4. Denotej. = w(Zk)' If j. E :Ö, then set Dk+1 = Dk. Otherwise, take a vector
k k -1c 1c - -k 1c
P such that the halfspace p (x - t..r) ~ 0 separates ur {rom D (where w = Bt..r,
B= sup {t: tt..r1c E D})
- k -k
and set Dk +1 = Dk n {x: p (x - w ) ~ O}.
k.5. Let .9lk+1 be the partition of Zk obtained in Step k.3., and let .At k+1 be the
collection that results {rom .At k by substituting .9lk+ 1 for Zk' Set k I - k+1 and
go to k.1.
As before, we shall say that a cone K with matrix Z = (vl, ... ,vn ) is of the first
category if Bi < +00 Vi and p.(Z) < +a, and that it is of the second category other-
wise.
489
Theorem. IX.3. II Algorithm IX.6 generates infinitely many cones 01 the second
category, then the convex complementarity problem has no solution. Otherwise, be-
ginning at some iteration, the algorithm generates only cones 01 the first category. In
the latter case, il the algorithm is infinite, then the sequence J = w(Z~ has at least
one accumulation point, and any 01 its accumwation points yields a solution 01 the
problem.
Proof. It is easily seen that Lemmas 1X.3 and IX.4 are still valid (f(y) > 0 means
that y e G), and hence the first part of the theorem can be established in the same
way as the first part of Theorem 1X.2.
Now suppose that for all sufficiently large k, jIlk consists only of cones of the first
category. H the algorithm is infinite, then it generates at least one infinite nested se-
quence of cones K s of the first category, s e 11 ( {1,2, ... }. For each s, let
Zs = (vs,l ,... ,Vs,n) , Zs,i = (Js,ivs,i , (Js,i = sup {t : t vs,i e G} ('1=1,... ,n ) , an d 1et
ls(X) = 1 be the equation of the hyperplane through zs,l, ... ,zs,n (so that
tJ e argmax {ls(x): xe Ds n Ks}; see Proposition 1X.7). Then, denoting the smallest
index s e 11 by sI ' we find that tJ e {x e Ks : ls (x) ~ p.(Zs )} for all s e 11, Le.,
1 1 1
the sequence {tJ, s e Ll} is bounded, and hence must have an accumulation point.
Consider any accumulation point i of the sequence {Jt} (for all k sufficiently large,
k
J<is a point). For example, let i = lim w r. Reasoning as in the proof of Lemma
r-+m
1X.4, we can find an infinite nested sequence of cones Ks ' s e 11' ( {1,2, ... }, such that
each K , s e 11', contains infinitely many members of the sequence {K k ' r=1,2, ... }.
8 . r
Without loss of generality we may assume that K k (Ks (s e 11') and some K s is
8 1
of the first category. It is easily verified that all of the conditions of Theorem 11.2
k
are fulfilled for the 8et D n {x e Ks : ls (x) ~ p.(Zs )} and the sequences {w r},
1 1 1
k
{p r}. Therefore, by this theorem, we conclude that i e D.
On the other hand, the exhaustiveness of the subdivision process implies that the
490
k ,1 k ,n _ kr,i k ,i
simplex [Zk ] = [v r ,... ,V r ] shrinks to a point'/: Hence z = 0k i V r con-
r r'
k i
verges to a point z = öV as r -+ 00. Since z r' E aG (the boundary of G), we must
k k
have zE 00. If z r denotes the point where the halfline from 0 through w r meets
kr ,l kr,n . kr _ kr kr
the simplex [z ,... ,z ], then ObvlOusly z -+ z. But w = JL(Zk)z , and
r
k
since JL(Zk ) ~ 1, it follows that x= lim w r t G. Therefore we have xE D \ G. •
r r-'ID
where Dis a polyhedron in IRn, cis an n-vector, and f: IRn -+ IR is a concave function.
We shall call this a parametric concave programming problem, since the minimi-
zation problem in (25) is a concave program depending on the parameter O.
In a typical interpretation of (PCP), D represents the set of all feasible produc-
tion programs, while the inequality cx ~ 0 expresses a constraint on the amount of a
certain scarce commodity that can be used in the production, and f(x) is the produc-
tion cost of the pro gram x. Then the problem is to find the least amount of the
scarce commodity required for a feasible production program with a cost not ex-
ceeding a given level a.
In the literature, the PCP problem has received another formulation which is
Proof. We may of course assume that min f(D) ~ er, for otherwise both problems
are infeasible. If 0 0 is optimal for (PCP) and xO is an optimal solution of the corres-
ponding concave program, then obviously cxO = 0 0, and x O is feasible for (LRCP),
hence 0 0 ~ 01:= optimal value of (LRCP).
Conversely, if xl is optimal for (LRCP) and cx1 = 01, then 01 is feasible for
(PCP), hence 01 ~ 0 O. Therefore, 01 = 0 0, and xO is optimal for (LRCP), while 01
is optimal for (PCP).
•
An inequality of the form f(x) ~ er, where f(x) is a concave function, is called a re-
verse convez inequalitll, because it becomes convex when reversed (see Chapter I). If
this inequality is omitted, then problem (26) is merely a linear program; therefore it
Jacobsen (1975 and 1975a), Hillestad (1975) and also Hillestad and Jacobsen (1980).
In Bansal and Jacobsen (1975 and 1975a) the special problem of optimizing a net-
work flow capacity wider economies of scale was discussed. Several methods for
globally solving (LRCP) with bounded feasible domain have been proposed since
then. Hillestad (1975) and Hillestad and Jacobsen (1980 and 1980a) developed
methods based on the property that an optimal solution lies on an edge of the poly-
hedron D. These authors also showed how cuts that were originally devised for con-
cave minimization problems can be applied to (LRCP). Further developments along
these lines were given in Sen and Sherali (1985 and 1987), Gurlitz (1985) and Fulöp
(1988). On the other hand, the branch and bound methods originally proposed for
minimizing concave functions over polytopes have been extended to (LRCP) by Muu
(1985), Hamami and Jacobsen (1988), Utkin, Khachaturov and Tuy (1988), Horst
(1988).
For some important applications of (LRCP) we refer to the discussion in Section
1.2.5.
492
The last assumption simply means that the constraint f(x) ~ 0 is essential and
(LRCP) does not reduce to the trivial linear program min {cx: xE D}. It follows
(such a point is provided, for example, by an optimal solution of the linear program
Proposition IX.IO. The set conv(D \ int G) is a polyhedron whose extreme diT'-
ections are the same as those of D and whose vertices are endpoints of sets of the
form conv(E \ int G), where Eis any edge of D.
Proof. Denote by M the set of directions and points described in the proposition.
Obviously, M ( D \ int G (note that, in view of assumption (b), any recession dir-
ection of D must be a recession direction of D \ int G). Hence, convM (
conv(D \ int G).
We now show the inverse inclusion. Suppose z E D \ int G. Since G is convex,
there is a halfspace H = {x: h(x~) ~ O} such that zEH ( IRn \ G. Then, since D n H
493
contains no lines, z belongs to the convex hull of the set of extreme points and direc-
tions of H n D. But obviously any extreme direction of the latter polyhedron is a re-
cession direction of D, while any extreme point must lie on an edge E of D such that
conv(E \ int G). Consequently, D \ int Gis contained in conv M; and, since conv M
is closed (M is finite), conv(D \ int G) ( conv M. Hence, conv(D \ int G) = conv M.
•
Since minimizing a linear form over a closed set is equivalent to minimizing it
over the closure of the convex hull of this set we have
Here the term "implicit" refers to the fact that, although conv(D \ int G) is a poly-
hedron, its constraints are not given explicitly (this constitutes, of course, the main
Proposition IX.n. If (LRCP) is solvable, then at least one of its optimal solutions
lies on the intersection of the boundary 8G of G with an edge of D.
Proof. If (LRCP) is solvable, then at least one of its optimal solutions (i.e., an
optimal solution of (28)), say xO, is a vertex of conv(D \ int G), and hence is an end-
point of the set conv(E \ int G), where E is some edge of D. If f(xO) < a, then xO
can restrict ourselves to the set of intersection points of 00 with the edges of D.
Several earlier approaches to solving (LRCP) are based on this property (see, e.g.,
494
Definition DU. We say that the LRCP problem is regular if D \ int G = cl(D\ G)I
i.e. 1 if any feasible point is the limit of a sequence of points xE D satisfying f(x) < Q.
Thus, if D \ int G has isolated points, as in Fig. IX.5, then the problem is not
regular. However, a regular problem may have a disconnected feasible set as in Fig.
IX.6.
....•...•................••..•.••...•............
D(xO)\G=%
but xOnot opt.
opt.
f(x)=ct
Fig. IX.5
495
.•..........
Fig. IX.6
(29)
Theorem IXA. In order that a feasible solution ');0 be globally optimal for (LRCP;,
it is necessary and, if the problem is regular, also sufficient that
Proof. Suppose D(xO) \ Gis not empty, i.e., there exists a point Z of D(xO) such
that f(z) < Q. Let xl be the point where the boundary 8G of G intersects the line
segment joining z and the point w satisfying (27). Then xl belongs to D \ int G,
and, since cw < cz, it follows that cx l < cz ~ cxO; and hence xO is not optimal. Con-
versely, suppose that (30) holds and the problem is regular. If there were a feasible
point x with cx < cxO, in any neighbourhood of x we would find a point Xl E D with
496
f(x ' ) < a. When this point is sufficiently near to x, we would have cx' < cxO, Le., x'
E D(xO) \ G, contrary to the assumption. Therefore, xO must be optimal. This com-
(30) holds without xO being optimal (see Fig. IX.5). However, the next result shows
the usefulness of condition (30) in the most general case, even when we do not know
For each k = 1,2, ... let Ek : IRn -I IR be a convex function such that °< Ek(X)
Vx E D, max {Ek(X): x E D} -I °(k -Im), and consider the perturbed problem
Proof. Clearly, because of the continuityof f(x) - Ek(X), xis feasible for (LRCP).
For any feasible solution x of (LRCP), since f(x) - ek(x) < f(x) ~ a, we have x t Gk .
Therefore, the condition D(xk) \ Gk = 0 implies that x t D(xk). Since x E D, we
must have cx ~ cxk , and hence cx ~ cx. This proves that xis a global optimal so-
lution of (LRCP).
•
In practice, the most commonly used perturbation functions are e(x) =e or
is regular.
Proof. Since the vertex set V of D is finite, there exists Co > 0 small enough so
that cE (O,cO) implies that F(c,x):= f(x) - c(lIxl12 + 1) f 0: "Ix E V. Indeed, if
VI = {x E V: f(x) > o:} and Co satisfies coOlxll2 + 1) < f(x) - 0: "Ix E VI' then
whenever 0 < c < cO' we have for all x E V \ V( f(x) - c(lIxll 2 + 1) ~ 0: - c < 0:,
while for all x E VI: f(x) - c(lIxll 2 + 1) > f(x) - coOlxll2 + 1) > 0:. Also note that
the function F(c,x):= f(x) - c(IIxll2 + 1) is strictly concave in x. Now consider the
problem (LRCP- c), where 0 < c < cO' and let xE D be such that F(c,x) ~ 0:. If xis
not a vertex of D, then x is the midpoint of a line segment t:. ( D, and, because of
the strict concavity of F(c,x), any neighbourhood of x must contain a point x' of t:.
such that F(c,x' ) < 0:. On the other hand, if xis a vertex of D, then F(c,x) < 0:, and
any point x' of D sufficientIy near to x will satisfy F(c,x') < 0:. Thus, given any
xE D such that F(c,x) ~ 0:, there exists a point x' arbitrarily elose to x such that
x' E D, F(t:,x') < 0:. This me ans that the problem (LRCP- c) is regular.
•
It follows from the above result that a LRCP problem can always be regularized
by a slight perturbation. Moreover, this perturbation makes the function f(x) strictly
concave, a property which may be very convenient in certain circumstances.
To simplify the presentation of the methods, in the sequel instead of (b) we shall
assurne astronger condition:
With suitable modifications, most of the results below can be extended to the case
when D is unbounded.
49S
Under assumptions (a), (b'), (c), if w is a basic optimal solution of the linear pro-
gram min {cx: x E D}, then, by transforming the problem to the space of nonbasic
1) w = 0 is a vertex of D;
One of the most natural approaches to solving the LRCP problem is by outer ap-
proximation (cf. Forgo (19SS), Hillestad and Jacobsen (19S0a), and Fülöp (19SS);
see also Bulatov (1977) for a related discussion). This approach is motivated by the
min {cx: x E D} .
Suppose that I(i) > a, i.e., xO is not leasible lor (LRCP). 11 1r(x-xO) ~ 1 is an
min {f(x): x E D} ,
l(x):= 1r(X-XO) - 1 ~ 0
equality lex) ~ 0 which excludes xO without excluding any point x E D such that
fex) ~ a.
•
It follows from this fact that concavity cuts (see Chapter III) can be used in out er
approximation methods to solve (LRCP).
499
with So = D. When SO"",Sk have been constructed, one solves the linear program
obtaining a basic optimal solution, xk. If xk happens to be feasible for (LRCP), then
the procedure terrninates: i solves (LRCP), since Sk contains the feasible set of
(LRCP). Otherwise, one generates a concavity cut ?f"k(x_xk ) ~ 1 to exclude xk and
forms Sk+ 1 by adding this constraint to Sk' The procedure is then repeated with
difficult to attack, its convergence is not guaranteed. For example, Gurlitz (1985)
has shown that, when applied to the 5-dimensional problem
the outer approximation method using such cuts will generate a sequence xk which
combine cutting with partitioning the feasible domain by means of cone splitting.
This leads to conical algorithms, which will be discussed later in this section.
500
4th cut - - - - -
"-
1 st cut
X
o
o
T
cx=8
1
X
Fig. IX.7
Since, by Proposition IX.H, an optimal solution must exist on some edge of D in-
tersecting 00, and since the number of such edges is finite, one can hope to solve the
problem by a suitable edge search procedure. The first method along these lines is
IX.4).
501
Typically, a method based on the edge property alternates between steps of two
jective function value cx, while moving forward to the surface f(x) = /l. To do this,
it suffices to apply the simplex procedure to the linear program
neighbouring vertex u to sO satisfies cu < csO. H f(u) < /l, we perform a simplex
pivot to move from sO to u. This pivoting process is continued until we find a pair of
vertices u,v of D such that f(u) < /l, f(v) ~ /l (this must occur, again because of as-
sumption (c)). Then we move along the edge [u,v] of D to the point xO where this
edge meets the surface f(x) = /l (due to the strict concavity of f(x), xO is uniquely
determined). At this stage xO is the best feasible point obtained thus far, so for fur-
Since we are now stopped by the "wall" f(x) = /l, we try to move backward to the
region f(x) < /l, while keeping the objective function value at the lowest level al-
ready attained. This can be done by finding a vertex sI of D(xO) such that f(sl) < /l
which is as far as possible from the surface f(x) = /l (intuitively, the further we can
move backward, the more we will gain in the next forward step).
H such a point sI can be found, then another forward step can be performed from sI,
and the whole process can be repeated with sI and D(xO) replacing sO and D.
On the other hand, if such an sI does not exist, this means that D(xO) \ Gis empty.
By Theorem IX.1 and the regularlty of the problem, this implies that xO is a global
f(SI) > a?
Hillestad and Jacobsen (1980) suggested a combinatorial procedure for the backward
certain cases this procedure may require us to solve by a rather expensive method
(finding a feasible point sI better than the current best feasible solution xO).
However, a more systematic way to check whether D(xO) has a vertex s such that
f(s) < a, and to find such a vertex if one exists, is to solve the concave program
Therefore, Thuong and Tuy (1985) proposed that one solve this concave program
in the backward step. With this approach, the algorithm can be summarized as fol-
lows:
Algorithm IX.S.
Initialization:
If a vertex 5° of Dis available such that f(sO) < a, set DO= D, k = ° and go to 1).
Otherwise, apply any finite algorithm to the concave program min {f(x): x E D},
until a vertex s° of Dis found such that f(sO) S a (if such a vertex cannot be found,
Iteration k = 1,2,... :
1) Starting from sk-l pivot by means of the simplex algorithm for solving the linear
program
(31)
503
cv< cu $ cs k- 1. Let xk be the (unique) point of the line segment [u,v] such that
f(x k ) == Q. Go to 2).
k
2) Form Dk == {x E D: cx $ cx } and solve the concave program
a) If f(sk) == Q, terminate.
Theorem IX.6. Assume that (a), (b'), (c) hold, and that moreover, J(z) is strictly
concave and the LRCP problem is regular. Ifthe problem has a feasible solution, then
the above algorithm terminates at Step 2a) after jinitely many iterations, yielding a
global optimal solution.
Proof. If the algorithm terminates at Step 2a), then min {f(x): xE Dk} == Q, and
Let M denote the set of a11 x E D such that f(x) == Q and xis contained in some
edge of D. M is finite, since the number of edges of D is finite, and the strict1y con-
cave function f(x) can assume the value Q on each edge of D at most at two distinct
points. Finiteness of the algorithm then fo11ows from finiteness of M and the fact
is that the subproblem in each iteration differs from the one in the previous iteration
only in the right hand side of the constraint cx ~ cxk. To increase efficiency, the so-
lution method chosen for the subproblems should take advantage of this structure.
For example, if D is bounded and the outer approximation algorithm (Algorithm
VI.l) is used for the concave programs (32), then the algorithm could proceed as fol-
lows:
Algorithm IX.8*.
Initialization:
Construct a polytope DO J D, with a known (and small) vertex set VO. Let sO be a
vertex of DOsuch that f(sO) < Q. Set k = o.
Iteration k = 1,2,... :
I} Starting from sk-I, pivot by means of the simplex algorithm for solving the
linear program min {cx: xE Dk_ 1} until a pair of vertices u,v of Dk_ 1 is found so
that f(u) < Q, f(v) ~ Q, and cv < cu ~ cs k- 1. Let xk be the intersection of [u,v]
with the sudace fex) = Q. Go to 2).
2} If xk E D, set Dk = Dk_1 n {x: cx ~ cxk}. Otherwise, set Dk = Dk- 1 n {x:
4c(x) ~ O}, where 4c(x) ~ 0 is the constraint of D that is the most violated by xk.
Compute the vertex set Vk of Dk (from knowledge of Vk_ 1).
Let sk E argmin {fex): x E Vk}.
Theorem IX.7. Under the same assumptions as in Theorem IX. 6, Algorithm IX.8*
terminates at Step 2a) or 2c) after jinitely many iterations.
Proof. Sinee the number of eonstraints on D is finite, either all of these eon-
straints are generated (and from then on Theorem IX.6 applies), or else the algo-
rithm terminates before that. In the latter ease, if 2a) oeeurs, sinee
Minimize -2x1 + x 2
s .t. xl + x2 ~ 10 ,
-xl + 2x2 ~ 8,
-2x1 - 3x2 ~ -6 ,
xl - x 2 ~ 4,
xl ~ 0 , x2 ~ 0 ,
2 2
-Xl + Xl x2 - x2 + 6x I ~ 0.
Sinee the subproblems are only on~mensional it will suffiee to use AIgo-rithm
IX.8. However, we shall also show Algorithm IX.8* for eomparison.
Applying Algorithm IX.8, we start with the vertex 50 = (0;4).
Iteration 1.
Iteration 2.
Fig. IX.8
Iteration 1:
Step 1 finds xl = (4.3670068,5.6329334). Sinee xl is feasible,
01 = 00 n {x: cx ~ ex1}. Step 2 finds the vertex sI = (10jO) whieh achieves the
minimum of cx over 01'
Iteration 2:
Fig. IX.9
Remark IX.3. The above method assumes that the function f(x) is strictly con-
cave and the LRCP problem is regular. If these ass um pt ions are not readily verifi-
able, we apply Algorithm IX.8 or IX.8* to the perturbed problem
which is regular and satisfies the strict concavity assumption. Clearly, if this
perturbed problem is infeasible, then the original problem itse1f is infeasible. Other-
wise, we obtain a global optimal solution X(E) of the perturbed problem, and, by
Theorem IX.5, as E ! 0 any accumulation point ofthe sequence {X(E)} yields a global
(*) Given a feasible point xk, find a point y E D(xk) \ G, or else establish that
D(xk ) C G.
Now recall that, because ofthe basic assumptions (a), (b'), (c), we can arrange it so
that the conditions 1) - 3) at the beginning of Section IX.5.2 hold (in particular,
°E D n int G and D C IR~). Then, setting Dk = D(xk ) in (*) we recognize the (DkG)
problem studied in Sections VI.2.2 and VII.1.1. If we then use the (DkG) procedure
(Section VII.1.2) in the backward step of each iteration, then we obtain an algo-
rithm which differs !rom Algorithm IX.8 only in the way the backward step is
carried out. However, just as with Algorithm IX.8, it is important to take advantage
of the fact that Dk differs from Dk_ 1 only in the right hand side of the constraint
cx 5 cxk- 1. This suggests integrating all of the (DkG) procedures in successive
iterations k = 1,2, ... into a unified conical algorithm, as follows.
Select an NCS rule for cone subdivision (see Sections VII.1.4 and VII.1.6). Set
10 = +ID, k = 0.
1) For each Q E ~ with Q = (z 1 i, ... ,zn), f(zi) = Cl (i=1,2, ... ,n), solve the linear
pro gram
-1 -1
max { eQ x: x E Dk , Q x ~ O}
obtaining the optimal value JL(Q) and a basic optimal solution w (Q). If
2) In .Ji delete all Q E ~ such that JL(Q) $ 1. Let .9E be the remaining collection of
3) Select Q*E argmax {JL(Q): Q E .9E} and split it according to the NCS rule chosen.
Theorem IX.8. Assume that the LRCP problem is regular. 11 Algorithm 1X.9 is in-
finite, then some iteration k ~ 1 is infinite and i is a global optimal solution. 11 the
algorithm terminates at iteration k with 'Yk < +m (or equivalently, k ~ 1), then i is a
global optimal solution; il it terminates at iteration k with 'Yk = +m (or equivalently,
k = O), then the problem is inleasible.
Proof. Before proving the theorem, observe that so long as Step 5) has not yet
occurrence of Step 5) marks the end of an iteration k and the passage to iteration
k+1, with 'Yk+1 < 'Yk' Bearing this in mind, suppose first that the algorithm is in-
finite. Since each xk lies on some edge E of D and achieves the minimum of cx over
510
E \ int G, it follows that the number of all possible values of 7k' and hence the num-
ber of iterations, is finite. That is, some iteration k must continue endlessly. Since
this iteration is exactly the (DkG) procedure, by Proposition VII.2 the set Dk n öG
is nonempty, while Dk C G. If k = 0, i.e., Dk = D, this would mean that the prob-
lem is feasible, but has no feasible point x such that f(x) < er, conflicting with the
regularity assumption. Hence, k ~ 1, and then the fact that Dk C G together with the
regularity assumption imply that xk is a global optimal solution. Now suppose that
the algorithm terminates at Step 2 of some iteration k. Then, since the current set
~ is empty, no cone of the current partition of KO contains points of Dk \ G.
Hence, if 7k < +m, then by Theorem IXA, x k is a global optimal solution. On the
other hand, if 7k = +m (i.e., Dk = D), then D C Gj hence, since the problem is regu-
lar, it must be infeasible.
•
As with Algorithms IX.8 and IX.8*, the regularity assumption here is not too re-
strictive. If this assumption cannot be readily checked, the problem can always be
handled by replacing f(x) with f(x) - E (lIxll 2+1), where E > 0 is sufficiently small.
On the other hand, there exist variants of conical algorithms which do not require
regularity of the problem (Muu (1985), Sen and Whiteson (1985». However, these
algorithms approach the global optimum from outside the feasible region and at any
stage can generally guarantee only an infeasible solution sufficiently near to a global
optimal solution.
earlier algorithm of Muu (1985). The main improvement consists in allowing any ex-
haustive subdivision process instead of a pure bisection process. This is possible due
to the following lower bounding method.
Proposition IX.I5. Let K = con(Q), Q = (zl,;, ... ,zn), be a cone generated b1l n
linearl1l independent vectors i' e IR! such that f(zi) = er. Then a lower bound for cz
over the set Kn (D \ int G) is given b1l the optimalvalue ß(Q) ofthe linear program:
511
-1 -1
min cx s.t. XE D, eQ x ~ 1, Q x ~ 0, (34)
i.e.,
(35)
(i=I, ... ,n». The assumption cx > 0 Vx E K \ int G implies that CX(AZi ) > czi
VA > 1, while the convexity of D implies that [O,w (Q)] ( D. Therefore, if an op-
timal solution w (Q) of (34) lies on the i-th edge of K, i.e., if it satisfies w (Q) = AZi
for some A ~ 1, then necessarily w (Q) = zi, and hence f(w (Q» = Cl. On the other
hand, if f(w (Q») = Cl, then w (Q) E D \ int G, and hence w (Q) achieves the min-
imum of cx over K n (D\int G) ( K n (D n H).
•
The algorithm we are going to describe is a branch and bound procedure similar
to Algorithm IX.S*, in which branching is performed by means of conical subdivision
and lower bounding is based on Proposition IX.15.
We start with a cone KO as in Algorithm X.9. Then for any subcone K = con(Q)
w (Qk) must be infeasible (otherwise, the exact minimum of cx over the feasible
portion contained in this cone would be known, and Q would have been fathomed);
hence, by Proposition X.15, w (Qk) does not lie on any edge of con(Qk) and can be
used for furt her subdivision of con(Qk)' We can thus state:
512
Algorithm IX.9*
VII.1.4).
1) In .J( k delete al1 Q such that ß(Q) ~ 7k· Let .ge k be the remaining collection of
matrices. If .ge k = 0, terminate: xk is a global optimal solution of (LRCP) if
7k < +ID; the problem is infeasible if 7k = +ID. Otherwise,. if .ge k is nonempty, go
to 2).
3) Let .9lk be the partition of Qk so obtained. For each Q e .9lk solve (35) to obtain
the optimal value ß(Q) and a basic optimal solution w (Q) of (34).
4) Update the incumbent: set xk +1 equal to the best among: xk,uk = u(Qk) (if
these points exist) and all w (Q), Q e .9lk ' that are feasible. Set 7k+1 = cxk +1.
Set .J( k+1 = (.ge k \ {Q}) U .9l k ' k f - k+1 and go to 1).
Proof. If the algorithm is infinite, it generates at least one infinite nested se-
quence of cones Ks = con(Qs)' seil ( {0,1, ... }, with Qs = (zSl,zS2, ... ,zsn) such that
f(zsi) = a (i=1,2, ... ,n). By virtue of the subdivision rule, such a sequence shrinks to
a ray; consequently, zSi -i x* (5 -i ID, seil), for i=1,2, ... ,n.
Since wS belongs to the halfspace {x: eQs-1x ~ 1}, the halfline from 0 through wS
meets the simplex [i 1,... ,zsn] at some poi~t vS, and it meets the 5urface f(x) = a at
513
some point yS. Clearly, vS -+ x*, yS -+ x*. But we have f(w s) > a, for otherwise, by
Proposition IX.15, cJ would be feasible and cws ~ 'Ys ' Le., ß(Qs) ~ 'Ys ' conflicting
longs to the line segment [vS,ys]. Hence, cJ -+ x*. Noting that wS E D and
feasible, proving that the lower bounding used in the algorithm is strongly consistent
in the sense of Definition IV.7. It then follows by Corollary IV.3 that lim ß(Qs) =
lim ß(Qk) = min {cx: xE D, f(x) 5 a}, and, since cx* = lim ß(Qs)' we conclude that
x* is a global optimal solution.
(This can also be seen directly: for any feasible point x, if x belongs to one of the
cones that have been deleted at some iteration h 5 s, then cx ~ 'Yh ~ 'Ys > ß (Qs) =
ccJ, and hence cx ~ cx*; if x belongs to some con(Q), Q E fIl s' then cx ~ ß(Q) ~
ß (Qs) = cws, and hence cx ~ cx*). Now let xbe an arbitrary accumulation point of
the sequence {wk = w (Qk)}' e.g., x= lim wh (h ---I CD, h EH). It is easy to see that
there exists a nested sequence Ks = con(Qs)' s E A, such that any Ks contains in-
finitely many Kh ' hE H. Indeed, at least one of the cones con(Q), Q E fIl 1 ' con-
tains infinitely many K h : such a cone must be split at some subsequent iteration; let
this cone be K for some sl ~ 1. Next, among the successors of K s at least one con-
sl 1
tains infinitely many K h : such a cone roust, in turn, be split at some iteration
s2> sl; let this cone be K s . Continuing in this way, we obtain an infinite nested se-
2
quence {K s ' s E A} with the desired property, where A = {sl's2''''}' Since for every
s there exists an h such that K h is a descendant of K , because the sequence {K s}
s s s
shrinks to a ray, it follows that w (Qh ) ---I x*, where x* = lim w (Qs) (5 ---I CD, 5 E
s
A). That is, x = x*, and hence, by the above, x is a global optimal solution of
(LRCP).
•
514
Remark IX.4. In the interest of efficieney of the proeedure, one should ehoose a
eone subdivision rule that involves mostly w-subdivisions. It ean be verified that the
algorithm will still work if an arbitrary NCS rule is allowed.
Examples IX.4. Fig. IX.lO illustrates Algorithm IX.9 for a regular problem. In
Step 1 a point wO of D is found with f(wO) < a; henee the algorithm goes to 5). A
forward step from sO then finds the optimal solution x*.
Fig. IX.lI illustrates Algorithm IX.9* for a nonregular problem. The algorithm gen-
erates a sequenee of infeasible points w1 ,J, ... approaching the optimal solution x*,
which is an isolated point of the feasible region.
~
---r
f(x)= Ci
Fig. IX.10
515
f(x)= Ci
O·~~~=---~~----~~--- --r
Fig. IX.lI
PART C
Part C is devoted to the study of methods of solution for quite general global op-
timizatioil problems. Several outer approximation algorithms, branch and bound
procedures and combinations thereof are developed for solving d.c. programming,
Lipschitzian optimization problems, and problems with concave minorants. The
"relief indicator method" may serve as a conceptual tool for even more general
global problems. The applications that we discuss include design centering problems,
biconvex programming, optimization problems with indefinite quadratic constraints
and systems of equations and / or inequalities.
CHAPTER X
D.C. PROGRAMMING
duality theory is developed between the objective and the constraints of a very gen-
eral dass of optimization problems. This theory allows one to derive several outer
approximation methods for solving canonical d.c. problems and even certain d.c.
problems that involve functions whose d.c. representations are not known. Then we
present branch and bound methods for the general d.c. program and a combination
of outer approximations and branch and bound. Finally, the design centering prob-
where C C IRn is convex and all of the functions f, gj are d.c. Suppose that C is de-
fined by a finite system of convex inequalities hk(x) ~ 0, k EIe IN. In Theorem 1.9, it
is shown that, by introducing at most two additional variables, every d.c. pro-
gramming problem
where c E IRn (cx denotes the inner product), and where hand g: IRn -+ IR are
real-valued convex functions on IRn.
that is based on Tuy (1987), Tuy (1994), Tuy and Thuong (1988). Moreover, it will
be shown that the method can easily be extended to solve problems where in (2) the
A general and simple duality principle allows one to derive an optimality condi-
tion for problem (CDC). We present a modification of the development given in Tuy
(1987) and Tuy and Thuong (1988), cf. also Tichonov (1980), where a related II reci-
procity principlell is discussed. Let D be an arbitrary subset of IRn , and let f: IRn -+ IR,
g: IRn -+ IR, a,p E IR. Consider the following pair of global optirnization problems, in
which the objective function of one is the constraint of the other, and vice versa:
Proof. (i) Assume that (Qa) is stable and a ~ inf P ß . Then for all a' < a the set
{x E D: g(x) ~ ß, f(x) ~ a'} is empty. Hence,
and, letting a' --I a - 0, we see from (6) that ß ~ sup Qa'
(ii) Similarly, if (P ß) is stable and ß ~ sup Qa ' then for all ß' > ß the set
{x E D: g(x) ~ ß', f(x) ~ a} is empty. Hence,
and, letting ß' --I ß+O, we see from (5) that inf P ß ~ a. _
Proposition X.l. (i) If (QaY is stable and a = min Pß , then ß = sup Qa'
Proof. (i) Since a = min P ß ' there must exist an i e D satisfying f(i) = a,
g(i) ~ ß. It follows that ß ~ sup Qa' But, by Lemma X.I, we know that ß ~ sup Qa'
(ii) Similarly, since ß = max Qa' we see that there exists ye D satisfying
g(Y) = ß, f(Y) ~ a. Hence, inf Pß ~ a, and Lemma X.1 shows that we must have
infP ß = a. _
In order to apply the above results, it is important to have some criteria for
checking stability of a given problem.
Lemma X.2. 11 inl Pß < +m, 1is upper semicontinuott.S {u.s.c.} and ß is not a loeal
marimum 01 9 ot/er D, then {Pß } is stable. Similarly, il sup Qa > -ID, gis lower
semieontinuott.S (l.s.c.) and ais not a loeal minimum oll ot/er D, then {Qa} is stable.
Proof. We prove only the first assertion, since the second can be established in a
similar way.
Suppose that inf P ß < +ID, fis u.s.c. and ß is not a local maximum of g with respect
to D. Then there is a sequence {xk} C D such that
where {ck} is any sequence of real numbers having the above property. If, for some
keIN we have g(xk) > ß, then, obviously, for all ß' > ß sufficiently close to ß we
also have g(xk) ~ ß', and hence
It follows that
lim infPß,~ck'
ß'~ß+O
523
Therefore, since ck 1 inf P ß we see that, if g(xk ) > ß holds for infinitely many k,
then
Since obviously inf P ß' ~ inf P ß for all ß' ~ ß, we have equality in (7).
On the other hand, if g(xk ) = ß for all hut finitely many k, then, since ß is not a
Iocal maximum of g over D, it follows that for every k with g(xk ) = ß thereis a se-
kll ---;;-+ x k such that x'
quence x' kll E D, g(kll)
x' > ß. Then for all .
ß ' sufficlently elose
and
stahility.
neighbourhood ofx contains a point x E D satisfying g(x) > ß (i.e., there exists a se-
quence i ---I x such that i E D, g(i) > ß).
524
Clearly, if the function g is l.s.c. (so that {x: g(x) > ß} is open), then every non-
isolated point x E D satisfying g(x) > ß is regular for (P ß)' Saying that ß is not a
local maximum of g over D (cf. Lemma X.2) is equivalent to saying that every point
Proposition X.2. If fis u.s.c. and if there exists at least one optimal solution of
(Pß) that is regular for (Pß)' then (Pß) is stable. Similarly, if g is l.s. c. and there
exists at least one optimal solution of (Qa) that is regular for (Qa)' then (Qa) is
stable.
Proof. Let x be an optimal solution of (Pß ) that is regular. Then there exists a
sequence {xk} (D satisfying g(xk ) > ß, xk --I x. For any fixed k we have
g(xk ) > ß for all ß sufficiently dose to ß ; hence inf Pß S f(x k). This implies that
I I I
ß -Iß+O
I
Since the reverse inequality is obvious, the first assertion in Proposition X.2 is
proved.
The second assertion can be proved in an analogous way.
•
525
G:= {x: g(x) ~ O}, H:= {x: h(x) ~ O}, and reeall that g and h are eonvex funetions.
In (3) let D = H, f(x) = ex and ß = O. Then problems (PO) and (CDC) coincide. As-
sume that H n G f 0.
Corollary X.2. (i) Let H be bounded, and suppose that g(x) f 0 at every extreme
(ii) 11 at least one optimal solution 01 (eDe) is regular, then problem (eDe) is
stable.
(cf. Definition 1.6). Then the following optimality eriterion ean be derived !rom the
above eonsiderations. Reeall !rom Theorem 1.10 that an optimal solution is attained
proof. Let fex) = cx, D = H, ß = 0, 0 = min {cx: x E H n G}, and consider the
problems (Qo) and (PO) = (CDC). The convex function g is continuous and it fol-
lows from (8) that 0 is not a local minimum of fex) = cx over H. Hence from the se-
cond part of Lemma X.2 we see that (Qo) is stable. Therefore, if i is optimal, Le"
ci = 0, then, by Lemma X.l(i),
But since gei) = 0, i E H, it follows that in (10) we have equality, i.e., (9) holds.
Conversely, if (CDC) is stable and (9) holds for i E 80, then, by Proposition X.I
(ii), we have
minimize cx (11)
S.t. xEH nG
where c EDf, H:= {x: hex) ~ O}, G:= {x: g(x) ~ O} with h,g: IRn - I IR convex.
Assume that
527
(b) H is bounded,
Note that w is readily available by assumptions (a), (b), (c) (cf. Section 1.3.4). For
example, solve the convex minimization problem
minimize cx
S.t. xEH
obtaining an optimal solution W. If wEH n G, then assumption (c) above does not
hold, and w is an optimal solution to (11). Hence, by assumption (c), we have
g(W) < 0, cw< min {cx:x E H n G}, where w is a boundary point of H. Then weint
H can be found by a small perturbation of W.
For every xe G, let 7f'(x) denote the point where the line segment [w,x] intersects
the boundary 8G of G. Since g is convex, and since g(w) < 0, while g(x) ~ 0, it is
clear that
7r(x) = tx + (l-t)w ,
Note that for x E H n G it follows from (12) and (13) that c?r{x)=tcx+(l-t)cw<cx.
Algorithm X.I.
Initialization:
Determine a point xl E H n 00. Set k I - 1.
Proposition X.3. Assume that problem (eDe) is stable. Then the lollowing asser-
tions hold.
(ii) 11 Algorithm X.1 is infinite, then it generates a sequence {:i} ( H n aG, every acr
cumulation point 01 which is an optimal solution to (eDe).
Proof. If the algorithm stops at zk, then we have g(zk) = max {g(x): xE H,
cx 5 cxk} = 0, and xk satisfies the necessary and sufficient optimality condition (9)
in Theorem X.1.
529
Suppose that Algorithm X.1 is infinite. Then, since H n an is compact, the se-
k
quence {xk} ( H n an has accumulation points in H n 00. Let {x q} be a subse-
quence of {xk} such that
k
x=limx q .
q"'lIJ
Since the sequences {xk} and {zk} are bounded, we may, by considering a subse-
k +1 k
quence if necessary, assurne that x q q x, z q q z. Clearly, since c7r{x) < cx
for all x E H n G, it follows that cxk+ 1 < cxk for all k. Moreover, for xE H satis-
k
fying cx ~ cx, we have cx ~ cxk, k=1,2, ... But, by the definition of z q in Algorithm
k
X.1, it follows that g(x) ~ g(z q)j and hence, letting q -- IIJ : g(x) ~ g(z). Since zEH
and cz ~ cx, we thus see that z is an optimal solution of the subproblem (Q(X)).
k +1 k
Now suppose that g(z) > O. Since x q = 7r(z q), it is easily seen (from the def-
But
k k+1 k k
cx q+1 < cx q < cz q ~ cx q, (16)
where the strict inequality holds, by an argument similar to that used to derive (15).
Remark X.2. Theorem X.l and Proposition X.3 remain true if we replace the lin-
ear function ex by a convex funcüon f(x), Hence, Algorithm X.l can be used to min-
imize a convex function f(x) subject to convex and reverse convex constraints (cf.
Tuy (1987».
»
Note that each subproblem (Q(xk is a difficult global optimization problem that
cannot be assumed to be solved in a finite number of iterations, since its feasible set
is not polyhedral.
Therefore, on the basis of the above conceptual scheme, in Tuy (1987) and Tuy
(1994) the following algorithm was proposed that can be interpreted as an outer ap-
proximation method for solving
Denote by V(Dk ) the vertex set of a polytope Dk, and let 8f(x) denote the set of
subgradients of a convex function f at x.
Algorithm X.2.
Initialization:
Set a l = exl , where xl e H n 8G is the best feasible solution available (if no feasible
solution is known, set xl = 0, a l = + 00). Set a l = exl . Generate a polytope Dl
containing the compact convex set {x eH: ex ~ all. Let k 1-1.
Iteration k = 1,2,... :
Solve the subproblem
Otherwise, determine the point l where the line segment [w,zk] interseets the sur-
face
(b) If l ;. H (i.e., h(yk) > 0), then ehoose pk E 8h(yk) and set
(19)
Let
(20)
Set
k+1 k k k k
x = y , ctk+1 = ey , if Y E H and g(y ) = 0,
k+1 k
x = x , ctk +1 = ctk , otherwise.
Go to iteration k+1.
We establish eonvergenee of this algorithm under the assumptions (a), (b), (e)
Observe that for eaeh k=I,2, ... , xk is the best feasible point obtained until step k,
while ctk = exk is the eorresponding objeetive funetion value, ctk ~ ctk+l'
(ii) 1/ the sequence {}} has an accumtdation point z satisfying g(z) = 0, then
Proof. (i) From Qk ~ Q := min {ex: x E H, g(x) ~ O} and Lemma X.3 we see that
(23)
by hypothesis,
k
g(z ) = max {g(x): x E Dk} = 0 ,
00
g(Z) ~ sup {g(x): xE n Dk}. (25)
k=l
Then
Proof. From (27) we see that for any number Cl!' < ä we have
ä~
Since (CDC) is stable, it follows by Lemma X.l(ii) that Cl!' ~ a, and hence a.
•
Theorem X.2. Assume that the conditions (a), (b), (c) are fulfilled and that prob-
lem (GDG) is stable. If Algorithm X.2 terminates at iteration k, then i is an optimal
solution for problem (GDG) (if ak < + 00), or (GDG) is infeasible (if
ak = +00).
If the algorithm is infinite, then every accumulation point x of the sequence {xk} is an
optimal solution for problem (GDG).
In order to prove the second part of Theorem X.2, we shall show that any ac-
cumulation point Z of the sequence {zk} satisfies g(i) = O. Then, in view of Lemma
XA (ii), we have
and the equality ci = a = min {cx: x E H, g(x) ~ O} follows from Lemma X.5, since
i is feasible for (CDC).
The assertion g(i) = 0 will be established by checking that all of the conditions of
Theorem 11.1 are fulfilled for the sequence {xk = zk} and the set
D = {x: g(z) ~ 0, cx ~ a}, (a:= min {cx: x E H, g(x) ~ O}).
k
z = y k + Ak(Y k - w) , Ak >0. (28)
If lk is of the form (18), then from (28) and the inequality cw < min {cx: x E H n G}
we deduce that
k
lk(z ) = c(zk -yk) = Akc(y k -w) > O.
If lk is of the form (19), then observe that from the definition of a subgradient and
lim ir(l) = lim ir(Z) and, moreover, that lim ir(Z) = 0 implies that g(Z) ~ O. With-
out loss of generality we may assume that one of the following cases occurs.
(a) Suppose that yq E H for an q. Then iq has the form (18); and, since zq, yq E D1
and D 1 is compact, there is a subsequence (l, Ar ) such that l --I y, Ar --I X.
Therefore, by (18) and (28), with i(z) = i(z-y) we have
z= y + X(y - w) , X~ 0 (30)
and
Let
As above, from the definition of w we deduce that cy > cw. Therefore, we have
X = 0 and z = y. But for an r we have g(zr) ~ 0, g(yr) ~ 0, by the construction of the
algorithm. Therefore, z = y implies that g(Z) = o.
(b) Suppose that yq ~ H for an q. Then iq has the form (19); and, as before, we
have a subsequence yr --I y, Ar --I X. Moreover, we may also assume that pr --I p E
Clearly, l(w) = p(w - Y) + h(Y) ~ h(w) < 0 and l(Y) = h(Y) ~ 0, since h(l) > o.
As above, from (30) we conclude that then
1 cx== (Xl
Z ... ..~ .. ..
CX< (Xl
fex) over H n G. For that purpose we only have to start with a polytope Dl ) H, and
to replace equation (17) by max {fex) - ak , g(x)} = 0 and lk(x) = c(x_yk) in (18)
by lk) = tk(x_yk), where t k E ßf(yk) (cf. Remark X.2 and Tuy (1987)).
Remark X.4. In the case of linear objective function (for which Algorithm X.2 is
formulated) it is easy to say that certain simplifications can be built in. For ex-
ample, if (18) occurs, Le. if a cut cx ~ cl is added, then obviously all previous cuts
of the form cx ~ cyi, i < k, are redundant and can be omitted. Moreover, if the ini-
tial polytope D l is contained in the halfspace {x E IRn: cx ~ all, i.e., if cx ~ a1 is
among the constraints which define Dl' then equation (17) can be replaced by
g(x) = O.
(32)
537
Since any accumulation point Z of the sequence {zk} satisfies g(Z) = 0, (32) must
occur after finitely many iterations. Suppose that
which, because of (33), implies that Clk < +m, i.e., there is a point xk E H satisfying
g(xk ) = 0, cxk = Clk. Furthermore, there is no xE H such that cx ~ Clk ' g(x) ~ c.
Hence,
Therefore, with the stopping rule (32), where c < ,,(, the algorithm is finite and
provides an c-optimal solution in the sense of (34). Note that this assertion is valid
no matter whether or not the problem is stable.
There are two points in the proposed algorithm which may cause problems in the
h(X) ~ c, g(X) + c ~ o.
solution and
ci ~ min{cx: xE D, g(x) ~ O} + c.
The following modified variant of Algorithm X.2. has been proposed in Tuy (1994):
538
AIgorithm X.2· .
O. Let 11 = cx1, where xl is the best feasible solution available (if no feasible solu-
tion is known, set xl = 0, 11 = + 00). Take a polytope PI such that {x E D :
lc c k+ 1 k k lc
3.lfh(w) > 2' then defi.ne x = x '/k+1 = Ik' Let p E ah (w),
(35)
and go to Step 6.
4. Determine I k E [r.!j zk] such that g(l) = --t: (yk exists because g(zk) ~ 0,
lc k k+1 k
g(w) < - c). If h(y ) ~ c, then set x = x '/k+1 = Ik'
Determine uk E [r.!j yk] such that h(u k) = c, pk E ah(uk) (uk exists because
and go to Step 6.
k k+1 k k
5.lfh(y ) ~ c, then set x = y '/k+1 = cy .
b) Otherwise, let
539
(37)
and go to Step 6.
Lemma X.6. Let {i} be a bounded sequence ofpoints in IRn, Zk (.) be a sequence of
affine fu,nctions such that
k
Let {wk} be a bounded sequence such that Zirn Zk (w) < 0 for any w = Zirn w q.lf
q-t+m q q--t+m
yk E [~, .!] satisfies 'k (yk) ~ 0 then
lim (i - yk) = o.
fo.++m
k k
Proof. Suppose the contrary, that IIz q - y qll ~ 6 > 0 for some infinite subse-
k k k
quence {kq}. Let lk(x) = p x + ß.x with IIp 11 = 1. We can assume z q -i Z,
k k k k
w q ---I W, Y q ---I Y E [w, Z], P q ---I p. Since lk (z q) > 0 > lk (w) implies that
q q
k k k
-p q z q < f\ < - P q wwe can also assume f\ ---I P, hence,
q q
lk (x) ---11(x) := px + P V x.
q
540
k
Furthermore, l(w) = !im lk (w) < o. From the relation 0< lk (z q) = lk (Z) +
q"'+1D q q q
k k k
<p q(z q - Z» it follows that !im lk (Z) ~ O. On the other hand, since lk (z s)
q"'+1D q q
~ 0 V s > q, by fixing q and letting s - - f + ID we obtain lk (Z) ~ 0 V q. Hence, 1(Z)
q
= o. Also, since lk (yk) ~ 0 V k we have l(y) ~ o. But Y = 0 w + (1-0) '-z for some 0 E
[0,1], hence l(y) = 01 (w) + (1-0) 1 (Z) = 01 (w), and since l(w) < 0, while l(y) ~ 0,
proximate optimal solu.tion or by the evidence that the problem has no feasible solu.-
tion.
Proof. It is easily seen that the algorithm stops only at one of the Steps 1,2, 5a.
Since xk changes only at Step 5, and xk+1 = yk with h(l) ~ c, g(yk) = - c, it is
clear that every x k satisfies h(xk ) ~ c, g(xk) = --f:. If Step 1a) occurs, then {x E D :
Now suppose the algorithm is infinite. Step 5b) cannot occur infinitely often
X.6. are fulfilled for the sequence {zk, yk, Je} and the functions lk(x), where k ~ kO•
In fact, since zr e Pr we have lk (xr ) ~ 0 V r > k. On the other hand, lk (Je) =
pk(Je - u k ) ~ h(wk ) - h(uk ) ~ e/2 - e = -€/2 < 0, while lk(uk ) = 0, hence
lk(zk) > o. Furthermore, since uk is bounded, and pk e 8h (uk ), it follows by a
well-known property of subdifferentials (see e.g. Rockafellar (1970)), that pk is also
k
bounded. If w q ---! W (q ---! + m), then, by taking a subsequence if necessary, we can
k k
assume u q ---! u, P q ---! P e 8h (u), so lk (w) = Pk(w - k
u ) ---! p(w - u) ~ h(w) -
q
h(u) ~ e/2 - e = - e/2 < O. Finally, yk = Je + (Jk (uk - Je) for some (Jk ~ 1, hence
= (Jk lk (uk) + (1- (Jk) lk (wk) = (1-(Jk) lk (Je) ~ O. Thus, all conditions of
lk (yk)
Lemma X.6 are fulfilled and by this Lemma, zk - yk O. Since g(l) = - e, for
---!
sufficiently large k we would have g(zk) < 0, and the Algorithm would stop at Step
d.c., but we are not able to find one ofits d.c. representations.
A first attempt to overcome the resulting difficulties was made by Tuy and
Thuong (1988), who derlved a conceptual outer approximation scheme that is applic-
able for certain noncanonical d.c. problems, and even in certain cases where a d.c. re-
542
minimize f(x)
(P) s . t. gi (x) ~ 0 (i=l, ... ,m)
where fis a finite convex function on IRn and ~ (i=l, ... ,m) are continuous functions
Let us set
With this notation, problem (P) asks us to find the global minimum of f(x) over the
(i) The finite convex fv,nction f has bounded level sets {x E IR n : 1(x) ~ J}, and
the fv,nctions gi are everywhere continuoUSi
(iv) for any Z E IR n, one can compute the point 1r(z) nearest to w in the
Assumptions (i) and (ii) are self-explanatory. Assumption (iii) can be verified by
minimize f(x)
S.t. xElRn
If an optimal solution of this convex program exists satisfying g(x) ~ 0, then it will
functions that we admit. Since, by assumption (i), the functions ~(x) are con-
tinuous, the set of feasible points lying in a line segment [w,z] is compact. This set
does not contain w. Therefore, whenever nonempty, it must have an element nearest
to w. It is easily seen that such a point must lie on Be, and is just 7!'(z). Assumption
For instance, if each of the functions gi(x) (i=l,oo.,m) is convex, then 7I"(z) can be
Initialization:
1
Set w = w, 11 = {l,oo.,m} .
Step k=1,2,oo.:
Choose i k E argmin {~(wk): i E I k} .
If ~ (z) < 0, then stop: there is no feasible point on the line segment [wk,z] because
k
gi(x) < ° k
Vx E [w ,z].
Otherwise, compute the point wk +1 E [wk,z] satisfying gi (w k +1) = 0. (Since Ci
k k
is convex, this point is unique and can be easily determined.)
Set Ik +1 = I k \ {i k}. Umin {gi(wk +1): i E I k+1} ~ 0, then stop: 7!'(z) = wk +1.
Otherwise, go to Step k+ 1.
Clearly, after at most m steps this procedure either finds 7!'(z) or else establishes
It is also not difficult to derive a procedure to determine ?r{z) for other classes of
functions ~(x), for example, when an of the ~(x) are piecewise affine (this will be
left to the reader).
Proposition X.5. Suppose that assumptions (i) - (iv) are satisfied. Then every op-
timal solution 0/ (P) lies on.the boundary oe o/the set e.
proof. Suppose that z is a point satisfying g(z) > O. Then ?r{z) as defined in as-
sumption (iv) exists. By assumption (iii), we have f(w) < fez), and hence
?r{z) = ~w + (l-~)z . •
Note that the duality theory discussed in the preceding sections also applies to prob-
lem (P). In particular, Theorem X.l remains valid, Le., we have the following
corollary.
Assumption (v) is purely technical. If it is not satisfied (Le., if fex) is convex but not
strictly convex), then we may replace fex) by fe(x) = fex) + ellxll 2, which is ob-
viously strictly convex. For an optimal solution i(e) of the resulting problem we
545
then have f(X(E)) + Ellx(E)11 2 S f(x) + Ellxll 2 for all feasible points of the original
The role of assumption (v) is to ensure that every supporting hyperplane of the
level set {x: f(x) S er} supports it at exactly one point. This property is needed in
Algorithm X.3.
Initialization:
Let w E argmin {f(x): x E IRn } (or any point satisfying assurnption (iii)). Compute a
point xl E BC.
k.3.: Compute 7r{ zk). If 1f(zk) exists and f( 7r{ zk)) < f(xk ), then set xk + l = 7r{ zk).
· set x k+l
Ot herwlse, = xk .
546
kA.: Let l+1 be the point where the line segment [w,zk] intersects the surface
{x: f(x) = f(xk+1)}. Compute pk+1 E 8f(l+1) (8f(l+1) denotes the subdif-
Remarks X.6. (i) In Step k.1, checking whether f(x k ) = min f(D k) is easy,
because fis convex and Dk is a polytope. It suffices to determine whether one of the
standard first order optimality conditions holds. For example, xk is optimal if
o E 8f(xk) + ND k
(x ) ,
k
where ND (xk ) denotes the out ward normal cone to Dk at xk , Le., the cone which is
k
generated by the normal vectors of the constraints of Dk that are binding (active) at
(ii) The subproblem (SP k) is equivalent to the problem of globally minimizing the
concave function (-f) over the polytope Dk , and it can be solved by any of the algo-
rithms described in Part B. Since Dk+1 differs from Dk by just one additional linear
constraint, the algorithm for solving (SP k ) should have the capability of being re-
started at the current solution of (SP k_ 1) (cf., Chapter VII). However, most orten
one would proceed as discussed in Chapter II: start with a simple initial polytope D1
(for example, a simplex) whose vertex set V(D 1) is known and determine the vertex
set V(D k+1) of Dk +1 from the vertex set V(D k ) of Dk by one of the methods de-
scribed in Chapter 11. Since max f(D k ) = max f(V(D k )), problem (SP k ) is then re-
Proof. The assertion is obvious for k=1. Supposing that it holds for some k, we
prove it for k+1.
If x E D1 and f(x) ~ f(i+1), then f(x) ~ f(xk), since f(xk+ 1) ~ f(i); hence, x E
Dk by the induction assumption. Furthermore, from the definition of a subgradient
and the equality f(yk+1) = f(xk+1), we see that
(ii) There exists an a* E IR satisfying f(x k) --+ a* (k --+ m). Moreover, whenever
k k
z q --+ z and lk (z q) --+ 0 (q --+ m) for some subsequence kq , we have f(z) = a*.
q
Proof. (i) Since l+1 E D1 and D1 is compact (it is a polytope) , it follows from
a well-known result of convex analysis (see, e.g., Rockafellar (1970)) that the se-
quence {pk+1}, pk+l E M(yk+1), is bounded, i.e., IIpk+111 < L for some L > O.
(ii) The sequence {f(xkn is nonincreasing and bounded from below by f(w) (cf. as-
Finally, since f(w) - f(y) < 0, this implies that >. = 0, i.e., z= y. Therefore,
Proof. Lemma X.9 can easily be deduced from the preceding two lemmas and
Theorem II.1. A simple direct proof is as folIows:
k
Let z = Iim z q. From Lemma X.8 (i) we see that
q-+ID
(40)
But from the construction of the algorithm it is dear that ~(zj) ~ 0 Vk < j. Fixing
k and setting j = kq - - I ID, we obtain lk(Z) ~ O. Inserting this in the above inequality
yields
549
k k
o5 lk (z qH Lllz q - zll -+ 0 ,
q
k
where the first inequality follows from the definition of z q in the above algorithm
Proof. Since a* 5 f(x l ), it follows from the construction of D1 that x is not a ver-
tex of D 1. Therefore, using the strict convexity of f(x), we see that there exists a
point uq E D 1 such that Ilu q -xII ~ l/q for any integer q > 0 and f(u q ) > a*.
But the inclusion zk E argmax {f(z): z E Dk } and Lemma X.9 imply that max {f(z):
k
z E Dk } -+ a* (k -+ m). Hence, u q ~ Dk for some k ,Le., one has y q and
q q
k k
p q E M(y q) such that
k k
p q(uq - y q) > 0 . (41)
k k
By passing to subsequences if necessary, we may assume that y q -+ y and p q -+ p
as q -+ m. Then we know from the proof of Lemma X.8 that f(y) = 0.* and p E M(Y).
We must have p j 0, because 0 E M(y) would imply that f(y) = a* = min f(lR n ),
Letting q -+ m, we see from (41) that p(x - y) ~ 0, and since f(x)= f(Y) = a*,
from the strict convexity of f(x) again it follows that x = y, Le., x =
•
Proposition X.6. Every accumulation point xofthe sequence {l} generated by Al-
gorithm X.9 satisfies the condition
550
11 problem (P) is stable, then every accumulation point x01 {i} is an optimal Slr
lution 01 (P).
Proof. First note that, for every k the line segment [w,l] contains at most one
feasible point, namely either yk, or no point at all. Indeed, either yk = xk = 1I{zk-l)
and, from the definition of 1I{ zk-I), yk is the only feasible point in [w,yk], or else l
satisfies f(l) = f(x k- l ) and there is no feasible point in [w,yk].
Now suppose that (42) does not hold. Then there is a point x satisfying
f(x) ~ f(i) = a* and g(x) > O. Since fand gare continuous and f(w) < a*, there is
also a point x satisfying f(x) < a* and g(x) > O. Let U denote a closed ball around x
such that g(x) > 0 and f(x) < a* for all x E U. Let i: E DI be the point of the half-
line from w through x for which f(i:) = a*. From Lemma X.IO we know that there is
k k
a sequence y q - - I i: (q --I ID). For sufficiently large q the line segment [w ,y q] will
intersect the ball U at a point x' satisfying g(x ') > 0 and f(x') < a*. But since
k k k
f(y q) = f(x q) ~ a*, this implies that x # y q Vq, contradicting the above ob-
servation.
Therefore, relation (42) holds, and the second assertion follows from Corollary X.3 .•
point) xloc by one of the standard nonlinear programming techniques, and then to
apply aglobai optimization algorithm with the aim of locating a feasible solution xk
k
f(x ) < f(xl oc) -1/ ,
S.t. 2 2 2 2
gl (x):= 2(xC15 ) +(x2-9) + 3(x3-18) + 2(x4-10) ~O ~ 0,
The global minimum of f(x) over 1R4 is O. Suppose that a feasible solution is given by
We choose w = (14, 10, 16, 8) with f(w) = 0, g(w) = min {f1(w), g2(w)}: -37.
2 2 2 2
x = (13.91441, 9.93887, 22.01549, 7.95109), y = x ,f(x ) = 108.58278;
Now suppose that problem (P) is not stable. In this case, Algorithm X.3 is not
guaranteed to converge to a global solution (cf. Proposition X.9). However, an
e-perturbation in the sense of the following proposition can always be used to handle
unstable problems, no matter whether or not the perturbed problem is stable.
Proposition X.7. Let ire} denote any accummation point 01 the sequence d'{e}
generated by Algorithm X.9 when applied to the e-perturbed problem
minimize I{ z}
{P{e}}
s. t. g/z} + e ~ 0 {i=l, ... ,m}
Then as e -10 every accummation point olthe sequeme {ire}} is an optimal solution
olproblem {P}.
Proof. Let D(e) denote the feasible set of problem (P(e)). Clearly, D(e) contains
the feasible set of (P) for all e > o. From Proposition X.6 we know that every ac-
cumulation point x(e) of {xk(e)} satisfies g(x(e» = -E, and
This implies that f(x(e)) < min {f(x): g(x) ~ -e/2}. Therefore, as e - I 0 every ac-
cumulation point x of X(e) satisfies g(X) = 0, and f(X) S min {f(x): g(x) ~ O}j hence,
it is an optimal solution of (P).
•
Remark X.7. If we apply Algorithm X.3 to the above perturbed problem and stop
assoonas
then xk(e) yields an approximate optimal solution of (P) in the sense that
select abounding operation in accordance with the given type of objective function
which provides a lower bound ß(M) for min f(D n M) or min f(M), respectively. We
apply abound improving selection for the partition elements to be refined. Finally,
if necessary, we choose from Section IV.5. the "deletion by infeasibility" rule that
corresponds to the given feasible set, and we incorporate all of these elements into
the prototype BB procedure described in Section IV.!. Then the theory developed in
Note that, as shown in Section IVA.5., whenever a lower bound ß (M) yields con-
sistency or strong consistency, then any lower bound "ß (M) satisfying "ß (M) ~ ß(M)
for all partition sets M will, of course, also provide consistency or strong consistency,
respectively. Hence, better and more sophisticated lower bounds than those dis-
cussed in Section IVA.5. can be incorporated in the corresponding BB procedures
without worrying about convergence.
globaU71 minimizing a d.c. junction subject to a finite number of convex and reverse
convex inet[Ualities which has been proposed in Horst and Dien (1987).
Let f1: IRn --+ IR be a concave function, and let ~, ~, h j: IRn --+ IR be convex func-
tions (i=l, ... ,mj j=l, ... ,r). Define the convex function
(44)
(45)
Assume that D is nonempty and compact and that a point yO satisfying g(yO) < °
is known.
Algorithm X.4.
Step 0 (Initialization):
Construct an n-simplex MO 'J D1 (cf. Chapter III) and its radial partition
At the beginning of Step k we have the current partition ..,K k-l of a subset of MO
still of interest.
Moreover, we have the current lower and upper bounds ~-1' ak_l (possibly
a k_ 1 = m) which satisfy
and subdivide every member of ,9Jk into a finite number of n-ilimplices by means
of an exhaustive radial subdivision. Let ,9Jk be the collection of all new partition
elements.
556
k
11:.3. Delete every M E .9l for which the deletion rule (DR!) (cf. Section IV.5)
applies or for which it is otherwise known that min f(D) cannot occur. Let .Jt k be
the collection of all remaining members of .9lk.
Proposition X.8. (i) 1/ Algorithm X.4 does not tenninate after a finite number 0/
iterations, then e'IJery accumulation point 0/ the sequence {yk} is an optimal solution
o/problem (46), and
(ii) 1/ SM # 0 /or e'IJery partition element M that is ne'IJer deleted, then the se-
quence {.}} has accumulation points, and e'IJery accumulation point is an optimal sa-
lution o/problem (46) satisfying
(50)
557
Proof. Proposition X.ll follows from the theory presented in Chapter IV.. Since
{~} is a nondecreasing sequence bounded from above by min f(D), we have the
existence of
satisfying
(52)
and
(53)
In Proposition IVA, it was shown that deletion rule (DR2) is certain in the limit,
and by Proposition IV.3 we have strong consistency of the bounding operation (cf.,
Definition VI.7). It follows that
yeD (54)
and
(55)
and hence
In order to prove (ii), recall from Lemma IV.5 that under the assumptions of
Proposition X.11(ii) the bounding operation is also consistent. The assertion then
follows from Theorem IV.3 and Corollary IV.2, since f is continuous and D is
compact.
•
Remark X.S. In addition to (47), several other bounding procedures are available.
One possibility, for example, is to linearize the convex part f2 of f at different ver-
tices v* E M and to choose the best bound obtained from (47) over all v* e M con-
sidered (cf. Section XI.2.5).
Another method is to replace the concave part f1 of f by its convex envelope cp over
the simplex M (which is an affine function, cf. Section IV.4.3.) and to minimize the
convex function (cp + f2 ) over M.
More sophisticated bounding operations have been applied to problems where
additional structure can be exploitedj examples include separable d.c. problems such
as, e.g., the minimization of indefinite quadratic functions over polytopes where
piecewise linearization has been used (Chapter IX).
to solve the more general and more difficult d.c. problem (general reverse convex
programming problem).
Further developments in concave minimization, however, have led to certain com-
binations of outer approximation and branch and bound methods that involve only
linear programming subproblems and line searches (Benson and Horst (1991), Horst,
Thoai and Benson (1991)), cf. Algorithm VII.2 and the discussion in Section VII.1.9.
The first numerical experiments indicate that, for concave minimization, these
methods can be expected to be more efficient than pure outer approximation and
pure branch and bound methods (cf. Horst, Thoai and Benson (1991) and the dis-
cussion in Chapter VII.). Therefore, some effort has been devoted to the extension of
these approaches to the d.c. problem. The resulting procedure, which is presented
below, can be viewed as an extension of Algorithm VII.2 to the d.c. problem which
takes into account the more complicated nature of the latter problem by an ap-
propriate deletion-by-infeasibility rule and a modified bounding procedure (cf.
Horst et al. (1990)).
For other possible extensions of the above mentioned linear programming -line
search approaches for concave minimization to the d.c. problem we refer to Horst
(1989), Tuy (1989a) and Horst et al. (1991), see also Horst, Pardalos and Thoai
(1995).
minimize f(x): = cx
(CDC)
S.t. h(x) ~ 0, g(xHO
given below.
560
Let
(Note that 0 is the set which in Section X.I was denoted by H.) Assume that
Remarb x.g. (i) Assumptions (a), (b), (c) are quite similar to the standard as-
sumptions in Section X.1.2. A polytope T satisfying (a) is often given as a rectangle
(ii) Assumption (b) is often not satisfied in formulations of (CDC) arising from
applications. However, since 0 is bounded and a simple polytope T ) 0 with known
vertex set V(T) is at hand, we can always redefine
(iii) Assumption (c) is fulfilled if 0 satisfies the Slater condition (h(w) < 0) and if
the reverse convex constraint is essential, i.e., if it cannot be omitted (cf. Section
X.1.2).
In order to simplify notation, let us assume that the coordinate system has been
translated such that the point w in assumption (c) is the origin o.
561
Let
OE int S .
Let Fi denote the facet of S opposite vi, Le., we have vi ~ Fi (i=l, ... ,nH). Clearly,
F. is an (n-1)-simplex.
1
Let M(Fi ) be the convex polyhedral cone generated by the vertices of Fr Then we
is a conical partition of IRn. This partition will be the initial partition of the algo-
rithm.
denote the cone generated by the vertices of U. Furthermore, suppose that ais the
Q = P n {x: cx ~ a} .
562
The algorithm below begins with P = T, and then successively redefines P to in-
Now we present a method for calculating a lower bound of f(x) = cx on the inter-
section D n M n {x: cx ~ /l} of the part of the feasible set still of interest with the
cone M provided that D nM n {x: cx ~ /l} :f: 0. This method will also enable us to
detect sufficiently many (but not necessarily all) cones M of a current partition
which satisfy
+ n . n
H n M = {x E IRn: x = E >..yl, E >.. ~ 1, >'1. ~ 0 (i=l, ... ,n)} ,
i=l 1 i=l 1
Now let Y denote the (n .. n)-matrix with columns yi (i=l, ... ,n), and consider the
linear programming problem
n
(LP) = (LP(M,Q)): max { E \ AY>. ~ b , >. ~ O} ,
i=l
T n n
where>. = (>'1' ... '>') E IR . Define JL(>') = E >. ..
n i=l I
We recall the geometrie meaning of (LP): eonsider the hyperplane H = {x E IRn: x =
Y>', JL(>') = 1} defined above. Changing the value of JL(>') results in a translation of
H into a hyperplane whieh is parallel to H. The eonstraints in (LP) describe the set
M n Q. Let >.*(M) and 1'* = I'*(M) = JL(>.*) denote an optimal solution and the op-
timal objective funetion value of (LP), respeetively. Then H* = {x E IRn: x = Y>',
JL(>') = I'*} describes a hyperplane parallel to H that supports Q n M at
Let zi = I'*yi denote the point where the i-th edge of M intersects H* (i=l, ... ,n).
Lemma X.11. Let >.* = >.*{M), 1'* = I'*{M) and yi, zi (i=l, ... ,n) be defined as
above.
Proof. Let H- and H*+ denote the closed halfspaces containing 0 generated by
the hyperplanes Hand H* defined above, and let ft- be the open halfspace int H-.
Since H is the hyperplane passing through the points yi E ac (i=l, ... ,n), Cis convex
and 0 Eint C, it follows that the simplex H- n M = conv {O,yl, ... ,yn} is contained
in Cu 8C; hence
(because D n C = 0).
But from the definition of H* it follows that
and hence
(60)
Therefore, since p.* < 1 implies that H*- c ft-, we see that assertion (i) holds.
Now consider the case p.* ~ 1. It is easily seen that (H*- \ ftl n M is a polytope
with vertex set {yi ,zi (i=l, ... ,n)}. It follows from the linearity of the objective func-
tion cx and from (60) that assertion (ii) holds.
•
Remark X.lO. When p.*(M) < 1 occurs, then the cone M is deleted.
Algorithm. X.5.
Initialization:
Let .At 1 = {M(Fi ): i=l, ... ,n+l} be the initial conical partition as defined above.
Determine the intersection points yi (i=l, ... ,n+l) of the rays emanating from 0
and passing through the vertices of the simplex S with the boundary ac of C.
565
Determine
Otherwise, for each cone Mi E .At 1 satisfying JL*(M i ) ~ 1, compute the lower
bound
satisfying ak = f(x k). Furthermore, we have a set .At k of cones generated from
the initial partition .At 1 by deletion operations and subdivisions according to the
rules stated below. Finally, for each cone M E .At k' a lower bound ß (M) $
min {cx: x E D n M} is known, and we have the bound ßk $ min {cx: xE D} and
a not necessarily feasible point xk associated with ßk such that cxk = f1c.
k.1. Delete all M E .At k satisfying
566
cone Mt
Ifx*k E D, then set P k+1 = P k and go to k.4.
1::.3. Determine the point wk where the line segment [O,x*k] intersects the
functions
Set
Set
k
Let .Jt = {Mk,j C Mt j E J k} and set ....rk = (se k \ {Mk}) u .Jtt
For each cone M E ....rk let yi(M) denote the intersection points of its i-th edge
with oe (i=l, ... ,n).
De1ete M E f k if cyi(M) ~ Qk+1 Vi=l, ... ,n (cf. (58».
Let ....rk denote the set of remaining cones.
567
k.5. Set
and, for each newly generated M E A'ic. solve the linear programming problem
Delete all M E A'ic. satisfying /L*(M) < 1. Let .At k+1 denote the collection of
cones in A'ic. that are not deleted. If ~+1 = 0, then stop: if tkic.+l = 00, then the
feasible set Dis empty. Otherwise, x k is an optimal solution.
k.7. Set
. t where ßk+lIS
an d 1et x-k+1 be a pom . att·
ame d·
,I.e., cx-k+1 = ßk+l.
From tkic.+1 and the new feasible points obtained in this iteration determine a
tkk+l = cxk+1 .
Set k I - k+ 1 and go to the next iteration.
vergence theory developed in Chapter IV. According to Corollary IV.3. and Corollary
IV.5, convergence in the sense that li m ~ = min {cx: x E D} and every ac-
k~1J)
teed if we show that any infinite decreasing sequence {Mq} of successively refined
Mn D f 0 ,ß (M ,) - - - I min {cx: x E M n D} ,
q q'.....,
568
Lemma X.12. Assume that the algorithm is infinite. Then we ha'IJe 1! E D n an /or
e'IJery accumulation point 1! o/the sequence {1!k}, where 1!k = Y(M*~)'*(MV (cf
Step k.2.)
Proof. From the convergence theory that is known for the outer approximation
method defined in Step k.3 and the linear programs LP(M,Qk) it follows that
x* E an (cf. Chapter II). In order to prove x* E D, note that, using a standard argu-
ment on finiteness of the number of partition cones in each step, one can concIude
that there exists a decreasing sequence {Mq} of successively refined partition cones
such that for the sequence x*q --I x* we have [O,x*q] C Mq and [O,x*] C M:= n Mq .
q
Since the algorithm use& an exhaustive subdivision process, the limit M must be a
1'*q = p.*(Mq).
Suppose that x* t D. Then, since x* E an, we must have g(x*) < 0, i.e., x* is an
interior point of the open convex set C, and it is also a relative interior point of the
line segment [O,y*]. Hence, there exist a point w E [x*,y*] and a closed ball B
around w such that x* t Band B C C. For sufficiently large q, we then must have
that we have #L~ < 1, and this would have led to the deletion of Mq , a contra-
diction.
•
Lemma X.13. Let {Mq} be an infinite decreasing sequence 0/ successi'IJely refined
partition cones generated by the algorithm. Let 1! be an accumulation point 0/ the
corresponding sequence {1!q}, and denote by y* the intersection o/the ray M:= Zim
q
569
Mn D = [y*,x*] .
Proof. Clearly, {x*q} has accumulation points, since 0 and the initial polytope T
are bounded sets, and x* E M. In the proof of Lemma X.12 we showed that
ß (M ,) - - I min {cx; xE Mn D} .
q q-+JJ
In the proof of Lemma X.13 we saw that every accumulation point x* of {x*q} satis-
Consider a subsequence {q'} such that x*q' ----er x*. We show (again passing to a
subsequence if necessary; we also denote this subsequence by {q'}) that zq' ,i ----er x*
Recall that zq,i = J.I~yq,i. Let yq be the point where the line segment [O,x*q]
intersects the hyperplane Bq through yq,i (i=l, ... ,n). Since yq E conv {yq,i:
. I } ,we see from y q,i q y * tha t yq q y *.
1= , ... ,n
But the relations x*q = J.I~yq , x*q'----er x* , yq'----er y* and the boundedness of {J.I~}
imply that (passing to a subsequence if necessary) we have J.I~' ----er J.I* . It follows
that
,.
q
Z '
l
-:::r+ J.I*Y* = x* E M
q
-
n an
But min {cy*,cx*} = min {cx: x E [y*,x*]}; and from Lemma X.13 it follows that
Finally, since ß (Mq"Qq') ~ ß (Mq ,) ~ min {cx: x E Mq, n D}, we must also have
Proof. Since the selection of the cone to be subdivided in each iteration is bound
improving and the preceding lemmas have shown that the lower bounding is strongly
consistent (cf. Definition IV.7), the assertions follow from Corollary IV.3 and
571
Corollary IV.5.
•
Extension
Note that Algorithm X.5 can also be applied if the objective junction f(x) is a con-
cave junction. The only modification that is necessary to cover this case is to omit
the deletion rule (58) and the sets {x: cx ~ Qk} whenever they occur. Since for con-
cave f the corresponding set {x: f(x) ~ Qk} cannot be handled by linear programming
The following slight modification has improved the performance of the algorithm
in all of the examples which we calculated. Instead of taking the point w defined in
assumption (c) as both the vertex of an the cones generated by the procedure and
the endpoint of the ray that determines the boundary point where the hyperplane
constructed in Step k.3 supports the set Cl, we used two points wO and WO. The point
wO has to satisfy the conditions
minimize cx
S.t. h(x)~O
Example X.2.
= 0.5x1 + 1.2x2
Objective function: f(xl'x 2)
Constraints: h(xl'x2) = max {(xC1)2 + (x2-1)2 - 0.6, 1/(0.7x1 + x2)-5},
2 2
g(xl'x2) = Xl + x2 - 2.4, T = {O ~ Xl ~ 5, 0 ~ x2 ~ 5}.
The points wO, WO were chosen as wO = (1.097729, 0.231592) with f(w O) = 0.826776
and -0
w = (1.0,1.0).
Some special d.c. problems have already been treated in Chapter IX. In this sec-
tion, we discuss design centering problems and biconvex programming.
Recall from Example 1.5 that a design centering problem is defined as folIows.
Let K c IRn be a compact, convex set containing the origin in its interior. Further-
more, let M ( IRn be a nonempty, compact set. Then the problem of finding X E M, r
E IR + satisfying
Problems of the form (61) often arise in optimal engineering design (e.g., Polak
and Vincentelli (1979), Vidigal and Director (1982), Polak (1982), Nguyen et al.
(1985 and 1992), Thoai (1987)). For example, consider a fabrication process where
Let x be the nominal value of this parameter, and let y be its actual value. Assume
that for fixed x the probability
P(lIy-xll ~ r) = p(x,r)
given nominal value of x the production yield can be measured by the maximal value
of r = r(x) satisfying
{y: lIy-xll ~ r} ( M .
In order to maximize the production yield, one should choose the nominal value x
so that
Setting K = {z: IIzll ~ I} and y = x + rz, zeK, we see that this is a design centering
problem.
Another interesting application has been described in Nguyen et al. (1985). In the
scribed form inside a rough stone M. This form can often be described by a convex
body K. Assume that the orientation of K is fixed, i.e., only translation and
Note that, in (61), we mayassume that int M # 0, since otherwise (6·1) has the so-
lution r = O.
574
In many cases of interest, the set M is the intersection of a number of convez and
complementary convez sets, i.e.,
(63)
where C is a closed convex set satisfying int C I 0, and Di = ort \ Ci is the comple-
ment of an open convex subset Ci of IRn (i=1, ... ,m). We show that in this case the
design centering problem is a d.c. programming problem (cf. Thach (1988».
Let
Note that max {r: x + rK ( M} exists if M is eompact. The expression (64), how-
ever, is defined for arbitrary M ( IRn, and we will also consider unbounded closed sets
M.
(65)
proof. Consider
max {cz: Z E K}
z
has a solution satisfying v= ci > 0, because K is compact, 0 Eint K, and c I O.
575
tl-ex
() = - - - .
px
v
henee x + p (x)K c H.
But c(x + rZ) > c(x + p (x)Z) = tl whenever r > p (x).
Therefore, rH(x) = p (x), i.e., we have
tl-ex
r H ()
x =-_-- Vx EH. (66)
v
(67)
tl· - Ci x
rM(x) =. min 1 _ Vx EM , (68)
1=1, ... ,m vi
where
In this case, the design eentering problem reduces to maximizing rM(x) over M.
But since x ~ M implies that tli - cix < 0 for at least one i, we may also maximize
the expression (68) over IRn .
576
Proposition X.IO. Let M be a polytope ofthe form (67). Then the design centering
problem is the linear programming problem
maximize t (70)
i -
S.t. n i -cx~tvi (i=l, ... ,m)
Proposition X.II. Let M = C n D1 n ... n Dm' where Cis a closed convex subset of
IR n, int C f. 0, and Di = IR n \ Ci is the complement of an open convex set in IR n
{i=l, ... ,m}. Assume that int M f. 0. Then the design centering problem is equivalent
to maximizing a d.c. junction over M.
(71)
where, for any H e cN, rH(x) is the affine function (66). Let H = {y: cy ~ a},
v(c) = max {cz: z e K}. Then we have
rdx) = in f
He tR
{n_-v(c)cx) , (72)
577
Note that rC(x) is finite for every x. To see this, let xO e C. Then for all
Let
(75)
a - cx
o
- - > - IIx -xII .
v(c) - rO
Since H' cn for every H' e tR', it follows that rn(x) ~ rH,(x)j hence,
The converse inequality obviously holds for rn(x) = 0, since rH,(x) ~ O. Assume
that rn(x) = 5 > O. Then we see from the definition of r n (x) that there exists a
that B eH', where H' was defined above. Then, we have r H,(x) ~ rn(x), and hence
(77)
2 ~ ~
Example X.3. Let n = IR \ C, where C = {(xl'~): Xl + x 2 > 0, -Xl + ~ > O}.
Obviously, cl Cis defined by the two supporting halfspaces x I +x2 ~ 0, -xI +x2 ~ O.
But finding rn(x) for given K, X may require that one considers additional half-
spaces that support cl C at (0,0). In the case of K,x as in Fig. X.3, for example, it is
the halfspace x 2 S 0 that determines rn(x).
579
x
2
X
1
The proof of Proposition X.14 shows that, under the assumptions there, the
Knowledge of the d.c. structure is of little practical use if an explicit d.c. decom-
position of rM(x) is not available. Such a d .c. representation is not known unless
additional assumptions are made. To see how difficult this general d.c. problem is,
let
where p(z) = inf {>. > 0: z E >.K} is the Minkowski functional 0/ K (cf. Thach (1988),
Thach and Tuy (1988)). Then it is readily seen that
Suppose that p(z) is given. (Note that p(z) = IIzliN if K is the unit ball with
1
L=-
r I
o
where rO= min {I/zl/: z e 8K} .
Proof. Let p(z) denote the Minkowski functional of K such that K = {z:
p(z) ~ I}. Then it is well-known that for y e IRn we have p(y) = W , where 1/·11
1I'i1i
denotes the Euclidean norm, and where y is the intersection point of the ray
{py: p ~ K} with the boundary aK. Therefore, for arbitrary xI ,x2 e ~, it follows
that
Using
and
•
Two algorithmic approaches to solve fairly general design centering problems
have been proposed by Thach (1988) and Thach and Tuy (1988). Thach considers
the case when M is defined as in Proposition X.12 and where p(z) = (z(Az»l/2 with
symmetrie positive definite (nxn) matrix A. He reduces the design centering problem
to concave minimization and presents a cutting plane method for solving it. How-
ever, there is a complicated, implicitly given function involved that requires that in
each step of the algorithm, in addition to the calculation of new vertices generated
by a cut, one minimizes a convex function over the complement of a convex set and
solves several convex minimization problems. However, when C and K are polytopes,
only linear programs have to be solved in this approach (cf. Thach (1988)).
Thach and Tuy (1988) treat the design centering problem in an even more general
setting and develop an approach via so--<:alled relief indicators which will be dis-
cussed in Section XlA.
The first numerical tests (Boy (1988» indicate, however, that - as expected from
the complicated nature of the general problems considered - the practical impact of
both methods is limited to very small problems.
In the diamond cutting problem mentioned at the beginning of this section, the
polytope and M = C n D1 n... n Dm' where Cis a polytope and each Di is the com-
plement of an open convex polyhedral set Ci (i=l, ... ,m). Let
582
(82)
(83)
and
where I, J, Ki are finite index sets; ai , bj , ci,k E IRn ; and Qi' ßj , li,k E IR.
Then we have
(85)
rc(x) = I
j EJ
o
1
Vj J
.
min [-= (ß· + bJx)] , x E C
, x; C
(86)
max {bjz: z E K} (j E J) .
Furthermore, evaluating r D. (x) (i=l, ... ,m) by means of (80) also only requires
1
solving linear programs. Indeed, since K contains 0 in its interior, we can assume
where .N' is the set of closed halfspaces H' = {y: cy ~ er} determined by the
supporting planes of cl C which satisfy cl C C {y: cy ~ er}. Observe that (87) is not
useful for computing rn(x), because .N' may contain an infinite number of elements.
Following Nguyen and Strodiot (1988, 1992), we show that, by the above assump-
tions, one can find a finite subcollection 'J 'of .N' such that for all x e 1R3
Then computing r n (x) amounts to using a formula similar to (86) (cf. also (66)) a
finite number of times.
Proof. From the inclusion 'J' c .N' and (87) it follows that for all x e 1R3 one has
But if x e n, then by (86) we have x + rn(x)K CHO for some HO e 'I 'j hence, by
the definition of r H"
the ones containing a facet of cl C, the ones containing only a single extreme point of
cl C and the ones containing only one edge of cl C. Corresponding to this classifi-
cation, 'I' will be constructed as the union of three finite subsets '11', '12' and '13'
of dl t •
'11' will be generated by all of the supporting planes corresponding to the facets of
cl C. Since cl C has only a finite number of facets, the collection '11' is finite. Let f
be a facet of cl C, and let PI ' P2 ' P3 be three affinely independent points which
characterize it. Suppose that they are numbered counter clockwise, when cl C is
viewed !rom outside. Then the halfspace H' e '11' which corresponds to the facet f
has the representation
The set '12' will be defined as folIows. To each vertex p of cl Cand to each facet v
of K we associate (if it exists) the plane
585
cy =a
which is parallel to the facet v and passes through the vertex p but not through any
other vertex of cl C. Moreover, it is required that
where q is any point of the facet v. Condition (91) ensures that {y: cy ~ a} E tR'
and that K is contained in the parallel halfspace {y: cy ~ cq}.
Computationally, 12' can be obtained in the following way. Consider each couple
cy =a
Let each edge of clC emanating from p be represented by a point p f p on it. Since
a = cp, we have cl C ( {y: cy 5 a} if cp 5 a for of all these points (edges) p. The col-
lection 12' is finite because there exist only a finite number of vertices p and facets
v.
Finally, 13' is defined in the following way. To each edge e of cl C and to each
edge w of K which is not parallel to e, we associate (if it exists) the plane
cy =a
that contains the edge e and is parallel to the edge w. Then as in (91) we have
586
Finally, we set
(90)
In order to prove that the eollection 'J' satisfies condition (86) of Proposition X.16,
we establish the following lemma (cf. Nguyen and Strodiot (1988)).
Lemma X.16. Let xE D, and consider the hal/spaces H' = {y: cy ~ a} ( ,H'
(i) cl an (x + rD(x)K) f 0 ;
(ii) .9'x f 0, and (x + rD(x)K) n cl a(p v PE .9'x;
587
Finally, assertions (iii) and (iv) are straightforward because P is aplane sep-
arating cl C and x + rn(x)K, whereas (v) and (vi) are immediate consequences of
The following proposition is also due to Nguyen and Strodiot (1988, 1992).
Proposition X.14. The coUection 'I' is finite and satisfies property (89) of Pro-
position X.16.
588
Proof. Finiteness of 'I' = '11' U '12' U '13' has already been demonstrated above.
Case 1: dim Y* = 2:
halfspace H' E CH' generated by P belongs to 'I{ Since x + rD(x)K eH', condition
(89) is satisfied.
Case 2: dim Y* = 1:
Since x + rn(x)K is bounded and dirn Y* = 1, we see that Y* is a c1osed,
bounded interval which does not reduce to a singleton. Consider the following three
subcases.
interior point of a facet of cl C, and thus, by Lemma X.16 (iii), it contains the whole
facet. Consequently, the halfspace H' E CH' generated by P again belongs to '11', and
Hence, by Lemma X.16 (v), it follows that the halfspace H' E tR' generated by P
the two facets of cl C such that e = fl nf2 ' and let vI and v2 be the two facets of K
determining w = vI nv 2 . Then, by Lemma X.16 (ii) and (iv), each P E .9'x contains
the two colinear edges e and x + rn(x)w. Therefore, among all planes P E .9'x ' there
exists at least one plane P' which contains one of the four facets f p ~, x + rn(x)vl'
x + r n (x)v 2. Let H' E tR' be the halfspace corresponding to P'. Then we have
H' E '11' if f l E P' or f2 E P'. Otherwise, by Lemma X.16 (v), one has H' E '12' . In
Case 3: dim Y* = 0:
In this case, when Y* is reduced to a singleton Y* = {y*}, we must consider six
subcases according to the position of y* with respect to cl Cand x + rn(x)K.
Case 3.1: y* is avertu 0/ z + rD(zjK and belongs to the relative interior 0/ a /acet 0/
cl C.
Let P E .9'x' and let H' E tR' be the corresponding halfspace. By Lemma X.16
(ii) and (iii), P contains a facet of cl C. Thus we have H' E '11' and, since
Lemma X.16 (v), H' E '12'. Since x + rn(x)K eH', it follows that condition (89) is
satisfied.
590
Case 3.4: y* is avertex 0/ x + rn(x)K and belongs to the relative interior 0/ an edge e
o/cl C.
Let f1 and f2 be the two facets of cl C which determine e, i.e., e = f1 n f2. Then,
by Lemma X.16 (ii) and (iv), each P E .J'x contains e. Therefore, there exists at least
one plane P' E .J'x which contains either one of the faces f1 and f2 or an edge of
x + rn(x)K emanating from x. (Observe that in the latter case this edge of
Two possibilities can occur. If none of the edges of x + rn(x)K is collinear with e,
then we can argue in the same way as in Case 3.4 to conclude that (89) holds.
e = fl nf2, and let vI and v2 be the two facets of K determining w = vI n v2. Then,
among all of the planes P E .9'x containing e, there exists at least one plane P' which
contains one ofthe four facets fl' f2, x + rD(x)vl' x + rD(x)v2 . Let H' e JI' be the
halfspace generated by P'. Then we have H' E 11' if f l E P' or ~ E P'. Otherwise,
using Lemma X.16 (v), we see that H' E 12'. In any case, x + rD(x)K eH', and con-
dition (89) is satisfied. _
Example X.4. An application of the above approach to the diamond cutting and
dilatation problem is reported in Nguyen and Strodiot (1988). In all the tests dis-
cussed there, the reference diamond K has 9 vertices and 9 facets and the rough
First, from the vertices, edges and facets of K and M the three finite collections
11',12' ,13' were automatically computed. Then the design centering problem was
solved by an algorithm given in Thoai (1988). Although the theoretical foundation of
this algorithm contains an error, it seems that, as a heuristic tool, it worked quite
efficiently on some practical problems (see Thoai (1988), Nguyen and Strodiot (1988
and 1992)). The following figure shows the reference diamond K, the rough stone M
and the optimal diamond inside the rough stone for one of the test examples. In this
example 11' has 9 elements, 12' has 1 element and 13' has 3 elements. The optimal
dilatation 'Y is 1.496.
592
problem namely
where:
and
Some immediate extensions of problem (SBC) are obvious. For example, a term
Note that, even though each term xiYi in xy is quasiconvex, it is possible to have
proper local optima that are not global. For example, the problem min {xy: (x,y)
where p, q E IRn; C is a (nxn) matrix, and X and Y are polytopes in IRn. This problem
is treated in Section IX.!. In Section 1.2.4, we showed that problem (BLP) is equi-
valent to a special concave minimization problem. From the form of this equivalent
concave minimization problem (Section 1.2.4) and the corresponding well-known
property of concave functions it follows that (BLP) has an optimal s~lution (x,y),
and Section 1.2.4). As pointed out by AI-Khayyal and Falk (1983), this property is
minimize (-x + xy - y)
s . t. -6x + 8y ~ 3
3x - y~3
o ~ x, y ~ 5
has an optimal solution at (i, ~) which is not an extreme point of the feasible set.
Moreover, no extreme point of the feasible set is a solution.
The jointly constrained bilinear programming problem, however, has an optimal
solution on the boundary of a compact, convex feasible set. This can be shown in a
somewhat more general context, by assuming that the objective function is bi-
concave in the sense of the following proposition (cf. Al-Khayyal and Falk (1983)).
O.
proof. Assume that there exists a point (xo, yo) Eint C such that
F(xO,yo) < F(x,y) V(x,y) E 00.
It follows that in particular we have
F(xO,yo) < F(xO,y) Vy E {}C (xo),
where
C(xo) = {y: (XO,y) E Cl.
But C(xo) is a compact and convex and yo is a (relative) interior point of C(xo).
Bence, the concave function F(xo,.) attains its global minimum over C(xo) at an
extreme point of C(xo), which is a contradiction to the last inequality above.
•
Another proof of Proposition X.15. can be found in Al-Khayyal and Falk (1983).
595
Recall, for example, that minimization of a (possibly indefinite) quadratic form over
Several algorithms have been proposed for solving the bilinear programming prob-
lem, some of which are discussed in Section IX.l.
For the biconvex problem under consideration, observe that the methods in the
preceding sections can easily be adapted to problem (SBC). Specifically, several
branch and bound approaches are available. One of them is the method discussed in
Section X.2 for a more general problem.
version in the subdivision rule which governs the refinement of partition sets and in
below will use bisection, but any other exhaustive subdivision will do as weIl (cf.
also Chapter VII).
will be used.
The selection rule that determines the partition sets to be refined in the current
We begin by deriving a formula for 'PM(x,y) (cf. Al-Khayyal and Falk (1983)).
Let M = {(x,y):= ~ ~ x ~ a, E. ~ y ~ 'Ei} be a 2n-rectangle in 1R2n , and let
596
(ii) 1/ /(z) = cz and g(y) = dy, e,d E IRn, then cz + lPA!z,y) + dy is the
eonvez envelope 0/ F(z,y) over M.
Proof. Part (i) is an immediate consequence of Theorem IV.S, and part (ü) fol-
~(x,y), where x and y are now real variables rather than vectors.
Recall that the convex envelope of a function h on M may be equivalently defined
as the pointwise supremum of an affine functions which underestimate h over M.
Since x - .!!: ~ 0 and y - !! ~ 0, it follows after multiplication that
If ep were not the convex envelope of xy over M, there would be a third affine
Suppose that (i,Y) E MI· Then (i,Y) is a unique convex combination of the three
extreme points vl,v 2,v3 of MI. Hence, for every affine function l one has
Step 0 (Initialization):
to deterrnine ßO = min ~M(M n K). Let SM be the finite set of iteration points in
Mn K obtained while solving (PM). Set QO = min F(SM) and (xO,yo) E argmin
F(SM)·
If QO - ßO = °(~ c), then stop. (xO,yO) is an (c-)optimal solution.
598
current lower and upper bounds J\-I' Qk-l satisfying ßk- 1 ~ min F(K n R) ~ Ilk- 1
are at hand, and we have a subset .At-k_ 1 of .At k-l whose elements are the partition
sets M such that J\-1 = ß (M). Finally, the current iteration point (xk- 1,yk-l) e
K n R is the best feasible point obtained so far, i.e., one has F(xk- 1,l-l) = Ilk_ 1.
11:.2. Select a collection .9k C .ge k satisfying .At-k_ 1 C .9k ' and bisect each member
of .9k . Let .9ic. be the collection of all new partition elements.
iteration points obtained while solving (PM) and best feasible points known from
iteration k-l). Set Il (M) = min F(SM)' ß(M) = min 'M(K n M).
11:.5. Set vK k = (.ge k\ .9k) u .At ic..
Compute
Let (xk,yk) E K n R be such that f(xk,yk) = ak' and set "{{-k = {M E .,{{ k:
ßk = ß (M)}.
If ak - ~ = °(~ e), then stop. (xk,yk) is an (e-)optimal solution. Otherwise, go
to Step k+1.
Proposition X.IS. 1/ Algorithm X.5 does not terminate after a finite number 0/
iterations, then the sequence {:i, yk} has accumulation points, and e'IJery accumu-
tation point 0/ {(zk,yk)} is an optimal solution 0/ problem (91) satisfying
Proof. Proposition X.21 can be derived from the general theory of branch and
IV.2.
Since the functions f, g are convex on an open set containing R, we have con-
tions, bound improving selections are complete. Therefore, Proposition X.21 follows
from Theorem IV.3 when consistency of the bounding operation is established. We
show that any decreasing sequence {Mq} of successively refined partition elements
satisfies
lim (a (M q ) -
q~m
ß (M q)) = °, (100)
Let Mq = Mx,q xM y, q' where Mx, q' My, q denote the projection of Mq onto the
x-space IRn and the y-space IRn, respectively. Denote h(x,y) = xy. Recall from The-
600
orem IV.4 that for the convex envelope ~ of hone has min ~ (M q ) =
. q q
min h(Mq ), i.e., the global minimum of h over Mq is equal to the global minimum of
its convex envelope ~ over Mq . Then, from the construction of the lower bounds
q
ß (Mq) we see that
But the subdivision is exhaustive, i.e., Mq q {(i,y)} ( D, and all of the infeas-
ible partition sets are deleted. Recall that a (M q ) = F(iq,yq) for some (iq,yq) e D n
Mq. Using the continuity of f, h, g, it follows that as q --i m the left-hand side of
(101) converges to f(i) + h(i,Y) + g(Y) = F(i,Y). Since we also have a(M q ) =
F(iq,yq) q F(i,y), we see that condition (100) follows from (101) if we let q - - i m•
•
The original version of Al-Khayyal and Falk (1983) proceeds like Algorithm X.5
kk kk kk kk
x.y. - 'PM(x.y,) = max {x.y. - ~(x.y.)}
J J J J i=l , ... ,n 1 1 1 1
and then splitting with two hyperplanes through (x~,y~) orthogonal to the
!
_._... ....._.._._.._-_..._....._,...._._......_-
k k
", ...,.... __ ........., .......•: ._(X
,
,Y1.......
.. ....l" .....
,,,
) , ............ _...... ,
proaches for such convex-concave problems can be found in Muu and Oettli (1991),
In this chapter, we discuss global optimization problems where the functions in-
volved are Lipschitz-{;ontinuous or have a related property on certain subsets
M c IRn. Section 1 presents a brief introduction into the most often treated
univariate case. Section 2 is devoted to branch and bound methods. First it is shown
that the well-known univariate approaches can be interpreted as branch and bound
methods. Next, several extensions of univariate methods to the case of n dimensional
problems with rectangular feasible sets are discussed. Then it is recalled from
Chapter IV that very general Lipschitz optimization problems and also very general
systems of equations and (or) inequalities can be solved by means of branch and
bound techniques. As an example of Lipschitz optimization, the problem of
minimizing a concave function subject to separable indefinite quadratic constraints
is discussed in some detail. Finally, the concept of Lipschitz functions is extended to
SO-{;alled functions with concave minorants.
In Section 3 it is shown that Lipschitz optimization problems can be transformed
into equivalent special d.c. programs which can be solved by outer approximation
techniques. This approach will then be generalized further to a "reijef indicator"
method.
604
Recall from Definition 1.3 that a real-valued funetion f is ealled a Lipschitz fu.n~
a so-called oracle.
This relatively simple problem (UL) is interesting beeause it anses in many ap-
pIicataions and also because some algorithms for solving problem (UL) can easily be
tem, which can in many eases be measured for given values of some parameter(s)
even if the governing equations are unknown (e.g., Brooks (1958». Other examples
are discussed in Pinter (1989), see also the survey of Hansen and Jaumard (1995).
605
Denote by cJ1 L [a,b] the c1ass of Lipschitz functions on [a,b] having Lipschitz con-
stant L. Then it is easy to see that no algorithm can solve (UL) for all fE cJ1 L [a,b]
by using only a finite number of function evaluations (cf., Ransen et al. (1989)).
Theorem XI.I. There is no algorithm for solving any problem (UL) in cJ1 L[ a,b]
that wes only a finite number of function evaluations.
global minimizer x* after k steps, i.e., we have f(x*) = min f(xi ) (k> 1).
i=1, ... ,k
Denote Xk = {xl, ... ,xk}, and let f(X k ) be the set of corresponding function
values. Let x-i E Xk \ {x*} be the evaluation point different from x* which is c10sest
to x* on the left (if such a point does not exist, then a similar argument holds for the
point in Xk \ {x*} c10sest to x* on the right). Consider the function
- _~ + x* f(x j ) - f(x*)
x- 2 + 2L
with
Frequently, instead of problem (UL), one investigates problem (ULt ) which con-
c <
f*c := f!(x*) - f* + c. (3)
It is obvious that every problem (ULc) always can be solved by a finite algorithm.
Problems (UL) and (ULc) have been studied by several authors, e.g., Danilin
(1971), Evtushenko (1971 and 1985), Piyavskii (1972), Shubert (1972), Strongin
(1973 and 1978), Timonov (1977), Schoen (1982), Shepilov (1987), Pinter (1986 and
of all optimal solutions to (UL); see, for example, Basso (1982), Galperin (1985 and
1988), Pint er (1986 and 1988), Hansen et al. (1991).
An algorithm such as (4), where the evaluation points are chosen simultaneously,
is frequently said to be passive, since the step size is predetermined and does not de-
pend on the function values. Its counterpart is a sequential algorithm, in which the
choice of new evaluation points depends on the information gathered at previous it-
erations.
For most functions f, the number of evaluation points required to solve (UL c) will
be much smaller with a suitable sequential algorithm than with a passive algorithm.
In the worst case, however, the number of evaluation points required by a passive
and by a best possible sequential algorithm are the same (cf. Ivanov (1972), Archetti
607
and Betro (1978), Sukharev (1985)). It can easily be seen from the following dis-
cussion that this case arises when fis a constant function over [a,b].
Given Xk = {x1,... ,xk }, the corresponding set f(X k ) of function values, and the
(6)
(cf. 1.4.1).
Obviously, for a fixed set X k, the best underestimating function using the above
information is given by
(7)
Because of its shape, a function Fk of the form (7) will be called a saw-tooth cover
1 2 k 1 k
0/ fLet X k be ordered such that a ~ y ~ y ~ ... ~ y ~ b, where {y ,... ,y } =
{x1 ,... ,xk }. The restriction of F k to the interval [yi,yi+1] of two consecutive evalu-
ation points is said to be the tooth on [yi,yi+1]. A straight forward simple calcula-
tion shows that the tooth on [yi,yi+1] attains its minimal value (downward peak)
at
with
(9)
Since the number of necessary function evaluations for solving problem (ULe )
measures the efficiency of a method, Danilin (1971) suggested studying the minimum
(UL ) (cf. also Hansen et al. (1988 and 1989), Hansen and Jaumard (1995)).
e
This can be done by constructing a reference saw-tooth cover
608
for solving (UL e ) with a minimal number kp of function evaluations. Such a refer-
ence cover is constructed with f* assumed to be known. It is, of course, designed not
to solve problem (ULe) from the outset, but rather to give a reference number of ne-
cessary evaluation points in order to study the efficiency of other algorithms.
It is easy to see that a reference saw-tooth cover FrJt) can be obtained in the fol-
lowing way. Set FrJa) = f* - e. The first evaluation point xl is then the intersection
point of the line (f* - c:) + L(x - a) with the curve fex). The next downward peak is
at
f(x)
f(x)
f- (
~~------------~----~-.--~~.------
o k p,k k+l
x x x b
Initialization:
Set k = 1 , xl solution of the equation fex) = (f* - E) + L(x - a).
Consider problem (ULE). Let xk be the last evaluation point and let f E denote the
current best known function value. We try to find xk + 1 such that the step-size
xk+ 1 _ xk is maximal under the condition that, if f(xk + 1) ~ f E ' then we have
(10)
(11)
(12)
This is essentially the procedure of Evtushenko (1971), (cf. also Hansen and J au-
mard (1995)).
610
Initialization:
Note that from the derivation of (12) it follows that, if f(x k+1) < f(x k) , then
the downward peak F(xp,k) differs from the new incumbent value f(x k +1) by more
than c.
We see from (12) that in the worst case of a constant or a monotonically decreasing
function f, we have f(x k ) = fc for all k, hence xk+ 1 - xk = if. ' which is the
step-size of the passive algorithm (4)
1 + rlog2 (1 + L(b2; a»l . It is attained when fis an affine function on [a,b] with
slope L. The efficiency of the procedure depends greatly on the position of the op-
timal solution x*, and it tends to become worse when x* --I b. Its saw-tooth cover
can differ considerably from the reference saw-tooth cover, particularly when x* is
Le., the evaluation points at successive iterations are increasing values of x belonging
to [a,b] , the algorithm of Piyavskii (1967 and 1972) constructs more and more
refined saw-tooth covers of f in the following way. Starting with xl = ~ and
f(x 1), the first saw-tooth cover
is minimized over [a,b] in order to obtain its lowest downward peak at x 2 E argmin
The function fis then evaluated at this "peak point" x2, and the corresponding
tooth is split into smaller teeth to obtain the next cover
(13)
Piyavskii's algorithm with various extensions seems to be the most often dis-
cussed approach for problems (UL) and (ULc). It was rediscovered by Shubert
(1972) and Timonov (1977). Archetti and Betro (1978) discuss it in a general frame-
tions. Pinter (1986 and 1986a) introduces five axioms which guarantee the conver-
gence of certain global algorithms, and he shows that Piyavskii's algorithm, as well
as others, satisfy them. Schoen (1982) proposes a variant of Piyavskii's approach
that, instead of choosing the point of lowest downward peak to be the next
evaluation point, selects the evaluation point of the passive strategy (4) that is
closest to this peak point. Shen and Zhu (1987) discuss a simplified version in which
at each iteration the new evaluation point is at the middle of a subinterval bounded
by two consecutive previous evaluation points. Hansen et al. (1989 and 1991) present
a thorough discussion of Piyavskii's univariate algorithm and related approaches
which includes a theoretical study of the number of iterations which was initiated by
Danilin (1971). Arecent comprehensive survey is Hansen and Jaumard (1995).
and
(14')
seems not to be very promising with respect to numerical efficiency for dimension
n > 2 since (14') constitutes an increasingly difficult d.c. problem (cf. Section 1.4).
In Horst and Tuy (1987) it is shown that Piyavskii's algorithm, as well as others,
can be viewed as branch and bound algorithm. In this branch and bound reformu-
lation, the subproblems correspond to a tooth of the current saw-tooth cover, and
613
This way of viewing the procedure will allow one to delete unpromising teeth in
order to reduce the need for memory space. We shall return to this branch and
bound formulation in the more general framework of Section X1.2.
In the following algorithmic description of Piyavskii's method for solving (UL e),
fe denotes the current best value of the function f, whereas Fe denotes the minimal
lnitialization:
Set k = 1, x 1 = ~
a+b , Xc = x1, f = f(xc),
e
Fe = fe - L(b;a) , F 1 = f(x 1) - L Ix - x \
If fe - Fe $ e, then stop.
Otherwise determine
xk +1 E argmin Fk([a,b]).
Go to Step k.
Obviously, the second and the third evaluation points in Piyavskii's algorithm are
Now suppose that for k ~ 3 the first k evaluation points have already been gener-
1 2 k 1 k
ated and ordered 50 that we have a = y ~ y ~ ... ~ y = b, where {y ,... ,y } =
(15)
and
F ( k+1) _
kx -
f(~+1)+f(~)
2 L r'+1 2- r. . (16)
below.
Let f(x 1),f(x 2), ... ,f(xk- 1) be the values of fE I}>L[a,b] at the first k-1 evaluation
points of a saw-tooth cover algorithm for solving problem (UL) (or problem ULe:)'
In addition, let I}>k_1(f) C I}>L[a,b] denote the set of all Lipschitz functions (with
Lipschitz constant L) which coincide with f(x) at these points. For all rp E I}>k_1 (f),
denote rp* = min rp[a,b]. Consider the saw-tooth cover ~(x) of rp(x) at iteration k
which, given ~-1 (x), is determined by the choice of the evaluation point xk . We
are interested in making the choice in such a way that the error
(17)
is minimized in the worst case. In other words, we attempt to minimize the quantity
(18)
(ll optimality in one stepll, cf. Sukharev (1985) for a related definition).
615
Proposition XI.i. Piyavskiz's algorithm is optimal in one step in the sense 0/ (17),
(18).
Proof. Since for all rp E 4>k_l(f) we have f(xi ) = rp(xi ) (i=I, ... ,k-l), the previous
saw-tooth covers ~-1 (x) coincide for all rp E 4>k(f) , Le., one has ~-1 (x) =
Fk_ 1(x).
and
It follows from the construction of F k-1 that the maximal error after k-l iterations
(19)
In order to investigate the worst case error in the next iteration, first note that
there exist functions rp E 4>k_l (f) satisfying
Furthermore, let ak_ 1 ' bk- 1 E {x I ,... ,xk-1} , ~-1 < bk_I' be the nearest evalu-
ation points to the left and to the right of xp,k, respectively, Le., we have
. '+1
(cf. (15), (16), where ak- 1, bk_ 1 correspond to yJ, yJ ).
Let xk denote the next evaluation point and consider the two cases xk t [ak_ 1,
k
bk_I] and x E [~_I,bk_l]' Denote
616
. . 1 k-l
and let f k_ 1 = mm {f(x ), ... ,f(x H.
Suppose that x k ~ [ak_ l ' bk_I]. Then for all cp E tf>k_l(f) satisfying (20), we
have
(cf. (17)). In this case, the maximal error is not improved by the choice of x k and
(21)
Now suppose that xk E [ak- l ' bk-I]. In this case, we have xk f:. ak_ l and
The algorithms for solving problem (UL g ) discussed so far regard the Lipschitz
constant L as known apriori. We would like to mention that Strongin (1973 and
1978) proposes an algorithm that, instead of L, uses an estimate of L which is a mul-
tiple of the greatest absolute value of the slopes of the lines joining successive evalu-
ation points. Convergence to a global minimum (g = 0) can be guaranteed whenever
this estimate is a sufficiently large upper bound for the Lipschitz constant L (for de-
tails, see Strongin (1978), Hansen et al. (1989), Hansen and Jaumard (1995)).
In this section, the branch and bound concept developed in Chapter IV will be
applied to certain Lipschitz optimization problems. We begin with an interpretation
of Piyavskii's univariate algorithm as a branch and bound procedure. Then the case
of an n-dimensional rectangular feasible set D is considered, where a generalization
617
Let the first k (k ~ 3) evaluation points of Piyavskii's algorithm for solving prob-
lem (UL) be ordered in such a way that we have a = y1 ~ y2 ~ ... ~ l = b, where
{y1, ... ,yk} = {xl, ... ,xk}. Obviously, the intervals
i i+1]
M k . = [y,y (.1=1, ... ,k
-1) ,
,I
constitute the associated lower bounds (cf. (15) and (16)). Changing notation
slightly, let xk,i e {yi,yi+1} satisfy f(xk,i) = min {f(yi), f(yi+1)}j
set
and let the current iteration point xk e {xk,i: i=1, ... ,k-1} be such that
(24)
. k k· 1
into the two intervals [y-I,z ], [z ,yH ], where
Rearranging the evaluation points y1, ... ,yk,zk in the order of increasing values to
.
obtaln a = y1 ~ 2
y ~ ... ~ y
k+1
= b, {y1,... ,yk ,zk} = {y1, ... ,yk+1 }, we see that all
which deletion operations can be used to reduce memory space compared to the
Proposition XI.2. Consider Piyavskiz's algorithm for solving problem (UL) in its
The selection is obviously bound improving and the set {x E [a,b]: f(x) ~ f(x 1)} is
tion.
Now consider the sequence {zk} of evaluation points zk generated at each Step k.
Q ~ f(z) (29)
holds.
Ix - zkq'+11 < 2t
c ,
' q > qc '
we have
(32)
620
This contradicts the assumption that zis an accumulation point of {zk}. Therefore,
there is no c > 0 such that (30) holds, and this (by using (29)) implies that
stant, then all numbers L' > L are Lipschitz constants as well. Let f be a Lipschitz
function on [a,b] and let
Then, in practice, we often know only some L' > L. Assuming this and applying
Piyavskii's algorithm with L' > L instead of L, Proposition XI.2 can also be derived
(33)
(34)
where 'Y = ~ (1 + t,) < 1. This establishes the exhaustiveness of the subdivision pro-
cedure.
2.2. Branch and Bound Methods for Minjmjzjng a Lipschitz Function over an
n-dimensional Rectangle
Now let the feasible set D be an n-dimensional interval, i.e., there are vectors
(35)
where the inequalities are understood with respect to the componentwise ordering of
IRn . Let the objective function f be Lipschitzian on D with Lipschitz constant L and
minimize f( x) . (36)
xED
As shown in Section XI. 1.2, Piyavskii's univariate algorithm can easily be formu-
lated for the case of problem (36), and one obtains a corresponding convergence re-
Tuy (1987)). Since, however, the computational effort in solving the corresponding
subproblems is enormous in dimension n ~ 2, we prefer to present branch and bound
622
Step 0 (Initialization):
Go to Step 1.
subset of MO = D which is still of interest, and for every M e .Je' k-1 we have
bounds ß(M), a(M) satisfying
Moreover, we have the current lower and upper bounds ~-1' Qk-1 satisfying
ß(M) ~ Qk-1 .
623
k.2. Select
satisfying
1 f (aM) - f( b M)
xM=2(aM+bM)+2L'lIbM -aM 11 (bM-aM), (40)
and subdivide M into two n-dimensional subintervals using the hyperplane which
k k
Let .Jt be the collection of new partition elements. For each M' E .Jt denote by
Proof. Proposition XI.3 readily follows from Theorem IV.3, Corollary IV.3 and
Proposition IV.3 if the subdivision procedure is exhaustive. But from (33), (34), we
{aM,b M} of the vertices of a partition set M, the lower and upper bounds may be
determined by
and
procedure as long as one of these bounds or any uniformly better bound is used.
(ii) In practical computation, an inter val M will not be subdivided further when its
diameter IIbM - aMIl is less than a fixed parameter 6 > O.
(iii) Whenever adaptive estimates L(M) of the Lipschitz constant of f over current
intervals are available, then, of course, these should be used instead of L'.
625
because we have /lk - ßk < E or IIb M - aMIl < 6 for all sets M which are still of
interest, then, of course, a local search starting from xk could lead to an improved
approximation of a global minimum.
(1986 and 1986a). This approach uses typical branch and bound elements, such as
partitions of the feasible n-interval D into finite sets of n-intervals and refinement
by sub division of selected partition elements. The selection of partition elements is
governed by a selector function that takes into account the size of a given partition
interval as well as the objective function values at its vertices. Each selected
best feasible point obtained so far. Moreover, Pinter's method does not make use of
lower bounds, and hence it does not provide estimates of the quality of a current
iteration point or adeletion rule to remove partition elements not of interest.
In Horst and Tuy (1987) it was shown, however, that Pinter's approach can
readily be modified, improved and generalized by viewing it within the framework of
branch and bound methods discussed in Chapter IV. A simplified and slightly gener-
alized version of the presentation in Horst and Tuy (1987) follows.
(i) Upper and lower bounds are determined on the complete vertex set V(M) of a
Following Pinter (1986 and 1986(a)), the vertex set V(M) can be described by an
(nIC2n) matrix X(M) whose columns are the lexicographically ordered vertices of M.
626
(41')
and
(ü) Rule (39) in Step k.2 of Algorithm XIA, which selects the n-intervals to be sub-
divided further is replaced by the following procedure.
R.2. R(X(M), z(M)) is continuous in (X(M), z(M)) and, for every decreasing
sequence {Mq} of n-intervals Mq C D, the limit ~: R(X(M,J, z(M,J) emts and
holtis. In (/.9), Xis the (nx~n)-matriz having ~n identieal eolumns Z.E IR n, andz is
the veetor o! ~n identieal eomponents !(x).
L' > L is used, where L denotes the infimum taken over al1 Lipschitz constants of f
on D = [a,b]. Then exhaustiveness of the subdivision (26) has been demonstrated in
Section XI.2.l., Le., the requirement R.l. holds.
Let
47)
where M = {x E IR: x 1(M) ~ x ~ x 2(M)} and zl(M) = f(i(M», z2(M) = f(x 2(M».
Note that (-R(x(M), z(M» describes Piyavskii's lower bound (cf. (22), where
The function R(X(M), z(M» defined by (47) is obviously continuous in its argu-
ments xi(M), zi(M) (i=1,2), and from the continuity of f we see that requirement
R.2 is satisfied.
Requirements R.3 and RA obviously hold.
Finally, in order to verify R.5, let xE M, z= f(X). Then
-;:) - 1(-;:)
R(X,z,=-z=-2 z+z,
Proposition XI.4. Suppose that in the above branch and bound interpretation 01
Pinfßrs method the requirements R.t - R.5 are satisfied. Then, il the algorithm is
infinite, we have
proof. We show that the assumptions of Theorem IV.2 and Corollary IV.2 are
satisfied.
lim 6(M k ) =0 ,
q. . . 1II q
where 6(M k ) denotes the diameter IIbM - a M 11 of Mk . Using (41'), (42') and
q kq kq q
i.e., there is a kO E IN such that M is not subdivided furt her if k > kO . We have to
show that inf f(M n D) = inf f(M) ~ a where a = I im ak .
k..... CD
Let M be represented by the nx2 n matrix X = X(M) of its vertices, and let z = z(M)
V(M q ) ofMq .
(48)
R(X,i) = lim R(X ,zq) ~ lim R(X,z) = R(X,z) > R(X, z) (49)
q-lCll q q-lCll
Now consider the sequence of points ~ E Mq satisfying f(~) = a(M q). Since
Recall that z}M) = f(x-i(M)) (j=1, ... ,2 n ). The function R 1 depends only on the
n
"lower left" vertex xl and the "upper right" vertex x 2 , whereas ~ takes into
631
It is easily seen that under the above assumptions the requirements R.2, R.3, RA
are satisfied.
In order to meet requirement R.5, suppose in addition that R2 is Lipschitzian
with Lipschitz constant L(R2) and that
(53)
By (53), the inequality R(X, Z) < R(X(M), z(M)) of R.5 then fo11ows from the fol-
lowing chain of relations:
n 2n 1 n 2n 1
< L(lL) . L· E (x. (M) -x.(M)) S R1 (E (x. (M) -x.(M)).
-~ i=1 1 1 i=1 1 1
is given by
n 2n 1
R 1 =c. E (x. (M)-x.(M)),(CEIR+,c>L),
i=1 1 1
and
632
where Po is a lower bound for min f(D). Then L(~)<l and RI(y) = cy~L(~).L.y
Vy E IR+ •
2.3. Branch and Bound Methods for Solving Lipschitz Opümization Problems with
General Constraints
and the corresponding branch and bound procedures that were already treated in
Chapter IV.
Let the compact feasible set D and the objective function f: IRn -I IR belong to one of
Feasible set D:
Objecüve Function f:
(f2) - concave,
633
(f3) - d.c.,
(f4) - Lipschitzian.
We expect the reader to be able to formulate a convergent branch and bound pro-
cedure following the lines of the discussion in Chapter IV for each of the resulting
problem classes (cf. also Horst (1988 and 1989)). More efficient approaches can be
constructed for the important case of linearly constrained Lipschitz and d.c. prob-
lems (cf. Horst, Nast and Thoai (1995), Horst and Nast (1996). These will be dis-
cussed in Section 2.5.
It is also recalled from Section 1.4.2 that broad classes of systems of eqv.alities and
(1988)).
634
Constraints
The bounds provided by the general methods referred to in the preceding section
can often be improved for problems with additional structure. In this section as an
minimize f(x)
where Pik' 'ljk ' rik , mk , mk (i=l, ... ,nj k=l, ... ,m) are given real numbers, and f(x)
is areal valued concave function defined on an open convex set containing the rect-
n;::) T - - - T
angle R:= {x E IR : m ~ x ~ ml' where m = (!!!l' ... ,mn ) ,m = (ml' ... ,mn ) .
Note that several other problems of importance can be included under problem (55).
For example, the problem of globally minimizing a separable possibly indefinite
quadratic form subject to separable quadratic constraints, i.e., the problem
(56)
s.t. xeD,
minimize t (56')
s.t. xeD,fO(x)~t
berg games, cf. Al-Khayyal et al. (1991), Vincente and Calamai (1995), and from
vertex minima (cf. Section IV.4.5, Example IV.2) and the "Lipschitzian" deletion-
(57)
where
Algorithm XI.5.
Step 0 (Initialization):
Step r.
Step r = 1,2,... :
At the beginning of Step r we have the eurrent reet angular partition .Jt r-I of a
subset of MO still under eonsideration. Furthermore, for every M E .Jt r-I we have
Moreover, we have the eurrent lower and upper bounds ßr-I' 11r-I satisfying
Finally, if Ilr _ I < IIJ, then we have a point xr- I E D satisfying f(xr - I ) = Ilr- I
(the best feasible point obtained so far).
normal subdivision yielding reet angular partitions). Let .9~ be the eolleetion of all
r.3. Remove any M E .9~ for which there is an i E {l, ... ,m} satisfying
where Li(M) is a Lipschitz constant for gi over M given as in (57), and where
6(M) is the diameter of M (rule (DR3)i note that V(M) can be replaced by any
nonempty subset V'(M) of V(M».
and every accumulation point of the sequence {xr} solves problem (55).
If not enough feasible points can be obtained such that SM :f: 0 for al1 partition ele-
ments M, then, as discussed in Chapter IV, one may consider the iteration sequence
{ir} defined by f(i r ) = ßr . Although i r is not necessarily feasible for problem (49),
we know that every accumulation point of {ir} solves problem (55).
638
The general algorithm uses only simple calculations to make decisions on parti-
tioning, deleting and bounding. However, the lower bounds ß(M) = min f(V(M» are
weak, and a closer examination leads to procedures that allow improved bounding,
at the expense of having to solve additional subproblems. Moreover, a mechanism
needs to he devised for identifying points in SM C M n D. Following AI-Khayyal,
Horst and Pardalos (1992), we next attempt to obtain hetter bounds than ß(M)
using only linear programming calculations. In the process, at times, we will also be
able to identify when M n D = 0 or possibly uncover feasible points of Mn D # 0 for
inclusion in SM .
Let
n
G = {x: ~(x) = E ~k(xk) ~ 0 (i=l, ... ,m)},
k=l
where ~k(xk) = ! Pikx: + qikxk + rik (i=l, ... ,mj k=l, ... ,n).
Note that for each partition element M we have M n D = M n G since MO ( R and
D = Rn G.
n
!PM eJx) = E !PM g (xk) ~ 2.(X) , (53)
'vi k=l k' ik UJ.
where
Each gik(xk) can be linearized according to one of the following three cases:
639
where
Case 3: Pik> 0, Then gik is convex, Compute xk = - qik/Pik which minimizes ~k'
If gik(xk) ~ 0, then replace gik(xk ) by the constant ~k(xk) in the constraint
gi(x) $ 0, (This effectively enlarges the region which is feasible for that con-
straint,) Otherwise, continue,
(1) 1 2 1/2
P'k
1
= -Pik [- q'k
1
+ (q'k
1
- 2p'k
1 1
r 'k) ],
(p~~) is the zero of gik to the right of xk)' Replace gik(xk) by the linear support
ofits graph at pf~), This is given by
Case 3b: If xk > b k ' then replace gik(xk) by the linear support of its graph at
(2) 1 2 1/2]
P'k = -Pik [- q'k - (q'k - 2p'k r 'k) ;
1 1 1 1 1
namely,
640
where O!ik = PikPf~) + qik < 0 and ßik = - O!ikPf~) (pf~) is the zero of gik to the
left of xk)'
Let t i be the number of terms in ~(x) that fall into Case 3e. That is, t i is the
Setting
gik(Xk) if Case 1
(0)
lik (xk) if Case 2
we have
defined by
641
li~1)(xk ) $ zik (k E Ki ) ,
li~2)(xk) $ zik (k E K i ) ,
Performing the above linearization for every constraint i, we let L(G) denote the
resulting polyhedral set, and we let Lx(G) denote its projection onto the n-dimen-
sional space of x. It is clear that
It is also clear that, if (x, Z) minimizes f(x) subject to (x,z) E, L(G) and x E M,
minimize f(x)
s .t. (x,z) E L(G), x ER (60)
minimize f(x)
(61)
S.t. (x,z) E L(G),xEM
would lead to a better lower bound than min f(V(M)). However, since fis concave
(and is not assumed to have any additional exploitable structure), too much effort is
(i,i) E L(G), i E M satisfying min f(V(M)) ~ f(i) is found. Ifi E G, then SM:f 0.
For problem (56) in the form (56'), however, problems (60) and (61) are linear pro-
n 1 2 n
~(x) = k!l (2 Pikxk + qikxk + rik ) = k!l (xkYik + rik ) ,
1
where Yik = 2 Pikxk + qik (i=l, ... ,mj k=l, ... ,n).
From Theorem IV.8 and Proposition X.20 we also have that the convex envelope of
where !{J{lik (xkYik) denotes the convex envelope of xkYk over the set {lik and
Let y.
1
= (Y·l'···,y·n)T.
1 1
Since IOn l!:.(x) is piecewise linear and convex, the set defined
i'vi
by
IOn.,g.(x) ~ 0,
1 1
n
I: z·k ~ 0 ,
k=l 1
This is done for each constraint i=l, ... ,m to yield a polyhedral set P(G) whose pro-
jection onto IRn (the space of x-variables), denoted by P x(G), contains the convex
hull of Gj that is we have
Rence,
The discussion at the end of the preceding subsection ("Linearisation of the con-
straints") can now be followed with L(G) replaced by P(G) and the vector (x,z) re-
placed by (x,y,z).
644
mathematical programming. Recently this approach has been studied in the context
of separable (quadratic) concave problems subject to linear constraints (e.g., Rosen
and Pardalos (1986), Pardalos and Rosen (1987)). A detailed discussion is given in
h=b-a
r
and determine the piecewise linear function l(x) that linearly interpolates g(x) at the
grid points
where
The last constraints in (63) imply that the vector Z = (zl' ... ,zr_1) of zer~ne
variables must have the form (1,1, ... ,1,0,0, ... 0), i.e., whenever Zj = 0 for some j, one
has Zj+1 = ... = zr_1 = O. Hence, Z takes only r possible values, instead of 2r- 1
645
values as in the case of a general zero-<>ne vector with r-l components. Under the
Replacing each term ~k in the constraints of (55) by its corresponding piecewise lin-
ear approximation in the form (63), (64), one obtains an approximation for G which
interpolation error.
Recent branch and bound approaches for general indefinite quadratic constraints are
given in Sherali and Almeddine (1992), Khayyal et al. (1995), Phong et al. (1995).
Minorants
A common property of Lipschitz functions, d.c. functions and some other function
classes of interest in global optimization is that, at every point of the domain, one
can construct a concave function which coincides with the given function at this
point, and underestimates the function on the whole domain (concave minorant, cf.
Kharnisov (1995)). We present a new branch and bound 'algorithm for minimizing
such a function over a polytope which, when specialized to Lipschitz or d.c. func-
tions, yields improved lower bounds as compared to the bounds discussed in the pre-
show that these bounds can be improved further when the algorithm is applied to
646
solve systems of inequalities. Our presentation is based on Horst and Nast (1996)
and Horst, Nast and Thoai (1995), where additional details and areport on imple-
(1995).
said to have a concave minorant on S ij, for every y E S, there exists a function Fy :
S -+ IR satisfying
{ii} (66)
{iii} (67)
The functions Fix) are caUed conave minorants of f{x} (at y ES), and the dass of
functions having a concave minorant on S wiU be denoted by CM{S}.
Example XI.3. Let f(x) = p(x) - q(x), where p and q are convex functions on IRn .
Then it is well-known that pis subdifferentiable at every y E IRn , Le., the set 8p(y)
definition,
(cf. Vial (1983». For p > 0 one obtains the class of strongly convex functions, p = 0
characterizes a convex function, and p-<!onvex functions with p < 0 are called
weakly convex. From (70) we see that weakly convex functions are in CM(lRn) with
concave minorants
(where 11·11 denotes the Euclidean norm). It follows from (72) that, for all x,y e S,
we have
In order to ensure convergence of the algorithm given below one needs continuous
convergence in the sense of the following lemma.
Lemma XI.1. Let {xk} and {Yk} be sequences in S such that lim xk = lim 1Jk =
k-+m k-+m
seS. Then, for each of the concave minorants given in Example XI.9 - XI.5, we
have
I im F (xII = f(s).
k-+m 1Jk
Proof. First, notice that each of the three types of functions considered in the
above examples is continous on S. This follows in Example XI.3 from continuity of
convex functions on open sets (since S = IRn) and is trivial in Example XI.5. Since
648
p-convex functions are not treated in detail in this monograph, we refer to Vial
(1983) for Example XIA. Let B(s) be a compact ball centered at s. Then the
assertion follows for Example XI.3 from boundedness of { &(y) : y E B( s)} (cf., e.g.,
Rockafellar (1970)) and continuity of p and q. For Example XIA, the property of
Lemma XI.1 follows in a similar way since s(y) is a subgradient of the convex func-
where D is a polytope in IRn with nonempty interior, and f E CM(S) for some
n-simplex S 2. D.
A lower bound for f over the intersection of an n-simplex S with the feasible set is
obtained by minimizing the maximum of the convex envelopes IPy(x) of the concave
minorants F y(x), taken at a finite set T ( S. Recall from Theorem IV.7 that, for
each y E T, the convex envelope IPy of F y over S is precisely that affine function
which coincides with Fy at the vertices of S.
Proposition XI.5. Let S = [vO' ...'vn] be an n-simplex with vertices vO, ... ,vn' D be
a polytope in IRn, T be a nonempty finite set of points in S, and fE CM(S) with con-
cave minorants Fy. For each y E T, let IP y denote that affine functiuon which is
uniquely defined by the system of linear equations
(76)
minimize t
s.t. IP y (x)$ t, yE T, xE Sn D (77)
(76)-{77) satisfies
ordinates
n n
xE S ~ x= ~ ).. V., ~ ).. = 1, ).. ~ 0, i = O, ... ,n
i=O 1 1 i=O 1 1
one has
n
<,0 (x) = ~ ).. F (v.).
Y i=O 1 Y 1
As usual, we set p(SnD) = + aJ when SnD = 0. When in (76), (77) SnD f. 0 we ob-
tain a set Q(S) of feasible points in S while solving (76), (77). The construction of a
tight initial simplex S l Dis known from previous chapters.
Algorithm X.6.
Initialization:
Determine an initial n-simplex S l D, the lower bound p(SnD), and the set Q(S).
Iteration k:
Otherwise, choose
Bisect Sinto the simplices SI and S2. Compute ß(Si n D), i = 1,2j and
M = M \ {S:ß(S)1 ~ a},
Proposition XI.6. In Problem (75), let fE CM(S} be continuous on the initial sim-
plex S. Moreover, for each pair of sequences {xk}, {Yk} C S such that 1im xk I im
k-tCD k-tCD
Yk = s assume that 1im Fy (x~ = f(s). Then, if the algorithm does not terminate
k-tm k
after a finite number of iterations, we have
and every accumulation point z* of the sequence {zk} is an optimal solution of Prob-
lem (75).
651
Proof. Let z* be an accumulation point of the sequence {zk} C P, and let f" =
standard argument (see Chapter 4) that this subsequence must contain an infinite
simplex S attains its minimum at a vertex of S, where l{Jy coincides with F y" In view
where v(y) is the vertex of S at which min cp (x) is attained. It follows that ßk ~
xES y q
F (v(Yk)) for an arbitray Yk E T k and &11 q. Since lim Sk = {s}, we must
Yk q q q q-llll q
q
have 1 im Yk = 1 im v(Yk ) = s, and hence, using the continuous convergence as
q-llll q q-llll q
652
sumption, ß'" = lim 1\ ~ f(s). Therefore, the assertion follows from (83).- •
q-i1D q
Systems of CM-lnequaliües
Let D c IRn be a polytope with nonempty interior, and let fi E CM(S) be eontinuous
It follows from Definition Xl.l that f(x) = max{fi (x) : i = 1,... ,m} E CM(S), so that
the system (84) of inequalities ean be investigated by applying the above algorithm
to the optimization problem (75) until a point x'" E D satisfying f(x"') ~ 0 is deteeted
or the optimal value of (75) is found to be positive (indieating that the system (84)
has no solution in D, cf. Seetion 4.2). A straightforward applieation of Proposition
XI.6 would lead to the bound
where rpy (x) is the eonvex envelope of F y(x), F y(x) being the eoneave minorant of
one ofthe funetions fj satisfying fj(y) = max{fi(y) : i = 1, ... ,m}. This bound ean eer-
tainly be improved by eonsidering
where rp; is the eonvex envelope ofthe eoneave minorant F; offi' i = 1,... ,m.
Further improvement results from the well-known observation that a maximum
operations always leads to a smaller value than the eorresponding minmax opera-
Notice that ß2(S n D) is the optimal objective function value of the linear program
minimize t
3. OUTER APPROXIMATION
where f, ~: IRn -+ IR are Lipschitz functions (i=l, ... ,m). Suppose that the feasible set
is nonempty and compact, and suppose that areal number r > 0 is known which
satisfies
(91)
Le., we know a ball of radius r containing D (11·11 denotes the Euclidean norm).
cussed in Chapter 11. Recall from Section II.1 that in each step of an out er approx-
(93)
promising, because the subproblems (Qk) are still Lipschitz optimization problems,
and, moreover, it will be difficult to find suitable functions lk such that {x: 4c:(x)
= o} separates x k from D in the sense of (94), (95). Convexity does not seem to be
present in problem (89), and this makes it difficult to apply outer approximation
methods.
Therefore, we shall first transform problem (89) into an equivalent program where
convexity is present. Specifically, problem (89) will be converted into a problem of
globally minimizing a concave (in fact, even linear) function subject to a convex and
(1987).
First we note that in (89) one may always assume that the objective function f(x) is
minimize t
s . t. f(x) ~ t , ~(x) ~ 0 (i=l, ... ,m)
655
which involves the additional variable t and the additional constraint f(x) ~ t.
nH 2 nH 2 2
S = {u e IR : lIull = i!l u i = r , unH ~ O} ,
whose projection onto IRn is just the ball B:= {x e IRn : IIxll ~ r} introduced above.
Using the projection 'lI": IRnH --+ IRn defined by u = (ul'''''un + 1) --+ 'lI"(u) =
(u1""'un ) we can establish an obvious homeomorphism between the hemisphere S in
IRnH and the ball B in IRn.
Let
minimi ze 'P(u)
s.t·'Pi(u)~O (i=l, ... ,m) , (97)
lIull = r , unH ~ O.
At first glance, problem (PS) seems to be more complicated than the original
problem (P). However, the following proposition shows that the feasible set of (97)
Proposition XI.7. The feasible set C ofproblem (PsJ is a difference oftwo convex
sets:
C=Cl\G, (99)
where Cl is the (closed) convex hull of C and G is the (open) ball of radius r in IR n +1,
i.e.,
the convex hull of a closed subset of a sphere in IRn +1 (with respect to the Euclidean
norm), one only needs to add to this set strictly interior points of the ball which
strictly separates uD from M: take the hyperplane which supports the ball Ilull $ r at
uD, and move it parallel to itself a sufficiently small distance towards the interior of
the ball. In particular, a separating hyperplane can easily be constructed for M = C,
where the functions IPi are Lipschitzian functions with Lipschitz constants Li .
657
1
h(u):= "2 ._ maz
t
['P (u) ] 2
L. (101)
z-l, . .. ,m z
where 'Pt (u) = maz {O, 'P/u)}. If uO E S \ C, then the affine fu,nction
(102)
°
Proof. Since uO ~ C, we have 'Pi{uO) > fOI at least one i E {l, ... ,m}. Thelefole,
h{uo) > 0, and since lIuo" = I, it follows that
°1
h( u ) = "2. max
['P+
L.
°
i (u )] 2 1 ['P+
= "2
° °
i * (u )] 2 1 ['Pi* (u )] 2
L. * = "2 L. * '
1=1, ... ,m 1 1 1
then
(104)
where rp(u) is a concave function on IR n+1. I/ this problem is solvable, then there is
always an optimal solution 0/ the problem which is a vertez 0/ P or lies on the inteT'-
section o/the sur/ace lIull = r with an edge 0/ P.
Proof. For every w E P, let Fw denote the face of P containing w of smallest di-
mension.
Suppose that problem (105) has an optimal solution in the region lIull > r and let
w be such an optimal solution with minimal Fw.
We first show that in this case dim Fw = 0, i.e., w must be a vertex of P. Indeed,
if dim Fw ~ I, then there exists a line whose intersection with Fw n {u: lIull ~ r} is a
659
segment [w' ,W"] containing w in its relative interior. Because of the concavity of
rp(u), we must have rp(w') = rp(w") = rp(w). Moreover, if /lw'lI > r, then
dim F w' < dim Fw' and this contradicts the minimality of dim Fw. Therefore,
/lw'lI = r, and similarly Ilw"/l = r. But then we must have [w', w"] C {u: /lu/l ~ r},
contradicting the assumption that IIwll > r.
Now consider the case when all of the optimal solutions of (105) lie on the surface
/lu/l = r,
and let w be an optimal solution. If dim Fw > 1, then the tangent hyper-
plane to the sphere IIull = r at the point wand F w n {u: IIull ~ r} would have a
common line segment [w', w"] which contains w in its relative interior. Because of
the concavity of C{J, we must have rp(w') = rp(w") = rp(w), Le., w' and wll are also op-
timal solutions. But since IIw'II > rand IIw"1l > r, this contradicts the hypothesis.
Therefore, dim Fw ~ 1, i.e., w lies on an edge of P.
•
Algorithm XI.7:
Initialization:
Se1ect a polytope D1 with known vertex set V 1 satisfying
Compute the set V1 of all points w such that w is either a vertex of D1 satisfying
IIwll ~ r or else the intersection of an edge of D1 with the sphere IIull = r.
Set k = 1.
Step 1. Compute
wk E argmin {rp(w): w E V
- }
k (106)
and the point uk where the verticalline through wk meets the hemisphere
(PS)·
Otherwise go to Step 2.
660
(107)
and set
Compute the vertex set Vk+1 of Dk +1 and the set Vk+1 of all points w such
that w is either a vertex of Dk +1 satisfying IIwl/ ~ r or else the intersection of an
Remark XI.2. The sets Vk+ 1 ' Vk+ 1 can be obtained from Vk' Vk using the
methods discussed in Section 11.4.2.
Remark XI.3. The constraint (83) corresponds to a quadratic cut in the original
The convergence of Algorithm XI. 7 follows from the general theory of out er ap-
proximation methods that was discussed in Chapter II.
Proof. We refer to Theorem II.1, and we verify that the assumptions of this the-
Since the functions 'Pi( u) are Lipschitzian, and hence continuous, we have continuity
of 4c(u).
661
Consider the ball B of radius 1 around (-1,0). Let a = (-1.5,0.5) and for every x
define
(actually p(x) is equal to the value of the gauge of B - a at the point x - a). We
want to find the global maximum of p(x), or equivalently, the global minimum of
f(x) = -p(x) over D. By an easy computation we see that f(a) = 0 and
2 2
2x1 + 2x2 + 6x1 - 2x 2 +5
p(x) = 2 2 172
(3x 1 + 3~ - 2x1x 2 + 10x 1 - 6x2 + 9) + Xl - ~ + 2
p(X) = 10.361, max {gI (X), g2(i)} = 0.0061. The intermediary results, taken from
Thach and Tuy (1987), are shown in the table below, where Nk denotes the number
Table XI.I.
The conceptual method that folIows, which is essentially due to Thach and Tuy
(1990), is intended to get beyond local optimality in general global optimization and
to provide solution procedures for certain important special problem classes.
Consider the problem
where f: IRn ---t IR is continuous and D ( IRn is compact. Moreover, it will be assumed
that
Assumption (108) is fulfilled, for example, if Dis robust (cI. Definition 1.1).
The purpose of our development is to associate to f, D and to every a E IR a d.c. func-
663
tion I{Jo.(x) such that xis a global minimizer of f over D if and only if
(110)
Let dA(x):= inf {lix - ylI: Y E A} denote the distance from xE IRn to a set A ( IRn
(with the usua! convention that dA(x) = +m if Ais empty).
Note that the notion of aseparator for fon D is related to but is different from
Example XI.7. The distance function dj) (x) is aseparator for fon D.
a
Example XI.S. Let D = {x E !Rn: ~(x) ~ 0 (i=l, ... ,mn with ~: !Rn --I IR
(i=l, ... ,m). Suppose that fis (L,J.')-Hölder continuous and ~ is (Li'I1)-Hölder con-
tinuous (i=l, ... ,m), i.e., for a11 x,y E IRn one has
1~(x)-~(y)1 ~Lillx-YII
11 (l=l,
. ... ,m) (112)
IIx _ ylI ~ max {I f(x)Lf(y) 11/ J.', 1~(x)~(y) 11/ 11 (i=l, ... ,mn.
1
But since
and
I~(x) - ~(y)1 ~ max {O, gi(x) - ~(yn ~ max {O, ~(xn (i=l, ... ,m),
we have
665
and hence
The conditions (ii) and (iii) in Definition XI.2 obviously hold for the function r( /l,x)
given in (89).
If"(x)1 SM VxEIR.
0 iff(x) S /l
p(/l x)·= { (114)
,. - ~ If '(x) 1- (I f '(x) 12 + 2M(f(x)_/l))1/2) iff(x) > /l
Then aseparator for fon the ball D = {x E IR: lxi Sc} is given by
Note that the second expression in (90) describes the unique positive solution
Conditions (ii) and (iii) in Definition XI.2 are again obviously satisfied by (91). To
demonstrate (i), it suffices to show that lyl S c and f(y) S /l imply that r(/l,x) S
to show that f(y) S /l implies that p( /l,x) S 1x - y I. This can be seen using Taylor's
formula
Suppose that aseparator r( a,x) for the function f on the set D is available, and
for every a e öl define the function
(117)
(with the usual convention that sup 0 = _, inf 0 = +m). Clearly, for fixed a, h(a,x)
is the pointwise supremum of a family of affine functions, and hence is convex (it is
a so-called closed convex function, cf. Rockafellar (1970)).
Now consider the d.c. function
(118)
Proof. If x ~ Dn' then it follows from Definition XI.2 (ii) that r( n,x) > 0, and
hence, by (117),
2 2
= sup {r (n,v) -lIx-vll }
v~Dn
(121)
It follows that
cP (x) = sUP{-llx-vIl2+r2(n,v)}~0,
n v~Dn
If f(x) < er, then to each point v ~ Der we associate a point z(v) on the intersec-
tion of the line segment [x,v] with the boundary an er of Der. Such a point z(v)
exists, because x E Der while v ~ Der.
Since z(v) E [x,v] , z(v) E TIer' and because of Definition Xl.l (i), we have
It follows that
2 2 2
-/lx-z(v)1I ~r (er,v)-lIx-v/i ,
and hence
Finally, from (121) and (123) it follows that we must have (120) for xe Der. •
in! tpJxh o.
xelR n ...
x
Theorem XI.2. Let E D be a leasible point 01 problem (P), and let a= f(x). Gon-
sider the function rpa{x) defined in (107), (108).
Ci) I/
(124)
then /or any x satisfying rpa(x) < 0 we have xE D, I(x) < a {i.e., xis a better /eas-
ible point than x}.
Proof. Because of Lemma XI.2, every x E /Rn satisfying rpa(x) < 0 must be in Da.
This proves (i) by the definition of Da.
Using the assertion (i) just proved and Corollary XI.1, we see that condition (125)
In order to prove (iii), suppose that xsatisfies (125) but is not a globally optimal
solution of (P). Then, using the regularity condition (109) we see that there exists a
point x' Eint D satisfying f(x') < a. Because of the continuity of f it follows that
x' Eint D -. By Lemma XU this implies that
a
Corollary XI.2. 11 ä = min I{D} is the optimal objective fu,nction value 01 problem
{P}, then ä satisfies {101}, and every optimal solution 01 {P} is a global minimizer 01
CPä(x) over IRn.
Conversely, ilthe regularity condition {109} is fu,lfiUed and ä satisfies {125}, then
ä is the optimal objective fu,nction value olproblem {P}, and every global minimizer 01
cpiz} overlR n is an optimal solution 01 {P}.
Proof. The first assertion follows from Corollary XI.1 and Theorem XI.2.
In order to prove the second assertion, assume that the regularity condition (109)
is fulfilled. Let ä satisfy (125) and let i be a global minimizer of cpä(x) over IRn.
Then cpä(i) = 0, and from Lemma XI.1 we have i e D, f(i) ~ ä.
Let ä = f(i). Since ä ~ ä, it follows from Definition XI.2 (iii) that r(ä,x) ~ r(ä,x),
and hence
Using the first part of Corollary XI.1, we deduce that inf IP(j..x) = O. Therefore, by
xelRn
Theorem XI.2 (iii) we conclude that i is an optimal solution of problem (P).
Furthermore, it follows from the regularity condition (109) that we cannot have
ä< ä, because in that case there would exist a point x' eint D satisfying f(x') < ä.
But, by Lemma XI.1, this would imply that cpä(x') = - inf {lIx' - vf v t Dä} < 0,
which contradicts the hypothesis that ä satisfies (125). Therefore, we must have
The properties of the function cP(l(x) presented in the preceding section suggest
Thach and Tuy (1990), we shall call cP(l(x) a reliel indicator fu,nction lor 1 on D. (A
671
slightly more general not ion of rellef indicator is discussed in Thach and Tuy
(1989)).
Thls suggests replacing problem (P) by the parametric unconstrained d.c. minimi-
zation problem:
where Qk = f(xk). Let xk+ 1 denote an optimal solution of(P k)· If I{JQk(xk +1) =
0, then stop: xk is an optimal solution of (P). Otherwise, go to iteration k+1 with
Proposition XI.n. Let problem {P} be regular in the sense 0/ {109}. 1/ the above
iterative procedure is infinite, then every accumulation point i 0/ the sequence {i} is
an optimal solution 0/ {P}.
Proof. We first note that problem (PI) has an optimal solution x 2 with
I{J (x2) ~ o. To see this, recall from Corollary XI.1 that i nf n I{J Q (x) ~ 0 since DQ
Q1 xelR 1 1
f 0 (because xl e D). Moreo~er, from Lemma XI.1 we know that I{JQ (x) > 0 if x ~
1
TI . It follows that inf I{J (x) = inf I{J (x). But I{J (x) is lower semi-
Q1 xelRn Q1 xeD Q1 Q1
672
continuous, since h( a,x) is lower semicontinuous (cf. Rockafellar (1970)), and from
two issues. The first matter concerns the solution of the subproblems (P k ). These
(130)
where a i is the smallest value of f(x) at an feasible points evaluated until iteration i,
and where xi+! (i ~ 1) is an optimal solution of problem (Qi)'
By the definition of the ai in (130), we must have a i ~ f(xi ) whenever xi E D. It fol-
lows that xi ~ D ,and hence xi ~ D,. for i=l, ... ,k. Since r(a.,x) ~ r(ak,x) for an
~ ~ 1
xe IRn , i=l, ... ,k (Definition XI.2 (iii», we see from (109) and (130) that for an x
with
674
(131)
In this way we have replaced the unconstrained d.c. problem (P k ) by the linearly
Several finite algorithms to solve problem (Qk) were discussed in Part B of this
book.
Since hk(x) ~ h( Ilk,x) for all x E S, it follows from Lemma XI.1 that the optimal ob-
jective function value of (Qk) is nonpositive. Moreover, if this value is zero, then we
must have
0= minn VJ at (x) ,
xEIR k
and it follows from Theorem XI.2 that Ilk = min f(D) and every point i k E D satis-
fying f(xk ) = Ilk solves (P).
However, if h k (xk +1) - IIxk +1 11 2 < 0, then it is not guaranteed that f(xk +1) < Ilk .
yields a feasible point i k +1 satisfying f(ik +1) ~ f(xk +1). In this case we set
. -k+1
'\+1 = mm {Ilk' f(x H·
Initialization:
Set 110 = IIJ if no feasible point is known, and set 110 equal to the minimal value of
Iteration 1:=1,2,... :
1:.1.: If i E D, then, using a local optimization procedure that starts with xk, find a
point i k E D satisfying f(ik) < f(xk), and set tlk = min {tlk_l,f(ik)). If xk t D,
then set tlk = tlk_l·
Denote by i k the best feasible point known so far, i.e., we have f(ik) = tlk.
1:.2.: Set
then stop: i k is an optimal solution of (P), and tlk = f(i k) = min f(D).
Otherwise (t k+1 _lIxk+1 11 2 < 0), go to iteration k+1.
and let Vk denote the vertex set of Dk. Then we know that
Since Dk+1 = Dk n ({x,t): 4t(x) 5 t}, we have the same situation as with the outer
approximation methods discussed in Chapter 11, and Vk+ 1 can be determined from
Vk by one of the methods presented in Chapter 11.
676
Proposition XI.12. Let problem {P} be regular in the sense of {109}, and let the
feasible set D of {P} be nonempty. Assume that we have aseparator r{a,x} for f on D
which is lower semi-continuo'US in IRJllR n. Then Algorithm XI. 7 either terminates after
a finite number of iterations with an optimal solution zk of {P}, or else it generates an
infinite sequence {i}, each accumulation point of which is optimal for {P}.
Therefore,a := I im ak exists. We show that a = min f(D), and that every accu-
k-t1D
We use the general convergence results for outer approximation methods (cf.
Theorem II.1) in the following form:
Consider the nonempty set
and the sequence {(xk,tkn generated by Algorithm X.6. This sequence is bounded,
accumulation point (x,I) belongs to A if the functions lk(x,t):= lk(x) - t (k=I,2, ... )
satisfy the following conditions:
(1)
I k' k' ,
(2) lk(x,tH 0 V(x,t) E A, '1c (x ,t ) ~ 0 V k > k;
677
(1): We have
where the first inequality is obvious and the second inequality follows from the as-
sumption that the algorithm is infinite (cf., Step k.2).
Here the last two inequalities follow from Definition XI (iii) and (i), respectively.
(3): The affine functions ~(x,t) are lower semi--continuous. Consider a subsequence
k k k k
(x q,t q) satisfying (x q,t q) ----+ (i,t). Then there is a subsequence {kr} ( {kq}
(q-lal )
such that
k k
lim lk (x r ,t r ) = lim 1_'11: (x,t,
-~ - 2 + r2(-:;"\
= IIxll Q,X, - -t = 1f-~
"\x,t,,
r-+m r r-+m r
678
(4): Let l(x,i) = IIxl1 2 + r2(ii,x) - t = O. Since the algorithm is infinite, i.e., we
k k
have IIx qll2 > t q Vq, it follows that IIxll 2 - t ~ 0, and hence, because Z(x,i") = 0,
we have r 2(ii,x) ~ o. This is possibly only if r(ii,x) = 0, and hence IIxll2 = t. But
from Definition X1.1 (ii) we see that then we must have x E Dii, and hence (x,t) E A.
Therefore, by Theorem I1.1, every accumulation point (x,i") of {(xk,tk)} satisfies
(x,i") E A, i.e.
(133)
Now we show that the optimality condition of Theorem XI.2 (resp. Corollary
XI.2) is satisfied.
Now let s - I m in (135) and observe that IIxll 2 ~ t (cf. 133). We obtain
Let g(xl'x2) = - [(xl - 0.5)2 + (x2 - 0.415331)2] 1/2 + 0.65 and let
S = {(xl'x2) E 1R2: 0 ~ xl ~ 1, 0 ~ x2 ~ I}.
The functions g and f are Lipschitz continuous on S with Lipschitz constants 28.3 for
ABRHAM, J., and BUIE, R.N. (1975), A Note on Nonconcave Continuous Program-
ming. Zeitschrift für Operations Research, Serie A, 3, 107-114.
ADAMS, W.P. and SHERALI, H.D. (1986), A Tight Linearization and an Algorithm
for zero-one Quadratic Programming Problems. Management Science, 32,
1274-1290.
AHUJA, R.K., MAGNANTI, T.L. and ORLIN, J.B. (1993), Network Flows: Theory,
Algorithms and Applications. Prentice Hall, Englewood cliffs, N.J.
ALAMEDDlNE, A. (1990), A New Reformulation-Linearization Technique for the
Bilinear Programming and Related Problems with Applications to Risk Management.
Ph.D., Dissertation, Department of Industrial and Systems Engineering, Virginia
Polytechnic Institute and State University, Blacksburg, Virginia.
ALTMANN, M. (1968), Bilinear Programming. BuH. Acad. Polon. Sei. Sero Sei.
Math. Astronom. Phys., 16, 741-745.
AVRIEL, M. (1973), Methods for Solving Signomial and Reverse Convex Program-
ming Problems. In: Avriel et al. (ed.), Optimization and Design, Prentice-Hall Inc.,
Englewood Cliffs, N.J. 307-320.
BALAS, E. (1968), A Note on the Branch and Bound Principle. Operations Research
16, 442-445.
BALAS, E. (1971), Intersection Cuts - a New Type of Cutting Planes for Integer
Programming. Operations Research, 19, 19-39.
BALAS, E. (1972), Integer Programming and Convex Analysis: Intersection Cuts
!rom Outer Polars. Mathematical Programming, 2, 330-382.
BALAS, E. (1975), Disjunctive Programming: Cutting Planes !rom Logical Condi-
tions. In: "Nonlinear Programming, 2", Academic Press, Inc., New York, San
Francisco, London, 279-312.
BASSO, P. (1982), Iterative methods for the localization of the global maximum.
SIAM Journal on Numerical Analysis, 19, 781-792.
BASSO, P. (1985), Optimal Search for the Global Maximum of Functions with
Bounded Seminorm. SIAM Journal on Numerical Analysis, 22, 888-903.
BAZARAA, M.S. (1973), Geometry and Resolution of Duality Gaps. Naval Research
Logistics, 20, 357-366.
BAZARAA, M.S. and SHETTY, C.M. (1979), Nonlinear Programming: Theory and
Algorithms. John Wiley and Sons, New York.
BAZARAA, M.S. and SHERALI, H.D. (1982), On the Use of Exact and Heuristic
Cutting Plane Methods for the Quadratic Assignment Problem. Journal Operational
Society, 33, 991-1003.
BEALE, E.M.L. (1980), Branch and Bound Methods for Numerical Optimization of
Nonconvex Functions. In: Compstat 1980, Physica, Vienna, 11-20.
BEALE, E.M.L. and FORREST, J.J.H. (1978), Global Optimization as an Extension
of Integer Programming. In: Towards Globaf Optimization 2, Dixon, L.C.W. and
Jzego, eds., North Holland, Amsterdam, 131-149.
BENSON, H.P. (l982), On the Convergence of Two Branch and Bound Algorithms
for Nonconvex Programming Problems. Journal of Optimization Theory and
Applications, 36, 129-134.
BENSON, H.P. (1985), A Finite Algorithm for Concave Minimization Over a Poly-
hedron. Naval Research Logistics Quarterly, 32, 165-177.
BENSON, H.P. (1990), Separable Concave Minimization via Partial Outer Approxi-
mation and Branch and Baund. Operations Research Letters, 9, 389-394.
BENSON, H.P. (1995), Concave Minimization: Theory, Applications and
Algorithms. In: Horst, R. and Pardalos, P.M. (eds.), Handbook of Global
Optimization, 43-148, Kluwer, Dordrecht-Boston-London.
685
BENSON, H.P. and ERENGUC, S. (1988), Using Convex Envelopes to Solve the
Interactive Fixed Charge Linear Programming Problem. Journal of Optimization
Theory and Applications 59, 223-246.
BENSON, H.P. and HORST, R. (1991), A Branch and Bound - Outer Approxi-
mation Algorithm /or Concave Minimization Over a Convex Set. Journal of
Computers and Mathematics with Applications 21, 67-76.
BITTNER (1970), Some Representation Theorems for Functions and Sets and their
Applications to Nonlinear Programming. Numerische Mathematik, 16, 32-51.
BLUM, E. and OETTLI, W. (1975), Mathematische Optimierung. Springer-Verlag,
Berlin.
BRANIN, F.H. (1972), Widely Convergent Method for Finding Multiple Solutions of
Simultaneous Nonlinear Equations. IBM J. Res. Dev. 504-522.
BRENT, R.P. (1973), Algorithms for Minimization Without Derivatives. Prentice-
Hall, Englewood Cliffs.
BULATOV, V.P. and KHAMISOV, O.V. (1992), The Branch and Bound Method
with Cuts in E n+1 for Solving Concave Programming Problems. Lecture Notes in
Control and Information Sciences, 180, 273-281.
BURDET, C.A. (1973), Polaroids: A New Tool in Nonconvex and in Integer Pro-
gramming. Naval Research Logistics Quarterly, 20, 13-24.
BURDET, C.A. (1973), Enumerative Cuts I. Operations Research, 21, 61-89.
BURDET, C.A. (1977), Convex and Polaroid Extensions. Naval Research Logistics
Quarterly, 26, 67-82.
CABOT, A.V. and ERENGUC, S.S. (1984), Some Branch and Bound Procedures
for Fixed-Cost Transportation Problems. Naval Research Logistics Quarterly, 31,
129-138.
CABOT, A.V. and ERENGUC, S.S. (1986), A Branch and Bound Algorithm for
Solving a Class of Nonlinear Integer Programming Problems. Naval Research
Logistics, 33, 559-567.
CABOT, A.V. and FRANCIS, R.L. (1970), Solving Nonconvex Quadratic Minimiza-
tion Problems by Ranlcing the Extreme Points. Operations Research, 18, 82-86.
CANDLER, W. and TOWNSLEY, R.J. (1964), The Maximization of a Quadratic
Function of Variables subject to Linear Inequalities. Management Science, 10,
515-523.
687
CHEW, S.H. and ZHENQ, Q. (1988), Integral Global Optimization. Lecture Notes in
Economics and Mathematical Systems, 289, Springer-Verlag, Berlin.
DANTZIG, G.B. and WOLFE, P. (1960), Decomposition Principle for Linear Pro-
grams. Operations Research, 8, 101-111.
DENNIS, J.E. and SCHNABEL, R.B. (1983), Numerical Methods for Nonlinear
Equations and Unconstrained Optimization. Prentice-Hall, Englewood Cliffs, New
Jersey.
DEWDER, D.R. (1967), An Approximate Algorithm for the Fixed Charge Problem.
Naval Research Logistics Quarterly, 14, 101-113.
DIENER, I. (1987), On the Global Convergence of Path-Following Methods to
Determine aU Solutions to a System of Nonlinear Equations. Mathematical
Programming, 39, 181-188.
DINH ZUNG (1987), Best Linear Methods of Approximation for Classes of Periodic
Function of Several Variables. Matematicheskie Zametki 41, 646-653 (in Russian).
DIXON, L.C.W. (1978), Global Optima Without Convexity. In: Greenberg, H. (ed.),
Design and Implementation Optimization Software, Sijthoff and Noordhoff, Alphen
aan den Rijn, 449-479.
DIXON, L.C.W., and SZEGO, G.P. (eds.) (1975), Towards Global Optimization.
Volume I. North-Holland, Amsterdam.
DIXON, L.C.W., and SZEGO, G.P. (eds.) (1978), Towards Global Optimization.
Volume 11. North-Holland, Amsterdam.
DUONG, P.C. (1987), Finding the Global Extremum of a Polynomial Function. In:
Essays on Nonlinear Analysis and Optimization Problems, Institute of Mathematics,
Hanoi (Vietnam), 111-120.
DUTTON, R., HINMAN, G. and MILLHAM, C.B. (1974), The Optimal Location of
Nuclear-Power Facilities in the Pacijic Northwest. Operations Research, 22,
478-487.
DYER, M.E. (1983), The Complexity of Vertex Enumeration Methods. Mathematics
of Operations Research, 8, 381-402.
DYER, M.E. and PROLL, L.G. (1977), An Algorithm for Determining AU Extreme
Points of a Convex Polytope. Mathematical Programming, 12, 81-96.
DYER, M.E. and PROLL, L.G. (1982), An Improved Vertex Enumeration Algo-
rithm. European Journal of Operational Research, 9, 359-368.
EAVES, B.C. AND ZANGWILL, W.1. (1971), Generalized Cutting Plane AIgo-
rithms. SIAM Journal on Control, 9, 529-542.
ECKER, J.G. and NIEMI, R.D. (1975), A Dual Method for Quadratic Programs
with Quadratic Constraints. SIAM Journal Applied Mathematics, 28, 568-576.
ELLAIA, R. (1984), Contribution a. l'Analyse et l'Optimisation de Difforences de
Fonctions Convexes. These du 3eme Cyc1e, Universite Paul Sabatier, Toulouse.
ELSHAFEI, A.N. (1975), An Approach to Locational Analysis. Operational Re-
search Quarterly, 26, 167-181.
689
EMELICHEV, V.A. and KOV ALEV, M.M. (1970), Solving Cenain Concave
Programming Problems by Successive Approximation I. Izvestya Akademii Nauk
BSSR, 6, 27-34 (in Russian).
ERENGUC, S.S. and BENSON, H.P. (1987a), Concave Integer Minimizations Over
a Compact, Convex Set. Working Paper No. 135, Center for Econometrics and
Decision Science, University of Florida.
EVTUSHENKO, Y.G. (1971), Numerical Methods for Finding the Global Extremum
of a Function. USSR Computational Mathematics and Mathematical Physics, 11,
38-54.
FALK, J.E., BRACKEN, J. and McGILL, J.T. (1974), The Equivalence of Two
Mathematical Pro grams with Optimization Problems in the Constraints. Operations
Research, 22, 1102-1104.
FALK, J.E. and HOFFMAN, K.L. (1976), A Successive Underestimation Method for
Concave Minimization Problems. Mathematics of Operations Research, I, 251-259.
690
FALK, J.E. and HOFFMAN, K.L. (1986), Concave Minimization via Collapsing
Poly top es. Operations Research, 34, 919-929.
FALK, J.E. and SOLAND, R.M. (1969), An Algorithm for Separable Nonconvex
Programming Problems. Management Science, 15, 550-569.
FEDOROV, V.V. (ed.) (1985), Problems of Cybernetics, Models and Methods in
Global Optimization. USSR Academy of Sciences, Moscow (in Russian).
FENCHEL, W. (1949), On Conjugate Convex Functions. Canadian Journal of
Mathematics, 1, 73-77.
FENCHEL, W. (1951), Convex Cones, Sets and Functions. Mimeographed Lecture
Notes, Princeton University.
FLOUDAS, C.A. and PARDALOS, P.M. (1990), A Collection of Test Problems for
Constrained Global Optimization Algorithms. Lecture Notes in Computer Science,
455, Springer Verlag, Berlin.
FORGO, F. (1972), Cutting Plane Methods for Solving Nonconvex Quadratic Prob-
lems. Acta Cybernetica, 1, 171-192.
FORGO, F. (1988), Nonconvex Programming. Akademiai Kiado, Budapest.
GAL, T., KRUSE, H.J. and ZÖRNIG, P. (1988), Survey of Solved and Open
Problems in the Degeneracy Phenomenon. Mathematical Programming B, 42,
125-133.
GALLO, G., SANDI, C. and SODINI, C. (1980), An Algorithm for the Min Concave
Cost Flow Problem, European Journal of Operational Research, 4, 249-255.
GALLO, G. and SODINI, C. (1979), Adjacent Extreme Flows and Application to
Min Concave Cost Flow Problem. Networks, 9, 95-122.
GALLO, G. and SODINI, C. (1979a), Concave Cost Minimization on Networks.
European Journal of Operations Research, 3, 239-249.
GALLO, G. and ÜLKUCÜ, A. (1977), Bilinear Programming: An Exact Algorithm.
Mathematical Programming, 12, 173-194.
GALPERIN, E.A. (1985), The Cubic Algorith.m. Journal of Mathematical Analysis
and Applications, 112, 635-640.
GALPERIN, E.A. (1988), Precision, Complexity, and Computational Schemes of the
Cubic Algorithm. Journaf of Optimization Theory and Applications, 57, 223-238.
GALPERIN, E.A. and ZHENG, Q. (1987), Nonlinear Observation via Global Opti-
mization Methods: Measure Theory Approach. Journal of Optimization Theory and
Applications, 54, 1, 63-92.
GANSHIN, G.S. (1976a), Simplest Ser;u.ential Search Algorithm for the Largest Value
0/ a Twice-Differentiable Function. USSR Computational Mathematics and
Mathematical Physics, 16, 508-509.
GANSHIN, G.S. (1977), Optimal Passive Algorithms for Evaluating the Maximum of
a Function in an Interval. USSR Computational Mathematics and Mathematical
Physics, 17, 8-17.
GARCIA, C.ß. and ZANGWILL, W.!. (1981), Pathways to Solutions, Fixed Points
and Er;u.ilibra. Prentice-Hall, Englewood Cliffs, N.J.
692
GASANOV, 11 and RIKUN, A.D. (1985), The Necessary and Sufficient Conditions
for Single Extremality in Nonconvez Problems of Mathematical Programming. USSR
Computational Mathematics and Mathematical Physics, 25, 105-113.
GEOFFRION, A.M. (1971), Duality in Nonlinear Programming: A Simplified
Applications-Oriented Development. SIAM Reviews, 13, 1-37.
GEOFFRION, A.M. (1972), Generalized Benders Decomposition. Journal of Opti-
mization Theory and Applications, 10, 237-260.
GLOVER, F. (1973), Convezity Cuts and Cut Search. Operations Research, 21, 123-
124.
GUISEWITE, G.M. (1995), Network Problems. In: Horst, R and Pardalos, P.M.
(eds.), Handbook of Global Optimization, 609-678, Kluwer, Dordrecht-Boston-Lon-
don.
694
HANSEN, P., JAUMARD, B and LU, S.H. (1989), A Framework for Algorithms in
Globally Optimal Design. Research Report 0-88-11, HEC-GERAD, University of
Montrea.J..
HANSEN, P., JAUMARD, B. and LU, S.H. (1989a), Global Minimization of Uni-
variate Functions by Sequential Polynomial Approximation. International Journal of
Computer Mathematics, 28, 183-193.
695
HANSEN, P., JAUMARD, B. and LU, S.H. (1991), On the Number of Iterations of
Piyavski,' s Global Optimization Algorithm. Mathematics of Operations Research, 16,
334-350.
HANSEN, P., JAUMARD, B. and LU, S.H. (1992), Global Optimization of Uni-
variate Lipsehitz Funetions: 1. Survey and Properties. Mathematical Programming,
55, 273-292.
HANSEN, P., JAUMARD, B. and LU, S.H. (1992a), Global Optimization of Uni-
variate Lipsehitz Funetions: Part 11. New Algorithms and Computational Compari-
son. Mathematical Programming, 55, 273-292.
HARMAN, J.K. (1973), Some Ezperienee in Global Optimization. Naval Research
Logistics Quarterly, 20, 569-576.
HEYDEN, L. Van der (1980), A Variable Dimension Algorithm for the Linear
Complementarity Problem. Mathematical Programming, 19, 123-130.
HlLLESTAD, R.J. (1975), Optimization Problems Subject to a Budget Constraint
with Eeonomies 0/ Scale. Öperations Research, 23, 1091-1098.
HlLLESTAD, R.J. and JACOBSEN, S.E. (1980), Reverse Convex Programming.
Applied Mathematics and Optimization, 6, 63-78.
HOFFMAN, K.L. (1981), A Method for Globally Minimizing Concave Fv.nctions over
Convex Sets. Mathematical Programming, 20, 22-32.
HOGAN, W.W. (1973), Applieations of a General Convergenee Theory for Outer
Approximation Algorithms. Mathematical Programming, 5, 151-168.
HORST, R. (1976), An Algorithm for Nonconvex Programming Problems. Mathema-
tical Programming, 10, 312-321.
696
HORST, R. (1976b), A New Branch and Bound Approach for Concave Minimization
Problems. Lecture Notes in Computer Science 41,330-337, Springer-Verlag, Berlin.
HORST, R. (1978), A New Approach for Separable Nonconvex Minimization Prob-
lems Including a Method for Finding the Global Minimum of a Function of a Single
Variable. Proceedings in Operations Research, 7, 39-47, Physica, Heidelberg.
HORST, R. (1979), Nichtlineare Optimierung. Carl Hanser-Verlag, München.
HORST, R. and DIEN, L.V. (1987), A Solution Concept for a very Generql Class of
Decision Problems. In: Opitz and Rauhut (eds.): Mathematik und Okonornie,
Springer 1987, 143-153.
HORST, R., MUU, L.D. and NAST, M. (1994), A Branch and Bound Decomposition
Approach for Solving Quasiconvex-Concave Programs, Journal of Optirnization
Theory and Applications, 82, 267-293.
HORST, R., NAST, M. and THOAI, N.V. (1995), New LP-Bound in Multivariate
Lipschitz Optimization: Theory and Applications. Journal of Optirnization Theory
and Applications, 86, 369-388.
HORST, R., PARDALOS, P.M. and THOAI, N.V. (1995), Introduction to Global
Optimization. Kluwer, Dordrecht-Boston-London.
HORST, R., PHONG, T.Q. and THOAI, N.V. (1990), On Solving General Reverse
Convex Programming Problems by a Sequence of Linear Programs and Line
Searches. Annals of Operations Research 25, 1-18.
HORST, R., PHONG, T.Q., THOAI, N.V. and VRIES, J. de (1991), On Solving a
D.C. Programming Problem by a Sequence of Linear Programs. Journal of Global
Optimization 1, 183-204.
HORST, R. and THOAI, N.V. (1995a), A New Algorithm lor Solving the General
Quadratic Programming Problem. To appear in Computational Optimization and
Applications.
HORST, R. and THOAI, N.V. (1996), A Decomposition Approach lor the Global
Minimization 01 Biconcave Functions over Polytopes. To appear in Journal of
Optimization Theory and Applications.
HORST, R., THOAI, N.V., and BENSON, H.P. (1991), Concave Minimization via
Conical Partitions and Polyhedral Outer Approximation. Mathematical
Programming 50, 259-274.
HORST, R., THOAI, N.V. and TUY, H. (1987), Outer Approximation by Polyhedral
Convex Sets. Operations Research Spektrum, 9, 153-159.
HORST, R., THOAI, N.V. and VRIES, J. de (1992), A New Simplicial Cover
Technique in Constrained Global Optimization. Journal of Global Optimization, 2,
1-19.
JEN SEN, P.A. and BARNES, J.W. (1980), Network Flow Programming. John
Wiley, New York.
699
KAO, E. (1979), A Multi-Product Lot-Size Model with Individual and Joint Set-Up
Costs. Operations Research, 27, 279-289.
KEARFOTT, R.B. (1987), Abstract Generalized Bisection and a Cost Bound.
Mathematics of Computation, 49, 187-202.
KEDE, G. and WATANABE, H. (1983), Optimization Techni([Ues for IC Layout
and Compaction. Proceedings IEEE Intern. Conf. in Computer Design: KSI in
Computers, 709-713.
KELLEY, J.E. (1960), The Cutting-Plane Method for Solving Convez Programs.
Journal SIAM, 8, 703-712.
KHACHATUROV, V. and UTKIN, S. (1988), Solving Multieztremal Concave
Programming Problems by Combinatorial Approximation Method. Preprint,
Computer Center of the Academy of Seiences, Moscow (in Russian).
KHANG, D.B. and FUJIWARA, O. (1989), A New Algorithm to Find All Vertices of
a Polytope. Operations Research Letters, 8, 261-264.
KIEFER, J. (1957), Optimum Search and Approximation Methods Under Minimum
Regularity Assumptions. SIAM Journal, 5, 105-136.
KIWIEL, K.C. (1985), Methods of Descent for Nondifferentiable Optimization.
Lecture Notes in Mathematics, 1133, Springer-Verlag, Berlin.
700
KLINZ, B. and TUY, H. (1993), Minimum Concave Cost Network Flow Problems
with a Single Nonlinear Arc Cost. In: Pardalos, P.M. and Du, D.Z. (eds.), Network
Optimization Problems, 125-143, World Scientific, Singapore.
KOEHLER, G., WHINSTON, A.B. and WRIGHT, G.P. (1975), Optimization over
Leontiev Substitution Systems. North Holland, Amsterdam.
KONNO, H. (1971), Bilinear Programming: Part I. An Algorithm for Solving
Bilinear Programs. Technical Report No. 71-9, Operations Research House,
Stanford University.
KONNO, H. (1971a), Bilinear Programming: Part II. Applications of Bilinear
Programming. Technical Report No. 71-10, Operations Research House, Stanford
University, Stanford, CA.
KONNO, H. (1973), Minimum Concave Se ries Production Systems with Determi-
nistic Demand-Backlogging. Journal of the Operations Research Society of Japan,
16,246-253.
LAMAR, B.W. (1993), A Method for Solving Network Flow Problems with General
Nonlinear Are Gosts. In: Du, D.Z. and Pardalos, P.M. (eds.), Network Optimization
Problems, 147-168, World Scientific, Singapore.
LAWLER, E.L. (1966), Braneh and Bound Methods: A Survey. Operations Research,
14, 669-179.
LEVY, A.V. and MONTALVO, A. (1985), The Tunneling Algorithm for the Global
Minimization of Funetions. SIAM Journaf on Scientific and Statistical Computing,
6,15-29.
LOVE, F.C. (1973), A Facilities in Series Inventory Model with Nested Sehedv.les.
Management 8cience, 18, 327-338.
LUENBERGER, D.G. (1969), Optimization by Veetor Spaee Methods. John Wiley,
New York.
LÜTHI, H.J. (1976), Komplementaritäts- v.f!..d Fixpunktalgorithmen in der Mathema-
tischen Programmierung, Spieltheorie v.nd Okonomie. Lecture Notes in Economies
and Mathematical Systems 129, Springer-Verlag, Berlin.
MAYNE, D.Q. AND POLAK, E. (1984), Outer Approximation Algorithm lor Non-
dillerentiable Optimization Problems. Journal Optimization Theory and Applica-
tions, 42, 19-30.
MAYNE, D.Q. AND POLAK, E. (1986), Algorithms lor Optimization Problems with
Exclusion Constraints. Journal of Optimization Theory and Applications, 51,
453-473.
McCORMICK, G.P. (1972), Attempts to Calculate Global Solutions 01 Problems that
may have Local Minima. In: Numerical Methods for Nonlinear Optimization, F.
Lootsma, Ed., Academic, London and New York, 209-221.
703
MOORE, R.E. (1988) (ed.), Reliability in Computing: The Role of Interval Methods.
Academic Press, New York.
MUELLER, R.K. (1970), A Method for Solving the Indefinite Quadratic Program-
ming Problem. Management Science, 16, 333-339.
MUKHAMEDIEV, B.M. (1982), Approximate Methods for Solving Concave Prrr
gramming Problems. USSR Computational Mathematics and Mathematical Physics,
22, 238-245 (in Russian).
704
MURTY, KG. (1969), Solving the Fixed-Charge Problem by Ranking the Extreme
Points. Operations Research, 16, 268-279.
MURTY, KG. (1974), Note on a Bard-Type Scheme for Solving the Complemen-
tarity Problem. Operations Research, 11, 123-130.
MURTY, KG. (1988), Linear Complementarity, Linear and Nonlinear Program-
ming. Heldermann Verlag, Berlin.
MURTY, K.G. and KABADI, S.N. (1987), Some NP-Complete Problems in
Quadratic and Nonlinear Programming. Mathematical Programming, 39, 117-130.
MUU, L.D. (1985), A Convergent Algorithm for Solving Linear Programs with an
Additional Reverse Convex Constraint. Kybernetika, 21, 428-435.
MUU, L.D. (1993), An Algorithm for Solving' Convex Programs with an Additional
Convex-Concave Constraint, Mathematical Programming, 61, 75-87.
MUU, L.D. and OETTLI, W. (1989), An Algorithm for Indefinite Qv.adratic
Programming with Convex Constraints, Operations Research Letters, 10, 323-327.
NEFERDOV, V.N. (1987), The Search for the Global Maximum of a Function of
Several Variables on a Set Defined by Constraints of Inequality Type. USSR
Computational Mathematics and Mathematical Physics, 27, 23-32.
NEMHAUSER, G.L. and WOLSEY, L.A. (1988), Integer and Combinatorial Optimi-
zation. John Wiley & Sons, New York.
NETZER, D. and PASSY, W. (1975), A Note on the Maximum of Quasiconvex
Functions. Journal of Optimization Theory and Applications, 16, 565-569.
NGHIA, N.D. and HIEU, N.D. (1986), A Method for Solving Reverse Convex
Programming Problems. Acta Mathematica Vietnamica, 11,241-252.
NGUYEN, V.H. and STRODIOT, J.J. (1988), Computing a Global Optimal Solution
to a Design Centering Problem. Technical Report 88/12, Facultes Universitaires de
Namur, Namur, Belgium.
NGUYEN, V.H. and STRODIOT, J.J. (1992), Computing a Global Optimal Solution
to a Design Centering Problem. Mathematical Programming, 53, 111-123.
NGUYEN, V.H., STRODIOT, J.J. and THOAI, N.V. (1985), On an Optimum
Shape Design Problem. Technical Report 85/5. Department of Mathematics,
Facultes Universitaires de Namur.
PANG, J.S. (1995), Complementarity Problems., In: Horst, R. and Pardalos, P.M.,
Handbook of Global Optimization, 271-338, Kluwer, Dordrecht-Boston-London.
PARDALOS, P.M., GLlCK, J.H. and ROSEN, J.B. (1987), Global Minimization of
Indefinite Quadratic Problems. Computing, 39, 281-291.
PARDALOS, P.M. and GUPTA, S. (1988), A Note on a Quadratic Formulation for
Linear Complementarity Problems. Journal of Optimization Theory and Applica-
tions, 57, 197-202.
PARDALOS, P.M. and KOVOOR, N. (1990), An Algorithm for Singly Constrained
Quadratic Programs. Mathematical Programming 46, 321-328.
PARDALOS, P.M. and PHILLIPS, A.T. (1990), AGlobai Optimization Approach
for the Maximum Clique Problem. International Journal of Computer Math. 33,
209-216.
PARDALOS, P.M. and PHILIPPS, A.T. (1991) , Global Optimization of Fractional
Programs. Journal of Global Optimization, 1, 173-182.
PARDALOS, P.M. and ROSEN, J.B. (1986), Methods for Global Concave Minimi-
zation: A Bibliographie Survey. SIAM Review, 28, 367-379.
PARDALOS, P.M. and ROSEN, J.B. (1987), Constrained Global Optimization:
Algorithms and Applications. Lecture Notes in Computer Science, 268,
Springer-Verlag, Berlin.
PARDALOS, P.M. and ROSEN, J.B. (1988), Global Optimization Approach to the
Linear Complementarity Problem. SIAM Journal on Scientific and Statistical
Computing, 9, 341-353.
PARDALOS, P.M. and SCHNITGER, G. (1987), Checking Local Optimality in Con-
strained Quadratic Programming is NP-hartl. Operations Research Letters, 7,
33-35.
706
PIYA VSKII, S.A. (1972), An Algorithm lor Finding the Absolute Extremum 01 a
Function. USSR Computational Mathematics and Mathematical Physics, 12, 57-67.
POLAK, E. (1982), An Implementable Algorithm lor the Optimal Design Centering,
Tolerancing and Tuning Problem. Journal of Optimization Theory and Applications,
37,45-67.
RATSCHEK, H. and ROKNE, J. (1984), Computer Methods lor the Range 01 Fun(}-
tions. Ellis Horwood Series Mathematics and its Applications, Wiley, New York.
RATSCHEK, H. and ROKNE, J. (1988), New Computer Methods lor Global Optimi-
zation. Ellis Horwood, Chi chester.
RATSCHEK, H. and ROKNE, J. (1988), Interval Methods. In: Horst, R. and
Pardalos, P.M. (eds.), Handbook of Global Optimization, 751-828, Kluwer,
Dordrecht-Boston-London.
REEVES, G.R. (1975), Global Minimization in Nonconvex AU-Quadratic Program-
ming. Management Science, 22, 76-86.
RINNOOY KAN, A.H.G. and TIMMER, G.T. (1987), Stochastic Global Optimiza-
tion Methods. Part I: Clustering Methods. Mathematical Programming, 39, 27-56.
RINNOOY KAN, A.H.G. and TIMMER, G.T. (1987), Stochastic Global Optimiza-
tion Methods. Part II: Multi-Level Methods. Mathematical Programming, 39, 57-78.
RITTER, K. (1965), Stationary Points of Quadratic Maximum Problems. Zeitschrift
für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 4, 149-158.
ROSEN, J.B. (1960), The Gradient Projection Method for Nonlinear Programming,
I: Linear Constraints. SIAM Journal on Applied Mathematics, 8,181-217.
ROSEN, J.B. (1966), Iterative Solution of Nonlinear Optimal Control Problems.
SIAM Journal on Contral, 4, 223-244.
ROSEN, J.B. (1983), Global Minimization of a Linearly Constrained Concave
Function by Partition of Feasible Domain. Mathematics of Operations Research, 8,
215-230.
ROSEN, J.B. (1983a), Parametric Global Minimization for Large Scale Problems.
Technical Report 83-11 (revised), Computer Sei. Dept. Univ. of Minnesota.
ROSEN, J.B. (1984), Performance of Approximate Algorithms for Global Minimi-
zation. Math. Progr. Study, 22, 231-236.
ROSEN, J;B. (1984a), Computational Solution of Large-Scale Constrained Global
Minimization Problems. Numerical Optimization 1984. (P.T. Boggs, R.H. Byrd,
R.B. Schnabel, Eds.). SIAM, Phil., 263-271.
SEN, S. and SRERALI, R.D. (1985), On the Convergence of Cutting Plane Alglr
rithms for a Class of Nonconvex Mathematical Programs. Mathematical Program-
ming, 31, 42-56.
SEN, S. and SHERALI, H.D. (1986), Facet Inequalities from Simple Disjunctions in
Cutting Plane Theory. Mathematicai Programming, 34, 72-83.
SEN, S. and SHERALI, H.D. (1987), Nondifferentiable Reverse Convex Programs
and Facial Convexity Cuts via a Disjunctive Characterization. Mathematical
Programming, 37, 169-183.
SEN, S. and WHITESON, A. (1985), A Cone Splitting Algorithm for Reverse Convex
Programming. Proceedings of IEEE Conference on Systems, Man, and Cybernetics,
Tucson, Az. 656-660.
SHEN, Z. and ZHU, Y. (1987), An Interval Version of Shubert's Iterative Method for
the Localization ofthe Global Maximum. Computing 38, 275-280.
SHEPILOV, M.A. (1987), Determination ofthe Roots and ofthe Global Extremum of
a Lipschitz Function. Cyhernetics, 23, 233-238.
SHERALI, H.D. and SHETTY, C.M. (1980), Optimization with Disjunctive Con-
straints. Lecture Notes in Economics and Mathematical Systems, 181, Springer-
Verlag, Berlin.
SHERALI, H.D. and SHETTY, C.M. (1980a), A Finitely Convergent Algorithm for
Bilinear Programming Problems Using Polar and Disjunctive Face Cuts.
Mathematical Programming, 19, 14-31.
SHERALI, H.D. and SHETTY, C.M. (1980b), Deep Cuts in Disjunctive Program-
ming. Naval Research Logistics, 27, 453--476.
SHIAU, T.R. (1984), Finding the Largest fl -ball in a Polyhedral Set. Technical
Summary Report, University of Wisconsin.
SHOR, N.Z. (1987), Quadratic Optimization Problems. Technicheskaya Kibernetika,
1, 128-139 (in Russian).
SHUBERT, B.O. (1972), A Sequential Method Seeking the Global Maximum 01 a
Function. SIAM Journal on Numerical Analysis, 9, 379-388.
SINGER, I. (1979), A Fenchel-Rockafellar Type Duality Theorem for Maximization.
Bulletin of the Australian Mathematical Society, 20, 193-198.
SINGER, I. (1992), Some Further Duality Theorem lor Optimization Problems with
Reverse Convex Constraint Sets. Journal of Mathematical Analysis and Applications,
171,205-219.
SOLAND, R.M. (1971), An Algorithm lor Separable Nonconvex Programming Prob-
lems II: Noneonvex Constraints. Management Science, 17, 759-773.
SOLAND, R.M. (1974), Optimal Faeility Location with Coneave Costs. OperationS
Research, 22, 373-382.
STEINBERG, D.I. (1970), The Fixed Charge Problem. Naval Research Logistics, 17,
217-235.
STUART, E.L., PHILLIPS, A.T. and ROSEN, J.B. (1988), Fast Approximate
Solution 01 Large-Seale Constrained Global Optimization Problems. Technical
Report No. 88-9, Computer Science Department, University of Minnesota.
TAM, B.T. and BAN, V.T. (1985), Minimization of a Concave Function Under Li-
near Constraints. Ekonomika i Matematicheskie Metody, 21, 709-714 (in Russian).
TAMMER, K. (1976), Möglichkeiten Z'Ur Anwendung der Erkenntnisse der para-
metrischen Optimierung fiir die Lösung indefiniter quadratischer Optimierungs-
probleme. Mathematische Operationsforschung und Statistik, Serie Optimization, 7,
206-222.
THACH, P.T. (1985), Convex Programs with Several Additional Reverse Convex
Constraints. Acta Mathematica Vietnamica, 10,35-57.
THACH, P.T. (1987), D.C. Sets, D.C. Functions and Systems of Equations.
Preprint, Institute of Mathematics, Hanoi.
THACH, P.T. (1988), The Design Centering Problem as a D.C. Programming
Problem. Mathematical Programming, 41, 229-248.
THACH, P.T. (1990), A Decomposition Method for the Min Concave Cost Flow
Problem with a Special Structure. Japan Journal of Applied Mathematics 7, 103-120.
THACH, P.T. (1990a), Convex Minimization Under Lipschitz Constraints. Journal
of Optimization Theory and Applications 64, 595-614.
THACH, P.T. (1993a), Global Optimality Criterion and Duality with Zero Gap in
Nonconvex Optimization Problems. SIAM J. Math. Anal., 24, 2537-2556.
THACH, P.T., BURKARD, R.E. and OETTLI, W. (1991), Mathematical Programs
with a Two Dimensional Reverse Convex Constraint. Journal of Global
Optimization, 1, 145-154.
THACH, P.T., THOAI, N.V. and TUY, H. (1987), Design Centering Problem with
Lipschitzian Structure. Preprint, Institute of Mathematics, Hanoi.
THACH, P.T. and TUY, H. (1987). Global Optimization under Lipschitzian Con-
straints. Japan Journal of Applied Mathematics, 4,205-217.
712
THACH, P.T. and TUY, H. (1990), The Relief Indicator Method for Constrained
Global Optimization, Naval Research Logistics 37, 473-497.
THAKUR, L. (1990), Domain Contraction in Nonlinear Programming: Minimizing a
Quadratic Concave Objective over a Polyhedron. Mathematics of Operations
Research 16, 390-407.
THIEU, T.V. (1980), Relationship between Bilinear Programming and Concave
Programming. Acta Mathematica Vietnamica, 2, 106-113.
THIEU, T.V. (1984), A Finite Method for GlobaUy Minimizing Concave Functions
over Unbounded Convex Sets and its Applications. Acta Mathematica Vietnamica 9,
173-191.
THIEU, T.V. (1987), Solving the Lay-Out Planning Problem with Concave Cost. In:
Essays in Nonllnear Analysis and Optimization, Institute of Mathematics, Hanoi,
101-110.
THOAI, N.V. (1987), On Canonical D.C. Programs and Applications. in: Essays on
Nonlinear Analysis and Optimization Problems, Institute of Mathematics, Hanoi,
88-100.
THOAI, N.V. (1988), A Modified Version of Tuy's Method for Solving D.C.
Programming Problems. Optimization, 19, 665-674.
THOAI, N.V. (1994), On the Construction of Test Problems for Concave
Minimization Algorithms. Journal of Global Optimization 5, 399-402.
713
THUONG, T.V. and TUY, H. (1984), A Finite Algorithm lor Solving Linear
Programs with an Additional Reverse Convex Constraint. Lecture Notes in
Economics and Mathematical Systems, 225, Springer, 291-302.
TOLAND, J.F. (1979), A Duality Principle lor Nonconvex Optimisation and the
Calculus 01 Variations. Archive of Rational Mechanics and Analysis, 71, 41-6l.
TOMLIN, J.A. (1978), Robust Implementation 01 Lemke's Method lor the Linear
Complementarity Problem. Mathematical Programming Study, 7, 55-60.
TOPKIS, D.M. (1970), Cutting Plane Methods without Nested Constraint Sets.
Operations Research, 18, 404-413.
TUY, H. (1964), Concave Programming under Linear Constraints. Soviet Mathema-
tics, 5, 1437-1440.
TUY, H., AL-KHAYYAL, F.A. and ZHOU, F. (1994), D.G. Optimization Method
for Single Facility Problems. Conference State of the Art in Global Optimization:
Computational Methods and Applications, Princeton, April 1995.
TUY, H., DAN, N.D. and GHANNADAN, S. (1992), Strongly Polynomial Time
Algorithm for Gertain Goncave Minimization Problems on Networks. Operations
Research Letters, 14, 99-109.
TUY, H. and THAI, N.Q. (1983), Minimizing a Concave Function Over a Compact
Convex Set. Acta Mathematica Vietnamica, 8, 12-20.
TUY, H., THIEU, T.V. and THAI, N.Q. (1985), A Conical Algorithm for Globally
Minimizing a Concave Function Over a Closed Convex Set. Mathematics of
Operations Research, 10, 498-515.
TUY, H. and THUONG, N.V. (1985), Minimizing a Convex Function over the
Complement of a Convex Set. Methods of Operations Research, 49, 85-89.
TUY, H. and THUONG, N.V. (1988), On the Global Minimization of a Convex
Function under General Nonconvex Constraints. Applied Mathematics and
Optimization, 18, 119-142.
VAISH, H. and SHETTY, C.M. (1976), The Bilinear Programming Problem. Naval
Research Logistics Quarterly, 23, 303-309.
VAISH, H. and SHETTY, C.M. (1977), A Cutting Plane Algorithm for the Bilinear
Programming Problem. Naval Research Logistics Quarterly, 24, 83-94.
VAN der HEYDEN, L. (1980), A Variable Dimension Algorithm for the Linear
Complementarity Problem. Mathematical Programming, 19, 328-346.
VASIL'EV, N.S. (1985), Minimum Search in Concave Problems Using the Sufficient
Condition for aGlobai Extremum. USSR Computational Mathematics and
Mathematical Physics, 25, 123-129 (in Russian).
VASIL'EV, S.B. and GANSHIN, G.S. (1982), Sequential Search Algorithm for the
Largest Value of a Twice Differentiable Function. Mathematical Notes of the
Academy of Sciences of the USSR 31, 312-316.
VEINOTT, A.F. (1967), The Supporting Hyperplane Method for Unimodal Program-
ming. Operations Research, 15, 147-152.
VEINOTT, A.F. (1969), Minimum Concave Cost Solution of Leontiev Substitution
Models of Multi-Facility Inventory Systems. Operations Research, 14, 486-507.
VERGIS, A., STEIGLITZ, K. and DICKINSON, B. (1986), The Complexity of
Analog Computation. Mathematics and Computers in Simulation, 28, 91-113.
VIDIGAL, L.M. and DIRECTOR, S.W. (1982), A Design Centering Algorithm for
Nonconvex Regions of Acceptability. IEEE Transactions on Computer-Aided-Design
of Integrated Circuits and Systems, CAD-I, 13-24.
WARGA, J. (1992), A Necessary and Sufficient Condition for a Constrained
Minimum. SIAM Journal Optimization 2,665--667.
717
YAGED, B. (1971), Minimum Cost Routing for Static Network Models. Networks, 1,
139-172.
YAJIMA, Y. and KONNO, H. (1990), Efficient Algorithms for Solving Rank Two
and Rank Three Bilinear Programming Problems. Journal of Global Optimization 1,
155-172.
ZALIZNYAK, N.F. and LIGUN, A.A. (1978), Optimal Strategies for Seeking the
Global Maximum of a Function. USSR Computational Mathematics and Mathemati-
cal Physics, 18, 31-38.
ZANG, I. and AVRIEL, M. (1975), On Functions whose Local Minima are GlobaL
Journal of Optimization Theory and Applications, 16, 183-190.
ZANG, 1., CHOO, E.W. and AVRIEL, M. (1976), A Note on Functions whose Local
Minima are Global. Journal of Optimization Theory and Applications, 18, 556-559.
ZANGWILL, W.!. (1966), A Deterministic Multi-Product Multi-Facility Pro duc-
tion and Inventory Model. Operations Research, 14,486-507.
ZANGWILL, W.!. (1968), Minimum Concave Cost Flows in Certain Networks.
Management Science, 14, 429-450.
I identity matrix
diag( O!) =
diag( O!P".'O!n) diagonal matrix with entries O!P ... 'O!n (where
a=( O!P."'O!n)
trans pose of matrix A
inverse of matrix A
determinant of A
M
H
Markovian assignment problem 20
hemisphere, 655 m:u;-min problem, 25 '
hierarchy of valid !;uts 211 IIllrumum concave cost flow
Hölder continuity, 664'
problem, 421 ff.
Horst-Thoai-de Vries method 78
mix~d integer program, 399, 645
hydraulic network, 13 ' modified exact simplicial algorithm
hypograph, 226, 279 353 '
mountain climbing procedure 450
multicommodity network flo~ 20
I, J multiextremal global optinn'zation
problem, 6
indefinite quadratic problem 13 32 multilevel fixed charge problem, 12
46 ' , ,
infeasible partition set, 118 134
initial polytope, 71 f., 87 ' N
~nner approximation, 225, 243
Integer linear programming national development, 14
problem, 19 network, 11, 13
integer quadratic programming node, 421
problem, 19 noncanonical d.c. problem, 541 ff.
integer programming, 14, 104 ff. nondegenerate subdivision of cones
integrated circuits, 14 300 ff. '
~nteractiye fixed charge problem, 12
nondegenerate subdivision of
IntersectlOn cut, 105 simplices, 347 ff. .
inventory, 11, 13 nonlinear cut, 53, 68, 660
investment, 33 normal cone splitting process, 297 ff.
jointly constrained biconvex normal conical algorithm, 308 ff.
programming problem, 36, 592 normal rectangular algorithm 295
M~M6K "
normal reet angular subdivision 295
K 365 ff. ' ,
normal simplicial algorithm 345
KCG algorithm, 61, 64, 228, 235 ~lK ' ,
Konno's cutting method, 217 normal simplicial subdivision, 343 ff
Krein-Milman Theorem, 11
L
o
w-subdivision, 306 ff., 349 ff. 371
Leontiev substitution system, 13 one-step optimality, 614 '
level set, 8, 181 optimality condition,. 6, 520 528
linear eomplementarity problem 24 546 ' ,
~9K ' ,
ordered sequential algorithm 611
linearization of the constraints 638 outer approximation by con~ex
Lipschitz constant, 6, 50 ' polyhedral sets, 58
Lipschitz function, 7, 50, 604 outer approximation by prov>ction
Lipschitz inequality, 8 65 ff. J- ,
Lipschitz optimization, 43, 145, 603 outer approximation method, 53 ff.
local d.c. function, 29
local minimizer, 4
local minimum, 4
726
T
Thieu-Tam-Ban method, 82 ff.
transversal facet, 247 ff.
trivial simplex, 356 ff.
u
uncapacitated minimum concave
cost flow problem, 422,426,437
uncertain partition set, 118
univariate Lipschitz optimization,
604,609 ff.
unstable d.c. problem, 537
upper bound, 117, 119, 133, 145, 164
upper level set, 184
utility, 14, 30, 33, 34
v
valid cut, 89 ff., 182 ff., 211 ff.
variable cost, 12
variational inequality, 24
vertex minima, 146, 165, 166
vertex problem, 191
vertices, 53, 71 ff., 96, 111, 189 ff.
249 ff., 341 ff., 385 f., 406 ff.
W,X,Y,Z
Weierstraß Theorem, 3, 11, 30
zero-one integer programming, 15,
399, 644
Spri nger-Verlag
and the Environment