Lecture Notes in Computer Science 1610: Edited by G. Goos, J. Hartmanis and J. Van Leeuwen
Lecture Notes in Computer Science 1610: Edited by G. Goos, J. Hartmanis and J. Van Leeuwen
Lecture Notes in Computer Science 1610: Edited by G. Goos, J. Hartmanis and J. Van Leeuwen
Integer Programming
and Combinatorial
Optimization
7th International IPCO Conference
Graz, Austria, June 9-11, 1999
Proceedings
13
Series Editors
Gerhard Goos, Karlsruhe University, Germany
Juris Hartmanis, Cornell University, NY, USA
Jan van Leeuwen, Utrecht University, The Netherlands
Volume Editors
Gérard Cornuéjols
GSIA, Carnegie Mellon University
Schenley Park, Pittsburgh, PA 15213, USA
E-mail: [email protected]
Rainer E. Burkard
Gerhard J. Woeginger
Institut für Mathematik, Technische Universität Graz
Steyrergasse 30, A-8010 Graz, Austria
E-mail: {burkard,gwoegi}@opt.math.tu-graz.ac.at
ISSN 0302-9743
ISBN 3-540-66019-4 Springer-Verlag Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are
liable for prosecution under the German Copyright Law.
c Springer-Verlag Berlin Heidelberg 1999
Printed in Germany
Typesetting: Camera-ready by author
SPIN: 10705123 06/3142 – 5 4 3 2 1 0 Printed on acid-free paper
Preface
This volume contains the papers selected for presentation at IPCO VII, the
Seventh Conference on Integer Programming and Combinatorial Optimization,
Graz, Austria, June 9–11, 1999. This meeting is a forum for researchers and prac-
titioners working on various aspects of integer programming and combinatorial
optimization. The aim is to present recent developments in theory, computa-
tion, and applications of integer programming and combinatorial optimization.
Topics include, but are not limited to: approximation algorithms, branch and
bound algorithms, computational biology, computational complexity, computa-
tional geometry, cutting plane algorithms, diophantine equations, geometry of
numbers, graph and network algorithms, integer programming, matroids and
submodular functions, on-line algorithms, polyhedral combinatorics, scheduling
theory and algorithms, and semidefinite programs.
IPCO was established in 1988 when the first IPCO program committee was
formed. IPCO I took place in Waterloo (Canada) in 1990, IPCO II was held in
Pittsburgh (USA) in 1992, IPCO III in Erice (Italy) 1993, IPCO IV in Copen-
hagen (Denmark) 1995, IPCO V in Vancouver (Canada) 1996, and IPCO VI in
Houston (USA) 1998. IPCO is held every year in which no MPS (Mathematical
Programming Society) International Symposium takes place: 1990, 1992, 1993,
1995, 1996, 1998, 1999, 2001, 2002, 2004, 2005, 2007, 2008, . . . . . . Since the MPS
meeting is triennial, IPCO conferences are held twice in every three-year period.
As a rule, in even years IPCO is held somewhere in Northern America, and in
odd years it is held somewhere in Europe.
In response to the call for papers for IPCO’99, the program committee re-
ceived 99 submissions, indicating a strong and growing interest in the conference.
The program committee met on January 10 and January 11, 1999, in Oberwol-
fach (Germany) and selected 33 contributed papers for inclusion in the scientific
program of IPCO’99. The selection was based on originality and quality, and
reflects many of the current directions in integer programming and optimization
research. The overall quality of the submissions was extremely high. As a result,
many excellent papers could not be chosen.
We thank all the referees who helped us in evaluating the submitted papers:
Karen Aardal, Norbert Ascheuer, Peter Auer, Imre Bárány, Therese Biedl, Hans
Bodlaender, Andreas Brandstädt, dan brown, Peter Brucker, Alberto Caprara,
Eranda Çela, Sebastian Ceria, Chandra Chekuri, Joseph Cheriyan, Fabian Chu-
dak, William H. Cunningham, Jesus De Loura, Friedrich Eisenbrand, Matteo
Fischetti, Michel Goemans, Albert Gräf, Jens Gustedt, Leslie Hall, Christoph
Helmberg, Winfried Hochstättler, Stan van Hoesel, Han Hoogeveen, Mark Jer-
rum, Olaf Jahn, Michael Jünger, Howard Karloff, Samir Khuller, Bettina Klinz,
Dieter Kratsch, Monique Laurent, Jan Karel Lenstra, Martin Loebl, Alexander
Martin, Ross McConnell, S. Tom McCormick, Petra Mutzel, Michael Naatz, Karl
Nachtigall, John Noga, Andreas Nolte, Alessandro Panconesi, Chris Potts, Mau-
VI Preface
rice Queyranne, Jörg Rambau, R. Ravi, Gerhard Reinelt, Franz Rendl, Günter
Rote, Juan José Salazar, Rüdiger Schultz, Andreas S. Schulz, Petra Schuurman,
András Sebő, Jay Sethuraman, Martin Skutella, Frits Spieksma, Angelika Steger,
Cliff Stein, Mechthild Stoer, Frederik Stork, Leen Stougie, Éva Tardos, Gottfried
Tinhofer, Zsolt Tuza, Marc Uetz, Vijay Vazirani, Albert Wagelmans, Dorothea
Wagner, Robert Weismantel, David Williamson, Laurence Wolsey, Günter M.
Ziegler, and Uwe Zimmermann. This list of referees is as complete as we could
make it, and we apologize for any omissions or errors.
The organizing committee for IPCO’99 essentially consisted of Eranda Çela,
Bettina Klinz, and Gerhard Woeginger. IPCO’99 was conducted in coopera-
tion with the Mathematical Programming Society (MPS), and it was sponsored
by the Austrian Ministry of Science, by Graz University of Technology, by the
Province of Styria, and by the City of Graz.
A Fast Algorithm for Computing Minimum 3-Way and 4-Way Cuts . . . . . . 377
H. Nagamochi and T. Ibaraki
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 1–16, 1999.
c Springer-Verlag Berlin Heidelberg 1999
2 Karen Aardal et al.
P
di = b 12 nj=1 aij c, 1 ≤ i ≤ m. This corresponds to a market split where ci = 12
for 1 ≤ i ≤ m. Cornuéjols and Dawande [2] argued that with this choice of
data most of the instances of the feasibility problem (1) are infeasible, which
implies that the optimization variant (3) has an objective value greater than
zero. If branch-and-bound is used to solve OPT (3), then, due to the symmetry
of the input, the value of the LP-relaxation remains at zero even after many
variables have been fixed. A behaviour of branch-and-bound that was observed
by Cornuéjols and Dawande was that 2αn nodes needed to be evaluated, where
α typically takes values between 0.6 and 0.7.
The algorithm we use in our study is described briefly in Section 2. The
algorithm was developed by Aardal, Hurkens, and Lenstra [1] for solving a system
of linear diophantine equations with bounds on the variables, such as problem
(2), and is based on Lovász’ lattice basis reduction algorithm as described by
Lenstra, Lenstra, and Lovász [6]. Aardal et al. motivate their choice of basis
reduction as the main ingredient of their algorithm by arguing that one can
interpret problem (2) as checking whether there exists a short integral vector
satisfying the system Ax = d. Given the lattice, the basis reduction algorithm
finds a basis spanning that lattice such that the basis consists of short, nearly
orthogonal vectors. Hence, a lattice is chosen that seems particularly useful for
problem (2). An initial basis that spans the given lattice is derived, and the
basis reduction algorithm is applied to this basis. The parameters of the initial
basis are chosen such that the reduced basis contains one vector xd satisfying
Axd = d, and n− m linearly independent vectors x0 satisfying Ax0 = 0. Due to
the basis reduction algorithm all these vectors are relatively short. If the vector
xd satisfies the bounds, then the algorithm terminates, and if not, one observes
that A(xd + λx0 ) = d for any integer multiplier λ and any vector x0 such that
Ax0 = 0. Hence, one can branch on integer linear combinations of vectors x0
satisfying Ax0 = 0 in order to obtain either a vector satisfying the diophantine
equations as well as the lower and upper bounds, or a proof that no such vector
exists.
In our computational study we solve both feasibility and optimization ver-
sions of the market split problem. The optimization version can be solved by
a slightly adapted version of the Aardal-Hurkens-Lenstra algorithm. We have
solved instances with up to 7 equations and 60 variables. To our knowledge,
the largest feasibility instances solved so far had 6 constraints and 50 variables,
and the largest optimization instances had 4 constraints and 30 variables. These
results were reported by Cornuéjols and Dawande [2]. Our computational expe-
rience is presented in Section 3.
When performing the computational study we observed that the larger the
instances became, the more often feasible instances were generated. This mo-
tivated us to analyse the expected number of solutions for instances generated
according to Cornuéjols and Dawande. Our conclusion is that for a given value
of m > 4, one needs to generate slightly less variables than is given by the ex-
pression n = 10(m − 1) (keeping all other parameters the same). We present our
analysis together with numerical support in Section 4.
4 Karen Aardal et al.
The vectors of B span the lattice L ⊂ IRn+m+1 that contains vectors of the form
x
(x , N1 y, N2 (a1 x − d1 y), . . . , N2 (am x − dm y)) = B
T T
, (6)
y
where y is a variable associated with the right-hand-side vector d.
Market Split and Basis Reduction 5
belongs to the lattice L, and the integer vector x0 satisfies Ax0 = 0 if and only
if the vector
T (1×m) T x0
(x0 , 0, 0 ) =B (8)
0
Unless xd satisfies the bound constraints, one needs to branch on the variables
λj in order to check whether the polytope P = {λ ∈ ZZ (n−m) : l − xd ≤ X 0 λ ≤
u − xd } is empty. The basis reduction algorithm runs in polynomial time. If one
wants an overall algorithm that runs in polynomial time for a fixed number of
variables, one needs to apply the algorithms of H.W. Lenstra, Jr. [7] or of Lovász
and Scarf [9]. Otherwise, one can, for instance, apply integral branching on
the unit vectors in λ-space or linear programming based branch-and-bound. By
integral branching in the λ-space we mean the following. Assume we are at node
k of the search tree. Take any unit vector ej , i.e., the jth vector of the (n − m)-
dimensional identity matrix, that has not yet been considered at the predecessors
of k. Measure the width of the polytope P in this direction, i.e., determine
uk = max{eTj λ : λ ∈ P ∩ {λj ’s fixed at predecessors of k}} and lk = min{eTj λ :
λ ∈ P ∩{λj ’s fixed at predecessors of k}}. Create buk c−dlk e+ 1 subproblems at
node k of the search tree by fixing λj to the values dlk e, . . . , buk c. The question is
now in which order we should choose the unit vectors in our branching scheme.
One alternative is to just take them in any predetermined order, and another is
to determine, at node k, which unit vector yields the smallest value of buk c−dlk e.
This branching scheme is similar to the scheme proposed by Lovász and Scarf [9]
6 Karen Aardal et al.
except that we in general are not sure whether the number of branches created
at each level is bounded by some constant depending only on the dimension.
What we hope for, and what seems to be the case given our computational
results, is that buk c − dlk e is small at most nodes of the tree. A natural question
in the context of branching is whether we may hope that linear programming
based branch-and-bound is more efficient on the polytope P as compared to the
polytope X. As can be observed in Section 3 we typically reduce the number of
nodes if we branch on P instead of X by several orders of magnitude. One way
of explaining this is that we obtain a scaling effect by going from description (2)
of our problem to description (10), see Aardal et al. [1].
3 Computational Experience
3.1 The Feasibility Version
We solved 17 instances of problem FP (1) reformulated as problem (10). Three
of the instances were feasible and 14 infeasible. The input was generated as de-
scribed in Section 1. The instances M16 and M17 are the instances “markshare1”
and “markshare2” of MIPLIB [10].
Market Split and Basis Reduction 7
1 1
3 2 77 78
6 3 59 64 79 108 111
9 4 35 60 65 67 80 89 109 112
13 5 26 31 36 52 61 66 68 81 86 90 95 110
15 6 19 27 32 37 42 53 62 69 70 82 87 91 93 96
17 7 14 20 28 33 38 43 50 54 63 71 83 88 92 94 97 103
17 8 9 15 16 21 22 29 34 39 40 44 51 55 72 84 98 104
12 10 17 23 30 41 45 48 56 73 85 99 105
9 11 18 24 46 49 57 74 100 106
7 12 25 47 58 75 101 107
3 13 76 102
17.2. The times reported on for this strategy are the actual times obtained with
the Alphaserver multiplied by a factor of 2, in order to make it easier to com-
pare the times of the three strategies. The basis reduction in the algorithm by
Aardal, Hurkens, and Lenstra is done using LiDIA, a library for computational
number theory [8]. The average computing time for the Aardal-Hurkens-Lenstra
algorithm was for the three instance sizes 1.6, 3.1, and 4.8 seconds respectively.
These computations were all carried out on the Sun Ultra Enterprise 2. For the
LP-based branch-and-bound on the variables λj we used CPLEX 6.5 [4], and for
the other two strategies we use the enumeration scheme developed by Verweij
[12]. The linear programming subproblems that arise in these strategies when
determining buk c and dlk e, are solved using CPLEX version 6.0.1 [3].
An important conclusion of the experiments is that the reformulation itself
is essential. Cornuéjols and Dawande [2] could only solve instances up to size
4 × 30 using CPLEX version 4.0.3. We also applied CPLEX versions 6.0 and
6.5 to the initial problem formulation (1) and observed a similar behaviour,
whereas CPLEX produced very good results on the reformulated problem (10).
Cornuéjols and Dawande did solve feasibility instances of size 6 × 50 by a group
relaxation approach. Their computing times are a factor of 3-10 slower than
ours. No previous computational results for instances of size 7 × 60 have, to our
knowledge, been reported.
If we consider the number of nodes that we enumerate in the three strate-
gies, we note that branching on the thinnest unit vector (in λ-space) yields the
fewest number of nodes for all instances except instance M12, which is a feasi-
ble instance. We do, however, need to determine the unit vector that yields the
thinnest direction at each node of the tree, which explains the longer computing
times. Branching on unit vectors in the predetermined order j = n − m, . . . , 1,
also requires fewer nodes for most instances than the linear programming based
branching on the variables λj . In terms of computing times, linear program-
ming based branch-and-bound is for most instances the fastest strategy, but
does not differ too much from the times needed for branching on unit vectors
ej , j = n−m, . . . , 1. This indicates that integer branching is an interesting strat-
egy, in particular if we can find reasonably good branching directions quickly,
as in the third strategy. In our case it seems as if the unit vectors in λ-space
yield thin branching directions. To investigate this we applied the generalized
basis reduction algorithm of Lovász and Scarf [9] to our polytope P . The re-
duced basis vectors yielded thinner directions than the strategy “thinnest ej ”
in only about 6% of the cases for the instances of size 5 × 40. This implies that
the unit vectors in λ-space, in some order, basically form a reduced basis in the
Lovász-Scarf sense. The computations involved in determining a Lovász-Scarf
reduced basis are fairly time consuming. For a problem of dimension 5 × 40,
at the root node of the tree, one has to solve at least 100 linear programs to
determine the basis. For each level of the tree the number of linear programs
solved at each node will decrease as the dimension of the subproblems decrease.
If the unit basis would generate bad search directions, then a heuristic version
of the Lovász-Scarf algorithm would be a possibility.
Market Split and Basis Reduction 9
The algorithm by Aardal, Hurkens, and Lenstra [1] was primarily designed to
solve feasibility problems, but can with simple adaptions be used to solve the
optimization version (3) of the market split instances as well. Below, we report on
the results obtained by using three different strategies to solve the optimization
version. All strategies are based on linear programming based branch-and-bound.
X
n X
m
aij xj + si − wi = di , 1 ≤ i ≤ m, (si + wi ) = v?
j=1 i=1
These feasibility problems are then reformulated as problems of type (10) using
the algorithm of Aardal et al. For each of these feasibility problems we apply
linear programming based branch-and-bound on the variables λj . Here, we also
investigate the influence of the choice of objective function on the search
Pn−mthat
CPLEX is performing. In Strategy 1 we use the objective function max j=1 λj .
Strategy 2: This is the same as Strategy 1 except that the objective function is a
perturbation of the objective function zero. Here we sketch the principle of the
perturbation. What is basically done to construct the perturbed objective func-
tion is to perturb the variables of the linear programming dual as follows. Notice
that the number of constraints in the linear relaxation of the reformulation (10)
of the feasibility problem (11) is p = 2n + 2m; we have 2n constraints corre-
sponding to the upper and lower bounds on the x-variables, and 2m constraints
corresponding to the nonnegativity requirements on the slack variables si and
wi , 1 ≤ i ≤ m. Let ε = 10−6 and let, for i = 1, . . . , p, Zi be a drawn uniformly
and independently from the interval [0, 1]. Let δi = εZi . If the dual variable
yi ≤ 0 in the original formulation we let yi ≤ δi , and if yi ≥ 0 we let yi ≥ −δi .
For yi such that yi ≤ δi , make the substitution Yi = yi − δi , and for yi ≥ −δi
we substitute yi by Yi = yi + δi . This substitution implies a perturbation of the
primal objective function.
From the results in Table 2 we can conclude that instances of sizes up to 7×60 are
relatively easy to solve to optimality after using the reformulation of the problem
implied by the algorithm of Aardal, Hurkens, and Lenstra [1]. This represents a
large improvement over earlier results, where the largest optimization instances
had dimension 4×30, see Cornuéjols and Dawande [2]. If we consider the number
of nodes that we enumerate when applying linear programming based branch-
and-bound on the variables λj , we observe that this number is significantly
smaller than the number 2αn for α between 0.6 and 0.7 that Cornuéjols and
Dawande observed when applying branch-and-bound on the xj -variables. For
instances of size 4 × 30 they enumerated between 106 and 2 × 106 nodes. For the
same number of enumeration nodes we can solve instances of more than twice
that size. We can also observe that solving the reformulated optimization version
(Strategy 3) instead of a sequence of feasibility problems (Strategies 1 and 2) is
more time consuming in most cases. One reason is that the optimum objective
value is small, so the number of problems we need to solve in Strategies 1 and 2
is small, but in case of the infeasible instances greater than one. If one expects
the optimum value to be large and no reasonable bounds are known, then it is
probably better to consider Strategy 3.
Next to the instances we report on here we also generated another five in-
stances of size 7 × 60. All these instances were feasible so we decided not to
Market Split and Basis Reduction 11
report on the results of the computations here. For the size 6 × 50 we also had
to generate quite a few instances to obtain infeasible ones. This motivated us
to investigate whether the relation n = 10(m − 1), as Cornuéjols and Dawande
suggested, is actually likely to produce infeasible instances if we keep all other
parameters as they suggested. Our probabilistic analysis is presented in the next
section.
X
Q−1
−k
Pr[Zi (S) = 0] = Pr[Yi (S) = −Ui ] = Pr[Yi (S) = ]. (13)
Q
k=0
We can compute this probability exactly using the probability generating func-
tion of Yi (S), see Section 4.2, or give an approximation using the normal dis-
tribution as described in Section 4.3. In either case, we obtain an expression
Pr[Zi (S) = 0] = q(n, D, |S|), i.e., the probability that x induced by S defines a
solution for (Ax)i = di depends on n, D and the size of S only. The probability
that S induces a solution for Ax = d is given by q(n, D, |S|)m . The expected
number of solutions is derived by summing over all subsets S, i.e.,
X Xn
n
E[#solutions] = q(n, D, |S|) =
m
q(n, D, s)m . (14)
s
S⊂{1,...,n} s=0
12 Karen Aardal et al.
X
D−1
1 i 1 1 xD − 1
Gaij (x) = E[xaij ] = x = (1 + x + · · · + xD−1 ) = . (15)
i=0
D D D x−1
min{T,bj/Dc}
X T j − Dk + T − 1
aj = (−1)k . (20)
k j − Dk
k=0
1
Pr[Zi (S) = 0] = q(n, D, |S|) = n
a(D−1)(n−|S|) + a(D−1)(n−|S|)−1 . (21)
D
For p 6= 12 , we use (20) to compute the coefficients of each factor in expres-
sion (18) and derive the cj -coefficients by convolution of the two power series
obtained.
(D2 − 1)n/48. For rational p = P/Q, the probability that subset S induces a
solution for row i is given by Pr[Zi (S) = 0] = Pr[1/(2Q) − 1 ≤ Yi (S) ≤ 1/(2Q)].
We can approximate this expression, using the Central Limit Theorem [5], by
the normal distribution giving
Z β
1 1 1 1
Pr − 1 < Yi (S) < ≈ √ exp(− u2 )du , (22)
2Q 2Q 2π α 2
with
1
2Q − 1 − E[Yi (S)] − E[Yi (S)]
1
2Q
α= p , and β = p . (23)
Var[Yi (S)] Var[Yi (S)]
Y
Pr[#solutions = 0] ≈ Pr[S is no solution] (24)
S⊂{1,...,n}
Y
= exp log (1 − q(n, D, |S|)m ) (25)
S⊂{1,...,n}
X
= exp log(1 − q(n, D, |S|)m ) (26)
S⊂{1,...,n}
X
≈ exp −q(n, D, |S|)m (27)
S⊂{1,...,n}
greater than 0.9. This confirms our experience with the instances we drew for our
computational experiments reported on in Section 3. If one wants to generate
infeasible instances for m ≥ 6 with high probability, then one needs to generate
slightly fewer columns for a given value of m than the relation n = 10(m − 1)
indicates.
Table 3. The expected number of solutions computed exactly using the prob-
ability generating function (gen) and approximated by the normal distribution
(approx)
n m =4 m= 5 m = 6 m = 7 m =8
gen approx gen approx gen approx gen approx gen approx
20 0.0004 0.0004 2.3091e-6 2.3692e-6 1.3071e-8 1.3519e-8 7.5214e-11 7.8457e-11 4.3865e-13 4.6155e-13
21 0.0007 0.0008 4.0743e-6 4.1700e-6 2.2299e-8 2.2964e-8 1.2316e-10 1.2759e-10 6.8386e-13 7.1256e-13
22 0.0014 0.0014 7.2921e-6 7.4623e-6 3.9302e-8 4.0507e-8 2.1514e-10 2.2339e-10 1.1927e-12 1.2479e-12
23 0.0025 0.0026 0.00001 0.00001 6.8320e-8 7.0205e-8 3.6171e-10 3.7380e-10 1.9270e-12 2.0023e-12
24 0.0046 0.0047 0.00002 0.00002 1.2125e-7 1.2461e-7 6.3449e-10 6.5644e-10 3.3604e-12 3.5006e-12
25 0.0086 0.0087 0.00004 0.00004 2.1378e-7 2.1924e-7 1.0881e-9 1.1219e-9 5.5765e-12 5.7801e-12
26 0.0159 0.0161 0.00008 0.00008 3.8208e-7 3.9177e-7 1.9192e-9 1.9797e-9 9.7509e-12 1.0121e-11
27 0.0295 0.0298 0.0001 0.0001 6.8118e-7 6.9740e-7 3.3415e-9 3.4385e-9 1.6516e-11 1.7079e-11
28 0.0548 0.0555 0.0003 0.0003 1.2257e-6 1.2544e-6 5.9298e-9 6.1020e-9 2.9002e-11 3.0013e-11
29 0.1023 0.1035 0.0005 0.0005 2.2050e-6 2.2540e-6 1.0449e-8 1.0733e-8 4.9914e-11 5.1513e-11
30 0.1913 0.1936 0.0009 0.0009 3.9929e-6 4.0798e-6 1.8657e-8 1.9159e-8 8.8096e-11 9.0939e-11
31 0.3585 0.3626 0.0016 0.0016 7.2371e-6 7.3878e-6 3.3199e-8 3.4045e-8 1.5357e-10 1.5819e-10
32 0.6732 0.6808 0.0030 0.0030 0.00001 0.00001 5.9626e-8 6.1124e-8 2.7250e-10 2.8068e-10
33 1.2668 1.2805 0.0055 0.0055 0.00002 0.00002 1.0696e-7 1.0953e-7 4.8000e-10 4.9361e-10
34 2.3879 2.4131 0.0102 0.0103 0.00004 0.00004 1.9319e-7 1.9774e-7 8.5633e-10 8.8043e-10
35 4.5090 4.5552 0.0189 0.0192 0.00008 0.00008 3.4895e-7 3.5685e-7 1.5215e-9 1.5623e-9
36 8.5279 8.6128 0.0353 0.0358 0.0001 0.0001 6.3353e-7 6.4759e-7 2.7288e-9 2.8010e-9
37 16.1533 16.3098 0.0659 0.0668 0.0003 0.0003 1.1512e-6 1.1758e-6 4.8844e-9 5.0085e-9
38 30.6410 30.9297 0.1234 0.1250 0.0005 0.0005 2.1000e-6 2.1440e-6 8.8038e-9 9.0240e-9
39 58.2021 58.7363 0.2314 0.2343 0.0009 0.0009 3.8357e-6 3.9137e-6 1.5859e-8 1.6241e-8
40 110.6973 111.6878 0.4345 0.4400 0.0017 0.0018 7.0282e-6 7.1681e-6 2.8719e-8 2.9400e-8
41 210.7999 212.6397 0.8174 0.8274 0.0032 0.0033 0.00001 0.00001 5.2021e-8 5.3216e-8
42 401.8960 405.3196 1.5399 1.5582 0.0060 0.0061 0.00002 0.00002 9.4624e-8 9.6756e-8
43 767.0835 773.4649 2.9050 2.9388 0.0112 0.0114 0.00004 0.00004 1.7224e-7 1.7601e-7
44 1465.6670 1477.5812 5.4876 5.5499 0.0209 0.0212 0.00008 0.00008 3.1459e-7 3.2134e-7
45 2803.3082 2825.5860 10.3793 10.4945 0.0391 0.0397 0.0001 0.0002 5.7517e-7 5.8721e-7
46 5366.9806 5408.6988 19.6556 19.8690 0.0733 0.0743 0.0003 0.0003 1.0545e-6 1.0761e-6
47 1.0285e4 1.0363e4 37.2658 37.6617 0.1375 0.1393 0.0005 0.0005 1.9356e-6 1.9744e-6
48 1.9726e4 1.9874e4 70.7327 71.4684 0.2582 0.2617 0.0010 0.0010 3.5613e-6 3.6313e-6
49 3.7868e4 3.8145e4 134.3989 135.7680 0.4856 0.4920 0.0018 0.0018 6.5607e-6 6.6867e-6
50 7.2753e4 7.3275e4 255.6339 258.1855 0.9144 0.9262 0.0033 0.0034 0.00001 0.00001
51 1.3989e5 1.4087e5 486.7099 491.4721 1.7238 1.7457 0.0062 0.0062 0.00002 0.00002
52 2.6918e5 2.7103e5 927.5451 936.4449 3.2537 3.2940 0.0116 0.0117 0.00004 0.00004
53 5.1834e5 5.2184e5 1769.2812 1785.9346 6.1478 6.2225 0.0216 0.0220 0.00008 0.00008
54 9.9884e5 1.0054e6 3377.8566 3409.0581 11.6287 11.7673 0.0405 0.0411 0.0001 0.0001
55 1.9261e6 1.9386e6 6454.3702 6512.8979 22.0181 22.2758 0.0761 0.0772 0.0003 0.0003
56 3.7165e6 3.7402e6 1.2343e4 1.2453e4 41.7308 42.2104 0.1429 0.1449 0.0005 0.0005
57 7.1757e6 7.2207e6 2.3622e4 2.3830e4 79.1668 80.0606 0.2687 0.2724 0.0009 0.0009
58 1.3863e7 1.3948e7 4.5245e4 4.5635e4 150.3239 151.9916 0.5058 0.5127 0.0017 0.0017
59 2.6799e7 2.6961e7 8.6723e4 8.7457e4 285.6905 288.8057 0.9531 0.9658 0.0032 0.0033
60 5.1835e7 5.2143e7 1.6634e5 1.6772e5 543.4188 549.2448 1.7978 1.8214 0.0060 0.0061
61 1.0031e8 1.0090e8 3.1928e5 3.2189e5 1034.5026 1045.4103 3.3943 3.4383 0.0112 0.0114
62 1.9424e8 1.9535e8 6.1324e5 6.1817e5 1970.9501 1991.3942 6.4148 6.4966 0.0211 0.0214
63 3.7629e8 3.7843e8 1.1786e6 1.1879e6 3757.9874 3796.3444 12.1341 12.2862 0.0395 0.0401
64 7.2936e8 7.3343e8 2.2666e6 2.2842e6 7170.6817 7242.7199 22.9726 23.2561 0.0733 0.0754
65 1.4144e9 1.4221e9 4.3616e6 4.3951e6 1.3692e4 1.3828e4 43.5290 44.0578 0.1397 0.1417
66 2.7440e9 2.7589e9 8.3980e6 8.4614e6 2.6164e4 2.6419e4 82.5476 83.5352 0.2629 0.2666
67 5.3262e9 5.3545e9 1.6179e7 1.6299e7 5.0029e4 5.0511e4 156.6664 158.5126 0.4952 0.5021
68 1.0343e10 1.0396e10 3.1186e7 3.1412e7 9.5728e4 9.6634e4 297.5662 301.0208 0.9337 0.9465
69 2.0092e10 2.0196e10 6.0146e7 6.0581e7 1.8328e5 1.8499e5 565.6096 572.0801 1.7619 1.7858
70 3.9050e10 3.9248e10 1.1606e8 1.1688e8 3.5114e5 3.5437e5 1075.8876 1088.0185 3.3274 3.3720
71 7.5923e10 7.6305e10 2.2406e8 2.2563e8 6.7315e5 6.7925e5 2047.9738 2070.7375 6.2893 6.3723
72 1.4767e11 1.4840e11 4.3279e8 4.3578e8 1.2912e6 1.3027e6 3901.0477 3943.8021 11.8969 12.0517
73 2.8734e11 2.8875e11 8.3636e8 8.4207e8 2.4781e6 2.4999e6 7435.8186 7516.1883 22.5215 22.8106
74 5.5932e11 5.6202e11 1.6170e9 1.6278e9 4.7588e6 4.8001e6 1.4183e4 1.4334e4 42.6664 43.2066
75 1.0891e12 1.0943e12 3.1277e9 3.1484e9 9.1434e6 9.2218e6 2.7068e4 2.7354e4 80.8888 81.8993
76 2.1215e12 2.1314e12 6.0523e9 6.0920e9 1.7577e7 1.7725e7 5.1694e4 5.2231e4 153.4610 155.3527
77 4.1339e12 4.1531e12 1.1717e10 1.1792e10 3.3807e7 3.4089e7 9.8781e4 9.9795e4 291.3437 294.8880
78 8.0580e12 8.0948e12 2.2693e10 2.2837e10 6.5057e7 6.5593e7 1.8887e5 1.9078e5 553.4833 560.1297
79 1.5712e13 1.5783e13 4.3968e10 4.4245e10 1.2525e8 1.2627e8 3.6133e5 3.6494e5 1052.1712 1064.6447
80 3.0646e13 3.0782e13 8.5223e10 8.5754e10 2.4126e8 2.4319e8 6.9164e5 6.9847e5 2001.4513 2024.8798
Market Split and Basis Reduction 15
70
60
50
n
10
8 40
ExpHits 6 70
4 60
2
0 50
4 n 30
5 40
0
6 2
m
30 4
7 6
20 8
ExpHits
8 20 4 5 6 7 10
8
m
Acknowledgements
We would like to thank Bram Verweij for his assistance in implementing our
integral branching algorithm using his enumeration scheme [12]. We also want
to thank David Applegate and Bill Cook for their many useful comments on our
work and for allowing us to use their DEC Alphaservers.
References
1. K. Aardal, C. Hurkens, A. K. Lenstra (1998). Solving a system of diophantine
equation with lower and upper bounds on the variables. Research report UU-CS-
1998-36, Department of Computer Science, Utrecht University.
2. G. Cornuéjols, M. Dawande (1998). A class of hard small 0-1 programs. In: R. E.
Bixby, E. A. Boyd, R. Z. Rı́os-Mercado (eds.) Integer Programming and Combinato-
rial Optimization, 6th International IPCO Conference. Lecture Notes in Computer
Science 1412, pp 284–293, Springer-Verlag, Berlin Heidelberg.
3. CPLEX 6.0 Documentation Supplement (1998). ILOG Inc., CPLEX Division, In-
cline Village NV.
4. CPLEX 6.5 Documentation Supplement (1999). ILOG Inc., CPLEX Division, In-
cline Village NV.
5. G. R. Grimmett, D. R. Stirzaker (1982). Probability and Random Processes, Oxford
University Press, Oxford.
6. A. K. Lenstra, H. W. Lenstra, Jr., L. Lovász (1982). Factoring polynomials with
rational coefficients. Mathematische Annalen 261, 515–534.
7. H. W. Lenstra, Jr. (1983). Integer programming with a fixed number of variables.
Mathematics of Operations Research 8, 538–548.
8. LiDIA – A library for computational number theory. TH Darmstadt / Universität
des Saarlandes, Fachbereich Informatik, Institut für Theoretische Informatik.
http://www.informatik.th-darmstadt.de/pub/TI/LiDIA
16 Karen Aardal et al.
1 Introduction
It is a fact of the present day that rounding of linear relaxations is one of the
most effective techniques in designing approximation algorithms with proven
worst-case bounds for discrete optimization problems. The quality characteris-
tics of a rounding-based algorithm are highly dependent on the choice of an
integer program reformulation and a rounding scheme. When applying popular
random roundings one encounters the additional problem of derandomization.
This problem may prove to be extremely difficult or quite intractable. For exam-
ple, the widely known derandomization method of conditional probabilities [1]
succeeds, as is easily seen, only under some very special conditions; in particu-
lar, if the relaxation has a subset of active variables that determine the optimal
values of the remaining ones and the optimization problem with respect to these
active variables is unconstrained. If one adds even a single cardinality constraint
connecting the active variables, the method fails. In this paper we present a
simple deterministic (“pipage”) rounding method to tackle some problems of
this sort. So far the method has happened to be applicable to two well-known
?
This research was partially supported by the Russian Foundation for Basic Research,
grant 97-01-00890
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 17–30, 1999.
c Springer-Verlag Berlin Heidelberg 1999
18 Alexander A. Ageev and Maxim I. Sviridenko
In the maximum coverage problem (MCP for short), given a family F = {Sj :
j ∈ J} of subsets of a set I = {1, . . . , n} with associated nonnegative weights
wj and a positive integer p, it is required to find a subset X ⊆ I with |X| = p so
as to maximize the total weight of the sets in F having nonempty intersections
with X. The polynomial-time solvability of MCP clearly implies that of the set
cover problem and so it is N P -hard. In a sense MCP can be treated as an in-
verse of the set cover problem and like the latter has numerous applications (see,
e. g. [10]). It is well known that a simple greedy algorithm solves MCP approxi-
mately within a factor of 1 − (1 − 1/p)p of the optimum (Cornuejols, Fisher and
Nemhauser [2]). Feige [4] proves that no polynomial algorithm can have better
performance guarantee provided that P6=NP. Another result concerning MCP is
due to Cornuejols, Nemhauser and Wolsey [3] who prove that the greedy algo-
rithm almost always finds an optimal solution to MCP in the case of two-element
sets. We show below that MCP can be solved in polynomial time approximately
within a factor of 1 − (1 − 1/k)k of the optimum, where k = max{|Sj | : j ∈ J}.
Although 1 − (1 − 1/k)k like 1 − (1 − 1/p)p can be arbitrary close to 1 − e−1 ,
the parameter k looks more interesting: for each fixed k (k = 2, 3, . . . ) MCP
still remains NP-hard. E. g., in the case when k = 2, which is the converse of
the vertex cover problem, the performance guarantee of the greedy algorithm
has the same value of 1 − e−1 [3], whereas our algorithm finds a solution within
a factor of 3/4 of the optimum. Ultimately, the performance guarantee of our
algorithm beats the performance guarantee of the greedy algorithm in each case
of bounded k and coincides with that when k is unbounded. Note also that our
result is similar in a sense to the well-known result [9] that the set cover problem
can be approximated in polynomial time within a factor of r of the optimum,
where r is the maximum number of sets containing an element.
Let J = {1, . . . m}. MCP can be equivalently reformulated as a constrained
version of MAX SAT over variables y1 , . . . , yn with m clauses C1 , . . . , Cm such
that Cj is the collection of yi with i ∈ Sj and has weight wj . It is required to
assign “true” values to exactly p variables so as to maximize the total weight of
satisfied clauses. Furthermore, analogously to MAX SAT (see, e. g. [6]), MCP
can be stated as the following integer program:
m
X
max wj zj (4)
j=1
X
s. t. xi ≥ zj , j = 1, . . . , m, (5)
i∈Sj
20 Alexander A. Ageev and Maxim I. Sviridenko
n
X
xi = p, (6)
i=1
xi ∈ {0, 1}, i = 1, . . . , n, (7)
0 ≤ zi ≤ 1, i = 1, . . . , m. (8)
as desired.
To check property (B) it suffices to observe that in this case the function
ϕ(ε, x, i, j) is convex because it is a quadratic polynomial in ε, whose main
coefficient is nonnegative for each pair of indices i and j and each x ∈ [0, 1]n .
Thus by concretizing the general scheme described in the introduction we obtain
a (1 − (1 − 1/k)k )-approximation algorithm for the maximum coverage problem.
We now demonstrate that the integrality gap of (4)–(8) can be arbitrarily
close to (1 − (1 − 1/k)k ) and thus the rounding scheme described above is best
Approximation Algorithms for Maximum Coverage and Max Cut 21
possible for the integer program (4)–(8). Set n = kp, wj = 1 for all j and let F be
the collection of all subsets of {1, . . . , n} with cardinality k. Then, by symmetry,
any binary vector with exactly p units maximizes L(x) subject to (6)–(7) and
so the optimal value of this problem is equal to L∗ = Cnk − Cn−p k
. On the other
hand, the vector with all components equal to 1/k provides an optimal solution
of weight L0 = Cnk to the linear relaxation in which the objective is to maximize
L(x) subject to (6) and 0 ≤ xi ≤ 1 for all i. Now it is easy to derive an upper
bound on the ratio
L∗ Cnk − Cn−p
k
=
L0 Cnk
(n − p)! k!(n − k)!
=1−
k!(n − p − k)! n!
n − p n − p − 1 n − p − k + 1
=1− ...
n n−1 n−k+1
n − p n − p − 1 n − p − k + 1
≤1− ...
n n n
1 1 1 1 2 1 k + 1
=1− 1− 1− − 1− − ... 1 − −
k k n k n k n
1 k + 1 k
≤1− 1− − ,
k n
which tends to (1 − (1 − 1/k)k ) when k is fixed and n → ∞.
Remark 1. The algorithm and the argument above can be easily adopted to
yield the same performance guarantee in the case of the more general problem
in which the constraint (6) is replaced by the constraints
X
xi = pt , t = 1, . . . , r
i∈It
Remark 2. It can be easily observed that from the very beginning (and with
the same ultimate result) we could consider objective functions of the following
more general form:
m
X Y l
X Y
F (x) = wj 1 − (1 − xi ) + ut 1 − xi ,
j=1 i∈Sj t=1 i∈Rt
where Sj and Rt are arbitrary subsets of {1, . . . , n}, and ut , wj are nonnegative
weights. The problem with such objective functions can be reformulated as the
constrained MAX SAT in which each clause either contains no negations or
contains nothing but negations.
22 Alexander A. Ageev and Maxim I. Sviridenko
over all binary vectors x with exactly p units. Notice first that the function F
has property (B) because of the same reason as in the above section. Consider
the following integer program:
X
max wij zij (10)
i<j
Here we take as a nice relaxation the continuous relaxation of (15)–(17) (in which
(17) is replaced by 0 ≤ xi ≤ 1 for all i ∈ V (G)). Let x be an optimal solution to
the nice relaxation. We claim that F (x) ≥ 1/2L(x). Indeed, it suffices to check
that
for all pairs i, j with i < j. Fix some i < j and assume first that xi + xj ≤ 1.
Then (18) is equivalent to xi + xj ≥ 4xi xj , which follows from
xi + xj ≥ (xi + xj )2 ≥ 4xi xj .
The case when xi +xj ≥ 1 is quite symmetric to the above, which is easily shown
by the substitution xi = 1 − yi , xj = 1 − yj . Thus property (A) with C = 1/2
does hold and we obtain an 1/2-approximation algorithm for MCGS.
Consider an instance of MCGS in which 2p ≤ n, wij = 1 if both i and j
lie in a fixed subset of V (G) having cardinality 2p, and wij = 0 otherwise. The
optimal value of this instance is obviously equal to p2 , whereas the optimal value
of the linear relaxation is p(p − 1)/2. Hence
p2 2p2 2
= 2 =
p(p − 1)/2 p −p 1 − 1/p
which can be arbitrarily close to 2. Thus, again, our rounding scheme is best
possible for the integer program (10)–(14).
In this section we illustrate how the “pipage” technique can be extended to apply
to a natural generalization of MCGS.
In the maximum k-cut problem with given sizes of parts (k-MCGS), given a
complete undirected graph G with V (G) = {1, . . . , n}, nonnegative edge weights
wij , and a collection of nonnegative numbers {p1 , . . . , pk }, it is required to find a
partition {P1 , . . . , Pk } of V (G) with |Pt | = pt for each t, maximizing the function
XX X
wij .
r<s i∈Pr j∈Ps
authors suggest a 2-approximation algorithm for the similar special case of this
problem.
We present further an approximation algorithm which produces a feasible so-
lution to k-MCGS of weight within a factor of 1/2 of the optimum. The problem
can be reformulated as the following nonlinear integer program:
X k
X
max F (x) = wij (1 − xit xjt ) (19)
i<j t=1
n
X
s. t. xit = pt , t = 1, . . . k, (20)
i=1
k
X
xit = 1, i ∈ V (G), (21)
t=1
xit ∈ {0, 1}, i ∈ V (G). (22)
We claim that F (x) ≥ 1/2L(x) for all feasible x = (xit ). Fix a pair of indices i
and j with i < j. It suffices to show that
k
X k
X
1− xit xjt ≥ 1/4 min{xit + xjt , 2 − xit − xjt }. (29)
t=1 t=1
Approximation Algorithms for Maximum Coverage and Max Cut 25
Indeed, we have already proved in the above section (see (18)) that for each t
xit + xjt − 2xit xjt ≥ 1/2 min{xit + xjt , 2 − xit − xjt }. (30)
Adding together (30) over t from 1 to k and taking into account (27) we obtain
that
Xk k
X
2−2 xit xjt ≥ 1/2 min{xit + xjt , 2 − xit − xjt },
t=1 t=1
which implies (29).
We describe now the “pipage” step. Let x be an optimal solution to the linear
t
relaxation of (23)–(28) (that is, with (28) replaced by 0 ≤ xjt ≤ 1, 0 ≤ zij ≤ 1).
Define the bipartite graph H with the bipartition ({1, . . . , n}, {1, . . . , k}) so
that jt ∈ E(H) if and only if xjt is fractional. Note that (26) and (27) imply
that each vertex of H is either isolated or has degree at least 2. Assume that x
has fractional components. Since H is bipartite it follows that H has a circuit
C of even length. Let M1 and M2 be the matchings of H whose union is the
circuit C. Define a new solution x(ε) by the following rule: if jt is not an edge
of C, then xjt (ε) coincides with xjt , otherwise, xjt (ε) = xjt + ε if jt ∈ M1 , and
xjt (ε) = xjt − ε if jt ∈ M2 .
By definition x(ε) is a feasible solution to the linear relaxation of (23)–(28)
for all ε ∈ [−ε1 , ε2 ] where
ε1 = min{ min xjt , min (1 − xjt )}
jt∈M1 jt∈M2
and
ε2 = min{ min (1 − xjt ), min xjt }.
jt∈M1 jt∈M2
m
X Y
max F (x) = wj 1 − (1 − xi ) (36)
j=1 i∈Sj
subject to (33)–(35).
Set k = max{|Sj | : j ∈ J}. Denote by IP [I0 , I1 ] and LP [I0 , I1 ] the integer
program (31)–(35) and its linear relaxation (31)–(34) respectively, subject to the
additional constraints: xi = 0 for i ∈ I0 and xi = 1 for i ∈ I1 where I0 and I1
are disjoint subsets of I. By a solution to IP [I0 , I1 ] and LP [I0 , I1 ] we shall mean
only a vector x, since the optimal values of zj are trivially computed if x is fixed.
We first describe an auxiliary algorithm A which finds a feasible solution
xA to the linear program LP [I0 , I1 ]. The algorithm is divided into two phases.
Approximation Algorithms for Maximum Coverage and Max Cut 27
Since the weight of x̂s does not exceed the weight of the output vector of the
algorithm this would prove that (1 − (1 − 1/k)k ) is the performance guarantee
of the algorithm.
In the following argument for brevity we shall use alternately the sets and
their incidence vectors as the arguments of F .
Indeed, the function F can be also treated as the set function F (X) defined on
all subsets X ⊆ I. It is well known that F (X) is submodular and, consequently,
have the property that
Recall that the integral vector x̂s is obtained from an “almost integral” vector
s
x returned by the algorithm A, by the replacement with zero its single fractional
component xsis . It follows that
Let xLP , z LP denote an optimal solution to LP [I0s , I1 ]. Using (35), (40) and (41)
we finally obtain
F (x̂s ) = F (X̂s )
= F (X̂s ∪ {is }) − F (X̂s ∪ {is }) − F (X̂s )
by (40)
≥ F (X̂s ∪ {is }) − 1/4F (I1 )
by (41)
≥ F (xs ) − 1/4F (I1 )
X X Y X
= wj + wj 1 − (1 − xsi ) − 1/4 wj
j∈J1 j∈J\J1 i∈Sj j∈J1
(by (37))
X X X
≥ 3/4 wj + 1 − (1 − 1/k)k wj min{1, xLP
i }
j∈J1 j∈J\J1 i∈Sj
k
X X X
≥ 1 − (1 − 1/k) wj + wj min{1, xLP
i }
j∈J1 j∈J\J1 i∈Sj
30 Alexander A. Ageev and Maxim I. Sviridenko
Acknowledgement
The authors wish to thank Refael Hassin for helpful comments and for pointing
out some references.
References
1. N. Alon and J. H. Spenser, The probabilistic method, John Wiley and Sons, New
York, 1992.
2. G. Cornuejols, M. L. Fisher and G. L. Nemhauser, Location of bank accounts to
optimize float: an analytic study exact and approximate algorithms, Management
Science 23 (1977) 789–810.
3. G. Cornuejols, G. L. Nemhauser and L. A. Wolsey, Worst-case and probabilistic
analysis of algorithms for a location problem, Operations Research 28 (1980) 847–
858.
4. U. Feige, A threshold of ln n for approximating set cover, J. of ACM. 45 (1998)
634–652.
5. A. Frieze and M. Jerrum, Improved approximation algorithms for MAX k-CUT
and MAX BISECTION, Algorithmica 18 (1997) 67–81.
6. M. X. Goemans and D. P. Williamson, New 3/4-approximation algorithms for
MAX SAT, SIAM J. Discrete Math. 7 (1994) 656–666.
7. N. Guttmann-Beck and R. Hassin, Approximation algorithms for min-sum p-
clustering, Discrete Appl. Math. 89 (1998) 125–142.
8. N. Guttmann-Beck and R. Hassin, Approximation algorithms for minimum K-cut,
to appear in Algorithmica.
9. D. S. Hochbaum, Approximation algorithms for the set covering and vertex cover
problems, SIAM J. on Computing 11 (1982) 555–556.
10. D. S. Hochbaum, Approximating covering and packing problems: Set Cover, Vertex
Cover, Independent Set, and related problems, in: D. S. Hochbaum, ed., Approxi-
mation algorithms for NP-hard problems ( PWS Publishing Company, New York,
1997) 94–143.
11. S. Khuller, A. Moss and J. Naor, The budgeted maximum coverage problem, to
appear in Inform. Proc. Letters.
12. S. Sahni, Approximate algorithms for the 0–1 knapsack problem, J. of ACM 22
(1975) 115–124.
13. L. A. Wolsey, Maximizing real-valued submodular functions: primal and dual
heuristics for location problems, Math. Oper. Res. 7 (1982) 410–425.
Solving the Convex Cost Integer Dual Network Flow
Problem
1 Introduction
In this paper, we consider the following integer programming problem with convex
costs:
Minimize Σ(i, j)∈Q Fij ( w ij ) + Σi∈P Bi (µ i ) (1a)
subject to
µi - µj = wij for all (i, j) ∈ Q, (1b)
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 31-44, 1999.
Springer-Verlag Berlin Heidelberg 1999
32 Ravindra K. Ahuja, Dorit S. Hochbaum, and James B. Orlin
using an adaptation of the preflow-push cost-scaling algorithm for the minimum cost
flow problem due to Goldberg and Tarjan [1987] and show that a solution to the
Lagrangian multiplier can be used to solve the dual network flow problem.
Our paper makes the following contributions: (i) We show that the minimum
convex cost network flow problem, obtained through Lagrangian relaxation
technique, is a piecewise linear convex function with integer slopes. Integrality of the
slopes allows us to solve this problem using Goldberg and Tarjan’s cost scaling
algorithm for the minimum cost flow problem. (The minimum convex cost network
flow problem with general cost structure cannot be solved by Goldberg and Tarjan’s
cost scaling algorithm.) (ii) We use another special property of the cost function to
implement the cost scaling algorithm in a manner so that the running time of the cost
scaling algorithm is the same as for solving the linear cost minimum cost flow
problem. The running time of our algorithm is O(nm log(n2/m) log(nU)), which gives
the best running time to solve the dual network flow problem. According to
McCormick [1998], the cancel and tighten algorithm due to Karzanov and
McCormick [1997] can be used to solve the convex cost integer dual network flow
problem in O(nm log n log(nU)) time. Our algorithm improves this cancel and tighten
approach for all network densities.
We use the Lagrangian relaxation technique of Rockafellar [1984] to solve the dual
network flow problem. He showed that the Lagrangian multiplier problem is a
minimum convex cost network flow problem. We review his approach here and
include some additional results. Our approach differs from his approach in a couple
of aspects. First of all, we are focused on finding an optimal integral solution.
Second, we are focused on developing formally efficient algorithms. We also include
this material in this paper because our notation and basic network terminology are
substantially different from that of Rockafellar.
As per Rockafellar [1984], we dualize the constraints (2b) using the vector x,
obtaining the following Lagrangian subproblem:
Minimize Σ(i, j)∈Q Fij ( w ij ) + Σi∈P Bi (µ i ) - Σ(i, j)∈Q (wij + µj - µi)xij (3a)
subject to
lij ≤ wij ≤ uij for all (i, j) ∈ Q, (3b)
li ≤ µi ≤ ui for all i ∈ P. (3c)
Notice that
Σ(i, j)∈Q (µj - µi) xij = Σi∈P µi(Σ{j:(j, i)∈Q} xji - Σ{j:(i, j)∈Q} xij) = Σi∈P µi xi0, (4)
where xi0 = Σ{j:(j,i)∈Q} xji - Σ{j:(i,j)∈Q} xij for each i ∈ P. Substituting (4) into (3)
yields
34 Ravindra K. Ahuja, Dorit S. Hochbaum, and James B. Orlin
L(x) = min Σ(i, j)∈Q { Fij ( w ij ) - xij wij} + Σi∈P{ Bi (µ i ) - x0iµi} (5a)
subject to
subject to
result in the theory of Lagrangian relaxation (see, for example, Ahuja, Magnanti and
Orlin [1993], Chapter 16). ♦
We will now focus on solving the Lagrangian multiplier problem (6). We will
show that the Lagrangian multiplier problem can be transformed into a network flow
problem with nonlinear arc costs. We define a directed network G = (N, A) with the
node set N and the arc set A. The node set N contains a node i for each element i ∈ P
and an extra node, node 0. The arc set A contains an arc (i, j) for each (i, j) ∈ Q and
an arc (i, 0) for each i ∈ P. For each (i, 0), i ∈ P, we let wi0 = µi, li0 = li, µi0 = µi, and
Fi0(wi0) = Bi(µi). Let us define the function Hij(xij) for each arc (i, j) in the following
manner:
Hij(xij) = min{Fij(wij) - xijwij : lij ≤ wij ≤ uij}. (7)
In terms of this notation, the Lagrangian multiplier problem (6) can be restated as
subject to
In this section, we show that for any arc (i, j), Hij(xij) is a piecewise linear concave
function with at most (uij – lij + 1) linear segments and the slopes of these segments
are integers between –uij and –lij. We shall use these properties in the next section to
develop an efficient cost scaling algorithm to solve the Lagrangian multiplier
problem. Let bij(θ) = Fij(θ +1) – Fij(θ) for lij ≤ θ ≤ uij – 1, θ integral. The fact that
Fij(θ) is a convex function of θ can be used to prove the following results:
Property 1. bij(θ) ≤ bij(θ + 1) ≤ bij(θ + 2) ≤ …… .
The following theorem obtains the expression for Hij(xij):
Theorem 2. The function Hij(xij) = min{Fij(wij) – wijxij : lij ≤ wij ≤ uij and wij integer}
is a piecewise linear concave function of xij, and is described in the following manner:
36 Ravindra K. Ahuja, Dorit S. Hochbaum, and James B. Orlin
Hij(xij) = &
K:
KKFij ( θ ) − θxij
(9)
if bij ( θ − 1) ≤ xij ≤ bij ( θ )
KK:F (u ) − u x
' ij ij ij ij if bij (uij − 1) ≤ xij ≤ ∞
Proof Sketch: The function Hij(xij) is defined as Hij(xij) = min{Fij(wij) - xijwij : lij ≤
wij ≤ uij and integer}, or Hij(xij) = min{Fij(θ) - θxij : lij ≤ θ ≤ uij and integer}. The
function Hij(xij) is the lower envelope of the (uij – lij +1) linear functions of xij, and
hence is a piecewise linear concave function. Using Property 1 it can be shown that
each line segment Fij(θ) - θxij is represented in the lower envelope Hij(xij), and it
starts at bij(θ-1) and ends at bij(θ). ♦
We have so far transformed the Lagrangian multiplier problem to maximizing a
concave cost flow problem. We can alternatively restate this problem as
subject to
The problem (10) is a convex cost flow problem in which the cost associated with the
flow on an arc (i, j) is a piecewise linear convex function containing at most U linear
segments. It is well known that such problems can be transformed to a minimum cost
flow problem by introducing an arc for every linear segment in the cost function (see,
for example, Ahuja, Magnanti and Orlin [1993, Chapter 14]). In the case that the
number of breakpoints for each cost function is bounded by a constant, one can solve
Solving the Convex Cost Integer Dual Network Flow Problem 37
the resulting minimum cost flow problem efficiently; otherwise this approach yields
an exponential-time algorithm. The algorithm by Hochbaum and Shantikumar [1990]
solves this problem in time polynomial in terms of the log of the number of
breakpoints. In this paper, we develop another algorithm for solving the minimum
convex cost network flow problem which has better time complexity.
Our algorithm is an adaptation of the cost scaling algorithm due to Goldberg and
Tarjan [1987]. The running time of our algorithm for the dual network flow problem
is O(nm log(n2/m) log(nU)) which is comparable to the time taken by the Goldberg-
Tarjan (GT) algorithm for the minimum cost flow problem. We refer to our algorithm
as the convex cost scaling algorithm to differentiate it from GT’s linear cost scaling
algorithm. Our subsequent discussion requires some familiarity with the linear cost
scaling algorithm and we refer the readers to the paper by Goldberg and Tarjan [1987]
or the book of Ahuja, Magnanti and Orlin [1993] for a description of this algorithm.
The cost scaling algorithm maintains a pseudoflow at each step. A pseudoflow x is
any function x: A → R satisfying -M ≤ xij ≤ M for all (i, j) ∈ A. For any pseudoflow
x, we define the imbalance of node i as
Σ Σ
e(i) = {j:(i,j)∈A} xji – {j:(i,j)∈A} xij for all i ∈ N. (11)
If e(i) > 0 for some node i, we refer to node i as an excess node and refer to e(i) as
the excess of node i. We refer to a pseudoflow x with e(i) = 0 for all i ∈ N as a flow.
The cost scaling algorithm also maintains a value π(i) for each node i ∈ N. We refer
to the vector π as a vector of node potentials.
The cost scaling algorithm proceeds by constructing and manipulating the residual
network G(x) defined as follows with respect to a pseudoflow x. For each (i, j) ∈ A,
the residual network G(x) contains two arcs (i, j) and (j, i). The cost of sending yij
units of flow in arc (i, j) is Cij(xij + yij) – Cij(xij); that is, it is the incremental cost of
increasing the flow in (i, j) by yij units. Notice that yij can be positive as well as
negative. We set the cost of arc (i, j) in the residual network as cij = Cij(xij + 1) –
Cij(xij) and set the cost of arc (j, i) as cji = Cij(xij – 1) – Cij(xij). Thus cij is the
marginal cost of increasing the flow in arc (i, j) and cji is the marginal cost of
decreasing flow in arc (i, j). By Theorem 2, both cij and cji are integer valued. The
following result is well known for convex cost flows (see for example, Rockafellar
[1984]).
Theorem 3. A flow x is an optimal flow of (10) if and only if G(x) contains no
negative cost directed cycle.
For a given residual network G(x) and a set of node potentials π, we define the
reduced cost of an arc (i, j) as cπij = cij - π(i) + π(j). Just like its linear counterpart,
the convex cost scaling algorithm relies on the concept of approximate optimality. A
flow or a pseudoflow x is said to be ε-optimal for some ε ≥ 0 if for some node
potentials π, the pair (x, π) satisfies the following ε-optimality conditions:
cπij ≥ - ε for every arc (i, j) in G(x). (12)
We call an arc (i, j) admissible if - ε ≤ cπij < 0. We shall use the following result
in our algorithm. This result uses the fact that each arc in G(x) has an integer cost.
38 Ravindra K. Ahuja, Dorit S. Hochbaum, and James B. Orlin
Lemma 1. (Bertsekas [1979]). If ε < 1/n, then any ε-optimal feasible flow of (10) is
an optimal flow.
We next define the residual capacity rij of an arc (i, j) in G(x) with respect to a flow
x and a set of node potential π. The residual capacity rij of an arc (i, j) in G(x) is the
additional flow we need to send on the arc (i, j) so that the reduced costs of both the
arcs (i, j) and (j, i) become nonnegative, that is, if one sends rij units of flow in (i, j)
and then recomputes the residual network, then cπij ≥ 0 and cπji ≥ 0. Alternatively, the
residual capacity rij of an arc (i, j) in G(x) is the flow yij that minimizes (Cij(xij + yij):
yij ≥ 0). The following lemma obtains a formula for rij.
Lemma 2. Let θ = π(i) - π(j). Suppose that (i, j) is an arc with cijπ < 0. Then, rij =
bij(θ) - xij, if θ ≤ uij –1; otherwise rij = M - xij.
Proof Sketch: If θ ≤ uij –1, then sending rij units of flow makes the flow on this arc
equal to bij(θ). Using Theorem 2, it can be shown that both the reduced costs cπij and
cπji are nonnegative. If θ ≤ uij –1, then we augment M - xij units of flow, which
eliminates the arc (i, j) from the residual network and makes cπji nonnegative. ♦
The convex cost scaling algorithm treats ε as a parameter and iteratively obtains ε-
optimal flows for successively smaller values of ε. Initially, ε = U. The algorithm
performs cost scaling phases by repeatedly applying an improve-approximation
procedure that transforms an ε-optimal flow into an ε/2-optimal flow. After log(nU)
+ 1 scaling phases, ε < 1/n and the algorithm terminates with an optimal flow. Figure
1 gives the algorithm description.
(a)
Solving the Convex Cost Integer Dual Network Flow Problem 39
procedure improve-approximation;
begin
for each arc (i, j) in G(x) do
if cπij < 0 then send rij amount of flow on arc (i, j);
while the network contains an excess node do
begin
select an excess node i;
if G(x) contains an admissible arc (i, j) then
push δ := min{e(i), rij} units of flow from node i to node j
else π(i) := π(i) + ε/2;
end;
end;
(b)
In the algorithm, we call the operation of sending δ = min{e(i), rij} units of flow
over the admissible arc (i, j) a push. If δ = rij; we refer to the push as saturating and
nonsaturating otherwise. We also refer to the operation of increasing the potential of
node i from π(i) to π(i) + ε/2 as a relabel operation. The purpose of a relabel
operation is to create new admissible arcs emanating from this node.
Goldberg and Tarjan had observed that their linear cost scaling algorithm could
be extended to treat convex cost flows; moreover, the bounds on the number of pushes
per scaling phase were the same as for the linear cost flow problem. Tarjan [1998]
communicated this to one of the co-authors of this paper. The algorithm gives an ε–
optimal flow at the end of the ε-scaling phase. However, for the general convex cost
functions, the algorithm is not guaranteed to give an optimal flow no matter how
small ε is. But in our case, the cost functions have integer slopes, and the algorithm
gives an optimal flow if ε < 1. The running time analysis of the cost scaling algorithm
for the convex cost case does not appear in the published literature; hence we give a
brief sketch of this analysis here.
Theorem 4. The convex cost scaling algorithm correctly solves the dual network flow
problem in O(nm log(n2/m) log(nU)) time.
Proof Sketch: The running time of O(nm log(n2/m) log(nU)) for the linear cost
scaling algorithm relies on the following two critical facts: (i) in a scaling phase any
node is relabeled O(n) times; and (ii) in between two consecutive saturating pushes of
any arc (i, j) node i must be relabeled at least once. These two facts hold for the
convex cost scaling algorithm too. The bound on the number of non-saturating pushes
also works as before. The dynamic tree data structure also works as it does for the
linear cost case. All we needs for it to work is that the residual capacities should
behave well in the following sense: if the residual capacity is rij, and if you send k
units of flow in the arc, then the residual capacity reduces to rij – k. This result holds
from the manner in which we define the residual capacities. Hence the convex cost
40 Ravindra K. Ahuja, Dorit S. Hochbaum, and James B. Orlin
scaling algorithm runs in O(nm log(n2/m)) time per scaling phase and in O(nm
log(n2/m) log(nU)) time overall. ♦
The cost scaling algorithm upon termination gives an optimal flow x* and the
optimal node potentials π*. Both the x* and π* may be non-integer. Since the
objective function in the convex cost flow problem (10) is piecewise linear, it follows
that there always exist optimal node potentials π. To determine those, we construct
G(x*) and solve a shortest path problem to determine shortest path distance d(i) from
node 0 to every other node i ∈ N. Then π(i) = -d(i) for each i ∈ N gives an integer
optimal set of node potentials for the problem (12). Now recall that Cij(xij) = -Hij(xij)
for each (i, j) ∈ A. This implies that µ(i) = -π(i) = d(i) for each i ∈ N gives optimal
dual variables for (8) and these µ(i) together wij = µ(i) – u(j) for each (i, j) ∈ A give
an optimal solution of the dual network flow problem (2).
In our formulation of the dual network flow problem we have assumed that the
constraints µi - µj = wij are in the equality form. We will show in this section that the
constraints of the forms µi - µj ≤ wij can be transformed to the equality form; hence
there is no loss of generality by restricting the constraints to the equality form.
Suppose that we wish to solve the following problem:
Minimize Σ(i, j)∈Q Fij ( w ij ) + Σi∈P Bi (µ i ) (13a)
subject to
µi - µj ≤ wij for all (i, j) ∈ Q, (13b)
lij ≤ wij ≤ uij for all (i, j) ∈ Q, (13c)
li ≤ µi ≤ ui for all i ∈ P. (13d)
Let w *ij denote the value of wij for which Fij(wij) is minimum. In case there are
multiple values for which Fij(wij) is minimum, choose the minimum such value. Let
us define the function Eij(wij) in the following manner:
The following lemma establishes a relationship between optimal solutions of (13) and
(15).
Lemma 3. For every optimal solution ( w , µ ) of (13), there is an optimal solution
( wü , µ ) of (15) of the same cost, and the converse also holds.
Proof Sketch: Consider an optimal solution ( w , µ ) of (13). We can show that the
ü ij = min{ µ i – µ j , w ij } is an optimal solution of (15) with
ü , µ ) of with w
solution ( w
ü , µü ) is an optimal solution of (15),
the same cost. Similarly, it can be shown that if ( w
then the solution ( w , µü ) constructed as w ij = max{ µü i – µü j , w ü ij } is an optimal
solution of (13). ♦
Consider an undirected network G = (N, A) with the node set N and the arc set A. Let
n = |N| and m = |A|. We assume that N = {1, 2, ... , n} and A = {a1, a2, ..., am}. Let cj
denote the cost of the arc aj. In the inverse spanning tree problem we are given a
spanning tree T0 of G which may or may not be a minimum spanning tree of G and
we wish to perturb the arc cost vector c to d so that T0 becomes a minimum spanning
∑ j=1
n
tree with d as the cost vector and Fj(dj – cj) is minimum, where each Fj(dj –
cj) is a convex function of dj. Sokkalingam, Ahuja and Orlin [1999], and Ahuja and
Orlin [1998] have studied special cases of the inverse spanning tree problem with cost
∑ j=1
n
functions as Fj|dj – cj| and max{|dj – cj| : 1 ≤ j ≤ m}.
We assume without any loss of generality that T0 = {a1, a2, ... , an-1}. We refer to
the arcs in T0 as tree arcs and the arcs not in T0 as nontree arcs. In the given
spanning tree T0, there is a unique path between any two nodes; we denote by W[aj]
the set of tree arcs contained between the two endpoints of the arc aj. It is well known
(see, for example, Ahuja, Magnanti and Orlin [1993]) that T0 is a minimum spanning
tree with respect to the arc cost vector d if and only if
di ≤ dj for each ai ∈ W[aj] and for each j = n, n+1, ..., m. (17)
42 Ravindra K. Ahuja, Dorit S. Hochbaum, and James B. Orlin
The isotonic regression problem can be defined as follows. Given a set A = {a1, a2,
… , an} ∈ Rn, find X = {x1, x2, … , xn} ∈ Rn so as to
∑ j=1
n
Minimize Bj(xj – aj) (21a)
subject to
xj ≤ xj+1 for all j = 1, 2, … , n-1, (21b)
lj ≤ xj ≤ uj for all j = 1, 2, … , n-1, (21c)
where Bj(xj – aj) is a convex function for every j, 1 ≤ j ≤ n. The isotonic regression
problem arises in statistics, production planning, and inventory control (see for
example, Barlow et al. [1972] and Robertson et al. [1988]. As an application of the
isotonic regression, consider a fuel tank where fuel is being consumed at a slow rate
and measurements of the fuel tank are being taken at different points in time.
Suppose these measurements are a1, a2, .., an. Due to the errors in the measurements,
these numbers in may not be in the non-increasing order despite the fact that the true
amounts of fuel remaining in the tank are non-increasing. However, we need to
determine these measurements as accurately as possible. One possible way to
accomplish this could be to perturb these numbers to x1 ≥ x2 ≥ … ≥ xn so that the cost
∑ j=1
n
of perturbation given by Bj(xj – aj) is minimum, where each Bj(xj – aj) is a
convex function. We can transform this problem to the isotonic regression problem
by replacing xj’s by their negatives.
If we define P = {1, 2, … , n} and Q = {(j, j+1) : j = 1, 2, … , n-1} and require that
xj must be integer, then the isotonic regression problem can be cast as a dual network
flow problem. However, it is a very special case of the dual network flow problem
and more efficient algorithms can be developed to solve it compared to the dual
network flow problem. Ahuja and Orlin [1998] recently developed an O(n log U)
algorithm to solve the isotonic regression problem.
Reference
1 Introduction
We consider the following combinatorial problem related to infeasible linear in-
equality systems.
Max FS: Given an infeasible system Σ : {Ax ≤ b} with A ∈ IRm×n and
b ∈ IRm , find a feasible subsystem containing as many inequalities as possible.
This problem has several interesting applications in various fields such as
statistical discriminant analysis, machine learning and linear programming (see
[2, 26, 22] and the references therein). In the latter case, it arises when the
LP formulation phase yields infeasible models and one wishes to diagnose and
resolve infeasibility by deleting as few constraints as possible, which is the com-
plementary version of Max FS [19, 27, 12]. In most situations this cannot be
done by inspection and the need for effective algorithmic tools has become more
acute with the considerable increase in model size. In fact, Max FS turns out to
be N P -hard [10] and it does not admit a polynomial time approximation scheme
unless P = N P [3]. The above complementary version, in which the goal is to
delete as few inequalities as possible in order to achieve feasibility, is equivalent
to solve to optimality, but is much harder to approximate than Max FS [5, 4].
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 45–59, 1999.
c Springer-Verlag Berlin Heidelberg 1999
46 Edoardo Amaldi, Marc E. Pfetsch, and Leslie E. Trotter, Jr.
First we briefly recall the main known structural results regarding IISs. For
notational simplicity, we use the same A and b, with A ∈ IRm×n and b ∈ IRm ,
to denote either the original system Σ or one of its IISs.
The known characterizations of IISs are based on the following version of
the Farkas Lemma. For any system Σ : {Ax ≤ b}, either Ax ≤ b is feasible or
∃ y ∈ IRm , y ≥ 0, such that yA = 0 and yb < 0, but not both.
Some Structural and Algorithmic Properties 47
P := {y ∈ IRm | yA = 0, yb ≤ −1, y ≥ 0} .
The inequality in the alternative system can obviously be replaced by the equa-
tion yb = −1. Note that, by using the transformation into Karmarkar’s standard
form, any polytope can be expressed as {y ∈ IRm | yA = 0, y1l = 1, y ≥ 0} for
an appropriate matrix A. Theorem 2 can also be stated in terms of rays [27] and
elementary vectors [18].
Definition 1. An elementary vector of a subspace L ⊆ IRm is a nonzero vector
y that has a minimal number of nonzero components (when expressed with respect
to the standard basis of IRm ). In other words, if x ∈ L and supp(x) ⊂ supp(y)
then x = 0, where supp(y) denotes the support of y.
We now determine the complexity status of the following problem for which
heuristics have been proposed in [14, 13, 26, 27].
Min IIS: Given an infeasible system Σ : {Ax ≤ b} as above, find a minimum
cardinality IIS.
48 Edoardo Amaldi, Marc E. Pfetsch, and Leslie E. Trotter, Jr.
To settle the issue left open in [19, 14, 27], we prove that Min IIS is not only
N P -hard to solve optimally but also hard to approximate. Note that, where
DT IM E(T (m)) denotes the class of problems solvable in time T (m), the as-
sumption N P 6⊆ DT IM E(mpolylog m ) is stronger than N P 6⊆ P , but it is also
believed to be extremely unlikely. Results that hold under such an assumption
are often referred to as almost N P -hard.
Theorem 4. Assuming P 6= N P , no polynomial time algorithm is guaranteed
to yield an IIS whose cardinality is at most c times larger than the minimum
one, for any constant c ≥ 1. Assuming N P 6⊆ DT IM E(mpolylog m ), Min IIS
1−ε
cannot be approximated within a factor 2log m , for any ε > 0, where m is the
number of inequalities.
Proof. We proceed by reduction from the following problem: Given a feasible
0 0 0
linear system Dz = d, with D ∈ IRm ×n and d ∈ IRm , find a solution z satisfy-
ing all equations with as few nonzero components as possible. In [4] we establish
that it is (almost) N P -hard to approximate this problem within the same type
of factors, but with m replaced by n, the number of variables. Note that the
above nonconstant factor grows faster than any polylogarithmic function, but
slower than any polynomial one.
For each instance of the latter problem with an optimal solution containing
s nonzero components, we construct a particular instance of Min IIS with a
minimum cardinality IIS containing s+ 1 inequalities. Given any instance (D,d),
consider the system
+ +
z− z
D −D −d z = 0, 0t 0t −1 z − < 0, z + , z − ≥ 0, z0 ≥ 0.
z0 z0
(1)
Since the strict inequality implies z0 > 0, the system Dz = d has a solution with
s nonzero components if and only if (1) has one with s + 1 nonzero components.
Now, applying Corollary 1, (1) has such a solution if and only if the system
t
D 0
−Dt x ≤ 0 (2)
−dt −1
has an IIS of cardinality s + 1. Since (2) is the alternative system of (1), the
Farkas Lemma implies that exactly one of these is feasible; as (1) is feasible, (2)
must be infeasible. Thus (2) is a particular instance of Min IIS with m = 2n0 + 1
inequalities in n = m0 variables.
Given that the polynomial time reduction preserves the objective function
modulo an additive unit constant, we obtain the same type of non-approx-
imability factors for Min IIS. t
u
Note that for the similar (but not directly related) problem of determining
minimum witnesses of infeasibility in network flows, NP-hardness is established
in [1].
Some Structural and Algorithmic Properties 49
Lemma 1. For any IIS {Ax ≤ b}, Ai has linearly independent rows, ∀ i; i.e.,
rank(Ai ) = m − 1.
It is interesting to note that this lemma together with Theorem 1 imply that an
infeasible system {Ax ≤ b} is an IIS if and only if rank(Ai ) = m − 1 for all i,
1 ≤ i ≤ m.
We thus have the following simplex decomposition result for IISs.
Proof. (⇒) To see feasibility of {Ax ≥ b}, delete constraint ai x ≥ bi to get the
equality system {Ai x = bi }. By Lemma 1, this system has a solution, say xi ,
and we must have ai xi > bi , else xi satisfies {Ax ≤ b}. Applying the polyhedral
resolution theorem, P := {x ∈ IRn | Ax ≥ b} = 6 ∅ can be written as P = K + Q,
where K = {x ∈ IRn | Ax ≥ 0} is its recession cone and Q ⊆ P is a polytope
generated by representatives of its minimal nonempty faces.
If x satisfies Ax ≥ 0 and ai x > 0 for row ai then xi −x satisfies A(xi −x) ≤
b for sufficiently large > 0 and the original system {Ax ≤ b} would be feasible.
Therefore we must have that each ai x = 0 for 1 ≤ i ≤ m, x ∈ K and we get
that in fact K = L := {x ∈ IRn | Ax = 0}.
For Q, minimal nonempty faces of P are given by changing a maximal set of
inequalities into equalities (all but one relation). Thus the vectors xi obtained
50 Edoardo Amaldi, Marc E. Pfetsch, and Leslie E. Trotter, Jr.
3 IIS-Hypergraphs
Proof. We show that for any instance of the Steinitz problem we can construct
in polynomial time a special instance of the above-mentioned restricted IIS Re-
alizability problem such that the answer to the first instance is affirmative if
and only if the answer to the second instance is affirmative. Since face lattices
52 Edoardo Amaldi, Marc E. Pfetsch, and Leslie E. Trotter, Jr.
Consider an infeasible system Σ : {Ax ≤ b} and let [m] = {1, . . . , m} be the set
of indices of all inequalities in Σ. If I denotes the set of all feasible subsystems
of Σ, ([m], I) is clearly an independence system and its set of circuits C(I)
corresponds to the set of all IISs. We denote by PF S the polytope generated by
the convex hull of all the incidence vectors of feasible subsystems.
Let us first briefly recall some definitions and facts about independence sys-
tem polytopes. To any independence system (E, I) with the family of circuits
denoted by C(I) we can associate the polytope P (I) = P (C(I)) = conv({y ∈
{0, 1}|E| | y is the incidence vector of an I ∈ I}). The rank function is defined
by r(S) = max{|I|P| I ⊆ S, I ∈ I} for all S ⊆ E. For any S ⊆ E, the rank
inequality for S is e∈S ye ≤ r(S), which is clearly valid for P (I). A subset
S ⊆ E is closed if r(S ∪ {t}) ≥ r(S) + 1 for all t ∈ E − S and nonseparable if
r(S) < r(T ) + r(S − T ) for all T ⊂ S, T 6= ∅. For any set S ⊆ E, S must be
closed and nonseparable for the corresponding rank inequality to define a facet
of P (I). These conditions generally are only necessary, but sufficient conditions
can be stated using the following concept [21]. For S ⊆ E, the critical graph
GS (I) = (S, F ) is defined as follows: (e, e0 ) ∈ F , for e, e0 ∈ S, if and only if
there exists an independent set I such that I ⊆ S, |I| = r(S) and e ∈ I, e0 ∈ / I,
I − e + e0 ∈ I. It is shown in [21] that if S is a closed subset of E and the critical
graph GS (I) of I on S is connected, then the corresponding rank inequality
induces a facet of the polytope P (I). See references in [15].
K1
1111111111111111
0000000000000000
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
0000000000000000
1111111111111111
x
0000000000000000
1111111111111111
1
d1
d3 d2
^x
x3 x2 1
000000000000
111111111111 00000000000
11111111111
11111111111
00000000000
111111111111
000000000000 00000000000
11111111111
000000000000
111111111111
000000000000
111111111111 00000000000
11111111111
000000000000
111111111111 00000000000
11111111111
000000000000
111111111111 00000000000
11111111111
00000000000
11111111111
000000000000
111111111111 00000000000
11111111111
000000000000
111111111111
000000000000
111111111111 00000000000
11111111111
000000000000
111111111111 00000000000
11111111111
K000000000000
111111111111
000000000000
111111111111
3 00000000000
11111111111
00000000000
11111111111
000000000000
111111111111
000000000000
111111111111
00000000000 2
11111111111
00000000000
11111111111 K
000000000000
111111111111 00000000000
11111111111
000000000000
111111111111
3 00000000000
11111111111
00000000000
11111111111
000000000000
111111111111
000000000000
111111111111
000000000000
111111111111 2
00000000000
11111111111
00000000000
11111111111
000000000000
111111111111 00000000000
11111111111
For each i ∈ S, consider the unique xi = A−1 S\{i} bS\{i} . By the proof of
1
Theorem 5, we know that x , . . . , x n+1
are affinely independent. If di := (xi − x̂)
1 Pn+1 i
for all i, 1 ≤ i ≤ n + 1, where x̂ := n+1 i=1 x , d1 , . . . , dn+1 are also affinely
Pn+1 n i
independent. Clearly i=1 di = 0 and the di ’s generate IR . Since each x
i i
satisfies exactly n of the n + 1 inequalities in S with equality and a x > bi
(otherwise S would be feasible), we have x̂ ∈ {x ∈ IRn | AS x ≥ bS }, i.e., x̂
satisfies the reversed inequalities of the IIS. In fact, x̂ is an interior point of the
above “reversed” polyhedron.
Some Structural and Algorithmic Properties 55
where we used the fact that AS\{i} x̂ ≥ bS\{i} . Now consider an arbitrary in-
equality ãx ≤ b̃ with ã 6= 0. We will verify that H := {x ∈ IRn | ãx ≤ b̃} has
a nonempty intersection with at least one of the Ki ’s, 1 ≤ i ≤ n + 1. Thus, for
any t ∈ E − S we have rank(S ∪ {t}) = rank(S) + 1 = n + 1, which means that
the IIS defined by S is closed. Pn+1 Pn+1
Since d , . . . , dn+1 generate IRn and i=1 di = 0, we have i=1 ãdi =
Pn+1 1
ã( i=1 di ) = 0 and therefore ã 6= 0 implies that we cannot have ãdi =
0 ∀i, 1 ≤ i ≤ n + 1. Thus there exists at least one i, such that ãdi < 0.
But this implies that Ri ∩ H 6= ∅. In other words, Ki ∩ H 6= ∅ and this proves
the theorem for maximal IISs.
b) The result can be easily extended to non-maximal IISs, i.e., with |S| < n + 1.
From Theorem 5 we know that P := {x ∈ IRn | Ax ≥ b} = L + Q with Q ⊆ L⊥ .
Since P is full-dimensional (x̂ is an interior point), n = dim(P ) = dim(L) +
dim(Q) and dim(Q) = rank(AS ) = |S| − 1 < n imply that dim(L) ≥ 1.
Two cases can arise:
i) If the above-mentioned ã is in lin({a1 , . . . , am }) = L⊥ , the linear hull of the
rows of A, then since dim(L⊥ ) = dim(Q), we can apply the above result to L⊥ .
ii) If ã 6∈ lin({a1 , . . . , am }) = L⊥ , then the projection of H = := {x ∈ IRn | ãx =
b̃} onto L yields the whole L and therefore H = {x ∈ IRn | ãx ≤ b̃} must
have a nonempty intersection with all the cones corresponding to the maximal
consistent subsystems of {AS x ≤ bS }. t
u
It is worth noting that closedness of every IIS makes PF S quite special among
all independence system polyhedra, since the circuits of a general independence
system need not be closed.
The separation problem for IIS-inequalities is defined as follows: Given an
infeasible system Σ and an arbitrary vector y ∈ IRm , show that y satisfies all
IIS-inequalities or find at least one violated by y.
In view of the trivial valid inequalities, we can assume that y ∈ [0, 1]m . Moreover,
we may assume with no loss of generality, that the nonzero components of y
correspond to an infeasible subsystem of Σ.
Let (A, b) and K define an arbitrary instance of the above decision problem.
Consider the particular instance of the separation problem given by the same
infeasible system together with the vector y such that yi = 1 − 1/(K + 1) for all
i, 1 ≤ i ≤ m.
Suppose that Σ has an IIS of cardinality atPmost K which is indexed by the
set S. Then the corresponding IIS-inequality i∈S yi ≤ |S| − 1 is violated by
the vector y because
X X 1 |S|
yi = (1 − ) = |S| − > |S| − 1,
K +1 K +1
i∈S i∈S
where the strict inequality is implied by |S| ≤ K. Thus the vector y can be
separated from PF S .
Conversely, if there exists an IIS-inequality violated by y, then
X |S|
yi = |S| − > |S| − 1
(K + 1)
i∈S
In the case of PF S , the ground set is the set of indices of inequalities in the
infeasible system Σ under consideration.
Proof. We invoke the following result (Proposition 3.15 of [21]). For any S ⊆ E,
let CS = {C ∈ C | C ⊆ S} denote the family of circuits P of the independence
system induced by (E, I) on S. Then the rank inequality e∈S ye ≤ r(S) induces
a facet of P (C) if and only if S is closed and the rank inequality induces a facet
of P (CS ). Hence it suffices to consider the case S = E and CS = AW(m, t, q)).
It is easy to verify that the only (m, t, q)-generalized antiwebs that can arise
in IIS-hypergraphs are those with q = t. Suppose that q < t and consider E 1 ,
an arbitrary circuit C ∈ AW(m, t, q) with C ⊆ E 1 and an arbitrary element
e ∈ E 1 \ C. By definition of AW(m, t, q), any q subset of E 1 is a circuit. This
must be true in particular for all subsets containing e and q − 1 elements of C.
But then C cannot be closed because r(C ∪ {e}) = r(C) and thus we have a
contradiction to the fact that all IISs are closed (Theorem 7). Hence the only
generalized cliques that can arise are those with m = t = q, that is, in which the
whole ground set E is an IIS. t
u
The generalized antiwebs which are not ruled out by the above proof, i.e,
AW(m, t, q) with q = t, clearly correspond to simple circular sequences of IISs
of cardinality t given by the subsets E i , i ∈ M , of the definition. For t = q = 2, it
is easy to see that the only possible cases that can arise as induced hypergraphs
of IIS-hypergraphs are those with m = 4 and m = 2. In fact, we conjecture that
no other (m, t, q)-generalized antiwebs can occur besides the cases m = t = q
with q ≥ 2, m = 4 and t = q = 2 as well as the trivial cases in which q = 1.
In this respect it is interesting to note that the remark following Theorem 5
implies that the lineality spaces L associated to all the IISs E i , i ∈ M , in any
given generalized antiweb are identical. Therefore we can assume that they are
all maximal IISs contained in L⊥ and exploit the special geometric structure
of such IISs revealed by the proof of Theorem 7. An intermediate step would
then be to show that no sequence of more than 3 such successive IISs E i can
occur without other additional IISs involving t nonsequential elements. In the
case m = 5 and t = 2, this observation is clearly valid.
Besides settling the above-mentioned issue, we are investigating other rank
and non-rank facets of PF S . For rank facets, it is also of interest to consider the
extent to which the sufficient condition involving connectedness of the critical
graph could also be necessary. By enumerating all independence systems on at
most 6 elements, we have verified that all cases with rank facets different from
IIS-inequalities and with a nonconnected critical graph occur in independence
systems which cannot be realized as PF S .
For non-rank facets, we can specialize some known facet classes for general
independence system polytopes and set covering polytopes, e.g., the class of all
facets (0, 1, 2)-valued coefficients characterized in [7]. A simple example of PF S
polytope with such a non-rank facet is as follows. The original system contains
six inequalities in three variables. In addition to the rank inequalities defined
by the five maximal IISs ({3456}, {2345}, {1346}, {1246}, {1245}) and to the
trivial (0, 1)–bounding inequalities, the single additional constraint x1 + x2 +
x3 + 2x4 + x5 + x6 ≤ 5 is required to provide the full description.
58 Edoardo Amaldi, Marc E. Pfetsch, and Leslie E. Trotter, Jr.
References
[1] C. C. Aggarwal, R. K. Ahuja, J. Hao, and J. B. Orlin, Diagnosing infeasi-
bilities in network flow problems, Mathematical Programming, 81 (1998), pp. 263–
280.
[2] E. Amaldi, From finding maximum feasible subsystems of linear systems to feed-
forward neural network design, PhD thesis, Dep. of Mathematics, EPF-Lausanne,
1994.
[3] E. Amaldi and V. Kann, The complexity and approximability of finding maxi-
mum feasible subsystems of linear relations, Theoretical Comput. Sci., 147 (1995),
pp. 181–210.
[4] , On the approximability of minimizing nonzero variables or unsatisfied re-
lations in linear systems, Theoretical Comput. Sci., 209 (1998), pp. 237–260.
[5] S. Arora, L. Babai, J. Stern, and Z. Sweedyk, The hardness of approximate
optima in lattices, codes, and systems of linear equations, J. Comput. Syst. Sci.,
54 (1997), pp. 317–331.
[6] A. Bachem and M. Grötschel, New aspects of polyhedral theory, in Optimiza-
tion and Operations Research, A. Bachem, ed., Modern Applied Mathematics,
North Holland, 1982, ch. I.2, pp. 51 – 106.
[7] E. Balas and S. M. Ng, On the set covering polytope: All the facets with coeffi-
cients in {0,1,2}, Mathematical Programming, 43 (1989), pp. 57–69.
[8] L. Blum, F. Cucker, M. Shub, and S. Smale, Complexity and Real Compu-
tation, Springer-Verlag, 1997.
[9] J. Bokowski and B. Sturmfels, Computational Synthetic Geometry, no. 1355
in Lecture Notes in Mathematics, Springer-Verlag, 1989.
[10] N. Chakravarti, Some results concerning post-infeasibility analysis, Eur. J.
Oper. Res., 73 (1994), pp. 139–143.
[11] J. Chinneck, Computer codes for the analysis of infeasible linear programs, J.
Oper. Res. Soc., 47 (1996), pp. 61–72.
[12] , An effective polynomial-time heuristic for the minimum-cardinality IIS set-
covering problem, Annals of Mathematics and Artificial Intelligence, 17 (1996),
pp. 127–144.
[13] , Feasibility and viability, in Advances in Sensitivity Analysis and Parametric
Programming, T. Gál and H. Greenberg, eds., Kluwer Academic Publishers, 1997.
[14] J. Chinneck and E. Dravnieks, Locating minimal infeasible constraint sets in
linear programs, ORSA Journal on Computing, 3 (1991), pp. 157–168.
[15] M. Dell’Amico, F. Maffioli, and S. Martello, Annotated Bibliographies in
Combinatorial Optimization, John Wiley, 1997.
[16] K. Fan, On systems of linear inequalities, in Linear Inequalities and Related
Systems, H. W. Kuhn and A. W. Tucker, eds., no. 38 in Annals of Mathematical
Studies, Princeton University Press, NJ, 1956, pp. 99–156.
[17] J. Gleeson and J. Ryan, Identifying minimally infeasible subsystems of inequal-
ities, ORSA Journal on Computing, 2 (1990), pp. 61–63.
Some Structural and Algorithmic Properties 59
1 Introduction
The single node fixed-charge flow polyhedron, studied by Padberg et al. [9] and
Van Roy and Wolsey [12], arises as an important relaxation of many 0-1 mixed
integer programming problems with fixed charges, including lot-sizing problems
[4,10] and capacitated facility location problems [1]. The valid inequalities de-
rived for the single node fixed-charge flow polyhedron have proven to be effective
for solving these types of problems. Here we study a generalization of the sin-
gle node fixed-charge flow polyhedron that arises as a relaxation of network flow
problems with additive variable upper bounds, such as network design/expansion
problems and production planning problems with setup times. We derive several
classes of strong valid inequalities for this polyhedron. Our computational ex-
perience with network expansion problems indicates that these inequalities are
very effective in improving the quality of the linear programming relaxations.
In a network design problem, given a network and demands on the nodes, we
are interested in installing capacities on the edges of the network so that the total
cost of flows and capacity installation is minimized. If some of the edges already
?
This research is supported, in part, by NSF Grant DMI-9700285 to the Georgia
Institute of Technology.
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 60–72, 1999.
c Springer-Verlag Berlin Heidelberg 1999
Valid Inequalities for Problems with Additive Variable Upper Bounds 61
where dit denotes the demand for item i in period t, ut the total production
capacity in period t and ai the setup time required for item i if the machine is
setup for this item. Aggregating the demand constraints (3) and the production
variables yit for each period, we arrive at the same structure as in (1)-(2).
In the next section we introduce four classes of valid inequalities for
and give conditions under which these inequalities are facet-defining for conv(P ).
In Section 3 we present a summary of computational results on the use of the
new inequalities in a branch-and-cut algorithm for network expansion problems.
2 Valid Inequalities
The proofs of the results given in the sequel are abbreviated or omitted due to
space considerations. For detailed proofs and for further results and explanations,
the reader is referred to Atamtürk [2].
Let N = {1, 2, . . . , n} be the index set of binary variables and N (S) be
the subset of N appearing in the additive variable upper bound constraints
associated with S ⊆ M = {1, 2, . . . ,m}. For notational
simplicity we use N (i)
P P
for N ({i}). We define u(S) = i∈S ui + j∈N (i) aij for S ⊆ M and aj (S) =
P
i∈S aij for j ∈ N (S). Again for notational simplicity we use u(i) for u({i}).
Throughout we make the following assumptions on the data of the model:
(A.1) aij > 0 for all i ∈ M, j ∈ N (i).
(A.2) u(i) > 0 for all i ∈ M .
(A.3) b + u(M − ) > 0.
(A.4) u(i) − aij ≥ 0 for all i ∈ M, j ∈ N (i).
(A.5) b + u(M − ) − aj (M − ) ≥ 0 for all j ∈ N .
Assumptions (A.2-A.5) are made without loss of generality. If u(i) < 0 or
b+u(M −) < 0, then P = ∅. If u(i) = 0 (b+u(M −) = 0), then yi = 0 (yi = 0, i ∈
M + ) in every feasible solution and can be eliminated. Similarly if u(i) − aij < 0
or b + u(M − ) − aj (M − ) < 0, then xj = 1 in every feasible solution and can be
eliminated. Note that given (A.1), if N (i) 6= ∅ for all i ∈ M , then (A.4) implies
(A.2) and (A.5) implies (A.3). Assumption (A.1) is made for convenience. Results
presented in the sequel can easily be generalized to the case with aij < 0. Note
that, for a particular j ∈ N if aij < 0 for all i ∈ M , then xj can be complemented
to satisfy (A.1). If there is no overlap of additive variable upper bounds, i.e.
N (i) ∩ N (k) = ∅ for all i, k ∈ M , then M (j) is singleton for all j ∈ N and (A.1)
can be satisfied by complementing the binary variables when aij < 0.
Proposition 1. Conv(P ) is full-dimensional.
If (N (C + )+ ∩ T ) ∪ (N (L− )+ \ T ) = ∅, then
X X X
lhs ≤ ȳi − aj (L− ) − ȳi ≤ b + u(C − ) + γ.
i∈C + j∈N (L− )− \T i∈K
P P
To + yi − i∈L− yi −
P see that the second−
inequality is valid,
P observe that i∈C
−
P −
y i ≤ b+u(C ) is valid for P and − y i ≤ u(L )− j∈N (L ) aj (L )−
− +
Pi∈K −
i∈L
− +
j∈N (L− )− ∩T aj (L ) is valid for (x̄, ȳ) since N (L ) ⊆ T . Adding these two
inequalities gives the result. Now, suppose (N (C ) ∩ T ) ∪ (N (L− )+ \ T ) 6= ∅.
+ +
Then
X X
lhs ≤ u(C + ) − aj (C + ) + aj (C + ) +
j∈N (C + )∩T j∈N (C + )+ ∩T
X X X
(γ − λ) − (λ − γ) − aj (L− )
j∈N (C + )+ ∩T j∈N (L− )+ \T j∈N (L− )− \T
Remark 1. For the single node fixed-charge flow model, where (2) is replaced
with yi ≤ ui xi , the additive flow cover inequality reduces to the flow cover
inequality [12]
X X X X X
yi + (ui − λ)+ (1 − xi ) − min{ui , λ}xi − yi ≤ b + ui .
i∈C + i∈C + i∈L− i∈K i∈C −
1. C − = ∅,
2. maxj∈N (C + ) aj (C + ) > λ − γ,
3. aj (L− ) > λ − γ for some j ∈ N (i) for all i ∈ L− with ui = 0,
4. ui ≥ 0 for all i ∈ L− ,
5. N (L− ) ∩ N (M \ L− ) = ∅.
If (N (L+ )+ \ T ) ∪ (N (C − )+ ∩ T ) = ∅, then
P P P
lhs = i∈C + ∪L+ ȳi − j∈N (L+ )− \T aj (L+ ) − i∈K ȳi ≤ u(C + ) + γ.
Otherwise,
X X
lhs ≤ b + u(C − ) − aj (C − ) − (µ − γ)
j∈N (C − )∪T j∈N (L+ )+ \T
X X
aj (L+ ) + (aj (C − ) − µ + γ)
j∈N (L+ )− \T j∈N (C − )+ ∩T
Remark 2. For the single node fixed-charge flow model, the additive flow packing
inequality reduces to the flow packing inequality [3]
X X X X X
yi − min{ui , µ}xi + (ui − µ)+ (1 − xi ) − yi ≤ ui .
i∈C + ∪L+ i∈L+ i∈C − i∈K i∈C +
γ = i∈L− ui < λ. Then, from Section 2.1 we have the following valid additive
flow cover inequality for PF
X X
yi + (aj (C + ) − λ + γ)+ (1 − xj )−
i∈C + j∈N (C + )\F
X X
min{aj (L− ), λ − γ}xj − yi ≤ b + ū(C − ) + γ. (7)
j∈N (L− )\F i∈K
Note that inequality (7) is not necessarily valid for P . We assume that
the conditions of Proposition 3 are satisfied and hence (7) is facet-defining for
conv(PF ). In order to derive a generalized additive flow cover inequality for
P , we lift (7) in two phases. In the first phase we lift the inequality with the
variables in N (L− ∪ C − ) ∩ F . Then in the second phase we lift the resulting in-
equality with the variables in N (C + ) ∩ F . When lifting the variables in phases,
for convenience, we make the following assumption:
(A.6) (N (C + ) ∩ F ) ∩ (N (L− ∪ C − ) ∩ F ) = ∅.
Even if (A.6) is not satisfied, the lifted inequality is still valid for P , but it
may not be facet-defining for conv(P ). Now, let (7) be lifted first with variable
xl , l ∈ N (L− ∪ C − ) ∩ F . Then the lifting coefficient associated with xl is equal
to
(
X
− − −
f (al (L ∪ C )) = b + ū(C ) + γ − max yi +
(x,y)∈PF \l ,xl =1
i∈C +
X X X
(aj (C + ) − λ + γ)+ (1 − xj ) − min{aj (L− ), λ − γ}xj − yi .
j∈N (C + )\F j∈N (L− )\F i∈K
Since (7) satisfies the conditions of Proposition 3, it follows that ū(C + ) >
λ − γ or equivalently b + ū(C − ) + γ > 0. Then the lifting problem has an optimal
66 Alper Atamtürk, George L. Nemhauser, and Martin W. P. Savelsbergh
solution such that yi = 0 for all i ∈ (M + \ C + )∪K. Let (x̄, ȳ) be such an optimal
solution and let S = {j ∈ N (C + )\F : x̄j = 0} and T = {j ∈ N (L− )\F : x̄j = 1}.
Clearly, we may assume that S ⊆ {j ∈ N (C + ) \ F : aj (C + ) > λ − γ} and
T ⊆ {j ∈ N (L− ) \ F : aj (L− ) > λ − γ}; otherwise we can obtain a solution with
the same or better objective value by considering a subset of S or T satisfying
these conditions. There are two cases Pto consider when determining the value of
f (al (L− ∪ C − )) depending on how i∈C + yi is bounded in an optimal solution.
We analyze f (al (L− ∪ C − )) separately for each case.
X X
Case 1: λ − γ ≤ aj (C + ) + aj (L− ) + al (L− ∪ C − ).
j∈S j∈T
X
f (al (L− ∪ C − )) = b + ū(C − ) + γ − [ū(C + ) − aj (C + ) +
j∈S
X X
(aj (C ) − λ + γ) −
+
(λ − γ)]
j∈S j∈T
= (|S ∪ T | − 1)(λ − γ).
X X
Case 2: λ − γ > aj (C + ) + aj (L− ) + al (L− ∪ C − ).
j∈S j∈T
X
f (al (L− ∪ C − )) = b + ū(C − ) + γ − [b + ū(C − ) + γ + aj (L− ) +
j∈T
X X
− −
al (L ∪ C ) + (aj (C ) − λ + γ) −
+
(λ − γ)]
j∈S j∈T
X X
= |S ∪ T |(λ − γ) − aj (C + ) − aj (L− ) − al (L− ∪ C − ).
j∈S j∈T
The lifting problem has an optimal solution such that yi = 0 for all i ∈
(M + \ C + ) ∪ K, xj = 1 for all j ∈ N (C + ) \ F such that aj (C + ) ≤ λ − γ, xj = 0
for all j ∈ N (L− ∪ C − ) ∩ F such that aj (L− ∪ C − ) ≤ λ − γ, and xj = 0 for all
j ∈ N (L− ) \ F such that aj (L− ) ≤ λ − γ. Let (x̄, ȳ) be such an optimal solution.
Let R = {j ∈ N (L− ∪ C − ) ∩ F ) : x̄j = 1}, S = {j ∈ N (C + ) \ F : x̄j = 0}, and
T = {j ∈ N (L− ) \ F : x̄j = 1}. Again, there P are two cases when determining
the value of g(al (C + )) depending on how i∈C + yi is bounded in an optimal
solution.
X X X
Case 1: λ − γ ≤ aj (C + ) + aj (L− ∪ C − ) + aj (L− ) − al (C + ).
j∈S j∈R j∈T
= (|R ∪ S ∪ T | − 1)(λ − γ) − al (C ). +
X X X
Case 2: λ − γ > aj (C + ) + aj (L− ∪ C − ) + aj (L− ) − al (C + ).
j∈S j∈R j∈T
X
g(al (C + )) = b + ū(C − ) + γ − [b + ū(C − ) + γ + aj (L− ∪ C − ) +
j∈R
X X X
aj (L− ) + (aj (C + ) − λ + γ) − (λ − γ)]
j∈T j∈S j∈R∪T
X X X
= |R ∪ S ∪ T |(λ − γ) − aj (C + ) − aj (L− ∪ C − ) − aj (L− ).
j∈S j∈R j∈T
Now, let
aj (C + ), if j ∈ N (C + ) \ F,
vj = aj (L ∪ C ), if j ∈ N (L− ∪ C − ) ∩ F,
− −
aj (L− ), if j ∈ N (L− ) \ F,
68 Alper Atamtürk, George L. Nemhauser, and Martin W. P. Savelsbergh
1 − xj , if j ∈ N (C + ) \ F,
x0j =
xj , if j ∈ N (L− ) ∪ (N (C − ) ∩ F ),
and {j1 , j2 , . . . , jr } = {j ∈ (N (C + ) \ F ) ∪ N (L− ) ∪ (N (C − ) ∩ F ) : vj > λ − γ}
such that vjk ≥ vjk+1 for k = 1, 2, . . . r − 1. We also define the partial sums
Pk
w0 = 0, wk = i=1 vji for k = 1, 2, . . . , r.
It is not hard to show that there is a monotone optimal solution to the
lifting problem. That is, there exists an optimal solution such that x̄0jk ≥ x̄0jk+1
for k = 1, 2, . . . , r − 1. Therefore g(al (C + )) can be expressed in a closed form as
follows:
k(λ − γ) − al (C + ), wk < al (C + ) ≤ wk+1 − λ + γ,
k = 0, 1, . . . , r − 1,
g(al (C + )) =
k(λ − γ) − wk , w k − λ + γ < a l (C +
) ≤ wk , k = 1, 2, . . . , r,
r(λ − γ) − wr , wr < al (C + ).
It can be shown that g is superadditive on IR− , which implies that the lifting
function g remains unchanged as the projected variables in N (C + ) ∩ F are
introduced to inequality (8) sequentially [7,13]. Hence we have the following
result.
with
k(λ − γ) − aj (C + ), wk < aj (C + ) ≤ wk+1 − λ + γ, k = 0, 1, . . . , r − 1,
αj = k(λ − γ) − wk , wk − λ + γ < aj (C + ) ≤ wk , k = 1, 2, . . . , r,
r(λ − γ) − wr , wr < aj (C + ).
is valid for P .
Then from Section 2.2 we have the following valid additive flow packing inequal-
ity for PF
X X
yi − min{aj (L+ ), µ − γ}xj +
i∈C + ∪L+ j∈N (L+ )\F
X X
(aj (C − ) − µ + γ)+ (1 − xj ) − yi ≤ ū(C + ) + γ. (10)
j∈N (C − )\F i∈K
We assume that the conditions of Proposition 5 are satisfied and hence (10)
is facet-defining for conv(PF ). To introduce the variables in F into inequality
(10), we lift (10) in two phases. First we lift the inequality with the variables in
N (C + ∪ L+ ) ∩ F . Then in the second phase we lift the resulting inequality with
variables in N (C − ) ∩ F . When employing this two phase lifting procedure, for
convenience, we assume that
(A.7) (N (L+ ∪ C + ) ∩ F ) ∩ (N (C − ) ∩ F ) = ∅.
The lifting of inequality (10) proceeds similar to the lifting of inequality (7).
Therefore, we only give the final result here.
with
k(µ − γ) − aj (C − ), wk < aj (C − ) ≤ wk+1 − µ + γ, k = 0, 1, . . . , r − 1,
αj = k(µ − γ) − wk , wk − µ + γ < aj (C − ) ≤ wk , k = 1, 2, . . . , r,
r(µ − γ) − wr , wr < aj (C − ),
is valid for P .
3 Computational Results
In this section, we present our computational results on solving network ex-
pansion problems with a branch-and-cut algorithm. We implemented heuristic
separation algorithms for the generalized additive flow cover and flow packing
inequalities for the single node relaxation of the problem. We also used the lifted
70 Alper Atamtürk, George L. Nemhauser, and Martin W. P. Savelsbergh
cover inequalities [6] for surrogate 0-1 knapsack relaxations of the single node
relaxation, where the continuous flow variables are replaced with either their
0-1 additive variable upper bound variables or with their lower bounds. The
branch-and-cut algorithm was implemented with MINTO [8] (version 3.0) using
CPLEX as LP solver (version 6.0). All of the experiments were performed on a
SUN Ultra 10 workstation with a one hour CPU time limit and a 100,000 nodes
search tree size limit.
We present a summary of two experiments. The first experiment is performed
to test the effectiveness of the cuts in solving a set of randomly generated net-
work expansion problems with 20 vertices and 70% edge density. The instances
were solved using MINTO first with its default settings and then with the above
mentioned cutting planes generated throughout the search tree. In Table 1, we
report the number of AVUB variables per flow variable (avubs) and the average
values for the LP relaxation at the root node of the search tree (zroot), the
best lower bound (zlb) and the best upper bound (zub) on the optimal value at
termination, the percentage gap between zlb and zub (endgap), the number of
generalized additive flow cover cuts (gafcov), generalized additive flow packing
cuts (gafpack), surrogate knapsack cover cuts (skcov) added, the number of
nodes evaluated (nodes), and the CPU time elapsed in seconds (time) for five
random instances. While none of the problems could be solved to optimality
without adding the cuts within 100,000 nodes, all of the problems were solved
easily when the cuts were added. We note that MINTO does not generate any
flow cover inequalities for these problem, since it does not recognize that ad-
ditive variable upper bounds can be relaxed to simple variable upper bounds.
Observe that the addition of the cuts improves the lower bounds as well as the
upper bounds significantly, which leads to much smaller search trees and overall
solution times. Table 1 clearly shows the effectiveness of the cuts.
in addition to endgap, gafcov, gafpack, skcov, nodes, and time for five random
instances with 50, 100 and 150 vertices. We note that these problems are much
larger than ones for which computations are provided in the literature [5,11].
Although all of the instances with 50 vertices could be solved to optimality, for
the larger instances the gap between the best lower bound and the best upper
bound could not be completely closed for most of the problems with 4 or 8 avubs
within an hour of CPU time. Nevertheless, the improvement in LP relaxations
is significant, ranging between 50% and 98%.
vertices avubs initgap rootgap endgap gafcov gafpack skcov nodes time
1 25.09 1.05 0.00 48 56 16 24 2
50 2 20.12 2.20 0.00 115 102 38 110 11
4 36.34 3.31 0.00 192 139 106 721 49
8 92.06 3.72 0.00 416 216 164 7712 960
1 14.80 0.94 0.00 182 148 41 44 39
100 2 12.27 4.44 0.00 590 324 191 1236 961
4 38.04 3.74 2.37 707 382 448 2437 2171
8 92.75 9.15 8.75 828 408 509 1717 3600
1 16.56 0.29 0.00 480 307 57 346 586
150 2 10.81 5.39 4.34 586 395 318 813 2965
4 43.69 12.22 12.22 862 527 672 711 3600
8 93.08 15.02 15.02 659 449 503 477 3600
For most of the unsolved problems, the best lower bound and the best upper
bound were found at the root node in a few minutes; no improvement in the gap
was observed later in the search tree. For instance nexp.100.8.5 the value of
the initial LP relaxation was 18.23. After adding 916 cuts in 42 rounds the root
LP relaxation improved to 220.81, which was in fact the best lower bound found
in the search tree, in 617 seconds. The best upper bound 259 was again found at
the root node by a simple heuristic which installs the least cost integral capacity
feasible for the flow on each edge provided by the LP relaxation. Therefore, for
the unsolved problems it is likely that the actual duality gaps of the improved
LP relaxations are much smaller.
More detailed experiments to compare the relative effectiveness of the differ-
ent classes of cuts revealed that the generalized additive flow cover inequalities
were the most effective, and that the lifted surrogate knapsack inequalities were
more effective than the generalized additive flow packing inequalities. However,
the use of all three classes of cuts delivered the best performance in most cases.
From these computational results, we conclude that the valid inequalities derived
from the single node relaxations are very effective in improving the LP bounds
for network design/expansion problems.
72 Alper Atamtürk, George L. Nemhauser, and Martin W. P. Savelsbergh
References
1. K. Aardal, Y. Pochet, and L. A. Wolsey. Capacitated facility location: Valid
inequalities and facets. Mathematics of Operations Research, 20:562–582, 1995.
2. A. Atamtürk. Conflict graphs and flow models for mixed-integer linear optimization
problems. PhD thesis, ISyE, Georgia Institute of Technology, Atlanta, USA, 1998.
3. A. Atamtürk. Flow packing facets of the single node fixed-charge flow polytope.
Technical report, IEOR, University of California at Berkeley, 1998.
4. I. Barany, T. J. Van Roy, and L. A. Wolsey. Uncapacitated lot sizing: The convex
hull of solutions. Mathematical Programming Study, 22:32–43, 1984.
5. D. Bienstock and O. Günlük. Capacitated network design - Polyhedral structure
and computation. INFORMS Journal on Computing, 8:243–259, 1996.
6. Z. Gu, G. L. Nemhauser, and M. W. P. Savelsbergh. Lifted knapsack covers inequal-
ities for 0-1 integer programs: Computation. Technical Report LEC-94-9, Georgia
Institute of Technology, Atlanta GA, 1994. (to appear in INFORMS Journal on
Computing).
7. Z. Gu, G. L. Nemhauser, and M. W. P. Savelsbergh. Sequence independent lifting.
Technical Report LEC-95-08, Georgia Institute of Technology, Atlanta, 1995.
8. G. L. Nemhauser, M. W. P. Savelsbergh, and G. S. Sigismondi. MINTO, a Mixed
INTeger Optimizer. Operations Research Letters, 15:47–58, 1994.
9. M. W. Padberg, T. J. Van Roy, and L. A. Wolsey. Valid linear inequalities for
fixed charge problems. Operations Research, 32:842–861, 1984.
10. Y. Pochet. Valid inequalities and separation for capacitated economic lot sizing.
Operations Research Letters, 7:109–115, 1988.
11. M. Stoer and G. Dahl. A polyhedral approach to multicommodity survivable
network. Numerische Mathematik, 68:149–167, 1994.
12. T. J. Van Roy and L. A. Wolsey. Valid inequalities for mixed 0-1 programs. Discrete
Applied Mathematics, 14:199–213, 1986.
13. L. A. Wolsey. Valid inequalities and superadditivity for 0/1 integer programs.
Mathematics of Operations Research, 2:66–77, 1977.
A Min-Max Theorem on Feedback Vertex Sets
(Preliminary Version)
Key words. feedback vertex set, bipartite tournament, totally dual in-
tegrality, min-max relation, approximation algorithm.
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 73–86, 1999.
c Springer-Verlag Berlin Heidelberg 1999
74 Mao-cheng Cai, Xiaotie Deng, and Wenan Zang
p2 p4 p6 p5 p6 p7
u u u u - u - u
Z
JZ J Z
JJZ JJ >
6J Z 6J 6 ] Z
J ]
J
J Z J J Z J
J ZZ J J Z ~ J
J Z J J Z u p4 J
J Z J J>ZZ J
J Z J J Z J
=
^
J
J ZZJ^J
~Z J ZZ ~J
u Ju ZJu u - Ju - ZJu
p1 p3 p5 p1 p2 p3
F1 F2
Figure 1: Two Forbidden Bipartite Subtournaments
that the packing problem has an integral optimum solution for any nonnegative
integral weight function w on vertices. In Section 4, we establish a TDI system
associated with the cycle-covering problem, which yields an integral optimum
solution to the LP relaxation for the minimum cycle-covering problem on these
bipartite tournaments. As a result, we obtain a min-max theorem—the cycle-
packing number equals the cycle-covering number for any bipartite tournament
with no F1 nor F2 . In addition, we present strongly polynomial time algorithms
for the cycle-packing and cycle-covering problems. In section 5, we show the NP-
completeness of the feedback set problem on general bipartite tournaments. Thus
it is natural to consider the approximation problem. Clearly, a 4-approximation
algorithm for this problem can be obtained by the primal-dual method [5]. Based
on the TDI system, we shall be able to improve the ratio from 4 to 3.5. This
exhibits another example of applying the duality structures of linear programs
to approximation algorithms. In section 6, we conclude this paper with remarks
and discussion.
2 Preliminaries
Proof. Let us reserve the symbol w for a vertex in T with minimum indegree
throughout the proof. For i = 0, 1, 2, . . ., define
Let us now extend the partial order ≺ to a linear order on the whole Vi as
follows: for any two incomparable (according to ≺) vertices u, v ∈ Vi , assign an
arbitrary order between u and v. Then ≺ is the desired linear order.
(2.8) For any arc (u, v) ∈ E, we have (u, x) ∈ E whenever v ≺ x and
(x, v) ∈ E whenever x ≺ u.
Otherwise, we have path xuv, implying x ≺ v in the former case and have
path uvx, implying u ≺ x in the latter, a contradiction.
The proof is complete. t
u
Vi,h2 ∪`j=h+1
i
Vi,j , there is a directed path xvy. So x ≺ y and thus the original
order of Pi is preserved after the replacement of Vi,h .) We then scan the next
vertex in Xi and repeat the process until no vertex is unscanned.
Since each v ∈ Xi can be scanned in time O(|V |), the total time complexity
Pk
is i=2 O(|V ||Xi |) + O(|V |2 ) = O(|V |2 ). t
u
Then we have
(3.5) P ∧ P 0 , P ∨ P 0 ∈ P4 and P ∧ P 0 ≺ P ≺ P 0 ≺ P ∨ P 0 .
Indeed, for each subscript h between s + 1 and s + 3, if neither P nor P 0
contains (vh , vh−1 ), then vh ≺ vh0 and (vh0 , vh−1 ) is an arc of P or P 0 . Thus by
Lemma 2.1 (b), (vh , vh−1 ) ∈ E. It follows that P ∧P 0 ∈ P4 . Similarly, P ∨P 0 ∈ P4 .
Since u0j ∈ P ∧ P 0 and ui ∈ P ∨ P 0 , we have P ∧ P 0 ≺ P ≺ P 0 ≺ P ∨ P 0 .
Proof. Assume the contrary: ȳ T em > ŷ T em for any optimum solution ȳ to (1).
For convenience, choose ȳ to be the lexicographically largest among all the op-
timum solutions to (1). Since ŷ is the lexicographically largest, there exists a
directed path Q∗ = v4 v3 v2 v1 such that ŷ(Q∗ ) > ȳ(Q∗ ) and ŷ(Q) = ȳ(Q) for all
Q ∈ P4 with Q ≺ Q∗ . Let i be the subscript with vj ∈ Vi−1+j for each 1 ≤ j ≤ 4
hereafter.
(3.6) Set R = {Q ∈ P4 | ȳ(Q) > 0, Q Q∗ , V (Q) ∩ V (Q∗ ) 6= ∅}. Then R =6 ∅.
Otherwise, let ỹ be the vector obtained from ȳ by replacing ȳ(Q∗ ) with ŷ(Q∗ ).
Then ỹ T H ≤ wT and ỹ T em > ȳ T em , a contradiction.
(3.7) No Q ∈ R contains any vertex in {v ∈ Vi−1+j | v ≺ vj } for any 1 ≤ j ≤
4.
Assume the contrary: some Q0 ∈ R contains a vertex v ∗ ∈ Vi−1+j with
v ≺ vj . Then Q0 and Q∗ form a crossing pair (recall (3.4)). By (3.5), Q0 ∧Q∗ ∈ P4
∗
Otherwise, define
ȳ(Q) + δ if Q = Q∗ ,
ỹ(Q) = ȳ(Q) − δ if Q = Q0 ,
ȳ(Q) otherwise,
where δ = min{ȳ(Q0 ), ŷ(Q∗ ) − ȳ(Q∗ )}. It is easy to see that ỹ is also an optimum
solution to (1). Since ȳ ≺ ỹ, we reach a contradiction. Hence
(3.9) There exist two vertices vh , vj ∈ Q∗ and two paths R, R0 ∈ R such that
vh ∈ R, vj 6∈ R, vj ∈ R0 , vh 6∈ R0 .
Let us show that R and R0 are crossing. Since Q∗ ≺ R and Q∗ ≺ R0 , neither
R nor R0 contains any vertex in Vr for any r < i. Without loss of generality, we
may assume h < j. Then R must contain a vertex u ∈ Vi−1+j . From (3.7) and
vj ∈ Vi−1+j , it follows that vj ≺ u. Thus R0 ≺ R for otherwise they are crossing.
Similarly, R0 must contain a vertex u0 ∈ Vi−1+h . By (3.7), we have vh ≺ u0 ,
implying that R and R0 are crossing. Now set δ = min{ȳ(R), ȳ(R0 )} and define
ȳ(Q) + δ if Q = R ∧ R0 or R ∨ R0 ,
ỹ(R) = ȳ(Q) − δ if Q = R or R0 ,
ȳ(Q) otherwise.
Then it is easy to verify that ỹ ≥ 0, ỹ T H = ȳ T H, and ỹ T em = ȳ T em . Hence ỹ
is an optimum solution to (1) with ȳ ≺ ỹ, contradicting the choice of ȳ.
This completes the proof. t
u
Lemma 3.2 In addition to the hypothesis of Lemma 3.1, if the weight w is
m
integral, then the lexicographically largest packing ȳ ∈ {y ∈ R+ | y T H ≤ wT } is
integral.
Proof. According to the statement of Lemma 3.1, the lexicographically largest
packing is optimum to (1). Based on this observation, we can come up with a
greedy algorithmic proof of the present statement as follows.
At the current step, let i ≤ k − 3 be the smallest subscript with Vi 6= ∅
and let vj∗ be the smallest vertex in Vj with respect to the linear order ≺ as
∗
defined in Lemma 2.1. If (vj+1 , vj∗ ) 6∈ E for j = i, i + 1, or i + 2, then remove
∗ ∗ ∗ ∗ ∗
vj from D; else, set Q = vi+3 vi+2 vi+1 vi∗ , y(Q∗ ) = min{w(vj∗ ) | i ≤ j ≤ i + 3},
w(vj ) := w(vj ) − y(Q ), i ≤ j ≤ i + 3, and remove all the vertices vj∗ with
∗ ∗ ∗
is finite for every integral vector c for which the maximum is finite, then the
minimum has an integral optimum solution. Based on this theorem, Theorem
4.1 and Theorem 3.1, we can instantly establish the following min-max result.
The min-max relation together with the above minimum cycle covering algorithm
lead to a 3.5-approximation algorithm for the feedback vertex set problem on
general bipartite tournaments, which relies on “eliminating” the problematic
subdigraphs, F1 and F2 , from T .
Given a bipartite tournament T = (V, E) such that each vertex v ∈ V is
associated with a positive integer w(v), recall Lemma 2.2, we can find an F1 or F2 ,
or a partition {V1 , V2 , . . . , Vk } of V as described in Lemma 2.1 in time O(|V |2 ).
Set C 0 = ∅. If an Fj , where j = 1 or 2, is output, then set δ = min{w(v) | v ∈
V (Fj )}, w(v) = w(v) − δ for all v ∈ V (Fj ), C0 = {v ∈ V (Fj ) | w(v) = 0}, C 0 =
C 0 ∪ C0 and T = T − C0 ; else, construct the digraph D as described in (3.1)
and apply the minimum P4 -covering algorithm to D to get a minimum P4 -
covering C 00 for D. Then C 0 ∪ C 00 is a C4 -covering of T . It is easy to see that the
performance guarantee of the algorithm is 3.5.
On the other hand, the problem is NP-complete in general.
Theorem 5.1 The feedback vertex set problem (given c > 0 whether there is a
feedback vertex set of size at most c) on bipartite tournaments is N P -complete
and approximable within 3.5.
Proof. The approximation ratio follows from the above argument. Let us show
the NP-completeness. Obviously, the problem is in N P . To prove the assertion,
it suffices to reduce the 3-SATISFIABILITY problem (3SAT ) to the feedback
vertex set problem on bipartite tournaments. Let U = {u1 , u2 , . . . , un } be the
set of variables and let C = {c1 , c2 , . . . , cm } be the set of clauses in an arbitrary
instance of 3SAT . We aim to construct a bipartite tournament T = (V, E) such
that T has a feedback vertex set of size n + 2m if and only if C is satisfiable.
The construction consists of several components: truth-setting components,
satisfaction testing components, and membership components, which are aug-
mented by some additional arcs so that the resulting digraph is a bipartite
tournament.
V ∗ = ∪ni=1 Vi , V 0 = ∪m 0
j=1 Vj , V̂ = ∪m 3 i
j=1 ∪i=1 V̂j .
• Finally, for each clause Cj and each literal zji ∈ Cj , reverse the arc (zji , xij ),
that is, if zji = uk for some k, then replace (uk , xij ) by (xij , uk ); if zji = ūk for
some k, then replace (ūk , xij ) by (xij , ūk ).
The construction is completed. It is easy to see that the construction can be
accomplished in polynomial time and the resulting digraph is a bipartite tour-
nament with 12m + 4n vertices.
Let us show that T has a feedback vertex set of size n + 2m if and only if
C is satisfiable. Our proof heavily relies on the following observation: if B is a
A Min-Max Theorem on Feedback Vertex Sets 85
6 Concluding Remarks
We generalize the approach developed in our previous work [1] to the feed-
back vertex set problem on bipartite tournaments (which is shown to be NP-
complete). The new structure characterization here is of its own interests and the
proof is much simplified. The TDI characterization yields a 3.5-approximation
algorithm for the feedback vertex set problem on general bipartite tournaments
when combined with the subgraph removal technique. We are still interested in
knowing whether this method of applying TDI characterization can be extended
to wider range of combinatorial optimization problems and would like to pursue
this direction further.
References
1. M. Cai, X. Deng, and W. Zang, A TDI System and Its Application to Approxima-
tion Algorithm, Proc. 39th IEEE Symposium on Foundations of Computer Science,
Palo Alto, 1998, pp. 227-231.
86 Mao-cheng Cai, Xiaotie Deng, and Wenan Zang
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 87–98, 1999.
c Springer-Verlag Berlin Heidelberg 1999
88 Alberto Caprara, Matteo Fischetti, and Adam N. Letchford
Problem (TSP) and led to the very successful branch-and-cut approach intro-
duced by Padberg and Rinaldi [24]. Most of the known methods have been
originally proposed for the TSP, a prototype in combinatorial optimization and
integer programming.
In spite of the large research effort, however, polynomial-time exact separa-
tion procedures are known for only a few classes of facet-defining TSP cuts. In
particular, no efficient separation procedure is known at present for the famous
class of comb inequalities [19]. The only exact method is due to Carr [7], and
requires O(n2t+3 ) time for separation of comb inequalities with t teeth on a
graph of n nodes. Recently, Letchford [21] proposed an O(|V |3 )-time separation
procedure for a superclass of comb inequalities, applicable when the fractional
point to be separated has a planar support.
Applegate, Bixby, Chvátal and Cook [1] recently suggested concentrating on
maximally violated combs, i.e., on comb inequalities which are violated by 1/2
by the given fractional point x∗ to be separated. This is motivated by the fact
that maximally violated combs exhibit a very strong combinatorial structure,
which can be exploited for separation. Their approach is heuristic in nature,
and is based on the solution of a suitably-defined system of mod-2 congru-
ences. Following this approach, Fleischer and Tardos [16] were able to design
an O(|V |2 log |V |)-time exact separation procedure for maximally violated comb
inequalities for the case where the support graph G∗ = (V, E ∗ ) of the fractional
point x∗ is planar.
It is well known that comb inequalities can be obtained by adding-up and
rounding a convenient set of TSP degree equations and subtour elimination con-
straints weighed by 1/2, i.e., they are {0, 12 }-cuts in the terminology of Caprara
and Fischetti [6]. These authors studied {0, 12 }-cuts in the context of general
ILP’s. They showed that the associated separation problem is equivalent to
the problem of finding a minimum-weight member of a binary clutter, i.e., a
minimum-weight {0, 1}-vector satisfying a certain set of mod-2 congruences.
This problem is NP-hard in general, as it subsumes the max-cut problem as
a special case.
In this paper we address the separation of Chvátal rank-1 inequalities in the
context of general ILP’s of the form min{cT x : Ax ≤ b, x integer}, where A is
an m × n integer matrix and b an m-dimensional integer vector. In particular,
for any given integer k we study mod-k cuts of the form λT Ax ≤ bλT bc for any
λ ∈ {0, 1/k, . . . , (k − 1)/k}m such that λT A is integer. We show that, for any
given k, separation of maximally violated mod-k cuts requires O(mn min{m, n})
time as it is equivalent to finding a {0, 1, . . . , k − 1}-vector satisfying a certain
set of mod-k congruences. We also discuss the separation of maximally violated
mod-k cuts in the context of the TSP. In particular, we show how to separate
efficiently maximally violated members of a family of cuts that properly contains
comb inequalities. Interestingly, this family contains facet-inducing cuts which
are not comb inequalities. We also show how to reduce from O(|V |2 ) to O(|V |)
the number of tight constraints to be considered in the mod-k congruence system,
where |V | is the number of nodes of the underlying graph. We investigate specific
On the Separation of Maximally Violated mod-k Cuts 89
classes of (sometimes new) mod-k facet-defining cuts for the TSP and then give
some concluding comments.
Following [6], this problem can equivalently be restated in terms of the integer
multiplier vector µ := kλ ∈ {0, 1, . . . , k − 1}m . For any given z ∈ Z and k ∈ Z+ ,
let z mod k := z − bz/kck. As is customary, notation a ≡ b (mod k) stands
for a mod k = b mod k. Given an integer matrix Q = (qij ) and k ∈ Z+ , let
Q = (q ij ) := Q mod k denote the mod-k support of Q, where q ij := qij mod k for
all i, j. Then, mod-k SEP is equivalent to the following optimization problem.
By construction, (s∗ T µ−θ)/k gives the slack of the mod-k cut λT Ax ≤ bλT bc
for λ := µ/k, computed with respect to the given point x∗ . Hence, there exists a
mod-k cut violated by x∗ if and only if the minimum δ ∗ in (1) is strictly less than
0. Observe that s∗ ≥ 0 and θ ≤ k − 1 imply δ ∗ ≥ 1 − k, i.e., no mod-k cut can be
violated by more than (k − 1)/k. This bound is attained for θ = k − 1, when the
mod-k congruence system (2)–(4) has a solution µ with µi = 0 whenever s∗i > 0.
In this case, the resulting mod-k cut is said to be maximally violated.
Even for k = 2, mod-k SEP is NP-hard as it is equivalent to finding a
minimum-weight member of a binary clutter [6]. However, finding a maximally
violated mod-k cut amounts to finding any feasible solution of the congruence
system (2)–(4) after having fixed θ = k − 1 and having removed all the rows of
(A, b) associated with a strictly positive slack s∗i . For any k prime this solution,
if any exists, can be found in O(mn min{m, n}) time by standard Gaussian
elimination in GF (k).
For k nonprime GF (k) is not a field, hence Gaussian elimination cannot be
performed. On the other hand, there exists an O(mn min{m, n})-time algorithm
to find, if any, a solution of the mod-k congruence system (2)–(4) even for k
nonprime, provided a prime factorization of k is known; see, e.g., Cohen [11].
The above considerations lead to the following result.
Theorem 1. For any given k, maximally violated mod-k cuts can be found in
O(mn min{m, n}) time, provided a prime factorization of k is known.
It is worth noting that mod-k SEP with µi = 0 whenever s∗i > 0 can be solved
efficiently by fixing θ to any value in {1, . . . , k − 1}. We call the corresponding
solutions of (2)–(5) totally tight mod-k cuts. The following theorem shows that,
for k prime, the existence of a totally tight mod-k cut implies the existence of a
maximally violated mod-k cut.
Theorem 2. For any k prime, a maximally violated mod-k cut exists if and
only if a totally tight mod-k cut exists.
Proof. One direction is trivial, as a maximally violated mod-k cut is also a totally
tight mod-k cut. Assume now that a totally tight mod-k cut exists, associated
with a vector (µ, θ) satisfying (2)–(5) and such that µi = 0 for all s∗i > 0. If
θ 6= k − 1 and k is prime, µ can always be scaled by a factor w ∈ {2, . . . , k − 1}
T T
such that A wµ ≡ 0 (mod k) and b wµ ≡ k − 1 (mod k).
violated whenever the one associated with µ is. Hence one is motivated in finding
maximally violated mod-k cuts which are associated with minimal solutions. This
can be done with no extra computational effort for k prime since, for any fixed
θ, all basic solutions to (2)–(4) are minimal by construction. Unfortunately, the
algorithm for k nonprime does not guarantee finding a minimal solution. On the
other hand, the following result holds.
Theorem 3. If there exists a maximally violated mod-k cut for some k non-
prime, a maximally violated mod-` cut exists also for every ` which is a prime
factor of k.
Proof. First of all, observe that Qy ≡ d (mod k) implies Qy ≡ d (mod `) for each
prime factor ` of k. Hence, given a solution (µ, θ) of (2)–(5) with θ = k − 1, the
vector (µ, θ) mod ` yields a totally tight mod-` cut, as θ mod ` = k − 1 mod ` 6= 0.
The claim then follows from Theorem 2.
We next address the separation of maximally violated mod-k cuts that can
be obtained from (6)–(8). Given a point x∗ ∈ RE satisfying (6)–(8), we call tight
any node set S with x∗ (E(S)) = |S| − 1. It is well known that only O(|V |2 ) tight
sets exist, which can be represented by an O(|V |)-sized data structure called
cactus tree [13]. A cactus tree associated with x∗ can be found efficiently in
O(|E ∗ ||V | log(|V |2 /|E ∗ |)), where E ∗ := {e ∈ E : x∗e > 0} is the support of x∗ ;
see [15] and also [20]. Moreover, we next show that only O(|V |) tight sets need
be considered explicitly in the separation of maximally violated mod-k cuts.
92 Alberto Caprara, Matteo Fischetti, and Adam N. Letchford
' $' $
e 6e
5
e
7
e
8
e9 1
B6 4 e 10 e 11
e e13 1/2
2 e e3 e16
&
%
17 e e e " e15
B1 1 e
& 12 14 "
" %
"
"
" B 5
"
"
e e e"
"
20
19
18
B2
B3 B4
Applegate, Bixby, Chvátal and Cook [1] and Fleischer and Tardos [16] showed
that tight sets can be arranged in necklaces. A necklace of size q ≥ 3 is a partition
of V into a cyclic sequence of tight sets B1 , . . . , Bq called beads; see Figure 1 for
an illustration. To simplify notation, the subscripts in B1 , . . . , Bq are intended
modulo q, i.e., Bi = Bi+hq for all integer h. Beads in a necklace satisfy:
(i) Bi ∪ Bi+1 ∪ . . . ∪ Bi+t is a tight set for all i = 1, . . . , q and t = 0, . . . , q − 2,
(ii) x∗ (E(Bi : Bj )) is equal to 1 if j ∈ {i + 1, i − 1}, and 0 otherwise.
A pair (Bi , Bi+1 ) of consecutive beads in a necklace is called a domino. We
allow for degenerate necklaces with q = 2 beads, in which x∗ (E(B1 : B2 )) = 2.
Degenerate necklaces have no dominoes.
Given x∗ satisfying (6)–(8), one can find in time O(|E ∗ ||V | log(|V |2 /|E ∗ |)) a
family F (x∗ ) of O(|V |) necklaces with the property that every tight set is the
union of consecutive beads in a necklace of the family. The next theorem shows
that the columns in the congruence system (2)–(4) corresponding to tight SEC’s
are linearly dependent, in GF (k), on a set of columns associated with degree
equations, tight nonnegativity constraints, and tight SEC’s corresponding to
beads and dominoes in F (x∗ ).
Theorem 4. If any TSP mod-k cut is maximally violated by x∗ , then there
exists a maximally violated mod-k cut whose Chvátal-Gomory derivation uses
SEC’s associated with beads and dominoes (only) of necklaces of F (x∗ ).
Proof. Let S be any tight set whose SEC is used in the Chvátal-Gomory deriva-
tion of some maximally violated mod-k cut. By the properties of F (x∗ ), S is the
union of consecutive beads B1 , . . . , Bt of a certain necklace B1 , . . . , Bq in F (x∗ ),
1 ≤ t ≤ q − 1. If t ≤ 2, then S is either a bead or a domino, and there is nothing
to prove. Assume then t ≥ 3, as in Figure 2, and add together:
On the Separation of Maximally Violated mod-k Cuts 93
Bq Bq−1
' J
' $ JJJJ $
J
JJ
J
J
JJ
J
B1 J Bq−2
B
BBB B
B B
BBB
BB
BB
B2 1 Bt+2
B
BBB B
B B
BB BB
BBB Bt+1
B3
J
& JJJJ J%
1
JJ
JJ k − 1
JJ t1
JJ
B t t Bt
t−1
& %
S
α0 := |S| + k|Bt−1 | − k − 1.
All the inequalities used in the combination are tight at x∗ . Moreover, all the
coefficients in αT x ≤ α0 are identical, modulo k, to the coefficients of the SEC
x(E(S)) ≤ |S| − 1. So we can use the inequalities in the derivation of αT x ≤ α0
in place of the original SEC to obtain a (different) maximally violated mod-k
cut. Applying this procedure recursively yields the result.
In this section we analyze specific classes of facet-defining mod-k cuts for the
Symmetric TSP. We also briefly mention some analogous results for the Asym-
metric TSP which will be presented in detail in the full paper.
We first address mod-2 cuts that can be obtained from (6)–(8). A well known
class of such cuts is that of comb inequalities, as introduced by Edmonds [14] in
the context of matching theory, and extended by Chvátal [10] and by Grötschel
and Padberg [17,18] for the TSP. Comb inequalities are defined as follows. We
are given a handle set H ⊂ V and t ≥ 3, t odd, tooth sets T1 , . . . , Tt ⊂ V such
that Ti ∩ H 6= ∅ and Ti \ H 6= ∅ hold for any i = 1, . . . , t. The comb inequality
associated with H, T1 , . . . , Tt reads:
t
X t
X t+1
x(E(H)) + x(E(Ti )) ≤ |H| + (|Ti | − 1) − . (9)
i=1 i=1
2
The simplest case of comb inequalities arises for |Ti | = 2 for i = 1, . . . , t, leading
to the Edmonds’ 2-matching constraints. It is well known that comb inequalities
define facets of the TSP polytope [19]. Also well known is that comb inequalities
are mod-2 cuts.
As already mentioned, no polynomial-time exact separation algorithm for
comb inequalities is known at present. A heuristic scheme for maximally violated
comb inequalities has been recently proposed by Applegate, Bixby, Chvátal and
Cook [1], and elaborated by Fleischer and Tardos [16] to give a polynomial-time
exact method for the case of x∗ with planar support. Here, comb separation is
viewed as the problem of “building-up” a comb structure starting with a given
set of dominoes. The interested reader is referred to [1] and [16] for a detailed
description of the method.
On the Separation of Maximally Violated mod-k Cuts 95
c c
c s
c c c c c c H s s s s s s
LS L B
L
SL B
cLcSLc
c c B c c c c c c c
T1 T2 T3
(a) (b)
Fig. 3. (a) The support graph of a simple extended comb inequality; all the
drawn edges, as well as the edges in E(H), have coefficient 1. (b) A mod-2
derivation, obtained by combining the degree equations on the black nodes and
the SEC’s on the sets drawn in continuous line (the nonnegativity inequalities
used in the derivation are not indicated).
' $
# #
c c c c c c s s s s s s
b
@b lA , ,
1
1/2 @,b l
Abl,
,
@A,bb
c c c, c,@@A c ll
bc c c c c c c
1/8
,
,
"! "!
& %
(a) (b)
5 Concluding Remarks
References
1. D. Applegate, R. Bixby, V. Chvátal, W. Cook (1995). Finding cuts in the TSP (A
preliminary report). Technical Report Technical Report 95–05, DIMACS, Rutgers
University, New Brunswick, NJ.
2. E. Balas, S. Ceria, G. Cornuéjols (1993). A lift-and-project cutting plane algorithm
for mixed 0-1 programs. Math. Program. (A) 58, 295–324.
3. E. Balas, S. Ceria, G. Cornuéjols (1996). Mixed 0-1 programming by lift-and-
project in a branch-and-cut framework. Management Sci. 42, 1229–1246.
4. E. Balas, S. Ceria, G. Cornuéjols, N. Natraj (1996). Gomory cuts revisited. Oper.
Res. Lett. 19, 1–9.
5. E. Balas, M. Fischetti (1993). A lifting procedure for the asymmetric traveling
salesman polytope and a large new class of facets. Math. Program. (A) 58, 325–
352.
6. A. Caprara, M. Fischetti (1996). {0, 12 }-Chvátal-Gomory cuts. Math. Program. (A)
74, 221–235.
7. R. Carr (1995). Separating clique tree and bipartition inequalities in polynomial
time. E. Balas, J. Clausen (eds.). Integer Programming and Combinatorial Op-
timization 4, Lecture Notes in Computer Science, 920, Berlin. Springer-Verlag,
40–49.
8. S. Ceria, G. Cornuéjols, M. Dawande (1995). Combining and strengthening Go-
mory cuts. E. Balas, J. Clausen (eds.). Integer Programming and Combinatorial
Optimization 4, Lecture Notes in Computer Science, 920, Berlin. Springer-Verlag,
438–451.
9. T. Christof, M. Jünger, G. Reinelt (1991). A complete description of the traveling
salesman polytope on 8 nodes. Oper. Res. Lett. 10, 497–500.
10. V. Chvátal (1973). Edmonds polytopes and weakly Hamiltonian graphs. Math.
Program. 5, 29–40.
11. H. Cohen (1995). A Course in Computational Algebraic Number Theory, Springer-
Verlag, Berlin.
12. G. Dantzig, D. Fulkerson, S. Johnson (1954). Solution of a large scale traveling-
salesman problem. Oper. Res. 2, 393–410.
13. E.A. Dinitz, A.V. Karzanov, M.V. Lomosonov (1976). On the structure of a fam-
ily of minimal weighted cuts in a graph. A.A. Fridman (ed.) Studies in Discrete
Optimization, Moscow Nauka, 290–306 (in Russian).
14. J. Edmonds (1965). Maximum matching and a polyhedron with 0,1-vertices. J.
Res. Natl. Bureau of Standards 69, 125–130.
15. L. Fleischer (1998). Building the chain and cactus representations of all minimum
cuts from Hao-Orlin in same asymptotic run time. R. Bixby, E. Boyd, R. Rios
Mercado (eds.). Integer Programming and Combinatorial Optimization 6, Lecture
Notes in Computer Science, Berlin. Springer-Verlag.
16. L. Fleischer, É. Tardos (1996). Separating maximally violated comb inequalities
in planar graphs. W. Cunningham, S. McCormick, M. Queyranne (eds.). Integer
Programming and Combinatorial Optimization 5, Lecture Notes in Computer Sci-
ence, 1084, Berlin. Springer-Verlag, 475–489. Revised version to appear in Math.
Oper. Res.
17. M. Grötschel, M. Padberg (1979). On the symmetric traveling salesman problem I:
Inequalities. Math. Program. 16, 265–280.
18. M. Grötschel, M. Padberg (1979). On the symmetric traveling salesman problem II:
lifting theorems and facets. Math. Program. 16, 281–302.
98 Alberto Caprara, Matteo Fischetti, and Adam N. Letchford
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 99–113, 1999.
c Springer-Verlag Berlin Heidelberg 1999
100 Fabián A. Chudak and David P. Williamson
The CFLP is NP-hard even in the case that U = ∞, sometimes called the
uncapacitated facility location problem (UFLP) [7]. Thus we turn our attention
to approximation algorithms. We say we have an α-approximation algorithm for
the CFLP if the algorithm runs in polynomial time and returns a solution of
value no more than α times the value of an optimal solution. The value α is
sometimes called the performance guarantee of the algorithm.
It is possible to express any instance of the well-known set cover problem as
an instance of the UFLP of the same cost, which implies that unless P = N P ,
there is no approximation algorithm for the UFLP with performance guarantee
better than c ln |D|, where c is some constant [13,8,16,1]. Thus we turn to special
cases of the CFLP. In particular, we assume that for any k, l ∈ F ∪ D a service
cost ckl is defined, and the service costs are symmetric and obey the triangle
inequality. This is a natural assumption, since service costs are often associated
with the distance between points in Euclidean space representing facilities and
clients. From now on, when we refer to the CFLP or UFLP, we refer to this
metric case.
Recently, Korupolu, Plaxton, and Rajaraman (KPR) gave the first approx-
imation algorithm for the CFLP with constant performance guarantee [10,11].
Surprisingly, Korupolu et al. show that a simple local search heuristic is guar-
anteed to run in polynomial time and to terminate with a solution of value no
more than (8 + ) times optimum, for any > 0. The central contribution of our
paper is to simplify and improve their analysis of the heuristic, showing that it
is a 6(1 + )-approximation algorithm for the CFLP. Although our proof follows
theirs closely at many points, we show that some case distinctions (e.g. “cheap”
versus “expensive” facilities) are unnecessary and some proofs can be simplified
and strengthened by using standard tools from mathematical programming. For
example, using the supermodularity of the cost function of the CFLP reduces a
six and a half page proof to a half page, and using the notion of a transshipment
problem and the integrality of its polyhedron allows us to get rid of the extra-
neous concept of a “refined β-allocation,” which in turn leads to the improved
performance guarantee.
We are also able to use a concept translated from KPR to get an improved
approximation algorithm for a variant of the CFLP. The variant we consider is
the one in which a solution may open up to k copies of facility i, each at cost
fi and having capacity U , and we denote this problem the k-CFLP (so that the
ordinary CFLP is the same as the 1-CFLP). Shmoys, Tardos, and Aardal [17] give
a polynomial-time algorithm for the 72 -CFLP which produces a solution of value
no more than 7 times the optimal value of the 1-CFLP. Chudak and Shmoys [6],
building on previous work [4,5] for the UFLP, give a 3-approximation algorithm
for the ∞-CFLP. Here we show how to take any solution for the ∞-CFLP and
produce from it a solution for the 2-CFLP adding cost no more than twice the
optimal value of the 1-CFLP. Thus by using the Chudak-Shmoys algorithm, we
are able to produce solutions in polynomial time for the 2-CFLP of cost no more
than 5 times the optimal value of the 1-CFLP, improving the previous result of
Shmoys et al. [17].
Improved Capacitated Facility Algorithms 101
least c(S)/p(n, ) an admissible operation; thus the algorithm runs until there
are no more admissible operations. This heuristic runs in polynomial time, as
Korupolu et al. argued: start with some arbitrary feasible solution (for instance,
setting S = F ). Since in each step, the value of the solution improves by a factor
of (1 − p(n,)
1
), after p(n, ) operations the value of the solution will have im-
proved by a constant factor. Since the value of the solution can’t be smaller than
c(F )
c(S ∗ ), after O(p(n, ) log c(S ∗ ) ) operations the algorithm will terminate. Each lo-
cal search step can be implemented in polynomial time, and O(p(n, ) log c(F ))
is a polynomial in the input size, so overall the algorithm takes polynomial time.
We now turn to proving some preliminary lemmas. These lemmas use the
fact that the cost function c is supermodular; that is, if A, B ⊆ F , we have that
(See Babayev [2], Propositions 3.3 and 3.4 of Nemhauser, Wolsey, and Fisher
[15].) In particular, cs is supermodular, while cf is modular (that is, cf (A) +
cf (B) = cf (A ∩ B) + cf (A ∪ B)). We will use the fact that supermodularity holds
even for multisets.
We show the following three lemmas:
Lemma 2.3 (KPR [11], Lemma 9.3). If there is no admissible add operation,
nc(S)
then cs (S) ≤ c(S ∗ ) + p(n,) .
Before proving the lemmas, we show how they lead to the 6(1+)-approximation
algorithm for the CFLP.
f (S + Wk−1 ) + f (S + uk ) ≤ f (S ∗ ) + f (S)
f (S + Wk−2 ) + f (S + uk−1 ) ≤ f (S + Wk−1 ) + f (S)
..
.
f (S + W2 ) + f (S + u3 ) ≤ f (S + W3 ) + f (S)
f (S + u1 ) + f (S + u2 ) ≤ f (S + W2 ) + f (S).
Pk−1
Summing the inequalities and subtracting i=2 f (S + Wi ) from both sides, we
obtain
Xk
f (S + ui ) ≤ f (S ∗ ) + (k − 1)f (S),
i=1
104 Fabián A. Chudak and David P. Williamson
Proof of Lemma 2.1: It follows from Lemma 2.6 that if c(S) ≥ (1 + )c(S ∗ ),
then
there exists an add operation that changes the cost by no more than
n(1+)
n 1+ − 1 c(S) ≤ −c(S)/p(n, ), for p(n, ) ≥
1 1
. So there is an admis-
sible add operation. t
u
Proof of Lemma 2.3: By modifying the last few lines of the proof of Lemma
2.6, it follows that if f (S ∗ ) ≤ f (S) − β, then there exists a ui ∈ S ∗ − S such that
nc(S)
f (S + ui ) − f (S) ≤ − |S ∗β−S| . Suppose it is the case that cs (S) > c(S ∗ ) + p(n,) .
∗
Then by adding cf (S) to the left-hand side, cf (S − S ) to the right-hand side,
nc(S)
and observing that cs (S ∪ S ∗ ) ≤ cs (S ∗ ), we have that c(S) > c(S ∗ ∪ S) + p(n,) .
nc(S)
Setting β = p(n,) and applying the above gives us that there is an admissible
add operation, proving the lemma. t
u
Proof. The proof of Lemma 2.7 is similar to that of Lemma 2.6, and so we omit
it. t
u
Proof of Lemma 2.2: It follows from Lemma 2.7 that if c(S) ≥ (1 + )c(S ∗ ),
then
there exists a drop operation that changes the cost by no more than
n(1+)
n 1+ − 1 c(S) ≤ −c(S)/p(n, ), for p(n, ) ≥
1 1
. So there is an admis-
sible drop operation. t
u
Note then that we need p(n, ) ≥ n(1+)
(from the proofs of Lemmas 2.1 and
Lemma 2.2) and p(n, ) ≥ 8n (from the proof of Theorem 2.5). Thus p(n, ) =
8n
path decomposition is useful in comparing the value of our current solution with
the optimal solution. The swap graph will be used in the analysis of the local
search algorithm (in the proof of Theorem 2.4) and will be used in the algorithm
and analysis of our result for the 2-CFLP.
To obtain the path decomposition, we start with some current solution S and
the optimal solution S ∗ . We construct the following directed graph: we include
a node j for each client j ∈ D, and a node i for each facility i ∈ S ∪ S ∗ . We
include an arc (j, i) of weight w(j, i) = x(S ∗ , i, j) for all i ∈ S ∗ , j ∈ D when
x(S ∗ , i, j) > 0, and an arc (i, j) of weight w(i, j) = x(S, i, j) for all i ∈ S, j ∈ D
when x(S, i, j) > 0. Observe that by the properties of x, the total weight of all
arcs incoming to a node j for j ∈ D is dj , as is the total weight of all outgoing
arcs. The total weight of arcs incoming to any node i for i ∈ S ∗ is at most U ,
and the total weight of arcs P going out of any ∗node i for i ∈ S is also at most U .
Furthermore, notice that cij w(i, j) = cs (S ) + cs (S).
By standard path-stripping arguments, we can decompose this graph into
a set of paths P and cycles. We ignore the cycles; the paths start at nodes in
S and end at nodes in S ∗ . Let the weight of a path P be denoted Pw(P ), and,
overloading notation somewhat, let c(P ) denote its cost (c(P ) = (i,j)∈P cij ).
P
Then P ∈P c(P )w(P ) ≤ cs (S) + cs (S ∗ ). For any subset of paths P 0 ⊆ P, let
P 0 (A, ·) denote the set of paths in P 0 starting at nodes i ∈ A for A ⊆ S, let
P 0 (·, B) denote the set of paths ending at nodes i0 ∈ B for B ⊆ S ∗ . Then
P 0 (A, B) denotes
P the set of paths in 0P from
0 0
P i ∈ A ⊆ S to i ∈ B ⊆ S . Also,
∗
0
let w(P ) = P ∈P 0 w(P ), and val(P ) = P ∈P 0 w(P )c(P ). Thus, for instance,
val(P) ≤ cs (S) + cs (S ∗ ).
By a small variation of Lemma 2.3, we know that we must have cs (S̃) ≤ c(S ∗ ).
Since the cost of the solution Ŝ obtained after swapping is at most the cost of
c(S̃) plus the cost of the solution to the transshipment problem, we know that
≤ cs (S̃) + cs (S ∗ ) − cf (S 0 ) + cf (S ∗ ),
where the inequality ckl ≤ c(P ) follows from the triangle inequality. t
u
Proof. We apply the 3-approximation algorithm of Chudak and Shmoys [5] for
the ∞-CFLP to obtain our initial solution S. Since the cost of the optimal
solution for ∞-CFLP is at most the cost of the optimal solution for the 1-CFLP,
the corollary follows. t
u
We now use the path decomposition and swap graph tools from the previous
section to complete our analysis of the local search algorithm, and prove Theorem
2.4. The lemmas we derive below are roughly similar to those of Korupolu et al.
[11]: Lemma 4.2 corresponds to their Claim 9.7, and Lemma 4.3 to their Claims
9.8 and 9.9. However, we do not need an analogue of their “refined β-allocation”,
which gives us an improvement in the analysis in Lemma 4.2.
Let S be a solution meeting the conditions of Theorem 2.4, and let P be the
path decomposition for S and an optimal solution S ∗ . We will be particularly
interested in three subsets of paths from P. The first set is the set of all paths
from nodes in S − S ∗ to S ∩ S ∗ (if any); we call these the transfer paths and
denote them T = P(S − S ∗ , S ∩ S ∗ ). The basic idea of these paths in the proof is
that for any path P ∈ T , we claim we can transfer w(P ) of the demand assigned
108 Fabián A. Chudak and David P. Williamson
to the start node of the path to the end node of the path at a cost of c(P )w(P )
without violating the capacity constraints. We establish this claim later.
The next subset of paths of interest is the set of all paths from S − S ∗ to
S − S; we call these the swap paths and denote them S = P(S − S ∗ , S ∗ − S).
∗
We use the swap paths to get a fractional feasible solution for a transshipment
problem from S − S ∗ to S ∗ − S in the swap graph, and get an integral solution
of swaps whose cost is a simple expression in terms of cs (S), cs (S ∗ ), cf (S − S ∗ ),
and cf (S ∗ − S). Thus if no swap can be performed that improves the cost of the
current solution by a certain amount, this implies a bound on cf (S − S ∗ ).
This idea does not quite work as stated because the weight of swap paths
from i ∈ S −S ∗ could be quite small. Thus, as in Korupolu et al. [11], we split the
nodes of S −S ∗ into two types: heavy nodes H such that the weight of paths from
any i ∈ H to S ∗ − S is at least U/2 (i.e., H = {i ∈ S − S ∗ : w(S(i, ·)) ≥ U/2}),
and light nodes (all the rest: L = S − S ∗ − H). We will be able to set up
a transshipment problem for the nodes in H, which will give us a bound on
cf (H). To get a bound on cf (L), we will have to set up a transshipment problem
in a different manner and use the observation that we can transfer the demand
assigned from one light node to another light node without violating capacity
constraints.
To build towards our proof of Theorem 2.4, we now formalize the statements
above in a series of lemmas.
Lemma 4.1. Weight w(T (i, ·)) of the demand assigned to facility i in the current
assignment can be transferred to other nodes in S at a cost increase of at most
val(T (i, ·)).
Proof. To prove the lemma, consider a path P ∈ T (i, ·), with start node i and end
node i0 . We observe that the first edge (i, j) in path P corresponds to a demand
w(P ) assigned to i by client j in the current assignment. We reassign this demand
to i0 ∈ S ∩ S ∗ ; the increase in cost is at most (ci0 ,j − ci,j )w(P ) ≤ c(P )w(P ) by
the triangle inequality. We now must show that such a reassignment does not
violate the capacity constraints at i0 . To see this, observe that by the properties
of path-stripping, the total weight of paths incoming to any node i0 ∈ S ∗ ∩ S
is the difference between the total weight of arcs coming into node i0 and the
total weight of arcs going out of node i0 . Since the total weight of arcs coming
into node i0 corresponds to the total amount of demand assigned to i0 by the
optimal solution, and the total weight of arcs going out of node i0 corresponds
to the total amount of demand assigned to i0 by the current solution, and the
optimal solution must be feasible, we can increase the demand serviced by i0 by
this difference and still remain feasible. t
u
where the first inequality follows by the definition of H and the second since
the total weight of paths adjacent to any node is at most U . The cost of this
fractional solution is
X X w(S(k, l))
ekl =
ĉkl x (U ckl + fl − fk )
w(S(k, ·))
k∈H,l∈S ∗ −S k∈H,l∈S ∗ −S
X
w(S(k, l)) w(S(k, l))
≤ (U ckl + fl ) − fk
U/2 w(S(k, ·))
k∈H,l∈S ∗ −S
110 Fabián A. Chudak and David P. Williamson
X
≤ 2ckl w(S(k, l)) + 2cf (S ∗ − S) − cf (H)
k∈H,l∈S ∗ −S
≤ 2val(S(H, ·)) + 2cf (S ∗ − S) − cf (H).
t
u
Lemma 4.3 (KPR [11], Claims 9.8 and 9.9). If there are no admissible drop
and swap operations, then
Proof. The proof of this lemma is similar to the proof of the previous lemma,
although here we will have to set up a transshipment problem to capture both
swap and drop operations. One difficulty with translating the previous proof to
this case is ensuring that one can find a feasible fractional solution such that
each facility in F − S is in no more than a small constant number swap/drop
operations. We do this by choosing exactly one “primary” facility k in L that
can be swapped for a facility l in F − S; i.e. xkl > 0 for exactly one k ∈ L. We
make a careful choice of this facility k so that any other facility i to which we
might otherwise normally make a fractional assignment xil > 0, we can drop i
and reassign its demand to k, the primary facility of l, at not much more cost.
We do this by setting up a transshipment problem from L to (F − S) ∪ L, in
which we set cost ĉkl = w(S(k, ·))ckl +fl −fk for l ∈ F −S, ĉkl = w(S(k, ·))(ckl +
θ(l)) − fk for l ∈ L, l 6= k, and ĉkk = ∞, where θ(l) for l ∈ L is the the cost
per unit demand for making U/2 units of capacity available at node l, either
via the unused capacity at l or transferring demand via the paths T (l, ·).1 Note
that since l ∈ L, w(S(l, ·)) ≤ U/2, and thus the unused capacity at node l plus
w(T (l, ·)) is at least U/2. Thus U2 θ(l) ≤ val(T (l, ·)). The transshipment problem
is then:
X
Min ĉkl xkl
k∈L,l∈(F −S)∪L
subject to:
X
xkl = 1 ∀k ∈ L
l∈(F −S)∪L
X
xkl ≤ 1 ∀l ∈ F − S
k∈L
xkl ≥ 0 ∀k ∈ L, l ∈ (F − S) ∪ L.
and transfer the demand w(S(k, ·)) assigned to k at change in cost at most ĉkl .
By Lemma 4.1, we can transfer the remaining demand w(T (k, ·)) assigned to
k to nodes in S ∩ S ∗ at change in cost at most val(T (k, ·)). When xki = 1 for
k ∈ L, i ∈ L, k 6= i, we can drop facility k from S and transfer the demand
w(S(k, ·)) assigned to k to i at change in cost ĉki = w(S(k, ·))(cki + θ(i)), as this
cost covers transferring these units of demand to i and transferring the same
amount of demand from i to nodes in S ∩ S ∗ . By Lemma 4.1, we can transfer the
remaining demand w(T (k, ·)) assigned to k to nodes in S ∩ S ∗ at change in cost
at most val(T (k, ·)). By the hypothesis of the lemma, we know that any swap or
drop of a facility results in a change in cost of at least −c(S)/p(n, ). Summing
over all swaps and drops for k ∈ L given by the solution to the transshipment
problem, we have that
|L|c(S)
2val(S(L, ·)) + 2val(T (L, ·)) − cf (L) + cf (S ∗ − S) ≥ − .
p(n, )
Rearranging terms gives us the lemma.
e for this transshipment
To complete the proof, we give a fractional solution x
problem. For each l ∈ S ∗ − S we find k ∈ L that minimizes ckl + θ(k) and
designate k as the primary node π(l) for l. We then set xekl as follows. For each
l ∈ S ∗ − S, if k is the primary node for l, we set x
ekl = w(S(k, l))/w(S(k, ·)),
ekl = 0. For each i ∈ L, we set
otherwise x
X w(S(k, l))
eki =
x .
w(S(k, ·))
l∈S ∗ −S:i=π(l),k6=π(l)
P
This solution is feasible since certainly l∈F xkl =P1 for all k ∈ L. Also, since
for at most one k ∈ L is x ekl > 0 for l ∈ F − S, k∈L xkl ≤ 1. Observe that
when xeki > 0 for k ∈ L, i = π(l), l ∈ S ∗ − S, then
since cil + θ(i) ≤ ckl + θ(k) by the definition of primary nodes. Then the cost of
this fractional solution is
X
ekl
ĉkl x
k∈L,l∈F
X w(S(k, l)) X X w(S(k, l))
= ĉkl + ĉki
w(S(k, ·)) w(S(k, ·))
k∈L,l∈S ∗ −S,k=π(l) k∈L,i∈L l∈S ∗ −S:i=π(l),k6=π(l)
X w(S(k, l)) X w(S(k, l))
≤ ĉkl + ĉki
w(S(k, ·)) w(S(k, ·))
k∈L,l∈S ∗ −S,k=π(l) k∈L,l∈S ∗ −S,i=π(l),k6=i
X w(S(k, l))
≤ [w(S(k, ·))ckl + fl − fk ]
w(S(k, ·))
k∈L,l∈S ∗ −S,k=π(l)
112 Fabián A. Chudak and David P. Williamson
X w(S(k, l))
+ [w(S(k, ·))(2ckl + θ(k)) − fk ]
w(S(k, ·))
k∈L,l∈S ∗ −S,k6=π(l)
X w(S(k, l))
≤ [w(S(k, ·))(2ckl + θ(k)) − fk ] + cf (S ∗ − S)
w(S(k, ·))
k∈L,l∈S ∗ −S
X X
≤ 2ckl w(S(k, l)) + val(T (k, ·)) − cf (L) + cf (S ∗ − S)
k∈L,l∈S ∗ −S k∈L
≤ 2val(S(L, ·)) + val(T (L, ·)) − cf (L) + cf (S ∗ − S).
t
u
References
14. P. Mirchandani and R. Francis, eds. Discrete Location Theory. John Wiley and
Sons, Inc., New York, 1990.
15. G.L. Nemhauser, L.A. Wolsey, and M.L. Fisher. An analysis of approximations for
maximizing submodular set functions – I. Mathematical Programming 14:265–294,
1978.
16. R. Raz and S. Safra. A sub-constant error-probability low-degree test, and a sub-
constant error-probability PCP characterization of NP. In Proceedings of the 29th
ACM Symposium on Theory of Computing, pages 475–484, 1997.
17. D. Shmoys, É. Tardos, and K. Aardal. Approximation algorithms for facility loca-
tion problems. In Proceedings of the 29th ACM Symposium on Theory of Computing,
pages 265–274, 1997.
18. M. Sviridenko, July, 1998. Personal communication.
Optimal 3-Terminal Cuts and Linear
Programming
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 114–125, 1999.
c Springer-Verlag Berlin Heidelberg 1999
Optimal 3-Terminal Cuts and Linear Programming 115
Chopra and Rao [3] and Cunningham [4] investigated linear programming
relaxations of the 3-cut problem, showing results on classes of facets and separa-
tion algorithms. Here are the two simplest relaxations. (By a T-path we mean the
edge-set of a path joining two of the terminals. By a wye we mean the edge-set
of a tree having exactly three nodes of degree one, each of which P is a terminal.
For a set A, a subset B of A, and a vector z ∈ RA , z(B) denotes j∈B zj .)
P
minimize e∈E ce xe
(LP 1) subject to
x(P ) ≥ 1, P a T -path
xe ≥ 0, e ∈ E.
P
minimize e∈E ce xe
(LP 2) subject to
x(P ) ≥ 1, P a T -path
x(Y ) ≥ 2, Y a wye
xe ≥ 0, e ∈ E.
It follows from some simple observations about shortest paths, and the equiva-
lence of optimization and separation, that both problems can be solved in poly-
nomial time. It was proved in [4] that the approximation algorithm of [5] delivers
a 3-cut of value at most 43 times the optimal value of (LP 1). (In particular, the
minimum weight of a 3-cut is at most 43 times the optimal value of (LP 1).) It
was conjectured that the minimum weight of a 3-cut is at most 16 15 times the
optimal value (LP 2). The examples in Figure 1 (from [4]) show that this conjec-
ture, if true, is best possible. In both examples, the values of a feasible solution
x of (LP 2) are shown in the figure. The weights ce are all 2 for the example on
the left. For the one on the right they are 1 for the edges of the interior triangle,
and 2 for the other edges. In both cases the minimum 3-cut value is 8, but the
given feasible solution of (LP 2) has value 7.5.
t1 t1
1/ 1/
2 2 1/ 1/
2 2
1/
2
1/
4
1/ 1/ 1/
1/ 4 1/ 2 2
2 2
1/ 1/
1/ 2 2
4
t2 1/ 1/
t3 t2 1/ 1/
t3
2 2 2 2
Recently, Călinescu, Karloff, and Rabani [1] gave a new linear programming
relaxation. Although their approach applies to any number k of terminals, we
116 William H. Cunningham and Lawrence Tang
continue to restrict attention to the case when k = 3. They need to assume that G
be a complete graph. (Of course, if any missing edges are added with weight zero,
the resulting 3-cut problem is equivalent to the given one, so this assumption
is not limiting.) The relaxation is based on the following observations. First,
every minimal 3-cut is of the form β(R1 , R2 , R3 ), where ti ∈ Ri for all i. Here,
where R is a family of disjoint subsets of R, β(R) denotes the set of all edges
of G joining nodes in different members of the family. Since c ≥ 0, there is an
optimal 3-cut of this form. Second, the incidence vector x of a minimal 3-cut
is a kind of distance function, that is, it defines a function d(v, w) = xvw on
pairs of nodes of G which is non-negative, symmetric, and satisfies the triangle
inequality. Finally, with respect to d the distance between any two terminals
is 1, and the sum of the distances from any node v to the terminals is 2. The
resulting linear-programming relaxation is:
P
minimize e∈E ce xe
(LP 3) subject to
P xvw = 1, v, w ∈ T, v 6= w
v∈T xvw = 2, w ∈ V
xuv + xvw − xuw ≥ 0, u, v, w ∈ V
xe ≥ 0, e ∈ E.
This relaxation is at least as tight as (LP 2). To see this, suppose that (af-
ter adding missing edges to make G complete), we have a feasible solution x
to (LP 3). Then for any path P of G joining u to v, x(P ) ≥ xuv , by applying
the triangle inequality. It follows that x(P ) ≥ 1 for any T -path P . Moreover,
any wye Y is the disjoint union of Ppaths P1 , P2 , P3 from some node v to the
terminals. It follows that x(Y ) ≥ w∈T xvw = 2. Thus every feasible solution
of (LP 3) gives a feasible solution of (LP 2) having the same objective value. The
first example of Figure 1 shows that the optimal value of (LP 3) can be strictly
greater than the optimal value of (LP 2). On the other hand, the second example
shows that there is no hope to prove in general that the the minimum weight of
a 3-cut is less than 16
15 times the optimal value of (LP 3).
It was proved in [1] that the minimum weight of a 3-cut is at most 76 times
the optimal value of (LP 3). As a consequence, an approximation algorithm for
the optimal 3-cut problem having a performance guarantee of 76 was derived. (It
is clear that (LP 3) can be solved in polynomial time, since it is of polynomial
size.) However, it was left open whether this result could be strengthened; the
second example of Figure 1 shows an example for which the minimum weight of
a 3-cut can be as large as 16/15 times the optimal value of (LP 3), and this is
the worst example given in [1]. (To see that x of that example does extend to a
feasible solution of (LP 3), we simply define x on each missing edge uv to be the
minimum length, with respect to lengths xe , of a path from u to v.)
In this paper it is shown that the minimum weight of a 3-cut is at most 12 11
times the optimal value of (LP 3), and that this is best possible. (This result has
been obtained independently by Karger, Klein, Stein, Thorrup, and Young [7].)
As a consequence we obtain an approximation algorithm for the optimal 3-cut
problem having a performance guarantee of 12 11 .
Optimal 3-Terminal Cuts and Linear Programming 117
2 Triangle Embeddings
Călinescu, Karloff, and Rabani [1] introduced an extremely useful geometric
relaxation, which they showed was equivalent to the linear-programming re-
laxation (LP 3). Let 4 denote the convex hull of the three elementary vectors
e1 = (1, 0, 0), e2 = (0, 1, 0), and e3 = (0, 0, 1) in R3 . By a triangle embedding
of G we mean a mapping y from V into 4 such that y(ti ) = ei for i = 1, 2, 3.
A triangle embedding y determines a vector x ∈ RE as follows. For each edge
uv, let xuv be one-half the L1 distance from y(u) to y(v). It is easy to see
that this x is a feasible solution to (LP 3). Conversely, a feasible solution x
of (LP 3) determines a triangle embedding y as follows. For each node v, let
y(v) = (1 − xt1 v , 1 − xt2 v , 1 − xt3 v ).
Given a triangle embedding y we can obtain x as above, and then use x to
obtain a triangle embedding y 0 . It is easy to see that y = y 0 . It is not true, on
the other hand, that every feasible solution of (LP 3) arises in this way from a
triangle-embedding. However, it is “almost true”. The following result is implicit
in [1], and we include a proof for completeness.
Theorem 1. Let x be a feasible solution of (LP 3), let y be the triangle embed-
ding determined by x and let x0 be the feasible solution of (LP 3) determined by
y. Then x0 ≤ x, and if x is an optimal solution of (LP 3), so is x0 .
Proof. First, observe that the second statement is a consequence of the first and
the non-negativity of c. Now let uv ∈ E. Both y(u) and y(v) have component-
sum 1. Therefore, y(u) − y(v) has component-sum zero, and so one-half of the
L1 distance between y(u) and y(v) is the sum of the non-negative components
of y(u) − y(v). Hence we may assume, perhaps by interchanging u with v and
relabelling the terminals, that one-half of the L1 distance between y(u) and y(v)
is the sum of the first two components of y(u) − y(v). Therefore,
1
ky(u) − y(v)k1 = y1 (u) − y1 (v) + y2 (u) − y2 (v)
2
= 1 − xut1 − (1 − xvt1 ) + 1 − xut2 − (1 − xvt2 )
= (2 − xut3 ) − (2 − xvt3 )
≤ xuv ,
as required.
t
u
The approximation algorithm of Călinescu, Karloff, and Rabani uses the
following ideas. Suppose that (LP 3) is solved, and an optimal solution x∗ that
arises from a triangle embedding is found. For a number α between 0 and 1 that
is different from x∗rv for every v ∈ V and r ∈ T , and an ordering r, s, t of T ,
define Rr = {v ∈ V : x∗rv < α}, Rs = {v ∈ V \Rr : x∗sv < α}, Rt = V \(Rr ∪ Rs ).
We call the 3-cut β(Rr , Rs , Rt ) uniform (with respect to this x∗ ). It is easy to
see that there are O(n) uniform 3-cuts. The algorithm of [1] simply chooses the
uniform 3-cut having minimum weight. It is proved to have weight at most 76
times the minimum weight of a 3-cut.
118 William H. Cunningham and Lawrence Tang
Note that the variables are the weights ce ! It may seem that the hypothesis that
G and x∗ are known is very strong, but it turns out that we can assume that
there are not many choices for them. First, we may assume that x∗ is rational,
since it is an optimal solution of a linear-programming problem having rational
data. Therefore, there exists a positive integer q such that qx∗ is integer-valued.
Second, we may assume that x∗ arises from a triangle-embedding y ∗ , and it
is easy to see that qy ∗ is integral, as well. Therefore, we can think of y ∗ as
embedding the nodes of G into a finite subset 4q of 4, consisting of those
points y ∈ 4 for which qy is integral. We define the planar graph Gq = (4q , Eq )
by uv ∈ Eq if and only if the L1 distance between u and v is 2q . Figure 3 shows
G9 ; the numbers there are explained later. For nodes u, v of Gq , we denote by
dq (u, v) the least number of edges of a path in Gq from u to v. (It is easy to see
that dq (u, v) = q2 times the L1 distance from u to v.)
10 10
0
8 8
2 2
0 0
8 0 2 0 8
2
8 8
8 8
8 8
8 8
0 0
8 0 2 2 0 8
2 2 2 2
10 2 2 10
0 0 0 0 0 0
10 8 8 8 8 8 8 8 10
Fig. 2. G9
true for G if it is true for G0 . It follows that we may assume that y ∗ is onto.
Therefore, we may assume that V = 4q , and that y ∗ is the identity mapping.
Now suppose that there exists uv ∈ E\Eq , such that cuv = ε > 0. Let P be
the edge-set of a path in Gq from u to v such that |P | = dq (u, v). Decrease cuv
to zero, and increase ce by ε for all e ∈ P . We denote the new c by c0 . Then,
since every 3-cut using e uses an edge from P , the minimum weight of a 3-cut
with respect to c0 is not less than that with respect to c. (Similarly, every flat
3-cut has value with respect to c0 not less than that with respect to c.) Now
c0 x∗ = cx∗ − εdq (u, v) + εdq (u, v) = cx∗ . This argument can be repeated as long
as there is such an edge uv.
t
u
It is a consequence of the above theorem that it is enough to study the 3-
cut problem on graphs Gq with x∗e = 1q for all e ∈ Eq . (That is, to obtain the
best bound on the ratio of the optimal weight of a 3-cut to the optimal value
of (LP 3), it suffices to analyze such graphs and weights.) In particular, for each
positive integer q, we are interested in the optimal value of the following linear
programming problem.
P
minimize 1q e∈E ce
(Pq ) subject to
c(S) ≥ 1, S a 3-cut of Gq
ce ≥ 0, e ∈ Eq
The dual problem is
P
maximize zS
(Dq ) subject
P to
e∈S ≤ 1q , e ∈ Eq
zS ≥ 0, S a 3-cut of Gq .
We actually solved these problems numerically for several values of q, and then
were able to find solutions for general q.
Theorem 3. For q ≥ 4 the optimal value of (Pq ) and of (Dq ) is equal to
11 1
12 + 12(q+1) , if q ≡ 0 mod 3
11 1
f (q) = 12 + 12q , if q ≡ 1 mod 3
11 + 1 − 1 2 ,
12 12q 12q if q ≡ 2 mod 3
5 Proof of Theorem 3
To prove Theorem 3, it is enough to give feasible solutions of (Pq ) and of (Dq )
having objective value f (q). For simplicity, we will actually do something weaker.
For the case when q ≡ 0 mod 3, we give a feasible solution of (Pq ) having objec-
tive value f (q), and a feasible solution to (Dq ) using only variables corresponding
122 William H. Cunningham and Lawrence Tang
to flat 3-cuts having objective value 11 12 . Although this does not quite prove The-
orem 3, it is enough to prove Theorems 4 and 5, since a common denominator
for the components of x∗ can always be chosen to be a multiple of 3.
First, we describe our feasible solution to (Pq ). Consider Figure 2 which shows
G9 . Let c0e be the number next to edge e, or 1 if no number appears. It is easy to
see that the minimum value of a 3-cut is 40, so c = c0 /40 is a feasible solution to
(P9 ). Its objective value is the sum of the components of c0 divided by 9, which
is 37
40 .
Here is the general construction (when q is a multiple of 3) for an optimal
solution of (Pq ). If q = 3m, divide 4q into three “corner triangles” of side m
together with the “middle hexagon”. Put c0e = 3m + 1 for all edges incident with
the terminals. Put c0e = 2m + 2 for all other edges on the boundary of 4q . Put
c0e = m − 1 for each edge e in a corner triangle that is parallel to an outside edge
and distance 1 from it. Put c0e = 1 for all other edges in the middle hexagon
(including its boundary). Put c0e = 0 for all other edges.
It is easy to convince oneself that the minimum weight of a 3-cut with respect
to c0 is 4(3m + 1), and hence that c = c0 /4(3m + 1) is a feasible solution to (Pq ).
Here is a sketch of a proof. (The ideas come, essentially, from the result of
Dahlhaus, et al. [5], showing that there is a polynomial-time algorithm to solve
the optimal multiterminal cut problem when G is planar and the number of
terminals is fixed.) Any minimal 3-cut of Gq has the form β(R1 , R2 , R3 ). There
are two kinds of such 3-cuts, corresponding to the case in which there is a pair
i, j for which there is no edge joining a node in Ri to a node in Rj , and the
one where this is not true. The minimum value of a 3-cut of the first type is
simply the sum of the weights of two cuts, each separating a terminal from the
other two. In the case of Gq with c0 described above, to show that any such cut
has weight at least 4(3m + 1), it is enough to show (due to the symmetry of
c0 ) that any cut separating one terminal from the other two has weight at least
2(3m + 1). This is done by exhibiting an appropriate flow of this value from one
terminal to the other two.
The second type of 3-cut corresponds to the union of three paths in the
planar dual of Gq , such that the three paths begin at the same face triangle
and end with edges that are on different sides of the outside face. Finding a
minimum-weight such 3-cut can be accomplished by, for each choice of the face
triangle, solving a shortest path problem. Therefore, to show that any 3-cut of
the second type has c0 -weight at least 4(3m + 1), one shows that, for each choice
of face triangle, there is an appropriate “potential” on the faces of Gq .
To compute the objective value of this feasible solution (Pq ), note that there
are 6 edges e having c0e = 3m + 1, 3(3m − 2) edges e having c0e = 2m + 2, 6(m − 1)
edges e having c0e = m − 1, and 9m2 edges e having c0e = 1. From this we get
that the total c0 -weight of all the edges is 3m(11m + 12). To obtain the objective
value of the resulting c in (Pq ), we divide by 4(3m + 1)(3m), and this gives f (q)
for q = 3m.
Now we need to show a feasible solution of (Dq ) having objective value 11 12 .
This requires a weighting of the flat 3-cuts of Gq . We assign positive dual vari-
Optimal 3-Terminal Cuts and Linear Programming 123
ables to two kinds of 3-cuts. For each integer j, 1 ≤ j < m and each choice of two
terminals r, s we consider the (uniform) 3-cut β(Rr (j), Rs (j), V \(Rr (j)∪Rs (j)))
where, for t = r, s, Rr (j) = {v ∈ Vq : dq (t, v) < j}. There are 3m such 3-cuts S,
1
and for each of them we set zS = 4q . Notice that these variables contribute to the
left-hand side of the main constraint of (Dq ) only for certain edges, namely, those
that are contained in the corner triangles and are parallel to one of the two sides
of 4 that meet at that corner. For each of these edges, the total contribution is
exactly 1/2q.
5 5 5
5 5 5 5
5 3 3 5
5 3 3 3 5
5 3 1 3 5
5 3 1 1 3 5
5 3 1 1 3 5
5 3 1 3 5
5 3 3 3 5
5 3 3 5
5 5 5 5
5 5 5
The weights assigned to the second type of flat cut are determined by a
weighting of the face triangles of Gq that are contained in the middle hexagon.
See Figure 3, where such a weighting of the face triangles is indicated for G9 . Let
us use the term row in the following technical sense. It is defined by a straight
line through the centre of a face triangle and parallel to one of its three sides.
When we speak of the face triangles in the row, we mean all of the face triangles
that are intersected by the line. When we speak of the edges in the row, we
mean all of the edges that are intersected by the line. Notice that in the figure,
the sum of the weights of the face triangles in each row is the same, namely 35.
It is obvious how to extend this pattern to find a weighting with this property
for any q = 3m. Then the sum of the weights of the face triangles in any row is
4m2 − 1.
Given a face triangle, consider the set of all edges in the three rows containing
the triangle. It is possible to choose two flat 3-cuts of Gq whose union is this
124 William H. Cunningham and Lawrence Tang
set, and whose intersection is a single edge, or is the set of edges of the face
triangle. (There is more than one way to do this.) For each of these two 3-cuts,
assign a weight equal to the weight of the triangle divided by 2q(4m2 − 1). (Note
that a 3-cut S may be assigned weight by two different face triangles; these
weights are added to form the variable zS .) Now consider the constraint of (Dq )
corresponding to an edge e. The contribution of the variables just defined to the
left-hand side of the constraint, is at most the sum of the weights of the face
triangles in rows containing the edge. If the edge is in the middle hexagon, or is in
a corner triangle and is not parallel to one of the edges incident with the corner,
then it gets contributions from triangles in two different rows, and otherwise,
it gets contributions from triangles in one row. Therefore, the contribution for
the first type of edge is at most (4m2 − 1)/(4m2 − 1)q = 1q . For the second type
1
of edge the total contribution is at most half this, that is, at most 2q . But the
second group of edges consists precisely of the ones that get a contribution from
1
the dual variables assigned to the uniform 3-cuts, and that contribution is 2q .
So the total contribution of all of the dual variables to the left-hand side of the
constraint of (Dq ) corresponding to any edge e is at most 1q , so we have defined
a feasible solution of (Dq ).
Now the objective value of this solution can be computed as follows. There
1
are 3m variables corresponding to uniform 3-cuts, each given value 4q . Therefore,
3m
the contribution to the objective function of variables of this type is 12m = 14 .
The contribution of the other variables is the sum of over the 2m horizontal
rows in the middle hexagon, of the total weight of a row divided by q(4m2 − 1).
Therefore, it is
2
2m(4m2 − 1)/q(4m2 − 1) = .
3
Therefore, the objective value of our feasible solution to (Dq ) is
1 2 11
+ = .
4 3 12
6 Remarks
All of the results of Călinescu et al. [1] quoted above for k = 3 are special cases
of their results for general k. They give a linear-programming relaxation that
generalizes (LP 3), and a corresponding generalization of the notion of triangle-
embedding, an embedding into a (k − 1)-dimensional simplex in which the ter-
minals are mapped to the extreme points. They show that the optimal value of a
k-cut is at most 3k−2
2k times the optimal value of this linear-programming prob-
lem. As a result, they obtain an approximation algorithm for the optimal k-cut
problem having performance guarantee 3k−2 2k . The recent paper [7], which has
most of our results for k = 3, also has results for k > 3, improving the bounds
given by [1]. For example, [7] gives bounds of 1.1539 for k = 4 and 1.3438 for
all k > 6. The problem of giving a tight analysis for k > 3, as we now have for
k = 3, remains open.
Acknowledgment. We are grateful to Gruia Călinescu, Joseph Cheriyan, Kevin
Cheung, and Levent Tunçel for conversations about this work.
References
1. G. Călinescu, H. Karloff, and Y. Rabani: An improved approximation algorithm
for MULTIWAY CUT Proceedings of Symposium on Theory of Computing, ACM,
1998.
2. Kevin Cheung, private communication, 1999.
3. S. Chopra and M.R. Rao, “On the multiway cut polyhedron”, Networks 21(1991),
51–89.
4. W.H. Cunningham, “The optimal multiterminal cut problem”, in: C. Monma and
F. Hwang (eds.), Reliability of Computer and Communications Networks, American
Math. Soc., 1991, pp. 105–120.
5. E. Dahlhaus, D. Johnson, C. Papadimitriou, P. Seymour, and M. Yannakakis, “The
Complexity of multiway cuts”, extended abstract, 1983.
6. E. Dahlhaus, D. Johnson, C. Papadimitriou, P. Seymour, and M. Yannakakis, “The
Complexity of multiterminal cuts”, SIAM J. Computing, 23(1994), 864–894.
7. D. Karger, P. Klein, C. Stein, M. Thorrup, and N. Young, “Rounding algorithms
for a geometric embedding of minimum multiway cut,” Proceedings of Symposium
on Theory of Computing, ACM, 1999, to appear.
Semidefinite Programming Methods for the
Symmetric Traveling Salesman Problem
1 Introduction
Semidefinite programming (SDP) has many applications to various classes of
optimization problems (see e.g. [33]). In particular, there is a growing interest in
the application of SDP to combinatorial optimization, where it is used in order to
get satisfactory bounds on the optimal objective function value (see [15], [31] for
a survey). Some examples are recently introduced semidefinite relaxations for the
max-cut problem (Goemans, Williamson [16]), graph colouring problem (Karger,
Motwani, Sudan [20]) and traveling salesman problem (Cvetković, Čangalović,
Kovačević-Vujčić [7], [8]). It is the purpose of this paper to investigate the power
of semidefinite relaxations for traveling salesman problem in a branch-and-bound
framework.
The traveling salesman problem (TSP) is one of the best-known NP-hard
combinatorial optimization problems. There is an extensive literature on both
theoretical and practical aspects of TSP. The most important theoretical results
on TSP can be found in [24] (see also [4], [9]). A large number of both exact
algorithms and heuristics for TSP have been proposed; for a review we refer to
Laporte [22], [23]. We shall mention here only the most important approaches for
finding an exact solution of the symmetric traveling salesman problem (STSP).
Two classical relaxations of STSP have been extensively discussed in literature.
The first exploits the fact that the cost of an optimal STSP-tour cannot be less
than that of a shortest 1-tree. Several algorithms of branch-and-bound type are
based on this relaxation first proposed by Christofides [3]. The basic algorithm
was developed by Held and Karp [19] and further improved by Helbig-Hansen
and Krarup [18], Smith and Thompson [32], Volgenant and Jonker [34], Gav-
ish and Srikanth [14] and, more recently, Carpaneto, Fischetti and Toth [2].
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 126–136, 1999.
c Springer-Verlag Berlin Heidelberg 1999
Semidefinite Programming Methods 127
Fiedler shows that the notion of the Laplacian and the algebraic connectivity
can be generalized to graphs with positively weighted edges.
A C-edge-weighted graph GC = (V, E, C) is defined by graph G = (V, E)
and a symmetric nonnegative weight matrix C such that cij > 0 if and only if
{i, j} ∈ E. Now the Laplacian L(GC ) is defined as L(GC ) = diag(r1 , . . . rn ) − C,
where ri is the sum of the i-th row of C. The Laplacian L(GC ) has similar
characteristics as L(G). Namely it is symmetric, positive semidefinite with the
smallest eigenvalue λ1 = 0 and the corresponding eigenvector e. As before, the
algebraic connectivity a(GC ) is the second smallest eigenvalue of L(GC ), which
enjoys similar properties to those in Theorem 1.
Theorem 2 ((M. Fiedler [13])). The generalized algebraic connectivity a(GC )
has the following properties:
(i) a(GC ) = min xT L(GC )x
x∈S
(ii) a(GC ) ≥ 0, a(GC ) > 0 if and only if GC is connected.
In the sequel we shall assume that G = (V, E) is a complete undirected
graph, where, as before, V = {1, . . . , n} is the set of vertices and E is the set
of edges. To each edge {i, j} ∈ E a distance (cost) dij is associated such that
the distance matrix D = [dij ]n×n is symmetric and dii = 0, i = 1, . . . , n. Now
the symmetric traveling salesman problem (STSP) can be formulated as follows:
find a Hamiltonian circuit of G with minimal cost.
Algebraic connectivity of a Hamiltonian circuit is well known in the theory
of graph spectra (see e.g. [10]). The Laplacian of a circuit with n vertices has
the spectrum
2 − 2 cos(2πj/n), j = 1, . . . , n
and the second smallest eigenvalue is obtained for j = 1 and j = n − 1, i.e. λ2 =
λ3 = 2 − 2 cos(2π/n). This value will be denoted by hn , i.e. hn = 2 − 2 cos(2π/n).
The next theorem, which gives a basis for the discrete semidefinite program-
ming model of STSP, has been proved in [8] as a consequence of a more general
result. For the sake of completeness we supply here a self-contained proof.
Theorem 3. Let H be a spanning subgraph of G such that d(i) = 2, i = 1, . . . , n,
where d(i) is the degree of vertex i with respect to H, and let L(H) = [lij ]n×n
be the corresponding Laplacian. Let α and β be real parameters such that α >
hn /n, 0 < β ≤ hn . Then H is a Hamiltonian circuit if and only if the matrix
X = L(H) + αJ − βI is positive semidefinite, where J is the n × n matrix with
all entries equal to one and I is the unit matrix of order n.
Proof. Let 0 = λ1 ≤ λ2 ≤ . . . ≤ λn be the eigenvalues of L(H) and let x1 = e
and xi ∈ S, i = 2, . . . , n, be the corresponding eigenvectors which form a basis
for IRn . It is easy to check that J has two eigenvalues: 0, with multiplicity n − 1
and the corresponding eigenvectors x2 , . . . , xn , and n with e as its eigenvector.
Therefore
Xe=(L + αJ − βI)e = (αn − β)e
Xxi=(L + αJ − βI)xi = (λi − β)xi , i = 2, . . . , n
Semidefinite Programming Methods 129
lii = 2, i = 1, . . . , n (1)
subject to
xii = 2 + α − β, i = 1, . . . , n (4)
n
X
xij = nα − β, i = 1, . . . , n (5)
j=1
X ≥0 (7)
where X ≥ 0 denotes that the matrix X = [xij ]n×n is symmetric and positive
semidefinite and α and β are chosen according to Theorem 3. Matrix L = X +
βI − αJ represents the Laplacian of a Hamiltonian circuit if and only if X
satisfies (4)-(7). Indeed, constraints (4)-(6) provide that L has the form of a
Laplacian with diagonal entries equal to 2, while condition (7) guarantees that
L corresponds to a Hamiltonian circuit. Therefore, if X ∗ is an optimal solution
of problem (3)-(7) then L∗ = X ∗ + βI − αJ is the Laplacian of an optimal
P
n P n 1 ∗
Hamiltonian circuit of G with the objective function value − dij lij =
i=1 j=1 2
F (X ∗ ).
A natural semidefinite relaxation of the traveling salesman problem is ob-
tained when discrete conditions (6) are replaced by inequality conditions:
130 Dragoš Cvetković, Mirjana Čangalović, and Vera Kovačević-Vujčić
subject to
xii = 2 + α − β, i = 1, . . . , n (9)
n
X
xij = nα − β, i = 1, . . . , n (10)
j=1
subject to
lii = 2, i = 1, . . . , n (14)
n
X
lij = 0, i = 1, . . . , n (15)
j=1
As s ∈ S then L ◦ ssT ≥ hn .
(ii) It is sufficient to prove that the hyperplane L ◦ ssT = hn and the set
W have nonempty intersection. We shall construct
a point
in the intersection of
2 2
the form Lγ = γL∗ + (1 − γ)L̂, where L̂ = 2 + I− J. The matrix
n−1 n−1
132 Dragoš Cvetković, Mirjana Čangalović, and Vera Kovačević-Vujčić
2
L̂ has the eigenvalue 2 + with multiplicity n − 1 and each x ∈ S as its
n−1
2
eigenvector, and 0 with multiplicity 1. Since λ2 (L̂) = 2 + > hn it follows
n−1
that L̂ ∈ ri W .
Let us prove that for each γ ∈ (0, 1) the vector s is an eigenvector corre-
sponding to λ2 (Lγ ). Indeed, for each x ∈ S
Lγ ◦ xxT = γL∗ ◦ xxT + (1 − γ)L̂ ◦ xxT ≥ γL∗ ◦ ssT + (1 − γ)L̂ ◦ ssT = Lγ ◦ ssT .
Moreover,
2
λ2 (Lγ ) = Lγ ◦ ssT = γλ2 (L∗ ) + (1 − γ) 2 + .
n−1
2 2 ∗
For γ = 2 + − hn / 2 + − λ2 (L ) we have λ2 (Lγ ) = Lγ ◦
n−1 n−1
ssT = hn . t
u
L ◦ ssT ≥ hn (18)
Algorithm 2. The first non-integer entry of the SDP solution matrix is replaced
in the sons by 0 and 1 respectively.
For solving the SDP relaxation tasks we used CSDP 2.2 software package
developed by Borchers [1] in C language. Inequality conditions (11) were handled
adding n2 − n slack variables each represented by a 1 × 1 block as accepted by
the software.
Our numerical experiments included 55 randomly generated STSP instances
of dimension 10 ≤ n ≤ 20 already treated in [8]. Entries of the distance matrix
are uniformly distributed in the range from 1 to 999. The experiments were
performed on an Alpha 800 5/400 computer. In a time sharing system it took no
more than 1 minute real time to get a solution of an SDP relaxation task related
to an STSP instance of dimension n ≤ 20. Computational results are presented
in Table 1.
Table 1.
1 2 3 4 5 6 7 8 9
1 1680.9950 1681 1681 1 / 1 /
2 2777.9920 2778 2778 1 / 1 /
10 3 1626.1300 1714 1630 7 5 5 3
4 2058.9950 2059 2059 1 / 1 /
5 2672.4910 2801 2713 24 18 11(1) 6
1 2884.0000 2884 2884 1 / 1 /
2 2258.4940 2283 2283 10 7 7 4
11 3 1565.0000 1565 1565 1 / 1 /
4 1226.8920 1229 1229 7 5 5 3
5 1999.0000 2019 2019 4 3 3 2
1 2962.0000 2962 2962 1 / 1 /
2 2416.0000 2416 2416 1 / 1 /
12 3 1267.0010 1267 1267 1 / 1 /
4 2434.0000 2434 2434 1 / 1 /
5 1981.7260 2021 2021 15 11 5 3
1 1742.0000 1742 1742 1 / 1 /
2 2064.4350 2072 2072 49 35 7(2) 4
13 3 1786.0010 1786 1786 1 / 1 /
4 2650.3250 2688 2686 10 7 11(1) 6
5 2458.0000 2458 2458 1 / 1 /
1 1503.0000 1838 1503 1 / 1 /
2 2269.0000 2269 2269 1 / 1 /
14 3 1985.5090 2091 2091 351 242 57(51) 29
4 2170.5680 2173 2173 11 8 5 3
5 2000.0000 2000 2000 1 / 1 /
1 1548.0010 1926 1548 1 / 1 /
2 1415.0000 1415 1415 1 / 1 /
15 3 1813.0000 2082 1849 4 3 3 2
4 2455.6730 2471 2471 8 6 7(2) 4
5 1749.0000 1749 1749 1 / 1 /
134 Dragoš Cvetković, Mirjana Čangalović, and Vera Kovačević-Vujčić
Table 1. (continued)
1 2 3 4 5 6 7 8 9
1 2579.0000 2579 2579 1 / 1 /
2 2189.0000 2189 2189 1 / 1 /
16 3 2147.9210 2247 2181 54 38 51(20) 26
4 1447.7110 1473 1473 11 8 7(1) 4
5 2595.0010 2896 2595 1 / 1 /
1 1183.9970 1184 1184 1 / 1 /
2 2606.9930 2607 2607 1 / 1 /
17 3 1664.9950 1665 1665 1 / 1 /
4 1568.4940 1579 1576 13 9 9 5
5 2192.2420 2233 2208 53 36 5(1)∗ 3
1 2606.5870 2676 2651 ∗ 59(1) 30
2 2273.2220 2356 2275 21 15 11 6
18 3 1562.0080 1673 1562 1 / 1 /
4 2490.0000 2490 2490 1 / 1 /
5 1815.9110 1994 1824 ∗ 17 9
1 1223.9960 1420 1224 1 / 1 /
2 2039.9930 2073 2068 89 60 19(3)∗ 10
19 3 1417.9960 1418 1418 1 / 1 /
4 1897.4480 2121 1926 ∗ 29(5) 15
5 2015.5880 2166 2035 * 15(5) 8
1 1953.2990 2250 2011 * 71(19) 36
2 2410.2790 2501 2420 452 309 7(3) 4
20 3 2585.1740 2680 2589 4 3 13(2) 7
4 1758.4360 1784 1777 20 14 15(2) 8
5 1817.7610 1838 1838 50 34 27(1) 14
References
1. Borchers, B.: CSDP, A C Library for Semidefinite Programming. Optimization
Methods and Software (to appear)
2. Carpaneto G., Fischetti M., Toth P.: New Lower Bounds for the Symmetric Trav-
elling Salesman Problem. Math. Program. 45 (1989) 233–254.
3. Christofides N.: The Shortest Hamiltonian Chain of a Graph. SIAM J. Appl. Math.
19 (1970) 689–696.
4. Cook, W., Cunningham, W., Pulleyblank, W., Schrijver, A.: Combinatorial Opti-
mization. John Wiley & Sons, New York Chichester Weinheim Brisbane Singapore
Toronto (1998)
5. Crowder H., Padberg M.W.: Solving Large-Scale Symmetric Travelling Salesman
Problems to Optimality. Management Sci. 26 (1980) 495–509
6. Cvetković, D., Čangalović, M., Dimitrijević, V., Kraus, L., Milosavljević, M., Simić,
S.: TSP-SOLVER - A Programming Package for the Traveling Salesman Problem.
Univ. Beograd, Publ. Elektrotehn. Fak. Ser. Mat., 1 (1990) 41–47
7. Cvetković, D., Čangalović, M., Kovačević-Vujčić, V.: Semidefinite Programming
and Traveling Salesman Problem. In: Petrović, R., Radojević, D. (eds.): Proceed-
ings of Yugoslav Symposium on Operations Research. Herceg Novi, Yugoslavia
(1998) 239–242
8. Cvetković, D., Čangalović, M., Kovačević-Vujčić, V.: Semidefinite Relaxations of
Travelling Salesman Problem. (to appear)
9. Cvetković, D., Dimitrijević, V., Milosavljević, M.: Variations on the Travelling
Salesman Theme. Libra Produkt, Belgrade (1996)
10. Cvetković, D., Doob, M., Sachs, H.: Spectra of Graphs. 3rd edn. Johann Ambrosius
Barth, Heidelberg Leipzig (1995)
11. Dantzig G.B., Fulkerson D.R., Johnson S.M.: Solution of a Large-Scale Traveling
Salesman Problem. Operations Research 2 (1954) 393–410
12. Fiedler M.: Algebraic Connectivity of Graphs. Czechoslovak Math. J. 23 (1973)
298–305
13. Fiedler, M.: Laplacian of Graphs and Algebraic Connectivity. In: Combinatorics
and Graph Theory, Vol. 25, Banach center publications, PWN-Polish scientific
publishers Warsaw (1989) 57–70
14. Gavish B., Srikanth K.N.: An Optimal Solution Method for Large-Scale Multiple
Travelling Salesman Problems. Operations Research 34 (1986) 698–717
15. Goemans, M.: Semidefinite Programming in Combinatorial Optimization. Math.
Program. 79 (1997) 143–161
16. Goemans M.X., Williamson D.P.: Improved Approximation Algorithms for Max-
imum Cut and Satisfability Problems Using Semidefinite Programming. J. ACM
42 (1995) 1115–1145
136 Dragoš Cvetković, Mirjana Čangalović, and Vera Kovačević-Vujčić
17. Grötschel M., Holland O.: Solution of Large-Scale Symmetric Travelling Salesman
Problems. Math. Program. 51 (1991) 141–202
18. Helbig-Hansen K., Krarup J.: Improvements of the Held-Karp Algorithm for the
Symmetric Traveling Salesman Problem. Math. Program. 7 (1974) 87–96
19. Held M., Karp R.M.: The Travelling Salesman Problem and Minimum Spanning
Trees. Part II, Math. Program. 1 (1971) 6–25
20. Karger D., Motwani R., Sudan M.: Approximate Graph Coloring by Semidefinite
Programming. J. ACM 45 (1998) 246–265
21. Land A.H.: The Solution of Some 100-City Travelling Salesman Problems. Working
Paper. London School of Economics (1979)
22. Laporte, G.: The Traveling Salesman Problem: An Overview of Exact and Approx-
imate Algorithms. European J. Operational Research 59 (1992) 231–247
23. Laporte G.: Exact Algorithms for the Traveling Salesman Problem and the Vehicle
Routing Problem. Les Cahiers du GERAD G-98-37 July (1998)
24. Lawler, E.L., Lenstra, J.K., Rinnooy Kan, A.H.G., Shmoys, D.B.: The Traveling
Salesman Problem. John Wiley & Sons, Chichester New York Brisbane Toronto
Singapore (1985)
25. Martin G.T.: Solving the Travelling Salesman Problem by Integer Programming.
Working Paper. CEIR, New York (1966)
26. Miliotis P.: Integer Programming Approaches to the Travelling Salesman Problem.
Math. Program. 10 (1976) 367–378
27. Miliotis P.: Using Cuting Planes to Solve the Symmetric Travelling Salesman Prob-
lem. Math. Program. 15 (1978) 177–188
28. Padberg M.W., Hong S.: On the Symmetric Travelling Salesman Problem: A Com-
putational Study. Math. Program. Study 12 (1980) 78–107
29. Padberg M.W., Rinaldi G.: Optimization of a 532-City Symmetric Traveling Sales-
man Problem by Branch and Cut. Operations Research Letters 6 (1987) 1–7
30. Padberg M.W., Rinaldi G.: A Branch-and-Cut Algorithm for the Resolution of
Large Scale Symmetric Traveling Salesman Problems. SIAM Review 33 (1991)
66–100
31. Rendl, F.: Semidefinite Programming and Combinatorial Optimization. Technical
Report Woe-19, TU Graz, Austria December (1997)
32. Smith T.H.C., Thompson G.L.: A LIFO Implicit Enumeration Search Algorithm
for the Symmetric Traveling Salesman Problem Using Held and Karp’s 1-Tree
Relaxation. Annals Disc. Math. 1 (1977) 479–493
33. Vandenberghe, L., Boyd, S.: Semidefinite Programming. SIAM Review 38 (1996)
49–95
34. Volgenant T., Jonker R.: A Branch and Bound Algorithm for the Symmetric Trav-
eling Salesman Problem Based on the 1-Tree Relaxation. Europian J. Operational
Research 9 (1982) 83–89
Bounds on the Chvátal Rank of Polytopes in the
0/1-Cube
1 Introduction
Chvátal [11] established cutting-plane proofs as a way to certify certain prop-
erties of combinatorial problems, e.g., to testify that there are no k pairwise
non-adjacent nodes in a given graph, that there is no acyclic subdigraph with k
arcs in a given digraph, or that there is no tour of length at most k in a pre-
scribed instance of the traveling salesperson problem. In this paper we discuss
the length of such proofs. Let us first recall the notion of a cutting-plane proof.
A sequence of inequalities
c1 x 6 δ1 , c2 x 6 δ2 , . . . , cm x 6 δm (1)
is called a cutting-plane proof of c x 6 δ from a given system of linear inequalities
A x 6 b, if c1 , . . . , cm are integral, cm = c, δm = δ, and if ci x 6 δi0 is a
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 137–150, 1999.
c Springer-Verlag Berlin Heidelberg 1999
138 Friedrich Eisenbrand and Andreas S. Schulz
system for propositional logic in which every tautology has a short proof. Here,
the length of the proof is measured by the total number of symbols in it and short
means polynomial in the length of the tautology. This question is equivalent to
whether or not NP equals co-NP. Cook, Coullard, and Turán [14] were the first to
consider cutting-plane proofs as a propositional proof system. In particular, they
pointed out that the cutting-plane proof system is a strengthening of resolution
proofs. Since the work of Haken [25] exponential lower bounds are known for
the latter. Results of Chvátal, Cook, and Hartmann [13], of Bonet, Pitassi, and
Raz [7], of Impagliazzo, Pitassi, and Urquhart [30], and of Pudlák [34] imply
exponential lower bounds on the length of cutting-plane proofs as well. On the
other hand, there is no upper bound on the length of cutting-plane proofs in
terms of the dimension of the corresponding polyhedron as the following well-
known example shows. The Chvátal rank of the polytope defined by
−t x1 + x2 61
t x1 + x2 6 t + 1
x1 6 1
x1 , x2 > 0
grows with t. Here, t is an arbitrary positive number. This fact is rather counter-
intuitive since the corresponding integer hull is a 0/1-polytope, i.e., all its vertices
have components 0 or 1 only. That is, for any 0/1-polytope there is a simple
certificate of the validity of an inequality c x 6 δ. Just list all, at most 2n possible
assignments of 0/1-values to the variables. One of our main results helps to
meet the natural expectation. We give a polynomial bound in the dimension for
the Chvátal rank of any polytope contained in the 0/1-cube. Then, Theorem 1
implies the existence of exponentially long cutting-plane proofs, matching the
known exponential lower bounds.
In polyhedral combinatorics, it has been quite common to consider the depth
of a class of inequalities if not as an indicator of quality at least as a measure
of its complexity. Hartmann, Queyranne, and Wang [29] give conditions under
which an inequality has depth at most 1 and use them to establish that sev-
eral classes of inequalities for the traveling salesperson polytopes have depth at
least 2, as was claimed before in [3, 8, 9, 10, 18, 20, 24]. However, it follows
from a recent result in [16] that deciding whether a given inequality c x 6 δ
has depth at least 2 can in general not be done in polynomial time, unless
P = NP. Chvátal, Cook, and Hartmann [13] (see also [27]) answered questions
and proved conjectures of Schrijver, of Barahona, Grötschel, and Mahjoub [4], of
Jünger, of Chvátal [12], and of Grötschel and Pulleyblank [24] on the behavior
of the depth of certain inequalities relative to popular relaxations of the stable
set polytope, the bipartite-subgraph polytope, the acyclic-subdigraph polytope,
and the traveling salesperson polytope, resp. They obtained similar results for
the set-covering and the set-partitioning polytope, the knapsack polytope, and
the maximum-cut polytope, and so did Schulz [38] for the transitive packing,
the clique partitioning, and the interval order polytope. The observed increase
of the depth was never faster than a linear function of the dimension; we prove
140 Friedrich Eisenbrand and Andreas S. Schulz
that this indeed has to be the case as the depth of any inequality with coeffi-
cients bounded by a constant is O(n), relative to a polytope in the 0/1-cube.
Naturally, most polytopes associated with combinatorial optimization problems
are 0/1-polytopes.
Main Results. We present two new upper bounds on the depth of inequalities
relative to polytopes in the 0/1-cube. For notational convenience, let P be any
polytope contained in the 0/1-cube, i.e., P ⊆ [0, 1]n , and let c x 6 δ, c ∈ Zn be
an arbitrary inequality valid for the integer hull PI of P .
We prove first that the depth of c x 6 δ relative to P is at most 2(n2 +
n lg kck∞ ). This yields an O(n2 lg n) bound on the Chvátal rank of P since
any 0/1-polytope PI can be represented by a system of inequalities Ax 6 b
with A ∈ Zm×n, b ∈ Zm such that each absolute value of an entry in A is
bounded by nn/2 . Note that the latter bound is sharp, i.e., there exist 0/1-
polytopes with facets for which any inducing inequality a x 6 β, a ∈ Zn satisfies
kak∞ ∈ Ω(nn/2 ) [1].
Second, we show that the depth of c x 6 δ relative to P is no more than
kck1 + n. A similar result was only known for monotone polyhedra [13]. In fact,
we present a reduction to the monotone case that is of interest in its own right
because of the smooth interplay of unimodular transformations and rounding
operations. The second bound gives an asymptotic improvement by a factor n
to the before-mentioned bound if the components of c are bounded by a constant.
Third, we construct a family of polytopes in the n-dimensional 0/1-cube
whose Chvátal rank is at least (1 + )n, for some > 0. In other words, if r(n)
denotes the maximum Chvátal rank over all polytopes that are contained in
[0, 1]n , then it is one outcome of our study that this function behaves as follows:
Finally, we also show that the number of inequalities in any linear description
of a polytope P ⊆ [0, 1]n with empty integer hull is exponential in n, whenever
there is an inequality of depth n.
Related Work. Via a geometric argument, Bockmayr and Eisenbrand [5] derived
the first polynomial upper bound of 6 n3 lg n on the Chvátal rank of polytopes in
the n-dimensional 0/1-cube. Subsequently, Schulz [39] and Hartmann [28] inde-
pendently obtained both a considerably simpler proof and a slightly better bound
of n2 lg(nn/2 ), by using bit-scaling. The reader is referred to the joint journal
version of their papers [6], where the authors actually show that the depth of any
inequality c x 6 δ, c ∈ Zn, which is valid for PI is at most n2 lg kck∞ , relative to
P . For monotone polytopes P , Chvátal, Cook, and Hartmann [13] showed that
the depth of any inequality c x 6 δ that is valid for PI is at most kck1 . More-
over, they also identified polytopes stemming from relaxations of combinatorial
optimization problems that have Chvátal rank at least n.
Eventually, our study of r(n) can also be seen as a continuation of the in-
vestigation of combinatorial properties of 0/1-polytopes, like their diameter [32],
Bounds on the Chvátal Rank of Polytopes in the 0/1-Cube 141
2 Preliminaries
where the intersection ranges over all rational half spaces containing P . We refer
to an application of the 0 operation as one iteration of the Gomory-Chvátal pro-
cedure. If we set P (0) = P and P (i+1) = (P (i) )0 , for i > 0, then the Chvátal rank
of P is the smallest number t such that P (t) = PI . The depth of an inequality
c x 6 δ with respect to P is the smallest k such that c x 6 δ is valid for P (k) .
Let P ⊆ Rn be a polyhedron. A polyhedron Q with Q ⊇ P is called a
weakening of P , if QI = PI . If c x 6 δ is valid for PI , then the depth of this
inequality with respect to Q is an upper bound on the depth of this inequality
with respect to P . It is easy to see that each polytope P ⊆ [0, 1]n has a rational
weakening in the 0/1-cube.
The following important lemma can be found in [37, p. 340]. (For a very
nice treatment, see also [15, Lemma 6.33].) It allows to use induction on the
dimension of the polyhedra considered and provides the key for the termination
of the Gomory-Chvátal procedure, which was shown by Schrijver for rational
polyhedra in [36].
142 Friedrich Eisenbrand and Andreas S. Schulz
The polytopes in this example have exponentially many inequalities, and this
indeed has to be the case.
Proposition 1. Let P ⊆ [0, 1]n be a polytope in the 0/1-cube with PI = ∅ and
rank(P ) = n. Any inequality description of P has at least 2n inequalities.
Proof. For a polytope P ⊆ Rn and for some i ∈ {1, . . . , n} and ` ∈ {0, 1} let
Pi` ⊆ Rn−1 be the polytope defined by
Pi` = {x ∈ [0, 1]n−1 | (x1 , . . . , xi−1 , `, xi+1 , . . . , xn )T ∈ P }.
Notice that, if P is contained in a facet (xi = `) of [0, 1]n for some ` ∈ {0, 1}
and some i ∈ {1, . . . , n}, then the Chvátal rank of P is the Chvátal rank of Pi` .
We will prove now that any one-dimensional face F1 of the cube satisfies
F1 ∩ P 6= ∅. We proceed by induction on n.
If n = 1, this is definitely true since P is not empty and since F1 is the cube
itself. For n > 1, observe that any one-dimensional face F1 of the cube lies in a
facet (xi = `) of the cube, for some ` ∈ {0, 1} and for some i ∈ {1, . . . , n}. Since
P has Chvátal rank n it follows that P̃ = (xi = `) ∩ P has Chvátal rank n − 1.
If the Chvátal rank of P̃ was less than that, P would vanish after n − 1 steps.
It follows by induction that (F1 )`i ∩ P̃i` 6= ∅, thus F1 ∩ P 6= ∅.
Now, each 0/1-point has to be cut off from P by some inequality, as PI = ∅.
If an inequality c x 6 δ cuts off two different 0/1-points simultaneously, then it
must also cut off a 1-dimensional face of [0, 1]n . Because of our previous obser-
vation this is not possible, and hence there is at least one inequality for each
0/1-point which cuts off only this point. Since there are 2n different 0/1-points
in the cube, the claim follows. t
u
Bounds on the Chvátal Rank of Polytopes in the 0/1-Cube 143
where byc denotes the largest integer smaller than or equal to y. Note that lg n
is the number of bits in the binary representation of n. For a vector x ∈ Rn , bxc
denotes the vector obtained by component-wise application of b·c.
Proof. Notice that Fβ0 = ∅ for each β > α. The proof is by induction on γ − α.
If α = γ, there is nothing to prove. So let γ − α > 0. Since Fγ0 = ∅, Lemma 1
implies that c x 6 γ − is valid for P 0 for some > 0 and thus the inequality
c x 6 γ − 1 is valid for P (2) . t
u
Proof. We can assume that c > 0 holds and that PI 6= ∅. (It is shown in [6] that
polytopes with empty integer hull have Chvátal rank at most n.) The proof is
by induction on n and lg kck∞ . The claim holds for n = 1, 2 since the Chvátal
rank of a polytope in the 1- or 2-dimensional 0/1-cube is at most 4.
So let n > 2. If lg(kck∞ ) = 1, then the claim follows, e.g., from Theorem 3
below. So let lg kck∞ > 1. Write c = 2c1 + c2 , where c1 = bc/2c and c2 ∈ {0, 1}n.
By induction, it takes at most 2(n2 + n lg kc1 k∞ ) = 2(n2 + n lg kck∞ ) − 2n
144 Friedrich Eisenbrand and Andreas S. Schulz
Proof. The proof is by induction on kwk1 . If kwk1 = 0, the claim follows trivially.
P w > 0 holds. Let γ = max{w x | x ∈ P } and let
W.l.o.g., we can assume that
J = {j | wj > 0}. If max{ j∈J xj | x ∈ P } = |J|, then, since P is monotone, x̂
with
(
1 if i ∈ J,
x̂i =
0 otherwise
is in P
P. Also w x̂ = γ must hold. So Pγ = δ and the claim follows trivially. If
max{ j∈J xj | x ∈ P } < |J|, then j∈J xj 6 |J| − 1 has depth at most 1. If
kwk1 = 1 this also implies the claim, so assume kwk1 > 2. By induction the
valid inequalities w x − xj 6 δ, j ∈ J have depth
P at most kwk1 − δ − 1. Adding
up the inequalities w x − xj 6 δ, j ∈ J and j∈J xj 6 |J| − 1 yields
w x 6 δ + (|J| − 1)/|J|.
u : Rn → Rn
x 7→ U x + v,
{x ∈ Rn | c u−1 (x) 6 δ} = {x ∈ Rn | c U −1 x 6 δ + c U −1 v}
= (c U −1 x 6 δ + c U −1 v).
It follows that
\
u(P 0 ) = (c U −1 x 6 bδc + c U −1 v).
6
(c x δ)⊇P
Z
c∈ n
πi : Rn → Rn
(x1 , . . . , xn ) 7→ (x1 , . . . , xi−1 , 1 − xi , xi+1 , . . . , xn ),
It has a representation
πi : Rn → Rn
x 7→ U x + ei ,
where U coincides with the identity matrix In except for U(i,i) which is −1. Note
that the switching operation is a bijection of [0, 1]n . For the set (c x 6 δ) one
has πi (c x 6 δ) = e
c x 6 δ − ci . Here e
c coincides with c except for a change of
sign in the i-th component.
Bounds on the Chvátal Rank of Polytopes in the 0/1-Cube 147
0 6 xi 6 1, i ∈ {1, . . . , n}
ax̂ x 6 γx̂ , x̂ ∈ {0, 1}n, x̂ ∈
/P
Proof. One can assume that c is nonnegative, since one can apply a series of
switching operations. Notice that this can change the right hand side δ, but in
the end δ has to be nonnegative since P 6= ∅. Let K = {x ∈ [0, 1]n | c x 6 δ}
and consider the polytope Q = conv(K, P ). The inequality c x 6 δ is valid for
QI and the depth of c x 6 δ with respect to P is at most the depth of c x 6 δ
with respect to Q. By Lemma 8, Q(n) has a monotone weakening S. The depth
of c x 6 δ with respect to Q(n) is at most the depth of c x 6 δ with respect to
S. But it follows from Lemma 6 that the depth of c x 6 δ with respect to S is
at most kck1 − δ 6 kck1 . t
u
The construction relies on the lower bound result for the fractional stable-set
polytope due to Chvátal, Cook, and Hartmann [13].
Let G = (V, E) be a graph on n vertices, C be the family of all cliques of
G, and let Q ⊆ Rn be the fractional stable set polytope of G defined by the
equations
Let e be the vector of all ones. The following lemma is proved in [13, Proof
of Lemma 3.1].
Lemma 9. Let k < s be positive integers and let G be a graph with n vertices
such that every subgraph of G with s vertices is k-colorable. If P is a polyhedron
that contains QI and the point u = k1 e, then P (j) contains the point xj =
s j
( s+k ) u.
Let α(G) be the size of the largest independent subset of the nodes of G. It
follows that e x 6 α(G) is valid for QI . One has
e xj =
n
(
s j
k s+k
) > nk e−jk/s ,
and thus xj does not satisfy the inequality e x 6 α(G) for all j < (s/k) ln kα(G)
n
.
Erdös proved in [17] that for every positive t there exist a positive integer
c, a positive number δ and arbitrarily large graphs G with n vertices, cn edges,
α(G) < tn and every subgraph of G with at most δn vertices is 3 colorable.
n
One wants that ln kα(G) > 1 and that s/k grows linearly, so by chosing some
t < 1/(3e), k = 3 and s = bδnc one has that xj does not satisfy the inequality
e x 6 α(G) for all j < (s/k).
We now give the construction. Let P be the polytope that results from the
convex hull of Pn defined in (2) and Q. Pn ⊆ P contributes to the fact that
1/2 e is in P (n−1) [13, Lemma 7.2]. Thus x0 = 1/3 e is in P (n−1) . Since the
convex hull of P is QI , it follows from the above discussion that the depth of
e x 6 α(G) with respect to P (n−1) is Ω(n). Thus the depth of e x 6 α(G) is at
least (n − 1) + Ω(n) > (1 + )n for infinitely many n, where > 0.
Acknowledgments
The authors are grateful to Alexander Bockmayr, Volker Priebe, and Günter
Ziegler for helpful comments on an earlier version of this paper.
References
[1] N. Alon and V. H. Vu. Anti-Hadamard matrices, coin weighing, threshold gates,
and indecomposable hypergraphs. Journal of Combinatorial Theory, 79A:133–
160, 1997.
Bounds on the Chvátal Rank of Polytopes in the 0/1-Cube 149
Lisa Fleischer
1 Introduction
In the 1960’s, Ford and Fulkerson introduced dynamic network flows to include
time in the standard network flow model. Since then, dynamic network flows
have been used widely to model network-structured, decision-making problems
over time: problems in electronic communication, production and distribution,
economic planning, cash flow, job scheduling, and transportation. For examples,
see the surveys of Aronson [4] and Powell, et al. [20].
The maximum dynamic flow problem generalizes the standard maximum flow
problem by introducing time. A standard network consists of a set of nodes V
and a set of arcs E which is a subset of V × V . The capacity function from
the arcs to the real numbers bounds the amount of flow allowed on each arc. In
a dynamic network, the capacity function u limits the rate of flow into an arc.
In addition, a dynamic network has a transit-time vector % associated with the
arcs. The transit time of an arc is the amount of time it takes for flow to travel
from one end of the arc to the other.
Ford and Fulkerson [7] consider the dynamic maximum flow problem: given
a dynamic network with a specified source and sink, determine the maximum
amount of flow that can be sent from source to sink in T time units. They show
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 151–165, 1999.
c Springer-Verlag Berlin Heidelberg 1999
152 Lisa Fleischer
that the problem can be solved in polynomial time by using information obtained
from a minimum cost flow computation in a related network of comparable size.
A generalization of dynamic maximum flows, a universally maximum flow is a
flow that simultaneously maximizes the amount of flow reaching a specified sink
node d, sent from a specified source node s, by time t, for all 0 < t ≤ T . Wilkin-
son [24] and Minieka [14] showed that such a flow exists, and they both provide
algorithms to solve this problem, but these do not run in polynomial time. There
is no known polynomial time algorithm that solves the universally maximum flow
problem. It is not even known if the optimal solution is polynomial in the size
of the input. Hoppe and Tardos [11,12] present an O(m−1 (m + n log n) log U )
algorithm that computes a dynamic flow with the property that the quantity of
flow reaching the sink by time t is within (1 − ) of the maximum possible for
all 0 ≤ t ≤ T .
Another generalization of the maximum dynamic flow problem is the problem
with multiple sources and sinks. The dynamic transshipment problem is a single
commodity flow problem that asks to send specified supplies located at source
nodes to satisfy specified demands at sink nodes within a given time bound T .
A universally quickest transshipment is a dynamic flow that satisfies as much
demand as possible in every interval of time (0, t] for 0 < t ≤ T . If there is
more than one sink, a universally quickest transshipment may not exist, so the
universally quickest transshipment problem refers to a dynamic transshipment
problem with one sink only.
One interesting special case of dynamic flow problems is when all transit
times are zero. In this setting, the universally maximum flow problem is solved
by sending flow at the rate of a maximum flow in the static network at every
moment of time in [0, T ]. (The static network is the interpretation of the dy-
namic network as a standard network. Transit times are ignored, and capacities
are interpreted as bound on the total flow allowed on an arc.) A universally
quickest transshipment with zero transit times has a more complicated struc-
ture. This problem models the situation when the supplies and demands exceed
network capacity, and it is necessary to send flow over time, in rounds. Unlike
standard flow problems with multiple sources and sinks, this problem cannot
be modeled as an equivalent s − t dynamic maximum flow problem. The rea-
son such a transformation does not work here is because the arc capacities are
upper bounds on the rate of flow per time period, and do not correspond to
the total amount of flow an arc may carry throughout an interval. The dynamic
transshipment problem with zero transit times is discussed in [6,10,15,23]. Hajek
and Ogier [10] describe the first polynomial time algorithm to find a universally
quickest transshipment for a dynamic network with all zero transit times. Their
algorithm uses n maximum flow computations on the underlying static network.
This is improved in [6] with an algorithm that solves this problem in the same
asymptotic time as a preflow-push maximum flow algorithm.
All of the above mentioned problems may be generalized to allow storage
of flow at the nodes. In this setting, there may be node storage capacities that
limit the total excess flow allowed at a node. Flow deficit is not allowed. In
Universally Maximum Flow with Piecewise-Constant Capacities 153
the above mentioned problems, even when node excess is allowed, there always
exists an optimal solution that does not use any node storage, and the algorithms
mentioned above find such a solution. For the universally quickest transshipment
with piecewise-constant capacity functions, the subject of this paper, this will
not be the case: node storage will be used in an optimal solution, and the amount
of available storage will affect the solution.
Many other generalizations of this problem have been considered in the lit-
erature, however few other generalizations are known to be polynomial-time
solvable. One exception is the integer dynamic transshipment problem with gen-
eral transit times, for which Hoppe and Tardos [13] give the only polynomial
time algorithm known to solve this problem. Their algorithm repeatedly calls an
oracle to minimize submodular functions. Orlin [17] provides polynomial time
algorithms for some infinite horizon problems, where the objective function is
related to average throughput. Burkard, et al. [5] find the minimum feasible time
to send specified flow from source to sink faster than incorporating the Ford-
Fulkerson algorithm in a binary search framework. Most other work has focussed
on characterizing the structure of problems with more general capacity functions,
and determining when optimal solutions exist [1,2,3,18,19,21,22]. None of these
problems are known to be polynomial time solvable. Among the earlier work,
Anderson, Nash, and Philpott [2] consider the problem of finding a maximum
flow in a network with zero transit times, and with capacities and node storage
that are much more general functions of time, and develop a duality theory for
continuous network flows.
We consider the generalization of the zero transit time, universally maximum
flow problem that allows both arc capacities and node storage capacities to be
piecewise-constant functions on [0, T ] with at most k breakpoints. As with the
general universally maximum flow problem, it is not clear a universally maxi-
mum flow exists in these circumstances. Ogier [16] proves it does, and provides
a polynomial time algorithm that finds a universally maximum flow with at
most kn breakpoints. These breakpoints can be computed with nk maximum
flow computations on a network with nk vertices and (m + n)k arcs. After the
breakpoints are computed, additional maximum flow computations on the same
network are used to calculate the flow between two breakpoints. Thus the to-
tal run time of Ogier’s algorithm is determined by the time to solve O(nk)
maximum flow problems on a network with nk vertices and (n + m)k arcs. As
Ogier [16] demonstrates, this problem also generalizes the universally quickest
transshipment problem.
The main contributions of this paper is to a) recognize that these problems
can be solved by solving a parametric maximum flow problem on a suitably de-
fined graph and b) generalize the parametric maximum flow algorithm of Gallo,
Grigoriadis, and Tarjan [8] to fit the needs of this problem. The end result is
that all the computations described in [16] can be performed in the same asymp-
totic time as one preflow-push maximum flow computation on a network with
nk vertices and (n + m)k arcs. This improves the previous strongly polynomial
run time by a factor of O(nk).
154 Lisa Fleischer
obeys the node capacity constraint 0 ≤ pj (t) ≤ uj (t) for all j ∈ V − {s, d}, and
all 0 ≤ t ≤ k. Denote the set of feasible dynamic flows by D. The value of flow x
at time τ , denoted vτ (x), is the total flow reaching the sink by time τ . That is,
Z τ
vτ (x) := xN,d (t)dt.
0
G1
G 1.5
s
d
Θ= 0 1 2 3
H 1.5
initial network G:
i
u = u( Θ )
u = 1/2 u( Θ )
s d
u=0
Z τ dτ e−1
X X
wτ (C) = uC(t)C(t) (t)dt + uj (θ). (1)
0 θ=1 j∈C(θ+1)∩C(θ)
A dynamic cut can be interpreted as a set of arcs entering a cut in the time-
expanded network. This will be made more explicit following Theorem 1. This
theorem is comparable to strong duality in standard network flows. The proof
of even the weak duality case is a little more involved than the corresponding
static version and is omitted. A slightly weaker statement is proved by Anderson,
Nash, and Philpott [1], but for more general capacity functions.
5.5 2.5
G3
8
1 1 3 1 1 1 0 1 2
1 2
2 1 1 2
3 2 3
3 2 2 2
8
2 4 6
Θ = 0 1 2 3
2.5-maximum flow x 2.5
8
x=1
x=1/2
1 1 3 1 1 1 0 1
R 2.5+ = R3
1 2
2 1 1 2
R 2+ = R2.5
3 2 3
3 2 2 2
R2
8
Fig. 2. Top left: The time expanded graph Gk for k = 3. Bottom left: Cuts Rτ
for τ = 2, 2.5, 3. Top right: A τ -maximum flow xτ for τ = 2.5.
and that such an optimal flow can be defined piecemeal by special τ -maximum
flows [16].
Let xτ be a τ -maximum flow that is constant on each interval (θ − 1, θ] for
θ ∈ {1, 2, . . . , bτ c} and also constant on interval (bτ c, τ ]. In addition, the excess
function pτ of xτ satisfies pτj (τ ) = 0 for all j ∈ V − {s, d}. That such a flow exists
is implied by Corollary 2 : Compute a maximum flow in Hτ . This flow saturates
the arcs entering Rτ , hence has the value of a τ -maximum flow. It does not use
any node storage arc leaving N (dτ e), hence in the dynamic setting, completes
by time τ . (See also Figure 2.) Define x0 to be the dynamic flow such that for
all t, f = x0 (t) is a maximum flow for static network N with capacities u(t),
sources Rt (dte), and sink d. Theorem 2 below extends the static flow concept of
complementary slackness to UMFP.
For each node j of the dynamic network N , define
max{τ |τ ≥ t, j(dte) ∈ Rτ }, if this set is non-empty,
qj (t) := (2)
0, otherwise.
In words, qj (t) is the largest value of τ for which there is no path from j(dte) to
d(dτ e) in the residual network of a τ -maximum flow. Thus if qi (t) > qj (t) then
there is a τ such that for all xσ , σ > τ , if i(dte) has a residual path to d(dσe),
then j(dte) has a residual path to d(dσe), and there is some such σ for which
i(dte) does not have a residual path to d(dσe), when j(dte) does. Since Rτ is
left-continuous, qj (t) is well-defined.
158 Lisa Fleischer
time [8,9] (assuming the number of values of θ is not more than m log(n2 /m)).
This is because the bound on the run-time of the preflow-push algorithm depends
only on the number of times a node is relabeled, node labels never decrease in
either algorithm, and all node labels are ≤ 2n.
Using this parametric preflow algorithm, Gallo, Grigoriadis, and Tarjan [8]
describe an algorithm that finds all breakpoints of κ(θ), the minimum cut func-
tion of graph G, in the same asymptotic time. Like Ogier’s algorithm, this al-
gorithm relies on the concavity of κ. Once the breakpoints are found, the para-
metric preflow algorithm can be invoked again to compute the maximum flows
and minimum cuts corresponding to these breakpoints.
In this section, we discuss the main contribution of this paper which is a gen-
eralization of the parametric maximum flow algorithm and the breakpoint al-
gorithm of Gallo, Grigoriadis, and Tarjan [8] to solve the universally maximum
flow problem with piecewise constant capacities. Our generalization of [8] en-
ables us to reduce the time needed to compute the set of breakpoints of the
universally maximum dynamic flow W , and the τ -maximum flows xτ , τ ∈ W
that are necessary to compute the optimal flow x∗ as detailed in Section 1.2.
We require O(k 2 nm log(kn2 /m)) time to do this. Since x0 can be computed
in O(knm log(n2 /m)) time using k calls to Goldberg and Tarjan’s push-relabel
maximum flow algorithm [9], this implies that the universally maximum dynamic
flow x∗ can also be computed in O(k 2 nm log(kn2 /m)) time, which improves the
algorithm of Ogier [16] by a factor of O(kn).
Our algorithm integrates the work of Gallo, et al. [8] into the framework
of the Ogier algorithm. In Step 1, we use the parametric preflow algorithm of
Gallo, et al. to compute the minimum cuts Rθ and θ-maximum flows xθ in Hθ
for θ = 1, . . . , k. In Step 2, we generalize the breakpoint algorithm of Gallo, et
al. to compute the minimum cuts Rτ and corresponding maximum flows xτ for
all τ ∈ (θ − 1, θ] ∩ W , for each θ = 1 . . . , k.
We consider a parametric flow problem based on the graphs Hτ . Instead of
considering the graphs Hτ for τ ∈ (0, k] as separate graphs, we consider one
graph H on the same vertex set but with parameterized capacities, so that the
capacities of arcs of H at time τ equal the capacities of Hτ . That is, H(τ ) = Hτ .
More precisely, using Nθ to denote the θth copy of network N in H, an arc in
Nθ has capacity 0 for 0 ≤ t ≤ θ − 1, capacity (t − θ + 1)ue (θ) for θ − 1 < t ≤ θ,
and capacity ue (θ) for θ < t ≤ k. An arc from Nθ to Nθ+1 has capacity 0 for
0 ≤ t ≤ θ and capacity uj (θ) for θ < t ≤ k.
For the correctness and speed of their algorithm, Gallo et al. require that
all arcs with parameterized capacities either leave the source or enter the sink,
and this does not hold for H. However, the Gallo et al. requirements as stated
in Section 1.3 are merely sufficient conditions. The following conditions are also
sufficient, but more general [8].
Universally Maximum Flow with Piecewise-Constant Capacities 161
The second item is necessary for us to find the breakpoints W efficiently using a
modified version of the Goldberg and Tarjan maximum flow algorithm and holds
true here by Lemma 2. The first item is necessary to compute all corresponding
maximum flows and minimum cuts in the same asymptotic time as one — this
is a requirement of the parametric preflow algorithm as discussed in Section 1.3.
We establish below (in Step 2) how to satisfy the first condition.
The two steps of our algorithm are detailed below. Figure 4 briefly summa-
rizes the algorithm.
Step 1: Computing Rθ and xθ for θ ∈ {0, 1, . . . , k}. To compute the
θ-maximum flows xθ , θ ∈ {0, . . . , k}, we construct a parametric flow problem
on a modified Gk (= Hk ). Recall that si is the copy of the source in Ni , and
di is the corresponding sink. We introduce a super source sS with infinite ca-
pacity arcs (sS , si ) for each i ∈ {0, . . . , k}, a super sink dS with arcs (di , dS ) for
each i ∈ {0, . . . , k} with capacity function that is infinite when i ≤ θ and zero
otherwise, and call this parametric network Ĝk . (See Figure 3.) We then solve
the parametric flow problem in Ĝk to find the maximum flows corresponding
to each parameter θ ∈ {0, 1, . . . , k}. These maximum flows are the flows xθ for
θ ∈ {0, 1, . . . , k}. This is because, by Corollary 1, they correspond to cuts in
H(θ) ≡ Hθ of the same value; and, as a cut in Hθ , they keep this value even
when all sinks di that are not in the original sink side of the cut are moved to
the sink side: If di is not in the sink side of the cut for parameter θ in Ĝk , the
capacity of (di , dS ) for parameter θ must be finite, and is therefore zero. Thus
the capacity of all arcs entering di in Hθ is also zero, and di can be moved to
the sink side of the cut.
To solve this parametric flow problem, we reverse Ĝk by reversing the di-
rection of every arc in Ĝk , and apply the algorithm in [8] to this network. This
parametric flow problem is of the form that is solved in [8]: all arcs that vary
with θ are leaving the source, and the capacities are all increasing functions of θ.
Thus, xθ and Rθ for all θ ∈ {0, . . . , k} can be computed in the same asymptotic
time as one maximum flow computation in the graph Ĝk : O(k 2 nm log(kn2 /m))
time.
Step 2: Computing Rτ and xτ for all τ ∈ W . To find the elements
of W , and the corresponding τ -maximum flows within interval (θ − 1, θ], we
generalize the version of the parametric maximum flow algorithm [8] that finds
all breakpoints of the minimum cut function κ(θ). We start with Rθ−1 and Rθ .
The corresponding graphs are H(θ − 1) and H(θ). As τ increases from θ − 1 to
θ, the capacities of the arcs in H ∩ Nθ increase linearly from 0 to uE (θ). Because
the change in the capacity of the arcs is linear, the change in the minimum cut
function is piecewise linear. By Lemma 2, this minimum cut function is also
concave; thus it remains to show how to satisfy Condition 1 of [8].
162 Lisa Fleischer
sS
Gk
8
8
8
u( θ )
u1 = 0, for t=[0,1)
, for t=[1,3]
8
u2 = 0, for t=[0,2)
, for t=[2,3]
8
u3 = 0, for t=[0,3)
, for t=3
8
u1 u2
8
u3
dS
Fig. 4. The two step algorithm to compute the τ -maximum flows needed for
constructing a universally maximum flow.
Universally Maximum Flow with Piecewise-Constant Capacities 163
Acknowledgments
I am grateful to Éva Tardos and Kevin Wayne for providing helpful comments
on earlier drafts of this paper.
References
1. E. J. Anderson and P. Nash. Linear Programming in Infinite-Dimensional Spaces.
John Wiley & Sons, 1987.
2. E. J. Anderson, P. Nash, and A. B. Philpott. A class of continuous network flow
problems. Mathematics of Operations Research, 7:501–14, 1982.
3. E. J. Anderson and A. B. Philpott. A continuous-time network simplex algorithm.
Networks, 19:395–425, 1989.
4. J. E. Aronson. A survey of dynamic network flows. Annals of Operations Research,
20:1–66, 1989.
5. R. E. Burkard, K. Dlaska, and B. Klinz. The quickest flow problem. ZOR Methods
and Models of Operations Research, 37(1):31–58, 1993.
6. L. Fleischer. Faster algorithms for the quickest transshipment problem with zero
transit times. In Proceedings of the Ninth Annual ACM/SIAM Symposium on
Discrete Algorithms, pages 147–156, 1998. Submitted to SIAM Journal on Opti-
mization.
7. L. R. Ford and D. R. Fulkerson. Flows in Networks. Princeton University Press,
1962.
8. G. Gallo, M. D. Grigoriadis, and R. E. Tarjan. A fast parametric maximum flow
algorithm and applications. SIAM J. Comput., 18(1):30–55, 1989.
9. A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem.
Journal of ACM, 35:921–940, 1988.
10. B. Hajek and R. G. Ogier. Optimal dynamic routing in communication networks
with continuous traffic. Networks, 14:457–487, 1984.
11. B. Hoppe. Efficient Dynamic Network Flow Algorithms. PhD thesis, Cornell Uni-
versity, June 1995. Department of Computer Science Technical Report TR95-1524.
12. B. Hoppe and É. Tardos. Polynomial time algorithms for some evacuation prob-
lems. In Proc. of 5th Annual ACM-SIAM Symp. on Discrete Algorithms, pages
433–441, 1994.
13. B. Hoppe and É. Tardos. The quickest transshipment problem. In Proc. of 6th
Annual ACM-SIAM Symp. on Discrete Algorithms, pages 512–521, 1995.
14. E. Minieka. Maximal, lexicographic, and dynamic network flows. Operations Re-
search, 21:517–527, 1973.
Universally Maximum Flow with Piecewise-Constant Capacities 165
1 Introduction
A graph G = (V, E) is called 2-edge connected if for every pair of nodes (u, v)
there are at least two edge-disjoint paths between u and v. Given a graph
G = (V, E) and a weight function w which associates to each edge e a weight
w(e), the 2-edge connected subgraph problem (TECSP) consits of finding a 2-edge
connected
P subgraph H = (V, F ) of G, spanning all the nodes of G and such that
e∈F w(e) is minimum. This problem arises in the design of reliable transporta-
tion and communication networks [23], [24]. It is NP-hard in general. It has been
shown to be polynomial in series-parallel graphs [26] and Halin graphs [25].
Given a graph G = (V, E) and an edge subset F ⊆ E, the 0 − 1 vector xF
of <E such that xF (e) = 1 if e ∈ F and xF (e) = 0 if e ∈ E \ F is called the
incidence vector of F . The convex hull of the incidence vectors of the edge sets
of the connected spanning subgraphs of G, denoted by TECP(G), is called the
2-edge connected spanning subgraph polytope of G.
G = (V, E) be a graph. Given b : E → < and F a subset of E, b(F ) will
Let P
denote e∈F b(e). For W ⊆ V we let W = V \ W . If W ⊂ V is a node subset of
G, then the set of edges that have only one node in W is called a cut and denoted
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 166–182, 1999.
c Springer-Verlag Berlin Heidelberg 1999
Critical Extreme Points 167
by δ(W ). (Note that δ(W ) = δ(W )). If W = {v} for some node v ∈ V , then we
write δ(v) for δ({v}), δ(v) will be called a node cut. An edge cutset F ⊆ E of G
is a set of edges such that F = δ(S) forP some non-empty set S ⊂ V . Given two
vectors w, x ∈ <E , we let < w, x >= e∈E w(e)x(e).
If xF is the incidence vector of the edge set F of a 2-edge connected spanning
subgraph of G, then xF is a feasible solution of the polytope P (G) defined by
x(e) ≥ 0 ∀ e ∈ E, (1.1)
P (G) = x(e) ≤ 1 ∀ e ∈ E, (1.2)
x(δ(S)) ≥ 2 ∀ S ⊂ V, S 6= ∅. (1.3)
Conversely, any integer solution of P (G) is the incidence vector of the edge set
of a 2-connected subgraph of G. Inequalities (1.1) and (1.2) are called trivial
constraints and constraints (1.3) are called cut constraints.
The polytope P (G) is a relaxation of the TECP(G). It is also a relaxation of
the subtour polytope of the traveling salesman problem, the set of all the solu-
tions of the system given by inequalities (1.1)-(1.3) together with the constraints
x(δ(v)) = 2 for all v ∈ V . Thus minimizing < w, x > over P (G) may provide a
good lower bound for both the TECSP and the traveling salesman problem (see
[12] [21]).
Using network flows [8] [9], one can compute in polynomial time a minimum
cut in a weighted undirected graph. Hence the separation problem for inequalities
(1.3) (i.e. the problem that consists of finding whether a given solution ȳ ∈
<|E| satisfies inequalities (1.3) and if not to find an inequality which is violated
by ȳ) can be solved in polynomial time. This implies by the ellipsoid method
[13] that the TECSP can be solved in polynomial time on graphs G for which
TECP(G) = P (G). Mahjoub [20] called these graphs perfectly 2-edge connected
graphs (perfectly-TEC). Thus an interesting question would be to characterize
these graphs. In [19], Mahjoub showed that series-parallel graphs are perfectly-
TEC. In [20] he described sufficient conditions for a graph to be perfectly-TEC.
In [10], Fonlupt and Naddef studied the graphs for which the polyhedron
given by the inequalities (1.1) and (1.3) is the convex hull of the tours of G. ( Here
a tour is a cycle going through each node at least once). This is in connection with
the graphical traveling salesman problem. They gave a characterization of these
graphs in terms of excluded minors. (A minor of a graph G is a graph obtained
from G by deletions and contractions of edges). A natural question that may arise
here is whether or not one can obtain a similar characterization for perfectly-
TEC graphs. The answer to this question is, unfortunately, in the negative. If we
add the constraints x(e) ≤ 1 for all e ∈ E, the approach developped by Fonlupt
and Naddef [10] would not be appropriate. In fact, consider a non perfectly-TEC
graph G (for instance a complete graph on four nodes or more). Subdivide G
by inserting a new node of degree 2 on each edge and let G0 = (V 0 , E 0 ) be the
resulting graph. Clearly, each edge e of G0 belongs to a 2-edge cutset. So x(e) = 1
for all e ∈ E 0 is the unique solution of P (G0 ) and hence G0 is perfectly-TEC.
However, the graph G, which is a minor of G0 , is not.
In this paper we study the fractional extreme points of the polytope P (G).
We introduce an ordering on the extreme points of P (G) and give necessary
168 Jean Fonlupt and Ali Ridha Mahjoub
We consider finite, undirected and loopless 2-edge connected graphs, which may
have multiple edges. We denote a graph by G = (V, E) where V is the node set
and E is the edge set of G. If e is an edge with endnodes u and v, then we write
e = uv. If u is the endnode of an edge e, we say that u (resp. e) is incident to e
(resp. u). If W ⊆ V , we denote by G(W ) the subgraph of G induced by W and
by E(W ) the set of edges having both endnodes in W .
Given a solution x ∈ <E of P (G) we denote by E0 (x), E1 (x), Ef (x) the sets
of edges e such that x(e) = 0, x(e) = 1, 0 < x(e) < 1, respectively. A cut δ(S)
is said to be tight for x if x(δ(S)) = 2. A node v ∈ V is called tight for x if the
cut δ(v) is tight for x. We will denote by τ (x) the set of tight cuts for x. A cut
δ(W ) will be called proper if |W | > 2 and |W | > 2. Given a polyhedron P , we
denote by dim(P ) the dimension of P .
Let G = (V, E) be a graph. Two cuts δ(W1 ) and δ(W2 ) of G are said to be
crossing if W1 ∩ W2 6= ∅, W1 6⊂ W2 , W2 6⊂ W1 and V \ (W1 ∪ W2 ) 6= ∅. A
family of cuts {δ(W1 ), . . . , δ(Wk )} is said to be laminar if δ(W1 ), . . . , δ(Wk ) are
pairwise non-crossing.
Critical Extreme Points 169
This lemma implies that any tight cut that crosses δ(W ) induces a constraint
which is redundant with respect to system (2.1). Note that system (2.1) may
contain redundant equalities.
If x is a solution of <F where F ⊆ E and T is a subset of F , then xT
will denote the restriction of x on T . If S ⊂ V , we let Γ (S) = E(S) ∪ δ(S). If
W1 , W2 ⊂ V , (W1 , W2 ) will denote the set of edges between W1 and W2 .
Throughout this paper x̄ will denote a non-integer extreme point of P (G).
All the structures (polytopes, affine subspaces,. . .) that will be considered in the
sequel will refer to x̄. So in the rest of the paper we will write τ (W ), τ (W )
instead of τ (W, x̄), τ (W , x̄).
Let δ(W ) be a cut of G tight for x̄. Let L0 be the affine subspace of <Γ (W )
given by the constraints
x(e) = 0 ∀ e ∈ E0 (x̄) ∩ Γ (W ),
L0 =
x(e) = 1 ∀ e ∈ E1 (x̄) ∩ Γ (W ).
P (W ) = {x ∈ L0 | 0 ≤ x(e) ≤ 1 ∀ e ∈ Γ (W ),
x(δ(S)) ≥ 2 ∀ S ⊂ W, S 6= ∅}.
In this section we introduce the concept of critical extreme points and give our
main result.
Suppose that x̄ is a non-integer extreme point of P (G). Let x̄0 be the solution
obtained by replacing some (but at least one) non-integer components of x̄ by 0
or 1 (and keeping all the other components of x̄ unchanged). Assume that x̄0 is
feasible for P (G). Then x̄0 can be written as a convex combination of extreme
points of P (G). If ȳ is such an extreme point, then we shall say that x̄ dominates
ȳ and we shall write x̄ ȳ. Note that if x̄0 is itself an extreme point of P (G),
then ȳ = x̄0 . Also note that Ef (ȳ) ⊂ Ef (x̄) and if x̄(e) = 0 or 1 for some edge
e ∈ E, then ȳ(e) = x̄(e) . Moreover, if a cut δ(W ) is tight for both x̄ and x̄0 ,
then δ(W ) is also tight for ȳ. It may however exist cuts which are tight for x̄
(resp. ȳ) but not tight for ȳ (resp. x̄).
It is clear that the relation defines a partial ordering on the extreme points
of P (G). The minimal elements of this ordering (i.e. the extreme points x for
which there is no extreme point y such that x y ) correspond to the integer
extreme points of P (G). In what follows we define, in a recursive way, a rank
function on the set of extreme points of P (G).
The minimal extreme points of P (G) will be called extreme points of rank 0.
An extreme point x of P (G) will be said of rank k, for fixed k, if x dominates
the extreme points of rank ≤ k − 1 and if there exists at least one extreme point
of P (G) of rank k − 1.
In [20], Mahjoub introduced the following operations:
θ1 : Delete an edge.
θ2 : Contract an edge whose both endnodes are of degree two.
θ3 : Contract a node subset W such that G(W ) is 2-edge connected.
1. V = V1 ∪ V2 with V1 ∩ V2 = ∅,
E = E1 ∪ E2 with E1 ∩ E2 = ∅ ,
G1 = (V1 , E1 ) is an odd cycle,
G2 = (V1 ∪ V2 , E2 ) is a forest whose set of pending nodes is V1 ,
and such that all the nodes of V2 have degree at least 3.
2. x(e) = 12 for e ∈ E1 and x(e) = 1 for e ∈ E2 .
3. For any proper cut δ(W ) of G, x(δ(W )) > 2.
Figure 1 shows some examples of basic pairs. Figure 1 (a) is a wheel, Figure
1 (b) shows an example where one component of the forest is an edge with two
pending nodes in V1 . In Figure 1 (c) the forest is not a star. If (G, x̄) is a basic
pair, then we will say that G is a basic graph and x̄ is a basic point.
Lemma 3.5. If (G, x̄) is a basic pair, then x̄ is an extreme point of P (G) .
Proof. Easy.
of Ω are not perfectly-TEC. In fact, let G̃ = (Ṽ , Ẽ) and x̃ ∈ <Ẽ form a basic
pair and G = (V, E) a graph of Ω obtained from G̃ by inserting new nodes on
some edges of the forest of G̃. Let x ∈ <E be the solution of P (G) such that
x(e) = x̃(e) if e ∈ Ẽ and x(e) = 1 if e ∈ E \ Ẽ. As by Lemme 3.5, x̃ is an
extreme point of P (G̃), x is a fractional extreme point of P (G), and thus G is
not perfectly-TEC.
A consequence of Theorem 3.6 is the following.
Corollary 3.7 A graph G is perfectly-TEC if and only if G is not reducible
to a graph of Ω by means of the operations θ1 , θ2 , θ3 .
Proof. Assume that G = (V, E) reduces to a graph G0 = (V 0 , E 0 ) ∈ Ω by
means of the operations θ1 , θ2 , θ3 . By the remark above, G0 is not perfectly-TEC.
And thus by Theorem 3.1, G so is.
Conversely, if x̄ is a fractional extreme point of P (G), then there must exist
an extreme point ȳ of P (G) of rank 1. By Lemma 3.3 together with Theorem
3.6, G and ȳ can be reduced by operations θ10 , θ20 , θ30 to a basic pair (G0 , ȳ 0 ). If
instead of applying θ20 we apply θ2 , we obtain a graph of Ω.
The proof of Theorem 3.6 will be given in Section 5. In what follows we are
going to establish some structural properties of the critical extreme points of
P (G). These properties will be useful in the sequel.
Let x̄ be a critical extreme point of P (G). We have the following lemmas, for
the proofs that are omitted, see [11].
Lemma 4.1.
i) x̄(e) > 0 for all e ∈ E.
ii) G does not contain nodes of degree 2.
iii) If W ⊆ V is a node set such that G(W ) is 2-edge connected, then there exists
an edge f in E(W ) with 0 < x(f ) < 1.
iv) If δ(W ) is a cut of G tight for x̄, then both G(W ) and G(W ) are connected.
Lemma 4.2. Every edge f ∈ Ef (x̄) belongs to at least two tight cuts of
system (2.1).
Critical Extreme Points 173
Lemma 4.3. Let δ(W ) be a cut tight for x̄ with W = {u, v}. Then
i) u and v are linked by exactly one edge, say f ,
ii) u and v are tight and x̄(f ) = 1,
iii) δ(W ) ⊆ Ef (x̄).
Remark 4.4. The converse of Lemma 4.3 also holds. That is, if u, v ∈ V
are two nodes satisfying i) and ii) of Lemma 4.3, then the cut δ(W ) is tight for
x̄, where W = {u, v}. This implies that the constraint x̄(δ(W )) = 2 is redundant
with respect to the equations x̄(δ(u)) = 2, x̄(δ(v)) = 2 and x̄(f ) = 1.
Lemma 4.5. Let δ(W ) be a proper cut tight for x̄. If G(W ) and G(W ) are
2-edge connected, then for every two edges f, g ∈ δ(W ), x̄(f ) + x̄(g) ≤ 1.
Lemma 4.6. Let δ(W ) be a cut tight for x̄. Then
i) δ(W ) contains at least three edges,
ii) if δ(W ) ∩ E1 (x̄) 6= ∅, then either δ(W ) or δ(W ) is a node cut.
Proof. i) If δ(W ) is a 2-edge cutset, then we can show in a similar way as a
result in [20] that either W or W is reduced to a single node. But this implies
that G contains a node of degree 2, which contradicts Lemma 4.1 ii).
ii) Suppose that δ(W ) contains an edge e0 ∈ E1 (x̄). By Lemma 4.3 iii), |W | = 6
2 6= |W |. So let us assume that δ(W ) is a proper cut. We claim that G(W ) and
G(W ) are both 2-edge connected. Indeed, since δ(W ) is tight for x̄, by Lemma
4.1 iv), G(W ) and G(W ) are both connected. Now suppose, for instance, that
G(W ) is not 2-edge connected. Then there exists a partition W1 , W2 of W with
(W1 , W2 ) = {f } for some f ∈ E. W.l.o.g. we may suppose that e0 ∈ (W1 , W ). We
have x̄(δ(W )) = x̄(δ(W1 ))+ x̄(δ(W2 ))−2x̄(f ) = 2. As x̄(δ(W1 )) ≥ 2, x̄(δ(W2 )) ≥
2 and x̄(f ) ≤ 1, it follows that x̄(f ) = 1 and δ(W1 ) and δ(W2 ) are both tight
for x̄. Hence δ(W1 ) = {e0 , f }, contradicting i).
Consequently, both graphs G(W ) and G(W ) are 2-edge connected. Now let e1
be an edge of δ(W ) \ {e0 }. As x̄(e1 ) > 0, we have that x̄(e0 ) + x̄(e1 ) > 1. Since
δ(W ) is proper, this contradicts Lemma 4.5.
Lemma 4.7. If δ(W ) is a proper cut tight for x̄, then
i) E(W ) ∩ Ef (x̄) 6= ∅, E(W ) ∩ Ef (x̄) 6= ∅,
ii) G(W ) and G(W ) are both 2-edge connected,
iii) x̄(e) < 1 for all e ∈ δ(W ),
iv) |δ(W )| ≥ 4,
v) P (W ) ∩ H(W ), P (W ) ∩ H + (W ), P (W ) ∩ H(W ) and P (W ) ∩ H + (W ) are
integer polyhedra.
Proof. i) Assume for instance that E(W ) ∩ Ef (x̄) = ∅. Hence G(W ) can-
not contain an induced 2-edge connected subgraph. For otherwise, G would be
reducible by θ30 to a smaller graph, which contradicts the fact that x̄ is crit-
ical. In consequence, by Lemma 4.1 iv), it follows that G(W ) is a tree. Let
u and v be two pendant nodes of this tree. As x̄(δ(u)) ≥ 2, it follows that
x̄(δ(u) ∩ δ(W )) ≥ 1. Similarly we have x̄(δ(v) ∩ δ(W )) ≥ 1. Since δ(W ) is tight
for x̄, this implies that these inequalities are satisfied with equality and that
δ(W ) = (δ(u) ∩ δ(W )) ∪ (δ(v) ∩ δ(W )). Thus u and v are the only pendant
nodes of G(W ) and, in consequence, G(W ) is a path P . Moreover, as G does
174 Jean Fonlupt and Ali Ridha Mahjoub
not contain nodes of degree 2, P must consist of a single edge. Thus W = {u, v},
which contradicts the fact that δ(W ) is a proper cut.
ii) Suppose for instance that G(W ) is not 2-edge connected. As by Lemma
4.1 iv) G(W ) is connected, there exists a partition W1 , W2 of W and an edge f ∈
E(W ) such that (W1 , W2 ) = {f }. We have x̄(δ(W )) = x̄(δ(W1 )) + x̄(δ(W2 )) −
2x̄(f ) = 2. As x̄(δ(W1 )) ≥ 2, x̄(δ(W2 )) ≥ 2 and x̄(f ) ≤ 1, it follows that
x̄(f ) = 1 and δ(W1 ) and δ(W2 ) are both tight for x̄. By Lemma 4.6 ii) this
implies that |W1 | = |W2 | = 1. Hence |W | = 2, a contradiction
iii) If x̄(e0 ) = 1 for some e0 ∈ δ(W ), as x̄(e) > 0 for all e ∈ δ(W ), by ii)
together with Lemma 4.5 it follows that x̄(e) = 0 for all e ∈ δ(W ) \ {e0 }. Thus
x̄(δ(W )) < 2, which is impossible.
iv) This is a consequence of ii), iii), Lemma 4.5 and the fact that δ(W ) is
tight for x̄.
v) Suppose, for instance, that P (W )∩H + (W ) is not integer and let y ∈ <Γ (W )
be a fractional extreme point of P (W ) ∩ H + (W ). Let x0 ∈ <E be the solution
such that
y(e) if e ∈ Γ (W ),
x0 (e) =
1 if e ∈ E(W ).
Since by ii) G(W ) is 2-edge connected, x0 is a solution of P (G). Moreover, as x0
is the unique solution of the system given by the constraints defining y together
with x(e) = 1, for all e ∈ E(W ), x0 is an extreme point of P (G). Furthermore,
as E(W ) ∩ Ef (x̄) 6= ∅ we have that x̄ x0 . As x0 is fractional this contradicts
the fact that x̄ is critical.
Thus P (W ) ∩ H + (W ) is integer. Since P (W ) ∩ H(W ) is a face of P (W ) ∩
H (W ), it is also integer. The proof for P (W ) ∩ H(W ) and P (W ) ∩ H + (W ) is
+
similar.
< ai , x >> bi ,
< aj , x >= bj .
0
In what follows we will denote by Lq (W ) the projection of L(W ) onto <E .
Claim 4. i) Lp (W ) is the projection of Lq (W ) onto <δ(W ) .
0
ii) A solution x ∈ <E belongs to Lq (W ) if and only if xδ(Wi ) ∈ Lp (Wi ))∩H(Wi )
for i = 1, . . . , r.
Proof. See [11].
By Claim 4, if Lq (W ) is given by a system of the form
0
Lq (W ) = {x ∈ <E | < Ai , x >= bi ; i ∈ I},
for i = 1, . . . , r. Let < Ai0 , x >= bi0 be an equation with i0 ∈ I. Suppose that
< Ai0 , x >= bi0 is relaxable. Consider the affine subspace
We say that the relaxation of < Ai0 , x >= bi0 is valid if at least one of such
points satisfies the constraint x(δ(W )) ≥ 2.
Remark 5.2. If Lq (W ) 6⊂ H(W ), then y can be chosen so that y(δ(W )) = 2,
and thus the relaxation of < Ai0 , x >= bi0 is valid.
Now suppose that Lp (W )∩H(W ) is a plane. So w.l.o.g. we may suppose that
Lp (W ) ∩ H(W ) = P l1 and therefore, a description of Lp (W ) ∩ H(W ) is given by
Critical Extreme Points 177
system (5.3). The cut δ(W ) is said to be good for W if dim (Lp (W ) ∩ H(W )) = 2
and if in the description given by (5.3), at least one of the equations has a valid
relaxation.
Now we turn to the crucial point in the proof.
Claim 5. All the proper cuts tight for x̄ are good.
Proof. Suppose for instance that δ(W ) is not good for W . We may Suppose
that |W | is minimum, that is, all the proper tight cuts δ(Z) with Z ⊂ W are
good for Z. Consequently, by Claim 3, these cuts form a laminar family. Let
δ(W1 ), . . . , δ(Wr ) be the maximal tight cuts of W . Let E 0 and Lq (W ) be as
defined in Claim 4. In what follows we are going to give a description of the affine
subspace Lq (W ). For this, let us first note that, by Claim 1, either |δ(Wi )| = 3
or |δ(Wi )| = 4, for i = 1, . . . , r.
If |Wi | = 1, and δ(Wi ) = {e1 , e2 , e3 , e4 }, (resp. δ(Wi ) = {e1 , e2 , e3 } with
e3 ∈ E1 (x̄)), then δ(Wi ) produces in Lq (W ) the constraint:
x(e1 ) + x(e2 ) + x(e3 ) + x(e4 ) = 2, (5.4)
(resp. x(e1 ) + x(e2 ) = 1). (5.5)
Note here that Lp (Wi ) ∩ H(Wi ) is given by inequality (5.4) (resp. (5.5) together
with x(e3 ) = 1), and thus (5.4) (resp. (5.5)) is relaxable.
If Wi = {u, v} for some nodes u, v ∈ V , by Lemma 4.3 i) we have that
uv ∈ E, x̄(uv) = 1 and δ(u) and δ(v) are both tight for x̄. This implies that
x̄(δ(Wi )) = 2 is redundant with respect to x̄(δ(u)) = 2, x̄(δ(v)) = 2 and x̄(uv) =
1. And thus δ(Wi ) produces two equations of type (5.5).
If δ(Wi ) is a proper cut, then by our minimality assumption, δ(Wi ) is a
good cut, and by Claim 1 ii), |δ(Wi )| = 4. Moreover if, for instance, δ(Wi ) =
{e1 , e2 , e3 , e4 }, we may suppose, w.l.o.g. that the plane Lp (Wi ) ∩ H(Wi ) is given
by system (5.3), and that at least one of the constraints of that system has a
valid relaxation.
Let Ax = b be the system given by the constraints of type (5.3),(5.4),(5.5).
Let k be the number of constraints of this system. Let E ∗ = E 0 ∩ Ef (x̄). Let
L∗q (W ) be the projection of L(W ) onto E ∗ . Note that the projection of L∗q (W )
onto <δ(W ) is Lp (W ). Also note that the maps L(W ) −→ Lq (W ) −→ L∗q (W ) −→
Lp (W ) are bijections. By Claim 4 it follows that
∗
L∗q (W ) = {x ∈ <E ; Ax = b}.
Let E1 be a connected component of the column set of A, and let I1 be the set of
indices i ∈ {1, . . . , k} such that some x(f ) with f ∈ E1 appears in the equation
E∗
< Ai , x >= bi . (The submatrix AE I1 is a block of A). A point y ∈ <
1
is a
E1
solution of Ax = b if and only if yE1 is a solution of the system AI1 x = bI1 and
yE ∗ \E1 is a solution of AE ∗
I2 x = bI2 where E2 = E \ E1 and I2 = {1, . . . , k} \ I1 .
2
E∗
{x ∈ < | < Ai , x >= bi ; i ∈ I1 } onto Rδ(W ) is contained in an hyperplane
of the form X
H = {x ∈ <δ(W ) | c(ei )x(ei ) = β}.
i=1,...,4
P
In other words, a block AE 1
I1 is active if an equation of the form c(ei )x(ei ) =
i=1,...,4
β is redundant with respect to that block. In that case we will say that the hy-
E1
perplane H is produced
P by AI1 .
Since the equation c(ei )x(ei ) = β can be obtained as a linear combination
i=1,...,4
P < Ai , x >= bi , i ∈ I1 ,P
of the equations there must exist a vector u = (u1 , . . . , uI1 )
such that ( c(ei )x(ei ) − β) = ui (< Ai , x > −bi ). As u 6= 0, we may
i=1,...,4 i∈I1 P
suppose, w.l.o.g. that ui0 = 1 for some i0 ∈ I1 . As ui Ai (f ) = 0 for every
i∈I1
f ∈ E1 \ δ(W ), for every row i ∈ I1 one should have either ui = +1 or ui = −1
in such a way that if an edge f of E1 \ δ(W ) appears in two lines i and j, then
ui + uj = 0. Moreover, starting from ui0 = +1, the coefficients ui , i ∈ I1 \ {i0 }
are determined in a unique manner. As c(ei ) 6= 0 for at least one edge ei ∈ δ(W ),
this implies that the rows i ∈ I1 are linearly independant and that at most one
hyperplane may be produced by an active block.
Let t = |E1 ∩ δ(W )| and L1 = {x ∈ <E1 ∪δ(W ) | < Ai , x >= bi , i ∈ I1 }.
As the map Lq (W ) −→ Lp (W ) is a bijection, it follows that the projection of
L1 onto <δ(W ) so is. Since the projection of L1 onto <δ(W ) is an hyperplane,
it follows that dim(L1 ) = 3. Also as AE I1 x = bI1 is a non-redundant system, it
1
Let l be the number of constraints i ∈ I1 of type (5.4). (Note that for this type
of constraints we have bi = 2.) We then have
P
i∈I1 < Ai , x̄ >= |I1 | + l. (5.7)
Critical Extreme Points 179
On the other hand, as x̄(e) = 12 for all e ∈ Ef (x̄), and every column correspond-
ing to an edge of E1 \ δ(W ) (resp. E1 ∩ δ(W )) contains exactly two 10 s (resp.
one 1), we have
P t t
i∈I1 < Ai , x̄ >= |E1 | − t + 2 = |E1 | − 2 . (5.8)
Combining (5.6)-(5.8) we get l = 2t − 1. Thus t is even, and therefore either t = 2
(and l = 0) or t = 4 (and l = 1). As Lp (W ) is the projection of Lq (W ) onto
<δ(W ) , each hyperplane in the description of Lp (W ) is produced by one active
block. We distinguish three cases:
Case1) There is only one active block AE I1 with |E1 ∩ δ(W )| = 2, and thus
1
dim(Lp (W )) = 3.
Case 2) There is only one active block AE I1 with |E1 ∩ δ(W )| = 4. We claim
1
∗
that in this case E1 = E and I1 = {1, . . . , k}. In fact, if there is a further
block AE I2 , then as E1 ∩ δ(W ) 6= ∅ =
2
6 E2 ∩ δ(W ) there must exist an edge
g ∈ E1 ∩ E2 ∩ δ(W ), which is impossible. Note that here there is a constraint of
type (5.4) (l = 1) and, as in Case 1), dim(Lp (W )) = 3.
Case 3) There are two active blocks AE E2
I1 , AI2 with I1 ∪I2 = {1, . . . , k}, E1 ∪
1
< A1 , ȳ >> 1,
<PAi , ȳ >= bi , i = 2, . . . , k,
ȳ(ei ) = 2.
i=1,...,4
Note that bi = 1 for all i ∈ I1 (since l = 0). Let I1+ (resp. I1− ) beP the set of
rows for which the coefficient ui is equal to +1 (resp. −1). We have i∈I1 ui =
|I1+ | − |I1− | = 1. We can define in a similar way I2+ and I2− for the second block
AE − −
I2 . Thus |I1 | + |I2 | = 2 + |I1 | + |I2 |. Note that all the constraints of Ax = b
2 + +
are of type (5.3) and (5.6) and thus the number of relaxable rows is greater than
or equal to the number of rows which are not relaxable. Thus there exists an
equation < Ai0 , x >= bi0 , i0 ∈ I1+ ∪ I2+ which is relaxable. W.l.o.g. we may
assume that i0 ∈ I2+ and < Ai0 , x >= bi0 is the constraint x(e3 ) + x(e4 ) = 1. So
∗
there is a solution y 0 ∈ <E such that
Hence y 0 (e1 )+y 0 (e2 ) = 1 and y 0 (e3 )+y 0 (e4 ) > 1, Moreover y 0 is the projection of a
point of P (W )∩L0 (W ) where L0 (W ) is an affine subspace containing L(W ). Since
y(δ(W )) ≥ 2, this implies that δ(W ) is a good cut. But this is a contradiction
and our claim is proved.
Now as δ(S) is a proper cut tight for x̄, By Claim 5, δ(S) is good for S and
for S̄. Thus dim(Lp (S)∩H(S)) = 2 and dim(Lp (S̄)∩H(S)) = 2. In consequence,
the affine space Lp (S) ∩ Lp (S̄) ∩ H(S) contains one of the three lines L1 , L2 , L3 .
But this contradicts Lemma 2.2 i), which finishes the proof of the proposition.
Since x̄ is critical, x̄(e) > 0 for all e ∈ E. Let Vf (x̄) be the subset of nodes
incident to at least one edge of Ef (x̄). From Lemma 4.2 together with Proposition
5.1, it follows that every node of Vf (x̄) is tight for x̄. Let Gf = (Vf (x̄), Ef (x̄)).
We claim that Gf does not contain pendant nodes. In fact, assume the contrary
Critical Extreme Points 181
References
1. Baı̈ou, M., Mahjoub, A.R.: Steiner 2-edge connected subgraphs polytopes on series
parallel graphs. SIAM Journal on Discrete Mathematics 10 (1997) 505-514
2. Barahona, F., Mahjoub, A.R.: On two-connected subgraphs polytopes. Discrete
Mathematics 17 (1995) 19-34
3. Chopra, S.: Polyhedra of the equivalent subgraph problem and some edge con-
nectivity problems. SIAM Journal on Discrete Mathematics 5 (1992) 321-337
4. Chopra, S.: The k-edge connected spanning subgraph polyhedron. SIAM Journal
on Discrete Mathematics 7 (1994) 245-259
5. Coruégols, G., Fonlupt, J., Naddef, D.: The traveling salesman problem on a
graph and some related integer polyhedra. Mathematical Programming 33 (1985)
1-27
6. Coullard, R., Rais, A., Rardin, R.L., Wagner, D.K.: The 2-connected Steiner
subgraph polytope for series-parallel graphs. Report No. CC-91-23, School of In-
dusrial Engineering, Purdue University (1991)
182 Jean Fonlupt and Ali Ridha Mahjoub
7. Coullard, R., Rais, A., Rardin, R.L., Wagner, D.K.: The dominant of the 2-
connected Steiner subgraph polytope for W4 -free graphs. Discrete Applied Math-
ematics 66 (1996) 33-43
8. Dinits, E.A,: Algorithm for solution of a problem of maximum flow in a network
unit with power estimation. Soziet. mat. Dokl. 11 (1970) 1277-1280
9. Edmonds, J., Karp, R.M.: Theoretical improvement in algorithm efficiency for
network flow problems. J. of Assoc. Comp. Mach. 19 (1972) 254-264
10. Fonlupt, J., Naddef, D.: The traveling salesman problem in graphs with some
excluded minors. Mathematical Programming 53 (1992) 147-172
11. Fonlupt, J., Mahjoub, A.R.: Critical extreme points of the 2-edge connected span-
ning subgraph polytope. Preprint (1998)
12. Goemans, M.X., Bertsimas, D.J.: Survivable Network, linear programming and
the parsimonious property. Mathematical Programming 60 (1993) 145-166
13. Grötschel, M, Lovász, L., Schrijver, A.: The ellipsoid method and its consequences
combinatorial optimization. Combinatorica 1 (1981) 70-89
14. Grötschel, M., Monma, C.: Integer polyhedra arising from certain network design
problem with connectivity constraints. SIAM Journal of Discrete Mathematics 3
(1990) 502-523
15. Grötschel, M., Monma, C., Stoer, M.: Facets for polyhedra arising in the design
of commumication networks with low-connectivity constraints. SIAM Journal on
Optimization 2 (1992) 474-504
16. Grötschel, M., Monma, C., Stoer, M.: Polyhedral approaches to network surviv-
ability. In: Roberts, F., Hwang, F., Monma, C. (eds.): Reliability of Computer and
Communication Networks, Vol. 5, Series in Discrete Mathematics and Computer
Science AMS/ACM (1991) 121-141
17. Grötschel, M., Monma, C., Stoer, M.: Computational results with a cutting
plane algorithm for designing communication networks with low-connectivity con-
straints. Operations Research 40/2 (1992) 309-330
18. Grötschel, M., Monma, C., Stoer, M.:Polyhedral and Computational Ivestigation
for designing communication networks with high suivability requirements. Oper-
ations Research 43/6 (1995) 1012-1024
19. Mahjoub, A.R.: Two-edge connected spanning subgraphs and polyhedra. Mathe-
matical Programming 64 (1994) 199-208
20. Mahjoub, A.R.: On perfectly 2-edge connected graphs. Discrete Mathematics 170
(1997) 153-172
21. Monma, C., Munson, B.S., Pulleyblank, W.R.: Minimum-weight two connected
spanning networks. Mathematical Programming 46 (1990) 153-171
22. Rockafellar, R.T.: Convex Analysis. Princeton University Press (1970)
23. Steiglitz, K.S, Weinen, P., Kleitmann, D.S.: The design of minimum cost suivable
networks. IEEE Transactions and Circuit Theory 16 (1969) 455-460
24. Stoer, M.: Design of suivable Networks. Lecture Notes in Mathematics Vol. 1531,
Springer, Berlin (1992)
25. Winter, P.: Generalized Steiner Problem in Halin Graphs. Proceedings of the 12th
International Symposium on Mathematical Programming, MIT (1985)
26. Winter, P.: Generalized Steiner Problem in series-parallel networks. Journal of
Algorithms 7 (1986) 549-566
An Orientation Theorem with Parity Conditions
1 Introduction
Theorem 1. [9] A connected graph G = (V, E) has a spanning tree F for which
each connected component of G − E(F ) has an even number of edges if and only
if
|A| ≥ c(G − A) + b(G − A) − 1 (1)
holds for every A ⊆ E, where c(G − A) denotes the number of connected com-
ponents of G − A and b(G − A) denotes the number of those components D of
G − A for which |V (D)| + |E(D)| − 1 is odd. t
u
?
Supported by the Hungarian National Foundation for Scientific Research Grant
OTKA T029772.
??
Supported in part by the Danish Natural Science Research Council, grant no. 28808.
†
Basic Research in Computer Science, Centre of the Danish National Research Foun-
dation.
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 183–190, 1999.
c Springer-Verlag Berlin Heidelberg 1999
184 András Frank, Tibor Jordán, and Zoltán Szigeti
t
u
t
u
Proof. (of Theorem 5) To see the necessity of condition (5), consider an orien-
tation of G with the required properties and some partition P = {V1 , ..., Vt } of
V . The following fact is easy to observe.
Proof. We handle the two cases simultaneously. Suppose that there exists a
partition P 0 of t0 elements in G[Vj ] violating (5) (with respect to T ∩ Vj in case
(a) or with respect to (T ∩Vj )⊕v, for some v ∈ Vj , in case (b)). By Proposition 3
this implies k(t0 −1)+s0 ≥ e(P 0 )+2, where s0 denotes the number of odd elements
of P 0 . Consider the partition P 00 = (P − Vj ) ∪ P 0 , consisting of t00 elements from
which s00 are odd. Clearly, e(P 00 ) = e(P)+ e(P 0 ) and t00 = t+ t0 − 1. Furthermore,
s00 ≥ s + s0 − 2, since the parity of at most two elements may be changed (these
are Vj – only in case (b) – and the element in P 0 which contains the vertex v –
only in case (b) again). Since (5) holds for P 00 by the assumption of the theorem,
we have k(t00 − 1) + s00 ≤ e(P 00 ) = e(P) + e(P 0 ) ≤ k(t − 1) + s + k(t0 − 1) + s0 − 2 =
k((t + t0 − 1) − 1) + s + s0 − 2 ≤ k(t00 − 1) + s00 . Thus P 00 is a tight partition with
t00 > t, which contradicts the choice of P. t
u
Proof. To prove the lemma we have to verify that the two conditions (2) and
(4) of Theorem 4 are satisfied. First we can see that g(V (H)) = g(A) + g(B) =
s(k + 1) + (t − s)k − k = k(t − 1) + s = |E(H)| by the definition of g and by (6).
Thus (2) is satisfied.
To verify (4), let us choose an arbitrary non-empty subset X of VS(H). Let
us define the partition P ∗ of V (G) by P ∗ := {Vj : vj ∈ V (H) − X} ∪ vj ∈X Vj .
Then P ∗ has t∗ = t − |X| + 1 elements and the number of its odd elements s∗ is
at least s − |X ∩ A|. Applying (5) for P ∗ , it follows that k(t∗ − 1) + s∗ ≤ e(P ∗ ).
188 András Frank, Tibor Jordán, and Zoltán Szigeti
3 Corollaries
As we reformulated Theorem 1 in terms of odd orientations and spanning ar-
borescences, we can similarly reformulate Theorem 5 in terms of even compo-
nents and spanning trees.
Proof. As we observed, G has an oriention for which the in-degree of every vertex
is even if and only if each connected component of G contains an even number of
edges. Thus the desired spanning trees exist in G if and only if G has a T ⊕R-odd
An Orientation Theorem with Parity Conditions 189
4 Remarks
References
1. I. Anderson, Perfect matchings of a graph, J. Combin. Theory Ser. B, 10 (1971),
183-186.
2. O. Chevalier, F. Jaeger, C. Payan and N.H. Xuong, Odd rooted orientations and
upper-embeddable graphs, Annals of Discrete Mathematics 17, (1983) 177-181.
3. J. Edmonds, Edge-disjoint branchings, in: R. Rustin (Ed.), Combinatorial Algo-
rithms, Academic Press, (1973) 91-96.
4. A. Frank, Orientations of graphs and submodular flows, Congr. Numer. 113 (1996),
111-142.
5. A. Frank, Z. Király, Parity constrained k-edge-connected orientations, Proc.
Seventh Conference on Integer Programming and Combinatorial Optimization
(IPCO), Graz, 1999. LNCS, Springer, this issue.
6. M.L. Furst, J.L. Gross, and L.A. McGeoch, Finding a maximum genus graph
imbedding, J. of the ACM, Vol. 35, No. 3, July 1988, 523-534.
7. M. Jungerman, A characterization of upper embeddable graphs, Trans. Amer.
Math. Soc. 241 (1978), 401-406.
8. C. St. J. A. Nash-Williams, Edge-disjoint spanning trees of finite graphs, J. London
Math. Soc. 36 (1961), 445-450.
9. L. Nebeský, A new characterization of the maximum genus of a graph, Czechoslo-
vak Mathematical Journal, 31 (106) 1981, 604-613.
10. L. Lovász, Selecting independent lines from a family of lines in a space, Acta Sci.
Univ. Szeged 42, 1980, 121-131.
11. N.H. Xuong, How to determine the maximum genus of a graph, J. Combin. Theory
Ser. B 26 (1979), 217-225.
Parity Constrained k-Edge-Connected
Orientations?
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 191–201, 1999.
c Springer-Verlag Berlin Heidelberg 1999
192 András Frank and Zoltán Király
In some cases the two areas overlap (for example, when we are interested
in finding paths or subgraphs of given properties and the parity comes in the
characterization). For example, Seymour’s theorem [S81] on minimum T -joins
implies a result on the edge-disjoint paths problem in planar graphs. In [FST84]
some informal analogy was pointed out between results on parity and on connec-
tivity, but in order to understand better the relationship of these two big aspects
of combinatorial optimization, it is desirable to explore further problems where
both parity and connectivity requirements are imposed. For example, Nebeský
provided a characterization of graphs having an orientation in which every node
is reachable from a given node by a directed path and the in-degree of every
node is odd. Recently, [FJS98] extended this result to the case where, beside the
parity constraints, the existence of k edge-disjoint paths was required from the
specific node to every other one.
The goal of the present paper is to provide a new contribution to this broader
picture. The main result is about orientations of undirected graphs simultane-
ously satisfying connectivity and parity requirements. The following concepts of
connectivity will be used.
By the definition, D is l− -ec if and only if %(X) ≥ pl (X) holds for every X ⊆ V .
By Menger’s theorem the l− -edge-connectivity of D is equivalent to requiring
that D has l edge-disjoint paths from s to every node and l − 1 edge-disjoint
paths from every node to s.
eG (F ) ≥ lt − 1 (1.2a)
For two sets X and Y , X − Y denotes the set of elements of X which are
not in Y , and X ⊕ Y := (X − Y ) ∪ (Y − X) denotes the symmetric difference.
A one-element set is called singleton. We often will not distinguish between a
singleton and its element. In particular, the in-degree of a singleton {x} will be
denoted by %(x) rather than %({x}). For a set X and an element r, we denote
X ∪ {r} by X + r.
For a directed or undirected graph G, let iG (X) = i(X) denote the number
¯
of edges having both end-nodes in X. Let d(X, Y ) (respectively, d(X, Y )) denote
194 András Frank and Zoltán Király
¯
%(X − Y ) + %(Y − X) = %(X) + %(Y ) − d(X, Y ) − [%(X ∩ Y ) − δ(X ∩ Y )]. (1.3b)
Let f be an edge and r a node of G. Then G−f and G−r denote, respectively,
the (di-)graphs arising from G by deleting edge f or node r.
We do not know any simple proof of this result even in the secial case of
k = 1. W. Mader [M82] proved a structural characterization of l-ec digraphs.
We are going to show an analogous result for l− -ec digraphs which will be used
in the proof of Theorem 2.1 but may perhaps be interesting for its own sake, as
well.
and X
eG (F ) ≥ p(V − Vi ) (2.2b)
i
3 l, -Edge-Connected Digraphs
Let D = (V, A) be a digraph which is l− -ec with respect to root-node s, that is,
%(X) ≥ pl (X) for every X ⊂ V where pl is defined in (1.1). We call a set tight
if %(X) = pl (X). A node r of D and the set {r} as well will be called special if
%(r) = l = δ(r) + 1. (Since δ(s) ≥ l, s is not special.)
Lemma 3.1 Suppose that a digraph D = (V, A) with |V | ≥ 2 is l− -ec where
l ≥ 2. Then there is an edge f = ur of D which does not enter any non-special
tight set.
that every l− -ec digraph, having a fewer number of edges than D has, is con-
structible in the sense that it can be constructed as described in the theorem.
If D has an edge f so that D0 = D − f is l− -ec, then D0 is constructible and
then we obtain D form D0 by adding back f , that is, by operation (j). Therefore
we may assume that the deletion of any edge destroys l− -edge-connectivity.
By Lemma 3.1 there is an edge f = ur of D so that r is special and so
that %0 (X) ≥ pl (X) for every subset X ⊆ V distinct from {r} where %0 denotes
the in-degree function of digraph D0 := D − f. Since r is special, we have
%0 (r) = l − 1 = δ 0 (r) where δ 0 is the out-degree function of D0 .
Let D1 = (U, A1 ) be the digraph arising from D by deleting r (where U :=
V − r), and let %1 denote the in-degree function of D1 . Let mi (u) (respectively,
mo (u)) denote the number of parallel edges in D0 from r to u (from u to r). From
%0 (r) = δ 0 (r) we have mo (U ) = mi (U ). Let p(X) := (pl (X) − %1 (X))+ (X ⊂ U )
and p(∅) := p(U ) := 0. Since both pl and −%1 are crossing supermodular, p is
positively crossing supermodular.
We claim that (2.5a) holds. Indeed, for every ∅ ⊂ X ⊂ U one has pl (X) ≤
%0 (X) = %1 (X) + mi (X) from which p(X) = (pl (X) − %1 (X))+ ≤ mi (X), which
is (2.5a).
We claim that (2.5b) holds, as well. Indeed, for every ∅ ⊂ X ⊂ U we have
pl (X) = pl (X + r) ≤ %0 (X + r) = %1 (X) + mo (U − X) from which p(X) =
(pl (X) − %1 (X))+ ≤ mo (U − X), which is (2.5b).
By Theorem 2.5, there exists a digraph H = (U, F ) satisfying (2.3) and (2.4).
It follows from (2.3) and from the definition of p that the digraph D1 + H :=
(U, A1 ∪ F ) is l− -ec. Then D1 + H is constructible by induction, as |A1 ∪ F | =
|A|−(2l−1)+(l−1) < |A|. By (2.4), D arises from D1 +H by applying operation
(jj) with the choice F , proving that D is also constructible. t
u
Remark.
The proof of Theorem 2.5 in [F94] is algorithmic and gives rise to a combina-
torial strongly polynomial algorithm if an oracle for handling p is available. We
applied Theorem 2.5 to a function p defined by p(X) := (pl (X) − %1 (X))+ and
in this case the oracle can indeed be constructed via a network flow algorithm.
Hence we can find in polynomial time a digraph H = (U, F ) for which D1 + H
is l− -ec. By applying this method at most |A| times one can find the sequence
of operations (j) and (jj) guaranteed by Theorem 2.2.
(2) → (3) Let s be any node of G. By Theorem 2.3, G has a (k + 1)− -ec
orientation, denoted by D = (V, A). By Theorem 2.2, D can be constructed
from s by a sequence of operations (j) and (jj). The corresponding sequence of
operations (i) and (ii) provides G.
(3) → (1). We use induction on the number of edges. There is nothing to prove
if G has no edges so suppose that E is non-empty. Let T be a G-even subset of
V . Let G0 denote the graph from which G is obtained by one of the operations
(i) and (ii). By induction, we may assume that G0 has a k-ec T 0 -odd orientation
for every G0 -even set T 0 .
Suppose first that G arises from G0 by adding a new edge f = xy. Let
T := T ⊕ {y}. Clearly, T 0 is G0 -even. By induction, there exists a k-ec T 0 -odd
0
References
[CG97] W. Cunningham and J. Geelen: The optimal path-matching problem, Combi-
natorica, Vol. 17, No. 3 (1997) pp. 315-338.
[F80] A. Frank: On the orientation of graphs, J. Combinatorial Theory, Ser B. ,
Vol. 28, No. 3 (1980) pp. 251-261.
[FST84] A. Frank, A. Sebő and É. Tardos: Covering directed and odd cuts, Mathe-
matical Programming Studies 22 (1984) pp. 99-112.
[FJS98] A. Frank, T. Jordán and Z. Szigeti: An orientation theorem with parity con-
straints, to appear in IPCO ’99.
[F94] A. Frank: Connectivity augmentation problems in network design, in: Mathe-
matical Programming: State of the Art 1994, eds. J.R. Birge and K.G. Murty,
The University of Michigan, pp. 34-63.
[G82] R. Giles: Optimum matching forests, I-II-III, Mathematical Programming,
22 (1982) pp. 1-51.
[L70] L. Lovász: Subgraphs with prescribed valencies, J. Combin. Theory, 8 (1970)
pp. 391-416.
[L80] L. Lovász: Selecting independent lines from a family of lines in a space, Acta
Sci. Univ. Szeged, 43 (1980) pp. 121-131.
[M78] W. Mader: Über die Maximalzahl kantendisjunkter A-Wege, Archiv der Math-
ematik (Basel) 30 (1978) pp. 325-336.
[M82] W. Mader: Konstruktion aller n-fach kantenzusammenhängenden Digraphen,
Europ. J. Combinatorics 3 (1982) pp. 63-67.
[N60] C.St.J.A. Nash-Williams: On orientations, connectivity, and odd vertex pair-
ings in finite graphs, Canad. J. Math. 12 (1960) pp. 555-567.
[N81] L. Nebeský: A new characterization of the maximum genus of a graph,
Czechoslovak Mathematical Journal, 31 (106), (1981) pp. 604-613
[S81] P.D. Seymour: On odd cuts and plane multicommodity flows, Proceedings of
the London Math.Soc. 42 (1981) pp. 178-192.
[T61] W.T. Tutte: On the problem of decomposing a graph into n connected factors,
J. London Math. Soc. 36 (1961) pp. 221-230.
[T47] W.T. Tutte: The factorization of linear graphs, J. London Math. Soc. 22
(1947) pp. 107-111.
Approximation Algorithms for MAX 4-SAT and
Rounding Procedures for Semidefinite Programs
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 202–217, 1999.
c Springer-Verlag Berlin Heidelberg 1999
Approximation Algorithms for MAX 4-SAT 203
4-SAT and perhaps even MAX SAT. It turns out, however, that the new family
falls just short of this mission. The experiments that we have made suggest that
a rounding procedure from the family, which we explicitly describe, can be used
to obtain a 0.8721-approximation algorithm for MAX 4-SAT. Unfortunately, no
rounding procedure from the family yields an approximation algorithm for MAX
4-SAT with a performance guarantee larger than 0.8724.
We have more success with MAX {2, 3, 4}-SAT, the version of MAX SAT in
which the clauses are of size two, three or four. We present a second rounding
procedure from the family that seems to yield an optimal 7/8-approximation
algorithm for MAX {2, 3, 4}-SAT. A 7/8-approximation algorithm for MAX
{2, 3, 4}-SAT yields immediately an optimal 7/8-approximation algorithm for
satisfiable instances of MAX 4-SAT, as clauses of size one can be easily elimi-
nated from satisfiable instances.
To determine the performance guarantee obtained using a given rounding
procedure R, or at least a lower bound on this ratio, we have to find the global
minimum of a function ratio R (v0 , v1 , v2 , v3 , v4 ), given a set of constraints on
the unit vectors v0 , v1 , . . . , v4 ∈ IR5 . The function ratio R is a fairly complicated
function determined by the rounding procedure R. As five unit vectors are de-
termined, up to rotations, by the 52 = 10 angles between them, the function
ratio R is actually a function of 10 real variables. Finding the global minimum
of ratio R analytically is a formidable task. In the course of our investigation we
experimented with hundreds of rounding procedures. Finding these minima ‘by
hand’ was not really an option. We have implemented a set of Matlab functions
that use numerical techniques to find these minima.
The discussion so far centered on the quality of the rounding procedures con-
sidered. We also consider the quality of the suggested semidefinite programming
relaxation itself. The integrality ratio of the MAX 4-SAT relaxation cannot be
more than 7/8, as it is also a relaxation of MAX 3-SAT. We also show that the
integrality ratio of the relaxation, considered as a relaxation of the problem MAX
{1, 4}-SAT, is at most 0.8753. The fact that this ratio is, at best, just above 7/8
is another indication of the difficulty of obtaining optimal 7/8-approximation
algorithm for MAX 4-SAT and MAX SAT. It may also indicate that a stronger
semidefinite programming relaxation would be needed to accomplish this goal.
The fact that numerical optimization techniques were used to compute the
performance guarantees of the algorithms means that we cannot claim the ex-
istence of a 0.8721-approximation algorithm for MAX 4-SAT, and of a 7/8-ap-
proximation algorithm for MAX {2, 3, 4}-SAT as theorems. We believe, however,
that it is possible to prove these claims analytically and promote them to the
status of theorems, as was eventually done with the optimal 7/8-approximation
algorithm for MAX 3-SAT. This would require, however, considerable effort. It
may make more sense, therefore, to look for an approximation algorithm that
seems to be a 7/8-approximation algorithm for MAX 4-SAT before proceeding
to this stage.
In addition to implementing a set of Matlab functions that try to find the per-
formance guarantee of a given rounding procedure from the family considered,
Approximation Algorithms for MAX 4-SAT 205
we have also implemented a set of functions that search for good rounding pro-
cedures. The whole project required about 3000 lines of code. The two rounding
procedures mentioned above, and several other interesting rounding procedures
mentioned in Section 5, were found automatically using this system, with some
manual help from the authors. The total running time used in the search for
good rounding procedures is measured by months.
We end this section with a short survey of related results. The 7/8-approxi-
mation algorithm for MAX 3-SAT is based on the MAX CUT approximation
algorithm of Goemans and Williamson [12]. A 0.931-approximation algorithm
for MAX 2-SAT was obtained by Feige and Goemans [9]. Asano [4] obtained
a 0.770- approximation algorithm for MAX SAT. Trevisan [25] obtained a 0.8-
approximation algorithm for satisfiable MAX SAT instances. The last two results
are also the best published results for MAX 4-SAT.
Karloff and Zwick [17] describe a canonical way of obtaining semidefinite pro-
gramming relaxations for any constraint satisfaction problem. We now describe
the canonical relaxation of MAX 4-SAT obtained using this approach.
Assume that x1 , . . . , xn are the variables of the MAX 4-SAT instance. We let
x0 = 0 and xn+i = x̄i , for 1 ≤ i ≤ n. The semidefinite program corresponding
to the instance has a variable unit vector vi , corresponding to each literal xi ,
and scalar variables zi , zij , zijk or zijkl corresponding to the clauses xi , xi ∨ xj ,
xi ∨ xj ∨ xk and xi ∨ xj ∨ xk ∨ xl of the instance, where 1 ≤ i, j, k ≤ 2n. Note
that all clauses, including those that contain negated literals, can be expressed
in this form. Clearly, we require vn+i = −vi , or vi · vn+i = −1, for 1 ≤ i ≤ n.
The objective of the semidefinite program is to maximize the function
X X X X
wi zi + wij zij + wijk zijk + wijkl zijkl ,
i i,j i,j,k i,j,k,l
where the wi ’s, wij ’s, wijk ’s and wijkl ’s are the non-negative weights of the
different clauses, subject to the following collection of constraints. For ease of
notation, we write down the constraints that correspond to the clauses x1 , x1 ∨x2 ,
x1 ∨x2 ∨x3 and x1 ∨x2 ∨x3 ∨x4 . The constraints corresponding to the other clauses
are easily obtained by plugging in the corresponding indices. The constraints
corresponding to x1 and x1 ∨ x2 are quite simple:
1−v0 ·v1 3−v0 ·v1 −v0 ·v2 −v1 ·v2
z1 = 2 , z12 ≤ 4 , z12 ≤ 1 .
It is not difficult to check that the first three constraints above are equivalent to
the requirement that
4−(vi0 ·vi1 +vi1 ·vi2 +vi2 ·vi3 +vi3 ·vi0 )
z123 ≤ 4 ,
for any permutation i0 , i1 , i2 , i3 on 0, 1, 2, 3. We will encounter similar constraints
for the 4-clauses. The constraints corresponding to x1 ∨ x2 ∨ x3 ∨ x4 are even
more complicated. For any permutation i0 , i1 , i2 , i3 , i4 on 0, 1, 2, 3, 4 we require:
5−(vi0 ·vi1 +vi1 ·vi2 +vi2 ·vi3 +vi3 ·vi4 +vi4 ·vi0 )
z1234 ≤ 4 ,
5−(vi0 +vi4 )·(vi1 +vi2 +vi3 )+vi0 ·vi4
z1234 ≤ 4 , z1234 ≤ 1 .
The first line above contributes 12 different constraints, the second line con-
tributes 10 different constraints. Together with the constraint z1234 ≤ 1 we
get a total of 23 constraints per 4-clause. In addition, for every distinct 0 ≤
i1 , i2 , i3 , i4 , i5 ≤ 2n, we require
X X
vij · vik ≥ −1 and vij · vik ≥ −2 .
1≤j<k≤3 1≤j<k≤5
3 Rounding Procedures
In this section we consider various procedures that can be used to round so-
lutions of semidefinite programming relaxations and examine the performance
guarantees that we get for MAX 4-SAT using them. We start with simple round-
ing procedures and then move on to more complicated ones. The new family of
rounding procedures is then presented in Section 3.5.
Approximation Algorithms for MAX 4-SAT 207
Thus, the probability that certain five unit vectors are separated by a random
hyperplane can be expressed as a combination of probabilities of events that
involve at most four vectors and no further numerical integration is needed. More
generally, probH (v0 , . . . , vk ), for any even k, can be expressed as combinations of
probabilities involving at most k vectors. The same does not hold, unfortunately,
when k is odd.
We let relax(v0 , v1 , . . . , vi ), where 1 ≤ i ≤ 4, denote the ‘relaxed’ value of
the clause x1 ∨ . . . ∨ xi , i.e., the maximal value to which the scalar z12...i can
be set, while still satisfying all the relevant constraints, given the unit vectors
v0 , v1 , . . . , vi . We let
Feige and Goemans [9] introduced the following variation of random hyperplane
rounding. Let f : [0, π] → [0, π] be a continuous function satisfying f (0) = 0 and
f (π − θ) = π − f (θ), for 0 ≤ θ ≤ π. Before rounding the vectors using a random
hyperplane, the vector vi is rotated into a new vector vi0 , in the plane spanned
0
by v0 and vi , so that the angle θ0i between v0 and vi0 would be θ0i0
= f (θ0i ). The
rotations of the vectors v1 , . . . , vn affects, of course, the angles between these
0
vectors. Let θij be the angle between vi0 and vj0 . It is not difficult to see (see [9]),
that for i, j > 0, i 6= j, we have
The vectors v0 , v10 , . . . , vn0 are then rounded using a random hyperplane. The
condition f (0) = 0 is required to ensure the continuity of the transformation
vi → vi0 . The condition f (π − θ) = π − f (θ) ensures that unnegated and negated
literals are treated in the same manner.
Feige and Goemans [9] use rotations to obtain a 0.931-approximation algo-
rithm for MAX 2-SAT. Rotations are also used in [29] and [30]. Can we use
rotations to get a better approximation algorithm for MAX 4-SAT? The answer
is that rotations on their own help, but very little. Consider the configuration
v0 , v1 , v2 , v3 , v4 in which θ0i = π/2, for 1 ≤ i ≤ 4, and θij = arccos(1/3), for
1 ≤ i < j ≤ 4. For this configuration we get relax = 1 and ratio H ' 0.8503. As
every rotation function f must satisfy f (π/2) = π/2, rotations have no effect on
this configuration.
A different type of rotations was recently used by Nesterov [20] and Zwick [31].
These outer-rotations are used in [31] to obtain some improved approximation
algorithms for MAX SAT and MAX NAE-SAT. We were not able to use them,
however, to improve the results that we get in this paper for MAX 4-SAT.
The semidefinite programming relaxation of MAX 4-SAT that we are using here
is stronger than the linear programming relaxation suggested by Goemans and
Approximation Algorithms for MAX 4-SAT 209
Yi
g(θ0j )
probI (v0 , v1 , . . . , vi ) = 1 − 1− .
j=1
π
Note that as each vector is rounded independently, the angles θij , where i, j > 0,
between the vectors, have no effect this time. It may be worthwhile to note
that the choice g(θ) = π/2, for every 0 ≤ θ ≤ π, corresponds to choosing
the assignment to the variables x1 , x2 , . . . , xn uniformly at random, a ‘rounding
procedure’ that yields a ratio of 7/8 for clauses of size 3 and 15/16 for clauses
of size 4. Goemans and Williamson [11] describe several functions g using which
a 3/4-approximation algorithm for MAX SAT may be obtained.
Independent rounding performs well for long clauses. It cannot yield a ratio
larger than 3/4, however, for clauses of size 2. To see this, consider the configura-
tion v0 , v1 , v2 in which θ01 = θ02 = π/2 and θ12 = π. We have relax(v0 , v1 , v2 ) = 1
and probI (v0 , v1 , v2 ) = 3/4, for any function g.
We have seen that hyperplane rounding works well for short clauses and that in-
dependent rounding works well for long clauses. It is therefore natural to consider
a combination of the two.
Perhaps the most natural combination of hyperplane rounding and indepen-
dent rounding is the following. Let 0 ≤ ≤ 1. With probability 1 − round
the vectors using a random hyperplane. With probability choose a random
assignment. It turns out that the best choice of here is ' 0.086553. With this
value of , we get α1 = α2 = α4 ' 0.853150 while α3 = 7/8. Thus, we again get
a small improvement but we are still far from 7/8.
Instead of rounding all vectors using a random hyperplane, or choosing ran-
dom values to all variables, we can round some of the vectors using a random
hyperplane, and assign some of the variables random values. More precisely, we
choose one random hyperplane. Each vector is now rounded using this random
hyperplane with probability 1 − , or is assigned a random value with proba-
bility . The decisions for the different variables made independently. Letting
' 0.073609, we get α1 = α2 = α4 ' 0.856994, while α3 ' 0.874496. This is
again slightly better but still far from 7/8.
210 Eran Halperin and Uri Zwick
where
Y Y
pr(S) = (1 − (θ0i )) · (θ0i ) ,
i∈S i6∈S
and where S ranges over all subsets of {1, 2, . . . , i}. Recall that probH (u1 , u2 , . . . ,
uk ) is the probability that the set of vectors u1 , u2 , . . . , uk is separated by a
random hyperplane, and that probI (v0 , u1 , . . . , uk ) is the probability that at
least one of the vectors u1 , u2 , . . . , uk is assigned the value 1 when all these
vectors are rounded independently using the function g.
We have made some experiments with an even wider family of rounding pro-
cedures but we were not able to improve on the results obtained using rounding
procedures selected from the family described here. More details will be given
in the full version of the paper.
Can we select a rounding procedure from the proposed family of rounding
procedures using which we can get an optimal, or an almost optimal, approxi-
mation algorithm for MAX 4-SAT?
Approximation Algorithms for MAX 4-SAT 211
f g
( 0 , 0 ) ( 0 , 0 ) ( 0 , 0.250000 )
( 0.777843 , 1.210627 ) ( 0.750000 , 0 ) ( 0.744611 , 0.357201 )
( 1.038994 , 1.445975 ) ( 1.072646 , 0 ) ( 1.039987 , 0.255183 )
( 1.248362 , 1.394099 ) ( 1.248697 , 0.872552 ) ( 1.072689 , 0.222928 )
( π/2 , π/2 ) ( π/2 , π/2 ) ( π/2 , 0.131681 )
f g
( 0 , 0 ) ( 0 , 0.550000 ) ( 0 , 0.650000 )
( 1.394245 , 1.544705 ) ( 1.155432 , 1.154866 ) ( 0.413021 , 0.163085 )
( π/2 , π/2 ) ( 1.394111 , 0.931661 ) ( π/2 , 0.160924 )
( π/2 , π/2 )
to note the non-monotonicity of the function g(θ) and the fact that only one
intermediate point is needed for f (θ) and (θ) and only two intermediate points
are needed for g(θ).
A 7/8-approximation algorithm for MAX {2, 3, 4}-SAT is of course optimal
as a ratio better than 7/8 cannot be obtained even for MAX {3}-SAT, which is
a subproblem of MAX {2, 3, 4}-SAT.
then relax(v0 , v1 , v2 , v3 , v4 ) = 1.
Let 0 < θ1 < θ2 ≤ π/2 be two angles. Consider the configuration (v0 , v1 ) in
which θ01 = π − θ1 , and the two configurations (v0 , v11 , v21 , v31 , v41 ) and (v0 , v12 , v22 ,
v32 , v42 ) in which
1
(θ01 , θ02
1
, θ03
1
, θ04
1
) = (θ1 , θ1 , θ1 , π − θ2 ) , 2
(θ01 , θ02
2
, θ03
2
, θ04
2
) = (θ2 , θ2 , θ2 , θ2 )
214 Eran Halperin and Uri Zwick
i
and in which the angles θjk , for 1 ≤ j < k ≤ 4 are determined according to
the relations above so that relax(v0 , v1i , v2i , v3i , v4i ) = 1, for i = 1, 2. It is not
θ2
difficult to check that θ12 1
= θ131
= θ23 1
= arccos( 1+2 cos3 ), θ14
1
= θ24
1
= θ34
1
=
1−3 cos θ1 −cos θ2 1−2 cos θ2
arccos( 3 ) and that θ 2
ij = arccos( 3 ), for 1 ≤ j ≤ k ≤ 4.
Assume that the configurations (v0 , v1i , v2i , v3i , v4i ), for i = 1, 2, are feasible. For
every rounding procedure C we have
α(C) ≤ min{ ratio C (v0 , v1 ), ratio C (v0 , v11 , v21 , v31 , v41 ), ratio C (v0 , v12 , v22 , v32 , v42 ) }.
As the only angles between v0 and and other vectors in these three configurations
are θ1 , θ2 , π − θ1 and π − θ2 , and as f (π − θ) = π − f (θ), g(π − θ) = π − g(θ) and
(π − θ) = (θ), we get that for every rounding procedure from our family, the
three ratios ratio C (v0 , v1 ), ratio C (v0 , v11 , v21 , v31 , v41 ) and ratio C (v0 , v12 , v22 , v32 , v42 )
depend only on the six parameters f (θ1 ), f (θ2 ), g(θ1 ), g(θ2 ), (θ1 ) and (θ2 ).
Take θ1 = 0.95 and θ2 = arccos(1/5) ' 1.369438. It is possible to check
that the resulting two configurations (v0 , v11 , v21 , v31 , v41 ) and (v0 , v12 , v22 , v32 , v42 ) are
feasible. The choice of the six parameters that maximizes the minimum ratio of
the three configurations, found again using numerical optimization, is:
With this choice of parameters, the three ratios evaluate to about 0.8724. No
rounding procedure from the family can therefore attain a ratio of more than
0.8724 simultaneously on these three specific configurations. No rounding proce-
dure from the family can therefore yield a performance ratio greater than 0.8724
for MAX 4-SAT, even if the functions f, g and are not piecewise linear.
8 Concluding Remarks
We have come frustratingly close to obtaining an optimal 7/8-approximation al-
gorithm for MAX 4-SAT. We have seen that devising a 7/8-approximation algo-
rithm for MAX {1, 4}-SAT is already a challenging problem. Note that Håstad’s
7/8 upper bound for MAX 3-SAT and MAX 4-SAT does not apply to MAX
{1, 4}-SAT, as clauses of length three are not allowed in this problem. A gadget
(see [26]) supplied by Greg Sorkin shows that no polynomial time approximation
algorithm for MAX {1, 4}-SAT can have a performance ratio greater that 9/10,
unless P=NP.
We believe that optimal 7/8-approximation algorithms for MAX 4-SAT and
MAX SAT do exist. The fact that we have come so close to obtaining such
algorithms may in fact be seen as cause for optimism. There is still a possibility
that simple extensions of ideas laid out here could be used to achieve this goal. If
this fails, it may be necessary to attack the problems from a more global point of
view. Note that the analysis carried out here was very local in nature. We only
considered one clause of the instance at a time. As a result we only obtained
lower bounds on the performance ratios of the algorithms considered. It may
even be the case that the algorithms from the family of algorithms considered
here do give a performance ratio of 7/8 for MAX 4-SAT although a more global
analysis is required to show it.
We also hope that MAX 4-SAT would turn out to be the last barrier on the
road to an optimal approximation algorithm for MAX SAT. The almost optimal
algorithms for MAX 4-SAT presented here may be used to obtain an almost
optimal algorithm for MAX SAT. We have not worked out yet the exact bounds
that we can get for MAX SAT as we still hope to get an optimal algorithm for
MAX 4-SAT before proceeding with MAX SAT.
Finally, a word on our methodology. Our work is a bit unusual as we use
experimental and numerical means to obtain theoretical results. We think that
216 Eran Halperin and Uri Zwick
the nature of the problems that we are trying to solve calls for this approach.
No one can rule out, of course, the possibility that some clever new ideas would
dispense with most of the technical difficulties that we are facing here. Until
that happens, however, we see no alternative to the current techniques. The use
of experimental and numerical means does not mean that we have to give up
the rigorousity of the results. Once we obtain the ‘right’ result, we can devote
efforts to proving it rigorously, possibly using automated means.
References
1. F. Alizadeh. Interior point methods in semidefinite programming with applications
to combinatorial optimization. SIAM Journal on Optimization, 5:13–51, 1995.
2. S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and
the hardness of approximation problems. Journal of the ACM, 45:501–555, 1998.
3. S. Arora and S. Safra. Probabilistic checking of proofs: A new characterization
of NP. Journal of the ACM, 45:70–122, 1998.
4. T. Asano. Approximation algorithms for MAX SAT: Yannakakis vs. Goemans-
Williamson. In Proceedings of the 3nd Israel Symposium on Theory and Computing
Systems, Ramat Gan, Israel, pages 24–37, 1997.
5. T. Asano, T. Ono, and T. Hirata. Approximation algorithms for the maximum
satisfiability problem. Nordic Journal of Computing, 3:388–404, 1996.
6. M. Bellare, O. Goldreich, and M. Sudan. Free bits, PCPs, and
nonapproximability—towards tight results. SIAM Journal on Computing, 27:804–
915, 1998.
7. M. Bellare, S. Goldwasser, C. Lund, and A. Russell. Efficient probabilistically
checkable proofs and applications to approximation. In Proceedings of the 25rd
Annual ACM Symposium on Theory of Computing, San Diego, California, pages
294–304, 1993. See Errata in STOC’94.
8. H.S.M. Coxeter. The functions of Schläfli and Lobatschefsky. Quarterly Journal
of of Mathematics (Oxford), 6:13–29, 1935.
9. U. Feige and M.X. Goemans. Approximating the value of two prover proof systems,
with applications to MAX-2SAT and MAX-DICUT. In Proceedings of the 3nd
Israel Symposium on Theory and Computing Systems, Tel Aviv, Israel, pages 182–
189, 1995.
10. U. Feige, S. Goldwasser, L. Lovász, S. Safra, and M. Szegedy. Interactive proofs
and the hardness of approximating cliques. Journal of the ACM, 43:268–292, 1996.
11. M.X. Goemans and D.P. Williamson. New 3/4-approximation algorithms for the
maximum satisfiability problem. SIAM Journal on Discrete Mathematics, 7:656–
666, 1994.
12. M.X. Goemans and D.P. Williamson. Improved approximation algorithms for max-
imum cut and satisfiability problems using semidefinite programming. Journal of
the ACM, 42:1115–1145, 1995.
13. M. Grötschel, L. Lovász, and A. Schrijver. Geometric Algorithms and Combinato-
rial Optimization. Springer Verlag, 1993. Second corrected edition.
14. J. Håstad. Some optimal inapproximability results. In Proceedings of the 29th
Annual ACM Symposium on Theory of Computing, El Paso, Texas, pages 1–10,
1997. Full version available as E-CCC Report number TR97-037.
15. W.Y. Hsiang. On infinitesimal symmetrization and volume formula for spherical or
hyperbolic tetrahedrons. Quarterly Journal of Mathematics (Oxford), 39:463–468,
1988.
Approximation Algorithms for MAX 4-SAT 217
1 Introduction
We consider the problem of deciding whether the Chvátal rank of an inequality,
relative to a given polyhedron, is greater than one or not. Formally, given a
finite set E, we let IRE denote the space of vectors with real-valued components
indexed by E, and ZZ E the set of all vectors in IRE all of whose components
are integer. The integral part byc of a real number y is the largest integer less
.
than or equal to y. Given a polyhedron P in IRE , let PI = conv(P ∩ ZZ E ) be
its integral hull, that is, the convex hull of the integer points in P . Following
Chvátal et al. [11], let
n
.
P 0 = x ∈ P : ax ≤ a0 whenever a ∈ ZZ E , a0 ∈ ZZ and
o
max{ax : x ∈ P } < a0 + 1 .
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 218–233, 1999.
c Springer-Verlag Berlin Heidelberg 1999
On the Chvátal Rank of Certain Inequalities 219
a rational polyhedron, each P (j) thus defined is itself a rational polyhedron and
there exists a nonnegative integer r such that P (r) = PI . (Chvátal [10] shows
this when P is bounded.) Consider any inequality ax ≤ a0 with a ∈ ZZ E and
a0 ∈ ZZ, which is valid for PI . Its Chvátal rank rP (a, a0 ) is the smallest j such
that ax ≤ a0 is valid for P (j) . Thus rP (a, a0 ) = 0 iff the inequality ax ≤ a0 is
valid for P ; rP (a, a0 ) = 1 iff it is valid for P 0 but not for P ; and rP (a, a0 ) ≥ 2
iff it is valid for PI but not for P 0 .
The Chvátal rank of an inequality is often considered as a measure of its
“complexity” relative to the formulation defining polyhedron P . Rank-one in-
equalities may be derived by a generally straightforward rounding argument. In
contrast, inequalities of rank two or greater require a more complicated proof,
involving several stages of rounding. In this paper, we consider the problem of
determining whether or not the Chvátal rank a given inequality is at least two,
as considered, among others, in [2], [18] and [4], [5], [6], [16] and [20] for sev-
eral classes of inequalities for travelling salesman polytopes. See also Chvátal
et al. [11] for references to a number of results and conjectures regarding lower
bounds on the Chvátal rank of inequalities and polyhedra related to various
combinatorial optimization problems and more recently Bockmayr et al. [3] and
Eisenbrand and Schulz [15] for polynomial upper bounds on the Chvátal rank of
0–1 polytopes. See also Caprara and Fischetti [7], Caprara et al. [8] and Eisen-
brand [14] regarding the separation problem for inequalities with Chvátal rank
one.
A related notion of disjunctive rank was introduced in the seminal work of
Balas et al. [1]; see also [9]. Their lift-and-project approach, which does not rely
on rounding, applies to polyhedra whose integer variables are restricted to be
0-1 valued. It does not apply (directly) to general integer variables; it is, in this
sense, less general than the Gomory-Chvátal approach. On the other hand, it
applies directly to mixed 0-1 programs, thus allowing continuous variables. It
also leads to interesting mathematical and computational properties. We outline
below some of those properties that are most directly related to the present
paper. Let K ⊆ IRN be a polyhedron and E ⊆ N be the set of 0-1 variables. For
S ⊆ E and any assignment x̄S ∈ {0, 1}S of 0-1 values S to the variables in S, let
KS (x̄S ) = K∩{x ∈ IRN : xj = x̄j ∀j ∈ S}. Let KS = {KS (x̄S ) : x̄S ∈ {0, 1}S },
so we have K = K∅ ⊇ KS ⊇ KE , and KE is the set of all feasible mixed 0-1
solutions associated with K and E. An inequality ax ≤ a0 , valid for KE , has
disjunctive rank r if r is the smallest integer such that there exists a subset
S of E with |S| = r and ax ≤ a0 valid for KS . It follows immediately that
the disjunctive rank of any such inequality is at most |E|, the number of 0-
1 variables. This is in sharp contrast with the fact that the Chvátal rank of
certain inequalities can be unbounded, even for polytopes with two variables and
integral hull contained in {0, 1}2 (e.g., [23], page 227). Eisenbrand and Schulz
[15] show that, if P = K is contained in the unit cube, then the Chvátal rank
of any inequality, valid for its integral hull PI = KN , is O(|N |2 log |N |); this
upper bound is still much larger than the corresponding bound of |N | for the
disjunctive rank. Balas et al. also show that the disjunctive and Chvátal ranks of
220 Mark Hartmann, Maurice Queyranne, and Yaoguang Wang
an inequality are “incomparable”; that is, for certain inequalities, the disjunctive
rank is larger than their Chvátal rank, whereas the converse holds for other
inequalities. An important computational property of the disjunctive rank is
that, for any fixed r, it can be decided in polynomial time whether or not the
disjunctive
rank of a given inequality ax ≤ a0 is at most r. Indeed, enumerate
all |E|
r = O(|E|r
) subsets S ⊆ E with |S| = r; for any such S, the inequality is
valid for KS if and only if it is valid for all 2r polyhedra KS (x̄S ) with x̄S ∈ {0, 1}S
(if KS (x̄S ) = ∅ then the inequality is trivially valid); since r is fixed, validity
for KS can thus be verified by solving a fixed number 2r of linear programs
max{ax : x ∈ KS (x̄S )}, each about the same size as the linear system defining
K; this yields the desired polynomial time algorithm. In contrast, it follows
from a result of Eisenbrand [14] and the equivalence between separation and
optimization (e.g., [23]), that it is NP-hard to decide whether or not the Chvátal
rank of an inequality is at most one. In this paper we present sufficient conditions
for a negative answer to this decision problem.
For a ∈ ZZ E and a0 ∈ ZZ, a straightforward method for proving that the
Chvátal rank rP (a, a0 ) is zero or one, relies on the optimal value zP (a) of the
linear programming problem
.
zP (a) = max{ax : x ∈ P }
This is stated explicitly in Lemma 3.1 in [6] and is used in Theorem 3.3 therein to
prove that the Chvátal rank of certain clique tree inequalities for the symmetric
Travelling Salesman Problem (TSP) is at least two, where P is the corresponding
subtour polytope (see Sect. 3 below for details), by Giles and Trotter [18] (p. 321)
to disprove a conjecture of Edmonds that each non-trivial facet of the stable set
polytope of a K1,3 -free graph has Chvátal rank one, where P is described by the
non-negativity and clique constraints, and by Balas and Saltzman in Proposition
3.8 in [2] to show that the Chvátal rank of inequalities induced by certain cliques
of the complete tri-partite graph Kn,n,n is at least (in fact, exactly) two, where
P is the linear programming relaxation of the (axial) three-index assignment
problem. This result is also implicit in related statements in [4] (p. 265) for the
envelope inequality for the symmetric TSP, and in [16] (p. 263) for a class of
T -lifted comb inequalities for the asymmetric TSP.
Example 1 below shows that implication (1) may fail if the inequality is not
properly scaled.
.
Example 1: Let P = x = (x1 , x2 ) ∈ IR2 : 2x1 + 3x2 ≤ 11, x ≥ 0 . The in-
equality ax ≤ a0 with a = (3, 3) and a0 = 15 is facet-inducing for PI =
conv{(0, 0), (0, 3), (1, 3), (4, 1), (5, 0)}. We have zP (a) = a (5 21 , 0) = 16 21 so
a0 < bzP (a)c. Yet rP (a, a0 ) = 1, since 3x1 +3x2 ≤ 15 is equivalent to x1 +x2 ≤ 5,
which is valid for P 0 since zP (1, 1) = 5 12 . t
u
On the Chvátal Rank of Certain Inequalities 221
λ ∈ IRm(Q) and a scalar s > 0 such that (a, a0 ) = s(c1 , c10 ) + λ(C, d). If z is the
“certificate” from condition (iii), then
which implies that s ≤ 1 since c1 z is a (positive) integer. Since zP (c1 ) < c10 + 1
and 0 < s ≤ 1, we have
Example 4: This example shows that we cannot drop the requirement that
m(P ) = m(Q). Let
.
P = x = (x1 , x2 ) ∈ IR2 : 4x1 + 3x2 ≤ 4, x1 + x2 ≥ 1, 4x1 + x2 ≥ 0 .
has full row rank. Theorem 5.2 of Schrijver [26] implies that there exists z ∈ ZZ E
with az = n and Cz = 0 if and only if nH −1 e is integral, where [H|0] is the
Hermite normal form of A and e is the first unit vector. By choice of n, the
components of nH −1 e are relatively prime integers, so there exists w ∈ ZZ m(Q)+1
such that w(nH −1 e) = 1. This implies that wH −1 = ( n1 , λ) for some λ ∈ IRm(Q) .
Theorem 5.2 of [26] also implies that A = [H|0]U for some unimodular (and
hence integral) matrix U , and thus
.
a0 = 1
na + λC = wH −1 A = wH −1 [H|0]U = wU
224 Mark Hartmann, Maurice Queyranne, and Yaoguang Wang
Corollary 3 Let Q be any integral polyhedron and let P be any rational polyhe-
dron such that conv(P ∩ ZZ E ) = Q.
(A) If m(P ) = m(Q) then for every facet F of Q, there exists an inequality
ax ≤ a0 , inducing facet F and with integral components, such that equivalence (2)
holds.
(B) If Q is full dimensional then every facet-inducing inequality ax ≤ a0 for Q,
scaled so its components are relatively prime integers, verifies equivalence (2).
Proof. For (B), recall that, since Q is a rational polyhedron, every facet-inducing
inequality for Q may be scaled so its components are relatively prime integers.
Since m(Q) = 0, condition (iii) then holds trivially so the inequality is in reduced
integral form. The rest of the proof follows directly from Theorems 1 and 2. u t
such that conv(P ∩ ZZ E ) = Q. In Giles and Trotter [18], the stable-set polytope
is full dimensional and the facet-inducing inequality in question has coefficients
of 1, 2 and 3, so equivalence (2) holds by Corollary 3(B).
We now consider the case where Q is not full-dimensional. This case is im-
portant for many combinatorial optimization problems, such as the travelling
salesman problems discussed in the next sections. Although the proof of Theo-
rem 2 is constructive, it may not be easy to apply directly. We present below
a simple condition for obtaining a reduced integral form when Q is not full-
dimensional.
Consider any B ⊆ E such that |B| = m(Q) and the |B| × |B| submatrix CB
of C with columns indexed by B is nonsingular. Subset B (and by extension,
.
matrix CB ) is called a basis of the equality system. Defining N = E \ B, we may
E
write C = (CB , CN ) and, for any vector a ∈ IR , a = (aB , aN ). An inequality
ax ≤ a0 is in B-canonical form if aB = 0. It is in integral B-canonical form if the
components of aN are relatively prime integers. Note that, for any given basis B
of the equality system for a rational polyhedron Q, every facet of Q is induced by
a unique inequality in integral B-canonical form. (Indeed, the equations aB = 0
define the facet-inducing inequality ax ≤ a0 uniquely, up to a positive multiple;
this multiple is determined by the requirement that the components of a be
relatively prime integers.)
−1
Define a basis B of Cx = d to be dual-slack integral if the matrix CB CN
is integer. Note that this definition is equivalent to requiring that the vector
.
c = c − cB CB−1
C be integral for every integral c ∈ ZZ E . The vector c may thus be
interpreted as that of basic dual slack variables (or reduced costs) for the linear
programming problem max{cx : Cx = d, x ≥ 0}. Note also that the property
of being dual-slack integral is independent of scaling and, more generally, of the
linear system used to represent the given equality system. (Indeed, if C 0 x = d0
represents the same system as Cx = d, then (C 0 , d0 ) = M (C, d) where M is a
0 −1 0
nonsingular matrix; it follows that c − cB (CB ) C = c.)
Proof. Let B be a dual-slack integral basis of the equality system for integral
polyhedron Q. Let ax ≤ a0 be a facet-inducing inequality in integral B-canonical
form. We only need to verify condition (iii). Since the components of aN are
relatively prime, there exists zN ∈ ZZ N such that aN zN = 1. Define zB =
−1
−CB CN zN so that Cz = CB zB + CN zN = 0. Then z is the “certificate” in
condition (iii). t
u
Remark. This result can also be interpreted in terms of the Hermite normal
form of the matrix A = (AB , AN ). If ax ≤ a0 is in B-canonical form for a dual-
−1
slack integral basis B, then CB CN is integral and aB = 0 so integer multiples
of the columns of AB can be added to the columns of AN to zero out CN . But
226 Mark Hartmann, Maurice Queyranne, and Yaoguang Wang
then since the components of aN are relatively prime, the Hermite normal form
of A will be [H|0], where
1 0
H=
0 HB
and HB is the Hermite normal form of CB .
Thus if P is a rational polyhedron with the same equality set as an inte-
gral polyhedron Q with conv(P ∩ ZZ E ) = Q, to prove that the Chvátal rank
rP (a, a0 ) of a facet-inducing inequality ax ≤ a0 with relatively prime inte-
ger components is at least two, it suffices to exhibit a dual-slack integral ba-
sis comprised only of edges e with ae = 0, and a point x∗ ∈ P with ax∗ ≥
a0 + 1. In Balas and Saltzman [2], the (axial) three-index assignment polytope
is not full dimensional, but the facet-inducing inequality induced by the clique
{(3, 3, 3), (2, 2, 3), (2, 3, 2), (3, 2, 2)} of Kn,n,n is in B-canonical form with respect
to the dual-slack integral basis described in the proof of Lemma 3.1 ibid. Hence
equivalence (2) holds for this class of inequalities.
Further examples of integral B-canonical forms for which Proposition 4 ap-
plies will be presented in the following sections, for integral polytopes related
to symmetric and asymmetric travelling salesman problems, respectively. Note,
however, that dual-slack integral bases need not exist (see, e.g., Example 3), so
Proposition 4 will not always apply.
Proof. Recall that, since graph Kn is connected and contains some odd cycles,
B is a basis of its incidence matrix C if and only if every connected component
of GB contains exactly one cycle and every such cycle is odd (see, e.g., [22]).
On the Chvátal Rank of Certain Inequalities 227
Theorem 6 A clique tree inequality has Chvátal rank one, relative to the STS
subtour polytope, if and only it is a comb.
Proof. Theorem 3.2 in [6] states that a clique tree has Chvátal rank zero if and
only if it consists of a single tooth. Theorem 2.2 ibid. implies that the Chvátal
rank is one if the clique tree is a comb. So, we only need to prove that every
clique tree with at least two handles has Chvátal rank at least two. Let H denote
the union of all the handles, and let T1 and T2 be any two distinct teeth. There
exist two nodes t1 ∈ T1 \ H and t2 ∈ T2 \ H, and a node h ∈ V \ (T1 ∪ T2 ). Define
Thus, B forms a dual-slack integral basis, with odd cycle (t1 , t2 , h). The clique
tree inequality ax ≤ a0 , as defined in equation (2.1) ibid., is in integer B-
canonical form. Hence, by Proposition 4, it is a reduced integral form. By The-
orem 2.2 ibid., there exists x∗ ∈ P with ax∗ ≥ a0 + 1. The result then follows
by applying Proposition 4 and Theorem 1. t
u
B = {(3, 4), (3, 5), (3, 6), (3, 7), (2, 4), (1, 5), (5, 6)} .
A solution x∗ in P with ax∗ = 9 is given on page 265, ibid. It then follows that
the envelope inequality for n = 7 is of Chvátal rank at least two, relative to the
subtour polytope, as stated in [4].
Such results may easily be extended to other classes of STS facet-inducing
inequalities. This is left to the interested reader.
We may first apply Proposition 7, in the spirit of the work in [24], to symmet-
ric ATS inequalities. A symmetric ATS inequality ax ≤ a0 is one that satisfies
aij = aji for all ij ∈ A. The corresponding STS inequality a0 y ≤ a0 is defined
by a0ij = aij for all i, j ∈ V (with i < j). Recall that a symmetric inequality is
valid for ATS polytope if and only if the corresponding STS inequality is valid
for the STS polytope.
Corollary 9 A clique tree inequality has Chvátal rank one, relative to the ATS
subtour polytope, if and only it is a comb.
respect to the STS subtour polytope is an upper bound on the Chvátal rank
of ax ≤ a0 with respect to the ATS subtour polytope (see Lemma 2.2.1 of
Hartmann [21]).
Proposition 7 may also apply to asymmetric ATS inequalities. We conclude
this paper by investigating an asymmetric lifting method due to Fischetti [16].
Let ax ≤ a0 be a valid inequality for Q and assume that:
(a) its components are nonnegative integers;
(b) there exist two distinct isolated nodes p and q, i.e., such that ahi = aih = 0
for all i ∈ V and h ∈ {p, q}; and
(c) there exists a node w ∈ V \ {p, q} such that
We now give a sufficient condition for the T -lifted inequality to have Chvátal
rank at least two, as was claimed for an example in [16]. We shall use the following
.
notation in the rest of this paper. Let Ve = V \ {p, q}. We use tildes to denote
any object associated with Ve . Thus A e is the associated arc set; Pe and Qe are the
subtour polytope and ATS polytope, respectively, defined on Ve ; e axe ≤ a0 is the
restriction to IRAe of ax ≤ a0 ; and δ(w)
e .
= δe+ (w) ∪ δe− (w) is the set of all arcs
e incident with w.
in A
Then Bw satisfies the conditions of Proposition 7 relative to the node set V \{w},
and therefore defines a basis of the corresponding STS polytope. Adding the two
.
arcs wp ∈ δ + (w) and qw ∈ δ − (w) thus yields a (dual-slack integral) basis B =
Bw ∪ {wp, qw} for Q, for which βx ≤ β0 is in integral B-canonical form. Define
x∗ ∈ IRA as follows:
ee
x for e ∈ A e
e \ δ(w)
e
e − ξe for e ∈ δ(w);
x
e
∗ .
xe = ξ iw for e = ip with i ∈ Ve \ {w};
ξ for e = qi with i ∈ Ve \ {w};
wi
1
for e ∈ {pw, wq, pq}; and
2
0 for e ∈ {wp, qw, qp}.
Since x∗ satisfies the degree constraints Cx∗ = d and
X X 3
βx∗ = e
axe+ e xwi − ξwi ) +
awi (e e xiw − ξiw ) +
aiw (e ≥ a0 + 2 ,
2
i6=w i6=w
we only need to show that x∗ satisfies, for all nonempty S ⊂ V , the subtour
elimination constraint x∗ (δ + (S)) ≥ 1. First, it is straightforward to verify that
these constraints are satisfied for all nonempty S ⊂P {w, p, q} and S ⊇ Ve \ {w}.
.
Next, for disjoint node sets S and T , let x(S : T ) = {xij : i ∈ S, j ∈ T } for all
.
6 T ⊂ Ve \{w}, let T = Ve \(T ∪{w}), so V = T ∪T ∪{w, p, q}
x ∈ IRA . For any ∅ =
and these three subsets are disjoint. We have
1≤x e T : T ∪ {w} = x∗ T : T ∪ {w, p, q}
< x∗ T ∪ {q} : T ∪ {w, p} ≤ x∗ T ∪ {p, q} : T ∪ {w}
Proposition 11 A T -lifted clique tree inequality (5) has Chvátal rank at least 2,
relative to the ATS subtour polytope, if it is facet-inducing for the ATS polytope
and either
232 Mark Hartmann, Maurice Queyranne, and Yaoguang Wang
Proof. By Theorem 10, we only need to construct the required x e and ξ. For
case (a), recall that, if a clique tree has at least one handle, then there exists
ye in the STS subtour polytope such that a0 ye − a0 ≥ 12 . Further, from step (6)
in the proof of Theorem 2.2 in [6] (p. 170), for every w as described in (a),
we can construct such a ye with yewk = yewl = 12 for two edges wk and wl with
a0wk = a0wl = 0. Then x e defined by x
eij = x
eji = yeij for all ij ∈ E, and ξ defined
by ξij = 14 if ij ∈ {wk, kw, wl, lw}, and 0 otherwise, satisfy the conditions of
Theorem 10. The proof is complete for case (a). Now consider case (b). The
solution ye constructed in the proof of Theorem 2.2 in [6] satisfies a0 ye − a0 ≥ 32 .
If w ∈ T \ H where T is a pendent teeth which intersects exactly two handles,
then yee = 1 for two edges wk and wl with k, l ∈ T ; and 0 for all other edges (see
step (3) in the proof of Theorem 2.2 ibid.). With x e as defined above, let the only
nonzero components of ξ be ξwk = ξkw = 12 . For every choice of w other than an
isolated node, that in (a) or that just discussed, we can apply the construction
in the proof of Theorem 2.2 in [6] so that the solution ye satisfies yewk = yewl = 12
for two nodes k, l in the same handle as w, or in the same nonpendent tooth T
(intersecting at least three handles) as w. With x e as defined above, let the only
nonzero components of ξ be ξwk = ξkw = ξwl = ξlw = 14 . Finally, note that
aij = a0ij = 1 for all arcs ij ∈ A such that ξij > 0, so ξ satisfies all the requisite
conditions. The proof is complete. t
u
References
1. E. Balas, S. Ceria and G. Cornuéjols (1993), “A lift-and-project cutting plane
algorithm for mixed 0-1 programs,” Mathematical Programming 58, 295–324.
2. E. Balas and M.J. Saltzman (1989), “Facets of the three-index assignment poly-
tope,” Discrete Applied Mathematics 23, 201–229.
3. A. Bockmayr, F. Eisenbrand, M. Hartmann and A.S. Schulz (to appear), “On
the Chvátal rank of polytopes in the 0/1 cube,” to appear in Discrete Applied
Mathematics.
4. S.C. Boyd and W.H. Cunningham (1991), “Small travelling salesman polytopes,”
Mathematics of Operations Research 16, 259–271.
5. S.C. Boyd, W.H. Cunningham, M. Queyranne and Y. Wang (1995), “Ladders for
travelling salesmen,” SIAM Journal on Optimization 5, 408–420.
On the Chvátal Rank of Certain Inequalities 233
6. S.C. Boyd and W.R. Pulleyblank (1991), “Optimizing over the subtour polytope
of the travelling salesman problem,” Mathematical Programming 49, 163–187.
7. A. Caprara and M. Fischetti (1996), “0-1/2 Chvátal-Gomory cuts,” Mathematical
Programming 74, 221–236.
8. A. Caprara, M. Fischetti and A.N. Letchford (1997), “On the separation of max-
imally violated mod-k cuts,” DEIS-OR Technical Report OR-97-8, Dipartimento
di Elettronica, Informatica e Sistemistica Università degli Studi di Bologna, Italy,
May 1997.
9. S. Ceria (1993), “Lift-and-project methods for mixed 0-1 programs,” Ph.D. dis-
sertation, Carnegie-Mellon University.
10. V. Chvátal (1973), “Edmonds polytopes and a hierarchy of combinatorial prob-
lems,” Discrete Mathematics 4, 305–337.
11. V. Chvátal, W. Cook and M. Hartmann (1989), “On cutting-plane proofs in
combinatorial optimization,” Linear Algebra and Its Applications 114/115, 455–
499.
12. P.D. Domich, R. Kannan and L.E. Trotter, Jr. (1987), “Hermite normal form
computation using modulo determinant arithmetic,” Mathematics of Operations
Research 12, 50–59.
13. J. Edmonds, L. Lovász and W.R. Pulleyblank (1982), “Brick decompositions and
the matching rank of graphs,” Combinatorica 2, 247–274.
14. F. Eisenbrand (1998), “A note on the membership problem for the elementary
closure of a polyhedron,” Preprint TU-Berlin No. 605/1998, Fachbereich Mathe-
matik, Technische Universität Berlin, Germany, November 1998.
15. F. Eisenbrand and A.S. Schulz (1999), “ Bounds on the Chvátal Rank of Polytopes
in the 0/1-Cube”, this Volume.
16. M. Fischetti (1992), “Three facet-lifting theorems for the asymmetric traveling
salesman polytope”, pp. 260–273 in E. Balas, G. Cornuéjols and R. Kannan, eds,
Integer Programming and Combinatorial Optimization (Proceedings of the IPCO2
Conference), G.S.I.A, Carnegie Mellon University.
17. M. Fischetti (1995), “Clique tree inequalities define facets of the asymmetric
travelling salesman polytope,” Discrete Applied Mathematics 56, 9–18.
18. R. Giles and L.E. Trotter, Jr. (1981), “On stable set polyhedra for K1,3 -free
graphs,” Journal of Combinatorial Theory Ser. B 31, 313–326.
19. M. Grötschel and M.W. Padberg (1985), “Polyhedral theory,” pp. 251–305 in:
E.L. Lawler et al., eds., The Travelling Salesman Problem, Wiley, New York.
20. M. Grötschel and W.R. Pulleyblank (1986), “Clique tree inequalities and the
symmetric travelling salesman problem,” Mathematics of Operations Research 11,
537–569.
21. M. Hartmann (1988), “Cutting planes and the complexity of the integer hull,”
Ph.D. thesis, Cornell University.
22. E.L. Johnson (1965), “Programming in networks and graphs”, Research Report
ORC 65-1, Operations Research Center, University of California–Berkeley.
23. G.L. Nemhauser and L.A. Wolsey (1988), Integer and Combinatorial Optimiza-
tion, Wiley, New York.
24. M. Queyranne and Y. Wang (1995), “Symmetric inequalities and their composi-
tion for asymmetric travelling salesman polytopes,” Mathematics of Operations
Research 20, 838–863.
25. A. Schrijver (1980), “On cutting planes,” in: M. Deza and I.G. Rosenberg, eds.,
Combinatorics 79 Part II, Ann. Discrete Math. 9, 291–296.
26. A. Schrijver (1986), Theory of Linear and Integer Programming, Wiley, Chich-
ester.
The Square-Free 2-Factor Problem in Bipartite
Graphs
David Hartvigsen
1 Introduction
A 2-factor in a simple undirected graph is a spanning subgraph whose compo-
nents are all simple cycles (i.e., the degrees of all nodes in the subgraph are 2). A
k-restricted 2-factor, for k ≥ 2 and integer, is a 2-factor with no cycles of length
k or less. Let us denote by Pk the problem of finding a k-restricted 2-factor in
a graph. Thus P2 is the problem of finding a 2-factor in a graph and Pk , for
n/2 ≤ k ≤ n − 1 (where n is the number of nodes in the graph), is the problem
of finding a Hamilton cycle in a graph.
The basic problem P2 has been has been extensively studied. A polynomial
algorithm exists due to a reduction (see Tutte [11]) to the classical matching
problem (see Edmonds [6]). Also well-known is a “good” characterization of those
graphs that have a 2-factor (due to Tutte [10]) and a polyhedral characterization
of the convex hull of incidence vectors of 2-factors (which can also be obtained
using Tutte’s reduction and Edmonds’ polyhedral result for matchings [5]). For
a more complete history of this problem see Schrijver [8].
Of course a polynomial time algorithm for Pk , for n/2 ≤ k ≤ n−1, is unlikely
since the problem of deciding if a graph has a Hamilton tour is NP-complete. In
fact, Papadimitiou (see [2]) showed that deciding if a graph has a k-restricted
2-factor for k ≥ 5 is NP-complete. A polynomial time algorithm for P3 was given
in [7]. This leaves P4 whose status is open.
In this note we consider the problem P4 on bipartite graphs, which we call
the bipartite square-free 2-factor problem. (The idea of studying this problem
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 234–241, 1999.
c Springer-Verlag Berlin Heidelberg 1999
The Square-Free 2-Factor Problem in Bipartite Graphs 235
was suggested to the author by Cunningham and Geelen [3].) Observe that
for bipartite graphs the problems Pk of interest are for k ≥ 2 and even. We
introduce here a polynomial time algorithm for the bipartite square-free 2-factor
problem. The algorithm uses techniques similar to those used for the problem
P3 on general graphs, but the resulting algorithm is simpler. We also present in
this paper, for the bipartite square-free 2-factor problem, three theorems that
are the appropriate versions of the following classical results for matchings in
graphs:
Our three theorems follow from the validity of our algorithm. We present our
Tutte-type theorem in Section 2. We introduce a slightly more general version
of the k-restricted 2-factor problem in Section 3, in terms of which we state
our remaining results. In particular, we present our “augmenting path” theorem
in Section 4 and our polyhedral theorem in Section 5. We introduce some of
the ideas of the algorithm in Section 6. We contrast all of our results with the
comparable results for the problem P2 on bipartite graphs.
Before ending this section let us briefly mention two related classes of prob-
lems. First, consider the problems Pk , for k ≥ 6, on bipartite graphs. The status
of these problems is open. Second, consider the “weighted” versions of the prob-
lems Pk . That is, given a graph with nonnegative weights, let weighted-Pk be the
problem of finding a k-restricted 2-factor such that the sum of the weights of its
edges is a maximum. Vornberger [12] showed that weighted-P4 is NP-hard. The
status of weighted-P3 remains open. Cunningham and Wang [4] have studied
the closely related polytopes given by the convex hull of incidence vectors of
k-restricted 2-factors.
2 Tutte-Type Theorems
In this section we present a theorem that characterizes those bipartite graphs
that have a 2-factor and a theorem that characterizes those bipartite graphs that
have a square-free 2-factor. The first theorem is essentially a bipartite version of
Tutte’s 2-factor theorem (see [10]). The necessity of the condition is straightfor-
ward to establish. The sufficiency follows from a max-min theorem of Edmonds
(see [5]). The sufficiency also follows from the proof of the validity of the first
algorithm we present in Section 6.
For a graph H = (V, E), let A(H) denote the number of nodes of H that
have degree 0; let B(H) denote the number of nodes of H that have degree 1.
For S ⊆ V , we let H[S] denote the subgraph of H induced by S.
Theorem 1. A bipartite graph G = (V, E) has a 2-factor if and only if for every
subset V 0 ⊆ V , 2|V 0 | ≥ 2A(G[V \V 0 ]) + B(G[V \V 0 ]).
236 David Hartvigsen
With a few more definitions we can present a variant of this result that holds
for square-free 2-factors. Again, let H = (V, E) be a graph. We refer to a cycle in
H of length 4 as a square. Let C(H) denote the number of isolated squares of H
(i.e., those squares all of whose nodes have degree 2 in H); and let D(H) denote
the number of squares of H for which two nonadjacent nodes have degree 2 (in
H) but at least one of the other two nodes has degree > 2 (in H). The necessity
of the condition in the following theorem is straightforward to establish. The
sufficiency appears to be considerably more difficult; it follows from the validity
of the second algorithm we introduce in Section 6.
Theorem 2. A bipartite graph G = (V, E) has a square-free 2-factor if and only
if for every subset V 0 ⊆ V ,
2|V 0 | ≥ 2A(G[V \V 0 ]) + B(G[V \V 0 ]) + 2C(G[V \V 0 ]) + D(G[V \V 0 ]).
C2
C1
g h
a b c d e f
Fig. 1.
5 Polyhedral Descriptions
In this section we present polyhedral results for our two problems. Let G = (V, E)
be a graph. For v ∈ V , let δ(v) denote the edges of G incident with v and let
x ∈ <E . For a simple 2-matching M , x is called an incidence vector for M if
Pe = 1 when e ∈ M ; and xe = 0 when e ∈
x / M . For S ⊆ E, we let x(S) denote
e∈S x e .
Consider the following linear program (P ).
max x(E)
s.t. x(δ(v)) ≤ 2 for all v ∈ V
0 ≤ xe ≤ 1 for all e ∈ E
Observe that integral solutions to (P ) are incidence vectors for simple 2-
matchings and hence an optimal integral solution is a maximum cardinality
simple 2-matching. When the graph is bipartite, it’s well known that the con-
straint matrix for (P ) is totally unimodular hence all the extreme points of the
polyhedron are incidence vectors for simple 2-matchings. In particular, we have
the following result.
Theorem 5. For any bipartite graph G, (P ) has an integer optimal solution
(which is an incidence vector of a maximum cardinality simple 2-matching).
Next, consider the following linear program (P 0 ).
max x(E)
s.t. x(δ(v)) ≤ 2 for all v ∈ V
x(S) ≤ 3 for all edge sets S of squares
0 ≤ xe ≤ 1 for all e ∈ E
The Square-Free 2-Factor Problem in Bipartite Graphs 239
This theorem can be proved from the structure available at the end of the
algorithm. In particular, we can produce an integral feasible primal solution and
a half-integral feasible dual solution that satisfy complementary slackness.
6 The Algorithms
We begin this section with a complete statement of the algorithm for finding
a maximum cardinality simple 2-matching in a bipartite graph. This algorithm
can be obtained using Tutte’s reduction [11] to the matching problem and then
applying Edmonds’ matching algorithm [6]. We present this algorithm in the
same form that is used for the square-free algorithm so that we may discuss
some of the features of the square-free algorithm later in this section.
In the course of the algorithm for finding a maximum cardinality simple
2-matching in a bipartite graph, we construct a subgraph S of G called an
alternating structure. We next describe some properties of this structure that
are maintained throughout the algorithm.
The nodes of S are either odd or even and have the following properties.
– Odd nodes are incident with two edges of M (zero, one, or two of these can
be in S); the other endnodes of these edges are even nodes.
– Even nodes may be saturated or deficient.
– If an even node is saturated, then exactly one of these two edges of M must
be in S and the other endnode of this edge must be odd.
S is partitioned into the union of node disjoint trees. These trees have the
following properties:
Consider an arbitrary path in such a tree that contains the tree’s root as an
endnode. Then this path has the following properties:
Algorithm :
Input: Bipartite graph G.
Output: A maximum cardinality simple 2-matching of G.
Step 0: Let M be any simple 2-matching of G.
Step 1: If M is a 2-factor, done. Otherwise, let S be the set of deficient nodes
of G, where these nodes are the roots of S and are even.
Step 2: If every edge of G that is incident with an even node is either an
edge of M or has an odd node as its other endnode, then M has maximum
cardinality; done. Otherwise, select an edge j ∈ / M joining an even node v
to a node w that is not an odd node.
Case 1: w ∈ / S. Go to Step 3.
Case 2: w is an even node. Go to Step 4.
Step 3: [Grow the structure] w ∈ / S.
w must be incident with two edges in M : wu1 and wu2 (otherwise w is
deficient, hence a root in S). Neither u1 nor u2 is odd, since then w would
be even. Thus each ui is either even or ∈/ S. Make w odd and add it and the
edge vw to the tree that contains v. If ui ∈/ S, then make it even and add it
and the edge wui to the tree that contains v. If ui is even, do not change its
status.
Step 4: [Augment the matching] w is an even node.
Observe that v and w must be in different trees (otherwise there exists an
odd cycle). Consider the path consisting of the path from v to the root of
its tree, the path from w to the root of its tree, plus the edge vw. The edges
along this path are alternately in and not in M with the endedges not in M
and the endnodes are deficient (i.e., it is an augmenting path). Interchange
edges in and out of the matching along this path. Go to Step 1.
End .
The validity of this algorithm can be proved using arguments similar to those
used by Edmonds [6] to prove the validity of his matching algorithm.
We next introduce some of the key ideas of our square-free version of the
above algorithm by considering an example. To begin, consider an application
of the above algorithm to the graph in Figure 1. Let us begin in Step 0 with
the matching M illustrated: bold lines are edges in M ; non-bold lines are edges
not in M . Observe that nodes a and f are the only deficient nodes, hence the
algorithm starts in Step 1 by making these two nodes into even nodes and the
roots of two trees in S. The algorithm may next select the edge ab in Step 2.
It would then go to Step 3 and make b an odd node, and c and g even nodes.
It would add the edges ab, bc, and bg to the tree containing a. The algorithm
may next select the edge cd in Step 2. As above it would then make d an odd
node, and e and h even nodes. It would add the edges cd, de, and dh to the tree
containing a. The algorithm may next select the edge ef in Step 2. This then
calls for augmenting the matching in Step 4. However, if we were to do so, we
would create a square in the matching. The way in which this problem is dealt
with in the square-free version of the algorithm is as follows: When we consider
edge cd, we make only h an even node. Thus we only look to grow the tree from
The Square-Free 2-Factor Problem in Bipartite Graphs 241
References
1. Berge, C., Two theorems in graph theory. Proceedings of the National Academy of
Sciences (U.S.A.) 43 (1957) 842-844.
2. Cornuejols, G. and W.R. Pulleyblank. A matching problem with side conditions.
Disc. Math. 29 (1980) 135-159.
3. Cunningham, W.H. and J. Geelen. Personal communication (1997).
4. Cunningham, W.H. and Y. Wang. Restricted 2-factor polytopes. Working paper,
Dept. of Comb. and Opt., University of Waterloo (Feb. 1997).
5. Edmonds, J., Maximum matching and a polyhedron with 0,1 vertices. J. Res. Nat.
Bur. Standards Sect. B 69 (1965) 73-77.
6. Edmonds, J., Paths, trees, and flowers. Canad. J. Math. 17 (1965) 449-467.
7. Hartvigsen, D. Extensions of Matching Theory. Ph.D. Thesis, Carnegie-Mellon
University (1984).
8. Schrijver, A., Min-max results in combinatorial optimization. in Mathematical Pro-
gramming, the State of the Art: Bonn, 1982, Eds.: A. Bachem, M. Grotschel, and
B. Korte, Springer-Verlag, Berlin (1983) 439-500.
9. Tutte, W.T., The factorization of linear graphs. J. London Math. Soc. 22 (1947)
107-111.
10. Tutte, W.T., The factors of graphs. Canad. J. Math. 4 (1952) 314-328.
11. Tutte, W.T., A short proof of the factor theorem for finite graphs. Canad. J. Math.
6 (1954) 347-352.
12. Vornberger, O., Easy and hard cycle covers. Preprint, Universitat Paderborn
(1980).
The m-Cost ATSP
Christoph Helmberg
Konrad-Zuse-Zentrum Berlin
Takustraße 7, 14195 Berlin, Germany
[email protected], http://www.zib.de/helmberg
1 Introduction
For Herlitz PBS AG, Berlin, we are developing a software package for the fol-
lowing scheduling problem. Gift wrap has to be printed on two non-identical
printing machines. The gift wrap is printed in up to six colors on various kinds
of materials in various widths. The colors may differ considerably from one gift
wrap to the next. The machines differ in speed and capabilities, not all jobs can
be processed on both machines. Setup times for the machines are quite large in
comparison to printing time. They depend strongly on whether material, width,
or colors have to be changed, so in particular on the sequence of the jobs. The
task is to find an assignment of the jobs to the machines and an ordering on
these machines such that the last job is finished as soon as possible, i.e., min-
imize the makespan for a scheduling problem on two (non-identical) machines
with sequence dependent set-up times.
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 242–258, 1999.
c Springer-Verlag Berlin Heidelberg 1999
The m-Cost ATSP 243
Let D = (V, A) be a digraph on node set V and arc set A without multiple arcs
but possibly with loops. An arc is an ordered pair (v, w) of nodes v, w ∈ V ,
v is called tail, w head of (v, w). Most of the time we will simply write vw
instead of (v, w). For a subset S ⊆ V , δ − (S) = {vw ∈ A : v ∈ V \ S, w ∈ S} is
the set of arcs of A that have their head in S and their tail in V \ S. Similarly,
δ + (S) = {vw ∈ A : v ∈ S, w ∈ V \ S} is the set of arcs of A having tail in S and
head in V \ S. By definition, loops are not contained in any set δ − (S) or δ + (S).
For a node v ∈ V we will write δ − (v) instead of δ − ({v}). For an arc set  ⊂ A,
V (Â) = {v ∈ V : vw ∈ Â or wv ∈ Â} is the set of nodes that are tail or head of
an arc in Â. For a node set S ⊂ V , A(S) = {vw ∈ A : v ∈ S and w ∈ S} is the
Px∈Q
|A|
set of arcs having both head and tail in S. For a vector the sum of the
weights over an arc set  ⊂ A is denoted by x(Â) = a∈ xa .
A dicycle is a set of k > 0 arcs C = {v1 v2 , v2 v3 , . . . , vk v1 } ⊆ A with distinct
vertices v1 , . . . , vk ∈ V . The number of arcs k is referred to as the length of the
dicycle. A tour is a dicycle of length |V |. A loop is a dicycle of length 1.
Now consider the problem of scheduling n jobs on m distinct machines. Let
J = {1, . . . , n} denote the set of jobs and M = {1, . . . , m} the set of machines.
In order to model all possible schedules we introduce for each machine k ∈ M
a digraph Dk0 = (Vk , Ak ) with Vk = J ∪ {0} and arc set Ak = {(0, 0)k } ∪
{(i, j)k : i, j ∈ Vk , i 6= j} (node 0 allows to model initial and final state of each
machine as well as idle costs). We will write ijk instead of (i, j)k . Usually we
will regard the nodes in Vk and Vi as the same objects. Sometimes, however, we
will have to discern between nodes in Vk and Vi , then we will do so by adding a
subscript, e.g., 0k for 0 ∈ Vk . For S ⊆ Vk the subscript in the symbols δk− (S) and
δk+ (S) will indicate on which arc set we are working.
Sm We will frequently need the
union of all arc sets which we denote by A = k=1 Ak . With each arc ijk ∈ A
we associate a cost or weight cijk .
A feasible solution AF = C1 ∪ . . . ∪ Cm is a set of m + n arcs that is the union
of m dicycles Ck ⊆ Ak for k ∈ M , such that 0 ∈ Vk (Ck ) for all k ∈ M and for
each v ∈ J there is a k ∈ M with v ∈ Vk (Ck ). The set of feasible solutions is
Pm
Fnm = C1 ∪ . . . ∪ Cm : k=1 |Ck | = n + m,
∀k ∈ M : Ck ⊆ Ak is a dicyclewith 0 ∈ Vk (Ck ),
∀v ∈ J : ∃k ∈ M : v ∈ Vk (Ck ) .
So Pnm is the convex hull of the feasible solutions of (1)–(5). In order to deter-
mine the dimension of Pnm we investigate equations (1)–(3) which we collect, for
convenience, in a matrix B and a right hand side vector b,
Bx = b.
(m+n+mn)×|A|
Lemma 2. The coefficient matrix B ∈ {−1, 0, 1} corresponding
to (1)–(3) has full row rank m + n + mn.
To complement this upper bound on the dimension of Pnm with a lower bound
we will need a standard construction of linearly independent tours for the ATSP.
Proof. Follows directly from [10], proof 1 of Theorem 7 and Theorem 20. t
u
Proof. If follows from Lemma 2 that dim(Pnm ) ≤ mn2 − n. To proof equality let
aT x = a0 be a valid equation for Pnm . We will show that it is linearly dependent
with respect to Bx =Sb. By the proof of Lemma 2 the columns of B corresponding
to the arcs in F = k∈M Fk are linearly independent, so we can compute λ ∈
Q n+m+nm such that aTijk = (λT B)ijk for ijk ∈ F . The equation dT x = (aT −
λT B)x = a0 − λT b is again valid for Pnm and dijk = 0 for ijk ∈ F . We will show
that all other coefficients dijk have to be zero as well.
First observe that on the support A1 \ {001} d has exactly n(n − 1) − 1
‘unknowns’. Employing the n(n − 1) linearly independent tours of Lemma 1
construct linearly independent feasible solutions Ah ∈ Fnm for h = 1, . . . , n(n−1)
h
that use these tours on machine 1 such that the xA are linearly independent on
the index set {ij1 : i 6= j, i, j ∈ J}. There exists an ĥ, 1 ≤ ĥ ≤ n(n−1), such that
h ĥ
the n(n − 1) − 1 vectors xA − xA (h 6= ĥ) are linearly independent on the index
set {ij1 : i 6= j, i, j ∈ J}\{121}. These difference vectors have no support outside
h ĥ
A1 \ {001} and satisfy dT (xA − xA ) = 0. Collect the n(n − 1) − 1 difference
vectors as columns in a matrix X. Since dT X = 0 and dij1 = 0 for ij1 ∈ F1 we
conclude that dij1 = 0 for all ij1 ∈ A1 \ {001} by the linear independence of the
columns of X.
We proceed to show that dij2 = 0 for all ij2 ∈ A2 . The following two arc sets
A1 , A2 ∈ Fnm correspond to a tour on machine 1 and to the same tour shortened
by ‘moving’ node n to machine 2,
Consider the subproblem on one machine where this machine is not required
to process all jobs, but may process any subset of jobs, or even choose the idle
loop. We model this problem on a digraph Ds = (Vs , As ) with Vs = {0} ∪ J and
As = {(0, 0)} ∪ {(i, j) : i, j ∈ Vs , i 6= j}. The feasible set is
The corresponding polytope defined by the incidence vectors of the arc sets in
the feasible set is
Pns = conv xA : A ∈ Fns
We can model the incidence vectors by the following constraints,
Proof. The if part is easy to check, so consider the only if part. Let x be a
solution of (6)–(9). Then, by the flow conservation constraints (7), x is the
incidence vector of a union of r dicycles C1 , . . . , Cr . Because of (6) exactly one
dicycle of these, w.l.o.g. C1 , satisfies 0 ∈ V (C1 ). Furthermore, for all Ci we must
have 0 ∈ V (Ci ), because otherwise S = V (Ci ) would lead to a violated v0-cut
inequality (8) with arbitrary v ∈ S. Thus r = 1 and the cycles C1 fulfill the
requirements for Fns . t
u
Pns is therefore the convex hull of the feasible solutions of (6)–(9). We determine
the dimension of Pns by the same steps as in Section 3. We collect equations (6)
and (7) in a matrix Bs and a right hand side vector bs , Bs x = bs .
(n+1)×|As |
Lemma 4. The coefficient matrix Bs ∈ {−1, 0, 1} corresponding to
(6) and (7) has full row rank n + 1.
Fs = {00} ∪ {0i : i ∈ J}
corresponds to the sets Fk for k > 1 of the proof of Lemma 2. The proof can be
completed analogous to the proof there. u
t
Theorem 3. dim(Pns ) = n2 .
The m-Cost ATSP 249
C00 = {00}
C0i = {0i, i0} i ∈ J \ {1}
Cij = {0i, ij, j0} i ∈ J \ {1} , j ∈ J, i 6= j
C1j = {0i, i1, 1j, j0} j ∈ J \ {1} with some i ∈ J \ {1, j}
are again easily seen to satisfy x01 = 0 and to be linearly independent (proceed
as in the proof of Theorem 3). A similar construction shows that xv0 is facet
defining for v ∈ J. t
u
The following corollary will be especially useful in relating Pns with Pnm .
Corollary 1. Any facet defining inequality aT x ≤ a0 of Pns not equivalent to
x00 ≥ 0 satisfies aT xC00 = a0 where C00 = {00}.
Proof. Let aT x ≤ a0 be facet defining with C00 ∈/ Fa = A ∈ Fns : aT xA = a0 .
Since
A = C00 is the only set A ∈ Fns satisfying xA 00 = 1 it follows that Fa ⊂
A ∈ Fns : xA T
00 = 0 , so a x ≤ a0 is equivalent to the facet x00 ≥ 0. t
u
The polyhedron Pns (n) corresponding to the convex hull of the points satisfying
constraints (6)–(9) and (10) is not exactly P0 of [2], but the only differences are
that now the “long” cycle is required to pass through node 0 and that this long
cycle may also be of length 1 as opposed to at least two. This close relationship
allows to copy the lifting Theorem 3.1 of [2] more or less verbatim. Denote by
Pns (k) the polytope obtained from the ATSP polytope P (with respect to the
complete digraph on nodes J ∪ {0}) by introducing the first k loop variables xjj
for j = 0, . . . , k.
The m-Cost ATSP 251
Theorem 7. Let aT x ≤ a0 be any facet defining inequality for the ATSP poly-
tope P with respect to the complete digraph on Vs . For k = 0, . . . , n define
bjk = a0 − z(Pns (k)), where
( k−1
)
X
s T s
z(Pn (k)) = max a x + bji xji ji : x ∈ Pn (k), xjk jk = 1 .
i=0
Pk−1
Then the inequality aT x + i=0 bji xji ji ≤ a0 is valid and facet defining for
Pns (k).
Proof. As in [2], but with P0 (k) replaced with Pns (k). t
u
After having determined the lifting coefficients we eliminate variables xjj for
j ∈ J by using (10) in order to obtain the lifted inequality for Pns .
For example the v0-cut inequalities can be seen to be liftings of the subtour
elimination constraints for 0 ∈ / S ⊂ Vs , |S| ≥ 2, with respect to the sequence
j1 , . . . , j|S| = v ∈ S and afterwards for j ∈ Vs \ S. Here, we need the direct proof
of Theorem 5 in order to arrive at Theorem 11.
It is not yet clear for which other classes of facet defining inequalities of the
ATSP polytope it is possible to determine the lifting coefficients explicitly as in
[2], because the special role of node 0 does not allow to use these results without
some careful checking.
For the following class of facet defining inequalities we do not know whether
there exists a corresponding class in the ATSP or PCTSP literature.
Theorem 8. Let S0 ⊂ S1 ⊂ . . . ⊂ Sk = J be a nested sequence of proper subsets
with k ≥ 2 and let t0 ∈ S0 , ti ∈ Si \ Si−1 for i = 1, . . . , k. For
[
Ac = δ + (Si−1 ) \ {vti : v ∈ Si−1 }
i=1,...,k
Theorem 10. For m ≥ 3 and n ≥ 3 all facet defining inequalities âT x ≤ â0
of Pns give rise to facet defining inequalities aT x ≤ a0 for Pnm by identifying As
with Ah for some h ∈ M , i.e., by setting aijh = âij for ij ∈ As , aijk = 0 for
ijk ∈ A \ Ah , and a0 = â0 .
We do not know whether the theorem also holds for m = 2. The main
difficulty is that one has to construct an additional solution that exploits the
variable 001. We can do this for the v0-cut and the nested conflict inequalities.
Theorem 11. The v0-cut inequalities (4) are facet defining for Pnm for m ≥ 2
and n ≥ 3.
Theorem 12. For x ∈ Q |A| , x ≥ 0 satisfying (3), the v0-cut inequalities (4)
can be separated in polynomial time.
Theorem 13. The nested conflict inequalities (11) are facet defining for Pnm
for m ≥ 2 and n ≥ 3.
254 Christoph Helmberg
On the support A \ A1 \ {012, . . . , 0n2, t0 02} the incidence vector (with respect
to A) of this tour is a linear combination of incidence vectors (with respect to
A) of dicycles from 3, 10, 14, and the dicycle 0, tk , t0 , 0 of 15 as specified in the
proof of Theorem 8. None of these dicycles are tours on machine 2, so 001 is not
in the support of any of these incidence vectors. However, 001 is in the support
of the new tour. Thus, by the linear independence of the dicycles of the proof
of Theorem 8, its incidence vector is linearly independent with respect to the
n(n−1) tours on machine 1 of the proof of Theorem 2 and the n2 −1 dicycles (the
idle dicycle is covered by the tours on machine 1) of the proof of Theorem 8. u t
B = {0k i, i0k : k ∈ M, i ∈ J} ∪
{ij : i 6= j, i, j ∈ J} ∪
{0k , 0k+1 : k ∈ M \ {m}} ∪
{0m 01 } .
y0k i = x0ik i ∈ J, k ∈ M
yi01 = xi0m i∈J
yi0k =xPi0(k−1) i ∈ J, k ∈ {2, . . . M }
yij = k∈M xijk i 6= j, i, j ∈ J
y0k 0k+1 = x00k k ∈ {1, . . . m − 1}
y0m 01 = x00m
7 Numerical Results
The scheduling problem at Herlitz PBS AG asks for a solution minimizing the
makespan over two non-identical machines. Conflicts arising from jobs using the
same printing cylinders are avoided by additional side constraints that require
that these jobs be executed on the same machine. The problem is thus not a
pure m-Cost ATSP, but the basic structure is the same. Our data instances
256 Christoph Helmberg
Column n gives the size of the instance, feas displays the best integral so-
lution we know, v0-cut shows the optimal value of the relaxation based on the
v0-cut inequalities, in parenthesis we give the relative gap 1−(v0-cut/feas). secs
is the number of seconds the code needed to solve this relaxation, including sepa-
ration. We compare the value of relaxation v0-cut to the relaxations obtained by
separating the subtour inequalities on the transformed problem and to the relax-
ation combining subtour and Dk+ /Dk− inequalities (for the definition of Dk+ /DK −
inequalities see [10], for a separation heuristic [7]). With respect to the latter
relaxations the v0-cut inequalities close the gap by roughly 60%. Optimal so-
lutions to the subtour relaxations without v0-cut inequalities would typically
exhibit several subtours on the one machine subproblems. There is still much
room for improvement in the separation of the v0-cut inequalities, so it should
be possible to reduce the computation time in more sophisticated implementa-
tions. It is surprising that the Dk+ /Dk− inequalities are not very effective, quite
contrary to the experience in usual ATSP-problems.
In Table 2 we summarize results on instances with 40 to 100 jobs. For each
problem size we generated 10 instances, the table displays the average relative
gap of the relaxations as well as the average computation time. In the application
in question the error in the data is certainly significantly larger than 0.2%, so
there is no immediate need to improve these solutions further. On the other
The m-Cost ATSP 257
References
1. E. Balas. The prize collecting traveling salesman problem. Networks, 19:621–636,
1989.
2. E. Balas. The prize collecting traveling salesman problem: II. polyhedral results.
Networks, 25:199–216, 1995.
3. E. Balas and M. Fischetti. A lifting procedure for the asymmetric traveling sales-
man polytope and a large new class of facets. Mathematical Programming, 58:325–
352, 1993.
4. M. O. Ball, T. L. Magnanti, C. L. Monma, and G. L. Nemhauser, editors. Network
Routing, volume 8 of Handbooks in Operations Research and Management Science.
Elsevier Sci. B.V., Amsterdam, 1995.
5. J. Desrosiers, Y. Dumas, M. M. Solomon, and F. Soumis. Time Constrained Rout-
ing and Scheduling, chapter 2, pages 35–139. Volume 8 of Ball et al. [4], 1995.
6. M. Fischetti, J. Salazar González, and P. Toth. A branch-and-cut algorithm for the
generalized travelling salesman problem. Technical report, University of Padova,
1994.
7. M. Fischetti and P. Toth. A polyhedral approach to the asymmetric traveling
salesman problem. Management Science, 43:1520–1536, 1997.
8. M. Fischetti and P. Toth. An additive approach for the optimal solution of the
prize-collecting travelling salesman problem. In B. Golden and A. Assad, editors,
Vehicle Routing: Methods and Studies, pages 319–343. Elsevier Science Publishers
B.V. (North-Holland), 1998.
9. M. Gendreau, G. Laporte, and F. Semet. A branch-and-cut algorithm for the
undirected selective traveling salesman problem. Networks, 32:263–273, 1998.
10. M. Grötschel and M. Padberg. Polyhedral theory. In Lawler et al. [12], chapter 8.
11. M. Jünger, G. Reinelt, and G. Rinaldi. The traveling salesman problem. In M. Ball,
T. Magnanti, C. Monma, and G. Nemhauser, editors, Network Models, volume 7
of Handbooks in Operations Research and Management Science, chapter 4, pages
225–330. North Holland, 1995.
258 Christoph Helmberg
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 259–272, 1999.
c Springer-Verlag Berlin Heidelberg 1999
260 Satoru Iwata, S. Thomas McCormick, and Maiko Shigeno
trivial way so that it will generalize to both a weakly and a strongly polynomial
algorithm for SF.
The only other polynomial cut canceling algorithm we know for SF is the most
helpful total cut canceling algorithm of [21]. However that algorithm is provably
not strongly polynomial, and it needs an oracle to compute exchange capacities
for a derived submodular function which appears to be achievable only through
the ellipsoid algorithm, even if we have a non-ellipsoid oracle for the original
submodular function. By contrast, the present algorithm can compute exchange
capacities in derived networks using only an oracle for the original function, and
is the first dual strongly polynomial algorithm for SF that we know of. Our
algorithm also appears to be the first strongly polynomial algorithm (primal or
dual) for SF that avoids scaling or rounding the data. Furthermore, it seems
very likely to us that the present algorithm will be able to be further generalized
so as to apply to the M-convex case unlike its cycle canceling cousin.
2 Submodular Flow
An instance of submodular flow looks much like an instance of MCF. We are
given a directed graph G = (N, A) with node set N of cardinality n and arc set
A of cardinality m. We are also given lower and upper bounds `, u ∈ RA on the
arcs, and costs c ∈ RA on the arcs.
We need some notation toP talk about relaxed conservation. If w ∈ RX and
Y ⊆ X, then we abbreviate y∈Y wy by w(Y ) as usual. For node subset S,
define ∆+ S as {i → j ∈ A | i ∈ S, j ∈ / S}, ∆− S as {i → j ∈ A | i ∈ / S, j ∈ S},
−
and ∆S = ∆ S ∪ ∆ S. We say that arc a crosses S if a ∈ ∆S. If ϕ is a flow on
+
f (S) + f (T ) ≥ f (S ∪ T ) + f (S ∩ T )
for all S, T ⊆ N . For a submodular function f with f (∅) = 0, the base polyhedron
B(f ) is defined by
A vector in B(f ) is called a base. Since ∂ϕ(∅) = ∂ϕ(N ) = 0 for any flow ϕ,
requiring that f (N ) = 0 and ∂ϕ(S) ≤ f (S) for all S ⊆ N is equivalent to
requiring that ∂ϕ belong to B(f ).
262 Satoru Iwata, S. Thomas McCormick, and Maiko Shigeno
cπa > 0 ⇒ ϕa = `a ,
cπa < 0 ⇒ ϕa = ua ,
and ∂ϕ ∈ B(f π ).
approximate optimality. This checking routine will form the core of our cut
canceling algorithm, as it will produce the cuts that we cancel. For this paper
a cut is just a subset S of N . We cancel a cut by increasing πi by some step
length β for i ∈ S, which has the effect of increasing cπa on ∆+ S, decreasing cπa
on ∆− S, and changing the level sets of π.
Given a node potential π, we define modified bounds `π and uπ as follows. If
cπa < 0 then `πa = uπa = `a ; if cπa = 0 then `πa = `a and uπa = ua ; if cπa > 0 then
`πa = uπa = ua . Then Theorem 1 implies that π is optimal if and only if there is
a feasible flow in the network Gπ with bounds `π , uπ , and f π .
For the proof of correctness of our algorithm we will need some details of
how feasibility of Gπ is checked.
We use a variant of an algorithm of Frank [6] to check feasibility of Gπ . It
starts with any initial base x ∈ B(f π ) and any initial ϕ satisfying `π and uπ .
We now define a residual network on N . If ϕij > `πij , then we have a backward
residual arc j → i with residual capacity rij = ϕij − `πij ; if ϕij < uπij , then we
have a forward residual arc i → j with rji = uπij − ϕij . To deal with relaxed
conservation, we have jumping arcs: For every i, j ∈ N with i 6= j and x ∈ B(f π ),
define
re(x, i, j) = max{α | x + αχi − αχj ∈ B(f π )};
thus re(x, i, j) > 0 means that there is no x-tight set containing i but not j. Make
a jumping arc j → i with capacity re(x, i, j) whenever re(x, i, j) > 0. Note that S
is x-tight if and only if there are no jumping arcs w.r.t. x entering S.
The Feasibility Algorithm finds directed residual paths from N + = {i ∈ N |
xi > ∂ϕ({i})} to N − = {i ∈ N | xi < ∂ϕ({i})} with a minimum number of
arcs in the residual network. On each path it augments flow ϕ on the residual
arcs, and modifies x as per the jumping arcs, which monotonically reduces the
difference of x and ∂ϕ. By using a lexicographic selection rule, the algorithm
terminates in finite time. At termination, either x coincides with ∂ϕ, which
implies the δ-optimality of π; or there is no directed path from N + to N − .
In this last case, define T ⊆ N as the set of nodes from which N − is reachable
by directed residual paths. No jumping arcs enter T , so it must be tight for the
final x, and it must contain all i with xi < ∂ϕ({i}). Furthermore, we have
ϕ(∆+ T ) = `π (∆+ T ) and ϕ(∆− T ) = uπ (∆− T ). Thus we get
−
V π (T ) ≡ `π (∆+ π π
A T ) − u (∆A T ) − f (T ) = ∂ϕ(T ) − x(T ) > 0.
We call a node subset S with V π (S) > 0 a positive cut. Similar reasoning shows
that for any other S ⊆ N , we have V π (S) ≤ ∂ϕ(S) − x(S) ≤ ∂ϕ(T ) − x(T ) =
V π (T ), proving that T is a most positive cut. Intuitively, V π (T ) measures how
far away from optimality π is. We summarize as follows:
Lemma 1. Node potentials π are optimal if and only if there are no positive
cuts w.r.t. π. When π is not optimal, the output of the Feasibility Algorithm is
a most positive cut.
We denote the running time of the Feasibility Algorithm by FA. As usual, we
assume that we have an oracle to compute exchange capacities, and denote its
264 Satoru Iwata, S. Thomas McCormick, and Maiko Shigeno
π
A max mean cut attains the maximum in δ(π) ≡ maxS V (S). By standard
LP duality arguments δ(π) also equals the minimum δ such that there is a
bπ with bounds `π,δ , uπ,δ , and f π . Define U to be the maximum
feasible flow in G
absolute value of any `a , ua , or f ({i}). A max mean cut can be computed
using O(min{m0 , 1+log log(nU)−log
log(nU)
log n }) calls to the Feasibility Algorithm in the
framework of Newton’s Algorithm, see Ervolina and McCormick [22] or Radzik
[26].
4 Cut Cancellation
We will start out with δ large and drive δ towards zero, since π is 0-optimal if
and only if it is optimal. The next lemma says that δ need not start out too big,
and need not end up too small. Its proof is similar to [5, Lemma 5.1].
Lemma 2. Suppose that `, u, and f are integral. Then any node potentials π
are 2U -optimal. Moreover, when δ < 1/m0 , any δ-optimal node potentials are
optimal.
Our relaxed δ-MPC Canceling algorithm will start with δ = 2U and will
execute scaling phases, where each phase first sets δ := δ/2. The input to a
phase will be a 2δ-optimal set of node potentials from the previous phase, and
its output will be a δ-optimal set of node potentials. Lemma 2 says that after
O(log(nU )) scaling phases we have δ < 1/m0 , and we are optimal. Within a
scaling phase we use the Feasibility Algorithm to find a δ-MPC T . We then
cancel T by adding a constant step length β to πi for each node in T to get π 0 .
In ordinary min-cost flow we choose the step length β based on the first arc
in A whose reduced cost hits zero (as long as the associated flow is within the
bounds, see [29, Figure 2]). This bound on β is
min{|cπa | | a ∈ ∆+
A T, cπa < 0, ϕa ≥ `a }
η ≡ min .
min{cπa | a ∈ ∆− π
A T, ca > 0, ϕa ≤ ua }
(Note that optimality of T implies that every arc a with ϕa ≥ `a has negative
reduced cost, and every arc a with ϕa ≤ ua has positive reduced cost.) We say
that an arc achieving the min for η determines η.
Here we must also worry about the level set structure of π changing during
0
the cancel. We need to increase flows on jumping arcs leaving T so that the Lπk
will be tight. We try to do this by decreasing flow on arcs of ∆−E T , all of which
have flow δ since T is tight. If the exchange capacity of such an arc is at most δ
we can do this. Otherwise, this E-arc will determine β, and the large exchange
capacity will allow us to show that the algorithm makes sufficient progress.
These considerations lead to the subroutine AdjustFlow below. It takes as
input the optimal max flow ϕ from the Feasibility Algorithm, and its associated
base x ∈ B(f π ). It computes an update ψ ∈ RE to ϕ and base x0 so that x0 will
be the base associated with ϕ0 ≡ ϕ − ψ. Note that in AdjustFlow we always
266 Satoru Iwata, S. Thomas McCormick, and Maiko Shigeno
Algorithm AdjustFlow:
begin
ψe = 0 for e ∈ E;
H := {j → i | i ∈ T, j ∈ N − T, 0 < πj − πi < η};
while H 6= ∅ do
begin
x0 := x − ∂E ψ;
select e = j → i ∈ H with minimum πj − πi ;
set H := H − {e};
if e
r (x0 , i, j) < δ then ψe := e
r (x0 , i, j);
else return β := πj − πi [β is determined by jumping arc j → i];
end
return β := η [β is determined by the A-arc determining η];
end.
compute re(x, i, j) w.r.t. f , never w.r.t. f π , so that jumping arcs are w.r.t. f , not
f π . We call an arc that determines β a determining arc.
The full description of canceling a cut is now: Use the Feasibility Algorithm
to find δ-MPC T , max flow ϕ, and base x. Run AdjustFlow to modify ϕ to
ϕ0 = ϕ−ψ and x to x0 = x−∂E ψ, and to choose β. Finally, increase πi by β for all
i ∈ T . A scaling phase cancels δ-MPCs in this manner until π is δ-optimal. The
next lemma shows that canceling a δ-MPC cannot increase the δ-MPC value,
and that V π,δ (S) will decrease by at least δ under some circumstances.
f π . Each Tbi nests with the Lπk , so we have f π (Tbi ) = f (Tbi ), so each Tbi is in fact
ϕ0 -tight for f . This implies that the only possible jumping arcs entering level set
L0 of π 0 are those from T j to Ti for p < j < i ≤ q. But these jumping arcs all
belong to H and were removed by AdjustFlow. Thus L0 is ϕ0 -tight for f , and
0
so x0 ∈ B(f π ).
Since ψe > 0 implies ϕe = δ, every e ∈ E satisfies 0 ≤ ϕ0e ≤ δ. Therefore we
have
0 0 0 0 0
V π ,δ (S) = `π ,δ (∆+I S) − u
π ,δ
(∆− π
I S) − f (S) (def’n of V
π ,δ
(S))
0
0 0
≤ ∂I ϕ (S) − x (S) (feas. of ϕ , x0 ∈ B(f π ))
0
Proof. Suppose that the first cancellation has initial node potentials π 0 and
cancels δ-MPC T 1 to get π 1 , the next cancels T 2 to get π 2 , . . . , and the nth
cancels T n to get π n . Each cancellation makes at least one arc into a determining
arc. Consider the subgraph of these determining arcs. If a cut creates a determin-
ing arc whose ends are in the same connected component of this subgraph, then
this cut must be crossed by a determining arc from an earlier cut. We can avoid
268 Satoru Iwata, S. Thomas McCormick, and Maiko Shigeno
this only if each new determining arc strictly decreases the number of connected
components in the subgraph. This can happen at most n − 1 times, so it must
happen at least once within n iterations.
Let k be an iteration where T k shares a determining arc with an earlier cut
h−1
T . By Corollary 1 applied to T = T h and S = T k , we have V π ,δ (T h ) ≥
h
h i−1
V π ,δ (T k ) + δ. Note that the δ-MPC value at iteration i is V π ,δ (T i ). If
h k−1 0 h−1
V π ,δ (T k ) ≥ V π ,δ (T k ), Lemma 3 says that V π ,δ (T 1 ) ≥ V π ,δ (T h ) ≥
h k−1 n−1
V π ,δ (T k ) + δ ≥ V π ,δ (T k ) + δ ≥ V π ,δ (T n ) + δ.
h k−1
If instead V π ,δ (T k ) < V π ,δ (T k ), then let p be the latest iteration between
p−1 p
h and k with V π ,δ (T k ) < V π ,δ (T k ). The only way for the value of T k to
p−1
increase like this is if there is an arc a ∈ A with cπa = 0 that crosses T k in the
p π p−1 ,δ p
reverse orientation to its orientation in T , or f (T k ) > f π ,δ (T k ) holds. Let
ϕ be a flow proving optimality of T . If a ∈ ∆A T ∩ ∆A T (a ∈ ∆−
p p + p − k p + k
A T ∩ ∆A T )
πp p p πp p p
then we have ua = ϕa + 2δ ≥ ϕa + δ (la = ϕa − 2δ ≤ ϕa − δ). When
p−1 p
f π ,δ (T k ) > f π ,δ (T k ) holds, there exist i, j ∈ N such that i ∈ T k \ T p ,
j ∈ T p \ T k . It follows from i → j ∈ ∆− p + p
I Tp and j → i ∈ ∆I T that ϕe = δ =
p
π p ,δ p π ,δ 0
le + δ for e = i → j and ϕe0 = 0 = ue0 − δ for e = j → i. In either case,
p
π p ,δ
we have lπ ,δ (∆+ k
I T )−u (∆− k p
I T ) ≤ ∂I ϕ − δ. Thus equations of the proof
p−1 p
of Lemma 3 applies, showing that V π ,δ (T p ) − δ ≥ V π ,δ (T k ). Then Lemma 3
0 p−1 p
and the choice of p says that V π ,δ (T 1 ) ≥ V π ,δ (T p ) ≥ V π ,δ (T k ) + δ ≥
k−1 n−1
V π ,δ (T k ) + δ ≥ V π ,δ (T n ) + δ. t
u
most m0 δ, and this is an upper bound on the value of the first δ-MPC in this
phase. By Lemma 3, it is then also a bound on the value of every δ-MPC in the
phase. t
u
Putting Lemmas 4 and 5 together yields our first bound on the running time
of δ-MPC Canceling:
Proof. Lemma 5 shows that the δ-MPC value of the first cut in a phase is
at most m0 δ. It takes at most n iterations to reduce this by δ, so there are at
most m0 n = O(n3 ) iterations per phase. The time per iteration is dominated by
computing a δ-MPC, which is O(FA). The number of phases is O(log(nU )) by
Lemma 2. t
u
Cut Canceling for Submodular Flow 269
x0 (N + ) ≤ m0 δ 0 + ∂A ϕ(N
b + ). (2)
We now apply the successive shortest path algorithm for submodular flow
of Fujishige [8] starting from ϕ. b (As originally stated, this algorithm is finite
only for integer data, but the lexicographic techniques of Schönsleben [27] and
Lawler and Martel [20] show that it can be made finite for any data.) This
algorithm looks for an augmenting path from a node i ∈ N + to a node j ∈ N − ,
where residual capacities on A-arcs come from ϕ, b and residual capacities on
jumping arcs come from x0 . It chooses a shortest augmenting path (using the
current reduced costs as lengths; such a path can be shown to always exist) and
augments flow along the path, updating π 0 by the shortest path distances, and x0
as per the jumping arcs. This update maintains the properties that the current
b satisfies the bounds and is complementary slack with the current π 0 , and the
ϕ
current x0 belongs to B(f ). The algorithm terminates with optimal flow ϕ∗ once
the boundary of the current ϕ b coincides with the current x0 . By (2), the total
amount of flow pushed by this algorithm is at most m0 δ 0 .
This implies that for each a ∈ A, ϕ ba differs from ϕ∗a by at most m0 δ 0 , so ϕ0a
differs from ϕa by at most (m + 1)δ . In particular, if ϕ0a < ua − (m0 + 1)δ 0 , then
∗ 0 0
∗
ϕ∗a < ua , implying that cπa ≥ 0, and similarly for ϕ0a > `a + (m0 + 1)δ 0 .
Suppose that P is an augmenting path chosen by the algorithm, and that
flow is augmented by amount τP along P . Since P has at most n/2 jumping
arcs, the boundary of any S ⊆ N changes by at most (n/2)τP due to P . Since
P 0 0
P τP ≤ m δ by (2), the total change in ∂A ϕ(S) b during the algorithm is at most
(m0 n/2)δ 0 . Since |∂I ϕ0 (S) − ∂A ϕ(S)|
b ≤ m0 δ 0 , the total change in ∂I ϕ0 (S) is at
0
most (m0 n/2)δ 0 . Thus ∂I ϕ0 (T 0 ) > f π (T 0 ) + (m0 n/2)δ 0 implies that ∂A ϕ∗ (T 0 ) >
270 Satoru Iwata, S. Thomas McCormick, and Maiko Shigeno
0
f π (T 0 ). This says that some level set of π 0 is not ϕ∗ -tight. This implies that
there is some E-arc i → j with πi0 > πj0 but πi∗ ≤ πj∗ . t
u
We now modify our algorithm a little bit. We divide our scaling phases into
blocks of log2 (m0 + 1) phases each. At the beginning of each block we compute a
max mean cut T 0 with mean value δ 0 = δ(π 0 ) and cancel T 0 (including calling
AdjustFlow). This ensures that our current flow is δ 0 -optimal, so we set δ = δ 0
and start the block of scaling phases. It will turn out that only 2m0 blocks are
sufficient to attain optimality.
Theorem 3. This modified version of the algorithm takes O(n5 log nFA) =
O(n8 h log n) time.
Proof. Let T 0 be the max mean cut canceled at the beginning of the block,
with associated node potential π 0 , flow ϕ0 , and mean value δ 0 . Complementary
0
π0
slackness for T 0 implies that ϕ0a = uπ + δ 0 for all a ∈ ∆− I T , that ϕa = `
0 0
− δ0
0
π π0
for all a ∈ ∆A T , that ϕa = ` for all a ∈ ∆I T , and that ∂I ϕ (T ) = f (T 0 ).
+ 0 0 + 0 0 0
Let π 0 , ϕ0 , and δ 0 be the similar values after the last phase in the block. Since
each scaling phase cuts δ in half, we have δ 0 < δ 0 /(m0 + 1).
− 0 π0
Subtracting ϕ0 (∆+ 0 0
I T ) − ϕ (∆I T ) − ∂I ϕ (T ) = 0 from V
0 0
(T 0 ) yields
0
(|∆A T 0 | + |T 0 | · |N − T 0 |)δ 0 = V π (T 0 )
0 0
− 0
= (`π − ϕ0 )(∆+ 0 π
I T ) + (ϕ − u )(∆I T ) (3)
0
0
+ (∂I ϕ0 (T 0 ) − f π (T 0 )).
0
Now apply Lemma 6 to ϕ0 and π 0 . If the term for arc a of (`π − ϕ0 )(∆+ A T ) is at
0
π0
least δ > (m +1)δ , then we must have that `a = ua and so ϕa < ua −(m +1)δ 0 ,
0 0 0 0 0
∗
π0
and we can conclude that cπa ≥ 0. But each a in ∆+ A T had ca < 0, so this is
0
0
a new sign constraint on cπ . The case for terms of (ϕ0 − uπ )(∆− A T ) is similar.
0
− 0
Suppose instead that all the terms in the ∆A T and ∆A T sums of (3)
+ 0
These techniques should lead to a (strongly) polynomial maximum mean cut can-
celing algorithm for the submodular flow problem. It should be very straightfor-
ward to extend this algorithm to the separable convex submodular flow problem
as was done for its cycle canceling cousin. We are also optimistic about extend-
ing it to the M-convex cost submodular flow problem [24,25]. If we can do this,
then we would have a unified approach to a variety of optimization problems
including separable convex cost optimization in totally unimodular spaces [19].
Acknowledgment
We heartily thank Lisa Fleischer for many valuable conversations about this
paper and helpful comments on previous drafts of it.
References
1. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin: Network Flows — Theory, Algo-
rithms, and Applications, Prentice Hall, 1993.
2. W. Cui and S. Fujishige: A primal algorithm for the submodular flow problem with
minimum-mean cycle selection, J. Oper. Res. Soc. Japan, 31 (1988), 431–440.
3. W. H. Cunningham and A. Frank: A primal-dual algorithm for submodular flows,
Math. Oper. Res., 10 (1985), 251–262.
4. J. Edmonds and R. Giles: A min-max relation for submodular functions on graphs,
Ann. Discrete Math., 1 (1977), 185–204.
5. T. R. Ervolina and S. T. McCormick: Two strongly polynomial cut canceling algo-
rithms for minimum cost network flow, Discrete Appl. Math., 46 (1993), 133–165.
6. A. Frank: Finding feasible vectors of Edmonds-Giles polyhedra, J. Combinatorial
Theory, B36 (1984), 221–239.
7. A. Frank and É. Tardos: An application of simultaneous Diophantine approxima-
tion in combinatorial optimization, Combinatorica, 7 (1987), 49–65.
8. S. Fujishige: Algorithms for solving the independent-flow problems, J. Oper. Res.
Soc. Japan, 21 (1978), 189–204.
9. S. Fujishige: Capacity-rounding algorithm for the minimum-cost circulation prob-
lem: A dual framework of the Tardos algorithm, Math. Programming, 35 (1986),
298–309.
10. S. Fujishige: Submodular Functions and Optimization, North-Holland , 1991.
11. S. Fujishige and X. Zhang: New algorithms for the intersection problem of sub-
modular systems, Japan J. Indust. Appl. Math., 9 (1992), 369–382.
12. S. Fujishige, H. Röck, and U. Zimmermann: A strongly polynomial algorithm for
minimum cost submodular flow problems, Math. Oper. Res., 14 (1989), 60–69.
13. A. V. Goldberg and R. E. Tarjan: A new approach to the maximum flow problem,
J. ACM, 35 (1988), 921–940.
14. A. V. Goldberg and R. E. Tarjan: Finding minimum-cost circulations by canceling
negative cycles, J. ACM, 36 (1989), 873–886.
15. M. Grötschel, L. Lovász, and A. Schrijver: Geometric Algorithms and Combinato-
rial Optimization, Springer-Verlag, 1988.
272 Satoru Iwata, S. Thomas McCormick, and Maiko Shigeno
16. R. Hassin: Algorithm for the minimum cost circulation problem based on maxi-
mizing the mean improvement, Oper. Res. Lett., 12 (1992), 227–233.
17. S. Iwata: A capacity scaling algorithm for convex cost submodular flows,
Math. Programming, 76 (1997), 299–308.
18. S. Iwata, S. T. McCormick, and M. Shigeno: A faster algorithm for minimum cost
submodular flows, Proceedings of the Ninth Annual ACM-SIAM Symposium on
Discrete Algorithms (1998), 167–174.
19. A. V. Karzanov and S. T. McCormick: Polynomial methods for separable convex
optimization in unimodular linear spaces with applications, SIAM J. Comput., 26
(1997), 1245–1275.
20. E. L. Lawler and C. U. Martel: Computing maximal polymatroidal network flows,
Math. Oper. Res., 7 (1982), 334–347.
21. S. T. McCormick and T. R. Ervolina: Cancelling most helpful total submodular
cuts for submodular flow, Integer Programming and Combinatorial Optimization
(Proceedings of the Third IPCO Conference), G. Rinaldi and L. A. Wolsey eds.
(1993), 343–353.
22. S. T. McCormick and T. R. Ervolina: Computing maximum mean cuts, Discrete
Appl. Math., 52 (1994), 53–70.
23. S. T. McCormick, T. R. Ervolina and B. Zhou: Mean canceling algorithms for
general linear programs and why they (probably) don’t work for submodular flow,
UBC Faculty of Commerce Working Paper 94-MSC-011 (1994).
24. K. Murota: Discrete convex analysis, Math. Programming, 83 (1998), 313–371.
25. K. Murota: Submodular flow problem with a nonseparable cost function, Combi-
natorica, to appear.
26. T. Radzik: Newton’s method for fractional combinatorial optimization, Proceed-
ings of the 33rd IEEE Annual Symposium on Foundations of Computer Science
(1992), 659–669; see also: Parametric flows, weighted means of cuts, and fractional
combinatorial optimization, Complexity in Numerical Optimization, P. Pardalos,
ed. (World Scientific, 1993), 351–386.
27. P. Schönsleben: Ganzzahlige Polymatroid-Intersektions Algorithmen, Dissertation,
Eigenössische Technische Hochschule Zürich, 1980.
28. A. Schrijver: Total dual integrality from directed graphs, crossing families, and
sub- and supermodular functions, Progress in Combinatorial Optimization, W. R.
Pulleyblank, ed. (Academic Press, 1984), 315–361.
29. M. Shigeno, S. Iwata, and S. T. McCormick: Relaxed most negative cycle and
most positive cut canceling algorithms for minimum cost flow, Math. Oper. Res.,
submitted.
30. É. Tardos: A strongly polynomial minimum cost circulation algorithm, Combina-
torica, 5 (1985), 247–255.
31. C. Wallacher and U. Zimmermann: A polynomial cycle canceling algorithm for
submdoular flows, Math. Programming, to appear.
32. U. Zimmermann: Negative circuits for flows and submodular flows, Discrete Appl.
Math., 36 (1992), 179–189.
Edge-Splitting Problems with Demands
?
Tibor Jordán
1 Introduction
Edge-splitting is a well-known and useful method to solve graph problems which
involve certain edge-connectivity properties. Splitting off two edges su, sv means
deleting su, sv and adding a new edge uv. Such an operation may reduce the local
edge-connectivity between some pairs of vertices and may decrease the global
?
Part of this work was done while the author visited the Department of Applied
Mathematics and Physics, Kyoto University, supported by the Monbusho Interna-
tional Scientific Research Program no. 09044160.
??
Basic Research in Computer Science, Centre of the Danish National Research Foun-
dation.
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 273–288, 1999.
c Springer-Verlag Berlin Heidelberg 1999
274 Tibor Jordán
for which the two sets of edges incident to s in G and H coincide. Let G and H
be k- and l-edge-connected in V , respectively. Then there exists a pair of edges
su, sv which is admissible in G and H (with respect to k and l, respectively)
simultaneously, provided d(s) ≥ 6. This property may not hold if d(s) = 4 but
assuming k and l are both even we prove that a simultaneously admissible pair
always exists, showing that a simultaneously admissible complete splitting exists
as well.
Using these splitting results and the g-polymatroid intersection theorem we
give a min-max theorem and a polynomial algorithm for the simultaneous edge-
connectivity augmentation problem. In this problem two graphs G0 = (V, E 0 ),
H 0 = (V, K 0 ) and two integers k and l are given and the goal is to find a
smallest set of edges whose addition makes G0 (and H 0 ) k-edge-connected (l-
edge-connected, respectively) simultaneously. Our algorithm finds an optimal
solution if k and l are both even. In the remaining cases the size of the solution
does not exceed the optimum by more than one.
Graphs in this paper are undirected and may contain parallel edges. Let G =
(V, E) be a graph. A subpartition of V is a collection of pairwise disjoint subsets
of V . The subgraph of G induced by a subset X of vertices is denoted by G[X].
A set consisting of a single vertex v is simply denoted by v. An edge joining
vertices x and y is denoted by xy. Sometimes xy will refer to an arbitrary copy
of the parallel edges between x and y but this will not cause any confusion.
Adding or deleting an edge e from a graph G is often denoted by G + e or G − e,
respectively.
For X, Y ⊆ V , d(X, Y ) denotes the number of edges with one endvertex
in X − Y and the other in Y − X. We define the degree of a subset X as
d(X) = d(X, V − X). For example d(v) denotes the degree of vertex v. The
degree-function of a graph G0 is denoted by d0 . The set of neighbours of v (or v-
neighbours, for short), that is, the set of vertices adjacent to vertex v, is denoted
by N (v). A graph G = (V, E) is k-edge-connected if
The operation splitting off a pair of edges sv, st at a vertex s means replacing
sv, st by a new edge vt. If v = t then the resulting loop is deleted. We use Gv,t
to denote the graph obtained after splitting off the edges sv, st in G (the vertex
s will always be clear from the context). A complete splitting at a vertex s (with
even degree) is a sequence of d(s)/2 splittings of pairs of edges incident to s.
2 Preliminaries
This section contains some basic results. The degree function satisfies the fol-
lowing well-known equalities.
276 Tibor Jordán
Lemma 1. (a) A maximal dangerous set does not cross any critical set.
(b) If X is dangerous then d(s, V − X) ≥ d(s, X).
(c) If k is even then two maximal dangerous sets X, Y which are crossing
have d(s, X ∩ Y ) = 0. t
u
An s-neighbour v for which alternative (i) (resp. (ii)) holds is called nor-
mal (special, respectively). Note that if k is even then there exist no special
s-neighbours by Lemma 2.
The previous lemmas include all ingredients of Frank’s proof [5] for the next
splitting off theorem due to Lovász.
k is even then B(s) is disconnected. Thus for any connected demand graph D(s)
there exists a D(s)-split. This simple observation, which illustrates our proof
method, will be used later.
Now let us focus on the case when k ≥ 3 is odd. In this case we do not give a
complete characterization of the non-admissibility graphs but characterize those
graphs G for which B(s) is 2-edge-connected. To state our result we need some
definitions. In a cyclic partition X = (X0 , ..., Xt−1 ) of V the t partition classes
{X0 , ..., Xt−1 } are cyclically ordered. Thus we use the convention Xt = X0 , and
so on. In a cyclic partition two classes Xi and Xj are neighbouring if |j−i| = 1 and
non-neighbouring otherwise. We say that G0 = (V 0 , E 0 ) is a Clp -graph for some
p ≥ 3 and some even l ≥ 2 if there exists a cyclic partition Y = (Y0 , ..., Yp−1 )
of V 0 for which d0 (Yi ) = l (0 ≤ i ≤ p − 1) and d(Yi , Yj ) = l/2 for each pair
Yi , Yj of neighbouring classes of Y (which implies d(Yi0 , Yj 0 ) = 0 for each pair of
non-neighbouring classes Yi0 , Yj 0 ). A cyclic partition of G0 with these properties
is called uniform.
Let G = (V + s, E) satisfy (1) for some odd k ≥ 3. Such a G is called round
d(s)
(from vertex s) if G − s is a Ck−1 -graph. Note that by (1) this implies that
d(s, Vi ) = 1 for each class Vi (0 ≤ i ≤ d(s) − 1) of a uniform partition V of G − s.
The following lemma will not be used in the proof of the main result of this
section but will be used later in the applications. We omit the proof.
Lemma 3. Let G = (V + s, E) satisfy (1) for some odd k ≥ 3. Suppose that
G is round from s and let V = (V0 , ..., Vr ) be a uniform partition of V , where
r = d(s) − 1. Then
(a) G−s is (k−1)-edge-connected and for every X ⊂ V with dG−s (X) = k−1
S
either X ⊆ Vi or V − X ⊆ Vi holds for some 0 ≤ i ≤ r or X = i+j i Vi for some
0 ≤ i ≤ r, 1 ≤ j ≤ r − 1.
(b) For any set I of r edges which induces a spanning tree on N (s) the graph
(G − s) + I is k-edge-connected.
(c) The uniform partition of G − s is unique.
(d) B(s) is a cycle on d(s) vertices (which follows the ordering of V). t
u
The main structural result is the following.
Theorem 3. Suppose that G = (V + s, E) satisfies (1), d(s) is even, |N (s)| ≥ 2
and k ≥ 3 is odd. Then B(s) is 2-edge-connected if and only if G is round from
s and d(s) ≥ 4.
Proof. First suppose that G is round from s and d(s) ≥ 4. Lemma 3(d) shows
B(s) is a cycle and hence B(s) is 2-edge-connected.
In what follows we prove the other direction and assume that B(s) is 2-edge-
connected. If there is no special s-neighbour then the proof of Theorem 2 shows
that B(s) is disconnected. Thus there exists at least one special s-neighbour.
Lemma 4. Suppose that t is special and let X and Y be the two maximal t-
dangerous sets. Let u be a special s-neighbour in Y − X. Let Y and Z denote
the two maximal u-dangerous sets in G and let su, sv be a non-admissible pair
with v ∈ Z − Y . Then Z ∩ X = ∅, Y = (X ∩ Y ) ∪ (Z ∩ Y ) and d(s, Y ) = 2.
Edge-Splitting Problems with Demands 279
Proof. Notice that for every special vertex u0 ∈ Y − X one of the two maximal
u0 -dangerous sets must be equal to Y , thus Y is indeed one of the two maximal
u-dangerous sets.
First suppose v ∈ X − Y . Since X and Y are the only maximal t-dangerous
sets we have t ∈ / Z. This shows that v is also special and the two maximal v-
dangerous sets are X and Z. Lemma 2 shows d(Z) = k + 1 and d(X − Y ) =
d(Y − X) = k. Hence by (3), applied to Z and X − Y and to Z and Y − X,
we obtain X − Y ⊂ Z and Y − X ⊂ Z. By Lemma 2 d(X ∩ Y ) = k. Hence
Z ∩X ∩Y = ∅ by Lemma 1, provided Z ∪(X ∩Y ) 6= V . Moreover, Z ∪(X ∩Y ) = V
implies d(s, Z) ≥ d(s) − 1 > d(s, V − Z), using d(s) ≥ 4. Since Z is dangerous,
this contradicts Lemma 1(b). Thus we conclude Z ∩ X ∩ Y = ∅.
The above verified properties of Z and Lemma 2 imply that k + 1 = d(Z) ≥
d(X ∩Y, Z)+ d(s, Z) ≥ d(X ∩Y, X − Y )+ d(X ∩Y, Y − X)+ 2 = k − 1 + 2 = k + 1.
This shows that d(s, Z) = 2 and hence d(s, Z ∪ (X ∩ Y )) = 3 holds. We also get
d(Z, V − Z − (X ∩ Y )) = 0 and by Lemma 2 we have d(X ∩ Y, V − Z − (X ∩
Y )) = 0 as well. Therefore d(Z ∪ (X ∩ Y ), V − Z − (X ∩ Y )) = 0 and hence
d(s, Z ∪ (X ∩ Y )) = d(Z ∪ X ∪ Y ) = 3 ≤ k. This shows Z ∪ X ∪ Y is dangerous,
contradicting the maximality of X.
Thus we may assume that v ∈ V −(X ∪Y ). By Lemma 2 we have d(Z) = k+1,
d(X −Y ) = d(Y −X) = d(X ∩Y ) = k and there exists an s-neighbour w ∈ X −Y .
Clearly, t ∈/ Z and by the previous argument we may assume w ∈ / Z. We claim
that Z ∩ (X − Y ) = ∅. Indeed, otherwise Z and X − Y would cross (observe that
t∈/ Z ∪ (X − Y )), contradicting Lemma 1(a). We claim that Z ∩ (X ∩ Y ) = ∅
holds as well. This claim follows similarly: since w ∈
/ Z ∪ (X ∩ Y ), Lemma 1(a)
leads to a contradiction.
A third application of Lemma 1(a) shows Y − X ⊂ Z. To see this observe
that t ∈/ Z ∪ (Y − X) and hence (Y − X) − Z 6= ∅ would imply that Z and Y − X
cross, a contradiction.
Summarizing the previous observations we obtain Z ∩ X = ∅ and Y = (X ∩
Y ) ∪ (Z ∩ Y ). By Lemma 2 this implies d(s, Y ) = 2. This proves the lemma. u t
From Lemma 2 we can see that for every normal s-neighbour v the v-
neighbours in B(s) induce a complete subgraph of B(s) and for every special
s-neighbour t the t-neighbours in B(s) can be divided into two nonempty parts
(depending on whether a t-neighbour belongs to X − Y or Y − X for the two
maximal t-dangerous sets X and Y ), such that each of them induces a complete
subgraph of B(s). By Lemma 4 we obtain that there are no edges between these
two complete subgraphs of B(s) and hence this bipartition of the t-neighbours
in B(s) into complete subgraphs is unique for every special s-neighbour t.
To deduce further properties of B(s) let us fix a special s-neighbour t and take
an s-neighbour r (r 6= t) which is not adjacent to t in B(s) (that is, for which
st, sr is admissible). Such an r exists by Proposition 2. Let the two maximal
t-dangerous sets be X and Y . Since B(s) is 2-edge-connected, there exist two
edge-disjoint paths P1 , P2 from t to r in B(s). Without loss of generality we
may assume that for the first edge uv of P1 which leaves X ∪ Y (that is, for
which u ∈ X ∪ Y and v ∈ / X ∪ Y ) we have u ∈ Y − X. Clearly, u is a special
280 Tibor Jordán
4 Applications
In this section we apply Theorems 2 and 3 for proving the existence of complete
admissible splittings (or properties of maximal admissible splitting sequences)
when the set of split edges has to satisfy some additional requirement.
then Theorem 2 and the fact that D(s) is connected (while the non-admissibility
graph B(s) is disconnected) shows that (*) holds and hence the proof of Theorem
4 is complete in this case. (Note that during the process of iteratively splitting
off consecutive admissible pairs the demand cycle C has to be modified whenever
s looses some neighbour w by splitting off the last copy of the edges sw.)
Now consider the case k = 3. The above argument and Theorem 3 shows
(using the fact that D(s) is 2-edge-connected) that by splitting off consecutive
admissible pairs as long as possible either we find a complete admissible splitting
which preserves planarity or we get stuck in a graph G0 which is round from s
and for which BG0 (s) = DG0 (s) holds. In the latter case we need to reembed some
parts of G0 in order to complete the splitting sequence. We may assume that s
is inside one of the bounded faces F of G0 − s. Let V0 , ..., V2m−1 be the uniform
partition of V in G0 − s (where 2m := dG0 (s)) and let vi be the neighbour of s in
Vi (0 ≤ i ≤ 2m − 1). By Lemma 3 we can easily see that adding the edges v0 vm
and vi v2m−i (1 ≤ i ≤ m − 1) to G0 − s results in a 3-edge-connected graph G00 .
Observe that G00 is obtained from G0 by a complete admissible splitting and all
these edges can be added within F in such a way that in the resulting embedding
of G00 every edge crossing involves the edge v0 vm . To avoid these edge crossings
in G00 we “flip” V0 and/or Vm , that is, reembed the subgraphs induced by V0 and
Vm in such a way that after the flippings both v0 and vm occur on the boundary
of the unbounded face. Since G0 is round and k = 3 it is easy to see that this
can be done. Then we can connect v0 and vm within the unbounded face and
obtain a planar embedding of the resulting graph. Thus the proof of Theorem 4
is complete. t
u
The theorem does not hold if k ≥ 5 is odd, see [9]. Note that the above proof
implies that the graphs obtained by a maximal planarity preserving admissible
splitting sequence are round for every odd k ≥ 5. Moreover, it shows that if
k = 3 then at most two “flippings” are sufficient.
Let G = (V + s, E) be a graph for which (1) holds and d(s) is even and let
P = {P1 , P2 , . . . , Pr }, 2 ≤ r ≤ |V | be a prescribed partition of V . In order to
solve a more general partition-constrained augmentation problem, Bang-Jensen,
Gabow, Jordán and Szigeti [1] investigated the existence of complete admissible
splittings at s for which each split edge connects two distinct elements of P.
They proved that if k is even and an obvious necessary condition holds (namely,
d(s, Pi ) ≤ d(s)/2 for every Pi ) then such a complete admissible splitting exists
and for odd k they characterized those graphs for which such a complete splitting
does not exist.
This partition-constrained edge-splitting problem can also be formulated as
an edge-splitting problem with demands. Here the demand graph is a complete
multipartite graph, which is either 2-edge-connected or is a star. Thus Theorems
2 and 3 can be applied and for several lemmas from [1] somewhat shorter proofs
can be obtained. We omit these proofs and note that, as we will point out, the
282 Tibor Jordán
respectively, and let |F | be even, where F denotes the set of edges incident to
s. We say that a pair su, sv is legal if it is admissible in G as well as in H. A
complete splitting sequence at s is legal if the resulting graphs (after deleting s)
satisfy (1) with respect to k and l, respectively. Let d(s) := dG (s) = dH (s).
Theorem 5. If d(s) ≥ 6 then there exists a legal pair su, sv. If k and l are both
even then there exists a complete legal splitting at s.
Proof. The property of being legal can be formulated in terms of a demand graph
D(s). Namely, a pair su, sv (u 6= v) is legal if and only if su, sv is a D(s)-split
in G with respect to D(s) = B̄H (s). Thus the existence of a legal pair follows
if we show that B̄H (s) cannot be a subgraph of BG (s). Let D := B̄H (s) and
A := BG (s). We may assume |N (s)| ≥ 4. (Otherwise, by Proposition 2, BH (s)
has an isolated vertex and hence D has a vertex which is connected to all the
other vertices. A has no such vertex.)
First suppose that k and l are both even. In this case Theorem 2 shows that
D is connected (since it is the complement of a disconnected graph) and A is
disconnected. This implies that a legal pair exists for arbitrary even d(s). Hence
a complete legal splitting exists as well. Now suppose that k is odd and l is
even. As above, we can see that D, which is a complete multipartitite graph by
Theorem 2, is either 2-edge-connected (and is not a cycle) or contains a vertex
which is connected to all the other vertices or is a four-cycle. In the first two
cases Theorem 3 and Proposition 2 show that D cannot be a subgraph of A.
In the last case A is 2-edge-connected if no legal split exists. Theorem 3 shows
that this may happen only if A = C4 and G is round. In that case there are no
parallel edges incident to s and d(s) = 4. This proves the theorem when one of
k and l is even.
Finally, suppose that k and l are both odd. In this case the proof is more
complicated. First we prove the following.
Lemma 5. Let H = (V + s, E) satisfy (1) with respect to some odd k ≥ 3. Let
B := BH (s) be the non-admissibility graph of H and let D := B̄. Then one of
the following holds:
(i) D is 2-edge-connected and D is not a cycle,
(ii) D has a vertex which is adjacent to all the other vertices,
(iii) D = C4 ,
(iv) B arises from a complete graph Km (m ≥ 2) by attaching a path of
length two to some vertex of Km ,
(v) B = C4 .
Proof. If B is 2-edge-connected then Theorem 3 gives B = Cl for some even
l ≥ 4. If l = 4 then (v) holds, otherwise (i) holds. Hence we may assume that B
is not 2-edge-connected. In what follows suppose that neither (i) nor (ii) holds.
Let S := N (s) denote the set of vertices of B and D.
Case I: B is disconnected.
Since (ii) does not hold, S has a bipartition S = X ∪ Y , X ∩ Y = ∅, for
which there are no edges from X to Y in B and |X|, |Y | ≥ 2. Let p := |X| and
284 Tibor Jordán
case d(s) = 4 follows, or A = P4 . Observe that Lemma 4 implies that the two
inner vertices of A are special in G. Now B = D̄ is a P4 as well. Hence the
two inner vertices of B (which are disjoint from the inner vertices of A) are
special in H. From this it follows that there are no parallel edges incident to s
and d(s) = 4. Furthermore, B = P4 implies that there exists a dangerous set
X in H which contains the two inner vertices x, y of B. It is easy to see that
V − X is also dangerous in H and contains the other two neighbours u, v of s,
which correspond to the two (non-adjacent) vertices of B with degree one. Thus
u and v should also be adjacent in B, a contradiction. This solves case (iv) when
m = 2.
Finally assume that (iv) holds with m ≥ 3 and D is a subgraph of A. Using
the notation of Lemma 5 we obtain that the edge vw is not present in A by
Proposition 2. Hence all the vertices in S − X − {x} are special in G by Lemma
2. If w is normal in G then A[S−X] is complete. Thus each vertex u in S−X−{x}
is adjacent to all the other vertices in A, contradicting Proposition 2. (Now such
a u exists since r ≥ 3.) Thus w is special in G. By Lemma 2 and Lemma 4
S − X can be partitioned into two non-empty complete subgraphs J and L in
A according to the two maximal w-dangerous sets J 0 and L0 in G. Without loss
of generality assume that x ∈ J. Let u ∈ L. Now Lemma 4, applied to L0 , u and
v gives L = {u}. If there exists a vertex y ∈ J − x then Lemma 4, applied to
J 0 , y and v gives J = {y}, a contradiction. Thus we conclude that s has four
neighbours {v, w, u, x} only, contradicting m ≥ 3. This completes the proof of
the theorem. t
u
From the above proof we can easily deduce the following.
Corollary 1. Suppose that k or l is odd, d(s) = 4 and there exists no legal
pair. Then BG (s) = C4 or BH (s) = C4 and hence at least one of G and H is
round. t
u
Note that a complete splitting sequence which is simultaneously admissible
in three (or more) graphs does not necessarily exist, even if each of the edge-
connectivity values is even. We also remark that the partition-constrained split-
ting problem can be reduced to a simultaneous edge-splitting problem where at
least one of k and l is even. To see this suppose that an instance of the partition-
constrained splitting problem is given as in the beginning of Section 4.2. Let
dm := maxi {dG (s, Pi )} in G = (V + s, E + F ) and let S := NG (s). Build graph
H = (S + x + s, K + F ) as follows. For each set S ∩ Pi in G let the corresponding
set in H induce a (2dm )-edge-connected graph (say, a complete graph with suf-
ficiently many parallel edges or a singleton). The edges incident to s in G and
H coincide. Then from vertex x of H add 2dm − dG (s, Pi ) parallel edges to some
vertex of S ∩ Pi (1 ≤ i ≤ r). Now H satisfies (1) with respect to l := 2dm . It can
be seen that a complete admissible splitting satisfying the partition-constraints
in G exists if and only if there exists a complete legal splitting in the pair G, H.
(Now the sets of vertices of G and H may be different. However, it is easy to
see that the assumption V (G) = V (H) is not essential in the simultaneous edge-
splitting problem.) This shows that characterizing the pairs G, H for which a
286 Tibor Jordán
complete legal splitting does not exist (even if one of k and l is even) is at least
as difficult as the solution of the partition-constrained problem [1].
(Step 1) Find a common augmentation vector for G and H for which z(V ) is as
small as possible.
(Step 2) Add a new vertex s to each of G and H and z(v) parallel edges from s
to v for every v ∈ V . If z(V ) is odd then add one more edge sw for some w ∈ V .
(Step 3) Find a maximal legal splitting sequence S at s in the resulting pair
of graphs. If S is complete, let the solution F consist of the set of split edges.
Otherwise splitting off S results in a pair of graphs G0 , H 0 for which the degree
of s is four. In this case delete s and add a (common) set I of three properly
chosen edges to G0 − s and H 0 − s. Let the solution F be the union of the split
edges and I.
The following theorem shows the correctness of the above algorithm and
proves that the solution set F is (almost) optimal. Let us define
Pr Pt
Φk,l (G, H) = max{ 1 (k − dG (Xi )) + r+1 (l − dH (Xi ) :
{X1 , ..., Xt } is a subpartition of V ; 0 ≤ r ≤ t}.
The size of a smallest simultaneous augmenting set for G and H (with respect
to k and l, resp.) is denoted by OP Tk,l (G, H).
Theorem 7. dΦk,l (G, H)/2e ≤ OP Tk,l (G, H) ≤ dΦk,l (G, H)/2e + 1. If k and l
are both even then OP Tk,l (G, H) = dΦk,l (G, H)/2e holds.
Proof. It is easy to see that dΦk,l (G, H)/2e ≤ OP Tk,l (G, H) holds. We will show
that the above algorithm results in a simultaneous augmenting set F with size
at most dΦk,l (G, H)/2e + 1 (and with size dΦk,l (G, H)/2e, if k and l are both
even). From the g-polymatroid intersection theorem and our remarks on common
augmentation vectors it can be verified that for the vector z that we obtain in
Step 1 of the above algorithm we have z(V ) = Φk,l (G, H) and hence we have
2dΦk,l (G, H)/2e edges incident to s at the end of Step 2. By Theorem 5 we can
find a maximal sequence of legal splittings in Step 3 which is either complete or
results in a pair of graphs G0 , H 0 , where the degree of s is four. In the former case
the set F of split edges, which is clearly a feasible simultaneous augmenting set,
has size dΦk,l (G, H)/2e, and hence is optimal. If k and l are both even then such
a complete legal splitting always exists, proving OP Tk,l (G, H) = dΦk,l (G, H)e/2.
In the latter case by Corollary 1 one of G0 and H 0 , say G0 , is round. Now there
exists a complete admissible splitting in H 0 by Theorem 1. Let e = uv, f = xy
be the two edges obtained by such a complete splitting. Let g = vx. By Lemma
3(b) adding the edge set I := {e, f, g} to G0 yields a k-edge-connected graph.
Thus the set of edges F which is the union of the edges obtained by the maximal
legal splitting sequence and the edge set I is a simultaneous augmenting set. Now
|F | = dΦk,l (G, H)/2e + 1, as required. t
u
References
1. J. Bang-Jensen, H.N. Gabow, T. Jordán and Z. Szigeti, Edge-connectivity aug-
mentation with partition constraints, Proc. 9th Annual ACM-SIAM Symposium
on Discrete Algorithms (SODA) 1998, pp. 306-315. To appear in SIAM J. Discrete
Mathematics.
2. G.R. Cai and Y.G. Sun, The minimum augmentation of any graph to a k-edge-
connected graph, Networks 19 (1989), 151–172.
3. W.H. Cunningham, A. Frank, A primal-dual algorithm for submodular flows,
Mathematics of Operations Research, 10 (1985) 251-262.
4. J. Edmonds, R. Giles, A min-max relation for submodular functions on graphs,
Annals of Discrete Mathematics 1 (1977) 185-204.
5. A. Frank, Augmenting graphs to meet edge–connectivity requirements, SIAM J.
Discrete Mathematics, 5 (1992) 22–53.
6. A. Frank, Connectivity augmentation problems in network design, in: Mathemat-
ical Programming: State of the Art 1994, (Eds. J.R. Birge and K.G. Murty), The
University of Michigan, Ann Arbor, MI, 34-63, 1994.
7. A. Frank and É. Tardos, Generalized polymatroids and submodular flows, Math-
ematical Programming 42, 1988, pp 489-563.
8. L. Lovász, Combinatorial Problems and Exercises, North-Holland, Amsterdam,
1979.
9. H. Nagamochi and P. Eades, Edge-splitting and edge-connectivity augmentation
in planar graphs, R.E. Bixby, E.A. Boyd, and R.Z. Rios-Mercado (Eds.): IPCO VI
LNCS 1412, pp. 96-111, 1998., Springer-Verlag Berlin Heidelberg 1998.
Integral Polyhedra Associated with Certain
Submodular Functions Defined on 012-Vectors
1 Introduction
J. Edmonds and R. Giles introduced the notion of total dual integrality of sys-
tems of inequalities [1]. Several combinatorial optimization problems can be for-
mulated as linear programming problems, the sets of whose feasible solutions
are described by totally dual integral systems of inequalities [9], [10]. Such sys-
tems of linear inequalities have in common a part derived from submodular or
supermodular set functions. Linear inequalities defined by set functions deter-
mine hyperplanes whose normal vectors are 01-vectors. Thus facets of feasible
polyhedra for such problems have only 01-vectors as their normal vectors.
K. Murota, in his theory of “discrete convex analysis” [7], introduced func-
tions on integer lattice points, called M-convex and L-convex functions. An L-
convex function is a generalization of a submodular set function and satisfies
submodularity on integer lattice points:
He proved duality theorems for L- and M- convex functions based on the dis-
crete separation theorem by A. Frank [2] and showed that submodularity may
be regarded as a discrete version of convexity.
While L-convex functions satisfy (1), the submodularity of L-convex functions
is essentially that of set functions. (It corresponds to the fact that the effective
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 289–303, 1999.
c Springer-Verlag Berlin Heidelberg 1999
290 Kenji Kashiwabara, Masataka Nakamura, and Takashi Takabatake
2 Preliminaries
Throughout this paper, S denotes a nonempty finite set. For a set X, the symbol
χX denotes the characteristic vector of X ⊆ S. For an element e ∈ S, the
characteristic vector of e is denoted by χe . For a vector p ∈ IRS , the symbol
supp(p) denotes {e ∈ S | p(e) 6= 0} and suppk (p) denotes {e ∈ S | p(e) = k}
for any real k. For vectors p, q ∈ IRS , the vectors s and t defined by s(e) :=
max{p(e), q(e)} and t(e) := min{p(e), q(e)} for P e ∈ S are denoted by p ∨ q and
p ∧ q, respectively. The 1-norm of p ∈ IRS is e∈S |p(e)| and denoted by ||p||.
The symbol ZZ + (IR+ ) denotes the set of nonnegative integers (reals), re-
spectively.
A chain of a partially ordered set (S, ) is a subset of S which is totally
ordered with . A chain is maximal if it is a proper subset of no chain. We write
(p1 , . . . , pm ) for a chain {p1 , . . . , pm } with pi pi+1 for i ∈ 1, . . . , m − 1.
The support function of a convex set C ⊆ IRn is the function δC ∗
: IRn → IR
∗
defined by δC (x) := sup{x y | y ∈ C}. Any support function f is known to be
T
positively homogeneous [8], that is, f (kp) = kf (p) holds for any positive real k.
A set C ⊆ IRn is called down-monotone if for each y ∈ S all vectors x ∈ IRS
with x < = y are in C.
A system of inequalities Ax < = b, where A is an m × n-matrix and b is an n
dimensional vector, is totally dual integral (TDI ) if (i) A and b are rational and
(ii) for each integral vector c such that the LP program max{cT x | Ax < = b} has
the finite optimal value, its dual LP problem min{y T c | yA = b, y > = } has an
0
integral solution. The system is box TDI if the system
Ax <
= b; l< <
=x=u
is TDI for each pair of rational vectors l and u [1]. It is proved in [1] that if
a system of inequalities Ax < = b is TDI with an integral vector b, then {x ∈
IRS | Ax <
= b} is an integral polyhedron.
A set function f : 2S → IR is submodular if it satisfies f (X) + f (Y ) > =
f (X ∪ Y ) + f (X ∩ Y ) for any X, Y ⊆ S. We assume f (∅) = 0 in the following.
S
A polyhedron P (f ) is associated withP a submodular set function f on 2
with f (∅) = 0 by P (f ) = {x ∈ IRS | e∈X x(e) < = f (X) (X ⊆ S)}. Such a
polyhedron is known to be TDI and box TDI [1].
A submodular set function f on 2S → IR is extended to a convex function fˆ
on IRS+ by fˆ(p) := max{pT x | x ∈ P (f )} (p ∈ IR+ ) in [6]. This extended function
Integral Polyhedra Associated with Certain Submodular Functions 291
It is obvious from (L2) the values of the Lovász extension for any nonnegative
vectors are nonnegative combinations of values for 01-vectors.
A function g : ZZ S → ZZ ∪{+∞} is called L-convex if g satisfies (L1) and (L2)
and if g(p) ∈ ZZ for some p ∈ ZZ S [7]. The class of L-convex functions thereby
contains the restriction of the Lovász extension to ZZ S of any integer-valued
submodular set function.
f ((1, 1)) := 14, f ((2, 1)) := 22, f ((1, 2)) := 22. But f fails (L2) and is the re-
striction of no L-convex function. (This function f is not the restriction of any
L\ -convex function introduced in [4], either.)
Functions on {0, ±1}S , such as bisubmodular functions and functions de-
termining ternary semimodular polyhedra, correspond to certain functions on
{0, 1, 2}S with a 1 to 1 mapping between {0, ±1} and {0, 1, 2}. (See [3] for defini-
tions of bisubmodular functions and ternary semimodular polyhedra.) Functions
on {0, 1, 2}S corresponding to bisubmodular functions do not satisfy (G2) with
any mapping. On the other hand, there exists a mapping with which functions
on {0, 1, 2}S corresponding to those on {0, 1}S determining ternary semimodular
polyhedra satisfy (G2). But the greedy-type function f described in the previous
paragraph also fails the necessary condition for functions determining ternary
semimodular polyhedra.
The focus of this paper is to study polyhedra associated with greedy-type
functions in the following way.
Definition 4 (Greedy-type polyhedra). A polyhedron in P ⊆ IRS is called
a greedy-type polyhedron if there exists a greedy-type function on IRS such that
P = {x ∈ IRS | pT x < S
= f (p) (p ∈ {0, 1, 2} )}. Then the polyhedron P is also
denoted by P (f ), where f is such a greedy-type function.
We give examples of such polyhedra and polyhedra associated with submod-
ular set functions in Fig. 1.
We write Ir(S)2 for {0, 1, 2}S \{0, 2}S , which is the set of “irreducible” vectors
in {0, 1, 2}S .
Now let S consist of n elements s1 , . . . , sn . A permutation σ of {1, . . . , n}
gives a total order < < <
=σ on S such that sσ(i) =σ sσ(j) for every i, j with i = j. We
write <=σ for this total order. For k1 ∈ {0, 1, . . . , n}, we write vecσ 1 for the
(k )
characteristic vector of the set {sσ(i) ∈ S | 1 < = = i < k1 }. The symbol vecσ (k2 , k1 )
denotes vecσ (k2 ) + vecσ (k1 ) in what follows. We sometimes omit σ when there
is no danger of confusion.
A vector p ∈ {0, 1, 2}S is simply non-increasing under < =σ if p(sσ(i) ) =
>
p(sσ(j) ) for any i, j ∈ {1, . . . , n} with i <
= j.
With these definitions, we introduce a partial order, the set of whose maximal
chains is the main issue of this section.
Definition 5 (Partial order 2 ). We define a partial order 2 on Ir(S)2 so
that p 2 q if both p and q are simply non-increasing under some total order on
S and if
(a) p <
= q and supp(p) = supp(q), or,
(b) p <
= q and supp(p) ⊆ supp (q).
2
The check for well-definedness of this partial order is easy and left for readers.
Here we describe maximal chains of the poset (Ir(S)2 , 2 ).
Lemma 6. Let S be a nonempty set which consists of n elements 1, . . . , n. Then
a sequence of vectors (p1 , . . . , pk ) in Ir(S)2 is a maximal chain of the partial
order 2 if and only if
(I) the number k of vectors equals n, and,
(II) there exists a total order < =σ on S such that
(i) pn = vecσ (n − 1, n), and,
(ii) pi is either vecσ (i − 1) + χsupp(pi+1 ) or vecσ (i − 1, i) for i = 1, . . . , n − 1.
Proof. Obviously a sequence of n vectors (p1 , . . . , pn ) satisfying (I) and (II)
makes a chain of the partial order. From Definition 5, two vectors in a chain
must have different number of “2” ’s. Thus this chain is maximal.
We now prove the “only if” part. It is obvious that any chain of 2 consists
of at most n vectors. Let q1 , q2 be arbitrary vectors in Ir(S)2 with q1 2 q2 and
|supp2 (q1 )| + 1 < |supp2 (q2 )|. Then, for any e ∈ supp2 (q2 )\supp2 (q1 ), the vector
q2 −χe is between q1 and q2 with respect to 2 . Thus any maximal chain consists
of exactly n vectors. Inclusion relationship of supp2 (pi )’s determines one unique
total order < =σ on S under which all pi ’s are simply non-increasing. It is obvious
that (i) and (ii) must be satisfied for this total order. t
u
Figure 2 shows the set of maximal chains of the partial order 2 , where |S|
is three.
A maximal chain (p1 , . . . , pn ) of the poset (Ir(S)2 , 2 ) together with a func-
tion on Ir(S)2 determines a point in IRS as the intersection of hyperplanes
pTi x = f (pi ) (i = 1, . . . , n). We state this fact as the following lemma. We
call this point the point corresponding to the maximal chain in what follows.
We would like to show the fact that any point corresponding to a maximal
chain belongs to P (f ) if f is a greedy-type function. Thus such points are vertices
of P (f ). The following two lemmas are used to prove this fact.
Lemma 10. Let f be a greedy-type function on {0, 1, 2}S and let p, q be vec-
tors in {0, 1, 2}S such that either supp1 (p)\supp(q) or supp1 (q)\supp(p) is not
empty. Then f (p) + f (q) > S S
= f ((p + q) ∧ 2χ ) + f (p + q − (p + q) ∧ 2χ ) holds.
Proof. If supp1 (p) ∩ supp1 (q) = ∅, the inequality of the lemma is nothing but
that of (G2). So we may assume that supp1 (p) ∩ supp1 (q) is not empty. Let r
2
denote 2χsupp (p∨q) . By submodularity, we have
f (p ∨ q) + f ((p ∧ q) ∨ r) > X Y X Z
= f (χ + χ ) + f (χ + χ )
= f (χY + χZ ) + f (2χX )
= f ((p + q) ∧ 2χS ) + f (r) . (3)
Now we have reached the point to state the main theorem of this section,
which ensures that every maximal chain of the poset (Ir(S)2 , 2 ) has its corre-
sponding vertex on P (f ) for any greedy-type function f .
Proof. Let n be |S|. Choose an arbitrary maximal chain (p1 , . . . , pn ). Let x ∈ IRS
be the vertex corresponding to the chain. Let < =σ denote the total order on
S under which p1 , . . . , pn are simply non-increasing. We omit the symbol σ
for simplicity. Let P denote the matrix (p1 , . . . , pn )T and b denote the vector
(f (p1 ), . . . , f (pn ))T . In this proof, lst(p) denotes the last coordinate that is not
0, more precisely, min{i ∈ {0, 1, . . . , n} | p(j) = 0 for every j > i}. The symbol
lst2 (p) denotes min{i ∈ {0, 1, . . . , n} | p(j) < 2 for every j > i}.
This theorem is true if x satisfies the linear inequality
pT x <
= f (p) (4)
P P
where a is the only vector in IRS with p = ni=1 a(i)pi . If p + pk = mi qi holds
for (i) some vector pk in the maximal chain, (ii) some nonnegative integers mi ’s,
and (iii) some vectors qi ’s for which (5) is shown, then the following inequality
X
f (p) + f (pk ) >
= mi f (qi ) (6)
implies (5). We thereby prove this theorem by showing (5) or (6) for every
p ∈ {0, 1, 2}S .
In this proof we call a vector p ∈ {0, 1, 2}S hole-less under ≤σ if χsupp(p) =
vecσ (k1 ) for some k1 ∈ {0, 1, . . . , n}. We say that a vector has a hole if it is not
a hole-less vector.
We first show these inequalities for vectors simply non-increasing under
< . Then for hole-less vectors in {0, 1, 2}S and lastly for arbitrary vectors in
=σ
{0, 1, 2}S . It should be noted that (5) holds if the vector p belongs to the chain.
We show (5) or (6) for a simply non-increasing vector p, using induction on
lst(p). Inequality (5) is trivial if lst(p) = 0, that is p = 0 . Suppose that (5)
holds for any simply non-increasing vector p ∈ {0, 1, 2}S with lst(p) < = k − 1. We
examine whether (5) or (6) holds for a vector p ∈ {0, 1, 2}S with lst(p) = k.
We first assume that pk = vec(k − 1, k). We also assume that neither p
nor 12 p belongs to the chain, since (5) is trivial otherwise. By Corollary 7, this
assumption implies that supp(p1 ) ( {1, . . . , k}. Let pl be the last vector in the
chain with supp(pl ) ( {1, . . . , k}. We see from Corollary 7 that pl = vec(l − 1, l),
pl+1 = vec(l, k), and supp2 (p) ⊆ {1, . . . , l − 1} hold. Then p + pl = p − χ{l,... ,k} +
pl+1 holds. By Lemma 10, we have f (p)+f (pl ) > = f (pl+1 )+f (p−χ
{l,... ,k}
). Since
{l,... ,k} {l,... ,k}
p−χ is a simply non-increasing vector with lst(p − χ ) < k, (6) is
proved.
Now we assume that pk = vec(k − 1, j) for some j > k. Then p + pk =
p − χk + pk+1 holds. By Lemma 10, we have f (p) + f (pk ) > = f (p − χk ) + f (pk+1 ).
Since p − χk is a simply non-increasing vector with lst(p − χk ) < k, we have
shown (6).
We show (6) for a hole-less vector p in {0, 1, 2}S , using induction on ||p||. If
||p|| <
= 2, then p is a simply non-increasing vector, for which we have already
shown (5). Suppose that (5) holds for every hole-less vector p ∈ {0, 1, 2}S with
||p|| <
= k − 1. We examine whether (6) holds for a hole-less vector p ∈ {0, 1, 2}S
with ||p|| = k. Let l denote lst (p).
2
pT x < S
= f (p) (p ∈ {0, 1, 2} ) (7)
Proof. Let the set S be {1, . . . , n}. Considering permutations of S, we only have
to show this theorem for simply non-increasing vector c. This theorem is proved
as a corollary of the claim stated in the following paragraph.
Let j be some integer in {1, . . . , n} and let c be a simply non-increasing
vector. Let l denote min{i ∈ {j, . . . , n} | c(i0 ) = 0 (i0 > i)}. If both c(i) = c(j)
for i = j + 1, . . . , l and c(i) = 0 for i = l + 1, . . . , n are satisfied, there exists
a maximal chain (p1 , . . . , pn ) of (Ir(S)2 , 2 ) such that (i) c is a nonnegative
combination of p1 , . . . , pj and (ii) supp(pj ) = {1, . . . , l}. Moreover if c is integral,
the coefficients may be taken from ZZ + .
We use induction on j to prove this claim. If j = 1, the claim is trivial.
Suppose that the claim holds for any integer j with j < = k. We now show the
claim for j = k + 1.
We consider the following two cases according to the vector c. If c(k) < =
c(k+1)−c(k), let c0 denote the vector c−c(k)vec(k, l). And then c0 is nonnegative,
non-increasing, and c0 (i) = 0 for i = k + 1, . . . , n. If c(k) > c(k + 1) − c(k), let c0
denote c − (c(k + 1) − c(k))vec(k, l) And then c0 is nonnegative, non-increasing,
and c0 (i) = c0 (k) for i = k, . . . , l and c0 (i) = 0 for i = l + 1, . . . , n.
In either case, by the assumption of induction, c0 is represented as a nonnega-
tive combination of vectors p1 , . . . , pk , which make a chain of (Ir(S)2 , 2 ). Thus
c can be represented as a nonnegative combination of p1 , . . . , pk , vec(k, l). Since
either supp(pk ) ⊆ supp2 (vec(k, l)) or supp(pk ) = supp(vec(k, l)) holds, vectors
p1 , . . . , pk , vec(k, l) also make a chain.
If c is integral, the coefficient for vec(k, l) is an integer and the other coeffi-
cients are also integral since the vector c0 is also integral.
Lemma 12 follows from the claim for j = n. t
u
Corollary 13. For every vertex of a greedy-type polyhedron in IRS , there exists
a maximal chain of (Ir(S)2 , 2 ) corresponding to the vertex.
298 Kenji Kashiwabara, Masataka Nakamura, and Takashi Takabatake
pT x <
= f (p) (p ∈ {0, 1, 2}S )
Remark 16. Such a system of linear inequalities is not box TDI. Let f be a
greedy-type function derived from the polyhedron {x ∈ IR{s1 ,s2 ,s3 } | x(s1 ) < =
2, x(s2 ) < < <
= 4, x(s3 ) = 1, 2x(s1 ) + x(s2 ) + 2x(s3 ) = 6}. Adding inequalities corre-
sponding to the box (0, 0, 0)T < <
= x = (2, 4, 1) to the system of linear inequalities
T
T < {s1,s2 ,s3 }
p x = f (p) (p ∈ {0, 1, 2} ) makes a system of inequalities which is not
TDI. For example, the objective vector (3, 2, 1)T is maximized by the vertex
(1, 4, 0), while (3, 2, 1)T is not any nonnegative integral combination of vectors
(2, 1, 2)T , (0, 0, −1)T , (0, 1, 0)T , which correspond to the vertex (1, 4, 0). (See
Fig 3.)
In the end of this section, we propose a dual greedy algorithm which solves
the LP problem max{cT x | pT x < S
= f (p) (p ∈ {0, 1, 2} )} for a greedy-type
function f : {0, 1, 2} → IR and a nonnegative objective vector c ∈ IRS+ . Readers
S
may easily see that Algorithm 17 gives a representation of the objective vector
as a nonnegative linear combination of vectors in a maximal chain. Then the
validity of Algorithm 17 follows from duality theorem of linear programming
and Theorem 11. (It is also proved in [11].)
Input A greedy-type function f : {0, 1, 2}S → IR, where S is a finite set with n
elements. A nonnegative objective vector c ∈ IRS+ .
Output The maximum value of cT x among x ∈ P (f ). A vertex x ∈ P (f ) which
gives the maximum value.
Step1 Sort the support S = {1, 2, . . . , n} so that c(i) > < <
= c(i + 1) (1 = i = n − 1).
Integral Polyhedra Associated with Certain Submodular Functions 299
s2
(0,4,1)
1
4 s3
(2,0,1)
0
(2,1,2) x < 6 (1,4,0)
2 (2,2,0)
s1
(2,4,−1)
for i = 1, . . . , n.
For an arbitrary vector p in {0, 1, 2}S , let argmin(p, γ) denote an element l
of S that satisfies p(l)/γ(l) <
= p(i)/γ(i) for any i ∈ S. (If there exists more than
one such elements, any choice may be possible for the following argument.) We
see that f (p) is attained by the vertex corresponding to vectors in
p(argmin(p,γ))
We write cntrb(p) for γ(argmin(p,γ)) in this proof.
Then we have
f (p) + f (q) − f (p ∨ q) − f (p ∧ q) =
X
(cntrb(p) + cntrb(q) − cntrb(p ∨ q) − cntrb(p ∧ q))[ γ(j)f (χj ) − f (γ)] .
j∈S
Integral Polyhedra Associated with Certain Submodular Functions 301
P
It is clear that j∈S γ(j)f (χj ) − f (γ) > >
= 0 and cntrb(p) + cntrb(q) = cntrb(p ∨
q) + cntrb(p ∧ q) hold. Thus (G2) is proved and (G3) follows from a similar
argument.
Since the normal vectors of P ’s facets are 012-vectors, we see that P = P (f )
holds. t
u
Now all we have to prove is that the sum of greedy-type polyhedra is the
polyhedron associated with the sum of those functions, which is also a greedy-
type function.
Lemma 20. Let S be a nonempty finite set. Let f and g be greedy-type functions
from {0, 1, 2}S to IR. Then the vector sum of P (f ) and P (g) coincides with
P (f + g).
Proof. Let S contain n elements. Since it is obvious that P (f +g) ⊇ P (f )+P (g),
all we have to show is that P (f + g) ⊆ P (f ) + P (g).
To prove our claim, we show that δP∗ (f +g) (p) > ∗
= δP (f )+P (g) (p) holds for any
vector p ∈ IRS . This inequality is trivial if p 6∈ IRS+ . (Then both sides are equal
to +∞.) Let q be an arbitrary vector in IRS+ . By Lemma 12, q is a nonnegative
combination of vectors p1 , . . . , pn that compose a maximal chain of (Ir(S)2 , 2 ).
Let P denote the n × n-matrix that has pi as its ith row vector for i = 1, . . . , n.
Then q T = y T P for some nonnegative vector y ∈ IRn+ . Let b1 (b2 ) be the n
dimensional vector that has f (pi ) (g(pi )) as its ith component, respectively. Since
the point corresponding to the chain belongs to P (f + g), we have δP∗ (f +g) (q) =
y T (b1 + b2 ) by the duality theorem of linear programming. Since f and g are
greedy-type, P (f ) contains P −1 b1 and P (g) contains P −1 b2 . The polyhedron
P (f ) + P (g) thereby contains P −1 (b1 + b2 ). Then δP∗ (f )+P (g) (q) > T −1
= q P (b1 +
b2 ) = y T (b1 + b2 ) = δP∗ (f +g) (q), which we have tried to show, holds. t
u
Theorem 21. Let P be the output polyhedron of a bipartite network with {1, 2}-
gain and let f be the {0, 1, 2}-support function of P . Then f is a greedy-type
function and P (f ) = P holds.
From Lemma 20, we have Theorem 22, which is a kind of separation theorem.
Theorem 22. Let S be a nonempty finite set. Let f and g be functions from
{0, 1, 2}S to IR such that f and −g are greedy-type. If f (p) >
= g(p) holds for every
p ∈ {0, 1, 2}S , there exists a vector h ∈ IRS which satisfies f (p) > T >
= h p = g(p)
S
for every p ∈ {0, 1, 2} .
Remark 23. Property (G3) is essential for Theorem 22. We give two functions
defined on {0, 1, 2}S that satisfy (G1) and (G2) and that are not separated by
any linear functions.
Let f be the 012-support of the polytope
References
[1] Edmonds, J., Giles, R.: A min-max relation for submodular functions on graphs.
Annals of Discrete Mathematics 1 (1977) 185–204
Integral Polyhedra Associated with Certain Submodular Functions 303
[2] Frank, A.: An algorithm for submodular functions on graphs. Annals of Discrete
Mathematics 16 (1982) 97–120
[3] Fujishige, S.: Submodular Functions and Optimization. Annal of Discrete Math-
ematics 47 (1991)
[4] Fujishige, S., Murota., K.: On the relationship between L-convex functions and
submodular integrally convex functions. RIMS Preprint 1152, Research Institute
for Mathematical Sciences, Kyoto University (1997)
[5] Kashiwabara, K.: Set Functions and Polyhedra. PhD thesis, Tokyo Institute of
Technology (1998)
[6] Lovász, L.: Submodular functions and convexity. in Grötschel M,. Bachem, A.,
Korte B. (eds.), Mathematical Programming — The State of the Art. Springer-
Verlag, Berlin (1983) 235–257
[7] Murota, K.: Discrete convex analysis. Mathematical Programming. 83 (1998) 313–
371
[8] Rockafellar, R.T.: Convex Analysis. Princeton University Press (1970)
[9] Scrijver, A.: Total dual integrality from directed graphs, crossing families, and
sub- and supermodular functions. in Pulleyblank, W.R. (ed.), Progress in Com-
binatorial Optimization. Academic Press (1984) 315–361
[10] Schrijver, A.: Theory of Linear and Integer Programming. Wiley (1986)
[11] Takabatake, T.: Generalizations of Submodular Functions and Delta-Matroids.
PhD thesis, Universtiy of Tokyo (1998)
Optimal Compaction of
Orthogonal Grid Drawings?
(Extended Abstract)
1 Introduction
The compaction problem has been one of the challenging tasks in vlsi–design.
The goal is to minimize the area or total edge length of the circuit layout while
mainly preserving its shape. In graph drawing the compaction problem also plays
an important role. Orthogonal grid drawings, in particular the ones produced by
the algorithms based on bend minimization (e.g., [16, 6, 10]), suffer from missing
compaction algorithms. In orthogonal grid drawings every edge is represented
as a chain of horizontal and vertical segments; moreover, the vertices and bends
are placed on grid points. Two orthogonal drawings have the same shape if one
can be obtained from the other by modifying the lengths of the horizontal and
?
This work is partially supported by the Bundesministerium für Bildung, Wissen-
schaft, Forschung und Technologie (No. 03–MU7MP1–4).
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 304–319, 1999.
c Springer-Verlag Berlin Heidelberg 1999
Optimal Compaction of Orthogonal Grid Drawings 305
vertical segments without changing the angles formed by them. The orthogo-
nal drawing standard is in particular suitable for Entity–Relationship diagrams,
state charts, and pert–diagrams with applications in data bases, case–tools,
algorithm engineering, work–flow management and many more.
So far, in graph drawing only heuristics have been used for compacting or-
thogonal grid drawings. Tamassia [16] suggested “refining” the shape of the
drawing into one with rectangular faces by introducing artificial edges. If all the
faces are rectangles, the compaction problem can be solved in polynomial time
using minimum–cost network flow algorithms. In general, however, the solution
is far from the optimal solution for the original graph without the artificial edges
(see also Sect. 4). Other heuristics are based on the idea of iteratively fixing the
x– and y–coordinates followed by a one–dimensional compaction step. In one–
dimensional compaction, the goal is to minimize the width or height of the layout
while preserving the coordinates of the fixed dimension. It can be solved in poly-
nomial time using the so–called layout graphs. The differences of the methods are
illustrated in Fig. 1. It shows the output of four different compaction methods
for a given graph with fixed shape. The big vertices are internally represented
as a face that must have rectangular shape. Figure 1(a) illustrates the method
proposed in [16]. Figures 1(b) and 1(c) show the output of two one–dimensional
compaction strategies applied to the drawing in Fig. 1(a). Both methods use the
underlying layout graphs, the first algorithm is based on longest paths computa-
tions, the second on computing maximum flows. Figure 1(d) shows an optimally
compacted drawing computed by our algorithm.
The compaction problems in vlsi–design and in graph drawing are closely
related but are not the same. Most versions of the compaction problem in vlsi–
design and the compaction problem in graph drawing are proven to be NP–
hard [11, 14]. Research in vlsi–design has concentrated on one–dimensional
methods; only two papers have suggested optimal algorithms for two–dimensio-
nal compaction (see [15, 8]). The idea is based on a (0, 1)–non–linear integer
programming formulation of the problem and solving it via branch–and–bound.
Unfortunately, the methods have not been efficient enough for practical use.
Lengauer states the following: “The difficulty of two–dimensional compaction
lies in determining how the two dimensions of the layout must interact to mini-
mize the area” [11, p. 581]. The two–dimensional compaction method presented
in this paper exactly attacks this point. We provide a necessary and sufficient
condition for all feasible solutions of a given instance of the compaction problem.
This condition is based on existing paths in the so–called constraint graphs in
x– and y–direction. These graphs are similar to the layout graphs known from
one–dimensional compaction methods. The layout graphs, however, are based
on visibility properties whereas the constraint graphs arise from the shape of
the drawing.
Let us describe our idea more precisely: The two constraint graphs Dh and
Dv specify the shape of the given orthogonal drawing. We characterize exactly
those extensions of the constraint graphs which belong to feasible orthogonal
grid drawings. The task is to extend the given constraint graphs to a complete
306 Gunnar W. Klau and Petra Mutzel
pair of constraint graphs by adding new arcs for which the necessary and suffi-
cient condition is satisfied and the total edge–length of the layout is minimized.
Hence, we have transformed the geometrical problem into a graph–theoretical
one. Furthermore, we can detect constructively those instances having only one
complete extension. For these cases, we can solve the compaction problem in
polynomial time.
We formulate the resulting problem as an integer linear program that can be
solved via branch–and–cut or branch–and–bound methods. Our computational
results on a benchmark set of 11,582 graphs [4] have shown that we are able
to solve the two–dimensional compaction problem for all the instances in short
computation time. Furthermore, they have shown that often it is worthwhile
looking for the optimally compacted drawing. The total edge lengths have been
improved up to 37, 0% and 65, 4% as compared to iterated one–dimensional
compaction and the method proposed in [16].
Optimal Compaction of Orthogonal Grid Drawings 307
The shape of a simple drawing is given by the angles inside the faces, i.e.,
the angles occurring at consecutive edges of a face cycle. Note that the notion
of shape induces a partitioning of drawings in equivalence classes. Often, the
shape of an orthogonal drawing is given by a so–called orthogonal representation
H. Formally, for a simple orthogonal drawing, H is a function from the set of
faces F to clockwise ordered lists of tuples (er , ar ) where er is an edge, and ar is
308 Gunnar W. Klau and Petra Mutzel
the angle formed with the following edge inside the appropriate face. Using the
definitions, we can specify the compaction problem:
Definition 1. The compaction problem for orthogonal drawings (cod) is stated
by the following: Given a simple orthogonal grid drawing Γ with orthogonal rep-
resentation H, find a drawing Γ 0 with orthogonal representation H of minimum
total edge length.
The compaction problem (cod) has been shown to be NP–hard [14]. So far,
in practice, one–dimensional graph–based compaction methods have been used.
But the results are often not satisfying. Figure 3(a) shows an orthogonal drawing
with total edge length 2k+5 which cannot be improved by one–dimensional com-
paction methods. The reason for this lies in the fact that one–dimensional com-
paction methods are based on visibility properties. An exact two–dimensional
compaction method computes Fig. 3(b) in which the total edge length is only
k + 6.
Let Γ be a simple orthogonal grid drawing of a graph G = (V, E). It induces a
partition of the set of edges E into the horizontal set Eh and the vertical set Ev .
A horizontal (resp. vertical) subsegment in Γ is a connected component in (V, Eh )
(resp. (V, Ev )). If the component is maximally connected it is also referred to
as a segment. We denote the set of horizontal and vertical subsegments by Sh ,
and Sv resp., and the sets of segments by Sh ⊆ Sh , Sv ⊆ Sv . We further define
S = Sh ∪ Sv and S = Sh ∪ Sv . The following observations are of interest (see also
Fig. 4).
1. Each edge is a subsegment, i.e., Eh ⊆ Sh , Ev ⊆ Sv .
2. Each vertex v belongs to one horizontal and one vertical segment, denoted
by hor(v) and vert(v).
3. Each subsegment s is contained in exactly one segment, referred to as seg(s).
4. The limits of a subsegment s are given as follows: Let vl , vr , vb , and vt be
the leftmost, rightmost, bottommost, and topmost vertices on s. Then l(s) =
vert(vl ), r(s) = vert(vr ), b(s) = hor(vb ), and t(s) = hor(vt ).
The following lemma implies that the total number of segments is 2|V | − |E|.
Lemma 1. (proof omitted) Let Γ be a simple orthogonal grid drawing of a graph
G = (V, Eh ∪ Ev ). Then |Sh | = |V | − |Eh | and |Sv | = |V | − |Ev |.
| {z }
k edges
(a) (b)
v3 v4 Sh = {s1 , s2 , s3 } Sv = {s4 , s5 }
s3
s5 i V (si ) E(si ) l(si ) r(si ) b(si ) t(si )
s2 1 {v1 } {} s4 s4 s1 s1
v2 v5
2 {v2 , v5 } {(v2 , v5 )} s4 s5 s2 s2
s4 3 {v3 , v4 } {(v3 , v4 )} s4 s5 s3 s3
4 {v1 , v2 , v3 } {(v1 , v2 ), (v2 , v3 )} s4 s4 s1 s3
v1 s1 5 {v4 , v5 } {(v4 , v5 )} s5 s5 s2 s3
The two digraphs characterize known relationships between the segments that
must hold in any drawing of the graph because of its shape properties. Let
a = (si , sj ) be an arc in Ah ∪ Av . If a ∈ Av then the horizontal segment si
must be placed at least one grid unit below segment sj . For vertical segments
the arc a ∈ Ah expresses the fact that si must be to the left of sj . Each arc
is defined by at least one edge in E. Clearly, each vertical edge determines the
relative position of two horizontal segments and vice versa. Figure 5 illustrates
the shape description of a simple orthogonal grid drawing.
s4 s5
s6 s8 s3
s9
s2
s7
s1
∗
For two vertices v and w we use the notation v −→ w if there is a path from
v to w. Shape descriptions have the following property:
Proof. We prove the lemma for horizontal subsegments. The proof for vertical
subsegments is similar. By definition, the lemma holds for edges. Let s be a
horizontal subsegment consisting of k consecutive edges e1 , . . . , ek . With l(e1 ) =
l(s), r(ek ) = r(s) and (l(ei ), r(ei )) ∈ Ah and b(s) = t(s) = seg(s) the result
follows. t
u
In an orthogonal drawing at least one of the four conditions must be satisfied for
any such pair (si , sj ). The following two observations show that we only need to
consider separated segments of opposite direction (proofs omitted).
Observation 2. Assume that the arcs between the segments form an acyclic
graph. Then the following is true: All segment pairs are separated if and only if
all segment pairs of opposite directions are separated.
The following lemma shows that we can restrict our focus to separated seg-
ments that share a common face. For a face f , we write S(f ) for the segments
containing the horizontal and vertical edges on the boundary of f .
Lemma 3. All segment pairs are separated if and only if for every face f the
segment pairs (si , sj ) ∈ S(f ) × S(f ) are separated.
We will see next that any shape description σ = h(Sv , Ah ), (Sh , Av )i can be
extended so that the resulting constraint graphs correspond to a feasible orthog-
onal planar drawing. We give a characterization of these complete extensions in
terms of properties of their constraint graphs.
1. Ah ⊆ Bh , Av ⊆ Bv .
2. Bh and Bv are acyclic.
3. Every segment pair is separated.
The following theorem characterizes the set of feasible solutions for the com-
paction problem.
Proof. To prove the first part of the theorem, we consider a simple orthogonal
grid drawing Γ with shape description σ = h(Sv , Ah ), (Sh , Av )i. Let c(si ) denote
the fixed coordinate for segment si ∈ Sh ∪Sv . We construct a complete extension
τ = h(Sv , Bh ), (Sh , Bv )i for σ as follows: Bh = {(si , sj ) ∈ Sv ×Sv | c(si ) < c(sj )},
i.e., we insert an arc from every vertical segment to each vertical segment lying
to the right of si . Similarly, we construct the set Bv . Clearly, we have Ah ⊆ Bh
and Av ⊆ Bv . We show the completeness by contradiction: Assume first that
there is some pair (si , sj ) which is not separated. According to the construction
this is only possible if the segments cross in Γ , which is a contradiction. Now
assume that there is a cycle in one of the arc sets. Again, the construction of Bh
and Bv forbids this case. Hence τ is a complete extension of σ.
We give a constructive proof for the second part of the theorem by specifying
a simple orthogonal grid drawing for the complete extension τ . To accomplish
this task we need to assign lengths to the segments. A length assignment for a
complete extension of a shape description τ = h(Sv , Bh ), (Sh , Bv )i is a function
c : Sv ∪ Sh → IN with the property (si , sj ) ∈ Bh ∪ Bv ⇒ c(si ) < c(sj ). Given τ ,
such a function can be computed using any topological sorting algorithm in the
acyclic graphs in τ , e.g., longest paths or maximum flow algorithms in the dual
graph. For a fixed length assignment, the following simple and straightforward
method assigns coordinates to the vertices. Let x ∈ INV and y ∈ INV be the
coordinate vectors. Then simply setting xv = c(vert(v)) and yv = c(hor(v)) for
every vertex v ∈ V results in a correct grid drawing. The following points have
to be verified:
312 Gunnar W. Klau and Petra Mutzel
t(sj )
sj
l(si ) r(si ) l(sj ) sj r(sj )
si
Fig. 7. Three types of shape descriptions. Dotted lines show the orthogonal grid
drawings, thin arrows arcs in shape descriptions and thick gray arcs possible
completions.
X X
(ILP) min cr(e) − cl(e) + ct(e) − cb(e) subject to
e∈Eh e∈Ev
The objective function expresses the total edge length in a drawing for G.
Note that the formulation also captures the related problem of minimizing the
length of the longest edge in a drawing. In this case, the constraints cr(e) −cl(e) ≤
lmax or ct(e) − cb(e) ≤ lmax must be added for each edge e and the objective
function must be substituted by min lmax . Furthermore, it is possible to give
314 Gunnar W. Klau and Petra Mutzel
each edge an individual weight in the objective function. In this manner, edges
with higher values are considered more important and will preferably be assigned
a shorter length.
We first give an informal motivation of the three different types of constraints
and then show that any feasible solution of the ILP formulation indeed corre-
sponds to an orthogonal grid drawing.
(1.1) Shape constraints. We are looking for an extension of shape description σ.
Since any extension must contain the arc sets of σ, the appropriate entries
of x must be set to 1.
(1.2) Completeness constraints. This set of constraints guarantees completeness
of the extension. The respective inequalities model Definition 3.3.
(1.3) Consistency constraints. The vector c corresponds to the length assignment
and thus must fulfill the property (si , sj ) ∈ Bh ∪ Bv ⇒ c(si ) < c(sj ). If
xij = 1, then the inequality reads cj − ci ≥ 1, realizing the property for the
arc (i, j). The case xij = 0 yields ci − cj ≤ M which is true if M is set to
the maximum of the width and the height of Γ .
The following observation shows that we do not need to require integrality
for the variable vector c. The subsequent lemma motivates the fact that no
additional constraints forbidding the cycles are necessary.
Observation 3. Let (x, cf ) with x ∈ {0, 1}|C| and cf ∈ Q|Sh ∪Sv | be a feasible
solution for (ILP) and let zf be the value of the objective function. Then there is
a feasible solution (x, c) with c ∈ IN|Sh ∪Sv | and objective function value z ≤ zf .
Proof. Since x is part of a feasible solution, its components must be either zero
or one. Then (ILP) reads as the dual of a maximum flow problem with integer
capacities. It follows that there is an optimal integer solution [2].
Lemma 4. Let (x, c) be a feasible solution for (ILP) and let Dh and Dv be the
digraphs corresponding to x. Then Dh and Dv are acyclic.
Proof. Assume without loss of generality that there is a cycle (si1 , si2 , . . . , sik ) of
length k in Dh . This implies xi1 ,i2 = xi2 ,i3 = . . . = xik ,i1 = 1 and the appropriate
consistency constraints read
ci1 − ci2 + (M + 1) ≤ M
ci2 − ci3 + (M + 1) ≤ M
..
.
cik − ci1 + (M + 1) ≤ M .
Summed up, this yields k · (M + 1) ≤ k · M , a contradiction. t
u
Theorem 2. For each feasible solution (x, c) of (ILP) for a shape description
σ, there is a simple orthogonal grid drawing Γ whose shape corresponds to σ
and vice versa. The total edge length of Γ is equal to the value of the objective
function.
Optimal Compaction of Orthogonal Grid Drawings 315
Proof. For the first part of the proof, let x and c be the solution vectors of (ILP).
According to Observation 3 we can assume that both vectors are integer. Vector
x describes a complete extension τ of σ; the extension property is guaranteed
by constraints 1.1, completeness by constraints 1.2 and acyclicity according to
Lemma 4. The consistency constraints 1.3 require c to be a length assignment
for τ . The result follows with Theorem 1.
Again, we give a constructive proof for the second part: Theorem 1 allows us
to use Γ for the construction of a complete extension τ for σ. Setting ci to the
fixed coordinate of segment si and xij to 1 if the arc (si , sj ) is contained in τ and
to 0 otherwise, results in a feasible solution for the ILP. Evidently, the bounds
and integrality requirements for c and x are not violated and Constraints 1.1
and 1.2 are clearly satisfied. To show that the consistency constraints hold, we
consider two arbitrary vertical segments si and sj . Two cases may occur: Assume
first that si is to the left of sj . Then the corresponding constraint reads cj −ci ≥ 1
which is true since c(j) > c(i) and the values are integer. Now suppose that si is
not to the left of sj . In this case the constraint becomes ci −cj ≤ M which is true
for a sufficiently large M . A similar discussion applies to horizontal segments.
Obviously, the value of the objective function is equal to the total edge length
in both directions of the proof. t
u
Proof. Note that any optimal drawing Γ has width w ≤ |Sv | and height h ≤ |Sh |,
otherwise a one–dimensional compaction step could be applied. M has to be big
enough to “disable” Constraints 1.3 if the corresponding entry of x is zero; it
has to be an upper bound for the distance between any pair of segments. Setting
it to the size of the bigger of the two sets fulfills this requirement. t
u
Our implementation which we will refer to as opt throughout this section, solves
the integer linear program described in Sect. 3. It is realized as a compaction
module inside the AGD–library [3, 1] and is written in C++ using LEDA [13, 12].
In a preprocessing phase, the given shape description σ is completed as far as
possible. Starting with a list L of all segment pairs, we remove a pair (si , sj ) from
L if it either fulfills the definition of separation or there is only one possibility
to meet this requirement. In the latter case we insert the appropriate arc and
proceed the iteration. The process stops if no such pair can be found in L. If the
list is empty at the end of the preprocessing phase, we can solve the compaction
problem in polynomial time by optimizing over the integral polyhedron given by
the corresponding inequalities.
316 Gunnar W. Klau and Petra Mutzel
1dim and orig. We computed the minimal, maximal and average improvement
for the graphs of the same size. The average improvement values are quite inde-
pendent from the graph size, and the minimum and maximum values converge
to them with increasing number of vertices. Note that opt yields in some cases
improvements of more than 30% in comparison to the previously best strategy,
Fig. 1 in Sect. 1 shows the drawings for such a case (here, the improvement is
28%). For the area, the data look similar with the restriction that in few cases
the values produced by 1dim are slightly better than those from opt; short
edge length does not necessarily lead to low area consumption. The average area
improvements compared to 1dim and orig are 2.7% and 29.3%, however.
In general, we could make the following observations: Instances of the com-
paction problem cod divide into easy and hard problems, depending on the
structure of the corresponding graphs. On the one hand, we are able to solve
some randomly generated instances of biconnected planar graphs with 1,000
vertices in less than five seconds. In these cases, however, the improvement com-
pared to the results computed by 1dim is small. On the other hand, graphs
containing tree–like structures have shown to be hard to compact since their
number of fundamentally different drawings is in general very high. For these
cases, however, the improvement is much higher.
We have introduced the constraint graphs describing the shape of a simple or-
thogonal grid drawing. Furthermore, we have established a direct connection
between these graphs by defining complete extensions of the constraint graphs
that satisfy certain connectivity requirements in both graphs. We have shown
that complete extensions characterize the set of feasible drawings with the given
shape. For a given complete extension we can solve the compaction problem cod
in polynomial time. The graph–theoretical characterization allows us to formu-
late cod as an integer linear program. The preprocessing phase of our algorithm
detects those instances having only one complete extension and, for them, the
optimal algorithm runs in polynomial time. Our experiments show that the re-
sulting ILP can be solved within short computation time for instances as big as
1,000 vertices.
There are still open questions concerning the two–dimensional compaction
problem: We are not satisfied having the ‘big’ M in our integer linear program-
ming formulation. Our future plans include the extensive study of the 0/1–
polytope Pcod that characterizes the set of complete extensions. We hope that
the results will lead to a running time improvement of our branch–and–cut al-
gorithm. Independently, Bridgeman et al. [5] have developed an approach based
on the upward planar properties of the layout graphs. We want to investigate a
possible integration of their ideas. Furthermore, we plan to adapt our method
to variations of the two–dimensional compaction problem.
318 Gunnar W. Klau and Petra Mutzel
Improvement in %
40
35
30
25
20
15
10
0
50 100 150 200 250 300
Number of vertices
60
50
40
30
20
10
0
50 100 150 200 250 300
Number of vertices
References
[1] AGD. AGD User Manual. Max-Planck-Institut Saarbrücken, Universität Halle,
Universität Köln, 1998. http://www.mpi-sb.mpg.de/AGD.
[2] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms,
and Applications. Prentice Hall, Englewood Cliffs, NJ, 1993.
[3] D. Alberts, C. Gutwenger, P. Mutzel, and S. Näher. AGD–Library: A library of
algorithms for graph drawing. In G. Italiano and S. Orlando, editors, WAE ’97
(Proc. on the Workshop on Algorithm Engineering), Venice, Italy, Sept. 11-13,
1997. http://www.dsi.unive.it/~wae97.
[4] G. D. Battista, A. Garg, G. Liotta, R. Tamassia, E. Tassinari, and F. Vargiu.
An experimental comparison of four graph drawing algorithms. CGTA: Compu-
tational Geometry: Theory and Applications, 7:303 – 316, 1997.
[5] S. Bridgeman, G. Di Battista, W. Didimo, G. Liotta, R. Tamassia, and L. Vis-
mara. Turn–regularity and optimal area drawings of orthogonal representations.
Technical report, Dipartimento di Informatica e Automazione, Università degli
Studi di Roma Tre, 1999. To appear.
[6] U. Fößmeier and M. Kaufmann. Drawing high degree graphs with low bend
numbers. In F. J. Brandenburg, editor, Graph Drawing (Proc. GD ’95), volume
1027 of Lecture Notes in Computer Science, pages 254–266. Springer-Verlag, 1996.
[7] M. Jünger and S. Thienel. Introduction to ABACUS - A Branch-And-CUt System.
Operations Research Letters, 22:83–95, March 1998.
[8] G. Kedem and H. Watanabe. Graph optimization techniques for IC–layout and
compaction. IEEE Transact. Comp.-Aided Design of Integrated Circuits and Sys-
tems, CAD-3 (1):12–20, 1984.
[9] G. W. Klau and P. Mutzel. Optimal compaction of orthogonal grid draw-
ings. Technical Report MPI–I–98–1–031, Max–Planck–Institut für Informatik,
Saarbrücken, December 1998.
[10] G. W. Klau and P. Mutzel. Quasi–orthogonal drawing of planar graphs. Technical
Report MPI–I–98–1–013, Max–Planck–Institut für Informatik, Saarbrücken, May
1998.
[11] T. Lengauer. Combinatorial Algorithms for Integrated Circuit Layout. John Wiley
& Sons, New York, 1990.
[12] K. Mehlhorn, S. Näher, M. Seel, and C. Uhrig. LEDA Manual Version
3.7.1. Technical report, Max-Planck-Institut für Informatik, 1998. http://
www.mpi-sb.mpg.de/LEDA.
[13] K. Mehlhorn and S. Näher. LEDA: A platform for combinatorial and geometric
computing. Communications of the ACM, 38(1):96–102, 1995.
[14] M. Patrignani. On the complexity of orthogonal compaction. Technical Report
RT–DIA–39–99, Dipartimento di Informatica e Automazione, Università degli
Studi di Roma Tre, January 1999.
[15] M. Schlag, Y.-Z. Liao, and C. K. Wong. An algorithm for optimal two–dimensional
compaction of VLSI layouts. Integration, the VLSI Journal, 1:179–209, 1983.
[16] R. Tamassia. On embedding a graph in the grid with the minimum number of
bends. SIAM J. Comput., 16(3):421–444, 1987.
On the Number of Iterations for Dantzig-Wolfe
Optimization and Packing-Covering
Approximation Algorithms
1 Introduction
We start with definitions given by Plotkin, Shmoys, and Tardos [16]. Given
A ∈ IRm×n , b ∈ IRm and a polytope P ⊆ IRn , the fractional packing problem is
to find an x ∈ P such that Ax ≤ b if such an x exists. An -approximate solution
to this problem is an x ∈ P such that Ax ≤ (1 + )b. An -relaxed decision
procedure always finds an -approximate solution if an exact solution exists.
A Dantzig-Wolfe-type algorithm for a fractional packing problem x ∈
P, Ax ≤ b is an algorithm that accesses P only by queries to P of the following
form: “given a vector c, what is an x ∈ P minimizing c · x?”
There are Dantzig-Wolfe-type -relaxed decision procedures (e.g. [16]) that
require O(ρ−2 log m) queries to P , where ρ is the width of the problem instance,
defined as follows:
ρ(A, P ) = max max Ai · x/bi
x∈P i
th
where Ai denotes the i row of A.
In this paper we give a natural probability distribution of fractional packing
instances such that, for an instance chosen at random, with probability 1 − o(1)
any Dantzig-Wolfe-type -relaxed procedure must make at least Ω(ρ−2 log m)
queries to P . This lower bound matches the aforementioned upper bound, pro-
viding evidence that the unfortunate linear dependence of the running times of
these algorithms on the width and on −2 is an inherent aspect of the Dantzig-
Wolfe approach.
The specific probability distribution we study here is as follows.
√ Given m and
ρ, let A be a random {0, 1}-matrix with m rows and n = m columns, where
each entry of A has probability 1/ρ of being 1. Let P be the n-simplex, and let
b be the m-vector whose every entry is some v, where v is as small as possible
so that Ax ≤ b for some x ∈ P .
The class of Dantzig-Wolfe-type algorithms encompasses algorithms and al-
gorithmic methods that have been actively studied since the 1950’s through the
current time, including:
?
Research supported by NSF NSF Grant CCR-9700146.
??
Research supported by NSF Career award CCR-9720664.
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 320–327, 1999.
c Springer-Verlag Berlin Heidelberg 1999
On the Number of Iterations for Dantzig-Wolfe Optimization 321
where Ai denotes the ith row of A and x ranges over the n-vectors with non-
negative entries summing to 1.
V (B) > (1 + c¯
)V (A)
Corollary 1. Let m ∈ IN, ρ > 2, and > 0 be given such that ρ−2 = O(m0.5−δ )
for some constant δ > 0.
For p = 1/ρ, and n = m0.5 , let A be a random {0, 1} m × n matrix as in the
theorem. Let b denoteP the m-element vector whose every element is V (A). Let
P = {x ∈ Rn : x ≥ 0, i xi = 1} be the n-simplex.
Then with probability 1 − o(1), the fractional packing problem instance x ∈
P, Ax ≤ b has width O(ρ), and any Dantzig-Wolfe-type -relaxed decision pro-
cedure requires at least Ω(ρ−2 log m) queries to P when given the instance as
input.
Assuming the theorem for a moment, we prove the corollary. Suppose that
the matrix A indeed has the two properties that hold with probability 1 − o(1)
according to the theorem. It follows from the definition of V (A) that there exists
x∗ ∈ P n such that Ax∗ ≤ b. That is, there exists a (non-approximate) solution
to the fractional packing problem.
To bound the width, let x̄ be any vector in P . By definition of P and A, for
any row Aj of A we have Aj · x̄ ≤ 1. On the other hand, from the theorem we
know that V (A) = Ω(p) = Ω(1/ρ). Since bj = V (A), it follows that Aj · x̄/bj is
O(ρ). Since this is true for every j and x̄ ∈ P , this bounds the width.
Now consider any Dantzig-Wolfe-type -relaxed decision procedure. Suppose
for a contradiction that it makes no more than s ≤ ρ(c)−2 ln m calls to the
oracle that optimizes over P . In each of these calls, the oracle returns a vertex
of P , i.e. a vector of the form
(0, 0, . . . , 0, 1, 0, . . . , 0, 0)
Let S be the set of vertices returned, and let P (S) be the convex hull of these
vertices. Every vector in P (S) has at most s non-zero entries, for its only non-
zero entries can occur in positions for which there is a vector in S having a 1 in
that position. Hence, by the theorem with ¯ = /c, there is no vector x ∈ P (S)
that satisfies Ax ≤ (1 + )b.
On the Number of Iterations for Dantzig-Wolfe Optimization 323
3 Proof of Theorem 1
For any m-row n-column matrix A, define the value of A (considered as a two-
player zero-sum matrix game) to be
V (A) = min max Ai · x
x 1≤i≤m
where Ai denotes the ith row of A and x ranges over the n-vectors with non-
negative entries summing to 1.
Before we give the proof of Theorem 1, we introduce some simple tools for
reasoning about V (X) for a random {0, 1} matrix X.
By the definition of V , V (X) is at most the maximum, over all rows, of the
average of the row’s entries. Suppose each entry in X is 1 with probability q, and
within any row of X the entries are independent. Then for any δ with 0 < δ < 1, a
standard Chernoff bound implies that the probability that a given row’s average
exceeds (1 + δ)q is exp(−Θ(δ 2 qnX )), where nX is the number of columns of
X. Thus, by a naive union bound Pr[V (X) ≥ (1 + δ)q] ≤ mX exp(−Θ(δ 2 qnX ))
where mX is the number of rows of X. For convenience we rewrite this bound
as follows. For any q ∈ [0, 1] and β ∈ (0, 1], assuming mX /β → ∞,
s !
ln(mX /β)
Pr[V (X) ≥ (1 + δ)q] = o(β) for some δ = O . (1)
qnX
We use an analogous lower bound on V (X). By von Neumann’s Min-Max The-
orem
V (X) = max min Xi0 · y
y i
(where X 0 denotes the transpose of X). Thus, reasoning similarly, if within any
column of X (instead of any row) the entries are independent,
s !
ln(nX /β)
Pr[V (X) ≤ (1 − δ)q] = o(β) for some δ = O , (2)
qmX
assuming nX /β → ∞. We will refer to (1) and (2) as the naive upper and lower
bounds on V (X), respectively.
Proof of Theorem 1.
The naive lower bound to V (A) shows that
s !
ln n
Pr[V (A) ≤ p(1 − δ0 )] = o(1) for some δ0 = O = o(1). (3)
pm
Thus, V (A) ≥ Ω(p) with probability 1 − o(1).
324 Philip Klein and Neal Young
the probability that any given row is good is at least 2r/m and the expectation
of G is at least 2r. Since G is a sum of independent random {0, 1} random
variables, Pr[G < r] < exp(−r/8).
By the choice of r, this is o(1/ns ), so with probability 1 − o(1/ns ), B has at
least r good rows.
Suppose this is indeed the case and select any r good rows. Let C be the r × s
submatrix of B formed by the chosen rows. In any column of C, the entries are
independent and by symmetry each has probability at least p(1 + δ1 ) of being 1.
Applying the naive lower bound (2) to V (C), we find
s !
s s ln n
Pr[V (C) ≤ p(1 + δ1 )(1 − δ2 )] = o(1/n ) for some δ2 = O . (5)
pr
4 Historical Discussion
Historically, there are three lines of research within what we might call the
Dantzig-Wolfe model. One line of work began with a method proposed by Ford
and Fulkerson for computing multicommodity flow. Dantzig and Wolfe noticed
that this method was not specific to multicommodity flow; they suggested de-
composing an arbitrary linear program into two sets of constraints, writing it
as
min{cx : x ≥ 0, Ax ≥ b, x ∈ P } (7)
and solving the linear program by an iterative procedure: each iteration in-
volves optimizing over the polytope P . This approach, now called Dantzig-Wolfe
decomposition, is especially useful when P can be written as a cross-product
P1 × · · · × Pk , for in this case minimization over P can be accomplished by
minimizing separately over each Pi . Often, for example, distinct Pi ’s constrain
disjoint subsets of variables. In practice, this method tends to require many it-
erations to obtain a solution with value optimum or nearly optimum, often too
many to be useful.
Lagrangean Relaxation
A second line of research is represented by the work of Held and Karp [9, 10]. In
1970 they proposed a method for estimating the minimum cost of a traveling-
salesman tour. Their method was based on the concept of a 1-tree, which is a
slight variant of a spanning tree. They proposed two ways to calculate this esti-
mate; one involved formulating the estimate as the solution to the mathematical
program
max ub + min(c − uA)x (8)
u x∈P
where P is the polytope whose vertices are the 1-trees. They suggested an it-
erative method to find an optimal or near-optimal solution: While they given
some initial assignment to u, find a minimum-cost 1-tree with respect to the
edge-costs c − uA. Next, update the node-prices u based on the degrees of the
nodes in the 1-tree found. Find a min-cost 1-tree with respect to the modified
costs, update the node-prices accordingly, and so on.
Like Dantzig and Wolfe’s method, this method’s only dependence on the
polytope P is via repeatedly optimizing over it. In the case of Held and Karp’s
estimate, optimizing over P amounts to finding a minimum-cost spanning tree.
Their method of obtaining an estimate for the solution to a discrete-optimization
problem came to be known as Lagrangean relaxation, and has been applied to
a variety of other problems.
Held and Karp’s method for finding the optimal or near-optimal solution
to (8) turns out to be the subgradient method, which dates back to the early
sixties. Under certain conditions this method can be shown to converge in the
limit, but, like Dantzig and Wolfe’s method it can be rather slow. (One author
refers to the “the correct combination of artistic expertise and luck” [19] needed
to make progress in subgradient optimization.)
326 Philip Klein and Neal Young
{x : Ax ≤ b, x ∈ P } (9)
References
[1] B. Awerbuch and T. Leighton. A simple local-control approximation algorithm
for multicommodity flow. In Proc. of the 34th IEEE Annual Symp. on Foundation
of Computer Science, pages 459–468, 1993.
[2] B. Awerbuch and T. Leighton. Improved approximation algorithms for the multi-
commodity flow problem and local competitive routing in dynamic networks. In
Proc. of the 26th Ann. ACM Symp. on Theory of Computing, pages 487–495, 1994.
[3] J. F. Benders Partitioning procedures for solving mixed-variables programming
problems. Numerische Mathematik 4:238–252, 1962.
[4] G. B. Dantzig and P. Wolfe. Decomposition principle for linear programs. Oper-
ations Res., 8:101–111, 1960.
[5] Y. Freund and R. Schapire. Adaptive game playing using multiplicative weights.
J. Games and Economic Behavior, to appear.
[6] N. Garg and J. Könemann. Faster and simpler algorithms for multicommodity
flow and other fractional packing problems. In Proc. of the 39th Annual Symp.
on Foundations of Computer Science, pages 300–309, 1998.
On the Number of Iterations for Dantzig-Wolfe Optimization 327
1 Introduction
In the single-source unsplittable flow problem (Ufp), we are given a network
G = (V, E, u), a source vertex s and a set of k commodities with sinks t1 , . . . , tk
and associated real-valued demands ρ1 , . . . , ρk . We seek to route the demand
ρi of each commodity i, along a single s-ti flow path, so that the total flow
routed across any edge e is bounded by the edge capacity ue . See Figure 1 for
an example instance. The minimum edge capacity is assumed to have value at
least maxi ρi .
The requirement of routing each commodity on a single path bears resem-
blance to integral multicommodity flow and in particular to the multiple-source
edge-disjoint path problem. For the latter problem, a large amount of work ex-
ists either for solving exactly interesting special cases (e.g. [10], [33], [32]) or for
approximating, with limited success, various objective functions (e.g. [31],[11],
?
Part of this work was performed at the Department of Computer Science, Dartmouth
College while partially supported by NSF Award NSF Career Award CCR-9624828.
??
Research partly supported by NSF Career Award CCR-9624828.
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 328–344, 1999.
c Springer-Verlag Berlin Heidelberg 1999
Unsplittable Flow 329
1 1
111
000
000
111 111
000
000
111
000
111
000
111 000
111
000
111
000
111 11
00 000
111 11
00
00
11
00 1/2
11 00
11
00 1/2
11
00
11 00
11
00
11 00
11
s 111
000
000
111 s 111
000
000
111
000
111 000
111
000
111 000
111
11
00
00
11 11
00
00
11
00
11
00 1/2
11 00
11
00 1/2
11
00
11 00
11
111
000
000
111 11
00
00
11
000
111
000
111 00
11
00
11
000
111 00
11
1 1 1/2 flow units
1 flow unit
Fig. 1. Example Ufp instance. Numbers on the vertices denote demands; all
edge capacities are equal to 1.
The feasibility question for Ufp, i.e., whether all commodities can be routed
unsplittably, is strongly NP-complete [19]. Thus research has focused on efficient
algorithms to obtain approximate solutions. For a minimization (resp. maxi-
mization) problem, a ρ-approximation algorithm, ρ > 1, (ρ < 1), outputs in
polynomial time a solution with objective value at most (at least) ρ times the
optimum. Different optimization versions can be defined such as minimizing con-
gestion, maximizing the routable demand and minimizing the number of rounds
[19]. In this work we focus on the congestion metric. The reader is referred to
[20,24,7] for theoretical work on the other metrics.
Given a flow f define the utilization ze of edge e as the ratio fe /ue . We
seek an unsplittable flow f satisfying all demands, which minimizes congestion,
i.e. the quantity maxe {ze , 1}. That is, the minimum congestion is the smallest
number α ≥ 1, such that if we multiplied all capacities by α, f would satisfy the
capacity constraints.
In this paper we initiate the experimental study of approximation algorithms
for the unsplittable flow problem. Our study examines the quality of the effective
approximation achieved by different algorithms, how this quality is affected by
330 Stavros G. Kolliopoulos and Clifford Stein
input structure, and the effect of heuristics on approximation and running time.
The experimental results indicate performance which is typically better than
the theoretical guarantees. Moreover modifications to the algorithms to achieve
better theoretical results translate to improvements in practice as well. The main
focus of our work is to examine the quality of the approximation and we have
not extensively optimized our codes for speed. Thus we have largely opted for
maintaining the relative simplicity and modularity that we believe is one of the
advantages of the algorithms we test. However keeping the running time low
allows the testing of larger instances. Therefore we report on wall clock running
times, at a granularity of seconds, as an indication of the time complexity in
practice. Our codes run typically much faster than multicommodity flow codes
on similar size instances [28,12,30] but worse than maximum flow codes [5] with
which they have a comparable theoretical running time [24].
We chose to focus our experimentation on the congestion metric for a variety
of reasons. The approximation ratios and integrality gaps known for congestion
are in general better than those known for the other metrics [19,24,7]. Moreover
congestion is relevant for maximizing routed demand in the following sense: an
unsplittable solution with congestion λ provides a guarantee that 1/λ fraction of
each demand can be routed while respecting capacities. Finally, the congestion
metric has been extensively studied theoretically in the setting of randomized
rounding technique [31] and for its connections to multicommodity flow and cuts
(e.g. [26,17,25,18]).
zero edges are not exploited in the theoretical analysis. Finally, we test codes
which combine both the sparsification and the cost heuristic.
Test data. We test our codes on five main families of networks. For each family we
have generated an extensive set of instances by varying appropriate parameters.
We test mostly on synthetic data adapted from previous experimental work for
multicommodity flow [28,12] and cuts [3] together with data from some new
generators. The instances produced are designed to accommodate a variety of
path and cut characteristics. Establishing a library of interesting data generators
for disjoint-path and unsplittable flow problems is an important task in its own
right given the practical significance of this class of problems. We believe our
work makes a first contribution in this direction. We attempted to obtain access
to real-world data from telecommunication and VLSI applications, but were
unsuccessful due to their proprietary nature. However one of the input families
we test (LA networks) is adapted from synthetic data, used in the restoration
capacity study of Deng et al. [6], which was created based on actual traffic
forecasts on realistic networks.
2 Preliminaries
assumption that states that the maximum demand is less than or equal to the
minimum edge capacity, i.e., a demand can be routed on any edge. This assump-
tion is partly justified by the splitting of data into small size packets in actual
networks so that no commodity has arbitrarily large demand. Even without the
balance assumption a constant factor approximation is possible [24]. We will re-
fer to instances that satisfy the balanced assumption as balanced, and ones which
violate it as unbalanced, or more precisely, ones in which the maximum demand
is ρ times the minimum capacity as ρ-unbalanced. Unless otherwise stated, we
use n, m to denote |V |, |E| respectively.
The following theorem is an easy consequence of the well-known augmenting
path algorithm ([9], [8]) and will be of use.
Theorem 1. Let (G = (V, E, u), s, T ) be a Ufp on an arbitrary network G with
all demands ρi , 1 ≤ i ≤ k, equal to the same value σ, and edge capacities ue
equal to integral multiples of σ for all e ∈ E. If there is a fractional flow of value
f, then there is an unsplittable flow of value at least f. This unsplittable flow can
be found in time O(km) where k is the number of sinks in T .
In the same way a routing corresponds to an edge flow, an edge flow can be
represented as a path flow. Our algorithms will make use of the well-known
flow decomposition theorem [1]. Given problem (G, s, T ) let a flow solution f be
represented as an edge flow, i.e. an assignment of flow values to edges. Then one
can find, in O(nm) time, a representation of f as a path and cycle flow, where
the paths join s to sinks in T . In the remainder of the paper, we will assume
that the cycles are deleted from the output of a flow decomposition.
can find in polynomial time an unsplittable routing g where the flow ge through
an edge e is at most fe + 1/2x+1.
Algorithm Partition finds a routing for each subproblem by scaling up sub-
problem capacities to ensure they conform to the requirements of Lemma 1. The
new algorithm operates in phases, during which the number of distinct paths
used to route flow for a particular commodity is progressively decreased. Call
granularity the minimum amount of flow routed along a single path to any com-
modity, which has not been routed yet unsplittably. Algorithm Partition starts
with a fractional solution of arbitrary granularity and essentially “in parallel”
for all subproblems, rounds this fractional solution to an unsplittable one. In
the new algorithm, we delay the rounding process in the sense that we keep
computing a fractional solution of successively increasing granularity. The gran-
ularity will always be a power of 1/2 and this unit will be doubled after each
iteration. Once the granularity surpasses 1/2x , for some x, we guarantee that all
commodities with demands 1/2x or less have already been routed unsplittably.
Every iteration improves the granularity at the expense of increasing the con-
gestion. The method has the advantage that by Lemma 1, a fractional solution
of granularity 1/2x requires only a 1/2x additive offset to current capacities to
find a new flow of granularity 1/2x−1. Therefore, if the algorithm starts with
a fractional solution of granularity 1/2λ , for some potentially large λ, the total
Pj=λ
increase to an edge capacity from then on would be at most j=1 1/2j < 1.
Expressing the granularity in powers of 1/2 requires an initial rounding of the
demand values; this rounding will force us to scale capacities by at most a factor
of 2. We first provide an algorithm for the case in which demand values are
powers of 1/2. The algorithm for the general case will then follow easily.
Theorem 3. Let Π = (G = (V, E, u), s, T ) be a Ufp, in which all demand
values are of the form 2−ν with ν ∈ N. There is an algorithm, 2H Partition,
which obtains, in polynomial time, an unsplittable routing such that the flow
through any edge e is at most zue +ρmax −ρmin where z is the optimum congestion.
Remark: We first obtained a relative (1 + ρmax − ρmin )-approximation for the
case with demand values in {1/2, 1} in [24]. Dinitz, Garg and Goemans [7]
first extended that theorem to arbitrary powers of 1/2. Their derivation uses a
different approach.
multiple re0 of 1/2λ . If the optimum congestion is 1 for Π, the optimum congestion
is 1 for Π 0 as well.
Proof of Claim. Let G be the network with capacity function r and G0 be the
network with rounded capacities r0 . The amount re −re0 of capacity willP be unused
by any optimal unsplittable flow in G. The reason is that the sum i∈S 1/2i ,
for any finite multiset S ⊂ N is equal to a multiple of 1/2i0 , i0 = mini∈S {i}.
We describe now the relaxed decision procedure. Let u0 denote the scaled ca-
pacity function xu. Let ρmin = 1/2λ . The relaxed decision procedure has at most
λ − log(ρ−1
max ) + 1 iterations. During the first iteration, split every commodity in
T , called an original commodity, with demand 1/2x, 0 ≤ x ≤ λ, into 2λ−x virtual
commodities each with demand 1/2λ . Call T1 the resulting set of commodities.
Round down all capacities to the closest multiple of 1/2λ . By Theorem 1 we can
find a fractional flow solution f 0 ≡ g 1 , for T1 , in which the flow through any edge
is a multiple of 1/2λ . If this solution does not satisfy all demands the decision
procedure returns no; by Claim 3.1 no unsplittable solution is possible without
increasing the capacities u0 . Otherwise the procedure continues as follows.
Observe that f 0 is a fractional solution for the set of original commodities
as well, a solution with granularity 1/2λ . Let uje denote the total flow through
an edge after iteration j. The flow u1e is trivially fe0 ≤ u0e ≤ ue . At iteration
i > 1, we extract from gi−1 , using flow decomposition, the set of flow paths f i−1
used to route the set Si−1 of virtual commodities in Ti−1 which correspond to
original commodities in T with demand 1/2λ−i+1 or more. Pair up the virtual
commodities in Si−1 corresponding to the same original commodity; this is pos-
sible since the demand of an original commodity is inductively an even multiple
of the demands of the virtual commodities. Call the resulting set of virtual com-
modities Ti . Ti contains virtual commodities with demand 1/2λ−i+1 . The set of
flow paths f i−1 is a fractional solution for the commodity set Ti . By Lemma 1,
we can find an unsplittable routing g i for Ti such that the flow gei through an
edge e is at most fei−1 + 1/2λ−i+2 . Now the granularity has been doubled to
1/2λ−i+1 . A crucial observation is that from this quantity, gei , only an amount
of at most 1/2λ−i+2 corresponds to new flow, added to e during iteration i. It is
easy to see that the algorithm maintains the following two invariants.
INV1: After iteration i, all original commodities with demands 1/2λ−i+1 or
less have been routed unsplittably.
INV2: After iteration i > 1, the total flow uie though edge e, which is due to
all i first iterations, satisfies the relation uie ≤ ui−1
e + 1/2λ−i+2 .
Given that u1e ≤ u0e the total flow though e used to route unsplittably all com-
modities in T is at most
λ−log(ρ−1
X max )+1
Therefore the relaxed decision procedure returns the claimed guarantee. For
the running time, we observe that in each of the l = λ − log(ρ−1
max ) + 1 iterations,
336 Stavros G. Kolliopoulos and Clifford Stein
4 Heuristics Used
The heuristics we designed fall into two major categories. We tested them both
individually and in combination with each other.
The theoretical analysis of Partition and H Partition is not affected if
after finding the initial solution to the maximum flow relaxation, edges carrying
Unsplittable Flow 337
zero flow, called zero edges are discarded. The sparsification heuristic imple-
ments this observation. The motivation behind the sparsification heuristic is to
decrease the running time and also observe how the theoretical versions of the
algorithms behave in practice since the zero edges are not exploited in the the-
oretical analysis. On random graph instances where maximum flow solutions
use typically a small fraction of the edges, the sparsification heuristic resulted
in drastic reductions of the running time. We observed that the approximation
guarantee typically degrades by a small amount when a large number of edges
is discarded. However the reduction on running time and space requirements
makes the sparsification heuristic attractive for large instances. We also tested
an intermediate approach where a randomly chosen fraction of the zero edges
(on the average between 30 and 50 percent) is retained.
The cost heuristic invokes a minimum-cost flow subroutine when the original
algorithm would invoke maximum flow to compute partial unsplittable solutions.
In algorithm Partition we use minimum cost flow at Step 3 to find the partial
unsplittable solution to each subproblem. In algorithm H Partition minimum-
cost flow is used to compute at iteration i the unsplittable routing gi for the set Ti
of virtual commodities (cf. proof of Theorem 3). That is we use a minimum-cost
flow routine to successively improve the granularity of the fractional solution.
The cost of an edge e is defined as an appropriate increasing function of the cur-
rent utilization of e. Let λ denote the current congestion. For robustness we use
the ratio ze /λ in the calculation of the cost c(e). We tested using functions that
depend either linearly or exponentially on ze /λ. The objective was to discour-
age the repeated use of edges which were already heavily utilized. Exponential
cost functions tend to penalize heavily the edge with the maximum utilization
in a given set of edges. Their theoretical success in multicommodity flow work
[34,18] motivated their use in our setting. In particular we let cL (e) = β1 ze /λ
and cE (e) = 2β2 ze /λ and tested both functions for different values of β1 , β2 . Ob-
serve that an edge with utilization 0 so far will incur a cost of 1 under cE , while
0 seems a more natural cost assignment for this case. However experiments sug-
gested that the cE function behaved marginally better than cE − 1 by imposing
limited moderation on the introduction of new edges in the solution.
5 Experimental Framework
Codes tested. For the implementation of the maximum flow and minimum cost
flow subroutines of our algorithms we used the codes described in [5] and [4]
respectively. We also used the Cherkassky-Goldberg code to implement the flow
338 Stavros G. Kolliopoulos and Clifford Stein
Input classes. We generated data from four input classes. For each class we
generated a variety of instances varying different parameters. The generators
use randomness to produce different instances for the same parameter values.
To make our experiments repeatable the seed of the pseudorandom generator is
an input parameter for all generators. If no seed is given, a fixed default value is
chosen. We used the default seed in generating all inputs. The four classes used
follow.
genrmf. This is adapted from the GENRMF generator of Goldfarb and Grigo-
riadis [15,2]. The input parameters are a b c1 c2 k d. The generated net-
work has b frames (grids) of size a × a, for a total of a ∗ a ∗ b vertices. In each
frame each vertex is connected with its neighbors in all four directions. In
addition, the vertices of a frame are connected one-to-one with the vertices
of the next frame via a random permutation of those vertices. The source is
the lower left vertex of the first frame. Vertices become sinks with probabil-
ity 1/k and their demand is chosen uniformly at random from the interval
[1, d]. The capacities are randomly chosen integers from (c1, c2) in the case
of interframe edges, and (1, c2 ∗ a ∗ a) for the in-frame edges.
noigen. This is adapted from the noigen generator used in [3,29] for minimum-
cut experimentation. The input parameters are n d t p k. The network has
n nodes and an edge between a pair of vertices is included with probability
Unsplittable Flow 339
2 6 12 25 50 100
program name cong. time cong. time cong. time cong. time cong. time cong. time
3al 1.64 1 2.37 1 2.14 1 1.59 1 1.18 1 1.23 3
3al2 2.50 0 2.51 0 2.41 0 1.47 1 1.25 1 1.19 2
c/h prf103 1.49 0 2.91 0 2.51 0 1.85 1 1.67 1 1.55 2
c/h prfexp10 1.49 0 2.17 0 2.15 1 1.85 1 1.44 1 1.59 2
c3/3al103 2.56 0 2.29 1 1.96 2 1.46 2 1.25 4 1.16 7
c3/3alexp10 1.49 0 1.86 1 1.74 1 1.81 2 1.27 2 1.18 3
c3 2/3al2103 1.49 0 2.51 0 1.76 1 1.92 1 1.57 2 1.44 2
c3 2/3al2exp10 1.49 0 2.13 0 2.04 1 1.59 1 1.32 1 1.31 2
h prf 2.00 0 2.47 0 2.51 0 2.15 0 1.50 0 1.45 1
h prf2 2.00 0 2.47 0 2.51 0 2.14 0 1.50 1 1.41 2
h prf3 2.00 0 2.75 0 2.20 0 1.74 0 1.36 0 1.46 1
Table 1. family noi commodities: noigen 1000 1 2 10 i; 7981 edges; 2-unbalanced
family; i is the expected percentage of sinks in the non-source component.
340 Stavros G. Kolliopoulos and Clifford Stein
2 4 8 16 32
program name cong. time cong. time cong. time cong. time cong. time
3al 1.18 1 1.12 2 1.09 3 1.04 3 1.06 3
3al2 1.25 1 1.15 1 1.09 2 1.08 1 1.05 1
c/h prf103 1.67 1 1.47 2 1.33 2 1.24 2 1.24 2
c/h prfexp10 1.44 1 1.40 1 1.26 1 1.29 1 1.25 2
c3/3al103 1.25 4 1.22 6 1.13 7 1.12 5 1.07 6
c3/3alexp10 1.27 2 1.12 3 1.11 3 1.12 3 1.11 4
c3 2/3al2103 1.57 2 1.24 1 1.09 2 1.14 2 1.10 1
c3 2/3al2exp10 1.32 1 1.16 2 1.15 2 1.13 2 1.12 2
h prf 1.50 1 1.42 1 1.30 1 1.22 1 1.24 2
h prf2 1.55 1 1.40 1 1.22 2 1.23 2 1.18 2
h prf3 1.36 0 1.22 1 1.38 0 1.28 1 1.20 1
Table 2. family noi components: noigen 1000 1 i 10 50; 7981 edges; 2-unbalanced
family; i is the number of components.
6 Experimental Results
In this section we give an overview of experimental results. We experimented
on 13 different families and present data on several representative ones. Tables
1-6 provide detailed experimental results. The running time is given in seconds.
The congestion achieved by algorithm Partition, on a balanced input, was
typically less than 2.3 without any heuristics. The largest value observed was
2.89. Similarly for H Partition the congestion, on a balanced input, was typi-
cally less than 1.65 and always at most 1.75. The theoretically better algorithm
H Partition does better in practice as well. To test the robustness of the ap-
proximation guarantee we relaxed, on several instances, the balance assumption
allowing the maximum demand to be twice the minimum capacity or more, see
Tables 1-5. The approximation guarantee was not significantly affected. Even in
the extreme case when the balance assumption was violated by a factor of 16, as
in Table 3, the best code achieved a congestion of 7.5. Performace on the smallest
instances of the LA family (not shown here) was well within the typical values
reported above. On larger LA instances, the exponential distribution generated
an excessive number of commodities (on the order of 30,000 and up) for our
codes to handle in the time frames we considered (a few minutes).
The sparsification heuristic proved beneficial in terms of efficiency as it typ-
ically cut the running time by half. One would expect that it would incur a
degradation in congestion as it discards potentially useful capacity. This turned
out not to be true in our tests. There is no clear winner in our experiments
between h prf and h prf3 or between 3al and 3al2. One possible explanation
is that in all cases the same maximum flow code is running in subroutines, which
would tend to route flow in a similar fashion.
2 4 16 64
program name cong. time cong. time cong. time cong. time
3al 2.26 0 2.18 0 1.48 9 1.21 214
3al2 1.82 0 2.72 0 1.46 4 1.20 102
c/h prf103 2.45 0 3.10 0 1.88 5 1.68 140
c/h prfexp10 2.45 0 3.01 0 1.98 4 1.73 117
c3/3al103 1.42 0 2.03 1 1.53 13 1.24 225
c3/3alexp10 1.72 0 2.11 1 1.57 7 1.22 145
c3 2/3al2103 1.50 0 2.40 1 1.70 5 1.32 113
c3 2/3al2exp10 1.61 0 2.17 0 1.59 5 1.27 112
h prf 1.85 0 3.56 0 2.05 6 1.72 154
h prf2 1.85 0 3.56 0 1.94 6 1.75 171
h prf3 2.38 0 3.12 0 1.92 3 1.72 87
Table 5. family rmf depthDem: genrmf -a 10 -b i -c1 64 -c2 128 -k 40 -d 128;
820, 1740, 7260, 29340 edges; 2-unbalanced family; i is the number of frames.
1 2 6 12 25 50
program name cong. time cong. time cong. time cong. time cong. time cong. time
3al 1.55 1 1.56 1 1.67 5 1.64 11 1.64 27 1.70 64
3al2 1.45 0 1.67 1 1.78 3 1.57 6 1.67 15 1.75 35
c/h prf103 1.80 1 2.11 0 2.22 2 2.15 4 2.33 11 2.33 26
c/h prfexp10 1.92 0 2.11 1 2.11 2 2.15 5 2.33 11 2.33 24
c3/3al103 1.44 1 1.64 2 1.44 7 1.67 14 1.56 30 1.56 130
c3/3alexp10 1.30 1 1.33 2 1.56 7 1.57 14 1.56 34 1.56 78
c3 2/3al2103 1.15 1 1.40 1 1.44 3 1.56 6 1.56 15 1.56 34
c3 2/3al2exp10 1.27 0 1.44 1 1.44 3 1.56 6 1.56 14 1.57 34
h prf 1.92 0 2.11 1 2.11 2 2.33 4 2.22 11 2.33 27
h prf2 1.92 0 2.22 0 2.33 2 2.56 4 2.60 12 2.89 28
h prf3 2.00 0 2.20 1 2.18 1 2.33 1 2.22 3 2.42 9
Table 6. family sat density: satgen -a 1000 -b i -c1 8 -c2 16 -k 10000 -d 8;
9967, 20076, 59977, 120081, 250379, 500828 edges; 22, 61, 138, 281, 682, 1350
commodities; i is the expected percentage of pairs of vertices joined by an edge.
7 Future Work
There are several directions to pursue in future work. The main task would be
to implement the new 2-approximation algorithm of Dinitz et al. [7] and see
how it behaves in practice. Testing on data with real network characteristics is
important if such instances can be made available. Designing new generators
that push the algorithms to meet their theoretical guarantees will enhance our
understanding of the problem. Finally, a problem of minimizing makespan on
Unsplittable Flow 343
References
1. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network flows: Theory, Algorithms
and Applications. Prentice Hall, 1993.
2. Tamas Badics. genrmf. ftp://dimacs.rutgers.edu/pub/netflow/generators/-
network/ genrmf/, 1991.
3. C. Chekuri, A. V. Goldberg, D. Karger, M. Levine, and C. Stein. Experimental
study of minimum cut algorithms. In Proc. of 8th SODA, 324–333, 1997.
4. B. V. Cherkassky and A. V. Goldberg. An efficient implementation of a scaling
minimum-cost flow algorithm. Journal of Algorithms, 22:1–29, 1997.
5. B. V. Cherkassky and A. V. Goldberg. On implementing the push-relabel method
for the maximum flow problem. Algorithmica, 19:390–410, 1997.
6. M. Deng, D. F. Lynch, S. J. Phillips, and J. R. Westbrook. Algorithms for restora-
tion planning in a telecommunications network. In ALENEX99, 1999.
7. Y. Dinitz, N. Garg, and M. Goemans. On the single-source unsplittable flow
problem. In Proc. of 39th FOCS, 290–299, November 1998.
8. P. Elias, A. Feinstein, and C. E. Shannon. Note on maximum flow through a
network. IRE Transactions on Information Theory IT-2, 117–199, 1956.
9. L. R. Ford and D. R. Fulkerson. Maximal flow through a network. Canad. J.
Math., 8:399–404, 1956.
10. A. Frank. Packing paths, cuts and circuits - a survey. In B. Korte, L. Lovász,
H. J. Prömel, and A. Schrijver, editors, Paths, Flows and VLSI-Layout, 49–100.
Springer-Verlag, Berlin, 1990.
11. N. Garg, V. Vazirani, and M. Yannakakis. Primal-dual approximation algorithms
for integral flow and multicut in trees. Algorithmica, 18:3–20, 1997.
12. A. V. Goldberg, J. D. Oldham, S. A. Plotkin, and C. Stein. An implementation
of a combinatorial approximation algorithm for minimum-cost multicommodity
flow. In Proc. of the 6th Conference on Integer Programming and Combinatorial
Optimization, 338–352, 1998. Published as Lecture Notes in Computer Science
1412, Springer-Verlag”,.
13. A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem.
Journal of the ACM, 35:921–940, 1988.
14. A.V. Goldberg and S. Rao. Beyond the flow decomposition barrier. In Proc. of
the 38th Annual Symposium on Foundations of Computer Science, 2–11, 1997.
15. D. Goldfarb and M. Grigoriadis. A computational comparison of the Dinic and
Network Simplex methods for maximum flow. Annals of Operations Research,
13:83–123, 1988.
16. D. S. Hochbaum and D. B. Shmoys. Using dual approximation algorithms for
scheduling problems: theoretical and practical results. Journal of the ACM, 34:144–
162, 1987.
344 Stavros G. Kolliopoulos and Clifford Stein
1 Introduction
We consider the following network design problem for directed networks. Given a
directed graph with nonnegative costs on the arcs find a minimum cost subgraph
where the number of arcs leaving set S is at least f (S) for all subsets S. Formally,
given a directed graph G = (V, E) and a requirement function f : 2V 7→ Z, the
network design problem is the following integer program:
X
minimize ce xe (1)
e∈E
subject to
X
xe ≥ f (S), for each S ⊆ V,
e∈δG (S)
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 345–360, 1999.
c Springer-Verlag Berlin Heidelberg 1999
346 Vardges Melkonian and Éva Tardos
where δG (S) denotes thePset of arcs leaving S. For simplicity of notation we will
use x(δG (S)) to denote e∈δG (S) xe .
The following are some special cases of interest. When f (S) = k for all
∅=6 S ⊂ V the problem is that of finding a minimum cost k-connected subgraph.
The directed Steiner tree problem is to find the minimum cost directed tree
rooted at r that contains a subset of vertices D ⊆ V . This problem is a network
design problem where f (S) = 1 if r 6∈ S and S ∩ D 6= ∅ and f (S) = 0 otherwise.
Both of these special cases are known to be NP-complete. In fact, the directed
Steiner tree problem contains the Set Cover problem as a special case, and hence
no polynomial time algorithm can achieve an approximation better than O(log n)
unless P=NP [10].
In the last 10 years there has been significant progress in designing approxi-
mation algorithm for undirected network design problems [1,5,12,4,8], the analog
of this problem where the graph G is undirected. General techniques have been
developed for the undirected case, e.g., primal-dual algorithms [1,5,12,4]. Re-
cently, Kamal Jain gave a 2-approximation algorithm for the undirected case
when the function f is weakly supermodular. The algorithm is based on a new
technique: Jain proved that in any basic solution to the linear programming
relaxation of the problem there is a variable whose value is at least a half.
There has been very little progress made on directed network design problems.
The main general technique used for approximating undirected network design
problems, the primal-dual method does not extend to the directed case. Charikar
et al. [2] gave the only nontrivial approximation algorithm for the directed Steiner
tree problem. Their method provides an O(n )-approximation algorithm for any
fixed . Khuller and Vishkin [9] gave a simple 2-approximation algorithm for
the special case of the k-connected subgraph problem in undirected graphs. A
similar method also works for the directed case.
In this paper, we consider the case when f is crossing supermodular, i.e., for
every A, B ⊆ V such that A ∩ B 6= ∅ and A ∪ B 6= V we have that
Note that the k-connected subgraph problem is defined by the function f (S) = k
for all ∅ =
6 S ⊂ V , which is crossing supermodular. Hence the network design
problem with a crossing supermodular requirement function f is still NP-hard.
However, the function defining the directed Steiner tree problem is not crossing
supermodular.
The paper contains two main results: a 2-approximation algorithm for the
integer program, and a description of the basic solutions of the LP-relaxation.
The network design problem with a crossing supermodular requirement func-
tion f is a natural extension of the case of intersecting supermodular function
considered by Frank [11], i.e., when the inequality (2) holds whenever A and B
are intersecting. Frank [11] has shown that this special case can be solved opti-
mally. However, when f is crossing supermodular the problem is NP-hard and
no approximation results are known for this general case. In this paper, we com-
bine Frank’s result and the k-connected subgraph approximation of Khuller and
Approximation Algorithms for a Directed Network Design Problem 347
subject to
X
xe ≥ f (S), for each S ∈ ρ,
e∈δG (S)
where δG (S) denotes the set of the arcs leaving S, and f (S) is a crossing super-
modular function on the set system ρ ⊆ 2V .
348 Vardges Melkonian and Éva Tardos
Frank [11] has considered the special case of this problem when f is inter-
secting supermodular. He showed that in this case, the LP relaxation is totally
dual integral (TDI), and gave a direct combinatorial algorithm to solve the prob-
lem. The case of crossing supermodular function is NP-hard (as it contains the
k-connected subgraph problem as a special case). Here we use Frank’s theorem
to get a simple 2-approximation algorithm.
Let r be a vertex of G. Divide all the sets in ρ into two groups as follows:
ρ1 = {A ∈ ρ : r 6∈ A}; (4)
ρ2 = {A ∈ ρ : r ∈ A}. (5)
The idea is that using Frank’s technique we can solve the network design
problem separately for ρ1 and ρ2 , and then combine the solutions to get a 2-
approximation for the original problem.
Lemma 1. For the set systems ρ1 and ρ2 , the problem (3) can be solved opti-
mally.
Proof. For any S ∈ ρ1 , define the requirement function as f1 (S) = f (S). Then
the problem for the first collection ρ1 is
X
minimize ce xe (6)
e∈E
subject to
X
xe ≥ f1 (S), for each S ∈ ρ1 ,
e∈δG (S)
subject to
X
xe ≥ f2r (S), for each S ∈ ρr2 ,
e∈δGr (S)
Note that f2r is intersecting supermodular on the set system ρr2 : Suppose
S1 , S2 ∈ ρr2 such that S1 ∩ S2 6= ∅. Then S1 ∪ S2 = S1 ∩ S2 6= V , and we have
that r ∈ S1 ∩ S2 6= ∅, so the sets S1 and S2 are crossing, and from the crossing
supermodularity of f we get
That is, f2 (S) is intersecting supermodular on ρr2 , and hence the linear pro-
gramming relaxation of (7) is TDI, and the integer program can be solved opti-
mally. t
u
We have the following simple algorithm: solve (6) and (7) optimally, and
return the union of the optimal arc sets of the two problems as a solution to (3).
Theorem 1. The algorithm given above returns a solution to (3) which is within
a factor of 2 of the optimal.
Combining the
Poptimal arcPsets of (6) and (7) we will get a solution for (3)
of cost at most e∈E ce x̃e + e∈E ce x̂e , which is within a factor of 2 of the
optimal. t
u
subject to
The second main result of the paper is the following property of the basic
solutions of (9).
350 Vardges Melkonian and Éva Tardos
The rest of this section gives a proof to this theorem. The outline of the
proof is analogous to Jain’s [8] proof. Consider a basic solution x to the linear
program.
– First we may assume without loss of generality that xe > 0 for all arcs e.
To see this simply delete all the arcs from the graph that have xe = 0. Also
assume xe < 1 for all arcs e; in the opposite case the theorem obviously
holds.
– If x is a basic solution with m non-zero variables, then there must be m
linearly independent inequalities in the problem that are satisfied by equa-
tion. Each inequality corresponds to a set S, and those satisfied by equations
correspond to tight sets, i.e., sets S such that x(δG (S)) = f (S). We use the
well-known structure of tight sets to show that there are m linearly indepen-
dent such equations that correspond to a cross-free family of m tight sets.
(A family of sets is cross-free if for all pairs of sets A and B in the family one
of the four sets A \ B, B \ A, A ∩ B or the complement of A ∪ B is empty.)
– We will use the fact that a cross-free family of sets can be naturally repre-
sented by a forest.
– We will prove the theorem by contradiction. Assume that the statement is
not true, and all variables have value xe < 1/4. We consider any subgraph
of the forest; using induction on the size of the subgraph we show that the
k sets in this family have more than 2k endpoints. Applying this result for
the whole forest we get there the m tight sets of the cross-free family have
more than 2m endpoints. This is a contradiction since m arcs can have only
2m endpoints.
The hard part of the proof is the last step in this outline. While Jain can
establish the same contradiction based on the assumption that all variables have
xe < 1/2, we will need the stronger assumption that xe < 1/4 to reach a con-
tradiction.
We start the proof by discussing the structure of tight sets. Call a set S
tight if x(δG (S)) = f (S). Two sets A and B are called intersecting if A ∩ B,
A \ B, B \ A are all nonempty. A family of sets is laminar if no two sets in it are
intersecting, i.e., every pair of sets is either disjoint or one is contained in the
other.
Two sets are called crossing if they are intersecting and A∪B is not the whole
set V . A family of sets is cross-free if no two sets in the family are crossing. The
key lemma for network design problems with crossing supermodular requirement
function is that crossing tight sets have tight intersection and union, and the rows
corresponding to these four sets in the constraint matrix are linearly dependent.
Let AG (S) denote the row corresponding to the set S in the constraint matrix
of (1).
Approximation Algorithms for a Directed Network Design Problem 351
Lemma 2. If two crossing sets A and B are tight then their intersection and
union are also tight, and if xe > 0 for all arcs then AG (A) + AG (B) = AG (A ∪
B) + AG (A ∩ B).
Proof. First observe that the function x(δG (S)) is submodular, and that equation
holds if and only if there are no positive weight arcs crossing from A to B or
from B to A, i.e., if x(δG (A, B)) = x(δG (B, A)) = 0, where δG (A, B) denotes
the arcs that are leaving A and entering B.
Next consider the chain of inequalities, using the crossing submodularity of
x(δG (.)), supermodularity of f and the fact that A and B are tight.
f (A ∪ B) + f (A ∩ B) ≥ f (A) + f (B) = x(δG (A)) + x(δG (B))
≥ x(δG (A ∪ B)) + x(δG (A ∩ B) ≥ f (A ∪ B) + f (A ∩ B).
This implies that both A ∩ B and A ∪ B are tight and x(δG (A, B)) =
x(δG (B, A)) = 0. By our assumption that xe > 0 on all arcs e, this im-
plies that δG (A, B) and δG (B, A) are both empty, and so AG (A) + AG (B) =
AG (A ∪ B) + AG (A ∩ B). t
u
The main tool for the proof is a cross-free family of |E(G)| linearly indepen-
dent tight sets.
Lemma 3. For any basic solution x that has xe > 0 for all arcs e, there exists
a cross-free family Q of tight subsets of V satisfying the following:
– |Q| = |E(G)|.
– The vectors AG (S), S ∈ Q are independent.
– For every set S ∈ Q, f (S) ≥ 1.
Proof. For any basic solution x that has xe > 0 for all arcs e there must be a
set of |E(G)| tight sets so that the vectors AG (S) corresponding to the sets S
are linearly independent.
Let Q be the family of all tight sets S. We use the technique of [7] to uncross
sets in Q and to create the family of tight sets in the lemma. Note that the rank
of the set of vectors AG (S) is |E(G)|. Suppose there are two crossing tight sets
A and B in the family. We can delete either of A or B from our family and add
both the union and the intersection. This step maintains the properties that
– all the sets in our family are tight,
– the rank of the set of vectors AG (S) for S ∈ Q is |E(G)|.
This is true, as the union and intersection are tight by Lemma 2, and by the
same lemma the vector of the deleted element B can be expressed as a linear
combination of the other three as AG (B) = AG (A ∪ B) + AG (A ∩ B) − AG (A).
In [7] it was shown that a polynomial sequence of such steps will result in a
family whose sets are cross-free. Now we can delete dependent elements to get
the family of sets satisfying the first two properties stated in the lemma.
To see the last property observe that the assumption that xe > 0 for all arcs
e implies that all the tight sets S that have a nonzero vector AG (S) must have
f (S) ≥ 1. t
u
352 Vardges Melkonian and Éva Tardos
Lemma 4. For any cross-free family Q, the sets {R ∈ Q for round sets}, and
the sets {V \ S for square sets S ∈ Q} together form a laminar family.
The laminar family as given by Lemma 4 has the following forest rep-
resentation. Let R1 , ..., Rk be the round sets and S1 , . . . , Sl be the square
sets of a cross-free family Q. Consider the corresponding laminar family L =
{R1 , ..., Rk , S¯1 , ..., S̄l }. Define the forest F as follows. The node set of F is L,
and there is an arc from U ∈ L to W ∈ L if W is the smallest subset in L con-
taining U . See the figure below for an example of a laminar family L obtained
from a cross-free family Q and the tree representation of L.
Now let’s give some intuition for our main proof. Lemma 3 says that |Q| =
|E(G)|, that is,
The idea of the proof comes from the following observation. Assuming that
the statement of Theorem 1 is not true, that is for any e ∈ E(G), xe < 1/4,
distribute the endpoints such that each subset of Q gets at least 2 endpoints
and some subsets get more than 2 endpoints. This will contradict (10). How to
find this kind of distribution? The concept of incidence defined below gives the
necessary connection between endpoints and subsets.
We say that an arc e = i → j of G = (V, E) leaves a subset U ∈ Q if i ∈ U
and j ∈ Ū. Consider an arc e = i → j of G = (V, E). We will define which set is
the head and the tail of this arc incident to. If a node of the graph is the head or
the tail of many arcs, then each head and tail at the same node may be incident
to different sets. If U ∈ L is the smallest round subset of L such that e is leaving
Approximation Algorithms for a Directed Network Design Problem 353
R
S P i j
w T
some tree structures and for each of them state a different k as the minimum
number of endpoints that the root of that particular subtree can get.
Next we define the necessary tree structures. See the figures for examples.
A chain is a subtree of nodes, all of which are unary except maybe the tail
(the lowest node in the chain). Note that a node of any arity is itself a chain.
A chain with a non-unary tail is called the survival chain of its head (the
highest node in the chain). So if a node is not unary then its survival chain is
the node itself.
A chain of nodes having the same shape and requirement 1 is called a uniform-
chain.
A chain-merger is a union of a round-shape uniform-chain and a square-
shape uniform-chain such that the head of one of the chains is a child of a node
of the other chain.
A subtree is called chameleon if
– the root has one child;
– the survival chain of the root consists of nodes of both shapes;
– the tail of the survival chain has two children.
The rest of this section is the following main lemma which completes the
proof of the theorem.
Lemma 5. Suppose every arc takes value strictly less than 1/4. Then for any
rooted subtree of the forest, we can distribute the endpoints contained in it such
that every vertex gets at least 2 endpoints and the root gets at least
– 5 if the subtree is a uniform-chain;
– 6 if the subtree is a chameleon;
– 7 if the subtree is a chain-merger;
– 10 if the root has at least 3 children;
– 8 otherwise.
Case 1: If R has at least four children, by induction it gets label at least 4∗3 = 12
since each child has excess of at least 3.
S1
S3 S2
Case 3: Suppose R has two children, S1 and S2 . Then R needs label 7 if its
subtree is a chain-merger and label 8 otherwise. Let’s consider cases in that
order.
R
R
S2
S1 T
T
S1
S2
case (b)
case (a) when f(R)=f(T)+1
Case 4: Suppose R has one child, S. Consider cases dependent on the structure
of the subtrees of R and S.
R
S S T
R
The idea of the algorithm is to iteratively round the solutions of the linear
programs derived from the main IP formulation as described below.
Based on Theorem 2, if we include the arcs with value at least a quarter in
the solution of the main integer program (1) then the factor that we lose in the
cost is at most 4. These arcs might not form a feasible solution for (1) yet. We
reduce the LP to reflect the set of edges included in the solution, and apply the
method recursively.
Formally, let x∗ be an optimal basic solution of (9) and E1/4+ be the set of
the arcs which take value at least 1/4 in x∗ . Let Eres = E − E1/4+ . Consider
the residual graph Gres = (V, Eres ) and the corresponding residual LP:
X
minimize ce xe (11)
e∈Eres
subject to
This residual program has the same structure as (9); the difference is that the
graph G and the requirements f (S) are reduced respectively to Gres and f (S) −
|E1/4+ ∩ δG (S)| considering that the arcs from E1/4+ are already included in the
integral solution. Theorem 2 can be applied to (11) if the reduced requirement
function is crossing supermodular which is shown next.
A function f : 2V 7→ Z is called submodular if −f is supermodular, i.e., if for
all A, B ⊆ V , f (A)+f (B) ≥ f (A∩B)+f (A∪B). The functions |δG (.)| and more
generally x(δG (.)) for any nonnegative vector x are the most classical examples of
a submodular functions. The requirement function in the residual problem is the
difference of a crossing supermodular function f and this submodular function,
so it is also crossing supermodular.
Approximation Algorithms for a Directed Network Design Problem 359
Theorem 3. Let G̃ = (V, Ẽ) be a subgraph of the directed graph G = (V, E). If
f : 2V 7→ Z is a crossing supermodular function then f (S) − |δG̃ (S)| is also a
crossing supermodular function.
Theorem 4. The iterative algorithm given above returns a solution for (1)
which is within a factor of 4 of the optimal.
Proof. We prove by induction that given a basic solution x∗ to (9) the method
finds a feasible solution to the integer program (1) of cost at most 4cx∗ . Consider
an iteration of the method. We add the arcs E1/4+ to the integer solution. The
cost of these arcs is at most P 4 times the corresponding part of the fractional
solution x∗ , i.e., c(E1/4+ ) ≤ 4 e∈E1/4+ ce x∗e . A feasible solution to the residual
problem can be obtained by projecting the current solution x∗ to the residual
arcs. Using purification we obtain a basic solution to the residual linear program
of cost at most the cost of x∗ restricted to the arcs Eres . We recursively apply
the method to find an integer solution
P to the residual problem of cost at most 4
times this cost, i.e., at most 4 e∈Eres ce x∗e . This proves the theorem. t
u
The algorithm stated above is assuming that we solve a linear program every
iteration. However, as seen by the proof, it suffices to solve the linear program
(9) once, and use purification to obtain a basic solution to the residual problems
in subsequent iterations.
360 Vardges Melkonian and Éva Tardos
References
1. A. Agrawal, P. Klein, and R. Ravi. When trees collide: An approximation algorithm
for generalized Steiner tree problems on networks. In Proceedings of the 23rd ACM
Symposium on Theory of Computing, pages 134–144, 1991.
2. M. Charikar, C. Chekuri, T. Cheung, Z. Dai, A. Goel, S. Guha, and M. Li. Ap-
proximation algorithms for directed Steiner problems. In Proceedings of the 9th
Annual Symposium of Discrete Algorithms, pages 192–200, 1998.
3. J. Edmonds. Edge-disjoint branchings. In R. Rustin, editor, Combinatorial Algo-
rithms, pages 91–96. Academic Press, New York, 1973.
4. M. X. Goemans, A. Goldberg, S.Plotkin, D. Shmoys, E. Tardos, and D. P.
Williamson. Approximations algorithms for network design problems. In Pro-
ceedings of the 5th Annual Symposium on Discrete Algorithms, pages 223–232,
1994.
5. M. X. Goemans and D. P. Williamson. A general approximation technique for
constrained forest problem. SIAM Journal on Computing, 24:296–317, 1995.
6. M. Grotschel, L. Lovasz, and A. Schrijver. Geometric algorithms and combinatorial
optimization. Springer-Verlag, 1988.
7. C. A. Hurkens, L. Lovasz, A. Schrijver, and E. Tardos. How to tidy up your
set-system? In Proceedings of Colloquia Mathematica Societatis Janos Bolyai 52.
Combinatorics, pages 309–314, 1987.
8. K. Jain. A factor 2 approximation algorithm for the generalized Steiner network
problem. In Proceedings of the 39th Annual Symposium on the Foundation of
Computer Science, pages 448–457, 1998.
9. S. Khuller and U. Vishkin. Biconnectivity approximations and graph carvings.
Journal of the Association of Computing Machinery, 41:214–235, 1994.
10. R. Raz and S. Safra. A sub-constant error-probability low-degree test, and a sub-
constant error-probability PCP characterization of NP. In Proceedings of the 29th
Annual ACM Symposium on the Theory of Computing, pages 475–484, 1997.
11. A. Frank. Kernel systems of directed graphs. Acta Sci. Math (Szeged), 41:63–76,
1979.
12. D. P. Williamson, M. X. Goemans, M. Mihail, and V. V. Vazirani. A primal-dual
approximation algorithm for generalized Steiner network problems. Combinatorica,
15:435–454, 1995.
Optimizing over All Combinatorial Embeddings
of a Planar Graph
(Extended Abstract)
Abstract. We study the problem of optimizing over the set of all com-
binatorial embeddings of a given planar graph. Our objective function
prefers certain cycles of G as face cycles in the embedding. The motiva-
tion for studying this problem arises in graph drawing, where the chosen
embedding has an important influence on the aesthetics of the drawing.
We characterize the set of all possible embeddings of a given biconnected
planar graph G by means of a system of linear inequalities with {0, 1}-
variables corresponding to the set of those cycles in G which can appear
in a combinatorial embedding. This system of linear inequalities can be
constructed recursively using SPQR-trees and a new splitting operation.
Our computational results on two benchmark sets of graphs are surpris-
ing: The number of variables and constraints seems to grow only linearly
with the size of the graphs although the number of embeddings grows
exponentially. For all tested graphs (up to 500 vertices) and linear objec-
tive functions, the resulting integer linear programs could be generated
within 10 minutes and solved within two seconds on a Sun Enterprise
10000 using CPLEX.
1 Introduction
A graph is called planar when it admits a drawing into the plane without edge-
crossings. There are infinitely many different drawings for every planar graph,
but they can be divided into a finite number of equivalence classes. We call two
planar drawings of the same graph equivalent when the sequence of the edges in
clockwise order around each node is the same in both drawings. The equivalence
classes of planar drawings are called combinatorial embeddings. A combinatorial
embedding also defines the set of cycles in the graph that bound faces in a planar
drawing.
The complexity of embedding planar graphs has been studied by various
authors in the literature [4, 3, 5]. E.g., Bienstock and Monma have given poly-
nomial time algorithms for computing an embedding of a planar graph that
?
Partially supported by DFG-Grant Mu 1129/3-1, Forschungsschwerpunkt “Effiziente
Algorithmen für diskrete Probleme und ihre Anwendungen”
??
Supported by the Graduiertenkolleg “Effizienz und Komplexität von Algorithmen
und Rechenanlagen”
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 361–376, 1999.
c Springer-Verlag Berlin Heidelberg 1999
362 Petra Mutzel and René Weiskircher
minimizes various distance functions to the outer face [4]. Moreover, they have
shown that computing an embedding that minimizes the diameter of the dual
graph is NP-hard.
In this paper we deal with the following optimization problem concerned with
embeddings: Given a planar graph and a cost function on the cycles of the graph.
Find an embedding Π such that the sum of the cost of the cycles that appear as
face cycles in Π is minimized. When choosing the cost 1 for all cycles of length
greater or equal to five and 0 for all other cycles, the problem is NP-hard [13].
Our motivation to study this optimization problem and in particular its inte-
ger linear programming formulation arises in graph drawing. Most algorithms for
drawing planar graphs need not only the graph as input but also a combinatorial
embedding. The aesthetic properties of the drawing often changes dramatically
when a different embedding is chosen.
5
0
4
3 1 2 6
2
3 7
0 1
6
4 8
7
8 5
(a) (b)
Figure 1 shows two different drawings of the same graph that were generated
using the bend minimization algorithm by Tamassia [12]. The algorithm used
different combinatorial embeddings as input. Drawing 1(a) has 13 bends while
drawing 1(b) has only 7 bends. It makes sense to look for the embedding that
will produce the best drawing. Our original motivation has been the following.
In graph drawing it is often desirable to optimize some cost function over all
possible embeddings in a planar graph. In general these optimization problems
are NP-hard [9]. For example: The number of bends in an orthogonal planar
drawing highly depends on the chosen planar embedding. In the planarization
method, the number of crossings highly depends on the chosen embedding when
the deleted edges are reinserted into a planar drawing of the rest-graph. Both
problems can be formulated as flow problems in the geometric dual graph. A
flow between vertices in the geometric dual graph corresponds to a flow between
Optimizing over All Combinatorial Embeddings of a Planar Graph 363
adjacent face cycles in the primal graph. Once we have characterized the set of all
feasible embeddings (via an integer linear formulation on the variables associated
with each cycle), we can use this in an ILP-formulation for the corresponding flow
problem. Here, the variables consist of ‘flow variables’ and ‘embedding variables’.
This paper introduces an integer linear program whose set of feasible solu-
tions corresponds to the set of all possible combinatorial embeddings of a given
biconnected planar graph. One way of constructing such an integer linear pro-
gram is by using the fact that every combinatorial embedding corresponds to a
2-fold complete set of cycles (see MacLane [11]). The variables in such a program
are all simple cycles in the graph; the constraints guarantee that the chosen sub-
set of all simple cycles is complete and that no edge of the graph appears in
more than two simple cycles of the subset.
We have chosen another way of formulating the problem. The advantage
of our formulation is that we only introduce variables for those simple cycles
that form the boundary of a face in at least one combinatorial embedding of
the graph, thus reducing the number of variables tremendously. Furthermore,
the constraints are derived using the structure of the graph. We achieve this
by constructing the program recursively using a data structure called SPQR-
tree suggested by Di Battista and Tamassia ([1]) for the on-line maintenance
of triconnected components. The static variant of this problem was studied in
[10]. SPQR-trees can be used to code and enumerate all possible combinatorial
embeddings of a biconnected planar graph. Furthermore we introduce a new
splitting operation which enables us to construct the linear description recur-
sively.
Our computational results on two benchmark sets of graphs have been quite
surprising. We expected that the size of the linear system will grow exponentially
with the size of the graph. Surprisingly, we could only observe a linear growth.
However, the time for generating the system grows sub-exponentially; but for
practical instances it is still reasonable. For a graph with 500 vertices and 1019
different combinatorial embeddings the construction of the ILP took about 10
minutes. Very surprising was the fact that the solution of the generated ILPs
took only up to 2 seconds using CPLEX.
Section 2 gives a brief description of the data structure SPQR-tree. In Sec-
tion 3 we describe the recursive construction of the linear constraint system using
a new splitting operation. Our computational results are described in Section 4.
2 SPQR-Trees
In this section, we give a brief description of the SPQR-tree data structure for
biconnected planar graphs. A connected graph is biconnected, if it has no cut
vertex. A cut vertex of a graph G = (V, E) is a vertex whose removal increases
the number of connected components. A connected graph that has no cut vertex
is called biconnected. A set of two vertices whose removal increases the number
of connected components is called a separation pair; a connected graph without
a separation pair is called triconnected.
364 Petra Mutzel and René Weiskircher
va va va va
G1
v1 v1
G2 G1 G2 G3
v2 v2
G3
vb vb vb vb
va va
G1 G2
G3
G4 G5
vb vb
(c) R-node
Fig. 2. Pertinent graphs and skeletons of the different node types of an SPQR-
tree
2 7
4 3 8 9
5 10
When we see the SPQR-tree as an unrooted tree, we get the same tree no
matter what edge of the graph was marked as the reference edge. The skeletons
of the nodes are also independent of the choice of the reference edge. Thus,
we can define a unique SPQR-tree for each biconnected planar graph. Another
important property of these trees is that their size (including the skeletons) is
linear in the size of the original graph and they can be constructed in linear time
([1]).
As described in [1], SPQR-trees can be used to represent all combinatorial
embeddings of a biconnected planar graph. This is done by choosing embeddings
for the skeletons of the nodes in the tree. The skeletons of S- and Q-nodes are
simple cycles, so they have only one embedding. The skeletons of R-nodes are
always triconnected graphs. In most publications, combinatorial embeddings are
defined in such a way, that only one combinatorial embedding for a triconnected
planar graph exists (note that a combinatorial embedding does not fix the outer
face of a drawing which realizes the embedding). Our definition distinguishes
between two combinatorial embeddings which are mirror-images of each other
(the order of the edges around each node in clockwise order is reversed in the
second embedding). When the skeleton of a P-node has k edges, there are (k−1)!
different embeddings of its skeleton.
Every combinatorial embedding of the original graph defines a unique com-
binatorial embedding for each skeleton of a node in the SPQR-tree. Conversely,
when we define an embedding for each skeleton of a node in the SPQR-tree, we
define a unique embedding for the original graph. The reason for this fact is that
each skeleton is a simplified version of the original graph where the split com-
ponents of some split pair are replaced by single edges. Thus, if the SPQR-tree
Optimizing over All Combinatorial Embeddings of a Planar Graph 367
X
k
2r (Li − 1)! .
i=1
The skeletons of P-nodes are multi-graphs, so they have multiple edges between
the same pair of nodes. Because we want to talk about directed cycles, we can be
much more precise when we are dealing with bidirected graphs. A directed graph
is called bidirected if there exists a bijective function r : E → E such that for
every edge e = (v, w) with r(e) = eR we have eR = (w, v) and r(eR ) = e We can
turn an undirected graph into a bidirected graph by replacing each undirected
edge by two directed edges that go in opposite directions. The undirected graph
G that can be transformed in this way to get the bidirected graph G0 is called
the underlying graph of G0 .
A directed cycle in the bidirected graph G = (V, E) is a sequence of edges of
the following form: c = ((v1 , v2 ), (v2 , v3 ), . . . , (vk , v1 )) = (e1 , e2 , . . . , ek ) with the
properties that every node of the graph is contained in at most two edges of c
and if k = 2, then e1 6= e2 holds. We say a planar drawing of a bidirected graph
is the drawing of the underlying graph, so the embeddings of a bidirected graph
are identical with the embeddings of the underlying graph.
A face cycle in a combinatorial embedding of a bidirected planar graph is a
directed cycle of the graph, such that in any planar drawing that realizes the
embedding, the left side of the cycle is empty. Note that the number of face
cycles of a planar biconnected graph is m − n + 2 where m is the number of
edges in the graph and n the number of nodes.
Now we are ready to construct an integer linear program (ILP) in which the
feasible solutions correspond to the combinatorial embeddings of a biconnected
planar bidirected graph. The variables of the program are binary and they corre-
spond to directed cycles in the graph. As objective function, we can choose any
linear function on the directed cycles of the graph. With every cycle c we asso-
ciate a binary variable xc . In a feasible solution of the integer linear program, a
variable xc has value 1 if the associated cycle is a face cycle in the represented
368 Petra Mutzel and René Weiskircher
We use a recursive approach to construct the variables and constraints of the ILP.
Therefore, we need an operation that constructs a number of smaller problems
out of our original problem such that we can use the variables and constraints
computed for the smaller problems to compute the ILP for the original problem.
This is done by splitting the SPQR-tree at some decision-node v. Let e be an
edge incident to v whose other endpoint is not a Q-node. Deleting e splits the
tree into two trees T1 and T2 . We add a new edge with a Q-node attached to
both trees to replace the deleted edge and thus ensure that T1 and T2 become
complete SPQR-trees again. The edges corresponding to the new Q-nodes are
called split edges. For incident edges of v, whose other endpoint is a Q-node, the
splitting is not necessary. Doing this for each edge incident to v results in d + 1
smaller SPQR-trees, called the split-trees of v, where d is the number of inner
nodes adjacent to v . This splitting process is shown in Fig. 4. Since the new
trees are SPQR-trees, they represent planar biconnected graphs which are called
the split graphs of v. We will show how to compute the ILP for the original graph
using the ILPs computed for the split graphs.
Q ... Q
Q ... Q
T1 T1
v Q
Q
Q
T2 T3 Q
Split
...
...
v
Q Q
Q Q
Q Q
Q
Q T2 T3
...
...
Q Q
3.3 The Integer Linear Program for SPQR-Trees with One Inner
Node
We observe that a graph whose SPQR-tree has only one inner node is isomorphic
to the skeleton of this inner node. So the split-tree of v which includes v, called
the center split-tree of v, represents a graph which is isomorphic to the whole
graph.
The ILPs for SPQR-trees with only one inner node are defined as follows:
– S-node: When the only inner node of the SPQR-tree is an S-node, the whole
graph is a simple cycle. Thus it has two directed cycles and both are face-
cycles in the only combinatorial embedding of the graph. So the ILP consists
of two variables, both of which must be equal to one.
– R-node: In this case, the whole graph is triconnected. According to our
definition of planar embedding, every triconnected graph has exactly two
embeddings, which are mirror-images of each other. When the graph has m
edges and n nodes, we have k = 2(m − n + 2) variables and two feasible
solutions. The constraints are given by the convex hull of the points in k-
dimensional space, that correspond to the two solutions.
– P-node: The whole graph consists only of two nodes connected by k edges
with k ≥ 3. Every directed cycle in the graph is a face cycle in at least one
embedding of the graph, so the number of variables is equal to the number
of directed cycles in the graph. The number of cycles is l = 2 k2 because we
always get an undirected cycle by pairing two edges and, since we are talking
about directed cycles, we get twice the number of pairs of edges. As already
mentioned, the number of embeddings is (k − 1)!. The constraints are given
as the convex hull of the points in l-dimensional space that represent these
embeddings.
3.4 Construction of the ILP for SPQR-Trees with More than One
Inner Node
We define, how to construct the ILP of an SPQR-tree T from the ILPs of the
split-trees of a decision node v of T . Let G be the graph that corresponds to T
and T1 , . . . , Tk the split-trees of v representing the graphs G1 to Gk . We assume
that T1 is the center split-tree of v. Now we consider the directed cycles of G.
We can distinguish two types:
1. Local cycles are cycles of G that also appear in one of the graphs G1 , . . . , Gk .
2. Global cycles of G are not contained in any of the Gi .
So far we have defined all the variables for the integer linear program of G.
The set C of all constraints of the ILP of T is given by
C = Cl ∪ Cc ∪ CG .
First we define the set Cl which is the set of lifted constraints of T . Each of the
graphs T1 , . . . , Tk is a simplified versions of the original graph G. They can be
generated from G by replacing some split components of one or more split pairs
by single edges. When we have a constraint that is valid for a split graph, a
weaker version of this constraint is still valid for the original graph. The process
of generating these new constraints is called lifting because we introduce new
variables that cause the constraint to describe a higher dimensional half space
or hyper plane. Let
Xl
.
aj xcj = R
j=1
.
be a constraint in a split-tree, where = ∈ {≤, ≥, =} and let X be the set of all
variables of T . Then the lifted constraint for the tree T is the following:
X
l X .
aj xc = R
j=1 c: c∈R(cj )∩X
We define Cl as the set of lifted constraints of all the split-trees. The number of
constraints in Cl is the sum of all constraints in all split-trees.
The set Cc is the set of choice constraints. For a cycle c in Gi , which includes
a split edge, we have |R(c)| > 1. All the cycles in R(c) share either at least one
directed edge or they pass a split graph of the split node in the same direction.
Therefore, only one of the cycles in R(c) can be a face cycle in any combinatorial
Optimizing over All Combinatorial Embeddings of a Planar Graph 371
embedding of G (proof omitted). For each variable xc in a split tree with |R(c)| >
1 we have therefore one constraint that has the following form:
X
xc0 ≤ 1
c0 ∈R(c)∧xc0 ∈X
The set CG consists of only one constraint, called the center graph constraint.
Let F be the number of face cycles in a combinatorial embedding of G1 , CG the
set of all global cycles c in G and CL the set of all local cycles c in G1 then this
constraint is: X
xc = F
c ∈ (Cg ∪Cl )∩C
Because the proof of the theorem is quite complex and the space is limited,
we can only give a sketch of the proof. The proof is split into three lemmas.
proof: (Sketch) To show the first part of the lemma, we start with a drawing
Z of G that realizes embedding Γ . When G0i is the graph Gi without its split
edge, we get a drawing Z1 of G1 by replacing in Z the drawings of the G0i with
2 ≤ i ≤ d0 with drawings of single edges that are drawn inside the area of the
plane formerly occupied by the drawing of G0i . We can show that each drawing of
G1 that we construct in this way realizes the same embedding Γ1 . We construct
a planar drawing of each Gi with 2 ≤ i ≤ d0 by deleting all nodes and edges
from Z that are not contained in Gi and drawing the split edge into the area
of the plane that was formerly occupied by the drawing of a path in G between
the poles of Gi not inside Gi . Again we can show that all drawings produced in
this way realize the same embedding Γi .
372 Petra Mutzel and René Weiskircher
To show the second part of the lemma, we start with special planar drawings
Zi of the Gi that realize the embeddings Γi . We assume that Z1 is a straight line
drawing (such a drawing always exists [8]) and that each Zi with 2 ≤ i ≤ d0 is
drawn inside an ellipse with the split edge on the outer face and the poles drawn
as the vertices on the major axis of the ellipse. Then we can construct a drawing
of G by replacing the drawings of the straight edges in Z1 by the drawings Zi
of the Gi from which the split edges have been deleted. We can show that every
drawing Z we construct in this way realizes the same embedding Γ of G.
To proof the main theorem, we first have to define the incidence vector of a
combinatorial embedding. Let C be the set of all directed cycles in the graph
that are face cycles in at least one combinatorial embedding of the graph. Then
the incidence vector of an embedding Γ is given as a vector in {0, 1}|C| where
the components representing the face cycles in Γ have value one and all other
components have value zero.
proof: (Sketch) We proof the lemma using induction over the number n of deci-
sion nodes in the SPQR-Tree T of G. The value χ(c) is the value of the component
in χ associated with the cycle c. We don’t consider the case n = 0, because G is
a simple cycle in this case and has only one combinatorial embedding.
1. n = 1:
No splitting of the SPQR-tree is necessary, the ILP is defined directly by T .
The variables are defined as the set of all directed cycles that are face cycles
in at least one combinatorial embedding of G. Since the constraints of the
ILP are defined as the convex hull of all incidence vectors of combinatorial
embeddings of G, χΓ satisfies all constraints of the ILP.
2. n > 1:
From the previous lemma we know that Γ uniquely defines embeddings Γi
with incidence vectors χi for the split graphs Gi . We will use the induction
basis to show that χΓ satisfies all lifted constraints. We know that the choice
constraints are satisfied by χΓ because in any embedding there can be only
on cycle passing a certain split pair in the same direction. When lifting
constraints, we replace certain variables by the sum of new variables and the
choice constraints guarantee that this sum is either 0 or 1. Using this fact
and the construction of the χi from χΓ , we can show that the sums of the
values of the new variables are always equal to the value of the old variable.
Therefore, all lifted constraints are satisfied.
To see that the center graph constraint is satisfied, we observe that any
embedding of the skeleton of the split node has F faces. We can construct
any embedding of G from an embedding Γ1 of this skeleton by replacing
edges by subgraphs. The faces in Γ that are global cycles are represented
Optimizing over All Combinatorial Embeddings of a Planar Graph 373
by faces in Γ1 and the faces that are local cycles in G are also faces in Γ1 .
Therefore the center graph constraint is also satisfied.
Lemma 3. Let G be a biconnected planar graph and χ ∈ {0, 1}|C| a vector satis-
fying all constraints of the ILP. Then χ is the incidence vector of a combinatorial
embedding Γ of G.
proof: Again, we use induction on the number n of decision nodes in the SPQR-
tree T of G and we disregard the case n = 0.
1. n = 1:
Like in the previous lemma, our claim holds by definition of the ILP.
2. n > 1:
The proof works in two stages: First we construct vectors χi for each split
graph from χ and prove that these vectors satisfy the ILPs for the Gi , and
are therefore incidence vectors of embeddings Γi of the Gi by induction basis.
In the second stage, we use the Γi to construct an embedding Γ for G and
show that χ is the incidence vector of Γ .
The construction of the χi works as follows: When x is a variable in the
ILP of Gi and the corresponding cycle is contained in G, then x gets the
value of the corresponding variable in χ. Otherwise, we define the value of
x as the sum of the values of all variables in χ whose cycles are represented
by the cycle of x. This value is either 0 or 1 because χ satisfies the choice
constraints.
Because χ satisfies the lifted constraints, the χi must satisfy the original con-
straints and by induction basis we know that each χi represents an embed-
ding Γi of Gi . Using these embeddings for the split graphs, we can construct
an embedding Γ for G like in lemma 1.
To show that χ is the incidence vector of Γ , we define χΓ as the incidence
vector of Γ and show that χ and χΓ are identical. By construction of Γ and
χΓ , the components in χΓ and χ corresponding to local cycles must be equal.
The number of global cycles whose variable in χ has value 1 must be equal
to the number of faces in Γ consisting of global cycles. This is guaranteed
by the center graph constraint. Using the fact that for all face cycle in Γ1
there must be a represented cycle in G whose component in χ and in χΓ is
1 we can show that both vectors agree on the values of the variables of the
global cycles, and thus must be identical.
4 Computational Results
In our computational experiments, we tried to get statistical data about the size
of the integer linear program and the times needed to compute it. Our implemen-
tation works for biconnected planar graphs with maximal degree four, since we
374 Petra Mutzel and René Weiskircher
600
Generation Time
500
400
Seconds
300
200
100
0
0 100 200 300 400 500
Number of Nodes
1e+20
Number of Embeddings
1e+18
1e+16
1e+14
1e+12
1e+10
1e+08
1e+06
10000
100
1
0 100 200 300 400 500
Number of Nodes
2500
Number of Constraints
2000
1500
1000
500
0
0 100 200 300 400 500
Number of Nodes
1000
Number of Variables
900
800
700
600
500
400
300
200
100
0
0 100 200 300 400 500
Number of Nodes
2
Solution Time
1.5
Seconds
0.5
0
0 100 200 300 400 500
Number of Nodes
The times for generating the ILPs have been below one minute; the ILPs were
quite small. CPLEX has been able to solve all of them very quickly.
In order to study the limits of our method, we started test runs on extremely
difficult graphs. We used the random graph generator developed by the group
around G. Di Battista in Rome that creates biconnected planar graphs with
maximal degree four with an extremely high number of embeddings (see [2] for
detailed information). We generated graphs with the number of nodes ranging
from 25 to 500, proceeding in steps of 25 nodes and generating 10 random graphs
for each number of nodes. For each of the 200 graphs, we generated the ILP and
measured the time needed to do this. The times are shown in Fig. 5. They
grow sub-exponentially and the maximum time needed was 10 minutes on a Sun
Enterprise 10000.
The number of embeddings of each graph is shown in Fig. 6. They grow
exponentially with the number of nodes, so we used a logarithmic scale for the
y-axis. There was one graph with more than 1019 combinatorial embeddings.
These numbers were computed by counting the number of R- and P-nodes in
the SPQR-tree of each graph. Each R-node doubles the number of combinatorial
embeddings while each P-node multiplies it by 2 or 6 depending on the number
of edges in its skeleton. Figures 7 and 8 show the number of constraints and
376 Petra Mutzel and René Weiskircher
variables in each ILP. Surprisingly, both of them grow linearly with the number
of nodes. The largest ILP has about 2500 constraints and 1000 variables.
To test how difficult it is to optimize over the ILPs, we have chosen 10 random
objective functions for each ILP with integer coefficients between 0 and 100 and
computed a maximal integer solution using the mixed integer solver of CPLEX.
Figure 9 shows the maximum time needed for any of the 10 objective functions.
The computation time always stayed below 2 seconds.
Our future goal will be to extend our formulation such that each solution will
not only represent a combinatorial embedding but an orthogonal drawing of the
graph. This will give us a chance to find drawings with the minimum number of
bends or drawings with fewer crossings. Of course, this will make the solution of
the ILP much more difficult.
References
[1] G. Di Battista and R. Tamassia. On-line planarity testing. SIAM Journal on
Computing, 25(5):956–997, October 1996.
[2] P. Bertolazzi, G. Di Battista, and W. Didimo. Computing orthogonal draw-
ings with the minimum number of bends. Lecture Notes in Computer Science,
1272:331–344, 1998.
[3] D. Bienstock and C. L. Monma. Optimal enclosing regions in planar graphs.
Networks, 19(1):79–94, 1989.
[4] D. Bienstock and C. L. Monma. On the complexity of embedding planar graphs
to minimize certain distance measures. Algorithmica, 5(1):93–109, 1990.
[5] J. Cai. Counting embeddings of planar graphs using DFS trees. SIAM Journal
on Discrete Mathematics, 6(3):335–352, 1993.
[6] G. Di Battista, A. Garg, G. Liotta, R. Tamassia, E. Tassinari, and F. Vargiu.
An experimental comparison of four graph drawing algorithms. Comput. Geom.
Theory Appl., 7:303–326, 1997.
[7] P. Eades and P. Mutzel. Algorithms and theory of computation handbook, chapter
9 Graph drawing algorithms. CRC Press, 1999.
[8] I. Fary. On straight line representing of planar graphs. Acta. Sci. Math.(Szeged),
11:229–233, 1948.
[9] A. Garg and R. Tamassia. On the computational complexity of upward and
rectilinear planarity testing. Lecture Notes in Computer Science, 894:286–297,
1995.
[10] J. E. Hopcroft and R. E. Tarjan. Dividing a graph into triconnected components.
SIAM Journal on Computing, 2(3):135-158, August 1973.
[11] S. MacLane. A combinatorial condition for planar graphs. Fundamenta Mathe-
maticae, 28:22–32, 1937.
[12] R. Tamassia. On embedding a graph in the grid with the minimum number of
bends. SIAM Journal on Computing, 16(3):421–444, 1987.
[13] G. J. Woeginger. personal communications, July 1998.
A Fast Algorithm for Computing Minimum
3-Way and 4-Way Cuts?
1 Introduction
Let G = (V, E) stand for an undirected graph with a set V of vertices and a
set E of edges being weighted by non-negative real numbers, and let n and m
denote the numbers of vertices and edges, respectively. For an integer k ≥ 2, a
k-way cut is a partition {V1 , V2 , . . . , Vk } of V consisting of k non-empty subsets.
The problem of partitioning V into k non-empty subsets so as to minimize the
weight sum of the edges between different subsets is called the minimum k-way
cut problem. The problem has several important applications such as cutting
planes in the traveling salesman problem [26], VLSI design [21], task allocation
in distributed computing systems [20] and network reliability. The 2-way cut
problem (i.e., the problem of computing the edge-connectivity) can be solved
by Õ(nm) time deterministic algorithms [7,22] and by Õ(n2 ) and Õ(m) time
randomized algorithms [13,14,15]. For an unweighted planar graph, Hochbaum
and Shmoys [9] proved that the minimum 3-way cut problem can be solved in
O(n2 ) time. However, the complexity status of the problem for general k ≥
3 in an arbitrary graph G has been open for several years. Goldschmidt and
Hochbaum proved that the problem is NP-hard if k is an input parameter [6]. In
?
This research was partially supported by the Scientific Grant-in-Aid from Ministry of
Education, Science, Sports and Culture of Japan, and the subsidy from the Inamori
Foundation.
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 377–390, 1999.
c Springer-Verlag Berlin Heidelberg 1999
378 Hiroshi Nagamochi and Toshihide Ibaraki
2
the same article [6], they presented an O(nk /2−3k/2+4 F (n, m)) time algorithm
for solving the minimum k-way cut problem, where F (n, m) denotes the time
required to find a minimum (s, t)-cut (i.e., a minimum 2-way cut that separates
two specified vertices s and t) in an edge-weighted graph with n vertices and
m edges, which can be obtained by applying a maximum flow algorithm. This
running time is polynomial for any fixed k. Afterwards, Karger and Stein [15]
proposed a randomized algorithm that solves the minimum k-way cut problem
with high probability in O(n2(k−1) (log n)3 ) time. For general k, a deterministic
O(n2k−3 m) time algorithm for the minimum k-way cut problem is claimed in
[11] (where no full proof is available).
For k = 3, Kapoor [12] and Kamidoi et al. [10] showed that the problem can
be solved in O(n3 F (n, m)) time, which was then improved to Õ(mn3 ) by Burlet
and Goldschmidt [1]. For k = 4, Kamidoi et al. [10] gave an O(n4 F (n, m)) =
Õ(mn5 ) time algorithm.
Let us call a non-empty and proper subset X of V a cut. Clearly, if we can
identify the first cut V1 in a minimum k-way cut {V1 , . . . , Vk }, then the rest of
cuts V2 , . . . , Vk can be obtained by solving the minimum (k −1)-way cut problem
in the graph induced by V −V1 . For k = 3, Burlet and Goldschmidt [1] succeeded
to characterize a set of O(n2 ) number of cuts which contains at least one such
cut V1 . Thus, by solving O(n2 ) minimum (k − 1)-way cut problems, a minimum
k-way cut can be computed. They showed that, given a minimum 2-way cut
{X, V − X} in G, such V1 is a cut whose weight is less than 4/3 of the weight
of a minimum 2-way cut in G or in the induced subgraphs G[X] and G[V − X].
Since it is known [24] that there are O(n2 ) cuts with weight less than 4/3 of the
weight of a minimum 2-way cut, and all those cuts can be enumerated in Õ(mn3 )
time, their approach yields an Õ(mn3 ) time minimum 3-way cut algorithm.
In this paper, we consider the minimum k-way cut problem for k = 3, 4, and
give a new characterization of a set of O(n) number of such candidate cuts for the
first cut V1 , on the basis of the submodularity of cut functions. We also show that
those O(n) cuts can be obtained in O(n2 F (n, m)) time by using Vazirani and
Yannakakis’s algorithm for enumerating small cuts [28]. Therefore, we can find
a minimum 3-way cut in O(n2 F (n, m) + nC2 (n, m)) = O(mn3 log(n2 /m)) time
and a minimum 4-way cut in O(n3 F (n, m) + n2 C2 (n, m)) = O(mn4 log(n2 /m))
time, where C2 (n, m) denotes the time required to find a minimum 2-way cut
in an edge-weighted graph with n vertices and m edges. The bound for k = 3
matches the current best deterministic bound Õ(mn3 ) for weighted graphs, but
improves the bound Õ(mn3 ) to O(min{mn8/3 , m3/2 n2 }) for unweighted graphs
(since F (n, m) = O(min{mn2/3 , m3/2 }) is known for unweighted graphs [2,12]).
The bound Õ(mn4 ) for k = 4 improves the previous best randomized bound
Õ(n6 ) (for m = o(n2 )). In the case of an edge-weighted planar graph G, we also
shows that the algorithm can be implemented to run in O(n3 ) time for k = 3
and in O(n4 ) time for k = 4, respectively. The algorithm is then generalized to
the problem of finding a minimum 3-way cut in a symmetric submodular system.
In the next section, we review some basic results of cuts and symmetric
submodular functions. In section 3, we present a new algorithm for computing
A Fast Algorithm for Computing Minimum 3-Way and 4-Way Cuts 379
2 Preliminaries
An f is called symmetric if
Given a system (V, f ), we define the minimum k-way cut problem as follows.
A k-way cut is a partition π = {V1 , V2 , . . . , Vk } of V consisting of k non-empty
subsets. The weight ωf (π) of a k-way cut π is defined by
1
ωf (π) = (f (V1 ) + f (V2 ) + · · · + f (Vk )). (2.4)
2
A k-way cut is called minimum if it has the minimum weight among all k-way
cuts in (V, f ).
Let G = (V, E) be an undirected graph with a set V of vertices and a set
E of edges weighted by non-negative reals. For a non-empty subset X ⊆ V , let
G[X] denote the graph induced from G by X. For a subset X ⊂ V , its cut value,
denoted by c(X), is defined to be the sum of weights of edges between X and
V − X, where c(∅) and c(V ) are defined to be 0. This set function c on V is
called the cut function of G. The cut function c is symmetric and submodular,
as easily verified. For a k-way cut π = {V1 , V2 , . . . , Vk } of V , its weight in G is
defined by
1
ωc (π) = (c(V1 ) + c(V2 ) + · · · + c(Vk )),
2
which means the weight sum of the edges between different cuts in π.
In [28], Vazirani and Yannakakis presented an algorithm that finds all the 2-way
cuts in G in the order of non-decreasing weights. The algorithm finds the next
2-way cut by solving at most 2n − 3 maximum flow problems. We describe this
result in a slightly more general way in terms of set functions. This algorithm
will be used in Section 3 to obtain minimum 3-way and 4-way cuts.
Theorem 2. [28] For a system (V, f ) with n = |V |, 2-way cuts in (V, f ) can
be enumerated in the order of non-decreasing weights with O(nFf ) time delay
between two consecutive outputs, where Ff is the time required to find a minimum
2-way cut that separates specified two disjoint subsets S, T ⊂ V in (V, f ).
Sµ ⊆ X ⊆ V − T µ
(Note that there is at most one vector µn p in which all entries are 1.)
Given a p-dimensional vector µ, where 2 ≤ p ≤ n, we can find a minimum
2-way cut {Y, V − Y } over all the 2-way cuts in C(µ) in Ff time by applying
procedure A(Sµ , Tµ ) if Tµ 6= ∅, i.e., µ contains at least one 0-value entry (recall
that v1 ∈ Sµ ); otherwise (if Tµ = ∅, i.e., µ(i) = 1 for all i ∈ {1, . . . , p}), such a
minimum 2-way cut {Y, V − Y } can be found by applying procedure A(Sµ , T )
at most (n − 2) times by choosing T = {vi } for all vi ∈ V − Sµ . Let µ∗ (µ)
denote the n-dimensional {0, 1}-vector that represents the minimum 2-way cut
{Y, V − Y } ∈ C(µ) obtained by this procedure, and let f(µ) be its weight.
With these notations, an algorithm for enumerating all 2-way cuts in the
order of non-decreasing weights can now be described. Initially we compute
µ∗ (µ1 ) and f(µ1 ) for the 1-dimensional vector µ1 with µ1 (1) = 1. Let Q := {µ1 }.
Then we can enumerate 2-way cuts one after another in the non-decreasing order
by repeatedly executing the next procedure B.
Procedure B
Choose a vector α ∈ Q with the minimum weight f(α); µn := µ∗ (α);
Output µn ; Q := Q − {α};
Let a be the dimension of α, and for each β = µn p , p = a + 1, a + 2, . . . , n,
compute µ∗ (β) and f(β), and Q := Q ∪ {β}.
Proof. It is known [17] that s, t-paths can be enumerated in the order of non-
decreasing lengths with O(S(n, m)) time delay between two consecutive outputs.
Based on this, we enumerate cycles in G in the order of non-decreasing lengths
as follows. Let E = {e1 = (s1 , t1 ), e2 = (s2 , t2 ), . . . , em = (sm , tm )}. Then the
set S of all cycles in G can be partitioned into m sets Si , i = 1, . . . , m, of cycles,
where Si is the set of cycles C ⊆ E such that ei ∈ C ⊆ E − {e1 , . . . , ei−1 }. Also,
a cycle C ∈ Si is obtained by combining an si , ti -path P ⊆ E − {e1 , . . . , ei−1 }
and the edge ei = (si , ti ) and vice versa. The shortest cycle in G can be obtained
as a cycle with the minimum length among Pi∗ ∪ {ei }, i = 1, . . . , m, where Pi∗ is
the shortest si , ti -path in the graph (V, E − {e1 , . . . , ei−1 }). If Pj∗ ∪ {ej } is chosen
as a cycle of the minimum length, then we compute the next shortest sj , tj -path
P in the graph (V, E − {e1 , . . . , ej−1 }), and update Pj∗ by this P . Then the
second shortest cycle in G can be as a cycle with the minimum length among
Pi∗ ∪ {ei }, i = 1, . . . , m. Thus, by updating Pj∗ by the next shortest sj , tj -path
P after Pj∗ ∪ {ej } is chosen, we can repeatedly enumerate cycles in the order of
non-decreasing weights, with O(S(n, m)) time delay. t
u
be the first r smallest 2-way cuts in the order of non-decreasing weights, where
r is the first integer r such that Xr crosses some Xq (1 ≤ q < r). Let us denote
Y1 = Xq − Xr , Y2 = Xq ∩ Xr , Y3 = Xr − Xq , and Y4 = V − (Xq ∪ Xr ). Then:
(i) There is a minimum 3-way cut {V1 , V2 , V3 } of (V, f ) such that V1 = Xi or
V1 = V − Xi for some i ∈ {1, 2, . . . , r − 1}, or {V1 , V2 , V3 } = {Yj , Yj+1 , V −
(Yj ∪ Yj+1 )} for some j ∈ {1, 2, 3, 4} (where Yj+1 = Y1 for j = 4).
(ii) There is a minimum 4-way cut {V1 , . . . , V4 } of (V, f ) such that V1 = Xi or
V1 = V − Xi for some i ∈ {1, 2, . . . , r − 1}, or {V1 , . . . , V4 } = {Y1 , Y2 , Y3 , Y4 }.
Assume that f(Xh − Xj ) ≤ f(Xr ) holds for {h, j} = {q, r}. Similarly, we obtain
min{f(Xq ∩ Xr ), f(Xq ∪ Xr )} ≤ f(Xr ) by (2.1). Thus,
This implies that {Yj , Yj+1 , V − (Yj ∪ Yj+1 )} is also a minimum 3-way cut.
(ii) Let a minimum 4-way cut in G be denoted by {V1 , V2 , V3 , V4 } with f(V1 ) ≤
min{f(V2 ), f(V3 ), f(V4 )}. Assume that f(V1 ) ≥ f(Xr ) holds, since otherwise (i.e.,
f(V1 ) < f(Xr )), we are done. Thus, 2f(Xr ) ≤ 2f(V1 ) ≤ ωf ({V1 , V2 , V3 , V4 }).
From inequalities (2.1) - (2.3), we then obtain
1
2f (Xr ) ≥ f(Xq )+f(Xr ) ≥ [f(Xq−Xr )+f(Xr−Xq )+f(Xq ∩Xr )+f(V −(Xq ∪Xr ))].
2
Therefore, we have 12 [f(Xq −Xr )+f(Xr −Xq )+f(Xq ∩Xr )+f(V −(Xq ∪Xr ))] ≤
f(Xq ) + f(Xr ) ≤ 2f(Xr ) ≤ ωf ({V1 , V2 , V3 , V4 }), indicating that {Xq − Xr , Xr −
Xq , Xq ∩ Xr , V − (Xq ∪ Xr )} is also a minimum 4-way cut in (V, f ). t
u
Now we are ready to describe a new algorithm for computing minimum 3-way
and 4-way cuts in an edge-weighted graph G. In the algorithm, minimum 2-way
cuts will be stored in X in the non-decreasing order, until X becomes crossing,
and an O(n) number of k-way cuts will be stored in C, from which a k-way cut
of the minimum weight is chosen.
MULTIWAY
Input: An edge-weighted graph G = (V, E) with |V | ≥ 4 and an integer k ∈ {3, 4}.
Output: A minimum k-way cut π.
1 X := C := ∅; i := 1;
2 while X is non-crossing do
3 Find the i-th minimum 2-way cut {Xi , V − Xi } in G;
4 X := X ∪ {Xi };
5 if |V −Xi | ≥ k−1 then find a minimum (k − 1)-way cut {Z1 , . . . , Zk−1 }
in the induced subgraph G[V − Xi ], and add the k-way cut
{Xi , Z1 , . . . , Zk−1 } to C;
6 if |Xi | ≥ k − 1 then find a minimum (k − 1)-way cut {Z10 , . . . , Zk−1
0
}
in the induced subgraph G[Xi ], and add the k-way cut
{Z10 , . . . , Zk−1
0
, V − Xi } to C;
7 i := i + 1
8 end; /* while */
9 Let Xr be the last cut added to X , and choose a cut Xq ∈ X that crosses
Xr , where we denote Y1 = Xq − Xr , Y2 = Xq ∩ Xr , Y3 = Xr − Xq , and
Y4 = V − (Xq ∪ Xr );
10 if k = 3 then add to C 3-way cuts {Yj , Yj+1 , V − (Yj ∪ Yj+1 )},
j ∈ {1, 2, 3, 4} (where Yj+1 = Y1 for j = 4);
11 if k = 4 then add 4-way cut {Y1 , Y2 , Y3 , Y4 } to C;
12 Output a k-way cut π in C with the minimum weight.
time in total. The total time required to check whether X is crossing or not is
O(rn2 ) = O(n3 ). By Theorem 2, r minimum 2-way cuts can be enumerated in
the non-decreasing order in O(rnF (n, m)) = O(n2 F (n, m)) time, where F (n, m)
denotes the time required to find a minimum (s, t)-cut (i.e., a minimum 2-way
cut that separates two specified vertices s and t) in an edge-weighted graph with
n vertices and m edges. Summing up these, we have
C3 (n, m) = O(mn3 log(n2 /m)) or O(min{n2/3 , m1/2 }mn log(n2 /m) log U ),
C4 (n, m) = O(mn4 log(n2 /m)) or O(min{n2/3 , m1/2 }mn2 log(n2 /m) log U ).
For unweighted graphs, we have C3 (n, m) = O(min{mn2/3 , m3/2 }n) and
C4 (n, m) = O(min{mn2/3 , m3/2 }n2 ), since F (n, m) = O(min{mn2/3 , m3/2 }) is
known for unweighted graphs [2,12]. For a planar graph G, we can obtain better
bounds by applying Corollary 4.
where w(e) is the weight of edge e. It is easy to observe that the cut function
c in a hypergraph H is also symmetric and submodular. We define the weight
ωc (π) of a k-way cut {V1 , . . . , Vk } also by (2.4).
Remark: The weight ωc (π) of a minimum 2-way cut π is equal to the minimum
weight sum of edges whose removal makes the hypergraph disconnected. This is
the same as the case of the cut function in a graph. For k ≥ 3, however, ωc (π)
of a minimum k-way cut π = {V1 , . . . , Vk } may not mean the minimum weight
sum γH of edges whose removal generates at least k components, where
3X
+ {w(e) | e ∈ E, |{i | Vi ∩ V (e) 6= ∅}| = 3}.
2
388 Hiroshi Nagamochi and Toshihide Ibaraki
where for each hyper-edge e ∈ E 0 , the set V 0 (e) of its end vertices and its weight
w0 (e) are redefined by V 0 (e) = V (e) − V1 and
0 w(e) if V 0 (e) = V (e) (i.e., V (e) ∩ V1 = ∅)
w (e) =
w(e)/2 if V 0 (e) ⊂ V (e) (i.e., V (e) ∩ V1 6= ∅).
Hence,
P 0 for a 2-way cut π 0 = {V2 , V3 } in H[V1 ] = (V 0 , E 0 ), its weight ωc0 (π 0 ) =
{w (e) | e ∈ E , V 0 (e)∩V
0
P 2 6= ∅ = 6 V 0 (e)∩V3 } (where c0 denotes the cut function
in H[V P1 ]) is equal to {w(e) | e ∈ E 0 , V (e) ∩ V2 6= ∅ =
6 V (e) ∩ V3 , V (e) ∩ V1 =
∅} + {w(e)/2 | e ∈ E , V (e) ∩ V2 6= ∅, V 0 (e) ∩ V3 6= ∅, V 0 (e) ∩ V1 6= ∅}.
0 0
Thus, c(V1 ) + ωc0 (π 0 ) = ωc (π) holds for a 3-way cut π = {V1 , V2 , V3 } and the cut
function c in H. Therefore, for a fixed cut V1 , we can find a minimum 3-way cut
{V1 , V2 , V3 } in H by computing a minimum 2-way cut {V2 , V3 } in H[V1 ]. It is
known that a minimum 2-way cut in a hypergraph H = (V, E) can be obtained
in O(|V |dH + |V |2 log |V |) time [18]. Since we need to solve O(n) number of
minimum 2-way cut problems in our algorithm, the time complexity becomes
O(|V |2 F (|V |+|E|, dH )+|V |2 (dH +|V | log |V |)) = Õ(|V |2 |E|dH ), assuming |E| ≥
|V |.
5 Concluding Remark
Acknowledgments
This research was partially supported by the Scientific Grant-in-Aid from
Ministry of Education, Science, Sports and Culture of Japan, and the subsidy
from the Inamori Foundation. We would like to thank Professor Naoki Katoh
and Professor Satoru Iwata for their valuable comments.
References
1. M. Burlet and O. Goldschmidt, A new and improved algorithm for the 3-cut problem,
Operations Research Letters, vol.21, (1997), pp. 225–227.
2. S. Even and R. E. Tarjan, Network flow and testing graph connectivity, SIAM J.
Computing, vol.4, (1975), pp. 507–518.
3. A. V. Goldberg and S. Rao, Beyond the flow decomposition barrier, Proc. 38th IEEE
Annual Symp. on Foundations of Computer Science, (1997), pp. 2–11.
4. A. V. Goldberg and R. E. Tarjan, A new approach to the maximum flow problem,
J. ACM, vol.35, (1988), pp. 921–940.
5. M. Grötschel, L. Lovász and A. Schrijver, Geometric Algorithms and Combinatorial
Optimization, Springer, Berlin (1988).
6. O. Goldschmidt and D. S. Hochbaum, A polynomial algorithm for the k-cut problem
for fixed k, Mathematics of Operations Research, vol.19, (1994), pp. 24–37.
7. J. Hao and J. Orlin, A faster algorithm for finding the minimum cut in a directed
graph, J. Algorithms, vol. 17, (1994), pp. 424–446.
390 Hiroshi Nagamochi and Toshihide Ibaraki
1 Introduction
We consider one of the most basic scheduling problems: scheduling parallel ma-
chines when jobs arrive over time with the objective of minimizing the makespan.
This problem is formulated as follows: There are m machines and n jobs. Each
job has a release time rj and a processing time pj . An algorithm must assign
each job to a machine and fix a start time. No machine can run more that one
job at a time and no job may start prior to being released. For job j let sj be
the start time in the algorithm’s schedule. We define the completion time for
job j to be cj = sj + pj . The makespan is maxj cj . The algorithm’s goal is to
minimize the makespan.
We study an online version of this problem, where jobs are completely un-
known until their release times. In contrast, in the offline version all jobs are
known in advance. Since it is in general impossible to solve the problem optimally
online, we consider algorithms which approximate the best possible solution.
Competitive analysis is a type of worst case analysis where the performance
of an online algorithm is compared to that of the optimal offline algorithm. This
approach to analyzing online problems was initiated by Sleator and Tarjan, who
used it to analyze the List Update problem [17]. The term competitive analysis
originated in [12]. For a given job set σ, let costA (σ) be the cost incurred by
an algorithm A on σ. Let cost(σ) be the cost of the optimal schedule for σ. A
scheduling algorithm A is ρ-competitive if
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 391–399, 1999.
c Springer-Verlag Berlin Heidelberg 1999
392 John Noga and Steve Seiden
for all job sets σ. The competitive ratio of A is the infimum of the set of values
ρ for which A is ρ-competitive. Our goal is to find the algorithm with smallest
possible competitive ratio.
A great deal of work has been done on parallel machine scheduling when
jobs are presented in a list and must be immediately scheduled prior to any
processing [1, 2, 3, 4, 5, 7, 8, 9, 11, 14, 15]. On the other hand, very little
work has been done on the version which we address here, despite the fact
that this version of the problem seems more realistic. The known results are
as follows: Hall and Shmoys show that the List algorithm of Graham is 2-
competitive for all m [9, 10]. In [16], Shmoys, Wein and Williamson present a
general online algorithm for scheduling with release dates. This algorithm uses as
a subroutine an offline algorithm for the given problem. If the offline algorithm is
an α-approximation, then the online algorithm is 2α-competitive. They also show
a lower bound of 10 9 on the competitive ratio for online scheduling of parallel
machines with release times. The LPT algorithm starts the job with largest
processing time whenever a machine becomes available. Chen and Vestjens show
that LPT is 32 competitive for all m [6]. These same authors also show a lower
bound of 3√− ϕ ≈ 1.38197 for m = 2 and a general lower bound of 1.34729, where
ϕ = (1 + 5)/2 ≈ 1.61803 is the √ golden ratio. Stougie and Vestjens [18] show a
randomized lower bound of 4 − 2 2 ≈ 1.17157 which is valid for all m.
We present an algorithm for scheduling two parallel machines, called Sleepy,
and show that it is (3−ϕ)-competitive, and thus has the best possible competitive
ratio. We also show a randomized lower bound of
u
1 + max > 1.21207.
0≤u<1 1 − 2 ln(1 − u)
there are no available jobs we say that the machine is waiting. If a machine is
waiting we call the next job released tardy.
We claim that Sleepy is (1+α)-competitive. By considering two jobs of size 1
released at time 0 we see that Sleepy is no better than (1 + α)-competitive.
3 Analysis
We will show, by contradiction, that no set of jobs with optimal offline makespan
1 causes Sleepy to have a makespan more than 1 + α. This suffices, since we
can rescale any nonempty job set to have optimal offline makespan 1.
Without loss of generality, we identify jobs with integers 1, . . . , n so that
c1 ≤ c2 ≤ · · · ≤ cn . Assume that % = {(pi , ri )}ni=1 is a minimum size set of
jobs with optimal offline makespan 1 which causes Sleepy to have a makespan
cn > 1 + α. Let A be the machine which runs job n and B be the other machine.
The remainder of the analysis consists of verifying a number of claims. Intu-
itively, what we want to show is: 1) the last three jobs to be released are the last
three jobs completed by Sleepy, 2) two of the last three jobs have processing
time greater than α and the third has processing time greater than 1 − α, and
3) replacing the last three jobs with one job results in a smaller counterexample.
Claim (Chen and Vestjens [6]). Without loss of generality, there is no time
period during which Sleepy has both machines idle.
Proof. Assume both machines are idle during (t1 , t2 ) and that there is no time
later than t2 when both machines are idle. If there is a job with release time prior
to t2 then removing this job results in a set with fewer jobs, the same optimal
offline cost, and the same cost for Sleepy. If no job is released before t2 then
%0 = {(pi /(1 − t2 ), (ri − t2 )/(1 − t2 ))}ni=1 is a job set with optimal makespan 1,
greater cost for Sleepy, and no time period when Sleepy has both machines
idle. Note %0 is simply % with all release times decreased by t2 and all jobs rescaled
so that the optimal cost remains 1.
Proof. Since there is no period when Sleepy has both machines idle, the last
job on machine B must complete at or after sn and so cn−1 ≥ sn .
Let j be the last tardy job and k be the job running when the last tardy
job is released. If there are no tardy jobs let rj = ck = 0. At time rj the
optimal algorithm may or may not have completed all jobs released before rj .
Let p (possibly 0) be the amount of processing remaining to be completed by
the optimal algorithm at time rj on jobs released before rj .
Since % is a minimum size counterexample, ck ≤ (1 + α)(rj + p). Note that
between time rj and cn the processing done by Sleepy is exactly the processing
done by the optimal algorithm plus ck − rj − p.
It follows from the previous claim that there is a set of jobs σ1 , . . . , σ` such
that sσ1 ≤ ck < cσ1 , sσi ≤ sσi+1 , and cσ` = cn−1 .
394 John Noga and Steve Seiden
Consider the time period (ck , cσ1 ). During this period neither machine is
waiting. So, at least (2 − α)(cσ1 − ck ) processing is completed by Sleepy. Sim-
ilarly, during the time period (cσi , cσi+1 ) at least (2 − α)(cσi+1 − cσi ) processing
is completed by Sleepy for i = 1, 2, . . . , ` − 1. Therefore, during the period
(ck , cn−1 ), Sleepy completes at least (2 − α)(cn−1 − ck ) processing.
Since the optimal algorithm can process at most 2(1 − rj ) after rj ,
Therefore,
(1 − α)(cn−1 − ck ) ≤ 2 − rj − p − cn .
and further
2 − rj − p − cn
cn−1 ≤ + ck
1−α
= (2 − α)(2 − rj − p − cn ) + ck
< (2 − α)(1 − α − rj − p) + ck
= 1 − (2 − α)(rj + p) + ck
≤ 1 − (2α − 1)(rj + p)
≤ 1.
Claim. cn−2 = sn .
Proof. Since sn > sn−1 + α · pn−1 , either cn−2 = sn or machine A is idle im-
mediately before time sn . In the later case, we have sn = rn and therefore
cn = sn + pn = rn + pn ≤ 1, which is a contradiction.
Proof. It is easily seen that pn−1 ≥ cn−2 − sn−1 . If cn−2 − sn−1 ≤ α then
pn−1 = (cn−1 − cn−2 ) + (cn−2 − sn−1 ) ≤ (cn−1 − cn−2 ) + α < pn . Therefore
pn−1 − cn−1 + cn−2 ≤ α and further rn ≥ sn−1 . This means cn = sn−1 + pn−1 +
pn − (cn−1 − cn−2 ) ≤ rn + pn + α ≤ 1 + α, a contradiction.
Scheduling Two Machines with Release Times 395
pj ≥ cn−2 − sj
cn−2 − sk
≥
1−α
= (2 − α)(cn−2 − max{sn−1 , sn−2 })
= (2 − α) min{cn−2 − sn−1 , pn−2 }
> (2 − α)α
= 1 − α.
Claim. sj + α · pj = sk .
Proof. No job other than n, n − 1, and n − 2 requires more than α processing
time. Note that jobs k and n must go on the same machine in the optimal
schedule. This means either k or n must be released before time 1 − 2α. Suppose
job i has si ≤ sj and ci ≥ sj + α · pj . Then si + α · pi ≤ sj and pi ≥ α · pi +
α · pj ≥ α · pi + α(1 − α). This implies that pi ≥ α, a contradiction. Therefore,
any job that starts before job j must complete before time sj + α · pj : Since
α · pj > α(1 − α) = 1 − 2α, we have sk = sj + α · pj .
Claim. min{rn , rn−1 , rn−2 } > maxi≤n−3 {si }.
Proof. Since jobs n, n − 1,and n − 2 are larger than all other jobs and start later,
they must have been released after maxi≤n−3 {si }.
Claim. pn + pk − pj ≥ 0.
Proof. We have
pn + pk − pj = pn + ck − sk − (cj − sj )
= p n + ck − cj − α · p j
≥ p n + ck − cj − α
= cn − cn−1 + cn−1 − cn−2 + ck − cj − α
> cn−1 − cn−2 + ck − cj
≥ 0.
Proof. Since we can assume that on each of the machines individually the optimal
algorithm orders its jobs by release times, this set can be processed identically
to % for the first n − 3 jobs. The new job (pn + pk − pj , min{rn , rn−1 , rn−2 }) runs
on the machine which would have run k. This schedule has cost no more than
1 − pj .
Proof. The first n − 3 jobs are served identically. The final job starts at time sj .
The makespan is at least sj +pn +pk −pj = sk +pk +pn −(1+α)pj ≥ cn −(1+α)pj .
First we show a lower bound which improves on the previous result of Stougie
and Vestjens [18, 19]:
Theorem 2. No randomized algorithm for m ≥ 2 machines is ρ-competitive
with ρ less than
u
1 + max > 1.21207.
0≤u<1 1 − 2 ln(1 − u)
Proof. We make use of the von Neumann/Yao principle [21, 20]. Let 0 ≤ u < 1 be
a real constant. We show a distribution over job sets such that the expectation
of the competitive ratio of all deterministic algorithms is at least 1 + u/(1 −
2 ln(1 − u)). Let
1 1
q= , p(y) = .
1 − 2 ln(1 − u) (y − 1) ln(1 − u)
We now define the distribution. In all cases, we give two jobs of size 1 at
time 0. With probability q, we give no further jobs. With probability 1 − q, we
pick y at random from the interval [0, u] using the probability density
R u p(y) and
release m−1 jobs of size 2−y at time y. Note that q ≥ 0 and that 0 p(y)dy = 1,
and therefore this is a valid distribution.
We now analyze the expected competitive ratio incurred by a deterministic
online algorithm on this distribution. Let x be the time at which the algorithm
would start the second of the two size one jobs, given that it never receives the
jobs of size 2 − y. If the third job is not given, the algorithm’s cost is 1 + x while
the optimal offline cost is 1. If the jobs of size 2 − y are given at time y ≤ x,
then the algorithm’s cost is at least 2 and this is also the optimal offline cost.
If the jobs of size 2 − y are given at time y > x, then the algorithm’s cost is at
Scheduling Two Machines with Release Times 397
least 3 − y while the optimal offline cost is again 2. First consider x ≥ u. In this
case, the expected competitive ratio is
Z u
costA (σ)
Eσ ≥ q(1 + x) + (1 − q) p(y)dy
cost(σ) 0
Z u
1+u −2 ln(1 − u) dy
≥ +
1 − 2 ln(1 − u) 1 − 2 ln(1 − u) 0 (y − 1) ln(1 − u)
u
=1+ .
1 − 2 ln(1 − u)
Since the adversary may choose u, the theorem follows. Choosing u = 0.575854
yields a bound of at least 1.21207.
One natural class of randomized online algorithms is the class of barely random
algorithms [13]. A barely random online algorithm is one which is distribution
over a constant number of deterministic algorithms. Such algorithms are often
easier to analyze than general randomized ones. We show a lower bound for any
barely random algorithm which is a distribution over two deterministic algo-
rithms:
We also conclude that t1 < 14 . Now suppose we give the algorithm m − 1 jobs of
size 2 − t1 − δ at time t1 + δ. Then its competitive ratio is at least
p(3 − t1 − δ) + (1 − p)2 1 − t1 − δ (1 − t1 − δ)(4t2 − 1) 5
= p+1> +1< ,
2 2 8(t2 − t1 ) 4
and therefore
1 − 3t1 − δ
t2 < .
2 − 4t1 − 4δ
Since this is true for all δ > 0 we have
1 − 3t1
t2 < .
2 − 4t1
If instead we give m − 1 jobs of size 2 − t2 − δ at time t2 + δ, then the competitive
ratio is at least
3 − t2 − δ 5 − 9t1 − 2δ + 4t1 δ
> .
2 4 − 8t1
Again, this holds for all δ > 0 and so we have
5 − 9t1 5
≤ − ,
4 − 8t1 4
which is impossible for 0 ≤ t1 < 14 .
5 Conclusions
We have shown a deterministic online algorithm for scheduling two machines
with release times with the best possible competitive ratio. Further, we have im-
proved the lower bound on the competitive ratio of randomized online algorithms
for this problem.
We feel there are a number of interesting and related open questions. First,
is there a way to generalize Sleepy to an arbitrary number of machines and
have a competitive ratio smaller than LPT? Or the even stronger result, for
a fixed value of > 0 is there an algorithm which is (3/2 − )-competitive
for all m? Second, does randomization actually reduce the competitive ratio?
It seems like randomization should help to reduce the competitive ratio a great
deal. However, the best known randomized algorithms are actually deterministic
(Sleepy for m = 2 and LPT for m > 2). Third, how much does the competitive
ratio decrease if restarts are allowed? In many real world situations a job can be
killed and restarted later with only the loss of processing already completed.
6 Acknowledgement
This research has been supported by the START program Y43-MAT of the Aus-
trian Ministry of Science. The authors would like to thank Gerhard Woeginger
for suggesting the problem.
Scheduling Two Machines with Release Times 399
References
[1] Albers, S. Better bounds for online scheduling. In Proceedings of the 29th ACM
Symposium on Theory of Computing (1997), pp. 130–139.
[2] Bartal, Y., Fiat, A., Karloff, H., and Vohra, R. New algorithms for an
ancient scheduling problem. Journal of Computer and System Sciences 51, 3 (Dec
1995), 359–366.
[3] Bartal, Y., Karloff, H., and Rabani, Y. A better lower bound for on-line
scheduling. Information Processing Letters 50, 3 (May 1994), 113–116.
[4] Chen, B., van Vliet, A., and Woeginger, G. A lower bound for randomized
on-line scheduling algorithms. Information Processing Letters 51, 5 (Sep 1994),
219–222.
[5] Chen, B., van Vliet, A., and Woeginger, G. New lower and upper bounds
for on-line scheduling. Operations Research Letters 16, 4 (Nov 1994), 221–230.
[6] Chen, B., and Vestjens, A. Scheduling on identical machines: How good is
LPT in an online setting? Operations Research Letters 21, 4 (Nov 1997), 165–169.
[7] Faigle, U., Kern, W., and Turàn, G. On the performance of on-line algorithms
for partition problems. Acta Cybernetica 9, 2 (1989), 107–119.
[8] Galambos, G., and Woeginger, G. An online scheduling heuristic with better
worst case ratio than Graham’s list scheduling. SIAM Journal on Computing 22,
2 (Apr 1993), 349–355.
[9] Graham, R. L. Bounds for certain multiprocessing anomalies. Bell Systems
Technical Journal 45 (1966), 1563–1581.
[10] Hall, L., and Shmoys, D. Approximation schemes for constrained schedul-
ing problems. In Proceedings of the 30th IEEE Symposium on Foundations of
Computer Science (1989), pp. 134–139.
[11] Karger, D., Phillips, S., and Torng, E. A better algorithm for an ancient
scheduling problem. Journal of Algorithms 20, 2 (Mar 1996), 400–430.
[12] Karlin, A., Manasse, M., Rudolph, L., and Sleator, D. Competitive snoopy
caching. Algorithmica 3, 1 (1988), 79–119.
[13] Reingold, N., Westbrook, J., and Sleator, D. Randomized competitive
algorithms for the list update problem. Algorithmica 11, 1 (Jan 1994), 15–32.
[14] Seiden, S. S. Randomized algorithms for that ancient scheduling problem. In
Proceedings of the 5th Workshop on Algorithms and Data Structures (Aug 1997),
pp. 210–223.
[15] Sgall, J. A lower bound for randomized on-line multiprocessor scheduling. In-
formation Processing Letters 63, 1 (Jul 1997), 51–55.
[16] Shmoys, D., Wein, J., and Williamson, D. Scheduling parallel machines on-
line. In Proceedings of the 32nd Symposium on Foundations of Computer Science
(Oct 1991), pp. 131–140.
[17] Sleator, D., and Tarjan, R. Amortized efficiency of list update and paging
rules. Communications of the ACM 28, 2 (Feb 1985), 202–208.
[18] Stougie, L., and Vestjens, A. P. A. Randomized on-line scheduling: How low
can’t you go? Manuscript, 1997.
[19] Vestjens, A. P. A. On-line Machine Scheduling. PhD thesis, Eindhoven Uni-
versity of Technology, The Netherlands, 1997.
[20] Von Neumann, J., and Morgenstern, O. Theory of games and economic
behavior, 1st ed. Princeton University Press, 1944.
[21] Yao, A. C. Probabilistic computations: Toward a unified measure of complexity.
In Proceedings of the 18th IEEE Symposium on Foundations of Computer Science
(1977), pp. 222–227.
An Introduction to Empty Lattice Simplices
András Sebő?
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 400–414, 1999.
c Springer-Verlag Berlin Heidelberg 1999
An Introduction to Empty Lattice Simplices 401
can always be obviously reduced to L = ZZn . Therefore we will not care about
more general lattices.
P
A set of the form cone(V ) := { v∈V λv v : λv ≥ 0} is a cone; if V is linearly
independent, the cone is called simplicial.
We refere to Schrijver [13] for basic facts about polytopes, cones and other
notions of polyhedral combinatorics, as well as for standard notations such as
the affine or linear hull of vectors, etc.
Let us call an integer polytope P empty, if it is integer, and denoting the set
of its vertices by V , (P ∩ ZZn ) \ V = ∅. (The definition is similar for arbitrary
lattice L instead of ZZn .) Empty lattice polytopes have been studied in the past
four decades, let us mention as landmarks Reeve [11], White [15], Reznick [10],
Scarf [12] and Haase, Ziegler [5]. (However, there is no unified terminology about
them: the terms range from ‘lattice-point-free lattice polytopes’ to ‘elementary
or fundamental lattice polytopes’.) The latter paper is devoted to the volume
and the width of empty lattice simplices.
In this paper we study the structure of empty lattice simplices, the correlation
of emptiness and the width of lattice simplices, including the computational com-
plexity of both. (Note that deciding whether a (not necessarily integer) simplex
contains an integer point is trivially NP-complete, since the set of feasible solu-
tions of the knapsack problem K := {x ∈ IRn : ax = b, x ≥ 0} (a ∈ IINn , b ∈ IIN)
is a simplex. However, in Section 2 we will decide whether a knapsack lattice
simplex is empty in polynomial time.)
Deciding whether a lattice simplex is empty is the simplest of a possible range
of problems that can be formulated as follows: given linearly indepedent integer
vectors a1 , . . . , an ∈ ZZn is there an integer vector v in the cone they generate such
that the uniquely determined coefficients λ1 , . . . , λn ∈ Q + for which λ1 a1 + . . . +
λn an = v satisfy some given linear inequalities. The problem investigated here
corresponds to the inequality λ1 + . . . λn ≤ 1. Another interesting variant is the
existence of an integer point where the λi (i = 1, . . . , n) satisfy some lower and
upper bounds. For instance the ‘Lonely Runner Problem’ (or ‘View Obstruction
Problem’) for velocity vector v ∈ IRn (see [4]) can be restated as the problem
of the existence of an integer vector with 1/(n + 1) ≤ λi ≤ n/(n + 1) for all
i = 1, . . . , n in cone(e1 , . . . , en , (v, d)), where the ei (i = 1, . . . , n) are unit vectors
and d is the least common multiple of all the vectors vi + vj , (i, j = 1, . . . , n).
The width W (S) in IRn of a set S ⊆ IRn is the minimum of max{wT (x − y) :
x, y ∈ S} over all vectors w ∈ ZZn \ {0}. If the rank of S is r < n, then the
width of S in IRn is 0, and it is more interesting to speak about its width in
aff(S) = lin(S) defined as the minimum of max{wT (x − y) : x, y ∈ S} over all
vectors w ∈ ZZn not orthogonal to lin(S). Shortly, the width of S will mean its
width in aff(S).
If 0 ∈
/ VP⊆ ZZn , and V is linearly independent, then define par(V ) := {x ∈
v∈V λv v; 1 > λv ≥ 0 (v ∈ V )} and call it a parallelepiped. If in
n
ZZ : x =
addition |V | = n, then | par(V )| = det(V ), where det(V ) denotes the absolute
value of the determinant of the matrix whose rows are the elements of V (see
402 András Sebő
for instance Cassels [3], or for an elementary proof see Sebő [14]). In particular,
par(V ) = {0} if and only if det(V ) = 1.
The problem of deciding the emptiness of parallelepipeds generated by integer
vectors is therefore easy. On the other hand the same problem is still open for
simplices, and that is the topic of this paper.
If n ≤ 3, it is well-known [15], that the width of an empty integer simplex is
1 (the reverse is even easier – for both directions see Corollary 4.5 below).
The following example shows that in general the width of empty integer
simplices can be large. The problem might have been overlooked before: the
simple construction below is apparently the first explicit example of empty inte-
ger simplices of arbitrary high width. The best result known so far was Kantor’s
non-constructive proof [8] for the existence of integer simplices of width n/e.
For a survey concerning previous results on the correlation of the width and the
volume of empty lattice simplices see Haase, Ziegler [5].
It is easy to construct integer simplices of width n without integer points
at all (not even the vertices), for instance by the following Pn well-known example,
[6]: for arbitrary ε ∈ IR, ε > 0, the simplex {x ∈ IRn : i=1 xi ≤ n − ε/2, xi ≥
ε/2 for all i = 1, . . . , n} has width n−ε and no integer point. The vertices of this
simplex have exactly one big coordinate. We define an integer simplex which is
‘closest possible’ to this one:
Let k ∈ IIN (the best choice will be k = n − 2). Sn (k) := conv(s0 , s1 , ..., sn ),
s0 := 0, s1 := (1, k, 0, . . . , 0), s2 := (0, 1, k, 0, . . . , 0), . . ., sn−1 := (0, . . . , 0, 1, k),
sn := (k, 0, . . . , 0, 1). Let si,j denote the j-th coordinate of si , that is, si,j = 0
if |i − j| ≥ 2, si,i = 1, si,i+1 = k, (i, j ∈ {1, . . . , n}). The notation i + 1 is
understood mod n.
2 Knapsack Simplices
If a, b ∈ ZZ, lcm(a, b) denotes the least common multiple of a and b, that is, the
smallest nonnegative number which is both a multiple of a and of b.
In this section we are stating two lemmas that will be needed in the sequel. The
first collects some facts about the structure or parallelepipeds. The second is a
result of number theoretic character. The structure provided by the first raises
the problem solved by the second.
A unimodular transformation is a linear transformation defined by an in-
teger matrix whose determinant is 1. An equivalent definition: a unimodular
transformation is the composition of a finite number of reflections fi (x) :=
(x1 , . . . , −xi , . . . , xn ), and sums fi,j := (x1 , . . . , xi−1 , xi + xj , xi+1 , . . . , xn ),
An Introduction to Empty Lattice Simplices 405
(i, j = 1, . . . , n). The equivalence of the two definitions is easy to prove in knowl-
edge of the Hermite normal form (see the definition in [13]).
We will use unimodular transformations of a set of vectors V by putting
them into a matrix M as rows, and then using column operations to determine
the Hermite normal form M 0 of M . Then the rows of M 0 can be considered to
provide an ‘isomorphic’ representation of V .
The residue of x ∈ IR mod d ∈ IIN will be denoted by mod(x, d), 0 ≤
mod(x, d) < d.
Let V := {v1 , . . . , vn } ⊆ ZZn be a basis of IRn , and d := det(v1 , . . . , vn ).
By Cramer’s rule every integer vector is a linear combination of V with coeffi-
cients that are integer multiples of 1/d. For x ∈ IRn the coefficient vector λ =
(λ1 , . . . , λn ) ∈ ZZn defined by the unique combination x = (λ1 v1 + . . . + λn vn )/d,
will be called the V -coefficient vector of x. In other words λ = dV −1 x (where
V denotes the n × n matrix whose n-th column is vn ). If x ∈ ZZn then all
V -coefficients of x are integer.
Clearly, V -coefficient vectors are unchanged by linear transformations, and
the width is unchanged under unimodular transformations. (The inverse of a
unimodular transformation is also unimodular. Unimodular transformations can
be considered to be the ‘isomorphisms’ of polytopes with respect to their integer
vectors.)
A par(V )-coefficient vector is a vector λ ∈ IRn which is the V -coefficient
vector of some x ∈ par(V ). We will often exploit the fact that parallelepipeds are
symmetric objects: if x ∈ par(v1 , . . . , vn ), then v1 + . . .+ vn − x ∈ par(v1 , . . . , vn ).
In other words, if (λ1 , . . . , λn ) ∈ IINn is a par(V )-coefficient vector, then (d −
λ1 , . . . , d − λn ) is also one (extensively used in [10] and [14]).
We will use par(V )-coefficients and some basic facts in the same way as we did
in [14]. These have been similarly used in books, for n = 3 in White’s paper [15],
or in Reznick [10] through ‘barycentric coordinates’, whose similarity was kindly
pointed out to the author by Jean-Michel Kantor.
An important part of the literature is in fact involved more generally with
the number of integer points in three (or higher) dimensional simplices. Our
ultimate goal is to find only simple general ‘good certificates’ (or polynomial
algorithms) for deciding when this number is zero, and par(V )-coefficients seem
to be helpful to achieve this task. We are able to achieve only much less: we treat
small dimensional cases in a simple new way, bringing to the surface some more
general facts and conjectures.
If x ∈ IRn , then clearly, there exists a unique integer vector p so that x +
p ∈ par(V ). If x ∈ ZZn and λ ∈ ZZn is the V -coefficient vector of x, then the
V -coefficient vector of x + p is (mod(λ1 , d)/d, . . . , mod(λn , d)/d). The par(V )-
coefficient vectors form a group G = G(V ) with respect to mod d addition. This
is the factor-group of the additive group of integer vectors with respect to the
subgroup generated by V , moreover the following well-known facts hold:
Lemma 3.1 Let V := v1 , . . . , vn ∈ ZZn . Then:
(a) par(V \vn ) = {0} if and only if there exists a unimodular transformation (and
possibly permutation of the coordinates) such that vi := ei (i = 1, . . . , n − 1),
406 András Sebő
vn = (a1 , . . . , an−1 , d), where d = det(v1 , . . . , vn ) and 0 < ai < d for all
i = 1, . . . , n − 1.
(b) If par(V \ vn ) = {0}, then G(V ) is a cyclic group.
(c) If par(V \ vn ) = {0} then par(V \ vi ) = {0} for some i ∈ {1, . . . , n − 1} if
and only if in (a) gcd(ai , d) = 1.
Indeed, let the rows of a matrix M be the vectors vi ∈ ZZn (i = 1, . . . , n)
and consider the Hermite normal form of (the column lattice of) M . If par(V \
vn ) = {0}, then deleting the last row and column we get an (n − 1) × (n −
1) identity matrix, and (a), follows. Now (b) is easy, since par(V ) = {0} ∪
{di/d(a1 , . . . , an−1 , d)e : i = 1, . . . , d − 1}, that is, the par(V )-coefficients are
equal to {mod(ih, d) : i = 1, . . . , d − 1}, where h := (d − a1 , . . . , d − an , 1).
Statement (c) is also easy, since gcd(ai , d) = gcd(d − ai , d) > 1 if and only if
there exists j ∈ IIN, j < d so that the i-th coordinate of mod(jh, d) is 0.
Statement (b) is very useful for proving properties forr all the parallelepiped
P : one can generate the V := {v1 , . . . , vn }-coefficients of all the d − 1 nonzero
points of P by taking the mod d multiples of the V -coefficient vector of some
generator h = (h1 , . . . , hn ) ∈ P (cf. [10], or [14]). If the polytope we are consid-
ering is described in terms of inequalities satisfied by a function of the par(V )-
coefficients, then it is useful to understand how the par(V )-coefficient vector
mod((i + 1)h, d) changed comparing to mod(ih, d).
For instance, for the simplex S := conv(v1 , . . . , vn ) to be empty means exactly
that the sum of the V -coefficients of any vector in P is strictly greater than d. For
any coordinate 0 < a ≤ d−1 of h one simply has mod((i+1)a, d) = mod(ia, d)+a,
unless the interval (mod(ia, d), mod(ia, d) + a] contains a multiple of d, that
is, if and only if mod(ia, d) + a ≥ d. In this latter case mod((i + 1)a, d) =
mod(ia, d) + a − d, and we will say that i is a jump-coefficient of a mod d.
Hence mod d inequalities can be treated as ordinary inequalities, corrected
by controling of jump-coefficients. We will need only the following simply stated
Lemma 3.2, relating containment relations between the sets of jump coeffients,
to divisibility relations.
We say that i ∈ {1, . . . , d − 1} is a jump-coefficient of a ∈ IIN mod d (1 ≤ a <
d), if b(i + 1)a/dc > bia/dc (equivalently, if mod((i + 1)a, d) < mod(ia, d)). If
a = 1, then Ja (d) = ∅, and if a ≥ 2, the set of jump-coefficients of a mod d is
(∗) Ja (d) := {bid/ac : i = 1, . . . , a − 1}.
Let us illustrate our goal and the statement we want to prove on the easy
example of a = 2 and d odd: let us show that Ja (d) ⊆ Jb (d) if and only if b is
even. Indeed, 2 has just one jump-coefficient, bd/2c. So Ja (d) ⊆ Jb (d) if and only
if bd/2c ∈ Jb , that is, if and only if the interval (bd/2cb, (bd/2c + 1)b] contains a
multiple of d. It does contain a multiple of d/2: bd/2, and since b/2 < d/2 this
is the only multiple of d/2 it contains. But bd/2 is a multiple of d if and only if
b is even, as claimed.
Lemma 3.2 states a generalization of this statement for arbitrary a. Let us
first visualise the statement, and (∗) – a basic tool in the proof :
Let d ∈ IIN be arbitrary, and 0 < a, b < d. The points A := {id/a : i =
1, . . . , a − 1} divide the interval [0, d] into a equal parts. Each of these parts has
An Introduction to Empty Lattice Simplices 407
length bigger than 1, so the points of A lie in different intervals (i, i + 1]. Now
(∗) means exactly that
(∗∗) i ∈ Ja if and only if the interval (i, i + 1] contains an element of A.
If b > a, then clearly, there is an interval (i, i + 1] containing a point of B
and not caintaining a point of A. If b is a multiple of a, then obviously, B ⊇ A.
If d − a is a multiple of d − b, then again B ⊇ A is easy to prove (see the Fact
below).
If a, b ≤ d/2, then d − b | d − a cannot hold. The following lemma states that
under this condition the above remark can be reversed: if a does not divide b,
then A \ B 6= ∅. If a ≤ d/2 and b > d/2, then Ja (d) ⊆ Jb (d) can hold without
any of a|b or d − b | d − a being true (see example below).
Let us first note the following statement that will be frequently used in the
sequel:
Fact: {Ja (d), Jd−a (d)} is a bipartition of {1, . . . , d − 1}. (Easy.)
The following lemma looks like a simple and quite basic statement :
Proof. Let a, b ∈ IIN, 0 < a, b < d/2, Ja (d) ⊆ Jb (d). Then a ≤ b. Suppose
a does not divide b, and let us show Ja \ Jb 6= ∅. We have then 2 ≤ a < b,
and Ja \ Jb 6= ∅ means exactly the existence of k ∈ {1, . . . , a − 1} such that
bkd/ac ∈
/ Jb (see (∗)).
Claim : Let k ∈ {1, . . . , a − 1}. Then bkd/ac ∈
/ Jb if and only if both
mod(kd, a) d a − mod(kd, a) d
< , and < hold.
mod(kb, a) b a − mod(kb, a) b
This statement looks somewhat scaring, but we will see that it expresses
exactly what bkd/ac ∈ / Jb means if one exploits (∗) for b, and (∗∗) for a:
Indeed, then bkd/ac ∈ / Jb if and only if for all i = 1, . . . , b − 1: id/b ∈
/
(bkd/ac, bkd/ac + 1]. Let us realize, that instead of checking this condition for
all i, it is enough to check it for those possibilities for i for which id/b has a
chance of being in (bkd/ac, bkd/ac + 1]:
Since kd a = a b and hence b a c b ≤ a ≤ d a e b , there are only two such
kb d kb d kd kb d
kb d kd kb d kd
b c < b c, or d e > 1 + b c.
a b a a b a
a − mod(kb, a) d mod(kd, a)
>1− ,
a b a
which is the second inequality. The claim is proved.
Let g := gcd(a, b). The values mod(ib, a) (i ∈ {0, 1, . . . , a − 1}) are from the
set {jg : j = 0, . . . , (a/g) − 1}, and each number in this set is taken g times.
Depending on whether a/g is even or odd, a/2 or a − g/2 is in this set.
So there exist g different values of i for which a/2 − g/2 ≤ mod(ib, a) ≤ a/2
and since for all of these g values mod(id, a) is different (because of gcd(a, d) = 1),
for at least one of them mod(id, a) ≤ a − g. For this i we have mod(id,a) mod(ib,a) ≤
a−g
a/2−g/2 = 2, and since a − mod(ib, a) ≥ a/2, a−mod(id,a)
a−mod(ib,a) ≤ 2 holds as well. Since
d/b > 2 by assumption, the condition of the claim is satisfied and we conclude
bid/ac ∈
/ Jb . t
u
Indeed, according to the Fact proved before the Lemma, Ja (d) ⊆ Jb (d) im-
plies Jd−b (d) ⊆ Jd−a (d), and clearly, 0 < d − b, d − a < d/2 gcd(d − b, d) =
gcd(d − a, d) = 1. Hence the Lemma can be applied to d − b, d − a and it estab-
lishes d − b | d − a.
The following example shows that the Lemma or its corollary are not nec-
essarily true if a < d/2, b > d/2, even if the condition of the lemma is ‘asymp-
totically true’ (lim d/b = 2 if k → ∞): let k ∈ IIN, k ≥ 3; d := 6k − 1, a = 3,
b = 3k + 4.
Then Ja = {2k − 1, 4k − 1} ⊆ Jb – let us check for instance 2k − 1 ∈ Jb :
(2k − 1)b = 6k 2 + 5k − 4 = (6k − 1)k + 6k − 4 ≡ 6k − 4 mod 6k − 1. Since
3k + 4 > (6k − 1) − (6k − 4) = 3, 2k − 1 is a jump coefficient of b = 3k + 4 mod
d.
For k = 2 we do not get a real example: a = 3, b = 10, d = 11; Ja ⊆ Jb
is true, and 3 is not a divisor of 10, but the corollary applies to d − a = 8 and
d − b = 1. One gets the smallest example with Ja ⊆ Jb and neither a|b nor
d − b | d − a by substituting k = 3: then a = 3, b = 13, d = 17.
4 A Polynomial Certificate
also holds, except if λi = 0 for all i = 1, . . . , n for which ai > 0. We suppose that
this exception does not occur. (This is automatically the case if all proper faces
of par(V ) are empty, or if ai > 0 for all i = 1, . . . , n.) Then, in order to certify
the validity of the above inequality for the entire par(V ), one only has to check
the congruence to hold for a generating set.
Such inequalities (induced exactly by the ‘orthogonal’ space to G(V ) mod d)
can then be combined in the usual way of linear (or integer) programming, in
order to yield new inequalities. We do not have an example where this procedure
would not provide a ‘short’ certificate
Pn for a lattice simplex to be empty, or more
generally, for the inequality i=1 λi /d > k (k ∈ IIN) to hold.
By the symmetry of the parallelepiped, the maximum of k for which such an
inequality can be satisfied is k = n/2.
For this extreme case (which occurs in the open cases of the integer Caratheo-
dory property – and slightly changes for odd n) we conjecture that the simplest
possible ‘good certificate’ can work:
Theorem 4.2 Let S ⊆ IR3 be an integer simplex with vertices O = 0 ∈ IR3 , and
A, B, C linearly independent vectors, d := det(A, B, C), V := {A, B, C}. The
following statements are equivalent:
Proof. If (i) holds, then (α, β, γ) ∈ G(V ) is a generator, so (ii) also holds. If
(ii) holds, then for all x ∈ par(A, B, C) the sum of the first two V -coefficients is
divisible by, and it follows that it is equal to d; since par(A, B) = {0}, the third
V -coordinate of x is nonzero, so the sum of the V -coefficients is greater than d
which means exactly that x ∈ / S. So S is empty.
The main part of the proof is (iii) implies (i). We follow the proof of [14]
Theorem 2.2:
Let S ⊆ IR3 be an empty integer simplex. Then every face of S is also integer
and empty. Therefore (since the faces are two-dimensional) par(V 0 ) = {0} for
every proper subset V 0 ⊂ V . Now by Lemma 3.1, the group G(V ) is cyclic. Let
the par(V )-coefficient (α, β, γ) be a generator.
Claim 1: d < mod(iα, d) + mod(iβ, d) + mod(iγ, d) < 2d (i = 1, . . . , d − 1).
Indeed, since S is empty, mod(iα, d) + mod(iβ, d) + mod(iγ, d) > d, and
(d − mod(iα, d)) + d − mod(iβ, d) + d − mod(iγ, d) > d, for all i = 1, . . . , d − 1.
Claim 2: There exists a generator (α, β, γ) ∈ G(V ) such that α + β + γ = d + 1.
Note that mod(iα, d)+mod(iβ, d)+mod(iγ, d) is mod d different for different
i ∈ {1, . . . , d − 1}, because if for j, k, 0 ≤ j < k ≤ d − 1 the values are equal,
then for i = k − j the expression would be divisibile by d, contradicting Claim
1. So {mod(iα, d) + mod(iβ, d) + mod(iγ, d) : i = 1, . . . , d − 1} is the same as
the set {d + 1, . . . , 2d − 1}, in particular there exists k ∈ {1, . . . , d − 1} such that
mod (kα, d) + mod (kα, d) + mod (kγ, d) = d + 1. Then clearly, mod (ikα, d) +
mod (ikα, d) + mod (ikγ, d) = d + i, so (kα, kβ, kγ) is also a generator, and the
claim is proved.
Choose now (α, β, γ) be like in Claim 2.
Claim 3. Each i = 1, . . . , d − 1 is jump-coefficient of exactly one of α, β and γ.
Indeed, because of Claim 1 we have mod((i + 1)α, d) + mod((i + 1)β, d) +
mod((i + 1)γ, d) = mod(iα, d) + mod(iβ, d) + mod(iγ, d) + α + β + γ − d and the
claim follows.
Fix now the notation so that α ≥ β ≥ γ. If we apply Claim 3 to i = 1 we get
that α > d/2, β < d/2, γ < d/2.
Hence Lemma 3.2 can be applied to d − α, β, γ: Claim 3 means Jβ ∩ Jα = ∅,
Jγ ∩ Jα = ∅, whence, because of the Fact noticed before Lemma 3.2 Jβ , Jγ ⊆
An Introduction to Empty Lattice Simplices 411
Proof. The if part is obvious, since for an arbitrary par(V )-coefficient vector
(α, β, γ): α + β = d and γ > 0, so α/d + β/d + γ/d > 1.
The only if part is an obvious consequence of the theorem: let the two par-
ticular rows mentioned in Theorem 4.2 (i) be the first two rows of a matrix M ,
and let the remaining row be the third. Then M is a 3 × 3 matrix, and with
the notation of Theorem 4.2, the rows of the Hermite normal form of M are
the following : the first two rows are (1, 0, 0), (0, 1, 0) since the parallelepiped
generated by any two nonzero extreme rays is {0}; the third is (d − α, d − β, d).
Since by Theorem 4.2 α + β = d, the statement follows. t
u
Proof. If n ≤ 2 the statement is obvious (see above). The only if part follows
from the previous corollary: after applying a unimodular transformation, the
three nonzero vertices of S will be (1, 0, 0), (0, 1, 0) and (1, b, d). The vector
w := (1, 0, 0) shows that the width of S is 1.
412 András Sebő
In spite of some counterexamples (see Section 1), the width and the number
of lattice points of a lattice simplex are correlated, and some of the remarks
above are about this relation. It is interesting to note that the complexity of
computing these two numbers seem to show some analogy: it is hard to compute
the number of integer points of a polytope, but according to a recent result of
Barvinok[2] this problem is polynomially solvable if the dimension is bounded;
we show below that to compute the width of quite particular simplices is already
NP-hard, however, there is a simple algorithm that finds the width in polynomial
time if the dimension is fixed. The proofs are quite easy:
for an integer point in conv(X ∪{0}) but not in X ∪{0}. Mixed integer programs
can be solved in polynomial time by Lenstra [9], see also Schrijver [13], page 260.
Haase and Ziegler [5] present W (S) where S is a lattice simplex as the optimal
value of a direct integer linear program. Their method is much simpler, and
probably quicker. The finite set of points (‘finite basis’) provided by Theorem 5.2
can be useful for presenting a finite list of vectors that include a width-defining
w ∈ ZZn , for arbitrary polytopes.
The negative results of the paper do not exclude that the emptiness of integer
simplices, and the width of empty integer simplices are decidable in polynomial
time. The positive results show some relations between these notions, involving
both complexity and bounds.
Acknowledgment: I am thankful to Jean-Michel Kantor for introducing me
to the notions and the references of the subject; to Imre Bárány and Bernd
Sturmfels for further helpful discussions.
References
1. W. Banaszczyk, A.E. Litvak, A. Pajor, S.J Szarek, The flatness theorem for
non-symmetric convex bodies via the local theory of Banach spaces, Preprint
1998.
2. A. Barvinok, A polynomial time algorithm for counting integral point in poly-
hedra when the dimension is fixed, Math. Oper. Res., 19 (1994), 769–779.
3. J.W.S. Cassels, An Introduction to the Geometry of Numbers, Springer, Berlin,
1959.
4. W. Bienia, L. Goddyn, P. Gvozdjak, A. Sebő, M. Tarsi, Flows, View Obstruc-
tions, and the Lonely Runner, J. Comb. Theory/B, Vol 72, No 1, 1998.
5. C. Haase, G. Ziegler, On the maximal width of empty lattice simplices, preprint,
July 1998/January 1999, 10 pages, European Journal of Combinatorics, to ap-
pear.
6. R. Kannan, L. Lovász, Covering minima and lattice-point-free convex bodies,
Annals of Mathematics, 128 (1988), 577-602.
7. L. Lovász and M. D. Plummer, Matching Theory, Akadémiai Kiadó, Budapest,
1986.
8. J-M. Kantor, On the width of lattice-free simplices, Composition Mathematica,
1999.
9. H.W. Lenstra Jr., Integer programming with a fixed number of variables, Math-
ematics of Operations Research, 8, (1983), 538–548.
10. B. Reznick, Lattice point simplices, Discrete Mathematics, 60, 1986, 219–242.
11. J.E.Reeve, On the volume of lattice polyhedra, Proc. London Math. Soc. (3) 7
(1957), 378–395.
12. H. Scarf, Integral polyhedra in three space, Math. Oper. Res., 10 (1985), 403–438.
13. A. Schrijver, ‘Theory of Integer and Linear Programming’, Wiley, Chich-
ester,1986.
14. A. Sebő, Hilbert bases, Caratheodory’s theorem and Combinatorial Optimiza-
tion, IPCO1, (R. Kannan and W. Pulleyblank eds), University of Waterloo Press,
Waterloo 1990, 431–456.
15. G. K. White, Lattice tetrahedra, Canadian J. Math. 16 (1964), 389–396.
On Optimal Ear-Decompositions of Graphs
Zoltán Szigeti
1 Introduction
This paper is concerned with matchings, matroids and ear-decompositions. The
results presented here are closely related to the ones in [7]. As in [7], we focus
our attention on ear-decompositions (called optimal) that have minimum num-
ber ϕ(G) of even ears. A. Frank showed in [3] how an optimal ear-decomposition
can be constructed in polynomial time for any 2-edge-connected graph. For an
application we refer the reader to [1], where we gave a 17/12-approximation algo-
rithm for the minimum 2-edge-connected spanning subgraph problem. ϕ(G) can
also be interpreted as the minimum size of an edge-set (called critical making)
whose contraction leaves a factor-critical graph, because, by a result of Lovász
(see Theorem 1), ϕ(G) = 0 if and only if G is factor-critical. It was shown in
[7] (see also [8]) that the minimal critical making edge sets form the bases of a
matroid. We call the matroid as the ear-matroid of the graph and it will be de-
noted by MG . For a better understanding of this matroid we give here a simple
description of the blocks of MG . As we shall see, two edges belong to the same
block of the ear-matroid if and only if there exists an optimal ear-decomposition
so that these two edges are contained in the same even ear. Let us denote by
D(G) the graph obtained from G by deleting each edge which is always in an
odd ear in all optimal ear-decompositions of G. We shall show that the edge sets
of the blocks of this graph D(G) coincide with the blocks of MG .
Let us turn our attention to matching theory. Matching-covered graphs and
saturated graphs have ϕ(G) = 1. By generalizing the Cathedral Theorem of
Lovász [6] on saturated graphs we proved in [7] a stucture theorem on graphs
with ϕ(G) = 1. As a new application of that result we present here a simple
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 415–428, 1999.
c Springer-Verlag Berlin Heidelberg 1999
416 Zoltán Szigeti
proof for the Tight Cut Lemma due to Edmonds, Lovász and Pulleyblank [2].
Edmonds et al. determined in [2] the dimension of the perfect matching polytope
of an arbitrary graph. They used the so-called brick decomposition and the most
complicated part was to show that the result is true for bricks. In this case the
problem is equivalent to saying that no brick contains a non-trivial tight cut.
This is the Tight Cut Lemma. Their proof for this lemma contains some linear
programming arguments, the description of the perfect matching polytope, the
uncrossing technique and some graph theoretic arguments. Here we provide a
purely graph theoretic proof.
Let us call a graph ϕ-covered if for each edge e there exists an optimal
ear-decomposition so that e is contained in an even ear. We shall generalize
matching-covered graphs via the following fact.
(×) A graph G is matching-covered if and only if G/e is factor-critical for each
edge e of G, in other words, G is ϕ-covered and ϕ(G) = 1.
Thus ϕ-covered graphs can be considered as the generalization of matching-
covered graphs. We shall extend several results on matching-covered graphs for
ϕ-covered graphs. For example Little’s theorem combining by Lovász and Plum-
mer’s result states that if e and f are two arbitrary edges of a matching-covered
graph G, then G has an optimal ear–decomposition such that the first ear is
even and it contains e and f . This is also true for ϕ-covered graphs. Lovász
and Plummer [6] proved that each matching-covered graph has a 2-graded ear-
decomposition. For a simple proof for this theorem see [9]. This result may also
be generalized for ϕ-covered graphs.
We say that an edge set of a graph G is critical making if its contraction leaves
a factor-critical graph. An edge of a graph G is allowed if it lies in some perfect
matching of G. If G has a perfect matching then N (G) denotes the subgraph of
G induced by the allowed edges of G. A connected graph G is matching covered
if each edge of G is allowed. The following theorem is obtained by combining the
result of Little [4] and a result of Lovász and Plummer [6].
Proof. If at least one of the two end vertices of e is contained in one of the odd
components of H − X, then let us denote this component by K, otherwise let K
be an arbitrary odd component of H − X. Let f be a ϕ-extreme edge in H which
connects K to X, such an edge exists by Theorem 4.3 in [3]. The proposition is
true for f (see above the other direction), thus there exists a base Bf ∈ B(G)
so that f ∈ Bf and |Bf ∩ E(G[V (H)])| = 1. e is ϕ-extreme edge in G that
is it is independent in M(G). Thus it can be extended to a base Be ∈ B(G)
using elements in Bf . We show that Be has the required properties. Clearly,
e ∈ Be . |Be ∩ E(G[V (H)])| ≤ 2 by construction. Let us denote by X 0 (by H 0 )
the vertex set (subgraph) corresponding to X (H) in G/Be . G/Be is factor-
critical because Be ∈ B(G), whence trivially follows that co (H 0 − X 0 ) < |X 0 |.
Thus, by construction, |X| − 1 = co (H − X) − 1 ≤ co (H 0 − X 0 ) ≤ |X 0 | − 1 ≤
|X| − 1. Thus co (H 0 − X 0 ) = co (H − X) − 1 and |X 0 | = |X|. It follows that
|Be ∩ E(G[V (H)])| = 1. t
u
The first two statements of the following theorem were proved in [7]. Since
the fourth statement is trivial, we prove here only the third one.
On Optimal Ear-Decompositions of Graphs 419
Proof. Let us denote by U the vertex set on the right hand side in Theorem
(4.3). By (∗), V (B(G)) ⊆ U. To show the other direction let v ∈ U. Let e be an
allowed edge of G incident to v. Let G0 := G/e and let us denote by u the vertex
of G0 corresponding to the two end vertices of e. G0 − u has a perfect matching
because e is an allowed edge of G. If G0 is not factor-critical, then it is easy to
see (using Lemma 1.8 in [8]) that G0 contains a strong subgraph H so that H
does not contain u. It follows that H is a strong subgraph of G not containing
v, which is a contradiction since v ∈ U. Thus G0 is factor-critical, that is e is a
critical making edge of G, thus v ∈ V (B(G)), and we are done. t
u
Lemma 2. The following are equivalent for the ear matroid M(G) of a 2-edge-
connected graph G. Let e and f be two edges of G.
a.) there exists an optimal ear-decomposition of G so that the first ear is even
and it contains e and f .
b.) e and f are in the same block of M(G).
The following result generalizes Theorem (4.2) and gives some information
about the structure of D(G) for an arbitrary 2-edge-connected graph G.
Proof. We prove the theorem by induction on ϕ(G). First suppose that ϕ(G) = 1.
Then Theorem (4.2) proves a.), and Theorem (4.2) and Fact (×) in the Intro-
duction implies b.), c.) and d.).
Now suppose that ϕ(G) ≥ 2. Let H be a strong subgraph of G with strong
barrier X. By Theorem 3/b, H is almost critical, and by Theorem (4.2), B(H) is
matching-covered. Since B(H) is 2-vertex-connected, by Lemma 1, it is included
in some Gi , say G1 . Consider the graph G0 := G/B(H). Since H/B(H) is factor-
critical by Theorem (4.2), ϕ(G0 ) = ϕ(G/H) and E(D(G/H)) = E(D(G0 )). By
Lemma 1, E(D(G)) − E(D(H)) = E(D(G/H)), so E(D(G0 )) = E(D(G)) −
E(D(H)). Thus the blocks G01 , ..., G0l of D(G0 ) are exactly the blocks of G1 /B(H)
and G2 , ..., Gk . By Theorem 3/b, ϕ(G0 ) = ϕ(G/H) = ϕ(G) − 1, thus, by the
induction hypothesis, the theorem is true for G0 .
Proposition 3. B(H) is a strong subgraph of G1 .
Proof. For an edge e ∈ E(H), e is ϕ-extreme in H if and only if e is ϕ-extreme in
G by Lemma 1, thus deleting the non ϕ-extreme edges of G from H the resulting
graph is exactly D(H). The factor-critical components of H − X correspond
to odd components of B(H) − X because the graph B(H) is nice in H by
Theorem (4.1). Thus X is a barrier of B(H). Let Y be a maximal barrier of
B(H) including X. Then, since B(H) is matching-covered by Theorem (4.2), Y
is a strong barrier of B(H). Since X separates the factor-critical components of
H − X from G − V (H) in G, Y separates B(H) − Y from G1 − V (B(H)). It
follows that B(H) is a strong subgraph of G1 with strong barrier Y. t
u
a.) Since S(G) = S(G0 ) (in the second case we contracted G1 in two steps,
namely first B(H) and then the blocks of G1 /B(H)), the statement follows
from the induction hypothesis.
b.) By Proposition 3 and Theorem 3/b, ϕ(G1 /B(H)) = ϕ(G1 ) − 1. ϕ(G) =
Pl Pk Pk
ϕ(G0 ) + 1 = 1 ϕ(G0i ) + 1 = ϕ(G1 ) − 1 + 2 ϕ(Gi ) + 1 = 1 ϕ(Gi ).
Theorem 1/b, ϕ(G) ≤ ϕ(G/Gi ) + ϕ(Gi ) and ϕ(G/Gi ) ≤ ϕ((G/Gi )/ ∪j6=i
c.) By P
Gj ) + j6=i ϕ(Gj ). By adding them, and using that ϕ((G/Gi )/ ∪j6=i Gj ) = 0 by
Pk Pk
a.), and 1 ϕ(Gj ) = ϕ(G) by b.), we have the following. ϕ(G) ≤ 1 ϕ(Gj ) =
ϕ(G). Thus equality holds everywhere, hence ϕ(G) = ϕ(G/Gi ) + ϕ(Gi ), as we
claimed.
d.) For i ≥ 2 the statement follows from the induction hypothesis. For G1 it
follows from the induction hypothesis, Proposition 3 and from Lemma 1. u
t
Theorem 7. The edge sets of the blocks of D(G) and the blocks of M(G) coin-
cide.
Proof. (a) Let e and f be two edges of G so that they belong to the same block
of M(G). By Lemma 2, there exists an optimal ear-decomposition of G so that
the starting even ear C contains e and f. Since every edge of C is ϕ-extreme,
the edges of this cycle belong to the same block of D(G).
422 Zoltán Szigeti
(b) Let e and f be two edges of G so that they belong to the same block of
D(G), say to G1 . By Theorem 6/d, G1 is ϕ-covered, thus, by Theorem 5 and
Lemma 2, there exists an optimal ear-decomposition of G1 so that the starting
even ear contains e and f . By Theorem 6/c, ϕ(G/G1 ) = ϕ(G) − ϕ(G1 ), so this
ear-decomposition can be extended to an optimal ear-decomposition of G so that
the starting even ear contains e and f . Then by Lemma 2, e and f belong to
the same block of M(G). t
u
Let G00 := G − {ei : ui ∈ R}. Using (6.1), the induction hypothesis implies
that there exists uj ∈ R so that uj ∈ V (B(G00 )).
4.6/b in [3], Lemma 2.6 in [7] and Theorem (4.1). Let M2 be a perfect matching
of G0 containing g. g is the only edge of M2 leaving F ∗ because X is a barrier in
G0 . Thus M1 (F ∗ ) ∪ M2 (G0 − V (F ∗ )) ∪ g is a perfect matching of G0 containing
f. t
u
Proposition 7 gives a contradiction because, by the definition of T, there is
no allowed edge leaving T in G0 . u
t
Proof. By (∗∗), G00 has a perfect matching. If G00 is not almost critical, then by
Corollary 4.8 in [3], there exist two disjoint strong subgraphs H1 and H2 in G00
with strong barriers X1 and X2 . By Proposition (10.2), we may suppose that
X1 ⊆ (S − v). By Proposition (10.1), (S − u) ∩ V (H1 ) 6= ∅, thus (S − u) ⊂
V (H1 ) because G[S − u] is connected. Hence (S − u) ∩ V (H2 ) = ∅, contradicting
Proposition (10.1). t
u
5 ϕ-Covered Graphs
The aim of this section is to extend earlier results on matching-covered graphs of
Lovász and Plummer [6] for ϕ-covered graphs. First we prove a technical lemma.
Proof. First suppose that G has a perfect matching. Then, by Corollary 4.8
in [3], G has two vertex disjoint strong subgraphs. Clearly, for one of them
e ∈ E(G/H). Secondly, suppose that G has no perfect matching. Let X be a
maximal vertex set for which co (G − X) > |X|. Then each component of G − X
is factor-critical and, by assumption, |X| =
6 ∅.
i.) If a component F contains an end vertex of e, then by Lemma 1.8 in [8], G
has a strong subgraph H so that V (H) ⊆ V (G) − V (F ) so we are done.
426 Zoltán Szigeti
ii.) Otherwise, by Lemma 1.8 in [8], G has a strong subgraph H with strong
barrier Y ⊆ X so that each component of H − Y is a component of G − X. We
claim that e ∈ E(G/H). If not then the two end vertices u and v of e belong
to Y because we are in ii.). By Lemma 1, H is ϕ-covered and e is ϕ-extreme in
H. Thus by Theorem 3/b and Fact (×), H is matching-covered, that is H/e is
factor-critical. However, co ((H/e) − (Y /e)) = |Y | > |Y /e|, so obviously, H/e is
not factor-critical, a contradiction. t
u
Lovász and Plummer [6] proved that each matching-covered graph has a
2-graded ear-decomposition. This result may also be generalized for ϕ-covered
graphs. A sequence (G0 , G1 , ..., Gm ) of subgraphs of G is a generalized 2-graded
ear-decomposition of G if G0 is a vertex, Gm = G, for every i = 0, ..., m − 1 :
Gi+1 is ϕ-covered, Gi+1 is obtained from Gi by adding at most two disjoint
paths (ears) which are openly disjoint from Gi but their end-vertices belong to
Gi , if we add two ears then both are of odd length, and ϕ(Gi ) ≤ ϕ(Gi+1 ). This is
the natural extension of the original definition of Lovász and Plummer. Indeed,
if G is matching-covered then ϕ(G) = 1, thus the first ear will be even and all
the other ears will be odd.
Let Gj be the first subgraph of Q which contains v and let a and b the two
edges of Gj incident to v. G/H/Q is also ϕ-covered, since a graph is ϕ-covered if
and only if each block of it is ϕ-covered. By induction G/H/Q has a generalized
2-graded ear-decomposition (G∗0 , G∗1 , ..., G∗p ) so that the starting ear contains an
edge incident to v.
i.) First suppose that a and b are incident to the same vertex u of X in G. Let c be
an edge of H incident to u. H has a 2-graded ear-decomposition (G00 , G01 , ..., G0l )
so that the starting ear contains c. Let G00i = Gi if 0 ≤ i ≤ j, let G00i be the graph
obtained from G0i−j−1 by replacing the vertex u by Gj if j + 1 ≤ i ≤ j + l + 1, let
G00i be the graph obtained from Gi−l−1 by replacing the vertex v by G00j+l+1 if
j + l + 2 ≤ i ≤ k + l + 1 and finally let G00i be the graph obtained from Gi−k−l−1
by replacing the vertex v by G00k+l+1 if k + l + 2 ≤ i ≤ k + l + p + 1. As above,
it is easy to see that (G000 , G001 , ..., G00k+l+p+1 ) is the desired generalized 2-graded
ear-decomposition.
ii.) Secondly, suppose that a and b are incident to different vertices u and w of
X in G. Let c and d be edges of H incident to u and w respectively. By Theorem
5.4.2 in [6] and Theorem 2, H has a 2-graded ear-decomposition (G00 , G01 , ..., G0l )
so that the starting ear P1 contains c and d. u and w divide P1 (which is an
even ear) into two paths D1 and D2 . By Proposition 2, D1 and D2 are of even
length. Let G00i = Gi if 0 ≤ i ≤ j − 1, G00j be graph obtained from Gj by replacing
the vertex v by D1 , G00j+1 be graph obtained from Gj by replacing the vertex
v by P1 , let G00i be graph obtained from G0i−j−1 by replacing P1 by G00j+1 if
j + 2 ≤ i ≤ j + l + 1, let G00i be graph obtained from Gi−l−1 by replacing the
vertex v by G00j+l+1 if j + l + 2 ≤ i ≤ k + l + 1 and finally as above let G00i
be the graph obtained from Gi−k−l−1 by replacing the vertex v by G00k+l+1 if
k + l + 2 ≤ i ≤ k + l + p + 1. It is easy to see that (G000 , G001 , ..., G00k+l+p+1 ) is the
desired generalized 2-graded ear-decomposition. In this case however in order to
see that each subgraph G00i (j + 2 ≤ i ≤ j + l + 1) is ϕ-covered we have to use
that the subgraph of G00i corresponding to G0i−j−1 of H constructed so far is a
strong subgraph of G00i . t
u
The two ear theorem on matching-covered graphs follows easily from the
following: if the addition of some new edge set F to a matching-covered graph
G results in a matching-covered graph then we can choose at most two edges
of F so that by adding them to G the graph obtained is matching-covered. The
next theorem is the natural generalization of this. However, we cannot prove
Theorem 10 using this result.
Theorem 11. Let F := {e1 , ...ek } ⊂ E(G) be a set of non edges of a ϕ-covered
graph G. Suppose that G + F is ϕ-covered and ϕ(G) = ϕ(G + F ). Then there
exist i ≤ j so that G + ei + ej is ϕ-covered.
Proof. We prove the theorem by induction on ϕ(G). If ϕ(G) = 1, then G
is matching-covered by Fact (×), so by the theorem of Lovász and Plummer
(Lemma 5.4.5 in [6]), we are done. In the following we suppose that ϕ(G) ≥ 2.
Let F 0 ⊆ F be a minimal set in F so that G0 := G + F 0 is ϕ-covered. We claim
428 Zoltán Szigeti
Example. We give an example which shows that the above theorem is not true
if we delete the condition that ϕ(G) = ϕ(G + F ). Let G be the graph obtained
from a star on four vertices by duplicating each edge and let F be the three
edges in E(G). Then G and G + F are ϕ-covered but for every non empty subset
F 0 of F, G + F 0 is not ϕ-covered. Note that ϕ(G) = 3 and ϕ(G + F ) = 1.
Acknowledgement This research was done while the author visited the Re-
search Institute for Discrete Mathematics, Lennéstrasse 2, 53113. Bonn, Ger-
many by an Alexander von Humboldt fellowship. All the results presented in
this paper can be found in [10] or in [11].
References
1. J. Cheriyan, A. Sebő, Z. Szigeti, An improved approximation algorithm for the
minimum 2-edge-connected spanning subgraph problem, Proceedings of IPCO6,
eds.: R.E. Bixby, E.A. Boyd, R.Z. Rios-Mercado, (1998) 126-136.
2. J. Edmonds, L. Lovász, W. R. Pulleyblank, Brick decompositions and the matching
rank of graphs, Combinatorica 2 (1982) 247-274.
3. A. Frank, Conservative weightings and ear-decompositions of graphs, Combinator-
ica 13 (1) (1993) 65-81.
4. C.H.C. Little, A theorem on connected graphs in which every edge belongs to a
1-factor, J. Austral. Math. Soc. 18 (1974) 450-452.
5. L. Lovász, A note on factor-critical graphs, Studia Sci. Math. Hungar. 7 (1972)
279-280.
6. L. Lovász, M. D. Plummer, “Matching Theory,” North Holland, Amsterdam,
(1986).
7. Z. Szigeti, On Lovász’s Cathedral Theorem, Proceedings of IPCO3, eds.: G. Ri-
naldi, L.A. Wolsey, (1993) 413-423.
8. Z. Szigeti, On a matroid defined by ear-decompositions of graphs, Combinatorica
16 (2) (1996) 233-241.
9. Z. Szigeti, On the two ear theorem of matching-covered graphs, Journal of Com-
binatorial Theory, Series B 74 (1998) 104-109.
10. Z. Szigeti, Perfect matchings versus odd cuts, Technical Report No. 98865 of Re-
search Institute for Discrete Mathematics, Bonn, 1998.
11. Z. Szigeti, On generalizations of matching-covered graphs, Technical Report of
Research Institute for Discrete Mathematics, Bonn, 1998.
Gale-Shapley Stable Marriage Problem
Revisited: Strategic Issues and Applications
(Extended Abstract)
1 Introduction
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 429–438, 1999.
c Springer-Verlag Berlin Heidelberg 1999
430 Chung-Piaw Teo, Jay Sethuraman, and Wee-Peng Tan
One of the main difficulties with using the men-optimal stable matching
mechanism is that it is manipulable by the women: some women can intentionally
misrepresent their preferences to obtain a better stable partner. Such (strategic)
questions have been studied for the stable marriage problem by mathematical
economists and game theorists; essentially, this approach seeks to understand
and quantify the potential gains of a deceitful player. Roth [7] proved that when
the men-optimal stable-matching mechanism is used, a man will never be better
off by falsifying his preferences, if all the other people in the problem reveal their
true preferences. Falsifying preferences will at best result in the (original) match
that he obtains when he reveals his true preferences. Gale and Sotomayor [4]
showed that when the man-propose algorithm is used, a woman w can still force
the algorithm to match her to her women-optimal partner, denoted by µ(w), by
falsifying her preference list. The optimal cheating strategy for woman w is to
declare men who rank below µ(w) as unacceptable. Indeed, the cheating strat-
egy proposed by Gale and Sotomayor is optimal in the sense that it enables the
women to obtain their women-optimal stable partner even when the man-propose
mechanism is being adopted. Prior to this study, we know of no analogous results
when the women are required to submit complete preference lists. This question
is especially relevant to the Singapore school-admissions problem: All assign-
ments are done via the centralized posting exercise, and no student is allowed
to approach the schools privately for admission purposes. In fact, in the current
system, students not assigned to a school on their list are assigned to schools
according to some pre-determined criterion set by the Ministry. Thus effectively
this is a matching system where the students are not allowed to remain “single”
at the end of the posting exercise. To understand whether the stable matching
mechanism is a viable alternative, we first need to know whether there is any
incentive for the students to falsify their preferences, so that they can increase
their chances of being assigned to “better” schools. The one-to-one version of
this problem is exactly the question studied in this paper: in the stable marriage
model with complete preferences, with the men-optimal matching mechanism,
is there an incentive for the women to cheat? If so, what is the optimal cheating
strategy for a woman? To our knowledge, the only result known about this prob-
Gale-Shapley Stable Marriage Problem Revisited 431
lem is an example due to Josh Benaloh (cf. Gusfield and Irving [5]), in which
the women lie by permuting their preference lists, and still manage to force the
men-optimal matching mechanism to return the women-optimal solution.
1. Run the man-propose algorithm, and reject all proposals made to w. At the
end, w and a man, say m, will remain single.
2. Among all the men who proposed to w in Step 1, let the best man (according
to w) be m1 .
Proof: Let µ(w) denote the women-optimal partner for w. We modify w’s pref-
erence list by inserting the option to remain single in the list, immediately after
µ(w). (We declare all men that are inferior to µ(w) as unacceptable to w.) Con-
sequently, in the man-propose algorithm, all proposals inferior to µ(w) will be
rejected. Nevertheless, since there exists a stable matching with w matched to
µ(w), our modification does not destroy this solution. It is also well known that
the set of people who are single is the same for all stable matchings (cf. Roth
and Sotomayor [8], pg. 42). Thus, w must be matched in all stable matchings
with the modified preference list. The men-optimal matching for this modified
preference list must match w to µ(w). In particular, µ(w) must have proposed to
432 Chung-Piaw Teo, Jay Sethuraman, and Wee-Peng Tan
w during the execution of the man-propose algorithm. Note that until µ(w) pro-
poses to w, the man-propose algorithm for the modified list runs exactly in the
same way as in Step 1 of OP . The difference is that Step 1 of OP will reject the
proposal from µ(w), while the man-propose algorithm for the modified list will
accept the proposal from µ(w), as w prefers µ(w) to being single. Hence, clearly
µ(w) is among those who proposed to w in Step 1 of OP , and so m1 ≥w µ(w).
Suppose m1 >w µ(w). Consider the modified list in which we place the option
of remaining single immediately after m1 . We run the man-propose algorithm
with this modified list. Again, until m1 proposes to w, the algorithm runs ex-
actly the same as in Step 1 of OP , after which the algorithm returns a stable
partner for w who is at least as good as m1 . This gives rise to a contradiction
as we assumed µ(w) to be the best stable partner for w.
Observe that under this approach, the true preference list of w is only used
to compare the men who have proposed to w. We do not need to know her exact
preference list; we only need to know which man is the best among a given set of
men, according to w. Hence the information set needed here to find the women-
optimal partner of w is much less than that needed when the woman-propose
algorithm is used. This is useful for the construction of the cheating strategy as
the information on the “optimal” preference list is not given a-priori and is to
be determined.
1. Run the man-propose algorithm with the true preference list P (w) for woman
w. Keep track of all men who propose to w. Let the men-optimal partner for
w be m, and let P (m, w) be the true preference list P (w).
2. Suppose mj proposed to w in the Gale-Shapley algorithm. By moving mj to
the front of the list P (m, w), we obtain a preference list for w such that the
men-optimal partner will be mj . Let P (mj , w) be this altered list. We say
that mj is a potential partner for w.
3. Repeat step 2 for every man who proposed to woman w in the algorithm;
after this, we say that we have exhausted man m, the men-optimal partner
obtained with the preference list P (m, w).
4. If a potential partner for w, say man u, has not been exhausted, run the
Gale-Shapley algorithm with P (u, w) as the preference list of w. Identify
new possible partners and their associated preference lists, and classify man
u as exhausted.
5. Repeat Step 4 until all possible partners of w are exhausted. Let N denote
the set of all possible (and hence exhausted) partners for w.
6. Among the men in N let ma be the man woman w prefers most. Then
P (ma , w) is an optimal cheating strategy for w.
The men in the set N at the end of the algorithm have the following crucial
properties:
– For each man m in N , there is an associated preference list for w such the
Gale-Shapley algorithm returns m as the men-optimal partner for w with
this list.
– All other proposals in the course of the Gale-Shapley algorithm come from
other men in N . (Otherwise, there will be some possible partners who are
not exhausted.)
Proof: (by contradiction) We use the convention that r(m) = k if man m is the
k th man on woman w’s list. Let π ∗ = {mp1 , mp2 , . . . , mpn } be the preference list
that gives rise to the best stable partner for w under the man-propose algorithm.
Let this man be denoted by mpb , and let woman w strictly prefer mpb to ma
(under her true preference list). Recall that we use r(m) to represent the rank
of m under the true preferences of w; by our assumption, r(mpb ) < r(ma ),
i.e., mpb is ranked higher than ma . Observe that the order of the men who
do not propose to woman w is irrelevant and does not affect the outcome of
the Gale-Shapley’s algorithm. Furthermore, men of rank higher than r(mpb ) do
not get to propose to w, otherwise we can cheat further and improve on the
best partner for w, contradicting the optimality of π ∗ . Thus we can arbitrarily
alter the order of these men, without affecting the outcome. Without loss of
generality, we may assume that 1 = r(mp1 ) < 2 = r(mp2 ) < . . . < q = r(mpb ).
434 Chung-Piaw Teo, Jay Sethuraman, and Wee-Peng Tan
Since r(mpb ) < r(ma ), ma will appear anywhere after mpb in π ∗ : thus, ma can
appear in any position from mpb+1 to mpn .
Now, we modify π ∗ such that all men who (numerically) rank lower than ma
but higher than mpb (under true preferences) are put in order according to their
ranks. This is accomplished by moving all these men before ma in π ∗ . With that
alteration, we obtain a new list π̃ = {mq1 , mq2 , . . . , mqn } such that:
Note that the men-optimal partner of w under π̃ cannot come from the set
{mqs+1 , mqs+2 , . . . mqn }. Otherwise, since the set of men who proposed in the
course of the algorithm must come from {mqs+1 , mqs+2 , . . . mqn }, and since the
preference list π ∗ retains the relative order of the men in this set, the same
partner would be obtained under π ∗ . This leads to a contradiction as π ∗ is
supposed to return a better partner for w. Hence, we can see that under π̃, we
already get a better partner than under π.
Now, since the preference list π returns ma with r(ma ) = s + 1, we may
conclude that the set N (obtained from the final stage of the algorithm) does not
contain any man of rank smaller than s + 1. Thus N ⊆ {mqs+1 , mqs+2 , . . . mqn }.
Suppose mqs+1 , mqs+2 , . . . , mqw do not belong to the set N , and mqw+1 is the
first man after mqs who belongs to the set N . By construction of N , there
exists a permutation π̂ with mqw+1 as the stable partner for w under the men-
optimal matching mechanism. Furthermore, all of those who propose to w in
the course of the algorithm are in N , and hence they are no better than ma to
w. Furthermore, all proposals come from men in {mqw+1 , mqw+2 , . . . mqn }, since
N ⊆ {mqs+1 , mqs+2 , . . . mqn }.
By altering the order of those who did not propose to w, we may assume that
π̂ is of the form {mq1 , mq2 , . . . , mqs−1 , mqs , . . . , mqw , mqw+1 , . . .}, where the first
qw + 1 men in the list are identical to those in π̃. But, the men-optimal stable so-
lution obtained using π̂ must also be stable under π̃, since w is match to mqw+1 ,
and the set of men she strictly prefers to mqw+1 is identical in both π̂ and π̃.
This is a contradiction as π̃ is supposed to return a men-optimal solution better
than ma . Thus π ∗ does not exist, and so π is optimum and ma is the best stable
partner w can get by permuting her preference list.
123451 112354
234512 221453
351423 332514
431245 445123
515234 551234
True Preferences of the Men True Preferences of the Women
– Step 1: Run Gale-Shapley with the true preference list for woman 1; her men-
optimal partner is man 5. Man 4 is the only other man who proposes to her
during the Gale-Shapley algorithm. So P (man5, woman1) = (1, 2, 3, 5, 4).
– Step 2-3: Man 4 is moved to the head of woman 1’s preference list; i.e.,
P (man4, woman1) = (4, 1, 2, 3, 5). Man 5 is exhausted, and man 4 is a
potential partner.
– Step 4: As man 4 is not yet exhausted, we run the Gale-Shapley algorithm
with P (man4, woman1) as the preference list for woman 1. Man 4 will be
exhausted after this, and man 3 is identified as a new possible partner, with
P (man3, woman1) = (3, 4, 1, 2, 5).
– Repeat Step 4: As man 3 is not yet exhausted, we run Gale-Shapley with
P (man3, woman1) as the preference list for woman 1. Man 3 will be ex-
hausted after this. No new possible partner is found, and so the algorithm
terminates.
Example 1 shows that woman 1 could cheat and get a partner better than the
men-optimal solution. However, her women-optimal partner in this case turns
out to be man 1. Hence Example 1 also shows that woman 1 cannot always assure
herself of getting the women-optimal partner through cheating, in contrast to
the case when rejection is allowed in the cheating strategy.
3.1 The Best Possible Partners (Obtained from Cheating) May Not
Be Women-Optimal
In the two-sided matching model with rejection, it is not difficult to see that the
women can always force the man-propose algorithm to return the women-optimal
436 Chung-Piaw Teo, Jay Sethuraman, and Wee-Peng Tan
solution (e.g. each woman rejects all those who are inferior to her women-optimal
partner). In our model, where rejection is forbidden, the influence of the women
is far less, even if they collude. A simple example is when each woman is ranked
first by exactly one man. In this case, there is no conflict among the men, and
in the men-optimal solution, each man is matched to the woman he ranks first.
(This situation arises whenever each man ranks his men-optimal partner as his
first choice.) In this case, the algorithm will terminate with the men-optimal
solution, regardless of how the women rank the men in their lists. So ruling out
the strategic option of remaining single for the women significantly affects their
ability to change the outcome of the game by cheating.
By repeating the above analysis for all the other women in Example 1, we
conclude that the best possible partner for woman 1, 2, 3, 4, and 5 are re-
spectively man 3, 1, 2, 4, and 3. An interesting observation is that woman 5
cannot benefit by cheating alone (she can only get her men-optimal partner no
matter how she cheats). However, if woman 1 cheats using the preference list
(3, 4, 1, 2, 5), woman 5 will also benefit by being matched to man 5, who is first
in her list.
Suppose each woman w announces a preference list P (w). The set of strategies
{π(1), π(2), . . . , π(n)} is said to be in strategic equilibrium if none of the women
has an incentive to deviate unilaterally from this announced strategy. It is easy
to see that if a woman benefits from announcing a different permutation list
(instead of her true preference list), then every other woman would also benefit,
i.e. every other woman will get a partner who is at least as good as her men-
optimal partner (cf. Roth and Sotomayor [8] ).
Theorem 3. If a single woman can benefit by cheating, then the game has mul-
tiple strategic equilibria.
Roth [7] shows that under the man-propose mechanism, the men have no incen-
tives to alter their true preference lists. In the rejection model, however, Gale
and Sotomayor [3] show that a woman has an incentive to cheat as long as she
has at least two distinct stable partners. Pittel [6] shows that the average num-
ber of stable solutions is asymptotic to nlog(n)/e, and with high probability,
the rank of the women-optimal and men-optimal partner for the woman are re-
spectively log(n) and n/log(n). Thus in typical instances of the stable marriage
game under the rejection model, most of the women will not reveal their true
preference lists.
Many researchers have argued that the troubling implications from these
studies are not relevant in practical stable marriage game, as the model assumes
that the women have full knowledge of each individual’s preference list and the
set of all the players in the game. For the model we consider, it is natural to
ask whether it pays (as in the rejection model) for a typical woman to solicit
information about the preferences of all other participants in the game. We run
the cheating algorithm on 1000 instances, generated uniformly at random, for
n = 8 and the number of women who benefited from cheating is tabulated in
Table 1.
Interestingly, the number of women who can gain from cheating is surprisingly
low. In fact, in 74% of the instances, the men-optimal solution is their only
option, no matter how they cheat. The average percentage of women who benefit
from cheating is merely 5.06%.
To look at the typical strategic behaviour on larger instances of the stable
marriage problem, we run the heuristic on 1000 random instances for n = 100.
The cumulative plot is shown in Figure 1. In particular, in more than 60% of
the instances at most 10 women (out of 100) benefited from cheating, and in
more than 96% of the instances at most 20 women benefited from cheating.
The average number of women who benefited from cheating is 9.52%. Thus, the
chances that a typical woman can benefit from acquiring complete information
(i.e., knowing the preferences of the other players) is pretty slim in our model.
We have repeated the above experiment for large instances of the Gale-
Shapley model. Due to computational requirements, we can only run the ex-
periment on 100 random instances of the problem with 500 men and women.
Again the insights obtained from the 100 by 100 cases carry over: the number
of women who benefited from cheating is again not more than 10% of the to-
tal number of the women involved. In fact, the average was close to 6% of the
women population in the problem. This suggests that the number of women who
can benefit from cheating in the Gale-Shapley model with n women grows at a
rate which is slower than a linear function of n. However, detailed probabilistic
438 Chung-Piaw Teo, Jay Sethuraman, and Wee-Peng Tan
1000
800
Number of instances (out of 1000)
600
400
200
0
0 10 20 30 40 50 60 70 80 90 100
Number of people who benefit from cheating
References
1. Dubins, L.E. and D.A. Freedman (1981), “Machiavelli and the Gale- Shapley Al-
gorithm.“ American Mathematical Monthly, 88, pp485-494.
2. Gale, D., L. S. Shapley (1962). College Admissions and the Stability of Marriage.
American Mathematical Monthly 69 9-15.
3. Gale, D., M. Sotomayor (1985a). Some Remarks on the Stable Matching Problem.
Discrete Applied Mathematics 11 223-232.
4. Gale, D., M. Sotomayor (1985b). Ms Machiavelli and the Stable Matching Problem.
American Mathematical Monthly 92 261-268.
5. Gusfield, D., R. W. Irving (1989). The Stable Marriage Problem: Structure and
Algorithms, MIT Press, Massachusetts.
6. Pittel B. (1989), the Average Number of Stable Matchings, SIAM Journal of Dis-
crete Mathematics, 530-549.
7. Roth, A.E. (1982). The Economics of Matching: Stability and Incentives. Mathe-
matics of Operations Research 7 617-628.
8. Roth, A.E., M. Sotomayor (1991). Two-sided Matching: A Study in Game-theoretic
Modelling and Analysis, Cambridge University Press, Cambridge.
Vertex-Disjoint Packing of Two Steiner Trees:
Polyhedra and Branch-and-Cut
1 Introduction
The Vertex-disjoint Packing of Steiner Trees problem (VPST) is the following:
given a connected graph G = (V, E) and K disjoint vertex-sets N1 , . . . , NK ,
called terminal sets, find disjoint vertex-sets T1 , . . . , TK such that, for each k ∈
{1, . . . , K}, Nk ⊆ Tk and G[Tk ] (the subgraph induced by Tk ) is connected.
Deciding whether a VPST instance is feasible or not is NP-complete, and many
times a very challenging task. VPST has a wide application in layout of printed
boards or chips, as it can model in an accurate way the routing of electrical
connections among the circuit components (see Korte et al. [3] or Lengauer [4]).
This work studies the case where K = 2, the Vertex-Disjoint Packing of
Two Steiner Trees (2VPST). This restricted problem is still NP-complete [3].
Although most VPST applications involves more than two terminal sets, many
problems in circuit layout have only two large terminal sets, for instance the
ground and the Vcc terminals. Finding a solution for these two large sets may
represent a major step in solving the whole routing problem. In fact, it is usual in
chip layout to route the ground and Vcc connections over special layers, separated
from the remaining connections. This particular routing problem leads to the
2VPST.
We address an optimization version of the 2VPST: find two disjoint vertex-
sets T1 and T2 , such that G[T1 ] and G[T2 ] are connected, minimizing |N1 \ T1 | +
|N2 \ T2 |. In other words, we look for a solution leaving the minimum possible
G. Cornuéjols, R.E. Burkard, G.J. Woeginger (Eds.): IPCO’99, LNCS 1610, pp. 439–452, 1999.
c Springer-Verlag Berlin Heidelberg 1999
440 Eduardo Uchoa and Marcus Poggi de Aragão
2 Formulation
For each vertex i ∈ V define a binary variable xi , where xi = 1 if i belongs to
T1 and xi = 0 if i belongs to T2 . This choice of variables obliges every vertex to
belong either to T1 or T2 . Of course, terminals of N1 in T2 and terminals of N2
in T1 are considered unconnected. A formulation for 2VPST follows.
X X
Min z = − wi + wi xi (1)
i∈N1 i∈V
P P
s.t. xi − xi ≤ 1 ∀S ⊆ V, |S| = 2; ∀R ∈ ∆(S) (2)
(F) i∈S i∈R
P P
− xi ≤ |R| − 1 ∀S ⊆ V, |S| = 2; ∀R ∈ ∆(S) (3)
xi +
i∈S i∈R
xi ∈ {0, 1} ∀i ∈ V (4)
3 Polyhedral Investigations
Many combinatorial optimization problems can be naturally decomposed into
simpler subproblems. The 2VPST decompose into two Steiner-like subproblems.
There are cases where a good polyhedral description for the subproblems yields
a good description of the main polyhedra. On the other hand, sometimes even
Packing of Two Steiner Trees 441
G1 b G2
c a b
a
e
f
e
f
d i
h c d
g
l
j
k
Constraints F.2 are 2-separators. In graph G1, xa +xf −xb −xe −xj ≤ 1 is an
example of such inequality. In the same graph, xa + xf + xg − xb − 2xe − 2xk ≤ 1
is a 3-separator valid for P1 (G1). Sometimes taking αi = |SiR | − 1 for all i ∈ R
does not yield a valid inequality for P1 . For instance,
P in graph G2 (Fig. 1) taking
S = {a, b, c, d} and R = {e, f },Pthe 4-separator i∈S xi −xeP −xf ≤ 1 is not valid
for P1 (G2). But 4-separators i∈S xi − 2xe − xf ≤ 1 and i∈S −xe − 2xf ≤ 1
are valid and facet-defining.
Theorem 3. For each S and each R ∈ ∆(S), there is at least one p-separator
inequality which is valid and facet-defining for P1 (G).
P P
Corollary 1. If the p-separator i∈S xi − i∈R (|SiR | − 1)xi ≤ 1 is valid for
P1 (G), then it defines a facet of P1 (G).
Packing of Two Steiner Trees 443
The above theorem does not tell how to calculate the α coefficients of a p-
separator facet. But given sets S and R ∈ ∆(S), the coefficients of a p-separator
valid for P1 can be computed by the following procedure:
Procedure Compute α
M ←− ∅;
For each i ∈ R {
If (exists j ∈ M such that SiR ∩ SjR = ∅)
Then { αi ←− |SiR |; }
Else { αi ←− |SiR | − 1; M ←− M ∪ {i}; }
}
P P
Theorem 4. Suppose G is biconnected. Let i∈S xi − i∈R (|SiR | − 1)xi ≤ 1 be
a valid p-separator, where R induces a structure without B-components and that
every A-component is a double v-tree. If for every i ∈ R, there is a double v-tree
Ui containing {i} ∪ SiR such that Ui ∩ R = {i}, this inequality defines a facet of
P (G).
Proof.
P We show that for every double v-tree T containing i, γT = |S ∩ T | −
i∈R∩T αi ≤ 0. If T contains Si , |S ∩ T | = |Si |. As T must contain another
R R
For examples, in graph G1, the 2-separator xa +xc −xc −xf −xe −xh −xl ≤ 1
do not define a facet of P (G1), by Corollary 2, αe or αh can be lifted to 0. The
Packing of Two Steiner Trees 445
G3 a d G4
b c d
b e a e
f g h
c f
4 Separation Procedures
5 Computational Results
The branch-and-cut for 2VPST was implemented in C++ using the ABACUS
framework [8] and CPLEX 3.0 LP solver, running on a Sun ULTRA workstation.
The code used 2-separators, 3-separators (heuristically separated) and p-crosses.
In all experiments described in table 1, G is a square grid graph (grid graphs
are typical in circuit layout applications). Two kind of instances were generated:
In Table 1, the LBr column presents the lower bound obtained in the root
node, z∗ is the value of the optimal integer solution, Nd is the number of nodes
explored in the branch-and-bound tree, LPs is the total number of solved LPs
and T is the total cpu time in seconds spent by the algorithm. Remark that
LBr bounds includes p-crosses and 3-separators. The lower bounds given by the
linear relaxation of formulation F alone are rather weak, for example, in instance
u1600 1 it is 8.7. Figure 3 shows an optimal solution of instance u1600 1, the
black boxes represent the terminals in N1 , the empty boxes are the terminals in
N2 , the full lines are edges of a spanning tree for G[T1 ] and the dotted lines a
spanning tree for G[T2 ].
The code turned out to be robust under changes on terminal placement, ter-
minal density and weight distribution. In fact, we used the same set of parameters
for solving all instances in table 1, weighted and unweighted. An interesting ex-
perience is to generate a random terminal placement avoiding the most obvious
causes of infeasibility: terminals on the outer face, two pairs of opposite termi-
nals crossed over the same grid square and terminals blocked by its neighbors.
An optimal solution of such an instance (called t1600 1) with 800 terminals and
unitary weights is shown in Fig. 4. The unconnected terminals are in coordinates
(4,33), (8,21), (33,36) and (34,15). This solution was obtained in 21 seconds.
Packing of Two Steiner Trees 449
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Appendix
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
The following lemma and definition are used in the proof of Theorem 3.
Proof (Theorem
P 3). Consider the structure induced by R (see definition 7).
Inequality i∈S xi ≤ 1 is valid for P1 (G)Z0 , where Z0 = R ∪ (∪qj=1 Bj ). Define
Packing of Two Steiner Trees 451
P
F = {x ∈ P1 (G)Z0 | i∈S xi = 1}. For each j ∈ {1, . . . , p}, Aj is a v-tree.
Construct a sequence of nested P v-trees from sj to Aj . For every T in these
p
sequences, χT ∈ F . So there are j=1 |Aj | affine independent vectors in F and
P
i∈S xi ≤ 1 defines a facet of P1 (G) .
Z0
References
1. M. Grötschel, A. Martin and R. Weismantel, Packing Steiner Trees: polyhedral in-
vestigations, Mathematical Programming, Vol. 72, 101-123, 1996.
2. M. Grötschel, A. Martin and R. Weismantel, Packing Steiner Trees: a cutting plane
algorithm and computational results, Mathematical Programming, Vol. 72, 125-145,
1996.
3. B. Korte, H. Prömel and A. Steger, Steiner Trees in VLSI-Layout, in B. Korte, L.
Lovász, H. Prömel, A. Schrijver (Eds.) “Paths, Flows and VLSI-Layout”, Springer-
Verlag, Berlin, 1989.
4. T. Lengauer, Combinatorial Algorithms for Integrated Circuit Layout, Wiley, Chich-
ester, 1990.
5. G. Nemhauser and L. Wolsey, Integer and Combinatorial Optimization, Wiley, New
York, 1988.
6. M. Poggi de Aragão and E. Uchoa, The γ-Connected Assignment Problem, To
appear in European Journal of Operational Research, 1998.
7. Y. Shiloach, A Polynomial Solution to the Undirected Two Paths Problem, Journal
of the ACM, Vol. 27, No.3, 445-456, 1980.
8. S. Thienel, ABACUS - A Branch-And-Cut System, PhD thesis, Universität zu Köln,
1995.
Author Index