2010 Book IntegerProgrammingAndCombinato
2010 Book IntegerProgrammingAndCombinato
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Friedrich Eisenbrand F. Bruce Shepherd (Eds.)
Integer Programming
and Combinatorial
Optimization
13
Volume Editors
Friedrich Eisenbrand
École Polytechnique Féderale de Lausanne
Institute of Mathematics
1015 Lausanne, Switzerland
E-mail: [email protected]
F. Bruce Shepherd
McGill University
Department of Mathematics and Statistics
805 Sherbrooke West, Montreal, Quebec, H3A 2K6, Canada
E-mail: [email protected]
ISSN 0302-9743
ISBN-10 3-642-13035-6 Springer Berlin Heidelberg New York
ISBN-13 978-3-642-13035-9 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
springer.com
© Springer-Verlag Berlin Heidelberg 2010
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper 06/3180
Preface
Programme Committee
Alper Atamtürk UC Berkeley
David Avis McGill
Friedrich Eisenbrand EPFL
Marcos Goycoolea Adolfo Ibañez
Oktay Günlük IBM
Satoru Iwata Kyoto
Tamás Király Eötvös Budapest
François Margot CMU
Bruce Shepherd (Chair) McGill
Levent Tunçel Waterloo
Santosh Vempala Georgia Tech
Peter Winkler Dartmouth
Neal E. Young UC Riverside
Local Organization
Michel Bierlaire
Jocelyne Blanc
Friedrich Eisenbrand (Chair)
Thomas Liebling
Martin Niemeier
Thomas Rothvoß
Laura Sanità
External Reviewers
Tobias Achterberg John Birge
Ernst Althaus Jaroslaw Byrka
Reid Andersen Alberto Caprara
Matthew Andrews Deeparnab Chakrabarty
Elliot Anshelevich Chandra Chekuri
Gary Au Kevin Cheung
Mourad Baiou Marek Chrobak
Nina Balcan Jose Coelho de Pina
Nikhil Bansal Michelangelo Conforti
Andre Berger Miguel Constantino
Attila Bernáth Jose Correa
Dan Bienstock Sanjeeb Dash
VIII Organization
Zero-Coefficient Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Kent Andersen and Robert Weismantel
1 Introduction
We consider problems involving the scheduling of jobs over several periods sub-
ject to precedence constraints among the jobs as well as side-constraints. We
must choose the subset of jobs to be performed, and, for each of these jobs,
how to perform it, choosing from among a given set of options (representing
facilities or modes of operation). Finally, there are side-constraints to be satis-
fied, including period-wise, per-facility processing capacity constraints, among
others. There are standard representations of these problems as (mixed) integer
programs.
Our data sets originate in the mining industry, where problems typically have
a small number of side constraints - often well under one hundred – but may
contain millions of jobs and tens of millions of precedences, as well as spanning
multiple planning periods. Appropriate formulations often achieve small inte-
grality gap in practice; unfortunately, the linear programming relaxations are
far beyond the practical reach of commercial software.
We present a new iterative algorithm for solving the LP relaxation of this prob-
lem. The algorithm incorporates, at a low level, ideas from Lagrangian relaxation
and column generation, but is however based on fundamental observations on
the underlying combinatorial structure of precedence constrained, capacitated
optimization problems. Rather than updating dual information, the algorithm
uses primal structure gleaned from the solution of subproblems in order to ac-
celerate convergence. The general version of our ideas should be applicable to
a wide class of problems. The algorithm can be proved to converge to optimal-
ity; in practice we have found that even for problems with millions of variables
1
The first author was partially funded by a gift from BHP Billiton Ltd., and ONR
Award N000140910327.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 1–14, 2010.
c Springer-Verlag Berlin Heidelberg 2010
2 D. Bienstock and M. Zuckerberg
t
t
Subject to: yi,τ ≤ yj,τ , ∀(i, j) ∈ A, 1 ≤ t ≤ T (2)
τ =1 τ =1
Dx ≤ d (3)
F
yj,t = xj,t,f , ∀j ∈ N , 1 ≤ t ≤ T (4)
f =1
T
yj,t ≤ 1, ∀j ∈ N (5)
t=1
x ≥ 0. (6)
2.2 Background
The Open Pit Mine Scheduling Problem. The practical motivating prob-
lem behind our study is the open pit mine scheduling problem. We are given a
three-dimensional region representing a mine to be exploited; this region is di-
vided into “blocks” (jobs, from a scheduling perspective) corresponding to units
of earth (“cubes”) that can be extracted in one step. In order for a block to be
extracted, the set of blocks located (broadly speaking) in a cone above it must
be extracted first. This gives rise to a set of precedences, i.e. to a directed graph
whose vertices are the blocks, and whose arcs represent the precedences. Finally,
the extraction of a block entails a certain (net) profit or cost.
The problem of selecting which blocks to extract so as to maximize profit can
be stated as follows:
max cT x : xi ≤ xj ∀ (i, j) ∈ A, xj ∈ {0, 1} ∀j ,
where as before A indicates the set of precedences. This is the so-called maximum
weight closure problem – in a directed graph, a closure is a set S of vertices such
that there exist no arcs (i, j) with i ∈ S and j ∈
/ S. It can be solved as a minimum
s − t cut problem in a related graph of roughly the same size. See [P76], and also
[J68], [Bal70] and [R70]. Further discussion can be found in [HC00], where the
authors note (at the end of Section 3.4) that it can be shown by reduction from
max clique that adding a single cardinality constraint to a max closure problem
is enough to make it NP-hard. For additional related material see [F06], [LG65],
[CH03], and references therein.
The problem we are concerned with here, by contrast, also incorporates pro-
duction scheduling. When a block is extracted it will be processed at one of
several facilities with different operating capabilities. The processing of a given
block i at a given facility f consumes a certain amount of processing capacity
vif and generates a certain net profit pif . This overall planning problem spans
several time periods; in each period we will have one or more knapsack (capac-
ity) constraints for each facility. We usually will also have additional, ad-hoc,
non-knapsack constraints. In this version the precedence constraints apply across
periods as per (2): if (i, j) ∈ A then j can only be extracted in the same or in a
later period than i.
Typically, we need to produce schedules spanning 10 to 20 periods. Addi-
tionally, we may have tens of thousands (or many more) blocks; this can easily
make for an optimization problem with millions of variables and tens of millions
of precedence constraints, but with (say) on the order of one hundred or fewer
processing capacity constraints (since the total number of processing facilities is
typically small).
Previous work. A great deal of research has been directed toward algorithms
for the maximum weight closure problems, starting with [LG65] and culminat-
ing in the very efficient method described in [H08] (also see [CH09]). A “nested
shells” heuristic for the capacitated, multiperiod problem, based on the work in
[LG65], is applicable to problems with a single capacity constraint, among other
4 D. Bienstock and M. Zuckerberg
3 Our Results
Empirically, it can be observed that formulation (1-6) frequently has small in-
tegrality gap. We present a new algorithm for solving the continuous relaxation
of this formulation and generalizations. Our algorithm is applicable to problems
with an arbitrary number of process options and arbitrary side constraints, and
it requires no aggregation. On very large, real-world instances our algorithm
proves very efficient.
2
We interacted with [BDFG09] as part of an industrial partnership, but our work was
performed independently.
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 5
t−1
F
f
zj,t,f = xj,τ,f + xj,t,f ,
τ =1 f =1 f =1
and conversely. Thus, for an appropriate system D̄z ≤ d¯ (with the same number
of rows as Dx ≤ d) and objective c̄T z, PCPSP is equivalent to the linear program:
¯ and constraints (11)-(15)}.
min{c̄T z : D̄z ≤ d,
Let Ā be the submatrix of A containing the binding constraints at x̂. Then the
vectors θr are linearly independent and belong to the null space of Ā. As a con-
sequence, k ≤ q.
Proof: First we prove that Āθr = 0. Given a precedence constraint xi − xj ≤ 0,
if the constraint is binding then x̂i = x̂j . Thus if x̂i = αr , so that θir = 1, then
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 7
x̂j = αr also, and so θjr = 1 as well, and so θir − θjr = 0. By the same token if
x̂i = αr then x̂j = αr and again θir − θjr = 0. If a constraint xi ≥ 0 or xi ≤ 1 is
binding at x̂ then naturally θir = 0 for all r as x̂i is not fractional. The supports
of the θr vectors are disjoint, yielding linear independence. Finally, k ≤ q follows
from Lemma 2.
where μ ≥ 0, and, for each q, v q ∈ {0, 1}n is the incidence vector of a closure
S q ⊂ N . [In fact, the S q can be assumed to be nested]. So for any i, j ∈ N ,
x∗j = x∗i if i and j belong to precisely the same family of sets S q . Also, Lemma 3
states that the number of distinct values that x∗j can take is small, if the number
of side constraints is small. Therefore it can be shown that when the number
of side constraints is small the number of closures (terms) in (19) must also be
small. In the full paper we show that a rich relationship exists between the max
closures produced by Lagrangian problems and the optimal dual and primal
solutions to GP CP . Next, we will develop an algorithm that solves GP CP by
attempting to “guess” the correct representation (19).
First, we present a result that partially generalizes Lemma 3.
Comment: Note that rank(Ā) can be high and thus condition (d) is not quite
as strong as Lemma 3; nevertheless q is small in any case and so we obtain
a decomposition of x̂ into “few” terms when the number of side-constraints is
“small”. Theorem 2 can be strengthened for specific families of totally unimodu-
lar matrices. For example, when A is the node-arc incidence matrix of a digraph,
the θ vectors are incidence vectors of cycles, which yields the following corollary.
Corollary 1. Let P be the feasible set for a minimum cost network flow problem
with integer data and side constraints. Let x̂ be an extreme point of P , and let
q be the number of linearly independent side constraints that are binding at x̂.
Let ζ = {j : x̂j integral}. Then x̂ can be decomposed into the sum of an integer
vector v satisfying all network flow (but not necessarily side) constraints, and
with vj = x̂j ∀j ∈ ζ, and a sum of no more than q fractional cycle flows, over a
set of cycles disjoint from ζ.
(P1 ) : max cT x
s.t. Ax ≤ b
Dx ≤ d. (21)
Denote by L(P1 , μ) the Lagrangian relaxation in which constraints (21) are du-
alized with penalties μ, i.e. the problem max{cT x + μT (d − Dx) : Ax ≤ b}.
One can approach problem P1 by means of Lagrangian relaxation, i.e. an algo-
rithm that iterates by solving multiple problems L(P1 , μ) for different choices of
μ; the multipliers μ are updated according to some procedure. A starting point
for our work concerns the fact that traditional Lagrangian relaxation schemes
(such as subgradient optimization) can prove frustratingly slow to achieve con-
vergence, often requiring seemingly instance-dependent choices of algorithmic
parameters. They also do not typically yield optimal feasible primal solutions;
in fact frequently failing to deliver a sufficiently accurate solutions (primal or
dual). However, as observed in [B02] (and also see [BA00]) Lagrangian relaxation
schemes can discover useful “structure.”
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 9
The restricted linear program includes all constraints, and thus could (poten-
tially) still be very hard – the idea is that the structure we have imposed renders
the LP much easier. Further, the LP includes all constraints, and thus the solu-
tion we obtain is fully feasible for P1 , thus proving a lower bound. Moreover, if
our guess as to “structure” is correct, we also obtain a high-quality dual feasible
vector, and our use of this vector so as to restart the Lagrangian scheme should
result in accelerated convergence (as well as proving an upper bound on P1 ). In
[B02] these observations were experimentally verified in the context of several
problem classes.
(P2k ) : max cT x
s.t. Ax ≤ b, Dx ≤ d, H k x = hk .
where the first equality follows by duality and the second by definition of wk in
Step 2 since H k−1 wk = hk−1 . Also, clearly z k−1 ≤ z ∗ , and so in summary
note later that this partition never needs more than a small number of elements
for the algorithm to converge.
At iteration k, we denote by C k = {C1k , . . . , Crkk } the constructed partition of
N . Our basic application of the template is as follows:
GPCP Algorithm
1. Set μ0 = 0. Set r0 = 1, = N , C 0 = {C10 }, z 0 = −∞, and k = 1.
C10
k
2. Let y be an optimal solution to L(P1 , μk−1 ), and define
I k = {j ∈ N : yjk = 1} (23)
and define
Ok = {j ∈ N : yjk = 0}. (24)
If k > 1, and, for 1 ≤ h ≤ rk−1 , either Chk−1 ∩ I k = ∅ or Chk−1 ∩ Ok = ∅,
then STOP.
3. Let C k = {C1k , . . . , Crkk } consist of all nonempty sets in the collection
k
I ∩ Chk−1 : 1 ≤ h ≤ rk−1 ∪ Ok ∩ Chk−1 : 1 ≤ h ≤ rk−1 .
Let H k x = hk consist of the constraints:
xi = xj , for 1 ≤ h ≤ rk , and each pair i, j ∈ Chk .
4. Let P2k consist of P1 , plus the additional constraints H k x = hk .
5. Solve P2k , with optimal solution xk , and let μk denote the optimal duals
corresponding to the side-constraints Dx ≤ d. If μk = μk−1 , STOP.
6. Set k = k + 1 and goto Step 2.
We have:
Lemma 4. (a) For each k, problem P2k is an instance of GPCP with rk variables
and the same number of side-constraints as in Dx ≤ d. (b) If P21 is feasible, the
above algorithm terminates finitely with an optimal solution.
Proof: full paper.
Comments: Since each problem P2k is a GPCP, its extreme point solution xk
never attains more than m + 2 distinct values (where m is the number of linearly
independent rows in D), and thus the partition C k can be coarsened while main-
taining the feasibility of xk by merging the sets Cjk with common xk values. Note
also that in choosing C k+1 to be a refinement of C k the LP solution xk remains
available to the problem P2k+1 . The above algorithm is a basic application of
the template. Finer partitions than {I k , Ok } may also be used. The feasibility
assumption in (b) of Lemma 4 can be bypassed. Details will be provided in the
full paper.
In the full paper an analysis is presented that explains why the structure
exposed by the Lagrangian solutions can be expected to point the algorithm in
the right direction. In particular, the solution to the Lagrangian obtained by
using optimal duals for the side constraints can be shown to exhibit significant
structure.
12 D. Bienstock and M. Zuckerberg
Algorithm Performance
Iterations to 10−5
optimality 8 8 9 13 30
Time to 10−5
optimality (sec) 10 60 344 1 1076
Iterations to
comb. optimality 11 12 16 15 39
Time to comb.
optimality (sec) 15 96 649 1 1583
5 Computational Experiments
In this section we present results from some of our experiments. A more complete
set of results will be presented in the full paper. All these tests were conducted
using a single core of a dual quad-core 3.2 GHz Xeon machine with 64 GB of
memory. The LP solver we used was Cplex, version 12 and the min cut solver
we used was our implementation of Hochbaum’s pseudoflow algorithm ([H08]).
The tests reported on in Tables 1 and 2 are based on three real-world ex-
amples provided by BHP Billiton3 , to which we refer as ’Mine1’, ’Mine2’ and
’Mine3’ and a synthetic but realistic model called ’Marvin’ which is included
with Gemcom’s Whittle [W] mine planning software. ’Mine1B’ is a modifica-
tion of Mine1 with a denser precedence graph. Mine3 comes in two versions to
which we refer as ’big’ and ’small’. Using Mine1, we also obtained smaller and
larger problems by modifying the data in a number of realistic ways. Some of
the row entries in these tables are self-explanatory; the others have the following
meaning:
3
Data was masked.
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 13
– Problem arcs. The number of arcs in the graph that the algorithm creates to
represent the scheduling problem (i.e., the size of the min cut problems we solve).
– Iterations, time to 10−5 optimality. The number of iterations (resp.,
the CPU time) taken by the algorithm until it obtained a solution it could
certify as having ≤ 10−5 relative optimality error.
– Iterations, time to combinatorial optimality. The number of iterations
(resp., the CPU time) taken by the algorithm to obtain a solution it could cer-
tify as optimal as per the stopping criteria in Steps 2 or 5. Notice that this
implies that the solution is optimal as per the numerical tolerances of Cplex.
Finally, an entry of ”—” indicates that Cplex was unable to terminate after 100000
seconds of CPU time. More detailed analyses will appear in the full paper.
Algorithm Performance
Iterations to 10−5
optimality 6 6 8 7 10
Time to 10−5
optimality (sec) 0 1 7 45 2875
Iterations to
comb. optimality 7 7 11 9 20
Time to comb.
optimality (sec) 0 2 10 61 6633
References
[Bal70] Balinsky, M.L.: On a selection problem. Management Science 17, 230–
231 (1970)
[BA00] Barahona, F., Anbil, R.: The Volume Algorithm: producing primal solu-
tions with a subgradient method. Math. Programming 87, 385–399 (2000)
14 D. Bienstock and M. Zuckerberg
Takuro Fukunaga
1 Introduction
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 15–28, 2010.
c Springer-Verlag Berlin Heidelberg 2010
16 T. Fukunaga
k ≥ 5. They also showed that, for the hypergraph 4-cut problem, their algorithm
achieves approximation factor 4/3.
The rest of this paper is organized as follows. Section 2 introduces basic facts
and notations. Section 3 explains outline of our result and presents our algo-
rithm. Sections 4 shows that a maximum hypertree packing contains a hyper-
tree sharing at most a constant number of hyperedges with a minimum k-cut.
Section 5 discusses a property of a set of hypertrees constructed greedily.
Section 6 concludes this paper and mentions the future works.
2 Preliminaries
The first step of our result is to prove the following theorem originally proven for
graphs by Thorup [18]. A recursively maximum hypertree packing is a maximum
hypertree packing that satisfies some condition, which will be defined formally
in Section 4.
components of V, |δE (V)| ≥ U∈V e ∈δE (U) (1/|e |). Since each e ∈ E has |e|
copies in E , e ∈δE (U) (1/|e |) = e∈δE (U) 1 = |δE (U )|. Moreover, |δE (U )| ≥ 1
because H is connected. Thus |δE (V)| ≥ U∈V 1 = |V|, which implies that H
is partition-connected. Hence by Theorem 1, H contains a hypertree.
Let us discuss the running time of this algorithm. For each hypertree, there
are O(nh ) ways to choose h hyperedges. By Corollary 1, shrinking n − 1 − h hy-
peredges in a hypertree results in a hypergraph with at most h+1 vertices. Hence
Step 3-2 can be done in O(k h+1 ) time. It means that Step 3 of the algorithm
runs in O(k h+1 nh ) time per one hypertree in T ∗ .
To bound the running time of all the steps, we must consider how to compute
a recursively maximum hypertree packing and how large its size is. A recursively
maximum hypertree packing can be computed in polynomial time. However we
know no algorithm to compute small recursively maximum hypertree packings.
Hence this paper follows the approach taken by Thorup [18] for the graph k-cut
problem. We show that a set of hypertrees constructed as below approximates a
recursively maximum hypertree packing well. It enables us to avoid computing
a recursively maximum hypertree packing.
4 Proof of Theorem 2
Let V be an arbitrary partition of V , and (T , α) be an arbitrary hypertree
packing of H. Since every hypertree T ∈ T satisfies |T | = |V | − 1 and has at
most |U | − 1 hyperedges contained by U for each U ∈ V, we have
|δT (V)| = |T | − |T [U ]| ≥ |V | − 1 − (|U | − 1) = |V| − 1. (1)
U∈V U∈V
Moreover,
c(e) ≥ α(Te ) for each e ∈ δ(V) (2)
by the definition of hypertree packings. Thus it follows that
c(δ(V)) e∈δ(V) α(Te ) α(T )|δT (V)|
≥ = T ∈T ≥ α(T ). (3)
|V| − 1 |V| − 1 |V| − 1
From now on, we let (T ∗ , α∗ ) stand for a recursively maximum hypertree pack-
ing. For U ∈ V ∗ , let T = {T [U ] | T ∈ T ∗ } and α (T [U ]) = α∗ (T )H[U] /α∗ (T ∗ )
where H[U] is the partition-connectivity of H[U ]. The definition of (T ∗ , α∗ ) im-
plies that (T , α ) is a recursively maximum hypertree packing of H[U ] for any
U ∈ V ∗.
From T ∗ and given k, define Vk as the k-partition of V constructed by the
following algorithm.
Algorithm 4: Computing Vk
Input: A connected hypergraph H = (V, E) with capacity c : E → Q+ , and an
integer k ≥ 2.
Output: A k-partition of V .
Step 1: Define Vk := {V }.
Step 2: Let U ∈ Vk be a set attaining min{H[U] | U ∈ Vk , |U | ≥ 2}. Compute
aweakest partition U = {U1 , U 2 , . . . , U|U | } of H[U ], where we assume that
∗ ∗
T ∈T ∗ α (T )|δ(Ui ) ∩ T [U ]| ≤ T ∈T ∗ α (T )|δ(Uj ) ∩ T [U ]| for 1 ≤ i < j ≤
|U|.
Step 3: If |Vk | − 1 + |U| < k, then Vk := (Vk \ {U }) ∪ U and return to Step 2.
Step 4: If |Vk | − 1 + |U| ≥ k, then Vk := (Vk \ {U }) ∪ {U1 , U2 , . . . , Uk−|Vk | , U \
k−|Vk |
∪i=1 Ui }, and output Vk .
Lemma 2
α∗ (Te∗ ) < (γk − 2)α∗ (T ∗ ).
e∈δ(Vk )
Computing Minimum Multiway Cuts in Hypergraphs 23
Proof. In this proof, Vk stands for Vk immediately before executing Step 4 of
Algorithm 4, and Vk stands for it outputted by Algorithm 4.
By the definition of recursively maximum hypertree packings, T [U ] is a hy-
∗
pertree of H[U ] for every pairof U ∈ Vk and T ∈ T . Thus |δ(Vk ) ∩ T | =
|T |− U∈V |T [U ]| = |V |−1− U∈V (|U |−1) = |Vk |−1 holds for each T ∈ T ∗ ,
k
∗ ∗
k ∗ ∗ ∗
and hence e∈δ(Vk ) α (Te ) = T ∈T ∗ α (T )|δ(Vk ) ∩ T | = (|Vk | − 1)α (T )
holds.
Let U be the element of Vk and U = {U1 , U2 , . . . , U|U | } be the weakest par-
tition of U computed in Step 2 immediately before executing Step 4. Note that
|U| > k − |Vk | holds by the condition of Step 4. By the same reason with above,
|δ(U) ∩ T [U ]| = |U| − 1 holds for each T ∈ T ∗ . Hence T ∈T ∗ α∗ (T )|δ(U) ∩
T [U ]| = (|U| − 1)α∗ (T ∗ ).
k−|V |
Let VU = {U1 , U2 , . . . , Uk−|Vk | , U \ ∪j=1 k Uj }. Then
k−|Vk |
∗
α (T )|δ(VU ) ∩ T [U ]| ≤ α∗ (T )|δ(Uj ) ∩ T [U ]|.
T ∈T ∗ j=1 T ∈T ∗
The
elements in U are ordered so that they satisfy T ∈T ∗ α∗ (T )|δ(Ui )∩T [U ]| ≤
∗
T ∈T ∗ α (T )|δ(Uj ) ∩ T [U ]| for 1 ≤ i < j ≤ |U|. Hence it holds that
k−|Vk | |U |
k − |Vk | ∗
∗
α (T )|δ(Uj ) ∩ T [U ]| ≤ α (T )|δ(Uj ) ∩ T [U ]|.
∗
|U| j=1 ∗
j=1 T ∈T T ∈T
Lemma 3. For each e ∈ δ(Vk ) and f ∈ E \ δ(Vk ), α∗ (Te∗ )/c(e) ≥ α∗ (Tf∗ )/c(f )
holds.
24 T. Fukunaga
Proof. Let U ∈ Vk denote the set containing f (i.e., f ∈ E[U ]). Let Vk denote
Vk immediately before e enters δ(Vk ) in Algorithm 4. Assume that e is contained
by U ∈ Vk (i.e., e ∈ E[U ]). Moreover, let U (resp., U ) denote the packing
value of H[U ] (resp., H[U ]).
From (T ∗ , α∗ ), define T = {T [U ] | T ∈ T ∗ }. Moreover, define α (T [U ]) =
α (T )U /α∗ (T ∗ ) for T ∈ T ∗ . By the definition of recursively maximum hyper-
∗
∗
Proof. Let η = mine∈δ(Vk ) α (Te∗ )/c(e).
By Lemma 3, each hyperedge e ∈
δ(V opt ) \ δ(Vk ) satisfies α∗ (Te∗ )/c(e) ≤ η. Hence it holds that
α∗ (Te∗ )
α∗ (Te∗ ) = c(e) ≤ ηc(δ(V opt ) \ δ(Vk )).
c(e)
e∈δ(V opt )\δ(Vk ) e∈δ(V opt )\δ(Vk )
The definition of V opt implies that c(δ(V opt )) ≤ c(δ(Vk )), and hence c(δ(V opt ) \
δ(Vk )) ≤ c(δ(Vk ) \ δ(V opt )). Thus
α∗ (T )|δ(V opt ) ∩ T | = α∗ (Te∗ ) + α∗ (Te∗ )
T ∈T ∗ e∈δ(V opt )∩δ(Vk ) e∈δ(V opt )\δ(Vk )
≤ α∗ (Te∗ ) + ηc(δ(V opt ) \ δ(Vk ))
e∈δ(V opt )∩δ(V k)
≤ α∗ (Te∗ ) + ηc(δ(Vk ) \ δ(V opt ))
e∈δ(V opt )∩δ(Vk )
≤ α∗ (Te∗ ).
e∈δ(Vk )
Computing Minimum Multiway Cuts in Hypergraphs 25
5 Proof of Theorem 5
In this section, we present a proof of Theorem 5. Although it is almost same
with that for γ = 2 presented by Thorup [18], we sketch it for self-containment.
Throughout this section, we let H = (VH , EH ) be a hypergraph such that
each e ∈ EH has at least |e| − 1 copies in EH \ {e} of the same capacity. We
denote |EH | by γm, and the capacity of hyperedges in H by cH in order to avoid
confusion. Moreover, we assume that a recursively maximum hypertree packing
(T ∗ , α∗ ) of H satisfies α∗ (Te∗ ) = α∗ (Te∗ ) for e ∈ EH and a copy e ∈ EH of e.
For a set T of hypertrees of H and e ∈ EH , define uT H (e) = |Te |/(cH (e)|T |).
For each e ∈ EH , we also define u∗H (e) as α∗ (Te∗ )/(cH (e)α∗ (T ∗ )) from a recur-
sively maximum hypertree packing (T ∗ , α∗ ) of H. Since cH (e) ≥ α∗ (Te∗ ) for all
e ∈ EH , 1/u∗H (e) is at least the packing value of H, i.e., 1/u∗H (e) ≥ α∗ (T ∗ ).
Moreover, since cH (e) = α∗ (Te∗ ) holds for some e ∈ EH by the maximality of
(T ∗ , α∗ ), mine∈EH 1/u∗H (e) = α∗ (T ∗ ) holds.
Recall that Algorithm 3 updates V ∗ by partitioning non-singleton sets in
∗
V repeatedly until no such sets exist. For e ∈ EH , define Ue as the last
set in V ∗ such that e ∈ EH [Ue ] during the execution of the algorithm. Then
maxe ∈EH [Ue ] u∗H[Ue ] (e ) = u∗H[Ue ] (e). The definition of recursively maximum hy-
pertree packings implies that u∗H[Ue ] (e ) = u∗H (e ) for each e ∈ EH [Ue ] because
α∗ (Te∗ )/α∗ (T ∗ ) = β(Se )/β(S ) holds with a maximum hypertree packing
(S , β) of H[Ue ]. Therefore, the partition-connectivity of H[Ue ] is 1/u∗H (e).
Lemma 5. Let I be a subgraph of H and assume that each hyperedge e in I
has capacity cI (e) such that cmin ≤ cI (e) ≤ cH (e). Let C = e∈EI cI (e), and
uI = maxe∈EI u∗I (e). Moreover, let be an arbitrary real such that 0 < <
1/2, and T g be a set of hypertrees in H constructed by Algorithm 2 with t ≥
3 ln(C/cmin )/(cmin uI 2 ). Then
uT
g
H (e) < (1 + )uI (4)
holds for each e ∈ EI .
Proof. Scaling hyperedge capacity makes no effect on the claim. Hence we assume
without loss of generality that cmin = 1.
Let T denote a set of hypertrees kept by Algorithm 2 at some moment during
it is running for computing T g . The key is the following quantity:
(1 + )|Te |/cH (e) (1 + uI )t−|T |
cI (e) . (5)
e∈EI
(1 + )(1+)uI t
26 T. Fukunaga
6 Concluding Remarks
References
1. Chekuri, C., Korula, N.: Personal Communication (2010)
2. Frank, A., Király, T., Kriesell, M.: On decomposing a hypergraph into k connected
sub-hypergraphs. Discrete Applied Mathematics 131(2), 373–383 (2003)
3. Garg, N., Vazirani, V.V., Yannakakis, M.: Multiway cuts in node weighted graphs.
Journal of Algorithms 50, 49–61 (2004)
4. Gasieniec, L., Jansson, J., Lingas, A., Óstlin, A.: On the complexity of constructing
evolutionary trees. Journal of Combinatorial Optimization 3, 183–197 (1999)
5. Goldberg, A.V., Tarjan, R.E.: A new approach to the maximum flow problem.
Journal of the ACM 35, 921–940 (1988)
6. Goldschmidt, O., Hochbaum, D.: A polynomial algorithm for the k-cut problem
for fixed k. Mathematics of Operations Research 19, 24–37 (1994)
7. Kamidoi, Y., Yoshida, N., Nagamochi, H.: A deterministic algorithm for finding
all minimum k-way cuts. SIAM Journal on Computing 36, 1329–1341 (2006)
8. Karger, D.R., Stein, C.: A new approach to the minimum cut problem. Journal of
the ACM 43, 601–640 (1996)
9. Klimmek, R., Wagner, F.: A simple hypergraph min cut algorithm. Internal Report
B 96-02, Bericht FU Berlin Fachbereich Mathematik und Informatik (1995)
10. Lawler, E.L.: Cutsets and partitions of hypergraphs. Networks 3, 275–285 (1973)
11. Lorea, M.: Hypergraphes et matroides. Cahiers Centre Etudes Rech. Oper. 17,
289–291 (1975)
12. Lovász, L.: A generalization of König’s theorem. Acta. Math. Acad. Sci. Hungar. 21,
443–446 (1970)
13. Mak, W.-K., Wong, D.F.: A fast hypergraph min-cut algorithm for circuit parti-
tioning. Integ. VLSI J. 30, 1–11 (2000)
14. Nagamochi, H.: Algorithms for the minimum partitioning problems in graphs. IE-
ICE Transactions on Information and Systems J86-D-1, 53–68 (2003)
15. Okumoto, K., Fukunaga, T., Nagamochi, H.: Divide-and-conquer algorithms for
partitioning hypergraphs and submodular systems. In: Dong, Y., Du, D.-Z., Ibarra,
O. (eds.) ISAAC 2009. LNCS, vol. 5878, pp. 55–64. Springer, Heidelberg (2009)
16. Stoer, M., Wagner, F.: A simple min-cut algorithm. J. the ACM 44, 585–591 (1997)
17. Thorup, M.: Fully-dynamic min-cut. Combinatorica 27, 91–127 (2007)
18. Thorup, M.: Minimum k-way cuts via deterministic greedy tree packing. In: Pro-
ceedings of the 40th Annual ACM Symposium on Theory of Computing, pp. 159–
166 (2008)
19. Xiao, M.: Finding minimum 3-way cuts in hypergraphs. In: Agrawal, M., Du, D.-
Z., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 270–281. Springer,
Heidelberg (2008)
20. Xiao, M.: An improved divide-and-conquer algorithm for finding all minimum k-
way cuts. In: Hong, S.-H., Nagamochi, H., Fukunaga, T. (eds.) ISAAC 2008. LNCS,
vol. 5369, pp. 208–219. Springer, Heidelberg (2008)
21. Zhao, L., Nagamochi, H., Ibaraki, T.: A unified framework for approximating mul-
tiway partition problems. In: Eades, P., Takaoka, T. (eds.) ISAAC 2001. LNCS,
vol. 2223, pp. 682–694. Springer, Heidelberg (2001)
Eigenvalue Techniques for Convex Objective,
Nonconvex Optimization Problems
Daniel Bienstock
1 Introduction
We consider problems with the general form
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 29–42, 2010.
c Springer-Verlag Berlin Heidelberg 2010
30 D. Bienstock
After obtaining the solution x∗ to the given relaxation for problem F , our meth-
ods will use techniques of convex analysis, of eigenvalue optimization, and com-
binatorial estimations, in order to quickly obtain a valid lower bound on F̄ which
is strictly larger than F (x∗ ). Our methods apply if F is not quadratic but there
is a convex quadratic G such that F (x) − F (x∗ ) ≥ G(x − x∗ ) for all feasible x.
We will describe an important class of problems where our method, applied
to a “cheap” but weak formulation, produces bounds comparable to or better
than those obtained by much more sophisticated formulations, and at a small
fraction of the computational cost.
Cardinality constrained optimization problems. Here, for some integer
0 < K ≤ n, K = { x ∈ Rn : x0 ≤ K }, where the zero-norm v0 of a vector
v is used to denote the number of nonzeros in v. This constraint arises in portfo-
lio optimization (see e.g. [2]) but modern applications involving this constraint
arise in statistics, machine learning [13], and, especially, in engineering and bi-
ology [19]. Problems related to compressive sensing have an explicit cardinality
constraint (see www.dsp.ece.rice.edu/cs for material). Also see [7].
The simplest canonical example of problem F is as follows:
This problem is strongly NP-hard, and it does arise in practice, exactly as stated.
In spite of its difficulty, this example already incorporates the fundamental
difficulty alluded to above: clearly, conv x ∈ Rn+ : j xj = 1, x0 ≤ K =
1.1 Techniques
Our methods embody two primary techniques:
(a) The S-lemma (see [20], also [1], [5], [15]). Let f, g : Rn → R be quadratic
functions and suppose there exists x̄ ∈ RN such that g(x̄) > 0. Then
Simple Template
S.1 Compute an optimal solution x∗ to the given relaxation to problem F .
S.2 Obtain the quantity D(x∗ ).
S.3 Apply the S-lemma as in (7), using F (x) for p(x), and (the exterior of)
the ball centered at x∗ with radius D(x∗ ) for q(x) − β.
x*
y xF
x*
In Section 1.4 we will address how to produce the quadratic q(x) and the value
D2 (y, q) when K is defined by a cardinality constraint.
Let c = ∇F (x∗ ) (other choices for c discussed in full paper). Note that for
any x ∈ P ∩ K, cT (x − x∗ ) ≥ 0. For α ≥ 0, let pα = x∗ + αc, and let H α be the
hyperplane through pα orthogonal to c. Finally, define
and let y α attain the minimum. Note: computing V (α) entails an application of
the S-lemma, “restricted” to H α . See Figure 3. Clearly, V (α) ≤ F̄ . Then
Thus, F (x∗ ) ≤ inf α≥0 V (α) ≤ F̄ ; the first inequality being strict in the positive-
definite case. [It can be shown that the “inf” is a “min”]. Each value V (α)
incorporates combinatorial information (through the quantity D2 (pα , q)) and
thus the computation of minα≥0 V (α) cannot be obtained through direct convex
optimization techniques. As a counterpoint to Theorem 1 one can prove (using
the notation in eq. (8):
Theorem 2. In (9), if C has one row and q(x) = j x2j , V ≤ inf α≥0 V (α).
c
α
H
yα
pα
x*
Updated Template
The idea here is that if (for all i) α(i+1) −α(i) is small then V (α(i) ) ≈ V (α(i+1) ).
Thus the quantity output in (2) will closely approximate minα≥0 V (α).
34 D. Bienstock
and therefore
P M P wi = P M wi = λ̂i P wi = λ̂i wi ,
as desired.
Altogether, Lemma 1 produces q − 1 eigenvalue/eigenvector pairs of P M P . The
vector in (e.2) should not be explicitly computed; rather the factorized form in
(e.2) will suffice. The root to the equation in (e.1) can be quickly obtained using
numerical methods (such as golden section search) since the expression in (e.1)
is monotonely increasing in (αi , αi+1 ) (it may also be possible to adapt the basic
trust-region algorithm [14], which addresses a similar but not identical problem).
Lemma 2. Let α be an eigenvalue of M , V α the set of columns of Q with
eigenvalue α, and A = A(α) denote the acute members of V α . If |A| > 0, then
we can construct |A| − 1 eigenvectors of P M P corresponding to eigenvalue α,
each of which is a linear combination of elements of A and is orthogonal to c.
Proof: Write m = |A|, and let H be the m × m Householder matrix [9] corre-
sponding to dA , i.e. H is a symmetric matrix with H 2 = Im such that
HdA = (dA 2 , 0, ..., 0)T ∈ Rm .
Let QA be the n × m submatrix of Q consisting of the columns corresponding
to A, and define
W = QA H. (12)
Then cT W = dTA H = (dA 2 , 0, ..., 0). In other words, the columns of the
submatrix Ŵ consisting of the last m − 1 columns of W are orthogonal to c.
Denoting by Ĥ the submatrix of H consisting of the last m − 1 columns of H,
we therefore have
Ŵ = QA Ĥ, and
P M P Ŵ = P QΛQT Ŵ = P QΛQT QA Ĥ = αP QA Ĥ = αŴ .
Finally, Ŵ T Ŵ = Ĥ T Ĥ = Im , as desired.
Now suppose that
α1 < α2 < . . . < αq
denote the distinct acute eigenvalues of M (possibly q = 0). Let p denote the
number of columns of Q which are perpendicular eigenvectors. Writing mi =
|A(αi )| > 0 for 1 ≤ i ≤ q, we have that
q
n= mi + p.
i=1
Here we take up the problem of computing strong lower bounds on the Euclidean
distance from a point to the set P ∩ K. In this abstract we will focus on the
cardinality constrained problem, but results of a similar flavor hold for the case
of disjunctive sets.
Let a ∈ Rn , b ∈ R, K < n be a positive integer, and ω ∈ Rn . Consider the
problem
⎧ ⎫
⎨n ⎬
2
Dmin (ω, a) := min (xj − ωj )2 , : aT x = b and x0 ≤ K . (13)
⎩ ⎭
j=1
Clearly, the sum of smallest n − K values ωj2 constitutes a (“naive”) lower bound
for problem (13). But it is straightforward to show that an exact solution to (13)
is obtained by choosing S ⊆ {1, . . . , n} with |S| ≤ K, so as to minimize
(b − j∈S aj ωj )2
2 + ωj2 . (14)
j∈S aj j ∈S
/
[We use the convention that 0/0 = 0.] Empirically, the naive bound mentioned
above is very weak since the first term in (14) is typically at least an order of
magnitude larger than the second; and it is the bound, rather than the set S
itself, that matters.
Suppose aj = 1 for all j. It can be shown, using (14), that the optimal set S
has the following structure: S = P ∪ N , where |P | + |N | ≤ K, and P consists of
the indices of the |P | smallest nonnegative ωj (resp., N consists of the indices
of the |N | smallest |ωj | with ωj < 0). The optimal S can be computed in O(K)
Eigenvalue Techniques for Convex Objective Problems 37
n
(x̂j − ωj )2 ≤ (1 + )Dmin
2
(ω, a),
j=1
Writing
ω̄j = aj ω j (for all
j), this becomes
n
j=1 (xj − ω̄j ) j xj = b and x0 ≤ K , which as noted above
2
min :
can be efficiently solved.
This is a simple task, since in [0, λ̃1 ) the objective in (19) is concave in μ.
Remarks:
(1) Our updated template in Section 1.2 requires the solution of multiple prob-
lems of the form (19) but just one computation of Q̃ and Λ̃.
(2) Consider any integer 1 ≤ p < n − 1. When μ < λ̃1 , the expression max-
p ṽi2
n−1 2
i=p+1 ṽi
imized in (19) is lower bounded by μβ − i=1 λ̃i −μ − λp+1 −μ . This, and
related facts, yield an approximate version of our approach which only asks for
the first p elements of the eigenspace of P M P (and M ).
Capturing the second eigenvalue. We see that Γ < λ̃1 β (and frequently this
bound is close). In experiments, the solution y ∗ to (16) often “cheats” in that y1∗
is close to zero. We can then improve on our procedure if the second projected
eigenvalue, λ̃2 , is significantly larger than λ̃1 . Assuming that is the case, pick a
value θ with y1∗2 /β < θ < 1.
n that y1 ≥ θβ
2
(a) If we assert then we may be able to strengthen the constraint
in (15) to i=1 δi (xi − x̂i )2 ≥ γ, where γ = γ(θ) > β. See Lemma 3 below. So
the assertion amounts to applying
the2 S-lemma, but using γ in place of β.
(b) Otherwise, we have that n−1 i=2 yi ≥ (1 − θ)β. In this case, instead of the
right-hand side of (19), we will have
ṽ 2
n−1
max μ(1 − θ)β − i
: 0 ≤ μ ≤ λ̃2 . (20)
λ̃ − μ
i=2 i
The minimum of the quantities obtained in (a) and (b) yields a valid lower
bound on Γ ; we can evaluate several candidates for θ and choose the strongest
bound. When λ̃2 is significantly larger than λ̃1 we often obtain an improvement
over the basic approach as in Section 1.5.
Note: the approach in this section constitutes a form of branching and in our
testing has proved very useful when λ2 > λ1 . It is, intrinsically, a combinatorial
approach, and thus not easily reproducible using convexity arguments alone.
To complete this section, we point out that the construction of the quantities
γ(β) above is based on the following observation:
Eigenvalue Techniques for Convex Objective Problems 39
2 Computational Experiments
We consider problems min{ xT M x + v T x : j xj = 1, x ≥ 0, x0 ≤ K }. The
matrix M 0 is given in its eigenvector/eigenvalue factorization QΛQT . To
stress-test our linear algebra routines, we construct Q as the product of random
rotations: as the number of rotations increases, so does the number of nonzeroes
in Q, and the overall “complexity” of M . We ran our procedure after computing
the solution to the (diagonalized) “weak” formulation
min{ y T Λy + v T x : QT x = y, xj = 1, x ≥ 0}.
j
We also ran the (again, diagonalized) perspective formulation [10], [12], a strong
conic formulation (here, λmin is the minimum λi ):
min λmin wj + (λj − λmin )yj2
j j
T
s.t. Q x = y, xj = 1
j
x2j − wj zj ≤ 0, 0 ≤ zj ≤ 1 ∀ j, (21)
zj ≤ K, xj ≤ zj ∀ j, x, w ∈ Rn+ .
j
We used the Updated Template given above, with c = ∇(x∗ ) and with the
α(i) quantities set according to the following method: (a) J = 100, and (b)
α(J) = argmax{α ≥ 0 : H α ∩ S n−1 = ∅} (S n−1 is the unit simplex). The
improvement technique involving the second eigenvalue was applied in all cases.
For the experiments in Tables 1 and 2, we used Cplex 12.1 on a single core
of a 2.66 GHz quad-core Xeon machine with 16 GB of RAM, which was never
exceeded. In the tests in Table 1, n = 2443 and the eigenvalues are from a
finance application. Q is the product of 5000 random rotations, resulting in
142712 nonzeros in Q.
Here, rQMIP refers to the weak formulation, PRSP to the perspective for-
mulation, and SLE to the approach in this paper. “LB” is the lower bound
produced by a given approach, and “sec” is the CPU time in seconds. The second
eigenvalue technique proved quite effective in all these tests.
In Table 2 we consider examples with n = 10000 and random Λ. In the table,
Nonz indicates the number of nonzeroes in Q; as this number increases the
quadratic becomes less diagonal dominant.
40 D. Bienstock
QPMIP
mip emph 3 4 10000 41685 0.0314 0.241
(16.67 sec/node)
PRSP-MIP
mip emph 2 16 14000 39550 0 0.8265
(90.4 sec/node)
mip emph 3 16 7000 19817 0 0.8099
(45.30 sec/node)
LPRSP-MIP
References
1. Ben-Tal, A., Nemirovsky, A.: Lectures on Modern Convex Optimization: Analysis,
Algorithms, and Engineering Applications. MPS-SIAM Series on Optimization.
SIAM, Philadelphia (2001)
2. Bienstock, D.: Computational study of a family of mixed-integer quadratic pro-
gramming problems. Math. Programming 74, 121–140 (1996)
3. Bienstock, D., Zuckerberg, M.: Subset algebra lift algorithms for 0-1 integer pro-
gramming. SIAM J. Optimization 105, 9–27 (2006)
4. Bienstock, D., McClosky, B.:Tightening simple mixed-integer sets with guaranteed
bounds (submitted 2008)
5. Boyd, S., El Ghaoui, L., Feron, E., Balakrishnan, V.: Linear matrix inequalities in
system and control theory. SIAM, Philadelphia (1994)
6. Cook, W., Kannan, R., Schrijver, A.: Chv’atal closures for mixed integer programs.
Math. Programming 47, 155–174 (1990)
7. De Farias, I., Johnson, E., Nemhauser, G.: A polyhedral study of the cardinality
constrained knapsack problem. Math. Programming 95, 71–90 (2003)
8. Golub, G.H.: Some modified matrix eigenvalue problems. SIAM Review 15, 318–
334 (1973)
9. Golub, G.H., van Loan, C.: Matrix Computations. Johns Hopkins University Press,
Baltimore (1996)
10. Frangioni, A., Gentile, C.: Perspective cuts for a class of convex 0-1 mixed integer
programs. Mathematical Programming 106, 225–236 (2006)
11. Frangioni, A., Gentile, C.: SDP Diagonalizations and Perspective Cuts for a Class
of Nonseparable MIQP. Oper. Research Letters 35, 181–185 (2007)
12. Günlük, O., Linderoth, J.: Perspective Relaxation of Mixed Integer Nonlinear Pro-
grams with Indicator Variables. In: Lodi, A., Panconesi, A., Rinaldi, G. (eds.)
IPCO 2008. LNCS, vol. 5035, pp. 1–16. Springer, Heidelberg (2008)
13. Moghaddam, B., Weiss, Y., Avidan, S.: Generalized spectral bounds for sparse
LDA. In: Proc. 23rd Int. Conf. on Machine Learning, pp. 641–648 (2006)
14. Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Stat.
Comput. 4, 553–572 (1983)
15. Pólik, I., Terlaky, T.: A survey of the S-lemma. SIAM Review 49, 371–418 (2007)
16. Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems
with applications to large scale minimization. Math. Program 77, 273–299 (1997)
17. Stern, R.J., Wolkowicz, H.: Indefinite trust region subproblems and nonsymmetric
eigenvalue perturbations. SIAM J. Optim. 5, 286–313 (1995)
18. Sturm, J., Zhang, S.: On cones of nonnegative quadratic functions. Mathematics
of Operations Research 28, 246–267 (2003)
19. Miller, W., Wright, S., Zhang, Y., Schuster, S., Hayes, V.: Optimization methods for
selecting founder individuals for captive breeding or reintroduction of endangered
species (2009) (manuscript)
20. Yakubovich, V.A.: S-procedure in nonlinear control theory, vol. 1, pp. 62–77. Vest-
nik Leningrad University (1971)
21. Ye, Y., Zhang, S.: New results on quadratic minimization. SIAM J. Optim. 14,
245–267 (2003)
Restricted b-Matchings
in Degree-Bounded Graphs
1 Introduction
Let G = (V, E) be an undirected graph and let b : V → Z+ be an upper bound
on the nodes. An edge set F ⊆ E is called a b-matching if dF (v), the number of
edges in F incident to v, is at most b(v) for each node v. (This is often called
simple b-matching in the literature.) For some integer t ≥ 2, by a t-matching
we mean a b-matching with b(v) = t for every v ∈ V . Let K be a set consisting
of Kt,t ’s, complete bipartite subgraphs of G on two colour classes of size t, and
Kt+1 ’s, complete subgraphs of G on t + 1 nodes. The node-set and the edge-set
of a subgraph K ∈ K are denoted by VK and EK , respectively. By a K-free b-
matching we mean a b-matching not containing any member of K. In this paper,
we give a min-max formula on the size of K-free b-matchings and a polynomial
time algorithm for finding one with maximum size (that is, a K-free b-matching
F ⊆ E with maximum cardinality) under the assumptions that for any K ∈ K
and any node v of K,
VK spans no parallel edges (1)
b(v) = t (2)
dG (v) ≤ t + 1. (3)
Note that this is a generalization of the problem mentioned in the abstract. The
most important special case of K-free b-matching is to find a maximum C3 -free
Supported by the Hungarian National Foundation for Scientific Research (OTKA)
grant K60802.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 43–56, 2010.
c Springer-Verlag Berlin Heidelberg 2010
44 K. Bérczi and L.A. Végh
the graph is subcubic was solved by the first author and Kobayashi [1]. In terms
of connectivity augmentation, the equivalent problem is augmenting an (n − 4)-
connected graph to (n − 3) connected. Our theorem is a generalization of this
result.
It is worth mentioning that the polynomial solvability of the above problems
seems to show a strong connection with jump systems. In [18], Szabó proved that
for a list K of forbidden Kt,t and Kt+1 subgraphs the degree sequences of K-free t-
matchings form a jump system in any graph. Concerning bipartite graphs,
Kobayashi and Takazawa showed [14] that the degree sequences of C≤k -free 2-
matchings do not always form a jump system for k ≥ 6. These results are con-
sistent with the polynomial solvability of the C≤k -free 2-matching problem, even
when restricting it to bipartite graphs. Similar results are known about even fac-
tors due to [13]. Although Szabó’s result suggests that finding a maximum K-free
t-matching should be solvable in polynomial time, the problem is still open.
Among our assumptions, (1) and (2) may be considered as natural ones as
they hold for the maximum Kt,t -free t-matching problem in a simple graph.
We exclude parallel edges on the node sets of members of K in order to avoid
having two different Kt,t ’s on the same two colour classes or two Kt+1 ’s on the
same ground set. However, the degree bound (3) is a restrictive assumption and
dissipates essential difficulties. Our proof strongly relies on this and the theorem
cannot be straightforwardly generalized, as it can be shown by using the example
in Chapter 6 of [20].
The proof and algorithm use the contraction technique of [11], [16] and [1].
Our contribution on the one hand is the extension of this technique for t ≥ 2 and
forbidding Kt+1 ’s as well, while on the other hand the argument is significantly
simpler than the argument in [1].
Throughout the paper we use the following notation. For an undirected graph
G = (V, E), the set of edges induced by X ⊆ V is denoted by E[X]. For disjoint
subsets X, Y of V , E[X, Y ] denotes the set of edges between X and Y . The set of
nodes in V − X adjacent to X by some edge from F ⊆ E is denoted by ΓF (X).
We let dF (v) denote the number of edges in F ⊆ E incident to v, where loops
in G are counted twice, while dF (X, Y ) stands for the number of edges going
between disjoint subsets X and Y . For a node v ∈ V , we sometimes abbreviate
the set {v} by v, e.g. dF (v,
X) is the number of edges between v and X. For a
set X ⊆ V , let hF (X) = v∈X dF (v), the sum of the number of edges incident
to X and twice the number of edges spanned by X. We use b(U ) = v∈U b(v)
for a function b : V → Z+ and a set U ⊆ V .
Let K be the list of forbidden Kt,t and Kt+1 subgraphs. For disjoint subsets
X, Y of V we denote by K[X] and K[X, Y ] the members of K contained in X
and having edges only between X and Y , respectively. That is, K[X, Y ] stands
for forbidden Kt,t ’s whose colour classes are subsets of X and Y . Recall that
VK and EK denote the node-set and edge-set of the forbidden graph K ∈ K,
respectively.
The rest of the paper is organized as follows. In Section 2 we formalize the
theorem and prove the trivial max ≤ min direction. Two shrinking operations
46 K. Bérczi and L.A. Végh
are introduced in Section 3, and Section 4 contains the proof of the max ≥ min
direction. Finally, the algorithm is presented in Section 5.
2 Main Theorem
Before stating our theorem, let us recall the well-known min-max formula on the
maximum size of a b-matching (see e.g. [17, Vol A, p. 562.]).
where U and W are disjoint subsets of V , and T ranges over the connected
components of G − U − W .
Let us now formulate our theorem. There are minor technical difficulties when
t = 2 that do not occur for larger t. In order to make both the formulation and
the proof simpler it is worth introducing the following definitions. We refer to
forbidden K2,2 and K3 subgraphs as squares and triangles, respectively.
For fixed U, W, P and K̇ the value of (5) is denoted by τ (U, W, P, K̇). It is easy
to see that the contribution of a square-full component to (5) is always 3 and
a maximum K-free b-matching contains exactly 3 of its edges. Hence we may
count these components of G separately, so the following theorem immediately
implies the general one.
Proof (of max ≤ min in Theorem 5). Let M be a K-free b-matching. Then clearly
|M ∩(E[U ]∪E[U, V −U ])| ≤ b(U ) and |M ∩E[W ]| ≤ |E[W ]|−|K̇[W ]|. Moreover,
for each T ∈ P we have
3 Shrinking
In the proof of max ≥ min we use two shrinking operations to get rid of the Kt,t
and Kt+1 subgraphs in K.
When identifying the nodes in KA and KB , the edges (and also loops) spanned
by KA and KB are replaced by loops on ka and kb , respectively. Each edge
48 K. Bérczi and L.A. Végh
KA ka
t − 1 edges
KB kb
VK
t+1
2
− 1 loops
Proof (of Lemma 8). First we show that if M is a K-free b-matching in G then
there is a K◦ -free b◦ -matching M ◦ in G◦ with |M ◦ | ≥ |M | − (t2 − t). Let M =
M −EK . Clearly, |M ∩EK | ≤ t2 −max{1, hM (KA ), hM (KB )}. In G◦ , let M ◦ be
the union of M and t− max{1, dM (ka ), dM (kb )} parallel edges from the shrunk
bundle between ka and kb . Is is easy to see that M ◦ is a K◦ -free b◦ -matching in
G◦ with |M ◦ | ≥ |M | − (t2 − t).
The proof is completed by showing that for an arbitrary K◦ -free b◦ -matching
M in G◦ there exists a K-free b-matching M in G with |M | ≥ |M ◦ | + (t2 − t).
◦
Let H denote the set of parallel edges in the shrunk bundle between ka and kb ,
and let M = M ◦ − H. Now |M ◦ ∩ H| ≤ t − max{1, dM (ka ), dM (kb )} and, by
Claim 10, M may be extended to a K-free b-matching in G with |M ∩ EK | =
t2 − max{1, hM (KA ), hM (KB )}, that is
v3 v3
v4 v1 v4 v1
v2 v2
: edges in M : edges in M
: edges in P : edges in P
v1 v2 x v1 v2 x
KA KA
K K
KB KB
v3 v4 y v3 v4 y
: edges in M
: edges in P
: edges in E \ (P ∪ M )
v4 v3 v4 v3
v1 v1
v5 v2 v5 v2
: edges in M : edges in M
: edges in P : edges in P
Fig. 5. Choice of P for t = 2 in the proof of Claim 11
52 K. Bérczi and L.A. Végh
Proof (of Lemma 9). First we show that if M is a K-free b-matching 2 in G then
there is a K◦ -free b◦ -matching M ◦ in G◦ with |M ◦ | ≥ |M | − t2 . Let M =
max{1,hM (VK )}
M − EK . Clearly, |M ∩ EK | ≤ t+1 − . In G◦ , let M ◦ be the
2 2
union of M and t−max{1,d 2
M (k)}
or t+1−max{1,d
2
M (k)}
loops on k depending
on whether t is even or not, respectively. Is
2 is easy to see that M ◦ is a K◦ -free
◦ ◦ ◦
b -matching in G with |M | ≥ |M | − 2 . t
Let H denote the set of loops on k obtained when shrinking K, and let M =
M ◦ − H. Now |M ◦ ∩ H| ≤ t−max{1,d M (k)}
if t is even and |M ◦ ∩ H| ≤
2
t+1−max{1,dM (k)}
if t is odd. By Claim 10, M can be extended modified as to
max{1,h (VK )}
2
max{1,hM (VK )} ◦
2
+ t+1 2 − 2 ≥ |M | + t
2
if t is even and
|M | = |M ◦ | − |M ◦ ∩ H| + |M ∩ EK | ≥ |M ◦ | − t+1−max{1,dM (k)}
2
max{1,hM (VK )} 2
+ t+1
2 − 2 ≥ |M ◦ | + t2
if t is odd.
4 Proof of Theorem 5
We prove max ≥ min by induction on |K|. For K = ∅, this is simply a consequence
of Theorem 1.
Assume now that K = ∅ and let K be a forbidden subgraph such that K is
uncovered if t = 2. Let G◦ = (V ◦ , E ◦ ) denote the graph obtained by shrinking
K, let b◦ be defined as in Lemma 8 or 9 depending on whether K is a Kt,t or a
Kt+1 . We denote by K◦ the list of forbidden subgraphs disjoint from K.
By induction, the maximum size of a K◦ -free b◦ -matching in G◦ is equal to the
minimum value of τ (U ◦ , W ◦ , P ◦ , K̇◦ ). Let us choose an optimal U ◦ , W ◦ , P ◦ , K˙◦
so that |U ◦ | is minimal. The following claim gives a useful property of U ◦ .
Claim 12. Assume that v ∈ U is such that d(v, W )+|Γ (v)∩(V −W )| ≤ b(v)+1.
Then τ (U −v, W, P , K̇) ≤ τ (U, W, P, K̇) where P is obtained from P by replacing
its members incident to v by their union plus v.
Proof. By removing v from U , b(U ) decreases by b(v). |E[W ]| − |K̇[W ]| remains
unchanged, while the bound on d(v, W ) + |Γ (v) ∩ (V − W )| implies that the
increment in the sum over the components of G − U − W is at most b(v).
Restricted b-Matchings in Degree-Bounded Graphs 53
1 1 3 3 3 T2 1 1 3 3 3 T1
2 3 3 3 U 2 3 3 3 W
Forbidden K3,3 τ (U, W, P, K̇) = 5 + 32 − 3 = 11
Shrinking
1 1 3 T2◦ 1 1
◦
3 T1
2 3 U◦ 2 3 W◦
τ (U ◦ , W ◦ , P ◦ , K˙◦ ) = 5
Fig. 6. Extending M ◦
Case 2: K is a Kt+1 .
By Lemma 9, the difference between the maximum size of a K-free
2 b-matching in
G and the maximum size of a K◦ -free b◦ -matching in G◦ is t2 . We show that
for the pre-images U, W and P of U ◦ , W ◦ and P ◦ with K̇ = K̇◦ ∪ {K} satisfy
54 K. Bérczi and L.A. Végh
2
τ (U, W, P, K̇) = τ (U ◦ , W ◦ , P ◦ , K˙◦ ) + t
2 . (8)
t+1
After shrinking K = (VK , EK ) we get a new node k with 2 − 1 loops on
it. (3) implies that there are at most t + 1 non-loop edges incident to k. Since
b◦ (k) ≥ t, Claim 12 implies k ∈ U . Hence we have the following two cases (T ◦
denotes a member of P ◦ ).
t+1
• k ∈ W ◦ : |E[W ]| = |E ◦ [W ◦ ]| + t+1
2 − 2 + 1 and |K̇[W ]| = |K˙◦ [W ◦ ]| + 1.
• k ∈ T ◦ : b(T ) = b◦ (T ◦ ) + t2 if t is even and b(T ) = b◦ (T ◦ ) + t2 − 1 for an
odd t.
(8) is satisfied in both cases, hence we are done. We may also leave out K from
K̇ in the second case as it is not counted in any term.
5 Algorithm
In this section we show how the proof of Theorem 5 immediately yields an
algorithm for finding a maximum K-free b-matching in strongly polynomial time.
In such problems, an important question from an algorithmic point of view is how
K is represented. For example, in the K-free b-matching problem for bipartite
graphs solved by Pap in [16], the set of excluded subgraphs may be exponentially
large. Therefore Pap assumes that K is given by a membership oracle, that is,
a subroutine is given for determining whether a given subgraph is a member
of K. However, with such an oracle there is no general method for determining
whether K = ∅. Fortunately, we do not have to tackle such problems: by the next
claim, we may assume that K is given explicitly, as its size is linear in n. We use
n = |V |, m = |E| for the number of nodes and edges of the graph, respectively.
Claim 13. If the graph G = (V, E) satisfies (1) and (3), then the total number
of Kt,t and Kt+1 subgraphs is bounded by (t+3)n
2 .
can be done in O(t3 n) time as follows. Maintain an array of size m that encodes
for each edge whether it is used in one of the selected forbidden subgraphs or
not. When increasing H, one only has to check whether any of the edges of the
examined forbidden subgraph is already used, which takes O(t2 ) time. This and
Claim 13 together give an O(t3 n) bound.
Let us shrink the members of H simultaneously (this can be easily done since
they are disjoint), resulting in a graph G = (V , E ) with a bound b : V → Z+
and no forbidden subgraphs since H was maximal. One can find a maximal b -
matching M in G in O(|V ||E | log |V |) = O(nm log m) time as in [5]. Using the
constructions described in Lemmas 8 and 9 for Hk , ..., H1 , this can be modified
into a maximal K-free b-matching M . Note that, for t = 2, Hi is always uncovered
in the actual graph by the selection rule. A dual optimal solution U, W, P, K̇ can
be constructed simultaneously as in the proof of Theorem 5. The best time bound
of the shrinking and extension steps may depend on the data structure used and
the representation of the graph. In any case, one such step may be performed in
O(m) time and |H| = O(n), hence the total running time is O(t3 n + nm log m).
References
1. Bérczi, K., Kobayashi, Y.: An Algorithm for (n − 3)–Connectivity Augmentation
Problem: Jump System Approach. Technical report, Department of Mathematical
Engineering, University of Tokyo, METR 2009-12
2. Cornuéjols, G., Pulleyblank, W.: A Matching Problem With Side Conditions. Dis-
crete Math. 29, 135–139 (1980)
3. Frank, A.: Restricted t-matchings in Bipartite Graphs. Discrete Appl. Math. 131,
337–346 (2003)
4. Frank, A., Jordán, T.: Minimal Edge-Coverings of Pairs of Sets. J. Comb. Theory
Ser. B 65, 73–110 (1995)
5. Gabow, H.N.: An Efficient Reduction Technique for Degree-Constrained Subgraph
and Bidirected Network Flow Problems. In: STOC ’83: Proceedings of the fifteenth
annual ACM symposium on Theory of computing, pp. 448–456. ACM, New York
(1983)
6. Hartvigsen, D.: Extensions of Matching Theory. PhD Thesis, Carnegie-Mellon Uni-
versity (1984)
7. Hartvigsen, D.: The Square-Free 2-factor Problem in Bipartite Graphs. In:
Cornuéjols, G., Burkard, R.E., Woeginger, G.J. (eds.) IPCO 1999. LNCS, vol. 1610,
pp. 234–241. Springer, Heidelberg (1999)
8. Hartvigsen, D.: Finding maximum square-free 2-matchings in bipartite graphs. J.
Comb. Theory Ser. B 96, 693–705 (2006)
9. Hartvigsen, D., Li, Y.: Triangle-Free Simple 2-matchings in Subcubic Graphs (Ex-
tended Abstract). In: Fischetti, M., Williamson, D.P. (eds.) IPCO 2007. LNCS,
vol. 4513, pp. 43–52. Springer, Heidelberg (2007)
10. Király, Z.: C4 -free 2-factors in Bipartite Graphs. Technical report, Egerváry Re-
search Group, Department of Operations Research, Eötvös Loránd University, Bu-
dapest, TR-2001-13 (2001)
11. Király, Z.: Restricted t-matchings in Bipartite Graphs. Technical report, Egerváry
Research Group, Department of Operations Research, Eötvös Loránd University,
Budapest, TR-2009-04 (2009)
56 K. Bérczi and L.A. Végh
Otto-von-Guericke-University of Magdeburg,
Department of Mathematics/IMO, Universitätsplatz 2,
39106 Magdeburg, Germany
{andersen,weismant}@mail.math.uni-magdeburg.de
1 Introduction
This paper concerns mixed integer linear sets of the form:
Observe that P (B) can be obtained from P by deleting the non-negativity con-
straints on the basic variables xi for i ∈ B. Also observe that P (B) can be
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 57–70, 2010.
c Springer-Verlag Berlin Heidelberg 2010
58 K. Andersen and R. Weismantel
where x̄B ∈ Rn is the basic solution associated with B, and the vectors rj,B ∈ Rn
for j ∈ N \B are the extreme rays of P (B).Finally observe that every non-trivial
valid inequality for PI (B) is of the form j∈N \B αj xj ≥ 1 with αj ≥ 0 for all
j ∈ N \ B. We say that a valid inequality j∈N \B αj xj ≥ 1 for PI (B) is a valid
cut for PI that can be derived from the basis B.
Several classes of cuts can be derived from a basis. Some of these cuts are
derived from a single equation. This equation may be one of the equalities
x = x̄B + j∈N \B xj rj,B , or an integer combination of these equalities. The
integrality constraints on the variables are then used to obtain a valid cut. Single-
row cuts are named either Mixed Integer Gomory (MIG) cuts [10], Mixed Integer
Rounding (MIR) cuts [11] or Split Cuts [8].
Recent research has attempted to use several equations simultaneously to
generate valid cuts from a basis. In [3,9], two equations were considered, and
cuts named disection cuts and lifted two-variable cuts were derived. All these
cuts are intersection cuts [4], and their validity is based on lattice-free polyhedra.
This paper is motivated by the following question: Which properties should
cuts derived from bases of the linear relaxation have in order to be effective
cutting planes for mixed integer programs ? Such properties could be useful for
identifying classes of cuts that are effective in practice.
A first realization is that such cuts must be sparse, i.e., the cuts must have
many zero coefficients. Dense cuts are hard for linear programming solvers to
handle, and they have not been shown to be effective in closing the integrality
gap of mixed integer programs.
Secondly, deciding exactly which variables should have a zero coefficient in a
cut seems hard. It therefore seems natural to consider a specific point x̄ ∈ P and
aim at cuts that form solutions to the following variant of a separation problem:
min{ αj x̄j : αj xj ≥ 1 is valid for PI (B) for some basis B} (5)
j∈N \B j∈N \B
valid inequality j∈N \B αj xj ≥ 1 for PI (B) which satisfies j∈N \B αj x̄j = 0 if
such an inequality exists. The cuts we identify are split cuts, and we show that, if
there exists a zero-coefficient cut wrt. x̄, then there also exists a zero-coefficient
cut wrt. x̄ which is a split cut. The cut is computed by first pivoting to an
appropriate basis, and then computing a lattice basis of a well chosen lattice.
It has been shown that, in general, the separation problem for split cuts is
NP-hard [7]. Our result demonstrates that, if one insists on a maximally violated
split cut, then the separation problem can be solved in polynomial time. Zero-
coefficient cuts therefore seem to provide a reasonably efficient alternative to
optimizing over the split closure of a mixed integer set. The quality of the split
closure as an approximation of a mixed integer set was demonstrated in [5].
The performance of zero-coefficient cuts is tested computationally on instances
from miplib 3.0 [6] and miplib 2003 [1]. We restrict our experiments to the corner
polyhedron PI (B ∗ ) obtained from an optimal basis B ∗ of the LP relaxation. In
other words we do not examine the effect of pivoting in our experiments. On
several test problems, zero-coefficient close substantially more integrality gap
than the MIG cuts obtained from the equations defining the optimal simplex
tableau.
The remainder of the paper is organized as follows. In Sect. 2 we derive an
infeasibility certificate for the set PI (B) for a given basis B. This certificate
is key for deriving zero-coefficient cuts. Zero-coefficient cuts are motivated and
presented in Sect. 3. Our main theorem is proved in Sect. 4. Finally our compu-
tational results are presented in Sect. 5.
xj = xj , for j ∈ N \ B,
xj ≥ 0 for j ∈ N \ B }. (6)
Defining the following vectors r ∈ R for j ∈ N \ B:
j n
⎧
⎨ −āk,j if k ∈ B,
rkj := 1 if k = j, (7)
⎩
0 otherwise,
the representation (6) of P (B) can be re-written in the form
P (B) = {x ∈ Rn : x = x̄ + sj rj and sj ≥ 0 for j ∈ N \ B}. (8)
j∈N \B
60 K. Andersen and R. Weismantel
Hence PI (B) is empty if and only if the translated cone x̄ + cone({rj }j∈N \B )
does not contain mixed integer points. Our certificate for proving PI (B) is empty
is a split disjunction. A split disjunction is of the form π T x ≤ π0 ∨ π T x ≥ π0 + 1,
where (π, π0 ) ∈ Zn+1 and πj = 0 for all j ∈ N \ NI . All mixed integer points
x ∈ PI (B) satisfy all split disjunctions.
Our point of departure is a result characterizing when an affine set contains
integer points (see [2]). Specifically, consider the affine set:
T a := f + span({q j }j∈J ), (9)
Since PI (B) is the set of mixed integer points in a translate of a polyhedral cone,
we now have the following certificate for when PI (B) is empty.
Corollary 1. We have PI (B) = ∅ if and only there exists π ∈ Zn such that
π T x̄ ∈
/ Z, π T rj = 0 for all j ∈ N \ B and πj = 0 for all j ∈ N \ NI .
We now use the certificate obtained in Sect. 2 to derive zero-coefficient cuts for
a corner polyhedron PI (B) for a fixed basis B. As in Sect. 2, we let x̄ := x̄B and
rj := rj,B for j ∈ N \ B. We consider an optimization problem (MIP):
min{ cj xj : x ∈ PI (B)},
j∈N \B
where c ∈ R|N \B| denotes the objective coefficients. The linear programming
relaxation of (MIP) is denoted (LP). The set of feasible solutions to (LP) is the
set P (B). We assume cj ≥ 0 for all j ∈ N \B since otherwise (LP) is unbounded.
To motivate zero-coefficient cuts, we first consider a generic cutting plane
|N \B|
algorithm for strengthening the LP relaxation (LP) of (MIP). Let V ⊂ Q+
be a family of valid inequalities for PI (B), i.e., we have that j∈N \B αj xj ≥ 1 is
valid for PI (B) for all α ∈ V . Let x ∈ P (B) be arbitrary. A separation problem
(SEP) wrt. x can be formulated as:
min{ αj xj : α ∈ V }.
j∈N \B
In (CPalg) above, only one cut is added in every iteration. It is also possible
to add several optimal and sub-optimal solutions to (SEP). Furthermore, for
many classes V of valid cutting planes for PI (B), (SEP) can not necessarily be
62 K. Andersen and R. Weismantel
is a zero-coefficient
cut wrt. xk :
j∈N \B αj xj ≥ 1 to (LP) and re-optimize.
k
(a) Add
Let xk+1 be an optimal solution.
(b) Set k := k + 1.
End.
We next prove that, for a given point x ∈ P (B), if there exists any valid
inequality for PI (B) which is maximally violated wrt. x , then there also exists
a split cut which is maximally violated wrt. x .
Theorem 1. Let x ∈ P (B) be arbitrary. If there exists a valid inequality for
PI (B) which is a zero-coefficient cut wrt. x , then there also exists a split cut for
PI (B) which is a zero-coefficient cut wrt. x .
Proof. Let x ∈ P (B), and let j∈N \B αj xj ≥ 1 be a valid inequality for PI (B)
which is a zero-coefficient cut wrt. x . Since j∈N \B αj xj = 0 and αj , sj ≥ 0
for all j ∈ N \ B, we must have αj = 0 for all j ∈ N \ B satisfying xj > 0. Let
X := {j ∈ N \ B : xj > 0}. It follows that 0 ≥ 1 is valid for:
Observe that L(x ) is a lattice, i.e., for any π 1 , π 2 ∈ L(x ) and k ∈ Z, we have
kπ 1 ∈ L(x ) and π 1 + π 2 ∈ L(x ). For any lattice it is possible to compute a basis
for the lattice in polynomial time. Hence we can find vectors π 1 , . . . , π p ∈ L(x )
in polynomial time such that:
p
L(x ) = {π ∈ Zn : π = λi π i and λi ∈ Z for i = 1, 2, . . . , p}.
i=1
64 K. Andersen and R. Weismantel
Now, if there exists a lattice basis vector π ī ∈ L(x ) with ī ∈ {1, 2, . . . , p} such
that (π ī )T x̄ ∈
/ Z, then the split cut derived from π ī is maximally violated wrt.
x . Conversely, if we have (π i )T x̄ ∈ Z for all i ∈ {1, 2, . . . , p}, then π T x̄ ∈ Z for
all π ∈ L(x ). We therefore have the following.
Corollary 2. Let x ∈ P (B) be arbitrary. If there exists a valid inequality for
PI (B) that is maximally violated wrt. x , then it is possible to find such an
inequality in polynomial time.
Based on the above results, we have the following implementation of the cutting
plane algorithm (InitCPalg) presented earlier:
Implementation of (InitCPalg):
(1) Set k := 0. Let xk := x̄ be an optimal solution to (LP).
(2) Find a lattice basis π 1 , . . . , π pk for L(xk ).
Let I(xk ) := {i ∈ {1, . . . , pk } : (π i )T x̄ ∈
/ Z}.
(3) While I(xk ) = ∅:
(a) Add all split cuts generated from vectors π i
with i ∈ I(xk ) to (LP) and re-optimize.
Let xk+1 be an optimal solution.
(b) Find a lattice basis π 1 , . . . , π pk+1 for L(xk+1 ).
Let I(xk+1 ) := {i ∈ {1, . . . , pk+1 } : (π i )T x̄ ∈
/ Z}.
(c) Set k := k + 1.
End.
We next argue that mixed integer Gomory cuts play a natural role in the above
implementation of (InitCPalg). Consider the computation of the lattice basis for
L(x0 ) in step (2) of (InitCPalg). Observe that, since x0 = x̄, we have L(x0 ) = Zn ,
and therefore π 1 := e1 , . . . , π n := en is a lattice basis for L(x0 ), where e1 , . . . , en
denote the unit vectors in Rn . Since a split cut (10) obtained from a unit vector
is a mixed integer Gomory cut, the first cuts added in step (3).(a) of the above
implementation of (InitCPalg) are the mixed integer Gomory cuts. A natural
computational question therefore seems to be how much more integrality gap
can be closed by continuing (InitCPalg) and generating the remaining zero-
coefficient cuts.
Proof. For simplicity let x̄ := x̄B and x̄ := x̄B . Also let ā.j := (AB )−1 a.j for all
j ∈ N \ B and ā.j := (AB )−1 a.j for all j ∈ N \ B , where AB and AB denote
the basis matrices associated with the bases B and B respectively.
Suppose z ∈ PI (B). We have zi = x̄i + j∈N \B āi,j zj for all i ∈ B , zj ≥ 0 for
all j ∈ N \ (B ∪ {ī}) and zj ∈ Z for all j ∈ NI . If zī ≥ 0, we are done, so suppose
zī < 0. Choose an integer k > 0 such that kā.ī ∈ Zm and zī + k ≥ 0. Defining
zi := zi + kāi,ī for all i ∈ B , zī := zī + k and zj := zj for all j ∈ N \ (B ∪{ī}), we
have z ∈ PI (B ). Hence PI (B) = ∅ implies PI (B ) = ∅. The opposite direction
is symmetric.
From Lemma 3 it follows that either all corner polyhedra PI (B) associated with
bases B for P are empty, or they are all non-empty. We next present a pivot
operation from a basis B to an adjacent basis B with the property that, if a zero-
coefficient cut wrt. a point x ∈ P can be derived from B, then a zero-coefficient
cut wrt. x can also be derived from B .
Lemma 4. Let B be a basis for P , let x ∈ P and define X := {j ∈ N : xj > 0}.
Also let B := (B \ {ī}) ∪ {j̄} be an adjacent basis to B, where ī ∈ B \ X and
j̄ ∈ X \ B. If a zero-coefficient cut wrt. x can be derived from B, then a zero-
coefficient cut wrt. x can also be derived from B .
Proof. Given a set S ⊆ N , we will use sets obtained from P , PI , P (B) and
PI (B) by setting xj = 0 for all j ∈ N \ S. For S ⊆ N , define Q(S) := {x ∈ P :
xj = 0 for j ∈ N \ S} and QI (S) := {x ∈ PI : xj = 0 for j ∈ N \ S}. Also,
given a basis B ⊆ S, define Q(B, S) := {x ∈ P (B) : xj = 0 for j ∈ N \ S} and
QI (B, S) := {x ∈ PI (B) : xj = 0 for j ∈ N \ S}.
Assume a zero-coefficient cut wrt. x can be derived from B. Observe that this
implies PI (B, B ∪X ) = ∅. Now, PI (B, B ∪X ) is a corner polyhedron associated
with PI (B ∪ X ), and PI (B , B ∪ X ) is also a corner polyhedron associated with
PI (B ∪ X ). Since any two bases of P (B ∪ X ) can be obtained from each other
by pivoting, it follows from Lemma 3 that also PI (B , B ∪ X ) = ∅. Corollary 1
now gives a split cut which is a zero-coefficient cut wrt. x derived from B .
Lemma 4 shows that, for the purpose of identifying zero-coefficient cuts wrt. x ,
the interesting bases to consider are those bases for which it is not possible to
pivot a variable xj with j ∈ X into the basis.
Definition 1. Let x ∈ P , and define X := {j ∈ N : xj > 0}. A basis B for P
is called maximal wrt. x if (B \ {ī}) ∪ {j̄} is not a basis for P for all ī ∈ B \ X
and j̄ ∈ X \ B.
From the above results it is not clear whether it is necessary to investigate all
maximal bases wrt. x in order to identify a zero-coefficient cut wrt. x . However,
the following lemma shows that it is sufficient to examine just a single arbitrarily
chosen maximal basis wrt. x . In other words, if there exists a basis from which
a zero-coefficient cut wrt. x can be derived, then a zero-coefficient cut wrt. x
can be derived from every maximal basis wrt. x .
66 K. Andersen and R. Weismantel
Lemma 5. If there exists a basis B for P from which a zero-coefficient cut wrt.
x can be derived, then a zero-coefficient cut can be derived from every basis for
P which is maximal wrt. x .
Proof. Suppose B is a basis from which a zero-coefficient cut wrt. x can be
derived. Let J := N \ B, Bx := B ∩ X and Jx := J ∩ X . Also let x̄ := x̄B and
ā.j := (AB )−1 a.j for j ∈ J, where AB denotes the basis matrix associated with
B. Lemma 4 shows that we may assume B is maximal, i.e., we may assume that
the simplex tableau associated with B is of the form:
xi = 0 + āi,j xj for all i ∈ B \ Bx , (15)
j∈J\Jx
xi = x̄i + āi,j xj + āi,j xj for all i ∈ Bx . (16)
j∈Jx j∈J\Jx
Furthermore, from any basis B for P which is maximal wrt. x , a basic poly-
hedron T (B ) of T can be associated of the above form, and a zero-coefficient
cut wrt. x can be derived from B if and only if T (B ) does not contain mixed
integer points. Since T (B) does not contain mixed integer points, it follows from
Lemma 3 that every basic polyhedron T (B ) for T does not contain mixed in-
teger points. Hence a zero-coefficient cut can be derived from every basis B for
P which is maximal wrt. x .
Since a maximal basis wrt. x ∈ P can be obtained in polynomial time, we
immediately have our main theorem.
Theorem 2. Let x ∈ P be arbitrary. If there exists basis B, and a valid inequal-
ity for PI (B) which is a zero-coefficient cut wrt. x , then such a zero-coefficient
cut can be obtained in polynomial time.
5 Computational Results
We now test the performance of the cutting plane algorithm (InitCPalg) de-
scribed in Sect. 3. In our implementation, we use CPLEX 9.1 for solving linear
Zero-Coefficient Cuts 67
programs, and the open source software NTL for the lattice computations. We
use instances from miplib 3.0 [6] and miplib 2003 [1] in our experiments. All in-
stances are minimization problems, and we use the preprocessed version of each
instance, i.e., when we refer to an instance, we refer to the instance obtained
after applying the preprocessor of CPLEX 9.1.
For each instance, we formulate the optimization problem over the corner
polyhedron associated with an optimal basis of the LP relaxation. To distin-
guish the optimization problem over the corner polyhedron from the original
mixed integer program, we use the following notation: The original mixed inte-
ger program is denoted (MIP), and the mixed integer program over the corner
polyhedron is denoted (MIPc ). The optimal objective of (MIP) is denoted z MIP ,
c
and the optimal objective value of (MIPc ) is denoted z MIP . The LP relaxation
of (MIP) is denoted (LP), and the optimal objective value of (LP) is denoted
z LP .
We assume the (original) mixed integer program (MIP) has n variables, and
includes slack, surplus and artificial variables in the formulation:
min cT x
such that
aTi. x = bi , for all i ∈ M, (18)
lj ≤ xj ≤ uj , for all j ∈ N, (19)
xj ∈ Z, for all j ∈ NI . (20)
where M is an index set for the constraints, c ∈ Qn+|M| denotes the objective
coefficients, N := {1, 2, . . . , (n + |M |)} is an index set for the variables, NI ⊆ N
denotes those variables that are integer constrained, l and u are the lower and
upper bounds on the variables respectively and (ai. , bi ) ∈ Q|N |+1 for i ∈ M
denotes the coefficients in the ith constraint. The variables xn+i for i ∈ M are
either slack, surplus or artificial variables.
The problem (MIPc ) is formulated as follows. An optimal basis for (LP) is an
|M |-subset B ∗ ⊆ N of basic variables. Let J ∗ := N \ B ∗ denote the non-basic
variables. The problem (MIPc ) can be constructed from (MIP) by eliminating
∗
certain bounds on the variables. Let JA denote the non-basic artificial variables,
∗ ∗ ∗
let JL ⊆ J \ JA denote the non-basic structural variables on lower bound, and
let JU∗ ⊆ J ∗ \ JA
∗
denote the non-basic structural variables on upper bound. By
re-defining the bounds on the variables xj for j ∈ N to:
⎧ ∗
⎧ ∗
⎨0 if j ∈ JA , ⎨0 if j ∈ JA ,
∗ ∗
lj (B ) := lj if j ∈ JL , and uj (B ):= uj if j ∈ JU∗ ,
∗
(21)
⎩ ⎩
−∞ otherwise, +∞ otherwise,
then the problem (MIPc ) associated with B ∗ is given by:
min cT x
such that
aTi. x = bi , for all i ∈ M, (22)
68 K. Andersen and R. Weismantel
Table 2. Instances where the increase in objective with all ZC cuts was substantially
larger than the increase in objective with only MIG cuts
ΔObj. MIG cuts
Problem # Constr. # Var. # MIG # Additional ΔObj. All cuts × 100%
cuts ZC cuts
l152lav 97 1988 53 6 0.00%
mkc∗ 1286 3230 119 29 0.00%
p0201 107 183 42 27 0.00%
p2756 702 2642 201 7 6.02%
rout 290 555 52 36 6.24%
swath 482 6260 80 26 7.58%
vpm1 114 188 15 40 22.38%
vpm2 128 188 29 29 22.73%
flugpl 13 14 13 5 23.36%
fixnet6 477 877 12 19 27.06%
timtab2∗ 287 648 237 653 29.85%
timtab1∗ 166 378 133 342 30.93%
egout 35 47 8 5 41.47%
qnet1 363 1417 55 47 45.05%
p0282 160 200 34 53 49.60%
air04 614 7564 290 30 50.53%
modglob 286 384 60 28 56.77%
mas76 12 148 11 11 65.77%
pp08a 133 234 51 34 66.16%
10teams 210 1600 179 76 66.67%
mod008 6 319 5 10 69.77%
nsrand-ipx 590 4162 226 91 73.17%
Table 2 contains those instances where MIG cuts closed less than 80% of the
total integrality gap that can be closed with zero-coefficient cuts. We observe
that for the first 16 instances in Table 2, continuing (InitCPalg) beyond MIG
cuts closed at least twice as much integrality gap as would have been achieved
70 K. Andersen and R. Weismantel
by using only MIG cuts. For the remaining instances in Table 2, it was not at
least a factor of two which was achieved, but still a substantial improvement.
The instances marked with an asterisk in Table 2 are instances where we were
unable to solve (MIPc ). For those instances, the results are based on the best
possible solution we were able to find.
The remaining class of instances are those instances where MIG cuts closed
more than 80% of the total integrality gap that can be closed with zero-coefficient
cuts. There were 28 of these instances. For these instances, continuing (InitC-
Palg) beyond MIG cuts was therefore not beneficial. However, we observe that
for all except two of these instances (markshare1 and markshare2), this was
because very few zero-coefficient cuts were generated that are not MIG cuts.
Detecting that it is not beneficial to continue (InitCPalg) beyond the generation
of MIG cuts was therefore done after only very few lattice basis computations.
References
1. Achterberg, T., Koch, T.: MIPLIB 2003. Operations Research Letters 34, 361–372
(2006)
2. Andersen, K., Louveaux, Q., Weismantel, R.: Certificates of linear mixed integer
infeasibility. Operations Research Letters 36, 734–738 (2008)
3. Andersen, K., Louveaux, Q., Weismantel, R., Wolsey, L.A.: Inequalities from Two
Rows of a Simplex Tableau. In: Fischetti, M., Williamson, D.P. (eds.) IPCO 2007.
LNCS, vol. 4513, pp. 1–15. Springer, Heidelberg (2007)
4. Balas, E.: Intersection Cuts - a new type of cutting planes for integer programming.
Operations Research 19, 19–39 (1971)
5. Balas, E., Saxena, A.: Optimizing over the split closure. Mathematical Program-
ming, Ser. A 113, 219–240 (2008)
6. Bixby, R.E., Ceria, S., McZeal, C.M., Savelsbergh, M.W.P.: An updated mixed
integer programming library: MIPLIB 3. 0. Optima 58, 12–15 (1998)
7. Caprara, A., Letchford, A.: On the separation of split cuts and related inequalities.
Mathematical Programming, Ser. A 94, 279–294 (2003)
8. Cook, W.J., Kannan, R., Schrijver, A.: Chvátal closures for mixed integer pro-
gramming problems. Mathematical Programming 47, 155–174 (1990)
9. Cornuéjols, G., Margot, F.: On the Facets of Mixed Integer Programs with Two
Integer Variables and Two Constraints. Mathematical Programming, Ser. A 120,
429–456 (2009)
10. Gomory, R.E.: An algorithm for the mixed integer problem. Technical Report RM-
2597, The Rand Corporation (1960a)
11. Nemhauser, G., Wolsey, L.A.: A recursive procedure to generate all cuts for 0-1
mixed integer programs. Mathematical Programming, Ser. A 46, 379–390 (1990)
Prize-Collecting Steiner Network Problems
1 Introduction
Prize-collecting Steiner problems are well-known network design problems with
several applications in expanding telecommunications networks (see for exam-
ple [JMP00, SCRS00]), cost sharing, and Lagrangian relaxation techniques (see
e.g. [JV01, CRW01]). A general form of these problems is the Prize-Collecting
Steiner Forest problem1 : given a network (graph) G = (V, E), a set of source-
sink pairs P = {{s1 , t1 }, {s2 , t2 }, . . . , {sk , tk }}, a non-negative cost function
c : E → + , and a non-negative penalty function π : P → + , our goal is
Part of this work was done while the authors were meeting at DIMACS. We would
like to thank DIMACS for hospitality.
Partially supported by NSF Award Grant number 0829959.
1
In the literature, this problem is also called “prize-collecting generalized Steiner
tree”.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 71–84, 2010.
c Springer-Verlag Berlin Heidelberg 2010
72 M. Hajiaghayi et al.
a minimum-cost way of installing (buying) a set of links (edges) and paying the
penalty for those pairs which are not connected via installed links. When all
penalties are ∞, the problem is the classic APX-hard Steiner Forest problem, for
which the best known approximation ratio is 2 − n2 (n is the number of nodes of
the graph) due to Agrawal, Klein, and Ravi [AKR95] (see also [GW95] for a more
general result and a simpler analysis). The case of Prize-Collecting Steiner Forest
problem when all sinks are identical is the classic Prize-Collecting Steiner Tree
problem. Bienstock, Goemans, Simchi-Levi, and Williamson [BGSLW93] first
considered this problem (based on a problem earlier proposed by Balas [Bal89])
and gave for it a 3-approximation algorithm. The current best ratio for this
problem is 1.992 by Archer, Bateni, Hajiaghayi,
! and Karloff [ABHK09], im-
proving upon a primal-dual 2 − n−1 1
-approximation algorithm of Goemans
and Williamson [GW95]. When in addition all penalties are ∞, the problem is
the classic Steiner Tree problem, which is known to be APX-hard [BP89] and
for which the best approximation ratio is 1.55 [RZ05]. Very recently, Byrka et
al. [BGRS10] have announced an improved approximation algorithm for the
Steiner tree problem.
The general form of the Prize-Collecting Steiner Forest problem first has been
formulated by Hajiaghayi and Jain [HJ06]. They showed how by a primal-dual
algorithm to a novel integer programming formulation of the problem with
doubly-exponential variables, we can obtain a 3-approximation algorithm for
the problem. In addition, they show that the factor 3 in the analysis of their
algorithm is tight. However they show how a direct randomized LP-rounding al-
gorithm with approximation factor 2.54 can be obtained for this problem. Their
approach has been generalized by Sharma, Swamy, and Williamson [SSW07] for
network design problems where violated arbitrary 0-1 connectivity constraints
are allowed in exchange for a very general penalty function. The work of Ha-
jiaghayi and Jain has also motivated a game-theoretic version of the problem
considered by Gupta et al. [GKL+ 07].
In this paper, we consider a much more general high-connectivity version of
Prize-Collecting Steiner Forest, called Prize-Collecting Steiner Network, in which
we are also given connectivity requirements ruv for pairs of nodes u and v and
a penalty function in case we do not satisfy all ruv . Our goal is to find a mini-
mum way of constructing a network (graph) in which we connect u and v with
ruv ≤ ruv edge-disjoint paths and paying a penalty for all violated connectivity
between source-sink pairs. This problem can arise in real-world network design,
in which a typical client not only might want to connect to the network but
also might want to connect via a few disjoint paths (e.g., to have a higher band-
width or redundant connections in case of edge failures) and a penalty might
be charged if we cannot satisfy its connectivity requirement. When all penalties
are ∞, the problem is the classic Steiner Network problem. Improving on a long
line of earlier research that applied primal-dual methods, Jain [Jai01] obtained
a 2-approximation algorithm for Steiner Network using the iterative rounding
method. This algorithm was generalized to so called “element-connectivity” by
Fleischer, Jain, and Williamson [FJW01] and by Cheriyan, Vempala, and Vetta
Prize-Collecting Steiner Network Problems 73
[CVV06]. Recently, some results were obtained for the node-connectivity version;
the currently best known ratios for the node-connectivity case are O(R3 log n)
for general requirements [CK09] and O(R2 ) for rooted requirements [Nut09],
where R = maxu,v∈V ruv is the maximum requirement. See also the survey by
Kortsarz and Nutov [KN07] for various min-cost connectivity problems.
Hajiaghayi and Nasri [HN10] generalize the iterative rounding approach of
Jain to Prize-Collecting Steiner Network when there is a separate non-increasing
marginal penalty function for each pair u, v whose ruv -connectivity requirement
is not satisfied. They obtain an iterative rounding 3-approximation algorithm for
this case. For the special case when penalty functions are linear in the violation
of the connectivity requirements, Nagarajan, Sharma, and Williamson [NSW08]
using Jains iterative rounding algorithm as a black box give a 2.54-factor approx-
imation algorithm. They also generalize the 0-1 requirements of Prize-Collecting
Steiner Forest problem introduced by Sharma, Swamy, and Williamson [SSW07]
to include general connectivity requirements. Assuming the monotone submod-
ular penalty function of Sharma et al. is generalized to a multiset function that
can be decomposed into functions in the same type as that of Sharma et al.,
they give an O(log R)-approximation algorithm (recall that R is the maximum
connectivity requirement). In this algorithm, they assume that we can use each
edge possibly many times (without bound). They raise the question whether we
can obtain a constant ratio without all these assumptions, when penalty is a sub-
modular multi-set function of the set of disconnected pairs? More importantly
they pose as an open problem to design a good approximation algorithm for the
all-or-nothing version of penalty functions: penalty functions which charge the
penalty even if the connectivity requirement is slightly violated. In this paper,
we answer affirmatively all these open problems by proving the first constant
factor 2.54-approximation algorithm which is based on a novel LP formulation
of the problem. We further generalize our results for element-connectivity and
node-connectivity. In fact, for all types of connectivities, we prove a very gen-
eral result (see Theorem 1) stating that if Steiner Network (the version without
penalties) admits an LP-based ρ-approximation algorithm, then the correspond-
ing prize-collecting version admits a (ρ + 1)-approximation algorithm.
is when there is a “root” s that belongs to all pairs {ui , vi }. We consider the
following “prize-collecting” version of GSN.
k
val (H) = c(H) + pi (λSH (ui , vi )).
i=1
between u and v is “mixed”, meaning it may contain both edges in the graph
and nodes from S. Note that if T (i, S) then δ(T ) ∪ (V \ (T ∪ T )) is such
a mixed cut that separates between ui and vi . Intuitively, Menger’s Theorem
for S-connectivity (c.f. [KN07]) states that the S-connectivity between ui and
vi equals the minimum size of such a mixed cut. Formally, for a node pair ui , vi
of a graph H = (V, E) and S ⊆ V we have:
λSH (ui , vi ) = min (|δ(T )|+ |V \ (T ∪T )|) = min (|δ(T )|+ |V |− (|T |+ |T |))
T (i,S) T (i,S)
Hence if λSH (ui , vi ) ≥ ri for a graph H = (V, E), then for any setpair T with
T (i, S) we must have |δ(T )| ≥ ri (T ), where ri (T ) = max{ri + |T | + |T | −
|V |, 0}. Consequently, a standard “cut-type” LP-relaxation of the GSN problem
is as follows (c.f. [KN07]):
⎧ ⎫
⎨ ⎬
min ce xe | xe ≥ ri (T ) ∀T (i, S), ∀i ∈ K, xe ∈ [0, 1] ∀e . (1)
⎩ ⎭
e∈E e∈δ(T )
2 A New LP Relaxation
We use the following LP-relaxation for the PC-GSN problem. We introduce vari-
ables xe for e ∈ E (xe = 1 if e ∈ H), fi,e for i ∈ K and e ∈ E (fi,e = 1 if
i ∈ unsat(H) and e appears on a chosen set of ri S-disjoint {ui , vi }-paths in
H), and zI for I ⊆ K (zI = 1 if I = unsat(H)).
Minimize e∈E ce xe + I⊆K π(I)zI
Subject to fi,e ≥ (1 − I:i∈I zI )ri (T ) ∀i ∀T (i, S)
e∈δ(T )
fi,e ≤ 1 − I:i∈I zI ∀i ∀e
xe ≥ fi,e ∀i ∀e (2)
I⊆K zI =1
xe , fi,e , zI ∈ [0, 1] ∀i ∀e ∀I
for all i and T (i, S). Here is an example demonstrating that the integrality
gap of this LP can be as large as R = maxi ri even for edge-connectivity. Let G
consist of R − 1 edge-disjoint paths between two nodes s and t. All the edges
have cost 0. There is only one pair {u1 , v1 } = {s, t} that has requirement r1 = R
and penalty π({1}) = 1. Let π(∅) = 0. Clearly, π is submodular and monotone
non-decreasing. We have S = ∅. No integral solution can satisfy the requirement
r1 , hence an optimal integral solution pays the penalty π({1}) and has value 1.
A feasible fractional solution (without the flow variables) sets xe = 1 for all e,
sets z{1} = 1/R, z∅ = 1 − 1/R. The new set of constraints is satisfied since
and
e∈δ(T ) xe ≥ (1 − 1/R) · R = (1 − z{1} )r1 (T ) for any {s, t}-cut T . Thus the
optimal LP-value is at most 1/R, giving a gap of at least R.
With flow variables, however, we have an upper bound f1,e ≤ 1 − z{1} . Since
there
is an {s, t}-cut T with |δ(T )| = R − 1, we cannot satisfy the constraints
e∈δ(T ) f1,e ≥ (1 − z{1} )r1 (T ) and f1,e ≤ 1 − z{1} simultaneously unless we set
z{1} = 1. Thus in this case, our LP (2) with flow variables has the same optimal
value of as the integral optimum.
78 M. Hajiaghayi et al.
We will prove the following two statements that together imply Theorem 1.
Lemma 2. Any basic feasible solution to (2) has a polynomial number of non-
zero variables. Furthermore, an optimal basic solution to (2) (the non-zero en-
tries) can be computed in polynomial time.
Lemma 3. Suppose that there exists a polynomial time algorithm that computes
an integral solution to LP (1) of cost at most ρ times the optimal value of LP (1)
for any subset of node pairs. Then there exists a polynomial time algorithm that
given a feasible solution to (2) computes as a solution to PC-GSN a subgraph H
of G so that val(H) = c(H) + π(unsat(H)) is at most (1 − e−1/ρ )−1 times the
value of this solution, assuming π is submodular and monotone non-decreasing.
Before proving these lemmas, we prove some useful results. The following state-
ment can be deduced from a theorem of Edmonds for polymatroids (c.f. [KV02,
Chapter 14.2]), as the dual LP d(γ) in the lemma seeks to optimize a linear
function over a polymatroid. We provide a direct proof for completeness of ex-
position.
Lemma 4. Let γ ∈ [0, 1]k be a vector. Consider a primal LP
⎧ ⎫
⎨ ⎬
p(γ) := min π(I)zI | zI ≥ γi ∀i ∈ K, zI ≥ 0 ∀I ⊆ K
⎩ ⎭
I⊆K I:i∈I
Let σ be a permutation of K such that γσ(1) ≤ γσ(2) ≤ . . . ≤ γσ(k) . Let us also use
the notation that γσ(0) = 0. The optimum solutions to p(γ) and d(γ) respectively
are given by
γσ(i) − γσ(i−1) , for I = {σ(i), . . . , σ(k)}, i ∈ K
zI =
0, otherwise;
and
k
k
= γi · (π({i, . . . , k}) − π({i + 1, . . . , k})) = γi · yi .
i=1 i=1
Thus from weak LP duality, they in fact form optimum solutions to primal and
dual LPs respectively.
Recall that a sub-gradient of a convex function g : k → at a point γ ∈ k
is a vector d ∈ k such that for any γ ∈ k , we have g(γ ) − g(γ) ≥ d ·
(γ − γ). For a differentiable convex function g, the sub-gradient corresponds
to gradient ∇g. The function p(γ) defined in Lemma 4 is essentially Lovasz’s
continuous extension of the submodular function π. The fact that p is convex
and its subgradient can be computed efficiently is given in [Fuj05]. We provide
a full proof for completeness of exposition.
Lemma 5. The function p(γ) in Lemma 4 is convex and given γ ∈ [0, 1]k , both
p(γ) and its sub-gradient ∇p(γ) can be computed in polynomial time.
Proof. We first prove that p is convex. Fix γ1 , γ2 ∈ [0, 1]k and α ∈ [0, 1]. To show
that p is convex, we will show p(αγ1 + (1 − α)γ2 ) ≤ αp(γ1 ) + (1 − α)p(γ2 ). Let
{zI1 } and {zI2 } be the optimum solutions of the primal LP defining p for γ1 and
γ2 respectively. Note that the solution {αzI1 + (1 − α)zI2 } is feasible for this LP
for γ = αγ1 + (1 − α)γ2 . Thus the optimum solution has value not greater than
the value of this solution which is αp(γ1 ) + (1 − α)p(γ2 ).
From Lemma 4, it is clear that given γ ∈ [0, 1]k , the value p(γ) can be com-
puted in polynomial time. Lemma 4 also implies that the optimum dual solution
y ∗ = (y1∗ , . . . , yk∗ ) ∈ k+ can be computed in polynomial time. We now argue that
y ∗ is a sub-gradient of p at γ. Fix any γ ∈ k . First note that, from LP duality,
p(γ) = y ∗ · γ. Thus we have
p(γ) + y ∗ · (γ − γ) = y ∗ · γ + y ∗ · (γ − γ) = y ∗ · γ ≤ p(γ ).
The last inequality holds from weak LP duality since y ∗ is a feasible solution for
the dual LP d(γ ) as well. The lemma follows.
80 M. Hajiaghayi et al.
3 Proof of Lemma 3
Proof. Consider the LP-relaxation (1) of the GSN problem with good require-
ments only, with K replaced by Kg ; namely, we seek a minimum cost sub-
graph H of G that satisfies the set Kg of good requirements. We claim that
x∗∗ ∗
e = min {1, xe /(1 − α)} for each e ∈ E is a feasible solution to LP (1). Thus
the optimum value of LP (1) is at most e∈E ce x∗∗ e . Consequently, using the
algorithm that computes an integral solution to LP (1) of cost at most ρ times
the optimal value of LP (1), we can construct a subgraph H that satisfies all
ρ
good requirements and has cost at most c(H) ≤ ρ e∈E ce x∗∗ e ≤ 1−α e c e x∗
e,
as desired.
∗∗
We now∗∗ show that {xe } is a feasible solution to LP (1), namely, that
x ≥ ri (T ) for any i ∈ Kg and any T (i, S). Let i ∈ Kg and let ζi =
) e
e∈δ(T
1 − I:i∈I zI∗ . Note that ζi ≥ 1 − α, by the definition of Kg . By the second and
the third sets of constraints in LP (2), for every e ∈ E we have min{ζi , x∗e } ≥ fi,e∗
.
∗
∗
fi,e f∗
= i,e
ζi 1− zI∗ .
Consequently, combining with the first set of constraints in
I:i∈I
∗
e∈δ(T ) fi,e
LP (2), for any T (i, S) we obtain that e∈δ(T ) x∗∗
e ≥ 1−
z ∗ ≥ ri (T ).
I:i∈I I
Let H be as in Lemma 6, and recall that unsat(H) denotes the set of require-
ments not satisfied by H. Clearly each requirement i ∈ unsat(H) is bad. The
following lemma bounds the total penalty we pay for unsat(H).
Lemma 7. π(unsat(H)) ≤ α1 · I π(I)zI∗ .
To complete the proof of β1 -approximation, we now argue that the above expec-
tation is at most β1 · e∈E (ce x∗e + I π(I)zI∗ ).
$ %
Since Eα 1−αρ
= β1 , the first term in (3) is at most β1 · e∈E ce x∗e . Since
unsat(H) ⊆ {i | I:i∈I zI∗ ≥ α} and & sinceπ is monotone'non-decreasing, the
second term in (3) is at most Eα π {i | I:i∈I zI∗ ≥ α} . Lemma 8 bounds
this quantity as follows. The ideas used here are also presented in Sharma et
al. [SSW07].
Lemma 8. We have
( ) *+
1
Eα π {i | zI∗ ≥ α} ≤ · π(I)zI∗ . (4)
β
I:i∈I I
Proof. Let γi = I:i∈I zI∗ for all i ∈ K. Let us, without loss of generality, order
the elements i ∈ K such that γ1 ≤ γ2 ≤ · · · ≤ γk . We also use the notation
γ0 = 0. Note that {zI∗ } forms a feasible solution to the primal LP p(γ) given in
Lemma 4. Therefore, from Lemma 4, its objective value is at least that of the
optimum solution:
k
π(I)zI∗ ≥ [(γi − γi−1 ) · π({i, . . . , k})] . (5)
I i=1
We now observe that the LHS of (4) can be expressed as follows. Since α is picked
uniformly at random from (0, β], we have that for all 1 ≤ i ≤ k, with probability
at most γi −γ i−1
, the random variable α lies in the interval (γi−1 , γi ]. When this
β
event happens, we get that {i | I:i ∈I zI∗ ≥ α} = {i | γi ≥ α} = {i, . . . , k}.
Thus the expectation in LHS of (4) is at most
k "
#
γi − γi−1
· π({i, . . . , k}) . (6)
i=1
β
4 Proof of Lemma 2
We next show that even if LP (2) has exponential number of variables and
constraints, the following lemma holds.
Lemma 9. Any basic feasible solution to LP (2) has a polynomial number of
non-zero variables.
Proof. Fix a basic feasible solution {x∗e , fi,e
∗
, zi∗ } to (2). For i ∈ K, let
∗
min e∈δ(T ) fi,e
T :T i
γi = 1 − and γi = 1 − max fi,e
∗
.
ri e
Now fix the values of variables {xe , fi,e } to {x∗e , fi,e
∗
} and project the LP (2) onto
variables {zI } as follows.
⎧
⎨
ce x∗e + min π(I)zI |
⎩
e∈E I⊆K
⎫
⎬
zI = 1, γi ≤ zI ≤ γi ∀i ∈ K, zI ≥ 0 ∀I ⊆ K . (7)
⎭
I⊆K I:i∈I
sub-gradient of the function e∈E ce xe + p(γ) w.r.t. variables {xe , γi }. The sub-
gradient of e∈E ce xe w.r.t. x is simply the cost vector c. The sub-gradient
of p(γ) w.r.t. γ is computed using Lemma 5, denote it by y ∈ k+ . From the
definition of sub-gradient, we have that the sub-gradient (c, y) to the objective
function at point (x, γ) satisfies
) * ) *
ce xe + p(γ ) − ce xe + p(γ) ≥ (c, y) · ((x , γ ) − (x, γ)) .
e∈E e∈E
Now fix any feasible solution (x∗ , γ ∗ ), i.e., the one that satisfies e∈E ce x∗e +
p(γ ∗ ) ≤ opt. Substituting (x , γ ) = (x∗ , γ ∗ ) in the above equation we get,
) * ) *
0 = opt − opt > ce x∗e + p(γ ∗ ) − ce xe + p(γ)
e∈E e∈E
≥ (c, y) · (x∗ , γ ∗ ) − (c, y) · (x, γ).
Thus (c, y) defines a separating hyperplane between the point (x, γ) and any
point (x∗ , γ ∗ ) that satisfies e∈E ce x∗e + p(γ ∗ ) ≤ opt. Hence we have a polyno-
mial time separation oracle for the objective function as well.
Thus we can solve (8) using the ellipsoid algorithm. The proof of Lemma 2 is
hence complete.
References
[ABHK09] Archer, A., Bateni, M., Hajiaghayi, M., Karloff, H.: A technique for im-
proving approximation algorithms for prize-collecting problems. In: Proc.
50th IEEE Symp. on Foundations of Computer Science, FOCS (2009)
[AKR95] Agrawal, A., Klein, P., Ravi, R.: When trees collide: an approximation
algorithm for the generalized Steiner problem on networks. SIAM J. Com-
put. 24(3), 440–456 (1995)
[Bal89] Balas, E.: The prize collecting traveling salesman problem. Networks 19(6),
621–636 (1989)
[BGRS10] Byrka, J., Grandoni, F., Rothvoss, T., Sanita, L.: An improved lp-based
approximation for steiner tree. In: Proceedings of the 42nd annual ACM
Symposium on Theory of computing, STOC (2010)
[BGSLW93] Bienstock, D., Goemans, M., Simchi-Levi, D., Williamson, D.: A note on
the prize collecting traveling salesman problem. Math. Programming 59(3,
Ser. A), 413–420 (1993)
[BP89] Bern, M., Plassmann, P.: The Steiner problem with edge lengths 1 and 2.
Information Processing Letters 32, 171–176 (1989)
[CK09] Chuzhoy, J., Khanna, S.: An O(k3 log n)-approximation algorithms for
vertex-connectivity network design. In: Proceedings of the 50th Annual
IEEE Symposium on Foundations of Computer Science, FOCS (2009)
[CRW01] Chudak, F., Roughgarden, T., Williamson, D.: Approximate k-MSTs and
k-Steiner trees via the primal-dual method and Lagrangean relaxation. In:
Aardal, K., Gerards, B. (eds.) IPCO 2001. LNCS, vol. 2081, pp. 60–70.
Springer, Heidelberg (2001)
84 M. Hajiaghayi et al.
[CVV06] Cheriyan, J., Vempala, S., Vetta, A.: Network design via iterative rounding
of setpair relaxations. Combinatorica 26(3), 255–275 (2006)
[FJW01] Fleischer, L., Jain, K., Williamson, D.: An iterative rounding 2-
approximation algorithm for the element connectivity problem. In: Proc.
of the 42nd IEEE Symp. on Foundations of Computer Science (FOCS), pp.
339–347 (2001)
[Fuj05] Fujishige, S.: Submodular functions and optimization. Elsevier, Amster-
dam (2005)
[GKL+ 07] Gupta, A., Könemann, J., Leonardi, S., Ravi, R., Schäfer, G.: An efficient
cost-sharing mechanism for the prize-collecting steiner forest problem. In:
Proc. of the 18th ACM-SIAM Symposium on Discrete algorithms (SODA),
pp. 1153–1162 (2007)
[GW95] Goemans, M., Williamson, D.: A general approximation technique for con-
strained forest problems. SIAM J. Comput. 24(2), 296–317 (1995)
[HJ06] Hajiaghayi, M., Jain, K.: The prize-collecting generalized Steiner tree prob-
lem via a new approach of primal-dual schema. In: Proc. of the 17th ACM-
SIAM Symp. on Discrete Algorithms (SODA), pp. 631–640 (2006)
[HN10] Hajiahayi, M., Nasri, A.: Prize-collecting Steiner networks via iterative
rounding. In: LATIN (to appear, 2010)
[Jai01] Jain, K.: A factor 2 approximation algorithm for the generalized Steiner
network problem. Combinatorica 21(1), 39–60 (2001)
[JMP00] Johnson, D., Minkoff, M., Phillips, S.: The prize collecting Steiner tree
problem: theory and practice. In: Proceedings of the Eleventh Annual
ACM-SIAM Symposium on Discrete Algorithms, pp. 760–769 (2000)
[JV01] Jain, K., Vazirani, V.: Approximation algorithms for metric facility loca-
tion and k-median problems using the primal-dual schema and Lagrangian
relaxation. J. ACM 48(2), 274–296 (2001)
[KN07] Kortsarz, G., Nutov, Z.: Approximating minimum cost connectivity prob-
lems. In: Gonzales, T.F. (ed.) Approximation Algorithms and Metahueris-
tics, ch. 58. CRC Press, Boca Raton (2007)
[KV02] Korte, B., Vygen, J.: Combinatorial Optimization: Theory and Algorithms.
Springer, Berlin (2002)
[NSW08] Nagarajan, C., Sharma, Y., Williamson, D.: Approximation algorithms for
prize-collecting network design problems with general connectivity require-
ments. In: Bampis, E., Skutella, M. (eds.) WAOA 2008. LNCS, vol. 5426,
pp. 174–187. Springer, Heidelberg (2009)
[Nut09] Nutov, Z.: Approximating minimum cost connectivity problems via un-
crossable bifamilies and spider-cover decompositions. In: Proc. of the 50th
IEEE Symposium on Foundations of Computer Science, FOCS (2009)
[RZ05] Robins, G., Zelikovsky, A.: Tighter bounds for graph Steiner tree approx-
imation. SIAM J. on Discrete Mathematics 19(1), 122–134 (2005)
[SCRS00] Salman, F., Cheriyan, J., Ravi, R., Subramanian, S.: Approximating the
single-sink link-installation problem in network design. SIAM J. on Opti-
mization 11(3), 595–610 (2000)
[SSW07] Sharma, Y., Swamy, C., Williamson, D.: Approximation algorithms for
prize collecting forest problems with submodular penalty functions. In:
Proceedings of the 18th ACM-SIAM Symposium on Discrete Algorithms
(SODA), pp. 1275–1284 (2007)
On Lifting Integer Variables in Minimal
Inequalities
1 Introduction
There has been a renewed interest recently in the study of cutting planes for
general mixed integer linear programs (MILPs) that cut off a basic solution
of the linear programming relaxation. More precisely, consider a mixed integer
linear set in which the variables are partitioned into a basic set B and a nonbasic
set N , and K ⊆ B ∪ N indexes the integer variables:
xi = fi − j∈N aij xj for i ∈ B
x≥0 (1)
xk ∈ Z for k ∈ K.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 85–95, 2010.
c Springer-Verlag Berlin Heidelberg 2010
86 A. Basu et al.
B ⊆ K, i.e. all basic variables are integer. Andersen, Louveaux, Weismantel and
Wolsey [1] studied the corner polyhedron when |B| = 2 and B = K, i.e. all
nonbasic variables are continuous. They give a complete characterization of the
corner polyhedron using intersection cuts (Balas [2]) arising from splits, triangles
and quadrilaterals. This very elegant result has been extended to |B| > 2 and
B = K by showing a correspondence between minimal valid inequalities and
maximal lattice-free convex sets [5], [7]. These results and their extensions [6],
[10] are best described in an infinite model, which we motivate next.
A classical family of cutting planes for (1) is that of Gomory mixed integer
cuts.
For a given row of the tableau, the Gomory mixed integer cut is of the form
j∈N \K ψ(aij )xj + j∈N ∩K π(aij )xj ≥ 1 where ψ and π are functions given by
simple formulas. A nice feature of the Gomory mixed integer cut is that, for fixed
fi , the same functions ψ, π are used for any possible choice of the aij s in (1). It is
well known that the Gomory mixed integer cuts are also valid for X. More gener-
ij , i ∈ B; we are interested
ally, let aj be the vector with entries a in pairs (ψ, π)
of functions such that the inequality j∈N \K ψ(aj )xj + j∈N ∩K π(aj )xj ≥ 1
is valid for X for any possible choice of the nonbasic coefficients aij . Since we
are interested in nonredundant inequalities, we can assume that the function
(ψ, π) is pointwise minimal. While a general characterization of minimal valid
functions seems hopeless (see for example [4]), when N ∩ K = ∅ the minimal
valid functions ψ are well understood in terms of maximal lattice-free convex
sets, as already mentioned. Starting from such a minimal valid function ψ, an
interesting question is how to generate a function π such that (ψ, π) is valid and
minimal. Recent papers [8], [9] study when such a function π is unique. Here we
prove a theorem that generalizes and unifies results from these two papers.
In order to formalize the concept of valid function (ψ, π), we introduce the
following infinite model. In the setting below, we also allow further linear con-
straints on the basic variables. Let S be the set of integral points in some rational
polyhedron in Rn such that dim(S) = n (for example, S could be the set of non-
negative integer points). Let f ∈ Rn \ S. Consider the following semi-infinite
relaxation of (1), introduced in [10].
x= f+ rsr + ryr , (2)
r∈Rn r∈Rn
x ∈ S,
sr ∈ R+ , ∀r ∈ Rn ,
yr ∈ Z+ , ∀r ∈ Rn ,
s, y have finite support
where the nonbasic continuous variables have been renamed s and the nonbasic
integer variables have been renamed y. Given π : Rn → R, (ψ, π)
two functions ψ,
is said to be valid for (2) if the inequality r∈Rn ψ(r)sr + r∈Rn π(r)yr ≥ 1
holds for every (x, s, y) satisfying (2). We also consider the semi-infinite model
where we only have continuous nonbasic variables.
On Lifting Integer Variables in Minimal Inequalities 87
x= f+ rsr (3)
r∈Rn
x ∈ S,
sr ∈ R+ , ∀r ∈ Rn ,
s has finite support.
A function ψ : Rn → R is said to be valid for (3) if the inequality r∈Rn ψ(r)sr ≥
1 holds for every (x, s) satisfying (3). Given a valid function ψ for (3), a function
π is a lifting of ψ if (ψ, π) is valid for (2). One is interested only in (pointwise)
minimal valid functions, since non-minimal ones are implied by some minimal
valid function. If ψ is a minimal valid function for (3) and π is a lifting of ψ such
that (ψ, π) is a minimal valid function for (2) then we say that π is a minimal
lifting of ψ.
While minimal valid functions for (3) have a simple characterization [6], min-
imal valid functions for (2) are not well understood. A general idea to derive
minimal valid functions for (2) is to start from some minimal valid function ψ
for (3), and construct a minimal lifting π of ψ. While there is no general tech-
nique to compute such minimal lifting π, it is known that there exists a region
Rψ , containing the origin in its interior, where ψ coincides with π for any mini-
mal lifting π. This latter fact was observed by Dey and Wolsey [9] for the case
of S = Z2 and by Conforti, Cornuéjols and Zambelli [8] for the general case.
Furthermore, it is remarked in [8] and [10] that, if π is a minimal lifting of ψ,
then π(r) = π(r ) for every r, r ∈ Rn such that r − r ∈ Zn ∩ lin(conv(S)).
Therefore the coefficients of any minimal lifting π are uniquely determined in
the region Rψ + (Zn ∩ lin(conv(S))). In particular, whenever translating Rψ by
integer vectors in lin(conv(S)) covers Rn , ψ has a unique minimal lifting. The
purpose of this paper is to give a precise description of the region Rψ .
To state our main result, we need to explain the characterization of minimal
valid functions for (3). We say that a convex set B ⊆ Rn is S-free if B does not
contain any point of S in its interior. A set B is a maximal S-free convex set if it
is an S-free convex set that is not properly contained in any S-free convex set.
It was proved in [6] that maximal S-free convex sets are polyhedra containing a
point of S in the relative interior of each facet.
Given an S-free polyhedron B ⊆ Rn containing f in its interior, B can be
uniquely written in the form
B = {x ∈ Rn : ai (x − f ) ≤ 1, i ∈ I},
ψB (r) = max ai r, ∀r ∈ Rn .
i∈I
Note in particular that, since maximal S-free convex sets are polyhedra, the
above function is defined for all maximal S-free convex sets B.
88 A. Basu et al.
Theorem 1. [6] Let ψ be a minimal valid function for (3). Then the set
Bψ := {x ∈ Rn | ψ(x − f ) ≤ 1}
We define ,
Rψ := R(x).
x∈S∩Bψ
In the above expression, equality holds if and only if h ∈ I(r) and h ∈ I(x−f −r).
x3 R(x3) R(x ) x f
f 2 2
x1 x2
Bψ
R(x2)
R(x1)
R(x1)
x1
Bψ
l1 l
(a) A maximal Z2 -free triangle with (b) A wedge
three integer points
x3 R(x3)
x1
R(x1)
Bψ x1 x2
f
x6 R(x1)
x2
R(x2) R(x6) R(x2)
f
R(x5)
R(x4)
Bψ
x3 x4 x5
R(x3)
(c) A maximal Z2 -free triangle with integer (d) A truncated wedge
vertices
Fig. 1. Regions R(x) for some maximal S-free convex sets in the plane. The thick dark
line indicates the boundary of Bψ . For a particular x, the dark gray regions denote
R(x). The jagged lines in a region indicate that it extends to infinity. For example, in
Figure 1(b), R(x1 ) is the strip between lines l1 and l. Figure 1(c) shows an example
where R(x) is full-dimensional for x2 , x4 , x6 , but is not full-dimensional for x1 , x3 , x5 .
90 A. Basu et al.
Given
a minimal valid function ψ for (3) and scalar λ, we say that the inequality
r∈Rn ψ(r)sr + λyr ∗ ≥ 1 is valid for (4) if it holds for every (x,
s, yr∗ ) satisfy-
ing (4). We denote by ψ ∗ (r∗ ) the minimum value of λ for which r∈Rn ψ(r)sr +
λyr∗ ≥ 1 is valid for (4).
We observe that, for any lifting π of ψ, we have
ψ ∗ (r∗ ) ≤ π(r∗ ).
Indeed, r∈Rn ψ(r)sr + π(r∗ )yr∗ ≥ 1 is valid for (4), since, for any (s̄, ȳr∗ )
satisfying (4), the vector (s̄, ȳ), defined by ȳr = 0 for all r ∈ Rn \{r∗ }, satisfies (2).
Moreover, the following fact was shown in [8].
Lemma 1. If ψ is a minimal valid function for (3) and π is a minimal lifting
of ψ, then π ≤ ψ.
So we have the following relation for every minimal lifting π of ψ :
In general ψ ∗ is not a lifting of ψ, but if it is, then the above relation implies
that it is the unique minimal lifting of ψ.
Remark 1. For any r ∈ Rn such that ψ ∗ (r) = ψ(r), we have π(r) = ψ(r) for
every minimal lifting π of ψ. Conversely, if ψ ∗ (r∗ ) < ψ(r∗ ) for some r∗ ∈ Rn ,
then there exists some lifting π of ψ such that π(r∗ ) < ψ(r∗ ).
Proof. The first part follows from ψ ∗ ≤ π ≤ ψ. For the second part, given
r∗ ∈ Rn such that ψ ∗ (r∗ ) < ψ(r∗ ), we can define π by π(r∗ ) = ψ ∗ (r∗ ) and
π(r) = ψ(r) for all r ∈ Rn , r = r∗ . Since ψ is valid for (3), it follows by the
definition of ψ ∗ (r∗ ) that π is a lifting of ψ.
Remark 2. The proof of Theorem 3 in [6] implies the following. Given a maximal
S-free convex set B, there exists δ > 0 such that there is no point of S \ B at
distance less than δ from B.
ai r̄ ≤ 0, i ∈ I,
92 A. Basu et al.
This implies that there exists some ε̄ > 0 such that for all ε ≤ ε̄,
μi ai + γC = 0
i∈I
∗
(λ − ε)( μi ) − ( μi ai )r∗ > 0,
i∈I i∈I
since λ∗ − ai r∗ ≥ 0, i ∈ I.
Corollary 1. Let ψ be a minimal valid function for (3). Then ψ ∗ (r∗ ) = ψ(r∗ )
if and only if there exists x̄ ∈ S such that
We show the converse. Since ψ is a valid function for (3), ψ(x̄−f −r∗ )+ψ(r∗ ) ≥ 1.
Since ψ is a minimal valid function for (3), by Theorem 1 there exists a maximal
S-free convex set B such that ψ = ψB . Let Bψ = {x ∈ Rn | ai (x − f ) ≤ 1, i ∈ I}.
∗ ∗ ∗
x̄Assume ψ∗ (r ) = ψ(r ). By Theorem 5, there exists a point x̄ ∈ S such that
1 ∈ B(ψ(r )). Therefore
ai (x̄ − f ) + ψ(r∗ ) − ai r∗ ≤ 1, i ∈ I.
Thus
4 Conclusion
In this paper we give an exact characterization of the region where a minimal
valid inequality ψ and any minimal lifting π of ψ coincide. This was exhibited in
Theorem 2, which generalizes results from [8] and [9] about liftings of minimal
valid inequalities.
As already mentioned in the introduction, the following theorem was proved
in [8].
Theorem 6. Let ψ be a minimal valid function for (3). If Rψ + (Zn ∩ lin
(conv(S))) covers all of Rn , then there exists a unique minimal lifting π of ψ.
We conjecture that the converse also holds.
Conjecture 7 Let ψ be a minimal valid function for (3). There exists a unique
minimal lifting π of ψ if and only if Rψ + (Zn ∩ lin(conv(S))) covers all of Rn .
Acknowledgements
The authors would like to thank Marco Molinaro for helpful discussions about
the results presented in this paper.
References
1. Andersen, K., Louveaux, Q., Weismantel, R., Wolsey, L.A.: Cutting Planes from
Two Rows of a Simplex Tableau. In: Fischetti, M., Williamson, D.P. (eds.) IPCO
2007. LNCS, vol. 4513, pp. 1–15. Springer, Heidelberg (2007)
2. Balas, E.: Intersection Cuts - A New Type of Cutting Planes for Integer Program-
ming. Operations Research 19, 19–39 (1971)
3. Barvinok, A.: A Course in Convexity. In: Graduate Studies in Mathematics, vol. 54.
American Mathematical Society, Providence (2002)
4. Basu, A., Conforti, M., Cornuejols, G., Zambelli, G.: A Counterexample to a Con-
jecture of Gomory and Johnson. Mathematical Programming Ser. A (to appear
2010)
5. Basu, A., Conforti, M., Cornuejols, G., Zambelli, G.: Maximal Lattice-free Convex
Sets in Linear Subspaces (2009) (manuscript)
6. Basu, A., Conforti, M., Cornuejols, G., Zambelli, G.: Minimal Inequalities for an
Infinite Relaxation of Integer Programs. SIAM Journal of Discrete Mathematics
(to appear 2010)
7. Borozan, V., Cornuéjols, G.: Minimal Valid Inequalities for Integer Constraints.
Mathematics of Operations Research 34, 538–546 (2009)
8. Conforti, M., Cornuejols, G., Zambelli, G.: A Geometric Perspective on Lifting
(May 2009) (manuscript)
9. Dey, S.S., Wolsey, L.A.: Lifting Integer Variables in Minimal Inequalities corre-
sponding to Lattice-Free Triangles. In: Lodi, A., Panconesi, A., Rinaldi, G. (eds.)
IPCO 2008. LNCS, vol. 5035, pp. 463–475. Springer, Heidelberg (2008)
10. Dey, S.S., Wolsey, L.A.: Constrained Infinite Group Relaxations of MIPs (March
2009) (manuscript)
On Lifting Integer Variables in Minimal Inequalities 95
11. Gomory, R.E.: Some Polyhedra related to Combinatorial Problems. Linear Algebra
and its Applications 2, 451–558 (1969)
12. Gomory, R.E., Johnson, E.L.: Some Continuous Functions Related to Corner Poly-
hedra, Part I. Mathematical Programming 3, 23–85 (1972)
13. Johnson, E.L.: On the Group Problem for Mixed Integer Programming. In: Math-
ematical Programming Study, pp. 137–179 (1974)
14. Schrijver, A.: Theory of Linear and Integer Programming. John Wiley & Sons,
New York (1986)
15. Meyer, R.R.: On the Existence of Optimal Solutions to Integer and Mixed-Integer
Programming Problems. Mathematical Programming 7, 223–235 (1974)
Efficient Edge Splitting-Off Algorithms
Maintaining All-Pairs Edge-Connectivities
1 Introduction
The edge splitting-off operation plays an important role in many basic graph
problems, both in proving theorems and obtaining efficient algorithms. Splitting-
off a pair of edges (xu, xv) means deleting these two edges and adding a new
edge uv if u = v. This operation is introduced by Lovász [18] who showed that
splitting-off can be performed to maintain the global edge-connectivity of a graph.
Mader extended Lovász’s result significantly to prove that splitting-off can be
performed to maintain the local edge-connectivity for all pairs:
Theorem 1 (Mader [19]). Let G = (V, E) be an undirected graph that has at
least r(s, t) edge-disjoint paths between s and t for all s, t ∈ V − x. If there is
no cut edge incident to x and d(x) = 3, then some edge pair (xu, xv) can be
split-off so that in the resulting graph there are still at least r(s, t) edge-disjoint
paths between s and t for all s, t ∈ V − x.
These splitting-off theorems have applications in various graph problems. Lovász
[18] and Mader [19] used their splitting-off theorems to derive Nash-Williams’ graph
orientation theorems [23]. Subsequently these theorems and their extensions have
found applications in a number of problems, including edge-connectivity augmen-
tation problems [4, 8, 9], network design problems [7, 13, 16], tree packing problems
[1, 6, 17], and graph orientation problems [11].
Efficient splitting-off algorithms have been developed to give fast algorithms
for the above problems [4, 6, 12, 20, 22]. However, most of the efficient algorithms
are developed only in the global edge-connectivity setting, although there are
important applications in the more general local edge-connectivity setting.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 96–109, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Efficient Edge Splitting-Off Algorithms 97
Mader’s theorem can be applied repeatedly until d(x) = 0 when d(x) is even and
there is no cut edge incident to x. This is called a complete edge splitting-off at
x, which is a key subroutine in algorithms for connectivity augmentation, graph
orientation, and tree packing.
A straightforward algorithm to compute a complete splitting-off sequence is
to split-off (xu, xv) for every pair u, v ∈ N (x) where N (x) is the neighbor set
of x, and then check whether the connectivity requirements are violated by
computing all-pairs edge-connectivities in the resulting graph, and repeat this
procedure until d(x) = 0.
Several efficient algorithms are proposed for the complete splitting-off prob-
lem, but only Gabow’s algorithm [12] can be used in the local edge-connectivity
setting, with running time O(rmax 2 · n3 ). Our algorithms improve the running
time of Gabow’s algorithm by a factor of Ω̃(n). In applications where rmax is
small, the improvement of the randomized algorithm could be a factor of Ω̃(n2 ).
Theorem 2. In the local edge-connectivity setting, there is a deterministic
Õ(rmax 2 · n2 )-time algorithm and a randomized Õ(m + rmax 3 · n)-time algorithm
for the complete edge splitting-off problem in unweighted graphs.
These edge splitting-off algorithms can be used directly to improve the run-
ning time of various graph algorithms [7, 9, 12, 13, 17, 23]. For instance, us-
ing Theorem 2 in Gabow’s local edge-connectivity augmentation algorithm [12]
in unweighted graphs, the running time can be improved from Õ(rmax 2 n3 ) to
Õ(rmax 2 n2 ) time. Similarly, using Theorem 2 in Gabow’s orientation algorithm
[12], one can find a well-balanced orientation in unweighted graphs in Õ(rmax 3 n2 )
expected time, improving the O(rmax 2 n3 ) result by Gabow [12]. We will not dis-
cuss the details of these applications in this paper.
Our edge splitting-off algorithms are conceptually very simple, which can be
seen as refinements of the straightforward algorithm. The improvements come
from some new structural results, and a recent fast Gomory-Hu tree construc-
tion algorithm by Bhalgat, Hariharan, Kavitha, and Panigrahi [5]. First, in
Section 3.2, we show how to find a complete edge splitting-off sequence by using
at most O(|N (x)|) splitting-off attempts, instead of O(|N (x)|2 ) attempts by the
straightforward algorithm. This is based on an alternative proof of Mader’s the-
orem in Section 3.1. Then, in Section 3.4, we show how to reduce the problem of
checking local edge-connectivities for all pairs, to the problem of checking local
98 L.C. Lau and C.K. Yung
2 Preliminaries
Let G = (V, E) be a graph. For X, Y ⊆ V , denote by δ(X, Y ) the set of edges
with one endpoint in X − Y and the other endpoint in Y − X and d(X, Y ) :=
¯
|δ(X, Y )|, and also define d(X, Y ) := d(X ∩ Y, V − (X ∪ Y )). For X ⊆ V , define
δ(X) := δ(X, V − X) and the degree of X as d(X) := |δ(X)|. Denote the degree
of a vertex as d(v) := d({v}). Also denote the set of neighbors of v by N (v), and
call a vertex in N (v) a v-neighbor.
Let λ(s, t) be the maximum number of edge-disjoint paths between s and t in
V , and let r(s, t) be an edge-connectivity requirement for s, t ∈ V . The connec-
tivity requirement is global if rs,t = k for all s, t ∈ V , otherwise it is local. We
say a graph G satisfies the connectivity requirements if λ(s, t) ≥ r(s, t) for any
s, t ∈ V . The requirement r(X) of a set X ⊆ V is the maximum edge-connectivity
requirement between u and v with u ∈ X and v ∈ V − X. By Menger’s theorem,
to satisfy the requirements, it suffices to guarantee that d(X) ≥ r(X) for all
X ⊂ V . The surplus s(X) of a set X ⊆ V is defined as d(X) − r(X). A graph
satisfies the edge-connectivity requirements if s(X) ≥ 0 for all ∅ = X ⊂ V . For
X ⊂ V − x, X is called dangerous if s(X) ≤ 1 and tight if s(X) = 0. The
following proposition will be used throughout our proofs.
Proposition 4 ([10] Proposition 2.3). For X, Y ⊆ V at least one of the
following inequalities holds:
Efficient Edge Splitting-Off Algorithms 99
Lemma 7 ([7] Lemma 2.7). If d(x) = 3 and there is no cut edge incident to
x, then there are no maximal dangerous sets X, Y, Z and u, v, w ∈ N (x) with
u ∈ X ∩ Y , v ∈ X ∩ Z, w ∈ Y ∩ Z and u, v, w ∈
/ X ∩ Y ∩ Z.
Nagamochi and Ibaraki [21] gave a fast algorithm to find a sparse subgraph that
satisfies edge-connectivity requirements, which will be used in Section 3.3 as a
preprocessing step.
Theorem 8 ([21] Lemma 2.1). There is an O(m)-time algorithm to construct
a subgraph with O(rmax · n) edges that satisfies all the connectivity requirements.
As a key tool in checking local edge-connectivities, we need to construct a
Gomory-Hu tree, which is a compact representation of all pairwise min-cuts
of an undirected graph. Let G = (V, E) be an undirected graph, a Gomory-Hu
tree is a weighted tree T = (V, F ) with the following property. Consider any
s, t ∈ V , the unique s-t path P in T , an edge e = uv on P with minimum
weight, and any component K of T − e. Then the local edge-connectivity be-
tween s and t in G is equal to the weight of e in T , and δ(K) is a minimum s-t
cut in G. To check whether the connectivity requirements are satisfied, we only
need to check the pairs with λ(u, v) ≤ rmax . A partial Gomory-Hu tree Tk of G is
obtained from a Gomory-Hu tree T of G by contracting all edges with weight at
least k. Therefore, each node in Tk represents a subset of vertices S in G, where
the local edge-connectivity between each pair of vertices in S is at least k. For
vertices u, v ∈ G in different nodes of Tk , their local edge-connectivity (which
is less than k) is determined in the same way as in an ordinary Gomory-Hu
tree. Bhalgat et.al. [5] gave a fast randomized algorithm to construct a partial
Gomory-Hu tree. We will use the following theorem by setting k = rmax . The
following result can be obtained by using the algorithm in [15], with the fast tree
packing algorithm in [5].
Theorem 9 ([5, 15]). A partial Gomory-Hu tree Tk can be constructed in
Õ(km) expected time.
Proof. We prove the lemma by a simple induction. The statement holds trivially
for |U | = 2 by Proposition 5. Consider U = {u1 , u2 , . . . , uk+1 } ⊆ N (x) where ev-
ery pair (ui , uj ) is non-admissible. By induction, since every pair (ui , uj ) is non-
admissible, there are maximal dangerous sets X, Y such that {u1 , ..., uk−1 , uk } ⊆
X and {u1 , ..., uk−1 , uk+1 } ⊆ Y . Since (uk , uk+1 ) is non-admissible, by Propo-
sition 5, there is a dangerous set Z containing uk and uk+1 . If uk+1 ∈ / X and
uk ∈ / Y and there is some ui ∈ / Z, then X, Y and Z form a 3-dangerous-set
structure with u = ui , v = uk , w = uk+1 . Hence either X, Y or Z contains U .
Proof. We consider three cases based on the size of C. When |C| = 0, we simply
assign C = {u}. When |C| = 1, pick the vertex v ∈ C, and split-off (u, v) to
capacity. Either case (1) applies when either u or v becomes void, or case (2)
applies in the resulting graph after (u, v) is split-off to capacity. Hence, when
|C| ≤ 1, either case (1) or case (2) applies after only one splitting-off attempt.
102 L.C. Lau and C.K. Yung
that Trmax has at least two nodes. Let X be the node in Trmax that contains x
in G, and U1 , . . . , Up be the nodes adjacent to X in Trmax , and let XU1 be the
edge in Trmax with largest weight among XUi for 1 ≤ i ≤ p. See Figure (a).
X x U1
t
U2 Up W1 Wq
U1 U2 … Up … …
(a) (b)
Since each iteration reduces the degree of x by 2q−2 , with at most 2l+1 =
O(rmax ) successful iterations, we can then reduce d(x) to 2l+q−1 , i.e. reduce
d(x) by half. This procedure is applicable as long as q ≥ 3. Therefore, we can
reduce d(x) to 2l+2 by using this procedure for O(log n) times. The total running
time is thus Õ(2l+1 · log n · rmax 2 · n) = Õ(rmax 3 · n). Note that there are at most
Õ(rmax ) iterations and the failure probability of each iteration is at most 1/nc .
By the union bound, the probability for above randomized algorithm to fail
is at most 1/nc−1 . Therefore, with high probability, the algorithm succeeds in
Õ(rmax 3 · n) time to reduce d(x) to O(rmax ). Since the correctness of solution
can be verified by a Gomory-Hu Tree, this also gives a Las Vegas algorithm with
the same expected runtime.
Lemma 16 ([2] Lemma 5.4, [24] Lemma 2.6). If |DP | ≥ 3, then inequal-
ity (4a) holds for every X, Y ∈ DP . Furthermore, X ∩ Y = {v} and is a tight
set for any X, Y ∈ DP .
Lemma 17. |DP | ≤ rmax − 1 when |DP | ≥ 3.
Proof. By Lemma 16, we have X ∩ Y = {v} for any X, Y ∈ DP . For each set
X ∈ DP , we have d(x, v) ≥ 1 and d(x, X − v) ≥ 1 by the minimality of DP .
Therefore, we must have d(v, X − v) ≥ 1 by Proposition 15. By Lemma 16, X − v
and Y − v are disjoint for each pair X, Y ∈ DP . Since d(v, X − v) ≥ 1 for each
X ∈ DP and d(x, v) ≥ 1, it follows that |DP | ≤ d(v) − 1. By Lemma 16, {v} is
a tight set, and thus |DP | ≤ d(v) − 1 ≤ rmax − 1.
An Inductive Argument: The goal is to prove that |P | ≤ rmax − 1 + |DP |.
By Lemma 17, this holds if d(x, X − v) = 1 for every dangerous set X ∈ DP .
Hence we assume that there is a dangerous set A ∈ DP with d(x, A − v) ≥ 2;
this property will only be used at the very end of the proof. By Lemma 16,
inequality (4a) holds for A and B for every B ∈ DP . By the minimality of DP ,
there exists a x-neighbor a ∈ A which is not contained in any other set in DP .
Similarly, there exists b ∈ B which is not contained in any other set in DP . The
following lemma shows that the edge pair (xa, xb) is admissible.
Lemma 18. For any A, B ∈ DP satisfying inequality (4a), an edge pair (xa, xb)
is admissible if a ∈ A − B and b ∈ B − A.
Proof. Suppose, by way of contradiction, that (xa, xb) is non-admissible. Then,
by Proposition 5, there exists a maximal dangerous set C containing a and b. We
claim that v ∈ C; otherwise there exists a 3-dangerous-set structure, contradict-
ing Lemma 7. Then d(x, A ∩ C) ≥ d(x, {v, a}) ≥ 2, and so inequality (4b) cannot
¯ C) ≥
hold for A and C, since 1 + 1 ≥ s(A) + s(C) ≥ s(A − C) + s(C − A) + 2d(A,
0 + 0 + 2 · 2. Therefore, inequality (4a) must hold for A and C. Since A and
C are maximal dangerous sets, A ∪ C cannot be a dangerous set, and thus
1 + 1 ≥ s(A) + s(C) ≥ s(A ∪ C) + s(A ∩ C) + 2d(A, C) ≥ 2 + s(A ∩ C) + 0, which
implies that A ∩ C is a tight set, but this contradicts the assumption that each
tight set is a singleton as {v, a} ⊆ A ∩ C.
After splitting-off (xa, xb), let the resulting graph be G and P = P − {xa, xb}.
Clearly, since each edge in P is a non-admissible partner of xv in G, every edge
in P is still a non-admissible partner of xv in G . Furthermore, by contracting
non-trivial tight sets in G , each edge in P is still a non-admissible partner of
xv by Lemma 6. Hence we assume all tight sets in G are singletons. Let DP be a
108 L.C. Lau and C.K. Yung
minimal set of maximal dangerous sets such that (i) each set D ∈ DP covers the
edge xv and (ii) each edge in P is covered by some set D ∈ DP . The following
lemma shows that there exists DP with |DP | ≤ |DP | − 2.
Lemma 19. When |DP | ≥ 3, the edges in P can be covered by a set DP of
maximal dangerous sets in G such that (i) each set in DP covers xv, and (ii)
each edge in P is covered by some set in DP , and (iii) |DP | ≤ |DP | − 2.
Proof. We will use the dangerous sets in DP to construct DP . Since each pair of sets
in DP satisfies inequality (4a), we have s(A∪D) = 2 before splitting-off (xa, xb) for
each D ∈ DP . Also, before splitting-off (xa, xb), for A, B, C ∈ DP , inequality (4b)
cannot hold for A ∪ B and C because 2 + 1 = s(A ∪ B) + s(C) ≥ s((A ∪ B) − C) +
¯
s(C − (A∪B))+ 2d(A∪B, C) ≥ 2 + 0 + 2 ·1, where the last inequality follows since
v ∈ A∩B∩C and (A∪B)−C is not dangerous (as it covers the admissible edge pair
(xa, xb)). Therefore inequality (4a) must hold for A ∪ B and C, which implies that
s(A ∪ B ∪ C) ≤ 3 since 2 + 1 = s(A ∪ B) + s(C) ≥ s((A ∪ B) ∪ C) + s((A ∪ B) ∩ C).
For A and B as defined before Lemma 18, since s(A ∪ B) = 2 before splitting-off
(xa, xb), A∪B becomes a tight set after splitting-off (xa, xb). For any other set C ∈
DP −A−B, since s(A∪B ∪C) ≤ 3 before splitting-off (xa, xb), A∪B ∪C becomes
a dangerous set after splitting-off (xa, xb). Hence, after splitting-off (xa, xb) and
contracting the tight set A ∪ B into v, each set in DP − A − B becomes a dangerous
set. Then DP = DP − A − B is a set of dangerous sets covering each edge in P ,
satisfying properties (i)-(iii). By replacing a dangerous set C ∈ DP by a maximal
dangerous set C ⊇ C and removing redundant dangerous sets in DP so that it
minimally covers P , we have found DP as required by the lemma.
Recall that we chose A with d(x, A − v) ≥ 2, and hence d(x, v) ≥ 2 after the
splitting-off and contraction of tight sets. Therefore, inequality (4a) holds for
every two maximal dangerous sets in DP . By induction, when |DP | ≥ 3, we
have |P | = |P | + 2 ≤ rmax − 1 + |DP | + 2 ≤ rmax − 1 + |DP |. In the base case
when |DP | = 2 and A, B ∈ DP satisfy (4a), the same argument in Lemma 19 can
be used to show that the edges in P is covered by one tight set after splitting-off
(xa, xb), and thus |P | = |P |+ 2 ≤ rmax − 1 + 2 ≤ rmax − 1 + |DP |. This completes
the proof that |P | ≤ rmax − 1 + |DP |, proving the theorem.
5 Concluding Remarks
Theorem 3 can be applied to constrained edge splitting-off problems, and give
additive approximation algorithms for constrained augmentation problems. The
efficient algorithms can also be adapted to these problems. We refer the reader
to [25] for these results.
References
1. Bang-Jensen, J., Frank, A., Jackson, B.: Preserving and increasing local edge-
connectivity in mixed graphs. SIAM J. Disc. Math. 8(2), 155–178 (1995)
2. Bang-Jensen, J., Jordán, T.: Edge-connectivity augmentation preserving simplicity.
SIAM Journal on Discrete Mathematics 11(4), 603–623 (1998)
Efficient Edge Splitting-Off Algorithms 109
3. Bernáth, A., Király, T.: A new approach to splitting-off. In: Lodi, A., Panconesi, A.,
Rinaldi, G. (eds.) IPCO 2008. LNCS, vol. 5035, pp. 401–415. Springer, Heidelberg
(2008)
4. Benczúr, A.A., Karger, D.R.: Augmenting undirected edge connectivity in O(n2 )
time. Journal of Algorithms 37(1), 2–36 (2000)
5. Bhalgat, A., Hariharan, R., Kavitha, T., Panigrahi, D.: An Õ(mn) Gomory-Hu
tree construction algorithm for unweighted graphs. In: STOC 2007, pp. 605–614
(2007)
6. Bhalgat, A., Hariharan, R., Kavitha, T., Panigrahi, D.: Fast edge splitting and
Edmonds’ arborescence construction for unweighted graphs. In: SODA ’08, pp.
455–464 (2008)
7. Chan, Y.H., Fung, W.S., Lau, L.C., Yung, C.K.: Degree Bounded Network Design
with Metric Costs. In: FOCS ’08, pp. 125–134 (2008)
8. Cheng, E., Jordán, T.: Successive edge-connectivity augmentation problems. Math-
ematical Programming 84(3), 577–593 (1999)
9. Frank, A.: Augmenting graphs to meet edge-connectivity requirements. SIAM
Journal on Discrete Mathematics 5(1), 25–53 (1992)
10. Frank, A.: On a theorem of Mader. Ann. of Disc. Math. 101, 49–57 (1992)
11. Frank, A., Király, Z.: Graph orientations with edge-connection and parity con-
straints. Combinatorica 22(1), 47–70 (2002)
12. Gabow, H.N.: Efficient splitting off algorithms for graphs. In: STOC ’94, pp. 696–
705 (1994)
13. Goemans, M.X., Bertsimas, D.J.: Survivable networks, linear programming relax-
ations and the parsimonious property. Math. Prog. 60(1), 145–166 (1993)
14. Gomory, R.E., Hu, T.C.: Multi-terminal network flows. Journal of the Society for
Industrial and Applied Mathematics 9(4), 551–570 (1961)
15. Hariharan, R., Kavitha, T., Panigrahi, D.: Efficient algorithms for computing all
low st edge connectivities and related problems. In: SODA ’07, pp. 127–136 (2007)
16. Jordán, T.: On minimally k-edge-connected graphs and shortest k-edge-connected
Steiner networks. Discrete Applied Mathematics 131(2), 421–432 (2003)
17. Lau, L.C.: An approximate max-Steiner-tree-packing min-Steiner-cut theorem.
Combinatorica 27(1), 71–90 (2007)
18. Lovász, L.: Lecture. Conference of Graph Theory, Prague (1974); See also Combi-
natorial problems and exercises. North-Holland (1979)
19. Mader, W.: A reduction method for edge-connectivity in graphs. Annals of Discrete
Mathematics 3, 145–164 (1978)
20. Nagamochi, H.: A fast edge-splitting algorithm in edge-weighted graphs. IEICE
Transactions on Fundamentals of Electronics, Communications and Computer Sci-
ences, 1263–1268 (2006)
21. Nagamochi, H., Ibaraki, T.: Linear time algorithm for finding a sparse k-connected
spanning subgraph of a k-connected graph. Algorithmica 7(1), 583–596 (1992)
22. Nagamochi, H., Ibaraki, T.: Deterministic O(nm) time edge-splitting in undirected
graphs. Journal of Combinatorial Optimization 1(1), 5–46 (1997)
23. Nash-Williams, C.S.J.A.: On orientations, connectivity and odd vertex pairings in
finite graphs. Canadian Journal of Mathematics 12, 555–567 (1960)
24. Szigeti, Z.: Edge-splittings preserving local edge-connectivity of graphs. Discrete
Applied Mathematics 156(7), 1011–1018 (2008)
25. Yung, C.K.: Edge splitting-off and network design problems. Master thesis, The
Chinese University of Hong Kong (2009)
On Generalizations of Network Design Problems with
Degree Bounds
Abstract. Iterative rounding and relaxation have arguably become the method of
choice in dealing with unconstrained and constrained network design problems.
In this paper we extend the scope of the iterative relaxation method in two direc-
tions: (1) by handling more complex degree constraints in the minimum spanning
tree problem (namely laminar crossing spanning tree), and (2) by incorporating
‘degree bounds’ in other combinatorial optimization problems such as matroid
intersection and lattice polyhedra. We give new or improved approximation al-
gorithms, hardness results, and integrality gaps for these problems.
1 Introduction
Iterative rounding and relaxation have arguably become the method of choice in dealing
with unconstrained and constrained network design problems. Starting with Jain’s ele-
gant iterative rounding scheme for the generalized Steiner network problem in [14], an
extension of this technique (iterative relaxation) has more recently lead to breakthrough
results in the area of constrained network design, where a number of linear constraints
are added to a classical network design problem. Such constraints arise naturally in
a wide variety of practical applications, and model limitations in processing power,
bandwidth or budget. The design of powerful techniques to deal with these problems is
therefore an important goal.
The most widely studied constrained network design problem is the minimum-cost
degree-bounded spanning tree problem. In an instance of this problem, we are given an
undirected graph, non-negative costs for the edges, and positive, integral degree-bounds
for each of the nodes. The problem is easily seen to be NP-hard, even in the absence
of edge-costs, since finding a spanning tree with maximum degree two is equivalent to
finding a Hamiltonian Path. A variety of techniques have been applied to this problem
[5,6,11,17,18,23,24], culminating in Singh and Lau’s breakthrough result in [27]. They
presented an algorithm that computes a spanning tree of at most optimum cost whose
degree at each vertex v exceeds its bound by at most 1, using the iterative relaxation
framework developed in [20,27].
The iterative relaxation technique has been applied to several constrained network
design problems: spanning tree [27], survivable network design [20,21], directed graphs
with intersecting and crossing super-modular connectivity [20,2]. It has also been ap-
plied to degree bounded versions of matroids and submodular flow [15].
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 110–123, 2010.
c Springer-Verlag Berlin Heidelberg 2010
On Generalizations of Network Design Problems with Degree Bounds 111
In this paper we further extend the applicability of iterative relaxation, and obtain
new or improved bicriteria approximation results for minimum crossing spanning tree
(MCST), crossing matroid intersection, and crossing lattice polyhedra. We also provide
hardness results and integrality gaps for these problems.
Notation. As is usual, when dealing with an undirected graph G = (V, E), for any
S ⊆ V we let δG (S) := {(u, v) ∈ E | u ∈ S, v ∈ S}. When the graph is clear from
context, the subscript is dropped. A collection {U1 , · · · , Ut } of vertex-sets is called
laminar if for every pair Ui , Uj in this collection, we have Ui ⊆ Uj , Uj ⊆ Ui , or
Ui ∩ Uj = ∅. A (ρ, f (b)) approximation for minimum cost degree bounded problems
refers to a solution that (1) has cost at most ρ times the optimum that satisfies the degree
bounds, and (2) satisfies the relaxed degree constraints in which a bound b is replaced
with a bound f (b).
results also cannot be used to obtain a small additive violation, especially if b is large.
In particular, both the results [4,8] for general MCST √ are based on the natural LP relax-
ation, for which there is an integrality gap of b + Ω( n) even without regard to costs
and when m = O(n) [26] (see also [3]). On the other hand, Theorem 1 shows that a
purely additive O(log n) guarantee on degree (relative to the LP relaxation and even in
presence of costs) is indeed achievable for MCST, when the degree-bounds arise from
a laminar cut-family.
The algorithm in Theorem 1 is based on iterative relaxation and uses two main new
ideas. Firstly, we drop a carefully chosen constant fraction of degree-constraints in each
iteration. This is crucial as it can be shown that dropping one constraint at a time as in
the usual applications of iterative relaxation can indeed lead to a degree violation of
Ω(Δ). Secondly, the algorithm does not just drop degree constraints, but in some itera-
tions it also generates new degree constraints, by merging existing degree constraints.
All previous applications of iterative relaxation to constrained network design treat
connectivity and degree constraints rather asymmetrically. While the structure of the
connectivity constraints of the underlying LP is used crucially (e.g., in the ubiquitous
uncrossing argument), the handling of degree constraints is remarkably simple. Con-
straints are dropped one by one, and the final performance of the algorithm is good only
if the number of side constraints is small (e.g., in recent work by Grandoni et al. [12]),
or if their structure is simple (e.g., if the ‘frequency’ of each element is small). In con-
trast, our algorithm for laminar MCST exploits the structure of degree constraints in a
non-trivial manner.
Hardness Results. We obtain the following hardness of approximation for the general
MCST problem (and its matroid counterpart). In particular this rules out any algorithm
for MCST that has additive constant degree violation, even without regard to costs.
Theorem 2. Unless N P has quasi-polynomial time algorithms, the MCST problem
admits no polynomial time O(logα m) additive approximation for the degree bounds
for some constant α > 0; this holds even when there are no costs.
The proof for this theorem is given in Section 3, and uses a a two-step reduction from
the well-known Label Cover problem. First, we show hardness for a uniform matroid
instance. In a second step, we then demonstrate how this implies the result for MCST
claimed in Theorem 2.
Note that our hardness bound nearly matches the result obtained by Chekuri et al.
in [8]. We note however that in terms of purely additive degree guarantees,√a large gap
remains. As noted above, there is a much stronger lower bound of b + Ω( n) for LP-
based algorithms [26] (even without regard to costs), which is based on discrepancy. In
light of the small number of known hardness results for discrepancy type problems, it
is unclear how our bounds for MCST could be strengthened.
Degree Bounds in More General Settings. We consider crossing versions of other clas-
sic combinatorial optimization problems, namely matroid intersection and lattice poly-
hedra. We discuss our results briefly and defer the proofs to the full version of the
paper [3].
On Generalizations of Network Design Problems with Degree Bounds 113
We remark that there are alternate definitions of matroid intersection (e.g., see Schri-
jver [25]) and that our result below extends to those as well.
Let Δ = maxe∈E |{i ∈ [m] | e ∈ Ei }| be the largest number of sets Ei that any
element of E belongs to, and refer to it as frequency.
Theorem 3. Any optimal basic solution x∗ of the linear relaxation of the minimum
crossing matroid intersection problem can be rounded into an integral solution x̂ such
that x̂(S) ≥ max{r1 (S), r2 (S)} for all S ⊆ E and
The algorithm for this theorem again uses iterative relaxation, and its proof is based on
a ‘fractional token’ counting argument similar to the one used in [2].
An interesting special case is for the bounded-degree arborescence problem (where
Δ = 1). As the set of arborescences in a digraph can be expressed as the intersection
of partition and graphic matroids, Theorem 3 readily implies a (2, 2b) approximation
for this problem. This is an improvement over the previously best-known (2, 2b + 2)
bound [20] for this problem.
The bounded-degree arborescence problem is potentially of wider interest since it is
a relaxation of ATSP, and it is hoped that ideas from this problem lead to new ideas
for ATSP. In fact Theorem 3 also implies an improved (2, 2b)-approximation for the
bounded-degree arborescence packing problem, where the goal is to pack a given num-
ber of arc-disjoint arborescences while satisfying degree-bounds on vertices (arbores-
cence packing can again be phrased as matroid intersection). The previously best known
bound for this problem was (2, 2b + 4) [2]. We also give the following integrality gap.
Theorem 4. For any > 0, there exists an instance of unweighted minimum crossing
arborescence for which the LP is feasible, and any integral solution must violate the
bound on some set {Ei }m
i=1 by a multiplicative factor of at least 2 − . Moreover, this
instance has Δ = 1, and just one non-degree constraint.
Thus Theorem 3 is the best one can hope for, relative to the LP relaxation. First,
Theorem 4 implies that the multiplicative factor in the degree cannot be improved be-
yond 2 (even without regard to costs). Second, the lower bound for arborescences with
costs presented in [2] implies that no cost-approximation ratio better than 2 is possible,
without violating degrees by a factor greater than 2.
Crossing Lattice Polyhedra. Classical lattice polyhedra form a unified framework for
various discrete optimization problems and go back to Hoffman and Schwartz [13] who
proved their integrality. They are polyhedra of type
setting. [15] also considered a degree-bounded version of the submodular flow problem
and gave a (1, b + 1) approximation guarantee.
The bounded-degree arborescence problem was considered in Lau et al. [20], where
a (2, 2b + 2) approximation guarantee was obtained. Subsequently Bansal et al. [2]
designed an algorithm that for any 0 < ≤ 1/2, achieves a (1/, bv /(1 − ) + 4)
approximation guarantee. They also showed that this guarantee is the best one can hope
for via the natural LP relaxation (for every 0 < ≤ 1/2). In the absence of edge-costs,
[2] gave an algorithm that violates degree bounds by at most an additive two. Recently
Nutov [22] studied the arborescence problem under weighted degree constraints, and
gave a (2, 5b) approximation for it.
Lattice polyhedra were first investigated by Hoffman and Schwartz [13] and the nat-
ural LP relaxation was shown to be totally dual integral. Even though greedy-type algo-
rithms are known for all examples mentioned earlier, so far no combinatorial algorithm
has been found for lattice polyhedra in general. Two-phase greedy algorithms have been
established only in cases where an underlying rank function satisfies a monotonicity
property [10], [9].
s.t. x(E(V )) = |V | − |F | − 1
x(E(U )) ≤ |U | − |F (U )| − 1 ∀U ⊂ V
x(δE (S)) ≤ b(S) ∀S ∈ L
xe ≥ 0 ∀e ∈ E
extreme point solution to this LP. By reducing degree bounds b(S), if needed, we as-
sume that x satisfies all degree bounds at equality (the degree bounds may therefore be
fractional-valued). Let α := 24.
Definition 2. An edge e ∈ E is said to be local for S ∈ L if e has at least one end-point
in S but is neither in E(C) nor in δ(C) ∩ δ(S) for any grandchild C of S. Let local(S)
denote the set of local edges for S. A node S ∈ L is said to be good if |local(S)| ≤ α.
The figure on the left shows a set S, its
children B1 and B2 , and grand-children
C1 , . . . , C4 ; edges in local(S) are drawn
S
solid, non-local ones are shown dashed. C4 B2
C1
Initially, E is the set of edges in the C 3
B1
given graph, F ← ∅, L is the original C2
laminar family of vertex sets for which
there are degree bounds, and an arbitrary
linear ordering is chosen on the children
of each node in L. In a generic iteration (F, E, L, b), the algorithm performs one of the
following steps (see also Figure 1):
Assuming that one of the above steps applies at each iteration, the algorithm terminates
when E = ∅ and outputs the final set F as a solution. It is clear that the algorithm
outputs a spanning tree of G. An inductive argument (see e.g. [20]) can be used to show
that the LP (1) is feasible at each each iteration and c(F ) + zcur ≤ zo where zo is
the original LP value, zcur is the current LP value, and F is the chosen edge-set at the
current iteration. Thus the cost of the final solution is at most the initial LP optimum zo .
Next we show that one of the four iterative steps always applies.
Lemma 1. In each iteration, one of the four steps above applies.
On Generalizations of Network Design Problems with Degree Bounds 117
S S
N1 T N2 N3 N4 N5
1 2 3 4 DropN step DropL step
Good non-leaf S Good leaves {Ni}5i=1
S
S
1 2 3 4 T
M1 M2
Fig. 1. Examples of the degree constraint modifications DropN and DropL
Proof. Let x∗ be the optimal basic solution of (1), and suppose that the first two steps
do not apply. Hence, we have 0 < x∗e < 1 for all e ∈ E. The fact that x∗ is a basic
solution together with a standard uncrossing argument (e.g., see [14]) implies that x∗ is
uniquely defined by
where S is a laminar subset of the tight spanning tree constraints, and L is a subset of
tight degree constraints, and where |E| = |S| + |L |.
A simple counting argument (see, e.g., [27]) shows that there are at least 2 edges
induced on each S ∈ S that are not induced on any of its children; so 2|S| ≤ |E|. Thus
we obtain |E| ≤ 2|L | ≤ 2|L|.
From the definition of local edges, we get that any edge e = (u, v) is local to at most
the following six sets: the smallest set S1 ∈ L containing u, the smallest set S2 ∈ L
containing v, the parents P1 and P2 of S1 and S2 resp., the least-common-ancestor L
of P1 and P2 , andthe parent of L. Thus S∈L |local(S)| ≤ 6|E|. From the above,
we conclude that S∈L |local(S)| ≤ 12|L|. Thus at least |L|/2 sets S ∈ L must have
|local(S)| ≤ α = 24, i.e., must be good. Now either at least |L|/4 of them must be
non-leaves or at least |L|/4 of them must be leaves. In the first case, step 3 holds and in
the second case, step 4 holds.
It remains to bound the violation in the degree constraints, which turns out to be rather
challenging. We note that this is unlike usual applications of iterative rounding/relaxation,
where the harder part is in showing that one of the iterative steps applies.
It is clear that the algorithm reduces the size of L by at least |L|/8 in each DropN or
DropL iteration. Since the initial number of degree constraints is at most 2n − 1, we get
the following lemma.
Lemma 2. The number of drop iterations (DropN and DropL) is T := O(log n).
Performance guarantee for degree constraints. We begin with some notation. The
iterations of the algorithm are broken into periods between successive drop iterations:
there are exactly T drop-iterations (Lemma 2). In what follows, the t-th drop iteration
118 N. Bansal et al.
is called round t. The time t refers to the instant just after round t; time 0 refers to the
start of the algorithm. At any time t, consider the following parameters.
– Lt denotes the laminar family of degree constraints.
– Et denotes the undecided edge set, i.e., support of the current
LP optimal solution.
– For any set B of consecutive siblings in Lt , Bnd(B, t) = N ∈B b(N ) equals the
sum of the residual degree bounds on nodes of B.
– For any set B of consecutive siblings in Lt , Inc(B, t) equals the number of edges
from δEt (∪N ∈B N ) included in the final solution.
Recall that b denotes the residual degree bounds at any point in the algorithm. The
following lemma is the main ingredient in bounding the degree violation.
Lemma 3. For any set B of consecutive siblings in Lt (at any time t), Inc(B, t) ≤
Bnd(B, t) + 4α · (T − t).
Observe that this implies the desired bound on each original degree constraint S: using
t = 0 and B = {S}, the violation is bounded by an additive 4α · T term.
Proof. The proof of this lemma is by induction on T − t. The base case t = T is trivial
since the only iterations after this correspond to including 1-edges: hence there is no
bound, i.e. Inc({N }, T) ≤ b(N ) for all N ∈ LT . Hence for any
violation in any degree
B ⊆ L, Inc(B, T ) ≤ N ∈B Inc({N }, T ) ≤ N ∈B b(N ) = Bnd(B, T ).
Now suppose t < T , and assume the lemma for t + 1. Fix a consecutive B ⊆ Lt . We
consider different cases depending on what kind of drop occurs in round t + 1.
DropN round. Here either all nodes in B get dropped or none gets dropped.
Case 1: None of B is dropped. Then observe that B is consecutive in Lt+1 as well;
so the inductive hypothesis implies Inc(B, t + 1) ≤ Bnd(B, t + 1) + 4α · (T − t − 1).
Since the only iterations between round t and round t + 1 involve edge-fixing, we have
Inc(B, t) ≤ Bnd(B, t) − Bnd(B, t + 1) + Inc(B, t + 1) ≤ Bnd(B, t) + 4α · (T − t − 1) ≤
Bnd(B, t) + 4α · (T − t).
Case 2: All of B is dropped. Let C denote the set of all children (in Lt ) of nodes in
B. Note that C consists of consecutive siblings in Lt+1 , and inductively Inc(C, t + 1) ≤
Bnd(C, t + 1) + 4α · (T − t − 1). Let S ∈ Lt denote the parent of the B-nodes;
so C are grand-children of S in Lt . Let x denote the optimal LP solution just before
round t + 1 (when the degree bounds are still given by Lt ), and H = Et+1 the support
edges of x. At that point, we have b(N ) = x(δ(N )) for all N ∈ B ∪ C. Also let
Bnd (B, t + 1) := N ∈B b(N ) be the sum of bounds on B-nodes just before round
t+ 1. Since S is t + 1, |Bnd (B,
a good node in round t + 1) − Bnd(C, t + 1)| =
| N ∈B b(N ) − M∈C b(M )| = | N ∈B x(δ(N )) − M∈C x(δ(M ))| ≤ 2α. The
last inequality follows since S is good; the factor of 2 appears since some edges, e.g.,
the edges between two children or two grandchildren of S, may get counted twice. Note
also that the symmetric difference of δH (∪N ∈B N ) and δH (∪M∈C M ) is contained in
local(S). Thus δH (∪N ∈B N ) and δH (∪M∈C M ) differ in at most α edges.
Again since all iterations between time t and t + 1 are edge-fixing:
The first inequality follows from simple counting; the second uses Claim 6, the third
is the induction hypothesis (since C is consecutive), and the fourth is Bnd(C, t + 1) ≤
Bnd (B, t + 1) + 2α (from above).
This completes the proof of the inductive step and hence Lemma 3.
3 Hardness Results
We now prove Theorem 2. The first step to proving this result is a hardness for the more
general minimum crossing matroid basis problem: given a matroid M on a ground set
V of elements, a cost function c : V → R+ , and degree bounds specified by pairs
i=1 (where each Ei ⊆ V and bi ∈ N), find a minimum cost basis I in M
{(Ei , bi )}m
such that |I ∩ Ei | ≤ bi for all i ∈ [m].
Theorem 7. Unless N P has quasi-polynomial time algorithms, the unweighted min-
imum crossing matroid basis problem admits no polynomial time O(logc m) additive
approximation for the degree bounds for some fixed constant c > 0.
Proof. We reduce from the label cover problem [1]. The input is a graph G = (U, E)
where the vertex set U is partitioned into pieces U1 , · · · , Un each having size q, and all
edges in E are between distinct pieces. We say that there is a superedge between Ui and
Uj if there is an edge connecting some vertex in Ui to some vertex in Uj . Let t denote
the total number of superedges; i.e.,
- .-
- [n] -
t = -- (i, j) ∈ : there is an edge in E between Ui and Uj --
2
The goal is to pick one vertex from each part {Ui }ni=1 so as to maximize the number of
induced edges. This is called the value of the label cover instance and is at most t.
It is well known that there exists a universal constant γ > 1 such that for every
k ∈ N, there is a reduction from any instance of SAT (having size N ) to a label cover
instance "G = (U, E), q, t# such that:
– If the SAT instance is satisfiable, the label cover instance has optimal value t.
– If the SAT instance is not satisfiable, the label cover instance has optimal value
< t/γ k .
– |G| = N O(k) , q = 2k , |E| ≤ t2 , and the reduction runs in time N O(k) .
We consider a uniform matroid M with rank t on ground set E (recall that any subset
of t edges is a basis in a uniform matroid). We now construct a crossing matroid basis
instance I on M. There is a set of degree bounds corresponding to each i ∈ [n]: for
every collection C of edges incident to vertices in Ui such that no two edges in C are
incident to the same vertex in Ui , there is a degree bound in I requiring at most one
element to be chosen from C. Note that the number of degree bounds m is at most
k
|E|q ≤ N O(k 2 ) . The following claim links the SAT and crossing matroid instances.
Its proof is deferred to the full version of this paper.
Claim 8. [Yes instance] If the SAT instance is satisfiable, there is a basis (i.e. subset
B ⊆ E with |B| = t) satisfying all degree bounds.
√subset B ⊆ E with |B | ≥ t/2
[No instance] If the SAT instance is unsatisfiable, every
k/2
violates some degree bound by an additive ρ = γ / 2.
On Generalizations of Network Design Problems with Degree Bounds 121
The steps described in the above reduction can be done in time polynomial in m and
|G|. Also, instead of randomly choosing vertices from the sets Wi , we can use condi-
tional expectations to derive a deterministic algorithm that recovers at least t/ρ2 edges.
Setting k = Θ(log log N ) (recall that N is the size of the original SAT instance), we
a
obtain an instance of bounded-degree matroid basis of size max{m, |G|} = N log N
and ρ = logb N , where a, b > 0 are constants. Note that log m = loga+1 N , which
implies ρ = logc m for c = a+1 b
> 0, a constant. Thus it follows that for this constant
c > 0 the bounded-degree matroid basis problem has no polynomial time O(logc m)
additive approximation for the degree bounds, unless N P has quasi-polynomial time
algorithms.
We now prove Theorem 2.
Proof. [Proof of Theorem 2] We show how the bases of a uniform matroid can be
represented in a suitable instance of the crossing spanning tree problem. Let the uniform √
matroid from Theorem 7 consist of e elements and have rank t ≤ e; recall that t ≥ e
and clearly m ≤ 2e . We construct a graph as in Figure 2, with vertices v1 , · · · , ve
corresponding to elements in the uniform matroid. Each vertex vi is connected to the
root r by two vertex-disjoint paths: "vi , ui , r# and "vi , wi , r#. There are no costs in
this instance. Corresponding to each degree bound (in the uniform matroid) of b(C)
on a subset C ⊆ [e], there is a constraint to pick at most |C| + b(C) edges from
δ({ui / | i ∈ C}). Additionally, there is a special degree bound of 2e − t on the edge-set
E = ei=1 δ(wi ); this corresponds to picking a basis in the uniform matroid.
Observe that for each i ∈ [e], any r
spanning tree must choose exactly three u
1
w
edges amongst {(r, ui ), (ui , vi ), (r, wi ), w
e
1
e u
(wi , vi )}, in fact any three edges suffice. u
i w
i
v1
v
Hence every spanning tree T in this graph e
Since B ∗ satisfies its degree-bounds, T ∗ satisfies all degree bounds derived from the
crossing matroid instance. For the special degree bound on E , note that |T ∗ ∩ E | =
2e − |B ∗ | = 2e − t; so this is also satisfied. Thus there is a spanning tree satisfying all
the degree bounds.
No instance. Every subset B ⊆ [e] with |B | ≥ t/2 (i.e. near basis) violates some
degree bound by an additive ρ = Ω(logc m) term, where c > 0 is a fixed constant.
Consider any spanning tree T that corresponds to subset X ⊆ [e] as described above.
122 N. Bansal et al.
References
1. Arora, S., Babai, L., Stern, J., Sweedyk, Z.: The hardness of approximate optima in lattices,
codes, and systems of linear equations. J. Comput. Syst. Sci. 54(2), 317–331 (1997)
2. Bansal, N., Khandekar, R., Nagarajan, V.: Additive guarantees for degree bounded network
design. In: STOC, pp. 769–778 (2008)
3. Bansal, N., Khandekar, R., Könemann, J., Nagarajan, V., Peis, B.: On Generalizations of
Network Design Problems with Degree Bounds (full version),Technical Report (2010)
4. Bilo, V., Goyal, V., Ravi, R., Singh, M.: On the crossing spanning tree problem. In: Jansen,
K., Khanna, S., Rolim, J.D.P., Ron, D. (eds.) RANDOM 2004 and APPROX 2004. LNCS,
vol. 3122, pp. 51–60. Springer, Heidelberg (2004)
5. Chaudhuri, K., Rao, S., Riesenfeld, S., Talwar, K.: What would Edmonds do? Augment-
ing paths and witnesses for degree-bounded MSTs. In: Chekuri, C., Jansen, K., Rolim,
J.D.P., Trevisan, L. (eds.) APPROX 2005 and RANDOM 2005. LNCS, vol. 3624, pp. 26–39.
Springer, Heidelberg (2005)
6. Chaudhuri, K., Rao, S., Riesenfeld, S., Talwar, K.: Push relabel and an improved approxima-
tion algorithm for the bounded-degree MST problem. In: Bugliesi, M., Preneel, B., Sassone,
V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4051, pp. 191–201. Springer, Heidelberg
(2006)
7. Chazelle, B.: The Discrepancy Method: Randomness and Complexity. Cambridge University
Press, Cambridge (2000)
8. Chekuri, C., Vondrák, J., Zenklusen, R.: Dependent Randomized Rounding for Matroid Poly-
topes and Applications (2009), http://arxiv.org/abs/0909.4348
9. Faigle, U., Peis, B.: Two-phase greedy algorithms for some classes of combinatorial linear
programs. In: SODA, pp. 161–166 (2008)
10. Frank, A.: Increasing the rooted connectivity of a digraph by one. Math. Programming 84,
565–576 (1999)
11. Goemans, M.X.: Minimum Bounded-Degree Spanning Trees. In: FOCS, pp. 273–282 (2006)
12. Grandoni, F., Ravi, R., Singh, M.: Iterative Rounding for Multiobjective Optimization Prob-
lems. In: Fiat, A., Sanders, P. (eds.) ESA 2009. LNCS, vol. 5757, pp. 95–106. Springer,
Heidelberg (2009)
13. Hoffman, A., Schwartz, D.E.: On lattice polyhedra. In: Hajnal, A., Sos, V.T. (eds.) Proceed-
ings of Fifth Hungarian Combinatorial Coll, pp. 593–598. North-Holland, Amsterdam (1978)
14. Jain, K.: A factor 2 approximation algorithm for the generalized Steiner network problem.
In: Combinatorica, pp. 39–61 (2001)
15. Király, T., Lau, L.C., Singh, M.: Degree bounded matroids and submodular flows. In: Lodi,
A., Panconesi, A., Rinaldi, G. (eds.) IPCO 2008. LNCS, vol. 5035, pp. 259–272. Springer,
Heidelberg (2008)
On Generalizations of Network Design Problems with Degree Bounds 123
16. Klein, P.N., Krishnan, R., Raghavachari, B., Ravi, R.: Approximation algorithms for finding
low degree subgraphs. Networks 44(3), 203–215 (2004)
17. Könemann, J., Ravi, R.: A matter of degree: Improved approximation algorithms for degree
bounded minimum spanning trees. SIAM J. on Computing 31, 1783–1793 (2002)
18. Könemann, J., Ravi, R.: Primal-Dual meets local search: approximating MSTs with nonuni-
form degree bounds. SIAM J. on Computing 34(3), 763–773 (2005)
19. Korte, B., Vygen, J.: Combinatorial Optimization, 4th edn. Springer, New York (2008)
20. Lau, L.C., Naor, J., Salavatipour, M.R., Singh, M.: Survivable network design with degree or
order constraints (full version). In: STOC, pp. 651–660 (2007)
21. Lau, L.C., Singh, M.: Additive Approximation for Bounded Degree Survivable Network
Design. In: STOC, pp. 759–768 (2008)
22. Nutov, Z.: Approximating Directed Weighted-Degree Constrained Networks. In: Goel, A.,
Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX 2008 and RANDOM 2008. LNCS,
vol. 5171, pp. 219–232. Springer, Heidelberg (2008)
23. Ravi, R., Marathe, M.V., Ravi, S.S., Rosenkrantz, D.J., Hunt, H.B.: Many birds with one
stone: Multi-objective approximation algorithms. In: STOC, pp. 438–447 (1993)
24. Ravi, R., Singh, M.: Delegate and Conquer: An LP-based approximation algorithm for Min-
imum Degree MSTs. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP
2006. LNCS, vol. 4051, pp. 169–180. Springer, Heidelberg (2006)
25. Schrijver, A.: Combinatorial Optimization. Springer, Heidelberg (2003)
26. Singh, M.: Personal Communication (2008)
27. Singh, M., Lau, L.C.: Approximating minimum bounded degree spanning trees to within one
of optimal. In: STOC, pp. 661–670 (2007)
A Polyhedral Study of the Mixed Integer Cut
1 Introduction
Consider the integer program
min(cx : Ax = b, x ∈ Zn+ ), (1)
where A ∈ Zm×n , b ∈ Zm , and c ∈ Rn . Given a basis B of the LP relaxation of
(1), the group relaxation of X, is obtained by relaxing non-negativity on xB , i.e.
XGR = {x : BxB + N xN = b, xB ∈ Zm , xN ∈ Zn−m
+ }.
It follows that for an integer vector xN , xB is integral if and only if N xN ≡ b
(mod B); that is, N xN − b belongs to the lattice generated by the columns of
B.
Consider the group G of equivalence classes of Zn modulo B. Let N be the
set of distinct equivalence classes represented by the columns of N , and let g0
be the equivalence class represented by b. The group polyhedron is given by
⎧ ⎫
⎨ ⎬
|N |
P (N , g0 ) = conv t ∈ Z+ : gt(g) = g0 ,
⎩ ⎭
g∈N
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 124–134, 2010.
c Springer-Verlag Berlin Heidelberg 2010
A Polyhedral Study of the Mixed Integer Cut 125
When A consists of a single row, the master group polyhedron is of the form
⎧ ⎫
⎨ |D|−1
⎬
|D|−1
P (CD , r) = conv t ∈ Z+ : iti ≡ r (mod D)
⎩ ⎭
i=1
Further, we will always assume that r > 0 and that n ≥ 3. By observing that the
recession cone of P (Cn , r) is the non-negative orthant, one notes that P (Cn , r) is
of dimension n − 1. It is also easily observed that P (Km ) is of dimension m − 1.
By the assumption that n ≥ 3, it follows that the non-negativity constraints
are facet defining. In our discussion, these shall be referred to as the trivial facets.
126 S. Tyber and E.L. Johnson
When speaking of valid inequalities for the master knapsack polyhedra, we shall
use the same notation where entries are understood to be of appropriate dimen-
sion. Denote the mixed integer cut by (μ, 1), where
i
i≤r
μi = rn−i .
n−r i>r
For completeness, we include the following theorem to which we have already
referred:
Theorem 1 (Gomory [3]). (μ, 1) is a facet of P (Cn , r).
We consider the mixed integer cut as the polytope
PMIC (n, r) = P (Cn , r) ∩ {t : μt = 1}.
Since (μ, 1) is a facet of P (Cn , r) and P (Cn , r) is integral, PMIC (n, r) is also
integral. Note that a facet (π, π0 ) is adjacent to (μ, 1) if and only if it is a facet
of PMIC (n, r). We assume that 1 < r < n − 1, since otherwise the non-trivial
facets of PMIC (n, r) are clearly knapsack facets.
We shall now discuss the connection between PMIC (n, r) and the master knap-
sack polytopes P (Kr ) and P (Kn−r ). The following proposition highlights an
operation that we will call extending a knapsack solution.
Proposition 1. If x ∈ P (Kr ), x = (x1 , . . . , xr ), then t = (x1 , . . . , xr , 0, . . . , 0)
belongs to PMIC (n, r). Likewise, if x ∈ P (Kn−r ), x = (x1 , . . . , xn−r ), then t =
(0, . . . , 0, xn−r , . . . , x1 ) belongs to PMIC (n, r).
Proof. For x ∈ P (Kr ), the result is trivial. So take x ∈ P (Kn−r ). Since P (Kn−r )
is convex and integral, we may assume that x is integral. Rewriting i = n−(n−i)
for i = 1, . . . , r and applying the assumption that x is an integral knapsack
solution, the proposition follows.
In terms of facets, we shall focus on a family of facets introduced in [1]. Before
stating the theorem, we note that for any non-trivial knapsack facet, by taking
an appropriate linear combination with the knapsack equation, we may assume
the following:
Proposition 2. Let (ρ, ρ0 ) be a non-trivial facet of P (Km ). Without loss of
generality we may assume that (ρ, ρ0 ) ≥ 0, ρ0 = ρm = 1. Moreover, we may
assume there exists some i = m such that ρi = 0.
Theorem 2 (Aráoz et. al. [1]). Let (ρ, ρr ) be a non-trivial facet of P (Kr )
such that ρ ≥ 0, ρi = 0 for at least one i, and ρr = 1. Let
n−r−1 1
ρ = ρ1 , . . . , ρr = 1, ,..., .
n−r n−r
A Polyhedral Study of the Mixed Integer Cut 127
Proof. We argue for facets tilted from P (Kr ); an analogous argument proves the
result for facets tilted from P (Kn−r ).
Let (π, π0 ) be tilted from (ρ, 1), and let ρ be as described in Theorem 2 and
α be the corresponding tilting coefficient. Since (ρ, 1) is a facet of P (Kr ), there
exist r − 1 affinely independent extreme points x1 , . . . , xr−1 satisfying (ρ, 1)
at equality. As described in Proposition 1, these points may be extended to
points t1 , . . . , tr−1 ∈ PMIC (n, r), and clearly this preserves affine independence.
Moreover, for i = 1, . . . , r − 1, μti = 1 and ρti = ρx = 1, thus
Consider a tilted knapsack facet (π, π0 ) arising from the facet (ρ, 1) of P (Kr )
with tilting coefficient α. Letting μ denote first r coefficients of μ, the same
facet of P (Kr ) is described by (γ, 0) = (ρ, 1) − (μ , 1). In particular letting,
it follows that (π, π0 ) = (γ̄, 0)+(1+α)(μ, 1). The same applies to tilted knapsack
facets arising from P (Kn−r ).
128 S. Tyber and E.L. Johnson
Proof. For convenience, say that P (Kr ) has non-trivial facets (ρ1 , 0), . . . , (ρM , 0)
and that P (Kn−r ) has non-trivial facets (γ 1 , 0), . . . , (γ N , 0). Let (ρ̄i , 0) and
(γ̄ i , 0) denote the tilted knapsack facets from (ρi , 0) and (γ i , 0) respectively.
We shall show that the system
min c · t
s.t. μ · t = 1
ρ̄i · t ≥ 0 i = 1, . . . M (2)
γ̄ i · t ≥ 0 i = 1, . . . N
t ≥0
attains an integer optimum that belongs to PMIC (n, r) for every c.
Let c = (c1 , . . . ,!cr ) and c = (cn−1 , . . . , cr ), μ = 1r , . . . , r−1
r , 1 , and μ =
1 n−r−1
n−r , . . . , n−r , 1 . Consider the systems
min c · x
s.t. μ · x = 1
(3)
ρi · x ≥ 0 i = 1, . . . M
x ≥0
and
min c · x
s.t. μ · x = 1
(4)
γ i · x ≥ 0 i = 1, . . . N
x ≥0
representing P (Kr ) and P (Kn−r ) respectively. Since both systems are integral,
the minima are obtained at integer extreme points x0 and x 1 respectively. Now
∗
let t be obtained by extending the solution achieving the smaller objective value
to a feasible point of PMIC (n, r). Indeed this t∗ is feasible and integral; it remains
to show that it is optimal.
We now consider the duals. The dual of (3) is given by
11 , α0 ) and (λ
Let (λ 12 , β0 ) attain the maxima in (5) and (6) respectively. Setting
0 1 1
λ = min(λ1 , λ2 ), it easily follows from the zero pattern of (2) and non-negativity
of μ that (λ, 0 α0 , β0 ) is feasible to (7). Moreover λ
0 = c · t∗ , proving optimality.
Further observe that PMIC (n, r) is pointed, and so from this same proof we get
the following characterization of extreme points:
Theorem 4. A point t is an extreme point of PMIC (n, r) if and only if it can
be obtained by extending an extreme point of P (Kr ) or P (Kn−r ).
1 r−1 n−r−1 1
t1 + · · · + tr−1 + tr + tr+1 + · · · + tn−1 = 1
r r n−r n−r
or
" #
1 r−1 n−r−1 1
t1 + · · · + tr−1 + tr = 1 − tr+1 + · · · + tn−1 . (9)
r r n−r n−r
130 S. Tyber and E.L. Johnson
! !
⇒ r+1
r − n−r−1
n−r t r+1 + · · · + n−1
r − 1
n−r tn−1 = β r
n
⇒ n
r · n−r tr+1 + · · · + r · n−r tn−1
1 n n−r−1
= β nr
⇒ n−r tr+1 + · · · + n−r tn−1 =
1 n−r−1
β
" #
n−r−1 1
⇒ [tr+1 + · · · + tn−1 ] − tr+1 + · · · + tn−1 = β.
2 34 5 n−r n−r
(∗) 2 34 5
(∗∗)
implies that (∗∗) must be fractional. But this contradicts that β is integral.
Therefore (I) and (II) cannot simultaneously hold.
Here we review some general properties of facets of the master group polyhedra
and discuss extensions of our previous results. Throughout, some basic knowledge
of algebra is assumed.
Let G be an abelian group with identity 0, G+ = G \ 0, and g0 ∈ G+ . The
master group polyhedron, P (G, g0 ) is defined by
⎧ ⎫
⎨ ⎬
|G|−1
P (G, g0 ) = conv t ∈ Z+ : gt(g) = g0 .
⎩ ⎭
g∈G+
Because |G|g = 0 for all g ∈ G+ , the recession cone of P (G, g0 ) is the non-negative
orthant, and since P (G, g0 ) is nonempty, the polyhedron is of full dimension.
As before, let (π, π0 ) denote the inequality
π(g)t(g) ≥ π0 .
g∈G+
If |G| − 1 ≥ 2, then the inequality t(g) ≥ 0 is facet defining for all g ∈ G+ , and
it is easily verified that these are the only facets with π0 = 0. Likewise, we call
these the trivial facets of P (G+ , g0 ).
A Polyhedral Study of the Mixed Integer Cut 131
4.1 Automorphisms
Proof. Since (γ, γ0 ) and (π, π0 ) define adjacent facets, there exist affinely in-
dependent points t1 , . . . , t(|G|−2) satisfying both at equality. By the previous
remarks, we may define points (t1 ) , . . . , (t(|G|−2) ) satisfying both (π , π0 ) and
(γ , γ0 ) at equality. Since these are all defined by the same permutation of the
indices of t1 , . . . , t(|G|−2) , affine independence is preserved.
4.2 Homomorphisms
– Set N = N \ N (i)
3. For each k ∈ K+ , define sk by sk = s1 + |G|ek
A Polyhedral Study of the Mixed Integer Cut 133
where the first equality comes from the fact that ψ is a homomorphism and the
second equality follows by how we defined the above points. Therefore,
gs(g) ∈ g0 K,
g∈G+ \K
and by construction,
⎛ ⎞
ks(k) = g0 − ⎝ gs(g)⎠ .
k∈K g∈G+ \K
Thus s ∈ P (G, g0 ).
Note that we have the |H| − 2 points s1 , . . . , s|H|−2 . By Proposition 5, we
obtain (|H| − 1)(|K| − 1) points of the form sk,h for k ∈ K+ and h ∈ H+ , and
lastly, we obtain |K| − 1 points, sk for k ∈ K+ . Using the identity |G| = |K||H|,
it immediately follows that we have |G| − 2 points.
The affine independence of these points is easily verified. By constructing a
matrix for which the first |K| − 1 columns correspond to K, the next |H| − 1
columns corresponding to ϕ(H), and the remaining columns are arranged in
blocks by the cosets, it is readily observed by letting each row be one of the
above points and using the affine independence of t1 , . . . , t|H|−2 that the newly
defined points are affinely independent.
Given a point s ∈ P (G, g0 ) that satisfies the lifted facets at equality, we can
obtain a point t ∈P (H, h0 ) that satisfies (π, π0 ) and (γ, γ0 ) at equality under the
mapping t(h) = g∈G:ψ(g)=h s(g). By a fairly routine exercise in linear algebra,
one can use this to verify that s is in the affine hull of the points described above.
Hence we obtain the following theorem:
Theorem 10. Let (π, π0 ) and (γ, γ0 ) be non-trivial facets of P (H, h0 ), and let
(π , π0 ) and (γ , γ0 ) be facets of P (G, g0 ) obtained by homomorphic lifting using
the homomorphism ψ. Then (π , π0 ) and (γ , γ0 ) are adjacent if and only if (π, π0 )
and (γ, γ0 ) are adjacent.
Now consider G = Cn , g0 = r , a homomorphism ψ : Cn → Cn , ψ(r ) = r = 0,
and let (μ , 1) be obtained by applying homomorphic lifting to (μ, 1). Similarly
by applying Theorem 10, we know that the only lifted facets under ψ that are
adjacent to (μ , 1) come from tilted knapsack facets. Stated precisely:
134 S. Tyber and E.L. Johnson
References
1. Aráoz, J., Evans, L., Gomory, R.E., Johnson, E.L.: Cyclic group and knapsack facets.
Math. Program. 96(2), 377–408 (2003)
2. Dash, S., Günlük, O.: On the strength of gomory mixed-integer cuts as group cuts.
Math. Program. 115(2), 387–407 (2008)
3. Gomory, R.E.: Some polyhedra related to combinatorial problems. Linear Algebra
and Its Applications (2), 451–558 (1969)
4. Gomory, R.E., Johnson, E.L., Evans, L.: Corner polyhedra and their connection
with cutting planes. Math. Program. 96(2), 321–339 (2003)
Symmetry Matters for the Sizes of Extended
Formulations
1 Introduction
Linear Programming techniques have proven to be extremely fruitful for com-
binatorial optimization problems with respect to both structural analysis and
the design of algorithms. In this context, the paradigm is to represent the prob-
lem by a polytope P ⊆ Rm whose vertices correspond to the feasible solutions
of the problem in such a way that the objective function can be expressed by
a linear functional x %→ "c, x# on Rm (with some c ∈ Rm ). If one succeeds in
finding a description of P by means of linear constraints, then algorithms as
well as structural results from Linear Programming can be exploited. In many
cases, however, the polytope P has exponentially (in m) many facets, thus P
can only be described by exponentially many inequalities. Also it may be that
the inequalities needed to describe P are too complicated to be identified.
In some of these cases one may find an extended formulation for P , i.e., a
(preferably small and simple) description by linear constraints of another poly-
hedron Q ⊆ Rd in some higher dimensional space that projects to P via some
(simple) linear map p : Rd → Rm with p(y) = T y for all y ∈ Rd (and some
matrix T ∈ Rm×d ). Indeed, if p : Rm → Rd with p (x) = T t x for all x ∈ Rm
denotes the linear map that is adjoint to p (with respect to the standard bases),
then we have max{"c, x# : x ∈ P } = max{"p (c), y# : y ∈ Q}.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 135–148, 2010.
c Springer-Verlag Berlin Heidelberg 2010
136 V. Kaibel, K. Pashkovich, and D.O. Theis
As for an example, let us consider the spanning tree polytope Pspt (n) =
conv{χ(T ) ∈ {0, 1}En : T ⊆ En spanning tree of Kn }, where Kn = ([n], En )
denotes the complete graph with node set [n] = {1, . . . , n} and edge set En =
{{v, w} : v, w ∈ [n], v = w}, and χ(A) ∈ {0, 1}B is the characteristic vector of
the subset A ⊆ B of B, i.e., for all b ∈ B, we have χ(A)b = 1 if and only if b ∈ A.
Thus, Pspt (n) is the polytope associated with the bases of the graphical matroid
of Kn , and hence (see [7]), it consists of all x ∈ RE + satisfying x(En ) = n − 1
n
and x(En (S)) ≤ |S| − 1 for all ⊆ [n] with 2 ≤ |S| ≤ n − 1, where RE + is the
nonnegative orthant of R E
, we denote by En (S) the subset of all edges with both
nodes in S, and x(F ) = e∈F xe for F ⊆ En . This linear description of Pspt (n)
has an exponential (in n) number of constraints, and as all the inequalities define
pairwise disjoint facets, none of them is redundant.
The following much smaller exended formulation for Pspt (n) (with O(n3 ) vari-
ables and constraints) appears in [5] (and a similar one in [17], who attributes
it to [13]). Let us introduce additional 0/1-variables ze,v,u for all e ∈ En , v ∈ e,
and u ∈ [n] \ e. While each spanning tree T ⊆ En is represented by its char-
acteristic vector x(T ) = χ(T ) in Pspt (n), in the extended formulation it will be
(T )
represented by the vector y (T ) = (x(T ) , z (T ) ) with ze,v,u = 1 (for e ∈ En , v ∈ e,
u ∈ [n] \ e) if and only if e ∈ T and u is contained in the component of v in T \ e.
The polyhedron Qspt (n) ⊆ Rd defined by the nonnegativity constraints x ≥ 0,
z ≥ 0, the equations x(En ) = n − 1, x{v,w} − z{v,w},v,u − z{v,w},w,u = 0 for all
pairwise distinct v, w, u ∈ [n], as well as x{v,w} + u∈[n]\{v,w} z{v,u},u,w = 1 for
all distinct v, w ∈ [n], satisfies p(Qspt (n)) = Pspt (n), where p : Rd → RE is the
orthogonal projection onto the x-variables.
For many other polytopes (with exponentially many facets) associated with
polynomial time solvable combinatorial optimization problems polynomially sized
extended formulations can be constructed as well (see, e.g., the recent survey [5]).
Probably the most prominent problem in this class for which, however, no such
small formulation is known, is the matching problem. In fact, Yannakakis [17]
proved that no symmetric polynomially sized extended formulation of the match-
ing polytope exists.
Here, symmetric refers to the symmetric group S(n) of all permutations
π : [n] → [n] of the node set [n] of Kn acting on En via π.{v, w} = {π(v), π(w)}
for all π ∈ S(n) and {v, w} ∈ En . Clearly, this action of S(n) on En induces
an action on the set of all subsets of En . For instance, this yields an action
on the spanning trees of Kn , and thus, on the vertices of Pspt (n). The ex-
tended formulation of Pspt (n) discussed above is symmetric in the sense that,
for every π ∈ S(n), replacing all indices associated with edges e ∈ En and
nodes v ∈ [n] by π.e and π.v, respectively, does not change the set of constraints
in the formulation. Phrased informally, all subsets of nodes of Kn of equal cardi-
nality play the same role in the formulation. For a general definition of symmetric
extended formulations see Section 2.
In order to describe the main results of Yannakakis paper [17] and the
contributions of the present paper, let us denote by M (n) = {M ⊆ En :
M matching in Kn , |M | = } the set of all matchings of size (a matching
Symmetry Matters for the Sizes of Extended Formulations 137
being a subset of edges no two of which share a node), and by Pmatch (n) =
conv{χ(M ) ∈ {0, 1}En : M ∈ M (n)} the associated polytope. According to
n/2
Edmonds [6] the perfect matching polytope Pmatch (n) (for even n) is described
by
n/2
Pmatch (n) = {x ∈ RE
+ : x(δ(v)) = 1 for all v ∈ [n],
n
(with δ(v) = {e ∈ En : v ∈ e}). Yannakakis [17, Thm.1 and its proof] shows that
n/2
there is a constant C > 0 such that, for every extended formulation for Pmatch (n)
(with n even) that is symmetric
n in the sense above, the number of variables and
constraints is at least C · n/4 = 2Ω(n) . This in particular implies that there is
no polynomial size symmetric extended formulation for the matching polytope
of Kn (the convex hulls of characteristic vectors of all matchings in Kn ), of which
the perfect matching polytope is a face.
Yannakakis [17] also obtains a similar (maybe less surprising) result on travel-
ing salesman polytopes. Denoting the set of all (simple) cycles of length in Kn
by C (n) = {C ⊆ En : C cycle in Kn , |C| = }, and the associated polytopes by
Pcycl (n) = conv{χ(C) ∈ {0, 1}En : C ∈ C (n)}, the traveling salesman polytope
n/2
is Pncycl (n). Identifying Pmatch (n) (for even n) with a suitable face of P3n
cycl (3n),
Yannakakis concludes that all symmetric extended formulations for Pncycl (n) have
size at least 2Ω(n) as well [17, Thm. 2 and its proof].
Yannakakis’ results in a fascinating way illuminate the borders of our principal
abilities to express combinatorial optimization problems like the matching or the
traveling salesman problem by means of linear constraints. However, they only
refer to linear descriptions that respect the inherent symmetries in the problems.
In fact, the second open problem mentioned in the concluding section of [17] is
described as follows: “We do not think that asymmetry helps much. Thus, prove
that the matching and TSP polytopes cannot be expressed by polynomial size
LP’s without the asymmetry assumption.”
The contribution of our paper is to show that, in contrast to the assumption
expressed in the quotation above, asymmetry can help much, or, phrased differ-
ently, that symmetry requirements on extended formulations indeed can matter
significantly with respect to the minimal sizes of extended formulations. Our
log n log n
main results are that both Pmatch (n) and Pcycl (n) do not admit symmetric
extended formulations of polynomial size, while they have non-symmetric ex-
tended formulations of polynomial size (see Cor. 1 and 2 for matchings, as well as
Cor. 3 and 4 for cycles). The corresponding theorems from which these corollar-
ies are derived provide some more general and more precise results for Pmatch (n)
and Pcycl(n). In order to establish the lower bounds for symmetric extensions,
we generalize the techniques developed by Yannakakis [17]. The constructions
of the compact non-symmetric extended formulations rely on small families of
perfect hash functions [1,8,15].
The paper is organized as follows. In Section 2, we provide definitions of
extensions, extended formulations, their sizes, the crucial notion of a section
138 V. Kaibel, K. Pashkovich, and D.O. Theis
3 Yannakakis’ Method
Here, we provide an abstract view on the method used by Yannakakis [17] in or-
der to bound from below the sizes of symmetric extensions for perfect matching
polytopes, without referring to these concrete poytopes. That method is capable
of establishing lower bounds on the number of variables of weakly symmetric
subspace extensions of certain polytopes. By the following lemma, which is ba-
sically Step 1 in the proof of [17, Theorem 1], such bounds imply similar lower
bounds on the dimension of the ambient space and the number of facets for
general symmetric extensions (that are not necessarily subspace extensions).
Due to Lemma 5 one can prove that subspace extensions of some polytope P
with certain properties do not exist by finding, for such a hypothetical extension,
Symmetry Matters for the Sizes of Extended Formulations 141
(π.sj )(x) = sκ−1 (j) (x) = (κπ−1 .s(x))j = sj (π −1 .x) for all x ∈ X , (6)
π −1
from which one deduces 1.sj = sj for the one-element 1 in G as well as (ππ ).sj =
π.(π .sj ) for all π, π ∈ G. The isotropy group of sj ∈ S under this action is
isoG (sj ) = {π ∈ G : π.sj = sj }. From (6) one sees that, for all x ∈ X and
142 V. Kaibel, K. Pashkovich, and D.O. Theis
Proof. For odd , this follows from Theorem 1 using Lemmas 1, 2, and 4. For
match (n − 2) is (isomorphic to) a face of Pmatch (n) defined
even , the polytope P−1 −1
implies (7). Thus, for each j ∈ [d], the equivalence classes of the equivalence
relation defined by (8) refine the partitioning of X into orbits under Hj , and
we may use the collection of all these equivalence classes (for all j ∈ [d]) as the
family F in Remark 1. With
face the inequality x(V : V ) ≥ 1 is valid (where (V : V ) is the set of all edges
having one node in V and the other one in V ), since every matching M ∈ M
intersects (V : V ) in an odd number of edges. Therefore, in order to derive the
it suffices to find cx ∈ R (for all x ∈
desired
contradiction, X ) with
x∈X cx = 1, x∈X cx · 1F (x) ≥ 0F , and x∈X cx e∈(V :V ) xe = 0. For
the details on how this can be done we refer to [12].
−1
Pi = {x ∈ RE
+ : xE\Ei = 0, x(δ(φi (s))) = 1 for all s ∈ [2],
x(Ei (φ−1
i (S))) ≤ (|S| − 1)/2 for all S ⊆ [2], |S| odd} ,
/
where Ei = E \ j∈[2] E(φ−1
i (j)). This follows by Lemma 11 from Edmonds’
linear description (1) of the perfect matching polytope Pmatch (2) of K2 . As the
sum of the number of variables and the number of inequalities in the description
of Pi is at most 2O() + n2 (the summand n2 comes from the nonnegativity
constraints on x ∈ RE+ and the constant in O() is independent of i), we obtain an
extension of Pmatch (n) of size 2O() n2 log n by Lemma 10. This proves Theorem 3.
7 Conclusions
The results presented in this paper demonstrate that there are polytopes which
have compact extended formulations though they do not admit symmetric ones.
These polytopes are associated with matchings (or cycles) of some prescribed
cardinalities (see [4] for a recent survey on general cardinality restricted com-
binatorial optimization problems). Similarly, for the permutahedron associated
with [n] there is a gap between the smallest sizes Θ(n log n) of a non-symmetric
extension [9] and Θ(n2 ) of a symmetric extension [14].
Nevertheless, the question whether there are compact extended formulations
for general matching polytopes (or for perfect matching polytopes), remains
one of the most interesting open question here. In fact, it is even unknown
whether there are (non-symmetric) extended formulations of these polytopes of
size 2o(n) .
148 V. Kaibel, K. Pashkovich, and D.O. Theis
Actually, it seems that there are almost no lower bounds known on the sizes
of (not necessarily symmetric) extensions, except for the one obtained by the
observation that every extension Q of a polytope P with f faces has at least f
faces itself, thus Q has at least log f facets (since a face is uniquely determined
by the subset of facets it is contained in) [9]. It would be most interesting to
obtain other lower bounds, including special ones for 0/1-polytopes.
References
1. Alon, N., Yuster, R., Zwick, U.: Color-coding. J. Assoc. Comput. Mach. 42(4),
844–856 (1995)
2. Balas, E.: Disjunctive programming and a hierarchy of relaxations for discrete
optimization problems. SIAM J. Algebraic Discrete Methods 6(3), 466–486 (1985)
3. Bochert, A.: Ueber die Zahl der verschiedenen Werthe, die eine Function gegebener
Buchstaben durch Vertauschung derselben erlangen kann. Math. Ann. 33(4), 584–
590 (1889)
4. Bruglieri, M., Ehrgott, M., Hamacher, H.W., Maffioli, F.: An annotated bibliog-
raphy of combinatorial optimization problems with fixed cardinality constraints.
Discrete Appl. Math. 154(9), 1344–1357 (2006)
5. Conforti, M., Cornuéjols, G., Zambelli, G.: Extended formulations in combinatorial
optimization. Tech. Rep., Università di Padova (2009)
6. Edmonds, J.: Maximum matching and a polyhedron with 0, 1-vertices. J. Res. Nat.
Bur. Standards Sect. B 69B, 125–130 (1965)
7. Edmonds, J.: Matroids and the greedy algorithm. Math. Programming 1, 127–136
(1971)
8. Fredman, M.L., Komlós, J., Szemerédi, E.: Storing a sparse table with O(1) worst
case access time. J. Assoc. Comput. Mach. 31(3), 538–544 (1984)
9. Goemans, M.: Smallest compact formulation for the permutahedron,
http://www-math.mit.edu/~ goemans/publ.html
10. Held, M., Karp, R.M.: A dynamic programming approach to sequencing problems.
J. Soc. Indust. Appl. Math. 10, 196–210 (1962)
11. Kaibel, V., Loos, A.: Branched polyhedral systems. In: Eisenbrand, F., Shepherd,
B. (eds.) IPCO 2010. LNCS, vol. 6080, pp. 177–190. Springer, Heidelberg (2010)
12. Kaibel, V., Pashkovich, K., Theis, D.O.: Symmetry matters for the sizes of extended
formulations. arXiv:0911.3712v1 [math.CO]
13. Kipp Martin, R.: Using separation algorithms to generate mixed integer model
reformulations. Tech. Rep., University of Chicago (1987)
14. Pashkovich, K.: Tight lower bounds on the sizes of symmetric extensions of per-
mutahedra and similar results (in preparation)
15. Schmidt, J.P., Siegel, A.: The spatial complexity of oblivious k-probe hash func-
tions. SIAM J. Comput. 19(5), 775–786 (1990)
16. Wielandt, H.: Finite permutation groups. Translated from the German by Bercov,
R. Academic Press, New York (1964)
17. Yannakakis, M.: Expressing combinatorial optimization problems by linear pro-
grams. J. Comput. System Sci. 43(3), 441–466 (1991)
A 3-Approximation for Facility Location with
Uniform Capacities
1 Introduction
In a facility location problem we are given a set of clients C and facility locations
F . Opening a facility at location i ∈ F costs fi (the facility cost). The cost of
servicing a client j by a facility i is given by ci,j (the service cost) and these
costs form a metric — for facilities i, i and clients j, j , ci ,j ≤ ci ,j + ci,j + ci,j .
The objective is to determine which locations to open facilities in, so that the
total cost for opening the facilities and for serving all the clients is minimized.
Note that in this setting each client would be served by the open facility which
offers the smallest service cost.
When the number of clients that a facility can serve is bounded, we have a
capacitated facility location problem. In this paper we assume that these capac-
ities are the same, U , for all facilities. For this problem of uniform capacities
the first approximation algorithm was due to Chudak and Williamson [2] who
analyzed a local search algorithm and proved that any locally optimum solution
has cost no more than 6 times the facility cost plus 5 times the service cost of
an (global) optimum solution. In this paper we refer to such a guarantee as a
(6,5)-approximation; note that this is different from the bi-criterion guarantees
for which this notation is typically used. The result of Chudak and Williamson
Work done as part of the “Approximation Algorithms” partner group of MPI-
Informatik, Germany.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 149–162, 2010.
c Springer-Verlag Berlin Heidelberg 2010
150 A. Aggarwal et al.
built on earlier work of Korupolu, Plaxton and Rajaraman [3] who were the first
to analyze local search algorithms for facility location problems.
Given the set of open facilities, the best way of serving the clients, can be
determined by solving an assignment problem. Thus any solution is completely
determined by the set of open facilities. The local search procedure analyzed
by Chudak and Williamson, starts with an arbitrary set of open facilities and
then updates this set, using one of the operations add, delete, swap, whenever
that operation reduces the total costs of the solution. We show that a solution
which is locally optimum with respect to this same set of operations is a (3,3)-
approximation. We then show that our analysis of this local search algorithm is
best possible by demonstrating an instance where the locally optimum solution
is three times the optimum solution.
When facilities have different capacities, the best result known is a (6,5)-
approximation by Zhang, Chen and Ye [7]. The local search in this case relies
on a multi-exchange operation, in which, loosely speaking, a subset of facilities
from the current solution is exchanged with a subset not in the solution. This
result improves on a (8,4)-approximation by Mahdian and Pal [4] and a (9,5)
approximation by Pal, Tardos and Wexler [6].
For capacitated facility location, the only algorithms known are based on local
search. One version of capacitated facility location arises when we are allowed
to make multiple copies of the facilities. Thus if facility i has capacity Ui and
opening cost fi , then to serve k > Ui clients by facility i we need to open k/Ui
copies of i and incur an opening cost fi k/Ui . This version is usually referred
to as “facility location with soft capacities” and the best known algorithm for
this problem is a 2-approximation [5].
All earlier work for capacitated facility location (uniform or non-uniform)
reroutes all clients in a swap operation from the facility which is closing to one
of the facilities being opened. This however can be quite expensive and cannot
lead to the tight bounds that we achieve in this paper. We use the idea of
Arya et.al. [1] to reassign clients of the facility being closed in a swap operation
to other facilities in our current solution. However, to be able to handle the
capacity constraints in this reassignment we need to extend the notion of the
mapping between clients used in [1] to a fractional assignment. As in earlier work,
we use the fact that when we have a local optimum, no operation leads to an
improvement in cost. However, we now take carefully defined linear combinations
of the inequalities capturing this local optimality. All previous work that we are
aware of seems to only use the sum of such inequalities and therefore requires
additional properties like the integrality of the assignment polytope to carry the
argument through [2]. Our approach is therefore more general and amenable
to better analysis. The idea of doing things fractionally appears more often
in our analysis. Thus, when analyzing the cost of an operation we assign clients
fractionally to the facilities and rely on the fact that such a fractional assignment
cannot be better than the optimum assignment.
In Section 4 we give a tight example that relies on the construction of a suitable
triangle free set-system. While this construction itself is quite straightforward,
A 3-Approximation for Facility Location with Uniform Capacities 151
this is the first instance we know of, where such an idea being applied to prove
a large locality gap.
2 Preliminaries
Let C be the set of clients and F denote the facility locations. Let S (resp. O)
be the set of open facilities in our solution (resp. optimum solution). We abuse
notation and use S (resp O) to denote our solution (resp. optimum solution).
Initially S is an arbitrary set of facilities which can serve all the clients. Let
cost(S) denote the total cost (facility plus service) of solution S. The three
operations that make up our local search algorithm are
Add. For s ∈
/ S, if cost(S + {s}) < cost(S) then S ← S + {s}.
Delete. For s ∈ S, if cost(S − {s}) < cost(S) then S ← S − {s}.
Swap. For s ∈ S and s ∈ / S, if cost(S − {s} + {s }) < cost(S) then S ←
S − {s} + {s }.
S is locally optimum if none of the three operations are possible and at this point
our algorithm stops. Polynomial running time can be ensured at the expense of
an additive in the approximation guarantee by doing a local step only if the
cost reduces by more than an 1 − /n factor, for > 0.
We use fi , i ∈ F to denote the cost of opening a facility at location i. Let
Sj , Oj denote the service-cost of client j in the solutions S and O respectively.
NS (s) denotes the clients served by facility s in the solution S. Similarly NO (o)
denotes the clients served by facility o in solution O. Nso denotes the set of clients
served by facility s in solution S and by facility o in solution O.
The presence of the add operation ensures that the total service cost of the
clients in any locally optimum solution is at most the total cost of the optimum
solution [2]. Hence in this paper we only consider the problem of bounding the
facility cost of a locally optimum solution which we show is no more than 2 times
the cost of an optimum solution.
To determine wt(j) so that these two properties are satisfied we start by assigning
wt(j) = init-wt(j). However, this assignment might violate the second property.
A facility s ∈ S captures a facility o ∈ O if init-wt(Nso ) > init-wt(NO (o))/2.
Note that at most one facility in S can capture a facility o. If s does not capture
o then for all j ∈ Nso define wt(j) = init-wt(j). However if s captures o then
for all j ∈ Nso define wt(j) = α · init-wt(j) where α < 1 is such that wt(Nso ) =
wt(NO (o))/2.
For a facility o ∈ O we define a fractional assignment πo : NO (o) × NO (o) →
+ with the following properties.
πo (j, j ) > 0 only ifj and j are served
separation. by different facilities in S.
balance. π
j ∈NO (o) o (j , j) = j ∈NO (o) πo (j, j ) = wto (j) for all j ∈ NO (o).
The fractional assignment πo can be obtained along the same lines as the map-
ping in [1]. The individual fractional assignments πo are extended to a frac-
tional assignment over all clients, π : C × C → + in the obvious way —
π(j, j ) = πo (j, j ) if j, j ∈ NO (o) and is 0 otherwise.
To bound the facility cost of a facility s ∈ S we will close the facility and as-
sign the clients served by s to other facilities in S and, may be, some facility in
O. The reassignment of the clients served by s to the facilities in S is done using
the fractional assignment π. Thus if client j is served by s in the solution S and
π(j, j ) > 0 then we assign a π(j, j ) fraction of j to the facility σ(j ). Note that
1. σ(j ) = s and this follows from the separation property of π.
2. j is reassigned to the facilities in S to a total extent of wt(j) (balance
property).
3. A facility s ∈ S, s = s would get some additional clients. The total extent
to which these additional clients are assigned to s is at most wt(NS (s ))
(balance property). Since
wt(NS (s )) ≤ init-wt(NS (s )) ≤ U − |NS (s )|,
the total extent to which clients are assigned to s is at most U .
Let Δ(s) denote the increase in the service-cost of the clients served by s due to
the above reassignment.
Lemma 1. s∈S Δ(s) ≤ j∈C 2Oj wt(j)
Proof. Let π(j, j ) > 0. When the facility σ(j) is closed and π(j, j ) fraction of
client j assigned to facility σ(j ), the increase in service cost is π(j, j )(cj,σ(j ) −
cj,σ(j) ). Since cj,σ(j ) ≤ Oj + Oj + Sj we have
Δ(s) = π(j, j )(cj,σ(j ) − cj,σ(j) )
s∈S j,j ∈C
≤ π(j, j )(Oj + Oj + Sj − Sj )
j,j ∈C
=2 Oj wt(j)
j∈C
If wt(j) < 1 then some part of j remains unassigned. The quantity 1−wt(j) is the
residual weight of client j and is denoted by res-wt(j). Clearly 0 ≤ res-wt(j) ≤
1. Note that
1. If we close facility s ∈ S and assign the residual weight of all clients served
by s to a facility o ∈ O − S then the total extent to which clients are assigned
to o equals res-wt(NS (s)) which is less than U .
2. The service cost of a client, j, which is assigned to o would increase by
cj,o − cj,s . Let
cs,o = max(cj,o − cj,s )
j∈C
where
res-wt(Nso )
λs,o =
res-wt(NS (s))
and is 0 if res-wt(NS (s)) = 0. Let S be the subset of facilities in the solution
S for which res-wt(NS (s)) = 0. A facility s ∈ S can be deleted from S and its
clients reassigned completely to the other facilities in S. This implies
−fs + Δ(s) ≥ 0
We write such an inequality for each s ∈ S and add them to inequality 2.
Note that for all s ∈ S − S , o λs,o = 1. This implies that
fs + λs,o fs = fs (3)
s∈S s,o s
and
Δ(s) + λs,o Δ(s) = Δ(s) ≤ 2Oj wt(j) (4)
s∈S s,o s j∈C
However, the reason for defining λs,o as above is to ensure the following property.
154 A. Aggarwal et al.
Lemma 2. s,o λs,o cs,o res-wt(NS (s)) ≤res-wt(j)(Oj + Sj )
j∈C
o
Proof. The left hand side in the inequality is s,o cs,o res-wt(Ns ). Since for
each client j ∈ Ns , cs,o ≤ Oj + Sj we have
o
cs,o res-wt(Nso ) = cs,o res-wt(j)
o
j∈NS
≤ res-wt(j)(Oj + Sj )
o
j∈NS
Incorporating equations (3), (4) and Lemma 2 into inequality (2) we get
fs ≤ λs,o fo + res-wt(j)(Oj + Sj ) + 2Oj wt(j)
s s,o j∈C j∈C
= λs,o fo + 2 Oj + res-wt(j)(Sj − Oj ) (5)
s,o j∈C j∈C
We now need to bound the number of times a facility of the optimum solution
may be opened.
Lemma 3. For all o ∈ O, s λs,o ≤ 2.
U
res-wt(j) = 1 − wt(j) = 1 − init-wt(j) = 2 − .
|NS (s)|
U
res-wt(j) = 1 − wt(j) ≥ 1 − init-wt(j) = 2 − .
|NS (s)|
U − |NS (s)|
= |Nso |
}
|NS (s)|
s∈I∪{s
/
|Nso |
= U − |Nso |
}
|NS (s)|
s∈I∪{s
/
Now
|Nso |
λs,o ≤ ≤1
} }
|NS (s)|
s∈I∪{s
/ s∈I∪{s
/
βs,o is defined so that o gets at most U clients. Let Δ (s, o) denote the increase
in service cost of the clients of NO (o) due to this reassignment. Hence
Δ (s, o) = βs,o res-wt(j)(Oj − Sj ).
j∈NO (o)
The inequality (1) corresponding to the swap "s, o# would now get an additional
term Δ (s, o) on the left. Hence the term s,o λs,o Δ (s, o) would appear on the
left in inequality (2) and on the right in inequality (5). To bound this term note
that
λs,o Δ (s, o) = λs,o βs,o res-wt(j)(Oj − Sj )
s s j∈NO (o)
) *
= λs,o βs,o res-wt(j)(Oj − Sj ).
s j∈NO (o)
If s λs,o βs,o > 1
then we reduce some βs,o so that the sum is exactly 1. On
the other hand if s λs,o βs,o = 1 − γo , γo > 0, then we take the inequalities
corresponding to the operation of adding the facility o ∈ O
fo + res-wt(j)(Oj − Sj ) ≥ 0 (6)
j∈NO (o)
and add these to inequality (2) with a weight γo . Hence the total increase in the
left hand side of inequality (2) is
⎛ ⎞
λs,o Δ (s, o) + γo ⎝fo + res-wt(j)(Oj − Sj )⎠
s,o o j∈NO (o)
= (1 − γo )res-wt(j)(Oj − Sj )
o j∈NO (o)
+ γo fo + γo res-wt(j)(Oj − Sj )
o o j∈NO (o)
= res-wt(j)(Oj − Sj ) + γo fo
o j∈NO (o) o
= res-wt(j)(Oj − Sj ) + γo fo
j∈C o
) *
= γo + λs,o fo + 2 Oj
o s j∈C
) *
= 1+ λs,o (1 − βs,o ) fo + 2 Oj
o s j∈C
⎛ ⎞
≤ 2⎝ fo + Oj ⎠
o j∈C
Hence
res-wt(N o )
λs,o (1 − βs,o ) ≤ s
= 1.
s s
res-wt(N O (o))
4 When S ∩ O = φ
We now consider the case when S ∩ O = φ. We construct a bipartite graph,
G, on the vertex set C ∪ F as in [2]. Every client j ∈ C has an edge from the
facility σ(j) ∈ S and an edge to the facility τ (j) ∈ O, where τ (j) is the facility
in O serving client j. Thus each client has one incoming and one outgoing edge.
A facility s ∈ S has |NS (s)| outgoing edges and a facility o ∈ O has |NO (o)|
incoming edges.
Decompose the edges of G into a set of maximal paths, P, and cycles, C. Note
that all facilities on a cycle are from S ∩ O. Consider a maximal path, p ∈ P
which starts at a vertex s ∈ S and ends at a vertex o ∈ O. Let head(p) denote
the client served by s on this path and tail(p) be the client served by o on this
path. Let s = s0 , j0 , s1 , j1 , . . . , sk , jk , o be the sequence of vertices on this path.
158 A. Aggarwal et al.
We can similarly define a shift along a cycle. The increase in service cost equals
the sum of Oj − Sj for all clients j in the cycle and since the assignment of
clients to facilities is done optimally in our solution and in the global optimum
this sum is zero. Thus
Oj − Sj = 0.
j∈C
Consider the operation of adding a facility o ∈ O. We shift along all paths which
end at o. The increase in service cost due to these shifts equals the sum of Oj −Sj
for all clients j on these paths and this quantity is at least −fo .
Oj − Sj ≥ − fo .
j∈P o∈O
Thus
Oj − Sj = Oj − Sj + Oj − Sj ≥ − fo
j∈C j∈P j∈C o∈O
which implies that the service cost of S is bounded by o∈P fo + j∈C Oj .
To bound the cost of facilities in S − O we only need the paths that start from
a vertex in S − O. Hence we throw away all cycles and all paths that start at a
facility in S ∩ O; this is done by removing all clients on these cycles and paths.
Let P denote the remaining paths and C the remaining clients. Every client in
C either belongs to a path which ends in S ∩ O (transfer path) or to a path
which ends in O − S (swap path). Let T denote the set of transfer paths and S
the set of swap paths.
Let Nso be the set of paths that start at s ∈ S and end at o ∈ O. Define
Note that we do not include the transfer paths in the above definition. Similarly
for all o ∈ O define
NO (o) = ∪s∈S−O Nso .
Just as we defined the init-wt, wt and res-wt of a client, we can define the
init-wt, wt and res-wt of a swap path. Thus for a path p which starts from
s ∈ S − O we define
U − |NS (s)|
init-wt(p) = min 1, .
|NS (s)|
A 3-Approximation for Facility Location with Uniform Capacities 159
The notion of capture remains the same and we reduce the initial weights on the
paths to obtain their weights. Thus wt(p) ≤ init-wt(p) and for every s ∈ S
and o ∈ O, wt(Nso ) ≤ wt(NO (o))/2. For every o ∈ O − S we define a fractional
mapping πo : NO (o) × NO (o) → + such that
πo (p, p ) > 0 only ifp and p start at different facilities in S − O.
separation.
balance. π
p ∈NO (o) o (p , p) = p ∈NO (o) πo (p, p ) = wto (p) for all p ∈ NO (o).
This fractional mapping can be constructed in the same way as done earlier.
The way we use this fractional mapping, π, will differ slightly. When facility s
is closed, we will use π to partly reassign the clients served by s in the solution
S to other facilities in S. If p is a path starting from s and π(p, p ) > 0, then we
shift along p and the client tail(p) is assigned to s , where s is the facility from
which p starts. This whole operation is done to an extent of π(p, p ).
Let Δ(s) denote the total increase in service cost due to the reassignment of
clients on all swap paths starting from s. Define the length of the path p as
length(p) = Oc + Sc .
c∈C∩p
Then
Δ(s) ≤ π(p, p )(shift(p) + length(p ))
s s p∈NS (s) p ∈P
= wt(p)(shift(p) + length(p))
p∈S
Since, initially, s was serving |NS (s )| + |T (s )| clients, the total number of
clients that s is serving after the reassignment is at most U .
Consider a transfer path, q, starting from s. We would shift once along path
q when we close facility s. We would also be shifting along q to an extent of
p ∈NS (s) t(p , q) which is at most 1. Let Δ (s) denote the total increase in
service cost due to shifts on all transfer paths starting from s. Then
Δ (s) ≤ 2 shift(q) (7)
q∈T (s)
fs ≤ 2 fo + wt(p)(shift(p) + length(p))
s∈S−O o∈O−S p∈S
+ res-wt(p)(length(p) + shift(p)) + 2 shift(p)
p∈S p∈T
=2 fo + (shift(p) + length(p)) + 2 shift(p)
o∈O−S p∈S p∈T
⎛ ⎞
≤ 2⎝ fo + Oj ⎠
o∈O−S j∈C
5 A Tight Example
Our tight example consists of r facilities in the optimum solution O, r facilities
in the locally optimum solution S and rU clients. The facilities are F = O ∪ S.
Since, no facility can serve more than U clients, each facility in S and O serves
exactly U clients. Our instance has the property that a facility in O and a facility
in S share at most one client.
We can view our instance as a set-system — the set of facilities O is the
ground set and for every facility s ∈ S we have a subset Xs of this ground set.
o ∈ Xs iff there is a client which is served by s in the solution S and by o in the
solution O. This immediately implies that each element of the ground set is in
exactly U sets and that each set is of size exactly U .
A triangle in the set-system is a collection of 3 elements, o1 , o2 , o3 and 3 sets
Xs1 , Xs2 , Xs3 such that oi is not in Xsi but belongs to the other two sets. An
important property of our instance is that the corresponding set-system has no
triangles.
We now show how to construct a set system with the three properties men-
tioned above. With every o ∈ O we associate a distinct point xo = (xo1 , xo2 , . . . xoU )
in a U -dimensional space where for all i, xoi ∈ {1, 2, 3, . . . , U }. For every choice
of coordinate i, 1 ≤ i ≤ U we form U U−1 sets, each of which contains all points
differing only in coordinate i. Thus the total number of sets we form is r = U U
which is the same as the number of points. Each set can be viewed as a line in
U -dimensional space. To see that this set system satisfies all the three proper-
ties note that each line contains U points and each point is on exactly U lines.
Further, we do not have 3 points and 3 lines such that each line includes two
points and excludes one.
We now define the facility and the service costs. For a facility o ∈ O, fo = 2U
while for facility s ∈ S, fs = 6U − 6. For a client j ∈ Nso , we have cs,j = 3 and
co,j = 1. All other service costs are given by the metric property.
Since the service cost of each client in O is 1 and the facility cost of each facility
in O is 2U , we have cost(O) = 3U U+1 . Similarly, cost(S) = (3 − 2/U )3U U+1
and hence cost(S) = (3 − 2/U )cost(O). We now need to prove that S is indeed
a locally optimum solution with respect to the local search operations of add,
delete and swap.
162 A. Aggarwal et al.
References
1. Arya, V., Garg, N., Khandekar, R., Meyerson, A., Munagala, K., Pandit, V.: Lo-
cal search heuristics for k-median and facility location problems. SIAM J. Com-
put. 33(3), 544–562 (2004)
2. Chudak, F., Williamson, D.P.: Improved approximation algorithms for capacitated
facility location problems. Math. Program. 102(2), 207–222 (2005)
3. Korupolu, M.R., Plaxton, C.G., Rajaraman, R.: Analysis of a local search heuristic
for facility location problems. J. Algorithms 37(1), 146–188 (2000)
4. Mahdian, M., Pál, M.: Universal facility location. In: Di Battista, G., Zwick, U.
(eds.) ESA 2003. LNCS, vol. 2832, pp. 409–421. Springer, Heidelberg (2003)
5. Mahdian, M., Ye, Y., Zhang, J.: A 2-approximation algorithm for the soft-
capacitated facility location problem. In: Arora, S., Jansen, K., Rolim, J.D.P., Sa-
hai, A. (eds.) RANDOM 2003 and APPROX 2003. LNCS, vol. 2764, pp. 129–140.
Springer, Heidelberg (2003)
6. Pál, M., Tardos, É., Wexler, T.: Facility location with nonuniform hard capacities.
In: FOCS ’01: Proceedings of the 42nd IEEE symposium on Foundations of Com-
puter Science, Washington, DC, USA, p. 329. IEEE Computer Society, Los Alamitos
(2001)
7. Zhang, J., Chen, B., Ye, Y.: A multiexchange local search algorithm for the capac-
itated facility location problem. Math. Oper. Res. 30(2), 389–403 (2005)
Secretary Problems via Linear Programming
1 Introduction
In the classical secretary problem an employer would like to choose the best
candidate among n competing candidates. The candidates are assumed to arrive
in a random order. After each interview, the position of the interviewee in the
total order is revealed vis-á-vis already interviewed candidates. The interviewer
has to decide, irrevocably, whether to accept the candidate for the position or
to reject the candidate. The objective in the basic problem is to accept the
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 163–176, 2010.
c Springer-Verlag Berlin Heidelberg 2010
164 N. Buchbinder, K. Jain, and M. Singh
best candidate with high probability. A mechanism used for choosing the best
candidate is to interview the first n/e candidates for the purpose of evaluation,
and then hire the first candidate that is better than all previous candidates.
Analysis of the mechanism shows that it hires the best candidate with probability
1/e and that it is optimal [8,18].
This basic concept of n elements arriving in a random order and irrevocable de-
cisions made by an algorithm have been explored extensively over the years. We
refer the reader to the survey by Ferguson [9] on the historical and extensive work
on different variants of the secretary problem. Recently, there has been a interest
in the secretary problem with its application to the online auction problem [13,3].
This has led to the study of variants of the secretary problem which are motivated
by this application. For example, [15] studied a setting in which the mechanism
is allowed to select multiple candidates and the goal is to maximize the expected
profit. Imposing other combinatorial structure on the set of selected candidates,
for example, selecting elements which form an independent set of a matroid [4],
selecting elements that satisfy a given knapsack constraint [2], selecting elements
that form a matching in a graph or hypergraph [16], have also been studied. Other
variants include when the profit of selecting a secretary is discounted with time [5].
Therefore, finding new ways of abstracting, as well as analyzing and designing
algorithms, for secretary type problems is of major interest.
We next demonstrate these ideas by exploring some natural variants of the sec-
retary problem.
Kleinberg [15] gave an asymptotically tight mechanism for the cardinal version of
the problem. However, this mechanism is randomized, and also not tight for small
values of k. Better mechanisms, even restricted to small values of k, are helpful
not only for solving the original problem, but also for improving mechanisms that
are based upon them. For example, a mechanism for the secretary knapsack [2]
uses a mechanism that is 1/e competitive for maximizing the expected profit
for small values of k (k ≤ 27). Analyzing the LP asymptotically for any value
n is a challenge even for small value k. However, using our characterization we
solve the problem easily for small values k and n which gives an idea on how
competitive ratio behaves for small values of k. Our results appear in Table 1.
We also give complete asymptotic analysis for the cases of (1, 2), (2, 1)-secretary
problems.
Table 1. Competitive ratio for Maximizing expected profit. Experimental results for
n = 100
The basic secretary problem was introduced in a puzzle by Martin Gardner [11].
Dynkin [8] and Lindley [18] gave the optimal solution and showed that no other
strategy can do better (see the historical survey by Ferguson [9] on the history of
the problem). Subsequently, various variants of the secretary problem have been
studied with different assumptions and requirements [20](see the survey [10]).
More recently, there has been significant work using generalizations of secre-
tary problems as a framework for online auctions [2,3,4,13,15]. Incentives issues
in online mechanisms have been studied in several models [1,13,17]. These works
designed mechanisms where incentive issues were considered for both value and
time strategies. For example, Hajiaghayi et. al. [13] studied a limited supply
online auction problem, in which an auctioneer has a limited supply of identical
goods and bidders arrive and depart dynamically. In their problem bidders also
have a time window which they can lie about.
Our linear programming technique is similar to the technique of factor reveal-
ing linear programs that have been used successfully in many different settings
[7,12,14,19]. Factor revealing linear program formulates the performance of an
algorithm for a problem as a linear program (or sometimes, a more general con-
vex program). The objective function is the approximation factor of the algo-
rithm on the problem. Thus solving the linear program gives an upper bound on
the worst case instance which an adversary could choose to maximize/minimize
the approximation factor. Our technique, in contrast, captures the information
structure of the problem itself by a linear program. We do not apriori assume
any algorithm but formulate a linear program which captures every possible
algorithm. Thus optimizing our linear program not only gives us an optimal
algorithm, but it also proves that the algorithm itself is the best possible.
In this section, we give a simple linear program which we show characterizes all
possible mechanisms for the secretary problem. We stress that the LP captures
not only thresholding mechanisms, but any mechanism including probabilistic
mechanisms. Hence, finding the best mechanism for the secretary problem is
equivalent to finding the optimal solution to the linear program. The linear
program and its dual appear in Figure 1. The following two lemmas show that
n
(P) max n1 · n i=1 ipi (D) min i=1 xi
s.t. s.t.
∀ 1 ≤ i ≤ n i · pi ≤ 1 − i−1
j=1 pj ∀1≤i≤n n j=i+1 xj + ixi ≥ i/n
∀ 1 ≤ i ≤ n pi ≥ 0 ∀ 1 ≤ i ≤ n xi ≥ 0
Fig. 1. Linear program and its Dual for the secretary problem
168 N. Buchbinder, K. Jain, and M. Singh
the linear program exactly characterizes all feasible mechanisms for the secretary
problem.
Proof. Let pπi be the probability in which mechanism π selects candidate i. Any
mechanism cannot increase its chances of hiring the best candidate by selecting
a candidate that is not the best so far, therefore we may consider only such
mechanisms. We now show that pπ satisfies the constraints of linear program.
Lemma 1 shows that the optimal solution to (P) is an upper-bound on the per-
formance of the mechanism. The following lemma shows that every LP solution
actually corresponds to a mechanism which performs as well as the objective
value of the solution.
n n n
max 1
n
· i=1 ipi + q(1 − pi ) (D) min
i=1 i=1 xi + q
s.t. s.t.
∀ 1 ≤ i ≤ n i · pi ≤ 1 − i−1
j=1 pj ∀1≤i≤n n j=i+1 xj + ixi ≥ i/n − q
∀ 1 ≤ i ≤ n pi ≥ 0 ∀ 1 ≤ i ≤ n xi ≥ 0
Fig. 2. Linear program and its Dual for the rehiring secretary problem
Proof. First, notice that the mechanism is well defined since for any i,
(1− j<i pj ) ≤ 1. We prove by induction that the probability that the mechanism
ipi
selects candidate at position i is exactly pi . The base case is trivial. Assume this
is true until i − 1. At step i, the probability we choose
i is the probability that
we didn’t choose candidates 1 to i − 1 which is 1 − j<i pj times the probability
that the current candidate is best so far which is 1/i times (1−ipi pj ) which is
j<i
exactly pi .
The probability of hiring the ith candidate given that the ith candidate is the
best candidate is equal the probability of hiring the ith candidate given the ith
candidate is the best candidate among candidates 1 to i. Otherwise, it means
that the mechanism is able to distinguish between the event of seeing the relative
ranks and the absolute ranks which is a contradiction to the definition of the
secretary problem. Since the ith candidate is best so far with probability 1/i,
the latter probability equals ipi (the mechanism hires only the best candidate
so far). Summing over all possible position
n n we get that the mechanism π hires
the best candidate with probability n1 i=1 ipi .
Using the above equivalence between LP solutions and the mechanisms, it is easy
to show that the optimal mechanism can hire the best candidate with probability
of no more than 1/e. The proof is simply by constructing a feasible solution to
the dual linear program.
Lemma 3 ([8]). No mechanism can hire the best candidate with probability
better than 1/e + o(1).
One natural extension of the secretary problem is the case when one is allowed
to rehire the best secretary at the end with certain probability. That is, suppose
that after the interviewer has seen all n candidates, he is allowed to hire the
best candidate with certain probability q if no other candidate has been hired.
Observe that if q = 0, the problem reduces to the classical secretary problem
170 N. Buchbinder, K. Jain, and M. Singh
(P 1) max n1 n i=1 fi (P 2) max n1 n i=1 fi (P 3) max n1 n i=1 fi
s.t. s.t. s.t.
p ≤ 1/n p ≤ 1/n p = 1/n
∀ i fi + (i − 1) · p ≤ 1 ∀ i fi + (i − 1) · p ≤ 1 ∀ i fi + (i − 1) · p ≤ 1
∀ i fi ≤ i · p ∀ i fi = i · p ∀ i fi ≤ i · p
∀ i p, fi ≥ 0 ∀ i p, fi ≥ 0 ∀ i p, fi ≥ 0
(Incentive compatible) (Regret free) (Must-hire)
while if q = 1, then the optimal strategy is to wait till the end and then hire the
best candidate. We give a tight description of strategies as q changes. This can
be achieved simply
n by modifying the linear program: simply add in the objective
function q(1 − i=1 pi ). That is, if the mechanism did not hire any candidate
you may hire the best candidate with probability q. Solving the primal and the
corresponding dual (see Figure 2) give the following tight result. The proof is
omitted.
Theorem 2. There is a mechanism for the rehire variant that selects the best
secretary with probability e−(1−q) + o(1) and it is optimal.
3 Incentive Compatibility
In this section we study incentive compatible mechanisms for the secretary prob-
lem. We design a set of mechanisms Mp and show that with certain parameters
these mechanisms are the optimal mechanisms for certain secretary problems.
To this end, we derive linear formulations that characterize the set of possible
incentive compatible mechanisms and also analyze the dual linear programs.
The basic linear formulation that characterizes all incentive compatible mech-
anisms appears in Figure 3. We give a set of three linear formulations. The for-
mulation (P 1) characterizes all mechanisms that are incentive compatible, (P 2)
captures mechanisms that are also regret free and (P 3) captures mechanisms
that are must-hire mechanisms. This is formalized in the following two lemmas.
Proof. The proof follows the same ideas as in the proof of Lemma 1. The con-
dition of incentive compatibility implies that pi = pj = p for any two positions
i and j.
Also, in the original secretary problem, every mechanism could be modified to
be a regret free mechanism. This is not true for an incentive compatible mecha-
nism. Indeed, we have the following constraint, fi ≤ ipi since the probability of
hiring in the ith position is at least the probability of hiring in the ith position
given that the candidate is best so far times 1/i. If the mechanism is also sup-
posed to be regret free then equality must hold for each i. In the must-hire part
we demand that the sum of pi is 1. The resulting formulation given in Figure 3
is after simplification.
Lemma 4 shows that the optimal solution to the linear formulations is an upper-
bound on the performance of the mechanism. To show the converse we define a
family of mechanisms that are defined by their probability of selecting a candi-
date at each position 0 ≤ p ≤ 1/n, we show that the set of feasible solutions to
(P 1) corresponds to the set of mechanisms Mp defined here.
Incentive Compatible Mechanism Mp :
– Let 0 ≤ p ≤ 1/n. For each 1 ≤ i ≤ n, while no candidate is selected, do
• If 1 ≤ i ≤ 2p 1 i
, select the ith candidate with probability 1/p−i+1 if
she is the best candidate so far.
• If 2p1
< i ≤ n, let r = 1/p−i+1
i
. Select the ith candidate with proba-
bility 1 if her rank is in top r and with probability r − r if her
rank is r + 1.
The following lemma shows that every LP solution to (P 1) corresponds to a
mechanism which performs as well as the objective value of the solution.
Proof. For any p, the optimal values of fi are given by the following. fi = ip for
1 ≤ i ≤ 2p1
and fi = 1 − (i − 1)p for i > 2p1
. For ease of calculations, we ignore
the fact the fractions need not be integers. These are exactly the values achieved
by the mechanism Mp for any value p.
Optimizing the linear programs (P 1), (P 2) and (P 3) exactly, we get the following
theorem. The optimality of the mechanisms can also be shown by exhibiting an
optimal dual solution.
Theorem 3. The family of mechanisms Mp achieves the following.
1. Mechanism M1/√2n is incentive compatible with efficiency of 1 − √12 ≈ 0.29.
2. Mechanism M1/2n is incentive compatible and regret free with efficiency 14 .
3. Mechanism M1/n is incentive compatible and must-hire with efficiency 14 .
Moreover, all these mechanism are optimal for efficiency along with the addi-
tional property.
n J K k (n−i
k−)(−1) j|
i−1
max F (q) = 1
· qi
n i=1 j=1 k=1 =1 (n−1
k−1 )
s.t.
min{i,K} j|k
∀ 1 ≤ i ≤ n, 1≤j≤J pji = 1i k=1 qi
1|k
∀ 1 ≤ i ≤ n, 1≤k≤K qi ≤ 1 − <i p1i
j|k
∀ 1 ≤ i ≤ n, 1 ≤ k ≤ K, 2 ≤ j ≤ J qi ≤ j−1
<i pi − <i pji
j|k
∀ 1 ≤ i ≤ n, 1 ≤ k ≤ K, 1 ≤ j ≤ J pji , qi ≥ 0
Then (p(π), q(π)) is a feasible solution and expected number of top K candidates
selected is at most F (p(π), q(π)).
min{i,K} j|k
Proof. Let us prove the first type of constraints of the form: pji = 1i k=1 qi
It is clear that there is no reason for any mechanism to select a candidate which
is not at least the K best so far. Such a candidate cannot be even potentially one
of the K best globally and therefore is not profitable for the mechanism. Thus,
the probability any mechanism selects the ith candidate in the jth round is the
sum of the probability of selecting the ith candidate in the jth round given that
the candidate is the kth best candidate so far times 1/i, which is the probability
that the candidate is the kth best so far. We sum until the minimum between
i and K to get the desired equality which holds for every mechanism.Let us
now prove the third type of constraints (the second type follows by the same
arguments). Consider any mechanism and some position i and some rounds j.
j|k
qi = P r[π selects candidate i in round j| candidate i is kth best so far]
≤ P r[π selects exactly j − 1 candidates out of cand. {1, . . . , i − 1}]
| candidate i is kth best so far]
= P r[π selects exactly j − 1 candidates out of cand. {1, . . . , i − 1}]
j−1 j
= pi (π) − pi (π)
<i <i
The inequality follows since in order to select candidate i in round j the mechanism
must have selected exactly j−1 candidates out of the previous i−1 candidates. The
following equality then follows since the decisions made by the policy with respect
to the i−1 candidates depend only on the relative ranks of the i−1 candidates, and
is independent of the rank of the ith candidate with respect to these candidates.
The final equality follows since the event of selecting j − 1 candidates contains the
event of selecting j candidates, which concludes our proof.
Finally, let us consider the objective function and prove that it upper bounds
the performance of the mechanism. For analysis purpose let us consider the prob-
j|k
abilities fi that are defined as probability of selecting the ith candidate in the
jth round given that the kth best candidate is in the ith position. Note that the
j|k j|k
main difference between fi and qi is that while the former consider the kth
174 N. Buchbinder, K. Jain, and M. Singh
best candidate overall, the latter only looks from the mechanism’s perspective
and therefore looks at the event of the kth best candidate among the first i can-
didates. It is easy to state the objective function using the first set of variables
j|k
as simply the sum over all values of i, j and k of fi divided by 1/n.
j|k j|k
To finish we simply define each fi in terms of qi which proves the lemma.
Claim. For each 1 ≤ i ≤ n, 1 ≤ j ≤ J and 1 ≤ k ≤ K, we must have
i−1n−i
j|k
k
j|
fi = n−1
−1 k−
qi
=1 k−1
The proof is omitted due to lack of space. The proof of Lemma 7 follows directly
from the claim.
Lemma 7 shows that the optimal solution to (P) is an upper-bound on the per-
formance of the mechanism. The following lemma shows that every LP solution
actually corresponds to a mechanism which performs as well as the objective
value of the solution.
Lemma 8. (LP solution to Mechanism) Let (p, q) be any feasible LP solu-
tion to (P). Then consider the mechanism π defined inductively as follows. For
each position 1 ≤ i ≤ n,
– If the mechanism has not selected any candidate among position {1, . . . , i−1}
and the rank of candidate i among {1, . . . , i} is k for some 1 ≤ k ≤ K, then
1|k
q
select candidate i with probability 1−i p1 .
<i i
– If the mechanism has selected j − 1 candidates in positions 1, . . . , i − 1 for
some 2 ≤ j ≤ J and the rank of candidate i among {1, . . . , i} is k for some
j|k
qi
1 ≤ k ≤ K, then select candidate i with probability
pj−1
− <i pji
.
<i i
– Else do not select candidate i.
Then expected number of top k candidates selected by π is exactly F (p, q).
Proof (Sketch). The proof is by induction on the steps of the mechanism. It
can be verified easily that the procedure above keeps by induction that pji (π) =
j|k j|k
pji ,qi (π) = qi . That is, the probability the mechanism selects the ith candi-
date in the jth round is the same as the LP. As stated in Lemma 7 there is a
j|k j|k
correspondence between the values of qi (π) and fi (π) which is the probabil-
ities of hiring the ith candidate in the jth round given that the candidate is the
kth best. Thus, the objective function of π is exactly F (p, q).
We now give optimal mechanism for the (1, 2) and (2, 1)-secretary problem.
Observe that (1, 1)-secretary problem is the traditional secretary problem.
Theorem 4. There exists mechanisms which achieve a performance of
1. 1e + e1.5
1
) 0.591 for (2, 1)-secretary problem.
2. ) 0.572284 for the (1, 2) secretary problem.
Moreover all these mechanisms are (nearly) optimal.
Secretary Problems via Linear Programming 175
Proof. (Sketch) To give a mechanism, we will give a primal solution to LP (J, K).
The optimality is shown by exhibiting a dual solution of the same value. Due to
lack of space we only prove the (2, 1) case.
n
(2,1)-secretary. Let t1 = e3/2 and t2 = ne . Consider the following mechanism
that selects the ith candidate if ith candidate is best so far and t1 ≤ i < t2 and
no other candidate has been selected or if t2 ≤ i ≤ n and ith candidate is best
so far and at most one candidate has been selected. The performance of this
mechanism is 1e + 13 . The mechanism corresponds to the primal LP solution
e2
t1 −1
where p1i = 0 for 1 ≤ i < t1 and p1i = i(i−1) for t1 ≤ i ≤ n, p2i = 0 for 1 ≤ i < t2
t2 −t1 i−1 t1 −1 j|1
and p2i = i(i−1) − i(i−1)
1
j=1 i−1 for t2 ≤ i ≤ n, qi = i · pji for each 1 ≤ j ≤ 2
and 1 ≤ i ≤ n.
Dual Solution. We first simplify the primal linear program by eliminating the
j|k
qi variables using the first set of constraints. Let yi denote the dual variables
corresponding to the second set of constraints and zi the variables corresponding
to the third set of constraints. Then the following dual
nsolution is of value 1e +
3 − o(1). Set zi = 0 for 1 ≤ i < t2 and zi = n (1 − j=i+1 j ) for t2 ≤ i ≤ n.
1 1 1
e2 n n
Set yi = 0 for 1 ≤ i < t1 , yi = n (1 − j=i+1 j ) + j=t2 in
1 1 1
(1 − nk=j+1 k1 ) for
n n n
t1 ≤ i < t2 and yi = n1 (1 − j=i+1 1j ) + j=i in 1
(1 − k=j+1 k1 ) for t2 ≤ i ≤ n.
5 Further Discussion
Characterizing the set of mechanisms in secretary type problems as a linear poly-
tope possesses many advantages. In contrast to methods of factor revealing LPs
in which linear programs are used to analyze a single algorithm, here we char-
acterize all mechanisms by a linear program. One direction for future research
is trying to capture more complex settings of a more combinatorial nature. One
such example is the clean problem studied in [4] in which elements of a matroid
arrive one-by-one. This problem seems extremely appealing since matroid con-
straints are exactly captured by a linear program. Another promising direction
is obtaining upper bounds. While the linear program which characterizes the
performance may be too complex to obtain a simple mechanism, the dual linear
may still be used for obtaining upper bounds on the performance of any mech-
anism. We believe that linear programming and duality is a powerful approach
for studying secretary problems and will be applicable in more generality.
References
1. Awerbuch, B., Azar, Y., Meyerson, A.: Reducing Truth-Telling Online Mechanisms
to Online Optimization. In: Proceedings of ACM Symposium on Theory of Com-
puting, pp. 503–510 (2003)
2. Babaioff, M., Immorlica, N., Kempe, D., Kleinberg, R.: A Knapsack Secretary
Problem with Applications. In: Charikar, M., Jansen, K., Reingold, O., Rolim,
J.D.P. (eds.) RANDOM 2007 and APPROX 2007. LNCS, vol. 4627, pp. 16–28.
Springer, Heidelberg (2007)
176 N. Buchbinder, K. Jain, and M. Singh
3. Babaioff, M., Immorlica, N., Kempe, D., Kleinberg, R.: Online Auctions and Gen-
eralized Secretary Problems. SIGecom Exchange 7, 1–11 (2008)
4. Babaioff, M., Immorlica, N., Kleinberg, R.: Matroids, Secretary Problems, and
Online Mechanisms. In: Proceedings 18th ACM-SIAM Symposium on Discrete
Algorithms (2007)
5. Babaioff, M., Dinitz, M., Gupta, A., Immorlica, N., Talwar, K.: Secretary problems:
weights and discounts. In: SODA ’09: Proceedings of the Nineteenth Annual ACM
-SIAM Symposium on Discrete Algorithms, pp. 1245–1254. Society for Industrial
and Applied Mathematics, Philadelphia (2009)
6. Buchbinder, N., Singh, M., Jain, K.: Incentives in Online Auctions and Secretary
Problems via Linear Programming (2009) (manuscript)
7. Buchbinder, N., Jain, K., Naor, J(S.): Online primal-dual algorithms for maximiz-
ing ad-auctions revenue. In: Arge, L., Hoffmann, M., Welzl, E. (eds.) ESA 2007.
LNCS, vol. 4698, pp. 253–264. Springer, Heidelberg (2007)
8. Dynkin, E.B.: The Optimum Choice of the Instant for Stopping a Markov Process.
Sov. Math. Dokl. 4 (1963)
9. Ferguson, T.S.: Who Solved the Secretary Problem? Statist. Sci. 4, 282–289 (1989)
10. Freeman, P.R.: The Secretary Problem and its Extensions: A Review. International
Statistical Review 51, 189–206 (1983)
11. Gardner, M.: Mathematical Games. Scientific American, 150–153 (1960)
12. Goemans, M., Kleinberg, J.: An improved approximation ratio for the minimum
latency problem. In: SODA ’96: Proceedings of the seventh annual ACM-SIAM
symposium on Discrete algorithms, pp. 152–158 (1996)
13. Hajiaghayi, M.T., Kleinberg, R., Parkes, D.C.: Adaptive Limited-Supply Online
Auctions. In: Proceedings of the 5th ACM Conference on Electronic Commerce
(2004)
14. Jain, K., Mahdian, M., Markakis, E., Saberi, A., Vazirani, V.V.: Greedy facil-
ity location algorithms analyzed using dual fitting with factor-revealing lp. J.
ACM 50(6), 795–824 (2003)
15. Kleinberg, R.: A Multiple-Choice Secretary Algorithm with Applications to Online
Auctions. In: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on
Discrete algorithms (2005)
16. Korula, N., Pál, M.: Algorithms for secretary problems on graphs and hypergraphs.
In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W.
(eds.) ICALP 2009. LNCS, vol. 5556, pp. 508–520. Springer, Heidelberg (2009)
17. Lavi, R., Nisan, N.: Competitive Analysis of Incentive Compatible On-line Auc-
tions. In: Proceedings of 2nd ACM Conf. on Electronic Commerce, pp. 233–241
(2000)
18. Lindley, D.V.: Dynamic Programming and Decision Theory. Applied Statistics 10,
39–51 (1961)
19. Mehta, A., Saberi, A., Vazirani, U., Vazirani, V.: Adwords and generalized online
matching. J. ACM 54(5), 22 (2007)
20. Samuels, S.M.: Secretary Problems. In: Handbook of Sequential Analysis, vol. 118,
pp. 381–405 (1991)
Branched Polyhedral Systems
1 Introduction
An extended formulation for a polyhedron P ⊆ Rn is a linear system Ay ≤ b
defining a polyhedron Q = {y ∈ Rd | Ay ≤ b} such that there is a projection
(linear map) p : Rd → Rn with p(Q) = P . With respect to optimization of
linear functionals x %→ "c, x# over P , such an extended formulation is similarly
useful as a description of P by means of linear inequalities in Rn , since we have
max{"c, x# | x ∈ P } = max{"p (c), y# | Ay ≤ b} with the map p : Rn → Rd
that is adjoint to p, i.e., p (x) = T t x for all x ∈ Rn if T ∈ Rn×d is the matrix
with p(y) = T y for all y ∈ Rd .
Extended formulations play an increasingly important role in polyhedral com-
binatorics and mixed integer programming. For a survey on extended formula-
tions in combinatorial optimization we refer to [2], and for examples of recent
interesting results on extended formulations for mixed integer problems to, e.g.,
[3,4]. The reason for the importance of extended formulations lies in the fact
that they can have much less constraints than a description in the original space.
Moreover, in many cases the extensions reflect much better the structure of the
underlying problem, because not all relevant aspects can be expressed linearly
in the original variables. In fact, in several cases, (small) extended formulations
are rather easy to derive, while it seems extremely difficult to come up with a
linear description in the original space.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 177–190, 2010.
c Springer-Verlag Berlin Heidelberg 2010
178 V. Kaibel and A. Loos
2 The Concept
A branched polyhedral system (BPS ) is a pair B = (D, (P (v) )) of an acyclic
directed graph D = (V, A) with a unique source s ∈ V and a family P (v) ⊆
out
RN (v) (v ∈ V \ VT ) of non-empty polyhedra, such that, for every v ∈ V \ VT ,
(v) (v)
there is an admissable pair (Gcv , Gcc ) of generating sets, i.e., finite sets ∅ =
(v) out (v) out
Gcv ⊆ RN (v) and Gcc ⊆ RN (v) with
(v) (v)
satisfying the following for all x̃ ∈ Gcv ∪ Gcc :
(S.1) x̃w > 0 for all w ∈ supp(x̃) \ VT
(S.2) R(w) ∩ R(w ) = ∅ for all w, w ∈ supp(x̃), w = w
If P (v) is pointed (i.e., it has vertices), then we only need to consider the vertex
(v) (v)
set Gcv of P (v) and some set Gcc that contains exactly one non-zero vector
from each extreme ray of the recession cone of P (v) . As we can identify Nout (v)
out
with δ out (v), we will also consider the polyhedra P (v) as subsets of Rδ (v) .
For the remainder of this section, let B = (D, (P (v) )) be a BPS with D =
(v) (v)
(V, A) and source node s ∈ V . We fix one family (Gcv , Gcc ) of admissable pairs
of generating sets.
With respect to the fixed family of admissable pairs of generating sets, we de-
fine two finite sets Gcv (B), Gcc (B) ⊆ RV that will be used to define a polyhedron
associated with the BPS later (it will turn out that this polyhedron does not
depend on the particular family, but on the BPS only).
We start by constructing Gcv (B) ⊆ RV as the set that contains all points x ∈
R for which the following holds:
V
(V.1) xs = 1
(v)
(V.2) For each v ∈ supp(x) \ VT , we have x1v xNout (v) ∈ Gcv .
(V.3) For each v ∈ supp(x) \ {s}, we have xNin (v) = ONin (v) .
Note that Gcv (B) = ∅ holds, since we required the polyhedra P (v) to be non-
empty. Furthermore, looking at the nodes in order of a topological ordering of
the acyclic digraph D (i.e., v appears before w in the ordering for all (v, w) ∈ A),
one finds |Gcv (B)| < ∞.
For a node v ∈ V , the truncation of B at v is the BPS Bv induced by B on the
reach R(v) of v in D (with v as its source node). Clearly, a family of admissable
pairs of generating sets for B induces such a family for Bv , to which we refer
in the following characterization of Gcv (B) that in particular clarifies the name
branched polyhedral system. For a vector x ∈ RV , we denote by A[x] the set of
arcs (v, w) ∈ A with xv , xw = 0. We omit the proof of the following proposition
in this extended abstract.
Proposition 1. For each x ∈ RV , we have x ∈ Gcv (B) if and only if
(a) xs = 1,
(b) x1v · xR(v) ∈ Gcv (Bv ) for each v ∈ supp(x) \ VT , and
(c) A[x] is a branching in D with root s, supp(x) being the set of covered nodes.
We continue by constructing Gcc (B) as the set of all x ∈ RV for which there is
some node vx ∈ V \ VT (the root node of x) with:
(v )
(R.1) xNout (vx ) ∈ Gccx
(R.2) xV̄ = O for V̄ = V \ ∪{R(w) | w ∈ Nout (vx ) ∩ supp(x)}
(R.3) x1w xR(w) ∈ Gcv (Bw ) for all w ∈ Nout (vx ) ∩ supp(x)
Note that, since Gcv (Bw ) is finite for all w ∈ V \ VT , we also have |Gcc (B)| < ∞
(we already observed |Gcv (B)| < ∞).
Branched Polyhedral Systems 181
Gcv (B(C)) = {χ(F ) ∈ {0, 1}V | F ⊆ V feasible for C} (and, clearly, Gcc (B(C)) =
{O}). In particular, Proposition 1 (c) implies that A∩(F ×F ) is a branching with
root s, and, with P(C) = P(B(C)), we have P(C) = conv{χ(F ) ∈ {0, 1}V | F ⊆
V feasible for C}.
3 Inequality Descriptions
For a non-empty polyhedron ∅ = Q ⊆ Rd with recession cone rec(Q) = {z ∈
Rd | z̃ + z ∈ Q for all z̃ ∈ Q}, the homogenization
homog(Q) = cone{(z, 1) | z ∈ Q} + {(z, 0) | z ∈ rec(Q)} ⊆ Rd ⊕ R
of Q = {z ∈ Rd | Az ≤ b} with A ∈ Rm×d , b ∈ Rm is the polyhedral cone
homog(Q) = {(z, ξ) ∈ Rd ⊕ R | Az − ξb ≤ 0, ξ ≥ 0} .
The following result, showing how to obtain, for a BPS B, an extended formula-
tion for P(B) from extended formulations for the polyhedra P (v) , makes up the
core of the paper.
182 V. Kaibel and A. Loos
equations (7)) and νv = c v for all v ∈ V (for equations (5) and (6)). Obviously,
the vector (ν, μ, λ) has dual objective function value νs = c s . We omit the
calculation showing that (ν, μ, λ) is dually feasible (and revealing the redundancy
of (9)) in this extended abstract.
Corollary 1. Let B = (D, (P (v) )) be a branched polyhedral system with a di-
graph D = (V, A) with source node s ∈ V , and let VT ⊆ V be the set of sinks
of D. Then, with the polyhedron Q(B) ⊆ RV ⊕ RA defined by the system
xs = 1 (10)
in
xv = y(δ (v)) for all v ∈ V \ {s} (11)
(yδout (v) , xv ) ∈ homog(P (v)
) for all v ∈ V \ VT (12)
and the orthogonal projection π : RV ⊕ RA → RV , we have P(B) = π(Q(B)).
Remark 2. For each BPS B, the orthogonal projection of Q(B) (defined by (10),
(11), (12)) to the y-space is isomorphic to Q(B) (due to (10), (11)).
In case that the digraph D is a branching itself, the projection P(B) of Q(B) to
the x-space is isomorphic to Q(B) as well. Thus, in this case, from descriptions
of the polyhedra P (v) (v ∈ V \ VT ) we even obtain a description of P(B) in the
original space.
184 V. Kaibel and A. Loos
Proof. This follows readily from Corollary 1, because for branchings D = (V, A),
(11) means xw = y(v,w) for all (v, w) ∈ A.
4 Applications
4.1 Unions of Polyhedra
The following extended formulation for the convex hull of the union of finitely
many polyhedra is basically due to Balas [1] (see also [2, Thm. 5.1]). We show
how Balas’ result can be established as a consequence of Theorem 1. The slight
generalization to the case that the given polyhedra themselves are specified by
extended formulations is used, e.g., in [8] in order to construct compact extended
formulations of the polytopes associated with the cycles of length log n in a
complete graph with n nodes. We use conv(S) to denote the topological closure
of the convex hull of S.
Proof. We define a BPS B on the digraph D = (V, A) with node set V = {s}([q](
{t1 , . . . , tn } and arc set A = ({s}×[q])∪([q]×{t1 , . . . , tn }), thus, VT = {t1 , . . . , tn }
is the set of sinks of D. Identifying Rn with RVT , the polyhedra associated with
(s)
the non-sink nodes are P (i) ⊆ RVT (for i ∈ [q]) and P (s) = conv(Gcv ) ⊆ R[q] with
(s) (s) (i) (i)
Gcv = { 1 , . . . , q } (and Gcc = ∅). We choose any finite sets Gcv , Gcc ⊆ RVT
(i) (i)
with P (i) = conv(Gcv ) + ccone(Gcc ) for all i ∈ [q].
For each x ∈ R we have x ∈ Gcv (B) if and only if xs = 1 and there is some
V
(i)
i ∈ [q] with x[q] = i ∈ R[q] as well as xVT ∈ Gcv ; similarly, x ∈ Gcc (B) if and
(i)
only if xs = 0, x[q] = O, and xVT ∈ Gcc for some i ∈ [q]. Denoting by [. . . ]VT
the orthogonal projection of a set of vectors to the xVT -space, we thus find
(i) (i)
[conv(Gcv (B))]VT = conv(∪qi=1 Gcv ) and [ccone(Gcc (B))]VT = ccone(∪qi=1 Gcc ),
from which conv(P ∪ · · · ∪ P ) ⊆ [P(B)]VT follows, which implies conv(P ∪
(1) (q) (1)
from
q Theorem 1 und the inclusion from the fact that (2) and (3) imply xVT =
(i) (i)
i=1 π (z ). Thus, we have conv(P
(1)
∪ · · · ∪ P (q) ) ⊆ p(Q).
To see the reverse inclusion, let (z , . . . , z (q) , x) ∈ Q. With I = {i ∈ [q] | xi >
(1)
0} we find x1i A(i) z (i) ≤ b(i) from (4) or (8), hence x1i z (i) ∈ Q(i) for all i ∈ I, and
thus (as we have qi∈I xi = 1 due to (4) or (8) as well as x ≥ O)
v= π (i) (z (i) ) ∈ conv(P (1) ∪ · · · ∪ P (q) ) . (15)
i∈I
If I = [q], we have p(z , . . . , z (q) , x) = v, and (15) proves the claim. Otherwise,
(1)
for each i ∈ [q] \ I (again from (4) resp. (8)) we find A(i) z (i) ≤ O, hence z (i) ∈
rec (Q(i) ), and thus π (i) (z (i) ) ∈ rec (P (i) ). Therefore, choosing x(i) ∈ P (i) = ∅
arbitrarily for all i ∈ [q] \ I, we have
(i) q−|I| (i) (i)
w(ε) = 1
q−|I| x + ε π (z ) ∈ conv(P (1) ∪ · · · ∪ P (q) ) (16)
i∈[q]\I
for all ε > 0. By (15) and (16) we obtain (1 − ε)v + εw(ε) ∈ conv(P (1) ∪· · ·∪P
(q)
)
for all 0 < ε ≤ 1. Due to p(z , . . . , z , x) = limε→0 (1 − ε)v + εw
(1) (q) (ε)
, this
proves the claim.
where, for layout reasons, we use yv,i+1,j, = y(v,a(i+1,j,)) +y(v,b(i+1,j,)) , (17) has
to be included into the system for all v ∈ V \{s}, (18) for all 1 ≤ k ≤ q−1, as well
as (19) and (20) for all i ∈ [p − 1], j, ∈ [q] with j ≤ , v ∈ {a(i, j, ), b(i, j, )},
and 1 ≤ k ≤ − 1 (where the latter only refers to (20)).
(1,S)∈H z(1,S) = 1 (21)
(v,S)∈H z(v,S) = (u,S )∈H : v∈S z(u,S ) for all v ∈ V \ ({s} ∪ VT ) (22)
z ≥O. (23)
System (21), (22), (23) arises from Theorem 1 in the following way. We construct
a BCS on the acyclic digraph D = (V, A) on the node set V of the hypergraph
H, where A consists of all arcs (v, w) for which there is some (v, S) ∈ H with
w ∈ S. Clearly, VT (the set of boundary states) is the set of all sinks of D.
Defining S (v) = {S ⊆ V | (v, S) ∈ H} for every v ∈ V \ VT , we obtain a
BCS C (for this, the existence of the reference sets R(v) is crucial) whose feasible
sets are exactly the node sets of hyperpaths in H. Describing the polytopes
P (v) = conv{χ(S) | S ∈ S (v) }, for all v ∈ V \ VT , by their trivial extended
formulations P (v) = π (v) (Q(v) ) with Q(v) = {z (v) ∈ RS+ |
(v) (v)
S∈S (v) zS = 1}
(v)
and π (v) (z (v) ) = S∈S (v) zS · χ(S), Theorem 1 yields an extended formulation
of the convex hull P(C) of all node sets of hyperpaths in H that is isomorphic
to (21), (22), (23).
As the feasible sets for the BCS C are the node sets of the hyperpaths in H
(representing the set of states used in order to construct a solution during the
dynamic programming), quite often, the polytope associated with the combina-
torial optimization problem solved by the dynamic programming algorithm is
a projection of P(C), and thus, (21), (22), (23) provides an extended formula-
tion for that polytope. In principle, the concept of BCS allows more flexibility
here, because we can choose other representations of the polytopes P (v) than
their trivial extended formulations. This might be advantageous, e.g, if there are
states v for which the number of hyperedges (v, S) is much larger than the num-
ber of states. For the full orbitope example in Section 4.2 this is not the case.
In fact, one could derive the extended formulation in Theorem 2 also within the
hypergraph framework (and eliminating some redundant variables afterwards).
P (0) = {x ∈ RE
+ | x(δ(i)) = 1(i ∈ [2]), x(δ(I)) ≥ 1(I ⊆ [2], |I| odd)}
2
N
(due to Edmonds [5]) and P ({i,j}) = {x ∈ R+{i,j} | x(N{i,j} ) = 1} for all
{i, j} ∈ E2 , the system (27), (28), (29) provides the linear description
for all {i, j} ∈ E2 . As for each P ({i,j}) the equation x(Wi : Wj ) = p holds, the
system (27), (28), (29) for the modified example yields the linear description
Acknowledgements. We are greatful to the referees for careful reading and some
very helpful remarks.
References
1. Balas, E.: Disjunctive programming and a hierarchy of relaxations for discrete
optimization problems. SIAM J. Algebraic Discrete Methods 6(3), 466–486 (1985)
2. Conforti, M., Cornuéjols, G., Zambelli, G.: Extended Formulations in Combinato-
rial Optimization. Technical Report (2009)
3. Conforti, M., Cornuéjols, G., Zambelli, G.: Polyhedral approaches to mixed inte-
ger linear programming. In: Jünger, M., Liebling, T., Naddef, D., Nemhauser, G.,
Pulleyblank, W., Reinelt, G., Rinaldi, G., Wolsey, L. (eds.) 50 Years of Integer
Programming 1958-2008. Springer, Heidelberg (2010)
4. Conforti, M., Di Summa, M., Eisenbrand, F., Wolsey, L.: Network formulations of
mixed-integer programs. Math. Oper. Res. 34, 194–209 (2009)
5. Edmonds, J.: Maximum matching and a polyhedron with 0, 1 vertices. Journal of
Research of the National Bureau of Standards 69B, 125–130 (1965)
6. Faenza, Y., Kaibel, V.: Extended formulations for packing and partitioning or-
bitopes. Math. Oper. Res. 34(3), 686–697 (2009)
7. Faenza, Y., Oriolo, G., Stauffer, G.: The hidden matching structure of the compo-
sition of strips: a polyhedral perspective. In: 14th Aussois Workshop on Combina-
torial Optimization, Aussois (January 2010)
8. Kaibel, V., Pashkovich, K., Theis, D.O.: Symmetry matters for the sizes of extended
formulations. In: Eisenbrand, F., Shepherd, B. (eds.) IPCO 2010. LNCS, vol. 6080,
pp. 135–148. Springer, Heidelberg (2010)
9. Kaibel, V., Pfetsch, M.: Packing and partitioning orbitopes. Math. Program. 114(1,
Ser. A), 1–36 (2008)
10. Margot, F.: Composition de Polytopes Combinatoires: Une Approche par Projec-
tion. Ph.D. thesis, École Polytechnique Fédérale de Lausanne (1994)
11. Martin, R.K., Rardin, R.L., Campbell, B.A.: Polyhedral characterization of discrete
dynamic programming. Oper. Res. 38(1), 127–138 (1990)
12. Schaffers, M.: On Links Between Graphs with Bounded Decomposability, Existence
of Efficient Algorithms, and Existence of Polyhedral Characterizations. Ph.D. the-
sis, Université Catholique de Louvain (1994)
Hitting Diamonds and Growing Cacti
1 Introduction
Graphs in this paper are finite, undirected, and may contain parallel edges but
no loops. We study the following combinatorial optimization problem: given a
vertex-weighted graph, remove a minimum cost subset of vertices so that all
the cycles in the resulting graph are edge-disjoint. We call this problem the
diamond hitting set problem, because it is equivalent to covering all subgraphs
which are diamonds with a minimum cost subset of vertices, where a diamond
is any subdivision of the graph consisting of three parallel edges.
The diamond hitting set problem can be thought of as a generalization of
the vertex cover and feedback vertex set problems: Suppose you wish to remove
a minimum cost subset of vertices so that the resulting graph has no pair of
vertices linked by k internally disjoint paths. Then, for k = 1 and k = 2, this
is respectively the vertex cover problem and feedback vertex set problem, while
for k = 3 this corresponds to the diamond hitting set problem.
This work was supported by the “Actions de Recherche Concertées” (ARC) fund
of the “Communauté française de Belgique”.
G. Joret is a Postdoctoral Researcher of the Fonds National de la Recherche
Scientifique (F.R.S.–FNRS).
This work was done while U.P. was at Département de Mathématique - Université
Libre de Bruxelles as a Postdoctoral Researcher of the F.R.S.–FNRS.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 191–204, 2010.
c Springer-Verlag Berlin Heidelberg 2010
192 S. Fiorini, G. Joret, and U. Pietropaoli
It is well-known that both the vertex cover and feedback vertex set prob-
lems admit constant-factor approximation algorithms1 . Hence, it is natural to
ask whether the same is true for the diamond hitting set problem. Our main
contribution is a positive answer to this question.
Although there exists a simple 2-approximation algorithm for the vertex cover
problem, there is strong evidence that approximating the problem with a fac-
tor of 2 − ε might be hard, for every ε > 0 [8]. It should be noted that the
feedback vertex set and diamond hitting set problems are at least as hard to
approximate as the vertex cover problem, in the sense that the existence of a
ρ-approximation algorithm for one of these two problems implies the existence of
a ρ-approximation algorithm for the vertex cover problem, where ρ is a constant.
Concerning the feedback vertex set problem, the first approximation algo-
rithm is due to Bar-Yehuda, Geiger, Naor, and Roth [2] and its approximation
factor is O(log n). Later, 2-approximation algorithms have been proposed by
Bafna, Berman, and Fujito [1], and Becker and Geiger [3]. Chudak, Goemans,
Hochbaum and Williamson [4] showed that these algorithms can be seen as de-
riving from the primal-dual method (see for instance [9,7]). Starting with an
integer programming formulation of the problem, these algorithms simultane-
ously construct a feasible integral solution and a feasible dual solution of the
linear programming relaxation, such that the values of these two solutions are
within a constant factor of each other.
These algorithms also lead to a characterization of the integrality gap2 of
two different integer programming formulations of the problem, as we now ex-
plain. Let C(G) denote the collection of all the cycles C of G. A natural integer
programming formulation for the feedback vertex set problem is as follows:
Min cv xv
v∈V (G)
s.t. xv 1 ∀C ∈ C(G) (1)
v∈V (C)
xv ∈ {0, 1} ∀v ∈ V (G).
A better formulation has been introduced by Chudak et al. [4]. For S ⊆ V (G),
denote by E(S) the set of the edges of G having both ends in S, by G[S] the
subgraph of G induced by S, and by dS (v) the degree of v in G[S]. Then, the
following is a formulation for the feedback vertex set problem:
Min cv xv
v∈V (G)
s.t. (dS (v) − 1)xv |E(S)| − |S| + 1 ∀S ⊆ V (G) : E(S) = ∅ (2)
v∈S
xv ∈ {0, 1} ∀v ∈ V (G).
Chudak et al. [4] showed that the integrality gap of this integer program asymp-
totically equals 2. Constraints (2) derive from the simple observation that the
removal of a feedback vertex set X from G generates a forest having at most
|G| − |X| − 1 edges. Notice that the covering inequalities (1) are implied by (2).
First, we obtain a O(log n)-approximation algorithm for the diamond hitting set
problem, leading to a proof that the integrality gap of the natural LP formulation
is Θ(log n). Then, we develop a 9-approximation algorithm. Both the O(log n)-
and 9-approximation algorithm are based on the primal-dual method.
Our first key idea is contained in the following observation: every simple graph
of order n and minimum degree at least 3 contains a O(log n)-size diamond. This
directly yields a O(log n)-approximation algorithm for the diamond hitting set
problem, in the unweighted case. However, the weighted case requires more work.
Our second key idea is to generalize constraints (2) by introducing ‘sparsity
inequalities’, that enable us to derive a constant-factor approximation algorithm
for the diamond hitting set problem: First, by using reduction operations, we
ensure that every vertex of G has at least three neighbors. Then, if G contains
a diamond with at most 9 edges, we raise the dual variable of the corresponding
covering constraint. Otherwise, no such small diamond exists in G, and we can
use this information to select the right sparsity inequality, and raise its dual
variable. This inequality would not be valid in case G contained a small diamond.
The way we use the non-existence of small diamonds is perhaps best explained
via an analogy with planar graphs: An n-vertex planar simple graph G has at
most 3n − 6 edges. However, if we know that G has no small cycle, then this
upper bound can be much strengthened. (For instance, if G is triangle-free then
G has at most 2n − 4 edges.)
We remark that this kind of local/global trade-off did not appear in the work
of Chudak et al. [4] on the feedback vertex set problem, because the cycle cov-
ering inequalities are implied by their more general inequalities. In our case, the
covering inequalities and the sparsity inequalities form two incomparable classes
of inequalities, and examples show that the sparsity inequalities alone are not
enough to derive a constant-factor approximation algorithm.
194 S. Fiorini, G. Joret, and U. Pietropaoli
2 Preliminaries
A cactus is a connected graph where each edge belongs to at most one cycle.
Equivalently, a connected graph is a cactus if and only if each of its blocks is
isomorphic to either K2 or a cycle. Thus, a connected graph is a cactus if and
only if it does not contain a diamond as a subgraph. A graph without diamonds
is called a forest of cacti (see Figure 1).
A diamond hitting set (or simply hitting set) of a graph is a subset of vertices
that hits every diamond of the graph. A minimum (diamond) hitting set of a
weighted graph is a hitting set of minimum total cost, and its cost is denoted by
OP T .
Let D(G) denote the collection of all diamonds contained in G. From the
standard IP formulation for a covering problem, we obtain the following LP
relaxation for the diamond hitting set problem:
Min cv xv
v∈V (G)
s.t. xv 1 ∀D ∈ D(G) (3)
v∈V (D)
xv 0 ∀v ∈ V (G).
3 Reductions
In this section, we define two reduction operations on graphs: First, we define the
‘shaving’ of an arbitrary graph, and then introduce a ‘bond reduction’ operation
for shaved graphs.
The aim of these two operations is to modify a given graph so that the follow-
ing useful property holds: each vertex either has at least three distinct neighbors,
or is incident to at least three parallel edges.
Fig. 2. (a) A shaved graph G with two maximal bonds (in grey). (b) Reduction of the
first bond. (c) Reduction of the second bond. The graph is now reduced.
The algorithm relies on the simple fact that every hitting set of a reduced
graph G7 of a graph G is also a hitting set of G itself. The set of diamonds com-
puted by the algorithm yields a collection D of pairwise vertex-disjoint diamonds
in G. In particular, the size of a minimum hitting set is at least |D|. For each
diamond in D, at most 6 log3/2 n + 8 vertices were added to the hitting set X.
Hence, the approximation factor of the algorithm is O(log n).
Hitting Diamonds and Growing Cacti 197
xv 0 ∀v ∈ V (G),
where k is the total number of iterations of the algorithm. The dual of (LP) is:
k
(D) Max βi y i
i=1
k
s.t. ai,v yi cv ∀v ∈ V (G)
i=1
yi 0 ∀i ∈ {1, . . . , k}.
then ends and we check whether x satisfies all diamond inequalities. If so, then
we exit the loop, perform a reverse delete step, and output the current primal
solution.
The precise way the violated blended diamond inequality is chosen depends
among other things on the residual cost (or slack) of the vertices. The residual
i−1
cost of vertex v at the ith iteration is the number cv − j=1 aj,v yj . Note that
the residual cost of a vertex is always nonnegative, and zero if and only if the
vertex is tight.
We can prove that the integrality gap of the natural LP relaxation for the prob-
lem (see page 194) is Θ(log n). This result is obtained using expander graphs
with large girth [6, Section 5.4].
Hitting Diamonds and Growing Cacti 199
6 A 9-Approximation Algorithm
In this section we give a primal-dual 9-approximation algorithm for the diamond
hitting set problem. We start with a sketch of the algorithm in Section 6.1. The
algorithm makes use of the sparsity inequalities. In order to describe them, we
first bound the number of edges in a forest of cacti in Section 6.2; using this
bound, in Sections 6.3 and 6.4 we introduce the sparsity inequalities and prove
their validity. Once again, missing proofs and details can be found in the full
version of this paper [6, Section 6].
In the whole section, q is a global parameter of our algorithm which is set to
5. (We remark that the analysis below could be adapted to other values of q,
but this would not give an approximation factor better than 9.)
1 q !
γ>q (F ) |F | − k − (i − 1)γi (F ) . (5)
q i=2
q+1 q
q−i+1
(|F | − k) + γi (F ). 2
q i=2
q
where, for i ∈ {2, . . . , q}, γi (G, v) denotes the number of cycles of length i
incident to v in G and
q−i+1
λi := .
i/2 q
Lemma 3. Let X be a hitting set of a graph G where no two cycles of length at
most q share an edge. Then,
q + 1! q
q+1 q−i+1 q+1
G (v) − ||G|| − |G| − γi (G) + . (6)
q q i=2
q q
v∈X
Proof. For i ∈ {2, . . . , q} and j ∈ {0, . . . , i}, we denote by ξij the number of
cycles of G that have length i and exactly j vertices in X.
Letting ||X|| and |δ(X)| respectively denote the number of edges of G with
both ends in X and the number of edges of G having an end in X and the other
in V (G) − X, we have
q
i
G (v) = 2||X|| + |δ(X)| − j λi ξij
v∈X i=2 j=1
q
i
= ||X|| + ||G|| − ||G − X|| − j λi ξij
i=2 j=1
q+1 q
q − i + 1 0
q i
||X||+||G||− (|G − X| − 1)− ξi − j λi ξij ,
q i=2
q i=2 j=1
where the last inequality follows from Lemma 2 applied to the forest of cacti
G − X (notice that γi (G − X) = ξi0 ).
Because no two cycles of length at most q share an edge and, in a cycle of
length i, each subset of size j induces a subgraph that contains at least 2j − i
edges, we have
q i
||X|| (2j − i) ξij .
i=2 j=1+ i
2
Thus, we obtain
q
i
q+1
G (v) (2j − i) ξij + ||G|| − (|G − X| − 1)
i=2 j=1+ i
q
v∈X
2
q
q−i+1
q
i
− ξi0 − j λi ξij .
i=2
q i=2 j=1
We leave it to the reader to check that, in the right hand side of the last in-
equality, the total coefficient of ξij is at least − q−i+1
q , for all i ∈ {2, . . . , q} and
j ∈ {0, . . . , i}. Hence,
for H7 yields a valid inequality also for H, where the coefficient of each variable
corresponding to a vertex that was removed when H is zero. However, as it is,
the inequality is useless. We have to raise the coefficients to those variables in
such a way that the resulting inequality remains valid.
First, we associate to each edge of H 7 a corresponding primitive subgraph in
H, defined as follows. Consider an edge e ∈ E(H). 7 If e was already present in
H, then its primitive subgraph is the edge itself and its two ends. Otherwise, the
primitive subgraph of e is the bond whose reduction produced e. In particular,
if e has a twin edge e , then the primitive subgraphs of e and e coincide. The
primitive subgraph J of a subgraph J7 ⊆ H 7 is defined simply as the union of the
primitive subgraphs of every edge in E(J ).7
Thus, the graph H can be uniquely decomposed into simple or double pieces
corresponding respectively to edges or pairs of parallel edges in H. 7 Here, the
pieces of H are defined as follows: let v and w be two adjacent vertices of H, 7
7 7 7
and let J denote the subgraph of H induced by {v, w} (thus J is either an edge
or a pair of parallel edges, together with the endpoints v and w). The primitive
subgraph of J7 in H, say J, is a piece of H with ends v and w. Such a piece is
simple if J7 has exactly one edge and double otherwise (see Fig. 3), that is, if J7
has two parallel edges. The vertices of H are of two types: the branch vertices
are those that survive in H, 7 and the other vertices are internal to some piece
of H.
where
⎧ q+1
⎪
⎪ H7 (v) − if v is a branch vertex,
⎪
⎪
⎪
⎪
q
⎪
⎪ 1 if v is an internal vertex of a simple piece,
⎪
⎪
⎪
⎪
⎪
⎪ 2 if v is an internal vertex of a double piece
⎪
⎪
⎪
⎪ and does not belong to any handle,
⎨
av := 0 if v belongs to the top handle
⎪
⎪
⎪
⎪ of a cycle C with t(C) < b(C),
⎪
⎪
⎪
⎪ 2 if v belongs to the bottom handle
⎪
⎪
⎪
⎪ of a cycle C with t(C) < b(C),
⎪
⎪
⎪
⎪
⎪
⎪ 1 if v belongs to a handle of a cycle C with t(C) = b(C)
⎩
or C is special,
and
q−i+1 q
7 − q + 1 |H|
β := ||H|| 7 − 7 + q + 1.
γi (H)
q i=2
q q
In the definition of av above, we always assume that the cycle C is contained in
a double piece of H. The next lemma is proved in [6, Lemma 6.3].
The following lemma is the key of our analysis of Algorithm 3. Notice that the
constant involved is smaller than the approximation guarantee of our algorithm.
The number 9 comes from the fact that we use blended diamond inequalities for
diamonds with up to 9 pieces.
Acknowledgements
We thank Dirk Oliver Theis for his valuable input in the early stage of this
research. We also thank Jean Cardinal and Marcin Kamiński for stimulating
discussions.
204 S. Fiorini, G. Joret, and U. Pietropaoli
References
1. Bafna, V., Berman, P., Fujito, T.: A 2-approximation algorithm for the undirected
feedback vertex set problem. SIAM Journal on Discrete Mathematics 12(3), 289–297
(1999)
2. Bar-Yehuda, R., Geiger, D., Naor, J., Roth, R.M.: Approximation algorithms for
the feedback vertex set problem with applications to constraint satisfaction and
Bayesian inference. SIAM Journal on Computing 27(4), 942–959 (1998)
3. Becker, A., Geiger, D.: Optimization of Pearl’s method of conditioning and greedy-
like approximation algorithms for the vertex feedback set problem. Artificial Intel-
ligence 83, 167–188 (1996)
4. Chudak, F.A., Goemans, M.X., Hochbaum, D.S., Williamson, D.P.: A primaldual
interpretation of two 2-approximation algorithms for the feedback vertex set problem
in undirected graphs. Operations Research Letters 22, 111–118 (1998)
5. Even, G., Naor, J., Schieber, B., Zosin, L.: Approximating minimum subset feedback
sets in undirected graphs with applications. SIAM Journal on Discrete Mathemat-
ics 13(2), 255–267 (2000)
6. Fiorini, S., Joret, G., Pietropaoli, U.: Hitting diamonds and growing cacti (2010),
http://arxiv.org/abs/0911.4366v2
7. Goemans, M.X., Williamson, D.P.: The primal-dual method for approximation al-
gorithms and its application to network design problems. In: Approximation Al-
gorithms for NP-Hard Problems, ch. 4, pp. 144–191. PWS Publishing Company
(1997)
8. Khot, S., Regev, O.: Vertex cover might be hard to approximate to within 2 − ε.
Journal of Computer and System Sciences 74(3), 334–349 (2008)
9. Papadimitriou, C.H., Steiglitz, K.: Combinatorial Optimization: Algorithms and
Complexity. Prentice-Hall, Englewood Cliffs (1982)
Approximability of 3- and 4-Hop Bounded
Disjoint Paths Problems
1 Introduction
Two major concerns in the design of modern communication networks are the
protection against potential failures and the permanent provision of a guaran-
teed minimum level of service quality. A wide variety of models expressing such
requirements may be found in the literature, e.g. [1,2,3,4]. Coping simultane-
ously with both requirements naturally leads to length-restricted disjoint paths
problems: In order to ensure that a pair of nodes remains connected also after
some nodes or edges of the network fail, one typically demands the existence
Supported by the DFG Research Center Matheon – Mathematics for key technologies.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 205–218, 2010.
c Springer-Verlag Berlin Heidelberg 2010
206 A. Bley and J. Neto
Table 1. Known and new (bold) complexity results for node- and edge-disjoint -
bounded paths problems
v11 v21
v1 v3
v2 v22
v5 s0 v23 v33 t4
s t
v24
Fig. 1. Construction of the hop-extended digraph G (right) from the given graph G
(left) in Step 1 of algorithm ExFlow. Arcs with cost 0 in G are thick.
ExFlow
1. Compute a min cost max (s0 , t4 )-flow f in the hop-extended digraph G .
Let F := {P1 , . . . , Pk } be the corresponding 4-bounded simple paths in G.
2. Create the conflict graph H := (F , {Pi Pj | Pi ∩ Pj = ∅}).
3. Compute an independent set S ⊆ F in H with |S| ≥ 12 |F |.
4. Return S.
Approximability of 3- and 4-Hop Bounded Disjoint Paths Problems 209
s
s
u1i ā1i a1i u2i ā2i a2i u3i ā3i a3i a1j
ā3i ā2k
wi1 vi1 w̄i1 wi2 vi2 w̄i2 wi3 vi3 w̄i3 bl
cl
t
t
Fig. 2. Graph Gi for variable xi occurring as Fig. 3. Graph Hl for clause
literals x1i , x̄2i , x̄3i Cl = (x̄3i ∨ x1j ∨ x̄2k )
In order to show that MEDP(4) is APX -hard, i.e., that there is some c > 1
such that approximating MEDP(4) within a factor less than c is N P-hard,
we construct an approximation preserving reduction from the Max-k-Sat(3)
problem to MEDP(4). Given a set X of boolean variables and a collection C
of disjunctive clauses such that each clause contains at most k literals and each
variable occurs at most 3 times as a literal, the Max-k-Sat(3) problem is to
find a truth assignment to the variables that maximizes the number of satisfied
clauses. Max-k-Sat(3) is known to be APX -complete [27].
Theorem 3. MEDP(4) is APX -hard.
where wi4 = wi1 for notational simplicity. The nodes s and t are contained in
all subgraphs and serve as source and destination for all paths. For each l ∈ L,
we construct a clause graph Hl = (Wl , Fl ) as shown in Figure 3. In addition to
the nodes and edges it shares with the variable graphs, Hl contains 2 nodes and
Approximability of 3- and 4-Hop Bounded Disjoint Paths Problems 211
t
Fig. 4. Union of Gi and Hl for variable xi and clause Cl = (x̄3i ∨ . . . ). Thick lines are
paths in P̄i and path Qli .
The goal in the constructed MEDP(4) instance is to find the maximum number of
edge-disjoint 4-bounded (s, t)-paths in the simple undirected graph G obtained as
the union of all variable and clause (sub)-graphs. It is clear that the constructions
can be performed in polynomial time.
For notational convenience, we denote for each i ∈ I and j ∈ J the paths
j j j (s, āji , w̄ij , wij+1 , t) if x̄ji occurs
Pij = (s, ui , vi , w̄i , t) , Pij = ,
(s, āji , aji , wij+1 , t) if xji occurs
j j j (s, aji , āji , w̄ij , t) if x̄ji occurs
P̄ij = (s, ui , vi , wi , t) , P̄ij = .
(s, ai , wi , w̄i , t) if xji occurs
j j+1 j
For each clause Cl that is satisfied by this truth assignment, let il (x̂) be one of
the variables whose literal evaluates to true in Cl . We define
, ,
S = S(x̂) := Pi ∪ P̄i ∪ {Qlil (x̂) | l ∈ L : Cl (x̂) = true} .
i∈I:x̂i =true i∈I:x̂i =f alse
/ all paths in S contain at most 4 edges, |S| = 6|I| + r, and all paths in
Clearly,
S ∩ i (Pi ∪ P̄i ) are edge-disjoint. Note that if some path Qli is contained in S,
then either the negated literal x̄ji occurring in clause Cl evaluates to true, which
implies that xi = f alse and Pij ∈ S, or the unnegated literal xji occurring in Cl
evaluates to true and, hence, P¯ ij ∈ S. Furthermore, observe that these paths
Pij and P¯ ij are the only paths that may be contained in S and share an edge
with Qlil . Consequently, each path Qli ∈ S is edge-disjoint to any other path in
S and, thus, all paths in S are edge-disjoint.
In the second part of the proof we show that any set S of 6|I| + r edge-disjoint
4-bounded (s, t)-paths in G can be transformed into a truth assignment x̂(S) that
satisfies a least r clauses of the given Max-k-Sat(3)
/ instance. We may ignore
path sets with |S| < 6|I|, as the path set i∈I Pi is a feasible solution for the
constructed MEDP(4) instance with 6|I| paths. Furthermore, we may restrict
our attention to path sets S that satisfy the property that, for each i ∈ I, either
Pi ⊆ S or P̄i ⊆ S. Any path set S that does not satisfy this property can be
turned into a path set S with |S | ≥ S that does as follows:
Suppose that, for some i, neither Pi ⊆ S nor P̄i ⊆ S. Let Si ⊆ S be the set
of paths in S that are fully contained in the variable subgraph Gi . As there are
only 6 edges adjacent to t in Gi , we have |Si | ≤ 6. Observe that each 4-bounded
(s, t)-path in G is either of the form Qli or it is fully contained in one of the
variable subgraphs Gi . Furthermore, all (s, t)-paths of length exactly 4 in Gi are
contained in Pi ∪ P̄i . The only other 4-bounded paths in Gi are the three paths
of length 3, which we denote P̄ij = (s, āji , w̄ij , t) for the negated literals x̄ji and
Pij = (s, aji , wij+1 , t) for the unnegated literals x̄ji . In terms of edge-disjointness,
however, the paths Pij and P̄ij conflict with the same 4-bounded (s, t)-paths as
the paths Pij or P̄ij , respectively. Replacing all paths Pij and P̄ij in S by the
paths Pij and P̄ij , respectively, thus yields a set of edge-disjoint 4-bounded path
of the same size as S. Hence, we can assume that Si ⊆ Pi ∪ P̄i .
Now consider the paths Qil corresponding to the clauses Cl in which variable
xi occurs. Recall that variable xi occurs exactly 3 times in the clauses, so there
are at most 3 paths Qil in S that may share an edge with the paths in Pi ∪ P̄i .
If variable xi occurs uniformly in all 3 clauses negated or unnegated, then these
three paths Qil are edge-disjoint from either all 6 paths in Pi or from all 6 paths
in P̄i . Replacing the paths in Si by Pi or P̄i , respectively, yields an edge-disjoint
path set S with |S | ≥ |S|. If variable xi occurs non-uniformly, then either the
paths in Pi or the paths in P̄i conflict with at most one of the three Qil paths. In
this case we have Si ≤ 5, as the only edge-disjoint path sets of size 6 in Pi ∪ P̄i are
Pi and P̄i themselves. Replacing the at most 5 paths in Si and the 1 potentially
conflicting path Qil (if it is contained in S at all) by either Pi or P̄i thus yields
Approximability of 3- and 4-Hop Bounded Disjoint Paths Problems 213
To see that x̂(S) satisfies at least r clauses, consider the (s, t)-cut in G formed
by the edges adjacent to node t. As S contains either Pi or P̄i for each i ∈ I,
which amounts to a total of 6|I| paths, each of the remaining r paths in S must
be of the form Qil for some i ∈ I and l ∈ L. Path Qil , however, can be contained
in S only if clause Cl evaluates to true. Otherwise it would intersect with the
path Pij or P̄ij in S that corresponds to literal xji occurring in clause Cl . Hence,
at least r clauses of the given Max-k-Sat(3) instance are satisfied by the truth
assignment x̂(S).
It now follows in a straightforward way that MEDP(4) is APX -hard. Suppose
there is an algorithm ALG to approximate MEDP(4) within a factor of α > 1 and
denote by S the solution computed by this algorithm. Let r(S) be the number
of clauses satisfied by the truth assignment x̂(S) and let |S ∗ | and r∗ be optimal
solution values of MEDP(4) and Max-k-Sat(3), respectively. The fact that at
least half of the clauses in any Max-k-Sat(3) instance can be satisfied implies
r∗ ≥ 12 |L| and, further on, r∗ ≥ 2k3
|I|. Applying the problem transformation and
algorithm ALG to a given Max-k-Sat(3) instance, we get
1 ∗ 1 1 + 4k − 4kα ∗
r(S) ≥ |S| − 6|I| ≥ |S | − 6|I| ≥ (r∗+ 6|I|) − 6|I| ≥ r
α α α
Proof. Let S and T denote the set of neighbors of node s and t in the given
graph G, respectively. We may assume w.l.o.g. that each node in G is contained
in {s, t} ∪ S ∪ T , for otherwise it may not appear in any 3-bounded (s, t)-path.
We reduce WNDP(3) to the problem of finding a minimum weight matching
with cardinality k in an auxiliary graph G = (V , E ), which is constructed
as follows: For each node v ∈ S (resp. w ∈ T ), there is an associated node
uv ∈ V , (resp. uw ∈ V ). For each node v ∈ S ∩ T , there is an associated edge
ev = (uv , uv ) ∈ E with weight csv + cvt . Choosing this edge in the matching
corresponds to choosing the path (s, v, t) in G. For each edge (v, w) ∈ (S × T ) \
(S ∩ T )2 , there is an associated edge (uv , uw ) ∈ E , with uz = uz if z ∈ T
and uz = uz otherwise for any z ∈ V . The weight of edge (uv , uw ) is set to
csv + cvw + cwt . Choosing (uv , uw ) in the matching in G corresponds to choosing
path (s, v, w, t) in G. For each edge (v, w) ∈ (S ∩ T )2 , there is an associated
edge (uv , uw ) ∈ E , with weight min{csv + cvw + cwt , csw + cwv + cvt }, which
represents the paths (s, v, w, t) and (s, w, v, t) in G. For each edge (s, t) ∈ E,
there is an associated edge (us , ut ) ∈ E with weight cuv .
Clearly, this construction can be performed in polynomial time. One easily
verifies that any set of k vertex-disjoint 3-bounded (s, t)-paths in G corresponds
to a matching of size k and the same cost in G , and vice versa. Since the
cardinality constrained minimum weight matching problem can be solved in
polynomial time [28,29], the claim follows.
For = 4, the problem becomes at least APX -hard in the general case.
Theorem 7. WNDP(4) is (at least) APX -hard.
Proof. We use a construction similar to the one presented in the previous section
to reduce Max-k-Sat(3) to WEDP(4). Again, we let xi , i ∈ I, be the boolean
variables and Cl , l ∈ L be the clauses of the given Max-k-Sat(3) instance and
we denote the three occurrences of variable xi by xji , j ∈ J := {1, . . . , 3}.
For each l ∈ L, we construct a clause graph Hl = (Wl , Fl ) exactly as in the
proof of Theorem 3 and shown in Figure 3. For each i ∈ I, we construct a variable
graph Gi = (Vi , Ei ) as
Approximability of 3- and 4-Hop Bounded Disjoint Paths Problems 215
where ri4 = ri1 . Figure 5 illustrates these graphs. The graph G is obtained as
the union of all Gi and Hl (sub-)graphs. Finally, we assign weight 1 to all edges
suji and sūji and weight 0 to all other edges in G. The goal in the constructed
WNDP(4) instance is to find a minimum cost subgraph of G that contains (at
least) 6|I| + |L| node-disjoint 4-bounded (s, t)-paths.
For each i ∈ I and j ∈ J, we denote the paths
Pij = (s, uji , vij , wij , t) , Pij = (s, āji , ūji , rij+1 , t) , Pij = (s, ūji , rij+1 , t) ,
P̄ij = (s, ūji , vij , wij , t) , P̄ij = (s, aji , uji , rij , t) , P̄ij = (s, uji , rij , t) .
Note that these are the only 4-bounded (s, t)-paths in G. Furthermore, we let
Pi := {Pij , Pij | j ∈ J}, P̄i := {P̄ij , P¯ ij | j ∈ J}, and Ql := {Qli | i ∈ I :
xi occurs in Cl }. Figure 5 illustrates the paths in P̄i and path Qli .
As in the proof of Theorem 3, one finds that a truth assignment x̂ that satisfies
r clauses of the given Max-k-Sat(3) instance corresponds to a path set
, ,
S = S(x̂) := Pi ∪ P̄i ∪ {Qlil (x̂) | l ∈ L : Cl (x̂) = true}
i∈I:x̂i =true i∈I:x̂i =f alse
with |S| = 6|I| + r and cost c(S) = 3|I|. In order to obtain a set of 6|I| + |L|
paths, we modify S as follows: For each l ∈ L with Cl (x̂) = f alse, we arbitrarily
chose one i such that xji or x̄ji occurs in Cl , add the path Qli to S, and replace the
a1i u1i vi1 ū1i ā1i a2i u2i vi2 ū2i ā2i a3i u3i vi3 ū3i ā3i
Fig. 5. Union of Gi and Hl for variable xi and clause Ck = (x̄3i ∨ . . . ). Thick lines are
paths in P̄i and path Qli .
216 A. Bley and J. Neto
path Pij or P̄ij in S with Pij or P̄ij , respectively. This modification maintains
the node-disjointness of the paths in S and increases both the size and the cost
of S by |L| − r. The cost of the resulting path set S thus is
Conversely, one finds that any set S of 6|I| + |L| node-disjoint 4-bounded (s, t)-
paths must contain one path from each set Ql and 6 paths within each variable
subgraph Gi . The only way to have 6 node-disjoint 4-bounded path within Gi ,
however, is to have either all 3 paths Pij or all 3 paths P̄ij , complemented
with 3 paths of the type Pij and Pij or with 3 paths of the type P̄ij and P̄ij ,
respectively. The cost of such a path set is equal to the number of Pij and P̄ij
paths it contains, which amounts to a total of 3|I|, plus the number of Pij and
P̄ij paths. Note that the paths Pij and P̄ij contain only a subset of the nodes
in Pij and P̄ij , respectively, and that the cost induced by Pij and P̄ij is 1, while
the cost induced by Pij and P̄ij is 0. Thus, we may assume that S contains path
Pij or P̄ij only if it contains path Qli for the clause l in which literal xji occurs.
Let x̂(S) be the truth assignment defined as
true if Pi1 ∈ S,
x̂i (S) := for all i ∈ I.
f alse otherwise,
Consider a path Qli ∈ S and suppose Cl (x̂(S)) = f alse. Then also the literal xji
or x̄ji occurring in Cl evaluates to f alse. Since S contains either Pij or P̄ij , it
also must contain Pij or P̄ij , respectively. As these paths induce a cost of one,
the number of clauses satisfied by x̂(S) is
As in the proof of Theorem 3, it follows straightforward from (1) and (2) that
approximation ratios are transformed linearly by the presented reduction and,
hence, WNDP(4) is APX -hard.
Unfortunately, it remains open if WNDP(4) is approximable within a constant
factor or not. The best known approximation ratio for WNDP(4) is O(k), which
is achieved by a simple greedy algorithm.
Theorem 8. WNDP(4) can be approximated within a factor of 4k.
Proof. Consider the algorithm, which adds the edges in order of non-decreasing
cost until the constructed subgraph contains k node-disjoint 4-bounded (s, t)-
paths and then returns the subgraph defined by these paths. As, in each iteration,
we can check in polynomial time whether such paths exist or not [18], this
algorithms runs in polynomial time. Furthermore, the optimal solution must
contain at least one edge whose cost is at least as big as the cost of the last edge
added by the greedy algorithm. Therefore, the total cost of the greedy solution
is at most 4k times the optimal solution’s cost.
Approximability of 3- and 4-Hop Bounded Disjoint Paths Problems 217
4 Conclusion
In this paper we show that the maximum edge-disjoint 4-bounded paths prob-
lem MEDP(4) is APX -complete and that the corresponding weighted edge-
disjoint paths problem WEDP(4) is N PO-complete. The weighted node-disjoint
-bounded paths problem was proven to be polynomially solvable for = 3 and
to be at least APX -hard for = 4. This closes all basic complexity issues that
were left open in [18,19]. In addition, we presented a 2-approximation algorithm
for WEDP(4) and a 4k-approximation algorithm WNDP(4). It remains open
whether WNDP(4) is approximable within a factor better than O(k) or if there
is a stronger, non-constant approximation threshold.
The hardness results and algorithms presented in this paper also hold for
directed graphs and for graphs containing parallel edges.
References
1. Alevras, D., Grötschel, M., Wessäly, R.: Capacity and survivability models
for telecommunication networks. ZIB Technical Report SC-97-24, Konrad-Zuse-
Zentrum für Informationstechnik Berlin (1997)
2. Gouveia, L., Patricio, P., Sousa, A.D.: Compact models for hop-constrained node
survivable network design: An application to MPLS. In: Anandaligam, G., Ragha-
van, S. (eds.) Telecommunications Planning: Innovations in Pricing, Network De-
sign and Management. Springer, Heidelberg (2005)
3. Gouveia, L., Patricio, P., Sousa, A.D.: Hop-constrained node survivable network
design: An application to MPLS over WDM. Networks and Spatial Economics 8(1)
(2008)
4. Grötschel, M., Monma, C., Stoer, M.: Design of Survivable Networks. In: Hand-
books in Operations Research and Management Science, Volume Networks, pp.
617–672. Elsevier, Amsterdam (1993)
5. Chakraborty, T., Chuzhoy, J., Khanna, S.: Network design for vertex connectivity.
In: Proceedings of the 40th Annual ACM Symposium on the Theory of Computing
(STOC ’08), pp. 167–176 (2008)
√
6. Chekuri, C., Khanna, S., Shepherd, F.: An O( n) approximation and integrality
gap for disjoint paths and unsplittable flow. Theory of Computing 2, 137–146
(2006)
7. Lando, Y., Nutov, Z.: Inapproximability of survivable networks. Theoretical Com-
puter Science 410, 2122–2125 (2009)
8. Menger, K.: Zur allgemeinen Kurventheorie. Fund. Mathematicae 10, 96–115
(1927)
9. Lovász, L., Neumann-Lara, V., Plummer, M.: Mengerian theorems for paths of
bounded length. Periodica Mathematica Hungarica 9(4), 269–276 (1978)
10. Exoo, G.: On line disjoint paths of bounded length. Discrete Mathematics 44,
317–318 (1983)
11. Niepel, L., Safarikova, D.: On a generalization of Menger’s theorem. Acta Mathe-
matica Universitatis Comenianae 42, 275–284 (1983)
12. Ben-Ameur, W.: Constrained length connectivity and survivable networks. Net-
works 36, 17–33 (2000)
218 A. Bley and J. Neto
13. Pyber, L., Tuza, Z.: Menger-type theorems with restrictions on path lengths. Dis-
crete Mathematics 120, 161–174 (1993)
14. Baier, G.: Flows with path restrictions. PhD thesis, Technische Universität Berlin
(2003)
15. Martens, M., Skutella, M.: Length-bounded and dynamic k-splittable flows. In:
Operations Research Proceedings 2005, pp. 297–302 (2006)
16. Mahjoub, A., McCormick, T.: Max flow and min cut with bounded-length paths:
Complexity, algorithms and approximation. Mathematical Programming (to ap-
pear)
17. Baier, G., Erlebach, T., Hall, A., Köhler, E., Kolman, P., Panagrác, O., Schilling,
H., Skutella, M.: Length-bounded cuts and flows. ACM Transactions on Algorithms
(to appear)
18. Itai, A., Perl, Y., Shiloach, Y.: The complexity of finding maximum disjoint paths
with length constraints. Networks 12, 277–286 (1982)
19. Bley, A.: On the complexity of vertex-disjoint length-restricted path problems.
Computational Complexity 12, 131–149 (2003)
20. Perl, Y., Ronen, D.: Heuristics for finding a maximum number of disjoint bounded
paths. Networks 14 (1984)
21. Wagner, D., Weihe, K.: A linear-time algorithm for edge-disjoint paths in planar
graphs. Combinatorica 15(1), 135–150 (1995)
22. Botton, Q., Fortz, B., Gouveia, L.: The k-edge 3-hop constrained network design
polyhedron. In: Proceedings of the 9th INFORMS Telecommunications Conference,
Available as preprint at Université Catholique de Louvain: Le polyedre du probleme
de conception de reseaux robustes K-arête connexe avec 3 sauts (2008)
23. Dahl, G., Huygens, D., Mahjoub, A., Pesneau, P.: On the k edge-disjoint 2-hop-
constrained paths polytope. Operations Research Letters 34, 577–582 (2006)
24. Huygens, D., Mahjoub, A.: Integer programming formulations for the two 4-hop
constrained paths problem. Networks 49(2), 135–144 (2007)
25. Golovach, P., Thilikos, D.: Paths of bounded length and their cuts: Parameterized
complexity and algorithms. In: Chen, J., Fomin, F. (eds.) IWPEC 2009. LNCS,
vol. 5917, pp. 210–221. Springer, Heidelberg (2009)
26. Guruswami, V., Khanna, S., Shepherd, B., Rajaraman, R., Yannakakis, M.: Near-
optimal hardness results and approximation algorithms for edge-disjoint paths and
related problems. In: Proceedings of the 31st Annual ACM Symposium on the
Theory of Computing (STOC ’99), pp. 19–28 (1999)
27. Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A., Pro-
tasi, M.: Complexity and Approximation: Combinatorial Optimization Problems
and their Approximability Properties. Springer, Heidelberg (1999)
28. Edmonds, J.: Maximum matching and a polyhedron with 0-1 vertices. Journal of
Research of the National Bureau of Standards 69B, 125–130 (1965)
29. Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency. Springer, Hei-
delberg (2003)
A Polynomial-Time Algorithm for Optimizing
over N -Fold 4-Block Decomposable Integer
Programs
1 Introduction
Let A ∈ Zd×n be a matrix. We associate with A a finite set G(A) of vectors with
remarkable properties. Consider the set ker(A) ∩ Zn . Then we put into G(A) all
nonzero vectors v ∈ ker(A) ∩ Zn that cannot be written as a sum v = v + v of
nonzero vectors v , v ∈ ker(A)∩Zn that lie in the same orthant (or equivalently,
have the same sign pattern in {≥ 0, ≤ 0}n ) as v. The set G(A) has been named
the Graver basis of A, since Graver [6] introduced this set G(A) in 1975 and
showed that it constitutes an optimality certificate for a whole family of integer
linear programs that share the same problem matrix, A. By this we mean, that
G(A) provides an augmenting vector/step to any non-optimal feasible solution
and hence allows the design of a simple augmentation algorithm to solve the
integer linear program.
In the last 10 years, a tremendous theoretical progress has been made in the
theory of Graver bases. It has been shown that G(A) constitutes an optimal-
ity certificate for a much wider class of integer minimization problems, namely
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 219–229, 2010.
c Springer-Verlag Berlin Heidelberg 2010
220 R. Hemmecke, M. Köppe, and R. Weismantel
A ×nA
Theorem 2. If A ∈ Zd , B ∈ ZdA ×nB , C !
∈ ZdC ×nB , D ∈ ZdC ×nA are
(N )
fixed matrices, then max v1 : v ∈ G ( B A )
C D is bounded by a polynomial
in N .
It should be noted that in the two special cases of N -fold IPs and of two-stage
stochastic IPs each component of any Graver basis element is bounded by a
constant (depending only on the fixed problem matrices and not on N ). Hence,
Theorem 2 specialized to these two cases is essentially trivial. In the general
N -fold 4-block situation, however, each component of any Graver basis element is
bounded only by a polynomial in N . This fact demonstrates that N -fold 4-block
IPs are much richer and more difficult to solve than the two special cases of
N -fold IPs and of two-stage stochastic IPs. Moreover, a proof of Theorem 2 in
this general setting is not obvious.
In the next section, we present two sample applications of Theorem 1: stochas-
tic integer programming with second-order dominance constraints [5,11] and
stochastic integer multi-commodity flows [12,16]. For both cases we will develop
tractability results based on our general theory. We do, however, not claim that
these algorithms are particularly useful in practice. While the first application
has an N -fold 4-block matrix as problem matrix, the second application can be
222 R. Hemmecke, M. Köppe, and R. Weismantel
2 Sample Applications
In this section we present two N -fold 4-block decomposable integer programming
problems that are polynomial-time solvable for given fixed blocks and variable
N by Theorem 1 and its Corollary 1.
A Polynomial-Time Algorithm 223
Theorem 3. For given fixed network the two-stage stochastic integer linear
multi-commodity flow problem is solvable in polynomial time in the number M
of commodities, in the number N of scenarios, and in the encoding lengths of
the input data.
Proof. The only issue that prevents us to apply Corollary 1 directly is the fact
that M and N are different. But by introducing additional commodities or sce-
narios, we can easily obtain an equivalent (bigger) problem with M = N for
which we can apply Corollary 1. If M < N , we introduce additional commodi-
ties with zero flow and if M > N , we take one scenario, copy it additional M −N
times and choose for each of these M − N + 1 identical scenarios 1/(M − N + 1)
times the original cost vector. So, in total, these M − N + 1 scenarios are equiv-
alent to the one we started from.
224 R. Hemmecke, M. Köppe, and R. Weismantel
It should be noted that we can extend the problem and still get the same poly-
nomiality result. For example, we may assume that we are allowed to change
the routing of the M commodities in the second-stage decision. Penalties could
be enforced for the amount of change of the first-stage decision or only for the
amount of additional flow on edges compared to the first-stage decision. Writ-
ing down the constraints and introducing suitable additional variables with zero
lower and upper bounds, one obtains again a problem matrix that allows the
application of Corollary 1.
We assume now that all variables are integral and, for simplicity of exposition,
we assume that the inequalities of the polyhedron X are incorporated into the
constraints T x + W ylk = zl . Moreover, we assume that all scenarios have the
same probability, that is, πl = 1/L, l = 1, . . . , L.
Theorem 4. For given fixed matrices T and W and for fixed number K, problem
(SIP) is solvable in polynomial time in the number L of (data) scenarios, and in
the encoding lengths of the input data.
Proof. We transform the problem in such a way that Theorem 1 can be applied.
First, we include the constraints c x + q ylk − ak ≤ vlk into the constraint
Tx +) W y*
lk = zl (by adding
) slack*variables to get an equation). Then, we set
T W
T̄ = .. and W̄ = .. , in which we use K copies of T and W ,
. .
T W
respectively. As T , W , and K are assumed to be fixed, so are T̄ and W̄ . With
this, the problem matrix now becomes
⎛ ⎞
0 I ··· I I
⎜ T̄ W̄ 0⎟
⎜ ⎟
⎜ .. .. .. ⎟ .
⎝. . .⎠
T̄ W̄ 0
A Polynomial-Time Algorithm 225
Introducing suitable additional variables with zero lower and upper bounds, we
obtain a problem matrix of the form ( BC D )(l) with A = (
A W̄ 0 ), B = T̄ , C = 0,
and D = ( I I ). Thus, we can apply Theorem 1 and the result follows.
we have
v = (G(A)μ) + (G(A)ν)
and thus, the first claim is proved. The second claim is a trivial consequence of
the first.
Let us now extend this corollary to a form that we need to prove Theorem 1.
Corollary 3. Let A ∈ Zd×n and let B ∈ Zm×n . Let the entries of B be bounded
A ). Then we have
by M in absolute value. Moreover, put C := ( B
m
−1 2m
max{v1 : v ∈ G(C)} ≤ (2nM )2 (max{v1 : v ∈ G(A)}) .
Proof. This claim follows by simple induction, adding one row of B at a time,
and by using the second inequality of Corollary 2 to bound the sizes of the
intermediate Graver bases in comparison to the Graver basis of the matrix with
one row of B less.
0
In order to prove Theorem 1, let us consider the submatrix ( B 0 (N ) .
A) A main
result from [9] is the following.
Lemma 3. Let A ∈ ZdA ×nA and B ∈ ZdA ×nB . There exists a number g ∈ Z+
depending only on A and !B but not on N such that for every N ∈ Z+ and
for every v ∈ G ( B 0 0 )(N ) , the components of v are bounded by g in absolute
A
!
value. In particular, v1 ≤ (nB + N nA )g for all v ∈ G ( B0 0 )(N ) .
A
Combining this result with Corollary 3, we get a bound for the 1-norms of the
C D )(N ) . Note that the second claim of the following
Graver basis elements of ( B A
corollary is exactly Theorem 2.
Corollary 4. Let A ∈ ZdA ×nA , B ∈ ZdA ×nB , C ∈ ZdC ×nB , D ∈ ZdC ×nA be
given matrices. Moreover, let M be a bound on the absolute values of the entries
in C and D, and let g ∈ Z+ be the number from Lemma 3. Then for any N ∈ Z+
we have
!
max v1 : v ∈ G ( C D (N )
B A)
!
!2dC
≤ (2(nB + N nA )M )2 −1 max v1 : v ∈ G ( B
dC
0 0 )(N )
A
dC
−1 2dC
≤ (2(nB + N nA )M )2 ((nB + N nA )g) .
!
Proof. While the first claim is a direct consequence of Lemma 3 and Corollary
3, the polynomial bound for fixed matrices A, B, C, D and varying N follows
immediately by observing that nA , nB , dC , M, g are constants as they depend
only on the fixed matrices A, B, C, D.
Now we are ready to prove our main theorem.
Proof of Theorem 1. Let N ∈ Z+ , l, u ∈ ZnB +N nA , b ∈ ZdC +N dA , and a
separable convex function f : RnB +N nA → R be given. To prove claim (a),
observe that one can turn any integer solution to ( C D (N ) z = b (which can
B A)
be found in polynomial time using for example the Hermite normal form of
C D )(N ) ) into a feasible solution (that in addition fulfills l ≤ z ≤ u) by a
(B A
sequence of linear integer programs (with the same problem matrix ( B C D )(N ) )
A
that “move” the components of z into the direction of the given bounds, see [7].
This step is similar to phase I of the Simplex Method in linear programming.
In order to solve these linear integer programs, it suffices (by the ! result of [18])
(N )
to find Graver basis augmentation vectors from G ( B A ) C D for a directed
augmentation oracle. So, claim (b) will imply both claim (a) and claim (c).
Let us now assume that we are given a feasible solution z0 = (x, y1 , . . . , yN )
and that we wish to decide whether there exists another feasible solution z1
with f (z1 ) < f (z0 ). By the main result in [13], it suffices to decide whether
there exists some vector v = (x̂, ŷ1 , . . . , ŷN ) in the Graver basis of ( C D (N )
B A)
such that z0 + v is feasible and f (z0 + v) < f (z0 ). By Corollary 4 and by the
fact that nB is constant, there is only a polynomial number of candidates for
the x̂-part of v. For each such candidate x̂, we can find a best possible choice
for ŷ1 , . . . , ŷN by solving the following separable convex N -fold IP:
⎧ ⎛ ⎞ ⎫
⎪
⎪
x+x̂
⎪
⎪
⎪
⎪ D )(N ) ⎝
y1 +ŷ1
⎠ = b, ⎪
⎪
⎪
⎪ (C . ⎪
⎪
⎪
⎪ B A .. ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎛⎛ x+x̂ ⎞⎞ yN +ŷN ⎪
⎪
⎪
⎨ ⎪
⎬
y1 +ŷ1 ⎛ ⎞
min f ⎝⎝ .. ⎠⎠ : x+x̂
,
⎪
⎪ . y1 +ŷ1
⎠ ≤ u, ⎪⎪
⎪
⎪
⎪ yN +ŷN l≤⎝ .. ⎪
⎪
⎪
⎪
⎪ . ⎪
⎪
⎪
⎪ yN +ŷN ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎩ nA ⎭
y1 , . . . , yN ∈ Z
for given z0 = (x, y1 , . . . , yN ) and x̂. Observe that the problem (IP)N,b,l,u,f does
(N )
indeed simplify to a separable convex N -fold IP with problem matrix ( 00 D A)
because z0 = (x, y1 , . . . , yN ) and x̂ are fixed. For fixed matrices A and D,
however, each such N -fold IP is solvable in polynomial time [8]. If the N -fold IP
is feasible and if for the resulting optimal vector v := (x̂, ŷ1 , . . . , ŷN ) we have
f (z0 + v) ≥ f (z0 ), then no augmenting vector can be constructed using this
particular choice of x̂. If on the other hand we have f (z0 + v) < f (z0 ), then v
is a desired augmenting vector for z0 and we can stop. As we solve polynomially
228 R. Hemmecke, M. Köppe, and R. Weismantel
many polynomially solvable N -fold IPs, claim (b) and thus also claims (a) and
(c) follow.
⎝D C ⎠ b2N
B A
References
1. Aschenbrenner, M., Hemmecke, R.: Finiteness theorems in stochastic integer pro-
gramming. Foundations of Computational Mathematics 7, 183–227 (2007)
2. De Loera, J.A., Hemmecke, R., Onn, S., Rothblum, U., Weismantel, R.: Convex
integer maximization via Graver bases. Journal of Pure and Applied Algebra 213,
1569–1577 (2009)
3. De Loera, J.A., Hemmecke, R., Onn, S., Weismantel, R.: N-fold integer program-
ming. Discrete Optimization 5, 231–241 (2008)
A Polynomial-Time Algorithm 229
4. Diaconis, P., Graham, R., Sturmfels, B.: Primitive partition identities. In: Miklós,
D., Sós, V.T., Szonyi, T. (eds.) Combinatorics, Paul Erdos is Eighty, pp. 173–192.
Janos Bolyai Mathematical Society, Budapest (1996)
5. Gollmer, R., Gotzes, U., Schultz, R.: A note on second-order stochastic dominance
constraints induced by mixed-integer linear recourse. Mathematical Programming
(to appear, 2010), doi:10.1007/s10107-009-0270-0
6. Graver, J.E.: On the foundation of linear and integer programming I. Mathematical
Programming 9, 207–226 (1975)
7. Hemmecke, R.: On the positive sum property and the computation of Graver test
sets. Mathematical Programming 96, 247–269 (2003)
8. Hemmecke, R., Onn, S., Weismantel, R.: A polynomial oracle-time algorithm for
convex integer minimization. Mathematical Programming, Series A (to appear,
2010), doi:10.1007/s10107-009-0276-7
9. Hemmecke, R., Schultz, R.: Decomposition of test sets in stochastic integer pro-
gramming. Mathematical Programming 94, 323–341 (2003)
10. Hoşten, S., Sullivant, S.: Finiteness theorems for Markov bases of hierarchical mod-
els. Journal of Combinatorial Theory, Series A 114(2), 311–321 (2007)
11. New Formulations for Optimization Under Stochastic Dominance Constraints.
SIAM J. Optim. 19, 1433–1450 (2008)
12. Mirchandani, P.B., Soroush, H.: The stochastic multicommodity flow problem. Net-
works 20, 121–155 (1990)
13. Murota, K., Saito, H., Weismantel, R.: Optimality criterion for a class of nonlinear
integer programs. Operations Research Letters 32, 468–472 (2004)
14. Onn, S.: Theory and Applications of N -fold Integer Programming. In: IMA Volume
on Mixed Integer Nonlinear Programming. Frontier Series. Springer, Heidelberg (in
preparation 2010)
15. Onn, S., Rothblum, U.: Convex combinatorial optimization. Discrete Computa-
tional Geometry 32, 549–566 (2004)
16. Powell, W.B., Topaloglu, H.: Dynamic-Programming Approximations for Stochas-
tic Time-Staged Integer Multicommodity-Flow Problems. INFORMS Journal on
Computing 18, 31–42 (2006)
17. Santos, F., Sturmfels, B.: Higher Lawrence configurations. Journal of Combinato-
rial Theory, Series A 103, 151–164 (2003)
18. Schulz, A.S., Weismantel, R.: A polynomial time augmentation algorithm for in-
teger programming. In: Proc. of the 10th ACM-SIAM Symposium on Discrete
Algorithms, Baltimore (1999)
19. Schrijver, A.: Theory of linear and integer programming. Wiley, Chichester (1986)
20. Seymour, P.D.: Decomposition of regular matroids. Journal of Combinatorial The-
ory, Series B 28, 305–359 (1980)
Universal Sequencing on a Single Machine
1 Introduction
Traditional scheduling problems normally assume that jobs run on an ideal machine that
provides a constant performance throughout time. While in some settings this is a good
Supported by EU project 215270 FRONTS.
Supported by DFG research center M ATHEON in Berlin.
Supported by the Dutch BSIK-BRIKS project.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 230–243, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Universal Sequencing on a Single Machine 231
enough approximation of real life machine behavior, in other situations this assump-
tion is decidedly unreasonable. Our machine, for example, can be a server shared by
multiple users; if other users suddenly increase their workload, this can cause a general
slowdown; or even worse, the machine may become unavailable for a given user due
to priority issues. In other cases, our machine may be a production unit that can break
down altogether and remain offline for some time until it is repaired. In these cases, it
is crucial to have schedules that take such unreliable machine behavior into account.
Different machine behaviors will typically lead to widely different optimal sched-
ules. This creates a burden on the scheduler who would have to periodically recompute
the schedule from scratch. In some situations, recomputing the schedule may not even
be feasible: when submitting a set of jobs to a server, a user can choose the order in
which it presents these jobs, but cannot alter this ordering later on. Therefore, it is de-
sirable in general to have a fixed master schedule that will perform well regardless of
the actual machine behavior. In other words, we want a universal schedule that, for any
given machine behavior, has cost close to that of an optimal clairvoyant algorithm.
In this paper we initiate the study of universal scheduling by considering the problem
of sequencing jobs on a single machine to minimize average completion times. Our
main result is an algorithm for computing a universal schedule that is always a constant
factor away from an optimal clairvoyant algorithm. We complement this by showing
that our upper bound is best possible among universal schedules. We also consider
the case when jobs have release dates. Here we provide an almost logarithmic lower
bound on the performance of universal schedules, thus showing a drastic difference
with respect to the setting without release dates. Finally, we design an algorithm with
constant performance for the interesting case of scheduling jobs with release dates and
proportional weights. Our hope is that these results stimulate the study of universal
solutions for other scheduling problems, and, more broadly, the study of more realistic
scheduling models. In the rest of this section we introduce our model formally, discuss
related work, and explain our contributions in detail.
The model. We are given a job set J with processing times pj ∈ Q+ and weights
wj ∈ Q+ for each job j ∈ J. Using a standard scaling argument, we can assume
w.l.o.g. that wj ≥ 1 for j ∈ J. The problem is to find a sequence π of jobs to be
scheduled on a single machine that minimizes the total sum of weighted completion
times. The jobs are processed in the prefixed order π no matter how the machine may
change its processing speed or whether it becomes unavailable. In case of a machine
breakdown the currently running job is preempted and will be resumed processing at
any later moment when the machine becomes available again. We analyze the worst
case performance by comparing the solution value provided by an algorithm with that
of an optimal clairvoyant algorithm that knows the machine behavior in advance, and
that is even allowed to preempt jobs at any time.
We also consider the more general problem in which each job j ∈ J has its individ-
ual release date rj ≥ 0, which is the earliest point in time when it can start processing.
In this model, it is necessary to allow job preemption, otherwise no constant perfor-
mance guarantee is possible as simple examples show. We allow preemption in the
actual scheduling procedure, however, as in the case without release dates, we aim for
non-adaptive universal solutions. That is, a schedule will be specified by a total ordering
232 L. Epstein et al.
of the jobs. At any point in time we work on the first job in this ordering that has not
finished yet and that has already been released. This procedure is called preemptive list
scheduling [9, 28]. Note that a newly released job will preempt the job that is currently
running if it comes earlier than the current job in the ordering.
Related work. The concept of universal solutions, that perform well for every single
input of a superset of possible inputs, has been used already decades ago in different
contexts, as e.g. in hashing [4] and routing [31]. The latter is also known as oblivious
routing and has been studied extensively; see [26] for a state-of-the-art overview. Jia et
al. [12] considered universal approximations for TSP, Steiner Tree, and Set Cover Prob-
lems. All this research falls broadly into the field of robust optimization [3]. The term
robust is not used consistently in the literature. In particular, the term robust scheduling
refers mainly to robustness against uncertain processing times; see e.g. [17, chap. 7]
and [23]. Here, quite strong restrictions on the input or weakened notions of robustness
are necessary to guarantee meaningful worst case solutions. We emphasize, that our
results in this paper are robust in the most conservative, classical notion of robustness
originating by Soyster [30], also called strict robustness [22], and in this regard, we
follow the terminology of universal solutions.
Scheduling with limited machine availability is a subfield of machine scheduling
that has been studied for over twenty years; see, e.g., the surveys [27, 20, 7]. Different
objective functions, stochastic breakdowns, as well as the offline problem with known
availability periods have been investigated. Nevertheless, only few results are known on
the problem of scheduling to minimize the total weighted completion time, and none of
these deal with release dates. If all jobs have equal weights, a simple interchange argu-
ment shows that sequencing jobs in non-increasing order of processing times is optimal
as it is in the setting with continuous machine availability [29]. Obviously, this result
immediately transfers to the universal setting in which machine breakdowns or changes
in processing speeds are not known beforehand. The special case of proportional jobs,
in which the processing time of each job is proportional to its weight, has been stud-
ied in [32]. The authors showed that scheduling in non-increasing order of processing
times (or weights) yields a 2-approximation for preemptive scheduling. However, for
the general problem with arbitrary job weights, it remained an open question [32] if a
polynomial time algorithm with constant approximation ratio exists, even without re-
lease dates. In this case, the problem is strongly NP-hard [32].
A major line of research within this area focused on the offline scheduling prob-
lem with a single unavailable period. This problem is weakly NP-hard in both, the
preemptive [19] and the non-preemptive variant [1, 21]. Several approximation results
have been derived, see [19, 21, 32, 13, 24]. Only very recently, and independently of
us, Kellerer and Strusevich [16] derived FPTASes with running time O(n4 /2 ) for the
non-preemptive problem and O(n6 /3 ) in the preemptive case. An even improved non-
preemptive FPTAS with running time O(n2 /2 ) is claimed in [14]. However, the proof
seems incomplete in bounding the deviation of an algorithm’s solution from an optimal
one; in particular, the claim after Ineq. (11) in the proof of Lem. 1 is not proved.
Our results. Our main results are algorithms that compute deterministic and randomized
universal schedules for jobs without release dates. These algorithms run in polynomial
time and output an ordering of the jobs such that scheduling the jobs in this order will
Universal Sequencing on a Single Machine 233
always yield a solution that remains within multiplicative factor 4 and within multiplica-
tive factor e in expectation from any given schedule. Furthermore, we show that our
algorithms can be adapted to solve more general problem instances with certain types
of precedence constraints without loosing performance quality. We also show that our
upper bounds are best possible for universal scheduling. This is done by establishing an
interesting connection between our problem and a certain online bidding problem [5].
It may seem rather surprising that universal schedules with constant performance
guarantee should always exist. In fact, our results immediately answer affirmatively
an open question in the area of offline scheduling with limited machine availability:
whether there exists a constant factor approximation algorithm for scheduling jobs in a
machine having multiple unavailable periods that are known in advance.
To derive our results, we study the objective of minimizing the total weight of un-
completed jobs at any point in time. First, we show that the performance guarantee
is given directly by a bound on the ratio between the remaining weight of our algo-
rithm and that of an optimal clairvoyant algorithm at every point in time on an ideal.
Then, we devise an algorithm that computes the job sequence iteratively backwards: in
each iteration we find a subset of jobs with largest total processing time subject to a
bound on their total weight. The bound is doubled in each iteration. Our approach is
related to, but not equivalent to, an algorithm of Hall et al. [9] for online scheduling
on ideal machines—the doubling there happens in the time horizon. Indeed, this type
of doubling strategy has been applied successfully in the design of algorithms for var-
ious problems; the interested reader is referred to the excellent survey of Chrobak and
Kenyon-Mathieu [6] for a collection of such examples.
The problem of minimizing the total weight of uncompleted jobs at any time was
previously considered [2] in the context of on-line scheduling to minimize flow time on
a single machine; there, a constant approximation algorithm is presented with a worst
case bound of 24. Our results imply an improved 4-approximation for this problem.
Furthermore, we show that the same guarantee holds for the setting with release dates;
unfortunately, unlike in the case without release dates, this does not translate into the
same performance guarantee for universal schedules. In fact, when jobs have individual
release dates, the problem changes drastically.
In Section 4 we show that in the presence of release dates, even if all weights are
equal, there are instances for which the ratio between the value of any universal solution
and that of an optimal schedule is Ω(log n/ log log n). Our proof relies on the classical
theorem of Erdős and Szekeres [8] on the existence of long increasing/decreasing sub-
sequences of a given sequence of numbers. Motivated by this hardness, we study the
class of instances with proportional jobs. We present a non-trivial algorithm and prove
a performance guarantee of 5. Additionally, we give a lower bound of 3 for all universal
solutions in this special case.
Our last result, Section 5, is a fully polynomial time approximation scheme (FPTAS)
for offline scheduling on a machine with a single unavailable period. Compared to the
FPTAS presented recently in [16], our scheme, which was discovered independently
from the former, is faster and seems to be simpler, even though the basic ideas are
similar. Our FPTAS for the non-preemptive variant has running time O(n3 /2 ) and for
the preemptive variant O(n4 /3 log pmax ).
234 L. Epstein et al.
: ∞
wj Cjπ = W π (t)dt. (1)
j∈J 0
Clearly, breaks or fluctuations in the speed of the machine delay the completion times.
To describe a particular machine behavior, let f : R+ → R+ be a non-decreasing con-
tinuous function, with f (t) being the aggregated amount of processing time available
on the machine up to time t. We refer to f as the machine capacity function. If the
derivative of f at time t exists, it can be interpreted as the speed of the machine at that
point in time.
For a given capacity function f , let S(π, f ) denote the single machine schedule
S(π,f )
when applying preemptive list scheduling to permutation π, and let Cj denote
the completion time of job j in this particular schedule. For some point in time t ≥ 0,
let W S(π,f ) (t) denote the total weight of jobs that are not yet completed by time t in
schedule S(π, f ). Then,
: ∞
S(π,f )
wj Cj = W S(π,f ) (t)dt .
j∈J 0
∗
For t ≥ 0 let W S (f )
(t) := minπ W S(π,f ) (t).
Observation 1. For a given machine capacity function f ,
: ∞
∗
W S (f ) (t)dt (2)
0
∗
We prove the “only if” part by contradiction. Assume that W S(π,f ) (t0 ) > cW S (f ) (t0 )
for some t0 and f . For any t1 > t0 consider the following machine capacity function
⎧
⎪
⎨f (t) if t ≤ t0 ,
f (t) = f (t0 ) if t0 < t ≤ t1 ,
⎪
⎩
f (t − t1 + t0 ) if t > t1
which equals f up to time t0 and then remains constant at value f (t) = f (t0 ) for the
time interval [t0 , t1 ]. Hence,
S(π,f )
S(π,f )
wj Cj = wj Cj + (t1 − t0 )W S(π,f ) (t0 ). (3)
j∈J j∈J
∗ ∗
(f )
On the other hand, let π ∗ be a sequence of jobs with W S(π ,f ) (t0 ) = W S (t0 ).
Then,
S(π ∗ ,f )
S(π ∗ ,f ) ∗
wj Cj = wj Cj + (t1 − t0 )W S (f ) (t0 ). (4)
j∈J j∈J
∗
(f )
As t1 tends to infinity, the ratio of (3) and (4) tends to W S(π,f ) (t0 )/W S (t0 ) > c,
a contradiction.
In case that all release dates are equal, approximating the sum of weighted completion
times on a machine with unknown processing behavior is equivalent to approximating
the total remaining weight at any point in time on an ideal machine: f (t) = t, t ≥ 0.
Scheduling according to sequence π on such a machine yields for each j, Cjπ :=
k:π(k)≤π(j) pk . The completion time under machine capacity function f is
S(π,f )
Cj = min{t | f (t) ≥ Cjπ }.
Observation 2. For any machine capacity function f and any sequence π of jobs with-
out release dates,
Simple counter examples show that this lemma is only true if all release dates are equal,
otherwise, Observation 2 is simply not true.
236 L. Epstein et al.
Algorithm D OUBLE:
1. For i ∈ {0, 1, . . . , log w(J)}, find a subset Ji∗ of jobs of total weight w(Ji∗ ) ≤ 2i
and maximum total processing time p(Ji∗ ). Notice that Jlog ∗
w(J) = J.
2. Construct a permutation π as follows. Start with / an empty sequence of jobs. For i =
log w(J) down to 0, append the jobs in Ji∗ \ k=0 Jk∗ in any order at the end of
i−1
the sequence.
Proof. Using Lemma 2 it is sufficient to show that W π (t) < 4W ∗ (t) for all t ≥ 0.
Let t ≥ 0 and let i be minimal such that p(Ji∗ ) ≥ p(J) − t. By construction of π, only
/i
jobs j in k=0 Jk∗ have a completion time Cjπ > t. Thus,
i
i
W π (t) ≤ w(Jk∗ ) ≤ 2k = 2i+1 − 1. (5)
k=0 k=0
In case i = 0, the claim is trivially true since wj ≥ 1 for any j ∈ J, and thus, W ∗ (t) =
∗
W π (t). Suppose i ≥ 1, then by our choice of i, it holds that p(Ji−1 ) < p(J) − t.
Therefore, in any sequence π , the total weight of jobs completing after time t is larger
∗
than 2i−1 , because otherwise we get a contradiction to the maximality of p(Ji−1 ). That
∗ i−1
is, W (t) > 2 . Together with (5) this concludes the proof.
Notice that the algorithm takes exponential time since finding the subsets of jobs Ji∗ is
a K NAPSACK problem and, thus, NP-hard [15]. However, we adapt the algorithm by,
instead of Ji∗ , computing a subset of jobs Ji of total weight w(Ji ) ≤ (1 + /4)2i and
processing time p(Ji ) ≥ max{p(J ) | J ⊆ J and w(J ) ≤ 2i }. This can be done in
time polynomial in the input size and 1/ adapting, e.g., the FPTAS in [11] for K NAP -
SACK . The subsets Ji obtained in this way are turned into a sequence π as in D OUBLE.
Universal Sequencing on a Single Machine 237
Theorem 2. Let > 0. For every scheduling instance, we can construct a permuta-
S(π,f )
tion π in time polynomial in the input size and 1/ such that the value j∈J wj Cj
is less than 4 + times the optimum for all machine capacity functions f .
Proof. Again, by Lemma 2 it is sufficient to prove that W π (t) < 4W ∗ (t) for all t ≥ 0.
Instead of inequality (5) we get the slightly weaker bound
i
i
W π (t) ≤ w(Jk ) ≤ (1 + /4)2k = (1 + /4)(2i+1 − 1) < (4 + ) 2i−1 .
k=0 k=0
Theorem 3. Let > 0. For every scheduling instance, randomized D OUBLE constructs
a permutation π in time that is polynomial in the input size and 1/ such that the objec-
S(π,f )
tive value j∈J wj Cj is in expectation less than e + times the optimum value
for all machine capacity functions f .
A natural generalization of the universal sequencing problem requires that jobs are se-
quenced in compliance with given precedence constraints. We extend the results in
Theorems 1 and 3 to this model for certain classes of precedence constraints such as
directed out-trees, two dimensional orders, and the complement of chordal bipartite
orders.
Chrobak et al. [5] gave lower bounds of 4− and e−, for any > 0, for deterministic
and randomized algorithms, respectively.
Theorem 4. For any > 0, there exists an instance of the universal scheduling problem
without release dates on which the performance ratio of any deterministic schedule is
at least 4 − and the performance ratio of any randomized schedule is at least e − .
238 L. Epstein et al.
Proof. Take an instance of the online bidding problem and create the following instance
of the scheduling problem: For each j ∈ U create job j with weight wj = j and
processing time pj = j j . Consider any permutation π of the jobs U. For any j ∈ U,
j−1
let k(j) be the largest index such
that πk(j) ≥ j. Since pj > i=1 pj , at time t =
p(U) − pj we have W π (t) = k=k(j) wπk , while W ∗ (t) = wj . If sequence π yields a
n
n
πk ≤ α j ∀ j ∈ U. (7)
k=k(j)
that is, the bid set W induced by the sequence π must be α-competitive. Since there
is a lower bound of 4 − for the competitiveness of deterministic strategies for on-
line bidding, the same bound holds for the performance ratio of deterministic universal
schedules.
The same approach yields the lower bound for randomized strategies.
In our lower bound instance each job j has wj = 1, j = 0, 1, . . . , n−1. Their processing
j = 2 , j = 0, 1, . . . , n − 1, and they are released in
j
times form a geometric
n series p
reversed order rj = i>j 2 = i>j pi , j = 0, 1, . . . , n − 1.
i
To show the bound, we rely on a classic theorem of Erdős and Szekeres [8] or, more
precisely, on Hammersley’s proof [10] of this result.
Proof. Let j be the first job in . The machine has breakdowns [rj , r0 ] and [r0 + 2j −
1, L] for large L. At time r0 all jobs have been released. 2j − 1 time units later, at
the start of the second breakdown, all jobs in belong to the set of jobs uncompleted
by the universal schedule, whereas an optimal solution can complete all jobs except j.
Choosing L large enough implies the lemma.
Proof. For each job j in i there is a breakdown [rj , rj +]. For each job j in i+1 , . . . , k
there is a breakdown [rj , rj + pj ] = [rj , rj + 2j ]. As a consequence, at time 2n − 1 the
universal schedule has all jobs in i and all jobs in i+1 , . . . , k uncompleted, whereas,
a schedule exists that leaves the last job of i and all jobs in j+1 , . . . , k uncompleted.
Therefore, a breakdown [2n − 1, L] for L large enough implies the lemma.
Proof (Proof of Theorem 5). Consider an arbitrary universal scheduling solution and its
decomposition into increasing subsequences 1 , . . . , k as in Lemma 3 and let α be its
performance guarantee. Using Lemma 5, one can easily prove by induction that |i | ≤
αk−i+1 . Since 1 , . . . , k is a partition of the jobs, we have
k
k
n= |i | ≤ αk−i+1 ≤ αk+1 .
i=1 i=1
Motivated by the negative result in the previous section, we turn our attention to the
special case with proportional weights, that is, there exists a fixed γ ∈ Q such that wj =
γpj , for all j ∈ J. Using a standard scaling argument we can assume w.l.o.g. that pj =
wj , for all j. We provide an algorithm with a performance guarantee 5, and prove a
lower bound of 3 on the performance guarantee of any universal scheduling algorithm.
Proof. Let π be the job sequence computed by S ORT C LASS. By Lemma 1, it is suffi-
cient to prove
∗
W S(π,f ) (t) ≤ 5W S (f ) (t) ∀t > 0. (8)
Take any time t and any machine capacity function f . Let j ∈ Ji be the job being
processed at time t according to the schedule S(π, f ). We say that a job other than
job j is in the stack at time t if it was processed for a positive amount of time before t.
The algorithm needs to complete all jobs in the stack, job j, and jobs that did not start
before t, which have a total weight of at most p(J) − f (t), the amount of remaining
processing time at time t to be done by the algorithm.
Since jobs within a class are ordered by release times, there is at most one job per
class in the stack at any point in time. Since jobs in higher classes have higher priority
and job j ∈ Ji is processed at time t, there are no jobs in Ji+1 , . . . , Jz in the stack at
time t.Thus the weight of the jobs in the stack together with the weight of job j is at
i
most k=1 2k = 2i+1 − 1. Hence,
A first obvious lower bound on the remaining weight of any schedule at time t is
∗
WS (f )
(t) ≥ p(J) − f (t) . (10)
For another lower bound, let t be the last time before t in which the machine is avail-
able but it is either idle or a job of a class Ji with i < i is being processed. Note that t
is well-defined. By definition, all jobs processed during the time interval [t , t] are in
classes with index at least i, but also, they are released in the interval [t , t] since at t a
job of a lower class was processed or the machine was idle. Since at time t at least one
of these jobs is unfinished in S(π, f ), even though the machine continuously processed
Universal Sequencing on a Single Machine 241
only those jobs, no algorithm can complete all these jobs. Thus, at time t, an optimal
schedule also still needs to complete at least one job with weight at least 2i−1 :
∗
WS (f )
(t) ≥ 2i−1 . (11)
Combining (9), (10), and (11) yields (8) and thus the upper bound of the theorem.
We omit the example that shows that the analysis is tight.
We complement this result by a lower bound of 3, but have to omit the proof.
Theorem 7. There is no algorithm with performance guarantee strictly smaller than 3
for universal scheduling of jobs with release dates and wj = pj , for all j ∈ J.
Due to space limitations we defer all details to the full version of the paper. The idea for
our FPTAS is based on a natural non-preemptive dynamic programming algorithm, used
also in [16]. Given a non-available time interval [s, t], the job set must be partitioned
into jobs that complete before s and jobs that complete after t. Clearly, the jobs in
each individual set are scheduled in non-increasing order of ratios wj /pj . This order is
known to be optimal on an ideal machine [29].
The main challenge in designing the FPTAS is to discretize the range of possible
total processing times of jobs scheduled before s in an appropriate way. Notice that
we cannot afford to round these values since they contain critical information on how
much processing time remains before the break. Perturbing this information causes a
considerable change in the set of feasible schedules that cannot be controlled easily. The
intuition behind our algorithm is to reduce the number of states by removing those with
the same (rounded) objective value and nearly the same total processing time before the
break. Among them, we want to store those with smallest amount of processing before
the break in order to make sure that enough space remains for further jobs that need to
be scheduled there.
The algorithm can be extended easily to the preemptive (resumable) problem. We
can assume, w.l.o.g., that in an optimal solution there is at most one job j interrupted
by the break [s, t] and it resumes processing as soon as the machine is available again.
For a given job j and with start time Sj , we define a non-preemptive problem with non-
available period [Sj , Sj + pj + t − s], which we solve by the FPTAS above. Thus, we
can solve the preemptive problem by running the FPTAS above O(n log pmax ) times.
242 L. Epstein et al.
6 Further Remarks
In Section 4 we have shown that the performance of universal scheduling algorithms
may deteriorate drastically when generalizing the universal scheduling problem slightly.
Other generalizations do not admit any (exponential time) algorithm with bounded per-
formance guarantee. If a non-adaptive algorithm cannot guarantee to finish within the
minimum makespan, then an adversary creates an arbitrarily long breakdown at the mo-
ment that an optimal schedule has completed all jobs. Examples of such variations are
the problem with two or more machines instead of a single machine, or the problem in
which preempting or resuming a job requires (even the slightest amount of) extra work.
The offline version of our problem (without release dates) in which preemption is
not allowed or causes extra work is not approximable in polynomial time; a reduction
from 2-PARTITION shows that the problem with two or more non-available periods is
not approximable, unless P=NP, even if all jobs have equal weight. A reduction in that
spirit has been used in [33] for a scheduling problem with some jobs having a fixed
position in the schedule. Similarly, we can rule out constant approximation factors for
any preemptive problem variant in which the makespan cannot be computed exactly in
polynomial time. This is shown by simple reductions from the corresponding decision
version of the makespan minimization problem. Such variations of our problem are
scheduling with release dates and scheduling with general precedence constraints.
References
1. Adiri, I., Bruno, J., Frostig, E., Rinnooy Kan, A.: Single machine flow-time scheduling with
a single breakdown. Acta Informatica 26(7), 679–696 (1989)
2. Becchetti, L., Leonardi, S., Marchetti-Spaccamela, A., Pruhs, K.: Online weighted flow time
and deadline scheduling. In: Goemans, M.X., Jansen, K., Rolim, J.D.P., Trevisan, L. (eds.)
RANDOM 2001 and APPROX 2001. LNCS, vol. 2129, pp. 36–47. Springer, Heidelberg
(2001)
3. Ben-Tal, A., Nemirovski, A.: Robust solutions of linear programming problems contami-
nated with uncertain data. Mathematical Programming 88, 411–424 (2000)
4. Carter, J., Wegman, M.: Universal classes of hash functions. Journal of Computer and System
Sciences 18, 143–154 (1979)
5. Chrobak, M., Kenyon, C., Noga, J., Young, N.E.: Incremental medians via online bidding.
Algorithmica 50(4), 455–478 (2008)
6. Chrobak, M., Kenyon-Mathieu, C.: Sigact news online algorithms column 10: Competitive-
ness via doubling. SIGACT News 37(4), 115–126 (2006)
7. Diedrich, F., Jansen, K., Schwarz, U.M., Trystram, D.: A survey on approximation algo-
rithms for scheduling with machine unavailability. In: Lerner, J., Wagner, D., Zweig, K.A.
(eds.) Algorithmics of Large and Complex Networks. LNCS, vol. 5515, pp. 50–64. Springer,
Heidelberg (2009)
8. Erdős, P., Szekeres, G.: A combinatorial problem in geometry. Compositio Mathematica 2,
463–470 (1935)
9. Hall, L., Schulz, A.S., Shmoys, D., Wein, J.: Scheduling to minimize average comple-
tion time: off-line and on-line approximation algorithms. Mathematics of Operations Re-
search 22, 513–544 (1997)
10. Hammersley, J.: A few seedlings of research. In: Proceedings Sixth Berkeley Symp. Math.
Statist. and Probability, vol. 1, pp. 345–394. University of California Press, Berkeley (1972)
11. Ibarra, O.H., Kim, C.E.: Fast approximation algorithms for the knapsack and sum of subset
problems. Journal of the ACM 22(4), 463–468 (1975)
Universal Sequencing on a Single Machine 243
12. Jia, L., Lin, G., Noubir, G., Rajaraman, R., Sundaram, R.: Universal approximations for TSP,
Steiner tree, and set cover. In: Proceedings of the 37th Annual ACM Symposium on Theory
of Computing (STOC ’05), pp. 386–395 (2005)
13. Kacem, I.: Approximation algorithm for the weighted flow-time minimization on a single
machine with a fixed non-availability interval. Computers & Industrial Engineering 54(3),
401–410 (2008)
14. Kacem, I., Mahjoub, A.R.: Fully polynomial time approximation scheme for the weighted
flow-time minimization on a single machine with a fixed non-availability interval. Computers
& Industrial Engineering 56(4), 1708–1712 (2009)
15. Karp, R.M.: Reducibility among combinatorial problems. In: Complexity of computer com-
putations (Proc. Sympos., IBM Thomas J. Watson Res. Center), pp. 85–103. Plenum, New
York (1972)
16. Kellerer, H., Strusevich, V.A.: Fully polynomial approximation schemes for a symmetric
quadratic knapsack problem and its scheduling applications (2009) (accepted for publication
in Algorithmica)
17. Kouvelis, P., Yu, G.: Robust Discrete Optimization and Its Applications. Springer, Heidelberg
(1997)
18. Lawler, E.L.: A dynamic programming algorithm for preemptive scheduling of a single ma-
chine to minimize the number of late jobs. Annals of Operations Research 26, 125–133 (1990)
19. Lee, C.Y.: Machine scheduling with an availability constraint. Journal of Global Optimimiza-
tion 9, 395–416 (1996)
20. Lee, C.Y.: Machine scheduling with availability constraints. In: Leung, J.Y.T. (ed.) Handbook
of scheduling. CRC Press, Boca Raton (2004)
21. Lee, C.Y., Liman, S.D.: Single machine flow-time scheduling with scheduled maintenance.
Acta Informatica 29(4), 375–382 (1992)
22. Liebchen, C., Lübbecke, M., Möhring, R.H., Stiller, S.: Recoverable robustness. Technical
report ARRIVAL-TR-0066, ARRIVAL Project (2007)
23. Mastrolilli, M., Mutsanas, N., Svensson, O.: Approximating single machine scheduling with
scenarios. In: Goel, A., Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX and RAN-
DOM 2008. LNCS, vol. 5171, pp. 153–164. Springer, Heidelberg (2008)
24. Megow, N., Verschae, J.: Note on scheduling on a single machine with one non-availability
period (2008) (unpublished)
25. Pruhs, K., Woeginger, G.J.: Approximation schemes for a class of subset selection problems.
Theoretical Computer Science 382(2), 151–156 (2007)
26. Räcke, H.: Survey on oblivious routing strategies. In: Ambos-Spies, K., Löwe, B., Merkle, W.
(eds.) Mathematical Theory and Computational Practice, Proceedings of 5th Conference on
Computability in Europe (CiE). LNCS, vol. 5635, pp. 419–429. Springer, Heidelberg (2009)
27. Schmidt, G.: Scheduling with limited machine availability. European Journal of Operational
Research 121(1), 1–15 (2000)
28. Schulz, A.S., Skutella, M.: The power of α-points in preemptive single machine scheduling.
Journal of Scheduling 5, 121–133 (2002)
29. Smith, W.E.: Various optimizers for single-stage production. Naval Research Logistics Quar-
terly 3, 59–66 (1956)
30. Soyster, A.: Convex programming with set-inclusive constraints and applications to inexact
linear programming. Operations Research 21(4), 1154–1157 (1973)
31. Valiant, L.G., Brebner, G.J.: Universal schemes for parallel communication. In: Proc. of
STOC, pp. 263–277 (1981)
32. Wang, G., Sun, H., Chu, C.: Preemptive scheduling with availability constraints to minimize
total weighted completion times. Annals of Operations Research 133, 183–192 (2005)
33. Yuan, J., Lin, Y., Ng, C., Cheng, T.: Approximability of single machine scheduling with fixed
jobs to minimize total completion time. European Journal of Operational Research 178(1),
46–56 (2007)
Fault-Tolerant Facility Location:
A Randomized Dependent LP-Rounding
Algorithm
1 Introduction
In Facility Location problems we are given a set of clients C that require a certain
service. To provide such a service, we need to open a subset of a given set of
This work was partially supported by: (i) the Future and Emerging Technologies Unit
of EC (IST priority - 6th FP), under contract no. FP6-021235-2 (project ARRIVAL),
(ii) MNISW grant number N N206 1723 33, 2007-2010, (iii) NSF ITR Award CNS-
0426683 and NSF Award CNS-0626636, and (iv) NSERC grant 327620-09 and an
Ontario Early Researcher Award.
Work of this author was partially conducted at CWI Amsterdam, TU Eindhoven,
and while visiting the University of Maryland.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 244–257, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Fault-Tolerant Facility Location 245
2 Dependent Rounding
s
gλ,S (x) = λi · I(SumS (x) = i),
i=0
where I(·) denotes the indicator function. Thus, gλ,S (x) = λi if SumS (x) = i.
Let R(y) be a random vector in {0, 1}N obtained by independently rounding
each yi to 1 with probability yi , and to 0 with the complementary probability
of 1 − yi . Suppose, as above, that ŷ is a random vector in {0, 1}N obtained
by applying the dependent rounding technique to y. We start with a general
theorem and then specialize it to Theorem 2 that will be very useful for us:
Fault-Tolerant Facility Location 247
The above two theorems yield a key corollary that we will use:
Corollary 1
Proofs will appear in the full version of the paper (see also [3]).
3 Algorithm
3.1 LP-Relaxation
The FTFL problem is defined by the following Integer Program (IP).
minimize i∈F fi yi + j∈C i∈F cij xij (1)
subject to: i xij ≥ rj ∀j ∈ C (2)
xij ≤ yi ∀j ∈ C ∀i ∈ F (3)
yi ≤ 1 ∀i ∈ F (4)
xij , yi ∈ Z≥0 ∀j ∈ C ∀i ∈ F, (5)
3.2 Scaling
We may assume, without loss of generality, that for any client j ∈ C there ex-
ists at most one facility i ∈ F such that 0 < xij < yi . Moreover, this facility
248 J. Byrka, A. Srinivasan, and C. Swamy
can be assumed to have the highest distance to client j among the facilities that
fractionally serve j in (x∗ , y ∗ ).
We first set x̃ij = ỹi = 0 for all i ∈ F, j ∈ C. Then we scale up the fractional
solution by the constant γ ≈ 1.7245 to obtain a fractional solution (x̂, ŷ). To be
precise: we set x̂ij = min{1, γ · x∗ij }, ŷi = min{1, γ · yi∗ }. We open each facility
i with ŷi = 1 and connect each client-facility pair with x̂ij = 1. To be more
precise, we modify ŷ, ỹ, x̂, x̃ and service requirements r as follows. For each
facility i with ŷi = 1, set ŷi = 0 and ỹi = 1. Then, for every pair (i, j) such
that x̂ij = 1, set x̂ij = 0, x̃ij = 1 and decrease rj by one. When this process is
finished we call the resulting r, ŷ and x̂ by r, y and x. Note that the connections
that we made in this phase can be paid for by a difference in the connection
cost between x̂ and x. We will show that the remaining connection cost of the
solution of the algorithm is expected to be at most the cost of x.
For the feasibility of the final solution, it is essential that if we connected
client j to facility i in this initial phase, we will not connect it again to i in the
rest of the algorithm. There will be two ways of connecting clients in the process
of rounding x. The first one connects client j to a subset of facilities serving j in
x. Recall that if j was connected to facility i in the initial phase, then xij = 0,
and no additional i-j connection will be created.
The connections of the second type will be created in a process of clustering.
The clustering that we will use is a generalization of the one of Chudak & Shmoys
for the UFL problem [4]. As a result of this clustering process, client j will be
allowed to connect itself via a different client j to a facility open around j . j
will be called a cluster center for a subset of facilities, and it will make sure that
at least some guaranteed number of these facilities will get opened.
To be certain that client j does not get again connected to facility i with a
path via client j , facility i will never be a member of the set of facilities clustered
by client j . We call a facility i special for client j iff ỹi = 1 and 0 < xij < 1.
Note that, by our earlier assumption, there is at most one special facility for
each client j, and that a special facility must be at maximal distance among
facilities serving j in x. When rounding the fractional solution in Section 3.5, we
take care that special facilities are not members of the formed clusters.
For every client j consider the following construction. Let i1 , i2 , . . . , i|F | be the
ordering of facilities in F in a nondecreasing order of distances cij to client j. Let
k
ik be the facility in this ordering, such that k−1
l=1 xil j < r j and l=1 xil j ≥ r j .
Define ⎧
⎨ xil j for l < k,
(c) k−1
xil j = r j − l=1 xil j for l = k,
⎩
0 for l > k
(d) (c)
Define xij = xij − xij for all i ∈ F, j ∈ C.
(c)
We will call the set of facilities i ∈ F such that xij > 0 the set of close
facilities of client j and we denote it by Cj . By analogy, we will call the set of
(d)
facilities i ∈ F such that xij > 0 the set of distant facilities of client j and
denote it Dj . Observe that for a client j the intersection of Cj and Dj is either
empty, or contains exactly one facility. In the latter case, we will say that this
facility is both distant and close. Note that, unlike in the UFL problem, we
cannot simply split this facility to the close and the distant part, because it is
essential that we make at most one connection to this facility in the final integral
(max)
solution. Let dj = cik j be the distance from client j to the farthest of its
close facilities.
3.4 Clustering
We will now construct a family of subsets of facilities S ∈ 2F . These subsets S ∈
S will be called clusters and they will guide the rounding procedure described
next. There will be a client related to each cluster, and each single client j will
be related to at most one cluster, which we call Sj .
Not all the clients participate in the clustering process. Clients j with r j = 1
and a special facility i ∈ Cj (recall that a special facility is a facility that is fully
open in ŷ but only partially used by j in x) will be called special and will not
take part in the clustering process. Let C denote the set of all other, non-special
clients. Observe that, as a result of scaling, clients j with r j ≥ 2 do not have any
special facilities among their close facilities (since i xij ≥ γrj > rj + 1). As
a consequence, there are no special facilities among the close facilities of clients
from C , the only clients actively involved in the clustering procedure.
For each client j ∈ C we will keep two families Aj and Bj of disjoint subsets
of facilities. Initially Aj = {{i} : i ∈ Cj }, i.e., Aj is initialized to contain a
singleton set for each close facility of client j; Bj is initially empty. Aj will be
used to store these initial singleton sets, but also clusters containing only close
facilities of j; Bj will be used to store only clusters that contain at least one
close facility of j. When adding a cluster to either Aj or Bj we will remove all
the subsets it intersects from both Aj and Bj , therefore subsets in Aj ∪ Bj will
always be pairwise disjoint.
The family of clusters that we will construct will be a laminar family of subsets
of facilities, i.e., any two clusters are either disjoint or one entirely contains the
other. One can imagine facilities being leaves and clusters being internal nodes
of a forest that eventually becomes a tree, when all the clusters are added.
250 J. Byrka, A. Srinivasan, and C. Swamy
We will use y(S) as a shorthand for i∈S yi . Let us define y(S) = y(S). As
a consequence of using the family of clusters to guide the rounding process, by
Property (P2’) of the dependent rounding procedure when applied to a cluster,
th quantity y(S) lower bounds the number of facilities that will certainly be
opened in cluster S. Additionally,
let us define the residual requirement of client
j to be rrj = r j − S∈(Aj ∪Bj ) y(S), that is r j minus a lower bound on the
number of facilities that will be opened in clusters from Aj and Bj .
We use the following procedure to compute clusters. While there exists a client
j ∈ C , such that rrj > 0, take such j with minimal dj
(max)
and do the following:
1.
Take Xj to be an inclusion-wise minimal subset / of Aj , such that
S∈Xj (y(S) − y(S)) ≥ rrj . Form the new cluster Sj = S∈Xj S.
2. Make Sj a new cluster by setting S ← S ∪ {Sj }.
3. Update Aj ← (Aj \ Xj ) ∪ {Sj }.
4. For each client j with rrj > 0 do
– If Xj ⊆ Aj , then set Aj ← (Aj \ Xj ) ∪ {Sj }.
– If Xj ∩ Aj = ∅ and Xj \ Aj = ∅,
then set Aj ← Aj \ Xj and Bj ← {S ∈ Bj : S ∩ Sj = ∅} ∪ {Sj }.
Eventually, add a cluster Sr = F containing all the facilities to the family S.
We call a client j active in aparticular iteration, if before this iteration its
residual requirement rrj = r j − S∈(Aj ∪Bj ) y(S) was positive. During the above
procedure, all active clients j have in their sets Aj and Bj only maximal subsets
of facilities, that means they are not subsets of any other clusters (i.e., they
are roots of their trees in the current forest). Therefore, when a new cluster
Sj is created, it contains all the other clusters with which it has nonempty
intersections (i.e., the new cluster Sj becomes a root of a new tree).
We shall now argue that there is enough fractional opening in clusters in Aj
to cover the residual requirement rrj when cluster Sj is to be formed. Consider
a fixed client j ∈ C . Recall that
at the start of the clustering
we have Aj =
{{i} : i ∈ Cj }, and therefore S∈Aj (y(S) − y(S)) = i∈Cj yi ≥ r j = rrj . It
remains to show, that S∈Aj (y(S) − y(S)) − rrj does not decrease over time
until client j is considered. When a client j with dj
(max) (max)
≤ dj is considered
and cluster Sj is created, the following cases are possible:
/
1. Sj ∩ ( S∈Aj S) = ∅: then Aj and rrj do not change;
/
2. Sj ⊆ ( S∈Aj S): then Aj changes its structure, but S∈Aj y(S) and
S∈Bj y(S) do not change; hence S∈Aj (y(S) − y(S)) − rrj also does not
change;/ /
3. Sj ∩( S∈Aj S) = ∅ and Sj \( S∈Aj S) = ∅: then, by inclusion-wise minimal-
ity of set Xj , we have y(Sj ) − S∈Bj ,S⊆Sj y(S) − S∈Aj ,S⊆Sj y(S) ≥ 0;
hence, S∈Aj (y(S) − y(S)) − rrj cannot decrease.
Let Aj = Aj ∩ S be the set of clusters in Aj . Recall that all facilities in clusters
in Aj are close facilities of j. Note also that each cluster Sj ∈ Bj was created
from close facilities of a client j with dj
(max) (max)
≤ dj . We also have for each
Fault-Tolerant Facility Location 251
Note that our clustering is related to, but more complex then the one of Chudak
and Shmoys [4] for UFL and of Swamy and Shmoys [11] for FTFL, where clusters
are pairwise disjoint and each contains facilities whose fractional opening sums
up to or slightly exceeds the value of 1.
4 Analysis
We will now estimate the expected cost of the solution (x̃, ỹ). The tricky part is
to bound the connection cost, which we do as follows. We argue that a certain
fraction of the demand of client j can be satisfied from its close facilities, then
some part of the remaining demand can be satisfied from its distant facilities.
Eventually, the remaining (not too large in expectation) part of the demand is
satisfied via clusters.
252 J. Byrka, A. Srinivasan, and C. Swamy
(c) (d)
Proof. Recall that i∈F xij = r j and i∈F xij ≥ (γ − 1) · r j . Therefore, we
(d) (c)
have (dj − dj ) · (γ − 1) ≤ (dj − dj ) · 1 = Rj · dj , which can be rearranged to
(d) Rj
get dj ≤ dj (1 + γ−1 ).
Finally, observe that the average distance from j to the distant facilities of j
gives an upper bound on the maximal distance to any of the close facilities of j.
(max) (d)
Namely, dj ≤ dj .
distance
(max)
dj maximal distance to close facilities
dj
(c)
dj = dj (1 − Rj ) average distance to close facilities
rj
0 close facilities γ
distant rj
facilities
split such a facility into a close and a distant part. Note that we can only do
it for this part of the analysis, but not for the actual rounding algorithm from
Section 3.5. Applying the above-described split of the undecided facility, we get
that the total fractional opening of close facilities of client j is exactly r j , and
the total fractional opening of both close and distant facilities is at least γ · r j .
Therefore, Corollary 1 yields the following:
Corollary 4. The amount of close facilities used by client j in a solution de-
scribed in Section 3.5 is expected to be at least (1 − 1e ) · r j .
Corollary 5. The amount of close and distant facilities used by client j in a
solution described in Section 3.5 is expected to be at least (1 − e1γ ) · rj .
Motivated by the above bounds we design a selection method to choose a (large-
enough in expectation) subset of facilities opened around client j:
Lemma 2. For j ∈ C we can select a subset Fj of open facilities from Cj ∪ Dj
such that:
|Fj | ≤ r j (with probability 1),
1
E[Fj ] = (1 − γ ) · r j ,
e
(c) 1 (d)
E[ cij ] ≤ ((1 − 1/e) · r j ) · dj + (((1 − γ ) − (1 − 1/e)) · r j ) · dj .
e
i∈Fj
4.3 Calculation
Proof. First observe that the solution produced by ALG is trivially feasible to
the original problem (1)-(5), as we simply choose different rj facilities for client
j in step 5. What is less trivial is that all the rj facilities used by j are within
a certain small distance. Let us now bound the expected connection cost of the
obtained solution.
For each client j ∈ C we get rj − r j facilities opened in Step 2. As we already
argued in Section 3.2, we can afford to connectj to these facilities and pay the
connection cost from the difference between i cij x̂ij and i cij xij . We will
now argue, that client j canconnect to the remaining r j with the expected
connection cost bounded by i cij xij .
For a special client j ∈ (C \C ) we have r j = 1 and already in Step 2 one special
(max)
facility at distance dj from j is opened. We cannot always just connect j
(max)
to this facility, since dj may potentially be bigger then γ · dj . What we do
instead is that we first look at close facilities of j that, as a result of the rounding
in Step 4, with a certain probability, give one open facility at a small distance.
By Corollary 4 this probability is at least 1 − 1/e. It is easy to observe that
(c)
the expected connection cost to this open facility is at most dj . Only if no
close facility is open, we use the special facility, which results in the expected
connection cost of client j being at most
where the first inequality is a consequence of Lemma 1, and the last one is a
consequence of the choice of γ ≈ 1.7245.
In the remaining, we only look at non-special clients j ∈ C . By Lemma 2,
client j can select to connect itself to the subset of open facilities Fj , and pay for
(c) (d)
this connection at most ((1 − 1/e) · rj ) · dj + (((1 − e1γ ) − (1 − 1/e)) · rj ) · dj in
expectation. The expected number of facilities needed on top of those from Fj is
r j − E[|Fj |] = ( e1γ · r j ). These remaining facilities client j gets deterministically
(max)
within the distance of at most 3 · dj , which is possible by the properties of
the rounding procedure described in Section 3.5, see Corollary 3. Therefore, the
(max)
expected connection cost to facilities not in Fj is at most ( e1γ · rj ) · (3 · dj ).
Fault-Tolerant Facility Location 255
5 Concluding Remarks
We have presented improved approximation algorithms for the metric Fault-
Tolerant Uncapacitated Facility Location problem. The main technical innova-
tion is the usage and analysis of dependent rounding in this context. We believe
that variants of dependent rounding will also be fruitful in other location prob-
lems. Finally, we conjecture that the approximation threshold for both UFL and
FTFL is the value 1.46 · · · suggested by [5]; it would be very interesting to prove
or refute this.
Acknowledgment. We thank the IPCO referees for their helpful comments.
256 J. Byrka, A. Srinivasan, and C. Swamy
References
1. Ageev, A., Sviridenko, M.: Pipage rounding: a new method of constructing algo-
rithms with proven performance guarantee. Journal of Combinatorial Optimiza-
tion 8(3), 307–328 (2004)
2. Byrka, J.: An optimal bifactor approximation algorithm for the metric uncapaci-
tated facility location problem. In: APPROX-RANDOM, pp. 29–43 (2007)
3. Byrka, J., Srinivasan, A., Swamy, C.: Fault-tolerant facility location: a randomized
dependent LP-rounding algorithm. arXiv:1003.1295v1
4. Chudak, F.A., Shmoys, D.B.: Improved approximation algorithms for the unca-
pacitated facility location problem. SIAM J. Comput. 33(1), 1–25 (2003)
5. Guha, S., Khuller, S.: Greedy strikes back: Improved facility location algorithms.
J. Algorithms 31(1), 228–248 (1999)
6. Guha, S., Meyerson, A., Munagala, K.: A constant factor approximation algorithm
for the fault-tolerant facility location problem. J. Algorithms 48(2), 429–440 (2003)
7. Jain, K., Vazirani, V.V.: An approximation algorithm for the fault tolerant metric
facility location problem. Algorithmica 38(3), 433–439 (2003)
8. Lin, J.-H., Vitter, J.S.: Epsilon-approximations with minimum packing constraint
violation (extended abstract). In: STOC, pp. 771–782 (1992)
9. Shmoys, D.B., Tardos, É., Aardal, K.: Approximation algorithms for facility loca-
tion problems (extended abstract). In: STOC, pp. 265–274 (1997)
10. Srinivasan, A.: Distributions on level-sets with applications to approximation al-
gorithms. In: FOCS, pp. 588–597 (2001)
11. Swamy, C., Shmoys, D.B.: Fault-tolerant facility location. ACM Transactions on
Algorithms 4(4) (2008)
Appendix
Proof. (for Lemma 2, a sketch) Given client j, fractional facility opening
vector y, distances cij , requirement r j , and facility subsets Cj and Dj , we will
describe how to randomly choose a subset of at most k = r j open facilities from
Cj ∪ Dj with the desired properties.
For this argument we assume that all the numbers are rational. Recall that
the opening of facilities is decided in a dependent rounding routine, that in a
single step couples two fractional entries to leave at most one of them fractional.
Observe that, for the purpose of this argument, we can split a single facility
into many identical copies with smaller fractional opening. One can think that
the input facilities and their original openings were obtained along the process
of dependent rounding applied to the multiple “small” copies that we prefer to
consider here. Therefore, without loss of generality, we can assume that all the
facilities have fractional opening equal , i.e., yi = for all i ∈ Cj ∪Dj . Moreover,
we can assume that sets Cj and Dj are disjoint.
By renaming facilities we obtain that Cj = {1, 2, . . . , |Cj |}, Dj = {|Cj | +
1, . . . , |Cj | + |Dj |}, and cij ≤ ci j for all 1 ≤ i < i ≤ |Cj | + |Dj |.
Consider random set S0 ⊆ Cj ∪ Dj created as follows. Let ŷ be the outcome of
rounding the fractional opening vector y with the dependent rounding procedure,
and define S0 = {i : ŷi = 1, ( j<i ŷ) < k}. By Corollary 1, we have that
Fault-Tolerant Facility Location 257
E[|S0 |] ≥ k ·(1−exp(−SumCj ∪Dj (y)/k)). Define random set Sα for α ∈ (0, |Cj |+
|Dj |] as follows. For i = 1, 2, . . . |Cj | + |Dj | − α we have i ∈ Sα if and only
if i ∈ S0 . For i = |Cj | + |Dj | − α, in case i ∈ S0 we toss a (suitably biased)
coin and include i in Sα with probability α − α. For i > |Cj | + |Dj | − α we
deterministically have i ∈ / Sα .
Observe that E[|Sα |] is a continuous monotone non-increasing function of α,
hence there is α0 such that E[|Sα0 |] = k · (1 − exp(−SumCj ∪Dj (y)/k)). We fix
Fj = Sα0 and claim that it has the desired properties. By definition, we have
E[|Fj |] = k · (1 − exp(−SumCj ∪Dj (y)/k)) = (1 − e1γ ) · r j . We next show that the
expected total connection cost between j and facilities in Fj is not too large.
Let pα i = P r[i ∈ Sα ] and pi = pα
i
0
= P r[i ∈ Fj ]. Consider the cumulative
α α
probability defined as cpi = j≤i pj . Observe that application of Corollary 1
to subsets of first i elements of Cj ∪ Dj yields cp0i ≥ k · (1 − exp(−i/k)) for
i = 1, . . . , |Cj | + |Dj |. Since (1 − exp(−i/k)) is a monotone increasing function
of i one easily gets that also cpα i ≥ k · (1 − exp(−i/k)) for α ≤ α0 and i =
1, . . . , |Cj | + |Dj |. In particular, we get cpα |Cj | ≥ k · (1 − exp(−|Cj |/k)).
0
(c) 1 (d)
= ((1 − 1/e) · r j ) · dj + (((1 − ) − (1 − 1/e)) · r j ) · dj .
eγ
Integer Quadratic Quasi-polyhedra
Adam N. Letchford
1 Introduction
In recent years, there has been increasing interest in Mixed-Integer Non-Linear
Programming (MINLP), due to the realisation that it has a wealth of applica-
tions. This paper is concerned with a special case of MINLP: Integer Quadratic
Programming (IQP). It is assumed that instances of IQP are written in the
following standard form:
n
min cT x + xT Qx : Ax = b, x ∈ Z+ , (1)
ming, inequalities can be converted into equations using slack variables, and free
variables can be expressed as the difference between two non-negative variables.)
We assume (without loss of generality) that the matrix Q is symmetric, but
we do not require it to be positive semidefinite. That is, we do not assume that
the objective function is convex.
Polyhedral combinatorics — the study of polyhedra associated with combi-
natorial problems — has proven to be a very useful tool for deriving strong
formulations of Mixed-Integer Linear Programs (e.g., [1,16]). The purpose of
this paper is to apply it to IQP. It turns out, however, that one has to deal with
‘quasi-polyhedra’: convex sets that are the intersection of a countably infinite
number of half-spaces. For this reason, polyhedral theory has to be combined
with elements of convex analysis. (A similar strategy was used in [5] to study a
continuous quadratic optimisation problem.)
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 258–270, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Integer Quadratic Quasi-polyhedra 259
2 The Quasi-polyhedra
A standard trick when dealing with quadratic optimisation problems is to lin-
earise the objective and/or constraints by introducing additional variables (e.g.,
[14,17,18]). More precisely, for 1 ≤ i ≤ j ≤ n, we define a new variable yij , which
represents the product xi xj . The IQP (1) can then be reformulated as:
min cT x + q T y : Ax = b, x ∈ Z+ , yij = xi xj (1 ≤ i ≤ j ≤ n) ,
n
n+1
where q ∈ Q( 2 ) is defined appropriately. Notice that the non-linearity (and
non-convexity, if any) is now captured in the constraints yij = xi xj .
It is an interesting fact that the linear equations can be eliminated from the
problem. Indeed, we can delete an arbitrary linear equation aT x = r, provided
that we add M (aT x − r)2 to the objective function, where M is a suitable large
integer. For this reason, one can concentrate on the unconstrained case, in which
the linear system Ax = b is vacuous.
The set of feasible solutions to an unconstrained IQP, in the extended (x, y)-
space, is:
.
n+(n+1
2 )
Fn := (x, y) ∈ Z+
+
, yij = xi xj (1 ≤ i ≤ j ≤ n) .
and all other variables to zero. On the other hand, the point with y11 = 1 and
all other variables at zero does not lie in the convex hull of Fn . Since the convex
hull does not contain all of its limit points, it is not closed.
We are therefore led to look at the closure of the convex hull, which we denote
by IQ+ +
n . Figure 1 represents IQ1 . It can be seen that it is described by the
non-negativity inequality x1 ≥ 0, together with the inequalities y11 ≥ (2t +
1)x1 − t(t + 1) for all t ∈ Z+ . (A similar observation was made by Michaels &
Weismantel [11] for a closely-related family of polytopes.)
p
y11 p
p
p
p
16 sp
12
s
8
4 s
s
s
x1
0 1 2 3 4
Fig. 1. The convex set IQ+
1
The second technical issue is that IQ+ n is, in fact, not a polyhedron. A polyhe-
dron is defined as the intersection of a finite number of half-spaces, but we have
seen that IQ+ 1 is the intersection of a countably infinite number of half-spaces.
(The results that we give in Sect. 4 show that the same holds when n > 1 as
well.) The correct term for such sets is quasi-polyhedra (see, e.g., Anderson et
al. [2]). Fortunately, this issue does not cause any difficulty in what follows.
For the purposes of what follows, we introduce a closely-related family of
quasi-polyhedra, obtained by omitting the non-negativity requirement. Specifi-
cally, we define
n+ n+1
Fn := (x, y) ∈ Z ( 2 ) , yij = xi xj (1 ≤ i ≤ j ≤ n) ,
and then let IQn denote the closure of the convex hull of Fn . One can check that
IQ1 is described by the inequalities y11 ≥ (2t + 1)x1 − t(t + 1) for all t ∈ Z.
Next, we present two simple complexity results:
n is N P-hard in the
Proposition 2. Minimising a linear function over IQ+
strong sense.
Proof. It follows from the above discussion that this problem is equivalent to
IQP. Now, IQP is clearly N P-hard in the strong sense, since it contains Integer
Linear Programming as a special case.
Integer Quadratic Quasi-polyhedra 261
ing the objective function leaves the optimal solution unchanged. The resulting
problem is one of minimising a quadratic function over the integer lattice Z . It
n
follows from the definitions that this is equivalent to minimising a linear function
over IQn .
We therefore cannot expect to obtain complete linear descriptions of IQn or IQ+
n
for general n.
On a more positive note, we have the following result:
Proposition 4. Minimising a linear function over IQn is solvable in polynomial
time when n is fixed.
Proof. As already mentioned, minimising a linear function over IQn is equiva-
lent to minimising a quadratic function over the integer lattice Z . Now, if the
n
function is not convex, the problem is easily shown to be unbounded. If, on the
other hand, the function is convex, then the problem can be solved for fixed n
by the algorithm of Khachiyan & Porkolab [9].
There is therefore some hope of obtaining a complete linear description of IQn
for small values of n (just as we have already done for the case n = 1). We do
not know the complexity of minimising a linear function over IQ+ n for fixed n.
Proof. Let x̄ be an arbitrary point in Z, and let (x̄, ȳ) be the corresponding
n
n
member of Fn . The quadratic function i=1 (xi − x̄i )2 has a unique minimum
at x = x̄. Since every point in Fn satisfies yij = xi xj for all 1 ≤ i ≤ j ≤ n, the
n
linear function i=1 (yii −2x̄i xi + x̄2i ) has a unique minimum at (x̄, ȳ). Therefore
(x̄, ȳ) is an extreme point of IQn . The proof for IQ+ n is similar.
Proof. Trivial.
Theorem 1. Let U be a unimodular integral square matrix of order n, and let
w ∈ Z be an arbitrary integer vector. Consider the affine transformation that
n
n+ n n+ n
takes any (x, y) ∈ R ( 2 ) and maps it to a point (x , y ) ∈ R ( 2 ) , where
– x = U x + w;
– yij = xi xj for all 1 ≤ i ≤ j ≤ n.
This transformation maps IQn onto itself.
Proof. Let (x, y) be an extreme point of IQn , and let (x , y ) be its image under
the transformation. Since U and w are integral, x is integral. Moreover, since
yij = xi xj , (x , y ) is an extreme point of IQn . For the reverse direction, let
(x , y ) be an extreme point of IQn , and let (x, y) be its image under the inverse
transformation. Note that x = U −1 (x − w), and is therefore integral. Moreover,
yij = xi xj for all 1 ≤ i ≤ j ≤ n, which implies that (x, y) is an extreme point of
IQn .
Proposition 7 simply states that IQ+ n is invariant under a permutation of the
index set {1, . . . , n}, which is unsurprising. Theorem 1, on the other hand, has
a very useful corollary:
Integer Quadratic Quasi-polyhedra 263
– x = x − v;
– yij = xi xj for all 1 ≤ i ≤ j ≤ n.
Proof. Let d = n + n+1 2 . Since the original inequality α x + β y ≥ γ in-
T T
Since Ŷ is the product of a vector and its transpose, it must be psd. Equivalently,
v T Y v + (2s)v T x + s2 ≥ 0 for all vectors v ∈ R and scalars s ∈ R. This
n
Proposition 8. The following ‘psd inequalities’ are valid for IQn (and therefore
also for IQ+
n ):
n
(∀v ∈ R , s ∈ R) .
n
(2s)v T x + vi2 yii + 2 vi vj yij + s2 ≥ 0 (2)
i=1 1≤i<j≤n
To the knowledge of the author, the validity of the psd inequalities for extended
formulations of quadratic optimisation problems was first observed by Ramana
[15]. The inequalities can be shown to induce proper faces of IQn and IQ+ n under
mild conditions. We will see in the next section, however, that they never induce
facets.
Now recall that a symmetric matrix M ∈ R
n×n
is called completely positive
T
if it can be factorised as AA for some non-negative real matrix A. The set of
completely positive matrices of order n also forms a convex cone in R
n×n
. Using
exactly the same argument as above, any valid inequality for the completely
positive cone yields a valid inequality for IQ+ n . Unfortunately, this additional
information does not help us much, because a complete linear description of the
completely positive cone is unknown, and unlikely to be found for general n [12].
We close this section by pointing out a connection between IQn , IQ+ n and the
so-called boolean quadric polytope. The boolean quadric polytope of order n is
denoted by BQPn and is defined as:
n
Note that the yii variables are not present in the case of BQPn .
The boolean quadric polytope was defined by Padberg [14] in the context of
quadratic 0-1 programming. It has many applications in other fields and has
been studied in great depth [7].
We will need the following lemma:
Lemma 1. For all 1 ≤ i ≤ n, the inequality yii ≥ xi is valid for IQn .
Proof. This follows from the fact that all members of Fn satisfy yii = x2i , and
the fact that t2 ≥ t for any integer t.
The following proposition states that BQPn is essentially nothing but a face of
IQn :
Proposition 9. Let H be the face of IQn obtained by setting the inequality
yii ≥ xi to an equation for all 1 ≤ i ≤ n. The boolean quadric polytope BQPn is
an affine image of H.
Proof. Note that t2 = t if and only if t ∈ {0, 1}. Therefore, the extreme points
of H are precisely the members of Fn that satisfy x ∈ {0, 1}n. So there is a
Integer Quadratic Quasi-polyhedra 265
induces a facet of BQPn . Then there exists at least one facet-inducing inequality
for IQn of the form
n
n
(ai − λi )xi + λi yii + bij yij ≤ c ,
i=1 i=1 1≤i<j≤n
with λ ∈ Q .
n
Proof. To see that the inequalities of the form yij ≥ 0 induce facets, simply
note that all but one of the affinely-independent points listed in the proof of
Proposition 5 satisfy yij = 0. To see that the inequalities of the form yii ≥ 0 do
not induce facets, simply note that they are dominated by the inequalities xi ≥ 0
and yii ≥ xi (refer to Fig. 1). The inequalities
of the form xi ≥ 0 are a little
more tricky: one can easily construct n + n2 affinely-independent points with
xi = 0, but to complete the proof one needs an additional n extreme rays of IQ+ n
having xi = 0. The proof of Proposition 1 shows that there is a ray with yii = 1
and all other variables zero. Using a similar argument, one can show that, for
all j = i, there is a ray with xj = yij = 1 and all other variables zero.
The non-negativity inequalities are of course not valid for IQn .
266 A.N. Letchford
is illustrated in Fig. 2.
r r r r r
r rrr
r
x2 r r r r r
r r
r r r
r r r r r
x1
Fig. 2. The split disjunction (x1 − 2x2 ≤ −2) ∨ (x1 − 2x2 ≥ −1)
n
(2s + 1)v T x + vi2 yii + 2 vi vj yij + s(s + 1) ≥ 0 . (3)
i=1 1≤i<j≤n
Theorem 4. The split inequalities (3) dominate the psd inequalities (2).
Proof. First, suppose that a psd inequality is derived using an integral vector
v and an integral scalar s. Recall that the psd inequality can be written as
v T Y v + (2s)v T x + s2 ≥ 0. This is dominated by the two inequalities v T Y v +
(2s + 1)v T x + s(s + 1) ≥ 0 and v T Y v + (2s − 1)v T x + s(s − 1) ≥ 0, which are
both split inequalities.
To complete the proof, we must show that the psd inequalities derived from
integral v and s dominate all the others. Suppose a point (x∗ , y ∗ ) violates a psd
inequality with non-integral v or s, and let be a small positive quantity. Let v
be a rational vector such that |vi −vi | < for all i, and let s be a rational number
such that |s − s| < . Provided is small enough, the psd inequality obtained by
using v and s in place of v and s will also be violated by (x∗ , y ∗ ). Now let M
be a positive integer such that M v ∈ Z and M s ∈ Z. The psd inequality with
n
Proof. First, note that the trivial inequality y11 ≥ x1 is a split inequality, ob-
tained by linearising the quadratic inequality (x1 − 1)x1 ≥ 0. This trivial split
inequality induces a facet of IQn , because all but one of the affinely-independent
points listed in the proof of Proposition 5 satisfy y11 = x1 .
Now consider a non-trivial split inequality of the form (3), and assume that the
non-zero components of v are relatively prime. A well-known result on integral
matrices (see, e.g., p. 15 of Newman [13]) implies that there exists a unimodular
matrix U ∈ Z
n×n
having v as its first row. Let U be such a matrix, and let
w ∈ Z be an arbitrary vector satisfying w1 = s + 1. Note that, if (x, y) is an
n
extreme point of IQn and (x , y ) is the transformed extreme point described in
Theorem 1, then x1 = v T x+s+1 and y11
= (x1 )2 = v T Y v+2(s+1)v T x+(s+1)2 .
Thus, if we apply the transformation mentioned in Corollary 1 to the trivial split
inequality y11 ≥ x1 , we obtain the inequality v T Y v + 2(s + 1)v T x + (s + 1)2 ≥
v T x + s + 1. This is equivalent to the non-trivial split inequality. By Corollary
1, it induces a facet of IQn .
Proof. First, note that when v satisfies the stated condition, there exists a vector
w ∈ Z such that v T w = 0 and such that wi > 0 for all i. To see this, let k and
n
k be the number of components of v that are positive and negative, respectively,
and let m be the product of the non-zero components of v. The desired vector w
can be obtained by setting wi to k |m|/vi when vi > 0, to k|m|/|vi | when vi < 0,
and to 1 otherwise.
Second, observe that an extreme point (x̄, ȳ) of IQn satisfies the split inequal-
ity (3) at equality if and only if v T x̄ ∈ {−s−1, −s}. Therefore, if (x̄, ȳ) is such an
268 A.N. Letchford
If the non-zero components of the vector v all have the same sign, then the split
inequality need not induce even a proper face of IQ+ n (because there may not
exist a lattice point x ∈ Z+ such that v T x ∈ {−s − 1, −s}). Theorem 2 implies
n
and of the same sign. Then there exists an integer s, of the opposite sign, such
that the split inequality (3) induces a facet of IQ+
n.
To close this subsection, we remark that Propositions 9 and 10 imply the validity
of the following inequalities for BQPn :
n
(∀v ∈ Z , s ∈ Z) . (4)
n
vi (vi + 2s+ 1)xi + 2 vi vj yij + s(s+ 1) ≥ 0
i=1 1≤i<j≤n
We have seen that IQ+ 1 is completely described by the split inequalities and the
non-negativity inequality x1 ≥ 0. A natural question is whether the split and
non-negativity inequalities are enough to describe IQ+ 2 . This is unfortunately
not the case, as we now explain.
Consider the two lines in R defined by the equations x1 + x2 = 3 and x1 +
2
and x1 + 2x2 ≥ 4), or below both lines (satisfying x1 + x2 ≤ 3 and x1 + 2x2 ≤ 4).
This implies that all points in F2+ satisfy the non-linear inequality (x1 + x2 −
3)(x1 + 2x2 − 4) ≥ 0. This implies that the linear inequality
r r r r r
r r r r r
@
@
r
x2 H @r r r r
HH@
H@
r r H Hr r r
@H
H
@ H
r r r @r HHr
x1
5 Concluding Remarks
References
1. Aardal, K.I., Weismantel, R.: Polyhedral combinatorics. In: Dell’Amico, M., Mafi-
oli, F., Martello, S. (eds.) Annotated Bibliographies in Combinatorial Optimiza-
tion. Wiley, New York (1997)
2. Anderson, E.J., Goberna, M.A., López, M.A.: Simplex-like trajectories on quasi-
polyhedral sets. Math. Oper. Res. 26, 147–162 (2001)
3. Balas, E.: Disjunctive programming. Ann. Discr. Math. 5, 3–51 (1979)
4. Boros, E., Hammer, P.L.: Cut-polytopes, Boolean quadric polytopes and nonneg-
ative quadratic pseudo-Boolean functions. Math. Oper. Res. 18, 245–253 (1993)
5. Burer, S., Letchford, A.N.: On non-convex quadratic programming with box con-
straints. SIAM J. Opt. 20, 1073–1089 (2009)
6. Cook, W., Kannan, R., Schrijver, A.: Chvátal closures for mixed integer program-
ming problems. Math. Program. 47, 155–174 (1990)
7. Deza, M.M., Laurent, M.: Geometry of Cuts and Metrics. Springer, Berlin (1997)
8. van Emde Boas, P.: Another NP-complete problem and the complexity of com-
puting short vectors in a lattice. Technical Report 81-04, Mathematics Institute,
University of Amsterdam (1981)
9. Khachiyan, L., Porkolab, L.: Integer optimization on convex semialgebraic sets.
Discr. Comput. Geom. 23, 207–224 (2000)
10. Lovász, L., Schrijver, A.J.: Cones of matrices and set-functions and 0-1 optimiza-
tion. SIAM J. Opt. 1, 166–190 (1991)
11. Michaels, D., Weismantel, R.: Polyhedra related to integer-convex polynomial sys-
tems. Math. Program. 105, 215–232 (2006)
12. Murty, K.G., Kabadi, S.N.: Some N P-complete problems in quadratic and nonlin-
ear programming. Math. Program. 39, 117–129 (1987)
13. Newman, M.: Integral Matrices. Academic Press, New York (1972)
14. Padberg, M.W.: The boolean quadric polytope: some characteristics, facets and
relatives. Math. Program. 45, 139–172 (1989)
15. Ramana, M.: An Algorithmic Analysis of Multiquadratic and Semidefinite Pro-
gramming Problems. PhD thesis, Johns Hopkins University, Baltimore, MD (1993)
16. Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency. Springer,
Berlin (2003)
17. Sherali, H.D., Adams, W.P.: A Reformulation-Linearization Technique for Solving
Discrete and Continuous Nonconvex Problems. Kluwer, Dordrecht (1998)
18. Yajima, Y., Fujie, T.: A polyhedral approach for nonconvex quadratic programming
problems with box constraints. J. Global Opt. 13, 151–170 (1998)
An Integer Programming and Decomposition
Approach to General Chance-Constrained
Mathematical Programs
James Luedtke
1 Introduction
We introduce a new approach for exactly solving general chance-constrained
mathematical programs (CCMPs). A chance constraint states that the chosen
decision vector should, with high probability, lie within a region that depends
on a set of random variables. A generic CCMP can be stated as
min f (x) | P{x ∈ P (ω)} ≥ 1 − , x ∈ X , (1)
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 271–284, 2010.
c Springer-Verlag Berlin Heidelberg 2010
272 J. Luedtke
and a two-stage problem with the possibility to take recourse after observing the
random outcome has P (ω) = {x | ∃y with T (ω)x + W (ω)y ≥ b(ω)}. (In §5.1 we
describe an example application.)
Our approach works for CCMPs with discrete (and finite support) distribu-
tion. Specifically, we assume that1 P{ω = ω k } = 1/N for k = 1, . . . , N . We refer
to the possible outcomes as scenarios. While this is a restriction, recent results
on using sample average approximations on problems with general distributions
[1] demonstrate that such finite support approximations, when obtained from a
Monte Carlo sample of the original distribution, can be used to find good feasible
solutions to the original problem and statistical bounds on solution quality. We
also assume that the sets Pk := P (ω k ) are polyhedra described by
we are interested in strong valid inequalities for the projection of the feasible
region of (3) into the space of x and z variables. Specifically, we define this
projection as
F = x ∈ X, z ∈ {0, 1}N | ∃y ∈ Rd×N+ s.t. (3b) − (3c) hold . (4)
Note that (x, z) ∈ F if and only if x ∈ X, z ∈ {0, 1}N satisfies (3c), and x ∈ Pk
for any k with zk = 0. Given this definition of F , we can then succinctly state a
reformulation of the original chance-constrained problem (1) as:
αx + πz ≥ β (6)
The proof of this result is almost identical to an argument in [19] and follows
from the observation that zk = 0 implies αx ≥ hk (α), whereas (3c) implies that
zk = 0 for at least one of the p + 1 largest values of hk (α).
Now, as was done in [19,8], we can apply the star inequalities of [25], or
equivalently in this case, the mixing inequalities of [26] to “mix” the inequalities
(8) to obtain additional strong valid inequalities.
Theorem 1 ([25,26]). Let T = {t1 , t2 , . . . , tl } ⊆ {σ1 , . . . , σp } be such that
hti (α) ≥ hti+1 (α) for i = 1, . . . , l, where htl+1 (α) := hσp+1 (α). Then the
inequality
l
αx + (hti (α) − hti+1 (α))zti ≥ ht1 (α) (9)
i=1
is valid for F .
These inequalities are strong in the sense that, if we consider the set Y defined
by
Y = (y, z) ∈ R × {0, 1}p | y + (hσi (α) − hσp+1 (α))zσi ≥ hσi (α), i = 1, . . . , p ,
then the inequalities (9), with αx replaced by y, define the convex hull of Y [25].
Furthermore, the inequalities of Theorem 1 are facet-defining for the convex hull
of Y (again with y = αx) if and only if ht1 (α) = hσ1 (α), which suggests that
when searching for a valid inequality of the form (9), one should always include
σ1 ∈ T . In particular, the valid inequalities
αx + (hσ1 (α) − hσi (α))zσ1 + (hσi (α) − hσp+1 (α))zσi ≥ hσ1 (α), i = 1, . . . , p ,
(10)
dominate the inequalities (8) which can be obtained by aggregating (10) with
the valid inequalities zσ1 ≤ 1 with a weight of (hσ1 (α) − hσi (α)) on the latter.
Integer Programming and Decomposition for Chance Constraints 277
4 Decomposition Algorithm
We are now ready to describe the branch-and-cut decomposition algorithm. The
algorithm works with a master relaxation defined as follows:
RP∗ (N0 , N1 , C) := min f (x) (11a)
N
s.t. zk ≤ p, (x, z) ∈ C, x ∈ X, z ∈ [0, 1]N , (11b)
k=1
zk = 0, k ∈ N0 , zk = 1, k ∈ N1 . (11c)
Here N0 is the set of binary variables currently fixed to 0, N1 is the set of binary
variables currently fixed to 1, and C is the relaxation defined by all the globally
valid inequalities added so far when the relaxation is solved. At the root node
in the search tree, we set N0 = N1 = ∅ and C = Rn×N .
Algorithm 1 presents a simple version of the proposed approach. The algo-
rithm is a basic branch-and-bound algorithm, with branching being done on the
binary variables zk , with the only important difference being how the nodes are
processed (Step 2 in the algorithm). In this step, the current node relaxation
(11) is solved repeatedly until no cuts have been added to the description of
C or the lower bound exceeds the incumbent objective value U . Whenever an
integer feasible solution ẑ is found, and optionally otherwise, the cut separation
routine SepCuts is called. The SepCuts routine must be called when ẑ is integer
feasible to check whether the solution (x̂, ẑ) is truly feasible to the set F . The
routine is optionally called otherwise to possibly improve the lower bound.
The SepCuts routine, described in Algorithm 2, attempts to find strong vio-
lated inequalities using the approach described in §3. The key here is the method
for selecting the coefficients α that are taken as given in §3. The idea is to con-
sider all scenarios k such that ẑk = 0, so that the associated constraints x ∈ Pk
are supposed to be satisfied, and for such scenarios test whether indeed this
holds. If x̂ ∈ Pk , then the condition that ẑk = 0 should imply x̂ ∈ Pk is not
violated. However, if x̂ ∈
/ Pk , this contradicts the value of ẑk , and hence we seek
to find an inequality that cuts off this infeasible solution. We therefore find an
inequality, say αx ≥ β, that is facet-defining for Pk , and that separates x̂ from
Pk . We then use the coefficients α to generate one or more strong valid inequal-
ities as derived in §3. While stated as two separate steps, the test of x̂ ∈ Pk
(line 3) and subsequent finding of a facet-defining inequality of Pk that cuts off
x̂ if not would typically be done together. For example, if we have an inequality
description of Pk (possibly in a lifted space such as in (2)) then this can be ac-
complished by solving an appropriate linear program. If Pk has special structure
278 J. Luedtke
Proof (Sketch). The details of the proof are left out of this extended abstract.
However, the first main point is that the algorithm terminates finitely because it
is based on branching on a finite number of binary variables, and the processing
of each node terminates finitely because the valid inequalities are derived from
a finite number of facet-defining inequalities (for the polyhedral sets Pk ). The
Integer Programming and Decomposition for Chance Constraints 279
second point is that the algorithm never cuts off an optimal solution because the
branching never excludes part of the feasible region and only valid inequalities
for the set F are added. The final point is that no solutions that are not in the
feasible region F are accepted for updating the incumbent objective value U (in
line 13 of the algorithm) because the SepCuts routine is always called for integer
feasible solutions ẑ and it can be shown that it is guaranteed to find a violated
inequality if (x̂, ẑ) ∈
/ F.
Aside from solving the master relaxation (11) the main work of Algorithm 1
happens within the SepCuts routine. An advantage of this approach is that
most of this work is done for one scenario at a time and can be implemented
to be done in parallel. In particular, checking whether x̂ ∈ Pk (and finding a
violated facet-defining if not) for any k such that ẑk < 1 can be done in parallel.
The subsequent work of generating a strong valid inequality is dominated by
calculation of the values hk (α) as in (7), which can also be done in parallel.
We have stated our approach in relatively simple form in Algorithm 1. How-
ever, as this approach is essentially a variant of branch-and-cut for solving a
(particularly structured) integer programming problem, we can also use all the
computational enhancements commonly used in such algorithms. In particular,
using heuristics to find good feasible solutions early and using some sort of
pseudocost branching [27], strong branching [28], or reliability branching [29]
approach for choosing which variable to branch on would be important. In our
implementation (described in §5.2) we have embedded the key cut generation
step of our algorithm within the CPLEX commercial integer programming solver
which has such enhancements already implemented.
In our definition of the master relaxation (11), we have enforced the con-
straints x ∈ X. If, X is a polyhedron and f (x) is linear, (11) is a linear program.
However, if X is not a polyhedron, suitable modifications to the algorithm could
280 J. Luedtke
be made to ensure that the relaxations solved remain linear programming prob-
lems. For example, if X is defined by a polyhedron Q with integrality constraints
on some of the variables, then we could instead define the master relaxation to
enforce x ∈ Q, and then perform branching both on the integer-constrained x
variables and on the zk variables. Such a modification is also easy to imple-
ment within existing integer programming solvers. Note also that, in this case,
Q would be a natural choice for the relaxation X̄ of X used in §3 when obtaining
the hk (α) values as in (7).
Here μij is the service rate of server type i when serving customer type j (μij = 0
if server type i cannot serve customer type j). This formulation aims to choose
minimum cost staffing levels such that the probability of meeting quality of
service targets is high.
When generating the test instances, we first generated the service rates, and
the mean and covariance of the arrival rate vector. We then generated the cost
vector in such a way that “more useful” server types were generally more ex-
pensive, in order to make the solutions nontrivial. Finally, to generate specific
instances with finite support, we sampled N joint-normally distributed arrival
rate vectors independently using the fixed mean and covariance matrix for vari-
ous sample sizes N . In all our test instances we use = 0.1 as the risk tolerance.
We see that this problem has the two-stage structure given in (2), and hence
available methods for finding exact solutions (or even any solution with a bound
on optimality error) are very limited. However, we point out that the form of
P (λ) still possesses some special structure in that the second-stage constraints
have no random coefficients (i.e., in the form of (2) the matrices T k and W k
do not vary with k). In addition, the constraints x ∈ X are very simple for
Integer Programming and Decomposition for Chance Constraints 281
this problem; we simply have X = Rn+ . Thus, while this test problem is certainly
beyond the capabilities of existing approaches, it is not yet a test of the algorithm
in the most general settings.
Technically, this problem does not satisfy our assumptions given in §2 because
the sets Rn+ ∩ P (λ) are not bounded. However, our approach really only requires
that the optimal solutions to (7) always exist for any coefficient vector α of a
facet-defining inequality for P (λ). As valid inequalities for P (λ) necessarily have
non-negative coefficients, this clearly holds.
5.3 Results
We compared our algorithm against the Big-M formulation (3) (with the M
values chosen as small as possible) and also against a basic decomposition algo-
rithm that does not use the strong valid inequalities of §3. We compare against
this simple decomposition approach to understand whether the success of our
algorithm is due solely to the decomposition, or whether the strong inequalities
are also important. The difference between the basic decomposition algorithm
and the strengthened version is in the type of cuts that are added in the SepCuts
routine. Specifically, in the case of an uncertainty set Pk of the form (12), if we
find a scenario k with ẑk = 0, and a valid inequality αx ≥ β for the set Pk that
is violated by x̂, the basic decomposition algorithm simply adds the inequality
αx ≥ βzk .
It is not hard to see that when the sets Pk have the form (12), this inequality is
valid for F because x ≥ 0 and any valid inequality for Pk has α ≥ 0. Furthermore,
this inequality successfully cuts off the infeasible solution (x̂, ẑ).
Table 1 presents the results of these three approaches for varying problem
size in terms of number of agent types (n), number of customer types (m) -
which is also the dimension of the random vector Λ, and number of scenarios N .
These tests were done with a time limit of one hour. Unless stated otherwise,
282 J. Luedtke
Table 1. Results on call center staffing instances; solution time (sec) or final optimality
gap (%)
each entry is an average over ten randomly generated samples from the same
base underlying instance (i.e., the instance is fixed, but ten different samples of
the N scenarios are taken). The big-M formulation (3) only successfully solves
the LP relaxation and finds a feasible solution for the two smallest instance
sizes. The entries ‘-’ in the other cases mean that either no solution was found
in the time limit, or that the LP relaxation did not solve in the time limit.
For the largest instances, CPLEX failed with an out-of-memory error before
the time limit was reached. Using the basic decomposition approach makes a
significant improvement over the big-M formulation in that feasible solutions
are now found for all instances. However, only the smallest of the instances (and
only 9 of 10 of them) could be solved to optimality, and the larger instances had
very large optimality gaps after the limit. Combining decomposition with strong
valid inequalities (“Strong Decomp” in the table) we are able to solve all the
instances to optimality in an average of less than five minutes.
To understand these results a little better, we present in Table 2 the root
gaps (relative to the optimal values) after all cuts have been added for the
6 Discussion
We have presented a promising approach for solving general CCMPs, although
additional computational tests are needed on problems having more general
structures than the test problem we considered. The approach uses both de-
composition, to enable processing subproblems corresponding to one scenario at
a time, and integer programming techniques, to yield strong valid inequalities.
From a stochastic programming perspective, it is not surprising that decom-
position is necessary to yield an efficient algorithm, as this is well-known for
traditional two-stage stochastic programs. From an integer programming per-
spective, it is not surprising that using strong valid inequalities has an enormous
impact. The approach presented here represents a successful merger of these
approaches to solve CCMPs.
References
1. Luedtke, J., Ahmed, S.: A sample approximation approach for optimization with
probabilistic constraints. SIAM J. Optim. 19, 674–699 (2008)
2. Charnes, A., Cooper, W.W., Symonds, G.H.: Cost horizons and certainty equiva-
lents: an approach to stochastic programming of heating oil. Manage. Sci. 4, 235–
263 (1958)
3. Prékopa, A.: On probabilistic constrained programmming. In: Kuhn, H.W. (ed.)
Proceedings of the Princeton Symposium on Mathematical Programming, Prince-
ton, NJ, pp. 113–138. Princeton University Press, Princeton (1970)
4. Charnes, A., Cooper, W.W.: Deterministic equivalents for optimizing and satisfic-
ing under chance constraints. Oper. Res. 11, 18–39 (1963)
5. Calafiore, G., El Ghaoui, L.: On distributionally robust chance-constrained linear
programs. J. Optim. Theory Appl. 130, 1–22 (2006)
6. Beraldi, P., Ruszczyński, A.: The probabilistic set-covering problem. Oper. Res. 50,
956–967 (2002)
7. Dentcheva, D., Prékopa, A., Ruszczyński, A.: Concavity and efficient points of dis-
crete distributions in probabilistic programming. Math. Program. 89, 55–77 (2000)
284 J. Luedtke
8. Luedtke, J., Ahmed, S., Nemhauser, G.L.: An integer programming approach for
linear programs with probabilistic constraints. Math. Program. 12, 247–272 (2010)
9. Saxena, A., Goyal, V., Lejeune, M.: MIP reformulations of the probabilistic set
covering problem. Math. Program. 121, 1–31 (2009)
10. Ruszczyński, A.: Probabilistic programming with discrete distributions and prece-
dence constrained knapsack polyhedra. Math. Program. 93, 195–215 (2002)
11. Tanner, M., Ntaimo, L.: IIS branch-and-cut for joint chance-constrained programs
with random technology matrices (2008)
12. Ben-Tal, A., Nemirovski, A.: Robust solutions of linear programming problems
contaminated with uncertain data. Math. Program. 88, 411–424 (2000)
13. Bertsimas, D., Sim, M.: The price of robustness. Oper. Res. 52, 35–53 (2004)
14. Calafiore, G., Campi, M.: Uncertain convex programs: randomized solutions and
confidence levels. Math. Program. 102, 25–46 (2005)
15. Nemirovski, A., Shapiro, A.: Scenario approximation of chance constraints. In:
Calafiore, G., Dabbene, F. (eds.) Probabilistic and Randomized Methods for Design
Under Uncertainty, pp. 3–48. Springer, London (2005)
16. Nemirovski, A., Shapiro, A.: Convex approximations of chance constrained pro-
grams. SIAM J. Optim. 17, 969–996 (2006)
17. Erdoğan, E., Iyengar, G.: Ambiguous chance constrained problems and robust op-
timization. Math. Program. 107, 37–61 (2006)
18. Erdoğan, E., Iyengar, G.: On two-stage convex chance constrained problems. Math.
Meth. Oper. Res. 65, 115–140 (2007)
19. Luedtke, J., Ahmed, S., Nemhauser, G.: An integer programming approach for
linear programs with probabilistic constraints. In: Fischetti, M., Williamson, D.P.
(eds.) IPCO 2007. LNCS, vol. 4513, pp. 410–423. Springer, Heidelberg (2007)
20. Birge, J., Louveaux, F.: Introduction to stochastic programming. Springer, New
York (1997)
21. Van Slyke, R., Wets, R.J.: L-shaped linear programs with applications to optimal
control and stochastic programming. SIAM J. Appl. Math. 17, 638–663 (1969)
22. Higle, J.L., Sen, S.: Stochastic decomposition: an algorithm for two-stage stochastic
linear programs. Math. Oper. Res. 16, 650–669 (1991)
23. Shen, S., Smith, J., Ahmed, S.: Expectation and chance-constrained models and
algorithms for insuring critical paths (2009) (submitted for publication)
24. Codato, G., Fischetti, M.: Combinatorial benders’ cuts for mixed-integer linear
programming. Oper. Res. 54, 756–766 (2006)
25. Atamtürk, A., Nemhauser, G.L., Savelsbergh, M.W.P.: The mixed vertex packing
problem. Math. Program. 89, 35–53 (2000)
26. Günlük, O., Pochet, Y.: Mixing mixed-integer inequalities. Math. Program. 90,
429–457 (2001)
27. Linderoth, J., Savelsbergh, M.: A computational study of search strategies for
mixed integer programming. INFORMS J. Comput. 11, 173–187 (1999)
28. Applegate, D., Bixby, R., Chvátal, V., Cook, W.: Finding cuts in the TSP. Tech-
nical Report 95-05, DIMACS (1995)
29. Achterberg, T., Koch, T., Martin, A.: Branching rules revisited. Oper. Res. Lett. 33,
42–54 (2004)
30. Gurvich, I., Luedtke, J., Tezcan, T.: Call center staffing with uncertain arrival
rates: a chance-constrained optimization approach. Technical report (2009)
An Effective Branch-and-Bound Algorithm for
Convex Quadratic Integer Programming
1 Introduction
Nonlinear integer optimization has attracted a lot of attention recently. Besides
its practical importance, it is challenging from a theoretical and methodological
point of view. While intensive research has led to tremendous progress in the
practical solution of integer linear programs in the last decades [9], practical
methods for the nonlinear case are still rare [5]. This is true even in special
cases, such as Convex Quadratic Integer Programming (CQIP):
min f (x) = x Qx + L x + c
(1)
s.t. x ∈ Zn ∩ X ,
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 285–298, 2010.
c Springer-Verlag Berlin Heidelberg 2010
286 C. Buchheim, A. Caprara, and A. Lodi
well we used CPLEX MIQP as a reference, which was able to solve instances
with n up to about 55.
some cases it will be convenient to use apices to indicate vectors, such as xi , and
in this case we will denote the transposed vector by (xi ) . As customary, given
a matrix Q we will let qi,j denote its entry in the ith row and jth column. Given
a scalar a ∈ R, we will let a denote the integer value closest to a. Analogously,
given a vector x we will let x̄ denote the componentwise rounding of x to the
nearest integer.
A box X ⊆ Rn is a set of the form X = {x ∈ Rn : l ≤ x ≤ u}, where l, u ∈ Rn ,
l ≤ u. Let Q be a positive semidefinite matrix. For x ∈ Rn we consider the
corresponding ellipsoid
E(Q , x ) := {x ∈ Rn : (x − x ) Q (x − x ) ≤ 1} ,
Given a closed convex set X, we let int X denote the interior of X and bd X
the border of X.
2 Lower Bounds
As anticipated, the main inspiring observation for our method is that the com-
putation of the minimum of the objective function (neglecting all constraints
including integrality) simply requires solving a system of linear equations.
Remark 1. The unique minimum of f (x) = x Qx+ L x+ c in case Q is positive
definite is attained at x̄ = − 21 Q−1 L and has value c − 14 L Q−1 L. Moreover, for
every x ∈ Rn , f (x) = f (x̄) + (x − x̄)T Q(x − x̄).
Our aim in this section is to get stronger bounds by exploiting the integrality of
the variables and possibly the structure of X.
be the scaling factor α such that the ellipsoid αE(Q , x ) contains some integer
point on its border but no integer point in its interior.
Observation 1. μ(Q , x ) = max{α : (x − x ) Q (x − x ) ≥ α2 for each x ∈
Zn }.
An Effective Branch-and-Bound Algorithm for CQIP 289
Fig. 1. Improving the lower bounds: the light gray ellipsoid is E(Q , x̄) scaled
by μ(Q , x̄); the dark gray ellipsoid is E(Q, x̄) scaled by λ(Q, Q )μ(Q , x̄)
Note that, given our objective function f (x) = x Qx + L x + c and the asso-
ciated continuous minimum x̄, the level sets of f (x) are precisely the borders of
ellipsoids of the form αE(Q, x̄). Given this, it is easy to visualize the fact that
finding the integer point that minimizes f is equivalent to scaling E(Q, x̄) by α
starting from α = 0 and stopping as soon as the border of αE(Q, x̄) contains
an integer point. This is the same as computing μ(Q, x̄). Since this is as hard as
solving (1) when X = Rn , we rather do the same scaling for some other ellipsoid
E(Q , x̄), and then scale E(Q, x̄) in turn until it touches the border of the first
ellipsoid, see Figure 1. This requires one to be able to compute μ(Q , x̄) as well
as the maximum α ∈ R+ such that αE(Q) is contained in E(Q ):
λ(Q, Q ) := max{α : αE(Q) ⊆ E(Q )} ,
noting that this latter value has nothing to do with x̄.
=
Observation 2. λ(Q, Q ) = max{α : Q − α2 Q 0} = min{1/ x Q x : x ∈
E(Q)}.
Proposition 1. Given f (x) = x Qx + L x + c with Q positive definite and
continuous minimum x̄ and a positive semidefinite matrix Q of the same size
as Q,
min{f (x) : x ∈ Zn } ≥ f (x̄) + λ2 (Q, Q )μ2 (Q , x̄) .
Note that, in order to find hopefully strong lower bounds, one would like to have
matrices Q such that on the one hand E(Q ) is as close as possible to E(Q) and
on the other μ(Q , x̄) is fast to compute. It is particularly fast to compute μ(Q , x̄)
if Q is a split, i.e., if Q = dd for some vector d ∈ Zn \ {0} with d1 , . . . , dn
coprime. In this case, we have
Observation 3. μ(dd , x ) = |d x − d x |.
In order to derive strong lower bounds, we aim at splits Q that yield large
factors λ(Q, Q ). To this end, we consider flat directions of the ellipsoid E(Q),
i.e., vectors d ∈ Zn \ {0} minimizing the width of E(Q) along d, defined as
max{d x : x ∈ E(Q)} − min{d x : x ∈ E(Q)} = 2 max{d x : x ∈ E(Q)} .
290 C. Buchheim, A. Caprara, and A. Lodi
Note that λ(Q, Q0 ) can be strictly smaller than one, so that the lower bound
derived from Q0 ,
n
f (x̄) + λ2 (Q, Q0 )μ2 (Q0 , x̄) = f (x̄) + λ2 (Q, Q0 ) λ2 (Q, Qi )μ2 (Qi , x̄) ,
i=1
can be weaker than the bound f (x̄) + λ2 (Q, Qi )μ2 (Qi , x̄) derived from Qi for
some i ≥ 1. In general, which Qi gives the strongest lower bound depends on
the position of x̄.
Example 1. We illustrate the ideas in Section 2 by an example in the plane. Let
1 −2 1 0 1 −2 1 −2 −1 1 −1
Q= = , B= , B = .
−2 8 −2 −2 0 −2 0 −2 0 −1/2
The ellipse E(Q) is shown in Figure 2. Short vectors of the lattice generated by
the rows of B −1 , the vectors (1, −1) and (0, −1/2), are (0, −1/2) and (1, 0) .
These correspond to the transformation matrix
0 1 −2 1
T = , T −1 = ,
1 −2 10
and hence to the (hopefully) flat directions (0, 1) and (1, −2) . Thus
00 1 −2
Q1 = , Q2 =
01 −2 4
and λ(Q, Q1 ) = 2, λ(Q, Q2 ) = 1. The ellipses E(Q1 ) and E(Q2 ) are illustrated
in Figure 2. Finally, in this case we are lucky to obtain Q0 = 4Q1 + Q2 = Q, so
that the improved lower bound given by Q0 agrees with the optimal solution of
the problem, independently of L and c.
An Effective Branch-and-Bound Algorithm for CQIP 291
Fig. 2. The ellipse E(Q) in (a); the split E(Q1 ) given by (1, 0) in (b); the split E(Q2 )
given by (1, −2) in (c)
By the convexity of f and its symmetry with respect to x̄, the continuous minima
with respect to these fixings are non-decreasing, so that we can stop as soon as
one of these minima exceeds the current upper bound. In particular, we get a
finite algorithm even without bounds on the variables, since we assume that f
is strictly convex.
In order to enumerate subproblems as quickly as possible, our aim is to per-
form the most time-consuming computations in a preprocessing phase. In partic-
ular, having fixed d variables, we get the reduced objective function f¯ : Rn−d →
R of the form
f¯(x) = x Q̄d x + L̄ x + c̄ .
d d d
If xi is fixed to ri for i = 1, . . . , d, we have c̄ = c+ i=1 Li ri + i=1 j=1 qij ri rj
d
and L̄j−d = Lj + 2 i=1 qij ri for j = d + 1, . . . , n. On the other hand, the
matrix Q̄d is obtained from Q by deleting the first d rows and columns, and
therefore is positive definite and does not depend on the values at which the
first d variables are fixed.
See Algorithm 1 for an outline of our method, for the case in which X = Rn
and we simply use the continuous lower bound. Clearly, the running time of this
algorithm is exponential in general. However, every node in the enumeration tree
can be processed in O(n2 ) time (if the primal heuristics obey the same runtime
bound), the bottleneck being the computation of the continuous minimum given
the pre-computed inverse matrix Q̄−1 d . Note that Algorithm 1 can easily be
adapted to the constrained case where X = Rn . In this case, we just prune all
nodes with invalid variable fixings.
294 C. Buchheim, A. Caprara, and A. Lodi
then we get
f¯(x̄ + αz d ) = f¯(x̄) + α(x̄ v d + L̄ z d ) + α2 sd .
Since L̄ can be computed incrementally in O(n − d) time, we get:
Proposition 4. If, in the preprocessing phase of the algorithm in Figure 1, we
compute z d , v d , sd as defined above for d = 0, . . . , n − 1, then the computation
of the continuous minimum and the associated lower bound can be carried out
in O(n − d) time per node.
When improving lower bounds by ellipsoids as illustrated in Section 2, the fol-
lowing natural restriction leads to linear time per node: if the splits in the root
node are defined by the columns of the transformation matrix T , then the splits
on level d are defined by the columns of the matrix T̄d arising from T by deleting
the first d rows and columns. Indeed, in this case, for the continuous minimum
¯ = x̄ + (rd+1 − x̄1 )z d obtained after having fixed variable xd+1 to rd+1 , we have
x̄
¯n−d ) (recall the notation in Section 1.4) in order to
¯2 , . . . , x̄
to compute T̄d+1 (x̄
determine the scaling factors μ(Q̄i , (x̄ ¯n−d ) ). If we define
¯2 , . . . , x̄
wd+1 := T̄d+1 (z2d , . . . , zn−d
d
) ∈ Rn−d−1 ,
we have
¯n−d ) = T̄d+1
T̄d+1 ¯2 , . . . , x̄
(x̄ (x̄2 , . . . , x̄n−d ) + T̄d+1
(rd+1 − x̄1 )(z2d , . . . , zn−d
d
)
= ((T̄d x̄)2 , . . . , (T̄d x̄)n−d ) − x̄1 (td+1,d+2 , . . . , td+1,n ) + (rd+1 − x̄1 )wd+1 .
Now T̄d x̄ has already been determined, hence each of the n − d factors
μ(Q̄i , (x̄ ¯n−d ) ) can be computed in constant extra time. After that, the
¯2 , . . . , x̄
last factor μ(Q̄0 , (x̄ ¯n−d ) ) can be computed in O(n−d) time by Observa-
¯2 , . . . , x̄
tion 5. In the constrained case, we can determine improved bounds as explained
in Section 3.1. In summary, we thus have
Proposition 5. If, in the preprocessing phase of the algorithm in Figure 1, we
compute wd+1 and all timin and timax for d = 0, . . . , n − 1, then the lower bound
improvement as illustrated in Section 2 can be carried out in O(n − d) time per
node.
4 Computational Results
In this section, we present experimental results for the two special cases of CQIP
mentioned in the introduction, namely the cases X = [−1, 1]n (Section 4.1)
and X = Rn (Section 4.2). In all cases, we compare our algorithm with the
CPLEX MIQP solver [6], which, as mentioned in the introduction, turned out
296 C. Buchheim, A. Caprara, and A. Lodi
to be by far the best method among those available when we approached the
problem. For our algorithm, we also state results obtained without the lower
bound improvements discussed in Section 2.
We implemented the versions of our algorithm running in quadratic and linear
time per node in C. All experiments were run on Intel Xeon processors running
at 2.33 GHz. For basis reduction, we used the LLL-algorithm [7] implemented in
LiDIA [8]. The runtime limit for all instances and solution methods was 8 hours.
Besides total running times, we investigated the time needed for preprocessing
and the time per node in the enumeration tree. Moreover, we state the total
number of nodes processed.
columns marked “tt/s” contain the total time in seconds needed to solve the
instance to optimality, while “pt/s” states the time in seconds needed for the
preprocessing. Finally, “nodes” contains the total number of nodes processed in
the enumeration tree. The last column contains the value of the optimal solution.
As is obvious from Table 1, our algorithm outperforms CPLEX by several or-
ders of magnitude. CPLEX could not solve instances with more than 50 variables
within the time limit of 8 hours, while the linear-time version of our algorithm
solves all instances up to 120 variables. As expected, quadratic running time
per node leads to much slower total times. However, even the quadratic version
clearly outperforms CPLEX.
References
1. Bonami, P., Biegler, L.T., Conn, A.R., Cornuéjols, G., Grossmann, I.E., Laird,
C.D., Lee, J., Lodi, A., Margot, F., Sawaya, N., Wächter, A.: An algorithmic
framework for convex mixed integer nonlinear programs. Discrete Optimization 5,
186–204 (2008)
2. Callegari, S., Bizzarri, F., Rovatti, R., Setti, G.: Discrete quadratic programming
problems by ΔΣ modulation: the case of circulant quadratic forms. Technical re-
port, Arces, University of Bologna (2009)
3. Eisenbrand, F.: Integer programming and algorithmic geometry of numbers. In:
Jünger, M., Liebling, T., Naddef, D., Nemhauser, G., Pulleyblank, W., Reinelt,
G., Rinaldi, G., Wolsey, L.A. (eds.) 50 Years of Integer Programming 1958-2008.
The Early Years and State-of-the-Art Surveys. Springer, Heidelberg (2009)
4. Frangioni, A., Lodi, A., Rinaldi, G.: Optimizing over semimetric polytopes. In:
Bienstock, D., Nemhauser, G.L. (eds.) IPCO 2004. LNCS, vol. 3064, pp. 431–443.
Springer, Heidelberg (2004)
5. Hemmecke, R., Köppe, M., Lee, J., Weismantel, R.: Nonlinear integer program-
ming. In: Jünger, M., Liebling, T., Naddef, D., Nemhauser, G., Pulleyblank, W.,
Reinelt, G., Rinaldi, G., Wolsey, L.A. (eds.) 50 Years of Integer Programming 1958-
2008. The Early Years and State-of-the-Art Surveys. Springer, Heidelberg (2009)
6. ILOG, Inc. ILOG CPLEX 12.1 (2009), http://www.ilog.com/products/cplex
7. Lenstra, A.K., Lenstra Jr., H.W., Lovász, L.: Factoring polynomials with rational
coefficients. Mathematische Annalen 261, 515–534 (1982)
8. LiDIA. LiDIA: A C++ Library For Computational Number Theory (2006),
http://www.cdc.informatik.tu-darmstadt.de/TI/LiDIA/
9. Lodi, A.: MIP computation and beyond. In: Jünger, M., Liebling, T., Naddef, D.,
Nemhauser, G., Pulleyblank, W., Reinelt, G., Rinaldi, G., Wolsey, L.A. (eds.) 50
Years of Integer Programming 1958-2008. The Early Years and State-of-the-Art
Surveys. Springer, Heidelberg (2009)
10. Micciancio, D., Goldwasser, S.: Complexity of Lattice Problems: A Cryptographic
Perspective. Springer, Heidelberg (2002)
11. Moré, J.J., Toraldo, G.: On the solution of large quadratic programming problems
with bound constraints. SIAM Journal on Optimization 1, 93–113 (1991)
12. Rajan, D., Dash, S., Lodi, A.: { − 1,0,1} unconstrained quadratic programs us-
ing max-flow based relaxations. Technical Report OR/05/13, DEIS, University of
Bologna (2005)
13. De Simone, C.: The cut polytope and the boolean quadric polytope. Discrete Math-
ematics 79, 71–75 (1989)
Extending SDP Integrality Gaps to
Sherali-Adams with Applications to
Quadratic Programming and MaxCutGain
Abstract. We show how under certain conditions one can extend con-
structions of integrality gaps for semidefinite relaxations into ones that
hold for stronger systems: those SDP to which the so-called k-level con-
straints of the Sherali-Adams hierarchy are added. The value of k above
depends on properties of the problem. We present two applications, to the
Quadratic Programming problem and to the MaxCutGain problem.
Our technique is inspired by a paper of Raghavendra and Steurer
[Raghavendra and Steurer, FOCS 09] and our result gives a doubly ex-
ponential improvement for Quadratic Programming on another re-
sult by the same authors [Raghavendra and Steurer, FOCS 09]. They
provide tight integrality-gap for the system above which is valid up to
k = (log log n)Ω(1) whereas we give such a gap for up to k = nΩ(1) .
1 Introduction
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 299–312, 2010.
c Springer-Verlag Berlin Heidelberg 2010
300 S. Benabbas and A. Magen
quality of such relaxations and the level parameter. See [3, 5, 10, 13, 14, 16, 17]
for a few examples.
Our main problem of interest will be Quadratic Programming as de-
fined by [4]. Here the input is a matrix An×n and the objective is to find
x ∈ {−1, 1}n that maximizes the quadratic form i=j aij xi xj . The natural
semidefinite programming relaxation of this problem replaces the integer (±1)
valued xi ’s with unit vectors, and products with inner products. This problem
has applications in correlation clustering and its relaxation is related to the well-
known Grothendieck inequality[6] (see [1].) Charikar and Wirth [4] show that the
integrality gap of this relaxation is at most O(log n) and a matching lower bound
was later established by [1, 2, 7]. To lowerbound the integrality gap Khot and
O’Donnell [7] construct a solution for the SDP relaxation that has objective
value Θ(log n) times bigger than any integral (±1) solution.
It is then natural to ask if strengthening the SDP with a lift and project
method will reduce this Θ(log n) gap resulting in a better approximation algo-
rithm. We investigate the performance of one of the stronger Lift-and-Project
systems, the so called Sherali-Adams system for this problem. We show that for
Quadratic Programing the integrality gap does not asymptotically decrease even
after level nδ of the Sherali-Adams hierarchy for some constant δ. The technique
used is somewhat general and can be applied to other problems, and as an ex-
ample we apply it to the related problem of MaxCutGain. We can prove the
following two theorems.
Theorem 1. The standard SDP formulation of Quadratic Programming
admits a Θ(log N ) integrality gap even when strengthened with N Ω(1) levels of
the Sherali-Adams hierarchy.
Theorem 2. The standard SDP formulation of MaxCutGain has integrality
gap ω(1) even when strengthened with ω(1) levels of the Sherali-Adams hierarchy.
It should be mentioned that it is known [2, 7] that assuming the Unique Games
Conjecture no polynomial time algorithm can improve upon the Θ(log n) factor
for Quadratic Programming or ω(1) for MaxCutGain. However, our re-
sults do not rely on any such assumptions. Also, given that the Sherali-Adams
strengthening can only be solved in polynomial time for constant levels, our re-
sults preclude an algorithm based on the standard relaxation with Sherali-Adams
δ
strengthening that runs in time 2n for some small enough constant δ.
Here is an outline of our approach. We start with a known integrality gap
instance for the SDP relaxation and its SDP solution A. Now consider an in-
tegral solution B to the same instance (obviously B is a solution to an SDP
strengthened with an arbitrary level of the Sherali-Adams hierarchy.) We com-
bine A and B into a candidate solution C that has some of the good qualities
of both, namely it has high objective value (like A) and has almost the same
behavior with regards to the Sherali-Adams hierarchy as B. In particular, the
only conditions of the Sherali-Adams hierarchy C violates are some positivity
conditions, and those violations can be bounded by some quantity ξ. To handle
these violations we take a convex combination of C with a solution in which
Extending SDP Integrality Gaps to Sherali-Adams with Applications 301
these violated conditions are satisfied with substantial enough slack. In fact a
uniform distribution over all integral points (considered as a solution) has this
property. The weights we need to give to C and the uniform solution in the
convex combination will depend on ξ and the amount of slack in the uniform
solution which are in turn a function of the level parameter k. In turn these two
weights determine the objective value of the resulting solution.
This idea of “smoothening” a solution like the above to satisfy Sherali-Adams
constraints is due to Raghavendra and Steurer[12]. They use this idea together
with dimensionality reduction to give a rounding algorithm that is optimal for
a large class of constraint satisfaction problems. In another relevant new result
[13] the same authors show how to get integrality gaps for the Sherali-Adams
SDPs similar to our work. However, in [13] the main vehicle in getting the results
are reductions from the Unique Games Problem and “smoothening” is used as
a small step.
It is interesting to compare our results with [13]. While [13] gives a more
general result in that it applies to a broader spectrum of problems, it only
holds to a relatively low level of the Sherali-Adams hierarchy. In particular,
for Quadratic Programming their result is valid for Sherali-Adams SDPs of
level up to (log log n)Ω(1) whereas ours is valid up to level nΩ(1) . We note that
the two results exhibit different level/integrality gap tradeoffs as well, in that
[13] provides the same integrality gap asymptotically until the “critical level”
in which point it breaks down completely. Our results supplies a more smooth
tradeoff with the integrality gap dropping “continuously” as the level increases.
An additional difference is that our result (which does not use Unique Games
reductions) is elementary.
The rest of the paper is organized as follows. In Section 2 we are going to
introduce our notation and some basic definitions and identities from Fourier
analysis on the cube. In the same section we will review the relevant definition
of the Sherali-Adams hierarchy. In section 3 we are going to state and prove
our main technical lemma. Section 4 presents an application of the lemma to
Quadratic Programming. There we prove an integrality gap for the SDP
relaxation of this problem strengthened to the nδ th level of the Sherali-Adams
hierarchy (for some δ > 0.) In Section 5 we present our application to the
MaxCutGain problem. In particular we present super-constant integrality gaps
for some super-constant level of the Sherali-Adams SDP hierarchy. We conclude
in section 6 by pointing out some of the limitations of our approach and some
open problems.
2 Preliminaries
For a distribution μ on {−1, 1}n and for S ⊆ [n] we let MarS μ be the marginal
distribution of μ on the set S. In particular, for y ∈ {−1, 1}S we have
(MarS μ)(y) = Prz∼μ [zS = yS ]. Alternatively,
(MarS μ)(y) = μ(y ◦ z),
z∈{−1,1}[n]\S
We will use Sd−1 for the (d − 1)-dimensional unit sphere, i.e. , the set of unit
vectors in Rd . Throughout the paper we will use bold letters for real vectors,
and "v, u# for the inner product of two vectors v, u ∈ Rd .
Extending SDP Integrality Gaps to Sherali-Adams with Applications 303
max c(x1 , . . . , xn )
Subject to xi ∈ {−1, 1} for all i.
The goal of the present work is to show that (for particular problems), even
for large values of k, the Sherali-Adams SDP of level k is not much stronger
than the canonical relaxation. In particular, its has asymptotically the same
integrality gap.
Before we prove the lemma we will briefly describe its use. For simplicity we will
only describe the case where ν is the uniform distribution on {−1, 1}n. To use the
lemma one starts with an integrality gap instance for the canonical SDP relax-
ation of the problem of interest (say Quadratic Programming) and its vector
solution v0 , v1 , . . . , vn . Given that the instance is an integrality gap instance one
knows that the objective value attained by v0 , v1 , . . . , vn is much bigger than
what is attainable by any integral solution. The simplest thing one can hope for
is to show that this vector solution is also a solution to the Sherali-Adams SDP
of level k, or in the language of Fact 1 the vectors are compatible with a set of
distributions {μS }S⊂[n],|S|≤k . However, this is not generally possible. Instead we
use the lemma (in the simplest case with ν being the uniform distribution on
{−1, 1}n) to get another set of vectors u0 , . . . , un which are in fact compatible
with some {μS }S⊂[n],|S|≤k . Given that in the problems we consider the objective
function is a quadratic polynomial, and given the promise of the lemma that
inner products of ui ’s is times that of vi ’s, it follows that the objective value
attained by ui ’s is times that attained by vi ’s. As vi ’s are the vector solution
for an integrality gap instance, it follows that the integrality gap will decrease
by a multiplicative factor of at most when one goes from the canonical SDP
relaxation to the Sherali-Adams SDP of level k.
Extending SDP Integrality Gaps to Sherali-Adams with Applications 305
How big one can take (satisfying the requirements of the lemma) will then
determine the quality of the resulting integrality gap. In the simplest case one
can take ν to be the uniform distribution on {−1, 1}n and argue that in this case
the requirements of the lemma are satisfied as long as 2k 2 ≤ 1 and in particular
for = 1/2k 2. In fact our application to the MaxCutGain problem detailed in
section 5 follows this simple outline. For Quadratic Programming we can get
a better result by taking a close look at the particular structure of vi ’s for the
integrality gap instance of [7] and using a more appropriate distribution for ν.
We will now prove Lemma 2.
Proof (of Lemma 2). Our proof proceeds in two steps. First, we construct a
family of functions {fS : {−1, 1}S → R}S⊂[n],|S|≤k that satisfy all the required
conditions except being the probability mass function of distributions. In par-
ticular, while for any S the sum of the values of fS is 1, this function can take
negative values for some inputs. The construction of fS uses the distribution
ν and guarantees that while fS may take negative values, these values are not
too small. In the second step we take a convex combination of the fS ’s with
the uniform distribution on {−1, 1}S to get the desired family of distributions
{μS }|S|≤k .
[n]
For any subset S ∈ ≤k we define fS as a “hybrid” object that is using
both the vectors {vi }’s and the distribution ν. Given that a function is uniquely
determined by its Fourier expansion we will define fS in terms of its Fourier
expansion,
f1
S (∅) = 2
−|S|
= 2n−|S| ν0(∅),
∀i ∈ S f1
S ({i}) = 2
−|S|
"v0 , vi #,
∀i = j ∈ S f1
S ({i, j}) = 2
−|S|
"vi , vj #,
∀T ⊆ S, |T | > 2 f1
S (T ) = 2
n−|S|
ν0(T ).
Comparing the above definition with (1), fS is exactly like the marginal distri-
bution of ν on the set S except it has different degree one and degree two Fourier
coefficients. First, observe that for any S, the sum of the values of fS is 1.
fS (x) = 2|S| E [fS (x)] = 2|S| f1
S (∅) = 1.
x
x∈{−1,1}S
So, fS ’s satisfy (P1) and (P2) and are compatible with vi ’s (except, they are
not distributions.1 ) Next we show that fS (y) cannot be too negative.
fS (y) = f1S (T )χT (y)
T ⊆S
= 2n−|S| ν0(T )χT (y) + (f1
S (T ) − 2
n−|S|
ν0(T ))χT (y)
T ⊆S T ⊆S
= (MarS ν)(y) + (f1
S (T ) − 2
n−|S|
ν0(T ))χT (y) by (2)
T ⊆S
- -
≥− fS (T ) − 2n−|S| ν0(T )-
-1
T ⊆S
-
=− -1
fS ({i}) − 2n−|S| ν0({i})|
i∈S
-
− |f1
S ({i, j}) − 2
n−|S|
ν0({i, j})-
i=j∈S
⎛ ⎞
- -
= −2−|S| ⎝ -"v0 , vi # − E [xi ] | + |"vi , vj # − E [xi xj ] -⎠
x∼ν x∼ν
i∈S i=j∈S
≥ −2−|S|/2,
where the last step follows from the condition on and because |S| ≤ k. This
completes the first step of the proof.
Next, define π to be the uniform distribution on {−1, 1}n and μS as a convex
combination of fS and MarS π, i.e.,
Here, wi ’s are defined such that they are perpendicular to all vi ’s and each other.
It is easy to check that ui ’s are compatible with μS ’s and satisfy all the
required properties of the lemma (except that μS ’s can potentially be negative.)
In fact (P1-P3) are linear and given that they hold for {fS } and {vi } and for
[n]
{MarS π} and {wi } they must hold for {μS } and {ui }. Finally, for any S ∈ ≤k
and y ∈ {−1, 1}S we have
Pr [x = y] ≥ δ(k).
x∼MarS π
One would need a stronger condition on in this case that depends on δ(k). The
inner product of the resulting vectors would of course depend on the distribution
π, namely,
Furthermore, the vi ’s are produced in this simple manner based on ui ’s. Divide
the N variables into n classes of size m each, and assign the vector uj /uj to
the variables in the jth class. Formally, vi = ui/m /ui/m .
We will need the the following property of the vi ’s which easily follows from
well-known properties of random unit vectors. We will prove it in appendix A
for completeness.
Fact 3. In the SDP solution of [7], with probability at least 1−4/n4, for all pairs
of indices 1 ≤ i, j ≤ N the inner product "vi , vj # has the following property,
vi , vj = 1 if i and j are in the same class, i.e. i/m = j/m,
=
|vi , vj | ≤ (12 log n)/d if i and j are in different classes, i.e. i/m = j/m.
Recall that in Lemma 2 a choice of a distribution ν on {−1, 1}n is required. In
particular, if for every pair of variables i, j, Ex∼ν [xi xj ] is close to "vi , vj # one
can choose a big value for in the lemma, which in turn means that the resulting
ui ’s will have inner products close to those of vi ’s.
Indeed, the key to Theorem 1 is using ν which is “agreeable” with fact 3: two
variables will have a large or a small covariance depending on whether they are
from the same or different classes, respectively. Luckily, this is easily achievable
by identifying variables in the same class and assigning values independently
across classes. In other words the distribution ν will choose a random value from
{−1, 1} for x1 = x2 = · · · = xm , an independently chosen value for xm+1 = · · · =
x2m , and similarly an independently chosen value for xnm−m+1 = · · · = xnm .
Such ν clearly satisfies,
1 if i and j are in the same class,
E [xi xj ] =
x∼ν 0 otherwise.
Consider the vector solution of [7], v1 , . . . , vnm and define v0 as a vector per-
pendicular to all other vi ’s. Consider the distribution ν defined above and apply
Lemma 2 for v0 , . . . , vi , ν, k = d0.2 , and = 1/2. By Fact 3 the inner products
of the vi vectors are close to the corresponding correlations of the distribution
ν. It is easy to check that the conditions of Lemma 2 are satisfied,
2k 2 |"v0 , vi # − E [xi ] | = 2k 2 |0 − 0| = 0,
x∼ν
= =
2k |"vi , vj # − E [xi xj ] | ≤ d0.4 (12 log n)/d = 12 log n/d0.1 1,
2
x∼ν
and the lemma applies for large enough n thus the resulting vectors, ui ’s, are
a solution to the Sherali-Adams SDP of level k. It is now easy to see that (4)
implies a big objective value for this solution,
aij "ui , uj # = aij "vi , vj # ≥ Ω(ξ).
i=j i=j
5 Application to MaxCutGain
The MaxCutGain problem is an important special case of the Quadratic Pro-
gramming problem where all entries of the matrix A are nonpositive. In other
words the input is a nonpositive n by n matrixA and the objective is to find
x ∈ {−1, 1}n that maximizes the quadratic form i=j aij xi xj . This problem gets
its name and main motivation from studying algorithms for the MaxCut prob-
lem that perform well for graphs with maximum cut value close to half the edges.
See [7] for a discussion.
Naturally, constructing integrality gap instances for MaxCutGain is harder
than Quadratic Programing. The best integrality gap instances are due to Khot
and O’Donnell [7] who, for any constant Δ, construct an instance of integrality
gap at least Δ. The following is a restatement of their Theorem 4.1 tailored to
our application.
Theorem 3 ([7]). The standard SDP relaxation for MaxCutGain has super
constant integrality gap. Specifically, for any constant ξ > 0, there is a big enough
n and a matrix An×n such that,
i=j aij xi xj ≤ ξ/ log ξ for all x ∈ {−1, 1}
1 n
1. .
2. There are unit vectors v1 , . . . , vn such that i=j aij "vi , vj # ≥ Ω(ξ).
It should be mentioned that the proof of [7] is continuous in nature and it is not
entirely clear how n grows as a function of 1/ξ (once some form of discretization
is applied to their construction.) However, an integrality gap of f (n) for some
function f (n) = ω(1) is implicit in the above theorem.
Consider=the instance from Theorem 3 and the function f (n) as above and
let g(n) =3 f (n). We know that for every n, there are unit vectors v1 , . . . , vn
such that i=j aij "vi , vj # ≥ Ω(ξ). Let v0 be a unit vector perpendicular to all
vi ’s and set k = g(n), = g(n)−2 /2 and let ν be the uniform distribution on
{−1, 1}n. Note that for all i < j,
2k 2 |"v0 , vi # − E [xi ] | = |"v0 , vi #| = 0,
x∼ν
2k 2 |"vi , vj # − E [xi xj ] | = |"vi , vj #| ≤ 1,
x∼ν
and we obtain,
310 S. Benabbas and A. Magen
6 Discussion
We saw that the Sherali-Adams SDP of level nΩ(1) for Quadratic Program-
ming has the same asymptotic integrality gap as the canonical SDP, namely
Ω(log n). It is interesting to see other problems for which this kind of construc-
tion can prove meaningful integrality gap results. It is easy to see that as long
as a problem does not have “hard” constraints, and a super constant integrality
gap for the canonical SDP relaxation is known, one can get super constant inte-
grality gaps for super constant levels of the Sherali-Adams SDP just by plugging
in the uniform distribution for ν in Lemma 2.
It is possible to show that the same techniques apply when the objective
function is a polynomial of degree greater than 2 (but still constant.) This is
particularly relevant to Max-CSP(P ) problems. When formulated as a maxi-
mization problem of a polynomial q over ±1 valued variables, q will have degree
r, the arity of P . In fact, the canonical SDP formulation for the case r > 2 will
be very similar to Sherali-Adams SDP of level r in our case. In order to adapt
Lemma 2 to this setting, the Fourier expansion of fS ’s should be adjusted appro-
priately. Specifically, their Fourier expansion would match that of the starting
SDP solution up to level r and that of ν beyond level r. It is also possible to
define “gain” versions of Max-CSP(P ) problems in this setting and extend ex-
isting superconstant integrality gaps to the Sherali-Adams SDP of superconstant
level (details omitted in this extended abstract.)
The first obvious open problem is to extend these constructions to be appli-
cable to problems with “hard” constraints for which SDP integrality gaps have
been (or may potentially be) shown. A possibly reasonable candidate along these
lines would be the Sparsest Cut problem in which we have one normalizing con-
straint in the SDP relaxation, and for which superconstant integrality gaps are
known. In contrast, it seems quite unlikely that these techniques can be extended
for Vertex Cover where the integrality gap is constant. Another direction is to
extend these techniques to the Lasserre hierarchy for which very few integrality
gaps are known. We believe that this is possible but more sophisticated “merge
operators” of distributions and vectors á la Lemma 2 will be necessary.
References
1. Alon, N., Makarychev, K., Makarychev, Y., Naor, A.: Quadratic forms on graphs.
Invent. Math. 163(3), 499–522 (2006)
2. Arora, S., Berger, E., Kindler, G., Safra, M., Hazan, E.: On non-approximability for
quadratic programs. In: FOCS ’05: Proceedings of the 46th Annual IEEE Sympo-
sium on Foundations of Computer Science, pp. 206–215. IEEE Computer Society,
Washington (2005)
Extending SDP Integrality Gaps to Sherali-Adams with Applications 311
3. Charikar, M., Makarychev, K., Makarychev, Y.: Integrality gaps for sherali-adams
relaxations. In: STOC ’09: Proceedings of the 41st annual ACM symposium on
Theory of computing, pp. 283–292. ACM, New York (2009)
4. Charikar, M., Wirth, A.: Maximizing quadratic programs: Extending
grothendieck’s inequality. In: FOCS ’04: Proceedings of the 45th Annual IEEE
Symposium on Foundations of Computer Science, pp. 54–60. IEEE Computer
Society, Washington (2004)
5. Georgiou, K., Magen, A., Tulsiani, M.: Optimal sherali-adams gaps from pairwise
independence. In: APPROX-RANDOM, pp. 125–139 (2009)
6. Grothendieck, A.: Résumé de la théorie métrique des produits tensoriels
topologiques. Bol. Soc. Mat. São Paulo 8, 1–79 (1953)
7. Khot, S., O’Donnell, R.: Sdp gaps and ugc-hardness for maxcutgain. In: FOCS ’06:
Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer
Science, pp. 217–226. IEEE Computer Society, Washington (2006)
8. Lasserre, J.B.: An explicit exact sdp relaxation for nonlinear 0-1 programs. In:
Aardal, K., Gerards, B. (eds.) IPCO 2001. LNCS, vol. 2081, pp. 293–303. Springer,
Heidelberg (2001)
9. Lovász, L., Schrijver, A.: Cones of matrices and set-functions and 0–1 optimization.
SIAM Journal on Optimization 1(2), 166–190 (1991)
10. Mathieu, C., Sinclair, A.: Sherali-adams relaxations of the matching polytope.
In: STOC ’09: Proceedings of the 41st annual ACM symposium on Theory of
computing, pp. 293–302. ACM, New York (2009)
11. Matousek, J.: Lectures on Discrete Geometry. Springer, New York (2002)
12. Raghavendra, P., Steurer, D.: How to round any csp. In: FOCS ’09: Proceedings
of the 50th Annual IEEE Symposium on Foundations of Computer Science (to
appear 2009)
13. Raghavendra, P., Steurer, D.: Integrality gaps for strong sdp relaxations of unique
games. In: FOCS ’09: Proceedings of the 50th Annual IEEE Symposium on Foun-
dations of Computer Science (to appear 2009)
14. Schoenebeck, G.: Linear level lasserre lower bounds for certain k-csps. In: FOCS
’08: Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of
Computer Science, pp. 593–602. IEEE Computer Society, Washington (2008)
15. Sherali, H.D., Adams, W.P.: A hierarchy of relaxations between the continuous and
convex hull representations for zero-one programming problems. SIAM Journal on
Discrete Mathematics 3(3), 411–430 (1990)
16. Tulsiani, M.: Csp gaps and reductions in the lasserre hierarchy. In: STOC ’09:
Proceedings of the 41st annual ACM symposium on Theory of computing, pp.
303–312. ACM, New York (2009)
17. de la Vega, W.F., Kenyon-Mathieu, C.: Linear programming relaxations of maxcut.
In: SODA ’07: Proceedings of the eighteenth annual ACM-SIAM symposium on
Discrete algorithms, pp. 53–61. Society for Industrial and Applied Mathematics,
Philadelphia (2007)
312 S. Benabbas and A. Magen
A Proof of Fact 3
We will need the following simple and well known lemma that most of the area
of Sd−1 is concentrated around the equator.
Lemma 4. For any unit vector v ∈ Sd−1 , if a unit vector x ∈ Sd−1 is chosen
uniformly at random, |"v, x#| is sharply concentrated:
$ √ %
Pr |"v, x#| ≥ t/ d ≤ 4e−t /2 .
2
Proof. Define,
def
f (x) = "v, x#,
and apply Lévy’s lemma (see Theorem 14.3.2 of [11]) observing that f (x) is
1-Lipschitz. We will have,
$ √ % $ √ % √ 2
Pr |"v, x#| ≥ t/ d = Pr |f (x) − median(f )| ≥ t/ d ≤ 4e−(t/ d) d/2
= 4e−t
2
/2
.
Now, proving the fact is a matter of looking at the actual definition of the
solution vectors and applying lemma 4.
Fact 3 (restated). In the SDP solution of [7], with probability at least 1−4/n4,
for all pairs of indices 1 ≤ i, j ≤ N the inner product "vi , vj # has the following
property,
Proof. The first case follows from the definition of vi ’s. For the second case vi
and vj are independent d-dimensional vectors distributed uniformly on Sd−1 .
Consider a particular choice of vi , according to lemma 4,
$ = %
Pr |"vi , vj #| ≥ (12 log n)/d ≤ 4e−6 log n = 4n−6 .
vj
Applying union bound on all n2 pairs of classes shows that the condition of the
lemma holds for all pairs with probability at least,
1 − n2 4n−6 = 1 − 4/n4 .
The Price of Collusion in Series-Parallel
Networks
1 Introduction
In a routing game, players have a fixed amount of flow which they route in
a network [16,18,24]. The flow on any edge in the network faces a delay, and
the delay on an edge is a function of the total flow on that edge. We look at
routing games in which each player routes flow to minimize his own delay, where
a player’s delay is the sum over edges of the product of his flow on the edge and
the delay of the edge. This objective measures the average delay of his flow and
is commonly used in traffic planning [11] and network routing [16].
Routing games are used to model traffic congestion on roads, overlay routing
on the Internet, transportation of freight, and scheduling tasks on machines.
Players in these games can be of two types, depending on the amount of flow they
control. Nonatomic players control only a negligible amount of flow, while atomic
players control a larger, non-negligible amount of flow. Further, atomic players
may or may not be able to split their flow along different paths. Depending
on the players, three types of routing games are: games with (i) nonatomic
players, (ii) atomic players who pick a single path to route their flow, and (iii)
atomic players who can split their flow along several paths. These are nonatomic
[21,22,24], atomic unsplittable [3,10] and atomic splittable [8,16,19] routing games
respectively. We study atomic splittable routing games in this work. These games
are less well-understood than either nonatomic or atomic unsplittable routing
games. One significant challenge here is that, unlike most other routing games,
each player has an infinite strategy space. Further, unlike nonatomic routing
games, the players are asymmetric since each player has different flow value.
This work was supported in part by NSF grant CCF-0728869.
Research supported by an Alexander von Humboldt fellowship.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 313–326, 2010.
c Springer-Verlag Berlin Heidelberg 2010
314 U. Bhaskar, L. Fleischer, and C.-C. Huang
We first consider the case when all delays are affine. We show that in the case
of affine delays in the setting described above, the total delay at equilibrium is
largest when the players are symmetric, i.e. all players have the same flow value
(Section 3). To do this, we first show that the equilibrium flow for a player i
remains unchanged if we modify the game by changing slightly the value of flow
The Price of Collusion in Series-Parallel Networks 315
of any player with larger flow value than player i. Then starting from a game
with symmetric players, we show that if one moves flow from a player i evenly to
all players with higher flow value the cost of the corresponding equilibrium flow
never increases. Since it is known that the price of collusion is 1 if the players
are symmetric [8], this shows that the bound extends to arbitrary players with
affine delays.
In Section 4, we extend the result for general convex delays, by showing that
the worst case price of collusion is obtained when the delays are affine.
In contrast to Theorem 1 which presents a bound on the price of collusion,
we also present a new bound on the price of anarchy of atomic splittable routing
games in series-parallel graphs.
This bound was proven earlier for parallel links in [12]. For nonatomic routing
games bounds on the price of anarchy depend on the delay functions in the
graph and, in the case of polynomial delays, the price of anarchy is bounded by
O(d/ log d). These bounds are known to be tight even on simple graphs consisting
of 2 parallel links [20]. Theorem 2 improves on the bounds obtained by Theorem 1
when k ≤ d/ log d. All missing proofs are contained in the full version [6].
2 Preliminaries
Let G = (V, E) be a directed graph, with two special vertices s and t called the
source and sink. The vector f , indexed by edges e ∈ E, is defined as a flow of
value v if the following conditions are satisfied.
fuw − fwu = 0, ∀u ∈ V − {s, t} (1)
w w
fsw − fws = v (2)
w w
fe ≥ 0, ∀e ∈ E .
Here fuw represents the flow on arc (u, w). If there are several flows f 1 , f 2 , · · · , f k ,
we define f := (f 1 ,f 2 , · · · , f k ) and f −i is the vector of the flows except f i . In this
k
case the flow on an edge fe = i=1 fei .
Let C be the class of differentiable nondecreasing convex functions. Each edge
e is associated with a delay function le : R+ → R drawn from C. Note that we
allow delay functions to be negative. For a given flow f , the induced delay on
edge e is le (fe ). We define the total delay on an edge e as the product of the
flow on the edge and the induced delay Ce (fe ) := fe le (fe ). The marginal delay
on an edge e is the rate of change of the total delay: Le (fe ) := fe le (fe ) + le (fe ).
The total delay of a flow f is C(f ) = e∈E fe le (fe ).
An atomic splittable routing game is a tuple (G,v,l,s,t) where l is a vector
of delay functions for edges in G and v = (v 1 ,v 2 ,· · · ,v k ) is a tuple indicating
316 U. Bhaskar, L. Fleischer, and C.-C. Huang
the flow value of the players from 1 to k. We always assume that the players
are indexed by the order of decreasing flow value, hence v 1 ≥ v 2 · · · ≥ v k . All
players have source s and destination t. Player i has a strategy space consisting
of all possible s-t flows of volume v i . Let (f 1 , f 2 , · · · ,f k ) be a strategy vector.
Player i incurs a delay Cei (fei , fe
) := fei le (fe ) on each edge e, and his objective is
to minimize his delay C (f ) := e∈E Cei (fei , fe ). A set of players are symmetric
i
For player i, the marginal delay on edge e is defined as the rate of change of
his delay on the edge Lie (fei , fe ) := le (fe ) + fei le (fe ). For any s-t path p, the
marginal delay on path p is defined as the rate of change of totaldelay of player
i when he adds flow along the edges of the path: Lip (f ) := i i
e∈p Le (fe , fe ).
The following lemma follows from Karush-Kuhn-Tucker optimality conditions
for convex programs [15] applied to player i’s minimization problem.
Lemma 4. Flow f is a Nash equilibrium flow if and only if for any player i and
any two directed paths p and q between the same pair of vertices such that on all
edges e ∈ p, fei > 0, then Lip (f ) ≤ Liq (f ).
In directed series-parallel graphs, all edges are directed from the source to the
destination and the graph is acyclic in the directed edges. This is without loss
of generality, since any edge not on an s-t path is not used in an equilibrium
flow, and no flow is sent along a directed cycle. The following lemma describes
a basic property of flows in a directed series-parallel graph.
Vectors and matrices in the paper, except for flow vectors, will be referred to
using boldface. 1 and 0 refer to the column vectors consisting of all ones and
all zeros respectively. When the size of the vector or matrix is not clear from
context, we use a subscript to denote it, e.g. 1n .
For technical reasons, for proving Theorem 1 we also require that every s-t path
in the graph have at least one edge with strictly increasing delay. We modify the
graph in the following way: we add a single edge e in series with graph G, with
delay function le (x) = x. It is easy to see that for any flow, this increases the
total delay by exactly v 2 where v is the value of the flow, and does not change
the value of flow on any edge at equilibrium. In addition, if the price of collusion
in the modified graph is less than one, then the price of collusion in the original
graph is also less than one. The proof of Theorem 2 does not use this assumption.
Theorem 8. In a series-parallel graph with affine delay functions, the total de-
lay of a Nash equilibrium is bounded by that of a Wardrop equilibrium of the
same total flow value.
We first present the high-level ideas of our proof. Given a series-parallel graph G,
terminals s and t, and edge delay functions l, let f (·) : Rk+ → Rm×k + denote the
function mapping a vector of flow values to the equilibrium flow in the atomic
splittable routing game. By Lemma 7, the equilibrium flow is unique and hence
the function f (·) is well-defined. Let (G, u, l, s, t) be an atomic splittable routing
game. Our proof consists of the following three steps:
Step 1. Start with v i = kj=1 uj /k for each player i, i.e. the players are sym-
metric.
Step 2. Gradually adjust the flow values v of the k players so that the total
delay of the equilibrium flow f (v) is monotonically nonincreasing.
Step 3. Stop the flow redistribution process when for each i, v i = ui .
Lemma 9. [8] Let (G, v, l, s, t) denote an atomic splittable routing game with
k
symmetric players. Let g be a Wardrop equilibrium of the same flow value
k
i=1 v . Then C(f (v)) ≤ C(g).
i
Step 2 is the heart of our proof. The flow redistribution works as follows. Let
i
v denote the current flow value of player i. Initially, each player i has v i =
k j
j=1 u /k. Consider each player in turn from k to 1. We decrease the flow of
the kth player and give it evenly to the first k−1 players until v k = uk . Similarly,
when we consider the rth player, for any r < k, we decrease v r and give the flow
evenly to the first r−1 players until v r = ur . Throughout the following discussion
and proofs, player r refers specifically to the player whose flow value is currently
being decreased in our flow redistribution process.
Our flow redistribution strategy traces out a curve S in Rk+ , where points in
the curve correspond to flow value vectors v.
The Price of Collusion in Series-Parallel Networks 319
Lemma 10. For all e ∈ E, i ∈ [k], the function f (v) is continuous and piece-
wise linear along the curve S, with breakpoints occurring where the set of edges
used by any player changes.
The proof of Lemma 11 is described in Section 3.2. Here we highlight the main
ideas. To simplify notation, when the vector of flow values is clear from the
context, we use f instead of f (v) to denote the equilibrium flow.
By chain rule, we have that C(f )
∂v i = e∈E
∂Le (fe ) ∂fe
∂fe ∂v i . The exact expressions
of ∂C(f )
∂v i , for 1 ≤ i ≤ r, are given in Lemmas 18 and 19 in Section 3.2. Our
derivations use the fact that it is possible to simplify the expression ∂fe
∂v i using
the following “nesting property” of a series-parallel graph.
the following condition: for any players i and j with flow values v i and v j , v i > v j
if and only if on every edge e ∈ E, for the equilibrium flow f , either fei = fej = 0
or fei > fej .
Lemma 13 ([5]). A series-parallel graph satisfies the nesting property for any
choice of non-decreasing, convex delay functions.
If a graph satisfies the nesting property, symmetric players have identical flows at
equilibrium. When the flow value of player r is decreased in Step 2, the first r − 1
players are symmetric. Thus, by Lemma 13, these players have identical flows
at equilibrium. Hence, for any player i < r, fei = fe1 and Lie (fei , fe ) = L1e (fe1 , fe )
for any edge e. With affine delays, the nesting property has the following
implication.
∂fe k
∂fei ∂fe1 ∂fer
= = (r − 1) + . (3)
∂v r i=1
∂v r ∂v r ∂v r
∂fe k
∂fei ∂fei
= = , ∀i < r . (4)
∂v i i=1
∂v i ∂v i
Lemma 15. If player h has positive flow on every edge of two directed paths p
∂Lh
p (f ) ∂Lh
q (f )
and q between the same pair of vertices, then ∂v i = ∂v i .
Lemma 16. Let G be a directed acyclic graph. For an atomic splittable routing
game (G, v, l, s, t) with equilibrium flow f , let c and κ be defined as in Lemma 6.
∂fei (v)
Then e∈E c(e) ∂v j = κ if i = j, and is zero otherwise.
The Price of Collusion in Series-Parallel Networks 321
Summing (5) over all players h in S j \{j} and subtract it from |S j | times (5) for
player j gives the proof.
Let Gj = (V, E j ). By definition, all players h ∈ S j have flow on every s-t path
in this graph. Lemma 15 implies that for any s-t paths p, q in Gj and any player
∂Lh (f ) ∂Lh (f )
h ∈ S j , ∂v p
i
q
= ∂v i . The expression on the left hand side of Claim 17 is
thus equal for any path p ∈ P j , and therefore so is the expression on the right.
For the base case j = k, the set {h : E h (f ) ⊂ E j (f )} is empty. Hence, the
second term on the right of Claim 17 is zero, and by the previous discussion,
∂f k ∂f k
the quantity e∈p ae ∂vei is equal for any path p ∈ P k . Define c(e) = ae ∂vei
∂f k
for each e ∈ E k and κ = e∈p ae ∂vei for any s-t path p in Gk . By Lemma 16,
!2
∂fek ∂fek ∂f k
e∈E j (f ) c(e) ∂v i = e∈E j (f ) ae ∂v i = 0. Hence, ∂vei = 0, ∀e ∈ E.
∂f h
For the induction step j < k, due to the inductive hypothesis, ∂vei = 0 for
h > j. Since by the nesting property if E h (f ) ⊂ E j (f ) then h > j, the second
term on the right of Claim 17 is again zero. By the same argument as in the
∂f j
base case, ∂vei = 0, for each e ∈ E, proving the lemma.
322 U. Bhaskar, L. Fleischer, and C.-C. Huang
Let P denote the set of all s-t paths in G, and for equilibrium flow f , let
{fpi }p∈P,i∈[k] denote a path flow decomposition of f . For players i, j ∈ [r] with
player r defined as in the flow redistribution, we will be interested in the rate
of change of marginal delay of player i along an s-t path p as the value of flow
controlled by player j changes. Given a decomposition {fpi }p∈P,i∈[k] along paths
of the equilibrium flow, this rate of change can be expressed as
i
Lemma 21. For equilibrium flow f and sets P ⊆ P as described in Lemma 20,
for all players i ∈ [k], Wi ≥ Wi+1 and Wk > 0.
The next lemma gives the rate of change of marginal delay at equilibrium.
Lemma 22. For player r defined as in the flow redistribution process and any
player i < r, for f = f (v),
∂Li (f (v)) 2
(i) = ,
∂v i
Wi
∂Li (f ) 1
(ii) = ,
∂v r Wi
r
∂L (f ) r+1 1 r−1 1
(iii) = r + .
∂v r r W r W1
two players, when r = 2 the fourth term on the right hand side of (7) has
nonzero contribution. Calculating this term is complicated. However, we show
the following inequality for this expression.
Lemma 23. For f = f (v) and the player
r as defined in
the flow redistribution
∂f e v 1
v r
1 1
process, ae fe1 r ≥ − − .
1
∂v W1 r W1 Wr
e∈E
i
Proof of Lemma 11. For any player i < r, substituting the expression for ∂L∂v(f
i
)
from Lemma 22 into Lemma 18, and observing that Li (f ) = L1 (f ) and Wi =
W1 since the flow of the first r − 1 players is identical,
k j
∂C(f ) 1 j=2 v
= L (f ) + . (9)
∂v i W1
From Lemma 21 we know that W1 ≥ Wr . Also, ki=2 v i = (r − 2)v 1 +
k
i=r v ≥ (r − 2)(v − v ). Hence, the expression on the right of (11) is nonneg-
i 1 r
Theorem 24. The price of collusion on a series-parallel graph with delay func-
tions taken from the set C is at most the price of collusion with linear delay
functions.
This theorem combined with Theorem 8, suffices to prove Theorem 1. The fol-
lowing lemma is proved by Milchtaich.1
1
Milchtaich in fact shows the same result for undirected series-parallel graphs. In our
context, every simple s-t path in the underlying undirected graph is also an s-t path
in the directed graph G.
The Price of Collusion in Series-Parallel Networks 325
If the nesting property does not hold, the total delay can increase as we decrease
the flow of a smaller player and increase the flow of a larger player, thus causing
our flow redistribution strategy presented in Section 3.2 to break down. See the
full version for an example.
References
1. Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows. Prentice-Hall, Englewood
Cliffs (1993)
2. Altman, E., Basar, T., Jimenez, T., Shimkin, N.: Competitive routing in networks
with polynomial costs. IEEE Transactions on Automatic Control 47(1), 92–96
(2002)
3. Awerbuch, B., Azar, Y., Epstein, A.: The price of routing unsplittable flow. In:
STOC, pp. 57–66. ACM, New York (2005)
4. Beckmann, M., McGuire, C.B., Winsten, C.B.: Studies in the Economics of Trans-
portation. Yale University Press, New Haven (1956)
5. Bhaskar, U., Fleischer, L., Hoy, D., Huang, C.-C.: Equilibria of atomic flow games
are not unique. In: SODA, pp. 748–757 (2009)
6. Bhaskar, U., Fleischer, L., Huang, C.-C.: The price of collusion in series-parallel
networks (unpublished manuscript 2010)
7. Catoni, S., Pallottino, S.: Traffic equilibrium paradoxes. Transportation Sci-
ence 25(3), 240–244 (1991)
8. Cominetti, R., Correa, J.R., Stier-Moses, N.E.: The impact of oligopolistic compe-
tition in networks. Operations Research 57(6), 1421–1437 (2009)
326 U. Bhaskar, L. Fleischer, and C.-C. Huang
1 Introduction
Nonlinear Integer Programming has received significant attention from the Inte-
ger Programming (IP) community in recent time. Although, some special classes
are efficiently solvable [32], even simple nonlinear IP problems can be NP-Hard
or undecidable [33]. However, there has been considerable progress in the devel-
opment of practical algorithms that can be effective for many important applica-
tions (e.g. [1,8,9,10,32,36,37]). Building on work for linear IP, practical algorithms
for nonlinear IP have benefited from the development of several classes of cutting
planes or valid inequalities (e.g. [3,4,5,6,13,14,25,29,30,31,35,28,39,40,43]). Many
of these inequalities are based on the generalization of ideas used in linear IP. For
example, [4,5,39,14] exploit the interaction between superadditive functions and
nonlinear constraints to develop techniques that can yield several strong valid
inequalities.
Following the success of such approaches we study some theoretical properties
of this interaction when the superadditive function is the integer round down
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 327–340, 2010.
c Springer-Verlag Berlin Heidelberg 2010
328 S.S. Dey and J.P. Vielma
operation · and the nonlinear constraints are convex. Specifically we study
the polyhedrality of the (first) Chvátal-Gomory (CG) closure [15,26,27,41] of
a non-polyhedral convex set. The study of properties of the CG closure of a
rational polyhedron has yielded many well known results for linear IP. In this
case, the closure is a rational polyhedron [41] for which the associated optimiza-
tion, separation and membership problems are NP-hard even for restricted cases
[11,12,21,34]. However, optimization over the CG closure of a polyhedron has
been successfully used to show its strength computationally [22,23]. Similar re-
sults have also been obtained for closures associated to other valid inequalities
such as split cuts [2,7,12,17,19,20,44].
CG cuts for non-polyhedral sets are considered implicitly in [15,41] and explic-
itly in [14], but only [41] deals with the polyhedrality of the CG closure. Although
[41] shows that for rational polyhedra the closure is a rational polyhedron, the
result does not automatically extend to non-polyhedral sets. Furthermore, nei-
ther of the known proofs of the result for rational polyhedra [16,41,42] can be
easily adapted to consider other convex sets. In fact, as noted in [41] even the
polyhedrality of the CG closure of non-rational polytopes remains unknown.
Because of this, we study the polyhedrality of the CG closure of an ellipsoid as
the first natural step towards understanding the closure of other non-polyhedral
convex sets.
Let a rational ellipsoid be the image of an Euclidean ball under a rational
affine transformation. Our main result is to show that the CG closure of a full-
dimensional bounded rational ellipsoid is a rational polytope. To the best of
our knowledge, this is the first extension to a non-polyhedral set of the well
known result for rational polyhedra. Additionally, the proof of our main result
reveals some general sufficient conditions for the polyhedrality of the CG closure
and other interesting properties. For example, we show that every non-integral
point on the boundary of an ellipsoid can be separated by a CG cut. We recently
verified [18] that this geometrically natural property holds for some other classes
of convex sets.
The rest of the paper is organized as follows. In Section 2, we give some back-
ground on CG cuts, formally state the main result of the paper and present an
outline of its proof. In Section 3, we present notation and review some standard
results from convex analysis. In Section 4, we consider two separation results that
are needed for the proof of the main theorem, which we present in Section 5. We
end with some remarks in Section 6.
To show that Step 1 can be accomplished, we first show that every non-integral
point on the boundary of T can be separated by a CG cut. If there are no integral
points on the boundary of T , then this separation result allows us to cover the
boundary of T with a possibly infinite number of open sets that are associated
to the CG cuts. We then use compactness of the boundary of T to obtain a finite
sub-cover that yields a finite number of CG cuts that separate every point on
the boundary of T . If there are integer points on the boundary, then we use a
stronger separation result and a similar argument to show that there is a finite
set of CG cuts that separate every non-integral point on the boundary of T .
To show that Step 2 terminates finitely, we simply show that the set of CG
cuts that separate at least one point in Q \ CC(Zn , T ) is finite.
We note that the separation of non-integral points using CG cuts on the
boundary of T , required in Step 1 of Figure 1, is not straightforward. A natural
first approach to separate a non-integral point u on the boundary of T is to take
an inequality "a, x# ≤ σT (a) that is supporting at u, scale it so that a ∈ Zn , and
then generate the CG cut "a, x# ≤ σT (a). If σT (a) ∈ / Z, then the CG cut will
separate u because a was selected such that "a, u# = σT (a). Unfortunately, as
the following examples show, this approach can fail either because a cannot be
scaled to be integral or because σT (a) ∈ Z for any scaling that yields a ∈ Zn .
= √
Example 1. Let T := {x ∈ R2 | x21 + x22 ≤ 1} and u = (1/2, 3/2)T ∈ bd(T ).
We have that the supporting inequality for u is a1 x1 + a2 x2 ≤ σT (a) where
a = u. Since u is irrational in only one component, observe that a cannot be
scaled to be integral.
For Example 1, it is easy to see that selecting an alternative integer left-hand-
√
side vector a resolves the issue. We can use a = (1, 1) which has σT (a ) = 2 to
obtain the CG cut x1 + x2 ≤ 1. In Example 1 this CG cut separates every non-
negative non-integral point on the boundary of T . In Section 4, we will show that
The Chvátal-Gomory Closure of an Ellipsoid Is a Polyhedron 331
given any non-integral point u on the boundary of T such that the left-hand-side
of its supporting hyperplane cannot be scaled to be integral, there exists an al-
ternative left-hand-side integer vector a such that the CG cut "a , x# ≤ σT (a )
separates u. This vector a will be systematically obtained using simultaneous
diophantine approximation of the left-hand-side of an inequality describing the
supporting hyperplane at u.
=
Example 2. Let T := {x ∈ R2 | x21 + x22 ≤ 5} and u = (25/13, 60/13)T ∈
bd(T ). We have that the supporting inequality for u can be scaled to a1 x1 +
a2 x2 ≤ σT (a) for a = (5, 12)T which has σT (a) = 65. Because 5 and 12 are
coprime and σT (·) is positively homogeneous, a cannot be scaled so that a ∈ Z2
and σT (a) ∈/ Z.
Observe that Example 2 is not an isolated case. In fact, these = cases are closely
related to primitive Pythagorean triples. For T := {x ∈ R2 | x21 + x22 ≤ r},
select any primitive Pythagorean triple v12 + v22 = v32 , and consider the point
r( vv13 , vv23 ) (such that r( vv13 , vv23 ) ∈
/ Z2 ). Then since v1 and v2 are coprimes, the
behavior in Example 2 will be observed. Also note that these examples are not
nballs2 in R2 , since it 2is easy
2
restricted only to Euclidean to construct integers
a1 , ..., an , an+1 such that i=1 ai = an+1 (e.g. 3 + 42 + 122 = 132 ). For the
class of points u ∈ bd(T ) where the left-hand-side of an inequality describing
the supporting hyperplane is scalable to an integer vector a, we will show in
Section 4 that there exists a systematic method to obtain a ∈ Zn such that
"a , x# ≤ σT (a ) separates u.
4 Separation
To prove Theorem 2 we need two separation results. The first one simply states
that every non-integral point on the boundary of T can be separated by a CG
cut.
Proposition 1. If u ∈ bd(T ) \ Zn , then there exists a CG cut that separates
point u.
An integer point u ∈ bd(T ) ∩ Zn cannot be separated by a CG cut, but Proposi-
tion 1 states that every point in bd(T ) that is close enough to u will be separated
by a CG cut. However, for the compactness argument to work we need a stronger
separation result for points on the boundary that are close to integral boundary
points. This second result states that all points in bd(T ) that are sufficiently
close to an integral boundary point can be separated by a finite number of CG
cuts.
Proposition 2. Let u ∈ bd(T ) ∩ Zn . Then there exists εu > 0 and a finite set
Wu ⊂ Zn such that
"w, u# = σ(w) ∀w ∈ Wu , (2)
∀v ∈ bd(T )∩{x ∈ Rn : x− u < εu } \ {u} ∃w ∈ Wu s.t "w, v# > σ(w), (3)
and
∀v ∈ int(T ) ∃w ∈ Wu s.t. "w, v# < σ(w). (4)
The main ideas used in the proof of Proposition 2 are as follows. First, it is
verified that for any nonzero integer vector q, there exists a finite i ∈ Z+ such
that the CG cut of the form "q + iλs(u), x# ≤ σ(q + iλs(u)) satisfies (2) (here
λs(u) ∈ Zn for some scalar λ = 0). Second, it is verified that by carefully
selecting a finite number of integer vectors and applying the above construction,
all points in a sufficiently small neighborhood of u can be separated. Finally, (4) is
established by adding the supporting hyperplane at u which is trivially a CG cut.
Although this proof of Proposition 2 is similar to the proof of Proposition 1,
it is significantly more technical. We therefore refer the readers to [18] where a
more general version of Proposition 2 is proven and confine our discussion to an
outline of the proof of Proposition 1 here.
The Chvátal-Gomory Closure of an Ellipsoid Is a Polyhedron 333
Neither condition holds for every sequence such that si /si → s(u)/s(u).
For instance, for s(u) = (0, 1)T the sequence si = (k, k 2 ) does not comply with
condition C1. For these reason we need the following proposition.
Proposition 3. Let u ∈ bd(T ) \ Zn and let el be the l-th unit vector for some
l ∈ {1, . . . , n} such that ul ∈
/ Z.
(a) If there exists λ > 0 such that p := λs(u) ∈ Zn and σ(λs(u)) ∈ Z, then
si := el + ip complies with conditions C1 and C2.
(b) If λs(u) ∈ / Zn for all λ > 0, then let {(pi , qi )}i∈N ⊂ Zn × (Z+ \ {0}) be the
coefficients obtained using Dirichlet’s Theorem to approximate s(u). That is
{(pi , qi )}i∈N is such that
1
|qi s(u)j − pij | < ∀j ∈ {1, ..., n}.
i
For M ∈ Z+ such that M c ∈ Zn we have that si := el + M pi complies with
conditions C1 and C2.
With this proposition we can proceed to the proof of Proposition 1
Proof (Proof of Proposition 1). Let u ∈ bd(T ) \ Zn . There are three possible
cases:
1. There exists λ > 0 such that λs(u) ∈ Zn and σ(λs(u)) ∈
/ Z.
2. There exists λ > 0 such that λs(u) ∈ Zn and σ(λs(u)) ∈ Z.
3. λs(u) ∈
/ Zn for all λ > 0.
Case 1: "λs(u), x# ≤ σ(λs(u)) is a CG cut that separates u.
Cases 2 and 3: From Proposition 3, we have that in both cases there exists a
sequence {si }i∈N ⊂ Zn satisfying conditions C1 and C2. Together with
conditions C1 and C2 yields that for sufficiently large i we have "si , u#−σ(si ) >
0 and hence "si , x# ≤ σ(si ) separates u.
We next discuss the proof of Proposition 3 in the next two subsections.
334 S.S. Dey and J.P. Vielma
k + δ ≤ σ(si ) ≤ k + δ + ε ∀i ≥ Nε . (7)
Now, k ∈ Z and
A−1 si = A−1 (M pi + el )
= M qi A−1 s(u) + A−1 el + M A−1 ¯i
≤ M qi A−1 s(u) + A−1 el + M A−1 ¯i
E 2
"A−1 s(u), A−1 el #
= + M qi + t + M A−1 ¯i .
A−1 s(u)
!2
A−1 s(u),A−1 el
where t := A−1 el 2 − A−1 s(u) . Since A−1 s(u) = A(u − c) = 1,
t = A−1 el 2 −"A(u−c), A e # which is non-negative by the Chauchy-Schwartz
−1 l 2
−1 −1 l
inequality. Therefore by setting α := A As(u),A
−1 s(u)
e
, βi := M qi we can use
−1
Lemma 2 and the fact that A s(u) = A(u − c) = 1 to obtain that for every
ε > 0 there exists Nε such that
The separation results from Section 4 allows the construction of the set re-
quired in Proposition 4, which proves our main result.
Proof (Proof of Theorem 2). Let I := bd(T ) ∩ Zn be the finite (and possibly
empty) set of integer points on the boundary of T . We divide the proof into the
following cases
1. CC(Zn ) = ∅.
2. CC(Zn ) = ∅ and CC(Zn ) ∩ int(T ) = ∅.
3. CC(Zn ) ∩ int(T ) = ∅.
For the first case, the result follows directly. For the second case, by Proposition 1
and the strict convexity of T , we have that |I| = 1 and CC(Zn ) = I so the
result again follows directly. For the third case we show the existence of a set S
complying with conditions (12) presented in Proposition 4.
/ For each un∈ I, let εu > 0 be the value from Proposition 2. Let D := bd(T ) \
u∈I {x ∈ R : x − u < εu }. Observe that D ∩ Z = ∅ by construction and
n
6 Remarks
We note that the proof of Proposition 4 only uses the fact that T is a convex
body and Theorem 2 uses the fact that T is additionally an ellipsoid only through
Proposition 1 and Proposition 2. Therefore, we have the following general suf-
ficient conditions for the polyhedrality of the first CG closure of a compact
convex set.
338 S.S. Dey and J.P. Vielma
A condition similar to (12) was considered in [41] for polytopes that are not
necessarily rational. Specifically the author stated that if P is a polytope in real
space such that CC(Zn , P ) ∩ bd(P ) = ∅, then CC(Zn , P ) is a rational polytope.
We imagine that the proof he had in mind could have been something along the
lines of Proposition 4.
We also note that Step 2 of the procedure described in Section 2 can be directly
turned into a finitely terminating algorithm by simple enumeration. However, it
is not clear how to obtain a finitely terminating algorithmic version of Step 1
because it requires obtaining a finite subcover of the boundary of T from a quite
complicated infinite cover.
References
1. Abhishek, K., Leyffer, S., Linderoth, J.T.: FilMINT: An outer-approximation-
based solver for nonlinear mixed integer programs. In: Preprint ANL/MCS-P1374-
0906, Argonne National Laboratory, Mathematics and Computer Science Division,
Argonne, IL (September 2006)
2. Andersen, K., Cornuéjols, G., Li, Y.: Split closure and intersection cuts. Mathe-
matical Programming 102, 457–493 (2005)
3. Atamtürk, A., Narayanan, V.: Cuts for conic mixed-integer programming. In: Fis-
chetti and Williamson [24], pp. 16–29
4. Atamtürk, A., Narayanan, V.: Lifting for conic mixed-integer programming. Re-
search Report BCOL.07.04, IEOR, University of California-Berkeley, October 2007,
Forthcoming in Mathematical Programming (2007)
5. Atamtürk, A., Narayanan, V.: The submodular 0-1 knapsack polytope. Discrete
Optimization 6, 333–344 (2009)
6. Atamtürk, A., Narayanan, V.: Conic mixed-integer rounding cuts. Mathematical
Programming 122, 1–20 (2010)
7. Balas, E., Saxena, A.: Optimizing over the split closure. Mathematical Program-
ming 113, 219–240 (2008)
8. Belotti, P., Lee, J., Liberti, L., Margot, F., Waechter, A.: Branching and bound
tightening techniques for non-convex MINLP. Optimization Methods and Soft-
ware 24, 597–634 (2009)
9. Bonami, P., Biegler, L.T., Conn, A.R., Cornuéjols, G., Grossmann, I.E., Laird,
C.D., Lee, J., Lodi, A., Margot, F., Sawaya, N., Waechter, A.: An algorithmic
framework for convex mixed integer nonlinear programs. Discrete Optimization 5,
186–204 (2008)
The Chvátal-Gomory Closure of an Ellipsoid Is a Polyhedron 339
10. Bonami, P., Kilinç, M., Linderoth, J.: Algorithms and software for convex mixed in-
teger nonlinear programs, Technical Report 1664, Computer Sciences Department,
University of Wisconsin-Madison (October 2009)
11. Caprara, A., Fischetti, M.: {0, 12 }-Chvátal-Gomory cuts. Mathematical Program-
ming 74, 221–235 (1996)
12. Caprara, A., Letchford, A.N.: On the separation of split cuts and related inequal-
ities. Mathematical Programming 94, 279–294 (2003)
13. Ceria, S., Soares, J.: Perspective cuts for a class of convex 0-1 mixed integer pro-
grams. Mathematical Programming 86, 595–614 (1999)
14. Çezik, M.T., Iyengar, G.: Cuts for mixed 0-1 conic programming. Mathematical
Programming 104, 179–202 (2005)
15. Chvatal, V.: Edmonds polytopes and a hierarchy of combinatorial problems. Dis-
crete Mathematics 4, 305–337 (1973)
16. Cook, W.J., Cunningham, W.H., Pulleyblank, W.R., Schrijver, A.: Combinatorial
optimization. John Wiley and Sons, Inc., Chichester (1998)
17. Cook, W.J., Kannan, R., Schrijver, A.: Chvátal closures for mixed integer pro-
gramming problems. Mathematical Programming 58, 155–174 (1990)
18. Dadush, D., Dey, S.S., Vielma, J.P.: The Chvátal-Gomory closure of strictly convex
sets. Working paper, Geogia Institute of Technology (2010)
19. Dash, S., Günlük, O., Lodi, A.: On the MIR closure of polyhedra. In: Fischetti and
Williamson [24], pp. 337–351
20. Dash, S., Günlük, O., Lodi, A.: MIR closures of polyhedral sets. Mathematical
Programming 121, 33–60 (2010)
21. Eisenbrand, F.: On the membership problem for the elementary closure of a poly-
hedron. Combinatorica 19, 297–300 (1999)
22. Fischetti, M., Lodi, A.: Optimizing over the first Chvàtal closure. In: Jünger, M.,
Kaibel, V. (eds.) IPCO 2005. LNCS, vol. 3509, pp. 12–22. Springer, Heidelberg
(2005)
23. Fischetti, M., Lodi, A.: Optimizing over the first Chvátal closure. Mathematical
Programming, Series B 110, 3–20 (2007)
24. Fischetti, M., Williamson, D.P. (eds.): IPCO 2007. LNCS, vol. 4513. Springer,
Heidelberg (2007)
25. Frangioni, A., Gentile, C.: Perspective cuts for a class of convex 0-1 mixed integer
programs. Mathematical Programming 106, 225–236 (2006)
26. Gomory, R.E.: Outline of an algorithm for integer solutions to linear programs.
Bulletin of the American Mathematical Society 64, 275–278 (1958)
27. Gomory, R.E.: An algorithm for integer solutions to linear programs. In: Recent
advances in mathematical programming, pp. 269–302. McGraw-Hill, New York
(1963)
28. Grossmann, I., Lee, S.: Generalized convex disjunctive programming: Nonlinear
convex hull relaxation. Computational Optimization and Applications 26, 83–100
(2003)
29. Günlük, O., Lee, J., Weismantel, R.: MINLP strengthening for separable convex
quadratic transportation-cost UFL, IBM Research Report RC24213, IBM, York-
town Heights, NY (March 2007)
30. Günlük, O., Linderoth, J.: Perspective relaxation of mixed integer nonlinear pro-
grams with indicator variables. In: Lodi, et al. (eds.) [38], pp. 1–16
31. Günlük, O., Linderoth, J.: Perspective relaxation of mixed integer nonlinear pro-
grams with indicator variables. Mathematical Programming, Series B (to appear
2009)
340 S.S. Dey and J.P. Vielma
32. Hemmecke, R., Köppe, M., Lee, J., Weismantel, R.: Nonlinear integer program-
ming. IBM Research Report RC24820, IBM, Yorktown Heights, NY (December
2008); Juenger, M., Liebling, T., Naddef, D., Nemhauser, G., Pulleyblank, W.,
Reinelt, G., Rinaldi, G., Wolsey, L.: 50 Years of Integer Programming 1958–2008:
The Early Years and State-of-the-Art Surveys. Springer, Heidelberg (to appear
2010), ISBN 3540682740.
33. Jeroslow, R.: There cannot be any algorithm for integer programming with
quadratic constraints. Operations Research 21, 221–224 (1973)
34. Letchford, A.N., Pokutta, S., Schulz, A.S.: On the membership problem for the
{0, 1/2}-closure. Working paper, Lancaster University (2009)
35. Letchford, A.N., Sørensen, M.M.: Binary positive semidefinite matrices and asso-
ciated integer polytopes. In: Lodi, et al. (eds.) [38], pp. 125–139
36. Leyffer, S., Linderoth, J.T., Luedtke, J., Miller, A., Munson, T.: Applications and
algorithms for mixed integer nonlinear programming. Journal of Physics: Confer-
ence Series 180 (2009)
37. Leyffer, S., Sartenaer, A., Wanufelle, E.: Branch-and-refine for mixed-integer non-
convex global optimization. In: Preprint ANL/MCS-P1547-0908, Argonne National
Laboratory, Mathematics and Computer Science Division, Argonne, IL (September
2008)
38. Lodi, A., Panconesi, A., Rinaldi, G. (eds.): IPCO 2008. LNCS, vol. 5035. Springer,
Heidelberg (2008)
39. Richard, J.-P.P., Tawarmalani, M.: Lifting inequalities: a framework for generating
strong cuts for nonlinear programs. Mathematical Programming 121, 61–104 (2010)
40. Saxena, A., Bonami, P., Lee, J.: Disjunctive cuts for non-convex mixed integer
quadratically constrained programs. In: Lodi, et al. (eds.) [38], pp. 17–33
41. Schrijver, A.: On cutting planes. Annals of Discrete Mathematics 9, 291–296 (1980);
Combinatorics 79 (Proc. Colloq., Univ. Montréal, Montreal, Que., 1979), Part II
(1979)
42. Schrijver, A.: Theory of linear and integer programming. John Wiley & Sons, Inc.,
New York (1986)
43. Stubbs, R.A., Mehrotra, S.: A branch-and-cut method for 0-1 mixed convex pro-
gramming. Mathematical Programming 86, 515–532 (1999)
44. Vielma, J.P.: A constructive characterization of the split closure of a mixed integer
linear program. Operations Research Letters 35, 29–35 (2007)
A Pumping Algorithm for Ergodic Stochastic
Mean Payoff Games with Perfect Information
This research was partially supported by DIMACS, Center for Discrete Mathematics
and Theoretical Computer Science, Rutgers University, and by the Scientific Grant-
in-Aid from Ministry of Education, Science, Sports and Culture of Japan.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 341–354, 2010.
c Springer-Verlag Berlin Heidelberg 2010
342 E. Boros et al.
1 Introduction
1.1 BWR-Games
We consider two-person zero-sum stochastic games with perfect information and
mean payoff: Let G = (V, E) be a digraph whose vertex-set V is partitioned
into three subsets V = VB ∪ VW ∪ VR that correspond to black, white, and
random positions, controlled respectively, by two players, Black - the minimizer
and White - the maximizer, and by nature. We also fix a local reward function
r : E → R, and probabilities p(v, u) for all arcs (v, u) going out of v ∈ VR .
Vertices v ∈ V and arcs e ∈ E are called positions and moves, respectively. In
a personal position v ∈ VW or v ∈ VB the corresponding player White or Black
selects an arc (v, u), while in a random position v ∈ VR a move (v, u) is chosen
with the given probability p(v, u). In all cases Black pays White the reward
r(v, u).
From a given initial position v0 ∈ V the game produces an infinite walk (called
a play). White’s objective is to maximize the limiting mean payoff
n
bi
c = lim inf i=0 , (1)
n→∞ n+1
where bi is the reward incurred at step i of the play, nwhile the objective of Black
i=0 bi
is the opposite, that is, to minimize lim supn→∞ n+1 .
For this model it was shown in [7] that a saddle point exists in pure positional
uniformly optimal strategies. Here “pure” means that the choice of a move (v, u)
in a personal position v ∈ VB ∪ VR is deterministic, not random; “positional”
means that this choice depends solely on v, not on previous positions or moves;
finally, “uniformly optimal” means that it does not depend on the initial position
v0 , either. The results and methods in [7] are similar to those of Gillette [17];
see also Liggett and Lippman [28]: First, we analyze the so-called discounted
version, in which the payoff is discounted by a factor β i at step i, giving the
effective payoff: aβ = (1 − β) ∞ i
i=0 β bi , and then we proceed to the limit as the
discount factor β ∈ [0, 1) tends to 1.
This class of the BWR-games was introduced in [19]; see also [10]. It was
recently shown in [7] that the BWR-games and classical Gillette games [17] are
polynomially equivalent. The special case when there are no random positions,
VR = ∅, is known as cyclic, or mean payoff, or BW-games. They were introduced
for the complete bipartite digraphs in [32,31], for all (not necessarily complete)
bipartite digraphs in [15], and for arbitrary digraphs in [19]. A more special
case was considered extensively in the literature under the name of parity games
[2,3,11,20,22,24], and later generalized also to include random nodes in [10]. A
BWR-game is reduced to a minimum mean cycle problem in case VW = VR = ∅,
see, for example [25]. If one of the sets VB or VW is empty, we obtain a Markov
decision process; see, for example, [30]. Finally, if both are empty VB = VW = ∅,
we get a weighted Markov chain.
It was noted in [9] that “parlor games”, like Backgammon (and even Chess)
can be solved in pure positional uniformly optimal strategies, based on their
BWR-model.
A Pumping Algorithm for Ergodic Stochastic Mean Payoff Games 343
In the special case of a BWR-game, when all rewards are zero except at a
single node t called the terminal, at which there is a self-loop with reward 1,
we obtain the so-called simple stochastic games (SSG), introduced by Condon
[12,13] and considered in several papers [18,20]. In these games, the objective
of White is to maximize the probability of reaching the terminal, while Black
wants to minimize this probability. Recently, it was shown that Gillette games
(and hence BWR-games by [7]) are equivalent to SSG’s under polynomial-time
reductions [1]. Thus, by recent results of Björklund, Vorobyov [5], and Halman
[20],√ all these games can be solved in randomized strongly subexponential time
2O( nd log nd ) , where nd = |VB | + |VW | is the number of deterministic vertices.
Let us note that several pseudo-polynomial and subexponential algorithms exists
for BW-games [19,26,33,6,4,20,36]; see also [14] for a so called policy iteration
method, and [24] for parity games.
Besides their many applications (see e.g. [29,23]), all these games are of in-
terest to Complexity Theory: Karzanov and Lebedev [26] (see also [36]) proved
that the decision problem “whether the value of a BW-game is positive” is in
the intersection of NP and co-NP. Yet, no polynomial algorithm is known for
these games, see e.g., the recent survey by Vorobyov [35]. A similar complexity
claim can be shown to hold for SSG s and BWR-games, see [1,7].
While there are numerous pseudo-polynomial algorithms known for the BW-
case, it is a challenging open question whether a pseudo-polynomial algorithm
exists for SSG s or BWR-games.
This is not a complete surprise, since this case is at least as hard as SSG s, for
which no pseudo-polynomial algorithm is known.
Theorem 1. Consider a BWR-game with k random nodes, a total of n vertices,
and integer rewards in the range [−R, R], and assume that all probabilities are
rational whose common denominator is bounded by W . Then there is an algo-
2
rithm that runs in time nO(k) W O(k ) R log(nRW ) and either brings the game by
a potential transformation to canonical form, or proves that it is non-ergodic.
Let us remark that the ergodic case is frequent in applications. For instance,
it is the case when G = (VW ∪ VB ∪ VR , E) is a complete tripartite digraph
(where p(v, u) > 0 for all v ∈ VR and (v, u) ∈ E); see Section 3 for more general
sufficient conditions.
Theorem 1 states that our algorithm is pseudo-polynomial if the number
of random nodes is fixed. As far as we know, this is the first algorithm with
such a guarantee (in comparison, for example, to strategy improvement meth-
ods [4,21,34], for which exponential lower bounds are known [16]; it is worth
mentioning that the algorithm of [21] also works only for the ergodic case). In
fact, we are not aware of any previous results bounding the running time of an
algorithm for a class of BWR-games in terms of the number of random nodes,
except for [18] which shows that simple stochastic games on k random nodes can
be solved in time O(k!(|V ||E| + L)), where L is the maximum bit length of a
transition probability. It is worth remarking here that even though BWR-games
are polynomially reducible to simple stochastic games, under this reduction the
number of random nodes k becomes a polynomial in n, even if the original BWR-
game has constantly many random nodes. In particular, the result in [18] does
not imply a bound similar to that of Theorem 1 for general BWR-games.
One should also contrast the bound in Theorem 1 with the subexponential
bounds in [20]: roughly, the algorithm of Theorem 1 will be more efficient if |VR |
1
is o((|VW | + |VB |) 4 ) (assuming that W and R are polynomials in n). However,
our algorithm could be practically much faster since it can stop much earlier
than its estimated worst-case running time (unlike the subexponential algorithms
[20], or those based on dynamic programming [36]). In fact, as our preliminary
experiments indicate, to approximate the value of a random game on up to
15, 000 nodes within an additive error ε = 0.001, the algorithm takes no more
than a few hundred iterations, even if the maximum reward is very large. One
more desirable property of this algorithm is that it is of the certifying type (see
e.g. [27]), in the sense that, given an optimal pair of strategies, the vector of
potentials provided by the algorithm can be used to verify optimality in linear
time (otherwise verifying optimality requires solving two linear programs).
the locally optimal rewards at different nodes become sufficiently close to each
other, or a proof of non-ergodicity is obtained in the form of a certain partition
of the nodes. The upper bound on the running time consists of three technical
parts. The first one is to show that if the number of iterations becomes too large,
then there is a large enough potential gap to ensure an ergodic partition. In the
second part, we show that the range of potentials can be kept sufficiently small
throughout the algorithm, namely x∗ ∞ ≤ nRk(2W )k , and hence the range of
the transformed rewards does not explode. The third part concerns the required
accuracy. It can be shown that it is enough in our algorithm to get the value of
the game within an accuracy of
1
ε= , (2)
n2(k+1) k 2k (2W )4k+2k2 +2
in order to guarantee that it is equal to the exact value. As far as we know, such
a bound in terms of k is new, and it could be of independent interest. We also
show the lower bound W Ω(k) on the running time of the algorithm of Theorem
1 by providing an instance of the problem, with only random nodes.
The paper is organized as follows. In the next section, we formally define
BWR-games, canonical forms, and state some useful propositions. In Section 3,
we give a sufficient condition for the ergodicity of a BWR-game, which will be
used as one possible stopping criterion in our algorithm. We give the algorithm in
Section 4.1, and prove it converges in Section 4.2. In Section 5, we show that this
convergence proof can, in fact, be turned into a quantitative statement giving
the precise bounds stated in Theorem 1. The last section gives a lower bound
example for the algorithm. Due to lack of space, most of the proofs are omitted
(see [8] for details).
2 Preliminaries
2.1 BWR-Games
∗
where p : V → [0, 1] is the limiting distribution for Gs starting from v0 . Doing
this for all possible strategies of Black and White, we obtain a matrix game
Cv0 : SW × SB → R, with entries Cv0 (sW , sB ) defined by (3).
In particular, the properties above imply that the induced subgraphs G[V W ]
and G[V B ] have no terminal vertex.
Partition Πr : V = V W ∪ V B ∪ V R satisfying (i), (ii), and (iii) will be called
a contra-ergodic partition for digraph G = (VW ∪ VB ∪ VR , E).
Theorem 2. A digraph G is ergodic iff it has no contra-ergodic partition.
The “only if part” can be strengthened as follows:
Proposition 2. Given a BWR-game G whose graph has a contra-ergodic par-
tition, if m(v) > m(u) for every v ∈ V W , u ∈ V B then μ(v) > μ(u) for every
v ∈ V W , u ∈ V B.
Definition 1. A contra-ergodic decomposition of G is a contra-ergodic partition
Πr : V = V W ∪ V B ∪ V R such that m(v) > m(u) for every v ∈ V W and u ∈ V B .
By Proposition 2, G is not ergodic whenever it has such a decomposition.
8n2 Rx
N= + 1, (7)
Mx θk−1
this will be turned into a quantitative argument with the precise bound on the
running time. Yet, in Section 6, we will show that this time can be exponential
already for R-games.
Algorithm 1. PUMP(G, ε)
Input: A BWR-game G = (G = (V, E), P, r) and a desired accuracy ε
Output: a potential x : V → R s.t. |mx (v) − mx (u)| ≤ ε for all u, v ∈ V if the game
is ergodic, and a contra-ergodic decomposition otherwise
1: let x0 (v) := x(v) := 0 for all v ∈ V ; i := 1
2: let t0 , t1 , . . . , t4 , and N be as defined by (6) and (7)
3: while i ≤ N do
4: if Mx ≤ ε then
5: return x
6: end if
7: δ := max{δ | mδx (v) ≥ t1 for all v ∈ Vx0 [t2 , t4 ] and mδx (v) ≤ t3 for all v ∈
Vx0 [t0 , t2 )}
8: if δ = ∞ then
9: return the ergodic partition Vx0 [t0 , t2 ) ∪ Vx0 [t2 , t4 ]
10: end if
11: x(v) := x(v) − δ for all v ∈ Vx0 [t2 , t4 ]
12: if Vx0 [t0 , t1 ) = ∅ or Vx0 (t3 , t4 ] = ∅ then
13: x := x0 :=REDUCE-POTENTIALS(G, x); i := 1
14: recompute the thresholds t0 , t1 , . . . , t4 and N using (6) and (7)
15: else
16: i := i + 1;
17: end if
18: end while
19: V W ∪ V B ∪ V R :=FIND-PARTITION(G, x)
20: return contra-ergodic partition V W ∪ V B ∪ V R
One problem that arises during the pumping procedure is that the potentials
can increase exponentially in the number of phases, making our bounds on the
number of iterations per phase also exponential in n. For the BW-case Pisaruk
[33] solved this problem by giving a procedure that reduces the range of the
potentials after each round, while keeping all its desired properties needed for the
running time analysis. Pisaruk’s potential reduction procedure can be thought
of as a combinatorial procedure for finding an extreme point of a polyhedron,
given a point in it. Indeed, given a BWR-game and a potential x, let us assume
without loss of generality, by shifting the potentials if necessary, that x ≥ 0, and
let E = {(v, u) ∈ E : rx (v, u) ∈ [m− x , mx ], v ∈ VB ∪ VW }, where r is the
+
1 1 1 1 1 1 1 1
1 W+1 W+1 W+1 W+1 W+1 W+1 W+1 W+1
−1
W
W+1 u4 u3 u2 u1 u0 v1 v2 v3 v4 W
W+1
W W W 1 1 W W W
W+1 W+1 W+1 2 2 W+1 W+1 W+1
We show now that the execution time of the algorithm, in the worst case, can
be exponential in the number of random nodes k, already for weighted Markov
chains, that is, for R-games. Consider the following example. Let G = (V, E)
be a digraph on k = 2l + 1 vertices ul , . . . , u1 , u0 = v0 , v1 , . . . , vl , and with the
following set of arcs:
Let W ≥ 1 be an integer. All nodes are random with the following transi-
tion probabilities: p(ul , ul ) = p(vl , vl ) = 1 − W1+1 , p(u0 , u1 ) = p(u0 , v1 ) =
2 , p(ui−1 , ui ) = p(vi−1 , vi ) = 1 − W +1 , for i = 2, . . . , l, and p(ui , ui−1 ) =
1 1
p(vi , vi−1 ) = W1+1 , for i = 1, . . . , l. The local rewards are zero on every arc,
except for r(ul , ul ) = −r(vl , vl ) = 1. Clearly this Markov chain consists of a
single recurrent class, and it is easy to verify that the limiting distribution p∗ is
as follows:
W −1 W i−1 (W 2 − 1)
p∗ (u0 ) = , p∗ (ui ) = p∗ (vi ) = for i = 1, . . . , l.
(W + 1)W − 2
l 2((W + 1)W l − 2)
References
1. Andersson, D., Miltersen, P.B.: The complexity of solving stochastic games on
graphs. In: Dong, Y., Du, D.-Z., Ibarra, O. (eds.) ISAAC 2009. LNCS, vol. 5878,
pp. 112–121. Springer, Heidelberg (2009)
2. Beffara, E., Vorobyov, S.: Adapting Gurvich-Karzanov-Khachiyan’s algorithm for
parity games: Implementation and experimentation. Technical Report 2001-020,
Department of Information Technology, Uppsala University (2001),
https://www.it.uu.se/research/reports/#2001
3. Beffara, E., Vorobyov, S.: Is randomized Gurvich-Karzanov-Khachiyan’s algorithm
for parity games polynomial? Technical Report 2001-025, Department of Informa-
tion Technology, Uppsala University (2001),
https://www.it.uu.se/research/reports/#2001
4. Björklund, H., Sandberg, S., Vorobyov, S.: A combinatorial strongly sub-
exponential strategy improvement algorithm for mean payoff games. DIMACS
Technical Report 2004-05, DIMACS, Rutgers University (2004)
5. Björklund, H., Vorobyov, S.: Combinatorial structure and randomized subexponen-
tial algorithms for infinite games. Theoretical Computer Science 349(3), 347–360
(2005)
6. Björklund, H., Vorobyov, S.: A combinatorial strongly sub-exponential strategy im-
provement algorithm for mean payoff games. Discrete Applied Mathematics 155(2),
210–229 (2007)
7. Boros, E., Elbassioni, K., Gurvich, V., Makino, K.: Every stochastic game with
perfect information admits a canonical form. RRR-09-2009, RUTCOR. Rutgers
University (2009)
8. Boros, E., Elbassioni, K., Gurvich, V., Makino, K.: A pumping algorithm for er-
godic stochastic mean payoff games with perfect information. RRR-19-2009, RUT-
COR. Rutgers University (2009)
9. Boros, E., Gurvich, V.: Why chess and back gammon can be solved in pure posi-
tional uniformly optimal strategies? RRR-21-2009, RUTCOR. Rutgers University
(2009)
10. Chatterjee, K., Henzinger, T.A.: Reduction of stochastic parity to stochastic mean-
payoff games. Inf. Process. Lett. 106(1), 1–7 (2008)
11. Chatterjee, K., Jurdziński, M., Henzinger, T.A.: Quantitative stochastic parity
games. In: SODA ’04: Proceedings of the fifteenth annual ACM-SIAM symposium
on Discrete algorithms, pp. 121–130. Society for Industrial and Applied Mathe-
matics, Philadelphia (2004)
12. Condon, A.: The complexity of stochastic games. Information and Computation 96,
203–224 (1992)
13. Condon, A.: An algorithm for simple stochastic games. In: Advances in computa-
tional complexity theory. DIMACS series in discrete mathematics and theoretical
computer science, vol. 13 (1993)
14. Dhingra, V., Gaubert, S.: How to solve large scale deterministic games with mean
payoff by policy iteration. In: Valuetools ’06: Proceedings of the 1st international
conference on Performance evaluation methodolgies and tools, vol. 12. ACM, New
York (2006)
15. Eherenfeucht, A., Mycielski, J.: Positional strategies for mean payoff games. Inter-
national Journal of Game Theory 8, 109–113 (1979)
16. Friedmann, O.: An exponential lower bound for the parity game strategy improve-
ment algorithm as we know it. In: Symposium on Logic in Computer Science, pp.
145–156 (2009)
354 E. Boros et al.
17. Gillette, D.: Stochastic games with zero stop probabilities. In: Dresher, M., Tucker,
A.W., Wolfe, P. (eds.) Contribution to the Theory of Games III. Annals of Mathe-
matics Studies, vol. 39, pp. 179–187. Princeton University Press, Princeton (1957)
18. Gimbert, H., Horn, F.: Simple stochastic games with few random vertices are
easy to solve. In: Amadio, R.M. (ed.) FOSSACS 2008. LNCS, vol. 4962, pp. 5–
19. Springer, Heidelberg (2008)
19. Gurvich, V., Karzanov, A., Khachiyan, L.: Cyclic games and an algorithm to find
minimax cycle means in directed graphs. USSR Computational Mathematics and
Mathematical Physics 28, 85–91 (1988)
20. Halman, N.: Simple stochastic games, parity games, mean payoff games and dis-
counted payoff games are all LP-type problems. Algorithmica 49(1), 37–50 (2007)
21. Hoffman, A.J., Karp, R.M.: On nonterminating stochastic games. Management
Science, Series A 12(5), 359–370 (1966)
22. Jurdziński, M.: Deciding the winner in parity games is in UP ∩ co-UP. Inf. Process.
Lett. 68(3), 119–124 (1998)
23. Jurdziński, M.: Games for Verification: Algorithmic Issues. PhD thesis, Faculty of
Science, University of Aarhus, USA (2000)
24. Jurdziński, M., Paterson, M., Zwick, U.: A deterministic subexponential algorithm
for solving parity games. In: SODA ’06: Proceedings of the seventeenth annual
ACM-SIAM symposium on Discrete algorithm, pp. 117–123. ACM, New York
(2006)
25. Karp, R.M.: A characterization of the minimum cycle mean in a digraph. Discrete
Math. 23, 309–311 (1978)
26. Karzanov, A.V., Lebedev, V.N.: Cyclical games with prohibition. Mathematical
Programming 60, 277–293 (1993)
27. Kratsch, D., McConnell, R.M., Mehlhorn, K., Spinrad, J.P.: Certifying algorithms
for recognizing interval graphs and permutation graphs. In: SODA ’03: Proceedings
of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 158–
167. Society for Industrial and Applied Mathematics, Philadelphia (2003)
28. Liggett, T.M., Lippman, S.A.: Stochastic games with perfect information and time-
average payoff. SIAM Review 4, 604–607 (1969)
29. Littman, M.L.: Algorithm for sequential decision making, CS-96-09. PhD thesis,
Dept. of Computer Science, Brown Univ., USA (1996)
30. Mine, H., Osaki, S.: Markovian decision process. American Elsevier Publishing Co.,
New York (1970)
31. Moulin, H.: Extension of two person zero sum games. Journal of Mathematical
Analysis and Application 5(2), 490–507 (1976)
32. Moulin, H.: Prolongement des jeux à deux joueurs de somme nulle. Bull. Soc. Math.
France, Memoire 45 (1976)
33. Pisaruk, N.N.: Mean cost cyclical games. Mathematics of Operations Re-
search 24(4), 817–828 (1999)
34. Vöge, J., Jurdzinski, M.: A discrete strategy improvement algorithm for solving
parity games. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855,
pp. 202–215. Springer, Heidelberg (2000)
35. Vorobyov, S.: Cyclic games and linear programming. Discrete Applied Mathemat-
ics 156(11), 2195–2231 (2008)
36. Zwick, U., Paterson, M.: The complexity of mean payoff games on graphs. Theo-
retical Computer Science 158(1-2), 343–359 (1996)
On Column-Restricted and Priority Covering
Integer Programs
1 Introduction
In a 0,1-covering integer program (0,1-CIP, in short), we are given a constraint
matrix A ∈ {0, 1}m×n, demands b ∈ Zm + , non-negative costs c ∈ Z+ , and upper
n
bounds d ∈ Z+ , and the goal is to solve the following integer linear program
n
min{cT x : Ax ≥ b, 0 ≤ x ≤ d, x integer}.
Problems that can be expressed as 0,1-CIPs are essentially equivalent to set
multi-cover problems, where sets correspond to columns and elements correspond
to rows. This directly implies that 0,1-CIPs are rather well understood in terms
of approximability: the class admits efficient O(log n) approximation algorithms
and this is best possible unless NP = P. Nevertheless, in many cases one can
get better approximations by exploiting the structure of matrix A. For example,
it is well known that whenever A is totally unimodular (TU)(e.g., see [18]), the
Supported by NSERC grant no. 288340 and by an Early Research Award.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 355–368, 2010.
c Springer-Verlag Berlin Heidelberg 2010
356 D. Chakrabarty, E. Grant, and J. Könemann
We believe that priority covering problems are interesting in their own right,
and they arise quite naturally in covering applications where one wants to model
quality of service (QoS) or priority restrictions. For instance, in the tree cover
problem defined above, suppose each segment j has a quality of service (QoS)
or priority supply sj associated with it and suppose each edge e has a QoS or
priority demand πe associated with it. We say that a segment j covers e iff j
contains e and the priority supply of j exceeds the priority demand of e. The
goal is to find a minimum cost subset of segments that covers every edge. This
is the priority tree cover problem.
Besides being a natural covering problem to study, we show that the priority
tree cover problem is a special case of a classical geometric covering problem:
that of finding a minimum cost cover of points by axis-parallel rectangles in 3
dimensions. Finding a constant factor approximation algorithm for this problem,
even when the rectangles have uniform cost, is a long standing open problem.
We show that although the tree cover is polynomial time solvable, the priority
tree cover problem is APX-hard. We complement this with a factor 2 approx-
imation for the problem. Furthermore, we present constant upper bounds for
the integrality gap of this PCIP in a number of special cases, implying constant
upper bounds on the corresponding CCIPs in these special cases. We refer the
reader to Section 1.2 for a formal statement of our results, which we give after
summarizing works related to our paper.
bounds on the variables, Srinivasan [19] gave a O(1 + log α)-approximation to the
problem (where α is the dilation as before). Later on, Kolliopoulos and Young
[15] obtained the same approximation factor, respecting the upper bounds. How-
ever, these algorithms didn’t give any better results when special structure of the
constraint matrix was known. On the hardness side, Trevisan [21] showed that it
is NP-hard to obtain a (log α − O(log log α))-approximation algorithm even for
0,1-CIPs.
The most relevant work to this paper is that of Kolliopoulos [12]. The author
studies CCIPs which satisfy a rather strong assumption, called the no bottleneck
assumption, that the supply of any column is smaller than the demand of any
row. Kolliopoulos [12] shows that if one is allowed to violate the upper bounds
by a multiplicative constant, then the integrality gap of the CCIP is within a
constant factor of that of the original 0,1-CIP1 . As the author notes such a
violation is necessary; otherwise the CCIP has unbounded integrality gap. If one
is not allowed to violated upper bounds, nothing better than the result of [15]
is known for the special case of CCIPs.
Our work on CCIPs parallels a large body of work on column-restricted
packing integer programs (CPIPs). Assuming the no-bottleneck assumption, Kol-
liopoulos and Stein [14] show that CPIPs can be approximated asymptotically
as well as the corresponding 0,1-PIPs. Chekuri et al. [7] subsequently improve
the constants in the result from [14]. These results imply constant factor approx-
imations for the column-restricted tree packing problem under the no-bottleneck
assumption. Without the no-bottleneck assumption, however, only polylogarith-
mic approximation is known for the problem [6].
The only work on priority versions of covering problems that we are aware
of is due to Charikar, Naor and Schieber [5] who studied the priority Steiner
tree and forest problems in the context of QoS management in a network multi-
casting application. Charikar et al. present a O(log n)-approximation algorithm
for the problem, and Chuzhoy et al. [8] later show that no efficient o(log log n)
approximation algorithm can exist unless NP ⊆ DTIME(nlog log log n ) (n is the
number of vertices).
To the best of our knowledge, the column-restricted or priority versions of the
line and tree cover problem have not been studied. The best known approxima-
tion algorithm known for both is the O(log n) factor implied by the results of [15]
stated above. However, upon completion of our work, Nitish Korula [16] pointed
out to us that a 4-approximation for column-restricted line cover is implicit in
a result of Bar-Noy et al. [2]. We remark that their algorithm is not LP-based,
although our general result on CCIPs is.
supremum of the ratio of optimal IP value to optimal LP value, taken over all
non-negative integral vectors b, c, and d. The integrality gap of an IP captures
how much the integrality constraint affects the optimum, and is an indicator of
the strength of a linear programming formulation.
CCIPs: Suppose the CCIP is Cov(A[s], b, c, d). We make the following two as-
sumptions about the integrality gaps of the 0,1 covering programs, both the
original 0,1-CIP and the priority version of the 0,1-CIP.
Assumption 1. The integrality gap of the original 0,1-CIP is γ ≥ 1. Specifi-
cally, for any non-negative integral vectors b, c, and d, if the canonical LP re-
laxation to the CIP has a fractional solution x, then one can find in polynomial
time an integral feasible solution to the CIP of cost at most γ · cT x. We stress
here that the entries of b, c, d could be 0 as well as ∞.
Assumption 2. The integrality gap of the PCIP is ω ≥ 1. Specifically, for any
non-negative integral vectors s, π, c, if the canonical LP relaxation to the PCIP
has a fractional solution x, then one can find in polynomial time, an integral
feasible solution to the PCIP of cost at most ω · cT x.
We give an LP-based approximation algorithm for solving CCIPs. Since the
canonical LP relaxation of a CCIP can have unbounded integrality gap, we
strengthen it by adding a set of valid constraints called the knapsack cover
constraints. We show that the integrality gap of this strengthened LP is O(γ +ω),
and can be used to give a polynomial time approximation algorithm.
Theorem 1. Under Assumptions 1 and 2, there is a (24γ + 8ω)-approximation
algorithm for column-restricted CIPs.
Knapsack cover constraints to strengthen LP relaxations were introduced in
[1,10,22]; Carr et al. [3] were the first to employ them in the design approximation
algorithms. The paper of Kolliopoulos and Young [15] also use these to get their
result on general CIPs.
The main technique in the design of algorithms for column-restricted problems
is grouping-and-scaling developed by Kolliopoulos and Stein [13,14] for packing
problems, and later used by Kolliopoulos [12] in the covering context. In this
technique, the columns of the matrix are divided into groups of ‘close-by’ supply
values; in a single group, the supply values are then scaled to be the same; for
a single group, the integrality gap of the original 0,1-CIP is invoked to get an
integral solution for that group; the final solution is a ‘union’ of the solutions
over all groups.
There are two issues in applying the technique to the new strengthened LP
relaxation of our problem. Firstly, although the original constraint matrix is
column-restricted, the new constraint matrix with the knapsack cover constraints
is not. Secondly, unless additional assumptions are made, the current grouping-
and-scaling analysis doesn’t give a handle on the degree of violation of the upper
bound constraints. This is the reason why Kolliopoulos [12] needs the strong
no-bottleneck assumption.
360 D. Chakrabarty, E. Grant, and J. Könemann
We get around the first difficulty by grouping the rows as well, into those that
get most of their coverage from columns not affected by the knapsack constraints,
and the remainder. On the first group of rows, we apply a subtle modification to
the vanilla grouping-and-scaling analysis and obtain a O(γ) approximate feasible
solution satisfying these rows; we then show that one can treat the remainder
of the rows as a PCIP and get a O(ω) approximate feasible solution satisfying
them, using Assumption 2. Combining the two gives the O(γ + ω) factor. The
full details are given in Section 2.
We stress here that apart from the integrality gap assumptions on the 0,1-
CIPs, we do not make any other assumption (like the no-bottleneck assumption).
In fact, we can use the modified analysis of the grouping-and-scaling technique to
get a similar result as [12] for approximating CCIPs violating the upper-bound
constraints, under a weaker assumption than the no-bottleneck assumption. The
no-bottleneck assumption states that the supply of any column is less than the
demand of any row. In particular, even though a column has entry 0 on a certain
row, its supply needs to be less than the demand of that row. We show that if we
weaken the no-bottleneck assumption to assuming that the supply of a column
j is less than the demand of any row i only if A[s]ij is positive, a similar result
can be obtained via our modified analysis.
Priority Covering Problems. In the following, we use PLC and PTC to refer
to the priority versions of the line cover and tree cover problems, respectively.
Recall that the constraint matrices for line and tree cover problems are totally
unimodular, and the integrality of the corresponding 0,1-covering problems is
therefore 1 in both case. It is interesting to note that the 0,1-coefficient matrices
for PLC and PTC are not totally unimodular in general. The following integrality
gap bound is obtained via a primal-dual algorithm.
Theorem 3. The canonical LP for priority line cover has an integrality gap of
at least 3/2 and at most 2.
In the case of tree cover, we obtain constant upper bounds on the integrality gap
for the case c = 1, that is, for the minimum cardinality version of the problem.
We believe that the PCIP for the tree cover problem with general costs also has
a constant integrality gap. On the negative side, we can show an integrality gap
e
of at least e−1 .
We obtain the upper bound by taking a given PTC instance and a fractional so-
lution to its canonical LP, and decomposing it into a collection of PLC instances
On Column-Restricted and Priority Covering Integer Programs 361
with corresponding fractional solutions, with the following two properties. First,
the total cost of the fractional solutions of the PLC instances is within a constant
of the cost of the fractional solution of the PTC instance. Second, union of
integral solutions to the PLC instances gives an integral solution to the PTC
instance. The upper bound follows from Theorem 3. Using Theorem 1, we get
the following as an immediate corollary.
Theorem 6. PTC is APX-hard, even when all the costs are unit.
Theorem 8. The priority tree covering problem is a special case of the rectangle
cover problem in 3-dimensions.
Due to space restrictions, we omit many proofs. A full version of the paper is
available [4].
min{cT x : A[s]x ≥ b, 0 ≤ x ≤ d, x ≥ 0}
by adding valid knapsack cover constraints. In the following we use C for the set
of columns and R for the set of rows of A.
362 D. Chakrabarty, E. Grant, and J. Könemann
is valid for the set of all integer solutions x for Cov(A[s], b, c, d). Adding the set
of all KC inequalities yields the following stronger LP formulation CIP. We note
that the LP is not column-restricted, in that, different values appear on the same
column of the new constraint matrix.
optP := min cj xj (P)
j∈C
s.t. AF [s]ij xj ≥ bF
i ∀F ⊆ C, ∀i ∈ R (3)
j∈C
0 ≤ xj ≤ dj ∀j ∈ C
It is not known whether (P) can be solved in polynomial time. For α ∈ (0, 1),
call a vector x∗ α-relaxed if its cost is at most optP , and if it satisfies (3) for
F = {j ∈ C : x∗j ≥ αdj }. An α-relaxed solution to (P) can be computed
efficiently for any α. To see this note that one can check whether a candidate
solution satisfies (3) for a set F ; we are done if it does, and otherwise we have
found an inequality of (P) that is violated, and we can make progress via the
ellipsoid method. Details can be found in [3] and [15].
We fix an α ∈ (0, 1), specifying its precise value later. Compute an α-relaxed
solution, x∗ , for (P), and let F = {j ∈ C : x∗j ≥ αdj }. Define x̄ as, x̄j = x∗j if
j ∈ C \ F , and x̄j = 0, otherwise. Since x∗ is an α-relaxed solution, we get that x̄
is a feasible fractional solution to the residual CIP, Cov(AF [s], bF , c, αd). In the
next subsection, our goal will be to obtain an integral feasible solution to the
covering problem Cov(AF [s], bF , c, d) using x̄. The next lemma shows how this
implies an approximation to our original CIP.
Lemma 1. If there exists an integral feasible solution, xint , to Cov(AF [s], bF , c, d)
with cT xint ≤ β · cT x̄, then there exists a max{1/α, β}-factor approximation to
Cov(A[s], b, c, d).
and small otherwise. Note that Lemma 2 together with the fact that each column
in row i’s support is either small or large implies,
AF [s̄]ij yj ≥ b̄i /2, for all large rows i, and
j∈Li
AF [s̄]ij yj ≥ b̄i /2, for all small rows i.
j∈Si
Proof. (Sketch) Since the rows are small, for any row i, we can zero out the entries
that are larger than b̄i , and still 2y will be a feasible solution. Note that, now in
each row, the entries are < b̄i , and thus are at most b̄i /2 (everything being powers
of 2). We stress that it could be that b̄i of some row is less than the entry in some
other row, that is, we don’t have the no-bottleneck assumption. However, when a
particular row i is fixed, b̄i is at least any entry of the matrix in the ith row. Our
modified analysis of grouping and scaling then makes the proof go through.
We group the columns into classes that have sj as the same power of 2, and
(t)
for each row i we let b̄i be the contribution of the class t columns towards the
(t)
demand of row i. The columns of class t, the small rows, and the demands b̄i
form a CIP where all non-zero entries of the matrix are the same power of 2.
(t)
We scale both the constraint matrix and b̄i down by that power of 2 to get a
0,1-CIP, and using assumption 1, we get an integral solution to this 0,1-CIP. Our
final integral solution is obtained by concatenating all these integral solutions
over all classes.
Till now the algorithm is the standard grouping-and-scaling algorithm. The
difference lies in our analysis in proving that this integral solution is feasible for
the original CCIP. Originally the no-bottleneck assumption was used to prove
this. However, we show that, since the column values in different classes are
geometrically decreasing, the weaker assumption of b̄i being at least any entry
in the ith row is enough to make the analysis go through.
This completes the sketch of the proof.
Large rows. The large rows can be showed to be a PCIP problem and thus
Assumption 2 can be invoked to get an analogous lemma to Lemma 3.
Lemma 4. We can find an integral solution xint,L such that
a) xint,L ≤ 1 for all j,
j
b) j∈C cj xint,S ≤ 8ω j∈C cj x̄j , and
j
c) for every large row i ∈ RL , j∈C AF [s]ij xint,S
j ≥ bF
i .
min cj xj : S
x ∈ R+
j∈S
max ye : y ∈ R+
E
(Dual)
e∈E
(Primal)
ye ≤ cj , ∀j ∈ S
xj ≥ 1, ∀e ∈ E e∈E:j covers e
j∈S:j covers e
segment in Q covers e and let U be the set of unsatisfied edges. The algorithm
picks the largest edge in U and raises the dual value ye till some segments
becomes tight. The segments with the farthest left-end point and the farthest
right-end point are picked in Q, and all edges contained in any of them are
removed from U . Note that since we choose the largest in U , all such edges are
covered. The algorithm repeats this process till U becomes ∅, that is, all edges
are covered. The final set of segments is obtained by a reverse delete step, where
a segment is deleted if its deletion doesn’t make any edge uncovered.
The algorithm is a factor 2 approximation algorithm. To show this it suffices
by a standard argument for analysing primal-dual algorithms, that any edge
with a positive dual ye is contained in at most two segments in Q. These two
segments correspond to the left-most and the right-most segments that cover e;
it is not too hard to show if something else covers e, then either e has zero dual,
or the third segment is removed in the reverse delete step.
Proof of Theorem 7: For any two vertices t (top) and b (bottom) of the tree T ,
such that t is an ancestor of b, let Ptb be the unique path from b to t. Note that
Ptb , together with the restrictions of the segments in Sto Ptb , defines an instance
of PLC. Therefore, for each pair t and b, we can compute the optimal solution
to the corresponding PLC instance using the exact algorithm; let the cost of
this solution be ctb . Create an instance of the 0,1-tree cover problem with T and
segments S := {(t, b) : t is an ancestor of b} with costs ctb . Solve the 0,1-tree
cover instance exactly (recall we are in the rooted version) and for the segments
(t, b) in S returned, return the solution of the corresponding PLC instance of
cost ctb .
One now uses the decomposition above to obtain a solution to the 0,1-tree
cover problem (T, S ) of cost at most 2 times the cost of S ∗ . This proves the
theorem. The segments in S picked are precisely the segments corresponding to
paths Ei , i = 1, . . . , p and each Si is a solution to the PLC instance. Since we
find the optimum PLC, there is a solution to (T, S ) with costs c of cost less
than total cost of segments in S1 ∪ · · · ∪ Sp . But that cost is at most twice the
cost of S ∗ since each segment of S ∗ is in at most two Si ’s.
2-Priority Line Cover (2-PLC). The input is similar to PLC, except each
segment and edge has now an ordered pair of priorities, and a segment covers an
edge it contains iff each of the priorities of the segment exceeds the corresponding
priority of the edge. The goal, as in PLC, is to find a minimum cost cover.
It is not too hard to show 2-PLC is a special case of rectangle cover. The edges
correspond to points in 3 dimension and segments correspond to rectangles in
3-dimension; dimensions encoded by the linear coordinates on the line, and the
two priority values. In general, p-PLC can be shown to be a special case of
(p + 1)-dimensional rectangle cover.
What is more involved is to show PTC is a special case of 2-PLC. To do so, we
run two DFS orderings on the tree, where the order in which children of a node
are visited is completely opposite in the two DFS orderings. The first ordering
gives the order in which these edges must be placed on a line. The second gives
one of the priorities for the edges. The second priority of the edges comes from
the original priority in PTC. It can be shown that the segments priorities can
be so set that the feasible solutions are precisely the same in both the instances
proving Theorem 8.
5 Concluding Remarks
In this paper we studied column restricted covering integer programs. In particu-
lar, we studied the relationship between CCIPs and the underlying 0,1-CIPs. We
conjecture that the approximability of a CCIP should be asymptotically within
a constant factor of the integrality gap of the original 0,1-CIP. We couldn’t show
this; however, if the integrality gap of a PCIP is shown to be within a constant of
the integrality gap of the 0,1-CIP, then we will be done. At this point, we don’t
even know how to prove that PCIPs of special 0,1-CIPS, those whose constraint
matrices are totally unimodular, have constant integrality gap. Resolving the
case of PTC is an important step in this direction, and hopefully in resolving
our conjecture regarding CCIPs.
References
1. Balas, E.: Facets of the knapsack polytope. Math. Programming 8, 146–164 (1975)
2. Bar-Noy, A., Bar-Yehuda, R., Freund, A., Naor, J., Schieber, B.: A unified approach
to approximating resource allocation and scheduling. J. ACM 48(5), 1069–1090
(2001)
3. Carr, R.D., Fleischer, L.K., Leung, V.J., Phillips, C.A.: Strengthening integrality
gaps for capacitated network design and covering problems. In: Proceedings of
ACM-SIAM Symposium on Discrete Algorithms, pp. 106–115 (2000)
4. Chakrabarty, D., Grant, E., Könemann, J.: On column-restricted and priority cov-
ering integer programs. arXiv eprint (2010)
5. Charikar, M., Naor, J., Schieber, B.: Resource optimization in qos multicast routing
of real-time multimedia. IEEE/ACM Trans. Netw. 12(2), 340–348 (2004)
368 D. Chakrabarty, E. Grant, and J. Könemann
6. Chekuri, C., Ene, A., Korula, N.: Unsplittable flow in paths and trees and column-
restricted packing integer programs. In: Proceedings of International Workshop on
Approximation Algorithms for Combinatorial Optimization Problems (2009) (to
appear)
7. Chekuri, C., Mydlarz, M., Shepherd, F.B.: Multicommodity demand flow in a tree
and packing integer programs. ACM Trans. Alg. 3(3) (2007)
8. Chuzhoy, J., Gupta, A., Naor, J., Sinha, A.: On the approximability of some net-
work design problems. ACM Trans. Alg. 4(2) (2008)
9. Dobson, G.: Worst-case analysis of greedy heuristics for integer programming with
non-negative data. Math. Oper. Res. 7(4), 515–531 (1982)
10. Hammer, P., Johnson, E., Peled, U.: Facets of regular 0-1 polytopes. Math. Pro-
gramming 8, 179–206 (1975)
11. Hochbaum, D.S.: Approximation algorithms for the set covering and vertex cover
problems. SIAM Journal on Computing 11(3), 555–556 (1982)
12. Kolliopoulos, S.G.: Approximating covering integer programs with multiplicity con-
straints. Discrete Appl. Math. 129(2-3), 461–473 (2003)
13. Kolliopoulos, S.G., Stein, C.: Approximation algorithms for single-source unsplit-
table flow. SIAM Journal on Computing 31(3), 919–946 (2001)
14. Kolliopoulos, S.G., Stein, C.: Approximating disjoint-path problems using packing
integer programs. Math. Programming 99(1), 63–87 (2004)
15. Kolliopoulos, S.G., Young, N.E.: Approximation algorithms for covering/packing
integer programs. J. Comput. System Sci. 71(4), 495–505 (2005)
16. Korula, N.: Private Communication (2009)
17. Rajagopalan, S., Vazirani, V.V.: Primal-dual RNC approximation algorithms for
(multi)set (multi)cover and covering integer programs. In: Proceedings of IEEE
Symposium on Foundations of Computer Science (1993)
18. Schrijver, A.: Combinatorial optimization. Springer, New York (2003)
19. Srinivasan, A.: Improved approximation guarantees for packing and covering inte-
ger programs. SIAM Journal on Computing 29(2), 648–670 (1999)
20. Srinivasan, A.: An extension of the lovász local lemma, and its applications to
integer programming. SIAM Journal on Computing 36(3), 609–634 (2006)
21. Trevisan, L.: Non-approximability results for optimization problems on bounded
degree instances. In: Proceedings of ACM Symposium on Theory of Computing,
pp. 453–461 (2001)
22. Wolsey, L.: Facets for a linear inequality in 0-1 variables. Math. Programming 8,
168–175 (1975)
On k-Column Sparse Packing Programs
1 Introduction
Packing integer programs (PIPs) are those of the form:
max wT x | Sx ≤ c, x ∈ {0, 1}n , where w ∈ Rn+ , c ∈ Rm
+ and S ∈ R+
m×n
.
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 369–382, 2010.
c Springer-Verlag Berlin Heidelberg 2010
370 N. Bansal et al.
Related Previous Work: Various special cases of k-CS-PIP have been extensively
studied. An important special case is the k-set packing problem, where given a
collection of sets of cardinality at most k, the goal is to find the maximum weight
sub-collection of mutually disjoint sets. This is equivalent to k-CS-PIP where the
constraint matrix S is 0-1 and the capacity c is all ones. Note that for k = 2 this
is maximum weight matching which can be solved in polynomial time, and for
k = 3 the problem becomes APX-hard [18]. After a long line of work [19,2,12,9],
the best-known approximation ratio for this problem is k+1 2 + obtained using
local search techniques [9]. An improved bound of k2 + is also known [19] for the
unweighted case, i.e., the weight vector w = 1. It is also known that the natural
LP relaxation for this problem has integrality gap at least k − 1 + 1/k, and in
particular this holds for the projective plane instance of order k − 1. Hazan et
al. [18] showed that k-set packing is Ω( logk k )-hard to approximate.
Another special case of k-CS-PIP is the independent set problem in graphs
with maximum degree at most k. This is equivalent to k-CS-PIP where the
constraint matrix S is 0-1, capacity c is all ones, and each row is 2-sparse. This
problem has an O(k log log k/ log k)-approximation [17], and is Ω(k/ log2 k)-hard
to approximate [3], assuming the Unique Games Conjecture [20].
Shepherd and Vetta [26] studied the demand matching problem on graphs,
which is k-CS-PIP with k = 2, with the further restriction that in each column
the non-zero entries are equal, and that no two columns have non-zero entries in
the same two rows. They gave an LP-based 3.264-approximation algorithm [26],
and showed that the natural LP relaxation for this problem has integrality gap
at least 3. They also showed the demand matching problem to be APX-hard even
on bipartite graphs. For larger values of k, problems similar to demand matching
have been studied under the name of column-restricted PIPs [21], which arise in
the context of routing flow unsplittably (see also [6,7]). In particular, an 11.54k-
approximation algorithm was known [15] where (i) in each column all non-zero
entries are equal, and (ii) the maximum entry in S is at most the minimum entry
in c (this is also known as the no bottle-neck assumption); later, it was observed
in [13] that even without the second of these conditions, one can obtain an 8k
approximation. The literature on unsplittable flow is quite extensive; we refer
the reader to [4,13] and references therein.
For the general k-CS-PIP, Pritchard [25] gave a 2k k 2 -approximation algo-
rithm, which was the first result with approximation ratio depending only on k.
Pritchard’s algorithm was based on solving an iterated LP relaxation, and then
applying a randomized selection procedure. Independently, [14] and [11] showed
that this final step could be derandomized, yielding an improved bound of O(k 2 ).
All these previous results crucially use the structural properties of basic feasible
372 N. Bansal et al.
We say that item i participates in constraint j if sij > 0. For each i ∈ [n], let
N (i) := {j ∈ [m] | sij > 0} be the set of constraints that i participates in. In a
k-column sparse PIP, we have |N (i)| ≤ k for each i ∈ [n]. The goal is to find the
maximum weight subset of items such that all the constraints are satisfied.
We define the slack as B := mini∈[n],j∈[m] cj /sij . By scaling the constraint
matrix, we may assume that cj = 1 for all j ∈ [m]. We also assume that sij ≤ 1
for each i, j; otherwise, we can just fix xi = 0. Finally, for each constraint j, we
let P (j) denote the set of items participating in this constraint. Note that |P (j)|
can be arbitrarily large.
M x1 + x2 + x3 + x4 + . . . + xM ≤ M
1 will also be chosen (since M 1 and we pick each item independently with
probability xi /(2k) = 1/4); in this case, item 1 will be discarded. Thus the final
solution will almost always not contain item 1, violating the claim that it lies in
the final solution with probability at least x1 /(4k) = 1/8.
The key point is that we must consider the probability of an item being
discarded by some constraint, conditional on it being chosen in the set S (for
item 1 in the above example, this probability is close to one, not at most half).
This is not a problem if either all item sizes are small (say sij ≤ cj /2), or all item
sizes are large (say sij ≈ cj ). The algorithm we analyze shows that the difficult
case is indeed when some constraints contain both large and small items, as in
the example above.
To see that (1) implies the theorem, for any item i, simply take the union bound
over all j ∈ N (i). Thus, the probability that i is deleted from S conditional on
it being chosen in S is at most 2/α. Equivalently, Pr[i ∈ S | i ∈ S] ≥ 1 − 2/α.
We now prove (1) using the following intuition: The total extent to which the
LP selects items that are big for any constraint cannot be more than 2 (each
big item has size at least 1/2); therefore, Bij is unlikely to occur since we scaled
down probabilities by factor αk. Ignoring for a moment the conditioning on
i ∈ S, event Gj is also unlikely, by Markov’s Inequality. But items are selected
for S independently, so if i is big for constraint j, then its presence in S does
not affect the event Gj at all. If i is small for constraint j, then even if i ∈ S,
the total size of S-items is unlikely to exceed 1.
To prove (1) formally,
let B(j) denote the set of items that are big for con-
straint j, and Yj := ∈B(j) x . By the LP constraint for j, it follows that Yj ≤ 2
(since each
∈ B(j) has size sj > 12 ). Now by a union bound,
1 Yj 2
Pr[Bij | i ∈ S] ≤ x ≤ ≤ . (2)
αk αk αk
∈B(j)\{i}
Now, let G−i (j) denote the set of items that are small for constraint j, not
counting item i, even if it is small. Using the LP constraint j, we have:
Yj
sj · xl ≤ 1 − sj · x ≤ 1 − . (3)
2
∈G−i (j) ∈B(j)
Since each item i is chosen into S with probability xi /(αk), inequality (3) im-
plies that the expected total size of S-items in G−i (j) is at most αk
1
(1 − Yj /2).
By Markov’s inequality, the probability that the total size of these S-items ex-
ceeds 1/2 is at most αk 2
(1 − Yj /2). Since items are chosen independently and
i ∈ G−i (j), we obtain this probability even conditioned on i ∈ S.
Whether i is big or small for j, event Gj can occur only if the total size of
S-items in G−i (j) exceeds 1/2. Thus,
2 Yj 2 Yj
Pr[Gj | i ∈ S] ≤ 1− = −
αk 2 αk αk
which, combined with inequality (2), yields (1).
Using the theorem above, we obtain the desired approximation:
Theorem 2. There is a randomized 8k-approximation algorithm for k-CS-PIP.
Proof. From Lemma 1, our algorithm always outputs a feasible solution. To
bound the objective value, recall that Pr[i ∈ S] = αk xi
for all i ∈ [n]. Hence
Theorem 1 implies that for all i ∈ [n]
xi 2
Pr[i ∈ S ] ≥ Pr[i ∈ S] · Pr[i ∈ S |i ∈ S] ≥ · 1− .
αk α
Finally, using linearity of expectation and α = 4, we obtain the theorem.
376 N. Bansal et al.
Remarks: We note that the analysis above only uses Markov’s inequality con-
ditioned on a single item being chosen in set S. Thus a pairwise independent
distribution suffices to choose the set S, and hence the algorithm can be eas-
ily derandomized. More generally, one could consider k-CS-PIP with arbitrary
upper-bounds on the variables: the above 8k-approximation algorithm extends
easily to this setting (details in the full version).
Stronger LP relaxation. Recall that entries are scaled so that all capacities are
one. An item i is called big for constraint j iff sij > 1/2. For each constraint
j ∈ [m], let B(j) = {i ∈ [n] | sij > 12 } denote the set of big items. Since no
two items that
are big for some constraint can be chosen in an integral solution,
the inequality i∈B(j) xi ≤ 1 is valid for each j ∈ [m]. The strengthened LP
relaxation that we consider is as follows.
n
max wi xi (4)
i=1
n
s.t. sij · xi ≤ cj , ∀j ∈ [m] (5)
i=1
xi ≤ 1, ∀j ∈ [m]. (6)
i∈B(j)
0 ≤ xi ≤ 1, ∀i ∈ [n]. (7)
Analysis: It is clear that S is feasible with probability one. The improved ap-
proximation ratio comes from four different steps: First, we use the stronger LP
relaxation. Second, the more careful alteration step does not discard items un-
necessarily; the previous algorithm sometimes deleted items from S even when
constraints were not violated. Third, in analyzing the probability that constraint
On k-Column Sparse Packing Programs 377
Lemma2. For every item i ∈ [n] and constraint j ∈ N (i), we have Pr[Eij | i ∈
S] ≤ αk
1 2 1/3
1 + ( αk ) .
Proof Sketch. Let
:= (4αk)1/3 . We classify items in relation to constraints as:
In case 1, Eij occurs only if some other big item for constraint j is chosen in S;
the new constraints (6) of the strengthened LP bound this probability. In case
2, Eij can occur only if some big item or at least two medium items other than
i are selected for S; we argue that the latter probability is much smaller than
1/αk. In case 3, Eij can occur only if the total size (in constraint j) of items in
S \ {i} is greater than 1 − 1 ; Markov’s inequality gives the desired result.
Thus, for any item i and constraint j ∈ N (i), Pr[Eij | i ∈ S] ≤ αk
1
max{(1 +
2
2 1/3
), (1 + 2αk )}. From the choice of
= (4αk) , which makes the probability in
parts 2 and 3 of the claim equal, we obtain the lemma.
We now prove the main result of this section.
2 1/3 k
Theorem 3. For each i ∈ [n], Pr[i ∈ S | i ∈ S] ≥ 1 − 1
αk 1 + ( αk ) .
Proof. For any item i and constraint j ∈ N (i), the conditional event (¬Eij | i ∈ S)
is a decreasing function over the choice of items in set [n] \ {i}. Thus, by the FKG
inequality [1], for any fixed item i ∈ [n], the probability that no event (Eij | i ∈ S)
occurs is: ⎡ ⎤
H - I
Pr ⎣ ¬Eij - i ∈ S ⎦ ≥ Pr[¬Eij | i ∈ S]
j∈N (i) j∈N (i)
From Lemma 2, Pr[¬Eij | i ∈ S] ≥ 1 − αk 1 2 1/3
1 + ( αk ) . As each item is in at
most k constraints, we obtain the theorem.
378 N. Bansal et al.
Note that F (x) = f (x) for x ∈ {0, 1}n and hence F is an extension of f . Even
though F is a non-linear function,
using
the continuous greedy algorithm of
Vondrák [29], we can obtain an 1 − 1e -approximation algorithm to the following
fractional relaxation of (8):
- n
max F (x) - sij · xi ≤ cj , ∀j ∈ [m]; 0 ≤ xi ≤ 1, ∀i ∈ [n] (9)
i=1
In order to apply the algorithm from [29], one needs to solve in polynomial
n time
the problem of maximizing a linear objective over the constraints { i=1 sij ·xi ≤
cj , ∀j ∈ [m]; 0 ≤ xi ≤ 1, ∀i ∈ [n]}. This is indeed possible since it is a linear
program on n variables and m constraints.
The Rounding Algorithm and Analysis. The rounding algorithm is identi-
cal to that for k-CS-PIP. Let x denote any feasible solution to Problem (9). We
apply the rounding algorithm from the previous section, to first obtain (possibly
infeasible) solution S ⊆ [n] and then feasible integral solution S ⊆ [n].
However, the analysis approach in Theorem 3 does not work. The problem is
that even though S (which is chosen by random sampling) has good expected
profit, i.e., E[f (S)] = Ω( k1 )F (x), it may happen that the alteration step used
to obtain S from S may end up throwing away essentially all the profit. This
was not an issue for linear objective functions since our alteration procedure
guarantees that Pr[i ∈ S |i ∈ S] = Ω(1) for each i ∈ [n]; if f is linear, this im-
plies E[f (S)] = Ω(1) E[f (S )]. However, this property is not enough for general
monotone submodular functions. Consider the following:
Remark: Note that if S itself was chosen randomly from S such that Pr[i ∈
S |S = T ] = Ω(1) for every T ⊆ [n] and i ∈ T , then we would be done by Feige’s
Subadditivity Lemma [16]. Unfortunately, this is too much to hope for. In our
rounding procedure, for any particular choice of S, set S is a fixed subset of S;
and there could be (bad) sets S, where after the alteration step we end up with
sets S such that |S | |S|.
However, it turns out that we can use the following two additional properties
of our algorithm to argue that S has reasonable profit. First, the sets S we con-
struct are drawn from a product distribution on the items. Second, our alteration
procedure has the following ‘monotonicity’ property: Suppose i ∈ T1 ⊆ T2 ⊆ [n],
and i ∈ S when S = T2 . Then we are guaranteed that i ∈ S when S = T1 .
(That is, if S contains additional items, it is more likely that i will be discarded
by some constraint it participates in.) The example above does not satisfy ei-
ther of these properties. Corollary 1 shows that these properties suffice. Roughly
speaking, the intuition is that since f is submodular, the marginal contribution
of item i to S is largest when S is “small”; this is also the case when i is most
likely to be retained for S . That is, for every i ∈ [n], both Pr[i ∈ S | i ∈ S] and
the marginal contribution of i to f (S) are decreasing functions of S. We prove
(see [5]) the following generalization of Feige’s Subadditivity Lemma.
Theorem 5. Let [n] denote a groundset, x ∈ [0, 1]n , and for each B ⊆ [n]
define p(B) = Πi∈B xi · Πj ∈B / (1 − xj ). Associated with each B ⊆ [n], there is an
arbitrary distribution
over subsets of B, where each set A ⊆ B has probability
qB (A); so A⊆B qB (A) = 1 for all B ⊆ [n]. That is, we choose B from a
product distribution, and then retain a subset A of B by applying a randomized
alteration. Suppose that the system satisfies the following conditions.
Marginal Property:
∀i ∈ [n], p(B) qB (A) ≥ β · p(B). (10)
B⊆[n] A⊆B:i∈A B⊆[n]:i∈B
We are now ready to prove the performance guarantee of our algorithm. Ob-
serve that our rounding algorithm satisfies the hypothesis of Corollary 1 with
1
β = e+o(1) , when parameter α = 1. Moreover, one can show that E[f (S)] ≥
F (x)/(αk). Thus, E[f (S )] ≥ e+o(1)
1
E[f (S)] ≥ ek+o(k)
1
·F (x). Combined with
e
the fact that x is an e−1 -approximate solution to the continuous relaxation (9),
we have proved our main result:
Theorem 6. There is a randomized algorithm for maximizing any monotone
submodular function over k-column sparse packing constraints achieving approx-
e2
imation ratio e−1 k + o(k).
References
1. Alon, N., Spencer, J.: The Probabilistic Method, 3rd edn. Wiley-Interscience,
New York (2008)
2. Arkin, E.M., Hassin, R.: On Local Search for Weighted k-Set Packing. In: European
Symposium on Algorithms, pp. 13–22 (1997)
3. Austrin, P., Khot, S., Safra, S.: Inapproximability of Vertex Cover and Independent
Set in Bounded Degree Graphs. In: Comp. Complexity Conference (2009)
4. Bansal, N., Friggstad, Z., Khandekar, R., Salavatipour, M.R.: A logarithmic ap-
proximation for unsplittable flow on line graphs. In: SODA (2009)
5. Bansal, N., Korula, N., Nagarajan, V., Srinivasan, A.: On k-Column Sparse Packing
Programs (full version), arXiv (2010)
382 N. Bansal et al.
1 Introduction
In the Steiner tree problem, we are given an undirected graph G = (V, E), non-
negative costs ce for all edges e ∈ E, and a set of terminal vertices R ⊆ V . The
goal is to find a minimum-cost tree T spanning R, and possibly some Steiner
vertices from V \R. We can assume that the graph is complete and that the costs
induce a metric. The problem takes a central place in the theory of combinatorial
optimization and has numerous practical applications. Since the Steiner tree
problem is NP-hard1 we are interested in approximation algorithms for it. The
best published approximation algorithm for the Steiner tree problem is due to
Robins and Zelikovsky [20], which for any fixed > 0, achieves a performance
.
ratio of 1 + ln23 + = 1.55 in polynomial time; an improvement is currently in
press [2], see also Remark 1.
Supported by NSERC grant no. 288340 and by an Early Research Award.
1
Chlebı́k and Chlebı́ková show that no (96/95 − )-approximation algorithm can exist
for any positive unless P=NP [5].
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 383–396, 2010.
c Springer-Verlag Berlin Heidelberg 2010
384 D. Chakrabarty, J. Könemann, and D. Pritchard
In this paper, we study linear programming (LP) relaxations for the Steiner
tree problem, and their properties. Numerous such formulations are known (e.g.,
see [7,11,16,17,24,25]), and their study has led to impressive running time im-
provements for integer programming based methods. Despite the significant body
of work in this area, none of the known relaxations is known to exhibit an in-
tegrality gap provably smaller than 2. The integrality gap of a relaxation is the
maximum ratio of the cost of integral and fractional optima, over all instances.
It is commonly regarded as a measure of strength of a formulation. One of the
contributions of this paper are improved bounds on the integrality gap for a
number of Steiner tree LP relaxations.
A Steiner tree relaxation of particular interest is the bidirected cut relaxation
[7,25] (precise definitions will follow in Section 1.2). This relaxation has a flow
formulation using O(|E||R|) variables and constraints, which is much more com-
pact than the other relaxations we study. Also, it is also widely believed to have
an integrality gap significantly smaller than 2 (e.g., see [3,19,23]). The largest
lower bound on the integrality gap known is 8/7 (by Martin Skutella, reported
in [15]), and Chakrabarty et al. [3] prove an upper bound of 4/3 in so called
quasi-bipartite instances (where Steiner vertices form an independent set).
Another class of formulations are the so called hypergraphic LP relaxations
for the Steiner tree problem. These relaxations are inspired by the observation
that the minimum Steiner tree problem can be encoded as a minimum cost
hyper-spanning tree (see Section 1.2) of a certain hypergraph on the terminals.
They are known to be stronger than the bidirected cut relaxation [18], and it is
therefore natural to try to use them to get better approximation algorithms, by
drawing on the large corpus of known LP techniques. In this paper, we focus on
one hypergraphic LP in particular: the partition LP of Könemann et al. [15].
leads to the bidirected cut relaxation (B) (shown in Figure 1 with dual) which
has a variable for each arc a ∈ A, and a constraint for every valid set U . Here
and later, δ out (U ) denotes the set of arcs in A whose tail is in U and whose head
lies in V \ U . When there are no Steiner vertices, Edmonds’ work [7] implies this
relaxation is exact.
min ca xa : x ∈ RA
≥0 (B) max zU :
valid(V )
z ∈ R≥0 (BD )
a∈A U
xa ≥ 1, ∀U ∈ valid(V ) zU ≤ ca , ∀a ∈ A
a∈δ out (U ) U :a∈δ out (U )
Fig. 1. The bidirected cut relaxation (B) and its dual (BD )
Goemans & Myung [11] made significant progress in understanding the LP,
by showing that the bidirected cut LP has the same value independent of which
terminal is chosen as the root, and by showing that a whole “catalogue” of very
different-looking LPs also has the same value; later Goemans [10] showed that
if the graph is series-parallel, the relaxation is exact. Rajagopalan and Vazirani
[19] were the first to show a non-trivial integrality gap upper bound of 3/2 on
quasibipartite graphs; this was subsequently improved to 4/3 by Chakrabarty et
al. [3], who gave another alternate formulation for (B).
Fig. 2. Black nodes are terminals and white nodes are Steiner nodes. Left: a Steiner
tree for this instance. Middle: the Steiner tree’s edges are partitioned into full com-
ponents; there are four full components. Right: the hyperedges corresponding to these
full components.
xK ρ(K ∩ S) ≤ ρ(S), ∀S ⊂ R
K∈K
→
−
K
min CK xK i : x ∈ R≥0 (D) max zU : z ∈ R≥0
valid(R)
(DD )
K∈K,i∈K U
xK i ≥ 1, ∀ valid U ⊆ R z U ≤ CK , ∀K ∈ K, i ∈ K
K i ∈Δout (U ) U :K∩U =∅,i∈U
/
Fig. 3. The directed hypergraph relaxation (D) and its dual (DD )
Lemma 1 ([18]). For any instance, OPT (D) ≥ OPT (B). There are instances
for which this inequality is strict.
Könemann et al. [15], inspired by the work of Chopra [6], described a partition-
based relaxation which captures that given any partition of the terminals, any
hyper-spanning tree must have sufficiently many “cross hyperedges”. More for-
mally, a partition, π, is a collection of pairwise disjoint nonempty terminal sets
(π1 , . . . , πq ) whose union equals R. The number of parts q of π is referred to
as the partition’s rank and denoted as r(π). Let ΠR be the set of all partitions
of R. Given a partition π = {π1 , . . . , πq }, define the rank contribution rcπK of
hyperedge K ∈ K for π as the rank reduction of π obtained by merging the
K := |{i : K ∩ πi = ∅}| − 1. Then a
π
parts of π that are touched by K; i.e., rc
hyper-spanning tree (R, K ) must satisfy K∈K rcπK ≥ r(π) − 1. The partition
based LP of [15] and its dual are given in Figure 4.
min CK x K : x ∈ RK
≥0 (P) max (r(π) − 1)yπ : y ∈ RΠ
≥0 (PD )
R
K∈K π
Fig. 4. The unbounded partition relaxation (P) and its dual (PD )
The feasible region of (P) is unbounded, since if x is a feasible solution for (P)
then so is any x ≥ x. We obtain a bounded partition LP relaxation, denoted by
(P ) and shown below, by adding a valid equality constraint to the LP.
2 Uncrossing Partitions
In this section we are interested in uncrossing a minimal set of tight partitions
that uniquely define a basic feasible solution to (P). We start with a few pre-
liminaries necessary to state our result formally.
2.1 Preliminaries
We introduce some needed well-known properties of partitions that arise in com-
binatorial lattice theory [22].
Hypergraphic LP Relaxations for Steiner Trees 389
Given a graph G and a partition π of V (G), we say that G induces π if the parts
of π are the vertex sets of the connected components of G.
Definition 3 (Join of partitions). Let (R, E) be a graph that induces π, and
let (R, E ) be a graph that induces π . Then the graph (R, E ∪ E ) induces π ∨ π .
π
Given a feasible solution x to (P), a partition π is tight if K∈K xK rcK =
r(π) − 1. Let tight(x) be the set of all tight partitions. We are interested in
uncrossing this set of partitions. More precisely, we wish to find a cross-free set
of partitions (chain) which uniquely defines x. One way would be to prove the
following.
Property 1. If two crossing partitions π and π are in tight(x), then so are
π ∧ π and π ∨ π .
This type of property is already well-used [9,14,21] for sets (with meets and joins
replaced by unions and intersections respectively), and the standard approach is
390 D. Chakrabarty, J. Könemann, and D. Pritchard
the following. The typical proof considers the constraints in (P) corresponding to
π and π and uses the “supermodularity” of the RHS and the “submodularity”
of the coefficients in the LHS. In particular, if the following is true,
Any chain of distinct partitions of R that does not contain π has size at most
|R| − 1, and this is an upper bound on the rank of the system in (3). Elementary
linear programming theory immediately yields the following corollary.
Corollary 1. Any basic solution x∗ of (P) has at most |R| − 1 non-zero
coordinates.
3 Equivalence of Formulations
In this section we describe our equivalence results. A summary of the known and
new results is given in Figure 6.
For lack of space, we present only sketches for our main equivalence results in
this extended abstract, and refer the reader to [4] for details.
π
4
In this hypothetical scenario we get r(π) + r(π ) − 2 = K xK (rcK + rcK ) ≥
π
π∧π
K xK (rcK + rcπ∨π
K ) ≥ r(π ∧ π ) + r(π ∨ π ) − 2 ≥ r(π) + r(π ) − 2; thus the
inequalities hold with equality, and the middle one shows π ∧ π and π ∨ π are tight.
Hypergraphic LP Relaxations for Steiner Trees 391
OPT(P)
OPT (S)
Theorem 2. The LPs (P ) and (P) have the same optimal value.
Proof sketch. To show this, it suffices to find an optimum solution of (P) which
satisfies the equality in (P ); i.e., we want to find a solution for which the
maximal-rank partition
π is tight. We pick the optimum solution to (P) which
minimizes the sum K∈K xK |K|. Using Property 1, we show that either π is
tight or there is a shrinking operation which decreases K∈K xK |K| without
increasing the cost. Since the latter is impossible, the theorem is proved.
Proof sketch. We show that the inequalities defining (P ) are valid for (S), and
vice-versa. Note that both have the same equality and non-negativity constraints.
To show that the partition inequality of (P ) for π holds for any x ∈ (S), we
use the subtour inequalities in (S) for every part of π. For the other direction,
given any subset S ⊆ R, we invoke the inequality in (P ) for the partition
π := {{S} as one part and the remaining terminals as singletons}.
Theorem 4. On quasibipartite Steiner tree instances, OPT (B) ≥ OPT (D).
Proof sketch. We look at the duals of the two LPs and we show OPT (BD ) ≥
OPT (DD ) in quasibipartite instances. Recall that the support of a solution to
(DD ) is the family of sets with positive zU . A family of sets is called laminar if
for any two of its sets A, B we have A ⊆ B, B ⊆ A, or A ∩ B = ∅. The following
fact follows along the standard line of “set uncrossing” argumentation.
Lemma 2. There is an optimal solution to (DD ) with laminar support.
Given the above result, we may now assume that we have a solution z to (DD )
whose support is laminar. The heart of the proof of Theorem 4 is to show that
z can be converted into a feasible solution to (BD ) of the same value.
Comparing (DD ) and (BD ) one first notes that the former has a variable for
every valid subset of the terminals, while the latter assigns values to all valid
subsets of the entire vertex
set. We say that an edge uv is satisfied for a candidate
solution z, if both a) U:u∈U,v∈U / z U ≤ c uv and b) / zU ≤ cuv hold; z
U:v∈U,u∈U
is then feasible for (BD ) if all edges are satisfied.
392 D. Chakrabarty, J. Könemann, and D. Pritchard
Let z be a feasible solution to (DD ). One easily verifies that all terminal-
terminal edges are satisfied. On the other hand, terminal-Steiner edges may
initially not be satisfied; e.g., consider the Steiner vertex v and its neighbours
depicted in Figure 7 below. Initially, none of the sets in z’s support contains
v, and the load on the edges incident to v is quite skewed: the left-hand side
of condition a) above may be large, while the left-hand side of condition b) is
initially 0.
To construct a valid solution for (BD ), we
therefore lift the initial value zS of each ter-
U
minal subset S to supersets of S, by adding
Steiner vertices. The lifting procedure processes u
each Steiner vertex v one at a time; when pro- v
cessing v, we change z by moving dual from U’ u’
some sets U to U ∪ {v}. Such a dual transfer
decreases the left-hand side of condition a) for
edge uv, and increases the (initially 0) left-hand
sides of condition b) for edges connecting v to
neighbours other than v. Fig. 7. Lifting variable zU
We are able to show that there is a way of
carefully lifting duals around v that ensures that all edges incident to v become
satisfied. The definition of our procedure will ensure that these edges remain
satisfied for the rest of the lifting procedure. Since there are no Steiner-Steiner
edges, all edges will be satisfied once all Steiner vertices are processed.
Throughout the lifting procedure, we will maintain that z remains unchanged,
when projected to the terminals. The main consequence of this is that the ob-
jective value U⊆V zU remains constant throughout, and the objective value of
z in (BD ) is not affected by the lifting. This yields Theorem 4.
In this extended abstract, we show the improved bound of 73/60 for uniformly
quasibipartite
√ graphs, and due to space restrictions, we only show the weaker
.
(2 2 − 1) = 1.828 upper bound on general graphs.
Procedure RatioGreedy
1: Initialize the set of acyclic components L to ∅.
2: Let L∗ be a minimizer of |L|−1
CL
over all full components L such that |L| ≥ 2
and L ∪ L is acyclic.
3: Add L∗ to L.
4: Continue until (R, L) is a hyper-spanning tree and return L.
Proof sketch. Let t denote the number of iterations and L := {L1 , . . . , Lt } be the
ordered sequence of full components obtained. We now define a dual solution y
to (PD ). Let π(i) denote the partition induced by the connected components of
{L1 , . . . , Li }. Let θ(i) denote CLi /(|Li | − 1) and note that θ is nondecreasing.
Define θ(0) = 0 for convenience. We define a dual solution y with
where H denotes the harmonic series; this is obtained by using the greedy nature
of the algorithm and the fact that, in uniformly quasi-bipartite graphs, CK ≤
CK |K |
|K| whenever K ⊂ K. Now, (|K| − 1 + H(|K| − 1))/|K| is always at most
60 . Thus (4) implies that 73 · y is a feasible dual solution, which completes the
73 60
proof.
and analyze the algorithm we introduce some notation. For a minimum terminal
spanning tree T = mtst(G; c) define dropT (K; c) := c(T ) − mtst(G/K; c). We
also define gainT (K; c) := dropT (K) − c(K), where c(K) is the cost of full
component K. A tree T is called gainless if for every full component K we have
gainT (K; c) ≤ 0. The following useful fact is implicit in [15] (see also [4]).
We now give the algorithm and its analysis, which uses a reduced cost trick
introduced by Chakrabarty et al.[3].
Procedure Reduced One-Pass Heuristic
√
1: Define costs ce by ce := ce / 2 for all terminal-terminal edges e, and ce = ce
for all other edges. Let G1 := G, Ti := mtst(Gi ; c ), and i := 1.
2: The algorithm considers the full components in any order. When we exam-
ine a full component K, if gainTi (K; c ) > 0, let Ki := K, Gi+1 := Gi /Ki ,
Ti+1 := mtst(Gi+1 ; c ), and i := i + 1.
/f −1
3: Let f be the final value of i. Return the tree Talg := Tf ∪ i=1 Ki .
Note that the full components are scanned in any order and they are not ex-
amined a priori. Hence the algorithm works just as well if the full components
arrive “online,” which might be useful for some applications.
√
Theorem 7. c(Talg ) ≤ (2 2 − 1) OPT (P).
Proof. First we claim that gainTf (K; c ) ≤ 0 for all K. To see this there are
two cases. If K = Ki for some i, then we immediately see that dropTj (K) = 0
for all j > i so gainTf (K) = −c(K) ≤ 0. Otherwise (if for all i, K = Ki )
K had nonpositive gain when examined by the algorithm; and the well-known
contraction lemma (e.g., see [12, §1.5]) immediately implies that gainTi (K) is
nonincreasing in i, so gainTf (K) ≤ 0.
By Theorem 6, c (Tf ) equals the value of (P) on the graph Gf with costs c .
Since c ≤ c, and since at each step we only contract terminals, the√value of this
optimum must be at most OPT (P). Using the fact that c(Tf ) = 2c (Tf ), we
get
√ √
c(Tf ) = 2c (Tf ) ≤ 2 OPT (P) (5)
Furthermore, for every i we have gainTi (Ki ; c ) > 0, that is, dropTi (Ki ; c ) >
c (K) = c(K). The equality follows since K contains no terminal-terminal edges.
However, dropTi (Ki ; c ) = √12 dropTi (Ki ; c) because all edges of Ti are terminal-
√
terminal. Thus, we get for every i = 1 to f , dropTi (Ki ; c) > 2 · c(Ki ).
Since dropTi (Ki ; c) := mtst(Gi ; c) − mtst(Gi+1 ; c), we have
f −1
dropTi (Ki ; c) = mtst(G; c) − c(Tf ).
i=1
Hypergraphic LP Relaxations for Steiner Trees 395
Thus, we have
−1
1
f f
1
c(Ki ) ≤ √ dropTi (Ki ; c) = √ (mtst(G; c) − c(Tf ))
i=1
2 i=1
2
1
≤ √ (2 OPT (P) − c(Tf ))
2
where we use the fact that mtst(G, c) is at most twice OPT (P)5 . Therefore
1 !
−1
f
√
c(Talg ) = c(Tf ) + c(Ki ) ≤ 1 − √ c(Tf ) + 2 OPT (P).
i=1
2
√
Finally, using c(Tf ) ≤ 2 OPT (P) from (5), the proof of Theorem 7 is complete.
References
1. Borchers, A., Du, D.: The k-Steiner ratio in graphs. SIAM J. Comput. 26(3), 857–
869 (1997)
2. Byrka, J., Grandoni, F., Rothvoß, T., Sanità, L.: An improved LP-based approxi-
mation for Steiner tree. In: Proc. 42nd STOC (to appear 2010)
3. Chakrabarty, D., Devanur, N.R., Vazirani, V.V.: New geometry-inspired relax-
ations and algorithms for the metric Steiner tree problem. In: Lodi, A., Panconesi,
A., Rinaldi, G. (eds.) IPCO 2008. LNCS, vol. 5035, pp. 344–358. Springer, Heidel-
berg (2008)
4. Chakrabarty, D., Könemann, J., Pritchard, D.: Hypergraphic LP relaxations for
Steiner trees. Technical Report 0910.0281, arXiv (2009)
5. Chlebı́k, M., Chlebı́ková, J.: Approximation hardness of the Steiner tree problem
on graphs. In: Penttonen, M., Schmidt, E.M. (eds.) SWAT 2002. LNCS, vol. 2368,
pp. 170–179. Springer, Heidelberg (2002)
6. Chopra, S.: On the spanning tree polyhedron. Operations Research Letters 8, 25–29
(1989)
7. Edmonds, J.: Optimum branchings. Journal of Research of the National Bureau of
Standards B 71B, 233–240 (1967)
8. Edmonds, J.: Matroids and the greedy algorithm. Math. Programming 1, 127–136
(1971)
9. Edmonds, J., Giles, R.: A min-max relation for submodular functions on graphs.
Annals of Discrete Mathematics 1, 185–204 (1977)
10. Goemans, M.X.: The Steiner tree polytope and related polyhedra. Math. Pro-
gram. 63(2), 157–182 (1994)
11. Goemans, M.X., Myung, Y.: A catalog of Steiner tree formulations. Networks 23,
19–28 (1993)
12. Gröpl, C., Hougardy, S., Nierhoff, T., Prömel, H.J.: Approximation algorithms for
the Steiner tree problem in graphs. In: Cheng, X., Du, D. (eds.) Steiner trees in
industries, pp. 235–279. Kluwer Academic Publishers, Norvell (2001)
5
This follows using standard arguments, and can be seen, for instance, by applying
Theorem 6 to the cost-function with all terminal-terminal costs divided by 2, and
using short-cutting.
396 D. Chakrabarty, J. Könemann, and D. Pritchard
13. Gröpl, C., Hougardy, S., Nierhoff, T., Prömel, H.J.: Steiner trees in uniformly
quasi-bipartite graphs. Inform. Process. Lett. 83(4), 195–200 (2002); Preliminary
version appeared as a Technical Report at TU Berlin (2001)
14. Jain, K.: A factor 2 approximation algorithm for the generalized Steiner network
problem. Combinatorica 21(1), 39–60 (2001); Preliminary version appeared in Proc.
39th FOCS, pp. 448–457 (1998)
15. Könemann, J., Pritchard, D., Tan, K.: A partition-based relaxation for Steiner
trees. Math. Programming (2009) (in press)
16. Polzin, T.: Algorithms for the Steiner Problem in Networks. PhD thesis, Universität
des Saarlandes (February 2003)
17. Polzin, T., Vahdati Daneshmand, S.: A comparison of Steiner tree relaxations. Dis-
crete Applied Mathematics 112(1-3), 241–261 (2001); Preliminary version appeared
at COS 1998 (1998)
18. Polzin, T., Vahdati Daneshmand, S.: On Steiner trees and minimum spanning trees
in hypergraphs. Oper. Res. Lett. 31(1), 12–20 (2003)
19. Rajagopalan, S., Vazirani, V.V.: On the bidirected cut relaxation for the metric
Steiner tree problem. In: Proceedings of ACM-SIAM Symposium on Discrete Al-
gorithms, pp. 742–751 (1999)
20. Robins, G., Zelikovsky, A.: Tighter bounds for graph Steiner tree approximation.
SIAM J. Discrete Math. 19(1), 122–134 (2005); Preliminary version appeared as
Improved Steiner tree approximation in graphs at SODA 2000 (2000)
21. Singh, M., Lau, L.C.: Approximating minimum bounded degree spanning trees to
within one of optimal. In: Proc. 39th STOC, pp. 661–670 (2007)
22. Stanley, R.P.: Enumerative Combinatorics, vol. 1. Wadsworth & Brooks/Cole
(1986)
23. Vazirani, V.: Recent results on approximating the Steiner tree problem and its
generalizations. Theoret. Comput. Sci. 235(1), 205–216 (2000)
24. Warme, D.: Spanning Trees in Hypergraphs with Applications to Steiner Trees.
PhD thesis, University of Virginia (1998)
25. Wong, R.T.: A dual ascent approach for Steiner tree problems on a directed graph.
Math. Programming 28, 271–287 (1984)
Efficient Deterministic Algorithms for Finding a
Minimum Cycle Basis in Undirected Graphs
1 Introduction
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 397–410, 2010.
c Springer-Verlag Berlin Heidelberg 2010
398 E. Amaldi, C. Iuliano, and R. Rizzi
2 Previous Work
3 3 3
1 4 1 4 1 4
2 2 2 2 2 2
3 5 3 3 5 3 3 5 3
2 2 2 2 2 2
2 3 2 3 2 3
3 3 3
from the pair involving the vertex in C with smallest index, assuming a given
numbering of the vertices from 1 to n. All redundant representations (x, [u, v])
can be discarded without additional time by preliminarily identifying if the path
pxu or the path pxv contains a vertex smaller than x. Non-isometric cycles are
discarded if they do not admit a representation for their vertex of smallest index.
Since an isometric cycle has a representation for all its vertices and at least one
of them belongs to any FVS, the FVS-based set of candidate cycles considered
in [18] contains the set I of isometric cycles, while the reverse is not true. An
example is illustrated in Fig. 1. As we shall see in Section 4, restricting attention
to isometric cycles leads to a substantially smaller set of candidate cycles w.r.t.
just removing all the duplicate ones or focusing on the FVS-based ones.
We first present the underlying idea of the witness update procedure in [15]
in an iterative, rather than recursive, manner. For the sake of exposition, we as-
sume w.l.o.g. that the initial number of witnesses is a power of 2 with null vectors
beyond the ν-th position. There are always O(m) such vectors. At the i-th itera-
tion, for 1 ≤ i ≤ ν, the i-th witness Si is considered and the lightest cycle Ci with
Ci , Si = 1 is found. The index i can be expressed in a unique way as i = 2q r,
where q is a nonnegative integer and r is a positive odd integer. Obviously, q = 0
if and only if i is odd. In the update phase, the 2q witnesses {Si+1 , . . . , Si+2q }
are updated so that they become orthogonal to {Ci+1−2q , . . . , Ci }, namely to
the last 2q cycles added to the basis. In [15] it is shown that the overall cost of
all update phases is O(mω ). In this work, we proceed by blocks of b := log n
witnesses and consider at the i-th iteration the i-th block instead of a single wit-
ness. For exposition purpose, we assume w.l.o.g. that b divides the initial number
of witnesses and that the number of blocks is a power of 2. For the i-th block,
with i = 2q r, after finding the corresponding b cycles of a minimum basis in the
way we shall describe below, the next 2q blocks of witnesses, namely the 2q b
witnesses contained in the blocks, are updated so that they become orthogonal
to the last 2q b cycles added to the basis. Since we consider blocks of witnesses,
the total amount of work needed for all the witness updates is lower than that
in [15] and still O(mω ).
In order to find the b cycles of a minimum basis corresponding to the i-th
block, we proceed as follows. At the beginning of the i-th iteration, we already
have selected s = (i − 1)b cycles. We must specify how to find the following
b cycles Cs+1 , . . . , Cs+b using the corresponding b witnesses Ss+1 , . . . , Ss+b . We
code the witnesses Ss+j , with 1 ≤ j ≤ b, in the following way: for each edge e ∈ E
we have a word W (e) in which the j-th bit is set to 1 if the e-th component of
Ss+j is 1. Then, we scan the isometric cycles searching for the lightest cycle
in the subspace spanned by Ss+1 , . . . , Ss+b , i.e., non-orthogonal to at least one
witness Ss+j , for 1 ≤ j ≤ b. A cycle C ∈ I is tested by making the x-or
among logarithmic words W (e), for each edge e ∈ C. Let W (C) denote the
resulting word. If W (C) is null, then C is orthogonal to all considered witnesses,
otherwise C is the selected cycle. The witnesses Ss+j non-orthogonal to C are
those whose j-th bit in W (C) is 1. Anyone of these witnesses, for example Ss+t ,
can be discarded while the remaining ones are updated. This can be implicitly
done in O(m) as follows: for each edge e ∈ E, if the e-th component in Ss+t
is 1 then W (e) := W (e) x-or W (C), otherwise W (e) remains unchanged. This
is equivalent to make the update according to the standard rule, but has the
advantage of preserving the witness coding in the desired form. Note that this
internal update is necessary only after finding a cycle of a minimum basis, so
that its cost in all phases is O(m2 ). After selecting C, we search for other b − 1
cycles in the same way. To select the b cycles we examine each isometric cycle at
most once. Due to the sparseness property, this takes O(nm). Since the above
step must be applied O(m/ log n) times, a minimum cycle basis is obtained in
O(m2 n/ log n). This dominates the O(mω ) cost of the witness update and thus
is the overall complexity of the algorithm.
Efficient Deterministic Algorithms for Finding a Minimum Cycle Basis 403
Note that our algorithm is simpler than the O(m2 n/ log n+mn2 ) one proposed
in [18] and has a better complexity for sparse graphs with m < n log n.
4 Computational Results
To evaluate the practical performance of our AICE algorithm, we carried out an
extensive computational campaign and compared the results with those provided
by the main alternative algorithms, namely, the improved de Pina’s one described
in [15], the hybrid one in [17], the hybrid one based on FVS [18] and an efficient
implementation of Horton’s method. In the previous computational work [17],
implementations are based on LEDA [19] and the executables are available. For
fairness of comparison, we have implemented all algorithms in C with the same
data structures. Since, due to vector sparsity, we do not expect actual speed-up
from fast matrix multiplication and bit packing tricks, we have not included
these features in the implementation of the previous algorithms and we have not
tested the algorithm described in Section 3.1.
As benchmark we use random graphs and hypercubes like those in [17] but
of larger size, and Euclidean graphs like in [2]. Random graphs are generated
using the G(n; p) model [9], for p = 0.3, 0.5, 0.9 and for p = 4/n (sparse graphs,
m ≈ n), and hypercubes are in dimension d = 7, . . . , 10, with n = 2d . Both
unweighted and weighted cases are considered, with integer weights randomly
chosen from the uniform distribution in the range [0 . . . 216 ]. Euclidean graphs
have density p = 0.3, 0.5, 0.9, and the weight of an edge is the Euclidean distance
between its endpoints, supposing that vertices are randomly distributed on a
10x10 square. The algorithms are run on a Xeon 2.0 Ghz 64 bit machine with 3
GB of memory, in Linux. We use the GNU gcc 4.1 compiler (g++ 4.1 for LEDA
implementations) with the -O3 optimization flag. For each type of graph, results
are reported as average (for ν and the size of the witnesses the averages are
rounded up) on 20 randomly generated instances with the same characteristics.
Table 1 reports the ratio between the size of the sets of candidate cycles and
the cyclomatic number ν. Both sets H and HFVS (the one of FVS-based candidate
cycles) are without duplicates. The set HFVS does not contain the representations
(x, [u, v]) whose corresponding cycle has a smaller vertex than x also belonging to
the given FVS. A close-to-minimum FVS is obtained by the heuristic in [5]. For
weighted graphs, the number of isometric cycles is very close to ν, so that very few
of them must be tested for independence. Also the standard deviation is in general
very small, less than 2%. The size of H and HFVS is up to 20 times larger than ν,
with a higher standard deviation, 10 − 20%. Thus even a close-to-minimum FVS
does not help. If duplicate cycles are not deleted, the size of both H and HFVS turns
out to be up to 10 times larger (for lack of space not shown in the table).
For unweighted graphs, isometric cycles are less effective for larger density
because in this case a random graph tends to become a complete graph with nν/3
isometric triangles. The particular weighting of Euclidean graphs also reduces
the impact of the restriction to isometric cycles.
In order to try to further trim the set of isometric cycles we have removed
isometric cycles that admit a wheel decomposition, i.e., which can be expressed
as the composition of more than two cycles of smaller weight with a vertex
r not incident to any edge in C such that each co-tree edge in C relative to
Tr closes a lighter fundamental cycle in Tr . For an example see Fig. 2(a). The
406
Table 1. Comparison of the number of candidate cycles divided by ν for the Horton cycles (H), the FVS-based cycles (HFVS ), and the
isometric cycles (I). Both H and HFVS do not include duplicates. For unweighted graphs, the isometric cycles which do not admit a wheel
decomposition (InoWh ) are also considered, with the generation time (in seconds). The label “x” indicates memory limit exceeded (3 GB).
Weighted Unweighted
n ν H HFVS I H HFVS I InoWh (Time)
Weighted hypercubes
100 1375 6.67 6.35 1.07 9.28 10.86 4.57 1.97 (0.03) Weighted random sparse
d ν H HFVS I
200 5775 11.65 11.51 1.06 13.02 15.94 7.78 2.44 (0.42) n ν H HFVS I
7 321 7.67 7.59 1.56
Random 0.3 300 13173 16.39 16.19 1.05 15.89 18.94 10.74 2.67 (1.70) 500 510 18.21 13.72 2.15
8 769 11.99 12.22 1.69
400 23527 19.40 19.23 1.04 19.13 23.82 13.76 2.83 (4.46) 750 761 25.59 19.45 2.51
500 36903 23.42 23.29 1.04 22.12 27.56 16.78 2.91 (9.46) 9 1793 20.12 20.70 1.79
1000 1010 31.86 23.10 2.73
10 4097 35.31 34.44 1.92
100 2379 6.29 6.21 1.04 10.81 12.55 9.05 1.76 (0.06)
200 9749 11.65 11.59 1.03 19.06 20.18 17.35 1.83 (0.65) Euclidean n = 200
Random 0.5 300 22119 15.45 15.38 1.03 27.47 29.29 25.67 1.89 (2.56) p ν H HFVS I InoWh (Time)
400 39556 20.01 19.88 1.03 35.98 40.43 34.19 1.93 (5.93)
500 61871 24.14 24.17 1.02 44.31 46.25 42.41 1.97 (11.66) 0.1 1781 27.13 25.26 3.32 1.08 (0.08)
0.3 5770 41.69 40.96 7.92 1.03 (0.30)
100 4360 6.62 6.66 1.02 27.34 27.36 27.12 1.06 (0.10)
0.5 9733 49.54 49.21 17.65 1.02 (0.68)
200 17706 11.15 11.15 1.02 54.26 54.58 54.04 1.05 (1.03)
E. Amaldi, C. Iuliano, and R. Rizzi
Table 2. Time in seconds and size of the witnesses for the AICE independence test and the original de Pina’s test restricted to the
isometric cycles. For AICE, the percentage of the selected cycles which do not require witness update (% c.n.u.) is also reported.
Weighted Unweighted
De Pina’s test on I AICE test De Pina’s test on I AICE test
n ν Time max |Si | avg|Si | Time max |Si | avg|Si | % c.n.u. Time max |Si | avg|Si | Time max |Si | avg|Si | % c.n.u.
100 1375 0.10 628 9 0.00 12 2 97.20 0.08 665 11 0.00 39 2 74.73
Random 0.3 200 5775 2.02 2656 13 0.00 16 2 98.08 1.57 2767 16 0.01 56 2 87.18
300 13173 15.88 6185 15 0.01 26 2 98.75 13.03 6399 18 0.02 39 2 92.51
100 2379 0.29 1105 10 0.00 12 2 98.15 0.21 1117 9 0.00 9 2 96.41
Random 0.5 200 9749 5.67 4560 13 0.01 17 2 98.93 4.82 4567 11 0.02 8 2 98.33
300 22119 74.68 10411 15 0.02 24 2 99.20 40.70 10461 12 0.07 8 2 98.83
100 4360 0.92 2007 9 0.00 8 2 99.16 0.54 1973 7 0.01 2 2 99.80
Random 0.9 200 17706 30.25 8273 13 0.02 13 2 99.44 11.96 8074 7 0.07 2 2 99.91
Table 3. Comparison of the running times in seconds for the main algorithms, for our C implementations as well as the LEDA available
ones. The labels “-” and “x” indicate that the time limit of 900 seconds and the memory limit of 3 GB are respectively exceeded.
LEDA implementation Our implementation LEDA implementation Our implementation
De Pina Hybrid Hyb FVS De Pina Hybrid Hyb FVS Horton AICE De Pina Hybrid Hyb FVS De Pina Hybrid Hyb FVS Horton AICE
1
1 6 1
1 7
1 2 2 1 2
5 1
2
2 2 1
2 7 5 3 3 4 2 3
2 2 2 6 1
1 1 2
2 8
3 4 1
1
(a) (b)
Fig. 2. (a) A small example of wheel decomposition centered in vertex 7 (T7 in bold):
the isometric hexagonal cycle of weight 6 can be obtained as the composition of the
6 inner triangles of weight 5. (b) A wheel decomposition centered in vertex 4 (T4 in
bold) of the outer cycle of weight 8. Since the non-isometric cycle C(4, [7, 8]) of weight
9 can be separated by [5, 6] (dashed edge) in two cycles of strictly smaller weight (6
and 7), when decomposing the outer cycle the weight of the heavier of the two cycles,
7, is considered instead of 9.
isometric cycles are checked for a wheel decomposition after they are sorted by
non-decreasing weight. If a fundamental cycle of a possible wheel decomposition
is non-isometric, we consider the weight of the heavier of the two cycles in which
it can be separated, see Fig. 2(b). Since the total number of edges is bounded
by nν (sparseness property) and there are n vertices, it is easy to see that all
candidate cycles that admit a wheel decomposition can be discarded in O(mn2 ).
In Table 1 the set of isometric cycles that are left after checking for wheel
decomposition is denoted by InoWh . Although its size is very close to ν, in partic-
ular for Euclidean graphs, the time needed to obtain InoWh is very high compared
to the total time of AICE (reported in Table 3). This suggests that, due to the
efficiency of the independence test, it does not pay to further reduce the number
of cycles with a similar technique.
In Table 2, we assess the impact of the adaptive independence test of AICE
w.r.t. the original de Pina’s test. For a fair comparison, they are both applied
to the set I of isometric cycles. AICE results are averages on 20 instances,
whereas de Pina’s test results are averages on n instances (n = 100, 200, 300)
corresponding to n different randomly generated spanning trees that induce the
initial (standard) basis {S1 , . . . , Sν }. As in [17], we report statistics on the size
(number of nonzero components) of the witnesses Si , for 1 ≤ i ≤ ν, used in the
independence test, namely the maximum and the rounded up average cardinality.
Note that in the AICE tests the rounded up average of the size of the witnesses
Si , with 1 ≤ i ≤ ν, is always equal to 2. The maximum has a large standard
deviation, since it depends on the specific instance, but it is always much smaller
than that of de Pina’s test, whose standard deviation is less than 10%. Not only
the size of the witnesses is very small but for almost all cycles identified by AICE
no witness update is needed (Step 11 in Algorithm 1). Since many unnecessary
operations are avoided, the overall computing time is greatly reduced.
In Table 3, we compare the running times of the main algorithms. First, we
consider the algorithms whose implementation based on LEDA [19] is available,
Efficient Deterministic Algorithms for Finding a Minimum Cycle Basis 409
Table 4. Comparison of the running times in seconds for large instances. In most cases,
the previous algorithms exceed the time limit of 1800 seconds (“-”).
namely de Pina’s [15], the Hybrid [17] and the FVS-based Hybrid [18] algo-
rithms. For a fair comparison, these algorithms have also been implemented in
C within the same environment used for AICE. De Pina’s algorithm is imple-
mented with the heuristics suggested in [15] and the Hybrid method is that
in [17] but duplicates are removed from H. In the FVS-based Hybrid algorithm
a close-to-minimum FVS is obtained by the heuristic in [5], but the heuristic
computing time is neglected, and duplicate cycles are also removed. We also de-
vised an efficient version of Horton’s algorithm using H without duplicates and
an ad hoc Gaussian elimination exploiting operations on GF(2). For Euclidean
graphs LEDA algorithms cannot be tested since they require integer weights.
The time limit is set to 900 seconds. For all results the standard deviation is in
general less than 10%. Our C implementation of the previous algorithms turns
out to be more effective than the ones based on LEDA. It is worth pointing out
that, except for dense unweighted random graphs, the ad hoc implementation of
Horton’s algorithm is substantially faster than sophisticated algorithms based
on de Pina’s idea. However, AICE outperforms all previous algorithms, in most
cases by one or two order of magnitude.
Finally, in Table 4 we report the results of our C implementations for larger
weighted instances. AICE finds an optimal solution for graphs with up to 3000
vertices within the time limit of 1800 seconds, while the other algorithms cannot
solve most of the instances.
An interesting open question is whether it is possible to do without the in-
dependence test, even though in practice it is unlikely to lead to an efficient
algorithm.
References
1. Amaldi, E., Iuliano, C., Jurkiewicz, T., Mehlhorn, K., Rizzi, R.: Breaking the
O(m2 n) barrier for minimum cycle bases. In: Fiat, A., Sanders, P. (eds.) ESA
2009. LNCS, vol. 5757, pp. 301–312. Springer, Heidelberg (2009)
410 E. Amaldi, C. Iuliano, and R. Rizzi
2. Amaldi, E., Liberti, L., Maculan, N., Maffioli, F.: Edge-swapping algorithms for the
minimum fundamental cycle basis problem. Mathematical Methods of Operations
Research 69(12), 205–233 (2009)
3. Bafna, V., Berman, P., Fujito, T.: A 2-approximation algorithm for the undirected
feedback vertex set problem. SIAM J. Discrete Math. 12(3), 289–297 (1999)
4. Bollobas, B.: Graduate Texts in Mathematics, vol. 184. Springer, Heidelberg (2nd
printing)
5. Brunetta, L., Maffioli, F., Trubian, M.: Solving the feedback vertex set problem on
undirected graphs. Discrete Applied Mathematics 101(1-3), 37–51 (2000)
6. Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions.
J. Symb. Comput. 9(3), 251–280 (1990)
7. De Pina, J.C.: Applications of shortest path methods. Ph.D. thesis, University of
Amsterdam, The Netherlands (1995)
8. Deo, N., Prabhu, G., Krishnamoorthy, M.S.: Algorithms for generating fundamen-
tal cycles in a graph. ACM Trans. on Mathematical Software 8(1), 26–42 (1982)
9. Erdös, P., Rényi, A.: On random graphs, I. Publicationes Mathematicae (Debre-
cen) 6, 290–297 (1959)
10. Gleiss, P.M.: Short cycles: minimum cycle bases of graphs from chemistry and
biochemistry. Ph.D. thesis, Universität Wien, Austria (2001)
11. Golynski, A., Horton, J.D.: A polynomial time algorithm to find the minimum
cycle basis of a regular matroid. In: Penttonen, M., Schmidt, E.M. (eds.) SWAT
2002. LNCS, vol. 2368, pp. 200–209. Springer, Heidelberg (2002)
12. Hartvigsen, D., Mardon, R.: The all-pairs min cut problem and the minimum cycle
basis problem on planar graphs. SIAM J. Discrete Math. 7(3), 403–418 (1994)
13. Horton, J.D.: A polynomial-time algorithm to find the shortest cycle basis of a
graph. SIAM J. Computing 16(2), 358–366 (1987)
14. Kavitha, T., Liebchen, C., Mehlhorn, K., Michail, D., Rizzi, R., Ueckerdt, T.,
Zweig, K.A.: Cycle bases in graphs characterization, algorithms, complexity, and
applications. Computer Science Review 3(4), 199–243 (2009)
15. Kavitha, T., Mehlhorn, K., Michail, D., Paluch, K.E.: An Õ(m2 n) algorithm for
minimum cycle basis of graphs. Algorithmica 52(3), 333–349 (2008)
16. Liebchen, C., Rizzi, R.: Classes of cycle bases. Discrete Applied Mathemat-
ics 155(3), 337–355 (2007)
17. Mehlhorn, K., Michail, D.: Implementing minimum cycle basis algorithms. ACM
Journal of Experimental Algorithmics 11 (2006)
18. Mehlhorn, K., Michail, D.: Minimum cycle bases: Faster and simpler. Accepted for
publication in ACM Trans. on Algorithms (2007)
19. Mehlhorn, K., Näher, S.: LEDA: A Platform for Combinatorial and Geometric
Computing. Cambridge University Press, Cambridge (1999)
20. Rizzi, R.: Minimum weakly fundamental cycle bases are hard to find. Algorith-
mica 53(3), 402–424 (2009)
21. Stepanec, G.F.: Basis systems of vector cycles with extremal properties in graphs.
Uspekhi Mat. Nauk II 19, 171–175 (1964) (in Russian)
22. Zykov, A.A.: Theory of Finite Graphs. Nauka, Novosibirsk (1969) (in Russian)
Efficient Algorithms for Average Completion
Time Scheduling
René Sitters
1 Introduction
There is a vast amount of papers on minimizing average completion in machine
scheduling. Most appeared in the combinatorial optimization community in the
last fifteen years. The papers by Schulz and Skutella [22] and Correa and Wag-
ner [6] give a good overview.
The shortest remaining processing time (SRPT) algorithm is a well-known
and simple online procedure for preemptive scheduling of jobs. It produces an
optimal schedule on a single machine with respect to the average completion time
objective [20]. The example in Figure 1 shows that this is not true when SRPT
is applied to parallel machines. The best known upper bound on its competitive
ratio was 2 [18] until recently (SODA2010), Chung et al. [5] showed that the
ratio is at most 1.86. Moreover, they show that the ratio is not better than
21/19 > 1.105. In this paper, we show that the competitive ratio of SRPT is at
most 1.25.
The SRPT algorithm has a natural generalization to the case where jobs have
given weights. Unfortunately, our proof does not carry over to this case. No
algorithm is known to have a competitive ratio less than 2. Remarkably, even
for the offline problem, the only ratio less than 2 results from the approximation
scheme given by Afrati et al. [1]. Schulz and Skutella [22] give a randomized 2-
approximate algorithm which can be derandomized and applied online (although
not at the same time). A deterministic online algorithm for the preemptive case
is given by Megow and Schulz [16] and for the non-preemptive case by Correa
and Wagner [6]. The ratios are, respectively, 2 and 2.62. The first bound on
the algorithm is tight, the latter is probably not. On the single machine, no
Supported by a research grant from the Netherlands Organization for Scientific Re-
search (NWO-veni grant).
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 411–423, 2010.
c Springer-Verlag Berlin Heidelberg 2010
412 R. Sitters
t=0 1 2 3 4 t=0 1
Fig. 1. There are two machines. At time 0, two jobs of length 1 and one job of length
2 are released and at time 2, two jobs of length 1 are released. The picture shows the
suboptimal SRPT schedule and the optimal schedule.
Table 1. Known lower and upper bounds on the competitive ratio for randomized and
deterministic online algorithm
Problem (Online) L.B. Rand. U.B. Rand. L.B. Det. U.B. Det.
1|rj , pmtn| j Cj 1 1 [20] 1 1 [20]
1|rj , pmtn| j wj Cj 1.038 [7] 4/3 [21] 1.073 [7] 1.57 [24]
1|rj | j Cj e/(e − 1) [26] e/(e−1) ≈ 1.58 [3] 2 [27] 2 [11][15][18]
1|rj | j wj C
j e/(e − 1) [26] 1.69 [9] 2 [27] 2 [2][19]
P |rj , pmtn| j Cj 1 − 1.86 → 5/4 [5] 1.047 [27] 1.86 → 5/4 [5]
P |rj , pmtn| j wj Cj 1 − 2 → 1.791 [22][16] 1.047 [27] 2 → 1.791 [16]
P |rj | j Cj 1.157 [23] 2 → 1.791 [22] 1.309 [27] 2 → 1.791 [14]
P |rj | j wj Cj 1.157 [23] 2 → 1.791 [22] 1.309 [27] 2.62 → 1.791 [6]
Efficient Algorithms for Average Completion Time Scheduling 413
The algorithm has to construct the schedule online, i.e., the number of machines
is known a priori but jobs are only revealed at their release times . Even the
number of jobs n = |J| is unknown until the last job has been scheduled. Given
a schedule, we denote the completion time of
job j by Cj . The value of a schedule
is the weighted average completion time n1 j∈J wj Cj and the objective is to
find a schedule with small value. We say that an algorithm is c-competitive if it
finds for any instance a schedule with value at most c times the optimal value.
Let t = 1. Repeat:
If there are more than m jobs available for slot t, then process m jobs in slot t that have
the shortest remaining processing times among all available jobs. Otherwise, process
all available jobs. Let t = t + 1.
The SRPT algorithm as defined here is not deterministic since it may need
to choose between jobs with the same remaining processing time. We say that
a schedule σ is an SRPT schedule for instance I if it is a possible output of the
SRPT algorithm applied to I. Note that the values μi (σ) do not depend on the
non-deterministic choices of the algorithm, i.e., if σ and σ are SRPT schedules
for the same instance on n jobs, then μi (σ) = μi (σ ) for all i ∈ {1, 2, . . . , n}.
All four lemmas are quite intuitive. For the first lemma, imagine that for
a given instance we reduce the release time of some job by δ and increase its
processing time by at least the same amount. Then, the optimum value cannot
improve since there is no advantage in starting a job earlier if this is undone
by an increase in its processing time. The first lemma shows that SRPT has an
even stronger property in this case.
414 R. Sitters
Lemma 1. Let I and I satisfy J = J and for each j ∈ J satisfy rj = rj −δj ≥ 0
and pj ≥ pj + δj , for some integers δj ≥ 0. Let σ and σ be SRPT schedules for,
respectively, I and I . Then, for every i ∈ {1, 2, . . . , n},
μi (σ) ≤ μi (σ ).
Proof. We proof it by induction on the makespan of σ. Let qj (t) and qj (t) be the
remaining processing time of job j in, respectively, σ and σ at time t. Define the
multiset Q(t) = {qj (t) | rj ≤ t}, i.e., it contains the remaining processing times
of all jobs released at t or earlier. Let Q (t) contain the remaining processing
times of the same set in σ , i.e., Q (t) = {qj (t) | rj ≤ t}. Note that we take rj
and not rj in Q . Let Qi (t) and Qi (t) be the i-th smallest element in, respectively,
Q(t) and Q (t). We claim that for any time point t,
If we can show (1) then the proof follows directly since μi (σ) (μi (σ )) is the
smallest t such that Qi (t) (Qi (t)) has at least i zero elements.
The proof is by induction on t. It is true for t = 0 since Q(0) = Q (0). Now
consider an arbitrary time t0 and assume the claim is true for and t ≤ t0 .
First we analyze the changes when no job is released at time t0 + 1. If σ
processes less than m jobs in slot t then all non-zero elements in Q(t0 ) are
reduced by one, implying, Qi (t0 + 1) ≤ Qi (t0 + 1) for all i ≤ |Qi (t0 + 1)|. Now
assume σ processes less than m jobs in slot t0 . Then it processes jobs with
remaining processing times Qk+1 (t0 ), Qk+2 (t0 ), . . . , Qk+m (t0 ) for some k ≥ 0
while Qj (t0 ) = 0 for any j ≤ k. Since Qk+1 (t0 ), Qk+2 (t0 ), . . . , Qk+m (t0 ) are
also non-zero, only values Qs (t0 ) with s ≤ k + m are reduced for σ . Again,
Qi (t0 + 1) ≤ Qi (t0 + 1) for all i ≤ |Qi (t0 + 1)|.
Now assume some jobs are released at time t0 + 1. We may use the analysis
above and only consider the affect of the newly added jobs. For any new job j
we have pj = qj (t0 + 1) ≤ qj (t0 + 1). Clearly, (1) remains valid after the addition
of these jobs.
The next lemma follow directly from Lemma 1 and was given before by Phillips
et al. (Lemma 4.3 in [18])
Lemma 2. Let instance I be obtained from I by removing some of the jobs
from I. Let σ and σ be SRPT schedules for, respectively, I and I and let n, n
be the number of jobs in I and I . Then, for every i ≤ n ,
μi (σ) ≤ μi (σ ).
Proof. For each job j that is included in I but not in I we add a job j to I with
rj = rj and pj = ∞ (or some large enough number). In the SRPT schedule for
the extended instance, the added jobs will complete last and the other jobs are
scheduled as in σ . Now the lemma follows directly from Lemma 1 with δj = 0
for all j. (N.B. Phillips et al. [18] use the same argument. However, we do need
the stronger version of Lemma 1 with arbitrary δj ≥ 0 to prove Lemma 4.)
Efficient Algorithms for Average Completion Time Scheduling 415
Proof. Let t = μn−m (ρ). We change the instance I into I as follows such that
no job is released after time t in the new instance. Every job j with rj ≥ t + 1
gets release time rj = t and processing time pj + rj − t. Let τ be an SRPT
schedule for I . Then, by Lemma 1 we have
μi (τ ) ≤ μi (τ ), for any i ∈ {1, 2, . . . , n}. (2)
On the other hand, we can change ρ into a feasible schedule ρ for I without
changing any of the completion times since at most m jobs are processed after
time t in ρ. Hence, we may assume
μi (ρ) = μi (ρ ), for any i ∈ {1, 2, . . . , n}. (3)
416 R. Sitters
Let Wt (τ ) and Wt (ρ ) be the total remaining processing time at time t in,
respectively, τ and ρ . Since the last m jobs complete at time t or later in ρ we
have
n
μi (ρ ) ≥ mt + Wt (ρ ). (4)
i=n−m+1
Since no jobs are released after t, the SRPT schedule satisfies
t=0 1 2 3
Fig. 2. A tight example for Lemma 3. Take m = 2 and two jobs of length 1 and one
job of length 2. All are released at time 0. It is possible to complete the jobs by time
2. The remaining volume at time t = 2 in the SRPT schedule is 1 = mt/4.
n
μi (τ ) ≤ mt + Wt (τ ). (5)
i=n−m+1
Hence, we see from (7) and (10) that the theorem follows by partitioning the com-
pletion times in groups of size m. The first group may be smaller.
3 Weighted Jobs
The SRPT algorithm has a natural generalization to the case where jobs have
given weights. Unfortunately, our proof does not carry over to this case. A com-
mon approach in the analysis of the weighted average completion time is to use
the mean busy time of a job which is defined as the average point in time that a
job is processed. Given a schedule σ let Z(σ) be the sum of weighted completion
times and Z R (σ) be the sum of weighted mean busy times. On a single machine,
the average (or total) weighted mean busy time is minimized by scheduling jobs
preemptively in order of highest ratio of wj /pj [8]. This is called the preemptive
weighted shortest processing time (WSPT) schedule. The WSPT-schedule is not
unique but its total mean busy time is. Now consider a fast single machine that
runs each job m times faster, i.e., job j has release time rj and processing time
pj /m. For a given instance I, let σm (I) be its preemptive WSPT-schedule on
418 R. Sitters
the fast single machine. The following inequality is a well-known lower bound
on the optimal value of a preemptive and non-preemptive schedule [4,22].
1
Z R (σm (I)) + wj pj ≤ Opt(I). (11)
2 j
Our algorithm uses the same two steps as the algorithms by Schulz and Skutella
[22] and Correa and Wagner [6]: First, the jobs are scheduled on the fast single
machine and then, as soon as an α-fraction of a job is processed, a job is placed
as early as possible on one of the parallel machines. The algorithm in [22] uses
random values of α and a random assignment to machines. The deterministic
algorithm of [6] optimizes over α and simply takes the first available machine
for each job. Our algorithm differs at three points: First, we take a fast single
machine schedule of a modified instance I instead of I. Second, we do not apply
preemptive WSPT but use non-preemptive WSPT instead. Third, we simply
take α = 0 for each job. The behavior of our algorithm depends on the input I
and a real number > 0.
√
Theorem 2. With = 1/ m, algorithm Online() is δm√-competitive for min-
imizing total weighted completion time, where δm = (1+1/ m)2 (3e−2)/(2e−2).
The ratio holds for preemptive and non-preemptive scheduling on m identical
parallel machines.
We denote the start and completion time of job j in the fast machine ρm by,
respectively, sj and cj and in the parallel machine schedule ρ by Sj and Cj . First,
we prove that the optimal value does not change much by the modification made
in step (i).
Lemma 5. Opt(I ) ≤ (1 + )Opt(I).
Proof. Let σ ∗ be an optimal schedule for I and for any job j let Cj∗ be the
completion time of j in σ ∗ . We stretch the schedule by a factor 1 + such that
each job j completes at time (1 + )Cj∗ and starts at time
We see that the schedule is feasible for I and its value is exactly 1 + times the
optimal value of I.
Since we apply non-preemptive WSPT, the schedule ρm derived in step (ii) will
in general not be the same as the fast single machine schedule σ(I ), which is
derived by preemptive WSPT. Hence, we cannot use inequality (11) directly. We
define a new instance I such that ρm is the fast machine schedule of I . We
shall prove this in Lemma 7 but first we introduce I and bound its optimal
value like we did in the previous lemma. Let I = {(pj , wj , rj ) | j = 1 . . . n}
with pj = pj , wj = wj and rj = min{γ rj , sj }, where γ = 1 + 1/(m).
Lemma 6. Opt(I ) ≤ (1 + 1/(m))Opt(I ).
Efficient Algorithms for Average Completion Time Scheduling 419
We see that the schedule is feasible for I and its value is exactly γ times the
optimal value of I .
Algorithm Online():
(i) Let I = {(pj , wj , rj ) | j = 1 . . . n} with pj = pj , wj = wj and rj = rj + pj .
(ii) Apply non-preemptive WSPT to I on the fast single machine. Let ρm be this
schedule and let sj be the start time of job j in ρm .
(iii) Each job j is placed at time sj on one of the parallel machines as early as possible
(but not before sj ). Let ρ be the final schedule.
Clearly, Opt(I) ≤ Opt(I ) since we only shift release times forward. Com-
bining Lemmas √ 5 and 6 we see that Opt(I ) ≤ (1 + 1/(m))(1 + )Opt(I).
Choosing = 1/ m we obtain the following corollary.
Corollary 1
2
1
Opt(I) ≤ Opt(I ) ≤ 1+ √ Opt(I). (12)
m
We see that job k was available at the time we started job j in step (ii). Hence,
we must have wk /pk ≤ wj /pj .
Combining this with Corollary 1 and Lemma 7, we finally get a useful lower
bound on the optimal solution.
Corollary 2
2
1 1
Z R (ρm ) + wj pj ≤ 1+ √ Opt(I).
2 j m
lower bound of Corollary 2 together with the obvious lower bound Opt(I) ≥
The
j wj pj results in the following lemma.
Cj = Sj + pj
≤ αsj + pj
< αbj + pj
= α(bj + pj /2) + (1 − α/2)pj
Next, we add weights and take the sum over all jobs.
!
1
j wj Cj ≤ α Z (ρm ) + 2 j wj pj + (1 − α/2)
R
j wj pj
Now we use Corollary 2 and use that Opt(I ) ≥ Opt(I) ≥ j wj pj . For any
α ≤ 2 we have
√ 2
j wj Cj ≤ α(1 + 1/ m) Opt(I)
√
+ (1 − α/2)Opt(I)
≤ (1 + α/2)(1 + 1/ m)2 Opt(I).
Efficient Algorithms for Average Completion Time Scheduling 421
First we give a short proof that α ≤ 2. This shows that the competitive ratio is
at most 2 + o(m).
Lemma 9. Sj ≤ 2sj for any job j.
Proof. Consider an arbitrary job j. At time sj , the total processing time of jobs k
with sk < sj is at most msj . Since these are the only jobs processed on the parallel
machines between time sj and Sj we have msj ≥ m(Sj − sj ). Hence, Sj ≤ 2sj .
The bound of the next lemma is stronger. The proof is given in the technical
e
report [25]. Lemma 8 tells us that the competitive ratio is at most 1 + 2(e−1) ≈
1.791 in the limit.
Lemma 10. Sj ≤ e
e−1 sj and this bound is tight.
4 Conclusion
We have shown that approximation ratios less than 2 can be obtained for parallel
machines by simple and efficient online algorithms. The lower bounds indicate
that competitive ratios close to 1 are possible for randomized algorithms, espe-
cially when preemption is allowed. Our analysis for SRPT is tight and it seems
that a substantially different proof is needed to get below 1.25. Already, the gap
with the lower bound, 1.105, is quite small. Muthukrishnan et al.[17] show that
SRPT is at most 14 competitive w.r.t. the average stretch of jobs. Possibly, our
result can reduce this ratio substantially. The analysis for algorithm Online is
not tight and a slight modification of the algorithm and analysis may give a ratio
e/(e − 1) + o(m) ≈ 1.58 + o(m). Moreover, the analysis is not parameterized by
m. A refined analysis will reduce the o(m) for small values of m.
References
1. Afrati, F., Bampis, E., Chekuri, C., Karger, D., Kenyon, C., Khanna, S., Milis, I.,
Queyranne, M., Skutella, M., Stein, C., Sviridenko, M.: Approximation schemes
for minimizing average weighted completion time with release dates. In: FOCS ’99,
pp. 32–44 (1999)
2. Anderson, E.J., Potts, C.N.: Online scheduling of a single machine to minimize
total weighted completion time. Math. Oper. Res. 29, 686–697 (2004)
422 R. Sitters
3. Chekuri, C., Motwani, R., Natarajan, B., Stein, C.: Approximation techniques for
average completion time scheduling. SIAM Journal on Computing 31, 146–166
(2001)
4. Chou, M.C., Queyranne, M., Simchi-Levi, D.: The asymptotic performance ratio
of an on-line algorithm for uniform parallel machine scheduling with release dates.
Mathematical Programming 106, 137–157 (2006)
5. Chung, C., Nonner, T., Souza, A.: SRPT is 1.86-competitive for completion time
scheduling. In: Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete
Algorithms (Austin, Texas), pp. 1373–1388 (2010)
6. Correa, J.R., Wagner, M.R.: LP-based online scheduling: from single to parallel
machines. Mathematical Programming 119, 109–136 (2009)
7. Epstein, L., van Stee, R.: Lower bounds for on-line single-machine scheduling.
Theoretical Computer Science 299, 439–450 (2003)
8. Goemans, M.X.: Improved approximation algorithms for scheduling with release
dates. In: Proc. 8th Symp. on Discrete Algorithms, New Orleans, Louisiana, United
States, pp. 591–598 (1997)
9. Goemans, M.X., Queyranne, M., Schulz, A.S., Skutella, M., Wang, Y.: Single ma-
chine scheduling with release dates. SIAM Journal on Discrete Mathematics 15,
165–192 (2002)
10. Hall, L.A., Schulz, A.S., Shmoys, D.B., Wein, J.: Scheduling to minimize average
completion time: Off-line and on-line approximation algorithms. Mathematics of
Operations Research 22, 513–544 (1997)
11. Hoogeveen, J.A., Vestjens, A.P.A.: Optimal on-line algorithms for single-machine
scheduling. In: Cunningham, W.H., Queyranne, M., McCormick, S.T. (eds.) IPCO
1996. LNCS, vol. 1084, pp. 404–414. Springer, Heidelberg (1996)
12. Hussein, M.E., Schwiegelshohn, U.: Utilization of nonclairvoyant online schedules.
Theoretical Computer Science 362, 238–247 (2006)
13. Jalliet, P., Wagner, R.M.: Almost sure asymptotic optimality for online routing
and machine scheduling problems. Networks 55, 2–12 (2009)
14. Liu, P., Lu, X.: On-line scheduling of parallel machines to minimize total comple-
tion times. Computers and Operations Research 36, 2647–2652 (2009)
15. Lu, X., Sitters, R.A., Stougie, L.: A class of on-line scheduling algorithms to min-
imize total completion time. Operations Research Letters 31, 232–236 (2002)
16. Megow, N., Schulz, A.S.: On-line scheduling to minimize average completion time
revisited. Operations Research Letters 32, 485–490 (2004)
17. Muthukrishnan, S., Rajaraman, R., Shaheen, A., Gehrke, J.E.: Online scheduling
to minimize average stretch. SIAM J. Comput. 34, 433–452 (2005)
18. Phillips, C., Stein, C., Wein, J.: Minimizing average completion time in the presence
of release dates, networks and matroids; sequencing and scheduling. Mathematical
Programming 82, 199–223 (1998)
19. Queyranne, M.: On the Anderson-Potts single machine on-line scheduling algo-
rithm (2001) (unpublished manuscript)
20. Schrage, L.: A proof of the optimality of the shortest remaining processing time
discipline. Operations Research 16(3), 687–690 (1968)
21. Schulz, A.S., Skutella, M.: The power of α-points in single machine scheduling.
Journal of Scheduling 5, 121–133 (2002)
22. Schulz, A.S., Skutella, M.: Scheduling unrelated machines by randomized rounding.
SIAM Journal on Discrete Mathematics 15, 450–469 (2002)
23. Seiden, S.: A guessing game and randomized online algorithms. In: Proceedings of
the 32nd ACM Symposium on Theory of Computing, pp. 592–601 (2000)
Efficient Algorithms for Average Completion Time Scheduling 423
24. Sitters, R.A.: Complexity and approximation in routing and scheduling, Ph.D.
thesis, Eindhoven Universtity of Technology, the Netherlands (2004)
25. Sitters, R.A.: Efficient algorithms for average completion time scheduling, Tech. Re-
port 2009-58, FEWEB research memorandum, Free University Amsterdam (2009)
26. Stougie, L., Vestjens, A.P.A.: Randomized on-line scheduling: How low can’t you
go? Operations Research Letters 30, 89–96 (2002)
27. Vestjens, A.P.A.: On-line machine scheduling, Ph.D. thesis, Department of Math-
ematics and Computing Science, Technische Universiteit Eindhoven, Eindhoven,
the Netherlands (1997)
Experiments with Two Row Tableau Cuts
1 Introduction
n
{(z, y) ∈ Z2 × Rn+ | z = f + r j yj } , (1)
j=1
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 424–437, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Experiments with Two Row Tableau Cuts 425
where f ∈ Q2 \ {(0, 0)}, rj ∈ Q2 ∀j ∈ {1, ..., n}. Various variants of (1) have been
studied; see for example [3,4,9,10,11,12,13,15,16,26] and [14] for a recent survey on
the topic. The relaxation (1) can be obtained in three steps. First select two rows of
a simplex tableau corresponding to integer basic variables currently at fractional
value(s). Then relax the non-negativity of the integer basic variables. Finally relax
the non-basic integer variables to be continuous variables. Two appealing features
of this relaxation are the complete characterization of all facet-defining inequal-
ities using the so-called lattice-free convex sets and the advantage of obtaining
the strongest possible coefficients for continuous variables in the resulting cutting
planes for MIPs. Note that the importance of obtaining cutting planes where the
continuous variables have strong coefficients has been emphasized by previous the-
oretical and computational studies; see for example [2,18].
Since the continuous variables receive strong coefficients, the focus next is to
improve the coefficients of non-basic integer variables. One way to strengthen
the coefficients of non-basic integer variables is to consider lifting them to obtain
valid inequalities for
n
n+k
{(z, y, x) ∈ Z2 × Rn+ × Zk+ | z = f + r j yj + rj xj } , (2)
j=1 j=n+1
where all data is rational. One possible approach for lifting is by the use of
the so-called ‘fill-in functions’ [21,23] or ‘monoidal strengthening’ [8]. Observe
that such lifted inequalities do not provide the complete list of facet-defining
inequalities of the convex hull of (2). In fact, (2) is a face of the mixed integer
master group relaxation and a complete description of the convex hull of the
master group relaxation is unknown. (See discussion in [20,22]).
The goal of this paper is to evaluate the quality of the two-row lifted cutting
planes described above computationally. In particular, we would like to under-
stand how to select a small subset of these cutting planes which are useful in prac-
tice, to discover what are the potential weaknesses of these cutting planes, and to
evaluate how far these weaknesses can be attributed to the different relaxations or
the lack of knowledge of the complete description of the convex hull of (2). We work
with a specific subclass of facet-defining inequality of the convex hull of (1) and
consider its lifted version. We attempt to answer the following specific question.
How good are these cutting planes in the the presence of integer non-basic vari-
ables? Is there a cut off on the ratio of number of integer and continuous non-basic
variables for these cutting planes to be useful? What is the strength of these two
row cuts in the presence of multiple rows? Can the sparsity structure be used to se-
lect important two row relaxations? How important is the effect of non-negativity
of the basic variables? Through experiments designed to answer each of these ques-
tions individually by keeping all other parameters constant, we hope to gain insight
into the strength of relaxations of m-row sets consisting of the intersection of the
convex hulls of two-row relaxations, and possibly obtain guidelines for selecting
the most useful two-row lifted cutting planes.
The outline of this paper is the following. In Section 2, we summarize the rele-
vant results regarding valid inequalities of (1) and the lifting of these inequalities.
426 S.S. Dey et al.
In Section 3, we briefly discuss the algorithmic aspects of generating the lifted two
row inequalities. In Section 4, we describe various experiments conducted and the
results. We conclude in Section 5.
2 Basics
We begin with a definition of maximal lattice-free convex sets and describe these
sets in R2 .
Definition 1 ([24]). A set M ⊆ Rm is called lattice-free if int(M ) ∩ Zm = ∅.
A lattice-free convex set M is maximal if there exists no lattice-free convex set
M = M such that M M .
Proposition 1. Let M be a full-dimensional maximal lattice-free convex set in
R2 . Then M is one of the following:
1. A split set {(x1 , x2 ) | b ≤ a1 x1 + a2 x2 ≤ b + 1} where a1 and a2 are coprime
integers and b is an integer,
2. A triangle which is one of the following:
(a) A type 1 triangle: triangle with integral vertices and exactly one integral
point in the relative interior of each edge,
(b) A type 2 triangle: triangle with at least one fractional vertex v, exactly
one integral point in the relative interior of the two edges incident to v
and at least two integral points on the third edge,
(c) A type 3 triangle: triangle with exactly three integral points on the bound-
ary, one in the relative interior of each edge.
3. A quadrilateral containing exactly one integral point in the relative interior
of each of its edges.
A maximal lattice-free convex
n set M containing f in its interior can be used to
derive the intersection cut j=1 π(rj )yj ≥ 1 ([7]) for (1) where the coefficients
are obtained as:
j λ if ∃λ > 0 s.t. f + λ1 rj ∈ boundary(M )
π(r ) = (3)
0 if rj belongs to the recession cone of M.
All non-trivial valid inequalities for (1) are intersection cuts and can be derived
from maximal lattice-free convex sets using (3). We refer the readers to [5,13]
for a complete characterization of facet-defining inequalities of the convex hull
of (1).
Next consider the strengthening of the ncoefficients for integer variables. It is
n+k
possible to obtain the valid inequality j=1 π(rj )yj + j=n+1 φ(rj )xj ≥ 1 for
(2) where φ(rj ) = infu∈Z2 {π(rj + u)} ([21,23],[8]). For an infinite version of the
relaxation (2), it has been shown [15] that this strengthening yields extreme
inequalities if π was obtained using (3) and M is a type 1 or type 2 triangle.
Moreover in the case in which M is a type 1 and type 2 triangle [15], the function
φ can be evaluated as
φ(rj ) = minu∈Z2 {π(rj + u) | rj + u + f ∈ M } .
Experiments with Two Row Tableau Cuts 427
T
min{ei z | λ1 r1 + λ2 r2 = −f + z, λ1 , λ2 ≥ 0, z ∈ Z2 } .
Let the optimal objective value be wi+1 and the optimal solution be
T T
(λ̄i+1 i+1 i+1
1 , λ̄2 , z̄ ). Let ei+1 ∈ R2 be such that ei+1 z̄ 0 = ei+1 z̄ i+1 = 1
T
and ei+1 f < 1. If wi+1 = 1, then denote the point z̄ i+1 as z̄ i0 and stop.
If w i+1
< 1, then set i ← i + 1 and repeat this step.
One final check is needed to verify that z̄ i0 is indeed a vertex of
the conv(Y 2 ). This check becomes relevant when conv(Y 2 ) has only one
vertex. Verify that (z̄ i0 − z̄ 0 ) and (−z̄ i0 + z̄ 0 ) do not belong to the cone
formed by r1 and r2 .
2. Lifting the third continuous variable: From the previous step we obtain two
integer points z̄ 0 and z̄ i0 that are tight for the inequality we will obtain.
Let z v1 and z v2 be the two points obtained by extending the line segment
passing through z̄ 0 and z̄ i0 and intersecting the two half lines f +μj rj , μj ≥ 0
j ∈ {1, 2}. The next step is to identify two other integer points which lie in
the relative interior of the other two sides of the triangle. Let (a, b) = z̄ 0 − z̄ i0 .
Update a and b by dividing them by their greatest common divisor and let
p, q ∈ Z such that pa + qb = ±1 and (q, −p)T r3 > 0. Then the two other
integer points are of the form z̄ 0 + (q, −p) + k(a, b) and z̄ 0 + (q, −p) + (k +
1)(a, b) for some integer k. The integer k can be calculated as follows. There
are two cases:
(a) f lies in the set {(w1 , w2 ) | z̄10 (−b) + z̄20 a ≤ w1 (−b) + w2 (a) ≤ z̄10 (−b) +
z̄20 a + 1}. In this case solve z̄ 0 + (q, −p) + λ(a, b) = f + r3 μ, μ ≥ 0 for λ
and μ and set k = λ.
(b) f lies in the set {(w1 , w2 ) | w1 (−b) + w2 (a) ≥ z̄10 (−b) + z̄20 a + 1}. Then
solve (q, −p) + λ(a, b) = μ(f − z̄ 0 ), μ ≥ 0 for λ and μ and set k = λ.
Denote z̄ 0 + (q, −p) + k(a, b) as z L and z̄ 0 + (q, −p) + k(a, b) as z R . Construct the
triangle by joining the two vertices z v1 and z v2 obtained in the previous step to
z L and z R , and extending these line segments until they intersect at the third
vertex.
We observe here that the method described above is a heuristic and not an
exact method for selecting best possible type 2 triangles. This is for at least two
reasons. First, due to the method employed to lift the third continuous variable
in step 2, the resulting cut may not always be facet-defining. More importantly
for every choice of two continuous variable that correspond to the set Y 2 in
step 1, we select only one inequality, whereas there are typically possible many
candidates.
Experiments with Two Row Tableau Cuts 429
4 Computational Experiments
To begin to answer some of the questions raised above we have decided to use
randomly generated instances of which we can control the size and (partially)
the structure. In particular, we used multidimensional knapsack instances ob-
tained by the random generator of Atamturk [6], kindly provided by the author.
In addition, we considered a single round of cuts and we did not apply any
cut/triangle selection rule.
430 S.S. Dey et al.
Of course, r̂j is only one of the possible liftings of variable xj , not nec-
essarily the best one.
The experiments with m = 2 are aimed at analyzing the case in which the
two constraint relaxation (1) is supposed to capture much of the structure of the
original problem (see Section 4.2 below). Of course, real problems have generally
(many) more constraints but in this context we limited the setting to m = 5
because we do not apply any cut selection mechanism, and otherwise the number
of separated type 2 triangle cuts would have been too big.
The code has been implemented in C++, using IBM ILOG Cplex 10.0 as
LP solver. In order to limit possible numerical issues in the cut generation, we
adopted several tolerances and safeguards. In particular, concerning the GMIC
generation, we did not generate a GMIC from a tableau row if the corresponding
basic variable has a fractional part smaller (resp. greater) than 0.001 (resp.
0.999). Similarly, we generated cuts from a triangle only if the fractional point
f is safely in its interior. More precisely, a triangle is discarded if the Euclidean
distance of f from its boundary is smaller than 0.001. In addition, both for
GMICs and triangle cuts, we discarded all the generated cuts with “dynamism”
greater than 109 , where the dynamism is computed as the ratio between the
greatest and smallest nonzero absolute value of the cut coefficients.
Before going into the details of the numbers, it is important to note that the
way used to derive the instances accomplishes a very important goal. The cuts
generated are exactly the same for the three sets A, B and C (except for the
fact that no integer rays are present in case A). Indeed, what changes is the
impact of the cuts on the solution process whereas (i) the number of continuous
variables is constant from A to C, (ii) the number of integer variables is the same
from B to C, and (iii) the rays of all variables stay the same from A to C. The
changes are instead due to the fact that (1) no integer variables are present in A
and (2) the objective function coefficients of the continuous variables have been
decreased from B to C.
problems with only two rows and no integer non-basic variables (set A), model
(1) is not a relaxation but coincides with the problem itself. Thus, type 2 triangle
cuts are supposed to be really strong and the experiments are aimed at asserting
their real effect in practice. On the other hand, sets B and C allow one to evaluate
the impact of non-basic integer variables on the strength of the two constraint
relaxation (1).
Moreover, in order to better understand the impact of bounds on the generated
instances, we also considered introducing bounds on instances of sets B and C
in the following way. For the generic set Z (Z ∈ {B, C}):
Table 1 compares one round of Gomory Mixed Integer Cuts (GMIC in the table)
with type 2 triangle cuts either in the setting (y) above – only rays associated
with continuous variables, Tr.(y) in the table – or in the setting (x+y) – rays
from both integer and continuous variables, Tr.(x+y) in the table. In particular,
all entries are average values of the results over 90 instances and for each set,
the first 5 columns report the characteristics of the instances in the set: namely,
number of y variables (ny), number of x variables (nx), number of non-basic
slacks that are nonzero in the MIP optimal solution (nz.s), number of non-basic
y variables that are nonzero in the MIP optimal solution (nz.y), number of non-
basic x variables that are nonzero in the MIP optimal solution (nz.x). Then,
for each of the cut family, Table 1 reports three columns: percentage gap closed
(%gap), number of generated cuts (ncuts) and number of cuts tight at the second
optimal tableau, i.e., after reoptimization (ntight).
Table 1 does not report computing times for the separation. It is easy to see
that separating type 2 triangle cuts is more time consuming than separating
GMICs, mainly because of the procedure for finding the edge of the triangle
containing more than one integral point (see Section 3). However, the separation
time is quite reasonable: the average 2740.8 type 2 triangle cuts for instances
of type B and C require a separation time of 8 CPU seconds on a workstation
with an Intel(R) 2.40 GHz processor running the SUSE Linux 10.1 Operating
System. Again, developing effective selection strategies is likely to heavily reduce
this computational effort.
Finally, ee did not report detailed results concerning the use of GMICs and
type 2 triangle cuts together because the improvement over the latter is negligible
(even if consistent).
– Type 2 triangle cuts vs other lattice-free cuts: Type 2 triangle cuts appear
to be quite important: A very few of them together with the GMICs close
98% of the gap in the instances of set A.
– Type 2 triangle cuts vs GMICs: The fact that the triangles close more of the
gap than GMICs is very interesting. In fact, the number of triangles needed
is not much more than the number of GMICs. This suggests that effort spent
in finding the “right triangles” is probably very useful especially in the case
in which the continuous variables are more important.
– Need for study of new relaxations: The instances when the integer variables
are more important show that the performance of both the GMICs and the
triangle inequalities deteriorate. This suggests that analysis of other relax-
ations based on integer non-basic variables should be pursued. The impor-
tance of generating inequalities with strong coefficients on integer variables is
again illustrated by the improvement obtained by the triangle cutting planes
%Min angle
Cut type Separated/Tight 0–15 15–30 30–45 45–60
Tr.(y) Separated in A,B,C 64.66 25.23 8.42 1.70
Tight in A 62.23 29.79 6.38 1.60
Tight in B 72.11 24.21 3.16 0.53
Tight in C 78.16 17.82 3.45 0.57
Tr.(y) Separated in B.n 64.66 25.23 8.42 1.70
Tight in B 72.11 24.21 3.16 0.53
Tight in B.1 66.50 27.50 4.50 1.50
Tight in B.2 62.81 28.64 6.53 2.01
Tight in B.3 71.66 24.60 3.21 0.53
Tr.(y) Separated in C.n 64.66 25.23 8.42 1.70
Tight in C 78.16 17.82 3.45 0.57
Tight in C.1 74.76 17.62 6.67 0.95
Tight in C.2 74.77 19.16 3.74 2.34
Tight in C.3 78.16 17.82 3.45 0.57
Tr.(x+y) Separated in B,C 61.86 24.94 10.52 2.67
Tight in B 77.63 15.35 5.26 1.75
Tight in C 91.21 5.86 2.56 0.37
434 S.S. Dey et al.
based on all nonbasic variables over the triangle cutting planes obtained us-
ing only the continuous non-basic variables.
– Effect of bounds on variables: Although bounds on different types of variables
deteriorate the quality of the triangle cutting planes, this reduction in the
quality of the cutting planes is not significant. Note that the gap for set B
(continuous variable more important) deteriorates when we have bounds on
non-basic continuous variables. Similarly, the gap for set C (integer variable
more important) deteriorates when we have bounds on non-basic integer
variables. This illustrates that if a certain type of variables is important,
then adding bound on these variables deteriorates the effect of the triangle
inequalities and GMICs.
– Shape of Triangle and their importance: Almost consistently, the percentage
of triangle cutting planes which have a lower min angle and are tight after
resolving is greater than the percentage of total triangle cutting planes with
a lower min angle. This observation may relate to the fact that the thin angle
gives strong coefficient to integer non-basic variables. However, further study
is needed to understand which triangles are more important.
Generated in C.s 67.68 23.42 7.59 1.31 8.69 0.00 55.39 35.92
Tight in C.s 69.29 23.60 6.37 0.75 32.21 0.00 28.84 38.95
Tr.(x+y) Separated in B,C 61.80 25.36 10.15 2.69 0.00 0.00 0.00 100.00
Tight in B 81.41 14.57 3.52 0.50 0.00 0.00 0.00 100.00
Tight in C 94.12 5.29 0.59 0.00 0.00 0.00 0.00 100.00
Separated in C.s 61.57 25.36 10.45 2.62 7.29 0.00 43.90 48.81
Tight in C.s 76.86 16.29 6.57 0.29 35.43 0.00 24.86 39.71
436 S.S. Dey et al.
5 Conclusions
References
1. Achterberg, T., Koch, T., Martin, A.: MIPLIB 2003. Operations Research Let-
ters 34, 361–372 (2006), http://miplib.zib.de
2. Andersen, K., Cornuéjols, G., Li, Y.: Reduce-and-split cuts: Improving the perfor-
mance of mixed integer Gomory cuts. Management Science 51, 1720–1732 (2005)
3. Andersen, K., Louveaux, Q., Weismantel, R.: An analysis of mixed integer linear
sets based on lattice point free convex sets. Mathematics of Operations Research 35,
233–256 (2010)
4. Andersen, K., Louveaux, Q., Weismantel, R.: Mixed-integer sets from two rows of
two adjacent simplex bases (2009), http://hdl.handle.net/2268/35089
5. Andersen, K., Louveaux, Q., Weismantel, R., Wolsey, L.A.: Cutting planes from
two rows of a simplex tableau. In: Fischetti, M., Williamson, D.P. (eds.) Pro-
ceedings 12th Conference on Integer and Combinatorial Optimization, pp. 1–15.
Springer, Heidelberg (2007)
6. Atamturk, A.: http://ieor.berkeley.edu/~ atamturk/data/
7. Balas, E.: Intersection cuts - a new type of cutting planes for integer programming.
Operations Research 19, 19–39 (1971)
8. Balas, E., Jeroslow, R.: Strenghtening cuts for mixed integer programs. European
Journal of Operations Research 4, 224–234 (1980)
9. Basu, A., Conforti, M., Cornuéjols, G., Zambelli, G.: Maximal lattice-free convex
sets in linear subspaces (2009), http://www.math.unipd.it/~ giacomo/
10. Basu, A., Conforti, M., Cornuéjols, G., Zambelli, G.: Minimal inequalities for an
infinite relaxation of integer programs (2009),
http://www.math.unipd.it/~ giacomo/
Experiments with Two Row Tableau Cuts 437
11. Borozan, V., Cornuéjols, G.: Minimal valid inequalities for integer constraints.
Mathematics of Operations Research 34, 538–546 (2009)
12. Conforti, M., Cornuéjols, G., Zambelli, G.: A geometric perspective on lifting
(2009), http://www.math.unipd.it/~ giacomo/
13. Cornuéjols, G., Margot, F.: On the facets of mixed integer programs with two
integer variables and two constraints. Mathematical Programming 120, 429–456
(2009)
14. Dey, S.S., Tramontani, A.: Recent developments in multi-row cuts. Optima 80, 2–8
(2009)
15. Dey, S.S., Wolsey, L.A.: Lifting integer variables in minimal inequalities correspond-
ing to lattice-free triangles. In: Lodi, A., Panconesi, A., Rinaldi, G. (eds.) Proceed-
ings 13th Conference on Integer and Combinatorial Optimization, pp. 463–475.
Springer, Heidelberg (2008)
16. Dey, S.S., Wolsey, L.A.: Constrained infinite group relaxations of MIPs, Tech. Re-
port CORE DP 33, Université catholique de Louvain, Louvain-la-Neuve, Belgium
(2009)
17. Espinoza, D.: Computing with multiple-row Gomory cuts. In: Lodi, A., Panconesi,
A., Rinaldi, G. (eds.) IPCO 2008. LNCS, vol. 5035, pp. 214–224. Springer, Heidel-
berg (2008)
18. Fischetti, M., Saturni, C.: Mixed integer cuts from cyclic groups. Mathematical
Programming 109, 27–53 (2007)
19. Gomory, R.E.: An algorithm for integer solutions to linear programs. In: Graves,
R.L., Wolfe, P. (eds.) Recent Advances in Mathematical Programming, pp. 269–
308. Mcgraw-Hill Book Company Inc., New York (1963)
20. Gomory, R.E., Johnson, E.L.: Some continuous functions related to corner polyhe-
dra, part I. Mathematical Programming 3, 23–85 (1972)
21. Gomory, R.E., Johnson, E.L.: Some continuous functions related to corner polyhe-
dra, part II. Mathematical Programming 3, 359–389 (1972)
22. Gomory, R.E., Johnson, E.L.: T-space and cutting planes. Mathematical Program-
ming 96, 341–375 (2003)
23. Johnson, E.L.: On the group problem for mixed integer programming. Mathemat-
ical Programming Study 2, 137–179 (1974)
24. Lovász, L.: Geometry of numbers and integer programming. Mathematical Pro-
gramming: Recent Developments and Applications, 177–210 (1989)
25. Nemhauser, G.L., Wolsey, L.A.: A recursive procedure to generate all cuts for 0-1
mixed integer programs. Mathematical Programming 46, 379–390 (1990)
26. Zambelli, G.: On degenerate multi-row Gomory cuts. Operations Research Let-
ters 37, 21–22 (2009)
An OP T + 1 Algorithm for the Cutting Stock
Problem with Constant Number of Object
Lengths
1 Introduction
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 438–449, 2010.
c Springer-Verlag Berlin Heidelberg 2010
An OP T + 1 Algorithm for the Cutting Stock Problem 439
When computing time complexities we use the log-cost RAM model, where each
arithmetic operation requires time proportional to the logarithm of the size of its
operands. Our algorithm uses a variation of the integer programming formula-
tion (IP) for the cutting stock problem of Gilmore and Gomory [6]; furthermore,
we take advantage of a result by Eisenbrand and Shmonin [3] stating that IP
has an optimum solution with only a constant number of positive variables.
By partitioning the set of objects into two groups of small and big objects, we
can re-write IP so that only a constant number of constraints is needed to restrict
440 K. Jansen and R. Solis-Oba
the placement of big objects in the bins. Then, by relaxing the integrality con-
straints on the variables controlling the packing for the small objects, we obtain
a mixed integer program with a constant number of integer variables. We show
that this mixed integer program can be solved in polynomial time using Lenstra’s
algorithm [15], and a simple rounding procedure can be then used to transform
this solution into a feasible solution for the cutting stock problem that uses at
most OP T + 1 bins.
CiS , configuration. Let C B be the set of all big configurations. Note that C B
includes the configuration with no big objects in it.
IP, then, can be re-written as the following feasibility problem.
IP1 : xCi ≤ y , for all B ∈ C B (2)
Ci ∈ C
CiB =B
y = m∗ (3)
B ∈ C B
a(B , j)y ≥ nj , for all j = 1, . . . , α (4)
B ∈ C B
a(Ci , j)xCi ≥ nj , for all j = α + 1, . . . , d (5)
Ci ∈ C
xCi ∈ Z≥0 , for all Ci ∈ C
y ∈ Z≥0 , for all B ∈ C B
From Corollary 6 in [3] and Lemma 1, IP1 has an optimum solution (x∗ , y ∗ )
∗
with at most 2d non-zero variables y ∗ . Let S B be the set of at most 2d big
configurations corresponding to the non-zero variables in y ∗ . Thus, we can
reduce the number of constraints of type (2) by not considering all big con-
∗
figurations C B , but only those in S B . Since we do not know the optimum
solution (x∗ , y ∗ ) we do not know either which big configurations to select. How-
d
ever,as the number of big configurations is only 1/ε , there is a constant num-
−d
ε
ber of subsets S B of 2d big configurations, so we can try them all
2d
knowing that one of them will lead to an optimum solution. Furthermore, in
IP1 the value of m∗ is unknown. However, since m∗ ≤ n, we can use binary
search to find in O(log n) iterations the smallest value for m∗ for which IP1
has a feasible solution. Finally, we consider a mixed integer linear program-
ming relaxation of IP1 by relaxing the integrality constraints on the variables
xCi :
442 K. Jansen and R. Solis-Oba
MILP(m, S B ) : xCi ≤ y , for all B ∈ S B (6)
Ci ∈ C
CiB =B
y = m (7)
B ∈ S B
a(B , j)y ≥ nj , for all j = 1, . . . , α (8)
B ∈ S B
a(Ci , j)xCi ≥ nj , for all j = α + 1, . . . , d (9)
Ci ∈ C
CiB ∈S B
3 Rounding
In Section 4 we show how to solve MILP(m, S B ) using Lenstra’s algorithm [15].
∗
Let (x+ , y + ) be the solution produced by Lenstra’s algorithm for MILP(m∗ , S B );
as we show below (see Theorem 2), in this solution at most 2d + d + 1 of the
variables x+ have non-zero value. We show now how to obtain an integer solution
from (x+ , y + ) that uses at most m∗ + 1 bins.
∗
For each big configuration B ∈ S B such that y+ > 0, the solution (x+ , y + )
+ +
uses y bins so that the big objects stored in them conform to B . Let Δ =
y − Ci ∈ C;C B =B xCi . If for some big configuration B , Δ > 0, then we select
+ + +
i
any Ch ∈ C such that ChB = B and increase the value of x+ +
Ch by Δ . Note that
this change does not affect the number of bins that it uses, but it ensures that
constraint (6) is satisfied with equality for every B ∈ C B ; we need this property
for our rounding procedure. For simplicity, let us denote the new solution as
(x+ , y + ).
For each Ci ∈ C such that x+ Ci > 0, let xCi = xCi − xCi , so xCi can be split
+ + +
into an integer part x+ Ci and a fractional one xCi . We round each xCi to an
integer value as follows.
∗
Note that since condition (6) holds with equality for every B ∈ S B and
y is integer, then set Q always exists, unless C = ∅.
Take any configuration Cp ∈ Q and split xCp into two parts xCp , x
Cp so
that xCp = xCp + x
Cp and
xCp + xCi = 1. (10)
Ci ∈Q\{Cp }
⎛⎛ ⎞ ⎞
d
d
(CQj × pπ(j) ) = ⎝⎝ (a(Ci , j)xCi )⎠ pπ(j) ⎠
j=1 j=1 Ci ∈Q
⎛ ⎛ ⎞⎞
d
= ⎝xC ⎝ (a(Ci , j)pπ(j) )⎠⎠
i
Ci ∈Q j=1
≤ xCi β, as each Ci is a configuration,
Ci ∈Q
d
so (a(Ci , j)pπ(j) ) ≤ β
j=1
= β, by (10).
so, the total size of the objects in CQ is β. But, CQ might not be a fea-
sible configuration, as some of its components CQi might not be integer.
Let C Q = CQ1 , CQ2 , . . . , CQd , and CQ = CQ1 − CQ1 , CQ2 −
CQ2 , . . . , CQd − CQd . Clearly C Q is a valid configuration and each com-
ponent CQi − CQi of CQ has value smaller than 1.
Note that the first α components of CQ are zero because for all configu-
rations Cj ∈ Q, Cj = B , hence,
B
for j ≤ α, CQj = Ci ∈Q a(Ci , j)xCi =
Ci ∈Q (a(B , j)xCi ) = a(B , j) Ci ∈Q xCi = a(B , j), by (10); observe that
a(B , j) is integer. Thus, each CQ is of the form
CQ = 0, . . . , 0, CQ,α+1 , . . . , CQ,d where 0 ≤ CQ,i < 1 for all i = 1, . . . , d.
(11)
We can think of CQ as containing only a fraction (maybe equal to zero) of
an object of each different small length. The fractional items in CQ are set
aside for the time being.
4. Remove Q from C and if x
Cp > 0, add Cp back to C and set xCp = xCp .
5. If C = ∅ go back to Step 2.
444 K. Jansen and R. Solis-Oba
The above procedure yields a solution that uses m∗ bins, but the objects from
the configurations CQ identified in Step 3 are not yet packed. Let CQ be the set
of all these configurations.
Lemma 3. The above rounding procedure packs all the objects in at most m∗ +1
bins.
Algorithm BinPacking(P, U, β)
Input: Sets P = {p1 , . . . , pd }, U = {n1 , . . . , nd } of object lengths and their
multiplicities; capacity β of each bin.
Output: A packing for the objects into at most OP T + 1 bins.
1
1. Set ε = d(2d +d+1) and then partition the set of objects into big (of length at
least εβ) and small (of length smaller than εβ). Set m∗ = n.
2. For each set S B of 2d big objects do :
Use Lenstra’s algorithm and binary search over the set V =
{1, 2, . . . , m∗ } to find the smallest value j ∈ V , if any, for which
MILP(j, S B ) has a solution.
If a value j < m∗ was found for which MILP(j, S B ) has a solution,
then set m∗ = j and let (x+ , y + ) be the solution computed by Lenstra’s
algorithm for MILP(j, S B ).
3. Round (x+ , y + ) as described above and output the corresponding packing
of the objects into m∗ + 1 bins.
Lenstra’s algorithm [15] can be used to solve mixed integer linear programs
in which the number of integer variables is constant; the time complexity of
the algorithm is O(P (N )), where P is a polynomial and N = O(M η log k)
is the maximum number of bits needed to specify the input, where M is the
number of constraints in the mixed integer linear program, η is the number of
variables, and k is the maximum of the absolute values of the coefficients of
the constraints. Since MILP(m, S B ) has O(nd ) variables, it might seem that
the time complexity of Lenstra’s algorithm is too high for our purposes as an
instance of the high multiplicity bin packing problem is specified with only N =
d
i=1 (log pi + log ni ) + log β = O(log β + log n) bits, and so P (N ) is not a
polynomial function of N . In this section we show that Lenstra’s algorithm can,
in fact, be implemented to run in time polynomial in N .
An OP T + 1 Algorithm for the Cutting Stock Problem 445
K = {(y, x) ∈ R2
d
+nd
| A(y, x) ≤ b}
and let
Algorithm Lenstra(K)
Input: Closed convex set K of dimension D.
Output: A point in K ∩ ZD , if one exists, or null if K ∩ ZD = ∅.
1. Reduce the dimension of K until we ensure that K has positive volume.
2. Compute a linear transformation τ that maps K into a ball-like set τ K such
that there is a point σ and radii r, R with Rr = 2D3/2 for which B(σ, r) ⊂
τ K ⊂ B(σ, R), where B(σ, z) ⊂ RD is the closed ball with center σ and
radius z.
Compute a reduced basis b1 , b2 , . . . , bD for the lattice τ ZD : a basis such that
3. <
D
i=1 !bi ! ≤ 2 × |determinant(b1 , b2 , . . . , bD )|, where ! ! denotes the
D(D−1)/4
Euclidean norm. √
4. Find a point v ∈ τ ZD such that !v − σ! ≤ 12 D max{!bi ! | i = 1, . . . , D}.
5. If v ∈ τ K then output τ −1 v.
D−1
6. If v ∈ τ K let H = i=1 (Rbi ) be the (D − 1)-hyperplane spanned by
b1 , . . . , bD−1 .
For each integer i such that H + ibD intersects B(σ, R) do :
Let K be the intersection of K with H + ibD .
If v =Lenstra(K) is not null, then output τ −1 (v, ibD ).
Output null.
LP : max f y
B ∈S B
s.t. y − xCi ≥ 0, for all B ∈ S B
Ci ∈ C
CiB =B
− y ≥ −m
B ∈S B
a(B , j)y ≥ nj , for all j = 1, . . . , α
B ∈S B
a(Ci , j)xCi ≥ nj , for all j = α + 1, . . . , d
Ci ∈ C
CiB ∈S B
d
−δ + (a(Ci , j)λj ) ≤ 0, for all Ci ∈ C,
j=α+1
λj ≥ 0, j = 1, . . . , d
We use the ellipsoid algorithm [13,7] to solve DLP. Note that DLP has only
2d + d + 1 variables, but it has a large number O(nd ) of constraints, so for
the ellipsoid’s algorithm to solve DLP in time polynomial in N , we need an
efficient separation oracle that given a vector δ = λ1 , . . . , λd , δ0 , . . . , δ2d it either
determines that δ is a feasible solution for DLP or it finds a constraint of DLP
that is violated by δ.
To design this separation oracle, we can think that each object oi ∈ O has
length pi and value λi . Each constraint (12) can be tested in constant time.
However, constraints (13) are a bit harder to test. Since a configuration Ci for
which CiB = B ∈ S B includes small objects of total length at most β − β ,
where β is the total length of the big objects in B , then constraints (13) check
that for each Ci ∈ C and B ∈ S B such that CiB = B , the set of small objects
in Ci has total value at most δ .
An OP T + 1 Algorithm for the Cutting Stock Problem 447
Hence, (as it was also observed by Karmakar and Karp [12]) to determine
whether the constraints (13) are satisfied we need to solve an instance of the
knapsack problem where the input is the set of small objects and the knapsack
has capacity β − β . If the maximum value of any subset of small objects of total
length at most β −β is larger than δ then we know that a constraint of type (13)
is violated; furthermore, the solution of the knapsack problem indicates exactly
which constraint is not satisfied by δ.
Therefore, a separation oracle for DLP needs to be able to efficiently solve
any instance of the knapsack problem formed by a set of objects of d ≤ d
different types, where objects of type i have length pi and value λi . This knapsack
problem can be formulated as an integer program with a constant number of
variables and so it can be solved, for example, by using Kannan’s algorithm [11]
in O(d9d (d28d log2 n + log β)3 ) time.
By Lemmas 2.1 and 8.5 of [20] the basic feasible solutions of DLP are 2d -
vectors of rational numbers whose numerators and denominators have absolute
d
values at most L = (2d )!nd2 . Therefore, we can use the version of the ellipsoid
algorithm described in [7] with precision L−1 to solve DLP in time polynomial
in log n and log β.
Lemma 5. The maximum number of bits needed to encode each value in the
solution (δ, λ) computed by the ellipsoid algorithm for DLP, is O(d28d log2 n).
Lemma 6. The running time of the ellipsoid algorithm when solving DLP is
O(d9d 24d (d28d log2 n + log β)3 log n).
Proof. We have shown that steps 1-5 of the algorithm can be performed in time
T = O(d9d 26d (d28d log2 n + log β)3 log2 n). As shown in [15] in the “for” loop
of step 6 we need to consider 21+2d+2 (2 −1)/4 different values for i. In each
d d
iteration of the “for” loop we perform a recursive call to the algorithm, and the
recursive call dominates the running time of every iteration of the loop.
Let F (D) be the time complexity of the algorithm when the input convex set
has dimension D. Then, F (D) satisfies the following recurrence:
d
(2d −1)/4
F (D) = T + 21+d+2 F (D − 1).
!
d
(2d −1)/4 D
Therefore, F (D) = O T (21+d+2 ) . Since D = 2d , the complexity of
3d
the algorithm is O(d9d 22 +6d
(d28d log2 n + log β)3 log2 n).
∗
Theorem 2. A solution for MILP(m∗ , S B ) in which at most 2d + d + 1 vari-
3d
ables xCi have positive value can be computed in O(d18d 22 +6d (d28d log2 n +
3 2
log β) log n) time.
Proof of Theorem 1: By Lemma 4 algorithm BinPacking produces a solution
for the high multiplicity bin packing problem using at most OP T + 1 bins. By
3d+3
Theorem 2 the time complexity of BinPacking is O(d21d 22 (log2 n + log β)4 )
as the algorithm needs to solve O((ε−d )! log n) mixed integer linear programs.
References
1. Coffman Jr., E.G., Garey, M.R., Johnson, D.S.: Approximation algorithms for bin
packing: a survey. In: Hochbaum, D.S. (ed.) Approximation algorithms for NP-hard
problems, pp. 46–86. PWS Publishing Company (1997)
2. Eisemann, K.: The trim problem. Management Science 3(3), 279–284 (1957)
3. Eisenbrand, F., Shmonin, G.: Carathéorody bounds for integer cones. Operations
Research Letters 34, 564–568 (2006)
4. Filippi, C., Agnetis, A.: An asymptotically exact algorithm for the high-multiplicity
bin packing problem. Mathematical Programming 104(1), 21–57 (2005)
5. Filippi, C.: On the bin packing problem with a fixed number of object weights.
European Journal of Operational Research 181, 117–126 (2007)
An OP T + 1 Algorithm for the Cutting Stock Problem 449
6. Gilmore, P.C., Gomory, R.E.: A linear programming approach to the cutting stock
problem. Operations Research 9, 849–859 (1961)
7. Grötschel, M., Lovász, L., Schrijver, A.: The ellipsoid method and its consequences
in Combinatorial Optimization. Combinatorica 1(2), 169–197 (1981)
8. Hochbaum, D.S., Shamir, R.: Strongly polynomial algorithms for the high multi-
plicity scheduling problem. Operations Research 39, 648–653 (1991)
9. Kaltofen, E., Villard, G.: On the complexity of computing determinants. Compu-
tational Complexity 13(3-4), 91–130 (2004)
10. Kannan, R., Bachem, A.: Polynomial algorithms for computing the Smith and
Hermite normal forms of an integer matrix. SIAM Journal on Computing 8, 499–
507 (1979)
11. Kannan, R.: Minkowski’s convex body theorem and integer programming. Mathe-
matics of Operations Research 12(3), 415–440 (1987)
12. Karmakar, N., Karp, R.M.: An efficient approximation scheme for the one-
dimensional bin packing problem. In: Proceedings FOCS, pp. 312–520 (1982)
13. Khachiyan, L.G.: A polynomial algorithm in linear programming. Dokl. Akad.
Nauk. SSSR 244, 1093-1096 (1979); English translation: Soviet Math. Dokl. 20,
191–194 (1979)
14. Lenstra, A.K., Lenstra Jr., H.W., Lovász, L.: Factoring Polynomials with rational
coefficients. Math. Ann. 261, 515–534 (1982)
15. Lenstra Jr., H.W.: Integer programming with a fixed number of variables. Mathe-
matics of Operations Research 8(4), 538–548 (1983)
16. Lovász, L.: Complexity of algorithms (1998)
17. Marcotte, O.: The cutting stock problem and integer rounding. Mathematical Pro-
gramming 33, 82–92 (1985)
18. McCormick, S.T., Smallwood, S.R., Spieksma, F.C.R.: A polynomial algorithm for
multiprocessor scheduling with two job lengths. Math. Op. Res. 26, 31–49 (2001)
19. Orlin, J.B.: A polynomial algorithm for integer programming covering problems
satisfying the integer round-up property. Mathematical Programming 22, 231–235
(1982)
20. Papadimitriou, C.H., Steiglitz, K.: Combinatorial optimization: Algorithms and
complexity. Prentice-Hall, Inc., Englewood Cliffs (1982)
21. Schrijver, A.: Theory of Linear and Integer Programming. John Wiley, Chichester
(1986)
On the Rank of Cutting-Plane Proof Systems
1 Introduction
Cutting planes are a fundamental, theoretically and practically relevant tool in
combinatorial optimization and integer programming. Cutting planes help to
eliminate irrelevant fractional solutions from polyhedral relaxations while pre-
serving the feasibility of integer solutions. There are several well-known pro-
cedures to systematically derive valid inequalities for the integer hull PI of a
rational polyhedron P = {x ∈ Rn : Ax ≤ b} ⊆ [0, 1]n (see, e.g., [8, 9]). This
includes Gomory-Chvátal cuts [5, 17, 18, 19], lift-and-project cuts [1], Sherali-
Adams cuts [28], and the matrix cuts of Lovász and Schrijver [21], to name just
a few. Repeated application of these operators is guaranteed to yield a linear
description of the integer hull, and the question naturally arises of how many
rounds are, in fact, necessary. This gives rise to the notion of rank. For exam-
ple, it is known that the Gomory-Chvátal rank of a polytope contained in the
n-dimensional 0/1-cube is at most O(n2 log n) [14], whereas the rank of all other
F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 450–463, 2010.
c Springer-Verlag Berlin Heidelberg 2010
On the Rank of Cutting-Plane Proof Systems 451
Related work. A lower bound of n for the rank of the Gomory-Chvátal pro-
cedure for polytopes P ⊆ [0, 1]n with PI = ∅ was established in [6]. This was
later used to provide a lower bound of (1 + )n on the Gomory-Chvátal rank of
arbitrary polytopes P ⊆ [0, 1]n , showing that in contrast to most other cutting-
plane procedures, the Gomory-Chvátal procedure does not have an upper bound
of n if PI = ∅ [14]. The upper bound is known to be n if PI = ∅ [2]. Lower
bounds of n for the matrix cut operators N0 , N , and N+ of Balas, Ceria, and
Cornuéjols [1], Sherali and Adams [28], and Lovász and Schrijver [21] were given
in [7, 10, 16]. Lower bounds for the split cut operator SC were obtained in
[11]. These operators (and some strengthenings thereof) have recently regained
attention [15, 20, 24], partly due to an interesting connection between the inap-
proximability of certain combinatorial optimization problems and the integrality
gaps of their (LP and SDP) relaxations. For example, in [27] it was shown that
452 S. Pokutta and A.S. Schulz
the integrality gaps of the vertex cover and the max cut problem remain at least
2 − after n rounds of the Sherali-Adams operator. A related result for the
stronger
= Lovász-Schrijver operator established an integrality gap of 2 − after
Ω( log n/ log log n) rounds [15]. In [26] it was shown that even for the stronger
Lasserre hierarchy [20] one cannot expect to be able to prove the unsatisfiability
of certain k-CSP formulas within Ω(n) rounds. As a result, a 76 − integrality
gap for the vertex cover problem after Ω(n) rounds of the Lasserre hierarchy
follows. In [4], the strength of the Sherali-Adams operator is studied in terms of
integrality gaps for well-known problems like max cut, vertex cover, and spars-
est cut, and in [22] integrality gaps for the fractional matching polytope, which
has Gomory-Chvátal rank 1, are provided, showing that although the match-
ing problem can be solved in polynomial time, it cannot be approximated well
with a small number of rounds of the Sherali-Adams operator. In addition, it
was shown that for certain tautologies that can be expressed in first-order logic,
the Lovász-Schrijver N+ rank can be constant, whereas the Sherali-Adams rank
grows poly-logarithmically [12].
Note that these conditions are quite natural and are indeed satisfied by all linear
cutting-plane procedures mentioned above; proofs and references are given in the
full version of this paper. Condition (1) ensures that M (P ) is a relaxation of PI
that is not worse than P itself. Condition (2) establishes the monotonicity of
the procedure; as any inequality that is valid for Q is also valid for P , the same
should hold for the corresponding M -cuts. Condition (3) states that the order in
which we fix certain variables to 0 or 1 and apply the operator should not matter.
Condition (4) makes sure that an admissible procedure is able to derive the most
basic conclusions, while Condition (5) makes certain that the natural symmetry
of the 0/1-cube is maintained. Finally, Condition (6) guarantees that admissible
cutting-plane procedures cannot be too powerful; otherwise even M (P ) = PI
would be included, and the family of admissible procedures would be too broad
to derive interesting results. Note also that (6) is akin to an independence of
irrelevant alternatives axiom.
A cutting-plane operator can be applied iteratively; we define M (i+1) (P ) :=
M (M (i) (P )).1 For consistency, we let M (1) (P ) := M (P ) and M (0) (P ) := P . Ob-
viously, PI ⊆ M (i+1) (P ) ⊆ M (i) (P ) ⊆ · · · ⊆ M (1) (P ) ⊆ M (0) (P ) = P . In gen-
eral, it is not clear whether there exists a finite k ∈ Z+ such that PI = M (k) (P ).
However, we will see that for polytopes P ⊆ [0, 1]n with PI = ∅ this follows
from properties (3) and (4). In this sense, every admissible cutting-plane proce-
dure can be viewed as a system for proving the unsatisfiability of propositional
formulas in conjunctive normal form (which can be naturally represented as sys-
tems of integer inequalities), which is the setting considered here. The rank of
P with respect to M is the minimal k ∈ Z+ such that PI = M (k) (P ). We write
rkM (P ) = k (and drop the index M if it is clear from the context).
In the following, we put together some useful properties of admissible cutting-
plane procedures that follow almost directly from the definition. We define L∩M
as (L ∩ M )(P ) := L(P ) ∩ M (P ) for all polytopes P ⊆ [0, 1]n .
1
To avoid technical difficulties, we assume that M (P ) is again a rational polytope
whenever we apply M repeatedly.
454 S. Pokutta and A.S. Schulz
3.1 Results for Gomory-Chvátal Cuts, Matrix Cuts, and Split Cuts
We immediately obtain the following corollary from Theorem 10, which shows
that the Gomory-Chvátal procedure is, in some sense, weakest possible: When-
ever the rank of some admissible cutting-plane procedure is maximal, then so is
the Gomory-Chvátal rank. More precisely:
Corollary 11. Let M be admissible and let P ⊆ [0, 1]n be a polytope with PI = ∅
and rkM (P ) = n. Then rkGC (P ) = n.
Proof. By Theorem 10 (1) we have that P ∩ F = ∅ for all one-dimensional faces
F of [0, 1]n . With Theorem 9 we therefore obtain rkGC (P ) = n.
Note that Corollary11 does not hold for polytopes P ⊆ [0, 1]n with PI = ∅: Let
Pn = {x ∈ [0, 1]n | i∈[n] xi ≥ 12 }. Then (Pn )I = {x ∈ [0, 1]n | i∈[n] xi ≥ 1} =
∅. In [7, Section 3] it was shown that rkGC (Pn ) = 1, but rkN0 (Pn ) = n.
We can also derive a slightly weaker relation between the rank of matrix cuts,
split cuts, and other admissible cutting-plane procedures. First we will establish
lower bounds for the rank of Bn . The following result was provided in [7, Lemma
3.3] for matrix cuts and in [8, Lemma 6] for split cuts.
Lemma 12. Let P ⊆ [0, 1]n be a polytope and let Fk ⊆ P . Then Fk+1 ⊆ N+ (P )
and Fk+1 ⊆ SC(P ).
This yields:
Lemma 13. Let M ∈ {N0 , N, N+ , SC}. Then rkM (Bn ) = n − 1.
Proof. As Bn = conv(F2 ), Lemma 12 implies that Fn ⊆ M (n−2) (Bn ), and thus
rk(Bn ) ≥ n − 1. Together with Lemma 7 it follows that rk(Bn ) = n − 1.
We also obtain the following corollary that shows that the M -rank with M ∈
{N0 , N, N+ , SC} is at least n − 1 whenever it is n with respect to any other
admissible cutting-plane procedure.
Corollary 14. Let L be an admissible cutting-plane procedure, let M ∈
{N0 , N, N+ , SC}, and let P ⊆ [0, 1]n be a polytope with PI = ∅ and rkL (P ) = n.
Then rkM (P ) ≥ n − 1 and, if P is half-integral, then rkM (P ) = n.
Proof. If rkL (P ) = n, then Bn ⊆ P by Theorem 10 and rkM (Bn ) = n − 1
by Lemma 13. So the first part follows from Lemma 3. In order to prove the
second part observe that P ∩ F = ∅ for all one-dimensional faces F of [0, 1]n .
Thus P ∩ F ∼ = F2 for all two-dimensional faces F of [0, 1]n and, by Lemma
12, M (P ) ∩ F = { 12 eF }. Therefore Bn ⊆ M (P ). The claim now follows from
Lemma 13.
We will now consider the case where P ⊆ [0, 1]n is half-integral with PI = ∅ in
detail. The polytope An ⊆ [0, 1]n is defined by
1
An := x ∈ [0, 1]n | xi + (1 − xi ) ≥ for all S ⊆ [n] .
2
i∈S i∈[n]\S
On the Rank of Cutting-Plane Proof Systems 457
Lemma 20. Let P ⊆ [0, 1]n be a polytope. Then PI = ∅ if and only if there
exists an inequality cx ≤ δ valid for PI with V (c, δ) = {0, 1}n.
Next we establish an upper bound on the growth of the size of V (c, δ).
Lemma 21. Let M be admissible with verification degree p(n). Further, let
P = {x : Ax ≤ b} ⊆ [0, 1]n be a polytope with PI = ∅ and define k :=
maxi∈[m] |V (ai , bi )|. If cx ≤ δ has been derived by M from Ax ≤ b within
We are ready to prove a universal lower bound on the rank of admissible cutting-
plane procedures:
Proof. We will first show that if P ∩F = ∅ for all k-dimensional faces F of [0, 1]n
and cx ≤ δ is a valid inequality for P , then cx ≤ δ can cut off at most (2n)k
0/1 points, i.e., |V (c, δ)| ≤ (2n)k . Without loss of generality, we may assume
that c ≥ 0 and that ci ≥ cj whenever i ≤ j; otherwise we can apply coordinate
j
flips and variable permutations. Define l := min{j ∈ [n] : i=1 ci > δ}. Suppose
Bn−k
l ≤ n − k. Define F := i=1 {xi = 1}. Observe that dim(F ) = k and cx > δ
for all x ∈ F as l ≤ n − k. Thus P ∩ F = ∅, which contradicts our assumption
that P ∩ F = ∅ for all k-dimensional faces F of [0, 1]n . Therefore, k ≥ n − l + 1.
By the choice of l, every 0/1 point x0 cut off by cx ≤ δ has to have at least
l coordinates equal to 1. The number ζ of 0/1 points of dimension n with this
property is bounded by
n n n
ζ ≤ 2n−l ≤ 2k ≤ 2k ≤ 2k nk ≤ (2n)k .
l n−l k−1
Note that the third inequality holds as k ≤ n/2, by assumption. It follows that
|V (c, δ)| ≤ (2n)k .
As we have seen, any inequality πx ≤ π0 valid for P can cut off at most (2n)k
0/1-points. In order to prove that PI = ∅, we have to derive an infeasibility
certificate cx ≤ δ with V (c, δ) = {0, 1}n, by Lemma 20. Thus, |V (c, δ)| = 2n is
a necessary condition for cx ≤ δ to be such a certificate. If cx ≤ δ is derived
in
rounds by M from Ax ≤ b then, by Lemma 21, we have that |V (c, δ)| ≤
p(n) (2n)k . Hence,
∈ Ω(n/ log n) and, therefore, rk(P ) ∈ Ω(n/ log n).
Corollary 23. Let M be admissible. Then rk(Bn ) ∈ Ω(n/ log n) and rk(An ) ∈
Ω(n/ log n).
Corollary 24. Let P ⊆ [0, 1]n be a polytope with PI = ∅ and let L, M be two
admissible cutting-plane procedures. If rkL (P ) = n, then rkM (P ) ∈ Ω(n/ log n).
460 S. Pokutta and A.S. Schulz
Definition 25. Let P ⊆ [0, 1]n be a polytope. The cutting-plane procedure “+”
is defined as follows. Let J˜ ⊆ [n] with |J|˜ ≤ log n and let I ⊆ I˜ ⊆ [n] with
I ∩ J = ∅. If there exists > 0 such that
˜ ˜
xi + (1 − xi ) + xi + (1 − xi ) ≥
i∈I ˜
i∈I\I i∈J i∈J˜\J
Let us first prove that +-cuts are indeed valid; i.e., they do not cut off any integer
points contained in P . At the same time, the proof of the following lemma helps
to establish that the +-operator satisfies Property (6) of Definition 1.
Lemma 26. Let P ⊆ [0, 1]n be a polytope. Every +-cut is valid for PI .
1
xi + (1 − xi ) + xi + (1 − xi ) + xi0 ≥ 1
2
i∈I ˜
i∈I\I i∈J i∈J0 \J
1
+ xi + (1 − xi ) + xi + (1 − xi ) + (1 − xi0 ) ≥ 1
2
i∈I ˜
i∈I\I i∈J i∈J0 \J
1
xi + (1 − xi ) + xi + (1 − xi ) ≥
2
i∈I ˜
i∈I\I i∈J i∈J0 \J
We can again round up the right-hand side and iteratively repeat this process
until |J0 | = 0.
By Definition 25,
xi + (1 − xi ) + xi ≥ 1
i∈I ˜
i∈I\I k+1≤i≤n
In the following we will show that, for any given polytope P ⊆ [0, 1]n with
PI = ∅, rk+ (P ) ∈ O(n/ log n). This is a direct consequence of the following
lemma; we use P (k) to denote the k-th closure of the “+” operator.
Lemma 27. Let P ⊆ [0, 1]n be a polytope with PI = ∅. Then i∈I xi + i∈I\I ˜ (1−
xi ) ≥ 1 with I ⊆ I ⊆ [n], |I| ≥ n − klog n is valid for P
˜ ˜ (k+1)
.
We are ready to establish an upper bound on the rank of the “+” operator:
Theorem 28. Let P ⊆ [0, 1]n be a polytope with PI = ∅. Then rk+ (P ) ∈
O(n/ log n).
Proof. It suffices
to derive
the inequalities xi ≥ 1 and xi ≤ 0. By Lemma 27 we
˜ (1 − xi ) ≥ 1 with I ⊆ I ⊆ [n], |I| ≥ n − klog n is
have that i∈I xi + i∈I\I ˜ ˜
˜ (1 − xi ) ≥ 1 with I ⊆ I = {i} is valid for
valid for P (k+1) . Thus i∈I xi + i∈I\I ˜
k ≥ (n − 1)/log n. Observe that for I = {i} and I = ∅ we obtain that xi ≥ 1
and xi ≤ 0 are valid for P (k+1) , respectively.
References
[1] Balas, E., Ceria, S., Cornuejols, G.: A lift-and-project cutting plane algorithm for
mixed 0-1 programs. Mathematical Programming 58, 295–324 (1993)
[2] Bockmayr, A., Eisenbrand, F., Hartmann, M., Schulz, A.: On the Chvátal rank of
polytopes in the 0/1 cube. Discrete Applied Mathematics 98, 21–27 (1999)
[3] Bonet, M., Pitassi, T., Raz, R.: Lower bounds for cutting planes proofs with small
coefficients. In: Proceedings of the 27th Annual ACM Symposium on Theory of
Computing, pp. 575–584 (1995)
[4] Charikar, M., Makarychev, K., Makarychev, Y.: Integrality gaps for Sherali-Adams
relaxations. In: Proceedings of the 41st Annual ACM Symposium on Theory of
Computing, pp. 283–292 (2009)
[5] Chvátal, V.: Edmonds polytopes and a hierarchy of combinatorial problems. Dis-
crete Mathematics 4, 305–337 (1973)
[6] Chvátal, V., Cook, W., Hartmann, M.: On cutting-plane proofs in combinatorial
optimization. Linear algebra and its applications 114, 455–499 (1989)
[7] Cook, W., Dash, S.: On the matrix-cut rank of polyhedra. Mathematics of Oper-
ations Research 26, 19–30 (2001)
On the Rank of Cutting-Plane Proof Systems 463
[8] Cornuéjols, G.: Valid inequalities for mixed integer linear programs. Mathematical
Programming 112, 3–44 (2008)
[9] Cornuejols, G., Li, Y.: Elementary closures for integer programs. Operations Re-
search Letters 28, 1–8 (2001)
[10] Cornuéjols, G., Li, Y.: A connection between cutting plane theory and the geom-
etry of numbers. Mathematical Programming 93, 123–127 (2002)
[11] Cornuéjols, G., Li, Y.: On the rank of mixed 0,1 polyhedra. Mathematical Pro-
gramming 91, 391–397 (2002)
[12] Dantchev, S.: Rank complexity gap for Lovász-Schrijver and Sherali-Adams proof
systems. In: Proceedings of the 39th Annual ACM Symposium on Theory of Com-
puting, pp. 311–317 (2007)
[13] Dash, S.: An exponential lower bound on the length of some classes of branch-
and-cut proofs. Mathematics of Operations Research 30, 678–700 (2005)
[14] Eisenbrand, F., Schulz, A.: Bounds on the Chvátal rank of polytopes in the 0/1-
cube. Combinatorica 23, 245–261 (2003)
[15] Georgiou, K., Magen, A., Pitassi, T., Tourlakis, I.: Integrality gaps of 2-o(1) for ver-
tex cover SDPs in the Lovász-Schrijver hierarchy. In: Proceedings of the 48th Annual
IEEE Symposium on Foundations of Computer Science, pp. 702–712 (2007)
[16] Goemans, M., Tuncel, L.: When does the positive semidefiniteness constraint help
in lifting procedures? Mathematics of Operations Research 26, 796–815 (2001)
[17] Gomory, R.: Outline of an algorithm for integer solutions to linear programs.
Bulletin of the American Mathematical Society 64, 275–278 (1958)
[18] Gomory, R.: Solving linear programming problems in integers. In: Bellman, R.,
Hall, M. (eds.) Proceedings of Symposia in Applied Mathematics X, pp. 211–215.
American Mathematical Society, Providence (1960)
[19] Gomory, R.: An algorithm for integer solutions to linear programs. In: Recent Ad-
vances in Mathematical Programming, pp. 269–302. McGraw-Hill, New York (1963)
[20] Lasserre, J.: An explicit exact SDP relaxation for nonlinear 0-1 programs. In:
Aardal, K., Gerards, B. (eds.) IPCO 2001. LNCS, vol. 2081, pp. 293–303. Springer,
Heidelberg (2001)
[21] Lovász, L., Schrijver, A.: Cones of matrices and set-functions and 0-1 optimization.
SIAM Journal on Optimization 1, 166–190 (1991)
[22] Mathieu, C., Sinclair, A.: Sherali-Adams relaxations of the matching polytope. In:
Proceedings of the 41st Annual ACM Symposium on Theory of Computing, pp.
293–302 (2009)
[23] Pokutta, S., Schulz, A.: A note on 0/1 polytopes without integral points and with
maximal rank (2009) (preprint)
[24] Pokutta, S., Schulz, A.: On the connection of the Sherali-Adams closure and border
bases (2009) (submitted)
[25] Pudlak, P.: On the complexity of propositional calculus. In: Sets and Proofs, In-
vited Papers from Logic Colloquium ’97, pp. 197–218. Cambridge University Press,
Cambridge (1999)
[26] Schoenebeck, G.: Linear level Lasserre lower bounds for certain k-CSPs. In: Pro-
ceedings of the 49th Annual IEEE Symposium on Foundations of Computer Sci-
ence, pp. 593–602 (2008)
[27] Schoenebeck, G., Trevisan, L., Tulsiani, M.: Tight integrality gaps for Lovász-
Schrijver LP relaxations of vertex cover and max cut. In: Proceedings of the 39th
Annual ACM Symposium on Theory of Computing, pp. 302–310 (2007)
[28] Sherali, H., Adams, W.: A hierarchy of relaxations between the continuous and
convex representations for zero-one programming problems. SIAM Journal on Dis-
crete Mathematics 3, 411–430 (1990)
Author Index