Integration of Ai and or Techniques in Constraint Programming Fo 2010
Integration of Ai and or Techniques in Constraint Programming Fo 2010
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Andrea Lodi Michela Milano
Paolo Toth (Eds.)
Integration
of AI and OR Techniques
in Constraint Programming
for Combinatorial
Optimization Problems
13
Volume Editors
Andrea Lodi
Michela Milano
Paolo Toth
DEIS
University of Bologna
Viale Risorgimento 2, 40136 Bologna, Italy
E-mail: {andrea.lodi,michela.milano,paolo.toth}@unibo.it
ISSN 0302-9743
ISBN-10 3-642-13519-6 Springer Berlin Heidelberg New York
ISBN-13 978-3-642-13519-4 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
springer.com
© Springer-Verlag Berlin Heidelberg 2010
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper 06/3180
Preface
Special thanks go to Marco Gavanelli and Andrea Roli, the Conference Chairs
who took care of the many details concerning the organization, and to Vanessa
Grotti (Planning Congressi), for her work on budgeting, planning and booking.
Finally, we would like to thank the sponsors who made it possible to orga-
nize this conference: the ARTIST Design, Network of Excellence, the Institute
for Computational Sustainability (ICS), the Cork Constraint Computation Cen-
ter, the Association for Constraint Programming (ACP), the Optimization for
Sustainable Development (OSD) Chair, IBM and FICO.
A special mention should be made of FONDAZIONE DEL MONTE - 1473
for its generous support of the publication of these proceedings and of ALMA
MATER STUDIORUM - Università di Bologna for the continuous help and
support of the organization of CPAIOR 2010.
Program Chairs
Andrea Lodi DEIS, University of Bologna, Italy
Michela Milano DEIS, University of Bologna, Italy
Paolo Toth DEIS, University of Bologna, Italy
Program Committee
Philippe Baptiste École Polytechnique, France
Roman Barták Charles University, Czech Republic
Christopher Beck University of Toronto, Canada
Andrew Davenport IBM, USA
Matteo Fischetti University of Padova, Italy
Bernard Gendron University of Montréal, Canada
Youssef Hamadi Microsoft Research, United Kingdom
Susanne Heipcke Xpress Team, FICO, France
Holger Hoos University of British Columbia, Canada
Narendra Jussien École des Mines de Nantes, France
Thorsten Koch ZIB, Germany
Laurent Michel University of Connecticut, USA
Barry O’Sullivan University College Cork, Ireland
Laurent Perron Google, France
Gilles Pesant École Polytechnique de Montréal, Canada
Jean-Charles Régin University of Nice-Sophia Antipolis, France
Louis-Martin Rousseau École Polytechnique de Montréal, Canada
Meinolf Sellmann Brown University, USA
Paul Shaw IBM, France
Helmut Simonis 4C, Ireland
Michael Trick Carnegie Mellon University, USA
Pascal Van Hentenryck Brown University, USA
Willem-Jan van Hoeve Carnegie Mellon University, USA
Petr Vilim IBM, Czech Republic
Mark Wallace Monash University, Australia
Tallys Yunes University of Miami, USA
Local Organization
Marco Gavanelli University of Ferrara, Italy
Andrea Roli DEIS, University of Bologna, Italy
Enrico Malaguti DEIS, University of Bologna, Italy
Fabio Parisini DEIS, University of Bologna, Italy
VIII Organization
External Reviewers
Alejandro Arbelaez Robert Mateescu
Nicolas Beldiceanu Luis Paquete
Nicolas Chapados Fabio Parisini
Marco Chiarandini Claude-Guy Quimper
Emilie Danna Philippe Refalo
Yves Deville Fabrizio Riguzzi
Ambros Gleixner Jerome Rogerie
Stefan Heinz Domenico Salvagnin
Vincent Jost Horst Samulowitz
Serdar Kadioglu Joachim Schimpf
Philippe Laborie James Styles
Michele Lombardi Maxwell Young
Yuri Malitsky
Table of Contents
Matteo Fischetti
Cutting planes (cuts) are very popular in the OR community, where they are
used to strengthen the Linear Programming (LP) relaxation of Mixed-Integer
Programs (MIPs) in the hope of improving the performance of an exact LP-
based solver. In particular, an intense research effort has been devoted to the
study of families of general cuts, whose validity does not require the presence
of a specific MIP structure—as opposed to problem-specific cuts such as, e.g.,
subtour elimination or comb inequalities for the traveling salesman problem.
Among general cuts, Gomory’s Mixed-Integer Cuts (GMICs) play a central
role both in theory and in practice. These cuts have been introduced by Ralph
Gomory about 50 years ago in his seminal paper [1]. Though elegant and com-
putationally cheap, they were soon abandoned because they were considered of
little practical use [2]. The situation changed radically more than 30 years later,
when Balas, Ceria, Cornuéjols and Natraj [3] found how to take advantage of
exactly the same cuts but in an different framework. In our view, this is a good
example of the importance of a sound framework for MIP cuts.
Even today, MIP solvers are quite conservative in the use of general cuts, and
in particular of GMICs, because of known issues due to the iterative accumu-
lation of the cuts in the optimal LP basis. This leads to numerical instability
because of a typically exponential growth of the determinant of the LP basis.
Following our recent joint work with Balas and Zanette [4, 5], in this talk we
argue that the known issues with cutting plane methods are largely due to the
overall framework where the cuts are used, rather than to the cuts themselves.
This is because the two main cutting plane modules (the LP solver and the cut
generator) form a closed-loop system that is intrinsically prone to instability.
Hence a kind of “decoupling filter” needs to be introduced in the loop if one
wants to exploit the full power of a given family of cuts.
A main goal of the talk is to refocus part of the current research effort from
the definition of new cut families to the way the cuts are actually used. In fact,
cutting planes still miss an overall “meta-scheme” to control cut generation and
to escape local optima by means of diversification phases—very well in the spirit
of Tabu or Variable Neighborhood Search meta-schemes for primal heuristics.
The development of sound meta-schemes on top of a basic separation tool is
therefore an interesting new avenue for future research, with contributions ex-
pected from all the three CP/AI/OR communities. The relax-and-cut framework
for GMICs recently proposed in the joint work with Salvagnin [6] can be viewed
as a first step in this direction.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 1–2, 2010.
c Springer-Verlag Berlin Heidelberg 2010
2 M. Fischetti
References
1. Gomory, R.E.: An algorithm for the mixed integer problem. Technical Report RM-
2597, The RAND Cooperation (1960)
2. Cornuéjols, G.: Revival of the Gomory cuts in the 1990’s. Annals of Operations
Research 149(1), 63–66 (2006)
3. Balas, E., Ceria, S., Cornuéjols, G., Natraj, N.: Gomory cuts revisited. Operations
Research Letters 19, 1–9 (1996)
4. Zanette, A., Fischetti, M., Balas, E.: Lexicography and degeneracy: can a pure cut-
ting plane algorithm work? Mathematical Programming (2009), doi:10.1007/s10107-
009-0300-y
5. Balas, E., Fischetti, M., Zanette, A.: On the enumerative nature of Gomory’s dual
cutting plane method. Mathematical Programming B (to appear, 2010)
6. Fischetti, M., Salvagnin, D.: A relax-and-cut framework for Gomory’s mixed-integer
cuts. In: CPAIOR 2010 Proceedings (2010)
Challenges for CPAIOR in
Computational Sustainability
Carla P. Gomes
Cornell University
Ithaca, NY, USA
[email protected]
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 3–4, 2010.
c Springer-Verlag Berlin Heidelberg 2010
4 C.P. Gomes
spaces and hidden problem structure, plays as prominent a role as formal analy-
sis [2]. Such an approach differs from the traditional computer science approach,
based on abstract mathematical models, mainly driven by worst-case analyzes.
While formulations of real-world computational tasks lead frequently to worst-
case intractable problems, often such real world tasks contain hidden structure
enabling scalable methods. It is therefore important to develop new approaches
to identify and exploit real-world structure, combining principled experimen-
tation with mathematical modeling, that will lead to scalable and practically
effective solutions.
In summary, the new field of Computational Sustainability brings together
computer scientists, operation researchers, applied mathematicians, biologists,
environmental scientists, and economists, to join forces to study and provide so-
lutions to computational problems concerning sustainable development, offering
challenges but also opportunities for the advancement of the state of the art of
computing and information science and related fields.
Acknowledgments
The author is the lead Principal Investigator of an NSF Expedition in Computing
grant on Computational Sustainability (Award Number: 0832782). The author
thanks the NSF Expeditions in Computing grant team members for their many
contributions towards the development of a vision for Computational Sustain-
ability, in particular, Chris Barrett, Antonio Bento, Jon Conrad, Tom Dietterich,
John Gunckenheimer, John Hopcroft, Ashish Sabharwhal, Bart Selman, David
Shmoys, Steve Strogatz, and Mary Lou Zeeman.
References
[1] Gomes, C.: Computational sustainability: Computational methods for a sustain-
able environment, economy, and society. The Bridge, National Academy of Engi-
neering 39(4) (Winter 2009)
[2] Gomes, C., Selman, B.: The science of constraints. Constraint Programming Let-
ters 1(1) (2007)
[3] UNEP. Our common future. Published as annex to the General Assembly document
A/42/427, Development and International Cooperation: Environment. Technical
report, United Nations Environment Programme (UNEP) (1987)
Lazy Clause Generation: Combining the Power
of SAT and CP (and MIP?) Solving
Peter J. Stuckey
1 Introduction
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 5–9, 2010.
c Springer-Verlag Berlin Heidelberg 2010
6 P.J. Stuckey
decisions which has already proven to be unhelpful (in particular 1UIP nogoods),
and efficient search heuristics which concentrate on the hard parts of the prob-
lem combined with restarting to escape from early commitment to choices. These
changes, all effectively captured in Chaff [1], have made SAT solvers able to solve
problems orders of magnitude larger than previously possible.
Can we combine these two techniques in a way that inherits the strengths of
each, and avoids their weaknesses. Lazy clause generation [2,3] is a hybridization
of the two approaches that attempts to do this. The core of lazy clause generation
is simple enough, we examine a propagation based solver and understand its
actions as applying to an underlying set of Boolean variables representing the
integer (and set of integer) variables of the CP model.
In this invited talk we will first introduce the basic theoretical concepts that
underlie lazy clause generation. We discuss the relationship of lazy clause gen-
eration to SAT modulo theories [4]. We then explore the difficulties that arise
in the simple theoretical hybrid, and examine design choices that ameliorate
some of these difficulties. We discuss how complex global constraints interact
with lazy clause generation. We then examine some of the remaining challenges
for lazy clause generation: incorporating the advantages of mixed integer pro-
gramming (MIP) solving, and building hybrid adaptive search strategies. The
remainder of this short paper will simply introduce the basic concepts of lazy
clause generation.
Consider the usual bounds propagator for the constraint x = y×z (see e.g. [5]).
Suppose the domain of x is [ −10 .. 10 ], y is [ 2 .. 10 ] and z is [ 3 .. 10 ]. The bounds
propagator determines that the lower bound of x should be 6. In doing so it only
made use of the lower bounds of y and z. We can record this as a clause c1
It also determines the upper bound of z is 5 using the upper bound of x and the
lower bound of y, and similarly the upper bound of y is 3. These can be recorded
as
(c2 ) : [[x ≤ 10]] ∧ ¬[[y ≤ 1]] → [[z ≤ 5]]
(c3 ) : [[x ≤ 10]] ∧ ¬[[z ≤ 2]] → [[y ≤ 3]]
Similarly if the domain of x is [ −10 .. 10 ], y is [ −2 .. 3 ] and z is [ −3 .. 3 ], the
bounds propagator determines that the upper bound of x is 9. In doing so it
made use of both the upper and lower bounds of y and z. We can record this as
a clause
¬[[y ≤ −3]] ∧ [[y ≤ 3]] ∧ ¬[[z ≤ −4]] ∧ [[z ≤ 3]] → [[x ≤ 9]]
In fact we could strengthen this explanation since the upper bound of x will
remain 4 even if the lower bound of z was −4, or if the lower bound of y were
−3. So we could validly record a stronger explanation of the propagation as
¬[[y ≤ −3]] ∧ [[y ≤ 3]] ∧ ¬[[z ≤ −5]] ∧ [[z ≤ 3]] → [[x ≤ 9]]
or
¬[[y ≤ −4]] ∧ [[y ≤ 3]] ∧ ¬[[z ≤ −4]] ∧ [[z ≤ 3]] → [[x ≤ 9]]
In a lazy clause generation system every time a propagator determines a domain
change of a variable it records a clause that explains the domain change. We
can understand this process as lazily creating a clausal representation of the
information encapsulated in the propagator. Recording the clausal reasons for
domain changes creates an implication graph of domain changes. When conflict is
detected (an unsatisfiable constraint) we can construct a reason for the conflict,
just as in a SAT (or SMT solver).
Suppose the domain of x is [ 6 .. 20 ], domain of y is [ 2 .. 20 ], z is [ 3 .. 10 ] and
t is [ 0 .. 20 ] and we have constraints x ≤ t, x = y × z and y ≥ 4 ∨ z ≥ 7. Suppose
search adds the new constraint t ≤ 10 (represented by [[t ≤ 10]]). The inequality
changes the upper bounds of x to 10 with explanation (c4 ) : [[t ≤ 10]] → [[x ≤ 10]].
The multiplication changes the upper bounds of z to 5 ([[z ≤ 5]]), and y to 3
([[y ≤ 3]]) with the explanations c2 and c3 above, and the disjunctive constraint
(which is equivalent to (c5 ) : ¬[[y ≤ 3]] ∨ ¬[[z ≤ 6]]) makes ¬[[z ≤ 6]] true which
by the domain constraints makes (c6 ) : [[z ≤ 5]] → [[z ≤ 6]] unsatisfiable. The
implication graph is illustrated in Figure 1.
We can explain the conflict by any cut that separates the conflict node from
the earlier parts of the graph. The first unique implication point (1UIP) cut
chooses the closest literal to the conflict where all paths from the last decision
8 P.J. Stuckey
p _[ I 0
¬[[z ≤ 2]] / gfed
`abc
[[y ≤ 3]] / gfed
`abc
¬[[z ≤ 6]]
0 c2 c5
I[ p ppp7 GG
GG
_
p ppp p c6 GG
p c2 GG
#
gfed
`abc / `abc
gfed
[[t ≤ 10]] c4 [[x ≤ 10]] conflict
jj5
NNN jjjj
p _[ I 0
N
c3 N NN
NN' jjj jcjjj
jjjj
6
to the conflict flow through that literal, and draws the cut just after this literal.
The 1UIP cut for Figure 1 is shown as the dashed line. The resulting nogood is
Note that if we ever reach a situation in the future where the lower bound of y
is at least 2, and the lower bound of z is at least 3, then the lower bound of x
will become at least 11 using this clause.
Since we are explaining conflicts completely analogously to a SAT (or SMT)
solver we can attach activities to the Boolean variables representing the inte-
ger original variables. Each Boolean variable examined during the creation of
the explanation (including those appearing in the final nogood) has their activ-
ity bumped. Every once in a while all activities counts are decreased, so that
more recent activity counts for more. This allows us to implement activity based
VSIDS search heuristic for the hybrid solver. We can also attach activity coun-
ters to clauses, which are bumped when they are involved in the explanation
process.
Since all of the clauses generated are redundant information we can at any
stage remove any of the generated clauses. This gives us the opportunity to
control the size of the clausal representation of the problem. Just as in a SAT
solver we can use clausal activities to decide which generated clauses are most
worthwhile retaining.
3 Concluding Remarks
The simple description of lazy clause generation in the previous section does
not lead to an efficient lazy clause generation solver, except for some simple
kinds of examples. In practice we need to also lazily generate the Boolean vari-
ables required to represent the original integer (and set of integer) variables. We
may also choose to either eagerly generate the explanation clauses as we exe-
cute forward propagation, or lazily generate explanations on demand during the
Lazy Clause Generation 9
References
1. Moskewicz, M., Madigan, C., Zhao, Y., Zhang, L., Malik, S.: Chaff: Engineering
an efficient SAT solver. In: Proceedings of 38th Conference on Design Automation
(DAC 2001), pp. 530–535 (2001)
2. Ohrimenko, O., Stuckey, P., Codish, M.: Propagation via lazy clause generation.
Constraints 14(3), 357–391 (2009)
3. Feydy, T., Stuckey, P.: Lazy clause generation reengineered. In: Gent, I.P. (ed.) CP
2009. LNCS, vol. 5732, pp. 352–366. Springer, Heidelberg (2009)
4. Nieuwenhuis, R., Oliveras, A., Tinelli, C.: Abstract DPLL and abstract DPLL
modulo theories. In: Baader, F., Voronkov, A. (eds.) LPAR 2004. LNCS (LNAI),
vol. 3452, pp. 36–50. Springer, Heidelberg (2005)
5. Marriott, K., Stuckey, P.: Programming with Constraints: an Introduction. MIT
Press, Cambridge (1998)
6. Schutt, A., Feydy, T., Stuckey, P., Wallace, M.: Why cumulative decomposition is
not as bad as it sounds. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 746–761.
Springer, Heidelberg (2009)
7. Gange, G., Stuckey, P., Lagoon, V.: Fast set bounds propagation using a BDD-SAT
hybrid. Journal of Artificial Intelligence Research (to appear, 2010)
On Matrices, Automata, and Double Counting
Abstract. Matrix models are ubiquitous for constraint problems. Many such
problems have a matrix of variables M, with the same constraint defined by a
finite-state automaton A on each row of M and a global cardinality constraint
gcc on each column of M. We give two methods for deriving, by double count-
ing, necessary conditions on the cardinality variables of the gcc constraints from
the automaton A. The first method yields linear necessary conditions and simple
arithmetic constraints. The second method introduces the cardinality automaton,
which abstracts the overall behaviour of all the row automata and can be encoded
by a set of linear constraints. We evaluate the impact of our methods on a large
set of nurse rostering problem instances.
1 Introduction
Several authors have shown that matrix models are ubiquitous for constraint problems.
Despite this fact, only a few constraints that consider a matrix and some of its con-
straints as a whole have been considered: the allperm [8] and lex2 [7] constraints
were introduced for breaking symmetries in a matrix, while the colored matrix con-
straint [13] was introduced for handling a conjunction of gcc constraints on the rows and
columns of a matrix. We focus on another recurring pattern, especially in the context of
personnel rostering, which can be described in the following way.
Given three positive integers R, K, and V , we have an R × K matrix M of decision
variables that take their values within the finite set of values {0, 1, . . . , V − 1}, as well
as a V ×K matrix M# of cardinality variables that take their values within the finite set
of values {0, 1, . . . , R}. Each row r (with 0 ≤ r < R) of M is subject to a constraint
defined by a finite-state automaton A [2,12]. For simplicity, we assume that each row is
subject to the same constraint. Each column k (with 0 ≤ k < K) of M is subject to a
gcc constraint that restricts the number of occurrences of the values according to column
k of M# : let #vk denote the number of occurrences of value v (with 0 ≤ v < V ) in
column k of M, that is, the cardinality variable in row v and column k of M# . We
call this pattern the matrix-of-automata-and-gcc pattern. In the context of personnel
rostering, a possible interpretation of this pattern is:
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 10–24, 2010.
c Springer-Verlag Berlin Heidelberg 2010
On Matrices, Automata, and Double Counting 11
c ← c − d + 1, d ← 1 d←0 c ← c − d + 1, d ← 1
t0 : 0 t2 : 1 t4 : 0
t1 : 1 t3 : 0
s0 s1 s2
c ← 0, d ← 0 d←0 c ← c − d + 1, d ← 1
Fig. 1. Automaton associated with the global contiguity constraint, with initial state s0 , final
states s0 , s1 , s2 , and transitions t0 , t1 , t2 , t3 , t4 labelled by values 0 or 1. The missing transition
for value 1 from state s2 is assumed to go to a dead state. The automaton has been annotated with
counters [2]: the final value of counter c is the number of stretches of value 0, whereas d is an
auxiliary counter.
12 N. Beldiceanu et al.
containing two 1s. But since we have a global contiguity constraint on each row of the
matrix and since the matrix only has three rows, there is a contradiction.
– Methods for deriving necessary conditions on the cardinality variables of the gcc
constraints from string properties that hold for an automaton A (Sections 2.1 to 2.3).
– A method for annotating an automaton A with counter variables extracting string
properties from A (Section 2.4).
– Another method for deriving necessary conditions on the cardinality variables,
called the cardinality automaton, which simulates the overall behaviour of all the
row automata (Section 3).
– An evaluation of the impact of our methods in terms of runtime and search effort
on a large set of nurse rostering problem instances (Section 4).
Since our methods essentially generate linear constraints as necessary conditions, they
may also be relevant in the context of linear programming.
We develop a first method for deriving necessary conditions for the matrix-of-automata-
and-gcc pattern. The key idea is to approximate the set of solutions to the row constraint
by string properties such as:
– Bounds on the number of letters, words, prefixes, or suffixes (see Section 2.1).
– Bounds on the number of stretches of a given value (see Section 2.2).
– Bounds on the lengths of stretches of a given value (see Section 2.3).
Necessary Conditions. Let |w| denote the length of word w, and let wj denote the j th
letter of word w. The following bounds
⎛⎛ ⎞ ⎞
|w|−1
w
lw k (w) = max ⎝⎝ j ⎠
#k+j − (|w| − 1) · R, 0⎠ (1)
j=0
|w|−1
j w
uw k (w) = min #k+j (2)
j=0
K−|w|
R−1
K−|w|
R−1
lw k (w) ≤ UW r (w) (3a) uw k (w) ≥ LW r (w) (3b)
k=0 r=0 k=0 r=0
Note that (3b) trivially holds when all LW r (w) are zero.
We get the same necessary conditions as before. Note that (4) and (5) specialise respec-
tively to (1) and (2) when all wj are singleton sets.
R−1
R−1
lw 0 (w) ≤ UWP r (w) (6a) uw 0 (w) ≥ LWP r (w) (6b)
r=0 r=0
R−1
R−1
lw K−|w| (w) ≤ UWS r (w) (7a) uw K−|w| (w) ≥ LWS r (w) (7b)
r=0 r=0
Note that (6b) trivially holds when all LWP r (w) are zero, and that (7b) trivially holds
when all LWS r (w) are zero. Note that these necessary conditions also hold when each
letter of a constrained prefix or suffix is replaced by a set of letters.
Necessary Conditions. The following bounds (under the convention that #v−1 = 0 for
each value v)
By aggregating these bounds for all the columns of the matrix, we get the following
necessary conditions through double counting:
K−1
R−1
K−1
R−1
k (v) ≤
ls + US r (v) (10a) k (v) ≥
us + LS r (v) (10b)
k=0 r=0 k=0 r=0
Similarly, the following bounds (under the convention that #vK = 0 for each value v)
ls −
k (v) = max(0, #k − #k+1 )
v v
(11)
us −
k (v) = #vk − max(0, #vk+1 + #vk − R) (12)
K−1
R−1
K−1
R−1
ls −
k (v) ≤ US r (v) (13a) us −
k (v) ≥ LS r (v) (13b)
k=0 r=0 k=0 r=0
Note that (10b) and (13b) trivially hold when all LS r (v) are zero.
K−1 R−1
K−1 R−1
ls +
k (v̂) ≤ US r (v) (18a) us +
k (v̂) ≥ LS r (v) (18b)
k=0 v∈v̂ r=0 k=0 v∈v̂ r=0
16 N. Beldiceanu et al.
K−1 R−1
K−1 R−1
ls −
k (v̂) ≤ US r (v) (19a) us −
k (v̂) ≥ LS r (v) (19b)
k=0 v∈v̂ r=0 k=0 v∈v̂ r=0
Note that (18a), (18b), (19a), and (19b) specialise respectively to (10a), (10b), (13a),
and (13b) when v̂ = {v}.
Necessary Conditions
k
∀k ∈ [0, K − 1] : #vk ≥ ls +
j (v) (20)
j=max(0,k−LLS (v)+1)
min(K−1,k+LLS (v)−1)
∀k ∈ [0, K − 1] : #vk ≥ ls −
j (v) (21)
j=k
The intuition behind (20) resp. (21) is that the stretches starting resp. ending at the
considered columns j must overlap column k.
∀k ∈ [ULS (v), K − 1] :
ULS (v)
(23)
ls −
k (v) + #vk−j − (ULS (v) − LLS (v) + 1) · R ≤ 0
j=LLS (v)
The intuition behind (22) is as follows. Consider a stretch beginning at column k. Then
there must be an element distinct from v in column j ∈ [k + LLS (v), k + ULS (v)]
of the same row. So at least one of the terms in the summation of (22) will get a zero
contribution from the given row. The reasoning in (23) is similar but considers stretches
ending at column k.
Table 1. For each annotation in the first column, the second column gives the number of new
counters, the third column gives their initial values, and the fourth column shows the string prop-
erty variable among the final counter values. In the first three rows, is the word length.
Table 2. Given an annotation and a transition of the automaton reading letter u, the table gives the
counter update formulae to be used in this transition. For each annotation in the first column, the
second column shows the counter names, and the third column shows the update formulae. The
fourth column shows the condition under which each formula is used. In the first three multirows,
is the word length.
– For each letter, lower and upper bounds on the number of its occurrences.
– For each letter, lower and upper bounds on the number or length of its stretches.
– Each word of length at most 3 that cannot occur at all.
– Each word of length at most 3 that cannot occur as a prefix or suffix.
These properties are derived, one at a time, as follows. We annotate the automaton as
described in the previous section by the candidate string property. Then we compute by
labelling the feasible values of the counter variable reflecting the given property, giving
up if the computation does not finish within 5 CPU seconds. Among the collected word,
prefix, suffix, and stretch properties, some properties are subsumed by others and are
thus filtered away. Other properties could certainly have been derived, e.g., not only
forbidden words, but also bounds on the number of occurrences of words. Our choice
was based on (a) which properties we are able to derive necessary conditions for, and
(b) empirical observations of what actually pays off in our benchmarks.
On Matrices, Automata, and Double Counting 19
ci0 , . . . , cip−1 is initial (resp. final) if ci = 0 whenever s is not the initial (resp. a
final) state of A.
The number of states of # AR is the number of ordered partitions of p, and
thus exponential in p. However, it is possible to have a compact encoding via con-
straints. Toward this, we use K + 1 sequences of p decision variables Sik in the domain
{0, 1, . . . , R} to encode the states of an arbitrary path of length K (the number of
columns) in # AR . For k ∈ {1, . . . , K}, the sequence
S0k , S1k , . . . , Sp−1 k
has as
possible values the states of # AR after it has consumed column k − 1; the sequence
T(a
k
0 ,0 ,b0 )
k
, T(a1 ,1 ,b1 )
k
, . . . , T(aq−1 ,q−1 ,bq−1 )
gives the numbers of automata A with
transition (a0 , 0 , b0 ), (a1 , 1 , b1 ), . . . , (aq−1 , q−1 , bq−1 ) upon reading the character of
their row in column k. We get the following constraint for column k:
k
T(a0 ,0 ,b0 )
k
+ T(a1 ,1 ,b1 )
+ · · · + T(a
k
q−1 ,q−1 ,bq−1 )
=R (26)
A reformulation with linear constraints when R = 1 and there are no column constraints
is described in [6].
The necessary constraints above on the state and transition variables only handle the row
constraints, but they can also be used to handle column constraints of the considered
kinds. These necessary constraints can thus be seen as a communication channel for
enhancing the propagation between row and column constraints.
If column k has a gcc, then we constrain the number of occurrences of value v in
column k to equal the number of transitions on v when reading column k:
∀v ∈ {0, . . . , V − 1} : #vk = k
T(ai ,i ,bi )
(29)
(ai ,i ,bi )∈T : i =v
If column k constrains the sum of the column, then we constrain that sum to equal the
value-weighted number of transitions on v when reading column k:
⎛ ⎞
R−1
V −1
M[r, k] = v·⎝ k
T(a i ,i ,bi )
⎠ (30)
r=0 v=0 (ai ,i ,bi )∈T : i =v
On Matrices, Automata, and Double Counting 21
Furthermore, for more propagation, we can link the variables Sik back to the state vari-
ables [2] of the R automata A. For this purpose, let the variables Q0i , Q1i , . . . , QK
i (with
0 ≤ i < R) denote the K + 1 states visited by automaton A on row i of length K. We
get the following gcc necessary constraints:
∀k ∈ {0, . . . , K} : gcc(
Qk0 , Qk1 , . . . , QkR−1 ,
0 : S0k , 1 : S1k , . . . , p−1 : Sp−1
k
) (31)
S0k = T(s
k
0 ,0,s0 )
k
+ T(s0 ,1,s1 )
(transitions that exit state s0 )
k k k
S1 = T(s1 ,1,s1 ) + T(s 1 ,0,s2 )
(transitions that exit state s1 )
S2k = T(s
k
2 ,0,s2 ) (transitions that exit state s2 )
S0k+1 = T(s
k
0 ,0,s0 )
(transitions that enter state s0 )
S1k+1 = T(s
k
0 ,1,s1 )
k
+ T(s1 ,1,s1 )
(transitions that enter state s1 )
k+1 k k
S2 = T(s1 ,0,s2 ) + T(s 2 ,0,s2 )
(transitions that enter state s2 )
#0k = T(s
k
0 ,0,s0 )
k
+ T(s1 ,0,s2 )
k
+ T(s2 ,0,s2 )
(transitions labelled by value 0)
1 k k
#k = T(s0 ,1,s1 ) + T(s1 ,1,s1 ) (transitions labelled by value 1)
of some nurse): this can be modelled by gcc constraints on the rows. There are even
lower and upper bounds on the cumulative number of occurrences of the working shifts
1, . . . , S − 1 in any row: this can be modelled by gcc constraints on the off-duty value
S and always gives tighter occurrence bounds on S than in the previous gcc constraints.
For each shift s, there are also lower and upper bounds on the length of any stretch of
value s in any row: this can be modelled by stretch path constraints on the rows. Fi-
nally, there are lower and upper bounds on the length of any stretch of the working shifts
1, . . . , S − 1 in any row: this can be modelled by generalised stretch path partition
constraints [3] on the rows. We stress that the constraints on any two rows are the same.
There are 8 case files for the N × 7 rosters, and another 8 case files for the N × 28
rosters. We automatically generated (see [3] for details) deterministic finite automata
(DFA) for all the row constraints of each case, but used their minimised product DFA
instead (obtained through standard DFA algorithms), thereby getting domain consis-
tency on the conjunction of all row constraints [2]. For each case, string properties were
automatically selected off-line as described in Section 2.5, and cardinality automata
were automatically constructed off-line as described in Section 3.
Under these choices, the NSPLib benchmark corresponds to the pattern studied in
this paper. To reduce the risk of reporting improvements where another search proce-
dure can achieve much of the same impact, we use a two-phase search that exploits the
fact that there is a single domain-consistent constraint on each row and column:
– Phase 1 addresses the column (coverage) constraints only: it seeks to assign enough
nurses to given shifts on given days to satisfy all but one coverage constraint. To
break row symmetries, an equivalence relation is maintained: two rows (nurses) are
in the same equivalence class while they are assigned to the same shifts and days.
– In Phase 2, one column constraint and all row constraints remain to be satisfied.
But these constraints form a Berge-acyclic CSP [1], and so the remaining decision
variables can be trivially labelled without search.
This search procedure is much more efficient than row-wise labelling under decreasing
value ordering (value S always has the highest average number of occurrences per row)
in the presence of a decreasing lexicographic ordering constraint on the rows.
The objective of our experiments is to measure the impact in runtime and backtracks
when using either or both of our methods. The experiments were run under SICStus
Prolog 4.1.1 and Mac OS X 10.6.2 on a 2.8 GHz Intel Core 2 Duo with a 4GB RAM.
All runs were allocated 1 CPU minute. For each case and nurse count N , we used the
first 10 instances for each configuration of the NSPLib coverage complexity indicators,
that is instances 1–270 for the N × 7 rosters and 1–120 for the N × 28 rosters.
Table 3 summarises the running of these 3120 instances using neither, either, and
both of our methods. Each row first indicates the number of known instances of some
satisfiability status (‘sat’ for satisfiable, and ‘unsat’ for unsatisfiable) for a given case
and nurse count N , and then the performance of each method to the first solution,
namely the number of instances decided to be of that status without timing out, as well
as the total runtime (in seconds) and the total number of backtracks on all instances
where none of the four methods timed out (it is very important to note that this means
that these totals are comparable, but also that they do not reveal any performance gains
on instances where at least one of the methods timed out). Numbers in boldface indicate
On Matrices, Automata, and Double Counting 23
best performance in a row. It turned out that Cases 1–6, 9–10, 12–14 are very simple
(in the absence of preference constraints), so that our methods only decrease backtracks
on one of those 2220 instances, but increase runtime. It also turned out that Case 11 is
very difficult (even in the absence of preference constraints), so that even our methods
systematically time out, because the product automaton of all row constraints is very
big; we could have overcome this obstacle by using the built-in gcc constraint and the
product automaton of the remaining row constraints, but we wanted to compare all the
cases under the same scenario. Hence we do not report any results on Cases 1–6, 9–14.
An analysis of Table 3 reveals that our methods decide more instances without tim-
ing out, and that they often drastically reduce the runtime and number of backtracks
(by up to four orders of magnitude), especially on the shared unsatisfiable instances.
However, runtimes are often increased (by up to one order of magnitude) on the shared
satisfiable instances. String properties are only rarely defeated by the cardinality DFA
on any of the three performance measures, but their combination is often the overall
winner, though rarely by a large margin. A more fine-grained evaluation is necessary
to understand when to use which string properties without increasing runtime on the
satisfiable instances. The good performance of our methods on unsatisfiable instances
is indicative of gains when exploring the whole search space, such as when solving an
optimisation problem or using soft (preference) constraints.
With constraint programming, NSPLib instances (without the soft preference con-
straints) were also used in [4,5], but under row constraints that are different from those
24 N. Beldiceanu et al.
of the NSPLib case files that we used. NSP instances from a different repository were
used in [11], though with soft global constraints: one of the insights reported there was
the need for more interaction between the global constraints, and our paper shows steps
that can be taken in that direction.
Since both our methods essentially generate linear constraints, they may also be
relevant in the context of linear programming. Future work may also consider the inte-
gration of our techniques with the multicost-regular constraint [10], which allows the
direct handling of a gcc constraint in the presence of automaton constraints (as on the
rows of NSPLib instances) without explicitly computing the product automaton, which
can be very big.
References
1. Beeri, C., Fagin, R., Maier, D., Yannakakis, M.: On the desirability of acyclic database
schemes. Journal of the ACM 30, 479–513 (1983)
2. Beldiceanu, N., Carlsson, M., Petit, T.: Deriving filtering algorithms from constraint check-
ers. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 107–122. Springer, Heidelberg
(2004)
3. Beldiceanu, N., Carlsson, M., Rampon, J.-X.: Global constraint catalog. Technical Report
T2005-08, Swedish Institute of Computer Science (2005), The current working version is at:
www.emn.fr/x-info/sdemasse/gccat/doc/catalog.pdf
4. Bessière, C., Hebrard, E., Hnich, B., Kiziltan, Z., Walsh, T.: SLIDE: A useful special case of
the CARDPATH constraint. In: ECAI 2008, pp. 475–479. IOS Press, Amsterdam (2008)
5. Brand, S., Narodytska, N., Quimper, C.-G., Stuckey, P.J., Walsh, T.: Encodings of the se-
quence constraint. In: Bessière, C. (ed.) CP 2007. LNCS, vol. 4741, pp. 210–224. Springer,
Heidelberg (2007)
6. Côté, M.-C., Gendron, B., Rousseau, L.-M.: Modeling the regular constraint with integer
programming. In: Van Hentenryck, P., Wolsey, L.A. (eds.) CPAIOR 2007. LNCS, vol. 4510,
pp. 29–43. Springer, Heidelberg (2007)
7. Flener, P., Frisch, A.M., Hnich, B., Kiziltan, Z., Miguel, I., Pearson, J., Walsh, T.: Breaking
row and column symmetries in matrix models. In: Van Hentenryck, P. (ed.) CP 2002. LNCS,
vol. 2470, pp. 462–476. Springer, Heidelberg (2002)
8. Frisch, A.M., Jefferson, C., Miguel, I.: Constraints for breaking more row and column sym-
metries. In: Rossi, F. (ed.) CP 2003. LNCS, vol. 2833, pp. 318–332. Springer, Heidelberg
(2003)
9. Jukna, S.: Extremal Combinatorics. Springer, Heidelberg (2001)
10. Menana, J., Demassey, S.: Sequencing and counting with the multicost-regular constraint.
In: van Hoeve, W.-J., Hooker, J.N. (eds.) CPAIOR 2009. LNCS, vol. 5547, pp. 178–192.
Springer, Heidelberg (2009)
11. Métivier, J.-P., Boizumault, P., Loudni, S.: Solving nurse rostering problems using soft global
constraints. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 73–87. Springer, Heidelberg
(2009)
12. Pesant, G.: A regular language membership constraint for finite sequences of variables. In:
Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 482–495. Springer, Heidelberg (2004)
13. Régin, J.-C., Gomes, C.: The cardinality matrix constraint. In: Wallace, M. (ed.) CP 2004.
LNCS, vol. 3258, pp. 572–587. Springer, Heidelberg (2004)
14. Vanhoucke, M., Maenhout, B.: On the characterization and generation of nurse schedul-
ing problem instances. European Journal of Operational Research 196(2), 457–467 (2009);
NSPLib is at: www.projectmanagement.ugent.be/nsp.php
The Increasing Nvalue Constraint
1 Introduction
The Nvalue constraint was introduced by Pachet et al. in [10] to express a re-
striction on the number of distinct values assigned to a set of variables. Even if
finding out whether a Nvalue constraint has a solution or not is NP-hard [6],
a number of filtering algorithms were developed over the last years [4,3]. Moti-
vated by symmetry breaking, this paper considers the conjunction of an Nvalue
constraint with a chain of non strict inequalities constraints, that we call In-
creasing Nvalue. We come up with a filtering algorithm that achieves general
arc-consistency (GAC) for Increasing Nvalue in O(ΣDi ) time, where ΣDi is the
sum of domain sizes. This algorithm is more efficient than those obtained by
using generic approaches such as encoding Increasing Nvalue as a finite deter-
ministic automaton [12] or as a Slide constraint [5], which respectively require
O(n(∪Di )3 ) and O(nd4 ) time complexities for achieving GAC, where n denotes
the number of variables, ∪Di is the total number of potential values in the do-
mains, and d the maximum size of a domain. Part of its efficiency relies on
a specific data structure, i.e. a matrix of ordered sparse arrays, which allows
multiple ordered queries (i.e., Set and Get) to the columns of a sparse matrix.
Experiments proposed in this paper are based on a real-life resource allocation
problem related to the management of clusters. Entropy is a Virtual Machine
(VM) manager for clusters [8], which provides an autonomous and flexible engine
to manipulate the state and the position of VMs on the different working nodes
composing the cluster. The constraint programming part affects the VMs (the
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 25–39, 2010.
c Springer-Verlag Berlin Heidelberg 2010
26 N. Beldiceanu et al.
2 Preliminaries
Given a sequence of variables X, the domain D(x) of a variable x ∈ X is the
finite set of integer values that can be assigned to variable x. D is the union of
all domains in X. We use the notations min(x) for the minimum value of D(x)
and max(x) for the maximum value of D(x). The sum of domains sizes over D is
ΣDi = xi ∈X |D(xi )|. A[X] denotes an assignment of values to variables in X.
Given x ∈ X, A[x] is the value of x in A[X]. A[X] is valid iff ∀xi ∈ X, A[xi ] ∈
D(xi ). An instantiation I[X] is a valid assignment of X. Given x ∈ X, I[x] is
the value of x in I[X]. A constraint C(X), specifies the allowed combinations
of values for a set of variables X. It defines a subset RC (D) of the cartesian
product of the domains Πxi ∈X D(xi ). A feasible instantiation of C(X) is an
instantiation which is in RC (D). If I[X] is a feasible instantiation of C(X) then
I[X] satisfies C(X). W.l.o.g., we consider that X contains at least two variables.
Given X = [x0 , x1 , . . . , xn−1 ] and i, j two integers such that 0 ≤ i < j ≤ n − 1,
I[xi , . . . , xj ] is the projection of I[X] on the sequence [xi , . . . , xj ].
Definition 1. The constraint Increasing Nvalue(N, X) is defined by a variable
N and a sequence of n variables X = [x0 , x1 , . . . , xn−1 ]. Given an instantiation
of [N, x0 , x1 , . . . , xn−1 ], Increasing Nvalue(N, X) is satisfied iff:
1. N is equal to the number of distinct values assigned to the variables in X.
2. ∀i ∈ [0, n − 2], xi ≤ xi+1 .
Proof. From Definition 1, if I[X] satisfies the constraint then ∀i ∈ [0, n − 2],
I[xi ] ≤ I[xi+1 ]. By transitivity of ≤, the Lemma holds.
Proof. I[X] is well-ordered then, for any i and j s.t. 0 ≤ i < j ≤ n − 1, we have
I[xi ] ≤ I[xj ]. Consequently, if xi and xj belong to two distinct stretches and
i < j then I[xi ] < I[xj ].
It is possible to evaluate for each value v in each domain D(xi ) the exact min-
imum and maximum number of stretches of well-ordered suffix instantiations
I[xi , . . . , xn ] such that I[xi ] = v, and similarly for prefix instantiations. This
evaluation is performed w.r.t. the domains of variables xj such that j > i.
Notation 1. Let X = [x0 , x1 , . . . , xn−1 ] be a sequence of variables and let v be
a value of D. The exact minimum number of stretches among all well-ordered
instantiations I[xi , . . . , xn−1 ] such that I[xi ] = v is denoted by s(xi , v). By con-
vention, if v ∈
/ D(xi ) then s(xi , v) = +∞. Similarly, the exact minimum number
of stretches among all well-ordered instantiations I[x0 , . . . , xi ] such that I[xi ] = v
is denoted by p(xi , v). By convention, if v ∈ / D(xi ) then p(xi , v) = +∞.
1. If i = n − 1: s(xi , v) = 1,
2. If i < n − 1: s(xi , v) = min( s(xi+1 , v), minw>v (s(xi+1 , w)) + 1 ).
Proof. By induction. When |X| = 1 there is one stretch. Thus, if i = n−1, for any
v ∈ D(xi ), we have s(xi , v) = 1. Consider now, a variable xi , i < n−1, and a value
v ∈ D(xi ). Instantiations s.t. I[xi+1 ] < v cannot be augmented with value v for xi
to form a well-ordered instantiation I[xi , . . . , xn−1 ]. Thus, let I[xi+1 , . . . , xn−1 ]
be an instantiation s.t. I[xi+1 ] ≥ v, which minimizes the number of stretches
in [xi+1 , . . . , xn−1 ]. Either I[xi+1 ] = v and s(xi , v) = s(xi+1 , v) since the first
stretch of I[xi+1 , . . . , xn−1 ] is extended when augmenting I[xi+1 , . . . , xn−1 ] with
value v for xi , or I[xi+1 ] = v and s(xi , v) = s(xi+1 , I[xi+1 ]) + 1 since value v
creates a new stretch. By construction, instantiations of [xi+1 , . . . , xn−1 ] that do
not minimize the number of stretches cannot lead to a value s(xi , v) strictly less
than min(s(xi+1 , w), w > v) + 1, even if I[xi+1 ] = v.
1. If i = n − 1: s(xi , v) = 1,
2. If i < n − 1: s(xi , v) = max( s(xi+1 , v), maxw>v (s(xi+1 , w)) + 1 ).
This section enumerates the properties that link the natural ordering of values in
a domain D(xi ) with the minimum and maximum number of stretches that can
be obtained in the sub-sequence xi , xi+1 , . . . , xn−1 . We consider only well-ordered
values, which may be part of a feasible instantiation of Increasing Nvalue.
The Increasing Nvalue Constraint 29
Properties on a Single Value. The next three properties are directly deduced,
by construction, from Lemmas 3 and 4.
Property 1. Any value v ∈ D(xi ) well-ordered w.r.t. xi is such that s(xi , v) ≤
s(xi , v).
Ordering on Values. The two following properties establish the links between
the natural ordering of values in D(xi ) and the minimum and maximum number
of stretches in the sub-sequence starting from xi .
Property 4. Let X = [x0 , x1 , . . . , xn−1 ] be a sequence of variables and let i ∈
[0, n−1] be an integer. ∀v, w ∈ D(xi ) two well-ordered values, v ≤ w ⇒ s(xi , v) ≤
s(xi , w) + 1.
Proof. The first line of Algorithm 1 ensures that either [s(X), s(X)] ∩ D(N ) = ∅
and k belongs to [s(X), s(X)] ∩ D(N ), or there is no solution (from Proposi-
tions 1 and 2). At each new iteration of the for loop, by Lemmas 3 and 4
and Proposition 2, either the condition (line 6) is satisfied and a new stretch
begins at i + 1 with a greater value (which guarantees that I[{x1 , . . . , xi+1 }]
is well-ordered) and k is decreased by 1, or it is possible to keep the current
value v for I[xi+1 ]. Therefore, at the start of a for loop (line 4), ∃v ∈ D(xi )
s.t. k ∈ [s(xi , v), s(xi , v)]. When i = n − 1, by construction k = 1 and
∀vn−1 ∈ D(xn−1 ), s(xn−1 , vn−1 ) = s(xn−1 , vn−1 ) = 1; I[X] is well-ordered and
contains k stretches. From Lemma 2, instantiation I[{N } ∪ X] with I[N ] = k is
a solution of Increasing Nvalue(N, X) with k distinct values in X.
4.2 Algorithms
1. Within our algorithm we always iterate over p(xi , v), p(xi , v), s(xi , v) and
s(xi , v) by scanning the value of D(xi ) in increasing or decreasing order.
2. For a value v that does not belong to D(xi ), 0 (resp. n) is the default value
for p(xi , v) and s(xi , v) (resp. p(xi , v) and s(xi , v)).
For this purpose we create a data structure for handling such sparse matrices
for which write and read accesses are always done by iterating in increasing
or decreasing order through the rows in a given column. The upper part of
next table describes the three primitives on ordered sparse matrices as well as
their time complexity. The lower part gives the primitives used for accessing or
modifying the domain of a variable. Primitives which restrict the domain of a
variable x return true if D(x) = ∅ after the operation, false otherwise.
Primitives
Description Complexity
(access to matrices)
th
ScanInit(mats, i, dir ) indicates that we will iterate through the i
column of matrices in mats in increasing or- O(1)
der (dir =↑) or decreasing order (dir =↓)
Set(mat, i, j, info) performs the assignment mat[i, j] := info O(1)
Get(mat, i, j):int returns the content of entry mat[i, j] or the
default value if such entry does not belong to
the sparse matrix (a set of q consecutive calls amortized
to Get on the same column i and in increas-
ing or decreasing row indexes is in O(q))
Primitives
Description Complexity
(access to variables)
adjust min(x, v):boolean adjusts the minimum of var. x to value v O(1)
adjust max(x, v):boolean adjusts the maximum of var. x to value v O(1)
remove val(x, v):boolean removes value v from domain D(x) O(1)
instantiate(x, v):boolean fix variable x to value v O(1)
get prev(x, v):int returns the largest value w in D(x) such that
O(1)
w < v if it exists, returns v otherwise
get next(x, v):int returns the smallest value w in D(x) such
O(1)
that w > v if it exists, returns v otherwise
28 w := v ; v :=getPrev(xi , v );
29 until w = v ;
to the minimum and maximum number of stretches on the prefix and suffix ma-
trices p, p, s, s. Finally, based on this information, it adjusts the bounds of N
and does the necessary pruning on each variable x0 , x1 , . . . , xn−1 . Using Lem-
mas 3 and 4, Algorithm 2 builds the suffix matrices s and s used in Algorithm 3
(p and p are constructed in a similar way):
of the for loop we initialise the ith columns of s and s. Observe that we scan
columns i + 1 of matrices mins and maxs in decreasing rows indices.
Consequently, Algorithm 2 takes O(ΣDi ) time and Algorithm 3 prunes all the
values that are not arc-consistent in Increasing Nvalue in O(ΣDi ).1
This section provides a set of experiments for the Increasing Nvalue constraint.
First, Section 5.1 presents a constraint programming reformulation of a Nvalue
constraint into a Increasing Nvalue constraint to deal with symmetry breaking.
Next, Section 5.2 evaluates the Increasing Nvalue on a real life application
based on constraint programming technology. In the following, all experiments
were performed with the Choco constraint programming system [1], on an Intel
Core 2 Duo 2.4GHz with 4GB of RAM, and 128Mo allocated to the Java Virtual
Machine.
classes, in addition to the Nvalue. Thus, given a set E(X) of equivalence classes
among the variables in X, the pruning of the global constraint Nvalue(X, N )
can be strengthened in the following way:
N value(N, X) (1)
∀E ∈ E(X), Increasing N value(NE , E) (2)
max (NE ) ≤ N ≤ (NE ) (3)
E∈E(X)
E∈E(X)
the propagation gain (in term of nodes and failures) is not significant while the
solving time overhead could be important.
Unsurprisingly, Table 2 shows that the number of holes in the variable domains
impact the performances of the Increasing Nvalue constraint model. However,
we notice when the number of holes in the domains increases the number of
solved instances decreases. Such a phenomenon are directly related with the fact
that propagation of Nvalue is less efficient when there exist holes in the variable
domains.
6 Related Work
GAC for the Increasing Nvalue constraint can be also obtained by at least two
different generic techniques, namely by using a finite deterministic automaton
with a polynomial number of transitions or by using the Slide constraint.
Given a constraint C of arity k and a sequence X of n variables, the
Slide(C,X) constraint [5] is a special case of the cardpath constraint. The
slide constraint holds iff C(Xi , Xi+1 , . . . , Xi + k − 1) holds for all i ∈ [1, n −
k + 1]. The main result is that GAC can be enforced in O(ndk ) time where
d is the maximum domain size. An extension called slide j (C,X) holds iff
C(Xij+1 , Xij+2 , . . . , Xij+k ) holds for all i ∈ [0, n−k
j ]. Given X = {xi | i ∈ [1; n]},
the Increasing Nvalue constraint can be encoded as Slide2 (C,[xi , ci ]i∈[1;n] )
where (a) c1 , c2 , . . . , cn are variables taking their value within [1, n] with c1 = 1
and cn = N , and (b) C(xi , ci , xi+1 , ci+1 ) is the constraint b ⇔ xi = xi+1 ∧ci+1 =
ci + b ∧ xi ≤ xi+1 . This leads to a time bound of O(nd4 ) for achieving GAC on
the Increasing Nvalue constraint.
The reformulation based on finite deterministic automaton is detailed in the
global constraint catalog[2]. If we use Pesant’s algorithm [12], this reformulation
leads to a worst-case time complexity of O(n∪Di 3 ) for achieving GAC, where
∪Di denotes the total number of potential values in the variable domains.
7 Conclusion
References
1. Choco: An open source Java CP library, documentation manual (2009),
http://choco.emn.fr/
2. Beldiceanu, N., Carlsson, M., Rampon, J.-X.: Global constraint catalog, working
version of January 2010. Technical Report T2005-08, Swedish Institute of Com-
puter Science (2005), www.emn.fr/x-info/sdemasse/gccat
3. Beldiceanu, N., Carlsson, M., Thiel, S.: Cost-Filtering Algorithms for the two Sides
of the sum of weights of distinct values Constraint. Technical report, Swedish In-
stitute of Computer Science (2002)
The Increasing Nvalue Constraint 39
4. Bessière, C., Hebrard, E., Hnich, B., Kiziltan, Z., Walsh, T.: Filtering Algorithms
for the nvalue Constraint. In: Barták, R., Milano, M. (eds.) CPAIOR 2005. LNCS,
vol. 3524, pp. 79–93. Springer, Heidelberg (2005)
5. Bessière, C., Hebrard, E., Hnich, B., Kiziltan, Z., Walsh, T.: SLIDE: A useful
special case of the CARDPATH constraint. In: ECAI 2008, Proceedings, pp. 475–
479 (2008)
6. Bessière, C., Hebrard, E., Hnich, B., Walsh, T.: The Complexity of Global Con-
straints. In: 19th National Conference on Artificial Intelligence (AAAI 2004), pp.
112–117. AAAI Press, Menlo Park (2004)
7. Demassey, S., Pesant, G., Rousseau, L.-M.: A cost-regular based hybrid column
generation approach. Constraints 11(4), 315–333 (2006)
8. Hermenier, F., Lorca, X., Menaud, J.-M., Muller, G., Lawall, J.: Entropy: a con-
solidation manager for clusters. In: VEE 2009: Proceedings of the 2009 ACM SIG-
PLAN/SIGOPS International Conference on Virtual Execution Environments, pp.
41–50 (2009)
9. Katsirelos, G., Narodytska, N., Walsh, T.: Combining Symmetry Breaking and
Global Constraints. In: Oddi, A., Fages, F., Rossi, F. (eds.) CSCLP 2008. LNCS,
vol. 5655, pp. 84–98. Springer, Heidelberg (2009)
10. Pachet, F., Roy, P.: Automatic Generation of Music Programs. In: Jaffar, J. (ed.)
CP 1999. LNCS, vol. 1713, pp. 331–345. Springer, Heidelberg (1999)
11. Pesant, G.: A filtering algorithm for the stretch constraint. In: Walsh, T. (ed.) CP
2001. LNCS, vol. 2239, pp. 183–195. Springer, Heidelberg (2001)
12. Pesant, G.: A Regular Language Membership Constraint for Finite Sequences of
Variables. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 482–495. Springer,
Heidelberg (2004)
13. Ågren, M., Beldiceanu, N., Carlsson, M., Sbihi, M., Truchet, C., Zampelli, S.:
Six Ways of Integrating Symmetries within Non-Overlapping Constraints. In: van
Hoeve, W.-J., Hooker, J.N. (eds.) CPAIOR 2009. LNCS, vol. 5547, pp. 11–25.
Springer, Heidelberg (2009)
Improving the Held and Karp Approach with
Constraint Programming
1 Introduction
Held and Karp have proposed, in the early 1970s, a relaxation for the Travel-
ing Salesman Problem (TSP) as well as a branch-and-bound procedure that can
solve small to modest-size instances to optimality [4, 5]. It has been shown that
the Held-Karp relaxation produces very tight bounds in practice, and this relax-
ation is therefore applied in TSP solvers such as Concorde [1]. In this short paper
we show that the Held-Karp approach can benefit from well-known techniques
in Constraint Programming (CP) such as domain filtering and constraint prop-
agation. Namely, we show that filtering algorithms developed for the weighted
spanning tree constraint [3, 8] can be adapted to the context of the Held and Karp
procedure. In addition to the adaptation of existing algorithms, we introduce a
special-purpose filtering algorithm based on the underlying mechanisms used
in Prim’s algorithm [7]. Finally, we explored two different branching schemes
to close the integrality gap. Our initial experimental results indicate that the
addition of the CP techniques to the Held-Karp method can be very effective.
The paper is organized as follows: section 2 describes the Held-Karp approach
while section 3 gives some insights on the Constraint Programming techniques
and branching scheme used. In section 4 we demonstrate, through preliminary
experiments, the impact of using CP in combination with Held and Karp based
branch-and-bound on small to modest-size instances from the TSPlib.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 40–44, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Improving the Held and Karp Approach with Constraint Programming 41
edges incident to vertex 1. The degree of a vertex is the set of edges in the 1-tree
incident to that vertex, and we denote it by deg(i) for i ∈ V . To see that the
1-tree is a relaxation for the TSP, observe that every tour in the graph is a 1-tree,
and if a minimum-weight 1-tree is a tour, it is an (optimal) solution to the TSP.
Note that the 1-tree is a tour if and onlng y if all the degree of vertices is two.
The iterative approach proposed by [4, 5], uses Lagrangian relaxation to
produce a sequence of connected graphs which increasingly resemble tours. We
start by computing an initial minimum-weight 1-tree, by computing a minimum-
spanning tree on G \ {1}, and adding the two edges with lowest cost incident
to vertex 1. If the optimal 1-tree is a tour, we have found an optimal tour.
Otherwise, the degree constraint one some of the vertices must be violated, i.e.,
it is not equal to two. In that case, we proceed by penalizing the degree of such
vertices to be different from two by perturbing the edge costs of the graph, as
follows. For each vertex i ∈ V , a ‘node potential’ πi is introduced, Then, for each
edge (i, j) ∈ E, the edge weight c̃ij is defined as c̃ij = cij + πi + πj . [4] show that
the optimal TSP tour is invariant under these changes, but the optimal 1-tree
is not. Once choice for the node potentials is to define πi = (2 − deg(i)) · C, for
a fixed constant C. The Held-Karp procedure re-iterates by solving the 1-tree
problem and perturbing the edge costs until it reaches a fixed point or meets a
stopping criterion. The best lower bound, i.e., the maximum among all choices
of the node potentials, is known as the Held-Karp bound and will be denoted
by HK.
The overall Held-Karp approach solves the TSP through branch-and-bound,
a technique that has been widely used on this problem (see [2] for a survey).
A good upper bound, UB, can be computed easily with any of the popular
heuristics that have been devised for this problem, e.g., [6].
The marginal cost of an edge e in T is defined as ce = w(T (e)) − w(T ), that is,
the marginal increase of the weight of the minimum 1-tree if e is forced in the
1-tree.
42 P. Benchimol et al.
The following algorithm can compute, in O(mn), the marginal costs for edges
e ∈ T . Each non-tree edge e = (i, j) links two nodes i, j, and defines a unique i-j
path, say P e , in T . The replacement cost of (i, j) is defined by ce − max(ca |a ∈
P e ), that is the cost of (i, j) minus the cost of largest edge on the path from i to
j in the 1-tree T . Finding P e can be achieved through DFS in O(n) for all the
O(m) edges not in T . If HK + ce > UB, then e can be safely removed from E.
Recall that Prim’s algorithm computes the minimum spanning tree in G (which
is easily transformed into a 1-tree) in the following manner. Starting from any
node i, it first partitions the graph into disjoints subsets S = {i} and S̄ = V \ i
and creates an empty tree T . Then it iteratively adds to T the minimum edge
(i, j) ∈ (S, S̄), defined as the set of edges where i ∈ S and j ∈ S̄, and moves j
from S̄ to S.
Since we are using MST computations as part of a Held-Karp relaxation to
the TSP, we know that there should be at least 2 edges in each possible (S, S̄)
of V (this property defines one of well known subtour elimination constraints of
the TSP). Therefore, whenever we encounter a set (S, S̄) that contains only two
edges during the computation of the MST with Prim’s algorithm, we can force
these edges to be mandatory in T .
Once the initial Held-Karp bound has been computed and the filtering has been
performed it is necessary to apply a branching procedure in order to identify the
optimal TSP solution. We have investigated two orthogonal branching schemes,
both based on the 1-tree associated to the best Held-Karp bound, say T . These
strategies consist in selecting, at each branch-and-bound node, one edge e and
splitting the search in two subproblems, one where e is forced in the solution and
one where it is forbidden. In the strategy out we pick e ∈ T and first branch on
the subproblem where it is forbidden while in the strategy in we choose e ∈ T
and first try to force it in the solution.
Since there are O(n) edges in T and O(n2 ) edges not in T , the first strategy will
tend to create search trees which are narrower but also deeper than the second
one. However, since the quality of the HK improves rapidly as we go down the
search tree, it is generally possible to cut uninteresting branches before we get to
deep. Preliminary experiments, not reported here, have confirmed that strategy
out is generally more effective than strategy in.
4 Experimental Results
References
[1] Applegate, D.L., Bixby, R.E., Chvátal, V., Cook, W.J.: The Traveling Salesman
Problem: A Computational Study. Princeton University Press, Princeton (2006)
[2] Balas, E., Toth, P.: Branch and Bound Methods. In: Lawler, E.L., Lenstra, J.K.,
Rinnooy Kan, A.H.G., Shmoys, D.B. (eds.) The Traveling Salesman Problem: A
Guided Tour of Combinatorial Optimization, ch. 10. Wiley, Chichester (1985)
[3] Dooms, G., Katriel, I.: The “not-too-heavy spanning tree” constraint. In: Van
Hentenryck, P., Wolsey, L.A. (eds.) CPAIOR 2007. LNCS, vol. 4510, pp. 59–70.
Springer, Heidelberg (2007)
[4] Held, M., Karp, R.M.: The Traveling-Salesman Problem and Minimum Spanning
Trees. Operations Research 18, 1138–1162 (1970)
[5] Held, M., Karp, R.M.: The Traveling-Salesman Problem and Minimum Spanning
Trees: Part II. Mathematical Programming 1, 6–25 (1971)
[6] Helsgaun, K.: An Effective Implementation of the Lin-Kernighan Traveling Sales-
man Heuristic. European Journal of Operational Research 126(1), 106–130 (2000)
[7] Prim, R.C.: Shortest connection networks and some generalizations. Bell System
Tech. J. 36, 1389–1401 (1957)
[8] Régin, J.-C.: Simpler and Incremental Consistency Checking and Arc Consistency
Filtering Algorithms for the Weighted Spanning Tree Constraint. In: Perron, L.,
Trick, M.A. (eds.) CPAIOR 2008. LNCS, vol. 5015, pp. 233–247. Springer, Heidel-
berg (2008)
Characterization and Automation
of Matching-Based Neighborhoods
Thierry Benoist
Abstract. This paper shows that that some matching based neighborhood can
be automatically designed by searching for stable sets in a graph. This move
generation algorithm is illustrated and investigated within the LocalSolver
framework.
1 Introduction
Autonomous search is a challenging trend in optimization solvers. It consists in
partially or totally automating the solving process. The ultimate goal is to reach a
model&run paradigm where the user merely models the problem to be solved and
relies on the solver to find solutions. Automation can apply to various components of
an optimization solver, as detailed in [8]. In this note we focus on the automatic
design of Very Large Scale Neighborhoods (VLSN) [1] for local search solvers.
Neighborhood search is one of the most effective approaches to solve combinatorial
optimization problems. It basically consists in moving from one solution to another by
applying local changes, tending to improve the objective function. In this context,
VLSN are sometimes useful to improve the convergence on difficult instances. In
particular, neighborhoods of exponential size explored in polynomial time are often
appreciated by researchers [4, 5, 9, 10]. For instance in the car sequencing problem,
Estellon et al. noticed in [7] that selecting a set of K positions sufficiently distant one
from another allowed optimally repositioning these K cars through the resolution of a
transportation problem.
We will show that some of these VLSN encountered in different contexts share
common bases and thus can be implemented in a unified way in an autonomous
solver. More precisely we prove in section 2 that some matching-based neighborhood
can be automatically designed by searching for stable sets in a graph; then we propose
in section 3 a convenient way to implicitly pre-compute billions of stable sets.
Throughout this paper we will use the Eternity II edge matching puzzle
(http://us.eternityii.com) for illustrating the introduced ideas and finally for
experimenting the algorithms, within the LocalSolver1 framework. The goal of this
1
The Author is grateful to the LocalSolver team: Bertrand Estellon, Frédéric Gardi and Karim
Nouioua. LocalSolver is a free local search solver for combinatorial optimization problems,
based on a simple 0-1 formalism. The interested reader is referred to e-lab.bouygues.com for a
description of its functionalities and algorithms.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 45–50, 2010.
© Springer-Verlag Berlin Heidelberg 2010
46 T. Benoist
puzzle is to place 256 square tiles on a 16x16 board so as to maximize the number of
matching edges (namely adjacent tiles have matching colors along their common
edge).
Property. For any sum c∈C, we denote by T(c) the set of nodes of the DAG reachable
from a variable of c without crossing a node of C.∀ c,d∈C1, such that T(c)∩T(d)
contains only linear nodes, ∀ s a valid solution,∀ u,v∈F such that u (resp. v) only
involves values of c (resp. d) and u(s) (resp. v(s)) satisfied all constraints of T(c)
(resp. T(d)). Then, if preserve the sums in C, the impact of both
transformations is valid and additive: Δ Δ Δ . We will say that c
and d are non-adjacent.
Proof. First is well defined since c and d share no variable. Then, since
preserves the sums in C, constraints on these nodes remain satisfied. Other constraints
Characterization and Automation of Matching-Based Neighborhoods 47
are in T(c) ∪ T(d), whose intersection contains no constraints (only linear nodes), that
is to say that constraints in T(c) (resp. T(d)) are only impacted by variables of c (resp.
d). Our hypothesis ensures their satisfaction in this case, hence is valid.
Finally, since T(c) and T(d) share linear expressions only, we have Δ
Δ Δ .
In terms of the edge-matching puzzle, two “one tile per cell” constraints have the
above property if and only if they refer to non-adjacent cells. Indeed in this case, no
edge-matching detection expression involves variables of both constraints. For
instance, in the DAG below (a simple 2x2 board), the sums “cell1”and “cell3” are
non-ajacent. Swapping the
Maximize ∑edgeOK tiles in these cells preserves
the “assignment sums” (C)
edgeOK edgeOK edgeOK edgeOK
1,2 2,3 3,4
and thus can be evaluated as
1,4
constraints in a subset S1 of C1
are pairwise non-adjacent, we
tileA:
∑ =1
tileB:
∑ =1
tileC:
∑ =1
tileD:
∑ =1
say that S1 is a stable set with
cell1: cell2: cell3: cell4: respect to this definition of
∑ =1 ∑ =1 ∑ =1 ∑ =1
adjacency.
Besides, if all constraints in S1 are equalities to 1, then this additive property allows
defining the following large neighborhood for S1.We define a bipartite graph based on
the two sets C1 and C2 such that for each variable v of V involved in c1∈C1 and c2∈C2,
we define an edge from c1 to c2 if v = 1 and an edge from c2 to c1 if v = 0. For each
constraint c in C1, we assign weights to incoming edges (those associated to variables
equal to 0) as follows. For each variable v of c equal to 0, let g be the function setting
to 0 the variable currently instantiated to 1 in c and setting to 1 variable v. Then the
weight of the edge representing variable v is Δg(s) = Ω(g(s)) - Ω(s), with s the current
solution. Outgoing edges receive weight 0. To any cycle in this bipartite graph we
associate a move flipping the values of all variables corresponding to edges of the
cycle. Such a move preserves the sums in C because each node involved in the cycle
has exactly one incoming and one outgoing edge. Besides the cost of this move is the
sum of the weights of all involved edges, because:
parent graph in O(m) (see [6] for other cycle detection strategies). For a given set S1,
the complexity is bounded by O(|S1|3). For edge-matching puzzles, this neighborhood
is the same as the one explored by Schaus & Deville in [11], namely the optimal
reassignment of a set of non adjacent cells.
A variant of this algorithm consists in setting small negative costs to edges
associated to variables equal to 1, so as to favor the detection of different assignment
with the same cost if any (diversification move). In practice we first look for
an improving transformation and then for a diversification transformation in the
same graph.
The above neighborhood is based on a stable set S1 in C1 but the worst complexity of
computing a stable set of size smaller than K with a greedy algorithm (our goal is not
to find a maximum stable set) is in the worst case O(Kd) where d is the maximum
degree of the graph. Indeed, each time a node is added to the stable set, its neighbor
nodes must be removed from the graph. If absence on upper bound on the degree
this worst case complexity is O(K|C1|). Besides such a greedy algorithm may built
very small stable set in some cases, for instance if the first selected node is connected
to all others. For these reasons it is useful to precompute structures allowing
extracting stable sets of size K in O(K). The structure that we define for this purpose
is a collection of K disjoint subsets of C1 such that any pair of nodes appearing in two
different sets of this collection are non adjacent. We call such a collection a stable
set generator. Building such generators can be achieved with a simple algorithm
starting with an empty collection and based on two procedures: Grow adds to the
growing collection a new non-adjacent singleton and Merge adds an adjacent node to
one of the sets of the collection. In order to maximize the number of different
stable sets that can be generated from this collection, we need to maximize the
products of its sizes. Hence our heuristic consists in applying the merge procedure on
each new singleton, and in favoring the grow procedure otherwise. Once the target
size K is reached, the merge procedure is applied until no node can be added.
Applying this randomized algorithm around 10 times at the beginning of the
search procedure, we implicitly generate a huge number of stable sets when such
stable sets exist.
The above graphs represent local search descents for this edge-matching puzzle
with a neighborhood ranging from 2 tiles to 32 tiles. The left chart is extracted from
[11] while the right one was obtained with LocalSolver setting the maximum size of
the stable sets to 2, 4, 8, etc. The similarity of these curves confirms that without prior
knowledge on the structure of the problem, we achieve to explore the same
neighborhood, with the same efficiency. We tested this generic implementation of
matching-based neighborhoods on various problems [3]. For the largest instance of
each problem, the table below reports the size of the detected bipartite structure, the
number of stable sets implicitly generated, the total time of this analysis and the
number of moves per second during local search. In these experiments the size of
stable sets was limited to 8.
These results show that some very-large scale neighborhood can be automatically
generated thanks to an analysis of the model. Similarly to small neighborhoods
offered by default in LocalSolver, these moves preserve the feasibility of the solution
and are similar to what an OR researcher would implement.
References
1. Ahuja, R.K., Ergun, O., Orlin, J.B., Punnen, A.P.: A survey of very large-scale neighborhood
search techniques. Discrete Appl. Math. 123(1-3), 75 (2002)
2. Angel, E., Bampis, E., Pascual, F.: An exponential (matching based) neighborhood for the
vehicle routing problem. Journal of Combinatorial Optimization 15, 179–190 (2008)
3. Benoist, T.: Autonomous Local Search with Very Large-Scale Neighborhoods (manuscript
in preparation)
4. Bozejko, W., Wodecki, M.: A Fast Parallel Dynasearch Algorithm for Some Scheduling
Problems. Parallel Computing in Electrical Engineering, 275–280 (2006)
50 T. Benoist
5. Cambazard, H., Horan, J., O’Mahony, E., O’Sullivan, B.: Fast and Scalable Domino
Portrait Generation. In: Perron, L., Trick, M.A. (eds.) CPAIOR 2008. LNCS, vol. 5015,
pp. 51–65. Springer, Heidelberg (2008)
6. Cherkassky, B.V., Goldberg, A.V.: Negative-Cycle Detection Algorithms. In: Díaz, J. (ed.)
ESA 1996. LNCS, vol. 1136, pp. 349–363. Springer, Heidelberg (1996)
7. Estellon, B., Gardi, F., Nouioua, K.: Large neighborhood improvements for solving car
sequencing problems. RAIRO Operations Research 40(4), 355–379 (2006)
8. Hamadi, Y., Monfroy, E., Saubion, F.: What is Autonomous Search? Microsoft research
report. MSR-TR-2008-80 (2008)
9. Hurink, J.: An exponential neighborhood for a one-machine batching problem. OR Spectrum
21(4), 461–476 (1999)
10. Mouthuy, S., Deville, Y., Van Hentenryck, P.: Toward a Generic Comet Implementation
of Very Large-Scale Neighborhoods. In: 22nd National Conference of the Belgian
Operations Research Society, Brussels, January 16-18 (2008)
11. Shaus, P., Deville, Y.: Hybridization of CP and VLNS for Eternity II. In: JFPC 2008
Quatrième Journées Francophones de Programmation par Contraintes, Nantes (2008)
12. Voudouris, C., Dorne, R., Lesaint, D., Liret, A.: iOpt: A Software Toolkit for Heuristic
Search Methods. In: Walsh, T. (ed.) CP 2001. LNCS, vol. 2239, pp. 716–719. Springer,
Heidelberg (2001)
Rapid Learning for Binary Programs
1 Introduction
Constraint programming (CP) and integer programming (IP) are two comple-
mentary ways of tackling discrete optimization problems. Hybrid combinations
of the two approaches have been used for more than a decade. Recently both
technologies have incorporated new nogood learning capabilities that derive ad-
ditional valid constraints from the analysis of infeasible subproblems extending
methods developed by the SAT community.
The idea of nogood learning, deriving additional valid conflict constraints from
the analysis of infeasible subproblems, has had a long history in the CP commu-
nity (see e.g. [1], chapter 6) although until recently it has had limited applicabil-
ity. More recently adding carefully engineered nogood learning to SAT solving [2]
has lead to a massive increase in the size of problems SAT solvers can deal with.
The most successful SAT learning approaches use so called first unique implica-
tion point (1UIP) learning which in some sense capture the nogood closest to
the failure that can infer new information.
Constraint programming systems have adapted the SAT style of nogood learn-
ing [3,4], using 1UIP learning and efficient SAT representation for nogoods, lead-
ing to massive improvements for certain highly combinatorial problems.
Nogood learning has been largely ignored in the IP community until very
recently (although see [5]). Achterberg [6] describes a fast heuristic to derive
This research was partially funded by the DFG Research Center Matheon in Berlin
NICTA is funded by the Australian Government as represented by the Department of
Broadband, Communications and the Digital Economy and the Australian Research
Council.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 51–55, 2010.
c Springer-Verlag Berlin Heidelberg 2010
52 T. Berthold, T. Feydy, and P.J. Stuckey
small conflict constraints by constructing a dual ray with minimal nonzero ele-
ments. He shows that nogood learning for general mixed integer problems can
result in an average speedup of 10%. Kılınc Karzan et. al. [7] suggest restarting
the IP solver and using a branching rule that selects variables which appear in
small conflict constraints for the second run. Achterberg and Berthold [8] pro-
pose a hybrid branching scheme for IP that incorporates conflict-based SAT and
impact-based CP style search heuristics as dynamic tie-breakers.
2 Rapid Learning
The power of nogood learning arises because often search algorithms implicitly
repeat the same search in a slightly different context in another part of the
search tree. Nogoods are able to recognize such situations and avoid redundant
work. As a consequence, the more search is performed by a solver and the earlier
nogoods are detected the greater the chance for nogood learning to be beneficial.
Although the nogood learning methods of SAT, CP, and IP approaches are
effectively the same, one should note that because of differences in the amount of
work per node each solver undertakes there are different design tradeoffs in each
implementation. An IP solver will typically spend much more time processing
each node than either a SAT or CP solver. For that reason SAT and CP systems
with nogoods use 1UIP learning and frequent restarts to tackle problems while
this is not the case for IP. IP systems with nogoods typically only restart at the
root, and use learning methods which potentially generate several nogoods for
each infeasibility (see [6]).
The idea of Rapid Learning is based on the fact that a CP solver can typically
perform a partial search on a few hundred or thousand nodes in a fraction of
the time that an IP solver needs for processing the root node of the search tree.
Rapid Learning applies a fast CP branch-and-bound search for a few hundred
or thousand nodes, before we start the IP search, but after IP presolving and
cutting plane separation.
Each piece of information collected in this rapid CP search can be used to
guide the IP search or even deduce further reductions during root node process-
ing. Since the CP solver is solving the same problem as the IP solver
– each generated conflict constraint is valid for the IP search,
– each global bound change can be applied at the IP root node,
– each feasible solution can be added to the IP solver’s solution pool,
– the branching statistics can initialize a hybrid IP branching rule [8], and
– if the CP solver completely solves the problem, the IP solver can abort.
All five types of information may potentially help the IP solver. Rapid Learning
performs a limited CP search at the root node, after most of the IP presolving
is done to collect potential new information for the IP solver.
The basic idea of Rapid Learning is related to the concept of Large Neigh-
borhood Search heuristics in IP. But rather than doing a partial search on a
sub-problem using the same (IP search) algorithm, we perform a partial search
Rapid Learning for Binary Programs 53
on the same problem using a much faster algorithm. Rapid Learning also differs
from typical IP heuristics in the sense that it can improve both primal and dual
bounds at the same time.
3 Computational Results
Our computational study is based on the branch-cut-and-price framework SCIP
(Solving Constraint Integer Programs). This system incorporates the idea of
Constraint Integer Programming [9,10] and implements several state-of-the-art
techniques for IP solving, combined with solving techniques from CP and SAT,
including nogood learning. The Rapid Learning heuristic presented in this article
was implemented as a separator plugin.
For our experiments, we used SCIP 1.2.0.5 with Cplex 12.10 as underlying LP
solver, running on a IntelR CoreTM 2 Extreme CPU X9650 with 6 MB cache
and 8 GB RAM. We used default settings and a time limit of one hour for the
main SCIP instance which performs the IP search.
For solving the CP problem, we used a secondary SCIP instance with “empha-
sis cpsolver” (which among other things turns off LP solving) and “presolving
fast” settings (which turns off probing and pairwise comparison of constraints)
and the parameter “conflict/maxvarsfac” set to 0.05 (which only creates no-
goods using at most 5% of the variables of the problem). As node limit we used
max(500, min(niter , 5000)), with niter being the number of simplex iterations
used for solving the root LP in the main instance. We further aborted the CP
search as soon as 1000 conflicts were created, or no useful information was gained
after 20% of the node limit.
As test set we chose all 41 Binary programs (BPs) of the Miplib 3.0 [11],
the Miplib2003 [12] and the IP collection of Hans Mittelmann [13] which have
less then 10 000 variables and constraints after SCIP presolving. BPs are an
important subclass of IPs and finite domain CPs. where all variables take values
0 or 1. Note, that for a BP, all conflict constraints are Boolean clauses, hence
linear constraints.
Table 1 compares the performance of SCIP with and without Rapid Learn-
ing applied at the root node (columns “SCIP” and “SCIP-RL”). Columns “RL”
provide detailed information on the performance of Rapid Learning. “Ngds” and
“Bds” present the number of applied nogoods and global bound changes, respec-
tively, whereas “S” indicates, whether a new incumbent solution was found. For
instances which could not be solved within the time limit, we present the lower
and upper bounds at termination.
Note first that Rapid Learning is indeed rapid, it rarely consumes more than
a small fraction of the overall time (except for mitre). We observe that for
many instances the application of Rapid Learning does not make a difference.
However, there are some, especially the acc problems, for which the performance
improves dramatically. There are also a few instances, such as qap10, for which
Rapid Learning deteriorates the performance. The solution time decreases by
12% in geometric mean, the number of branch-and-bound nodes by 13%. For
54 T. Berthold, T. Feydy, and P.J. Stuckey
the four unsolved instances, we see that Rapid Learning leads to a better primal
bound in three cases. The dual bound is worse for the instance protfold. For
the instances acc-2 and nug08, Rapid Learning completely solved the problem.
Additional experiments indicate that the biggest impact of Rapid Learning
comes from nogoods and learning new bounds, but all the other sources of in-
formation are also beneficial to the IP search on average.
References
1. Dechter, R.: Constraint Processing. Morgan Kaufmann, San Francisco (2003)
2. Moskewicz, M., Madigan, C., Zhao, Y., Zhang, L., Malik, S.: Chaff: Engineering
an efficient SAT solver. In: Proceedings of DAC 2001, pp. 530–535 (2001)
3. Katsirelos, G., Bacchus, F.: Generalised nogoods in CSPs. In: Proceedings of AAAI
2005, pp. 390–396 (2005)
4. Ohrimenko, O., Stuckey, P., Codish, M.: Propagation via lazy clause generation.
Constraints 14(3), 357–391 (2009)
5. Davey, B., Boland, N., Stuckey, P.: Efficient intelligent backtracking using linear
programming. INFORMS Journal of Computing 14(4), 373–386 (2002)
6. Achterberg, T.: Conflict analysis in mixed integer programming. Discrete Opti-
mization 4(1), 4–20 (2007); Special issue: Mixed Integer Programming
7. Kılınç Karzan, F., Nemhauser, G.L., Savelsbergh, M.W.P.: Information-based
branching schemes for binary linear mixed-integer programs. Math. Progr. C 1(4),
249–293 (2009)
8. Achterberg, T., Berthold, T.: Hybrid branching. In: van Hoeve, W.-J., Hooker, J.N.
(eds.) CPAIOR 2009. LNCS, vol. 5547, pp. 309–311. Springer, Heidelberg (2009)
9. Achterberg, T.: Constraint Integer Programming. PhD thesis, TU Berlin (2007)
10. Achterberg, T., Berthold, T., Koch, T., Wolter, K.: Constraint integer program-
ming: A new approach to integrate CP and MIP. In: Perron, L., Trick, M.A. (eds.)
CPAIOR 2008. LNCS, vol. 5015, pp. 6–20. Springer, Heidelberg (2008)
11. Bixby, R.E., Ceria, S., McZeal, C.M., Savelsbergh, M.W.: An updated mixed inte-
ger programming library: MIPLIB 3.0. Optima (58), 12–15 (1998)
12. Achterberg, T., Koch, T., Martin, A.: MIPLIB 2003. Operations Research Let-
ters 34(4), 1–12 (2006)
13. Mittelmann, H.: Decision tree for optimization software: Benchmarks for optimiza-
tion software (2010), http://plato.asu.edu/bench.html
Hybrid Methods for the Multileaf Collimator
Sequencing Problem
1 Introduction
Radiation therapy represents one of the main treatments against cancer, with an esti-
mated 60% of cancer patients requiring radiation therapy as a component of their treat-
ment. The aim of radiation therapy is to deliver a precisely measured dose of radiation to
a well-defined tumour volume whilst sparing the surrounding normal tissue, achieving
an optimum therapeutic ratio. At the core of advanced radiotherapy treatments are hard
combinatorial optimisation problems. In this paper we focus on the multileaf collimator
sequencing in intensity-modulated radiotherapy (IMRT).
What is Intensity-Modulated Radiotherapy? IMRT is an advanced mode of high-
precision radiotherapy that utilises computer controlled x-ray accelerators to deliver
precise radiation doses to a malignant tumour. The treatment plan is carefully devel-
oped based on 3D computed tomography images of the patient, in conjunction with
computerised dose calculations to determine the dose intensity pattern that will best
conform to the tumour shape. There are three optimisation problems relevant to this
treatment. Firstly, the geometry problem considers the best positions for the beam head
from which to irradiate. Secondly, the intensity problem is concerned with computing
the exact levels of radiation to use in each area of the tumour. Thirdly, the realisation
problem, tackled in this paper, deals with the delivery of the intensities computed in
This work was supported by Science Foundation Ireland under Grant Number 05/IN/I886.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 56–70, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Hybrid Methods for the Multileaf Collimator Sequencing Problem 57
Fig. 1. A simplified view of the optimisation problem associated with sequencing multileaf colli-
mators in IMRT, Figure 1(b) has been adapted from [3]
Contribution of this Paper. In our earlier work in this area we presented a novel ap-
proach to multileaf collimator sequencing using an approach based on shortest paths
[10]. It was shown that such a model significantly out-performed the state-of-the-art
and brought clinical-sized instances of the problem within the reach of constraint pro-
gramming (CP). We now show that the shortest path idea can be exploited to give greater
scalability by coupling the CP model with Lagrangian relaxation and column genera-
tion techniques. Our shortest-path approach to this problem uniquely provides a basis
for benefitting from these techniques. The results presented define the current state-of-
the-art for this challenging problem from the domain of cancer treatment planning.
The CP model presented in [10], is briefly introduced in Section 2. We show how to
strengthen the CP model with a Lagrangian relaxation in Section 3. An alternative for-
mulation in which the paths are represented explicitly, along with a column generation
(CG) model, is presented in Section 4. Section 5 demonstrates that these approaches
significantly out-perform the state-of-the-art for this problem.
58 H. Cambazard, E. O’Mahony, and B. O’Sullivan
must contain an integer partition of every intensity. We will represent integer parti-
tions with the following notation: P (a) is the set of partitions of integer a, p ∈ P (a)
is a particular partition of a, and |p| the number of integer summands in p. We de-
note by occ(v, p) the number of occurrences of value v in p. For example, P (5) =
{
5,
4, 1,
3, 2,
3, 1, 1,
2, 2, 1,
2, 1, 1, 1,
1, 1, 1, 1, 1}, and if p =
3, 1, 1 then
|p| = 3 and occ(1, p) = 2. Observe that the DC problem can be formulated as a shortest
path problem in a weighted directed acyclic graph, G, which we refer to as a partition
graph. A partition graph G of a row matrix I =
I1 , . . . , In is a layered graph with
n+2 layers, the nodes of each layer j corresponding to the set of integer partitions of the
row matrix element Ij . The size of this graph is therefore exponential in the maximum
intensity. Source and sink nodes, located on layers 0 and n + 1 respectively, are associ-
ated with the empty partition ∅. Two adjacent layers form a complete bipartite graph and
the cost added to an edge, pu → pv , between two partitions, pu and pv of adjacent lay-
ers, represents the number of additional weights that need to be added to the decompo-
sition to satisfy the C1 property when decomposing the two consecutive elements with
M partitions. The cost of each edge pu → pv in the partition graph is:
the corresponding
c(pu , pv ) = b=1 c(b, pu , pv ) where c(b, pu , pv ) = max(occ(b, pv ) − occ(b, pu ), 0).
Figure 2 shows the partition graph I = [3, 2, 3, 1].
3
0 3
{1,1,1} 1
3 1 {1,1,1}
2 1 1
0
2 1 3
source {2,1} 0 0
1 {1,1} {2,1} {1} sink
2
1 1
0
{3}
{2} {3}
1
1
1
Fig. 2. A partition graph showing transition weights for the single row I = [3, 2, 3, 1]
By following the path {{2, 1}, {1, 1}, {2, 1}, {1}}, we build a decomposition:
The length of the path represents the cardinality of the decomposition and a shortest
path therefore provides a decomposition with minimum cardinality. The key idea is that
as one moves along a path in this graph, the partition chosen to decompose the element
at layer j contains the only weights that can be reused to decompose the element at layer
j + 1 because of the C1 property. Consider the previous example and the solution given.
A coefficient 2 is used by the first partition but not by the second and thus becomes
forbidden to decompose any other intensity values. The previous partition alone tells
60 H. Cambazard, E. O’Mahony, and B. O’Sullivan
us the available coefficients to decompose the present intensity value. This is why the
cardinality cost can be defined between consecutive partitions and the whole problem
mapped to a shortest path. We could also restrict the cost to a given weight b to obtain
the cardinality of this particular coefficient. We will use this idea in the CP model.
If the optimal value of the RCSPP, z ∗ , is less than or equal to B, then there is a solution
to constraints CP3 , CP4 , and CP5 , otherwise there is an inconsistency. The first two
constraints in the formulation are resource constraints and the last three are the flow
conservation constraints enforcing that the x variables define a path.
Lagrangian relaxation is a technique that moves the “complicating constraints” into
the objective function with a multiplier, λ ≥ 0, to penalise their violation. For a given
value of λ, the resulting problem is the Lagrangian subproblem and, in the context of
minimisation, provides a lower bound on the objective of the original problem. The
typical approach is to relax the resource constraints, so the Lagrangian function is:
f (x, λ) = j u,v c3 (pu , pv ) × xjuv + λ0 ( j u,v c1 (pu , pv ) × xjuv − K)
(2)
+ 1≤b≤M λb ( j u,v c2 (b, pu , pv ) × xjuv − Nb )
The Lagrangian subproblem in this setting is, therefore, a shortest path problem w(λ) =
minx f (x, λ) and the Lagrangian dual is to find the set of multipliers that provide the
best possible lower bound by maximising w(λ) over λ. A central result in Lagrangian
relaxation is that w(λ) is a piecewise linear concave function, and various algorithms
can be used to optimise it efficiently.
62 H. Cambazard, E. O’Mahony, and B. O’Sullivan
Solving the Lagrangian Dual. We followed the approach from [23] and used a sub-
gradient method [8]. The algorithm iteratively solves w(λ) for different values of λ,
initialised to 0 at the first iteration. The values of λ are updated by following the direc-
tion of a supergradient of w at the current value λ for a given step length μ. The step
lengths have to be chosen to guarantee convergence (see [8]). We refer the reader to
[23] for more details. At each iteration t, we solved the shortest path problem with the
penalised costs on the edges:
c(pu , pv ) = c3 (pu , pv ) + λt0 c1 (pu , pv ) + λtb c2 (b, pu , pv ).
1≤b≤M
Then we perform a reversed traversal from the sink to the source to get the values of the
shortest path SDa from all nodes a to the sink. At the end of the iteration we mark all
the nodes (partitions) that are infeasible in the current Lagrangian subproblem, i.e.:
SOa + SDa > B + λt0 K + λtb Nb .
1≤b≤M
At the end of the process, all nodes marked during the iterations are pruned from the
domains. This is Lagrangian relaxation-based filtering [24]: if a value is proven in-
consistent in at least one Lagrangian subproblem, then it is inconsistent in the original
problem. The Lagrangian relaxation is incorporated into the constraint model as a global
constraint for each line. The independent path constraints are kept and propagated first,
whereas the propagation of the resource constrained path constraint is delayed since it
is an expensive constraint to propagate.
Numerous linear models have been designed for this problem, see e.g. [14], but the
shortest path approach [10] opens the door for a totally new formulation of the prob-
lem to be considered. In [10] we designed a linear model representing every integer
partition. We now consider an alternative formulation that, rather than representing the
partition graph, explicitly encodes the set of possible paths in the partition graph of
each line. The resulting formulation is very large, but such models are typical in many
settings, e.g. vehicle routing problems. The optimisation of these models can be per-
formed using column generation [12]. The key idea is that the Simplex algorithm does
not need to have access to all variables (columns) to find a pivot point towards an im-
proving solution. The Simplex algorithm proceeds by iterating from one basic solution
to another while improving the value of the objective function. At each iteration, the
Hybrid Methods for the Multileaf Collimator Sequencing Problem 63
algorithm looks for a non-basic variable to enter the basis. This is the pricing problem.
Typically, for a linear minimisation problem written
min ci xi | ∀j aij xi ≥ bj , xi ≥ 0,
i i
the pricing problem is to find the i (a variable or column) that minimises ci − j πj aij
where πj is the dual variable associated with constraint j. The explicit enumeration of
all i is impossible when the number of variables is exponential. Therefore, the column
generation works with a restricted set of variables, which define the restricted master
problem (RMP) and evaluates reduced costs by implicit enumeration e.g., by solving a
combinatorial problem. We now apply these concepts to our shortest path model.
K ≥ 0, B ≥ 0, ∀b Nb ≥ 0
∀i, k ∈ Ωi ptki ∈ {0, 1}
This master problem optimises over a set of paths Ωi per line i. The task of generating
improving columns or paths is delegated to the sub-problem which is partitioned into
m problems. The reduced cost of a path in a given line does not affect the computation
of the reduced cost on another line. This is a typical structure for Danzig-Wolfe decom-
position, and the constraints of the RMP involving the Nb variables are the coupling, or
complicating, constraints. An improving columnfor line i is a path of negative reduced
cost where the reduced cost is defined by ci − j πj aij . This translates as follows in
our context. The M different costs on each edge are modified by a multiplier corre-
sponding to the dual variables of constraints C3 – C5 . We denote by δi , πi1 , πib2 , and
πi3 the dual variables associated with constraints C2 to C5 , respectively. The subprob-
lem of line i, PP (i), is a shortest path
problem where the cost of an edge c(pu , pv ) is:
c(pu , pv ) = −πi1 × c1 (pu , pv ) − b πib2 × c2 (b, pu , pv ) − πi3 × c3 (pu , pv ).
The column generation procedure is summarised in Algorithm 1. Notice that the
bound provided by column generation is no better than the one given by the compact
linear model because the pricing problem has the integrality property. The utility of this
64 H. Cambazard, E. O’Mahony, and B. O’Sullivan
Algorithm 1. ColumnGeneration
Data: Intensity Matrix – A matrix of positive integers
Result: A lower bound on the optimal decomposition.
1 Ω = ∅, DB = −∞, UB = +∞, = 10−6 ;
2 for i ≤ m do
3 add the path made of {1,. . . ,1} partitions for each integer of line i to Ω;
4 set πi1 = πi3 = πib2 = −1 for all b, solve PP (i) and add the shortest path to Ω
5 repeat
6 add the paths in Ω to the restricted master problem, RMP;
7 solve RPM, set UB to the corresponding optimal value and record the dual values
(δi , πi1 , πi3 , πib2 );
8 Ω = ∅;
9 for i ≤ m do
10 solve the pricing problem PP (i) and record its optimal value γi ;
11 if (γi − δi ) < − then
12 add the optimal path to Ω
i.e., the sum of the dual objective function and the best reduced cost (see [12]). The dual
bound provides a lower bound on the original problem and can be used for termination.
Typically, we can stop as soon as the optimal value is known to be in the interval ]a, a +
1], in which case one can immediately return a + 1 as the integer lower bound on the
original problem (Condition DB − = UB). This last condition is useful to avoid
Hybrid Methods for the Multileaf Collimator Sequencing Problem 65
a convergence problem and saves many calls to the subproblems. The use of an is to
avoid rounding issues arising from the use of continuous values.
Solving the Pricing Problem. The pricing problem involves solving a shortest path in
a graph whose size is exponential in the maximum element, M . Storing the partition
graph explicitly requires O(n× P 2 ) space, where P is the (exponentially large) number
of partitions of M . Memory remains an issue for the column generation if we solve the
pricing problems by explicitly representing the complete graph. To save memory as M
increases, the column generation procedure can avoid representing the whole graph by
only storing the nodes, thus consuming only O(nP ) space. In this case the costs on
each edge must be computed on demand as they cannot be stored. In practice, the pre-
vious compromise with O(nP ) space consumption is perfectly acceptable as instances
become very hard before the space again becomes an issue. In our implementation we
use a combined approach whereby we store the edges when the two consecutive layers
are small enough and only recompute the cost on the fly if it is not recorded.
Speeding up the Column Generation Procedure. The column generation process is
known to suffer from convergence problems [12]. In our case, an increase in the value
of M implies more time-consuming pricing problems, and the bottleneck lies entirely
in this task in practice. We obtained some improvement with a simple stabilisation tech-
nique [13, 22] to reduce degeneracy. We added surplus variables, y, to each constraint
(except for the convexity constraints) so that constraints C3 to C5 read as:
cki1 ×ptki −y3i ≤ K; ckib2 ×ptki −y4ib ≤ Nb ; cki3 ×ptki −y5i ≤ B.
k∈Ωi k∈Ωi k∈Ωi
We also added slackvariables, z, to constraints C0 and C1 which now read b≤M Nb −
y0 + z0 = K and b≤M b × Nb − y1 + z1 = B. The slack and surplus variables are
constrained in a box : y ≤ ψ, z ≤ ψ and they are penalised in the objective function by
a coefficient ρ. The objective function then reads as:
w1 K + w2 B + ρya + ρz0 + ρz1 .
a
This tries to avoid the dual solutions jumping from one extreme to another by restrain-
ing the dual variables in a box as long as no relevant dual information is known.
ρ and ψ are updated during the process and must end with ρ = ∞ or ψ = 0 to
ensure the sound termination of the algorithm. We simply fix the value of ψ to a
small constant (10% of the upper bound given by the heuristic [15]) and we update
ρ when the column generation algorithm stalls, using a predefined sequence of values:
[0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, ∞].
Branching on Nb raises an issue from the column generation perspective: the subprob-
lem becomes a shortest path with resource constraints, one resource per b ≤ M limited
by the current upper bound on the Nb variables of the CP model. This also means that
finding a feasible set of columns to initialise the master problem becomes difficult.
Interaction with CP. Solving the shortest path problem with multi-resource constraints
is far too costly. Recall that the original CP model is relaxing the multi-resource path
into a set of independent paths. The propagation obtained from this relaxation removes
partitions in the partition graph. We can therefore take advantage of this information to
prune the graph used by the subproblem of the column generation and solve a short-
est path in a restricted graph. We therefore solve a relaxation of the real subproblem
that we obtained from the CP propagation. The current bounds on the domains of the
Nb variables are also enforced in the master problem RMP. Propagation allows us to
strengthen both the master and the subproblems of the column generation.
Initialisation. The initialisation issue can be easily solved by adding slack variables
for constraints C0 , C1 , C3 , C4 , and C5 of the RMP and adding them to the objective
function with a sufficiently large coefficient to ensure they will be set to 0 in an optimal
solution. Then one simply needs to independently find a path in the current filtered
partition graph of each line to obtain a feasible solution.
Column Management. From one node of the search tree to another, we simply keep
the columns that are still feasible based on the domains of the Nb and Pij variables
and remove all the others. In addition to these removals, if the number of columns
increases beyond a threshold (set to 10000 columns in practice), we delete half of the
pool starting with the oldest columns to prevent the linear solver from stalling due to
the accumulation of too many variables.
Reduced cost propagation. The CG provides a lower bound on the objective function
but also the set of optimal reduced costs for the Nb variables. Propagation based on
these reduced costs can be performed in the CP model following [17]. At a given node,
once the RMP has been solved optimally, we denote by ub and lb the currents bounds
on the objective variable. ub + 1 corresponds to the value of the best solution found so
far and lb is the optimal value of the RMP at the corresponding node. We denote by rcb ,
the reduced cost of variable Nb at the optimal solution of the RMP. rcb represents the
increase in the objective function for an increase of one unit of Nb . The upper bound on
each Nb in the CP model can be adjusted to lb(Nb ) + ub−lb
rcb .
5 Experimental Results
We evaluated our methods using both randomly generated and clinical problem in-
stances.1 We used the randomly generated instances of [3, 9], which comprise 17 cate-
gories of 20 instances ranging in size from 12×12 to 40×40 with an M between 10 and
15, which we denote as m-n-M in our results tables. We added 9 additional categories
with matrix sizes reaching 80 × 80 and a maximum intensity value, M , of 25, giving
520 instances in total. The suite of 25 clinical instances we used are those from [25].
1
All the benchmarks are available from http://www.4c.ucc.ie/datasets/imrt
Hybrid Methods for the Multileaf Collimator Sequencing Problem 67
Table 1. Comparing quality and time of LP/CG/CG-STAB on the Lex objective function
LP CG Stabilised CG
Inst Gap (%)
Time Time NbPath NbIter Gain Time NbPath NbIter Gain
mean 0.64 109.08 40.28 849.10 147.54 10.83 20.42 579.53 62.47 14.97
median 0.30 14.73 1.44 660.18 123.30 10.58 1.13 434.35 54.48 14.32
min 0.00 0.81 0.12 262.75 65.95 6.45 0.12 198.95 38.85 6.72
max 5.00 1196.51 762.31 1958.10 297.60 21.96 368.90 1404.80 104.60 34.46
Table 2. Comparing the effect of the Lagrangian filtering on the Shortest Path Model CPSP
The experiments ran as a single thread on a Dual Quad Core Xeon CPU, 2.66GHz with
12MB of L2 cache per processor and 16GB of RAM overall, running Linux 2.6.25 x64.
A time limit of two hours and a memory limit of 3GB was used for each run.
Experiment 1: Evaluation of the LP Model. Firstly, we examine the quality and speed
of the linear models (solved with CPLEX 10.0.0). We use a lexicographic objective
function to perform this comparison, i.e. seek a minimum cardinality decomposition
for the given minimum beam on-time. In the result tables LP refers to the continuous
relaxation of the linear model representing every partition [10], CG to the model based
on paths and CG-STAB to its stabilised version. Table 1 reports the average gap (in
percentage terms) to the optimal value, the average times for the three algorithms as
well as the number of iterations and paths for CG and CG-STAB. The improvement
in time over LP is also given (column Gain). The mean, median, min and max across
all categories are finally reported as well. The linear relaxation leads to excellent lower
bounds but LP becomes quite slow as M grows and could not solve the instances with
M = 25 due to memory errors. CG improves the resolution time significantly and
offers better scalability in terms of memory. Its stabilised version clearly performs fewer
iterations and proves to be approximately twice as fast on average.
Table 4. Comparisortng the shortest path CP model, the Branch and Price algorithm against [25]
average number of nodes. The Lagrangian relaxation reduces the search space by an
order-of-magnitude but turns out to be very slow when M grows.
Experiment 3: Evaluation of the Branch and Price Model. We evaluate the Branch
and Price algorithm against the previous CP models with the lexicographic objective
function and also using the more general objective function to perform a direct compar-
ison with [25] on clinical instances. Following [25] we set w1 = 7 and w2 = 1. The
Hybrid Methods for the Multileaf Collimator Sequencing Problem 69
upper bound of the algorithm is initialised using the heuristic given in [15] whose run-
ning time is always well below a second. Table 3 compares the shortest path CP model
(CPSP) with two versions of the Branch and Price using the lex objective function. The
first version referred to as Branch and Price (light) only solves the CG during the first
branching phase on the Nb variables whereas the other version solves the CG at each
node of the search tree, including when the branching is made on the partition variables.
The Branch and Price significantly improves the CP model and is able to optimally solve
the integrality of the benchmark whereas the CPSP solves 455 out of the 520 instances.
The light version is often much faster but does not scale to the last two larger sets of
instances (70 × 70 and 80 × 80 matrices). Both branch and price algorithms outperform
CPSP on hard instances by orders of magnitude in search space reduction.
Finally, we evaluate the CPSP and the light Branch and Price on 25 clinical instances
with the general objective function. Table 4 reports the resolution time, the number
of nodes explored (Nodes) and the value of the objective function (Obj). The times
reported in [25] are quoted in the table and were obtained on a Pentium 4, 3 Ghz.3 .
The CP model alone already brings significant improvements over the algorithm of
[25]. The Branch and Price algorithm shows even more robustness by decreasing the
average, median and maximum resolution times.
6 Conclusion
References
1. Agazaryan, N., Solberg, T.D.: Segmental and dynamic intensity-modulated radiotherapy de-
livery techniques for micro-multileaf collimator. Medical Physics 30(7), 1758–1767 (2003)
2. Ahuja, R.K., Hamacher, H.W.: A network flow algorithm to minimize beamon time for un-
constrained multileaf collimator problems in cancer radiation therapy. Netw. 45(1), 36–41
(2005)
3. Baatar, D., Boland, N., Brand, S., Stuckey, P.J.: Minimum cardinality matrix decomposition
into consecutive-ones matrices: CP and IP approaches. In: Van Hentenryck, P., Wolsey, L.A.
(eds.) CPAIOR 2007. LNCS, vol. 4510, pp. 1–15. Springer, Heidelberg (2007)
4. Baatar, D., Hamacher, H.W., Ehrgott, M., Woeginger, G.J.: Decomposition of integer ma-
trices and multileaf collimator sequencing. Discrete Applied Mathematics 152(1-3), 6–34
(2005)
3
Two optimal values reported in [25] (for c3b5 and c4b5) are incorrect. The corresponding
solutions are pruned by their algorithms during search although it accepts them as valid solu-
tions if enforced as hard constraints.
70 H. Cambazard, E. O’Mahony, and B. O’Sullivan
5. Bahr, G.K., Kereiakes, J.G., Horwitz, H., Finney, R., Galvin, J., Goode, K.: The method of
linear programming applied to radiation therapy planning. Radiology 91, 686–693 (1968)
6. Boland, N., Hamacher, H.W., Lenzen, F.: Minimizing beam-on time in cancer radiation treat-
ment using multileaf collimators. Networks 43(4), 226–240 (2004)
7. Bortfeld, T.R., Kahler, D.L., Waldron, T.J., Boyer, A.L.: X-ray field compensation with mul-
tileaf collimators. International Journal of Radiation Oncology Biology Physics 28(3), 723–
730 (1994)
8. Boyd, S., Xiao, L., Mutapic, A.: Subgradient methods. In: Notes for EE392o, Standford
University (2003)
9. Brand, S.: The sum-of-increments constraints in the consecutive-ones matrix decomposition
problem. In: SAC 2009: 24th Annual ACM Symposium on Applied Computing (2009)
10. Cambazard, H., O’Mahony, E., O’Sullivan, B.: A shortest path-based approach to the multi-
leaf collimator sequencing problem. In: van Hoeve, W.-J., Hooker, J.N. (eds.) CPAIOR 2009.
LNCS, vol. 5547, pp. 41–55. Springer, Heidelberg (2009)
11. Collins, M.J., Kempe, D., Saia, J., Young, M.: Nonnegative integral subset representations of
integer sets. Inf. Process. Lett. 101(3), 129–133 (2007)
12. Desaulniers, G., Desrosiers, J., Solomon, M.M.: Column Generation. Springer, Heidelberg
(2005)
13. du Merle, O., Villeneuve, D., Desrosiers, J., Hansen, P.: Stabilized column generation. Dis-
crete Math. 194(1-3), 229–237 (1999)
14. Ehrgott, M., Güler, Ç., Hamacher, H.W., Shao, L.: Mathematical optimization in intensity
modulated radiation therapy. 4OR 6(3), 199–262 (2008)
15. Engel, K.: A new algorithm for optimal multileaf collimator field segmentation. Discrete
Applied Mathematics 152(1-3), 35–51 (2005)
16. Ernst, A.T., Mak, V.H., Mason, L.A.: An exact method for the minimum cardinality problem
in the planning of imrt. INFORMS Journal of Computing (2009) (to appear)
17. Focacci, F., Lodi, A., Milano, M.: Cost-based domain filtering. In: Jaffar, J. (ed.) CP 1999.
LNCS, vol. 1713, pp. 189–203. Springer, Heidelberg (1999)
18. Hamacher, H.W., Ehrgott, M.: Special section: Using discrete mathematics to model multi-
leaf collimators in radiation therapy. Discrete Applied Mathematics 152(1-3), 4–5 (2005)
19. Kalinowski, T.: The complexity of minimizing the number of shape matrices subject to mini-
mal beam-on time in multileaf collimator field decomposition with bounded fluence. Discrete
Applied Mathematics (in press)
20. Kalinowski, T.: A duality based algorithm for multileaf collimator field segmentation with
interleaf collision constraint. Discrete Applied Mathematics 152(1-3), 52–88 (2005)
21. Langer, M., Thai, V., Papiez, L.: Improved leaf sequencing reduces segments or monitor
units needed to deliver imrt using multileaf collimators. Medical Physics 28(12), 2450–2458
(2001)
22. Lübbecke, M.E., Desrosiers, J.: Selected topics in column generation. Oper. Res. 53(6),
1007–1023 (2005)
23. Menana, J., Demassey, S.: Sequencing and counting with the multicost-regular constraint.
In: van Hoeve, W.-J., Hooker, J.N. (eds.) CPAIOR 2009. LNCS, vol. 5547, pp. 178–192.
Springer, Heidelberg (2009)
24. Sellmann, M.: Theoretical foundations of CP-based lagrangian relaxation. In: Wallace, M.
(ed.) CP 2004. LNCS, vol. 3258, pp. 634–647. Springer, Heidelberg (2004)
25. Caner Taskin, Z., Cole Smith, J., Edwin Romeijn, H., Dempsey, J.F.: Collimator leaf se-
quencing in imrt treatment planning. Operations Research 119 (2009) (submitted)
26. Wake, G.M.G.H., Boland, N., Jennings, L.S.: Mixed integer programming approaches to
exact minimization of total treatment time in cancer radiotherapy using multileaf collimators.
Comput. Oper. Res. 36(3), 795–810 (2009)
Automatically Exploiting Subproblem
Equivalence in Constraint Programming
1 Introduction
When solving a search problem, it is common for the search to do redundant
work, due to different search paths leading to subproblems that are somehow
“equivalent”. There are a number of different methods to avoid this redundancy,
such as caching solutions (e.g. [19]), symmetry breaking (e.g. [8]), and nogood
learning (e.g. [14]). This paper focuses on caching, which works by storing infor-
mation in a cache regarding every new subproblem explored during the search.
Whenever a new subproblem is about to be explored, the search checks whether
there is an already explored subproblem in the cache whose information (such as
solutions or a bound on the objective function) can be used for the current sub-
problem. If so, it does not explore the subproblem and, instead, uses the stored
information. Otherwise, it continues exploring the subproblem. For caching to
be efficient, the lookup operation must be efficient. A popular way is to store the
information using a key in such a way that problems that can reuse each other’s
information are mapped to the same (or similar) key.
This paper explores how to use caching automatically to avoid redundancy in
constraint programming (CP) search. Caching has been previously used in CP
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 71–86, 2010.
c Springer-Verlag Berlin Heidelberg 2010
72 G. Chu, M. Garcia de la Banda, and P.J. Stuckey
search, but either relies on the careful manual construction of the key for each
model and search strategy (e.g [19]), or exploits redundancy when the remain-
ing subproblem can be decomposed into independent components (e.g. [10,12]).
Instead, we describe an approach that can automatically detect and exploit
caching opportunities in arbitrary optimization problems, and does not rely on
decomposition. The principal insight of our work is to define a key that can
be efficiently computed during the search and can uniquely identify a relatively
general notion of reusability (called U -dominance). The key calculation only re-
quires each primitive constraint to be extended to backproject itself on the fixed
variables involved. We experimentally demonstrate the effectiveness of our ap-
proach, which has been implemented in a competitive CP solver, Chuffed. We
also provide interesting insight into the relationships between U -dominance and
dynamic programming, symmetry breaking and nogood learning.
2 Background
Let ≡ denote syntactic identity and vars(O) denote the set of variables of ob-
ject O. A constraint problem P is a tuple (C, D), where D is a set of domain
constraints of the form x ∈ sx (we will use x = d as shorthand for x ∈ {d}),
indicating that variable x can only take values in the fixed set sx , and C is a set
of constraints such that vars(C) ⊆ vars(D). We will assume that for every two
x ∈ sx , y ∈ sy in D : x ≡ y. We will define DV , the restriction of D to variables
V , as {(x ∈ sx ) ∈ D|x ∈ V }. Each set D and C is logically interpreted as the
conjunction of its elements.
A literal of P ≡ (C, D) is of the form x → d, where ∃(x ∈ sx ) ∈ D s.t. d ∈ sx .
A valuation θ of P over set of variables V ⊆ vars(D) is a set of literals of P
with exactly one literal per variable in V . It is a mapping of variables to values.
The projection of valuation θ over a set of variables U ⊆ vars(θ) is the valuation
θU = {x → θ(x)|x ∈ U }. We denote by f ixed(D) the set of fixed variables in D,
{x|(x = d) ∈ D}, and by f x(D) the associated valuation {x → d|(x = d) ∈ D}.
Define f ixed(P ) = f ixed(D) and f x(P ) = f x(D) when P ≡ (C, D).
A constraint c ∈ C can be considered a set of valuations solns(c) over the
variables vars(c). Valuation θ satisfies constraint c iff vars(c) ⊆ vars(θ) and
θvars(c) ∈ c. A solution of P is a valuation over vars(P ) that satisfies every
constraint in C. We let solns(P ) be the set of all its solutions. Problem P is
satisfiable if it has at least one solution and unsatisfiable otherwise.
Finally, we use ∃V .F to denote ∃v1 .∃v2 · · · ∃vn .F where F is a formula and V is
¯V .F to denote the formula
the set of variables {v1 , v2 , . . . , vn }. Similarly, we use ∃
∃vars(F )−V .F . We let ⇔ denote logical equivalence and ⇒ logical entailment of
formulae.
Given a constraint problem P ≡ (C, D), constraint programming solves P
by a search process that first uses a constraint solver to determine whether
P can immediately be classified as satisfiable or unsatisfiable. We assume a
propagation solver, denoted by solv, which when applied to P returns a new set
D of domain constraints such that D ⇒ D and C ∧ D ⇔ C ∧ D . The solver
Automatic Caching in Constraint Programming 73
cache search(C, D)
D := solv(C, D)
if (D ⇔ false) return false
if (∃U.∃P ∈ Cache where P U -dominates (C, D)) return false
if f ixed(D ) ≡ vars(D)
[SAT] return D
foreach (x ∈ s) ∈ split(C, D)
S := cache search(C, join(x, s, D))
if (S ≡ false) return S
Cache := Cache ∪ {(C, D )}
return false
Fig. 1. Computing the first solution under subproblem equivalence
The last step holds because all variables being projected out in every c ∧ DF
were already fixed. Importantly, this allows each constraint c ∈ C to be treated
independently.
We can automatically convert this information into a key by back projecting
the projected constraints of this problem to determine conditions on the fixed
variables F . We define the back projection of constraint c ∈ C for DF as a con-
straint BP (c, DF ) over variables F ∩ vars(c) such that ∃ ¯ U .(c ∧ BP (c, DF )) ⇔
¯U .(c ∧ DF ). Clearly, while DF ∩vars(c) is always a correct back projection, our
∃
aim is to define the most general possible back projection that ensures the equiva-
lence. Note that if c has no variables in common with F , then BP (c, DF ) ≡ true.
Note also that when c is implied by DF , that is ∃¯U .(c ∧ DF ) ⇔ true, then c can
be eliminated. We thus define BP (c, DF ) ≡ red(c), where red(c) is simply a
name representing the disjunction of all constraints that force c to be redundant
(we will see later how to remove these artificial constraints). The problem key
for P ≡ (C, D) is then defined as key(C, D) ≡ ∧c∈C BP (c, DF ) ∧ DU .
Example 3. Consider the problem C ≡ {alldiff ([x1 , x2 , x3 , x4 , x5 , x6 ]), x1 +2x2 +
x3 + x4 + 2x5 ≤ 20} and domain D ≡ {x1 = 3, x2 = 4, x3 = 5, x4 ∈ {0, 1, 2}, x5 ∈
76 G. Chu, M. Garcia de la Banda, and P.J. Stuckey
{0, 1, 2}, x6 ∈ {1, 2, 6}}. Then F = {x1 , x2 , x3 } and U = {x4 , x5 , x6 }. The pro-
jected subproblem is characterized by alldiff ([x4 , x5 , x6 ]) ∧ x4 + 2x5 ≤ 4 ∧ DU .
A correct back projection for alldiff onto {x1 , x2 , x3 } is {x1 , x2 , x3 } = {3, 4, 5}.
A correct back projection of the linear inequality is x1 + 2x2 + x3 = 16. Thus,
key(C, D) ≡ {x1 , x2 , x3 } = {3, 4, 5} ∧ x1 + 2x2 + x3 = 16 ∧ x4 ∈ {0, 1, 2} ∧ x5 ∈
{1, 2} ∧ x6 ∈ {1, 2, 6}.
The second and sixth (marked) equivalences hold because, again, all variables
being projected out in each c ∧ DF and c ∧ DF were already fixed.
constraint c DF ¯ U .c
∃ BP (c, DF )
alldiff ([x1 , . . . , xn ])
∧m
i=1 xi = di alldiff ([d1 , . . . , dm , xm+1
, . . . xn ]) {x
m 1 , . . . , xm }
= {d1 , . . . , dm }
n
ai xi = a0 ∧m
i=1 xi = di
n
ai xi = a0 − m ai di ai xi = m
ai di
i=1 i=m+1 i=1 i=1 i=1
i=1 ai xi ≤ a0 ∧m i=m+1 ai xi ≤ a0 −
n n m m m
i=1 xi = di i=1 ai di i=1 ai xi = i=1 ai di
x0 = minn i=1 xi ∧m
i=1 xi = di
m n
xo = min(mini=1 di , mini=m+1 xi ) m
mini=1 xi = mini=1 di m
x0 = d0 ∧ni=1 xi ≥ d0 ∧ ∨i=1 xi = d0
n
x0 = d0
∨n
i=1 xi x1 = true true red(∨n i=1 xi )
∧m
i=1 xi = false ∨ni=m+1 xi ∧m
i=1 xi = false
Fig. 2. Example constraints with their fixed variables, projections and resulting back
projection
Suppose that by the time we reachP , a better solution k < k has been
n
discovered. Effectively the constraint i=1 ai xi ≤ k − 1 has been replaced by
ai xi ≤ k − 1. Now we are only interested in finding a solution where
n
i=1
n
i=m+1 ai xi ≤ k − 1 − p . To see if this is dominated by a previous subproblem,
the stored value p is not enough. We also need the optimal value k when the
key was stored. There is a simple fix: rather than storing p in the key for P we
store q = k − 1 − p. We can detect dominance if q ≤ k − 1 − p and this value
q is usable for all future dominance tests. Note that q implicitly represents the
partial objective bound on the subproblem P .
5 Related Work
Problem specific approaches to dominance detection/subproblem equivalance are
widespread in combinatorial optimization (see e.g. [6,19]) There is also a signif-
icant body of work on caching that rely on problem decomposition by fixing
variables (e.g [10,12]). This work effectively looks for equivalent projected prob-
lems, but since they do not take into account the semantics of the constraints,
they effectively use DF ∩vars(c) for every constraint c as the projection key, which
finds strictly fewer equivalent subproblems than back-projection. The success of
these approaches in finding equivalent subproblems relies on decomposing the
projected subproblem into disjoint parts. We could extend our approach to also
split the projected problem into connected components but this typically does
not occur in the problems of interest to us. Interestingly, [10] uses symmetry
detection to make subproblem equivalence detection stronger, but the method
used does not appear to scale.
0 j =0∨w ≤0
knp(j, w) =
max(knp(j − 1, w), knp(j − 1, w − wj ) + pj ) otherwise
The DP solution is O(nW ) since values for knp(j, w) are cached and only
computed once. Consider a CP solver using a fixed search order x1 , . . . , xn .
A
m subproblem fixing x1 , . . . ,
xm to d1 , . . . , dm respectively generates
mkey value
n
i=1 wi di for the constraint i=1 wi xi ≤ W and key value k + 1 − i=1 pi di for
the optimization constraint ni=1 pi xi ≥ k + 1 where k is the best solution found
so far. The remaining variable domains are all unchanged so they do not need to
be explicitly stored (indeed domains of Boolean or 0-1 variables never need to
be stored as they are either fixed or unchanged). The projection key is simply
the set of fixed variables {x1 , . . . , xm } and the two constants. The complexity is
hence O(nW u) where u is the initial upper bound on profit.
The solutions are in fact quite different: the DP approach stores the optimal
profit for each set of unfixed variables and remaining weight limit, while the CP
approach stores the fixed variables and uses weight plus the remaining profit
required. The CP approach in fact implements a form of DP with bounding [16].
In particular, the CP approach can detect subproblem dominance, a problem
with used weight w and remaining profit required p is dominated by a problem
with used weight w ≤ w and remaining profit p ≤ p . The DP solution must
examine both subproblems since the remaining weights are different.
In practice the number of remaining profits arising for the same set of fixed
variables and used weight is O(1) and hence the practical number of subproblems
visited by the CP approach is O(nW ).
Note that while adding a side constraint like x3 ≥ x8 destroys the DP ap-
proach (or at least forces it to be carefully reformulated), the CP approach with
automatic caching works seamlessly.
6 Experiments
We compare our solver Chuffed, with and without caching, against Gecode
3.2.2 [17] – widely recognized as one of the fastest constraint programming sys-
tems (to illustrate we are not optimizing a slow system) – against the G12 FD
solver [15] and against the G12 lazy clause generation solver [5] (to compare
against nogood learning). We use the MurmurHash 2.0 hash function. We use
models written in the modelling language MiniZinc [13]. This facilitates a fair
comparison between the solvers, as all solvers use the same model and search
82 G. Chu, M. Garcia de la Banda, and P.J. Stuckey
strategy. Note that caching does not interfere with the search strategies used
here, as all it can do is fail subtrees earlier. Thus, Chuffed with caching (de-
noted as ChuffedC) always finds the same solution as the non-caching version
and the other solvers, and any speedup observed comes from a reduced search.
Considerable engineering effort has gone into making caching as efficient as
possible: defining as general as possible back projections for each of the many
primitive constraints defined in the solver, exploiting small representations for
back projections and domains, and eliminating information that never needs
storing, e.g. binary domain consistent constraints and Boolean domains.
The experiments were conducted on Xeon Pro 2.4GHz processors with a 900
second timeout. Table 1 presents the number of variables and constraints as
reported by Chuffed, the times for each solver in seconds, and the speedup and
node reduction obtained from using automatic caching in Chuffed. We discuss
the results for each problem below. All the MiniZinc models and instances are
available at www.cs.mu.oz.au/~pjs/autocache/
Knapsack. 0-1 knapsack is ideal for caching. The non-caching solvers all timeout
as n increases, as their time complexity is O(2n ). This is a worst case for lazy
clause generation since the nogoods generated are not reusable. ChuffedC,
on the other hand, is easily able to solve much larger instances (see Table 1).
The node to nW ratio (not shown) stays fairly constant as n increases (varying
between 0.86 and 1.06), showing that it indeed has search (node) complexity
O(nW ). The time to nW ratio grows as O(n) though, since we are using a
general CP solver where the linear constraints take O(n) to propagate at each
node, while DP requires constant work per node. Hence, we are not as efficient
as pure DP.
MOSP. The minimal open stacks problem (MOSP) aims at finding a schedule
for manufacturing all products in a given set that minimizes the maximum num-
ber of active customers, i.e., the number of customers still waiting for at least
one of their products to be manufactured. This problem was the subject of the
2005 constraint modelling challenge [18]. Of the 13 entrants only 3 made use of
the subproblem equivalence illustrating that, in general, it may not be easy to
detect. Our MOSP model uses customer search and some complex conditional
dominance breaking constraints that make the (non-caching) search much faster.
We use random instances from [3]. Automatic caching gives up to two orders of
magnitude speedup. The speedup grows exponentially with problem size. Lazy
clause is also capable of exploiting this subproblem equivalence, but the overhead
is so large that it can actually slow the solver down.
Blackhole. In the Blackhole patience game, the 52 cards are laid out in 17 piles
of 3, with the ace of spades starting in a “blackhole”. Each turn, a card at the
top of one of the piles can be played into the blackhole if it is +/-1 from the card
that was played previously. The aim is to play all 52 cards. This was one of two
examples used to illustrate CP with caching in [19]. The remaining subproblem
only depends on the set of unplayed cards, and the value of the last card played.
Automatic Caching in Constraint Programming 83
Thus, there is subproblem equivalence. We use a model from [7] which includes
conditional symmetry breaking constraints. We generated random instances and
used only the hard ones for this experiment. The G12 solvers do not use a domain
consistent table constraint for this problem and are several orders of magnitudes
slower. Automatic caching gives a modest speedup of around 2-3. The speedup
is relatively low on this problem because the conditional symmetry breaking
constraints have already removed many equivalent subproblems, and the caching
is only exploiting the ones which are left. Note that the manual caching reported
in [19] achieves speedups in the same range (on hard instances).
Radiation Therapy. In the Radiation Therapy problem [1], the aim is to decom-
pose an integral intensity matrix describing the radiation dose to be delivered
to each area, into a set of patterns to be delivered by a radiation source, while
minimising the amount of time the source has to be switched on, as well as the
number of patterns used (setup time of machine). The subproblem equivalence
arises because there are equivalent methods to obtain the same cell coverages,
e.g. radiating one cell with two intensity 1 patterns is the same as radiating it
with one intensity 2 pattern, etc. We use random instances generated as in [1].
Both automatic caching and lazy clause generation produce orders of magnitude
speedup, though lazy clause generation times are often slightly better.
Instance vars cons. ChuffedC Chuffed Gecode G12 fd G12 lazyfd Speedup Node red.
knapsack-20 21 2 0.01 0.01 0.01 0.01 0.10 1.00 2.9
knapsack-30 31 2 0.02 0.83 0.76 1.168 534.5 41.5 67
knapsack-40 41 2 0.03 38.21 34.54 58.25 >900 1274 1986
knapsack-50 51 2 0.07 >900 >900 >900 >900 >12860 >20419
knapsack-60 61 2 0.10 >900 >900 >900 >900 >9000 >14366
knapsack-100 101 2 0.40 >900 >900 >900 >900 >2250 > 2940
knapsack-200 201 2 2.36 >900 >900 >900 >900 >381 > 430
knapsack-300 301 2 6.59 >900 >900 >900 >900 >137 >140
knapsack-400 401 2 13.96 >900 >900 >900 >900 >65 >65
knapsack-500 501 2 25.65 >900 >900 >900 >900 >35 > 34
mosp-30-30-4-1 1021 1861 1.21 4.80 24.1 50.29 29.70 4.0 4.91
mosp-30-30-2-1 1021 1861 6.24 >900 >900 >900 201.8 >144 >187
mosp-40-40-10-1 1761 3281 0.68 0.66 5.85 15.07 29.80 1.0 1.1
mosp-40-40-8-1 1761 3281 1.03 1.15 9.92 27.00 56.96 1.1 1.3
mosp-40-40-6-1 1761 3281 3.79 11.30 75.36 183.9 165.2 3.0 3.5
mosp-40-40-4-1 1761 3281 19.07 531.68 >900 >900 840.4 28 37
mosp-40-40-2-1 1761 3281 60.18 >900 >900 >900 >900 >15 > 18
mosp-50-50-10-1 2701 5101 2.83 3.17 40.70 92.74 134.1 1.1 1.2
mosp-50-50-8-1 2701 5101 6.00 9.12 113.0 292.0 295.9 1.5 1.8
mosp-50-50-6-1 2701 5101 39.65 404.16 >900 >900 >900 10.2 13.1
blackhole-1 104 407 18.35 39.77 103.6 >900 >900 2.17 2.90
blackhole-2 104 411 14.60 21.52 60.06 >900 >900 1.47 1.94
blackhole-3 104 434 18.31 26.14 31.43 >900 >900 1.43 1.81
blackhole-4 104 393 15.77 30.84 69.13 >900 >900 1.96 2.55
blackhole-5 104 429 24.88 58.77 159.5 >900 >900 2.36 3.45
blackhole-6 104 448 11.31 33.27 85.65 >900 >900 2.94 5.11
blackhole-7 104 407 28.02 47.31 127.6 >900 >900 1.69 2.49
blackhole-8 104 380 24.09 43.60 89.02 >900 >900 1.81 2.45
blackhole-9 104 404 38.74 93.92 215.1 >900 >900 2.42 3.52
blackhole-10 104 364 67.85 159.4 418.0 >900 >900 2.35 3.16
curriculum 8 838 1942 0.01 0.01 0.01 0.02 0.08 1.00 1.00
curriculum 10 942 2214 0.01 0.01 0.01 0.03 0.09 1.00 1.00
curriculum 12 1733 4121 0.01 0.01 0.01 0.10 0.23 1.00 1.00
bacp-medium-1 1121 2654 11.47 34.90 29.31 62.4 6.90 3.04 3.03
bacp-medium-2 1122 2650 9.81 >900 >900 >900 0.22 >92 >115
bacp-medium-3 1121 2648 2.42 380.7 461.62 838.6 0.23 157 190
bacp-medium-4 1119 2644 0.61 4.59 5.74 9.92 1.10 7.52 10.1
bacp-medium-5 1119 2641 2.40 56.46 54.03 126.9 0.76 23.5 26.5
bacp-hard-1 1121 2655 54.66 >900 >900 >900 0.16 >16 >16
bacp-hard-2 1118 2651 181.9 >900 >900 >900 0.22 >5 >7
radiation-6-9-1 877 942 12.67 >900 >900 >900 2.89 >71 >146
radiation-6-9-2 877 942 27.48 >900 >900 >900 5.48 >32 >86
radiation-7-8-1 1076 1168 0.84 >900 >900 >900 1.40 >1071 >5478
radiation-7-8-2 1076 1168 0.65 89.18 191.4 173.6 0.93 137 633
radiation-7-9-1 1210 1301 2.39 143.0 315.6 241.9 2.70 59 266
radiation-7-9-2 1210 1301 7.26 57.44 144.4 101.9 8.83 8 34
radiation-8-9-1 1597 1718 27.09 >900 >900 >900 6.21 >33 >114
radiation-8-9-2 1597 1718 12.21 >900 >900 >900 6.53 >74 >267
radiation-8-10-1 1774 1894 22.40 12.17 15.45 12.90 33.2 0.54 1.10
radiation-8-10-2 1774 1894 59.66 >900 >900 >900 12.05 >15 >78
Automatic Caching in Constraint Programming 85
7 Conclusion
We have described how to automatically exploit subproblem equivalence in a
general constraint programming system by automatic caching. Our automatic
caching can produce orders of magnitude speedup over our base solver Chuffed,
which (without caching) is competitive with current state of the art constraint
programming systems like Gecode. With caching, it can be much faster on prob-
lems that have subproblem equivalences.
The automatic caching technique is quite robust. It can find and exploit sub-
problem equivalence even in models that are not “pure”, e.g. MOSP with dom-
inance and conditional symmetry breaking constraints, Blackhole with condi-
tional symmetry breaking constraints, and BACP which can be seen as bin pack-
ing with lots of side constraints and some redundant constraints. The speedups
from caching tends to grow exponentially with problem size/difficulty, as sub-
problem equivalences also grow exponentially.
Our automatic caching appears to be competitive with lazy clause genera-
tion in exploiting subproblem equivalence, and is superior on some problems, in
particular those with large linear constraints.
The overhead for caching is quite variable (it can be read from the tables
as the ratio of node reduction to speedup). For large problems with little vari-
able fixing it can be substantial (up to 5 times for radiation), but for problems
that fix variables quickly it can be very low. Automatic caching of course relies
on subproblem equivalence occurring to be of benefit. Note that for dynamic
searches this is much less likely to occur. Since it is trivial to invoke, it seems al-
ways worthwhile to try automatic caching for a particular model, and determine
empirically if it is beneficial.
References
1. Baatar, D., Boland, N., Brand, S., Stuckey, P.J.: Minimum cardinality matrix de-
composition into consecutive-ones matrices: CP and IP approaches. In: Van Hen-
tenryck, P., Wolsey, L.A. (eds.) CPAIOR 2007. LNCS, vol. 4510, pp. 1–15. Springer,
Heidelberg (2007)
2. Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
3. Chu, G., Stuckey, P.J.: Minimizing the maximum number of open stacks by cus-
tomer search. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 242–257. Springer,
Heidelberg (2009)
4. Fahle, T., Schamberger, S., Sellmann, M.: Symmetry breaking. In: Walsh, T. (ed.)
CP 2001. LNCS, vol. 2239, pp. 93–107. Springer, Heidelberg (2001)
5. Feydy, T., Stuckey, P.J.: Lazy clause generation reengineered. In: Gent, I.P. (ed.)
CP 2009. LNCS, vol. 5732, pp. 352–366. Springer, Heidelberg (2009)
6. Fukunaga, A., Korf, R.: Bin completion algorithms for multicontainer packing,
knapsack, and covering problems. J. Artif. Intell. Res. (JAIR) 28, 393–429 (2007)
86 G. Chu, M. Garcia de la Banda, and P.J. Stuckey
7. Gent, I., Kelsey, T., Linton, S., McDonald, I., Miguel, I., Smith, B.: Conditional
symmetry breaking. In: van Beek, P. (ed.) CP 2005. LNCS, vol. 3709, pp. 256–270.
Springer, Heidelberg (2005)
8. Gent, I., Petrie, K., Puget, J.-F.: Symmetry in Constraint Programming. In: Hand-
book of Constraint Programming, pp. 329–376. Elsevier, Amsterdam (2006)
9. Hnich, B., Kiziltan, Z., Walsh, T.: Modelling a balanced academic curriculum prob-
lem. In: Proceedings of CPAIOR 2002, pp. 121–131 (2002)
10. Kitching, M., Bacchus, F.: Symmetric component caching. In: Proceedings of IJCAI
2007, pp. 118–124 (2007)
11. Lynce, I., Baptista, L., Marques-Silva, J.: Complete search restart strategies for
satisfiability. In: IJCAI Workshop on Stochastic Search Algorithms, pp. 1–5 (2001)
12. Marinescu, R., Dechter, R.: And/or branch-and-bound for graphical models. In:
Proceedings of IJCAI 2005, pp. 224–229 (2005)
13. Nethercote, N., Stuckey, P.J., Becket, R., Brand, S., Duck, G.J., Tack, G.: Minizinc:
Towards a standard CP modelling language. In: Bessière, C. (ed.) CP 2007. LNCS,
vol. 4741, pp. 529–543. Springer, Heidelberg (2007)
14. Ohrimenko, O., Stuckey, P.J., Codish, M.: Propagation via lazy clause generation.
Constraints 14(3), 357–391 (2009)
15. Stuckey, P.J., Garcia de la Banda, M., Maher, M.J., Marriott, K., Slaney, J.K.,
Somogyi, Z., Wallace, M., Walsh, T.: The G12 project: Mapping solver independent
models to efficient solutions. In: van Beek, P. (ed.) CP 2005. LNCS, vol. 3709, pp.
13–16. Springer, Heidelberg (2005)
16. Puchinger, J., Stuckey, P.J.: Automating branch-and-bound for dynamic programs.
In: Proceedings of PEPM 2008, pp. 81–89 (2008)
17. Schulte, C., Lagerkvist, M., Tack, G.: Gecode, http://www.gecode.org/
18. Smith, B., Gent, I.: Constraint modelling challenge report 2005 (2005),
http://www.cs.st-andrews.ac.uk/~ ipg/challenge/ModelChallenge05.pdf
19. Smith, B.M.: Caching search states in permutation problems. In: van Beek, P. (ed.)
CP 2005. LNCS, vol. 3709, pp. 637–651. Springer, Heidelberg (2005)
Single-Facility Scheduling over Long Time Horizons by
Logic-Based Benders Decomposition
1 Introduction
Logic-based Benders decomposition has been successfully used to solve planning and
scheduling problems that naturally decompose into an assignment and a scheduling
portion. The Benders master problem assigns jobs to facilities using mixed integer pro-
gramming (MILP), and the subproblems use constraint programming (CP) to schedule
jobs on each facility.
In this paper, we use a similar technique to solve pure scheduling problems with
long time horizons. Rather than assign jobs to facilities, the master problem assigns
jobs to segments of the time horizon. The subproblems schedule jobs within each time
segment.
In particular, we solve single-facility scheduling problems with time windows in
which the objective is to find a feasible solution, minimize makespan, or minimize total
tardiness. We assume that each job must be completed within one time segment. The
boundaries between segments might therefore be regarded as weekends or shutdown
times during which jobs cannot be processed. In future research we will address in-
stances in which jobs can overlap two or more segments.
Logic-based Benders decomposition was introduced in [2,8]. Its application to as-
signment and scheduling via CP/MILP was proposed in [3] and implemented in [9].
This and subsequent work shows that the Benders approach can be orders of magnitude
faster than stand-alone MILP or CP methods on problems of this kind [1,7,4,5,6,10,11].
For the pure scheduling problems considered here, we find that the advantage of Ben-
ders over both CP and MILP increases rapidly as the problem scales up.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 87–91, 2010.
c Springer-Verlag Berlin Heidelberg 2010
88 E. Coban and J.N. Hooker
2 The Problem
Each job j has release time, deadline (or due date) dj , and processing time pj . The time
horizon consists of intervals [zi , zi+1 ] for i = 1, . . . , m. The problem is to assign each
job j a start time sj so that time windows are observed (rj ≤ sj ≤ dj − pj ), jobs run
consecutively (sj + pj ≤ sk or sk + pk ≤ sj for all k = j), and each job is completed
within one segment (zi ≤ sj ≤ zi+1 − pj for some i). We minimize makespan by
minimizing max j {sj + pj }. To minimize tardiness, we drop the constraint sj ≤ dj − pj
and minimize j max{0, sj + pj − dj }.
3 Feasibility
When the goal is to find a feasible schedule, the master problem seeks a feasible as-
signment of jobs to segments, subject to the Benders cuts generated so far. Because we
solve the master problem with MILP, we introduce 0-1 variables yij with yij = 1 when
job j is assigned to segment i. The master problem becomes
yij = 1, all j
i
(1)
Benders cuts, relaxation
yij ∈ {0, 1}, all i, j
The master problem also contains a relaxation of the subproblem, similar to those de-
scribed in [4,5,6], that helps reduce the number of iterations.
Given a solution ȳij of the master problem, let Ji = {j | ȳij = 1} be the set of jobs
assigned to segment i. The subproblem decomposes into a CP scheduling problem for
each segment i:
rj ≤ sj ≤ dj − pj
, all j ∈ Ji
zi ≤ sj ≤ zi+1 − pj (2)
disjunctive ({sj | j ∈ Ji })
where the disjunctive global constraint ensures that the jobs assigned to segment
i do not overlap.
Each infeasible subproblem generates a Benders cut as described below, and the cuts
are added to the master problem. The master problem and corresponding subproblems
are repeatedly solved until every segment has a feasible schedule, or until the master
problem is infeasible, in which case the original problem is infeasible.
Strengthened nogood cuts. The simplest Benders cut is a nogood cut that excludes
assignments that cause infeasibility in the subproblem. If there is no feasible schedule
for segment i, we generate the cut
yij ≤ |Ji | − 1, all i (3)
j∈Ji
The cut can be strengthened by removing jobs one by one from Ji until a feasible
schedule exists for segment i. This requires re-solving the ith subproblem repeatedly,
Single-Facility Scheduling over Long Time Horizons 89
but the effort generally pays off because the subproblems are much easier to solve than
the master problem. We now generate a cut (3) with the reduced Ji .
The cut may be stronger if jobs less likely to cause infeasibility are removed from Ji
first. Let the effective time window [r̃ij , d˜ij ] of job j on segment i be its time window
adjusted to reflect the segment boundaries. Thus
r̃ij = max {min{rj , zi+1 }, zi } , d˜ij = min {max{dj , zi }, zi+1 }
Let the slack of job j on segment i be d˜ij − r̃ij − pj . We can now remove the jobs in
order of decreasing slack.
4 Minimizing Makespan
Here the master problem minimizes μ subject to (1) and μ ≥ 0 . The subproblems
minimize μ subject to (2) and μ ≥ sj + pj for all j ∈ Ji .
Strengthened nogood cuts. When one or more subproblems are infeasible, we use
strengthened nogood cuts (3). Otherwise, for each segment i we use the nogood cut
⎛ ⎞
μ ≥ μ∗i ⎝1 − (1 − yij )⎠
j∈Ji
where μ∗i is the minimum makespan for subproblem i. These cuts are strengthened by
removing jobs from Ji until the minimum makespan on segment i drops below μ∗i .
We also strengthen the cuts as follows. Let μi (J) be the minimum makespan that
results when in jobs in J are assigned to segment i, so that in particular μi (Ji ) = μ∗i .
Let Zi be the set of jobs that can be removed, one at a time, without affecting makespan,
so that Zi = {j ∈ Ji | Mi (Ji \ {j}) = Mi∗ }. Then for each i we have the cut
⎛ ⎞
μ ≥ μi (Ji \ Zi ) ⎝1 − (1 − yij )⎠
j∈Ji \Zi
when one or more jobs are removed from segment i, μ ≥ 0 when all jobs are removed,
and μ ≥ μ∗i otherwise. This can be linearized:
μ ≥ μ∗i − pj (1 − yij ) − wi − μ∗i (1 − yij ) − μ∗i qi , qi ≤ 1 − yij , j ∈ Ji
j∈ji j∈Ji
wi ≤ max {dj } − min {dj } (1 − yij ), wi ≤ max {dj } − min {dj }
j∈Ji j∈Ji j∈Ji j∈Ji
j∈Ji
5 Minimizing Tardiness
Here
the master problem minimizes τ subject to (1), and each subproblem minimizes
j∈Ji τj subject to τj ≥ sj + pj − dj and τj ≥ 0.
Benders cuts. We use strengthened nogood cuts and relaxations similar to those used
for minimizing makespan. We also develop the analytic Benders cuts
τ≥ τ̂i
i
⎧ +
⎪
⎪
⎪
⎪ ∗
τi − max
ri + p − dj (1 − yij ), if rimax + p ≤ zi+1
⎪
⎪
⎨ j∈Ji ∈Ji ∈Ji
τ̂i ≥ ⎛ ⎞
⎪
⎪
⎪ ∗⎝
⎪
⎪
⎪ τ 1− (1 − yij )⎠ , otherwise
⎩ i
j∈Ji
where the bound on τ̂i is included for all i for which τi∗ > 0. Here τi∗ is the minimum
tardiness in subproblem i, rimax = max{max{rj | j ∈ Ji }, zi }, and α+ = max{0, α}.
Table 1. Computation times in seconds (computation terminated after 600 seconds). The number
of segments is 10% the number of jobs. Tight time windows have (γ1 , γ2 , α) = (1/2, 1, 1/2)
and wide time windows have (γ1 , γ2 , α) = (1/4, 1, 1/2). For feasibility instances, β = 0.028
for tight windows and 0.035 for wide windows. For makespan instances, β = 0.025 for 130 or
fewer jobs and 0.032 otherwise. For tardiness instances, β = 0.05.
Tight time windows Wide time windows
Feasibility Makespan Tardiness Feasibility Makespan Tardiness
Jobs CP MILP Bndrs CP MILP Bndrs CP MILP Bndrs CP MILP Bndrs CP MILP Bndrs CP MILP Bndrs
50 0.91 8.0 1.5 0.09 9.0 4.0 0.05 1.3 1.1 0.03 7.7 2.5 0.13 13 3.5 0.13 1.3 1.1
60 1.1 12 2.8 0.09 18 5.5 0.14 1.8 1.5 0.05 12 1.6 0.94 29 5.7 0.11 2.3 1.4
70 0.56 17 3.3 0.11 51 6.7 1.3 3.9 2.1 0.13 17 2.3 0.11 39 6.2 0.16 3.0 1.9
80 600 21 2.8 600 188 7.6 0.86 6.0 4.5 600 24 5.0 600 131 7.3 1.9 6.4 5.0
90 600 29 7.5 600 466 10 21 11 4.6 600 32 9.7 600 600 8.5 5.9 9.5 11
100 600 36 12 600 600 16 600 11 2.0 600 44 9.7 600 600 19 600 24 22
110 600 44 20 600 600 17 600 600 600 600 49 17 600 600 24 600 600 600
120 600 62 18 600 600 21 600 15 3.3 600 80 15 600 600 23 600 12 3.1
130 600 68 20 600 600 29 600 17 3.9 600 81 43 600 600 31 600 18 3.9
140 600 88 21 600 600 30 600 600 600 600 175 27 600 * 35 600 600 14
150 600 128 27 600 600 79 600 386 8.5 600 600 43
160 600 408 82 600 600 34 600 174 5.2 600 600 53
170 600 192 5.9 600 600 37 600 172 5.9 600 600 600
180 600 600 6.6 600 * 8.0 600 251 6.5 600 600 56
190 600 600 7.2 600 * 8.5 600 600 7.3 600 * 78
200 600 600 8.0 600 * 85 600 600 8.2 600 * 434
∗
MILP solver ran out of memory.
References
1. Harjunkoski, I., Grossmann, I.E.: Decomposition techniques for multistage scheduling prob-
lems using mixed-integer and constraint programming methods. Computers and Chemical
Engineering 26, 1533–1552 (2002)
2. Hooker, J.N.: Logic-based benders decomposition. Technical report, CMU (1995)
3. Hooker, J.N.: Logic-Based Methods for Optimization: Combining Optimization and Con-
straint Satisfaction. Wiley, New York (2000)
4. Hooker, J.N.: A hybrid method for planning and scheduling. Constraints 10, 385–401 (2005)
5. Hooker, J.N.: An integrated method for planning and scheduling to minimize tardiness. Con-
straints 11, 139–157 (2006)
6. Hooker, J.N.: Planning and scheduling by logic-based benders decomposition. Operations
Research 55, 588–602 (2007)
7. Hooker, J.N., Ottosson, G.: Logic-based benders decomposition. Mathematical Program-
ming 96, 33–60 (2003)
8. Hooker, J.N., Yan, H.: Logic circuit verification by Benders decomposition. In: Principles
and Practice of Constraint Programming: The Newport Papers, pp. 267–288. MIT Press,
Cambridge (1995)
9. Jain, V., Grossmann, I.E.: Algorithms for hybrid MILP/CP models for a class of optimization
problems. INFORMS Journal on Computing 13(4), 258–276 (2001)
10. Maravelias, C.T., Grossmann, I.E.: A hybrid MILP/CP decomposition approach for the con-
tinuous time scheduling of multipurpose batch plants. Computers and Chemical Engineer-
ing 28, 1921–1949 (2004)
11. Timpe, C.: Solving planning and scheduling problems with combined integer and constraint
programming. OR Spectrum 24, 431–448 (2002)
Integrated Maintenance Scheduling for
Semiconductor Manufacturing
Andrew Davenport
1 Introduction
2 Problem Description
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 92–96, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Integrated Maintenance Scheduling for Semiconductor Manufacturing 93
2.1 Objectives
In practice feasible schedules satisfying release dates, due dates and capacity con-
straints on the availability of technicians are usually easy to generate. Resource
contention is not the main challenge in generating good maintenance schedules.
Rather, the challenge is in handling multiple, non-convex objectives. We describe
these objectives in the following sections.
Resource leveling. The first objective that we consider relates to the utilization
of the available technicians for a toolset over time. When the use of technicians
is spread out over time, it is more likely that there will be some idle techni-
cians available to carry out any unforeseen, unplanned maintenance. In time
periods when many technicians are busy, there are fewer idle technicians, so any
neccessary unplanned maintenance may disrupt planned maintenance or require
contracting outside technicians. Unplanned maintenance can be very expensive
and disruptive to production operations, so in general it is preferred that we
“levelize” the use of maintenance technicians over time so that some technicians
are always available to handle unplanned maintenance should it become nec-
essary (as illustrated in Figure 1(right)). We formulate this requirement as an
objective that we minimize the number of technicians that we utilize throughout
the entire schedule in order to perform all of the pending maintenance operations
(alternatively we minimize the maximum number of technicians that are active
(performing maintenance) in any one time period)).
Fig. 2. Illustration of a maintenance schedule for a single toolset. Each row presents a
Gantt chart representation of a schedule for a single tool. The plotted line represents
the projected work in progress (y-axis) for the tool over each time period (x-axis) in
the schedule.
to as the Work In Process (WIP), which is specified for a tool and a time pe-
riod. When maintenance is performed on a tool, all production is stopped on
that tool. As well as delaying wafers that need to be processed on the tool, this
can lead to starvation of downstream processes for the wafers, resulting in tool
under-utilization. Ideally, we would like to minimize such disruption by perform-
ing maintenance operations on tools during time periods when there is as little
WIP as possible.
The difficulty we face with using this objective is that typically we do not know
what the WIP levels will be in the fab for each tool over the scheduling horizon.
Operations in semiconductor fab are usually very dynamic, as wafers can make
multiple passes through various processes based on the results of tests of the
effectiveness of each production step. Uncertainty also arises due to unplanned
tool breakdowns. Detailed production scheduling is usually done using dispatch
rules applied whenever a tool becomes available for processing. As such, there is
no longer term production schedule we can refer to in order to determine what
the WIP levels will be for each tool that we could use in a formulation of the
Integrated Maintenance Scheduling for Semiconductor Manufacturing 95
3 Solution Approach
4 Summary
We have presented a maintenance scheduling problem for a semi-conductor man-
ufacturing facility. We have developed a goal programming approach combining
both constraint programming and mixed-integer programming, which exploits
the strengths of both solution techniques. The scheduling system we have devel-
oped based on this solution approach has been deployed within IBM.
References
1. Bagchi, S., Chen-Ritzo, C., Shikalgar, S., Toner, M.: A full-factory simulator as a
daily decision-support tool for 300mm wafer fabrication productivity. In: Mason, S.,
Hill, R., Moench, L., Rose, O. (eds.) Proceedings of the 2008 Winter Simulation
Conference (2008)
2. Baptiste, P., Pape, C.L., Nuijten, W.: Constraint-Based Scheduling - Applying Con-
straint Programming to Scheduling Problems. International Series in Operations
Research and Management Science. Springer, Heidelberg (2001)
A Constraint Programming Approach for the
Service Consolidation Problem
1 Introduction
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 97–101, 2010.
c Springer-Verlag Berlin Heidelberg 2010
98 K. Dhyani, S. Gualandi, and P. Cremonesi
2 Problem Formulation
xij ≤ yj , ∀i ∈ C, j ∈ S, (4)
+ rule-based constraints
yj ∈ {0, 1}, ∀j ∈ S,
xij ∈ {0, 1}, ∀i ∈ C, j ∈ S.
Constraints (2) ensures satisfaction of the resource constraints for each candidate
for each hour of the day profile to the server that it is assigned to. Constraints (3)
force each application to be assigned to just one server. The activation constraint
(4) forces that a candidate can only be assigned to a server that is activated.
To define the rule-based constraints, let C1 , C2 ⊂ C and S1 ⊂ S be the sets
over which these constraints are defined, then we have the following:
– Candidate-candidate REQUIRE (CCR) rule states that candidates in
C1 should be placed on the same server with candidates in C2 :
∀j ∈ S1 , r1 ∈ C1 , xi1 i = 0. (9)
The CTR-rule provides a strict lower bound on the number of servers needed for
consolidation. Numerically, it is seen that running the above model with an ILP
solver that the problem becomes tougher with increase in the size of the system
and in particular depends on the number of servers.
For the lack of space, we just sketch the CP model used in our application.
For each candidate there is an integer variable Xi with domain equal to the set
of available servers S. For each server j there is a 0–1 variable Yj indicating
whether the corresponding server is used or not. By using the same parameters
as in the ILP model, the linear knapsack constraints (2) are translated into a
set of multi-knapsack constraints (using the Comet syntax):
forall(t in T, l in R)
cp.post(multiknapsack(X, all(i in C) r[i,l,t], all(j in S) u[j,l]));
The rule-based constraints (5)–(9) are easily translated into logical and reified
constraints on the integer Xi variables.
3 Computational Experiments
We report experimental results for the ILP and CP approaches that are both
encoded using Comet [4]. As ILP solver we have used the version of SCIP (ver.
1.1) delivered with the Comet (ver. 2.0).
We have considered different scenarios, and we report the results on the most
challenging instances, which have |S|=20 and |S|=30 servers, a number of can-
didates ranging from |C|=100 to |C|=250, and four resources. Each instance is
Table 1. Challenging instances: averaged objective values after 60 sec. and 1000 sec.
(averaged over 5 instances for each row). The symbol ‘-’ means that no solution was
found.
SCIP CP (Comet)
|S| |C| 60 sec. 1000 sec. 60 sec. 1000 sec.
20 100 - 40 100 80
150 - 40 100 100
200 - - 290 280
250 - - 320 300
30 100 - - 110 80
150 - - 245 200
200 - - 310 300
250 - - 525 500
A CP Approach for the Service Consolidation Problem 101
involved in at least two rule-based constraints, making the whole problem demand-
ing for the ILP solver. Table 1 reports the cost and the computation time with both
the ILP and CP solver averaged over 5 different instances of the same dimension
(for a total of 40 instances). For both methods, we set two time limits, the first at
60 sec. and the second to 1000 sec. Note that the ILP solver is never able to pro-
duce an admissible solution within 60 sec, while CP does. Things are only slightly
different for the results after 1000 sec. In many cases the ILP solver does not even
find a feasible solution, but when it does, the solutions are of very good quality.
4 Conclusion
We presented the Service Consolidation Problem solved at Neptuny, using both
an ILP and a CP approach. The ILP approach is very effective when the number
of rule-based constraints is limited. However, when the rule-based constraints are
many, as it is the case in our real-life applications, the ILP approach fails in find-
ing a feasible solution in a short time (where short is defined in terms of usability
within an application). The CP approach is very effective in finding feasible solu-
tions, and even for the largest instances always finds a solution within 60 seconds.
In addition, for other tests not reported here, it is also very fast in detecting in-
feasibility. As future work, there are two open issues: the first issue concerns the
use of explanation techniques (like QuickXplain [5]) to be able to obtain a set of
conflicting constraints to show to the final user, and the second issue is to improve
the efficiency of the solver in order to tackle even larger instances.
References
1. Almeida, J., Almeida, V., Ardagna, D., Francalanci, C., Trubian, M.: Resource man-
agement in the autonomic service-oriented architecture. In: Proc. of International
Conference on Autonomic Computing, pp. 84–92 (2006)
2. Anselmi, J., Amaldi, E., Cremonesi, P.: Service consolidation with end-to-end re-
sponse time constraints. In: Proc. of Euromicro Conference Software Engineering
and Advanced Applications, pp. 345–352. IEEE Computer Society, Los Alamitos
(2008)
3. Bichler, M., Setzer, T., Speitkamp, B.: Capacity planning for virtualized servers. In:
Proc. of the Workshop on Information Technologies and Systems (2006)
4. DYNADEC. Comet release 2.0 (2009), http://www.dynadec.com
5. Junker, U.: Quickxplain: Conflict detection for arbitrary constraint propagation al-
gorithms. In: Proc. of Workshop on Modelling and Solving Problems with Con-
straints, Seattle, WA, USA (2001)
6. Lawler, E.: Recent results in the theory of machine scheduling. In: Bachem, A.,
Grotschel, M., Korte, B. (eds.) Mathematical Programming: The State of the Art.
Springer, Heidelberg (1983)
7. Michel, L., Shvartsman, A., Sonderegger, E., Van Hentenryck, P.: Optimal Deploy-
ment of Eventually-Serializable Data Services. In: Perron, L., Trick, M.A. (eds.)
CPAIOR 2008. LNCS, vol. 5015, pp. 188–202. Springer, Heidelberg (2008)
8. Rolia, J., Andrzejak, A., Arlitt, M.F.: Automating enterprise application placement
in resource utilities. In: Brunner, M., Keller, A. (eds.) DSOM 2003. LNCS, vol. 2867,
pp. 118–129. Springer, Heidelberg (2003)
Solving Connected Subgraph Problems in
Wildlife Conservation
1 Introduction
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 102–116, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Solving Connected Subgraph Problems in Wildlife Conservation 103
Subgraph Problem with Node Profits and Node Costs. This problem is known
to be NP-hard even for the case of no terminals [2]. Removing the connectivity
constraint, we have a 0-1 knapsack problem. On the other hand, the connectivity
constraint relates it to other important classes of well-studied problems such as
the Traveling Salesman Problem and the Steiner Tree problem. The connectivity
constraint plays a key role in the combinatorics of this problem and we propose
new mathematical formulations to better capture the structure of the problem
w.r.t. the connectivity constraint.
Our work is motivated by an important instance of this problem that arises
in Conservation Planning. The general problem consists of selecting a set of land
parcels for conservation to ensure species viability. This problem is also known
in the literature in its different variants as site selection, reserve network design,
and corridor design. Biologists have highlighted the importance of addressing
the negative ecological impacts of habitat fragmentation when selecting parcels
for conservation. To this effect, ways to increase the spatial coherence among
the set of parcels selected for conservation have been investigated ( see [14] for
a review). We look at the problem of designing so-called wildlife corridors to
connect areas of biological significance (e.g. established reserves). Wildlife cor-
ridors are an important conservation method in that they increase the genetic
diversity and allow for greater mobility (and hence better response to predation
and stochastic events such as fire, as well as long term climate change). Specifi-
cally, in the wildlife corridor design problem, we are given a set of land parcels,
a set of reserves (land parcels that correspond to biologically significant areas),
and the cost (e.g. land value) and utility (e.g. habitat suitability) of each parcel.
The goal is to select a subset of the parcels that forms a connected network
including all reserves. This problem is clearly an instance of the Connected Sub-
graph Problem with node profits and node costs, where the nodes correspond to
parcels, the terminal nodes correspond to the reserves and the edges correspond
to adjacency of parcels. Conservation and land use planners generally operate
with a limited budget while striving to secure the land that results in the corri-
dor with best habitat suitability. This results in the budget-constrained version
of the connected subgraph problem.
The connected subgraph problem in the context of designing wildlife corridors
was recently studied in [2, 7]. Conrad et al. [2] designate one of the terminals as
a root node and encode the connectivity constraints as a single commodity flow
from the root to the selected nodes in the subgraph. This encoding is small and
easy to enforce. They present computational results which show an easy-hard-
easy runtime pattern with respect to the allowed budget on a benchmark of syn-
thetic instances [7]. Further, when solving large scale real world instances of this
optimization problem, the authors report extremely large running time. Here,
we try to improve the state-of-the-art for this problem by proposing alternative
formulations. We show that the easy-hard-easy pattern in runtime solution for
finding optimal solutions observed for synthetic instances aligns with a similar
pattern in the relative integrality gap of the LP relaxation of the model. This
observation suggests that formulations that have tighter LP relaxations might
104 B. Dilkina and C.P. Gomes
also lead to faster solution times for finding optimal solutions. To this effect, we
propose two additional formulations.
One possible alternative which we explore in this paper is to establish the
connectivity of each selected node to the root node by a separate commodity flow.
This results in a multi-commodity flow encoding of the connectivity constraints.
Although the multi-commodity flow encoding is larger than the single commodity
encoding (yet still polynomial size), it can result in a stronger LP relaxation of
the problem.
A completely different avenue is to adapt ideas from the vast literature on the
Steiner Tree Problem. Encodings of the connectivity requirement with respect
to edge decisions successfully applied to the Steiner Tree problem involve ex-
ponential number of constraints. The Steiner Tree variants involve costs and/or
profits on edges and hence such models explicitly model binary decisions of in-
cluding or excluding edges from the selected subgraph. In particular, for the
Steiner Tree Problem with Node Revenues and Budgets, Costa et al. [4] sug-
gest using the directed Dantzig-Fulkerson-Johnson formulation [5] with subtour
elimination constraints enforcing the tree structure of the selected subgraph. For
variants of the Connection Subgraph Problem that involve edge costs or edge
profits one needs to model explicitly decisions about inclusion of edges in the
selected subgraph. Given a graph G = (V, E), in the problem variant we study
we only need to make explicit decisions of which nodes to include (i.e., V ⊆ V )
and connectivity needs to be satisfied on the induced subgraph G(V ) that only
contains edges of G whose endpoints belong to V . Nevertheless, we adapt the
directed Dantzig-Fulkerson-Johnson formulation to our problem, therefore con-
sidering the graph edges as decision variables instead of the nodes, which in
general results in dramatically increasing the search space size from 2|V | to 2|E| .
Although at first glance this change seems counterproductive, the added strength
that results from explicitly enforcing the connectivity of each selected node to
a predefined terminal, in fact, results in a tighter formulation. This formula-
tion involves an exponential number of connectivity constraints that cannot be
represented explicitly for real life sized instances. To address this, we present a
Bender’s decomposition approach that iteratively adds connectivity constraints
to a relaxed master problem [1, 12].
We provide computational results on the three different encodings of the con-
nectivity constraints: 1) the single-commodity flow (SCF) encoding [2]; 2) a
multi-commodity flow (MCF) encoding; 3) a modified directed Dantzig-Fulkerson-
Johnson (DFJ) formulation using node costs. On a benchmark of synthetic in-
stances consisting of grid graphs with random costs and revenues, we show that
indeed the multi-commodity encoding provides better LP relaxation bounds than
the single commodity flow, and that the directed Dantzig-Fulkerson-Johnson for-
mulation provides the best bounds. Most importantly, the advantage of the bounds
provided by the directed Dantzig-Fulkerson-Johnson formulation over the single-
commodity flow encoding are greatest exactly in the hard region. The tighter
bounds turn out to have a critical effect on the solution times for finding opti-
mal integer feasible solutions. Despite the large size of the DFJ encoding, it works
Solving Connected Subgraph Problems in Wildlife Conservation 105
remarkably well for finding integer feasible solutions. The easy-hard-easy pattern
with respect to the budget exhibited strongly by the SCF encoding is much less
pronounced when using the DFJ encoding – this encoding is considerably more ro-
bust to the budget level. We show that the DFJ encoding finds optimal solutions
two orders of magnitude faster than the SCF encoding in the interval of budget
values that are hardest. This result is particularly relevant when solving real-world
instances because the hard region usually falls over a budget interval close to the
minimum cost solution to find a connected subgraph – i.e. it helps find solutions
for tight budgets.
We test our formulations on real problem instances concerning the design of
a Grizzly Bear Wildlife Corridor connecting three existing reserves [2]. We show
that, for critically constrained budgets, the DFJ encoding proposed here can find
optimal or close to optimal solutions, dramatically speeding up runtime. For the
same problem instances and budget levels, the single flow encoding can only
find considerably worse feasible solutions and has much worse objective upper
bounds. For example, for a budget level which is 10% above the minimum cost
required to connect all reserves, the DFJ encoding finds an optimal soltuion and
proves optimality in 25 mins, while the SCF encoding after 10 hours has found
an inferior solution and has proven an optimality gap of 31%. Similar behavior
is observed for a budget of 20% above the minimum cost. Working budgets close
to the minimum cost solution is a very likely scenario in a resource-constrained
setting such as conservation planning. Hence, with the little money available, it
is important to find the best possible solutions. The new DFJ encoding proposed
here allows us to find optimal solutions to large scale wildlife corridor problems
in exactly the budget levels that are most relevant in practice and that are out of
reach in terms of computational time for the previously proposed formulations.
The DFJ encoding is better at capturing the combinatorial structure of the
connectivity constraints which is reflected in the tightness of the LP relaxation
as well as in the fact that it finds integer feasible solutions much faster and with
very strong guarantees in terms of optimality (often proving optimality or within
less than 1% of optimality), both when considering the synthetic instances as
well as the real-world wildlife corridor instances.
2 Related Work
One of the most studied variant of the Connected Subgraph Problem is perhaps
the Steiner Tree which involves a graph G = (V, E), a set of terminal vertices
T ⊂ V , and costs associated with edges. In the Minimum Steiner Tree Problem
the goal is to select a subgraph G = (V ⊆ V, E ⊆ E) of the smallest cost
possible that is a tree and contains all terminals (T ⊆ V ). Although including a
budget constraint has important practical motivation, budget-constrained vari-
ants of the Steiner tree problem are not as nearly widely studied as the minimum
Steiner tree or the prize-collecting variant. The variant that is more relevant here
is the Budget Prize Collecting Tree Problem where in addition to costs associ-
ated with edges, there are also revenues associated with nodes. The goal is to
106 B. Dilkina and C.P. Gomes
select a Steiner tree with total edge cost satisfying a budget constraint while
maximizing the total node revenue of the selected tree. Levin [10] gives a (4 + )-
approximation scheme for this problem. Costa et al. [3, 4] study mathematical
formulations and solution techniques for this problem in the presence of addi-
tional so-called hop constraints. They use a directed rooted tree encoding with an
exponential number of connectivity constraints and a Branch-and-Cut solution
technique. One can easily see that the Budget Prize Collecting Tree Problem is a
special case of the Budget-Constrained Steiner Connected Subgraph with Node
Profits and Node Costs by replacing each edge with an artificial node with the
corresponding cost and adding edges to the endpoints of the original edge. We
adapt some of the vast amount of work on tight formulations for the variants
of the Steiner Tree problem with edge costs to the more general node-weighted
problem.
Restricted variants of Budget-Constrained Steiner Connected Subgraph Prob-
lem with Node Profits and Node Costs have been addressed previously in the
literature. Lee and Dooly [9] study the Maximum-weight Connected Subgraph
Problem where profits and unit costs are associated with nodes and the goal is to
find a connected subgraph of maximal weight and at most a specified R number
of nodes. In the constrained variant they consider a designated root node that
needs to be included in the selected subgraph.
Moss and Rabani [13] also study the connected subgraph problem with node
costs and node profits and refer to this problem as the Constrained Node Weighted
Steiner Tree Problem. They also only consider the special case where there is
either no terminals or only one terminal - a specified root node. For all three
optimization variants - the budget, quota and prize-collecting, Moss and Rabani
[13] provide an approximation guarantee of O(log n), where n is the number of
nodes in the graph. However, for the budget variant, the result is a bi-criteria
approximation, i.e. the cost of the selected nodes can exceed the budget by some
fraction. Finding an approximation algorithm for the budget-constrained vari-
ant is still an open question, as well as dealing with multiple terminals. Demaine
et al. [6] have recently shown that one can improve the O(log n) approximation
guarantee to a constant factor guarantee when restricting the class of graphs
to planar but only in the case of the minimum cost Steiner Tree Problem with
costs on nodes (but no profits). It is an open research question whether for
planar graphs one can design a better approximation scheme for the budget-
constrained variant. This is of particular interest because the Wildlife Corridor
Design problem corresponds to finding a connected subgraph in a planar graph.
3 Mathematical Formulations
The Connected Subgraph Problem with Node Profits and Node Costs is specified
by a connected graph G = (V, E) along with a set of terminal nodes T ∈ V , a
cost function on nodes c : V → R, and a profit function on nodes u : V → R.
The goal is to select a subset of the nodes V ⊆ V such that all terminal nodes
T are included (T ⊆ V ) and the induced subgraph G(V ) is connected. In
Solving Connected Subgraph Problems in Wildlife Conservation 107
flow injected into the network corresponds to the total system flow. Each of
the vertices with a positive incoming flow retains one unit of flow, i.e., (yij >
0) ⇒ (xj = 1), ∀(i, j) ∈ A enforced by Constraint (6). The flow conservation is
modeled in Constraint (7). Finally, Constraint (8) enforces that the flow absorbed
by the network corresponds to the flow injected into the system. This encoding
requires 2|E| + 1 additional continuous variables and ensures that all selected
nodes form a connected component.
For all nodes k, if node k is selected, then k is a sink for flow of commodity
k (Constraint (12, 13)). Constraint (14) imposes conservation of flow for each
commodity type k at each node different from the sink k and the source r, and
Solving Connected Subgraph Problems in Wildlife Conservation 109
Constraint (11) imposes that the root does not have any incoming flow. Finally,
the capacity of each edge is zero if either end node is not selected, and 1 otherwise
(Constraint (15, 16)).
This encoding requires (|V | − 1)2|E| additional continuous variables – con-
siderably more than the SCF encoding. However, we will see that enforcing the
connectivity of each node to the root separately results in tighter LP relaxation.
yji = 1 ∀i ∈ T (20)
j∈δ(i)
yji ≤ 1 ∀i ∈ V − T (21)
j∈δ(i)
4 Solution Approaches
Conrad et al. [2], Gomes et al. [7] outline a preprocessing technique for the
Budget-constrained Connected Subgraph problem which effectively reduces the
problem size for tight budgets. The procedure computes all-pairs shortest paths
in the graph and uses these distances to compute for each node the minimal
Steiner Tree cost that covers all three terminals as well as the node under con-
sideration. If this minimum cost exceeds the allowed budget, the node does not
belong to any feasible solution and hence its variable is assigned to 0.
Gomes et al. [7] also outline a greedy method for finding feasible solutions to
the Budget-constrained Connected Subgraph problem by first computing the
minimum cost Steiner tree covering all the terminal nodes and then greed-
ily adding additional nodes until the allowed budget is exhausted. They show
that providing this greedy solution to their encoding of the Connected Sub-
graph Problem (the single commodity flow encoding) significantly improves
performance.
We use both of these techniques. We apply the preprocessing step to all prob-
lem instances. In addition, we provide the greedy solution as a starting point to
the SCF encoding.
Our approach to solving the DFJ encoding is based on a cutting plane or
Bender’s decomposition approach. We solve a relaxed “master” problem which
omits the exponential number of connectivity constraints. In a first pass of this
procedure all edge variables are relaxed from binary variables to continuous
variables ∈ [0, 1]. In this first phase, we solve a sequence of progressively tighter
LP master problems and in effect this corresponds to a cutting plane approach.
Once we find a (fractional) optimal solution to the LP master problem that does
not violate any connectivity constraints, we have obtained the optimal solution
to the LP relaxation of the DFJ formulation. If that solution is integral, then
we have an optimal solution to the original problem. If the LP solution is not
Solving Connected Subgraph Problems in Wildlife Conservation 111
integral, we enforce the integrality constraints for all edge variables. We continue
the same iteration steps where now the master problem includes the cuts learned
during solving the LP relaxation as well as the integrality constraints. In the
second phase, we need to solve a sequence of MIP master problems which is in
effect a Bender’s decomposition approach. At each iteration, the optimal solution
to the MIP master might not be connected and more connectivity cuts would
need to be added. Once we find an optimal MIP master solution, we have found
an optimal integer solution to the original problem. The detailed algorithm is
outlined below:
Master Algorithm:
0. (Initialize) Define the initial relaxation P0 of the problem by Constraints (18,
19,20, 21, 22) as well as the integrality Constraint (24) relaxed to only enforce
the bounds. Set iteration count t = 0.
1. (Master optimization) Solve Pt and obtain an optimal (edge) solution yt .
Let the associated vertex solution be xt . If the associated vertex solution xt is
integral, go to Step 3, otherwise go to Step 4.
2. (Additional Check) Check the connectivity of the induced graph G(xt ). If it is
connected, then xt is optimal, and the algorithm returns solution xt . Otherwise,
continue to Step 4.
3. (Master separation) Check if yt satisfies all the connectivity constraints (23). If
it does, go to Step 4. If a violated constraint is found, then add the corresponding
cut to the master problem and let Pt+1 be the problem obtained. Set t = t + 1
and return to Step 1.
4. (Optimality check) If the associated vertex solution xt is integral, then xt is
optimal, and the algorithm returns solution xt . Otherwise, add the integrality
constraints (24) back in to the problem, and let Pt+1 be the problem obtained.
Set t = t + 1 and go to Step 1.
Checking the exponential number of connectivity constraints (23) given an edge
solution yt in Step 3 is done through a polynomial time separation procedure.
The separation procedure checks the connectivity of each selected vertex to the
root and terminates as soon as it finds a disconnected node and infers a cut
to be added. It first checks the connectivity of the terminals to the root and
then other selected vertices. We solve a max-flow problem in the directed graph
G’=(V,A) between the root and each node k ∈ V − r selected in the proposed
solution, i.e. in the associated vertex solution xt (k) > 1 − . The capacities of the
edges are the current values of the edge variables yt in the master solution. If the
maximum flow is less than the sum of the incoming arcs from k, we have found
a violated constraint. The dual variables of the max-flow subproblem indicate
the partition of nodes {S, V \S} that define the minimum cut (let r ∈ V \S).
Now, we can add the cut enforcing that at least one edge across the partition
needs to be selected if parcel k is selected:
yij ≥ xk (25)
(i,j)∈A|i∈V \S,j∈S
112 B. Dilkina and C.P. Gomes
Step 2 of the algorithm is a special step that applies to the Connected Subgraph
Problems with node costs and node profits. Given a solution yij of the DFJ
formulation and the associated vertex vector xt we can infer a set of selected
nodes V = {k ∈ V |xk (t) = 1}. The original problem only requires that for
the selected subset of vertices V the induced graph G(V ) is connected, while
the DFJ formulation poses a much stronger requirement to select a subset of
edges forming a tree. Hence, it can be the case that that V induces a connected
subgraph in G, but the selected edges E = {(i, j) ∈ A|yij = 1} do not form a
single connected component. To illustrate this, imagine that the selected edges
E form two vertex-disjoint cycles C1 and C2 and such that u ∈ C1 and v ∈ C2
and u, v ∈ E. The edge set E clearly does form a connected subgraph, however
the subgraph induced by the selected vertices is connected because of the edge
u, v. Without Step 2, our separation procedure in Step 3 will infer a new cut and
will wrongly conclude that the selected master solution is not a feasible solution.
To avoid such cases, we introduce Step 2 to check the weaker connectivity in
terms of the induced subgraph. If this connectivity check fails, then we use the
max-flow separation procedure in Step 3 to infer a new connectivity cut to add
to the master.
The solution procedure described above solves a series of tighter relaxation
of the original problem and therefore the first solution that is feasible w.r.t.
all the constraints in the original problem is in fact the optimal solution. One
problem with this approach is that we need to wait until the very end to get
one integer feasible solution which is also the optimal one. Ideally, one would
like to have integer feasible solutions as soon as possible. We achieve this in
the context of this solution technique by noticing that while solving the MIP
master to optimality we discover a sequence of integer solutions. Some of these
integer solutions might satisfy all connectivity constraints (i.e. they are feasible
solutions to the original problem), but are discarded by the master as sub-
optimal – there might be disconnected solutions to the master of better quality.
To detect the discovery of feasible solutions to the original problem while solving
the master problem, we introduce a connectivity check at each MIP master
incumbent solution (not described in our algorithm outline above). If a MIP
master incumbent is connected and is better than any other connected integer
solution discovered so far, we record this solution as an incumbent to our original
problem.
5 Experimental Results
10
SCF SCF
MCF MCF
Rel. integrality gap
DFJ DFJ
Runtime secs
6
4
2
0
0 200 400 600 800 1000 0 200 400 600 800 1000
budget slack % (w.r.t. mincost) budget slack % (w.r.t. mincost)
Fig. 1. Optimality gap and run times of LP relaxations of the three encodings on 10x10
lattices with 3 reserves, median over 100 runs
100.00
Runtime secs (logscale)
SCF
DFJ
1.00
0.01
Fig. 2. Compare run times of DFJ and SCF for finding optimal integer solutions on
10x10 lattices with 3 reserves, median over 100 runs
114 B. Dilkina and C.P. Gomes
the variant we are studying here. More importantly, one would like to use the
strength of this encoding to find the optimal integer feasible solution.
Figure 2 compares the running time of the SCF encoding and the DFJ encod-
ing. An easy-hard-easy pattern of the running time with respect to the budget
was already observed by Gomes et al. [7]. Here, we clearly see that the DFJ
encoding is in fact most beneficial in exactly the hard budget region. For large
budgets, the DFJ encoding in fact has worst running time than the single com-
modity flow. However, more importantly it improves the running time in the
hard region by 2 orders of magnitude.
We are interested in the running time performance of the SCF and the DFJ
encoding when looking for integer feasible solutions. We evaluate the perfor-
mance on a real-world Wildlife Corridor design problem attempting to connect
three existing reserves. We tackle this problem at two different spatial scales.
The coarser scale considers parcels grid cells of size 40 by 40 km and has 242
parcels (nodes). The finer spatial scale consider parcels of size 10 by 10 km and
has 3299 parcels (nodes).
Figure 3 clearly demonstrates the advantage of the DFJ encoding on the 40km
problem instance both in terms of the LP relaxation bound (left) and in terms of
finding integer optimal solutions (right). The single flow encoding is fast for very
tight and very large budgets, but for a critically constrained region the running
time is much higher. The DFJ encoding on the other hand shows robust running
times which do not vary much with the budget level.
We compare the running time to find integer solutions for the much larger in-
stance at spatial resolution of 10 km. We set the budget at different (tight) levels
as percent slack above the minimum cost required to connect the reserves. Table
1 presents solution quality, running times and optimality gap results for three dif-
ferent levels. For comparison, we also include the quality of the solution obtained
by the greedy algorithm from [7] (which is usually much worse than the optimal).
The results in Table 1 show that the DFJ encoding is much faster at finding op-
timal or near optimal solutions to the problem than the SCF encoding. Given a
8 hour cutoff time, for all three budget levels DFJ finds equal or better feasible
solutions than SCF and also provides very tight optimality guarantee (< 1% in all
cases). On the other hand, SCF in all three cases can only guarantee that the best
solution it has founf is within at best 28% of optimality.
Real data set 40km resolution - LP relaxation Real data set 40km resolution
1 100
DFJ DFJ
MCF 90 SCF
0.8 SCF 80
rel. integrality gap
70
time (secs)
0.6 60
50
0.4 40
30
0.2 20
10
0 0
0 20 40 60 80 100 0 20 40 60 80 100
budget slack % (w.r.t. mincost) budget slack % (w.r.t. mincost)
Table 1. The performance of the SCF and DFJ encoding on a large real world instance
with an 8 hour cutoff time
6 Conclusion
The budget-constrained Connection Subgraph Problem is computationally chal-
lenging problem with a lot of real world applications. Capturing well the combina-
torial structure of the connectivity constraint is critical to effectively solving large
scale instances. In this work, we proposed a novel solution approach to this prob-
lem that uses an adapted directed Dantzig-Fulkerson-Johnson formulation with
subtour elimination constraints in the context of a cut-generation approach. This
results in significant speed up in run times when the budget level falls in the inter-
val that results in most computationally challenging instances. We evaluate perfor-
mance on a relatively large instance of the Wildlife Corridor Design Problem and
find optimal solutions for different budget levels. This work is a good example of
identifying and extending relevant Computer Science results for problems arising
in the area of Computation Sustainability.
Acknowledgments
This research was supported by IISI, Cornell University (AFOSR grant FA9550-
04-1-0151), NSF Expeditions in Computing award for Computational Sustain-
ability (Grant 0832782) and NSF IIS award (Grant 0514429).
References
[1] Benders, J.: Partitioning procedures for solving mixed-variables programming
problems. Numerische Mathematik 4, 238–252 (1962)
[2] Conrad, J., Gomes, C.P., van Hoeve, W.-J., Sabharwal, A., Suter, J.: Connec-
tions in networks: Hardness of feasibility versus optimality. In: Van Hentenryck,
P., Wolsey, L.A. (eds.) CPAIOR 2007. LNCS, vol. 4510, pp. 16–28. Springer, Hei-
delberg (2007)
[3] Costa, A.M., Cordeau, J.-F., Laporte, G.: Steiner tree problems with profits. IN-
FOR: Information Systems and Operational Research 4(2), 99–115 (2006)
116 B. Dilkina and C.P. Gomes
[4] Costa, A.M., Cordeau, J.-F., Laporte, G.: Models and branch-and-cut algorithms
for the steiner tree problem with revenues, budget and hop constraints. Net-
works 53(2), 141–159 (2009)
[5] Dantzig, G., Fulkerson, R., Johnson, S.: Solution of a Large-Scale Traveling-
Salesman Problem. Operations Research 2(4), 393–410 (1954)
[6] Demaine, E.D., Hajiaghayi, M.T., Klein, P.: Node-weighted steiner tree and group
steiner tree in planar graphs. In: Albers, S., Marchetti-Spaccamela, A., Matias,
Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009. LNCS, vol. 5555. Springer,
Heidelberg (2009)
[7] Gomes, C.P., van Hoeve, W.-J., Sabharwal, A.: Connections in networks: A hybrid
approach. In: Perron, L., Trick, M.A. (eds.) CPAIOR 2008. LNCS, vol. 5015, pp.
303–307. Springer, Heidelberg (2008)
[8] ILOG, SA, CPLEX 11.0 Reference Manual (2007)
[9] Lee, H.F., Dooly, D.R.: Decomposition algorithms for the maximum-weight con-
nected graph problem. Naval Research Logistics 45(8), 817–837 (1998)
[10] Levin, A.: A better approximation algorithm for the budget prize collecting tree
problem. Operations Research Letters 32(4), 316–319 (2004)
[11] Ljubic, I., Weiskircher, R., Pferschy, U., Klau, G.W., Mutzel, P., Fis-
chetti, M.: Solving the prize-collecting steiner tree problem to optimality. In:
ALENEX/ANALCO, pp. 68–76 (2005)
[12] McDaniel, D., Devine, M.: A Modified Benders’ Partitioning Algorithm for Mixed
Integer Programming. Management Science 24(3), 312–319 (1977)
[13] Moss, A., Rabani, Y.: Approximation algorithms for constrained node weighted
steiner tree problems. SIAM J. Comput. 37(2), 460–481 (2007)
[14] Williams, J.C., Snyder, S.A.: Restoring habitat corridors in fragmented landscapes
using optimization and percolation models. Environmental Modeling and Assess-
ment 10(3), 239–250 (2005)
Consistency Check for the Bin Packing Constraint
Revisited
1 Introduction
The bin packing problem (BP) consists in finding the minimum number of bins neces-
sary to pack a set of items so that the total size of the items in each bin does not exceed
the bin capacity C. The bin capacity is common for all the bins.
This problem can be solved in Constraint Programming (CP) by introducing one
placement variable xi for each item and one load variable lj for each bin.
The Pack([x1 , . . . , xn ], [w1 , . . . , wn ], [l1 , . . . , lm ]) constraint introduced by Shaw
[1] links the placement variables x1 , . . . , xn of n items having weights w1 , . . . , wn
with the load variables of m bins l1 , . . . , lm with domains {0, . . . , C}. More precisely
the constraint ensures that ∀j ∈ {1, . . . , m} : lj = ni=1 (xi = j) · wi where xi =
j is reified to 1 if the equality holds and to 0 otherwise. The Pack constraint was
successfully used in several applications. n
In addition to the decomposition constraints ∀j n∈ {1, . . . , m} : lj = i=1 (xi =
n
j) · wi and the redundant constraint i=1 wi = j=1 lj , Shaw introduced:
Shaw describes in [1] a fast failure detection procedure for the Pack constraint using a
bin packing lower bound (BPLB). The idea is to reduce the current partial solution (i.e.
where some items are already assigned to a bin) of the Pack constraint to a bin packing
problem. Then a failure is detected if the BPLB is larger than the number of available
bins m.
We propose two new reductions of a partial solution to a bin packing problem. The
first one can in some cases dominate Shaw’s reduction and the second one theoretically
dominates the other two.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 117–122, 2010.
c Springer-Verlag Berlin Heidelberg 2010
118 J. Dupuis, P. Schaus, and Y. Deville
Paul Shaw’s reduction: R0. Shaw’s reduction consists in creating a bin packing prob-
lem with the following characteristics. The bin capacity is the largest upper bound of
the load variables, i.e. c = maxj∈{1,...,m} (ljmax ). All items that are not packed in the
constraint are part of the items of the reduced problem. Furthermore, for each bin, a vir-
tual item is added to the reduced problem to reflect (1) the upper bound dissimilarities
of the load variables and (2) the already
packed items. More precisely, the size of the
virtual item for a bin j is (c−ljmax + {i|xi =j} wi ), that is the bin capacity c reduced by
the actual capacity of the bin in the constraint plus the total size of the already packed
items in this bin. An example is shown in Figure 1(b).
bins
capacity
virtual bins
capacity
capacity
1 4 5
capacity
4 4 virtual 4 4
1 2 1 2 3 1 1 1 1
(a) Initial problem (b) R0 (c) RMin (d) RMax
Fig. 1. Example of the three reductions for the bin packing problem
RMin. We introduce RMin that is obtained from R0 by reducing the capacity of the
bins and the size of all the virtual items
by the size of the smallest virtual
item. The
virtual items have a size of (c−ljmax + {i|xi =j} wi −mink (c−lkmax + {i|xi =k} wi )).
This reduction is illustrated in Figure 1(c).
RMax. We propose RMax that consists in increasing the capacity and the size of the
virtual items by a common quantity, so that, when distributing the items with a bin
packing algorithm, it is guaranteed that each virtual item will occupy a different bin. In
order to achieve this, each virtual item’s size must be larger than half the bin capacity.
In R0, let p be the size of the smallest virtual item, and c the capacity of the bins.
The size of the virtual items and the capacity must be increased by (c − 2p + 1). The
smallest virtual item will have a size of s = (c − p − 1) and the capacity of the bins will
be (2c − 2p + 1) = 2s − 1. As one can observe, the smallest virtual item is larger than
the half of the capacity. If c = 2p − 1, this reduction is equivalent to Shaw’s reduction.
Note that if c < 2p − 1, the capacity and the virtual itemswill be reduced.
The virtual items have a size of (2c − 2p + 1 − ljmax + {i|xi =j} wi ). This reduction
is illustrated in Figure 1(d).
Generic reduction: Rδ. All these reductions are particular cases of a generic reduc-
tion (Rδ) which, based on R0, consists in adding a positive or negative delta (δ) to the
capacity and to all the virtual items’ sizes.
For R0, δ = 0. For RMin, δ is the minimum possible value that keeps all sizes
positive. A smaller δ would create an inconsistency, as the smallest virtual item would
have a negative size. δRMin is always negative or equal to zero. For RMax, δ is the
Consistency Check for the Bin Packing Constraint Revisited 119
smallest value guaranteeing that virtual items cannot pile up. Note that in some cases,
δRMin or δRMax can be zero. Also note that δR0 can be larger than the others.
Proof. If a partial solution of the Pack constraint can be extended to a solution with
every item placed, then Rδ also has a solution: if each virtual item is placed in its initial
bin, then the free space of each bin is equal to its free space in the partial solution, and
so all the unplaced items can be placed in the same bin as in the extended solution from
the partial assignment.
Theorem 2. R0 does not dominate RMin and RMin does not dominate R0.
Proof. Consider the partial packing {4, 2} of two bins of capacity 6, and the unpacked
items {3, 3}. R0 only needs two bins, where RMin needs three bins.
Now consider the partial packing {2, 3, 1} of three bins of capacity 4, and the un-
packed items {3, 3}. In this case, R0 needs four bins, where RMin only needs three
bins.
From a theroretical standpoint, the RMax reduction is always better or equivalent to R0,
RMin, and any other instance of Rδ. In practice, though, this is not always the case, as
it is shown in the next section.
4 Experimental Comparison
The failure test of Shaw [1] uses the bin packing lower bound L2 of Martello and Toth
[2] that can be computed in linear time. Recently the lower bound L3 of Labbé [3] has
been proved [4] to be always larger than or equal to L2 and to benefit from a better
120 J. Dupuis, P. Schaus, and Y. Deville
worst case asymptotic performance ratio (3/4 for L3 [4] and 2/3 for L2 [2]), while still
having a linear computation time. Experiments show us that L3 can help detect about
20% more failures than L2 . Throughout the next experiments, we are using L3 .
Although in theory, RMax always outperforms R0 and RMin, the practical results
are less systematic. This is because L3 (as well as L2 ) is not monotonic, which means
that a BP instance requiring a larger number of bins than a second instance can have a
lower bound smaller than the second one. In fact, L3 is more adapted to instances where
most item sizes are larger than the third of the capacity. RMax increases the capacity,
making unpacked items proportionally smaller. For each of R0, RMin and RMax, there
are instances where they contribute to detecting a failure, while the other two do not.
Table 1 presents the performance of the failure detection using each one of the reduc-
tions. It shows the ratio of failures found using each reduction over the total number of
failures found by at least one filter. Additional reductions have been experimented, with
δ being respectively 25%, 50% and 75% on the way between δRMin and δRMax . These
results were obtained by generating more than 1,000 random instances and computing
L3 on each of their reductions. Here is how the instances were produced:
Inst1. Number of bins, number of items and capacity C each randomly chosen between
30 and 50. Bins already filled up to 1..C. Random item sizes in {1, . . . , C}.
Inst2. 50 bins. Capacity = 100. Number of items is 100 or 200. Size with normal distri-
bution (μ = 5000/n, σ ∈ {3n, 2n, n, n/2, n/3} where n is the number of items).
Among these, percentage of items already placed ∈ {10%, 20%, 30%, 40%, 50%}.
Inst3. Idem as 2, but the number of placed items is 90% or 95%.
This reveals that some types of instances are more adapted to R0 or RMin, while
some are more adapted to RMax. The intermediate reductions R25, R50 and R75 were
never better in average than RMin and RMax. Thus, they were not considered in the
following experiments.
∼0% 1.5%
15% 1.5%
4% 0% ∼0% 6%
1.5%
10% 69% 0.5%
1.5% 40% 54% 0%
Fig. 2. Proportions of failure detections using each reduction on SALBP-1 instances (left) and BP
instances (right)
These results show that R0 detects a larger number of failures. But (almost) all of its
failures are also detected by one of the others. Hence, combining RMin and RMax is
better than using R0 alone. It is also useless to combine R0 with RMin and RMax.
Impact on a CP search. We compared the effect of applying the failure detection strat-
egy in a CP search on Scholl’s bin packing instances N1 and N2 (360 instances), using
R0, RMin, RMax and then RMin and RMax combined, with a time limit of five min-
utes for each instance. For the instances for which all reductions leaded to the same
solution, the mean computation time of the searches was computed. All these results
are presented in Table 2. One can observe that RMin and Rmax combined find more
optimal solutions (though there is no significative difference with R0), and lead faster
to the solution than the others (33% speedup compared to R0).
5 Conclusion
We presented two new reductions of a partial solution of the Pack constraint to a bin
packing problem. Through a CP search, these reductions are submitted to a bin packing
lower bound algorithm in order to detect failures of the Pack constraint as suggested
by Shaw in [1].
We proved that our second reduction (RMax) theoretically provides a better failure
detection than the others, assuming a perfect lower-bound algorithm. We conclude that
the best strategy is to consider both RMin and RMax filters in a CP search.
Acknowledgments. The first author is supported by the Belgian FNRS (National Fund
for Scientific Research). This research is also partially supported by the Interuniversity
Attraction Poles Programme (Belgian State, Belgian Science Policy) and the FRFC
project 2.4504.10 of the Belgian FNRS.
122 J. Dupuis, P. Schaus, and Y. Deville
References
1. Shaw, P.: A constraint for bin packing. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp.
648–662. Springer, Heidelberg (2004)
2. Martello, S., Toth, P.: Lower bounds and reduction procedures for the bin packing problem.
Discrete Appl. Math. 28(1), 59–70 (1990)
3. Labbé, M., Laporte, G., Mercure, H.: Capacitated vehicle routing on trees. Operations Re-
search 39(4), 616–622 (1991)
4. Bourjolly, J.M., Rebetez, V.: An analysis of lower bound procedures for the bin packing prob-
lem. Comput. Oper. Res. 32(3), 395–405 (2005)
5. Scholl, A.: Data of assembly line balancing problems. Technische Universität Darmstadt (93)
6. Scholl, A., Klein, R., Jürgens, C.: Bison: A fast hybrid procedure for exactly solving the one-
dimensional bin packing problem. Computers & Operations Research 24(7), 627–645 (1997)
A Relax-and-Cut Framework
for Gomory’s Mixed-Integer Cuts
1 Introduction
Gomory’s Mixed-Integer Cuts (GMICs) are of fundamental importance for
branch-and-cut Mixed-Integer Program (MIP) solvers, that however are quite
conservative in their use because of known issues due to the iterative accumu-
lation of GMICs in the optimal Linear Programming (LP) basis, which leads to
numerical instability due a typically exponential growth of the determinant of
the LP basis.
Recent work on the subject suggests however that stability issues are largely
due to the overall framework where GMICs are used, rather than to the GMICs
themselves. Indeed, the two main cutting plane modules (the LP solver and the
cut generator) form a closed-loop system that is intrinsically prone to instability—
unless a “decoupling filter” is introduced in the loop. Breaking the feedback is
therefore a must if one wants to really exploit the power of GMICs.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 123–135, 2010.
c Springer-Verlag Berlin Heidelberg 2010
124 M. Fischetti and D. Salvagnin
2 Literature
GMICs for general MIPs have been introduced by Ralph Gomory about 50 years
ago in his seminal paper [4]. However, these cuts were not used in practice until
the work of Balas, Ceria, Cornuéjols and Natraj [5], who found for the first time
an effective way to exploit them in a branch-and-cut context [6]. In particular,
the authors stressed the importance of generating GMICs in rounds, i.e., from
all the tableau rows with fractional right hand side.
The explanation of GMIC instability in terms of closed-loop systems was
pointed out by Zanette, Fischetti and Balas [7], who presented computational
experiments showing that reading the LP optimal solution to cut and the Go-
mory cuts from the same LP basis almost invariably creates a dangerous feedback
in the long run.
The same explanation applies to other cutting plane procedures that de-
rive cuts directly from tableau information of the enlarged LP that includes
previously-generated cuts (e.g., those related to Gomory’s corner polyhedron,
including cyclic-group cuts, intersection cuts, multi-row cuts, etc.). This is not
necessarily the case when using methods based on an external cut generation LP
(e.g., disjunctive methods using disjunctions not read from the optimal tableau),
or when the derived cuts are post-processed so as to reduce their correlation with
the optimal tableau (e.g., through lexicographic search [7] or by cut strengthen-
ing methods [2, 8]).
A different framework for Gomory’s cuts was recently proposed by Fischetti
and Lodi [9]. The aim of that paper was actually to compute the best possible
bound obtainable with rank-1 fractional Gomory’s cuts. The fact of restricting
to rank-1 cuts forced the authors to get rid of the classical separation scheme,
A Relax-and-Cut Framework for Gomory’s Mixed-Integer Cuts 125
(i) Balas and Perregaard [8] perform a sequence of pivots on the tableau of the
large LP leading to a (possibly non-optimal or even infeasible) basis of the
same large LP that produces a deeper cut w.r.t. the given x∗ .
(ii) Dash and Goycoolea [2] heuristically look for a basis B of the original LP
that is “close enough to B ∗ ”, in the hope of cutting the given x∗ with rank-1
GMICs associated with B; this is done, e.g., by removing from A all the
columns that are nonbasic with respect to x∗ , thus restricting B to be a
submatrix of B ∗ .
The approach of relaxing cuts right after their separation is known in the lit-
erature as the Relax-and-Cut strategy. It was introduced independently by Lu-
cena [12], and by Escudero, Guignard and Malik [13]—who actually proposed
the relax-and-cut name; see Lucena [14] for a survey of the technique and of its
applications. Very recently, Lucena [15] applied a relax-and-cut approach to the
solution of hard single 0-1 knapsack problems, where fractional Gomory’s cuts
were used, for the first time, in a Lagrangian framework.
min{cx : Ax = b, x ≥ 0, xj integer ∀j ∈ J}
126 M. Fischetti and D. Salvagnin
αi x ≥ αi0 , i = 1, . . . , M (1)
In our basic application, family (1) consists of the GMICs associated with all
possible primal-feasible bases of system Ax = b, i.e., z1 is a (typically very tight)
lower bound on the first GMIC closure addressed by Balas and Saxena [10] and
by Dash, Günlük, and Lodi [11]. However, as discussed in the computational
section, family (1) is in principle allowed to contain any kind of linear inequalities,
including problem-specific cuts and/or GMICs of any rank, or even invalid linear
conditions related, e.g., to branching conditions.
A standard solution approach for (2) consists in dualizing cuts (1) in a La-
grangian fashion, thus obtaining the Lagrangian dual problem
M
max L(u) := min{cx + ui (αi0 − αi x) : x ∈ P } (3)
u≥0
i=1
ski := αi0 − αi xk , i = 1, . . . , M
(ski > 0 for violated cuts, and ski ≤ 0 for slack/tight cuts). In particular, the
ability of computing the subgradient is essential for the convergence of overall
scheme—this is not a trivial task when the cut family is described only implicitly.
In our context, family (1) is by far too large to be computed explicitly, so
we store only some of its members, using a data structure called cut pool. Cut
A Relax-and-Cut Framework for Gomory’s Mixed-Integer Cuts 127
4 Implementations
We next describe three very basic heuristics for the Lagrangian dual problem (3),
that are intended to evaluate the potentials of using GMICs in a relax-and-cut
framework. The investigation of more sophisticated schemes such as the bundle
method is left to future investigation.
where avgLB is the average value of L(u) in the last p iterations. Finally, if
L(u) < bestLB − Δ for 10 consecutive iterations we halve μk and backtrack to
the best uk so far.
In the following, we will denote by subg our implementation of a pure subgra-
dient method for solving (3), with a limit of 10, 000 iterations. The starting step
size parameter is aggressively set to μ0 = 10. This is justified by the fact that in
our scenario the convergence of the method is not guaranteed (and is also un-
likely in practice), because we are dealing explicitly only with a small subset of
cuts. In particular, we always deal with truncated subgradients and, even more
importantly, we have no way of generating violated GMICs apart from reading
them from the LP tableau. According to our computational experience, in this
scenario a small initial value for μ is quite unappropriate because it causes the
method to saturate far from an optimal Lagrangian dual solution u∗ , with no
possibility for recovery.
Finally, to avoid overloading the cut pool, we read a round of GMICs at
every K-th subgradient iteration, where K = 10 in our implementation. In
addition, the length of the Lagrangian vector uk is not increased every time
new cuts are added to the pool, but only every 50 subgradient iterations, so as
to let the subgradient method stabilize somehow before adding new Lagragian
components. In this view, our implementation is between the so-called delayed
and non-delayed relax-and-cut methods [14].
variants to the work of Ralphs, Kopman, Pulleyblank, and Trotter [20], where
130 M. Fischetti and D. Salvagnin
a similar idea was applied to separate capacity cuts for the Capacitated Vehicle
Routing Problem—the fractional CVRP vertex being expressed as the convex
combination of m-TSP integer solutions, each of which is easily evaluated to find
violated capacity cuts.
5 Computational Results
We tested our variants of the relax-and-cut framework for GMICs on the problem
instances in MIPLIB 3.0 [21] and MIPLIB 2003 [22]. Following [2], we omitted all
instances where there is no improvement after one round of GMICs read from the
optimal tableau, or where no integer solution is known. Moreover, we excluded
instances mod011 and rentacar, because of the presence of ranged constraints
in the formulation, that are not handled by our current GMIC code. In the end,
we were left with 52 instances from MIPLIB 3.0, and 20 instances from MIPLIB
2003. For the sake of space, we will only report aggregated statistics; detailed
tables are available, on request, from the authors.
We implemented our code in C++, using IBM ILOG Cplex 11.2 as black
box LP solver (its primal heuristics were also used to compute the subgradient
upper bound U B). All tests have been performed on a PC with an Intel Q6600
CPU running at 2.40GHz, with 4GB of RAM (only one CPU was used by each
process). As far as the GMIC generation is concerned, for a given LP basis we
try to generate a GMIC from every row where the corresponding basic variable
has a fractionality of at least 0.001. The cut is however discarded if its final
dynamism, i.e., the ratio between the greatest and smallest absolute value of the
cut coefficients, is greater than 1010 .
In our first set of experiments we compared the ability (and speed) of the pro-
posed methods in approximating the first GMI closure for the problems in our
testbed. The first GMI closure has received quite a lot of attention in the last
years, and it was computationally proved that it can provide a tight approxi-
mation of the convex hull of the feasible solutions. In addition, rank-1 GMICs
are read from the original tableau, hence they are generally considered safe from
the numerical point of view. Note that our method can only generate cuts from
primal-feasible bases, hence it can produce a weaker bound than that associated
with the first GMI closure [23].
In Table 1 we report the average gap closed by all methods that generate
rank-1 GMICs only, as well as the corresponding computing times (geometric
means). We recall that for a given istance, the gap closed is defined as 100 · (z −
z0 )/(opt − z0 ), where z0 is the value of the initial LP relaxation, z is the value
of the final LP relaxation, and opt is the best known solution. For comparison,
we report also the average gap closed by one round of GMIC read from the
the first optimal tableau (1gmi), as well as the average gap closed with the
default method proposed by Dash and Goycoolea (dgDef), as reported in [2].
A Relax-and-Cut Framework for Gomory’s Mixed-Integer Cuts 131
Table 1. Average gap closed and computing times for rank-1 methods
All computing times are given in CPU seconds on our Intel machine running
at 2.4 GHz, except for dgDef where we just report the computing times given
in [2], without any speed conversion—the entry for MIPLIB 3.0 refers to a 1.4
GHz PowerPC machine (about 2 times slower than our PC), while the entry
for MIPLIB 2003 refers to a 4.2 GHz PowerPC machine (about twice as fast as
our PC).
According to the table, the relax-and-cut methods performed surprisingly well,
in particular for the hard MIPLIB 2003 instances where all of them outperformed
dgDef in terms of both quality and speed.
As far as the bound quality is concerned, the best method appears to be
hybr, mainly because of its improved convergence with respect to subg, and
of the much larger number of subgradient iterations (and hence of LP bases)
generated with respect to the two fast versions.
The two fast versions also performed very well, in particular faster that
proved to be really fast (more than 10 times faster than dgDef) and quite accu-
rate. It is worth observing that about 75% of the computing time for fast and
faster was spent in the sampling phase: 40% for LP reoptimizations, and 35%
for actually reading the GMICs from the tableau and projecting slack variables
away. Quite surprisingly, the solution of the large LPs through a dynamic pricing
of the pool cuts required just 15% of the total computing time.
a) Generate k rounds of GMICs in a standard way, use them to intialize the cut
pool, and then apply our method to add rank-1 GMICs on top of them. This
very simple strategy turned out not to work very well in practice, closing
significantly less gap than the rank-1 version.
b) Apply one of the relax-and-cut variants of the previous subsection until a
termination condition is reached. At this point add to the original formula-
tion (some of) the GMICs that are tight at the large-LP optimal solution,
and repeat k times. This approach works quite well as far the final bound
is concerned, but it is computationally expensive because we soon have to
work with bigger (and denser) tableaux.
c) Stick to rank-1 GMICs in the sampling phase, never enlarging the original
system. However, each time a large LP is solved to recompute the dual
multipliers (this can happen at most k times), add to the pool (but not
to the original formulation) all the GMICs read from the large-LP optimal
basis.
d) As before, stick to rank-1 GMICs in the sampling phase. If however no cut
separating the previous large-LP solution x∗ is found in the sampling phase,
then add to the pool all GMICs read from the large LP optimal basis, and
continue. This way, the generation of higher-rank cuts acts as a diversification
step, used to escape a local deadlock, after which standard rank-1 separation
is resumed.
Table 2. Average gap closed and computing times for higher rank methods
In Table 2 we report the average gap closed by our fast versions when higher-
rank GMICs are generated according to scheme d) above. Computing times
(geometric means) are also reported. Rank-1 rows are taken from the previous
table.
In the table, row gmi refers to 1, 2 or 5 rounds of GMICs. For the sake of
comparison, we also report the average gap closed by 10 rounds of Lift&Project
cuts (L&P), as described in [1]. To obtain the Lift&Project bounds and running
times we ran the latest version of separator CglLandP [24] contained in the COIN-
OR [25] package Cgl 0.55, using Clp 1.11 as black box LP solver (the separator
did not work with Cplex because of the lack of some pivoting procedures). This
separation procedure was run with default settings, apart from the minimum
fractionality of the basic variables used to generate cuts, which was set to 0.001
as in the other separators. All computing times are given in seconds on our Intel
machine running at 2.4 GHz.
Our fast procedures proved quite effective also in this setting, providing sig-
nificantly better bounds than L&P in a comparable or shorter amount of time,
even when restricting to rank-1 GMICs. As expected, increasing the cut rank
improves the quality of the bound by a significant amount, though it is not clear
whether this improvement is worth the time overhead—also taking into account
that GMICs of higher rank tend to be numerically less reliable. Similarly, it is
not clear whether the bound improvement achieved by fast w.r.t. faster is
worth the increased computing time.
the preprocessed model and the generated cuts (stored in the cut pool) can
be provided as input to our relax-and-cut scheme, in the attempt of reducing
even further the integrality gap at the root node.
– During Lagrangian optimization, a large number of (possibly slightly frac-
tional or even integer) vertices of P are generated, that could be used heuris-
tically (e.g., through rounding) to provide good primal MIP solutions.
Finally, in the process of developing our method we realized that cutting plane
schemes miss an overall “meta-scheme” to control cut generation and to escape
“local optima” by means of diversification phases—very well in the spirit of
Tabu or Variable Neighborhood Search meta-schemes for primal heuristics. The
development of sound meta-schemes on top of a basic separation tool is therefore
an interesting topic for future investigations—our relax-and-cut framework for
GMICs can be viewed as a first step in this direction.
References
1. Balas, E., Bonami, P.: Generating lift-and-project cuts from the LP simplex
tableau: open source implementation and testing of new variants. Mathematical
Programming Computation 1(2-3), 165–199 (2009)
2. Dash, S., Goycoolea, M.: A heuristic to generate rank-1 GMI cuts. Technical report,
IBM (2009)
3. Cornuéjols, G.: Valid inequalities for mixed integer linear programs. Mathematical
Programming 112(1), 3–44 (2008)
4. Gomory, R.E.: An algorithm for the mixed integer problem. Technical Report RM-
2597, The RAND Cooperation (1960)
5. Balas, E., Ceria, S., Cornuéjols, G., Natraj, N.: Gomory cuts revisited. Operations
Research Letters 19, 1–9 (1996)
6. Cornuéjols, G.: Revival of the Gomory cuts in the 1990’s. Annals of Operations
Research 149(1), 63–66 (2006)
7. Zanette, A., Fischetti, M., Balas, E.: Lexicography and degeneracy: can a pure
cutting plane algorithm work? Mathematical Programming (2009)
8. Balas, E., Perregaard, M.: A precise correspondence between lift-and-project cuts,
simple disjunctive cuts, and mixed integer Gomory cuts for 0-1 programming.
Mathematical Programming 94(2-3), 221–245 (2003)
9. Fischetti, M., Lodi, A.: Optimizing over the first Chvàtal closure. Mathematical
Programming 110(1), 3–20 (2007)
10. Balas, E., Saxena, A.: Optimizing over the split closure. Mathematical Program-
ming 113(2), 219–240 (2008)
11. Dash, S., Günlük, O., Lodi, A.: MIR closures of polyhedral sets. Mathematical
Programming 121(1), 33–60 (2010)
12. Lucena, A.: Steiner problems in graphs: Lagrangian optimization and cutting
planes. COAL Bulletin (21), 2–8 (1982)
13. Escudero, L.F., Guignard, M., Malik, K.: A Lagrangian relax-and-cut approach for
the sequential ordering problem with precedence relationships. Annals of Opera-
tions Research 50(1), 219–237 (1994)
14. Lucena, A.: Non delayed relax-and-cut algorithms. Annals of Operations Re-
search 140(1), 375–410 (2005)
A Relax-and-Cut Framework for Gomory’s Mixed-Integer Cuts 135
Abstract. Cutting plane methods are widely used for solving convex
optimization problems and are of fundamental importance, e.g., to pro-
vide tight bounds for Mixed-Integer Programs (MIPs). This is obtained
by embedding a cut-separation module within a search scheme. The
importance of a sound search scheme is well known in the Constraint
Programming (CP) community. Unfortunately, the “standard” search
scheme typically used for MIP problems, known as the Kelley method,
is often quite unsatisfactory because of saturation issues.
In this paper we address the so-called Lift-and-Project closure for 0-
1 MIPs associated with all disjunctive cuts generated from a given set
of elementary disjunction. We focus on the search scheme embedding
the generated cuts. In particular, we analyze a general meta-scheme for
cutting plane algorithms, called in-out search, that was recently proposed
by Ben-Ameur and Neto [1]. Computational results on test instances
from the literature are presented, showing that using a more clever meta-
scheme on top of a black-box cut generator may lead to a significant
improvement.
1 Introduction
Cutting plane methods are widely used for solving convex optimization problems
and are of fundamental importance, e.g., to provide tight bounds for Mixed-
Integer Programs (MIPs). These methods are made by two equally important
components: (i) the separation procedure (oracle) that produces the cut(s) used
to tighten the current relaxation, and (ii) the overall search framework that
actually uses the generated cuts and determines the next point to cut.
In the last 50 years, a considerable research effort has been devoted to the
study of effective families of MIP cutting planes, as well as to the definition
of sound separation procedures and cut selection criteria [2, 3]. However, the
search component was much less studied, at least in the MIP context where one
typically cuts a vertex of the current LP relaxation, and then reoptimizes the
new LP to get a new vertex to cut—a notable exception is the recent paper [4]
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 136–140, 2010.
c Springer-Verlag Berlin Heidelberg 2010
An In-Out Approach to Disjunctive Optimization 137
2 In-Out Search
Let us consider a generic MIP of the form
min{cT x : Ax = b, l ≤ x ≤ u, xj ∈ Z ∀j ∈ I}
procedure in the attempt of cutting the middle point y := (x∗ + q)/2. (In the
original proposal, the separation point is more generally defined as y := αx∗ +
(1 − α)q for a given α ∈ (0, 1].) If a violated cut is returned, we add it to the
current LP that is reoptimized to update x∗ , hopefully reducing the current lower
bound cT x∗ . Otherwise, we update q := y, thus improving the upper bound and
actually halving the current uncertainty interval.
The basic scheme above can perform poorly in its final iterations. Indeed, it
may happen that x∗ already belongs to P1 , but the search is not stopped because
the internal point q is still far from x∗ . We then propose a simple but quite
effective modification of the original scheme where we just count the number of
consecutive updates to q, say k, and separate directly x∗ in case k > 3. If the
separation is unsuccessful, then we can terminate the search, otherwise we reset
counter k and continue with the usual strategy of cutting the middle point y.
As to the initialization of q ∈ P1 , this is a simple task in many practical
settings, including the MIP applications where finding a feasible integer solution
q is not difficult in practice.
3 Disjunctive Cuts
Consider the generic MIP of the previous section. To simplify notation, we con-
centrate on 0-1 MIPs where lj = 0 and uj = 1 for all j ∈ I. Our order of
business is to optimize over the Lift-and-Project closure, say P1 , obtained from
P by adding all linear inequalities valid for P j := conv({x ∈ P : xj ≤ 0} ∪ {x ∈
P : xj ≥ 1}) for j ∈ I. To this end, given a point x∗ ∈ P (not necessarily a
vertex), for each j ∈ I with 0 < x∗j < 1 we construct a certain Cut Generation
Linear Program (CGLP) whose solution allows us to detect a valid inequality
for P j violated by x∗ (if any). Various CGLPs have been proposed in the litera-
ture; the one chosen for our tests has a size comparable with that of the original
LP, whereas other versions require to roughly double this size. Given x∗ and a
disjunction xj ≤ 0 ∨ xj ≥ 1 violated by x∗ , our CGLP reads:
max xj − d∗ (1)
Ax = d∗ b (2)
d∗ l ≤ x ≤ d∗ l + (x∗ − l) (3)
d∗ u − (u − x∗ ) ≤ x ≤ d∗ u (4)
where d∗ = x∗j > 0 (the two sets of bound constraints can of course be merged).
Given the optimal dual multipliers (λ, −σ , σ , −τ , τ ) associated with the con-
straints of the CGLP, it is possible to derive a most-violated disjunctive cut γx ≥
γ0 , where γ = σ −τ −u0 ej , γ0 = σ l−τ u, and u0 = 1−λb−(σ −σ )+(τ −τ )u.
MIPLIB 3.0 [15] and 2003 [16], and on 15 set covering instances from ORLIB [17].
We used IBM ILOG Cplex 11.2 as black-box LP solver, and to compute a first
heuristic solution to initialize the in-out internal point q. Both schemes are given
a time limit of 1 hour, and generate only one cut at each iteration–taken from
the disjunction associated to the most fractional variable. Cumulative results are
reported in Table 1, where time denotes the geometric mean of the computing
times (CPU seconds on an Intel Q6600 PC running at 2.4 GHz), itr denotes
the geometric mean of the number of iterations (i.e., cuts), cl.gap denotes the
average gap closed w.r.t the best known integer solution, and L&P cl.gap de-
notes the average gap closed w.r.t. the best known upper bound on z1 (this
upper bound is obtained as the minimum between the best-known integer solu-
tion value and the last upper bound on z1 computed by the in-out algorithm).
The results clearly show the effectiveness of in-out search, in particular for set
covering instances.
References
1. Ben-Ameur, W., Neto, J.: Acceleration of cutting-plane and column generation
algorithms: Applications to network design. Networks 49(1), 3–17 (2007)
2. Cornuéjols, G.: Valid inequalities for mixed integer linear programs. Mathematical
Programming 112(1), 3–44 (2008)
3. Cornuéjols, G., Lemaréchal, C.: A convex analysis perspective on disjunctive cuts.
Mathematical Programming 106(3), 567–586 (2006)
4. Naoum-Sawaya, J., Elhedhli, S.: An interior-point branch-and-cut algorithm for
mixed integer programs. Technical report, Department of Management Sciences,
University of Waterloo (2009)
5. Kelley, J.E.: The cutting plane method for solving convex programs. Journal of the
SIAM 8, 703–712 (1960)
6. Elzinga, J., Moore, T.J.: A central cutting plane algorithm for the convex program-
ming problem. Mathematical Programming 8, 134–145 (1975)
7. Ye, Y.: Interior Point Algorithms: Theory and Analysis. John Wiley, New York
(1997)
8. Tarasov, S., Khachiyan, L., Erlikh, I.: The method of inscribed ellipsoids. Soviet
Mathematics Doklady 37, 226–230 (1988)
9. Bland, R.G., Goldfarb, D., Todd, M.J.: The ellipsoid method: a survey. Operations
Research 29(6), 1039–1091 (1981)
10. Atkinson, D.S., Vaidya, P.M.: A cutting plane algorithm for convex programming
that uses analytic centers. Mathematical Programming 69, 1–43 (1995)
140 M. Fischetti, and D. Salvagnin
11. Nesterov, Y.: Cutting plane algorithms from analytic centers: efficiency estimates.
Mathematical Programming 69(1), 149–176 (1995)
12. Goffin, J.L., Vial, J.P.: On the computation of weighted analytic centers and dual
ellipsoids with the projective algorithm. Mathematical Programming 60, 81–92
(1993)
13. Boyd, S., Vandenberghe, L.: Localization and cutting-plane methods (2007),
http://www.stanford.edu/class/ee364b/notes/
localization methods notes.pdf
14. Balas, E.: Disjunctive programming. Annals of Discrete Mathematics 5, 3–51 (1979)
15. Bixby, R.E., Ceria, S., McZeal, C.M., Savelsbergh, M.W.P.: An updated mixed
integer programming library: MIPLIB 3.0. Optima 58, 12–15 (1998),
http://www.caam.rice.edu/bixby/miplib/miplib.html
16. Achterberg, T., Koch, T., Martin, A.: MIPLIB 2003. Operations Research Let-
ters 34(4), 1–12 (2006), http://miplib.zib.de
17. Beasley, J.: OR-Library: distributing test problems by electronic mail. Journal of
the Operational Research Society 41(11), 1069–1072 (1990),
http://people.brunel.ac.uk/~ mastjjb/jeb/info.html
A SAT Encoding for Multi-dimensional Packing
Problems
1 Introduction
The multi-dimensional Orthogonal Packing Problem (OPP) consists in determining if
a set of items of known sizes can be packed in a given container. Although this prob-
lem is NP-complete, efficient algorithms are crucial since they may be used to solve
optimization problems like the strip packing problem, the bin-packing problem or the
optimization problem with a single container.
S. P. Fekete et al. introduced a new characterization for OPP [1]. For each dimen-
sion i, a graph Gi represents the items overlaps in the ith dimension. In these graphs,
the vertices represent the items. The authors proved that solving the d-dimensional or-
thogonal packing problem is equivalent to finding d graphs G1 , . . . , Gd such that (P1)
each graph Gi is an interval graph , (P2) in each graph Gi , any stable set is i-feasible,
that is the sum of the sizes of its vertices is not greater than the size of the container
in dimension i, and (P3) there is no edge which occurs in each of the d graphs. They
propose a complete search procedure [1] which consists in enumerating all possible d
interval graphs, choosing for each edge in each graph if it belongs to the graph or not.
The condition (P3) is always satisfied, forbidding the choice for any edge which occurs
in d-1 graphs in the remaining graph. Each time a graph Gi is an interval graph, the
i-feasibility of its stable sets is verified, computing its maximum weight stable set (the
weights are the sizes of the items in the dimension i). As soon as the three conditions
are satisfied the search stops and the d graphs represent then a class of equivalent solu-
tions to the packing problem. Figure 1 shows an example in two dimensions with two
packings among many others corresponding to the same pair of interval graphs.
There are very few SAT approaches for packing. In 2008 T. Soh et al. proposed a
SAT encoding for the strip packing problem in two dimensions (SPP) [2]. This problem
This work is supported by Region Provence-Alpes-Cote-d’Azur and the ICIA Technologies
company. We also thank P. Jegou and D. Habet for helpfull discussions.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 141–146, 2010.
c Springer-Verlag Berlin Heidelberg 2010
142 S. Grandcolas and C. Pinto
1 2 1 2
2
1 4
3 5
4 4
5 4
3 2 1 5 3 5 3
Fig. 1. Two packings corresponding to the same interval graphs in a two-dimensional space
consists in finding the minimal height of a fixed width container containing all the items.
For that purpose they perform successive searches with different heights (selected with
a dichotomy search strategy). Each time, the decision problem is encoded in a SAT
formula which is solved with an external SAT solver (Minisat). In their formulation the
variables represent the exact positions of the items in the container. Additional variables
represent the relative positions of the items one with the others (on the left, on the right,
above, under). T. Soh et al. also introduce constraints to avoid reconsidering symmetric
equivalent packings. Finally the new clauses that the SAT solver Minisat generates to
represent the conflicts are memorised and re-used in further searches. This is possible
since successive searches are incremental SAT problems. T. Soh et al. SAT encoding
involves O(W × H × n + n2 ) Boolean variables for a problem with n items and a
container of width W and height H.
The formulas (1) force each item to occur in at least one clique, while the formulas (2)
force each item to occur in consecutive cliques (Fekete et al. property P 1: the graphs
are interval graphs). The formulas (3) state that no two objects may intersect in all the
dimensions (Fekete et al. property P 3). The stable set feasability is enforced by the
formulas (4): for each unfeasible stable set N ∈ S i in the dimension i, a clause ensures
that at least two items of the stable set intersect each other. In fact only the minimal
unfeasible stable sets are considered. For example, if two items x and y are too large to
be packed side by side in the ith dimension, then {x, y} is a stable set of S i and the unit
clause eix,y is generated. Then the SAT solver will immediately assign to the variable
eix,y the value true and propagate it. The formulas (5) forbid empty cliques. Finally the
formulas (6) establish the relations between the Boolean variables.
The following constraints are not necessary but they may help during the search:
7. [Consective linear ordering (bis)]
x ∈ O, 1 ≤ a ≤ n, 1 ≤ i ≤ d,
(cix,a ∧ ¬cix,a+1 ) ⇒ (¬cix,a+2 ∧ . . . ∧ ¬cix,n )
(cix,a ∧ ¬cix,a−1 ) ⇒ (¬cix,a−2 ∧ . . . ∧ ¬cix,1 )
8. [Maximal cliques]
1 ≤ a ≤ n, 1 ≤ i ≤ d,
uia ⇔ (ci1,a ∨ . . . ∨ cin,a )
(uia ∧ uia+1 ) ⇒ ((ci1,a ∧ ¬ci1,a+1 ) ∨ . . . ∨ (cin,a ∧ ¬cin,a+1 ))
(uia ∧ uia+1 ) ⇒ ((¬ci1,a ∧ ci1,a+1 ) ∨ . . . ∨ (¬cin,a ∧ cin,a+1 ))
144 S. Grandcolas and C. Pinto
3 Experimental Results
3.1 Orthogonal Packing Problem
The problem consists to determine if a given set of items may be packed into a given
container. We have compared our approach with that Fekete et al. on a selection of two-
dimensional problems, using as reference the results published by Clautiaux et al. [3].
Table 1 shows the characteristics of the instances, the results of Fekete et al. (FS), and
the results of our approach with two modelisations: the modelisation M1 corresponds to
the formulas from (1) to (6) and (9), while the modelisation M2 contains, furthermore,
the facultative formulas (7) and (8). All of our experimentations were run on Pentium
IV 3.2 GHz processors and 1 GB of RAM, using Minisat 2.0.
Instance FS M1 M2
Name Space Fais. n Time (s) Time (s) #var. #claus. Time (s) #var. #claus.
E02F17 02 F 17 7 4.95 5474 26167 13.9 6660 37243
E02F20 02 F 20 - 5.46 8720 55707 1.69 10416 73419
E02F22 02 F 22 167 7.62 11594 105910 21.7 13570 129266
E03N16 03 N 16 2 39.9 4592 20955 47.3 5644 30259
E03N17 03 N 17 0 4.44 5474 27401 9.32 6660 38477
E04F17 04 F 17 13 0.64 5474 26779 1.35 6660 37855
E04F19 04 F 19 560 3.17 7562 46257 1.43 9040 61525
E04F20 04 F 20 22 5.72 8780 59857 2.22 10416 77569
E04N18 04 N 18 10 161 6462 32844 87.7 7790 45904
E05F20 05 F 20 491 6.28 8780 59710 0.96 10416 77422
Average > 217 23.9 7291 46159 18.8 8727 60894
Our approach outperforms FS on satisfiable instances, and even the instance E02F20
is not solved by Fekete et al. within the timeout (15 minutes). On unsatisfiable instances
they have better performances, probably because they compute very relevant bounds
(see DFF in [4]) which help them to detect dead ends during the search very early.
Instance Soh et M1 M2
Name n Width LB al. Height #var. #claus. Time Height #var. #claus. Time
HT01 16 20 20 20 20 4592 22963 13.3 20 5644 32267 19.4
HT02 17 20 20 20 20 5474 28669 744 20 6660 39745 444
HT03 16 20 20 20 20 4592 24222 18.5 20 5644 33526 25.5
HT04 25 40 15 15 16 16850 271500 1206 19 19396 305392 521
HT05 25 40 15 15 16 16850 337395 438 16 19396 372287 536
HT06 25 40 15 15 16 16850 494500 146 16 19396 528392 295
CGCUT01 16 10 23 23 23 4592 26745 5.89 23 5644 36049 9.71
CGCUT02 23 70 63 65 66 13202 115110 1043 70 15360 188222 1802
GCUT01 10 250 1016 1016 1016 1190 4785 0.11 1016 1606 7237 0.04
GCUT02 23 250 1133 1196 1259 8780 105810 37.3 1196 10416 123522 1241
NGCUT01 10 10 23 23 23 1190 5132 0.23 23 1606 7584 0.09
NGCUT02 17 10 30 30 30 5474 29662 1.6 30 6660 40738 2.74
NGCUT03 21 10 28 28 28 10122 108138 273 28 11924 128542 580
NGCUT04 7 10 20 20 20 434 1661 0.01 20 640 2577 0.01
NGCUT05 14 10 36 36 36 3122 15558 6.01 36 3930 21906 4.44
NGCUT06 15 10 31 31 31 3810 18629 1.92 31 4736 26361 2.91
NGCUT07 8 20 20 20 20 632 2535 0 20 900 3855 0
NGCUT08 13 20 33 33 33 2522 11870 2.74 33 3220 17010 9.73
NGCUT09 18 20 49 50 50 6462 33765 391 50 7790 46825 53.3
NGCUT10 13 30 80 80 80 2522 11790 0.75 80 3220 16930 0.39
NGCUT11 15 30 50 52 52 3810 18507 19.7 52 4736 26239 25.9
NGCUT12 22 30 79 87 87 11594 173575 886 87 13570 196931 24.5
We have proposed a SAT encoding which outperforms significantly Fekete et al. method
on satisfiable instances. Moreover, we have experimented this encoding on strip-packing
problems. In future work we will try to integrate the DFF computation to improve the
search on unsolvable problems. We will also try to characterize the situations in which
the conflicts clauses which are generated by the SAT solver, may be re-used. This oc-
curs in particular when successive calls to the solver are performed, for example when
searching the minimal height in strip-packing problems.
146 S. Grandcolas and C. Pinto
References
1. Fekete, S.P., Schepers, J., van der Veen, J.: An exact algorithm for higher-dimensional orthog-
onal packing. Operations Research 55(3), 569–587 (2007)
2. Soh, T., Inoue, K., Tamura, N., Banbara, M., Nabeshima, H.: A SAT-based Method for Solving
the Two-dimensional Strip Packing Problem. In: Proceedings of the 15th RCRA Workshop on
Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion
(2008)
3. Clautiaux, F., Carlier, J., Moukrim, A.: A new exact method for the two-dimensional orthog-
onal packing problem. European Journal of Operational Research 183(3), 1196–1211 (2007)
4. Fekete, S.P., Schepers, J.: A general framework for bounds for higher-dimensional orthogonal
packing problems. Mathematical Methods of Operations Research 60(2), 311–329 (2004)
5. Martello, S., Monaci, M., Vigo, D.: An exact approach to the strip-packing problem. Journal
on Computing 15(3), 310–319 (2003)
Job Shop Scheduling with Setup Times and Maximal
Time-Lags: A Simple Constraint Programming
Approach
1 Introduction
Scheduling problems have proven fertile research ground for constraint programming
and other combinatorial optimization techniques. There are numerous such problems
occurring in industry, and whilst relatively simple in their formulation - they typically
involve only Sequencing and Resource constraints - they remain extremely challenging
to solve. After such a long period as an active research topic (more than half a century
back to Johnson’s seminal work [18]) it is natural to think that methods specifically
engineered for each class of problems would dominate approaches with a broader spec-
trum. However, it was recently shown [27,15,26] that generic SAT or constraint pro-
gramming models can approach or even outperform state of the art algorithms for open
shop scheduling and job shop scheduling. In particular, in a previous work [15] we intro-
duced a constraint model that advantageously trades inference strength for brute-force
search speed and adaptive learning-based search heuristics combined with randomized
restarts and a form of nogood learning.
Local search algorithms are generally the most efficient approach for solving job
shop scheduling problems. The best algorithms are based on tabu search, e.g. i-TSAB
[21], or use a CP/local search hybrid [29]. Pure CP approaches can also be efficient,
especially when guided by powerful search strategies that can be thought of as meta-
heuristics [4]. The best CP approach uses inference from the Edge-finding algorithm
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 147–161, 2010.
c Springer-Verlag Berlin Heidelberg 2010
148 D. Grimes and E. Hebrard
[8,22] and dedicated variable ordering heuristics such as Texture [3]. On the other hand,
we take a minimalistic approach to modelling the problem. In particular, whilst most
algorithms consider resource constraints as global constraints, devising specific algo-
rithms to filter them, we simply decompose them into primitive disjunctive constraints
ensuring that two tasks sharing a resource do not run concurrently. To this naive propa-
gation framework, we combine slightly more sophisticated, although generic heuristics
and restart policies. In this work, we have also incorporated the idea of solution guided
search [4].
We showed recently that this approach can be very effective with respect to the state
of the art. However, it is even more evident on variants of these archetypal problems
where dedicated algorithms cannot be applied in a straightforward manner. In the first
variant, running a task on a machine requires a setup time, dependent on the task itself,
and also on the previous task that ran on the same machine. In the second variant, max-
imum time lags between the starting times of successive tasks of each job are imposed.
In both cases, most approaches decompose the problem into two subproblems, for the
former the traveling salesman problem with time windows [1,2] is used, while the latter
can be decomposed into sequencing and timetabling subproblems [10]. On the other
hand, our approach can be easily adapted to handle these additional constraints. Indeed,
it found a number of new best solutions and proved optimality for the first time on some
instances from a set of known benchmarks.
It may appear surprising that such a method, not reliant on domain specific knowl-
edge, and whose components are known techniques in discrete optimization, could be
so effective. We therefore devised some experiments to better understand how the key
component of our approach, the constraint weighting, affects search on these problems.
These empirical results reveal that although the use of constraint weighting is generally
extremely important to our approach, it is not always so. In particular on no-wait job
shop scheduling problems (i.e. problems with maximal time-lag of 0 between tasks),
where our approach often outperforms the state of the art, the weight even seems to be
detrimental to the algorithm.
In Section 2, we describe our approach. In Section 3, after outlining the experimental
setup, we provide an experimental comparison of our approach with the state-of-the-art
on standard benchmarks for these two problems. Finally we detail the results of our
analysis of the impact of weight learning in these instances in Section 4.
In this section we describe the common ground of constraint models we used to model
the variants of JSP tackled in this paper. We shall consider the minimization of the total
makespan (Cmax ) as the objective function in all cases.
0 ⇔ ti + p i ≤ tj
bij =
1 ⇔ tj + p j ≤ ti
dom(ti ) + dom(tj )
(2.4)
w(i, j)
Moreover, one can also use the weighted degree associated with the task variables.
Let
Γ (tj ) denote the set of tasks sharing a resource with tj . We call w(tj ) =
ti ∈Γ (tj ) w(i, j) the sum of the weights of every ternary disjunctive constraint in-
volving tj . Now we can define an alternative variable ordering as follows:
dom(ti ) + dom(tj )
(2.5)
w(ti ) + w(tj )
Value Selection: Our value ordering is based on the solution guided approach (SGM-
PCS) proposed by Beck for JSPs [4]. This approach involves using previous solution(s)
as guidance for the current search, intensifying search around a previous solution in a
similar manner to i-TSAB [21]. In SGMPCS, a set of elite solutions is initially gener-
ated. Then, at the start of each search attempt, a solution is randomly chosen from the
set and is used as a value ordering heuristic for search. When an improving solution
is found, it replaces the solution in the elite set that was used for guidance. The logic
behind this approach is its combination of intensification (through solution guidance)
and diversification (through maintaining a set of diverse solutions).
Interestingly Beck found that the intensification aspect was more important than the
diversification. Indeed, for the JSPs studied, there was little difference in performance
between an elite set of size 1 and larger elite sets (although too large a set did result in
a deterioration in performance). We use an elite set of 1 for our approach, i.e. once an
initial solution has been found this solution is used, and updated, throughout our search.
Furthermore, up until the first solution is found during dichotomic search, we use
a value ordering working on the principle of best promise [11]. The value 0 for bij is
visited first iff the domain reduction directly induced by the corresponding precendence
(ti + pi ≤ tj ) is less than that of the opposite precedence (tj + pj ≤ ti ).
Restart policy: It has previously been shown that randomization and restarts can greatly
improve systematic search performance on combinatorial problems [12]. We use a ge-
ometric restarting strategy [28] with random tie-breaking. The geometric strategy is of
the form s, sr, sr2 , sr3 , ... where s is the base and r is the multiplicative factor. In our
experiments the base was 64 failures and the multiplicative factor was 1.3. We also
incorporate the nogood recording from restarts strategy of Lecoutre et al. [19], where
nogoods are generated from the final search state when the cutoff has been reached. To
that effect, we use a global constraint which essentially simulates the unit propagation
procedure of a SAT solver. After every restart, for every minimal subset of decisions
leading to a failure, the clause that prevents exploring the same path on subsequent
restarts is added to the base. This constraint is not weighted when a conflict occurs.
152 D. Grimes and E. Hebrard
3 Experimental Evaluation
We compare our model with state-of-the-art solvers (both systematic and non-
sysytematic) on 2 variants of the JSP, job shop problems with sequence dependent setup
times and job shop problems with time lags. All our experiments were run on an Intel
Xeon 2.66GHz machine with 12GB of ram on Fedora 9. Due to the random compo-
nent of our algorithm, each instance was solved ten times and we report our results in
terms of both best and average makespan found per problem. Each algorithm run on a
problem had an overall time limit of 3600s.
The number of algorithms we need to compare against makes it extremely difficult
to run all experiments on a common setting.1 We therefore decided to compare with
the results taken from their associated papers. Since they were obtained on different
machines with overall cutoffs based on different criteria, a direct comparison of cpu
time is not possible. However, an improvement on the best known makespan is sufficient
to observe that our approach is competitive. Therefore, we focus our analysis of the
results on the objective value (although we do include average cpu time over the 10
runs for problems where we proved optimality).
State of the art: This problem represents a challenge for CP and systematic approaches
in general, since the inference from the Edge-finding algorithm is seriously weakened
as it cannot easily take into account the setup times. Therefore there are two main
approaches to this problem. The first by Artigues et al. [1] (denoted AF08 in Table 1)
tries to adapt the reasoning for simple unary resources to unary resources with setup
times. The approach relies on solving a TSP with time windows to find the shortest
permutation of tasks, and is therefore computationally expensive.
1
The code may be written for different OS, not publicly available, or not open source.
Job Shop Scheduling with Setup Times and Maximal Time-Lags 153
The second type of approach relies on metaheuristics. Balas et al. [2] proposed com-
bining a shifting bottleneck algorithm with guided local search (denoted BSV08 in Ta-
ble 12 ), where the problem is also decomposed into a TSP with time windows. Hybrid
genetic algorithms have also been proposed by González et al. for this problem, firstly
a hybrid GA with local search [13] and more recently GA combined with tabu search
[14] (denoted GVV08 and GVV09 resp. in Table 1). For both GA hybrids, the problem
is modeled using the disjunctive graph representation.
0 ⇔ ti + pi + si,j ≤ tj
bij =
1 ⇔ tj + pj + sj,i ≤ ti
Evaluation: Table 1 summarizes the results of the state-of-the-art and our approach on
a set of benchmarks proposed by Brucker and Thiele [7]. The problems are grouped
based on the number of jobs and machines (nxm), *01-05 are of size 10x5, *06-10 are
of size 15x5, while *11-15 are of size 20x5. Each step of the dichotomic search had
a 30 second cutoff, the search heuristic used was tdom/bweight. We use the following
2
Results for t2-pss-*06-11 and 14-15 are from
http://www.andrew.cmu.edu/user/neils/tsp/outt2.txt
154 D. Grimes and E. Hebrard
notation for Table 1 (we shall reuse it for Tables 3 and 4): underlined values denote the
fact that optimality was proven, bold face values denote the best value achieved by any
method and finally, values∗ marked with a star denote instances where our approach
improved on the best known solution or built the first proof of optimality. We also
include the average time over the 10 runs when optimality was proven (a dash means
optimality wasn’t proven before reaching the 1 hour cutoff).
We report the first proof of optimality for four instances (t2-ps09, t2-pss06,
t2-pss07, t2-pss10) and 8 new upper bounds for t2-pss* instances (however it
should be noted that there is no comparison available for GVV09 on these 8 instances).
In general, our approach is competitive with the state-of-the-art (GVV09) and outper-
forms both dedicated systematic and non-systematic solvers.
This type of constraint arises in many situations. For instance, in the steel industry,
the time lag between the heating of a piece of steel and its moulding should be small.
Similarly when scheduling chemical reactions, the reactives often cannot be stored for a
long period of time between two stages of a process to avoid interactions with external
elements. This type of problem has been studied in a number of areas including the steel
and chemical industries [24].
State of the art: Caumond et al. introduced in 2008 a genetic algorithm able to deal
with general time lag constraints [9]. However most of the algorithms introduced in the
literature have been designed for a particular case of this problem: the no-wait job shop.
In this case, the maximum time-lag is null, i.e. each task of a job must start directly after
its preceding task has finished.
For the no-wait job shop problem, the best methods are a tabu search method
by Schuster (TS [25]), another metaheuristic introduced by Framinian and Schuster
(CLM [10]) and a hybrid constructive/tabu search algorithm introduced by Bozėjko
and Makuchowski in 2009 (HTS [6]). We report the best results of each paper. It should
be noted that for HTS, the authors reported two sets of results, the ones we report for
the “hard” instances were “without limit of computation time”.
Job Shop Scheduling with Setup Times and Maximal Time-Lags 155
Specific Implementation Choices: The constraint to represent time lags between two
tasks of a job are simple precedences in our model. For instance, a time lag li between
ti and ti+1 , will be represented by the following constraint: ti+1 − (pi + li ) ≤ ti .
Although our generic model was relatively efficient on these problems, we made a
simple improvement for the no-wait class based on the following observation: if no
delay is allowed between any two consecutive tasks of a job, then the start time of every
task is functionally dependent on the start time of any other task in the job. The tasks
of each job can thus be viewed as one block. In other words we really need only one
task in our model to represent all the tasks of a job. We therefore use only n variables
standing for the jobs: {Jx | 1 ≤ x ≤ n}.
Let hi be the total duration of thetasks coming before task ti in its job. That is, if job
J = {t1 , . . . , tm }, we have: hi = k<i pk . For every pair of tasks ti ∈ Jx , tj ∈ Jy
sharing a machine, we use the same Boolean variables to represent disjuncts as in the
original model, however linked by the following constraints:
0 ⇔ Jx + hi + pi − hj ≤ Jy
bij =
1 ⇔ Jy + hj + pj − hi ≤ Jx
Notice that while the variables and constants are different, these are still exactly the
same ternary disjuncts used in the original model.
The no-wait job shop scheduling problem can therefore be reformulated as follows,
where the variables J1 , . . . , Jn represent the start time of the jobs, Jx(i) stands for the
job of task ti , and f (i, j) = hi + pi − hj .
our model on the instances used in that paper, where instances are grouped based on
type (car (4 instances) / la (3 instances)) and maximum time lag (0.5 / 1 / 2).
For the no-wait job shop problem, we first present our results in terms of each solver’s
average percentage relative deviation (PRD) from the reference values given in [6] per
problem set in Table 2b. The PRD is given by the following formula:
where CAlg is the best makespan found by the algorithm and CRef is the reference
makespan for the instance given in [6]. There are 82 instances overall.
Interestingly, the search heuristic tdom/tweight performed much better with our no-
wait model than tdom/bweight, thus we report the results for this heuristic. This was
somewhat surprising because this heuristic is less discriminatory as the task weights
for a Boolean are the weights of the two jobs, which will be the same for all Booleans
between these two jobs. Further investigation revealed that ignoring the weight yielded
better results on a number of problems. Thus we also include the heuristic tdom.
Our approach was better than the local search approaches on the smaller problem
sets, and remained competitive on the larger problem sets. In Table 3 we provide results
for the instances regarded as easy in [6], these had been proven optimal by Mascis [20].
Table 3. NW-JSP: Comparison vs state-of-the-art on easy instances (best & mean Cmax , 10
runs).
Table 4. NW-JSP: Improvement on hard instances (best & mean Cmax , 10 runs)
We proved optimality on all these instances, in under 10s for most cases. It is of interest
to note that tdom was nearly always quicker than tdom/tweight at proving optimality.
In Table 4, we report results for the “hard” instances where our approach found an
improving solution, and the first proofs of optimality for 10 (la12, la21-25, la36 and
la38-40) of the 53 open problems.
dom(ti ) + dom(tj )
(4.1)
w(i, j) + K
We can therefore tune the impact of the weights in the variable choice, by setting the
constant K. As K increases, the role of the weights is increasingly restricted to a tie
breaker. We selected a subset of instances small enough to be solved by tdom/(∞ +
bweight). For the selected subset of small instances, we ran each version of the heuristic
ten times with different random seeds. We report the average cpu time across the ten
runs in Table 5. When the run went over a one hour time cutoff, we report the deviation
to the optimal solution (in percentage) instead.
Table 5. Weight evaluation: cpu-time or deviation to the optimal for increasing values of K
tdom/(K + bweight)
Instance
K = 0 K = 10 K = 100 K = 1000 K = 10000 K = 100000 K = ∞
t2-ps07 26.55 23.33 26.67 41.60 77.27 403.90 +12.9%
t2-ps08 41.08 35.85 93.60 128.96 194.96 665.28 +9.8%
t2-ps09 971.83 956.63 948.28 957.85 1164.94 1649.19 +8.8%
t2-ps10 13.04 13.95 13.63 19.44 100.25 422.24 +15.7%
la07 0 3 +0.0% +0.0% +0.0% +0.0% +0.0% +0.0% +5.8%
la08 0 3 15.63 12.45 23.03 30.22 117.50 391.99 3098.87
la09 0 3 1.61 0.51 1.44 10.16 129.62 169.02 2115.98
la10 0 3 3.42 2.25 0.41 0.69 1.39 3.44 39.66
la07 0 0 1751.16 549.58 392.71 151.70 66.18 49.67 57.28
la08 0 0 2231.18 575.44 309.04 113.95 42.04 35.74 38.63
la09 0 0 2402.76 1291.29 691.96 407.68 147.73 89.28 102.03
la10 0 0 3274.86 833.28 214.51 53.75 26.85 26.51 30.82
For job shop with setup times, the best compromise is for K = 10. For very large
values of K, the domain size of the tasks takes complete precedence on the weights, and
the performance degrades. However, as long as the weights are present in the selection
process, even simply as tie breaker, the cpu time stays within one order of magnitude
from the best value for K. On the other hand, when the weights are completely ignored,
the algorithm is not able to solve any of the instances. Indeed the gap to optimality is
quite large, around 9% to 15%.
For job shop with time lags, the situation is a little bit different. As in the previous
case, the best compromise is for K = 10 and the performance degrades slowly when K
increases. However, even when the weights are completely ignored, the gap stays within
a few orders of magnitude from the best case. Finally, for the no-wait job shop, we
observe that the opposite is true. Rather than increasing with K, the cpu time actually
decreases when K grows.
One important feature of a heuristic is its capacity to focus the search on a small
subset of variables that would constitute a backdoor of the problem. It is therefore inter-
esting to find out if there is a correlation between a high level of inequality in the weight
distribution and the capacity to find small backdoors. We used the Gini coefficient to
characterize the weight distribution. The Gini coefficient is a metric of inequality, used
for instance to analyse distribution of wealth in social science.
The Gini coefficient is based on the Lorenz curve, mapping the cumulative pro-
portion of income y of a fraction x of the poorest population. When the distribution
is perfectly fair, the Lorenz curve is y = x. The Gini coefficient is the ratio of the
area lying between the Lorenz curve and x = y, over the total area below x = y.
Job Shop Scheduling with Setup Times and Maximal Time-Lags 159
Gini coefficient
toward a given set of variables from 0.6
which a short proof of unsatisfiability can
0.4
be extracted, the Gini coefficient of the
weight distribution typically increases 0.2
0.9
searched vars t2-ps09 t2-ps09
choice pts t2-ps09 la09_0_0
searched vars la09_0_0 0.8
choice pts la09_0_0
1
0.7
Gini coefficient
0.6
Ratios
0.1
0.5
0.01 0.4
0.3
0.001
0.2
0 10 20 30 40 50 60 1000 10000 100000 1e+06 1e+07 1e+08
Search depth Searched nodes
Fig. 2. Search tree and weight distribution for t2-ps09 and la09 0 0
160 D. Grimes and E. Hebrard
of the number of choice points, that is nodes of the search tree, at each depth, over the
total number of explored nodes.
Clearly for t2-ps09, where the weights are useful, the search is more focused on
lower depth, and on a smaller ratio of variables. Indeed, the cumulative ratio of searched
variables tops at 0.3 (See Figure 2a). On the other hand, for la09 0 0, even very deep
in the tree, new choice points are opened (the ratio of choice points is more spread
out), and they involve a large proportion of new variables (the cumulative number of
searched variables increases almost linearly up to 0.6). The evolution of the Gini coef-
ficient during search is, however, very similar in both cases (See Figure 2b).
One possibility is that the build up of contention is more important for the no wait
problems due to the stronger propagation between tasks of the one job. Preliminary
results suggest that initially both tdom and tdom/bweight repeatedly select Booleans
between the same pair of jobs, once a pair has been selected. The heuristics diverge
when search backs up from deep in search, tdom will still often choose Booleans from
the same pair of jobs as the variable above the choice point, while the weights learnt
deep in search may result in the heuristics that use bweight and tweight choosing
variables associated with a different pair of jobs. Obviously, this effect will be stronger
for bweight as the weights are associated with individual Booelans.
5 Conclusions
We have shown how our constraint model can be easily extended to handle two variants
of the job shop scheduling problem. In both cases we found our approach to be compet-
itive with the state-of-the-art, most notably in proving optimality on some of the open
problems of both problem types.
Whereas it appeared to uniformly improve search efficiency for standard job shop
and open shop scheduling problems, our analysis of constraint weighting revealed that
it can actually be detrimental for some variants of these problems.
References
1. Artigues, C., Feillet, D.: A branch and bound method for the job-shop problem with
sequence-dependent setup times. Annals OR 159(1), 135–159 (2008)
2. Balas, E., Simonetti, N., Vazacopoulos, A.: Job shop scheduling with setup times, deadlines
and precedence constraints. J. of Scheduling 11(4), 253–262 (2008)
3. Beck, J.C., Davenport, A.J., Sitarski, E.M., Fox, M.S.: Texture-Based Heuristics for Schedul-
ing Revisited. In: AAAI 1997, pp. 241–248 (1997)
4. Christopher Beck, J.: Solution-Guided Multi-Point Constructive Search for Job Shop
Scheduling. Journal of Artificial Intelligence Research 29, 49–77 (2007)
5. Boussemart, F., Hemery, F., Lecoutre, C., Sais, L.: Boosting Systematic Search by Weighting
Constraints. In: ECAI 2004, pp. 482–486 (2004)
6. Bozejko, W., Makuchowski, M.: A fast hybrid tabu search algorithm for the no-wait job shop
problem. Computers & Industrial Engineering 56(4), 1502–1509 (2009)
7. Brucker, P., Thiele, O.: A branch and bound method for the general- shop problem with
sequence-dependent setup times. Operation Research Spektrum 18, 145–161 (1996)
Job Shop Scheduling with Setup Times and Maximal Time-Lags 161
8. Carlier, J., Pinson, E.: An Algorithm for Solving the Job-shop Problem. Management Sci-
ence 35(2), 164–176 (1989)
9. Caumond, A., Lacomme, P., Tchernev, N.: A memetic algorithm for the job-shop with time-
lags. Computers & OR 35(7), 2331–2356 (2008)
10. Framinan, J.M., Schuster, C.J.: An enhanced timetabling procedure for the no-wait job shop
problem: a complete local search approach. Computers & OR 33, 1200–1213 (2006)
11. Geelen, P.A.: Dual viewpoint heuristics for binary constraint satisfaction problems. In: Proc.
Tenth European Conference on Artificial Intelligence, ECAI 1992, pp. 31–35 (1992)
12. Gomes, C.P., Selman, B., Kautz, H.: Boosting combinatorial search through randomization.
In: AAAI 1998, pp. 431–437 (1998)
13. González, M.A., Vela, C.R., Varela, R.: A new hybrid genetic algorithm for the job shop
scheduling problem with setup times. In: ICAPS, pp. 116–123. AAAI, Menlo Park (2008)
14. González, M.A., Vela, C.R., Varela, R.: Genetic algorithm combined with tabu search for the
job shop scheduling problem with setup times. In: Mira, J., Ferrández, J.M., Álvarez, J.R.,
de la Paz, F., Toledo, F.J. (eds.) IWINAC 2009. LNCS, vol. 5601, pp. 265–274. Springer,
Heidelberg (2009)
15. Grimes, D., Hebrard, E., Malapert, A.: Closing the Open Shop: Contradicting Conventional
Wisdom. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 400–408. Springer, Heidelberg
(2009)
16. Grimes, D., Hebrard, E., Malapert, A.: Closing the Open Shop: Contradicting Conventional
Wisdom on Disjunctive Temporal Problems. In: 14th ERCIM International Workshop on
Constraint Solving and Constraint Logic Programming, CSCLP 2009 (2009)
17. Hodson, A., Muhlemann, A.P., Price, D.H.R.: A microcomputer based solution to a practi-
cal scheduling problem. The Journal of the Operational Research Society 36(10), 903–914
(1985)
18. Johnson, S.M.: Optimal two- and three-stage production schedules with setup times included.
Naval Research Logistics Quarterly 1(1), 61–68 (1954)
19. Lecoutre, C., Sais, L., Tabary, S., Vidal, V.: Nogood Recording from Restarts. In: IJCAI
2007, pp. 131–136 (2007)
20. Mascis, A., Pacciarelli, D.: Job-shop scheduling with blocking and no-wait constraints. Eu-
ropean Journal of Operational Research 143(3), 498–517 (2002)
21. Nowicki, E., Smutnicki, C.: An Advanced Tabu Search Algorithm for the Job Shop Problem.
Journal of Scheduling 8(2), 145–159 (2005)
22. Nuijten, W.: Time and Resource Constraint Scheduling: A Constraint Satisfaction Approach.
PhD thesis, Eindhoven University of Technology (1994)
23. Raaymakers, W.H.M., Hoogeveen, J.A.: Scheduling multipurpose batch process industries
with no-wait restrictions by simulated annealing. European Journal of Operational Re-
search 126(1), 131–151 (2000)
24. Rajendran, C.: A no-wait flowshop scheduling heuristic to minimize makespan. The Journal
of the Operational Research Society 45(4), 472–478 (1994)
25. Schuster, C.J.: No-wait job shop scheduling: Tabu search and complexity of problems. Math.
Meth. Oper. Res. 63, 473–491 (2006)
26. Schutt, A., Feydy, T., Stuckey, P.J., Wallace, M.: Why cumulative decomposition is not as
bad as it sounds. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 746–761. Springer,
Heidelberg (2009)
27. Tamura, N., Taga, A., Kitagawa, S., Banbara, M.: Compiling finite linear CSP into SAT. In:
Benhamou, F. (ed.) CP 2006. LNCS, vol. 4204, pp. 590–603. Springer, Heidelberg (2006)
28. Walsh, T.: Search in a Small World. In: IJCAI 1999, pp. 1172–1177 (1999)
29. Watson, J.-P., Beck, J.C.: A Hybrid Constraint Programming / Local Search Approach to
the Job-Shop Scheduling Problem. In: Perron, L., Trick, M.A. (eds.) CPAIOR 2008. LNCS,
vol. 5015, pp. 263–277. Springer, Heidelberg (2008)
On the Design of the
Next Generation Access Networks
1 Introduction
In the last decade, network design has been one of the most important applica-
tion domains for Integer Programming methods. Typical application areas are
transportations and telecommunications, where even a small optimization factor
can have an important economical impact.
Even for Constraint Programming, network design has been a source of ap-
plications. See e.g., Simonis [1] for a recent overview.
In this paper, we present challenges that arise optimizing the design of the
Next Generation Access Networks completely based on fiber cable technology
that, in certain cases, may reach single users and for this reason are called Fiber
To The Home networks (FTTH).
The new network characteristics and the upcoming deployment motivate the
investigations on quantitative optimization models and algorithms for the plan-
ning that can help investors to decide which type of fiber network to select and
how to operationally implement it, that is where to install central offices, that is
the centers connected to the backbone network managing customer connections,
and possible intermediate cabinets and how to reach users considering network
link capacity.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 162–175, 2010.
c Springer-Verlag Berlin Heidelberg 2010
On the Design of the Next Generation Access Networks 163
Fig. 1. Micro example on the Politecnico of Milano campus, with two candidate sites
for the central offices Oi (downward trapezia), three canidate sites for the cabinets Cj
(upward trapezia), and nine basements Sl (circles)
(i, j) with i ∈ O and j ∈ C if dij ≤ L1 , and there is the edge (j, l) with j ∈ C
and l ∈ S if djl ≤ L2 .
The two level nature of the problem is quite evident. Further on we will
use superscript 1 to denote the level between central offices and cabinets, and
superscript 2 to denote that between cabinets and customers.
The problem studied here, to the best of our knowledge, was not considered be-
fore in the optimization literature. A related network design problem is presented
in [3], where, given the positions of central offices and of users, the problem con-
sists in finding the position of the optical splitters in such a way of minimizing
the overall costs. In that model, there is not an actual list of candidate sites,
since a rural (or greenfield) scenario is considered, and the coordinates of the po-
sition of the splitters are part of the decision variables of the problem. A mixed
integer non linear model is presented with the only purpose of formulating the
problem and it is not exploited in the heuristic algorithm.
Our problem is related to the Two-level Uncapacitated Facility Location prob-
lem (TUFL), well studied in the literature (e.g., see [4] for a polyhedral study
166 S. Gualandi, F. Malucelli, and D.L. Sozzi
O1 O2
C1 C2 C3
S1 S2 S3 S4 S5 S6 S7 S8 S9
and see [5] for recent advances in approximation algorithms). However, there are
two important differences with our problem: first, the capacity constraints that
are not considered in the literature, and, second, the multiplexing technology
constraints that make the problem much more complex. Note that, differently
from TUFL, in our case distinct paths from a central office to customers are not
profitable, since the multiplexing occurring at the cabinets can merge more links
coming from the secondary network into a single link of primary network. Thus
the techniques developed for the TUFL can hardly be exploited in our case.
Hybrid constraint and integer programming methods for the networks prob-
lems are presented in [1]. Recently [6] tackled a problem of routing and wave-
length assignment on optical networks. For each demand the set of frequencies
is given, and the problem consist in deciding which demands to select and how
to route them. To solve this problem a decomposition approach is implemented
using a MIP model to solve the allocation subproblem, i.e., to select and to
route a subset of demands, and the wavelength assignment problem is formal-
ized as a graph coloring and solved with constraint programming. In case the CP
subproblem becomes infeasible, the MIP allocation problem is somehow relaxed.
Another network design problem is presented in [7], where particular attention
is paid to problem of breaking symmetries.
A hybrid local search and constraint propagation method for a network rout-
ing problem is presented in [8], where the problem consists in, given a directed
capacitated network and a set of traffic demands, minimizing the cost of the lost
traffic demands.
activated in site j. We need another set of binary variables x2jl whose value is 1
if basement l is assigned to cabinet j. Integer variables x1ij give the number of
fibers connecting central office i with cabinet j. The last two sets of variables are
defined for all pairs i, j and j, l such that the distance between the corresponding
sites is less than or equal to the maximum allowed distance. In order to consider
only pairs of sites within a feasible distance, we introduce a set E including all
pairs i, j with i ∈ O and j ∈ C such that dij ≤ L1 , and a set F of pairs j, l with
j ∈ C and l ∈ S such that djl ≤ L2 .
The Integer Programming model is as follows:
min s1i yi1 + s2jt yjt
2
+ c1ij x1ij + c2jl x2jl (1)
i∈O j∈C t∈T ij∈E jl∈F
s.t. x2jl = 1, ∀l ∈ S, (2)
jl∈E
x1ij ≤ Mi1 yi1 , ∀i ∈ O, (3)
ij∈E
2
yjt ≤ 1, ∀j ∈ C, (4)
t∈T
x2jl ≤ Mj2 2
yjt , ∀j ∈ C, (5)
jl∈F t∈T
2
yjt ≤ x1ij , ∀j ∈ C, (6)
t∈T ij∈E
mt x1ij ≥ x2jl − Mj2 (1 − yjt
2
), ∀j ∈ C, ∀t ∈ T, (7)
ij∈E jl∈F
Constraints (2) state that each user must be connected to a cabinet. Constraints
(3) are twofold: they force the activation of central office i (i.e. it sets variable
yi to 1) if at least one cabinet j is assigned to it, and they limit the number of
cabinets assigned to i according to the capacity. Constraints (4) determine that
either a cabinet is not active (when the left hand side is equal to 0) or at most a
multiplexing technology is assigned to it. Constraints (6) state that if a cabinet is
activated it must be connected to a central office. While constraints (7) relate the
number of incoming fibers in a cabinet from users with the number of outgoing
fibers towards the central office. This number must account for the multiplexing
factor installed in the cabinet. Note that in the group of constraints referring to
a cabinet at most one is significant, while the others are made redundant by big
constants.
168 S. Gualandi, F. Malucelli, and D.L. Sozzi
The objective function (1) accounts for the cost s1i of each activated central
office, the cost s2jt for installing the technology t in cabinet j and the connection
costs for the fibers between central offices and cabinets and between cabinets
and the users.
Note that for real life instances the ILP model (1)–(11) has more than one
million of variables and constraints, and cannot be solved in a reasonable time
with a pure Integer Programming approach.
3 Computational Approaches
We have developed two approaches to solve the FTTH problem. The first approach
is an LP-based Randomized Rounding algorithm, the second is a Constraint-based
Local Search algorithm. Both approaches are implemented exploiting features of
the Comet constraint language [9].
0 ≤ yi1 ≤ 1, 0 ≤ yij
2
≤ 1, x1ij ≥ 0, 0 ≤ x2jl ≤ 1. (12)
Problem (P) can be solved easily with standard linear programming software.
An alternative option could utilize the Volume algorithm, but we leave this to
future investigations.
Our LP-based Randomized Rounding algorithm is based on the observation
that once we have decided which central offices and which cabinets to open, that
is, the variables y 1 and y 2 have been fixed to either 1 or 0, the remaining problem
is reduced to a generalized minimum cost flow problem on a two level bipartite
graph. Even if the generalized minimum cost flow problem is polynomial (e.g.,
see [11]), we solve it with a linear programming software.
We define three auxiliary subproblems:
1. The Continuous Generalized Minimum Cost Flow Problem (C-GFP) ob-
tained by fixing all of location variables yi1 and yjt
2
either to 1 or 0.
2. The Partial Generalized Minimum Cost Flow Problem (P-GFP) obtained by
fixing to 1 some selected yi1 and yjt
2
variables, and leave open the remaining
ones (that is we do not fix to 0 any variable);
3. The Integer Generalized Minimum Cost Flow Problem (I-GFP) obtained by
adding the integrality constraint to (C-GFP).
On the Design of the Next Generation Access Networks 169
This corresponds to normalize for each central office i the sum of the values
assigned in the LP relaxation to the link variables x̄1ij entering in i by the sum
of all the link variables x1i j . On the contrary, for variables y 2 we perform a
standard randomized rounding. The randomized rounding is preceded by a pre-
processing phase that fixes to 1 all the facility variables having a value greater
than δ.
CBLS Model. The CBLS approach proposed in this paper is based on a model
different from (1)–(11), and it relies on the use of invariants (see [13]) to incre-
mentally maintain the necessary information to guide the search procedure. In
order to use a different notation from the ILP formulation we will use upper case
letters to denote the variables of the CBLS model.
The decision variables are the following:
– For each basement l there is an integer variable Xl2 with domain equal to
the subset of cabinets Cl ⊆ C reachable from j, i.e., Cl = {j | ∃(l, j) ∈ F }.
If Xl2 = j it means that basement l is linked to cabinet j.
– For each cabinet j there is an integer variable Zj with domain equal to T .
Zj = t means that in cabinet j the t-th multiplexing technology is installed.
– For each possible link (i, j) ∈ E there is an integer variable Xij
1
(equivalent
1
to variable xij ), that gives the number of fibers installed between central
office i and cabinet j.
These are the actual decision variables, since once they have been determined
we can derive which are the open central offices and the open cabinets (with the
170 S. Gualandi, F. Malucelli, and D.L. Sozzi
Once we have assigned a value to each decision variable, and these values have
propagated to the invariants, the objective function is computed as follows:
s1i Yi1 + s2jZj + c1ij Xij
1
+ c2X 2 l (14)
l
i∈O j∈C:Yj2 =1 ij∈E l∈S
Note that in the second and the fourth term, we use variable subscription, as it
were an element constraint, that is, we have a variable appearing in the subscript
of a cost parameters, which it is not possible in Integer Linear Programming.
On the Design of the Next Generation Access Networks 171
4 Computational Results
The two approaches presented in this paper are evaluated on realistic instances.
By realistic we mean that they are randomly generated in such a way to be as
close as possible to the real scenario of the metropolitan area of the city of Rome.
Using the street graph of Rome, with the link lengths in meters, we have gener-
ated 21 different instances using values for the installation and deployment costs
and for the central office and cabinet capacities, as provided by our collaborators
working at Alcatel-Lucent.
The biggest instance has 35 candidate sites for the central offices (consider
that the currently operated traditional network in Milan has 28 central offices,
but they would be more than really necessary in a fiber based network), 150
candidate sites for the cabinets, and 10.000 basements. This is equivalent to
approximately serve 300.000 final users. The smaller instances are generated in
order to compare our heuristics with an exact ILP method.
Programming solver. The tests were carried over a computer with linux-ubuntu
32 bits, an Intel Q6600 CPU (Quadcore 2.4GHz, 8Mb L2 cache, 1066 MHz FSB)
and with 4Gb of ram.
LP-RR CBLS
Inst. |O| |C| |S| Cost Time Best-Cost Cost Time Best-Cost
1 3 10 100 2383 31 2383 2383 0.6 2383
2 10 35 400 6979 716 6966 6864 1.2 6860
3 15 65 841 13630 1735 13599 13349 44.6 13306
4 20 100 1521 25499 2465 25427 24850 316 24752
5 25 120 3025 55073 4768 55052 51752 330 51646
6 30 140 6084 121794 7705 121974 118224 1105 118135
7 35 150 10000 239668 26915 239668 229677 1817 229244
have computed the average over each run. Each instance is described in terms of
number of central office and cabinet candidate sites and number of basements.
For each instance the table reports the Cost and the running Time averaged
over 5 runs, the corresponding standard deviations (stdev), the Best-Cost, the
optimum solution (IP) found with an ILP solver. The last column gives the
percentage gap with the optimal solution. Note that the CBLS is fast, and it
also provides solutions with a very small percentage gap. In particular, for the
smaller instances it does find the optimum.
Table 2. Solving small Rome instances with the CBLS approach: gaps with respect to
the optimal solution computed with SCIP. Cost and Time (in seconds) are averaged
over 5 runs for each instance.
Inst. |O| |C| |S| Cost (stdev) Time (stdev) Best-Cost IP Gap
8 5 10 109 3244 0.04% 1.5 2.7% 3243 3243 0.0%
9 10 20 204 13888 0.00% 2.3 1.4% 13888 13888 0.0%
10 20 100 1462 419929 0.04% 87.1 0.6% 419823 417554 0.5%
11 25 120 3139 1011821 0.02% 567.7 0.7% 1011457 1009710 0.2%
Finally, Table 3 reports additional results for other bigger Rome instances,
reporting the percentage gap (LP-Gap) computed with the value of the linear
relaxation of the problem. The CBLS is pretty stable both in the quality of the
solution and in the computation time required. For the bigger instances, those
with 10,000 basements, the computation time can be more the one-hour (see
instance 18,20, and 21), but still is always better than the LP-RR algorithm. We
remark that the percentage gap on the lower bound computed by solving the
linear relaxation (P) is in the worse case 2.4%.
Table 3. Solving big Rome instances with the CBLS approach: gaps computed with
respect to the linear relaxation (P).
Inst. |O| |C| |S| Cost (stdev) Time (stdev) Best-Cost LP-Gap
12 30 140 5960 4558323 (0.02%) 1350.1 (0.46%) 4557601 1.1%
13 30 140 5981 3954325 (0.01%) 1008.4 (0.05%) 3953619 1.2%
14 30 140 5982 4561215 (0.01%) 1803.6 (0.14%) 4560780 0.9%
15 30 140 5995 4164941 (0.01%) 2168.7 (0.69%) 4164724 1.1%
16 30 140 6014 3462920 (0.01%) 1426.9 (0.35%) 3462857 1.4%
17 35 150 10020 3126763 (0.02%) 2511.8 (0.44%) 3126385 2.4%
18 35 150 10040 5937585 (0.01%) 3484.7 (0.55%) 5936733 1.1%
19 35 150 10072 6663950 (0.01%) 1183.6 (0.54%) 6663481 0.9%
20 35 150 9978 6261704 (0.01%) 4252.8 (0.49%) 6261046 1.0%
21 35 150 9983 5980627 (0.01%) 3846.9 (0.65%) 5979618 1.1%
174 S. Gualandi, F. Malucelli, and D.L. Sozzi
5 Conclusions
Acknowledgments
The authors are grateful indebted with Carlo Spinelli of Alcatel-Lucent for help-
ful discussions and for providing details about the technological aspects of the
problem.
References
1. Simonis, H.: Constraint applications in networks. In: Rossi, F., van Beek, P., Walsh,
T. (eds.) Handbook of Constraint Programming. Elsevier, Amsterdam (2006)
2. Kramer, G., Pesavento, G.: Ethernet Passive Optical Network (EPON): building
a next-generation optical access network. IEEE Communications Magazine 40(2),
66–73 (2002)
3. Li, J., Shen, G.: Cost Minimization Planning for Greenfield Passive Optical Net-
works. Journal of Optical Communications and Networking 1(1), 17–29 (2009)
4. Aardal, K., Labbe, M., Leung, J., Queyranne, M.: On the two-level uncapacitated
facility location problem. INFORMS Journal on Computing 8, 289–301 (1996)
5. Zhang, J.: Approximating the two-level facility location problem via a quasi-greedy
approach. Mathematical Programming 108(1), 159–176 (2006)
6. Simonis, H.: A Hybrid Constraint Model for the Routing and Wavelength As-
signment Problem. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 104–118.
Springer, Heidelberg (2009)
7. Smith, B.: Symmetry and Search in a Network Design Problem. In: Barták, R., Mi-
lano, M. (eds.) CPAIOR 2005. LNCS, vol. 3524, pp. 336–350. Springer, Heidelberg
(2005)
8. Lever, J.: A local search/constraint propagation hybrid for a network routing prob-
lem. International Journal of Artificial Intelligence Tools 14(1-2), 43–60 (2005)
On the Design of the Next Generation Access Networks 175
9. Van Hentenryck, P., Michel, L.: Control abstractions for local search. Con-
straints 10(2), 137–157 (2005)
10. Barahona, F., Chudak, F.: Near-optimal solutions to large-scale facility location
problems. Discrete Optimization 2(1), 35–50 (2005)
11. Ahuja, R., Magnanti, T., Orlin, J.: Network Flows: Theory, Algorithms, and Ap-
plications. Prentice-Hall, Englewood Cliffs (1993)
12. Van Hentenryck, P., Michel, L.: Constraint-based local search. MIT Press, Cam-
bridge (2005)
13. Van Hentenryck, P., Michel, L.: Differentiable invariants. In: Benhamou, F. (ed.)
CP 2006. LNCS, vol. 4204, pp. 604–619. Springer, Heidelberg (2006)
Vehicle Routing for Food Rescue Programs:
A Comparison of Different Approaches
1 Introduction
The 1-Commodity Pickup and Delivery Vehicle Routing Problem (1-PDVRP)
asks to deliver a single commodity from a set of supply nodes to a set of demand
nodes, which are unpaired. That is, a demand node can be served by any supply
node. In this paper, we further assume that the supply and demand is unsplit-
table, which implies that we can visit each node only once. The 1-PDVRP arises
in several practical contexts, ranging from bike-sharing programs in which bikes
at each station need to be redistributed at various points in time, to food rescue
programs in which excess food is collected from, e.g., restaurants and schools,
and redistributed through agencies to people in need. The latter application is
the main motivation of our study.
Pickup and delivery vehicle routing problems have been studied extensively;
see, e.g., [1] for a recent survey. However, the 1-commodity pickup and delivery
vehicle routing problem (1-PDVRP) has received limited attention. When only
one vehicle is considered, the problem can be regarded as a traveling salesman
problem, or 1-PDTSP. For the 1-PDTSP, different solution methods have been
proposed, including [3, 4]. On the other hand, the only paper that addresses
the 1-PDVRP is by [2], to the best of our knowledge. [2] present different
approaches, including MIP, CP and Local Search, which are applied to instances
involving up to nine locations.
The main goal of this work is to compare off-the-shelf solution methods for
the 1-PDVRP, using state-of-the-art solvers. In particular, how many vehicles,
and how many locations, can still be handled (optimally) by these methods?
The secondary goal of this work is to evaluate the potential (cost) savings in
the context of food rescue programs. We note that the approaches we consider
(MIP, CP, CBLS) are similar in spirit to those of [2]. Our MIP model is quite
different, however. Further, although the CP and CBLS models are based on
the same modeling concepts, the underlying solver technology has been greatly
improved over the years.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 176–180, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Vehicle Routing for Food Rescue Programs 177
Our MIP model is based on column generation. The master problem of our
column generation procedure consists of a set of ‘columns’ S representing feasible
routes. The routes are encoded as binary vectors on the index set V of locations;
that is, the actual order of the route is implictly encoded. The columns are
assumed to be grouped together in a matrix A of size V by S. The length of
the routes is represented by a ‘cost’ vector c ∈ R|S| . We let z ∈ {0, 1}|S| be a
vector of binary variables representing the selected routes. The master problem
can then be encoded as the following set covering model:
min cT z
(1)
s.t. Az = 1
For our column generation procedure, we will actually solve the continous re-
laxation of (1), which allows us to use the shadow prices corresponding to the
constraints. We let λj denote the shadow price of constraint j in (1), where
j ∈V.
The subproblem for generating new feasible routes uses a model that employs
a flow-based representation on a layered graph, where each layer consists of nodes
representing all locations. The new route comprises M steps, where each step
represents the next location to be visited. We can safely assume that M is the
minimum of |V | + 1 and (an estimate on) the maximum number of locations
that ‘fit’ in the horizon H for each vehicle.
We let xijk be a binary variable that represents whether we travel from loca-
tion i to location j in step k. We further let yj be a binary variable representing
whether we visit location j at any time step. The vector of variables y will rep-
resent the column to be generated. Further, variable Ik represents the inventory
of the vehicle, while variable Dk represents the total distance traveled up to step
k, where k = 0, . . . , M . We let D0 = 0, while 0 ≤ I0 ≤ Q. The problem of finding
an improving route can then be modeled as presented in Figure 1.
In this model, the first four sets of constraints ensure that we leave from and
finish at the origin. The fifth set of constraints enforce that we can enter the
origin at any time, but not leave it again. The sixth set of constraints model
the flow conservation at each node, while the seventh set of constraints (the first
set in the right column) prevent the route from visiting a location more than
once. The following four sets of constraints represent the capacity constraints
of the vehicle in terms of quantities picked up and delivered, and in terms of
178 C. Gunes, W.-J. van Hoeve, and S. Tayur
M
min dij xijk − λj yj
i∈V j∈V k=1 j∈V
M
s.t. xO,j,1 = 1 xijk ≤ 1 ∀j ∈ V \ {O}
j∈V j∈V \{O} k=1
xi,j,1 = 0 ∀i ∈ V \ {O} Ik = Ik−1 + qi xijk ∀k ∈ [1..M ]
j∈V i∈V j∈V
xi,O,M = 1 0 ≤ Ik ≤ Q ∀k ∈ [0..M ]
i∈V
Dk = Dk−1 + dij xijk ∀k ∈ [1..M ]
xi,j,M = 0 ∀j ∈ V \ {O}
i∈V i∈V j∈V
distance. The last set of constraints link together the ‘flow’ variables x with the
new column represented by the variables y.
As noted above, throughout the iterative process, we apply a continuous re-
laxation of the master problem (1). When this process terminates (it reaches a
fixed point, or it meets a stopping criterion), we run the master problem as an
integer program. Therefore, our procedure may not provably find the optimal
solution, but it does provide a guaranteed optimality gap.
As a final remark, when only one vehicle is involved, the MIP model amounts
to solving only the subproblem, to which the constraints are added that we must
visit all locations.
Fig. 2. Snapshots of the ILOG Scheduler model (left) and ILOG Dispatcher model
(right), for a single vehicle
3 Evaluation
Our experimental results are performed on data provided by the Pittsburgh Food
Bank. Their food rescue program visits 130 locations per week. The provided
data allowed us to extract a fairly accurate estimate on the expected pickup
amount for the donor locations. The precise delivery amounts were unknown, and
we therefore approximate the demand based on the population served by each
location (which is known accurately), scaled by the total supply. We allow the
total demand to be slightly smaller than the total supply, to avoid pathological
behavior of the algorithm. We note however, that although this additional ‘slack’
influences the results, the qualitative behavior of the different techniques remains
180 C. Gunes, W.-J. van Hoeve, and S. Tayur
the same. The MIP model is solved using ILOG CPLEX 11.2, while the CP and
CBLS model are solved using ILOG Solver 6.6, all on a 2.33GHz Intel Xeon
machine.
The first set of instances are for individual vehicles, on routes serving 13 to 18
locations (corresponding to a daily schedule). The second set of instances group
together schedules over multiple days, ranging from 30 to 130 locations. The
results are presented in Figure 3. We report for each instance the cost savings (in
terms of total distance traveled) with respect to the current operational schedule.
Here, |V | and |T | denote the number of locations and vehicles, respectively.
The optimal solutions found with MIP and CP took several (2–3) minutes to
compute, while the solutions found with CBLS took several seconds or less. The
time limit was set to 30 minutes.
Our experimental results indicate that on this
problem domain, our MIP model is outperformed |V | |T | MIP CP CBLS
by our CP model to find an optimal solution (we 13 1 12% 12% 12%
note that a specialized 1-PDTSP MIP approach such 14 1 15% 15% 14%
as [4] might perform better than our ‘generic’ MIP 15 1 - 7% 6%
model on the single-vehicle instances). Further, the 16 1 - 5% 3%
CP model is able to find optimal solutions for up 18 1 - 16% 15%
to 18 locations and one vehicle; for a higher num- 30 2 - - 4%
ber of locations or vehicles, the CP model is unable 60 4 - - 8%
to find even a single solution. Lastly, the CBLS ap- 130 9 - - 10%
proach is able to handle large-scale instances, up to
130 locations and 9 vehicles. The expected savings Fig. 3. Savings obtained
are substantial, being at least 10% on the largest with different approaches
instance.
References
[1] Berbeglia, G., Cordeau, J.F., Gribkovskaia, I., Laporte, G.: Static pickup and de-
livery problems: A classification scheme and survey. TOP 15(1), 1–31 (2007)
[2] Dror, M., Fortin, D., Roucairol, C.: Redistribution of self-service electric cars: A
case of pickup and delivery. Technical Report W.P. 3543, INRIA-Rocquencourt
(1998)
[3] Hernández-Pérez, H., Salazar-González, J.J.: A branch-and-cut algorithm for a
traveling salesman problem with pickup and delivery. Discrete Applied Mathemat-
ics 145, 126–139 (2004)
[4] Hernández-Pérez, H., Salazar-González, J.J.: The one-commodity pickup-and-
delivery traveling salesman problem: Inequalities and algorithms. Networks 50,
258–272 (2007)
Constraint Programming and Combinatorial
Optimisation in Numberjack
1 Introduction
We present Numberjack1, a Python-based constraint programming system. Number-
jack brings the power of combinatorial optimisation to Python programmers by sup-
porting the specification of complex problem models and specifying how these should
be solved. Numberjack provides a common API for constraint programming, mixed-
integer programming and satisfiability solvers. Currently supported are: the CP solvers
Mistral and Gecode; a native Python CP solver; the MIP solver SCIP; and the satisfi-
ability solver MiniSat2 . Users of Numberjack can write their problems once and then
specify which solver should be used. Users can incorporate combinatorial optimisation
capabilities into any Python application they build, with all the benefits that it brings.
2 Modelling in Numberjack
Numberjack is provided as a Python module. To use Numberjack one must import all
Numberjack’s classes, using the command: from Numberjack import *. Simi-
larly, one needs to import the modules corresponding to the solvers that will be invoked
in the program, for instance: import Mistral or import Gecode. The Number-
jack module essentially provides a class Model whereas the solver modules provide a
class Solver, which are built from a Model. The structure of a typical Numberjack
program is presented in Figure 1. Notice that it is possible to use several types of solver
to solve the same model by explicitly invoking the modules. To solve a model, the
various methods implemented in the back-end solvers can be invoked through Python.
Supported by Science Foundation Ireland Grant Number 05/IN/I886.
1
Available under LGPL from http://numberjack.ucc.ie
2
Mistral: http://4c.ucc.ie/˜ehebrard/Software.html; Gecode:
http://gecode.org; SCIP: http://scip.zib.de/;
MiniSat: http://minisat.se;
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 181–185, 2010.
c Springer-Verlag Berlin Heidelberg 2010
182 E. Hebrard, E. O’Mahony, and B. O’Sullivan
N = 10
row = [Variable(1,N) for i in range(N)]
model = Model()
model += AllDiff(row)
for y in range(N-2):
model += AllDiff([row[i] - row[i+y+1] for i in range(N-y-1)])
solver = Mistral.Solver(model)
solver.solve()
model = Model()
model += AllDiff([M[i]-M[j] for i in range(1,N) for j in range(i)])
Magic Square. In this problem one wants every number between 1 and N 2 to be placed
in an N × N matrix such that every row, column and diagonal sum to the same number.
A model for that problem making use of the Matrix class is presented in Figure 4.
N = 10
sum_val = N*(N*N+1)/2
square = Matrix(N,N,1,N*N)
model = Model(
N = 8
x = Matrix(N,N,N)
model = Model(
and fibre channels. When a node fails in the network, all lightpaths passing through that
node are affected. Monitors attached to the nodes present in the affected lightpaths trig-
ger alarms. Hence, a single fault will generate multiple alarms. By placing monitors in
the right way, we can minimize the number of alarms generated for a fault while keep-
ing the fault-detection coverage maximum. In the problem we model below, we add the
additional constraint that for any node failure that might occur, it triggers a unique set
of alarms. This problem requires that each combination of monitor alarms is unique for
each node fault. This requires that every pair of vectors of variables differ on at least one
element. This can be specified in Numberjack by introducing a HammingDistance
constraint. The Numberjack model for this problem is presented as Figure 6.
class HammingDistance(Expression):
def decompose(self):
return [Sum([(var1 != var2) for var1, var2 in zip(self.rows[0],self.rows[1])])]
Fig. 6. A Numberjack model for the optical network monitoring problem [1]
Constraint Programming and Combinatorial Optimisation in Numberjack 185
3 Experiments
Experiments ran on an Intel Xeon 2.66GHz machine with 12GB of ram on Fedora 9.
Experiment 1: Overhead due to Numberjack. We first assess the overhead of using a
solver within Numberjack. We ran three back-end solvers, Mistral, MiniSat and SCIP
on three arythmetic puzzles (Magic Square, Costas Array and Golomb Ruler). For each
run, we used a profiler to separate the time spent executing Python code from the time
spent executing code from the back-end solver. We report the results in Table 1. For
every problem we report results averaged across 7 instances4 of various size and 10
randomized runs each. The time spent executing the Python code is very modest, and
of course independent of the hardness of the instance.
4 Conclusion
Numberjack is a Python-based constraint programming system. It brings the power of
combinatorial optimisation to Python programmers by supporting the specification of
complex models and specifying how these should be solved. We presented the features
of Numberjack through the use of several combinatorial problems.
Reference
1. Nayek, P., Pal, S., Choudhury, B., Mukherjee, A., Saha, D., Nasipuri, M.: Optimal monitor
placement scheme for single fault detection in optical network. In: Proceedings of 2005 7th
International Conference, vol. 1, pp. 433–436 (2005)
4
The results of SCIP on the 3 hardest Magic Square instances are not taken into account since
the cutoff of 1000 seconds was reached.
Automated Configuration of
Mixed Integer Programming Solvers
University of British Columbia, 2366 Main Mall, Vancouver BC, V6T 1Z4, Canada
{hutter,hoos,kevinlb}@cs.ubc.ca
1 Introduction
Current state-of-the-art mixed integer programming (MIP) solvers are highly parame-
terized. Their parameters give users control over a wide range of design choices, includ-
ing: which preprocessing techniques to apply; what balance to strike between branching
and cutting; which types of cuts to apply; and the details of the underlying linear (or
quadratic) programming solver. Solver developers typically take great care to identify
default parameter settings that are robust and achieve good performance across a variety
of problem types. However, the best combinations of parameter settings differ across
problem types, which is of course the reason that such design choices are exposed as pa-
rameters in the first place. Thus, when a user is interested only in good performance for
a given family of problem instances—as is the case in many application situations—it
is often possible to substantially outperform the default configuration of the solver.
When the number of parameters is large, finding a solver configuration that leads to
good empirical performance is a challenging optimization problem. (For example, this
is the case for C PLEX: in version 12, its 221-page parameter reference manual describes
135 parameters that affect the search process.) MIP solvers exist precisely because hu-
mans are not good at solving high-dimensional optimization problems. Nevertheless,
parameter optimization is usually performed manually. Doing so is tedious and labori-
ous, requires considerable expertise, and often leads to results far from optimal.
There has been recent interest in automating the process of parameter optimization
for MIP. The idea is to require the user to only specify a set of problem instances of
interest and a performance metric, and then to trade machine time for human time to
automatically identify a parameter configuration that achieves good performance. No-
tably, IBM ILOG C PLEX—the most widely used commercial MIP solver—introduced
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 186–202, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Automated Configuration of Mixed Integer Programming Solvers 187
an automated tuning tool in version 11. In our own recent work, we proposed several
methods for the automated configuration of various complex algorithms [20, 19, 18, 15].
While we mostly focused on solvers for propositional satisfiability (based on both local
and tree search), we also conducted preliminary experiments that showed the promise of
our methods for MIP. Specifically, we studied the automated configuration of CPLEX
10.1.1, considering 5 types of MIP instances [19].
The main contribution of this paper is a thorough study of the applicability of one
of our black-box techniques to the MIP domain. We go beyond previous work by con-
figuring three different MIP solvers (G UROBI, LPSOLVE, and the most recent C PLEX
version 12.1); by considering a wider range of instance distributions; by considering
multiple configuration objectives (notably, performing the first study on automatically
minimizing the optimality gap); and by comparing our method to C PLEX’s automated
tuning tool. We show that our approach consistently sped up all three MIP solvers and
also clearly outperformed the C PLEX tuning tool. For example, for a set of real-life
instances from computational sustainability, our approach sped up C PLEX by a factor
of 52 while the tuning tool returned the C PLEX defaults. For G UROBI, speedups were
consistent but small (up to a factor of 2.3), and for LPSOLVE we obtained speedups up
to a factor of 153.
The remainder of this paper is organized as follows. In the next section, we describe
automated algorithm configuration, including existing tools and applications. Then, we
describe the MIP solvers we chose to study (Section 3) and discuss the setup of our
experiments (Section 4). Next, we report results for optimizing both the runtime of
the MIP solvers (Section 5) and the optimality gap they achieve within a fixed time
(Section 6). We then compare our approach to the C PLEX tuning tool (Section 7) and
conclude with some general observations and an outlook on future work (Section 8).
Fig. 1. A configuration procedure (short: configurator) executes the target algorithm with speci-
fied parameter settings on one or more problem instances, observes algorithm performance, and
uses this information to decide which subsequent target algorithm runs to perform. A configura-
tion scenario includes the target algorithm to be configured and a collection of instances.
a value for one parameter can be incompatible with a value for another parameter; for
example, some types of preprocessing are incompatible with the use of certain data
structures. Thus, some parts of parameter configuration space are forbidden; they can
be described succinctly in the form of forbidden partial instantiations of parameters
(i.e., constraints).
We refer to instances of this algorithm configuration problem as configuration sce-
narios, and we address these using automatic methods that we call configuration pro-
cedures; this is illustrated in Figure 1. Observe that we treat algorithm configuration as
a black-box optimization problem: a configuration procedure executes the target algo-
rithm on a problem instance and receives feedback about the algorithm’s performance
without any access to the algorithm’s internal state. (Because the C PLEX tuning tool is
proprietary, we do not know whether it operates similarly.)
achieved manually by their developers. F-Race and its extensions have been used to op-
timize numerous algorithms, including iterated local search for the quadratic assignment
problem, ant colony optimization for the travelling salesperson problem, and the best-
performing algorithm submitted to the 2003 timetabling competition [8].
Our group successfully used various versions of PARAM ILS to configure algorithms
for a wide variety of problem domains. So far, the focus of that work has been on the
configuration of solvers for the propositional satisfiability problem (SAT); we optimized
both tree search [16] and local search solvers [21], in both cases substantially advancing
the state of the art for the types of instances studied. We also successfully configured
algorithms for the most probable explanation problem in Bayesian networks, global
continuous optimization, protein folding, and algorithm configuration itself (for details,
see Ref. 15).
The configuration procedure used in this work is an instantiation of the PARAM ILS
framework [20, 19]. However, we do not mean to argue for the use of PARAM ILS in
particular, but rather aim to provide a lower bound on the performance improvements
that can be achieved by applying general-purpose automated configuration tools to MIP
solvers; future tools may achieve even better performance.
PARAM ILS performs an iterated local search (ILS) in parameter configuration space;
configurations are evaluated by running the target algorithm with them. The search is
initialized at the best out of ten random parameter configurations and the target al-
gorithm’s default configuration. Next, PARAM ILS performs a first-improvement local
search that ends in a local optimum. It then iterates three phases: (1) a random per-
turbation to escape the local optimum; (2) another local search phase resulting in a
new local optimum; and (3) an acceptance criterion that typically accepts the new local
optimum if it is better than the previous one. The PARAM ILS instantiation we used
here is F OCUSED ILS version 2.4, which aggressively rejects poor configurations and
focuses its efforts on the evaluation of good configurations. Specifically, it starts with
performing only a single target algorithm run for each configuration considered, and
performs additional runs for good configurations as the search progresses. This process
guarantees that—given enough time and a training set that is perfectly representative of
unseen test instances—F OCUSED ILS will identify the best configuration in the given
design space [20, 19]. (Further details of PARAM ILS and F OCUSED ILS can be found
in our previous publications [20, 19].)
In practice, we are typically forced to work with finite sets of benchmark instances,
and performance on a small training set is often not very representative for performance
on other, unseen instances of similar origin. PARAM ILS (and any other configuration
tool) can only optimize performance on the training set it is given; it cannot guarantee
that this leads to improved performance on a separate set of test instances. In particular,
with very small training sets, a so-called over-tuning effect can occur: given more time,
automated configuration tools find configurations with better training but worse test
performance [8, 20].
Since target algorithm runs with some parameter configurations may take a very long
(potentially infinite) time, PARAM ILS requires the user to specify a so-called captime
190 F. Hutter, H.H. Hoos, and K. Leyton-Brown
Table 1. Target algorithms and characteristics of their parameter configuration spaces. For details,
see http://www.cs.ubc.ca/labs/beta/Projects/MIP-Config/
Algorithm Parameter type # parameters of this type # values considered Total # configurations
Boolean 6 (7) 2
C PLEX Categorical 45 (43) 3–7 1.90 · 1047
MILP (MIQCP) Integer 18 5–7 (3.40 · 1045 )
Continuous 7 5–8
Boolean 4 2
Categorical 16 3–5
G UROBI Integer 3 5 3.84 · 1014
Continuous 2 5
Boolean 40 2
LPSOLVE
Categorical 7 3–8 1.22 · 1015
κmax , the maximal amount of time after which PARAM ILS will terminate a run of
the target algorithm as unsuccessful. F OCUSED ILS version 2.4 also supports adaptive
capping, a speedup technique that sets the captimes κ ≤ κmax for individual target
algorithm runs, thus permitting substantial savings in computation time.
F OCUSED ILS is a randomized algorithm that tends to be quite sensitive to the order-
ing of its training benchmark instances. For challenging configuration tasks some of its
runs often perform much better than others. For this reason, in previous work we adopted
the strategy to perform 10 independent parallel runs of F OCUSED ILS and use the result
of the run with best training performance [16, 19]. This is sound since no knowledge
of the test set is required in order to make the selection; the only drawback is a 10-fold
increase in overall computation time. If none of the 10 F OCUSED ILS runs encounters
any successful algorithm run, then our procedure returns the algorithm default.
3 MIP Solvers
We now discuss the three MIP solvers we chose to study and their respective parameter
configuration spaces. Table 1 gives an overview.
IBM ILOG C PLEX is the most-widely used commercial optimization tool for solv-
ing MIPs. As stated on the C PLEX website (http://www.ilog.com/products/
cplex/), currently over 1 300 corporations and government agencies use C PLEX, along
with researchers at over 1 000 universities. C PLEX is massively parameterized and end
users often have to experiment with these parameters:
“Integer programming problems are more sensitive to specific parameter set-
tings, so you may need to experiment with them.” (ILOG C PLEX 12.1 user
manual, page 235)
Thus, the automated configuration of C PLEX is very promising and has the potential to
directly impact a large user base.
We used C PLEX 12.1 (the most recent version) and defined its parameter configu-
ration space as follows. Using the C PLEX 12 “parameters reference manual”, we iden-
tified 76 parameters that can be modified in order to optimize performance. We were
careful to keep all parameters fixed that change the problem formulation (e.g., param-
eters such as the optimality gap below which a solution is considered optimal). The
Automated Configuration of Mixed Integer Programming Solvers 191
G UROBI is a recent commercial MIP solver that is competitive with C PLEX on some
types of MIP instances [23]. We used version 2.0.1 and defined its configuration space
as follows. Using the online description of G UROBI’s parameters,1 we identified 26
parameters for configuration. These consisted of 12 mostly-categorical parameters that
determine how aggressively to use each type of cuts, 7 mostly-categorical simplex pa-
rameters, 3 MIP parameters, and 4 other mostly-Boolean parameters. After disallowing
some problematic parts of configuration space (see Section 4.2), we considered 25 of
these 26 parameters, which led to a configuration space of size 3.84 · 1014 .
LPSOLVE is one of the most prominent open-source MIP solvers. We determined 52 pa-
rameters based on the information at http://lpsolve.sourceforge.net/. These
parameters are rather different from those of G UROBI and C PLEX: 7 parameters are
categorical, and the rest are Boolean switches indicating whether various solver mod-
ules should be employed. 17 parameters concern presolving; 9 concern pivoting; 14
concern the branch & bound strategy; and 12 concern other functions. After disallow-
ing problematic parts of configuration space (see Section 4.2), we considered 47 of
these 52 parameters. Taking into account one conditional parameter, these gave rise to
1.22 · 1015 distinct parameter configurations.
4 Experimental Setup
We now describe our experimental setup: benchmark sets, how we identified problem-
atic parts in the configuration spaces of G UROBI and LPSOLVE, and our computational
environment.
4.1 Benchmark Sets
We collected a wide range of MIP benchmarks from public benchmark libraries and
other researchers, and split each of them 50:50 into disjoint training and test sets; we
detail these in the following.
1
http://www.gurobi.com/html/doc/refman/node378.html#sec:
Parameters
192 F. Hutter, H.H. Hoos, and K. Leyton-Brown
MJA. This set comprises 343 machine-job assignment instances encoded as mixed in-
teger quadratically constrained programming (MIQCP) problems [2]. We obtained it
from the Berkeley Computational Optimization Lab (BCOL).2 On average, these in-
stances contain 2 769 variables and 2 255 constraints (with standard deviations 2 133
and 1 592, respectively).
MIK. This set comprises 120 mixed-integer knapsack instances encoded as mixed in-
teger linear programming (MILP) problems [4]; we also obtained it from BCOL. On
average, these instances contain 384 variables and 151 constraints (with standard devi-
ations 309 and 127, respectively).
CLS. This set of 100 MILP-encoded capacitated lot-sizing instances [5] was also ob-
tained from BCOL. Each instance contains 181 variables and 180 constraints.
R EGIONS 100. This set comprises 2 000 instances of the combinatorial auction win-
ner determination problem, encoded as MILP instances. We generated them using the
regions generator from the Combinatorial Auction Test Suite [22], with parameters
goods=100 and bids=500. On average, the resulting MILP instances contain 501 vari-
ables and 193 inequalities (with standard deviations 1.7 and 2.5, respectively).
R EGIONS 200. This set contains 2 000 instances similar to those in R EGIONS 100 but
larger; we created it with the same generator using goods=200 and bids=1 000. On
average, the resulting MILP instances contain 1 002 variables and 385 inequalities (with
standard deviations 1.7 and 3.4, respectively).
MASS. This set comprises 100 integer programming instances modelling multi-activity
shift scheduling [10]. On average, the resulting MILP instances contain 81 994 variables
and 24 637 inequalities (with standard deviations 9 725 and 5 391, respectively).
CORLAT. This set comprises 2 000 MILP instances based on real data used for the
construction of a wildlife corridor for grizzly bears in the Northern Rockies region
(the instances were described by Gomes et al. [11] and made available to us by Bistra
Dilkina). All instances had 466 variables; on average they had 486 constraints (with
standard deviation 25.2).
Table 2. Results for minimizing the runtime required to find an optimal solution and prove its
optimality. All results are for test sets disjoint from the training sets used for the automated
configuration. We report the percentage of timeouts after 24 CPU hours as well as the mean
runtime for those instances that were solved by both approaches. Bold-faced entries indicate
better performance of the configurations found by PARAM ILS than for the default configuration.
(To reduce the computational burden, results for LPSOLVE on R EGIONS 200 and CORLAT are
only based on 100 test instances sampled uniformly at random from the 1000 available ones.)
% test instances unsolved in 24h mean runtime for solved [CPU s] Speedup
Algorithm Scenario default PARAMILS default PARAMILS factor
MJA 0% 0% 3.40 1.72 1.98×
MIK 0% 0% 4.87 1.61 3.03×
R EGIONS 100 0% 0% 0.74 0.35 2.13×
C PLEX R EGIONS 200 0% 0% 59.8 11.6 5.16×
CLS 0% 0% 47.7 12.1 3.94×
MASS 0% 0% 524.9 213.7 2.46×
CORLAT 0% 0% 850.9 16.3 52.3×
MIK 0% 0% 2.70 2.26 1.20×
R EGIONS 100 0% 0% 2.17 1.27 1.71×
R EGIONS 200 0% 0% 56.6 40.2 1.41×
G UROBI CLS 0% 0% 58.9 47.2 1.25×
MASS 0% 0% 493 281 1.75×
CORLAT 0.3% 0.2% 103.7 44.5 2.33×
MIK 63% 63% 21 670 21 670 1×
R EGIONS 100 0% 0% 9.52 1.71 5.56×
R EGIONS 200 12% 0% 19 000 124 153×
LPSOLVE
CLS 86% 42% 39 300 1 440 27.4×
MASS 83% 83% 8 661 8 661 1×
CORLAT 50% 8% 7 916 229 34.6×
In our first set of experiments, we studied the extent to which automated configuration
can improve the time performance of C PLEX, G UROBI, and LPSOLVE for solving the
seven types of instances discussed in Section 4.1. This led to 3 · 6 + 1 = 19 configura-
tion scenarios (the quadratically constrained MJA instances could only be solved with
C PLEX).
For each configuration scenario, we allowed a total configuration time budget of 2
CPU days for each of our 10 PARAM ILS runs, with a captime of κmax = 300 seconds
for each MIP solver run. In order to penalize timeouts, during configuration we used
the penalized average runtime criterion (dubbed “PAR-10” in our previous work [19]),
counting each timeout as 10 · κmax . For evaluation, we report timeouts separately.
For each configuration scenario, we compared the performance of the parameter con-
figuration identified using PARAM ILS against the default configuration, using a test set
of instances disjoint from the training set used during configuration. We note that this
default configuration is typically determined using substantial time and effort; for ex-
ample, the C PLEX 12.1 user manual states (on p. 478):
Table 2 describes our configuration results. For each of the benchmark sets, our ap-
proach improved C PLEX’s performance. Specifically, we achieved speedups ranging
from 2-fold to 52-fold. For G UROBI, the speedups were also consistent, but less pro-
nounced (1.2-fold to 2.3-fold). For the open-source solver LPSOLVE, the speedups were
most substantial, but there were also 2 cases in which PARAM ILS did not improve over
LPSOLVE ’s default, namely the MIK and MASS benchmarks. This occurred because,
within the maximum captime of κmax = 300s we used during configuration, none of
the thousands of LPSOLVE runs performed by PARAM ILS solved a single benchmark
5 5
10 Train 10 Train
Config. found by ParamILS [CPU s]
5
10 Train
Config. found by ParamILS [CPU s]
Train
Config. found by ParamILS [CPU s]
Train−timeout 2 Test
Test 10
4
10 Test−timeout
1
3
10
10
0
10
2 10
1 −1
10 10
1 2 3 4 5 −1 0 1 2
10 10 10 10 10 10 10 10 10
Default [CPU s] Default [CPU s]
(c) LPSOLVE , R EGIONS 200. Speedup fac- (d) G UROBI, MIK. Speedup factors: train
tors: train 162×, test 153×. 2.17×, test 1.20×.
Fig. 2. Results for configuration of MIP solvers to reduce the time for finding an optimal solution
and proving its optimality. The dashed blue line indicates the captime (κmax = 300s) used
during configuration.
196 F. Hutter, H.H. Hoos, and K. Leyton-Brown
instance for either of the two benchmark sets. For the other benchmarks, speedups were
very substantial, reaching up to a factor of 153 (on R EGIONS 200).
Figure 2 shows the speedups for 4 configuration scenarios. Figures 2(a) to (c) show
the scenario with the largest speedup for each of the solvers. In all cases, PARAM -
ILS’s configurations scaled better to hard instances than the algorithm defaults, which
in some cases timed out on the hardest instances. PARAM ILS’s worst performance was
for the 2 LPSOLVE scenarios for which it simply returned the default configuration; in
Figure 2(d), we show results for the more interesting second-worst case, the configura-
tion of G UROBI on MIK. Observe that here, performance was actually rather good for
most instances, and that the poor speedup in test performance was due to a single hard
test instance. Better generalization performance would be achieved if more training in-
stances were available.
Sometimes, we are interested in minimizing a criterion other than mean runtime. Algo-
rithm configuration procedures such as PARAM ILS can in principle deal with various
optimization objectives; in our own previous work, for example, we have optimized me-
dian runlength, average speedup over an existing algorithm, and average solution qual-
ity [20, 15]. In the MIP domain, constraints on the time available for solving a given
MIP instance might preclude running the solver to completion, and in such cases, we
may be interested in minimizing the optimality gap (also known as MIP gap) achieved
within a fixed amount of time, T .
To investigate the efficacy of our automated configuration approach in this context,
we applied it to C PLEX, G UROBI and LPSOLVE on the 5 benchmark distributions with
Table 3. Results for configuration of MIP solvers to reduce the relative optimality gap reached
within 10 CPU seconds. We report the percentage of test instances for which no feasible solution
was found within 10 seconds and the mean relative gap for the remaining test instances. Bold
face indicates the better configuration (recall that our lexicographic objective function cares first
about the number of instances with feasible solutions, and then considers the mean gap among
feasible instances only to break ties).
% test instances for which no feas. sol. was found mean gap when feasible Gap reduction
Algorithm Scenariodefault PARAMILS default PARAMILS factor
MIK 0% 0% 0.15% 0.02% 8.65×
CLS 0% 0% 0.27% 0.15% 1.77×
C PLEX R EGIONS 200 0% 0% 1.90% 1.10% 1.73×
CORLAT 28% 1% 4.43% 1.22% 2.81×
MASS 88% 86% 1.91% 1.52% 1.26×
MIK 0% 0% 0.02% 0.01% 2.16×
CLS 0% 0% 0.53% 0.44% 1.20×
G UROBI R EGIONS 200 0% 0% 3.17% 2.52% 1.26×
CORLAT 14% 5% 3.22% 2.87% 1.12×
MASS 68% 68% 76.4% 52.2% 1.46×
MIK 0% 0% 652% 14.3% 45.7×
CLS 0% 0% 29.6% 7.39% 4.01×
LPSOLVE R EGIONS 200 0% 0% 10.8% 6.60% 1.64×
CORLAT 68% 13% 4.19% 3.42% 1.20×
MASS 100% 100% - - -
Automated Configuration of Mixed Integer Programming Solvers 197
the longest average runtimes, with the objective of minimizing the average relative op-
timality gap achieved within T = 10 CPU seconds. To deal with runs that did not find
feasible solutions, we used a lexicographic objective function that counts the fraction of
instances for which feasible solutions were found and breaks ties based on the mean rel-
ative gap for those instances. For each of the 15 configuration scenarios, we performed
10 PARAM ILS runs, each with a time budget of 5 CPU hours.
Table 3 shows the results of this experiment. For all but one of the 15 configuration sce-
narios, the automatically-found parameter configurations performed substantially better
than the algorithm defaults. In 4 cases, feasible solutions were found for more instances,
and in 14 scenarios the relative gaps were smaller (sometimes substantially so; consider,
e.g., the 45-fold reduction for LPSOLVE, and note that the gap is not bounded by 100%).
For the one configuration scenario where we did not achieve an improvement, LPSOLVE
on MASS, the default configuration of LPSOLVE could not find a feasible solution for
any of the training instances in the available 10 seconds, and the same turned out to be
the case for the thousands of configurations considered by PARAM ILS.
Table 4. Comparison of our approach against the C PLEX tuning tool. For each benchmark set,
we report the time t required by the C PLEX tuning tool (it ran out of time after 2 CPU days for
R EGIONS 200 and CORLAT, marked by ’*’) and the C PLEX name of the configuration it judged
best. We report the mean runtime of the default configuration; the configuration the tuning tool
selected; and the configurations selected using 2 sets of 10 PARAM ILS runs, each allowed time
t/10 and 2 days, respectively. For the PARAM ILS runs, in parentheses we report the speedup
over the C PLEX tuning tool. Boldface indicates improved performance.
C PLEX tuning tool stats C PLEX mean runtime [CPU s] on test set, with respective configuration
Scenario
Tuning time t Name of result Default C PLEX tuning tool 10× PARAMILS(t/10) 10× PARAMILS(2 days)
CLS 104 673 ’defaults’ 48.4 48.4 15.1(3.21×) 10.1(4.79×)
R EGIONS 100 3 117 ’easy’ 0.74 0.86 0.48(1.79×) 0.34(2.53×)
R EGIONS 200 172 800* ’defaults’ 59.8 59.8* 14.2(4.21×) 11.9(5.03×)
MIK 36 307 ’long test1’ 4.87 3.56 1.46(2.44×) 0.98(3.63×)
MJA 2 266 ’easy’ 3.40 3.18 2.71(1.17×) 1.64(1.94×)
MASS 28 844 ’branch dir’ 524.9 425.8 627.4(0.68×) 478.9(0.89×)
CORLAT 172 800* ’defaults’ 850.9 850.9* 161.1(5.28×) 18.2(46.8×)
Performance [CPU s]
Performance [CPU s]
Performance [CPU s]
3 Default Default
10 1.5 CPLEX tuning tool 8 CPLEX tuning tool
ParamILS ParamILS
6
2 1
10
4
Default
0.5
CPLEX tuning tool 2
1 ParamILS
10
4 5 6 3 4 5 6 4 5 6
10 10 10 10 10 10 10 10 10 10
Configuration budget [CPU s] Configuration budget [CPU s] Configuration budget [CPU s]
(a) CORLAT (b) R EGIONS 100 (c) MIK
Performance [CPU s]
Performance [CPU s]
Performance [CPU s]
Default Default
6 CPLEX tuning tool 100 CPLEX tuning tool
ParamILS 3
5 10 ParamILS
80
4
60
3
40
Default
2 CPLEX tuning tool 20
ParamILS
1 10
2
3 4 5 6 4 5 6 4 5 6
10 10 10 10 10 10 10 10 10 10
Configuration budget [CPU s] Configuration budget [CPU s] Configuration budget [CPU s]
(d) MJA (e) MASS (f) CLS
Fig. 3. Comparison of the default configuration and the configurations returned by the C PLEX
tuning tool and by our approach. The x-axis gives the total time budget used for configuration
and the y-axis the performance (C PLEX mean CPU time on the test set) achieved within that
budget. For PARAM ILS, we perform 10 runs in parallel and count the total time budget as the
sum of their individual time requirements. The plot for R EGIONS 200 is qualitatively similar to
the one for R EGIONS 100, except that the gains of PARAM ILS are larger.
solvers and, indeed, arbitrary parameterized algorithms. In contrast, the few configura-
tions in the C PLEX tuning tool appear to have been selected based on substantial domain
insights, and the fact that different parameter configurations are tried for different types
of instances leads us to believe that it relies upon MIP-specific instance characteristics.
Automated Configuration of Mixed Integer Programming Solvers 199
While in principle this could be an advantage, in its current form it appears to be rather
restrictive.
We compared the performance of the configurations found by the C PLEX tuning
tool to that of configurations found by PARAM ILS. For this comparison, we used the
tuning tool’s default settings to optimize mean runtime on the same training sets used
for PARAM ILS, and tested performance on the same test sets (disjoint from the train-
ing sets). We ran both configuration approaches with a time limit of 2 CPU days. In
most cases, the C PLEX tuning tool finished before that time limit was reached and—in
contrast to PARAM ILS—could not use the remaining time in order to further improve
performance. As before, we used 10 independent parallel runs of PARAM ILS, at each
time step reporting the performance of the one with best training performance.
First, we discuss the performance of the C PLEX tuning tool, summarized in Table 4.
We note that in two cases (R EGIONS 200 and CORLAT), it reached the time limit of
2 CPU days and returned the algorithm defaults in both cases. Out of the remaining
5 cases, it returned the default configuration in 1 (CLS), yielded a configuration with
slightly worse performance than the default in 1 (R EGIONS 100), and moderately im-
proved performance in the remaining 3 (up to a factor of 1.37). The 3 non-default con-
figurations it returned only differed in the following few parameters from the default:
‘easy’ (perform only 1 cutting plane pass, apply the periodic heuristic every 50 nodes,
and branch based on pseudo-reduced costs); ‘long test1’ (use aggressive probing and
aggressive settings for 8 types of cuts); and ‘branch dir’ (at each node, select the up
branch first).
PARAM ILS outperformed the tuning tool for 6 of the 7 configuration scenarios,
sometimes substantially so. Specifically, PARAM ILS found configurations with up to
5.2 times lower mean runtime when its total time budget was set to exactly the amount
of time t the C PLEX tuning tool ran before terminating (i.e., t/10 for each of the 10
PARAM ILS runs; t varied widely across the scenarios, see Table 4). For the one remain-
ing scenario, MASS, the configuration it found was slower by a factor of 1/0.68 = 1.47
(which we attribute to an over-tuning effect to be discussed shortly). With a fixed time
budget of two days for each PARAM ILS run, PARAM ILS’s performance improved for
all seven domains, reaching a speedup factor of up to 46.
Figure 3 visualizes the anytime test performance of PARAM ILS compared to the
default and the configuration found by the C PLEX tuning tool. Typically, PARAM ILS
found good configurations quickly and improved further when given more time. The
main exception was configuration scenario MASS (see Figure 3(e)), the one scenario
where PARAM ILS performed worse than the C PLEX tuning tool in Table 4. Here, test
performance did not improve monotonically: given more time, PARAM ILS found con-
figurations with better training performance but worse test performance. This example
of the over-tuning phenomenon mentioned in Section 2.3 clearly illustrates the prob-
lems arising from benchmark sets that are too small (and too heterogeneous): good
results for 50 (rather variable) training instances are simply not enough to confidently
draw conclusions about the performance on additional unseen test instances. On all
other 6 configuration scenarios, training and test sets were similar enough to yield near-
monotonic improvements over time, and large speedups over the C PLEX tuning tool.
200 F. Hutter, H.H. Hoos, and K. Leyton-Brown
Acknowledgements
We thank the authors of the MIP benchmark instances we used for making them avail-
able, in particular Louis-Martin Rousseau and Bistra Dilkina, who provided the previ-
ously unpublished instance sets MASS and CORLAT, respectively. We also thank IBM
and Gurobi Optimization for making a full version of their MIP solvers freely available
for academic purposes; and Westgrid for support in using their compute cluster. FH
gratefully acknowledges support from a postdoctoral research fellowship by the Cana-
dian Bureau for International Education. HH and KLB gratefully acknowledge support
from NSERC through their respective discovery grants, and from the MITACS NCE for
seed project funding.
Automated Configuration of Mixed Integer Programming Solvers 201
References
[1] Adenso-Diaz, B., Laguna, M.: Fine-tuning of algorithms using fractional experimental de-
sign and local search. Operations Research 54(1), 99–114 (2006)
[2] Aktürk, S.M., Atamtürk, A., Gürel, S.: A strong conic quadratic reformulation for machine-
job assignment with controllable processing times. Research Report BCOL.07.01, Univer-
sity of California-Berkeley (2007)
[3] Ansotegui, C., Sellmann, M., Tierney, K.: A gender-based genetic algorithm for the auto-
matic configuration of solvers. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 142–157.
Springer, Heidelberg (2009)
[4] Atamtürk, A.: On the facets of the mixed–integer knapsack polyhedron. Mathematical Pro-
gramming 98, 145–175 (2003)
[5] Atamtürk, A., Muñoz, J.C.: A study of the lot-sizing polytope. Mathematical Program-
ming 99, 443–465 (2004)
[6] Audet, C., Orban, D.: Finding optimal algorithmic parameters using the mesh adaptive di-
rect search algorithm. SIAM Journal on Optimization 17(3), 642–664 (2006)
[7] Bartz-Beielstein, T.: Experimental Research in Evolutionary Computation: The New Ex-
perimentalism. Natural Computing Series. Springer, Berlin (2006)
[8] Birattari, M.: The Problem of Tuning Metaheuristics as Seen from a Machine Learning
Perspective. PhD thesis, Université Libre de Bruxelles, Brussels, Belgium (2004)
[9] Birattari, M., Stützle, T., Paquete, L., Varrentrapp, K.: A racing algorithm for configuring
metaheuristics. In: Proc. of GECCO 2002, pp. 11–18 (2002)
[10] Cote, M., Gendron, B., Rousseau, L.: Grammar-based integer programing models for multi-
activity shift scheduling. Technical Report CIRRELT-2010-01, Centre interuniversitaire de
recherche sur les réseaux d’entreprise, la logistique et le transport (2010)
[11] Gomes, C.P., van Hoeve, W.-J., Sabharwal, A.: Connections in networks: A hybrid ap-
proach. In: Perron, L., Trick, M.A. (eds.) CPAIOR 2008. LNCS, vol. 5015, pp. 303–307.
Springer, Heidelberg (2008)
[12] Gratch, J., Chien, S.A.: Adaptive problem-solving for large-scale scheduling problems: A
case study. JAIR 4, 365–396 (1996)
[13] Huang, D., Allen, T.T., Notz, W.I., Zeng, N.: Global optimization of stochastic black-box
systems via sequential kriging meta-models. Journal of Global Optimization 34(3), 441–
466 (2006)
[14] Hutter, F.: On the potential of automatic algorithm configuration. In: SLS-DS2007: Doc-
toral Symposium on Engineering Stochastic Local Search Algorithms, pp. 36–40. Techni-
cal report TR/IRIDIA/2007-014, IRIDIA, Université Libre de Bruxelles, Brussels, Belgium
(2007)
[15] Hutter, F.: Automated Configuration of Algorithms for Solving Hard Computational Prob-
lems. PhD thesis, University of British Columbia, Department of Computer Science, Van-
couver, Canada (2009)
[16] Hutter, F., Babić, D., Hoos, H.H., Hu, A.J.: Boosting Verification by Automatic Tuning of
Decision Procedures. In: Proc. of FMCAD 2007, Washington, DC, USA, pp. 27–34. IEEE
Computer Society, Los Alamitos (2007a)
[17] Hutter, F., Hoos, H.H., Leyton-Brown, K., Murphy, K.P.: An experimental investigation
of model-based parameter optimisation: SPO and beyond. In: Proc. of GECCO 2009, pp.
271–278 (2009a)
[18] Hutter, F., Hoos, H.H., Leyton-Brown, K., Murphy, K.P.: Time-bounded sequential param-
eter optimization. In: Proc. of LION-4. LNCS. Springer, Heidelberg (to appear, 2010)
202 F. Hutter, H.H. Hoos, and K. Leyton-Brown
[19] Hutter, F., Hoos, H.H., Leyton-Brown, K., Stützle, T.: ParamILS: an automatic algorithm
configuration framework. Journal of Artificial Intelligence Research 36, 267–306 (2009b)
[20] Hutter, F., Hoos, H.H., Stützle, T.: Automatic algorithm configuration based on local search.
In: Proc. of AAAI 2007, pp. 1152–1157 (2007b)
[21] KhudaBukhsh, A., Xu, L., Hoos, H.H., Leyton-Brown, K.: SATenstein: Automatically
building local search SAT solvers from components. In: Proc. of IJCAI 2009, pp. 517–524
(2009)
[22] Leyton-Brown, K., Pearson, M., Shoham, Y.: Towards a universal test suite for combinato-
rial auction algorithms. In: Proc. of EC 2000, pp. 66–76. ACM, New York (2000)
[23] Mittelmann, H.: Mixed integer linear programming benchmark, serial codes (2010), http:
//plato.asu.edu/ftp/milpf.html (version last visited on January 26, 2010)
Upper Bounds on the Number of Solutions
of Binary Integer Programs
1 Introduction
Solution counting has become a new and exciting topic in combinatorial research.
Counting solutions of combinatorial problem instances is relevant for example for new
branching methods [23,24]. It is also relevant to give user feedback in interactive set-
tings such as configuration systems. Moreover, it plays an ever more important role
in post-optimization analysis to give the user of an optimization system an idea how
many solutions there are within a certain percentage of the optimal objective value. The
famous mathematical programming tool Cplex for example now includes a solution
counting method. Finally, from a research perspective the problem is interesting in its
own right as it constitutes a natural extension of the mere optimization task.
Solution counting is probably best studied in the satisfaction (SAT) community
where a number of approaches have been developed to estimate the number of solu-
tions of under-constrained instances. First attempts to count the number of solutions
often simply consisted in extending the run of a solution finding systematic search after
a first solution has been found [3]. More sophisticated randomized methods estimate
upper and lower bounds with high probability. In [8], e.g., in a trial an increasing num-
ber of random XOR constraints are added to the problem. The upper and lower bounds
This work was supported by the National Science Foundation through the Career: Cornflower
Project (award number 0644113).
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 203–218, 2010.
c Springer-Verlag Berlin Heidelberg 2010
204 S. Jain, S. Kadioglu, and M. Sellmann
on the number of solutions depends on how many XORs can be added before the in-
stance becomes infeasible, whereby the probability that the bound is correct depends
on the number of trials where (at least or at most) the same number of XORs can be
added before the instance changes its feasibility status.
An interesting trend in constraint programming (CP) is to estimate solution den-
sity via solution counting for individual constraints [23,24]. Since the solution density
information is used for branching, it is important that these methods run very fast. Con-
sequently, they are constraint-based and often give estimates on the number of solutions
rather than hard upper and lower bounds or bounds that hold with high probability.
In mathematical programming, finally, the IBM Cplex IP solution counter [5,10]
enumerates all solutions while aiming at finding diverse set of solutions, and the Scip
solution counter finds the number of all feasible solutions using a technique to collect
several solutions at once [1]. Stopped prematurely at some desired time-limit, these
solvers provide lower bounds on the number of solutions.
Considering the literature, we find that a big emphasis has been laid on the computa-
tion of lower bounds on the number of solutions of a given problem instance. Apart from
the work in [8] and the upper bounding routine for SAT in [11], we are not aware of
any other approaches that provide hard or high probability upper bounds on the number
of solutions. Especially solution counters that are part of the IBM Cplex and the Scip
solver would benefit if an upper bound on the number of solutions could be provided
alongside the lower bound in case that counting needs to be stopped prematurely.
With this study we attempt to make a first step to close this gap. In particular, we con-
sider binary integer programs and propose a general method for computing hard upper
bounds on the number of feasible solutions. Our approach is based on the exploitation
of relaxations, in particular surrogate and Lagrangian relaxations. Experimental results
on automatic recording and market split problems provide a first proof of concept.
(BIP ) pT x ≥ B
Ax ≤ b
xi ∈ {0, 1}.
Wlog, we assume that the profit coefficients are integers. Although we could multiply
the first inequality with minus one, we make it stand out as the original objective of the
binary integer program (BIP) that was to be maximized. Usually, in branch and bound
approaches, we consider relaxations to compute upper bounds on that objective. For
example, we may solve the linear program (LP)
M aximize L = pT x
Ax ≤ b
0 ≤ xi ≤ 1
Upper Bounds on the Number of Solutions of Binary Integer Programs 205
and check whether L ≥ B for an incumbent integer solution with value B to prune the
search.
For our task of computing upper bounds on the number of solutions, relaxing the
problem is the first thing that comes to mind. However, standard LP relaxations are
not likely to be all that helpful for this task. Assume that there are two (potentially
fractional) solutions that have an objective value greater or equal B. Then, there exist
infinitely many fractional solutions that have the same property.
Consequently, we need to look for a relaxation which preserves the discrete character
of the original problem. We propose to use the surrogate relaxation for this purpose.
In the surrogate relaxation, we choose multipliers λi ≥ 0 for each linear inequality
constraint i and then aggregate all constraints into one. We obtain:
M aximize S = pT x
λT Ax ≤ λT b
xi ∈ {0, 1}.
This problem is well known, it is a knapsack problem (that may have negative weights
and/or profits). Let us set w ← wλ ← AT λ and C ← Cλ ← λT b. Then, we insert the
profit threshold B back into the formulation. This is sound as S is a relaxation of the
original problem. We obtain a knapsack constraint
(KP ) pT x ≥ B
wT x ≤ C
xi ∈ {0, 1}.
items items
profits 0 1 2 3 4 profits 0 1 2 3 4
0 0
10 10
20 20
30 30
40
40
00111100
50 50
60 60
70 70
80 80
90 90 111
000 11111111111111
00000000000000
100 00000000000000
11111111111111
100 t 000000000000001100
11111111111111 t
110 110 000000000000001100
11111111111111
120 120
130 130
140 140
1111
0000 111
000
arc−weights 13 5 0 arc−weights
3 3 4 5 0 5 3 3 2
Fig. 1. The figure shows dynamic programming tables for a knapsack constraint with four vari-
ables, profits pT = (50, 40, 30, 20), and profit constraint threshold B = 82. In the left figure
the weights are wT = (3, 3, 4, 5), and the knapsack’s capacity is C = 10. In the right figure
the weights are wT = (13/5, 5/3, 3, 2), and the capacity is C = 19/3. The node-labels are
defined by their row and column number, the sink node t is marked separately. The value of
non-horizontal arcs that cross a vertical line is given under that line, horizontal arcs have weight
0. Hollow nodes and dashed arcs mark those nodes and arcs that are removed by the filtering
algorithm, because there exists no path from M0,0 to t with weight lower or equal C that visits
them.
In [12], the resulting DP is analyzed using a technique from [20,21] to identify which
variables cannot take value 0 or value 1. We do not perform this last step. Instead,
we use the resulting DP to count the number of paths from source to sink using the
Upper Bounds on the Number of Solutions of Binary Integer Programs 207
technique in [23,24]. Note that any solution to (BIP) fulfills the redundant constraint
(KP) and therefore corresponds to an admissible path in our DP. Moreover, two different
solutions also define two different paths in the DP. Therefore, the number of paths in
the DP gives an upper bound on the number of solutions in (BIP).
Now, the quality of the upper bound will depend on the choice of the initial vector
λ. In ordinary optimization, we aim for a choice of λ for which the surrogate relaxation
gives the tightest relaxation value. However, for the purpose of filtering we know that
sub-optimal multipliers λ can provide better filtering effectiveness [14]. Consider the
following example:
(EX) 50x1 + 40x2 + 30x3 + 20x4 ≥ 82
3x1 + x2 + 3x3 ≤ 5
2x2 + x3 + 5x4 ≤ 5
xi ∈ {0, 1}.
If we use λ = (1, 1)T , e.g., then we get the knapsack constraint as shown in the
left graph of Figure 1 with a relaxation value of 120 (as that is the highest profit
visited by the remaining admissible paths). On the other hand, had we chosen λ =
(13/15, 6/15)T , we would have obtained the knapsack constraint in the right graph of
Figure 1 with an improved upper bound of 110.
Comparing the two DPs, we find that the two choices for λ yield incomparable fil-
tering effectiveness. Although the second set of multipliers gives a strictly better upper
bound, it cannot remove the edge (M90,3 , M110,4 ). On the other hand, the second choice
for λ allows us to remove the edges (M90,2 , M120,3 ) and (M120,3 , M120,4 ). This effect
has been studied before in [14]. The explanation for the different filtering behavior is
that, in principle, each edge has its own vector λ that maximally challenges admissibil-
ity (as measured by the shortest path length through that edge).
In principle, we could employ a probing procedure. For each edge, we remove all
edges on the same level, thus enforcing that each path from source to sink must pass
through this edge. Then, we start with some selection for λ and compute the shortest
path length according to the corresponding weights wλ as well as the corresponding
BIP solution xλ . If wλT xλ > Cλ , then we can remove the edge. Otherwise, we modify
λ to minimize Cλ − wλT xλ as much as possible. From the theory of Lagrangian relax-
ation (see for example [2]) we know that finding the optimal choice for λ consists in
minimizing a piecewise linear convex function. Consequently, we can use a subgradient
search algorithm to find the vector λ ≥ 0 which will minimize Cλ − wλT xλ as much
as possible and thus enable us to decide whether any λ exists that would allow us to
remove the edge under consideration.
The problem with this procedure is of course that it takes way too much time to probe
each individual edge. Instead, we follow the same method as in CP-based Lagrangian
relaxation [15]. That is, we employ a subgradient search to find a vector λ that mini-
mizes Cλ − wλT xλ in the DP. Then, for each λ that the subgradient search considers, we
use our edge-filtering algorithms to remove edges from the graph. That way, we hope
to visit a range of different settings for λ that will hopefully remove a large percentage
of edges in the DP that can be discarded.
Consider again our example (EX) from before. If we first prune the graph with re-
spect to the weight vector w from the left graph in Figure 1 and then, in the pruned
208 S. Jain, S. Kadioglu, and M. Sellmann
graph, remove edges based on the weight vector w from the right graph in Figure 1,
then we end up with only one path which corresponds to the only solution to (EX)
which is x = (1, 1, 0, 0)T .
The complete procedure is sketched in Algorithm 1. Note how we first increase the num-
ber of solutions by considering the cardinality of the set R ← {x ∈ {0, 1}n | pT x ≥
B} instead of P ← {x ∈ {0, 1}n | pT x ≥ B & Ax ≤ b}. Then, to reduce the number
of solutions again, we heuristically remove edges from the DP that has exactly one path
for each x ∈ R by propagating constraints λT Ax ≤ λT b for various choices of λ in the
DP. The resulting number of paths in the DP gives a hard upper bound on the number
of solutions to the original BIP.
A nice property of our approach is that we can use all the usual methods for strength-
ening linear continuous relaxations, such as preprocessing and especially adding valid
inequalities, so-called cutting planes, to the BIP which tighten the continuous
relaxation.
To strengthen the upper bound on the solution count further, we can embed our pro-
cedure in a branch-and-bound tree search algorithm which we truncate at some given
depth-limit. The sum of all solutions at all leafs of the truncated tree then gives an upper
bound on the number of solutions.
For very hard combinatorial counting problems we may consider doing even more.
In our outline above, we use the profit constraint to define the graph G. In principle, we
could use any vector μ of natural numbers and consider the constraint (pT − μT A)x ≥
B − μT b to set up the DP. This is needed in particular when there is no designated
objective function. We notice, however, that we do not need to restrict us to using just
one DP. Instead, we can set up multiple DPs for different choices of μ.
The simplest way to strengthen the upper bound on the number of solutions is to take
the minimum count over all DPs. However, we can do much better than that. Following
Upper Bounds on the Number of Solutions of Binary Integer Programs 209
an idea presented in [9], we can compute compatibility labels between the different
DPs: Let us denote with GA and GB the graphs that correspond to two different DPs
for our problem. Our filtering algorithm ensures that each edge in the graph is visited
by at least one admissible path. The compatibility labels from [9] aim to ensure that
an edge in GA is also supported by a (not necessarily admissible) path in GB . More
precisely, for each edge in GA we ensure that there is a path from source to sink in GA
that visits the edge and which corresponds to a solution which also defines a path from
source to sink in GB .
Finally, if we have found an upper bound on the solution count that is rather small,
we can generate all potential solutions which is very easy given our DAG G. Then, we
test each assignment for feasibility and thus provide an exact count.
3 Numerical Results
3.1 Automatic Recording
We first consider the automatic recording problem (ARP) that was introduced in [15].
M aximize pT x
wT x ≤ K
xi + xj ≤ 1 ∀ 0 ≤ i ≤ j ≤ n, Ii ∩ Ij = ∅ (ARP 1)
x ∈ {0, 1}n
where pi and wi represent the profit and the storage requirement of program i, K is the
storage capacity, and Ii := [startT ime(i), endT ime(i)] corresponds to the broadcast-
ing interval of program i. The objective function maximizes the user satisfaction while
the first constraint enforces the storage restrictions. Constraints of the form xi + xj ≤ 1
ensure that at most one program is recorded at each point in time.
This formulation can be tightened by considering the conflict graph and adding the
corresponding clique constraints to the formulation [15].
210 S. Jain, S. Kadioglu, and M. Sellmann
M aximize pT x
wT x ≤ K
i + xj ≤ 1
x ∀ 0 ≤ i ≤ j ≤ n, Ii ∩ Ij = ∅ (ARP 2)
i∈Cp xi ≤ 1 ∀0≤p≤m
x ∈ {0, 1}n
Though being NP-complete on general graphs, finding maximal cliques on the graph
defined by our application is simple:
Definition 2. A graph G = (V, E) is called an interval graph if there exist intervals
I1 , . . . , I|V | ⊂ R such that ∀vi , vj ∈ V : (vi , vj ) ∈ E ⇐⇒ Ii ∩ Ij = ∅.
On interval graphs, the computation of maximal cliques can be performed in O(n log n)
[7]. Hence, ARP 2 can be obtained in polynomial time.
Table 1. Numerical Results for the ARP Problem. We present the upper bound on the number of
solutions and the CPU-time in seconds for the binary constraint model (ARP-1) and the maximal
clique model (ARP-2). The table on the left is for the small sized data set (20-720) with 20
channels and 720 minute time horizon, and the table on the right is for the large sized data set
(50-1440) with 50 channels and 1440 minute time horizon. In this experiment, we do not generate
and check solutions for feasibility.
experiments, we use the class usefulness (CU) instances. We consider a small sized
data set which spans half a day (720 minutes) and consists of 20 channels, and a large
sized data set which spans a full day (1440 minutes) and consists of 50 channels. Profits
for each program are chosen based on the class that a program belongs to. This class
also determines the parameters according to which its length is randomly chosen. On
average, these instances have 300 and 1500 programs, respectively. All experiments
in this paper were performed on a machine with Intel Core 2 Quad Q6600, 2.4GHz
CPUs and 2GByte of RAM operating Linux Debian 5.0.3 32-bit. On all experiments,
we enforced a time limit of 3 hours CPU time.
Our first evaluation compares the effectiveness of the models described by ARP 1
and ARP 2 in terms of the upper bound on the solution count that they provide and the
time they take. Specifically, we are interested in the increase of the number of solutions
as we move away from the optimal value. To this end, we introduce the Gap parameter
which indicates the percentage gap between a threshold and the optimal value. We only
consider solutions that achieve an objective value above the threshold. We experiment
with objective gaps of 0%, 1% and 2% and truncate the search at depth 5. Table 1 shows
that the ARP 2 formulation which includes the clique cuts provides much better upper
bounds than ARP 1 in substantially less time. This indicates that exploiting the common
methods for strengthening LP relaxations can also be exploited effectively to compute
superior upper bounds on the number of solutions of BIPs. The fact that ARP 2 actually
runs faster can be attributed to the fact that the cutting planes allow much better edge-
filtering effectiveness. Therefore, the DP contains much fewer edges higher up in the
tree, which leads to much faster times per choice point.
We next compare our approach (UBound) with the Cplex IP solution counter which
enumerates all solutions [10,5] and the Scip solution counter which collects several
solutions at a time. Note that Cplex and Scip provide only a lower bound in case they
time out or reach the memory limit. We again consider objective gaps 0%, 1% and 2%.
212 S. Jain, S. Kadioglu, and M. Sellmann
Table 2. Numerical Results for the ARP Problem with 0% objective gap. We present the upper
bound on the number of solutions and the CPU-time in seconds at depth 5. The table on the left
is for the small sized data set (20-720) with 20 channels and 720 minute time horizon, and the
table on the right is for the large sized data set (50-1440) with 50 channels and 1440 minute time
horizon. ’T’ means that the time limit has been reached. The numbers in bold show exact counts
and the numbers in parenthesis are our upper bounds before we generate and check solutions for
feasibility.
For 0% gap, we run our method with depth 5 which is adequate to achieve the exact
counts. For higher gaps, we present the results for depths 5, 10, and 15.
Our results are presented in Table 2, Table 3, and Table 4. For the optimal objective
threshold, UBound provides exact counts for all test instances. In terms of running time,
UBound does not perform as quickly as the IBM Cplex and the Scip solution counter.
There is only one notable exception to this rule, instance 50-1440-1. On this instance,
Scip takes 100 seconds and Cplex times out after three hours while our method could
have provided the 35 solutions to the problem in 20 seconds.
This discrepancy becomes more evident when we are interested in the number of
solutions that are with 1% or 2% of the optimum. As we can see from Table 3 and
Table 4 the number of solutions increases very rapidly even for those small objective
gaps. Not surprisingly, the counts obtained by Cplex and Scip are limited by the number
of solutions they can enumerate within the memory and the time constraints, yielding
a count of roughly 1E+5 to 1E+7 solutions in most cases. Due to the explosion in
the number of solutions, Cplex and Scip are never able to give exact counts for the
large instances but only give a lower bound. Cplex hits the time cutoff in 17 out of
20 large instances and reaches the memory limit for the remaining 3, and Scip times
out in all large instances. In most cases where Cplex or Scip are able to find the exact
counts, UBound is able to provide tight upper bounds that are not more than an order
of magnitude bigger. In Figure 2, we show how the upper and lower bounds obtained
by UBound, Cplex, and Scip progress as they approach the exact count.
We also compared our approach with the method from [8] which provides very good
bounds on the number of solutions for constraint satisfaction problems. The method
is based on the addition of random XOR-constraints. Unfortunately, we found that, in
combination with an integer programming problem, the method does not perform well.
Upper Bounds on the Number of Solutions of Binary Integer Programs 213
9
UBound
We tried using the vanilla code1 which was designed for pure CSPs. It did not perform
well for the ARP. So we modified the code, providing better branching variables for the
tree search and using linear bounds to prune the search. That improved the performance.
With this approach we are able to compute lower bounds, but computing these takes
more time and the counts are worse than those provided by Cplex and Scip. Upper
bounds take even more time as the XOR constraints involve more variables. We could
not obtain upper bounds within the time limit of three hours. We conjecture that a tight
integration between the XOR constraints and linear inequalities would be needed to
make this approach, which gives very good results for CSPs, work well for optimization
problems.
We next consider the market split problem (MSP), a benchmark that was suggested for
knapsack constraints in [20,21].
The original definition goes back to [4,22]: A large company has two divisions D1 and
D2 . The company supplies retailers with several products. The goal is to allocate each
retailer to either division D1 or D2 so that D1 controls A% of the company’s market
for each product and D2 the remaining (100-A)%. Formulated as an integer program,
the problem reads:
j aij xj = 100
A
j aij ∀0≤i<m
xj ∈ {0, 1} ∀ 0 ≤ j < n,
1
Many thanks to Ashish Sabharwal for providing us the source code!
214 S. Jain, S. Kadioglu, and M. Sellmann
Table 3. Numerical Results for the ARP Problem with 1% objective gap. We present the upper
bound on the number of solutions and the CPU-time in seconds. ’T’ means that the time limit
has been reached and ’M’ indicates a solver has reached the memory limit. The numbers in
bold show exact counts and the numbers in square brackets denote the best count UBound could
achieve within the time limit.
whereby m denotes the number of products, n is the number of retailers, and aij is the
demand of retailer j of product i. MSPs are generally very hard to solve, especially
the randomly generated instances proposed by Cornuejols and Dawande where weight
coefficients are randomly chosen in [1, . . . , 100] and A = 50. Special CP approaches
for the MSP have been studied in [20,21,14,9].
Compatibility Labels, and Generate and Test: For the MSP, we strengthen the so-
lution counts by employing the compatibility labels introduced in [9]. We additionally
Upper Bounds on the Number of Solutions of Binary Integer Programs 215
Table 4. Numerical Results for the ARP Problem with 2% objective gap. We present the upper
bound on the number of solutions and the CPU-time in seconds. ’T’ means that the time limit
has been reached and ’M’ indicates a solver has reached the memory limit. The numbers in
bold show exact counts and the numbers in square brackets denote the best count UBound could
achieve within the time limit.
set up the DPs for the original equations in the problem. If there are m > 3 constraints
in the MSP, we set up m − 2 DPs where the kth DP is defined by the sum of the kth
constraint plus five times the k+first constraint plus 25 times the k+second constraint.
Often, the number of solutions to MSP instances is comparably low, and checking
feasibility is very fast. In case that we find an upper bound of less than 50,000 we simply
generate and check those solutions for feasibility. Therefore, each number that is less
than 50,000 is actually an exact count.
Table 5. Numerical Results for the MSP Problem. We present the upper bound on the number
of solutions found and the CPU-time taken in seconds for the binary constraint model and the
maximal clique model. ’T’ means that the time limit has been reached. The numbers in bold
show exact counts. The numbers in parenthesis are our upper bounds before we generate and
check solutions for feasibility.
more solutions. We compare UBound again with the counts provided by IBM Cplex
and Scip. As before, Cplex and Scip provide a lower bound in case they time out. We
consider MSPs of orders 3 and 4 with an increasing number of variables between 24
and 38.
We present our results in Table 5. As we can see, UBound provides high quality
upper bounds very quickly as shown in the counts given in brackets. Using the generate
and test technique, on all instances we are able to provide exact counts in considerably
less time than Cplex and Scip.
Again, we compared our results also with the XOR approach from [8]. After the
vanilla implementation from [8] did not provide competitive results, we devised an
efficient code that can solve pure MSPs efficiently and added XOR constraints to it.
Again, we found that the problems augmented by XORs are much harder to solve which
resulted in the approach timing out on our entire benchmark. We attribute this behavior
to our inability to integrate the XOR constraints tightly with the subset-sum constraints
in the problem.
4 Conclusions
We presented a new method for computing upper bounds on the number of solutions of
BIPs. We demonstrated its efficiency on automatic recording and market split problems.
We showed that standard methods for tightening the LP relaxation by means of cutting
planes can be exploited also to provide better bounds on the number of solutions. More-
over, we showed that a recent new method for integrating graph-based constraints more
tightly via so-called compatibility labels can be exploited effectively to count solutions
for market split problems.
We do not see this method so much as a competitor to the existing solution counting
methods that are parts of IBM Cplex and Scip. Instead, we believe that these solvers
could benefit greatly from providing upper bounds on the number of solutions. This
obviously makes sense when the number of solutions is very large and solution enu-
meration must fail. However, as we saw on the market split problem, considering upper
Upper Bounds on the Number of Solutions of Binary Integer Programs 217
bounds can also boost the performance dramatically on problems that have few num-
bers of solutions. In this case, our method can be used to give a super-set of potential
solutions whose feasibility can be checked very quickly.
References
1. Achterberg, T.: SCIP - A Framework to Integrate Constraint and Mixed Integer Program-
ming, http://www.zib.de/Publications/abstracts/ZR-04-19/
2. Ahuja, R.K., Magnati, T.L., Orlin, J.B.: Network Flows. Prentice Hall, Englewood Cliffs
(1993)
3. Birnbaum, E., Lozinskii, E.L.: The Good Old Davis-Putnam Procedure Helps Counting Mod-
els. Journal of Artificial Intelligence Research 10, 457–477 (1999)
4. Cornuejols, G., Dawande, M.: A class of hard small 0-1 programs. In: Bixby, R.E., Boyd,
E.A., Rı́os-Mercado, R.Z. (eds.) IPCO 1998. LNCS, vol. 1412, pp. 284–293. Springer, Hei-
delberg (1998)
5. Danna, E., Fenelon, M., Gu, Z., Wunderling, R.: Generating Multiple Solutions for Mixed
Integer Programming Problems. In: Fischetti, M., Williamson, D.P. (eds.) IPCO 2007. LNCS,
vol. 4513, pp. 280–294. Springer, Heidelberg (2007)
6. Frangioni, A.: Object Bundle Optimization Package,
www.di.unipi.it/optimize/Software/Bundle.html
7. Golumbic, M.C.: Algorithmic Graph Theory and Perfect Graphs. Academic Press, New York
(1991)
8. Gomes, C.P., Hoeve, W., Sabharwal, A., Selman, B.: Counting CSP Solutions Using Gener-
alized XOR Constraints. In: 22nd Conference on Artificial Intelligence (AAAI), pp. 204–209
(2007)
9. Hadzic, T., O’Mahony, E., O’Sullivan, B., Sellmann, M.: Enhanced Inference for the Market
Split Problem. In: 21st IEEE International Conference on Tools with Artificial Intelligence
(ICTAI), pp. 716–723 (2009)
10. IBM. IBM CPLEX Reference manual and user manual. V12.1, IBM (2009)
11. Kroc, L., Sabharwal, A., Selman, B.: Leveraging Belief Propagation, Backtrack Search, and
Statistics for Model Counting. In: Perron, L., Trick, M.A. (eds.) CPAIOR 2008. LNCS,
vol. 5015, pp. 278–282. Springer, Heidelberg (2008)
12. Sellmann, M.: Approximated Consistency for Knapsack Constraints. In: Rossi, F. (ed.) CP
2003. LNCS, vol. 2833, pp. 679–693. Springer, Heidelberg (2003)
13. Sellmann, M.: Cost-Based Filtering for Shorter Path Constraints. In: Rossi, F. (ed.) CP 2003.
LNCS, vol. 2833, pp. 694–708. Springer, Heidelberg (2003)
14. Sellmann, M.: Theoretical Foundations of CP-based Lagrangian Relaxation. In: Wallace, M.
(ed.) CP 2004. LNCS, vol. 3258, pp. 634–647. Springer, Heidelberg (2004)
15. Sellmann, M., Fahle, T.: Constraint Programming Based Lagrangian Relaxation for the Au-
tomatic Recording Problem. Annals of Operations Research (AOR), 17–33 (2003)
16. Sellmann, M.: Approximated Consistency for the Automatic Recording Constraint. In: van
Beek, P. (ed.) CP 2005. LNCS, vol. 3709, pp. 822–826. Springer, Heidelberg (2005)
17. Sellmann, M.: Approximated Consistency for the Automatic Recording Constraint. Comput-
ers and Operations Research 36(8), 2341–2347 (2009)
18. Sellmann, M.: ARP: A Benchmark Set for the Automatic Recording Problem maintained,
http://www.cs.brown.edu/people/sello/arp-benchmark.html
19. TIVOtm System, http://www.tivo.com
20. Trick, M.: A Dynamic Programming Approach for Consistency and Propagation for Knap-
sack Constraints. In: 3rd Int. Workshop on Integration of AI and OR Techniques in Constraint
Programming for Combinatorial Optimization Problems (CPAIOR), pp. 113–124 (2001)
218 S. Jain, S. Kadioglu, and M. Sellmann
21. Trick, M.: A Dynamic Programming Approach for Consistency and Propagation for Knap-
sack Constraints. Annals of Operations Research 118, 73–84 (2003)
22. Williams, H.P.: Model Building in Mathematical Programming. Wiley, Chicester (1978)
23. Zanarini, A., Pesant, G.: Solution counting algorithms for constraint-centered search heuris-
tics. In: Bessière, C. (ed.) CP 2007. LNCS, vol. 4741, pp. 743–757. Springer, Heidelberg
(2007)
24. Zanarini, A., Pesant, G.: Solution counting algorithms for constraint-centered search heuris-
tics. Constraints 14(3), 392–413
Matrix Interdiction Problem
1 Introduction
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 219–231, 2010.
c Springer-Verlag Berlin Heidelberg 2010
220 S.P. Kasiviswanathan and F. Pan
2 Matrix Interdiction
Let [n] denote the set {1, . . . , n}. For a matrix M of dimension m × n, let Mi,j
denote the (i, j)th (ith row and jth column) entry of M . For a set J ⊆ [n], let
M |J denote the submatrix of M obtained by picking only the columns of M
indexed by J. Define,
m
val(M |J ) = max{Mi,j }.
j∈J
i=1
(2,1) 3 2
2
(2,2)
4
1 3
2 (3,1) 5
3
1 2
(3,2) 6
Fig. 1. Bipartite network interdiction for border control with 3 sources, 2 destinations,
and 3 border crossings. The figure to the right is the corresponding bipartite network.
222 S.P. Kasiviswanathan and F. Pan
where
h(x, ω) = max{[aω
i − bi xi ] }.
ω +
(2)
i∈[n]
Here, [·]+ = max{·, 0}. This is a stochastic resource allocation problem with a
simple structure. The assumption is that the changing a component performance
does not effect the performance of any other component. It is this simple struc-
ture that allows the conversion from a min-expected-max two-stage stochastic
program to a matrix interdiction problem. At each scenario, by subtracting a
constant cω = maxi {[aω i − bi ] } from each constraint in (2), we are able to
ω +
h(x, ω) = max{[âω
i − âi xi ] },
ω +
(3)
i∈[n]
i = ai − c .
where âω ω ω
The optimization problems (2) and (3) are equivalent in the sense that they
have the same optimal solutions and they are different by a constant at their
optimal value. The full process of this simplification step involves elaborate al-
gebraic manipulations, and we refer the reader to [21, 20] which discusses this
in the context of interdiction. After the simplification, the two-stage stochastic
program
minx ω∈Ω pω h(x, ω), i xi = k, x ∈ {0, 1} , where
n
(4)
h(x, ω) = maxi∈[n] {[âi − âi xi ] }.
ω ω +
Matrix Interdiction Problem 223
For more details on the bipartite network interdiction, see [21, 20].
We can construct a matrix M from bipartite network interdiction problem as
follows. The dimension of M is set as n = |B| and m = |S| × |T |, and the entry
Mij = rsjt pst , where i is the node index in the bipartite network for source-
destination pair (s, t). With this construction, the entries of M are positive real
values between 0 and 1, and the optimal solution of the matrix interdiction
problem for input M is exactly the k optimal border crossings to be interdicted
224 S.P. Kasiviswanathan and F. Pan
in Equation (5). Also, the two problems will also have the same optimal objective
values. This leads to the following theorem.
Theorem 1. Bipartite network interdiction problem is a special case of the ma-
trix interdiction problem.
The above theorem can be summarized as saying that every instance of the bipar-
tite interdiction problem is also an instance of the matrix interdiction problem.
The NP-hardness (Section 3) and the hardness of approximation (Section 4)
results that we obtain for the matrix interdiction problem also hold for the
bipartite network interdiction problem. Also, since the approximation guaran-
tee of the greedy algorithm (Section 5) holds for every instance of the matrix
interdiction problem, it also holds for every instance of the bipartite network in-
terdiction problem. In the following sections, we concentrate only on the matrix
interdiction problem.
3 NP-Hardness Result
In this section, we show that the matrix interdiction problem is NP-hard. Thus,
assuming P = NP there exists no polynomial time algorithm that can exactly
solve the matrix interdiction problem. For establishing the NP-hardness we re-
duce the clique problem to the matrix interdiction problem. The clique problem
is defined as follows.
Definition 2 (Clique Problem [8]). Let G = (V, E) be an undirected graph,
where V is the set of vertices and E is the set of edges of G. For a subset S ⊆ V ,
we let G(S) denote the subgraph of G induced by S. A clique C is a subset of
V such that the induced graph G(C) is complete (i.e., ∀u, v ∈ C an edge exists
between u and v). The clique problem is the optimization problem of finding a
clique of maximum size in the graph. As a decision problem, it requires us to
decide whether there exists a clique of a given size k in the graph.
Proof. To show the first part of the lemma notice that each row of M contributes
1 to val(M ), or in other words each edge in G contributes 1 to val(M ). Consider a
row of M , let us assume it corresponds to some edge (u, v) in G. Now notice that
to obtain M ∗ if one only deletes the column corresponding to u or the column
corresponding to v then the contribution of this row to val(M ∗ ) still remains 1
(because the row has two 1’s and only one of these gets removed). So to reduce
the contribution of this row to 0 one needs to delete columns corresponding to
both u and v.
A clique C of size k has exactly k2 edges between the vertices in C. Therefore,
by deleting the columns corresponding to vertices in C, one can create a subma-
trix M ∗ of dimension m × n − k with val(M ∗ ) = val(M ) − k2 . We now argue
that M ∗ is a (optimum) solution to the matrix interdiction problem. Consider
a set J ⊆ [n], |J| = n − k and let J¯ = [n] − J. Deleting the columns of M in
J¯ creates M |J in which the number of rows with all zero entries is the same as
the number of edges present between the vertices corresponding to entries in J. ¯
In other words, val(M |J ) = val(M ) − e(J), where e(J) is the number of edges
¯ ¯
in G that are present ¯
k between the vertices corresponding to entries in J. Since,
for any J, e(J) ≤ 2 , therefore for all J,
¯ ¯
k
val(M |J ) ≥ val(M ) − .
2
Therefore, M ∗ whose value equals val(M ) − k2 is a (optimum) solution to the
matrix interdiction problem.
To show the second part of the lemma, notice that if there exists no clique of
¯
size k in G, then for all J,
k
val(M |J ) = val(M ) − e(J) > val(M ) −
¯ ,
2
as in the absence of a clique of size k, e(J)¯ is always less than k .
2
226 S.P. Kasiviswanathan and F. Pan
4 Inapproximability Result
In this section, we show that there exists a fixed constant γ such that the matrix
interdiction problem is NP-hard to approximate to within an nγ additive factor.
More precisely, we show that assuming P = NP there exists no polynomial time
approximation algorithm for the matrix interdiction problem that can achieve
better than an nγ additive approximation. Note that this statement is stronger
than Theorem 2. Whereas, Theorem 2 shows that assuming P = NP there exists
no polynomial time algorithm that can solve the matrix interdiction problem
exactly, this inapproximability statement shows that unless P = NP it is not
even possible to design a polynomial time algorithm which gives close to an
optimum solution for the matrix interdiction problem.
To show the inapproximability bound we reduce a problem with known inap-
proximability bound to the matrix interdiction problem. We will use a reduction
that is similar to that in the previous section. It will be convenient to use a
variant of the clique problem known as the k-clique.
Definition 3 (k-clique Problem). In the k-clique problem the input consists
of a positive integer k and a k-partite graph G (that is a graph that can be
partitioned into k disjoint independent sets) along with its k-partition. The goal
is to find the largest clique in G. Define a function k-clique(G) as (G)/k, where
(G) is the size of the largest clique in G.
Since in a k-partite graph G a clique can have at most one vertex in common with
an independent set, the size of the largest clique in G is at most k. Therefore,
k-clique(G) ≤ 1.
Matrix Interdiction Problem 227
Theorem 3 (Arora et al. [1]). There exists a fixed 0 < δ < 1 such that
approximating the k-clique problem to within an nδ multiplicative factor is NP-
hard.
Proof Sketch. The proof presented in [1] (see also Chapter 10 in [16]) proceeds
by showing a polynomial time reduction τ from the SAT problem (the problem
of determining whether the variables of a Boolean formula can be assigned in a
way that makes the formula satisfiable) to the k-clique problem. The reduction
τ ensures for all instances I of SAT:
Theorem 4. There exists a fixed constant γ > 0, such that the matrix interdic-
tion problem is NP-hard to approximate within an additive factor of nγ .
228 S.P. Kasiviswanathan and F. Pan
Proof. From Theorem 3, we know there exists a constant δ such that it is NP-
hard to approximate the k-clique problem to within an nδ multiplicative factor.
From Lemma 3, we know that for a k-partite graph G, there exists a matrix
M = M (G) such that
2
If k-clique(G) = 1 ⇒ val(M ∗ ) = val(M ) − k2 − k2 ,
2 2
k2 n2δ nδ
If k-clique(G) ≤ n1δ ⇒ val(M ∗ ) ≤ val(M ) − 2nk
δ − 2
k
− k2 − 2n δ − 2
+ 2
.
By comparing the above two equations, we see that if we can approximate the
matrix interdiction problem within an n2δ /2 − nδ /2 additive factor, then we can
approximate the k-clique problem to within an nδ multiplicative factor. Since,
the latter is NP-hard, it implies that an n2δ /2 − nδ /2 additive approximation
of the matrix interdiction problem is also NP-hard. Setting γ such that, nγ =
n2δ /2 − nδ /2 proves the theorem.
Algorithm Greedy(M)
1. For every j ∈ [n], compute cj = m i=1 Mi,j , i.e., cj is the sum of the entries
in the jth column.
2. Pick the top k columns ranked according to the column sums.
3. Delete the k columns picked in Step 2 to create a submatrix Mg of M .
4. Output Mg .
val(Mg ) ≤ (n − k)val(M ∗ ),
m
m
val(Mg ) = max {Mi,j } ≤ Mi,j
j∈Soln
i=1 i=1 j∈Soln
m m
= Mi,j ≤ Mi,j
j∈Soln i=1 j∈Opt i=1
m
m
= Mi,j ≤ (n − k) max {Mi,j }
j∈Opt
i=1 j∈Opt i=1
m
= (n − k) max {Mi,j }
j∈Opt
i=1
= (n − k)val(M ∗ ).
The second inequality follows because the Greedy algorithm deletes the k columns
with the largest column sums. The third inequality follows because for any real
n−k
vector v = (v1 , . . . , vn−k ), p=1 vp ≤ (n − k) max{v}.
The above argument shows that the Greedy algorithm achieves an (n − k)
multiplicative approximation ratio for the matrix interdiction problem.
6 Conclusion
Motivated by security applications, we introduced the matrix interdiction prob-
lem. Our main contribution is in providing a complexity analysis and an ap-
proximation algorithm for this problem. We proved that the matrix interdiction
problem is NP-hard, and furthermore, unless P = NP there exists no nγ additive
approximation algorithm for this problem. We then presented a simple greedy
algorithm for the matrix interdiction problem and showed that this algorithm
has an (n − k) multiplicative approximation ratio. It is also possible to design
a dynamic programming based algorithm that achieves the same approximation
ratio. An interesting open question would be to either design a better approxi-
mation algorithm or to show a better hardness of approximation result.
References
[1] Arora, S., Lund, C., Motwani, R., Sudan, M., Szegedy, M.: Proof verification and
the hardness of approximation problems. Journal of the ACM (JACM) 45(3), 555
(1998)
[2] Assimakopoulos, N.: A network interdiction model for hospital infection control.
Comput. Biol. Med. 17(6), 413–422 (1987)
[3] Bar-Noy, A., Khuller, S., Schieber, B.: The complexity of finding most vital arcs
and nodes. Technical report, University of Maryland (1995)
[4] Birge, J.R., Louveaux, F.: Introduction to stochastic programming. Springer, New
York (1997)
[5] Boros, E., Fedzhora, L., Kantor, P.B., Saeger, K., Stroud, P.: Large scale lp model
for finding optimal container inspection strategies. Naval Research Logistics Quar-
terly 56(5), 404–420 (2009)
230 S.P. Kasiviswanathan and F. Pan
[6] Burch, C., Carr, R., Krumke, S., Marathe, M., Phillips, C., Sundberg, E.: A
decomposition-based pseudoapproximation algorithm for network flow inhibition.
In: Network Interdiction and Stochastic Integer Programming. Operations Re-
search/Computer Science Interfaces, vol. 22, pp. 51–68. Springer, US (2003)
[7] Corley, H.W., Sha, D.Y.: Most vital links and nodes in weighted networks. Oper-
ations Research Letters 1(4), 157–160 (1982)
[8] Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to algorithms.
MIT Press, Cambridge (2001)
[9] Cormican, K.J., Morton, D.P., Wood, K.R.: Stochastic network interdiction. Op-
erations Research 46(2), 184–197 (1998)
[10] Dimitrov, N., Michalopoulos, D.P., Morton, D.P., Nehme, M.V., Pan, F., Popova,
E., Schneider, E.A., Thoreson, G.G.: Network deployment of radiation detectors
with physics-based detection probability calculations. Annals of Operations Re-
search (2009)
[11] Dimitrov, N.B., Morton, D.P.: Combinatorial design of a stochastic markov deci-
sion process. In: Operations Research and Cyber-Infrastructure. Operations Re-
search/Computer Science Interfaces, vol. 47, pp. 167–193. Springer, Heidelberg
(2009)
[12] Fulkerson, D.R., Harding, G.C.: Maximizing the minimum source-sink path sub-
ject to a budget constraint. Mathematical Programming 13(1), 116–118 (1977)
[13] Ghare, P.M., Montgomery, D.C., Turner, W.C.: Optimal interdiction policy for a
flow network. Naval Research Logistics Quarterly 18(1), 37 (1971)
[14] Golden, B.: A problem in network interdiction. Naval Research Logistics Quar-
terly 25, 711–713 (1978)
[15] Gutfraind, A., Hagberg, A., Pan, F.: Optimal interdiction of unreactive markovian
evaders. In: van Hoeve, W.-J., Hooker, J.N. (eds.) CPAIOR 2009. LNCS, vol. 5547,
pp. 102–116. Springer, Heidelberg (2009)
[16] Hochbaum, D.S. (ed.): Approximation algorithms for NP-hard problems. PWS
Publishing Co., Boston (1997)
[17] Israeli, E., Kevin Wood, R.: Shortest-path network interdiction. Networks 40(2),
97–111 (2002)
[18] Karabati, S., Kouveils, P.: A min-sum-max resource allocation problem. IEE
Transactions 32(3), 263–271 (2000)
[19] McMasters, A.W., Mustin, T.M.: Optimal interdiction of a supply network. Naval
Research Logistics Quarterly 17(3), 261 (1970)
[20] Morton, D.P., Pan, F., Saeger, K.J.: Models for nuclear smuggling interdiction.
IIE Transactions 39(1), 3–14 (2007)
[21] Pan, F.: Stochastic Network Interdiction: Models and Methods. PhD dissertation,
University of Texas at Austin, Operations Research (2005)
[22] Pan, F., Charlton, W., Morton, D.P.: Interdicting smuggled nuclear material. In:
Woodruff, D.L. (ed.) Network Interdiction and Stochastic Integer Programming,
pp. 1–19. Kluwer Academic Publishers, Boston (2003)
[23] Pan, F., Morton, D.P.: Minimizing a stochastic maximum-reliability path. Net-
works 52(3), 111–119 (2008)
[24] Salmeron, J., Wood, K., Baldick, R.: Worst-case interdiction analysis of large-scale
electric power grids. IEEE Transactions on Power Systems 24(1), 96–104 (2009)
[25] Washburn, A., Wood, K.R.: Two-person zero-sum games for network interdiction.
Operations Research 43(2), 243–251 (1995)
Matrix Interdiction Problem 231
[26] Wein, L.M., Wilkins, A.H., Baveja, M., Flynn, S.E.: Preventing the importation
of illicit nuclear materials in shipping containers. Risk Analysis 26(5), 1377–1393
(2006)
[27] Wollmer, R.: Removing Arcs from a Network. Operations Research 12(6), 934–940
(1964)
[28] Wood, R.K.: Deterministic network interdiction. Mathematical and Computer
Modeling 17, 1–18 (1997)
[29] Zenklusen, R.: Matching interdiction. Arxiv preprint arXiv:0804.3583 (2008)
Strong Combination of Ant Colony Optimization
with Constraint Programming Optimization
1 Introduction
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 232–245, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Strong Combination of ACO with Constraint Programming Optimization 233
Let us point out that our main objective is not to compete with state-of-the-
art algorithms which are dedicated to solving specific problems, but to show
that sampling the search space with ACO can significantly improve the solution
process of a generic B&P&B approach for solving COPs. For this, we chose CP
Optimizer as our reference.
The rest of this paper is organized as follows. In Section 2, we recall some
definitions about COP, CP, and ACO. Section 3 describes the CPO-ACO algo-
rithm. In section 4, we give some experimental results on the multidimensional
knapsack problem, the quadratic assignment problem and the maximum inde-
pendent set problem. We conclude with a discussion on some other related work
and further work.
2 Background
2.1 COP
A COP is defined by a tuple P = (X, D, C, F ) such that X = {x1 , . . . , xn }
is a set of n decision variables; for every variable xi ∈ X, D(xi ) is a finite
set of integer values defining the domain of xi ; C is a set of constraints; and
F : D(x1 ) × . . . × D(xn ) −→ R is an objective function to optimize.
An assignment A is a set of variable-value couples denoted < xi , vi > which
correspond to the assignment of a value vi ∈ D(xi ) to a variable xi . An assign-
ment A is complete if all variables of X are assigned in A; it is partial otherwise.
An assignment is inconsistent if it violates a constraint and it is consistent other-
wise. A feasible solution is a complete consistent assignment. A feasible solution
A of P is optimal if for every other feasible solution A of P , F (A) ≤ F (A ) if
P is a minimization problem or F (A) ≥ F (A ) if P is a maximization problem.
domains of some variables. Refalo [9] has defined the impact of the assignment
xi = vi as the proportion of search space removed. He has defined the impact
of a value as the average of its observed impacts and the impact of a variable
as the average of the impact of its remaining values. He has shown that these
impacts may be used to define valuable ordering heuristics.
3 Description of CP O − ACO
ACO has shown to be very effective for quickly finding good solutions to many
COPs. However, designing ACO algorithms for new COPs implies a lot of pro-
gramming: if procedures for managing and exploiting pheromone are very similar
Strong Combination of ACO with Constraint Programming Optimization 237
from a COP to another so that one can easily reuse them, solving a new COP
implies to write procedures for propagating and checking problem dependent
constraints. Hence, a first motivation for combining ACO with CP is to reuse
the numerous available procedures for managing constraints. Moreover, combin-
ing ACO with CP optimizer allows us to take the best of these two approaches:
– During a first phase, CP Optimizer is used to sample the space of feasible
solutions, and pheromone trails are used to progressively intensify the search
around the best feasible solutions.
– During a second phase, CP Optimizer is used to search for an optimal solu-
tion, and the pheromone trails collected during the first phase are used to
guide CP Optimizer in this search.
12 if Abest is strictly better than all feasible solutions of {A1 , ..., AnbAnts } then
13 foreach < xi , vi >∈ Abest do τ (xi , vi ) ← min(τmax , τ (xi , vi ) + 1)
14 until time spent ≥ tmax1 or number of cycles without improvement of
Abest ≥ itmax or average distance of {A1 , ..., AnbAnts } ≤ dmin ;
15 return Abest and τ
238 M. Khichane, P. Albert, and C. Solnon
contain the best feasible solutions with respect to the objective function. This
pheromone structure associates a pheromone trail τ (xi , vi ) with each variable
xi ∈ X and each value vi ∈ D(xi ). Each pheromone trail is bounded between
two given bounds τmin and τmax , and is initialized to τmax (line 1) as proposed
in [10]. At the end of the first phase, the pheromone structure τ is returned so
that it can be used in the second phase as a value ordering heuristic.
where impact (vi ) is the observed impact of value vi as defined in [9], and α and β
are two parameters that weight the pheromone and impact factors respectively.
Hence, during the first cycle, values are randomly chosen with respect to impacts
only as all pheromone trails are initialized to the same value (i.e., τmax ). However,
at the end of each cycle, pheromone trails are updated so that these probabilities
are progressively biased with respect to past constructions.
It is worth mentioning here that our CPO-ACO framework is designed to
solve underconstrained COPs that have a rather large number of feasible solu-
tions (such as, for example, MKPs or QAPs): when solving these problems, the
difficulty is not to build a feasible solution, but to find the feasible solution that
optimizes the objective function. Hence, on these problems CP Optimizer is able
to build feasible solutions very quickly, with very few backtracks. Our CPO-ACO
framework may be used to solve more tightly constrained COPs. However, for
these problems, CP Optimizer may backtrack a lot (and therefore need more
CPU time) to compute each feasible solution. In this case, pheromone learning
will be based on a very small set of feasible solutions so that it may not be very
useful and CPO-ACO will simply behave like CP Optimizer.
Strong Combination of ACO with Constraint Programming Optimization 239
Pheromone laying step: At the end of each cycle, good feasible solutions (with
respect to the objective function) are rewarded in order to intensify the search
around them. Lines 8-11, the best feasible solutions of the cycle are rewarded.
Lines 12-13, the best feasible solution built so far is rewarded if it is better than
the best feasible solutions of the cycle (otherwise it is not rewarded as it belongs
to the best feasible solutions of the cycle that have already been rewarded).
In both cases, a feasible solution A is rewarded by increasing the quantity of
pheromone laying on every couple < xi , vi > of A, thus increasing the probability
of assigning xi to vi . The quantity of pheromone added is inversely proportional
to the gap between F (A) and F (Abest ).
Termination conditions: The first phase is stopped either if the CPU time
limit of the first phase tmax1 has been reached, or if Abest has not been improved
since itmax iterations, or if the average distance between the assignments com-
puted during the last cycle is smaller than dmin , thus indicating that pheromone
trails have allowed the search to converge. We define the distance between two
assignments with respect to the number of variable/value couples they share,
i.e., the distance between A1 and A2 is |X|−|A 1 ∩A2 |
|X|
– We have considered a framework where, during the first phase, the sampling
is done randomly, without using pheromone for biasing probabilities (i.e., α
is set to 0). This framework also obtains significantly worse results, showing
us that it is worth using an ACO learning mechanism.
– We have considered a framework where, during the second phase, the value
ordering heuristic is defined by the probabilistic rule used in the first phase,
instead of selecting the value that maximizes the formula. This framework
obtains results that are not significantly different on most instances.
The Maximum Independent Set (MIS) involves selecting the largest subset of
vertices of a graph such that no two selected vertices are connected by an edge
(this problem is equivalent to searching for a maximum clique in the inverse
graph). The associated CP model is such that
– X = {x1 , . . . , xn } associates a decision variable xi with every vertex i;
– ∀xi ∈ X, D(xi ) = {0, 1} so that xi = 0 if vertex i is not selected, and 1
otherwise;
– C associates a binary constraint cij with every edge (i, j) of the graph.
This constraint ensures that i and j have not been both selected, i.e., cij =
(xi + xj < 2).
– the objective function to maximize is F = ni=1 xi .
We have considered instances of the MIS problem which are available at
http://www.nlsde.buaa.edu.cn/∼kexu/benchmarks/graph-benchmarks.htm.
For both CPO and CPO-ACO, we performed 30 runs per problem instance
with a different random-seed for each run.
Table 1. Comparison of CPO and CPO-ACO on the MKP, the QAP and the MIS.
Each line successively gives: the name of the class, the number of instances in the class
(#I), the average number of variables in these instances (#X), the results obtained by
CPO (resp. CPO-ACO), i.e., the percentage of deviation from the best known solution
(average (avg) and standard deviation (sd)), the percentage of instances for which
CPO (resp. CPO-ACO) has obtained strictly better average results (>avg ), and the
percentage of instances for which CPO (resp. CPO-ACO) is significantly better w.r.t.
the statistical test.
us first note that CPO and CPO-ACO (nearly) never reach the best known
solution: indeed, best known solutions have usually been computed with state-of-
the-art dedicated approaches. Both CPO and CPO-ACO are completely generic
approaches that do not aim at competing with these dedicated approaches which
have often required a lot of programming and tuning work. Also, we have chosen
a reasonable CPU time limit (300 seconds) in order to allow us to perform a
significant number of runs per instance, thus allowing us to use statistical tests.
Within this rather short time limit, CPO-ACO obtains competitive results with
dedicated approaches on the MKP (less than 1% of deviation from best known
solutions); however, it is rather far from best known solutions on many instances
of the QAP and the MIS.
Let us now compare CPO with CPO-ACO. Table 1 shows us that using ACO
to guide CPO search improves the search process on all classes except two. How-
ever, this improvement is more important for the MKP than for the two other
problems. As the two approaches have obtained rather close results on some in-
stances, we have used statistical tests to determine if the results are significantly
different or not: we have performed the Student’s t-test with significance level
of 0.05, using the R Stats Package available at http://sekhon.berkeley.edu/doc-
/html/index.html. For each class, we report the percentage of instances for which
an approach has obtained significantly better results than the other one (column
>t−test of table 1). For the MKP, CPO-ACO is significantly better than CPO
for 57 instances, whereas it is not significantly different for 3 instances. For
the QAP, CPO-ACO is significantly better than CPO on a large number of in-
stances. However, CPO is better than CPO-ACO on one instance of the class
tai* of the QAP. For the MIS, CPO-ACO is significantly better than CPO on
35% of instances, but it is not significantly different on all other instances.
5 Conclusion
We have proposed CPO-ACO, a generic approach for solving COPs defined by
means of a set of constraints and an objective function. This generic approach
combines a complete B&P&B approach with ACO. One of the main ideas behind
this combination is the utilization of the effectiveness of (i) ACO to explore the
search space and quickly identify promising areas (ii) CP Optimizer to strongly
exploit the neighborhood of the best solutions found by ACO. This combination
allows us to reach a good balance between diversification and intensification
of the search: diversification is mainly ensured during the first phase by ACO;
intensification is ensured by CP optimizer during the second phase.
It is worth noting that thanks to the modular nature of IBM ILOG CP Opti-
mizer that clearly separates the modeling part of the problem from its resolution
part, the proposed combination of ACO and CP was made in natural way. Hence,
the CPO-ACO program used was exactly the same for the experiments on the
different problems used in this work.
We have shown through experiments on three different COPs that CPO-ACO
is significantly better than CP Optimizer.
244 M. Khichane, P. Albert, and C. Solnon
Acknowledgments
The authors thank Renaud Dumeur, Jérôme Rogerie, Philippe Refalo and
Philippe Laborie for the fruitful and animated discussions about combinato-
rial optimization in general case and, particularly, the search strategies. Also,
we thank Renaud Dumeur and Paul Shaw for proofreading drafts of this paper.
Strong Combination of ACO with Constraint Programming Optimization 245
References
1. Nemhauser, G., Wolsey, A.: Integer and combinatorial optimization. Jhon Wiley
& Sons, New York (1988)
2. Papadimitriou, C., Steiglitz, K.: Combinatorial optimization–Algorithms and com-
plexity. Dover, New York (1982)
3. Kirkpatrick, S., Gellat, C., Vecchi, M.: Optimization by simulated annealing. Sci-
ence 220, 671–680 (1983)
4. Glover, F., Laguna, M.: Tabu Search. Kluwer Academic Publishers, Dordrecht
(1997)
5. Lourenço, H., Martin, O., Stützle, T.: Iterated local search. In: Golver, F., Kochen-
berger, G. (eds.) Handbook of Metaheuristics. International Series in Operations
Research Management Science, vol. 57, pp. 321–353. Kluwer Academic Publisher,
Dordrecht (2001)
6. Hensen, P., Mladenović, N.: An introduction to variable neighborhood search. In:
Voβ, S., Martello, S., Osman, I., Roucairol, C. (eds.) Meta-heuristics: advances and
trends in local search paradigms for optimization, pp. 433–438. Kluwer Academic
Publisher, Dordrecht (1999)
7. Hentenryck, P.V., Michel, L.: Constraint-Based Local Search. MIT Press, Cam-
bridge (2005)
8. Dorigo, M., Stützle, T.: Ant Colony Optimization. MIT Press, Cambridge (2004)
9. Refalo, P.: Impact-based search strategies for constraint programming. In: Wallace,
M. (ed.) CP 2004. LNCS, vol. 3258, pp. 557–571. Springer, Heidelberg (2004)
10. Stützle, T., Hoos, H.: MAX − MIN Ant System. Journal of Future Generation
Computer Systems 16, 889–914 (2000)
11. Solnon, C.: Ants can solve constraint satisfaction problems. IEEE Transactions on
Evolutionary Computation 6(4), 347–357 (2002)
12. Khichane, M., Albert, P., Solnon, C.: Integration of ACO in a constraint program-
ming language. In: Dorigo, M., Birattari, M., Blum, C., Clerc, M., Stützle, T.,
Winfield, A.F.T. (eds.) ANTS 2008. LNCS, vol. 5217, pp. 84–95. Springer, Heidel-
berg (2008)
13. Solnon, C., Fenet, S.: A study of ACO capabilities for solving the Maximum Clique
Problem. Journal of Heuristics 12(3), 155–180 (2006)
14. Alaya, I., Solnon, C., Ghedira, K.: Ant Colony Optimization for Multi-objective
Optimization Problems. In: 19th IEEE International Conference on Tools with
Artificial Intelligence (ICTAI), pp. 450–457. IEEE Computer Society, Los Alamitos
(2007)
15. Birattari, M., Stutzle, T., Paquete, L., Varrentrapp, K.: A Racing Algorithm for
Configuring Metaheuristics. In: GECCO, pp. 11–18 (2002)
16. Meyer, B.: Hybrids of constructive meta-heuristics and constraint programming:
A case study with ACO. In: Blesa, M.J., Blum, C., Cotta, C., Fernández, A.J.,
Gallardo, J.E., Roli, A., Sampels, M. (eds.) HM 2008. LNCS, vol. 5296. Springer,
Heidelberg (2008)
Service-Oriented Volunteer Computing
for Massively Parallel Constraint Solving
Using Portfolios
1 Introduction
Recent years have witnessed growing interest in parallelising constraint solving
based on tree search (see [1] for a brief overview). One approach is search-space
splitting in which different parts of the tree are explored in parallel (e.g. [2]).
Another approach is the use of algorithm portfolios. This technique exploits
the significant variety in performance observed between different algorithms and
combines them in a portfolio [3]. In constraint solving, an algorithm can be a
solver or a tuning of a solver. Portfolios have often been run in an interleaving
fashion (e.g. [4]). Their use in a parallel context is more recent ([5], [1]).
Considering the complexity of the constraint problems and thus the compu-
tational power needed to tackle them, it is appealing to benefit from large-scale
parallelism and push for a massive number of CPUs. Bordeaux et. al have inves-
tigated this in [1] . By using the portfolio and search-space splitting strategies,
they have conducted experiments on constraint problems using a parallel com-
puter with the number of processors up to 128. They reported that the parallel
portfolio approach scales very well in SAT, in the sense that utilizing more pro-
cessors consistently helps solving more instances in a fixed amount of time.
As done also in [1], most of the prior work in parallel constraint solving as-
sumes a parallel computer with multiple CPUs. This architecture is fairly reliable
and has low communication overhead. However, such a computer is costly and
is not always at our disposal, especially if we want to push for massive paral-
lelism. Jaffar et al addressed this problem in [2] by using a bunch of computers
in a network (called “volunteer computing” in what follows). They employed 61
computers in a search-space splitting approach and showed that such a method
is effectively scalable in ILP.
In this paper, we combine the benefits of [1] and [2] when solving constraint
satisfaction problems (CSPs). We present an architecture in which massive num-
ber of volunteer computers can run several (tunings of) constraint solvers in par-
allel in a portfolio approach so as to solve many CSPs in a fixed amount of time.
The architecture is implemented using the service-oriented computing paradigm
and is thus modular, flexible for useful extensions and allows to utilise even the
computers behind a firewall. We report experiments up until 100 computers. As
the results confirm, the architecture is effectively scalable.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 246–251, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Service-Oriented Volunteer Computing 247
3 Our Architecture
Fig. 1 depicts our architecture using a notation similar to UML communication
diagrams. When services are used, we can have two kinds of messages: one way
message denoted by the string
message name (
data sent ) and a request
response message denoted by
message name (
data sent )(
data received ).
The figure is read as follows. The user utilises the redirecting service to get the
location of the preprocessing service and then sends to the preprocessing service
a problem instance ik to be solved. Once ik is sent, the preprocessing service
contacts the CBR service which runs a case-based reasoning system to provide
the expected solving time tk of ik . The preprocessing server then sends tk and ik
to the instance distributor service. This service is responsible for scheduling the
instances for different (tunings of) solvers and assigning the related jobs to the
volunteer computers. This can be done in a more intelligent way thanks to tk
provided by the CBR service. This value can be used for instance to minimize the
average solving time. Finally, the volunteer service asks the redirecting service
the location of the instance distributor service and then requests a job from
it using a request response message. This is the only way a job can be sent
to the volunteer service. Note that the use of the redirecting service makes it
possible to have multiple preprocessing and instance distributor services in the
future.
248 Z. Kiziltan and J. Mauro
Fig. 1. Architecture
n◦ Easy SAT (30 min) Easy UNSAT (30 min) Hard SAT (1h) Hard UNSAT (1h)
20 15 14 15 17 18 18 3 3 6 7 7 9
40 132 128 135 150 150 150 8 8 7 16 17 13
60 141 140 140 320 318 322 19 15 14 23 23 22
80 144 145 151 335 323 328 25 21 25 29 30 30
100 179 179 192 336 345 334 25 25 25 44 33 36
runs. The results are promising. Even without the CBR service and the differ-
ent tunings of solvers, the number of the instances solved in a fixed amount of
time increases as the number of computers increases, no matter how busy the
volunteer computers are. Note that only one computer is used to run the prepro-
cessing and the instance distributor services, and yet the system can handle 100
computers without any problems. The main reason for not always obtaining a
linear speed up is that some solvers cannot solve even the easy instances in less
then 30 minutes. Hence, many computers are spending more than 30 minutes
for solving an already solved instance. This has happened 104 times in the tests
of easy SAT instances with 100 volunteer computers. In the same tests, we as
well encountered 35 solver failures. These observations suggest we shall allow the
interruption of a computation if the related instance is already solved.
5 Related Work
References
1. Bordeaux, L., Hamadi, Y., Samulowitz, H.: Experiments with massively parallel
constraint solving. In: Proceedings of the 21st International Joint Conference on
Artificial Intelligence (IJCAI 2009), pp. 443–448 (2009)
2. Jaffar, J., Santosa, A.E., Yap, R.H.C., Zhu, K.Q.: Scalable distributed depth-first
search with greedy work stealing. In: Proceedings of the 16th IEEE International
Conference on Tools with Artificial Intelligence (ICTAI 2004), pp. 98–103 (2004)
Service-Oriented Volunteer Computing 251
3. Gomes, C., Selman, B.: Algorithm portfolios. Artificial Intelligence 1-2, 43–62 (2001)
4. O’Mahony, E., Hebrard, E., Holland, A., Nugent, C., O’Sullivan, B.: Using case-
based reasoning in an algorithm portfolio for constraint solving. In: Proceedings of
the 19th Irish Conference on Artificial Intelligence, AICS 2008 (2008)
5. Hamadi, Y., Jabbour, S., Sais, L.: Manysat: Solver description. In: SAT-Race 2008
(2008)
Constraint Programming with Arbitrarily Large
Integer Variables
Anna Moss
1 Introduction
One of the basic components in constraint modeling is the integer constraint variable,
which is used to represent signed integers with a finite domain. The existing CP tools
use some programming language integer types, e.g. signed 32 or 64 bit integers, to
represent domain values of an integer variable. In this approach, upper and lower
bounds on integer variable domains are inherently limited by the maximal and mini-
mal values that can be represented by the corresponding integer type. This limitation
becomes in fact an implicit constraint on integer variables, reducing the expressive
capabilities of integer modeling.
The limitation above does not impair the modeling in most combinatorial problems
that constitute the conventional application domain of CP technology, since the inte-
ger types typically suffice to represent values in this kind of problems. However, in
recent years the application domain of CP technology has expanded to additional
areas. In particular, new CP applications emerged where integers modeled by integer
variables can be arbitrarily large. An example of such application is the functional test
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 252–266, 2010.
© Springer-Verlag Berlin Heidelberg 2010
Constraint Programming with Arbitrarily Large Integer Variables 253
generation [1], [2], [3], [4]. Functional tests are required in the task of simulation
based approach to hardware design verification. In this approach, the design under
test (DUT) is examined against a large amount of functional tests, which exercise
numerous execution scenarios in order to expose potential bugs. The work of valida-
tion engineers on developing test suites is facilitated by the use of automated test
generation tools. CP technology provides powerful means in performing automated
functional test generation. In particular, constraint modeling provides the capability to
declaratively describe the DUT specification that defines a valid test as well as to
describe a specific test scenario. Moreover, advanced CP algorithms can be used by
automated test generation tools to produce tests that answer the architecture and sce-
nario requirements.
In the application described above, integer variables are used to model hardware
architectural components of large bit width, e.g. memory addresses or operand data.
The domain size of corresponding variables cannot be accommodated by common
integer types. Moreover, as hardware designs become more complex, the sizes of
such architectural components keep growing. This poses a requirement for integer
variables of arbitrarily large domain size.
The CP application to functional test generation also poses additional requirements
on operations supported by integer variables in CP modeling. In the examples above,
integers are viewed as unsigned machine words of a fixed bit width. The common
operations performed on this kind of integers are not confined to the set of standard
arithmetic operations provided in the traditional CP framework. They include, for
example, addition/subtraction modulo the bit width, left/right shift, and other opera-
tions that can be performed on machine words by a processor.
To the best of our knowledge, very little attempts have been made to extend the
traditional CP framework to accommodate integer variables of arbitrarily large size.
The only example we are aware of comes from the domain of Constraint Logic Pro-
gramming (CLP) [5]. This work reports the implementation of a Prolog based solver
supporting integer variables of arbitrarily large size. However, neither theoretical
results describing the implemented algorithms nor experimental results demonstrating
the performance have been published. Aside from the example above, the problem of
domain size limitation for integer variables has never been addressed in the CP litera-
ture and none of the existing CP tools supports integer variables with arbitrarily large
domain size. Such limitation impairs the applicability of the CP technology to a range
of problems where modeling of large integers is required. For example, in tools for
automated test generation, the problem of large integer variables and extended arith-
metic operation set is typically addressed by developing specialized solvers, e.g. [6].
Such approach is not generic, incurs a large development cost and makes it hard to
take advantage of the cutting edge CP technology. This paper comes to address this
problem and extends the traditional CP framework by introducing a new CP modeling
means, namely, an integer variable supporting arbitrarily large domain size as well as
an extended operation set. We refer to this new version of integer variable as LiVar
(standing for Large Integer VARiable). We show how LiVar abstraction can be real-
ized within a traditional CP framework by means of global constraints over standard
integer variables. The same ideas can be applied to implement a native variable of
LiVar type in a CP tool.
254 A. Moss
2 Background
For the sake of completeness, in this section we provide CP background required to
facilitate the presentation of the rest of this paper. An in-depth survey of the tradi-
tional CP can be found in [7].
The CP paradigm comprises the modeling of a problem as a CSP, constraint propa-
gation, search algorithms, and heuristics. A CSP is defined by:
• a set of constrained variables. Each variable is associated with a (finite) domain
defined as a collection of values that the variable is allowed to take;
• a set of constraints. A constraint is a relation defined on a subset of variables which
restricts the combinations of values that the variables can take simultaneously.
A solution to a CSP is an assignment of values to variables so that each variable is
assigned a value from its domain and all the constraints are satisfied.
A CSP formulation of a problem is processed by a constraint solver, which at-
tempts to find a solution using a search algorithm combined with reductions of vari-
able domains based on constraint information. The latter mechanism is known as
constraint propagation. During the constraint propagation, domains of the variables
involved in the constraint are reduced until some type of consistency is achieved. For
example, one of the possible types of consistency is the generalized arc consistency
(GAC), also known as domain consistency [8], which implies that for each value in
the domain of a variable, there exists a value in the domain of each of the other vari-
ables participating in the constraint so that the assignment of these values to the vari-
ables satisfies the constraint. Another commonly used consistency type is the bounds
consistency, also known as interval consistency [8], which implies that for the mini-
mal and maximal domain values of a variable there exists a value within the domain
bounds of each of the other variables participating in the constraint so that the as-
signment of these values to the variables satisfies the constraint. To ensure the re-
quired type of consistency, a solver associates each constraint type with its specific
propagation algorithm. There can be a number of propagation algorithms tied to one
constraint type, where each algorithm is responsible for propagating a specific domain
change to domains of other variables involved in the constraint. Such algorithms are
known as propagation demons.
Constraint Programming with Arbitrarily Large Integer Variables 255
3 LiVar Definition
In this section we formally introduce the notion of LiVar.
We define LiVar as a variable type used to represent unsigned integers of a fixed
bit width in a CSP model. LiVar is specified by a single natural number n indicating
the bit width of the corresponding object. Let A(n) be a LiVar, then the domain of A is
the integer range [0…2n−1]. An arithmetic operation on LiVars results in an expres-
sion also associated with a bit width. We refer to an arithmetic expression involving
LiVars as LiExpr. LiVar can be viewed as a special case of LiExpr.
The following arithmetic operations are defined on a LiExpr A(n):
• addition/subtraction modulo n of a constant c
• addition/subtraction of another LiExpr B(m) modulo max(n,m)
• shift left by k bits
• shift right by k bits
• multiplication by a constant 2c
• division by a constant 2c
• sub-range extraction; given the range start bit index bstart and the range
end bit index bend, this operation returns the expression equal to the value
of the sub-range of A within the specified range bounds, A[bstart:bend]
In addition, a LiExpr A(n) supports the arithmetic comparison constraints (=, ≠, >, ≥,
<, ≤) defined on the standard integer variables, where the comparison can be per-
formed with either a constant value or another LiExpr.
We observe that the domain size of a LiVar is 2n for any specified bit width n and
it is not limited by the size of an integer type.
We also note that the definition of LiVar presented above is motivated by applica-
tions where integer variables are used to model unsigned machine words. However,
the ideas presented in this paper can be also applied to implement arbitrarily large
signed integer variables.
We represent a LiVar (or a LiExpr) A(n) as an array of standard integer CSP variables
A1,A2,…,Am so that each variable Ai, 1≤i≤m, represents the corresponding sub-range of
A(n). Specifically, we choose a granularity parameter k so that the value 2k−1 is small
enough to be representable by a standard integer type. Then the number of integer
variables required for representation of A(n) is m=⎡n/k⎤. Assuming the bits of A(n) are
numbered from 0 (the least significant bit) to n−1 (the most significant bit), each vari-
able Ai, with the possible exception of Am, represents a sub-range of A(n) of size k
between the bits k·(i−1) and k·i−1, for 1≤i≤m−1. The last variable Am represents a
possibly smaller sub-range of the most significant bits of A(n), namely, the range
between the bits k·(m−1) and n−1. Consequently, the variables Ai, 1≤i≤m−1, have the
domain [0…2k−1], and Am has the domain [0…2n+ k−k·m−1].
To facilitate the understanding, the proposed representation can be thought of as a
“byte” representation of A(n), with the “byte” size of k bits. We will refer to the inte-
ger variables in the representation of LiVar A as the byte variables of A.
To illustrate the proposed LiVar representation, we consider an example of a LiVar
A(20) for k=8. For this value of k, A(20) is represented by three standard integer vari-
ables A1[0…FFh], A2[0…FFh] and A3[0…Fh]. The state in which there remain two
possible values in the domain of A, namely 91BE2h and A1B3Ch (hexadecimal
representation of values is used for the ease of transformation), corresponds to the
state of domains of A1, A2, A3 as shown in Fig. 1.
A[91BE2h, A1B3Ch]
19 16 15 8 7 0
In this subsection we show how the arithmetic comparison constraints (=, ≠, >, ≥, <,
≤) can be implemented for LiVar representation defined in Section 4.1. The proposed
implementation for all of these constraints is by means of standard comparison con-
straints on integer variables.
We start with constraint formulations for comparisons between two LiVars. The
equality and inequality constraint implementation is straightforward. Let A(n1) and
B(n2) be two LiVars, and let Ai, 1≤i≤m1 and Bj, 1≤j≤m2 be the byte variables in the
representations of these LiVars as defined in Section 4.1. We assume without the loss
of generality that m1≥m2. Then the equality constraint A=B is equivalent to:
∧ ( Ai = Bi ) ∧ ∧ ( Ai = 0)
1≤ i ≤ m 2 m 2 < j ≤ m1
∨ ( Ai ≠ Bi ) ∨ ∨ ( Ai ≠ 0)
1≤i ≤ m2 m2 < j ≤ m1
∨ ⎛⎜ ∧ ( Aj = B j ) ∧ ( Ai > Bi ) ⎞⎟
⎝
1≤ i ≤ m m ≥ j > i ⎠
is weak due to poor propagation of the disjunction constraint. For instance, a combi-
nation of greater/less than relations that uniquely determines the values for some of
the byte variables of A or B, would not be propagated to those variables, leaving their
domains unchanged.
Instead we propose the following formulation that achieves efficient propagation
between integer variables in LiVar representation. We define for each 2≤i≤m the
following constraint denoted as EqualPrefix(i): ∧m≥j≥i(Aj=Bj). Then the constraint A>B
is formulated as follows:
( Am ≥ Bm ) ∧ ∧ (Equal Pr efix(i) ⇒ ( Ai −1 ≥ Bi−1 )) ∧ (Equal Pr efix(2) ⇒ ( A1 > B1 ) ) (4.1)
m≥i ≥3
The formulation above is conjunction based and propagates any relevant change in
domains of byte variables. If A and B have different representation lengths mA and mB,
then the constraint above should be augmented as follows. If mA>mB, then the result-
ing constraint is the disjunction of the constraint (4.1) with ∨ m <i≤m ( Ai > 0) . Other-
B A
wise, the resulting constraint is the conjunction of the constraint (4.1) with
∨ m A <i ≤ mB ( Bi = 0) .
258 A. Moss
The implementations of less than and less than or equal constraints are analogous to
those of greater than and greater than or equal constraints presented above.
We observe that when LiVar representations of A and B have the same length, the
comparison constraints between A and B are equivalent to the lexicographic ordering
constraints on vectors of variables presented in [10]. The constraint formulations
above are similar to one of the alternative formulations for the lexicographic ordering
constraint given in [10]. The latter work also presents a GAC propagation algorithm
for lexicographic ordering and demonstrates experimentally that for selected combi-
natorial problems this algorithm outperforms the alternative formulation. However, in
the context of this paper where vectors of variables represent LiVars and allowed
constraint types are confined to the set defined in Section 3, the consistency level
achieved by the formulation shown above is typically sufficient to eliminate back-
tracking and GAC enforcement is not required.
Finally, the comparison constraints of LiVar A with a constant c are implemented
similarly to comparisons of LiVar A with LiVar B. In this case, the values of c in the
corresponding sub-ranges of size k participate in the constraint formulations in place
of the variables Bj.
δ
bend
bstart
Ai
Clow(i) Chigh(i)
The following filtering procedures for Chigh(i) and Clow(i) are performed when the
domain boundaries of Ai change. The filtering procedure for Chigh(i) bases upon the
equality between the least significant bits (the suffix) of Ai and the most significant
bits (the prefix) of Chigh(i) to reduce the allowed range of the latter. Due to the modulo
operation, this reduction can lead to removing an internal sub range of Chigh(i).
UpdateHigh(i):
d ← δ
if (d=0)
d ← k
if (DomainMax(Ai)−DomainMin(Ai)<2d-1)
SuffixMax ← DomainMax(Ai) (mod 2d)
SuffixMin ← DomainMin(Ai) (mod 2d)
if (SuffixMax < SuffixMin)
removeDomainRange(Chigh(i),(SuffixMax+1)·2k−d,
(SuffixMin-1)·2k−d + 2k−d-1)
else
setDomainMin(Chigh(i),SuffixMin·2k−d)
setDomainMax(Chigh(i),SuffixMax·2k−d + 2k−d-1)
The filtering procedure for Clow(i) bases upon the equality between the prefix of Ai
and the suffix of Clow(i) to remove the sub ranges corresponding to forbidden suffixes
from the domain of the latter.
Constraint Programming with Arbitrarily Large Integer Variables 263
UpdateLow(i):
if (δ>0)
PrefixMax ← DomainMax(Ai)/2k−δ
PrefixMin ← DomainMin(Ai)/2k−δ
forbiddenMin1 ← 0;
forbiddenMax1 ← PrefixMin−1
forbiddenMin2 ← PrefixMax+1
forbiddenMax2 ← 2k−δ−1
for each prefix: PrefixMin≤prefix≤PrefixMax
removeDomainRange(Clow(i),
prefix·2k−δ+forbiddenMin1,
prefix·2k−δ+forbiddenMax1)
removeDomainRange(Clow(i),
prefix·2k−δ+forbiddenMin1,
prefix·2k−δ+forbiddenMax1)
We observe that the procedures presented above refer to the mainstream case when all
the byte variables involved in the filtering algorithm are of size k and fall inside the
range of A. There is a substantial number of corner cases to be considered where the
most significant byte variables of shorter size or the byte variables of C that are not
fully contained in the range of A are involved in the filtering procedure. The filtering
algorithms for these cases are performed according to the similar principles as those
presented above and are not presented here for the sake of conciseness.
The same filtering procedures are performed in the opposite direction when the
domain boundary change occurs in a byte variable Ci of C. In this case, the filtering is
done for the byte variables Aj=Ahigh(i) and Aj+1=Alow(i) that overlap with Ci.
To summarize the propagation algorithm for SubRangeConstraint(A,C,bstart,bend),
the propagation demon for Ai performs the filtering procedures UpdateHigh(i) and
UpdateLow(i) for the corresponding byte variables of C, and the propagation demon
for Ci performs the symmetric filtering procedures for the corresponding byte vari-
ables of A.
5 Experimental Results
To demonstrate the effectiveness of the proposed LiVar realization, we implemented
the approach described in this paper within a preset traditional CP environment and
compared the performance and constraint propagation quality of the two methods for
arithmetic expression implementation presented in Section 4.3.
We observe that, commonly for CP, there is a tradeoff between the time spent
on domain filtering and the time spent on variable assignment enumeration. We ex-
perimented with different levels of consistency for the global constraint presented in
Section 4.3.2. In our experiments, we found out that while increasing the consistency
levels to GAC achieves less search fails, it incurs too much time cost leading in most
cases to the performance slowdown. Based on our experiments, we believe that the
consistency level of algorithms presented in this paper achieves a good tradeoff be-
tween the search phases, optimizing the overall search time.
264 A. Moss
Integer
Expressions − − − −
Method
The presented experimental results show that the expression implementation based
on global constraints has a clear advantage, sometimes in several orders of magnitude,
over the naïve method based on integer expressions. This result can be explained by
the fact that the global constraints method with custom domain filtering algorithms
achieves better propagation and therefore fewer search fails than the standard integer
expression propagation algorithms which are not tuned for this specific problem.
6 Conclusion
The main contribution of this paper is the extension of the traditional CP framework
to accommodate integer constraint variables of arbitrarily large domain size. Such
extension makes it possible to apply CP to a range of problems where such applica-
tion has not been possible due to integer variable size limitations of the existing CP
tools.
We proposed a method to represent such variables on top of the traditional CP
framework. The paper shows how constraints and expressions on integer variables
can be implemented through standard CP means for the proposed representation. The
266 A. Moss
set of arithmetic operations on integer variables considered in this paper was defined
to accommodate common requirements on large integers. We proposed a method to
implement expressions on large integer variables by means of customized global
constraints.
We presented experimental results to demonstrate the effectiveness of the proposed
method. The results show that the proposed extension can be efficiently integrated
into the standard CP framework by means of global constraints.
Finally, we observe that the ideas presented in this paper can be used to implement
a native large integer variable within a CP tool. For this purpose, one can represent
domains of such variables as a vector of bytes (or a partition with any sufficiently
small granularity) and perform domain reductions based on the algorithms presented
in the paper.
Acknowledgements
The author would like to thank Boris Gutkovich and Yevgeny Schreiber for their
valuable comments and helpful discussions.
References
1. Bin, E., Emek, R., Shurek, G., Ziv, A.: Using a constraint satisfaction formulation and
solution techniques for random test program generation. IBM Systems Journal 41(3), 386–
402 (2002)
2. Naveh, Y., Rimon, M., Jaeger, I., Katz, Y., Vinov, M., Marcus, E., Shurek, G.: Constraint-
based random stimuli generation for hardware verification. AI Magazine 28(3), 13–18
(2007)
3. Gutkovich, B., Moss, A.: CP with architectural state lookup for functional test generation.
In: 11th Annual IEEE International Workshop on High Level Design Validation and Test,
pp. 111–118 (2006)
4. Moss, A.: Constraint patterns and search procedures for CP-based random test generation.
In: Yorav, K. (ed.) HVC 2007. LNCS, vol. 4899, pp. 86–103. Springer, Heidelberg (2008)
5. Triska, M.: Generalising Constraint Solving over Finite Domains. In: Garcia de la Banda,
M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 820–821. Springer, Heidelberg
(2008)
6. SystemVerilog, IEEE Std. 1800TM (2005)
7. Smith, B.M.: Modeling for constraint programming. In: The 1st Constraint Programming
Summer School (2005)
8. Van Hentenryck, P., Saraswat, V., Deville, Y.: Design, implementation and evaluation of
the constraint language cc(FD). Journal of Logic Programming 31(1-3), 139–164 (1998)
9. Bessière, C., Van Hentenryck, P.: To be or not to be ... a global constraint. In: Rossi, F.
(ed.) CP 2003. LNCS, vol. 2833, pp. 789–794. Springer, Heidelberg (2003)
10. Frisch, A., Hnich, B., Kiziltan, Z., Miguel, I., Walsh, T.: Global constraints for lexico-
graphic orderings. In: Van Hentenryck, P. (ed.) CP 2002. LNCS, vol. 2470, pp. 93–108.
Springer, Heidelberg (2002)
Constraint-Based Local Search
for Constrained Optimum Paths Problems
1 Introduction
Constrained Optimum Path (COP) problems appear in many real-life applica-
tions, especially in communication and transportation networks (e.g., [5]). They
aim at finding one or more paths from some origins to some destinations sat-
isfying some constraints and optimizing an objective function. For instance, in
telecommunication networks, routing problems supporting multiple services in-
volve the computation of paths minimizing transmission costs while satisfying
bandwidth and delay constraints [3,6]. Similarly, the problem of establishing
routes for connection requests between network nodes is one of the basic op-
erations in communication networks and it is typically required that no two
routes interfere with each other due to quality-of-service and survivability re-
quirements. This problem can be modeled as edge-disjoint paths problem [4].
Most of COP problems are NP-hard. They are often approached by dedicated
algorithms, such as the Lagrangian-based branch and bound in [3] and the vertex
labeling algorithm from [7]. These techniques exploit the structure of constraints
and objective functions but are often difficult to extend and reuse.
This paper proposes a constraint-based local search (CBLS) [10] framework
for COP applications to support the compositionality, reuse, and extensibility
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 267–281, 2010.
c Springer-Verlag Berlin Heidelberg 2010
268 Q.D. Pham, Y. Deville, and P. Van Hentenryck
at the core of CBLS and CP systems. It follows the trend of defining domain-
specific CBLS frameworks, capturing modeling abstractions and neighborhoods
for classes of applications exhibiting significant structures. The COP framework
can also be viewed as an extension of the LS(Graph & Tree) framework [8] for
those applications where the output of the optimization model is one or more
elementary paths (i.e., paths with no repeated nodes). As is traditional for CBLS,
the resulting COP framework allows the model to be compositional and easy to
extend, and provides a clean separation of concerns between the model and
the search procedure. Moreover, the framework captures structural moves that
are fundamental in obtaining high-quality solutions for COP applications. The
key technical contribution underlying the COP framework is a novel connected
neighborhood for COP problems based on rooted spanning trees. More precisely,
the COP framework incrementally maintains, for each desired elementary path,
a rooted spanning tree that specifies the current path and provides an efficient
data structure to obtain its neighboring paths and their evaluations.
The availability of high-level abstractions (the “what”) and the underlying
connected neighborhood for elementary paths (the “how”) make the COP frame-
work particularly appealing for modeling and solving complex COP applications.
The COP framework, implemented in COMET, was evaluated experimentally on
two classes of applications: Resource-Constrained Shortest Path (RCSP) prob-
lems with and without side constraints and Edge-Disjoint Path (EDP) problems.
The experimental results show the potential of the approach.
The rest of this paper is organized as follows. Section 2 gives the basic def-
initions and notations. Section 3 specifies our novel neighborhoods for COP
applications and Section 4 presents the modeling framework. Section 5 applies
the framework to two various COP applications and Section 6 concludes the
paper.
Graphs Given an undirected graph g, we denote the set of nodes and the set
of edges of g by V (g), E(g) respectively. A path on g is a sequence of nodes <
v1 , v2 , ..., vk > (k > 1) in which vi ∈ V (g) and (vi , vi+1 ) ∈ E(g), (i = 1, . . . , k − 1.
The nodes v1 and vk are the origin and the destination of the path. A path is
called simple if there is no repeated edge and elementary if there is no repeated
node. A cycle is a path in which the origin and the destination are the same. This
paper only considers elementary paths and hence we use “path” and “elementary
path” interchangeably if there is no ambiguity. A graph is connected if and only
if there exists a path from u to v for all u, v ∈ V (g).
tr, which is denoted by f atr (u). Given a rooted tree tr and a node s ∈ V (tr),
we use the following notations:
– root (tr ) for the root of tr,
– pathtr (v) for the path from v to root(tr) on tr. For each node u of pathtr (v),
we say that u dominates v in tr (u is a dominator of v, v is a descendant of
u) which we denote by u Domtr v.
– pathtr (u, v) for the path from u to v in t (u, v ∈ V (tr)).
– ncatr (u, v) for the nearest common ancestor of two nodes u and v. In other
words, ncatr (u, v) is the common dominator of u and v such that there is no
other common dominator of u and v that is a descendant of ncatr (u, v).
Rooted Spanning Trees. Given an undirected graph g and a target node t ∈ V (g),
our COP neighborhood maintains a spanning tree of g rooted at t. Moreover,
since we are interested in elementary paths between a source s and a target
t, the data structure also maintains the source node s and is called a rooted
spanning tree (RST) over (g, s, t). An RST tr over (g, s, t) specifies a unique
path from s to t in g: pathtr (s) =< v1 , v2 , ..., vk > in which s = v1 , t = vk and
vi+1 = f atr (vi ), ∀i = 1, . . . , k − 1. By maintaining RSTs for COP problems, our
framework avoids an explicit representation of paths and enables the definition
of an connected neighborhood that can be explored efficiently. Indeed, the tree
structure directly captures the path structure from a node s to the root and
simple updates to the RST (e.g., an edge replacement) will induce a new path
from s to the root.
The Basic Neighborhood. We now consider the definition of our COP neighbor-
hood. We first show how to update an RST tr over (g, s, t) to generate a new
rooted spanning tree tr over (g, s, t) which induces a new path from s to t in g:
pathtr (s) = pathtr (s).
Given an RST over (g, s, t), an edge e = (u, v) such that e ∈ E(g) \ E(tr)
is called a replacing edge of tr and we denote by rpl(tr) the set of replacing
270 Q.D. Pham, Y. Deville, and P. Van Hentenryck
1. Insert the edge e = (u, v) to tr. This creates an undirected graph g with a
cycle C containing the edge e .
2. Remove e from g .
The application of these two actions yields a new rooted spanning tree tr of g,
denoted tr = rep(tr, e , e). The neighborhood of tr could then be defined as
It is easy to observe that two RSTs tr1 and tr2 over (g, s, t) may induce the
same path from s to t. For this reason, we now show how to compute a subset
N k (tr) ⊆ N (tr) such that pathtr (s) = pathtr (s), ∀tr ∈ N k (tr).
We first give some notations to be used in the following presentation. Given
an RST tr over (g, s, t) and a replacing edge e = (u, v), the nearest common
ancestors of s and the two endpoints u, v of e are both located on the path from s
to t. We denote by lowncatr (e, s) and upncatr (e, s) the nearest common ancestors
of s on the one hand and one of the two endpoints of e on the other hand, with the
condition that upncatr (e, s) dominates lowncatr (e, s). We denote by lowtr (e, s),
uptr (e, s) the endpoints of e such that ncatr (s, lowtr (e, s)) = lowncatr (e, s) and
ncatr (s, uptr (e, s)) = upncatr (e, s). Figure 1 illustrates these concepts. The left
part of the figure depicts the graph g and the right side depicts an RST tr over
(g, s, r). Edge (8,10) is a replacing edge of tr; ncatr (s, 10) = 12 since 12 is the
common ancestor of s and 10. ncatr (s, 8) = 7 since 7 is the common ancestor of
s and 8. lowncatr ((8, 10)) = 7 and upncatr ((8, 10)) = 12 because 12 Domtr 7;
lowtr ((8, 10)) = 8; uptr ((8, 10)) = 10.
We now specify the replacements that induce new path from s to t.
t t
12 12
11 10 11 10
7 7
6 8 9 6 8 9
5 4 5 4
3 3
1 2 1 2
s s
t t
12 12
11 10 11 10
7 7
6 8 9 6 8 9
5 4 5 4
3 3
1 2 1 2
s s
a. current tree tr
b. tr = rep(tr, (7, 11), (8, 10))
Proof. The proposition is proved by showing how to generate that instance trk .
This can be done by Algorithm 1. The idea is to travel the sequence of nodes of
P on the current tree tr. Whenever we get stuck (we cannot go from the current
node x to the next node y of P by an edge (x, y) on tr because (x, y) is not in
tr), we change tr by replacing (x, y) by a replacable edge of (x, y) that is not
traversed. The edge (x, y) in line 7 is a replacing edge of tr because this edge is
not in tr but it is an edge of g. Line 8 chooses a replacable edge eo of ei that is
not in S. This choice is always done because the set of replacable edges of ei that
are not in S is not null (at least an edge (y, f atr (y)) belongs to this set). Line 9
performs the move that replaces the edge eo by the edge ei on tr. So Algorithm
1 always terminates and returns a rooted spanning tree tr inducing P. Variable
S (line 1) stores the set of traversed edges.
Algorithm 1. Moves
Input: An instance tr 0 of RST on (g, s, t) and a path P on g, s =
firstNode(P), t = lastNode(P)
Output: A tree inducing P computed by taking k ≤ l basic moves (l is the
length of P)
1 S ← ;
2 tr ← tr 0 ;
3 x ← firstNode(P);
4 while x = lastNode(P) do
5 y ← nextNode(x, P);
6 if (x, y) ∈/ E(tr) then
7 ei ← (x, y);
8 eo ← replacable edge of ei that is not in S;
9 tr ← rep(tr, eo, ei);
10 S ← S ∪ {(x, y)};
11 x←y ;
12 return tr;
t t
12 12
11 10 11 10
7 7
6 8 9 6 8 9
5 4 5 4
3 3
1 2 1 2
s s
1. LSGraphSolver ls();
2. VarPath path(ls,g,s,t);
3. PreferredReplacingEdges prefReplacing(path);
4. PreferredReplacableEdges prefReplacable(path);
...
9. int d = MAXINT;
10. forall(ei in prefReplacing.getSet())
11. forall(eo in prefReplacable.getSet(ei))
12. d = min(d,C.getReplaceEdgeDelta(path,eo,ei));
5 Applications
5.1 The Resource Constrained Shortest Path (RCSP) Problem
The Resource constrained shortest path problem (RCSP) [3] is the problem
of finding the shortest path between two vertices on a network satisfying the
constraints over resources consumed along the path. There are some variations of
this problem, but we first consider a simplified version introduced and evaluated
in [3] over instances from the OR-Library [2]. Given a directed graph G =
(V, E), each arc e is associated with a length c(e) and a vector r(e) of resources
consumed in traversing the arc e. Given a source node s, a destination node t
and two vectors L, U of resources corresponding to the minimum and maximum
amount that can be used on the chosen path (i.e., a lower and an upper limit
on the resources
consumed on the path). The length of a path P is defined
as f (P)= e∈P c(e). The resources consumed in traversing P is defined as
r(P) = e∈P r(e) The formulation of RCSP is then given by:
min f (P)
s.t. L ≤ r(P) ≤ U
P is elementary path from s to t on G.
The RCSP problem with only constraints on the maximum resources consumed
is also considered in [5,7]. The algorithm based on Lagrangian relaxation and
enumeration from [5] and the vertex-labeling algorithm combining with several
preprocessing techniques in [7] are known to be state-of-the-art for this problem.
We give a Tabu search model (RCSP TABU) for solving the RCSP problem
considering both constraints of minimum and maximum resources consumed.
This problem is rather pure and does not highlight the benefits of our framework
but it is interesting as a starting point and a basis for comparison.
The RCSP Modeling. The model using the COP framework is as follows:
void stateModel{
1. LSGraphSolver ls();
2. VarPath path(ls,g,s,t);
3. range Resources = 1..K;
4. GraphObjective go[Resources];
5. forall(k in Resources)
6. go[k] = PathCostOnEdges(path,k);
7. PathCostOnEdges len(path,0);
8. GraphConstraintSystem gcs(ls);
9. forall(k in Resources){
10. gcs.post(L[k] <= go[k]);
11. gcs.post(go[k] <= U[k]);
12 }
13. gcs.close();
14. GraphObjectiveCombinator goc(ls);
15. goc.add(len,1);
16. goc.add(gcs,1000);
276 Q.D. Pham, Y. Deville, and P. Van Hentenryck
17. goc.close();
18. PreferredReplacingEdges prefReplacing(path);
19. PreferredReplacableEdges prefReplacable(path);
20. ls.close();
21.}
Line 1 declares a LSGraphSolver ls. A VarPath variable representing an RST
over (g, s, t) is declared and initialized in line 2. Lines 3–6 create K graph objec-
tives representing resources consumed in traversing the path from s to t where
PathCostOnEdges(ls,path,k) (line 6) is a modeling abstraction representing
the total weights of type k accumulated along the path from s to t1 . Variable
len represents the length of the path from s to t (line 7). Lines 9–12 initial-
ize and post to the GraphConstraintSystem gcs (line 8) the constraints over
resources consumed in traversing the path from s to t.
In this model, we combine the graph constraint gcs with coefficient 1000 and the
graph objective len with coefficient 1 in a global GraphObjectiveCombinator goc
to be minimized (lines 14–17). We introduce two graph invariants prefReplacing
and prefReplacable to represent the set of preferred replacing edges of path and
the sets preferred replacable edges of all preferred replacing edges of path (lines
18–19).
The search procedure not described here is based on tabu search on the neigh-
borhood N 2 because the exploration on basic neighborhood N 1 gave poor results.
At each local move, we consider the best neighbor which minimizes goc. We take
this neighbor if it improves the current solution. Otherwise, a random neighbor
which is not tabu will be taken. Solutions are made tabu by putting the edges
appearing in the replacements into two tabu lists: one list for storing the edges
to be added and another one for storing the edges to be removed. The length of
these lists are set to be the number of vertices divided by 5.
instances opt t* min max % avr t min t max t std dev min’ max’ %
rcsp1 131 0.62 131 131 100 0.26 0.25 0.28 0.01 131 131 100
rcsp2 131 0.05 131 131 100 0.26 0.24 0.26 0.01 131 131 100
rcsp3 2 0.60 2 2 100 2.11 0.48 5.82 1.29 2 2 100
rcsp4 2 0.09 2 2 100 3.82 0.82 10.19 2.96 2 7 100
rcsp5 100 0.84 100 100 100 0.83 0.6 0.97 0.1 100 100 100
rcsp6 100 0.84 100 100 100 0.75 0.6 0.95 0.1 100 119 100
rcsp7 6 1.40 6 6 100 21.44 3.48 55.08 15.17 6 9 100
rcsp8 14 1.58 14 14 100 51.28 1.22 183.94 47.03 14 ∞ 80
rcsp9 420 0.04 420 420 100 122.5 2.14 483.43 115.26 420 ∞ 60
rcsp10 420 0.03 420 420 100 71.04 4.14 416.6 92.89 420 ∞ 90
rcsp11 6 0.11 6 6 100 7.75 1.84 18.44 3.81 6 7 100
rcsp12 6 0.09 6 6 100 9.12 2.34 25.12 6.52 6 6 100
rcsp13 448 0.44 448 448 100 90.06 7.94 276.02 66.81 448 ∞ 70
rcsp14 - - - - - - - - - - - -
rcsp15 9 9.28 9 9 100 93.43 31.89 284.25 60.53 9 ∞ 70
rcsp16 17 8.84 17 17 100 279.89 33.43 1049.57 253.27 17 ∞ 30
rcsp17 652 55.91 652 652 100 56.64 19.9 106.07 23.65 652 652 100
rcsp18 652 56.45 652 652 100 57.27 25.61 116.98 22.56 652 652 100
rcsp19 6 0.59 6 6 100 28.15 7.92 66.72 13.32 6 6 100
rcsp20 6 1.07 6 6 100 46.85 12.79 118.63 31.08 6 15 100
rcsp21 858 3.20 858 858 100 242.3 61.68 679.96 190.7 858 ∞ 50
rcsp22 858 1.86 858 858 100 294.94 108.41 827.04 186.13 858 ∞ 50
rcsp23 4 50.74 4 4 100 280.36 11.92 1053.61 279.03 4 ∞ 90
rcsp24 5 54.05 5 ∞ 80 719.92 24.13 1737.43 574.94 5 ∞ 20
solutions. Column 7 present the percentage for finding optimal solutions. The fi-
nal column reports the average of the number of moves. Experimental results show
that our RCSP TABU model found optimal solutions faster than the RCSP BEA
model in most cases. The reason is that, on these instances, the constraints over
resources consumed are easy to satisfy but the search space is much larger. The
reduction techniques in [3] do not reduce the search space much and the search
procedure of the RCSP BEA model is thus much slower.
Experimental Results. We experiment the model over the benchmark from the
OR-library where the set of subsets S1 , S2 , ..., SQ is generated as follows. We take
a random feasible solution to the RCSP problem which is an elementary path
v1 , v2 , ..., vq satisfying the resource constraints. Then, we partition V into Q = 3∗
q sets S1 , S2 , ..., SQ where vj ∈ Sj , ∀j ∈ {1, 2, ..., q} and the size differences are at
most one. This ensures that there exists at least one feasible solution v1 , v2 , ..., vq
to the MC RCSP problem. The model is executed 10 times with 10 minutes
of time window for each instance. Experimental results are shown in Table 1
(columns 11–13) where column 13 presents the rate of finding feasible solutions.
Columns 11–12 present the minimal and maximal value of the best solution in
different executions. In most cases, the rate for finding feasible solutions is high
except instances 16 and 24. In some cases, the model finds optimal solutions.
which has most edges in common with other paths until all remaining paths
are mutually edge-disjoint as suggested in [4]. In our local search model, we
extend this idea by taking a simple greedy algorithm over the remaining paths
after that extraction procedure in hope of improving the number of edge-disjoint
paths.
Experimental Results. For the experimentation, we re-implemented in COMET the
Multi-start Greedy Algorithm (MSGA) and the ACO (the extended version)
algorithm described in [4] and compare them with our local search model. The
instances in the original paper [4] are not available (except some graphs). As a
result, we use the instance generator described in [4] and generate new instances
as follows. We take 4 graphs from [4]. For each graph, we generate randomly
different sets of commodities with different sizes depending on the size of the
graph: for each graph of size n, we generate randomly 20 instances with 0.10*n,
0.25*n and 0.40*n commodities. In total, we have 240 problems instances. Due
to the high complexity of the problem, we execute each problem instance once
with a time limit of 30 minutes for each execution. Experimental results are
shown in Table 3. The time window for the MSGA and the ACO algorithms are
also 30 minutes. The Table reports the average values of the objective function
and the average execution times for obtaining the best solutions of 20 instances
(a graph G = (V, E) and a set of r ∗ |V | commodities, r = 0.10, 0.25, 0.40).
Table 3 shows that our local search model gives very competitive results. It
finds better solutions than MGSA in 217/240 instances, while MSGA find better
solutions in 4/240 instances. On the other hand, in comparison with the ACO
model, our model finds better solutions in 144/240 instances, while the ACO
model find better solutions in 11/240 instances. This clearly demonstrates the
potential benefits of our COP framework, from a modeling and computational
standpoint.
Constraint-Based Local Search for COPs Problems 281
6 Conclusion
This paper considered Constrained Optimum Path (COP) problems which arise
in many real-life applications. It proposes a domain-specific constraint-based lo-
cal search (CBLS) framework for COP applications, enabling models to be high
level, compositional, and extensible and allowing for a clear separation between
model and search. The key technical contribution to support the COP frame-
work is a novel neighborhood based on a rooted spanning tree that implicitly
defines a path between the source and the target and its neighbors, and pro-
vides an efficient data structure for differentiation. The paper proved that the
neighborhood obtained by swapping edges in this tree is connected and pre-
sented a larger neighborhood involving multiple independent moves. The COP
framework, implemented in COMET, was applied to Resource Constrained Short-
est Path problems (with and without side constraints) and to the edge-disjoint
paths problem. Computational results showed the potential significance of the
approach, both from a modeling and computational standpoint.
Acknowledgments. We would like to thank Maria José Blesa Aguilera who has
kindly provided some graphs for the experimentation. This research is partially
supported by the Interuniversity Attraction Poles Programme (Belgian State,
Belgian Science Policy) and the FRFC project 2.4504.10 of the Belgian FNRS
(National Fund for Scientific Research).
References
1. Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory, Algorithms, and
Applications. Prentice Hall, New Jersey (1993)
2. Beasley, J.E.: Or-library,
http://people.brunel.ac.uk/~ mastjjb/jeb/info.html
3. Beasley, J.E., Christofides, N.: An algorithm for the resource constrained shortest
path problem. Network 19, 379–394 (1989)
4. Blesa, M., Blum, C.: Finding edge-disjoint paths in networks: An ant colony op-
timization algorithm. Journal of Mathematical Modelling and Algorithms 6(3),
361–391 (2007)
5. Carlyle, W.M., Wood, R.K.: Lagrangian relaxation and enumeration for solving
constrained shortest-path problems. In: Proceedings of the 38th Annual ORSNZ
Conference (2003)
6. Clı́maco, J.C.N., Craveirinha, J.M.F., Pascoal, M.M.B.: A bicriterion approach for
routing problems in multimedia networks. Network 41, 206–220 (2003)
7. Dumitrescu, I., Boland, N.: Improved preprocessing, labeling and scaling algo-
rithms for the weight-constrained shortest path problem. Networks 42, 135–153
(2003)
8. Pham, Q.D., Deville, Y., Van Hentenryck, P.: Ls(graph & tree): A local search
framework for constraint optimization on graphs and trees. In: Proceedings of the
24th Annual ACM Symposium on Applied Computing (SAC 2009) (2009)
9. Smilowitz, K.: Multi-resource routing with flexible tasks: an application in drayage
operation. IIE Transactions 38(7), 555–568 (2006)
10. Van Hentenryck, P., Michel, L.: Constraint-based local search. The MIT Press,
London (2005)
Stochastic Constraint Programming
by Neuroevolution with Filtering
1 Introduction
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 282–286, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Stochastic Constraint Programming by Neuroevolution with Filtering 283
policy tree is a policy tree in which each chance constraint is satisfied with respect to the
tree. A chance constraint h ∈ C is satisfied with respect to a policy tree if it is satisfied
under some fraction φ ≥ θ(h) of all possible paths in the tree.
Most current SCP approaches are complete and do not seem practicable for large
multi-stage problems, but the authors recently proposed a more scalable method called
Evolved Parameterised Policies (EPP) [3]. In this paper we hybridise EPP with con-
straint filtering, and show theoretically and empirically that this improves learning. An
upcoming technical report will contain details omitted from this short paper.
EPP [3] uses an evolutionary algorithm to find an artificial neural network (ANN) whose
input is a representation of a policy tree node, and whose output is a domain value for
the decision variable to be assigned at that node. The ANN describes a policy func-
tion: it is applied whenever a decision variable is to be assigned, and can be used to
represent or recreate a policy tree. The evolutionary fitness function penalises chance
constraint violations, and is designed to be optimal for ANNs representing satisfying
policy trees. In experiments on random SCSPs, EPP was orders of magnitude faster
than state-of-the-art complete algorithms [3]. Because it evolves an ANN it is classed
as a neuroevolutionary method (see for example [6]).
A drawback with EPP is that it treats hard constraints in the same way as chance
constraints. This is not incorrect, but a problem containing many hard constraints may
require a complex ANN with more parameters to tune, leading to longer run times. We
now describe a constraint-based technique for the special case of finite domain SCSPs
that allows more complex policies to be learned by simpler ANNs.
We modify EPP so that the ANN output is not used to compute a decision variable
value directly, but instead to compute a recommended value. As we assign values to the
decision and stochastic variables under some scenario ω, we apply constraint filtering
algorithms using only the hard constraints, which may remove values from both de-
cision and stochastic variable domains. If domain wipe-out occurs on any decision or
stochastic variable then we stop assigning variables under ω and every constraint is arti-
ficially considered to be violated in ω; otherwise we continue. On assigning a stochastic
variable s we choose ω(s), but if ω(s) has been removed from dom(s) then we stop as-
signing variables under ω and every constraint h is artificially considered to be violated
in ω; otherwise we continue. On assigning a decision variable x we compute the rec-
ommended value then choose the first remaining domain value after it in cyclic order.
For example suppose that initially dom(x) = {1, 2, 3, 4, 5} but this has been reduced
to {2, 4}, and the recommended value is 5. This value is no longer in dom(x) so we
choose the cyclically next remaining value 2. If all variables are successfully assigned
in ω then we check by inspection whether each constraint is violated or satisfied.
Some points should be clarified here. Firstly, it might be suspected that filtering
a stochastic variable domain violates the principle that these variables are randomly
assigned. But stochastic variables are assigned values from their unfiltered domains.
Secondly, the value assigned to a decision variable must depend only upon the values
assigned to stochastic variables occurring earlier in the stage structure. Does filtering
284 S.D. Prestwich et al.
Constraints:
c1 : Pr {x = s ⊕ t} = 1
Decision variables:
x ∈ {0, 1}
Stochastic variables:
s, t ∈ {0, 1}
Stage structure:
V1 = ∅ S1 = {s, t}
V2 = {x} S2 = ∅
L = [V1 , S1 , V2 , S2 ]
the domains of stochastic variables that occur later violate this principle? No: constraint
filtering makes no assumptions on the values of unassigned variables, it only tells us
that assigning a value to a decision variable will inevitably lead to a hard constraint
violation. Thirdly, we consider all constraints to be violated if either domain wipe-out
occurs, or if the selected value for a stochastic variable has been removed earlier by fil-
tering. This might appear to make the evolutionary fitness function incorrect. But both
these cases correspond to hard constraint violations, and considering constraints to be
violated in this way is similar to using a penalty function in a genetic or local search
algorithm: it only affects the objective function value for non-solutions.
We call the modified method Filtered Evolved Parameterised Policies (FEPP) and
now state two useful properties.
Proposition 1. FEPP can learn more policies than EPP with a given ANN.
Proof sketch. We can show that any policy that can be learned by EPP can also be
learned by FEPP. Conversely, we show by example that there exists an SCSP that can
be solved by FEPP but not by EPP using a given ANN. Suppose that the ANN is a
single perceptron [2] whose inputs are the s and t values and whose output is used to
select a domain value for x, the SCSP is as shown in Figure 1, and FEPP enforces arc
consistency. A single perceptron cannot learn the ⊕ (exclusive-OR) function [2] so EPP
cannot solve the SCSP. But arc consistency removes the incorrect value from dom(x)
so FEPP makes the correct assignment irrespective of the ANN. 2
Proposition 2. Increasing the level of consistency increases the set of policies that can
be learned by FEPP with a given ANN.
Proof sketch. We can show that any policy that can be learned by FEPP with a given
ANN and filtering algorithm A can also be learned with a stronger filtering algorithm
B. Conversely, we show by example that there exists an SCSP, an ANN, and filtering
algorithms A and B, such that the SCSP can be solved by FEPP with B but not A. Let
the SCSP be as shown in Figure 2, A enforce pairwise arc consistency on the disequality
constraints comprising c2 , B enforce hyper-arc consistency on c2 using the algorithm of
[5], and both A and B enforce arc consistency on c1 . In any satisfying policy x = s ⊕ t.
The proof rests on the fact that B reduces dom(x) to {0, 1} before search begins so ⊕
Stochastic Constraint Programming by Neuroevolution with Filtering 285
Constraints:
c1 : Pr { x < 2 → x = s ⊕ t} = 1
c2 : Pr {alldifferent(x, y, u)} = 1
Decision variables:
x ∈ {0, 1, 2, 3}
y ∈ {2, 3}
Stochastic variables:
s, t ∈ {0, 1}
u ∈ {2, 3}
Stage structure:
V1 = ∅ S1 = {s, t}
V2 = {x} S2 = {u}
V3 = {y} S3 = ∅
L = [V1 , S1 , V2 , S2 , V3 , S3 ]
3 Experiments
We now test two hypotheses: does filtering enable an ANN to learn more complex poli-
cies in practice as well as in theory (proposition 1)? And where a policy can be learned
without filtering, does filtering speed up learning (as we hope is implied by proposition
3)? For our experiments we use Quantified Boolean Formula (QBF) instances. QBF and
SCSP are closely related as there is a simple mapping from QBF to Stochastic Boolean
Satisfiability, which is a special case of SCSP [1]. QBF-as-SCSP is an interesting test
for FEPP because all its constraints are hard.
We have implemented a prototype FEPP using a weak form of constraint filtering
called backchecking. We use the same ANN as in [3]: a periodic perceptron [4], which
286 S.D. Prestwich et al.
has been shown to learn faster and require fewer weights than a standard perceptron.
Results for EPP and FEPP are shown in Table 1, both tuned roughly optimally to each
instance. All times were obtained on a 2.8 GHz Pentium (R) 4 with 512 MB RAM
and are medians of 30 runs. “—” indicates that the problem was never solved despite
multiple runs with different EPP parameter settings. These preliminary results support
both our hypotheses: there are problems that can be solved by FEPP but not (as far as
we can tell) by EPP; and where both can solve a problem FEPP is faster. So far we have
found no QBF instance on which EPP beats FEPP.
4 Conclusion
FEPP is a true hybrid of neuroevolution and constraint programming, able to benefit
from improvements to its evolutionary algorithm, its neural network and its filtering
algorithms. In future work we will work on all three of these aspects and test FEPP on
real-world optimisation problems involving uncertainty.
References
1. Majercik, S.M.: Stochastic Boolean Satisfiability. In: Handbook of Satisfiability, ch. 27, pp.
887–925. IOS Press, Amsterdam (2009)
2. Minsky, M., Papert, S.: Perceptrons: An Introduction to Computational Geometry. The MIT
Press, Cambridge (1972)
3. Prestwich, S.D., Tarim, S.A., Rossi, R., Hnich, B.: Evolving Parameterised Policies for
Stochastic Constraint Programming. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 684–
691. Springer, Heidelberg (2009)
4. Racca, R.: Can Periodic Perceptrons Replace Multi-Layer Perceptrons? Pattern Recognition
Letters 21, 1019–1025 (2000)
5. Régin, J.-C.: A Filtering Algorithm for Constraints of Difference in CSPs. In: 12th National
Conference on Artificial Intelligence, pp. 362–367. AAAI Press, Menlo Park (1994)
6. Stanley, K.O., Miikkulainen, R.: A Taxonomy for Artificial Embryogeny. Artificial Life 9(2),
93–130 (2003)
7. Walsh, T.: Stochastic Constraint Programming. In: 15th European Conference on Artificial
Intelligence (2002)
The Weighted Spanning Tree Constraint
Revisited
1 Introduction
2 Existing Approaches
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 287–291, 2010.
c Springer-Verlag Berlin Heidelberg 2010
288 J.-C. Régin et al.
algorithm. Assuming that the edges are sorted by non-decreasing weight, this
can be done in almost linear time [1]. Identifying inconsistent edges (that cannot
participate in a spanning tree of weight at most K) is more involved, however.
Dooms and Katriel [3] observed that inconsistent edges can be detected as fol-
lows [5]. Let T be a minimum spanning tree, and let (i, j) be a non-tree edge
that we wish to evaluate. We now find the maximum-weight edge on the unique
i-j path in T . If replacing that maximum-weight edge with (i, j) yields a tree of
weight more than K, (i, j) is inconsistent. Similar reasoning can be applied to
determine whether a tree edge is mandatory, i.e., when replacing it would yield
always a tree of weight more than K [3]. Therefore, the detection of inconsis-
tent and mandatory edges amounts to computing the ‘replacement cost’ of the
edges. Régin [4] also applies the replacement cost for non-tree edges to detect
inconsistent edges, but tree edges (and mandatory edges) were not considered.
Several algorithms have been proposed to compute the replacement cost of
the edges, for example by Tarjan [5] and Dixon, Rauch, and Tarjan [2]. These
algorithms allow to compute all replacement costs in time O(mα(m, n)) on a
graph with n nodes and m edges, where α(m, n) is the inverse Ackermann func-
tion stemming from the complexity of the ‘union-find’ algorithm [6]. Other ap-
proaches, such as those referenced by [3] are based on (or resemble) the algo-
rithms of [5] or [2]. Even though these algorithms allow to find the replacement
costs in almost linear time theoretically, the added complexity may not offset
the potential savings in practice, as argued by Tarjan [5]. Moreover, it is not
obvious how to apply the algorithms incrementally. Therefore, Régin proposed
a different algorithm running in O(n + m + n log n) time [4]. We next briefly
describe the main components of this algorithm for later use.
Régin [4] applies Kruskal’s algorithm to find a minimum spanning tree. That
is, we start from a forest consisting of all nodes in the graph. We then successively
add edges, whereby each added edge joins two separate trees. We ensure that
the next selected edge has minimum weight among all edges whose extremities
are not in the same tree. We use a so-called ccTree (‘connected component tree’)
to represent these merges. The leaves of the ccTree are the original graph nodes,
while the internal nodes of the ccTree represent the merging of two trees (or
connected components), defined in the order in which the edges were added to
the tree. An internal node thus represents the edge with which two components
have been merged; see Figure 1a and 1b for an example. Therefore, the ccTree
contains n − 1 internal nodes, where n is the number of nodes in the graph. The
computation of the replacement cost of a non-tree edge (i, j) can now be done
by finding the lowest common ancestor (LCA) of nodes i and j in the ccTree:
the weight of (i, j) minus the weight of the edge corresponding to the LCA is
exactly the replacement cost of (i, j). We refer to [4] for further details.
from a generic (and relatively complex) algorithm presented in [5]. Our contri-
bution is a description of a more practical algorithm, specific to the problem of
computing replacement costs, having the same time complexity. We will apply
the algorithm to detect mandatory edges.
Let G = (V, E) be the graph under consideration, with a ‘weight’ function w :
E → R, and let T bea minimum spanning tree of G. For a subset of edges S ⊆ E,
we let w(S) denote e∈S w(e). The replacement cost of an edge e in T is defined
as w(Te ) − w(T ), where Te is a minimum spanning tree of G \ e. It represents the
marginal increase of the weight of the minimum spanning tree if e is not used. It can
be shown that the new minimum spanning tree can be obtained by replacing e with
exactly one other edge, which is called the replacement edge. In fact, the replacement
cost of e is the weight of its replacement edge minus the weight of e itself.
Let us first describe a basic algorithm for computing the replacement costs
for tree edges. We start by computing a minimum spanning tree T , and we
label all tree edges as ‘unmarked’. We then consider the edges of the graph,
ordered by non-decreasing weight. If we encounter a non-tree edge (i, j), we do
the following. First, observe that there is a unique i-j path in T , and (i, j) serves
as replacement edge for all unmarked edges on this path. Therefore, we will mark
a tree edge as soon as we have identified its first replacement edge. For example,
in Figure 1, the first non-tree edge that we consider is (3, 4). We thus label the
tree edges (1, 3) and (1, 4) as marked, with associated replacement cost 1 and 2,
respectively. The next non-tree edge is (1, 2), which is used to mark tree edge
(2, 4) with associated replacement cost 2 (edge (1, 4) is already marked).
It can be shown that this basic algorithm computes the replacement costs of
all tree edges. Unfortunately, its time complexity is rather high: we may need up
to n steps to identify the unmarked edges, which gives an overall time complexity
of O(mn). Fortunately, we can efficiently reduce this complexity by contracting
the marked edges of the tree, i.e., we merge the extremities of marked tree edges.
This contraction will be performed by using a ‘union-find’ data structure [6, 1].
First, we root the minimum spanning tree, i.e., we designate an arbitrary root
node, and we organize the nodes in a directed tree with parent information. In
addition, each node is associated with a pointer p to its parent in the union-find
data structure. Initially the pointer p of every node points to the node itself.
When an unmarked edge is discovered, we ‘contract’ the edge by letting the
pointer p now point to its father. We then apply the classical ‘find’ function,
associated with its classical updates. That is, the pointers of the union-find data
structure are used to traverse the path between the two extremities of a non-
tree edge. Note that we move up in parallel in the tree from the two extremities.
We stop when the same node is reached by the two traversals (one from each
extremity). For example in Figure 1, suppose we let node 1 be the root of the
tree. After processing the first non-tree edge (3, 4), the updated pointers are
p(3) = 1 and p(4) = 1. For the next non-tree edge (1, 2), the algorithm directly
proceeds from the parent of 2 (node 4) to p(4), which is node 1.
The advantage of this method is that it is easy to implement. Moreover, we
will have at most n − 1 contractions because the tree contains n − 1 edges. In
290 J.-C. Régin et al.
addition we will have at most m requests, thus we obtain the classical union-find
complexity of O(mα(m, n)). We note that the replacement cost for tree edges
can be used to identify mandatory edges: an edge is mandatory if its replacement
cost is higher than K − w(T ), see also [3].
(6,8): 6
(7,9): 4
(6,8): 6 (4,6): 4
4 (3,5): 5 (5,7): 3
1
2
2
1
2 (4,6): 4 (2,4): 2
7
3
3 4 4 (2,4): 2 (1,3): 2
6
6
5
6 (1,3): 2 (7,9): 4 (1,4): 1
5
5 3
8 (1,4): 1 (5,7): 3 (4,5): 6
7
4
8
9 1 4 3 2 6 5 7 9 8 1 4 5 3 2 7 6 9 8
Fig. 1. The minimum spanning tree (MST, in bold) for a small example (a.), its ccTree
(b.), and the updated ccTree after the addition of the mandatory edge (4, 5) (c.)
weight of the parent of cj , we let ci be its parent and continue. If the weight of
the parent of cj is less than the weight of the parent of ci , we insert cj between
ci and its parent. In other words, cj has as ‘left’ child ci , and as ‘right’ child
its subtree in the path from j to the LCA (which is always a single node). We
then update cj to be its original parent in the j-LCA path, and repeat the
process until the two paths are fully combined (i.e., we reach the position of
the previous LCA). Figure 1 provides an example of our second method. To the
example presented in Figure 1.a, we introduce the mandatory edge (4, 5). From
the ccTree in Figure 1.b, we determine that the LCA for nodes 4 and 5 is the
internal node marked with edge (3, 5) with weight 5, which will disappear from
the ccTree. Execution of our second method yields the repaired ccTree, depicted
in Figure 1.c. The main benefit of this second method is that it needs to update
the minimum spanning tree (and the ccTree) only locally. In the worst case, its
time complexity may be O(n), but the expected time complexity is much lower.
References
[1] Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT
Press, Cambridge (1990)
[2] Dixon, B., Rauch, M., Tarjan, R.: Verification and sensitivity analysis of minimum
spanning trees in linear time. SIAM J. Comput. 21(6), 1184–1192 (1992)
[3] Dooms, G., Katriel, I.: The “not-too-heavy spanning tree” constraint. In: Van
Hentenryck, P., Wolsey, L.A. (eds.) CPAIOR 2007. LNCS, vol. 4510, pp. 59–70.
Springer, Heidelberg (2007)
[4] Régin, J.-C.: Simpler and Incremental Consistency Checking and Arc Consistency
Filtering Algorithms for the Weighted Spanning Tree Constraint. In: Perron, L.,
Trick, M.A. (eds.) CPAIOR 2008. LNCS, vol. 5015, pp. 233–247. Springer, Heidel-
berg (2008)
[5] Tarjan, R.E.: Applications of path compression on balanced trees. Journal of the
ACM 26(4), 690–715 (1979)
[6] Tarjan, R.E.: Efficiency of a good but not linear set union algorithm. Journal of
the ACM 22, 215–225 (1975)
Constraint Reasoning with Uncertain Data Using
CDF-Intervals
1 Introduction
Interval coefficients have been introduced in Operations Research and Constraint Pro-
gramming to specify uncertain data in order to provide reliable solutions to convex
models. They are at the heart of paradigms such as robust optimization [3, 12] in Op-
erations Research as well as mixed CSP [8], reliable constraint reasoning [15, 16], and
quantified CSP [17] in Constraint Programming. These paradigms specify erroneous
and incomplete data using uncertainty sets that denote a deterministic and bounded for-
mulation of an ill-defined data. To remain computationally tractable, the uncertainty
sets are approximated by convex structures such as intervals (extreme values within the
uncertainty set) and interval reasoning can be applied ensuring effective computations.
The concept of convex modeling was coined to formalize the idea of enclosing uncer-
tainty sets and yield reliable solutions; i.e guaranteed to contain any solution produced
by any possible realization of the data [5, 2, 16]. As a result, the outcome of such sys-
tems is a solution set that can be refined when more knowledge is acquired about the
data, and does not exclude any potential solution. The benefits of these approaches are
that they deal with real data measurements, produce robust/reliable solutions, and do
so in a computationally tractable manner. However, the solution set can sometimes be
too large to be meaningful since it encloses all solutions that can be constructed us-
ing the data intervals. Furthermore each solution derived has equal uncertainty weight,
thus does not reflect any possible degree of knowledge about the data. For instance,
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 292–306, 2010.
© Springer-Verlag Berlin Heidelberg 2010
Constraint Reasoning with Uncertain Data Using CDF-Intervals 293
consider a collected set of data measurements in traffic flow analysis [9], or in image
recognition [7], the data is generally ill-defined but some data values can occur more
than others or have a darker shade of grey (hence greater degree of knowledge or cer-
tainty). This quantitative information is available during data collection, but lost during
the reasoning because not accounted for in the representation of the uncertain data. As
a consequence it is not available in the solution set produced. Mainly reliable models
offer reliability and robustness, tractable models, but do not account for quantitative
information about the data.
This paper addresses this problem. Basically we extend the interval data models
with a second dimension: a quantitative dimension added to the measured input data.
The main idea introduced in this paper is to show that we can preserve the tractabil-
ity of convex modeling while enriching the uncertain data sets with a representation of
the degree of knowledge available. Our methodology consists of building data intervals
employing two dimensional points as extreme values. We assume that with each uncer-
tain data value comes its frequency of occurrence or density function. We then compute
the cumulative distribution function (cdf) over this function. The cdf is an aggregated
quantitative measurement indicating for a given uncertain value, the probability that the
actual data value lies before it. It has been used in different models under uncertainty
to analyze the distribution of data sets (e.g. [14] and [10]). It enjoys three main prop-
erties that fit an interval representation of data uncertainty: i) the cdf is a monotone,
non decreasing function like arithmetic ordering, suitable for interval computations and
pruning, ii) it directly represents the aggregated probability that a quantity lies within
bounds, thus showing the confidence interval of this uncertain data, iii) it brings flex-
ibility to the problem modeling assumptions (e.g. by choosing the data value bounds
based on the cdf values, or its sought confidence interval). We introduce the concept
of cdf-intervals to represent such convex sets, following the concept of interval coeffi-
cients. This requires the decision variables to range over cdf-intervals as well. Basically,
in our framework, the elements of a variable’s domain are points in a 2D-space, the first
dimension for its data value, the second for its aggregated cdf value. It is defined as a
cdf-interval specified by its lower and upper bounds. A new domain ordering is defined
within the 2D space. This raises the question of performing arithmetic computations
over such variables to infer bound consistency. We define the constraint domain over
which the calculus in this new domain structure can be performed, including the infer-
ence rules.
This paper contains the following contributions: (1) a new representation of uncertain
data, (2) a formal framework for solving systems of arithmetic constraints over cdf-
intervals, (3) a practical framework including the inference rules within the usual fixed
point semantics, (4) an application to interval linear systems. The paper is structured
along the contributions listed hereabove.
2 Basic Concepts
This section recalls basic concepts we use to characterize the degree of knowledge, and
introduces our notations. These definitions can be found in [10].
294 A. Saad, C. Gervet, and S. Abdennadher
(a) (b)
2.3 Notations
We assume that a data takes its value in the set of real numbers R, denoted by a,b,c. A
cdf value F X (x) is associated to the uncertainty curve of a given point p. For simplicity,
p
F X (x) is noted F x , i.e. the cdf value of an uncertain data p at value x. We have p x ∈ R ×
[0, 1] with coordinates (x, F xp ). Data points are denoted by p, q, r possibly subscripted
by a data value. Variables are denoted by X, Y, Z and take their value in U = R × [0, 1].
Intervals of elements from U are denoted I, J, K. We denote F I the approximated linear
curve relative to the cdf curve inbetween the bounds of I, and FaI the cdf value of the
data value a plotted on the 2D-interval I.
Given a measured (also possibly randomly generated) data set denoting the population
of an uncertain data, we construct our cdf-intervals as detailed in algorithm 1.
The algorithm runs in O(n) where n is the number of distinct values in the data set.
It receives three parameters: the size of the data population, m, a sorted list (ascending
order) of the distinct measured data, and a list of their corresponding frequencies. Both
lists are of the same size n. The algorithm first computes the cdf in a cumulative manner.
The turning points are then extracted by recording the data value having a density less
than the average step value (m/n) and the value with cdf equal to 98%.
Fig.1 illustrates an example of an interval data construction. For a data set size n =
11, and a population size m = 30, Arr[n] = [12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42],
and its corresponding frequencies Freq[n] = [1, 3, 6, 5, 3, 4, 2, 2, 2, 1, 1], the computed
cdf-interval has the following bounds [(15, 0.13), (39.6, 0.98)].
Consider the practical meaning of the interval [pa , pb ] that we have sought to obtain.
This interval is built according to two main sources of information: 1) the monotony
and non-decreasing properties of the cdf curve to account for degree of knowledge, 2)
the extreme turning points over such a curve. Recall that the cdf curve indicates the
aggregated distribution function of a data set. Plotting a point on this curve tells us
what are the chances that the actual data value lies on or before this point. The extreme
turning points we have considered are such that the lower bound indicates when the
slope (thus frequency of occurrence over the population) increases more than the aver-
age; and the upper bound that of the cdf reaching a plateau. The measure of this upper
bound has been associated with the cdf value of 98%. This corresponds to the distance
of avr + 3σ when the distribution follows a normal distribution. Such interpretation is
a conservative view that can be revised by the decision maker.
It is also important to note the effectiveness of using the cdf as an indicator of degree
of knowledge. Given a measurement of the data p such that (x, F xp ) is any point, we
have the following due to F p monotone non-decreasing property:
a ≤ x ≤ b, Fap ≤ F xp ≤ Fbp
This implies that we can order (partially) points in this 2D-space U = R × [0, 1].
Thus we can construct an algebra over variables taking their value in this space. In
particular, we can approximate the cdf curve associated with a data population by the
linear (increasing or constant) slope between the two turning points.
FbI − FaI
a < x < b, F xI = .(x − a) + FaI (5)
b−a
4 cdf-Intervals
Our approach follows real interval arithmetic. It adds a second dimension to each un-
certain value, requiring us to define a new ordering among points in a two dimensional
space, together with new inference rules.
Consider a data population with its cdf curve. Create the set of points such that each
points p x is specified by (x, F x ) ∈ R × [0, 1]. The set U = R × [0, 1] is a set of tu-
ples partially ordered, and constitutes a poset with a unique greatest lower bound and
least upper bound. Similarly to reliable computing methods the constraint system will
produce a solution set as opposed to a solution point. The variables thus denote inter-
vals within the cdf-interval structure and constraint processing needs to be extended to
perform arithmetic operations within this algebraic structure.
4.1 cdf-IntervalOrdering
Definition 6 (Ordering over U, ). Let p x = (x, F xp ), qy = (y, Fyq ) ∈ U, the ordering
is a partial order defined by:
p q
p x qy ⇔ x ≤ y and F x ≤ Fy
Example 2. Consider the three points p x = (1, 0.3), qy = (2, 0.5) and rz = (2, 0.1). We
have p x qy as well as rz qy , but p x and rz are not comparable.
Fig. 3 illustrates an example which computes glb and lub of two points p x = (1, 0.3)
and qy = (2, 0.1). glb(p x, qy ) = (1, 0.05) and lub(p x, qy ) = (2, 0.6).
Fig. 4. cdf distribution resulting from superimposing two intervals for x and y with a relation: (a)
addition (b) multiplication (c) subtraction. x ∈ [a, b] with a cdf F I and y ∈ [c, d] with a cdf F J .
Any two intervals, each shaping a different distribution cdf, can be involved in a
relation given by a function. This relation in turn shapes a cdf that is based on a double
integration of their joint cdf over the set of values per interval under the curve of the
function [10]. From this generic methodology we derive cdf lower and upper bound
equations for each binary arithmetic operation. Derived equations are shown by the
dark shaded area under the curve of the relation depicted in fig. 4. Proofs are omitted
for space reason.
FlbI+J
+
= 0.5[FlbJ + FcJ + FaI FlbI + ]
I+J
Fub+
= 0.5[FbI + FdJ ]
Example 4. Given two data populations with associated cdf curves, approximated by
the cdf-intervals I = [(1, 0.3), (7, 0.65)] and J = [(2, 0.46), (9, 0.6)]; the addition of two
uncertain data from I and J respectively is specified by the cdf-interval [ra+c , rb+d ] =
(3, 0.38), (16, 0.625)] as shown in fig.5. Note that in the absence of second dimension
we obtain a regular interval arithmetic addition.
such that the first dimensions, follows the conventional real interval arithmetic multipli-
cation. The lower and upper bounds are defined by lb× and ub× . Recall that data values
can be negative:
The 2nd -dimension cdf for the resulting interval bounds, are computed as follows:
FlbI×J
×
= 0.5(FbI FcJ + FaI FdJ )
I×J
Fub×
= max(FdJ , FbI ) (11)
Example 5. Given two variables ranging over the respective cdf-intervals: I = [(1, 0.3),
(4, 0.65)] and J = [(.5, 0.46), (5, 0.6)] as illustrated in fig.6. The result of the multipli-
cation is a cdf-interval specified by [ra×c , rb×d ] = (1.5, 0.24), (20, 0.65)].
6 Implementation
The constraint system behaves like a solver over real intervals, based on the relational
arithmetic of real intervals where arithmetic expressions as interpreted as relations [6].
The relations are handled using the following transformation rules that extend the ones
over real intervals with inferences over the cdf values. The handling of these rules is
done by a relaxation algorithm which resembles the arc consistency algorithm AC-3
[13]. The solver converges to a fixed point or infers failure. we ensure termination of the
generic constraint propagation algorithm because the cdf-domain ordering is reflexive,
antisymmetric and transitive. Hereafter we present the main transformation rules for
the basic arithmetic operations. For space reasons, when a domain remains unchanged
we will use the following notation: I = [pa , pb ], J = [qc , qd ] and K = [re , r f ]. The
cdf-variables are denoted by X, Y and Z. Failure is detected if some domain bounds do
not preserve the ordering .
Ordering constraint X Y
pb = glb(pb , qd ), qc = lub(pa, qc )
{X ∈ I, Y ∈ J, X Y} −→ {X ∈ [pa , pb ], Y ∈ [qc , qd ], X Y}
Equality constraint X = Y
pb = glb(pb , qd ), pa = lub(pa , qc )
{X ∈ I, Y ∈ J, X = Y} −→ {X ∈ [pa , pb ], Y ∈ [pa , pb ], X = Y}
Ternary addition constraints X +U Y = Z
r f = (ub+ , Fub
I+J
+
), re = (lb+ , FlbI+J
+
)
{X ∈ I, Y ∈ J, Z ∈ K, Z = X +U Y} −→ {X ∈ I, Y ∈ J, Z ∈ [re , r f ], Z = X +U Y}
The projection onto Y’s domain is symmetrical.
pb = (ub− , Fub
K−J
−
), pa = (lb− , FlbK−J
−
)
{X ∈ I, Y ∈ J, Z ∈ K, X = Z −U Y} −→ {X ∈ [pa , pb ], Y ∈ J, Z ∈ K, X = Z −U Y}
Ternary multiplication constraint X ×U Y = Z
r f = (ub× , Fub
I×J
×
), re = (lb× , FlbI×J
×
)
{X ∈ I, Y ∈ J, Z ∈ K, Z = X ×U Y} −→ {X ∈ I, Y ∈ J, Z ∈ [re , r f ], Z = X ×U Y}
The projection onto Y’s domain is symmetrical.
pb = (ub÷ , Fub
K÷J
÷
), pa = (lb÷ , FlbK÷J
÷
)
{X ∈ I, Y ∈ J, Z ∈ K, X = K ÷U J} −→ {X ∈ [pa , pb ], Y ∈ J, Z ∈ K, X = K ÷U J}
Constraint Reasoning with Uncertain Data Using CDF-Intervals 303
Remark. If we remove any density value for the uncertain coefficients by assigning
values ’0’ and ’1’ to the lower and upper bounds of the cdf dimension, this amounts to
considering that the uncertain data is equally distributed along the given interval. The
output in this case is X1 ∈ [(0.00, 0.00), (2.50, 1.00)] and X2 ∈ [(0.00, 0.00), (5.00,
1.00)]. In this sense the cdf-interval constraint model generalizes the real interval arith-
metic constraint reasoning.
∀ j, p x j ∈ R+ × [0..1] (14)
The transformation of the above model yields (m×2n+1 ) inequalities per dimension. The
produced solution set is S di = {S di k
|k = 1, 2, ..., 2n+1} where the upper-bound value
2n+1 k n+1
range is S di = lubk=1 S di and the lower-bound value range is S di = glb2k=1 S di k
; d is the
dimension order: 1 or 2.
The transformation is performed in two steps: first, each equality constraint is trans-
formed into two cdf-interval inequalities; then the cdf-constraints are transformed into
linear constraints. The output of the transformation applied to example 6 will be:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
⎢⎢⎢ (−2, 0.32) (1, 0.41) ⎥ ⎢⎢⎢ ≤ ⎥⎥⎥ ⎢⎢⎢ (4, 0.88) ⎥⎥⎥
⎢⎢⎢ (−2, 0.35) (−1.001, 0.31) ⎥⎥⎥⎥⎥ ⎢⎢⎢ ≤ ⎥⎥⎥ ⎢⎢⎢ (5, 0.78) ⎥⎥⎥
⎢⎢⎢ ⎥⎥⎥ ⎢⎢⎢ ⎥⎥⎥ ⎢⎢⎢ ⎥⎥⎥
A = ⎢⎢⎢⎢ (1, 0.13) (1, 0.25) ⎥⎥⎥⎥ , = ⎢⎢⎢⎢ ≤ ⎥⎥⎥⎥ and b = ⎢⎢⎢⎢ (5, 0.7) ⎥⎥⎥⎥
⎢⎢⎢ ⎥ ⎢⎢⎢ ⎥⎥⎥ ⎢⎢⎢ ⎥
⎢⎢⎣ (5.999, 0.28) (1.5, 0.34) ⎥⎥⎥⎥⎦ ⎢⎢⎣ ≤ ⎥⎥⎦ ⎢⎢⎣ (15, 0.85) ⎥⎥⎥⎥⎦
(−6, 0.14) (−3, 0.04) ≤ (−4, 0.71)
Accordingly, the 1 st constraint row is the inequality constraint: (−2, 0.32)p x1 +U (1, 0.41)
p x2 (4, 0.88). This constraint in turn will be transformed into two components:
−2x1 + x2 ≤ 4 and 0.25[0.41Fub 2
+ 0.95Flb2
+ 0.32Fub1
+ 0.83Flb1
] ≤ 0.88. The lin-
ear equations resulting from the transformation of the above ouput are presented here,
where equations 15 and 16 are in the 1 st and 2nd dimensions respectively:
− 2x1 + 2x2 ≤ 4
−2x1 − 1.001x2 ≤ 5
x1 + x2 ≤ 5
5.999x1 + 1.5x2 ≤ 15
−6x1 − 3x2 ≤ −4 (15)
2
0.25[0.41Fub + 0.95Flb
2
+ 0.32Fub
1
+ 0.83Flb
1
] ≤ 0.88
2
0.25[0.31Fub + 0.75Flb
2
+ 0.35Fub
1
+ 0.87Flb
1
] ≤ 0.78
0.5[Fub + Fub ] ≤ 0.7
1 2
2
0.25[0.34Fub + 0.96Flb
2
+ 0.28Fub
1
+ 0.86Flb
1
] ≤ 0.85
0.5[Fub + Fub ] ≤ 0.71
1 2
1
Flb ≤ Fub
1
2
Flb ≤ Fub
2
(16)
Constraint Reasoning with Uncertain Data Using CDF-Intervals 305
for a point p xi :xi is the point value in the 1 st -dimension and both Flb
i i
and Fub are the
nd
2 -dimension lower bound and upper bound values that the point can take.
An additional constraint has been added to the linear equations in the 2nd -dimension;
this ensures that the lower bound value is less that its upper bound value. It is clear
that resulting equations in both dimensions are linear and can be solved using linear
programming techniques.
7 Conclusion
The framework of reliable computing offers robust and tractable approaches to reason
with uncertain data by means of convex models of uncertainty sets (e.g using interval
coefficients). It does not account however for any degree of knowledge about the data
such as the density function. Thus all solutions in the solution set have equal uncertainty
weight. This paper addressed this issue and showed how to embed a degree of knowl-
edge in the form of the cumulative distribution function. The paper proposed the novel
concept of cdf-interval, whereby an uncertainty set is specified by an interval of points
with first coordinate the data uncertainty value, and second coordinate its cdf value.
Since the uncertain data and consequently the decision variables are specified by their
confidence interval, so is the solution set. We also presented the constraint system over
this new domain by extending the real interval arithmetic to cdf-interval arithmetic us-
ing the monotone non-increasing property of the cdf function. Finally the application of
cdf-intervals to extend the approach over Interval Linear Systems was showcased. We
are currently applying this new approach to larger systems of constraints with applica-
tion to vehicle routing, networking but also finance and image recognition, where the
uncertain data is enriched with a degree of knowledge (i.e..the density function drawn
from historic data trends or randomly generated).
References
1. Beaumont, O.: Solving interval linear systems with linear programming techniques. Linear
Algebra and its Applications 281(1-3), 293–309 (1998)
2. Ben-Haim, Y., Elishakoff, I.: Discussion on: A non-probabilistic concept of reliability. Struc-
tural Safety 17(3), 195–199 (1995)
3. Ben-Tal, A., Nemirovski, A.: Robust solutions of uncertain linear programs. Operations
Research Letters 25(1), 1–14 (1999)
4. Benhamou, F., de Vinci, R.: Interval constraint logic programming. In: Constraint program-
ming: basics and trends: Châtillon Spring School, France (1994)
5. Chinneck, J., Ramadan, K.: Linear programming with interval coefficients. Journal of the
Operational Research Society, 209–220 (2000)
6. Cleary, J.: Logical arithmetic. Future Computing Systems 2(2), 125–149 (1987)
7. Deruyver, A., Hodé, Y.: Qualitative spatial relationships for image interpretation by using a
conceptual graph. Image and Vision Computing 27(7), 876–886 (2009)
8. Fargier, H., Lang, J., Schiex, T.: Mixed constraint satisfaction: A framework for decision
problems under incomplete knowledge. In: Proceedings of the National Conference on Arti-
ficial Intelligence, Citeseer, pp. 175–180 (1996)
9. Grossglauser, M., Rexford, J.: Passive Traffic Measurement for Internet Protocol Operations.
In: The Internet as a Large-Scale Complex System, p. 91 (2005)
306 A. Saad, C. Gervet, and S. Abdennadher
10. Gubner, J.: Probability and Random processes for electrical and computer Engineers. Cam-
bridge Univ. Pr., Cambridge (2006)
11. Hansen, E.: Global optimization using interval analysis: the one-dimensional case. Journal
of Optimization Theory and Applications 29(3), 331–344 (1979)
12. Hoffman, K.: Combinatorial Optimization: Current successes and directions for the future.
Journal of Computational and Applied Mathematics 124(1-2), 341–360 (2000)
13. Mackworth, A.: Consistency in networks of relations. Artificial Intelligence 8(1), 99–118
(1977)
14. Tversky, A., Kahneman, D.: Advances in prospect theory: Cumulative representation of un-
certainty. Journal of Risk and Uncertainty 5(4), 297–323 (1992)
15. Yorke-Smith, N.: Reliable constraint reasoning with uncertain data. PhD thesis, IC-Parc,
Imperial College London (2004)
16. Yorke-Smith, N., Gervet, C.: Certainty closure: Reliable constraint reasoning with incom-
plete or erroneous data. ACM Transactions on Computational Logic (TOCL) 10(1), 3 (2009)
17. Zhou, K., Doyle, J.C., Glover, K.: Robust and optimal control. Prentice-Hall, Inc., Upper
Saddle River (1996)
Revisiting the Soft Global Cardinality
Constraint
1 Introduction
Régin et al. [3] suggested to soften global constraints by introducing a cost vari-
able measuring the violation of the constraints. This has the advantage that
over-constrained satisfaction problems can be turned into a constrained opti-
mization problem solvable by traditional CP solvers; furthermore specialized
filtering algorithms can be employed to filter the variables involved in soft con-
straints. One of the most used global constraints to solve practical problems is
the Global Cardinality Constraint (gcc) introduced in [2]:
Definition 1
where
viol(d1 , . . . , dn ) = max(0, |{di | di = d}| − ud , ld − |{di | di = d}|).
d∈DX
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 307–312, 2010.
c Springer-Verlag Berlin Heidelberg 2010
308 P. Schaus, P. Van Hentenryck, and A. Zanarini
The violation represents the sum of excess or shortage for each value. For space
reasons, we only consider the value-based violation version of the constraint in
this paper as the extension to variable violation follows directly (as in [5]).
The domain filtering algorithm for softggc introduced in [5] exploits match-
ing theory and we use the same notation for consistency and clarity. The consis-
tency check and the filtering of the violation variable Z are briefly summarized
in Section 2. The main contribution of the paper is in Section 3 where the cor-
rected filtering algorithm for the X variables is presented. Please refer to [5] for
some basic notions about matching theory.
3 Filtering of X
While the consistency check is correct, the original paper [5] overlooked the case
in which the lower or upper bounds of the value occurrences are zeroes and it
does not characterize the conditions to achieve domain consistency in such cases
(see Example 1 below). In this section, we review and correct the theorems on
which the filtering algorithm is built upon. Theorem 3 shows the properties for
which a vertex x is matched in every maximum matching.
Theorem 3. A variable vertex x is matched in every maximum matching of Gu
(Go ) iff it is matched in a maximum matching Mu (Mo ) and there does not exist
an M -alternating path starting from a free variable vertex and finishing in x.
Proof
⇒ Suppose there exists an even M -alternating path P starting from a free
variable vertex x such that P = {x , . . . , x}; the alternating path is even as x
and x belong to the same vertex partition furthermore x must be matched as
P is an alternating path and x is free. Then M = M ⊕ P is still a maximum
matching in which x is free.
⇐ Suppose there exists a maximum matching M in which x is free. Any of
the adjacent vertices of x are matched otherwise M is not maximum. We can
build an even alternating path starting from x by choosing one of the adjacent
vertices of x and then by following the edge belonging to M . By using such
an even alternating path, it is possible to build a new maximum matching in
which x is matched and there exists an M -alternating path starting from a free
variable-vertex.
Proof. If lv > 0, then there exists a matching (maybe not maximum) using
the edge x ← v. Consequently, if e = {x, v} does not belong to any maximum
matching of Gu , the size of a maximum matching would decrease by one forcing
the assignment x ← v. Now if lv = 0, then the edge x ← v belongs to no
matching of Gu so the size of the maximum matching decreases only if the
vertex x is matched in every maximum matching.
Theorem 6. Let Go and Gu be the value graphs with respectively upper and
lower bound capacities and let Mo and Mu be maximum matching respectively
Gu ; let BOF and BU F be respectively BOF = |X| − |Mo | and
in Go and
BU F = d∈D ld − |Mu |. The constraint sof tggc(X, l, u, Z) is domain consistent
on X if and only if min DZ ≤ BOF + BU F and either:
1. BOF + BU F < (max DZ − 1) or
2. if BOF + BU F = (max DZ − 1) and for each edge e = {x, v} either
– it belongs to a maximum matching in Gu , or lv = 0 and the vertex x is
not matched in every maximum matching in Gu or
– it belongs to a maximum matching in Go , or uv = 0 and the vertex x is
not matched in every maximum matching in Go or
3. if BOF + BU F = max DZ and for each edge e = {x, v} we have that
– it belongs to a maximum matching in Gu , or lv = 0 and the vertex x is
not matched in every maximum matching in Gu and
– it belongs to a maximum matching in Go , or uv = 0 and the vertex x is
not matched in every maximum matching in Go .
Proof. Common to the proof of all the cases is the fact that there exists an
assignment with violation BOF + BU F . If BOF + BU F > max DZ then the
constraint is inconsistent since BOF + BU F is a lower bound on the violation.
1. Forcing the assignment of one variable cannot increase BU F by more than
one and cannot increase BOF by more than one. So in the worst case the
assignment of a variable to a value results in BU F = BU F + 1 and BOF =
BOF + 1. By Theorem 2 it is possible to build an assignment for all the
variables with violation BU F + BOF = BU F + BOF + 2 ≤ max DZ which
is thus consistent.
2. Since BOF + BU F = (max DZ − 1) and because of Theorem 2, at most
one of BU F or BOF can increase by one by forcing the assignment x ← v.
Theorems 4 and 5 tell us the conditions say Cu and Co under which the
assignment leads to increase respectively BU F and BOF by one. At most
one of these conditions can be satisfied. Hence the condition expressed is
nothing else than ¬(Cu ∧ Co ) ≡ ¬Cu ∨ ¬Co .
3. This is similar to previous point except that neither BU F nor BOF can
increase by one. Hence the condition expressed is ¬Cu ∧ ¬Co .
Note that once we compute the alternating paths to detect edges belonging to a
maximum matching, we get for free also the variable vertices that are matched in
every maximum matching (Theorem 3). Therefore the complexity of the filtering
algorithm remains unchanged w.r.t. [5].
The algorithm described in this paper is implemented in the constraints
softAtLeast, softAtMost and softCardinality of Comet [1].
References
1. Comet 2.0, http://www.dynadec.com
2. Régin, J.-C.: Generalized arc consistency for global cardinality constraint. In: AAAI
1996, pp. 209–215 (1996)
312 P. Schaus, P. Van Hentenryck, and A. Zanarini
3. Régin, J.C., Petit, T., Bessière, C., Puget, J.-F.: An original constraint based ap-
proach for solving over constrained problems. In: Dechter, R. (ed.) CP 2000. LNCS,
vol. 1894, p. 543. Springer, Heidelberg (2000)
4. van Hoeve, W.J., Pesant, G., Rousseau, L.-M.: On global warming: Flow-based soft
global constraints. J. Heuristics 12(4-5), 347–373 (2006)
5. Zanarini, A., Milano, M., Pesant, G.: Improved algorithm for the soft global cardi-
nality constraint. In: Beck, J.C., Smith, B.M. (eds.) CPAIOR 2006. LNCS, vol. 3990,
pp. 288–299. Springer, Heidelberg (2006)
A Constraint Integer Programming Approach
for Resource-Constrained Project Scheduling
1 Introduction
The resource-constrained project scheduling problem (RCPSP) is not only theo-
retically hard [5] but consistently resists computational attempts to obtain solu-
tions of proven high quality even for instances of moderate size. As the problem
is of high practical relevance, it is an ideal playground for different optimization
communities, such as integer programming (IP), constraint programming (CP),
and satisfiability testing (SAT), which lead to a variety of publications, see [6].
Supported by the DFG Research Center Matheon Mathematics for key technologies
in Berlin.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 313–317, 2010.
c Springer-Verlag Berlin Heidelberg 2010
314 T. Berthold et al.
The three areas each come with their own strengths to reduce the size of
the search space. Integer programming solvers build on lower bounds obtained
from linear relaxations. Relaxation can often be considerably strengthened by
additional valid inequalities (cuts), which spawned the rich theory of polyhedral
combinatorics. Constraint programming techniques cleverly learn about logical
implications (between variable settings) which are used to strengthen the bounds
on variables (domain propagation). Moreover, the constraints in a CP model are
usually much more expressive than in the IP world. Satisfiability testing, or
SAT for short, actually draws from unsatisfiable or conflicting structures which
helps to quickly finding reasons for and excluding infeasible parts of the search
space. The RCPSP offers footholds to attacks from all three fields but no single
one alone has been able to crack the problem. So it is not surprising that the
currently best known approach [11] is a hybrid between two areas, CP and SAT.
Conceptually, it is the logical next step to integrate the IP world as well. It is
the purpose of this study to evaluate the potential of such a hybrid and to give
a proof-of-concept.
Related Work. For an overview on models and techniques for solving the
RCPSP we refer to the recent survey of [6]. Several works on scheduling problems
already combine solving techniques in hybrid approaches. For the best current
results on instances of PSPLib, we refer to [11], where a constraint programming
approach is supported by lazily creating a SAT model during the branch-and-
bound process by which new constraints, so called no-goods, are generated.
2 Problem Description
min max Sj + pj
j∈J
For the implementation we use the CIP solver scip which performs a complete
search in a branch-and-bound manner. The question to answer is how strongly
conflict analysis and LP techniques are involved in the solving process by prun-
ing the search tree. Therefore a first version of separation and conflict analysis
methods are implemented for the cumulative constraint.
As IP model we use the formulation of [9] with binary start time variables.
In the cumulative constraint we generate knapsack constraints [1] from the
capacity cuts. Propagation of variable bounds and repropagations of bound
changes are left to the solver scip. For the cumulative constraint bounds are
updated according to the concept of core-times [7]. The core-time of a job is de-
fined by the interval [ubj , bj + pj ]. A jobs lower bound can be updated from bj
to bj if its demand plus the demands of the cores exceed the resource capacity
in certain time intervals. An explanation of this bound change is given by the
set of jobs that have a core during this interval. More formally, let C ⊂ J be
the set of jobs whose core is non-empty, i.e., ubj < bj + pj holds for j ∈ C. The
delivered explanation is the local lower bound of job j itself and the local
lower
and upper bounds of all jobs k ∈ i ∈ C : ubi < bj and bi + pi > bj .
This poses the interesting still open question whether it is NP-hard to find a
minimum set of jobs from which the bound change can be derived.
To speed up the propagation process, we filter from the cumulative constraints,
all pairs of jobs that cannot be executed in parallel and propagate them in a
global disjunctive bounds constraint. This one propagates and checks the con-
straints in a more efficient manner and can separate further cuts based on forbid-
den sets. To get tight primal bounds, we apply a primal heuristic that is based on
a fast list scheduling algorithm [8]. If an LP solution is available the list of jobs is
sorted according to the start times of the jobs, otherwise by weighted local bounds,
and α-points [8]. Furthermore, we apply a justification improvement heuristic as
316 T. Berthold et al.
Table 1. Summary of the computational results. Detailed results are given in [4].
480 instances with 30 jobs 480 instances with 60 jobs
Nodes Time in [s] Nodes Time in [s]
Setting opt best wor. total(k) geom total(k) geom opt best wor. total(k) geom total(k) geom
default 460 476 4 3 513 173.2 93.0 7.8 385 395 85 34 104 364.3 350.9 27.3
noconflict 436 467 13 8 665 246.6 175.0 11.6 381 390 90 38 099 381.8 362.9 28.3
norelax 454 467 13 7 444 194.0 106.8 6.5 384 390 90 127 684 591.2 355.8 26.1
none 446 465 15 9 356 217.5 135.5 7.7 382 389 91 126 714 599.3 364.8 26.9
bestset 460 476 4 – – – – 391 401 79 – – – –
lazyFD 480 480 0 – – – – 429 429 51 – – – –
described in [13] whenever a better solution was found. We use hybrid branching [2]
only on integer variables.
4 Computational Results
In this section, we analyze the impact of the two features LP relaxation and
conflict analysis for the RCPSP using the test sets of the PSPLib [10]. Due
to the lack of space we restrict ourselves mainly to the test sets containing 30
and 60 jobs. For instances with 120 jobs we report improved lower bounds.
All computations were obtained on Intel Xeon Core 2.66 GHz computers (in
64 bit mode) with 4 MB cache, running Linux, and 8 GB of main memory. We
used scip [12] version 1.2.0.6 and integrated cplex release version 12.10 as
underlying LP solver. A time limit of one hour was enforced for each instance.
Table 1 presents the results for different settings which differ by disabled
features. The setting “norelax” does not take advantage of the LP relaxation,
“noconflict” avoids conflict analysis, “none” stands for disabling both these fea-
tures whereas “default” enables both. The settings “bestset” is the best of the
previous four settings for each instance and the last line reports the results for
the solver lazyFD. We compare for how many instances optimality (“opt”) was
proven, the best known primal solution (“best”) was found, and the primal so-
lution was worse (“wor.”) than the best known. Besides that we state total time
and number of branch-and-bound nodes over all instances in the test set and the
shifted geometric means1 (“geom”) over these two performance measures.
First of all the results show that our approach is competitive to the current
best known method [11]. We observe further, that using both features leads to
a tremendous reduction of the search space. This does not directly transfer to
the running time. From that point of view the relaxation seems to be more
expensive as the conflict analysis. On the other hand, the relaxation prunes a
greater portion of the search space compared to the reduction achieved by the
conflict analysis. Using both features, however, leads to the best performance
and indicates the potential of this highly integrated approach.
1
1/n
The shifted geometric mean of values t1 , . . . , tn is defined as (ti + s) − s with
shift s. We use a shift s = 10 for time and s = 100 for nodes in order to decrease
the strong influence of the very easy instances in the mean values.
A CIP Approach for Resource-Constrained Project Scheduling 317
Already with this basic implementation we can improve the lower bounds of
five large instances with 120 jobs. These are j12018 3 (and j12018 9) where the
new lower bound is 100 (and 88). For j12019 6 (and j12019 9) we obtain lower
bounds of 89 (and 87). Finally, we prove a lower bound of 75 for j12020 3.
5 Conclusions
We have shown the power of integrating CP, IP, and SAT techniques into a
single approach to solve the RCPSP. Already with our basic implementation we
are competitive with both, the best known upper and lower bounds, and even
improve on a few. There is ample room for improvement, like strengthening the
LP relaxation by cutting planes or a dynamic edge-finding which can be exploited
using scip’s re-propagation capabilities. This is subject to current research.
References
1. Achterberg, T.: SCIP: Solving Constraint Integer Programs. Math. Programming
Computation 1, 1–41 (2009)
2. Achterberg, T., Berthold, T.: Hybrid branching. In: van Hoeve, W.-J., Hooker, J.N.
(eds.) CPAIOR 2009. LNCS, vol. 5547, pp. 309–311. Springer, Heidelberg (2009)
3. Baptiste, P., Pape, C.L.: Constraint propagation and decomposition techniques
for highly disjunctive and highly cumulative project scheduling problems. Con-
straints 5, 119–139 (2000)
4. Berthold, T., Heinz, S., Lübbecke, M.E., Möhring, R.H., Schulz, J.: A constraint
integer programming approach for resource-constrained project scheduling, ZIB-
Report 10-03, Zuse Institute Berlin (2010)
5. Blażewicz, J., Lenstra, J.K., Kan, A.H.G.R.: Scheduling subject to resource con-
straints: classification and complexity. Discrete Appl. Math. 5, 11–24 (1983)
6. Hartmann, S., Briskorn, D.: A survey of variants and extensions of the resource-
constrained project scheduling problem. Eur. J. Oper. Res. (2009) (in press) (Cor-
rected Proof)
7. Klein, R., Scholl, A.: Computing lower bounds by destructive improvement: An
application to resource-constrained project scheduling. Eur. J. Oper. Res. 112,
322–346 (1999)
8. Möhring, R.H., Schulz, A.S., Stork, F., Uetz, M.: Solving project scheduling prob-
lems by minimum cut computations. Manage. Sci. 49, 330–350 (2003)
9. Pritsker, A.A.B., Watters, L.J., Wolfe, P.M.: Multi project scheduling with limited
resources: A zero-one programming approach. Manage. Sci. 16, 93–108 (1969)
10. PSPLib, Project Scheduling Problem LIBrary, http://129.187.106.231/psplib/
(last accessed 2010/February/01)
11. Schutt, A., Feydy, T., Stuckey, P., Wallace, M.: Why cumulative decomposition is
not as bad as it sounds. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 746–761.
Springer, Heidelberg (2009)
12. SCIP, Solving Constraint Integer Programs, http://scip.zib.de/
13. Valls, V., Ballestı́n, F., Quintanilla, S.: Justification and RCPSP: A technique that
pays. Eur. J. Oper. Res. 165, 375–386 (2005)
Strategic Planning for Disaster Recovery with
Stochastic Last Mile Distribution
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 318–333, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Strategic Planning for Disaster Recovery 319
Section 2 of this paper reviews similar work on disaster preparation and recovery
problems. Section 3 presents a mathematical formulation of the disaster recovery
problem and sets up the notations for the rest of paper. Section 4 presents the
overall approach using (hopefully) intuitive models. Section 5 presents a number
of modeling and algorithmic improvements that refines each of the initial models;
it also presents the final version of the optimization algorithm for SCAPs. Section
6 reports experimental results of our complete algorithm on some benchmark
instances to validate the approach and Section 7 concludes the paper.
2 Previous Work
The operations research community has been investigating the field of human-
itarian logistics since the 1990s but recent disasters have brought increased at-
tention to these kinds of logistical problems [18,4,10,9]. Humanitarian logistics
is filled with a wide variety of optimization problems that combine aspects from
classic problems in inventory routing, supply chain management, warehouse lo-
cation, and vehicle routing. The problems posed by humanitarian logistics add
significant complexity to their classical variants. The operations research com-
munity recognizes that novel research in this area is required to solve these kinds
of problems [18,4]. Some of the key features that characterize these problems are
as follows:
Objectives. The objective function aims at minimizing three factors: (1) The
amount of unsatisfied demands; (2) the time it takes to meet those demands; (3)
the cost of storing the commodity. Since these values are not expressed in the
same units, it is not always clear how to combine them into a single objective
function. Furthermore, their relative importance is typically decided by policy
makers on a case-by-case basis. For these reasons, this paper uses weights Wx ,
Wy , and Wz to balance the objectives and to give control to policy makers.
Side Constraints. The first set of side constraints concerns the nodes of the
graph which represent the repositories in the populated area. Each repository
Ri∈1..n has a maximum capacity RCi to store the commodity. It also has a one-
time initial cost RIi (the investment cost) and an incremental cost RMi for each
unit of commodity to be stored. As policy makers often work within budget
constraints, the sum of all costs in the system must be less than a budget B.
The second set of side constraints concerns the deliveries. We are given a
fleet of m vehicles Vi∈1..m which are homogeneous in terms of their capacity
V C. Each vehicle has a unique starting depot Di+ and ending depot Di− . Unlike
Strategic Planning for Disaster Recovery 321
Output:
Given:
The amount stored at each warehouse
Repositories: Ri∈1..n
Delivery schedules for each vehicle
Capacity: RCi
Investment Cost: RIi
Minimize:
Maintenance Cost: RMi
Wx ∗ Unserved Demands +
Vehicles: Vi∈1..m
Wy ∗ M AX i1..m Tour Timei +
Capacity: V C
Wz ∗ Investment Cost +
Start Depot: Di+
Wz ∗ Maintenance Cost
End Depot: Di−
Scenario Data: Si∈1..a
Subject To:
Scenario Probability: Pi
Vehicle and site capacities
Available Sites: ARi ⊂ {1..n}
Vehicles start and end locations
Site Demand: Ci,1..n
Costs ≤ B
Travel Time Matrix: Ti,1..l,1..l
Weights: Wx , Wy , Wz
Notes:
Budget: B
Every warehouse that stores a unit
must be visited at least once
classic vehicle routing problems [17], customer demands in SCAPs often exceed
the vehicle capacity and hence multiple deliveries are often required to serve a
single customer.
Stochasticity. SCAPs are specified by a set of a different disaster scenarios
Si∈1..a , each with an associated probability Pi . After a disaster, some sites are
damaged and each scenario has a set ARi of available sites where the stored
commodities remain intact. Moreover, each scenario specifies, for each site Ri ,
the demand Ci . Note that a site may have a demand even if a site is not avail-
able. Finally, site-to-site travel times Ti,1..l,1..l (where l = |V |) are given for each
scenario and capture infrastructure damages.
Unique Features. Although different aspects of this problem were studied before
in the context of vehicle routing, location routing, inventory management, and
humanitarian logistics, SCAPs present unique features. Earlier work in location-
routing problems (LRP) assumes that (1) customers and warehouses (storage
locations) are disjoint sets; (2) the number of warehouses is ≈ 3..10; (3) customer
demands are less than the vehicle capacity; (4) customer demands are atomic.
None of these assumptions hold in the SCAP context. In a SCAP, it may not
only be necessary to serve a customer with multiple trips but, due to the storage
capacity constraints, those trips may need to come from different warehouses.
The key features of SCAP are: (1) each site can be a warehouse and/or customer;
(2) one warehouse may have to make many trips to a single customer; (3) one
customer may be served by many warehouses; (4) the number of available vehicles
is fixed; (5) vehicles may start and end in different depots; (6) the objective is to
minimize the time of the last delivery. Minimizing the time of the last delivery
is one of the most difficult aspects of this problem as in demonstrated in [6].
322 P. Van Hentenryck, R. Bent, and C. Coffrin
This section presents the basic approach to the SCAP problem for simplifying
the reading of the paper. Modeling and algorithmic improvements are presented
in Section 5. Previous work on location routing [7,1,15] has shown that reason-
ing over both the storage problem and the routing problem simultaneously is
extremely hard computationally. To address this difficulty, we present a three-
stage algorithm that decomposes the storage, customer allocation, and routing
decisions. The three stages and the key decisions of each stage are as follows:
The decisions of each stage are independent and can use the optimization tech-
nique most appropriate to their nature. The first stage is formulated as a MIP,
the second stage is solved optimally using constraint programming, and the third
stage uses large neighborhood search (LNS).
Storage & Customer Allocation. The first stage captures the cost and demand
objectives precisely but approximates the routing aspects. In particular, the
model only considers the time to move the commodity from the repository to
a customer, not the maximum delivery times. Let D be a set of delivery triples
of the form
source, destination, quantity. The delivery-time component of the
objective is replaced by
Wy ∗ Ts,d ∗ q/V C
s,d,q∈D
Figure 2 presents the stochastic MIP model, which scales well with the number
of disaster scenarios because the number of integer variables only depends on
the number of sites n. The meaning of the decision variables is explained in the
figure. Once the storage and customer allocation are computed, the uncertainty
is revealed and the second stage reduces to a deterministic multi-depot, multiple-
vehicle capacitated routing problem whose objective consists in minimizing the
latest delivery. To our knowledge, this problem has not been studied before.
One of its difficulties in this setting is that the customer demand is typically
much larger than the vehicle capacity. As a result, we tackle it in two steps. We
first consider each repository independently and determine a number of vehicle
trips to serve the repository customers (Repository Routing). A trip is a tour
that starts at the depot, visits customers, returns to the depot, and satisfies the
vehicle capacity constraints. We then determine how to route the vehicles to
perform all the trips and minimize the latest delivery time (Fleet Routing).
Strategic Planning for Disaster Recovery 323
Variables:
Storedi∈1..n ∈ [0, RCi ] - Units stored
Openi∈1..n ∈ {0, 1} - More than zero units stored flag
Repository Routing. Figure 3 shows how to create the inputs for repository
routing from the outputs of the MIP model. For a given scenario s, the idea is
to compute the customers of each repository w, the number of full-capacity trips
F ullT ripss,w,c and the remaining demand Demands,w,c needed to serve each
such customer c. The full trips are only considered in the fleet routing since they
must be performed by a round-trip. The minimum number of trips required to
serve the remaining customers is also computed using a bin-packing algorithm.
The repository routing then finds a set of trips serving these customers with
minimal travel time. The repository routing is solved using a simple CP model
depicted in Figure 4. The model uses two depots for each possible trip (a starting
and an ending depot localized at the repository) and considers nodes consisting of
the depots and the customers. Its decision variables are the successor variables
specifying which node to visit next and the trip variables associating a trip
with each customer. The circuit constraint expresses that the successor variables
constitute a circuit, the vehicle capacity constraint is enforced with a multi-
knapsack constraint, and the remaining constraints associate a trip number with
every node. This model is then solved to optimality.
324 P. Van Hentenryck, R. Bent, and C. Coffrin
Let:
+ + +
Depots+ s,w = {d1 , d2 , ..., dM inT ripss,w }
Depotss,w = {d1 , d2 , ..., d−
− − −
M inT ripss,w }
N odess,w = Depots+ −
s,w ∪ Depotss,w ∪ Customerss,w
T ripss,w = {1, 2, ..., M inT ripss,w }
Variables:
Successor[N odess,w ] ∈ N odess,w - Node traversal order
T rip[N odess,w ] ∈ T ripss,w - Node trip assignment
Minimize:
Ts,n,Successor[n]
n∈Nodess,w
Subject To:
circuit(Successor)
multiknapsack(T rip, {Demands,w,c : c ∈ Customerss,w }, V C)
for wi+ ∈ Depots+ +
s,w : T rip[wi ] = i
for wi− ∈ Depots−
s,w : T rip[w −
i ] = i
for n ∈ Customerss,w ∪ Depots+ s,w : T rip[n] = T rip[Successor[n]]
Fleet Routing. It then remains to decide how to schedule the trips for the fleet
to perform and to minimize the latest delivery time. The capacity constraints
can be ignored now since each trip satisfies them. Each trip is abstracted into a
task at the warehouse location and a service time capturing the time to perform
Strategic Planning for Disaster Recovery 325
Let:
V ehicless = {1, 2, ..., m}
StartN odess = {D1+ , . . . , Dm
+
}
EndN odess = {D1− , . . . , Dm−
}
N odess = StartN odess ∪ EndN odess ∪ T askss,w
w∈1..n
Variables:
Successor[N odess ] ∈ N odess - Node traversal order
V ehicle[N odess ] ∈ V ehicless - Node vehicle assignment
DelT ime[N odess ] ∈ {0, . . . , ∞} - Delivery time
Minimize:
MAX n∈Nodess DelT ime(n)
Subject To:
circuit(Successor)
for n ∈ StartN odess such that n = Di+
V ehicle[n] = i
DelT ime[n] = T imes,n
DelT ime[Successor[n]] = DelT ime[n] + T ripT imen + Ts,n,Successor[n]
for each n ∈ EndN odess such that n = Di−
V ehicle[n] = i
for n ∈ N odess \ StartN odess \ EndN odes
V ehicle[n] = V ehicle[Successor[n]]
DelT ime[Successor[n]] = DelT ime[n] + T ripT imen + Ts,n,Successor[n]
the trip. The fleet routing problem then consists of using the vehicles to perform
all these tasks while minimizing the latest delivery.
Figure 5 depicts how to compute the inputs for fleet routing given the results of
the earlier steps, which consists of computing the proper service times T ripT imet
for each trip t. The model for the fleet routing is depicted in Figure 6 and is a
standard CP formulation for multiple vehicle routing adapted to minimize the
latest delivery time. For each node, the decision variables are its successor, its
vehicle, and its delivery time. The objective minimizes the maximum delivery
time and the rest of the model expresses the subtour elimination constraints, the
vehicle constraints, and the delivery time computation.
The fleet routing problem is solved using LNS [16] to obtain high-quality so-
lutions quickly given the significant number of nodes arising in large instances.
At each optimization step, the LNS algorithm selects 15% of the trips to re-
lax, keeping the rest of the routing fixed. The neighborhood is explored using
constraint programming allowing up to (0.15|N odess |)3 backtracks.
i=(s,d,q)
Ts,d ∗ q/V C
i∈D
but this ceiling function is too difficult for the stochastic MIP model. Instead, we
decompose the problem further and separate the storage and allocation decisions.
The stochastic MIP now decides which repository to open and how much of the
commodity to store at each of them. Once these decisions are taken and once
the uncertainty is revealed (i.e., the scenario s becomes known), we solve a
customer allocation problem, modeled as a MIP (see Figure 7). This problem
must be solved quickly since it is now considered after the uncertainty is revealed.
Unfortunately, even this simplified problem can be time consuming to solve
optimally. However, a time limit of between 30 and 90 seconds results in solutions
within 1% (on average) of the best solution found in one hour. Our results
indicate that even suboptimal solutions to this problem yield better customer
allocation than those produced by the stochastic MIP.
Path-Based Routing. The delivery plans produced by the basic approach exhibit
an obvious limitation. By definition of a trip, the vehicle returns to the repository
at the end of trip. In the case where the vehicle moves to another repository next, it
is more efficient to go directly from its last delivery to the next repository (assum-
ing a metric space which is the case in practice). To illustrate this point, consider
Figure 8 which depicts a situation where a customer (white node) receives deliver-
ies from multiple repositories (shaded nodes). The figure shows the savings when
moving from a tour-based (middle picture) to a path-based solution (right pic-
ture). It is not difficult to adapt the algorithm from a tour-based to a path-based
routing. In the repository routing, it suffices to ignore the last edge of a trip and
to remember where the path ends. In the fleet routing, only the time matrix needs
to be modified to account for the location of the last delivery.
Variables:
Senti∈1..n,j∈1..n ∈ [0, Storedi ] - Units moved from i to j
V ehicleT ripsi∈1..n,j∈1..n ∈ [0..Storedi /V C
] - Trips needed from i to j
Minimize:
Wx ∗ (Cs,i − Sentj,i )+
j∈1..n
i∈1..n
Wy ∗ Ts,i,j ∗ V ehicleT ripsi,j
i∈1..n j∈1..n
Subject
To:
Senti,j ≤ Storedi ∀i
j∈1..n
Sentj,i ≤ Cs,i ∀i
j∈1..n
Senti,j = 0 ∀i, j where i not in ARs
V ehicleT ripsi,j ≥ Senti,j /V C ∀i, j
scales much better than the pure CP formulation described earlier. Indeed, if
the customer demands d1 , . . . , dc , are uniformly distributed in the range 0..V C,
the number of sets satisfying the vehicle capacity is smaller than c3 when c is
not too large (e.g., c ≤ 50). This observation inspires the following formulation:
1. Use CP to enumerate all customer sets satisfying the capacity constraint.
2. Use CP to compute an optimal trip for those customer sets.
3. Use MIP to find a partition of customers with minimal delivery time.
This hybrid model is more complex but each subproblem is small and it scales
much better than the pure CP model.
Aggregate Fleet Routing. The most computationally intense phase is the fleet
routing and we now investigate how to initialize the LNS search with a high-
quality solution. Recall that the fleet routing problem associates a node with
every trip. Given a scenario s, a lower bound for the number of trips is,
StoredSents,i /V C
i∈1..n
328 P. Van Hentenryck, R. Bent, and C. Coffrin
Multi-Stage-SCAP(G)
1 D ← StochasticStorageM IP (G)
2 for s ∈ 1..a
3 do C ← CustomerAllocationP roblem(Gs , Ds )
4 for w ∈ 1..n
5 do T ← RepositoryP athRoutingP roblem(Gs, Cw )
6 I ← AggregateF leetRouting(Gs, T )
7 Ss ← T ripBasedF eetRouting(Gs, T , I)
8 return S
Clearly, the size and complexity of this problem grows with the amount of com-
modities moved. To find high-quality solutions to the fleet routing subtask, the
idea is to aggregate the trips to remove this dependence on the amount of com-
modities delivered. More precisely, we define an aggregate fleet routing model in
which all trips at a repository are replaced by an aggregate trip whose service
time is the sum of all the trip service times. The number of nodes in the aggre-
gate problem is now proportional to the number of repositories. Finding a good
initial solution is not important for smaller problems (e.g., n ≈ 25, m ≈ 4), but it
becomes critical for larger instances (e.g., n ≈ 100, m ≈ 20). Since the aggregate
problem is much simpler, it often reaches high-quality solution quickly.
The Final Algorithm. The final algorithm for solving a SCAP instance G is
presented in Figure 9.
The Algorithm Implementation and the Baseline Algorithm. The final algorithm
was implemented in the Comet system [14] and the experiments were run on
Strategic Planning for Disaster Recovery 329
Benchmark μ(T1 ) σ(T1 ) μ(T∞ ) σ(T∞ ) μ(STO) σ(STO) μ(CA) μ(RR) μ(AFR) μ(FR)
BM1 196.3 18.40 78.82 9.829 0.9895 0.5023 11.78 0.2328 23.07 30.00
BM2 316.9 59.00 120.2 20.97 0.5780 0.2725 16.83 0.2343 28.33 60.00
BM3 178.4 15.89 102.1 15.02 0.3419 0.1714 7.192 0.1317 11.98 40.00
BM4 439.8 48.16 169.0 22.60 0.9093 0.4262 22.71 0.2480 33.28 90.00
BM5 3179 234.8 1271 114.5 46.71 25.05 91.06 1.0328 351.7 600.0
Intel Xeon CPU 2.80GHz machines running 64-bit Linux Debian. To validate
our results, we compare our delivery schedules with those of an agent-based al-
gorithm. The agent-based algorithm uses the storage model but builds a routing
solution without any optimization. Each vehicle works independently to deliver
as much commodity as possible using the following heuristic:
Greedy-Truck-Agent()
1 while ∃ commodity to be picked up ∧ demands to be met
2 do if I have some commodity
3 then drop it off at the nearest demand location
4 else pick up some water from the nearest warehouse
5 goto final destination
Efficiency Results. Table 2 depicts the runtime results. In particular, the table
reports, on average, the total time in seconds for all scenarios (T1 ), the total
time when the scenarios are run in parallel (T∞ ), the time for the storage model
(STO), the client-allocation model (CA), the repository routing (RR), the ag-
gregate fleet routing (AFR), and fleet routing (FR). The first three fields(T1 ,
T∞ , STO ) are averaged over ten identical runs on each of the budget param-
eters. The last four fields (CA, RR, AFR, FR) are averaged over ten identical
runs on each of the budget parameters and each scenario. Since these are aver-
ages, the times of the individual components do not sum to the total time. The
results show that the approach scales well with the size of the problems and is
a practical approach for solving SCAPs.
330 P. Van Hentenryck, R. Bent, and C. Coffrin
Table 4. Correlations for the Distances in Customer Allocation and Fleet Routing
Quality of the Results. Table 3 depicts the improvement of our SCAP algorithm
over the baseline algorithm. Observe the significant and uniform benefits of our
approach which systematically delivers about a 50% reduction in delivery time.
Table 4 describes the correlations between the distances in the customer allo-
cation and fleet routing models. The results show strong correlations, indicating
that the distances in the customer allocation model are a good approximation
of the actual distances in the fleet routing model. Table 5 also reports results
on the absolute and relative differences between vehicles in the solutions. They
indicate that the load is nicely balanced between the vehicles. More precisely,
the maximum delivery times are often within 10% of each other on average, giv-
ing strong evidence of the quality of our solutions. Benchmark 5 is an exception
because it models emergency response at a state level, not at a city level. In that
benchmark, some vehicles have a significantly reduced load because they would
have to travel to the other side of the state to acquire more load, which would
take too much time to reduce the maximum delivery objective.
3000
● ● ● ● ● ● ● ● ● ●
●
Greedy
● ● PFR
STDV
2500
●
80
●
Expected Demand Met (%)
2000
●
●
Expected Time
●
●
60
●
●
1500
● ● ●
● ● ● ● ●
●
●
40
1000
●
20
500
0
0
500000 1500000 2500000 500000 1500000 2500000
AFR
TFR
● PFR
STDV
150
Time
100
● ●
● ●
● ● ●
● ●
●
● ●
●
50
Budget
7 Conclusion
This paper studied a novel problem in the field of humanitarian logistics, the
Single Commodity Allocation Problem (SCAP). The SCAP models the strate-
gic planning process for disaster recovery with stochastic last mile distribution.
The paper proposed a multi-stage stochastic hybrid optimization algorithm that
yields high quality solutions to real-world benchmarks provided by Los Alamos
National Laboratory. The algorithm uses a variety of technologies, including
MIP, constraint programming, and large neighborhood search, to exploit the
structure of each individual optimization subproblem. The experimental results
on water allocation benchmarks indicate that the algorithm is practical from a
computational standpoint and produce significant improvements over existing
relief delivery procedures. This work is currently deployed at LANL as part of
its mission to aid federal organizations in planning and responding to disasters.
References
1. Albareda-Sambola, M., Diaz, J.A., Fernandez, E.: A compact model and tight
bounds for a combined location-routing problem. Computer & Operations Re-
search 32, 407–428 (2005)
2. Balcik, B., Beamon, B., Smilowitz, K.: Last mile distribution in humanitarian relief.
Journal of Intelligent Transportation Systems 12(2), 51–63 (2008)
3. Barbarosoglu, G., Özdamar, L., Çevik, A.: An interactive approach for hierarchical
analysis of helicopter logistics in disaster relief operations. European Journal of
Operational Research 140(1), 118–133 (2002)
4. Beamon, B.: Humanitarian relief chains: Issues and challenges. In: 34th Interna-
tional Conference on Computers & Industrial Engineering, pp. 77–82 (2008)
5. Bianchi, L., Dorigo, M., Gambardella, L., Gutjahr, W.: A survey on metaheuristics
for stochastic combinatorial optimization. Natural Computing 8(2) (2009)
6. Campbell, A.M., Vandenbussche, D., Hermann, W.: Routing for relief efforts.
Transportation Science 42(2), 127–145 (2008)
7. Burke, L.I., Tuzun, D.: A two-phase tabu search approach to the location routing
problem. European Journal of Operational Research 116, 87–99 (1999)
8. Duran, S., Gutierrez, M., Keskinocak, P.: Pre-positioning of emergency items
worldwide for care international. Interfaces (2008) (submitted)
9. Fritz institute (2008), http://www.fritzinstitute.org
10. United States Government. The federal response to hurricane katrina: Lessons
learned (2006)
11. Griffin, P., Scherrer, C., Swann, J.: Optimization of community health center loca-
tions and service offerings with statistical need estimation. IIE Transactions (2008)
12. Gunnec, D., Salman, F.: A two-stage multi-criteria stochastic programming model
for location of emergency response and distribution centers. In: INOC (2007)
13. Kall, P., Wallace, S.W.: Stochastic Programming. Wiley Interscience Series in Sys-
tems and Optimization. John Wiley & Sons, Chichester (1995)
14. Comet 2.1 User Manual. Dynadec website, http://dynadec.com/
Strategic Planning for Disaster Recovery 333
15. Nagy, G., Salhi, S.: Nested heuristic methods for the location-routing problem.
Journal of Operational Research Society 47, 1166–1174 (1996)
16. Shaw, P.: Using constraint programming and local search methods to solve vehicle
routing problems. In: Maher, M.J., Puget, J.-F. (eds.) CP 1998. LNCS, vol. 1520,
pp. 417–431. Springer, Heidelberg (1998)
17. Toth, P., Vigo, D.: The Vehicle Routing Problem. SIAM Monographs on Discrete
Mathematics and Applications, Philadelphia, Pennsylvania (2001)
18. Van Wassenhove, L.: Humanitarian aid logistics: supply chain management in high
gear. Journal of the Operational Research Society 57(1), 475–489 (2006)
Massively Parallel Constraint Programming for
Supercomputers: Challenges and Initial Results
1 Introduction
In this paper we present initial results for implementing a constraint program-
ming solver on a massively parallel supercomputer where coordination between
processing elements is achieved through message passing. Previous work on mes-
sage passing based constraint programming has been targeted towards clusters
of computers (see [1,2] for some examples). Our target hardware platform is the
IBM Blue Gene supercomputer. Blue Gene is designed to use a large number
of relatively slow (800MHz) processors in order to achieve lower power con-
sumption, compared to other supercomputing platforms. Blue Gene/P, the sec-
ond generation of Blue Gene, can run continuously at 1 PFLOPS and can be
scaled to 884,736-processors to achieve 3 PFLOPS performance. We present a
dynamic scheme for allocating sub-problems to processors in a parallel, limited
discrepancy tree search [3]. We evaluate this parallelization scheme on resource
constrained project scheduling problems from PSPLIB [4].
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 334–338, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Massively Parallel Constraint Programming for Supercomputers 335
developed a simple dynamic load balancing scheme for distributed hardware en-
vironments based on message passing. The basic idea behind the approach is that
(a) the processors are divided into master and worker processes; (b) each worker
processor is assigned sub-trees to explore by a master; and (c) the master proces-
sors are responsible for coordinating the sub-trees assigned to worker processors.
A master process has a global view of the full search tree. It keeps track of which
sub-trees have been explored and which are to be explored. Typically a single mas-
ter processor is coordinating many worker processors. Each worker processor im-
plements a tree-based search. The master processor assigns sub-problems to each
worker, where each sub-problem is specified by a set of constraints. These con-
straints are communicated in a serialized form using message-passing. A worker
may receive a new sub-problem from its assigned master processor either at the be-
ginning of problem solving, or during problem solving after exhausting tree search
on its previously assigned sub-problem. On receiving a message from its master
specifying a sub-problem as a set of constraints, a worker processor will establish
an initial state to start tree search by creating and posting constraints to its con-
straint store based on this message.
involves creating the initial job tree for each of the master processors. The master
processor creates its initial job tree by exploring some part of the search space
of the problem, up to some (small) bound on the number of nodes to explore.
If during this initial search a solution is found, the master can terminate. Oth-
erwise, the master initializes its job tree from the search tree explored during
this phase. The master processor then enters into a job dispatching loop where
it responds to requests for job assignments from the worker processors.
The second phase of work generation occurs as workers themselves explore the
search space of their assigned sub-problems and detect that they are exploring
a large search tree which requires further parallelization. Job expansion is a
mechanism for a worker to release free jobs if it detects that it is working on a
large subtree. We use a simple scheme based on a threshold of searched nodes as
a rough indicator of the “largeness” of the job subtree. If the number of nodes
searched by a worker exceeds this threshold without exhausting the subtree or
finding a solution, the worker will send a job expansion request to its master and
pick a smaller part of the job to keep working on. Meanwhile, the master updates
the job tree using the information offered by the worker, eventually dispatching
the remaining parts of the original search tree to other worker processors.
Job expansion has two side effects. First, it introduces communication over-
head because the job expansion information needs to be sent from the worker
processor to the master processor. Secondly, the size of the job tree may become
large, slowing down the search for unexplored nodes in response to worker job
requests. The job tree can be pruned when all siblings of some explored node n
are explored. In this case, the parent of node n can be rendered as explored and
the siblings can be removed from the job tree.
1024 (on the 120 activity RCPSP instance 1-2 from PSPLIB). We manage to
achieve good linear scaling up to 256 processors. However the single master
process becomes a bottleneck when we have more than 256 worker processors,
where we see overall execution time actually slow down as we increase the number
of processors beyond 256.
Fig. 1. Scaling with one master process (left) and multiple (right) master processes
Table 1. Execution time (in seconds) for solving fixed makespan satisfaction PSPLIB
resource-constrained project scheduling problems with 60 and 90 activities
CPU time
Problem (makespan) Size
p=16 p=32 p=64 p=128 p=256 p=512
14-4 (65) 60 30 14 7.0 3.1 2.1 2.0
26-3 (76) 60 >600 >600 90 75 24 10
26-6 (74) 60 63 18 8.1 5.0 2.0 1.0
30-10 (86) 60 >600 >600 >600 >600 216 88
42-3 (78) 60 >600 >600 >600 >600 256 81
46-3 (79) 60 148 27 13 6.0 3.1 2.0
46-4 (74) 60 >600 >600 >600 >600 104 77
46-6 (90) 60 >600 >600 477 419 275 122
14-6 (76) 90 >600 371 218 142 48 25
26-2 (85) 90 294 142 86 35 16 9.0
22-3 (83) 90 50 24 12 5 3.0 0.07
References
1. Michel, L., See, A., Van Hentenryck, P.: Transparent parallelization of constraint
programming. INFORMS Journal of Computing (2009)
2. Duan, L., Gabrielsson, S., Beck, J.: Solving combinatorial problems with parallel
cooperative solvers. In: Ninth International Workshop on Distributed Constraint
Reasoning (2007)
3. Harvey, W.D., Ginsberg, M.L.: Limited discrepancy search. In: 14th International
Joint Conference on Artificial Intelligence (1995)
4. Kolisch, R., Sprecher, A.: PSPLIB - a project scheduling library. European Journal
of Operational Research 96, 205–216 (1996)
5. Bordeaux, L., Hamadi, Y., Samulowitz, H.: Experiments with massively parallel
constraint solving. In: Twenty-first International Joint Conference on Artificial In-
telligence, IJCAI 2009 (2009)
6. Blumofe, R.D., Joerg, C.F., Bradley, C.K., Leiserson, C.E., Randall, K., Zhou,
Y.: Cilk: An efficient multithreaded runtime system. In: Proceedings of the Fifth
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
(PPoPP), pp. 207–216 (1995)
7. Michel, L., See, A., Van Hentenryck, P.: Parallelizing constraint programs trans-
parently. In: Bessière, C. (ed.) CP 2007. LNCS, vol. 4741, pp. 514–528. Springer,
Heidelberg (2007)
8. Baptiste, P., Pape, C.L., Nuijten, W.: Constraint-Based Scheduling - Applying Con-
straint Programming to Scheduling Problems. International Series in Operations
Research and Management Science. Springer, Heidelberg (2001)
Boosting Set Constraint Propagation
for Network Design
1 Introduction
This paper reconsiders the deployment of synchronous optical networks
(SONET), an optimization problem originally studied in the operation research
community[1]. The SONET problem is defined in terms of a set of clients and
a set of communication demands between pairs of clients who communicate
through optical rings. The task is to allocate clients on (possibly multiple) rings,
satisfying the bandwidth constraints on the rings and minimizing the equipment
cost. This problem has been tackled previously using mixed integer programming
(MIP)[1] and constraint programming (CP)[2,3]. Much attention was devoted to
variable branching heuristics and breaking ring symmetries (since all rings are
identical). It was shown that sophisticated symmetry-breaking techniques dra-
matically reduce the computational times, both for MIP and CP formulations.
The difficulty of finding good branching heuristics, which do not clash with sym-
metry breaking, was also mentioned.
This paper takes another look at the problem and studies the possibility that
the thrashing behavior experienced in earlier attempts is primarily due to lack
of pruning. The key observation is that existing models mainly consist of bi-
nary constraints and lack a global perspective. Instead of focusing on symmetry
breaking and branching heuristics, we study how to strengthen constraint prop-
agation by adding redundant global set-constraints. We propose two classes of
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 339–353, 2010.
c Springer-Verlag Berlin Heidelberg 2010
340 J. Yip, P. Van Hentenryck, and C. Gervet
sb-domain sbc-domain
NonEmptyIntersection |X ∩ Y | ≥ 1 Polynomial Polynomial (Thm. 2)
AllNonEmptyIntersection ∀i, |X ∩ Yi | ≥ 1 Polynomial (Thm. 3) NP-hard (Thm. 5)
SubsetOfUnion Y ⊇X Polynomial (Thm. 6) ?
i i
SubsetOfOpenUnion i∈Y Xi ⊇ s Polynomial (Thm. 8) NP-hard (Thm. 9)
The Basic CP Model. The core CP model [6,2] include three types of variables:
Set variable Xi represents the set of nodes assigned to ring i, set variable Yu
represents the set of rings assigned to node u, and integer variable Zi,e represents
the amount of bandwidth assigned to demand pair e on ring i. The model is
Boosting Set Constraint Propagation for Network Design 341
minimize |Xi | s.t.
i∈R
|Yu ∩ Yv | ≥ 1 ∀(u, v) ∈ E (1)
Zi,(u,v) > 0 ⇒ i ∈ (Yu ∩ Yv ) ∀i ∈ R, (u, v) ∈ E (2)
Zi,e = d(e) ∀e ∈ E (3)
i∈R
u ∈ Xi ⇔ i ∈ Yu ∀i ∈ R, u ∈ N (4)
|Xi | ≤ a ∀i ∈ R (5)
Zi,e ≤ c ∀i ∈ R (6)
e∈E
Xi % Xj ∀i, j ∈ R : i < j (7)
(1) ensures nodes of every demand pair lie on at least one common ring. (2)
ensures that there is a flow for a demand pair on a particular ring i only if both
client are on that ring. (3) guarantees that every demand is satisfied. (4) channels
between the first two types of variables. (5) makes sure that there are at most a
ADMs on each ring. (6) makes sure that the total traffic flow on each ring does
not exceed the bandwidth capacity. (7) is a symmetry-breaking constraint that
removes symmetric solutions caused by interchangeability of rings.
|Xi | = 1 ∀i ∈ R (8)
|δu |
|Yu | ≥ ∀u ∈ N (9)
a−1
Yu = {i} ⇒ δu ∪ {u} ⊆ Xi ∀u ∈ N, i ∈ R (10)
Yu = {i, j} ⇒ δu ∪ {u} ⊆ Xi ∪ Xj ∀u ∈ N, i, j ∈ R (11)
subsetOf U nion (12) generalizes (8), it forces a node not to lie on rings with
no contribution. subsetOf OpenU nion (13) generalizes (9), (10), and (11) and
ensures that the rings of a node accommodate all its neighbors.
342 J. Yip, P. Van Hentenryck, and C. Gervet
We now give the definition of bound consistency for these set domains.
Definition 3 (sbc-bound consistency). A set constraint C(X1 , ..., Xm ) (Xi
are set variables using the sbc-domain) is said to be sbc-bound consistent if and
only if ∀1 ≤ i ≤ m,
Algorithm 1. bcsbc
nonEmptyIntersection(Xsbc = sbc
RX , PX , cX , cX , Ysbc )
Require: Xsbc , Ysbc are both bound consistent
1: PX EY , EX PY ← PX \ (RX ∪ PY ), PY \ (RY ∪ PX )
2: PX PY , RX RY ← (PX ∩ PY ) \ (RX ∪ RY ), RX ∩ RY
3: RX PY , PX RY ← RX ∩ (PY \ RY ), (PX \ RX ) ∩ RY
4: if |RX RY | > 0 then
5: return true
6: if |PX PY | + |RX PY | + |PX RY | = 0 then
7: return ⊥
8: else if |PX PY | + |RX PY | + |PX RY | = 1 then
9: insert e into Xsbc , Ysbc (where {e} = PX ∪ PY )
10: else
11: cX , cY ← cX − |RX |, cY − |RY |
12: if cX = 1 ∧ RX PY = ∅ then
13: exclude PX EY from Xsbc
14: if cY = 1 ∧ PX RY = ∅ then
15: exclude EX PY from Ysbc
16: return true
Theorem 3. bcsb
allN onEmptyIntersect(X, {Y1, .., Yn }) is decomposable.
Proof. (sketch) From reference [5], bcsb (∀i < j, |Yi ∩ Yj | ≥ 1) is decomposable.
Our constraint is a special case of it which can be transformed to the general
case by amending a dummy element to the possible set of each Yi .
Unfortunately, the result does not hold for the sbc-domain.
Theorem 4. bcsbc
allN onEmptyIntersect(X, {Y1, .., Yn }) is strictly stronger
than enforcing BC on its decomposition (i.e. ∀1 ≤ i ≤ n, bcsbc
|X ∩ Yi | ≥ 1).
Proof. Consider allN onEmptyIntersect(X, {Y1, Y2 , Y3 }). X ∈
sbc
∅, {1..6}, 2, 2, Y1 ∈ sbc
∅, {1, 2}, 1, 1, Y2 ∈ sbc
∅, {3, 4}, 1, 1, and
Y3 ∈ sbc
∅, {5, 6}, 1, 1. It is bound consistency on each constraint in the
decomposition. However, there is no solution since X can only takes two
elements and the possible sets of Y1 , Y2 and Y3 are disjoint.
Theorem 5. bcsbc
allN onEmptyIntersect(X, {Y1, .., Yn }) is NP-hard.
Proof. Reduction from 3-SAT. Instance: Set of n literals and m clauses over the
literals such that each clause contains exactly 3 literals. Question: Is there a
satisfying truth assignment for all clauses?
We construct a set-CSP with three types of variables. The first type corre-
sponds to literals: for each literal, we construct a set variable Xi with domain
sbc
∅, {i, ¬i}, 1, 1, values in the possible set corresponds to true and false. The
second type corresponds to clauses: for every clause j (xp ∨ ¬xq ∨ xr ), we in-
troduce one set variable Yj with domain sbc
∅, {p, −q, r}, 1, 3. The third type
contains just one set variable Z correspond to the assignment, its domain is
sbc
∅, {1, −1, .., n, −n}, n, n. The constraint is in the form,
allN onEmptyIntersect(Z, {X1, .., Xn , Y1 , .., Ym })
Boosting Set Constraint Propagation for Network Design 345
Set variables Xi guarantees that Z is valid assignment (i.e., for every i, it can
only pick either i or −i, but not both). Yj and Z overlap if and only if at least one
of the literals is satisfied. The constraint has a solution if and only if the 3-SAT
instance is satisfiable. Therefore, enforcing bound consistency is NP-hard.
6 Subset of Union
This section considers constraint (12) which is an instance of
subsetOf U nion(X, {Y1, .., Ym }) ≡ Yi ⊇ X (23)
1≤i≤m
Constraint (12) is justified by the following reasoning for a node u and a ring i it
belongs to: If i is not used by any of u’s neighbors, u does not need to use i. As
a result, the rings of node u must be a subset of the rings of its neighbors. We
first propose two simple inference rules to perform deductions on this constraint.
Rule 1 (SubsetOfUnion : Element Not in Union)
i ∈ PX ∧ ∀1 ≤ j ≤ m, i ∈ PYj
subsetOf U nion(X, {Y1, .., Ym }) −→ i ∈ X ∧ subsetOf U nion(X, {Y1, .., Ym })
Rule 2 (SubsetOfUnion : Element Must Be in Union)
i ∈ RX ∧ i ∈ PYk ∧ |{i ∈ PYj | 1 ≤ j ≤ m}| = 1
subsetOf U nion(X, {Y1, .., Ym }) −→ i ∈ Yk ∧ subsetOf U nion(X, {Y1, .., Ym })
Two above rules are sufficient to enforce bound consistency on the sb-domain
but not on the sbc-domain. It is an open issue to determine if bound consistency
can be enforced in polynomial time on the sbc-domain.
Theorem 6. bcsb
subsetOf U nion(X, {Y1, .., Ym }) is equivalent to enforcing
rule 1 and rule 2 until they reach the fix-point.
Proof. Consider an element e ∈ PX . It has a support or otherwise it would be
removed by rule 1. It does not belong to all solutions since, given any feasible
assignment to the constraint that contains e, removing e from X still leaves us
with a feasible solution. Hence e does not belong to the required set. An element
e ∈ PYi always has a support since adding e to any feasible assignment would
not make it invalid. An element e ∈ PYi belongs to all solutions if it must be in
the union and Yi is the only variable that contains e (rule 2).
Theorem 7. bcsbc
subsetOf U nion(X, {Y1, .., Ym }) is strictly stronger than
enforcing rule 1 and rule 2 until they reach the fix-point.
Proof. Consider the domains X ∈ sbc
∅, {1, .., 6}, 0, 2, Y1 ∈ sbc
∅, {1, 2}, 1, 1,
Y2 ∈ sbc
∅, {3, 4}, 1, 1 and Y3 ∈ sbc
∅, {1, .., 5}, 2, 2. Applying the domain re-
duction rules, the domain of X becomes sbc
∅, {1, .., 5}, 2, 2. 5 ∈ PY3 has no
solution since X has only two empty spots, one for {1, 2} and the other for
{3, 4} as Y1 and Y2 are disjoint. The constraint is thus not bound consistent.
346 J. Yip, P. Van Hentenryck, and C. Gervet
Theorem 8. bcsb
subsetOf OpenU nion(s, Y, {X1, .., Xm }) is equivalent to en-
forcing rule 3 and rule 4 until they reach a fix-point.
Theorem 9. bcsbc
subsetOf OpenU nion(s, Y, {X1, .., Xm }) is NP-hard.
Proof. Reduction from Dominating Set. The problem of dominating set is defined
as follows. Input instance: A graph G =
V, E and an integer k ≤ |V |. Question:
Does there exist a subset V of V such that |V | ≤ k and every node in V \ V
is a neighbor of some nodes in V ?
Boosting Set Constraint Propagation for Network Design 347
ˆ i − |RXi |) + |RXi ∩ s| = cX
(cX ˆ i − |RXi \ s| ≥ |xi ∩ s|
Proof. When Yu is bounded, the scope for the union is fixed. The union con-
straint requires that the union of set has to be a superset of s and hence each
element of s has to be taken at least once. The channeling constraint requires
each variable Xi contains element u and, as Yu defines the scope, element u has
to be taken exactly |Yu | times. It reduces to a GCClb .
9 Experimental Evaluation
The MIP Formulation. The problem was first solved using an MIP formulation[1].
The input was preprocessed before the search and some variables were pre-
assigned. Valid inequalities were added during the search in order to tighten the
model representation. Several variable-ordering heuristics, mainly based on the
neighborhood and demand of nodes, were devised and tested. Several symmetry-
breaking constraints were evaluated too, Table 1 in [10] indicates minuscule dif-
ferences in performance among different symmetry-breaking constraints.
were also investigated. To avoid clashing with variable ordering, SBDS (symme-
try breaking during search) was used. SBDS was very effective on the SONET
problems, although it generated a huge number of no-good constraints, inducing
a significant overhead to the system. Recall also that Smith’s model included a
few simple redundant constraints reasoning on the cardinality of node variables
(Yu ). Please refer to Section 5 in [2] for a detailed discussion.
Another CP model was proposed in [3] and it broke symmetries by adding a
lexicographic bound to set variable domain. With the additional lexicographic
component, the solver obtained a tighter approximation of the set-variable do-
main. The lexicographical information was used not only for breaking symme-
tries, but also for cardinality reasoning. This method provided a much simpler
mechanism to remove symmetries. However, as mentioned by the authors, differ-
ent components of the set domain (the membership component, the cardinality
restriction, and the lexicographical bound) did not interact effectively.
Our Search Procedure. Our CP algorithm Boosting implements all the con-
straints presented in this paper and uses a static four-stage search inspired by
Smith’s heuristics [2]. The algorithm first branches on the objective value, start-
ing from the minimum value and increasing the value by one at a time from the
infeasible region. The first feasible solution is thus optimal. Then it decides the
cardinality of Yu . Third, it decides the value of Yu . Last, the algorithm decides
the flow assigned to each pair of nodes on a ring. Proposition 2 in [1] shows
that there is an integral solution as long as all the demands are integral and the
algorithm only needs to branch on integers. In each stage, variables are labeled
in the order given by the instance.
Benchmarks and Implementations. The benchmarks include all the large ca-
pacitated instances from [1]. Small and medium instances take negligible time
and are omitted. Our algorithm was evaluated on an Intel Core 2 Duo 2.4GHz
laptop with 4Gb of memory. The MIP model [1] used CPLEX on a Sun Ultra
10 Workstation. Smith’s algorithm [2] used ILOG Solver on one 1.7GHz proces-
sor. Hybrid[3] was run using the Eclipse constraint solver on a Pentium 4 2GHz
processor, with a timeout of 3000 seconds.
Comparison of the Approaches. Table 2 reports the CPU time and number of
backtracks (bt) required for each approach to prove the optimality of each in-
stance. Our Boosting algorithm is, on average, more than 3700 times faster than
the MIP and Hybrid approaches and visits several orders on magnitude less nodes
than them. Boosting is more than 16 times faster than the SBDS approach when
the machines are scaled and produces significantly higher speedups on the most
difficult instances (e.g., instance 9). The SBDS method performs fewer back-
tracks in 6 out of 15 instances, because it eliminates symmetric subtrees earlier
than our static symmetry-breaking constraint. However, even when the CPU
speed is scaled, none of 15 instances are solved by SBDS faster than Boosting.
This is explained by the huge number of symmetry-breaking constraints added
during search. The empirical results confirm the strength of the light-weight
Boosting Set Constraint Propagation for Network Design 351
and effective propagation algorithms proposed in this paper. While earlier at-
tempts focused on branching heuristics and sophisticated symmetry-breaking
techniques, the results demonstrate that effective filtering algorithms are key
to obtaining strong performance on this problem. The remaining experimental
results give empirical evidence justifying this observation.
The Impact of Branching Heuristics. We now study the impact of the branch-
ing heuristics and evaluate various variable orderings for the static labeling
procedure of Boosting. Various variable orderings were studied in [1,2]. Most
of them are based on the node demands and degrees. Our experiments con-
sidered four different heuristics: minimum-degree-first, maximum-degree-first,
minimum-demand-first, and maximum-demand-first. To avoid a clash between
the variable heuristics and the symmetry-breaking constraint, the lexicographic
constraint uses the same static order as the branching heuristic. Table 3 (Left)
reports the average number of backtracks and time to solve all 15 instances,
where row Given is the node ordering from the instance data. The results show
that, with the exception of the max-demand heuristic, all variable orderings pro-
duce very similar number of backtracks and runtime performance. Moreover, the
max-demand heuristic is still orders of magnitude faster than earlier attempts.
This indicates that the variable ordering is not particularly significant when
stronger filtering algorithms are available.
NonEmptyIntersection SubsetOfUnion
SubsetOfOpenUnion
|X ∩ Y | ≥ 1 i Yi ⊇ X i∈Y Xi ⊇ s avg bt avg time
1167.13 0.23
2018.67 0.33
1190.13 0.23
1556.67 0.36
2177.93 0.33
13073.73 2.19
1670.47 0.37
17770.93 2.82
cardinalities, while the 01-lex orders sets based on their characteristic vectors.
Table 3 (Right) reports the results of the Boosting algorithm with both types of
symmetry-breaking constraints. The difference is still negligible when compared
with the benefits of global constraints, although the 01-lexicographic order seems
more effective on these benchmarks in average.
10 Conclusion
This paper reconsiders the SONET problem. While earlier attempts focused on
symmetry breaking and the design of effective search strategies, this paper took
an orthogonal view and aimed at boosting constraint propagation by studying a
variety of global constraints arising in the SONET application. From a modeling
standpoint, the main contribution was to isolate two classes of redundant con-
straints that provide a global view to the solver. From a technical standpoint, the
scientific contributions included novel hardness proofs, propagation algorithms,
and filtering rules. The technical contributions were also evaluated on a simple
and static model that performs a few orders of magnitude faster than earlier
attempts. Experimental results also demonstrated the minor impact of variable
orderings and symmetry-breaking techniques, once advanced constraint propa-
gation is used. More generally, these results indicate the significant benefits of
constraint programming for this application and the value of developing effective
constraint propagation over sets.
References
1. Sherali, H.D., Smith, J.C., Lee, Y.: Enhanced model representations for an intra-
ring synchronous optical network design problem allowing demand splitting. IN-
FORMS Journal on Computing 12(4), 284–298 (2000)
2. Smith, B.M.: Symmetry and search in a network design problem. In: Barták, R.,
Milano, M. (eds.) CPAIOR 2005. LNCS, vol. 3524, pp. 336–350. Springer, Heidel-
berg (2005)
3. Sadler, A., Gervet, C.: Enhancing set constraint solvers with lexicographic bounds.
J. Heuristics 14(1), 23–67 (2008)
4. Sadler, A., Gervet, C.: Global reasoning on sets. In: FORMUL, CP 2001 (2001)
5. Bessière, C., Hebrard, E., Hnich, B., Walsh, T.: Disjoint, partition and intersection
constraints for set and multiset variables. In: Wallace, M. (ed.) CP 2004. LNCS,
vol. 3258, pp. 138–152. Springer, Heidelberg (2004)
6. Sadler, A., Gervet, C.: Hybrid set domains to strengthen constraint propagation
and reduce symmetries. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 604–
618. Springer, Heidelberg (2004)
7. Hentenryck, P.V., Yip, J., Gervet, C., Dooms, G.: Bound consistency for binary
length-lex set constraints. In: AAAI 2008, pp. 375–380 (2008)
8. van Hoeve, W.J., Sabharwal, A.: Filtering atmost1 on pairs of set variables. In:
CPAIOR 2005, pp. 382–386 (2005)
9. van Hoeve, W.J., Régin, J.C.: Open constraints in a closed world. In: Beck, J.C.,
Smith, B.M. (eds.) CPAIOR 2006. LNCS, vol. 3990, pp. 244–257. Springer, Hei-
delberg (2006)
10. Sherali, H.D., Smith, J.C.: Improving discrete model representations via symmetry
considerations. Manage. Sci. 47(10), 1396–1407 (2001)
11. Gervet, C., Hentenryck, P.V.: Length-lex ordering for set csps. In: AAAI 2006
(2006)
More Robust Counting-Based Search Heuristics
with Alldifferent Constraints
1 Introduction
AI problem solving relies on effective inference and search. This is true in partic-
ular for Constraint Programming where, after many years of advances on infer-
ence, there has been a more recent focus on search heuristics. The kind of search
heuristics considered in this paper rely on counting the solutions to individ-
ual substructures of the problem [13]. Given a constraint γ defined on the set of
variables {x1 , . . . , xk } and respective finite domains Di 1≤i≤k, let #γ(x1 , . . . , xk )
denote the number of solutions of constraint γ. Given a variable xi in the scope
of γ, and a value d ∈ Di , we call
the solution density 1 of pair (xi , d) in γ. It measures how often a certain assign-
ment is part of a solution of the constraint γ. One simple — yet very effective —
solution counting-based heuristic is maxSD which, after collecting the solution
densities from the problem constraints, branches on the variable-value pair with
the highest solution density [13].
1
Also referred to as marginal in some of the literature.
A. Lodi, M. Milano, and P. Toth (Eds.): CPAIOR 2010, LNCS 6140, pp. 354–368, 2010.
c Springer-Verlag Berlin Heidelberg 2010
More Robust Counting-Based Search Heuristics with Alldifferent Constraints 355
For some constraints, computing solution densities can be done efficiently and
even, in some cases, at (asymptotically) no extra cost given the filtering algo-
rithm already implemented in the constraint. For the alldifferent constraint,
computing the number of solutions is equivalent to the problem of computing
the permanent of the related (0-1) adjacency matrix A that is built such that aij
is equal to 1 iff j ∈ Di . The permanent of a n × n matrix A is formally defined
as2
per(A) = ai,σ(i) (1)
σ∈Sn i
where Sn denotes the symmetric group, i.e. the set of n! permutations of [n].
Given a specific permutation, the product is equal to 1 if and only if all the ele-
ments are equal to 1 i.e. the permutation is a valid assignment for the
alldifferent constraint. Hence, the sum over all the permutations gives us
the total number of alldifferent solutions.
The problem of computing the permanent has been studied for the last two
centuries and it is still a challenging problem to address. Even though the an-
alytic formulation of the permanent resembles that of the determinant, there
have been few advances on its exact computation. In 1979, Valiant [12] proved
that the problem is #P -complete, even for 0-1 matrices, that is, under reason-
able assumptions, it cannot be computed in polynomial time in the general case.
The focus then moved to approximating the permanent. A sampling approach
proposed by Rasmussen was improved in [13] by adding propagation. Although
providing a very good approximation, it is time consuming and suitable mainly
for hard instances where the accuracy of the heuristic can balance the time spent
in computing the solution densities.
In this paper we explore a different approach, trading some of the accuracy for
a significant speedup in the counting procedure, in order to provide an algorithm
that performs well on easy instances while keeping the lead in solving hard
ones. A portfolio of heuristics could have been an alternative, first trying a
computationally cheap heuristic to take care of easy instances and switching to
our counting-based heuristic after a certain time limit. But as we shall see, our
proposal not only improves the performance on easy instances but also on hard
ones.
In the rest of this paper, Section 2 presents some known upper bounds for
the permanent and their integration in solution counting-based heuristics for
the alldifferent constraint. Section 3 evaluates our proposal on benchmark
problems. Final comments are given in Section 4.
ri = nj=1 aij ). Note that the permanent is defined on square matrices i.e. the
related bipartite graph needs to have |V1 | = |V2 |. In order to overcome this
limitation, we can augment the graph by adding |V2 | − |V1 | fake vertices to V1
(without loss of generality |V1 | ≤ |V2 |) each one connected to all vertices in
V2 . The effect on the number of maximum matchings is stated in the following
theorem.
Theorem 1. Let G(V1 ∪ V2 , E) be a bipartite graph with |V1 | ≤ |V2 | and the
related augmented graph G (V1 ∪ V2 , E ) a graph such that V1 = V1 ∪ Vf ake with
|Vf ake | = |V2 |− |V1 | and the edge set E = E ∪Ef ake with Ef ake = {(vi , vj ) | vi ∈
Vf ake , vj ∈ V2 }. Let |MG | and |MG | be the number of maximum matchings
respectively in G and G . Then |MG | = |MG |/|Vf ake |!.
Proof. Given a maximum matching m ∈ MG of size |V1 |, since m covers all the
vertices in V1 then there exists exactly |V2 | − |V1 | vertices in V2 not matched. In
the corresponding matching (possibly not maximum) m = m in G , the vertices
in V2 that are not matched can be matched with any of the vertices in Vf ake .
Since each vertex in Vf ake is connected to any vertex in V2 then there exists
exactly |Vf ake |! permutations to obtain a perfect matching in G starting from a
maximum matching m in G. If there is no maximum matching of size |V1 | for
G then clearly there isn’t any of size |V2 | for G either.
For simplicity in the rest of the paper we assume |X| = |DX |.
In 1963, Minc [6] conjectured that the permanent can be bounded from above
by the following formula:
n
perm(A) ≤ (ri !)1/ri . (2)
i=1
Proved only in 1973 by Brégman [1], it was considered for decades the best upper
bound for the permanent. Recently, Liang and Bai [4], inspired by Rasmussen’s
work, proposed a new upper bound (with qi = min{ ri2+1 , 2i }):
n
perm(A)2 ≤ qi (ri − qi + 1). (3)
i=1
None of the two upper bounds strictly dominates the other. In the following
we denote by U B BM (A) the Brégman-Minc upper bound and by U B LB (A) the
Liang-Bai upper bound. Jurkat and Ryser proposed in [3] another bound:
n
perm(A) ≤ min(ri , i).
i=1
However it is considered generally weaker than U B BM (A) (see [11] for a com-
prehensive literature review). Soules proposed in [10] some general sharpening
techniques that can be employed on any existent permanent upper bound in
order to improve them. The basic idea is to apply an appropriate combination
of functions (such as row or column permutation, matrix transposition, row or
column scaling) and to recompute the upper bound on the modified matrix.
More Robust Counting-Based Search Heuristics with Alldifferent Constraints 357
Note that in Formula 2 and 3, the ri are equal to |Di |; since the |Di | range from
0 to n, the factors can be precomputed and stored: in a vector BM f actors[r] =
(r!)1/r , r = 0, . . . , n for the first bound and similarly for the second one (with
factors depending on both |Di | and i). Assuming that |Di | is returned in O(1),
computing the formulas takes O(n) time.
Recall that matrix element aij = 1 iff j ∈ Di . Assigning j to variable xi
translates to replacing the ith row by the unit vector e(j) (i.e. setting the ith
row of the matrix to 0 except for the element in column j). We write Axi =j to
denote matrix A except that xi is fixed to j. We call local probe the assignment
xi = j performed to compute Axi =j i.e. a temporary assignment that does
not propagate to any other constraint except the one being processed. Solution
densities are then approximated as
This bound on the solution count depends on the consistency level enforced
in the alldifferent constraint during the local probes. It is the one we will
evaluate in Section 2.3.
If we want to compute σ(xi , j, alldifferent) for all i = 1, . . . , n and for all
j ∈ Di then a trivial implementation would compute Axi =j for each variable-
value pair; the total time complexity would be O(mP + mn) (where m is the
sum of the cardinalities of the variable domains and P the time complexity of
the filtering).
358 A. Zanarini and G. Pesant
Although unable to improve over the worst case complexity, in the following
we propose an algorithm that performs definitely better in practice. We first
introduce some additional notation: we write as Di the variable domains after
enforcing θ-consistency3 on that constraint alone and as I˜ the set of indices of
the variables that were subject to a domain change due to a local probe and the
ensuing filtering, that is, i ∈ I˜ iff |Di | = |Di |. We describe the algorithm for the
Brégman-Minc bound — it can be easily adapted for the Liang-Bai bound.
The basic idea is to compute the bound for the matrix A and reuse it to speed
up the computation of the bounds for Axi =j for all i = 1, . . . , n and j ∈ Di . Let
⎧ BMf actors[1]
⎪
⎪ BMf actors[|Dk |] if k = i
⎪
⎪
⎪
⎨
γk = BMf actors[|Dk |] if k ∈ I˜ \ {i}
⎪
⎪ BMf actors[|Dk |]
⎪
⎪
⎪
⎩
1 otherwise
n n
UB BM
(Axi =j ) = BM f actors[|Dk |] = γk BM f actors[|Dk |]
k=1 k=1
n
= U B BM (A) γk
k=1
Note that γk with k = i (i.e. we are computing U B BM (Axi =j )) does not depend
on j; however I˜ does depend on j because of the domain filtering.
3
Any form of consistency.
More Robust Counting-Based Search Heuristics with Alldifferent Constraints 359
Algorithm 1 shows the pseudo code for computing U B BM (Axi =j ) for all i =
1, . . . , n and j ∈ Di . Initially, it computes the bound for matrix A (line 1);
then, for a given i, it computes γi and the upper bound is modified accordingly
(line 3). Afterwards, for each j ∈ Di , θ-consistency is enforced (line 7) and it
iterates over the set of modified variables (line 9-10) to compute all the γk that
are different from 1. We store the upper bound for variable i and value j in the
structure V arV alU B[i][j]. Before computing the bound for the other variables-
values the assignment xi = j needs to be undone (line 12). Finally, we normalize
the upper bounds in order to correctly return solution densities (line 13-14). The
time complexity is O(mP + mI). ˜
If the matrix A is dense we expect |I| ˜ ) n, therefore most of the γk are
different from 1 and need to be computed. As soon as the matrix becomes sparse
enough then |I| ˜ * n and only a small fraction of γk needs to be computed, and
that is where Algorithm 1 has an edge. In preliminary tests conducted over the
benchmark problems presented in the Section 3, Algorithm 1 with arc consistency
performed on average 25% better than the trivial implementation.
But how accurate is the counting information we compute from these bounds?
We compared the algorithm based on upper bounds with the previous ap-
proaches: Rasmussen’s algorithm, Furer’s algorithm and the sampling algorithm
proposed in [13]. We generated alldifferent instances of size n ranging from
10 to 20 variables; variable domains were partially shrunk with a percentage of
removal of values p varying from 20% to 80% in steps of 10%. We computed
the exact number of solutions and removed those instances that were infeasible
or for which enumeration took more than 2 days (leaving about one thousand
instances). As a reference, the average solution count for the alldifferent in-
stances with 20% to 60% of values removed is close to one billion solutions (and
up to 10 billions), with 70% of removals it decreases to a few millions and with
80% of removals to a few thousands.
Randomized algorithms were run 10 times and we report the average of the
results. In order to verify the performance with varying sampling time, we set
a timeout of respectively 1, 0.1, 0.01, and 0.001 second. The running time of
the counting algorithm based on upper bounds is bounded by the completion of
Algorithm 1. The measures used for the analysis are the following:
counting error: relative error on the solution count of the constraint (com-
puted as the absolute difference between the exact solution count and the
estimated one and then divided by the exact solution count)
maximum solution density error: maximum absolute error on the solution
densities (computed as the maximum of the absolute differences between the
exact solution densities and the approximated ones)
average solution density error: average absolute error on the solution den-
sities (computed as the average of the absolute differences between the exact
solution densities and the approximated ones)
360 A. Zanarini and G. Pesant
Fig. 1. Counting Error for one thousand alldifferent instances with varying variable
domain sizes
Note that we computed absolute errors for the solution densities because
counting-based heuristics usually compare the absolute value of the solution
densities.
Plot 1 shows the counting error for the sampling algorithm, Rasmussen’s and
Furer’s with varying timeout. Different shades of gray indicate different per-
centages of removals; series represent different algorithms and they are grouped
based on the varying timeouts.
The relative counting error is maintained reasonably low for 1 and 0.1 second
of sampling, however it increases considerably if we further decrease the timeout.
Note that at 0.001 the sampling algorithm reaches its limit being able to sample
only a few dozens solutions (both Rasmussen’s and Furer’s are in the order of the
hundreds of samples). We left out the results of the algorithm based on upper
bounds to avoid a scaling problem: the counting error varies from about 40%
up to 2300% when enforcing domain consistency in Algorithm 1 (UB-DC) and
up to 3600% with arc consistency (UB-AC) or 4800% with forward checking
(UB-FC). Despite being tight upper bounds, they are obviously not suitable
to approximate the solution count. Note nonetheless their remarkable running
times: UB-DC takes about one millisecond whereas UB-AC and UB-FC about
a tenth of a millisecond (with UB-FC being slightly faster).
Despite the poor performance in approximating the solution count, they pro-
vide a very good tradeoff in approximation accuracy and computation time when
deriving solution densities.
Figure 2 and 3 show respectively the maximum and average solution density
errors (note that the maximum value in the y-axis is different in the two plots).
Again the sampling algorithm shows a better accuracy w.r.t. Rasmussen’s and
Furer’s. Solution density errors are very well contained when using the upper
bound approach: they are the best one when compared to the algorithms with
an equivalent timeout and on average comparable to the results obtained by the
sampling algorithm with a timeout of 0.01 seconds. Therefore, upper bounds
offer a good accuracy despite employing just a tenth (UB-DC) or a hundredth
More Robust Counting-Based Search Heuristics with Alldifferent Constraints 361
Fig. 2. Maximum Solution Density Error for one thousand alldifferent instances with
varying variable domain sizes
Fig. 3. Average Solution Density Error for one thousand alldifferent instances with
varying variable domain sizes
(UB-AC, UB-FC) of the time of the sampling algorithm with comparable ac-
curacy. Furthermore, errors for the upper bound algorithm are quite low when
the domains are dense (low removal percentage) and on par with the sampling
algorithm with a timeout of 0.1 or even 1 second. Note that in the context of
search heuristics dense domains are more likely to happen closer to the root of
the search tree hence when it is important to have a good heuristic guidance.
Finally, as expected, enforcing a higher level of consistency during the local
probes brings more accuracy, however the difference between UB-DC, UB-AC
and UB-FC is not striking.
3 Experimental Results
In addition to counting accuracy, we measured the performance of search heuris-
tics using such information to solve combinatorial problems by running
362 A. Zanarini and G. Pesant
Table 1. Average solving time (in seconds), median solving time and average number
of backtracks for 100 QWH instances of order 30
therefore the initial exact enumeration is more likely to time out. In Figure 4 we
can see that the sampling algorithm alone is able to solve some instances within
few seconds whereas maxSD exact+sampl-DC does not solve any instance within
40 seconds because of the high overhead due to exact enumeration. Sampling
alone struggles more with the hard instances and it ends up solving just 85% of
the instances whereas maxSD exact+sampl-DC solves 97% of the instances.
The previous heuristics were significantly outperformed in all the instance sets
by the heuristics based on upper bounds. As shown in Figure 6, maxSD UB-DC,
maxSD UB-AC, maxSD UB-FC are very quick in solving easy instances and yet
they are capable of solving the same number of instances as maxSD exact+sampl-
DC. The latter heuristic shows its limit already in the set of instances with 45%
of holes where no instance is solved within a hundred seconds, whilst maxSD
based on upper bounds almost instantaneously solves all the instances. maxSD
UB-AC was overall the best of the set on all the instances with up to a two
orders of magnitude advantage over IBS in terms of solving time and up to
four orders of magnitude for the number of backtracks. Enforcing a higher level
of consistency leads to better approximated solution densities and to a lower
number of backtracks, but it is more time consuming than simple arc consistency.
A weaker level of consistency like forward checking can pay off on easy instances
but it falls short compared to UB-AC on the hard ones. Note also that maxSD
UB-DC increases the solving time, despite lowering the backtracks, when the
364 A. Zanarini and G. Pesant
!"
#!
$##"
Fig. 4. Percentage of solved instances vs time for QWH instances with 42% of holes
#!
$##"
!"
Fig. 5. Percentage of solved instances vs time for QWH instances with 45% of holes
instances have more holes (apart from the 42% holes instances): in those cases m
increases and the overhead of propagation becomes important (see Section 2.2).
However we could not reuse the maximum matching and the strongly connected
components (see [9]) computed for the propagation (there is no access to the
underlying propagation code) — a more coupled integration of the counting
algorithm with the propagation algorithm could lead to a performance gain. We
did not consider attempting exact counting before computing the upper bound
(exact+UB) because this would have caused timeouts on the easier instances,
More Robust Counting-Based Search Heuristics with Alldifferent Constraints 365
%
%
%
!"
% #!
$##"
%
The Travelling Tournament Problem with Predefined Venues (TTPPV) was in-
troduced in [5] and consists of finding an optimal single round robin schedule
for a sport event. Given a set of n teams, each team has to play against each
other team. In each game, a team is supposed to play either at home or away,
however no team can play more than three consecutive times at home or away.
The particularity of this problem resides on the venues of each game, that are
predefined, i.e. if team a plays against b we already know whether the game is
366 A. Zanarini and G. Pesant
A variable xij = k means that team i plays against team k at round j. Constraint
(7) enforces that if team a plays against b then b plays against a in the same
round; constraint (4) enforces that each team plays against every other team;
the home-away pattern associated to the predefined venues of team i (P Vi ) is
defined through a regular constraint (5). Finally constraint (6) is redundant and
used to achieve additional filtering.
We tested 40 balanced and 40 unbalanced instances borrowed from [5] with
sizes ranging from 14 to 20. For the regular constraint we used the counting
algorithm proposed in [13]. Results are reported in Table 2 for balanced and
unbalanced instances (timeout instances are included in the averages). Figure 7
shows the percentage of unbalanced instances solved within a given time limit
(time is not cumulative).
Table 2. Average solving time (in seconds), number of backtracks, and percentage of
instances solved for 80 TTPPV instances
balanced unbalanced
heuristic time bckts %solved time bckts % solved
dom/ddeg; lexico 0.1 27 100% 901.2 2829721 25%
IBS 10.7 8250 100% 631.7 1081565 50%
maxSD sampl-DC 25.9 2 100% 140.7 3577 91%
maxSD exact+sampl-DC 120.4 1 100% 216.9 1210 91%
maxSD UB-DC 6.7 1 100% 36.8 245 98%
maxSD UB-AC 0.6 1 100% 30.6 2733 98%
maxSD UB-FC 0.5 1 100% 30.5 2906 98%
More Robust Counting-Based Search Heuristics with Alldifferent Constraints 367
!"
#!
$##"
Fig. 7. Percentage of solved instances vs time for non balanced instances of the TTPPV
Balanced instances do not present a challenge for any of the heuristics tested:
the lightweight heuristic dom/ddeg; lexico is the one performing better together
with maxSD based on upper bounds. The sampling algorithm here shows its
main drawbacks i.e. it is not competitive in solving easy instances: the number
of backtracks is low indeed but the time spent in sampling is simply a waste of
time on easy instances though crucial on difficult ones. Exact enumeration adds
another constant overhead to the counting procedure with the results of being
three orders of magnitude slower than upper bounds based on arc consistency
or forward checking.
Unbalanced instances are harder to solve and none of the heuristics were
able to solve all 40 instances within the time limit — note that in this set
of instances six out of forty are infeasible. maxSD is significantly faster than
any other heuristic: counting based on upper bounds also allowed to cut com-
puting time by almost 80% w.r.t. the sampling algorithm and by 85% w.r.t.
exact enumeration and sampling. 90% of the instances are solved in 100 sec-
onds by maxSD sampl-DC whereas maxSD UB-AC and maxSD UB-FC take less
than 2 seconds to solve 97.5% of the instances (maxSD UB-FC takes slightly
more).
Remarkably, maxSD with upper bounds proved the infeasibility of five of the
six instances, and with small search trees. None of the other heuristics tested were
able to prove the infeasibility of any of the six instances. Gains are remarkable
also in the number of backtracks (three orders of magnitude better than the
other heuristics). maxSD with upper bound-based counting turned out to be the
most consistent heuristic, performing very well both on hard and easy instances
with an average solving time up to 20 times better than IBS.
368 A. Zanarini and G. Pesant
References
1. Bregman, L.M.: Some Properties of Nonnegative Matrices and their Permanents.
Soviet Mathematics Doklady 14(4), 945–949 (1973)
2. Gomes, C., Shmoys, D.: Completing Quasigroups or Latin Squares: A Structured
Graph Coloring Problem. In: COLOR 2002: Proceedings of Computational Sym-
posium on Graph Coloring and Generalizations, pp. 22–39 (2002)
3. Jurkat, W.B., Ryser, H.J.: Matrix Factorizations of Determinants and Permanents.
Journal of Algebra 3, 1–27 (1966)
4. Liang, H., Bai, F.: An Upper Bound for the Permanent of (0,1)-Matrices. Linear
Algebra and its Applications 377, 291–295 (2004)
5. Melo, R.A., Urrutia, S., Ribeiro, C.C.: The traveling tournament problem with
predefined venues. Journal of Scheduling 12(6), 607–622 (2009)
6. Minc, H.: Upper Bounds for Permanents of (0, 1)-matrices. Bulletin of the Ameri-
can Mathematical Society 69, 789–791 (1963)
7. Pryor, J.: Branching Variable Direction Selection in Mixed Integer Programming.
Master’s thesis, Carleton University (2009)
8. Refalo, P.: Impact-Based Search Strategies for Constraint Programming. In: Wal-
lace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 557–571. Springer, Heidelberg (2004)
9. Régin, J.-C.: A Filtering Algorithm for Constraints of Difference in CSPs. In: AAAI
1994: Proceedings of the Twelfth National Conference on Artificial Intelligence,
vol. 1, pp. 362–367. American Association for Artificial Intelligence, Menlo Park
(1994)
10. Soules, G.W.: New Permanental Upper Bounds for Nonnegative Matrices. Linear
and Multilinear Algebra 51(4), 319–337 (2003)
11. Soules, G.W.: Permanental Bounds for Nonnegative Matrices via Decomposition.
Linear Algebra and its Applications 394, 73–89 (2005)
12. Valiant, L.: The Complexity of Computing the Permanent. Theoretical Computer
Science 8(2), 189–201 (1979)
13. Zanarini, A., Pesant, G.: Solution counting algorithms for constraint-centered
search heuristics. Constraints 14(3), 392–413 (2009)
Author Index