Network Optimization - Continuos and Discrete Models
Network Optimization - Continuos and Discrete Models
Dimitri P. Bertsekas
Massachusetts Institute of Technology
http://www.athenasc.com
Email: [email protected]
WWW: http://www.athenasc.com
ISBN 1-886529-02-7
ABOUT THE AUTHOR
iii
ATHENA SCIENTIFIC
OPTIMIZATION AND COMPUTATION SERIES
iv
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . p. 1
1.1. Graphs and Flows . . . . . . . . . . . . . . . . . . . . p. 3
1.1.1. Paths and Cycles . . . . . . . . . . . . . . . . . . p. 4
1.1.2. Flow and Divergence . . . . . . . . . . . . . . . . p. 6
1.1.3. Path Flows and Conformal Decomposition . . . . . . . p. 7
1.2. Network Flow Models – Examples . . . . . . . . . . . . . p. 8
1.2.1. The Minimum Cost Flow Problem . . . . . . . . . . p. 9
1.2.2. Network Flow Problems with Convex Cost . . . . . . . p. 16
1.2.3. Multicommodity Flow Problems . . . . . . . . . . . p. 17
1.2.4. Discrete Network Optimization Problems . . . . . . . p. 19
1.3. Network Flow Algorithms – An Overview . . . . . . . . . p. 20
1.3.1. Primal Cost Improvement . . . . . . . . . . . . . . p. 21
1.3.2. Dual Cost Improvement . . . . . . . . . . . . . . . p. 24
1.3.3. Auction . . . . . . . . . . . . . . . . . . . . . . p. 27
1.3.4. Good, Bad, and Polynomial Algorithms . . . . . . . . p. 35
1.4. Notes, Sources, and Exercises . . . . . . . . . . . . . . . p. 37
v
vi Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . p. 555
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . p. 587
Preface
Network optimization lies in the middle of the great divide that separates
the two major types of optimization problems, continuous and discrete.
The ties between linear programming and combinatorial optimization can
be traced to the representation of the constraint polyhedron as the convex
hull of its extreme points. When a network is involved, however, these ties
become much stronger because the extreme points of the polyhedron are in-
teger and represent solutions of combinatorial problems that are seemingly
unrelated to linear programming. Because of this structure and also be-
cause of their intuitive character, network models provide ideal vehicles for
explaining many of the fundamental ideas in both continuous and discrete
optimization.
Aside from their interesting methodological characteristics, network
models are also used extensively in practice, in an ever expanding spec-
trum of applications. Indeed collectively, network problems such as short-
est path, assignment, max-flow, transportation, transhipment, spanning
tree, matching, traveling salesman, generalized assignment, vehicle rout-
ing, and multicommodity flow constitute the most common class of practi-
cal optimization problems. There has been steady progress in the solution
methodology of network problems, and in fact the progress has accelerated
in the last fifteen years thanks to algorithmic and technological advances.
The purpose of this book is to provide a fairly comprehensive and up-
to-date development of linear, nonlinear, and discrete network optimization
problems. The interplay between continuous and discrete structures has
been highlighted, the associated analytical and algorithmic issues have been
treated quite extensively, and a guide to important network models and
applications has been provided.
Regarding continuous network optimization, we focus on two ideas,
which are also fundamental in general mathematical programming: dual-
ity and iterative cost improvement. We provide an extensive treatment of
iterative algorithms for the most common linear cost problem, the mini-
mum cost flow or transhipment problem, and for its convex cost extensions.
The discussion of duality is comprehensive: it starts with linear network
ix
x Preface
Chapter 7: This chapter starts with the auction algorithm for the assign-
ment problem, and proceeds to show how this algorithm can be extended
to more complex problems. In this way, preflow-push methods for the
max-flow problem and the -relaxation method for the minimum cost flow
problem are obtained. Several additional variants of auction algorithms
are developed.
Chapter 8: This is an important chapter that marks the transition from
linear to nonlinear network optimization. The primary focus is on continu-
ous (convex) problems, and their associated broad variety of structures and
methodology. In particular, there is an overview of the types of algorithms
from nonlinear programming that are useful in connection with various con-
vex network problems. There is also some discussion of discrete (integer)
problems with an emphasis on their ties with continuous problems.
Chapter 9: This is a fairly sophisticated chapter that is directed primar-
ily towards the advanced and/or research-oriented reader. It deals with
separable convex problems, discusses their connection with classical net-
work equilibrium problems, and develops their rich theoretical structure.
The salient features of this structure are a particularly sharp duality the-
ory, and a combinatorial connection of descent directions with the finite
set of elementary vectors of the subspace defined by the conservation of
flow constraints. Besides treating convex separable network problems, this
chapter provides an introduction to monotropic programming, which is the
largest class of nonlinear programming problems that possess the strong
duality and combinatorial properties of linear programs. This chapter also
develops auction algorithms for convex separable problems and provides an
analysis of their running time.
Chapter 10: This chapter deals with the basic methodological approaches
for integer-constrained problems. There is a treatment of exact methods
such as branch-and-bound, and the associated methods of Lagrangian re-
laxation, subgradient optimization, and cutting plane. There is also a
description of approximate methods based on local search, such as genetic
algorithms, tabu search, and simulated annealing. Finally, there is a dis-
cussion of rollout algorithms, a relatively new and broadly applicable class
of approximate methods, which can be used in place of, or in conjunction
with local search.
The book can be used for a course on network optimization or for part
of a course on introductory optimization at the first-year graduate level.
With the exception of some of the material in Chapter 9, the prerequisites
are fairly elementary. The main one is a certain degree of mathematical
maturity, as provided for example by a rigorous mathematics course beyond
the calculus level. One may cover most of the book in a course on linear
and nonlinear network optimization. A shorter version of this course may
consist of Chapters 1-5, and 8. Alternatively, one may teach a course that
xii Preface
Chapters 1-5
(Intro/Linear)
Chapter 9
(Convex)
Dimitri P. Bertsekas
Cambridge, Mass.
Spring 1998
1
Introduction
Contents
1
2 Introduction Chap. 1
Network flow problems are one of the most important and most frequently
encountered class of optimization problems. They arise naturally in the
analysis and design of large systems, such as communication, transporta-
tion, and manufacturing networks. They can also be used to model impor-
tant classes of combinatorial problems, such as assignment, shortest path,
and traveling salesman problems.
Loosely speaking, network flow problems consist of supply and de-
mand points, together with several routes that connect these points and
are used to transfer the supply to the demand. These routes may contain
intermediate transhipment points. Often, the supply, demand, and tran-
shipment points can be modeled by the nodes of a graph, and the routes can
be modeled by the paths of the graph. Furthermore, there may be multiple
“types” of supply/demand (or “commodities”) sharing the routes. There
may also be some constraints on the characteristics of the routes, such as
their carrying capacities, and some costs associated with using particu-
lar routes. Such situations are naturally modeled as network optimization
problems whereby, roughly speaking, we try to select routes that minimize
the cost of transfer of the supply to the demand.
This book deals with a broad spectrum of network optimization prob-
lems, involving linear and nonlinear cost functions. We pay special atten-
tion to four major classes of problems:
(a) The transhipment or minimum cost flow problem, which involves a
single commodity and a linear cost function. This problem has several
important special cases, such as the shortest path, the max-flow, the
assignment, and the transportation problems.
(b) The single commodity network flow problem with convex cost. This
problem is identical to the preceding transhipment problem, except
that the cost function is convex rather than linear.
(c) The multicommodity network flow problem with linear or convex cost.
This problem generalizes the preceding two classes of problems to the
case of multiple commodities.
(d) Discrete network optimization problems. These are problems where
the quantities transferred along the routes of the network are re-
stricted to take one of a finite number of values. Many combinatorial
optimization problems can be modeled in this way, including some
problems where the network structure is not immediately apparent.
Some discrete optimization problems are computationally very diffi-
cult, and in practice can only be solved approximately. Their algorith-
mic solution often involves the solution of “continuous” subproblems
that belong to the preceding three classes.
All of the network flow problems above can be mathematically mod-
eled in terms of graph-related notions. In Section 1.1, we introduce the
associated notation and terminology. In Section 1.2, we provide mathe-
Sec. 1.1 Graphs and Flows 3
† Some authors use a single symbol, such as a, to denote an arc, and use
something like s(a) and e(a) to denote the start and end nodes of a, respectively.
This notational method allows the existence of multiple arcs with the same start
and end nodes, but is also more cumbersome and less suggestive.
4 Introduction Chap. 1
n1 n2
Figure 1.1: Illustration of various types of paths and cycles. The cycle in (b)
is not a Hamiltonian cycle; it is simple and contains all the nodes of the graph,
but it is not forward. Note that for the path (c), in order to resolve ambiguities,
it is necessary to specify the sequence of arcs of the path (rather than just the
sequence of nodes) because both (n3 , n4 ) and (n4 , n3 ) are arcs.
6
1 2 3 5 2
4 5 1 8
6 7 8 7 4
3
Thus, yi is the total flow departing from node i less the total flow arriving
at i; it is referred to as the divergence of i.
We say that node i is a source (respectively, sink ) for the flow vector
x if yi > 0 (respectively, yi < 0). If yi = 0 for all i ∈ N , then x is called
a circulation. These definitions are illustrated in Fig. 1.3. Note that by
adding Eq. (1.1) over all i ∈ N , we obtain
yi = 0.
i∈N
Given a flow vector x that satisfies these bounds, we say that a path P is
unblocked with respect to x if, roughly speaking, we can send some positive
flow along P without violating the bound constraints; that is, if flow can
Sec. 1.1 Graphs and Flows 7
y2 = -2 (Sink)
x =1 2 x = -2
12 24
y 1 = 1 (Source) 1 x =1 x32= 0 4 y4 = 0 (Neither a source
23 nor a sink)
x =0 x34= 2
13 3
y 3 = 1 (Source)
(a)
y2 = 0
x =1 2 x 24= -1
12
y1 = 0 1 x =1 x 32= -1 4 y4 = 0
23
x13= -1 x34 = 1
3
y3 = 0
(b) A circulation
Figure 1.3: Illustration of flows xij and the corresponding divergences yi . The
flow in (b) is a circulation because yi = 0 for all i.
In this section we introduce some of the major classes of problems that will
be discussed in this book. We begin with the minimum cost flow problem,
which, together with its special cases, will be the subject of the following
six chapters.
Sec. 1.2 Network Flow Models – Examples 9
y2 = -2 (Sink)
x =1 2 x = -2
12 24
1 4 Flow = 1 4
3 Flow = 1 3 Flow = 1
Figure 1.4: Decomposition of a flow vector x into three simple path flows con-
forming to x. Consistent with the definition of conformance of a path flow, each
arc (i, j) of the three component paths carries positive (or negative) flow only if
xij > 0 (or xij < 0, respectively). The first two paths [(1, 2) and (3, 4, 2)] are not
cycles, but they start at a source and end at a sink, as required. Arcs (1, 3) and
(3, 2) do not belong to any of these paths because they carry zero flow. In this
example, the decomposition is unique, but in general this need not be the case.
This problem is to find a set of arc flows that minimize a linear cost function,
subject to the constraints that they produce a given divergence vector and
they lie within some given bounds; that is,
minimize aij xij (1.3)
(i,j)∈A
Suppose that each arc (i, j) of a graph is assigned a scalar cost aij , and suppose
that we define the cost of a forward path to be the sum of the costs of its
arcs. Given a pair of nodes, the shortest path problem is to find a forward
path that connects these nodes and has minimum cost. An analogy here is
made between arcs and their costs, and roads in a transportation network and
their lengths, respectively. Within this transportation context, the problem
becomes one of finding the shortest route between two geographical points.
Based on this analogy, the problem is referred to as the shortest path problem,
and the arc costs and path costs are commonly referred to as the arc lengths
and path lengths, respectively.
The shortest path problem arises in a surprisingly large number of con-
texts. For example in a data communication network, aij may denote the
average delay of a packet to cross the communication link (i, j), in which case
a shortest path is a minimum average delay path that can be used for routing
the packet from its origin to its destination. As another example, if pij is
the probability that a given arc (i, j) in a communication network is usable,
and each arc is usable independently of all other arcs, then the product of the
probabilities of the arcs of a path provides a measure of reliability of the path.
With this in mind, it is seen that finding the most reliable path connecting
Sec. 1.2 Network Flow Models – Examples 11
two nodes is equivalent to finding the shortest path between the two nodes
with arc lengths (− ln pij ).
The shortest path problem also arises often as a subroutine in algo-
rithms that solve other more complicated problems. Examples are the primal-
dual algorithm for solving the minimum cost flow problem (see Chapter 6),
and the conditional gradient and projection algorithms for solving multicom-
modity flow problems (see Chapter 8).
It is possible to cast the problem of finding a shortest path from node
s to node t as the following minimum cost flow problem:
minimize aij xij
(i,j)∈A
1 if i = s,
(1.6)
subject to xij − xji = −1 if i = t,
{j|(i,j)∈A} {j|(j,i)∈A} 0 otherwise,
0 ≤ xij , ∀ (i, j) ∈ A.
To see this, let us associate with any forward path P from s to t the flow
vector x with components given by
1 if (i, j) belongs to P ,
xij = (1.7)
0 otherwise.
Then x is feasible for problem (1.6) and the cost of x is equal to the length
of P . Thus, if a vector x of the form (1.7) is an optimal solution of problem
(1.6), the corresponding path P is shortest.
Conversely, it can be shown that if problem (1.6) has at least one op-
timal solution, then it has an optimal solution of the form (1.7), with a
corresponding path P that is shortest. This is not immediately apparent, but
its proof can be traced to a remarkable fact that we will show in Chapter 5
about minimum cost flow problems with node supplies and arc flow bounds
that are integer: such problems, if they have an optimal solution, they have
an integer optimal solution, that is, a set of optimal arc flows that are integer
(an alternative proof of this fact is sketched in Exercise 1.34). From this it
follows that if problem (1.6) has an optimal solution, it has one with arc flows
that are 0 or 1, and which is of the form (1.7) for some path P . This path is
shortest because its length is equal to the optimal cost of problem (1.6), so it
must be less or equal to the cost of any other flow vector of the form (1.7),
and therefore also less or equal to the length of any other path from s to t.
Thus the shortest path problem is essentially equivalent with the minimum
cost flow problem (1.6).
Suppose that there are n persons and n objects that we have to match on a
one-to-one basis. There is a benefit or value aij for matching person i with
object j, and we want to assign persons to objects so as to maximize the total
12 Introduction Chap. 1
PERSONS OBJECTS
1 1
1 1
...
...
Figure 1.5: The graph represen-
1 a ij 1 tation of an assignment problem.
i j
...
...
1 1
n n
0 ≤ xij ≤ 1, ∀ (i, j) ∈ A.
The constraint
xij = 1
{j|(i,j)∈A}
In the max-flow problem, we have a graph with two special nodes: the source,
denoted by s, and the sink , denoted by t. Roughly, the objective is to move as
much flow as possible from s into t while observing the capacity constraints.
More precisely, we want to find a flow vector that makes the divergence of all
nodes other than s and t equal to 0 while maximizing the divergence of s.
Source s t Sink
maximize xts
subject to
xij − xji = 0, ∀ i ∈ N with i = s and i = t,
{j|(i,j)∈A} {j|(j,i)∈A}
xsj = xit = xts ,
{j|(s,j)∈A} {i|(i,t)∈A}
but these bounds are actually redundant since they are implied by the other
upper and lower arc flow bounds.
This problem is the same as the assignment problem except that the node
supplies need not be 1 or −1, and the numbers of sources and sinks need not
be equal. It has the form
minimize aij xij
(i,j)∈A
subject to xij = αi , ∀ i = 1, . . . , m,
{j|(i,j)∈A} (1.9)
xij = βj , ∀ j = 1, . . . , n,
{i|(i,j)∈A}
Here αi and βj are positive scalars, which for feasibility must satisfy
m
n
αi = βj ,
i=1 j=1
Sec. 1.2 Network Flow Models – Examples 15
xij ≤ bj , ∀ j = 1, . . . , n.
i=1
xij = 1, ∀ i = 1, . . . , m.
j=1
aij xij
i=1 j=1
x0j = bj − m,
j=1 j=1
n
and we change the inequality constraints j=1
xij ≤ bj to
m
x0j + xij = bj .
i=1
The resulting problem has the transportation structure of problem (1.9), and
is equivalent to the original problem.
16 Introduction Chap. 1
A more general version of the minimum cost flow problem arises when the
cost function is convex rather than linear. An important special case is the
problem
minimize fij (xij )
(i,j)∈A
subject to xij − xji = si , ∀ i ∈ N,
{j|(i,j)∈A} {j|(j,i)∈A}
where fij is a convex function of the flow xij of arc (i, j), si are given
scalars, and Xij are convex intervals of real numbers, such as for example
where bij and cij are given scalars. We refer to this as the separable convex
cost network flow problem, because the cost function separates into the sum
of cost functions, one per arc. This problem will be discussed in detail in
Chapters 8 and 9.
Here the problem is to find an m × n matrix X that has given row sums and
column sums, and approximates a given m × n matrix M in some optimal
manner. We can formulate such a problem in terms of a graph consisting of
m sources and n sinks. In this graph, the set of arcs consists of the pairs
(i, j) for which the corresponding entry xij of the matrix X is allowed to be
nonzero. The given row sums ri and the given column sums cj are expressed
as the constraints
xij = ri , i = 1, . . . , m,
{j|(i,j)∈A}
xij = cj , j = 1, . . . , n.
{i|(i,j)∈A}
There may be also bounds for the entries xij of X. Thus, the structure of
this problem is similar to the structure of a transportation problem. The cost
function to be optimized has the form
fij (xij ),
(i,j)∈A
Sec. 1.2 Network Flow Models – Examples 17
and expresses the objective of making the entries of X close to the corre-
sponding entries of the given matrix M . A commonly used example is the
quadratic function
fij (xij ) = wij (xij − mij )2 ,
(i,j)∈A
where we assume that mij > 0 for all (i, j) ∈ A. Note that this function is not
defined for xij ≤ 0, so to obtain a problem that fits our framework, we must
use a constraint interval of the form Xij = (0, ∞) or Xij = (0, cij ], where cij
is a positive scalar.
An example of a practical problem that can be addressed using the
preceding optimization model is to predict the distribution matrix X of tele-
phone traffic between m origins and n destinations. Here we are given the
total supplies ri of the origins and the total demands cj of the destinations,
and we are also given some matrix M that defines a nominal traffic pattern
obtained from historical data.
There are other types of network flow problems with convex cost that
often arise in practice. We generically represent such problems in the form
minimize f (x)
subject to x ∈ F
Origin of Destination of
OD pair (im , jm ) OD pair (im , jm )
im jm
rm rm
M
Such a cost function is often based on a queueing model of average delay (see
for example the data network textbook by Bertsekas and Gallager [1992]).
The constraint that each node has a single incoming and a single outgoing arc
on the tour is expressed by the following two conservation of flow equations:
xij = 1, i = 1, . . . , N,
j=1,...,N
j=i
xij = 1, j = 1, . . . , N.
i=1,...,N
i=j
20 Introduction Chap. 1
the subgraph with node set N and arc set {(i, j) | xij = 1} is connected.
If this constraint was not present, the problem would be an ordinary assign-
ment problem. Unfortunately, this constraint is essential, since without it,
there would be feasible solutions involving multiple disconnected cycles.
Despite the similarity, the traveling salesman problem is far more dif-
ficult than the assignment problem. Solving problems having a mere few
hundreds of nodes can be very challenging. By contrast, assignment prob-
lems with hundreds of thousands of nodes can be solved in reasonable time
with the presently available methodology.
Primal cost improvement algorithms for the minimum cost flow problem
start from an initial feasible flow vector and then generate a sequence of
feasible flow vectors, each having a better cost than the preceding one.
Let us derive an important characterization of the differences between suc-
cessive vectors, which is the basis for algorithms as well as for optimality
conditions.
22 Introduction Chap. 1
Let x and x be two feasible flow vectors, and consider their difference
z = x − x . This difference must be a circulation with components
zij = xij − xij ,
since both x and x are feasible. Furthermore, if the cost of x is smaller
than the cost of x, the circulation z must have negative cost, i.e.,
aij zij < 0.
(i,j)∈A
We can decompose z into the sum of simple cycle flows by using the confor-
mal realization theorem (Prop. 1.1). In particular, for some positive integer
K, we have
K
z= wk ξ k ,
k=1
where wk are positive scalars, and ξ k are simple cycle flows whose nonzero
components ξij k are 1 or -1, depending on whether z
ij > 0 or zij < 0,
respectively. It is seen that the cost of z is
K
aij zij = wk ck ,
(i,j)∈A k=1
where ck is the cost of the simple cycle flow ξ k . Thus, since the scalars wk
are positive, if the cost of z is negative, at least one ck must be negative.
Note that if Ck is the cycle corresponding to ξ k , we have
ck = k =
aij ξij aij − aij ,
(i,j)∈A (i,j)∈Ck+ (i,j)∈Ck−
where Ck+ and Ck− are the sets of forward and backward arcs of the cycle
Ck , respectively. We refer to the expression in the right-hand side above
as the cost of the cycle Ck .
The preceding argument has shown that if x is feasible but not opti-
mal, and x is feasible and has smaller cost than x, then at least one of the
cycles corresponding to a conformal decomposition of the circulation x − x
as above has negative cost. This is used to prove the following important
optimality condition.
Proof: Let x∗ be an optimal flow vector and let C be a simple cycle that
is unblocked with respect to x∗ . Then there exists an > 0 such that
increasing (decreasing) the flow of arcs of C + (of C − , respectively) by
results in a feasible flow that has cost equal to the cost of x∗ plus times
the cost of C. Thus, since x∗ is optimal, the cost of C must be nonnegative.
Conversely, suppose, to arrive at a contradiction, that x∗ is feasible
and has the nonnegative cycle property stated in the proposition, but is not
optimal. Let x be a feasible flow vector with cost smaller that the one of
x∗ , and consider a conformal decomposition of the circulation z = x − x∗ .
From the discussion preceding the proposition, we see that there is a simple
cycle C with negative cost, such that x∗ij < xij for all (i, j) ∈ C + , and such
that x∗ij > xij for all (i, j) ∈ C − . Since x is feasible, we have bij ≤ xij ≤ cij
for all (i, j). It follows that x∗ij < cij for all (i, j) ∈ C + , and x∗ij > bij for
all (i, j) ∈ C − , so that C is unblocked with respect to x∗ . This contradicts
the hypothesis that every simple cycle that is unblocked with respect to x∗
has nonnegative cost. Q.E.D.
Consider the n × n assignment problem (cf. Example 1.2) and suppose that
we have a feasible assignment, that is, a set of n pairs (i, j) involving each
person i exactly once and each object j exactly once. In order to improve
this assignment, we could consider a two-person exchange, that is, replacing
two pairs (i1 , j1 ) and (i2 , j2 ) from the assignment with the pairs (i1 , j2 ) and
(i2 , j1 ). The resulting assignment will still be feasible, and it will have a
higher value if and only if
We note here that, in the context of the minimum cost flow representation of
the assignment problem, a two-person exchange can be identified with a cycle
involving the four arcs (i1 , j1 ), (i2 , j2 ), (i1 , j2 ), and (i2 , j1 ). Furthermore, this
cycle is the difference between the assignment before and the assignment after
the exchange, while the preceding inequality is equivalent to the cycle having
a positive value.
Unfortunately, it may be impossible to improve the current assignment
by a two-person exchange, even if the assignment is not optimal; see Fig.
1.8. An improvement, however, is possible by means of a k-person exchange,
for some k ≥ 2, where a set of pairs (i1 , j1 ), . . . , (ik , jk ) from the current as-
signment is replaced by the pairs (i1 , j2 ), . . . , (ik−1 , jk ), (ik , j1 ). To see this,
24 Introduction Chap. 1
note that in the context of the minimum cost flow representation of the as-
signment problem, a k-person exchange corresponds to a simple cycle with
k forward arcs (corresponding to the new assignment pairs) and k backward
arcs (corresponding to the current assignment pairs that are being replaced);
see Fig. 1.9. Thus, performing a k-person exchange is equivalent to pushing
one unit of flow along the corresponding simple cycle. The k-person exchange
improves the assignment if and only if
k−1
k
Duality theory deals with the relation between the original network opti-
mization problem and another optimization problem called the dual . To
develop an intuitive understanding of duality, we will focus on an n × n as-
signment problem (cf. Example 1.2) and consider a closely related economic
equilibrium problem. In particular, let us consider matching the n objects
Sec. 1.3 Network Flow Algorithms – An Overview 25
where
A(i) = j | (i, j) ∈ A
is the set of objects that can be assigned to person i. When this condition
holds for all persons i, we say that the assignment and the price vector
p = (p1 , . . . , pn ) satisfy complementary slackness (CS for short); this name
is standard in linear programming. The economic system is then at equi-
librium, in the sense that no person would have an incentive to unilaterally
seek another object. Such equilibrium conditions are naturally of great
interest to economists, but there is also a fundamental relation with the
assignment problem. We have the following proposition.
Furthermore, the value of the optimal assignment and the optimal cost
of the dual problem are equal.
for any set of prices {pj | j = 1, . . . , n}, since the first term of the right-hand
side is no less than
n
(aiki − pki ) ,
i=1
n
while the second term is equal to i=1 pki . On the other hand, the given
assignment and set of prices, denoted by {(i, ji ) | i = 1, . . . , n} and {pj |
j = 1, . . . , n}, respectively, satisfy the CS conditions, so we have
n
n
aiki ≤ max aij − pj + pji
j∈A(i)
i=1 i=1
n
= aiji
i=1
n
n
≤ max aij − pj + pj ,
j∈A(i)
i=1 j=1
1.3.3 Auction
Naive Auction
p
2
Surfaces of Equal
Dual Cost
p1
(b)
seen shortly. Nonetheless, this flaw will help motivate a more sophisticated
and correct algorithm.
The naive auction algorithm proceeds in iterations and generates a
sequence of price vectors and partial assignments. By a partial assignment
we mean an assignment where only a subset of the persons have been
matched with objects. A partial assignment should be contrasted with a
feasible or complete assignment where all the persons have been matched
with objects on a one-to-one basis. At the beginning of each iteration, the
CS condition [cf. Eq. (1.10)]
is satisfied for all pairs (i, ji ) of the partial assignment. If all persons
are assigned, the algorithm terminates. Otherwise some person who is
unassigned, say i, is selected. This person finds an object ji which offers
maximal value, that is,
and then:
(a) Gets assigned to the best object ji ; the person who was assigned to
ji at the beginning of the iteration (if any) becomes unassigned.
Sec. 1.3 Network Flow Algorithms – An Overview 29
(b) Sets the price of ji to the level at which he/she is indifferent between
ji and the second best object; that is, he/she sets pji to
pji + γi ,
where
γi = vi − wi , (1.13)
vi is the best object value,
(Note that as pji is increased, the value aiji − pji offered by object ji
to person i is decreased. γi is the largest increment by which pji can
be increased, while maintaining the property that ji offers maximal
value to i.)
This process is repeated in a sequence of iterations until each person has
been assigned to an object.
We may view this process as an auction where at each iteration the
bidder i raises the price of a preferred object by the bidding increment γi .
Note that γi cannot be negative, since vi ≥ wi [compare Eqs. (1.14)and
(1.15)], so the object prices tend to increase. The choice γi is illustrated
in Fig. 1.11. Just as in a real auction, bidding increments and price in-
creases spur competition by making the bidder’s own preferred object less
attractive to other potential bidders.
-Complementary Slackness
Unfortunately, the naive auction algorithm does not always work (although
it is an excellent initialization procedure for other methods, such as primal-
dual or relaxation, and it is useful in other specialized contexts). The diffi-
culty is that the bidding increment γi is 0 when two or more objects are tied
in offering maximum value for the bidder i. As a result, a situation may be
created where several persons contest a smaller number of equally desirable
objects without raising their prices, thereby creating a never ending cycle;
see Fig. 1.12.
To break such cycles, we introduce a perturbation mechanism, moti-
vated by real auctions where each bid for an object must raise its price by
a minimum positive increment, and bidders must on occasion take risks to
win their preferred objects. In particular, let us fix a positive scalar , and
30 Introduction Chap. 1
Values a ij - p j
of objects j
for person i
Figure 1.11: In the naive auction algorithm, even after the price of the best
object ji is increased by the bidding increment γi , ji continues to be the best
object for the bidder i, so CS is satisfied at the end of the iteration. However, we
have γi = 0 if there is a tie between two or more objects that are most preferred
by i.
for all assigned pairs (i, j). In words, to satisfy -CS, all assigned persons
of the partial assignment must be assigned to objects that are within of
being best.
We now reformulate the previous auction process so that the bidding in-
crement is always at least equal to . The resulting method, the auction
algorithm, is the same as the naive auction algorithm, except that the
bidding increment γi is
γi = vi − wi + (1.16)
rather than γi = vi − wi as in Eq. (1.13). With this choice, the -CS
condition is satisfied, as illustrated in Fig. 1.13. The particular increment
γi = vi − wi + used in the auction algorithm is the maximum amount with
this property. Smaller increments γi would also work as long as γi ≥ ,
but using the largest possible increment accelerates the algorithm. This
is consistent with experience from real auctions, which tend to terminate
faster when the bidding is aggressive.
Sec. 1.3 Network Flow Algorithms – An Overview 31
PERSONS OBJECTS
Initially assigned
2 2 Initial price = 0
to object 2
Here a ij= C > 0 for all (i,j) with i = 1,2,3 and j = 1,2
and a = 0 for all (i,j) with i = 1,2,3 and j = 3
ij
Initially 3 3
unassigned Initial price = 0
Figure 1.12: Illustration of how the naive auction algorithm may never terminate
for a problem involving three persons and three objects. Here objects 1 and 2
offer benefit C > 0 to all persons, and object 3 offers benefit 0 to all persons. The
algorithm cycles as persons 2 and 3 alternately bid for object 2 without changing
its price because they prefer equally object 1 and object 2.
Figure 1.13: In the auction algorithm, even after the price of the preferred
object ji is increased by the bidding increment γi , ji will be within of being
most preferred, so the -CS condition holds at the end of the iteration.
PERSONS OBJECTS
Initially assigned
to object 2
2 2 Initial price = 0
Here a ij= C > 0 for all (i,j) with i = 1,2,3 and j = 1,2
and a ij = 0 for all (i,j) with i = 1,2,3 and j = 3
Initially 3 3 Initial price = 0
unassigned
Figure 1.14: Illustration of how the auction algorithm, by making the bidding
increment at least , overcomes the cycling difficulty for the example of Fig. 1.12.
The table shows one possible sequence of bids and assignments generated by
the auction algorithm, starting with all prices equal to 0 and with the partial
assignment {(1, 1), (2, 2)}. At each iteration except the last, the person assigned
to object 3 bids for either object 1 or 2, increasing its price by in the first iteration
and by 2 in each subsequent iteration. In the last iteration, after the prices of 1
and 2 reach or exceed C, object 3 receives a bid and the auction terminates.
Suppose now that the benefits aij are all integer, which is the typical
practical case. (If aij are rational numbers, they can be scaled up to integer
by multiplication with a suitable common number.) Then the total benefit
of any assignment is integer, so if n < 1, any complete assignment that is
within n of being optimal must be optimal. It follows that if
1
<
n
and the benefits aij are all integer, then the assignment obtained upon ter-
mination of the auction algorithm is optimal .
Figure 1.15 shows the sequence of generated object prices for the ex-
ample of Fig. 1.12 in relation to the contours of the dual cost function.
It can be seen from this figure that each bid has the effect of setting the
price of the object receiving the bid nearly equal (within ) to the price
that minimizes the dual cost with respect to that price, with all other
prices held fixed (this will be shown rigorously in Section 7.1). Successive
minimization of a cost function along single coordinates is a central fea-
ture of coordinate descent and relaxation methods, which are popular for
unconstrained minimization of smooth functions and for solving systems
of smooth equations. Thus, the auction algorithm can be interpreted as
an approximate coordinate descent method; as such, it is related to the
relaxation method discussed in the previous subsection.
Scaling
C = max |aij |.
(i,j)∈A
p
2
Contours of the
dual function
C
ε
ε
ε
ε
Price p 3 is fixed at 0
ε
ε
C p1
(c) The ability to perform sensitivity analysis (resolve the problem with
slightly different problem data) quickly.
(d) The ability to take advantage of parallel computing hardware.
Given the diversity of these considerations, it is not surprising that
there is no algorithm that will dominate the others in all or even most
practical situations. Otherwise expressed, every type of algorithm that we
will discuss is best given the right type of practical situation. Thus, to
make intelligent choices, the practitioner needs to understand the proper-
ties of different algorithms relating to speed of convergence, flexibility, par-
allelization, and suitability for specific problem structures. For challenging
problems, the choice of algorithm is often settled by experimentation with
several candidates.
A theoretical analyst may also have difficulty ranking different algo-
rithms for specific types of problems. The most common approach for this
purpose is worst-case computational complexity analysis. For example, for
the minimum cost flow problem, one tries to bound the number of elemen-
tary numerical operations needed by a given algorithm with some measure
of the “problem size,” that is, with some expression of the form
Kf (N, A, C, U, S),
where
N is the number of nodes,
A is the number of arcs,
C is the arc cost range max(i,j)∈A |aij |,
U is the maximum arc flow range max(i,j)∈A (cij − bij ),
S is the supply range maxi∈N |si |,
f is some known function,
K is a (usually unknown) constant.
If a bound of this form can be found, we say that the running time or
operation count of the algorithm is O f (N, A, C, U, S) . If f (N, A, C, U, S)
can be written as a polynomial function of the number of bits needed to
express the problem data, the algorithm is said
to be polynomial
. Exam-
ples of polynomial complexity bounds are O N α Aβ and O N α Aβ log C ,
where α and β are positive
integers,
and the numbers aij are assumed in-
teger. The bound O N α Aβ is sometimes said to be strongly polynomial
because
it involves
only the graph size parameters. A bound of the form
O N α Aβ C is not polynomial, even assuming that the aij are integer, be-
cause C is not a polynomial expression of log C, the number
of bits needed
to express a single number aij . Bounds like O N α Aβ C , which are poly-
nomial in the problem data rather than in the number of bits needed to
express the data, are called pseudopolynomial .
Sec. 1.4 Notes, Sources, and Exercises 37
Network problems are discussed in many books (Berge [1962], Berge and
Ghouila-Houri [1962], Ford and Fulkerson [1962], Dantzig [1963], Busacker
and Saaty [1965], Hu [1969], Iri [1969], Frank and Frisch 1970], Christofides
38 Introduction Chap. 1
[1975], Zoutendijk [1976], Minieka [1978], Jensen and Barnes [1980], Ken-
nington and Helgason [1980], Papadimitriou and Steiglitz [1982], Chvatal
[1983], Gondran and Minoux [1984], Luenberger [1984], Rockafellar [1984],
Bazaraa, Jarvis, and Sherali [1990], Bertsekas [1991a], Murty [1992], Bert-
simas and Tsitsiklis [1997]). Several of these books discuss linear program-
ming first and develop linear network optimization as a special case. An
alternative approach that relies heavily on duality, is given by Rockafellar
[1984]. The conformal realization theorem (Prop. 1.1) has been developed
in different forms in several sources, including Ford and Fulkerson [1962],
Busacker and Saaty [1965], and Rockafellar [1984].
The primal cost improvement approach for network optimization was
initiated by Dantzig [1951], who specialized the simplex method to the
transportation problem. The extensive subsequent work using this ap-
proach is surveyed at the end of Chapter 5.
The dual cost improvement approach was initiated by Kuhn [1955]
who proposed the Hungarian method for the assignment problem. (The
name of the algorithm honors its connection with the research of the Hun-
garian mathematicians Egervary [1931] and König [1931].) Work using this
approach is surveyed in Chapter 6.
The auction approach was initiated in Bertsekas [1979a] for the as-
signment problem, and in Bertsekas [1986a], [1986b] for the minimum cost
flow problem. Work using this approach is surveyed at the end of Chapter
7.
EXERCISES
1.1
2 5
1
3 Figure 1.16: Flow vector for Ex-
1 2 -1 4 5 ercise 1.1. The arc flows are the
2
-1 numbers shown next to the arcs.
2 1
3
Prove the conformal realization theorem (Prop. 1.1) by completing the details
of the following argument. Assume first that x is a circulation. Consider the
following procedure by which given x, we obtain a simple cycle flow x that
conforms to x and satisfies
(see Fig. 1.17). Choose an arc (i, j) with xij = 0. Assume that xij > 0. (A
similar procedure can be used when xij < 0.) Construct a sequence of node
subsets T0 , T1 , . . ., as follows: Take T0 = {j}. For k = 0, 1, . . ., given Tk , let
/ ∪kp=0 Tp | there is a node m ∈ Tk , and either an arc (m, n)
Tk+1 = n ∈
such that xmn > 0 or an arc (n, m) such that xnm < 0 ,
and mark each node n ∈ Tk+1 with the label “(m, n)” or “(n, m),” where m
is a node of Tk such that xmn > 0 or xnm < 0, respectively. The procedure
terminates when Tk+1 is empty.
At the end of the procedure, trace labels backward from i until node j is
reached. (How do we know that i belongs to one of the sets Tk ?) In particular,
let “(i1 , i)” or “(i, i1 )” be the label of i, let “(i2 , i1 )” or “(i1 , i2 )” be the label
of i1 , etc., until a node ik with label “(ik , j)” or “(j, ik )” is found. The cycle
C = (j, ik , ik−1 , . . . , i1 , i, j) is simple, it contains (i, j) as a forward arc, and is
such that all its forward arcs have positive flow and all its backward arcs have
negative flow. Let a = min(m,n)∈C |xmn | > 0. Then the simple cycle flow x ,
where
a if (i, j) ∈ C + ,
xij = −a if (i, j) ∈ C − ,
0 otherwise,
has the required properties.
Now subtract x from x. We have xij − xij > 0 only for arcs (i, j) with
xij > 0, xij − xij < 0 only for arcs (i, j) with xij < 0, and xij − xij = 0 for at
40 Introduction Chap. 1
Figure 1.17: Construction of a cycle of arcs with nonzero flow used in the proof
of the conformal realization theorem.
least one arc (i, j) with xij = 0. If x is integer, then x and x − x will also be
integer. We then repeat the process (for at most A times) with the circulation x
replaced by the circulation x − x and so on, until the zero flow is obtained.
If x is not a circulation, we form an enlarged graph by introducing a new
node s and by introducing for each node i ∈ N an arc (s, i) with flow xsi equal
to the divergence yi . The resulting flow vector is seen to be a circulation in the
enlarged graph (why?). This circulation, by the result just shown, can be decom-
posed into at most A + N simple cycle flows of the enlarged graph, conforming
to the flow vector. Out of these cycle flows, we consider those containing node
s, and we remove s and its two incident arcs while leaving the other cycle flows
unchanged. As a result we obtain a set of at most A+N path flows of the original
graph, which add up to x. These path flows also conform to x, as required.
1.3
Use the algorithm of Exercise 1.2 to decompose the flow vector of Fig. 1.16 into
conforming simple path flows.
(a) Use the conformal realization theorem (Prop. 1.1) to show that a forward
path P can be decomposed into a (possibly empty) collection of simple
forward cycles, together with a simple forward path that has the same
start node and end node as P . (Here “decomposition” means that the
Sec. 1.4 Notes, Sources, and Exercises 41
union of the arcs of the component paths is equal to the set of arcs of P
with the multiplicity of repeated arcs properly accounted for.)
(b) Suppose that a graph is strongly connected and that a length aij is given for
every arc (i, j). Show that if all forward cycles have nonnegative length,
then there exists a shortest path from any node s to any node t. Show
also that if there exists a shortest path from some node s to some node t,
then all forward cycles have nonnegative length. Why is the connectivity
assumption needed?
Consider a graph such that each of the nodes has even degree.
(a) Give an algorithm to decompose the graph into a collection of simple cycles
that are disjoint, in the sense that they share no arcs (although they may
share some nodes). (Here “decomposition” means that the union of the
arcs of the component cycles is equal to the set of arcs of the graph.) Hint:
Given a connected graph where each of the nodes has even degree, the
deletion of the arcs of any cycle creates some connected subgraphs where
each of the nodes has even degree (including possibly some isolated nodes).
(b) Assume in addition that the graph is connected. Show that there is an
Euler cycle, i.e., a cycle that contains all the arcs of a graph exactly once.
Hint: Apply the decomposition of part (a), and successively merge an Euler
cycle of a subgraph with a simple cycle.
1.6
In the graph of Fig. 1.16, consider the graph obtained by deleting node 1 and
arcs (1, 2), (1, 3), and (5, 4). Decompose this graph into a collection of simple
cycles that are disjoint (cf. Exercise 1.5) and construct an Euler cycle.
1.7
For every square of its color, there should be exactly one illegal move that
either starts or ends at that square.
Consider a graph and the question whether there exists a forward cycle that
passes through each arc of the graph exactly once. Show that such a cycle exists
if and only if the graph is connected and the number of incoming arcs to each
node is equal to the number of outgoing arcs from the node.
1.9
Consider a graph and the question whether there exists a path that passes through
each arc of the graph exactly once. Show that such a path exists if and only if
the graph is connected, and either the degrees of all the nodes are even, or else
the degrees of all the nodes except two are even.
1.11
In shatranj, the old version of chess, the firz (or vizier, the predecessor to the
modern queen) can move one square diagonally in each direction. Show that
starting at a corner of an n × n chessboard where n is even, the firz can reach
the opposite corner after making each of the possible moves along its diagonals
exactly once and in one direction only [of the two moves (a, b) and (b, a) only one
should be made].
1.12
Show that the number of nodes with odd degree in a graph is even.
1.13
Assume that all the nodes of a graph have degree greater than one. Show that
the graph must contain a cycle.
Sec. 1.4 Notes, Sources, and Exercises 43
1.14
(a) Show that every tree with at least two nodes has at least two nodes with
degree one.
(b) Show that a graph is a tree if and only if it is connected and the number
of arcs is one less than the number of nodes.
1.15
Consider a volleyball net that consists of a mesh with m squares on the horizontal
dimension and n squares on the vertical. What is the maximum number of strings
that can be cut before the net falls apart into two pieces.
minimize aij xij
(i,j)∈A
subject to si ≤ xij − xji ≤ si , ∀ i ∈ N,
{j|(i,j)∈A} {j|(j,i)∈A}
where the bounds si and si on the divergence of node i are given. Show that
this problem can be converted to a standard (equality constrained) minimum
cost flow problem by adding an extra node A and an arc (A, i) from this node to
every other node i, with feasible flow range [0, si − si ].
44 Introduction Chap. 1
Consider the minimum cost flow problem with the additional constraints that
the total flow of the outgoing arcs from each node i must lie within a given range
[ti , ti ], that is,
ti ≤ xij ≤ ti .
{j|(i,j)∈A}
Convert this problem into the standard form of the minimum cost flow problem
by splitting each node into two nodes with a connecting arc.
Consider the minimum cost flow problem with the difference that, instead of the
linear form aij xij , each arc’s cost function has the piecewise linear form
a1ij xij if bij ≤ xij ≤ mij ,
fij (xij ) =
a1ij mij + a2ij (xij − mij ) if mij ≤ xij ≤ cij ,
where mij , a1ij , and a2ij are given scalars satisfying bij ≤ mij ≤ cij and a1ij ≤ a2ij .
(a) Show that the problem can be converted to a linear minimum cost flow
problem where each arc (i, j) is replaced by two arcs with arc cost coeffi-
cients a1ij and a2ij , and arc flow ranges [bij , mij ] and [0, cij − mij ], respec-
tively.
(b) Generalize to the case of piecewise linear cost functions with more than
two pieces.
Consider an assignment problem where the number of objects is larger than the
number of persons, and we require that each person be assigned to one object.
The associated linear program (cf. Example 1.2) is
maximize aij xij
(i,j)∈A
subject to xij = 1, ∀ i = 1, . . . , m,
{j|(i,j)∈A}
xij ≤ 1, ∀ j = 1, . . . , n,
{i|(i,j)∈A}
0 ≤ xij ≤ 1, ∀ (i, j) ∈ A,
where m < n.
(a) Show how to formulate this problem as a minimum cost flow problem by
introducing extra arcs and nodes.
Sec. 1.4 Notes, Sources, and Exercises 45
(b) Repeat part (a) for the case where there may be some persons that are
left unassigned; that is, the constraint x = 1 is replaced by
{j|(i,j)∈A} ij
{j|(i,j)∈A}
xij ≤ 1. Give an example of a problem with aij > 0 for all
(i, j) ∈ A, which is such that in the optimal assignment some persons are
left unassigned, even though there exist feasible assignments that assign
every person to some object.
(c) Formulate an asymmetric transportation problem where the total supply
is less than the total demand, but some demand may be left unsatisfied,
and appropriately modify your answers to parts (a) and (b).
Bipartite matching problems are assignment problems where the coefficients (i, j)
are all equal to 1. In such problems, we want to maximize the cardinality of the
assignment, that is, the number of assigned pairs (i, j). Formulate a bipartite
matching problem as an equivalent max-flow problem.
(ai xi + bi ui ),
i=0
where ai and bi are given scalars for each i. Formulate this problem as a minimum
cost flow problem. Hint: For each i, introduce a node that connects to a special
artificial node.
N −1
(ai xi + bi ui )
i=0
is minimized, where ai and bi are given scalars for each i. Formulate the problem
as a minimum cost flow problem.
Consider a transhipment context for the minimum cost flow problem where the
problem is to optimally transfer flow from some supply points to some demand
points over arcs of limited capacity. In a dynamic version of this context, the
transfer is to be performed over N time units, and transferring flow along an arc
(i, j) requires time τij , which is a given positive integer number of time units.
This means that at each time t = 0, . . . , N − τij , we may send from node i along
arc (i, j) a flow xij ∈ [0, cij ], which will arrive at node j at time t+τij . Formulate
this problem as a minimum cost flow problem involving a copy of the given graph
for each time period.
1.26
Consider a round-robin chess tournament involving n players that play each other
once. A win scores 1 for the winner and 0 for the loser, while a draw scores 1/2
Sec. 1.4 Notes, Sources, and Exercises 47
for each player. We are given a set of final scores (s1 , . . . , sn ) for the players, from
the range [0, n − 1], whose sum is n(n − 1)/2, and we want to check whether these
scores are feasible [for example, in a four-player tournament, a set of final scores
of (3, 3, 0, 0) is impossible]. Show that this is equivalent to checking feasibility of
some transportation problem.
Consider the k-color problem, which is to assign one out of k colors to each node
of a graph so that for every arc (i, j), nodes i and j have different colors.
(a) Suppose we want to choose the colors of countries in a world map so that
no two adjacent countries have the same color. Show that if the number of
available colors is k, the problem can be formulated as a k-color problem.
(b) Show that the k-color problem has a solution if and only if the number of
nodes can be partitioned in k or less disjoint subsets such that there is no
arc connecting a pair of nodes from the same subset.
(c) Show that when the graph is a tree, the 2-color problem has a solution.
Hint: First color some node i and then color the remaining nodes based on
their “distance” from i.
(d) Show that if each node has at most k − 1 neighbors, the k-color problem
has a solution.
xj := fj (x), j = 1, . . . , n,
Consider the minimum cost flow problem and let pj be a scalar price for each
node j. Show that if the arc cost coefficients aij are replaced by aij + pj − pi ,
we obtain a problem that is equivalent to the original (except for a scalar shift
in the cost function value).
1.30
Consider the assignment problem. Let pj denote the price of object j, let T be a
subset of objects, and let
S = i | the maximum of aij − pj over j ∈ A(i)
is attained by some element of T .
Assume that:
(1) For each i ∈ S, the maximum of aij − pj over j ∈ A(i) is attained only by
elements of T .
(2) S has more elements than T .
Show that the direction d = (d1 , . . . , dn ), where dj = 1 if j ∈ T and dj = 0 if
j∈/ T , is a direction of dual cost improvement. Note: Directions of this type are
used by the most common dual cost improvement algorithms for the assignment
problem.
1.32
Use -CS to verify that the assignment of Fig. 1.18 is optimal and obtain a bound
on how far from optimal the given price vector is. State the dual problem and
verify the correctness of the bound by comparing the dual value of the price
vector with the optimal dual value.
Sec. 1.4 Notes, Sources, and Exercises 49
Value = C
1 11 p = C - 1/8
C
0 Figure 1.18: Graph of an assignment prob-
C lem. Objects 1 and 2 have value C for all
C p = C + 1/8 persons. Object 3 has value 0 for all per-
2 2
sons. Object prices are as shown. The
0 thick lines indicate the given assignment.
C C
3 0 3 p=0
Consider the minimum cost flow problem of Exercise 1.33, where the upper
bounds cij are given positive integers and the supplies si are given integers.
Assume that the problem has at least one feasible solution. Show that there
exists an optimal flow vector that is integer. Hint: Show that the flow vectors
generated by the negative cycle algorithm of Exercise 1.33 are integer.
50 Introduction Chap. 1
The origins of the traveling salesman problem can be traced (among others) to the
work of the Irish mathematician Sir William Hamilton. In 1856, he developed a
system of commutative algebra, which inspired a puzzle marketed as the “Icosian
Game.” The puzzle is to find a cycle that passes exactly once through each
of the 20 nodes of the graph shown in Fig. 1.19, which represents a regular
dodecahedron. Find a Hamiltonian cycle on this graph using as first four nodes
the ones marked 1-4 (all arcs are considered bidirectional).
1
2
4
3
Figure 1.19: Graph for the Icosian Game (cf. Exercise 1.35). The arcs and nodes
correspond to the edges and vertices of the regular dodecahedron, respectively.
The name “icosian” comes from the Greek word “icosi,” which means twenty.
Adjacent nodes of the dodecahedron correspond to adjacent faces of the regular
icosahedron.
Contents
51
52 The Shortest Path Problem Chap. 2
This path is said to be shortest if it has minimum length over all forward
paths with the same origin and destination nodes. The length of a shortest
path is also called the shortest distance. The shortest path problem deals
with finding shortest distances between selected pairs of nodes. [Note that
here we are optimizing over forward paths; when we refer to a path (or a
cycle) in connection with the shortest path problem, we implicitly assume
that the path (or the cycle) is forward.]
The range of applications of the shortest path problem is very broad.
In the next section, we will provide some representative examples. We
will then develop a variety of algorithms. Most of these algorithms can be
viewed as primal cost or dual cost improvement algorithms for an appro-
priate special case of the minimum cost flow problem, as we will see later.
However, the shortest path problem is simple, so we will discuss it based
on first principles, and without much reference to cost improvement. This
serves a dual purpose. First, it provides an opportunity to illustrate some
basic graph concepts in the context of a problem that is simple and rich in
intuition. Second, it allows the early development of some ideas and results
that will be used later in a variety of other algorithmic contexts.
simply a path with minimum number of links. More generally, the length
of a link, may depend on its transmission capacity and its projected traffic
load. The idea here is that a shortest path should contain relatively few and
uncongested links, and therefore be desirable for routing. Sophisticated rout-
ing algorithms also allow the length of each link to change over time and to
depend on the prevailing congestion level of the link. Then a shortest path
may adapt to temporary overloads and route packets around points of con-
gestion. Within this context, the shortest path routing algorithm operates
continuously, solving the shortest path problem with lengths that vary over
time.
A peculiar feature of shortest path routing algorithms is that they are
often implemented using distributed and asynchronous communication and
computation. In particular, each node of the communication network mon-
itors the traffic conditions of its adjacent links, calculates estimates of its
shortest distances to various destinations, and passes these estimates to other
nodes who adjust their own estimates, etc. This process is based on stan-
dard shortest path algorithms that will be discussed in this chapter, but it
is also executed asynchronously, and with out-of-date information because of
communication delays between the nodes. Despite this fact, it turns out that
these distributed asynchronous algorithms maintain much of the validity of
their synchronous counterparts (see the textbooks by Bertsekas and Tsitsiklis
[1989], and Bertsekas and Gallager [1992] for related analysis).
where uk is a control that takes values from a given finite set, which may
depend on the index k. This transition involves a cost gk (xk , uk ). The final
transition from xN −1 to xN , involves an additional terminal cost G(xN ).
Here, the functions fk , gk , and G are given.
Given a control sequence (u0 , . . . , uN −1 ), the corresponding state se-
quence (x0 , . . . , xN ) is determined from the given initial state x0 and the
system of Eq. (2.1). The objective in dynamic programming is to find a
54 The Shortest Path Problem Chap. 2
control sequence and a corresponding state sequence such that the total cost
N −1
G(xN ) + gk (xk , uk )
k=0
is minimized.
For an example, consider an inventory system that operates over N
time periods, and let xk and uk denote the number of items held in stock and
number of items purchased at the beginning of period k, respectively. We
require that uk be an integer from a given range [0, rk ]. We assume that the
stock evolves according to the equation
xk+1 = xk + uk − vk ,
where vk is a known integer demand for period k; this is the system equa-
tion [cf. Eq. (2.1)]. A negative xk here indicates unsatisfied demand that is
backordered. A common type of cost used in inventory problems has the form
gk (xk , uk ) = hk (xk ) + ck uk ,
where ck is a given cost per unit stock at period k, and hk (xk ) is a cost either
for carrying excess inventory (xk > 0) or for backordering demand (xk < 0).
For example hk (xk ) = max{ak xk , −bk xk } or hk (xk ) = dk x2k , where ak , bk , and
dk are positive scalars, are both reasonable choices for cost function. Finally,
we could take G(xN ) = 0 to indicate that the final stock xN has no value
[otherwise G(xN ) indicates the cost (or negative salvage value) of xN ]. The
objective in this problem is roughly to determine the sequence of purchases
over time to minimize the costs of excess inventory and backordering demand
over the N time periods.
To convert the dynamic programming problem to a shortest path prob-
lem, we introduce a graph such as the one of Fig. 2.1, where the arcs corre-
spond to transitions between states at successive stages and each arc has a
cost associated with it. To handle the final stage, we also add an artificial
terminal node t. Each state xN at stage N is connected to the terminal node
t with an arc having cost G(xN ). Control sequences correspond to paths
originating at the initial state x0 and terminating at one of the nodes corre-
sponding to the final stage N . The optimal control sequence corresponds to a
shortest path from node x0 to node t. For an extensive treatment of dynamic
programming and associated shortest path algorithms we refer to Bertsekas
[1995a].
x2 xN- 1
Terminal Arcs
x1 . . . xN with Cost Equal
to Terminal Cost
. . .
Initial State
x0 t
Artificial Terminal
. . . Node
. . .
known in advance. We want to find the time required to complete the project,
as well as the critical activities, those that even if slightly delayed will result
in a corresponding delay of completion of the overall project.
The problem can be represented by a graph where nodes represent
completion of some phase of the project (cf. Fig. 2.2). An arc (i, j) represents
an activity that starts once phase i is completed and has known duration
tij > 0. A phase (node) j is completed when all activities or arcs (i, j) that
are incoming to j are completed. Two special nodes 1 and N represent the
start and end of the project, respectively. Node 1 has no incoming arcs,
while node N has no outgoing arcs. Furthermore, there is at least one path
from node 1 to every other node. An important characteristic of an activity
network is that it is acyclic. This is inherent in the problem formulation and
the interpretation of nodes as phase completions.
For any path p = (1, j1 ), (j1 , j2 , ), . . . , (jk , i) from node 1 to a node
i, let Dp be the duration of the path defined as the sum of durations of its
activities; that is,
Ti = max Dp .
paths p
from 1 to i
The maximum above is attained by some path because there can be only a
finite number of paths from 1 to i, since the network is acyclic. Thus to find
56 The Shortest Path Problem Chap. 2
Order 3 Transport
Material Material
Start 2 End
2 Construction
Train 4 5
1 1
Personnel 4
3 2
Hire Train
Personnel Personnel
2
Ti , we should find the longest path from 1 to i. Because the graph is acyclic,
this problem may also be viewed as a shortest path problem with the length
of each arc (i, j) being −tij . In particular, finding the duration of the project
is equivalent to finding the shortest path from 1 to N . For further discussion
of project management problems, we refer to the literature, e.g., the textbook
by Elmaghraby [1978].
The shortest path problem can be posed in a number of ways; for example,
finding a shortest path from a single origin to a single destination, or finding
a shortest path from each of several origins to each of several destinations.
We focus initially on problems with a single origin and many destinations.
For concreteness, we take the origin node to be node 1. The arc lengths aij
are given scalars. They may be negative and/or noninteger, although on
occasion we will assume in our analysis that they are nonnegative and/or
integer, in which case we will state so explicitly.
In this section, we develop a broad class of shortest path algorithms
for the single origin/all destinations problem. These algorithms maintain
and adjust a vector (d1 , d2 , . . . , dN ), where each dj , called the label of node
j, is either a scalar or ∞. The use of labels is motivated by a simple
optimality condition, which is given in the following proposition.
Proof: By adding Eq. (2.3) over the arcs of P , we see that the length of
P is equal to the difference dik − di1 of labels of the end node and start
node of P . By adding Eq. (2.2) over the arcs of any other path P starting
at i1 and ending at ik , we see that the length of P must be no less than
dik − di1 . Therefore, P is a shortest path. Q.E.D.
The conditions (2.2) and (2.3) are called the complementary slackness
(CS) conditions for the shortest path problem. This terminology is moti-
vated by the connection of the shortest path problem with the minimum
cost flow problem (cf. Section 1.2.1); we will see in Chapter 4 that the CS
conditions of Prop. 2.1 are a special case of a general optimality condition
(also called CS condition) for the equivalent minimum cost flow problem
58 The Shortest Path Problem Chap. 2
(in fact they are a special case of a corresponding CS condition for general
linear programs; see e.g., Bertsimas and Tsitsiklis [1997], Dantzig [1963]).
Furthermore, we will see that the scalars di in Prop. 2.1 are related to dual
variables.
Let us now describe a prototype shortest path method that contains
several interesting algorithms as special cases. In this method, we start
with some vector of labels (d1 , d2 , . . . , dN ), we successively select arcs (i, j)
that violate the CS condition (2.2), i.e., dj > di + aij , and we set
dj := di + aij .
Remove a node i from the candidate list V . For each outgoing arc
(i, j) ∈ A, if dj > di + aij , set
dj := di + aij
† In the case of the origin node 1, we will interpret the label d1 as either the
length of a cycle that starts and ends at 1, or (in the case d1 = 0) the length of
the trivial “path” from 1 to itself.
Sec. 2.2 A Generic Shortest Path Algorithm 59
2 2
3
Origin 1 1 1 4
1 3
3
1 {1} (0, ∞, ∞, ∞) 1
2 {2, 3} (0, 3, 1, ∞) 2
3 {3, 4} (0, 3, 1, 5) 3
4 {4, 2} (0, 2, 1, 4) 4
5 {2} (0, 2, 1, 4) 2
Ø (0, 2, 1, 4)
Figure 2.3: Illustration of the generic shortest path algorithm. The numbers
next to the arcs are the arc lengths. Note that node 2 enters the candidate list
twice. If in iteration 2 node 3 was removed from V instead of node 2, each node
would enter V only once. Thus, the order in which nodes are removed from V is
significant.
It can be seen that, in the course of the algorithm, the labels are
monotonically nonincreasing. Furthermore, we have
di < ∞ ⇐⇒ i has entered V at least once.
Figure 2.3 illustrates the algorithm. The following proposition gives its
main properties.
(b) If the algorithm terminates, then upon termination, for all j with
dj < ∞, dj is the shortest distance from 1 to j and
min(i,j)∈A {di + aij } if j = 1,
dj = (2.4)
0 if j = 1.
Proof: (a) We prove (i) by induction on the iteration count. Indeed, (i)
holds at the end of the first iteration since the nodes j = 1 with dj < ∞
are those for which (1, j) is an arc and their labels are dj = a1j , while for
the origin 1, we have d1 = 0, which by convention is viewed as the length
of the trivial “path” from 1 to itself. Suppose that (i) holds at the start
of some iteration at which the node removed from V is i. Then di < ∞
(which is true for all nodes of V by the rules of the algorithm), and (by the
induction hypothesis) di is the length of some path Pi starting at 1 and
ending at i. When a label dj changes as a result of the iteration, dj is set
to di + aij , which is the length of the path consisting of Pi followed by arc
(i, j). Thus property (i) holds at the end of the iteration, completing the
induction proof.
To prove (ii), note that for any i, each time i is removed from V , the
condition dj ≤ di + aij is satisfied for all (i, j) ∈ A by the rules of the
algorithm. Up to the next entrance of i into V , di stays constant, while
the labels dj for all j with (i, j) ∈ A cannot increase, thereby preserving
the condition dj ≤ di + aij .
(b) We first introduce the sets
I = {i | di = ∞ upon termination},
condition (i) of part (a) that we cannot have dj < ∞ upon termination, so
j ∈ I.
We show now that for all j ∈ I, upon termination, dj is the shortest
distance from 1 to j and Eq. (2.4) holds. Indeed, conditions (i) and (ii) of
part (a) imply that upon termination we have, for all i ∈ I,
When some arc lengths are negative, Prop. 2.2 points to a way to
detect existence of a path that starts at the origin 1 and contains a cycle
of negative length. If such a path exists, it can be shown under mild
assumptions that the label of at least one node will diverge to −∞ (see
Exercise 2.32). We can thus monitor whether for some j we have
When this condition occurs, the path from 1 to j whose length is equal to
dj [as per Prop. 2.2(a)] must contain a negative cycle [if it were simple, it
would consist of at most N − 1 arcs, and its length could not be smaller
than (N − 1) min(i,j)∈A aij ; a similar argument would apply if it were not
simple but it contained only cycles of nonnegative length].
62 The Shortest Path Problem Chap. 2
When all cycles have nonnegative length and there exists a path from node
1 to every node j, then Prop. 2.2 shows that the generic algorithm termi-
nates and that, upon termination, all labels are equal to the corresponding
shortest distances, and satisfy d1 = 0 and
...
obtaining ai2 i1 + · · · + aik ik−1 + ai1 ik = 0.] Since the subgraph is connected
and has N − 1 arcs, it must be a spanning tree. We call this subgraph a
shortest path spanning tree, and we note its special structure: it has a root
(node 1) and every arc of the tree is directed away from the root. The
preceding argument can also be used to show that Bellman’s equation has
no solution other than the shortest distances; see Exercise 2.5.
A shortest path spanning tree can also be constructed in the process
of executing the generic shortest path algorithm by recording the arc (i, j)
every time dj is decreased to di + aij ; see Exercise 2.4.
Sec. 2.2 A Generic Shortest Path Algorithm 63
Advanced Initialization
The generic algorithm need not be started with the initial conditions
V = {1}, d1 = 0, di = ∞, ∀ i = 1,
di = min dj .
j∈V
dj := di + aij
Figure 2.5 illustrates the label setting method. Some insight into the
method can be gained by considering the set W of nodes that have already
been in V but are not currently in V :
W = {i | di < ∞, i ∈
/ V }.
We will prove later, in Prop. 2.3(a), that as a consequence of the policy of
removing from V a minimum label node, W contains nodes with “small”
labels throughout the algorithm, in the sense that
dj ≤ d i , if j ∈ W and i ∈
/ W. (2.8)
Using this property and the assumption aij ≥ 0, it can be seen that when
a node i is removed from V , we have, for all j ∈ W for which (i, j) is an
arc,
dj ≤ di + aij .
66 The Shortest Path Problem Chap. 2
2 0
2 1
Origin 1 1 1 4 5
0
1 3
1 {1} (0, ∞, ∞, ∞, ∞) 1
2 {2, 3} (0, 2, 1, ∞, ∞) 3
3 {2, 4} (0, 2, 1, 4, ∞) 2
4 {4, 5} (0, 2, 1, 3, 2) 5
5 {4} (0, 2, 1, 3, 2) 4
Ø (0, 2, 1, 3, 2)
Figure 2.5: Example illustrating the label setting method. At each iteration,
the node with the minimum label is removed from V . Each node enters V only
once.
Hence, once a node enters W , it stays in W and its label does not change
further. Thus, W can be viewed as the set of permanently labeled nodes,
that is, the nodes that have acquired a final label, which by Prop. 2.2, must
be equal to their shortest distance from the origin.
The following proposition makes the preceding argument precise and
proves some additional facts.
(iii) For each node j, consider simple paths that start at 1, end
at j, and have all their other nodes in W at the end of the
iteration. Then the label dj at the end of the iteration is
equal to the length of the shortest of these paths (dj = ∞
if no such path exists).
(b) The label setting method will terminate, and all nodes with a
final label that is finite will be removed from the candidate list
V exactly once in order of increasing shortest distance from node
1; that is, if the final labels of i and j are finite and satisfy di < dj ,
then i will be removed before j.
Proof: (a) Properties (i) and (ii) will be proved simultaneously by induc-
tion on the iteration count. Clearly (i) and (ii) hold for the initial iteration
at which node 1 exits V and enters W .
Suppose that (i) and (ii) hold for iteration k − 1, and suppose that
during iteration k, node i satisfies di = minj∈V dj and exits V . Let W
and W be the set of Eq. (2.9) at the start and at the end of iteration k,
respectively. Let dj and dj be the label of each node j at the start and at
the end of iteration k, respectively. Since by the induction hypothesis we
have dj ≤ di for all j ∈ W , and aij ≥ 0 for all arcs (i, j), it follows that
dj ≤ di + aij for all arcs (i, j) with j ∈ W . Hence, a node j ∈ W cannot
enter V at iteration k. This completes the induction proof of property (i),
and shows that
W = W ∪ {i}.
Thus, at iteration k, the only labels that may change are the labels dj
of nodes j ∈/ W such that (i, j) is an arc; the label dj at the end of the
iteration will be min{dj , di + aij }. Since aij ≥ 0, di ≤ dj for all j ∈
/ W,
and di = di , we must have di ≤ dj for all j ∈ / W . Since by the induction
hypothesis we have dm ≤ di and dm = dm for all m ∈ W , it follows that
dm ≤ dj for all m ∈ W and j ∈ / W . This completes the induction proof of
property (ii).
To prove property (iii), choose any node j and consider the subgraph
consisting of the nodes W ∪ {j} together with the arcs that have both
end nodes in W ∪ {j}. Consider also a modified shortest path problem
involving this subgraph, and the same origin and arc lengths as in the
original shortest path problem. In view of properties (i) and (ii), the label
setting method applied to the modified shortest path problem yields the
same sequence of nodes exiting V and the same sequence of labels as when
applied to the original problem up to the current iteration. By Prop.
2.2, the label setting method for the modified problem terminates with the
labels equal to the shortest distances of the modified problem at the current
68 The Shortest Path Problem Chap. 2
iteration. This means that the labels at the end of the iteration have the
property stated in the proposition.
(b) Since there is no cycle with negative length, by Prop. 2.2(d), we see
that the label setting method will terminate. At each iteration the node
removed from V is added to W , and according to property (i) (proved
above), no node from W is ever returned to V . Therefore, each node
with a final label that is finite will be removed from V and simultaneously
entered in W exactly once, and, by the rules of the algorithm, its label
cannot change after its entrance in W . Property (ii) then shows that each
new node added to W has a label at least as large as the labels of the nodes
already in W . Therefore, the nodes are removed from V in the order stated
in the proposition. Q.E.D.
Here the nodes are organized as a binary heap on the basis of label values
and membership in V ; see Fig. 2.6. The node at the top of the heap is the
node of V that has minimum label, and the label of every node in V is no
larger than the labels of all the nodes that are in V and are its descendants
in the heap. Nodes that are not in V may be in the heap but may have no
descendants that are in V .
Label = 1
Label = 4 Label = 2
Figure 2.6: A binary heap organized on the basis of node labels is a binary
balanced tree such that the label of each node of V is no larger than the labels of
all its descendants that are in V . Nodes that are not in V may have no descendants
that are in V . The topmost node, called the root, has the minimum label. The
tree is balanced in that the numbers of arcs in the paths from the root to any
nodes with no descendants differ by at most 1. If the label of some node decreases,
the node must be moved upward toward the root, requiring O(log N ) operations.
[It takes O(1) operations to compare the label of a node i with the label of one
of its descendants j, and to interchange the positions of i and j if the label of j
is smaller. Since there are log N levels in the tree, it takes at most log N such
comparisons and interchanges to move a node upward to the appropriate position
once its label is decreased.] Similarly, when the topmost node is removed from V ,
moving the node downward to the appropriate level in the heap requires at most
log N steps and O(log N ) operations. (Each step requires the interchange of the
position of the node and the position of one of its descendants. The descendant
must be in V for the step to be executed; if both descendants are in V , the one
with smaller label is selected.)
At each iteration, the top node of the heap is removed from V . Fur-
thermore, the labels of some nodes already in V may decrease, so these
may have to be repositioned in the heap; also, some other nodes may enter
70 The Shortest Path Problem Chap. 2
V for the first time and have to be inserted in the heap at the right place.
It can be seen that each of these removals, repositionings, and insertions
can be done in O(log N ) time. There are a total of N removals and N
node
insertions, so the number of operations for maintaining the heap is
O (N + R) log N , where R is the total number of node repositionings.
There is at most one repositioning per arc, since each arc is examined at
most once, so we have R ≤ A and the total operation count for maintaining
the heap is O(A log N ). This dominates the O(A) operation count to ex-
amine all arcs, so the worst-case running time of the method is O(A log N ).
On the other hand, practical experience indicates that the number of node
repositionings R is usually a small multiple of N , and considerably less
than the upper bound A. Thus, the running time of the method in prac-
tice typically grows approximately like O(A + N log N ).
This algorithm, due to Dial [1969], requires that all arc lengths are non-
negative integers. It uses a naive yet often surprisingly effective method
for finding the minimum label node in V . The idea is to maintain for every
possible label value, a list of the nodes that have that value. Since every
finite label is equal to the length of some path with no cycles [Prop. 2.3(a),
part (iii)], the possible label values range from 0 to (N − 1)C, where
C = max aij .
(i,j)∈A
Thus, we may scan the (N − 1)C + 1 possible label values (in ascending
order) and look for a label value with nonempty list, instead of scanning
the candidate list V .
To visualize the algorithm, it is useful to think of each integer in
the range [0, (N − 1)C] as some kind of container, referred to as a bucket.
Each bucket b holds the nodes with label equal to b. Tracing steps, we see
that the method starts with the origin node 1 in bucket 0 and all other
buckets empty. At the first iteration, each node j with (1, j) ∈ A enters
the candidate list V and is inserted in bucket a1j . After we are done with
bucket 0, we proceed to check bucket 1. If it is nonempty, we repeat the
process, removing from V all nodes with label 1 and moving other nodes
to smaller numbered buckets as required; if not, we check bucket 2, and so
on. Figure 2.7 illustrates the method with an example.
Let us now consider the efficient implementation of the algorithm. We
first note that a doubly linked list (see Fig. 2.8) can be used to maintain the
set of nodes belonging to a given bucket, so that checking the emptiness of
a bucket and inserting or removing a node from a bucket are easy, requiring
O(1) operations. With such a data structure, the time required for mini-
mum label node searching is O(N C), and the time required for adjusting
node labels and repositioning nodes between buckets is O(A). Thus the
Sec. 2.3 Label Setting (Dijkstra) Methods 71
2
2 0
Origin 1 1 1 1 5
4 0
1
3 3
1 {1} (0, ∞, ∞, ∞, ∞) 1 – – – – 1
2 {2, 3} (0, 2, 1, ∞, ∞) 1 3 2 – – 3
3 {2, 4} (0, 2, 1, 4, ∞) 1 3 2 – 4 2
4 {4, 5} (0, 2, 1, 3, 2) 1 3 2,5 4 – 5
5 {4} (0, 2, 1, 2, 2) 1 3 2,4,5 – – 4
Ø (0, 2, 1, 2, 2) 1 3 2,4,5 – –
a = min aij
(i,j)∈A
ab ≤ di ≤ a(b + 1) − 1.
The running time of the algorithm is then reduced to O A + (N C/a) .
72 The Shortest Path Problem Chap. 2
Bucket b 0 1 2 3 4 5 6 7 8
F IRST (b) 1 0 3 2 0 6 0 0 0
Node i 1 2 3 4 5 6 7
Label di 0 3 2 2 2 5 3
NEXT (i ) 0 7 4 5 0 0 0
PREVIOUS (i ) 0 0 0 3 4 0 2
Figure 2.8: Illustration of a doubly linked list data structure to maintain the can-
didate list V in buckets. In this example, the nodes in V are numbered 1, 2, . . . , 7,
and the buckets are numbered 0, 1, . . . , 8. A node i belongs to bucket b if di = b.
As shown in the first table, for each bucket b we maintain the first node of
the bucket in an array element F IRST (b), where F IRST (b) = 0 if bucket b is
empty.
As shown in the second table, for every node i we maintain two array
elements, N EXT (i) and P REV IOU S(i), giving the next node and the pre-
ceding node, respectively, of node i in the bucket where i is currently residing
[N EXT (i) = 0 or P REV IOU S(i) = 0 if i is the last node or the first node in its
bucket, respectively].
intelligent strategies of this type, one may obtain label setting methods
with very good polynomial complexity bounds; see Johnson [1977], Denardo
and Fox [1979], Ahuja, Mehlhorn, Orlin, and Tarjan [1990]. In practice,
however, the simpler algorithm of Dial has been more popular than these
methods.
The simplest label correcting method uses a first-in first-out rule to update
the queue that is used to store the candidate list V . In particular, a node is
always removed from the top of the queue, and a node, upon entrance in the
candidate list, is placed at the bottom of the queue. Thus, it can be seen
that the method operates in cycles of iterations: the first cycle consists of
just iterating on node 1; in each subsequent cycle, the nodes that entered
the candidate list during the preceding cycle, are removed from the list
in the order that they were entered. We will refer to this method as the
Bellman-Ford method , because it is closely related to a method proposed
by Bellman [1957] and Ford [1956] based on dynamic programming ideas
(see Exercise 2.6).
The complexity analysis of the method is based on the following prop-
erty, which we will prove shortly:
Bellman-Ford Property
For each node i and integer k ≥ 1, let
dki = Shortest distance from 1 to i using paths that have k arcs or less,
In the case where all cycles have nonnegative length, the shortest
distance of every node can be achieved with a path having N − 1 arcs or
less, so the above Bellman-Ford property implies that the method finds
all the shortest distances after at most N − 1 cycles. Since each cycle
of iterations requires a total of O(A) operations (each arc is examined at
most once in each cycle), the running time of the Bellman-Ford method is
O(N A).
To prove the Bellman-Ford property, we first note that
dj = min dj , min {di + aij } ,
k+1 k k ∀ j, k ≥ 1, (2.10)
(i,j)∈A
since dk+1
j is either the length of a path from 1 to j with k arcs or less, in
which case it is equal to dkj , or else it is the length of some path that starts
at 1 goes to a predecessor node i with k arcs or less, and then goes to j
using arc (i, j). We now prove the Bellman-Ford property by induction.
At the end of the 1st cycle, we have for all i,
⎧
⎨0 if i = 1,
di = a1i if i = 1 and (1, i) ∈ A,
⎩
∞ if i = 1 and (1, i) ∈ / A,
while
a1i if (1, i) ∈ A,
d1i =
∞ if (1, i) ∈
/ A,
so that di ≤ d1i for all i. Let di and V be the node labels and the contents
of the candidate list at the end of the kth cycle, respectively. Let also di be
the node labels at the end of the (k + 1)st cycle. We assume that di ≤ dki
for all i, and we will show that di ≤ dk+1i for all i. Indeed, by condition
(ii) of Prop. 2.2(a), we have
We also have
since at the time when i is removed from V , its current label, call it d˜i ,
satisfies d˜i ≤ di , and the label of j is set to d˜i + aij if it exceeds d˜i + aij .
By combining Eqs. (2.11) and (2.12), we see that
where the last equality holds by Eq. (2.10). This completes the induction
proof of the Bellman-Ford property.
The Bellman-Ford method can be used to detect the presence of a
negative cycle. Indeed, from Prop. 2.2, we see that the method fails to
terminate if and only if there exists a path that starts at 1 and contains
a negative cycle. Thus in view of the Bellman-Ford property, such a path
exists if and only if the algorithm has not terminated by the end of N − 1
cycles.
The best practical implementations of label correcting methods are
more sophisticated than the Bellman-Ford method. Their worst-case run-
ning time is no better than the O(N A) time of the Bellman-Ford method,
and in some cases it is considerably slower. Yet their practical performance
is often considerably better. We will discuss next three different types of
implementations.
In this method, a node is always removed from the top of the queue used
to maintain the candidate list V . A node, upon entrance in the queue, is
placed at the bottom of the queue if it has never been in the queue before;
otherwise it is placed at the top.
The idea here is that when a node i is removed from the queue, its
label affects the labels of a subset Bi of the neighbor nodes j with (i, j) ∈ A.
When the label of i changes again, it is likely that the labels of the nodes
in Bi will require updating also. It is thus argued that it makes sense to
place the node at the top of the queue so that the labels of the nodes in Bi
get a chance to be updated as quickly as possible.
While this rationale is not quite convincing, it seems to work well in
practice for a broad variety of problems, including types of problems where
there are some negative arc lengths. On the other hand, special examples
have been constructed (Kershenbaum [1981], Shier and Witzgall [1981]),
where the D’Esopo-Pape algorithm performs very poorly. In particular, in
these examples, the number of entrances of some nodes in the candidate
list V is not polynomial. Computational studies have also shown that for
some classes of problems, the practical performance of the D’Esopo-Pape
algorithm can be very poor (Bertsekas [1993a]). Pallottino [1984], and
Gallo and Pallottino [1988] give a polynomial variant of the algorithm,
whose practical performance, however, is roughly similar to the one of the
original version.
76 The Shortest Path Problem Chap. 2
These methods are motivated by the hypothesis that when the arc lengths
are nonnegative, the queue management strategy should try to place nodes
with small labels near the top of the queue. For a supporting heuristic
argument, note that for a node j to reenter V , some node i such that
di + aij < dj must first exit V . Thus, the smaller dj was at the previous
exit of j from V the less likely it is that di +aij will subsequently become less
than dj for some node i ∈ V and arc (i, j). In particular, if dj ≤ mini∈V di
and the arc lengths aij are nonnegative, it is impossible that subsequent
to the exit of j from V we will have di + aij < dj for some i ∈ V .
We can think of Dijkstra’s method as implicitly placing at the top of
an imaginary queue the node with the smallest label, thereby resulting in
the minimal number N of iterations. The methods of this section attempt
to emulate approximately the minimum label selection policy of Dijkstra’s
algorithm with a much smaller computational overhead. They are primarily
suitable for the case of nonnegative arc lengths. While they will work even
when there are some negative arc lengths as per Prop. 2.2, there is no
reason to expect that in this case they will terminate faster (or slower)
than any of the other label correcting methods that we will discuss.
A simple strategy for placing nodes with small label near the top of the
queue is the Small Label First method (SLF for short). Here the candidate
list V is maintained as a double ended queue Q. At each iteration, the
node exiting V is the top node of Q. The rule for inserting new nodes is
given below:
SLF Strategy
Whenever a node j enters Q, its label dj is compared with the label
di of the top node i of Q. If dj ≤ di , node j is entered at the top of
Q; otherwise j is entered at the bottom of Q.
The SLF strategy provides a rule for inserting nodes in Q, but always
removes (selects for iteration) nodes from the top of Q. A more sophis-
ticated strategy is to make an effort to remove from Q nodes with small
labels. A simple possibility, called the Large Label Last method (LLL for
short) works as follows: At each iteration, when the node at the top of Q
has a larger label than the average node label in Q (defined as the sum of
the labels of the nodes in Q divided by the cardinality |Q| of Q), this node
is not removed from Q, but is instead repositioned to the bottom of Q.
LLL Strategy
Let i be the top node of Q, and let
Sec. 2.4 Label Correcting Methods 77
j∈Q dj
a= .
|Q|
If di > a, move i to the bottom of Q. Repeat until a node i such that
di ≤ a is found and is removed from Q.
It is simple to combine the SLF queue insertion and the LLL node
removal strategies, thereby obtaining a method referred to as SLF/LLL.
Experience suggests that, assuming nonnegative arc lengths, the SLF,
LLL, and combined SLF/LLL algorithms perform substantially faster than
the Bellman-Ford and the D’Esopo-Pape methods. The strategies are also
well-suited for parallel computation (see Bertsekas, Guerriero, and Mus-
manno [1996]). The combined SLF/LLL method consistently requires a
smaller number of iterations than either SLF or LLL, although the gain in
number of iterations is sometimes offset by the extra overhead.
Regarding the theoretical worst-case performance of the SLF and the
combined SLF/LLL algorithms, an example has been constructed by Chen
and Powell [1997], showing that these algorithms do not have polynomial
complexity in their pure form. However, nonpolynomial behavior seems
to be an extremely rare phenomenon in practice. In any case, one may
construct polynomial versions of the SLF and LLL algorithms, when the
arc lengths are nonnegative. A simple approach is to first sort the outgoing
arcs of each node by length. That is, when a node i is removed from Q, first
examine the outgoing arc from i that has minimum length, then examine
the arc of second minimum length, etc. This approach, due to Chen and
Powell [1997], can be shown to have complexity O(N A2 ) (see Exercise
2.9). Note, however, that sorting the outgoing arcs of a node by length
may involve significant overhead.
There is also another approach to construct polynomial versions of
the SLF and LLL algorithms (as well as other label correcting methods),
which leads to O(N A) complexity, assuming nonnegative arc lengths. To
see how this works, suppose that in the generic label correcting algorithm,
there is a set of increasing iteration indices t1 , t2 , . . . , tn+1 such that t1 = 1,
and for i = 1, . . . , n, all nodes that are in V at the start of iteration ti
are removed from V at least once prior to iteration ti+1 . Because all arc
lengths are nonnegative, this guarantees that the minimum label node of
V at the start of iteration ti will never reenter V after iteration ti+1 . Thus
the candidate list must have no more than N − i nodes at the start of
iteration ti+1 , and must become empty prior to iteration tN +1 . Thus, if
the running time of the algorithm between iterations ti and ti+1 is bounded
by R, the total running time of the algorithm will be bounded by N R, and
if R is polynomially bounded, the running time of the algorithm will also
be polynomially bounded.
Specializing now to the SLF and LLL cases, assume that between
78 The Shortest Path Problem Chap. 2
iterations ti and ti+1 , each node is inserted at the top of Q for a number
of times that is bounded by a constant and that (in the case of SLF/LLL)
the total number of repositionings is bounded by a constant multiple of
A. Then it can be seen that the running time of the algorithm between
iterations ti and ti+1 is O(A), and therefore the complexity of the algorithm
is O(N A).
To modify SLF or SLF/LLL so that they have an O(N A) worst-case
complexity, based on the preceding result, it is sufficient that we fix an inte-
ger k > 1, and that we separate the iterations of the algorithm in successive
blocks of kN iterations each. We then impose an additional restriction that,
within each block of kN iterations, each node can be inserted at most k − 1
times at the top of Q [that is, after the (k − 1)th insertion of a node to the
top of Q within a given block of kN iterations, all subsequent insertions of
that node within that block of kN iterations must be at the bottom of Q].
In the case of SLF/LLL, we also impose the additional restriction that the
total number of repositionings within each block of kN iterations should
be at most kA (that is, once the maximum number of kA repositionings is
reached, the top node of Q is removed from Q regardless of the value of its
label). The worst-case running times of the modified algorithms are then
O(N A). In practice, it is highly unlikely that the restrictions introduced
into the algorithms to guarantee O(N A) complexity will ever be exercised
if k is larger than a small number such as 3 or 4.
nently labeled at time t and never reenter the candidate list. We may thus
interpret the threshold algorithm as a block version of Dijkstra’s method ,
whereby a whole subset of nodes becomes permanently labeled when the
queue Q gets exhausted.
The preceding interpretation suggests that the threshold algorithm is
suitable primarily for the case of nonnegative arc lengths (even though it
will work in general). Furthermore, the performance of the algorithm is
quite sensitive to the method used to adjust the threshold. For example, if
s is taken to be equal to the current minimum label, the method is identical
to Dijkstra’s algorithm; if s is larger than all node labels, Q is empty and
the algorithm reduces to the generic label correcting method. With an
effective choice of threshold, the practical performance of the algorithm
is very good. A number of heuristic approaches have been developed for
selecting the threshold (see Glover, Klingman, and Phillips [1985], and
Glover, Klingman, Phillips, and Schneider [1985]). If all arc lengths are
nonnegative, a bound O(N A) on the operation count of the algorithm can
be shown; see Exercise 2.8(c).
time, these combined methods require considerably less overhead than Di-
jkstra’s method.
Let us now try to compare the two major special cases of the generic
algorithm, label setting and label correcting methods, assuming that the
arc lengths are nonnegative.
We mentioned earlier that label setting methods offer a better guar-
antee of good performance than label correcting methods, because their
worst-case running time is more favorable. In practice, however, there
are several considerations that argue in favor of label correcting methods.
One such consideration is that label correcting methods, because of their
inherent flexibility, are better suited for exploiting advanced initialization.
Another consideration is that when the graph is acyclic, label cor-
recting methods can be adapted to exploit the problem’s structure, so that
each node enters and exits the candidate list only once, thereby nullifying
the major advantage of label setting methods (see Exercise 2.10). The cor-
responding running time is O(A), which is the minimum possible. Note
that an important class of problems involving an acyclic graph is dynamic
programming (cf. Fig. 2.1).
A third consideration is that in practice, the graphs of shortest path
problems are often sparse; that is, the number of arcs is much smaller
than the maximum possible N 2 . In this case, efficient label correcting
methods tend to have a faster practical running time than label setting
methods. To understand the reason, note that all shortest path methods
require the unavoidable O(A) operations needed to scan once every arc, plus
some additional time which we can view as “overhead.” The overhead of
the popular label setting methods is roughly proportional to N in practice
(perhaps times a slowly growing factor, like log N ), as argued earlier for the
binary heap method and Dial’s algorithm. On the other hand, the overhead
of label correcting methods grows linearly with A (times a factor that likely
grows slowly), because for the most popular methods, the average number
of node entrances in the queue per node is typically not much larger than
1. Thus, we may conclude that the overhead ratio of label correcting to
label setting methods is roughly
A
· (a constant factor).
N
The constant factor above depends on the particular method used and
may vary slowly with the problem size, but is typically much less than 1.
Thus, the overhead ratio favors label correcting methods for a sparse graph
(A << N 2 ), and label setting methods for a dense graph (A ≈ N 2 ). This
is consistent with empirical observations.
Sec. 2.5 Single Origin/Single Destination Methods 81
Let us finally note that label setting methods can take better advan-
tage of situations where only a small subset of the nodes are destinations, as
will be seen in the next section. This is also true of the auction algorithms
to be discussed in Section 2.6.
Suppose that we use the label setting method. Then we can stop the
method when the destination t becomes permanently labeled; further com-
putation will not improve the label dt (Exercise 2.13 sharpens this criterion
in the case where min{i|(i,t)∈A} aij > 0). If t is closer to the origin than
many other nodes, the saving in computation time will be significant. Note
that this approach can also be used when there are several destinations.
The method is stopped when all destinations have become permanently
labeled.
Another possibility is to use a two-sided label setting method ; that is,
a method that simultaneously proceeds from the origin to the destination
and from the destination to the origin. In this method, we successively label
permanently the closest nodes to the origin (with their shortest distance
from the origin) and the closest nodes to the destination (with their shortest
distance to the destination). It can be shown that when some node gets
permanently labeled from both sides, the labeling can stop; by combining
the forward and backward paths of each labeled node and by comparing
the resulting origin-to-destination paths, one can obtain a shortest path.
Exercise 2.14 develops in some detail this approach, which can often lead
to a dramatic reduction in the total number of iterations. However, the
approach does not work when there are multiple destinations.
Unfortunately, when label correcting methods are used, it may not be easy
to realize the savings just discussed in connection with label setting. The
difficulty is that even after we discover several paths to the destination t
(each marked by an entrance of t into V ), we cannot be sure that better
paths will not be discovered later. In the presence of additional problem
82 The Shortest Path Problem Chap. 2
structure, however, the number of times various nodes will enter V can be
reduced considerably, as we now explain.
Suppose that at the start of the algorithm we have, for each node i, an
underestimate ui of the shortest distance from i to t (we require ut = 0).
For example, if all arc lengths are nonnegative we may take ui = 0 for
all i. (We do not exclude the possibility that ui = −∞ for some i, which
corresponds to the case where no underestimate is available for the shortest
distance of i.) The following is a modified version of the generic shortest
path algorithm.
Initially
V = {1},
d1 = 0, di = ∞, ∀ i = 1.
The algorithm proceeds in iterations and terminates when V is empty. The
typical iteration (assuming V is nonempty) is as follows.
set
dj := di + aij
and add j to V if it does not already belong to V .
di + aij + uj ≥ dt ,
2 0
Origin 2 Destination
1 1 1 5
4 0
1
3 1
1 {1} (0, ∞, ∞, ∞, ∞) 1
2 {2, 3} (0, 2, 1, ∞, ∞) 2
3 {3, 5} (0, 2, 1, ∞, 2) 3
4 {5} (0, 2, 1, ∞, 2) 5
Ø (0, 2, 1, ∞, 2)
It is unnecessary to consider such paths, and for this reason node j need
not be entered in V . In this way, the number of node entrances in V may
be sharply reduced.
Figure 2.9 illustrates the algorithm. The following proposition proves
its validity.
Proof: (a) The proof is identical to the corresponding part of Prop. 2.2.
(b) If upon termination we have dt = ∞, then the extra test di + aij + uj <
dt for entering V is always passed, so the algorithm generates the same
label sequences as the generic (all destinations) shortest path algorithm.
Therefore, Prop. 2.2(b) applies and shows that there is no path from 1 to t.
It will thus be sufficient to prove this part assuming that we have dt < ∞
upon termination.
Let dj be the final values of the labels dj obtained upon termination
and suppose that dt < ∞. Assume, to arrive at a contradiction, that there
is a path Pt = (1, j1 , j2 , . . . , jk , t) that has length Lt with Lt < dt . For
m = 1, . . . , k, let Ljm be the length of the path Pm = (1, j1 , j2 , . . . , jm ).
Let us focus on the node jk preceding t on the path Pt . We claim that
Ljk < djk . Indeed, if this were not so, then jk must have been removed at
some iteration from V with a label djk satisfying djk ≤ Ljk . If dt is the
label of t at the start of that iteration, we would then have
From the above two equations, it follows that the label of jk would be
reduced at that iteration from djk to djk−1 + ajk−1 t , which is less than the
final label djk – a contradiction.
Proceeding similarly, we obtain Ljm < djm for all m = 1, . . . , k, and
in particular a1j1 = Lj1 < dj1 . Since
Advanced Initialization
We finally note that similar to the all-destinations case, the generic sin-
gle origin/single destination method need not be started with the initial
conditions
V = {1}, d1 = 0, di = ∞, ∀ i = 1.
The algorithm works correctly using several other initial conditions. One
possibility is to use for each node i, an initial label di that is either ∞ or
else it is the length of a path from 1 to i, and to take V = {i | di < ∞}.
A more sophisticated alternative is to initialize V so that it contains all
nodes i such that
di + aij < min{dj , dt − uj } for some (i, j) ∈ A.
This kind of initialization can be extremely useful when a “good”
path
P = (1, i1 , . . . , ik , t)
from 1 to t is known or can be found heuristically, and the arc lengths are
nonnegative so that we can use the underestimate ui = 0 for all i. Then
we can initialize the algorithm with
Length of portion of path P from 1 to i if i ∈ P ,
di =
∞ if i ∈
/ P,
V = {1, i1 , . . . , ik }.
If P is a near-optimal path and consequently the initial value dt is near its
final value, the test for future admissibility into the candidate list V will
be relatively tight from the start of the algorithm and many unnecessary
entrances of nodes into V may be saved. In particular, it can be seen that
all nodes whose shortest distances from the origin are greater or equal to
the length of P will never enter the candidate list.
86 The Shortest Path Problem Chap. 2
destination. This behavior is typical when the initial prices are all zero (see
Exercise 2.19).
p2 = 1.5
2 p1 = 2.5 2
1 1.5 p4 = 0
1 4 1 4
Origin Destination
2 3
3 3
p3 = 3
Figure 2.10: An example illustrating the auction algorithm starting with P = (1)
and p = 0.
2
1 1.5 Shortest path problem with
arc lengths shown next to the arcs.
1 4 Node 1 is the origin.
Node 4 is the destination.
2 3
3
(a)
p1 = 2.5 1
1
p1 1
1 2
2 p2 = 1.5
p2 2 2
p3 1.5
3
p3 = 0.5 3
p4 1.5 3
4
3
p4 = 0 4
(b) (c)
Figure 2.11: Illustration of the CS conditions for the shortest path problem. If
each node is a ball, and for every arc (i, j), nodes i and j are connected with a
string of length aij , the vertical coordinates pi of the nodes satisfy pi − pj ≤ aij ,
as shown in (b) for the problem given in (a). If the model is picked up and left
to hang from the origin node s, then ps − pi gives the shortest distance to each
node i, as shown in (c).
2
1 1.5
Shortest path problem with
arc lengths shown next to the arcs.
1 4
Node 1 is the origin.
2 3 Node 4 is the destination.
3
(a)
2.0
1.5 2
1.0 1
1
0.5
1 2 3 4 2 3 4 3 4
0
3.0 3 3
2.5 1
2.0 1 1
1.5 2 2 2
1.0
0.5
3 4 4 4
0
(b)
Proposition 2.5: If there exists at least one path from the origin
to the destination, the auction algorithm terminates with a shortest
path from the origin to the destination. Otherwise the algorithm never
terminates and ps → ∞.
implying the CS condition (2.16) for arcs outgoing from that node as well.
Finally, since pi > pi and pk = pk for all k = i, we have pk ≤ akj + pj for
all arcs (k, j) outgoing from nodes k ∈/ P . This completes the induction
proof that (P, p) satisfies CS throughout the algorithm.
Assume first that there is a path from node s to the destination t.
By adding the CS condition (2.16) along that path, we see that ps − pt is
an underestimate of the (finite) shortest distance from s to t. Since ps is
monotonically nondecreasing, and pt is fixed throughout the algorithm, it
follows that ps must stay bounded.
We next claim that pi must stay bounded for all i. Indeed, in order to
have pi → ∞, node i must become the terminal node of P infinitely often.
92 The Shortest Path Problem Chap. 2
To solve the problem with multiple destinations and a single origin, one
can simply run the algorithm until every destination becomes the terminal
node of the path at least once. Also, to solve the problem with multiple
origins and a single destination, one can combine several versions of the
algorithm – one for each origin. However, the different versions can share a
common price vector, since regardless of the origin considered, the condition
pi ≤ aij + pj is always maintained. There are several ways to operate such
a method; they differ in the policy used for switching between different
94 The Shortest Path Problem Chap. 2
origins. One possibility is to run the algorithm for one origin and, after the
shortest path is obtained, to switch to the next origin (without changing the
price vector), and so on, until all origins are exhausted. Another possibility,
which is probably preferable in most cases, is to rotate between different
origins, switching from one origin to another, if a contraction at the origin
occurs or the destination becomes the terminal node of the current path.
For problems with one origin and one destination, a two-sided version of
the algorithm is particularly effective. This method maintains, in addition
to the path P , another path R that ends at the destination. To understand
this version, we first note that in shortest path problems, one can exchange
the role of origins and destinations by reversing the direction of all arcs.
It is therefore possible to use a destination-oriented version of the auction
algorithm that maintains a path R that ends at the destination and changes
at each iteration by means of a contraction or an extension. This algorithm,
called the reverse algorithm, is mathematically equivalent to the earlier
(forward) auction algorithm. Initially, in the reverse algorithm, R is any
path ending at the destination, and p is any price vector satisfying CS
together with R; for example,
ij = arg max pi − aij
{i|(i,j)∈A}
destinations are reached by the forward portion faster than other nodes,
thereby leading to faster termination.
Proposition 2.7: Every minimum cost flow problem with arc costs
aij such that all simple forward cycles have nonnegative cost is equiv-
alent to another minimum cost flow problem involving the same graph
and nonnegative arc costs âij of the form
Sec. 2.7 Multiple Origin/Multiple Destination Methods 97
where si is the supply of node i. Thus, the two cost functions (i,j)∈A âij xij
and (i,j)∈A aij xij differ by the constant i∈N di si . Q.E.D.
It can be seen now that the all-pairs shortest path problem can be
solved by using a label correcting method to solve the single origin/all
destinations problem described in the above proof, thereby obtaining the
scalars di and
âij = aij + di − dj , ∀ (i, j) ∈ A,
98 The Shortest Path Problem Chap. 2
and by then applying a label setting method N times to solve the all-
pairs shortest path problem involving the nonnegative arc lengths âij . The
shortest distance Dij from i to j is obtained by subtracting di −dj from the
shortest distance from i to j found by the label setting method. To estimate
the running time of this approach, note that the label correcting method
requires O(N A) computation using the Bellman-Ford method, and each
of the N applications of the label setting method require less than O(N 2 )
computation (the exact count depends on the method used). Thus the
overall running time is less that the O(N 3 ) required by the Floyd-Warshall
algorithm, at least for sparse graphs.
Still another possibility for solving the all-pairs shortest path problem
is to solve N separate single origin/all destinations problems but to also use
the results of the computation for one origin to start the computation for
the next origin; see our earlier discussion of initialization of label correcting
methods and also the discussion at the end of Section 5.2.
The work on the shortest path problem is very extensive, so we will re-
strict ourselves to citing the references that relate most to the material
presented. Literature surveys are given by Dreyfus [1969], Deo and Pang
[1984], and Gallo and Pallottino [1988]. The latter reference also contains
codes for the most popular shortest path methods, and extensive compu-
tational comparisons. A survey of applications in transportation networks
is given in Pallottino and Scutellà [1997a]. Parallel computation aspects of
shortest path algorithms, including asynchronous versions of some of the
algorithms developed here, are discussed in Bertsekas and Tsitsiklis [1989],
and Kumar, Grama, Gupta, and Karypis [1994].
The generic algorithm was proposed as a unifying framework of many
of the existing shortest path algorithms in Pallottino [1984], and Gallo
and Pallottino [1986]. The first label setting method was suggested in
Dijkstra [1959], and also independently in Dantzig [1960], and Whitting
and Hillier [1960]. The binary heap method was proposed by Johnson
[1972]. Dial’s algorithm (Dial [1969]) received considerable attention after
the appearance of the paper by Dial, Glover, Karney, and Klingman [1979];
see also Denardo and Fox [1979].
The Bellman-Ford algorithm was proposed in Bellman [1957] and
Ford [1956] in the form given in Exercise 2.6, where the labels of all nodes
are iterated simultaneously. The D’Esopo-Pape algorithm appeared in
Pape [1974] based on an earlier suggestion of D’Esopo. The SLF and
SLF/LLL methods were proposed by Bertsekas [1993a], and by Bertsekas,
Guerriero, and Musmanno [1996]. Chen and Powell [1997] gave a simple
polynomial version of the SLF method (Exercise 2.9). The threshold al-
Sec. 2.8 Notes, Sources, and Exercises 99
EXERCISES
2.1
Consider the graph of Fig. 2.14. Find a shortest path from 1 to all nodes using
the binary heap method, Dial’s algorithm, the D’Esopo-Pape algorithm, the SLF
method, and the SLF/LLL method.
2.2
Suppose that the only arcs that have negative lengths are outgoing from the
origin node 1. Show how to adapt Dijkstra’s algorithm so that it solves the
all-destinations shortest path problem in at most N − 1 iterations.
100 The Shortest Path Problem Chap. 2
1
2 4
4 5 2
0 Figure 2.14: Graph for Exercise
1 8 2.1. The arc lengths are the num-
6
bers shown next to the arcs.
9
1
0 5
3 5
5
2.3
Give an example of a problem where the generic shortest path algorithm will
reduce the label of node 1 to a negative value.
Consider the single origin/all destinations shortest path problem and assume that
all cycles have nonnegative length. Consider the generic algorithm of Section 2.2,
and assume that each time a label dj is decreased to di +aij the arc (i, j) is stored
in an array PRED(j ). Consider the subgraph of the arcs PRED(j ), j ∈ N , j = 1.
Show that at the end of each iteration this subgraph is a tree rooted at the origin,
and that upon termination it is a tree of shortest paths.
Assume that all cycles have positive length. Show that if the scalars d1 , d2 , . . . , dN
satisfy
dj = min {di + aij }, ∀ j = 1,
(i,j)∈A
d1 = 0,
then for all j, dj is the shortest distance from 1 to j. Show by example that this
need not be true if there is a cycle of length 0. Hint: Consider the arcs (i, j)
attaining the minimum in the above equation and consider the paths formed by
these arcs.
Consider the single origin/all destinations shortest path problem. The Bellman-
Ford method, as originally proposed by Bellman and Ford, updates the labels of
all nodes simultaneously in a single iteration. In particular, it starts with the
initial conditions
d01 = 0, d0j = ∞, ∀ j = 1,
Sec. 2.8 Notes, Sources, and Exercises 101
(a) Show that for all j = 1 and k ≥ 1, dkj is the shortest distance from 1 to
j using paths with k arcs or less, where dkj = ∞ means that all the paths
from 1 to j have more than k arcs.
(b) Assume that all cycles have nonnegative length. Show that the algorithm
terminates after at most N iterations, in the sense that for some k ≤ N we
have dkj = dk−1
j for all j. Conclude that the running time of the algorithm
is O(N A).
Consider the single origin/all destinations shortest path problem and the follow-
ing variant of the Bellman-Ford method of Exercise 2.6:
where each of the initial iterates d0i is an arbitrary scalar or ∞, except that
d01 = 0. We say that the algorithm terminates after k iterations if dki = dk−1
i for
all i.
(a) Given nodes i = 1 and j = 1, define
k
wij = minimum path length over all paths starting at i, ending at j,
k
and having k arcs (wij = ∞ if there is no such path).
(b) Assume that there exists a path from 1 to every node i and that all cycles
have positive length. Show that the method terminates at some iteration
k, with dki equal to the shortest distances d∗i . Hint: For all i = 1 and j = 1,
limk→∞ wij k
= ∞, while for all j = 1, w1j
k
= d∗j for all k ≥ N − 1.
(c) Under the assumptions of part (b), show that if d0i ≥ d∗i for all i = 1, the
method terminates after at most m∗ + 1 iterations, where
m∗ = max mi ≤ N − 1,
i=1
102 The Shortest Path Problem Chap. 2
and assume that β > 0. Show that the method terminates after at most
k + 1 iterations, where k = N − 1 if the graph is acyclic, and k = N − 2 −
β/L if the graph has cycles, where
Length of the cycle
L= min ,
All simple cycles Number of arcs on the cycle
is the, so called, minimum cycle mean of the graph. Note: See Section 4.1 of
Bertsekas and Tsitsiklis [1989] for related analysis, and an example showing
that the given upper bound on the number of iterations for termination is
tight.
(e) (Finding the minimum cycle mean) Consider the following Bellman-Ford-
like algorithm:
dk (i) = min {aij + dk−1 (j)}, ∀ i = 1, . . . , N,
(i,j)∈A
d0 (i) = 0, ∀ i = 1, . . . , N.
We assume that there exists at least one cycle, but we do not assume that
all cycles have positive length. Show that the minimum cycle mean L of
part (d) is given by
dN (i) − dk (i)
L= min max .
i=1,...,N k=0,...,N −1 N −k
k
Hint: Show that d (i) is equal to the minimum path length over all paths
that start at i and have k arcs.
Consider the generic algorithm, assuming that all arc lengths are nonnegative.
(a) Consider a node j satisfying at some time
dj ≤ di , ∀ i ∈ V.
Show that this relation will be satisfied at all subsequent times and that j
will never again enter V . Furthermore, dj will remain unchanged.
(b) Suppose that the algorithm is structured so that it removes from V a node
of minimum label at least once every k iterations (k is some integer). Show
that the algorithm will terminate in at most kN iterations.
(c) Show that the running time of the threshold algorithm is O(N A). Hint:
Define a cycle to be a sequence of iterations between successive repartition-
ings of the candidate list V . In each cycle, the node of V with minimum
label at the start of the cycle will be removed from V during the cycle.
Sec. 2.8 Notes, Sources, and Exercises 103
The purpose of this exercise, due to Chen and Powell [1997], is to show one way
to use the SLF method so that it has polynomial complexity. Suppose that the
outgoing arcs of each node have been presorted in increasing order by length.
The effect of this, in the context of the generic shortest path algorithm, is that
when a node i is removed from the candidate list, we first examine the outgoing
arc from i that has minimum length, then we examine the arc of second minimum
length, etc. Show an O(N A2 ) complexity bound for the method.
Consider the problem of finding shortest paths from the origin node 1 to all
destinations, and assume that the graph does not contain any forward cycles.
Let Tk be the set of nodes i such that every path from 1 to i has k arcs or more,
and there exists a path from 1 to i with exactly k arcs. For each i, if i ∈ Tk define
IN DEX(i) = k. Consider a label setting method that selects a node i from the
candidate list that has minimum IN DEX(i).
(a) Show that the method terminates and that each node visits the candidate
list at most once.
(b) Show that the sets Tk can be constructed in O(A) time, and that the
running time of the algorithm is also O(A).
2.11
Consider the graph of Fig. 2.14. Find a shortest path from node 1 to node 6
using the generic single origin/single destination method of Section 2.5 with all
distance underestimates equal to zero.
2.12
Consider the problem of finding a shortest path from the origin 1 to a single des-
tination t, subject to the constraint that the path includes a given node s. Show
how to solve this problem using the single origin/single destination algorithms of
Section 2.5.
Consider a label setting approach for finding shortest paths from the origin node
1 to a selected subset of destinations T . Let
a= min ait ,
{(i,t)∈A|t∈T }
and assume that a > 0. Show that one may stop the method when the node of
minimum label in V has a label dmin that satisfies
dmin + a ≥ max dt .
t∈T
104 The Shortest Path Problem Chap. 2
Consider the shortest path problem from an origin node 1 to a destination node
t, and assume that all arc lengths are nonnegative. This exercise considers an
algorithm where label setting is applied simultaneously and independently from
the origin and from the destination. In particular, the algorithm maintains a
subset of nodes W , which are permanently labeled from the origin, and a subset
of nodes V , which are permanently labeled from the destination. When W and
V have a node i in common the algorithm terminates. The idea is that a shortest
path from 1 to t cannot contain a node j ∈/ W ∪ V ; any such path must be longer
than a shortest path from 1 to i followed by a shortest path from i to t (unless j
and i are equally close to both 1 and to t).
Consider two subsets of nodes W and V with the following properties:
(1) 1 ∈ W and t ∈ V .
(2) W and V have nonempty intersection.
(3) If i ∈ W and j ∈/ W , then the shortest distance from 1 to i is less than or
equal to the shortest distance from 1 to j.
(4) If i ∈ V and j ∈/ V , then the shortest distance from i to t is less than or
equal to the shortest distance from j to t.
Let d1i be the shortest distance from 1 to i using paths all the nodes of which,
with the possible exception of i, lie in W (d1i = ∞ if no such path exists), and let
dti be the shortest distance from i to t using paths all the nodes of which, with
the possible exception of i, lie in V (dti = ∞ if no such path exists).
(a) Show that such W , V , d1i , and dti can be found by applying a label setting
method simultaneously for the single origin problem with origin node 1 and
for the single destination problem with destination node t.
(b) Show that the shortest distance D1t from 1 to t is given by
D1t = min d1i + dti = min d1i + dti = min d1i + dti .
i∈W i∈W ∪V i∈V
2.15
Apply the forward/reverse auction algorithm to the example of Fig. 2.13, and
show that it terminates in a number of iterations that does not depend on the
large arc length L. Construct a related example for which the number of iterations
of the forward/reverse algorithm is not polynomially bounded.
In order to initialize the auction algorithm, one needs a price vector p satisfying
the condition
pi ≤ aij + pj , ∀ (i, j) ∈ A. (2.19)
Sec. 2.8 Notes, Sources, and Exercises 105
Such a vector may not be available if some arc lengths are negative. Further-
more, even if all arc lengths are nonnegative, there are many cases where it is
important to use a favorable initial price vector in place of the default choice
p = 0. This possibility arises in a reoptimization context with slightly different
arc length data, or with a different origin and/or destination. This exercise gives
an algorithm to obtain a vector p satisfying the condition (2.19), starting from
another vector p satisfying the same condition for a different set of arc lengths
aij .
Suppose that we have a vector p̄ and a set of arc lengths {āij }, satisfying
pi ≤ aij + pj for all arcs (i, j), and we are given a new set of arc lengths {aij }.
(For the case where some arc lengths aij are negative, this situation arises with
p = 0 and aij = max{0, aij }.) Consider the following algorithm that maintains a
subset of arcs E and a price vector p, and terminates when E is empty. Initially
pi := aij + pj
and add to E every arc (k, i) with k = t that does not already belong to E.
Assuming that each node i is connected to the destination t with at least one
path, and that all cycle lengths are positive, show that the algorithm terminates
with a price vector p satisfying
Extend the auction algorithm for the case where all arcs have nonnegative length
but some cycles may consist exclusively of zero length arcs. Hint: Any cycle of
zero length arcs generated by the algorithm can be treated as a single node. An
alternative is the idea of graph reduction discussed in Section 2.6.
2.18
Consider the two single origin/single destination shortest path problems shown
in Fig. 2.15.
(a) Show that the number of iterations required by the forward auction algo-
rithm is estimated accurately by
nt − 1 + (2ni − 1),
i∈I, i=t
106 The Shortest Path Problem Chap. 2
1 2 3 . . . . N- 1 t (a)
3
1 t (b)
. . . .
N-1
Figure 2.15: Shortest path problems for Exercise 2.18. In problem (a) the arc
lengths are equal to 1. In problem (b), the length of each arc (1, i) is i, and the
length of each arc (i, t) is N .
2.19
In the auction algorithm of Section 2.6, let ki be the first iteration at which node
i becomes the terminal node of the path P . Show that if ki < kj , then the
shortest distance from 1 to i is less or equal to the shortest distance from 1 to j.
Consider the single origin/single destination shortest path problem and assume
that all arc lengths are nonnegative. Let node 1 be the origin, let node t be
the destination, and assume that there exists at least one path from 1 to t.
This exercise provides a forward/reverse version of Dijkstra’s algorithm, which
is motivated by the balls-and-strings model analogy of Figs. 2.11 and 2.12. In
particular, the algorithm may be interpreted as alternately lifting the model
upward from the origin (the following Step 1), and pulling the model downward
from the destination (the following Step 2). The algorithm maintains a price
vector p and two node subsets W1 and Wt . Initially, p satisfies the CS condition
W1 = {1}, and Wt = {t}. One may view W1 and Wt as the sets of permanently
labeled nodes from the origin and from the destination, respectively. The algo-
rithm terminates when W1 and Wt have a node in common. The typical iteration
is as follows:
Step 1 (Forward Step): Find
γ + = min{aij + pj − pi | (i, j) ∈ A, i ∈ W1 , j ∈
/ W1 }
and let
/ W1 | γ + = aij + pj − pi for some i ∈ W1 }.
V1 = {j ∈
Set
pi + γ + if i ∈ W1 ,
pi :=
pi if i ∈
/ W1 .
Set
W1 := W1 ∪ V1 .
γ − = min{aji + pi − pj | (j, i) ∈ A, i ∈ Wt , j ∈
/ Wt }
and let
/ Wt | γ − = aji + pi − pj for some i ∈ Wt }.
Vt = {j ∈
Set
pi − γ − if i ∈ Wt ,
pi :=
pi if i ∈
/ Wt .
Set
Wt := Wt ∪ Vt .
−
γ = min min vj− − p0j , −p1 + min vj− + d1j + pt .
j ∈W
/ 1 , j ∈W
/ t j∈W1 , j ∈W
/ t
(2.22)
Use these relations to calculate γ + and γ − in O(N ) time.
(d) Show how the algorithm can be implemented using binary heaps so that
its running time is O(A log N ). Hint: One possibility is to use four heaps
to implement the minimizations in Eqs. (2.21) and (2.22).
(e) Apply the two-sided version of Dijkstra’s algorithm with arc lengths aij +
pj − pi of Exercise 2.14, and with the termination criterion of part (c) of
that exercise. Show that the resulting algorithm is equivalent to the one of
the present exercise.
2.21
Consider the all-pairs shortest path problem, and suppose that the minimum
distances dij to go from any i to any j have been found. Suppose that a single
arc length amn is reduced to a value amn < amn . Show that if dnm + amn ≥ 0,
the new shortest distances can be obtained by
The doubling algorithm for solving the all-pairs shortest path problem is given by
aij if (i, j) ∈ A,
1
Dij = 0 if i = j,
∞ otherwise,
2k
k
minm Dim k
+ Dmj if i = j, k = 1, 2, . . . ,
log(N − 1),
Dij =
0 if i = j, k = 1, 2, . . . ,
log(N − 1).
Consider the dynamic programming problem of Example 2.2. The standard dy-
namic programming algorithm is given by the recursion
Jk (xk ) = min gk (xk , uk ) + Jk+1 (xk+1 ) , k = 0, . . . , N − 1,
uk
starting with
JN (xN ) = G(xN ).
(a) In terms of the shortest path reformulation in Fig. 2.1, interpret Jk (xk ) as
the shortest distance from node xk at stage k to the terminal node t.
(b) Show that the dynamic programming algorithm can be viewed as a spe-
cial case of the generic label correcting algorithm with a special order for
selecting nodes to exit the candidate list.
(c) Assume that gk (xk , uk ) ≥ 0 for all xk , uk , and k. Suppose that by us-
ing some heuristic we can construct a “good” suboptimal control sequence
(u0 , u1 , . . . , uN −1 ). Discuss how to use this sequence for initialization of a
single origin/single destination label correcting algorithm (cf. the discussion
of Section 2.5).
Given a problem of finding a shortest path from node s to node t, we can obtain
an equivalent “reverse” shortest path problem, where we want to find a shortest
path from t to s in a graph derived from the original by reversing the direction of
all the arcs, while keeping their length unchanged. Apply this transformation to
the dynamic programming problem of Example 2.2 and Exercise 2.23, and derive
a dynamic programming algorithm that proceeds forwards rather than backwards
in time.
The purpose of this exercise, due to Shier [1979], and Guerriero, Lacagnina,
Musmanno, and Pecorella [1997], is to introduce an approach for extending the
generic algorithm to the solution of a class of multiple shortest path problems.
Consider the single origin/many destinations shortest path context, where node
1 is the origin, assuming that no cycles of negative length exist. Let di (1) denote
the shortest distance from node 1 to node i. Sequentially, for k = 2, 3, . . ., denote
by di (k) the minimum of the lengths of paths from 1 to i that have length greater
than di (k − 1) [if there is no path from 1 to i with length greater than di (k − 1),
then di (k) = ∞]. We call di (k) the k-level shortest distance from 1 to i.
(a) Show that for k > 1, {di (k) | i = 1, . . . , N } are the k-level shortest distances
if and only if di (k − 1) ≤ di (k) with strict inequality if di (k − 1) < ∞, and
furthermore
di (k) = min li (k, j) + aij , i = 1, . . . , N,
(i,j)∈A
where
di (k − 1) if dj (k − 1) < di (k − 1) + aij ,
li (k, j) =
di (k) if dj (k − 1) = di (k − 1) + aij .
(b) Extend the generic shortest path algorithm of Section 2.2 so that it simul-
taneously finds the k-level shortest distances for all k = 1, 2, . . . , K, where
K is some positive integer.
2.27 (Clustering)
Consider the framework of the shortest path problem. For any path P , define
the bottleneck arc of P as an arc that has maximum length over all arcs of P .
Consider the problem of finding a path connecting two given nodes and having
minimum length of bottleneck arc. Derive an analog of Prop. 2.1 for this problem.
Consider also a single origin/all destinations version of this problem. Develop an
analog of the generic algorithm of Section 2.2 and prove an analog of Prop. 2.2.
Hint: Replace di + aij with max{di , aij }.
Sec. 2.8 Notes, Sources, and Exercises 111
Consider the problem of finding a simple forward path between an origin and a
destination node that has minimum length. Show that even if there are negative
cycles, the problem can be formulated as a minimum cost flow problem involving
node throughput constraints of the form
0≤ xij ≤ 1, ∀ i.
{j|(i,j)∈A}
Given a graph (N , A) and a weight wij for each arc (i, j), consider the problem
of finding a spanning tree with minimum sum of arc weights. This is not a
shortest path problem and in fact it is not even a special case of the minimum
cost flow problem. However, it has a similar graph structure to the one of the
shortest path problem. Note that the orientation of the arcs does not matter
here. In particular, if (i, j) and (j, i) are arcs, any one of them can participate
in a spanning tree solution, and the arc having greater weight can be a priori
eliminated.
(a) Consider the problem of finding a shortest path from node 1 to all nodes
with arc lengths equal to wij . Give an example where the shortest path
spanning tree is not a minimum weight spanning tree.
(b) Let us define a fragment to be a subgraph of a minimum weight spanning
tree; for example the subgraph consisting of any subset of nodes and no
arcs is a fragment. Given a fragment F , let us denote by A(F ) the set of
arcs (i, j) such that either i or j belong to F , and if (i, j) is added to F
no cycle is closed. Show that if F is a fragment, then by adding to F an
arc of A(F ) that has minimum weight over all arcs of A(F ) we obtain a
fragment.
(c) Consider a greedy algorithm that starts with some fragment, and at each
iteration, adds to the current fragment F an arc of A(F ) that has minimum
weight over all arcs of A(F ). Show that the algorithm terminates with a
minimum weight spanning tree.
(d) Show that the complexity of the greedy algorithm is O(N A), where N is
the number of nodes and A is the number of arcs.
(e) The Prim-Dijkstra algorithm is the special case of the greedy algorithm
where the initial fragment consists of a single node. Provide an O(N 2 ), im-
plementation of this algorithm. Hint: Together with the kth fragment Fk ,
maintain for each j ∈/ Fk the node nk (i) ∈ Fk such that the arc connecting
j and nk (i) has minimum weight.
112 The Shortest Path Problem Chap. 2
2.32
Consider the one origin-all destinations problem and the generic algorithm of
Section 2.2. Assume that there exists a path that starts at node 1 and contains
a cycle with negative length. Assume also that the generic algorithm is operated
so that if a given node belongs to the candidate list for an infinite number of
iterations, then it also exits the list an infinite number of times. Show that there
exists at least one node j such that the sequence of labels dj generated by the
algorithm diverge to −∞. Hint: Argue that if the limits dj of all the label nodes
are finite, then we have dj ≤ di + aij for all arcs (i, j).
Consider the problem of finding a shortest path from node 1 to a node t, as-
suming that there exists at least one such path and that all cycles have positive
length. This exercise deals with a modified version of the auction algorithm,
which was developed in Bertsekas [1992b], motivated by a similar earlier algo-
rithm by Cerulli, De Leone, and Piacente [1994]. This modified version aims to
use larger price increases than the original method. The algorithm maintains a
price vector p and a simple path P that starts at the origin, and is initialized
with P = (1) and any price vector p satisfying
p1 = ∞,
aii = 0, ∀ i ∈ N.
with the extra requirement that ji = i whenever possible; that is, we choose
ji = i whenever the minimum above is attained for some j = i. Set
pji := min aij + pj − aiji .
j∈A(i), j=ji
πi = aij + pj , ∀ (i, j) ∈ P,
πi = pi , ∀i∈
/ P,
where
πi = min pi , min {aij + pj } , ∀ i ∈ N.
{j|(i,j)∈A}
(b) Throughout the algorithm, P is a shortest path between its endnodes. Hint:
Show that if P̃ is another path with the same endnodes, we have
Length of P̃ − Length of P = (πk − pk ) − (πk − pk )
{k|k∈P̃ , k∈P
/ } {k|k∈P, k∈
/ P̃ }
≥ 0.
(c) The algorithm terminates with a shortest path from 1 to t. Note: This is
challenging. A proof is given in Bertsekas [1992b].
(d) Convert the shortest path problem to an equivalent assignment problem
for which the conditions of part (a) are the complementary slackness con-
ditions. Show that the algorithm is essentially equivalent to a naive auction
algorithm applied to the equivalent assignment problem.
114 The Shortest Path Problem Chap. 2
where r(·) is a given nonnegative and continuous function. The final time T and
the control trajectory {u(t) | 0 ≤ t ≤ T } are subject to optimization. Suppose
we discretize the plane with a mesh of size δ that passes through x(0) and x(T ),
and we introduce a shortest path problem of going from x(0) to x(T ) using moves
of the following type: from each mesh point x = (x1 , x2 ) we can go to each of
the mesh points (x1 + δ, x2 ), (x1 − δ, x2 ), (x1 , x2 + δ), and (x1 , x2 − δ), at a cost
r(x)δ. Show by example that this is a bad discretization of the original problem
in the sense that the shortest distance need not approach the optimal cost of
the original problem as δ → 0. Note: This exercise illustrates a common pitfall.
The difficulty is that the control constraint set (the surface of the unit sphere)
should be finely discretized as well. For a proper treatment of the problem
of discretization, see the original papers by Gonzalez and Rofman [1985], and
Falcone [1987], the survey paper by Kushner [1990], the monograph by Kushner
and Dupuis [1992], and the references cited there. For analogs of the label setting
and label correcting algorithms of the present chapter, see the papers by Tsitsiklis
[1995], and by Polymenakos, Bertsekas, and Tsitsiklis [1998].
3
Contents
115
116 The Max-Flow Problem Chap. 3
The key idea in the max-flow problem is very simple: a feasible flow x can
be improved if we can find a path from s to t that is unblocked with respect
to x. Pushing a positive increment of flow along such a path results in larger
divergence out of s, while maintaining flow feasibility. Most (though not
all) of the available max-flow algorithms are based on iterative application
of this idea.
We may also ask the reverse question. If we can’t find an unblocked
path from s to t, is the current flow maximal? The answer is positive,
Sec. 3.1 The Max-Flow and Min-Cut Problems 117
1 4
[0, s1 ] [0,- s4 ]
Source s 3 t Sink
[0, s2 ] [0,- s5 ]
2 5
Denote by I + = {i | si > 0} the set of source nodes ({1, 2} in the figure) and by
I − = {i | si < 0} the set of sink nodes ({4, 5} in the figure). If both these sets
are empty, the zero vector is a feasible flow, and we are done. Otherwise, these
sets are both nonempty (since s = 0). We introduce a node s, and for all
i i
i ∈ I + , the arcs (s, i) with flow range [0, si ]. We also introduce a node t, and
for all i ∈ I − , the arcs (i, t) with flow range [0, −si ]. Now consider the max-flow
problem of maximizing the divergence out of s and into t, while observing the
capacity constraints. Then there exists a solution to the feasibility problem of
Eqs. (3.1) and (3.2), if and only if the maximum divergence out of s is equal to
i∈I + i
s . If this condition is satisfied, solutions of the feasibility problem are in
one-to-one correspondence with optimal solutions of the max-flow problem.
If the capacity constraints involve lower bounds, bij ≤ xij ≤ cij , we may
convert first the feasibility problem to one with zero lower flow bounds by a
translation of variables, which replaces each variable xij with a variable zij =
xij − bij .
Also, a max-flow problem can (in principle) be solved by an algorithm
that solves the feasibility problem (we try to find a sequence of feasible flows
with monotonically increasing divergence out of s, stopping with a maximum flow
when no further improvement is possible). In fact, this is the main idea of the
Ford-Fulkerson method, to be discussed in Section 3.2.
118 The Max-Flow Problem Chap. 3
although the reason is not entirely obvious. For a brief justification, con-
sider the minimum cost flow formulation of the max-flow problem, given in
Example 1.3, which involves the artificial feedback arc (t, s) (see Fig. 3.2).
Then, a cycle has negative cost if and only if it includes the arc (t, s), since
this arc has cost -1 and is the only arc with nonzero cost. By Prop. 1.2, if
a feasible flow vector x is not optimal, there must exist a simple cycle with
negative cost that is unblocked with respect to x; this cycle must consist
of the arc (t, s) and a path from s to t, which is unblocked with respect to
x. Thus, if there is no path from s to t that is unblocked with respect to a
given flow vector x, then there is no cycle of negative cost and x must be
optimal.
Source s t Sink
Q = [S, N − S].
Note that the partition is ordered in the sense that the cut [S, N − S] is
distinct from the cut [N − S, S]. For a cut Q = [S, N − S], we use the
notation
Q+ = (i, j) ∈ A | i ∈ S, j ∈
/S ,
Q− = (i, j) ∈ A | i ∈
/ S, j ∈ S ,
and we say that Q+ and Q− are the sets of forward and backward arcs of
the cut, respectively. We say that the cut Q is nonempty if Q+ ∪ Q− = Ø;
otherwise we say that Q is empty. We say that the cut [S, N − S] separates
node s from node t if s ∈ S and t ∈/ S. These definitions are illustrated in
Fig. 3.3.
Let us recall from Section 1.1.2 the definition of the divergence of a node i:
yi = xij − xji , ∀ i ∈ N.
{j|(i,j)∈A} {j|(j,i)∈A}
120 The Max-Flow Problem Chap. 3
The following calculation shows that F (Q) is also equal to the sum of the
divergences yi of the nodes in S:
F (Q) = xij − xij
{(i,j)∈A|i∈S,j ∈S}
/ {(i,j)∈A|i∈S,j∈S}
/
⎛ ⎞
= ⎝ xij − xji ⎠ (3.3)
i∈S {j|(i,j)∈A} {j|(j,i)∈A}
= yi .
i∈S
(The second equality holds because the flow of an arc with both end nodes
in S cancels out within the parentheses; it appears twice, once with a
positive and once with a negative sign.)
Given lower and upper flow bounds bij and cij for each arc (i, j), the
capacity of a nonempty cut Q is
C(Q) = cij − bij . (3.4)
(i,j)∈Q+ (i,j)∈Q−
Clearly, for any capacity-feasible flow vector x, the flux F (Q) across Q is
no larger than the cut capacity C(Q). If F (Q) = C(Q), then Q is said to
be a saturated cut with respect to x; the flow of each forward (backward)
arc of such a cut must be at its upper (lower) bound. By convention, every
empty cut is also said to be saturated. The following is a simple but useful
result.
and mark each node n ∈ Tk+1 with the label “(m, n)” or “(n, m),”
where m is a node of Tk and (m, n) or (n, m) is an arc with the property
stated in the above equation, respectively.
Consider now the max-flow problem, where we want to maximize the diver-
gence out of s over all capacity-feasible flow vectors having zero divergence
for all nodes other than s and t. Given any such flow vector and any cut
Q separating s from t, the divergence out of s is equal to the flux across Q
[cf. Eq. (3.3)], which in turn is no larger than the capacity of Q. Thus, if
122 The Max-Flow Problem Chap. 3
(b)
Thus the theory of Chapter 5 (or the Weierstrass theorem) guarantees that
the max-flow problem has an optimal solution in this case. This is stated
as part (b) of the following theorem, even though its complete proof must
await the developments of Chapter 5.
Proof: (a) Let F ∗ be the value of the maximum flow, that is, the diver-
gence out of s corresponding to x∗ . There cannot exist an unblocked path
P from s to t with respect to x∗ , since by increasing the flow of the forward
arcs of P and by decreasing the flow of the backward arcs of P by a com-
mon positive increment, we would obtain a flow vector with a divergence
out of s larger than F ∗ . Therefore, by Prop. 3.1, there must exist a cut
Q, that is saturated with respect to x∗ and separates s from t. The flux
across Q is equal to F ∗ and is also equal to the capacity of Q [since Q is
saturated; see Eqs. (3.3) and (3.4)]. Since we know that F ∗ is less or equal
to the minimum cut capacity [cf. Eq. (3.5)], the result follows.
(b) See the discussion preceding the proposition. Q.E.D.
Q = [S, N − S].
+
Clearly Q separates s and t. If (i, j) ∈ Q , then we have x∗ij = cij because
i belongs to one of the sets S such that [S, N − S] is a saturated cut, and j
+
does not belong to S since j ∈/ S. Thus we have x∗ij = cij for all (i, j) ∈ Q .
124 The Max-Flow Problem Chap. 3
−
Similarly, we obtain x∗ij = bij for all (i, j) ∈ Q . Thus Q is a saturated cut
separating s and t, and in view of its definition, it is the maximal such cut.
By using set intersection in place of set union in the preceding argument, it
is seen that we can similarly form the minimal saturated cut that separates
s and t.
The maximal and minimal saturated cuts can be used to deal with
infeasibility in the context of various network flow problems, as we discuss
next.
I + = {i | si > 0}
I − = {i | si < 0}.
Then it may make sense to minimize the cost function over the set of all
maximally feasible flows, which is the set of flow vectors x whose diver-
gences
yi = xij − xji
{j|(i,j)∈A} {j|(j,i)∈A}
satisfy
yi ≥ 0 if i ∈ I + ,
yi ≤ 0 if i ∈ I − ,
yi = 0 if i ∈
/ I + ∪ I −,
Sec. 3.2 The Ford-Fulkerson Algorithm 125
and minimize
|si − yi |.
i∈N
respectively, separating the supply node set P from the demand node set D.
Furthermore, the flows of all arcs (i, j) that belong to these cuts are equal to
x∗ij for every maximally feasible flow vector. It can now be seen that given
x∗ , we can decompose the problem of minimizing the cost function over the
set of maximally feasible flows into two or three feasible and independent
subproblems, depending on whether Smin = Smax or not. The node sets of
these problems are Smin , N − Smax , and Smax − Smin , (if Smax = Smin ).
The supplies for these problems are appropriately adjusted to take into
account the arc flows x∗ij for the arcs (i, j) of the corresponding cuts, as
illustrated in Fig. 3.5.
Max-Flow/Capacity
shown next to each arc.
All lower flow bounds are 0.
1/1
1 5
2/3 1/1 1/1 2/3
Minimal Maximal
Saturated Cut Saturated Cut
(2) The problem involving the nodes 3 and 4, with conservation of flow con-
straint (for both nodes) x34 = 2.
(3) The problem involving the nodes 5, 6, and t, with conservation of flow
constraints
Note that while in this example the 2nd problem is trivial (has only one feasible
solution), the 1st and 3rd problems have multiple feasible solutions.
Sec. 3.2 The Ford-Fulkerson Algorithm 127
2 2
[0,4] [0,1] 1 1
[0,1]
1 4 5 1 5
[0,2] [0,1] [0,5]
[0,2] [0,3]
3
2 2
1 1 1 1
1 0 1 1
0 0 4 5 4 5
0
0 0
3
2
2 1
1 2
1 0 0 4 5 1 4 5
1
0 0 2 2
3 3
2 2
2 1 1
1
1 0 0 4 5 1 1 4 5
3 1
2 2
3 1
3
2
3 1
1 1 0 1 4 5
4
2 3
3
[0,C] 2 [0,C]
1 [0,1] 4
Figure 3.7: An example showing
[0,C] [0,C] that if the augmenting paths used
3 in the Ford-Fulkerson algorithm do
not have a number of arcs that
is as small as possible, the num-
ber of iterations may be very large.
2 Augmenting Path for Odd
1 Here, C is a large integer. The
Numbered Iterations
maximum flow is 2C, and can be
1 1 4 produced after a sequence of 2C
augmentations using the three-arc
augmenting paths shown in the fig-
1
3 ure. Thus, the running time is
pseudopolynomial (it is proportio-
nal to C).
If on the other hand the two-
Augmenting Path for Even
2 Numbered Iterations arc augmenting paths (1, 2, 4) and
(1, 3, 4) are used, only two augmen-
-1 1 tations are needed.
1 4
1
3
Using “shortest” augmenting paths (paths with as few arcs as possible) not
only guarantees termination of the Ford-Fulkerson algorithm. It turns out
that it also results in polynomial running time, as the example of Fig. 3.7
illustrates. In particular, the number of augmentations of the algorithm
with shortest augmenting paths can be estimated as O(N A); see Exercise
3.12. This yields an O(N A2 ) running time to solve the problem, since
each augmentation requires O(A) operations to execute the unblocked path
search method and to carry out the subsequent flow update.
Much research has been devoted to developing max-flow algorithms
with better than O(N A2 ) running time. The algorithms that we will discuss
can be grouped into two main categories:
(a) Variants of the Ford-Fulkerson algorithm, which use special data
structures and preprocessing calculations to generate augmenting paths
efficiently. We will describe some algorithms of this type in what fol-
lows in this chapter.
130 The Max-Flow Problem Chap. 3
(b) Algorithms that depart from the augmenting path approach, but in-
stead move flow from the source to the sink in a less structured fash-
ion than the Ford-Fulkerson algorithm. These algorithms, known as
preflow-push methods, will be discussed in Section 7.3. Their underly-
ing mechanism is related to the one of the auction algorithm described
in Section 1.3.3.
The algorithms that have the best running times at present are the preflow-
push methods. In particular, in Section 7.3 we will demonstrate an O(N 3 )
running time for one of these methods, and we will describe another method
with an O(N 2 A1/2 ) running time. Preflow-push algorithms with even bet-
ter running times exist (see the discussion in Chapter 7). It is unclear,
however, whether the best preflow-push methods outperform in practice
the best of the Ford-Fulkerson-like algorithms of this chapter.
In the remainder of this chapter, we will discuss efficient variants of
the Ford-Fulkerson algorithm. These variants are motivated by a clear
inefficiency of the unblocked path search algorithm: it discards all the la-
beling information collected from the construction of each augmenting path.
Since, in a large graph, an augmentation typically has a relatively small ef-
fect on the current flow vector, each augmenting path problem is similar to
the next augmenting path problem. One would thus think that the search
for an augmenting path could be organized to preserve information for use
in subsequent augmentations.
A prime example of an algorithm that cleverly preserves such infor-
mation is the historically important algorithm of Dinic [1970], illustrated
in Figure 3.8. Let us assume for simplicity that each lower arc flow bound
is zero. One possible implementation of the algorithm starts with the zero
flow vector and operates in phases. At the start of each phase, we have a
feasible flow vector x and we construct an acyclic network, called the lay-
ered network , which is partitioned in layers (subsets) of nodes as follows:
Layer 0 consists of just the sink node t, and layer k consists of all nodes
i such that the shortest unblocked path from i to t has k arcs. Let
k(i) be the layer number of each node i [k(i) = ∞ if i does not belong
to any layer].
If the source node s does not belong to any layer, there must exist a
saturated cut separating s from t, so the current flow is maximal and
the algorithm terminates. Otherwise, we form the layered network as
follows: we delete all nodes i such that k(i) ≥ k(s) and their incident
arcs, and we delete all remaining arcs except the arcs (i, j) such that
k(i) = k(j) + 1 and xij < cij , or k(j) = k(i) + 1 and xij > 0.
Sec. 3.2 The Ford-Fulkerson Algorithm 131
0/2
2 4 2
0/2 0/2
0/1 0/1 3
1 6 1 6
0/1 4
0/1 0/1
3 5 5
0/2
0/2
2 4
1/2 1/2
1/1 1/1 1 2 4
1 6
6
1/1
1/1 0/1 5
3 5
0/2
1/2
2 4
2/2 2/2
1/1 1/1
1 6
1/1
1/1 0/1
3 5
0/2
Figure 3.8: Illustration of Dinic’s algorithm for the problem shown at the top
left (node 1 is the source and node 6 is the sink).
In the first phase, there are three layers, as shown in the top right figure.
There are three augmentations in the layered network (1 → 2 → 6, 1 → 3 → 6,
and 1 → 4 → 6), and the resulting flows are shown in the middle left figure. In
the second phase, there are four layers, as shown in the bottom right figure. There
is only one augmenting path in the layered network (1 → 2 → 4 → 6), and the
resulting flows are shown in the bottom left figure. The algorithm then terminates
because in constructing the layered network, no augmenting paths from 1 to 6
can be found.
Notice a key property of the algorithm: with each new phase, the layer
number of the source node is strictly increased (from 2 to 3 in this example).
This property shows that the number of phases is at most N − 1.
132 The Max-Flow Problem Chap. 3
ORIGINAL GRAPH
Max-Flow/Capacity
shown next to each arc.
All lower flow bounds are 0.
1/2
2 4
1/1 1/1 Figure 3.9: Illustration of the reduced
graph corresponding to a given flow vec-
0/1 0/1
tor. Node 1 is the source, and node 6 is
1 6 the sink.
0/2
0/1 Figure (a) shows the original graph,
0/1 0/1 and the flow and upper flow bound next
3 5 to each arc (all lower flow bounds are
0/2 0). Figure (b) shows the reduced graph.
The arc (4,2) is added because the flow
(a) of arc (2,4) is strictly between the arc
flow bounds. The arcs (1,2) and (4,6)
REDUCED GRAPH are reversed because their flows are at
the corresponding upper bounds.
Note that every forward path in
2 4
the reduced graph, such as (1, 4, 2, 6),
corresponds to an unblocked path in the
original graph.
1 6
3 5
(b)
pi ≤ p j + 1 (3.6)
for all arcs (i, j) of the reduced graph. Furthermore, upon discovery of a
shortest augmenting path, there holds
pi = p j + 1
for all arcs (i, j) of the augmenting path. It can be seen that this equality
guarantees that following a flow augmentation, the CS condition (3.6) will
be satisfied for all newly created arcs of the reduced graph. As a result,
following an augmentation along a shortest path found by the auction al-
gorithm, the node prices can be reused without modification to start the
auction algorithm for finding the next shortest augmenting path.
The preceding observations can be used to formally define a max-
flow algorithm, where each augmenting path is found as a shortest path
from s to t in the reduced graph using the auction algorithm as a shortest
path subroutine. The initial node prices can be all equal to 0, and the
prevailing prices upon discovery of a shortest augmenting path are used as
the starting prices for searching for the next augmenting path. The auction
algorithm maintains a path starting at s, which is contracted or extended
at each iteration. The price of the terminal node of the path increases by at
least 1 whenever there is a contraction. An augmentation occurs whenever
the terminal node of the path is the sink node t. The overall algorithm is
terminated when the price of the terminal node exceeds N − 1, indicating
that there is no path starting at s and ending at t.
It is possible to show that, with proper implementation, the max-
flow algorithm just described has an O(N 2 A) running time. Unfortunately,
however, the practical performance of the algorithm is not very satisfactory,
because the computation required by the auction/shortest path algorithm
is usually much larger than what is needed to find an augmenting path. The
reason is that one needs just a path from s to t in the reduced graph and
Sec. 3.3 Price-Based Augmenting Path Algorithms 135
pi ≤ pj + 1, ∀ (i, j) ∈ A, (3.7)
ps < N, pt = 0, (3.8)
p i ≥ pj , ∀ (i, j) ∈ P. (3.9)
Set P = (s), and select p such that Eqs. (3.7) and (3.8) hold.
Step 1 (Check for contraction or extension): Let nk be the
end node of the current path P and if nk = s, let pred(nk ) be the
predecessor node of nk on P . If the set of downstream neighbors of
nk ,
N (nk ) = {j | (nk , j) ∈ A},
is empty, set pnk = N and go to Step 3. Otherwise, find a node in
N (nk ) with minimal price and denote it succ(nk ),
Set
pnk = psucc(nk ) + 1. (3.11)
If nk = s, or if
p2 = -1 p2 = 1
2 2
p1 = 0 p4 = 0 p1 = 1 p4 = 0
1 4 1 4
Origin Destination
3 3
p3 = 0 p3 = 1
(2) The conditions (3.7)-(3.9) are satisfied each time Step 1 is entered.
The proof is by induction. These conditions hold initially by assump-
tion. Condition (3.8) is maintained by the algorithm, since termi-
nation occurs as soon as ps ≥ N or t becomes the end node of P .
To verify conditions (3.7) and (3.9), we note that only the price of
nk can change in Step 1, and by Eqs. (3.10) and (3.11), this price
change maintains condition (3.7) for all arcs, and condition
(3.9) for
all arcs of P , except possibly for the arc pred(nk ), nk in the case of
an extension with the condition
ppred(nk ) > psucc(nk )
holding. In the latter case, we must have
ppred(nk ) ≥ psucc(nk ) + 1
because the prices are integer, so by Eq. (3.11), we have
ppred(nk ) ≥ pnk
138 The Max-Flow Problem Chap. 3
Proof: We first note that the prices of the nodes of P are upper bounded
by N in view of Eqs. (3.8) and (3.9). Next we observe that there is a
price change of at least one unit with each contraction, and since the prices
of the nodes of P are upper bounded by N , there can be only a finite
number of contractions. Since P never contains a cycle, there can be at
most N − 1 successive extensions without a contraction, so the algorithm
must terminate. Throughout the algorithm, we have pt = 0 and pi ≤ pj + 1
for all arcs (i, j). Hence, if a forward path from s to t exists, we must have
ps < N throughout the algorithm, including at termination, and since
termination via Step 3 requires that ps ≥ N , it follows that the algorithm
must terminate via Step 2 with a path from s to t. If a forward path from
s to t does not exist, termination can only occur via Step 3, in which case
we must have ps ≥ N . Q.E.D.
Sec. 3.4 Notes, Sources, and Exercises 139
pi ≥ pj
for all arcs (i, j) of P . It follows that as the reduced graph changes due to
the corresponding augmentation, for every newly created arc (j, i) of the
reduced graph, the arc (i, j) must belong to P , so that pi ≥ pj . Hence the
newly created arc (j, i) of the reduced graph will also satisfy the required
condition pj ≤ pi + 1 [cf. Eq. (3.12)].
For a practically efficient implementation of the max-flow algorithm
just described, a number of fairly complex modifications may be needed. A
description of these and a favorable computational comparison with other
competing methods can be found in Bertsekas [1995c], where an O(N 2 A)
complexity bound is also shown for a suitable variant of the method.
EXERCISES
3.1
[0,5] 2 [0,1]
[0,1] Figure 3.11: Max-flow problem
1 [0,5] 4 5
[0,4] for Exercise 3.1. The arc capac-
[0,1]
ities are shown next to the arcs.
[0,2] [0,3]
3
Sec. 3.4 Notes, Sources, and Exercises 141
and mark each node n ∈ Tk+1 with the label “(m, n)” or “(n, m),” where
m is a node of Tk such that (m, n) or (n, m) is an arc, respectively. The
algorithm terminates if either (1) Tk+1 is empty or (2) j ∈ Tk+1 . Show
that case (1) occurs if and only if there is no path from i to j. If case (2)
occurs, how would you use the labels to construct a path from i to j?
(b) Show that a path found by breadth-first search has a minimum number of
arcs over all paths from i to j.
(c) Modify the algorithm of part (a) so that it finds a forward path from i to
j.
Show that the minimum cost flow problem introduced in Section 1.2.1, has a
feasible solution if and only if s = 0 and for every cut Q = [S, N − S] we
i∈N i
have
Capacity of Q ≥ si .
i∈S
Show also that feasibility of the problem can be determined by solving a max-
flow problem with zero lower flow bounds. Hint: Assume first that all lower flow
bounds bij are zero. Use the conversion to a max-flow problem of Fig. 3.1, and
apply the max-flow/min-cut theorem. In the general case, transform the problem
to one with zero lower flow bounds.
Describe an algorithm of the Ford-Fulkerson type for checking the feasibility and
finding a feasible solution of a minimum cost flow problem (cf., Section 1.2.1). If
the supplies si and the arc flow bounds bij and cij are integer, your algorithm
should be guaranteed to find an integer feasible solution (assuming at least one
feasible solution exists). Hint: Use the conversion to a max-flow problem of Fig.
3.1.
Given a graph (N , A) and a flow vector x with integer divergence, show that
there exists an integer flow vector x having the same divergence vector as x and
satisfying
|xij − xij | < 1, ∀ (i, j) ∈ A.
142 The Max-Flow Problem Chap. 3
Hint: For each arc (i, j), define the integer flow bounds
bij =
xij , cij = xij .
3.6
Consider a graph with arc flow range [0, cij ] for each arc (i, j), and let x be a
capacity-feasible flow vector.
(a) Consider any subset S of nodes all of which have nonpositive divergence
and at least one of which has negative divergence. Show that there must
exist at least one arc (i, j) with i ∈
/ S and j ∈ S such that xij > 0.
(b) Show that for each node with negative divergence there is an augmenting
path that starts at that node and ends at a node with positive divergence.
Hint: Construct such a path using an algorithm that is based on part (a).
This counterexample (from Chvatal [1983]) illustrates how the version of the Ford-
Fulkerson method where augmenting paths need not have as few arcs as possible
may not terminate for a problem with irrational arc flow bounds. Consider the
max-flow problem shown in Fig. 3.12.
(a) Verify that an infinite sequence of augmenting paths is characterized by
the table of Fig. 3.12; each augmentation increases the divergence out of
the source s but the sequence of divergences converges to a value, which
can be arbitrarily smaller than the maximum flow.
(b) Solve the problem with the Ford-Fulkerson method (where the augmenting
paths involve a minimum number of arcs, as given in Section 3.2).
Let s and t be two nodes in a directed graph. Use the max-flow/min-cut theorem
to show that:
(a) The maximum number of forward paths from s to t that do not share any
arcs is equal to the minimum number of arcs that when removed from the
graph, eliminate all forward paths from s to t.
(b) The maximum number of forward paths from s to t that do not share any
nodes (other than s and t) is equal to the minimum number of nodes that
when removed from the graph, eliminate all forward paths from s to t.
Sec. 3.4 Notes, Sources, and Exercises 143
1 3 6
s 4
2 5 t
Figure 3.12: Max-flow problem illustrating that if the augmenting paths in the
Ford-Fulkerson method do not have a minimum number of arcs, then the method
may not terminate. All lower arc flow bounds are zero. The upper flow bounds
are larger than one, with the exception of the thick-line arcs; these are arc (3, 6)
which has upper flow bound equal to one, and arcs (1, 2) and (4, 6) which have
√
upper flow bound equal to σ = − 1 + 5 /2. (Note a crucial property of σ; it
satisfies σ k+2 = σ k − σ k+1 for all integer k ≥ 0.) The table gives a sequence of
augmentations.
Consider a bipartite graph consisting of two sets of nodes S and T such that
every arc has its start node in S and its end node in T . A matching is a subset of
arcs such that all the start nodes of the arcs are distinct and all the end nodes of
the arcs are distinct. A maximal matching is a matching with a maximal number
of arcs.
(a) Show that the problem of finding a maximal matching can be formulated
as a max-flow problem.
(b) Define a cover C to be a subset of S ∪ T such that for each arc (i, j), either
i ∈ C or j ∈ C (or both). A minimal cover is a cover with a minimal number
of nodes. Show that the number of arcs in a maximal matching and the
number of nodes in a minimal cover are equal. (Variants of this theorem
were independently published by König [1931] and Egervary [1931].) Hint:
Use the max-flow/min-cut theorem.
144 The Max-Flow Problem Chap. 3
3.11
(c) For any path P that is unblocked with respect to xk , let nk (P ) be the
number of arcs of P , let a+k (i) be the minimum of nk (P ) over all unblocked
P from s to i, and let a− k (i) be the minimum of nk (P ) over all unblocked
P from i to t. Show that for all i and k we have
a+ +
k (i) ≤ ak+1 (i), a− −
k (i) ≤ ak+1 (i).
+
(d) Show that if k < k and arc (i, j) is both a k+ -bottleneck and a k -
−
bottleneck, or is both a k− -bottleneck and a k -bottleneck, then a+
k (t) <
+
a (t).
k
(e) Show that the algorithm terminates after O(N A) augmentations, for an
O(N A2 ) running time.
Consider the algorithm described near the end of Section 3.2, which uses phases
and augmentations through a layered network.
(a) Provide an algorithm for constructing the layered network of each phase in
O(A) time.
(b) Show that the number of augmentations in each phase is at most A, and
provide an implementation whereby these augmentations require O(N A)
total time.
(c) Show that with each phase, the layer number k(s) of the source node s
increases strictly, so that there can be at most N − 1 phases.
(d) Show that with the implementations of (a) and (b), the running time of
the algorithm is O(N 2 A).
Consider the max-flow problem in the special case where the arc flow range is
[0,1] for all arcs.
(a) Show that each path from the source to√the sink that is unblocked with
respect to the zero flow has at most 2N/ M arcs, where M is the value of
the maximum flow. Hint: Let Nk be the number of nodes i such that the
shortest unblocked path from s to i has k arcs. Argue that k(k + 1) ≥ M .
(b) Show that the running time of the layered network algorithm (cf. Fig.
3.8) is reduced to O(N 2/3 A). Hint: Argue that each arc of the layered
network can be part of at most one augmenting path in a given phase, so
the augmentations of each phase require O(A) computation. Use part (a)
to show that the number of phases is O(N 2/3 ).
146 The Max-Flow Problem Chap. 3
3.15
(a) Solve the problem of Exercise 3.1 using the layered network algorithm (cf.
Fig. 3.8).
(b) Construct an example of a max-flow problem where the layered network
algorithm requires N − 1 phases.
3.16
Solve the problem of Exercise 3.1 using the max-flow algorithm of Section 3.3.2.
The purpose of this exercise (from Bertsekas [1995c]) is to show the connection
of the path construction algorithm of Section 3.3.1 with the assignment auction
algorithm of Section 1.3.3.
(a) Show that the path construction problem can be converted into the problem
of finding a solution of a certain assignment problem with all arc values
equal to 0, as shown by example in Fig. 3.13. In particular, a forward path
of a directed graph G that starts at node s and ends at node t corresponds
to a feasible solution of the assignment problem, and conversely.
(b) Show how to relate the node prices in the path construction algorithm with
the object prices of the assignment problem, so that if we apply the auction
algorithm with = 1, the sequence of generated prices and assignments
corresponds to the sequence of generated prices and paths by the path
construction algorithm.
PERSONS OBJECTS
1
3 t 1
2
1
2' (3,2) 1
s t
1
2 (s,3) 1
3
1
s (s,2) 1
Figure 3.13: Converting the path construction problem into an equivalent feasi-
bility problem of assigning “persons” to “objects.” Each arc (i, j) of the graph G,
with i = t, is replaced by an object labeled (i, j). Each node i = t is replaced by
R(i) persons, where R(i) is the number of arcs of G that are incoming to node i
(for example node 2 is replaced by the two persons 2 and 2 ). Finally, there is one
person corresponding to node s and one object corresponding to node t. For every
arc (i, j) of G, with i = t, there are R(i) + R(j) incoming arcs from the persons
corresponding to i and j. For every arc (i, t) of G, there are R(i) incoming arcs
from the persons corresponding to i. Each path that starts at s and ends at t can
be associated with a feasible assignment. Conversely, given a feasible assignment,
one can construct an alternating path (a sequence of alternatively assigned and
unassigned pairs) starting at s and ending at t, which defines a path from s to t.
I1
J1
I3 J3
148 The Max-Flow Problem Chap. 3
3.20
Consider a feasible max-flow problem. Show that if the upper flow bound of each
arc is increased by α > 0, then the value of the maximum flow is increased by no
more than αA, where A is the number of arcs.
3.21
A town has m dating agencies that match men and women. Agency i has a
list of men and a list of women, and may match a maximum of ci man/woman
pairs from its lists. A person may be in the list of several agencies but may be
matched with at most one other person. Formulate the problem of maximizing
the number of matched pairs as a max-flow problem.
3.22
3.23
Consider the “opposite” to the max-flow problem, which is to minimize the di-
vergence out of s over all capacity-feasible flow vectors having zero divergence
for all nodes other than s and t.
(a) Show how to solve this problem by first finding a feasible solution, and by
then using a max-flow algorithm.
(b) Derive an analog to the max-flow/min-cut theorem.
10
Contents
467
468 Network Problems with Integer Constraints Chap. 10
faster in practice. Two possibilities of this type, local search methods and
rollout algorithms, are discussed in Sections 10.4 and 10.5, respectively.
xij = 0 or 1, ∀ (i, j) ∈ A,
the subgraph with node-arc set N , {(i, j) | xij = 1} is connected. (10.1)
Note that, given the 0-1 constraints on the arc flows and the conservation of
flow equations, the last constraint can be expressed through the set of side
constraints
(xij + xji ) ≥ 2, ∀ nonempty proper subsets S of nodes.
i∈S, j ∈S
/
If these constraints were not present, the problem would be an ordinary as-
signment problem. Unfortunately, however, these constraints are essential,
since without them, there would be feasible solutions involving multiple dis-
connected cycles, as illustrated in Fig. 10.1.
Given a tour, one may try to improve its cost by using some method
that changes the tour incrementally. In particular, a popular method for the
symmetric case (aij = aji for all i and j) is the k-OPT heuristic, which creates
a new tour by exchanging k arcs of the current tour with another k arcs that
do not belong to the tour (see Fig. 10.2). The k arcs are chosen to optimize
the cost of the new tour with O(N k ) computation. The method stops when
no improvement of the current tour is possible through a k-interchange.
i j
i j
Subtour T Subtour T
xij = 0 or 1, ∀ (i, j) ∈ A,
yj = 0 or 1, j = 1, . . . , n,
1
1
1 1 1
...
...
...
...
1 xij m yj
i j j j
a ij bj
...
...
...
...
1 n n n
m
CLIENTS LOCATIONS
There are many network applications where one needs to construct an optimal
tree subject to some constraints. For example, in data networks, a spanning
tree is often used to broadcast information from some central source to all
the nodes. In this context, it makes sense to assign a cost or weight aij to
each arc (communication link) (i, j) and try to find a spanning tree that has
minimum total weight (minimum sum of arc weights). This is the minimum
weight spanning tree problem, which we have briefly discussed in Chapter 2
(see Exercise 2.30).
We can formulate this problem as an integer-constrained problem in
several ways. For example, let xij be a 0-1 integer variable indicating whether
arc (i, j) belongs to the spanning tree. Then the problem can be written as
minimize aij xij
(i,j)∈A
subject to xij = N − 1,
(i,j)∈A
474 Network Problems with Integer Constraints Chap. 10
(xij + xji ) ≥ 1, ∀ nonempty proper subsets S of nodes,
i∈S, j ∈S
/
xij = 0 or 1, ∀ (i, j) ∈ A.
The first two constraints guarantee that the graph defined by the set {(i, j) |
xij = 1} has N − 1 arcs and is connected, so it is a spanning tree.
In Exercise 2.30, we discussed how the minimum weight spanning tree
problem can be solved with a greedy algorithm. An example is the Prim-
Dijkstra algorithm, which builds an optimal spanning tree by generating a
sequence of subtrees. It starts with a subtree consisting of a single node
and it iteratively adds to the current subtree an incident arc that has min-
imum weight over all incident arcs that do not close a cycle. We indicated
in Exercise 2.30 that this algorithm can be implemented so that it has an
O(N 2 ) running time. This is remarkable, because except for the minimum
cost flow problems discussed in Chapters 2-7, very few other types of network
optimization problems can be solved with a polynomial-time algorithm.
There are a number of variations of the minimum weight spanning tree
problem. Here are some examples:
(a) There is a constraint on the number of tree arcs that are incident to
a single given node. This is known as the degree constrained minimum
weight spanning tree problem. It is possible to solve this problem using a
polynomial version of the greedy algorithm (see Exercise 10.10). On the
other hand, if there is a degree constraint on every node, the problem
turns out to be much harder. For example, suppose that the degree of
each node is constrained to be at most 2. Then a spanning tree subject
to this constraint must be a path that goes through each node exactly
once, so the problem is essentially equivalent to a symmetric traveling
salesman problem (see Exercise 10.6).
(b) The capacitated spanning tree problem. Here the arcs of the tree are to
be used for routing specified supplies from given supply nodes to given
demand nodes. The tree specifies the routes that will carry the flow
from the supply points to the demand points, and hence also specifies
the corresponding arc flows. We require that the tree is selected so
that the flow of each arc does not exceed a given capacity constraint.
This is an integer-constrained problem, which is not polynomially solv-
able. However, there are some practical heuristic algorithms, such as
an algorithm due to Esau and Williams [1966] (see Fig. 10.5).
(c) The Steiner tree problem, where the requirement that all nodes must be
included in the tree is relaxed. Instead, we are given a subset S of the
nodes, and we want to find a tree that includes the subset S and has
minimum total weight. [J. Steiner (1796-1863), “the greatest geome-
ter since Apollonius,” posed the problem of finding the shortest tree
spanning a given set of points on the plane.] An important application
of the Steiner tree problem arises in broadcasting information over a
communication network from a special node to a selected subset S of
nodes. This broadcasting is most efficiently done over a Steiner tree,
where the cost of each arc corresponds to the cost of communication
over that arc. The Steiner tree problem also turns out to be a difficult
Sec. 10.1 Formulation of Integer-Constrained Problems 475
0 0
5 5
5 2 3
1 4 1 4
9 9
1 2
1 2
2 3 2 3
1
0 0 0
1 4 1 4 1 4
2 3 2 3 2 3
xij = 0 or 1, ∀ (i, j) ∈ A.
The constraint (10.2) expresses the requirement that an object can be matched
with at most one other object. In a variant of the problem, it is specified that
the matching should be perfect; that is, every object should be matched with
some other object. In this case, the constraint (10.2) should be changed to
xij + xji = 1, ∀ i ∈ N. (10.3)
{j|(i,j)∈A} {j|(j,i)∈A}
The special case where aij = 1 for all arcs (i, j) is the maximum cardinal-
ity matching problem, i.e., finding a matching with a maximum number of
matched pairs.
It is possible to view nonbipartite matching as an optimal network flow
problem of the assignment type with integer constraints and with the side con-
straints defined by Eq. (10.2) or Eq. (10.3) (see Exercise 10.15). We would
Sec. 10.1 Formulation of Integer-Constrained Problems 477
thus expect that the problem is a difficult one, and that it is not polynomi-
ally solvable (cf. the discussion of Section 8.4). However, this is not so. It
turns out that nonbipartite matching has an interesting and intricate struc-
ture, which is quite unique among combinatorial and network optimization
problems. In particular, nonbipartite matching problems can be solved with
polynomial-time algorithms. These algorithms share some key structures with
their bipartite counterparts, such as augmenting paths, but they generally be-
come simpler and run faster when specialized to bipartite matching. One such
algorithm, due to Edmonds [1965] can be implemented so that it has O(N 3 )
running time. Furthermore, nonbipartite matching can be formulated as a
linear program without integer constraints, and admits an analysis based on
linear programming duality. We refer to the literature cited at the end of the
chapter for an account.
may be infeasible because their number may exceed the number of vehicles K.
One may then try to work towards feasibility by combining routes in a way
that satisfies the vehicle capacity constraints, while keeping the cost as small
as possible. Alternatively, one may start with a solution of a K-traveling
salesmen problem (see Exercise 10.9), corresponding to the K vehicles, and
then try to improve on this solution by interchanging customers between
routes, while trying to satisfy the capacity constraints. These heuristics often
work well, but generally they offer no guarantee of good performance, and
may occasionally result in a solution that is far from optimal.
An alternative possibility, which is ultimately also based on heuristics,
is to formulate the problem mathematically in a way that emphasizes its
connections to both the generalized assignment problem and the traveling
salesman problem. In particular, we introduce the integer variables
1 if node i is visited by vehicle k,
yik =
0 otherwise,
K
minimize fk (yk )
k=1
K
K if i = 0,
subject to yik =
1 if i = 1, . . . , N,
k=1
N
di yik ≤ ck , k = 1, . . . , K,
i=0
yik = 0 or 1, i = 0, . . . , N, k = 1, . . . , N,
N
Arc routing problems are similar to vehicle routing problems, except that
the emphasis regarding cost and constraints is placed on arc traversals rather
than node visits. Here each arc (i, j) has a cost aij , and we want to find a
set of arcs that satisfy certain constraints and have minimum sum of costs.
For example, a classical arc routing problem is the Chinese postman problem,
where we want to find a cycle that traverses every arc of a graph, and has
minimum sum of arc costs; here traversals in either direction and multiple
traversals are allowed.† The costs of all arcs must be assumed nonnegative
here in order to guarantee that the problem has an optimal solution (otherwise
cycles of arbitrarily small cost would be possible by crossing back and forth
an arc of negative cost).
An interesting related question is whether there exists an Euler cycle
in the given graph, i.e., a cycle that contains every arc exactly once, with
arc traversals in either direction allowed (such a cycle, if it exists, solves the
Chinese postman problem since the arc costs are assumed nonnegative). This
† An analogy here is made with a postman who must traverse each arc of the
road network of some town (in at least one direction), while walking the minimum
possible distance. The problem was first posed by the Chinese mathematician
Kwan Mei-Ko [1962].
480 Network Problems with Integer Constraints Chap. 10
question was posed by Euler in connection with the famous Königsberg bridge
problem (see Fig. 10.6). The solution is simple: there exists an Euler cycle
if and only if the graph is connected and every node has even degree (in an
Euler cycle, the number of entrances to a node must be equal to the number
of exits, so the number of incident arcs to each node must be even; for a proof
of the converse, see Exercise 1.5). It turns out that even when there are nodes
of odd degree, a solution to the Chinese postman problem can be obtained
by constructing an Euler cycle in an expanded graph that involves some ad-
ditional arcs. These arcs can be obtained by solving a nonbipartite matching
problem involving the nodes of odd degree (see Exercise 10.17). Thus, since
the matching problem can be solved in polynomial time as noted in Example
10.4, the Chinese postman problem can also be solved in polynomial time
(see also Edmonds and Johnson [1973], who explored the relation between
matching and the Chinese postman problem).
C
B River Pregel
C
B
of forward Euler cycles, in roughly the same way as the undirected Chinese
postman problem was related above to the construction of an (undirected)
Euler cycle. Exercise 1.8 states the basic result about the existence of a
forward Euler cycle: such a cycle exists if and only if the number of incoming
arcs to each node is equal to the number of its outgoing arcs. A forward Euler
cycle, if it exists, is also a solution to the directed Chinese postman problem.
More generally, it turns out that a solution to the directed Chinese postman
problem (assuming one exists) can be obtained by finding a directed Euler
cycle in an associated graph obtained by solving a certain minimum cost flow
problem (see Exercise 10.17).
By introducing different constraints, one may obtain a large variety of
arc routing problems. For example, a variant of the Chinese postman problem
is to find a cycle of minimum cost that traverses only a given subset of the arcs.
This is known as the rural postman problem. Other variants are characterized
by arc time-windows and arc precedence constraints, similar to vehicle routing
problem variants discussed earlier. In fact, it is always possible to convert
an arc routing problem to a “node routing problem,” where the constraints
are placed on some of the nodes rather than on the arcs. This can be done
by replacing each arc (i, j) with two arcs (i, kij ) and (kij , j) separated by an
artificial middle node kij . Traversal of an arc (i, j) then becomes equivalent
to visiting the artificial node kij . However, this transformation often masks
important characteristics of the problem. For example it would be awkward
to pose the question of existence of an Euler cycle as a node routing problem.
where βjm and γmw are given scalars. In this case, there is no coupling
between jobs and workers, and the problem can be solved by solving two
decoupled (2-dimensional) assignment problems: one involving the pairing of
jobs and machines, with the βjm as values, and the other involving the pairing
of machines and workers, with the γmw as values. In general, however, the 3-
dimensional assignment problem is a difficult integer programming problem,
for which there is no known polynomial algorithm.
A simple heuristic approach is based on relaxing each of the constraints
in turn. In particular, suppose that the constraint on the workers is neglected
first. It can then be seen that the problem takes the 2-dimensional assignment
form
maximize bjm yjm
j∈J m∈M
subject to yjm = 1, ∀ j ∈ J,
m∈M
yjm = 1, ∀ m ∈ M,
j∈J
yjm = 0 or 1, ∀ j ∈ J, m ∈ M,
where
bjm = max ajmw , (10.4)
w∈W
and yjm = 1 indicates that job j must be performed at machine m. For each
j ∈ J, let jm be the job assigned to machine m, according to the solution of
this problem. We can now optimally assign machines m to workers w, using
as assignment values
cmw = ajm mw ,
|β jm + γ mw − ajmw | ≤ , ∀ j ∈ J, m ∈ M, w ∈ W,
Sec. 10.2 Branch-and-Bound 483
10.2 BRANCH-AND-BOUND
minimize f (x)
subject to x ∈ F,
484 Network Problems with Integer Constraints Chap. 10
n
Yi = Y.
i=1
f Y ≤ min f (x).
x∈Y
F = {1,2,3,4,5}
{1} {2}
Branch-and-Bound Algorithm
f Y < UPPER,
j
set
UPPER = f (x)
and mark x as the best solution found so far.
Step 2: (Termination Test) If OPEN is nonempty, go to step 1.
Otherwise, terminate; the best solution found so far is optimal.
486 Network Problems with Integer Constraints Chap. 10
f Y j < UPPER
with
f Y j < UPPER −
in Step 1. This variant may terminate much faster, while the best solution
obtained upon termination is guaranteed to be within of optimality.
Other variations of branch-and-bound relate to the method for se-
lecting a node from OPEN in Step 1. For example, a possible strategy
is to choose the node with minimal lower bound; alternatively, one may
choose the node containing the best solution found so far. In fact it is
neither practical nor necessary to generate a priori the branch-and-bound
tree. Instead, one may adaptively decide on the order and the manner in
which the nodes are partitioned into descendants based on the progress of
the algorithm.
Branch-and-bound typically uses “continuous” network optimization
problems (without integer constraints) to obtain lower bounds to the op-
timal costs of the restricted problems minx∈Y f (x) and to construct corre-
sponding feasible solutions. For example, suppose that our original problem
has a convex cost function, and a feasible set F that consists of convex set
constraints and side constraints, plus the additional constraint that all the
arc flows must be 0 or 1 . Then a restricted subset Y may specify that the
flows of some given subset of arcs are fixed at 0 or at 1, while the remaining
arc flows may take either the value 0 or the value 1. A lower bound to the
restricted optimal cost minx∈Y f (x) is then obtained by relaxing the 0-1
constraint on the latter arc flows, thereby allowing them to take any value
in the interval [0, 1] and resulting in a convex network problem with side
constraints. Thus the solution by branch-and-bound of a network problem
Sec. 10.2 Branch-and-Bound 487
with convex cost and side constraints plus additional integer constraints re-
quires the solution of many convex network problems with side constraints
but without integer constraints.
n
xij = 0 or 1, ∀ (i, j) ∈ A,
yj = 0 or 1, j = 1, . . . , n,
where J0 and J1 are disjoint subsets of the index set {1, . . . , n} of facility
locations. Thus, F (J0 , J1 ) is the subset of feasible solutions such that:
a facility is placed at the locations in J1 ,
no facility is placed at the locations in J0 ,
a facility may or may not be placed at the remaining locations.
For each node/subset F (J0 , J1 ), we may obtain a lower bound and a feasible
solution by solving the linear program where all integer constraints are relaxed
except for the variables yj , j ∈ J0 ∪ J1 , which have been fixed at either 0 or
1:
n
f ∗ = 5.
y1 = 1/3, y2 = 2/3,
f Y = 4.66.
y 1 = 1, y 2 = 1,
UPPER = 7,
Sec. 10.2 Branch-and-Bound 489
CLIENTS LOCATIONS
1
2
1 _
< 3y1
1 1
b1 = 3
1 2
2 1
1
_
< 3y2
2
1 2 b2 = 1
3
J0 = ∅, J1 = ∅
Feasible Solution Cost = 5 Lower Bound = 6.66
Lower Bound = 5 FATHOMED
J0 = {1}, J1 = ∅ J0 = ∅, J1 = {1}
and we place in OPEN the two descendants J0 = {1}, J1 = Ø and J0 =
Ø, J1 = {1} , corresponding to fixing y1 at 0 and at 1, respectively.
We proceed with
the left branch of the branch-and-bound tree, and
consider the node J0 = {1}, J1 = Ø , corresponding to fixing y1 as well
as the corresponding flows x11 , x21 , and x31 to 0. The associated (relaxed)
linear program is
The optimal solution (in fact the only feasible solution) of this program is
1 if (i, j) = (1, 2), (2, 2), (3, 2),
xij =
0 otherwise,
y2 = 1,
and the corresponding optimal cost (lower bound) is
f Y = 5.
The optimal solution of the relaxed problem is integer, and its cost, 5, is lower
than the current value of UPPER, so we set
UPPER = 5.
The two descendants, J0 = {1}, J1 = {2} and J0 = {1, 2}, J1 = Ø ,
corresponding to fixing y2 at 1 and at 0, respectively, are placed in OPEN.
We proceed with the right branch
of the branch-and-bound tree, and
consider the node J0 = Ø, J1 = {1} , corresponding to fixing y1 to 1. The
associated (relaxed) linear program is
y2 = 2/3,
and the corresponding optimal cost (lower bound) is
f Y = 6.66.
This is larger than the current value of UPPER, so the node can be fathomed,
and its two descendants are not placed in OPEN.
We conclude that one of the two descendants of the left node, J0 =
{1}, J1 = {2} and J0 = {1, 2}, J1 = Ø (the only nodes in OPEN), contains
the optimal solution. We can proceed to solve the relaxed linear programs
corresponding to these two nodes, and obtain the optimal solution. However,
there is also a shortcut here: since these are the only two remaining nodes
and the upper bound corresponding to these nodes coincides with the lower
bound, we can conclude that the lower bound is equal to the optimal cost
and the corresponding integer solution (y1 = 0, y2 = 1) is optimal.
Sec. 10.2 Branch-and-Bound 491
n
xij ≤ yj , ∀ (i, j) ∈ A,
xij = 0 or 1, ∀ (i, j) ∈ A,
yj = 0 or 1, j = 1, . . . , n.
Additional
Figure 10.9: Illustration of the effect
Side Constraints
of additional side constraints. They do
not affect the set of feasible integer solu-
tions, but they reduce the set of “relaxed
Set of Relaxed solutions,” that is, those x that satisfy
Integer all the constraints except for the inte-
Solutions
Solutions
ger constraints. This results in improved
lower bounds and a faster branch-and-
bound solution.
minimize a x
subject to xij − xji = si , ∀ i ∈ N,
{j|(i,j)∈A} {j|(j,i)∈A}
ct x ≤ dt , t = 1, . . . , r,
xij ∈ Xij , ∀ (i, j) ∈ A,
where a and ct are given vectors, dt are given scalars, and each Xij is a
finite subset of contiguous integers (i.e., the convex hull of Xij contains
all the integers in Xij , as for example in the cases Xij = {0, 1} or Xij =
{1, 2, 3, 4}). We assume that the supplies si are integer , so that if the
side constraints ct x ≤ dt were not present, the problem would become a
minimum cost flow problem that has integer optimal solutions, according
to the theory developed in Chapter 5. Note that for this it is not necessary
Sec. 10.3 Lagrangian Relaxation 493
that the arc cost coefficients aij (the components of the vectors a) be
integer.
In the Lagrangian relaxation approach, we eliminate the side con-
straints ct x ≤ dt by adding to the cost function the terms µt (ct x − dt ),
thereby forming the Lagrangian function
r
L(x, µ) = a x + µt (ct x − dt ),
t=1
where the first inequality follows because the minimum of the Lagrangian in
the next-to-last expression is taken over a subset of F̃ , and the last inequal-
ity follows using the nonnegativity of µt . The lower bound minx∈F̃ L(x, µ)
can in turn be used in the branch-and-bound procedure discussed earlier.
Since in the context of branch-and-bound, it is important to use as
tight a lower bound as possible, we are motivated to search for an optimal
lower bound through adjustment of the vector µ. To this end, we form the
following dual function (cf. Section 8.7)
q(µ) = min L(x, µ),
x∈F̃
xij = 0 or 1, ∀ (i, j) ∈ A,
ckij xij ≤ dk , k = 1, . . . , K.
(i,j)∈A
Here, a path P from s to t is optimal if and only if the flow vector x defined
by
1 if (i, j) belongs to P ,
xij =
0 otherwise,
is an optimal solution of the problem (10.7).
To apply Lagrangian relaxation, we eliminate the side constraints, and
we form the corresponding Lagrangian function assigning a nonnegative mul-
tiplier µk to the kth constraint. Minimization of the Lagrangian now becomes
a shortest path problem with respect to corrected arc lengths âij given by
K
(We assume here that there are no negative length cycles with respect to the
arc lengths âij ; this will be so if all the aij and ckij are nonnegative.) We
then obtain µ∗ that solves the dual problem maxµ≥0 q(µ) and we obtain a
corresponding optimal cost/lower bound. We can then use µ∗ to obtain a
feasible solution (a path that satisfies the side constraints) as discussed in
Example 8.6.
Sec. 10.3 Lagrangian Relaxation 495
is a (linear) minimum cost flow problem that can be solved using the
methodology of Chapters 2-7: the Lagrangian L(x, µ) is linear in x and
the integer constraints do not matter, and can be replaced by the inter-
val constraints xij ∈ X̂ij , where X̂ij is the convex hull of the set Xij .
This should be contrasted with the integer constraint relaxation approach,
where we eliminate just the integer constraints, while leaving the side con-
straints unaffected (see the facility location problem that we solved using
branch-and-bound in Example 10.8). As a result, the minimum cost flow
methodology of Chapters 2-7 does not apply when there are side constraints
and the integer constraint relaxation approach is used. This is the main
reason for the widespread use of Lagrangian relaxation in combination with
branch-and-bound.
Actually, in Lagrangian relaxation it is not mandatory to eliminate
just the side constraints. One may eliminate the conservation of flow con-
straints, in addition to or in place of the side constraints. (The multipliers
corresponding to the conservation of flow constraints should be uncon-
strained in the dual problem, because the conservation of flow is expressed
in terms of equality constraints; cf. the discussion in Section 8.7.) One
still obtains a lower bound to the optimal cost of the original problem,
because of the weak duality property (cf. Section 8.7). However, the mini-
mization of the Lagrangian is not a minimum cost flow problem anymore.
Nonetheless, by choosing properly the constraints to eliminate and by tak-
ing advantage of the special structure of the problem, the minimization
of the Lagrangian over the remaining set of constraints may be relatively
simple. The following is an illustrative example.
the subgraph with node-arc set N , {(i, j) | xij = 1} is connected. (10.11)
We may express the connectivity constraint (10.11) in several different
ways, leading to different Lagrangian relaxation and branch-and-bound algo-
rithms. One of the most successful formulations is based on the notion of
a 1-tree, which consists of a tree that spans nodes 2, . . . , N , plus two arcs
that are incident to node 1. Equivalently, a 1-tree is a connected subgraph
that contains a single cycle passing through node 1 (see Fig. 10.10). Note
that if the conservation of flow constraints (10.8) and (10.9), and the integer
constraints (10.10) are satisfied, then the connectivity
constraint (10.11)
is
equivalent to the constraint that the subgraph N , {(i, j) | xij = 1} is a
1-tree.
3
1
Figure 10.10: Illustration of a 1-tree. It con-
sists of a tree that spans nodes 2, . . . , N , plus
two arcs that are incident to node 1.
Let X1 be the set of all x with 0 − 1 components, and such that the
subgraph N , {(i, j) | xij = 1} is a 1-tree. Let us consider a Lagrangian
relaxation approach based on elimination of the conservation of flow equa-
tions. Assigning multipliers ui and vj to the constraints (10.8) and (10.9),
respectively, the Lagrangian function is
N
N
The minimization of the Lagrangian is over all 1-trees, leading to the problem
min (aij + ui + vj )xij .
x∈X1
i,j,i=j
that spans the nodes 2, . . . , N , and then adding two arcs that are incident to
node 1 and have minimum modified cost. The minimum cost spanning tree
problem can be easily solved using the Prim-Dijkstra algorithm (see Exercise
2.30).
where
F̃ = {x | xij ∈ Xij , x satisfies the conservation of flow constraints},
and L(x, µ) is the Lagrangian function
r
L(x, µ) = a x + µt (ct x − dt ).
t=1
498 Network Problems with Integer Constraints Chap. 10
Recall here that the set F̃ is finite, because we have assumed that each Xij
is a finite set of contiguous integers.
We note that for a fixed x ∈ F̃ , the Lagrangian L(x, µ) is a linear
function of µ. Thus, because the set F̃ is finite, the dual function q is the
minimum of a finite number of linear functions of µ – there is one such
function for each x ∈ F̃ . For conceptual simplification, we may write q in
the following generic form:
where I is some finite index set, and αi and βi are suitable vectors and
scalars, respectively (see Fig. 10.11).
Of particular interest for our purposes are the “slopes” of q at vari-
ous vectors µ, i.e., the vectors αiµ , where iµ ∈ I is an index attaining the
minimum of αi µ + βi over i ∈ I [cf. Eq. (10.12)]. If iµ is the unique index
attaining the minimum, then q is differentiable (in fact linear) at µ, and its
gradient is aiµ . If there are multiple indices i attaining the minimum, then
q is nondifferentiable at µ (see Fig. 10.11). To deal with such differentia-
bilities, we generalize the notion of a gradient. In particular, we define a
subgradient of q at a given µ ≥ 0 to be any vector g such that
(see Fig. 10.11). The right-hand side of the above inequality provides a
linear approximation to the dual function q using the function value q(µ)
at the given µ and the corresponding subgradient g. The approximation
is exact at the vector µ, and is an overestimate at other vectors ν. Some
further properties of subgradients are summarized in Appendix A.
We now consider the calculation of subgradients of the dual function.
For any µ, let xµ minimize the Lagrangian L(x, µ) over x ∈ F̃ ,
gt (xµ ) = ct xµ − dt , t = 1, . . . , r,
≤ L(xµ , ν)
= a xµ + ν g(xµ )
= a xµ + µ g(xµ ) + (ν − µ) g(xµ )
= q(µ) + (ν − µ) g(xµ ),
Sec. 10.3 Lagrangian Relaxation 499
α 2 ' µ + β2 α 3 ' µ + β3
α 1 ' µ + β1
_
µ µ~ µ
Figure 10.11: Illustration of the dual function q and its subgradients. The
generic form of q is
q(µ) = min{αi µ + βi },
i∈I
where I is some finite index set, and αi and βi are suitable vectors and scalars,
respectively. Given µ, and an index iµ ∈ I attaining the minimum in the above
equation, the vector αiµ is a subgradient at µ. Furthermore, any subgradient
at µ is a convex combination of vectors aiµ such that iµ ∈ I and iµ attains the
minimum of αi µ + βi over i ∈ I. For example, at the vector µ shown in the figure,
there is a unique subgradient, the vector α1 . At the vector µ̃ shown in the figure,
the set of subgradients is the line segment connecting the vectors α2 and α3 .
We now turn to algorithms that use subgradients for solving the dual prob-
lem. The subgradient method consists of the iteration
! "+
µk+1 = µk + sk g k , (10.14)
g k = g(xµk ),
gt (x) = ct x − dt , t = 1, . . . , r.
q [µk + sk g k ]+ < q(µk ), ∀ sk > 0
(see Fig. 10.12). What makes the subgradient method work is that for suf-
ficiently small stepsize sk , the distance of the current iterate to the optimal
solution set is reduced , as illustrated in Fig. 10.12, and as shown in the
following proposition.
µ2
µk+1 = [ µk + sk g k ]+
Contours of q
µk
o
< 90
µ*
µ1
µk + sk g k
Figure 10.12: Illustration of how it may not be possible to improve the dual
function by using the subgradient iteration µk+1 = [µk + sk g k ]+ , regardless of
the value of the stepsize sk . However, the distance to any optimal solution µ∗ is
reduced using a subgradient iteration with a sufficiently small stepsize. The crucial
fact, which follows from the definition of a subgradient, is that the angle between
the subgradient g k and the vector µ∗ − µk is less than 90 degrees. As a result,
for sk small enough, the vector µk + sk g k is closer to µ∗ than µk . Furthermore,
the vector [µk + sk g k ]+ is closer to µ∗ than µk + sk g k is.
Sec. 10.3 Lagrangian Relaxation 501
Proof: We have
We can now verify that for the range of stepsizes of Eq. (10.15) the sum of
the last two terms in the above relation is negative. In particular, with a
straightforward calculation, we can write this relation as
2
γ k (2 − γ k ) q(µ∗ ) − q(µk )
µk + sk g k − µ∗ 2 ≤ µk − µ∗ 2 − , (10.16)
g k 2
where
sk g k 2
γk = .
q(µ∗ ) − q(µk )
If the stepsize sk satisfies Eq. (10.15), then 0 < γ k < 2, so Eq. (10.16)
yields
µk + sk g k − µ∗ < µk − µ∗ .
We now observe that since µ∗ ≥ 0, we have
#! " #
# µk + sk g k + − µ∗ # ≤ µk + sk g k − µ∗ ,
and from the last two inequalities, we obtain µk+1 − µ∗ < µk − µ∗ .
Q.E.D.
502 Network Problems with Integer Constraints Chap. 10
0 < αk < 2.
Note that we can estimate the optimal dual cost from below with the
best current dual cost
q̂ k = max q(µi ).
0≤i≤k
As an overestimate of the optimal dual cost, we can use the cost f (x̄) of any
primal feasible solution x̄; in many circumstances, primal feasible solutions
are naturally obtained in the course of the algorithm. Finally, the special
structure of many problems can be exploited to yield improved bounds to
the optimal dual cost.
Here are two common ways to choose αk and q k in the stepsize formula
(10.17):
(a) q k is the best known upper bound to the optimal dual cost at the kth
iteration and αk is a number, which is initially equal to one and is
decreased by a certain factor (say, two) every few (say, five or ten)
iterations. An alternative formula for αk is
m
αk = ,
k+m
where m is a positive integer.
(b) αk = 1 for all k and q k is given by
q k = 1 + β(k) q̂ k , (10.18)
maximize q(µ)
subject to µ ≥ 0.
The cutting plane method, at the kth iteration, replaces the dual function
q by a polyhedral approximation Qk , constructed using the vectors µi and
corresponding subgradients g i , i = 0, 1, . . . , k − 1, obtained so far. It then
solves the problem
maximize Qk (µ)
subject to µ ≥ 0.
In particular, for k = 1, 2, . . ., Qk is given by
Qk (µ) = min q(µi ) + (µ − µi ) g i , (10.19)
i=0,...,k−1
gt (x) = ct x − dt , t = 1, . . . , r.
504 Network Problems with Integer Constraints Chap. 10
Proof: For notational convenience, let us write the dual function in the
polyhedral form
q(µ) = min{αi µ + βi },
i∈I
q(µ0) + (µ − µ0)'g 0
q(µ1) + (µ − µ1)'g 1
q(µ)
µ0 µ3 µ2 µ1 µ
µ4 = µ*
Figure 10.13: Illustration of the cutting plane method. With each new iterate
µi , a new hyperplane q(µi ) + (µ − µi ) g i is added to the polyhedral approximation
of the dual function. The method converges finitely, since if µk is not optimal, a
new cutting plane will be added at the corresponding iteration, and there can be
only a finite number of cutting planes.
Since
Qk (µk ) = min {αi m µk + βim },
0≤m≤k−1
it follows that the pair (αik , βik ) is not equal to any of the preceding pairs
(αi0 , βi0 ), . . . , (αik−1 , βik−1 ). Since the index set I is finite, it follows that
there can be only a finite number of iterations for which the termination
criterion (10.23) is not satisfied. Q.E.D.
Despite its finite convergence property, the cutting plane method may
converge slowly, and in practice one may have to stop it short of finding an
optimal solution [the error bounds (10.22) may be used for this purpose].
An additional drawback of the method is that it can take large steps away
from the optimum even when it is close to (or even at) the optimum. This
phenomenon is referred to as instability, and has another undesirable effect,
namely, that µk−1 may not be a good starting point for the algorithm that
minimizes Qk (µ). A way to limit the effects of this phenomenon is to add
to the polyhedral function approximation a quadratic term that penalizes
large deviations from the current point. In this method, µk is obtained as
1
µ = arg max Q (µ) − k µ − µ
k k k−1 ,
2
µ≥0 2c
506 Network Problems with Integer Constraints Chap. 10
q(µ0) + (µ − µ0)'g 0
Set S 2
Central Pair ( µ2,z 2 )
q(µ1) + (µ − µ1)'g 1
q2
q(µ)
µ0 µ* µ2 µ1 µ
There are several possible methods for finding the central pair (µk , z k ).
Roughly, the idea is that the central pair should be “somewhere in the
middle” of S k . For example, consider the case where S k is polyhedral with
nonempty interior. Then (µk , z k ) could be the analytic center of S k , where
for any polyhedron
P = {y | ap y ≤ cp , p = 1, . . . , m}
Sec. 10.3 Lagrangian Relaxation 507
with
m nonempty interior, its analytic center is the unique maximizer of
p=1 ln(cp − ap y) over y ∈ P . Another possibility is the ball center of
S, that is, the center of the largest inscribed sphere in S. Assuming that
the polyhedron P given above has nonempty interior, its ball center can
be obtained by solving the following problem with optimization variables
(y, σ):
maximize σ
subject to ap (y + d) ≤ cp , ∀ d ≤ σ, p = 1, . . . , m.
maximize σ
subject to ap y + ap σ ≤ cp , p = 1, . . . , m.
While the central cutting plane methods are not guaranteed to ter-
minate finitely, their convergence properties are satisfactory. Furthermore,
the methods have benefited from advances in the implementation of interior
point methods; see the references cited at the end of the chapter.
M
aij (m)xij (m) (10.24)
m=1 (i,j)∈A
M
A(m)x(m) ≤ b. (10.27)
m=1
508 Network Problems with Integer Constraints Chap. 10
Here si (m) are given supply integers for the mth commodity, A(m) are
given matrices, b is a given vector, and x(m) is the flow vector of the
mth commodity, with components xij (m), (i, j) ∈ A. Furthermore, each
Xij (m) is a finite subset of contiguous integers.
The dual function is obtained by relaxing the side constraints (10.27),
and by minimizing the corresponding Lagrangian function. This minimiza-
tion separates into m independent minimizations, one per commodity:
M
q(µ) = −µ b + min a(m) + A(m) µ x(m), (10.28)
x(m)∈F (m)
m=1
where a(m) is the vector with components aij (m), (i, j) ∈ A, and
F (m) = x(m) satisfying Eq. (10.25) | xij (m) ∈ Xij (m), ∀ (i, j) ∈ A .
is a subgradient of q at µ.
Let us now discuss the computational solution of the dual problem
maxµ≥0 q(µ). The application of the subgradient method is straightfor-
ward, so we concentrate on the cutting plane method, which leads to a
method known as Dantzig-Wolfe decomposition. This method consists of
the iteration
µk = arg max Qk (µ),
µ≥0
maximize v
subject to q(µi ) + (µ − µi ) g i ≥ v, i = 0, . . . , k − 1, (10.30)
µ ≥ 0.
This is a linear program in the variables v and µ. We can form its dual
problem by assigning a Lagrange multiplier ξ i to each of the constraints
Sec. 10.3 Lagrangian Relaxation 509
k−1
minimize ξ i q(µi ) − µi g i
i=0
k−1
k−1 (10.31)
subject to ξ i = 1, ξ i g i ≤ 0,
i=0 i=0
ξ i ≥ 0, i = 0, . . . , k − 1.
M
i
q(µi ) = −µi b + a(m) + A(m) µi xµ (m),
m=1
M
i
gi = A(m)xµ (m) − b,
m=1
M
k−1
i
minimize a(m) ξ i xµ (m)
m=1 i=0
k−1
M
k−1
i
(10.32)
subject to ξ i = 1, A(m) ξ i xµ (m) ≤ b,
i=0 m=1 i=0
ξi ≥ 0, i = 0, . . . , k − 1.
M
k−1
i
minimize a(m) ξ i xµ (m)
m=1 i=0
k−1
M
k−1
i
subject to ξ i = 1, A(m) ξ i xµ (m) ≤ b,
i=0 m=1 i=0
ξ i ≥ 0, i = 0, . . . , k − 1.
M
k−1
i
A(m) ξ i xµ (m) ≤ b.
m=1 i=0
k
Step 2: For each m = 1, . . . , M , obtain a solution xµ (m) of the
minimum cost flow problem
min a(m) + A(m) µk x(m).
x(m)∈F (m)
k
Step 3: Use xµ (m) to modify the master problem by adding one
more variable ξ k and go to the next iteration.
the problem as
M
minimize a(m) x(m)
m=1
subject to x(m) ∈ F (m), m = 1, . . . , M,
M
y(m) = b, A(m)x(m) ≤ y(m), m = 1, . . . , M.
m=1
M
minimize min a(m) x(m)
x(m)∈F (m)
m=1 A(m)x(m)=y(m)
(10.33)
M
subject to y(m) = b, y(m) ∈ Y (m), m = 1, . . . , M,
m=1
where Y (m) is the set of all vectors y(m) for which the inner minimization
problem
M
minimize pm y(m)
m=1
M
subject to y(m) = b, y(m) ∈ Y (m), m = 1, . . . , M.
m=1
This problem, called the master problem, may be solved with nondifferen-
tiable optimization methods, and in particular with the subgradient and
the cutting plane methods. Note, however, that the commodity problems
(10.34) involve the side constraints A(m)x(m) ≤ y(m), and need not be
of the minimum cost flow type, except in special cases. We refer to the
literature cited at the end of the chapter for further details.
512 Network Problems with Integer Constraints Chap. 10
Local search methods are a broad and important class of heuristics for
discrete optimization. They apply to the general problem of minimizing
a function f (x) over a finite set F of (feasible) solutions. In principle,
one may solve the problem by global enumeration of the entire set F of
solutions (this is what branch-and-bound does). A local search method
tries to economize on computation by using local enumeration, based on
the notion of a neighborhood N (x) of a solution x, which is a (usually very
small) subset of F , containing solutions that are “close” to x in some sense.
In particular, given a solution x, the method selects among the solu-
tions in the neighborhood N (x) a successor solution x, according to some
rule. The process is then repeated with x replacing x (or stops when some
termination criterion is met). Thus a local search method is characterized
by:
(a) The method for choosing a starting solution.
(b) The definition of the neighborhood N (x) of a solution x.
(c) The rule for selecting a successor solution from within N (x).
(d) The termination criterion.
For an example of a local search method, consider the k-OPT heuristic
for the traveling salesman problem that we discussed in Example 10.1. Here
the starting tour is obtained by using some method, based for example on
subtour elimination or a minimum weight spanning tree, as discussed in
Example 10.1. The neighborhood of a tour T is defined as the set N (T )
of all tours obtained from T by exchanging k arcs that belong to T with
another k arcs that do not belong to T . The rule for selecting a successor
tour is based on cost improvement; that is, the tour selected from N (T )
has minimum cost over all tours in N (T ) that have smaller cost than T .
Finally, the algorithm terminates when no tour in N (T ) has smaller cost
than T . Another example of a local search method is provided by the
Esau-Williams heuristic of Fig. 10.5.
The definition of a neighborhood often involves intricate calculations
and suboptimizations that aim to bring to consideration promising neigh-
bors. Here is an example, due to Kernighan and Lin [1970]:
Consider a graph (N , A) with 2n nodes, and a cost aij for each arc (i, j). We
want to find a partition of N into two subsets N1 and N2 , each with n nodes,
so that the total cost of the arcs connecting N1 and N2 ,
aij + aij ,
(i,j), i∈N1 , j∈N2 (i,j), i∈N2 , j∈N1
Sec. 10.4 Local Search Methods 513
is minimized.
Here a natural neighborhood of a partition (N1 , N2 ) is the k-exchange
neighborhood . This is the set of all partitions obtained by selecting a fixed
number k of pairs of nodes (i, j) with i ∈ N1 and j ∈ N2 , and interchang-
ing them, that is, moving i into N2 and j into N1 . The corresponding local
search algorithm moves from a given solution to its minimum cost neighbor,
and terminates when no neighbor with smaller cost can be obtained. Unfor-
tunately, the amount of work needed to generate a k-exchange neighborhood
increases exponentially with k [there are m k
different ways to select k objects
out of m]. One may thus consider a variable depth neighborhood that involves
multiple successive k-exchanges with small k. As an example, for k = 1 we
obtain the following algorithm:
Given the starting partition (N1 , N2 ), consider all pairs (i, j) with i ∈
N1 and j ∈ N2 , and let c(i, j) be the cost change that results from moving i
into N2 and j into N1 . If (i, j) is the pair that minimizes c(i, j), move i into
N1 and j into N2 , and let c1 = c(i, j). Repeat this process a fixed number M
of times, obtaining a sequence c2 , c3 , . . . , cM of minimal cost changes resulting
from the sequence of exchanges. Then find
m
m = arg min cl ,
m=1,...,M
l=1
and accept as the next partition the one involving the first m exchanges.
This type of algorithm avoids the exponential running time of k-exchange
neighborhoods, while still considering neighbors differing by as many as M
node pairs.
local minima, and paying the cost of increased computation per iteration.
Note that there is an important advantage to a cost improving method: it
can never repeat the same solution, so that in view of the finiteness of the
feasible set F , it will always terminate with a local minimum.
An alternative type of neighbor selection and termination criterion,
used by simulated annealing and tabu search, is to allow successor solutions
to have worse cost than their predecessors, but to also provide mechanisms
that ensure the future generation of improved solutions with substantial
likelihood. The advantage of accepting solutions of worse cost is that stop-
ping at a local minimum becomes less of a difficulty. For example, the
method of simulated annealing, cannot be trapped at a local minimum,
as we will see shortly. Unfortunately, methods that do not enforce cost
improvement run the danger of cycling through repetition of the same so-
lution. It is therefore essential in these methods to provide a mechanism
by virtue of which cycling is either precluded, or becomes highly unlikely.
As a final remark, we note an important advantage of local search
methods. While they offer no solid guarantee of finding an optimal or
near-optimal solution, they offer the promise of substantial improvement
over any heuristic that can be used to generate the starting solution. Unfor-
tunately, however, one can seldom be sure that this promise will be fulfilled
in a given practical problem.
These are local search methods where the neighborhood generation mech-
anism is inspired by real-life processes of genetics and evolution. In par-
ticular, the current solution is modified by “splicing” and “mutation” to
obtain neighboring solutions. A typical example is provided by problems of
scheduling, such as the traveling salesman problem. The neighborhood of
a schedule T may be a collection of other schedules obtained by modifying
some contiguous portion of T in some way, while keeping the remainder of
the schedule T intact. Alternatively, the neighborhood of a schedule may
be obtained by interchanging the position of a few tasks, as in the k-OPT
traveling salesman heuristic.
In a variation of this approach, a pool of solutions may be maintained.
Some of these solutions may be modified, while some pairs of these solutions
may be combined to form new solutions. These solutions, are added to the
pool if some criterion, typically based on cost improvement, is met, and
some of the solutions of the existing pool may be dropped. In this way, it
is argued, the pool is “evolving” in a Darwinian way through a “survival
of the fittest” process.
A specific example implementation of this approach operates in phases.
At the beginning of a phase, we have a population X consisting of n feasible
solutions x1 , . . . , xn . The phase proceeds as follows:
Sec. 10.4 Local Search Methods 515
ber of practical problems. These heuristics regulate the size of the current
neighborhood, the criterion of selecting a new solution from the current
neighborhood, the criterion for termination, etc. These heuristics may also
involve selective memory storage of previously generated solutions or their
attributes, penalization of the constraints with (possibly time-varying)
penalty parameters, and multiple tabu lists. We refer to the literature
cited at the end of the chapter for further details.
e−f (x)/T
−f (x)/T
.
x∈F e
Essentially, this says that for very small T and far into the future, the
current solution is almost always optimal.
When the condition pxy = pyx does not hold, one cannot obtain a
closed-form expression for the steady-state probabilities of the various so-
lutions. However, as long as the underlying Markov chain is irreducible,
the behavior is qualitatively similar: the steady-state probability of nonop-
timal solutions diminishes to 0 as T approaches 0. There is also related
analysis for the case where the temperature parameter T is time-varying
and converges to 0; see the references cited at the end of the chapter.
The results outlined above should be viewed with a grain of salt. In
practice, speed of convergence is as important as eventual convergence to
the optimum, and solving a given problem by simulated annealing can be
very slow. A nice aspect of the method is that it depends very little on
the structure of the problem being solved, and this enhances its value for
relatively unstructured problems that are not well-understood. For other
problems, where there exists a lot of accumulated insight and experience,
simulated annealing is usually inferior to other local search approaches.
and at the end of the iteration, we augment this solution with some more
arc flows. The steps of the iteration are as follows:
for each y ∈ FT .
Step 3: Choose from the set FT the arc flows y = y ij | (i, j) ∈ T
that minimize the heuristic cost H(Py+ ); that is, find
Step 4: Augment
the currentpartial solution {xij | (i, j) ∈ S with
the arc flows y ij | (i, j) ∈ T thus obtained, and proceed with the
next iteration.
and at the kth iteration the flows of all of these arcs are set to 0, except
for arc (ik , ik+1 ) whose flow is set to 1.
Note that a rollout algorithm requires considerably more computation
than the base heuristic. For example, in the case where the subset T in Step
520 Network Problems with Integer Constraints Chap. 10
of the initial solution x0 produced by the base heuristic. For further elab-
oration of the sequential consistency property, we refer to the paper by
Bertsekas, Tsitsiklis, and Wu [1997], which also discusses some underlying
connections with the policy iteration method of dynamic programming.
A condition that is more general than sequential consistency is that
the algorithm be sequentially improving, in the sense that at each iteration
there holds
H(P + ) ≤ H(P ).
This property also guarantees that the cost of the solutions produced by the
rollout algorithm is monotonically nonincreasing. The paper by Bertsekas,
Tsitsiklis, and Wu [1997] discusses situations where this property holds,
and shows that with fairly simple modification, a rollout algorithm can be
made sequentially improving (see also Exercise 10.22).
There are a number of variations of the basic rollout algorithm de-
scribed above. Here are some examples:
(1) We may adapt the rollout framework to use multiple heuristic al-
gorithms. In particular, let us assume that we have K algorithms
H1 , . . . , HK . The kth of these algorithms, given an augmented par-
tial solution Py+ , produces a heuristic cost Hk (Py+ ). We may then use
in the flow selection via Eq. (10.35) a heuristic cost of the form
or of the form
K
H(Py+ ) = rk Hk (Py+ ),
k=1
where rk are some fixed scalar weights obtained by trial and error.
(2) We may incorporate multistep lookahead or selective depth lookahead
into the rollout framework. Here we consider
augmenting the current
partial solution P = xij | (i, j) ∈ S with all possible values for
the flows of a finite sequence of arcs that are not in S. We run the
base heuristic from each of the corresponding augmented partial so-
lutions, we select the sequence of arc flows with minimum heuristic
cost, and then augment the current partial solution P with the first
arc flow in this sequence. As an illustration, let us recall the traveling
salesman problem with the nearest neighbor method used as the base
heuristic. An example rollout algorithm with two-step lookahead op-
erates as follows: We begin each iteration with a path {i1 , . . . , ik }.
We run the nearest neighbor heuristic starting from each of the paths
{i1 , . . . , ik , i} with i = i1 , . . . , ik , and obtain a corresponding tour.
We then form the subset I consisting of the m nodes i = i1 , . . . , ik
that correspond to the m best tours thus obtained. We run the near-
est neighbor heuristic starting from each of the paths {i1 , . . . , ik , i, j}
522 Network Problems with Integer Constraints Chap. 10
Consider a person who walks on a straight line and at each time period takes
either a unit step to the left or a unit step to the right. There is a cost
function assigning cost f (i) to each integer i. Given an integer starting point
on the line, the person wants to minimize the cost of the point where he will
end up after a given and fixed number N of steps.
We can formulate this problem as a problem of selecting a path in a
graph (see Fig. 10.15). In particular, without loss of generality, let us assume
that the starting point is the origin, so that the person’s position after N
steps will be some integer in the interval [−N, N ]. The nodes of the graph
are identified with pairs (k, m), where k is the number of steps taken so far
(k = 1, . . . , N ) and m is the person’s position (m ∈ [−k, k]). A node (k, m)
with k < N has two outgoing arcs with end nodes (k+1, m−1) (corresponding
to a left step) and (k+1, m+1) (corresponding to a right step). Let us consider
paths whose starting node is (0, 0) and the destination node is of the form
(N, m), where m is of the form N − 2l and l ∈ [0, N ] is the number of left
steps taken. The problem then is to find the path of this type such that f (m)
is minimized.
Let the base heuristic be the algorithm, which, starting at a node (k, m),
takes N − k successive steps to the right and terminates at the node (N, m +
N − k). It can be seen that this algorithm is sequentially consistent [the base
heuristic generates the path (k, m), (k + 1, m + 1), . . . , (N, m + N − k) starting
from (k, m), and also the path (k + 1, m + 1), . . . , (N, m + N − k) starting
from (k + 1, m + 1), so the criterion for sequential consistency is fulfilled].
Sec. 10.5 Rollout Algorithms 523
The rollout algorithm, at node (k, m) compares the cost of the des-
tination node (N, m + N − k) (corresponding to taking a step to the right
and then following the base heuristic) and the cost of the destination node
(N, m + N − k − 2) (corresponding to taking a step to the left and then fol-
lowing the base heuristic). Let us say that an integer i ∈ [−N + 2, N − 2] is
a local minimum if f (i − 2) ≥ f (i) and f (i) ≤ f (i + 2). Let us also say that
N (or −N ) is a local minimum if f (N − 2) ≤ f (N ) [or f (−N ) ≤ f (−N + 2),
respectively]. Then it can be seen that starting from the origin (0, 0), the
rollout algorithm obtains the local minimum that is closest to N , (see Fig.
10.15). This is no worse (and typically better) than the integer N obtained
by the base heuristic. This example illustrates how the rollout algorithm may
exhibit “intelligence” that is totally lacking from the base heuristic.
(0,0)
Time
-N 0 i N-2 N i
Consider next the case where the base heuristic is the algorithm that,
starting at a node (k, m), compares the cost f (m + N − k) (corresponding to
taking all of the remaining N −k steps to the right) and the cost f (m−N +k)
(corresponding to taking all of the remaining N − k steps to the left), and
accordingly moves to node
(N, m + N − k) if f (m + N − k) ≤ f (m − N + k),
524 Network Problems with Integer Constraints Chap. 10
or to node
It can be seen that this base heuristic is not sequentially consistent, but is
instead sequentially improving. It can then be verified that starting from the
origin (0, 0), the rollout algorithm obtains the global minimum of f in the
interval [−N, N ], while the base heuristic obtains the better of the two points
−N and N .
A rollout algorithm starts with the trivial path P = (s), where s is some
initial node, progressively constructs a sequence of paths P = (s, i1 , . . . , im ),
m = 1, . . . , N − 1, consisting of distinct nodes, and then completes a tour by
adding the arc (iN −1 , s). The rollout procedure is as follows.
We introduce nonnegative penalty coefficients µk for the side constraints,
and we form modified arc traversal costs âij , given by
K
T ∗ (Pe ) satisfies the side constraints, the algorithm adds to the current path
the node j for which the tour T̂ (Pe ) has minimum cost.
One of the drawbacks of the scheme just described is that it requires the
approximate solution of a large number of traveling salesman problems. A
faster variant is obtained if the arc set Am above is restricted to be a suitably
chosen subset of the eligible arcs (im , j), such for example those whose length
does not exceed a certain threshold.
EXERCISES
10.1
Consider the symmetric traveling salesman problem with the graph shown in Fig.
10.16.
(a) Find a suboptimal solution using the nearest neighbor heuristic starting
from node 1.
(b) Find a suboptimal solution by first solving an assignment problem, and by
then merging subtours.
(c) Try to improve the solutions found in (a) and (b) by using the 2-OPT
heuristic.
5
2 5
9 9
1 0 4
Figure 10.16: Data for a symmetric trav-
1 2 eling salesman problem (cf. Exercise 10.1).
8 8 The arc costs are shown next to the arcs.
2 3 Each arc is bidirectional.
5
Consider a strongly connected graph with a nonnegative cost for each arc. We
want to find a forward cycle of minimum cost that contains all nodes but is not
necessarily simple; that is, a node or an arc may be traversed multiple times.
(a) Convert this problem into a traveling salesman problem. Hint: Construct
a complete graph with cost of an arc (i, j) equal to the shortest distance
from i to j in the original graph.
(b) Apply your method of part (a) to the graph of Fig. 10.17.
2
3 2
Figure 10.17: Data for a minimum cost cycle
1 1 4 problem (cf. Exercise 10.2). The arc costs are
1 1
shown next to the arcs.
1 5
3
10.3
Consider the problem of checking whether a given graph contains a simple cycle
that passes through all the nodes. (The cycle need not be forward.) Formulate
this problem as a symmetric traveling salesman problem. Hint: Consider a
complete graph where the cost of an arc (i, j) is 1 if (i, j) or (j, i) is an arc
of the original graph, and is 2 otherwise.
10.4
Show that an asymmetric traveling salesman problem with nodes 1, . . . , N and arc
costs aij can be converted to a symmetric traveling salesman problem involving
a graph with nodes 1, . . . , N, N + 1, . . . , 2N, and the arc costs
aij if i, j = 1, . . . , N, i = j,
ai(N +j) =
−M if i = j,
where M is a sufficiently large number. Hint: All arcs with cost −M must be
included in an optimal tour of the symmetric version.
10.5
Consider the problem of finding a shortest (forward) path from an origin node s
to a destination node t of a graph with given arc lengths, subject to the additional
constraint that the path passes through every node exactly once.
530 Network Problems with Integer Constraints Chap. 10
(a) Show that the problem can be converted to a traveling salesman problem
by adding an artificial arc (t, s) of length −M , where M is a sufficiently
large number.
(b) (Longest Path Problem) Consider the problem of finding a simple forward
path from s to t that has a maximum number of arcs. Show that the
problem can be converted to a traveling salesman problem.
10.6
Consider the problem of finding a shortest (forward) path in a graph with given
arc lengths, subject to the constraint that the path passes through every node
exactly once (the choice of start and end nodes of the path is subject to opti-
mization). Formulate the problem as a traveling salesman problem.
Consider a symmetric traveling salesman problem where the arc costs are non-
negative and satisfy the following triangle inequality:
Consider a symmetric traveling salesman problem where the arc costs are non-
negative and satisfy the triangle inequality (cf. the preceding exercise). Let R
Sec. 10.6 Notes, Sources, and Exercises 531
5
2 5
3 6
1 3 4
Figure 10.18: Data for a symmetric travel-
1 2 ing salesman problem (cf. Exercises 10.7 and
4 4 10.8). The arc costs are shown next to the
2 3
5 arcs.
be a minimum cost spanning tree of the graph (cf. Exercise 2.30), and let S be
the subset of the nodes that has an odd number of incident arcs in R. A perfect
matching of the nodes of S is a subset of arcs such that every node of S is an end
node of exactly one arc of the subset and each arc of the subset has end nodes
in S. Suppose that M is a perfect matching of the nodes of S that has minimum
sum of arc costs. Construct a tour that consists of the arcs of M and some of the
arcs of R, and show that its weight is no more than 3/2 times the optimal tour
cost. Solve the problem of Fig. 10.18 using this heuristic, and find the ratio of
the solution cost to the optimal tour cost. Hint: Note that the total cost of the
arcs of M is at most 1/2 the optimal tour cost. Also, use the fact that if a graph
is connected and each of its nodes has even degree, there is a cycle that contains
all the arcs of the graph exactly once (cf. Exercise 1.5).
Consider the version of the traveling salesman problem where there are K sales-
men that start at city 1, return to city 1, and collectively must visit all other
cities exactly once. Transform the problem into an ordinary traveling salesman
problem. Hint: Split city 1 into K cities.
Consider the minimum weight spanning tree problem, subject to the additional
constraint that the number of tree arcs that are incident to a single given node
s should be no greater than a given integer k. Consider adding a nonnegative
weight w to the weight of all incident arcs of node s, solving the corresponding
unconstrained spanning tree problem, and gradually increasing w until the degree
constraint is satisfied.
(a) State a polynomial algorithm for doing this and derive its running time.
(b) Use this algorithm to solve the problem of Fig. 10.19, where the degree of
node 1 is required to be no more than 2.
532 Network Problems with Integer Constraints Chap. 10
5
2 9
1
1 4
Figure 10.19: Data for a minimum weight
9 5
spanning tree problem (cf. Exercises 10.10 and
1 2 10.11). The arc weights are shown next to the
5 arcs.
2 3
We are given a connected graph G with a nonnegative weight aij for each arc
(i, j) ∈ A. We assume that if an arc (i, j) is present, the reverse arc (j, i) is also
present, and aij = aji . Consider the problem of finding a tree in G that spans a
given subset of nodes S and has minimum weight over all such trees.
(a) Let W ∗ be the weight of this tree. Consider the graph I(G), which has
node set S and is complete (has an arc connecting every pair of its nodes).
Let the weight for each arc (i, j) of I(G) be equal to the shortest distance in
the graph G from the node i ∈ S to the node j ∈ S. Let T be a minimum
weight spanning tree of I(G). Show that the weight of T is no greater
than 2W ∗ . Hint: Consider a minimum weight tour in I(G). Show that the
weight of this tour is no less than the weight of T and no more than 2W ∗ .
(b) Construct a heuristic based on part (a) and apply it to the problem of Fig.
10.19, where S = {1, 3, 5}.
Consider a minimum weight spanning tree problem with an additional side con-
straint denoted by C (for example, a degree constraint on each node). A general
heuristic (given by Deo and Kumar [1997]) is to solve the problem neglecting
the constraint C, and then to add a scalar penalty to the cost of the arcs that
“contribute most” to violation of C. This is then repeated as many times as
desired.
(a) Construct a heuristic of this type for the capacitated spanning tree problem
(cf. Example 10.3).
(b) Adapt this heuristic to a capacitated Steiner tree problem.
10.13
(a) Suppose that there existed a second bridge connecting the islands B and
C, and also another bridge connecting the land areas A and D. Construct
an Euler cycle that crosses each of the bridges exactly once.
(b) Suppose the bridge connecting the islands B and C has collapsed. Con-
struct an Euler path, i.e., a path (not necessarily a cycle) that passes
through each arc of the graph exactly once.
(c) Construct an optimal postman cycle assuming all arcs have cost 1.
10.14
Formulate the capacitated spanning tree problem given in Fig. 10.5 as an integer-
constrained network flow problem.
xij ≤ 1, ∀ j,
i
the integer constraints xij ∈ {0, 1}, and the side constraints
xij + xji ≤ 1, ∀ i ∈ N,
{j|(i,j)∈A} {j|(j,i)∈A}
or
xij + xji = 1, ∀ i ∈ N,
{j|(i,j)∈A} {j|(j,i)∈A}
Given a Chinese postman problem, delete all nodes of even degree together with
all their incident arcs. Find a perfect matching of minimum cost in the remaining
graph. Create an expanded version of the original problem’s graph by adding an
extra copy of each arc of the minimum cost matching. Show that an Euler cycle
of the expanded graph is an optimal solution to the Chinese postman problem.
534 Network Problems with Integer Constraints Chap. 10
Consider expanding the graph of the directed Chinese postman problem by du-
plicating arcs so that the number of incoming arcs to each node is equal to the
number of its outgoing arcs. A forward Euler cycle of the expanded graph cor-
responds to a solution of the directed Chinese postman problem. Show that the
optimal expanded graph is obtained by minimizing
aij xij
(i,j)∈A
0 ≤ xij , ∀ (i, j) ∈ A,
where di is the difference between the number of incoming arcs to i and the
number of outgoing arcs from i.
minimize f (x1 , . . . , xn )
subject to x ∈ X, xi ∈ {0, 1}, i = 1, . . . , n,
where X is some set. Construct a branch-and-bound tree that starts with a sub-
problem where the integer constraints are relaxed, and proceeds with successive
restriction of the variables x1 , . . . , xn to the values 0 or 1.
(a) Show that the original integer-constrained problem is equivalent to a single
origin/single destination shortest path problem that involves the branch-
and-bound tree. Hint: As an example, for the traveling salesman problem,
nodes of the tree correspond to sequences (i1 , . . . , ik ) of distinct cities, and
arcs correspond to pairs of nodes (i1 , . . . , ik ) and (i1 , . . . , ik , ik+1 ).
(b) Modify the label correcting method of Section 2.5.2 so that it becomes
similar to the branch-and-bound method (see also the discussion in Section
2.5.2).
10.19
Use the branch-and-bound method to solve the capacitated spanning tree problem
of Fig. 10.5.
Sec. 10.6 Notes, Sources, and Exercises 535
In the context of simulated annealing, assume that T is kept constant and let pxy
be the probability that when the current solution is x, the next solution sampled
is y. Consider the special case where pxy = pyx for all feasible solutions x and y,
and assume that the Markov chain defined by the probabilities pxy is irreducible,
in the sense that there is positive probability to go from any x to any y, with one
or more samples. Show that the steady-state probability of a solution x is
e−f (x)/T
πx = ,
C
where
C= e−f (x)/T .
x∈F
Hint: This exercise assumes some basic knowledge of the theory of Markov chains.
Let qxy be the probability that y is the next solution if x is the current solution,
i.e.,
− f (y)−f (x) /T
qxy = pxy e if f (y) > f (x),
pxy otherwise.
Show that for all x and y we have πy qyx = πx qxy , and that πy = π q .
x∈F x xy
This equality together with π = 1 is sufficient to show the result.
x∈F x
In the context of the rollout algorithm, suppose that given a partial solution
P = xij | (i, j) ∈ S , we have an estimate c(P ) of the optimal cost over all
feasible solutions that are consistent
with P , in the sense that there exists a
complementary solution P = xij | (i, j) ∈ / S such that P ∪ P is feasible.
Consider a heuristic algorithm, which is greedy with respect to c(P ), in the sense
that it starts from S = Ø, and given the partial solution P = xij | (i, j) ∈ S ,
it selects aset of arcs T ,
forms the collection FT of all possible values of the arc
flows y = yij | (i, j) ∈ T , and finds
where
Py+ = xij | (i, j) ∈ S , yij | (i, j) ∈ T .
It then augments P with the arc flows y thus obtained, and repeats up to ob-
taining a complete solution. Assume that the set of arcs T selected depends only
on P . Furthermore, the ties in the minimization of Eq. (10.36) are resolved in a
fixed manner that depends only on P . Show that the rollout algorithm that uses
the greedy algorithm as a base heuristic is sequentially consistent.
536 Network Problems with Integer Constraints Chap. 10
Consider a variant of the rollout algorithm that starts with the empty
set of arcs,
and maintains, in addition to the current partial solution P = xij | (i, j) ∈
S , a complementary solution P = xij | (i, j) ∈ / S , and the corresponding
(complete) flow vector x = P ∪ P . At the typical iteration, we select a subset T
of arcs that are not in
S, and we consider
the collection FT of all possible values
of the arc flows y = yij | (i, j) ∈ T . Then, if
we augment the current partial solution {xij | (i, j) ∈ S with the arc flows
y = y ij | (i, j) ∈ T that attain the minimum above, and we set x equal to the
complete solution generated by the base heuristic starting from Py+ . Otherwise,
we augment the current partial solution to {xij | (i, j) ∈ S with the arc flows
xij | (i, j) ∈ T and we leave x unchanged. Prove that this rollout algorithm is
sequentially improving in the sense that the heuristic costs of the partial solutions
generated are monotonically nonincreasing.
A machine can be used to perform a subset of N given tasks over T time periods.
At each time period t, only a subset A(t) of tasks can be performed. Each task
j has value vj (t) when performed at period t.
(a) Formulate the problem of finding the sequence of tasks of maximal total
value as an assignment problem. Hint: Assign time periods to tasks.
(b) Suppose that there are in addition some precedence constraints of the gen-
eral form: Task j must be performed before task j can be performed.
Formulate the problem as an assignment problem with side constraints
and integer constraints. Give an example where the integer constraints are
essential.
(c) Repeat part (b) for the case where there are no precedence constraints, but
instead some of the tasks require more than one time period.
αTi Ri αTj Rj
≥ .
1−α T i 1 − αTj
Conclude that scheduling jobs in order of decreasing αTi Ri / 1 − αTi is optimal.
We want to schedule N tasks, the ith of which requires Ti time units. Let ti
denote the time of completion of the ith task, i.e.,
ti = Ti + Tk .
tasks k
completed before i
10.26
ti = Ti + Tk .
projects k
completed before i
The professor wants to order the projects so as to minimize the maximum tardi-
ness, given by
max max[0, ti − di ].
i∈{1,...,N }
Consider a quiz contest where a person is given a list of N questions and can
answer these questions in any order he chooses. Question i will be answered
correctly with probability pi , independently of earlier answers, and the person will
then receive a reward Ri . At the first incorrect answer, the quiz terminates and
the person is allowed to keep his previous rewards. The problem is to maximize
the expected reward by choosing optimally the ordering of the questions.
(a) Show that to maximize the expected reward, questions should be answered
in decreasing order of pi Ri /(1 − pi ). Hint: Use an interchange argument
(cf. Exercise 10.24).
(b) Consider the variant of the problem where there is a maximum number
of questions that can be answered, which is smaller than the number of
questions that are available. Show that it is not necessarily optimal to
answer the questions in order of decreasing pi Ri /(1 − pi ). Hint: Try the
case where only one out of two available questions can be answered.
(c) Give a 2-OPT algorithm to solve the problem where the number of available
questions is one more than the maximum number of questions that can be
answered.
Consider the quiz problem of Exercise 10.28 for the case where the maximum
number of questions that can be answered is less or equal to the number of
questions that are available. Consider the heuristic which answers questions in
decreasing order of pi Ri /(1 − pi ), and use it as a base heuristic in a rollout
algorithm. Show that the cost of the rollout algorithm is no worse than the cost
of the base heuristic. Hint: Prove sequential consistency of the base heuristic.
10.30
This exercise shows that nondifferentiabilities of the dual function given in Section
10.3, often tend to arise at the most interesting points and thus cannot be ignored.
Show that if there is a duality gap, then the dual function q is nondifferentiable
at every dual optimal solution. Hint: Assume that q has a unique subgradient at
a dual optimal solution µ∗ and derive a contradiction by showing that any vector
xµ∗ that minimizes L(x, µ∗ ) is primal optimal.
Sec. 10.6 Notes, Sources, and Exercises 539
|β jm + γ mw − ajmw | ≤ , ∀ j ∈ J, m ∈ M, w ∈ W,
where ajmw is the value of the triplet (j, m, w).
(a) Show that if the problem is solved with ajmw replaced by β jm + γ mw , the
3-dimensional assignment obtained achieves the optimal cost of the original
problem within 2n.
(b) Suppose that we don’t know β jm and γ mw , and that we use the enforced
separation approach of Example 10.7. Thus, we first solve the jobs-to-
machines 2-dimensional assignment problem with values
bjm = max ajmw .
w∈W
n
minimize fj (xj )
j=1
n
subject to xj ≤ A,
j=1
xj ∈ {0, 1, . . . , mj }, j = 1, . . . , n,
where A and m1 , . . . , mn are given positive integers, and each function fj is con-
vex over the interval [0, mj ]. Consider an iterative algorithm (due to Ibaraki and
Katoh [1988]) that starts at (0, . . . , 0) and maintains a feasible vector (x1 , . . . , xn ).
At the typical iteration, we consider the set of indices J = {j | xj < mj }. If
n
J is empty or j=1 j
x = A, the algorithm terminates. Otherwise, we find an
index j ∈ J that maximizes fj (xj ) − fj (xj + 1). If fj (xj ) − fj (xj + 1) ≤ 0, the
algorithm terminates. Otherwise, we increase xj by one unit, and go to the next
iteration. Show that upon termination, the algorithm yields an optimal solution.
Note: The book by Ibaraki and Katoh [1988] contains a lot of material on this
problem, and addresses the issues of efficient implementation.
The purpose of this exercise is to compare the lower bounds obtained by relaxing
integer constraints and by dualizing the side constraints. Consider the nonlinear
network optimization problem with a cost function f (x), the conservation of flow
constraints, and the additional constraint
x ∈ X = x | xij ∈ Xij , (i, j) ∈ A, gt (x) ≤ 0, t = 1, . . . , r ,
where Xij are given subsets of the real line and the functions gt are linear . We
assume that f is convex over the entire space of flow vectors x. We introduce a
Lagrange multiplier µt for each of the side constraints gt (x) ≤ 0, and we form
the corresponding Lagrangian function
r
Let C denote the set of all x satisfying the conservation of flow constraints, let
f ∗ denote the optimal primal cost,
f∗ = inf f (x),
x∈C, xij ∈Xij , gt (x)≤0
Let X̂ ij denote the interval which is the convex hull of the set Xij , and denote
by fˆ the optimal cost of the problem, where each set Xij is replaced by X̂ ij ,
fˆ = inf f (x). (10.37)
x∈C, xij ∈X̂ ij , gt (x)≤0
Note that this is a convex problem even if Xij embodies integer constraints.
(a) Show that fˆ ≤ q ∗ ≤ f ∗ . Hint: Use Prop. 8.3 to show that problem (10.37)
has no duality gap and compare its dual cost with q ∗ .
(b) Assume that f is linear. Show that fˆ = q ∗ . Hint: The problem involved
in the definition of the dual function of problem (10.37) is a minimum cost
flow problem.
(c) Assume that C is a general polyhedron; that is, C is specified by a finite
number of linear equality and inequality constraints (rather than the con-
servation of flow constraints). Provide an example where f is linear and
we have fˆ < q ∗ .
maximize vi xi
i=1
n
(a) Let f ∗ and q ∗ be the optimal primal and dual costs, respectively. Show
that
0 ≤ q ∗ − f ∗ ≤ max vi .
i=1,...,n
This exercise provides a convergence result for a common variation of the subgra-
dient method (the result is due to Brannlund [1993]; see also Goffin and Kiwiel
[1996]). Consider the iteration µk+1 = [µk + sk g k ]+ , where
q̃ − q(µk )
sk = .
g k 2
(a) Suppose that q̃ is an underestimate of the optimal dual cost q ∗ such that
q(µk ) < q̃ ≤ q ∗ . [Here q̃ is fixed and the algorithm stops at µk if q(µk ) ≥ q̃.]
Use the fact that {g k } is bounded to show that either for some k̄ we have
q(µk̄ ) ≥ q̃ or else q(µk ) → q̃. Hint: Consider the function min q(µ), q̃
and use the results of Exercise 10.36.
(b) Suppose that q̃ is an overestimate of the optimal dual cost, that is, q̃ > q ∗ .
Use the fact that {g k } is bounded to show that the length of the path
traveled by the method is infinite, that is,
∞
∞
q̃ − q(µk )
sk g k = = ∞.
g k
k=0 k=0
(c) Let δ 0 and B be two positive scalars. Consider the following version of
the subgradient method. Given µk , apply successive subgradient iterations
with q̃ = q(µk ) + δ k in the stepsize formula in place of q(µ∗ ), until one of
the following two occurs:
(1) The dual cost exceeds q(µk ) + δ k /2.
(2) The length of the path traveled starting from µk exceeds B.
Then set µk+1 to the iterate with highest dual cost thus far. Furthermore,
in case (1), set δ k+1 = δ k , while in case (2), set δ k+1 = δ k /2. Use the fact
that {g k } is bounded to show that q(µk ) → q ∗ .
10.38 (Convergence Rate of the Subgradient Method)
∞ 2
Hint: Use Eq. (10.16) to show that q(µ∗ ) − q(µk ) < ∞. Assume
√ ∗
k=0
that k q(µ ) − q(µ ) ≥ for some > 0 and arbitrarily large k, and
k
reach a contradiction.
(b) Assume that for some a > 0 and all k, we have q(µ∗ ) − q(µk ) ≥ aµ∗ − µk .
Use Eq. (10.16) to show that for all k we have
µk+1 − µ∗ ≤ rµk − µ∗ ,
$
where r = 1 − a2 /b2 and b is an upper bound on g k .
10.39
Consider the rollout algorithm for the traveling salesman problem using as base
heuristic the nearest neighbor method, whereby we start from some simple path
and at each iteration, we add a node that does not close a cycle and minimizes
the cost of the enlarged path (see the paragraph following the description of
the rollout algorithm iteration in Section 10.5). Write a computer program to
apply this algorithm to the problem involving Hamilton’s 20-node graph (Exercise
1.35) for the case where all arcs have randomly chosen costs from the range
[0, 1]. For node pairs for which there is no arc, introduce an artificial arc with
cost randomly chosen from the range [100, 101]. Compare the performances of
the rollout algorithm and the nearest neighbor heuristic, and compile relevant
statistics by running a suitable large collection of randomly generated problem
instances. Verify that the rollout algorithm performs at least as well as the
nearest neighbor heuristic for each instance (since it is sequentially consistent).
555
556 References
References
Aarts, E., and Lenstra, J. K., 1997. Local Search in Combinatorial Opti-
mization, Wiley, N. Y.
Ahuja, R. K., Magnanti, T. L., and Orlin, J. B., 1989. “Network Flows,”
in Handbooks in Operations Research and Management Science, Vol. 1,
Optimization, Nemhauser, G. L., Rinnooy-Kan, A. H. G., and Todd M. J.
(eds.), North-Holland, Amsterdam, pp. 211-369.
Ahuja, R. K., Mehlhorn, K., Orlin, J. B., and Tarjan, R. E., 1990. “Faster
Algorithms for the Shortest Path Problem,” J. ACM, Vol. 37, 1990, pp.
213-223.
Ahuja, R. K., and Orlin, J. B., 1987. Private Communication.
Ahuja, R. K., and Orlin, J. B., 1989. “A Fast and Simple Algorithm for
the Maximum Flow Problem,” Operations Research, Vol. 37, pp. 748-759.
Amini, M. M., 1994. “Vectorization of an Auction Algorithm for Linear
Cost Assignment Problem,” Comput. Ind. Eng., Vol. 26, pp. 141-149.
Arezki, Y., and Van Vliet, D., 1990. “A Full Analytical Implementation of
the PARTAN/Frank-Wolfe Algorithm for Equilibrium Assignment,” Trans-
portation Science, Vol. 24, pp. 58-62.
Assad, A. A., and Golden, B. L., 1995. “Arc Routing Methods and Appli-
cations,” Handbooks in OR and MS, Ball, M. O., Magnanti, T. L., Monma,
C. L., and Nemhauser, G. L., (eds.), Vol. 8, North-Holland, Amsterdam,
pp. 375-483.
Atkinson, D. S., and Vaidya, P. M., 1995. “A Cutting Plane Algorithm for
Convex Programming that Uses Analytic Centers,” Math. Programming,
Vol. 69, pp. 1-44.
Auchmuty, G., 1989. “Variational Principles for Variational Inequalities,”
Numer. Functional Analysis and Optimization, Vol. 10, pp. 863-874.
Auslender, A., 1976. Optimization: Methodes Numeriques, Mason, Paris.
Balas, E., Miller, D., Pekny, J., and Toth, P., 1991. “A Parallel Shortest
Path Algorithm for the Assignment Problem,” J. ACM, Vol. 38, pp. 985-
1004.
Balas, E., and Toth, P., 1985. “Branch and Bound Methods,” in The Trav-
eling Salesman Problem, Lawler, E., Lenstra, J. K., Rinnoy Kan, A. H. G.,
and Shmoys, D. B. (eds.), Wiley, N. Y., pp. 361-401.
References 557
Bertsekas, D. P., Gafni, E. M., and Gallager, R. G., 1984. “Second Deriva-
tive Algorithms for Minimum Delay Distributed Routing in Networks,”
IEEE Trans. on Communications, Vol. 32, pp. 911-919.
Bertsekas, D. P., and Gallager, R. G., 1992. Data Networks, (2nd Ed.),
Prentice-Hall, Englewood Cliffs, N. J.
Bertsekas, D. P., Guerriero, F., and Musmanno, R., 1996. “Parallel Asyn-
chronous Label Correcting Methods for Shortest Paths,” J. of Optimization
Theory and Applications, Vol. 88, pp. 297-320.
Bertsekas, D. P., Hosein, P., and Tseng, P., 1987. “Relaxation Methods for
Network Flow Problems with Convex Arc Costs,” SIAM J. on Control and
Optimization, Vol. 25, pp. 1219-1243.
Bertsekas, D. P, and Mitter, S. K., 1971. “Steepest Descent for Optimiza-
tion Problems with Nondifferentiable Cost Functionals,” Proc. 5th Annual
Princeton Confer. Inform. Sci. Systems, Princeton, N. J., pp. 347-351.
Bertsekas, D. P., and Mitter, S. K., 1973. “Descent Numerical Methods for
Optimization Problems with Nondifferentiable Cost Functions,” SIAM J.
on Control, Vol. 11, pp. 637-652.
Bertsekas, D. P., Pallottino, S., and Scutellà, M. G., 1995. “Polynomial
Auction Algorithms for Shortest Paths,” Computational Optimization and
Applications, Vol. 4, pp. 99-125.
Bertsekas, D. P., Polymenakos, L. C., and Tseng, P., 1997a. “An -Relaxati-
on Method for Separable Convex Cost Network Flow Problems,” SIAM J.
on Optimization, Vol. 7, pp. 853-870.
Bertsekas, D. P., Polymenakos, L. C., and Tseng, P., 1997b. “Epsilon-
Relaxation and Auction Methods for Separable Convex Cost Network Flow
Problems,” in Network Optimization, Pardalos, P. M., Hearn, D. W., and
Hager, W. W. (eds.), Lecture Notes in Economics and Mathematical Sys-
tems, Springer-Verlag, N. Y., pp. 103-126.
Bertsekas, D. P., and Tseng, P., 1988a. “Relaxation Methods for Minimum
Cost Ordinary and Generalized Network Flow Problems,” Operations Re-
search, Vol. 36, pp. 93-114.
Bertsekas, D. P., and Tseng, P., 1988b. “RELAX: A Computer Code for
Minimum Cost Network Flow Problems,” Annals of Operations Research,
Vol. 13, pp. 127-190.
Bertsekas, D. P., and Tseng, P., 1990. “RELAXT-III: A New and Improved
Version of the RELAX Code,” Lab. for Information and Decision Systems
Report P-1990, M.I.T., Cambridge, MA.
Bertsekas, D. P., and Tseng, P., 1994. “RELAX-IV: A Faster Version of
the RELAX Code for Solving Minimum Cost Flow Problems,” Laboratory
for Information and Decision Systems Report P-2276, M.I.T., Cambridge,
562 References
MA.
Bertsekas, D. P., and Tsitsiklis, J. N., 1989. Parallel and Distributed Com-
putation: Numerical Methods, Prentice-Hall, Englewood Cliffs, N. J. (re-
published in 1997 by Athena Scientific, Belmont, MA).
Bertsekas, D. P., and Tsitsiklis, J. N., 1996. Neuro-Dynamic Programming,
Athena Scientific, Belmont, MA.
Bertsekas, D. P., Tsitsiklis, J. N., and Wu, C., 1997. “Rollout Algorithms
for Combinatorial Optimization,” Heuristics, Vol. 3, pp. 245-262.
Bertsimas, D., and Tsitsiklis, J. N., 1993. “Simulated Annealing,” Stat.
Sci., Vol. 8, pp. 10-15.
Bertsimas, D., and Tsitsiklis, J. N., 1997. Introduction to Linear Optimiza-
tion, Athena Scientific, Belmont, MA.
Birkhoff, G., and Diaz, J. B., 1956. “Nonlinear Network Problems,” Quart.
Appl. Math., Vol. 13, pp. 431-444.
Bland, R. G., and Jensen, D. L., 1985. “On the Computational Behavior
of a Polynomial-Time Network Flow Algorithm,” Tech. Report 661, School
of Operations Research and Industrial Engineering, Cornell University.
Blackman, S. S., 1986. Multi-Target Tracking with Radar Applications,
Artech House, Dehdam, MA.
Bogart, K. P., 1990. Introductory Combinatorics, Harcourt Brace Jovano-
vich, Inc., New York, N. Y.
Bradley, G. H., Brown, G. G., and Graves, G. W., 1977. “Design and Imple-
mentation of Large-Scale Primal Transshipment Problems,” Management
Science, Vol. 24, pp. 1-38.
Brannlund, U., 1993. On Relaxation Methods for Nonsmooth Convex Opti-
mization, Doctoral Thesis, Royal Institute of Technology, Stockhorm, Swe-
den.
Brown, G. G., and McBride, R. D., 1984. “Solving Generalized Networks,”
Management Science, Vol. 30, pp. 1497-1523.
Burkard, R. E., 1990. “Special Cases of Traveling Salesman Problems and
Heuristics,” Acta Math. Appl. Sin., Vol. 6, pp. 273-288.
Busacker, R. G., and Gowen, P. J., 1961. “A Procedure for Determining a
Family of Minimal-Cost Network Flow Patterns,” O.R.O. Technical Report
No. 15, Operational Research Office, John Hopkins University, Baltimore,
MD.
Busacker, R. G., and Saaty, T. L., 1965. Finite Graphs and Networks: An
Introduction with Applications, McGraw-Hill, N. Y.
References 563
Cheney, E. W., and Goldstein, A. A., 1959. “Newton’s Method for Convex
Programming and Tchebycheff Approximation,” Numer. Math., Vol. I, pp.
253-268.
Cheriyan, J., and Maheshwari, S. N., 1989. “Analysis of Preflow Push
Algorithms for Maximum Network Flow,” SIAM J. Computing, Vol. 18,
pp. 1057-1086.
Cherkasky, R. V., 1977. “Algorithm√for Construction of Maximum Flow in
Networks with Complexity of O(V 2 E) Operations,” Mathematical Meth-
ods of Solution of Economical Problems, Vol. 7, pp. 112-125.
Christofides, N., 1975. Graph Theory: An Algorithmic Approach, Aca-
demic Press, N. Y.
Chvatal, V., 1983. Linear Programming, W. H. Freeman and Co., N. Y.
Connors, D. P., and Kumar, P. R., 1989. “Simulated Annealing Type
Markov Chains and their Order Balance Equations,” SIAM J. on Control
and Optimization, Vol. 27, pp. 1440-1461.
Cook, W., Cunningham, W., Pulleyblank, W., and Schrijver, A., 1998.
Combinatorial Optimization, Wiley, N. Y.
Cornuejols, G., Fonlupt, J., and Naddef, D., 1985. “The Traveling Salesman
Problem on a Graph and Some Related Polyhedra,” Math. Programming,
Vol. 33, pp. 1-27.
Cottle, R. W., and Pang, J. S., 1982. “On the Convergence of a Block
Successive Over-Relaxation Method for a Class of Linear Complementarity
Problems,” Math. Progr. Studies, Vol. 17, pp. 126-138.
Croes, G. A., 1958. “A Method for Solving Traveling Salesman Problems,”
Operations Research, Vol. 6, pp. 791-812.
Cunningham, W. H., 1976. “A Network Simplex Method,” Math. Program-
ming, Vol. 4, pp. 105-116.
Cunningham, W. H., 1979. “Theoretical Properties of the Network Simplex
Method,” Math. of Operations Research, Vol. 11, pp. 196-208.
Dafermos, S., 1980. “Traffic Equilibrium and Variational Inequalities,”
Transportation Science, Vol. 14, pp. 42-54.
Dafermos, S., 1982. “Relaxation Algorithms for the General Asymmetric
Traffic Equilibrium Problem,” Transportation Science, Vol. 16, pp. 231-240.
Dafermos, S., and Sparrow, F. T., 1969. “The Traffic Assignment Problem
for a General Network,” J. Res. Nat. Bureau of Standards, Vol. 73B, pp.
91-118.
Dantzig, G. B., 1951. “Application of the Simplex Method to a Transporta-
tion Problem,” in Activity Analysis of Production and Allocation, T. C.
References 565
Derigs, U., 1985. “The Shortest Augmenting Path Method for Solving As-
signment Problems – Motivation and Computational Experience,” Annals
of Operations Research, Vol. 4, pp. 57-102.
Derigs, U., and Meier, W., 1989. “Implementing Goldberg’s Max-Flow Al-
gorithm – A Computational Investigation,” Zeitschrif fur Operations Re-
search, Vol. 33, pp. 383-403.
Desrosiers, J., Dumas, Y., Solomon, M. M., and Soumis, F., 1995. “Time
Constrained Routing and Scheduling,” Handbooks in OR and MS, Ball,
M. O., Magnanti, T. L., Monma, C. L., and Nemhauser, G. L. (eds.), Vol.
8, North-Holland, Amsterdam, pp. 35-139.
Dial, R. B., 1969. “Algorithm 360: Shortest Path Forest with Topological
Ordering,” Comm. ACM, Vol. 12, pp. 632-633.
Dial, R., Glover, F., Karney, D., and Klingman, D., 1979. “A Compu-
tational Analysis of Alternative Algorithms and Labeling Techniques for
Finding Shortest Path Trees,” Networks, Vol. 9, pp. 215-248.
Dijkstra, E., 1959. “A Note on Two Problems in Connexion with Graphs,”
Numer. Math., Vol. 1, pp. 269-271.
Dinic, E. A., 1970. “Algorithm for Solution of a Problem of Maximum Flow
in Networks with Power Estimation,” Soviet Math. Doklady, Vol. 11, pp.
1277-1280.
Dreyfus, S. E., 1969. “An Appraisal of Some Shortest-Path Algorithms,”
Operations Research, Vol. 17, pp. 395-412.
Duffin, R. J., 1947. “Nonlinear Networks. IIa,” Bull. Amer. Math. Soc.,
Vol. 53, pp. 963-971.
Eastman, W. L., 1958. Linear Programming with Pattern Constraints,
Ph.D. Thesis, Harvard University, Cambridge, MA.
Eckstein, J., 1994. “Nonlinear Proximal Point Algorithms Using Bregman
Functions, with Applications to Convex Programming,” Math. of Opera-
tions Research, Vol. 18, pp. 202-226.
Edmonds, J., 1965. “Paths, Trees, and Flowers,” Canadian J. of Math.,
Vol. 17, pp. 449-467.
Edmonds, J., and Johnson, E. L., 1973. “Matching, Euler Tours, and the
Chinese Postman,” Math. Programming, Vol. 5, pp. 88-124.
Edmonds, J., and Karp, R. M., 1972. “Theoretical Improvements in Al-
gorithmic Efficiency for Network Flow Problems,” J. ACM, Vol. 19, pp.
248-264.
Eiselt, H. A., Gendreau, M., and Laporte, G., 1995a. “Arc Routing Prob-
lems, Part 1: The Chinese Postman Problem,” Operations Research, Vol.
43, pp. 231-242.
References 567
Eiselt, H. A., Gendreau, M., and Laporte, G., 1995b. “Arc Routing Prob-
lems, Part 2: The Rural Postman Problem,” Operations Research, Vol. 43,
pp. 399-414.
Elias, P., Feinstein, A., and Shannon, C. E., 1956. “Note on Maximum Flow
Through a Network,” IRE Trans. Info. Theory, Vol. IT-2, pp. 117-119.
Egervary, J., 1931. “Matrixok Kombinatoricus Tulajonsagairol,” Mat. Es
Fiz. Lapok, Vol. 38, pp. 16-28.
El Baz, D., 1989. “A Computational Experience with Distributed Asyn-
chronous Iterative Methods for Convex Network Flow Problems,” Proc.
of the 28th IEEE Conference on Decision and Control, Tampa, Fl., pp.
590-591.
El Baz, D., 1996. “Asynchronous Gradient Algorithms for a Class of Con-
vex Separable Network Flow Problems,” Computational Optimization and
Applications, Vol. 5, pp. 187-205.
El Baz, D., Spiteri, P., Miellou, J. C., and Gazen, D., 1996. “Asynchronous
Iterative Algorithms with Flexible Communication for Nonlinear Network
Flow Problems,” J. of Parallel and Distributed Computing, Vol. 38, pp.
1-15.
Elam, J., Glover, F., and Klingman, D., 1979. “A Strongly Convergent
Primal Simplex Algorithm for Generalized Networks,” Math. of Operations
Research, Vol. 4, pp. 39-59.
Elmaghraby, S. E., 1978. Activity Networks: Project Planning and Control
by Network Models, Wiley, N. Y.
Elzinga, J., and Moore, T. G., 1975. “A Central Cutting Plane Algorithm
for the Convex Programming Problem,” Math. Programming, Vol. 8, pp.
134-145.
Engquist, M., 1982. “A Successive Shortest Path Algorithm for the Assign-
ment Problem,” INFOR, Vol. 20, pp. 370-384.
Ephremides, A., 1986. “The Routing Problem in Computer Networks,”
in Communication and Networks, Blake, I. F., and Poor, H. V. (eds.),
Springer-Verlag, N. Y., pp. 299-325.
Ephremides, A., and Verdu, S., 1989. “Control and Optimization Methods
in Communication Network Problems,” IEEE Trans. on Automatic Con-
trol, Vol. 34, pp. 930-942.
Esau, L. R., and Williams, K. C., 1966. “On Teleprocessing System Design.
A Method for Approximating the Optimal Network,” IBM System J., Vol.
5, pp. 142-147.
Escudero, L. F., 1985. “Performance Evaluation of Independent Superbasic
Sets on Nonlinear Replicated Networks,” Eur. J. Operations Research, Vol.
568 References
Ford, L. R., Jr., and Fulkerson, D. R., 1956b. “Maximal Flow Through a
Network,” Can. J. of Math., Vol. 8, pp. 339-404.
Ford, L. R., Jr., and Fulkerson, D. R., 1957. “A Primal-Dual Algorithm
for the Capacitated Hitchcock Problem,” Naval Res. Logist. Quart., Vol.
4, pp. 47-54.
Ford, L. R., Jr., and Fulkerson, D. R., 1962. Flows in Networks, Princeton
Univ. Press, Princeton, N. J.
Fox, B. L., 1993. “Integrating and Accelerating Tabu Search, Simulated
Annealing, and Genetic Algorithms,” Annals of Operations Research, Vol.
41, pp. 47-67.
Fox, B. L., 1995. “Faster Simulated Annealing,” SIAM J. Optimization,
Vol. 41, pp. 47-67.
Frank, H., and Frisch, I. T., 1970. Communication, Transmission, and
Transportation Networks, Addison-Wesley, Reading, MA.
Fratta, L., Gerla, M., and Kleinrock, L., 1973. “The Flow-Deviation Method:
An Approach to Store-and-Forward Computer Communication Network
Design,” Networks, Vol. 3, pp. 97-133.
Fredman, M. L., and Tarjan, R. E., 1984. “Fibonacci Heaps and their Uses
in Improved Network Optimization Algorithms,” Proc. 25th Annual Symp.
on Found. of Comp. Sci., pp. 338-346.
Fukushima, M., 1984a. “A Modified Frank-Wolfe Algorithm for Solving
the Traffic Assignment Problem,” Transportation Research, Vol. 18B, pp.
169–177.
Fukushima, M., 1984b. “On the Dual Approach to the Traffic Assignment
Problem,” Transportation Research, Vol. 18B, pp. 235-245.
Fukushima, M., 1992. “Equivalent Differentiable Optimization Problems
and Descent Methods for Asymmetric Variational Inequalities,” Math. Pro-
gramming, Vol. 53, pp. 99-110.
Fulkerson, D. R., 1961. “An Out-of-Kilter Method for Minimal Cost Flow
Problems,” SIAM J. Appl. Math., Vol. 9, pp. 18-27.
Fulkerson, D. R., and Dantzig, G. B., 1955. “Computation of Maximum
Flow in Networks,” Naval Res. Log. Quart., Vol. 2, pp. 277-283.
Gafni, E. M., 1979. “Convergence of a Routing Algorithm,” M.S. Thesis,
Dept. of Electrical Engineering, Univ. of Illinois, Urbana, Ill.
Gafni, E. M., and Bertsekas, D. P., 1984. “Two-Metric Projection Methods
for Constrained Optimization,” SIAM J. on Control and Optimization, Vol.
22, pp. 936-964.
570 References
Glover, F., Klingman, D., Phillips, N., and Schneider, R. F., 1985. “New
Polynomial Shortest Path Algorithms and Their Computational Attributes,”
Management Science, Vol. 31, pp. 1106-1128.
Glover, F., Klingman, D., and Stutz, J., 1973. “Extension of the Augmented
Predecessor Index Method to Generalized Netork Problems,” Transporta-
tion Science, Vol. 7, pp. 377-384.
Glover, F., Klingman, D., and Stutz, J., 1974. “Augmented Threaded Index
Method for Network Optimization,” INFOR, Vol. 12, pp. 293-298.
Glover, F., and Laguna, M., 1997. Tabu Search, Kluwer, Boston.
Glover, F., Taillard, E., and de Verra, D., 1993. “A User’s Guide to Tabu
Search,” Annals of Operations Research, Vol. 41, pp. 3-28.
Goffin, J. L., 1977. “On Convergence Rates of Subgradient Optimization
Methods,” Math. Programming, Vol. 13, pp. 329-347.
Goffin, J. L., Haurie, A., and Vial, J. P., 1992. “Decomposition and Non-
differentiable Optimization with the Projective Algorithm,” Management
Science, Vol. 38, pp. 284-302.
Goffin, J. L., and Kiwiel, K. C, 1996. ‘Convergence of a Simple Subgradient
Level Method,” Unpublished Report, to appear in Math. Programming.
Goffin, J. L., Luo, Z.-Q., and Ye, Y., 1993. “On the Complexity of a Column
Generation Algorithm for Convex or Quasiconvex Feasibility Problems,” in
Large Scale Optimization: State of the Art, Hager, W. W., Hearn, D. W.,
and Pardalos, P. M. (eds.), Kluwer.
Goffin, J. L., Luo, Z.-Q., and Ye, Y., 1996. “Further Complexity Analysis of
a Primal-Dual Column Generation Algorithm for Convex or Quasiconvex
Feasibility Problems,” SIAM J. on Optimization, Vol. 6, pp. 638-652.
Goffin, J. L., and Vial, J. P., 1990. “Cutting Planes and Column Generation
Techniques with the Projective Algorithm,” J. Opt. Th. and Appl., Vol. 65,
pp. 409-429.
Goldberg, A. V., 1987. “Efficient Graph Algorithms for Sequential and Par-
allel Computers,” Tech. Report TR-374, Laboratory for Computer Science,
M.I.T., Cambridge, MA.
Goldberg, A. V., 1993. “An Efficient Implementation of a Scaling Minimum-
Cost Flow Algorithm,” Proc. 3rd Integer Progr. and Combinatorial Opti-
mization Conf., pp. 251-266.
Goldberg, A. V., and Tarjan, R. E., 1986. “A New Approach to the Maxi-
mum Flow Problem,” Proc. 18th ACM STOC, pp. 136-146.
Goldberg, A. V., and Tarjan, R. E., 1990. “Solving Minimum Cost Flow
Problems by Successive Approximation,” Math. of Operations Research,
Vol. 15, pp. 430-466.
References 573
Hall, M., Jr., 1956. “An Algorithm for Distinct Representatives,” Amer.
Math. Monthly, Vol. 51, pp. 716-717.
Hansen, P., 1986. “The Steepest Ascent Mildest Descent Heuristic for Com-
binatorial Optimization,” Presented at the Congress on Numerical Methods
in Combinatorial Optimization, Capri, Italy.
Hearn, D. W., and Lawphongpanich, S., 1990. “A Dual Ascent Algorithm
for Traffic Assignment Problems,” Transportation Research, Vol. 24B, pp.
423-430.
Hearn, D. W., Lawphongpanish, S., and Nguyen, S., 1984. “Convex Pro-
gramming Formulation of the Asymmetric Traffic Assignment Problem,”
Transportation Research, Vol. 18B, pp. 357-365.
Hearn, D. W., Lawphongpanish, S., and Ventura, J. A., 1985. “Finiteness
in Restricted Simplicial Decomposition,” Operations Research Letters, Vol.
4, pp. 125-130.
Hearn, D. W., Lawphongpanish, S., and Ventura, J. A., 1987. “Restricted
Simplicial Decomposition: Computation and Extensions,” Math. Program-
ming Studies, Vol. 31, pp. 99-118.
Held, M., and Karp, R. M., 1970. “The Traveling Salesman Problem and
Minimum Spanning Trees,” Operations Research, Vol. 18, pp. 1138-1162.
Held, M., and Karp, R. M., 1971. “The Traveling Salesman Problem and
Minimum Spanning Trees: Part II,” Math. Programming, Vol. 1, pp. 6-25.
Helgason, R. V., and Kennington, J. L., 1977. “An Efficient Procedure for
Implementing a Dual-Simplex Network Flow Algorithm,” AIIE Transac-
tions, Vol. 9, pp. 63-68.
Helgason, R. V., and Kennington, J. L., 1995. “Primal-Simplex Algorithms
for Minimum Cost Network Flows,” Handbooks in OR and MS, Ball, M.
O., Magnanti, T. L., Monma, C. L., and Nemhauser, G. L. (eds.), Vol. 7,
North-Holland, Amsterdam, pp. 85-133.
Helgason, R. V., Kennington, J. L., and Stewart, B. D., 1993. “The One-
to-One Shortest-Path Problem: An Empirical Analysis with the Two-Tree
Dijkstra Algorithm,” Computational Optimization and Applications, Vol.
1, pp. 47-75.
Hiriart-Urruty, J.-B., and Lemarechal, C., 1993. Convex Analysis and Min-
imization Algorithms, Vols. I and II, Springer-Verlag, Berlin and N. Y.
Hochbaum, D. S., and Shantikumar, J. G., 1990. “Convex Separable Op-
timization is not Much Harder than Linear Optimization,” J. ACM, Vol.
37, pp. 843-862.
Hoffman, A. J., 1960. “Some Recent Applications of the Theory of Lin-
ear Inequalities to Extremal Combinatorial Analysis,” Proc. Symp. Appl.
References 575
Johnson, E. L., 1972. “On Shortest Paths and Sorting,” Proc. 25th ACM
Annual Conference, pp. 510-517.
Jonker, R., and Volgenant, A., 1986. “Improving the Hungarian Assign-
ment Algorithm,” Operations Research Letters, Vol. 5, pp. 171-175.
Jonker, R., and Volgenant, A., 1987. “A Shortest Augmenting Path Al-
gorithm for Dense and Sparse Linear Assignment Problems,” Computing,
Vol. 38, pp. 325-340.
Junger, M., Reinelt, G., and Rinaldi, G., 1995. “The Traveling Sales-
man Problem,” Handbooks in OR and MS, Ball, M. O., Magnanti, T.
L., Monma, C. L., and Nemhauser, G. L. (eds.), Vol. 7, North-Holland,
Amsterdam, pp. 225-330.
Karzanov, A. V., 1974. “Determining the Maximal Flow in a Network with
the Method of Preflows,” Soviet Math Dokl., Vol. 15, pp. 1277-1280.
Karzanov, A. V., and McCormick, S. T., 1997. “Polynomial Methods for
Separable Convex Optimization in Unimodular Linear Spaces with Ap-
plications to Circulations and Co-circulations in Network,” SIAM J. on
Computing, Vol. 26, pp. 1245-1275.
Kelley, J. E., 1960. “The Cutting-Plane Method for Solving Convex Pro-
grams,” J. Soc. Indust. Appl. Math., Vol. 8, pp. 703-712.
Kennington, J., and Helgason, R., 1980. Algorithms for Network Program-
ming, Wiley, N. Y.
Kennington, J., and Shalaby, M., 1977. “An Effective Subgradient Pro-
cedure for Minimal Cost Multicommodity Flow Problems,” Management
Science, Vol. 23, pp. 994-1004.
Kernighan, B. W., and Lin, S., 1970. “An Efficient Heuristic Procedure for
Partitioning Graphs,” Bell System Tech. Journal, Vol. 49, pp. 291-307.
Kershenbaum, A., 1981. “A Note on Finding Shortest Path Trees,” Net-
works, Vol. 11, pp. 399-400.
Kershenbaum, A., 1993. Network Design Algorithms, McGraw-Hill, N. Y.
Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P., 1983. “Optimization by
Simulated Annealing,” Science, Vol. 220, pp. 621-680.
Kiwiel, K. C., 1997a. “Proximal Minimization Methods with Generalized
Bregman Functions,” SIAM J. on Control and Optimization, Vol. 35, pp.
1142-1168.
Kiwiel, K. C., 1997b. “Efficiency of the Analytic Center Cutting Plane
Method for Convex Minimization,” SIAM J. on Optimization, Vol. 7, pp.
336-346.
References 577
Klee, V., and Minty, G. J., 1972. “How Good is the Simplex Algorithm?,”
in Inequalities III, O. Shisha (ed.), Academic Press, N. Y., pp. 159-175.
Klein, M., 1967. “A Primal Method for Minimal Cost Flow with Appli-
cations to the Assignment and Transportation Problems,” Management
Science, Vol. 14, pp. 205-220.
Klessig, R. W., 1974. “An Algorithm for Nonlinear Multicommodity Flow
Problems,” Networks, Vol. 4, pp. 343-355.
Klincewitz, J. C., 1989. “Implementing an Exact Newton Method for Sep-
arable Convex Transportation Problems,” Networks, Vol. 19, pp. 95-105.
König, D., 1931. “Graphok es Matrixok,” Mat. Es Fiz. Lapok, Vol. 38, pp.
116-119.
Korst, J., Aarts, E. H., and Korst, A., 1989. Simulated Annealing and
Boltzmann Machines: A Stochastic Approach to Combinatorial Optimiza-
tion and Neural Computing, Wiley, N. Y.
Kortanek, K. O., and No, H., 1993. “A Central Cutting Plane Algorithm for
Convex Semi-Infinite Programming Problems,” SIAM J. on Optimization,
Vol. 3, pp. 901-918.
Kuhn, H. W., 1955. “The Hungarian Method for the Assignment Problem,”
Naval Research Logistics Quarterly, Vol. 2, pp. 83-97.
Kumar, V., Grama, A., Gupta, A., and Karypis, G., 1994. Introduction to
Parallel Computing, Benjamin/Cummings, Redwood City, CA.
Kushner, H. J., 1990. “Numerical Methods for Continuous Control Prob-
lems in Continuous Time,” SIAM J. on Control and Optimization, Vol. 28,
pp. 999-1048.
Kushner, H. J., and Dupuis, P. G., 1992. Numerical Methods for Stochastic
Control Problems in Continuous Time, Springer-Verlag, N. Y.
Kwan Mei-Ko, 1962. “Graphic Programming Using Odd or Even Points,”
Chinese Math., Vol. 1, pp. 273-277.
Lamar, B. W., 1993. “An Improved Branch and Bound Algorithm for Min-
imum Concave Cost Network Flow Problems,” in Network Optimization
Problems, Du, D.-Z., and Pardalos, P. M. (eds.), World Scientific Publ.,
Singapore, pp. 261-287.
Land, A. H., and Doig, A. G., 1960. “An Automatic Method for Solving
Discrete Programming Problems,” Econometrica, Vol. 28, pp. 497-520.
Larsson, T., and Patricksson, M., 1992. “Simplicial Decomposition with
Disaggregated Representation for the Traffic Assignment Problem,” Trans-
portation Science, Vol. 26, pp. 4-17.
578 References
697-716.
Luo, Z.-Q., and Tseng, P., 1994. “On the Rate of Convergence of a Dis-
tributed Asynchronous Routing Algorithm,” IEEE Trans. on Automatic
Control, Vol. 39, pp. 1123-1129.
Malhotra, V. M., Kumar, M. P., and Maheshwari, S. N., 1978. “An O(|V |3 )
Algorithm for Finding Maximum Flows in Networks,” Inform. Process.
Lett., Vol. 7, pp. 277-278.
Marcotte, P., 1985. “A New Algorithm for Solving Variational Inequalities
with Application to the Traffic Assignment Problem,” Math. Programming
Studies, Vol. 33, pp. 339-351.
Marcotte, P., and Dussault, J.-P., 1987. “A Note on a Globally Convergent
Newton Method for Solving Monotone Variational Inequalities,” Opera-
tions Research Letters, Vol. 6, pp. 35-42.
Marcotte, P., and Guélat, J., 1988. “Adaptation of a Modified Newton
Method for Solving the Asymmetric Traffic Equilibrium Problem,” Trans-
portation Science, Vol. 22, pp. 112-124.
Martello, S., and Toth, P., 1990. Knapsack Problems, Wiley, N. Y.
Martinet, B., 1970. “Regularisation d’Inequations Variationnelles par Ap-
proximations Successives,” Rev. Francaise Inf. Rech. Oper., Vol. 4, pp.
154-159.
McGinnis, L. F., 1983. “Implementation and Testing of a Primal-Dual Al-
gorithm for the Assignment Problem,” Operations Research, Vol. 31, pp.
277-291.
Mendelssohn, N. S., and Dulmage, A. L., 1958. “Some Generalizations of
Distinct Representatives,” Canad. J. Math., Vol. 10, pp. 230-241.
Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., and Teller, E.,
1953. “Equation of State Calculations by Fast Computing Machines,” J. of
Chemical Physisc, Vol. 21, pp. 1087-1092.
Meyer, R. R., 1979. “Two-Segment Separable Programming,” Management
Science, Vol. 25, pp. 385-395.
Miller, D., Pekny, J., and Thompson, G. L., 1990. “Solution of Large Dense
Transportation Problems Using a Parallel Primal Algorithm,” Operations
Research Letters, Vol. 9, pp. 319-324.
Minty, G. J., 1957. “A Comment on the Shortest Route Problem,” Opera-
tions Research, Vol. 5, p. 724.
Minty, G. J., 1960. “Monotone Networks,” Proc. Roy. Soc. London, A, Vol.
257, pp. 194-212.
580 References
Wash.
Tseng, P., and Bertsekas, D. P., 1987. “Relaxation Methods for Linear
Programs,” Math. of Operations Research, Vol. 12, pp. 569-596.
Tseng, P., and Bertsekas, D. P., 1990. “Relaxation Methods for Monotropic
Programs,” Math. Programming, Vol. 46, 1990, pp. 127-151.
Tseng, P., and Bertsekas, D. P., 1993. “On the Convergence of the Expo-
nential Multiplier Method for Convex Programming,” Math. Programming,
Vol. 60, pp. 1-19.
Tseng, P., and Bertsekas, D. P., 1996. “An Epsilon-Relaxation Method
for Separable Convex Cost Generalized Network Flow Problems,” Lab. for
Information and Decision Systems Report P-2374, M.I.T., Cambridge, MA.
Tseng, P., Bertsekas, D. P., and Tsitsiklis, J. N., 1990. “Partially Asyn-
chronous Parallel Algorithms for Network Flow and Other Problems,”
SIAM J. on Control and Optimization, Vol. 28, pp. 678-710.
Tsitsiklis, J. N., 1989. “Markov Chains with Rare Transitions and Simu-
lated Annealing,” Math. of Operations Research, Vol. 14, pp. 70-90.
Tsitsiklis, J. N., 1992. “Special Cases of Traveling Salesman and Repairman
Problems with Time Windows,” Networks, Vol. 22, pp. 263-282.
Tsitsiklis, J. N., 1995. “Efficient Algorithms for Globally Optimal Trajec-
tories,” IEEE Trans. on Automatic Control, Vol. 40, pp. 1528-1538.
Tsitsiklis, J. N., and Bertsekas, D. P., 1986. “Distributed Asynchronous
Optimal Routing in Data Networks,” IEEE Trans. on Automatic Control,
Vol. 31, pp. 325-331.
Ventura, J. A., and Hearn, D. W., 1993. “Restricted Simplicial Decomposi-
tion for Convex Constrained Problems,” Math. Programming, Vol. 59, pp.
71-85.
Voß, S., 1992. “Steiner’s Problem in Graphs: Heuristic Methods,”, Discrete
Applied Math., Vol. 40, pp. 45-72.
Von Randow, R., 1982. Integer Programming and Related Areas: A Clas-
sified Bibliography 1978-1981, Lecture Notes in Economics and Mathemat-
ical Systems, Vol. 197, Springer-Verlag, N. Y.
Von Randow, R., 1985. Integer Programming and Related Areas: A Clas-
sified Bibliography 1982-1984, Lecture Notes in Economics and Mathemat-
ical Systems, Vol. 243, Springer-Verlag, N. Y.
Warshall, S., 1962. “A Theorem on Boolean Matrices,” J. ACM, Vol. 9,
pp. 11-12.
Wein, J., and Zenios, S. A., 1991. “On the Massively Parallel Solution of
the Assignment Problem,” J. of Parallel and Distributed Computing, Vol.
586 References