Graph Algorithms and Network Flows
Graph Algorithms and Network Flows
Contents
1 Introduction 1
1.1 Assignment problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Basic graph definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
5 Complexity analysis 18
5.1 Measuring quality of an algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2 Growth of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
i
IEOR 266 notes: Updated 2015 ii
6 Graph representations 24
6.1 Node-arc adjacency matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2 Node-node adjacency matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Node-arc adjacency list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.4 Comparison of the graph representations . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.4.1 Storage efficiency comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.4.2 Advantages and disadvantages comparison . . . . . . . . . . . . . . . . . . . . 26
8 Shortest paths 30
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
8.2 Properties of DAGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
8.2.1 Topological sort and directed acyclic graphs . . . . . . . . . . . . . . . . . . . 31
8.3 Properties of shortest paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.4 Alternative formulation for SP from s to t . . . . . . . . . . . . . . . . . . . . . . . . 33
8.5 Shortest paths on a directed acyclic graph (DAG) . . . . . . . . . . . . . . . . . . . . 33
8.6 Applications of the shortest/longest path problem on a DAG . . . . . . . . . . . . . 34
8.7 Dijkstra’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
8.8 Bellman-Ford algorithm for single source shortest paths . . . . . . . . . . . . . . . . 40
8.9 Floyd-Warshall algorithm for all pairs shortest paths . . . . . . . . . . . . . . . . . . 42
8.10 D.B. Johnson’s algorithm for all pairs shortest paths . . . . . . . . . . . . . . . . . . 43
8.11 Matrix multiplication algorithm for all pairs shortest paths . . . . . . . . . . . . . . 45
8.12 Why finding shortest paths in the presence of negative cost cycles is difficult . . . . . 46
14 Approximation algorithms 92
14.1 Traveling salesperson problem (TSP) . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
These notes are based on “scribe” notes taken by students attending Professor Hochbaum’s course
IEOR 266. The current version has been updated and edited by Professor Hochbaum in fall 2015.
The text book used for the course, and mentioned in the notes, is Network Flows: theory, algo-
rithms and applications by Ravindra K. Ahuja, Thomas L. Magnanti and James B. Orlin. Published
by Prentice-Hall, 1993. The notes also make reference to the book Combinatorial Optimization: al-
gorithms and complexity by Christos H. Papadimitriou and Kenneth Steiglitz, published by Prentice
Hall, 1982.
IEOR 266 notes: Updated 2015 1
1 Introduction
We will begin the study of network flow problems with a review of the formulation of linear pro-
gramming (LP) problems. Let the number of decision variables, xj ’s, be N , and the number of
constraints be M . LP problems take the following generic form:
∑N
min j=1 cj xj
∑N
s.t. j=1ai,j xj ≤ bi ∀i ∈ {1, . . . , M }
xj ≥ 0 ∀j ∈ {1, . . . , N } (1)
It may appear that ILP problems are simpler than LP problems, since the solution space in
ILP is countably finite while the solution space in LP is infinite; one obvious solution to ILP is
enumeration, i.e. systematically try out all feasible solutions and find one with the minimum cost.
As we will see in the following ILP example, enumeration is not an acceptable algorithm as, even
for moderate-size problems, its running time would be extremely large.
The first set of constraints ensures that exactly one person is assigned to each task; the second
set of constraints ensures that each person is assigned exactly one task. Notice that the upper bound
constraints on the xij are unnecessary. Also notice that the set of constraints is not independent
(the sum of the first set of constraints equals the sum of the second set of constraints), meaning
that one of the 2n constraint can be eliminated.
While at first it may seem that the integrality condition limits the number of possible solutions
and could thereby make the integer problem easier than the continuous problem, the opposite is
actually true. Linear programming optimization problems have the property that there exists an
optimal solution at a so-called extreme point (a basic solution); the optimal solution in an integer
program, however, is not guaranteed to satisfy any such property and the number of possible integer
valued solutions to consider becomes prohibitively large, in general.
Consider a simple algorithm for solving the Assignment Problem: It enumerates all possible
assignments and takes the cheapest one. Notice that there are n! possible assignments. If we
IEOR 266 notes: Updated 2015 2
consider an instance of the problem in which there are 70 people and 70 tasks, that means that
there are
70! = 1197857166996989179607278372168909
8736458938142546425857555362864628
009582789845319680000000000000000
≈ 2332
different assignments to consider. Our simple algorithm, while correct, is not at all practical! The
existence of an algorithm does not mean that there exits a useful algorithm.
While linear programming belongs to the class of problems P for which “good” algorithms exist
(an algorithm is said to be good if its running time is bounded by a polynomial in the size of the
input), integer programming belongs to the class of NP-hard problems for which it is considered
highly unlikely that a “good” algorithm exists. Fortunately, as we will see later in this course,
the Assignment Problem belongs to a special class of integer programming problems known as the
Minimum Cost Network Flow Problem, for which efficient polynomial algorithms do exist.
The reason for the tractability of the assignment problem is found in the form of the constraint
matrix. The constraint matrix is totally unimodular (TUM). Observe the form of the constraint
matrix A:
0 1 ···
1 0 ···
..
0 . ···
A= . .. (5)
.. .
..
1 .
0 1
We first notice that each column contains exactly two 1’s. Notice also that the rows of matrix A
can be partitioned into two sets, say A1 and A2 , such the two 1’s of each column are in different
sets. It turns out these two conditions are sufficient for the matrix A to be TUM. Note that simply
having 0’s and 1’s are not sufficient for the matrix to be TUM. Consider the constraint matrix for
the vertex cover problem:
0 ··· 1 0 ··· 1
0 1 0 ··· 10 ···
A= (6)
..
.
Although this matrix also contains 0’s and 1’s only, it is not TUM. In fact, vertex cover is an
NP-complete problem. We will revisit the vertex cover problem in the future. We now turn our
attention to the formulation of the generic minimum cost network flow problem.
• A directed graph or digraph G is an ordered pair G := (V, A). Where V is a set whose
elements are called vertices or nodes, and A is a set of ordered pairs of vertices of the form
(i, j), called arcs. In an arc (i, j) node i is called the tail of the arc and node j the head of
the arc. We sometimes abuse of the notation and refer to a digraph also as a graph.
• A path (directed path) is an ordered list of vertices (v1 , . . . , vk ), so that (vi , vi+1 ) ∈ E
((vi , vi+1 ) ∈ A) for all i = 1 . . . , k. The length of a path is |(v1 , . . . , vk )| = k.
• A simple path (simple cycle) is a path (cycle) where all vertices v1 , . . . , vk are distinct.
• An (undirected) graph is said to be connected if, for every pair of nodes, there is an (undi-
rected) path starting at one node and ending at the other node.
• A directed graph is said to be strongly connected if, for every (ordered) pair of nodes (i, j),
there is a directed path in the graph starting in i and ending in j.
• In a directed graph the indegree of a node is the number of incoming arcs taht have that node
as a head. The outdegree of a node is the number of outgoing arcs from a node, that have
that node as a tail.
Be sure you can prove, ∑
indeg(v) = |A|,
v∈V
∑
outdeg(v) = |A|.
v∈V
• A tree can be characterized as a connected graph with no cycles. The relevant property for
this problem is that a tree with n nodes has n − 1 edges.
Definition: An undirected graph G = (V, T ) is a tree if the following three properties are
satisfied:
Property 1: |T | = |V | − 1.
Property 2: G is connected.
Property 3: G is acyclic.
(Actually, it is possible to show, that any two of the properties imply the third).
• A graph is bipartite if the vertices in the graph can be partitioned into two sets in such a way
that no edge joins two vertices in the same set.
• A matching in a graph is set of graph edges such that no two edges in the set are incident to
the same vertex.
The bipartite (nonbipartite) matching problem, is stated as follows: Given a bipartite (nonbi-
partite) graph G = (V, E), find a maximum cardinality matching.
IEOR 266 notes: Updated 2015 4
max cx
subject to Ax ≤ b
Recall that a basic solution of this problem represents an extreme point on the polytope defined
by the constraints (i.e. cannot be expressed as a convex combination of any two other feasible
solutions). Also, if there is an optimal solution for LP then there is an optimal, basic solution. If
in addition A is totally unimodular we have the following result.
Lemma 2.1. If the constraint matrix A in LP is totally unimodular and the vector b is integer, then
there exists an integer optimal solution. In fact every basic or extreme point solution is integral.
Thus, since the determinant is a sum of terms each of which is a product of entries of the submatrix,
then all entries in C are integer. In addition, A is totally unimodular and B is a basis (so |det(B)| =
1). Therefore, all of the elements of B −1 are integers. So we conclude that any basic solution
xB = B −1 b is integral.
Although we will require all coefficients to be integers, the lemma also holds if only the right
hand sides vector b is integer.
This result has ramifications for combinatorial optimization problems. Integer programming
problems with totally unimodular constraint matrices can be solved simply by solving the linear
IEOR 266 notes: Updated 2015 5
programming relaxation. This can be done in polynomial time via the Ellipsoid method, or more
practically, albeit not in polynomial time, by the simplex method. The optimal basic solution is
guaranteed to be integral.
The set of totally unimodular integer problems includes network flow problems. The rows (or
columns) of a network flow constraint matrix contain exactly one 1 and exactly one -1, with all the
remaining entries 0. A square matrix A is said to have the network property if every column of A
has at most one entry of value 1, at most one entry of value −1, and all other entries 0.
Lemma 2.2. If the constraint matrix A has at most one 1 and at most one −1 in each column
(row) than A is totally unimodular.
Proof. The proof is by induction on k, the size of the submatrix of A. For k = 1 the determinant
value is the value of the entry, which is in {−1, 0, 1} as required.
Consider a k × k submatrix of A, Bk . If Bk has a column of all zeros, then det(Bk ) = 0. If every
column of Bk has exactly one 1 and one −1, them det(Bk ) = 0, since the sum of all rows is zero.
Therefore there exists a column of Bk with exactly one 1, or exactly one −1. The determinant is
then 1 × det(Bk−1 ) in the first case, and −1 × det(Bk−1 ) in the second case, which is −1, 0 or 1 by
the inductive assumption.
It turns out that all extreme points of the linear programming relaxation of the Minimum Cost
Network Flow Problem are integer valued (assuming that the supplies and upper/lower bounds are
integer valued). This matrix, that has one 1 and one −1 per column (or row) is totally unimodular
and therefore MCNF problems can be solved using LP techniques. However, we will investigate
more efficient techniques to solve MCNF problems, which found applications in many areas. In
fact, the assignment problem is a special case of MCNF problem, where each person is represented
by a supply node of supply 1, each job is represented by a demand node of demand 1, and there are
an arc from each person j to each job i, labeled by ci,j , that denotes the cost of person j performing
job i.
Example of general MCNF problem - the chairs problem
We now look at an example of general network flow problem, as described in handout #2. We
begin the formulation of the problem by defining the units of various quantities in the problem:
we let a cost unit be 1 cent, and a flow unit be 1 chair. To construct the associated graph for
the problem, we first consider the wood sources and the plants, denoted by nodes W S1, W S2 and
P 1, . . . , P 4 respectively, in Figure 1a. Because each wood source can supply wood to each plant,
there is an arc from each source to each plant. To reflect the transportation cost of wood from
wood source to plant, the arc will have cost, along with capacity lower bound (zero) and upper
bound (infinity). Arc (W S2, P 4) in Figure 1a is labeled in this manner.
P1 P1’ P1’ NY
It is not clear exactly how much wood each wood source should supply, although we know that
contract specifies that each source must supply at least 800 chairs worth of wood (or 8 tons). We
represent this specification by node splitting: we create an extra node S, with an arc connecting
to each wood source with capacity lower bound of 800. Cost of the arc reflects the cost of wood
at each wood source. Arc (S, W S2) is labeled in this manner. Alternatively, the cost of wood can
be lumped together with the cost of transportation to each plant. This is shown as the grey color
labels of the arcs in the figure.
Each plant has a production maximum, production minimum and production cost per unit.
Again, using the node splitting technique, we create an additional node for each plant node, and
labeled the arc that connect the original plant node and the new plant node accordingly. Arc
(P 4, P 4′ ) is labeled in this manner in the figure.
Each plant can distribute chairs to each city, and so there is an arc connecting each plant to
each city, as shown in Figure 1b. The capacity upper bound is the production upper bound of
IEOR 266 notes: Updated 2015 7
the plant; the cost per unit is the transportation cost from the plant to the city. Similar to the
wood sources, we don’t know exactly how many chairs each city will demand, although we know
there is a demand upper and lower bound. We again represent this specification using the node
splitting technique; we create an extra node D and specify the demand upper bound, lower bound
and selling price of each unit of each city by labeling the arc that connects the city and node D.
P1 P1’ NY
(0, inf, 0)
WS1 P2 P2’ Au P1 P1’ NY
S D
wood source
plant plant’ city P4 P4’ Ch
Finally, we need to relate the supply of node S to the demand of node D. There are two ways
to accomplish this. We observe that it is a closed system; therefore the number of units supplied
at node S is exactly the number of units demanded at node D. So we construct an arc from node
D to node S, with label (0, ∞, 0), as shown in Figure 2a. The total number of chairs produced will
be the number of flow units in this arc. Notice that all the nodes are now transshipment nodes —
such formulations of problems are called circulation problems.
Alternatively, we can supply node S with a large supply of M units, much larger than what
the system requires. We then create a drainage arc from S to D with label (0, ∞, 0), as shown in
Figure 2b, so the excess amount will flow directly from S to D. If the chair production operation
is a money losing business, then all M units will flow directly from S to D.
(V1 ∪ V2 ; A) ∀(u, v) ∈ A
u ∈ V1 , v ∈ V2 (8)
A property of bipartite graphs is that they are 2-colorable, that is, it is possible to assign each
vertex a “color” out of a 2-element set so that no two vertices of the same color are connected.
An example transportation problem is shown in Figure 3. Let si be the supplies of nodes in V1 ,
and dj be the demands of nodes in V2 . The transportation problem can be formulated as an IP
IEOR 266 notes: Updated 2015 8
W1 (-120)
(0, inf, 4)
(180) P1 W2 (-100)
W3 (-160)
(280) P2
(150) W4 (-80)
P3
plants W5 (-150)
warehouses
problem as follows:
∑
min i∈V1 ,j∈V2 ci,j xi,j
∑
s.t. xi,j = si ∀i ∈ V1
∑j∈V2
− i∈V1 xi,j = −dj ∀j ∈ V2
xi,j ≥ 0 xi,j integer (9)
Note that assignment problem is a special case of the transportation problem. This follows from
the observation that we can formulate the assignment problem as a transportation problem with
si = dj = 1 and |V1 | = |V2 |.
Max V
∑
subject to x =V Flow out of source
∑(s,i)∈A s,i
(MaxFlow) x =V Flow in to sink
∑(j,t)∈A j,t ∑
x
(i,k)∈A i,k − (k,j)∈A xk,j = 0, ∀k ̸= s, t
li,j ≤ xi,j ≤ ui,j , ∀(i, j) ∈ A.
The Maximum Flow problem and its dual, known as the Minimum Cut problem have a number
of applications and will be studied at length later in the course.
t, that minimizes the sum of weights along the path. To formulate as a network flow problem, let
xi,j be 1 if the arc (i, j) is in the path and 0 otherwise. Place a supply of one unit at s and a
demand of one unit at t. The formulation for the problem is:
∑
Min d x
∑((i,j)∈A i,j i,j∑
subject to x − (j,i)∈A xj,i = 0 ∀i ̸= s, t
∑(i,k)∈A i,k
(SP) x = 1
∑(s,i)∈A s,i
(j,t)∈A xj,t = 1
0 ≤ xi,j ≤ 1
This problem is often solved using a variety of search and labeling algorithms depending on the
structure of the graph.
Theorem 3.1. Given a directed graph G = (V, A), there exists an optimal solution x∗ij to a single-
source shortest paths problem on G with source node s ∈ V such that
Proof. Let x∗ij be an optimal solution to a single-source shortest path problem on G with source
node s. Furthermore let x∗ij be an optimal solution that minimizes the sum of indegrees with respect
to the arcs in the solution set A+
∑
indegA+ (i).
i∈V \{s}
We will show that this optimal solution that minimizes the sum of indegrees cannot contain a node
of indegree strictly greater than 1 by contradiction. Let the level of each node v ∈ V be equal to
the length of the shortest path in A+ from s to v.
Let i0 be a node with indegree greater than 1 such no nodes in any path from s to i0 have
indegree greater than 1. That is, i0 is a lowest level node of indegree > 1. Intuitively, i0 is the
“first” node we encounter with indegree greater than 1. Therefore there are at least two paths from
s to i0 . Since each node interior to these paths is of indegree 1, and all nodes other than s have
demand of 1 unit, the incoming flows to these nodes is strictly greater than the flow out of each of
these nodes. That is, the flows on the paths from s to i0 are monotone decreasing.
IEOR 266 notes: Updated 2015 10
Consider two paths from s to i0 and work back until we get to the first node p common to both
paths, possibly s. That is, p is the highest indexed level common ancestor. Both paths from p to
i0 must have the same cost in the objective function; one having greater cost than the other would
contradict optimality.
For either of the paths define the bottleneck arc as the arc with the smallest amount of flow,
and the bottleneck flow as the amount of flow on the bottleneck. Since the flows are monotone
decreasing the bottleneck arcs are both the incoming arcs to i0 .
p p
x
pn + xm i 0 x ∗
pm − xm i 0
x∗
pn x∗
pm
n m n m
x∗ x∗
x∗ ∗
n i 0 + xm i 0
ni0 mi 0
i0 i0
(a) The bottleneck flows are (b) The value of the objective function is un-
x∗mi0 and x∗ni0 . changed by moving x∗mi0 to the other path, but
indegA+ (i0 ) has reduced by 1.
We can choose either path and move its bottleneck flow to the other path; we do this by first
subtracting the bottleneck flow from the flow on each arc in the chosen path, and then adding the
same bottleneck flow to the flow on each arc in the other path. Since the costs of both paths are
equal, this does not change the value of the objective function and the solution remains optimal.
However, the flow on the bottleneck link of the path we chose is now 0 and therefore that arc
no longer appears in A+ . This process is illustrated in Figure 1. Consequently, indegA+ (i0 ) has
now decreased by one while the indegree of all other nodes has remained unchanged, contradicting
our initial assumption that x∗ij minimizes the sum of indegrees with respect to A+ . Thus any
optimal solution that minimizes the sum of indegrees with respect to A+ cannot contain a node
with indegree strictly greater than 1, and indegA+ (i) = 1 for all i ∈ V \ {s}.
The same theorem may also be proved following Linear Programming arguments by showing
that a basic solution of the algorithm generated by a linear program will always be acyclic in the
undirected sense.
matching is also referred to as the Edge Packing Problem. (We know that M ≤ ⌊ V2 ⌋.)
Maximum Weight Matching: We assign weights to the edges and look for maximum total
weight instead of maximum cardinality. (Depending on the weights, a maximum weight matching
may not be of maximum cardinality.)
Smallest Cost Perfect Matching: Given a graph, find a perfect matching such that the sum of
weights on the edges in the matching is minimized. On bipartite graphs, finding the smallest cost
perfect matching is the assignment problem.
Bipartite Matching, or Bipartite Edge-packing involves pairwise association on a bipartite graph.
In a bipartite graph, G = (V, E), the set of vertices V can be partitioned into V1 and V2 such that
all edges in E are between V1 and V2 . That is, no two vertices in V1 have an edge between them, and
likewise for V2 . A matching involves choosing a collection of arcs so that no vertex is adjacent to
more than one edge. A vertex is said to be “matched” if it is adjacent to an edge. The formulation
for the problem is
∑
Max x
∑(i,j)∈E i,j
(BM) subject to (i,j)∈E xi,j ≤ 1 ∀i ∈ V
xi,j ∈ {0, 1}, ∀ (i, j) ∈ E.
Clearly the maximum possible objective value is min{|V1 |, |V2 |}. If |V1 | = |V2 | it may be possible
to have a perfect matching, that is, it may be possible to match every node.
The Bipartite Matching problem can also be converted into a maximum flow problem. Add
a source node, s, and a sink node t. Make all edge directed arcs from, say, V1 to V2 and add a
directed arc of capacity 1 from s to each node in V1 and from each node in V2 to t. Place a supply
of min{|V1 |, |V2 |} at s and an equivalent demand at t. Now maximize the flow from s to t. The
existence of a flow between two vertices corresponds to a matching. Due to the capacity constraints,
no vertex in V1 will have flow to more than one vertex in V2 , and no vertex in V2 will have a flow
from more than one vertex in V1 .
Alternatively, one more arc from t back to s with cost of -1 can be added and the objective
can be changed to maximizing circulation. Figure 5 depicts the circulation formulation of bipartite
matching problem.
The problem may also be formulated as an assignment problem. Figure 6 depicts this formula-
tion. An assignment problem requires that |V1 | = |V2 |. In order to meet this requirement we add
dummy nodes to the deficient component of B as well as dummy arcs so that every node dummy
or original may have its demand of 1 met. We assign costs of 0 to all original arcs and a cost of 1
IEOR 266 notes: Updated 2015 12
to all dummy arcs. We then seek the assignment of least cost which will minimize the number of
dummy arcs and hence maximize the number of original arcs.
The General Matching, or Nonbipartite Matching problem involves maximizing the number of
vertex matching in a general graph, or equivalently maximizing the size of a collection of edges
such that no two edges share an endpoint. This problem cannot be converted to a flow problem,
but is still solvable in polynomial time.
Another related problem is the weighted matching problem in which each edge (i, j) ∈ E has
weight wi,j assigned to it and the objective is to find a matching of maximum total weight. The
mathematical programming formulation of this maximum weighted problem is as follows:
∑
Max w x
∑(i,j)∈E i,j i,j
(WM) subject to (i,j)∈E xi,j ≤ 1 ∀i ∈ V
xi,j ∈ {0, 1}, ∀ (i, j) ∈ E.
Bipartite-Matching Transportation
MCNF
' $
' $
?
' $ ' $
?
' $
?
' $
?
' $
#
& %
& % 6 !
"
& %
& % & %
& %
& %
Weighted Bipartite-Matching
Shortest Paths Assignment
Figure 7: Classification of MCNF problems
These solutions are clearly different, meaning that solving the MCNF problem is not equivalent to
solving the shortest path problem for a graph with negative cost cycles.
We will study algorithms for detecting negative cost cycles in a graph later on in the course.
4
-8 0
1 0
s 2 0 3 t
0 0
1
with. Expand the original trail by following it to the detour vertex, then follow the detour, then
continue with the original trail. If more unused edges are present, repeat this detour construction.
For this algorithm we will use the adjacency list representation of a graph, the adjacency lists
will be implemented using pointers. That is, for each node i in G, we will have a list of neighbors
of i. In these lists each neighbor points to the next neighbor in the list and the last neighbor points
to a special character to indicated that it is the last neighbor of i:
We will maintain a set of positive degree nodes V + . We also have an ordered list T our, and a
tour pointer pT that tells us where to insert the edges that we will be adding to out tour. (In the
following pseudocode we will abuse notation and let n(v) refer to the adjacency list of v and also
to the first neighbor in the adjacency list of v.)
Pseudocode:
v ← 0 (“initial” node in the tour)
T our ← (0, 0)
pT initially indicates that we should insert nodes before the second zero.
While V + ̸= ∅ do
Remove edge (v, n(v)) from lists n(v) and n(n(v))
dv ← dv − 1
If dv = 0 then V + ← V + \ v
Add n(v) to T our, and move pT to its right
If V + = ∅ done
If dv > 0 then
v ← n(v)
else
find u ∈ T our ∩ V + (with du > 0)
set pT to point to the right of u (that is we must insert nodes right after u)
v←u
end if
end while
IEOR 266 notes: Updated 2015 15
Decision version: Given a graph G = (V, E) with weights on the edges, and a number M , is
there a tour traversing each node exactly once of total weight ≤ M ?
Proof. Let OP T = optimal minimal vertex cover. Suppose |M | > |OP T |. Then there exists an
edge e ∈ M which doesn’t have an endpoint in the vertex cover. Every vertex in the V C can cover
at most one edge of the matching. (If it covered two edges in the matching, then this is not a
valid matching, because a match does not allow two edges adjacent to the same node.) This is a
contradiction. Therefore, |M | ≤ |OP T |. Because |S| = 2|M | , the claim follows.
Theorem 4.5 (4-color Theorem). It is always possible to color a planar graph using 4 colors.
Proof. A computer proof exists that uses a polynomial-time algorithm (Appel and Haken (1977)).
An “elegant” proof is still unavailable.
Observe that each set of nodes of the same color is an independent set. (It follows that the
cardinality of the maximum clique of the graph provides a lower bound to the chromatic number.) A
possible graph coloring algorithm consists in finding the maximum independent set and assigning
the same color to each node contained. A second independent set can then be found on the
subgraph corresponding to the remaining nodes, and so on. The algorithm, however, is both
inefficient (finding a maximum independent set is NP-hard) and not optimal. It is not optimal for
the same reason (and same type of example) that the b-matching problem is not solvable optimally
by finding a sequence of maximum cardinality matchings.
Note that a 2-colorable graph is bipartite. It is interesting to note that the opposite is also true,
i.e., every bipartite graph is 2-colorable. (Indeed, the concept of k-colorability and the property of
IEOR 266 notes: Updated 2015 18
being k-partite are equivalent.) We can show it building a 2-coloring of a bipartite graph through
the following breadth-first-search based algorithm:
Initialization: Set i = 1, L = {i}, c(i) = 1.
General step: Repeat until L is empty:
• For each previously uncolored j ∈ N (i), (a neighbor of i), assign color c(j) = c(i) + 1 (mod 2)
and add j at the end of the list L. If j is previously colored and c(j) = c(i) + 1(mod 2)
proceed, else stop and declare graph is not bipartite.
• Remove i from L.
Note that if the algorithm finds any node in its neighborhood already labeled with the opposite
color, then the graph is not bipartite, otherwise it is. While the above algorithm solves the 2-color
problem in O(|E|), the d-color problem for d ≥ 3 is NP-hard even for planar graphs (see Garey and
Johnson, page 87).
5 Complexity analysis
5.1 Measuring quality of an algorithm
Algorithm: One approach is to enumerate the solutions, and select the best one.
Recall that for the assignment problem with 70 people and 70 tasks there are 70! ≈ 2332.4
solutions. The existence of an algorithm does not imply the existence of a good algorithm!
To measure the complexity of a particular algorithm, we count the number of operations that
are performed as a function of the ‘input size’. The idea is to consider each elementary operation
(usually defined as a set of simple arithmetic operations such as {+, −, ×, /, <}) as having unit
cost, and measure the number of operations (in terms of the size of the input) required to solve a
IEOR 266 notes: Updated 2015 19
problem. The goal is to measure the rate, ignoring constants, at which the running time grows as
the size of the input grows; it is an asymptotic analysis.
Traditionally, complexity analysis has been concerned with counting the number of operations
that must be performed in the worst case.
Definition 5.1 (Concrete Complexity of a problem). The complexity of a problem is the complexity
of the algorithm that has the lowest complexity among all algorithms that solve the problem.
5.1.1 Examples
Set Membership - Unsorted list: We can determine if a particular item is in a list of n items
by looking at each member of the list one by one. Thus the number of comparisons needed
to find a member in an unsorted list of length n is n.
Problem: given a real number x, we want to know if x ∈ S.
Algorithm:
1. Compare x to si
2. Stop if x = si
3. else if i ← i + 1 < n goto 1 else stop x is not in S
Complexity = n comparisons in the worst case. This is also the concrete complexity of
this problem. Why?
Set Membership - Sorted list: We can determine if a particular item is in a list of n elements
via binary search. The number of comparisons needed to find a member in a sorted list of
length n is proportional to log2 n.
Problem: given a real number x, we want to know if x ∈ S.
Algorithm:
1. Select smed = ⌊ f irst+last
2 ⌋ and compare to x
2. If smed = x stop
3. If smed < x then S = (smed+1 , . . . , slast ) else S = (sf irst , . . . , smed−1 )
4. If f irst < last goto 1 else stop
n
Complexity: after k th iteration 2k−1 elements remain. We are done searching for k such that
n
2k−1
≤ 2, which implies:
log2 n ≤ k
Thus the total number of comparisons is at most log2 n.
Aside: This binary search algorithm can be used more generally to find the zero in a monotone
increasing and monotone nondecreasing functions.
Matrix Multiplication: The straightforward method for multiplying two n × n matrices takes
n3 multiplications and n2 (n − 1) additions. Algorithms with better complexity (though not
necessarily practical, see comments later in these notes) are known. Coppersmith and Wino-
grad (1990) came up with an algorithm with complexity Cn2.375477 where C is large. Indeed,
the constant term is so large that in their paper Coppersmith and Winograd admit that their
algorithm is impractical in practice.
Forest Harvesting: In this problem we have a forest divided into a number of cells. For each
cell we have the following information: Hi - benefit for the timber company to harvest, Ui
- benefit for the timber company not to harvest, and Bij - the border effect, which is the
benefit received for harvesting exactly one of cells i or j. This produces an m by n grid. The
way to solve is to look at every possible combination of harvesting and not harvesting and
pick the best one. This algorithm requires (2mn ) operations.
IEOR 266 notes: Updated 2015 20
Basically, we iterate through each item in the list and compare it with its neighbor. If the
number on the left is greater than the number on the right, we swap the two. Do this for all
of the numbers in the array until we reach the end. Then we repeat the process. At the end
of the first pass, the last number in the newly ordered list is in the correct location. At the
end of the second pass, the last and the penultimate numbers are in the correct positions.
And so forth. So we only need to repeat this process∑n a maximum of n times.
k=2 (k − 1) = n(n − 1)/2 = O(n ).
The complexity complexity of this algorithm is: 2
f (n)
lim = c.
n→∞ g(n)
Then,
1. Ignores the size of the numbers. The model presented is a poor one when dealing with
very large numbers, as each operation is given unit cost, regardless of the size of the numbers
involved. But multiplying two huge numbers, for instance, may require more effort than
multiplying two small numbers.
IEOR 266 notes: Updated 2015 23
2. Is worst case. So complexity analysis does not say much about the average case. Tradition-
ally, complexity analysis has been a pessimistic measure, concerned with worst-case behavior.
The simplex method for linear programming is known to be exponential (in the worst case),
while the ellipsoid algorithm is polynomial; but, for the ellipsoid method, the average case
behavior and the worst case behavior are essentially the same, whereas the average case be-
havior of simplex is much better than it’s worst case complexity, and in practice is preferred
to the ellipsoid method.
Similarly, Quicksort, which has O(n2 ) worst case complexity, is often chosen over other sorting
algorithms with O(n log n) worst case complexity. This is because QuickSort has O(n log n)
average case running time and, because the constants (that we ignore in the O notation)
are smaller for QuickSort than for many other sorting algorithms, it is often preferred to
algorithms with “better” worst-case complexity and “equivalent” average case complexity.
3. Ignores constants. We are concerned with the asymptotic behavior of an algorithm. But,
because we ignore constants (as mentioned in QuickSort comments above), it may be that an
algorithm with better complexity only begins to perform better for instances of inordinately
large size.
Indeed, this is the case for the O(n2.375477 ) algorithm for matrix multiplication, that is
“. . . wildly impractical for any conceivable applications.” 1
4. O(n100 ) is polynomial. An algorithm that is polynomial is considered to be “good”. So
an algorithm with O(n100 ) complexity is considered good even though, for reasons already
alluded to, it may be completely impractical.
Still, complexity analysis is in general a very useful tool in both determining the intrinsic
“hardness” of a problem and measuring the quality of a particular algorithm.
Ax ≤ b
x ≥ 0
T
y A T
≥ c
y ≥ 0
b y ≤ cT x
T
1
See Coppersmith D. and Winograd S., Matrix Multiplication via Arithmetic Progressions. Journal of Symbolic Compu-
tation, 1990 Mar, V9 N3:251-280.
IEOR 266 notes: Updated 2015 24
Note that by weak duality theorem bT y ≥ cT x, so with the inequality above this gives the desired
equality. Every feasible solution to this problem is an optimal solution. The ellipsoid algorithm
works on this feasibilty problem as follows:
1. If there is a feasible solution to the problem, then there is a basic feasible solution. Any
basic solution is contained in a sphere (ellipsoid) of radius n2log L , where L is the largest
subdeterminant of the augmented constraint matrix.
2. Check if the center of the ellipsoid is feasible. If yes, we are done. Otherwise, find a constraint
that has been violated.
3. This violated constraint specifies a hyperplane through the center and a half ellipsoid that
contains all the feasible solutions.
4. Reduce the search space to the smallest ellipsoid that contains the feasible half ellipsoid.
5. Repeat the algorithm at the center of the new ellipsoid.
Obviously this algorithm is not complete. How do we know when to stop and decide that the
feasible set is empty? It’s proved that the volume of the ellipsoids scales down in each iteration
by a factor of about e−1/2(n+1) , where n is the number of linearly independent constraints. Also
note that a basic feasible solution only lies on a grid of mesh length (L(L − 1))−1 . Because when
we write a component of a basic feasible solution as the quotient of two integers, the denominator
cannot exceed L (by Cramer’s Rule). Therefore the component-wise difference of two basic feasible
solutions can be no less than (L(L − 1))−1 . Given this, it can be shown that the convex hull of basic
feasible solutions has volume of at least 2−(n+2)L . While the volume of the original search space is
2 × n2 L2 . Therefore the algorithm terminates in O(n2 log L) iterations. The ellipsoid method is the
first polynomial time algorithm for linear programming problems. However, it is not very efficient
in practice.
6 Graph representations
There are different ways to store a graph in a computer, and it is important to understand how
each representation may affect or improve the complexity analysis of the related algorithms. The
differences between each representation are based on the way information is stored, so for instance,
one representation may have a characteristic X very handy but characteristic Y may be very oper-
ation intensive to obtain, while an alternative representation may favor readiness of characteristic
Y over characteristic X.
entry of the Node-Node Adjacency Matrix is 1, if there is an edge from the ith node to the j th node
and 0 otherwise. In the case of undirected graphs it is easy to see that the Node-Node Adjacency
Matrix is a symmetric matrix.
The network is described by the matrix N , whose (i, j)th entry is nij , where
{
1 if (i, j) ∈ E
nij =
0, otherwise
Figure 9 depicts the node-node adjacency matrices for directed and undirected graphs.
Node i j
Node
To j
From i 1
N =
N = i 1 j 1
(a) node-node adjacency matrix for (b) node-node adjacency matrix for
a directed graph an undirected graph
Note: For undirected graphs, the node-node adjacency matrix is symmetric. The density of
2|E|
this matrix is |V |2
for an undirected graph, and |V|A||2 , for a directed graph. If the graph is complete,
then |A| = |V | · |V − 1| and the matrix N is dense.
Is this matrix a good representation for a breadth-first-search? (discussed in the next section).
To determine whether a node is adjacent to any other nodes, n entries need to be checked for each
row. For a connected graph with n nodes and m edges, n − 1 ≤ m ≤ n(n − 1)/2, and there are
O(n2 ) searches.
There are 2 entries for every edge or arc, this results in O(m + n) size of the adjacency list
representation. This is the most compact representation among the three presented here. For
sparse graphs it is the preferable one.
Two basic methods that are used to define the selection mechanism are Breadth-First-Search
(BFS) and Depth-First-Search (DFS).
1
2
10 5 3
7 8 4
9 6
10 5 3 2 Distance 1
from root
7 8 4
Distance 2
from root
9 6 Distance 3
from root
Figure 11: Solid edges represent the BFS tree of the example graph. The dotted edges cannot be
present in a BFS tree.
Note that the BFS tree is not unique and depends on the order in which the neighbors are
visited. Not all edges from the original graph are represented in the tree. However, we know that
the original graph cannot contain any edges that would bridge two nonconsecutive levels in the
tree. By level we mean a set of nodes such that the number of edges that connect each node to the
root of the tree is the same.
Consider any graph. Form its BFS tree. Any edge belonging to the original graph and not
in the BFS can either be from level i to level i + 1; or between two nodes at the same level. For
IEOR 266 notes: Updated 2015 28
example, consider the dotted edge in Figure 11 from node 5 to node 6. If such an edge existed in
the graph, then node 6 would be a child of node 5 and hence would appear at a distance of 2 from
the root.
Note that an edge forms an even cycle if it is from level i to level i + 1 in the graph and an odd
cycle if it is between two nodes at the same level.
Theorem 7.1 (BFS Property 1). The level of a node, d(v), is the distance from s to v.
Proof. First, it is clear that every element reachable from s will be explored and given a d.
By induction on d(v), Q remains ordered by d, and contains only elements with d equal to k
and k + 1. Indeed, elements with d = k are dequeued first and add elements with d = k + 1 to
Q. Only when all elements with d = k have been used do elements with d = k + 2 start being
introduced. Let “stage k” be the period where elements with d = k are dequeued.
Then, again by induction, at the beginning of stage k, all nodes at distance from s less than k
have been enqueued, and only those, and they contain their correct value. Assume this for k. Let
v be a node at distance k + 1. It cannot have been enqueued yet at the beginning of stage k, by
assumption, so at that point d(v) = ∞. But there exists at least one node u at distance k that
leads to u. u is in Q, and d(u) = k by assumption. The smallest such u in Q will give v its value
d(v) = d(u) + 1 = k + 1. Nodes at distance more than k + 1 have no node at distance k leading to
them and will not be enqueued.
Corollary 7.2 (BFS Property 2). For undirected graphs ̸ ∃{i, j} : |d(i) − d(j)| > 1
(i.e. neighbors must be in adjacent levels)
For directed graphs ̸ ∃(i, j) : d(j) > d(i) + 1
(i.e. Can’t skip levels in forward arcs)
Naive Algorithm
IEOR 266 notes: Updated 2015 29
10 2
3
6 5
4
00: SC(G)
01: For every node in V do
02: R <-- DFS(G,v); //or BFS
03: If R not equal to V then STOP AND OUTPUT "FALSE"
04: OUTPUT "TRUE"
Since DFS/BFS take O(m + n) time and we have O(n) executions, the complexity of the naive
algorithm is O(mn).
Claim 7.6. Checking strong connectivity in any graph takes O(m + n) time.
Proof.
For Undirected Graphs
Only one run of DFS or BFS is needed; since, if from any arbitrary node v we can reach all other
nodes ∈ V , then we can reach any node from any other node at least through a path containing
node v.
Although we can use also BFS to detect cycles, DFS tends to find a cycle ’faster’. This is
because if the root is not part of the cycle BFS still needs to visit all of its neighbors; while DFS
will move away from the root (and hopefully towards the cycle) faster. Also, if the cycle has length
k then BFS needs at least to explore floor(k/2) levels.
Proof. We know that a graph is bipartite if and only if it does not contains a an odd cycle. So we
can run BFS from an arbitrary node v, and if it contains an edge between nodes in the same level
then the graph has an odd cycle, and thus it is not bipartite.
Similarly, we can determine if a graph has an odd cycle: An edge found from a node to a child
(of a previously processed node) in the next level. There is another way of detecting an odd cycle;
By assigning two alternating colors to nodes visited with any search algorithm: if there is an edge
from a node of a given color then the neighbor gets the second color. If an edge is discovered
between two nodes of same color, then there is an odd cycle.
8 Shortest paths
8.1 Introduction
Consider graph G = (V, A), with cost cij on arc (i, j). There are several different, but related,
shortest path problems:
• Single source shortest paths: Find shortest path from s to all nodes.
• All pairs shortest paths: Find the SP between every pair of nodes.
Additionally, we differentiate between shortest paths problems depending on the type of graph
we receive as input.
We will begin by considering shortest paths (or longest paths) in Directed Acyclic Graphs
(DAGs).
(i, j) ∈ E ⇒ i < j
It should be noted that a DAG can have O(n2 ) arcs. We can construct such graph by the
following construction: Given a set of nodes, number them arbitrarily and add all arcs (i, j) where
i < j.
IEOR 266 notes: Updated 2015 31
Definition 8.2. A (graph) property, P , is hereditary if the following is true: given a graph G
with property P , any subgraph of G also has property P .
Lemma 8.3. In DAG, G, there always exists a node with in-degree 0. (Similarly, a DAG always
has a node with out-degree 0.)
Proof. Assume by contradiction that G there is no node with in-degree 0. We arbitrarily pick
a node i in G, and apply a Depth-First-Search by visiting the predecessor nodes of each such
node encountered. Since every node has a predecessor, when the nodes are exhausted (which must
happen since there are a finite number of nodes in the graph), we will encounter a predecessor which
has already been visited. This shows that there is a cycle in G, which contradicts our assumption
that G is acyclic.
Theorem 8.4. There exists a topological sort for a digraph G if and only if G is a DAG (i.e. G
has no directed cycles).
Proof.
[Only if part] Suppose there exists a topological sort for G. If there exists cycle (i1 ,...,ik ,i1 ), then,
by the definition of topological sort above, i1 < i2 < ... < ik < i1 , which is impossible. Hence, if
there exists a topological sort, then G has no directed cycles.
[If part] In a DAG, the nodes can be topologically sorted by using the following procedure. Assign-
ing a node with in-degree 0 the lowest remaining label (by the previous lemma, there always exists
such a node). Removing that node from the graph (recall that being a DAG is hereditary), and
repeat until all nodes are labeled. The correctness of this algorithm is established by induction on
the label number. The pseudocode for the algorithm to find a topological sort is given below.
Note that the topological sort resulting from this algorithm is not unique. Further note that
an alternative approach works also: Take a node with out-degree 0, assign it the highest remaining
label, remove it from the graph, and repeat until no nodes remain.
An example of the topological order obtained for a DAG is depicted in Figure 13. The numbers
in the boxes near the nodes represent a topological order.
6
1
4 7
2
2 3 4
7
6 5
5
1
3
Figure 13: An example of topological ordering in a DAG. Topological orders are given in boxes.
Consider the complexity analysis of the algorithm. We can have a very loose analysis and say
that each node is visited at most once and then its list of neighbors has to be updated (a search of
O(m) in the node-arc adjacency matrix, or O(n) in the node-node adjacency matrix, or the number
of neighbors in the adjacency list data structure). Then need to find among all nodes one with
indegree equal to 0. This algorithm runs in O(mn) time, or if we are more careful it can be done
in O(n2 ). The second factor of n is due to the search for a indegree 0 node. To improve on this
we maintain a ”bucket” or list of nodes of indegree 0. Each time a node’s indegree is updated, we
check if it is 0 and if so add it to the bucket. Now to find a node of indegree 0 is done in O(1), just
lookup the bucket in any order. Updating the indegrees is done in O(m) time total throughout the
algorithm, as each arc is looked at once. Therefore, Topological Sort can be implemented to run in
O(m) time.
Proposition 8.6. Let the vector d⃗ represent the shortest path distances from the source node. Then
2. A directed path from s to k is a shortest path if and only if d(j) = d(i) + cij , ∀ (i, j) ∈ P .
IEOR 266 notes: Updated 2015 33
The property given above is useful for backtracking and identifying the tree of shortest paths.
Given the distance vector d,⃗ we call an arc eligible if d(j) = d(i) + cij . We may find a (shortest)
path from the source s to any other node by performing a breadth first search of eligible arcs. The
graph of eligible arcs looks like a tree, plus some additional arcs in the case that the shortest paths
are not unique. But we can choose one shortest path for each node so that the resulting graph of
all shortest paths from s forms a tree.
Aside: When solving the shortest path problem by linear programming (MCNF formulation), a
basic solution is a tree. This is because a basic solution is an independent set of columns; and a
cycle is a dependent set of columns (therefore “a basic solution cannot contain cycles”).
Given and undirected graph with nonnegative weights. The following analogue algorithm solves
the shortest paths problem:
The String Solution: Given a shortest path problem on an undirected graph with non-negative
costs, imagine physically representing the vertices by small rings (labeled 1 to n), tied to
other rings with string of length equaling the cost of each edge connecting the corresponding
rings. That is, for each arc (i, j), connect ring i to ring j with string of length cij .
Algorithm: To find the shortest path from vertex s to all other vertices, hold ring s and let
the other nodes fall under the force of gravity.
Then, the distance, d(k), that ring k falls represents the length of the shortest path from s
to k. The strings that are taut correspond to eligible arcs, so this arcs will form the tree of
shortest paths.
max d(t)
d(j) ≤ d(i) + cij ∀(i, j) ∈ A
d(s) = 0
Where the last constraint is needed as an anchor, since otherwise we would have an infinite number
of solutions. To see this, observe that we can rewrite the first constraint as d(j) − d(i) ≤ cij ; thus,
given any feasible solution d, we can obtain an infinite number of feasible solutions (all with same
objective value) by adding a constant to all entries of d.
This alternative mathematical programming formulation is nothing else but the dual of the
minimum-cost-network-flow-like mathematical programming formulation of the shortest path prob-
lem.
we can trivially determine that there is no path between i and j if i > j; in other words, we know
that the only way to each node j is by using nodes with label less than j. Therefore, without loss
of generality, we can assume that the source is labeled 1, since any node with a smaller label than
the source is unreachable from the source and can thereby be removed.
Given a topological sorting of the nodes, with node 1 the source, the shortest path distances
can be found by using the recurrence,
The validity of the recurrence is easily established by induction on j. It also follows from property
8.6. The above recursive formula is referred to as a dynamic programming equation (or Bellman
equation).
What is the complexity of this dynamic programming algorithm? It takes O(m) operations
to perform the topological sort, and O(m) operations to calculate the distances via the given
recurrence (we obviously compute the recurrence equation in increasing number of node labels).
Therefore, the distances can be calculated in O(m) time. Note that by keeping track of the i’s that
minimize the right hand side of the recurrence, we can easily backtrack to reconstruct the shortest
paths.
It is important how we represent the output to the problem. If we want a list of all shortest
paths, then just writing this output takes O(n2 ) (if we list for every node its shortest path). We
can do better if we instead just output the shortest path tree (list for every node its predecessor).
This way the output length is O(m).
Equipment Replacement Minimize the total cost of buying, selling, and operating the equip-
ment over a planning horizon of T periods.
IEOR 266 notes: Updated 2015 35
7 7
2 5
2 0 0
1 2 0 3 0 4 1 5 3 6 7 4 8 9 0 10
6 4 3
11
Lot Sizing Problem Meet the prescribed demand dj for each of the periods j = 1, 2, ..., T by
producing, or carrying inventory from previous production runs. The cost of production
includes fixed set up cost for producing a positive amount at any period, and a variable per
unit production cost at each period. This problem can be formulated as a shortest path
problem in a DAG (AM&O page 749) if one observes that any production will be for the
demand of an integer number of consecutive periods ahead. You are asked to prove this in
your homework assignment.
This is a typical project management problem. The objective of the problem is to find the
earliest time by which the project can be completed.
This problem can be formulated as a longest path problem. The reason we formulate the
problem as a longest path problem, and not as a shortest path problem (even though we want
to find the earliest finish time) is that all paths in the network have to be traversed in order
for the project to be completed. So the longest one will be the “bottleneck” determining the
completion time of the project.
We formulate the problem as a longest path problem as follows. We create a node for each
task to be performed. We have an arc from node i to node j if activity i must precede activity
j. We set the length of arc (i, j) equal to the duration of activity i. Finally we add two spacial
nodes start and f inish to represent the start and the finish activities (these activities with
duration 0). Note that in this graph we cannot have cycles, thus we can solve the problem as
a longest path problem in an acyclic network.
Now we give some notation that we will use when solving this type of problems. Each node
is labeled as a triplet (node name, time in node, earliest start time). Earliest start time of
a node is the earliest time that work can be started on that node. Hence, the objective can
also be written as the earliest start time of the finish node.
Then we can solve the problem with our dynamic programming algorithm. That is, we
traverse the graph in topological order and assign the label of each node according to our
dynamic programming equation:
tj = max{ti + cij },
(i,j)
IEOR 266 notes: Updated 2015 36
where ti is the earliest start time of activity i; and cij is the duration of activity i.
To recover the longest path, we start from node tf inish , and work backwards keeping track of
the preceding activity i such that tj = ti + cij . Notice that in general more than one activity
might satisfy such equation, and thus we may have several longest paths.
Alternatively, we can formulate the longest path as follows:
min tf inish
tj ≥ ti + cij ∀(i, j) ∈ A
tstart = 0
From the solution of the linear program above we can identify the longest path as follows.
The constraints corresponding to arcs on the longest path will be satisfied with equality.
Furthermore, for each unit increase in the length of these arcs our objective value will increase
also one unit. Therefore, the dual variables of these constraints will be equal to one (while
all other dual variables will be equal to zero–by complementary slackness).
The longest path (also known as critical path) is shown in Figure 15 with thick lines. This
path is called critical because any delay in a task along this path delays the whole project
path.
(D,8, (L,8,
10) 28)
(A,10,
0)
(E,14,
10)
(F,4, (J,4,
19) 14)
(FINISH,
0,38)
(B,19, (G,1,
START 19)
0) (K,6, (M,9,
23) 29)
(H,3,
19)
(I,5,
19)
(C,13, (N,7,
0) 24)
Problem: Given a set of items, each with a weight and value (cost), determine the subset
of items with maximum total weight and total cost less than a given budget.
The binary knapsack problem’s mathematical programming formulation is as follows.
∑
n
max wj xj
j=1
∑n
vj xj ≤ B
j=1
xj ∈ {0, 1}
IEOR 266 notes: Updated 2015 37
The binary knapsack problem is a hard problem. However it can be solved as a longest path
problem on a DAG as follows.
Let fi (q) be the maximum weight achievable when considering the first i items and a budget
of q, that is,
∑
i
fi (q) = max wj xj
j=1
∑
i
vj xj ≤ q
j=1
xj ∈ {0, 1}
Note that we are interested in finding fn (B). Additionally, note that the fi (q)’s are related
with the following dynamic programming equation:
The above equation can be interpreted as follows. By the principle of optimality, the maximum
weight I can obtain, when considering the first i + 1 items will be achieved by either:
1. Not including the i + 1 item in my selection. In which case my total weight will be equal
to fi (q), i.e. equal to the maximum weight achievable with a budget of q when only
considering only the first i items; or
2. Including the i + 1 item in my selection. In which case my total weight will be equal to
fi (q − vi+1 ) + wi+1 , i.e. equal to the maximum weight achievable when considering the
first i items with a budget of q − vi+1 plus the weight of this i + 1 object.
We also have the the boundary conditions:
{
0 0 ≤ q < v1
f1 (q) = .
w1 B ≥ q ≥ v1
The graph associated with the knapsack problem with 4 items, w = (6, 8, 4, 5), v = (2, 3, 4, 4),
and B=12 is given in Figure 16. (The figure was adapted from: Trick M., A dynamic
programming approach for consistency and propagation of knapsack constraints.)
Note that in general a knapsack graph will have O(nB) nodes and O(nB) edges (each node
has at most two incoming arcs), it follows that we can solve the binary knapsack problem in
time O(nB).
Remark: The above running time is not polynomial in the length of the input. This is
true because the number B is given using only log B bits. Therefore the size of the input for
the knapsack problem is O(n log B + n log W ) (where W is the maximum wi ). Finally, since,
B = 2log B , then a running time of O(nB) is really exponential in the size of the input.
iteration, a node with the least distance label is marked as permanent, and the distances to its
successors are updated. This is continued until no temporary nodes are left in the graph.
We give the pseudocode for Dijkstra’s algorithm below. Where P is the set of nodes with
permanent labels, and T of temporarily labeled nodes. Let s = 1.
begin
N + (i) := {j|(i, j) ∈ A};
P := {1}; T := V \ {1};
d(1) := 0 and pred(1) := 0;
d(j) := c1j and pred(j) := 1 for all j ∈ A(1),
and d(j) := ∞ otherwise;
while P ̸= V do
begin
(Node selection, also called F IN DM IN )
let i ∈ T be a node for which d(i) = min{d(j) : j ∈ T };
P := P ∪ {i}; T := T \ {i};
(Distance update)
for each j ∈ N + (i) do
if d(j) > d(i) + cij then
d(j) := d(i) + cij and pred(j) := i;
end
end
Theorem 8.7 (The correctness of Dijkstra’s algorithm). Once a node joins P its label is the
shortest path label.
Proof. At each iteration the nodes are partitioned into subsets P and T . We will prove by induction
on the size of P that the label for each node i ∈ P is the shortest distance from node 1.
Base: When |P | = 1, the only node in P is s. It is correctly labeled with a distance 0.
Inductive step: Assume for |P | = k and prove for |P | = k + 1.
IEOR 266 notes: Updated 2015 39
Suppose that node i with d(i) = min{d(j) : j ∈ T }, was added to P , but its label d(i) is not the
shortest path label. Let d∗ (i) be the true shortest path label of i. Obviously d∗ (i) < d(i) because
d(i) is assume to not be the shortest path label. We now prove a claim:
Claim 8.8. If d∗ (i) < d(i) then the true shortest path to i contains nodes of T .
Proof. Suppose not, and let v be the immediate predecessor of i in the path from 1 to i. Then
necessarily v ∈ P . By the induction hypothesis d(v) is the true shortest path label of v, d(v) = d∗ (v).
When v was added to P, the update step ensures the the label of i is at most d∗ (v) + cvi . Hence
d(i) ≤ d∗ (v) + cvi = d∗ (i), contradiction.
Therefore, by the claim, there must be some nodes of T along the shortest path 1 to i. Let j be
the first node of T on the shortest path from 1 to i. Consider the two sections of the path: from
1 to j and from j to i. Let q ∈ P be the immediate predecessor of j. Then by the update step for
q and the induction hypothesis, d(j) ≤ d(q) + cqj = d∗ (q) + cqj . The subpath from j to i is of cost
C(j, i) ≥ 0. By combining these observations, we know that the shortest path from 1 to i via q and
j has length d∗ (i) = d(q) + cqj + C(j, i) ≥ d(j) + C(j, i) ≥ d(j). However, i was selected to have the
minimum label d(i) in T , so d(j) ≥ d(i). Therefore, d∗ (i) ≥ d(i). Contradiction with d∗ (i) < d(i).
We next prove two invariants that are preserved during the execution of Dijkstra’s algorithm.
Proposition 8.9 (Dijkstra’s invariant). The label for each j ∈ T is the shortest distance from s
such that all nodes of the path are in P (i.e. it would be the shortest path in a graph from which
we eliminate all arcs with both endpoints in T ).
Proposition 8.10. The labels of nodes joining P can only increase in successive iterations.
Complexity Analysis
There are two major operations in Dijkstra’s algorithm :
Find minimum, which has O(n2 ) complexity; and
Update labels, which has O(m) complexity.
Therefore, the complexity of Dijkstra’s algorithm is = O(n2 + m).
There are several implementations that improve upon this running time.
One improvement uses Binary Heaps: whenever a label is updated, use binary search to insert
that label properly in the sorted array. This requires O(log n). Therefore, the complexity of
finding the minimum label in temporary set is O(m log n), and complexity of the algorithm is
O(m + m log n) = O(m log n). √
Another improvement is Radix Heap which has complexity O(m+n log C) where C = max(i,j)∈A cij .
Note that this complexity is not strongly polynomial.
Currently, the best strongly polynomial implementation of Dijkstra’s algorithm uses Fibonacci
Heaps and has complexity O(m + n log n). Since Fibonacci heaps data structure is difficult to
program, this implementation is not used in practice.
It is important to stress that Dijkstra’s algorithm does not work correctly in the presence of
negative edge weights. In this algorithm once a node is included in the permanent set it will not be
checked later again, but because of the presence of negative edge that is not correct. The problem
is that the distance label of a node in the permanent set might be reduced after its inclusion in the
permanent set (when considering negative cost arcs).
See Figure 17 for an example such that Dijkstra’s algorithm does not work in presence of
negative edge weight. Using the algorithm, we get d(3) = 3. But the actual value of d(3) is 2.
1 XX
X3
XXX
B z
B
3
4B
B -2
BBN
2
Lemma 8.11. If there are no negative cost cycles in the network G = (N, A), then there exists a
simple shortest path from s to any node i (a path that uses at most n − 1 arcs).
Proof. Suppose that G contains no negative cycles. Observe that at most n − 1 arcs are required
to construct a path from s to any node i. Now, consider a path, P , from s to i which traverses a
cycle.
P = s → i1 → i2 → . . . → (ij → ik → . . . → ij ) → iℓ → . . . → i.
Since G has no negative length cycles, the length of P is no less than the length of P̄ where
P̄ = s → i1 → i2 → . . . → ij → ik → . . . → iℓ → . . . → i.
Thus, we can remove all cycles from P and still retain a shortest path from s to i. Since the
final path is acyclic, it must have no more than n − 1 arcs.
Theorem 8.12 (Invariant of the algorithm). After pulse k, all shortest paths from s of length k
(in terms of number of arcs) or less have been identified.
Note that the shortest paths identified in the theorem are not necessarily simple. They may
use negative cost cycles.
Theorem 8.13. If there exists a negative cost cycle, then there exists a node j such that dn (j) <
dn−1 (j) where dk (i) is the label at iteration k.
Proof. Suppose not, then for all nodes, dn (j ) = dn−1 (j), since the labels cannot strictly decrease,
and also labels never increase.
Let (i1 − i2 − ... − ik − i1 ) be a negative cost cycle, and denote ik+1 = i1 . For all nodes in the cycle
dn (ij ) = dn−1 (ij ).
From the optimality conditions, dn (ir+1 ) ≤ dn−1 (ir ) + Cir ,ir+1 where r = 1,...,k, (for r = k we
get dn (i1 ) ≤ dn−1 (ik ) + Cik ,i1 ).
∑ ∑ ∑
We sum up these inequalities and get, kj=1 dn (ij ) ≤ kj=1 dn−1 (ij ) + kj=1 Cij ,ij+1 .
∑
Since dn (ij ) = dn−1 (ij ), the above inequality implies that 0 ≤ kj=1 Cij ,ij+1 . Butt his contra-
∑
dicts the assumption that this cycle is of negative length, kj=1 Cij−1 ,ij < 0. That is, we get an
inequality which says that 0 is less than a negative number which is a contradiction.
Corollary 8.14. The Bellman-Ford algorithm identifies a negative cost cycle if one exists.
IEOR 266 notes: Updated 2015 42
By running n pulses of the Bellman-Ford algorithm, we either detect a negative cost cycle (if
one exists) or find a shortest path from s to all other nodes i.
Complexity Analysis
Complexity = O(mn) since each pulse is O(m), and we have n pulses.
Note that it suffices to apply the pulse operations only for arcs (i, j) where the label of i changed in
the previous pulse. Thus, if no label changed from the previous iteration, then we are done, even
if the iteration number is less than n.
Complexity Analysis
At each iteration we consider another node as intermediate node → O(n) iterations.
In each iteration we compare n2 triangles (for n2 pairs) → O(n2 ).
IEOR 266 notes: Updated 2015 43
Figure 18: Triangle operation: Is the (dotted) path using node j as intermediate node shorter than
the (solid) path without using node j?
At the initial stage, dij = cij if there exist an arc between node i and j
dij = ∞ otherwise
eij = 0
First iteration : d24 ← min(d24 , d21 + d14 ) = 3
e24 = 1
Update distance label
Second iteration : d41 ← min(d41 , d42 + d21 ) = −2
d43 ← min(d43 , d42 + d23 ) = −3
d44 ← min(d44 , d42 + d24 ) = −1
e41 = e43 = e44 = 2
No other label changed
Note that we found a negative cost cycle since the diagonal element of matrix D, d44 = −1 ≤ 0.
Negative dii means there exists a negative cost cycle since it simply says that there exists a path
of negative length from node i to itself.
See the handout titled All pairs shortest paths - section 1.2.
This algorithm takes advantage of the idea that if all arc costs were nonnegative, we can find
all pairs shortest paths by solving O(n) single source shortest paths problems using Dijkstra’s al-
gorithm. The algorithm converts a network with negative-cost arcs to an equivalent network with
nonnegative-cost arcs. After this transformation, we can use Dijkstra’s algorithm to solve n SSSP
(and get our APSP).
2. Use Bellman-Ford algorithm to find the shortest paths from s to every other node. If a
negative cost cycle is detected by the algorithm, then terminate at this step. Otherwise, let
d(i) be the shortest distance from s to node i.
3. Remove s from the graph, and convert the costs in the original graph to nonnegative costs
by setting c′ij = cij + d(i) − d(j).
4. Apply Dijkstra’s algorithm n times to the graph with weights c′ .
Claim 8.16. In step 1, adding node s to the graph does not create or remove negative cost cycles.
Proof. Since s only has outgoing arcs, then it cannot be part of any cycle; similarly the added arcs
cannot be part of any cycle. Therefore we have not created any more cycles in G. Finally, if G
contained a negative cycle, this cycle remains unchanged after adding s to G.
Proof. Since d(i) are shortest path labels. They must satisfy cij + d(i) ≥ d(j) (Proposition 2 in
Lecture 9). This implies that c′ij = cij + d(i) − d(j) ≥ 0
Let dPc (s, t) be the distance of a path P from s to t with the costs c.
Claim 8.18. For every pair of nodes s and t, and a path P , dPc (s, t) = dPc′ (s, t) − d(s) + d(t). That
is, minimizing dc (s, t) is equivalent to minimizing dc′ (s, t).
dPc′ (s, t) = csi1 + d(s) − d(i1 ) + csi2 + d(i1 ) − d(i2 ) . . . + csik + d(ik ) − d(t) = dPc (s, t) + d(s) − d(t).
Note that the d(i) terms cancel out except for the first and last terms. Thus the length of all
paths from s to t, when using costs c′ instead of costs c, change by the constant amount d(s) − d(t).
Therefore, the distances of path P when using c and c′ are related as follows:
Thus, minP dPc′ (s, t) = minP dPc (s, t)+ constant. Hence minimizing dc′ (s, t) is equivalent to
minimizing dc (s, t).
Complexity Analysis
step 1: Add s and n arcs → O(n) step 2: Bellman-Ford → O(mn).
step 3: Updating c′ij for m arcs → O(m).
step 4: Dijkstra’s algorithm n times → O(n(m + n log n)).
Total complexity: O(n(m + n log n)).
So far this has been the best complexity established for the all-pairs shortest paths problem (on
general graphs).
IEOR 266 notes: Updated 2015 45
Consider a matrix D with dij = cij if there exists an arc from i to j, ∞ otherwise.
Let dii = 0 for all i. Then we define the “dot” operation as
(2)
Dij = (D ⊙ DT )ij = min{di1 + d1j , . . . , din + dnj }.
(2)
Dij represents the shortest path with 2 or fewer edges from i to j .
(3)
Dij = (D ⊙ D(2) T )ij = shortest path with number of edges ≤ 3 from i to j
..
.
(n)
Dij = (D ⊙ D(n−1)T )ij = shortest path with number of edges ≤ n from i to j
Complexity Analysis
One matrix “multiplication” (dot operation) → O(n3 )
n multiplications leads to the total complexity O(n4 )
By doubling, we can improve the complexity to O(n3 log2 n). Doubling is the evaluation
(n)
of Dij from its previous matrices that are powers of two.
4
−10 S
o 1
S
/ S
1- 1-
1 2 3
8.12 Why finding shortest paths in the presence of negative cost cycles is dif-
ficult
Suppose we were to apply a minimum cost flow formulation to solve the problem of finding a
shortest path from node s to node t in a graph which contains negative cost cycles. We could give
a capacity of 1 to the arcs in order to guarantee a bounded solution. This is not enough, however,
to give us a correct solution. The problem is really in ensuring that we get a simple path as a
solution.
Consider the network in Figure 19. Observe first that the shortest simple path from node 1 to
node 2 is the arc (1, 2). If we formulate the problem of finding a shortest path from 1 to 2 as a
MCNF problem, we get the following linear program:
The Maximum Flow problem is defined on a directed network G = (V, A) with capacities uij
on the arcs, and no costs. In addition two nodes are specified, a source node, s, and sink node, t.
The objective is to find the maximum flow possible between the source and sink while satisfying
the arc capacities. (We assume for now that all lower bounds are 0.)
Definitions:
• A feasible flow f is a flow that satisfied the flow-balance and capacity constraints.
• A cut is a partition of the nodes
∑ (S, ∑ T ), such that S ⊆ V , s ∈ S, T = V \ S, t ∈ T .
• Cut capacity: U (S, T ) = i∈S j∈T uij . (Important: note that the arcs in the cut are
only those that go from S to T .)
Let |f | be the total amount of flow out of the source (equivalently into the sink). It is easy to
observe that any feasible flow satisfies
|f | ≤ U (S, T ) (13)
for any (S, T ) cut. This is true since all flow goes from s to t, and since s ∈ S and t ∈ T (by
definition of a cut), then all the flow must go through the arcs in the (S, T ) cut. Inequality 13 is a
special case of the weak duality theorem of linear programming.
The following theorem, which we will establish algorithmically, can be viewed as a special case of
the strong duality theorem in linear programming.
Theorem 9.1 (Max-Flow Min-Cut). The value of a maximum flow is equal to the capacity of a
cut with minimum capacity among all cuts. That is,
Next we introduce the terminology needed to prove the Max-flow/Min-cut duality and to give
us an algorithm to find the max flow of any given graph.
Residual graph: The residual graph, Gf = (V, Af ), with respect to a flow f , has the following
arcs:
• forward arcs: (i, j) ∈ Af if (i, j) ∈ A and fij < uij . The residual capacity is ufij = uij − fij .
The residual capacity on the forward arc tells us how much we can increase flow on the original
arc (i, j) ∈ A.
• reverse arcs: (j, i) ∈ Af if (i, j) ∈ A and fij > 0. The residual capacity is ufji = fij . The
residual capacity on the reverse arc tells us how much we can decrease flow on the original
arc (i, j) ∈ A.
Intuitively, residual graph tells us how much additional flow can be sent through the original
graph with respect to the given flow. This brings us to the notion of an augmenting path.
In the presence of lower bounds, ℓij ≤ fij ≤ uij , (i, j) ∈ A, the definition of forward arc remains,
whereas for reverse arc, (j, i) ∈ Af if (i, j) ∈ A and fij > ℓij . The residual capacity is ufji = fij −ℓij .
IEOR 266 notes: Updated 2015 48
Max x
∑ts ∑
subject to − j xjk = 0 k ∈ V
i xki
0 ≤ xij ≤ uij ∀(i, j) ∈ A.
For dual variables we let {zij } be the nonnegative dual variables associated with the capacity
upper bounds constraints, and {λi } the variables associated with the flow balance constraints. (I
also multiply the flow balance constraints by −1, so they represent Inflowi −Outflowi = 0.)
∑
Min (i,j)∈A uij zij
subject to zij − λi + λj ≥ 0 ∀(i, j) ∈ A
λs − λt ≥ 1
zij ≥ 0 ∀(i, j) ∈ A.
The dual problem has an infinite number of solutions: if (λ∗ , z ∗ ) is an optimal solution, then
so is (λ∗ + C, z ∗ ) for any constant δ. To avoid that we set λt = 0 (or to any other arbitrary value).
Observe now that with this assignment there is an optimal solution with λs = 1 and a partition of
the nodes into two sets: S = {i ∈ V |λi = 1} and S̄ = {i ∈ V |λi = 0}.
The complementary slackness condition state that the primal and dual optimal solutions x∗ , λ∗ , z ∗
satisfy,
x∗ij · [zij
∗
− λ∗i + λ∗j ] = 0
[uij − x∗ij ] · zij
∗
= 0.
In an optimal solution zij ∗ − λ∗ + λ∗ = 0 except for the arcs in (S̄, S), so the first set of
i j
complementary slackness conditions provides very little information on the primal variables {xij }.
∗ = 0 on all arcs other
Namely, that the flows are 0 on the arcs of (S̄, S). As for the second set, zij
than the arcs in the cut (S, S̄). So we can conclude that the cut arcs are saturated, but derive no
further information on the flow on other arcs.
The only method known to date for solving the minimum cut problem requires finding a max-
imum flow first, and then recovering the cut partition by finding the set of nodes reachable from
the source in the residual graph (or reachable from the sink in the reverse residual graph). That
set is the source set of the cut, and the recovery can be done in linear time in the number of arcs,
O(m).
On the other hand, if we are given a minimum cut, there is no efficient way of recovering the
flow values on each arc other than essentially solving from scratch. The only information given
by the minimum cut, is the value of the maximum flow and the fact that the arcs on the cut are
saturated. Beyond that, the flows have to be calculated with the same complexity as would be
required without the knowledge of the minimum cut.
This asymmetry implies that it may be easier to solve the minimum cut problem than to solve
the maximum flow problem. Still, no minimum cut algorithm has ever been discovered, in the sense
that every so-called s, t minimum cut algorithm known, computes first the maximum flow.
IEOR 266 notes: Updated 2015 49
9.3 Applications
9.3.1 Hall’s theorem
Hall discovered this theorem about a necessary and sufficient condition for the existence of a perfect
matching independently of the max flow min cut theorem. However the max flow min cut theorem
can be used as a quick alternative proof to Hall’s theorem.
Proof.
(⇒) If there is a perfect matching, then for every X ⊆ U the set of its neighbors N (X) contains at
least all the nodes matched with X. Thus, |N (X)| ≥ |X|.
(⇐) We construct a flow graph by adding: a source s, a sink t, unit-capacity arcs from s to all
nodes in U , and unit-capacity arcs from all nodes in V to t. Direct all arcs in the original graph G
from U to V , and set the capacity of these arcs to ∞ (See Figure 20).
U V
8
1
1
1
1
1 1
s t
1
.......
.......
Assume that |N (X)| ≥ |X|, ∀ X ⊆ U . We need to show that there exists a perfect matching.
Consider a finite cut (S, T ) in the flow graph defined. Let X = U ∩ S. We note that N (X) ⊆ S ∩ V ,
else the cut is not finite.
Since |N (X)| ≥ |X|, the capacity of the any (S, T ) cut satisfies,
The following definition will be useful in our later discussion. Given a directed graph G = (V, A),
a closed set is a set C ⊆ V such that u ∈ Cand(u, v) ∈ A =⇒ v ∈ C. That is, a closed set includes
all of the successors of every node in C. Note that both ∅ and V are closed sets.
IEOR 266 notes: Updated 2015 51
Now we are ready to show how we can formulate the selection problem as a minimum cut
problem. We first create a bipartite graph with sets on one side, items on the other. We add arcs
from each set to all of its items. (See figure 21.) Note that a selection in this graph is represented
by a collection of set nodes and all of its successors. Indeed there is a one-to-one correspondence
between selections and closed sets in this graph.
Next we transform this graph into a maxflow-mincut graph. We set the capacity of all arcs to
infinity. We add a source, a sink, arcs from s to each set i with capacity bi , and arcs from from
each set j to t with capacity cj . (See figure 22.)
We make the following observations. There is a one-to-one correspondence between finite cuts
and selections. Indeed the source set of any finite (S, T ) cut is a selection. (If Sj ∈ S then it must
be true that all of its items are also in S—otherwise the cut could not be finite.) Now we are ready
to state our main result.
IEOR 266 notes: Updated 2015 52
Theorem 9.4. The source set of a minimum cut (S,T) is an optimal selection.
Proof.
∑ ∑
min U (S, T ) = min bi + cj
S
i∈T j∈S
∑
m ∑ ∑
= min bi − bi + cj
i=1 i∈S j∈S
∑ ∑
= B + min − bi + cj
i∈S j∈S
∑ ∑
= B − max bi − cj .
i∈S j∈S
∑m
Where B = i=1 bi .
The vertex cover problem on bipartite graphs G = (V1 ∪ V2 , E) can be solved in polynomial
time. This follows since constraint matrix of the vertex cover problem on bipartite graphs is totally
unimodular. To see this, observe that the constraint matrix can be separated in columns into two
parts corresponding to V1 and V2 respectively, each row within which contains exactly one 1.
To{ see this, let
1 if i ∈ V1 is in the cover
xi = ,
0 otherwise
{
−1 if i ∈ V2 is in the cover
yi = .
0 otherwise
The integer programming formulation for the vertex cover problem on bipartite graph can then
be transformed to the following:
∑ ∑
min wj x j + (−wj )yj (VC B )
j∈V1 j∈V2
s.t. xi − yj ≥ 1 ∀ (i, j) ∈ E
xi ∈ {0, 1} ∀ i ∈ V1
yj ∈ {−1, 0} ∀ j ∈ V2
IEOR 266 notes: Updated 2015 53
The following s, t-Graph (Figure 23) can be constructed to solve the vertex cover problem on
bipartite graph.
1. Given a bipartite graph G = (V1 ∪ V2 , E), set the capacity of all edges in A equal to ∞.
2. Add a source s, a sink t, set As of arcs from s to all nodes i ∈ V1 (with capacity us,i = wi ), and
set At of arcs from all nodes j ∈ V2 to t (with capacity uj,t = wj ).
V1 V2
1
!
1’
w1 2 ! 2’ w1’
w2 ! w2’
3
!
3’
w3 w3’
s t
!
wn !
!
wn’
n n’
Figure 23: An illustration of the s, t-Graph for the vertex cover problem on bipartite graph and an
example of a finite s − t cut.
Solution. Make a grid-graph G = (V, E). For every square of forest make a node v. Assign
every node v, weight w, which is equal to the amount of benefit you get when cut the corresponding
square. Connect two nodes if their corresponding squares are adjacent. The resulting graph is
bipartite. We can color it in a chess-board fashion in two colors, such that no two adjacent nodes
are of the same color. Now, the above Forest Clearing problem is equivalent to finding a max-weight
independent set in the grid-graph G. In a previous section it was shown that Maximum Independent
Set Problem is equivalent to Minimum Vertex Cover Problem. Therefore, we can solve our problem
by solving weighted vertex cover problem on G by finding Min-Cut in the corresponding network.
This problem is solved in polynomial time.
Version 2. Suppose, deer can see if the diagonal squares are cut. Then the new constraint is
the following: no two adjacent vertically, horizontally or diagonally squares can be cut. We can
build a grid-graph in a similar fashion as in Version 1. However, it will not be bipartite anymore,
since it will have odd cycles. So, the above solution would not be applicable in this case. Moreover,
the problem of finding Max Weight Independent Set on such a grid-graph with diagonals is proven
to be NP-complete.
Another variation of Forest Clearing Problem is when the forest itself is not of a rectangular
shape. Also, it might have “holes” in it, such as lakes. In this case the problem is also NP-complete.
Because there are no odd cycles in the above grid graph of version 1, it is bipartite and the
aforementioned approach works. Nevertheless, notice that this method breaks down for graphs
where cells are neighbors also if they are adjacent diagonally. An example of such a graph is given
below.
Such graphs are no longer bipartite. In this case, the problem has indeed been proven NP-complete
in the context of producing memory chips (next application).
IEOR 266 notes: Updated 2015 55
Theorem 9.5 (Flow Decomposition Theorem). Any feasible flow f can be decomposed into no
more than m primitive elements.
Proof. The proof is by construction; that is, we will give an algorithm to find the primitive elements.
Let Af = {(i, j) ∈ E : fij > 0}, and G′ = (V, Af ).
Pseudocode
While there is an arc in Af adjacent to s (i.e. while there is a positive flow from s to t), do:
Begin DFS in Gf from s, and stop when we either reach t along a simple path P , or
found a simple cycle Γ. Let ξ denote the set of arcs of the primitive element found.
Set δ = min(i,j)∈ξ {fij }.
Record either (P, δ) or (Γ, δ) as a primitive element.
Update the flow in G by setting: fij ← fij − δ ∀(i, j) ∈ ξ.
Update Af . Note that the number of arcs in Af is reduced by at least one arc.
If no arc in Af leaves s, then Af must consist of cycles. While Af ̸= ∅ do:
Pick any arc (i, j) ∈ Af and start DFS from i.
When we come back to i (which is guaranteed by flow conservation), we found another
primitive element, a cycle Γ.
Set δ = min(i,j)∈Γ {fij }.
Record (Γ, δ) as a primitive element.
Update the flow in G by setting: fij ← fij − δ ∀(i, j) ∈ ξ.
Update Af . Note that the number of arcs in Af
Note that the above algorithm is correct since, every time we update the flow, the new flow must
still be a feasible flow.
Since, each time we found a primitive element we removed at least one arc from Af , we can
find at most m primitive elements.
IEOR 266 notes: Updated 2015 56
An interesting question is the complexity of the algorithm to generate the primitive elements
and how it compares to maximum flow. We now show that the overall complexity is O(mn).
Assuming the we have for each node the adjacency list - the outgoing arcs. Follow a path from s
in a DFS manner, visiting one arc at a time. If a node is repeated (visited twice) then we found
a cycle. We trace it back and find its bottleneck capacity and record the primitive element. The
bottleneck arcs are removed, and the adjacency lists affected are updated. This entire operation is
done in O(n) as we visit at most n nodes.
If a node is not repeated, then after at most n steps we reached the sink t. The path followed
is then a primitive path with flow equal to its bottleneck capacity. We record it and update the
flows and adjacency lists in O(n) time.
Since there are O(m) primitive elements, then all are found in O(mn). Notice that this is
(pretty much) faster than any known maximum flow algorithm.
9.5 Algorithms
First we introduce some terminology that will be useful in the presentation of the algorithms.
Residual graph: The residual graph, Gf = (V, Af ), with respect to a flow f , has the following
arcs:
• forward arcs: (i, j) ∈ Af if (i, j) ∈ A and fij < uij . The residual capacity is ufij = uij − fij .
The residual capacity on the forward arc tells us how much we can increase flow on the original
arc (i, j) ∈ A.
• reverse arcs: (j, i) ∈ Af if (i, j) ∈ A and fij > 0. The residual capacity is ufji = fij . The
residual capacity on the reverse arc tells us how much we can decrease flow on the original
arc (i, j) ∈ A.
Intuitively, residual graph tells us how much additional flow can be sent through the original
graph with respect to the given flow. This brings us to the notion of an augmenting path.
In the presence of lower bounds, ℓij ≤ fij ≤ uij , (i, j) ∈ A, the definition of forward arc remains,
whereas for reverse arc, (j, i) ∈ Af if (i, j) ∈ A and fij > ℓij . The residual capacity is ufji = fij −ℓij .
Augmenting path: An augmenting path is a path from s to t in the residual graph. The capacity
of an augmenting path is the minimum residual capacity of the arcs on the path – the bottleneck
capacity.
If we can find an augmenting path with capacity δ in the residual graph, we can increase the flow
in the original graph by adding δ units of flow on the arcs in the original graph which correspond
to forward arcs in the augmenting path and subtracting δ units of flow on the arcs in the original
graph which correspond to reverse arcs in the augmenting path.
Note that this operation does not violate the capacity constraints since δ is the smallest arc
capacity in the augmenting path in the residual graph, which means that we can always add δ units
of flow on the arcs in the original graph which correspond to forward arcs in the augmenting path
and subtract δ units of flow on the arcs in the original graph which correspond to reverse arcs in
the augmenting path without violating capacity constraints.
The flow balance constraints are not violated either, since for every node on the augmenting
path in the original graph we either increment the flow by δ on one of the incoming arcs and
increment the flow by δ on one of the outgoing arcs (this is the case when the incremental flow in
the residual graph is along forward arcs) or we increment the flow by δ on one of the incoming arcs
and decrease the flow by δ on some other incoming arc (this is the case when the incremental flow
in the residual graph comes into the node through the forward arc and leaves the node through the
IEOR 266 notes: Updated 2015 57
reverse arc) or we decrease the flow on one of the incoming arcs and one of the outgoing arcs in the
original graph (which corresponds to sending flow along the reverse arcs in the residual graph).
Pseudocode:
f : flow;
Gf : the residual graph with respect to the flow;
Pst : a path from s to t;
uij : capacity of arc from i to j.
Initialize f = 0
If ∃Pst ∈ Gf do
find δ = min (i,j)∈Pst Uij
fij = fij + δ ∀(i, j) ∈ Pst
else stop f is max flow.
Detailed description:
1. Start with a feasible flow (usually fij = 0 ∀(i, j) ∈ A ).
2. Construct the residual graph Gf with respect to the flow.
3. Search for augmenting path by doing breadth-first-search from s (we consider nodes to be
adjacent if there is a positive capacity arc between them in the residual graph) and seeing
whether the set of s-reachable nodes (call it S) contains t.
If S contains t then there is an augmenting path (since we get from s to t by going through a
series of adjacent nodes), and we can then increment the flow along the augmenting path by
the value of the smallest arc capacity of all the arcs on the augmenting path.
We then update the residual graph (by setting the capacities of the forward arcs on the
augmenting path to the difference between the current capacities of the forward arcs on the
augmenting path and the value of the flow on the augmenting path and setting the capacities
of the reverse arcs on the augmenting path to the sum of the current capacities and the value
of the flow on the augmenting path) and go back to the beginning of step 3.
If S does not contain t then the flow is maximum.
To establish correctness of the augmenting path algorithm we prove the following theorem which
is actually stronger than the the max-flow min-cut ∑ theorem. ∑ ∑
Reminder: fij is the flow from i to j, |f | = (s,i)∈A fsi = (i,t)∈A fit = u∈S,v∈T fuv for any
cut (S, T ).
Theorem 9.6 (Augmenting Path Theorem). (generalization of the max-flow min-cut theorem)
The following conditions are equivalent:
1. f is a maximum flow.
2. There is no augmenting path for f .
3. |f | = U (S, T ) for some cut (S, T ).
Proof.
(1 ⇒ 2) If ∃ augmenting path p, then we can strictly increase the flow along p; this contradicts
IEOR 266 notes: Updated 2015 58
(2 ⇒ 3) Let Gf be the residual graph w.r.t. f . Let S be a set of nodes reachable in Gf from s.
Let T = V \ S. Since s ∈ S and t ∈ T then (S, T ) is a cut. For v ∈ S, w ∈ T, we have the following
implications:
(v, w) ̸∈ Gf ⇒ fvw = uvw and fwv = 0
∑ ∑ ∑
⇒ |f | = fvw − fwv = uvw = U (S, T )
v∈S,w∈T w∈T,v∈S v∈S,w∈T
(3 ⇒ 1) Since |f | ≤ U (S, T ) for any (S, T ) cut, then |f | = U (S, T ) =⇒ f is a maximum flow.
Note that the equivalence of conditions 1 and 3 gives the max-flow min-cut theorem.
Given the max flow (with all the flow values), a min cut can be found in by looking at the residual
network. The set S, consists of s and the all nodes that s can reach in the final residual network
and the set S̄ consists of all the other nodes. Since s can’t reach any of the nodes in S̄, it follows
that any arc going from a node in S to a node in S̄ in the original network must be saturated which
implies this is a minimum cut. This cut is known as the minimal source set minimum cut. Another
way to find a minimum cut is to let S̄ be t and all the nodes that can reach t in the final residual
network. This a maximal source set minimum cut which is different from the minimal source set
minimum cut when the minimum cut is not unique.
While finding a minimum cut given a maximum flow can be done in linear time, O(m), we have
yet to discover an efficient way of finding a maximum flow given a list of the edges in a minimum cut
other than simply solving the maximum flow problem from scratch. Also, we have yet to discover
a way of finding a minimum cut without first finding a maximum flow. Since the minimum cut
problem is asking for less information than the maximum flow problem, it seems as if we should be
able to solve the former problem more efficiently than the later one. More on this in Section 9.2.
In a previous lecture we already presented Ford-Fulkerson algorithm and proved its correctness.
In this section we will analyze its complexity. For completeness purposes we give a sketch of the
algorithm.
Ford-Fulkerson Algorithm
Step 0: f = 0
Step 1: Construct Gf (Residual graph with respect to flow f)
Step 2: Find an augmenting path from s to t in Gf
Let path capacity be δ
Augment f by δ along this path
Go to step 1
If there does not exist any path
Stop with f as a maximum flow.
Theorem 9.7. If all capacities are integer and bounded by a finite number U , then the augmenting
path algorithm finds a maximum flow in time O(mnU ), where U = max(v,w)∈A uvw .
Proof. Since the capacities are integer, the value of the flow goes up at each iteration by at least
one unit.
Since the capacity of the cut (s, N \ {s}) is at most nU , the value of the maximum flow is at
most nU .
From the two above observations it follows that there are at most O(nU ) iterations. Since each
iteration takes O(m) time–find a path and augment flow, then the total complexity is O(nmU ).
IEOR 266 notes: Updated 2015 59
The above result also applies for rational capacities, as we can scale to convert the capacities
to integer values.
1j
> @2000
2000
@
sj 1 @
R j
t
Z >
Z
2000 Z
~ ?2000
2j
Figure 24: Graph leading to long running time for Ford-Fulkerson algorithm.
3. Finally, for irrational capacities, this algorithm may converge to the wrong value (see Pa-
padimitriou and Steiglitz p.126-128)
∑
paths from s to t with sum of flows equal to v ∗ − v. i=1,k δi = v∗ − v
(v ∗ −v) (v ∗ −v)
⇒ ∃i|δi ≥ k ≥ m
m−1 ∗
⇒ remaining flow (after augmenting along a maximum capacity path) is ≤ m (v − v).
log(v ∗ )
Repeat the above for q iterations, where q = log( m
).
m−1
q ∗ log(v ∗ )
m ) v ≤ 1, since flows must be integers. Thus it suffices to have q ≥
Stop when ( m−1 m
log( m−1 ).
m
Now log( m−1 )≈ m 1
. Thus the overall complexity is O(m(m + n log n) log(nU )).
Why is the largest capacity augmenting path not necessarily a primitive path? A primitive
path never travels backwards along an arc, yet an augmenting path may contain backward arcs.
Thus, knowing the flow in advance is a significant advantage (not surprisingly).
From the above equation it is clear that we can find a feasible flow to the current iteration by
doubling the max-flow from the previous iteration.
Finally, note that the residual flow in the current iteration can’t be more than m. This is
true since in the previous iteration we had a max-flow and a corresponding min-cut. The residual
capacity at each arc of the min-cut from the previous scaling iteration can grow at most by one
unit as these arcs were saturated in that previous iteration. Therefore the max-flow in the residual
graph of the current iteration can be at most m units.
The preceding discussion is summarized in the following algorithm.
Theorem 9.8. The capacity scaling algorithm runs in O(m2 log2 U ) time.
Proof. For arcs on the cut in the previous network residual capacities were increased by at most
one each, then the amount of residual max flow in Pi is bounded by the number of arcs in the cut
which is ≤ m. So the number of augmentations at each iteration is O(m).
The complexity of finding an augmenting path is O(m).
The total number of iterations is O(log2 U ).
Thus the total complexity is O(m2 log2 U ).
Notice that this algorithm is polynomial, but still not strongly polynomial. Also it cannot
handle real capacities.
We illustrate the capacity scaling algorithm in Figure 25.
Proof. The algorithm finishes when s and t are disconnected in residual graph and so there is no
augmenting path. Thus by the augmenting path theorem the flow found by Dinic’s algorithm is a
maximum flow.
Complexity Analysis
We first argue that, after finding the blocking flow at the k-layered network, the shortest path in
the (new) residual graph is of length ≥ k + 1 (i.e. the level of t must strictly increase from one
IEOR 266 notes: Updated 2015 63
stage to the next one). This is true since all paths of length k intersect at least one saturated arc.
Therefore, all the paths from s to t in the next layered network must have at least k + 1 arcs (all
paths with length ≤ k are already blocked). Thus, Dinic’s algorithm has at most n stages (one
stage for each layered network.
To find the blocking flow in stage k we do the following work: 1) Each DFS and update takes
O(k) time (Note that by definition of AN (f ) we can’t have cycles.), and 2) we find the blocking
flow in at most m updates (since each time we remove at∑least one arc from the layered network).
We conclude that Dinic’s algorithm complexity is O( n1 mk) = O(mn2 ).
Now we will show that we can improve this complexity by finding more efficiently the blocking flow.
In order to show this, we need to define one more concept.
Throughput: The throughput of a node is the minimum of{ the sum of incoming arcs capacities}
∑ f ∑ f
and the outgoing arcs capacity (that is, thru(v) ≡ min u
(i,v)∈Af v,j , (v,f )∈Af v,f ). The
u
throughput of a node is the largest amount of flow that could possibly be routed through the
node.
The idea upon which the improved procedure to find a blocking flow is based is the following. We
find the node v with the minimum throughput thru(v). We know that at this node we can “pull”
thru(v) units of flow from s, and “push” thru(v) units of flow to t. After performing this pulling
and pushing, we can remove at least one node from the layered network.
Push δ from v to t
Q ← {v}; excess(v) ← δ
While Q ̸= ∅
i ← first node in Q (remove i from Q)
For each arc (i,
{ j), and while}excess(i) > 0
∆ ← min ufij , excess(i)
excess(i) ← excess(i) − ∆
excess(j) ← excess(j) + ∆
thru(j) ← thru(j) − ∆
fij ← fij + ∆
if j ̸∈ Q add j to end of Q
End For
End While
The “Pull” procedure has a similar implementation.
Theorem 9.10. Dinic’s algorithm solves the max-flow problem for a network G = (V, A) in O(n3 )
arithmetic operations.
At each stage we need to create the layered network with BFS. This takes O(m) complexity.
The operations at each stage can be either saturating push along an arc or an unsaturating
push along an arc. So we can write number of operations as sum of saturating and unsaturating
pull and push steps, N = Ns + Nu . We first note that once an arc is saturated, it is deleted from
graph, therefore Ns = O(m). However we may have many unsaturating steps for the same arc.
The key observation is that we have at most n executions of the push and pull procedures
(along a path) at each stage. This is so because each such execution results in the deletion of the
node v with the lowest throughput where this execution started. Furthermore, in each execution
of the push and pull procedures, we have at most n unsaturated pushes – one for each node along
the path. Thus, we have at each stage at most O(n2 ) unsaturated pushes and N = Ns + Nu =
O(m) + O(n2 ) = O(n2 ). Therefore, the complexity per stage is O(m + n2 ) = O(n2 ).
We conclude that the total complexity is thus O(n3 ).
Definition 9.11. A unit capacity network is an s − t network, where all arc capacities are equal
to 1.
Some problems that can be solved by finding a maximum flow in unit capacity networks are
the following.
• Maximum Bipartite matching,
• Maximum number of edge disjoint Paths.
Lemma 9.12. In a unit capacity √ network with distance from s to t greater or equal to ℓ the
maximum flow value |fˆ | satisfies |fˆ | ≤ 2 |V | .
ℓ
Proof. Construct a layered network with Vi the set of the nodes in Layer i (See Figure 26). And
let Si = V0 ∪ V1 ∪ · · · ∪ Vi
1 ≤ i ≤ ℓ − 1.
s V1 V2 Vl -1 t
. . .
ra
ext es
od
cut n
Then (Si , S̄i ) is a s-t cut. Using the fact that the maximum flow value is less than the capacity
of any cut,
IEOR 266 notes: Updated 2015 65
√ √
|fˆ | ≤ |Vi ||Vi+1 | ⇒ either |Vi | ≥ |fˆ | or |Vi+1 | ≥ fˆ
The first inequality comes from the fact that the maximum possible capacity of a cut is equal
to |Vi ||Vi+1 |. Since each arc has a capacity of one, and the maximum of number of arcs from Vi
toVi+1 is equal to |Vi ||Vi+1 | ( all the combinations
√ of the set of nodes Vi and Vi+1 ).
So at least ℓ of the layers have at least |fˆ | nodes each
2
(consider the pairs, V0 V1 , . . . , V2 V3 , . . . ).
√ ∑
ℓ √
ℓ 2|V |
|fˆ | ≤ |Vi | ≤ |V | ⇒ |fˆ | ≤
2 ℓ
i=1
Claim 9.13. Dinic’s Algorithm, applied to the unit capacity network requires at most O(n2/3 )
stages.
Proof. Notice that in this case, it is impossible to have non-saturated arc processing, because of
unit capacity, so the total computation per stage is no more than O(m). If Max flow ≤ 2n2/3 then
done. (Each stage increments the flow by at least one unit.)
If Max flow > 2n2/3 , run the algorithm for 2n2/3 stages. The distance labels from s to t will be
increased by at least 1 for each stage. Let the Max flow in the residual network (after 2n2/3 stages)
be g. The shortest path from s to t in this residual network satisfies ℓ ≥ 2n2/3 .
Apply lemma 9.12 to this residual network.
√ 2|V |
|g| ≤ = n1/3 , |V | = n ⇒ |g| ≤ n2/3
2n2/3
It follows that no more than n2/3 additional stages are required. Total number of stages ≤
3n2/3 .
Complexity Analysis
Any arc processed is saturated (since this is a unit capacity network). Hence O(m) work per stage.
There are O(n2/3 ) stages. This yields an overall complexity O(mn2/3 ).
Remark: A. Goldberg and S. Rao [GR98] devised a new algorithm for maximum flow in general
networks that use ideas of the unit capacity as subroutine. The overall running time of their
algorithm is O(min{n2/3 , m1/2 }m log(n2 /m) log U ) for U the largest capacity in the network.
Lemma 9.14. In a simple network, with distance from s to t greater than or equal to ℓ, the max
|V |
flow, fˆ satisfies |fˆ| ≤ (ℓ−1)
IEOR 266 notes: Updated 2015 66
a) b)
1
4 4
1
1
2
3
2 1
Proof. Consider layers V0 , V1 . . . and the cut (Si , S̄i ), where, Si = V0 ∪ · · · ∪ Vi . Since for i =
1 . . . ℓ − 1 the throughput of each node for each layer Vi is one,
∑
ℓ−1
|V |
|Vi | ≤ |V | → min |Vi | ≤
i=1....ℓ−1 (ℓ − 1)
i=1
Claim 9.15. Applying Dinic’s Algorithm to a simple network, the number of stages required is
√
O( n).
√ √ √
Proof. If the max flow ≤ n done. Otherwise, run Dinic’s algorithm for n stages. l ≥ n + 1 in
that residual network, since each stage increments the flow by at least one unit and increases the
distance label from the source to the sink by at least one at each stage. In the residual network,
√ √
the Maxflow g satisfies |g| ≤ (√|V |
≤ n. So, we need at most n additional stages. Thus the
√
n+1)
total number of the stages is O( n)
Complexity Analysis
Recall we are considering of simple networks. All push/pull operations are saturating a node since
there is only one incoming arc or outgoing arc with capacity 1 each node v. The processing thus
makes the throughput of node v be 0.
Note that even though we only have to check O(n) nodes, we still have to update the layered
network which requires O(m) work. Hence the work per stage is O(m).
√
Thus, the complexity of the algorithm is O(m n)
References
[Din1] E. A. Dinic. Algorithm for Solution of a Problem of Maximum Flows in Networks with
Power Estimation. Soviet Math. Dokl., 11, 1277-1280, 1970.
[FF1] L. R. Ford, Jr. and D. R. Fulkerson. Maximal Flow through a Network. Canad. J.
Math., 8, 399-404, 1956.
IEOR 266 notes: Updated 2015 67
[GR98] A. V. Goldberg and S. Rao. Beyond the flow decomposition barrier. Journal of the
ACM 45, 783–797, 1998.
[GT1] A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem. J.
of ACM, 35, 921-940, 1988.
• There is never an augmenting path in the residual network. The preflow is possibly infeasible,
but super-optimal.
Definitions
Preflow: Preflow f⃗ in a network is defined as the one which
Distance Labels: Each node v is assigned a label. The labeling is such that it satisfies the
following inequality
d(i) ≤ d(j) + 1 ∀(i, j) ∈ Af .
Admissible Arc An arc (u, v) is called admissible if (u, v) ∈ Af and d(u) > d(v). Notice that
together with the validity of the distance labels this implies that d(u) = d(v) + 1.
Lemma 10.1. The admissible arcs form a directed acyclic graph (DAG).
Proof: For contradiction suppose the cycle is given as (i1 , i2 , ..., ik , i1 ). From the definition of an
admissible arc, d(i1 ) > d(i2 ) > . . . > d(ik ) > d(i1 ), from which the contradiction follows.
Lemma 10.2. The distance label of node i, d(i), is a lower bound on the shortest path distance in
Af from i to t, if such path exists.
IEOR 266 notes: Updated 2015 68
Proof: Let the shortest path from i to t consist of k arcs: (i, i1 , ..., ik−1 , t). From the validity of
the distance labels,
d(i) ≤ d(i1 ) + 1 ≤ d(i2 ) + 2 ≤ . . . ≤ d(t) + k = k.
Corollary 10.3. If the distance label of a node i, satisfies d(i) ≥ n, then there is no path in Af
from i to t. That is, t is not reachable from i.
Now we prove the following lemmas about this algorithm which will establish its correctness.
Proof: Relabeling is applied to node i with d(i) ≤ d(j) ∀(i, j) ∈ Af . Suppose the new label is
This implies
d′ (i) ≤ d(j) + 1 ∀(i, j) ∈ Af
Lemma 10.5 (Super optimality). For preflow f⃗ and distance labeling d⃗ there is no augmenting
path from s to t in the residual graph Gf .
Lemma 10.6 (Optimality). When there is no active node in the residual graph (∀v, ev = 0) preflow
is a maximum flow.
IEOR 266 notes: Updated 2015 69
Proof: Since there is no node with positive excess, the preflow is a feasible flow. Also from the
previous lemma there is no augmenting path, hence the flow is optimal.
Lemma 10.7. For any active node i, there exists a path in Gf from i to s.
Proof: Let Si be the nodes reachable in Gf from i. If s ∈ / Si , then all nodes in Si have non-negative
excess. (Note that s is the only node in the graph that is allowed to have a negative excess - deficit.)
In-flow into Si must 0 (else can reach more nodes than Si )
0 = inf low(Si )
≥ inf low(Si ) − outf low(Si )
∑
= [inf low(j) − outf low(j)]
j∈Si
∑
= [inf low(j) − outf low(j)] + ef (i)
j∈Si \{i}
> 0.
Since the node is active, hence it has strictly positive excess. Therefore, a contradiction. ⇒ s ∈
Si ⇒ there exists a path in Gf from i to s.
Lemma 10.8. When all active nodes satisfy d(i) ≥ n, then the set of nodes S = {i|d(i) ≥ n} is a
source set of a min-cut.
Proof: Firstly, there is no residual (augmenting) path to t from any node in S (Corollary 10.3).
Let S + = {i | d(i) ≥ n, e(i) > 0}
From Lemma 10.7, each active node can send back excess to s. Hence, the preflow can be converted
to (a feasible) flow and the saturated cut (S, S̄) is a min-cut. Therefore the resulting feasible flow
is maximum flow. (Although the conversion of preflow to a feasible flow can be accomplished by
the algorithm, it is more efficient to send flow from excess nodes back to source by using the flow
decomposition algorithm. That is, terminate the push-relabel algorithm when there is no active
node of label < n.)
Note that when we terminate the push-relabel algorithm when there is no active node of label
< n, any node of label ≥ n must be active. This is since when it was last relabeled it was active,
and could not have gotten rid of the access as of yet.
Notice also that the source set S is minimal source set of a min cut. The reason is that the
same paths that are used to send the excess flows back to source, can then be used in the residual
graph to reach those nodes of S.
Lemma 10.9. ∀v, d(v) ≤ 2n − 1
Proof: Consider an active node v. From lemma 10.7, s is reachable from v in Gf . Consider a
simple path, P , from v to s. P = (i0 , i1 , .....ik ) where i0 = v, ik = s. Now,
d(s) = n (fixed)
d(i0 ) ≤ d(i1 ) + 1 ≤ d(i2 ) + 2..... ≤ d(ik ) + k
d(i0 ) ≤ d(ik ) + k = n + k
d(i0 ) ≤ n + k
Since any simple path has no more that n nodes, we have k ≤ n − 1, which implies that ⇒ d(i0 ) ≤
2n − 1.
IEOR 266 notes: Updated 2015 70
Height (Label) of node
u
6 Q
Q
Q
s
Q
v Third Push v
`
+ Second Push
u u
Q
Q
Q
First PushQ
s
v
`
2. Relabeling:
Each relabel increases Φ by the increase in the label.
Maximum increase of label per node = 2n − 1
Total increase in Φ by relabeling = O(n2 ). This term is dominated by O(n2 m).
Since total increase = O(n2 m), the number of non-saturating pushes is at most O(n2 m)
Overall Complexity
Cost of relabels = O(nm)
Cost of saturating pushes = O(nm)
Cost of non-saturating pushes = O(n2 m)
Therefore total complexity is O(n2 m).
• Highest label preflow: picks an active node to process that has the highest distance label
√
O( mn2 )
log n2
• Dynamic tree implementation: Running time = O(mn m )
preprocess()
While (there exists an active node(i))
Do
if v is active then push/relabel v
if v gets relabeled, place v in front of L
else replace v by the node after v in L
end
IEOR 266 notes: Updated 2015 72
end
Proof. It suffices to show that there are O(n3 ) non-saturating pushes for the wave algorithm. Define
a phase to be the period between two consecutive relabels.
There are ≤ n(2n − 1) relabels. So there are O(n2 ) phases and we need to show that for each node
there are ≤ 1 non-saturating push per phase. (Lemma 10.16.)
Claim 10.15. The list L is in topological order for the admissible arcs of Af .
Proof. By induction on the iteration index. After preprocess, there are no admissible arcs and so
any order is topological. Given a list in topological order, process node u. ’Push’ can only eliminate
admissible arcs (with a saturating push) and ’relabel’ can create admissible arcs, but none into u
(lemma 10.10), so moving u to the front of L maintains topological order.
Lemma 10.16. At most one non-saturating push per node per phase.
Proof. Let u be a node from which a non-saturating push is executed. Then u becomes inactive.
Also all the nodes preceding u in L are inactive. In order for u to become active, one node preceding
it has to become active. But every node can only make nodes lower than itself active. Hence the
only way for u to become active is for some node to be relabeled.
11.1 Initialization
The pseudoflow algorithm starts with a pseudoflow and an associated current forest, called in
[Hoc08], a normalized tree. Each non-root in a branch (tree) of the forest has a parent node, and
the unique arc from the parent node is called current arc. The root has no parent node and no
current arc.
The generic initialization is the simple initialization: source-adjacent and sink-adjacent arcs are
saturated while all other arcs have zero flow.
If a node v is both source-adjacent and sink-adjacent, then at least one of the arcs (s, v) or (v, t)
can be pre-processed out of the graph by sending a flow of min{csv , cvt } along the path s → v → t.
This flow eliminates at least one of the arcs (s, v) and (v, t) in the residual graph. We henceforth
assume w.l.o.g. that no node is both source-adjacent and sink-adjacent.
The simple initialization creates a set of source-adjacent nodes with excess, and a set of sink-
adjacent nodes with deficit. All other arcs have zero flow, and the set of current arcs is selected to
be empty. Thus, each node is a singleton component for which it serves as the root, even if it is
balanced (with 0-deficit).
A second type of initialization is obtained by saturating all arcs in the graph. The process of
saturating all arcs could create nodes with excesses or deficits. Again, the set of current arcs is
empty, and each node is a singleton component for which it serves as the root. We refer to this as
the saturate-all initialization scheme.
Figure 29: (a) Components before merger (b) Before pushing flow along admissible path from ri
to rj (c) New components generated when arc (u, v) leaves the current forest due to insufficient
residual capacity.
rj
D rj u
rj ri
j D E ri
D A
u i j
C
i
E v v C A
F B
j i u v
F B
C B ri
E F
Admissible arc
A
/*
Min-cut stage of the generic labeling pseudoflow algorithm. All nodes in label-n compo-
nents form the nodes in the source set of the min-cut.
*/
procedure GenericPseudoflow (Vst , Ast , c):
begin
SimpleInit (As , At , c);
while ∃ a good active component T do
Find a lowest labeled node u ∈ T ;
if ∃ admissible arc (u, v) do
Merger (root(u), · · · , u, v, · · · , root(v));
else do
ℓ(u) ← ℓ(u) + 1;
end
/*
Saturates source- and sink-adjacent arcs.
*/
procedure SimpleInit(As , At , c):
begin
f, e ← 0;
for each (s, i) ∈ As do
e(i) ← e(i) + csi ; ℓ(i) = 1;
for each (i, t) ∈ At do
e(i) ← e(i) − cit ; ℓ(i) = 0;
for each v ∈ V do
ℓ(v) ← 0;
current(v) ← ∅;
end
/*
Pushes flow along an admissible path and preserves invariants.
*/
procedure Merger(v1 , · · · , vk ):
begin
for each j = 1 to k − 1 do
if e(vj ) > 0 do
δ ← min{c(vj , vj+1 ), e(vj )};
e(vj ) ← e(vj ) − δ;
e(vj+1 ) ← e(vj+1 ) + δ;
if e(vj ) > 0 do
current(vj ) ← ∅;
else do
current(vj ) ← vj+1 ;
end
In the generic labeling algorithm, a node was relabeled if no admissible arc was found from the
node. In the monotone implementation, a node u is relabeled only if no admissible arc is found and
for all current arcs (u, v) in the component, ℓ(v) = ℓ(u) + 1. This feature, along with the merger
process, inductively preserves the monotonicity property. The pseudocode for the Min-cut Stage
of the monotone implementation of the pseudoflow algorithm is given in Figure 33.
/*
Min-cut stage of the monotone implementation of pseudoflow algorithm. All nodes in
label-n components form the nodes in the source set of the min-cut.
*/
procedure MonotonePseudoflow (Vst , Ast , c):
begin
SimpleInit (As , At , c);
while ∃ a good active component T with root r do
u ← r;
while u ̸= ∅ do
if ∃ admissible arc (u, v) do
Merger (root(u), · · · , u, v, · · · , root(v));
u ← ∅;
else do
if ∃w ∈ T : (current(w) = u) ∧ (ℓ(w) = ℓ(u)) do
u ← w;
else do
ℓ(u) ← ℓ(u) + 1;
u ← current(u);
end
The monotone implementation simply delays relabeling of a node until a later point in the
IEOR 266 notes: Updated 2015 77
algorithm, which does not affect correctness of the labeling pseudoflow algorithm.
References
[CH09] B.G. Chandran and D.S. Hochbaum. A computational study of the pseudoflow and
push-relabel algorithms for the maximum flow problem. Operations Research, 57(2):358
– 376, 2009.
[Hoc08] D.S. Hochbaum. The pseudoflow algorithm: A new algorithm for the maximum-flow
problem. Operations Research, 56(4):992–1009, 2008.
[HO12] D.S. Hochbaum and J.B. Orlin. Simplifications and speedups of the pseudoflow algo-
rithm. Networks, 2012. (to appear).
IEOR 266 notes: Updated 2015 78
12.1 IP formulation
The decision variables for the IP formulation of MST are:
{
1 if edge e ∈ ET
xe =
0 otherwise
The constraints of the IP formulation need to enforce that the edges in ET form a tree. Recall
from early lectures that a tree satisfies the following three conditions: It has n − 1 edges, it is
connected, it is acyclic. Also recall that if any two of these three conditions imply the third one.
An IP formulation of MST is:
∑
min we xe (15a)
e∈E
∑
s.t. xe = n − 1 (15b)
e∈E
∑
xe ≤ |S| − 1 ∀ S ⊆ V (15c)
e∈(S,S)
xe ∈ {0, 1} ∀e∈E (15d)
where (S, S) denotes the set of edges that have both endpoints in S. Inequality (15c) for S enforces
the property that if a subset of edges in (S, S) is connected (induces a connected subgraph), then
this subset is acyclic. If (S, S) is disconnected and contains a cycle, then there is another set S ′ ⊂ S
that violates the respective constraint for S ′ . Therefore the edges in ET can’t form cycles.
We have the following observations.
1. The constraint matrix of problem (15) does not have a network flow structure, and is not
totally unimodular. However Jack Edmonds proved that the LP relaxation has integral
extreme points.
2. The formulation contains an exponential number of constraints.
3. Even though the LP relaxation has integral extreme points, this does not imply that we can
solve the MST problem in polynomial time. This is because of the exponential size of the
formulation. Nevertheless, we can use the ellipsoid method with a separation algorithm to
solve problem (15):
Relax the set of constraints given in (15c), and solve the remaining problem. Given the solu-
tion to such relaxed problem, find one of the relaxed constraints that is violated (this process
is called separation) and add it. Resolve the problem including the additional constraints,
and repeat until no constraint is violated.
It is still not clear that the above algorithm solves in polynomial time MST. However, in
light of the equivalence between separation and optimization (one of the central theorems in
optimization theory), and since we can separate the inequalities (15c) in polynomial time,
it follows that we can solve problem (15) in polynomial time. This can be done using the
Ellipsoid algorithm for linear programming.
IEOR 266 notes: Updated 2015 79
Note that, like the set of constraints (15c), the number of these inequalities is exponential.
Theorem 12.1. Let T ∗ be a spanning tree in G = (V, E). Consider edge [i, j] in T ∗ . Suppose
removing [i, j] from T ∗ disconnects the tree into S and S̄. T ∗ is a MST iff for all such [i, j], for all
[u, v], u ∈ S, v ∈ S̄, wuv ≥ wij .
∪ ∪
Proof. Suppose T ∗ is an MST and wij > wkl for some k ∈ S, l ∈ S. Let T ′ = T ∗ T (S) [k, l].
Then T ′ is a spanning tree of lower weight than T ∗ .
Given a spanning tree T ∗ that satisfies the cut optimality condition. Suppose there exists some
MST T ′ ̸= T ∗ , with [k, l] ∈ T ′ but not in T ∗ .
We will construct T ′′ from T ∗ with one less edge different from T ′ . Add [k, l] to T ∗ , creating
a cycle. Let [i, j] be the arc with the largest weight in the cycle. Removing this creates a new
spanning tree. But, by the cut optimality condition, wij ≤ wkl , but by the optimality of T ′ ,
wij ≥ wkl , so the new tree has the same weight as the old tree. We can repeat this process for all
iterations, and find that T ∗ has the same cost as T ′ .
Theorem 12.2. T ∗ is a MST iff for all [i, j] not in T ∗ , cij ≥ ckl for all [k, l] in the unique path
from i to j in T ∗ .
Proof. Let T ∗ be an MST. Suppose ∃[i, j] not in T ∗ that has a strictly lower cost than an edge in
the path from i to j in T ∗ . Then add [i, j] to T ∗ , forming a unique cycle, and remove one such
edge in the path. The new spanning tree is of strictly lower cost, which is a contradiction.
IEOR 266 notes: Updated 2015 80
k
i
Figure 35: Path optimality condition
Let T ∗ be a spanning tree that satisfies the path optimality condition. Remove edge [i, j] ∈ T ∗ ,
disconnecting T ∗ into S and S such that i ∈ S, j ∈ S.
For every edge [k, l] not in T ∗ , k ∈ S, l ∈ S, the path from k to l must have used [i, j] in T ∗ .
But cij ≤ ckl by path optimality, which is the same statement as cut optimality, so the statements
are equivalent.
Proof. Suppose no MST containing F contains edge [i, j]. Suppose you are given T ∗ not containing
[i, j]. Add [i, j] to ET ∗ , this will create a cycle, remove the other edge from U1 to U1c that belongs
to the created cycle. Since the weight of this removed edge must be greater or equal than that of
edge [i, j], this is a contradiction.
T ← T [i, j]; P ← P j
end
The while loop is executed n − 1 times, and finding an edge of minimum weight takes O(m)
work, so this naive implementation is an O(nm) algorithm. However, note that similarity between
this algorithm and Dijkstra’s algorithm; in particular note that we can apply the same tricks to
improve its running time.
IEOR 266 notes: Updated 2015 81
We can make this a faster algorithm by maintaining, in a binary heap, a list of sorted edges.
Each edge is added to the heap when one of its end points enters P , while the other is outside
P . It is then removed from the heap when both its end points are in P . Adding an element to
a heap takes O(log m) work, and deleting an element also takes O(log m) work. Thus, total work
done in maintaining the heap is O(m log m), which is O(m log n) since m is O(n2 ). Thus, total
work done in this binary heap implementation is O(m log n).
With Fibonacci heaps, this can be improved to O(m + n log n).
For dense graphs, there is a faster, O(n2 ) algorithm. This algorithm maintains an array Q[i]
for the shortest distance from P to each node i ∈ P̄ , outside of P .
begin
P = {1} , Q[i] = w1i
while P ̸= V do
∪∈ E such that ∪ i ∈ P, j ∈ P of minimum weight
Pick [i, j] c
10
35 2 4
20
1 25 30
40 3 5
15
Figure 36: An example of applying Prim’s Algorithm
10 10
2 4 2 4
35
1 25 20 30 1
40
3 5 3 5
15
The original graph Iteration 1
10 10
2 4 2 4
1 1 20
3 5 3 5
15 15
Iteration 2 Iteration 3
10 10
2 4 2 4
1 25 20 1 20 30
3 5 3 5
15 15
Iteration 4 Iteration 5
10 10
2 4 2 4
35 35
1 20 1 20
40
3 5 3 5
15 15
Iteration 6 Iteration 7
Figure 37: An example of Kruskal’s algorithm
component can be done with DFS on (V, ET ), this takes O(n). Therefore this naive implementation
has a running time of O(nm).
We will now show that Kruskal’s algorithm has a running time of O(m + n log n) (plus the time
to sort the edges) without special data structures:
Each component is maintained as a linked list with a first element label, for each node in the
component, as a representative of the component. The size of the component is also stored with
the representative ”first” of the component.
1. Checking for edge [i, j] whether i and j belong to the same component is done by comparing
their labels of component’s first. This is done in O(1).
2. If for edge [i, j], i and j belong to the different components, then the edge is added and
the two components are merged. In that case the smaller component’s nodes labels are all
updated to assume the label of the representative of the larger component. The size of the
component that is the second label of “first” is modified by adding to it the size of the smaller
component. The complexity is the size of the smaller component merged.
For each edge [i, j] inspected, checking whether i and j belong to the same component or not
takes O(m) operations throughout the algorithm. To evaluate the complexity of merger, note that
the “first” label of a node i is updated when i is in the smaller component. After the merger the size
of the component that contains i at least doubles. Hence the label update operation on node i can
take place at most O(log n) times. Therefore the total complexity of the labels updates throughout
the algorithm is O(m log n).
Hence, the total complexity of Kruskal’s algorithm is O(m log n + m + n log n). The complexity
of the initial sorting of the edges, O(m log n), dominates the run time of the algorithm.
TSP OPT = { Find a tour (a cycle that visits each node exactly once) of total minimum
distance. }
TSP EVAL = { What is the total distance of the tour with total minimum distance in G =
(V, E)? }
TSP DEC = { Is there a tour in G = (V, E) with total distance ≤ M ? }
Given an algorithm to solve TSP DEC, we can solve TSP EVAL as follows.
1. Find the upper bound and lower bound for the TSP optimal objective value. Let Cmin =
min(i,j)∈E cij , and Cmax = max(i,j)∈E cij . Since a tour must contain exactly n edges, then an
upper bound (lower bound) for the optimal objective value is n · Cmax , (n · Cmin ).
2. Find the optimal objective value by binary search in the range [n·Cmin , n·Cmax ]. This binary
search is done by calling the algorithm to solve TSP DEC O(log2 n(Cmax − Cmin ) times,
which is a polynomial number of times.
An important problem for which the distinction between the decision problem and giving a
solution to the decision problem – the search problem – is significant is primality. It was an open
question (until Aug 2002) whether or not there exist polynomial-time algorithms for testing whether
or not an integer is a prime number. 2 However, for the corresponding search problem, finding all
factors of an integer, no similarly efficient algorithm is known.
Another example: A graph is called k-connected if one has to remove at least k vertices in order
to make it disconnected. A theorem says that ∑there exists a partition of the graph into k connected
k
components of sizes n1 , n2 . . . nk such that i=1 nk = n (the number of vertices), such that each
component∑contains one of the vertices v1 , v2 . . . vk . The optimization problem is finding a partition
such that i |ni − n̄| is minimal, for n̄ = n/k. No polynomial-time algorithm is known for k ≥ 4.
However, the corresponding recognition problem is trivial, as the answer is always Yes once the
graph has been verified (in polynomial time) to be k-connected. So is the evaluation problem: the
answer is always the sum of the averages rounded up with a correction term, or the sum of the
averages rounded down, with a correction term for the residue.
For the rest of the discussion, we shall refer to the decision version of a problem, unless stated
otherwise.
Verifier Prover
To illustrate, consider again the decision version of TSP. That is, we want to know if there exists
a tour with total distance ≤ M . If the answer to our problem is “yes”, the prover can provide us
with such a tour and we can verify in polynomial time that 1) it is a valid tour and 2) its total
distance is ≤ M . However if the answer to our problem is “no”, then (as far as we know) the
only way that to verify this would be to check every possible tour (this is certainly not a poly-time
computation).
13.3 co-NP
Suppose the answer to the recognition problem is No. How would one certify this?
Definition 13.2. A decision problem is said to be in co-NP if for all “No” instances of it there
exists a polynomial-length “certificate” that can be used to verify in polynomial time that the answer
is indeed No.
NP Co-NP
+ +
P
PRIME
COMPOSITE
Figure 39: NP , co-NP , and P
13.5.1 Reducibility
There are two definitions of reducibility, Karp and Turing; they are known to describe the same
class of problems. The following definition uses Karp reducibility.
Definition 13.4. A problem P1 is said to reduce in polynomial time to problem P2 (written as
“P1 ∝ P2 ”) if there exists a polynomial-time algorithm A1 for P1 that makes calls to a subroutine
solving P2 and each call to a subroutine solving P2 is counted as a single operation.
We then say that P2 is at least as hard as P1 . (Turing reductions allow for only one call to the
subroutine.)
Important: In all of this course when we talk about reductions, we will always be referring to
polynomial time reductions.
Theorem 13.5. If P1 ∝ P2 and P2 ∈P then P1 ∈P
Proof. Let the algorithm A1 be the algorithm defined by the P1 ∝ P2 reduction. Let A1 run in
time O(p1 (|I1 |)) (again, counting each of the calls to the algorithm for P2 as one operation), where
p1 () is some polynomial and |I1 | is the size of an instance of P1 .
Let algorithm A2 be the poly-time algorithm for problem P2 , and assume this algorithm runs
in O(p2 (|I2 |)) time.
The proof relies on the following two observations:
1. The algorithm A1 can call at most O(p1 (|I1 |)) times the algorithm A2 . This is true since each
call counts as one operation, and we know that A1 performs O(p1 (|I1 |)) operations.
IEOR 266 notes: Updated 2015 88
2. Each time the algorithm A1 calls the algorithm A2 , it gives it an instance of P2 of size at
most O(p1 (|I1 |)). This is true since each bit of the created P2 instance is either a bit of the
instance |I1 |, or to create this bit we used at least one operation (and recall that A1 performs
O(p1 (|I1 |) operations).
We conclude that the resulting algorithm for solving P1 (and now counting all operations)
performs at most O(p1 (|I1 |) + p1 (|I1 |) ∗ p2 ((p1 (|I1 |)))) operations. Since the multiplication and
composition of polynomials is still a polynomial, this is a polynomial time algorithm.
13.5.2 NP-Completeness
A problem Q is said to be NP-hard if B ∝ Q ∀B ∈ NP. That is, if all problems in NP are
polynomially reducible to Q.
It follows from Theorem 13.5 that if any NP-complete problem were to have a polynomial algorithm,
then all problems in NP would. Also it follows from Corollary 13.6 that is we prove that any NP-
complete problem has no polynomial time algorithm, then this would prove that no NP-complete
problem has a polynomial algorithm.
Note that when a decision problem is NP-Complete, it follows that its optimization problem is
NP-Hard.
Conjecture: P ̸= NP. So, we do not expect any polynomial algorithm to exist for an NP-complete
problem.
SAT = {Given a boolean function in conjunctive normal3 form (CNF), does it have a satisfying
assignment of variable? }
Proof.
SAT is in NP: Given the boolean formula and the assignment to the variables, we can substitute
the values of the variables in the formula, and verify in linear time that the assignment indeed
satisfies the boolean formula.
3
In boolean logic, a formula is in conjunctive normal form if it is a conjunction of clauses (i.e. clauses are “linked
by and”), and each clause is a disjunction of literals (i.e. literals are “linked by or”; a literal is a variable or the
negation of a variable).
IEOR 266 notes: Updated 2015 89
B ∝ SAT ∀B ∈ NP: The idea behind this part of the proof is the following: Let X be an instance
of a problem Q, where Q ∈ NP. Then, there exists a SAT instance F (X, Q), whose size is
polynomially bounded in the size of X, such that F (X, Q) is satisfiable if and only if X is a
“yes” instance of Q.
The proof, in very vague terms, goes as follows. Consider the process of verifying the cor-
rectness of a “yes” instance, and consider a Turing machine that formalizes this verification
process. Now, one can construct a SAT instance (polynomial in the size of the input) mim-
icking the actions of the Turing machine, such that the SAT expression is satisfiable if and
only if verification is successful.
Karp4 pointed out the practical importance of Cook’s theorem, via the concept of reducibility.
To show that a problem Q is NP-complete, one need only demonstrate two things:
1. Q ∈ NP, and
2. Q is NP-Hard. However, now to show this we only need to show that Q′ ∝ Q, for some
NP-hard or NP-complete problem Q′ . (A lot easier than showing that B ∝ Q ∀B ∈ NP,
right?)
Starting with SAT, Karp produced a series of problems that he showed to be NP-complete.5
To illustrate how we prove NP-Completeness using reductions, and that, apparently very dif-
ferent problems can be reduced to each other, we will prove that the Independent Set problem is
NP-Complete.
Independent Set (IS) = {Given a graph G = (V, E), does it have a subset of nodes S ⊆ V of
size |S| ≥ k such that: for every pair of nodes in S there is no edge in E between them?}
Theorem 13.8. Independent Set is NP-complete.
Proof.
IS is in NP: Given the graph G = (V, E) and S, we can check in polynomial time that: 1) |S| ≥ k,
and 2) there is no edge between any two nodes in S.
IS is NP-Hard: We prove this by showing that SAT ∝ IS.
i) Transform input for IS from input for SAT in polynomial time
Suppose you are given an instance of SAT, with k clauses. Form a graph that has a
component corresponding to each clause. For a given clause, the corresponding compo-
nent is a complete graph with one vertex for each variable in the clause. Now, connect
4
Reference: R. M. Karp. Reducibility Among Combinatorial Problems, pages 85–103. Complexity of Computer
Computations. Plenum Press, New York, 1972. R. E. Miller and J. W. Thatcher (eds.).
5
For a good discussion on the theory of NP-completeness, as well as an extensive summary of many known NP-
complete problems, see: M.R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of
NP-completeness. Freeman, San Francisco, 1979.
IEOR 266 notes: Updated 2015 90
two nodes in different components if and only if they correspond to a variable and its
negation. This is clearly a polynomial time reduction.
ii) Transform output for SAT from output for IS in polynomial time
If the graph has an independent set of size ≥ k, then our the SAT formula is satisfiable.
iii) Prove the correctness of the above reduction
To prove the correctness we need to show that the resulting graph has an independent
set of size at least k if and only if the SAT instance is satisfiable.
(→) [G has an independent set of size at least k → the SAT instance is satisfiable]
We create a satisfiable assignment by making true the literals corresponding to
the nodes in the independent set; and we make false all other literals. That this
assignment satisfies the SAT formula follows from the following observations:
1. The assignment is valid since the independent set may not contain a node corre-
sponding to a variable and a node corresponding to the negation of this variable.
(These nodes have an edge between them.)
2. Since 1) the independent set may not contain more than one node of each of the
components of G, and 2) the value of the independent set is ≥ k (it is actually
equal to k); then it follows that the independent set must contain exactly one
node from each component. Thus every clause is satisfied.
(←) [the SAT instance is satisfiable → G has an independent set of size at least k]
We create an independent set of size equal to k by including in S one node from each
clause. From each clause we include in S exactly one of the nodes that corresponds
to a literal that makes this clause true. That we created an independent set of size
k follows from the following observations:
1. The constructed set S is of size k since we took exactly one node “from each
clause”, and the SAT formula has k clauses.
2. The set S is an independent set since 1) we have exactly one node from each
component (thus we can’t have intra-component edges between any two nodes
in S); and 2) the satisfying assignment for SAT is a valid assignment, so it
cannot be that both a variable and its negation are true, (thus we can’t have
inter-component edges between any two nodes in S).
An example of the reduction is illustrated in Figure 40 for the 3-SAT expression (x1 ∨ x2 ∨ x3 ) ∧
(x¯1 ∨ x¯2 ∨ x4 ) ∧ (x1 ∨ x¯3 ∨ x4 ) ∧ (x¯2 ∨ x3 ∨ x¯4 ).
0-1 Knapsack is NP-complete.
Proof. First of all, we know that it is in NP, as a list of items acts as the certificate and we can
verify it in polynomial time.
Now, we show that Partition ∝ 0-1 Knapsack. Consider an instance of Partition. Construct an
∑
instance of 0-1 Knapsack with wi = ci = bi for i = 1, 2, . . . , n, and W = K = 21 ni=1 bi . Then, a
solution exists to the Partition instance if and only if one exists to the constructed 0-1 Knapsack
instance.
Exercise: Try to find a reduction in the opposite direction. That is, assume that partition is
NP-complete and prove that 0-1 Knapsack is NP-complete.
IEOR 266 notes: Updated 2015 91
x1 x2 x4
x1 x1
x2 x3
x3 x4
x2 x3 x4
k-center is NP-complete.
Proof. The k-center problem is in NP from Section 13.2. The dominating set problem is known to
be NP-complete; therefore, so is k-center (from Section 13.5.1).
∑
n
max wj x j
j=1
∑n
vj x j ≤ B
j=1
xj ∈ {0, 1}
The problem can be solved by a longest path procedure on a graph – DAG– where for each
j ∈ {1, . . . , n} and b ∈ {0, 1, . . . , B} we have a node. This algorithm
∑j can be viewed
∑alternatively
also as a dynamic programming with fj (b) the value of max i=1 ui xi such that ji=1 vi xi ≤ b.
This is computed by the recursion:
{
0 if b ≤ v1 − 1
f1 (b) =
u1 if b ≥ v1
The complexity of this algorithm is O(nB). So if the input is represented in unary then the
input length is a polynomial function of n and b and this run time is considered polynomial. The
knapsack problem, as well as any other NP-complete problem that has a poly time algorithm for
unary input is called weakly NP-complete.
14 Approximation algorithms
Approximation algorithms have developed in response to the impossibility of solving a great variety
of important optimization problems. Too frequently, when attempting to get a solution for a
problem, one is confronted with the fact that the problem is NP-hard.
If the optimal solution is unattainable then it is reasonable to sacrifice optimality and settle for
a ‘good’ feasible solution that can be computed efficiently. Of course we would like to sacrifice as
little optimality as possible, while gaining as much as possible in efficiency.
A(I) ≤ δOPT (I ).
Naturally, δ > 1 and the closer it is to 1, the better. Similarly, for maximization problems
a δ-approximation algorithm delivers for every instance I a solution that is at least δ times the
optimum.
An alternative, and equivalent, definition of approximation algorithms is the following. For a
minimization problem, a δ-approximation algorithm is a polynomial algorithm, such that for every
IEOR 266 notes: Updated 2015 93
instance I, the algorithm delivers a value Alg(I) with Opt(I) ≤ Alg(I) ≤ δ · Opt(I), where Opt(I)
is the optimal solution value. i.e. { }
Alg(I)
SupI ≤δ
Opt(I)
An algorithm that is a δ-approximation, guarantees relative error ≤ δ − 1:
Alg(I) − Opt(I)
≤δ−1 .
Opt(I)
For a maximization problem, Opt(I) ≥ Alg(I) ≥ δ · Opt(I), 0 ≤ δ ≤ 1.
MST-Algorithm:
Step 1: Find a minimum spanning tree (MST) in the graph.
Step 2: “Double” the edges of the spanning tree (to get a graph in which each node has even degree).
Construct an Eulerian tour.
Step 3: Construct a valid tour that visits each node exactly once using “short-cuts”.
Since the ∆-inequality is satisfied, the shortcut tour is no longer than the Eulerian tour, the
length of which is no more than two times the length of the minimum spanning tree (denoted by
|M ST |). Also |M ST | ≤ |T SP | for |T SP | denoting the length of the optimal tour, because a tour
is a spanning tree with one additional edge:
⇒ |M ST | ≤ |T SP | ≤ 2|M ST | .
Now,
|MST-Algorithm(I)| ≤ 2|M ST | ≤ 2|T SP | = 2OP T (I)
⇒ MST-Algorithm is a 2-approximation algorithm.
A more subtle scheme for approximating TSP with ∆-inequality is due to Christofides (1978).
That algorithm uses the fact that some nodes in the spanning tree are of even degree and dou-
bling the edges adjacent to these nodes is superfluous and wasteful. Instead the algorithm due
to Christofides works by adding only the bare minimum necessary additional edges to produce an
Eulerian graph, thus delivering a better approximation, 3/2-approximation.
Step 1: Find a minimum spanning tree (MST) in the graph. Let the set of odd degree vertices in MST be
V odd .
Step 2: Find the minimum weight perfect matching induced on the set of nodes V odd , M ∗ . Add the edges
of M ∗ to the edges of MST to create a graph in which each node has even degree. Construct an
Eulerian tour.
Step 3: Construct a valid tour that visits each node exactly once using “short-cuts”.
We know that the weight of the edges in MST form a lower bound on the optimum. We now
show that the weight of the edges in M ∗ forms a lower bound to 21 |T SP |. To see that observe that
the optimal tour can be shortcut and restricted to the set of nodes V odd without an increase in cost.
Now V odd contains an even number of vertices. An even length tour on V odd can be partitioned to
two perfect matchings M1 and M2 by taking the alternate set of edges in one, and the remaining
alternate set in the other. We thus have
|M1 |, |M2 | ≥ |M ∗ |.
Ground
Block i
vi = value of commodity in i
ci = cost of excavating block i
wi = vi - ci
$10M
design the optimal pit – one that maximizes profit, the entire area is divided into blocks, and
IEOR 266 notes: Updated 2015 95
the value of the ore in each block is estimated by using geological information obtained from drill
cores. Each block has a weight associated with it, representing the value of the ore in it, minus
the cost involved in removing the block. While trying to maximize the total weight of the blocks
to be extracted, there are also contour constraints that have to be observed. These constraints
specify the slope requirements of the pit and precedence constraints that prevent blocks from being
mined before others on top of them. Subject to these constraints, the objective is to mine the most
profitable set of blocks.
The problem can be represented on a directed graph G = (V, A). Each block i corresponds
to a node with weight wi representing the net value of the individual block. The net value wi is
computed as the assessed value vi of the ore in that block, from which the cost ci of excavating that
block alone is deducted. There is a directed arc from node i to node j if block i cannot be excavated
before block j which is on a layer right above block i. This precedence relationship is determined
by the engineering slope requirements. Suppose block i cannot be excavated before block j, and
block j cannot be excavated before block k. By transitivity this implies that block i cannot be
excavated before block k. We choose in this presentation not to include the arc from i to k in the
graph and the existence of a directed path from i to k implies the precedence relation. Including
only arcs between immediate predecessors reduces the total number of arcs in the graph. Thus to
decide which blocks to excavate in order to maximize profit is equivalent to finding a maximum
weighted set of nodes in the graph such that all successors of all nodes are included in the set.
Notice that the problem is trivial if wi ≤ 0, ∀i ∈ V (in which no block would be excavated) or
if wi ≥ 0, ∀i ∈
{ V (in which all the blocks would be excavated).
1 if block i is selected
Let xi = .
0 otherwise
Then the open pit mining problem can be formulated as follows:
∑
max wi xi
i∈V
s.t. xi ≤ xj ∀(i, j) ∈ A
0 ≤ xi ≤ 1 ∀i ∈ V
Notice that each inequality constraint contains exactly one 1 and one −1 in the coefficient matrix.
The constraint matrix is totally unimodular. Therefore, we do not need the integrality constraints.
The following website http://riot.ieor.berkeley.edu/riot/Applications/OPM/OPMInteractive.html
offers an interface for defining, solving, and visualizing the open pit mining problem.
The open-pit mining problem is a special case of the maximum closure problem. The next
subsection discusses the maximum closure problem in detail.
Consider a directed graph G = (V, A) where every node i ∈ V has a corresponding weight wi .
The maximum closure problem is to find a closed set V ′ ⊆ V with maximum total weight. That is,
the maximum closure problem is:
Instance: Given a directed graph G = (V, A), and node weights (positive or nega-
tive) wi for all i ∈ V .
∑
Optimization Problem: find a closed subset of nodes V ′ ⊆ V such that i∈V ′ wi
is maximum.
We can formulate the maximum closure problem as an integer linear program (ILP) as follows.
∑
max wi xi
i∈V
s.t. xi ≤ xj ∀(i, j) ∈ A
xi ∈ {0, 1} ∀i ∈ V
where xi is a binary variable that takes the value 1 if node i is in the maximum closure, and 0
otherwise. The first set of constraints imposes the requirement that for every node i included in
the set, its successor is also in the set. Observe that since every row has exactly one 1 and one
-1, the constraint matrix is totally unimodular (TUM). Therefore, its linear relaxation formulation
results in integer solutions. Specifically, this structure also indicates that the problem is the dual
of a flow problem.
Johnson [Joh68] seems to be the first researcher who demonstrated the connection between the
maximum closure problem and the selection problem (i.e., maximum closure on bipartite graphs),
and showed that the selection problem is solvable by max flow algorithm. Picard [Pic76], demon-
strated that a minimum cut algorithm on a related graph, solves the maximum closure problem.
Let V + ≡ {j ∈ V |wi > 0}, and V − ≡ {i ∈ V |wi ≤ 0}. We construct an s, t-graph Gst as follows.
Given the graph G = (V, A) we set the capacity of all arcs in A equal to ∞. We add a source s, a
sink t, set As of arcs from s to all nodes i ∈ V + (with capacity us,i = wi ), and set At of arcs from
all nodes j ∈ V − to t (with capacity uj,t = |wj | = −wj ). The graph Gst = {V ∪ {s, t}, A ∪ As ∪ At }
is a closure graph (a closure graph is a graph with a source, a sink, and with all finite capacity arcs
adjacent only to either the source or the sink.) This construction is illustrated in Figure 43.
IEOR 266 notes: Updated 2015 97
G = (V,A)
$"
Wi > 0 ∞ -wj ≥ 0
!" #" %"
Proof. Assume by contradiction that S is not closed. This means that there must be an arc
(i, j) ∈ A such that i ∈ S and j ∈ T . This arc must be on the cut (S, T ), and by construction
ui,j = ∞, which is a contradiction on the cut being finite.
Proof.
∑ ∑
C (s ∪ S, t ∪ T ) = us,i + uj,t
(s,i)∈Ast , i∈T (j,t)∈Ast , j∈S
∑ ∑
= wi + −wj
i ∈ T ∩V + j ∈ S∩V −
∑ ∑ ∑
= wi − wi − wj
i∈V + i∈S∩V + j∈S∩V −
∑
= W+ − wi
i∈S
∑
(Where W + = i∈V + wi , which is a constant.) This implies that minimizing C (s ∪ S, t ∪ T ) is
∑ ∑
equivalent to minimizing W + − i∈S wi , which is in turn equivalent to maxS⊆V i∈S wi .
Therefore, any source set S that minimizes the cut capacity also maximizes the sum of the weights
of the nodes in S. Since by Claim 15.2 any source set of an s − t cut in Gs,t is closed, we conclude
that S is a maximum closed set on G.
Variants/Special Cases
• In the minimum closure problem we seek to find a closed set with minimum total weight.
This can be solved by negating the weights on the nodes in G to obtain G− , constructing G−
s,t
just as before, and solving for the maximum closure. Under this construction, the source set
of a minimum s − t cut on Gs,t is a minimum closed set on G. See Figure 44 for a numerical
example.
IEOR 266 notes: Updated 2015 98
Numerical Example
(c) Modified (max flow) graph (d) Source of min cut = min closure set on G
Figure 44: Converting a minimum closure problem to a maximum closure problem. Under this
transformation, the source set of a minimum cut is also a minimum closed set on the original
graph.
∑ ∑ ∑
ui
(p)
max wi l i + wi xi (16a)
i∈V i∈V p=li +1
(p) (q(p))
s.t. xj ≤ xi ∀ (i, j) ∈ E for p = lj + 1, . . . , uj (16b)
(p) (p−1)
xi ≤ xi ∀i∈V for p = li + 1, . . . , ui (16c)
(p)
xi ∈ {0, 1} ∀i∈V for p = li + 1, . . . , ui , (16d)
⌈ ⌉
cij +bij p
where q(p) ≡ aij .
(p) (p−1)
Inequality (16c) guarantees the restriction that xi = 1 =⇒ xi = 1 for all p = li +1, . . . , ui ;
Inequality (16b) follows from the monotone inequalities in the original problem. In particular, for
any monotone inequality⌈ aij xi ⌉− bij xj ≥ cij with aij , bij ≥ 0, aij xi − bij xj ≥ cij ⌈ ⇐⇒ ⌉xi ≥
cij +bij xj xi integer cij +bij xj cij +bij p
aij =⇒ xi ≥ aij . Equivalently, if xj ≥ p, we must have xi ≥ q(p) = aij . In
q(p)
terms of the newly defined binary variables, this is further equivalent to xpj = 1 =⇒ xi = 1, i.e.,
(p) (q(p))
xj ≤ xi .
Now it’s obvious that Monotone IP2 is the maximum closure problem of an s, t-Graph Gst
defined as follows. First still define V + ≡ {i ∈ V |wi > 0} and V − ≡ {j ∈ V |wj ≤ 0} as before, and
the sets of nodes and arcs are constructed below.
Set of nodes:
(l ) (l +1) (u )
Add a source s, a sink t, and nodes xi i , xi i , . . . , xi i for each i ∈ V .
Set of arcs:
(p)
1) For any i ∈ V + , connect s to xi , p = li + 1, . . . , ui by an arc with capacity wi .
2) For any i ∈ V − , connect xi , p = li + 1, . . . , ui to t by an arc with capacity |wi | = −wi .
(p)
(p) (p−1)
3) For any i ∈ V , connect xi to xi , p = li + 1, . . . , ui , by an arc with capacity ∞, and
(li )
connect s to xi with an arc with capacity ∞.
(p) q(p)
4) For any (i, j) ∈ E, connect xj to xi by an arc with capacity ∞ for all p = li + 1, . . . , uj .
(p)
(Note that for situations where q(p) > ui , we must have xj = 0. Therefore, we can either remove
IEOR 266 notes: Updated 2015 100
(p) (p)
the node xj by redefining a tighter upper bound for xj , or simply fix xj to be zero by introducing
(p)
an arc from xj to t with capacity ∞. )
This construction is illustrated in Figure 45.
xi xj
xi(ui) xj(uj)
"#%
"# ! !
j
i
$
wi ! ! -wj
! ! !
(q(p))
wi xi -wj
s t
wi -wj
xj(p) -wj
wi ! !
! xi(li +1)
! ! xj(lj +1)
xi(li) xj(lj)
Figure 45: Illustration of Gst for Monotone IP2 and example of a finite s − t cut
Remarks:
• It should
∑ be noted that the maximum closure problem is defined on an s, t-Graph with
2 + i∈V (ui − li ) nodes. The size of the graph is not a polynomial function of the length
of the input. Therefore, the original Monotone IP2 is weakly NP-hard and can be solved by
pseudo-polynomial time algorithm based on the construction of a maximum closure problem.
• For the min version of Monotone IP2, we can construct the s, t-Graph in the same way and
define closure with respect to the sink set instead.
This problem is more general than Monotone IP2, as we don’t impose any restrictions on the signs
of aij and bij . The problem is clearly NP-hard, since vertex cover is a special case. We now give a
2-approximation algorithm for it.
−
x+
i −xi
We first “monotonize” IP2 by replacing each variable xi in the objective by xi = 2 where
−
li ≤ x+
i ≤ ui and −ui ≤ xi ≤ −li .
IEOR 266 notes: Updated 2015 101
Each non-monotone inequality aij xi + bij xj ≤ cij is replaced by the following two inequalities:
−
i − bij xj ≤ cij
aij x+
−aij x−
i + bij xj ≤ cij
+
And each monotone inequality a′ij xi − b′ij xj ≤ c′ij is replaced by the two inequalities:
a′ij x+ ′ +
i − bij xj ≤ cij
−a′ij x− ′ −
i + bij xj ≤ cij
( )
1 ∑ ∑
(IP2’) min wi x +
i + (−wi )x−
i
2
i∈V i∈V
−
s.t. aij xi − bij xj ≤ cij ∀ (i, j) ∈ E
+
li ≤ x+i ≤ ui , integer ∀ i ∈ V
−ui ≤ x− i ≤ −li , integer ∀ i ∈ V
Let’s examine the relationship between IP2 and IP2’. Given any feasible solution {xi }i∈V for
IP2, we can construct a corresponding feasible solution for IP2’ with the same objective value, by
− + −
i = xi , xi = −xi , i ∈ V ; On the other hand, given any feasible solution {xi , xi }i∈V for
letting x+
x+ −x−
IP2’, {xi = i 2 i }i∈V satisfies the inequality constraint in IP2 (To see this, simply sum up the
two inequality constraints in IP2’ and divide the resulted inequality by 2.) but may not be integral.
Therefore, IP2’ provides a lower bound for IP2.
Since IP2’ is a monotone IP2, we can solve it in integers (we can reduce it to maximum closure,
which in turn reduces to the minimum s,t-cut problem). Its solution is a lower bound on IP2.
In general it is not trivial to round the solution obtained and get an integer solution for IP2.
However we can prove that there always exists a way to round the variables and get our desired
2-approximation.
15.5 2-approximations for integer programs with two variables per inequality
We have now established that for any integer program on two variables per inequality it is possible
to generate a superoptimal half integral solution in polynomial time. The following was shown in
[HNMT93, Hoc97].
and the feasible solutions to the monotone system resulting from the transformation above,
If x ∈ S, x+ = x, and x− = −x, then (x+ , x− ) ∈ S (2) . So, for every feasible solution in S, there
exists a feasible solution in S (2) . Conversely, if (x+ , x− ) ∈ S (2) , then x(2) = 21 (x+ − x− ) ∈ S.
Hence, for every feasible solution in S (2) , there is a feasible solution in S.
Let SI = {x ∈ S | x integer }, and let
{ }
1 +
(x − x− ) | (x+ , x− ) ∈ S (2) and x+ , x− integer
(2)
SI = .
2
(2) (2)
If x ∈ SI , then x ∈ SI . Thus, SI ⊆ SI ⊆ S.
(2)
In fact, the set of solutions SI is even smaller than the set of feasible solutions that are integer
multiples of 12 . To see that, let
1 1
S ( 2 ) = {x | Ax ≤ c and x ∈ Z n } .
2
(2) 1 1 (2)
The claim is that SI ⊂ S ( 2 ) , and S ( 2 ) may contain points not in SI . The following example
illustrates such a case:
5x + 2y ≤ 6
0 ≤ x, y ≤ 1 .
1
1
Obviously, (x = 1, y = 2) is a feasible solution in S ( 2 ) . But there is no corresponding integer
solution in SI as x+ = −x− = 1 implies that y + = y − = 0. It follows that the bound derived
(2)
(2) 1
from optimizing over SI is tighter than a bound derived from optimizing over S ( 2 ) . Not only is
this latter optimization weaker, but it is also in general NP-hard as stated in the following Lemma
(proved in [HNMT93]).
Lemma 15.5. Minimizing over a system of inequalities with two variables per inequality for x ∈
2 · Z , is NP-hard.
1 n
− −
if min{m+i , −mi } ≤ zi ≤ max{mi , −mi },
+
ℓi = zi (17)
− −
max{mi , −mi } if zi ≥ max{mi , −mi } .
+ +
Lemma 15.6. The vector ℓ is a feasible solution of the given integer program.
IEOR 266 notes: Updated 2015 103
Proof. Let axi + bxj ≥ c be an inequality where a and b are nonnegative. We check all possible
− −
i , −mi }, and ℓj is equal to zj or min{mj , −mj }, then clearly,
cases. If ℓi is equal to zi or min{m+ +
If ℓi ≥ −m−
i , then,
aℓi + bℓj ≥ −am−
i + bmj ≥ c .
+
Otherwise,
−
aℓi + bℓj ≥ am+
i − bmj ≥ c .
− −
i , −mi }, and ℓj = max{mj , −mj }. In this case,
The last case is when ℓi = max{m+ +
−
aℓi + bℓj ≥ am+
i − bmj ≥ c .
We showed that vector ℓ is a feasible solution. We now argue that it also approximates the
optimum.
Theorem 15.7.
2. The value of the objective function at the vector m∗ is at least a half of the value of the
objective function of the best integer solution.
Proof. By construction, ℓ ≤ 2m∗ . From the previous subsection we know that the vector m∗
provides a lower bound on the value of the objective function for any integral solution. Hence, the
theorem follows.
The complexity of the algorithm is dominated by the complexity of the procedure in [HN94] for
optimizing over a monotone system. The running time is O(mnU 2 log(U n2 /m)).
s.t. xi − yj ≥ 1 ∀ (i, j) ∈ E
xi ∈ {0, 1} ∀ i ∈ V1
yj ∈ {−1, 0} ∀ j ∈ V2
Evidently, this formulation is a Monotone IP2. Furthermore since both xi and yj can take only two
values, the size of the resulted s, t-Graph is polynomial. In fact, the following s, t-Graph (Figure
46) can be constructed to solve the vertex cover problem on bipartite graph.
1. Given a bipartite graph G = (V1 ∪ V2 , E), set the capacity of all edges in A equal to ∞.
2. Add a source s, a sink t, set As of arcs from s to all nodes i ∈ V1 (with capacity us,i = wi ), and
set At of arcs from all nodes j ∈ V2 to t (with capacity uj,t = wj ).
It is easy to see that if (s ∪ S, t ∪ T ) is a finite s − t cut on Gs,t , then (V1 ∩ T ) ∪ (V2 ∩ S) is a
vertex cover. Furthermore, if (s ∪ S, t ∪ T ) is an optimal solution to the minimum s − t cut problem
on Gs,t , then (V1 ∩ T ) ∪ (V2 ∩ S) is a vertex cover with the minimum weights, where
∑ ∑ ∑
C(s ∪ S, t ∪ T ) = wi + wj = wj
i∈V1 ∩T j∈V2 ∩S j∈V C ∗
V1 V2
1
!
1’
w1 2 ! 2’ w1’
w2 ! w2’
3
!
3’
w3 w3’
s t
!
wn !
!
wn’
n n’
Figure 46: An illustration of the s, t-Graph for the vertex cover problem on bipartite graph and an
example of a finite s − t cut
Moreover, the LP-relaxation of the vertex cover problem on general graphs can be solved by
solving the vertex cover problem in a related bipartite graph. Specifically, as suggested by Edmonds
and Pulleyblank and noted in [NT75], the LP-relaxation can be solved by finding an optimal cover
C in a bipartite graph Gb = (Vb1 ∪ Vb2 , Eb ) having vertices aj ∈ Vb1 and bj ∈ Vb2 of weight wj for
each vertex j ∈ V , and two edges (ai , bj ), (aj , bi ) for each edge (i, j) ∈ E. Given the optimal cover
C on Gb , the optimal solution to the LP-relaxation of our original problem is given by:
1 if aj ∈ C and bj ∈ C,
xj = 21 if aj ∈ C and bj ̸∈ C, or aj ̸∈ C and bj ∈ C,
0 if aj ̸∈ C and bj ̸∈ C.
In turn, the problem of solving the vertex cover problem in Gb (or, on any bipartite graph)
can be reduced to the minimum cut problem as we showed previously. For this purpose we create
a (directed) st-graph Gst = (Vst , Ast ) as follows: (1) Vst = Vb ∪ {s} ∪ {t}, (2) Ast contains an
infinite-capacity arc (ai , bj ) for each edge (ai , bj ) ∈ Eb , (3) Ast contains an arc (s, ai ) of capacity
wi for each node i ∈ Vb1 , and (4) Ast contains an arc (bi , t) of capacity wi for each node i ∈ Vb2 .
Given a minimum (S, T ) cut we obtain the optimal vertex cover as follows: let ai ∈ C if ai ∈ T
and let bj ∈ C if bj ∈ S.
We now present an alternative method of showing that the LP-relaxation of the vertex cover
problem can be reduced to a minimum cut problem, based on our discussion on integer programming
with two variables per inequality (IP2).
As we did for the non–monotone integer program with two variables per inequality, we can
−
x+
i −xi
“monotonize” (VC) by replacing each binary variable xi in the objective by xi = 2 where
IEOR 266 notes: Updated 2015 106
−
i ∈ {0, 1} and xi ∈ {−1, 0}, and each inequality xi + xj ≥ 1 by the following two inequalities:
x+
−
i − xj ≥ 1
x+
−x−
i + xj ≥ 1
+
The relationship between (VC) and (VC ′ ) can be easily seen. For any feasible solution {xi }i∈V for
(VC), a corresponding feasible solution for (VC ′ ) with the same objective value can be constructed
− + −
i = xi , xi = −xi , ∀i ∈ V ; On the other hand, for any feasible solution {xi , xi }i∈V
by letting x+
x+ −x−
for (VC ′ ), xi = i 2 i satisfies inequality (VC.1) (To see this, simply sum up inequalities (VC.1)
and (VC.2) and divide the resulted inequality by 2.) but may not be integral. Therefore, given an
− ′ x+ −x−
optimal solution {x+ i , xi }i∈V for (VC ), the solution {xi =
i
2
i
} is feasible, half integral (i.e.,
xi ∈ {0, 2 , 1}) and super-optimal (i.e., its value provides a lower bound) for (VC).
1
Comparing (VC ′ ) with (VC)B , we can see that without the coefficient 12 in the objective (VC ′ )
can be treated as a vertex cover problem on a bipartite graph G(Vb1 ∪ V b2 , Eb ) where Vb1 =
aj , ∀j ∈ V , Vb2 = bj , ∀i ∈ V , Eb = {(ai , bj ), (aj , bi ), ∀(i, j) ∈ E}.
Therefore, we obtain the following relationship
V C +− ≤ 2V C ∗
where V C ∗ is the optimal value of the original vertex cover problem on a general graph G(V, E),
and V C +− is the optimal value of the vertex cover problem defined on the bipartite graph G(Vb1 ∪
V b2 , Eb ).
∑ ∑
min i∈V wi x i + (i,j)∈A eij zij
[BP89] and [Hoc08] show that this problem is equivalent to solving the minimum s − t cut
problem on a modified graph, Gst , defined as follows. We add nodes s and t to the graph, with
an arc from s to every negative weight node i (with capacity usi = −wi ), and an arc from every
positive weight node j to t (with capacity ujt = wj ).
Lemma 15.8. S ∗ is a set of minimum s-excess capacity in the original graph G if and only if S ∗
is the source set of a minimum cut in Gst .
Proof. As before, let V + ≡ {i ∈ V |wi > 0}, and let V − ≡ {j ∈ V |wj < 0}. Let (s ∪ S, t ∪ T ) define
an s − t cut on Gst . Then the capacity of this cut is given by
∑ ∑ ∑
C (s ∪ S, t ∪ T ) = us,i + uj,t + eij
(s,i)∈Ast , i∈T (j,t)∈Ast , j∈S i∈S, j∈T
∑ ∑ ∑
= −wi + wj + eij
i ∈ T ∩V − j ∈ S∩V + i∈S, j∈T
∑ ∑ ∑ ∑
= −wi − −wi + wj + eij
i∈V − j∈S∩V − j ∈ S∩V + i∈S, j∈T
∑ ∑
−
= W + wj + eij
j∈S i∈S, j∈T
Where W − is the sum of all negative weights in G, which is a constant. Therefore, minimizing
∑ ∑
C (s ∪ S, t ∪ T ) is equivalent to minimizing j∈S wj + i∈S, j∈T eij , and we conclude that the
source set of a minimum s − t cut on Gst is also a minimum s-excess set of G.
References
[BP89] D. Greig B. Porteous, A. Seheult. Exact maximum a posteriori estimation for binary
images. Journal of the Royal Statistical Society, Series B, 51(2):271–279, 1989.
[HN94] D. S. Hochbaum and J. Naor. Simple and fast algorithms for linear and integer programs
with two variables per inequality. SIAM Journal on Computing, 23(6):1179–1192, 1994.
[HNMT93] D. S. Hochbaum, J. Naor N. Megiddo, and A. Tamir. Tight bounds and 2-approximation
algorithms for integer programs with two variables per inequality. Mathematical Pro-
gramming, 62:69–83, 1993.
IEOR 266 notes: Updated 2015 108
[Hoc97] D. S. Hochbaum. Approximating covering and packing problems: set cover, vertex
cover, independent set and related problems. chapter 3. In Approximation algorithms
for NP-hard problems, pages 94–143. PWS Boston, 1997.
[Hoc08] D. S. Hochbaum. The pseudoflow algorithm: A new algorithm for the maximum flow
problem. Operations Research, 58(4):992–1009, July-Aug 2008.
[Joh68] T. B. Johnson. Optimum Open Pit Mine Production Technology. PhD thesis, Opera-
tions Research Center, University of California, Berkeley, 1968.
[Lag85] J. C. Lagarias. The computational complexity of simultaneous diophantine approxima-
tion problems. SIAM Journal on Computing, 14:196–209, 1985.
[Pic76] J. C. Picard. Maximal closure of a graph and applications to combinatorial problems.
Management Science, 22:1268–1272, 1976.
The left hand side is simply the sum of weights of nodes in D. This necessary condition is therefore
identical to the condition stated in (18).
Verification of condition (18).
A restatement of condition (18) is that for each subset of nodes D ⊂ V ,
∑
bj − C(D, D̄) ≤ 0.
j∈D
The term on the left is called the s-excess of the set D. In Section ?? it is shown that the maximum
s-excess set in a graph can be found by solving a minimum cut problem. The condition is thus
satisfied if and only if the maximum s-excess in the graph is of value 0.
This condition is an extension of Hall’s theorem for the existence of perfect matching in bipartite
graphs.
(See also pages 194-196 of text)
IEOR 266 notes: Updated 2015 110
17 Planar Graphs
Planar graph : A graph is planar if it can be drawn in the plane so no edges intersect. Such a
drawing is called planar embedding.
Face : Faces are regions defined by a planar embedding, where two points in the same face can be
reachable from each other along a continuous curve (that does not intersect any edges).
Critical graph : G is called critical if chromatic number of any proper subgraph of G is smaller
than that of G.
v1 3i
S
@ S
@ S
@ S
@ 1i
v5 v v2
J
J 3i
J
J
J 1i
v4 v3
t*
!!
1
, l !
, l!!
!
, ! l
, !!! l
, !! l
, l
, 1* 2* l
, ll
,
s 2 t
@
@
@ 4* 3*
@
""
@ "
@ "
"
@""
" @
" @ 3
"
s*
The original planar graph is called the primal and the newly constructed graph is called the
dual. The dual of a planar graph is still planar.
The followings are some interesting points of primal and dual
• 3. call the two nodes in the two faces created by the new edge e s∗ and t∗ .
IEOR 266 notes: Updated 2015 113
Proof.
The capacity constraints are satisfied:
Since d(i∗ ) is the shortest path from s∗ to i∗ in the dual graph, d(i∗ ) satisfies the shortest path
optimality condition:
d(i∗ ) ≤ d(j ∗ ) + ci∗ j ∗ . Since cost of path in dual = capacity of cut in primal
p
D
D 1
( (D(( 1* ,
p* D \ ,,
D \
PP D ,\
PP D , ,
PPP 2*
i P
PP
PP
C P
P
C 2
C
3*
C
C
C
C
3
For edges on the shortest path in dual, identify the min cut, the flow on the cut edges is
saturated, therefore, flow value = cut capacity = max flow
Again we can use Dijkstra’s Algorithm to find the shortest path, which is also the max flow.
The running time is O(nlogn) as we justified in min cut algorithm.
Minimum 2-Cut A Minimum 2 cut in a directed network is the min weight arc-connectivity in
the network, which is the smallest collection of arcs that separates one non-empty component
of the network from the other. That is for each pair of nodes (u, v), find the minimum weight
u − v cut and select the minimum of these minimum cuts from among all pairs (u, v).
An Algorithm for Finding a Minimum 2-Cut Find for all pairs v, u ∈ V , the (v-u) - min cut
in the network using O(n2 ) applications of Max-Flow. This can, however, be done with O(n)
applications of Min-Cut. Pick a pair of nodes, (1, j) and solve the min-cut problem for that
IEOR 266 notes: Updated 2015 115
pair. From among all pairs, take the minimum cut. This requires considering n different pairs
of nodes (1, j).
Matula’s Algorithm
Proof. First we demonstrate correctness. If α(G) = δ then done, as we never reduce the value of
the cut. Assume then that α(G) < δ. We need to demonstrate that at some iteration S ⊆ S ∗ and
k ∈ S¯∗ . This will suffice as that for that iteration α[S, k] = α∗ .
Initially S = {p} ⊆ S ∗ ; when the algorithm terminate S ̸⊆ S ∗ , otherwise nonneighbor(S) ̸= ∅,
(see lemma). Therefore there must be an iteration i which is the last one in which S ⊆ S ∗ . Consider
that iteration and the vertex k ∈ noneighbor(S) selected. Since S ∪ {k} ̸⊆ S ∗ ⇒ k ∈ S¯∗ . Thus,
α[S, k] = α∗ .
Consider now the complexity of the algorithm: Consider augmentations along shortest aug-
menting path (such as in Dinic’s alg.).
There are two types of augmenting paths:
-type1: its last internal node (before k) ∈ neighbor(S)
-type2: its last internal node (before k) ∈ noneighbor(S).
The length of type1 path is 2, then the total work for type1 path is O(n2 ).
At most there are O(n) type2 path augmentations throughout the alg., in fact consider a path
s, . . . , l, k; let l be the node just before k. While k is the sink l can be used just once; as soon
as we are done with k, l become a neighbor. Then l can be before the last one only in one type2
path. The complexity of finding an augmenting path is O(m), so the total work for a type2 path
is O(mn). Then the total complexity of the algorithm is O(mn).
References
[HO92] J. Hao and J. B. Orlin. A faster algorithm for finding the minimum cut of a graph. In
Proc. 3rd ACM-SIAM Symp. On Discrete Alg. 165–174, 1992.
[NI92] H. Nagamochi and T. Ibaraki. Linear time algorithms for finding sparse k-connected
spanning subgraph of a k-connected graph. Algorithmica, 7 (5/6):583-596, 1992.
[NI92a] H. Nagamochi and T. Ibaraki. Computing edge connectivity in multigraphs and capac-
itated. SIAM J. Discrete Math, 5:54–66, 1992.
IEOR 266 notes: Updated 2015 117
∑
n ∑
Minimize z= cij xij
∑i=1 (i,j)∈A
∑
Subject to: xij − xki = bi i∈V
(i,j)∈A (k,i)∈A
0 ≤ xij ≤ uij (i, j) ∈ A
It is easily seen that the constraint matrix is not of full rank. In fact, the matrix is of rank n-1.
This can be remedied by adding an artificial variable corresponding to an arbitrary node. Another
way is just to throw away a constraint, say the one corresponding to node 1. For the uncapacitated
version, where the capacities uij are infinite, the dual is :
∑
Maximize z= bj πj
j∈V
Subject to: πi − πj ≤ cij (i, j) ∈ A.
We denote by cπij the reduced cost cij − πi + πj which are the slacks in the dual constraints.
As proved earlier, the sum of the reduced costs along a cycle is equal to the sum of the original costs.
Simplex method assumes that an initial ”basic” feasible flow is given by solving phase I or by
solving another network flow problem. By basis, we mean the n − 1 xij variables whose columns in
the constraint matrix are linearly independent. The non-basic variables are always set to zero. (or
in the case of finite upper bounds, nonbasic variables may be set to their respective upper bounds.)
We will prove in Section 19.3 that a basic solution corresponds to a spanning tree. The concept
of Simplex algorithm will be illustrated in the handout but first we need to define some important
terminologies used in the algorithm first.
Given a flow vector x, a basic arc is called free iff 0 < xij < uij . An arc (i, j) is called restricted
iff xij = 0 or xij = uij . From the concept of basic solution and its corresponding spanning tree we
can recognize a characteristic of an optimal solution. An optimal solution is said to be cycle-free
iff in every cycle at least one arc assumes its lower or upper bound.
Proposition 19.1. If there exists a finite optimal solution, then there exists an optimal solution
which is cycle-free.
Proof: Prove by contradiction. Suppose we have an optimal solution which is not cycle-free and
has the minimum number of free arcs. Then we can identify a cycle such that we can send flow
in either clockwise or counter-clockwise direction. By linearity, one direction will not increase the
objective value as sum of the costs of arcs along this direction is nonpositive. From our finiteness
IEOR 266 notes: Updated 2015 118
assumption, we will make at least one arc achieve its lower or upper bound after sending a certain
amount of flow along the legitimate direction. This way, we get another optimal solution having
fewer free arcs and the contradiction is found.
• x∗ is feasible.
• There is no negative cost cycle in G(x∗ ).
• x∗ is feasible.
• cπij ≥ 0 for each arc (i,j) in G(x∗ ).
• x∗ is feasible,
• cπij > 0 ⇒ x∗ij = 0.
• cπij = 0 ⇒ 0 ≤ x∗ij ≤ uij .
• cπij < 0 ⇒ x∗ij = uij .
Theorem 19.2 (Optimality condition 1). A feasible flow x∗ is optimal iff the residual network
G(x∗ ) contains no negative cost cycle.
Proof: ⇐ direction.
Let x0 be an optimal flow and x0 ̸= x∗ . (x0 − x∗ ) is a feasible circulation flow vector. A circulation
is decomposable into at most m primitive cycles. The sum of the costs of flows on these cycles
is c(x0 − x∗ ) ≥ 0 since all cycles in G(x∗ ) have non-negative costs. But that means cx0 ≥ cx∗ .
Therefore x∗ is a min cost flow.
⇒ direction.
Suppose not, then there is a negative cost cycle in G(x∗ ). But then we can augment flow along
that cycle and get a new feasible flow of lower cost, which is a contradiction.
Theorem 19.3 (Optimality condition 2). A feasible flow x∗ is optimal iff there exist node potential
π such that cπij ≥ 0 for each arc (i,j) in G(x∗ ).
Proof: ⇐ direction.
∑
If cπij ≥ 0 for each arc (i,j) in G(x∗ ), then in particular for any cycle C in G(x∗ ), (i,j)∈C cπij =
∑ ∗
(i,j)∈C cij . Thus every cycle’s cost is nonnegative. From optimality condition 1 it follows that x
is optimal.
IEOR 266 notes: Updated 2015 120
⇒ direction.
x∗ is optimal, thus (Opt cond 1) there is no negative cost cycle in G(x∗ ). Therefore shortest paths
distances are well defined. Let d(i) be the shortest paths distances from node 1 (say). Let the node
potentials be πi = −d(i). From the validity of distance labels it follows that,
Theorem 19.4 (Optimality condition 3). A feasible flow x∗ is optimal iff there exist node potential
π such that,
Proof: ⇐ direction.
If these conditions are satisfied then in particular for all residual arcs (i, j) ∈ G(x∗ ), cπij ≥ 0. Hence,
optimality condition 2 is satisfied.
⇒ direction.
From Optimality condition 2 cπij ≥ 0 for all (i,j) in G(x∗ ). Suppose that for some (i, j) cπij > 0
but x∗ij > 0. But then (j, i) ∈ G(x∗ ) and thus cπji = −cπij < 0 which is a contradiction (to opt con
2). Therefore the first condition on the list is satisfied. The second is satisfied due to feasibility.
Suppose now that the third holds, and cπij < 0. So then arc (i, j) is not residual, as for all residual
arcs the reduced costs are nonnegative. So then the arc must be saturated and x∗ij = uij as
stated.
The complexity of the cycle canceling algorithm, is finite. Every iteration reduces the cost
of the solution by at least one unit. For maximum arc cost equal to C and maximum capacity value
equal to U the number of iterations cannot exceed 2mU C. The complexity is thus O(m2 nU C),
which is not polynomial, but is only exponential in the length of the numbers in the input.
In order to make this complexity polynomial we can apply a scaling algorithm. The scaling can
be applied to either capacities, or costs, or both.
Costs=0
j
… … t
s
i
finite capacity
Cost=-1
capacity=infinity
If we solve this MCNF problem by cycle canceling algorithm, we observe that the only negative
cost arc is (t, s) and therefore each negative cost cycle is a path from s to t augmented with this
arc. When we look for negative cost cycle of minimum mean, that is, the ratio of the cost to the
number of arcs, it translates in the context of maximum flow to −1 divided by the length of the
s, t path in terms of the number of arcs. The minimum mean corresponds to a path of smallest
number number of arcs – shortest augmenting path. You can easily verify the other analogies in
the following table.
Notice that finding a most negative cost cycle in a network is NP-hard as it is at least as hard
as finding a hamiltonian path if one exists. (set all costs to 1 and all capacities to 1 and then
finding most negative cost cycle is equivalent to finding a longest cycle.) But finding a minimum
mean cost cycle can be done in polynomial time. The mean cost of a cycle is the cost divided by
IEOR 266 notes: Updated 2015 122
References
[Hoc08] D.S. Hochbaum. The pseudoflow algorithm: A new algorithm for the maximum-flow
problem. Operations Research, 56(4):992–1009, 2008.