C%ID
C%ID
Proof:
(The fact that f (S − s, V ) = 0 comes from flow conservation. f (u, V ) = 0 for all u other than s and t,
and since S − s is formed of such vertices the sum of their flows will be zero also.)
Corollary: The value of any flow is bounded from above by the capacity of any cut. (i.e. Maximum flow ≤
Minimum cut).
Proof: You cannot push any more flow through a cut than its capacity.
The correctness of the Ford-Fulkerson method is based on the following theorem, called the Max-Flow, Min-Cut
Theorem. It basically states that in any flow network the minimum capacity cut acts like a bottleneck to limit
the maximum amount of flow. Ford-Fulkerson algorithm terminates when it finds this bottleneck, and hence it
finds the minimum cut and maximum flow.
Analysis of the Ford-Fulkerson method: The problem with the Ford-Fulkerson algorithm is that depending on how
it picks augmenting paths, it may spend an inordinate amount of time arriving a the final maximum flow. Con-
sider the following example (from page 596 in CLR). If the algorithm were smart enough to send flow along
the edges of weight 1,000,000, the algorithm would terminate in two augmenting steps. However, if the algo-
rithm were to try to augment using the middle edge, it will continuously improve the flow by only a single unit.
2,000,000 augmenting will be needed before we get the final flow. In general, Ford-Fulkerson can take time
Θ((n + e)|f ∗|) where f ∗ is the maximum flow.
An Improvement: We have shown that if the augmenting path was chosen in a bad way the algorithm could run for a
very long time before converging on the final flow. It seems (from the example we showed) that a more logical
way to push flow is to select the augmenting path which holds the maximum amount of flow. Computing this
path is equivalent to determining the path of maximum capacity from s to t in the residual network. (This is
exactly the same as the beer transport problem given on the last exam.) It is not known how fast this method
works in the worst case, but there is another simple strategy that is guaranteed to give good bounds (in terms of
n and e).
Observation: If the edge (u, v) is an edge on the minimum length augmenting path from s to t in G f , then
δf (s, v) = δ f (s, u) + 1 .
Proof: This is a simple property of shortest paths. Since there is an edge from u to v, δf (s, v) ≤ δ f (s, u) + 1 ,
and if δf (s, v) < δ f (s, u) + 1 then u would not be on the shortest path from s to v, and hence (u, v) is not
on any shortest path.
Lemma: For each vertex u ∈ V −{s, t} , let δf (s, u) be the distance function from s to u in the residual network
G f . Then as we peform augmentations by the Edmonds-Karp algorithm the value of δf (s, u) increases
monotonically with each flow augmentation.
Proof: (Messy, but not too complicated. See the text.)
Theorem: The Edmonds-Karp algorithm makes at most O(n · e) augmentations.
Proof: An edge in the augmenting path is critical if the residual capacity of the path equals the residual capacity
of this edge. In other words, after augmentation the critical edge becomes saturated, and disappears from
the residual graph.
How many times can an edge become critical before the algorithm terminates? Observe that when the
edge (u, v) is critical it lies on the shortest augmenting path, implying that δf (s, v) = δ f (s, u) + 1 . After
this it disappears from the residual graph. In order to reappear, it must be that we reduce flow on this edge,
i.e. we push flow along the reverse edge (v, u) . For this to be the case we have (at some later flow f 0)
δf 0 (s, u) = δ f 0 (s, v) + 1 . Thus we have:
δf 0 (s, u) =δ f 0 (s, v) + 1
≥δ f (s, v) + 1 since dists increase with time
= (δ f (s, u) + 1) + 1
=δ f (s, u) + 2.
Thus, between the time that an edge becomes critical, its tail vertex increases in distance from the source
by two. This can only happen n/2 times, since no vertex can be further than n from the source. Thus, each
edge can become critical at most O(n) times, there are O(e) edges, hence after O(ne) augmentations, the
algorithm must terminate.
In summary, the Edmonds-Karp algorithm makes at most O(ne) augmentations and runs in O(ne 2 ) time.
Maximum Matching: One of the important elements of network flow is that it is a very general algorithm which is
capable of solving many problems. (An example is problem 3 in the homework.) We will give another example
here.
Consider the following problem, you are running a dating service and there are a set of men L and a set of
women R . Using a questionaire you establish which men are compatible which which women. Your task is
to pair up as many compatible pairs of men and women as possible, subject to the constraint that each man is
paired with at most one woman, and vice versa. (It may be that some men are not paired with any woman.)
This problem is modelled by giving an undirected graph whose vertex set is V = L ∪ R and whose edge set
consists of pairs (u, v) , u ∈ L , v ∈ R such that u and v are compatible. The problem is to find a matching,
Reduction to Network Flow: We claim that if you have an algorithm for solving the network flow problem, then you
can use this algorithm to solve the maximum bipartite matching problem. (Note that this idea does not work for
general undirected graphs.)
Construct a flow network G 0 = (V 0, E 0) as follows. Let s and t be two new vertices and let V 0 = V ∪ {s, t} .
We claim that this matching is maximum because for every matching there is a corresponding flow of equal
value, and for every (integer) flow there is a matching of equal value. Thus by maximizing one we maximize
the other.
Hamiltonian Cycle: Today we consider a collection of problems related to finding paths in graphs and digraphs.
Recall that given a graph (or digraph) a Hamiltonian cycle is a simple cycle that visits every vertex in the graph
(exactly once). A Hamiltonian path is a simple path that visits every vertex in the graph (exactly once). The
Hamiltonian cycle (HC) and Hamiltonian path (HP) problems ask whether a given graph (or digraph) has such
a cycle or path, respectively. There are four variations of these problems depending on whether the graph is
directed or undirected, and depending on whether you want a path or a cycle, but all of these problems are
NP-complete.
An important related problem is the traveling salesman problem (TSP). Given a complete graph (or digraph)
with integer edge weights, determine the cycle of minimum weight that visits all the vertices. Since the graph
is complete, such a cycle will always exist. The decision problem formulation is, given a complete weighted
graph G, and integer X , does there exist a Hamiltonian cycle of total weight at most X ? Today we will prove
that Hamiltonian Cycle is NP-complete. We will leave TSP as an easy exercise. (It is done in Section 36.5.5 in
CLR.)
The proof is not hard, but involves a careful inspection of the gadget. It is probably easiest to see this on your
own, by starting with one, two, or three input paths, and attempting to get through the gadget without skipping
vertex and without visiting any vertex twice. To see whether you really understand the gadget, answer the
question of why there are 6 groups of triples. Would some other number work?
DHP is NP-complete: This gadget is an essential part of our proof that the directed Hamiltonian path problem is
NP-complete.
Let us consider the similar elements between the two problems. In 3SAT we are selecting a truth assignment
for the variables of the formula. In DHP, we are deciding which edges will be a part of the path. In 3SAT there
must be at least one true literal for each clause. In DHP, each vertex must be visited exactly once.
We are given a boolean formula F in 3-CNF form (three literals per clause). We will convert this formula into
a digraph. Let x 1 , x 2 , . . . , xm denote the variables appearing in F . We will construct one DHP-gadget for each
clause in the formula. The inputs and outputs of each gadget correspond to the literals appearing in this clause.
Thus, the clause (x 2 ∨ x5 ∨ x8 ) would generate a clause gadget with inputs labeled x 2 , x 5 , and x 8 , and the same
outputs.
The general structure of the digraph will consist of a series vertices, one for each variable. Each of these vertices
will have two outgoing paths, one taken if x i is set to true and one if x i is set to false. Each of these paths will
then pass through some number of DHP-gadgets. The true path for x i will pass through all the clause gadgets
for clauses in which x i appears, and the false path will pass through all the gadgets for clauses in which xi
appears. (The order in which the path passes through the gadgets is unimportant.) When the paths for x i have
passed through their last gadgets, then they are joined to the next variable vertex, x i+1 . This is illustrated in
the following figure. (The figure only shows a portion of the construction. There will be paths coming into
i2 o2 i2 o2
i3 o3 i3 o3
i2 o2 i2 o2
i3 o3 i3 o3
i2 o2 i2 o2
i3 o3 i3 o3
i2 o2 i2 o2
i3 o3 i3 o3
xi xi ...
xi xi
xi xi xi xi
xi _ _ _ _ xi+1
xi xi xi xi _ _
_ _ xi xi
xi xi ...
Note that for each variable, the Hamiltonian path must either use the true path or the false path, but it cannot use
both. If we choose the true path for x i to be in the Hamiltonian path, then we will have at least one path passing
through each of the gadgets whose corresponding clause contains x i , and if we chose the false path, then we
will have at least one path passing through each gadget for x i .
For example, consider the following boolean formula in 3-CNF. The construction yields the digraph shown in
the following figure.
(x 1 ∨ x 2 ∨ x 3 ) ∧ (x 1 ∨ x 2 ∨ x 3 ) ∧ (x 2 ∨ x 1 ∨ x 3 ) ∧ (x 1 ∨ x 3 ∨ x 2 ).
The Reduction: Let us give a more formal description of the reduction. Recall that we are given a boolean formula F
in 3-CNF. We create a digraph G as follows. For each variable x i appearing in F , we create a variable vertex,
named x i . We also create a vertex named x e (the ending vertex). For each clause c, we create a DHP-gadget
whose inputs and outputs are labeled with the three literals of c. (The order is unimportant, as long as each input
and its corresponding output are labeled the same.)
We join these vertices with the gadgets as follows. For each variable x i , consider all the clauses c1 , c2 , . . . , ck in
which x i appears as a literal (uncomplemented). Join x i by an edge to the input labeled with x i in the gadget for
c1 , and in general join the the output of gadget cj labeled x i with the input of gadget cj+1 with this same label.
Finally, join the output of the last gadget ck to the next vertex variable x i+1 . (If this is the last variable, then
join it to x e instead.) The resulting chain of edges is called the true path for variable x i . Form a second chain
in exactly the same way, but this time joining the gadgets for the clauses in which x i appears. This is called
the false path for x i . The resulting digraph is the output of the reduction. Observe that the entire construction
can be performed in polynomial time, by simply inspecting the formula, creating the appropriate vertices, and
adding the appropriate edges to the digraph. The following lemma establishes the correctness of this reduction.
Lemma: The boolean formula F is satisfiable if and only if the digraph G produced by the above reduction has
a Hamiltonian path.
x1 _ tox3
F x1 x_2
x2 T x2 _ _x1 xe
x3 x3
x3 tox2
F
A nonsatisfying assignment misses some gadgets
Fig. 81: Correctness of the 3SAT to DHP reduction. The upper figure shows the Hamiltonian path resulting from the
satisfying assignment, x 1 = 1 , x 2 = 1 , x 3 = 0 , and the lower figure shows the non-Hamiltonian path resulting from
the nonsatisfying assignment x 1 = 0 , x 2 = 1 , x 3 = 0 .
Proof: We need to prove both the “only if” and the “if”.
⇒: Suppose that F has a satisfying assignment. We claim that G has a Hamiltonian path. This path will start at
the variable vertex x 1 , then will travel along either the true path or false path for x 1 , depending on whether
it is 1 or 0, respectively, in the assignment, and then it will continue with x 2 , then x 3 , and so on, until
reaching x e . Such a path will visit each variable vertex exactly once.
Because this is a satisfying assignment, we know that for each clause, either 1, 2, or 3 of its literals
will be true. This means that for each clause, either 1, 2, or 3, paths will attempt to travel through the
corresponding gadget. However, we have argued in the above claim that in this case it is possible to visit
every vertex in the gadget exactly once. Thus every vertex in the graph is visited exactly once, implying
that G has a Hamiltonian path.
⇐: Suppose that G has a Hamiltonian path. We assert that the form of the path must be essentially the same as
the one described in the previous part of this proof. In particular, the path must visit the variable vertices
in increasing order from x 1 until x e , because of the way in which these vertices are joined together.
Also observe that for each variable vertex, the path will proceed along either the true path or the false path.
If it proceeds along the true path, set the corresponding variable to 1 and otherwise set it to 0. We will
show that the resulting assignment is a satisfying assignment for F .
Any Hamiltonian path must visit all the vertices in every gadget. By the above claim about DHP-gadgets,
if a path visits all the vertices and enters along input edge then it must exit along the corresponding output
edge. Therefore, once the Hamiltonian path starts along the true or false path for some variable, it must
remain on edges with the same label. That is, if the path starts along the true path for x i , it must travel
through all the gadgets with the label x i until arriving at the variable vertex for x i+1 . If it starts along the
false path, then it must travel through all gadgets with the label x i .
Since all the gadgets are visited and the paths must remain true to their initial assignments, it follows that
for each corresponding clause, at least one (and possibly 2 or three) of the literals must be true. Therefore,
this is a satisfying assignment.
Polynomial Approximation Schemes: Last time we saw that for some NP-complete problems, it is possible to ap-
proximate the problem to within a fixed constant ratio bound. For example, the approximation algorithm pro-
duces an answer that is within a factor of 2 of the optimal solution. However, in practice, people would like to
the control the precision of the approximation. This is done by specifying a parameter > 0 as part of the input
to the approximation algorithm, and requiring that the algorithm produce an answer that is within a relative
error of of the optimal solution. It is understood that as tends to 0, the running time of the algorithm will
increase. Such an algorithm is called a polynomial approximation scheme.
For example, the running time of the algorithm might be O(2 (1/) n 2 ). It is easy to see that in such cases the user
pays a big penalty in running time as a function of . (For example, to produce a 1% error, the “constant” factor
would be 2100 which would be around 4 quadrillion centuries on your 100 Mhz Pentium.) A fully polynomial
approximation scheme is one in which the running time is polynomial in both n and 1/ . For example, a
running time of O((n/) 2 ) would satisfy this condition. In such cases, reasonably accurate approximations are
computationally feasible.
Unfortunately, there are very few NP-complete problems with fully polynomial approximation schemes. In fact,
recently there has been strong evidence that many NP-complete problems do not have polynomial approximation
schemes (fully or otherwise). Today we will study one that does.
Subset Sum: Recall that in the subset sum problem we are given a set S of positive integers {x 1 , x 2 , . . . , xn } and a
target value t, and we are asked whether there exists a subset S 0 ⊆ S that sums exactly to t. The optimization
problem is to determine the subset whose sum is as large as possible but not larger than t.
This problem is basic to many packing problems, and is indirectly related to processor scheduling problems that
arise in operating systems as well. Suppose we are also given 0 < < 1 . Let z ∗ ≤ t denote the optimum sum.
The approximation problem is to return a value z ≤ t such that
∗
z≥z (1 − ).
If we think of this as a knapsack problem, we want our knapsack to be within a factor of (1 − ) of being as full
as possible. So, if = 0.1 , then the knapsack should be at least 90% as full as the best possible.
What do we mean by polynomial time here? Recall that the running time should be polynomial in the size of
the input length. Obviously n is part of the input length. But t and the numbers x i could also be huge binary
numbers. Normally we just assume that a binary number can fit into a word of our computer, and do not count
their length. In this case we will to be on the safe side. Clearly t requires O(log t) digits to be store in the input.
We will take the input size to be n + log t .
Intuitively it is not hard to believe that it should be possible to determine whether we can fill the knapsack to
within 90% of optimal. After all, we are used to solving similar sorts of packing problems all the time in real
life. But the mental heuristics that we apply to these problems are not necessarily easy to convert into efficient
algorithms. Our intuition tells us that we can afford to be a little “sloppy” in keeping track of exactly full the
knapsack is at any point. The value of tells us just how sloppy we can be. Our approximation will do something
similar. First we consider an exponential time algorithm, and then convert it into an approximation algorithm.
Exponential Time Algorithm: This algorithm is a variation of the dynamic programming solution we gave for the
knapsack problem. Recall that there we used an 2-dimensional array to keep track of whether we could fill a
knapsack of a given capacity with the first i objects. We will do something similar here. As before, we will
concentrate on the question of which sums are possible, but determining the subsets that give these sums will
not be hard.
Let L i denote a list of integers that contains the sums of all 2i subsets of {x 1 , x 2 , . . . , xi } (including the empty
set whose sum is 0). For example, for the set {1, 4, 6} the corresponding list of sums contains h0, 1, 4, 5(=
L0 = h0i
L1 = h0i ∪ h0 + 1i = h0, 1i
L2 = h0, 1i ∪ h0 + 4, 1 + 4i = h0, 1, 4, 5i
L3 = h0, 1, 4, 5i ∪ h0 + 6, 1 + 6, 4 + 6, 5 + 6i = h0, 1, 4, 5, 6, 7, 10, 11i.
The last list would have the elements 10 and 11 removed, and the final answer would be 7. The algorithm runs
in Ω(2n ) time in the worst case, because this is the number of sums that are generated if there are no duplicates,
and no items are removed.
Approximation Algorithm: To convert this into an approximation algorithm, we will introduce a “trim” the lists to
decrease their sizes. The idea is that if the list L contains two numbers that are very close to one another, e.g.
91, 048 and 91, 050, then we should not need to keep both of these numbers in the list. One of them is good
enough for future approximations. This will reduce the size of the lists that the algorithm needs to maintain.
But, how much trimming can we allow and still keep our approximation bound? Furthermore, will we be able
to reduce the list sizes from exponential to polynomial?
The answer to both these questions is yes, provided you apply a proper way of trimming the lists. We will trim
elements whose values are sufficiently close to each other. But we should define close in manner that is relative
to the sizes of the numbers involved. The trimming must also depend on . We select δ = /n . (Why? We will
see later that this is the value that makes everything work out in the end.) Note that 0 < δ < 1 . Assume that the
elements of L are sorted. We walk through the list. Let z denote the last untrimmed element in L , and let y ≥ z
be the next element to be considered. If
y−z
≤δ
y
then we trim y from the list. Equivalently, this means that the final trimmed list cannot contain two value y and
z such that
(1 − δ)y ≤ z ≤ y.
We can think of z as representing y in the list.
For example, given δ = 0.1 and given the list
L = h10, 11, 12, 15, 20, 21, 22, 23, 24, 29i,
Another way to visualize trimming is to break the interval from [1, t] into a set of buckets of exponentially
increasing size. Let d = 1/(1 − δ) . Note that d > 1 . Consider the intervals [1, d], [d, d2 ], [d2 , d3 ], . . . , [dk−1 , dk ]
where dk ≥ t . If z ≤ y are in the same interval [di−1 , di ] then
y−z di − d i−1 1
≤ i
=1− = δ.
y d d
Thus, we cannot have more than one item within each bucket. We can think of trimming as a way of enforcing
the condition that items in our lists are not relatively too close to one another, by enforcing the condition that no
bucket has more than one item.
12 4 8 16
L
L’
Fig. 82: Trimming Lists for Approximate Subset Sum.
Claim: The number of distinct items in a trimmed list is O((n log t)/) , which is polynomial in input size and
1/ .
Proof: We know that each pair of consecutive elements in a trimmed list differ by a ratio of at least d =
1/(1 − δ) > 1 . Let k denote the number of elements in the trimmed list, ignoring the element of value 0.
Thus, the smallest nonzero value and maximum value in the the trimmed list differ by a ratio of at least
dk−1 . Since the smallest (nonzero) element is at least as large as 1, and the largest is no larger than t, then
it follows that dk−1 ≤ t/1 = t . Taking the natural log of both sides we have (k − 1) ln d ≤ ln t . Using the
facts that δ = /n and the log identity that ln(1 + x) ≤ x , we have
ln t ln t
k−1 ≤ =
ln d − ln(1 − δ)
ln t n ln t
≤ =
δ
n log t
k =O .
Observe that the input size is at least as large as n (since there are n numbers) and at least as large as log t
(since it takes log t digits to write down t on the input). Thus, this function is polynomial in the input size
and 1/ .
The approximation algorithm operates as before, but in addition we call the procedure Trim given below.
For example, consider the set S = {104, 102, 201, 101} and t = 308 and = 0.20 . We have δ = /4 = 0.05 .
Here is a summary of the algorithm’s execution.
Approx_SS(x[1..n], t, eps) {
delta = eps/n; // approx factor
L = <0>; // empty sum = 0
for i = 1 to n do {
L = MergeLists(L, L+x[i]); // add in next item
L = Trim(L, delta); // trim away "near" duplicates
remove for L all elements greater than t;
}
return largest element in L;
}
init: L0 = h0i
The final output is 302. The optimum is 307 = 104 + 102 + 101 . So our actual relative error in this case is
within 2%.
The running time of the procedure is O(n|L|) which is O(n 2 ln t/) by the earlier claim.
Our proof will make use of an important inequality from real analysis.
Lemma: For n > 0 and a real numbers,
a n
(1 + a) ≤ 1+ ≤ e a.
n
Recall that our intuition was that we would allow a relative error of /n at each stage of the algorithm. Since the
algorithm has n stages, then the total relative error should be (obviously?) n(/n) = . The catch is that these
are relative, not absolute errors. These errors to not accumulate additively, but rather by multiplication. So we
need to be more careful.
Let L ∗
i denote the i-th list in the exponential time (optimal) solution and let L i denote the i-th list in the approx-
imate algorithm. We claim that for each y ∈ L ∗ i there exists a representative item z ∈ L i whose relative error
from y that satisfies
(1 − /n) i y ≤ z ≤ y.
The proof of the claim is by induction on i. Initially L 0 = L ∗ 0 = h0i , and so there is no error. Suppose by
induction that the above equation holds for each item in L ∗
i−1 . Consider an element y ∈ L ∗ i−1 . We know that
∗
y will generate two elements in L i : y and y + x i . We want to argue that there will be a representative that is
“close” to each of these items.
By our induction hypothesis, there is a representative element z in L i−1 such that
i−1
(1 − /n) y ≤ z ≤ y.
When we apply our algorithm, we will form two new items to add (initially) to L i : z and z + x i . Observe that
by adding x i to the inequality above and a little simplification we get
i−1
(1 − /n) (y + x i ) ≤ z + x i ≤ y + x i.
zy
L*i−1
Li−1
L*i
Li
z’ y z’’ y+xi
z z+xi
The items z and z + x i might not appear in L i because they may be trimmed. Let z 0 and z 00 be their respective
representatives. Thus, z 0 and z 00 are elements of L i . We have
(1 − /n)z ≤z 0≤ z
00
(1 − /n)(z + x i) ≤z ≤z+x i.
Since z and z 00 are in L i this is the desired result. This ends the proof of the claim.
Using our claim, and the fact that Y ∗ (the optimum answer) is the largest element of L ∗
n and Y (the approximate
answer) is the largest element of L n we have
(1 − /n) n
Y ∗≤ Y ≤ Y ∗
.
∗
This is not quite what we wanted. We wanted to show that (1 − )Y ≤ Y . To complete the proof, we observe
from the lemma above (setting a = − ) that
n
(1 − ) ≤ 1− .
n
This completes the approximate analysis.