Shimon Even Graph Algorithms Computer Software Engineering Series
Shimon Even Graph Algorithms Computer Software Engineering Series
ALGORITHMS
SHIMON EVEN
CALINGAERT
Assemblers, Compilers, and Program Translation
EVEN
Graph Algorithms
SHIMON EVEN
Technion Institute
All rights reserved. No part of this work may be reproduced, transmitted, or stored in
any form or by any means, without the prior written consent of the Publisher.
2 3 4 5 6 85 84 83 82 81 80
Even, Shimon.
Graph algorithms.
Graph theory has long become recognized as one of the more useful mathe-
matical subjects for the computer science student to master. The approach
which is natural in computer science is the algorithmic one; our interest is
not so much in existence proofs or enumeration techniques, as it is in find-
ing efficient algorithms for solving relevant problems, or alternatively
showing evidence that no such algorithm exists. Although algorithmic
graph theory was started by Euler, if not earlier, its development in the last
ten years has been dramatic and revolutionary. Much of the material of
Chapters 3, 5, 6, 8, 9 and 10 is less than ten years old.
This book is meant to be a textbook of an upper level undergraduate, or
graduate course. It is the result of my experience in teaching such a course
numerous times, since 1967, at Harvard, the Weizmann Institute of
Science, Tel Aviv University, University of California at Berkeley and the
Technion. There is more than enough material for a one semester course,
and I am sure that most teachers will have to omit parts of the book from
their course. If the course is for undergraduates, Chapters 1 to 5 provide
enough material, and even then the teacher may choose to omit a few sec-
tions, such as 2.6, 2.7, 3.3, 3.4. Chapter 7 consists of classical nonalgo-
rithmic studies of planar graphs, which are necessary in order to under-
stand the tests of planarity, described in Chapter 8; it may be assigned as
preparatory reading assignment. The mathematical background needed for
understanding Chapter I to 8 is some knowledge of set theory, com-
binatorics and algebra, which the computer science student usually masters
during his freshman year through a course on discrete mathematics and a
course on linear algebra. However, the student will also need to know a
little about data structures and programming techniques, or he may not
appreciate the algorithmic side or miss the complexity considerations. It is
my experience that after two courses in programming the students have the
necessary knowledge. However, in order to follow Chapters 9 and 10, addi-
tional background is necessary, namely, in theory of computation. Specifi-
cally, the student should know about Turing machines and Church's
thesis.
V
vi Preface
S.E.
PREFACE................................................ v
1. PATHSIN GRAPHS
1.1 Introduction to graph theory .......................... 1
1.2 Computer representation of graphs ..................... 3
1.3 Euler graphs ..................................... 5
1.4 De Bruijn sequences ................................. 8
1.5 Shortest-path algorithms ............................. 11
Problems ........................................... 18
References ......................................... 20
2. TREES
2.1 Tree definitions ..................................... 22
2.2 Minimum spanning tree .............................. 24
2.3 Cayley's theorem .................................... 26
2.4 Directed tree definitions .............................. 30
2.5 The infinity lemma .................................. 32
2.6 The number of spanning trees ......................... 34
2.7 Optimum branchings and directed spanning trees ......... 40
2.8 Directed trees and Euler circuits ....................... 46
Problems....................................... 49
References ......................................... 51
3. DEPTH-FIRST SEARCH
3.1 DFS of undirected graphs ............................ 53
3.2 Algorithm for nonseparable components ................ 57
3.3 DFS on digraphs .................................... 63
3.4 Algorithm for strongly-connected components ............ 64
Problems....................................... 66
References ......................................... 68
vii
viii Contents
4. ORDERED TREES
4.1 Uniquely decipherable codes .......................... 69
4.2 Positional trees and Huffrnan's optimization problem ...... 74
4.3 Application of the Huffman tree to sort-by-merge
techniques..................................... 80
4.4 Catalan numbers .................................... 82
Problems....................................... 87
References ......................................... 88
7. PLANAR GRAPHS
7.1 Bridges and Kuratowski's theorem ..................... 148
7.2 Equivalence ........................................ 160
7.3 Euler's theorem ..................................... 161
7.4 Duality ........................................ 162
Problems ........................................... 168
References ......................................... 170
PATHS IN GRAPHS
1.1 INTRODUCTION TO GRAPH THEORY
Lemma 1.1: The number of vertices of odd degree in a finite graph is even.
Proof: Let I VI and JEl be the number of vertices and edges, respectively.
Then,
WvI
Ed(vi) = 2 El,
since each edge contributes two to the left hand side; one to the degree of
each of its two endpoints, if they are different, and two to the degree of its
endpoint if it is a self-loop. It follows that the number of odd degrees must be
even.
Q.E.D.
1
2 Paths In Graphs
Figure 1.1
The notation u -v means that the edge e has u and v as endpoints. In this
case we also say that e connects vertices u and v, and that u and v are adja-
cent.
A path is a sequence of edges e,, e2, .such that:
..
Figure 1. 2
Computer Representation of Graphs 3
We do not like to call the sequence e,, e2 , e3 a path, and it is not, since the
only vertex, b, which is shared by e 1and e 2 is also the only vertex shared by e 2
and e3 . But we have no objection to calling e,, e 4 , e3 a path. Also, the se-
quence e,, e2 , e2, e3 is a path since e1 and e2 share b, e2 and e2 share d, e2
ande 3 share b. It is convenient to describe a path as follows:vo e' v, 2 .-..
Vi-, v,. Here the path is el, e2, . . ., et and the endpoints shared are trans-
parent; vo is called the start and v, is called the end vertex. The length of the
path is l.
A circuit is a path whose start and end vertices are the same.
A path is called simple if no vertex appears on it more than once. A
circuit is called simple if no vertex, other than the start-end vertex, ap-
pears more than once, and the start-end vertex does not appear elsewhere
in the circuit; however, u -' v _' u is not considered a simple circuit.
If for every two vertices u and v there exists a path whose start vertex is u
and whose end vertex is v then the graph is called connected.
A digraph (or directed graph) is defined similarly to a graph except that
the pair of endpoints of an edge is now ordered; the first endpoint is called
the start-vertex of the edge and the second (which may be the same) is called
its end-vertex. The edge (u -- v) e is said to be directed from u to v. Edges
with the same start vertex and the same end vertex are called parallel, and if
u • v, u ' v and v 2 u then e l and e 2 are antiparallel.An edge u -u is called
a self-loop.
The outdegree, d.o, (v), of a vertex v is the number of edges which have v as
their start-vertex; indegree, din (v), is defined similarly. Clearly, for every
graph
MI IvI
di (v i) = dou (v i).
A directedpath is a sequence of edges e,, e 2, ... such that the end vertex
of ei- is the start vertex of ei. A directed path is a directed circuitif the start
vertex of the path is the same as its end vertex. The notion of a directed path
or circuit being simple is defined similarly to that in the undirected case. A
digraph is said to be strongly connected if for every vertex u and every vertex
v there is a directed path from u to v; namely, its start-vertex is u and its end-
vertex is v.
memory. In this section two of the most common methods of graph represen-
tation are briefly described.
Graphs and digraphs which have no parallel edges are called simple. In
cases of simple graphs, the specification of the two endpoints is sufficient to
specify the edge; in cases of digraph the specification of the start-vertex and
end-vertex is sufficient. Thus, we can represent a graph or digraph of n ver-
tices by an n X n matrix C, where Cy = 1 if there is an edge connecting
vertex v, to v; and Cy = 0, if not. Clearly, in the case of graphs Cy = 1 im-
plies C,, = 1; or in other words, C is symmetric. But in the case of di-
graphs, any n X n matrix of zeros and ones is possible. This matrix is
called the adjacency matrix.
Given the adjacency matrix of a graph, one can compute d(v,) by counting
the number of ones in the i-th row, except that a one on the main diagonal
contributes two to the count. For a digraph, the number of ones in the i row
is equal to dou,(v i) and the number of ones in the i column is equal to d i(v i).
The adjacency matrix is not an efficient representation of the graph in case
the graph is sparse; namely, the number of edges is significantly smaller than
n2 . In these cases the following representation, which also allows parallel
edges, is preferred.
For each of the vertices, the edges incident to it are listed. This incidence
list may simply be an array or may be a linked list. We may need a table
which tells us the location of the list for each vertex and a table which tells us
for each edge its two endpoints (or start-vertex and end-vertex, in case of a
digraph).
We can now trace a path starting from a vertex, by taking the first edge on
its incidence list, look up its other endpoint in the edge table, finding the in-
cidence list of this new vertex etc. This saves the time of scanning the row of
the matrix, looking for a one. However, the saving is real only if n is large and
the graph is sparse, for instead of using one bit per edge, we now use edge
names and auxiliary pointers necessary in our data structure. Clearly, the
space required is O(IEI + IlVI), i.e., bounded by a constant times JEl +
IVI. Here we assume that the basic word length of our computer is large
enough to encode all edges and vertices, If this assumption is false then the
space required is O((IEI + IVI) log (lEl + IVl))*.
In practice, most graphs are sparse. Namely, the ratio (IEI + I VI)/ I VI 2
tends to zero as the size of the graphs increases. Therefore, we shall prefer
the use of incidence lists to that of adjacency matrices.
*The base of the log is unimportant (clearly greater than one), since this estimate is
only up to a constant multiplier.
Euler Graphs 5
The reader can find more about data structures and their uses in graph
theoretic algorithms in references 1 and 2.
Proof: It is clear that if a graph has an Euler path which is not a circuit, then
the start vertex and the end vertex of the path are of odd degree, while all the
other vertices are of even degree. Also, if a graph has a Euler circuit, then all
vertices are of even degree.
(a) Mb
Figure 1.3
6 Paths In Graphs
Assume now that G is a finite graph with exactly two vertices of odd
degree, a and b. We shall described now an algorithm for finding a Euler
path from a to b. Starting from a we choose any edge adjacent to it (an edge
of which a is an endpoint) and trace it (go to its other endpoint). Upon enter-
ing a vertex we search for an unused incident edge. If the vertex is neither a
nor b, each time we pass through it we use up two of its incident edges. The
degree of the vertex is even. Thus, the number of unused incident edges after
leaving it is even. (Here again, a self-loop is counted twice.) Therefore, upon
entering it there is at least one unused incident edge to leave by. Also, by a
similar argument, whenever we reenter a we have an unused edge to leave by.
It follows that the only place this process can stop is in b. So far we have
found a path which starts in a, ends in b, and the number of unused edges in-
cident to any vertex is even. Since the graph is connected, there must be at
least one unused edge which is incident to one of the vertices on the existing
path from a to b. Starting a trail from this vertex on unused edges, the only
vertex in which this process can end (because no continuation can be found)
is the vertex in which it started. Thus, we have found a circuit of edges which
were not used before, and in which each edge is used at most once: it starts
and ends in a vertex visited in the previous path. It is easy to change our path
from a to b to include this detour. We continue to add such detours to our
path as long as not all edges are in it
The case of all vertices of even degrees is similar. The only difference is
that we start the initial tour at any vertex, and this tour must stop at the same
vertex. This initial circuit is amended as before, until all edges are included.
Q.E.D.
In the case of digraphs, a directedEttler path is a directed path in which
every edge appears exactly once. A directedEuler circuit is defined similarly.
Also a digraph is called Euler if it has a directed Euler path (or circuit).
The underlying (undirected) graph of a digraph is the graph resulting from
the digraph if the direction of the edges is ignored. Thus, the underlying
graph of the digraph shown in Figure '1.4(a) is shown in Figure 1.4(b).
Theorem 1.2: A finite digraph is an Lutler digraph if any only if its underly-
ing graph is connected and one of the following two conditions holds:
1. There is one vertex a such that d o, (a) = di.(a) + 1 and another vertex b
such that d.out(b) + 1 = din (b), while for all other vertices v, d.out(v) = din (v).
2. For all vertices v, do0 t(v) = di.(v).
If 1 holds then every directed Euler path starts in a and ends in b. If 2 holds
then every directed Euler path is a directed Euler circuit.
Euler Graphs 7
(a) (b)
Figure 1.4
The proof of the theorem is along the same lines as the proof of Theorem 1. 1,
and will not be repeated here.
Let us make now a few comments about the complexity of the algorithm
for finding an Euler path, as described in the proof of Theorem 1.1. Our pur-
posed is to show that the time complexity of the algorithm is O( IEl); namely,
there exists a constant K such that the time it takes to find an Euler path is
bounded by K - E l.
In the implementation, we use the following data structures:
TRACE(d, P):
(1) v - d
(2) If v is "unvisited", put it in L and mark it "visited".
(3) If N(v) is "used" but is not last on v's incidence list then have N(v) point
to the next edge and repeat (3).
(4) If N(v) is "used" and it is the last edge on v's incidence list then stop.
8 Paths In Graphs
(5) e - N(v)
(6) Add e to the end of P.
(7) If E(v) is "undefined" then E(v) is made to point to the occurrence of e
in P.
(8) Mark e "used".
(9) Use the edge table to find the other endpoint u of e.
(10) v - u and go to (2).
(1) d - a
(2) TRACE(d, P). [Comment: The subroutine finds a path from a to b.]
(3) If L is empty, stop.
(4) Let u be in L. Remove u from L.
(5) Start a new doubly linked list of edges, P', which is initially empty.
[Comment: P' is to contain the detour from u.1
(6) TRACE(u, P')
(7) Incorporate P' into P at E(u). [Comment: This joins the path and the
detour into one, possibly longer path. (The detour may be empty.) Since
the edge E(u) starts from u, the detour is incorporated in a correct
place.]
(8) Go to (3).
It is not hard to see that both the time and space complexity of this
algorithm is O(IEI).
Binary de Bruijn sequences are of great importance in coding theory and are
implemented by shift registers. (See Golomb's book [3] on the subject.) The
interested reader can find more information on de Bruijn sequences in
references 4 and 5. The only problem we shall discuss here is the existence of
de Bruijn sequences for every a > 2 and every n.
Let us describe a digraph Go,,(V, E) which has the following structure:
The graphs G2. 3, G2,4 , and G3,2 are shown in Figures 1.5, 1.6 and 1.7
respectively.
The implied de Bruijn sequence, 00011101, follows by reading the first letter
of each word in the circuit. Thus, the question of existence of de Bruijn se-
quences is equivalent to that of the existence of direct Euler circuits in the
corresponding de Bruijn diagram.
Theorem 1.3: For every positive integers a and n, G, has a directed Euler
circuit.
Proof: We wish to use Theorem 1.2 to prove our theorem. First we have to
show that the underlying undirected graph is connected. In fact, we shall
show that Ge,, is strongly connected. Let b 1b2 ... b,, and cIc2 ... c.-, be
any two vertices; the directed path b1b2 ... b.-ici, b2b3 ... b-IcIc2 , . . .,
b.- 1 cIc2 ... c**-I leads from the first to the second. Next, we have to show
that do(v) = di(v) for each vertex v. The vertex bib2 ... b.-l is entered by
10 Paths In Graphs
Figure 1.5
Figure 1.6
Shortest-Path Algorithms 11
edges cblb2 ... b.- 1 , where c can be chosen in a ways, and is the start vertex
of edges b1b2 ... b -c,1 where again c can be chosen in a ways.
Q.E.D.
Corollary 1.1: For every positive integers a and n there exists a de Bruijn
sequence:
00
il
Figure 1.7
12 Paths In Graphs
Clearly, this section will deal only with very few of all the possible prob-
lems. An attempt is made to describe the most important techniques.
First let us consider the case of a finite graph G in which two vertices s and
t are specified. Our task is to find a path from s to t, if there are any, which
uses the least number of edges. Clearly this is the case of the finite, un-
directed graph, with all length of edges being equal to 1, and where all we
want is one path from a given vertex to another. In fact, the digraph case is
just as easy and can be similarly solved.
The algorithm to be used here was suggested by Moore [6] and by now is
widely used. It is well known as the Breadth FirstSearch (BES) technique.
At first no vertices of the graph are considered labeled.
Clearly we can remove step 5 from the algorithm, and the algorithm is still
valid for finite graphs. However, step 5 saves the work which would be
wasted after t is labeled, and it permits the use of the algorithm on infinite
graphs whose vertices are of finite degree and in which there is a (finite) path
between s and t.
Let the distance between u and v be the least number of edges in a path
connecting u and v, if such a path exists, and oo if none exists.
Theorem 1.3: The BFS algorithm computes the distance of each vertex from
s, if t is not closer.
Proof: Let us denote the label of a vertex v, assigned by the BFS algorithm,
by X(v).
It is clear that each edge is traced at most twice, in this algorithm; once
from each of its endpoints. That is, for each i the vertices labeled i are
scanned for their incident edges in step 3. Thus, if we use the incidence lists
data structures the algorithm will be of time complexity O(jE ).
The directed case in even simpler because each edge is traced at most once.
A path from s to t can be traced by moving now from t to s, as described in
the proof of Theorem 1.3. If we leave for each vertex the name of the edge
used for labeling it, the tracing is even easier.
Let us now consider the case of a finite digraph, whose edges are assigned
with non-negative length; thus, each edge e is assigned a length l(e) 2 0.
Also, there are two vertices s and t and we want to find a shortest directed
path from s to t, where the length of a path is the sum of the lengths of its
edges.
The following algorithm is due to Dijkstra [71:
1. X(s) - 0 and for all v • s, X(v) - xc.
2. T - V.
3. Let u be a vertex in T for which X(u) is minimum.
4. If u = t, stop.
5. For every edge u -~ v, if v E T and Mv) > M(u) + I(e) then M(v) - (u) +
I(e).
6. T-T-{u} and go to step 3.
Let us denote the distance of vertex v from s by 6(v). We want to show that
upon termination 5(t) = X(t); that is, if 1(t) is finite than it is equal to 6(t)
and if (t) is infinite then there is no path from s to t in the digraph.
Lemma 1.2: In Dijkstra's algorithm, if A(v) is finite then there is a path from
s to v whose length is X(v).
14 Paths In Graphs
Proof: Let u be the vertex which gave v its present label X(v); namely, X(u) +
1(e) = (v), where u e V. After this assignment took place, u did not change
its label, since in the following step (step 6) u was removed from the set T (of
temporarily assigned vertices) and its label remained fixed from there on.
Next, find the vertex which gave u its final label X(u), and repeating this
backward search, we trace a path from s to v whose length is exactly X(v).
The backward search finds each time a vertex which has left T earlier, and
therefore no vertex on this path can be repeated; it can only terminate in s
which has been assigned its label in step 1.
Q.E.D.
Lemma 1.3: In Dijkstra's algorithm, when a vertex is chosen (in Step 3), its
label X(u) satisfies X(u) = 6(u).
Proof: By induction on the order in which vertices leave the set T. The first
one to leave is s, and indeed X(s) = b(s) == 0.
Assume now that the statement holds for all vertices which left T before u.
If X(u) = xc, let u ' be the first vertex whose label X(u ') is infinite when it is
chosen. Clearly, for every v in T, at this point, X(v) = o, and for all vertices
v'E V - T, X(v') is finite. Therefore, there is no edge with a start-vertex in
V - T and end-vertex in T. It follows that there is no path from s (which is in
V - 7) to u (which is in 7).
If X(u) is finite, then by Lemma 1.2, X(u) is the length of some path from s
to u. Thus, X(u) 2 6(u). We have to show that X(u) > 6(u) is impossible. Let
ashortest path fromstoubes = v. 'p "** . - Vk = u. Thus, for
every i = 0O 1, . . ., k
6(vW)= L I(e ).
Let vi be the right most vertex on this path to leave T before u. By the induc-
tive hypothesis
If vj+1 • u, then N(vi+ 1) c X(vi) + I(ei+±) after vi has left T. Since labels
can only decrease if they change at all, when u is chosen X(vi+:) still satisfies
this inequality. We have:
and if 6(u) < X(u), u should not have been chosen. In case v,+, = u, the
same argument shows directly that X(u) c 6(u).
Q.E.D.
Lemma 1.4: In the Ford algorithm, if X(v) is finite then there is a directed
path from s to v whose length is X(v).
16 Paths In Graphs
The lemma above is even true if there are negative length directed circuits.
But if there are no such circuits, the path traced in the proof cannot return to
a vertex visited earlier. For if it does, then by going around the directed cir-
cuit, a vertex improved its own label; this implies that the sum of the edge
lengths of the circuit is negative. Therefore we have:
Lemma 1.5: In the Ford algorithm, if the digraph has no directed circuits of
negative length and if A(v) is finite then there is a simple directed path from s
to v whose length is X(v).
Since each value, X(v), corresponds to at least one simple path from s to v,
and since the number of simple directed paths in a finite digraph is finite, the
number of values possible for X(v) is finite. Thus, the Ford algorithm must
terminate.
Lemma 1.6: For a digraph with no negative directed circuit, upon termina-
tion of the Ford algorithm, X(v) = 6(v) for every vertex v.
.. k
Wi)- a(ej).
j=1
Let vi be the first vertex on this path for which X(vi) > 6(vi). Since X(vi-1)
6(vi- ), the edge vi-, i vi can be used to lower X(vi) to v(vi-) + l(ei),
(which is equal to 6(v 1)). Thus, the algorithm should not have terminated.
Q.E.D.
Shortest-Path Algorithms 17
Let k (i,j) be the length of a shortest path from i toj among all paths which
may pass through vertices 1, 2, ... , k but do not pass through vertices k +
1,k + 2, ... ,n.
18 Paths In Graphs
1. k - 1
2. For every 1 c i, j c n compute
The algorithm clearly yields the right answer; namely, 6"(i, j) is the
distance from i toj. The answer is on y meaningful if there are no negative
circuits in G. The existence of negative circuits is easily detected by 6 k(i, i) <
0. Each application of step 2 requires i ' operations, and step 2 is repeated n
times. Thus, the algorithm is of complexity 0(n 3 ).
For the case of finite graphs with non-negative edge lengths, both the
repeated Dijkstra algorithm and Floyd's take 0(1 VI3). Additional informa-
tion on the shortest path problem can be found in Problems 1.9 and 1.10 and
references 11 and 12.
PROBLEMS
1.1 Prove that if a connected (undirected) finite graph has exactly 2k ver-
tices of odd degree then the set of edges can be partitioned into k paths
such that every edge is used exactly once. Is the condition of connec-
tivity necessary or can it be replaced by a weaker condition?
A Hamilton path (circuit) is a simple path (circuit) on which every
vertex of the graph appears exactly once.
1.2 Prove that the following graph has no Hamilton path or circuit.
Problems 19
(b) Prove that if for every two vertices u and v, d(u) + d(v) 2 n, where
n=I VI, then the algorithm will never fail to produce a Hamilton
circuit.
(c) Deduce Dirac's theorem [13]: If for every vertex v, d(v) > n/2,
then G has a Hamilton circuit.
1.6 Describe an algorithm for finding the number of shortest paths from s
to t after the BFS algorithm has been performed.
1.7 Repeat the above, after the Dijkstra algorithm has been performed.
Assume 1(e) > 0 for every edge e. Why is this asumption necessary?
1.8 Prove that a connected undirected graph G is orientable (by giving
each edge some direction) into a strongly connected digraph if and only
if each edge of G is in some simple circuit in G. (A path u - V U is
not considered a simple circuit.)
1.9 The transitiveclosure of a digraph G(V, E) is a digraph G '(V, E) such
that there is an edge u - v in G' if and only if there is a (non-empty)
20 Paths In Graphs
Show that Dantzig's algorithm is valid. How are negative circuits de-
tected? What is the time complexity of this algorithm?
REFERENCES
TREES
Proof: We shall prove that conditions (a) - (b) * (c) - (d) - (a).
22
Tree Definitions 23
disjoint subpaths between the branching off vertex and v form a simple
circuit in G.
(c) - (d): We assume the existence of a unique simple path between
every pair of vertices of G. This implies that G is connected. Assume now
that we delete an edge e from G. Since G has no self-loops, e is not a self-
loop. Let a and b be e's endpoints. If there is now (after the deletion of e)
a path between a and b, then G has more than one simple path between a
and b.
(d) - (a): We assume that G is connected and that no edge can be
deleted without interrupting the connectivity. If G contains a simple cir-
cuit, any edge on this circuit can be deleted without interrupting the con-
nectivity. Thus, G is circuit-free.
Q.E.D.
There are two more common ways to define a finite tree. These are given
in the following theorem.
Theorem 2.2: Let G(V, E) be a finite graph and n = I VI. The following
three conditions are equivalent:
(a) G is a tree.
(b) G is circuit-free and has n -1 edges.
(c) G is connected and has n -1 edges.
circuit is closed. Thus, our path remains simple. Since the graph is finite,
this extension must terminate on both sides of e, yielding two vertices of
degree 1.
Now, the proof that G is connected proceeds by induction on the number
of vertices, n. The statement is obviously true for n = 2. Assume that it is
true for n = m - 1, and let G be a circuit-free graph with m vertices and
m - 1 edges. Eliminate from G a vertex v, of degree 1, and its incident
edge. The resulting graph is still circuit-free and has m - 1 vertices and
m - 2 edges; thus, by the inductive hypothesis it is connected. There-
fore, G is Lonnected too.
(c) w (a): Assume that G is connected and has n - 1 edges. If G con-
tains circuits, we can eliminate edges (without eliminating vertices) and
maintain the connectivity. When this process terminates, the resulting
graph is a tree, and, by (a) - (b), has n - 1 edges. Thus, no edge can be
eliminated and G is circuit-free.
Q.E.D.
Corollary 2.1: A finite tree, with more than one vertex, has at least two
leaves.
There are many known algorithms for the minimum spanning tree
problem, but they all hinge on the following theorem:
First observe, that each time we reach Step (5), T is the edge set of a
spanning tree of the subraph induced by U. This is easily proved by induc-
tion on the number of times we reach Step (5). We start with U = {1} and
T = 0 which is clearly a spanning tree of the subgraph induced by {I}.
After the first application of Steps (2), (3) and (4), we have two vertices in
U and an edge in T which connects them. Each time we apply Steps (2),
(3) and (4) we add an edge from a vettex of the previous U to a new vertex.
Thus the new T is connected too. Also, the number of edges is one less
than the number of vertices. Thus, by Theorem 2.2 (part (c)), T is a span-
ning tree.
Now, let us proceed by induction to prove that if the old T is a sub-
graph of some minimum spanning tree of G then so is the new one. The
proof is similar to that of Theorem 2.3. Let To be a minimum spanning
tree of G which contains T as a subgraph, and assume e is the next edge
chosen in Step (2) to connect betweer a vertex of U and V - U. If e is not
in To, add it to To to form To + e. It contains a circuit in which there is
one more edge, e', connecting a vertex of U with a vertex of V - U. By
Step (2), 1(e) c l(e'), and if we delete e' from To + e, we get an minimum
spanning tree which contains both T, as a subgraph, and e, proving that
the new T is a subgraph of some minimum spanning tree. Thus, in the end
T is a minimum spanning tree of G.
The complexity of the algorithm is °(| VI 2); Step (2) requires at most
IVI - 1 comparisons and is repeated IVI - 1 times, yielding O(1 VI 2).
Step (6) requires one comparison for each edge; thus, the total time spent
on it is O(IEI).
It is possible to improve the algorithm and the interested reader is ad-
vised to read the Cheriton and Tarjan paper [2]. We do not pursue this
here because an understanding of advanced data structures is necessary.
The faster algorithms do not use any graph theory beyond the level of this
section.
The analogous problem for diagraphs, namely, that of finding a subset
of the edges E' whose total length is minimum among those for which
(V, E') is a strongly connected subgraph, is much harder. In fact, even the
case where 1(e) = 1 for all edges is hard. This will be discussed in Chap-
ter 10.
interesting problem, of the number of trees one can define on a given set of
vertices, V = {1, 2, . . ., n}.
For n = 3, there are 3 possible trees, as shown in Figure 2.1. Clearly,
for n = 2 there is only one tree. The reader can verify, by exhausting all
the cases, that for n = 4 the number of trees is 16. The following theorem
is due to Cayley [3]:
Theorem 2.4: The number of spanning trees for n distinct vertices is n n-2.
(1) i - 1.
(2) Among all leaves of the current tree letj be the least one (i.e., its name
is the least integer). Eliminated and its incident edge e from the tree.
The ith letter of the word is the other endpoint of e.
(3) If i = n - 2, stop.
(4) Increment i and go to step 2.
For example, assume that n = 6 and the tree is as shown in Figure 2.2.
On the first turn of Step (2),] = 2 and the other endpoint of its incident
edge is 4. Thus, 4 is the first letter of the word. The new tree is as shown in
Figure 2.3. On the second turn, j = 3 and the second letter is 1. On the
third, j = 1 and the third letter is 6. On the fourth, j = 5 and the fourth
letter is 4. Now i = 4 and the algorithm halts. The resulting word is 4164
(and the current tree consists of one edge connecting 4 and 6).
Figure 2.1
28 Trees
Figure 2.2
Figure 2.3
By Corollary 2.1, Step (2) can always be performed, and therefore for
every tree a word of length n - 2 is produced. It remains to be shown that
no word is produced by two different trees and that every word is generated
from some tree. We shall achieve both ends by showing that the mapping
has an inverse; i.e., for every word there is a unique tree which produces it.
Let w = ala2 ... a,-2 be a word over V. If T is a tree for which the
algorithm produces w then the degree of vertex k, d (k), in T, is equal to
the number of times k appears in w, plus 1. This follows from the observa-
tion that when each, but the last, of the edges incident to k is deleted, k is
written as a letter of w; the last edge may never be deleted, if k is one of
the two vertices remaining in the tree, or if it is deleted, k is now the re-
moved leaf, and the adjacent vertex, not k, is the written letter. Thus, if
w is produced by the algorithm, for some tree, then the degrees of the
vertices in the tree must be as stated.
For example, if w = 4164 then d(1) = 2, d(2) = 1, d(3) = 1, d(4) = 3,
d(5) = 1 and d(6) = 2 in a tree which produced w.
Given this data, apply the following algorithm:
(1) i - 1.
(2) Letj be the least vertex for which d(j) = 1. Construct an edge j-ai,
d(j) - 0 and d(a) - d(ai) -1.
(3) If i = n - 2, construct an edge between the two vertices whose degree
is 1 and stop.
(4) Increment i and go to step 2.
It is easy to see that this algorithm picks the same vertexj as the original
algorithm, and constructs a tree (the proof is by induction). Also, each step
Cayley's Theorem 29
A similar problem, stated and solved by Lempel and Welch [6], is that
of finding the number of ways m labeled (distinct) edges can be joined by
unlabeled endpoints to form a tree. Their proof is along the lines of Priifer's
proof of Cayley's theorem and is therefore constructive, in the sense that
one can use the inverse transformation to generate all the trees after the
words are generated. However, a much simpler proof was pointed out to
me by A. Pnueli and is the subject of Problem 2.5.
240 (0
Figure 2.4.
2 4 ( 13 (
Figure 2.5.
3 1
Figure 2.6
6 0
Figure 2.7
30 Trees
Proof: We prove that (a) * (b) w (c) > (d) > (e) w (a).
(a) - (b): We assume that G has a root, say r, and its underlying un-
directed graph G' is a tree. Thus, by Theorem 2.1, part (c), there is a
unique simple path from r to every vertex in G'; also, G' is circuit-free.
Thus, a directed path from r to a vertex v, in G, must be simple and
unique.
(b) * (c): Here we assume that G has a root, say r, and a unique
directed path from it to every vertex v. First, let us show that din(r) = 0.
Assume there is an edge u r. There is a directed path from r to u, and it
can be continued, via e, back to r. Thus, in addition to the empty path
from r to itself (containing no edges ), there is one more, in contradiction
of the assumption of the path uniqueness. Now, we have to show that if
v • r then din(v) = 1. Clearly, din(v) > 0 for it must be reachable from r.
If din(v) > 1, then there are at least two edges, say v,1 - v and v2 e' v.
Since there is a directed path P1 from r to v,, and a directed path P2 from
r to v2 , by adding el to PI and e2 to P 2 we get two different paths from
r to v. (This proof is valid even if v, == v2 .)
(c) > (d): This proof is trivial, for the deletion on any edge u -e v will
make v unreachable from r.
Directed Tree Definitions 31
(d) - (e): We assume that G has a root, say r, and the deletion of any
edge interrupts this condition. First dj,(r) = 0, for any edge entering r
could be deleted without interrupting the condition that r is a root. For
every other vertex v, din(v) > 0, for it is reachable from r. If din(v) > 1, let
VI eA V and V 2 em V be two edges entering v. Let P be a simple directed path
from r to v. It cannot use both el and e2 . The one which is not used in P
can be deleted without interrupting the fact that r is a root. Thus, di11(v) = 1.
(e) - (a): We assume that the underlying undirected graph of G, G', is
connected, din(r) = 0 and for v X r, din(v) = 1. First let us prove that r is
a root. Let P' be a simple path connecting r and v in G'. This must corre-
spond to a directed path P from r to v in G, for if any of the edges points
in the wrong direction it would either imply that din(r) > 0 or that for
some u, di.(u) > 1. Finally, G' must be circuit-free, for a simple circuit
in G' must correspond to a simple directed circuit in G (again using
din(r) = 0 and di.(v) = 1 for v • r), and at least one of its vertices, u,
must have din(u) > 1, since the vertices of the circuit are reachable from r.
Q.E.D.
In case of finite digraphs one more useful definition of a directed tree is
possible:
Theorem 2.6: A finite digraph G is a directed tree if and only if its under-
lying undirected graph, G', is circuit-free, one of its vertices, r, satisfies
di,(r) = 0, and for all other vertices v, di,(v) = 1.
Proof: The "only if" part follows directly from the definition of a directed
tree and Theorem 2.5, part (c).
To prove the "if" part we first observe that the number of edges is
n - 1. Thus, by Theorem 2.2, (b) (c), G' is connected. Thus, by
C
Theorem 2.5, (e) - (a), G is a directed tree.
Q.E.D.
Let us say that a digraph is arbitrated (Berge [71 calls it quasi strongly
connected) if for every two vertices v, and v2 there is a vertex V called an
arbiter of v, and v2, such that there are directed paths from v to v, and
from v to v2 . There are infinite digraphs which are arbitrated but do not
have a root. For example, see the digraph of Figure 2.8. However, for
finite digraphs the following theorem holds:
0F0
Figure 2.8
32 Trees
Before we present the proof let us point out the necessity of the finiteness
of the out-degrees of the vertices. For if we allow a single vertex to be of
infinite out-degree, the conclusion does not follow. Consider the digraph of
Figure 2.9. The root is connected to vertices v,1 , Vt2 , V13, . . ., where v 1 k is
the second vertex on a directed path of length k. It is clear that the tree is
infinite, and yet it has no infinite path. Furthermore, the replacement of
the condition of finite degrees by the condition that for every k the tree has
a path of length k, does not work either, as the same example shows.
Figure 2.9
Figure 2.10
c d a
a aa bb a
a d c
a d c
b aa b b b
c d a
Figure 2.11
of families for which the upper-right quadrant is tileable, while the whole
plane is not.
Consider the following directed tree T: The root r is connected to ver-
tices, each representing one of the tile families, i.e., a square 1 X 1 tiled with
the tile of that family. For every k, each one of the legitimate ways of tiling
a (2k + 1) X (2k + 1) square is represented by a vertex in T; its father is
the vertex which represents the tiling of a (2k - 1) X (2k - 1) square,
identical to the center part of the square represented by the son.
Now, if the upper-right quadrant is tilable, then T has infinitely many
vertices. Since the number of families is finite, the out-degree of each ver-
tex is finite (although, may not be bounded). By Theorem 2.8, there is an
infinite directed path in T. Such a path describes a way to tile the whole
plane.
din(i) if i = j,
D(i, j) =
-k if i X j, where k is the number of edges in G from
itoj.
Lemma 2.1: A finite digraph G(V, E), with no self-loops is a directed tree
with root r if and only if its in-degree matrix D has the following two
properties:
O if i =r,
(1) D(i, i) =
-1 if i Xr.
(2) The minor, resulting from erasing the rth row and column from D and
computing the determinant, is 1.
D'(1, 1) = 0,
D'(i,i) 1 for i = 2, 3, ... , n,
D'(i,j) =0 if i > j.
36 Trees
Thus, the minor, resulting from the erasure of the first row and the first
column from D' and computing the determinant, is 1.
Now assume that D satisfies properties (1) and (2). By property (1) and
Theorem 2.6, if G is not a directed tree then its underlying undirected
graph contains a simple circuit. The vertex r cannot be one of the vertices
of the circuit, for this would imply that either di,(r) > 0 or for some other
vertex v din(v) > 1, contrary to property (1). The circuit must be of the
form:
il -i2- i,- i,
where 1 is the length of the circuit, and no vertex appears on it twice. Also,
there may be other edges out of i,, i!, . .. , ii, but none can enter. Thus,
each of the columns of D, corresponding to one on this vertices, has exactly
one + 1 (on the main diagonal of D) and one -1 and all the other entries
are 0. Also, each of the rows of this submatrix is either all zeros, or there
is one +1 and one -1. The sum of these columns is therefore a zero
column, and thus, the minor is 0. This contradicts property (2).
Q.E.D.
As a side result of our proof, we have the additional property that the
minor of a graph whose in-degree matrix satisfies property (1) is 0 if the
graph is not a directed tree with root r.
Theorem 2.9: The number of directed spanning trees with root r of a di-
graph with no self-loops is given by the minor of its in-degree matrix which
results from the erasure of the rth row and column.
The proof of this theorem follows immediately from Lemma 2.1, the
comment following it, and the linearity of the determinant function with
respect to its columns. Let us demonstrate this by the following example.
Consider the graph shown in Figure 2.12. Its in-degree matrix D is as fol-
lows:
2 -1 -1
D= 1 1 -2
-1 0 3
.
Number of Spanning Trees 37
Figure 2.12
2 -1
- 5
-1 3
2 o -1
-1 1 -2
-1 0 3
We have returned the second row of D except its second entry, which
must be made equal to 1 (in this case its value did not change). All other
entries in the second column are changed into zero. Next, we decompose
each column, except the second, into columns which consist of a single +1
and a single -1, as follows:
2 o -1 1 0 -1 1 0 -1
-1 1 -2 -1 1 -2 + 0 1 -2
-1 0 3 0 0 3 -1 0 3
38 Trees
1 0 -1 o o
= -1 1 0 + -1 1 -1
0 0 1 0 0 1
1 0 0 1 0 -1
+ -1 1 -- ] + 0 1 0
0 0 I -1 0 1
1 0 0 1 0 0
+ 0 1 --1 + 0 1 -1
-1 0 1 -1 0 1
These six determinants correspond to the following selections of sets of
edges, respectively: {e3 , e2}, fe 3 , e4 }, {e3, es}, {e6, e2 }, {e6, e4 }, {e 6 , eS}.
After erasing the second row and column, this corresponds to
2 -1 1 -I 0 1 0
-1 3 0 1 0 1 0 1
1 -1 1 0 1 0
+
I
-1 -1 1 -1 1
1
= 0.
-1 1
Q.E.D.
Number of Spanning Trees 39
(i) in G if i j,
This matrix is called the degree matrix of G. Hence, we have the following
theorem:
-1 I -1:-
-1 n-1 -1
-1 -1 ... n-ij.
After erasing one row and the corresponding column, the matrix looks
the same, except that it is now (n - 1) X (n - 1). We can now add to any
40 Trees
column (or row) a linear combination of the others, without changing its
determinant. First subtract the first column from every other. We get:
- -
-1 n 0 0
- 0 21 0
-1 0 0 n
1 0 0) ... 0
-1 n l) 0
-1 0 n 0
-1 0 0 n
Let H be a set of critical edges, where for each vertex one entering critical
edge is chosen, if any exist. The graph (V, H) is called critical.
is the circuit C, where the pi's are directed paths common to C and B.
Since e, is not eligible, by Lemma 2.4 there is a directed path in B, from
v, to u . This path leaves pI somewhere and enters Vk, and continues via
Pk to uI; it cannot enter Pk after Vk or B would have two edges entering
the same vertex. Similarly, e, is nol eligible, and therefore there is a
directed path in B, from vj to uj, which leaves pj somewhere and enters
pj-, at vj-,. We now have a directed circuit in B: It goes from v,, via part
of pl, to a path leading to Vk, via part of Pk, to a path leading to Vk-1,
etc., until it returns to vI. The situation is described pictorially in Figure
2.13. Since B is circuit-free, k c 1.
Q.E.D.
Proof: Let B(V, E') be a maximum branching which, among all maxi-
mum branchings, contains a maximum number of edges of (V, H). Con-
sider e E H - E'. If u -. v is eligible then
Figure 2.13
c(e) if u vinGandvE V,
e(e) =
c(e) - c(e) + c(ej°) if u vV in G and v is on C,.
The reason for this definition of e(e), in case v is on C,, is that the least
cost we must drop is c(ei0). If we do pick e as part of the branching, we
gain c(e), but we lose c(e) - c(ei0), since now e must be dropped instead
of eiY.
Let e be the set of all branchings fl(V, E') of G for which ICi -E'
= 1 for every circuit C, of the critical graph (V, H) and if there is no
edge u vV in B, with u 0 Vi and v E Vi then C, -E' = {e,0 }. By
Theorem 2.11, Z contains some maximum branchings. Also, let EB be the
set of all branchings of G.
where c(B) and e(B) are the total costs of B and B respectively and c(C,)
is the total cost of the edges in CQ.
Proof: First assume that B(V, E') ( e and let us show that B(V, E')
defined by E' =E E E is a branching of G. In B there can be at most
one edge entering Vi from outside; this follows from the fact that all
edges of Ci, but one, are in B and therefore only one vertex can have one
edge enter it from V - Vi. Thus, in B there can be at most one edge
entering aj. If v E V, then in B and therefore in B, there can be only one
edge entering v. It remains to be shown that B is circuit-free. A directed
circuit in B would indicate a directed circuit in B; whenever the circuit
in B goes through a vertex aj, in B we go around Ci and exit from the
Optimum Branchings and Directed Spanning Trees 45
from the definition of a critical graph and remove all edges entering
r in G.
is one such circuit. Thus, if i • I, then e, • ej, and m = [El; but ver-
tices may repeat. Consider now a subgraph HC(V, E') defined in the
following way: Let es, be any one of The edges entering vi. Let I VI = n.
For every p = 2, 3, . . ., n let ep be the first edge on C to enter vp after the
appearance of ej,. Now E' = {ej, ei , . . ., ej .
For example, consider the graph of Figure 2.14.
The sequence of edges el, e2 , ... , e6 designates a Euler circuit C. If
we choose ej, = e6 (the only choice in this case), then ej, = el, ej, = e2
and ej4 = e4 . The resulting subgraph H,(V, E') is shown in Figure 2.15,
and is easily observed to be a directed spanning tree of G. The following
lemma states this fact in general.
V4 :
Figure 2.14
Directed Trees and Euler Circuits 47
Figure 2.15
Proof: The definition of H implies that dj1 (vj) = 0 while din(v,) = 1 for
p = 2, 3, . . ., n. By Theorem 2.6 it remains to be shown that the under-
lying undirected graph of H is circuit-free. It suffices to show that H has
no directed circuits, but an edge u e v in H means that u is discovered
before v, if we go along C starting from the reference edge ei,. Thus, no
directed circuits exist in H.
Q.E.D.
In the construction of a Euler circuit from H and e,, there are places
of choice. If din(vp) > 1, there are (din(Vp) - 1)! different orders for
picking the incoming edges (with en, picked last). Also, it is clear that
different orders will yield different Euler circuits. Thus, the number of dis-
tinct Euler circuits to be constructed from a given H and ej, is
Furthermore, different choices of H (but with the same root v1 and the
same eja) yield different Euler circuits; because a different eip for some
2 < p c n, will yield a different first entry to vp after ej, in the resulting
Euler circuit.
Finally, Lemma 2.6 guarantees that every Euler circuit will be generated
for some H and some choice of ordering the backtracking edges, because
the construction of Euler circuits from a directed tree is the reversal of
the procedure of deriving a directed tree from a circuit.
We have thus proved the following theorem:
n
a- H (djjvp)-1)!
P=1
Clearly the result cannot depend on the choice of the root. This proves
that if d0 u,(v) = din(v) for every v E V then the number of spanning trees is
the same for every choice of root.
For example, the in-degree matrix of the digraph of Figure 2.14 is
1 -- 1 0 0
0 2 -1 -1
D=
-1 --1 2 0
0 0 -1 1
Problems 49
By Theorem 2.9,
2r~-1
= 1 2 0 =2
0 -1 I
PROBLEMS
2.1 Let TI(V, El) and T 2(V, E2 ) be two spanning trees of G(V, E). Prove
that for every a E E, n E2 there exists a ,3 E E, n E2 such that each
of the sets
the condition
n
Ed(vi) = 2n -2
,=1
then there exists a tree with v,, v2 , ... , v,, as vertices and the d's
specify the vertices' degree in the tree. How many different trees are
there (if edges are unlabeled)?
2.4 Compute the number of trees that can be built on n given labeled
vertices, with unlabeled edges, in such a way that one specified vertex
is of degree k.
50 Trees
2.5 What is the number of trees that one can build with n labeled ver-
tices and m = n - 1 labeled edges? Prove that the number of trees
that can be built with m labeled edges (and no labels on the vertices)
is (m + 1)--2
2.6 Prove the following: If G is ar, infinite undirected connected graph,
whose vertices are of finite degrees, then every vertex G is the start
vertex of some simple infinite path.
2.7 Show that if rotation or flipping of tiles is allowed, the question of
tiling the plane becomes trivial.
2.8 Describe a method for computing the number of in-going directed
trees of a given digraph with a designated root. (Here a root is a
vertex r such that from every vertex there is a directed path to r.)
Explain why the method is valid. (Hint: Define an out-degree matrix.)
2.9 How many directed trees with root 000 are there in G2 ,4?
2.10 Show that tons the number of spanning trees of Go, satisfies
Au.. = -
,o.-I
1ln-2
2.12 Show that the number of deBruijn sequences for a given a and n is
(a!)" 1l
con
is done, for every two vertices, of distance I apart in the tree, one can
find the (minimum) path connecting them in time 0(l). Describe
briefly the preparatory algorithm, and the algorithm for finding the
path.
2.14 Prove that the complement of a tree (contains the same vertices and
an edge between two vertices iff no such edge exists in the tree) is
either connected, or consists of one isolated vertex while all the others
form a clique (a graph in which every pair of vertices is connected
by an edge).
2.15 Let G(V, E) be an undirected finite graph, where each edge e has a
given length 1(e) > 0. Let X(v) be the length of a shortest path from
S to v.
(a) Explain why each vertex v of s has an incident edge u e V such
that X(v) = X(u) + I(e).
(b) Show that if such an edge is chosen for each vertex v X. s (as in
(a)) then the set of edges forms a spanning tree of G.
(c) Is this spanning tree always of minimum total weight? Justify
your answer.
REFERENCES
9. Wang, H., "Proving Theorems by Pattern Recognition, II," Bell System Tech.
J., Vol. 40, 1961, pp. 1-41.
10. Tutte, W. T., "The Dissection of Equilateral Triangles into Equilateral Tri-
angles," Proc. CambridgePhil. Soc., Vol. 44, 1948, pp. 463-482.
11. Harary, F., Graph-Theory, Addison Wesley, 1969, Chap. 16.
12. Chu, Y. J., and T. H. Liu, "On the Shortest Arborescence of a Directed
Graph," Sci. Sinica, 14, 1965, pp. 1396-1400.
13. Edmonds, J., "Optimum Branchings," J. of Res. of the Nat. Bureau of Stan-
dards, 71B, 1967, pp. 233-240.
14. Bock, F., "An Algorithm to Construct a Minimum Directed Spanning Tree in
a Directed Network," Developments in Operations Research, Gordon and
Breach, 1971, pp. 29-44.
15. Karp, R. M., "A Simple Derivation of Edmonds' Algorithm for Optimum
Branchings," Networks, 1, 1971, pp. 265-272.
16. Tarjan, R. E., "Finding Optimum Branchings," Networks, 7, 1977, pp. 25-35.
17. Knuth, D. E., "Oriented Subtrees of an Arc Digraph," J. Comb. Th., Vol. 3,
1967, pp. 309-314.
Chapter 3
DEPTH-FIRST SEARCH
3.1 DFS OF UNDIRECTED GRAPHS
Tremaux's Algorithm:
(1) v-s.
(2) If there are no unmarked passages in v, go to (4).
53
54 Depth-First Search
(3) Choose an unmarked passage, mark it E and traverse the edge to its
other endpoint u. If u has any marked passages (i.e. it is not a new
vertex) mark the passage, through which u has just been entered, by E,
traverse the edge back to v, and go to Step (2). If u has no marked
passages (i.e. it is a new vertex), mark the passage through which u has
been entered by F, v - u and go to Step (2).
(4) If there is no passage marked F, halt. (We are back ins and the scanning
of the graph is complete.)
(5) Use the passage marked F, traverse the edge to its other endpoint u, v -
u and go to Step (2).
Let us demonstrate the algorithm on the graph shown in Figure 3.1. The
initial value of v, the place "where we are" or the center of activity, is s. All
passages are unlabelled. We choose one, mark itE and traverse the edge. Its
other endpoint is a (u = a). None of its passages are marked, therefore we
mark the passage through which a has been entered by F, the new center of
activity is a (v = a), and we return to Step (2). Since a has two unmarked
passages, assume we choose the one leading to b. The passage is marked E
and the one at b is marked F since b is new, etc. The complete excursion is
shown in Figure 3.1 by the dashed line.
Figure 3.1
DFS of Undirected Graphs 55
Proof: Let us state the proposition differently: For every vertex all its inci-
dent edges have been traversed in both directions.
First, consider the start vertex s. Since the algorithm has terminated, all its
incident edges have been traversed from s outward. Thus, s has been left d(s)
times, and since we end up in s, it has also been entered d(s) times. However,
by Lemma 3.1 no edge is traversed more than once in the same direction.
Therefore, all the edges incident to s have been traversed once in each direc-
tion.
Assume now that S is the set of vertices for which the statement, that each
of their incident edges has been traversed once in each direction, holds.
Assume V X S. By the connectivity of the graph there must be edges con-
necting vertices of S with V - S. All these edges have been traversed once in
each direction. Let v u be the first edge to be traversed from v ES to u E V
- S. Clearly, u's passage, corresponding to e, is marked F. Since this
passage has been entered, all other passages must have been marked E.
Thus, each of u's incident edges has been traversed outward. The search has
not started in u and has not ended in u. Therefore, u has been entered d(u)
times, and each of its incident edges has been traversed inward. A contradic-
tion, since u belongs in S.
Q.E.D.
56 Depth-First Search
(1) Mark all the edges "unused". For every v E V, k(v) - 0. Also, let i - 0
and v - s.
(2) i - i + 1, k(v) -
(3) If v has no unused incident edges, go to Step (5).
(4) Choose an unused incident edge v e u. Mark e "used". If k(u) X 0,
go to Step (3). Otherwise (k(u) =O)),f(u) - v, v - u and go to Step (2).
(5) If k(v) = 1, halt.
(6) v - f(v) and go to Step (3).
Since this algorithm is just a simple variation of the previous one, our
proof that the whole (connected) graph will be scanned, each edge once in
each direction, still applies. Here, in Step (4), if k(u) i• 0 then u is not a new
vertex and we "return" to v and continue from there. Also, moving our
center of activity from v tof(v) (Step (6)) corresponds to traversing the edge
v-f(v), in this direction. Thus, the whole algorithm is of time complexity
O( El), namely, linear in the size of the graph.
After applying the DFS to a finite and connected G(V, E) let us consider
the set of edgesE' consisting of all the edgesf(v) -v through which new ver-
tices have been discovered. Also direct each such edge fromf(v) to v.
Lemma 3.3: The digraph (V, E') defined above is a directed tree with root s.
Proof: Without loss of generality, assume that k(a) < k(b). In the DFS
algorithm, the center of activity (v in the algorithm) moves only along the
edges of the tree (V, E '). If b is not a descendant of a, and since a is
discovered before b, the center of activity must first move from a to some
ancestor of a before it moves up to b. However, we backtrack from a (v -
f(a)) only when all a's incident edges are used, which means that e is used
and therefore b is already discovered-a contradiction.
Q.E.D.
Let us call all the edges of ( V, E ') tree edges and all the other edges back
edges. The justification for this name is in Lemma 3.4; all the non-tree edges
connect a vertex back to one of its ancestors.
Consider, as an example, the graph shown in Figure 3.2. Assume we start
the DFS in c (s = c) and discover d, e, f g, b, a in this order. The resulting
vertex numbers, tree edges and back edges are shown in Figure 3.3, where
the tree edges are shown by solid lines and are directed from low to high, and
the back edges are shown by dashed lines and are directed from high to low.
In both cases the direction of the edge indicates the direction in which the
edge has been scanned first. For tree edges this is the defined direction, and
for back edges we can prove it as follows: Assume u - v is a back edge and
u is an ancestor of v. The edge e could not have been scanned first from u, for
if v has been undiscovered at that time then e would have been a tree edge,
and if v has already been discovered (after u) then the center of activity could
have been in u only if we have backtracked from v, and this means that e has
already been scanned from v.
Figure 3.2
/
/
/
/
/
e6
N1
Figure 3.3
Algorithm for Nonseparable Components 59
Let V' C V. The induced subgraph G ' (V', E ') is called a nonseparable
component if G ' is nonseparable and if for every larger V", V' C V" ' V,
the induced subgraph G "( V ", E ") is separable. For example, in the graph
shown in Figure 3.2, the subsets {a, b}, {b, c, d} and {d, e, , g} induce the
nonseparable components of the graph.
If a graph G( V, E) contains no separation vertex then clearly the whole G
is a nonseparable component. However, if v is a separating vertex then V -
{ v} can be partitioned into V,, V2, ... , Vk such that V, U V2 U ... U Vk =
V-{ v} and if i • j then Vi n V; 0; two vertices a and b are in the same
Vi if and only if there is a path connecting them which does not include v.
Thus, no nonseparable component can contain vertices from more than one
Vi. We can next consider each of the subgraphs induced by Vi U {v} and
continue to partition it into smaller parts if it is separable. Eventually, we
end up with nonseparable parts. This shows that no two nonseparable com-
ponents can share more than one vertex because each such vertex is a
separating vertex. Also, every simple circuit of length greater than one must
lie entirely in one nonseparable component.
Now, let us discuss how DFS can help to detect separating vertices.
Let the lowpoint of v, L(v), be the least number, k(u) of a vertex u which
can be reached from v via a, possible empty, directed path consisting of tree
edges followed by at most one back edge. Clearly L(v) ' k(v), for we can use
the empty path from v to itself. Also, if a non-empty path is used then its last
edge is a back edge, for a directed path of tree edges leads to vertices higher
than v. For example, in the graph of Figure 3.2 with the DFS as shown in
Figure 3.3 the lowpoints are as follows: L(a) = 7, L(b) = L(c) = L(d) = 1
and L(e) = L(f) = L(g) = 2.
Lemma 3.5: Let G be a graph whose vertices have been numbered by DFS.
If u - v is a tree edge, k(u) > 1 and L (v) 2 k(u) then u is a separating
vertex of G.
Proof: Let S be the set of vertices on the path from the root r (k(r) = 1) to u,
including r but not including u, and let T be the set of vertices on the subtree
rooted at v, including v (that is, all the descendants of v, including v itself).
By Lemma 3.4 there cannot be any edge connecting a vertex of T with any
vertex of V - (S U {u} U 7). Also, if there is any edge connecting a vertex t
E T with a vertex s E S then the edge t - s is a back edge and clearly k(s) <
k(u). Now, L(v) ' k(s), since one can take the tree edges from v to t followed
by t - s. Thus, L(v) < k(u), contradicting the hypothesis. Thus, u is
60 Depth-First Search
Lemma 3.6: Let G(V, E) be a graph whose vertices have been numbered by
DFS. If u is a separating vertex and k(u) > 1 then there exists a tree edge u
- v such that L(v) 2 k(u).
Lemma 3.7: Let G(V, E) be a graph whose vertices have been numbered by
DFS, starting with r (k(r) = 1). The vertex r is a separating vertex if and only
if there are at least two tree edges out of r.
Proof: Assume that r is a separating vertex. Let VI, V2, ... , Vm be a parti-
tion of V - {r} such that m 2 2 and if i • j then all paths from a vertex of
Vi to a vertex of Vj pass through r. Therefore, no path in the tree which starts
with r - v, v E Vi can lead to a vertex of Vj wherej X i. Thus, there are at
least two tree edges out of r.
Now, assume r - v1 and r - V2 are two tree edges out of r. Let T be the
set of vertices in the subtree rooted at v,. By Lemma 3.4, there are no edges
connecting vertices of T with vertices of V - (T U {r}). Thus, r separates T
from the rest of the graph, which is not empty since it includes at least the
vertex v2.
Q.E.D.
E = - Cj I si is a vertex of Cj in G}.
Algorithm for Nonseparable Components 61
If v is not a leaf of the DFS tree, then L(v) is the least element in the follow-
ing set:
When we backtrack from v, we have already backtracked from all its sons
earlier, and therefore already know their lowpoint. Thus, all we need to add
is that when we backtrack from u tc v, we assign
Let us assume that I VI > 1 and s is the vertex in which we start the search.
The algorithm is now as follows:
(1) Mark all the edges "unused". Empty the stack S. For every v E V let
k(v) - 0. Let i - 0 and v - s.
(2) i - i + 1, k(v) - i, L(v) -i and put v on S.
(3) If v has no unused incident edges go to Step (5).
(4) Choose an unused incident edge v u. Mark e "used". If k(u) X 0,
let L(v) - Min{L(v), k(u)} and go to Step (3). Otherwise (k(u) = 0) let
f(u) -v, v - u and go to Step (2).
(5) If k(f(v)) = 1, go to Step (9).
(6) (f(v) • s). If L(v) < k(f(v)), thenL(f(v)) - Min{L(f(v)), L(v)} and go
to Step (8).
(7) (L(v) 2 k(f(v)))f(v) is a separating vertex. All the vertices on S down to
and including v are now removed from S; this set, with f(v), forms a
nonseparable component.
(8) v -f(v) and go to Step (3).
(9) All vertices on S down to and including v are now removed from S; they
form with s a nonseparable component.
(10) If s has no unused incident edges then halt.
(11) Vertex s is a separating vertex. Let v - s and go to Step (4).
(1) Mark all the edges "unused". For every v E V let k(v) - 0 andf(v) be
"undefined". Also, let i - 0 and v - s. (s is the vertex we choose to start
the search from.)
(2) i - i + 1, k(v) - i.
(3) If there are no unused incident edges from v then go to Step (5).
(4) Choose an unused edge v -e u. Mark e "used". If k(u) • 0, go to Step
(3). Otherwise (k(u) = 0),f(u) - v, v - u and go to Step (2).
(5) Iff(v) is defined then v - f(v) and go to Step (3).
(6) (f(v) is undefined). If there is a vertex u for which k(u) = 0 then let v -
u and go to Step (2).
(7) (All the vertices have been scanned) Halt.
The structure which results from the DFS of a digraph is not as simple as it
is in the case of undirected graphs; instead of two types of edges (tree edges
and back edges) there are four:
Assume DFS was performed on a digraph G(V, E), and let the set of
tree edges be E'. The digraph (V. E ') is a branching (Section 2.7), or as
sometimes called, a forest, since it is a union of disjoint directed trees. The
only remnant parallel of the structure of DFS for undirected graphs (as in
Lemma 3.4) is in the fact that if x -~-y and k(y) > k(x) theny is a descendant
of x.
The digraph & must be free of directed circuits; for if it has a directed cir-
cuit, all the strongly-connected components on it should have been one com-
ponent. Thus, there must be at least one sink, Ck, i.e., d0 ,t(Ck) = 0, in G.
Let r be the first vertex of Ck visited in the DFS of G; r may have been
reached via a tree edge q r, it may be s or it may have been picked by Step
e
(6). in the last two cases it is a root of one of the trees of the forest. Now, all
the vertices of Ck are reachable from r. Thus, no retracting from r is attempted
until all the vertices of Ck are discovered; they all get numbers greater than
k(r), and since there are no edges out of Ck in G, no vertex is visited outside,
Ck form the time that r is discovered until we retract from it. Thus, if we store
on a stack the vertices in the order of their discovery, then upon retraction
from r, all the vertices on the stack, down to and including r are the elements
of Ck. The only problem is how do we tell when we retract from a vertex that
it has been the first one in a sink-conponent?
For this purpose let us again define the lowpoint of v, L(v), to be the least
number, k(u), of a vertex u which can be reached from v via a, possibly
empty, directed path consisting of tree edges followed by at most one back
edge or a cross edge, provided u belongs to the same strongly-connected com-
ponent. It seems like a circular situation; in order to compute the low-
Algorithm for Strongly-Connected Components 65
point we need to identify the components, and in order to find the com-
ponents we need the lowpoint. However, Tarjan [21 found a way out. In the
following algorithm, whose validity we shall discuss later, we use a stack S,
on which we store the names of the vertices in the order in which they are
discovered. Also, in an array, we record for each vertex whether it is on S or
not, so that the question of whether it is on S can be answered is constant
time. The algorithm is as follows:
(1) Mark all the edges "unused". For every v E Vlet k(v) -0 andf(v) be
"undefined". Empty S. Let i - 0 and v - s.
(2) i - i + 1, k(v) - i, L(v) -i and put v on S.
(3) If there are no unused incident edges from v then go to Step (7).
(4) Choose an unused edge v - u. Mark e "used". If k(u) = 0 then
f(u) - v, v - u and go to Step (2).
(5) If k(u) > k(v) (e is a forward edge) go to Step (3). Otherwise (k(u) <
k(v)), if u is not on S (u and v do not belong to the same component) go
to Step (3).
(6) (k(u) < k(v) and both vertices are in the same component) Let L(v) -
Min [L(v), k(u)) and go to Step (3).
(7) If L(v) = k(v) then delete all the vertices form S down to and including
v; these vertices form a component.
(8) Iff(v) is defined then
L(f(v)) - Min L(f(v)), L(v) ], v - f(v) and go to Step (3).
(9) (f(v) is undefined) If there is a vertex u for which k(u) = 0 then let
v - u and go to Step (2).
(10) (All vertices have been scanned) Halt.
Lemma 3.8: Let r be the first vertex for which, in Step (7), L(r) = k(r).
When this occurs, all the vertices on S, down to and including r, form a
strongly-connected (sink) component of G.
Proof: All the vertices in S, on top of r, have been discovered after r, and
since no backtracking from r has been tempted yet, these vertices are the
descendants of r; i.e. are reachable from r via tree-edges.
Next, we want to show that if v is a descendant of r then r is reachable from
v. We have already backtracked from v, but since r is the first vertex for
which equality occurs in Step (7), L(v) < k(v). Thus, a vertex u, k(u) =
L(v) is reachable from v. Also, k(u) 2 k(r), since by Step (8) L(r) < L(v),
(k(u) = L (v) 2 L (r) = k(r)), and therefore u must be a descendant of r. If u
• r then we can repeat the argument again to find a lower numbered descen-
66 Depth-First Search
If r is the first vertex for which in Step (7), L(r) = k(r), then by Lemma
3.8, and its proof, a component C has been discovered and all its elements
are descendants of r. Up to now at most one edge from a vertex outside C into
a vertex in C may have been used, namelyf(r) - r. Thus, so far, no vertex of
C has been used to change the lowpoint value of a vertex outside C. At this
point all vertices of C are removed from S and therefore none of the edges
entering C can change a lowpoint value anymore. Effectively this is
equivalent to the removal of C and all its incident edges from G. Thus, when
equality in Step (7) will occur again, Lemma 3.8 is effective again. This
proves the validity of the algorithm.
PROBLEMS
3.1 Tarry's algorithm [31 is like Trrnwux's, with the following change. Re-
place Step (3) by:
(3) Choose an unmarked passage, mark it E and traverse the edge to its
other endpoint u. If u has no marked passages (i.e. it is a new ver-
tex), mark the passage through which u has been entered by F. Let v
- u and go to Step (2).
Prove that Tarry's algorithm terminates after all the edges of G have
been traversed, once in each direction. (Observe that Lemmas 3.1 and
3.2 remain valid for Tarry's algorithm, with virtually the same proofs.)
Problems 67
3.2 Consider the set of edges which upon the termination of Tarry's algo-
rithm (see Problem 3.1) have one endpoint marked E and the other F;
also assume these edges are now directed from E to F.
(a) Prove that this set of edges is a directed spanning tree of G with root
s. (See Lemma 3.3.).
(b) Does a statement like that of Lemma 3.4 hold in this case? Prove or
disprove.
3.3 Fraenkel [5, 6] showed that the number of edge traversals can sometimes
be reduced if the use of a two-way counter is allowed. The algorithm is
a variant of Tarry's algorithm (see Problem 3.1). Each time a new vertex
is entered the counter is incremented; when it is realized that all inci-
dent edges of a vertex have been traversed at least in one direction, the
counter is decremented. If the counter reaches the start value, the
search is stopped. One can return to s via the F marked passages.
Write an algorithm or a flow chart which realizes this idea. (Hint:
an additional mark which temporarily marks the passages used to re-
enter a vertex is used.) Prove that the algorithm works. Show that for
some graphs the algorithm will traverse each edge exactly once, for
others the savings depends on the choice of passages, and yet there are
graphs for which the algorithm cannot save any traversals.
3.4 Assume G is drawn in the plane in such a way that no two edges inter-
sect. Show how Trdmaux's algorithm can be modified in such a way
that the whole scanning path never crosses itself.
3.5 In an undirected graph G a set of vertices C is called a clique if every
two vertices of C are connected by an edge. Prove that in the spanning
(directed) tree resulting from a DFS, all the vertices of a clique appear
on one directed path. Do they necessarily appear consecutively on the
path? Justify your answer.
3.6 Prove that if C is a directed circuit of a digraph to which a DFS algo-
rithm was applied then the vertex v, for which k(v) is minimum among
the vertices of C, is a root of a subtree in the resulting forest, and all
the vertices of C are in this subtree.
3.7 An edge e of a connected undirected graph G is called a bridge if its
deletion destructs G's connectivity. Describe a variation of the DFS
algorithm which, instead of detecting separating vertices, detects
bridges.
3.8 (This problem was suggested by Silvio Micali) Let G be a connected graph.
(a) Prove that a vertex u •o s is a separating vertex of G if and only if,
68 Depth-First Search
REFERENCES
[1] Hopcroft, J., and Tarjan, R., "Algorithm 447: Efficient Algorithms for Graph
Manipulation", Comm. ACM, Vol. 16, 1973, pp. 372-378.
12] Tarjan, R., "Depth-First Search and Linear Graph Algorithms", SIAM J. Coin-
put., Vol. 1, 1972, pp. 146-160.
[3] Lucas, E., Recreations Mathimatiques, Paris, 1882.
[4] Tarry, G., "Le Problime des Labyrinthes". Nouvelles Ann. de Math., Vol. 14,
1895, page 187.
[5] Fraenkel, A. S., "Economic Traversal of Labyrinths", Math. Mag., Vol. 43,
1970, pp. 125-130.
[6] Fraenkel, A. S., "Economic Traversal of Labyrinths (Correction)," Math. Mag.,
Vol. 44, No. 1, January 1971.
Chapter 4
4. ORDERED TREES
4.1 UNIQUELY DECIPHERABLE CODES
(1) ci, 1 < i < m, and c', 1 c j 5n are code-words and c,I c ';
(2) t is a suffix of c,, ';
(3) c, c2 ... c t = c, C2 C2.
69
70 Ordered Trees
The algorithm generates all the tails. If a code-word is a tail, the algorithm
terminates with a negative answer.
Clearly, in Step (1), the words declared to be tails are indeed tails. In Step
(2), since t is already known to be a tail, there exist code-words cl, C2, .C . - m
and cl', C2', . . ., C.n' such that ctc2 * * ct = cl'c2' *** c,'. Now, if ts = c
then CIC2 ... CmC c1 'c2 ' * , 's, and therefore s is a tail; and if cs = t
then cIc 2 .. C*CS CI'C2' c,,' and s is a tail.
Next, if the algorithm halts in (3), we want to show that all the tails have
been produced. Once this is established, it is easy to see that the conclusion
that C is UD follows; Each tail has been checked, in Step (2.1), whether it is
equal to a code-word, and no such equality has been found; by Lemma 4.1,
the code C is UD.
Uniquely Decipherable Codes 71
For every t let m(t) = CIC2 ... Cm be a shortest message such that CIC2 ...
Cmt = c, 'C
12 ' *' C,', and t is a suffix of cn,'. We prove by induction on the
length of m(t) that t is produced. If m(t) = 1 then t is produced by (1.2),
sincem = n = 1.
Now assume that all tails p for which m(p) < m(t) have been produced.
Since t is a suffix of c,s', we have pt = c. '. Therefore, C IC2 cm = CX 'C2'
C
C.n-i
C 'P.
Ifp = cm then cmt = c, ' and t is produced in Step (1).
If p is a suffix of cm then, by definition, p is a tail. Also, m(p) is shorter
then m(t). By the inductive hypothesis p has been produced. In Step (2.2),
when applied to the tail p and code-word cn ', by pt = c, ', the tail t is pro-
duced.
If Cm is a suffix of p, then cmt is a suffix of c.', and therefore, cmt is a tail.
M(Cmt) = ClC2 * Cm-,, and is shorter than m(t). By the inductive hypothesis
Cmt has been produced. In Step (2.2), when applied to the tail cmt and code-
word Cm, the tail t is produced.
This proves that the algorithm halts with the right answer.
Let the code consists of n words and l be the maximum length of a code-
word. Step (1) takes at most O(n2 - 1) elementary operations. The number of
tails is at most O(n . 1). Thus, Step (2) takes at most O(n 212) elementary
operations. Therefore, the whole algorithm is of time complexity O(n2 12 ).
Other algorithms of the same complexity can be found in References 3 and 4;
these tests are extendible to test for additional properties [5, 6, 7].
E a c 1. (4.1)
i=I
, 1 0i= *-E+Ii,+
-+lie).
72 Ordered Trees
There is a unique term, on the right hand side, for each of the ne messages of
e code-words. Let us denote by N(e, i) the number of messages of e code-
words whose length isj. It follows that
,, n n e;
LI, S1MeI
** 'e-I
* a~Ii+Ii2++Iie) = L N(e,j).cr
a
i
e1el
EN(e,j) -oa ori- a-i - e-l.
n
a-c"i1 (4.2)
i=j
then there exists a prefix code C = {c i, C2, . . ., c"}, over the alphabet of a
letters, such that 1i = l(c,).
Uniquely Decipherable Codes 73
Proof: Let XI < X2 ... < X,, be integers such that each 1i is equal to one
of the Xj-s and each Xiis equal to at least one of the li-s. Let kj be the num-
ber of lj-s which are equal to X+. We have to show that there exists a prefix
code C such that the number of code-words of length X\is kj.
Clearly, (4.2) implies that
E kj a -X•c 1 (4.3)
i-I
r+1
E kja-"j 5 1.
1=1
,+1
E k U)r+ I -Xj < aXr+ I
which is equivalent to
Figure 4.1
Positional Trees and Huffman's Optimization Problem 75
n
IT= pili. (4.5)
We want to find a code for which Iis minimum, in order to minimize the ex-
pected length of the message.
Since the code must be UD, by Theorem 4.1, the vector of code-word
lengths must satisfy the characteristic sum condition. This implies, by
Theorem 4.2, that a prefix code with the same vector of code-word lengths
exists. Therefore, in seeking an optimum code, for which [ is minimum, we
may restrict our search to prefix codes. In fact, all we have to do is find a vec-
tor of code-word lengths for which [ is minimum, among the vectors which
satisfy the characteristic sum condition.
First, let us assume thatp1 2 P 2 *... > pn. This is easily achieved by
sorting the probabilities. We shall first demonstrate Huffman's construction
for the binary case (a = 2). Assume the probabilities are 0.6, 0.2, 0.05, 0.05,
0.03, 0.03, 0.03, 0.01. We write this list as our top row (see Fig. 4.2). We add
the last (and therefore least) two numbers, and insert the sum in a proper
place to maintain the non-increasing order. We repeat this operation until we
get a vector with only two probabilities. Now, we assign each of them a word-
length 1 and start working our way back up by assigning each of the pro-
babilities of the previous step, its length in the present step, if it is not one of
the last two, and each of the two last probabilities of the previous step is
assigned a length larger by one than the length assigned to their sum in the
present step.
Once the vector of code-word lengths is found, a prefix code can be as-
signed to it by the technique of the proof of Theorem 4.2. (An efficient im-
plementation is discussed in Problem 4.6) Alternatively the back up pro-
76 Ordered Trees
cedure can produce a prefix code directly. Instead of assigning the last two
probabilities with lengths, we assign the two words of length one: 0 and 1. As
we back up from a present step, in which each probability is already assigned
a word, to the previous step, the rule is as follows: All, but the last two
probabilities of the previous step are assigned the same words as in the pre-
sent step. The last two probabilities are assigned cO and cl, where c is the
word assigned to their sum in the present step.
In the general case, when a Ž 2, we add in each step the last d
probabilities of the present vector of probabilities; if n is the number of
probabilities of this vector then d is given by:
After the first step, the length of the vector, n ', satisfies n' 1 mod (a -
1), and will be equal to one, mod (a - 1), from there on. The reason for this
rule is that we should end up with exactly a probabilities, each to be assigned
length 1. Now, a =- 1 mod (a - 1), and since in each ordinary step the
number of probabilities is reduced by e - 1, we want n =1 mod (a - 1). In
case this condition is not satisfied by the given n, we correct it in the first step
as is done by our rule. Our next goal is to prove that this indeed leads to an
optimum assignment of a vector of code-word lengths.
Positional Trees and Huffman's Optimization Problem 77
Lemma 4.2: If C = {c,, c2 , ... , c,,} is an optimum prefix code for the
probabilities ,P2, ... ,p,, thenpi > pj implies thatl(c,) c I(c,).
Proof: Assume 1(ci) > 1(cj). Make the following switch: Assign ci to
probability pj, and c, top j; all other assignments remain unchanged. Let I
denote the average code-word length of the new assignment, while I denotes
the previous one. By (4.5) we have
Lemma 4.3: There exists an optimum prefix code for the probabilities p I
P2 *-- 2 p, such that the positional tree which represents it has the
following properties:
(1) All the internal vertices of the tree, except possibly one internal vertex v,
have exactly a sons.
(2) Vertex v has 1 < p c a sons, where n p mod (a - 1).
(3) Vertex v, of (1) is on the lowest level which contains internal vertices,
and its sons are assigned to Pn-p+i , Pn-p+2, * * *, Pn -
ternal vertices on the same level. Also, when this process ends, v is the only
lacking internal vertex (proving (1)) and its number of remaining sons must
be greater than one, or its son can be removed and its probability attached to
v. This proves that the number of sons of v, p, satisfies 1 < p < a.
If v's p sons are removed, the new tree has n' = n - p + 1 leaves and is
full (i.e., every internal vertex has exactly a sons). In such a tree, the number
of leaves, n ', satisfies n ' 3 1, mod (a - 1). This is easily proved by induc-
tion on the number of internal vertices. Thus, n - p + 1 1l mod (a - 1),
and therefore n 3 p mod (a - 1), proving (2).
We have already shown that v is on the lowest level of T which contains in-
ternal vertices and number of its sons is p. By Lemma 4.2, we know that the
least p probabilities are assigned to leaves of the lowest level of T. If they are
not sons of v, we can exchange sons of v with sons of other internal vertices on
this level, to bring all the least probabilities to v, without changing the
average length.
Q.E.D.
vertex v, which has no other sons. By Lemma 4.3, 0,, (PI, P2, . . .,pP, con-
tains at least one optimum tree. Thus, we may restrict our search for an op-
timum tree to 00 (pa, P2, . . .,P.
the prefix code represented by a tree T of 0( p1, P2, . . ., pn) and the average
code word-length 1, of the prefix code represented by the tree T', which
corresponds to T, satisfy
I = V 4- p'. (4.7)
n-d
it
Ei-,ili + (In 1) .p +p =P1 +p t
Q.E.D.
Assume we have n items, and there is an order defined between them. For
ease of presentation, let us assume that the items are the integers 1, 2, . . ., n
and the order is "less than". Assume that we want to organize the numbers
in nondecreasing order, where initially they are put in L lists, A 1, A 2, -. .*
AL. Each Ai is assumed to be ordered already. Our method of building
larger lists from smaller ones is as follows. Let B 1, B 2 , . . ., Bm be any m ex-
isting lists. We read the first, and therefore least, number in each of the lists,
take the least number among them away from its list and put it as the first
number of the merged list. The list from which we took the first number is
now shorter by one. We repeat this operation on the same m lists until they
merge into one. Clearly, some of the lists become empty before others, but
since this depends on the structure of lists, we only know that the general step
of finding the least number among m numbers (or less) and its transfer to a
new list is repeated bI + b2 + * + bm times, where bi is the number of
numbers in Bi.
The number m is dictated by our equipment or decided upon in some other
way. However, we shall assume that its value is fixed and predetermined. In
fact, in most cases m = 2.
The whole procedure can be described by a positional tree. Consider the
example shown in Fig. 4.3, where m = 2. First we merge the list <3 ) with
(1, 4). Next we merge <2, 5> with <1, 3, 4>. The original lists, A l, A 2....
AL, correspond to the leaves of the tree. The number of transfers can be
Figure 4.3
Application of the Huffman Tree to Sort-By-Merge Techniques 81
Burge [15] observed that the attempt to find a positional m-ary which
minimizes (4.8) is similar to that of the minimum average word-length prob-
lem solved by Huffman. The fact that the Huffman construction is in terms
of probabilities does not matter, since the fact that p I + P 2 + * * * + pL =
1 is never used in the construction or its validity proof. Let us demonstrate
the implied procedure by the following example.
Assume L = 12 and m = 4; the b i's are given in nonincreasing order: 9, 8,
8, 7, 6, 6, 6, 5, 5, 4, 3, 3. Since L 0- (mod 3), according to (4.6) d = 3.
Thus, in the first step we merge the last three lists to form a list of length
10 which is now put in the first place (see Fig. 4.4). From there on, we merge
each time the four lists of least length. The whole merge procedure is de-
scribed in the tree shown in Fig. 4.5.
9 8 8 7 6 6 6 5 5 4 3 3
l 2 2 2 2 2 2 2 2 2 2 2
10 9 8 8 7 6 6 6 5 5
l l 2 2 2 2 2 2 2 2
22 10 9 8 8 7 6
1 1 1 2 2 2 2
29 22 10 9
1 1 1 1
Figure 4.4
82 Ordered Trees
Figure 4.5
Lemma 4.5: A sequence of (left and right) parentheses is well formed if and
only if it contains an even number of parentheses, half of which are left and
the other half are right, and as we read the sequence from left to right, the
number of right parentheses never exceeds the number of left parentheses.
Proof: First let us prove the "only if" part. Since the construction of every
well formed sequence starts with no parentheses (the empty sequence) and
each time we add on parentheses (Step 3) there is one left and one right, it is
clear that there are n left parentheses and n right parentheses. Now, assume
that for every well-formed sequence of mn left and m right parentheses, where
m < n, it is true that as we read it from left to right the number of right
parentheses never exceeds the number of left parentheses. If the last step in
Catalan Numbers 83
left, we get a sequence of n left and n right parentheses which is not well
formed. This transformation is the inverse of the one of the previous
paragraph. Thus, the one-to-one correspondence is established.
The number of sequences of n - I left and n + 1 right parentheses is
2n
(n -_1)
for we can choose the places for the left parentheses, and the remaining
places will have right parentheses. Thus, the number of well-formed se-
quences of length n is
((C ~)()C
(C )(
))
Figure 4.6
Figure 4.7
86 Ordered Trees
Output mm m E Input
I
Stack
Figure 4.8
Problems 87
taching two leaves to each leaf of T', and one leaf (son) to each vertex which
in T' has only one son. Thus, the number of full-binary trees of n vertices is
equal to the number of positional binary trees of (n - 1)/2 vertices. By (4.9)
this number is
2(2)
PROBLEMS
4.7 A code is called exhaustive if every word over the alphabet is the begin-
ning of some message over the code. Prove the following:
(a) If a code is prefix and its characteristic sum is 1 then the code is ex-
haustive.
(b) If a code is UD and exhaustive then it is prefix and its characteristic
sum is 1.
4.8 Construct the ordered forest, the positional binary tree and the permuta-
tion through a stack which corresponds to the following well-formed se-
quence of 10 pairs of parentheses:
(O()((())((()()()) .
4.9 A direct method for computing the number of positional binary trees of
n vertices through the use of a generating function goes as follows: Let
b . be the number of trees of n vertices. Define b o I and define the func-
tion
B(x) = bo + b lx + b 2 x 2 +
to prove that
n 1 (2n).
REFERENCES
[1] Sardinas, A. A., and Patterson, G. W., "A Necessary and Sufficient Condition
for the Unique Decomposition of Coded Messages", IRE Convention Record,
Part 8, 1953, pp. 104-108.
12] Gallager, R. G., Information Theorv and Reliable Communication, John Wiley,
1968. Problem 3.4, page 512.
References 89
[3] Levenshtein, V. I., "Certain Properties of Code Systems", Dokl. Akad. Nauk,
SSSR, Vol. 140, No. 6, Oct. 1961, pp. 1274-1277. English translation: Soviet
Physics, "Doklady", Vol. 6, April 1962, pp. 858-860.
[41 Even, S., "Test for Unique Decipherability", IEEE Trans. on Infor. Th., Vol.
IT-9, No. 2, April 1963, pp. 109-112.
[5] Levenshtein, V. I., "Self-Adaptive Automata for Coding Messages", DokI.
Akad. Nauk, SSSR, Vol. 140, Dec. 1961, pp. 1320-1323. English translation:
Soviet Physics, "Doklady", Vol. 6, June 1962, pp. 1042-1045.
[61 Markov, Al. A., "On Alphabet Coding", Dokl. Akad. Nauk, SSSR, Vol. 139,
July 1961, pp. 560-561. English translation: Soviet Physics, "Doklady", Vol. 6,
Jan. 1962, pp. 553-554.
[7] Even, S., "Test for Synchronizability of Finite Automata and Variable Length
Codes", IEEE Trans. on Infor. Th., Vol. IT-10, No. 3, July 1964, pp. 185-189.
[8] McMillan, B., "Two Inequalities Implied by Unique Decipherability", IRE
Tran. on Infor. Th., Vol. IT-2, 1956, pp. 115-116.
[9] Karush, J., "A Simple Proof of an Inequality of McMillan", IRE Trans. On In-
for. Th., Vol. IT-7, 1961, page 118.
[10] Kraft, L. G., "A Device for Quantizing, Grouping and Coding Amplitude
Modulated Pulses", M.S. Thesis, Dept. of E.E., M.I.T.
[111 Huffman, D. A., "A Method for the Construction of Minimum Redundancy
Codes", Proc. IRE, Vol. 40, No. 10, 1952, pp. 1098-1101.
[121 Perl, Y., Garey, M. R. and Even, S., "Efficient Generation of Optimal Prefix
Code: Equiprobable Words Using Unequal Cost Letters", J.ACM, Vol. 22, No.
2, April 1975, pp. 202-214.
[13] Itai, A., "Optimal Alphabetic Trees", SIAM J. Comput., Vol. 5, No. 1, March
1976, pp. 9-18.
[141 Knuth, D. E., The Art of Computer Programming, Vol. 3: Sorting and Search-
ing, Addison-Wesley, 1973.
[15] Burge, W. H., "Sorting, Trees, and Measures of Order", Infor. and Control,
Vol. 1, 1958, pp. 181-197.
Chapter 5
*The exclusion of self-loops and parallel edges is not essential. It will shortly
become evident that no generality is lost; the flow in a self-loop gains nothing, and a
set of parallel edges can be replaced by one whose capacity is the sum of their
capacities. This condition ensures that {Et s I VI (I V - 1).
**The choice of s or t is completely arbitrary. There is no requirement that s is a
graphical source; i.e. has no incoming edges, or that t is a graphical sink; i.e. has no
outgoing edges. The edges entering s or leaving t are actually redundant and have no
effect on our problem, but we allow them since the choice of s and t may vary, while
we leave the other data unchanged.
90
Ford and Fulkerson Algorithm 91
Namely, F is the net sum of flow into the sink. Our problem is to find anf for
which the total flow is maximum.
Let S be a subset of vertices such that s ES and t 0 S. S is the complement
of S, i.e. S = V - S. Let (S; S) be the set of edges of G whose start-vertex is
in S and end-vertex is in S. The set (S; S) is defined similarly. The set of edges
connecting vertices of S with S (in both directions) is called the cut defined
by S.
By definition, the total flow F is measured at the sink. Our purpose is to
show that F can be measured at any cut.
Proof. Let us sum up equation (5.2) with all the equations (5.1) for v E S-
{ t}. The resulting equation has F on the left hand side. In order to see what
happens on the right hand side, consider an edge x - y. If both x andy be-
long to S then f(e) does not appear on the r.h.s. at all, in agreement with
(5.3). If both x and y belong to S thenf(e) appears on the r.h.s. once posi-
tively, in the equation fory, and once negatively, in the equation forx. Thus,
in the summation it is canceled out, again in agreement with (5.3). If x E S
andy E S thenf(e) appears on the r.h.s. of the equation fory, positively, and
in no other equation we use, and indeed e E (S; S), and again we have agree-
ment with (5.3). Finally, if x E S and y E S, f(e) appears negatively on the
r.h.s. of the equation for x, and again this agrees with (5.3) since e E (S; S).
Q.E.D.
Lemma 5.2: For every flow function f, with total flow F, and every S,
F - c(S). (5.5)
F= f(e)- e (SSe).
92 Maximum Flow in a Network
F5
c c(e) - = c(S). QED
eE(S;S) Q.E.D.
Corollary 5.1: IfF and S satisfy (5.5) by equality then F is maximum and the
cut defined by S is of minimum capacity.
Ford and Fulkerson [1] suggested the use of augmenting paths to change a
given flow function in order to increase the total flow. An augmenting path is
a simple path from s to t, which is not necessarily directed, but it can be used
to advance flow from s to t. If on this path, e points in the direction from s to
t, then in order to be able to push flow through it,f(e) must be less than c(e).
If e points in the opposite direction, then in order to be able to push through
it additional flow from s to t, we must be able to cancel some of its flow.
Therefore, f(e) > 0 must hold.
In attempt to find an augmenting path for a given flow, a labeling pro-
cedure is used. We label s. Then, every vertex v, for which we can find an
augmenting path from s to v, is labeled. If t is labeled than an augmenting
path has been found. This path is used to increase the total flow, and the pro-
cedure is repeated.
A forward labeling of vertex v by the edge u - v is applicable if
The label that v gets is 'e'. If e is used for forward labeling we define A(e)
= c(e) -f(e).
A backward labeling of vertex v by the edge u ' v is applicable if
The label that v gets is 'e'. In this case we define A(e) = f(e).
The Ford and Fulkerson algorithm is as follows:
(Note that if the initial flow on the edges entering s is zero, it will never
change. This is also true for the edges leaving t. )
As an example, consider the network shown in Fig. 5.1. Next to each edge
e we write c(e)f(e) in this order. We assume a zero initial flow everywhere. A
first wave of label propagation might be as follows: s is labeled; e2 used to
label c; e 6 used to label d; e 4 used to label a; e 3 used to label b; and finally, e 7
used to label t. The path is s ". c -" d -el a -eb t. A = 4, and the new flow
is shown in Fig. 5.2.
The next augmenting path may be
5 eA a e) b e5 c en d es t.
Now, A = 3 and the flow is as in Fig. 5.3.
The next augmenting path may be s A' a A. b A' t. Now, A = 3 and the
new flow is as in Fig. 5.4. Now, the labeling can proceed as follows: s is la-
Figure 5.1
94 Maximum Flow in a Network
Figure 5.2
Figure 5.3
Figure 5.4
Ford and Fulkerson Algorithm 95
beled; el is used to label a; e3 used to label b; (so far we have not used
backward labeling, but this next step is forced) e4 is used to label d; e8 is
used to label t. The path we backtrack is s -a a -e d -en t. Now, A(e ) = 9,
A(e 4) = 4 and A(e8) = 7. Thus, A = 4. The new flow is shown in Fig. 5.5.
The next wave of label propagation is as follows:
Figure 5.5
algorithm: If the initial flow is integral, for example, zero everywhere, and if
all the capacities are integers, then the algorithm never introduces fractions.
The algorithm adds and subtracts, but it never divides. Also, if t is labeled,
the augmenting path is used to increase the total flow by at least one unit.
Since there is an upper bound on the total flow (any cut), the process must
terminate.
Ford and Fulkerson showed that their algorithm may fail, if the capacities
are allowed to be irrational numbers. Their counterexample (Reference 1, p.
21) displays an infinite sequence of flow augmentations. The flow converges
(in infinitely many steps) to a value which is one fourth of the maximum total
flow. We shall not bring their example here; it is fairly complex and as the
reader will shortly discover, it is not as important any more.
One could have argued that for all practical purposes, we may assume that
the algorithm is sure to halt. This follows from the fact that our computa-
tions are usually through a fixed radix (decimal, binary, and so on) number
representation with a bound on the number of digits used; in other words, all
figures are multiples of a fixed quantum and the termination proof works
here as it does for integers. However, a simple example shows the weakness
of this argument. Consider the network shown in Fig. 5.6. Assume that M is
a very large integer. If the algorithm starts withf(e) = 0 for all e, and alter-
natively uses s - a - b - t and s - b - a - t as augmenting paths, it will
take 2M augmentations before F = 2M is achieved.
Edmonds and Karp [21 were first to overcome this problem. They showed
that if one uses breadth-first search (BFS) in the labeling algorithm and
always uses a shortest augmenting path, the algorithm will terminate in
O(IVI3 IEI) steps, regardless of the capacities. (Here, of course, we assume
M M
Figure 5.6
The Dinic Algorithm 97
that our computer can handle, in one step, any real number.) In the next sec-
tion we shall present the more advanced work of Dinic [3]; his algorithm has
time complexity 0(1 VI2 IEI ). Karzanov [4] and Cherkassky [5] have reduced
it to °(l VI 3) and °(l VI 21E1112), respectively. This algorithms are fairly com-
plex and will not be described. A recent algorithm of Malhotra, Pramodh
Kumar and Maheshwari [6] has the same time complexity as Karzanov's
and is much simpler; it will be described in the next section.
The existence of these algorithms assures that, if one proceeds according
to a proper strategy in the labeling procedure, the algorithm is guaranteed to
halt. When it does, the total flow is maximum, and the cut indicated is
minimum, thus providing the max-flow min-cut theorem:
Theorem 5.1: Every network has a maximum total flow which is equal to the
capacity of a cut for which the capacity is minimum.
As in the Ford and Fulkerson algorithm, the Dinic algorithm starts with
some legal flow functions and improves it. When no improvement is possible
the algorithm halts, and the total flow is maximum.
If presently an edge u v has flowf(e) then we say that e is useful from u
to v if one of the following two conditions holds:
(1) Vo - {s}, i - 0.
(2) Construct T - { v Iv £ Vj forj c i and there is a useful edge from a vertex
of Vi to v}.
(3) If T is empty, the present total flow F is maximum, halt
(4) If T contains t then l - i + 1, VI - {t} and halt.
(5) Let Vi+1 - T, increment i and return to Step (2).
For every 1 c i c 1, let Ei be the set of edges useful from a vertex of Vi-l
to a vertex of Vi. The sets Vj are called layers.
The construction of the layered network investigates each edge at most
twice; once in each direction. Thus, the time complexity of this algorithm is
O(IEI).
98 Maximum Flow in a Network
Lemma 5.3: If the construction of the layered network terminates in Step (3)
then the present total flow, F, is indeed maximum.
Proof: The proof here is very similar to the one in the Ford and Fulkerson
algorithm: Let S be the union of Vo, V 1, . . ., Vi. Every edge u -v in (S; S) is
saturated, i.e.f(e) = c(e), or else e is useful from u to v and T is not empty.
Also, every edge u Vis (S; S) hasf(e) 0, or again e is useful from u to v,
etc. Thus, by Lemma 5.1,
It is easy to see that the new flow' satisfies both Cl (due to the choice of
c) and C2 (because it is the superposition of two flows which satisfy C2).
ClearlyF' = F + F > F.
Let us call the part of the algorithm which starts with, finds its layered
network, finds a maximal flow f in it and improves the flow in the original
The Dinic Algorithm 99
VO V1 V? VI
II
Figure 5.7
Lemma 5.4: If the (k + 1)st phase is not the last then 1,+ > 1,.
First, let us assume that all the vertices of the path appear in the k-th
layered network. Let Vj be thejth layer of the kth layered network. We claim
that if Va E Vb then a 2 b. This is proved by induction on a. For a = 0, (vO =
s) the claim is obviously true. Now, assume va+I E V,. If c c b + 1 the induc-
tive step is trivial. But if c > b + 1 then the edge ea+: has not been used in
the kth phase since it is not even in the kth layered network, in which only
edges between adjacent layers appear. If ea+: has not been used and is useful
from va to va+, in the beginning of phase k + 1, then it was useful from va to
va+l in the beginning of phase k. Thus, va+l cannot belong to Vc (by the algo-
100 Maximum Flow in a Network
rithm). Now, in particular, t = v,,,, and t E VIk. Therefore, 1k+I 2 lk. Also,
equality cannot hold, because in this case the whole path is in the kth layered
network, and if all its edges are still useful in the beginning of phase k + 1
then theft of phase k was not maximal.
If not all the vertices of the path appear in the kth layered network then let
Va ea+ va+I be the first edge such that for some b Va E Vb but Va+I is not in
the kth layered network. Thus, ea+l was not used in phase k. Since it is useful
in the beginning of phase k + 1, it was also useful in the beginning of phase
k. The only possible reason for va+j not to belong to Vb+1 is that b + 1 = lk:
By the argument of the previous paragraph a 2 b. Thus a + 1 2 lk, and
therefore 1k+1 > lk.
Q.E.D.
Pv qv
(1) For every v E V, IP(v)- E j(ei), OP(v)- j(e'vi), m,-1, m '-1.
(2) For every e in N. I(e) -0.
(3) Perform the following operations:
(3.1) P(s) - OP(s), P(t) - IP(v).
(3.2) For v E V - {s, t3, P(v) - Min{IP(v), OP(v)}.
(3.3) Find a vertex v for which P(v) is minimum.
(4) If P(v) = 0, perform the following operations:
(4.1) If v = s or t, halt; the presentf is maximal.
(4.2) For every m, c m c pv, where u e.. v, if u E V then OP(u) -
OP(u) - (C(e.m) f(evm)).
(4.3) For every mi.' c m ' c q,, where v - u, if u E V then IP(u)
- IP(u) - (c(e vm f)-fev').
(4.4) V- V- {v} and go to (3).
(5) Find i for which v E V,. Letj - i, k - i. Also OF(v) - P(v), IF(v) -
P(v)andforeveryu E V- {v} let OF(u) - OandIF(u) -0.
(6) If k = I (the pushing from v to t is complete) go to (9).
(7) Assume Vk = {VI, V2, .. Vk}.
k.,
(7.1) r - 1.
(7.2) u - v,.
(7.3) If OF(u) = 0 then go to (7.5).
(7.4) (OF(u) > 0. Starting with e 'um', we push the excess supply to-
wards t.)
(7.4.1) e - e 'urm and assume u ' w.
102 Maximum Flow in a Network
First, we compute for every vertex v its in-potential IP(v), which is a local
upper bound on the flow which can enter v. Similarly, the out-potential
OP(v), is computed. For every vertex v • s or t, the potential, P(v), is the
minimum of IP(v), and OP(v). For s, P(s) = OP(s) and for t, P(t) = IP(t).
Next, we find v for which P(v) is minimum.
The main idea is that we can easily find a flow of P(v) units which goes
from s to t via v. We use the edges of O(v), one by one, saturating them as
long as the excess supply lasts, and pushing through them flow to the next
layer. For each of the vertices on higher layers we repeat the same process,
until all the P(v) units reach t. This is done in Steps (6) to (8). We can
never get stuck with too much excess supply in a vertex, since v is of mini-
mum potential. We then do the same while pulling the excess demand,
P(v), into v, from the previous layer, and then into it from the layer pre-
ceeding it, etc. This is done in Steps (9) to (11). When this is over, we
return to (3) to choose a vertex v for which P(v) is now minimum (Step (3)),
and repeat the pushing and pulling for it.
Clearly, when edges incident to a vertex are used, their in-potential and
out-potential must be updated. Also, variables IF(v) and OF(v) are used to
record the excess demand that should be flowed into v, and the excess
supply that should be flowed out of v, respectively. If P(v) = 0, none of v's
incident edges can be used anymore for flowing additional flow from s to t.
Thus, the in and out potentials of the adjacent vertices are updated accord-
ingly; this is done in Step (4). If P(s) or P(t) is zero, the flow is maximal,
and the algorithm halts (see (4.1)).
Every edge can be saturated once only (in (7.4.4) or (10.4.4)). The num-
ber of all other uses of edges (in (7.4.3) or (10.4.3)) can be bounded as
follows:
For every v, when P(v) is minimum and we push and pull for v, for every
u ; v we use at most one outgoing edge without saturating it (in (7.4.3))
or one incoming edge (in (10.4.3)). Thus, the number of edge-uses is
bounded by E + I VI2 = 0(1 VI2). Thus, the complexity of the MPM algo-
rithm for finding maximal flow in a layered network is 0(1 V12 ) and if we
use it, the maximum flow problem is solved in 0(1 Vl3 ) time, since the
number of phases is bounded by I VI .
104 Maximum Flow in a Network
In the previous sections we have assumed that the flow in the edges is
bounded from above but the lower bound on all the edges is zero. The
significance of this assumption is that the assignment off(e) = 0, for every
edge e, defines a legal flow, and the algorithm for improving the flow can be
started without any difficulty.
In this section, in addition to the upper bound, c(e), on the flow through e,
we assume that the flow is also bounded from below by b(e). Thus, f must
satisfy
Figure 5.8
The following method for testing whether a given network has a legal flow
function is due to Ford and Fulkerson 11]. In case of a positive answer, a flow
function is found.
The original network with graph G(V, E) and bounds b(e) and c(e) is
modified as follows:
V= {i, t} U V.
gf and F are new vertices, called the auxiliary source and sink, respec-
tively,
Networks with Upper and Lower Bounds 105
e(e) =E b(e),
eE 3(v)
where f(v) is the set of edges which emanate from v in G. The lower
bound is zero.
(3) For every v E V construct an edge i -e v with an upper bound
e(e) = E h(e),
eEfc(v)
where a(v) is the set of edges which enter v in G. The lower bound is zero.
(4) The edges of E remain in the new graph but the bounds change: The
lower bounds are all zero and the upper bound e(e) of e EE is defined by
e(e) = c(e) - b(e).
(5) Construct new edges s -e t and t s with very high upper bounds e(e)
and O.(e' )(=oo) and zero lower bounds.
The resulting auxiliary network has a source Y, a sink i; s and t are re-
garded now as regular vertices which have to conform to the conservation
rule, i.e. condition C2.
Let us demonstrate this construction on the graph shown in Fig. 5.9(a).
The auxiliary network is shown in Fig. 5.9(b). The upper bounds e(e) are
shown next to the edges to which they apply.
Now we can use the Ford and Fulkerson or the Dinic (with or without the
MPM improvement) algorithms to find a maximum flow in the auxiliary net-
work.
Theorem 5.2: The original network has a legal flow if and only if the max-
imum flow of the auxiliary network saturates all the edges which emanate
from.
Clearly, if all the edges which emanate from ! are saturated, then so are all
the edges which enter f. This follows from the fact that each b(e), of the origi-
nal graph, contributes its value to the capacity of one edge emanating from A
and to the capacity of one edge entering f. Thus, the sum of capacities of
edges emanating from I is equal to the sum of capacities of edges entering f.
(oi,
lb)
Figure 5.9
Networks with Upper and Lower Bounds 107
For every e E E
Since
we have
satisfying (5.6).
Now let v E V - {s, t}; a(v) is the set of edges which enter v in the original
network and $(v) is the set of edges which emanate from v in it. Let f- v and
v T ( be the edges of the auxiliary network, as constructed in parts (3) and
(2). Clearly,
By the assumption
and
Thus
efa(e)
eE.4V) eE)(O'f(e).
el (5.9)
This proves that C2 is satisfied too, and f is a legal flow function of the
original network.
The steps of this proof are reversible, with minor modifications. Iff is a
legal flow function of the original network, we can defined for the auxiliary
network by (5.7). Sincef satisfies (5.6), by subtracting b(e), we get thatf(e)
satisfies C1 in e E E. Now, f satisfies (5.9) for every v E V - {s, t}. Letf(a)
= e(a) andf(r) = e(T). Now (5.8) is satisfied and therefore condition C2 is
held while all the edges which emanate from are saturated. Finally, since
.
108 Maximum Flow in a Network
the net flow which emanates from s is equal to the net flow which enters t, we
can make both of them satisfy C2 by flowing through the edges of part (5)
of the construction, this amount.
Q.E.D.
Let us demonstrate the technique for establishing whether the network has
a legal flow, and finding one in the case the answer is positive, on our exam-
ple (Fig. 5.9). First, we apply the Dinic algorithm on the auxiliary network
and end up with the flow, as in Fig. 5.10(a). The maximum flow saturates
all the edges which emanate from &,and we conclude that the original net-
work has a legal flow. We use (5.7) to define a legal flow in the original net-
work; this is shown in Fig. 5.10(b) (next to each edge e we write b(e), c(e),
f(e), in this order).
Once a legal flow has been found, we turn to the question of optimizing it.
First, let us consider the question of maximizing the total flow.
One can use the Ford and Fulkerson algorithm except that the backward
labeling must be redefined as follows:
The label that v gets is 'e'. In this case we define A(e) = f(e) - b(e).
We start the algorithm with the known legal flow. With this exception, the
algorithm is exactly as described in Section 5.1. The proof that when the
algorithm terminates the flow is maximum is similar too. We need to
redefine the capacity of a cut determined by S as follows:
It is easy to prove that the statement analogous to Lemma 5.2, still holds; for
every flow f with total flow F and every S
F s c(S'. (5.10)
Now, the set of labeled vertices S, when the algorithm terminates satisfies
(5.10) by equality. Thus, the flow is maximum and the indicated cut is
minimum.
The Dinic algorithm can be used too. The only change needed is in
the definition of a useful edge, part (ii): u v andf(e) > b(e), instead of
-
Networks with Upper and Lower Bounds 109
(a)
(b)
Figure 5.10
110 Maximum Flow in a Network
f(e) > 0. Also, in the definition of e(e), part (ii): If u E Vi-,, v E Vi and u
v then c(e) = f(e) - b(e).
Let us demonstrate the maximizing of the flow on our example, by
the Dinic algorithm. The layered network of the first phase for the net-
work, with legal flow, of Fig. 5.10(b) is shown in Fig. 5.11(a). The pair
i(e), f(e) is shown next to each edge. The new flow of the original network is
shown in Fig. 5.11(b). The layered network of the second phase is shown in
Fig. 5.11(c). The set S = {s, y} indicates a minimum cut, and the flow is
maximum.
In certain applications, what we want is a minimum flow, i.e. a legal flow
functionf for which the total flow F is minimum. Clearly, a minimum flow
from s to t is a maximum flow from t to s. Thus, our techniques solve this
problem too, by simply exchanging the roles of s and t. By the max-flow min-
cut theorem, the max-flow from t to 5, F(t, s) is equal to a min-cut from t to
s. Therefore, there exists a T C V, t i T, s f T such that
Clearly, every S yields a lower bound, c(S), on the flow F(s, t) and the
min-flow is equal to the max-cut.
Networks with Upper and Lower Bounds 111
I
I
LI
(a)
(b)
n
VO-- 1
V
I--------
(
S 3 y :
- -- -- -- - -i . -- -- -- -- -
(C)
Figure 5.11
112 Maximum Flow in a Network
PROBLEMS
5.1 Find a maximum flow in the network shown below. The number next to
each edge is its capacity.
5.2 In the following network, x 1, x2 , X3 are all sources (of the same com-
modity). The supply available at xi is 5, at x2 is 10, and at X3 is 5. The
verticesy1 ,Y2,y3 are all sinks. The demand required aty, is 5, aty2 is 10,
and aty3 is 5. Find out whether all the requirements can be met simul-
taneously. (Hint: One way of solving this type of problem is to introduce
an auxiliary source s and a sink t, connect s to xi through an edge of ca-
pacity equal to xi's supply; connect eachyi to t through an edge of capa-
city equal to yi's demand; find a maximum flow in the resulting net-
work and observe if all the demands are met.)
Problems 113
5.4 (a) Describe an alternative labeling procedure, like that of Ford and
Fulkerson, for maximizing the flow, except that the labeling starts
at t, and if it reaches s an augmenting path is found.
(b) Demonstrate your algorithm on the following network.
(c) Describe a method of locating an edge which has the property that
increasing its capacity increases the maximum flow in the graph.
114 Maximum Flow in a Network
(Hint: One way of doing this is to use both source-to-sink and sink-
to-source labelings.) Demonstrate your method on the graph of (b).
(d) Does an edge like this always exist? Prove your claim.
5.5 Prove that in a network with a nonnegative lower bound b(e) for every
edge e but no upper bound (c(e) = o), there exists a legal flow if and
only if for every edge e either e is iin a directed circuit or e is in a directed
path from s to t or from t to s.
5.6 Find a minimum flow from s to t for the network of Problem 5. 1, where
all the numbers next to the edges are now assumed to be lower bounds,
and there are no upper bounds =oo).
5.7 The two networks shown have both lower and upper bounds on the flow
through the edges. Which of the two networks has no legal flow? Find
both a maximum flow and minimum flow if a legal flow exists. If no
legal flow exists display a set of vertices which neither includes the
source, nor the sink, and is either required to "produce" flow or to "ab-
sorb" it.
References 115
5.8 Prove that a network with lower and upper bounds on the flow in the
edges has no legal flow if and only if there exists a set of vertices which
neither includes the source, nor the sink, and is required to "produce"
flow or to "absorb" it.
REFERENCES
[1] Ford, L. R., Jr. and Fulkerson, D. R., Fl7ows in Networks, Princeton University
Press, 1962.
[2] Edmonds, J., and Karp, R. M., "Theoretical Improvements in Algorithmic Ef-
ficiency for Network Flow Problems," J.ACM, Vol. 19, 1972, pp. 248-264.
[3] Dinic, E. A., "Algorithm for Solution of a Problem of Maximum Flow in a Net-
work with Power Estimation," Soviet Math. Dokl., Vol. 11, 1970, pp. 1277-1280.
[4] Karzanov, A. V., "Determining the Maximal Flow in a Network by the Method of
Preflows," Soviet Math. Doki., Vol. 15, 1974, pp. 434-437.
[5] Cherkassky, B., "Efficient Algorithms for the Maximum Flow Problem", Akad.
Nauk USSR, CEMI, Mathematical Methods for the Solution of Economical
Problems, Vol. 7, 1977, pp. 117-126.
[6] Malhotra, V. M., Pramodh Kumar, M., and Maheshwari, S. N., "An 0(1 VI 3)
Algorithm For Finding Maximum Flows in Networks," Computer Science Pro-
gram, Indian Institute of Technology, Kanpur 208016, India. 1978.
Chapter 6
APPLICATIONS OF NETWORK
FLOW TECHNIQUES
6.1 ZERO-ONE NETWORK FLOW
Thus, IEl = IE1. Clearly, the usef l edges of the layered network which is
constructed for G with present flow, with their direction of usefulness, are
all edges of G.
116
Zero-One Network Flow 117
However,
F = Ef (e) - E f (e)-
eE(S;S)G eE(S;S)G
Thus,
Lemma 6.2: The length of the layered network for the 0-1 network defined
by G(V, E) (with a given s and t) and zero flow everywhere is at most
IEI /M.
Proof: We remind the reader that Vi is the set of vertices of the ith layer of
the layered network, and Ej is the set of edges from V,., to Vi. Sincef(e) = 0
for every e EE, the useful directions are all forward. Thus, everyEj is equal to
(S; S)G where S = V. U V, U ... U V, . Thus, by Lemma 5. 1,
M c IEjI . (6.1)
I c |E /M. (6.2)
F <M-- JEl" 2 .
This layered network is identical with the one constructed for G with zero
flow everywhere. Thus, by Lemma 6. 1.
M = M -- F> |El"2 .
Lemma 6.3: Let G( V, E) define a 0-1 network of type 1 with maximum total
flow M from s to t. The length 1of the first layered network, when the flow is
zero everywhere, is at most 21 VI /M1'2 + 1.
Proof: Let Vi be the set of vertices of the ith layer. Since there are no parallel
edges, the set of edges, E,+l, from V,to V + I in the layered network satisfies
IEj+, lI * VI , I for every i = 0. 1, .. .,- 1. Since each IEjI is the
capacity of a cut, we get that
M c IVi IVi+,1.
Zero-One Network Flow 119
Thus, either I Vj I I
Ml2 or V+| M2!2 Clearly,
Thus,
[ +2 1 jMI2 IVI
Theorem 6.2: For 0-1 networks of type 1, Dinic's algorithm has time com-
plexity O(I VI21/3 [El ).
Proof: If M C | VI 2/3, the result follows immediately. Let F be the total flow
when the layered network, for the phase during which the total flow reaches
the value M - IVI2/3, is constructed. This layered network is identical with
the first layered network for & with zero flow everywhere. G may not be of
type 1 since it may have parallel edges, but it can have at most two parallel
edges from one vertex to another; if el and e2 are antiparallel in G, f (el) = 0
andf(e 2 ) = 1, then in G there are two parallel edges: el and e2'. A result
similar to Lemma 6.3 yields that
Thus, the number of phases up to this point is 0(1 VI2/3). Since the number of
phases from here to completion is at most IVI'23, the total number of phases
if O(l VI23).
Q.E.D.
In certain applications, the networks which arise satisfy the condition that
for each vertex other than s or t, either there is only one edge emanating from
it or only one edge entering it. Such 0-1 networks are called type 2.
120 Applications of Network Flow Techniques
Lemma 6.4: Let the 0-1 network defined by G(V, E) be of type 2, with max-
imum total flow M from s to t. The length I of the first layered network, when
the flow is zero everywhere, is at most (I VI - 2)/M + 1.
Lemma 6.5: If the 0-1 network defined by G is of type 2 and if the present
flow function isf, then the corresponding G defines also a type 2 0-1 net-
work.
Theorem 6.3: For a 0-1 network of lype 2, Dinic's algorithm is of time com-
"EI).
plexity O(I VI 1/2 *
Proof: If M C IVI 1/2, then the number of phases is bounded by IVI 1/2, and
the result follows. Otherwise, consider the phase during which the total flow
reaches the value M-| V I 2, Therefore, the layered network for this phase
Vertex Connectivity of Graphs 121
the first for G, with zero flow everywhere. Also, by Lemma 6.5, G is of type 2.
Thus, by Lemma 6.4, the length 1 of the layered network is at most (I VI -
2)/AM + 1. Now, M M - F > M - (M - IVI /2) = IVII 2
Thus,
1VI -/2
IC VI -l,2 + 1 = °(I VI"2).
Therefore, the number of phases up to this one is at most 0(1 VI 1/2). Since the
number of phases to completion is at most I VI 1/2 more, the total number of
phases is at most 0(1 VI 1/2).
Q.E.D.
This is one of the variations of Menger's theorem [2]. It is not only reminis-
cent of the max-cut min-flow theorem, but can be proved by it. Dantzig and
Fulkerson [3] pointed out how this can be done, and we shall follow their ap-
proach.
two edges u "- v' and v" -u ' in G. Define now a network, with digraph
G, source a", sink b ', unit capacities for all the edges of the e, type (let us
call them internal edges), and infinite capacity for all the edges of the e' and
e" type (called external edges). For example, in Fig. 6.1(b) the network for
G. as shown in Fig. 6.1(a), is demonstrated.
We now claim that p(a, b) is equal to the total maximum flow F (from a"
to b ') in the corresponding network. First, assume we have p(a, b) vertex
'a)
(b)
Figure 6.1
Vertex Connectivity of Graphs 123
e1 e2 e3 e,-I e
disjoint paths from a to b in G. Each such path, a - vI
V2 .. . V/- I b,
indicates a directed path in G.:
a, VV 2V 1)2 . / v 1/ - 1 -I b'.
These directed paths are vertex disjoint, and each can be used to flow one
unit from a " to b '. Thus,
F 2 p(a, b).
F c p(a, b).
eE(S.S)
the set (S; S) consists of internal edges only. Now, every directed path from
a" to b' in Guses at least one edge of (S; S). Thus, every path from a to b in
G uses at least one vertex v such that e, E (S; 5). Therefore, the set R =
I vI v E V and e. E (S; S) J is an (a, b) vertex separator. Clearly JR I = c(S).
Thus, we have an (a, b) vertex separator whose cardinality is F. Proving that
N(a, b) c F = p(a, b).
Finally, it is easy to see that N(a, b) > p(a, b), since every path from a to b
124 Applications of Network Flow Techniques
uses at least one vertex of the separator, and no two paths can use the same
one.
Q.E.D.
The algorithm suggested in the proof, for finding N(a, b), when the Dinic
algorithm is used to solve the network problem, is of time complexity
0( I V| 1/2. * E
E ). This results from the following considerations. The number
of vertices in G is 21 VI; the number of edges is l VI + 21EI. Assuming [El
> i VI, we have IVI = °(| VI) and 1lFl = °(IEI). Since we can assign unit
capacity to all the edges without changing the maximum total flow, the net-
work is of type 2. By Theorem 6.3, the algorithm is of time complexity
O(I VI 1 /2. El). We can even find a minimum (a, b) vertex separator as
follows: Once the flow is maximum, change the capacity of the external edges
back to X and apply the construction of the layered network. The set of ver-
tices which appear also in this layered network, S, defines a minimum cut
which consists of internal edges only. Let R be the vertices of G which corre-
spond to the internal edges in (S; S). R is a minimum (a, b) vertex separator
in G. This additional work is of time complexity O(1El).
c = MinN(a, b).
af b
Namely, the smallest value of p(a, b) occurs also for some two vertices a and
b which are not connected by an edge.
Proof: If G is completely connected then for every two vertices a and b, p(a,
b) =I VI - 1 and the theorem holds. If G is not completely connected then,
by definition
c = MinN(a, b).
By Theorem 6.4, Min N(a, b) = Min p(a, b). Now by Lemma 6.6, Min
a,'b a-,lb a-b
p(a, b) = Minp(a, b).
ab
Q.E.D.
c = Minp(a, b),
Proof: The vertex (or edge) connectivity of a graph cannot exceed the degree
of any vertex. Thus,
c c Mind(v).
126 Applications of Network Flow Techniques
Also,
Ed(v) = 2. lEt.
c s -y s l VIl - 2. (6.3)
From there on -y can only decrease, bat (6.3) still holds. Thus, for some k c
IVI - 1, the procedure will terminate. When it does, k 2 -y + 1 2 c + 1.
By definition, c is equal to the cardinality of a minimum vertex separator R
of G. Thus, at least one of the vertices v,, v2, . . ., Vk is not in R, say vi. R
separates the remaining vertices into at least two sets, such that each path,
from a vertex of one set to a vertex of another, passes through at least one
vertex of R. Thus, there exists a vertex v such that N(v,, v) ' JRI = c, and
therefore ey c.
Q.E.D.
(1) G is nonseparable.
(2) For every two vertices x and y there exists a simple circuit which goes
through both.
(3) For every two edges el and e2 there exists a simple circuit which goes
through both.
(4) For every two vertices x and y and an edge e there exists a simple path
from x toy which goes through e.
(5) For every three vertices x, y and z there exists a simple path from x to
z which goes through y.
(6) For every three vertices x, y and z there exists a simple path from x to z
which avoids y.
four new edges: ul - x - vI, u2 - Y - V2. Clearly, none of the old vertices
become separation vertices, by this change. Also, x cannot be a separation
vertex, or either ul or v, are separation vertices in G. (Here I VI > 2 is used.)
Thus, G' is nonseparable. Hence, by the equivalence of (1) and (2), G'
satisfies (2). Therefore, there exists a simple circuit in G' which goes through
x andy. This circuit indicates a circuit through e1 and e2 in G.
(3) - (1): Letx andy be any two vertices. Since G has no isolated vertices,
there is an edge el incident tox and an edge e2 incident toy. (If el e2, chose
any other edge to replace e2 ; the replacement need not even be incident toy;
the replacement exists since there is at least one other vertex, and it is not
*Many authors use the term biconnected to mean nonseparable. I prefer to call a
graph biconnected if c = 2.
**Namely, for every v E V, d(v) > 0. G has been assumed to have no self-loops.
128 Applications of Network Flow Techniques
Proof: Assume not. Then p(s, u) < k. By Theorem 6.4, there exists a (s, u)
vertex separator S, in G, such that IS < k. Let R be the set of vertices such
that all paths, in G, from s to v E R pass through at least one vertex of S.
Clearly, vi 0 R, since vi is connected by an edge to s. However, since 1 2 k >
IS I, there exists some 1 c i < I such that vi 0 S. All paths from v, to u go
through vertices of S. Thus, p(vi, u) < S I < k, contradicting the assump-
tion.
Q.E.D.
Vertex Connectivity of Graphs 129
Let V = (vI, V2, . . ., v. ]. Let j be the least integer such that for some
i <j,p(vi, v;) < kinG.
Lemma 6.9: Letj be as defined above and & be the auxiliary graph for L =
IVI, V2, . . ., vj- I. In t,,p(s, v>) < k.
(1) For every i andj such that 1 c i <j c k, check whetherp(vi, v,) 2 k. If
for some i andj this test fails then halt; G's connectivity is less then k.
(2) For every, k + 1 cj c n form G(withL = {vI, v 2 , ... , v, 1}) and
check whether in G p(s, vj) Ž k. If for some j this test fails then
halt; G's connectivity is less than k.
(3) Halt; the connectivity of G is at least k.
notice that this technique is of no hel in our problem. The reason for that is
that even if G is undirected, the network we get for vertex connectivity testing
is directed.
c = MinmN(a, b).
The lemma analogous to Lemma 6.6 still holds, and the proof goes along
the same lines. Also, the theorem analogous to Theorem 6.5 holds, and the
complexity it yields is the same. If G has no parallel edges a statement like
Lemma 6.7 holds and the procedure and the proof of its validity (Theorem
6.6) extend to the directed case, except that for each v, we compute both
N(v1 , v) and N(v, v,).
The algorithm for testing k connectivity extends also to the directed case
and again all we need to change is that whenever p(a, b) was computed, we
now have to compute both p(a, b) and p(b, a).
Let us now consider the case of e dge connectivity both in graphs and
digraphs.
Let G( V, E) be an undirected graph. A set of edges, T, is called an (a, b)
edge separatorif every path from a to b passes through at least one edge of T.
Let M(a, b) be the least cardinality of an (a, b) edge separator. Let p (a, b)
be now the maximum number of edge disjoint paths which connect a
with b.
Theorem 6.8: M(a, b) = p(a, b).
Connectivity of Digraphs and Edge Connectivity 131
The proof is similar to that of Theorem 6.4, only simpler. There is no need
to split vertices. Thus, in G, V = V. We still represent each edge u - v of
G by two edges u - v and v u in G. There is no loss of generality in assum-
ing that the flow function in G satisfies the condition that eitherf(e') = 0 or
f(e") = 0; for iff(e ') =f(e") = 1 then replacing both by 0 does not change
the total flow. The rest of the proof raises no difficulties.
The edge connectivity, c, of a graph G is defined by c = Mingb M(a, b).
By Theorem 6.8 and its proof, we can find c by the network flow technique.
The networks we get are of type 1. Both Theorem 6.1 and Theorem 6.2 ap-
ply. Thus, each network flow problem is solvable by Dinic's algorithm with
complexity O(Min I IEl3 '2 , l VI213. |ElI).
Let T be a minimum edge separator in G; i.e. ITI = c. Let v be any vertex
of G. For every vertex v ', on the other side of T, M(v, v ') = c. Thus, in
order to determine c, we can use
c= Min M(v, v
,t'EV- 'I
Lemma 6.10: Let v,, V2, . .. , v. be a circular ordering (i.e. v,+, v,) of the
vertices of a digraph G. The edge connectivity, c, of G satisfies
132 Applications of Network Flow Techniques
Proof: Let T be a minimum edge separator in G. That means that there are
two vertices a and b such that T is an (a, b) edge separator. Define
Both, in the case of graphs and digraphs we can test for k connectivity,
easily, in time complexity O(k* I VI * ElI). Instead of running each network
flow problem to completion, we terminate it when the total flow reaches k.
Each augmenting path takes O(IEI) time and there are I VI flow problems.
As we can see, testing for k edge connectivity is much easier than for k vertex
connectivity. The reason is that vertices cannot participate in the separating
set which consists of edges.
We can also use this approach to determine the edge connectivity, c, in
time O(c* I VI - {El). We run all the I VI network flow problems in parallel,
one augmenting path for each network in turn. When no augmenting path
exists in any of the I VI problems, we terminate. The cost increase is only in
space, since we need to store all I VI problems simultaneously. One can use
binary search on c to avoid this increase in space requirements, but in this
case the time complexity is O(c IV| JEl .log c).
Proof: The theorem trivially holds for k = 1. We prove the theorem by in-
duction on k. Let us denote by 6G(S) the number of edges in (S; S) in G. If H
Connectivity of Digraphs and Edge Connectivity 133
(1) a E 5,
(2) S U V' AV
(3) 6G-F(S) k - I
Let us show that if no such $ exists then one can add any edge e E (V'; V')
to F. Clearly, F + e satisfies (i). Now, if (ii) does not hold then there exists
an S such that S • V, a E S and bG--(f+.) (S) < k - 1. It follows that 6G-F(S)
< k. Now, by (ii), 6 G-F(S) 2 k - 1. Thus, 6G-F(S) = k - 1, and S satisfies
condition (3). Let u and v be vertices such that u -e v. Since 6G-(F+e)(S) <
k-1 and 6 G-F(S) = k-1, v £ S. Also, v £ V'. Thus S U V' • V, satisfy-
ing condition (2). Therefore, S satisfies all three conditions; a contradiction.
Now, let A be a maximal* set of vertices which satisfies (1), (2) and (3).
Since the edges of F all enter vertices of V'
By condition (3)
Assume e E (S; 5). It is not hard to prove that for every two subsets of V, S
and A
6 6 6 6
G-F(S U A) + G-F(S nl A) ' G-F(S) + G-F(A),
The proof provides an algorithm fir finding k edge disjoint directed trees
rootedata. WelookforatreeFsuchthatMin,,EV-, M(a, v) 2 k- in G -
F, by adding to F one edge at a time. For each candidate edge e we have to
check whether MinvEV- I. IM(a, v) 2 k -- 1 in G - (F + e). This can be done
by solving I VI - 1 network flow problems, each of complexity O(k lEl).
Thus, the test for each candidate edge is O(k * IVI *IE I). No edge need be
considered more than once in the construction of F, yielding the time com-
plexity O(k I VI * El 2). Since we repeat the construction k times, the whole
algorithm is of time complexity O(k2 IVI *IEl 2 ).
The following theorem was conjectured by Y. Shiloach and proved by
Even, Garey and Tarjan [11].
Corollary 6.1: If the edge connectivity of a digraph is at least 2 then for every
two vertices u and v there exists a directed circuit which goes through u and v
in which no edge appears more than once.
It is interesting to note that no such easy result exists in the case of vertex
connectivity and simple directed circuit through given two vertices. In
Reference [11], a digraph with vertex connectivity 5 is shown such that for
two of its vertices there is no simple directed circuit which passes through
both. The author does not know whether any vertex connectivity will
guarantee the existence of a simple directed circuit through any two vertices.
lem. We shall present here its solution via network flow and show that its
complexity is 0(1 V1/2-. El). This result was first achieved by Hopcroft and
Karp [14].
Let us construct a network N(G). Its digraph G(V, E) is defined as follows:
V = Ist3 U V,
E [s-xlxE X Uty-tIyE Y) U [x-ylx-yinG ).
proof of Theorem 6.12. Actually, since there is only one edge entering x, with
unit capacity, the flow in x - y is bounded by 1.) The source is s and the sink
is t. For example consider the bipartite graph G shown in Fig. 6.2(a). Its cor-
responding network is shown in Fig. 6.2(b).
(a) (b)
Figure 6.2
Maximum Matching in Bipartite Graphs 137
The proof indicates how the network flow solution can yield a maximum
matching. For our example, a maximum flow, found by Dinic's algorithm is
shown in Fig. 6.3(a) and its corresponding matching is shown in Fig. 6.3(b).
The algorithm, of the proof, is 0(1 VI"' IEl), by Theorem 6.3, since the
network is, clearly, of type 2.
Next, let us show that one can also use the max-flow min-cut theorem to
prove a theorem of Hall [151. For every A C X, let F(A) denote the set of ver-
1 Yl
(a) (b)
Figure 6.3
138 Applications of Network Flow Techniques
(i) There is a vertex s, called the start vertex, and a vertex t( • s), called the
termination vertex.
(ii) G has no directed circuits.
(iii) Every vertex v E V - ts, t ) is on some directed path from s to t.
Our first problem deals with the question of how soon can the whole pro-
ject be completed; i.e., what is the shortest time, from the moment the pro-
cesses represented by d(s) are started, until all the processes represented by
a(t) are completed. We assume that the resources for running the processes
are unlimited. For this problem to be well defined let us assume that each e E
E has an assigned length l(e), which specifies the time it takes to execute the
process represented by e. The minimum completion time can be found by the
following algorithm:
(1) Assign s the lable 0 (X(s) - 0). All other vertices are "unlabeled".
(2) Find a vertex, v, such that v is unlabeled and all edges of a(v) emanate
from labeled vertices. Assign
In Step (2), the existence of a vertex v, such that all the edges of a(v)
emanate from labeled vertices is guaranteed by Condition (ii) and (iii): If no
unlabeled vertex satisfies the condition then for every unlabeled vertex, v,
there is an incoming edge which emanates from another unlabeled vertex. By
repeatedly tracing back these edges, one finds a directed circuit. Thus, if no
such vertex is found then we conclude that either (ii) or (iii) does not hold.
It is easy to prove, by induction on the order of labeling, that X(v) is the
minimum time in which all processes, represented by the edges of a(v), can
be completed.
The time complexity of the algorithm can be kept down to O(1El) as
follows: For each vertex, v, we keep count of its incoming edges from
unlabeled vertices; this count is initially set to din(v); each time a vertex, u,
gets labeled we use the list $(u) to decrease the count for all v such that u
- v, accordingly; once the count of a vertex v reaches 0, it enters a queue of
vertices to be labeled.
Once the algorithm terminates, by going back from t to s, via the edge
which determined the label of the vertex, we can trace a longest path from s
to t. Such a path is called critical.* Clearly, there may be more than one criti-
cal path. If one wants to shorten the completion time, X(t), then on each
critical path at least one edge length must be shortened.
*The whole process is sometimes called the Critical Path Method (CPM).
140 Applications of Network Flow Techniques
Figure 6.4
Two Problems on Pert Digraphs 141
(a)
(b)
Figure 6.5
142 Applications of Network Flow Techniques
and all the others a short length. Since no directed path leads from one edge
of S to another, they all will be operative simultaneously. This implies that
the number of processors required is at least I(T; T)I.
However, the flow can be decomposed into F directed paths from s to t,
where F is the minimum total flow, such that every edge is on at least one
such path (sincef(e) 2 1 for every e a E). This is demonstrated for our ex-
ample in Fig. 6.5(b). We can, now, assign to each processor all the edges of
one such path. Each such processor executes the processes, represented by
the edges of the path in the order in which they appear on the path. If one
process is assigned to more than one processor, then one of them executes
while the others are idle. It follows that whenever a process which cor-
responds to u - v, is executable (since all the the processes which correspond
to (x(u) have been executed), the processor to which this process is assigned is
available for its execution. Thus, F processors are sufficient for our purpose.
Since F = I(T; T)I, by the min-flow max-cut theorem, the number of pro-
cessors thus assigned is minimum.
The complexity of this procedure is as follows. We can find a legal initial
flow in time 0(1 VI| E l), by tracing for each edge a directed path from s to t
via this edge, and flow through it one unit. This path is found by starting
form the edge, and going forward and backward from it until s and t are
reached. Next, we solve a maximum flow problem, from t to s, by the algo-
rithm of Dinic, using MPM, in time C(/ VI'). Thus, the whole procedure is
of complexity 0( VI3), if EI c V.
PROBLEMS
(a) Describe an algorithm for achieving this goal, and make it as effi-
cient as possible. (Hint. Form a network as follows:
(b) Prove that for every k there exists a graph G such that c(G) 2Ik
and G has no Hamilton path. (See Problem 1.2.)
6.5 Let M be a matching of a bipartite graph. Prove that there exists a
maximum matching M' such that every vertex which is matched in M
is matched also in M'.
6.6 Let G(V, E) be a finite acyclic cligraph with exactly one vertex s for
which di. (s) = 0 and exactly one vertex t for which d.., (t) = 0. We
say that the edge a - b is greater than the edge c - d if and only if
there is a directed path, in G, from b to c. A set of edges is called a
slice if no edge in it is greater than another, and it is maximal; no other
set of edges with this property contains it. Prove that the following
three conditions on a set of edges., P, are equivalent.:
(1) P is a slice.
(2) P is an (s, t) edge separator 'in which no edge is greater than any
other.
(3) P = (S;S)forsome IsI C S C V- ttl such that(S;S) = 0.
1 c i c m, ei E Si.
6.8 Let 7r, and 1r2 be two partitions of a set of m elements, each containing
exactly r disjoint subsets. We want to find a set of r elements such that
each of the subsets of 7r and 7r2 is represented.
(a) Describe an efficient algorithm to determine whether there is such
a set of r representatives.
(b) State a necessary and sufficient condition for the existence of such a
set, similar to Theorem 6.12.
6.9 Let G(V, E) be a completely connected digraph (see Problem 1.3);
it is called classifiable if V can be partitioned into two nonempty
classes, A and B, such that all the edges connecting between them are
directed from A to B. Let V = {V,, V2, . . ., v.} where the vertices
satisfy
d..t(vi) < d..t(V2) < . .- < dout(vj.
Prove that G is classifiable if and only if there exists a k < n such that
k
E8d..,(vi) =(2).
6.10 Let S be a set of people such that 1SI 2 4. We assume that acquain-
tance is a mutual relationship. Prove that if in every subset of 4 people
there is one who knows all the others then there is someone in S who
knows everybody.
6.11 In the acyclic digraph, shown below, there are both AND vertices (de-
146 Applications of Network Flow Techbiiques
(a) For every three vertices x, y, z there exists a path, in which no edge
appears more than once, from x to z via y.
(b) For every three vertices x, y, 2 there exists a path, in which no edge
appears more than once, from x to z which avoids y.
REFERENCES
[tI Even, S., and Tarjan, R. E., "Network Flow and Testing Graph Connectivity,"
SIAM J. on Computing, Vol. 4, 1975, pp. 507-518.
[2] Menger, K., "Zur Allgemeinen Kurventheorie", Fund. Math., Vol. 10, 1927,
pp. 96-115.
[3] Dantzig, G. B., and Fulkerson, D. R., "On the Max-Flow Min-Cut Theorem
of Networks", LinearInequalities and Related Systems, Annals of Math. Study
38, Princeton University Press, 1956, pp. 215-221.
References 147
[4] Hopcroft, J., and Tarjan, R. E., "Dividing a Graph into Triconnected Com-
ponents", SIAM J. on Computing, Vol. 2, 1973, pp. 135-158.
[5] Kleitman, D. J., "Methods for Investigating Connectivity of Large Graphs",
IEEE Trans. on Circuit Theory, CT-16, 1969, pp. 232-233.
[6] Even, S., "Algorithm for Determining whether the Connectivity of a Graph
is at Least k." SIAM J. on Computing, Vol. 4, 1977, pp. 393-396.
[7] Gomory, R. E., and Hu, T. C., "Multi-Terminal Network Flows", J. of SIAM,
Vol. 9, 1961, pp. 551-570.
[8] Schnorr, C. P., "Multiterminal Network Flow and Connectivity in Unsym-
metrical Networks", Dept. of Appl. Math, University of Frankfurt, Oct. 1977.
[9] Edmonds, J., "Edge-Disjoint Branchings", in Combinatorial Algorithms,
Courant Inst. Sci. Symp. 9, R. Rustin, Ed., Algorithmics Press Inc., 1973, pp.
91-96.
[10] Lovisz, L., "On Two Minimax Theorems in Graph Theory", to appear in J. of
Combinatorial Th.
[11] Even, S., Garey, M. R. and Tarjan, R. E., "A Note on Connectivity and Circuits
in Directed Graphs", unpublished manuscript (1977).
[12] Edmonds, J., "Paths, Trees, and Flowers", Canadian J. of Math., Vol. 17,
1965, pp. 449-467.
[13] Even, S. and Kariv, O., "An O(n2 5) Algorithm for Maximum Matching in
General Graphs", 16-th Annual Symp. on Foundations of Computer Science,
IEEE, 1975, pp. 100-112.
[14] Hopcroft, J., and Karp, R. M., "An n5 2 Algorithm for Maximum Matching in
Bipartite Graphs", SIAM J. on Comput., 1975, pp. 225-231.
[15] Hall, P., "On Representation of Subsets", J. London Math. Soc. Vol. 10, 1935,
pp. 26-30.
[16] Dilworth, R. P., "A Decomposition Theorem for Partially Ordered Sets," Ann.
Math., Vol. 51, 1950, pp. 161-166.
Chapter 7
PLANAR GRAPHS
7.1 BRIDGES AND KURATOWSKI'S THEOREM
Consider a graph drawn in the plane in such a way that each vertex is
represented by a point; each edge is represented by a continuous line con-
necting the two points which represent its end vertices and no two lines,
which represent edges, share any points, except in their ends. Such a drawing
is called a plane graph. If a graph G has a representation in the plane which
is a plane graph then it is said to be planar.
In this chapter we shall discuss some of the classical work concerning
planar graphs. The question of efficiently testing whether a given finite graph
is planar will be discussed in the next chapter.
Let S be a set of vertices of a nonseparable graph G(V, E). Consider the
partition of the set V-S into classes, such that two vertices are in the same
class if and only if there is a path connecting them which does not use any
vertex of S. Each such class K defines a component as follows: The compo-
nent is a subgraph H(V', E') where V' D K. In addition, V' includes all the
vertices of S which are connected by an edge to a vertex of K, in G. E' con-
tains all edges of G which have at least one end vertex in K. An edge uv,
where both u and v are in S, defines a singular component ({u, v}, {e} ).
Clearly, two components share no edges, and the only vertices they can share
are elements of S. The vertices of a c emponent which are elements of S are
called its attachments.
In our study we usually use a set 5 which is the set of vertices of a simple
circuit C. In this case we call the components bridges; The edges of C are not
considered bridges.
For example, consider the plane graph shown in Fig. 7.1. Let C be the out-
side boundary: a _Lb 2 C - I I g - a. In this case the bridges are:
({e, g}, {8}), ({h, i, j, a, e, g}, {9, 10, 11, 12, 13, 14} ), ({a, e}, {15}) and
({k, b, c, d}, {16, 17, 18} ). The first and third bridges are singular.
Figure 7.1
disjoint paths in B: PI(v, a,), P2(v, a2) and P3(v, a3). (P(a, b) denotes a
path connecting a to b.)
Proof: Let a, - vt, a2 " V2, a3 ' v3 be edges of B. If any of the via's (i = 1,
2, 3) is an attachment then the corresponding edge is a singular bridge and is
not part of B. Thus, vi E K, where K is the class which defines B. Hence,
there is a simple path P '(v 1, V2) which uses vertices of K only; if vt = v2 , this
path is empty. Also, there is a simple path P"(v3 , v 1) which uses vertices of K
only. Let v be the first vertex of P"(v3 , v 1 ) which is also on P'. Now, let P 1 (v,
a, ) be the part of P' which leads from v to v,, concatenated with v, -a
P2 (v, a2) be the part of P' which leads from v to V2, concatenated with V2-
a 2; P3 (v, a3 ) be the part of P" which leads from v to V3, concatenated with V3
-a 3. It is easy to see that these paths are disjoint.
Q.E.D.
Let C be a simple circuit of a nonseparable graph G, and B I, B 2, . . ., Bk
be the bridges with respect to C. We say that Bi and Bj interlace if at least
one of the following conditions holds:
(i) There are two attachments of B;, a and b, and two attachments of Bj, c
and d, such that all four are distinct and they appear on C in the order a,
c, b, d.
(ii) There are three attachments common to Bi and Bp.
150 Planar Graphs
For each bridge Bi, consider the subgraph C + Bi. If any of these graphs
is not planar then clearly G is not planar. Now, assume all these subgraphs
are planar. In every plane realization of G, C outlines a contour which
divides the plane into two disjoint parts: its inside and outside. Each bridge
must lie entirely in one of these parts. Clearly, if two bridges interlace they
cannot be on the same side of C. Thus, in every plane realization of G the set
of bridges is partitioned into two sets: those which are drawn inside C and
those which are drawn outside. No two bridges in the same set interlace.
(2) The set of bridges can be partitioned into two subsets, such that no two
bridges in the same subset interlace.
Theorem 7.1: A graph G is 2-colorable if and only if it has no odd length cir-
cuits.
Proof: It is easy to see that if a graph has an odd length circuit then it is not
2-colorable. In order to prove the converse, we may assume that G is con-
nected; for if each component is 2-colorable then the whole graph is
2-colorable.
Let v be any vertex. Let us perform BFS (Section 1.5) starting from v.
There cannot be an edge u - w in G, if u and w belong to the same layer; i.e.
are the same distance from v. For if such an edge exists then we can display
an odd length circuit as follows: Let Pi(v, u) be a shortest path from v to u.
P2 (v, w) is defined similarly and is of the same length. Let x be the last vertex
which is common to PI and P2 . The part of PI from x to u, and the part of P2
from x to w are of equal length, and together with u - w they form an odd
length simple circuit.
Now, we can color all the vertices of even distance from v with one color
and all the vertices of odd distance from v with a second color.
Q.E.D.
Let us now introduce the graphs of Kuratowski [2]: K3 ,3 and Ks5 . They are
shown in Fig. 7.2(a) and (b) respectively.
K 5 is a completely connected graph of 5 vertices, or a clique of 5 vertices.
K3, 3 is a completely connected bipartite graph with 3 vertices on each side.
(CI)
(b)
Figure 7.2
Bridges and Kuratowski's Theorem 153
Figure 7.3
154 Planar Graphs
each point of the plane vertically up to the surface vi a sphere whose center is
in the plane and its intersecting circle with the plane encircles G. Next, place
the sphere on a plane which is tangent to it, in such a way that a point in the
face whose contour is W is the "north pole", i.e. furthest from the plane.
Project each point P (other then the ' north pole") of the sphere to the plane
by a straight line which starts from the 'north pole" and goes through P. The
graph is now drawn in the plane and VV is the external window.
Two graphs are said to be homeomorphic if both can be obtained from the
same graph by the insertion of new vertices of degree 2, in edges; i.e. an edge
is replaced by a path whose intermediate vertices are all new.* Clearly, if two
graphs are homeomorphic then either both are planar or both are not. We
are now ready to state Kuratowski's Theorem [1].
*Two graphs GI (VI, El) and G2 ( 12, E2) are said to be isomorphic if there are
1-1 correspondencesf: VI - V2 and g: EI - E2 such that for every edge u ' v in
G, f(u) f(fe)(v) in G 2 . Clearly GI is planar if and only if G2 is. Thus, we are not
interested in the particular names of the vertices or edges, and distinguish between
graphs only up to isomorphism.
Bridges and Kuratowski's Theorem 155
(a ,FaF
Figure 7.4
Case 1.2: B* has no attachment on C(bO, bo). Thus, B* has one attachment
b 2' on C(ao, b X]; i.e., it may be b 1 but not ao. Also, B* has an attachment
b 2" on C[bo, a 1). By Lemma 7.1, there exists a vertex v and three vertex dis-
joint paths in B*: P:(v, a 2 ), P2 (v, b2 ') and P3 (v, b2 "). The situation is
shown in Fig. 7.7. If we erase from the subgraph of G, shown in Fig. 7.7 the
path C[bI, bo I and all its intermediate vertices, the resulting subgraph is
Bridges and Kuratowski's Theorem 157
- a0 **
Fiure 7.5
homeomorphic to K 3 , 3: Vertices a 2, b 2 ' and b 2 " play the role of the upper
vertices, and a O,a 1 and v, the lower vertices.
Case 2: B* has no attachments other than a o, b O,a 1, b1 . In this case all four
must be attachments; for if ao or b oare not, then B* and e 1 do not interlace;
if a, or b are not, then B* does not prevent the drawing of e0 .
Case 2.1 There is a vertex v, in B*, from which there are four disjoint paths
in B*: PI(v, ao), P2 (v, bo), P3(v, a,) and P4 (v, b 1). This case is shown in
Fig. 7.8, and the shown subgraph is clearly homeomorphic to K 5 .
Figure 7.6
Figure 7.7
Bridges and Kuratowski's Theorem 159
Figure 7.8
Figure 7.9
160 Planar Graphs
7.2 EQUIVALENCE
Proof: If C is a window of G with more than one bridge, then all C's bridges
are external. Therefore, no two bridges interlace. As in the first paragraph of
the proof of Lemma 7.2, there exists a bridge B whose attachments can be
ordered a,, a2, . . ., a, and no attachments of any other bridge appear on
C(aI, a,). It is easy to see that {a, at,' is a separating pair; it separates the
vertices of B and C(a1 , a,) from the set of vertices of all other bridges and
C(a,, a1 ), where neither set can be empty since G has no parallel edges.
Q.E.D.
Euler's Theorem 161
Figure 7.10
I VI +f- El = 2. (7.1)
Proof: By induction on JEl. If IE =l 0 then G consists of one vertex and
there is one face, and 7.1 holds. Assume the theorem holds for all graphs
with m = JEl. Let G(V, E) be a connected plane graph with m + 1 edges. If
G contains a circuit then we can remove one of its edges. The resulting plane
graph is connected and has m edges, and therefore, by the inductive
hypothesis, satisfies (7.1). Adding back the edge increases the number of
faces by one and the number of edges by one, and thus (7.1) is maintained. If
G contains no circuits, then it is a tree. By Lemma 2.1 it has at least two
leaves. Removing a leaf and its incident edge yields a connected graph with
one less edge and one less vertex which satisfies (7.1). Therefore, G satisfies
(7.1) too.
Q.E.D.
The theorem implies that all connected plane graphs with I VI vertices and
IEl edges have the same number of faces. There are many conclusions one
can draw from the theorem. Some of them are the following:
Proof: Since there are no parallel edges, every window consists of at least
three edges. Each edge appears on the windows of two faces, or twice on the
window of one face. Thus, 3 *f c 2 IEl . By (7.1), IEI = I VI + f- 2.
Thus, EEl l V I + 2/3
% El -2, and (7.2) follows.
Q.E.D.
Corollary 7.2: Every connected plane graph with no parallel edges and no
self-loops has at least one vertex whose degree is 5 or less.
Proof: Assume the contrary; i.e. the degree of every vertex is at least 6.
Thus, 2 * JEl : 6 * I VI; note that each edge is counted in each of its two end
vertices. This contradicts (7.2).
Q.E.D.
7.4 DUALITY
from G interrupts its connectivity, but no proper subset of K does it. It is easy
to see that a cutset separates G into two connected components: Consider
first the removal of K - {e}, where e E K. G remains connected. Now
remove e. Clearly G breaks into two components.
The graph G2 (V 2 , E2 ) is said to be the dual of a connected graph GJ(VI,
E, ) if there is a 1 - 1 correspondence: E, - E2 , such that a set of edges S
forms a simple circuit in GI if and only iff(S) (the corresponding set of edges
in G2 ) forms a cutset in G2 . Consider the graph G1 shown in Fig. 7.11(a). G2
shown in Fig. 7.11(b) is a dual of GI, but so is G3 , shown in Fig. 7.11(c), as
the reader can verify by considering all (six) simple circuits of GI and all
cusets of G2 , or G3 .
(a) (b)
(c)
FIgure 7.11
164 Planar Graphs
tion: Delete the edge e and merge x with y. The new contracted graph, G',
has one less edge and one less vertex, if x • y. Clearly, if G is connected so is
G'. A graph G' is a contraction of G if by repeated contractions we can con-
struct G' from G.
(i) A deletion of an edge e of the present graph, which is not in GI ', and
whose deletion does not interrupt the connectivity of the present graph.
(ii) A deletion of a leaf of the present graph, which is not a vertex of G. ', to-
gether with its incident edge.
We want to show that each of the resulting graphs, starting with G, and
ending with GI ', has a dual.
Let G be one of these graphs, except the last, and its dual be Gd. First con-
sider a deletion of type (i), of an edge e. Contractf(e) in G d, to get Gd,, If C
is a simple circuit in G - e, then clearly it cannot use e, and therefore it is a
circuit in G too. The set of edges of C is denoted by S. Thus,f(S) is a cutset of
G d, and it does not includef(e). Thus, the end vertices off(e) are in the same
component of Gd with respect tof(S), [t follows thatf(S) is a cutset of Gdc
too. If K is a cutset of Gd. then it is a cutset of Gd too. Thusf -1 (K) form a
simple circuit C' in G. However,f(e) is not in K, and therefore e is not in C'.
Hence, C' is a simple circuit of G - e.
Next, consider a deletion of type (ii) of a leaf v and its incident edge e.
Clearly, e, plays no role in a circuit. Thus,f(e) cannot be a part of a cutset in
Gd. Hence,f (e) is a self-loop. The deletion of v and e from G, and the con-
traction off (e) in Gd (which effectively, only deletesf (e) from Gd), does not
change the sets of simple circuits in 1, and cutsets in Gd, and the cor-
respondence is maintained.
Q.E.D.
Lemma 7.7: Let G be a connected graph and el, e2 be two of its edges,
neither of which is a self loop. If for every cutset either both edges are in it or
both are not, then el and e2 are parallel edges.
Proof: If e, and e2 are not parallel, then there exists a spanning tree which
includes both. (Such a tree can be found by contracting both edges and
Duality 165
Lemma 7.8: Let G be a connected graph with a dual Gd and letf be the 1
- 1 correspondence of their edges. If u ' x, ' x2 ' * * * xi- In v is a sim-
ple path or circuit in G such thatxI, x 2, . . ., x,-, are all of degree two, then
f(e:),f(e2 ), ._. ,f(ej) are parallel edges in Gd.
Proof: Every circuit of G which contains one ei, 1 c i 5 1, contains all the
rest. Thus, in Gd, if one edge,f(ei), is in a cutset then all the rest are. By
Lemma 7.7, they are parallel edges.
Q.E.D.
twice. In this case there will be two lines from pi hitting e from both direc-
tions. These two lines can be made to end in the same point on e, thus
creating a closed curve which goes through pi and crosses e. If e is not a
separating edge then it appears on two windows, say Wi and WJ. In this case
we can make the line from pi to e meet e at the same point as does the line
from pj to e, to form a line connecting p Ewith pj which crosses e. None of the
set of lines, thus formed crosses another, and we have one per edge of G.
Now define G 2 (V 2 . E 2 ) as follows: The set of lines connecting the pi's is a
representation of E2 . The 1-1 correspondence f El E2 is defined as
follows: f(e) is the edge of G2 which crosses e. Clearly, 2 is a plane graph
which is a realization of a graph G2 (V 2 ,E 2 ). It remains to show that there is
a 1-1 correspondence of the simple circuits of GI to the cutsets of G2 . The
construction described above is demonstrated in Fig. 7.12, where GI is
shown in solid lines, and &2 is shown in dashed lines.
Let C be a simple circuit of G 1. Clearly, in & 1, C describes a simple closed
curve in the plane. There must be at least one vertex of G 2 inside this circuit,
since at least one edge of G2 crosses the circuit, and it crosses the circuit ex-
actly once. The same argument applies to the outside too. This implies that
f(S), where S is the set of edges of C, forms a separating set of G2 . Let us
postpone the proof of the minimality of f(S) for a little while.
Now, let K be a cutset of G2. Let T and T be the sets of vertices of the two
- -
-11 1 , \ \
.-. - ,. /
- Figure-71
Figure 7.12
Duality 167
There are many facts about duality which we have not discussed. Among
them are the following:
(1) If GdisadualofGthenGisadualofGd.
(2) A 3-connected planar graph has a unique dual.
PROBLEMS
7.3 Use Kuratowski's theorem to prove that the Petersen graph, shown
below is nonplanar.
REFERENCES
[1] K6nig, D., Theorie der endlichen and unendlichen Graphen. Leipzig, 1936.
Reprinted Chelsea, 1950.
[2] Kuratowski, K., "Sur le Probleme des Courbes Gauches en Topologie", Fund.
Math., Vol. 15, 1930, pp. 217-283.
[31 Parson, T. D., "On Planar Graphs", Am. Math. Monthly, Vol. 78, No. 2, 1971,
pp. 176-178.
[4] Harary, F., Graph Theory, Addison Wesley, 1969.
[5] Ore, O., The Four Color Problem, Academic Press, 1967.
[6] Wilson, R. J., Intr. to Graph Theory, Longman, 1972.
Chapter 8
There are two known planarity testing algorithms which have been shown
to be realizable in a way which achieves linear time (O(I VI)). The idea in
both is to follow the decisions to be made during the planar construction of
the graph, piece by piece, as to the relative location of the various pieces.
The construction is not carried out explicitly because there are difficulties,
such as crowding of elements into a relatively small portion of the area allo-
cated, that, as yet, we do not know to avoid. Also, an explicit drawing of
the graph is not necessary, as we shall see, to decide whether such a drawing
is possible. We shall imagine that such a realization is being carried out,
but will only decide where the various pieces are laid, relative to each other,
and not of their exact shape. Such decisions may change later in order to
make place for later additions of pieces. In both cases it was shown that
the algorithm terminates within °(l VI) steps, and if it fails to find a "reali-
zation" then none exists.
The first algorithm starts by finding a simple circuit and adding to it
one simple path at a time. Each such new path connects two old vertices
via new edges and vertices. (Whole pieces are sometimes flipped over, around
some line). Thus, we call it the path addition algorithm. The basic ideas
were suggested by various authors, such as Auslander and Parter [11 and
Goldstein [2], but the algorithm in its present form, both from the graph
theoretic point of view, and complexity point of view, is the contribution of
Hopcroft and Tarijan [3]. They were first to show that planarity testing can
be done in linear time.
The second algorithm adds in each step one vertex. Previously drawn
edges incident to this vertex are connected to it, and new edges incident to
it are drawn and their other endpoints are left unconnected. (Here too,
sometimes whole pieces have to be flipped around or permuted). The algo-
rithm is due to Lempel, Even and Cederbaum [4]. It consists of two parts.
The first part was shown to be linarily realizable by Even and Tarjan [51;
171
172 Testing Graph Planarity
the second part was shown to be linearity realizable by Leuker and Booth [6].
We call this algorithm the vertex addition algorithm.
Each of these algorithms can be divided into its graph theoretic part and
its data structures and their manipulation. The algorithms are fairly com-
plex and a complete description and proof would require a much more
elaborate exposition. Thus, since this is a book on graphs and not on pro-
gramming, I have chosen to describe in full the graph theoretic aspects of
both algorithms, and only briefly describe the details of the data manipula-
tion techniques. An attempt is made to convince the reader that the algo-
rithms work, but in order to see the details which make it linear he will
have to refer to the papers mentioned above.
Throughout this chapter, for reasons explained in Chapter 7, we shall
assume that G(V, E) is a finite undirected graph with no parallel edges
and no self loops. Also, we shall assume that G is nonseparable. The first
thing that we can do is check whether JEl c 3 - IVI - 6. By Carollary 7.1,
if this condition does not hold then C, is nonplanar. Thus, we can restrict
our algorithms to the cases where IE I = 0(l VI).
From now on, we refer to the vertices by their k(v) number; i.e. we change
the name of v to k(v).
Let A (v) be the adjacency list of v; i.e. the list of edges incident from v.
We remind the reader that after the DFS each of the edges is directed; the
tree edges are directed from low to high and the back edges are directed
from high to low. Thus, each edge appears once in the adjacency lists.
Now, we want to reorder the adjacency lists, but first, we must define an
order on the edges. Let the value O(e) of an edge u -' v be defined as
follows:
Lemma 8.1: The paths finding algorithm finds first a circuit C which con-
sists of a path from 1 (the root) to some vertex v, via tree edges, and a
back edge from v to 1.
Proof: Let 1 - u be the first edge of the tree to be traced (in the first appli-
cation of Step (3)). We assume that Gy is nonseparable and IVI > 2. Thus,
by Lemma 3.7, this edge is the only tree edge out of 1, and u = 2. Also, 2
has some descendants, other than itself. Clearly, 2 - 3 is a tree edge. By
Lemma 3.5, L1(3) < 2, i.e. L1(3) == 1. Thus L1(2) = 1. The reordering
of the adjacency lists assures that the first path to be chosen out of 1 will
lead back to 1 as claimed.
Q.E.D.
Lemma 8.2: Each generated path P is simple and it contains exactly two
vertices in common with previously generated paths; they are the first
vertexf, and the last 1.
Proof: The edge scanning during the paths finding algorithm is in a DFS
manner, in accord with the structure of the tree (but not necessarily in the
same scanning order of vertices). Thus, a path starts from some (old) vertex
f, goes along tree edges, via intermediate vertices which are all new, and
ends with a back edge which leads to 1. Since back edges always lead to
ancestors, I is old. Also, by the reordering of the adjacency lists and the
assumption that G is nonseparable I must be lower than. Thus, the path
is simple.
Q.E.D.
Path Addition Algorithm of Hoperoft and Tarjan 175
Lemma 8.3: Letf and I be the first and last vertices of a generated path P
andf - v be its first edge.
(i) if v • then l(v) = 1.
(ii) I is the lowest vertex reachable from Sf via a back edge which has not
been used in any path yet.
So far, we have not made any use of L2(v). However, the following
lemma relies on it.
Proof: The Lemma follows from the fact that the paths finding algorithm
is a DFS. First C is found. We then 'backtrack from a vertex vj only if all
its descendants have been scanned. No internal part of B can be scanned
before we backtrack into vi. There must be a tree edge vi - u, where u
belongs to B, for the following reasons. If all the edges of B, incident to v,
are back edges, they all must come from descendents or go to ancestors of
vi (see Lemma 3.4). An edge from v, to one of its ancestor (which must be
on C) is a singular bridge and is not part of B. An edge from a descendant
w of vi to v, implies that w cannot be in B, for it has been scanned already,
and we have observed that no internal part of B can be scanned before we
backtrack into vi. If any other edge vk - x of B is also a tree edge then,
by the definition of a bridge, there is a path connecting u and x which is
vertex disjoint from C. Along this path there is at least one edge which
contradicts Lemma 3.4.
Q.E.D.
Proof: By Lemma 8.6, there is only one edge through which B is entered.
Since eventually the whole graph is scanned, and no edge is scanned twice
in the same direction, the corollary follows.
Q.E.D.
Assuming C and the bridges explored from vertices higher than vi have
already been explored and drawn in the plane. The following lemma pro-
vides a test for whether the next generated path could be drawn inside; the
answer is negative even if the path itself can be placed inside, but it is
already clear that the whole bridge to which it belongs cannot be placed
there.
then there cannot be any inside edge incident to these vertices, since bridges
Path Addition Algorithm of Hoperoft and Tarjan 177
to be explored from vertices lower than vi have not been scanned yet. Thus,
P can be drawn inside if it is placed sufficiently close to C.
Now, assume there is a back edge w - vk, drawn inside, for which
j < k < i. Let P' be the directed path from v, to Vk whose last edge is the
back edge w - vk. Clearly v, is on C and p 2 i; P' is not necessarily gen-
erated in one piece by the path finding algorithm, if it is not the first path
to be generated in the bridge B' to which it belongs.
Case 1: p > i. The bridges B and B' interlace by part (i) of the definition
of interlacement. Thus, B cannot be drawn inside.
Case 2.1: q < j. Since v, and vj are attachments of B, vk and Vq are attach-
ments of B' and q < j < k < i, the two bridges interlace. Thus, B cannot
be drawn inside.
Case 2.2: q = j. P" cannot consist of a single edge, for in this case it is a
singular bridge and Vk is not one of its attachments. Also, L2(x:) c vk.
Thus, L2(x:) < vi. By Lemma 8.5, ul X v; and L2(ul) < vi. This implies
that B and B' interlace by either part (i) or part (ii) of the definition of
interlacement, and B cannot be drawn inside.
Q.E.D.
The algorithm assumes that the first path of the new bridge B is drawn
inside C. Now, we use the results of Corollary 7.1, Theorem 7.1 and the
discussion which follows it, to decide whether the part of the graph ex-
plored so far is planar, assuming that C + B is planar. By Lemma 8.7, we
find which previous bridges interlace with B. The part of the graph explored
so far is planar if and only if the set of its bridges can be partitioned into
two sets such that no two bridges in the same set interlace. If the answer is
negative, the algorithm halts declaring the graph nonplanar. If the answer
is positive, we still have to check whether C + B is planar.
Let the first path of B be P: v5 - - U2 - - * - vi. We now have a
circuit C' consisting of C[vj, v,] and P. The rest of C is an outside path
P', with respect to C', and it consists of C[vi, 1] and C[1, v;]. The graph
B + C[vj, vi] may have bridges with respect to C', but none of them has all
its attachments on C[vj, vi], for such a bridge is also a bridge of G with
respect to C, and is not a part of B.
178 Testing Graph Planarity
Proof: We prove the lemma by induction on the order in which the bridges
are explored. At first, both 1I, and 112 are empty and the lemma is vacuously
true. The lemma trivially holds after one bridge is explored.
Assume the lemma is true up to the exploration of the first path P of
the new bridge B, where P: vi - . .. - v>. If there is no vertex vk on IIl or
12 such that vj < Vk < vi then clearly the attachments of B (in C(1, vi))
form a new block (assuming C + B is planar) and the lemma holds. How-
ever, if there are vertices of H1 or Il2 in between v; and vi then, by Lemma
8.7, the bridges, they are attachments of, all interlace with B. Thus, the
Path Addition Algorithm of Hopcroft and Taijan 179
old blocks which these attachments belong to, must now be merged into
one new block with the attachments of B (in C(1, vi)). Now, let v, be the
lowest vertex of the new block and vA be the highest. Clearly, v, was the
lowest vertex of some old block whose highest vertex was Vh', and vA' > v;.
Thus, by the inductive hypothesis, no attachment of another block could
be in between v, and VA', and therefore cannot be in this region after the
merger. Also, all the attachments in between vj and vh are in the new block
since they are attachments of bridges which interlace with B. Thus, all the
entries of II, or 12 which are in between v, and vA belong to the new block.
Q.E.D.
of the lowest of these blocks need remain, except when one of its entries is
0. In this case the lowest nonzero enrry on the same side, of the pairs above
it, if any such entry exists, takes its place.
When we enter a recursive step, a special "end of stack" marker E is
placed on top of 112, and the three stacks are used as in the main algorithm.
If the recursive step ends successfully, we first attempt to switch sections
for each of the blocks with a nonempty section on 112, above the top most
E. If we fail to expose E, then C + B is nonplanar and we halt. Otherwise,
all the blocks created during the recursion are joined to the one which
includes vj (the end vertex of the first path of B). The exposed E, on top
of 112, is removed and we continue with the previous level of recursion.
When we backtrack into a vertex vi, all occurrences of v, are removed from
the top of ll, and 112, together with pairs of pointers of 113 which point to
removed entries on both 11, and 112. (Technically, instead of pointing to an
occurrence of vi, we point to 0, and pairs (0, 0) are removed).
Theorem 8.1: The complexity of the path addition algorithm is O(1 VI).
(1) g(s) = 1,
(2) g(t) = IVI (=n),
(3) for every v e V -{s, t} there are adjacent vertices u and w such that
g(u) < g(v) < g(w).
Computing an st-Numbering 181
Lempel, Even and Cederbaum showed that for every nonseparable graph
and every edge s - t, there exists an st-numbering. The algorithm described
here, following the work of Even and Tarijan [5], achieves this goal in linear
time.
The algorithm starts with a DFS whose first vertex is t and its first edge is
t- s. (i.e., k(t) = 1 and k(s) = 2). This DFS computes for each vertex
v, its DFS number, k(v), its father, f(v), its lowpoint L(v) and distinguishes
tree edges from back edges. This information is used in the paths finding
algorithm to be described next, which is different from the one used in the
path addition algorithm.
Initially, s, t and the edge connecting them are marked "old" and all the
other edges and vertices are marked "new". The path finding algorithm
starts from a given vertex v and finds a path from it. This path may be
directed from v or into v.
(1) If there is a "new" back edge v ' w (in this case k(w) < k(v)) then do
the following:
Mark e "old".
The path is v -' w.
Halt.
(2) If there is a "new" tree edge v -' w (in this case k(w) > k(v)) then do
the following:
Trace a path whose first edge is e and from there it follows a path
which defined L(w), i.e., it goes up the tree and ends with a back
edge into a vertex u such that k(u) = L(w). All vertices and edges on
the path are marked "old". Halt.
(3) If there is a "new" back edge w - v (in this case k(w) > k(v)) then do
the following:
Start the path with e (going backwards on it) and continue backwards
via tree edges until you encounter an "old" vertex. All vertices and
edges on the path are marked "old". Halt.
(4) (All edges incident to v are "old"). The path produced is empty. Halt.
Lemma 8.9: If the path finding algorithm is always applied from an "old"
vertex v • t then all the ancestors of an "old" vertex are "old" too.
Proof: The only case which requires a discussion is when case (2) of the
path finding algorithm is applied. Since G is nonseparable, by Lemma 3.5,
L(w) < k(v). Thus, the path ends "below" v, in one of its ancestor. By
Lemma 8.9, this ancestor is "old".
Q.E.D.
(1) i - 1.
(2) Let v be the top vertex on S. Remove v from S. If v = t then g(t) - i
and halt.
(3) (v • t) Apply the path finding algorithm to v. If the path is empty
then g(v) - i, i - i + l and go to Step (2).
(4) (The path is not empty) Let the path be v - ul - u2- - - w.
Put u,, u,-,, . . ., U2, U1, v on S in this order (v comes out on top) and
go to Step (2).
Theorem 8.2: The algorithm above computes for every nonseparable graph
G(V, E) an st-numbering.
(i) No vertex ever appears in two or more places on S at the same time.
(ii) Once a vertex v is placed on S, nothing under v receives a number
until v does.
(iii) A vertex is permanently removed from S only after all its incident edges
become "old".
*Do not confuse with the source and sink of a network. The source of a network
is not necessarily a (graphical) source, etc.
184 Testing Graph Planarity
(a) (b)
Figure 8.1
Vertex Addition Algorithm of Lempel, Even and Cederbaum 185
drawn higher. Such a realization is called a bush form. A bush form of B3,
of our example, is shown in Fig. 8.2.
In fact, Lemma 8.10 implies that if G is planar then there exists a bush
form of Bk such that all the virtual vertices with labeled k + 1 appear next
to each other on the horizontal line.
The algorithm proceeds by successively "drawing" B1, B2, . . ., Bn -I and
G. If in the realization of Bk all the virtual vertices labeled k + 1 are next
to each other, then it is easy to draw Bk + 1: One joins all the virtual vertices
labeled k + 1 into one vertex and "pulls" it down from the horizontal line.
Now all the edges of G which emanate from k + 1 are added, and their
other endpoints are labeled properly and placed in an arbitrary order on
the horizontal line, in the space evacuated by the former virtual vertices
labeled k + 1.
However, a difficulty arises. Indeed, the discussion up to now guarantees
that if G is planar then there exists a sequence of bush forms, such that
each one is "grown" from the previous one. But since we do not have a
plane realization of G, we may put the virtual vertices, out of k + 1, in a
"wrong" order. It is necessary to show that this does not matter; namely,
by simple transformations it will be possible later to correct the "mistake".
Figure 8.2
186 Testing Graph Planarity
Note that here we ignore the direction of the edges, and the lemma is
actually concerned with the undirected underlying graph of Bk.
Proof: The st-numbering implies that for every vertex u there exists a path
from 1 to u such that all the vertices on the path are less than u. Thus, if
u < v then there is a path from ] to u which does not pass through v.
Therefore, 1 and u are in the same component.
Q.E.D.
Proof: Since Bk is a bush form, all the y's are on the outside face of H.
Assume there are two bush forms Pk' and Bk 2 in which the realizations of
H are kH and IF, respectively. If the y's do not appear in the same order
on the outside windows of H' and f' then there are two y's, yi and yj which
are next to each other in 4' but not in If (see Fig. 8.4). Therefore, in f2,
there are two other y's, Yk and y, which interpose between yi and yj on the
two paths between them on the outside window of IF. However, from If
we see that there are two paths, Plys, yjJ and P21yk, yJe which are completely
disjoint. These two paths cannot exist simultaneously in fF. A contradiction.
Q.E.D.
Vertex Addition Algorithm of Lempel, Even and Cederbaum 187
8@ ;
if
\
(a)
(b)
( :)
A^1 A2
/ H
\ //
\ //
Figure 8.4
Theorem 8.3: If B&. and B,*2 are bush forms of the same Bk then there exists
a sequence of permutations and flippings which transforms B&r into Pk',
such that in Bk2 and Bhk3 the virtual vertices appear in the same order.
flippings to have its virtual vertices in the same order as its counterpart
in A2.
Q.E.D.
PROBLEMS
8.1 Demonstrate the path addition algorithm on the Peterson graph (see
problem 7.3). Show the data for all the steps: The DFS for numbering
the vertices, defining the tree and computing LI and L2. The 0 function
References 191
on the edges. The sorting of the adjacency lists. Use the path finding
algorithm in the new DFS, to decompose the graph into C and a se-
quence of paths. Use Hi, 112, 13 and end of stack markers to carry out
all the recursive steps up to planarity decision.
8.2 Repeat the path addition planarity test, as in Problem 8.1, for the
graph given below.
8.3 Demonstrate the vertex addition planarity test on the Peterson graph.
Show the steps for the DFS, the st-numbering and the sequence of
bush forms.
8.4 Repeat the vertex addition planarity test for the graph of Problem 8.2.
8.5 Show that if a graph is nonplanar then a subgraph homeomorphic to
one of the Kuratowski's graphs can be found in 0(1 VI2). (Hints:
Only O(1 VI) edges need to be considered. Delete edges if their deletion
does not make the graph planar. What is left?)
REFERENCES
[1] Auslander, L., and Parter, S. V., "On Imbedding Graphs in the Plane," J.
Math. and Mech., Vol. 10, No. 3, May 1961, pp. 517-523.
[2] Goldstein, A. J., "An Efficient and Constructive Algorithm for Testing Whether
a Graph Can be Embedded in a Plane," Graph and Combinatorics Conf., Con-
tract No. NONR 1858-(21), Office of Naval Research Logistics Proj., Dept. of
Math., Princeton Univ., May 16-18, 1963, 2 pp.
[31 Hopcroft, J., and Tarjan, R., "Efficient Planarity Testing," JACM, Vol. 21,
No. 4, Oct. 1974, pp. 549-568.
[41 Lempel, A., Even, S., and Cederbaum, I., "An Algorithm for Planarity Testing
of Graphs," Theory of Graphs, InternationalSymposium, Rome, July, 1966. P.
Rosenstiehl, Ed., Gordon and Breach, N.Y. 1967, pp. 215-232.
[5] Even, S., and Tarjan, R. E., "Computing an st-numbering," Th. Comp. Sci.,
Vol. 2, 1976, pp. 339-344.
[6] Booth, K. S., and Lueker, G. S., "Testing for the Consecutive Ones Property,
Interval Graphs, and Graph Planarity Using PQ-tree Algorithms," J. of Comp.
and Sys. Sciences, Vol. 13, 1976, pp. 335-379.
Chapter 9
THE THEORY OF
NP-COMPLETENESS
9.1 Introduction
For many years many researchers have been trying to find efficient algo-
rithms for solving various combinatorial problems, with only partial success.
In the previous chapters several of the achievements were described, but
many problems arising in areas such as computer science, operations re-
search, electrical engineering, number theory and other branches of discrete
mathematics have defied solution in spite of the massive attempt to solve
them. Some of these problems are: The simplification of Boolean functions,
scheduling problems, the traveling salesman problem, certain flow prob-
lems, covering problems, placement of components problems, minimum
feedback problems, prime factorization of integers, minimum coloration of
graphs, winning strategies for combinatorial games.
In this chapter we shall introduce a class of problems, which includes
hundreds of problems which have been attempted individually, and no
efficient algorithm has been found to solve any of them. Furthermore, we
shall show that a solution of any one member of this class, will imply a
solution for all. This is no direct proof that members of this class are hard
to solve, but it provides a strong circumstantial evidence that such a solution
is unlikely to exist.
Since all the problems we consider are solvable, in the sense that there
is an algorithm for their solution (in finite time), we need a criterion for
deciding whether an algorithm is efficient. To this end, let us discuss what
we mean by the input length. For every problem (for example, deciding
whether a given integer is a prime) we seek an algorithm such that for every
instance of this problem (say, 5127) the algorithm will answer the question
("is it a prime"?) correctly. The length of the data describing the instance
is called the input length (in our illustration it is 4 decimal digits). This
length depends on the format we have chosen to represent the data; i.e. we
can use decimal notation for integers, or binary notation (13 bits for our
192
Introduction 193
illustration) or any other well defined notation; for graphs, we can use
adjacency matrices, or incidence lists, etc.
Following Edmonds [1], we say that an algorithm is efficient if there
exists a polynomial p(n) such that an instance whose input length is n takes
at most p(n) elementary computational steps to solve. That is, we accept
an algorithm as efficient only if it is of polynomial (time) complexity. This
is clearly a very crude criterion, since it says nothing about the degree or
coefficients of the polynomial. However, we accept it as a first approxima-
tion for the complexity of a problem for the following reasons:
(1) All the problems which are considered efficiently solved, in the litera-
ture, have known polynomial algorithms for their solutions. For ex-
ample, all the algorithms of the previous chapters of this book are
polynomial.
(2) None of the hard problems, mentioned above, is known to have a poly-
nomial algorithm.
(3) To see why polynomial is better than, say, exponential, consider the
following situation. Assume we have two algorithms for a solution of a
certain problem. Algorithm A is of complexity n2 and algorithm B is of
complexity 2n. Assume, we have a bound of say, one hour computation
time, and we consider an instance manageable with respect to a certain
algorithm if it can be solved by this algorithm within one hour. Let no
be the longest instance (no is its input length) which can be solved by
algorithm B, using a given computer C. Thus, the number of steps C
can perform in one hour is 2"o. Now, if we buy a new computer C', say
10 times faster, the largest instance n we can handle satisfies
2n = 10 *2"O
or
n = no + log2 10.
This means, that instead of, say, factoring integers of no binary digits
we can now factor integers of about no + 4 digits. This is not a very
dramatic improvement.
However, if no is the largest instance we could handle, by C, using
Algorithm A, then now we can handle, by C', instances of length up to
n, where
n2 = 10 *no2 ,
194 The Theory of NP-Completeness
and
7
n=- ALO no.
This means that we would be able to factor integers with more than 3
times the number of digits as we could before. This is much more
appealing. (Unfortunately, we dc not know of an n2 algorithm for fac-
toring integers.)
Having a problem in mind, we have to decide on the format of its input.
Only after this is done, the question of whether there is a polynomial algo-
rithm is mathematically defined. We do not mind exactly what format one
chooses for our problem, up to certain limits. We consider two formats
equally acceptable if each can be translated to the other in polynomial time.
If in one format the input length of an instance is n and q(n) is a poly-
nomial bound on the time to translate it (by the translation algorithm) to
the second format, then clearly the input length in the second format of the
same instance is bounded by q(n). Now, if we have a polynomial time algo-
rithm to solve the problem in the second format, where the complexity is
bounded by p(n), then there exists a polynomial time algorithm for the first
format too: First we apply the translation algorithm and to its output we
apply the polynomial algorithm for the second format. The combined algo-
rithm is of complexity
q(n) + p(q(n)),
where
196 The Theory of NP-Completeness
S is a finite set whose elements are called states; so, sy, SN are three
states called the initial state, the 'yes' state and the 'no' state, re-
spectively.
r is a finite set whose elements are called tape symbols; E is a proper
subset of r whose elements are called input symbols; b is an element
of r-s, called the blank symbol.
f is a function (S -{sY, SN}) x r - s x r x f 1, -1), called the
transition function.
A Turing machine has an infinite tape, divided into cells ... c(-2), c(-1),
c(O), c(1), c(2)- -.. The machine has a read-write head. Each unit of time
t, the head is located at one of the tape cells; the fact that at time t the
head is in c(i) is recorded by assigning the head location function h the
value i (h(t) = i). When the computation starts, t = 0. The head is initially
in c(1); i.e. h(O) = 1.
Each cell contains one tape symbol at a time. The tape symbol of c(i) at
time t is denoted by -y(i, t). The input data consists of input symbols x,, X2,
,x..
x. and is initially written on the tape in cells c(l), c(2), . . ., c(n) re-
spectively; i.e. y(l, 0) = xi, 'y(2, 0) = X2, . (n, 0) = x.. For all other
y..
cells c(i), y(i, 0) = b.
The state at time t is denoted by s(t). Assume s(t) e S - {SY, SN}, and
f(s(t), -y(h(t), t)) = (p, q, d). In this case, the next state s(t + 1) = p, the
symbol in c(h(t)) becomes q (y(h(t), I + 1) = q) and the head location
moves one cell to the right (left) if d == 1(-1); i.e. h(t + 1) = h(t) + d.
All other cells c(i), i • h(t), retain their symbol; i.e. -y(i, t + 1) = -y(i, t).
If s(t) r {sy, SN} the machine halts.
We assume that M has the property that for every input data it eventually
halts.
A Turing machine M is said to solve a decision problem P, if for every
instance of this problem, described according to the conventional format
by xi, X2, . . ., xn (n is the input length , the answer is 'yes' if and only if M,
when applied to xi, x2, . . ., x. as its input data, halts in state sy.
A Turing machine seems to be a very slow device for computation.
Typically, a lot of time is "wasted" on moving the head from one location
to another, step by step. This is in contrast to the random access machine
model which we usually use for our computations. Yet, it can be shown
that if an algorithm solves a problem in a random access machine in poly-
nominal time then a corresponding Turing machine will solve this problem
in polynomial time too. (The new polynomial may be of higher degree, but
this does not concern us.)
The NP Class of Decision Problems 197
None of these inclusions is known to be strict, but equality has not been
demonstrated either.
In certain studies the differences between these different definitions may
be crucial, and the reader is advised to be aware of the differences. How-
ever, these differences are irrelevant in what follows in this book. For
definiteness, we shall stick to the Karp approach.
Our aim is to prove that several combinatorial decision problems, which
are of wide interest, are NPC. Typically, it is easy to show that they belong
to NP, and as noted before, once we know one NPC problem D, a demon-
200 The Theory of NP-Completeness
Proof: First, it is easy to see that SAT e NP. For if the set of clauses is
satisfiable, all we need to do is to use a satisfying truth assignment as the
guess and verify that indeed this assignment is consistent and satisfying. A
Turing machine which performs this task can be built; this is a cumber-
some but straightforward task.
The more demanding part of the proof is to show that every NP problem
is polynomially reducible to SAT. Although this proof is long, its idea is
ingeniously simple.
By definition, for every decision problem D in NP, there exists a poly-
nomially bounded nondeterministic Turing machine M which solves it. We
display a polynomial reduction such that when M, its bounding polynomial
*In fact Cook used the tautology problem of disjunctive normal forms. We use
SAT, which by De Morgan's law is the complement of the tautology problem, since
the latter is not known to be in NP.
NP-Complete Problems and Cook's Theorem 201
p(n) and the input I (for D) are given, it constructs an instance f(I) of
SAT, such that f(I) is satisfiable if and only if M "accepts" I; i.e. there
exists a guess with which M will verify that the answer to I, with respect
to D, is 'yes' and the computation time be at mostp (n).
The idea is that f(I) simulates the operation of M on the instance I.
There will be eight sets of clauses. The satisfiability of each set Si assures
that a certain condition holds. The conditions are:
(1) Initially, I is specified by the contents of cells c(1), c(2), ... , c(n); the
cells c(O) and c(n + 1), c(n + 2), . .. , c(p(n)) all contain blanks.
(2) The initial state is so and the head is in c(l) (h(0) = 1).
(3) For every 0 c t c p(n), the machine is exactly in one state.
(4) For every unit of time t, 0 c t c p(n), each cell c(i), -p(n) + 1 c i c
p(n) + 1, contains exactly one tape symbol.
(5) For every 0 c t < p(n), the head is exactly on one cell c(i), -p(n)
+ 1 < i c p(n) + 1.
(6) The contents of each cell can change from time t to time t + 1 only if
at time t the head is on this cell.
(7) If s(t) eS - (SY, SN} then the next state (s(t + 1)), the next tape symbol
in the cell under the head (y(h(t), t + 1)) and the next location of the
head (h(t + 1)) are according tof. If s(t) e (sY, SN} then s(t + 1) = s(t).
(We assume that the state does not change after M halts.)
(8) At time p(n) the state is sY.
To simplify our notation let S = {SO, SI, S2, .. ., Sq}, where Si is the 'yes'
state (sy) and S2 is the 'no' state (SN). We shall use the following conventions.
Index i is used for cells; -p(n) + 1 c i c p(n) + 1. Index t is used for
time; 0 c t c p(n). Index j is used for tape symbols; 0 c j c g, where
r = {yo, yi, . .. , yg} and -yo is the blank symbol (b). Index k is used for
states; 0 s k c q. The set of variables of the satisfiability problem is:
Since the number of tape symbols is fixed (for the given M, which does
not change with I) the number of G variables is O((p(n))2). The number of
H variables is also 0((p(n))2) and the number of S variables is 0(p(n)). The
interpretation is as follows:
Assume that I = xIx 2x3 ... xn. The set Si is given by:
It is easy to see that condition (1) holds if and only if all the clauses of
51 are satisfied. The fact that the values of G(i, 0, j) are not specified for
i < 0 corresponds to the fact that one imay use any guess one wants.
For every i and (relevant) t there are g clauses which together require that
either the head is on cell c(i) at time t or the symbol y(i, is the same as t)
'y(i, t + 1); for if at time t + 1 it is j and at time t it is not, then no literal
j
in the clause for i, t, is 'true'.
S7 = U iO~rt<p(n)
U U
ko1. 2
U{{S(t,
j
k), H(i, t), G(i, t, j),.S(t + 1, k')},
The clauses of the last line imply that if s(t) = s1 or S2 (the 'yes' and 'no'
states) then s(t + 1) = s(t). The other three lines imply that if s(t) = Sk,
h(t) = c(i) and -y(c(i), t) = yj then s(t + 1), y(c(i), t + 1) and h(t + 1)
are according to the transition functionf of the Turing machine M.
58 = {{S(p(n), 1)}}.
It follows that the set of clauses U I Si is satisfiable if and only if there
is a guess for which M accepts the input I, by reaching by t = p(n) the
'yes' state. Thus, every NP decision problem is polynomially reducible to
SAT.
Q.E.D.
Proof. The proof that 3SAT e NP is the same as for SAT. Next we show
that SAT c 3SAT.
{a,, a2, XI}, {xi, a 3 , X2}, {X, ai+2, Xi+ 1} . XI-3, a,-1, a,}.
If a truth assignment for ax, a2, . ., a, contains at least one 'true' literal
then the long clause {a,, a2, . . ., a,} is satisfied. Furthermore, the x vari-
ables can be assigned to satisfy all the clauses of length three which are not
satisfied by one of the a-s: If a, or a2 is T, assign xi = X2 = ... = X/-3
= F. If a,-, or a, is T, assign xi = X2 = Xl-3 = T. If some ak =T
2 < k < I-1, assignxi = X2 = . Xk-2 = TTandxk- =xk =**
Xl-3 = F. At least one of these cases applies. However, if a truth assign-
ment for a,, a2, .. , a, makes them all 'false' then the long clause is not
.
satisfied and no choice of values for the x variables will satisfy all the I - 2
short clauses.
Note that this replacement of long clauses by short ones can be done in
time which is bounded polynomially with the length of the input which
describes the original set of clauses. The number of clauses in the trans-
formed set is bounded by the number of literals, counting their repetitions,
in the original set, and each of the clauses in the transformed set contains
at most three literals.
Q.E.D.
Although this book deals mainly with graph theoretic problems, we shall
discuss in this section three nongraphic combinatorial problems, and prove
their NP-Completeness. These problems are important and well known,
and their established NP-Completeness will help in the analysis of some
graphic problems in the next chapter.
The first problem is the three dimensional matching (3DM). In Section
6.4 we discussed the problem of maximum matching in bipartite graphs,
and saw that it can be solved in polynomial time; this problem is also called
the two dimensional matching problem (2DM). The 3DM is defined as
follows:
Input: Let W, X, and Y be three sets all of the same cardinality, p, and
Mc WXXX Y.
Three Combinatorial Problems which are NPC 205
Proof. Clearly 3DM is in NP. Let us show that 3SAT cc 3DM, and since
3SAT is already known to be NPC, this will prove that 3DM is NPC. The
proof follows Garey and Johnson [7]*.
Let xI, X2, . .. , x. be the variables which appear in the set C of clauses
C,, C2, . . ., Cm, which is the input I to the 3SAT problem.
We construct W, X, Y and M off (I), the input to 3DM, in the following
general manner. There will be a set of triples, AC, representing the truth
assignment consistency of the variables. Here the consistency means that in
all its appearances in C a variable gets the same truth value, and all the
appearances of its complement get the complementary truth value. There
will be a set of triples, SC, representing the satisfiability condition; its
task is to ensure that every clause will be satisfied. Finally, there will be a
set of triples, GC(garbage collection), which will enable to complete the
matching if the previous conditions have been met.
The sets W, X and Y are defined as follows:
that is, for each variable xi and each clause Cj there is an element xij e W
representing the appearance of xi in Cj; similarly xij for Xi in Cj. [This is,
clearly, a great waste, since xi and xi are unlikely to appear in the same
clause, and only three variables can appear in each clause. However, we
are not concerned here with efficient reductions; as long as it is polynomial
it is acceptable, and the simpler construction is preferred.]
The sets X and Y are described piecewise,
*The use of 3SAT instead of SAT is unimportant. The same reduction proves also
SAT oc 3DM directly.
206 The Theory of NP-Completeness
X=A, US U G
Y=A 2 US U G
where Ai, S and G are pairwise disjoint; Ai plays a role in the assignment
consistency, S in the satisfiability and G in the garbage collection.
AI = {a I I < i s: n, 1 s j s m},
A2 = {bij I 1 s i c n, 1 s j
S = {sIjl S j < in),
G = {gk I s k c m(n - 1)).
AC = U,'-= AC,
where
ACi = {(xi, ai,, bi,) I 1 <j m} U
{(Zil ai j+ , bj) I 1 :i• rm! - 1} U {(iim, ail, bin)).
The structure of ACi is described in Figure 9.1, where the triples are
circumscribed and the i index is dropped. Since each aij and by participates
in only two triples, for every i M' must contain either all triples of the type
(xij, aij, bij) or all of the other type, but no mixture is possible. This repre-
sents the fact that all appearances of xi are 'false' (corresponds to covering
the a1,'s and big's with the xj's) or all are 'true', and if xi = T,then xi = F,
etc.
SC = U- , SC,
where
SCj = {(xiu, sj, sj) I xi e (Cj U {ip, s3, sO) Xie CEQ.
Since we are using 3SAT, each SC' contains three triples. In order to
cover s1, as a component in the second and third dimensions, one, and
only one of the triples must be in M'. Clearly this can be the case only
if x,, (x,,) has not been used in AW' n AC, namely, if x, gets 'true'
('false') assignment.
Three Combinatorial Problems which are NPC 207
b,, b\
\ 'S \-'-
- - _-
Figure 9.1
Proof: Obviously, 3XC e NP. We show now that 3DM W 3XC. Given the
sets W, X, Y and M C W X X X Y which specify the instance of 3DM, let
us assume that W, X and Y are pairwise disjoint. If they are not, then by
using new names we can easily change them, (and M, accordingly) to satisfy
this assumption.
Define now
C = {{w, x, y} I (w, x, y) e M}
and
U = W U X U Y.
is a complete matching.
Q.E.D.
The exact cover (XC) problem is similar to the 3XC, except that the sets
are not necessarily 3-sets. Clearly 3XC cx XC, and therefore XC is NPC too.
The set covering (SC) problem is defined as follows:
U S U,
SEW
and
IC'I ck.
We can also define the set cover problem by 3-sets (3SC), and the same
proof shows that it is NPC.
Usually, when we encounter a covering problem, it is stated as an opti-
mization problem: Given a collection C of subsets of U, find the smallest
subcollection C' which covers U. Clearly, if we could solve this optimiza-
tion problem then we could also solve SC in polynomial time. It follows
that if SC is hard to solve, as is suggested by its being NPC, then so is the
optimization problem.
PROBLEMS
ai Sjesi (m + 1) i
E for i = 1, 2, . m
,
n =m,
(b) Show that if b is expressed in the unary notation then 0-1 KNAP is
solvable in polynomial time. (Hint: prepare a binary vector (xo, xi, ....
Xb) in which initially xo = 1 and xi = X2 . . . = Xb = 0. For each
i = 1, 2, . . ., n andj = b - 1, b -2, ... , 2, 1, if x; 1,j + ai < b
References 211
and xj+,,i = 0, set xj+i - 1. At the end, the answer to 0-1 KNAP is
'yes' if and only if Xb = 1.)
9.5 The partition problem (PART) is defined as follows:
Input: Positive integers pi, p2, .. p.P.
Question: Is there a subsetJ C (1, 2, ... , m} such that
E pi = E pi?
W6 i~J
m = n + 2,
pi ai for i = 1, 2, n,
n
p.+l= 2 ai + b,
REFERENCES
[1] Edmonds, J., "Paths, Trees and Flowers", Canad. J. of Math., Vol. 17, 1965, pp.
449-467.
[2] Hopcroft, J. E., and Ullman, J. D., Formal Languages and their Relation to
Automata, Addison-Wesley, 1969.
[3] Minsky, M., Computation: Finite and Infinite Machines, Prentice Hall, 1967.
[4] Karp, R. M. "Reducibility among Combinatorial Problems", in R. E. Miller and
J. W. Thatcher (eds.), Complexity of Computer Computations, Plenum Press,
1972, pp. 85-104.
[5] Cook, S. A., "The Complexity of Theorem Proving Procedures", Proc. 3rd Ann.
ACM Symp. on Theory of Computing, ACM, 1971, pp. 151-158.
[6] Aho, A. V., Hopcroft, J. E., and Ullman, J. D., The Design and Analysis of Com-
puterAlgorithms, Addison-Wesley, 1974.
[7] Garey, M. R., and Johnson, D. S., Computers and Intractability;A Guide to the
Theory of NP-Completeness, W. H. Freeman, 1979.
Chapter 10
There are many known NPC graph problems, and we cannot possibly
describe them all in one chapter. The interested reader can refer to the
book of Garey and Johnson [1], for the most complete list of NPC prob-
lems. In this chapter some of the most interesting NPC graph problems are
discussed.
The graphs in this section are all finite, undirected, have no parallel
edges and no self-loops. These assumptions are natural when we deal with
any of the problems of this section.
A clique of a graph G(V, E) is a subset of vertices, C, such that if u, v E
C then u - v in G.* An independent set of G is a subset of vertices, 5,
such that if u, v E S then u - v in G. (Here u - v means that there is an
edge connecting u and v, and u 9- v means that there is no such edge.) A
vertex cover (of the edges) is a subset, C, of vertices such that if u - v then
{u, v} n c ; 0.
The maximum clique problem (CLIQUE) is defined as follows:
212
Clique, Independent Set and Vertex Cover 213
The graph G(V, E) has a clique of size k if and only if there is a com-
plete matching M' C M.
Q.E.D.
Here, ao, al, ... , ak are new symbols, and for every vertex v of G, there
are 2 * d(v) vertices in G'.
The last two parts, in the definition of E', describe a directed path from
(v, 1, 1) to (v, 2, d(v)) which passes through all the other 2 d(v) - 2 ver- .
Hamilton Paths and Circuits 215
tices in G', associated with v. This path will be called v's track. For every
edge u v in G there is a linkage between u's track and v's track as speci-
fied in the third part in the definition of E'; if e is the i-th edge incident to
u (e = e(u, i)) and is the j-th edge incident to v (e = e(v, j)) then the
connections it implies are as shown in Fig. 10.1. For every 0 c i < k and
every v E V by the first part of the definition of E', a, is connected by an
edge to (v, 1, 1). Similarly, by the second part, for every 0 < i < k and v
E V, (v, 2, d(v)) is connected by an edge to aj. Thus, the ai vertices serve as
passages from one track to another track.
The reason for the construction of the linkage as shown in Fig. 10.1 is
that if the Hamilton path enters from A it must exit from C, or else, either
(v, 1, j) or (u, 2, i) cannot be included in the path. The path can enter
from A and go through all four vertices and exit via C; it can enter from B,
go through all four vertices and exit via D; it can enter both from A and
B and exit via C and D, respectively. Thus, if the path enters (u, 1, 1)
(from some ai), it will go through all the 2 * d(u) vertices on u's track and
exit from (u, 2, d(u)) (to some aj). It may cover pairs of vertices (v, 1, j),
(v, 2, j), in addition, if u - v in G.
Let s = ao and t = ak. This completes the definition of f(I). We claim
that G has a k-vertex cover if and only if there is a Hamilton path from ao
to at in G'.
Assume C = {VI, V2, . . ., vj is a vertex cover of G.* One can construct
a Hamilton path in G', from ao to ak, as follows. Start with an edge from
ao to (v,, 1, 1), down the v, track to (v,, 2, d(vi)), from there to ai, to (v2,
1, 1), down the v2 track to (V2, 2, d(v 2)), etc. Finally, from (Vk, 2, d(vk)) to
ak. Now, for every edge u vVin G, if one vertex, say u, belongs to C but v
O C, the vertices (v, 1, j) and (v, 2, j), where e = e(v, j), are included by
making a detour in the u track. Assume e = e(u, i), then instead of going
directly from (u, 1, i) to (u, 2, 1), insert the detour (u, 1, i) - (v, 1, j) -
(v, 2,) - (u, 2, i). Since C is a vertex cover, we can include in this way all
the vertices on the unused tracks.
If P is an Hamilton path from ao to ak in G', then we can construct a k-
vertex cover, S, of G, as follows. v E S if and only if v's track is used in P.
The number of vertices in S is exactly k. Consider now an edge u v in G.
If both the v track and u track are used in P then clearly e is (doubly)
covered. If not, the only way to have (u, 1, i), (u, 2, i), (v, 1, j) and (v, 2, j)
in P is either to use the u track or the v track. Thus, e is covered by either
u or v. Q.E.D.
A B
C D
Figure 10.1
Proof: Let us show that DHP a HP. (This proof is due to R. E. Tarjan.)
Let the input I of DHP consist of the digraph G(V, E) and the two vertices
be s and t. The graph G'(V', E'), off(I), is defined as follows:
V' = {(v, 0), (v, 1), (v, 2)1v E V}
E' {(v, 0)- (v, 1), (v, 1)- (v, 2)1v E V} U {(u, 2)- (v, 0)Iu - v in
G}.
Coloring of Graphs 217
The (undirected) Hamilton circuit problem (HC) is also NPC, and this is
again proved by showing that HP o HC. Again the reduction is simply by
adding a new vertex a and two new edges t - a and a - s.
The traveling salesman problem, is really not one problem. Generally, a
graph or a digraph is given, with length assigned to the edges. The
problem is to find a minimum circuit, or path from a vertex s to a vertex t,
such that every vertex is on it. Vertices may, or may not, be more than
once on the circuit or path.
For definiteness, let us assume that G(V, E) is an undirected graph,
each e E E is assigned a length l(e) and we are required to find a simple
circuit which passes through all the vertices and whose sum of edge lengths
is minimum. Clearly, if we could solve this traveling salesman problem in
polynomial time we could also solve HC. Simply, assign length 1 to all the
edges and solve the traveling salesman problem. This observation remains
valid even if in the traveling salesman problem the circuit is not required to
be simple; a minimum circuit is of length I VI if and only if it is
Hamiltonian. Similarly, the directed versions are related. We conclude that
the traveling salesman problems are hard to solve if P • NP.
One of the classical problems in graph theory is that of coloring the ver-
tices of a graph in such a way that no two adjacent vertices (i.e. connected
by an edge) are assigned the same color. The minimum number of colors
necessary to color G is called the chromatic number, y(G), of G.
218 NP-Complete Graph Problems
In this section, we shall show that this problem is NPC. The problem re-
mains NPC even if all we ask is whether -y(G) s 3. Furthermore, even if
we restrict the question to planar graph, the problem remains NPC. Even
if we restrict the problem to a class of planar graphs with well behaved
planar realization, the problem of whether -y(G) • 3 is still NPC. One such
definition for well behaved realization is that all edges are straight lines, no
angle is less than 100 and the edge lengths are in between two given
bounds.
First we consider the 3-Coloration problem, (3C), which is defined as
follows:
Question: Can one assign each vertex a color, so that only three colors are
used and no two adjacent vertices are assigned the same color? (In short: Is
-y(G) s 3?)
Proof: We show that 3SAT a 3C. (The proof of this theorem, and the
next, follows the works of Stockineyer [2] and Garey, Johnson and
Stockmeyer [3].) Let the set of literals, of the input I to 3SAT, be {xI, x2,
x2, . . .,
x,,, 1,n} and the clauses be C1, C2 , . . ., Cm,.
The graph G(V, E), which is the inpul f(1) to 3C is defined as follows:
The structure of the last two parts in the definition of E, for each j, is
shown in Figure 10.2.
The significance of this structure is as follows: Assume each of the ver-
tices skj, ;2j and ~jtis colored 0 or 1, (we assume the three colors are 0, 1
and 2) and ignore for the moment vertex b. We claim that w6; can be
colored 1 or 2 if and only if not all three vertices jti, sti and ~ji are colored
0. First, it is easy to see that if ti,, it 23j are colored 0 then w4, must also
Coloring of Graphs 219
Figure 10.2
be colored 0, and therefore w0, must be colored 0. But, as the reader may
check for himself, if at least one of t, 42j, i is colored 1 then W6j can be
colored 1.
The structure of the first two parts of E's definition is shown in Fig.
10.3. Clearly, if a is colored 2 then all the literal-vertices must be colored 0
and 1, one of these colors is used for xi and the other for xi. Assume I is
satisfiable by some truth value assignment to the literals. To see thatf(I) is
3-colorable, assign a the color 2. Assign the literal t the color 1 if it is
'true' and 0 if it is 'false'. Now, since no triple (jX, 02y, 3jj is assigned all
zeros, we can color wj;, w2j, . . ., w~j in such a way that w6 j is colored 1, for
all j = 1, 2, . . ., m. Thus, b is colorable 0, and the 3-coloration of G is
complete. Conversely, if G is 3-colorable, call a's color 2 and b's color 0.
Clearly, all the literal vertices are colored 0 and 1, and w6, cannot be
colored 0. Thus, for each triple (y, 42i, 43j, not all three are colored 0.
Now, if we assign a literal 'true' if and only if its corresponding vertex is
colored 1, the assignment satisfies all clauses.
Q.E.D.
Figure 10.3
220 NP-Complete Graph Problems
Figure 10.4
Coloring of Graphs 221
(We remind the reader that a plane graph is a drawing of a graph in the
plane, so that no two edges share a point except, possibly, a mutual end-
point.)
Proof: We show that 3C o 3CP. Let G(V, E) be the input I to the 3C prob-
lem, where V = {vI, v2, . . ., v.}. We construct f(I) in two steps. First,
construct a general layout, as demonstrated in Fig. 10.5 for the case of n
= 5. This general layout depends only on n, and not on E. (The idea here
is a variation of Johnson's construction, based on a simple and well known
sorting network, from which the last two layers are omitted. For a descrip-
E1 1
S- __8
I n
j -
\I
D I D
K
D
igue 1____5
Figure 10.5
222 NP-Complete Graph Problems
tion of the sorting network see, for example, references [4] or [51.) This
layout has n - 1 main layers of vertices and n main columns of vertices;
these vertices are denoted us;, where C[c i s n - 1 and 1 c j c n. If i +
j is even then a copy of D is constructed with vertices uj,, Ui(;+±), U(i+l)j and
U(i+l)j+i) playing the role of D's four points. If i < n - 1 and even then ui,
and u(i+t)1 are connected via two new vertices as shown in Fig. 10.5 in the
case of U2, and U31. If i < n - 1 and i + n is even then uin is connected,
similarly, to u(y+1)"; see uis and U25, and also U35 and U45 in Fig. 10.5. This
completes the description of the layout.
Assuming all vertices* are colored using only three colors. Clearly, ull,
U22, ... , u~ -,, are all colored identically. If i > 1 and odd then uij,
U2(i+1), -. -U(n-i+1)n, U(n-i+2)n, U(-ji43)(-1), .. , U(,- I)(n-i+3) are all colored
and if the two tracks do not intersect they are adjacent at the (n - 1)-st
level.
Now, we turn to the second part of the reduction. If v, - v, in G, add an
edge Uk/ - Uk(l+1) to the layout, where Uk/ is on the i-th track (or the .l-th
track) and Uk(I+l) is on thej-th track (or the i-th track). Such a k and I can
be found since every two tracks are adjacent somewhere.
Now, if G is 3-colorable, color all the vertices of the i-th track with the
color of vi. Clearly, the remaining vertices of f(I) can also be legally
colored; the vertices in the D structures and the connecting pairs are easily
colored, and the edges which correspond to edges of E connect between
tracks of different color. Conversely, if f(I) is 3-colorable, each track is
uniformly colored, and we can assign vi, in G, the same color. No two
adjacent vertices in G get the same color, because there is an edge connect-
ing the two corresponding tracks in f(I), and since f(I) is legally colored,
the colors of these two tracks are different.
Q.E.D.
As the reader can easily observe the plane graph, f(I), constructed in
this proof is "well behaved". This justifies the claim, as made earlier in the
section, that even when the 3-colorability problem is restricted to such
graphs, the problem remains NPC.
Question: Is there a subset of vertices V' such that I V' l c k and the di-
graph resulting from G by eliminating all the vertices in V' and their
incident edges is acyclic?
Proof: We show that VC (c FVS. Let the input I to the vertex cover prob-
lem consist of the graph G(V, E) and the integer k. The input f(I) to FVS
consists of a digraph H(V, F) and the same integer k, where F is defined as
follows:
224 NP-Complete Graph Problems
F = {a - b, b - ala - b inE}.
Since each edge a - b of G corresponds to a directed circuit a - b - a of
H, clearly, a feedback vertex set S must contain either a or b (or both).
Thus the set S is a vertex cover of all the edges of G. Also, if S is a vertex
cover of G, the elimination of S and all edges incident to its elements from
H, leaves no edges, and therefore no directed circuits. Thus, G has a vertex
cover with k or less vertices, if and only if H has a feedback vertex set of k
or less vertices.
Q.E.D.
Question: Is there a subset of edges .', such that IE' I < k and G '(V, E
- E') is acyclic?
Proof: We show that FVS ox FES. Let the input I to FVS consist of a di-
graph G(V, E) and a positive integer k. The input f(I) to FES consists of a
digraph H(W, F) and the same integer k, where H is defined as follows:
Let us call the edges of H, of the type (v, 1) - (v, 2), internal, and those
of type (u, 2) - (v, 1), external. All the external edges incident to (v, 1)
are incoming, and there is exactly one internal edge incident to (v, 1) and
it is outgoing. It follows that if in a feedback edge set there is an external
edge, (u,2) - (v, 1), then it can be replaced by the internal edge, (v, 1) -
(v, 2), since all the directed circuits which go through (u, 2) - (v, 1), go
also through (v, 1) - (v, 2). Thus, if there is a feedback edge set F' in H,
which satisfies IF' |I k, we can assume that it consists entirely of internal
edges, and the set of vertices, V', in G, which correspond to these internal
edges, is a feedback vertex set of G. Also, if V' is a feedback vertex set of
G then the set of internal edges in H which correspond to vertices of V', is
a feedback edge set of H.
Q.E.D.
Steiner Tree 225
Input: A connected graph G(V, E), a subset of vertices X(' V), a length
function l(e) > 0 defined on the edges and a positive integer k.
Proof: (Following Karp [71.) We show that 3XC oc ST. Let the input I to
3XC consist of a universal set U = {(U, U2, ... , Ut} and a collection C =
{SI, 52, . . ., S.} such that Si C U and ISjI = 3. The input fI) to ST is
defined as follows:
V = {vo} U C U U,
E = fv eL
i 1 c i c n) U (Si-;AiuJu
IE Si},
X = {vo} U U,
k = 4
W = {vo} U {SiiiEJ} U U,
{e IiEJ} U fijIiEJanduESi}.
F=
Now, assume T(W, F) is a Steiner tree off(I). First, observe that we can
assume that in T, each vertex uj is a leaf; for if d(uj) > 1 we can reduce its
degree, without increasing the degree of any other u,, and without
changing T's total length, as follows. If in T us is connected by edges to Si,
and Si,, delete the edgefij (connecting u, with S,). One of Si, and Si, is now
disconnected from vo; add the edge which connects it directly to vo, to re-
store the connectivity.
We can now prove that if J = { i l E W} then it defines an exact cover
of U. Clearly, each Si E W can have at most 3 edges in T which lead to ele-
ments of U, and since U C W, if any Si E W has less then 3 such edges,
IJI > t/3. Also, for every Si E W. e E F. Thus, if IJI > t/3 then | Fj >
4t/3 contradicting the requirement that IFl c 4t/3. Therefore, each S, E
W has exactly 3 edges to elements of U, no us is connected by an edge to
more than one Si and we conclude that J defines an exact cover of U.
Q.E.D.
Note that we have proved that even if all edge lengths are 1, ST remains
NPC. For NP-Completeness results cf the Steiner problem on a rectilinear
grid and other related problems, see Garey, Graham and Johnson 18].
Proof: Let us show that 3SAT o MAXC. The reduction is done in two
steps. In the first step we shall assign weights to the edges, and will reduce
Maximum Cut 227
Now,
m
W(VO-( E ICi n {a} I if E E L,
w(Q'-h") Ici
C {'}HIC
Ici {n"}1 if E', andt' •
"EL i",
w(X
1 - Xj) = 10 * m + 1,
edges incident to vi, i > 0, is 1. The sum of weights of all these three
classes of edges is exactly 10 m.
We now claim that the answer to the instance of 3SAT problem is the
same as to the question: Is there a setS' C V' such that
E_ w~e) 2 kI?
~E(S',S)
First, assume the answer to the 3SAT problem is affirmative, and let T be
the set of literals which have a 'true' value in a consistent assignment which
satisfies all the clauses. Let r C S' and L - T C S'. Clearly, for each 1 c
j < n, x; - xj belongs to the cut, and contributes 10 * m + 1 to its
weight. Altogether we already have K1I0 * m + 1) * n contributed to the
total weight of the cut. Now, put vo e S'. It is convenient to interpret the
rest of the edges and their weights as follows: Each A, is a clique, and each
edge appears as many times as it belongs to such cliques, which is exactly
equal to its defined weight. In each of these m cliques there is at least one
literal-vertex which is in S', vo is in S' and vi can be put in S' or S' in
such a way that 2 of A,'s vertices will be on one side and the remaining 3
on the other side. Thus, the clique contributes 6 to the cut, and the m
cliques contribute together 6 * m to the weight of the cut.
The argument above shows also that the total weight of a cut cannot ex-
ceed k', and if the answer to the weighted cut problem is affirmative then
all edges of the type x; - x; must in the cut and, each of the m cliques
must contribute 6, which is the maximum any such clique can contribute.
Therefore, in each clique 2 of the vertices are on one side of the cut, and
the remaining 3 are on the other side. Now, let us call the side vo is in, the
'false' side, and the other side, the 'true' side. It follows that at most 2
literal-vertices can be on the 'false' side. Thus, the defined assignment is
consistent and satisfies all clauses.
Now, in the second step we want to reduce the weighted cut problem into
an unweighted one. Actually, this has already been done above, when each
edge of weight w is replaced by w parallel edges of weight 1. Since all weights
are polynomially bounded by the input length to the 3SAT problem, this
increase in the number of edges is polynomially bounded. However, we
wish to show that the reduction can be done even if the graph is required
to be simple; i.e. have no parallel edges.
Let us replace each edge u ' v of , ', of weight w(e), by the construct
shown in Fig. 10.6, where a,, a2, ... , aw(e) are new vertices. We claim that
the new graph, G, has a cut of size k = 2[(10 - m + 1) - n + 10 * m] + k'
if and only if G' has a weighted cut of size k'. This can be shown as fol-
Maximum Cut 229
Figure 10.6
lows. In each path u - a,- - v if u and v are on the same side of the
cut, at most 2 edges can be in the cut. But if u and v are on different sides
of the cut, by putting b1 on u's side, and a, on v's side, the path contributes
3. Thus, each edge e of G' can give rise to at least 2 w(e) edges in a cut
of G even if its two endpoints are on the same side. Altogether, we get this
way, for all the edges of G' a contribution of 2 * Ee(E' w(e) to a cut in G,
and
If the weight of (S'; S') in G' is equal to k', let S' C S andS' c . If u
and v are the same side of the cut, we can assign a, and b1 so that u -a,
- - v contributes 2, if not, it can contribute 3. Thus, we can complete
the definition of S and S in such a way that the value of the cut (S; S) in G
is
e's construct contributes as most 3 * w(e) to the cut, and without loss of
generality (or again, change S), we can assume that this is exactly the con-
struct's contribution. Thus,
E_ U(e) = k'.
'E(S:S')
Thus, G' has a cut of weight k' if and only if G has a cut with k edges.
Q.E.D.
u-
Ein G Ip(i','p(V) I
Our goal is to show that the problem of finding an arrangement for which
the length of G is minimum, or rather the corresponding decision problem,
MINLA, is NPC. As an intermediate result, we shall prove that MAXLA,
defined below, is NPC.
Definition of MAXLA:
Question: Is there a 1-1 onto function p:V - {1, 2, . . ., I VI} such that
un lp(u, -p(v)I 2 k?
u-v in G
Linear Arrangement 231
Proof: Let us show that MAXC a MAXLA. (Here again, we follow [9]; a
similar proof appears in [3].)
Let the input I to MAXC consist of a graph G '(V', E') and a positive
integer k'. We define the inputf(I) to MAXLA, as follows:
E E'.
k k' . n3
where n V' I, and xI, X2, x. are new isolated vertices. If we inter-
. .,
3
The number of edges which span over the n vertices interposed in between
S' and S' is I (S'; S') I, and each has a length which exceeds n3. Thus,
u-
EinG jp(u) -p(v)I > I(S';S')I *n
3
> k' *
n3 = k,
in G P(U) -p(V) k.
u
-E
v,in G
IIp'(u) - p'(v)|I Ein p(u) -p(v)1 I2 k' n3.
The total length of G, in placement p ', can be divided into two parts:
(1) n3 I(S; S)I, where S = S n V' and (S; S) is the cut in G' defined
by S. This is the length caused by the interposition of xI, x2, . . ., xn' in
between S and S.
(2) The total length of the edges if xI, x2, ... , x, were dropped and the
gap between S and S was closed. This length is clearly bounded by
Thus,
3(S;S)I
- + k' n3
or
(S;S)I 2 k'.
Therefore, the answer to I, with respect to MAXC is affirmative too.
Q.E.D.
Multicommodity Integral Flow 233
u-
E Jp(u) -p(v) I ' k?
in G
VI = V
E' = u-vIu X v and -/-
u vinG},
kI = n(n2- 1)/6 -k,
u-v in Go PU ()|Ck'
(2) For each commodity i E {1, 2} and every vertex v E V-{si, ti}
fi(e) = E fi(e).
eecr~v)ef,3(v)
(We remind the reader that a(v) (g(v) is the set of edges which enter v
(emanate from v) in G.)
The total flows, F, and F2 , of the flow functions, andf 2 , are defined by
F; = E fi(e) - E fi(e).
Question: Are there integral flow functions, andf 2 for N, for which Fj 2
Rj?
We shall show that D2CIF is NPC, even if all the edge capacities are 1;
this is called the simple D2CIF.
Proof: Let us show that SAT x Simple D2CIF. The input I, to SAT, con-
sists of clauses C,, C 2 , . . . C., each a subset of the set of literals L =
{ X2 ..., ,C.,XI, X2, .. ., 4. The structure of f(I), the input to Simple
D2CIF is as follows. For each variable xi we construct a lobe, as shown in
Fig. 10.7. Here pi is the number of occurrences of xi in the clauses, and qj
is the number of occurrences of xi. The lobes are connected in series: v/ is
connected by an edge to vi+', s* is connected to V. and vn to tI. s2 is con-
1
nected by edges to all the vertices vji and vji where j is odd. In addition,
there are vertices C,, C2, . . ., C. and an edge from each to t 2 . For the j-th
occurrence of xi (xi), there is an edge from V2ji (V2ji) to vertex C, the clause
in which it occurs. The requirements are R I 1 and R2 = m .
The first commodity must flow from s, to t, through the lobes; vertices
52, CI, C 2 , .. . , C. and t2 cannot be used in this flow since there is no edge
from the lobe to s2, and there is no edge to return from C,, C 2 , . .. , Cm
and t2 to the lobes or to t,. Thus, the one unit of the first commodity must
use in each lobe either the upper or the lower path, but not both.
If the second commodity meets the requirement then F2 = R2 = m, and
all the edges which enter t2 are saturated. In this case, there is exactly one
unit of flow, of the second commodity entering each Ck,. If this unit of flow
comes from the upper track of the i-th lobe, through the edge v2ji - Ck,
then clearly it uses also the edge V2j-li - vzji and the unit of the first
commodity must use the lower track in this lobe.
Thus, if the answer to f(I), with respect to D2CIF, is positive, then we
can use the flows f, and f2 to assign a satisfying assignment of the literals
Figure 10.7
236 NP-Complete Graph Problems
as follows: If the first commodity goes through the lower track of the i-th
lobe, assign xi = T, and if through the upper, xi = F. In this case, the
answer to I, with respect to SAT, is also positive.
Conversely, assume there is a satisfying assignment of the variables. If xi
T, let the first commodity use the lower track in the i-th lobe; if xi = F,
use the upper track. Now, let t be a 'true' literal in Ck. If t = xi then the
upper track is free of the first commodity and we can use it to flow one
unit of the second commodity from s2 to Ck; if t = xi, use the lower track.
Finally, use the m edges entering t 2 to flow into it the m available units of
the second commodity.
Q.E.D.
(Note that c(u -e- v) = c(e) 2 0.) Condition (1) on the edges, is changed to:
Condition (2), that for each v E V-{s;, ti}, the total flow of the i-th com-
modity entering v is equal to the total flow of the i-th commodity
emanating from v, is now in the following form
F, f (u e) = O
U e VEE 2
Fi = E f(u-et1 ).
u '-tiEE
Question: Are there integral flow functions f, and f2 for N, such that Fi 2
Rj?
Multicommodity Integral Flow 237
Figure 10.8
238 NP-Complete Graph Problems
Thus, these changes can only expand the data describing the problem
linearly.
We proceed to construct the undirected network from the new directed
network N', as follows: Each edge a -. v of G' is replaced by the con-
struct shown in Fig. 10.8. (u or v may be any two vertices in V', including
sources and sinks.) The vertices of the construct, which are unlabeled in
Fig. 10.8, are new, and do not appear elsewhere in the graph.
The new requirements are Ri' = Ri + JE' |.
It remains to be shown that the requirements in N' can be met if and
only if the new requirements can be met in the new undirected network
N.
First assume that the requirements can be met in N'. Initially, flow one
unit of each commodity through each of the constructs, as shown in Fig.
10.9. This yields F. = F2 = |E'|. Next, if u -v is used in N' to flow one
unit of the first commodity, then we change the flows in the corresponding
construct as shown in Fig. 10.10. The case of the second commodity
flowing through e in N' is handled sirnilarly.
Figure 10.9
Multicommodity Integral Flow 239
Figure 10.10
We can now use the following flows through u -v, in N': If in e's con-
struct, in N. we use pattern (1) then fi(e) = f2(e) = 0. If it is pattern (2),
then f' (e) = 1 and f2 (e) = 0, etc. Clearly, this defines legal flows in N'
which meet its requirements.
Q.E.D.
240 NP-Complete Graph Problems
PROBLEMS
10.1 Show that IND o CLIQUE without using Cook's theorem (Theorem
9.1) or any of its consequences.
10.2 Show that the problem of finding a minimum independent set which
is also a vertex cover is solvable in polynomial time.
10.3 A set S c V, in G(V, E), is called a dominating set if for every v E
V - S, there exists an edge u -- v in G, such that u E S. Formulate
the minimum dominating set problem as a decision problem and
prove that it is NPC. (Hint: One way is to show a reduction from
VC. For every edge u - v add a new path u - x - v in parallel).
10.4 Formulate the problem of finding a minimum independent set which
is also dominating, as a decision problem and prove its NP-complete-
ness. (Hint: Use 10.3. Duplicate vertices. The set of duplicates is in-
dependent. Add to each duplicate a path of length 2.)
10.5 Formulate the traveling salesman problem for undirected graphs,
where a circuit is required but it may pass through vertices more than
once, as a decision problem, and prove that it is NPC.
10.6 Show that there is a polynomial algorithm for finding a circuit C,
as in 10.5, whose length l(C) satisfies l(C) c 2 * 1(T), where l(T) is
the length of a minimum spanning tree T of the graph. Prove also
that every such circuit C satisfies l(C) > I(T), if all edge lengths are
positive.
10.7 Appel and Haken [6] proved that every plane graph is 4-colorable.
(This is the famous four color theorem.) Thus, 4-colorability of plane
graphs is polynomial (why?). Prove that for every k 2 3, k-colora-
bility of graphs in general is NPC.
10.8 The following is called the partition of a graph into cliques problem:
Input: A graph G(V, E) and a positive integer k.
Problems 241
*The subgraph of G(V, E) induced by S C V, is the graph (S. E'), where E' is
the subset of E which includes all the edges whose endpoints are in S.
242 NP-Complete Graph Problems
S = XI
t = Xn
k = n2- k'.)
n
Yaj-b
10.20 The following network flow problem is called the integral flow with
bundles problem (IFWB). As in the maximum flow problems, there
is a digraph G(V, E), a vertex s, assigned as the source and a ver-
tex t, assigned as the sink. The bundles are subsets of edges, Bi,
B 2 , . . ., Bk. There are bundle capacities c,, C2, . . ., Ck, and the flow
f must satisfy the condition that for each Bi, E,2Bjf(e) c ci. In addi-
tion, for every vertex v E V - {s, t}, the incoming flow must equal
the outgoing flow. The question is whether there is a flow which
meets the requirement R. Prove that IFWB is NPC. (Hint: May be
proved by IND oc IFWB [12].)
REFERENCES
[1] Garey, M. R., and Johnson, D. S., Computers and Intractability, A Guide to
the Theory of NP-Completeness, Freeman, 1979.
[21 Stockmeyer, L. J., "Planar 3-Colorability is NP-Complete", SIGACT News,
Vol. 5, #3, 1973, pp. 19-25.
[3] Garey, M. R., Johnson, D. S., and Stockmeyer, L. J., "Some Simplified NP-
Complete Graph Problems", Theor. Comput. Sci., Vol. 1, 1976, pp. 237-267.
[4] Kautz, W. H., Levitt, K. N., and Waksman, A., "Cellular Interconnection
Arrays", IEEE Trans. Computers, Vol. C-17, 1968, pp. 443-451.
[5] Even, S., Algorithmic Combinatorics, Macmillan, 1973. (See Section 1.4.)
244 NP-Complete Graph Problems
[6] Appel, K., and Haken, W., "Every Planar Map is Four Colorable", Bull.
Amer. Math. Soc., Vol. 82, 1976, pp. 711-712.
[7] Karp, R. M., "Reducibility among Combinatorial Problems", in R. E. Miller
and J. W. Thatcher (eds.), Complexity of Computer Computations, Plenum
Press, 1972, pp. 85-104.
[81 Garey, M. R., Graham, R. L., and Johnson, D. S., "The Complexity of Com-
puting Steiner Minimal Trees", SIAM J. AppL. Math., Vol. 32, 1977, pp.
835-859.
[9] Even, S., and Shiloach, Y., "NP-Completeness of Several Arrangement Prob-
lems", Technical Report #43, Dept. of Comp. Sci., Technion, Haifa, Israel,
Jan. 1975.
[10] Even, S., Itai, A., and Shamir, A., "On the Complexity of Timetable and
Multicommodity Flow Problems", SIAM J. Comput., Vol. 5, #4, Dec. 1976,
pp. 691-703.
[11] Itai, A., "Two Commodity Flow" J. ACM, Vol. 25, #4, Oct. 1978, pp.
596-611.
112] Sahni, S., "Computationally Related Problems", SIAMJ. on Comput., Vol. 3,
1974, pp. 262-279.
INDEX
adjacency, 2 circuit-free, 22
matrix, 4 clause, 200
list, 173 clique, 51, 67, 212
Aho, A. V., 20, 199, 211 CLIQUE (problem), 212
alphabet, 69 code, 69
Appel, K., 240, 244 exhaustive, 88
arbiter, 31 uniquely decipherable (UD), 69
arbitrated (digraph), 31 word, 69
attachments, 148 complement, 200
augmenting path, 92 complete graph, 39
Auslander, L., 171, 191 component, 148
singular, 148
connectivity
back edge, 57
edge, 131
backward labeling, 92, 108
of digraphs, 130
Berge, C., 20, 31, 51
vertex, 124
BFS, 12, 96
consistency condition, 200
blank symbol, 196
contraction, 164
block, 198
Cook, S. A., 199, 200, 211
Bock, F., 52
cut, 91
Booth, K. S., 172, 190, 191
maximum cut problem (MAXC), 226
branching, 40
cutset, 162
breadth-first search, 12
bridge (component), 148
Dantzig, G. B., 20, 21, 121, 146
bridge (edge), 67
de Bruijn, 8
Burge, W. H., 81, 89
diagrams, 9, 50
bush form, 185
sequence, 8, 50
dead-end, 100
capacity, 90 degree, 1
of a cut, 91, 108 matrix, 39
Catalan numbers, 82 depth-first search, 53
Cayley, A., 26, 27, 51 DFS, 53, 56
Cederbaum, I., 171, 181, 191 DHC, 216
characteristic sum, 71 DHP, 214
condition, 71 digraph
Cheriton, D., 26, 51 arbitrated, 31
Cherkassky, B., 97, 115 connectivity, 130
chromatic number, 217 Euler, 6
Chu,Y.J.,52 PERT, 138
Church's thesis, 195 Dijkstra, E. W., 13, 18, 19, 20
circuit, 3 Dilworth, R. P., 143, 147
directed, 3 Dinic, E. A., 97, 115
directed Euler, 6 Dirac, G. A., 19, 21
Euler, 5 directed Hamilton circuit problem (DHC),
simple, 3 216
245
246 Index
passage, 53 source, 90
path, 2 graphical, 183
augmenting, 92 spanning tree, 24
critical, 139 directed, 34
directed, 3 minimum, 24
directed Euler, 6 number of, 34
Euler, 5 ST, 225
length, 3 stack, 86
shortest, 11 state, 196
simple, 3 initial, 196
path addition algorithm, 171 ,no', 196
Patterson, G. W., 69, 88 'yes', 196
Perl, Y., 89 start-vertex, 3
PERT digraph, 138 in a PERT digraph, 138
Peterson graph, 169 Steiner tree problem (ST), 225
phase, 99 st-numbering, 180
planar, 148 Stockmeyer, L. J., 218, 226, 243
plane graph, 148 Strongly connected, 3
equivalence, 160 components, 64
Pnueli, A., 29 subgraph, 24
polynomial reduction, 198 induced, 25
potential, 103 suffix, 69
Pramodh Kumar, M., 97, 115 superstructure, 60, 64
prefix, 69
code, 72
Prim, R. C., 25, 51 tail, 69
Prefer, H., 27, 51 tape-symbols, 196
Tarjan, R. E., 26, 45, 51, 52, 53, 56, 57, 65,
quasi strongly connected, 31 68, 116, 128, 134, 146, 147, 171, 172,
181, 191, 216
reachability, 30 Tarry, G., 53, 66, 68
root, 30 termination vertex (PERT), 138
Tescari, A., 68
Sahni, S., 244 3-coloration of planar graphs problem (3CP),
Sardinas, A. A., 69, 88 221
SAT, 200 3-coloration problem (3C), 218
3SAT, 203 three dimensional matching (3DM), 204
satisfiability condition, 200 pairwise consistent, 209
SC, 208 tiling problem, 33
Schnorr, C. P., 131, 147 total flow, 90
SDR, 144 transition function, 196
second lowpoint, 172 transitive closure, 19
self-loop, 1, 3 traveling salesman problem, 217
set covering (SC), 208 tree, 22
3SC, 209 definitions, 22
set packing, 210 directed, 30
Shamir, A., 233, 244 full, 78
shift register state diagrams, 9 ordered, 69, 84
Shiloach, Y., 134, 226, 244 positional, 74
shortest path, 11 spanning, 24
singular component, 148 tree edge, 57
sink, 90 Tremaux, 53
graphical, 183 Turing machine, 195
sort-by-merge, 80 Tutte, W. T., 35, 52
Index 249
UD, 69 lowpoint, 59
Ullman, J. D., 20, 199, 211 separation, 57
Underlying graph, 6 separator, 121, 130
uniquely decipherable codes, 69 virtual edges, 184
U2CIF, 236 virtual vertices, 182
Eu
r
C)
0
II
I
CD
.4