Introduction To Graph Theory
Introduction To Graph Theory
Each edge e ∈ E can thus be descibed by an unorderd pair e = {u, v} consisting of the two
end vertices of e.
1
Graph Theory 2
Each directed egde or arc a ∈ A is an ordered pair a = (u, v) of two vertices u and v. That is,
u and v are not interchangeable end vertices of a. Vertex v is called the head, while vertex u
is called the tail of arc a. We also say that arc a leaves vertex u and enters vertex v.
2 5 2
4 4 5
1 1
6
3 6 3
If it is clear from the context whether the graph under consideration is undirected or directed,
we typically omit the adjectives “directed” or “undirected” If not stated otherwise, we denote
the number of vertices of a graph by n, and the number of (directed) edges of a graph by m.
For example, each of the two graphs in the illustration above has n = 6 vertices. The left
graph has 7 edges, while the right graph has 6 arcs.
Definition 1.3 (Adjacent, incident, loops). If u and v are the two end vertices of undirected
edge e = {u, v} [ or of directed edge a = (u, v)], we say that v is adjacent or neighbouring
to u. Vertices u and v are incident to edge undirected e = {u, v} [or directed edge a = (u, v),
resp.]. An edge with identical end vertices is called loop.
Example 1.4. For example, the vertex with label 2 in the graph on the left in the Figure below
is adjacent to the two vertices with labels 1, and 3. The vertex with label 2 in the graph on
the right is adjacent to the three vertices with labels 1, 3, and 4. The vertex with label 1 in the
graph on the right is end vertex of a loop
2 2 5
4 5 4
1 1
6
3 3 6
Definition 1.5 (Degree). The degree of a vertex u, deg(u) for short, is the number of edges
incident to u. Loops are counted twice. For directed graphs, we distinguish between the indegree
and the outdegree of u, deg− (u) and deg+ (u) for short. The indegree is the number of edges
entering u, while the outdegree is the number of edges leaving u.
For example, the vertex with label 1 in the right graph in the figure above has degree 4, while
the vertex with label 4 has degree 3. In the directed graph on the left, the vertex with label 2
has indegree deg − (2) = 2 and outdegree deg + (2) = 1.
Graph Theory 3
Example 1.8. Consider, for example, the left and the right graph in the illustration below.
The subgraph in blue of the right graph is induced, while the subgraph in red of the left graph
is not induced.
2 2
4 5 4 5
1 1
6 6
3 3
We call a graph complete if each pair of vertices is linked by an edge. A directed graph
G = (V, A) is called symmetric if (u, v) ∈ A implies (v, u) ∈ A. Note that for each undirected
graph exists an associated symmetric directed graph which can be constructed by replacing
each undirected edge {v, w} by two anti-parallel arcs (v, w) and (w, v). Consider, for example,
the complete graph on three vertices on the left in the illustration below, and it’s associated
symmetric directed graph on the right.
2 2
1 1
3 3
Definition 1.9 (Bipartite graphs.). An undirected graph G = (V, E) is bipartite if the vertex
set can be partitioned into two sets V = L ∪ R with L ∩ R = ∅ such that every edge has exactly
one end vertex in L, and one end vertex in R.
As mentioned above, you can find applications of graphs almost everywhere. Here, we mention
just a few.
Graph Theory 4
Information networks. The World Wide Web can be modeled or viewed as a directed graph
in whose vertices correspond to Web pages and with an arc from u to v if u has a hyperlink
to v. The structure of these hyperlinks can be used by algorithms to try inferring the most
important pages on the Web. Such techniques are crucial for search engines.
Social networks. Social networks can be used to model pairwise relationships between
groups of people that interact. Like, e.g., employees in a company, or high school students.
Here, the vertices correspond to the people, and the egdes represent pairwise relationsship.
Undirected edges correspond to symmetric relationships (like friendship), while directed edges
are used to model assymmetric relationsships (e.g. hierarchies inside a company).
Definition 1.10 (Paths, cycles, and circuits). A path, also called walk, in a graph is a
sequence v1 , e1 , v2 , e2 , ..., vk of vertices vi and edges ei such that ei has end vertices vi and vi+1
for i = {1, . . . , k − 1}. An u → v-path is a path with first vertex u and last vertex v. If u = v,
the path is called cycle. A path is called simple if all its vertices (except the start and end vertex
of a cycle) are distinct from each other. Simple cycles are called circuits. If the graph has no
parallel edges, it is sufficient to represent the path or cycle as ordered subset of either vertices
or edges, only. In a directed path or directed cycle v1 , e1 , v2 , e2 , ..., vk , the edges ei = (vi , vi+1 )
are directed edges with tail vi and head vi+1 for i = {1, . . . , k − 1}.
1 2 3 ... k
1 2 3 ... k
Definition 1.12 (Connectivity in directed graphs.). In a directed graph G = (V, A), vertex v is
reachable from vertex w if there exists a directed path from w to v. A directed graph G = (V, A)
is strongly connected if each vertex is reachable from every other vertex. A strongly connected
component is a strongly connected component in a directed graph, which is maximal in the sense
that no vertex of the super graph can be added so that the larger component is still strongly
connected.
Example 1.13. Consider, for example, the undirected [directed] graph on the left [right] in
the illustration below. The green subgraph forms a [strongly] connected component, but the red
subgraph doesn’t.
2 3 6 2 3 6
5 5
1 4 7 1 4 7
Definition 1.14 (Trees and forests). An undirected graph which is cycle-free is called forest.
A connected forest is called tree.
Graph Theory 6
Example 1.15. The graph in the illustration below is a forest on 12 vertices that decomposes
into three trees as its connected components.
2 5 8 11
1 4 7 10
3 6 9 12
We observe that any tree on n vertices has n − 1 edges. In fact, an even stronger statement is
true:
Lemma 1.16. Let G be an undirected graph on n vertices. Then any two of the following
statements imply the third:
(i) G is connected.
1.3 Exercises
Exercise 2. A cycle which contains each vertex exactly once is called a Hamiltonian cycle.
Show that a bipartite graph with an odd number of vertices does not contain a Hamiltonian
cycle.
Exercise 3. Let G be an undirected graph on n vertices that is connected and does not
contain a cycle. Proof that G has n − 1 edges.
Exercise 5. Let G be an undirected graph on n vertices that does not contain a cycle and
has n − 1 edges. Proof that G is connected.
Graph Theory 7
2 Breadth-First Search
We now turn our attention to a basic algorithmic question: given a graph G = (V, E) and two
designated vertices s and t. How can we efficiently decide whether there is a path from s to t
in G? This decision problem is called s-t connectivity.
A relatively simple algorithm for s-t connectivity is breadth-first search (BFS), in
which the graph is explored outward from s in all possible directions, adding vertices to one
“layer” at a time. Thus, we start a vertex s and add all vertices that are adjacent to s to a first
layer, which was initially empty. In each iteration, as long as there exist further vertices which
can be reached from s, the next layer is constructed by adding all vertices, which haven’t been
visited before and which are adjacent to a vertex in the previous layer, to the next (initially
empty) layer. Note that the algorithm can be applied to both, directed and undirected graphs.
BFS has multiple applications, such as detecting connected components, topological sorting,
critical path analysis, and for testing bipartiteness of a graph. Some of those applications will
be discussed in the exercises. A neighbour of a set S ⊆ V is a vertex v ∈ V \ S for which there
exists an edge (directed or undirected) whose other end vertex lies in S.
Example 2.1. Consider the graph on the left in the illustration below. Applying BFS with
start vertex s = 1 assigns the vertices to layers as shown in the graph on the right.
L0 L1 L2 L3
2
2 4
4 5
1 6
1
6
3 5
3
Note that a minor modification of BFS allows to compute for every vertex v which is reachable
Graph Theory 8
from start vertex s an s → v-path using the smallest number of edges. In order to compute
such paths, it suffices to introduce an additional pointer p(v) which points to the vertex in the
previous layer to which v is adjacent. This way, BFS constructs a tree with root s, which is
usually called the reachability tree or BFS tree.
2.1 Exercises
Exercise 6. Consider the chessboard shown below and note that some squares are shaded.
We wish to determine a knight’s tour, if one exists, that starts in the upper left corner and,
after visiting a minimum number of squares, ends in the bottom right corner. The tour can not
visit any shaded area. Formulate this problem as a reachability problem on a appropriately
defined graph.
Exercise 7. Two men have a 8-gallon jug full of fine and two empty jugs with a capacity of
5 and 3 gallons. They want to divide the wine into two equal parts. Suppose that when shifting
wine from one jug to another, in order to know how much they have transferred, the men must
Graph Theory 9
always empty out the first jug or fill the second one (or both). Formulate this problem as a
reachability problem in an appropriately defined graph.
Exercise 8. Consider the following problem. A farmer is standing on one side of a river.
He wants to cross the river with his life stock consisting of a wolf, a goat and a cabbage. There
is a boat available, but this boat can only transport two of the four items. Of course, the
farmer is always one of them. Furthermore, the farmer can not leave the wolf and the goat
alone (without the farmer), as the wolf will eat the goat. For the same reason the goat and the
cabbage can not be left alone. Formulate this problem as a reachability problem on a graph,
and solve it.
Assumptions: We assume that all vertices in G are reachable from s, and that G is simple
in the sense that G does not contain any parallel arcs or loops.
Note that these assumptions are not restrictive. In a preprocessing step, we could simply
run a breadth-first search algorithm and restrict to the subgraph induced by the vertices
reachable from s. Moreover, we can ignore all loops and, in case of parallel arcs, ignore all but
the cheapest one among the parallel arcs.
Shorstest paths in undirected graphs. Note that although the shortest-paths problem
is defined on a directed graph, we can handle the case on an undirected graph as well by
simply replacing any undirected edge e = {u, v} by two antiparallel directed arcs a = (u, v)
and a0 = (v, u), both of cost c(e).
In 1959, Edsger Dijkstra proposed a very simple greedy algorithm to solve the single-source
shortest-paths problem. We begin by describing an algorithm that just dermines the length
of the shortest path from s to every other vertex v in the graph. The shortest-path length or
distance from s to v will be abbreviated by d(v). Once we have understood how the algorithm
determines the shortest-path distances d(v) for all vertices v ∈ V , we will see that it is then
easy to construct the associated shortest s → v paths as well. The algorithm, described below
in pseudocode, maintains a set S of vertices u for which a shortest-path distance d(u) from s
is already computed; this set S is called the “explored” part of the graph. This set grows in
each iteration of the algorithm by one vertex until all vertices are explored (remember that we
assume that all vertices are reachable). Initially, S = {s}, and d(s) = 0. Now, for every vertex
v ∈ V \ S which is a neighbour of some vertex in S, we determine the shortest path from s to
v among those s → v paths consisting of only vertices in S, except for the final vertex v. That
is, we compute the quantitiy
and call d0 (v) the preliminary distance at this stage of the algorithm. We then pick a vertex
v ∈ V \ S for which the preliminary distance d0 (v) is minimal, add v to S, and define d(v) to
be the value d0 (v). Let us denote by N (S) the set of neighbours of S, i.e.,
Example 3.1. Consider the figure below which illustrates the stage of the algorithm at the
beginning of the 6th iteration of the while-loop. Six vertices are already explored and belong
to the current set S (the vertices inside the blue eclipse), and the preliminary distances of the
two neighbours of S (the two vertices x and v in magenta) are already computed: d0 (x) = 12
and d0 (v) = 11. Thus, the algorithm will select v, add it to S, and assign the true distance
d(v) = d0 (v).
Graph Theory 11
d(u) = 9
d0 (v) = 11
u
2
v
s 6
d(y) = 7 d0 (x) = 12
y 5 x
d(w) = 8
w
5
S
Construction of the associated shortest paths. Note that, in order to compute the
preliminary distances for each neigbour v ∈ N (S), it suffices to check locally for a vertex u ∈ S
which minimizes d(u) + c(u, v). Thus, in order to produce the s → v paths corresponding
to the distances found by Dijkstra’s algorithm, it suffices to simply record in each iteration
of the while-loop, when a new vertex v is added to S, the arc (u, v) on which the value
min{d(u) + c(u, v) | u ∈ S} is achieved. This way, the shortest path Pv from s to v of length
l(Pv ) = dv is implicitely represented by these arcs: if (u, v) is the arc we have stored for v, then
Pv is just (recursively) the path Pu followed by the arc (u, v). That is, we simply construct
Pv by starting at v, then follow the arc (u, v) we have stored for v in reverse direction; then
follow the arc we have stored for u in reverse direction; and so on until we reach s. The path
constructed this way must reach s, since our backward walk visits vertices that were added to
S earlier and earlier. By our assumption that the graph does not contain any parallel arcs, it
suffices to store the vertex u as predecessor of vertex v instead of storing the arc (u, v) for v.
Graph Theory 12
We still need to convince ourselves that Dijkstra’s algorithm is doing the right thing. In
particulare, we need to show that when Dijkstra’s algorithm is adding a new vertex v to S,
and assigns d(v) = d0 (v), then d(v) is indeed the true shortest-path distance from s to v. Note
that this fact follows as an immediate consequence once we proved that Dijkstra’s algorithm
maintains as loop-invariant that for each u ∈ S, the value d(u) is the length of a shortest
s → u-path in G.
Theorem 3.2. Dijkstra’s algorithm maintains the following loop-invariant: throughout the
entire algorithm, for each u ∈ S, the value d(u) is the length of a shortest s → u-path in G.
Proof. (by contradiction.) Initially, when S = {s}, the length of shortest s → s-path is
d(s) = 0. If the statement of the theorem were false, there would be some input-instance
consisting of a graph G = (V, A) and arc costs c : A → R+ such that the algorithm, in some
iteration with current set S, assigns a value d(v) to a vertex v although there exists some
s → v-path of length smaller than d(v). Consider the first iteration where this happens. Note
that Dijkstra’s algorithm assigns to v a value d(v) = d(u) + c(u, v) corresponding to the length
d(u) of a shortest s → u-path plus the length c(u, v) of arc (u, v).
Now, let P be an s → v-path of length l(P ) < d(v), and consider the first arc (x, y) on P with
x ∈ S and y 6∈ S. Let us denote by P 0 the subpath of P going from s to x. For an illustration,
see the figure below. Then we derive the contradiction
In the above chain of inequalities, the first inequality follows from the fact that the costs on
the arcs on the (possibly empty) subpath of P going from y to v are non-negative. The second
inequality follows from the fact that we consider the first iteration where the statement of
the theorem is violated, implying l(P 0 ) = d(x) since x was added to S at an earlier iteration.
The third inequality follows from the definition of preliminary distance d0 (y), and the last
inequality follows from our selection rule according to which we pick v as the neighbour of
smallest preliminary distance.
Graph Theory 13
P0 x y
s
P
u v
Graph with arcs of negative cost. In some applications we need to compute shortest
paths in graphs that might contain arcs of negative costs or lengths. Dijkstra’s algorithm,
however, requires all arc costs to be non-negative. In case of appearances of arcs with negative
lengths, Dijkstra’s algorithm might in fact fail. As an exercise, we ask you to construct an
example instance where some of the arcs have negative costs, and where Dijkstra’s algorithm
fails. For the computation of shortest paths in graphs with general arc costs, you might use
the Bellman-Ford algorithm, which either returns the shortest-path distance from a designated
start vertex s to all vertices reachable from s, or returns a cycle on which the sum of arc costs
is negative. Such a negative cycle serves as certificate proving that it is impossible to determine
the shortest s → v-paths to all vertices. The reason is simply that, in case a negative cycle
exists, at least for some of the vertices, you can always find cheaper and cheaper paths by
traversing along the negative cycle again and again. The Bellman-Ford algorithm will be
discussed in the course “Design and Analysis of Algorithm”, as one application of the Dynamic
Programming technique.
Running time and implementation. Let’s at least shortly discuss the running time of
Dijkstra’s algorithm. There are n − 1 iterations of the while-loop for a graph on n vertices,
as each iteration adds one vertex to S. For selecting the right vertex to be added to S, the
algorithm needs to compute the preliminary distances d0 (v) for all neighbours of the current
set S. The most efficient way is to use as data structure a priority queue which maintains the
value d0 (v) as keys for all v ∈ V \ S (initially, d0 (v) = ∞ for all v ∈ V \ S). For selecting v of
minimal preliminary distance d0 (v), we simply call extract min. Afterwards, we update the
keys d0 (w) of the vertices w remaining in the priority queue according to the following rule:
if there is no arc (v, w), the value d0 (w) need not be changed. Otherwise, w is assigned the
value min{d0 (w), d(v) + c(v, w)}. Such a change of values can only occur once per arc, namely,
when the tail of the arc is added to S. Using the heap-based priority queue as discussed in
Chapter 8, each priority queue operation, and in particulare the two operations extract min
Graph Theory 14
Lemma 3.3. Using a priority queue, Dijkstra’s algorithm can be implemented on a graph with
n vertices and m arcs to run in O(m log n) time.
3.2 Exercises
Exercise 9. A construction company’s work schedule on a certain site requires the following
number of skilled personnel, called steel erectors, in the months of March through August:
Personell work at the site on the monthly basis. Suppose that three steel erectors are on the
site in February and three steel erectors must be on site in September. The problem is to
determine how many workers to have on site in each month in order to minimize cost, subject
to the following conditions:
• Transfer costs. Adding a worker to this site costs $100 per worker and redeploying a
worker to another site costs $160.
• Transfer rules. The company can transfer no more than three workers at the start of
any month, and under a union agreement, it can redeploy no more than one-third of the
current workers in any trade from a site at the end of any month.
• Shortage and overtime. The company incurs a cost of $200 per worker per month for
having a surplus of steel erectors on site and a cost of $200 per worker per month for
having a shortage of workers at the site (which must be made up in overtime). Overtime
cannot exceed 25% of the regular work time.
Exercise 10. Four imprudent walkers are caught in the storm and nights. To reach the hut,
they have to cross a canyon over a fragile rope bridge which can resist the weight of at most
two persons. In addition, crossing the bridge requires to carry a torch to avoid to step into a
hole. Unfortunately, the walkers have a unique torch and the canyon is too large to throw the
torch across it. Due to dizziness and tiredness, the four walkers can cross the bridge in 1, 2, 5
and 10 minutes. When two walkers cross the bridge, they both need the torch and thus cross
the bridge at the slowest of the two speeds. With the help of a graph, find the minimum time
for the walkers to cross the bridge
Graph Theory 15
Exercise 11. Let cij ≥ 0 denote the capacity of an arc in a given network. Define the
capacity of a directed path P as the minimum arc capacity in P . The maximum capacity path
problem is to determine a maximum capacity path from a specified source s to every other
node in the network. Modify Dijkstra’s algorithm so that it solves the maximum capacity path
problem.
Exercise 12. A farmer wishes to transport a truckload of eggs from one city to another
through a given road network. The truck will incur a certain amount of breakage on each road
segment; let wij denote the fraction of eggs broken if the truck traverses the road segment (i, j).
How should the truck be routed to minimize the total breakage? Formulate this problem as a
shortest path problem.
Problem 4.1. Minimum Spanning Tree Problem (MST) Input: An undirected connected
graph G = (V, E) with non-negative edge costs ce for all e ∈ E.
P
Question: Find edge subset T ⊆ E of minimum cost c(T ) = e∈T ce such that the subgraph
G[T ] = (V, T ) is connected.
Remark 4.2. There is always an optimal solution T ⊆ E such that G[T ] = (V, T ) is a tree.
An optimal solution T of the MST problem is called MST.
Example 4.3. Consider the graph illustrated below. The subgraph induced by the red edges
forms a MST of cost 13.
Graph Theory 16
1 5
1 3 5
2 4 4 2
10
1 0 7
1 3 4
1 3
2 4 6
Historical context. During the electrification of south-west Maehren the Czech mathe-
matician Otokar Boruvka (1899-1959) was asked how to efficiently connect all clients with a
electricity network. Boruvka modeled the problem graphtheoretically as MST problem and
proposed an algorithm to solve the problem.
Recall from the previous chapter that an undirected (sub-)graph is a tree if it is cycle-free and
connected. Moreover, remember that any tree on n vertices has exactly n − 1 edges.
Theorem 4.4. For an undirected graph G = (V, T ), the following four statements (1)-(4) are
equivalent1 .
1. G = (V, T ) is a tree.
Proof. Exercise.
The brute-force approach to find a MST by enumerating all spanning trees of a graph and
searching for the one of minimal cost is not practical, except for very small instances. This
follows as a consequence of Caley’s formula.
Theorem 4.5 (Caley’s Fromula). A complete undirected graph on n vertices contains nn−2
distinct spanning trees.
Proof. The proof is omitted. The interested reader is referred to the book “Proofs from the
Book” by Aigner and Ziegler, in which you can find four different beautiful proofs of Caley’s
formula.
1 Recall that two statements A and B are equivalent if A implies B, and B implies A.
Graph Theory 17
Example 4.6. Consider, for example, the complete undirected graph on three vertices (cf. the
Figure below). This graph contains 31 distinct spanning trees. How many spanning trees does
a complete graph on 10 or 30 vertices contain?
As with previous problems such as the interval scheduling, or the interval partitioning problem,
we can easily come up with natural greedy algorithms for the MST problem. Luckily, for the
MST problem, several different greedy algorithms work correctly. In this chapter, we discuss
two natural greedy algorithms: Kruskal’s algorithm (Kruskal 1956), and Prim’s algorithm
(Prim 1957). We will see that the augmentation property of trees turns out to be crucial for
proving the correctness of both greedy algorithms.
Kruskal’s algorithm. Kruskal’s algorithm starts without any edges and builds a spanning
tree by successively inserting edges from E in order of increasing cost. An edge is inserted
whenever it does not create a cycle with the edges already selected.
Prim’s algorithm. Another simply greedy algorithm can be designed in analogy with Di-
jkstra’s shortest-path algorithm. The algorithm, known as Prim’s algorithm, starts with an
Graph Theory 18
arbitrary selected root vertex s and grows a tree iteratively outward from s. In each iteration,
the next vertex is attached as cheaply as possible to the current partial tree. More precisely,
the algorithm maintains as loop invariant a vertex set S ⊆ V together with a spanning tree
T on the subgraph induced by S. Initially, S = {s}. In each iteration, until S = V , the
algorithm grows the current set S by adding the vertex that minimizes the “attachment cost”
mine∈δ(S) ce , where
δ(S) = {e = {u, v} ∈ E | u ∈ S, v 6∈ S}
denotes the set of all edges with exactly one end vertex in S. Prim’s algorithm adds in each
iteration the edge minimizing mine∈δ(S) ce to the current partial tree T , and adds the other
end vertex to S.
Remark: Robert Clay Prim (born 1921, USA) hasn’t been the first in 1957 to describe this
greedy algorithm to solve the MST problem. In fact, Prim’s algorithm had been developed
already in 1930 by Czech mathematician Vojtech Jarnik. Furthermore, Dijkstra described
the algorithm in 1959 in his article ”A Note on Two Problem in Connexions with Graphs”.
The second problem was the shortest-path-problem. For this reason, Prim’s algorithm is
also known under the names DJP-Algorithm, Prim-Jarnik- algorithm, Jarnik’s algorithm, and
8 Prim-Dijkstra-algorithm.
Graph Theory 19
For the analysis of Prim’s and Kruskal’s algorithms in the subsequent section, the following
Observation will be useful.
Observation 4.8. A cycle C and a cut δ(S) always intersect in an even number of edges (cf.
the illustration below).
S cycle C
Note that both algorithms start with an initially empty edge set T = {}, and grow the partial
solution T by iteratively inserting one edge at a time. Therefore, to analyze the algorithm, it
would be useful to have in hand some local property which tells us when it is safe to include
an edge in the current solution.
Definition 4.9. Let G = (V, E) be an undirected graph. We call a subset F of edge set E
MST-extendable if there exists a MST containing F .
Proof. Consider some MST T containing all edges in F . Note that such an MST T must exist,
since F is assumed to be MST-extendable. Let e be a cheapest edge in δ(U ), where U is any
connected component of subgraph G[F ] = (V, F ). Clearly, if e ∈ T , the statement of the lemma
holds.
Otherwise, by Observation 4.7, there exists some cycle C in T + e. Since |C ∩ δ(U )| is even, by
Observation 4.8, there must exist, beside e, some other edge f ∈ C ∩ δ(U ). Now, since e was
a cheapest edge in δ(U ), we have ce ≤ cf .
By the Augmentation property (cf. Observation 4.7) we know that T 0 = T − f + e is again a
spanning tree. Thus, since T is MST, it follows that cf = ce , implying that T − f + e is MST
Graph Theory 20
Example 4.11. Consider the illustration below and assume that the set F of red edges is MST-
extendable. Note that the subgraph G[F ] = (V, F ) decomposes into 7 connected components.
One of these components consists of the set U of blue vertices. Note that δ(U ) contains 4 edges.
The cut property above tells us that when we add the cheapest edge e in δ(U ) to F , then F + e
maintains being MST-extendable.
With the Cut Property at hand, we can now easily prove the optimality of both, Kruskal’s and
Prim’s algorithm. The point is that both algorithms only include an edge when it is justified
by the Cut Property.
Proof. Note that it suffices to prove the following invariant: throughout the entire algorithm,
the set T of selected edges is MST-extendable. We prove this loop invariant via induction.
Clearly, the initial set T = {} is MST-extendable. For proving the induction step, observe
that in each of the subsequent iterations, Kruskal’s algorithm extends the current set T by a
cheapest edge among those that do not close a cycle with T . However, those edges which do
not close a cycle with T are exactly the edges in the cuts induced by the connected components
of G[T ] = (V, T ). Thus, by the Cut Property, T + e is MST-extendable.
Proof. Note that it suffices to prove the following loop invariant: throughout the entire algo-
rithm, the set T of edges selected so far is MST-extendable. Again, we prove the loop invariant
via induction. Clearly, the initial set T = {} is MST-extendable. To prove the induction step,
observe that in each of the subsequent iterations, Prim’s algorithm extends the current set T
by a cheapest edge in δ(V [T ]), i.e., the cut induced by the end vertices of T . Thus, by our Cut
Property, T + e is MST-extendable.
Graph Theory 21
Running time. Let us shortly comment on the running time of Kruskal’s and Prim’s algo-
rithm. With the right data structure, namely a min priority queue for Prim’s algorithm, and
a Union-Find data structure for Kruskal’s algorithm, both algorithms can be implemented to
run in time O(m logn), where m is the number of edges, and n is the number of vertices in G.
4.4 Exercises
Exercise 13. The travelling salesman problem asks the following question: ”Given a
graph G = (V, E), where V is a list of cities and each edge (i, j) ∈ E and the distances ci,j
between each pair of cities i, j ∈ I, what is the shortest possible route that visits each city
exactly once and returns to the origin city?”. In practise, edges satisfy the triangle inequality.
Thus, for each combinations of three cities i, j, k ∈ I the direct route from city i to j is always
shorter than taking the detour via city k. More formally,
One can use the minimum spanning tree problem to find a solution that is at most twice the
length of any optimal solution.
a) Show that the total length of any optimal tour for the travelling salesman problem is at
least the sum of the edge lengths in a minimal spanning tree.
b) Show that you can use the minimum spanning tree to create a tour with at most twice the
length of any optimal tour.
Exercise 14. Think of the network on the next page as a highway map, and the number
recorded next to each arc as the maximum elevation encountered in traversing the arc. A
traveller plans to drive from node 1 to node 12 on this highway. This traveller dislikes high
altitudes and so would like to find a path connecting node 1 to 12 that minimizes the maximum
altitude. Find the best path for this traveller using the minimum spanning tree algorithm.
Graph Theory 22
C
3 5
B F
3
3
5
D
4
A 8
1
3
7
Exercise 15. Consider a graph G = (V, E) where each edge (i, j) ∈ E is assigned a weight
cij . Suppose you want to determine a spanning tree T that minimizes the function
s X
(cij )2
(i,j)∈T
Exercise 16. A minimum bottleneck spanning tree is a spanning tree in which the most
expensive edge is as cheap as possible.
a) Explain why each minimum spanning tree is also a minimum bottleneck spanning tree.
b) Is the converse also true? Thus: is each minimum bottleneck spanning tree also a minimum
spanning tree? Either prove this statement or provide a counter example.