Lecture6 IO BLG336E 2022
Lecture6 IO BLG336E 2022
Analysis of Algorithms II
Lecture 6:
The Minimum Spanning Tree
Prim’s Algorithm, Kruskal’s Algorithm
1
RECAP OF PREVIOUS LECTURES
• Stable Matching Wee Date Topic
k
– Gale-Shapley Algorithm
1 12-Feb Introduction. Some representative
problems
2
• Greedy algorithms for Minimum Spanning Tree.
• Agenda:
1. What is a Minimum Spanning Tree?
2. Short break to introduce some graph theory tools
3. Prim’s algorithm
4. Kruskal’s algorithm
3
Outline for Today
Prim's Algorithm
– simple and efficient algorithm for finding
minimum spanning trees.
4
4.5 Minimum Spanning Tree
Minimum Spanning Tree
4 24
4
23
6 9 6 9
18
5 5
16 11 11
8 8
7 7
10 14
21
G = (V, E) T, eT ce = 50
Prim's algorithm. Start with some root node s and greedily grow a tree
T from s outward. At each step, add the cheapest edge e to T that has
exactly one endpoint in T.
7
Minimum Spanning Tree
Say we have an undirected weighted graph
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
A tree is a
1 2 connected graph
H G F
with no cycles!
7 6
8 10
A tree is a
1 2 connected graph
H G F
with no cycles!
7 6
8 10
A tree is a
1 2 connected graph
H G F
with no cycles!
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
7 6
8 10
1 2
H G F
Figure 2: Fully parsimonious minimal spanning tree of 933 SNPs for 282 isolates of Y. pestis colored by location.
Morelli et al. Nature genetics 2010
13
How to find an MST?
• Today we’ll see two greedy algorithms.
• In order to prove that these greedy algorithms work, we’ll
need to show something like:
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
15
Brief aside
for a discussion of cuts in graphs!
16
Cuts in graphs
• A cut is a partition of the vertices into two parts:
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
8
B C D
7
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
7 6
8 10
1 2
H G F
B 8 C D
7
4 9
2
11 4
A I 14 E
8 7 6
10
H 1 G 2 F
8 7 6
10
H 1 G 2 F
x y
u a
v b
23
Proof of Lemma
• Assume that we have:
• a cut that respects S
• S is part of some MST T.
• Say that {u,v} is light.
x y
• lowest cost crossing the cut
u a
v b
24
Claim: Adding any additional edge to
a spanning tree will create a cycle.
v b
25
Claim: Adding any additional edge to
a spanning tree will create a cycle.
x y
u a
v b
27
Proof of Lemma ctd.
• Consider swapping {u,v} for {x,y} in T.
• Call the resulting tree T’.
• So T’ is an MST
containing S and {u,v}.
• This is what we wanted. v b
28
Lemma
• Let S be a set of edges, and consider a cut that respects S.
• Suppose there is an MST containing S.
• Let {u,v} be a light edge.
• Then there is an MST containing S ∪ {{u,v}}
This edge is light
B 8 C D
7
4 9
2
11 4
A I 14 E
8 7 6
10
H 1 G 2 F
30
Back to MSTs
• How do we find one?
• Today we’ll see two greedy algorithms.
• The strategy:
• Make a series of choices, adding edges to the tree.
• Show that each edge we add is safe to add:
• we do not rule out the possibility of success
• we will choose light edges crossing cuts and use the Lemma.
• Keep going until we have an MST.
31
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
32
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
33
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
34
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
35
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
36
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
37
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
38
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
39
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
40
We’ve discovered
Prim’s algorithm!
• slowPrim( G = (V,E), starting vertex s ):
• Let (s,u) be the lightest edge coming out of s.
• MST = { (s,u) }
n iterations of this
• verticesVisited = { s, u } while loop.
• while |verticesVisited| < |V|:
• find the lightest edge {x,v} in E so that: Time at most m to
• x is in verticesVisited go through all the
edges and find the
• v is not in verticesVisited lightest.
• add {x,v} to MST
• add v to verticesVisited
• return MST Naively, the running time is O(nm):
• For each of n-1 iterations of the while loop:
• Go through all the edges. 41
Two questions
1. Does it work?
• That is, does it actually return a MST?
42
Does it work?
• We need to show that our greedy choices don’t
rule out success.
• That is, at every step:
• If there exists an MST that contains all of the edges S we
have added so far…
• …then when we make our next choice {u,v}, there is still
an MST containing S and {u,v}.
43
Lemma
• Let S be a set of edges, and consider a cut that respects S.
• Suppose there is an MST containing S.
• Let {u,v} be a light edge.
• Then there is an MST containing S ∪ {{u,v}}
This edge is light
B 8 C D
7
4 9
2
11 4
A I 14 E
8 7 6
10
H 1 G 2 F
S is the set of
edges selected so far.
8 7
B C D
4 9
11 2
4
A I 14 E
8 7 6
2 10
H G F 45
1
Partway through Prim
• Assume that our choices S so far don’t rule out success
• There is an MST extending them
• Consider the cut {visited, unvisited}
• This cut respects S.
S is the set of
edges selected so far.
8 7
B C D
4 9
11 2
4
A I 14 E
8 7 6
2 10
H G F 46
1
Partway through Prim
• Assume that our choices S so far don’t rule out success
• There is an MST extending them
• Consider the cut {visited, unvisited}
• This cut respects S.
• The edge we add next is a light edge.
• Least weight of any edge crossing the cut.
S is the set of
edges selected so far.
• By the Lemma, that
8 7
edge is safe to add. B C D
• There is still an
4 9
MST extending 11 2
4
the new set A I 14 E
8 7 6
2 10
add this one next H G F 47
1
Good news
• Our greedy choices don’t rule out success.
48
Formally(ish)
• Inductive hypothesis:
• After adding the t’th edge, there exists an MST with the
edges added so far.
• Base case:
• After adding the 0’th edge, there exists an MST with the
edges added so far. YEP.
• Inductive step:
• If the inductive hypothesis holds for t (aka, the choices so far
are safe), then it holds for t+1 (aka, the next edge we add is
safe).
• That’s what we just showed.
• Conclusion:
• After adding the n-1’st edge, there exists an MST with the
edges added so far.
• At this point we have a spanning tree, so it better be minimal.
49
Two questions
1. Does it work?
• That is, does it actually return a MST?
• Yes!
50
How do we actually implement this?
• Each vertex keeps:
• the distance from itself to the growing spanning tree
if you can get there in one edge.
• how to get there.
I’m 7 away.
C is the closest.
B
8 C
7 D
4 9
2
11 4
A I 14 E
8 7 6
10 I can’t get to the
H
1 G 2 F tree in one edge
51
How do we actually implement this?
• Each vertex keeps:
• the distance from itself to the growing spanning tree
if you can get there in one edge.
• how to get there.
• Choose the closest vertex, add it.
I’m 7 away.
C is the closest.
B
8 C
7 D
4 9
2
11 4
A I 14 E
8 7 6
10 I can’t get to the
H
1 G 2 F tree in one edge
52
How do we actually implement this?
• Each vertex keeps:
• the distance from itself to the growing spanning tree
if you can get there in one edge.
• how to get there.
• Choose the closest vertex, add it.
I’m 7 away.
C is the closest.
B
8 C
7 D
4 9
2
11 4
A I 14 E
8 7 6
10 I can’t get to the
H
1 G 2 F tree in one edge
53
How do we actually implement this?
• Each vertex keeps:
• the distance from itself to the growing spanning tree
if you can get there in one edge.
• how to get there.
• Choose the closest vertex, add it.
• Update stored info. I’m 7 away.
C is the closest.
B
8 C
7 D
4 9
2
11 4
A I 14 E
8 7 6
10 I’m 10 away. F is
H
1 G 2 F the closest.
54
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
k[x] is the distance of x
k[x] from the growing tree
∞ ∞ ∞
8 7
B C D
4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
∞ ∞ ∞
55
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
∞ ∞ ∞
8 7
B C D
4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
∞ ∞ ∞
56
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
k[b] comes from.
4 ∞ ∞
8 7
B C D
4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
57
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 ∞ ∞
8 7
B C D
4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
58
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 ∞ ∞
8 7
B C D
4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
59
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 ∞
8 7
B C D
4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
60
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 ∞
8 7
B C D
4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
61
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 ∞
8 7
B C D
4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
62
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
8 ∞ 4
63
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
8 ∞ 4
64
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
8 ∞ 4
65
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 6 4
66
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 6 4
67
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 6 4
68
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 10
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 2 4
69
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 10
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 2 4
70
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.
4 8 7
8 7
B C D
4 9
2
4 10
11
A I 2
14 E
0
7 6
8 10
7
H etc.1
G
2
2 F
4
71
This should look pretty familiar
• Very similar to Dijkstra’s algorithm!
• Differences:
1. Keep track of p[v] in order to return a tree at the end
• But Dijkstra’s can do that too, that’s not a big difference.
73
Two questions
1. Does it work?
• That is, does it actually return a MST?
• Yes!
74
What have we learned?
• Prim’s algorithm greedily grows a tree
• smells a lot like Dijkstra’s algorithm
• It finds a Minimum Spanning Tree!
• in time O(mlog(n)) if we implement it with a Red-Black Tree.
• In amortized time O(m + nlog(n)) with a Fibonacci heap.
• To prove it worked, we followed the same recipe for
greedy algorithms we saw last time.
• Show that, at every step, we don’t rule out success.
75
That’s not the only greedy
algorithm for MST!
76
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
77
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
78
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
79
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
80
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
81
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
82
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
83
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
84
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
85
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
86
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
87
We’ve discovered
Kruskal’s algorithm!
• slowKruskal(G = (V,E)):
• Sort the edges in E by non-decreasing weight.
• MST = {}
m iterations through this loop
• for e in E (in sorted order):
• if adding e to MST won’t cause a cycle:
• add e to MST. How do we check this?
• return MST
Let’s do this
2. How do we actually implement this? one first
• the pseudocode above says “slowKruskal”…
89
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
90
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
91
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
92
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
93
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
94
We never add an edge within a tree since that would create a cycle.
Keep the trees in a special data structure
“treehouse”?
95
Union-find data structure
also called disjoint-set data structure
• Used for storing collections of sets
• Supports:
• makeSet(u): create a set {u}
• find(u): return the set that u is in
• union(u,v): merge the set that u is in with the set that v is in.
makeSet(x) x
makeSet(y) y
makeSet(z)
union(x,y)
z
96
Union-find data structure
also called disjoint-set data structure
• Used for storing collections of sets
• Supports:
• makeSet(u): create a set {u}
• find(u): return the set that u is in
• union(u,v): merge the set that u is in with the set that v is in.
makeSet(x) x y
makeSet(y)
makeSet(z)
union(x,y)
z
97
Union-find data structure
also called disjoint-set data structure
• Used for storing collections of sets
• Supports:
• makeSet(u): create a set {u}
• find(u): return the set that u is in
• union(u,v): merge the set that u is in with the set that v is in.
makeSet(x) x y
makeSet(y)
makeSet(z)
union(x,y)
find(x)
z
98
Kruskal pseudo-code
• kruskal(G = (V,E)):
• Sort E by weight in non-decreasing order
• MST = {} // initialize an empty tree
• for v in V:
• makeSet(v) // put each vertex in its own tree in the forest
• for (u,v) in E: // go through the edges in sorted order
• if find(u) != find(v): // if u and v are not in the same tree
• add (u,v) to MST
• union(u,v) // merge u’s tree with v’s tree
• return MST
99
Once more…
To start, every vertex is in its own tree.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
100
Once more…
Then start merging.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
101
Once more…
Then start merging.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
102
Once more…
Then start merging.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
103
Once more…
Then start merging.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
104
Once more…
Then start merging.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
105
Once more…
Then start merging.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
106
Once more…
Then start merging.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
107
Stop when we have one big tree!
Once more…
Then start merging.
8 7
B C D
4 9
2
11 4
A I 14 E
7 6
8 10
1 2
H G F
108
Running time
• Sorting the edges takes O(m log(n))
• In practice, if the weights are small integers we can use
radixSort and take time O(m)
• For the rest: In practice, each of
• n calls to makeSet makeSet, find, and union
• put each vertex in its own set run in constant time*
• 2m calls to find
• for each edge, find its endpoints
• n calls to union
• we will never add more than n-1 edges to the tree,
• so we will never call union more than n-1 times.
• Total running time:
• Worst-case O(mlog(n)), just like Prim with an RBtree.
• Closer to O(m) if you can do radixSort
*technically, they run in amortized time O(𝛼(𝑛)), where 𝛼(𝑛) is the inverse Ackerman function.
109
𝛼 𝑛 ≤ 4 provided that n is smaller than the number of atoms in the universe.
Two questions
1. Does it work?
Now that we
• That is, does it actually return a MST? understand this
“tree-merging” view,
let’s do this one.
110
Does it work?
• We need to show that our greedy choices don’t
rule out success.
• That is, at every step:
• There exists an MST that contains all of the edges we
have added so far.
111
Lemma
• Let S be a set of edges, and consider a cut that respects S.
• Suppose there is an MST containing S.
• Let {u,v} be a light edge.
• Then there is an MST containing S ∪ {{u,v}}
This edge is light
B 8 C D
7
4 9
2
11 4
A I 14 E
8 7 6
10
H 1 G 2 F
B 8 C 7 D
4 9
2
11 4
A I 14 E
8 7 6 10
S is the set of
H 1 G 2 F 113
edges selected so far.
Partway through Kruskal
• Assume that our choices S so far don’t rule out success.
• There is an MST extending them
• The next edge we add will merge two trees, T1, T2
B 8 C 7 D
4 9
2
11 4
A I 14 E
8 7 6 10
S is the set of
H 1 G 2 F 114
edges selected so far.
Partway through Kruskal
• Assume that our choices S so far don’t rule out success.
• There is an MST extending them
• The next edge we add will merge two trees, T1, T2
• Consider the cut {T1, V – T1}.
• This cut respects S This is the
• Our new edge is light for the cut next edge
B 8 C 7 D
4 9
2
11 4
A I 14 E
8 7 6 10
S is the set of
H 1 G 2 F 115
edges selected so far.
Partway through Kruskal
• Assume that our choices S so far don’t rule out success.
• There is an MST extending them
• The next edge we add will merge two trees, T1, T2
• Consider the cut {T1, V – T1}.
• This cut respects S This is the
• Our new edge is light for the cut next edge
8 7 6 10
S is the set of
H 1 G 2 F 116
edges selected so far.
Good news
• Our greedy choices don’t rule out success.
117
Two questions
1. Does it work?
• That is, does it actually return a MST?
• Yes
119
What have we learned?
• Kruskal’s algorithm greedily grows a forest
• It finds a Minimum Spanning Tree in time O(mlog(n))
• if we implement it with a Union-Find data structure
• if the edge weights are reasonably-sized integers and we ignore the inverse
Ackerman function, basically O(m) in practice.
120
Compare and contrast
Prim might be a better idea
• Prim: on dense graphs if you can’t
• Grows a tree. Sort edge weights
B 8 C D
7
4 9
2
11 4
A I 14 E
8 7 6
10
H 1 G 2 F
• Karger-Klein-Tarjan 1995:
• O(m) time randomized algorithm
• Chazelle 2000:
• O(m⋅ 𝛼(𝑛)) time deterministic algorithm
• Pettie-Ramachandran 2002:
The optimal number of comparisons
• O N*(n,m) you need to solve the time deterministic algorithm
problem, whatever that is…
124
NEXT LECTURE
Wee Date Topic
• Divide and Conquer k
11 29/30- Midterm
Apr
12 6-May Network Flow II
13 13-May NP and computational intractability-I
14 20- NP and computational intractability-II
May
125