0% found this document useful (0 votes)
14 views124 pages

Lecture6 IO BLG336E 2022

This lecture covers Minimum Spanning Trees (MST) and two algorithms for finding them: Prim's and Kruskal's. An MST is defined as a subset of edges that connects all vertices with the minimum total edge weight. The lecture also discusses the importance of MSTs in various applications such as network design and image processing.

Uploaded by

pearsonicin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views124 pages

Lecture6 IO BLG336E 2022

This lecture covers Minimum Spanning Trees (MST) and two algorithms for finding them: Prim's and Kruskal's. An MST is defined as a subset of edges that connects all vertices with the minimum total edge weight. The lecture also discusses the importance of MSTs in various applications such as network design and image processing.

Uploaded by

pearsonicin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

BLG 336E

Analysis of Algorithms II
Lecture 6:
The Minimum Spanning Tree
Prim’s Algorithm, Kruskal’s Algorithm

1
RECAP OF PREVIOUS LECTURES
• Stable Matching Wee Date Topic
k
– Gale-Shapley Algorithm
1 12-Feb Introduction. Some representative
problems

• Big-O Notation 2 19-Feb Stable Matching

– Asymptotically Tight Bounds 3 26-Feb Basics of algorithm analysis.


– Big Theta and Omega
4 4-Mar Graphs (Project 1 announced)
– A Survey of runtimes
5 11-Mar Greedy algorithms-I

6 18-Mar Greedy algorithms-II


• Graphs
7 25- Divide and conquer (Project 2 announced)
– Breadth First Search Mar

– Depth First Search 8 1-Apr Dynamic Programming I

– Testing Bi-partite 9 15-Apr Dynamic Programming II


10 22-Apr Network Flow-I (Project 3 announced)
– Topological Ordering
11 29/30- Midterm
Apr
• Greedy Algorithms 12 6-May Network Flow II
– Interval Scheduling 13 13-May NP and computational intractability-I
– Interval Partitioning 14 20- NP and computational intractability-II
May
– Shortest Paths in a Graph(Dijkstra)

2
• Greedy algorithms for Minimum Spanning Tree.

• Agenda:
1. What is a Minimum Spanning Tree?
2. Short break to introduce some graph theory tools
3. Prim’s algorithm
4. Kruskal’s algorithm

3
Outline for Today

Minimum Spanning Trees


– What's the cheapest way to connect a graph

Prim's Algorithm
– simple and efficient algorithm for finding
minimum spanning trees.

4
4.5 Minimum Spanning Tree
Minimum Spanning Tree

Minimum spanning tree. Given a connected graph G = (V, E) with real-


valued edge weights ce, an MST is a subset of the edges T  E such
that T is a spanning tree whose sum of edge weights is minimized.

4 24
4

23
6 9 6 9
18
5 5
16 11 11
8 8
7 7
10 14
21

G = (V, E) T, eT ce = 50

Cayley's Theorem. There are nn-2 spanning trees of Kn.

can't solve by brute force


#StayHome
6
Greedy Algorithms

Kruskal's algorithm. Start with T = . Consider edges in ascending


order of cost. Insert edge e in T unless doing so would create a cycle.

Prim's algorithm. Start with some root node s and greedily grow a tree
T from s outward. At each step, add the cheapest edge e to T that has
exactly one endpoint in T.

Reverse-Delete algorithm. Start with T = E. Consider edges in


descending order of cost. Delete edge e from T unless doing so would
disconnect T.

Remark. All three algorithms produce an MST.

7
Minimum Spanning Tree
Say we have an undirected weighted graph

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
A tree is a
1 2 connected graph
H G F
with no cycles!

A spanning tree is a tree that connects all of the vertices.


8
Minimum Spanning Tree
Say we have an undirected weighted graph
The cost of a
This is a
spanning tree is 8 7 spanning tree.
the sum of the B C D
weights on the It has cost 67
edges. 4 9
2
11 4
A I 14 E

7 6
8 10
A tree is a
1 2 connected graph
H G F
with no cycles!

A spanning tree is a tree that connects all of the vertices.


9
Minimum Spanning Tree
Say we have an undirected weighted graph
This is also a
8 7 spanning tree.
B C D
It has cost 37
4 9
2
11 4
A I 14 E

7 6
8 10
A tree is a
1 2 connected graph
H G F
with no cycles!

A spanning tree is a tree that connects all of the vertices.


10
Minimum Spanning Tree
Say we have an undirected weighted graph

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

minimum of minimal cost


A spanning tree is a tree that connects all of the vertices.
11
Minimum Spanning Tree
Say we have an undirected weighted graph
This is a minimum
8 7 spanning tree.
B C D
It has cost 37
4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

minimum of minimal cost


A spanning tree is a tree that connects all of the vertices.
12
Why MSTs?
• Network design
• Connecting cities with roads/electricity/telephone/…
• cluster analysis
• eg, genetic distance
• image processing
• eg, image segmentation
• Useful primitive
• for other graph algs

Figure 2: Fully parsimonious minimal spanning tree of 933 SNPs for 282 isolates of Y. pestis colored by location.
Morelli et al. Nature genetics 2010
13
How to find an MST?
• Today we’ll see two greedy algorithms.
• In order to prove that these greedy algorithms work, we’ll
need to show something like:

Suppose that our choices so far


haven’t ruled out success.

Then the next greedy choice that we make


also won’t rule out success.

• Here, success means finding an MST.


14
Let’s brainstorm some greedy
algorithms!
8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

15
Brief aside
for a discussion of cuts in graphs!

16
Cuts in graphs
• A cut is a partition of the vertices into two parts:

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

This is the cut “{A,B,D,E} and {C,I,H,G,F}”


17
Cuts in graphs
• One or both of the two parts might be disconnected.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

This is the cut “{B,C,E,G,H} and {A,D,I,F}” 18


Let S be a set of edges in G
• We say a cut respects S if no edges in S cross the cut.
• An edge crossing a cut is called light if it has the
smallest weight of any edge crossing the cut.

8
B C D
7
4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

S is the set of thick orange edges 19


Let S be a set of edges in G
• We say a cut respects S if no edges in S cross the cut.
• An edge crossing a cut is called light if it has the
smallest weight of any edge crossing the cut.
This edge is light
8
B C D
7
4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

S is the set of thick orange edges 20


Lemma
• Let S be a set of edges, and consider a cut that respects S.
• Suppose there is an MST containing S.
• Let {u,v} be a light edge.
• Then there is an MST containing S ∪ {{u,v}}
This edge is light

B 8 C D
7
4 9
2
11 4
A I 14 E

8 7 6
10
H 1 G 2 F

S is the set of thick orange edges 21


Lemma
• Let S be a set of edges, and consider a cut that respects S.
• Suppose there is an MST containing S.
• Let {u,v} be a light edge.
• Then there is an MST containing S ∪ {{u,v}}
It’s ”safe” to add this edge!
Aka:
B 8 C D
If we haven’t ruled 7
out the possibility of 4 9
success so far, then 2
adding a light edge 11 4
A I 14 E
still won’t rule it out.

8 7 6
10
H 1 G 2 F

S is the set of thick orange edges 22


Proof of Lemma
• Assume that we have:
• a cut that respects S

x y

u a

v b
23
Proof of Lemma
• Assume that we have:
• a cut that respects S
• S is part of some MST T.
• Say that {u,v} is light.
x y
• lowest cost crossing the cut

u a

v b
24
Claim: Adding any additional edge to
a spanning tree will create a cycle.

Proof of Lemma Proof: Both endpoints are already in


the tree and connected to each other.

• Assume that we have:


• a cut that respects S
• S is part of some MST T.
• Say that {u,v} is light.
x y
• lowest cost crossing the cut
• But say {u,v} is not in T.
• So adding {u,v} to T
will make a cycle. u a

v b
25
Claim: Adding any additional edge to
a spanning tree will create a cycle.

Proof of Lemma Proof: Both endpoints are already in


the tree and connected to each other.
• Assume that we have:
• a cut that respects S
• S is part of some MST T.
• Say that {u,v} is light.
x y
• lowest cost crossing the cut
• But say {u,v} is not in T.
• So adding {u,v} to T
will make a cycle. u a

• So there is at least one


other edge in this cycle
crossing the cut. v b
• call it {x,y} 26
Proof of Lemma ctd.
• Consider swapping {u,v} for {x,y} in T.
• Call the resulting tree T’.

x y

u a

v b
27
Proof of Lemma ctd.
• Consider swapping {u,v} for {x,y} in T.
• Call the resulting tree T’.

• Claim: T’ is still an MST.


• It is still a tree:
x y
• we deleted {x,y}
• It has cost at most that of T
• because {u,v} was light.
• T had minimal cost.
• So T’ does too. u a

• So T’ is an MST
containing S and {u,v}.
• This is what we wanted. v b
28
Lemma
• Let S be a set of edges, and consider a cut that respects S.
• Suppose there is an MST containing S.
• Let {u,v} be a light edge.
• Then there is an MST containing S ∪ {{u,v}}
This edge is light

B 8 C D
7
4 9
2
11 4
A I 14 E

8 7 6
10
H 1 G 2 F

S is the set of thick orange edges 29


End aside
Back to MSTs!

30
Back to MSTs
• How do we find one?
• Today we’ll see two greedy algorithms.

• The strategy:
• Make a series of choices, adding edges to the tree.
• Show that each edge we add is safe to add:
• we do not rule out the possibility of success
• we will choose light edges crossing cuts and use the Lemma.
• Keep going until we have an MST.

31
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

32
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

33
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

34
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

35
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

36
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

37
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

38
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

39
Idea 1
Start growing a tree, greedily add the shortest edge
we can to grow the tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

40
We’ve discovered
Prim’s algorithm!
• slowPrim( G = (V,E), starting vertex s ):
• Let (s,u) be the lightest edge coming out of s.
• MST = { (s,u) }
n iterations of this
• verticesVisited = { s, u } while loop.
• while |verticesVisited| < |V|:
• find the lightest edge {x,v} in E so that: Time at most m to
• x is in verticesVisited go through all the
edges and find the
• v is not in verticesVisited lightest.
• add {x,v} to MST
• add v to verticesVisited
• return MST Naively, the running time is O(nm):
• For each of n-1 iterations of the while loop:
• Go through all the edges. 41
Two questions
1. Does it work?
• That is, does it actually return a MST?

2. How do we actually implement this?


• the pseudocode above says “slowPrim”…

42
Does it work?
• We need to show that our greedy choices don’t
rule out success.
• That is, at every step:
• If there exists an MST that contains all of the edges S we
have added so far…
• …then when we make our next choice {u,v}, there is still
an MST containing S and {u,v}.

• Now it is time to use our lemma!

43
Lemma
• Let S be a set of edges, and consider a cut that respects S.
• Suppose there is an MST containing S.
• Let {u,v} be a light edge.
• Then there is an MST containing S ∪ {{u,v}}
This edge is light

B 8 C D
7
4 9
2
11 4
A I 14 E

8 7 6
10
H 1 G 2 F

S is the set of thick orange edges 44


Partway through Prim
• Assume that our choices S so far don’t rule out success
• There is an MST extending them

How can we use our lemma to show that our


next choice also does not rule out success?

S is the set of
edges selected so far.
8 7
B C D
4 9
11 2
4
A I 14 E

8 7 6
2 10
H G F 45
1
Partway through Prim
• Assume that our choices S so far don’t rule out success
• There is an MST extending them
• Consider the cut {visited, unvisited}
• This cut respects S.

S is the set of
edges selected so far.
8 7
B C D
4 9
11 2
4
A I 14 E

8 7 6
2 10
H G F 46
1
Partway through Prim
• Assume that our choices S so far don’t rule out success
• There is an MST extending them
• Consider the cut {visited, unvisited}
• This cut respects S.
• The edge we add next is a light edge.
• Least weight of any edge crossing the cut.
S is the set of
edges selected so far.
• By the Lemma, that
8 7
edge is safe to add. B C D

• There is still an
4 9
MST extending 11 2
4
the new set A I 14 E

8 7 6
2 10
add this one next H G F 47
1
Good news
• Our greedy choices don’t rule out success.

• This is enough (along with an argument by


induction) to guarantee correctness of Prim’s
algorithm.

48
Formally(ish)
• Inductive hypothesis:
• After adding the t’th edge, there exists an MST with the
edges added so far.
• Base case:
• After adding the 0’th edge, there exists an MST with the
edges added so far. YEP.
• Inductive step:
• If the inductive hypothesis holds for t (aka, the choices so far
are safe), then it holds for t+1 (aka, the next edge we add is
safe).
• That’s what we just showed.
• Conclusion:
• After adding the n-1’st edge, there exists an MST with the
edges added so far.
• At this point we have a spanning tree, so it better be minimal.
49
Two questions
1. Does it work?
• That is, does it actually return a MST?
• Yes!

2. How do we actually implement this?


• the pseudocode above says “slowPrim”…

50
How do we actually implement this?
• Each vertex keeps:
• the distance from itself to the growing spanning tree
if you can get there in one edge.
• how to get there.

I’m 7 away.
C is the closest.
B
8 C
7 D

4 9
2
11 4
A I 14 E

8 7 6
10 I can’t get to the
H
1 G 2 F tree in one edge
51
How do we actually implement this?
• Each vertex keeps:
• the distance from itself to the growing spanning tree
if you can get there in one edge.
• how to get there.
• Choose the closest vertex, add it.
I’m 7 away.
C is the closest.
B
8 C
7 D

4 9
2
11 4
A I 14 E

8 7 6
10 I can’t get to the
H
1 G 2 F tree in one edge
52
How do we actually implement this?
• Each vertex keeps:
• the distance from itself to the growing spanning tree
if you can get there in one edge.
• how to get there.
• Choose the closest vertex, add it.
I’m 7 away.
C is the closest.
B
8 C
7 D

4 9
2
11 4
A I 14 E

8 7 6
10 I can’t get to the
H
1 G 2 F tree in one edge
53
How do we actually implement this?
• Each vertex keeps:
• the distance from itself to the growing spanning tree
if you can get there in one edge.
• how to get there.
• Choose the closest vertex, add it.
• Update stored info. I’m 7 away.
C is the closest.
B
8 C
7 D

4 9
2
11 4
A I 14 E

8 7 6
10 I’m 10 away. F is
H
1 G 2 F the closest.
54
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
k[x] is the distance of x
k[x] from the growing tree

a b p[b] = a, meaning that


a was the vertex that
k[b] comes from.

∞ ∞ ∞
8 7
B C D

4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
∞ ∞ ∞
55
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree

a b p[b] = a, meaning that


a was the vertex that
k[b] comes from.

∞ ∞ ∞
8 7
B C D

4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
∞ ∞ ∞
56
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
k[b] comes from.

4 ∞ ∞
8 7
B C D

4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
57
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 ∞ ∞
8 7
B C D

4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
58
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 ∞ ∞
8 7
B C D

4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
59
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 ∞
8 7
B C D

4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
60
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 ∞
8 7
B C D

4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
61
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 ∞
8 7
B C D

4 9
2
4 ∞
11
A I ∞
14 E
0
7 6
8 10
1 2
H G F
8 ∞ ∞
62
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
8 ∞ 4
63
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
8 ∞ 4
64
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
8 ∞ 4
65
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 6 4
66
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 6 4
67
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 ∞
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 6 4
68
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 10
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 2 4
69
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 10
11
A I 2
14 E
0
7 6
8 10
1 2
H G F
7 2 4
70
Efficient implementation x
x
Can’t reach x yet
x is “active”
Every vertex has a key and a parent x Can reach x
Until all the vertices are reached:
• Activate the unreached vertex u with the smallest key. k[x] is the distance of x
k[x] from the growing tree
• for each of u’s neighbors v:
• k[v] = min( k[v], weight(u,v) ) p[b] = a, meaning that
a b
• if k[v] updated, p[v] = u a was the vertex that
• Mark u as reached, and add (p[u],u) to MST. k[b] comes from.

4 8 7
8 7
B C D

4 9
2
4 10
11
A I 2
14 E
0
7 6
8 10

7
H etc.1
G
2
2 F
4
71
This should look pretty familiar
• Very similar to Dijkstra’s algorithm!
• Differences:
1. Keep track of p[v] in order to return a tree at the end
• But Dijkstra’s can do that too, that’s not a big difference.

2. Instead of d[v] which we update by


• d[v] = min( d[v], d[u] + w(u,v) )
we keep k[v] which we update by
• k[v] = min( k[v], w(u,v) )
• To see the difference, consider: U
2 2
S T
3 72
One thing that is similar:
Running time
• Exactly the same as Dijkstra:
• O(mlog(n)) using a Red-Black tree as a priority queue.
• O(m + nlog(n)) amortized time if we use a Fibonacci Heap.

73
Two questions
1. Does it work?
• That is, does it actually return a MST?
• Yes!

2. How do we actually implement this?


• the pseudocode above says “slowPrim”…
• Implement it basically the same way
we’d implement Dijkstra!

74
What have we learned?
• Prim’s algorithm greedily grows a tree
• smells a lot like Dijkstra’s algorithm
• It finds a Minimum Spanning Tree!
• in time O(mlog(n)) if we implement it with a Red-Black Tree.
• In amortized time O(m + nlog(n)) with a Fibonacci heap.
• To prove it worked, we followed the same recipe for
greedy algorithms we saw last time.
• Show that, at every step, we don’t rule out success.

75
That’s not the only greedy
algorithm for MST!

76
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

77
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

78
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

79
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

80
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

81
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

82
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

83
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

84
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

85
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

86
That’s not the only greedy algorithm
what if we just always take the cheapest edge?
whether or not it’s connected to what we have so far?
That won’t
cause a cycle
8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

87
We’ve discovered
Kruskal’s algorithm!
• slowKruskal(G = (V,E)):
• Sort the edges in E by non-decreasing weight.
• MST = {}
m iterations through this loop
• for e in E (in sorted order):
• if adding e to MST won’t cause a cycle:
• add e to MST. How do we check this?
• return MST

How would you Naively, the running time is ???:


figure out if added e
• For each of m iterations of the for loop:
would make a cycle
in this algorithm? • Check if adding e would cause a cycle…
88
Two questions
1. Does it work?
• That is, does it actually return a MST?

Let’s do this
2. How do we actually implement this? one first
• the pseudocode above says “slowKruskal”…

89
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

90
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

91
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees

When we add an edge, we merge two trees:

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

92
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees

When we add an edge, we merge two trees:

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

93
A forest is a
At each step of Kruskal’s, collection of
we are maintaining a forest. disjoint trees

When we add an edge, we merge two trees:

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

94
We never add an edge within a tree since that would create a cycle.
Keep the trees in a special data structure

“treehouse”?

95
Union-find data structure
also called disjoint-set data structure
• Used for storing collections of sets
• Supports:
• makeSet(u): create a set {u}
• find(u): return the set that u is in
• union(u,v): merge the set that u is in with the set that v is in.

makeSet(x) x
makeSet(y) y
makeSet(z)

union(x,y)
z
96
Union-find data structure
also called disjoint-set data structure
• Used for storing collections of sets
• Supports:
• makeSet(u): create a set {u}
• find(u): return the set that u is in
• union(u,v): merge the set that u is in with the set that v is in.

makeSet(x) x y
makeSet(y)
makeSet(z)

union(x,y)
z
97
Union-find data structure
also called disjoint-set data structure
• Used for storing collections of sets
• Supports:
• makeSet(u): create a set {u}
• find(u): return the set that u is in
• union(u,v): merge the set that u is in with the set that v is in.

makeSet(x) x y
makeSet(y)
makeSet(z)

union(x,y)

find(x)
z
98
Kruskal pseudo-code
• kruskal(G = (V,E)):
• Sort E by weight in non-decreasing order
• MST = {} // initialize an empty tree
• for v in V:
• makeSet(v) // put each vertex in its own tree in the forest
• for (u,v) in E: // go through the edges in sorted order
• if find(u) != find(v): // if u and v are not in the same tree
• add (u,v) to MST
• union(u,v) // merge u’s tree with v’s tree
• return MST

99
Once more…
To start, every vertex is in its own tree.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

100
Once more…
Then start merging.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

101
Once more…
Then start merging.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

102
Once more…
Then start merging.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

103
Once more…
Then start merging.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

104
Once more…
Then start merging.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

105
Once more…
Then start merging.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

106
Once more…
Then start merging.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

107
Stop when we have one big tree!
Once more…
Then start merging.

8 7
B C D

4 9
2
11 4
A I 14 E

7 6
8 10
1 2
H G F

108
Running time
• Sorting the edges takes O(m log(n))
• In practice, if the weights are small integers we can use
radixSort and take time O(m)
• For the rest: In practice, each of
• n calls to makeSet makeSet, find, and union
• put each vertex in its own set run in constant time*
• 2m calls to find
• for each edge, find its endpoints
• n calls to union
• we will never add more than n-1 edges to the tree,
• so we will never call union more than n-1 times.
• Total running time:
• Worst-case O(mlog(n)), just like Prim with an RBtree.
• Closer to O(m) if you can do radixSort

*technically, they run in amortized time O(𝛼(𝑛)), where 𝛼(𝑛) is the inverse Ackerman function.
109
𝛼 𝑛 ≤ 4 provided that n is smaller than the number of atoms in the universe.
Two questions
1. Does it work?
Now that we
• That is, does it actually return a MST? understand this
“tree-merging” view,
let’s do this one.

2. How do we actually implement this?


• the pseudocode above says “slowKruskal”…
• Worst-case running time O(mlog(n)) using a
union-find data structure.

110
Does it work?
• We need to show that our greedy choices don’t
rule out success.
• That is, at every step:
• There exists an MST that contains all of the edges we
have added so far.

• Now it is time to use our lemma!


again!

111
Lemma
• Let S be a set of edges, and consider a cut that respects S.
• Suppose there is an MST containing S.
• Let {u,v} be a light edge.
• Then there is an MST containing S ∪ {{u,v}}
This edge is light

B 8 C D
7
4 9
2
11 4
A I 14 E

8 7 6
10
H 1 G 2 F

S is the set of thick orange edges 112


Partway through Kruskal
• Assume that our choices S so far don’t rule out success.
• There is an MST extending them
• The next edge we add will merge two trees, T1, T2

B 8 C 7 D

4 9
2
11 4
A I 14 E

8 7 6 10
S is the set of
H 1 G 2 F 113
edges selected so far.
Partway through Kruskal
• Assume that our choices S so far don’t rule out success.
• There is an MST extending them
• The next edge we add will merge two trees, T1, T2

How can we use our lemma to


show that our next choice also This is the
next edge
does not rule out success?

B 8 C 7 D

4 9
2
11 4
A I 14 E

8 7 6 10
S is the set of
H 1 G 2 F 114
edges selected so far.
Partway through Kruskal
• Assume that our choices S so far don’t rule out success.
• There is an MST extending them
• The next edge we add will merge two trees, T1, T2
• Consider the cut {T1, V – T1}.
• This cut respects S This is the
• Our new edge is light for the cut next edge

B 8 C 7 D

4 9
2
11 4
A I 14 E

8 7 6 10
S is the set of
H 1 G 2 F 115
edges selected so far.
Partway through Kruskal
• Assume that our choices S so far don’t rule out success.
• There is an MST extending them
• The next edge we add will merge two trees, T1, T2
• Consider the cut {T1, V – T1}.
• This cut respects S This is the
• Our new edge is light for the cut next edge

• By the Lemma, that 8 7


B C D
edge is safe to add.
4 9
• There is still an 2
MST extending 11 4
the new set
A I 14 E

8 7 6 10
S is the set of
H 1 G 2 F 116
edges selected so far.
Good news
• Our greedy choices don’t rule out success.

• This is enough (along with an argument by


induction) to guarantee correctness of Kruskal’s
algorithm.

117
Two questions
1. Does it work?
• That is, does it actually return a MST?
• Yes

2. How do we actually implement this?


• the pseudocode above says “slowKruskal”…
• Using a union-find data structure!

119
What have we learned?
• Kruskal’s algorithm greedily grows a forest
• It finds a Minimum Spanning Tree in time O(mlog(n))
• if we implement it with a Union-Find data structure
• if the edge weights are reasonably-sized integers and we ignore the inverse
Ackerman function, basically O(m) in practice.

• To prove it worked, we followed the same recipe for


greedy algorithms we saw last time.
• Show that, at every step, we don’t rule out success.

120
Compare and contrast
Prim might be a better idea
• Prim: on dense graphs if you can’t
• Grows a tree. Sort edge weights

• Time O(mlog(n)) with a red-black tree


• Time O(m + nlog(n)) with a Fibonacci heap
• Kruskal:
• Grows a forest.
• Time O(mlog(n)) with a union-find data structure
• If you can do radixSort on the edge weights, morally O(m)
Kruskal might be a better idea
on sparse graphs if you can
Sort edge weights
121
Both Prim and Kruskal
• Greedy algorithms for MST.
• Similar reasoning:
• Optimal substructure: subgraphs generated by cuts.
• The way to make safe choices is to choose light edges
crossing the cut.
This edge is light

B 8 C D
7
4 9
2
11 4
A I 14 E

8 7 6
10
H 1 G 2 F

S is the set of thick orange edges 122


Can we do better?
State-of-the-art MST on connected undirected graphs

• Karger-Klein-Tarjan 1995:
• O(m) time randomized algorithm
• Chazelle 2000:
• O(m⋅ 𝛼(𝑛)) time deterministic algorithm
• Pettie-Ramachandran 2002:
The optimal number of comparisons
• O N*(n,m) you need to solve the time deterministic algorithm
problem, whatever that is…

What is this number?


Do we need that silly 𝛼 𝑛 ?
Open questions!
123
Recap
• Two algorithms for Minimum Spanning Tree
• Prim’s algorithm
• Kruskal’s algorithm

• Both are (more) examples of greedy algorithms!


• Make a series of choices.
• Show that at each step, your choice does not rule out
success.
• At the end of the day, you haven’t ruled out success, so
you must be successful.

124
NEXT LECTURE
Wee Date Topic
• Divide and Conquer k

• Mergesort 1 12-Feb Introduction. Some representative


problems
• Counting Inversions 2 19-Feb Stable Matching
• Closest pair of points
• Karatsuba 3 26-Feb Basics of algorithm analysis.

Multiplication 4 4-Mar Graphs (Project 1 announced)


5 11-Mar Greedy algorithms-I

6 18-Mar Greedy algorithms-II

7 25- Divide and conquer (Project 2 announced)


Mar
8 1-Apr Dynamic Programming I
9 15-Apr Dynamic Programming II
10 22-Apr Network Flow-I (Project 3 announced)

11 29/30- Midterm
Apr
12 6-May Network Flow II
13 13-May NP and computational intractability-I
14 20- NP and computational intractability-II
May

125

You might also like