DSU and MST
DSU and MST
● unite(2, 7)
Disjoint Set Union: naive approach
● Let’s store objects of each set in a rooted tree with the representative in the
root
● Let p[v] be the parent of object v
● Here p[1] = 2 and p[7] = None
● find(a): if p[a] = None: a
else: find(p[a])
● unite(a, b): p[a] = b
find(6)
Disjoint Set Union: path compression
● find(a):
if p[a] != None:
p[a] = find(p[a])
return p[a]
return a
find(4)
Disjoint Set Union: path compression
● Let p[a] = a if a is a representative for convenience
● find(a):
if p[a] != a: p[a] = find(p[a])
return p[a]
find(4)
Disjoint Set Union: rank heuristic
● Suppose we don’t do path compression
● Let rank r[q] be the maximum depth of a tree where q is the representative
● In union(a, b) we have two options: either make p[a] = b, or p[b] = a
● Choose the option that minimizes the resulting rank: p[a] = b if r[a]<r[b]
union(2, 7)
Disjoint Set Union: rank heuristic
● Ranks only increase when we unite trees of equal ranks
● union(a, b):
If r[a] > r[b]: swap(a, b)
p[a] = b
If r[a] == r[b]: r[b]++
union(2, 7)
Disjoint Set Union: heuristics
● With path compression only, amortized complexity is O(logn) per query
● With rank heuristic only, complexity is O(logn) per query
● With both heuristics, complexity is O(α(n)) per query
● With both heuristics, rank is not necessarily equal the maximum depth, but
the depth is not greater than rank
find(a): union(a, b):
if p[a] != a: a = find(a)
p[a] = find(p[a]) b = find(b)
return p[a] if r[a] > r[b]:
swap(a, b)
p[a] = b
if r[a] == r[b]:
r[b]++
Disjoint Set Union: rank heuristic complexity
● If r[a] == k, then the tree with representative a has at least 2k objects
● Proof by induction:
○ Base k = 0: a set with one object
○ Inductive step: we can only get rank k after merging two trees of
rank k - 1, each of them has at least 2k-1 objects, so the resulting tree
has at least 2k objects
● The rank can’t be greater than log2(n), so find works in O(logn)
Disjoint Set Union: path compression complexity
● Split the edges in three groups:
○ Root edges point to a representative
○ Heavy edges u->v such that size(u) >= size(v) / 2
○ All the other edges are light
● Count passes for each edge group separately
● At most one root edge in each find call
● At most log(n) light edges in each
find call
● After we pass a heavy edge u->v, the size
of the v subtree decreases at least twice
due to path compression, and never increases
● The total number of passes through heavy edges is not greater than nlogn
Disjoint Set Union: problem 1
Disjoint Set Union: problem 2
Minimum Spanning Tree
● Statement:
○ Let G be a weighted undirected
connected graph
○ A spanning tree is a set of n-1
edges of G connecting all vertices
○ A minimum spanning tree is a
spanning tree with minimum
possible total weight of edges
Minimum Spanning Tree: generic approach
● In the algorithms we will discuss, we will follow the following greedy algorithm
○ Maintain F --- a set of edges that is a subset of some MST. Initially F is
empty
○ Find such an edge e that F with added e is still a subset of some MST.
○ Such edge is called a safe edge
○ When there is no safe edge, F is an MST.
● How to find a safe edge?
Minimum Spanning Tree: safe edge lemma
● Let F be a subset of some MST.
Consider S -- a connected with F
component of G.
● Let e be the smallest by weight
edge connecting S with other
vertices.
● Then e is a safe edge.
Safe edge lemma proof: cut-and-paste
● Let M be MST containing F. If e is in M,
then it is safe, otherwise it makes a
cycle.
● Let e’ be the first edge in the cycle
not in F
● Since e connects two different
connected components, e’ exists
● Replace e’ with e: M is still a tree, but
the weight did not increase
● It means e is safe.
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Similar to Dijkstra algorithm, but another
distance function
● O(n2+m): find minimum in linear time
● O(m log(n)): use heap to find minimum of
d[v]
Kruskal’s Algorithm
● Maintain F as a subset of some MST
● At each step select the lightest edge
connecting two different components in F
● This edge is safe according to lemma
● Add this edge to F
● To find the lightest edge, loop through the
edges sorted by weight:
○ If the edge connects two different
components, add it and merge
components
○ Otherwise skip it
Kruskal’s algorithm
● To effectively check if two vertices belong to the same component, use DSU
● initialize DSU
sort edges by weight
for e in edges:
if find(e.v) != find(e.u):
add e to MST
union(e.v, e.u)
● O(mα(n) + mlog(m)) with DSU.
● For dense graphs, Prim is faster. For sparse graphs, Kruskal is faster