0% found this document useful (0 votes)
47 views35 pages

DSU and MST

The document discusses Disjoint Set Union (DSU) and Minimum Spanning Tree (MST) algorithms, detailing the operations of DSU such as find and union, along with optimizations like path compression and rank heuristics. It also covers the principles of MST, including Prim's and Kruskal's algorithms, explaining how to find safe edges and the complexities involved. The document emphasizes the efficiency of these algorithms in different graph scenarios, highlighting their respective performance characteristics.

Uploaded by

reankera9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views35 pages

DSU and MST

The document discusses Disjoint Set Union (DSU) and Minimum Spanning Tree (MST) algorithms, detailing the operations of DSU such as find and union, along with optimizations like path compression and rank heuristics. It also covers the principles of MST, including Prim's and Kruskal's algorithms, explaining how to find safe edges and the complexities involved. The document emphasizes the efficiency of these algorithms in different graph scenarios, highlighting their respective performance characteristics.

Uploaded by

reankera9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

DSU and MST

Nikolay Kalinin | Harbour.Space, 31.03.2025


Disjoint Set Union
● N objects, each of them belongs to exactly one set.
● Two types of queries:
○ Given two objects a and b, merge the set containing a with the set
containing b;
○ Given two objects a and b, check if a and b are in the same set.
● Let each set have a selected object called a representative.
● Switch to the following queries:
○ find: given an object a, find the representative of its set rep(a).
○ union: given two representatives a and b, change the representative of
the set containing a to b.
Disjoint Set Union: naive approach
● Let’s store objects of each set in a rooted tree with the representative in the
root
● Let p[v] be the parent of object v
● Here p[1] = 2 and p[7] = None
● find(a): if p[a] = None: a
else: find(p[a])
● unite(a, b): p[a] = b
Disjoint Set Union: naive approach
● Let’s store objects of each set in a rooted tree with the representative in the
root
● Let p[v] be the parent of object v
● Here p[1] = 2 and p[7] = None
● find(a): if p[a] = None: a
else: find(p[a])
● unite(a, b): p[a] = b

● unite(2, 7)
Disjoint Set Union: naive approach
● Let’s store objects of each set in a rooted tree with the representative in the
root
● Let p[v] be the parent of object v
● Here p[1] = 2 and p[7] = None
● find(a): if p[a] = None: a
else: find(p[a])
● unite(a, b): p[a] = b

● O(n) per operation


Disjoint Set Union: path compression
● After a find(a) call, set p[a] to the representative rep(a)
● The depth of many vertices decreases

find(6)
Disjoint Set Union: path compression
● find(a):
if p[a] != None:
p[a] = find(p[a])
return p[a]
return a

find(4)
Disjoint Set Union: path compression
● Let p[a] = a if a is a representative for convenience
● find(a):
if p[a] != a: p[a] = find(p[a])
return p[a]

find(4)
Disjoint Set Union: rank heuristic
● Suppose we don’t do path compression
● Let rank r[q] be the maximum depth of a tree where q is the representative
● In union(a, b) we have two options: either make p[a] = b, or p[b] = a
● Choose the option that minimizes the resulting rank: p[a] = b if r[a]<r[b]

union(2, 7)
Disjoint Set Union: rank heuristic
● Ranks only increase when we unite trees of equal ranks
● union(a, b):
If r[a] > r[b]: swap(a, b)
p[a] = b
If r[a] == r[b]: r[b]++

union(2, 7)
Disjoint Set Union: heuristics
● With path compression only, amortized complexity is O(logn) per query
● With rank heuristic only, complexity is O(logn) per query
● With both heuristics, complexity is O(α(n)) per query
● With both heuristics, rank is not necessarily equal the maximum depth, but
the depth is not greater than rank
find(a): union(a, b):
if p[a] != a: a = find(a)
p[a] = find(p[a]) b = find(b)
return p[a] if r[a] > r[b]:
swap(a, b)
p[a] = b
if r[a] == r[b]:
r[b]++
Disjoint Set Union: rank heuristic complexity
● If r[a] == k, then the tree with representative a has at least 2k objects
● Proof by induction:
○ Base k = 0: a set with one object
○ Inductive step: we can only get rank k after merging two trees of
rank k - 1, each of them has at least 2k-1 objects, so the resulting tree
has at least 2k objects
● The rank can’t be greater than log2(n), so find works in O(logn)
Disjoint Set Union: path compression complexity
● Split the edges in three groups:
○ Root edges point to a representative
○ Heavy edges u->v such that size(u) >= size(v) / 2
○ All the other edges are light
● Count passes for each edge group separately
● At most one root edge in each find call
● At most log(n) light edges in each
find call
● After we pass a heavy edge u->v, the size
of the v subtree decreases at least twice
due to path compression, and never increases
● The total number of passes through heavy edges is not greater than nlogn
Disjoint Set Union: problem 1
Disjoint Set Union: problem 2
Minimum Spanning Tree
● Statement:
○ Let G be a weighted undirected
connected graph
○ A spanning tree is a set of n-1
edges of G connecting all vertices
○ A minimum spanning tree is a
spanning tree with minimum
possible total weight of edges
Minimum Spanning Tree: generic approach
● In the algorithms we will discuss, we will follow the following greedy algorithm
○ Maintain F --- a set of edges that is a subset of some MST. Initially F is
empty
○ Find such an edge e that F with added e is still a subset of some MST.
○ Such edge is called a safe edge
○ When there is no safe edge, F is an MST.
● How to find a safe edge?
Minimum Spanning Tree: safe edge lemma
● Let F be a subset of some MST.
Consider S -- a connected with F
component of G.
● Let e be the smallest by weight
edge connecting S with other
vertices.
● Then e is a safe edge.
Safe edge lemma proof: cut-and-paste
● Let M be MST containing F. If e is in M,
then it is safe, otherwise it makes a
cycle.
● Let e’ be the first edge in the cycle
not in F
● Since e connects two different
connected components, e’ exists
● Replace e’ with e: M is still a tree, but
the weight did not increase
● It means e is safe.
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Start with empty F and S having one vertex.
● On each step add a safe edge connecting S
with other vertices to F
● Add the end of this edge to S
● To find the safe edge, maintain d[], where
d[v] is the smallest weight of an edge
connecting vertex v with a vertex in S.
● On each step choose v not in S with the
smallest d[v].
● Recompute d[] for new candidates for safe
edges
Prim’s Algorithm
● Similar to Dijkstra algorithm, but another
distance function
● O(n2+m): find minimum in linear time
● O(m log(n)): use heap to find minimum of
d[v]
Kruskal’s Algorithm
● Maintain F as a subset of some MST
● At each step select the lightest edge
connecting two different components in F
● This edge is safe according to lemma
● Add this edge to F
● To find the lightest edge, loop through the
edges sorted by weight:
○ If the edge connects two different
components, add it and merge
components
○ Otherwise skip it
Kruskal’s algorithm
● To effectively check if two vertices belong to the same component, use DSU
● initialize DSU
sort edges by weight
for e in edges:
if find(e.v) != find(e.u):
add e to MST
union(e.v, e.u)
● O(mα(n) + mlog(m)) with DSU.
● For dense graphs, Prim is faster. For sparse graphs, Kruskal is faster

You might also like