0% found this document useful (0 votes)
73 views37 pages

Disjoint Ssets

Uploaded by

dpkmds
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views37 pages

Disjoint Ssets

Uploaded by

dpkmds
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

Disjoint Sets Data Structure

Disjoint Sets
• Some applications require maintaining a collection
of disjoint sets.

• A Disjoint set S is a collection of sets


S1 ,......Sn where i  j Si  S j  

• Each set has a representative which is a member


of the set (Usually the minimum if the elements
are comparable)
Disjoint Set Operations
• Make-Set(x) – Creates a new set where x is
it’s only element (and therefore it is the
representative of the set). O(1) time.

• Union(x,y) – Replaces S x , S y by S x  S y
one of the elements of S x  S y becomes the
representative of the new set. O(log n) time.

• Find(x) – Returns the representative of the set


containing x O(log n) time
Analyzing Operations

• We usually analyze a sequence of m


operations, of which n of them are
Make_Set operations, and m is the total of
Make_Set, Find, and Union operations

• Each union operations decreases the number


of sets in the data structure, so there can not
be more than n-1 Union operations
Applications

• Equivalence Relations (e.g Connected


Components)

• Minimal Spanning Trees


Connected Components
• Given a graph G we first preprocess G to
maintain a set of connected components.
CONNECTED_COMPONENTS(G)

• Later a series of queries can be executed to


check if two vertexes are part of the same
connected component
SAME_COMPONENT(U,V)
Connected Components
CONNECTED_COMPONENTS(G)

for each vertex v in V[G]


do MAKE_SET (v)

for each edge (u,v) in E[G]


do if FIND_SET(u) != FIND_SET(v)
then UNION(u,v)
Connected Components
SAME_COMPONENT(u,v)

return FIND_SET(u) ==FIND_SET(v)


Example

e f h j

g i

a b

c d
(b,d)

e f h j

g i

a b

c d
(e,g)

e f h j

g i

a b

c d
(a,c)

e f h j

g i

a b

c d
(h,i)

e f h j

g i

a b

c d
(a,b)

e f h j

g i

a b

c d
(e,f)

e f h j

g i

a b

c d
(b,c)

e f h j

g i

a b

c d
Result

e f h j

g i

a b

c d
Connected Components

• During the execution of CONNECTED-


COMPONENTS on a undirected graph G =
(V, E) with k connected components, how
many time is FIND-SET called? How many
times is UNION called? Express you
answers in terms of |V|, |E|, and k.
Solution
• FIND-SET is called 2|E| times. FIND-SET is
called twice on line 4, which is executed once for
each edge in E[G].

• UNION is called |V| - k times. Lines 1 and 2


create |V| disjoint sets. Each UNION operation
decreases the number of disjoint sets by one. At
the end there are k disjoint sets, so UNION is
called |V| - k times.
Linked List implementation
• We maintain a set of linked list, each list corresponds to a
single set.

• All elements of the set point to the first element which is


the representative

• A pointer to the tail is maintained so elements are inserted


at the end of the list

a b c d
Union with linked lists

a b c d

+
e f g

e f g a b c d

5
Analysis
• Using linked list, MAKE_SET and
FIND_SET are constant operations,
however UNION requires to update the
representative for at least all the elements of
one set, and therefore is linear in worst case
time

• A series of m operations could take ( m 2 )


Analysis
• Let q  m  n  1  m / 2 , n  m / 2  1. Let n be the
number of make set operations, then a series of n
MAKE_SET operations, followed by q-1 UNION
operations will take (m2 ) since
q 1
n  1  2  3  .....q  1  n   i  n  q 2
i 1

• q,n are an order of m, so in total we get (m2 )


which is an amortized cost of m for each
operations
Improvement – Weighted Union
• Always append the shortest list to the longest list.
A series of operations will now cost only (m  n log n)

• MAKE_SET and FIND_SET are constant time


and there are m operations.

• For Union, a set will not change it’s representative


more than log(n) times. So each element can be
updated no more than log(n) time, resulting in
nlogn for all union operations
Disjoint-Set Forests
• Maintain A collection of trees, each element
points to it’s parent. The root of each tree is the
representative of the set

• We use two strategies for improving running time


– Union by Rank c
– Path Compression
a b f

d
Make Set
• MAKE_SET(x)
p(x)=x
rank(x)=0

x
Find Set
• FIND_SET(d)
if d != p[d]
p[d]= FIND_SET(p[d])
return p[d]

a b f

d
Union
w
c
• UNION(x,y) x
link(findSet(x), a b f
findSet(y)) y
• link(x,y) d z
if rank(x)>rank(y)
then p(y)=x
else c w
p(x)=y
if rank(x)=rank(y) a b f x
then rank(y)++
y
d
z
Analysis
• In Union we attach a smaller tree to the larger tree,
results in logarithmic depth.

• Path compression can cause a very deep tree to


become very shallow

• Combining both ideas gives us (without proof) a


sequence of m operations in O(m (m, n))
Exercise
• Describe a data structure that supports the
following operations:

– find(x) – returns the representative of x


– union(x,y) – unifies the groups of x and y
– min(x) – returns the minimal element in the
group of x
Solution
• We modify the disjoint set data structure so that
we keep a reference to the minimal element in the
group representative.

• The find operation does not change (log(n))

• The union operation is similar to the original


union operation, and the minimal element is the
smallest between the minimal of the two groups
Example
• Executing find(5)
4 6
7 1 4 4
3 1

7
1 2 3 4 5 6 .. N
Parent 4 7 4 4 7 6 2 5
min 1 6
Example
• Executing union(4,6)
4 6

3 1

7
1 2 3 4 5 6 .. N
Parent 4 7 4 4 7 4 2 5
min 1 1
Exercise
• Describe a data structure that supports the
following operations:

– find(x) – returns the representative of x


– union(x,y) – unifies the groups of x and y
– deUnion() – undo the last union operation
Solution
• We modify the disjoint set data structure by
adding a stack, that keeps the pairs of
representatives that were last merged in the union
operations

• The find operations stays the same, but we can not


use path compression since we don’t want to
change the modify the structure after union
operations
Solution
• The union operation is a regular operation
and involves an addition push (x,y) to the
stack

• The deUnion operation is as follows


– (x,y)  s.pop()
– parent(x) x
– parent(y) y
Example
• Example why we can not use path
compression.

– Union (8,4) 1 2 3 4 5 6 7 8 9 10
– Find(2) parent 4 7 7 4 8 1 5 8 1 4
– Find(6)
– DeUnion()

You might also like