0% found this document useful (0 votes)

185 views11 pages

UNIT - 1: Disjoint SETS: Equivalence Relations

The document discusses the disjoint set data structure and algorithms for solving the dynamic equivalence problem. It covers the quick-find, quick-union, and optimized approaches of weighted quick-union and path compression. Weighted quick-union improves quick-union by linking smaller trees to larger trees during union to limit depth growth. Path compression further optimizes find by making all nodes on the search path point directly to the root. Together these approaches guarantee an efficient algorithm with worst-case time of O(N+MlogN) for M union-find operations on N objects.

Uploaded by

Guna Shekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

185 views11 pages

UNIT - 1: Disjoint SETS: Equivalence Relations

Uploaded by

Guna Shekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

UNIT -1 : Disjoint SETS

Disjoint Set ADT is an efficient data structure to solve the equivalence problems.
It has wide applications: Kruskal's minimum spanning tree algorithm, Least
common ancestor, compiling equivalence statements in Fortran, Matlab's
bwlabel() function in image processing, and so on.

The implementation of the quick-find, quick-union, smart-union with

path compression in C can be seen here and an application to solve
problem in C++

Equivalence relations

In order to better describe the dynamic equivalence problem, we need to

first talk about the concept equivalence relation. A relation R is defined on
a set S if for every pair of elements (a, b), a, b ∈ S, a R b is either true or
false. If a R b is true, then
we say that a is related to b. An equivalence relation is a relation R that
satisfies three properties:

1. (Reflective) a R a, for all a ∈ S.

2. (Symmetric) a R b iff b R a.

3. (Transitive) a R b and b R c implies that a R c.

Usually, we use ∼ to denote equivalence relation. Let's consider

several examples:

1. The ≤ relationship is NOT an equivalence relationship.

Although it is reflexive (i.e., a ≤ a) and transitive (i.e., a ≤ b and
b ≤ c implies a ≤ c), it is not symmetric, since a ≤ b does not
imply b ≤ a.

2. Electrical connectivity, where all connections are by metal

wires, is an equivalence relation. The relation is clearly
reflexive, as any component is connected to itself. If a is
electrically connected to b, then b must be electrically
connected to a, so the relation is symmetric. Finally, if a is
connected to b and b is connected to c, then a is connected to
c.

3. Two cities are related if they are in the same country. This is an
equivalence relation.
4. Suppose town a is related to b if it is possible to travel from a
to b by taking roads. This relation is an equivalence relation if
all the roads are two-way.

We need to define another term equivalence class in order to talk about

dynamic equivalence problem. Suppose we are given a set of elements that
have the equivalence relation defined over (i.e. for a set {a1, a2, a3}, we
have a1 ∼ a2), the

equivalence class of an element a ∈ S is the subset of S thatcontains all the

elements that are related to a. Notice that the equivalence classes form a partition
of S: every member of S appears in exactly one equivalence class.

The dynamic equivalence problem

The dynamic equivalence problem essentially is about supporting two
operations on a set of elements where the equivalence relation is defined
over:

find, which returns the name of the set (i.e., the

equivalence class) containing a given element.
union, which merges the two equivalence classes
containing a and b into a new equivalence class. From a set
point of view, the result of union is to create a new set
S = S ∪ S , destroying the originals and preserving the disjointness
ofk all the
i sets.
j

We can model the problem like the following: the input is initially a
collection of N sets, each with one element. This initial representation is
that all relations(except reflexive relations) are false. Each set has a
different element, so that S ∩ S = ∅; this makes the sets disjoint. In
i j
addition, since we only care about the knowledge of the elements' locations
not values, we can assume that all the elements have been numbered
sequentially from 1 to N. Thus, we have S = {i} for i = 1 through N. At
i
last, we don't care what value returned by find operation as long as find(a) =
find(b) if a and b are in the same set.

Now, let's take a look at an example. Suppose we have a set of 10

elements: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and we perform the following union
operations:
1 − 2, 3 − 4, 5 − 6, 7 − 8, 7 − 9, 2 − 8, 0 − 5, 1 − 9. Then, we have three connected
components (i.e. maximal set of objects that are mutually connected): {0, 5, 6},
{3, 4}, {1, 2, 7, 8, 9}. find(5) should return the same value as find(6).

Quick-find
The first approach to solve the problem is called quick-find, which
ensures that the find instruction can be executed in constant worst-case
time. For the find operation to be fast, we could maintain, in an array, the
name of the equivalence class for each element. Then find is just a
simple O(1) lookup:

In the above example, find(0) gives 0; find(1) gives 1; find(5) gives 0.

Thus, we know that 0 ∼ 5, 0 ≁ 1, and 1 ≁ 5. For the union(a,b) operation,
suppose that a is in equivalence class i and b is in equivalence class j.
Then we scan down the array, changing all i's to j.

In the above example, when do union(6,1), we need to change all entries

in the equivalence class of 6 (i.e., 0, 5, 6) into 1's. As you can see, the
number of array acesses for union operation is O(N). Thus, a sequence of
N − 1 union (the maximum, then everything is in one set) would take O(N2)
time.

Quick-union
The second approach to solve the problem is to ensure that the union instruction
can be executed in constant worst-case time, which is called "quick-union". One
thing to note is that both find and union cannot be done simultaneously in
constant worst-case time. Recall that the problem doesn't require that a find
operation return any specific name as long as find on the elements in the same
connected component returns the same value. Thus, we can use a tree to
represent each component becase each element in a tree has the same root.
Thus, the root can be used to name the set. The structure looks like below:
Since only the name of the parent is required, we can assume that this tree
is stored implicitly in an array: each entry id[i] in the array represents the
parent of element i. If i is the root, then id[i] = i. A find(X) on element X is
performed by returning the root of the tree containing X. The time to
perform this operation depending on the depth of the tree that represents
the set containing X, which is O(N) in the worst case because of the
possiblity of creating a tree of depth N − 1. union(p,q) can be done by
change the root of tree containing p into the value of root containing q:

Changing the root value step in union(p,q) is O(1). However, since we

need to find the root of p and q respectively, which takes O(N) in the
worst case. Thus, the union operation takes O(N).

Improvements
There are two major improvements we can do with our quick- union: smart-union
works on union operation and path compression works on find operation. Their
goal is to make the tree of each set shallow, which can reduce the time we spend
on find.

Smart union (weighted quick-union)

Smart union is a modification to quick-union that avoid tall trees. We keep
track of the size (i.e., number of objects) of each tree and always to link the
root of smaller tree to root of larger tree, breaking ties by any method. This
approach is called union-by-size. In quick-union, we may make the larger
tree a subtree of the smaller tree, which increase the depth of the new tree,
which increase the find cost. The following picture demonstrates this point:
Another approach is called union-by-height, which tracks the height,
instead of the size, of each tree and perform union by making the shallow
tree a subtree of the deeper tree. Since the height of a tree increases only
when two equally deep trees are joined (and then the height goes up by
one). Thus, union-by- height is a trivial modification of union-by-size.

To find the running time of find and union, we need to find out the depth of
any node X, which in this case is at most logN. The proof is simple: when
the depth of X increases, the size of tree is at least doubled (i.e., join two
equal-size trees). Since there are at maximum N nodes for a tree, the size
of trees doubled at least logN times. Thus, the depth of any node is at
most logN. With this claim, we have running time for find is O(logN) and
running time for union is O(logN) as well.

Path compressionn
Path compression is performed during a find operation and is independent of the
strategy used to perform union. The effect of path compression is that every node
on the path from X to the root has its parent changed to the root. For example,
suppose we call find(9) for the following tree representation of our disjoint set:
Then the following picture shows the end state of our tree after calling
find(9). As you can see, on the path from 9 to 0 (root), we have 9, 6, 3, 1. All
of them have been directly connected to the root after the call is done:
This strategy may look familiar to you: we do the path compression in the
hope of the fast future accesses on these nodes (i.e., 9, 6, 3, 1) will pay off
for the work we do now. This idea is exactly the same as the splaying in
splay tree.

When union are done arbitrarily, path compression is a good idea, because
there is an abundance of deep nodes and these are brought near the root
by path compression. Path compression is perfectly compatible with union-
by-size, and thus both routines can be implemented at the same time. In
fact, the combination of path compression and a smart union rule
guarantees a very efficient algorithm in all cases. Path compression is not
entirely compatible with union-by-height, because path compression can
change the heights of the trees. We don't want to recompute all the heights
and in this case, heights stored for each tree become estimated heights
(i.e., ranks), but in theory union-by-rank is as efficient as union-by- size.
If we do analysis on smart union with path compression, the running time
for any sequence of M union-find operations on N objects makes O(N +
∗
Mlog N) accesses.

The following table summarizes the running time for M union- find
operations on a set of N objects (don't forget we need to spend O(N) to
initialize disjoint sets):

The running time for each operation for each algorithm is following:
Remarks
Essentially, union- find structure addresses the "dynamic connectivity
problem":

Given a set of N objects, support two operation: 1. Connect two

objects. 2. Is there a path connecting the two objects?

For example, given two points in a maze, we may ask "Is there a path
connecting p and q?" Objects can be:
 Pixels in a digital photo.
 Computers in a network.
 Friends in a social network.
 Transistors in a computer chip.
 Elements in a mathematical set.
 Variable names in a Fortran program.
 Metallic sites in a composite system.

Some Other list of union-find applications:

 Percolation.
 Games (Go, Hex).
 Dynamic connectivity.
 Least common ancestor.
 Equivalence of finite state automata.
 Hoshen-Kopelman algorithm in physics.
 Hinley-Milner polymorphic type inference.
 Kruskal's minimum spanning tree algorithm.
 Compiling equivalence statements in Fortran.
 Morphological attribute openings and closings.
 Matlab's bwlabel() function in image processing.

Links to resources

Here are some of the resources I found helpful while preparing this
article:

1. M. A. Weiss, Data Structures and Algorithm Analysis in C.

(2nd ed.) Menlo Park, Calif: Addison-Wesley, 1997, ch. 8.
2. R. Sedgewick 1946 and K. Wayne 1971, algorithms. (4th ed.)
Upper Saddle River, NJ: Addison-Wesley, 2011, ch. 1, sec. 5.

∗
1. log N counts the number of times you have to take the log of N
to get one. This is also called iterated log function. For
∗
example, log 65536 = 4 because loglogloglog65536 = 1
.

02 Union Find
No ratings yet
02 Union Find
194 pages
Unit 2 Daa Updated 26th
No ratings yet
Unit 2 Daa Updated 26th
82 pages
DSA2 L14 (Disjoint Set)
No ratings yet
DSA2 L14 (Disjoint Set)
29 pages
DAA Lecture Notes
No ratings yet
DAA Lecture Notes
171 pages
Unit - 5 Disjoint Set
No ratings yet
Unit - 5 Disjoint Set
22 pages
DSA Module 4
No ratings yet
DSA Module 4
77 pages
ADA Unit-II P1 DisjointSets C
No ratings yet
ADA Unit-II P1 DisjointSets C
26 pages
Operations On Dynamic Sets
No ratings yet
Operations On Dynamic Sets
34 pages
DAA II-Unit (2) (Conflict2024-04-06-14-34-05)
No ratings yet
DAA II-Unit (2) (Conflict2024-04-06-14-34-05)
59 pages
Design and Analysis of Algorithms - U2-P1 by S. Sandhya
No ratings yet
Design and Analysis of Algorithms - U2-P1 by S. Sandhya
49 pages
Lec11 Graphs
No ratings yet
Lec11 Graphs
77 pages
CS2040 Tutorial 06 Slides 1
No ratings yet
CS2040 Tutorial 06 Slides 1
72 pages
12 13 Union Find
No ratings yet
12 13 Union Find
53 pages
Unit V Ads
No ratings yet
Unit V Ads
7 pages
R19 MFCS - Unit-2
No ratings yet
R19 MFCS - Unit-2
37 pages
2022-CAT-Grade 10-June Exam-Paper 2
67% (9)
2022-CAT-Grade 10-June Exam-Paper 2
11 pages
Ads Unit-4
No ratings yet
Ads Unit-4
46 pages
Lecture 15
No ratings yet
Lecture 15
40 pages
Disjoint Sets Data Structure: Example. Consider A System of Three Sets (1, 3, 5), (2, 6), (4, 7, 8)
No ratings yet
Disjoint Sets Data Structure: Example. Consider A System of Three Sets (1, 3, 5), (2, 6), (4, 7, 8)
8 pages
Tutorial
No ratings yet
Tutorial
9 pages
DAA U-2 (Part1)
No ratings yet
DAA U-2 (Part1)
33 pages
Union-Find Structures
No ratings yet
Union-Find Structures
23 pages
1 Greedy
No ratings yet
1 Greedy
116 pages
Unit 2
No ratings yet
Unit 2
19 pages
The Disjoint Set Class S
No ratings yet
The Disjoint Set Class S
21 pages
Praveen Kumar, Mike Folk, Momcilo Markus, Jay C. Alameda - Hydroinformatics - Data Integrative Approaches in Computation, Analysis, and Modeling-CRC Press (2005)
100% (1)
Praveen Kumar, Mike Folk, Momcilo Markus, Jay C. Alameda - Hydroinformatics - Data Integrative Approaches in Computation, Analysis, and Modeling-CRC Press (2005)
553 pages
Sets & Disjoint Set Union
No ratings yet
Sets & Disjoint Set Union
27 pages
Union Find
No ratings yet
Union Find
5 pages
The Disjoint Set ADT
No ratings yet
The Disjoint Set ADT
67 pages
Daa Unit Ii
No ratings yet
Daa Unit Ii
14 pages
SSP Appendix A High FedRAMP Security Controls
No ratings yet
SSP Appendix A High FedRAMP Security Controls
531 pages
Lecture07 DisjointSets
No ratings yet
Lecture07 DisjointSets
2 pages
(M3S1 POWERPOINT) Pre Reading Strategies
No ratings yet
(M3S1 POWERPOINT) Pre Reading Strategies
18 pages
Dynamic Equivalence Problem
No ratings yet
Dynamic Equivalence Problem
20 pages
11 Unionfind
No ratings yet
11 Unionfind
14 pages
1 HCIE-Cloud Computing V3.0 Lab Guide
No ratings yet
1 HCIE-Cloud Computing V3.0 Lab Guide
150 pages
Equivalence Relations: A Binary Relation Over A Set S Is Called An
No ratings yet
Equivalence Relations: A Binary Relation Over A Set S Is Called An
14 pages
Disjoint Set Data Structure: Piyali Chandra Assistan Professor Uemk
No ratings yet
Disjoint Set Data Structure: Piyali Chandra Assistan Professor Uemk
10 pages
Disjoint Sets and Joint Sets
No ratings yet
Disjoint Sets and Joint Sets
9 pages
Disjoint Ssets
No ratings yet
Disjoint Ssets
37 pages
Unit-1 2
No ratings yet
Unit-1 2
7 pages
Data Structures
No ratings yet
Data Structures
4 pages
Algorithms Exam Help
No ratings yet
Algorithms Exam Help
11 pages
Algorithm Homework Help
No ratings yet
Algorithm Homework Help
11 pages
Notes - Union-Find Disjoint Sets (UFDS)
No ratings yet
Notes - Union-Find Disjoint Sets (UFDS)
1 page
Introduction To Internet: National Diploma in Computer Science
No ratings yet
Introduction To Internet: National Diploma in Computer Science
53 pages
Unique Binary Search Tree Representations and Equality-Testing of Sets and Sequences
No ratings yet
Unique Binary Search Tree Representations and Equality-Testing of Sets and Sequences
8 pages
Data Structures For Disjoint Sets - 1.PDF Unit 4
No ratings yet
Data Structures For Disjoint Sets - 1.PDF Unit 4
5 pages
Efficiency of A Good But Not Linear Set Union Algorithm. Tarjan
No ratings yet
Efficiency of A Good But Not Linear Set Union Algorithm. Tarjan
11 pages
CH5 3
No ratings yet
CH5 3
36 pages
Unit 2 (Part 1)
No ratings yet
Unit 2 (Part 1)
6 pages
Disjoint in Data Structure
No ratings yet
Disjoint in Data Structure
17 pages
AS400 Daily Exercises
No ratings yet
AS400 Daily Exercises
17 pages
Soda14 Disjoint Set Union
No ratings yet
Soda14 Disjoint Set Union
13 pages
Test Preparation Computer Knowledge
No ratings yet
Test Preparation Computer Knowledge
36 pages
Each of The Elements Is in Exactly One Set at Any Time
No ratings yet
Each of The Elements Is in Exactly One Set at Any Time
47 pages
DisjointSet Slide
No ratings yet
DisjointSet Slide
19 pages
Chapter 10 Complete
No ratings yet
Chapter 10 Complete
9 pages
Dynamic Connectivity: 1.1 Algorithm
No ratings yet
Dynamic Connectivity: 1.1 Algorithm
4 pages
Union-Find and Amortized Analysis
No ratings yet
Union-Find and Amortized Analysis
5 pages
Chap 8
No ratings yet
Chap 8
36 pages
Computer Science Paper 2 HL Markscheme
No ratings yet
Computer Science Paper 2 HL Markscheme
26 pages
Liniar Time Disjoint-Set by Tarjan
No ratings yet
Liniar Time Disjoint-Set by Tarjan
13 pages
Horror Novels
No ratings yet
Horror Novels
22 pages
Veeam Rental Licensing and Usage Reporting: Reference Guide
No ratings yet
Veeam Rental Licensing and Usage Reporting: Reference Guide
36 pages
Tennecomp Minidek Part 2
No ratings yet
Tennecomp Minidek Part 2
24 pages
Algorithms and Data Structures Princeton University Fall 2005 Kevin Wayne
No ratings yet
Algorithms and Data Structures Princeton University Fall 2005 Kevin Wayne
9 pages
Disjoint Sets Union Find Algorithms
No ratings yet
Disjoint Sets Union Find Algorithms
3 pages
SESlides 5
No ratings yet
SESlides 5
21 pages
Artemis
No ratings yet
Artemis
24 pages
Differences Between IPC Mechanisms On A Single System Vs
No ratings yet
Differences Between IPC Mechanisms On A Single System Vs
3 pages
Disjoint Sets: 1. Union Find Problem
No ratings yet
Disjoint Sets: 1. Union Find Problem
20 pages
CIT-3117-INTRODUCTION-TO-COMPUTERS-AND-APPLICATION-2 Class 1
No ratings yet
CIT-3117-INTRODUCTION-TO-COMPUTERS-AND-APPLICATION-2 Class 1
6 pages
Nokia8 ROM Installation Guide V2.0
No ratings yet
Nokia8 ROM Installation Guide V2.0
18 pages
Dec50132 Internet Based Controller Pw3
No ratings yet
Dec50132 Internet Based Controller Pw3
19 pages
Industrial Training
No ratings yet
Industrial Training
11 pages
Automating Tasks Using The Automation 360 Excel Advanced Package
No ratings yet
Automating Tasks Using The Automation 360 Excel Advanced Package
18 pages
Disjoint Set Data Structure: Find (X) - Determine Which Set An Item With Key X Is In, I.e., Return The Key of
No ratings yet
Disjoint Set Data Structure: Find (X) - Determine Which Set An Item With Key X Is In, I.e., Return The Key of
5 pages
Chapter 3
No ratings yet
Chapter 3
10 pages
Anjalie G@sliit LK
No ratings yet
Anjalie G@sliit LK
10 pages
COM Port Mapping With Keyspan USB Serial Adapter
No ratings yet
COM Port Mapping With Keyspan USB Serial Adapter
6 pages
Ds Xi 1
No ratings yet
Ds Xi 1
3 pages
Disjoint Set
No ratings yet
Disjoint Set
4 pages
Internship Flyer2
No ratings yet
Internship Flyer2
4 pages
21CS32
No ratings yet
21CS32
5 pages
Application of AI in Home Automation
No ratings yet
Application of AI in Home Automation
5 pages
Datasheet of Ds 7600ni k2 Pak NVR 20210913
No ratings yet
Datasheet of Ds 7600ni k2 Pak NVR 20210913
3 pages
Install Network Cables
No ratings yet
Install Network Cables
17 pages
1.data Types and Definitions: Abap - Syntax'S
No ratings yet
1.data Types and Definitions: Abap - Syntax'S
15 pages