Operations On Dynamic Sets

The document discusses operations on dynamic sets using hash functions, collision resolution techniques, and the implementation of symbol tables. It covers the complexities of searching, inserting, and deleting in hash tables, as well as the use of disjoint sets for connected components and the efficiency of Kruskal's and Prim's algorithms for finding minimum spanning trees. Additionally, it addresses the time complexities associated with these operations and algorithms, emphasizing the importance of heuristics and data structures in optimizing performance.

Uploaded by

Turjo Sarker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views34 pages

Operations On Dynamic Sets

Uploaded by

Turjo Sarker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Operations on Dynamic sets

Hash functions map items from a very large set to a

smaller set (storage location)
Each item has a key (an integer)
Randomness is expected in the hash function
Collision occurs when the mapping for a given key
returns an occupied slot
Collision resolution is done through chaining
Rehashing can be done to avoid collision till finding an
unoccupied location
Symbol tables in assemblers and compilers are
implemented as hash tables
Operations on hash table
Insert, delete and search
Problem size depends on
Number of slots in the hash table, m
Number of items stored in the hash table, n
Load factor = = n/m
Chaining needs traversal along the chain if slot
is occupied => > 1 is possible
Open addressing needs to free slot for
insertion upon deletion of an item =>
Search time complexity - Chaining
Search time can be bounded for both success and failure
In case of failure:
Time to hash the key = O(1)
Time to find the key = O( )
Total time = O( +1)
In case of success:
Let the item be i-th one to be inserted
Load factor at that time was (i-1)/m
Chain traversal needed is of length (i-1)
Expected no of elements searched out of n elements
= (1/n) (1+ (i-1)/m) averaged over no of currently stored
nm) (1/2) (n(n-1)) = O(1+ )
Types of hash functions
Division method h(k) = k mod m, m is prime
Multiplication method h(k)=floor(m(kA mod 1)) A
1)/2 = 0.618
Collision probability depends on randomness of h
Linear probing: h(k,i i)mod m
may cause primary clustering
Quadratic probing: h(k,i 1i+c2i ) mod m
2

may cause secondary clustering

Double hashing: h(k,i) = (h1(k) + i h2(k)) mod m
Uniform hashing: unoccupied slots are equally likely to
be selected => h(k,i)=(h1(k) + i h2(k) + i2 h3
Unsuccessful Search-Open addressing
Pi = prob{exactly i probes access occupied slots} for i =
0,1,2,..,n. Pi=0 for i > n
Qi = prob{at least i probes access occupied slots}
assuming uniform distribution, will be
= (n/m)(n-1/m- -i+1/m- i i

i=0 iP(x=i i=0 i (Pr( )-

i=1 (Pr( i=1 Qi
Expected no of probes for unsuccessful search=
iPi i=1 Qi i=0
i - )
Expected no of probes = 2 for =0.5 and =10 for =0.9
Successful Search-Open addressing
When element was inserted as (i+1 )-th element, load
factor was = i/m
(1-i/m) no of unsuccessful probes
Expected no of probes for successful search is sum over
all such i when n slots are occupied
n (1-i/m) (m-i) (Hm Hm-n )
Hi j=1 j=i (1/j) bounded by ln(i i i)
(1+ ln(m) - ln(m-n)) (ln 1- )
=0.9
No of probes not increasing rapidly as table fills up.
Data structures for disjoint sets
S={S1,S2,..,Sk) disjoint dynamic sets having
representative elements
MAKE_SET(x) creates new set with only one
member pointed to by x and x cannot belong to
any other set disjoint.
UNION (x,y) unites two dynamic sets Sx and Sy
containing x and y into a new set U , with new
representative.
FIND_SET (x) returns pointer to the
representative of the set containing x.
Connected Components
for each vertex v V[G]
do MAKE_SET(v)
for each edge (u,v) E[G]

UNION(v,u)
End

SAME_COMPONENT(u,v)
if FIND_SET(u) == FIND_SET(v)
then return TRUE
else return FALSE
End
Parameters for analyzing running
times of operations
The no of MAKE_SET operations is n
total no of operations is m
Each UNION operation reduces no of sets by one, so that
after n-1 such operations, only one set remains.
Therefore, m cannot exceed n-
operations are included in m.

Then q=m-n operations requires incremental no of

updations O(q^2) with i-th UNION operation requiring O(i).
Total time is O(m^2) i.e. average O(m) per operation
Some heuristics that update representative of smaller list to
that of larger list can reduce this complexity.
Linked list representation
An element x is always on the smaller set if its
representative pointer is updated.
Hence first time, resulting set has at least two
members, next time at least 4 and so on.
lg(k)
times, resulting set has at least k members.
Hence a pointer of an object is updated at most
lg(n) times. For n objects, this is O(n lg n).
Since there are O(m) MAKE and FIND operations
taking O(1) time each, total time for entire
operation is O(m+n lg n)
Disjoint set forest
MAKE-SET creates trees with just one node
FIND-SET chases parent pointers upto the root
UNION causes root of one tree to point to other
Heuristics union by rank, path compression
Rank- approximates the logarithm of the subtree size,
it is also the upper bound on height of the node
Root with smaller rank is made to point to the one with
larger rank
Path compression makes each node on find path
point directly to the root rank not affected by this
Disjoint Set Forest Algorithms
MAKE-SET(x)
p[x]=x LINK(x,y)
rank[x]=0
if rank[x] > rank[y]
FIND-SET(x) p[y]=x
else
p[x] = FIND-SET(p[x]) p[x]=y
return p[x]
if rank[x]==rank[y]
UNION (x,y) rank[y]++
LINK(FIND-SET(x),FIND-SET(y))
Properties of rank
P1-
hence, Subtree rooted at p[x] is larger.
P2- Let size(x) be no of nodes in the tree
rank[x]

P3- r nodes of

rank r exists.
P4- Every node has a rank at most floor (lg n)
Property on size of tree rooted at x
This can be proved by induction on no of LINK operations.
Before first LINK on x, this is TRUE since rank[x]=0.
Let rank, size before LINK be rank, size and after LINK it
becomes .
In operation LINK(x,y) let rank[x] < rank[y].
Node y is root of tree formed through LINK and we have
rank[x] + 2rank[y]
rank[y] [no rank changes
other than y]
rank[y] = 2rank[y]+1 =
2
rank[x]
Counting nodes within a rank r
When rank r is assigned to x, attach a label x to all
nodes of the tree rooted at x.
At least 2r nodes are labeled each time. When root
changes for x, rank of root is at least r+1. Hence no
new node is labeled x for this.
Each node is therefore labeled at most once. There
being n nodes in all, at most n labeled nodes with at
least 2r labels assigned for each node of rank r.
If there are more than n/2r nodes of rank r, then more
than (n/2r ) 2r i.e. more than n nodes would be labeled
by a node of rank r, which is a contradiction.
Maximum rank possible for a node
Let r > lg n, then there are at most n/2r < 1
node of rank r.
But this is impossible as rank is integer.
Every node has a rank at most floor (lg n)
Dividing ranks into rank groups
Rank 0, 1 rank group 1; Rank 2,2^2-1 rank group 2
Rank 4 to 2^2^2-1 (15) rank group 3
Rank 16 to 2^2^2^2-1 (255) rank group 4
Rank F(g) to 2F(g) - 1 rank group g
lg*(n) take lg
till n reduces to 1.
This puts rank r into group lg* lg n)
Highest group no will be lg*(lg n) = lg*(n) -1.
Then j-th group has ranks {F(j-1)+1, F(j-
Time complexity for transitions
Two cost types: within group and transition to
higher rank group.
In Transition cost, there can be lg*(n) +1
transitions in all for each FIND-SET operation.
Once a node has parent in a different group, it
can no longer come back to previous group
because of heuristics.
For m FIND-SET operations, total cost of
transitions is thus m(lg*(n) +1)
Cost within group
No of nodes are given by
r ) = (n/2F(g-1)+1 r )
The sum running from r=0 to F(g) F(g-1)+1 can
be changed to an infinite series sum for large g.
So, N(g) < n / F(g)

Hence considering all rank groups, denoting by

P(n) the total cost within groups, can be obtained
Cost within group
Multiplying no-of-nodes by no-of-ranks
and summing from g=0 to lg*(n)-1
F(g-1)
-1) for large g.
lg*(n) so that
T(n) = m (lg*(n) + 1)
Total time complexity
Total cost is therefore
O(m (lg*(n) + 1) + n lg*(n)) = O(m (lg* n)
There are O(n) MAKE-SET and LINK or UNION
operations with O(1) time each.
Total time complexity stays at O(m (lg* n))
Time per operation is therefore O (lg* n)
amortized complexity
MST Lemma
G = (V, E) be weighted connected graph
U is a strict subset of V i.e. nodes in G
T is a promising subset of edges in E such that
no edge in T leaves U
e is a least cost edge that leaves U
Then the set of edges T' = T e} is promising.
MST - Kruskal's Algorithm
J.B. Kruskal. On the shortest spanning subtree of
a graph and the traveling salesman problem
Proceedings of the American Mathematical
Society, Volume 7, pp. 48-50, 1956.
Complexity is O(elog e) where e is the number of
edges. Can be made even more efficient by a
proper choice of data structures.
At the end of the algorithm, we will be left with a
single component that comprises all the vertices
and this component will be an MST for G.
MST - Kruskal
Let G = (V, E) be the given graph, with | V| = n
{
Start with a graph T = (V,) consisting of only the
vertices of G and no edges;
/* This can be viewed as n connected components, each vertex being one
connected component */
Arrange E in the order of increasing costs; GREEDY
for (i = 1, in - 1, i + +) DISJOINT SETS
{
Select the next smallest cost edge;
if (the edge connects two different connected components)
add the edge to T;
}
}
Kruskal matroids and disjoint sets
MST-Kruskal (G,w)
A
FOR each vertex v in V[G]
DO MAKE-SET(v)
Sort the edges of E in order of non-decreasing w
FOR each edge (u,v) in E in sorted order DO
IF FIND- -SET(v) THEN
A = A U {(u,v)}
UNION (u,v)
RETURN A
Theorem: Kruskal algorithm finds MST
Proof: Let G = (V, E) be a weighted, connected graph. Let T be the
edge set that is grown in Kruskal's algorithm. The proof is by
mathematical induction on the number of edges in T.
We show that if T is promising at any stage of the algorithm,
then it is still promising when a new edge is added to it in
Kruskal's algorithm
When the algorithm terminates, it will happen that T gives a
solution to the problem and hence an MST.
Basis: T = is promising since a weighted connected graph always
has at least one MST.
Induction Step: Let T be promising just before adding a new
edge e = (u, v). The edges T divide the nodes of G into one or more
connected components. u and v will be in different components. Let
U be the set of nodes in the component that includes u.
Theorem: Kruskal algorithm finds MST
Note that
U is a strict subset of V
T is a promising set of edges such that no edge in T leaves
U (since an edge T either has both ends in U or has neither
end in U)
e is a least cost edge that leaves U (since Kruskal's
algorithm, being greedy, would have chosen e only after
examining edges shorter than e)
The above three conditions are precisely like in the MST
Lemma and hence we can conclude that the T {e} is also
promising. When the algorithm stops, T gives not merely a
spanning tree but a minimal spanning tree since it is
promising.
Kruskal - illustration
Running Time of Kruskal's Algorithm
The total time for performing all the merge
and find depends on the method used.
O(elog e) without path compression
O(e(lg*e)) with the path compression
- MST
R.C. Prim. Shortest connection networks and some generalizations. Bell System Technical Journal,
Volume 36, pp. 1389-1401, 1957.
{ T= ;
U = { 1 };
while ( )
{
let (u, v) be the lowest cost edge
such that u U and v V - U;
T=T u, v)}
U=U v}
}
}

At each step, we can scan lowcost to find the vertex in V - U that is closest to U.
Then we update lowcost and closest taking into account the new addition to U.
Complexity: O(n2)
Proof of Correctness-Prim's Algorithm
Let G = (V,E) be a weighted, connected graph. Let T be the
edge set that is grown in Prim's algorithm. The proof is by
mathematical induction on the number of edges in T and
using the MST Lemma.
Basis: The empty set is promising since a connected,
weighted graph always has at least one MST.
Induction Step: Assume that T is promising just before the
algorithm adds a new edge e = (u,v). Let U be the set of nodes
grown in Prim's algorithm. Then all three conditions in the
MST Lemma are satisfied and therefore T U e is also
promising.
When the algorithm stops, U includes all vertices of the graph
and hence T is a spanning tree. Since T is also promising, it will
be a MST.
- illustration
Difference between Prim and Kruskal

algorithm
Initialize with a node Initialize with an edge
Graph has to be connected Work on disconnected
graph
Always keep a connected
component, look at all edges do not keep one connected
component but a forest. At
from the current component
each stage, look at the globally
to other vertices and find the smallest edge that does not
smallest among them - then create a cycle in the current
add the neighbouring vertex to forest. Such an edge has to
the component, increasing size necessarily merge two trees in
by 1. the current forest into one.
Difference between Prim and Kruskal

algorithm
In N-1 steps, every vertex Since you start with N single-
would be merged to the vertex trees, in N-1 steps, they
current one if we have a would all merge into one if the
connected graph. graph was connected.
Next edge shall be the cheapest Choose the cheapest edge, but it
edge in the current vertex. may not be in the current vertex.
Prim's algorithm is found to run Kruskal's algorithm is found to
faster in dense graphs with more run faster in sparse graphs.
number of edges than vertices

Unit 2 Daa Updated 26th
No ratings yet
Unit 2 Daa Updated 26th
82 pages
ADA Lab Manual Updated 2023-24
No ratings yet
ADA Lab Manual Updated 2023-24
36 pages
Minimum Spanning Trees
No ratings yet
Minimum Spanning Trees
51 pages
13 Union-Find
No ratings yet
13 Union-Find
49 pages
Small 16
No ratings yet
Small 16
77 pages
12 13 Union Find
No ratings yet
12 13 Union Find
53 pages
Algorithm Methodologies
No ratings yet
Algorithm Methodologies
28 pages
DS MinimumSpanningTrees (6) SLM
No ratings yet
DS MinimumSpanningTrees (6) SLM
27 pages
13 MST
No ratings yet
13 MST
33 pages
17 23 Lecture Notes 1
No ratings yet
17 23 Lecture Notes 1
27 pages
DAA Unit III
No ratings yet
DAA Unit III
53 pages
14 MST
No ratings yet
14 MST
20 pages
1 Greedy
No ratings yet
1 Greedy
116 pages
All Algorithms
No ratings yet
All Algorithms
17 pages
Data Structures Cheat Sheet
71% (14)
Data Structures Cheat Sheet
2 pages
Comp 251 Final Cribsheet Calem Bendell
No ratings yet
Comp 251 Final Cribsheet Calem Bendell
12 pages
Analysis of Algorithms Minimum Spanning Trees (MST) : Dr. Manisha Chahal
No ratings yet
Analysis of Algorithms Minimum Spanning Trees (MST) : Dr. Manisha Chahal
31 pages
Lect 0912
No ratings yet
Lect 0912
8 pages
H170227e Vimbainashe Chigumbu Daa-3
No ratings yet
H170227e Vimbainashe Chigumbu Daa-3
8 pages
Minimum Spanning Trees
No ratings yet
Minimum Spanning Trees
25 pages
5 TH AOAEXP
No ratings yet
5 TH AOAEXP
10 pages
DAANotes
No ratings yet
DAANotes
12 pages
Abhijeet Daa Solution All Years Previous Year Paper
No ratings yet
Abhijeet Daa Solution All Years Previous Year Paper
81 pages
AAD1 Pyq Solutions
No ratings yet
AAD1 Pyq Solutions
8 pages
Daa 4
No ratings yet
Daa 4
7 pages
Raph Lgorithms Dr. Tanzima Hashem Assistant Professor Cse, Buet
No ratings yet
Raph Lgorithms Dr. Tanzima Hashem Assistant Professor Cse, Buet
82 pages
LN 3 Greedy Technique
No ratings yet
LN 3 Greedy Technique
73 pages
Lec 26 Supp
No ratings yet
Lec 26 Supp
3 pages
Hiiiiiiiiii
No ratings yet
Hiiiiiiiiii
3 pages
Outline of Kruskal's Algorithm
No ratings yet
Outline of Kruskal's Algorithm
12 pages
CSC 172 Midterm
No ratings yet
CSC 172 Midterm
11 pages
Unit - I: Random Access Machine Model
No ratings yet
Unit - I: Random Access Machine Model
39 pages
Tutorial
No ratings yet
Tutorial
6 pages
ECE250 Notes
No ratings yet
ECE250 Notes
23 pages
Design &analysis of Algorithms Assignment: Uiet Department Information Technology Panjab University SSG-RC
No ratings yet
Design &analysis of Algorithms Assignment: Uiet Department Information Technology Panjab University SSG-RC
11 pages
Analysis of Algorithms I: Minimum Spanning Tree: Xi Chen
No ratings yet
Analysis of Algorithms I: Minimum Spanning Tree: Xi Chen
25 pages
Module 3
No ratings yet
Module 3
52 pages
Week 6
No ratings yet
Week 6
22 pages
M5 - Chapter 9 1
No ratings yet
M5 - Chapter 9 1
18 pages
Chapter 6 Searching
No ratings yet
Chapter 6 Searching
31 pages
DAA Assignment PDF
No ratings yet
DAA Assignment PDF
13 pages
DAA Unit 4
No ratings yet
DAA Unit 4
24 pages
Algorithm Homework Help
No ratings yet
Algorithm Homework Help
11 pages
Algorithms Exam Help
No ratings yet
Algorithms Exam Help
11 pages
Algorithmic Cheatsheet: Typesetting Math: 97%
No ratings yet
Algorithmic Cheatsheet: Typesetting Math: 97%
12 pages
Efficiency of A Good But Not Linear Set Union Algorithm. Tarjan
No ratings yet
Efficiency of A Good But Not Linear Set Union Algorithm. Tarjan
11 pages
Greedy Algorithms
No ratings yet
Greedy Algorithms
11 pages
DisjointSet Slide
No ratings yet
DisjointSet Slide
19 pages
Daa Module 5
No ratings yet
Daa Module 5
9 pages
Huffman Codes: Spanning Tree
No ratings yet
Huffman Codes: Spanning Tree
6 pages
Algorithms Part 1 - Lecture Notes: 1 Union Find
No ratings yet
Algorithms Part 1 - Lecture Notes: 1 Union Find
6 pages
Solutions For HW10-CS 6033 Fall 2023
No ratings yet
Solutions For HW10-CS 6033 Fall 2023
10 pages
Fib Heaps
100% (1)
Fib Heaps
7 pages
Lecture 12
No ratings yet
Lecture 12
4 pages
Correctness of Kruskal's Algorithm: Operations
No ratings yet
Correctness of Kruskal's Algorithm: Operations
7 pages
Disjoint Set Data Structure: Find (X) - Determine Which Set An Item With Key X Is In, I.e., Return The Key of
No ratings yet
Disjoint Set Data Structure: Find (X) - Determine Which Set An Item With Key X Is In, I.e., Return The Key of
5 pages
1 Ds + Daa: 1.1.1 Data
No ratings yet
1 Ds + Daa: 1.1.1 Data
4 pages
Minimum Spanning Trees (Ch. 23) ! Minimum Spanning Trees!
No ratings yet
Minimum Spanning Trees (Ch. 23) ! Minimum Spanning Trees!
5 pages
Noc20 Cs81 Assignment 01 Week 03
No ratings yet
Noc20 Cs81 Assignment 01 Week 03
5 pages
Lecture Notes - 3
No ratings yet
Lecture Notes - 3
18 pages
A Sign Ments
No ratings yet
A Sign Ments
14 pages
Ch-3 - AI Problem Solving by Searching
No ratings yet
Ch-3 - AI Problem Solving by Searching
42 pages
Hashing
No ratings yet
Hashing
23 pages
Cs PPT CHP 3 Part 4
No ratings yet
Cs PPT CHP 3 Part 4
37 pages
Naive and Rabin Karp
No ratings yet
Naive and Rabin Karp
47 pages
03 InformedHeuristicSearch
No ratings yet
03 InformedHeuristicSearch
80 pages
Column Generation
No ratings yet
Column Generation
44 pages
Chap 3 B Uninformed Search
No ratings yet
Chap 3 B Uninformed Search
67 pages
Session 2 - Search Strategies
No ratings yet
Session 2 - Search Strategies
60 pages
Rec5 Mtreview Sol
No ratings yet
Rec5 Mtreview Sol
14 pages
Descriptive ANSWER AI
No ratings yet
Descriptive ANSWER AI
15 pages
BranchAndBound TSP
No ratings yet
BranchAndBound TSP
50 pages
Java Data Strcuture For Intermediate Level
No ratings yet
Java Data Strcuture For Intermediate Level
18 pages
Hashing Technique
No ratings yet
Hashing Technique
8 pages
Big O Sheet
No ratings yet
Big O Sheet
2 pages
DAA - Unit IV - Overview of Backtracking and Branch and Bound
No ratings yet
DAA - Unit IV - Overview of Backtracking and Branch and Bound
28 pages
Collision in Hashing
No ratings yet
Collision in Hashing
9 pages
Breadth First Search Animat Ion
No ratings yet
Breadth First Search Animat Ion
25 pages
Ant Colony Optimization and Local Search For Bin P
No ratings yet
Ant Colony Optimization and Local Search For Bin P
13 pages
Hashing
No ratings yet
Hashing
4 pages
Experiment No.5: Branch and Bound Algorithm
No ratings yet
Experiment No.5: Branch and Bound Algorithm
6 pages
Assignment - 05: Shivam Kamlesh Yadav BEIT-B4 77
No ratings yet
Assignment - 05: Shivam Kamlesh Yadav BEIT-B4 77
6 pages
AMCAT Hash Tables Questions
No ratings yet
AMCAT Hash Tables Questions
3 pages
Lab 9 HMAC
No ratings yet
Lab 9 HMAC
4 pages
Worksheet 6 HashFunction
No ratings yet
Worksheet 6 HashFunction
2 pages
Gedit Cheat Sheet For Rails Development
100% (6)
Gedit Cheat Sheet For Rails Development
1 page
Assignment 3
No ratings yet
Assignment 3
1 page
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)

Operations On Dynamic Sets

Uploaded by

Operations On Dynamic Sets

Uploaded by

Operations on Dynamic sets

Hash functions map items from a very large set to a

may cause secondary clustering

i=0 iP(x=i i=0 i (Pr( )-

Then q=m-n operations requires incremental no of

Hence considering all rank groups, denoting by

You might also like