Multi-Threaded Cycle Detection in Undirected Graph
Multi-Threaded Cycle Detection in Undirected Graph
IIT2013134
IIT2013180
There are two commonly used approaches for detecting cycle in an undirected graph:
Union-Find Approach
Depth-First Search Approach
Union-Find Approach
In union find approach, edges are scanned one-by-one and all connected nodes are added to a
set. If an edge between two nodes belonging to the same set is encountered, it confirms a cycle.
Initially, we start with all nodes being in separate sets and have equal rank of 1. Next, edges are
scanned one at a time. If both nodes havent been added to any bigger set so far, then we
randomly make one of the two nodes as parent of the other, in order to denote that they now
belong to the same set. For this, a vector called parent is maintained and also a height vector is
used.
A node which becomes root for the first time has a height (or rank) of 2. Whenever a node is
added to the set rooted at root r, the rank of r increases by 1. So when an edge between two
nodes belonging to sets rooted at different nodes is encountered, we make root of that node
which has greater rank as the parent of root of other node. In this way, we say that the two sets
have merged. In the find-set operation, we refresh the root of all nodes belonging to a set. The
pseudo code is given below.
for each edge E = V1-V2 in graph do
// find subsets to which V1 and V2 belong
p1 = V1.find-Set ()
p2 = V2.find-Set ()
// if V1 and V2 belong to same subset, theres a cycle
if p1 = p2
print (cycle)
else
make-Union (V1, V2)
end-if
end-loop
SPACE COMPLEXITY:
O (V)
TIME COMPLEXITY:
O (V)
IPCO-630 Assignment
IIT2013134
IIT2013180
O (V)
IPCO-630 Assignment
IIT2013134
IIT2013180
A call to this DFS method would perform cycle detection on one connected component
only. If the graph has multiple disjoint sets of connected nodes, then cycle detection has
to be performed on all of them. Hence the DFS method is invoked from main method
iteratively over each unvisited node. This ensures that we check for presence of cycle in
all connected components of the undirected graph.
for every node n
if visited[n] = false
DFS (n, -1)
end-if
end-loop
Shown above is a glimpse of a scenario where a graph resembling a binary tree is being undergone
cycle detection. Suppose that cycle detection was invoked from main method at root node (blue). The
root node invoked cycle detection on its two adjacent nodes in threads T1 and T2 respectively. The
yellow node was marked visited by thread T1 and pink node by thread T2. The white node, which is
adjacent to yellow node is still unvisited. Next, if the thread T1 invokes DFS on unvisited white node
(by spawning a new thread), then were good to go. However, if the main thread, which returned to
1
1
IPCO-630 Assignment
IIT2013134
IIT2013180
the main method after invoking recursive calls to DFS on yellow and pink nodes in threads T1 and T2
respectively, now makes a call to DFS on the white node inside the for loop before thread T1 could,
then there will be a problem. This situation is depicted in following figure.
At this point, when thread T1 checks for the adjacent nodes (excluding the parent blue node marked
as 1), it will see that node 2 (coloured in blue) is already visited. So it would think that a cycle has been
found while there is none.
The above argument makes it clear that we cannot invoke DFS from the for loop of main thread in
new thread. Next we look at another way to parallelize it. The recursive calls made to DFS span over
different sub trees. So this gives us a motivation that these separate tasks can be taken up in different
threads as shown below.
DFS (node, parent)
visited [node] = true
for all vertices v adjacent to node
if v not equal to parent
if visited[v] = true
print (cycle)
else
new::thread DFS (v, node)
end-if
end-if
end-loop
IPCO-630 Assignment
IIT2013134
IIT2013180
This is achieved by calling join methods of each of the thread spawned inside the loop. With our
current modification, main thread invokes DFS on a node and unless that DFS invocation is over, it
waits. This gives us correctness with parallelism.
RUST Implementation
In our RUST implementation, we use
adjacency matrix to store graphs.
Boolean vector is used to hold the
visited information of nodes. Overall,
we have the following global variables:
IDENTIFIER
NTHREADS
cnt
n
adj [500] [500]
cycle_found
visited [500]
Type
i32
i32
i32
i32
boolean
boolean
Usage
constant value of 10 denoting maximum no of threads to use
keep count of no of threads spawned so far
Actual number of nodes in graph
adjacency matrix to store the graph
flag variable to indicate cycle has been found
vector to store visited status of nodes
IPCO-630 Assignment
IIT2013134
IIT2013180
Speedup Analysis
To test our code against its serial counterpart, we used the following inputs.
A graph having only one big cycle comprising all nodes and no other edges except those
making up the cycle
A graph having a structure of a complete binary tree
A graph having random edges having count equal to 1/4th of the maximum possible number
of edges
A graph having random edges with count equal to 1/2 the maximum no of edges
A graph having random edges with count equal to 3/4th the maximum no of edges
We deliberately skipped the fully-connected graph because cycle-detection in such a graph would
involve only 3 iterations at the maximum and comparing results of serial and parallel codes for such
low number of iterations wouldnt be appropriate.
The graphs had 350 nodes each and following table lists the average speedup ratio of codes with
different limits for maximum number of threads.
Input Type
Cycle
Tree
Random Sparse
Random Medium
Random Dense
Filename
inp_undirected_350x350_cycle.txt
inp_undirected_350x350_tree.txt
inp_undirected_350x350_random_sparse.txt
inp_undirected_350x350_random_medium.txt
inp_undirected_350x350_random_dense.txt
T=2
0.96
1.34
1.25
1.19
0.87
T=5
0.82
1.43
1.38
1.31
0.89
T=10
0.77
1.58
1.51
1.33
0.54
Library Integration
In the library, the method to perform cycle-detection in directed graph goes by the name
cycle_detection that takes two arguments: matrix134180 of type &mut [[i32; 350]; 350] and
n1 of type i32. The matrix stores the adjacency matrix of the graph while the integer stores number
of nodes in graph.
The above method invokes depth first search through the method called dfs_134180 which takes in
current node and its parent as arguments. This DFS method spawns new thread and the limit on
maximum number of threads spawned is set by global variable called NTHREADS180. Following is
the list of global variables used:
IPCO-630 Assignment
IIT2013134
IIT2013180