0% found this document useful (0 votes)

101 views9 pages

Gpu Working - K Shortest Path Analysis

This document describes a study that implemented Yen's algorithm for finding the k-shortest paths in a graph on a GPU using CUDA. Yen's algorithm finds simple k-shortest paths (without vertex repetition) in polynomial time. The authors developed a parallel GPU version of the algorithm using Nvidia's CUDA programming model. Their implementation achieved a 6x speedup compared to the serial CPU version of the algorithm. The paper presents the GPU and CUDA architecture, describes how the graph is represented, discusses data structures for storing multiple paths, and analyzes the performance of the parallel algorithm on large graphs.

Uploaded by

Sourabh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

101 views9 pages

Gpu Working - K Shortest Path Analysis

Uploaded by

Sourabh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Available online at www.sciencedirect.

com

ScienceDirect
Procedia Computer Science 48 (2015) 5 – 13

International Conference on Intelligent Computing, Communication & Convergence

(ICCC-2014)
(ICCC-2015)
Conference Organized by Interscience Institute of Management and Technology,

Bhubaneswar, Odisha, India

Implementation of K-shortest path algorithm in GPU using CUDA

AvadheshPratapSingha, DhirendraPratapSinghb
a
M. Tech Scholar, MANIT Bhoapl, MP, India.
b
Assistant Professor, MANIT Bhoapl, MP, India

Abstract

K-shortest path algorithm is generalization of the shortest path algorithm. K-shortest path is used in various fields like sequence
alignment problem in molecular bioinformatics, robot motion planning, path finding in gene network where speed to calculate
paths plays a vital role. Parallel implementation is one of the best ways to fulfill the requirement of these applications. A GPU
based parallel algorithm is developed to find k number of shortest path in a positive edge-weighted directed large graph. In
calculated shortest path repetition of the vertices is not allowed. Implemented algorithm calculates a k-shortest path between two
pair of vertices of a graph with n nodes and m vertices. This approach is based on Yen’s algorithm to find k-shortest loopless
path. We implemented our algorithms in Nvidia’s GPU using Compute Unified Device Architecture (CUDA). This paper
presents comparative analysis between CPU and GPU based implementation of Yen’s Algorithm. Our approach achieves the 6
time speed up in comparison of serial algorithm.
©© 2014 TheAuthors.
2015 The Authors.Published
Published
by by Elsevier
Elsevier B.V.B.V.
This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of scientific committee of International Conference on Computer, Communication
and Convergence (ICCC 2015)

Keywords:Compute Unified Device Architecture (CUDA), Graphical Processing Unit (GPU), Shortest path Algorithm, Parallel Algorithm.

1. Introduction

Let G = (V, E) be a directed graph, where V is a set of n nodes and E is a set of m arcs. m > n is assumed
throughout the paper to avoid trivial complications. In the graph G each edge is associated with the positive weight

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of scientific committee of International Conference on Computer, Communication and Convergence (ICCC 2015)
doi:10.1016/j.procs.2015.04.103
6 Avadhesh Pratap Singh and Dhirendra Pratap Singh / Procedia Computer Science 48 (2015) 5 – 13

w. The problem of finding shortest path from p to q is classical and most studied problem of graph algorithm and has
been various methods implemented in sequential as well as parallel. The most popular algorithm for this problem is
Dijkastra’s algorithm1. Parallel implementation of Dijkastra’s algorithm is given by different researchers2-9. K-
shortest path problem is generalization of shortest path problem which calculates k number of shortest path in
increasing order of weight. K-shortest path problem is broadly divided into two categories. First type is containing
path that allows repeated vertices. Second type of path that doesn’t allowed repetition of nodes in the path. We are
concerning on positive edge weight in which shortest path is always without vertex repetition. See figure 1 for an
example illustrating difference between k-shortest path problem with and without repetition of vertices. In figure 1
three simple(without repetition of vertices) path have length 6(s,a,b,t), 20(s,c,t), 21(s,d,t) respectively and Paths that
may have repetitions of vertices are 6(s,a,b,t), 8(s,a,b,a,b,t),10(s,a,b,a,b,a,b,t).

Fig. 1.the difference between path with and without repetition of the vertices

The k-shortest path problem in which repetition of vertices allowed seems to be easier.There are various
algorithms10,11,12,13 to find this type of path. Latest improvement of k-shortest path with repetition of vertices is done
by Eppstein10 that achieves the optimal time of O(m+nlogn+k)the algorithm computes an implicit representation of
the paths, from which each path can be calculated in O(n) additional time.
There are various algorithms15,16,17,18to calculate k-shortest path problem but they are not have a stable time
complexity.Finding K-shortest path without repetition of vertices proved to be more challenging that is called simple
k-shortest paths. The problem is initially examined by Hoffman and Pavley14, but all early attempts15,16,17,18to solve it
leads to exponential time complexity. The first algorithm to find k-shortest path without vertex repetition with
polynomial time complexity is Yen’s algorithm19,20(generalized by Lawler21). The time complexity of the Yen’s
algorithm using new data structure is O(kn(m + nlogn)). In the case of undirected graphs, Katohet. al.22 improves
yen’s algorithm to O(kn(m+nlogn)) time. While Yen’s algorithm worst case time complexity for simple k-shortest
path in weighted directed graph is still unbeaten.
There is various implementation of Yen’s algorithm23,24 with the new data structure but still yen’s worst case
complexity is unbeaten in k-shortest path calculation in the directed graph. Kumar and Ghosh25 designed a CREW
PRAM algorithm for the calculation of all pair version of the k-shortest path problem. They developed algorithm
based on the transitive closure algorithm for computing all pair shortest path. Ruppert26 developed a CREW PRAM
algorithm for the k shortest path problem to a given destination node from every node of an edge-weighted directed
graph. Ruppert algorithm is based on the Eppestein algorithm. Guerriero27et. Al developed a parallel asynchronous
algorithm to calculate k-shortest paths from a single source to all other node of a directed graph. They have
implemented their algorithm in nonuniform memory access multiprocessor. Their algorithm is based on the
parallelism in sequential label correcting method. Ruppert and Guerriero algorithms calculates k-shortest path that
may have repetition of vertices.
K-shortest path problem is applied in various real time problems that are discussed and listed by Eppestein10. In
real time application like,robot motion planning, network routing path, optimization problems such that length
limited Huffman coding, Sequence alignment problem in molecular bioinformatics, multiple object tracking,
multiple path finding in gene network28 where we need speed up and we can apply parallel algorithm to calculate
shortest path. In these applications many applications need fast calculation of k-shortest path. To reduce the time we
can implement k-shortest path algorithm in parallel. Parallel implementation of the k-shortest path algorithm done in
GPU using CUDA improves the processing time of large graph.
Avadhesh Pratap Singh and Dhirendra Pratap Singh / Procedia Computer Science 48 (2015) 5 – 13 7

In this paper we proposeparallel algorithms to find simple k-shortest path and their comparative analysis with
serial implementation. Both these algorithms are based on Yen’s algorithm.
This paper is organized insevensections. In section 2 we described about GPU and CUDA programming. Graph
representation method is explained in the section 3. In section 4 we described the multiple path representation data
structure. Serial algorithm to obtained k-shortest path and there parallel implementation discussed in the in Section
5. Section 6 presents the results and performance analysis of the parallel algorithm regarding to the various large
graph. At the end in section 7 we discussed conclusion.

2. GPU and CUDA overview

The architecture of GPU best fits in data parallel approach. As per analysis GPU is most suited in algorithms that
have high arithmetic intensity and regular data access pattern29. Leading GPU developing company NVIDIA
introduced Compute Unified Device Architecture to simplify the GPU programming by mean of high level
application programming interface.
GPUs are basically collection of multiple streaming processors. In stream processing, a single instruction is
executing on a stream of data in each thread. A CUDA program is composed of two parts: A host(CPU) code that’s
creating kernel calls, and a device(GPU) code that actually implements the kernel30. The host code is serial code
runs on CPU and device code runs on GPU in each thread to maximize the GPU thread utilization. From
programmer’s view CUDA programming model is collection of thread running in parallel.

Fig. 2. CUDA Programming Architecture

Nvidia created GPU is has multiple multiprocessors, which are able to run numerous processing elements called
cores. Each Multiprocessor can access data from memory hierarchy provided by GPU. Nvidia GPU provides
different memory level named as, a fast private register memory, global memory, shared memory, constant memory
and texture memory. Register memory is private memory for each thread. Global memory, constant memory and
texture memory are accessible for the all thread present in a grid30. Shared memory is local to the thread of block.
Constant memory and texture memory are read only memory present in DRAM of the GPU device.
In CUDA platform a set of instructions (kernel) executed on each thread of the GPU device. Threads are divided
into group that is called block. Block is a collection of thread that can be run on each core of device at a time. A grid
is collection of multiple blocks assigned to multiprocessor. The number of thread that can be executed on a single
8 Avadhesh Pratap Singh and Dhirendra Pratap Singh / Procedia Computer Science 48 (2015) 5 – 13

multi-processor called wrap which size is fixed. Each thread, block and grid has a unique ID assigned by CUDA.
Unique ID is used to access data in which particular thread’s instruction is executed. Figure 2 shows the CUDA
programming model. In GPU programming only one common data stored in global and it will be same for all the
running blocks under a grid. If we want to work in different data simultaneously then there is a need of multiple
GPUs.

3. Graph Representation

Graph representation in order to access graph, is plays important role in running time of the algorithm.
Conventionally, there is two ways to represent the graph, adjacency matrix and adjacency list. Adjacency matrix is
waste a lot of memory in case of the sparse graph. Adjacency list representation is best way to represent sparse
graph. In GPU device CUDA access the memory in array so because of different size of edge list difficult to use
Adjacency list. Harish et. al.5 and Dhirendraet. al.9 gives the modified adjacency list representation. According to
Dhirendraet. al.9use three arrayVa, Ea and Ew. Vertices of the graph G(V,E,W) are represented as an array Va . Va is
use to store the starting index of own adjacency list in Ea. Ea array of size equal to number of edges, use to store the
vertex number which is connected to ith vertex of Va. In Ea array each entry in range of ith value to (i+1)th value of
Va, is connected to ith vertices of the Va for all i in Va. Ew is use to store the weight of edge corresponds to the Ea.

Fig. 3. Graph Representation

According to Dhirendraet. al.9 in preprocessingtime Ew is store the weight in shortest order in the range of ith
value to (i+1)th value of Va, is connected to ith vertices of the Va for all i in Va. This sorted preprocessing is useful in
calculation of shortest path. Figure 3 shows the representation of the graph in the CUDA. In this way of
representation we can easily calculate the out degree of the each vertex. The out degree of node i is equal to
difference of the ith index of Va and (i+1)th index of Va. We have added one more array Es of size |E| to store the
starting node of the each edge. This graph representation done in such a way that will easily accessible in GPU and
help to reduce the access time of the running algorithm. All of these defined array initialized in preprocessing of the
graph.
Avadhesh Pratap Singh and Dhirendra Pratap Singh / Procedia Computer Science 48 (2015) 5 – 13 9

4. Data structures to represent the Path

Let Z be an array that store the information related to each calculated shortest path. In this paper we use two arrays,
one for the result path set (Zresult) and another one for the rest path set (Zrest). Size of the Zresult is equal K (number of
shortest paths). If the calculated path is less than K then some field of array is nil.
Each array element stores the following information about the path P that it represents.
Pointer to an array Path that stores the shortest path between two nodes. Each Element of Path array store
two information that is node and weight from the source.
Index value of the parent path from which it is generated (if it is first path or not depends on previous
calculated paths then value of this field is negative).
Diversion node index of the parent path from which new path is calculated (if the source node is diversion
node then value of this field is negative).
Length of the corresponding path.

This data structures is used in the minimizing the function call of the shortest path algorithm in comparison to
yen’s algorithms. Diversion node ensures that there is no duplicate shortest path in Zrest because we start calculating
new path after removing edges in the graph after diversion node index onwards so there are no duplicate paths. This
data structures helps to track the parent path by which new path is generated so there is no need to compare each
path for the common edges in path.

5. Algorithm and there parallel implementation

5.1. Modified Yen’s algorithm using Dijkastra’s algorithm

Dijkastra’s algorithm is basically use for the single source shortest path1calculation. There is various parallel
implementation2,4,5,6,7,8,9of Dijkastra’s algorithm in GPU using CUDA. With some modification Dijkastra’s
algorithm can be used for calculating shortest path between two vertices. Yen’s algorithm internally uses the
shortest path calculation where we will use parallel Dijkastra’s algorithm9. Yen’s algorithm is basically used for
calculating k-shortest simple path in the directed weighted graph. Yen’s algorithm also can be used in calculating
shortest path in undirected graph but it is unnecessarily calculating many shortest paths that lead to it in more time
consuming. Katoahet al.22 gives an efficient algorithm for the undirected graph.

Algorithm 1: Modified Yen’s algorithm (Graph G (V, E, W), Source node, Destination node , K_no)
Create two list for the Result path set RESULT and rest path set REST
Begin
[1] Calculate first shortest path using DIJKASTRA algorithm.
[2] Add the first shortest path in the Zresultt.
[3] for k=2 to K_no
[4] for each edge of the (k-1)th path(s, v1, v2, v3….. vn) of Zresult
[5] Remove each edge (vi , vi+1) from graph corresponding Zresult to where same sub path from s to vi
[6] Apply shortest path algorithm for vi to t
[7] Combine path s to vi and new path from vi to t
[8] Store the calculated path in Zrest
[9] End of for
[10] Take minimum path from the REST path list and store in Zresult.
[11] End of for
End

According to yen’s algorithm firstly calculate the shortest path between given pair of vertices. After calculating
first path store it in the Zresult and then for each next shortest path take the previous calculated path(s, v1, v2, v3….. vn)
from the Zresult and remove each edge (vi, vi+1)and check in all path of result path set if there are some matched path
from s to vi then remove each edge next to vi in the graph and then calculate shortest path between vi and t. After
shortest path calculation combine both the path that is root path(s to vi) and spur path (vi to t) and store in Zrest. After
10 Avadhesh Pratap Singh and Dhirendra Pratap Singh / Procedia Computer Science 48 (2015) 5 – 13

calculating all path corresponding to previously calculated path take minimum path from the rest of the path list and
store in the Zresult. This whole process is repeated until the size of Zresult is not equal to k. In Modified Yen’s we have
used Dijkastra’s algorithm to internal shortest path calculation in between given pair of vertices. In This modified
algorithm we used new data structure so it can easily modify the graph in order to calculate new shortest path. Data
structure also helps to identify the common edges between the Zresultby the use of parent path field in the data
structure. This data structure also reduces the memory. Modified Yen’s algorithm is defined in the Algorithm 1.
In yen’s serial algorithm there is various modification and improvement has been done previously21,23,24 in data
structure and calculation of new path also. We are also using new data structure to store the path information.

5.2. Implementation in CUDA

Implementation of Yen’s algorithm is done in CUDA is using parallel Dijkastra’s algorithm. In simple SSSP
Dijkastra’s algorithms9 we are calculating path weight between source to each node. In this paper we also
constructing shortest path tree from source to destination doing some minor modification in9. Algorithm 2 defines
the modified version of parallel Dijkastra’s algorithm that is also calculating responsible edge for each node weight
of the path.

Algorithm 2: Dijkastra_SSSP (Node_weight, Mask, res_node, Edge_start, Edge, Weight, S_node, D_node)
Begin
[1] while(thre<infinite and Mask[D_node] != 1) do
[2] thre= infinite
[3] CAL_THRES(Node, Node_weight, Edge, Weight, Mask, thre, infinite) for all nodes of the graph in parallel
[4] REL_NODE(Node, Node_weight, Edge, Weight, Mask, thre) for all nodes of the graph in parallel
[5] Endwhile
[6] FIND_RESPONSIBLE(Edge_start, res_node, edge, edge weight , Node_weight ) for all edge of the graph in parallel
End

In Dijkastra_SSSP algorithm process is same as9 but we have modified in step three and added a new step to
calculate the responsible edge. In step three we have added one condition that is verifying weather we relaxed the
destination node or not. If we relaxed the destination node then the process will stop and move to step 8. In this
algorithm we have 4 kernel call. INITIATE, CAL_THRES and REL_NODE kernel have the same functionality like
INITIALIZATION, THRESHOLD and RELAX in [9] respectively. Fourth kernel is FIND_RESPONSIBLE that is
defined in the Algorithm 3.

Algorithm 3: FIND_RESPONSIBLE (Edge_start, mask, edge, weight, Node_weight )

Begin
[1] id = getThreadID
[2] if(Node_weight[Edge_start[id]] + weight[id] == Node_weight[edge[id]])
[3] then res_node[edge[id]] = Edge_start[id]
[4] End if
End

In FIND_RESPONSIBLE kernel |E| thread are initialized to find the responsible edge for the each node weight.
Responsible edge of the each node i of the path is calculated by comparing the edge weight of all incoming edge
from any node. If the incoming edge weight plus edge start node weight is equal to node weight then that edge is
responsible for weight of that node i.
In Yen’s algorithm implementation we use Dijkastra_SSSP for the calculation of the shortest path. In yen’s
algorithm step 1 and step 6 will use parallel shortest path calculation between two nodes. Algorithm 4 defines the
parallel implementation of the Yen’s algorithm. Edge_start, Edge, Weight are initialize in preprocessing of the
graph.

Algorithm 4: Yen_parallel(Graph G (V, E, W), S_node, D_node, K)

Create an array Node_weight of size |V|, a Boolean array Mask of size |V|, a array res_node of size |V|, a variable infinite with a
very large number assigned to it and a variable thre to store the threshold value. Create two arrayZresultand Zrest. Size of Zresultt is K
Avadhesh Pratap Singh and Dhirendra Pratap Singh / Procedia Computer Science 48 (2015) 5 – 13 11

that is used to store the result k-shortest path and

Begin
[1] INITIATE(Node_weight, Mask, S_node) for all nodes of the graph in parallel
[2] thre= 0
[3] Dijkastra_SSSP (Node_weight, Mask, res_node, Edge_start, Edge, Weight, S_node, D_node)
[4] Add the first shortest path in the Zresultin first index and initiate diversion node and parent node value to 0 and -1.
[5] for k=2 to K
[6] foreach edge fromm=diversion node totod_node of (k-1)th path of Zresult.
[7] n= n = Zresult[k-1].
[8] whilen = parent path is non-negative and m == Zresult[diversion node].
[9] remove edge (vm, vm+1) from graph.
[10] n = Zresult[n].
[11] End of while
[12] Dijkastra_SSSP (Node_weight, Mask, res_node, Edge_start, Edge, Weight, m, D_node)
[13] Store the calculated path in Zrest.
[14] End of for
[15] find minimum path weight value from Zrest and store in path_wand store index in min.
[16] if(path_w< infinite)
[17] Zresult [k]=Zrest[min] and set Zrest to NULL.
[18] else
[19] exit (there is no more path)
[20] End of for
End

This parallel approach is limited in the concept of GPU where we can assign only one common data in global
memory of the each kernel running under a single grid. Because of this limitation of the GPU we can calculate a
single shortest path at a time.

6. Performance Analysis

Modified yen’s algorithm performance analysis done on the basis of various graph available on the Stanford
graph library31. Various web graphs, the computer network graphs, the citation graph, the citation graphs and road
networks graphs are available in this library. These graphs are verified and tested in various parameters. We have
used directed graph with randomly assigned weight to the edges. This weight is assigned in preprocessing time of
the graph. We have used graph with 10k nodes to 65k nodes graph with edges 20k to 1.5 millions. These graphs are
considered as sparse graph because degree of these graphs is very less.

6.1. Experimental setup

To evaluate the performance we have one setup described in table 1.

Table 1. Experimental Setup
Specification Version / Detail
CUDA Version 5.0
Nvidia GPU Tesla C2075
Compute Capability 2.1
Cores 448
Multiprocessor 14
CPU Processor 2 x CPU Intel HEX(6), 2.8Hz
RAM 24GB
GPU Memory 4GB
OS Windows 7
Visual Studio Professional 2010
12 Avadhesh Pratap Singh and Dhirendra Pratap Singh / Procedia Computer Science 48 (2015) 5 – 13

6.2. Results

In this section we show the results of parallel implementation of the yen algorithm. Yen’s parallel
implementation compares with serial yen implementation with new data structure developed byErnestoet. al.21. Time
to calculate k number of path between two specified nodes is depends on the number of edges in the k-1 path. It also
effect the number of path stored in rest path set. We implemented shortest path calculation in parallel that reduced
the timing of each shortest path calculation. The selection of shortest path from rest path list is also implemented in
parallel. In a dense graph more parallelization is achieved because we can calculate shortest path quickly.
Result shown in figure consist number of node in graph at x-axes and time in seconds at y-axes. We have analyzed
algorithm for different value of the K.

Fig. 4. Yen's algorithm timing graph(serial vs. parallel) (a) k=100 (b) k=200 (c) k=300

Figure 4(a), figure 4(b) and figure 4(5) show the result of serial yen algorithm (serial_yen) and parallel
implementation of yen’s algorithm (parallel_yen) in specified setup. The result show the comaparative analysis in
graph with 62k and 1.4 million edges. For k=100 the parallel_yen show 6.5 time speedup in the graph. The average
degree of used directed graph is 3 to 5. As we are increasing the value of K the time is almost similar for all values
of K. A graph with node 22k is shows less time because it average degree is 6 to 7. So we can say that is density of
the graph is increases then time to calculate K number of graph is reduced.

7. Conclusion

In this paper we have designed and implemented yen’s algorithm in a efficient way using parallel Dijkastra’s
algorithm. K-shortest path algorithm is get implemented in GPU first time. We have used a new data structure that is
well suited to GPU and reduces the running time of algorithm as well. By using new data structure internal calls of
shortest path is reduced and easily we can identify which how graph is temporarily modified to get new shortest path
Avadhesh Pratap Singh and Dhirendra Pratap Singh / Procedia Computer Science 48 (2015) 5 – 13 13

from previous one. Finally we got a shortest path tree with the according weight of the path. We have tested our
algorithm for different graph with various values of K. We have 6x speed up in comparison to yen’s serial
implementation.
To more performance gain we can implement this algorithm multiple GPU at a time so we can assign multiple
modified graph in different GPU so we can find multiple shortest path. Using Multiple GPUs more parallelization
cabbe gain. We can also use GPU memory hierarchy to reduce memory access time.

References

1. Dijkstra E.Anote on two problems in connection with graphs. Numerical Mathematics.1959;1:395–412.

2. Papaefthymiou M, RodrigueJ.Implementing parallel shortestpathsalgorithms.DIMACS Series in Discrete Mathematics and Theoretical
Computer Science, 1994; pp. 59-68.
3. Fetterer A, Shekhar S. A performance analysis of hierarchical shortest path algorithms, Ninth IEEE International Conference on Tools with
Artificial Intelligence,IEEE, 1997.
4. Crobak JR, Berry JW, Madduri K. and Bader D. A. Advanced shortest paths algorithms on a massively-multithreaded architecture, Parallel
and Distributed Processing Symposium, IEEE, 2007.
5. Harish P, Narayanan PJ. Accelerating large graph algorithms on the GPU using CUDA, in High Performance Computing – HiPC 2007,
Aluru S, Parashar M. et al. (Eds.), Springer Berlin Heidelberg 2007; pp. 197-208.
6. Tang Y, Zhang Y, Chen H. A parallel shortest path algorithm based on graph-partitioning and iterative correcting, 10th IEEE International
Conference on High Performance Computing and Communications, IEEE, 2008.
7. Martín PJ, Torres R, Gavilanes A. CUDA Solutions for the SSSP Problem, in Computational Science – ICCS 2009, G. Allen et al. (Eds.),
Springer-Verlag Berlin, Heidelberg, 2009; pp. 904–913.
8. Kumar S, MisraA,Tomar RS.A Modified Parallel Approach to Single Source Shortest Path Problem for Massively Dense Graphs Using
CUDA, Int. Conf. on Computer & Comm. Tech. (ICCCT), IEEE, 2011.
9. Singh DP, KhareN. Parallel Implementation of the Single Source Shortest Path Algorithm on CPU–GPU Based Hybrid System,
International Journal of Computer Science and Information Security,September 2013;Vol. 11, No. 9.
10. EppsteinD .Finding the k shortest paths. SIAM Journal on Computing1998;28:652–673.
11. Fox BL.k-th shortest paths and applications to the probabilistic networks. In ORSA/TIMS National Mtg, Bull. Operations Research Soc. of
America, 1975;volume 23, page B263.
12. Martins EQV. An algorithm for ranking paths that may contain cycles. European J. Operational Research, 1984;18:123-130.
13. Azevedo JA.An algorithm for the ranking of shortest paths. European J. Operational Research,1993;69:97-106.
14. Hoffman R, Pavley RR.A method for the solution of the nth best path problem. Journal of the Association for Computing Machinery
1959;6:506–515.
15. Clarke S,Krikorian A,Rausan J.Computing the N Best Loopless Paths in a Net-work, J. of SIAM, December 1963;Vol. 11, No. 4,pp. 1096-
1102.
16. Bock F, Kantner H, HaynesJ. An Algorithm (The r-th Best Path Algorithm) for Find-inq and Ranking Paths Through a Network, Research
Report, Armour Research Foundation, Chicago, Illinois, November 15, 1957.
17. Pollack M.Thekth Best Route Through a Network, Opns. Res., 1961;Vol. 9, No. 4 ,pp. 578.
18. Sakarovitch M. The k Shortest Routes and the k Shortest Chains in a Graph, Opns. Res. Center, University of California, Berkeley, Report
ORC-32, October 1966.
19. Yen JY. Finding the K shortest loopless paths in a network, MgmtSci 17, 1971; 712–716.
20. Yen JY. Another algorithm for finding the K shortest loopless network paths, 41st MtgOper Res Soc of Am, Bull Oper Res Soc of Am 20 ,
1972; B/185.
21. Lawler EL.A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path
problem. Management Science, 1972;18:401–405.
22. Katho N, Ibaraki T,Mine H.An efficient algorithm for k shortest simple paths. Networks, 1982;12:411–427.
23. Hadjiconstantinou E, Christofides N. An efficient implementation of an algorithm for finding k shortest simple paths. Networks,
1999;34(2):88–101.
24. Ernesto QVM, Marta MBP. A new Implementation of Yen’s ranking loopless paths algorithm, 2002.
25. Kumar N, Ghosh RK.Parallel algorithm for finding first k shortest paths. Computer Science and Informatics: Journal of the Computer
Society of India, 1994;24(3):21–28.
26. Ruppert E., Finding the k shortest path in parallel, Algorithmica, 2000;28: 242-54.
27. Guerrieo E, Musmanno R.Parallel Asynchronous Algorithms for the K Shortest Paths Problem, Journal of Optimization Theory and
Applications, January 2000;vol 104, No. 1,pp 91-108.
28. Shih YK,ParthasarathyS. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network, ISMB 2012;Vol.
28,pages i49–i58.
29. Nickolls J, Buck I, Garland M, Skadron K. Scalable Parallel Programming with CUDA, ACM Queue, 2008;vol. 6, no. 2, pp. 40-53.
30. NVIDIA Corporation. CUDA C programming guide 2013; http:// docs.nvidia.com /cuda /pdf /CUDA_C_Programming_Guide.pdf.
31. Jure L.Stanford Large Network Dataset Collection, Stanford University, http://snap.stanford.edu/data/.

Daa Project Report 2
No ratings yet
Daa Project Report 2
14 pages
X2 Software Developer Guide
No ratings yet
X2 Software Developer Guide
6 pages
Finding The K Shortest Paths: David Eppstein March 31, 1997
No ratings yet
Finding The K Shortest Paths: David Eppstein March 31, 1997
26 pages
K-Shortest Paths Problem
No ratings yet
K-Shortest Paths Problem
14 pages
Jcar2020 35 40
No ratings yet
Jcar2020 35 40
6 pages
Cmpe232 Lecture Notes 05
No ratings yet
Cmpe232 Lecture Notes 05
3 pages
Lecture 08
No ratings yet
Lecture 08
6 pages
Shortest Path
No ratings yet
Shortest Path
33 pages
Daa Theory
No ratings yet
Daa Theory
23 pages
Graphs Shortest Path
No ratings yet
Graphs Shortest Path
35 pages
Shortestpath
No ratings yet
Shortestpath
11 pages
Project
No ratings yet
Project
12 pages
M3-Cs306-Computer Networking-Ktustudents - in PDF
No ratings yet
M3-Cs306-Computer Networking-Ktustudents - in PDF
49 pages
Path Planning For Unmanned Ground Vehicle: Fethi DEMIM, Kahina LOUADJ, Abdelkrim NEMRA
No ratings yet
Path Planning For Unmanned Ground Vehicle: Fethi DEMIM, Kahina LOUADJ, Abdelkrim NEMRA
3 pages
BelManFord 6.8 KNT
No ratings yet
BelManFord 6.8 KNT
25 pages
18bce0537 VL2020210104308 Pe003
No ratings yet
18bce0537 VL2020210104308 Pe003
31 pages
2251271001, Md. Borhan Uddin, Course Paper, Dissertation Writing Guidance
No ratings yet
2251271001, Md. Borhan Uddin, Course Paper, Dissertation Writing Guidance
12 pages
JIEEE V003 Iss01 Sn007
No ratings yet
JIEEE V003 Iss01 Sn007
17 pages
Shortest Path Algorithm
No ratings yet
Shortest Path Algorithm
4 pages
Shortest
No ratings yet
Shortest
3 pages
IJCER (WWW - Ijceronline.com) International Journal of Computational Engineering Research
No ratings yet
IJCER (WWW - Ijceronline.com) International Journal of Computational Engineering Research
4 pages
Yen KTH Shortest Path
No ratings yet
Yen KTH Shortest Path
6 pages
Graph Algorithm
No ratings yet
Graph Algorithm
14 pages
Dijkstra Algorithm
No ratings yet
Dijkstra Algorithm
5 pages
00-Pso SPP
No ratings yet
00-Pso SPP
11 pages
Ch3 GraphApplications
No ratings yet
Ch3 GraphApplications
33 pages
Dijkstras Algorithm
No ratings yet
Dijkstras Algorithm
10 pages
Ijcsn 2013 2 6 161 PDF
No ratings yet
Ijcsn 2013 2 6 161 PDF
12 pages
Implementation of Shortest Path in Packet Switching Network Using Genetic Algorithm
No ratings yet
Implementation of Shortest Path in Packet Switching Network Using Genetic Algorithm
6 pages
Webpage Design Using Shortest Path Algoritjm
100% (1)
Webpage Design Using Shortest Path Algoritjm
7 pages
Dijkstra Algorithm
100% (1)
Dijkstra Algorithm
4 pages
Travelling Salesman Problem
No ratings yet
Travelling Salesman Problem
19 pages
Dijkstra's and A-Star in Finding The Shortest Path: A Tutorial
No ratings yet
Dijkstra's and A-Star in Finding The Shortest Path: A Tutorial
5 pages
Dijkstra's and A-Star in Finding The Shortest Path: A Tutorial
No ratings yet
Dijkstra's and A-Star in Finding The Shortest Path: A Tutorial
5 pages
Algorithms: Algorithms For Finding Shortest Paths in Networks With Vertex Transfer Penalties
No ratings yet
Algorithms: Algorithms For Finding Shortest Paths in Networks With Vertex Transfer Penalties
21 pages
62837lecture 04 (B) Dijkstras Algorithm
No ratings yet
62837lecture 04 (B) Dijkstras Algorithm
7 pages
Dijkstra's Shortest Path Algorithm
No ratings yet
Dijkstra's Shortest Path Algorithm
8 pages
International Journal of Computer Science & Information Technology (IJCSIT)
No ratings yet
International Journal of Computer Science & Information Technology (IJCSIT)
15 pages
23mz05-Floyd Warshall Algorithm
No ratings yet
23mz05-Floyd Warshall Algorithm
16 pages
Y Even
No ratings yet
Y Even
8 pages
Experiment No. 4-2
No ratings yet
Experiment No. 4-2
4 pages
Paper 1 2014
No ratings yet
Paper 1 2014
15 pages
YEN Research Paper
No ratings yet
YEN Research Paper
6 pages
The Shortest Path Between Two Nodes AMD Algorithm: School of Computers and Information Engineering
No ratings yet
The Shortest Path Between Two Nodes AMD Algorithm: School of Computers and Information Engineering
20 pages
2.3.5. Path Finding Algorithms
No ratings yet
2.3.5. Path Finding Algorithms
8 pages
Dijkstra's Algorithm
No ratings yet
Dijkstra's Algorithm
8 pages
Algorithims Section 10 Dijkstras Algorithm
No ratings yet
Algorithims Section 10 Dijkstras Algorithm
5 pages
Pune Institute of Computer Technology, Pune ACADEMIC YEAR: 2022-23 Department of Computer Engineering Department
No ratings yet
Pune Institute of Computer Technology, Pune ACADEMIC YEAR: 2022-23 Department of Computer Engineering Department
8 pages
Yen's Algorithm
No ratings yet
Yen's Algorithm
6 pages
05K ShortestPaths
No ratings yet
05K ShortestPaths
62 pages
Dijkstra's Shortest Path Algorithm
No ratings yet
Dijkstra's Shortest Path Algorithm
18 pages
Minimum Spanning Trees
No ratings yet
Minimum Spanning Trees
17 pages
Lecture 12 - Graphs P2 PDF
No ratings yet
Lecture 12 - Graphs P2 PDF
59 pages
Project
No ratings yet
Project
5 pages
Longest Path Problem
No ratings yet
Longest Path Problem
6 pages
DSA Chapter 5.2 2024
No ratings yet
DSA Chapter 5.2 2024
7 pages
CN Expt 7
No ratings yet
CN Expt 7
7 pages
Java Based Visualization and Animation For Teaching The Dijkstra Shortest Path Algorithm in Transportation Networks
100% (1)
Java Based Visualization and Animation For Teaching The Dijkstra Shortest Path Algorithm in Transportation Networks
15 pages
Algoritmo de Yen
No ratings yet
Algoritmo de Yen
6 pages
Self Study
No ratings yet
Self Study
33 pages
Sept Last Update
No ratings yet
Sept Last Update
16 pages
Analog Lab Report
No ratings yet
Analog Lab Report
27 pages
Aifpga 5
No ratings yet
Aifpga 5
25 pages
AI and ML Accelerator Survey and Trends
No ratings yet
AI and ML Accelerator Survey and Trends
10 pages
A GPU-based Computational Framework That Bridges Neuron Simulation and Artificial Intelligence
No ratings yet
A GPU-based Computational Framework That Bridges Neuron Simulation and Artificial Intelligence
18 pages
Fuzzy Systems, Neural Networks and Neuro-Fuzzy Systems A Vision On Their Hardware Implementation and Platforms Over Two Decades
No ratings yet
Fuzzy Systems, Neural Networks and Neuro-Fuzzy Systems A Vision On Their Hardware Implementation and Platforms Over Two Decades
49 pages
Programming and Synthesis For Software-Defined FPGA Acceleration - Status and Future Prospects
No ratings yet
Programming and Synthesis For Software-Defined FPGA Acceleration - Status and Future Prospects
39 pages
LESSON PLAN Intro and Vocab Technology
No ratings yet
LESSON PLAN Intro and Vocab Technology
1 page
Linux Basic To Advanced 1
No ratings yet
Linux Basic To Advanced 1
2 pages
Xis - 100x Rev C Final
No ratings yet
Xis - 100x Rev C Final
2 pages
Asus VW221D User Manual
No ratings yet
Asus VW221D User Manual
75 pages
MNFST
No ratings yet
MNFST
20 pages
CST Jss 2 Mid-Term Question
No ratings yet
CST Jss 2 Mid-Term Question
7 pages
TN001 - MCU Memory Dump Using LPC-Link2
No ratings yet
TN001 - MCU Memory Dump Using LPC-Link2
4 pages
MikroTik Installation Guide
No ratings yet
MikroTik Installation Guide
2 pages
JSP Cheat Sheet: by Via
No ratings yet
JSP Cheat Sheet: by Via
2 pages
Questions Reasoning
No ratings yet
Questions Reasoning
6 pages
Python Advanced - Threads and Threading
No ratings yet
Python Advanced - Threads and Threading
9 pages
x86 Instruction Listings
100% (1)
x86 Instruction Listings
53 pages
Concurrency and Multiprogramming: 1 Project 1
No ratings yet
Concurrency and Multiprogramming: 1 Project 1
12 pages
Autosys Job Management - Unix Installation Guide
67% (3)
Autosys Job Management - Unix Installation Guide
235 pages
Applications of Compiler Design
No ratings yet
Applications of Compiler Design
2 pages
Latitude 3330 12275-1 - AUSTIN13 - CHIEFRIVER - MB - A00 - 0226
No ratings yet
Latitude 3330 12275-1 - AUSTIN13 - CHIEFRIVER - MB - A00 - 0226
106 pages
Operating Manual PA 8000: Edition 11.01 Software Revision 1.9 PA Subject To Technical Modifications and Errors
No ratings yet
Operating Manual PA 8000: Edition 11.01 Software Revision 1.9 PA Subject To Technical Modifications and Errors
51 pages
Arduino Code For CNC Machine
No ratings yet
Arduino Code For CNC Machine
8 pages
Lastexception 63844932232
No ratings yet
Lastexception 63844932232
4 pages
Reyrolle 7SR119 G3
No ratings yet
Reyrolle 7SR119 G3
2 pages
Et200sp System Manual en-US en-US
No ratings yet
Et200sp System Manual en-US en-US
271 pages
Group Policy For Beginners
No ratings yet
Group Policy For Beginners
13 pages
ZKTeco Biometric Readers Product Catalogue FINAL LRZ 2023
No ratings yet
ZKTeco Biometric Readers Product Catalogue FINAL LRZ 2023
16 pages
6.2.3.8 Lab - Configuring Multiarea OSPFv2 Final
No ratings yet
6.2.3.8 Lab - Configuring Multiarea OSPFv2 Final
13 pages
7th CSIT Adv JAVA
No ratings yet
7th CSIT Adv JAVA
2 pages
Ghaziabad Branch of Circ of Icai: Submitted To MR (Itt Faculty) Submitted by CRO-0407449 BATCH-154
No ratings yet
Ghaziabad Branch of Circ of Icai: Submitted To MR (Itt Faculty) Submitted by CRO-0407449 BATCH-154
30 pages
Insem PPL
No ratings yet
Insem PPL
28 pages
DP Software Loading and Monitoring Instructions
No ratings yet
DP Software Loading and Monitoring Instructions
17 pages
Java
No ratings yet
Java
9 pages

Gpu Working - K Shortest Path Analysis

Uploaded by

Gpu Working - K Shortest Path Analysis

Uploaded by

Available online at www.sciencedirect.

International Conference on Intelligent Computing, Communication & Convergence

Bhubaneswar, Odisha, India

Implementation of K-shortest path algorithm in GPU using CUDA

2. GPU and CUDA overview

Fig. 2. CUDA Programming Architecture

Fig. 3. Graph Representation

4. Data structures to represent the Path

5. Algorithm and there parallel implementation

5.1. Modified Yen’s algorithm using Dijkastra’s algorithm

5.2. Implementation in CUDA

Algorithm 3: FIND_RESPONSIBLE (Edge_start, mask, edge, weight, Node_weight )

Algorithm 4: Yen_parallel(Graph G (V, E, W), S_node, D_node, K)

that is used to store the result k-shortest path and

6.1. Experimental setup

To evaluate the performance we have one setup described in table 1.

1. Dijkstra E.Anote on two problems in connection with graphs. Numerical Mathematics.1959;1:395–412.

You might also like