Dynamic Shortest Paths Using Javascript On Gpus: Anurag Ingole Rupesh Nasre
Dynamic Shortest Paths Using Javascript On Gpus: Anurag Ingole Rupesh Nasre
Abstract—Information on the internet is growing rapidly cities to find the shortest paths from one point to another
and its processing needs high-speed infrastructure, both in via a given mode of travel. Facebook creates a social
hardware and software. JavaScript is now an integral in- network of friend connections, and allows retrieving
gredient of web applications which perform tasks ranging attributes of various graph vertices (such as the update
from error checking in online forms to processing Google
stream of a friend). In real-life, these graphs are dynamic
maps. Due to their interactive nature, performance of
JavaScript applications is critical, especially while handling
in nature, that is, new vertices and edges keep getting
huge volumes of evolving data. Therefore, parallelization added or removed from the underlying graph. We target
of JavaScript code has been pursued in the recent past. In such dynamic graphs in our work. As a preliminary
this work, we target GPU parallelization of dynamic graph study, we investigate the effect of computing dynamic
algorithms on GPUs. We present implementation and shortest paths in JavaScript in parallel on a GPU for
achieve effective parallelization of dynamic single source large real-world graphs. Thus, instead of recomputing
shortest path computation. We compare the incremental, the shortest paths for the modified graph, the goal is to
decremental and fully dynamic versions against their static perform only a small amount of parallel processing to
counterpart and show that upto about 10% of updates,
get the modified shortest paths.
dynamic processing on GPUs is beneficial.
Index Terms—GPU, SSSP, JavaScript, WebCL, dy-
In this work, we implement a work-efficient fully
namic, Node.js dynamic, that is, incremental and decremental single
source shortest path (SSSP) algorithm in JavaScript. The
incremental algorithm follows the same methodology
I. I NTRODUCTION
as that of the static algorithm, while the decremental
JavaScript is a dynamic programming language. It processing involves extra care to be taken. This paper
is most commonly used as part of web browsers, presents the ideas implemented on SSSP and BFS, but
whose implementations allow client-side scripts to in- these ideas are general enough to be extended to other
teract with the user, control the browser, communicate graph analytics algorithms like graph coloring, finding
asynchronously, and alter the document content that connected components, computing Page Rank etc.
is displayed [1]. It is also used in server-side net-
work programming with runtime environments such as II. R ELATED W ORK
Node.js [2], game development and the creation of A few implementations for parallelizing JavaScript on
desktop and mobile applications. With the rise of the multi-core systems and GPUs exist, such as WebCL,
single-page web applications and JavaScript-heavy sites, ParallelJS and RiverTrail.
it is increasingly being used as a compile target for
source-to-source compilers from both dynamic as well as A. JavaScript Parallelization
static languages. JavaScript is predominantly sequential, Parallel.js [3] is a tiny library for multi-core processing
and web applications until recently have been unable to in JavaScript. It was created to take full advantage of the
utilize hardware parallelism. Considering the parallelism ever-maturing web-workers API. It uses the support of
support by current hardware; web experience can be web workers provided by native browser to run script
evolved to the next level if JavaScript is made to run in parallel. Web workers help in writing multi threaded
in parallel. JavaScript code. Hence different bits of JavaScript code
JavaScript is used in several web-products that operate may be running at a particular instance of time. The level
on an underlying graph, such as Google Maps and Face- of parallelism achieved by this implementation is limited
book. Google Maps processes a network of junctions and since scheduling of JavaScript thread on multi core
processor is dependent on OS and there is a limitation
on the number of threads for multi core processors.
ParallelJS [4] is used for flexible mapping of
JavaScript onto heterogeneous systems that have both
CPUs and GPUs. The framework includes a front-
end compiler, construct library and a runtime system.
JavaScript programs written with high-level constructs
are compiled to GPU binary code and scheduled to
GPUs by the runtime. The program can be executed on
either the CPU using the native JavaScript compiler or
Figure 1. Graph with dynamic CSR edge representation
translated to PTX and executed on the GPU.
RiverTrail [5] is a JavaScript library and a Firefox
add-on that together provide support for data-parallel are stored in memory, and then discuss incremental,
programming in JavaScript, targeting multi-core CPUs decremental and fully dynamic SSSP.
and GPUs via OpenCL. The central component of River-
Trail is the ParallelArray type which models ordered A. Dynamic Graphs and their Representation
collections of scalar values. ParallelArray objects support Dynamic graphs undergo series of modifications like
primitives such as map, reduce, scan, and combine, insertion and deletion of edges and vertices, as well
which are amenable to parallelism. as modifications to vertex and edge attributes. Since
The WebCL [6] project exposes OpenCL into insertion of a vertex may be simulated by adding an
JavaScript, allowing parallel computation on modern edge to a disconnected vertex, we consider insertions and
GPUs, multi-core CPUs and many core accelerators. We- deletion of edges alone. This way, we can simulate edge
bCL supports all the functionality provided by OpenCL. weight modification as a combination of edge deletion
As OpenCL targets a wide variety of parallel architec- and edge insertion. Thus, we do not reduce the generality
tures compared to Nvidia CUDA, we base our imple- of application.
mentation on WebCL. The graphs are represented in compressed sparse
row (CSR) format where entries in the edge array are
B. Parallel SSSP pointed to by the vertices in the vertex array. Additional
There are many implementations of parallel static weight array of the same size as the edge array is also
graph algorithms on a variety of architectures, including maintained to store weights of the corresponding edges.
distributed-memory supercomputers [7], shared-memory The set of newly inserted edges is maintained in a new
supercomputers [8], and multicore machines [9]. Harish CSR array. For each edge to be deleted from the graph,
and Narayanan [10] describe CUDA implementations of its weight in the weight-array is increased to MAX. An
graph algorithms such as BFS and single-source shortest example graph, its CSR representation and the dynamic
paths computation. CSR representation are shown in Figure 1.
There exists a relatively large body of work on The worklist based parallel SSSP algorithm for static
speeding up processing of evolving graphs [11], [12], graphs discovers new minimum paths by propagating
[13], [14], [15]. While Chronos [11] introduces a novel shortest paths through the graph. A vertex is added to
memory layout for evolving graphs to improve cache the worklist if and only if its distance has reduced in
locality during serial or parallel graph processing, much the current iteration. Continuing this processing until
of the other work restricts type of queries or are designed the worklist becomes empty ensures shortest distance
for a specific algorithm. For instance, Ren et al. [12] and computed for each vertex in the graph. The information
Kan et al. [13] consider queries that depend upon the computed by the static version is used as the base
graph structure alone while Desikan and Srivastava [14] distances by the dynamic version.
exploit specific properties of the Page Rank algorithm.
None of these works amortize processing costs as we do. B. Incremental SSSP
In the incremental processing updates distances on
III. DYNAMIC SSSP C OMPUTATION new edge addition. A useful property of incremental
We work with directed weighted graphs with positive SSSP is that the information is always propagated in the
edge weights. We first explain how dynamic graphs forward direction (away from the updated edge). Thus,
From each edge u → v that needs to be deleted for
decremental processing, we raise its weight to MAXINT.
Also, if edge u → v was part of the previous shortest
path then vertex v is pushed to the worklist with a flag
specifying decremental node. If the edge was not part of
the former shortest path then its weight is set to MAXINT
but the vertex is not added to the worklist. Each vertex
in the worklist looks for its next smallest predecessor by
going through all the incoming edges to find out the new
shortest distance. Therefore, decremental SSSP requires
Figure 2. (a) Incremental update (b) Decremental update (c) Incre-
mental + decremental update
reverse edges to be maintained. The newly-found shortest
distance will never be smaller than the previous distance.
Once v ’s distance is computed then v propagates the
whenever a new edge u → v is added, the distance of the information to all of its successors which are part of the
vertex u never changes, the distance of v may change, shortest path. All the children of v which were part of
and the distance of any vertex that is unreachable from the shortest paths are pushed into the worklist with flag
v would not change. Node v is added to the worklist if specifying decremental node. The distance is iteratively
its current distance (computed from the previous static propagated to all the levels.
computation) is larger than the sum of u’s distance and
the weight on new edge u → v . When v is processed, Example. Consider the graph in Figure 2(b) where edge
its neighbors may get added to the worklist, and so on. 2 → 3 is deleted. Since it is part of the original shortest
All the nodes in the worklist are processed in the next path, we push it to the worklist with the decremental flag.
iteration to propagate the newly found shortest paths. In Vertex 3 goes through all its incoming edges to find the
the OpenCL kernel for incremental SSSP all the worklist next shortest path. Distance of vertex 3 is changed to
vertices are processed in parallel. The kernel is launched 10. Now node 3 goes through all its outgoing edges and
repeatedly until no new shortest path can be discovered, finds that node 1 had its shortest path via 3, so it pushes
i.e., until the worklist becomes empty. node 1 to the worklist with flag as decremental node.
In the next iteration node 1’s distance is updated to 12
Example. Consider the graph in Figure 2(a) where two from earlier 7. Since none of the edges of node 1 was
new edges are added: 0 → 1 and 4 → 3. The processing part of the shortest path, no edge can be pushed to the
starts with both 0 and 4 in the worklist. The newly added worklist, and the algorithm terminates.
edge 0 → 1 with weight 2 reduces the current distance
D. Fully Dynamic SSSP
of vertex 1, and 1 is added to the worklist. The distance
is now propagated to all the children of 1 which causes Fully dynamic processing involves both the incremen-
change in the distance of 2 (from 4 to 3). In the next tal and the decremental modes simultaneously. At first,
iteration, the distances of neighbors 3 and 4 are updated for each edge u → v to be added to the graph the
(from 5 to 4 and from 7 to 6 respectively). At this parent u is pushed into the worklist. Also for each edge
step, no more nodes get added to the worklist, and the p → q that needs to be deleted from the graph vertex
processing stops. The example also shows that depending q is pushed to the worklist with flag as decremental
upon where in the graph the new edge is added, the if p → q was part of the shortest path. Each vertex
amount of processing may differ. q in the worklist with decremental flag goes through
its incoming edges to compute the new shortest path.
C. Decremental SSSP But since the incoming edges in fully dynamic setting
Decremental processing for SSSP has a non-triviality include incremental edges, there is a possibility of new
that deletion of the shortest-path edge requires finding distance to be smaller than the previous one. If the
the next shortest path. Further, this needs to be done distance decreases then the vertex q is inserted in the
potentially for all the vertices reachable from the deleted worklist without the decremental flag. If the distance
edge. This processing becomes complicated because the increases then according to the decremental approach, all
next shortest path may lie anywhere in the graph – it the successors of q which were part of the shortest paths
need to be restricted to the reachable set of vertices. are pushed into the worklist with flag as decremental.
Graph #Vertices #Edges
Flickr 395,980 8,545,307
Example. Consider the graph in Figure 2(c) where edge Rmat20 1,048,576 8,259,994
0 → 1 is newly inserted and two edges 2 → 1 and Rmat5 100,000 1,000,000
2 → 3 are deleted. Since both the edges 2 → 1 and P2P 10,876 39,994
Table I
2 → 3 were part of the original shortest paths, both I NPUT GRAPHS
the vertices 1 and 3 are added to the worklist with
flag as decremental. Each of the worklist elements is
processed in parallel, hence both the vertices compute
their next nearest predecessor. Vertex 1 finds 0 as the
new predecessor because of newly added edge 0 → 1.
Also, the new distance of vertex 1 is smaller than
its previous, hence node 1 gets added to the worklist
without decremental flag. Vertex 3 chooses vertex 4 as
the new predecessor and because of the increase in its
distance, vertex 3 loops through its outgoing edges to
find any successor which is part of the shortest path.
Since no children are part of the shortest path from
vertex 3, it does not add any vertex to the worklist.
Since incremental edge 0 → 1 is already processed in
the decremental phase, no new distances are computed
by the incremental algorithm.