0% found this document useful (0 votes)
9 views15 pages

A Star

The document presents improvements to point-to-point shortest path algorithms, particularly focusing on a reach-based approach that incorporates bidirectional search and shortcut arcs to enhance efficiency. The proposed algorithm, which combines reach pruning with A* search, achieves competitive query times comparable to existing methods while maintaining simplicity. The study emphasizes the applicability of these algorithms to various network types, including road networks, without relying on domain-specific information.

Uploaded by

reinout annaert
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views15 pages

A Star

The document presents improvements to point-to-point shortest path algorithms, particularly focusing on a reach-based approach that incorporates bidirectional search and shortcut arcs to enhance efficiency. The proposed algorithm, which combines reach pruning with A* search, achieves competitive query times comparable to existing methods while maintaining simplicity. The study emphasizes the applicability of these algorithms to various network types, including road networks, without relying on domain-specific information.

Uploaded by

reinout annaert
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Reach for A∗:

Efficient Point-to-Point Shortest Path Algorithms


Andrew V. Goldberg∗ Haim Kaplan† Renato F. Werneck‡

Abstract within a small constant factor of the breadth-first search


We study the point-to-point shortest path problem in time.
a setting where preprocessing is allowed. We improve The P2P problem with no preprocessing has been
the reach-based approach of Gutman [17] in several addressed, for example, in [19, 27, 31, 38]. While no
ways. In particular, we introduce a bidirectional version nontrivial theoretical results are known for the general
of the algorithm that uses implicit lower bounds and P2P problem, there has been work on the special case
we add shortcut arcs to reduce vertex reaches. Our of undirected planar graphs with slightly super-linear
modifications greatly improve both preprocessing and preprocessing space. The best bound in this context
query times. The resulting algorithm is as fast as the appears in [8]. Algorithms for approximate shortest
best previous method, due to Sanders and Schultes [28]. paths that use preprocessing have been studied; see e.g.
However, our algorithm is simpler and combines in a [2, 21, 34]. Previous work on exact algorithms with
natural way with A∗ search, which yields significantly preprocessing includes those using geometric informa-
better query times. tion [24, 36], hierarchical decomposition [28, 29, 30], the
notion of reach [17], and A∗ search combined with land-
1 Introduction mark distances [13, 16].
In this paper we focus on road networks. However,
We study the following point-to-point shortest path
our algorithms do not use any domain-specific informa-
problem (P2P): given a directed graph G = (V, A)
tion, such as geographical coordinates, and therefore
with nonnegative arc lengths and two vertices, the
can be applied to any network. Their efficiency, how-
source s and the destination t, find a shortest path
ever, needs to be verified experimentally for each par-
from s to t. We are interested in exact shortest paths
ticular application. In addition to road networks, we
only. We allow preprocessing, but limit the size of
briefly discuss their performance on grid graphs.
the precomputed data to a (moderate) constant times
We now discuss the most relevant recent develop-
the input graph size. Preprocessing time is limited by
ments in preprocessing-based algorithms for road net-
practical considerations. For example, in our motivating
works. Such methods have two components: a prepro-
application, driving directions on large road networks,
cessing algorithm that computes auxiliary data and a
quadratic-time algorithms are impractical.
query algorithm that computes an answer for a given
Finding shortest paths is a fundamental problem.
s–t pair.
The single-source problem with nonnegative arc lengths
Gutman [17] defines the notion of vertex reach.
has been studied most extensively [1, 3, 4, 5, 9, 10, 11,
Informally, the reach of a vertex is a number that is
12, 15, 20, 25, 33, 37]. For this problem, near-optimal
big if the vertex is in the middle of a long shortest path
algorithms are known both in theory, with near-linear
and small otherwise. Gutman shows how to prune an
time bounds, and in practice, where running times are
s-t search based on (upper bounds on) vertex reaches
and (lower bounds on) vertex distances from s and to
∗ Microsoft Research, 1065 La Avenida, Mountain View, t. He uses Euclidean distances for lower bounds, and
CA 94062. E-mail: [email protected]; URL: observes that the idea of reach can be combined with
http://www.research.microsoft.com/~goldberg/. Euclidean-based A∗ search to improve efficiency.
† School of Mathematical Sciences, Tel Aviv University, Israel.

Part of this work was done while the author was visiting Microsoft Goldberg and Harrelson [13] (see also [16]) have
Research. E-mail: [email protected]. shown that the performance of A∗ search (without
‡ Department of Computer Science, Princeton University, 35
reaches) can be significantly improved if landmark-
Olden Street, Princeton, NJ 08544. Supported by the Aladdin based lower bounds are used instead of Euclidean
Project, NSF Grant no. CCR-9626862. Part of this work was
bounds. This leads to the alt (A∗ search, land-
done while the author was visiting Microsoft Research. E-mail:
[email protected]. marks, and triangle inequality) algorithm for the prob-
lem. In [13], it was noted that the alt method could the preprocessing algorithms of Gutman, hh, and re.
be combined with reach pruning in a natural way. Not It also shows why hh cannot be combined with alt as
only would the improved lower bounds direct the search naturally as re can.
better, but they would also make reach pruning more ef- In short, our results lead to a better understanding
fective. of several recent P2P algorithms, leading to simplifi-
Sanders and Schultes [28] (see also [29]) have re- cation and improvement of the underlying techniques.
cently introduced an interesting algorithm based on This, in turn, leads to practical algorithms. For the
highway hierarchy; we call it the hh algorithm. They de- graph of the road network of North America (which has
scribe it for undirected graphs, and briefly discuss how almost 30 million vertices), finding the fastest route be-
to extend it to directed graphs. However, at the time tween two random points takes less than 4 milliseconds
our experiments have been completed and our techni- on a standard workstation, while scanning fewer than
cal report [14] published, there was no implementation 2 000 vertices on average. Local queries are even faster.
of the directed version of the highway hierarchy algo- Due to the page limit, we omit some details, proofs, and
rithm. Assuming that the directed version of hh is not experimental results. A full version of the paper is available
much slower than the undirected version, hh is the most as a technical report [14].
practical of the previously published P2P algorithms for
road networks. It has fast queries, relatively small mem- 2 Preliminaries
ory overhead, and reasonable preprocessing complexity.
The input to the preprocessing stage of a P2P algorithm
Since the directed case is more general, if an algorithm
is a directed graph G = (V, A) with n vertices and
for directed graphs performs well compared to hh then
m arcs, and nonnegative lengths `(a) for every arc a.
it follows that this algorithm performs well compared
The query stage also has as inputs a source s and a
to the current state of the art. We compare our new
sink t. The goal is to find a shortest path from s to
algorithms to hh in Section 8.3.
t. We denote by dist(v, w) the shortest-path distance
The notions of reach and highway hierarchies have
from vertex v to vertex w with respect to `. In general,
different motivations: The former is aimed at pruning
dist(v, w) 6= dist(w, v).
the shortest path search, while the latter takes advan-
The labeling method for the shortest path prob-
tage of inherent road hierarchy to restrict the search to
lem [22, 23] finds shortest paths from the source to
a smaller subgraph. However, as we shall see, the two
all vertices in the graph. The method works as fol-
approaches are related. Vertices pruned by reach have
lows (see e.g. [32]). It maintains for every vertex v
low reach values and as a result belong to a low level of
its distance label d(v), parent p(v), and status S(v) ∈
the highway hierarchy.
{unreached, labeled, scanned}. Initially d(v) = ∞,
In this paper we study the reach method and its
p(v) = nil, and S(v) = unreached for every ver-
relationship to the hh algorithm. We develop a shortest
tex v. The method starts by setting d(s) = 0 and
path algorithm based on improved reach pruning that
S(s) = labeled. While there are labeled vertices, the
is competitive with hh. Then we combine it with alt
method picks a labeled vertex v, relaxes all arcs out of
to make queries even faster.
v, and sets S(v) = scanned. To relax an arc (v, w),
The first contribution of our work is the introduc-
one checks if d(w) > d(v) + `(v, w) and, if true, sets
tion of several variants of the reach algorithm, includ-
d(w) = d(v) + `(v, w), p(w) = v, and S(w) = labeled.
ing bidirectional variants that do not need explicit lower
If the length function is nonnegative, the labeling
bounds. We also introduce the idea of adding shortcut
method terminates with correct shortest path distances
arcs to reduce vertex reaches. A small number of short-
and a shortest path tree. Its efficiency depends on the
cuts (less than n, the number of vertices) drastically
rule to choose a vertex to scan next. We say that
speeds up the preprocessing and the query of the reach-
d(v) is exact if it is equal to the distance from s to v.
based method. The performance of the algorithm that
Dijkstra [5] (and independently Dantzig [3]) observed
implements these improvements (which we call re) is
that if ` is nonnegative and v is a labeled vertex with
similar to that of hh. We then show that the techniques
the smallest distance label, then d(v) is exact and each
behind re and alt can be combined in a natural way,
vertex is scanned once. We refer to the labeling method
leading to a new algorithm, real. On road networks,
with the minimum label selection rule as Dijkstra’s
the time it takes for real to answer a query and the
algorithm. If ` is nonnegative then Dijkstra’s algorithm
number of vertices it scans are much lower than those
scans vertices in nondecreasing order of distance from s
for re and hh.
and scans each vertex at most once.
Furthermore, we suggest an interpretation of hh in
For the P2P case, note that when the algorithm is
terms of reach, which explains the similarities between
about to scan the sink t, we know that d(t) is exact and
the s-t path defined by the parent pointers is a shortest of v with respect to P is the minimum of the length of
path. We can terminate the algorithm at this point. the prefix of P (the subpath from s to v) and the length
Intuitively, Dijkstra’s algorithm searches a ball with s of the suffix of P (the subpath from v to t). The reach
in the center and t on the boundary. of v, r(v), is the maximum, over all shortest paths P
One can also run Dijkstra’s algorithm on the reverse that contain v, of the reach of v with respect to P . (For
graph (the graph with every arc reversed) from the sink. now, assume that the shortest path between any two
The reverse of the t-s path found is a shortest s-t path vertices is unique; Section 5 discusses this issue in more
in the original graph. detail.)
The bidirectional algorithm [3, 7, 26] alternates Let r(v) be an upper bound on r(v), and let
between running the forward and reverse versions of dist(v, w) be a lower bound on dist(v, w). The following
Dijkstra’s algorithm, each maintaining its own set of fact allows the use of reaches to prune Dijkstra’s search:
distance labels. We denote by df (v) the distance label
of a vertex v maintained by the forward version of Suppose r(v) < dist(s, v) and r(v) < dist(v, t).
Dijkstra’s algorithm, and by dr (v) the distance label Then v is not on the shortest path from s to
of a vertex v maintained by the reverse version. (We t, and therefore Dijkstra’s algorithm does not
will still use d(v) when the direction would not matter need to label or scan v.
or is clear from the context.) During initialization, the Note that this also holds for the bidirectional algorithm.
forward search scans s and the reverse search scans To compute reaches, it suffices to look at all shortest
t. The algorithm also maintains the length of the paths in the graph and apply the definition of reach to
shortest path seen so far, µ, and the corresponding each vertex on each path. A more efficient algorithm
path. Initially, µ = ∞. When an arc (v, w) is relaxed is as follows. Initialize r(v) = 0 for all vertices v. For
by the forward search and w has already been scanned each vertex x, grow a complete shortest path tree Tx
by the reverse search, we know the shortest s-v and rooted at x. For every vertex v, determine its reach
w-t paths have lengths df (v) and dr (w), respectively. rx (v) within the tree, given by the minimum between
If µ > df (v) + `(v, w) + dr (w), we have found a path its depth (the distance from the root) and its height (the
shorter than those seen before, so we update µ and its distance to its farthest descendant). If rx (v) > r(v),
path accordingly. We perform similar updates during update r(v). This algorithm runs in Õ(nm) time, which
the reverse search. The algorithm terminates when the is still impractical for large graphs. On the largest one
search in one direction selects a vertex already scanned we tested, which has around 30 million vertices, this
in the other. A better criterion (see [16]) is to stop computation would take years on existing workstations.
the algorithm when the sum of the minimum labels of Note that, if one runs this algorithm from only a few
labeled vertices for the forward and reverse searches roots, one will obtain valid lower bounds for reaches.
is at least µ, the length of the shortest path seen so Unfortunately, the query algorithm needs good upper
far. Intuitively, the bidirectional algorithm searches two bounds to work correctly. Upper bounding algorithms
touching balls centered at s and t. are considerably more complex, as Section 5 will show.
Alternating between scanning a vertex by the for-
ward search and scanning a vertex by the reverse search 4 Queries Using Upper Bounds on Reaches
balances the number of scanned vertices between these
In this section, we describe how to make the bidirec-
searches. One can, however, coordinate the progress
tional Dijkstra’s algorithm more efficient assuming we
of the two searches in any other way and, as long as
have upper bounds on the reaches of every vertex. As
we stop according to one of the rules mentioned above,
described in Section 3, to prune the search based on
correctness is preserved. Balancing the work of the for-
the reach of some vertex v, we need a lower bound on
ward and reverse searches is a strategy guaranteed to
the distance from the source to v and a lower bound on
be within a factor of two of the optimal strategy, which
the distance from v to the sink. We show how we can
is the one that splits the work between the searches to
use lower bounds implicit in the search itself to do the
minimize the total number of scanned vertices. Also
pruning, thus obtaining a new algorithm.
note that remembering µ is necessary, since there is no
During the bidirectional Dijkstra’s algorithm, con-
guarantee that the shortest path will go through the
sider the search in the forward direction, and let γ be
vertex at which the algorithm stops.
the smallest distance label of a labeled vertex in the
reverse direction (i.e., the topmost label in the reverse
3 Reach-Based Pruning
heap). If a vertex v has not been scanned in the re-
The following definition of reach is due to Gutman [17]. verse direction, then γ is a lower bound on the distance
Given a path P from s to t and a vertex v on P , the reach from v to the destination t. (The same idea applies to
HHγ = dist(v, t)
df (s, v) 
t HH
  v HH

t  HH t

s t

Figure 1: Bidirectional bound algorithm. Assume v is about to be scanned in the forward direction, has not yet been
scanned in the reverse direction, and that the smallest distance label of a vertex not yet scanned in the reverse direction
is γ. Then v can be pruned if r̄(v) < df (v) and r̄(v) < γ.

the reverse search: we use the topmost label in the for- target, as the search in the opposite direction has not
ward heap as a lower bound on the distance from s for selected the vertex yet. We refer to this algorithm as
unscanned vertices in the reverse direction.) When we distance-balanced. Note that one could also use explicit
are about to scan v we know that df (v) is the distance lower bounds in combination with the implicit bounds.
from the source to v. So we can prune the search at v We call our implementation of the bidirectional
if all the following conditions hold: (1) v has not been Dijkstra’s algorithm with reach-based pruning re. The
scanned in the reverse direction, (2) r̄(v) < df (v), and query is distance-balanced and uses two optimizations:
(3) r̄(v) < γ. When using these bounds, the stopping early pruning and arc sorting. The former avoids
criterion is the same as for the standard bidirectional labeling unscanned vertices if reach and distance bounds
algorithm (without pruning). We call the resulting pro- justify this. The latter uses adjacency lists sorted
cedure the bidirectional bound algorithm. See Figure 1. in decreasing order by the reach of the head vertex,
An alternative is to use the distance label of the which allows some vertices to be early-pruned without
vertex itself for pruning. Assume we are about to scan explicitly looking at them. The resulting code is simple,
a vertex v in the forward direction (the procedure in with just a few tests added to the implementation of the
the reverse direction is similar). If r(v) < df (v), we bidirectional Dijkstra’s algorithm.
prune the vertex. Note that if the distance from v
to t is at most r(v), the vertex will still be scanned 5 Preprocessing
in the reverse direction, given the appropriate stopping In this section we present an algorithm for efficiently
condition. More precisely, we stop the search in a given computing upper bounds on vertex reaches. Our algo-
direction when either there are no labeled vertices or rithm combines three main ideas, two introduced in [17],
the minimum distance label of labeled vertices for the and the third implicit in [28].
corresponding search is at least half the length of the The first idea is the use of partial trees. Instead
shortest path seen so far. We call this the self-bounding of running a full shortest path computation from each
algorithm. vertex, which is expensive, we stop these computations
The reason why the self-bounding algorithm can early and use the resulting partial shortest path trees,
safely ignore the lower bound to the destination is that which contain all shortest paths with length lower than a
it leaves to the other search to visit vertices that are certain threshold. These trees allow us to divide vertices
closer to it. Note, however, that when scanning an arc into two sets, those with small reaches and those with
(v, w), even if we end up pruning w, we must check if large reaches. We obtain upper bounds on the reaches of
w has been scanned in the opposite direction and, if the former vertices. The second idea is to delete these
so, check whether the candidate path using (v, w) is the low-reach vertices from the graph, replacing them by
shortest path seen so far. penalties used in the rest of the computation. Then
The following natural algorithm falls into both of we recursively bound reaches of the remaining vertices.
the above categories. The algorithm balances the radius The third idea is to introduce shortcuts arcs to reduce
of the forward and reverse search regions by picking the the reach of some vertices. This speeds up both the
labeled vertex with minimum distance label considering preprocessing (since the graph will shrink faster) and
both search directions. Note that the distance label of the queries (since more vertices will be pruned).
this vertex is also a lower bound on the distance to the
The preprocessing algorithm works in two phases: the tree construction stops. The algorithm marks all
during the main phase, partial trees are grown and vertices that have reach at least  with respect to a path
shortcuts are added; this is followed by the refinement in Tx as high-reach vertices.
phase, when high-reach vertices are re-evaluated in It is clear that the algorithm will never mark
order to improve their reach bounds. a vertex whose reach is less than , since its reach
The main phase uses two subroutines: one adds restricted to the partial trees cannot be greater than
shortcuts to the graph (shortcut step), and the other its actual reach. Therefore, to prove the correctness of
runs the partial-trees algorithm and eliminates low- the algorithm, it is enough to show that every vertex
reach vertices (partial-trees step). The main phase starts v with high reach is marked at the end. Consider a
by applying the shortcut step. Then it proceeds in minimal canonical path P such that the reach of v with
iterations, each associated with a threshold i (which respect to P is high (at least ). Let x and y be the first
increases with i, the iteration number). Each iteration and the last vertices of P , respectively. Consider Tx .
applies a partial-trees step followed by the shortcut step. By uniqueness of shortest paths, either P is a path in
By the end of the i-th iteration, the algorithm eliminates Tx , or P contains a subpath of Tx that starts at x and
every vertex which it can prove has reach less than ends at a leaf, z, of Tx . In the former case v is marked.
i . If there are still vertices left in the graph, we set For the latter case, note that z cannot be a leaf of T as
i+1 = α i (for some α > 1) and proceed to the next z has been scanned and the shortest path P continues
iteration. past z. The distance from x to v is at least  and the
Approximate reach algorithms, including ours, need distance from x0 , the successor of x on P , to v is less
the notion of a canonical path, which is a shortest than  (otherwise P would not be minimal). By the
path with additional properties. In particular, between algorithm, the distance from x0 to z is at least 2 and
every pair (s, t) there is a unique canonical path. We therefore the distance from v to z is at least . Thus in
implement canonical paths as follows. For each arc a, we this case v is also marked.
generate a length perturbation `0 (a). When computing Note that long arcs pose an efficiency problem for
the length of a path, we separately sum lengths and this approach. For example, if x has an arc with
perturbations along the path, and use the perturbation length 100 adjacent to it, the depth of Tx is at least
to break ties in path lengths. 102. Building Tx will be expensive. All partial-tree-
Next we briefly discuss the major components of based preprocessing algorithms, including ours, deal
the algorithm. Due to space limitations, we discuss with this problem by building smaller trees in such
a variant based on vertex reaches. We indeed use cases and potentially classifying some low-reach vertices
vertex reaches for pruning the query, but our best as having high reach. This results in weaker upper
preprocessing algorithm uses arc reaches instead to gain bounds on reaches and potentially slower query times,
efficiency (see [14] for details). The main ideas behind but correctness is preserved.
our arc-based preprocessing are the same as for the Our algorithm builds the smaller trees as follows.
vertex-based version that we describe. Consider a partial shortest path tree Tx rooted at a
vertex x, and let v 6= x be a vertex in this tree. Let f (v)
5.1 Growing Partial Trees. To gain intuition on be the vertex adjacent to x on the shortest path from x
the construction and use of partial trees, we consider to v. The inner circle of Tx is the set containing the root
a graph such that all shortest paths are unique (and x and all vertices v ∈ Tx such that d(v) − `(x, f (v)) ≤ .
therefore canonical) and a parameter . We outline We call vertices in the inner circle inner vertices; all
an algorithm that partitions vertices into two groups, other vertices in Tx are outer vertices. The distance
those with high reach ( or more) and those with low from an outer vertex w to the inner circle is defined
reach (less than ). For each vertex x in the graph, the in the obvious way, as the length of the path (in Tx )
algorithm runs Dijkstra’s shortest path algorithm from between the closest (to w) inner vertex and w itself.
x with an early termination condition. Let T be the The partial tree stops growing when all labeled vertices
current tentative shortest path tree maintained by the are outer vertices and have distance to the inner circle
algorithm, and let T 0 be the subtree of T induced by the greater than .
scanned vertices. Note that any path in T 0 is a shortest Our preprocessing runs the partial-trees algorithm
path. The tree construction stops when for every leaf y in iterations, multiplying the value of  by a constant α,
of T 0 , one of two conditions holds: (1) y is a leaf of T or each time it starts a new iteration. Iteration i applies
(2) the length of the x0 -y path in T 0 is at least 2, where the partial-trees algorithm to a graph Gi = (Vi , Ai ).
x0 is the vertex adjacent to x on the x-y path in T 0 . This is the graph induced by all arcs that have not been
Let Tx , the partial tree of x, denote T 0 at the time eliminated yet (considering not only the original arcs,
but also shortcuts added in previous iterations). All the following condition holds: (1) v has exactly one
vertices in Vi have reach estimates above i−1 (for i > 1). incoming arc, (u, v), and one outgoing arc, (v, w); or
To compute valid upper bounds for them, the partial- (2) v has exactly two outgoing arcs, (v, u) and (v, w),
trees algorithm must take into account the vertices that and exactly two incoming arcs, (u, v) and (w, v). In the
have been deleted. It does so by using the concept first case, we say v is a candidate for a one-way bypass;
of penalties, which implicitly increase the depths and in the second, v is a candidate for a two-way bypass.
heights of vertices in the partial trees. This ensures the Shortcuts are used to go around bypassable vertices.
algorithm will compute correct upper bounds. A line is a path in the graph containing at least
Next we introduce arc reaches, which are similar to three vertices such that all vertices, except the first
vertex reaches but carry more information and lead to and the last, are bypassable. Every bypassable vertex
faster preprocessing. They are useful for defining the belongs to exactly one line, which can either be one-
penalties as well. way or two-way. Once a line is identified, we may
bypass it. The simplest approach would be to do it
5.2 Arc Reaches. Let (v, w) be an arc on the short- in a single step: if its first vertex is u and the last
est path P between s and t. The reach of this arc with one is w, we can simply add a shortcut (u, w) (and
respect to P is the minimum of the length of the prefix (w, u), in case it is a two-way line). The length and
of P (the distance between s and w) and the length of the perturbation associated with the shortcut is the
the suffix of P (the distance between v and t). Note sum of the corresponding values of the arcs it bypasses.
that the arc belongs to both the prefix and the suffix We break the tie thus created by making the shortcut
(a definition that excluded the arc from both would be preferred (i.e., implicitly shorter). If v is a bypassed
equivalent). The arc reach of (v, w) with respect to the vertex, any shortest path that passes through u and
entire graph, denoted by r(v, w), is the maximum reach w will no longer contain v. This potentially reduces
of this arc with respect to all shortest paths P contain- the reach of v. If the line has more than two arcs, we
ing it. actually add “sub-lines” as well: we recursively process
During the partial-trees algorithm, we actually try the left half, then the right half, and finally bypass the
to bound arc reaches instead of vertex reaches—the entire line. This reduces reaches even further, as the
procedure is essentially the same as described before, example in Figure 2 shows.
and arc reaches are more powerful (the reach of an arc Once a vertex is bypassed, we immediately delete
may be much smaller than the reaches of its endpoints). it from the graph to speed up the reach computation.
Once all arc reaches are bounded, they are converted As long as the appropriate penalties are assigned to its
into vertex reaches: a valid upper bound on the reach neighbors, the computation will still find valid upper
of a vertex can be obtained from upper bounds on the bounds on all reaches.
reaches of all incident arcs. One issue with the addition of shortcuts is that they
Penalties are computed as follows. The in-penalty may be very long, which can hurt the performance of
of a vertex v ∈ Vi is defined as the partial-trees algorithm in future iterations. To avoid
this, we limit the length of shortcuts that may be added
in-penalty(v) = max {r̄(u, v)}, in iteration i to at most i+1 /2.
(u,v)∈A+ :(u,v)6∈Ai

if v has at least one eliminated incoming arc, and zero 5.4 The Refinement Phase. The fact that penal-
otherwise. In this expression, A+ is the set of original ties are used to help compute valid upper bounds tends
arcs augmented by the shortcuts added up to iteration to make the upper bounds less tight (in absolute terms)
i. The out-penalty of v is defined similarly, considering as the algorithm progresses, since penalties become
outgoing arcs instead of incoming arcs: higher. Therefore, additive errors tend to be larger for
out-penalty(v) = max {r̄(v, w)}. vertices that remain in the graph after several itera-
(v,w)∈A+ :(v,w)6∈Ai tions. Since they have high reach, they are visited by
more queries than other vertices. If we could make these
If there is no outgoing arc, the out-penalty is zero.
reaches more precise, the query would be able to prune
The partial-trees algorithm works as described
more vertices. This is the goal of the refinement phase
above, but increases the lengths of path suffixes and
of our algorithm: it recomputes the reach estimates of
prefixes by out- and in-penalties, respectively, for the
the δ vertices with highest (upper bounds on) reaches
purpose of reach computation.
found during the main step,√where δ is a user-defined
parameter (we used δ = d10 ne).
5.3 Shortcut Step. We call a vertex v bypassable
Let Vδ be this set of high-reach vertices of G. To
if it has exactly two neighbors (u and w) and one of
40

22 18

t 20 t 10 t 12 t 7 t 11 t 18 t
s u x v y w t

Figure 2: In this graph, (s, u), (u, x), (x, v), (v, y), (y, w), and (w, t) are the original edges (for simplicity, the graph is
undirected). Without shortcuts, their reaches are r(s) = 0, r(u) = 20, r(x) = 30, r(v) = 36, r(y) = 29, r(w) = 18, and
r(t) = 0. If we add just shortcut (u, w), the reaches of three vertices are reduced: r(x) = 19, r(v) = 12, and r(y) = 19. If
we also add shortcuts (u, v) and (v, w), the reaches of x and y are reduced even further, to r(x) = r(y) = 0.

recompute the reaches, we first determine the subgraph this paper, A∗ search [6, 18] is an algorithm that works
Gδ = (Vδ , Aδ ) induced by Vδ . This graph contains like Dijkstra’s algorithm, except that at each step it se-
not only original arcs, but also the shortcuts between lects a labeled vertex v with the smallest key, defined
vertices in Vδ added during the main phase. We as kf (v) = df (v) + πf (v), to scan next. It is easy to see
then run an exact vertex reach computation on Gδ by that A∗ search is equivalent to Dijkstra’s algorithm on
growing a complete shortest path tree from each vertex the graph with length function `πf . If πf is such that
in Vδ . Because these shortest path trees include vertices `πf is nonnegative for all arcs (i.e., if πf is feasible), the
in Gδ only, we still have to use penalties to account for algorithm will find the correct shortest paths. We refer
the remaining vertices. to the class of A∗ search algorithms that use a feasi-
ble function πf with πf (t) = 0 as lower-bounding algo-
5.5 Additional Parameters. The choice of 1 and rithms. As shown in [16], better estimates lead to fewer
α is a tradeoff between preprocessing efficiency and the vertices being scanned. In particular, a lower-bounding
quality of reaches and√shortcuts. To choose 1 , we first algorithm with a nonnegative potential function visits
pick k = min{500, bd ne/3c} vertices at random. For no more vertices than Dijkstra’s algorithm, which uses
each vertex, we compute the radius of a partial shortest the zero potential function.
path tree with exactly bn/kc scanned vertices. (This We combine A∗ search and bidirectional search as
radius is the distance label of the last scanned vertex.) follows. Let πf be the potential function used in the
Then we set 1 to be twice the minimum of all k radii. forward search and let πr be the one used in the reverse
We use α = 3.0 until we reach an iteration in the search. Since the latter works in the reverse graph, each
main phase where the number of vertices is smaller than original arc (v, w) appears as (w, v), and its reduced cost
δ, then we reduce it to 1.5. This change allows the w.r.t. πr is `πr (w, v) = `(v, w) − πr (w) + πr (v), where
algorithm to add more shortcuts in the final iterations. `(v, w) is in the original graph. We say that πf and
The refinement step ensures that the reach bounds of πr are consistent if, for all arcs (v, w), `πf (v, w) in the
the last δ vertices are still good. original graph is equal to `πr (w, v) in the reverse graph.
This is equivalent to πf + πr = const.
6 Reach and the ALT Algorithm If πf and πr are not consistent, the forward and
6.1 A∗ Search and the ALT Algorithm. A poten- reverse searches use different length functions. When
tial function is a function from the vertices of a graph G the searches meet, we have no guarantee that the
to reals. Given a potential function π, the reduced cost shortest path has been found. Assume πf and πr
of an arc is defined as `π (v, w) = `(v, w) − π(v) + π(w). give lower bounds to the sink and from the source,
Suppose we replace the original distance function ` by respectively. We use the average function suggested by
`π . Then for any two vertices x and y, the length of Ikeda et al. [19], defined as pf (v) = (πf (v) − πr (v))/2
every x-y path (including the shortest) changes by the for the forward computation and as pr (v) = (πr (v) −
same amount, π(y) − π(x). Thus the problem of find- πf (v))/2 = −pf (v) for the reverse one. Although pf
ing shortest paths in G is equivalent to the problem of and pr usually do not give lower bounds as good as the
finding shortest paths in the transformed graph. original ones, they are feasible and consistent.
Now suppose we are interested in finding the short- The alt algorithm [13, 16] is based on A∗ and
est path from s to t. Let πf be a (perhaps domain- uses landmarks and triangle inequality to compute
specific) potential function such that πf (v) gives an es- feasible lower bounds. We select a small subset of
timate on the distance from v to t. In the context of vertices as landmarks and, for each vertex in the graph,
precompute distances to and from every landmark.
Consider a landmark L: if d(·) is the distance to L, then, the one used by re (which computes shortcuts and
by the triangle inequality, d(v) − d(w) ≤ dist(v, w); if reaches) and the one used by alt (which chooses land-
d(·) is the distance from L, d(w) − d(v) ≤ dist(v, w). To marks and computes distances from all vertices to it).
get the tightest lower bound, one can take the maximum These two procedures are independent from each other:
of these bounds, over all landmarks. Intuitively, the best since shortcuts do not change distances, landmarks can
lower bound on dist(v, w) is given by a landmark that be generated regardless of what shortcuts are added.
appears “before” v or “after” w. We use the version of Furthermore, the query is still independent of the pre-
alt algorithm used in [16], which balances the work of processing algorithm: the query only takes as input the
the forward search and the reverse search. graph with shortcuts, the reach values, and the dis-
tances to and from landmarks. The actual algorithms
6.2 Reach and A∗ search. Reach-based pruning used to obtain this data can be changed at will.
can be easily combined with A∗ search. Gutman [17]
noticed this in the context of unidirectional search. The 7 Other Reach Definitions and Related Work
general approach is to run A∗ search and prune vertices 7.1 Gutman’s Algorithm. In [17], Gutman com-
(or arcs) based on reach conditions. When A∗ is about putes shortest routes with respect to travel times. How-
to scan a vertex v, we can extract the length of the ever, his algorithm, which is unidirectional, uses Eu-
shortest path from the source to v from the key of clidean bounds on travel distances, not times. This re-
v (recall that kf (v) = df (v) + πf (v)). Furthermore, quires a more general definition of reach, which involves,
πf (v) is a lower bound on the distance from v to the in addition to the metric induced by graph distances
destination. If the reach of v is smaller than both df (v) (native metric), another metric M , which can be dif-
and πf (v), we prune the search at v. ferent. To define reach, one considers native shortest
The reason why reach-based pruning works is that, paths, but takes subpath lengths and computes reach
although A∗ search uses transformed lengths, the short- values for M -distances. It is easy to see how these
est paths remain invariant. This applies to bidirectional reaches can be used for pruning. Note that Gutman’s
search as well. In this case, we use df (v) and πf (v) to algorithm can benefit from shortcuts, although he does
prune in the forward direction, and dr (v) and πr (v) to not use them. All our algorithms have natural distance
prune in the reverse direction. Pruning by reach does bounds for the native metric, so we use it as M .
not affect the stopping condition of the algorithm. We Other major differences between re and Gutman’s
still use the usual condition for A∗ search, which is sim- algorithm are as follows. First, re is bidirectional,
ilar to that of the standard bidirectional Dijkstra, but and bidirectional shortest path algorithms tend to scan
with respect to reduced costs [16]. We call our imple- fewer vertices than unidirectional ones. Second, re uses
mentation of the bidirectional A∗ search algorithm with implicit lower bounds and thus does not need the vertex
landmarks and reach-based pruning real. As in alt, coordinates required by Gutman’s algorithm. Finally,
we used a version of real that balances the work of the re preprocessing creates shortcuts, which Gutman’s
forward search and the reverse search. Our implemen- algorithm does not. There are some other differences
tation of real uses variants of early pruning and arc in the preprocessing algorithm, but their effect on
sorting, modified for the context of A∗ search. performance is less significant. In particular, we do
Note that we cannot use implicit bounds with A∗ not grow partial trees from eliminated vertices, which
search. The implicit bound based on the radius of the requires a slightly different interpretation of penalties.
ball searched in the opposite direction does not apply A variant of Gutman’s algorithm uses A∗ search
because the ball is in the transformed space. The self- with Euclidean lower bounds. In addition to the
bounding algorithm cannot be combined with A∗ search differences mentioned in the previous paragraph, real
in a useful way, because it assumes that the two searches differs in using tighter landmark-based lower bounds.
will process balls of radius equal to half of the s-t
distance. This defeats the purpose of A∗ search, which 7.2 Cardinality Reach and Highway Hierar-
aims at processing a smaller set. chies. We now discuss the relationship between our
The main gain in the performance of A∗ search reach-based algorithm (re) and the hh algorithm of
comes from the fact that it directs the two searches Sanders and Schultes. Since hh is described for undi-
towards their goals, reducing the search space. Reach- rected graphs, we restrict the discussion to this case.
based pruning sparsifies search regions, and this sparsifi- We introduce a variant of reach that we call c-reach
cation is effective for regions searched by both Dijkstra’s (cardinality reach). Given a vertex v on a shortest path
algorithm and A∗ search. P , grow equal-cardinality balls centered at its endpoints
Note that real has two preprocessing algorithms: until v belongs to one of the balls. Let cP (v) be the
Table 1: Road Networks
name description vertices arcs latitude (N) longitude (W)
NA North America 29 883 886 70 297 895 [−∞, +∞] [−∞, +∞]
E Eastern USA 4 256 990 10 088 732 [24.0; 50.0] [−∞; 79.0]
NW Northwest USA 1 649 045 3 778 225 [42.0; 50.0] [116.0; 126.0]
COL Colorado 585 950 1 396 345 [37.0; 41.0] [102.0; 109.0]
BAY Bay Area 330 024 793 681 [37.0; 39.0] [121; 123]

cardinality of each of the balls at this point. The c- tation. All arcs are stored in a single array, with each
reach of v, c(v), is the maximum, over all shortest paths arc represented by its head and its length.1 The ar-
P , of cP (v). Note that if we replace cardinality with ray is sorted by arc tail, so all outgoing arcs from a
radius, we get the definition of reach. To use c-reach vertex appear consecutively. An array of vertices maps
for pruning the search, we need the following values. the identifier of a vertex to the position (in the list of
For a vertex v and a nonnegative integer i, let ρ(v, i) arcs) of the first element of its adjacency list. All query
be the radius of the smallest ball centered at v that algorithms use standard four-way heaps.
contains i vertices. Consider a search for the shortest We conduct most of our tests on road networks.
path from s to t and a vertex v. We do not need to We test our algorithm on the five graphs described
scan v if ρ(s, c(v)) < dist(s, v) and ρ(t, c(v)) < dist(v, t). in Table 1. The first graph in the table, North
Implementation of this pruning method would require America (NA), was extracted from Mappoint.NET data
maintaining n − 1 values of ρ for every vertex. and represents Canada, the United States (including
The main idea behind hh preprocessing is to use the Alaska), and the main roads of Mexico. The other
partial-trees algorithm for c-reaches instead of reaches. four instances are representative subgraphs of NA (for
Given a threshold h, the algorithm identifies vertices tests on more subgraphs, see [14]). All graphs are
that have c-reach below h (local vertices). Consider a directed and biconnected. We ran tests with two length
bidirectional search. During the search from s, once the functions: travel times and travel distances.
search radius advances past ρ(s, h), one can prune local For a comparison with hh, we use the graph of the
vertices in this search. One can do similar pruning for United States built by Sanders and Schultes [28] based
the reverse search. This idea is applied recursively to on Tiger-Line data [35]. Because our implementations
the graph with low c-reach vertices deleted. This gives a of alt and real assume the graph to be connected (to
hierarchy of vertices, in which each vertex needs to store simplify implementation), we only take the largest con-
a ρ-value for each level of the hierarchy it is present at. nected component of this graph, which contains more
The preprocessing phase of hh also shortcuts lines and than 98.6% of the vertices. The graph is undirected,
uses other heuristics to reduce the graph size at each and we replace each edge {v, w} by arcs (v, w) and
iteration. (w, v). Our version of the graph (which we call USA)
An important property of the hh query algorithm, has 23 947 347 vertices and 57 708 624 arcs.
which makes it similar to the self-bounding algorithm We also performed experiments with grid graphs.
discussed in Section 4, is that the search in a given Vertices of an x×y grid graph correspond to points on a
direction never goes to a lower level of the hierarchy. two-dimensional grid with coordinates i, j for 0 ≤ i < x
Our self-bounding algorithm can be seen as having a and 0 ≤ j < y. Each vertex has arcs to the vertices to
“continuous hierarchy” of reaches: once a search leaves its left, right, up, and down neighbors, if present. Arc
a reach level, it never comes back to it. Like the self- lengths are integers chosen uniformly at random from
bounding algorithm, hh cannot be combined with A∗ [1, 1024]. We use square grids (i.e., x = y).
search in a natural way. Unless otherwise noted, in each experiment we run
the algorithms with a fixed set of parameters. For alt
8 Experimental Results we use the same parameters as in [16]: for each graph we
8.1 Experimental Setup. We implemented our al- generated one set of 16 maxcover landmarks, and each s-
gorithms in C++ and compiled them with Microsoft t search uses dynamic selection to pick between two and
Visual C++ 7.0. All tests were performed on an AMD
Opteron with 16 GB of RAM running Microsoft Win- 1 The length is stored as a 16-bit integer on the original graphs
dows Server 2003 at 2.4 GHz. and as a 32-bit integer for the graphs with shortcuts. The head
We use a standard cache-efficient graph represen- is always a 32-bit integer.
six of those. The same set of landmarks was also used t pairs, queries for driving directions tend to be more
by real. Upper bounds on reaches were generated with local. We used an idea from [28] to generate queries
the algorithm described in Section 5. The reaches thus with different degrees of locality for NA. See Figure 3.
obtained (alongside with the corresponding shortcuts) When s and t are close together, alt visits fewer vertices
were used by both re and real. than re. However, since the asymptotic performance of
alt is worse, re quickly surpasses it as s and t get
8.2 Road Networks. Tables 2 and 3 present the farther apart. real is the best algorithm in every case.
results obtained by our algorithms when applied to the Comparing plots for travel time and distance metrics,
Mappoint.NET graphs with the travel-time and travel- we note that alt is less affected by the metric change
distance metrics, respectively. In these experiments, than the other algorithms.
we used 1000 random s-t pairs for each graph. We
give results for preprocessing, average-case performance, 8.3 Comparison to Highway Hierarchies. As al-
and worst-case performance. For queries, we give both ready mentioned, hh is the most practical of the pre-
absolute numbers and the speedup with respect to an vious P2P algorithms. Recall that hh works for undi-
implementation of the bidirectional Dijkstra’s algorithm rected graphs only, while our algorithms work on di-
(to which we refer as b). rected graphs (which are more general). To compare
For queries, the running time generally increases our algorithms with hh, we use the (undirected) USA
with graph size. While the complexity of alt grows graph. Data for hh on USA, which we take from [28] and
roughly linearly with the graph size, re and real scale from a personal communication from Dominik Schultes,
better. For small graphs, alt is competitive with re, is available for the travel time metric only.
but for large graphs the latter is more than 20 times We compare both operation counts and running
faster. real is 3 to 4 times faster than re. times. Since for all algorithms queries are based on
In terms of preprocessing, we note that computing the bidirectional Dijkstra’s algorithm, comparing the
landmarks is significantly faster than finding good upper number of vertices scanned is informative. For the
bounds on reaches. However, landmark data (with a running times, note that the hh experiments were
reasonable number of landmarks) takes up more space conducted on a somewhat different machine. It was
than reach data; compare the space usage of re and slightly slower than ours: an AMD Opteron running at
alt. In fact, the reaches themselves are a minor part 2.2 GHz (ours is an AMD Opteron running at 2.4 GHz)
(less than 20%) of the total space required by re. using the Linux operating system (ours uses Windows).
The rest of the space is used up by the graph with Furthermore, implementation styles may be different.
shortcuts (typically, the number of arcs increases by This introduces an extra error margin in the running
35% to 55%) and by the shortcut translation map, used time comparison. (To emphasize this, we use ≈ when
to convert shortcuts into its constituent arcs. The time stating running times for hh in Table 4.) However,
for actually performing this conversion after each query comparing running times gives a good sanity check, and
is not taken into account in our experiments, since not is necessary for preprocessing algorithms, which differ
all applications require it. more than the query algorithms.
Next we compare the results for the two metrics. While in all our experiments we give the maximum
With the travel distance metric, the superiority of high- number of vertices visited during the 1 000 queries we
ways over local roads becomes much less pronounced tested, Sanders and Schultes also obtain an upper bound
than with travel times. As a result, re become twice on the worst-case number by running the search from
as slow for queries, and preprocessing takes 2.5 times each vertex in the graph to an unreachable dummy
longer on NA. On the other hand, alt queries slow vertex and doubling the maximum number of vertices
down only by about 20%, and preprocessing slows down scanned. We did the same for re. Note that this
even less. Changes in the performance of real fall in- approach does not work for the landmark-based algo-
between, which implies that its speedup with respect rithms, as preprocessing would determine that no land-
to re becomes higher (on NA, real visits less than one mark is reachable to or from the dummy vertex. For
tenth as many vertices as re on average). All algorithms both metrics, the upper bound is about a factor 1.5
require a similar amount of space for travel times and higher than the lower bound given by the maximum
for travel distances. While not quite as good as with over 1 000 trials, suggesting that the latter is a reason-
travel times, the performance for travel distances is still able approximation of the worst-case behavior.
excellent: real can find a shortest path on NA in less Data presented in Table 4 for the travel time metric
than 6 milliseconds on average. suggests that re and hh have similar performance and
While s and t are usually far apart on random s- memory requirements. real queries are faster, but
Table 2: Algorithm performance on road networks with travel times as arc lengths: total preprocessing time, total space
in disk required by the preprocessed data (in megabytes), average number of vertices scanned per query (over 1000 random
queries), maximum number of vertices scanned (over the same queries), and average running times. Query data shown in
both absolute values and as a speedup with respect to the bidirectional Dijkstra algorithm.
prep. disk query
time space avg scans max scans avg time
graph method (min) (MB) count spd count spd ms spd
BAY alt 0.7 26 4 052 29 54 818 5 3.39 16
re 3.2 19 1 590 74 3 438 85 1.17 48
real 3.9 40 290 404 1 691 172 0.45 123
COL alt 1.6 47 7 373 26 85 246 6 5.84 15
re 5.2 36 2 181 88 5 074 103 1.80 49
real 6.9 73 306 624 1 612 324 0.59 149
NW alt 3.9 132 14 178 36 144 082 8 12.52 21
re 17.5 100 2 804 184 5 877 203 2.39 112
real 21.4 204 367 1 408 1 513 789 0.73 365
E alt 15.2 342 35 044 42 487 194 8 44.47 18
re 84.7 255 6 925 212 13 857 277 7.06 116
real 99.9 523 795 1 843 4 543 844 1.61 510
NA alt 95.3 2 398 250 381 41 3 584 377 8 393.41 19
re 678.8 1 844 14 684 698 24 618 1 104 17.38 439
real 774.2 3 726 1 595 6 430 7 450 3 647 3.67 2 080

Table 3: Algorithm performance on road networks with travel distances as arc lengths: total preprocessing time, total
space in disk required by the preprocessed data (in megabytes), average number of vertices scanned per query (over 1000
random queries), maximum number of vertices scanned (over the same queries), and average running times. Query data
shown in both absolute values and as a speedup with respect to the bidirectional Dijkstra algorithm.
prep. disk query
time space avg scans max scans avg time
graph method (min) (MB) count spd count spd ms spd
BAY alt 0.8 27 3 383 35 42 192 7 3.25 18
re 4.6 19 2 761 43 6 313 45 2.05 28
real 5.4 41 335 356 2 717 105 0.45 128
COL alt 1.8 48 7 793 24 126 755 4 6.34 14
re 9.7 36 3 792 50 10 067 50 3.16 28
real 11.5 75 406 469 2 805 178 0.72 123
NW alt 4.2 136 20 662 26 426 069 3 21.61 12
re 21.3 101 4 217 125 10 630 121 3.81 71
real 25.4 208 478 1 103 3 058 419 0.89 302
E alt 14.6 353 43 737 35 582 663 7 61.98 15
re 158.9 258 14 025 108 28 144 141 13.28 69
real 173.4 537 1 142 1 323 7 097 560 2.27 404
NA alt 97.2 2 511 292 777 36 3 588 684 8 476.86 17
re 1 623.0 1 866 30 962 336 56 794 485 34.92 231
real 1 720.2 3 860 2 653 3 922 17 527 1 570 5.97 1 351
106 106
ALT ALT
RE RE
REAL REAL
105 105
scanned nodes

scanned nodes
104 104

103 103

102 102

101 101
10 12 14 16 18 20 22 24 10 12 14 16 18 20 22 24
bucket bucket

Figure 3: Average number of scanned vertices for local queries on NA with travel times (left) and distances (right). The
horizontal axis refers to buckets with 1 000 pairs each. Each pair s-t in bucket i is such that s is chosen at random and t is
the j-th farthest vertex from s, where j is selected uniformly at random from the range (2 i−1 , 2i ]. The vertical axis is in
log scale.

Table 4: Results for the undirected USA graph (same measures as in Table 2). For hh, averages are taken over 10 000
random queries (but the maximum is still taken over 1 000). For hh and re we also give an upper bound on the maximum
number of scans (ub). Data for hh with travel distances is not available.
prep. disk query
time space avg scans max scans avg time
metric method (min) (MB) count spd count spd ub ms spd
times alt 92.7 1 984 177 028 44 2 587 562 8 — 322.78 21
re 365.9 1 476 3 851 2 000 8 722 2 330 13 364 4.50 1 475
real 458.5 3 038 891 8 646 3 667 5 541 — 1.84 3 601
hh ≈ 258.0 1 457 3 912 1 969 5 955 3 412 8 678 ≈ 7.04 ≈ 937
distances alt 99.9 1 959 256 507 33 2 674 150 8 — 392.84 15
re 981.5 1 503 22 377 376 44 130 500 68 672 25.59 236
real 1 081.4 3 040 2 119 3 973 11 163 1 977 — 4.89 1 235

it needs more memory. alt queries are substantially at random. These graphs have no natural hierarchy of
slower, but preprocessing is faster. shortest paths, which results in a large fraction of the
Lacking data for hh, we cannot compare it to our vertices having high reach. For these tests, we used the
algorithms for the travel distance metric. Performance same parameter settings as for road networks. It is un-
of our algorithms on USA with this metric is similar to clear how much one can increase performance by tuning
that on NA with the same metric. This suggests that parameter values. As preprocessing for grids is fairly
directed graphs are not much harder than undirected expensive, we limited the maximum grid size to about
ones for our algorithms. In contrast, with the travel half a million vertices. The results are shown in Table 5.
time metric, the performance of re (and, to a lesser As expected, re does not get nearly as much
extent, real) is much better on USA than on NA. speedup on grids as it does on road networks (see
This suggests that the hierarchy on the USA graph Tables 2 and 3). However, there is some speedup,
with travel times is more evident than on NA, probably and it does grow (albeit slowly) with grid size. alt
because USA has a small number of road categories. is significantly faster than re: in fact, its speedup on
grids is comparable to that on road networks. However,
8.4 Grids. Although road networks are our motivat- the speedup does not appear to change much with grid
ing application, we also tested our algorithms on grid size, and it is likely that for very large grids re would
graphs. As with road networks, for each graph we gen- be faster.
erated 1 000 pairs of vertices, each selected uniformly An interesting observation is that real remains the
Table 5: Algorithm performance on grid graphs with random arc lengths. For each graph and each method, the table
shows the total time spent in preprocessing, the total size of the data stored on disk after preprocessing, the average
number of vertices scanned (over 1 000 random queries), the maximum number of vertices scanned (over the same queries),
and the average running time. For the last three measures, we show both the actual value and the speedup (spd) with
respect to b.

prep. disk query


time space avg scans max scans avg time
vertices method (min) (MB) count spd count spd msec spd
65 536 alt 0.2 6.2 686 29.6 8 766 5.5 0.52 17.6
re 12.3 5.2 5 514 3.7 10 036 4.8 3.09 2.9
real 12.5 9.6 363 55.9 2 630 18.4 0.34 26.4
131 044 alt 0.6 12.4 1 307 32.6 14 400 7.2 1.42 13.9
re 44.7 10.4 9 369 4.6 16 247 6.4 5.94 3.3
real 45.3 19.3 551 77.4 3 174 32.6 0.77 25.8
262 144 alt 0.9 25.1 2 382 35.9 27 399 7.3 2.81 16.1
re 131.4 20.7 14 449 5.9 24 248 8.3 9.75 4.6
real 132.3 38.8 791 108.0 5 020 39.9 1.22 37.1
524 176 alt 1.9 50.2 4 416 38.8 40 568 9.9 5.25 17.5
re 232.1 41.4 23 201 7.4 39 433 10.2 17.47 5.3
real 234.1 77.7 1 172 146.3 7 702 52.3 1.61 57.2

best algorithm in this test, and its speedup grows with a good enough job: on BAY, exact reaches improved
grid size. For our largest grid, queries for real improve queries by less than 25%.
on alt by about a factor of four for all performance We also experimented with the number of land-
measures that we considered. The space penalty of marks real uses on NA. With as few as four landmarks,
real with respect to alt is a factor of about 1.5. real real is already twice as fast as re on average (while
is over 50 times better than b. This shows that the visiting less than one third of the vertices). In general,
combination of reaches and landmarks is more robust more landmarks give better results, but with more than
than either alt or re individually. 16 landmarks the additional speedup does not seem to
The most important downside of the reach-based be worth the extra amount of space required.
approach on grids is its large preprocessing time. An in-
teresting question is whether this can be improved. This 9 Conclusion and Future Work
would require a more elaborate procedure for adding The reach-based shortest path approach leads to sim-
shortcuts to a graph (instead of just waiting for lines ple query algorithms with efficient implementations.
to appear during the preprocessing algorithm). Such an Adding shortcuts greatly improves the performance of
improvement may lead to a better preprocessing algo- these algorithms on road networks. We have shown
rithm for road networks as well. that the algorithm re, based on these ideas, is com-
petitive with the best previous method. Moreover, it
8.5 Additional Experiments. We ran our prepro- combines naturally with A∗ search. The resulting al-
cessing algorithm on BAY with and without shortcut gorithm, real, improves query times even more: an
generation. The results are shown in Table 6. With- average query in North America takes less than 4 mil-
out shortcuts, queries visited almost 10 times as many liseconds.
vertices, and preprocessing was more than 15 times However, we believe there is still room for improve-
slower; for larger graphs, the relative performance is ment. In particular, we could make the algorithm more
even worse. Without shortcuts, preprocessing NA is im- cache-efficient by reordering the vertices so that those
practical. The table also compares approximate and ex- with high reach appear close to each other. There are
act reach computations. Again, preprocessing for exact few of those, and they are much more likely to be visited
reaches is extremely expensive, and of course shortcuts during any particular search than low-reach vertices.
do not make it any faster (note that the shortcuts in this The number of vertices visited could also be re-
case are the ones added by the approximate algorithm). duced. With shortcuts added, a shortest path on NA
Fortunately, our upper bounding heuristic seems to do with travel times has on average less than 100 vertices,
Table 6: Results for re with different reach values on BAY, both with and without shortcuts.

prep. query
time avg max time
metric shortcuts reaches (min) scans scans (ms)
times no approx. 52.8 13 369 28 420 6.44
exact 966.1 11 194 24 358 6.05
yes approx. 3.2 1 590 3 438 1.17
exact 980.7 1 383 3 056 0.97
distances no approx. 82.5 17 448 37 171 9.47
exact 956.9 13 986 30 788 7.61
yes approx. 4.6 2 761 6 313 2.05
exact 1 078.9 2 208 5 159 1.55

but an average real search scans more than 1500 ver- Shortest Paths Algorithms: Theory and Experimental
tices. Simply adding more landmarks would require too Evaluation. Math. Prog., 73:129–174, 1996.
much space, however. To overcome this, one could store [2] L. J. Cowen and C. G. Wagner. Compact Roundtrip
landmark distances only for a fraction (e.g., 20%) of the Routing in Directed Networks. In Proc. Symp. on
vertices, those with reach greater than some threshold Principles of Distributed Computation, pages 51–59,
2000.
R. The query algorithm would first search balls of ra-
[3] G. B. Dantzig. Linear Programming and Extensions.
dius R around s and t without using landmarks, then Princeton Univ. Press, Princeton, NJ, 1962.
would start using landmarks from that point on. An- [4] E. V. Denardo and B. L. Fox. Shortest-Route Methods:
other potential improvement would be to pick a set of 1. Reaching, Pruning, and Buckets. Oper. Res., 27:161–
landmarks specific to real (in our current implementa- 186, 1979.
tion, real uses the same landmarks as alt). [5] E. W. Dijkstra. A Note on Two Problems in Connexion
Also, one could reduce the space required to store with Graphs. Numer. Math., 1:269–271, 1959.
r values by picking a constant γ, rounding r’s up to the [6] J. Doran. An Approach to Automatic Problem-Solving.
nearest integer power of γ, and storing the logarithms Machine Intelligence, 1:105–127, 1967.
to the base γ of the r’s. [7] D. Dreyfus. An Appraisal of Some Shortest Path Algo-
Our query algorithm is independent of the prepro- rithms. Technical Report RM-5433, Rand Corporation,
Santa Monica, CA, 1967.
cessing algorithm, allowing us to state natural subprob-
[8] J. Fakcharoenphol and S. Rao. Planar graphs, negative
lems for the latter. What is a good number of shortcuts weight edges, shortest paths, and near linear time. In
to add? Where to add them? How to do it efficiently? Proc. 42nd IEEE Annual Symposium on Foundations
Another natural problem, originally raised by Gut- of Computer Science, pages 232–241, 2001.
man [17], is that of efficient reach computation. Can one [9] M. L. Fredman and R. E. Tarjan. Fibonacci Heaps
compute reaches in less than Θ(nm) time? What about and Their Uses in Improved Network Optimization
provably good upper bounds on reaches? Our results Algorithms. J. Assoc. Comput. Mach., 34:596–615,
add another dimension to this direction of research by 1987.
allowing shortcuts to be added to improve performance. [10] G. Gallo and S. Pallottino. Shortest Paths Algorithms.
Another interesting direction of research is to iden- Annals of Oper. Res., 13:3–79, 1988.
tify a wider class of graphs for which these techniques [11] A. V. Goldberg. A Simple Shortest Path Algorithm
with Linear Average Time. In Proc. 9th ESA, Lecture
work well, and to make the algorithms more robust over
Notes in Computer Science LNCS 2161, pages 230–241.
that class. Springer-Verlag, 2001.
[12] A. V. Goldberg. Shortest Path Algorithms: Engineer-
Acknowledgments ing Aspects. In Proc. ESAAC ’01, Lecture Notes in
We would like to thank Peter Sanders and Dominik Computer Science. Springer-Verlag, 2001.
Schultes for their help with the USA graph data. [13] A. V. Goldberg and C. Harrelson. Computing the
Shortest Path: A∗ Search Meets Graph Theory. In
Proc. 16th ACM-SIAM Symposium on Discrete Algo-
References rithms, pages 156–165, 2005.
[14] A. V. Goldberg, H. Kaplan, and R. F. Werneck. Reach
for A∗ : Efficient Point-to-Point Shortest Path Al-
[1] B. V. Cherkassky, A. V. Goldberg, and T. Radzik.
gorithms. Technical Report MSR-TR-2005-132, Mi- ing Highway Hierarchies. Master’s thesis, Department
crosoft Research, 2005. of Computer Science, Universitt des Saarlandes, Ger-
[15] A. V. Goldberg and C. Silverstein. Implementations many, 2005.
of Dijkstra’s Algorithm Based on Multi-Level Buckets. [30] F. Schulz, D. Wagner, and K. Weihe. Using Multi-
In P. M. Pardalos, D. W. Hearn, and W. W. Hages, Level Graphs for Timetable Information. In Proc. 4th
editors, Lecture Notes in Economics and Mathematical International Workshop on Algorithm Engineering and
Systems 450 (Refereed Proceedings), pages 292–327. Experiments, pages 43–59. LNCS, Springer, 2002.
Springer Verlag, 1997. [31] R. Sedgewick and J. Vitter. Shortest Paths in Eu-
[16] A. V. Goldberg and R. F. Werneck. Computing clidean Graphs. Algorithmica, 1:31–48, 1986.
Point-to-Point Shortest Paths from External Memory. [32] R. E. Tarjan. Data Structures and Network Algo-
In Proc. 7th International Workshop on Algorithm rithms. Society for Industrial and Applied Mathemat-
Engineering and Experiments, pages 26–40. SIAM, ics, Philadelphia, PA, 1983.
2005. [33] M. Thorup. Undirected Single-Source Shortest Paths
[17] R. Gutman. Reach-based Routing: A New Approach with Positive Integer Weights in Linear Time. J. Assoc.
to Shortest Path Algorithms Optimized for Road Net- Comput. Mach., 46:362–394, 1999.
works. In Proc. 6th International Workshop on Al- [34] M. Thorup. Compact Oracles for Reachability and Ap-
gorithm Engineering and Experiments, pages 100–111. proximate Distances in Planar Digraphs. In Proc. 42nd
SIAM, 2004. IEEE Annual Symposium on Foundations of Computer
[18] P. E. Hart, N. J. Nilsson, and B. Raphael. A Formal Science, pages 242–251, 2001.
Basis for the Heuristic Determination of Minimum Cost [35] D. US Census Bureau, Washington.
Paths. IEEE Transactions on System Science and UA Census 2000 TIGER/Line files.
Cybernetics, SSC-4(2), 1968. http://www.census.gov/geo/www/tiger/tugerua/ua.-
[19] T. Ikeda, M.-Y. Hsu, H. Imai, S. Nishimura, H. Shi- tgr2k.html, 2002.
moura, T. Hashimoto, K. Tenmoku, and K. Mitoh. A [36] D. Wagner and T. Willhalm. Geometric Speed-Up
Fast Algorithm for Finding Better Routes by AI Search Techniques for Finding Shortest Paths in Large Sparse
Techniques. In Proc. Vehicle Navigation and Informa- Graphs. In European Symposium on Algorithms, 2003.
tion Systems Conference. IEEE, 1994. [37] F. B. Zhan and C. E. Noon. Shortest Path Algorithms:
[20] R. Jacob, M. Marathe, and K. Nagel. A Computational An Evaluation using Real Road Networks. Transp.
Study of Routing Algorithms for Realistic Transporta- Sci., 32:65–73, 1998.
tion Networks. Oper. Res., 10:476–499, 1962. [38] F. B. Zhan and C. E. Noon. A Comparison Be-
[21] P. Klein. Preprocessing an Undirected Planar Network tween Label-Setting and Label-Correcting Algorithms
to Enable Fast Approximate Distance Queries. In Proc. for Computing One-to-One Shortest Paths. Journal of
13th ACM-SIAM Symposium on Discrete Algorithms, Geographic Information and Decision Analysis, 4, 2000.
pages 820–827, 2002.
[22] J. L. R. Ford. Network Flow Theory. Technical Report
P-932, The Rand Corporation, 1956.
[23] J. L. R. Ford and D. R. Fulkerson. Flows in Networks.
Princeton Univ. Press, Princeton, NJ, 1962.
[24] U. Lauther. An Extremely Fast, Exact Algorithm
for Finding Shortest Paths in Static Networks with
Geographical Background. In IfGIprints 22, Institut
fuer Geoinformatik, Universitaet Muenster (ISBN 3-
936616-22-1), pages 219–230, 2004.
[25] U. Meyer. Single-Source Shortest Paths on Arbitrary
Directed Graphs in Linear Average Time. In Proc. 12th
ACM-SIAM Symposium on Discrete Algorithms, pages
797–806, 2001.
[26] T. A. J. Nicholson. Finding the Shortest Route Be-
tween Two Points in a Network. Computer J., 9:275–
280, 1966.
[27] I. Pohl. Bi-directional Search. In Machine Intelligence,
volume 6, pages 124–140. Edinburgh Univ. Press, Ed-
inburgh, 1971.
[28] P. Sanders and D. Schultes. Highway Hierarchies
Hasten Exact Shortest Path Queries. In Proc. 13th
Annual European Symposium Algorithms, volume 3669
of LNCS, pages 568–579. Springer, 2005.
[29] D. Schultes. Fast and Exact Shortest Path Queries Us-

You might also like