Csom Phdthesis
Csom Phdthesis
Csom Phdthesis
= (V
, E
V and E
E. An
induced subgraph is a subset of the vertices of a graph together with any edges whose endpoints
are both in this subset.
Denition 3. Two nodes u, v V of a graph G = (V, E) are called adjacent if there is an edge
between u and v, that is, {u, v} E. For a graph G = (V, E), the set of neighbors of a vertex v,
denoted by
G
(v), is dened as the set of nodes adjacent to v, that is,
G
(v) := {u : {u, v} E}.
For a set of nodes U V , let
G
(U) :=
uU
G
(u).
Denition 4. For a graph G = (V, E), the degree of a vertex v, denoted by deg
G
(v), is dened
as the number of its neighbors, that is, deg
G
(v) := |
G
(v)|.
If the graph G is clear from the context, we omit subscripts. For example, we write the set of
neighbors and the degree of a node by (v) and deg(v), respectively.
A graph is called rregular if all vertices have degree r.
The sum of all node degrees divided by two equals the number of edges:
vV
deg(v) = 2 |E|.
17
CHAPTER 2. PRELIMINARIES
In this thesis, if not stated otherwise, we consider undirected graphs (as in Denition 1). In
some networks, relationships between entities are inherently directed, for example one-way streets
in road networks or hyperlinks in the World Wide Web. Directed graphs can be used to model these
networks.
Denition 5. A directed graph (digraph) D is a pair D = (V, A) consisting of a set of nodes V
and a set of edges (also called arcs) A V V .
For digraphs, we may distinguish between in-neighbors and out-neighbors.
Denition 6. For a digraph D = (V, E), the set of in-neighbors of a vertex v is dened as
D
(v) := {u : (u, v) E}, and its set of out-neighbors is dened as
+
D
(v) := {u : (v, u) E}.
We dene the neighbors of a vertex v as the union of the set of in-neighbors and the set of out-
neighbors,
D
(v) :=
D
(v)
+
D
(v). Its in-degree is deg
D
(v) := |
D
(v)| and its out-degree is
deg
+
D
(v) := |
+
D
(v)|.
Note that deg
D
(v) deg
D
(v) + deg
+
D
(v) and equality does not necessarily hold. We may
again omit the subscript if D is clear from the context.
Weighted graphs also capture relationships of different cost, length, and strength. In what
follows, we only consider edge-weighted graphs (as opposed to node-weighted graphs). We may
still restrict the edge weights to 1, which yields an unweighted graph.
Denition 7. An edge-weighted graph (digraph) is a graph (digraph) associated with a weight
function w : E R.
An edge weight can be interpreted as representing a value in the real world such as distance,
time, cost, penalty, or loss. In the following, if not stated otherwise, we shall only consider weight
functions with positive range, that is, w : E R
+
. This explicitly excludes edges with weight
0. This is no restriction, since we may just contract (denition below) edges with weight 0, which
yields a single vertex instead.
Denition 8. A path in G from a node u
0
to a node u
h
is a sequence of (undirected or directed)
edges ((u
0
, u
1
), (u
1
, u
2
), . . . , (u
h1
, u
h
)). We also interpret such a path as a node sequence
(u
0
, u
1
, . . . , u
h
), as a node set {u
0
, u
1
, . . . , u
h
}, or as a subgraph, when this simplies the no-
tation. The length of a path P is the sum of its edge weights (P) :=
h1
i=0
w(u
i
, u
i+1
). The
hop-length of a path P is the number of edges h on P.
Note that for any path of an unweighted graph, the hop-length and the path length are equal.
Denition 9. A subpath P
of a path P = (u
0
, u
1
, . . . u
h
) is a path constructed from a subse-
quence of nodes P
= (u
i
, u
i+1
, . . . u
j
), 0 i < j h. A simple path is a path without repeated
vertices. Two paths are called vertex-disjoint if they do not have any vertices in common except
for, possibly, the endpoints.
Distances in graphs are computed based on the shortest path metric.
1
1
This connection to metrics is one reason to restrict the range of the edge-weight function w() to R
+
and the
graphs to undirected. This thesis does not heavily rely on the general concepts of a metric space but since metrics
are inherently designed to measure distance, we briey outline the basic denition. A metric space is a set for whose
elements a distance (called a metric) is dened. This distance metric is supposed to satisfy three conditions:
1. d(x, y) = 0 if and only if x = y,
18
2.1. GRAPHS
Denition 10. Let P
G
(u, v) denote the set of paths from u to v in G. The distance d
G
(u, v)
between two nodes u, v is the length of a shortest path from u to v; that is,
d
G
(u, v) = min
PP
G
(u,v)
(P).
If P
G
(u, v) = then d
G
(u, v) := . The distance d
G
(u, V
V is dened as d
G
(u, V
) := min
vV
d
G
(u, v). The distance between two subsets of
the nodes U
, V
V is dened as min
uU
d
G
(u, V
).
If unique shortest paths are needed, one may perturb the edge weights by adding random
innitesimal weights
2
[MVV87, EHP04].
Denition 11. An undirected graph G = (V, E) is connected if d
G
(u, v) is nite for all u, v V .
A directed graph D = (V
at least one of d
D
(u, v) and
d
D
(v, u) is nite. A directed graph D = (V
both
d
D
(u, v) and d
D
(v, u) are nite.
The (strongly) connected components of a graph can be extracted efciently [CLRS01, Sec-
tion 22.5].
Denition 12. A cycle is a path where both endpoints coincide. A cycle is thus a node sequence
(u
0
, u
1
, . . . , u
h
) for which (u
i
, u
i+1
) E for all i {0, 1, . . . h 1} and u
0
= u
h
. The length of
a cycle is dened as the number of edges h 3.
A cycle of length 3 is also called triangle.
Denition 13. In a graph G = (V, E), the open ball with radius r around v V is dened by
B
r
G
(v) := {u V : d
G
(v, u) < r}
Accordingly, the closed ball with radius r is dened by B
r+
G
(v) := {u V : d
G
(v, u) r}.
The open (closed) ball relative to a subset of the nodes U V is dened as the open (closed)
ball with radius d(v, U).
Denition 14. Dene the multiplicative stretch of a path P from s to t = s relative to the distance
from s to t as the ratio (P)/d
G
(s, t) and dene the additive stretch as the difference (P)
d
G
(s, t).
The stretch of a path is also called distortion.
2. symmetry: d(x, y) = d(y, x), and
3. the triangle inequality: d(x, z) d(x, y) + d(y, z).
These conditions also imply non-negativity d(x, y) 0. Let
dim
p
denote the Euclidean space of dimension dim,
denoted by R
dim
, equipped with the
p
norm. For 1 p , the
p
norm on a dimdimensional space is dened as
||x||
p
:=
p
q
P
dim
i=1
|x
i
|
p
, set ||x||
:= max
i
|x
i
|.
2
The isolation lemma [MVV87] states that, for a nite set of distinct weights w(e
1
), w(e
2
), . . . w(e
|E|
), any col-
lection of subsets of weights has a unique minimum with probability at least
1
/2. Edge weights can be made unique by
adding innitesimal weights. Dene w
(e) := w(e) + (e), where (e) for each edge e is chosen independently at
random from [n
2
] = {1, 2, . . . n
2
}.
19
CHAPTER 2. PRELIMINARIES
2.1.1 Graph Properties
Denition 15. The diameter diam(G) of a graph G = (V, E) is the maximum distance between
two vertices:
diam(G) := max
u,vV
d
G
(u, v).
We dene the empty graph to have innite diameter and the graph with one vertex to have zero
diameter. All other graphs have a diameter in R
+
{}. We usually abbreviate = diam(G)
when G is clear from the context.
Denition 16. The radius of a graph G = (V, E) is the least r such that there is a vertex v whose
closed ball B
r+
G
(v) covers all vertices.
Denition 17 (Girth). The girth of a graph G = (V, E), denoted by g(G) is the length of its
shortest cycle.
A connected undirected graph without cycles is called a tree. A tree with n nodes has exactly
n1 edges. An undirected graph without cycles (but not necessarily connected) is called a forest.
Since there are no cycles in trees and forests, these graphs have innite girth. A subgraph that is a
tree on all nodes is called a spanning tree.
The tree-width of a graph was introduced by Halin [Hal76], but it went unnoticed until it
was rediscovered by Robertson and Seymour [RS86] and, independently, by Arnborg and Prosku-
rowski [AP89]. The tree-width of a graph is dened as follows.
Denition 18. Let G be a graph, T a tree and let V = {V
t
V (G) | t V (T)} be a family of
vertex sets of G indexed by the vertices t of T. The pair (T, V) is called a tree-decomposition of
G if it satises the following three conditions:
V (G) =
tT
V
t
for every edge e G there exists a t T such that both ends of e lie in V
t
If t, t
, t
V (T) and t
, then V
t
V
t
V
t
.
The width of (T, V) is the number max{|V
t
| 1 | t T} and the tree-width tw(G) of G is the
minimum width of any tree-decomposition of G.
Denition 19 (Doubling Dimension). The doubling dimension (also: Assouad dimension [Ass83])
of a graph is the minimum dim such that any ball of radius r can be covered by at most 2
dim
balls
of radius r/2.
A metric with diameter and doubling dimension dim has at most
O(dim)
points
Denition 20 (Edge Contraction). In an undirected graph G = (V, E), the contraction of an edge
e = {u, v} with endpoints u and v is the replacement of u and v by a single vertex u
such that
the edges incident to the new vertex u
(V, E
) such that E =
i=1
E
i
.
There is literally a class of graphs called bounded X graphs for any of the aforementioned
properties X, where the corresponding property is bounded by a constant O(1). For example:
bounded-degree graphs are graphs in which the degree of each vertex is bounded by a constant.
21
CHAPTER 2. PRELIMINARIES
Another example class is the class of graphs with bounded tree-width (Denition 18). The tree-
width is a good measure of the algorithmic tractability of graphs. It is known that a number of hard
problems on graphs can be solved efciently when the given graph has small tree-width [AP89].
A graph has tree-width 1 if and only if it is a forest, and families of graphs with tree-width at most
2 include outer-planar graphs and series-parallel graphs.
A ball graph is an intersection graph of balls in R
dim
. It consists of n balls with centers v
i
and
radii r
i
. Two centers v
i
, v
j
are connected in the intersection graph iff their balls intersect in R
dim
.
A disk graph is a ball graph with dim = 2. In unit-disk and unit-ball graphs, all radii are equal.
The class of disk graphs contains the class of planar graphs [Koe36].
2.1.3 Synthetic Graph Models
Ideally, an algorithm would work well for all instances. However, more often than not, one can
construct an adversarial graph (also termed worst-case graph) for which certain algorithms show a
very bad performance. Almost ideally, an algorithm would work well for all practical instances, or
at least for a typical (average) instance. Even though many datasets are made public these days,
3
the number of available real-world networks is still rather limited. Furthermore, the algorithm
designer may not know in advance which graph the user will work with. From a theoretical
perspective, creating an algorithm for one particular graph instance is trivial: since code size is
not measured and evaluated, all solutions can be encoded in advance. Instead, we often evaluate
algorithms on certain restricted classes of graphs (see Section 2.1.2). Many real-world networks,
however, do not fall into any of these classes.
A large branch of research investigates models of the real world. Since we wish to capture the
essential features of multiple networks, the models have some degree of freedom, which is often
modeled by randomness. In some random graphs [ER60, Gil59, Bol01] for example each possible
edge is in the graph with probability p. In general, these models may help to understand charac-
teristics of certain real-world networks but also to evaluate and test [ASS09] the performance of
algorithms.
4
In the following we briey review models for the networks discussed in this thesis: road
networks and complex networks.
Road Networks
Often planar graphs are used to model road networks.
Eppstein and Goodrich [EG08] and Eppstein et al. [EGS09] explicitly state that road networks
are non-planar. Instead, they use a model called multiscale-dispersed graphs, formalized in terms
of disk graphs (which contain planar graphs [Koe36]). They prove that these networks have small
separators (see also Theorem 4 in a subsequent section), which can be found efciently.
Abraham et al. [AFGW10] introduce the notion of highway dimension, which means that for
every radius r > 0, there is a small set of vertices S
r
, which all shortest paths of length greater
than r pass through.
3
There are even online platforms to trade datasets, for example infochimps.org
4
Another approach to evaluate the average-case performance of algorithms and to generate test instances is to collect
many real-world graphs and perturb edge weights at random [Iri92].
22
2.1. GRAPHS
Network Degree distribution Example Model
Single-scale Gaussian or expo-
nential
Erd os-R enyi
[Gil59, ER60]
Scale-free Power law Metabolic net-
works, food webs,
Web graphs, and
numerous others
Pref. attach-
ment [BA99];
xed (exp.) de-
gree sequence
[BBK72, ACL00]
Broad-scale Power-law distrib.
with sharp cut-off
(decay of the tail)
Movie actor
Table 2.2: Complex networks of different scales [ASBS00]
Complex Networks
Complex networks at rst appear not to have any particular structure this is why they are
called complex. For the moment, let us focus on the degree distribution. Amaral et al. [ASBS00]
distinguish three classes of networks based on their degree distribution: single-scale, scale-free,
and broad-scale networks (see Table 2.2).
Single-Scale Networks. Erd os-R enyi random graphs [ER60, Gil59] have been studied inten-
sively for more than 50 years. There are two common models for undirected graphs with n nodes.
In the G
n,p
model, each of the
_
n
2
_
edges is in the graph independently at random with probability
p. In the G
n,M
model, all graphs with n nodes and M edges have the same probability. Many
properties of Erd os-R enyi graphs are well understood. For example, the G
n,p
random graph with
edge probability p proportional to n
1/d
(where d denotes an integer) has diameter at most d + 1
with high probability [Bol01]. For more results we refer to [Bol01]. Erd os-R enyi graphs serve
as suitable probability distributions for the average-case analysis of many algorithms. However,
graphs with power-law degree distribution are very unlikely in the Erd os-R enyi random graph dis-
tribution. Since many real-world networks do have power-law degree distributions, researchers
also consider other random graph models.
Scale-Free Networks. The node degree sequence of scale-free graphs obeys a power law[Mit03,
New05, CSN07]. Power-lawdistributions are referred to as scale-free distributions, since they look
the same on any scale. Mathematically speaking, a power-law degree distribution is dened as fol-
lows: the probability that a node has degree x is proportional to x
x=1
x Pr[deg(v) = x]
_
1
Cx
+1
dx =
C
2
2.1
23
CHAPTER 2. PRELIMINARIES
For constant values of C, the expected number of edges is linear, which makes scale-free networks
sparse. The power-law degree sequence is just one important feature of many real-world complex
networks. Another characteristic is that distances are very short. This characteristic is called the
small world effect.
Two broad classes of network models [BS05, Mit03, CF06, TGJ
+
02] are distinguished based
on the method the graphs are generated with. In pure random graphs, the number of nodes and
the parameters are set at the beginning and then all the edges are generated. These models are
satisfactory to analyze complex networks but they do not explain the reasons for the scale-free
nature of complex networks. In random evolving graphs, the graph is generated by a random
process that adds node by node to the graph and connects the new node at random to the existing
graph. This process can be stopped at any time. For a generated graph by either model, let n
denote the number of nodes. The details for the different models vary greatly. Commonalities
other than the power-law degree sequence are that, usually, the diameter is proportional to lg n
and the average distance is proportional to lg lg n. The goal is to nd a model that is both realistic
and easy to work with.
The conguration model [BBK72, RN04b] works as follows: we specify a degree sequence
d := (d
1
, . . . d
n
). The edges are generated such that all graphs G = (V, E) with v
i
V :
deg(v
i
) = d
i
have the same probability. Where in the Erd os-R enyi random graph model all edges
were independent, the edges in the conguration model are dependent. Once an edge between two
vertices v
i
, v
j
has been assigned, the potential of both u and v to acquire more edges decreases by
1. Note that, for a degree sequence
d to be realizable as a graph, there are some conditions on
d
such as
n
i=1
d
i
must be even and others [EG60, Hak62].
In the xed expected degree random graph model [ACL00, CL02, NR06], edges are indepen-
dent. We again specify a sequence w := (w
1
, . . . w
n
). For this model, w
i
is interpreted as the
expected degree of v
i
. Each edge {v
i
, v
j
} is in the graph independently at random with probability
w
i
w
j
P
k
w
k
. Note that it is required to restrict w such that i, j : w
i
w
j
k
w
k
. In Chapter 5, we use
an adapted version of this model to analyze distance oracles for random power-law graphs.
In the re-wired lattice model [BMST97, NW99, Kle00], each vertex is connected to all of
its neighbors within constant distance by an undirected edge. In addition, a number of shortcuts
(long-range links) are added between randomly chosen pairs of nodes. In a variant, instead of
adding edges, some of the connections to neighbors are removed and re-wired to random nodes.
In afliation networks [LS09], we start with a bipartite graph. The nodeset is divided into
actor nodes and afliation nodes. Each node representing an actor is connected to certain nodes
representing afliations such as companies, orchestras, and sports clubs. Then, the bipartite graph
is unfolded into a social network, which consists of actor nodes only; edges are generated such
that two actors connected by a path of length 2 in the afliation graph get connected in the social
network.
In the preferential attachment model [BA99, DMS00], the network is growing in time in such
a way that new vertices are more likely to be connected to vertices that already have a high degree.
A new vertex connects to a node with degree d
i
with probability
d
i
P
k
d
k
. This model offers a
convincing explanation for the emergence of scale-free networks. The copy model [KRR
+
00] is
in some sense a variant of the preferential attachment model, where a new node, upon generation,
copies a fraction of the links of a random node.
24
2.2. GRAPH ALGORITHMS
2.2 Graph Algorithms
Graph algorithms is a research eld at the intersection between graph theory and computer science.
We are interested in (efciently) computing certain properties of graphs. An algorithm, given a
graph G = (V, E) and optional inputs such as subsets of the nodeset or edgeset or constants,
decides or computes certain properties of G. Decision problems are those questions for which
the answer is yes or no. Optimization problems are the questions for which the solution is a
subgraph, potentially ordered, minimizing or maximizing an objective function.
The efciency of algorithms is of integral interest.
For practical purposes computational details are vital. However, my purpose is only
to show as attractively as I can that there is an efcient algorithm. According to the
dictionary, efcient means adequate in operation or performance. This is roughly
the meaning I want.
Jack Edmonds [Edm65]
Suppose that we want to evaluate an algorithm for a problem. Objective evaluation criteria
include the quality of the result (correctness, exactness, approximation quality), the computing
time (also: time complexity), measured in terms of the input size, and the memory consumption
(also: space complexity), again measured relative to the input size. For graph algorithms, the
input consists of a graph G = (V, E) and optional parameters. Unless stated otherwise, n := |V |
denotes the number of nodes and m := |E| denotes the number of edges. An important aspect
in the evaluation of an algorithm is its scalability. For theoretical work, scalability means the
asymptotic behavior of an algorithm in terms of n and m. Let f
A
(n, m) denote the least upper
bound on the cost of applying algorithm A to graphs with n nodes and m edges. It is claimed
that a constant number of instructions every now and then does not inuence the running time too
much. It is often also convenient not to analyze these constant overheads in great detail. This is
captured in the Bachmann-Landau [Bac94, Lan09] Onotation [CLRS01, Chapter 3]. We say that
the running time of an algorithm is (in) O(g(n, m)), meaning that the actual running time as a
function f
A
(n, m) increases, or grows, at most proportionally to g(n, m), ignoring the exact value
f
A
(n, m). The precise denitions of the Onotation are listed in Table 2.3.
Notation Denition
f(n) O(g(n)) n
0
, c
1
, c
2
such that n > n
0
: c
1
g(n) +c
2
f(n)
f(n) o(g(n)) n
0
, c
1
, c
2
such that n > n
0
: c
1
g(n) +c
2
> f(n)
f(n) (g(n)) g(n) O(f(n))
f(n) (g(n)) g(n) o(f(n))
f(n) (g(n)) f(n) O(g(n)) f(n) (g(n))
f(n)
O(g(n)) c
n)
Table 2.3: Onotation for the asymptotic behavior of functions f, g.
Computational problems are classied according to their difculty, which is dened by the
existence of an algorithm running in a certain time. The class of decision problems for which
there exists an algorithm that outputs the correct answer in time O(poly(n)) is called P. The
class of decision problems for which there exists an algorithm that, given some evidence, veries
25
CHAPTER 2. PRELIMINARIES
the correct answer in time O(poly(n)) is called NP. The complexity classes P and NP are only
mentioned in some parts of the chapter on related work, but profound knowledge on the Pvs. NP
problem is not essential to understand this thesis. For more on computational complexity theory,
we refer to [GJ90].
2.2.1 Computational Models
Time and space complexities are measured differently depending on the machine model [vEB90].
The traditional model of computation consists of a Turing machine [Tur37], which is a state ma-
chine operating on an innite tape divided into cells. In the word RAM model [CR73] with integral
word length w 1, the contents of all memory cells are integers in the range {0, . . . , 2
w
1} and
operations such as addition, subtraction, bit shifts, and bit-wise boolean operations are assumed
to be executable in constant time (analogous to programming languages such as C). This model
often allows for fast algorithms (faster compared to addition/comparison models by a logarithmic
factor) if for example the edge weights are integers and the largest integer weight W satises
W 2
w
1. This model is often used for upper bounds on the time complexity of a specic
algorithm.
For lower bounds on the time complexity of any algorithm, the related cell-probe model is
very common.
Denition 25 (Cell-probe model [Yao81, Mil99]). In the cell-probe model, a memory cell has
w bits (also called word length) and the space of a data structure is measured as the number of
cells it occupies, denoted by S. The query time is measured by the worst-case number of cells t
that a query reads.
Both for the word RAM and the cell-probe model, the most typical values for the word length
are w = lg n or w = polylog(n) = poly(lg n), but larger (or smaller) values may be interesting
as well.
For problems involving huge data sets, often I/O is the bottleneck of computations. The data
does not t into main memory; instead it is read block by block from disk. To analyze external
memory algorithms [AV88, VS94], often a cell-probe-like model is used for upper bounds as well.
Operations may read a block of size B into main memory of size M.
2.2.2 Approximation Algorithms
For certain optimization problems, the optimal solution with respect to an objective function is
hard to compute. Often the computation of a close-to-optimal solution can be done much faster.
An approximate solution may still be acceptable if the quality of the solution is sufciently good.
The quality is measured as follows. Let OPT denote the value of an optimal solution (as deemed
by the objective function) and let ALG denote the value of the solution the algorithm returned.
We say that the algorithm has approximation quality (, ) if the inequalities (2.2) hold for all
allowed inputs. The approximation quality is thus a worst-case measure.
OPT ALG OPT +
2.2
(Note that this denition is tailored to minimization problems and in particular to the shortest path
problem.)
In the case of distances, this approximation quality is also called stretch or distortion. Let
d(u, v) denote the result of the approximation algorithm when asked for the distance between u
26
2.3. COMMON TECHNIQUES
and v. For an algorithm computing (, )approximate distances, for all u, v V , the result must
satisfy
d
G
(u, v)
d(u, v) d
G
(u, v) + .
(The combination of multiplicative and additive stretch only makes sense for multiple node
pairs. For a single path we consider either its multiplicative or its additive stretch (Denition 14).)
2.3 Common Techniques
This section consists of a non-exhaustive list of techniques that are commonly used to solve prob-
lems related to shortest paths.
2.3.1 Spanners and Emulators
For most graph algorithms, the performance depends on the number of nodes and edges of the
input graph. The running time can potentially be reduced by altering the graph, in particular by
adding or deleting edges. After altering the graph, we wish that the answer to the question we ask
concerning the graph (in our case the distances between nodes) does not change by much. When
edges are deleted only, we obtain a subgraph, which, if it preserves distances to a certain extent, is
called a spanner [PS89, ADD
+
93, Coh98, DHZ00, Kor01, BCE03, EP04, TZ06, Pet07, Elk08a,
Elk08b, BS08].
Denition 26 ((Graph) Spanner). An (, )spanner of a graph G = (V, E) is a subgraph G
=
(V, E
) that approximately preserves distances such that for all pairs of nodes (u, v) V V ,
d
G
(u, v) d
G
(u, v) d
G
(u, v) + .
We say that this spanner has stretch (or distortion) (, ).
Spanners are useful in various applications such as constructing routing tables, where the edges
of a subgraph are used to route messages, and computing approximate shortest paths.
The more edges we delete, the smaller the input size for the next algorithm, the faster the
running time. The amount of edges we can delete before some distances change substantially
often depends on the girth of the graph. Recall that the girth of a graph is the length of its shortest
cycle (Denition 17). Intuitively, if there are short cycles, we may delete an edge of a cycle, since
for a shortest path using this edge, there is an alternative, reasonably short path using the cycle. If
there are no short cycles, there is no alternative short path; the redundancy is low and the deletion
of an edge may cause a large distortion.
Let m
g
(n) denote the maximum number of edges in a graph with n vertices and girth at least g.
Theorem 1 (Alth ofer et al. [ADD
+
93]). For any integer 3, every graph G = (V, E) on
|V | = n vertices has a spanner with stretch (, 0) and m
+2
(n) edges.
Their construction uses a greedy algorithm (similar to Kruskals algorithm to construct a min-
imum spanning tree [CLRS01, p. 568]). The upper bound is actually tight. The corresponding
lower bound is not very difcult: in a graph with girth g = + 2, removing any edge increases
the distance between its endpoints from 1 to at least +1. The only multiplicative (, 0)spanner
is the graph itself.
27
CHAPTER 2. PRELIMINARIES
Since no edges can be removed from graphs with large girth without signicantly altering dis-
tances, graphs with many edges (dense graphs) and large girth are important worst-case instances
for spanner and distance oracle constructions. In extremal combinatorics, determining m
g
(n) is a
research eld of its own [EJ08, Big98, Hoo02]. For example, for g = 4 the question is the follow-
ing: how many edges can be added to the empty graph on n nodes without closing a triangle? In-
tuitively, the more edges that were already added, the harder it gets to add another one. For g = 4,
the complete bipartite graph is asymptotically optimal. For general g, the construction of the graph
is much more involved; for some g the value of m
g
(n) is not even known.
5
Erd os girth conjec-
ture [Erd64, ES63] predicts that, for an integer k 1, m
2k+1
(n) = m
2k+2
(n) = (n
1+1/k
). The
corresponding upper bound is known to be tight [AHL02] and the conjectured lower bound is a
theorem for certain values of k (1, 2, 3, and 5); for an overview, see Table 2.4.
Girth |E| Reference
4 (n
2
) complete bipartite graphs
6 (n
3/2
) [Rei58, ERS66, Bro66, Wen91]
8 (n
4/3
) [Tit59, Ben66, Wen91]
10 O(n
5/4
)
(n
6/5
) [Tit59, Ben66, LU93]
12 (n
6/5
) [Tit59, Ben66, Wen91, LU93]
14 O(n
7/6
)
(n
9/8
) [LUW95, LUW96]
16 O(n
8/7
)
(n
10/9
) [WU93, LUW95]
4r + 2 O(n
2r+1
2r
)
(n
1+
1
3r1
) [LUW95, LUW96]
4r O(n
2r
2r1
)
(n
1+
1
3r3
) [LUW95, LUW96]
Table 2.4: Results on Erd os girth conjecture, overview from [TZ05, Table II]. Maximum size of
the edge set E for a graph G = (V, E) with |V | = n nodes and given girth (length of a shortest
cycle, Def. 17).
For multiplicative spanners, the tradeoff between space and stretch is well understood. Not
so for spanners with additive stretch. Aingworth et al. [ACIM99] found a (1, 2)spanner with
O(n
3/2
) edges and Baswana et al. [BKMP05] found a (1, 6)spanner with O(n
4/3
) edges. Wood-
ruff [Woo06] gives a strong lower bound for additive graph spanners independent of Erd os
girth conjecture. He proves that for an integer k = o
_
lg n
lg lg n
_
, there are graphs for which any
(1, 2k 1)spanner has
_
n
1+1/k
/k
_
edges.
Recently [TZ06, Pet07], spanners with non-constant additive stretch () are under investiga-
tion. For these spanners, is required to be sublinear in d(u, v). We refer to the overview by
Pettie [Pet07, Fig. 2]. For a result on spanners for directed graphs, see [RTZ08].
Spanners are subgraphs. If we just care about distances and not about the actual paths, the
subgraph requirement may be too restrictive. Emulators are graphs restricted to the same nodeset
5
For directed graphs, the problem seems to be even more involved [CH78, CS83].
28
2.3. COMMON TECHNIQUES
but not to the same edgeset.
Denition 27 (Emulator [DHZ00]). An edge-weighted graph F = (V, E
) (, )-emulates a
graph G = (V, E) if for every u, v V
d
G
(u, v) d
F
(u, v) d
G
(u, v) + .
F is called an (, )emulator.
Consequently, we say that such an emulator has stretch (, ). Note that, for both spanners
and emulators, distances may only increase. Dor et al. [DHZ00] give a (1, 4)emulator with
O(n
4/3
) edges. Thorup and Zwick [TZ06], for an arbitrary integer k 2, construct emulators
with O(kn
1+1/(2
k
1)
) edges (in expectation), such that for pairs with distance , the distance in
the emulator is at most + O(k
11/(k1)
). The lower bounds by Woodruff [Woo06] can be
extended to emulators as well.
2.3.2 Distance Labelings and Metric Embeddings
The objective of distance labelings is to assign each node of a graph a label such that the distance
(or an approximation thereof) between two nodes can be computed based on the corresponding
labels only [Pel00]. Such labelings are used in the real world to a certain extent. For example,
postal addresses include countries, cities, and street names, using which we can get an estimate of
how close two addresses are.
6
The idea is formalized in the following denition.
Denition 28 (Distance Labeling [Pel00, GPPR04]). An (, )approximate distance labeling
scheme for a graph G = (V, E) is an assignment of labels to nodes L : V {0, 1}
such that
the estimated distance
d(u, v) computed by the scheme from the labels L(v) and L(v) satises
d
G
(u, v)
d(L(u), L(v)) d
G
(u, v) + .
Distance labeling schemes with short labels are derivable for highly regular graph classes, such
as rings, meshes, and hypercubes. An interesting question is whether more general graph classes
can also be labeled in this fashion. For general graphs, it is known that any distance labeling
scheme must label some graphs with n vertices with labels of size (n) [GP03b]. Even for planar
graphs, some nodes must have labels of size (n
1/3
) [GPPR04].
If we restrict the function
d(L(u), L(v)) to
p
norms, we obtain embeddings [IM04, Lin02].
The idea is to map a metric space (here: the shortest path metric of a graph) into a simpler one,
in such a way that the distances between points do not change too much. More formally, an
embedding of a (weighted) graph G = (V, E, w) with distance function d into a target metric
space (V
, d
) (where d
. An
embedding with good distortion yields good approximation algorithms [Ind01].
The Johnson-Lindenstrauss Lemma can be used to reduce the number of dimensions of a
metric space without introducing a large error. It states that, for any
DIM
2
, a random linear mapping
into
dim
2
preserves distances up to a factor of 1 with probability at least 1 e
2
dim
. More
precisely:
6
Latitude and longitude coordinates would of course be more precise.
29
CHAPTER 2. PRELIMINARIES
Lemma 2 (Johnson and Lindenstrauss [JL84]). For any 0 < < 1 and any integer n, let dim be
a positive integer such that
dim 4(
2
/2
3
/3)
1
lnn.
Then for any set V of n points in R
DIM
there is a map f : R
DIM
R
dim
such that for all u, v V ,
(1 )||u v||
2
||f(u) f(v)||
2
(1 + )||u v||
2
.
The map f can be found in polynomial time.
We can thus project any ndimensional Euclidean space to an O(lg n/
2
)dimensional space
such that the distance between any two points changes by at most 1 . Such projections are
almost best possible: at least (lg n/(
2
lg )) dimensions are needed [Alo03, Theorem 9.3].
We would like to generalize this result to all metric spaces, in particular to graphs and the
shortest path metric.
Theorem 3 (Bourgain [Bou85]). For every npoint metric space there exists an embedding into
Euclidean space (
2
) with distortion (O(lg n), 0).
Also, this bound is tight [LLR95]. For the special case of planar graphs, the multiplicative
distortion is (
n)).
Theorem 4 (Lipton and Tarjans Planar Separator Theorem [LT80]). The n vertices of a planar
graph can be partitioned into three sets A, B, S such that
no edge connects a vertex in A with a vertex in B,
A and B each contain at most n/2 vertices, and
S contains at most O(
n) vertices.
30
2.4. SHORTEST PATHS
Results for planar graphs often extend to other classes of graphs. The separator theorem has
been generalized to bounded-genus graphs by Gilbert et al. [GHT84] and to minor-free graphs by
Alon et al. [AST90].
Road networks, although non-planar (the networks may have many bridges and tunnels), often
also have structural properties that are similar to the ones for planar graphs. Most importantly,
they appear to have small separators as well [EG08].
Edge Orientability
The edge set of a simple undirected planar graph G = (V, E) can be oriented (denoted by
G =
(V,
E)) such that the out-degree of every vertex v satises deg
+
G
(v) = O(1). The upper bound
on the degree can be chosen to be 3 [CE91]. Note that the nodes in-degrees deg
G
(v) remain
unbounded.
Using this orientation, adjacency queries can be answered in constant time by inspecting both
nodes. This technique can be seen as a labeling [Bre66, KNR92]: each node gets a label of
size O(lg n) bits such that the adjacency of two nodes can be computed by looking at the two
corresponding labels only. The result on edge orientability extends to minor-free graphs [GL07].
2.3.4 Well-Separated Pairs
The well-separated pair decomposition by Callahan and Kosaraju [CK95] is important for algo-
rithms operating on point sets in R
dim
. The corresponding denition for graphs is as follows. For
a graph G = (V, E) and a set A V , let G(A) denote the subgraph induced by A.
Denition 29 (WSPD [CK95]). For a graph G = (V, E), two sets of nodes A, B V are
separated if max{diam(G(A)), diam(G(B))} d
G
(A, B). For a parameter > 0, a well-
separated pair decomposition of a graph Gis a set of s pairs W = {{A
1
, B
1
}, . . . {A
s
, B
s
}} such
that for all pairs u, v V
(u, v)
s
_
i=1
A
i
B
i
B
i
A
i
,
and that for all i [s]
A
i
, B
i
V ,
A
i
B
i
= , and
A
i
and B
i
are separated.
In other words, for any pair of nodes u, v V , there is exactly one pair {A
i
, B
i
} W such
that u A
i
and v B
i
.
2.4 Shortest Paths
Shortest paths should be of inherent interest to any lazy creature living on a sphere
like the surface of planet Earth.
Mikkel Thorup [Tho04a]
31
CHAPTER 2. PRELIMINARIES
Even in very primitive (even animal) societies, nding short paths and searching (for
instance, for food) is essential.
Alexander Schrijver [Sch05a, p. 1]
The distance between two nodes s, t V of a graph G = (V, E) is dened by the length
of a shortest path (Denitions 8 and 10). The objective is to nd the shortest possible path that
connects the source and the target.
In this thesis, we restrict ourselves to unconstrained, (approximate) shortest paths in static, dis-
crete graphs with positive edge weights. For geometric shortest paths we refer to [Mit97, Che96]
(for motion planning see [Sha97b]). For paths on surfaces and meshes we refer to [MMP87,
ADG
+
06, VS09]. For the dynamic version of the problem we refer to [DI08, Ita08]. For time-
dependent weights, see [CH66, OR90, KS93a]. For SSSP algorithms on graphs with negative edge
weights, we refer to [GT89, Gol95, FR06].
We classify [DP84] the shortest path algorithms according to the problem type (single source,
all pairs, one pair), the graph class, and the techniques. We also refer to Schrijvers book [Sch03,
Chapter 7], Petties thesis [Pet03], and the surveys by Zwick [Zwi01] and Sen [Sen09]. The review
in this section is made with best efforts; however, the list of methods, algorithms, and techniques
is by far not complete.
2.4.1 Single Source Shortest Path (SSSP) Algorithms
For a brief overview, we refer to the Encyclopedia of Algorithms [Pet08b]. For a comprehensive
historical overview, we refer to Schrijver [Sch03, Section 7.5b]. The following outline is partially
based on Ahuja, Magnanti, and Orlins book [AMO93]. Historical information is from [Dre69,
GM77, DP84, Sch05a].
The Single Source Shortest Path (SSSP) problem asks for a shortest path from one node s V
(called the source) to all other nodes in V \{s}. In particular, after running a single source shortest
path algorithm, we know the distance from the source to all other nodes. After termination, each
node u V is labeled with d(s, u). At the beginning d(s, s) = 0 and all other labels are set
to . SSSP algorithms iterate and assign tentative distance labels
d(s, u) (upper bounds on the
true distance d(s, u)
d(s, u)) at each step. We distinguish label-setting, label-correcting, and
other algorithms. In label-setting algorithms, each iteration produces one optimal label. This
property does not hold for label-correcting algorithms, for which all labels are optimal after the
nal iteration only. Label-setting algorithms are restricted to non-negative lengths, but they often
have a better worst-case time complexity than label-correcting algorithms. Also, if we are inter-
ested in one particular target node t, it is possible to stop the algorithm as soon as the distance
label for t has been produced.
Label-Setting Algorithms
The most famous label-setting algorithm is Dijkstras algorithm [Dij59], see also [Sch03, Sec-
tion 7.2] or any algorithms textbook [CLRS01, p. 595] of your choice. The original implemen-
tation runs in time O(n
2
). According to an article on the history of combinatorial optimiza-
tion [Sch05a], the same or a similar algorithm has been proposed independently by Leyzorek et
al. [LGJ
+
57], Dantzig [Dan60], and Whiting and Hillier [WH60]; but it is known as Dijkstras al-
gorithm. The algorithm starts a search at the source. At each step, among the nodes with tentative
labels, the algorithm selects the node u with the shortest tentative distance
d(s, u) and deems the
32
2.4. SHORTEST PATHS
label of u as permanent. Then, the algorithm updates the labels of all the neighbors of u if neces-
sary. That is, if a neighbor v (u) has a tentative distance
d(s, v) larger than d(s, u) +w(u, v),
its label is decreased. The algorithm does not need to backtrack: once a label is nalized, it is cor-
rect and it can never decrease. The computational bottleneck of the algorithm is the node selection:
we need to efciently nd the node u with the shortest tentative distance.
For this node selection step, sorting may be used [Joh72]. Dial [Dia69] proposes to maintain
sorted distances by using buckets. Let the weight function w : E N
+
be restricted to integers
and let W denote the largest integer weight. Dials implementation maintains buckets, which could
require a large amount of memory; the running time is O(m+nW). Further improvements were
made by Wagner [Wag76], Dial et al. [DGKK79], and Denardo and Fox [DF79].
Another approach to efcient node selection is to use a priority queue, which is a data structure
we use to manage the nodes [Mur67a]. This data structure supports efcient operations to insert
an element, to retrieve the minimum element, to delete the minimum element, and to decrease
the value of an element in the queue. These are exactly the operations necessary for Dijkstras
algorithm. Each update of a label involves a queue operation. Using a binary heap as priority
queue, insertions, deletions, and decrease operations can be done in time O(lg n), which yields
time O(mlg n) for Dijkstras algorithm [Wil64]. For very dense graphs with m =
_
n
2
lg n
_
edges
this running time is actually slower than the O(n
2
) running time of the original implementation,
but for sparse graphs the running time decreases signicantly. Johnson [Joh77] uses a priority
queue with xed depth to get running time O(m) for dense graphs (m = (n
1+
) for some
> 0).
The faster the operations of the priority queue, the better the overall running time. Using a
dheap, insertions and decrease operations take time O(lg
d
n) and deletions take time O(d lg
d
n),
which yields total running time O(mlg
d
n + nd lg
d
n). The optimal value for the parameter d is
d = max{2, m/n}, which yields total running time O(mlg
d
n). The Fibonacci heap [FT87]
supports deletions in time O(lg n) and all the other operations in amortized constant time O(1),
which yields the currently best time bound of O(m + nlg n). Alternatively, one may use re-
laxed heaps [DGST88] or rank-pairing heaps [HST09], which are data structures with the same
performance guarantees.
If we restrict the range of the edge weight function w() to integers (or oats), better bounds
are possible by using special priority queues designed for the word RAM model (as dened in
Section 2.2.1). For an overview of the running times, see Table 2.5.
Component Hierarchy Algorithms
Sorting and priority queues are strongly related [Tho07]. For sorting, there is an information-
theoretic time lower bound of (nlg n). To circumvent this bound, sorting must be avoided. An
SSSP algorithm is not actually required to sort the distances to all nodes. Indeed, the sorting
bottleneck can be avoided. Thorup [Tho99, Tho00a] gives an O(m) algorithm for undirected
graphs with integer or oating point weights. He rst constructs a component hierarchy (in linear
time) and then revisits the nodes. Although the algorithm is theoretically best-possible, it appears
to be hard to implement and not very efcient in practice [AI00].
Hagerup [Hag00b] generalizes the idea of the component hierarchy to directed graphs, for
which his algorithm (using an analogue of minimum spanning trees for directed graphs) runs in
time O(mlg lg W). Actually, constructing the component hierarchy itself takes this time; an SSSP
search using the hierarchy requires time O(m+nlg lg n).
Pettie and Ramachandran [PR02] generalize the component hierarchy to real weights. For
33
CHAPTER 2. PRELIMINARIES
Time Reference
O(mlg n) [Wil64]
O(m+nlg n) [FT87, DGST88]
O(mlg lg W) [Joh82, vEBKZ77]
O(m+n
lg W) [AMOT90]
O(m
lg n) [FW93]
O(m+n
lg n
lg lg n
) [FW94]
O(mlg lg n) [Tho00b]
O(m+nlg
1/2+
n) [Tho00b]
O(m+n
lg nlg lg n) [Ram96b]
O(m+nlg
1/3+
n) [Ram97]
O(m+nlg lg n) [Tho04b]
O(m+n
lg lg n) [HT02]
Table 2.5: Running times for different implementations of Dijkstras algorithm. W denotes the
largest integer weight. The table is in large parts excerpted from [Tho99, p. 364]. The algorithms
in the rst two rows work for both the comparison/addition model and the word RAM model. The
analysis of the algorithms shown in row 3 and below only works in the word RAM model.
real weights and undirected graphs, if the ratio of any two edge weights is polynomial in n, their
algorithm runs in time O(m + nlg lg n). Furthermore, the algorithm appears to be efcient in
practice as well [PRS02]. The algorithms idea is to enforce a certain degree of balance in the
component hierarchy and, when computing the SSSP, to use a specialized priority queue that
takes advantage of this balance [Pet08b]. Unfortunately, it has been found that a hierarchy-based
algorithm can not improve upon Dijkstras algorithm to run in o(nlg n) for general weights [PR02,
Pet04]. See also [Pet03, Section 3.6].
Note that the component hierarchy approach already captures a certain notion of having pre-
processing and query stages [AI00].
Label-Correcting Algorithms
In label-correcting algorithms, all distance labels are temporary and they are guaranteed to be
exact and optimal in the end only.
7
Label-correcting algorithms can solve more general problems
(they can, for example, nd a cycle with negative length) and they are more exible but they
are usually slower. Some version of the algorithm often solves the All-Pairs Shortest Path (APSP)
problem. It is actually unknown whether computing the result of a point-to-point shortest path
query is easier than computing APSP on arbitrarily weighted graphs.
The generic label-correcting algorithm runs in time O(min{n
2
mW, m2
n
}) [For56, Moo59,
FF58, GKP85, GKPS85]. There is an efcient FIFO implementation, which runs in time O(mn)
[Bel58] (the well-known Bellman-Ford algorithm[Bel67, For56] (for an outline, see also [CLRS01,
Chapter 24.1])) and a dequeue implementation running in time O(min{mnW, m2
n
}) but being
very efcient in practice [Pap74, Pap80, GP86, Ber93].
The Simplex algorithm can also be used to compute shortest paths [Orl83, Akg88, GHK90,
AO92, GJ99, SNGM09].
7
This convergence behavior makes the correctness of an algorithm more difcult to prove.
34
2.4. SHORTEST PATHS
Restricted Graph Classes, Average-Case Analysis, Expected Running Times, and Practical
Considerations
Due to the importance of the shortest path problem, researchers have been interested in parallel al-
gorithms [Coh00, PK85, KS93b, KS96, BTZ98, TZ00, CMMS98, SS99, HTB01, MS03, SPZ06],
I/Oefcient algorithms [MZ08, MZ03, BFMZ04, JZ05, MZ06, ALZ07, MO09, Mey09], algo-
rithms for special graph classes such as planar graphs [Fre87, HKRS97], sparse graphs [GOY76,
Wag76], or Euclidean graphs [SV86], and in the expected performance (average-case analysis) of
algorithms [NMM78, Nos85, Mey03, Gol08, Hag06]. Also, comparisons [Dre69, ZN00] and ex-
perimental evaluation for practical purposes [Hit68, BH69, GW73, Pap74, Gol76, II84, IHI
+
94,
CGR96, ZN98, CGS99, JMN99, Shi00, Gol01, PRS02, DI04] have been an important part of
investigations.
Point-to-point shortest path problems have been considered early on [Min57, Dan60, Kle64,
Smo75]. In practice, point-to-point shortest path algorithms can be computed faster when bidirec-
tional search [Nic66, Boo67, Cha67, Mur67b, Poh71, SdC77, dC83] is used. For road networks, if
in addition to the graph the node coordinates are known, A* heuristics [Gel63, Sam63, KHI
+
86,
Dor67, HNR68, Gel77] based on geometry help to guide the search towards the target [SV86].
Approaches to the problem of computing shortest paths using an additional preprocessed data
structure are reviewed in Chapter 3.
2.4.2 All Pairs Shortest Path (APSP) Algorithms
In the following, we consider the static version of the APSP problem, which is for example sur-
veyed by Schrijver [Sch03] and Pettie [Pet08a]. For the dynamic version of the problem, we refer
to the Encyclopedia of Algorithms [DI08, Ita08]. We do not list parallel algorithms.
Given a graph G = (V, E), the All Pairs Shortest Paths (APSP) problem is to compute a
shortest path, respectively the distance, between all pairs of nodes s, t V . Since there are
_
n
2
_
= (n
2
) pairs of nodes for which the distance needs to be output, the time complexity
must be at least (n
2
). Strong lower bounds are hard to prove [YAR77, GYY80]. Kerr [Ker70]
and Nakamori [Nak72] give lower bounds for algorithms with restricted operations. Karger et
al. [KKP93] show that any algorithm based on path comparison must take time (mn). For
general algorithms, the gap between the lower and the upper bound is huge.
Given an SSSP algorithm with running time S(n, m, W), the straightforward approach to
solve APSP is to run the SSSP algorithm for each node. This approach yields a running time of
O(nS(n, m, W)). For non-negative weights, we may use Dijkstras algorithm, which yields time
O(mn+n
2
lg n). The algorithm of Johnson [Joh77] matches this bound also for negative weights
(by using Dijkstras algorithm [Dij59] and the Bellman-Ford algorithm [Bel67, For56]). For very
sparse graphs with m = O(n) edges, this is already best possible.
In practice, for sparse graphs, an initial graph decomposition step may potentially decrease the
total computation time [LS67, FLM67, KY65, Mil66, Hu68, HT69, Yen71, GKN74, LR82] (see
Figure 2.1). For some graphs we can save a few unnecessary operations by considering essential
edges only [KKP93, McG95]. Also, there are algorithms that are efcient on average [Spi73,
TM80, Blo83, FG85, MT87, MP97, MC09]. In both cases, the worst-case complexity does not
change.
The question is: can we do better for dense graphs (with algorithms not based on path com-
parison)? Label-correcting algorithms may help to improve the running time. The famous Floyd-
Warshall algorithm [Flo62, War62] is based on dynamic programming and it runs in time O(n
3
)
(see also [CLRS01, Section 25.2]). This is optimal if only triple operations on paths and edge
35
CHAPTER 2. PRELIMINARIES
Figure 2.1: Illustration of the network decomposition technique as originally depicted by Hu and
Torres [HT69, Figure 4].
costs are allowed [IN72]. Yen [Yen72, Yen73] and Moffat and Takaoka [MT84] further in-
vestigate the constants hidden in the Onotation. Fredman improves the time complexity to
O
_
n
3
_
lg lg n
lg n
_
1/3
_
[Fre76]. Based on Fredmans algorithm, Takaoka [Tak92] improves the run-
ning time to O
_
n
3
_
lg lg n
lg n
_
. Feder and Motwani [FM95] give an O
_
mn
lg(n
2
/m)
lg n
_
time algo-
rithm for weighted graphs and an O
_
n
3 1
lg n
_
time algorithm for unweighted graphs. Takaoka
gives an algorithm running in time O
_
n
3
lg lg n
lg n
_
[Tak05]. For directed graphs with real edge
lengths, Zwick [Zwi06] gives an O
_
n
3
lg lg n
lg n
_
time algorithm. The algorithm of Chan [Cha07]
runs in time O
_
n
3
lg lg n
lg
2
n
_
. Blelloch et al. [BVW08] save a lgfactor compared to Feder and
Motwani [FM95]. They give the currently fastest combinatorial algorithm, which requires time
proportional to O
_
mn
lg(n
2
/m)
lg
2
n
_
. For an overview, see Table 2.6.
In forty years of research, only a logarithmic improvement in the running time has been
achieved. A combinatorial algorithm running in truly sub-cubic time would be a major break-
through. Better theoretical bounds exist through an algebraic [Car71] approach: the APSP prob-
lem can be solved using matrix multiplication (APSP for directed graphs is at least as hard as
boolean matrix multiplication). It is also possible to apply algorithms for fast matrix multiplica-
tion [Str69]. We refer to the survey [Tak08] and briey state the result. Although very important
in theory, the algorithms are unfortunately impractical to implement. Let M(n) denote the time
it takes to multiply two n n matrices. The fastest known algorithm is due to Coppersmith
and Winograd [CW90] with M(n) = O(n
_
lg lg n
lg n
_
1/3
[Fre76]
n
3
_
lg lg n
lg n
_
1/2
[Tak92]
mn
lg(
n
2
/m)
lg n
[FM95]
n
3
_
lg lg n
lg n
_
5/7
[Han04]
n
3
lg lg n
lg n
[Tak05]
n
3
lg lg n
lg n
[Zwi06]
n
3
1
lg n
[Han08a]
n
3
_
lg lg n
lg n
_
5/4
[Han08b]
n
3
lg lg n
lg
2
n
[Cha07]
mn
lg(
n
2
/m)
lg
2
n
[BVW08]
Table 2.6: Combinatorial algorithms for the All Pairs Shortest Path problem.
All Pairs Approximate Shortest Path (APASP) Algorithms
For some applications, the computation of exact distances may be too expensive. The computa-
tional complexity of approximation algorithms is better. We also refer to the survey by Sen [Sen09,
Section 5].
For undirected, unweighted graphs, Aingworth et al. [ACIM99] solve APASP with stretch
(1, 2) in time O(n
2.5
lg n) and Dor et al. [DHZ00] solve APASP with stretch (1, O(lg n)) in time
O(n
2
). Cohen and Zwick [CZ01] solve APASP with stretch (2, 0) in time
O(n
3/2
m
1/2
), with
stretch (
7
/3, 0) in time
O(n
7/3
), and with stretch (3, 0) in time
O(n
2
). Baswana and Kavitha
[BK06], Berman and Kasiviswanathan [BK07], and Baswana et al. [BGS09] give an
O(n
2
)
time algorithm for all pairs (2, W)approximate shortest paths, where W denotes the largest
edge weight. Based on matrix multiplication, Zwick [Zwi02] obtains running time O(n
/) for
(1 +, 0)approximate distances. Roditty and Shapira [RS08] give a smooth transition between
Zwicks APASP result and the exact APSP algorithms based on matrix multiplication. Their dis-
tances have sublinear additive distortion.
2.4.3 Many Pairs Shortest Path (MPSP) Algorithms
We may want to compute the shortest paths between pairs from a restricted subset of vertices
U V . For this, an APSP algorithm might be computing too much. Let s := |U|. For s = (1)
pairs, the algorithm of Pettie and Ramachandran [PR02] is currently the fastest method to compute
exact shortest paths.
Restricted graph classes. The algorithm by Cabello [Cab06] computes smany distances in
planar graphs in time O
_
s
2/3
n
2/3
lg n +n
4/3
lg
1/3
n
_
. The algorithm makes use of an efcient
distance oracle for planar graphs (see Section 3.1.3).
37
CHAPTER 2. PRELIMINARIES
Many Pairs Approximate Shortest Path (MPASP) Algorithms
For certain applications, approximate distances between a subset of the nodes may be enough.
For an overview of results, see [Elk05, Table I]. Similar to APASP, many algorithms use spanners
(subgraphs that approximate distances, for details see Section 2.3) to reduce the problem size.
Elkin [Elk05, p. 284] notes that the existence of an algorithm for constructing an (, )spanner
with O(n
1+
) edges ( > 0) in time O(T) implies the existence of an algorithm for the MPASP
problem with s sources with running time O(T +s n
1+
).
Also, any (, )approximate distance oracle (to be dened in the next chapter) with pre-
processing time O(T) and query time O(Q) can trivially solve MPASP in time O(T + s Q).
It is interesting to investigate whether it helps when we know the s pairs in beforehand (ofine
scenario).
Aingworth et al. [ACIM99] compute (1, 2)approximate distances in time O(n
1.5
s lg n +
n
2
lg
2
n). Dor et al. [DHZ00] compute (1, 1/)approximate shortest paths in time
O(n
2+
).
Cohen [Coh94] computes (1 + , polylog(n))approximate shortest paths in time O(mn
+
sn
1+
). Elkin [Elk05] computes (1 + , O(1))approximate shortest paths in the same time.
Roditty et al. [RTZ05] construct a distance oracle for a subset of the nodes U V . Based on this,
they can compute (2k 1, 0)approximate distances in time
O(ms
1/k
+ks).
2.4.4 Shortest Path and Distance Queries
Processing shortest path and distance queries in graphs can be seen as a generalization of the
APSP problem and the MPSP problem, for which we do not know the pairs in beforehand (online
scenario). The efcient processing of these queries and the corresponding data structures are the
main focus of this thesis. The problem is dened in Chapter 3, wherein we also review related
work.
38
Die Aufgabe [...] besteht darin, ein Verfahren anzugeben, wie
man sich aus einem Labyrinthe herausndet.
(The task is to give a method how to nd your way out of a maze.)
Ludwig Christian Wiener (18261896) [Wie73]
3
Review of Short Path Query Processing
The eminent importance of the shortest path problem has stimulated a large body of research, both
of theoretical and practical nature. Some important results for the general problem (without being
exhaustive) are mentioned in Section 2.4. In this chapter, we review results on the point-to-point
(approximate) shortest path query problem. In this overview, the query problem is restricted to
discrete, static graphs with positive edge weights. Also, the only criteria on the optimality of a
path is its length (for example, there are no turn penalties [Cal61, KP69]). If paths are required to
satisfy constraints other than distance (for example bounded leg paths [DP08]), the corresponding
results are not covered in this review.
For distance and path queries in computational geometry, we refer to [Mit97]. For the spe-
cic problem in computational geometry concerning robot path planning, we refer to [Sha97b,
KLMR98, Lat91]. For questions regarding query processing in dynamic graphs, see [Ram96a,
pp. 30100] and [Ita08, Ber09, BHS07, RZ04, MN67], for time-dependent weights, see [OR90],
and for negative weights, see [YZ05].
The query scenario is different from the classical scenario described in the previous sec-
tions (2.2 and 2.4). We are rst presented with a usually large graph G = (V, E). A so-called
preprocessing algorithm may compute certain information or a data structure to prepare
for the next phase. After this preprocessing algorithm has been executed, users will ask queries,
which should be answered efciently. In computational geometry, this and similar scenarios are
sometimes called repetitive-mode (as opposed to single-shot) scenarios [PS85, p. 37].
A lazy strategy would be not to precompute a data structure but to use a classical SSSP algo-
rithm to answer queries. The query would then take roughly linear time. An eager strategy would
be to precompute the result for all possible queries using an APSP algorithm.
1
Both strategies
have its advantages and disadvantages: for the rst strategy, no preprocessing is necessary but
the query processing is very slow; for the second strategy, the query execution is extremely fast
but the preprocessing step is very expensive and the space consumption is prohibitively large for
many graphs. In the path query scenario, we mediate between these two extremes: we analyze
the tradeoff between space, preprocessing time, and query time. If the query algorithm is allowed
to return an approximate shortest path, the worst-case stretch is also an important factor of the
tradeoff.
In the following, we review related work on shortest path and distance query processing for
graphs. Theoretical results are presented in Section 3.1 and practical results are summarized in
Section 3.2. The review in this chapter is made with best efforts; however, the list of methods and
algorithms is not exhaustive.
1
We assume no knowledge about the query distribution. In practice, we might actually know that some queries are
more frequent than others.
39
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
3.1 Theoretical Distance Oracles
Thorup and Zwick [TZ05] coined the term distance oracle, which is a data structure that, after
preprocessing a graph G = (V, E), allows for efcient (approximate) distance and shortest path
queries.
Denition 30. An ((, )approximate) distance oracle for a class of graphs G consists of a data
structure S and a query algorithm with the following characteristics:
The preprocessing time is the worst-case time required to construct the data structure for
any G G.
The space complexity refers to the worst-case size of the data structure for any G G.
After preprocessing G = (V, E), the data structure S supports (approximate) distance queries for
all pairs of vertices u, v V , returning a value
d
S
(u, v). The query algorithm and its result are
characterized as follows.
The query time is the worst-case time required to compute
d
S
(u, v) among all G = (V, E)
G and u, v V .
A distance oracle S is said to have stretch (, ) if for all G = (V, E) G and u, v V
its query algorithm satises
d
G
(u, v)
d
S
(u, v) d
G
(u, v) + .
The stretch is also called distortion.
In addition to the worst-case measures, the average-case behavior of the time and space com-
plexities, and, especially, the stretch, may also be of interest. For some distance oracles, only the
average stretch is guaranteed. For other distance oracles, the stretch condition is satised except
for n
2
pairs of nodes. This fraction [0, 1) is called slack. If not explicitly stated otherwise,
stretch means the worst-case stretch as in Denition 30.
We summarize the known results for general graphs (lower bounds in Section 3.1.1 and upper
bounds in Section 3.1.2), and we give an overview of the distance oracles for restricted classes of
graphs (Section 3.1.3). Some common techniques are explained in Section 2.3.
3.1.1 Lower Bounds
For directed graphs, distance oracles are closely related to reachability oracles. We do not review
upper bounds on algorithms for reachability oracles. Any lower bound on the space complexity of
reachability oracles directly implies a lower bound on any distance oracles with nite stretch. The
rst part of the denition for reachability oracles is equivalent to the rst part of the denition for
distance oracles (Denition 30) except that reachability oracles always consider directed graphs.
Denition 31. A reachability oracle for a class of directed graphs G consists of a data structure S
and a query algorithm with the following characteristics:
The preprocessing time is the worst-case time required to construct the data structure for
any G G.
40
3.1. THEORETICAL DISTANCE ORACLES
The space complexity refers to the worst-case size of the data structure for any G G.
After preprocessing G = (V, E), the data structure S supports reachability queries for all pairs of
vertices u, v V , returning a boolean value reach
S
(u, v) {, }. The query algorithm and its
result are characterized as follows.
The query time is the worst-case time required to compute reach
S
(u, v) among all G =
(V, E) G and u, v V .
The query algorithm returns
reach
S
(u, v) =
_
if and only if d
G
(u, v) =
otherwise.
For general directed graphs, any distance oracle with nite worst-case stretch requires space
(n
2
) bits. Consider a complete bipartite digraph D = (V
1
V
2
, A), where all arcs are directed
from V
1
to V
2
. Any scheme that encodes reachability [AF90] information for D and all its 2
(n
2
)
subgraphs requires quadratic space for some subgraphs.
For sparse graphs with O(n) edges, the following tradeoff between space complexity and
query time has been proven. For details, see Section 4.2.1.
Theorem 5 (P atrascu [Pat08a, Theorem 2]). A reachability oracle using space S in the cell-probe
model with wbit cells, requires query time t =
_
lg n/ lg
Sw
n
_
.
Corollary 6. A distance oracle for directed graphs with nite stretch using space S in the cell-
probe model with wbit cells, requires query time t =
_
lg n/ lg
Sw
n
_
.
For dense graphs, the information-theoretic argument extends to undirected graphs as follows.
For undirected graphs, Thorup and Zwick [TZ05] prove a lower bound on the size of distance
oracles for a certain stretch. The worst-case instances are dense graphs with large girth. Recall that
the girth of a graph is the length of its shortest cycle and recall that m
g
(n) denotes the maximum
number of edges in a graph with n vertices and girth at least g (for details, see Section 2.3.1).
Theorem 7 (Girth-based lower bound [TZ05, Proposition 5.1]). For any integer k 0 (called
stretch parameter), any distance oracle for graphs with n nodes and multiplicative stretch less
than < 2k + 1 needs space at least (m
2k+1
(n)) bits.
Erd os girth conjecture [Erd64, ES63] predicts that m
2k+1
(n) = m
2k+2
(n) = (n
1+1/k
). For
the values of k for which the conjecture has been proven (including 1, 2, 3, and 5, see Table 2.4),
this yields a space lower bound of (n
1+1/k
) bits.
Thorup and Zwicks lower bound proof [TZ05, Proposition 5.1] roughly works as follows:
Recall the argument of the lower bound for graph spanners (Denition 26): in a graph with girth
g = + 2, removing any edge increases the distance between its endpoints from 1 to at least
+ 1. Let G be a graph with n nodes, m
g
(n) edges, and girth g. All 2
m
g
(n)
subgraphs G
of the
graph G also have large girth g(G
and G
r
_
1+2
edges, using which they
compute paths of stretch (4 + 1, 0), where is another parameter. They also prove two lower
bounds. For probe factor 0 (no edges can be read from the original graph), the index requires
space at least
nlg n
2
bits. For probe factor
n
10
and index size
nlg n
10
bits, no stretch less than (5, 0) is
possible.
For an overview of distance oracles for general graphs, see Table 3.1.
Suppose that we need a distance oracle with quasi-linear space consumption. The oracle of
Thorup and Zwick [TZ05] achieves this for k = lg n with O(lg n) multiplicative stretch and
O(lg n) query time. The oracle of Mendel and Naor [MN06] improves the query time to O(1).
It would be very useful to reduce the stretch to O(1) instead. The space lower bound proves that
such a reduction is impossible for dense graphs. It is however open whether such oracles exist for
sparse graphs.
To conclude this section, note that any APSP or APASP algorithm constructs a complete dis-
tance table, which may serve as a distance oracle. The space requirement is O(n
2
) and the query
time is O(1). Distances in the table are optimal if computed by an APSP algorithm. For APASP
algorithms, the tradeoff between stretch and space is not optimal with respect to the lower bound of
Section 3.1.1. APASP algorithms may improve upon APSP algorithms in terms of the preprocess-
43
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Preprocessing Space Query Stretch Reference
O(mn) O(n
2
) O(1) (1, 0) APSP
O(1) O(m) O(m) (1, 0) BFS
O(kmn
1/k
) O(kn
1+1/k
) O(k) (2k 1, 0) [TZ05, RTZ05]
O(mn
1/k
) O(n
1+1/k
) O(1) (O(k), 0) [MN06, MS08b]
O(n
2
) O(n
3/2
) (lg n) (3, 0) [BK06]
O(n
2
) O(kn
1+1/k
) O(k) (2k 1, 0) [BK06], k 3
O(n
2
) O(kn
1+1/k
) O(k) (2k 1, 0) [BS06]
O(m+n
23/12
) O(n
3/2
) O(1) (3, 10) [BGSU08]
(n
1+1/k
) < (2k + 1, 0) [TZ05]
n
1+(1/t)
t < (1, 2) Lemma 11
n
1+(1/t)
t < (, 0) Theorem 8
Table 3.1: Time and space complexities of distance oracles for general, undirected, unweighted
graphs (some upper bounds extend to weighted graphs). The upper part of the table lists upper
bounds; the lower part lists lower bounds (some restrictions on k and apply). For the result
by Baswana et al. [BGSU08], the stretch is taken from [Sen09, Table 1]. Approximate distance
oracles are included only if the space requirement is at most o(n
2
).
ing time. Recall that the distance oracle of Thorup and Zwick [TZ05] is restricted to stretch fac-
tors of uneven integers. If multiplicative stretch less than 3 is desired, the preprocessing algorithm
of their oracle can only function as an APSP algorithm. The APASP algorithms by Cohen and
Zwick [CZ01] yield a (2, 1)approximate distance oracle with preprocessing time
O(n
3/2
m
1/2
)
and a (
7
/3, 0)approximate distance oracle with preprocessing time
O(n
7/3
). The algorithms by
Baswana and Kavitha [BK06] improve this to a (2, 0)approximate distance oracle with prepro-
cessing time
O(mn
1/2
+ n
2
) and a (
7
/3, 0)approximate distance oracle with preprocessing time
O(m
2/3
n +n
2
).
If the input is restricted to special classes of graphs, the girth-based lower bound may not
necessarily apply. Distance oracles with better stretch/space tradeoffs exist.
3.1.3 Restricted Graph Classes
Given the prohibitive girth lower bound (Theorem 7), the natural question arises whether we can
construct a better distance oracle for certain classes of graphs. Graphs that actually appear in
practical settings are of particular interest. Indeed, there are better constructions for restricted
graph classes, in particular for several sparse graphs.
Planar Graphs (and Graphs with Bounded Genus)
Due to the importance of planar graphs, short path queries for planar graphs have been studied
extensively. A complete review of the corresponding results would deserve a chapter itself. We
give a brief overview; a summary can be found in Table 3.2.
For exact shortest path queries, the currently best result in terms of the tradeoff between
space and query time is by Fakcharoenphol and Rao [FR06, FR08]. The data structure requires
space O(nlg
3
n) (improvement for non-negative weights to O(nlg
2
n) by Klein [Kle05], an-
other improvement by a lg lg factor may be possible [MWN09]) and processes queries in time
44
3.1. THEORETICAL DISTANCE ORACLES
O(
nlg
2
n). Note that their result holds for graphs with negative weights as well.
Some distance oracles by Cabello [Cab06] (and others) have better query times. The time and
space complexities of his distance oracles depend on a parameter r that can be specied by the
application. He proves that for any r [n
4/3
lg
1/3
n, n
2
] there is an exact distance oracle with
preprocessing time and space O(r) with query time O
_
n
r
lg
3/2
n
_
. This oracle yields an im-
provement over complexities of the oracles by Arikati et al. [ACC
+
96, p. 517] and Djidjev [Dji96].
They prove that, for any r [n
3/2
, n
2
], there is a distance oracle with preprocessing time and size
O(r) that supports queries in time O(n
2
/r). Djidjev [Dji96] obtains two more results. For any
r [n, n
3/2
], the preprocessing step takes time O(n
r
lg n
_
. The size remains O(r); the preprocessing
time remains O(n
r). For r = n
3/2
, we obtain query time O(n
1/4
lg n) and space O(n
3/2
).
Cabellos tradeoff [Cab06] yields better preprocessing times for the same amount of space.
The hammock decomposition [FJ88, Fre91, Fre95] of a planar graph is used to construct dis-
tance oracles optimized for certain classes of planar graphs. Let q = q(G) {1, 2, . . . n 1}
denote the minimum number of outerplanar subgraphs of a planar graph G (proportional to the
minimum number of faces covering all vertices of G; the minimum is taken over all embeddings
of G in the plane). The number of hammocks is a topological measure imposing a natural hierar-
chy on the class of planar graphs. Djidjev et al. [DPZ95] give a distance oracle with preprocessing
time and space O(n + q lg q) and distance query time O(lg n + q). Recall that, for some planar
graphs, q may be (n). If only distance queries are to be answered (no path queries), space O(n)
sufces. Djidjev et al. [DPZ00, DPZ91] also give a dynamic distance oracle for outerplanar graphs
with O(n) preprocessing and space and O(lg n) query time. They generalize this distance oracle
to planar graphs (and graphs with genus at most O(n
r)
space, O(n+q
2
/
r+qr
3/4
) preprocessing time, and O(
n). Their
tradeoff matches the tradeoff for the results by Arikati et al. and Djidjev with r = n
3/2
[ACC
+
96,
Dji96].
Hutchinson et al. [HMZ03] consider shortest path queries in planar graphs in the parallel disk
model. This model is used for external memory algorithms [AV88, VS94]. Their data structure
uses O(n
3/2
/B) blocks of external memory and allows for a shortest path query to be answered in
O
_
n+L
DB
_
I/O operations, where B is the block size, L is the number of vertices on the reported
path, and D is the number of parallel disks. Their result essentially also matches the tradeoffs
in [ACC
+
96, Dji96].
Restricted Queries. Kowalik and Kurowski [KK03] generalize adjacency queries in unweighted
planar graphs to queries for distances bounded by a constant h. The preprocessing time and space
complexities are O(n) and the query time is O(1) (which is an improvement over the O(lg n)
query time in Eppstein [Epp99, Theorem 12]). To obtain the space bound, they prove that any
constant power of a planar graph still has constant thickness (Denition 24). Then, answering a
45
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
distance query translates to combining the results of a constant number of adjacency queries. Note
that for unweighted planar graphs, any distance oracle with stretch (1, O(1)) immediately yields a
(1 +, 0)approximate distance oracle by combining it with the result by Kowalik and Kurowski.
Distances in undirected, unweighted grids are directly determined by grid coordinates. For a
weighted, directed (p q)grid, Schmidt [Sch98] gives an O(pq lg p)time algorithm to build a
distance oracle that supports distance queries in time O(lg p) for paths starting in the left-most
column. In a straightforward way, this yields an O(pq lg pq)time algorithm to build a distance
oracle that supports distance queries in time O(lg p + lg q) starting at any node on the outer face.
For a (
n)grid, the preprocessing and query times are O(nlg n) and O(lg n), respectively.
This result can be generalized (with rather different techniques). For an undirected graph with
genus g, Cabello and Chambers [CC07] provide an O(g
2
nlg n)time algorithm to represent the
shortest path tree from all the vertices on one specied face. Any query distance from a vertex on
this face can be obtained in time O(lg n).
Approximations. The tradeoff between space and query time of all exact distance oracles is
such that for space O(n
2
) the query time is at best O(n
/2
) [Cab06]. Constant query time with
space o(n
2
) has not been achieved yet. To obtain fast query times of O(1) or
O(1) with lower
space requirements, approximate distance oracles are considered. Recall that, for general graphs,
any distance oracle with multiplicative stretch less than < 3 requires space (n
2
). In the special
case of planar graphs, the corresponding tradeoff is much better.
Thorup [Tho04a] presents efcient (1 + )approximate distance oracles for planar digraphs.
The main ingredient of Thorups construction [Tho04a] is a special separator consisting of a set of
shortest paths (instead of a general set of O(
lg n
_
and query time O
_
lg n
_
.
For an overview, see Table 3.2. Recall that denotes the diameter (largest nite distance) of
a graph.
46
3.1. THEORETICAL DISTANCE ORACLES
Preprocessing Space Query Reference
O(nlg
2
n) O(nlg
2
n) O(
nlg
2
n) 1 [FR06, Kle05]
O(n
3/2
) O(n
3/2
) O(
n) 1 [ACC
+
96, Dji96]
O(n
4/3
)
O(n
4/3
) O(n
1/3
lg
4/3
n) 1 [Cab06, r : n
4/3
]
O(n
3/2
) O(n
3/2
) O(n
1/4
lg
3/2
n) 1 [Cab06, r : n
3/2
]
O(n
7/4
) O(n
3/2
) O(n
1/4
lg n) 1 [Dji96, r : n
3/2
]
O(n
7/4
) O(n
7/4
) O(n
1/8
lg
3/2
n) 1 [Cab06, r : n
7/4
]
O
_
n
2
lg
3
n
_
O
_
n
lg n
_
O(1/) 1 + [Tho04a, T3.19]
O
_
n
lg n
_
O(1/) 1 + [Kle02a]
O
_
n
lg
2
nlg
_
O(
n
) 1 + [Tho04a, P3.14]
O(
n
+ nlg
2
nlg ) O(
n
) 1 + [Kle05, Sec. 7]
O(poly(n)) O
_
n
lg n
_
O
_
lg n
_
1 + [AG06]
Table 3.2: Time and space complexities of distance oracles for undirected planar graphs (some
results extend to planar digraphs and/or minor-free graphs). denotes the multiplicative stretch;
denotes the diameter.
Approximate shortest path query processing for planar graphs has been investigated before
Thorups and Kleins seminal results. Frederickson and Janardan [FJ89, FJ90] give stretch (3, 0)
approximate routing schemes for planar graphs. Klein and Subramanian [KS98] give a data struc-
ture that also works in the dynamic case. The stretch is (1 + , 0); query and update times are
O(
1
n
2/3
lg
2
nlg ).
Road Networks Graphs with Bounded Highway Dimension
An important characteristic of road networks appears to be its highway dimension [AFGW10]. A
graph is said to have small highway dimension if for any r > 0, there is a not-too-large set of
vertices S
r
, which all shortest paths of length at least r pass through. S
r
is not too large if all
balls of radius at most O(r) contain only few vertices of S
r
. Based on this notion, many efcient
practical methods (to be discussed in Section 3.2.3) such as contraction hierarchies [GSSD08,
BDS
+
08], highway hierarchies [SS05, SS06, NBB
+
08], transit-node routing [BFM
+
07, Sch08b],
and SHARC [BD08] have provable performance guarantees. The highway dimension of real road
networks has not been investigated yet.
Road networks also share some properties with planar graphs such as small separators [EG08].
The techniques of Thorup [Tho04a] may potentially apply too (experimental results indicate
so [MZ07]).
Bounded Tree-Width
For digraphs with tree-width w (Denition 18), Chaudhuri and Zaroliagis [CZ00] give an algo-
rithm, parameterized by an integer q [1, a(n)], to compute a distance oracle with query time
O(w
3
q) in preprocessing time O(w
3
nlg n) for q = 1, preprocessing time O(w
3
nlg
n) for q = 2,
and preprocessing time O(w
3
n) for q = a(n). The tradeoff between preprocessing and query time
arises from semigroup computations over trees [AS87, Cha87, Sei06]. a(n) denotes the inverse
Ackermann function and lg
nlg n/
3
), a data structure of size O(nlg n/
4
) that
supports (1+, 0)approximate distance queries in time O(1). F urer and Kasiviswanathan [FK07]
give efcient algorithms for disk and ball graphs with general radii. Let R :=
max r
i
min r
i
denote the
ratio between the largest and the smallest radius in the graph. They construct sparse (1 + , 0)
spanners (Denition 26), which have a separator of size S(n, R, ) = O(n
11/dim
dim+1/2
+
2dim+1
lg R) that can be found in time O(nlg n). After preprocessing of time
O(n S(n, R, ))
their distance oracle can answer queries in time O(S(n, R, )). Recall that the class of disk graphs
contains the class of planar graphs [Koe36]. The oracle of F urer and Kasiviswanathan is more
general than the oracles for planar graphs but not competitive on planar instances.
Let denote a permutation of [n] = {1, 2, . . . n}. The permutation graph G() is dened
as the graph with vertex set [n] and edge set {{i, j} : (i j) (
1
(i)
1
(j)) < 0}.
Sprague [Spr07] gives an algorithm that preprocesses a permutation graph in time O(n) such
that distance queries can be answered in time O(1).
48
3.1. THEORETICAL DISTANCE ORACLES
Graphs with Bounded Doubling Dimension
The doubling dimension of a metric space is the minimum dim such that any ball of radius r can
be covered by at most 2
dim
balls of radius r/2 (Denition 19).
Talwar [Tal04] provides distance labels of length O(
dim
), using which there is a (1 +
, 0)approximate distance oracle with space O(n
dim
) and query time O(
dim
). Abraham
et al. [ABN08] provide, for a parameter (0, 1], an (O(lg
1+
n), 0)approximate distance or-
acle with space complexity O(ndim/) and query time O(dim/). The stretch is worse but the
dependencies on the doubling dimension dim are better.
The beacon-based embedding by Kleinberg et al. [KSW09] works as follows. A set of land-
marks is selected (for example at random); a constant number of landmarks sufces. Each node
stores its distance to all the landmarks. Distances are approximated by triangulation. Distance
estimates are unbounded for a fraction of the node pairs. This fraction is referred to as slack.
For all the remaining node pairs, the triangulation yields a (1 + , 0)approximate distance oracle
with linear space complexity and constant query time.
Graphs with bounded doubling dimension have also been studied in the setting of compact
routing [AM05, AGGM06, KRX07] and spanners [CG06]. Some researchers suggest to apply
algorithms that work well for graphs with bounded doubling dimension to Internet-like graphs.
However, the graph representing the Internet does not appear to have bounded ball growth or
bounded doubling dimension. Both measures can be large [FLV08].
Power-law Graphs and Complex Networks
For power-law graphs, compact routing schemes have been studied. Any compact routing scheme
may serve as an approximate shortest path oracle. While the time to retrieve the approximate
distance may not be competitive, we can retrieve each edge of the path by simulating the decision
of a router in a centralized way such that path queries are efcient. The results for distance oracles
and compact routing schemes are often strongly related. For a routing scheme, all the information
needs to be distributed. This constraint renders the problem intuitively harder than constructing a
distance oracle, where information is centralized and the query algorithm is not restricted to local
information. We do not attempt to cover results on compact routing in this review. We instead refer
to [ANLP90, AP92, Gav01, GP03b, TZ01, AGM
+
08, CW04] and the references therein. Compact
routing schemes for power-law graphs with direct inuence on the best results for distance oracles
are discussed in the following.
Krioukov et al. [KFY04] evaluate the compact routing scheme by Thorup and Zwick [TZ01]
for Internet-like inter-domain topologies and random power-law graphs. They also analyze the
stretch-distribution for this routing scheme when run on Erd os-R enyi G
n,p
random graphs [ER60].
Enachescu et al. [EWG08] also analyze the compact routing scheme by Thorup and Zwick [TZ01]
for Erd os-R enyi G
n,p
random graphs [ER60]. They prove that stretch (2, 0) can be achieved with
space
O(n
7/4
) by selecting
O(n
3/4
) landmarks. They also claim(without proof in the proceedings
version) that stretch (, 0) can be achieved with space
O
_
n
1+
2
+1
+
_
. Recall that the Erd os-
R enyi random graphs do not have a power-law degree sequence (Section 2.1.3).
The compact routing scheme by Brady and Cowen [BC06] is evaluated experimentally only.
More on their scheme can be found in Section 3.2.4 on practical results.
49
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Geometric Graphs
A geometric graph G = (V, E) has vertices corresponding to points in R
dim
and edge weights
from a Euclidean metric; G is said to be a (, )spanner for V , if for any two points p and q in
V the shortest path metric in G (, 0)approximates the Euclidean distance in R
dim
. Since some
distances may be larger than the corresponding Euclidean distance, we can not just apply Johnson-
Lindenstrauss [JL84] (Lemma 2) to obtain an efcient distance oracle for G. For geometric graphs
with sparse spanners, Gudmundsson et al. [GLNS08] give a (1+, 0)approximate distance oracle
with preprocessing time O(mlg n), space O(nlg n), and query time O(1).
Andersson et al. [AGL07] extend the result to geometric graphs with dense clusters using
the well-separated pair decomposition by Callahan and Kosaraju [CK95] (Denition 29) and
well-separated clusters by Krznaric and Levcopoulos [KL95]. If G contains N disjoint (, 0)
spanners that are inter-connected with M edges, there is an algorithm that constructs an (1+, 0)
approximate distance oracle in time O((m+M
2
) lg n) with space O(M
2
+nlg n) and constant
query time. Their algorithm chooses a representative point for each cluster, based on which dis-
tances are computed. As a potential application they give the following example: in the European
railway network, each country has a its own network (a spanner) and the railway networks of the
countries are then sparsely interconnected.
Sankaranarayanan and Samet [SS09, SSA09] adapt the well-separated pair decomposition to
spatial networks of dim dimensions. They obtain (1 + , 0)approximate distance oracles using
space O(n/
dim
) and query time O(lg n). With a hash table, the query time can be reduced to
O(1) while the space consumption increases to O(nlg n/
dim
). They also evaluate their scheme
experimentally on road networks.
3.2 Practical
Efcient practical methods to process shortest path queries are often devised by following a feed-
back loop that consists of design, analysis, implementation, and experimentation. The approach
using this feedback loop is also called algorithm engineering [San09, Figure 1]. Since experimen-
tation is an integral part of the feedback loop, the choice of the datasets may highly inuence the
outcome of the algorithm engineering process. If possible, experiments are run with input graphs
that are actually used in practice.
Practical instances for shortest path problems are often sparse. The number of edges is roughly
linear in the number of nodes. Besides sparsity, practical networks often have other important
properties. A large fraction of the efforts in the eld of practical shortest path query processing
has been devoted to transportation networks, in particular to road networks. Road networks, for
example, share many properties with planar graphs; in particular, road networks also have small
separators. In practice, however, approaches directly based on separators are often not the most
efcient ones.
Experimental evaluation [Hit68, BH69, Dre69, GW73, Pap74, Gol76] has always been an im-
portant part of shortest path research. For the rst practical methods devised by the algorithm en-
gineering approach, the feedback loop was rather short. Researchers found that the representation
of a graph in memory affects the performance of the algorithm. For sparse graphs, representing
the graph by an adjacency list is quite efcient. The list can be sorted by starting nodes (such a
representation is sometimes termed forward star form). It may be efcient to also sort the edges
of a node by their length [DGKK79]. Such a sorting step preprocesses the graph in order to ob-
tain faster query times. It may also be efcient to reorder the vertices such that proximity in the
50
3.2. PRACTICAL
graph is reected in proximity in memory as well. Such a reordering may have great impact on
the running time of the query algorithm due to caching effects [GKW07].
Reordering nodes and edges was just the beginning. If additional structure is computed, or if
the network is changed structurally, the investment during the preprocessing phase is higher but so
is the payoff at query time. Network decomposition [LS67, FLM67, KY65, Mil66, Hu68, HT69,
Yen71, GKN74, LR82] was used to speedup APSP algorithms on sparse networks. Other than the
articles on the network decomposition technique, the thesis of Smolleck [Smo75, SC81] and the
article by van Vliet [Vli78] appear to be among the rst reports on the shortest path query problem
with considerable preprocessing.
2
Smolleck models the network by an electric circuit, wherein each edge is mapped to an
impedance. According to [DP84], Smolleck achieves a speedup of 30 compared to Dijkstras
algorithm (on a graph with 2,047 nodes and 2,547 edges); the paths are on average 1.9% longer
than the optimal path; the preprocessing time is apparently 1,000 times slower than the query time.
Van Vliet compares the running times of Dijkstras [Dij59], DEsopos [PW60, Pap74], and
Moores [Moo59] algorithms on road networks with up to 5,337 nodes and 14,930 edges. Based on
his observations, he introduces heuristics termed spider web techniques [Vli78, Section 6]. He
contracts nodes such that groups of 2 or more links from the original network are combined into
single links representing minimum distance paths between their end nodes. For an illustration, see
Figure 3.1. He attributes the idea to Hu [Hu69], who termed it distance equivalent networks;
3
he also relates it to triple operations [Flo62, Mur65, Hu68]. Such a triple operation compares
an edge length with the lengths of paths with two edges using an intermediate pivot node. The
method is mainly used in APSP algorithms. Van Vliet combining APSP and SSSP techniques
into a query method illustrates the tradeoff that shortest path query methods address. Van Vliets
contraction techniques decrease the CPU time for multiple queries by approximately 25%.
For recent methods, two preprocessing strategies are distinguished. Hierarchical approaches
compute an additional graph structure to speed up shortest path queries. Approaches based on
graph annotation attach additional information to each vertex, based on which, at query time, the
search tree can be pruned.
3.2.1 Hierarchical Approaches
Hierarchical methods to compute shortest paths in graphs have been proposed by many researchers
[KK77, AJ94, SWN92, SFG97, IOAI91, CF94, JHR96, TF97, FS97, CRS98, CTB01, HJR95,
CL07, CZ07, AY00, JP02, AY01, HSW08, SS06, BFM
+
07, Sch08b, KKRS08, GSSD08, Hol08,
BDS
+
08]. An auxiliary graph is constructed hierarchically. A shortest path query is then an-
swered by searching only a small part of the auxiliary graph, often using Dijkstras algorithm.
This approach works very well for intrinsically hierarchical graphs.
If, for each level, the size of the graph is reduced by a constant factor, the hierarchy contains
O(lg n) levels. In practice, it may be benecial to stop the recursive process when only O(
n)
nodes are left. For these remaining nodes, a distance table can be computed and stored. This yields
more efcient query algorithms at a comparably low preprocessing cost.
2
An earlier approach by Bazaraa and Langley [BL74] was to preprocess a graph in order to eliminate negative
weights such that Dijktras algorithm can be applied.
3
There may be a connection to the minimum-route transformations by Akers [Ake60] and William S. Jewell (no
reference). These network changes are based on Wye-Delta Y transformations of electrical networks. However,
the transformations appear to be restricted to planar networks and to two or three terminals. Hu and Torres [HT69,
p. 390] attribute smaller ow equivalent networks to Akers [Ake60].
51
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Figure 3.1: The contractions for nodes of degrees 2, 3, and 4 (termed spider web transforma-
tions) by van Vliet [Vli78, Figure 5].
Concrete approaches exploiting hierarchy are reviewed in the forthcoming section on road
networks (Section 3.2.3).
3.2.2 Graph Annotation Approaches
Algorithms exploiting the annotation approach are sometimes also termed goal-directed search
algorithms. Additional information is attached to all/some vertices or edges of the graph. Based
on this information, the search algorithm decides which part of the graph not to search.
A* Search
A* [Gel63, Sam63, KHI
+
86, HNR68, Dor67, Gel77, Kor85] is a popular search technique in
Articial Intelligence. The idea is to direct the search towards the goal. In the priority queue
implementation of Dijkstras algorithm, at each iteration, the node with the shortest distance to the
source is taken from the queue. In the original A* algorithm, instead of ordering nodes by their
distance from the source, nodes in the queue are ordered by their distance from the source plus
a potential (see for example [Del09, Algorithm 2, p. 22]). By adding a potential to the priority
of each node, the order in which nodes are removed from the priority queue is altered. A good
potential function increases the priority of nodes that lie on a shortest path to the target (usually
by decreasing the priority of the other nodes). In road networks for example, if the coordinates of
the target are known, the Euclidean distance provides a good lower bound on the graph distance
and thus a good potential function [SV86]. Using the Euclidean distance as a potential function
for A* has been exploited and applied successfully. In general, however, the coordinates may not
52
3.2. PRACTICAL
be known. A metric embedding or a drawing [WW05] can provide coordinates for a potential
function.
Goldberg and Harrelson [GH05] (see also [GW05, GKW06, GKW07]) propose to use a set
of landmarks S V and the triangle inequality (their method is sometimes called ALT (A*-
Landmarks-Triangle inequality) for this reason). Their potential function is a beacon-based
triangulation [KSW09]. Analogous to the distance oracle of Thorup and Zwick [TZ05], all nodes
v V know the distance to all landmarks L S. For two nodes u, v V and a landmark
L S, the triangle inequality yields that d(u, v) d(u, L) d(v, L). Taking the maximum
difference over all L S yields the best estimate, which is used as a potential in the A* search.
The quality of the lower bound highly depends on the landmark selection. Since in the preprocess-
ing phase the distances to all landmarks need to be computed and stored, the preprocessing time
and the space consumption highly depend on the number of landmarks. An important question
is how to select few but good landmarks. Random selection is a straightforward approach but
it may not guarantee good coverage, meaning that some nodes are far from all landmarks. Sev-
eral heuristics have been proposed to improve coverage [GH05, GW05], or to choose important
nodes [PBCG09]. The theory on beacon-based triangulations by Kleinberg et al. [KSW09] may
help to explain for which graphs ALT works well and how many landmarks to select. For graphs
with bounded doubling dimension, triangulations with respect to a constant number of landmarks
yields (1 + , 0)approximate distances for a (1 )fraction of the node pairs (Kleinberg et
al. also prove that this slack is necessary). While A* with landmarks [GH05] works for general
graphs, it is thus expected to perform best on graphs with low doubling dimension. Poudel [Pou08]
proposes a similar algorithm with a potential function based on an approximate distance oracle.
With increasing quality of the distance approximations, fewer nodes are visited at query time.
A* is easy to implement and it yields decent speedups. In external memory setups, there
appear to be better practical methods [EM01]. Better speedups can be obtained when combining
A* with the bidirectional version of Dijkstras algorithm. This is however not a straightforward
combination. The potential function for the forward and the backward search need to be consistent
such that the shortest path is found when both searches meet. A good approach for consistent
potentials is to take the average of the forward and backward potential function [IHI
+
94].
Querying using precomputed cluster distances [MSM09] is a somewhat similar approach. The
network is partitioned into clusters and distances between all pairs of clusters are precomputed.
These cluster distances yield upper and lower bounds for distances, based on which the search is
directed towards the goal.
Reach
Reach-Based Routing [Gut04] is another modication of Dijkstras algorithm. Each vertex is
assigned a so-called reach value that determines whether a particular vertex will be considered
during Dijkstras algorithm. To have a high reach value, a vertex must lie on a shortest path that
extends a long distance in both directions from the vertex (similar to Highway Hierarchies [SS05,
SS06, NBB
+
08], to be discussed in the next section). A vertex is excluded from consideration if
its reach value is small, that is, if it does not contribute to any path long enough to be of use for
the current query. Gutman [Gut04] reports fast query times with a speed up factor of 10 compared
to Dijkstras algorithm.
53
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Arc Flags
In the preprocessing phase, the Arc Flag method [Lau04, KMS05, MSS
+
06] partitions the graph
into clusters and then, for each cluster, marks all edges where shortest paths towards nodes in the
cluster start. At query time, edges that are not marked with the target cluster are ignored. A related
approach uses geometric containers [WW03, WWZ05]. On its own, the preprocessing step of the
Arc Flag approach is rather expensive. However, when applied within a hierarchy [MSS
+
06] or
when combined with other techniques, it can be very efcient [BD08, BDS
+
08].
2Hop Covers and Reachability Queries
For general directed graphs, the answers to shortest path and reachability queries (Denition 31)
are harder to compute than for undirected graphs [AF90]. The scheme by Cohen et al. [CHKZ03]
actually lies at the boundary between theory and practice. Cohen et al. focus on algorithms for
directed graphs that occur in practice. They introduce a new technique, which they call 2hop
covers: such a cover is a set of shortest paths P such that for every pair of vertices (u, v) V V
there is a shortest path between u and v that is the concatenation of two paths in P. Based on
this cover, they assign labels to vertices. The sizes of the labels thus depend directly on the size
of the cover. It is known that graph classes with separators of size O(n
g), 0) distortion [LMN02]. This means that, for constant r and girth O(lg
r
n), an embedding-
based distance oracle with space
e
O(n) has stretch ((
lg n), 0).
60
4.2. PRELIMINARIES
i 1 rounds and the output is from {1, 2}, indicating which player will communicate in the ith
round. The input of C
i
is the input string of this selected player as well as the communication
pattern of the rst i 1 rounds. The output of C
i
is the bit that this player will communicate in
the ith round. Finally,
1
and
2
are 0/1valued functions that the players apply at the end of the
protocol to their inputs as well as the communication pattern in the t rounds in order to compute
the output. These two outputs must be (x, y). The communication complexity of is
C() = min
protocols CP
max
x,y
{Number of bits exchanged by CP on x, y} .
A trivial protocol is to communicate the entire input of one player, compute (x, y) at the
other player, and send back the result. This yields C() n+1. The objective is to communicate
(signicantly) less.
Lopsided Set Disjointness
In communication complexity [Yao79, KN96, AB07], SETDISJOINTNESS is the problem of two
separated agents deciding whether two sets are disjoint. In the asymmetric version of the problem,
called LOPSIDEDSETDISJOINTNESS (LSD), Alice and Bob receive sets S
Alice
and S
Bob
, respec-
tively. Their goal is to determine whether S
Alice
S
Bob
= using a communication protocol
(Figure 4.1). More precisely (as in Denition 32), the function is 1 if S
Alice
S
Bob
= and
Alice U Bob
is given S
Alice
U is given S
Bob
U
. . .
S
Alice
S
Bob
?
=
Figure 4.1: The LOPSIDEDSETDISJOINTNESS communication problem
0 otherwise. LOPSIDEDSETDISJOINTNESS has two parameters, N and B, known to both Alice
and Bob. The universe has size NB and both Alice and Bob are given a subset of this universe;
S
Alice
[NB] and S
Bob
[NB]. Alices set has size |S
Alice
| = N. (Alice is given one of
_
NB
N
_
different sets.) B thus denotes the fraction between N and the size of the universe NB. The size
of Bobs set is not xed; it may be (NB). Two trivial protocols are the following: (1) Alice
communicates her set S
Alice
with a message of length lg
_
NB
N
_
= O(N lg B) bits; Bob can then
compute S
Alice
S
Bob
and reply with 1 bit. Alternatively, (2) Bob communicates his set S
Bob
with a message of length lg |S
Bob
| + lg
_
NB
|S
Bob
|
_
(encoding |S
Bob
| and S
Bob
); Alice computes
S
Alice
S
Bob
and sends back 1 bit. Either Alice or Bob communicates his/her complete set. The
question is how much communication Alice and Bob need to decide whether S
Alice
S
Bob
= . A
trivial randomized protocol is the following: Alice tosses a coin; based on the result, she decides
whether S
Alice
S
Bob
and sends 1 bit to Bob. Let this protocol be denoted by CP. The probability
61
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
that the result is correct is Pr[CP(S
Alice
, S
Bob
) = (S
Alice
, S
Bob
)] =
1
/2. Alice could also send a
random sample, based on which Bob must decide whether S
Alice
S
Bob
. A randomized protocol
CP
is said to have have two-sided error if there is a non-zero probability that Alice and Bob err
when they output either 0 or 1, and it is said to have one-sided error if the probability that Alice
and Bob err is zero for at least one of the possible outcomes 0 or 1. For example, CP
has one-
sided error if (S
Alice
, S
Bob
) = 0 Pr[CP
(S
Alice
, S
Bob
) = 0] = 1 [MR95]. A protocol CP
is
said to have bounded error if there are two constants ,
>
1
/2 such that (S
Alice
, S
Bob
) = 0
Pr[CP
(S
Alice
, S
Bob
) = 0] and (S
Alice
, S
Bob
) = 1 Pr[CP
(S
Alice
, S
Bob
) = 1]
.
The communication complexity to solve LSD with a one-sided error is known to be bounded
from below (Miltersen et al. [MNSW98]). Note that a lower bound on communication protocols
with one-sided error is easier than a bound on protocols with two-sided error, since one-sided error
protocols are required to have a better performance than two-sided error protocols.
Lemma 9 (Miltersen et al. [MNSW98]). There exists some constant C > 0 such that in a one-
sided error protocol for LSD, either Alice sends CN lg B bits or Bob sends NB
C
bits.
Andoni et al. [AIP06] extend the bound to include protocols with two-sided error as well;
P atrascu [Pat08a, Pat09, Pat08c] proves the following (alternatively, one could use P atrascus ap-
proach with [Pat08b, Theorem 5.15] and [Pat08b, Chapter 5.4.3]).
Lemma 10 (P atrascu [Pat09, Theorem 1.4]). There exists some constant C > 0 such that in a
bounded-error protocol for LSD, either Alice sends CN lg B bits or Bob sends NB
C
bits.
This roughly means that, for Alice and Bob to know whether S
Alice
S
Bob
with probabil-
ity bounded away from
1
/2, either Alice or Bob must send almost their complete set. (For the
communication complexity of symmetric set disjointness, see [KS92, Raz92, BYJKS04, HW07].)
Data Structure Lower Bounds based on LSD
One key part in P atrascus results [Pat08a] is the reduction from LOPSIDEDSETDISJOINTNESS
(LSD) to reachability oracles. Recall (Denition 31) that the reachability query problem is, given
a (sparse) directed graph G = (V, E), to construct a data structure using less than n
2
space such
that reachability queries (deciding whether there is a directed path fromu to v) can be answered ef-
ciently. Reachability oracles for undirected graphs are trivial: we compute the connected compo-
nents and store a component number for each node. Storing reachability information for directed
graphs appears to be hard [AF90] and so is the reachability query problem.
Recall P atrascus theorem on reachability oracles (Theorem 5), which we restate here.
Theorem (P atrascu [Pat08a, Theorem 2]). A reachability oracle using space S in the cell-probe
model with wbit cells, requires query time t = (lg n/ lg
Sw
n
).
P atrascus proof is a reduction from a variant of LSD to the problem of reachability queries in
a buttery graph and its subgraphs.
Denition 33. A buttery graph BUTTERFLY(, r) is a directed graph F
,r
= (V, A) specied by
two parameters: the depth and the degree r.
F
,r
has + 1 layers V
0
, V
1
, . . . V
with r
vertices each.
The nodeset is [ + 1] [r]
.
Every vertex v V \ V
has out-degree r.
Every vertex v V \ V
0
has in-degree r.
62
4.2. PRELIMINARIES
Arcs a A
i=0
V
i
V
i+1
only connect nodes in adjacent levels V
i
, V
i+1
.
Two nodes v V
i
and v
V
i+1
are adjacent if they differ only at coordinate i. Node
(i, c
1
, c
2
, . . . c
i
, . . . c
i
, . . . c
) for c
i
[r].
Note that paths between any s V
0
and t V
.
Let (e
1
, e
2
, . . . e
= (V, E \
f(S
Bob
)). Alice can nd out whether at least one of her edges e
i
is in Bobs set f(S
Bob
) by asking
one reachability query reachable(u, v). Alice does so by sending a message in a communication
protocol (technically speaking, she encodes the position of the word in the data structure she
wants to read as in the cell-probe model). By asking all queries in parallel [PT06, Pat08a], Alice
learns whether at least one of her edges is in Bobs set f(S
Bob
) Alice thus also knows
whether f(S
Alice
) f(S
Bob
) S
Alice
S
Bob
. It turns out that, if the data structure is small,
the communication complexity of this protocol is very low, which contradicts the communication
lower bound of LSD.
Figure 4.2: A drawing of an undirected buttery graph with degree 2 spanning 3 layers.
Recall that the lower bound for reachability oracles directly implies a lower bound for distance
oracles on directed graphs (Section 3.1.1, Denition 31). In a straightforward way, it also implies
a lower bound for distance oracles on undirected graphs. The following direct reduction from
reachability oracles for subgraphs of the buttery graph to distance oracles with less than stretch
(1, 2) yields the same space lower bound for the latter. The lemma is a minor result of this chapter;
it serves as a warm-up and as an illustration of P atrascus reduction technique.
Lemma 11. In the cell-probe model with wbit cells, a distance oracle with additive stretch less
than 2 using space S requires query time t = (lg n/ lg
Sw
n
).
2
We abuse notation by simplifying f(E
) :=
S
eE
{f(e)}.
63
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
Figure 4.3: A drawing of an undirected buttery graph with degree 3 spanning 3 layers.
Proof. We give a straightforward reduction from reachability oracles for the directed buttery
graph F in P atrascus reduction [Pat08a, Reduction 13] (see also [Pat08b, Reduction 7.11]) to a
distance oracle for the same graph, interpreted as undirected, say F
must
be at least + 2 since the buttery graph F
r1
r
.
It is known that each |
i
| is real number in the range [0, 1]. It is also known that
0
= 1 and
that
1
0. Therefore, (G) := max{
1
, |
n1
|}.
At one point in the proof, we rely on the expansion property of a graph. We use the follow-
ing theorem. Based on the expansion (G), Alon et al. [AFWZ95] prove a lower bound on the
probability that a random walk of length stays inside a set of a certain density.
Theorem 13 (Alon et al. [AFWZ95, Theorem 4.2]). For a graph G = (V, E) with := (G),
the probability that a random walk of steps from a uniformly random starting vertex stays inside
U V , where |U| n 6n, is at least ( 2)
.
The theorem implies the following corollary by setting = 0.9.
Corollary 14. For a graph G = (V, E) with (G) 0.1, the probability that a random walk of
steps from a uniformly random starting vertex stays inside U V , where |U|
9
10
|V |, is at least
1
2
.
Deep knowledge of expansion is not necessary to understand our proof in this chapter. For
more information on expander graphs, we refer to [Alo86, AFWZ95, HLW06].
Graph Construction
Our proof relies on the existence of graphs with large girth and large P
,
(G); Ramanujan graphs
3
(construction by Lubotzky et al. [LPS88] using the Cayley graph of a projective general linear
group) are what we use.
4
For the connection to dense graphs with large girth, see also [EJ08,
Construction IV].
Lubotzky et al. [LPS88, pp. 262263] prove the following.
5
Recall that the Legendre symbol
for two unequal primes p = q is dened as follows:
(p|q) =
_
+1 if there is an integer x such that p
q
x
2
1 otherwise
3
Ramanujan graphs are named after the Indian mathematician Srinivasa Iyengar Ramanujan (18871920).
4
In our proof, we apply Theorem 13 by Alon et al. with the condition that (G) 0.1. The construction by
Lubotzky et al. is not the only one that guarantees regular expanders with large girth. The construction by Morgen-
stern [Mor94] may potentially work as well. It has been shown that random regular graphs also have large expan-
sion [Fri91, FKS89]. Random regular graphs with large girth may thus potentially work as well.
5
Note that, in their paper, Lubotzky et al. do not normalize . The statement in Theorem 15 uses normalized
eigenvalues as dened in Denition 35.
65
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
Theorem 15 (Lubotzky et al. [LPS88, pp. 262263]). Let p and q be unequal primes congruent
to 1 mod 4. There exists a (p + 1)regular graph X
p,q
with the following properties:
Case (p|q) = +1
1. X
p,q
has
q(q
2
1)
2
nodes.
2. g(X
p,q
) 2 lg
p
q
3. diam(X
p,q
) 2 lg
p
n + 2 lg
p
2 + 1
4. (X
p,q
)
2
p
p+1
Case (p|q) = 1
1. X
p,q
has q(q
2
1) nodes.
2. g(X
p,q
) 4 lg
p
q lg
p
4
3. diam(X
p,q
) 2 lg
p
n + 2 lg
p
2 + 1
4. (X
p,q
)
2
p
p+1
In the following we prove the existence of Ramanujan graphs for innitely many and various
choices of large r and n. The theorem is mostly implied by the result of Lubotzky et al. [LPS88]
stated in Theorem 15.
Lemma 16. For every sufciently large n
0
, r
0
with n
0
> 8r
3
0
, there exists a graph G = (V, E)
with the following properties:
1. |V | = n with
n
0
2
n 9n
0
2. G is rregular, where r
0
r 2r
0
3. The girth of G is at least g(G)
1
2
lg
r
n
4. (G)
2
r1
r
Proof. The graph claimed to exist is a Ramanujan graph X
p,q
as in Theorem 15. The construction
by Lubotzky et al. [LPS88, pp. 262263] requires unequal primes p = q, both congruent to 1
mod 4. The graph has at least n =
q(q
2
1)
2
nodes and it is r = (p + 1)regular.
In the following, we prove the existence of primes p, q such that r and n lie in the ranges
r [r
0
, 2r
0
] and n [
n
0
2
, 9n
0
].
The Bertrand-Chebyshev theorem
6
states that for every x > 1 there is always at least one
prime p such that x < p < 2x. This generalizes to certain arithmetic progressions [Bre32, Erd35,
Bre64, Mor93]. Let z N
0
. Breusch [Bre32, p. 505] states
[...] da f ur x 7 zwischen x und 2x stets Primzahlen einer jeden der vier Progres-
sionen 3z + 1, 3z + 2, 4z + 1, 4z + 3 liegen.
6
Named after Joseph Louis Francois Bertrand (18221900), who conjectured the existence of primes between x
and 2x in 1845, and Pafnuty Lvovich Chebyshev (18211894), who proved the conjecture in 1850. Ramanujan gave a
simpler proof in 1919. Its a small world!
66
4.2. PRELIMINARIES
that for every x 7 there is a prime of the form p = 4z + 1 in the interval between x and 2x.
Note that the modular congruence is satised, p
4
1.
Due to Breusch [Bre32, p. 505], there exists a prime p [r
0
1, 2r
0
2] of the formp = 4z+1.
Let p denote this prime.
The Lubotzky et al. construction of the Ramanujan graph requires a different prime q = p of
the form q = 4z + 1. Again, due to Breusch [Bre32, p. 505], there exists such a prime q [x, 2x]
if the intervals for p and q do not overlap. Let x := n
1/3
0
+1 [n
1/3
0
+1, n
1/3
0
+2). The imposed
condition n
0
> 8r
3
0
ensures that the two intervals do not overlap. Thus, there is a different prime
in the interval [x, 2x] for every integer x 2r
0
1. Let q denote this prime. The Ramanujan
graph has either q(q
2
1) = q
3
q or
q(q
2
1)
2
=
q
3
q
2
nodes. For the number of nodes n, we
have that n [
x
3
x
2
, 8x
3
2x]. With x = n
1/3
0
+ 1 [n
1/3
0
+ 1, n
1/3
0
+ 2), we derive
n
"
n
0
+ 3n
2/3
0
+ 3n
1/3
0
+ 1 n
1/3
0
2
2
, 8(n
0
+ 6n
2/3
0
+ 6n
1/3
0
+ 8) 2(n
1/3
0
1)
#
n
"
n
0
+ 3n
2/3
0
+ 2n
1/3
0
1
2
, 8n
0
+ 48n
2/3
0
+ 46n
1/3
0
+ 10
#
n
h
n
0
2
, 9n
0
i
for n
0
sufciently large (such that n
0
48n
2/3
0
+ 46n
1/3
0
+ 10).
The graph is (p + 1)regular. Let r := p + 1.
The girth is at least 2 lg
p
q (since 4 lg
p
q lg
p
4 2 lg
p
q for q 2). In terms of n and r, this
yields
g(X
p,q
) 2 lg
p
q =
2
3
lg
p
q
3
=
2
3
lnq
3
lnp
=
2 ln(p + 1)
3 lnp
lg
p+1
q
3
1
2
lg
r
n.
The expansion is (X
p,q
)
2
p
p+1
due to Theorem 15. This concludes the proof.
4.2.3 Counting Permutations
In the following, we prove the existence of a not-too-large set of permutations with certain
properties. The lemma is a tailored restatement of P atrascu [Pat08a, Lemma 11], proven using the
probabilistic method [AS00, Erd63].
Lemma 17. Let S
|. Let B A
of size at least
|B| b. There exists a set S
of || =: >
a
b
lna permutations {
1
,
2
, . . . ,
} such that
for any set A A() there exists a permutation
i
with
aA
{
i
( a)} B.
Proof. We denote
i
(A) :=
aA
{
i
( a)}. Let denote a randomly-chosen permutation : []
[]. Fix A A
. The
probability that none of the permutations
i
maps A to a set contained in B is bounded by
Pr[i [] :
i
(A) B] =
_
1
b
a
_
< e
b/a
.
67
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
We need that for any A A() there is at least one permutation out of the permutations that
maps A to set contained in B. We apply the union bound to obtain the following.
q := Pr [A A()i [] :
i
(A) B] a
_
1
b
a
_
< ae
b/a
.
We need 1 q > 0 to apply the probabilistic method [AS00, Erd63] to guarantee that there exists
a set S
:=
_
X
_
. Let B be a set
of subsets of Y of size at least |B| b such that each subset has elements. There exists a set
of
=
b
_
e
ln
_
e
_
bijections f
1
, . . . , f
to a set A
Y. Since
f is a bijection, we have that
|{A A
: f(a)}| = |A
| =: a.
By Lemma 17, there exists a set of permutations {
1
,
2
, . . .
(as
above) there exists a
i
that maps A
to an element of B.
We have
7
that
a =
_
_
<
_
e
.
The statement of the lemma is immediate.
7
We use the following well-known inequality for binomial coefcients:
n
k
!
=
n!
k!(n k)!
=
n (n 1) . . . (n k + 1)
k!
n
k
k!
en
k
k
,
where the last step is due to the Taylor series of e
x
=
P
k=1
x
k
k!
, which yields x, k : e
x
x
k
k!
, in particular e
k
k
k
k!
and thus k!
`
k
e
k
.
68
4.3. REDUCTION FROM LOPSIDED SET DISJOINTNESS
4.3 Reduction from Lopsided Set Disjointness
In this section, we show a lower bound on the space complexity of any approximate distance oracle
on any base-graph G, based on essentially two parameters of G: the girth of G, and the path-count
of G, which is the cardinality of the set P
,
(G).
We prove that a graph G with a large path-count and large girth is a hard base-graph for
approximate distance oracles.
Lemma 19. Let G = (V, E) be a graph, such that an (, 0)approximate distance oracle exists
for G and all its subgraphs, using query time t and space S. Let C denote the constant from the
LSD communication complexity lower bound in Lemma 10. Let , be two positive integers, such
that <
g(G)
+1
and |E| (2tw/)
1/C
. Then,
S
e
_
|P
,
(G)|
1/
e(|E|/)
1C
_
/t
_
1
e|E|
_
1/t
.
The proof proceeds as follows. We start by giving some intuition in Section 4.3.1. We prove
that if there exists a good data structure then there is a good (short) protocol for LSD (Sec-
tion 4.3.2). Then, we use the LSD lower bound to show that there cannot be a good data structure
(Section 4.3.3).
4.3.1 Intuition
Was wirklich z ahlt, ist Intuition.
(The only really valuable thing is intuition.)
Albert Einstein (18791975)
We give a rough sketch of the proof of the main theorem, highlighting some technical details.
Statements are not necessarily formally correct in this section (for details, see proof of Lemma 20).
The proof is a reduction from the communication problem LSD, in which Alice is given a set
S
Alice
[N B] of cardinality N, Bob is given a set S
Bob
[N B], and they must decide whether
S
Alice
S
Bob
= . For an illustration of the reduction, see Figure 4.4. There are communication
lower bounds for LSD (see Section 4.2.1). We prove that a distance oracle with good parameters
implies a good protocol for LSD, and derive our lower bound from the contrapositive of this claim.
A standard way to perform such reductions is to translate one query to the data structure into
a t rounds of a communication protocol. In total, Alice sends Bob t lg S bits and Bob replies
with tw bits. However, lower bounds in communication complexity are usually asymptotic. The
standard reduction puts the constant multiplicative factor in the exponent of S; therefore, using
the standard reduction, we can only prove lower bounds on the space complexity of the form
S x
(1)
, where x is some expression that depends on the problem parameters. Since any
distance oracle must use space n and since there is a trivial distance oracle that requires space
(n
2
) (a complete distance table), this reduction can not provide a meaningful lower bound.
P atrascu and Thorup [PT06, Pat08a] found a way to prove lower bounds on the space com-
plexity S of data structures that hold up to a polylogarithmic multiplicative factor. Alice holds
independent queries to the same database, where is large ( = O
_
n
polylog(n)
_
). The queries
performed in parallel are transformed into a communication protocol. We use the same approach.
69
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
Alice G = (V, E) Bob
is given S
Alice
F = {f
1
, f
2
, . . . f
} is given S
Bob
Choose f
i
F such that
f
i
(S
Alice
) P
,
(G),
encode i (lg bits)
Compute oracle
which yields for (V, E \ f
i
(S
Bob
))
paths of length of size S
FOR t rounds
lg (
S
) bits
w bits
IF
d(u
j
, v
j
)
for all j [] THEN
S
Alice
S
Bob
=
ELSE S
Alice
S
Bob
=
(S
Alice
,S
Bob
)
S
Alice
S
Bob
?
=
Figure 4.4: Illustration of a reduction from a distance oracle to a communication protocol for
LOPSIDEDSETDISJOINTNESS. The protocol computes (S
Alice
, S
Bob
) :=
_
S
Alice
S
Bob
?
=
_
.
G and F are known to both Alice and Bob. Details are in the proof of Lemma 20.
Our rst objective is to prove that a good distance oracle implies a good protocol for LSD. We
identify the universe of the LSD problem with the edge set of a graph. Let the universe size of the
LSD problem be N B = |E|. For the moment, f denotes an arbitrary bijection f : E [NB]
between the edgeset E and the elements of the universe [NB].
In the reduction, Alice plays the role of the querier, and Bob plays the role of the data structure.
Bob transforms his set S
Bob
into a subgraph of G, G
= (V, E
) = (V, E \ f(S
Bob
)). In other
words, Bob constructs a subgraph G
is large, g(G
from the case where at least one edge is missing. Now, if we perform queries of this
form, we can check paths. Therefore, using queries we can check whether edges are all
in the graph, or whether at least one of these edges is missing. By setting NB = |E| and
N = , this is an instance of the LSD problem. By doing a standard transformation from a data
structure to a communication protocol, we connect the parameters of the data structure to those of
a protocol for LSD: Alice sends roughly tlg(S/) bits to Bob, and Bob sends roughly tw bits
to Alice.
However, the instance described above is a very specic instance of the LSD problem, where
Alices set is restricted to a set that corresponds to paths in G. For a general instance of LSD,
Alices input may not necessarily map to a collection of vertex-disjoint paths of length each.
There is a technique of P atrascu [Pat08a] to perform a reduction from LSD even when only a non-
negligible fraction of Alices inputs map to a set of vertex-disjoint paths: in a preliminary round of
communication, Alice and Bob choose the bijection f from some not-too-large set of bijections
(see Lemma 18). In order to obtain such a set of bijections, we prove that there is a large set of
sets of vertex-disjoint paths in G (we refer to this as the path-count, as in Denition 34). If the
path-count is sufciently large, then we obtain a strong lower bound.
For details, we refer to the proof of Lemma 20 in the next section.
4.3.2 Reduction from a data structure to a communication protocol
We prove that the distance oracle data structure can be transformed into a protocol for LSD. For
an illustration, see Figure 4.4.
Lemma 20. Let G = (V, E) be a graph, such that an (, 0)approximate distance oracle exists
for G and all its subgraphs, using query time t and space S in the cell-probe model with word size
w. Let , be two positive integers, such that <
g(G)
+1
. Then, there exists a protocol for LSD with
parameters N = and B = |E|/N, where Alice sends
tlg(eS/) +N lg(eB) + lg(eBN) lg |P
,
(G)| bits,
and Bob sends tw bits.
Proof. We begin by dening a bijection between the universe [NB] and E. For now, any bijection
will do additional restrictions will be imposed later. Denote the bijection by f : [NB] E.
Alice and Bob both know G and f.
In the LSD problem, Alice receives a set S
Alice
[NB] of cardinality |S
Alice
| = N, and Bob
receives a set S
Bob
[NB]. We now derive a protocol for LSD based on the existence of the data
structure. Bob uses his set S
Bob
to construct a set of edges, E
= E \ f(S
Bob
). That is, an edge
71
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
e E is in E
= (V, E
) creating the distance oracle data structure, and from now on, Bob plays
the role of the data structure. Since G
.
Alice constructs the set P = f(S
Alice
). For now, assume that P P
,
(G). We call this
assumption the perfect bijection scenario; we shall remove this assumption later. Under this
assumption, P can be written as the union of vertex-disjoint paths, each of length . Let
(u
1
, v
1
), . . . , (u
, v
, since
there are no cycles shorter than g(G), and since an approximate distance oracle never returns an
underestimate of the distance, but always an overestimate or the correct value. After querying all
distances (u
1
, v
1
), . . . , (u
, v
, v
). Alice then considers which cells should be probed in the rst round of each
of the queries, and sends the set of probed cells to Bob. This set can be communicated using
lg
_
S
_
bits (it is crucial not to send the queries one by one, which would require lg S bits [PT06,
Pat08a]). Bob replies with the contents of these cells, using w bits. Next, Alice sends the set of
cells to be probed in the second round, using another lg
_
S
_
bits, and Bob replies, using another
w bits. This procedure is repeated for t rounds in total. Overall, Alice sends
t lg
_
S
_
t lg
_
eS
= tlg
_
eS
_
bits,
and Bob sends tw bits.
We eliminate the perfect bijection assumption by including an additional round of communica-
tion at the beginning of the protocol. In this round, Alice chooses a particular bijection f to reach
the perfect bijection scenario. Instead of having only one bijection f : [NB] E, Alice and Bob
share knowledge of bijections f
1
, f
2
, . . . , f
_
bits
in the perfect bijection scenario. This yields a total of at most
lg(eBN) +N lg(eB) lg |P
,
(G)| + tlg
_
eS
_
bits.
Bob sends tw bits. We ignore the nal 1 bit that Alice sends to inform Bob of the result. For an
illustration, see Figure 4.4.
4.3.3 Communication complexity implies space complexity
Conditioned on the existence of a space-efcient distance oracle, there is a communication pro-
tocol for LSD with low communication complexity (Lemma 20). In the following, we prove that
the lower bound on the communication problem LSD (Lemmas 9 and 10) yields a lower bound on
the space complexity of distance oracles.
Recall the statement of Lemma 19: Let G = (V, E) be a graph, such that an (, 0)
approximate distance oracle exists for G and all its subgraphs, using query time t and space
S. Let C denote the constant from the LSD communication complexity lower bound in Lemma 10.
Let , be two positive integers, such that <
g(G)
+1
and |E| (2tw/)
1/C
. Then,
S
e
_
|P
,
(G)|
1/
e(|E|/)
1C
_
/t
_
1
e|E|
_
1/t
.
Proof of Lemma 19. If a protocol computes LSD with parameters N and B, then either Alice
sends at least CN lg B bits or Bob must send at least NB
C
bits (Lemma 10).
Bob communicates tw bits. By the condition of Lemma 19, B (2tw/)
1/C
. This implies
NB
C
2tw/ = 2tw.
Bob, by sending tw bits, uses strictly less than NB
C
bits. The lower bound on the communi-
cation complexity of LSD implies that Alice must communicate at least CN lg B bits. Using the
protocol of Lemma 20, we have that
lg(eBN) +N lg(eB) lg |P
,
(G)| + tlg
_
eS
_
CN lg B.
Starting from this inequality, we derive a bound on S. Recall that N = and recall that the
73
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
edgeset is identied with the universe of LSD (|E| = BN). We get
tlg
_
eS
_
CN lg B lg(eBN) N lg(eB) + lg |P
,
(G)|
tlg
_
eS
_
lg |P
,
(G)| + C lg B lg(eB) lg(eBN)
lg
_
eS
_
1
t
lg |P
,
(G)| +
C
t
lg B
t
lg(eB)
1
t
lg(eBN)
eS
|P
,
(G)|
1/t
(B)
C/t
(eB)
/t
(eBN)
1/t
S
e
|P
,
(G)|
1/t
(eB
1C
)
/t
(eBN)
1/t
S
_
|P
,
(G)|
1/
eB
1C
_
/t
e(e|E|)
1/t
,
which yields the statement of the theorem.
4.3.4 Counting Paths
We prove a lower bound on the size of the set of sets of disjoint paths P
,
(G).
Lemma 21. Let G = (V, E) be an rregular graph and , be two positive integers, such that
the following three conditions hold:
1. (G) 0.1,
2. |V | 20, and
3. < g(G).
Then
|P
,
(G)|
_
|V |
_
r
8
_
.
Proof. Let N = .
Let us rst choose one path. There are |V | vertices to start a path. Since the graph is rregular,
and since < g(G), we have r choices for the rst step and r 1 choices for each subsequent
step. This yields
|V | r (r 1)
1
4.1
possibilities to choose one path. Since the graph is undirected and since we want to reduce LSD
to distance queries (as opposed to actual path queries), an s t path is equal to an t s path. We
divide (4.1) by 2 to account for this.
Let us now choose vertex-disjoint paths of length one by one. Recall that, due to
Corollary 14 (implied by Theorem 13 of Alon et al. [AFWZ95]), the probability that a random
walk of steps from a uniformly random starting vertex stays inside U V , where |U|
9
10
|V |,
is at least
1
2
|V |
2
_
r
4
_
74
4.3. REDUCTION FROM LOPSIDED SET DISJOINTNESS
(simple) paths of length . According to the corollary, each path has probability at least
1
2
to avoid
A. Therefore, the number of different paths of length in G that do not use any vertices of A is at
least
|V |
_
r
8
_
.
We apply this argument times to generate all sets with paths of length . After each
application, we add the vertices of the path to A. We divide by ! to account for the fact that the
order in which the paths were chosen does not matter. We obtain
|P
,
(G)|
_
|V |
_
r
8
_
!
|V |
!
_
r
8
_
_
|V |
_
r
8
_
.
This concludes the proof of Lemma 21.
4.3.5 Assembly
We now combine Lemma 21 with Lemma 19 to prove a lower bound for any expander graph based
on its expansion, degree, and girth. After this, we use the Ramanujan graph from Lemma 16 to
derive the main result of this chapter. Both proofs consist of just a sequence of calculations to glue
together all the conditions.
Lemma 22. Let C denote the constant from the LSD communication complexity lower bound in
Lemma 10. Let G = (V, E) be an rregular expander graph on |V | = n vertices, n sufciently
large, with expansion (G) 0.1 and girth g = g(G). In the cell-probe model with word-length
at most w = n
o(1)
, for an integer 1, any (, 0)approximate distance oracle with query time
t that works for G and its subgraphs, requires space at least
S
n
lg n
r
(
g
t
)
given that
g = g(G) 2 and
r
_
4tw
g
_
1/C
.
Proof of Lemma 22. Let =
g
2
lg n. Let =
|V |
20
. This yields
n
20 lg n
. Recall that
N = .
The conditions of Lemma 21 are satised:
(G) 0.1 (by the condition of Lemma 22)
|V | 20 (by the denition of and )
< g(G) = g since g 2 and =
g
2
_
r
8
_
.
The conditions of Lemma 19 are satised.
75
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
<
g(G)
+1
|E| = |V |
r
2
= 10r 10
_
4tw
g
_
1/C
(2tw/)
1/C
From Lemma 19, we know that (under the above conditions)
S
_
|P
,
(G)|
1/
e(|E|/)
1C
_
/t
/((e|E|)
1/t
e).
Since we are interested in the asymptotic behavior of S, we may ignore constant factors.
We claim that (e|E|)
1/t
e = (1). Since |E| 1, we have that (e|E|)
1/t
e = (1). Since
|E| V
2
= 400
2
2
400
4
,
(e|E|)
1/t
e O(1)
(e|E|)
1/t
O(1)
lg(e400)
t
O(1)
lg
t
O(1).
We derive (using |V | = 20 and |E| = |V |
r
2
)
S
_
|P
,
(G)|
1/
e(|E|/)
1C
_
/t
n
20 lg n
_
_
_
_
_
|V |
_
r
8
_
_
1/
e(|E|/)
1C
_
_
_
/t
n
20 lg n
_
_
_
|V |
_
1/
r
8e(|E|/)
1C
_
_
/t
n
20 lg n
_
_
_
20
_
1/
r
8e(10r)
1C
_
_
/t
n
20 lg n
_
(20)
1/
r
8e(10r)
1C
_
/t
n
20 lg n
_
r
C
8e10
1C
_
/t
(20)
1/t
.
By =
g
2
, we get the statement in the theorem for sufciently large n.
Based on Lemma 22, we now use the Ramanujan graph from Lemma 16 to derive the lower
bound stated in the main theorem of this chapter (Theorem 8).
Proof of Theorem 8. Let C denote the constant from the LSD communication complexity lower
bound in Lemma 10.
76
4.3. REDUCTION FROM LOPSIDED SET DISJOINTNESS
Let n
0
= (n). Let r
0
:=
_
8tw
lg n
0
_
2/C
. We assume that n
0
is sufciently large such that
r
0
max{400,
_
4
C
_
4/C
}.
To use Lemma 16, we need that n
0
> 8r
3
0
. Since t, O(polylog(n)) and w = n
o(1)
, we
have that
8
_
8tw
lg n
0
_
6/C
< 8 (8tw)
6/C
< n
0
,
thus there exists a graph G = (V, E) with the following properties:
1. |V | = n
with
n
0
2
n
9n
0
2. G is rregular, where r
0
r 2r
0
3. The girth of G is at least g(G)
1
2
lg
r
n
4. (G)
2
r1
r
Since n
0
= (n), we have that n
for the
remainder of the proof.
To apply Lemma 22, three conditions must be veried.
(G)
2
r1
r
0.1 holds for r 400.
g = g(G) 2
We have that r
0
=
_
8tw
lg n
0
_
2/C
and r [r
0
, 2r
0
]. Also, both t lg n and lg n.
g(G)
1
2
lg
r
n
=
1
2
lg n
lg r
1
2
lg n
lg (2r
0
)
=
1
2
lg n
lg
_
2
_
8tw
lg n
0
_
2/C
_
=
C
4
lg n
lg
_
2
C/2
8tw
lg n
0
_
C
4
lg n
lg
_
2
C/2
8wlg
2
n
lg n
0
_
=
_
lg n
lg(wlg n)
_
77
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
Since = o
_
lg n
lg(wn)
_
, the condition holds. Note that the necessary condition on is
C
4
lg n
lg
_
2
C/2
8wlg
2
n
lg n
0
_ 2
C
8
lg n
lg(2
C/2
8) + lg w + 2 lg lg n lg lg
n
2
C
8
lg n
C
2
+ lg 8 + lg w + lg lg
n
2
1.
The lower bound thus extends to constant w and, in particular, to the bit-probe model [MP69].
_
4tw
g
_
1/C
r
We need two inequalities on r. Since r r
0
_
4
C
_
4/C
(for sufciently large n
0
), we have
that (lg r)
1/C
r. Since r r
0
=
_
8tw
lg n
0
_
2/C
_
8tw
lg n
_
2/C
,
_
4tw
g
_
1/C
_
8twlg r
lg n
_
1/C
_
8tw
lg n
_
1/C
(lg r)
1/C
_
8tw
lg n
_
1/C
r r.
We now apply Lemma 22. Since g = g(G)
1
2
lg
r
n, we have that
r
g
r
1
2
lg
r
n
= n
1/2
,
therefore,
S
n
lg n
n
(
1
t
)
This concludes the proof.
4.4 Conclusion and Open Problems
Theorem 8 implies that a distance oracle with query time t and stretch (, 0) requires space
n
1+(1/t)
. Since our proof holds even for sparse graphs with m =
O(n) edges, the space re-
quirement is strictly larger than the original graph size. For sparse graphs, our space lower bound
is an improvement over the lower bound by Thorup and Zwick [TZ05], which states that at least
space (m) is required. We prove that O(m) is not enough.
Our lower bound also indicates that the tradeoff of the distance oracle of Thorup and Zwick
can potentially be improved for sparse graphs. Their tradeoff is that for multiplicative stretch
78
4.4. CONCLUSION AND OPEN PROBLEMS
(, 0) and query time O(), the space is roughly n
1+O(1/)
. Our lower bound only proves space
requirement n
1+(1/
2
)
. There is a gap between the upper and the lower bound. Mendel and
Naor [MN06] improve the query time to O(1) while maintaining the same amount of space asymp-
totically. Their tradeoff is tight up to constant factors in the exponent with respect to our lower
bound.
Two technical questions remain open.
Linear Number of Edges. The worst-case graphs in our proof have degree
O(1). It would be
interesting to generalize the proof to constant-degree graphs.
Graphs without Large Girth. In our proof, we require that a graph has large girth. It may be
possible to remove this requirement. In the reduction, we perform distance queries. The corre-
sponding paths must be vertex-disjoint and non-bypassable, meaning that any alternative path is
long. We ensure that paths do not have a short alternative by using graphs with large girth. It may
however be feasbile to use graphs without large girth to prove a lower bound.
Our lower bound applies to general sparse graphs. It however does not apply to specic graph
classes such as those with many short cycles; efcient distance oracles may still be possible for
specic graph classes.
79
Unversehens h angt alles
ineinander [...]
(everything is connected)
Max Frisch [Fri64, p. 116]
5
Distance Oracles for Power-law Graphs
5.1 Introduction
Although complex networks are very common in practice (Section 1.1.2), there is no distance
oracle with provable guarantees better than those of the general distance oracle of Thorup and
Zwick [TZ05]. For stretch parameter k = 2, the distance oracle of Thorup and Zwick has the
following worst-case performance: the size is O(n
3/2
) and the stretch is (3, 0). Fortunately, the
theoretical worst-case stretch bounds of Thorup and Zwicks distance oracle [TZ05] (and, also, of
their routing scheme [TZ01]) are not observed in practice
1
[KFY04], even though they are tight.
In this chapter, we make an attempt to bridge the gap between theory and practice. We provide
the rst theoretical analysis that directly links the power-law exponent of a random power-law
graph to the bound on distance oracle sizes.
We adapt the distance oracle of Thorup and Zwick [TZ05] to optimize it for unweighted,
undirected power-law graphs. The scheme by Thorup and Zwick is based on a set of landmarks
selected uniformly at random. Instead of sampling landmarks at random, we select the nodes with
highest degrees as landmarks.
The use of nodes with high degrees is a heuristic that has been proposed by many researchers.
The high-degree heuristic is also very common in practice. For power-law graphs it particularly
makes sense to leverage the power of high-degree nodes. These nodes are also called hubs. These
hubs appear in most large complex networks [Bar03, p. 63].
Connectors are an extremely important component of our social network. They create
trends and fashions, make important deals, spread fads, or help launch a restaurant.
They are the thread of society, smoothly bringing together different races, levels of ed-
ucation, and pedigrees. [...] Connectors nodes with an anomalously large number
of links are present in very diverse complex systems, ranging from the economy to
the cell. They are a fundamental property of most networks [...]. [Bar03, p. 56]
Indeed, with links to an unusually large number of nodes, hubs create short paths
between any two nodes in the system [Bar03, p. 64]
Intuitively, using these hubs to approximate distances in power-law graphs is a good heuristic.
The main result of this chapter is a theoretical proof that may explain why this heuristic performs
well in practice.
1
Krioukov et al. [KFY04, Section IV.B] report routing tables with 52 entries for random power-law
graphs [ACL00] with 10,000 nodes. The bound by Thorup and Zwick [TZ01] is O(
nlg n is 365.
81
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
5.1.1 Overview of the Result
We give an informal statement of our result. The precise statement is deferred to Theorem 30.
For the nodes of a power-law graph, the probability that a node has degree x is proportional
to x
for some , which is called the power-law exponent. For most practical scenarios, the
power-law exponent lies in the interval 2 < < 3. These inequalities are assumed to hold in the
following.
The complexity analysis of our distance oracle is based on the random power-law graph model
with expected degree sequence proposed by Aiello, Chung and Lu [ACL00, CL02, Lu02, CL06]
with some minor simplications.
Let =
2
23
+ and > 0. For sufciently large n, we prove that for a random power-
law graph (sampled from the modied Chung-Lu model, see Denition 37) with n nodes, with
probability at least 1 1/n, our distance oracle of size O(n
1+
lg n) can be constructed in time
O(n
1+
lg n). With probability 1, the distance oracle has stretch (3, 0). The space requirement of
O(n
1+
lg n) (for a plot of with respect to , see Figure 5.1) improves upon the general distance
oracle of size O(n
3/2
) by Thorup and Zwick [TZ05].
Our bounds on the space complexity of the distance oracle of Thorup and Zwick [TZ05] extend
to the labeled compact routing scheme
2
by Thorup and Zwick [TZ01].
Figure 5.1: The gure shows a plot of f() =
2
23
for (2, 3). For values of close to 2, for
example for = 2.1, which is the exponent that ts the power-law distribution well to the degree
distribution of the actual Internet inter-domain graph [FFF99, KFY04], our bound is O(n
13/12+
),
which indicates that the adapted distance oracle (and the adapted routing scheme) could be very
effective on Internet-like graphs.
2
In this thesis, we focus on distance oracles; we do not dene compact routing schemes. Some notes on the routing
scheme, without detailed explanation and proof. The routing scheme is a xed-port scheme, meaning that it works
for any permutation of port number assignments on any node. The routing scheme requires a stretch5 handshaking
(see [TZ01, Section 4]), and uses addresses and message headers of size O(lg nlg lg n), with probability at least
1 o(1). Addresses and headers are based on an efcient path encoding scheme using O(lg nlg lg n) bits per node.
The encoding scheme relies on specic distance properties of power-law graphs. For details, see [CSTW09b].
82
5.2. PRELIMINARIES
5.1.2 Related Work
Due to their large occurrence in practice, various aspects of power-law graphs and complex net-
works have been studied.
There is some evidence that, despite their unique features, power-law graphs are actually not
easy instances for algorithms. Although power-law graphs are sparse, optimization problems
remain hard: problems such as COLORING or CLIQUE are NPhard for power-law graphs as
well [FPP08].
Power-law graphs have a dense core that consists of nodes with high degrees. Core prop-
erties have been investigated for several power-law graph models. Having a small core whose
removal substantially changes connectivity, would allow for a scheme that constructs shortest
paths through this core based on a separator theorem, as for planar graphs [Tho04a] and for
minor-free graphs [AG06]. However, the proportion of nodes that have to be removed to sub-
stantially change the connectivity of a power-law graph is linear with respect to the size of the
graph [CNSW00, NSW01, NWS02, BR03, FFV05, NR08]. Therefore, a separator-based strategy
is not suitable for power-law graphs; different techniques are necessary.
Also, most powerful techniques that work well for graphs with bounded doubling dimension
cannot be used. Although sometimes claimed, the Internet does not appear to have bounded ball
growth or bounded doubling dimension; both measures can be large [FLV08].
Practical routing schemes (and distance oracles) for power-law graphs have been proposed
[BC06, RMJ07, PBCG09, CY09, GSVGM98, XWP
+
09] (Section 3.2.4). However, there are no
theoretical results on the space requirements of routing schemes for power-law graphs.
An approach related to routing is due to Kleinberg [Kle00]. He formally proves that, in the
re-wired lattice model [BMST97, NW99, Kle00] (Section 2.1.3), greedy routing is a good routing
scheme. Greedy routing intuitively means that edges on the path to the target are chosen one after
another such that the estimated distance to the target is minimized. Kleinberg proves that paths
have length O(lg
2
n). For this greedy approach to work, it is however crucial that nodes know their
coordinates within the lattice.
3
Unfortunately, we cannot transform the greedy routing scheme into
a distance oracle, since there is no bound on the stretch for greedy routing; even for two nodes
connected by a short path of constant length, the greedy route may have length O(lg
2
n). The
stretch is thus (O(lg
2
n), 0).
Complex networks usually have diameter O(lg n). For a graph with diameter , a (, 0)
approximate distance oracle with constant space is trivial (by storing the diameter). If the oracle is
required to output actual paths, linear space sufces (the preprocessing algorithm creates a shortest
path starting at an arbitrary vertex; the query algorithm outputs paths on this tree).
The objective is to devise a distance oracle (or routing scheme) with constant stretch and good
theoretical bounds on the space requirements.
5.2 Preliminaries
5.2.1 Distance Oracle of Thorup and Zwick
The construction algorithm of the distance oracle of Thorup and Zwick [TZ05] is based on the
following ideas. Thorup and Zwick use random sampling to select a subset S V containing
O(
n) vertices. From each vertex of the set S, the algorithm computes and stores the distance
to all the vertices in the graph. From all other vertices v V \ S, the algorithm computes the
3
Without coordinates, paths may have length O(n
c
) [DEH07a, DEH07b].
83
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
open ball around v until it touches the nearest vertex from S. For each v, the ball has expected
size O(
n). The balance between the sample size |S| and the expected ball size is optimal,
which yields the desired space complexity of O(n
j
w
j
, the edge between
v
i
and v
i
is present in the random graph with probability
Pr [{v
i
, v
i
} E] = w
i
w
i
, where =
1
j
w
j
.
In the original FDRG model it is assumed that i, i
: w
i
w
i
<
j
w
j
. We adapt the original
model by deterministically inserting edges if w
i
w
i
>
j
w
j
. Without modication, the original
assumption would rule out the values for considered in this thesis.
Denition 37. For a constant (2, 3), the random power-law graph distribution RPLG(n, )
is dened as follows. Let the sequence of generating parameters w = {w
1
, w
2
, . . . , w
n
} obey a
power law:
w
j
=
_
n
j
_
1/(1)
for j {1, 2, . . . n}.
The edge between v
i
and v
i
is present in the random graph with probability
Pr [{v
i
, v
i
} E] = min{w
i
w
i
, 1}, where =
1
j
w
j
.
Note that, in both models, there is a one-to-one correspondence between a node v
j
and its
generating parameter w
j
. In the FDRG model, the value w
j
corresponds to the expected degree
of vertex v
j
, and Chung and Lu refer to w as the expected degree sequence. In the RPLG(n, )
adaptation, the graph is sampled according to the generating parameter values w
j
. Let D
j
be the
random variable denoting the degree of node v
j
. In the RPLG(n, ) model, the expected degree
E[D
j
] of node v
j
is less than or equal to the generating parameter w
j
. We refer to the edges
between two nodes v
i
, v
i
with w
i
w
i
j
w
j
as deterministic edges; we refer to the remaining
edges as random edges.
An important reason to work with this model is that the edges are independent. This in-
dependence makes several graph properties easier to analyze. We also (implicitly) rely on a
property called assortativity. Assortativity is the tendency of nodes with high degree to attach
to other highly connected nodes. This tendency is especially high in social networks. The
84
5.2. PRELIMINARIES
opposite tendency, termed dissortativity, is more common in technological and biological net-
works. Highly connected nodes tend to be connected with low degree nodes. Li et al. [LADW05,
Denition 4.1] formalize assortativity as follows. They dene the s(G) value of a graph as
s(G) :=
{v
i
,v
i
}E
deg(v
i
) deg(v
i
). Graphs sampled from the FDRG model tend to have a
high s(G) value, since high-degree nodes are attached to other highly connected nodes. Li et
al. state that s(G) measures to what extent a graph has a hub-like core.
The core of a graph consists of nodes having large degrees. Let =
2
23
+ for some > 0
and
=
1
1
.
Denition 38. For a power-law degree sequence w and a graph G with n nodes, the core with
degree threshold n
( w) :=
_
v
j
: w
j
> n
_
,
core
(G) :=
_
v
j
: deg
G
(v
j
) > n
/4
_
,
where deg
G
(v
j
) is the degree of v
j
in G (the subscript G is omitted when the graph is clear from
the context).
The core
( w) and core
(G) are not necessarily equivalent. Even if the degree bound in core
(G)
was set to n
instead of n
/4, the two cores would not be equal. In Section 5.4.1, we prove that
core
( w) core
core
(G)
d(u, v
)
_
.
Note that it is important to use the open ball and not the closed ball.
The volume of a set of nodes is an integral notion in the proof. Let G be a random graph
sampled from RPLG(n, ). For a set of nodes S, dene its volume Vol (S) as the sum of all its
nodes w
j
, that is, Vol (S) :=
v
j
S
w
j
. We simplify notation by Vol (G) := Vol (V ). Note that
Vol (G) = 1/ (Denition 37). Let vol
G
(S) denote the sum of the nodes degrees in the actual
graph G, vol
G
(S) :=
v
j
S
deg
G
(v
j
).
For our proof, the most important property of the FDRG model is captured in the following
lemma, which is applied for the core and individual balls. There is an edge between two nodes
v
i
, v
i
with probability proportional to w
i
w
i
. The statement is extended to sets of nodes S, T
V (G) in the following. The lemma holds for both FDRG( w) and RPLG(n, ).
Lemma 23 ([Lu02, Lemma 3.3, proof in Lemma 9]). For any two disjoint subsets S and T with
Vol (S) Vol (T) > c Vol (G), we have
Pr[d(S, T) > 1] =
v
i
S,v
i
T
max{0, (1 w
i
w
i
/Vol (G))} e
c
.
The following lemma proves that Vol (G) is linear in n.
85
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
Lemma 24. Let G be a random graph sampled from RPLG(n, ). The volume Vol (G) satises
n < Vol (G)
1
2
n.
Proof. Lower bound: it holds that
j
w
j
> n, since w
j
> 1 for all j < n and w
n
= 1.
Upper bound: it holds that
Vol (G) =
n
j=1
w
j
< w
1
+
n
_
1
_
n
x
_
1/(1)
dx
1
2
n.
In the remainder of the preliminaries section, we prove certain concentration properties of the
adapted random power-law graph model. Since for the RPLG(n, ) model the edge probability
is capped, several properties of graphs sampled according to the FDRG model do not hold for
graphs sampled according to the RPLG(n, ) model.
In the following, we show concentration results for the actual degree of a vertex and for the
volume of a set of vertices under the adapted RPLG(n, ) model. We also restate the correspond-
ing results in the original FDRG model.
Lemma 25 ([CL06, Lemma 5.6], generalized from [McD98, Theorem 2.7]). For a random graph
sampled from FDRG( w), the random variable D
j
measuring the degree of vertex v
j
is concen-
trated around its expectation w
j
as follows:
Pr[D
j
> w
j
c
w
j
] 1 e
c
2
/2
5.1
Pr[D
j
< w
j
+c
w
j
] 1 e
c
2
2(1+c/(3
w
j
))
5.2
Lemma 26 ([CL06, Lemma 5.9]). For a random graph sampled from FDRG( w), for a subset of
vertices S and for all 0 < c
_
Vol (S),
Pr
_
|vol (S) Vol (S)| < c
_
Vol (S)
_
1 2e
c
2
/6
.
Weaker concentration bounds hold for graphs sampled from RPLG(n, ).
Lemma 27. Let n 4
1
(2)
2
. For a random graph sampled from RPLG(n, ), if w
j
32 lnn,
for vertex v
j
, the degree D
j
satises the following:
Pr[w
j
/4 D
j
3w
j
] > 1 2/n
4
.
Proof. Recall that = 1/Vol (G) < 1/n (by Lemma 24).
For 1 j n, let h(j) {1, 2, . . . n} denote the smallest integer such that w
h(j)
w
j
1.
Consider h(1). Since w
1
_
n
n
3
_
1/(1)
1, we have that
h(1) n
3
.
Therefore, for all 1 j n,
h(j) h(1) n
3
.
86
5.2. PRELIMINARIES
We split the degree D
j
into two parts: the contribution by edges to nodes v
j
with j
< h(j)
and the contribution stemming from edges to nodes v
j
with j
i=h(j)
w
i
n
_
n
3
+1
(n/x)
1/(1)
dx
1
2
_
n n
1/(1)
2
2
1
n
2
1
(3)
_
(since n
3
1)
1
2( 2)
n (since n 4
1
(2)
2
).
Recall that = 1/
n
j=1
w
j
2
n(1)
by Lemma 24.
Let D
j
denote the random variable counting the number of edges fromv
j
to v
j
with j
h(j)
in a random graph. Thus,
E[D
j
] = = w
j
n
i=h(j)
w
i
w
j
/2 16 lnn.
Also, w
j
. Since there are no deterministic edges in this case, the random variable D
j
can be
bounded using Lemma 25:
Pr[D
j
> /2] 1 e
/4
1 1/n
4
,
Pr[D
j
< 2] 1 e
3/8
1 1/n
4
.
For h(j) = 1, the statement of the lemma follows directly.
If h(j) > 1, we have D
j
D
j
+h(j) 1. Notice that w
j
(n/w
j
)
1/(1)
1, which implies
that h(j) w
j
w
j
+ 1. Therefore,
Pr[w
j
/4 /2 D
j
3w
j
] 1 2/n
4
.
Lemma 28. Let G be a random graph sampled from RPLG(n, ). For a subset of vertices S sat-
isfying Vol (S) 192 lnn, it holds with probability at least 1 2/n
3
that Vol (S)/8 vol (S)
4Vol (S).
Proof. We split S into two parts S
1
:= {v
j
S : w
j
< 32 lnn} and S
2
:= S \ S
1
.
By Lemma 27,
Pr[Vol (S
2
)/4 vol (S
2
) 3Vol (S
2
)] 1 2|S
2
|/n
4
.
For each vertex v
j
S
1
, w
j
< 32 lnn. Since no deterministic edges are attached to S
1
, we
can apply Lemma 26 to S
1
.
Therefore, if Vol (S
1
) 96 lnn, by Lemma 26,
Pr[Vol (S
1
)/2 vol (S
1
) 2Vol (S
1
)/3] 1 2/n
4
.
Therefore, the statement holds with probability at least 1 2(|S
2
| + 1)/n
4
1 2/n
3
.
87
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
If Vol (S
1
) < 96 lnn, we have Vol (S
2
) Vol (S)/2 96 lnn.
However, since
Pr
_
vol (S
1
) <
3
2
96 lnn
3
4
Vol (S)
_
1 2/n
4
,
we can still apply Lemma 26 to bound vol (S
1
) from above.
In this case, since
Pr[Vol (S)/8 Vol (S
2
)/4 vol (S
2
) 3Vol (S
2
)] 1 2|S
2
|/n
4
,
the statement also holds with probability at least 1 2/n
3
.
In Lemma 24, we prove that Vol (G) = (n). Under the adapted RPLG(n, ) model, com-
pared to the FDRG model, the edge probability may only decrease. We immediately have that
vol (G) = O(n) with high probability. With the concentration results of this section, we obtain
the following corollary.
Corollary 29. The number of edges of a random graph sampled from RPLG(n, ) is at most
vol (G)
2
4(1)
2
n with probability at least 1 n
2
.
5.3 The Adapted Distance Oracle
We propose a modication of the distance oracle of Thorup and Zwick [TZ05, Fig. 5] for stretch
parameter k = 2, which guarantees stretch (3, 0). The main idea of the scheme by Thorup and
Zwick for k = 2 is the following (see Figure 5.2): in the preprocessing step, given a graph
G = (V, E):
1. Each node v V is chosen as a landmark independently at random with probability n
1/2
.
The expected number of landmarks is
n.
2. For each node u V , nd its nearest landmark L(u) and compute the distances from u to
all landmarks.
3. To guarantee optimal stretch for short distance queries, for every node u V a local ball
B
G
(u) = {u
V (G) : d(u, u
2( 1)
2
lnn.
5.3
The complexity results of this chapter do not have any other implicit dependencies on .
The following is the precise version of the main theorem of this chapter.
88
5.3. THE ADAPTED DISTANCE ORACLE
D
A
E
B
C
D
A
E
B
C
D
A
E
B
C
D
A
E
B
C
Figure 5.2: The distance oracle of Thorup and Zwick [TZ05, Fig. 5] for stretch parameter k =
2, which guarantees stretch (3, 0). An illustration of the preprocessing algorithm, from left to
right: (1) random sampling of landmarks, (2) SSSP computation for one landmark (node B), (3)
knowledge of one node after all SSSP computations (from all landmarks), and (4) ball for the
rightmost node in the bottom line
Theorem 30. Let =
2
23
+ be a constant. Assume Equation (5.3) is satised. For ran-
dom power-law graphs from RPLG(n, ) (Denition 37), there exists a (3, 0)approximate dis-
tance oracle with the following properties. The preprocessing algorithm runs in expected time
O(n
1+
lg n) and creates a distance oracle of expected size O(n
1+
). These bounds also hold
with probability at least 1 1/n. After preprocessing, approximate distance queries can be an-
swered in O(1) time with stretch at most (3, 0).
Since power-law graphs do not have large girth, the lower bound of Chapter 4 does not apply to
power-law graphs. However, scale-free networks and expander graphs (which are the worst-case
instances in Chapter 4) also share certain important properties [MPS06, Theorem 1, Corollary 4,
and p. 247]. It is thus not clear whether space O(n
1+
) is reasonably good or whether space
O(n)
may be sufcient.
Details for the preprocessing step are listed in Algorithm 3. Analogous to the oracle of Tho-
rup and Zwick [TZ05], for efcient query times, preprocessed information is stored in a hash
table [FKS84] for each node.
The query algorithm is the same as in [TZ05] for k = 2, see Algorithm 4.
Lemma 31. Algorithm 4 runs in time O(1) and achieves stretch (3, 0).
The following proof applies the same stretch and time bounds as [TZ05].
Proof. Time: At each node, all the information is stored in a hash table [FKS84] with constant
access time. The number of hash table reads necessary is constant.
89
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
Algorithm 3 Preprocess (G = (V, E),
)
compute core {v V : deg(v) > n
/4}
for each v core do
run breadth-rst search from v in G
for each node u = v, store d(u, v) and let FirstNode
u
(v) be the penultimum node on the
shortest path; update L(u) if v is nearest landmark
end for
for each u V do
compute and store B
core
(u) (including distances)
for each v B
core
(u) let FirstNode
u
(v) be the rst node on the shortest path to v.
end for
Algorithm 4 Distance (s, t)
if s B
S
(t) or t B
S
(s) then
return local distance d(s, t) from the information at s or t.
else
return d(s, L(t)) +d(L(t), t)
end if
t
s
L(t)
L(s)
d(s, t)
d(t, L(t))
d(s, L(t))
Figure 5.3: Illustration of the proof of worst-case stretch (3, 0) using the triangle inequality.
Stretch (3, 0) is guaranteed by the following observation [Cow01] (for an illustration, see
Figure 5.3). For a node u V , its ball is dened as follows.
B
core
(u) :=
_
v V (G) : d(u, v) < min
v
core
(G)
d(u, v
)
_
If neither s B
core
(t) nor t B
core
(s), then both
d(s, t) d(s, core) = d(s, L(s)) and
d(s, t) d(t, core) = d(t, L(t)).
90
5.4. TIME AND SPACE COMPLEXITIES
Using the second inequality, the triangle inequality
d(s, L(t)) d(s, t) +d(t, L(t)),
and d(t, L(t)) = d(L(t), t) (since G is undirected), we have
d(s, L(t)) +d(L(t), t) d(s, t) +d(t, L(t)) +d(L(t), t) 3d(s, t).
In practice, the return value
min{d(s, L(t)) +d(L(t), t), d(s, L(s)) +d(L(s), t)}
(or even min
Lcore
{d(s, L) +d(L, t)}) may yield better approximations (this triangulation is related
to A* [GH05] and beacon-based embeddings [KSW09], see Section 3.2.2).
5.4 Time and Space Complexities
The objective of this section is to prove the following lemma.
Lemma 32. Let =
2
23
+ be a constant. Assume Equation 5.3 is satised. For random power-
law graphs RPLG(n, ), Algorithm 3 runs in expected time O(n
1+
lg n) and creates a distance
oracle of expected size O(n
1+
). These bounds also hold with probability at least 1 1/n.
The main result of this chapter, Theorem 30 is immediate from Lemmas 31 and 32.
5.4.1 Core Size
We prove that the size of the core is (n
( w) :=
_
v
i
: w
i
> n
_
,
core
(G) :=
_
v
i
: deg
G
(v
i
) > n
/4
_
.
To compute the size of core
and obtain k.
w
k
=
_
n
k
_ 1
1
> n
1
1
> n
1
1
k < n
(1)(
1
1
)
= n
(1)+1
As
=
1
1
, we have
|core
( w)| = n
(1)+1
1 = n
1.
Even if the same degree threshold n
( w) and core
( w) =
_
v
i
: w
i
> n
_
v
i
: deg(v
i
) > n
/4
_
= core
(G).
Proof. Let v
i
be a vertex in core
( w).
By Lemma 27, D
i
n
( w)
_
v
i
: deg(v
i
) > n
/4
_
is at least 1 1/n
2
.
Lemma 34. Let G be a random graph sampled from RPLG(n, ). With probability at least
1 1/n
2
,
|core
(G)| = (n
).
Proof. Lower bound: Since core
. By Lemma 27, D
i
3w
i
< n
(G)|
144n
core
(G)
d(u, v
)
_
.
Lemma 35. Let b =
( 2) +
(23)
1
be a constant. Assume Equation (5.3) is satised. For a
random graph G sampled from RPLG(n, ), with probability at least 1 3/n
2
, it holds that for
all u V (G),
|B
core
(u)| = |{u
V (G) : d(u, u
( w))}| = O(n
b
),
|E(B
core
(u))| = O(n
b
lg n),
where E(B
core
(u)) is the set of internal edges among vertices in B
core
(u).
Since for RPLG(n, ) the edges are independent, in our analysis, the existence of every edge
in random graph Gis only determined when it is needed, and before that it is treated as a probabil-
ity distribution as dened in our randomgraph model. We call the determination of the existence of
an edge according to its probability distribution revealing the edge. For a given vertex u V (G),
we dene a sequence of balls (B
0
= {u}, B
1
, B
2
, . . .) as follows:
Let V
= V \ core
( w).
Now dene B
0
= {u} and B
i
= {v : d
G
(u, v) i}.
92
5.4. TIME AND SPACE COMPLEXITIES
We also dene the circles C
i
= B
i
\ B
i1
for i 0 with B
1
= . Let E
i
denote the
random variable counting the number of edges between C
i
and C
i
C
i+1
.
We rst give a concentration result on E
i
.
Lemma 36. For circle C
i
, the following holds with probability at least 1 2/n
3
:
If Vol (C
i
) < 192 lnn, then E
i
4 192 lnn, and
if Vol (C
i
) 192 lnn, then E
i
4Vol (C
i
).
If Vol (C
i
) < 192 lnn, then E
i
4 192 lnn, and if Vol (C
i
) 192 lnn, then E
i
4Vol (C
i
).
Proof. For our analysis, we assume that the edges of the random graph are revealed in consecutive
steps as follows: in step i with i 0, edges from C
i
to V
\ B
i1
are revealed and circle C
i+1
is
formed. In other words, when discovering C
i
, the edges between C
i
and V
= V
\ B
i1
have
not been revealed yet.
In particular, E
i
measures the number of edges between C
i
and V
.
Let vol (C
i
) denote the random variable measuring the number of edges adjacent to C
i
for the
original model FDRG. vol
G
(C
i
) is stochastically dominated by vol (C
i
). Hence, the statement of
the lemma follows directly, since it applies to vol (C
i
) by Lemma 28.
Since there are at most n circles, Lemma 36 holds for all circles with probability at least
1 2/n
2
.
The above arguments are combined to prove Lemma 35.
Proof of Lemma 35. Let k be the smallest integer such that Vol (B
k
) n
b
. We have the condi-
tions
Vol (B
k
) n
b
,
Vol (core
( w)) |core
( w)|n
= n
+
, and
Vol (G)
1
2
n (Lemma 24).
From Equation (5.3),
n
b
(2)
> 2
1
2
lnn.
Since the edges between B
k
and core
( w) core
i=0
E
i
_
with probability at least 1 2/n
2
.
By Lemma 36, with probability at least 1 1/n
2
, for all i,
E
i
4 192 lnn + 4Vol (C
i
).
Since k n
b
, with probability at least 1 3/n
2
,
|E(B
core
(u))| = O
_
k1
i=0
E
i
_
= O(4 192n
b
lnn + 4Vol (B
k1
))
= O(n
b
lg n).
This concludes the proof.
5.4.3 Assembly
The core core
(G) of
G with time complexity O(m + nlg n). It runs a complete breadth-rst search for each node
of the core in time O(m). Due to the condition of Lemma 32, Equation (5.3) is satised. Let
B
core
(u) denote the ball computed in our algorithm for vertex u. Let T(B
core
(u)) denote the
time to compute B
core
(u). Therefore, the time complexity TC and space complexity SC of our
algorithm are at most
TC(G) = O
_
_
m |core
(G)| +
vV (G)
T(B
core
(u))
_
_
,
5.4
SC(G) = O
_
_
n |core
(G)| +
vV (G)
|B
core
(u)|
_
_
.
5.5
Let b =
( 2) +
(23)
1
. By Lemma 35, SC is at most O(n
1+b
) with probability at least
13/n
2
. The time to compute B
core
(u) is linear in the number of internal edges in B
core
(u), since
the graph is unweighted and the distance fromu to the core has been determined before computing
B
core
(u). By Lemma 35, TC = O(n
1+b
) with probability at least 1 3/n
2
.
We now know that with probability at least 1 5/n
2
, all of the following conditions are true:
94
5.5. CONCLUSION AND OPEN PROBLEMS
1. m = (n) (Corollary 29);
2. |core
(G)| = (n
) (Lemma 34);
3. |B
core
(u)| = O(n
b
) for all vertices u (Lemma 35);
4. T(B
core
(u)) = O(n
b
lg n) for all vertices u (Lemma 35).
Therefore, from Equations 5.4 and 5.5, we know that with probability at least 1 5/n
2
, the
space complexity of our algorithm is O(n
1+
+ n
1+b
) and the time complexity is O(n
1+
+
n
1+b
lg n).
Finally, we x the parameters to obtain a balanced scheme. In a balanced scheme, the core
size and the expected ball sizes are asymptotically equivalent, that is, b = . Together with
b =
( 2) +
(2 3)
1
and
=
1
1
, we have
=
2
2 3
+ .
Therefore, assuming that Equation (5.3) is satised, the space requirement per node is O(n
lg n)
bits and the total preprocessing time is bounded by O(n
1+
lg n), which holds with probability at
least 1 1/n.
5.5 Conclusion and Open Problems
Theorem 30 implies that distances and shortest paths in random power-law graphs can be approx-
imated efciently. For power-law exponents close to 2, the expected space consumption is close
to linear. The extension of the algorithms for distance oracles to compact routing schemes indi-
cates that the routing scheme by Thorup and Zwick [TZ01] may be very efcient on Internet-like
network topologies.
Edge-weighted Graphs
The algorithm and the proof currently only apply to unweighted graphs. It seems difcult to extend
our distance oracle to graphs with worst-case weights. An adversary could assign all edges within
the core and within the fringe (all nodes outside the core) innitesimal values, and edges between
the core and the fringe to large values. With these weight values, all balls of nodes in the fringe
span the whole fringe. Since the fringe size is linear in the number of nodes, the distance oracle
would have O(n
2
) space. It may be interesting to investigate the case where weights are random.
General Stretch Parameter k
Currently, the adaptation of the distance oracle of Thorup and Zwick [TZ05] works for the stretch
parameter k = 2 only. An extension to general k seems feasible but with our proof technique, the
space requirements for constant k would remain O(n
1+
) for some > 0.
95
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
Stretch
An important open question targets the stretch of the distance oracle. If a distance oracle is used
as a component to distinguish between close and far entities in a complex network, the stretch is
very important. Since the diameter of random power-law graphs is O(lg n), the stretch should be
as small as possible to guarantee meaningful estimates. While stretch (3, 0) is best possible for
general graphs and distance oracles with O(n
3/2
) space, we prove (Theorem 30) that it is possible
to signicantly reduce the space for power-law graphs. For Erd os-R enyi random graphs, stretch
(2, 0) has been achieved with space consumption of
O(n
7/4
) [EWG08] (see Section 3.1.3). Is it
possible to reduce both space and stretch?
Different Models
As mentioned in Section 2.1.3, there are many different models for complex networks. Our anal-
ysis only works for the adapted random graph model by Aiello, Chung, and Lu [ACL00]. Other
models such as the conguration model [BBK72] and the preferential attachment model [BA99]
may also have efcient distance oracles.
96
The best theory is inspired by prac-
tice and the best practice is inspired
by theory.
Donald Knuth [Knu89]
6
Approximating Shortest Paths
Using Voronoi Duals
6.1 Introduction
The main result of this chapter is an approximation method to answer shortest path queries in gen-
eral, undirected graphs with positive edge weights, based on random sampling and graph Voronoi
duals [Meh88, Erw00]. In preprocessing, each node is selected as a Voronoi site independently at
random with probability p, and the Voronoi dual is computed for the selected sites (Section 6.3).
This preprocessing step is very efcient; it takes time proportional to computing one single source
shortest path tree (Section 6.4). For p < 1, the resulting dual graph is expected to be smaller
than the original graph. At query time, search for the shortest path from source s to target t can
potentially be done faster in the Voronoi dual. We let the shortest path in the Voronoi dual guide
the search for an approximate shortest path in the original graph. We prove that the expected ap-
proximation ratio is at most logarithmic in the number of nodes on the actual shortest path, and
that this bound is tight (Section 6.5). Our experimental results show that, in practice, the approx-
imation is much better than the stated theoretical bound and that the preprocessing overhead is
indeed extremely low (Section 6.6).
Many practical shortest path query methods are tailored for road networks (Section 3.2.3).
There has been considerable recent progress: for the road networks of Europe or the USA, using
a high-performance computer, a speedup of several orders of magnitude compared to Dijkstras
algorithm can be achieved with a preprocessing time in the tens of minutes [DSSW09]. Un-
fortunately, theoretical bounds on both query time and preprocessing time are often difcult to
obtain. However, even though road networks constitute the most common and popular application
of shortest path query algorithms to date, other challenging applications exist. Computer net-
works, social networks, protein interaction networks, and the web graph exhibit different degree
and structural properties, and may contain hundreds of millions or even billions of nodes. In spe-
cic cases, a user might be willing to trade preprocessing time against exactness due to the vast
size of the data or due to restricted processing power (Section 1.2.2). These scenarios may require
the use of a fast approximation method.
Related methods. Kambara and Ueshima [KU08] independently propose a method (without
analysis) that appears to be closely related to the method we present in this chapter. Fang et
al. [FGG
+
05] use graph Voronoi diagrams for routing in sensor networks. Yu et al. [YWD08] use
Voronoi paths to bridge communication gaps in sparse sensor networks. Chan and Efrat [CE01]
solve the cheapest path problem for ight connections in R
2
. Their method runs Dijkstras algo-
rithm on the Delaunay triangulation with respect to a superquadratic cost function R
2
R
2
R
+
.
97
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
6.2 Preliminaries
6.2.1 Graph Voronoi Diagram
The classical Voronoi diagram is a distance-based decomposition of a metric space relative to a
discrete set, the Voronoi sites [Dir50, Vor07].
1
For a survey on this fundamental structure, we refer
to [Aur91]. Among many applications, the Voronoi diagram is often used to solve facility location
problems [Sha75, ACS99, AGK
+
04, GKP05, Svi08, Sug09]. The Voronoi diagram and the Delau-
nay triangulation of n points in the plane can be computed in expected time n 2
O(
lg lg n)
[CP07],
which is even faster than O(nlg n).
Mehlhorn [Meh88] and Erwig [Erw00] proposed an analogous decomposition, the Graph Vo-
ronoi Diagram, for undirected and directed graphs respectively. Since the Voronoi diagram for
the Euclidean space is used for various applications, its graph counterpart, the graph Voronoi dia-
gram, may be used for these applications if the underlying metric is the shortest path metric of a
graph [OSF
+
08].
Real-world distances or travelling times can be approximated more appropriately using mod-
els based on weighted graphs. In general, non-planar networks such as social networks, com-
puter networks, protein interaction networks, and the web graph cannot be embedded into a low-
dimensional Euclidean space without signicant distortion.
Denition 39 (Graph Voronoi Diagram [Meh88, Erw00]). In a graph G = (V, E, w), the Voronoi
diagramfor a set of nodes K = {v
1
, . . . , v
k
} V is a disjoint partition Vor
(G,K)
:= {V
1
, . . . , V
k
}
of V such that for each node u V
i
, d(u, v
i
) d(u, v
j
) for all j {1, . . . , k}.
The V
i
are called Voronoi regions. The graph Voronoi diagram is not necessarily unique, as a
node u may have the same distance to more than one Voronoi node. Let vor(u) denote the index i
of the Voronoi region V
i
containing u; that is, vor(u) = i u V
i
.
Analogously to the Delaunay triangulation dual for classical Voronoi diagrams of point sets,
we dene the Voronoi dual for graphs.
Denition 40. Let G = (V, E, w) be an edge-weighted graph and Vor
G,K
its Voronoi diagram.
The Voronoi dual is the graph G
= (K, E
, w
) with edgeset E
:= {(v
i
, v
j
) : v
i
, v
j
K and u V
i
w V
j
: (u, w) E}, and edge weights
w
(v
i
, v
j
) := min
uV
i
,wV
j
(u,w)E
{d(v
i
, u) +w(u, w) +d(w, v
j
)}.
By contracting edges on the shortest paths connecting Voronoi nodes, one can see that G
is a
minor of G (see for example Wolff [Wol08, Lemma 4]; minors are dened in Denition 21).
Figure 6.1 illustrates a Voronoi diagram and a graph Voronoi diagram. Although the classical
Voronoi dual of a non-degenerate set of points in the plane is always a triangulation, the graph
Voronoi dual is not necessarily a triangulation, even for planar graphs. For example, a graph
Voronoi dual may have nodes whose removal would disconnect the graph.
1
More folklore in the style of Erd os-numbers: according to the Mathematics Genealogy Project, available online
at genealogy.math.ndsu.nodak.edu, there is a tree path in the advisor graph from Dirichlet to my mentor
and collaborator on the Voronoi method, Michael E. Houle: Gustav Peter Lejeune Dirichlet Rudolf Otto Sigismund
Lipschitz Christian Felix Klein Carl Louis Ferdinand Lindemann Arnold Johannes Wilhelm Sommerfeld Ernst
Adolph Guillemin Samuel Jefferson Mason Robert Wellington Donaldson Godfried Theodore Patrick Toussaint
Michael Edward Houle.
98
6.2. PRELIMINARIES
F
G
D
A
E
B
C
F
G
D
A
E
B
C
F
G
D
A
E
B
C
F
G
D
A
E
B
C
Figure 6.1: The Voronoi diagram and the Delaunay triangulation of the plane for a set of Vo-
ronoi sites {A, B, . . . G} and the graph Voronoi diagram and its dual for a set of Voronoi nodes
{A, B, . . . G} in an unweighted graph (note that the graph Voronoi dual is not necessarily a trian-
gulation).
Erwig [Erw00, Theorem 2] showed that the graph Voronoi diagram can be constructed with
a single Dijkstra search in time O(m + n lg n). A heap is used to store the shortest path dis-
tances from nodes to their closest Voronoi node. The heap is initialized to store the Voronoi nodes
themselves. Thereafter, as long as there are nodes in the queue, the minimum is extracted from
the heap and processed (or settled) by assigning to it a Voronoi region, storing the distance
to its Voronoi node, and adding to or updating its neighbors in the queue. We slightly modify
this construction of the Voronoi diagram [Erw00, Section 3.1] to compute the Voronoi dual
that is, to also compute E
and w
vor(u)
, v
vor(u
)
) with weight
w
G
(v
vor(u)
, v
vor(u
)
) = d
G
(v
vor(u)
, u) + w
G
(u, u
) + d
G
(u
, v
vor(u
)
) is added, or its length is
decreased if there already is an edge in G
= (v
vor(u
0
)
, v
vor(u
1
)
, . . . , v
vor(u
h
)
).
Note that the Voronoi path P
:= V {v
0
}, E
:= E
2: i := 1
3: for u K do
4: v
i
:= u
5: vor(v
i
) := i
6: E
:= E
{{v
0
, u}}
7: i := i + 1
8: end for
9: HEAP.put(v
0
)
10: while HEAP.empty do
11: u
cur
:= HEAP.extractMin
12: for u (u
cur
) do
13: if vor(u) = undened then
14: vor(u) := vor(u
cur
)
15: HEAP.insert(u, d(v
0
, u
cur
) +w(u
cur
, u))
16: else if d(v
0
, u
cur
) +w(u
cur
, u) < d(v
0
, u) then
17: vor(u) := vor(u
cur
)
18: HEAP.decreaseKey(u, d(v
0
, u
cur
) +w(u
cur
, u))
19: else if HEAP.contains(u) and vor(u) = vor(u
cur
) then
20: if (v
vor(u
cur
)
, v
vor(u)
) E
then
21: E
:= E
{(v
vor(u
cur
)
, v
vor(u))
}
22: w
(v
vor(u
cur
)
, v
vor(u)
) :=
23: end if
24: if w
(v
vor(u
cur
)
, v
vor(u)
) > d(v
vor(u
cur
)
, u
cur
) +w(u
cur
, u) +d(u, v
vor(u)
) then
25: w
(v
vor(u
cur
)
, v
vor(u)
) := d(v
vor(u
cur
)
, u
cur
) +w(u
cur
, u) +d(u, v
vor(u)
)
26: end if
27: end if
28: end for
29: end while
Proof. Suppose that there is no such path P
in G
. As u
i
, u
i+1
are consecutive nodes on the path P, we know that (u
i
, u
i+1
) E. This contradicts the denition
of the Voronoi dual (Def. 40), since (u
i
, u
i+1
) E and v
vor(u
i
)
= v
vor(u
i+1
)
together imply that
(v
vor(u
i
)
, v
vor(u
i+1
)
) E
. P
,
Sleeve
(G,G
)
(P
) := G
_
_
_
v
i
P
V
i
_
_
.
The Voronoi sleeve is related to a subgraph sometimes termed corridor.
With the denitions at hand we can now state the approximation method.
100
6.3. THE VORONOI METHOD
6.3 The Voronoi Method
This section describes the preprocessing and query algorithms of the Voronoi method. Both algo-
rithms are conceptually very simple and thus easy to implement.
In preprocessing, each node is selected as a Voronoi site independently at random with prob-
ability p, and the Voronoi dual is computed for the selected sites (Algorithm 6). For the sake of
exposition, we treat the computation of the Voronoi dual as a black box, denoted by Compute
VoronoiDual.
Algorithm 6 Preprocessing
Input: graph G = (V, E, w), sampling rate p [0, 1].
Output: Voronoi dual G
= (K, E
, w
:=ComputeVoronoiDual(G, K)
3: Return G
.
Lemma 38. For a graph G = (V, E) with n := |V | and m := |E|, Algorithm 6 takes time
proportional to that of Dijkstras single source shortest path algorithm.
Proof. Erwigs variant of Dijkstras algorithm computes the graph Voronoi diagram in a worst-
case time proportional to Dijkstras algorithm [Erw00, Theorem 2]. The only modication of
Algorithm 5 compared to Erwigs variant is the following: for each node, at the time it is settled,
all its neighbors are inspected. Therefore, each edge is additionally considered two times in total.
This yields the same asymptotic running time.
The preprocessing time complexity is proportional to the cost of computing one single source
shortest path tree. Details are discussed in Section 6.4.
At query time, given a graph Gand its Voronoi dual G
and M
] = p n.
The expected query time without renement (computing the shortest path in the Voronoi sleeve) is
at most O(N
lg N
+M
). The time for the renement step depends on the size of the Voronoi
sleeve. The analysis will show that the renement step is not necessary for the approximation ratio
to hold for long distance queries; however, it makes a practical difference for the quality of paths.
For p = O(n
2/3
), E[N
] = O(n
1/3
), and thus we can afford to compute all-pairs shortest path
distances in the Voronoi dual G
, Source s, Target t.
Output: an approximate shortest path P from s to t.
1: Find Voronoi source v
vor(s)
from s and Voronoi target v
vor(t)
from t. If thereby a shortest path
SP
G
(s, t) has been found, return it.
2: Compute a shortest path from v
vor(s)
to v
vor(t)
in the Voronoi dual G
: SP
G
(v
vor(s)
, v
vor(t)
).
3: Compute the Voronoi sleeve
S := Sleeve(SP
G
(v
vor(s)
, v
vor(t)
)).
4: Compute a shortest path from s to t in the Voronoi sleeve, SP
S
(s, t).
5: Return P = SP
S
(s, t).
F
G
D
A
E
B
C
F
G
D
A
E
B
C
F
G
D
A
E
B
C
F
G
D
A
E
B
C
Figure 6.2: Illustration of the query algorithm of the Voronoi method. Left to right, top to bottom:
(1) the original shortest path, (2) shortest path in the (weighted) dual, (3) sleeve, and (4) shortest
path in the sleeve
6.4 Computational Complexity
In this section we study the cost of computing a Voronoi dual. Recall that in Erwigs algo-
rithm [Erw00, Section 3.1] the graph Voronoi diagram is constructed with a single Dijkstra search.
A heap is used to store the shortest path distances from nodes to their closest Voronoi node. Con-
ceptually, a dummy node with a zero-weighted edge to each of the Voronoi nodes is added, the
dummy node is inserted into the heap, and the Dijkstra single source shortest path search is ex-
ecuted. The running times of different implementations of Dijkstras algorithm depend on the
102
6.4. COMPUTATIONAL COMPLEXITY
priority queue employed (see Table 2.5). Using Fibonacci heaps [FT87], Dijkstras algorithm
takes time O(m+nlg n).
Erwig also claims a time lower bound of (max(n, (n k) lg k)) [Erw00, Theorem 1]. The
lower bound simplies to (nlg n) when the number of Voronoi nodes is assumed to be k = n
C
for a xed choice of C (0, 1). Assuming that all edges must be inspected at construction time,
this lower bound would be tight. The bound is information theoretic: for a connected graph, each
node w V \ K is in exactly one of the k regions V
i
. Encoding one instance out of these k
nk
possibilities requires lg k
nk
= (n k) lg k bits.
For some graphs with special properties, Erwigs lower bound may not apply. Eppstein and
Goodrich [EG08] presented a linear-time algorithm to compute the Voronoi diagram for road net-
works satisfying certain geometric properties. Also, the lower bound may not hold under different
models of computation, such as the word RAM model. This model assumes that basic operations
such as adding two words requires a single time step, and that the time compexity is the number
of word operations executed. The space complexity is the number of words of storage required,
assuming that any identier (such as a node label) or value (such as a distance) can be contained
in a single word. Under the word RAM model, the implementation of Dijkstras algorithm by
Thorup [Tho04b] requires only O(m+nlg lg n)time.
Corollary 39. The graph Voronoi dual can be computed in time O(m + nlg lg n) in the word
RAM model.
Note that the time upper bound under the word RAM model does not contradict Erwigs
information-theoretic lower bound [Erw00, Theorem 1] of (nlg n) bits.
Computing a graph Voronoi dual does not actually require the use of Dijkstras algorithm
any single source shortest path algorithm (including parallel and distributed algorithms) can be
used to compute a graph Voronoi dual as follows. Instead of an adapted Dijkstra search, we may
also
1. augment G by introducing a dummy node v
0
connected to each of the Voronoi nodes with
an edge of length zero,
2. run any single source shortest path algorithm in the augmented graph G
with v
0
as its
source, and
3. explore the search tree rooted at v
0
by following shortest path edges only.
This last step simulates a Dijkstra search by following the single source shortest path tree without
using any expensive decrease-key operations (these operations have to be avoided to reduce the
worst-case running time [Tho00b, Tho07]); a First-In-First-Out queue with constant time for the
enqueue and dequeue operations is sufcient. For a pseudo-code description, see Algorithm 8.
Although the construction is mainly of theoretical interest, it may be useful for example for parallel
or distributed algorithms and for software that must rely on certain libraries.
Note that, if a single source shortest path algorithmAworks for a special class of graphs G, the
augmented graph G
:= (V
, E
) with V
= V {v
0
} and E
= E {(v
0
, v) : v K} with w
(v
0
, v) =
(one would set = 0 if possible; if only positive edge are allowed, other values work as well)
2: D := SSSP(G
, v
0
), where D is the distance vector storing the distance from v
0
to each node
u V
3: for i := 1 to k = |K| do
4: vor(v
i
) := i
5: FIFO.enqueue(v
i
)
6: end for
7: while FIFO.empty do
8: u
cur
:= FIFO.dequeue
9: for u (u
cur
) do
10: if D(u) = D(u
cur
) +w(u, u
cur
) and vor(u) = undef then
11: vor(u) := vor(u
cur
)
12: FIFO.enqueue(u)
13: else if vor(u) = undef and vor(u) = vor(u
cur
) then
14: if (v
vor(u
cur
)
, v
vor(u)
) E
then
15: E
:= E
{(v
vor(u
cur
)
, v
vor(u))
}
16: w
(v
vor(u
cur
)
, v
vor(u)
) :=
17: end if
18: if w
(v
vor(u
cur
)
, v
vor(u)
) > D(v
0
, u
cur
) +w(u
cur
, u) +D(u, v
0
) then
19: w
(v
vor(u
cur
)
, v
vor(u)
) := D(v
0
, u
cur
) +w(u
cur
, u) +D(u, v
0
)
20: end if
21: end if
22: end for
23: end while
Theorem 40. Using any single source shortest path algorithm for general graphs with running
time S(n, m, W), Algorithm 8 computes a graph Voronoi dual in time O(n +m+S(n, m, W)).
Proof. After running the SSSP algorithm in time S(n, m, W), Algorithm 8 visits every node
exactly once and every edge exactly twice (once for each end point).
For undirected graphs with integer or oating point weights, we may use the O(m)time SSSP
algorithm of Thorup [Tho99, Tho00a].
Corollary 41. For undirected graphs with integer or oating point weights, the graph Voronoi
dual can be computed in time O(m+n) in the word RAM model.
For real weights and undirected graphs, we may use the O(m + nlg lg n)time algorithm of
Pettie and Ramachandran [PR02].
Corollary 42. For undirected graphs with real weights, the graph Voronoi dual can be computed
in time O(m+nlg lg n).
For road networks, we may use the linear-time algorithm of Eppstein and Goodrich [EG08].
104
6.5. STRETCH ANALYSIS
Practical considerations. Note that storing each Voronoi node twice, once as a graph node and
once as a dual node, causes unnecessary additional space consumption. However, when both the
original graph and the dual graph are stored in the same structure, searching the dual could result
in a substantial number of cache misses, since Voronoi nodes are roughly 1/p positions apart.
Adapting the memory organization by reordering the nodes such that the memory locations used
for Voronoi nodes are close together may potentially increase the cache efciency [GKW07].
6.5 Stretch Analysis
In this section, we prove that the expected path length approximation ratio (stretch) is logarithmic
in the number of edges of an exact shortest path. The bound on the stretch is the main theoretical
result of this chapter.
In this section, to simplify notation, we only consider the multiplicative stretch . We write
stretch instead of stretch (, 0) as originally dened in Denition 30.
Theorem 43. For shortest paths having h edges, Algorithm 7, given a graph and its Voronoi dual
with sampling rate p (constructed by Algorithm6), has expected worst-case stretch O(lg
1/(1p)
h).
The path SP
S
(s, t) found by the algorithm is an approximation, since it is possible that no
actual shortest path SP
G
(s, t) lies entirely within the Voronoi sleeve S. We explain how this is
possible, and give an upper bound on the expected length (SP
S
(s, t)). For this purpose, we
prove relationships between the lengths of simple paths P and their corresponding Voronoi paths
P
(v
vor(u
k
)
, v
vor(u
k+1
)
) of an edge
between two Voronoi nodes on the path P
(v
vor(u
k
)
, v
vor(u
k+1
)
) d(v
vor(u
k
)
, u
k
) +w(u
k
, u
k+1
) +d(u
k+1
, v
vor(u
k+1
)
)
From the Voronoi condition, we observe that d(u
k
, v
vor(u
k
)
) d(u
k
, v
vor(u
j
)
) for all j. Due to
the assumption that s and t are also Voronoi nodes, this also holds for source and target. That is,
d(u
k
, v
vor(u
k
)
) d(s, u
k
)
d(u
k
, v
vor(u
k
)
) d(u
k
, t)
= d(v
vor(u
k
)
, u
k
)
This yields:
(P
) d
(s, t) = d
(s, v
vor(u
1
)
)
+
h2
k=1
_
d(v
vor(u
k
)
, u
k
) +w(u
k
, u
k+1
)+d(u
k+1
, v
vor(u
k+1
)
)
_
+d
(v
vor(u
h1
)
, t)
w(s, u
1
) +d(u
1
, v
vor(u
1
)
)
+
h2
k=1
_
d(v
vor(u
k
)
, u
k
) +d(u
k+1
, v
vor(u
k+1
)
)
_
+
h2
k=1
w(u
k
, u
k+1
)
+d(v
vor(u
h1
)
, u
h1
) +w(u
h1
, t)
d(s, t) +
h1
k=1
_
d(s, u
k
) +d(u
k
, t)
_
= h (P)
There exist constructions for which the bound can be shown to be tight. For example, for any
choice of a > > 0, the edge weights of G may be chosen such that d(u
k
, v
vor(u
k
)
) = a ,
w(u
k
, u
k+1
) = , and w(s, u
1
) = w(u
h1
, t) = a. Path P has length 2a + (h 2), and the
Voronoi path P
has length 2a +(h 2) +2(h 1) (a ). The worst case is attained for very
small . As 0, the ratio (P
)/(P) h.
If in addition to the endpoints there are Voronoi nodes on the shortest path, the maximum
stretch is guaranteed to be smaller than the number of edges on the shortest path. In the following
lemma, we prove that the maximum stretch is proportional to the largest gap between Voronoi
nodes on the path. The proof is a simple composition of Lemma 44, and is supported by the
illustration in Figure 6.4.
106
6.5. STRETCH ANALYSIS
b
< <
a + b + 2c
a + b
a + c
b + c
c
a
t
s
u
v
i
path/edge in G
path/edge in G
Vor. region boundary
Figure 6.3: s, t, and v
i
are Voronoi nodes. The shortest path from s to t leads through u, which
is in v
i
s Voronoi region (if c < a and c < b), and paths in the Voronoi dual pass through v
i
.
If < a + b + 2c, the shortest path in the Voronoi dual SP
G
takes the left-hand route, and the
Voronoi sleeve S does not contain u.
s
u
h1
v
vor(u
h1
)
v
vor(u2)
u
2
u
1
v
vor(u1)
. . .
t
Figure 6.4: The shortest path between two Voronoi nodes s and t with h 1 intermediate nodes
u
1
, . . . , u
h1
. The distance between two Voronoi nodes that are adjacent in the Voronoi dual is at
most w
(v
vor(u
k
)
, v
vor(u
k+1
)
) d(v
vor(u
k
)
, u
k
) +w(u
k
, u
k+1
) +d(u
k+1
, v
vor(u
k+1
)
).
Lemma 45. Let P = (v
i
, u
1
, . . . , u
h1
, v
j
) be a simple path of length (P) between two Voronoi
nodes v
i
= u
0
and v
j
= u
h
. Let
h denote the largest gap of P. The corresponding Voronoi path
P
k=0
(P
k
) = P,
k=0
h
k
= h, k : h
k
h). Composition of Lemma 44 leads to the
following bound on the path length:
k=0
h
k
(P
k
)
k=0
max
{0,...,}
h
(P
k
)
h (P).
Tightness can be shown with the same example as in the proof of Lemma 44.
Lemma 47 gives an upper bound on the expected size of the largest gap. We use the fol-
lowing lemma by Szpankowski and Rego [SR90] concerning the maximum of geometric random
variables.
Lemma 46 (Szpankowski and Rego [SR90, eq. (2.6) and (2.12)]). Let X
i
, i = 1, 2, . . . , n be a
set of i.i.d. random variables distributed according to the geometric distribution with parameter
p. That is, for every i = 1, 2, . . . , n and k N
+
,
Pr[X
i
= k] = (1 p)
k1
p
E[X
i
] = p
1
E[X
2
i
] = (2 p)p
2
.
Let M
n
= max{X
1
, X
2
, . . . , X
n
}. The expected value of M
n
is
E[M
n
] =
n
k=1
(1)
k
_
n
k
_
1
1 (1 p)
k
= lg
1/(1p)
n +O(1).
Lemma 47. In a path of length h1, where each node has been selected as a Voronoi node inde-
pendently at random with probability p, the longest sequence of non-Voronoi nodes is of expected
length at most O(lg
1/(1p)
h).
Proof. The path can be seen as a sequence of coin tosses, for which we want to bound the ex-
pected length of the longest sequence of tails. This problem is known as the Longest Success-
Run [EMK97, Ch. 8.5]. We wish to bound the expectation of the maximum of N independent
geometric random variables with probability p and sum h 1 N (N itself being a random
variable).
To derive a bound on the expectation, we observe that by dropping the sum condition, and
by taking the maximum over h N random variables, the maximum value obtained can only
increase.
As of Lemma 46, the expectation of the maximum of h geometric random variables with
probability p is known to be at most O(lg
1/(1p)
h).
We now combine Lemmas 44, 45, and 47 to prove Theorem 43.
108
6.6. EXPERIMENTS
Proof of Theorem 43. Consider rst the case where s and t are both Voronoi nodes.
Let
h denote the largest gap of some shortest path SP
G
(s, t). Lemma 45 implies that the
corresponding Voronoi path (SP
G
(s, t))
) (SP
G
(s, t)). The path SP
G
(s, t) in the Voronoi dual corresponds to a path
P
)
= (SP
G
(s, t))
((SP
G
(s, t))
h (SP
G
(s, t)).
Recall that nodes are independently selected as Voronoi nodes with sampling rate p. For a shortest
path with h edges, the expected largest gap
h is at most O(lg
1/(1p)
h) by Lemma 47.
For the case where either s or t (or both) are not Voronoi nodes, if the path returned by
Algorithm 7 has been found in Step 1, it is optimal, and the result holds trivially. For the re-
mainder of the proof we assume that the shortest path has not been found in Step 1. In this
case, the path returned is at most as long as the shortest path P
vor
in G from s to t having
SP
Sleeve(SP
G
(v
vor(s)
,v
vor(t)
))
(v
vor(s)
, v
vor(t)
) as a subpath. In the following, we derive an upper
bound on (P
vor
) with respect to the number of edges on the shortest path between s and t, denoted
by h
. We have that
(P
vor
) d(s, v
vor(s)
) +d
(v
vor(s)
, v
vor(t)
) +d(v
vor(t)
, t).
Since the shortest path froms to t has not already been found directly in Step 1, it must be true that
both d(s, v
vor(s)
) d(s, t) and d(s, v
vor(s)
) d(s, t). It remains to bound the distance between
v
vor(s)
and v
vor(t)
in the dual graph.
Observe that augmenting the graph G with one edge (u, v
vor(u)
) of weight d(u, v
vor(u)
) for
each non-Voronoi node u V \K affects neither the Voronoi diagram nor the Voronoi dual, since
the nodes on the shortest path fromv
vor(u)
to u cannot be interfered with by another Voronoi node.
In the augmented primal graph, by the triangle inequality (for details, see Lemma 31), we have
that
d(v
vor(s)
, v
vor(t)
) d(v
vor(s)
, s) +d(s, t) +d(t, v
vor(t)
) 3d(s, t)
using a path with at most 1 +h
(v
vor(s)
, v
vor(t)
) is
also bounded by O(lg h
n, and
3
(v
vor(u
k
)
, v
vor(u
k+1
)
) d(v
vor(u
k
)
, u
k
)+w(u
k
, u
k+1
)+
d(u
k+1
, v
vor(u
k+1
)
). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.5 Preprocessing time versus speedup (with respect to Dijkstras algorithm) tradeoff
for the European road network. Plot on a doubly-logarithmic scale, yaxis re-
versed. Circles stand for variants of the Voronoi method and for the related exact
methods listed in Table 6.1. Transit-Node Routing (TNR) has the best speedup
and the slowest preprocessing. Contraction Hierarchies (CHHNR) and Highway
Hierarchies (HH) achieve a very good tradeoff between preprocessing and query
times. A* has short preprocessing times but rather low speedup. The Voronoi
method (Chapter 6) has the fastest preprocessing times with competitive speedups
at the cost of exactness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.6 Approximate path length versus actual shortest path length for VORROOT with
sleeve steps omitted on the European road network, distance metric. The theoreti-
cal worst-case logarithmic dependency on the number of edges cannot be observed
in the experimental results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.7 Approximate path length versus actual shortest path length for VORROOT us-
ing the sleeve on the European road network, distance metric. Renement using
the sleeve substantially improves the stretch in practice (also compare with Fig-
ure 6.6), although the theoretical performance is not affected. . . . . . . . . . . . 119
126
Bibliography
[AB02] Reka Albert and Albert-L aszl o Barab asi. Statistical mechanics of complex net-
works. Reviews of Modern Physics, 74:4797, 2002. 4
[AB07] Sanjeev Arora and Boaz Barak. Computational complexity: A modern approach,
2007. Web draft dated 2007-01-08 22:02. 60, 61
[ABN08] Ittai Abraham, Yair Bartal, and Ofer Neiman. Embedding metric spaces in their
intrinsic dimension. In Proceedings of the Nineteenth Annual ACM-SIAM Sympo-
sium on Discrete Algorithms, SODA 2008, San Francisco, California, USA, Jan-
uary 20-22, 2008, pages 363372, 2008. 49, 121
[AC07] Noga Alon and Michael Capalbo. Finding disjoint paths in expanders determinis-
tically and online. In FOCS 07: Proceedings of the 48th Annual IEEE Symposium
on Foundations of Computer Science, pages 518524, 2007. 65
[ACC
+
96] Srinivasa Rao Arikati, Danny Z. Chen, L. Paul Chew, Gautam Das, Michiel H. M.
Smid, and Christos D. Zaroliagis. Planar spanners and approximate shortest path
queries among obstacles in the plane. In Algorithms - ESA 96, Fourth Annual
European Symposium, Barcelona, Spain, September 25-27, 1996, Proceedings,
pages 514528, 1996. 45, 47, 121
[ACIM99] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast
estimation of diameter and shortest paths (without matrix multiplication). SIAM
Journal on Computing, 28(4):11671181, 1999. Announced at SODA 1996. 28,
37, 38
[ACKM09] Dimitris Achlioptas, Aaron Clauset, David Kempe, and Cristopher Moore. On the
bias of traceroute sampling: Or, power-law degree distributions in regular graphs.
Journal of the ACM, 56(4), 2009. 5
[ACL00] William Aiello, Fan Rong King Chung, and Linyuan Lu. A random graph model
for massive graphs. In Proceedings of the 32nd Annual ACMSymposiumon Theory
of Computing, pages 171180, 2000. 23, 24, 56, 81, 82, 84, 96
[ACS99] Karen Aardal, Fabian A. Chudak, and David B. Shmoys. A 3-approximation al-
gorithm for the k-level uncapacitated facility location problem. Information Pro-
cessing Letters, 72:161167, 1999. 98
[ADD
+
93] Ingo Alth ofer, Gautam Das, David P. Dobkin, Deborah Joseph, and Jos e Soares.
On sparse spanners of weighted graphs. Discrete & Computational Geometry,
9:81100, 1993. 27
[ADG
+
06] Lyudmil Aleksandrov, Hristo Djidjev, Hua Guo, Anil Maheshwari, Doron Nuss-
baum, and J org-R udiger Sack. Approximate shortest path queries on weighted
polyhedral surfaces. In Mathematical Foundations of Computer Science 2006,
31st International Symposium, MFCS 2006, Star a Lesn a, Slovakia, August 28-
September 1, 2006, Proceedings, pages 98109, 2006. 11, 32
[AF90] Mikl os Ajtai and Ronald Fagin. Reachability is harder for directed than for undi-
rected nite graphs. The Journal of Symbolic Logic, 55(1):113150, 1990. An-
nounced at FOCS 1988. 41, 54, 62
[AFGW10] Ittai Abraham, Amos Fiat, Andrew V. Goldberg, and Renato Fonseca F. Werneck.
Highway dimension, shortest paths, and provably efcient algorithms. In ACM-
127
Bibliography
SIAMSymposiumon Discrete Algorithms (SODA10) January 17-19, 2010, Austin,
Texas, 2010. 22, 47, 55, 121
[AFWZ95] Noga Alon, Uriel Feige, Avi Wigderson, and David Zuckerman. Derandomized
graph products. Computational Complexity, 5:6075, 1995. 65, 74
[AG06] Ittai Abraham and Cyril Gavoille. Object location using path separators. In Pro-
ceedings of the Twenty-Fifth Annual ACM Symposium on Principles of Distributed
Computing, PODC 2006, Denver, CO, USA, July 23-26, 2006, pages 188197,
2006. Details in LaBRI Research Report RR-1394-06. 46, 47, 83, 121
[AGGM06] Ittai Abraham, Cyril Gavoille, Andrew V. Goldberg, and Dahlia Malkhi. Routing
in networks with low doubling dimension. In 26th IEEE International Conference
on Distributed Computing Systems (ICDCS 2006), 4-7 July 2006, Lisboa, Portu-
gal, page 75, 2006. 49
[AGK
+
04] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala,
and Vinayaka Pandit. Local search heuristics for k-median and facility location
problems. SIAM Journal on Computing, 33(3):544562, 2004. Announced at
STOC 2001. 98
[AGL07] Mattias Andersson, Joachim Gudmundsson, and Christos Levcopoulos. Approx-
imate distance oracles for graphs with dense clusters. Computational Geometry,
37(3):142154, 2007. Announced at ISAAC 2004. 50, 121
[AGM97] Noga Alon, Zvi Galil, and Oded Margalit. On the exponent of the all pairs shortest
path problem. Journal of Computer and System Sciences, 54(2):255262, 1997.
Announced at FOCS 1991. 36
[AGM
+
08] Ittai Abraham, Cyril Gavoille, Dahlia Malkhi, Noam Nisan, and Mikkel Thorup.
Compact name-independent routing with minimum stretch. ACM Transactions on
Algorithms, 4(3), 2008. 49
[AGMN92] Noga Alon, Zvi Galil, Oded Margalit, and Moni Naor. Witnesses for boolean
matrix multiplication and for shortest paths. In 33rd Annual Symposium on Foun-
dations of Computer Science, 24-27 October 1992, Pittsburgh, Pennsylvania, USA,
pages 417426, 1992. 36
[AHL02] Noga Alon, Shlomo Hoory, and Nathan Linial. The Moore bound for irregular
graphs. Graphs and Combinatorics, 18(1):5357, 2002. 28
[AI97] Cavit Aydin and Doug Ierardi. Partitioning algorithms for transportation graphs
and their applications to routing. In Proceedings of the 9th Canadian Conference
on Computational Geometry (CCCG), 1997. 54
[AI00] Yasuhito Asano and Hiroshi Imai. Practical efciency of the linear-time algorithm
for the single source shortest path problem. Journal of the Operations Research
Society of Japan, 43(4):431447, 2000. 33, 34
[AIP06] Alexandr Andoni, Piotr Indyk, and Mihai Patrascu. On the optimality of the di-
mensionality reduction method. In 47th Annual IEEE Symposium on Foundations
of Computer Science (FOCS 2006), 21-24 October 2006, Berkeley, California,
USA, Proceedings, pages 449458, 2006. 60, 62
[AJ89] Rakesh Agrawal and H. V. Jagadish. Materialization and incremental update of
path information. In Proceedings of the Fifth International Conference on Data
128
Bibliography
Engineering, February 6-10, 1989, Los Angeles, California, USA, pages 374383,
1989. 10
[AJ94] Rakesh Agrawal and H. V. Jagadish. Algorithms for searching massive graphs.
IEEE Transactions on Knowledge and Data Engineering, 6(2):225238, 1994. 51
[AJB99] Reka Albert, Hawoong Jeong, and Albert-L aszl o Barab asi. Diameter of the world-
wide web. Nature, 401:130, 1999. 5
[Ajt88] Mikl os Ajtai. A lower bound for nding predecessors in Yaos call probe model.
Combinatorica, 8(3):235247, 1988. 60
[Ake60] Sheldon B. Akers. The use of wye-delta transformations in network simplication.
Operations Research, 8(3):311323, 1960. Announced at Rand Symposium on
Mathematical Programming 1959. 51
[Akg88] Mustafa Akg ul. Shortest paths and the simplex method. 14th International Con-
ference on Mathematical Programming, Tokyo, 1988. 34
[Alo86] Noga Alon. Eigenvalues and expanders. Combinatorica, 6(2):8396, 1986. 65
[Alo03] Noga Alon. Problems and results in extremal combinatoricsI. Discrete Mathe-
matics, 273(1-3):3153, 2003. 30
[ALZ07] Luca Allulli, Peter Lichodzijewski, and Norbert Zeh. A faster cache-oblivious
shortest-path algorithm for undirected graphs with bounded edge lengths. In Pro-
ceedings of the Eighteenth Annual ACM-SIAMSymposiumon Discrete Algorithms,
SODA 2007, New Orleans, Louisiana, USA, January 7-9, 2007, pages 910919,
2007. 35
[AM05] Ittai Abraham and Dahlia Malkhi. Name independent routing for growth bounded
networks. In SPAA 2005: Proceedings of the 17th Annual ACM Symposium on
Parallel Algorithms, July 18-20, 2005, Las Vegas, Nevada, USA, pages 4955,
2005. 49
[AMO93] Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin. Network Flows.
Prentice Hall, 1993. 32
[AMOT90] Ravindra K. Ahuja, Kurt Mehlhorn, James B. Orlin, and Robert Endre Tarjan.
Faster algorithms for the shortest path problem. Journal of the ACM, 37(2):213
223, 1990. 34
[ANLP90] Baruch Awerbuch, Amotz Bar Noy, Nathan Linial, and David Peleg. Improved
routing strategies with succinct tables. Journal of Algorithms, 11(3):307341,
1990. 49
[AO92] Ravindra K. Ahuja and James B. Orlin. The scaling network simplex algorithm.
Operations Research, 40(S1):513, 1992. 34
[AOPS02] Ravindra K. Ahuja, James B. Orlin, Stefano Pallottino, and Maria Grazia Scutell` a.
Minimum time and minimum cost-path problems in street networks with periodic
trafc lights. Transportation Science, 36(3):326336, 2002. 54
[AP89] Stefan Arnborg and Andrzej Proskurowski. Linear time algorithms for NP-hard
problems restricted to partial k-trees. Discrete Applied Mathematics, 23(1):1124,
1989. 20, 22
129
Bibliography
[AP92] Baruch Awerbuch and David Peleg. Routing with polynomial communication-
space trade-off. SIAM Journal on Discrete Mathematics, 5(2):151162, 1992. 49
[APF
+
06] Bal azs Adamcsek, Gergely Palla, Ill es J. Farkas, Imre Der enyi, and Tam as Vic-
sek. CFinder: locating cliques and overlapping modules in biological networks.
Bioinformatics, 22(8):10211023, 2006. 13
[Ari00] Masanori Arita. Metabolic reconstruction using shortest paths. Simulation Prac-
tice and Theory, 8(1-2):109 125, 2000. 11
[AS87] Noga Alon and Baruch Schieber. Optimal preprocessing for answering on-line
product queries. Technical Report 71/87, Tel Aviv University, 1987. 47
[AS00] Noga Alon and Joel Spencer. The Probabilistic Method (second edition). John
Wiley Interscience Series in Discrete Mathematics and Optimization, 2000. 67, 68
[ASBS00] Lus A. Nunes Amaral, Antonio Scala, Marc Barthelemy, and H. Eugene Stan-
ley. Classes of small-world networks. Proceedings of the National Academy of
Sciences USA, 97(21):1114911152, 2000. 4, 23, 123
[Ass83] Patrice Assouad. Plongements Lipschitziens dans R
n
. Bulletin de la Soci et e
Math ematique de France, 111:429448, 1983. 20
[ASS09] Dennis J. Adams-Smith and Douglas R. Shier. Generating random test net-
works for shortest path algorithms. Operations Research and Cyber-Infrastructure,
47:295308, 2009. 22
[AST90] Noga Alon, Paul D. Seymour, and Robin Thomas. A separator theorem for nonpla-
nar graphs. Journal of the American Mathematical Society, 3(4):801808, 1990.
Announced at STOC 1990. 31, 46, 121
[AST94] Noga Alon, Paul D. Seymour, and Robin Thomas. Planar separators. SIAMJournal
on Discrete Mathematics, 7(2):184193, 1994. 30
[Aur91] Franz Aurenhammer. Voronoi diagrams a survey of a fundamental geometric
data structure. ACM Computing Surveys, 23(3):345405, 1991. 98
[AV88] Alok Aggarwal and S. Vitter, Jeffrey. The input/output complexity of sorting and
related problems. Communications of the ACM, 31(9):11161127, 1988. 26, 43,
45
[AY00] Muhammad Abaidullah Anwar and Takaichi Yoshida. OORF: An object-oriented
route nder. In SAC 00: Proceedings of the 2000 ACM Symposium on Applied
Computing, pages 301306, 2000. 51, 54
[AY01] Muhammad Abaidullah Anwar and Takaichi Yoshida. Integrating OO road net-
work database, cases and knowledge for route nding. In SAC 01: Proceedings
of the 2001 ACM Symposium on Applied Computing, pages 215219, 2001. 51, 54
[BA99] Albert-L aszl o Barab asi and Reka Albert. Emergence of scaling in random net-
works. Science, 286(5439):509512, 1999. 4, 5, 23, 24, 96
[Bac94] Paul Bachmann. Die Analytische Zahlentheorie. Zahlentheorie. pt. 2. 1894. 25
[BAJ00] Albert-L aszl o Barab asi, Reka Albert, and Hawoong Jeong. Scale-free charac-
teristics of random networks: the topology of the world-wide web. Physica A:
Statistical Mechanics and its Applications, 281:6977, 2000. 4, 5
130
Bibliography
[Bar03] Albert-L aszl o Barab asi. Linked: How Everything Is Connected to Everything Else
and What It Means. Plume, 2003. 81
[Bat08] Michael Batty. The size, scale, and shape of cities. Science, 319(5864):769771,
2008. 10
[Bav48] Alex Bavelas. A mathematical model of group structure. Human Organizations,
7:1630, 1948. 12
[Bav50] Alex Bavelas. Communication patterns in task-oriented groups. The Journal of
the Acoustical Society of America, 22:725, 1950. 12
[BBCR03] B ela Bollob as, Christian Borgs, Jennifer T. Chayes, and Oliver Riordan. Directed
scale-free graphs. In Symposium on Discrete Algorithms (SODA), pages 132139,
2003. 5
[BBJ
+
02] Christopher L. Barrett, Keith R. Bisset, Riko Jacob, Goran Konjevod, and Mad-
hav V. Marathe. Classical and contemporary shortest path problems in road net-
works: Implementation and experimental analysis of the TRANSIMS router. In
Algorithms - ESA 2002, 10th Annual European Symposium, Rome, Italy, Septem-
ber 17-21, 2002, Proceedings, pages 126138, 2002. 10
[BBK72] Andr as B ek essy, P. Bekessy, and J anos Koml os. Asymptotic enumeration of reg-
ular matrices. Studia Scientiarum Mathematicarum Hungarica, 7:343353, 1972.
23, 24, 96
[BC58] Frederick Bock and Scott Cameron. Allocation of network trafc demand by in-
stant determination of optimum paths. Operations Research, 6:633634, 1958.
Announced at the 13th National (6th Annual) Meeting of the Operations Research
Society of America, 1958. 8
[BC06] Arthur Brady and Lenore Cowen. Compact routing on power law graphs with
additive stretch. In Proceedings of the Ninth Workshop on Algorithm Engineering
and Experiments, pages 119128, 2006. 49, 56, 83
[BCE03] B ela Bollob as, Don Coppersmith, and Michael L. Elkin. Sparse distance pre-
servers and additive spanners. In Symposium on Discrete Algorithms (SODA),
2003. 27
[BD86] George E. P. Box and Norman R. Draper. Empirical model-building and response
surface. John Wiley & Sons, Inc., 1986. 84
[BD08] Reinhard Bauer and Daniel Delling. SHARC: Fast and robust unidirectional rout-
ing. In Proceedings of the 10th Workshop on Algorithm Engineering and Experi-
ments (ALENEX08), pages 1326, 2008. 47, 54, 55, 56, 110, 111, 117, 123
[BDDW09] Reinhard Bauer, Gianlorenzo DAngelo, Daniel Delling, and Dorothea Wagner.
The shortcut problem - complexity and approximation. In SOFSEM 2009: Theory
and Practice of Computer Science, 35th Conference on Current Trends in Theory
and Practice of Computer Science, Spindleruv Ml yn, Czech Republic, January 24-
30, 2009. Proceedings, pages 105116, 2009. 55
[BDS
+
08] Reinhard Bauer, Daniel Delling, Peter Sanders, Dennis Schieferdecker, Dominik
Schultes, and Dorothea Wagner. Combining hierarchical and goal-directed speed-
up techniques for Dijkstras algorithm. In Experimental Algorithms, 7th Inter-
131
Bibliography
national Workshop (WEA08), Provincetown, MA, USA, May 30-June 1, 2008,
Proceedings, pages 303318, 2008. 47, 51, 54, 55, 56, 110, 111, 112, 117, 123
[BDW09] Reinhard Bauer, Daniel Delling, and Dorothea Wagner. Experimental study of
speed-up techniques for timetable information systems. Networks, 2009. 56
[Bea64] Murray A. Beauchamp. An improved index of centrality. Behavioral Science,
10:161163, 1964. 12
[Bel58] Richard Ernest Bellman. On a routing problem. Quarterly of Applied Mathematics,
16:8790, 1958. 8, 34
[Bel67] Richard Ernest Bellman. Dynamic Programming. Princeton University Press,
1967. 34, 35, 37
[Ben66] C. T. Benson. Minimal regular graphs of girth eight and twelve. Canadian Journal
of Mathematics, 18:10911094, 1966. 28
[Ber93] Dimitri P. Bertsekas. A simple and fast label correcting algorithm for shortest
paths. Networks, 23(7):703709, 1993. 34
[Ber09] Aaron Bernstein. Fully dynamic (2 + ) approximate all-pairs shortest paths with
O(log log n) query and close to linear update time. In 50th Annual IEEE Sym-
posium on Foundations of Computer Science, FOCS 2009, pages 693702, 2009.
39
[Bes74] Julian Besag. Spatial interaction and the statistical analysis of lattice systems.
Journal of the Royal Statistical Society. Series B (Methodological), 36(2):192
236, 1974. 11
[BFM
+
07] Holger Bast, Stefan Funke, Domagoj Matijevic, Peter Sanders, and Dominik
Schultes. In transit to constant time shortest-path queries in road networks.
In Proceedings of the Workshop on Algorithm Engineering and Experiments
(ALENEX07), New Orleans, Louisiana, USA, January 6, 2007, 2007. 47, 51,
54, 55, 110, 114
[BFMZ04] Gerth Stlting Brodal, Rolf Fagerberg, Ulrich Meyer, and Norbert Zeh. Cache-
oblivious data structures and algorithms for undirected breadth-rst search and
shortest paths. In Algorithm Theory - SWAT 2004, 9th Scandinavian Workshop
on Algorithm Theory, Humlebaek, Denmark, July 8-10, 2004, Proceedings, pages
480492, 2004. 35
[BFSS07] Holger Bast, Stefan Funke, Peter Sanders, and Dominik Schultes. Fast routing in
road networks with transit nodes. Science, 316(5824):566, 2007. 55
[BG07] Zachary K. Baker and Maya Gokhale. On the acceleration of shortest path
calculations in transportatoin networks. In International Symposium on Field-
Programmable Custom Computing Machines, pages 2332, 2007. 10, 54
[BGS09] Surender Baswana, Vishrut Goyal, and Sandeep Sen. All-pairs nearly 2-
approximate shortest paths in
O(n
2
) time. Theoretical Computer Science,
410(1):8493, 2009. Announced at STACS 2005. 37
[BGSU08] Surender Baswana, Akshay Gaur, Sandeep Sen, and Jayant Upadhyay. Distance
oracles for unweighted graphs: Breaking the quadratic barrier with constant ad-
ditive error. In Automata, Languages and Programming, 35th International Col-
loquium, ICALP 2008, Reykjavik, Iceland, July 7-11, 2008, Proceedings, Part I:
132
Bibliography
Tack A: Algorithms, Automata, Complexity, and Games, pages 609621, 2008. 43,
44, 123
[BH69] M. H. Bourgoin and E. M. J. Heurgon. Study and comparison of algorithms of
the shortest path through planned experiments. In Project Planning by Network
Analysis, pages 106118, 1969. 35, 50
[BHS07] Surender Baswana, Ramesh Hariharan, and Sandeep Sen. Improved decremental
algorithms for maintaining transitive closure and all-pairs shortest paths. Journal
of Algorithms, 62(2):7492, 2007. 39
[Big98] Norman Biggs. Constructions for cubic graphs with large girth. The Electronic
Journal of Combinatorics, 5, 1998. 28
[BK06] Surender Baswana and Telikepalli Kavitha. Faster algorithms for approximate
distance oracles and all-pairs small stretch paths. In 47th Annual IEEE Symposium
on Foundations of Computer Science (FOCS 2006), 21-24 October 2006, Berkeley,
California, USA, pages 591602, 2006. 37, 43, 44
[BK07] Piotr Berman and Shiva Prasad Kasiviswanathan. Faster approximation of dis-
tances in graphs. In Algorithms and Data Structures, 10th International Workshop,
WADS 2007, Halifax, Canada, August 15-17, 2007, Proceedings, pages 541552,
2007. 37
[BKM
+
00] Andrei Z. Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Ra-
jagopalan, Raymie Stata, Andrew Tomkins, and Janet L. Wiener. Graph structure
in the web. Computer Networks, 33(1-6):309320, 2000. 5
[BKMM07] David A. Bader, Shiva Kintali, Kamesh Madduri, and Milena Mihail. Approxi-
mating betweenness centrality. In Algorithms and Models for the Web-Graph, 5th
International Workshop, WAW 2007, San Diego, CA, USA, December 11-12, 2007,
Proceedings, pages 124137, 2007. 12
[BKMP05] Surender Baswana, Telikepalli Kavitha, Kurt Mehlhorn, and Seth Pettie. New
constructions of (, )-spanners and purely additive spanners. In SODA 05: Pro-
ceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms,
pages 672681, 2005. 28
[BKS01] Ingmar Bitter, Arie E. Kaufman, and Mie Sato. Penalized-distance volumetric
skeleton algorithm. IEEE Transactions on Visualization and Computer Graphics,
7(3):195206, 2001. 11
[BL74] Mokhtar S. Bazaraa and R. W. Langley. A dual shortest path algorithm. SIAM
Journal on Applied Mathematics, 26(3):496501, 1974. 51
[BLM
+
06] Stefano Boccaletti, Vito Latora, Yamir Moreno, Mario Chavez, and Dong-Uk
Hwang. Complex networks: Structure and dynamics. Physics Reports, 424:175
308, 2006. 4, 13
[BLMN05] Yair Bartal, Nathan Linial, Manor Mendel, and Assaf Naor. On metric Ramsey
type phenomena. Annals of Mathematics (2), 162(2):643709, 2005. Announced
at STOC 2003. 43
[Blo83] Peter A. Bloniarz. A shortest-path algorithm with expected time
O(n
2
log nlog
lg lg n)
time.
In Proceedings of the 39th Annual ACM Symposium on Theory of Computing, San
Diego, California, USA, June 11-13, 2007, pages 3139, 2007. 98
[CR73] Stephen A. Cook and Robert A. Reckhow. Time bounded randomaccess machines.
Journal of Computer and System Sciences, 7(4):354375, 1973. Announced at
STOC 1972. 26
[CRS98] Yu-Li Chou, H. Edwin Romeijn, and Robert L. Smith. Approximating shortest
paths in large-scale networks with an application to intelligent transportation sys-
tems. INFORMS Journal on Computing, 10(2):163179, 1998. 51, 54
[CS83] Vasek Chv atal and Endre Szemer edi. Short cycles in directed graphs. Journal of
Combinatorial Theory, Series B, 35(3):323327, 1983. 28
[CSN07] Aaron Clauset, Cosma R. Shalizi, and Mark E. J. Newman. Power-law distribu-
tions in empirical data. SIAM Reviews, June 2007. 4, 23
[CSTW09a] Wei Chen, Christian Sommer, Shang-Hua Teng, and Yajun Wang. Compact rout-
ing in power-law graphs. In 23rd International Symposium on Distributed Com-
puting (DISC), pages 379391, 2009. 16
[CSTW09b] Wei Chen, Christian Sommer, Shang-Hua Teng, and Yajun Wang. A compact
routing scheme and approximate distance oracle for power-law graphs. Technical
Report MSR-TR-2009-84, Microsoft Research, July 2009. 16, 82
[CTB01] Adrijana Car, George Taylor, and Chris Brunsdon. An analysis of the performance
of a hierarchical waynding computational model using synthetic graphs. Com-
puters, Environment and Urban Systems, 25(1):6988, 2001. 51, 54
[CV03a] Elizabeth Costenbader and Thomas W. Valente. The stability of centrality mea-
sures when networks are sampled. Social Networks, 25(4):283 307, 2003. 12
[CV03b] Bruno Courcelle and R emi Vanicat. Query efcient implementation of graphs of
bounded clique-width. Discrete Applied Mathematics, 131(1):129150, 2003. 48
138
Bibliography
[CW90] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic
progressions. Journal of Symbolic Computation, 9(3):251280, 1990. Announced
at STOC 1987. 36
[CW04] Lenore J. Cowen and Christopher G. Wagner. Compact roundtrip routing in di-
rected networks. Journal of Algorithms, 50(1):7995, 2004. Announced at PODC
2000. 49
[CWM94] Gordon Cameron, Brian J. N. Wylie, and David McArthur. PARAMICS: moving
vehicles on the connection machine. Technical report, Edinburgh Parallel Com-
puting Center, 1994. ISSN 1063-9535, IEEE. 10
[CX00] Danny Z. Chen and Jinhui Xu. Shortest path queries in planar graphs. In Proceed-
ings of the ACM Symposium on Theory of Computing (STOC), pages 469478,
2000. 45
[CY09] Jiefeng Cheng and Jeffrey Xu Yu. On-line exact shortest distance query processing.
In EDBT 2009, 12th International Conference on Extending Database Technology,
Saint Petersburg, Russia, March 24-26, 2009, Proceedings, pages 481492, 2009.
54, 57, 83, 113
[CYL
+
06] Jiefeng Cheng, Jeffrey Xu Yu, Xuemin Lin, Haixun Wang, and Philip S. Yu. Fast
computation of reachability labeling for large graphs. In Advances in Database
Technology - EDBT 2006, 10th International Conference on Extending Database
Technology, Munich, Germany, March 26-31, 2006, Proceedings, pages 961979,
2006. 54
[CYL
+
08] Jiefeng Cheng, Jeffrey Xu Yu, Xuemin Lin, Haixun Wang, and Philip S. Yu. Fast
computing reachability labelings for large graphs with high compression rate. In
EDBT 2008, 11th International Conference on Extending Database Technology,
Nantes, France, March 25-29, 2008, Proceedings, pages 193204, 2008. 54
[CYT06] Jiefeng Cheng, Jeffrey Xu Yu, and Nan Tang. Fast reachability query processing.
In Database Systems for Advanced Applications, 11th International Conference,
DASFAA 2006, Singapore, April 12-15, 2006, Proceedings, pages 674688, 2006.
54
[CZ00] Shiva Chaudhuri and Christos D. Zaroliagis. Shortest paths in digraphs of small
treewidth. part I: Sequential algorithms. Algorithmica, 27(3):212226, 2000. An-
nounced at ICALP 1995. 47, 121
[CZ01] Edith Cohen and Uri Zwick. All-pairs small-stretch paths. Journal of Algorithms,
38(2):335353, 2001. Announced at SODA 1997. 37, 44
[CZ07] Edward P. F. Chan and Jie Zhang. A fast unied optimal route query evaluation
algorithm. In CIKM 07: Proceedings of the sixteenth ACM conference on Con-
ference on information and knowledge management, pages 371380, 2007. 51
[DABC08] Engin Demir, Cevdet Aykanat, and B. Barla Cambazoglu. Clustering spatial net-
works for aggregate query processing: A hypergraph approach. Inf. Syst., 33(1):1
17, 2008. 54, 55
[Dan57] George Bernard Dantzig. Discrete-variable extremum problems. Operations Re-
search, 5(2):266277, 1957. 8, 9, 125
139
Bibliography
[Dan60] George Bernard Dantzig. On the shortest route through a network. Management
Science, 6(2):187190, 1960. 8, 32, 35
[DBCP97] Mikael Degermark, Andrej Brodnik, Svante Carlsson, and Stephen Pink. Small
forwarding tables for fast routing lookups. In SIGCOMM, pages 314, 1997. 15
[dC83] Dennis de Champeaux. Bidirectional heuristic search again. Journal of the ACM,
30(1):2232, 1983. 10, 35
[DCKM04] Frank Dabek, Russ Cox, Frans Kaashoek, and Robert Morris. Vivaldi: a decen-
tralized network coordinate system. In SIGCOMM 04: Proceedings of the 2004
conference on Applications, technologies, architectures, and protocols for com-
puter communications, pages 1526, 2004. 15
[DEH07a] Philippe Duchon, Nicole Eggemann, and Nicolas Hanusse. Non-searchability of
randompower-lawgraphs. In Principles of Distributed Systems, 11th International
Conference, OPODIS 2007, Guadeloupe, French West Indies, December 17-20,
2007. Proceedings, pages 274285, 2007. 83
[DEH07b] Philippe Duchon, Nicole Eggemann, and Nicolas Hanusse. Non-searchability of
random scale-free graphs. In Proceedings of the Twenty-Sixth Annual ACM Sym-
posium on Principles of Distributed Computing, PODC 2007, Portland, Oregon,
USA, August 12-15, 2007, pages 380381, 2007. 83
[Del09] Daniel Delling. Engineering and Augmenting Route Planning Algorithms. PhD
thesis, Universit at Karlsruhe, 2009. 52, 54, 55, 56
[DF79] Eric V. Denardo and Bennett L. Fox. Shortest route methods: Reaching, pruning
and buckets. Operations Research, 27:161186, 1979. 33
[DFJ54] George Bernard Dantzig, Delbert Ray Fulkerson, and Selmer M. Johnson. Solution
of a large-scale traveling-salesman problem. Journal of the Operations Research
Society of America, 2(4):393410, 1954. 8
[DFS90] David P. Dobkin, Steven J. Friedman, and Kenneth J. Supowit. Delaunay graphs
are almost as good as complete graphs. Discrete & Computational Geometry,
5:399407, 1990. 118
[DGJ08] Camil Demetrescu, Andrew V. Goldberg, and David S. Johnson. Implementation
challenge for shortest paths. In Encyclopedia of Algorithms. 2008. 15, 54
[DGKK79] Robert B. Dial, Fred Glover, David Karney, and Darwin Klingman. A computa-
tional analysis of alternative algorithms and labeling techniques for nding shortest
path trees. Networks, 9:215248, 1979. 33, 50
[DGST88] James R. Driscoll, Harold N. Gabow, Ruth Shrairman, and Robert E. Tarjan. Re-
laxed heaps: an alternative to Fibonacci heaps with applications to parallel com-
putation. Communications of the ACM, 31(11):13431354, 1988. 33, 34
[DHZ00] Dorit Dor, Shay Halperin, and Uri Zwick. All-pairs almost shortest paths. SIAM
Journal on Computing, 29(5):17401759, 2000. Announced at FOCS 1996. 27,
29, 37, 38
[DI04] Camil Demetrescu and Giuseppe F. Italiano. Engineering shortest path algorithms.
In Experimental and Efcient Algorithms, Third International Workshop, WEA
2004, Angra dos Reis, Brazil, May 25-28, 2004, Proceedings, pages 191198,
2004. 35
140
Bibliography
[DI08] Camil Demetrescu and Giuseppe F. Italiano. Decremental all-pairs shortest paths.
In Encyclopedia of Algorithms. 2008. 32, 35
[Dia69] Robert B. Dial. Algorithm 360: shortest-path forest with topological ordering [h].
Communications of the ACM, 12(11):632633, 1969. 33
[Dij59] Edsger Wybe Dijkstra. A note on two problems in connexion with graphs. Nu-
merische Mathematik, 1:269271, 1959. 8, 32, 35, 51
[Dir50] Gustav Lejeune Dirichlet.
Uber die Reduktion der positiven quadratischen Formen
mit drei unbestimmten ganzen Zahlen. Journal f ur die Reine und Angewandte
Mathematik, 40:209227, 1850. 98
[Dji96] Hristo Djidjev. Efcient algorithms for shortest path problems on planar digraphs.
In Graph-Theoretic Concepts in Computer Science, 22nd International Workshop,
WG 96, Cadenabbia (Como), Italy, June 12-14, 1996, Proceedings, pages 151
165, 1996. 45, 47, 121
[Dji06] Hristo Djidjev. A scalable multilevel algorithm for graph clustering and commu-
nity structure detection. In Algorithms and Models for the Web-Graph, Fourth
International Workshop, WAW 2006, Banff, Canada, November 30 - December 1,
2006. Revised Papers, pages 117128, 2006. 13
[Djo73] Dragomir Z. Djokovic. Distance-preserving subgraphs of hypercubes. Journal of
Combinatorial Theory, Series B, 14(3):263267, 1973. 30
[DJW07] J org Derungs, Riko Jacob, and Peter Widmayer. Approximate shortest paths
guided by a small index. In Algorithms and Data Structures, 10th International
Workshop, WADS 2007, Halifax, Canada, August 15-17, 2007, Proceedings, pages
553564, 2007. Journal version to appear in Algorithmica. 43
[DLL
+
06] Debora Donato, Luigi Laura, Stefano Leonardi, Ulrich Meyer, Stefano Millozzi,
and Jop F. Sibeyn. Algorithms and experiments for the webgraph. Journal of
Graph Algorithms and Applications, 10(2):219236, 2006. Announced at ESA
2003. 5
[DLO03] Erik D. Demaine and Alejandro L opez-Ortiz. A linear lower bound on index size
for text retrieval. Journal of Algorithms, 48(1):215, 2003. Announced at SODA
2001. 43
[DMS00] Sergey N. Dorogovtsev, Jos e Fernando Ferreira Mendes, and Alexander N.
Samukhin. Structure of growing networks with preferential linking. Physical
Review Letters, 85(21):46334636, Nov 2000. 24
[Dor67] Jim E. Doran. An approach to automatic problem-solving. Machine Intelligence,
1:105124, 1967. 35, 52
[DP84] Narsingh Deo and Chi-Yin Pang. Shortest path algorithms: Taxonomy and anno-
tation. Networks, 14:257323, 1984. 32, 51
[DP08] Ran Duan and Seth Pettie. Bounded-leg distance and reachability oracles. In SODA
08: Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete
algorithms, pages 436445, 2008. 39
[DPW09] Daniel Delling, Thomas Pajor, and Dorothea Wagner. Accelerating multi-modal
route planning by access-nodes. In Algorithms - ESA 2009, 17th Annual Eu-
141
Bibliography
ropean Symposium, Copenhagen, Denmark, September 7-9, 2009. Proceedings,
pages 587598, 2009. 2
[DPWZ09] Daniel Delling, Thomas Pajor, Dorothea Wagner, and Christos D. Zaroliagis. Ef-
cient route planning in ight networks. In ATMOS 2009 - 9th Workshop on Algo-
rithmic Approaches for Transportation Modeling, Optimization, and Systems, IT
University of Copenhagen, Denmark, September 10, 2009, 2009. 2
[DPZ91] Hristo Djidjev, Grammati E. Pantziou, and Christos D. Zaroliagis. Computing
shortest paths and distances in planar graphs. In Automata, Languages and Pro-
gramming, 18th International Colloquium, ICALP91, Madrid, Spain, July 8-12,
1991, Proceedings, pages 327338, 1991. 45
[DPZ95] Hristo Djidjev, Grammati E. Pantziou, and Christos D. Zaroliagis. On-line and
dynamic algorithms for shorted path problems. In STACS, pages 193204, 1995.
45, 121
[DPZ00] Hristo Djidjev, Grammati E. Pantziou, and Christos D. Zaroliagis. Improved algo-
rithms for dynamic shortest paths. Algorithmica, 28(4):367389, 2000. 45, 121
[DR94] Shaul Dar and Raghu Ramakrishnan. A performance study of transitive closure
algorithms. ACM SIGMOD Record, 23(2):454465, 1994. 57
[Dre69] Stuart E. Dreyfus. An appraisal of some shortest-path algorithms. Operations
Research, 17(3):395412, 1969. 32, 35, 50
[dSP65] Derek J. de Solla Price. Networks of scientic papers. Science, 149(3683):510
515, 1965. 4, 5
[dSP76] Derek J. de Solla Price. A general theory of bibliometric and other cumulative
advantage processes. Journal of the American Society for Information Science,
27(5):292306, 1976. 5
[dSPK78] Ithiel de Sola Pool and Manfred Kochen. Contacts and inuence. Social Networks,
1:551, 1978. 6
[DSSW06] Daniel Delling, Peter Sanders, Dominik Schultes, and Dorothea Wagner. Highway
hierarchies star. In 9th DIMACS Implementation Challenge, 2006. 110, 114
[DSSW09] Daniel Delling, Peter Sanders, Dominik Schultes, and Dorothea Wagner. Engineer-
ing route planning algorithms. In Algorithmics of Large and Complex Networks -
Design, Analysis, and Simulation [DFG priority program 1126], pages 117139,
2009. 54, 97
[DW07] Daniel Delling and Dorothea Wagner. Landmark-based routing in dynamic graphs.
In Experimental Algorithms, 6th International Workshop (WEA07), Rome, Italy,
June 6-8, 2007, Proceedings, pages 5265, 2007. 111, 117
[Eco09] The Economist. Rational consumer: The road ahead. The Economist (Technology
Quarterly), 392(8647):17, 2009. 122
[Edm65] Jack Edmonds. Paths, trees and owers. Canadian Journal of Mathematics,
17:449467, 1965. 25
[EG60] Paul Erd os and Tibor Gallai. Grafok el oirt fok u pontokkal. Matematikai Lapok,
11:264274, 1960. English title: Graphs with points of prescribed degrees. 24
142
Bibliography
[EG08] David Eppstein and Michael T. Goodrich. Studying (non-planar) road networks
through an algorithmic lens. In 16th ACM SIGSPATIAL International Symposium
on Advances in Geographic Information Systems, ACM-GIS 2008, November 5-7,
2008, Irvine, California, USA, Proceedings, page 16, 2008. 22, 31, 47, 103, 104
[Ege31] Jen o Egerv ary. Matrixok kombinatorikus tulajdons agair ol. K oz episkolai Matem-
atikai es Fizikai Lapok, 38:1627, 1931. Translation by Harold W. Kuhn in Logis-
tics Papers, issue 11, 1955. 8
[EGK
+
04] Stephen Eubank, Hasan Guclu, V. S. Anil Kumar, Madhav V. Marathe, Aravind
Srinivasan, Zoltan Toroczkai, and Nan Wang. Modelling disease outbreaks in
realistic urban social networks. Nature, 429(6988):180184, 2004. 7, 10
[EGS09] David Eppstein, Michael T. Goodrich, and Darren Strash. Linear-time algorithms
for geometric graphs with sublinearly many crossings. In Proceedings of the Twen-
tieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2009, New
York, NY, USA, January 4-6, 2009, pages 150159, 2009. 22
[EHP04] Jeff Erickson and Sariel Har-Peled. Optimally cutting a surface into a disk. Dis-
crete & Computational Geometry, 31(1):3759, 2004. Announced at SOCG 2002.
19
[EJ73] Jack Edmonds and Ellis L. Johnson. Matching, euler tours and the Chinese post-
man. Mathematical Programming, 5(1):88124, 1973. 8
[EJ08] Geoffrey Exoo and Robert Jajcay. Dynamic cage survey. The Electronic Journal
of Combinatorics, 15, 2008. 28, 65
[EL82] R. J. Elliott and Michael Lesk. Route nding in street maps by computers and
people. In AAAI, pages 258261, 1982. 54
[Elk05] Michael L. Elkin. Computing almost shortest paths. ACM Transactions on Algo-
rithms, 1(2):283323, 2005. Announced at PODC 2001. 38
[Elk08a] Michael L. Elkin. Sparse graph spanners. In Encyclopedia of Algorithms. 2008.
27
[Elk08b] Michael L. Elkin. Synchronizers, spanners. In Encyclopedia of Algorithms. 2008.
27
[Elm77] Salah E. Elmaghraby. Activity Networks: Project Planning and Control by Net-
work Models. Wiley, New York, 1977. 8
[EM01] Stefan Edelkamp and Ulrich Meyer. Theory and practice of time-space trade-
offs in memory limited search. In KI 2001: Advances in Articial Intelligence,
Joint German/Austrian Conference on AI, Vienna, Austria, September 19-21, 2001,
Proceedings, pages 169184, 2001. 53
[EMK97] Paul Embrechts, Thomas Mikosch, and Claudia Kl uppelberg. Modelling extremal
events: for insurance and nance. Springer-Verlag, London, UK, 1997. 108
[EP04] Michael L. Elkin and David Peleg. (1+epsilon, beta)-spanner constructions for
general graphs. SIAM Journal on Computing, 33(3):608631, 2004. 27
[Epp99] David Eppstein. Subgraph isomorphism in planar graphs and related problems.
Journal of Graph Algorithms and Applications, 3(3), 1999. Announced at SODA
1995. 45
143
Bibliography
[ER60] Paul Erd os and Alfr ed A. R enyi. On the evolution of random graphs. Magyar
Tudom anyos Akad emia Matematikai Kutat o Int ezet enek K ozlem enyei, 5:1761,
1960. 22, 23, 49
[ER97] Joost Engelfriet and Grzegorz Rozenberg. Node replacement graph grammars. In
Handbook of Graph Grammars and Computing by Graph Transformations, Vol-
ume 1: Foundations, pages 194, 1997. 48
[Erd35] Paul Erd os.
Uber die Primzahlen gewisser arithmetischer Reihen. Mathematische
Zeitschrift, 39(1):473491, 1935. 66
[Erd63] Paul Erd os. On a Combinatorial Problem, I. Nordisk Matematisk Tidsskrift, 11:5
10, 1963. 67, 68
[Erd64] Paul Erd os. Extremal problems in graph theory. Theory Graphs Appl., Proc. Symp.
Smolenice, pages 2936, 1964. 28, 41, 59
[ERS66] Paul Erd os, Alfr ed A. R enyi, and Vera Turan Sos. On a problem of graph theory.
Studia Scientiarum Mathematicarum Hungarica, 1:215235, 1966. 28
[Erw00] Martin Erwig. The graph Voronoi diagram with applications. Networks,
36(3):156163, 2000. 97, 98, 99, 101, 102, 103, 118
[ES63] Paul Erd os and Horst Sachs. Regul are Graphen gegebener Taillenweite mit min-
imaler Knotenzahl. Wissenschaftliche Zeitschrift der Martin-Luther-Universit at
Halle-Wittenberg, Mathematisch-Naturwissenschaftliche Reihe, pages 251258,
1963. 28, 41, 59
[EW03] Nils Eissfeldt and Peter Wagner. Effect of anticipatory driving in trafc ow
model. The European Physical Journal B, 33:121129, 2003. 10
[EW04] David Eppstein and Joseph Wang. Fast approximation of centrality. Journal of
Graph Algorithms and Applications, 8:3945, 2004. Announced at SODA 2001.
12
[EWG08] Mihaela Enachescu, Mei Wang, and Ashish Goel. Reducing maximum stretch in
compact routing. In INFOCOM, pages 336340, 2008. 49, 96
[FF58] Lester Randolph Ford and Delbert Ray Fulkerson. Constructing maximal dynamic
ows from static ows. Op. Res., 6(3):419433, 1958. 8, 34
[FFF99] Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. On power-law rela-
tionships of the Internet topology. In SIGCOMM: Proceedings of the conference
on applications, technologies, architectures, and protocols for computer commu-
nication, pages 251262, 1999. 5, 82, 125
[FFV05] Abraham D. Flaxman, Alan M. Frieze, and Juan Vera. Adversarial deletion in a
scale free random graph process. In Proceedings of the 16th annual ACM-SIAM
symposium on Discrete algorithms, pages 287292, 2005. 83
[FFV07] Abraham D. Flaxman, Alan M. Frieze, and Juan Vera. A geometric preferential
attachment model of networks ii. In Algorithms and Models for the Web-Graph,
5th International Workshop, WAW 2007, San Diego, CA, USA, December 11-12,
2007, Proceedings, pages 4155, 2007. 4, 5
[FG85] Alan M. Frieze and Geoffrey R. Grimmett. The shortest-path problem for graphs
with random arc-lengths. Discrete Applied Mathematics, 10(1):5777, 1985. 35
144
Bibliography
[FGG
+
05] Qing Fang, Jie Gao, Leonidas J. Guibas, V. de Silva, and Li Zhang. GLIDER:
gradient landmark-based distributed routing for sensor networks. In INFOCOM
2005. 24th Annual Joint Conference of the IEEE Computer and Communications
Societies, 13-17 March 2005, Miami, FL, USA, pages 339350, 2005. 97
[FH04] Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Efcient graph-based image
segmentation. International Journal of Computer Vision, 59(2):167181, 2004.
11
[FJ88] Greg N. Frederickson and Ravi Janardan. Designing networks with compact rout-
ing tables. Algorithmica, 3:171190, 1988. 45
[FJ89] Greg N. Frederickson and Ravi Janardan. Efcient message routing in planar net-
works. SIAM Journal on Computing, 18(4):843857, 1989. 47
[FJ90] Greg N. Frederickson and Ravi Janardan. Space-efcient message routing in c-
decomposable networks. SIAM Journal on Computing, 19(1):164181, 1990. 47
[FK07] Martin F urer and Shiva Prasad Kasiviswanathan. Spanners for geometric inter-
section graphs. In Algorithms and Data Structures, 10th International Workshop,
WADS 2007, Halifax, Canada, August 15-17, 2007, Proceedings, pages 312324,
2007. 48, 121
[FKS84] Michael L. Fredman, J anos Koml os, and Endre Szemer edi. Storing a sparse table
with O(1) worst case access time. Journal of the ACM, 31(3):538544, 1984.
Announced at FOCS 1982. 42, 89
[FKS89] Joel Friedman, Jeff Kahn, and Endre Szemer edi. On the second eigenvalue in
random regular graphs. In Proceedings of the Twenty-First Annual ACM Sympo-
sium on Theory of Computing, 15-17 May 1989, Seattle, Washington, USA, pages
587598, 1989. 65
[Fla63] C. Flament. Applications of Graph Theory to Group Structure. Prentice-Hall,
Englewood Cliffs, 1963. 12
[FLM67] B. A. Farbey, A. H. Land, and J. D. Murchland. The cascade algorithm for nding
all shortest distances in a directed graph. Management Science, 14(1):1928, 1967.
35, 51
[Flo62] Robert W. Floyd. Algorithm 97: Shortest path. Communications of the ACM,
5(6):345, 1962. 8, 35, 37, 51
[FLV08] Pierre Fraigniaud, Emmanuelle Lebhar, and Laurent Viennot. The inframetric
model for the internet. In INFOCOM 2008. 27th IEEE International Conference
on Computer Communications, Joint Conference of the IEEE Computer and Com-
munications Societies, 13-18 April 2008, Phoenix, AZ, USA, pages 10851093,
2008. 49, 83
[FM95] Tom as Feder and Rajeev Motwani. Clique partitions, graph compression and
speeding-up algorithms. Journal of Computer and System Sciences, 51(2):261
272, 1995. Announced at STOC 1991. 36, 37
[For56] Lester R. Ford. Network ow theory. Report P-923, The Rand Corporation, 1956.
8, 34, 35, 37
145
Bibliography
[FPP08] Alessandro Ferrante, Gopal Pandurangan, and Kihong Park. On the hardness of
optimization in power-law graphs. Theoretical Computer Science, 393(1-3):220
230, 2008. 83
[FR06] Jittat Fakcharoenphol and Satish Rao. Planar graphs, negative weight edges,
shortest paths, and near linear time. Journal of Computer and System Sciences,
72(5):868889, 2006. Announced at FOCS 2001. 32, 44, 47, 121
[FR08] Jittat Fakcharoenphol and Satish Rao. Shortest paths in planar graphs with negative
weight edges. In Encyclopedia of Algorithms. 2008. 44
[Fra08] Andrew U. Frank. Shortest path in a combined public transportation network. KI,
22(3):1418, 2008. 2
[Fre76] Michael L. Fredman. New bounds on the complexity of the shortest path problem.
SIAM Journal on Computing, 5(1):8389, 1976. 36, 37
[Fre77] Linton C. Freeman. A set of measures of centrality based on betweenness. So-
ciometry, 40(1):3541, 1977. 12
[Fre87] Greg N. Frederickson. Fast algorithms for shortest paths in planar graphs, with
applications. SIAM Journal on Computing, 16(6):10041022, 1987. 35
[Fre91] Greg N. Frederickson. Planar graph decomposition and all pairs shortest paths.
Journal of the ACM, 38(1):162204, 1991. 21, 45
[Fre95] Greg N. Frederickson. Using cellular graph embeddings in solving all pairs short-
est paths problems. Journal of Algorithms, 19(1):4585, 1995. Announced at
FOCS 1989. 21, 45
[Fri64] Max Frisch. Mein Name sei Gantenbein. Suhrkamp, 1964. 81
[Fri76] Alan M. Frieze. Shortest path algorithms for knapsack type problems. Mathemat-
ical Programming, 11(1):150157, 1976. 8
[Fri91] Joel Friedman. On the second eigenvalue and random walks in random d-regular
graphs. Combinatorica, 11(4):331362, 1991. 65
[FS97] AndrewFetterer and Shashi Shekhar. Aperformance analysis of hierarchical short-
est path algorithms. In ICTAI, pages 8493, 1997. 51, 54
[FT87] Michael L. Fredman and Robert Endre Tarjan. Fibonacci heaps and their uses in
improved network optimization algorithms. Journal of the ACM, 34(3):596615,
1987. Announced at FOCS 1984. 33, 34, 103
[FTW87] Martin A. Fischler, Jay M. Tenenbaum, and Helen C. Wolf. Detection of roads and
linear structures in low-resolution aerial imagery using a multisource knowledge
integration technique. In Readings in computer vision: issues, problems, princi-
ples, and paradigms, pages 741752, 1987. 11
[FUS
+
98] Alexandre X. Falcao, Jayaram K. Udupa, Supun Samarasekera, Shoba Sharma,
Bruce Elliot Hirsch, and Roberto de A. Lotufo. User-steered image segmentation
paradigms: Live wire and live lane. Graphical Models and Image Processing,
60(4):233 260, 1998. 8, 10
[FVC07] Alan M. Frieze, Juan Vera, and Soumen Chakrabarti. The inuence of search
engines on preferential attachment. Internet Mathematics, 3(3), 2007. 5
146
Bibliography
[FW93] Michael L. Fredman and Dan E. Willard. Surpassing the information theoretic
bound with fusion trees. Journal of Computer and System Sciences, 47(3):424
436, 1993. 34
[FW94] Michael L. Fredman and Dan E. Willard. Trans-dichotomous algorithms for mini-
mum spanning trees and shortest paths. Journal of Computer and System Sciences,
48(3):533551, 1994. Announced at FOCS 1990. 34
[GA06] Christopher M. Gold and Paul Angel. Voronoi hierarchies. In Geographic Informa-
tion Science, 4th International Conference, GIScience 2006, M unster, Germany,
September 20-23, 2006, Proceedings, pages 99111, 2006. 118
[Gal58] Tibor Gallai. Maximum-minimum S atze uber Graphen. Acta Mathematica
Academiae Scientiarum Hungaricae, 9:395434, 1958. 8
[Gav01] Cyril Gavoille. Routing in distributed networks: overview and open problems.
SIGACT News, 32(1):3652, 2001. 49
[GBB
+
03] Loic Giot, Joel S. Bader, C. Brouwer, Amitabha Chaudhuri, Bing Kuang, Y. Li,
YL. Hao, CE. Ooi, Brian Godwin, E. Vitols, G. Vijayadamodar, Philippe Pochart,
H. Machineni, M. Welsh, Y. Kong, B. Zerhusen, Robert J. Malcolm, Z. Var-
rone, A. Collis, M. Minto, S. Burgess, L. McDaniel, E. Stimpson, F. Spriggs,
J. Williams, Kathryin Neurath, N. Ioime, M. Agee, E. Voss, K. Furtak, R. Renzulli,
N. Aanensen, S. Carrolla, E. Bickelhaupt, Y. Lazovatsky, A. DaSilva, J. Zhong,
Clement A. Stanyon, Russell L. Finley Jr, Kevin P. White, Michael S. Braverman,
Thomas P. Jarvie, S. Gold, M. Leach, J. Knight, Richard A. Shimkets, Michael P.
McKenna, John Chant, and Jonathan M. Rothberg. A protein interaction map of
Drosophila melanogaster. Science, 302(5651):17271736, 2003. 6
[GBL08] Amit Goyal, Francesco Bonchi, and Laks V.S. Lakshmanan. Discovering leaders
from community actions. In CIKM 08: Proceeding of the 17th ACM conference
on Information and knowledge management, pages 499508, 2008. 7
[GCE
+
09] Emanuele Galli, Leticia Cuellar, Stephan Eidenbenz, Mary Ewers, Sue
Mniszewski, and Christof Teuscher. Activitysim: large-scale agent-based activ-
ity generation for infrastructure simulation. In Proceedings of the 2009 Spring
Simulation Multiconference, SpringSim 2009, San Diego, California, USA, March
22-27, 2009, 2009. 7
[Gel63] Herbert L. Gelernter. Realization of a geometry theorem proving machine. Com-
puters and Thought, 1963. 35, 52
[Gel77] David Gelperin. On the optimality of A*. Articial Intelligence, 8(1):6976, 1977.
35, 52
[GGM06] Jesus Gomez-Gardenes and Yamir Moreno. From scale-free to Erdos-Renyi net-
works. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics),
73(5):056124, 2006. 4
[GH05] Andrew V. Goldberg and Chris Harrelson. Computing the shortest path: A* search
meets graph theory. In Proceedings of the Sixteenth Annual ACM-SIAM Sympo-
sium on Discrete Algorithms (SODA05), Vancouver, British Columbia, Canada,
January 23-25, 2005, pages 156165, 2005. 53, 91, 111, 114
[GHK90] Donald Goldfarb, Jianxiu Hao, and Sheng-Roan Kai. Efcient shortest path sim-
plex algorithms. Operations Research, 38(4):624628, 1990. 34
147
Bibliography
[GHT84] John R. Gilbert, Joan P. Hutchinson, and Robert Endre Tarjan. A separator theorem
for graphs of bounded genus. Journal of Algorithms, 5(3):391407, 1984. 31
[Gil59] Edgar N. Gilbert. Random graphs. The Annals of Mathematical Statistics,
30:11411144, 1959. 22, 23
[GJ90] Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide
to the Theory of NP-Completeness. W. H. Freeman & Co., 1990. 26
[GJ99] Donald Goldfarb and Zhiying Jin. An O(nm)-time network simplex algorithm for
the shortest path problem. Operations Research, 47(3):445448, 1999. 34
[GKK
+
01] Cyril Gavoille, Michal Katz, Nir A. Katz, Christophe Paul, and David Peleg. Ap-
proximate distance labeling schemes. In Algorithms - ESA 2001, 9th Annual Eu-
ropean Symposium, Aarhus, Denmark, August 28-31, 2001, Proceedings, pages
476487, 2001. 48
[GKN74] Fred Glover, D. Klingman, and A. Napier. A note on nding all shortest paths.
Transportation Science, 8:312, 1974. 35, 51
[GKP85] Fred Glover, Darwin D. Klingman, and Nancy V. Phillips. A new polynomially
bounded shortest path algorithm. Operations Research, 33(1):6573, 1985. 34
[GKP05] Naveen Garg, Rohit Khandekar, and Vinayaka Pandit. Improved approximation
for universal facility location. In Proceedings of the Sixteenth Annual ACM-SIAM
Symposium on Discrete Algorithms, (SODA05), Vancouver, British Columbia,
Canada, January 23-25, 2005, pages 959960, 2005. 98
[GKPS85] Fred Glover, Darwin D. Klingman, Nancy V. Phillips, and Robert F. Schneider.
New polynomial shortest path algorithms and their computational attributes. Man-
agement Science, 31(9):11061128, 1985. 34
[GKR04] Sandeep Gupta, Swastik Kopparty, and Chinya Ravishankar. Roads, codes, and
spatiotemporal queries. In PODS 04: Proceedings of the twenty-third ACM
SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages
115124, 2004. 45, 55, 121
[GKW06] Andrew Goldberg, Haim Kaplan, and Renato Fonseca F. Werneck. Reach for A*:
Efcient point-to-point shortest path algorithms. In Proceedings of the Eighth
Workshop on Algorithm Engineering and Experiments (ALENEX06), pages 129
143. SIAM, 2006. 53
[GKW07] Andrew V. Goldberg, Haim Kaplan, and Renato Fonseca F. Werneck. Better land-
marks within reach. In Experimental Algorithms, 6th International Workshop,
WEA 2007, Rome, Italy, June 6-8, 2007, Proceedings, pages 3851, 2007. 51, 53,
105
[GL07] Cyril Gavoille and Arnaud Labourel. Shorter implicit representation for planar
graphs and bounded treewidth graphs. In Algorithms - ESA 2007, 15th Annual
European Symposium, Eilat, Israel, October 8-10, 2007, Proceedings, pages 582
593, 2007. 31
[GLNS08] Joachim Gudmundsson, Christos Levcopoulos, Giri Narasimhan, and Michiel
H. M. Smid. Approximate distance oracles for geometric spanners. ACM Trans-
actions on Algorithms, 4(1), 2008. 50, 121
148
Bibliography
[GM77] Bruce Golden and Thomas L. Magnanti. Deterministic network optimization: A
bibliography. Networks, 7:149183, 1977. 32
[GM93] Zvi Galil and Oded Margalit. Witnesses for boolean matrix multiplication and for
transitive closure. Journal of Complexity, 9(2):201221, 1993. 36
[GM97] Zvi Galil and Oded Margalit. All pairs shortest distances for graphs with small
integer length edges. Information and Computation, 134(2):103139, 1997. 36
[GMMO00] Sudipto Guha, Nina Mishra, Rajeev Motwani, and Liadan OCallaghan. Clustering
data streams. In Proceedings of the IEEE Symposium on Foundations of Computer
Science (FOCS), pages 359366, 2000. 13
[GN64] Allan L. Gutjahr and George L. Nemhauser. An algorithm for the line balancing
problem. Management Science, 11(2):308315, 1964. 8
[GN67] A. J. Goldman and G. L. Nemhauser. A transport improvement problem trans-
formable to a best-path problem. Transportation Science, 1(4):295307, 1967.
8
[GN02] M. Girvan and Mark E. J. Newman. Community structure in social and biological
networks. Proceedings of the National Academy of Sciences, 99(12):78217826,
2002. 13
[Gol76] Bruce Golden. Shortest-path algorithms: A comparison. Operations Research,
24(6):11641168, 1976. 35, 50
[Gol95] Andrew V. Goldberg. Scaling algorithms for the shortest paths problem. SIAM
Journal on Computing, 24(3):494504, 1995. Announced at SODA 1993. 32
[Gol01] Andrew V. Goldberg. Shortest path algorithms: Engineering aspects. In
Algorithms and Computation, 12th International Symposium, ISAAC 2001,
Christchurch, New Zealand, December 19-21, 2001, Proceedings, pages 502513,
2001. 35
[Gol07] Andrew V. Goldberg. Point-to-point shortest path algorithms with preprocessing.
In SOFSEM 2007: Theory and Practice of Computer Science, 33rd Conference on
Current Trends in Theory and Practice of Computer Science, Harrachov, Czech
Republic, January 20-26, 2007, Proceedings, pages 88102, 2007. 54
[Gol08] Andrew V. Goldberg. A practical shortest path algorithm with linear expected
time. SIAM Journal on Computing, 37(5):16371655, 2008. 35
[GOY76] Satoshi Goto, Tatsuo Ohtsuki, and Takeshi Yoshimura. Sparse matrix tech-
niques for the shortest path problem. IEEE Transactions on Circuits and Systems,
23(12):752758, Dec 1976. 35
[GP72] Ronald L. Graham and Henry O. Pollak. On embedding graphs in squashed cubes,
volume 303, pages 99110. 1972. 30
[GP86] Giorgio Gallo and Stefano Pallottino. Shortest path methods: A unifying approach.
Mathematical Programming Studies, 26:3864, 1986. 34
[GP03a] Cyril Gavoille and Christophe Paul. Optimal distance labeling for interval and
circular-arc graphs. In Algorithms - ESA 2003, 11th Annual European Symposium,
Budapest, Hungary, September 16-19, 2003, Proceedings, pages 254265, 2003.
48
149
Bibliography
[GP03b] Cyril Gavoille and David Peleg. Compact and localized distributed data structures.
Distributed Computing, 16(2-3):111120, 2003. 29, 49
[GPPR04] Cyril Gavoille, David Peleg, St ephane P erennes, and Ran Raz. Distance labeling
in graphs. Journal of Algorithms, 53(1):85112, 2004. Announced at SODA 2000.
29
[Gra06] Leo Grady. Computing exact discrete minimal surfaces: Extending and solving the
shortest path problem in 3d with application to segmentation. In IEEE Computer
Society Conference on Computer Vision and Pattern Recognition, pages 6978,
2006. 11
[Gra08] Leo Grady. Minimal surfaces extend shortest path segmentation methods to 3d.
IEEE Transactions on Pattern Analysis and Machine Intelligence, PP(99):1, De-
cember 2008. 11
[GSDF03] Lise Getoor, Ted E. Senator, Pedro Domingos, and Christos Faloutsos, editors.
SIGKDD Proceedings, 2003. 112
[GSSD08] Robert Geisberger, Peter Sanders, Dominik Schultes, and Daniel Delling. Con-
traction hierarchies: Faster and simpler hierarchical routing in road networks. In
Experimental Algorithms, 7th International Workshop (WEA08), Provincetown,
MA, USA, May 30-June 1, 2008, Proceedings, pages 319333, 2008. 47, 51, 54,
55, 56, 111, 112, 114, 117, 123
[GSVGM98] Roy Goldman, Narayanan Shivakumar, Suresh Venkatasubramanian, and Hector
Garcia-Molina. Proximity search in databases. In VLDB98, Proceedings of 24rd
International Conference on Very Large Data Bases, August 24-27, 1998, New
York City, New York, USA, pages 2637, 1998. 57, 83
[GT89] Harold N. Gabow and Robert Endre Tarjan. Faster scaling algorithms for network
problems. SIAM Journal on Computing, 18(5):10131036, 1989. 32
[Gua93] John Guare. Six degrees of separation, 1993. Movie. 6
[Gut04] Ron Gutman. Reach-based routing: A new approach to shortest path algorithms
optimized for road networks. In Proceedings 6th Workshop on Algorithm Engi-
neering and Experiments (ALENEX), pages 100111. SIAM, 2004. 53, 56
[GW73] David E. Gilsinn and Christoph Witzgall. A performance comparison of label-
ing algorithms for calculating shortest path trees. Technical Note 772, National
Institute of Standards and Technology, 1973. 35, 50
[GW05] Andrew V. Goldberg and Renato Fonseca F. Werneck. Computing point-to-point
shortest paths from external memory. In Proceedings of the Seventh Workshop
on Algorithm Engineering and Experiments and the Second Workshop on Ana-
lytic Algorithmics and Combinatorics, ALENEX /ANALCO 2005, Vancouver, BC,
Canada, 22 January 2005, pages 2640, 2005. 53, 54, 111
[GYY80] Ronald L. Graham, Andrew Chi-Chih Yao, and F. Frances Yao. Information
bounds are weak in the shortest distance problem. Journal of the ACM, 27(3):428
444, 1980. 35
[GZ05] Jie Gao and Li Zhang. Well-separated pair decomposition for the unit-disk graph
metric and its applications. SIAM Journal on Computing, 35(1):151169, 2005.
48, 121
150
Bibliography
[GZC
+
09] Ido Guy, Naama Zwerdling, David Carmel, Inbal Ronen, Erel Uziel, Sivan Yogev,
and Shila Ofek-Koifman. Personalized recommendation of social software items
based on social relations. In Proceedings of the 2009 ACM Conference on Recom-
mender Systems, RecSys 2009, New York, NY, USA, October 23-25, 2009, pages
5360, 2009. 13
[Hag00a] Torben Hagerup. Dynamic algorithms for graphs of bounded treewidth. Algorith-
mica, 27(3):292315, 2000. Announced at ICALP 1997. 48
[Hag00b] Torben Hagerup. Improved shortest paths on the word RAM. In Automata, Lan-
guages and Programming, 27th International Colloquium, ICALP 2000, Geneva,
Switzerland, July 9-15, 2000, Proceedings, pages 6172, 2000. 33
[Hag06] Torben Hagerup. Simpler computation of single-source shortest paths in linear
average time. Theory of Computing Systems, 39(1):113120, 2006. 35
[Hak62] Seifollah Louis Hakimi. On realizability of a set of integers as degrees of the
vertices of a linear graph. I. Journal of the Society for Industrial and Applied
Mathematics, 10(3):496506, 1962. 24
[Hal76] Rudolf Halin. S-functions for graphs. Journal of Geometry, 8(1-2):171186, 1976.
20
[Han87] Eric N. Hanson. A performance analysis of view materialization strategies. SIG-
MOD Rec., 16(3):440453, 1987. 10
[Han04] Yijie Han. Improved algorithm for all pairs shortest paths. Information Processing
Letters, 91(5):245250, 2004. 37
[Han08a] Yijie Han. A note of an O(n
3
/ log n) time algorithm for all pairs shortest paths.
Information Processing Letters, 105(3):114116, 2008. 37
[Han08b] Yijie Han. An O(n
3
(log log n/ log n)
5/4
) time algorithm for all pairs shortest
path. Algorithmica, 51(4):428434, 2008. Announced at ESA 2006. 37
[Hay06] Brian Hayes. Connecting the dots. Computing Science, 94(5):400, 2006. 4
[Hel53] Isidor Heller. On the problem of shortest path between points, I. Bulletin of the
American Mathematical Society, 59, 1953. Astudy of the ve-city Traveling Sales-
man Problem polytope. 8
[HHSW09] Shinichi Honiden, Michael E. Houle, Christian Sommer, and Martin Wolff. Ap-
proximate shortest path queries in graphs using Voronoi duals. In Sixth annual In-
ternational Symposium on Voronoi Diagrams in Science and Engineering (ISVD),
pages 5362, 2009. Invited to special issue of Transactions on Computational
Science. 16
[Hit68] Lewis E. Hitchner. A comparative investigation of the computational efciency of
shortest path algorithms. Technical Report ORC 68-17, University of California at
Berkeley, 1968. 35, 50
[HJR95] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Hierarchical path views: A
model based on fragmentation and transportation road types. In ACM-GIS, pages
93, 1995. 51, 54
[HJR96a] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Effective graph clustering
for path queries in digital map databases. In CIKM 96, Proceedings of the Fifth
151
Bibliography
International Conference on Information and Knowledge Management, November
12 - 16, 1996, Rockville, Maryland, USA, pages 215222, 1996. 54
[HJR96b] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Path queries for trans-
portation networks: Dynamic reordering and sliding window paging techniques.
In GIS 96, Proceedings of the fourth ACM workshop on Advances on Advances
in Geographic Information Systems, November 15-16, 1996, Rockville, Maryland,
USA, pages 916, 1996. 54
[HJR97a] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. A hierarchical path view
model for path nding in intelligent transportation systems. GeoInformatica,
1(2):125159, 1997. 54
[HJR97b] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Integrated query process-
ing strategies for spatial path queries. In Proceedings of the Thirteenth Interna-
tional Conference on Data Engineering, April 7-11, 1997 Birmingham U.K, pages
477486, 1997. 54
[HJR00] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Optimizing path query
performance: Graph clustering strategies, 2000. 54
[HKRS97] Monika Rauch Henzinger, Philip Nathan Klein, Satish Rao, and Sairam Subrama-
nian. Faster shortest-path algorithms for planar graphs. Journal of Computer and
System Sciences, 55(1):323, 1997. Announced at STOC 1994. 35, 103
[HLL06] Haibo Hu, Dik Lun Lee, and Victor C. S. Lee. Distance indexing on road networks.
In VLDB 06: Proceedings of the 32nd international conference on Very large data
bases, pages 894905, 2006. 54
[HLW98] Abolhassan Halati, Henry Lieu, and Susan Walker. CORSIM-corridor trafc simu-
lation model. In Proceedings at the trafc congesion and trafc safety conference,
pages 570576, 1998. 10
[HLW06] Shlomoh Hoory, Nati Linial, and Avi Wigderson. Expander graphs and their ap-
plications. Bulletin of the American Mathematical Society, 43(4):439561, 2006.
2, 65
[HMZ03] David A. Hutchinson, Anil Maheshwari, and Norbert Zeh. An external memory
data structure for shortest path queries. Discrete Applied Mathematics, 126(1):55
82, 2003. 45
[HNR68] Peter E. Hart, Nils J. Nilsson, and Bertram R. Raphael. A formal basis for the
heuristic determination of minimum cost paths in graphs. IEEE Transactions of
Systems Science and Cybernetics, SSC-4(2):100107, 1968. 35, 52
[Hol04] Johan Holmgren. Efcient updating shortest path calculations for trafc assign-
ment. Technical Report LITH-MAI-EX-2004-13, Division of Optimization, De-
partment of Mathematics, Linkoping Institute of Technology, Linkoping, October
2004. 10
[Hol08] Martin Holzer. Engineering Planar-Separator and Shortest-Path Algorithms. PhD
thesis, Universit at Karlsruhe, 2008. 51, 54
[Hoo02] Shlomo Hoory. On graphs of high girth. PhD thesis, Hebrew University, 2002. 28
152
Bibliography
[HP58] Walter Hoffman and Richard Pavley. Applications of digital computers to prob-
lems in the study of vehicular trafc. In IRE-ACM-AIEE 58 (Western): Pro-
ceedings of the May 6-8, 1958, western joint computer conference: contrasts in
computers, pages 159161, 1958. 10, 54
[HPCD96] John Joseph Helmsen, Elbridge Gerry Puckett, Phillip Colella, and M. Dorr. Two
new methods for simulating photolithography development in 3D. In Proceed-
ings of the SPIE The International Society for Optical Engineering Optical Mi-
crolithography IX, pages 253261, 1996. 11
[HSP92] Jianying Hu, William J. Sakoda, and Theodosios Pavlidis. Interactive road nding
for aerial images. In IEEE Workshop on Applications of Computer Vision, pages
5663, 1992. 11
[HST09] Bernhard Haeupler, Siddhartha Sen, and Robert Endre Tarjan. Rank-pairing heaps.
In Algorithms - ESA 2009, 17th Annual European Symposium, Copenhagen, Den-
mark, September 7-9, 2009. Proceedings, pages 659670, 2009. 33
[HSW08] Martin Holzer, Frank Schulz, and Dorothea Wagner. Engineering multilevel over-
lay graphs for shortest-path queries. ACM Journal of Experimental Algorithmics,
13, 2008. 51, 54
[HSWW05] Martin Holzer, Frank Schulz, Dorothea Wagner, and Thomas Willhalm. Combin-
ing speed-up techniques for shortest-path computations. ACM Journal of Experi-
mental Algorithmics, 10, 2005. 54, 55
[HT69] Te C. Hu and William T. Torres. Shortcut in the decomposition algorithm for short-
est paths in a network. IBM Journal of Research and Development, 13(4):387390,
1969. 35, 36, 51, 125
[HT02] Yijie Han and Mikkel Thorup. Integer sorting in O(n