Csom Phdthesis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 188

Approximate Shortest Path and

Distance Queries in Networks


Christian Sommer
Born on 8 March 1983 in Zurich, Switzerland
Citizen of Winterthur (ZH) and Sumiswald (BE), Switzerland
Master of Science ETH Zurich (2006)
Submitted in partial fullment of the requirements
for the degree of Doctor of Philosophy
January 2010
Department of Computer Science
Graduate School of Information Science and Technology
The University of Tokyo
Abstract
Computing shortest paths in graphs is one of the most fundamental and well-studied problems in
combinatorial optimization. Numerous real-world applications have stimulated research investi-
gations for more than 50 years. Finding routes in road and public transportation networks is a
classical application motivating the study of the shortest path problem. Shortest paths are also
sought by routing schemes for computer networks: the transmission time of messages is less when
they are sent through a short sequence of routers. The problem is also relevant for social networks:
one may more likely obtain a favor from a stranger by establishing contact through personal con-
nections.
This thesis investigates the problem of efciently computing exact and approximate shortest
paths in graphs, with the main focus being on shortest path query processing. Strategies for com-
puting answers to shortest path queries may involve the use of pre-computed data structures (often
called distance oracles) in order to improve the query time. Designing a shortest path query pro-
cessing method raises questions such as: How can these data structures be computed efciently?
What amount of storage is necessary? How much improvement of the query time is possible?
How good is the approximation quality of the query result? What are the tradeoffs between pre-
computation time, storage, query time, and approximation quality?
For distance oracles applicable to general graphs, the quantitative tradeoff between the storage
requirement and the approximation quality is known up to constant factors. For distance oracles
that take advantage of the properties of certain classes of graphs, however, the tradeoff is less
well understood: for some classes of sparse graphs such as planar graphs, there are data structures
that enable query algorithms to efciently compute distance estimates of much higher precision
than what the tradeoff for general graphs would predict. The rst main contribution of this thesis
is a proof that such data structures cannot exist for all sparse graphs. We prove a space lower
bound implying that distance oracles with good precision and very low query costs require large
amounts of space. A second contribution consists of space- and time-efcient data structures for
a large family of complex networks. We prove that exploiting well-connected nodes yields ef-
cient distance oracles for scale-free graphs. A third contribution is a practical method to compute
approximate shortest paths. By means of random sampling and graph Voronoi duals, our method
successfully accommodates both highly structured graphs stemming from transportation networks
and less structured graphs stemming from complex networks such as social networks.
iii
Acknowledgements
I thank my advisor and patron Shinichi Honiden for the wonderful Japanese environment he pro-
vided, in which I learnt the essentials of Japanese student life and culture, for the freedom he gave
me, and for his very generous nancial support, which allowed me to travel to many conferences
all around the world, and which also allowed me to work with high-quality equipment.
I thank my co-advisor and mentor Michael E. Houle for the many fruitful and interesting
discussions about algorithms and everything else, for announcing the shortest path project, for his
advice, guidance, and intelligent questions, for his efforts as a collaborator, for teaching me how to
improve my writing, and, honestly, for him patiently insisting on important things I did not want
to hear and believe.
I am very honored to have Hiroshi Imai as chair and David M. Avis, Takeo Igarashi, Kunihiko
Sadakane, and Tetsuo Shibuya as members of the committee for my PhD thesis. Many thanks for
investing their time in studying my work and for examining this thesis.
I proted and learned enormously from my co-authors I worked with on the results of this
thesis. I am indebted to Wei Chen, who rst told me about the connection between path queries
and compact routing, Shang-Hua Teng, for empowering and inspiring me, Elad Verbin, for his
fascinating questions and his great intuition (and the invitation to Tsinghua), Yajun Wang, for
insisting and working hard on the subtle differences between random graph models, Martin Wolff,
for his precious help on bootstrapping my thesis and for his impressive skills as a non-native editor,
and Wei Yu, for working on all the tedious and nasty calculations.
I also proted from collaborating with Cyrille Artho, Hristo Djidjev, Stephan Eidenbenz,
Daisuke Fukuchi, Nicolas W. Hengartner, Pierre-Loc Garoche, Shiva Kasiviswanathan, Ken-ichi
Kawarabayashi, Yusuke Kobayashi, Martin Mevissen, Johan Nystr om, Yoshio Okamoto, David
Roberts, Sunil Thulasidasan, and Takeaki Uno. I got precious advice from Ittai Abraham, Erik D.
Demaine, Jittat Fakcharoenphol, Cyril Gavoille, Stephan Langermann, Xiang-Yang Li, Mikkel
Thorup, and Uri Zwick.
This thesis has improved substantially by the valuable comments of those who read parts of
preliminary versions of it. I would like to thank Michael E. Houle, Cyrille Artho, Jacopo Grazzini,
and Johan Nystr om for their helpful suggestions and proofreading. The remaining errors and
omissions are entirely the authors responsibility.
I owe a big thanks to countless individuals who I was fortunate to meet and to spend time
with in one way or the other o-sewa ni nari mashita! in the Honiden laboratories: Miki Nak-
agawa (and Family Nishida in Takatsuki), Kyoko Oda, Akiko Shimazu, Shuko Yamada, Ai To-
bimatsu, Mizuki Inoue, Kenji Taguchi, Yasuyuki Tahara, Nobukazu Yoshioka, Fuyuki Ishikawa,
Kenji Tei, Rihoko Inoue, Nik Nailah Binti Abdullah, Rey Abe, Hikari Aikawa, Yukino Baba,
Valentina Baljak, Takuo Doi , Katsushige Hino, Satoshi Kataoka, Yojiro Kawamata, Kazutaka
Matsuzaki, Hirotaka Moriguchi, Mohammad Reza Motallebi, Hiroyuki Nakagawa, Yoshiyuki
Nakamura, Hikotoshi Nakazato, Eric Platon, Jos e Ghislain Quenum, Yuichi Sei, Ryota Seike,
Shunichiro Suenaga, Ryuichi Takahashi, Ryu Tatsumi, Susumu Toriumi, Eric Tschetter, Kayoko
Yamamoto, Adrian Klein, Maxim Makatchev, Daniele Quercia, and Martin Rehak; at NII: Shigeko
Tokuda, Michael Nett, Weihuan Shu, Nizar Grira, Sebastien Louis, S ebastien Duval, Takeshi
Ozawa, Yuzuru Sawato and all the other nice and friendly guards; at Microsoft Research Asia
and in Beijing: Kun Chen, Yuki Arase, Yasuyuki Matsushita, Tommy, Yuan Zhou, Jialin Zhang,
Lolan Song, Peter & Ursi Z urcher, Gabriel Schweizer, and Jaimie Hwang; at the Los Alamos Na-
tional Laboratories and in Los Alamos: Ulrike Campbell & Glenn, Family Eidenbenz, Douglas D.
Kautz, Jacopo Grazzini, Nicolas Jegou, Leticia Cuellar, Vishwanath Venkatesan, Guanhua Yan,
Keren Tan, Lukas Kroc, Leonid Gurvits, Nandakishore Santhi, and Carrie Manore; on my trips
v
to the US and around the world: Alicia Aponte, Jittat Fakcharoenphol, Daniele Quercia, Thomas,
Kati & Lars Spirig, and Andrew Yao; a special thanks to my running mates from Tokyos Interna-
tional Running Club Namban Rengo for many enjoyable hours of training, racing, and travelling
(and partying): thanks Guillaume Bouvet, Gary & Mami Chandler, Kazuo Chiba, Chad Clark, Ed-
ward Clease, Joachim Dirks, Jon Holmes, Jay Johannesen, Chika Kanai, Gordon Kanki Knight,
Daniel Kershner, Stephen Lacey, Christiane Lange, Brett Larner, Jason Lawrence, Omar Minami,
Teruyuki Minegishi, Keren Miers, Satohi Numasawa, Paddy OConnor, Rie Onodera, Bob Poul-
son, Gareth Pughe, Fabrizio Raponi, Gerard Robb, David Motozo Rubenstein, Philip Ryan, Daniel
Seite, Yuka Shigihara, Mika Tokairin, Martin Verdier, J urgen Wittstock, Kiyonari Yoshida, and
others; from the Swiss club in Japan: J org Aschwanden, Daniel & Monika Hagemeier, Felix
M osner, Hans Prisi, Christoph Saxer, Uwe Sievers, Martin Wenk, and Hermann Werner; Japan
friends (tea ceremony & BBQ) Mitsuhiko & Yuko Kusuyama, Chikako, Tomoaki, & Noriaki
Sawada, Machiko & Yoshinobu Nagura, Junichi Andy Kimura, Hitomi Sakamoto, Tomoko
Sawada, Toshiro Joe Suzuki, Akemi Okado, Masae Ono, John & Yuki Mettraux; more friends
in Japan: Ryudo Tsukizaki, Koichi Matsumoto, Filiz Gencer, Jumi Klaus, Cedric S. Rutishauser,
Jerry Ray, my extremely patient and lovely Japanese teacher Chieko Okamoto, Katsuro Ishimasa
of Idaten, and Ken Yamagata of Iroha for making the best eel sushi in town.
It was great to have friends visiting me in Japan; thanks to Samuel Burri, Adrian Doswald,
Martin Halter, Roger Herzog, Julia Imhof, Tanja Isker, Moritz Isler, Hannes Schneebeli, Georg
Troxler, and Stefan Wolf.
Last but denitely not least, I wish to thank my wonderful family. None of this would ever
have been possible without the love and the fantastic support of Hansruedi, Hermine, and Stefan.
They taught me that shortest paths are not always optimal and desirable, and that der Weg ist das
Ziel. And I wish to thank my Japanese family in Miyazaki, Kayashima, and Tottori, especially
my beloved Rie, for everything.
Christian Sommer, December 2009
vi
Table of Contents
1 Introduction 1
1.1 Networks and Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Transportation Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Complex Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Classical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Point-to-Point Shortest Path Queries . . . . . . . . . . . . . . . . . . . . 9
1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Preliminaries 17
2.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.1 Graph Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.2 Graph Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.3 Synthetic Graph Models . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Graph Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.1 Computational Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.2 Approximation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Common Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 Spanners and Emulators . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.2 Distance Labelings and Metric Embeddings . . . . . . . . . . . . . . . . 29
2.3.3 Planar Graph Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3.4 Well-Separated Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4 Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.1 Single Source Shortest Path (SSSP) Algorithms . . . . . . . . . . . . . . 32
2.4.2 All Pairs Shortest Path (APSP) Algorithms . . . . . . . . . . . . . . . . 35
2.4.3 Many Pairs Shortest Path (MPSP) Algorithms . . . . . . . . . . . . . . . 37
2.4.4 Shortest Path and Distance Queries . . . . . . . . . . . . . . . . . . . . 38
3 Review of Short Path Query Processing 39
3.1 Theoretical Distance Oracles . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.1 Lower Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.2 General Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1.3 Restricted Graph Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Practical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.1 Hierarchical Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2.2 Graph Annotation Approaches . . . . . . . . . . . . . . . . . . . . . . . 52
3.2.3 Road Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.2.4 Complex Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4 Lower Bounds for Sparse Graphs 59
4.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.1 Communication Complexity . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.2 Regular Graphs with Large Girth . . . . . . . . . . . . . . . . . . . . . . 64
vii
Table of Contents
4.2.3 Counting Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 Reduction from Lopsided Set Disjointness . . . . . . . . . . . . . . . . . . . . . 69
4.3.1 Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.2 Reduction from a data structure to a communication protocol . . . . . . . 71
4.3.3 Communication complexity implies space complexity . . . . . . . . . . 73
4.3.4 Counting Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.5 Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.4 Conclusion and Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 Distance Oracles for Power-law Graphs 81
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.1 Overview of the Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2.1 Distance Oracle of Thorup and Zwick . . . . . . . . . . . . . . . . . . . 83
5.2.2 Properties of Random Power-law Graphs . . . . . . . . . . . . . . . . . 84
5.3 The Adapted Distance Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.4 Time and Space Complexities . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.4.1 Core Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.4.2 Ball Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.4.3 Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.5 Conclusion and Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6 Approximating Shortest Paths Using Voronoi Duals 97
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.2.1 Graph Voronoi Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3 The Voronoi Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.4 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5 Stretch Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.6.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.6.2 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.6.3 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.6.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.7 Conclusion and Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7 Conclusion 121
viii
MILLE VIAE DUCUNT HOMINEM
PER SAECULA ROMAM.
(All roads lead to Rome.)
(Es f uhren viele Wege nach Rom.)
Alanus de Insulis
in the Liber Parabolarum
12
th
century AD
1
Introduction
Imagine you wanted to travel from central Tokyo to visit the place where I grew up, which is the
village of Ottenbach in Switzerland. What is the fastest way to get there?
Imagine you wanted to get in touch with Nelson Mandela using a sequence of personal intro-
ductions through friends and friends of friends. What is the shortest such sequence?
Imagine you wanted to access a webpage on the Internet. Which routers should be used such
that the necessary information is downloaded to your computer fastest?
These questions have something in common, in that their (optimal) solution is the shortest
path between two points of a network: a transportation network, a social network, and a router
network. All roads may lead to Rome, but we wish to arrive as soon as possible. The aim of this
thesis is to provide means to efciently compute shortest paths in networks.
1.1 Networks and Graphs
Figure 1.1: The K onigsberg bridges as depicted by Leonhard Euler in his article Solutio Prob-
lematis ad Geometriam Situs Pertinentis on page 129 in volume 8 of Commentarii Academiae
Scientiarum Petropolitanae in 1741.
You may have heard of the seven K onigsberg bridges and the question as to whether one can,
in one walk, cross each bridge exactly once. Leonhard Euler resolved this question in 1735 by
proving that there is no such walk. His proof works as follows. Consider the island denoted by
the letter A on the illustration in Figure 1.1. Euler made the following important observation:
although the island contains many buildings, streets, and paths, the relevant information is the
1
CHAPTER 1. INTRODUCTION
islands connections (the bridges) to the other parts of the city. This observation led him to create
an abstract discrete structure, later termed a graph. He identied each landmass with a node
and each bridge with an edge connecting the two corresponding nodes in the graph. A walk in
K onigsberg corresponds to a walk in the graph; crossing a bridge is represented by traversing an
edge. Euler then noted that the number of edges adjacent to a node is essential. Except for the
endpoints of the walk, all intermediate nodes must have an even number of adjacent edges, since
any walk must leave the node exactly once for every time entering it. Since all of the four nodes
have an odd number of edges, there cannot be any walk that traverses each edge exactly once.
Modeling the bridges by the edges of a graph helped Euler to solve the problem of the
K onigsberg bridges. Ever since, graphs have been used as an abstraction of structures in the
real world:
Graphs are, of course, one of the prime objects of study in Discrete Mathemat-
ics. However, graphs are among the most ubiquitous models of both natural and
human-made structures. In the natural and social sciences they model relations among
species, societies, companies, etc. In computer science, they represent networks
of communication, data organization, computational devices as well as the ow of
computation, and more. In mathematics, Cayley graphs are useful in Group Theory.
Graphs carry a natural metric and are therefore useful in Geometry, and though they
are just one-dimensional complexes, they are useful in certain parts of Topology,
e.g. Knot Theory. In statistical physics, graphs can represent local connections be-
tween interacting parts of a system, as well as the dynamics of a physical process on
such systems. [HLW06]
In a graph representing a road network, intersections and streets can be modeled by nodes
and edges, respectively. Two nodes have an edge in between if there is a street connecting the
two corresponding intersections. For a computer network, routers and the connecting network
cables are mapped to nodes and edges, respectively. In a social network, the connections are not
physical. Individuals can be modeled by nodes; two nodes are connected by an edge whenever
the corresponding individuals are friends. In other social networks, an edge may also indicate a
private or professional relationship other than friendship.
Different streets of a road network have different lengths. This is modeled by assigning each
edge a number, called its edge weight. This cost can reect real-world values such as distance,
travel time, transmission time, and latency. In a graph modeling a social network, the edge weight
can also reect the quality of a friendship, although this is arguably difcult to capture appropri-
ately by a single number [XNR10].
In the remainder of this introduction, we rst review various example structures, for which a
graph serves as a suitable model. We then consider one central problem that can be solved using
graphs: we investigate applications of the shortest path problem (Section 1.2), in particular con-
temporary applications of the point-to-point shortest path problem (Section 1.2.2). We conclude
the chapter by stating the contribution of this thesis (Section 1.3).
1.1.1 Transportation Networks
Transportation networks are an integral part of the infrastructure in many countries. For this
reason, the study of transportation networks is an important eld of research. We give three
examples of transportation networks: road, railway, and airline networks. A more realistic model
of transportation would ideally integrate networks of all three types [Fra08, DPW09, DPWZ09].
2
1.1. NETWORKS AND GRAPHS
Road Networks
As mentioned before, intersections and streets are represented by graph nodes and edges, respec-
tively. Also, edges may be assigned weights such as expected travel times or distances. All the
information included in the graph model can essentially be found by consulting a road map. The
reverse is not true. We cannot draw the original road map by considering the graph only, since
some information about the original drawing (or embedding) such as coordinates is not included.
The original map is just one possible drawing of the graph. A general graph is completely inde-
pendent from its embedding.
A graph that can be drawn on a plane such that no two edges cross (except for the endpoints),
is called planar. Since road networks often contain many bridges and tunnels, the corresponding
graphs are in general not planar. However, road networks and planar graphs share some important
properties, which render road networks tractable for many optimization problems. Navigation,
for example, is not as difcult as it could be on a general graph, since road networks have some
geometric and geographical orientation. Another characteristic of road networks is that, at every
intersection, the number of streets (and thus the number of choices when navigating) is quite low.
Still, planning efcient routes through a road network is very challenging since these networks
may be huge (millions of nodes) and dynamic (travel times depend on various factors such as the
current trafc situation and road maintenance).
Railway Networks
For railway networks, a straightforward model, analogous to road networks, would map stations
and tracks to the nodes and edges of a graph, respectively. Taking a train is represented by travers-
ing an edge. Again, edges can be weighted with distances or travel times. This model arguably
does not represent a real public transportation system very well. In road networks, one can con-
ceivably use a particular street at any time (except for small delays due to trafc lights). In railway
networks, however, trains are bound to a timetable. This may cause signicant waiting time at a
station, which needs to be captured by a realistic model. Indeed, modeling railway networks with
timetable information is more involved than modeling road networks [MHSWZ04]. In the time-
expanded model for example, nodes represent both a location and a specic time simultaneously.
One station then corresponds to several nodes in the graph. Edges have different types; traversing
an edge means either taking a train, waiting at a station, or walking to a different track within a
station. Edge weights can be travel times (possibly walking or waiting times) or ticket prices. In
general, the graphs derived from railway networks and timetables are considerably more complex
than the graphs representing road networks.
Airline Networks
Air trafc networks can also be represented by graphs. Airports can be modeled by the nodes
of a graph. Two nodes are connected by an edge if there is an aircraft that can start and land on
and cover the distance between the corresponding airports. In this graph, almost all nodes are
connected. While in road networks, each intersection had up to half a dozen of connections, in
air trafc networks, airports can have hundreds of connections. Consequently, the corresponding
graphs are signicantly more dense. For the moment, let us restrict the edges of the graph to
the routes that commercial airlines offer. Even when considering the graph with this restricted
edge set, some nodes have hundreds of adjacent edges, due to the fact that airlines often have a
small number of hubs (usually big airports) where all their routes connect. The airline network
3
CHAPTER 1. INTRODUCTION
is a so-called small-world network [WS98]: one can connect any two airports in the network by
following a sequence of only one to ve connections [ASBS00]. From a graph-theoretic point of
view, airline networks are structurally very different from road and railway networks. Networks
without apparent structure are often called complex networks.
1.1.2 Complex Networks
Various complex systems are highly interconnected: phenomena that were assumed to be local
only are sometimes unexpectedly shown to inuence the other end of the system. Researchers from
elds such as physics [BA99, New00, WS98], mathematics [BRST01, BR04, FFV07, Lov09],
computer science [KL06, WM03], biology [MV06, LT09], and social science [LPA
+
09] analyze
these systems to explain how (and sometimes why) everything is connected [Str01, AB02, New03,
GGM06, BLM
+
06]. The objectives of network scientists are (1) to make connections in real-
world systems explicit (to connect the dots [Hay06]), (2) to analyze and understand the network
structures formed by these connections, and, possibly, (3) to exploit the structural properties of
these systems. The challenges are manifold. Extracting the connections of complex systems
can already be very difcult since these systems tend to be inherently distributed and very large.
Once the connections have been made explicit, that is, once a graph (in this context also termed
complex network) has been created, the next challenge is awaiting. Due to the often massive size
of the graph (millions of nodes and edges are not uncommon), analyzing its structure requires
sophisticated methods, techniques, and tools. Researchers often use methods to decompose the
graph into different clusters and communities of smaller size, which are easier to understand than
the whole graph. Another approach is to focus on important nodes within the graph. Important
nodes are conjectured to be those that are well connected with most of the other nodes, or those
that enable important connections between other nodes.
Although these systems and the corresponding complex networks appear to lack structure,
they still have some important commonalities: they show a large variation in the number of
links connected to each node; these networks often have a few nodes that are very well con-
nected while a majority of the nodes have only few interactions; the number of interactions ap-
pears to follow a power law [Mit03, New05, CSN07]. These networks are called scale-free net-
works [dSP65, BA99, BAJ00, New00, WS98]. Also, many complex networks are small worlds,
which means that, although many nodes only have a few direct interactions, they are still con-
nected to all the other nodes through very short chains of interactions.
In the following, we discuss some specic complex networks in more detail.
Citations among scientic articles
The graph modeling citations among scientic articles is a directed graph, which means that its
edges are oriented. The node corresponding to article A has a directed edge to the node corre-
sponding to article B if and only if article A cites article B. The citation of a research article by
another one is an explicit relationship indicating inuence and dependence. An article that gets
cited by many other articles is likely to have had some inuence on many researchers. For this
reason, the impact of an article is often measured by the number of citations an article received.
Researchers and managers evaluate the success of a research article based on its impact. There are
also potential applications other than evaluating and ranking articles and researchers. For exam-
ple, the citation graph may be of use in a system that automatically suggests reviewers for articles
submitted to a journal.
4
1.1. NETWORKS AND GRAPHS
The network of citations among scientic articles was in fact one of the rst complex net-
works ever analyzed [dSP65]. It has been found that, over time, articles that are cited by many
others, acquire even more citations [dSP76]. This cumulative advantage, (later termed preferential
attachment [BA99]), appears to be very common in complex networks.
Web Graph
The web consists of billions of pages that are connected by hyperlinks. For the web structure, a
directed graph is a suitable model [KKR
+
99, BKM
+
00, DLL
+
06]. Web pages and hyperlinks are
mapped to nodes and edges, respectively. The resulting structure is also called web graph. Similar
to the citation between two scientic papers, the hyperlink between two web pages indicates some
dependence and inuence. A scientic article is considered important if it is cited by many others;
analogously, a webpage is considered important if it is linked to by many others. Modern search
engines often make use of this network structure when ranking search results [PBMW99, FVC07].
An accurate snapshot of the web graph is hard to obtain: the number of web pages is massive,
the pages reside on different servers all around the world, and many pages change constantly.
Due to these reasons, researchers analyze subsets and samples of the web. Based on such small
samples, it is conjectured that the web graph is scale-free [BAJ00], which means that the number
of incoming links per web page obeys a power law. It is also conjectured that most pages that are
inter-linked, are connected by rather short chains of links [AJB99]. Based on the characteristics
of sample web graphs, researchers also build mathematical models [KRR
+
00, BBCR03, FFV07,
CKL
+
09], using which the future structure of the web graph can be predicted.
Weblogs (blogs) form a very popular part of the web [Shi03]. Since good content on the
web is proportionally rare and since most blog writers (termed bloggers) also read other blogs,
interesting ideas, rumors, and stories propagate very quickly within the blogosphere. Some blogs
are read by millions of readers and the corresponding writers have the power to inuence their
audience to a certain extent. It is thus no surprise that companies try to make use of this network
of inuence when launching new products or services. The analysis of the blogosphere is indeed
a vibrant research topic [LMF
+
07, LNMG09].
Router Connections
Since the Internet has become an integral network handling a signicant percentage of business
transactions both among companies and between companies and individuals, any technical failure
might have drastic consequences in the real world. The Internet is nowadays considered being
critical infrastructure almost at the level of transportation and power networks.
Although the Internet is a physical network consisting of routers and cables (modeled re-
spectively by nodes and edges of a graph), there is no accurate map available. Researchers try
to create such a map by sending messages to random computers while monitoring (using the
tool traceroute) which routers the messages went through [FFF99, VPSV02, Coo03]. These
measurements yield that the Internet (or some of its subgraphs) also has properties of power-law
graphs. However, it is not clear whether these properties were obtained due to measurement bias
in the map creation process [ACKM09].
The speed of the internet highly depends on the efciency of the internet protocols, in par-
ticular the routing protocols. The performance of a protocol highly depends on the underlying
network structure. Since speed matters and due to the importance of the Internet as critical infras-
tructure, the network of router connections is a very important network to understand and analyze.
5
CHAPTER 1. INTRODUCTION
Networks in Biology
Traditionally, proteins are identied based on their actions as building blocks, catalysts, or signal-
ing molecules. The network view identies a protein by its physical interactions with other pro-
teins. This yields its contextual role in a proteinprotein interaction network [UGC
+
00, CHC
+
00,
ICO
+
01, RSR
+
01, GBB
+
03, LAB
+
04]. Proteins and interactions are represented by nodes and
edges of a graph, respectively.
Recently, many networks have been extracted from biological data [MV06]. Examples other
than protein interaction networks include metabolic networks, which encode biochemical reactions
between metabolic substrates [RSM
+
02], and transcriptional regulatory networks, which describe
the regulatory interactions between different genes [IFB
+
02, SLSP03].
Network science may help to understand the human body and its internal actions, for exam-
ple by determining the role, function, and essentiality of proteins or genes [SL03, JOB03] by
analyzing shared interaction partners in the proteinprotein interaction network. This knowledge
may then help to nd treatments for diseases such as cancer by identifying drug targets [VLL00].
Another potential application in medicine lies in the intersection with neuroscience. Abnormal
interaction patterns in the brain could help diagnosing neurological disorders [SG05]. These ap-
plications indicate that the potential of an interplay between biology and network science is enor-
mous.
Social Networks
Around 1929, the Hungarian writer Frigyes Karinthy [Kar29] formulated the following challenge:
nd a person who you can not be connected to by a friendship chain of at most ve people. In
other words, Karinthy conjectured that everybody knows everybody else through a short chain of
personal connections. Milgram [Mil67], in his famous small-world experiment [Mil67, TM69],
tried to verify this conjecture with an empirical study. Participants in the Milgram study were sup-
posed to deliver a letter addressed to a specic person, not by mailing it directly, but by forwarding
it to somebody they know on a rst-name basis somebody who is more likely to know the ad-
dressee. Milgram was interested in the number of forwarding steps, which corresponds directly
to the length of a friendship chain. Although most letters never arrived, many letters reached the
addressee after a very small number of forwarding steps only.
1
Its a small world! Milgrams
result exhibits a stronger statement than Karinthys conjecture. For two individuals, the friendship
chains connecting the two are not only short, it is also possible to nd such a chain.
The small-world phenomenon has been fascinating people around the world for many years.
For example, the popular Hollywood movie Six degrees of separation [Gua93] is devoted en-
tirely to small-world phenomena. As another example, the members of some communities mea-
sure their interaction distance to distinguished individuals. Actors are interested in their co-
starring distance to Kevin Bacon, Mathematicians care about their collaboration distance to Paul
Erd os,
2
and players of the popular game Go measure their distance to Honinbo Shusaku.
3
In social networks [dSPK78], we consider individuals as nodes and relationships as edges
of a graph. If the relationship is friendship, social networks are far more than a scientic play-
ground: online social networking platforms such as Facebook, MySpace, mixi, and others have
1
Some researchers doubt the signicance of Milgrams results [Kle02b, Kle02c].
2
My own Erd os number is currently 3. SommerHouleAvisErd os and SommerTengF. YaoErd os are two
disjoint shortest paths containing 3 publications each.
3
See http://senseis.xmp.net/?ShusakuNumber Erd os apparently has a Shusaku number of at
most 6.
6
1.2. SHORTEST PATHS
conquered the web and their corresponding websites attract millions of users. Privacy aside, these
social networking websites generate massive data sets that are of huge interest for advertising
companies and marketeers. Some of this data is made available to scientists as well (usually after
anonymization). Companies also analyze their internal communication patterns to improve pro-
ductivity and to identify leader personalities [GBL08]. Social networks can also be extracted from
phone call [OSH
+
07] and instant messaging [LH08] data. Other data on social relationships may
be harder to collect. If an edge of the graph indicates sexual interaction (not necessarily a subset
of the friendship edges), not everybody may be willing to reveal all connections. In a sample
of 2,810 Swedes, the number of sexual interactions per person showed the structure of a power
law [LEA
+
01]. Although the exact connections were not retrieved, the power-law distribution in
the number of connections is already consequential. Epidemics tend to arise and propagate very
fast in power-law networks [WS98, PSV01]. It is also known that the speed of disease propaga-
tion can potentially be reduced by prevention campaigns that strategically target those individuals
with a large number of partners [LEA
+
01]. The previous example indicates that knowledge and
understanding about the structure of complex networks may have an impact on the real world.
Network of networks and techno-social systems
The spread of a general infectious disease depends on interactions of different types. One net-
work is not enough; several networks have to be considered simultaneously to nd effective con-
tainment strategies in urban social networks [New02, MPSV02, EGK
+
04, MPN
+
05, GCE
+
09].
Techno-social systems consist of a technological component, often physical infrastructure such
as transportation systems, and a human component, which could be any form of communication,
potentially inuenced by a social network. Understanding the interplay of both components is
necessary to predict, and eventually control and inuence, techno-social systems.
It now seems possible to imagine the creation of computational forecasting infrastruc-
tures that will help us design better energy-distribution systems, plan for trafc-free
cities, anticipate the demands of Internet connectivity, or manage the deployment of
resources during health emergencies. [Ves09]
1.2 Shortest Paths
Once we have performed the abstraction by making connections explicit and modeling them by a
graph, we can work with this model and compute properties of the graph without considering the
exact details of the underlying network. Interesting properties could be graph distances between
nodes. Tobler [Tob70] invoked
... the rst law of geography: everything is related to everything else, but near things
are more related than distant things.
This law ought to be generalized to a law holding for many networks. An edge of a graph es-
sentially indicates a relationship between the two corresponding entities: between two interacting
proteins, between two friends, between two intersections in a road network, or between two routers
connected by a cable. If this relationship is somewhat transitive,
4
which means that, for example,
4
In some graphs, two nodes are connected by an edge if there is some conict between the corresponding edges.
For example, two neighboring wireless devices cannot send at the same time due to interferences. These conict graphs
often occur in scheduling problems, which can be solved for example by coloring the graph. In this thesis, however,
7
CHAPTER 1. INTRODUCTION
if Paul is friends with Linton and Stanley, then Linton and Stanley are probably closer to each
other than if they were not friends with Paul. Or, to consider another example, if protein A inter-
acts with and inuences protein B, and B inuences C, then A has an indirect inuence on C.
The shorter the chain of interactions, the stronger the inuence. This is the rst law of networks.
In graphs, the distance between two nodes s and t (source and target, respectively) is dened as
follows. If s and t are connected by an edge, their distance is 1. If they are not directly connected,
the distance is dened by the length of a shortest path between s and t, which is a sequence of
adjacent edges. In a weighted graph, the length of a path is dened by the sum of the weights of
the edges on the path. Consequently, shortest paths are dened with respect to these weights. Note
that, in weighted graphs, even if two nodes are connected by an edge, depending on its weight, the
edge is not necessarily part of any shortest path.
To each of the three questions at the beginning of this chapter the answer is a shortest path in
the corresponding network: the fastest way to get to Ottenbach, the shortest sequence of friends
to get in touch with Nelson Mandela, and the fastest route to a webserver. Many other ques-
tions also have shortest paths in graphs as answers. Whenever we must traverse a network or
send someone or something through a network between two points in a fast, cheap, or reliable
way, it is likely that solving a shortest path problem will provide the optimal solution. Other
applications of shortest path computations include (but are by no means limited to) network opti-
mization [GN67, BMW89], packet routing [SS80], image segmentation [CK96, MB98, FUS
+
98],
computer-assisted surgery [MDMCM01], computer games [Sto99], DNA analysis [Sch98], in-
jection molding [Joh97], postman problems [EJ73], operator scheduling [BOR80], production
planning [Kle63, GN64, Whi69, Elm77, LN94], re-allocation of resources [Wri75], approxima-
tion of piecewise linear functions [II86, p. 33], and VLSI physical design [CKL97]. Further-
more, the countless optimization problems that can be solved by deterministic dynamic program-
ming [CLRS01, Chapter 15] (Knapsack, for example [Fri76]), can also be solved by a shortest
path algorithm (on a very large graph) by identifying the stages of the dynamic program with
the nodes of an acyclic, directed network. The shortest path problem also has a deep connection
to the minimum cost ow problem, which is an abstraction for various shipping and distribution
problems, the minimum weight perfect matching, and the minimum mean-cycle problem. More
sophisticated algorithms for combinatorial optimization problems such as network ows often
need a shortest path subroutine.
1.2.1 Classical Results
Methods to nd a shortest path were discovered and analyzed already in the late 50s and early
60s by Bellman [Bel58], Bock and Cameron [BC58], Caldwell [Cal61], Dantzig [Dan57, Dan60],
Dijkstra [Dij59], Floyd [Flo62], Ford [For56], Fulkerson [FF58], Hu [Hu67], Klee [Kle64], Ley-
zorek, Gray, Johnson, Ladew, Meaker, Petry, and Seitz [LGJ
+
57], Minty [Min57, Min58], Moore
[Moo59], Mori and Nishimura [MN67], Parikh [Par60], Rapaport and Abramson [RA59], Shimbel
[Shi53, Shi55], Warshall [War62], Whiting and Hillier [WH60], and probably others.
5
Despite the
we restrict ourselves to graphs of the type encountered in the previous examples, assuming that an edge represents a
somewhat positive relationship and not a conict.
5
Due to the large amount of literature, providing an exhaustive list appears to be difcult. The rst survey article
reviewing shortest path algorithms was apparently written and published in 1960 [PW60] and complemented in the
same year with four additional solutions [PRB60]. Neither article mentions Dijkstras algorithm.
Robacker [Rob56] and Gallai [Gal58] (and potentially Egerv ary [Ege31]) studied the shortest path problem without
giving an algorithm. Research on the strongly related (but harder) Traveling Salesman Problem started earlier, see for
example [Hel53, Kuh55, DFJ54].
8
1.2. SHORTEST PATHS
exponential growth of computing research, many of their algorithms are still in use today in one
form or the other, most prominently Dijkstras algorithm, which literally every computer science
student learns as an undergraduate. Dijkstras algorithm solves the Single Source Shortest Path
(SSSP) problem: the algorithm starts at one point (the source) and explores the whole graph in
all directions until the distances to all the other nodes are known. For an illustration of the short-
est path tree in a transportation network, see Figure 1.2. If the user is interested in the distance
to one target node only, it is possible to stop the search as soon as this specic target has been
found. Some of the other classical algorithms solve the All Pairs Shortest Path (APSP) problem:
the distance and the shortest path between all pairs of nodes is computed.
Figure 1.2: The shortest path tree starting at the node representing the city of Los Angeles. Orig-
inal by George B. Dantzig [Dan57]. Hence the optimal path is from Los Angeles to Salt Lake
City, then to Chicago, and nally to Boston.
The importance of the problem and the numerous applications have stimulated research ef-
forts for more than fty years.
6
A large fraction of these efforts targets the shortest path problem
in transportation networks, since route planning is arguably one of the most important applica-
tions of shortest path algorithms. Intelligent navigation systems know the current location using
GPS and guide cars along the shortest route to minimize travel time, distance, or fuel and energy
consumption. Some of these systems also react to changing trafc conditions and, in the future,
cars may communicate with each other and negotiate routes to regulate the overall trafc ow and
manage congestion. The end users are often interested in trip planning, which means that they
want to know the distance and the shortest, cheapest, or most reliable path between two specic
points.
1.2.2 Point-to-Point Shortest Path Queries
We ask for the distance between two points of a network. Recall that Dijkstras algorithm solves
the Single Source Shortest Path (SSSP) problem by exploring the whole graph starting from the
source until the distance to all other nodes is known. Dijkstras algorithm can be stopped pre-
maturely, as soon as the target has been found. Another improvement to decrease the running
6
A search for shortest path on Google scholar scholar.google.com, a search engine for scholarly literature,
returns more than 150,000 results. Hu [Hu71] states: This is an area in which people keep writing papers.
9
CHAPTER 1. INTRODUCTION
time is to start a search from both the source and the target and stop when the two searches
meet [Nic66, Mur67b, Poh71, SdC77, dC83]. Still, Dijkstras algorithm may explore the whole
graph. A complete exploration is too expensive. The objective is to develop a method that is much
faster than Dijkstras algorithm. In return, the method is allowed to pre-compute a data structure
that, later, assists shortest path computations, called queries. We thus assume that the graph is
known some time before the rst query is asked.
Shortest path query processing resembles a typical data base problem: create an index by
materializing information to speed up certain queries [Han87, SFG97]. One strategy would be
to precompute the result for every possible query. This is expensive in terms of time and space;
storing all the results may be prohibitive since the number of possible queries is quadratic in the
number of nodes. No precomputation, the other extreme, is too slow at query time. We aim for
a mid-point, a good compromise, between no precomputation and complete processing at the
time of query and complete precomputation and a simple look-up at the time of query [AJ89].
Shortest path query processing is an integral part of many applications, in particular Geo-
graphic Information Systems (GIS) and intelligent transportation systems [JHR96]. A challenge
for trafc information systems or public transportation systems is to process a vast number of cus-
tomer queries on-line while keeping the space requirements as small as possible [Zar08]. Trans-
portation planning systems and GIS form arguably the most established application scenarios for
shortest path algorithms. The following sections list without being exhaustive other motiva-
tional examples of contemporary applications, where computing short paths is a key component
and where a method that allows for efcient shortest path queries may potentially speed up the
total computation signicantly.
Trafc Simulations
Two objectives of trafc simulations [HP58, CWM94, SBA
+
95, HLW98, PW99, EW03, RN04a,
TE09, TKE
+
09] are to forecast future trafc patterns and to predict the consequences of certain
changes to the road network. While sometimes the consequences are easily predictable and intu-
itively clear, there are paradoxical situations where closing a road actually improves the overall
trafc situation [Bra68]. Simulations can help to identify these counterproductive roads. Simula-
tion results may also help in urban planning [SG67, SL67, RTMS05, Bat08] to reason about the
economic and social impact of building new roads. With realistic estimates of population mobility
and parameterized models for simulating the progress and transmission of a disease, simulations
may improve predictions in public health and epidemiology [EGK
+
04, MPN
+
05, Ves09].
Network-based simulators assume that cars drive along shortest paths. Within the simulator,
at each virtual time step, entities are routed one step further from the source to the destination
along a shortest path. The simulation requires the computation of a large number of shortest
paths [ZKM97, BBJ
+
02, Hol04, RN04a, BG07, TE09, TKE
+
09]. Therefore, various simulators
can potentially reduce the total running time by exploiting a fast query processing method. Since
in the real world drivers rarely use the exact shortest path, even a method returning approximate
shortest paths may be of help.
Image Segmentation
Image segmentation [CK96, MB98, FUS
+
98] is an integral part of image processing applications
such as accident disposal, medical images analysis, and photo editing. The image segmentation
problem is to group together neighboring pixels whose properties are coherent. The grouping
process often relies on shortest path computations. Object surfaces are in some sense continuous.
10
1.2. SHORTEST PATHS
Continuous shortest paths are computed with the fast marching method [Tsi95, Set96, HPCD96],
which can be seen as a continuous version of Dijkstras algorithm. However, since the discrete
lattice [Bes74] is the standard reconstruction of image data, discrete algorithms are also used in
many applications. Graph-based algorithms
7
consider the image as a graph in which each pixel is
a node, connected by an edge to each of the 4 (or 8 or more) neighboring pixels. An important
part is to assign appropriate weights to the edges, based on the image potential of the two pixels
and their Euclidean distance [Bor84, TM92].
The graph-based intelligent scissors algorithm [MB98] heavily relies on shortest paths. The
user provides two endpoints; the algorithm computes the result of the corresponding point-to-
point shortest path query and the resulting path is interpreted as a part of the object boundary.
In other words, given two endpoints of a contour, the algorithm determines the maximum like-
lihood contour connecting them. The edge weights are probabilities and, by taking the negative
logarithm, the algorithm can just sum up the weights on the path to retrieve the optimal con-
tour.
8
For boundary computations in 3 dimensions, the intelligent scissors algorithm must be
adapted [KH05, Gra06, Gra08]. A boundary can be found by combining many shortest paths until
a closed surface is obtained [KH05].
In computer vision [Gra08, PC06], shortest path algorithms on weighted graphs have found
numerous applications other than segmentation such as centerline nding [BKS01, BSB
+
00], ra-
diation therapy [CLW08], mesh morphing [LDSS99, ADG
+
06], video summarization [PMT03],
and nding roads and trails on satellite images [FTW87, HSP92, MZ93, SMR97, MB08].
Drug Target Identication
Shortest paths in interaction graphs are important in systems biology. Signaling paths for example
are routes along which one molecule can affect another one [KvK09, Ari00]. The average path
length already reveals valuable information about a cell or a body.
In a sense, the average path length in a network is an indicator of how readily in-
formation can be transmitted through it. Thus, the small world property observed
in biological networks suggests that such networks are efcient in the transfer of bi-
ological information: only a small number of intermediate reactions are necessary
for any one protein/gene/metabolite to inuence the characteristics or behavior of an-
other. [MV06, Section 3, p. 8]
7
We note that image segmentation based on minimum spanning trees in these graphs is also a suitable ap-
proach [Zah71, FH04].
8
This likelihood transformation can be applied to Markov chains [MT93] in general [Met07, Chapter 6]. We may
interpret a Markov chain with state set S, described by its (stochastic) transition matrix P = (p
ij
)
i,jS
, as a directed
graph G = (V, E). The edge weights can be set with a likelihood approach, where the edge (i, j) E has weight
w
LL
(i, j) := lg p
ij
and the path length is the negative log-likelihood. The shortest path between two states with
respect to the weights w
LL
(i, j) is the path with the highest probability (reaction pathway) and the result of a point-
to-point distance query yields a lower bound on the transition probability between the corresponding states. For rather
long paths, the likelihood approach may produce misleading results. The Free Energy Approach tries to overcome this
problem. We assume that the stationary distribution is unique, = (
i
)
iS
. Now, the discrete free energy of a state
is F
i
:= lg
i
> 0. The edge weights are constructed such that the shortest path between two states overcomes
the lowest discrete free energy barriers. This yields weights w
FE
(i, j) := |F
j
F
i
|. Assigning likelihood or free
energy edge weights may be a suitable model for social networks as well. The probability of favor serves as a possible
interpretation. Here, the sum of the weights of edges adjacent to one node need not necessarily to sum up to 1.
11
CHAPTER 1. INTRODUCTION
The average path length is related to a graphs Wiener index
9
[Wie47, Rou86], which could po-
tentially be approximated using random sampling and (approximate) point-to-point shortest path
computations.
The next potential application leverages information on paths other than their average length.
We wish to know which proteins, genes, and metabolites are more important and powerful than
others, meaning that they have a strong inuence on others. These may be suitable targets for
medication [VLL00]. Various graph-theoretic centrality measures somehow correlate with the
importance of a node. The degree centrality for example counts the number of neighbors of a
node. Intuitively, the more interaction partners a protein has, the higher its potential to inuence
others [Fla63]. More complex measures distinguish purely local effects (such as the number of
interaction partners) from global organizational effects [WS03]. For example, if a protein is on
many reaction pathways or signaling paths, it has some potential for control [Fre77]. It can stop
or at least slow down the reaction by breaking the chain of interactions, due to the fact that it lies
between two indirectly interacting proteins. The number of shortest paths that a node takes part
of is called its betweenness centrality [Bav48, Shi53, Sha54, Fre77]. Betweenness accounts for
direct and indirect inuences of proteins at distant network sites and hence it allows one to relate
local network structure to global network topology. Closeness centrality [Bav50, Bea64, Sab66]
measures how far away a node is from all the other nodes. Degree centrality counts the mere
number of interaction partners.
An important application of network centrality in pharmaceutical research could be drug target
identication. One of these potential target genes is p53 [VLL00]. The loss of p53 function is very
damaging: p53 is among the genes most likely to be mutated in cancers. In fact, p53 function loss
occurs in nearly all human cancers. It turns out that p53 corresponds to a highly connected node
in the interaction graph.
Centrality also allows to predict protein essentiality [JOB03, JBIH05]. Interesting proteins are
those with high betweenness-centrality, yet low local connectivity. Their low connectivity would
imply that they are unimportant, but their high betweenness suggests that these proteins may have
a global impact, acting as important links between modules. The removal of a protein can have
different phenotypic effects including lethal or non-lethal effects and a slow-down of growth.
There is a positive correlation between lethality and connectivity [JMBO01]. The most highly
connected proteins in the cell are the most important ones for its survival. The more essential a
gene (or its associated protein) is to a pathogen or to a cancerous cell, the more attractive it is as
a drug target [MV06, Section 4]. So far, however, there is not much evidence that for two nodes
with the same number of interactions the node with higher betweenness centrality is signicantly
more important [JBIH05]. Currently, degree centrality serves as a good indicator. Nevertheless:
In most of the studies, [...] the centrality score of a node was found to be indicative of
its likelihood to be essential. In particular, this appears to be true for degree centrality,
betweenness centrality, and eigenvector centrality measures. [MV06]
The computation of centrality indices has been studied [Bra01] and since exact sequential methods
are rather slow, there are efforts to parallelize the computations [MEJ
+
09] and to nd approximate
solutions [EW04, BKMM07]. Since the biological networks are often sampled and have some
errors, centrality measures are not exact anyway [CV03a]. Depending on the application, approx-
9
In chemistry, the Wiener index of a molecule is the sum of all shortest path lengths between non-hydrogen atoms
in the graph dened by the structure of the molecule. In general, the Wiener index of a graph is the sum of all shortest
path lengths among all nodes.
12
1.2. SHORTEST PATHS
imations may be sufcient. Centrality algorithms rely on the computation of functional pathways
and they could thus benet from fast (approximate) path computations [RMJ06].
Community Detection
Complex networks are often huge and thus difcult to analyze. One way to obtain an under-
standing of complex networks is to decompose the network at hand into related components,
communities, and clusters. Several algorithms for clustering and community detection have been
proposed [GN02, NG04, New04, CNM04, APF
+
06, Dji06]. The algorithm by Girvan and New-
man [GN02, NG04] has been applied successfully to a variety of networks, including several
social and collaboration networks, metabolic networks, and gene networks. Their algorithm iter-
atively removes edges with high betweenness centrality. If a network contains highly-connected
communities that are only loosely connected by a few edges between clusters, then all shortest
paths between different communities must go along one of these few edges, which will therefore
have high edge betweenness. Removing these edges separates the communities from each other.
The method works very well for small graphs but it does not scale due to its high computational
demand [BLM
+
06, Section 7.1.3]. Again, computing the betweenness centrality is the bottle-
neck. If there are only few edges between different clusters, any approximate shortest path query
method would arguably also detect these edges. Approximating betweenness centrality or just
nding edges with high betweenness centrality can potentially be sped up using fast point-to-point
shortest path queries
10
[RMJ07].
Social Search
The sheer size of the web (currently, the web consists of billions of pages), renders the search
for relevant information very challenging. Search engines are expected to nd the needle in
the haystack. The search interface is supposed to be kept simple and the average user is not
entering much information; the search engine must nd relevant information without knowing
what the user actually wants (sometimes he may not even know it himself or he may not be able
to express it appropriately). Imagine that a search engine would know basically everything about
the user (this scenario may actually already be reality). The query term combined with what a
users friends were looking for and which results they liked could enable the search engine to
make a well-educated guess on what the user would want to see. Almost like a recommender
system [GZC
+
09], the engine can rank the documents matching the search query term just for one
user, based on his interests a personalized search [JW03]. It would aggregate the ratings for the
pages retrieved and then assign a higher ranking to those pages that people similar to that user, for
example his friends, liked best. To do so, efciently retrieving proximity information in the social
graph is essential.
Involving context and social connections in ranking is a hot topic for search engines [VFD
+
07,
YBLS08, SR08, UCDG08, SCK
+
08, PBCG09] and computing social distances may soon be a
an important primitive, both for search engines handling keyword queries and for online stores
recommending items for purchase.
10
Distance queries are inherent in graph clustering: a constant-factor kclustering can be computed with t queries
to the distance oracle if and only if a graph kpartition can be computed with t queries to the adjacency matrix of
G [GMMO00].
13
CHAPTER 1. INTRODUCTION
Social Networking
Shortest paths in social networks seem to be of interest for end users. On the website oracleof
bacon.org, for example, users can enter two names of actors and the database server uses a
breadth-rst search (BFS) to nd the shortest path between pairs of actors.
11
Such a webpage also
exists for Mathematicians.
12
In professional networking sites such as LINKEDIN or XING, users
can add their business contacts in order to get in touch with potential clients or employers through
a short chain of personal introductions. The corresponding webservers compute point-to-point
shortest paths in an online setting.
Analogous to the Erd os number project there is a social network based on collaboration on sci-
entic articles in various elds. Two scientists are considered connected if they have co-authored
one or more scientic papers together [New01]. One may assign weights to these connections
based on how many papers two authors share. Such weightings potentially capture the strength of
a relationship more appropriately. Shortest paths [New01, Sec. A] between scientists may help to
establish new professional contacts by following a sequence of personal referrals. Such systems
have been built years before online social networks became popular [KSS97a, KSS97b, Sha97a].
Scientic collaboration networks are also used to enhance communication at conferences. Some
researchers built a system called DEAIEXPLORER [KIK
+
06] that visualizes how two scientists
standing in front of a screen may know each other (a common afliation such as that they wrote
a paper together, they both gave a talk at the same conference, they have a common co-author,
or they cite each others papers), which is supposed to help them communicating. The DEAIEX-
PLORER system computes relationships and connections between persons up to distance 4.
All these systems could potentially benet from fast shortest path query processing.
Message Routing
Forwarding a message from a sender to a receiver through a network is called routing. Many
routing algorithms are variants, in one form or another, of shortest path algorithms that route
packets over a path of minimal cost [SS80]. The cost of an edge may reect transmission capacity,
latency, congestion, error rates, and other features. On the chosen path, routers must know where
to forward a packet to. These decisions are made based on the information in the packet and the
routing table at the router. It is integral that this information is kept small (compact) while paths
remain short. Research in compact routing addresses this tradeoff. Compact routing focuses on
distances and ignores other inuencing factors such as the quality of service provided (see [Hui00,
Chapter 11: Policy Routing]). It is assumed that these factors are abstracted out and aggregated
in the edge weight function. Large routing tables are difcult to cope with [Hui00, Section 9.1.2:
Routing Table Explosion, pp. 202203].
As more and more networks get connected, the memory required for storing the rout-
ing tables grows. This memory requirement varies a lot with the routing protocol and
with the routers architecture. In fact, the problem may appear in multiple ways. The
phrase routing table explosion is merely a catchall term for all the problems posed
by the manipulation of very large routing tables.
The tradeoff between routing table size and route quality is basically the distributed version
of shortest path query processing. Each router gets some part of the index, based on which
11
Written by Patrick Reynolds on http://www.oracleofbacon.org/how.php (as of December 2009).
12
See http://www.ams.org/mathscinet/collaborationDistance.html.
14
1.3. CONTRIBUTION
it has to make best-effort decisions. The routing tables should be small and routing decisions
quick [DBCP97] while paths remain short.
Distance approximations for networks may be of interest for end users as well. If a sender
can choose among different destinations, for example before downloading a large le from one of
several replicated servers holding the same data, it is benecial to predict the round trip time for
each of the servers prior to actually communicating. This proximity estimate helps choosing the
optimal server and connection [NZ02, DCKM04].
Conclusion
The numerous applications of point-to-point shortest path query processing make the following
claim easy to believe. The organizers of the DIMACS implementation challenge on shortest paths,
Demetrescu, Goldberg, and Johnson [DGJ08], state that
... shortest path problems are among the most fundamental combinatorial optimiza-
tion problems with many applications, both direct and as subroutines in other com-
binatorial optimization algorithms. Algorithms for these problems have been studied
since the 1950s and still remain an active area of research.
This thesis investigates the tradeoffs between pre-computation time, storage, query time, and
approximation quality both from a theoretical and a practical point of view.
1.3 Contribution
Theory and Practice. Both of these English words come from the Greek language,
and their root meanings are instructive. The Greek means seeing or viewing,
while means doing, performing. [...] Theory and practice are not mutually
exclusive; they are intimately connected. They live together and support each other.
Donald Knuth [Knu89]
This thesis aims to contribute to both the theoretical and the practical side of the shortest
path and distance query problems. The main contributions are summarized in the following. The
precise statements are deferred to subsequent chapters.
Theoretical
Space lower bound
The rst main contribution is a theoretical analysis of the space requirements for data structures
that assist shortest path queries. Given is a graph with n nodes. A preprocessing algorithm com-
putes a data structure of size S. If an algorithm computes approximate shortest paths with multi-
plicative stretch at most , it must access the data structure at a certain number t of locations of
the data structure. This three-way tradeoff between size S, stretch , and query time t is analyzed
in Chapter 4.
This result was achieved in joint work with Elad Verbin and Wei Yu. An extended abstract was
published in the proceedings of the 50th Annual Symposium on Foundations of Computer Science
(FOCS) [SVY09].
15
CHAPTER 1. INTRODUCTION
Time and space upper bounds for power-law graphs
One class of graphs that appears to be very common in real-world networks is the class of power-
law graphs: the node degrees obey a power law, which roughly means that there are many nodes
with few neighbors and only a few nodes with many neighbors. One property of these graphs
is that shortest paths often pass through nodes with many neighbors. This property allows for
efcient data structures. Routing through nodes with large degrees is a natural and very common
heuristic. Empirical evidence indicates that it is also a powerful heuristic in practice. We make
an attempt to bridge the gap between theory and practice by a rigorous efciency proof of this
heuristic for certain random power-law graphs. Details are in Chapter 5.
This result was achieved in joint work with Wei Chen, Shang-Hua Teng, and Yajun Wang.
An extended abstract was published in the proceedings of the 23rd International Symposium
on Distributed Computing (DISC) [CSTW09a]. The full version is available as a technical re-
port [CSTW09b].
Practical
A third contribution of this thesis is an efcient practical method (with theoretical guarantees)
to compute approximate shortest paths in undirected graphs. The preprocessing step consists of
computing the analogue of a Voronoi dual for graphs and the query step consists of searching a
shortest path in the dual and rening it in the original graph (primal). Compared to many existing
practical methods, the method described in Chapter 6 computes approximate shortest paths but it
also works for graphs other than road networks (such as complex networks).
This result was achieved in joint work with Shinichi Honiden, Michael E. Houle, and Martin
Wolff. An extended abstract was published in the proceedings of the 6th Annual International
Symposium on Voronoi Diagrams in Science and Engineering (ISVD) [HHSW09]. An outdated
version is available as a technical report [SHWH08].
1.4 Outline
This thesis is organized as follows. Chapter 2 contains preliminaries such as various denitions
from graph theory and a review of shortest path algorithms for the single source and the all pairs
shortest path problems. Related work considering the shortest path query problem is reviewed
in Chapter 3. Chapters 4, 5, and 6 contain the main contributions of this thesis, as outlined in
Section 1.3.
16
I never commit to memory anything that
can easily be looked up in a book.
Albert Einstein (18791975)
2
Preliminaries
This chapter consists of (1) a list of necessary denitions from graph theory and algorithmics, and
(2) a review of work related to the shortest path problem in graphs. A detailed review of shortest
path query processing is postponed to Chapter 3.
We use the following convenient notation throughout the thesis. For n N
+
, we dene
[n] := {1, 2, . . . n}. Unless stated otherwise, lg denotes the logarithm with base 2. poly(x) means
a polynomial in x of unspecied constant degree. We write modular congruences by p
q
r and
p = r mod q.
2.1 Graphs
A graph is a collection of entities (nodes) linked by some relationship (edges).
Denition 1. A graph G is a pair G = (V, E) consisting of a set of nodes V and a set of edges
E
_
V
2
_
.
Nodes are also referred to as vertices.
Denition 2. A graph G

= (V

, E

) is a subgraph of G = (V, E), if V

V and E

E. An
induced subgraph is a subset of the vertices of a graph together with any edges whose endpoints
are both in this subset.
Denition 3. Two nodes u, v V of a graph G = (V, E) are called adjacent if there is an edge
between u and v, that is, {u, v} E. For a graph G = (V, E), the set of neighbors of a vertex v,
denoted by
G
(v), is dened as the set of nodes adjacent to v, that is,
G
(v) := {u : {u, v} E}.
For a set of nodes U V , let
G
(U) :=

uU

G
(u).
Denition 4. For a graph G = (V, E), the degree of a vertex v, denoted by deg
G
(v), is dened
as the number of its neighbors, that is, deg
G
(v) := |
G
(v)|.
If the graph G is clear from the context, we omit subscripts. For example, we write the set of
neighbors and the degree of a node by (v) and deg(v), respectively.
A graph is called rregular if all vertices have degree r.
The sum of all node degrees divided by two equals the number of edges:

vV
deg(v) = 2 |E|.
17
CHAPTER 2. PRELIMINARIES
In this thesis, if not stated otherwise, we consider undirected graphs (as in Denition 1). In
some networks, relationships between entities are inherently directed, for example one-way streets
in road networks or hyperlinks in the World Wide Web. Directed graphs can be used to model these
networks.
Denition 5. A directed graph (digraph) D is a pair D = (V, A) consisting of a set of nodes V
and a set of edges (also called arcs) A V V .
For digraphs, we may distinguish between in-neighbors and out-neighbors.
Denition 6. For a digraph D = (V, E), the set of in-neighbors of a vertex v is dened as

D
(v) := {u : (u, v) E}, and its set of out-neighbors is dened as
+
D
(v) := {u : (v, u) E}.
We dene the neighbors of a vertex v as the union of the set of in-neighbors and the set of out-
neighbors,
D
(v) :=

D
(v)
+
D
(v). Its in-degree is deg

D
(v) := |

D
(v)| and its out-degree is
deg
+
D
(v) := |
+
D
(v)|.
Note that deg
D
(v) deg

D
(v) + deg
+
D
(v) and equality does not necessarily hold. We may
again omit the subscript if D is clear from the context.
Weighted graphs also capture relationships of different cost, length, and strength. In what
follows, we only consider edge-weighted graphs (as opposed to node-weighted graphs). We may
still restrict the edge weights to 1, which yields an unweighted graph.
Denition 7. An edge-weighted graph (digraph) is a graph (digraph) associated with a weight
function w : E R.
An edge weight can be interpreted as representing a value in the real world such as distance,
time, cost, penalty, or loss. In the following, if not stated otherwise, we shall only consider weight
functions with positive range, that is, w : E R
+
. This explicitly excludes edges with weight
0. This is no restriction, since we may just contract (denition below) edges with weight 0, which
yields a single vertex instead.
Denition 8. A path in G from a node u
0
to a node u
h
is a sequence of (undirected or directed)
edges ((u
0
, u
1
), (u
1
, u
2
), . . . , (u
h1
, u
h
)). We also interpret such a path as a node sequence
(u
0
, u
1
, . . . , u
h
), as a node set {u
0
, u
1
, . . . , u
h
}, or as a subgraph, when this simplies the no-
tation. The length of a path P is the sum of its edge weights (P) :=

h1
i=0
w(u
i
, u
i+1
). The
hop-length of a path P is the number of edges h on P.
Note that for any path of an unweighted graph, the hop-length and the path length are equal.
Denition 9. A subpath P

of a path P = (u
0
, u
1
, . . . u
h
) is a path constructed from a subse-
quence of nodes P

= (u
i
, u
i+1
, . . . u
j
), 0 i < j h. A simple path is a path without repeated
vertices. Two paths are called vertex-disjoint if they do not have any vertices in common except
for, possibly, the endpoints.
Distances in graphs are computed based on the shortest path metric.
1
1
This connection to metrics is one reason to restrict the range of the edge-weight function w() to R
+
and the
graphs to undirected. This thesis does not heavily rely on the general concepts of a metric space but since metrics
are inherently designed to measure distance, we briey outline the basic denition. A metric space is a set for whose
elements a distance (called a metric) is dened. This distance metric is supposed to satisfy three conditions:
1. d(x, y) = 0 if and only if x = y,
18
2.1. GRAPHS
Denition 10. Let P
G
(u, v) denote the set of paths from u to v in G. The distance d
G
(u, v)
between two nodes u, v is the length of a shortest path from u to v; that is,
d
G
(u, v) = min
PP
G
(u,v)
(P).
If P
G
(u, v) = then d
G
(u, v) := . The distance d
G
(u, V

) between a node u and a subset of


the nodes V

V is dened as d
G
(u, V

) := min
vV

d
G
(u, v). The distance between two subsets of
the nodes U

, V

V is dened as min
uU

d
G
(u, V

).
If unique shortest paths are needed, one may perturb the edge weights by adding random
innitesimal weights
2
[MVV87, EHP04].
Denition 11. An undirected graph G = (V, E) is connected if d
G
(u, v) is nite for all u, v V .
A directed graph D = (V

, A) is connected if for all u, v V

at least one of d
D
(u, v) and
d
D
(v, u) is nite. A directed graph D = (V

, A) is strongly connected if for all u, v V

both
d
D
(u, v) and d
D
(v, u) are nite.
The (strongly) connected components of a graph can be extracted efciently [CLRS01, Sec-
tion 22.5].
Denition 12. A cycle is a path where both endpoints coincide. A cycle is thus a node sequence
(u
0
, u
1
, . . . , u
h
) for which (u
i
, u
i+1
) E for all i {0, 1, . . . h 1} and u
0
= u
h
. The length of
a cycle is dened as the number of edges h 3.
A cycle of length 3 is also called triangle.
Denition 13. In a graph G = (V, E), the open ball with radius r around v V is dened by
B
r
G
(v) := {u V : d
G
(v, u) < r}
Accordingly, the closed ball with radius r is dened by B
r+
G
(v) := {u V : d
G
(v, u) r}.
The open (closed) ball relative to a subset of the nodes U V is dened as the open (closed)
ball with radius d(v, U).
Denition 14. Dene the multiplicative stretch of a path P from s to t = s relative to the distance
from s to t as the ratio (P)/d
G
(s, t) and dene the additive stretch as the difference (P)
d
G
(s, t).
The stretch of a path is also called distortion.
2. symmetry: d(x, y) = d(y, x), and
3. the triangle inequality: d(x, z) d(x, y) + d(y, z).
These conditions also imply non-negativity d(x, y) 0. Let
dim
p
denote the Euclidean space of dimension dim,
denoted by R
dim
, equipped with the
p
norm. For 1 p , the
p
norm on a dimdimensional space is dened as
||x||
p
:=
p
q
P
dim
i=1
|x
i
|
p
, set ||x||

:= max
i
|x
i
|.
2
The isolation lemma [MVV87] states that, for a nite set of distinct weights w(e
1
), w(e
2
), . . . w(e
|E|
), any col-
lection of subsets of weights has a unique minimum with probability at least
1
/2. Edge weights can be made unique by
adding innitesimal weights. Dene w

(e) := w(e) + (e), where (e) for each edge e is chosen independently at
random from [n
2
] = {1, 2, . . . n
2
}.
19
CHAPTER 2. PRELIMINARIES
2.1.1 Graph Properties
Denition 15. The diameter diam(G) of a graph G = (V, E) is the maximum distance between
two vertices:
diam(G) := max
u,vV
d
G
(u, v).
We dene the empty graph to have innite diameter and the graph with one vertex to have zero
diameter. All other graphs have a diameter in R
+
{}. We usually abbreviate = diam(G)
when G is clear from the context.
Denition 16. The radius of a graph G = (V, E) is the least r such that there is a vertex v whose
closed ball B
r+
G
(v) covers all vertices.
Denition 17 (Girth). The girth of a graph G = (V, E), denoted by g(G) is the length of its
shortest cycle.
A connected undirected graph without cycles is called a tree. A tree with n nodes has exactly
n1 edges. An undirected graph without cycles (but not necessarily connected) is called a forest.
Since there are no cycles in trees and forests, these graphs have innite girth. A subgraph that is a
tree on all nodes is called a spanning tree.
The tree-width of a graph was introduced by Halin [Hal76], but it went unnoticed until it
was rediscovered by Robertson and Seymour [RS86] and, independently, by Arnborg and Prosku-
rowski [AP89]. The tree-width of a graph is dened as follows.
Denition 18. Let G be a graph, T a tree and let V = {V
t
V (G) | t V (T)} be a family of
vertex sets of G indexed by the vertices t of T. The pair (T, V) is called a tree-decomposition of
G if it satises the following three conditions:
V (G) =

tT
V
t
for every edge e G there exists a t T such that both ends of e lie in V
t
If t, t

, t

V (T) and t

lies on the path of T between t and t

, then V
t
V
t
V
t
.
The width of (T, V) is the number max{|V
t
| 1 | t T} and the tree-width tw(G) of G is the
minimum width of any tree-decomposition of G.
Denition 19 (Doubling Dimension). The doubling dimension (also: Assouad dimension [Ass83])
of a graph is the minimum dim such that any ball of radius r can be covered by at most 2
dim
balls
of radius r/2.
A metric with diameter and doubling dimension dim has at most
O(dim)
points
Denition 20 (Edge Contraction). In an undirected graph G = (V, E), the contraction of an edge
e = {u, v} with endpoints u and v is the replacement of u and v by a single vertex u

such that
the edges incident to the new vertex u

are the edges other than e that were incident with u or v.


Denition 21 (Graph Minor [RS83]). A graph H is a minor of a graph G if a copy of H can be
obtained from G via repeated edge deletion and/or edge contraction.
20
2.1. GRAPHS
Graph Class Excluded Minor
trees K
3
series-parallel K
4
outerplanar K
4
and K
2,3
planar K
5
and K
3,3
Table 2.1: Examples of minor-free graph classes.
2.1.2 Graph Classes
A bipartite graph is a set of graph vertices decomposed into two disjoint sets such that no two
graph vertices within the same set are adjacent. All forests are bipartite.
Denition 22. A graph G is planar if it can be drawn in the plane such that the edges are repre-
sented by line intervals and do not intersect in their interiors.
There is a famous statement by Leonhard Euler, stating that planar graphs with n 3 nodes
have at most 3n 6 edges. If a graph only has few edges (m n poly(lg n)), it is called
sparse. Often, algorithms run faster on sparse graphs. However, sparsity alone does not necessarily
imply that a graph is an easy instance for an algorithm. Besides sparsity, planar graphs have
other special structural properties that allow for efcient algorithms: planar graphs can be cut
into different pieces without cutting too many edges (the planar separator theorem captures this
property, see Theorem 4 in a subsequent section).
Outerplanar graphs are the easiest planar graphs.
Denition 23. An outerplanar graph is a planar graph that can be embedded in the plane such
that all nodes lie on one face.
A planar graph can be efciently decomposed into outerplanar subgraphs (called hammocks)
[Fre91, Fre95, KPSZ96]. The number of hammocks required induces a hierarchy on the class of
planar graphs.
Often, graphs modeling practical networks are almost planar. The planarity property has
been extended in several ways.
Planar graphs require the existence of an embedding in the plane without any two edges cross-
ing. This can be generalized to orientable surfaces of larger genus (for example genus 0 is a plane
and genus 1 is a torus). If the genus is restricted to a constant, these graphs are called bounded-
genus graphs.
Graphs characterized by forbidden minors [RS83] are another special class of graphs. A graph
belongs to a minor-closed family if and only if it does not have a minor from a certain specied
list. Some examples are given in Table 2.1. All graphs in these classes characterized by a nite set
of forbidden minors are sparse.
Denition 24. The thickness of a graph G = (V, E) is the minimumnumber of planar subgraphs
G
1
= (V, E
1
), G
2
= (V, E
2
), . . . G

(V, E

) such that E =

i=1
E
i
.
There is literally a class of graphs called bounded X graphs for any of the aforementioned
properties X, where the corresponding property is bounded by a constant O(1). For example:
bounded-degree graphs are graphs in which the degree of each vertex is bounded by a constant.
21
CHAPTER 2. PRELIMINARIES
Another example class is the class of graphs with bounded tree-width (Denition 18). The tree-
width is a good measure of the algorithmic tractability of graphs. It is known that a number of hard
problems on graphs can be solved efciently when the given graph has small tree-width [AP89].
A graph has tree-width 1 if and only if it is a forest, and families of graphs with tree-width at most
2 include outer-planar graphs and series-parallel graphs.
A ball graph is an intersection graph of balls in R
dim
. It consists of n balls with centers v
i
and
radii r
i
. Two centers v
i
, v
j
are connected in the intersection graph iff their balls intersect in R
dim
.
A disk graph is a ball graph with dim = 2. In unit-disk and unit-ball graphs, all radii are equal.
The class of disk graphs contains the class of planar graphs [Koe36].
2.1.3 Synthetic Graph Models
Ideally, an algorithm would work well for all instances. However, more often than not, one can
construct an adversarial graph (also termed worst-case graph) for which certain algorithms show a
very bad performance. Almost ideally, an algorithm would work well for all practical instances, or
at least for a typical (average) instance. Even though many datasets are made public these days,
3
the number of available real-world networks is still rather limited. Furthermore, the algorithm
designer may not know in advance which graph the user will work with. From a theoretical
perspective, creating an algorithm for one particular graph instance is trivial: since code size is
not measured and evaluated, all solutions can be encoded in advance. Instead, we often evaluate
algorithms on certain restricted classes of graphs (see Section 2.1.2). Many real-world networks,
however, do not fall into any of these classes.
A large branch of research investigates models of the real world. Since we wish to capture the
essential features of multiple networks, the models have some degree of freedom, which is often
modeled by randomness. In some random graphs [ER60, Gil59, Bol01] for example each possible
edge is in the graph with probability p. In general, these models may help to understand charac-
teristics of certain real-world networks but also to evaluate and test [ASS09] the performance of
algorithms.
4
In the following we briey review models for the networks discussed in this thesis: road
networks and complex networks.
Road Networks
Often planar graphs are used to model road networks.
Eppstein and Goodrich [EG08] and Eppstein et al. [EGS09] explicitly state that road networks
are non-planar. Instead, they use a model called multiscale-dispersed graphs, formalized in terms
of disk graphs (which contain planar graphs [Koe36]). They prove that these networks have small
separators (see also Theorem 4 in a subsequent section), which can be found efciently.
Abraham et al. [AFGW10] introduce the notion of highway dimension, which means that for
every radius r > 0, there is a small set of vertices S
r
, which all shortest paths of length greater
than r pass through.
3
There are even online platforms to trade datasets, for example infochimps.org
4
Another approach to evaluate the average-case performance of algorithms and to generate test instances is to collect
many real-world graphs and perturb edge weights at random [Iri92].
22
2.1. GRAPHS
Network Degree distribution Example Model
Single-scale Gaussian or expo-
nential
Erd os-R enyi
[Gil59, ER60]
Scale-free Power law Metabolic net-
works, food webs,
Web graphs, and
numerous others
Pref. attach-
ment [BA99];
xed (exp.) de-
gree sequence
[BBK72, ACL00]
Broad-scale Power-law distrib.
with sharp cut-off
(decay of the tail)
Movie actor
Table 2.2: Complex networks of different scales [ASBS00]
Complex Networks
Complex networks at rst appear not to have any particular structure this is why they are
called complex. For the moment, let us focus on the degree distribution. Amaral et al. [ASBS00]
distinguish three classes of networks based on their degree distribution: single-scale, scale-free,
and broad-scale networks (see Table 2.2).
Single-Scale Networks. Erd os-R enyi random graphs [ER60, Gil59] have been studied inten-
sively for more than 50 years. There are two common models for undirected graphs with n nodes.
In the G
n,p
model, each of the
_
n
2
_
edges is in the graph independently at random with probability
p. In the G
n,M
model, all graphs with n nodes and M edges have the same probability. Many
properties of Erd os-R enyi graphs are well understood. For example, the G
n,p
random graph with
edge probability p proportional to n
1/d
(where d denotes an integer) has diameter at most d + 1
with high probability [Bol01]. For more results we refer to [Bol01]. Erd os-R enyi graphs serve
as suitable probability distributions for the average-case analysis of many algorithms. However,
graphs with power-law degree distribution are very unlikely in the Erd os-R enyi random graph dis-
tribution. Since many real-world networks do have power-law degree distributions, researchers
also consider other random graph models.
Scale-Free Networks. The node degree sequence of scale-free graphs obeys a power law[Mit03,
New05, CSN07]. Power-lawdistributions are referred to as scale-free distributions, since they look
the same on any scale. Mathematically speaking, a power-law degree distribution is dened as fol-
lows: the probability that a node has degree x is proportional to x

for some , which is called


the power-law exponent. For most practical scenarios, the power-law exponent lies in the interval
2 < < 3. These inequalities are assumed to hold in the following. Formally, a degree sequence
obeys a power law if Pr[deg(v) = x] = C x

for some constant C. The expected degree can


be computed as follows.
E[deg(v)] =
n1

x=1
x Pr[deg(v) = x]

_
1
Cx
+1
dx =
C
2

2.1
23
CHAPTER 2. PRELIMINARIES
For constant values of C, the expected number of edges is linear, which makes scale-free networks
sparse. The power-law degree sequence is just one important feature of many real-world complex
networks. Another characteristic is that distances are very short. This characteristic is called the
small world effect.
Two broad classes of network models [BS05, Mit03, CF06, TGJ
+
02] are distinguished based
on the method the graphs are generated with. In pure random graphs, the number of nodes and
the parameters are set at the beginning and then all the edges are generated. These models are
satisfactory to analyze complex networks but they do not explain the reasons for the scale-free
nature of complex networks. In random evolving graphs, the graph is generated by a random
process that adds node by node to the graph and connects the new node at random to the existing
graph. This process can be stopped at any time. For a generated graph by either model, let n
denote the number of nodes. The details for the different models vary greatly. Commonalities
other than the power-law degree sequence are that, usually, the diameter is proportional to lg n
and the average distance is proportional to lg lg n. The goal is to nd a model that is both realistic
and easy to work with.
The conguration model [BBK72, RN04b] works as follows: we specify a degree sequence

d := (d
1
, . . . d
n
). The edges are generated such that all graphs G = (V, E) with v
i
V :
deg(v
i
) = d
i
have the same probability. Where in the Erd os-R enyi random graph model all edges
were independent, the edges in the conguration model are dependent. Once an edge between two
vertices v
i
, v
j
has been assigned, the potential of both u and v to acquire more edges decreases by
1. Note that, for a degree sequence

d to be realizable as a graph, there are some conditions on

d
such as
n

i=1
d
i
must be even and others [EG60, Hak62].
In the xed expected degree random graph model [ACL00, CL02, NR06], edges are indepen-
dent. We again specify a sequence w := (w
1
, . . . w
n
). For this model, w
i
is interpreted as the
expected degree of v
i
. Each edge {v
i
, v
j
} is in the graph independently at random with probability
w
i
w
j
P
k
w
k
. Note that it is required to restrict w such that i, j : w
i
w
j

k
w
k
. In Chapter 5, we use
an adapted version of this model to analyze distance oracles for random power-law graphs.
In the re-wired lattice model [BMST97, NW99, Kle00], each vertex is connected to all of
its neighbors within constant distance by an undirected edge. In addition, a number of shortcuts
(long-range links) are added between randomly chosen pairs of nodes. In a variant, instead of
adding edges, some of the connections to neighbors are removed and re-wired to random nodes.
In afliation networks [LS09], we start with a bipartite graph. The nodeset is divided into
actor nodes and afliation nodes. Each node representing an actor is connected to certain nodes
representing afliations such as companies, orchestras, and sports clubs. Then, the bipartite graph
is unfolded into a social network, which consists of actor nodes only; edges are generated such
that two actors connected by a path of length 2 in the afliation graph get connected in the social
network.
In the preferential attachment model [BA99, DMS00], the network is growing in time in such
a way that new vertices are more likely to be connected to vertices that already have a high degree.
A new vertex connects to a node with degree d
i
with probability
d
i
P
k
d
k
. This model offers a
convincing explanation for the emergence of scale-free networks. The copy model [KRR
+
00] is
in some sense a variant of the preferential attachment model, where a new node, upon generation,
copies a fraction of the links of a random node.
24
2.2. GRAPH ALGORITHMS
2.2 Graph Algorithms
Graph algorithms is a research eld at the intersection between graph theory and computer science.
We are interested in (efciently) computing certain properties of graphs. An algorithm, given a
graph G = (V, E) and optional inputs such as subsets of the nodeset or edgeset or constants,
decides or computes certain properties of G. Decision problems are those questions for which
the answer is yes or no. Optimization problems are the questions for which the solution is a
subgraph, potentially ordered, minimizing or maximizing an objective function.
The efciency of algorithms is of integral interest.
For practical purposes computational details are vital. However, my purpose is only
to show as attractively as I can that there is an efcient algorithm. According to the
dictionary, efcient means adequate in operation or performance. This is roughly
the meaning I want.
Jack Edmonds [Edm65]
Suppose that we want to evaluate an algorithm for a problem. Objective evaluation criteria
include the quality of the result (correctness, exactness, approximation quality), the computing
time (also: time complexity), measured in terms of the input size, and the memory consumption
(also: space complexity), again measured relative to the input size. For graph algorithms, the
input consists of a graph G = (V, E) and optional parameters. Unless stated otherwise, n := |V |
denotes the number of nodes and m := |E| denotes the number of edges. An important aspect
in the evaluation of an algorithm is its scalability. For theoretical work, scalability means the
asymptotic behavior of an algorithm in terms of n and m. Let f
A
(n, m) denote the least upper
bound on the cost of applying algorithm A to graphs with n nodes and m edges. It is claimed
that a constant number of instructions every now and then does not inuence the running time too
much. It is often also convenient not to analyze these constant overheads in great detail. This is
captured in the Bachmann-Landau [Bac94, Lan09] Onotation [CLRS01, Chapter 3]. We say that
the running time of an algorithm is (in) O(g(n, m)), meaning that the actual running time as a
function f
A
(n, m) increases, or grows, at most proportionally to g(n, m), ignoring the exact value
f
A
(n, m). The precise denitions of the Onotation are listed in Table 2.3.
Notation Denition
f(n) O(g(n)) n
0
, c
1
, c
2
such that n > n
0
: c
1
g(n) +c
2
f(n)
f(n) o(g(n)) n
0
, c
1
, c
2
such that n > n
0
: c
1
g(n) +c
2
> f(n)
f(n) (g(n)) g(n) O(f(n))
f(n) (g(n)) g(n) o(f(n))
f(n) (g(n)) f(n) O(g(n)) f(n) (g(n))
f(n)

O(g(n)) c

such that f(n) O(g(n) lg


c

n)
Table 2.3: Onotation for the asymptotic behavior of functions f, g.
Computational problems are classied according to their difculty, which is dened by the
existence of an algorithm running in a certain time. The class of decision problems for which
there exists an algorithm that outputs the correct answer in time O(poly(n)) is called P. The
class of decision problems for which there exists an algorithm that, given some evidence, veries
25
CHAPTER 2. PRELIMINARIES
the correct answer in time O(poly(n)) is called NP. The complexity classes P and NP are only
mentioned in some parts of the chapter on related work, but profound knowledge on the Pvs. NP
problem is not essential to understand this thesis. For more on computational complexity theory,
we refer to [GJ90].
2.2.1 Computational Models
Time and space complexities are measured differently depending on the machine model [vEB90].
The traditional model of computation consists of a Turing machine [Tur37], which is a state ma-
chine operating on an innite tape divided into cells. In the word RAM model [CR73] with integral
word length w 1, the contents of all memory cells are integers in the range {0, . . . , 2
w
1} and
operations such as addition, subtraction, bit shifts, and bit-wise boolean operations are assumed
to be executable in constant time (analogous to programming languages such as C). This model
often allows for fast algorithms (faster compared to addition/comparison models by a logarithmic
factor) if for example the edge weights are integers and the largest integer weight W satises
W 2
w
1. This model is often used for upper bounds on the time complexity of a specic
algorithm.
For lower bounds on the time complexity of any algorithm, the related cell-probe model is
very common.
Denition 25 (Cell-probe model [Yao81, Mil99]). In the cell-probe model, a memory cell has
w bits (also called word length) and the space of a data structure is measured as the number of
cells it occupies, denoted by S. The query time is measured by the worst-case number of cells t
that a query reads.
Both for the word RAM and the cell-probe model, the most typical values for the word length
are w = lg n or w = polylog(n) = poly(lg n), but larger (or smaller) values may be interesting
as well.
For problems involving huge data sets, often I/O is the bottleneck of computations. The data
does not t into main memory; instead it is read block by block from disk. To analyze external
memory algorithms [AV88, VS94], often a cell-probe-like model is used for upper bounds as well.
Operations may read a block of size B into main memory of size M.
2.2.2 Approximation Algorithms
For certain optimization problems, the optimal solution with respect to an objective function is
hard to compute. Often the computation of a close-to-optimal solution can be done much faster.
An approximate solution may still be acceptable if the quality of the solution is sufciently good.
The quality is measured as follows. Let OPT denote the value of an optimal solution (as deemed
by the objective function) and let ALG denote the value of the solution the algorithm returned.
We say that the algorithm has approximation quality (, ) if the inequalities (2.2) hold for all
allowed inputs. The approximation quality is thus a worst-case measure.
OPT ALG OPT +

2.2
(Note that this denition is tailored to minimization problems and in particular to the shortest path
problem.)
In the case of distances, this approximation quality is also called stretch or distortion. Let

d(u, v) denote the result of the approximation algorithm when asked for the distance between u
26
2.3. COMMON TECHNIQUES
and v. For an algorithm computing (, )approximate distances, for all u, v V , the result must
satisfy
d
G
(u, v)

d(u, v) d
G
(u, v) + .
(The combination of multiplicative and additive stretch only makes sense for multiple node
pairs. For a single path we consider either its multiplicative or its additive stretch (Denition 14).)
2.3 Common Techniques
This section consists of a non-exhaustive list of techniques that are commonly used to solve prob-
lems related to shortest paths.
2.3.1 Spanners and Emulators
For most graph algorithms, the performance depends on the number of nodes and edges of the
input graph. The running time can potentially be reduced by altering the graph, in particular by
adding or deleting edges. After altering the graph, we wish that the answer to the question we ask
concerning the graph (in our case the distances between nodes) does not change by much. When
edges are deleted only, we obtain a subgraph, which, if it preserves distances to a certain extent, is
called a spanner [PS89, ADD
+
93, Coh98, DHZ00, Kor01, BCE03, EP04, TZ06, Pet07, Elk08a,
Elk08b, BS08].
Denition 26 ((Graph) Spanner). An (, )spanner of a graph G = (V, E) is a subgraph G

=
(V, E

) that approximately preserves distances such that for all pairs of nodes (u, v) V V ,
d
G
(u, v) d
G
(u, v) d
G
(u, v) + .
We say that this spanner has stretch (or distortion) (, ).
Spanners are useful in various applications such as constructing routing tables, where the edges
of a subgraph are used to route messages, and computing approximate shortest paths.
The more edges we delete, the smaller the input size for the next algorithm, the faster the
running time. The amount of edges we can delete before some distances change substantially
often depends on the girth of the graph. Recall that the girth of a graph is the length of its shortest
cycle (Denition 17). Intuitively, if there are short cycles, we may delete an edge of a cycle, since
for a shortest path using this edge, there is an alternative, reasonably short path using the cycle. If
there are no short cycles, there is no alternative short path; the redundancy is low and the deletion
of an edge may cause a large distortion.
Let m
g
(n) denote the maximum number of edges in a graph with n vertices and girth at least g.
Theorem 1 (Alth ofer et al. [ADD
+
93]). For any integer 3, every graph G = (V, E) on
|V | = n vertices has a spanner with stretch (, 0) and m
+2
(n) edges.
Their construction uses a greedy algorithm (similar to Kruskals algorithm to construct a min-
imum spanning tree [CLRS01, p. 568]). The upper bound is actually tight. The corresponding
lower bound is not very difcult: in a graph with girth g = + 2, removing any edge increases
the distance between its endpoints from 1 to at least +1. The only multiplicative (, 0)spanner
is the graph itself.
27
CHAPTER 2. PRELIMINARIES
Since no edges can be removed from graphs with large girth without signicantly altering dis-
tances, graphs with many edges (dense graphs) and large girth are important worst-case instances
for spanner and distance oracle constructions. In extremal combinatorics, determining m
g
(n) is a
research eld of its own [EJ08, Big98, Hoo02]. For example, for g = 4 the question is the follow-
ing: how many edges can be added to the empty graph on n nodes without closing a triangle? In-
tuitively, the more edges that were already added, the harder it gets to add another one. For g = 4,
the complete bipartite graph is asymptotically optimal. For general g, the construction of the graph
is much more involved; for some g the value of m
g
(n) is not even known.
5
Erd os girth conjec-
ture [Erd64, ES63] predicts that, for an integer k 1, m
2k+1
(n) = m
2k+2
(n) = (n
1+1/k
). The
corresponding upper bound is known to be tight [AHL02] and the conjectured lower bound is a
theorem for certain values of k (1, 2, 3, and 5); for an overview, see Table 2.4.
Girth |E| Reference
4 (n
2
) complete bipartite graphs
6 (n
3/2
) [Rei58, ERS66, Bro66, Wen91]
8 (n
4/3
) [Tit59, Ben66, Wen91]
10 O(n
5/4
)
(n
6/5
) [Tit59, Ben66, LU93]
12 (n
6/5
) [Tit59, Ben66, Wen91, LU93]
14 O(n
7/6
)
(n
9/8
) [LUW95, LUW96]
16 O(n
8/7
)
(n
10/9
) [WU93, LUW95]
4r + 2 O(n
2r+1
2r
)
(n
1+
1
3r1
) [LUW95, LUW96]
4r O(n
2r
2r1
)
(n
1+
1
3r3
) [LUW95, LUW96]
Table 2.4: Results on Erd os girth conjecture, overview from [TZ05, Table II]. Maximum size of
the edge set E for a graph G = (V, E) with |V | = n nodes and given girth (length of a shortest
cycle, Def. 17).
For multiplicative spanners, the tradeoff between space and stretch is well understood. Not
so for spanners with additive stretch. Aingworth et al. [ACIM99] found a (1, 2)spanner with
O(n
3/2
) edges and Baswana et al. [BKMP05] found a (1, 6)spanner with O(n
4/3
) edges. Wood-
ruff [Woo06] gives a strong lower bound for additive graph spanners independent of Erd os
girth conjecture. He proves that for an integer k = o
_
lg n
lg lg n
_
, there are graphs for which any
(1, 2k 1)spanner has
_
n
1+1/k
/k
_
edges.
Recently [TZ06, Pet07], spanners with non-constant additive stretch () are under investiga-
tion. For these spanners, is required to be sublinear in d(u, v). We refer to the overview by
Pettie [Pet07, Fig. 2]. For a result on spanners for directed graphs, see [RTZ08].
Spanners are subgraphs. If we just care about distances and not about the actual paths, the
subgraph requirement may be too restrictive. Emulators are graphs restricted to the same nodeset
5
For directed graphs, the problem seems to be even more involved [CH78, CS83].
28
2.3. COMMON TECHNIQUES
but not to the same edgeset.
Denition 27 (Emulator [DHZ00]). An edge-weighted graph F = (V, E

) (, )-emulates a
graph G = (V, E) if for every u, v V
d
G
(u, v) d
F
(u, v) d
G
(u, v) + .
F is called an (, )emulator.
Consequently, we say that such an emulator has stretch (, ). Note that, for both spanners
and emulators, distances may only increase. Dor et al. [DHZ00] give a (1, 4)emulator with

O(n
4/3
) edges. Thorup and Zwick [TZ06], for an arbitrary integer k 2, construct emulators
with O(kn
1+1/(2
k
1)
) edges (in expectation), such that for pairs with distance , the distance in
the emulator is at most + O(k
11/(k1)
). The lower bounds by Woodruff [Woo06] can be
extended to emulators as well.
2.3.2 Distance Labelings and Metric Embeddings
The objective of distance labelings is to assign each node of a graph a label such that the distance
(or an approximation thereof) between two nodes can be computed based on the corresponding
labels only [Pel00]. Such labelings are used in the real world to a certain extent. For example,
postal addresses include countries, cities, and street names, using which we can get an estimate of
how close two addresses are.
6
The idea is formalized in the following denition.
Denition 28 (Distance Labeling [Pel00, GPPR04]). An (, )approximate distance labeling
scheme for a graph G = (V, E) is an assignment of labels to nodes L : V {0, 1}

such that
the estimated distance

d(u, v) computed by the scheme from the labels L(v) and L(v) satises
d
G
(u, v)

d(L(u), L(v)) d
G
(u, v) + .
Distance labeling schemes with short labels are derivable for highly regular graph classes, such
as rings, meshes, and hypercubes. An interesting question is whether more general graph classes
can also be labeled in this fashion. For general graphs, it is known that any distance labeling
scheme must label some graphs with n vertices with labels of size (n) [GP03b]. Even for planar
graphs, some nodes must have labels of size (n
1/3
) [GPPR04].
If we restrict the function

d(L(u), L(v)) to
p
norms, we obtain embeddings [IM04, Lin02].
The idea is to map a metric space (here: the shortest path metric of a graph) into a simpler one,
in such a way that the distances between points do not change too much. More formally, an
embedding of a (weighted) graph G = (V, E, w) with distance function d into a target metric
space (V

, d

) (where d

denotes the distance function of this space) is a map : V V

. An
embedding with good distortion yields good approximation algorithms [Ind01].
The Johnson-Lindenstrauss Lemma can be used to reduce the number of dimensions of a
metric space without introducing a large error. It states that, for any
DIM
2
, a random linear mapping
into
dim
2
preserves distances up to a factor of 1 with probability at least 1 e

2
dim
. More
precisely:
6
Latitude and longitude coordinates would of course be more precise.
29
CHAPTER 2. PRELIMINARIES
Lemma 2 (Johnson and Lindenstrauss [JL84]). For any 0 < < 1 and any integer n, let dim be
a positive integer such that
dim 4(
2
/2
3
/3)
1
lnn.
Then for any set V of n points in R
DIM
there is a map f : R
DIM
R
dim
such that for all u, v V ,
(1 )||u v||
2
||f(u) f(v)||
2
(1 + )||u v||
2
.
The map f can be found in polynomial time.
We can thus project any ndimensional Euclidean space to an O(lg n/
2
)dimensional space
such that the distance between any two points changes by at most 1 . Such projections are
almost best possible: at least (lg n/(
2
lg )) dimensions are needed [Alo03, Theorem 9.3].
We would like to generalize this result to all metric spaces, in particular to graphs and the
shortest path metric.
Theorem 3 (Bourgain [Bou85]). For every npoint metric space there exists an embedding into
Euclidean space (
2
) with distortion (O(lg n), 0).
Also, this bound is tight [LLR95]. For the special case of planar graphs, the multiplicative
distortion is (

lg n) [Rao99, NR03]. The optimal distortion of an embedding into a constant-


dimensional Euclidean space is hard to compute [MS08a].
Embedding a graph into a hypercube (wherein the
1
norm is computed as the Hamming
distance between coordinates) is another technique [Djo73]. The Squashed Cube Conjecture by
Graham and Pollak [GP72], proven by Winkler [Win83] implies an exact distance labeling scheme
with labels of size nlg
2
3. Note that computing the Hamming distance takes time proportional to
the label size.
2.3.3 Planar Graph Techniques
Separators
A powerful approach when designing an algorithm is to use divide & conquer: we split the prob-
lem at hand into a bunch of smaller problems, solve these, and then combine their solutions.
Ideally the sub-problems are completely independent such that the combination of their solutions
is straightforward. In general, the sub-problems are not independent. It is then crucial to cut the
problem into pieces with as few interdependencies as possible. For some problems, such a cutting
is indeed possible. Given an instance of the restricted class of planar graphs with n nodes, it is
known [LT80, AST94] that we can split the graph into two parts of roughly half the original size
separated by a rather small third set of nodes (with size O(

n)).
Theorem 4 (Lipton and Tarjans Planar Separator Theorem [LT80]). The n vertices of a planar
graph can be partitioned into three sets A, B, S such that
no edge connects a vertex in A with a vertex in B,
A and B each contain at most n/2 vertices, and
S contains at most O(

n) vertices.
30
2.4. SHORTEST PATHS
Results for planar graphs often extend to other classes of graphs. The separator theorem has
been generalized to bounded-genus graphs by Gilbert et al. [GHT84] and to minor-free graphs by
Alon et al. [AST90].
Road networks, although non-planar (the networks may have many bridges and tunnels), often
also have structural properties that are similar to the ones for planar graphs. Most importantly,
they appear to have small separators as well [EG08].
Edge Orientability
The edge set of a simple undirected planar graph G = (V, E) can be oriented (denoted by

G =
(V,

E)) such that the out-degree of every vertex v satises deg
+

G
(v) = O(1). The upper bound
on the degree can be chosen to be 3 [CE91]. Note that the nodes in-degrees deg

G
(v) remain
unbounded.
Using this orientation, adjacency queries can be answered in constant time by inspecting both
nodes. This technique can be seen as a labeling [Bre66, KNR92]: each node gets a label of
size O(lg n) bits such that the adjacency of two nodes can be computed by looking at the two
corresponding labels only. The result on edge orientability extends to minor-free graphs [GL07].
2.3.4 Well-Separated Pairs
The well-separated pair decomposition by Callahan and Kosaraju [CK95] is important for algo-
rithms operating on point sets in R
dim
. The corresponding denition for graphs is as follows. For
a graph G = (V, E) and a set A V , let G(A) denote the subgraph induced by A.
Denition 29 (WSPD [CK95]). For a graph G = (V, E), two sets of nodes A, B V are
separated if max{diam(G(A)), diam(G(B))} d
G
(A, B). For a parameter > 0, a well-
separated pair decomposition of a graph Gis a set of s pairs W = {{A
1
, B
1
}, . . . {A
s
, B
s
}} such
that for all pairs u, v V
(u, v)
s
_
i=1
A
i
B
i
B
i
A
i
,
and that for all i [s]
A
i
, B
i
V ,
A
i
B
i
= , and
A
i
and B
i
are separated.
In other words, for any pair of nodes u, v V , there is exactly one pair {A
i
, B
i
} W such
that u A
i
and v B
i
.
2.4 Shortest Paths
Shortest paths should be of inherent interest to any lazy creature living on a sphere
like the surface of planet Earth.
Mikkel Thorup [Tho04a]
31
CHAPTER 2. PRELIMINARIES
Even in very primitive (even animal) societies, nding short paths and searching (for
instance, for food) is essential.
Alexander Schrijver [Sch05a, p. 1]
The distance between two nodes s, t V of a graph G = (V, E) is dened by the length
of a shortest path (Denitions 8 and 10). The objective is to nd the shortest possible path that
connects the source and the target.
In this thesis, we restrict ourselves to unconstrained, (approximate) shortest paths in static, dis-
crete graphs with positive edge weights. For geometric shortest paths we refer to [Mit97, Che96]
(for motion planning see [Sha97b]). For paths on surfaces and meshes we refer to [MMP87,
ADG
+
06, VS09]. For the dynamic version of the problem we refer to [DI08, Ita08]. For time-
dependent weights, see [CH66, OR90, KS93a]. For SSSP algorithms on graphs with negative edge
weights, we refer to [GT89, Gol95, FR06].
We classify [DP84] the shortest path algorithms according to the problem type (single source,
all pairs, one pair), the graph class, and the techniques. We also refer to Schrijvers book [Sch03,
Chapter 7], Petties thesis [Pet03], and the surveys by Zwick [Zwi01] and Sen [Sen09]. The review
in this section is made with best efforts; however, the list of methods, algorithms, and techniques
is by far not complete.
2.4.1 Single Source Shortest Path (SSSP) Algorithms
For a brief overview, we refer to the Encyclopedia of Algorithms [Pet08b]. For a comprehensive
historical overview, we refer to Schrijver [Sch03, Section 7.5b]. The following outline is partially
based on Ahuja, Magnanti, and Orlins book [AMO93]. Historical information is from [Dre69,
GM77, DP84, Sch05a].
The Single Source Shortest Path (SSSP) problem asks for a shortest path from one node s V
(called the source) to all other nodes in V \{s}. In particular, after running a single source shortest
path algorithm, we know the distance from the source to all other nodes. After termination, each
node u V is labeled with d(s, u). At the beginning d(s, s) = 0 and all other labels are set
to . SSSP algorithms iterate and assign tentative distance labels

d(s, u) (upper bounds on the
true distance d(s, u)

d(s, u)) at each step. We distinguish label-setting, label-correcting, and
other algorithms. In label-setting algorithms, each iteration produces one optimal label. This
property does not hold for label-correcting algorithms, for which all labels are optimal after the
nal iteration only. Label-setting algorithms are restricted to non-negative lengths, but they often
have a better worst-case time complexity than label-correcting algorithms. Also, if we are inter-
ested in one particular target node t, it is possible to stop the algorithm as soon as the distance
label for t has been produced.
Label-Setting Algorithms
The most famous label-setting algorithm is Dijkstras algorithm [Dij59], see also [Sch03, Sec-
tion 7.2] or any algorithms textbook [CLRS01, p. 595] of your choice. The original implemen-
tation runs in time O(n
2
). According to an article on the history of combinatorial optimiza-
tion [Sch05a], the same or a similar algorithm has been proposed independently by Leyzorek et
al. [LGJ
+
57], Dantzig [Dan60], and Whiting and Hillier [WH60]; but it is known as Dijkstras al-
gorithm. The algorithm starts a search at the source. At each step, among the nodes with tentative
labels, the algorithm selects the node u with the shortest tentative distance

d(s, u) and deems the
32
2.4. SHORTEST PATHS
label of u as permanent. Then, the algorithm updates the labels of all the neighbors of u if neces-
sary. That is, if a neighbor v (u) has a tentative distance

d(s, v) larger than d(s, u) +w(u, v),
its label is decreased. The algorithm does not need to backtrack: once a label is nalized, it is cor-
rect and it can never decrease. The computational bottleneck of the algorithm is the node selection:
we need to efciently nd the node u with the shortest tentative distance.
For this node selection step, sorting may be used [Joh72]. Dial [Dia69] proposes to maintain
sorted distances by using buckets. Let the weight function w : E N
+
be restricted to integers
and let W denote the largest integer weight. Dials implementation maintains buckets, which could
require a large amount of memory; the running time is O(m+nW). Further improvements were
made by Wagner [Wag76], Dial et al. [DGKK79], and Denardo and Fox [DF79].
Another approach to efcient node selection is to use a priority queue, which is a data structure
we use to manage the nodes [Mur67a]. This data structure supports efcient operations to insert
an element, to retrieve the minimum element, to delete the minimum element, and to decrease
the value of an element in the queue. These are exactly the operations necessary for Dijkstras
algorithm. Each update of a label involves a queue operation. Using a binary heap as priority
queue, insertions, deletions, and decrease operations can be done in time O(lg n), which yields
time O(mlg n) for Dijkstras algorithm [Wil64]. For very dense graphs with m =
_
n
2
lg n
_
edges
this running time is actually slower than the O(n
2
) running time of the original implementation,
but for sparse graphs the running time decreases signicantly. Johnson [Joh77] uses a priority
queue with xed depth to get running time O(m) for dense graphs (m = (n
1+
) for some
> 0).
The faster the operations of the priority queue, the better the overall running time. Using a
dheap, insertions and decrease operations take time O(lg
d
n) and deletions take time O(d lg
d
n),
which yields total running time O(mlg
d
n + nd lg
d
n). The optimal value for the parameter d is
d = max{2, m/n}, which yields total running time O(mlg
d
n). The Fibonacci heap [FT87]
supports deletions in time O(lg n) and all the other operations in amortized constant time O(1),
which yields the currently best time bound of O(m + nlg n). Alternatively, one may use re-
laxed heaps [DGST88] or rank-pairing heaps [HST09], which are data structures with the same
performance guarantees.
If we restrict the range of the edge weight function w() to integers (or oats), better bounds
are possible by using special priority queues designed for the word RAM model (as dened in
Section 2.2.1). For an overview of the running times, see Table 2.5.
Component Hierarchy Algorithms
Sorting and priority queues are strongly related [Tho07]. For sorting, there is an information-
theoretic time lower bound of (nlg n). To circumvent this bound, sorting must be avoided. An
SSSP algorithm is not actually required to sort the distances to all nodes. Indeed, the sorting
bottleneck can be avoided. Thorup [Tho99, Tho00a] gives an O(m) algorithm for undirected
graphs with integer or oating point weights. He rst constructs a component hierarchy (in linear
time) and then revisits the nodes. Although the algorithm is theoretically best-possible, it appears
to be hard to implement and not very efcient in practice [AI00].
Hagerup [Hag00b] generalizes the idea of the component hierarchy to directed graphs, for
which his algorithm (using an analogue of minimum spanning trees for directed graphs) runs in
time O(mlg lg W). Actually, constructing the component hierarchy itself takes this time; an SSSP
search using the hierarchy requires time O(m+nlg lg n).
Pettie and Ramachandran [PR02] generalize the component hierarchy to real weights. For
33
CHAPTER 2. PRELIMINARIES
Time Reference
O(mlg n) [Wil64]
O(m+nlg n) [FT87, DGST88]
O(mlg lg W) [Joh82, vEBKZ77]
O(m+n

lg W) [AMOT90]
O(m

lg n) [FW93]
O(m+n
lg n
lg lg n
) [FW94]
O(mlg lg n) [Tho00b]
O(m+nlg
1/2+
n) [Tho00b]
O(m+n

lg nlg lg n) [Ram96b]
O(m+nlg
1/3+
n) [Ram97]
O(m+nlg lg n) [Tho04b]
O(m+n

lg lg n) [HT02]
Table 2.5: Running times for different implementations of Dijkstras algorithm. W denotes the
largest integer weight. The table is in large parts excerpted from [Tho99, p. 364]. The algorithms
in the rst two rows work for both the comparison/addition model and the word RAM model. The
analysis of the algorithms shown in row 3 and below only works in the word RAM model.
real weights and undirected graphs, if the ratio of any two edge weights is polynomial in n, their
algorithm runs in time O(m + nlg lg n). Furthermore, the algorithm appears to be efcient in
practice as well [PRS02]. The algorithms idea is to enforce a certain degree of balance in the
component hierarchy and, when computing the SSSP, to use a specialized priority queue that
takes advantage of this balance [Pet08b]. Unfortunately, it has been found that a hierarchy-based
algorithm can not improve upon Dijkstras algorithm to run in o(nlg n) for general weights [PR02,
Pet04]. See also [Pet03, Section 3.6].
Note that the component hierarchy approach already captures a certain notion of having pre-
processing and query stages [AI00].
Label-Correcting Algorithms
In label-correcting algorithms, all distance labels are temporary and they are guaranteed to be
exact and optimal in the end only.
7
Label-correcting algorithms can solve more general problems
(they can, for example, nd a cycle with negative length) and they are more exible but they
are usually slower. Some version of the algorithm often solves the All-Pairs Shortest Path (APSP)
problem. It is actually unknown whether computing the result of a point-to-point shortest path
query is easier than computing APSP on arbitrarily weighted graphs.
The generic label-correcting algorithm runs in time O(min{n
2
mW, m2
n
}) [For56, Moo59,
FF58, GKP85, GKPS85]. There is an efcient FIFO implementation, which runs in time O(mn)
[Bel58] (the well-known Bellman-Ford algorithm[Bel67, For56] (for an outline, see also [CLRS01,
Chapter 24.1])) and a dequeue implementation running in time O(min{mnW, m2
n
}) but being
very efcient in practice [Pap74, Pap80, GP86, Ber93].
The Simplex algorithm can also be used to compute shortest paths [Orl83, Akg88, GHK90,
AO92, GJ99, SNGM09].
7
This convergence behavior makes the correctness of an algorithm more difcult to prove.
34
2.4. SHORTEST PATHS
Restricted Graph Classes, Average-Case Analysis, Expected Running Times, and Practical
Considerations
Due to the importance of the shortest path problem, researchers have been interested in parallel al-
gorithms [Coh00, PK85, KS93b, KS96, BTZ98, TZ00, CMMS98, SS99, HTB01, MS03, SPZ06],
I/Oefcient algorithms [MZ08, MZ03, BFMZ04, JZ05, MZ06, ALZ07, MO09, Mey09], algo-
rithms for special graph classes such as planar graphs [Fre87, HKRS97], sparse graphs [GOY76,
Wag76], or Euclidean graphs [SV86], and in the expected performance (average-case analysis) of
algorithms [NMM78, Nos85, Mey03, Gol08, Hag06]. Also, comparisons [Dre69, ZN00] and ex-
perimental evaluation for practical purposes [Hit68, BH69, GW73, Pap74, Gol76, II84, IHI
+
94,
CGR96, ZN98, CGS99, JMN99, Shi00, Gol01, PRS02, DI04] have been an important part of
investigations.
Point-to-point shortest path problems have been considered early on [Min57, Dan60, Kle64,
Smo75]. In practice, point-to-point shortest path algorithms can be computed faster when bidirec-
tional search [Nic66, Boo67, Cha67, Mur67b, Poh71, SdC77, dC83] is used. For road networks, if
in addition to the graph the node coordinates are known, A* heuristics [Gel63, Sam63, KHI
+
86,
Dor67, HNR68, Gel77] based on geometry help to guide the search towards the target [SV86].
Approaches to the problem of computing shortest paths using an additional preprocessed data
structure are reviewed in Chapter 3.
2.4.2 All Pairs Shortest Path (APSP) Algorithms
In the following, we consider the static version of the APSP problem, which is for example sur-
veyed by Schrijver [Sch03] and Pettie [Pet08a]. For the dynamic version of the problem, we refer
to the Encyclopedia of Algorithms [DI08, Ita08]. We do not list parallel algorithms.
Given a graph G = (V, E), the All Pairs Shortest Paths (APSP) problem is to compute a
shortest path, respectively the distance, between all pairs of nodes s, t V . Since there are
_
n
2
_
= (n
2
) pairs of nodes for which the distance needs to be output, the time complexity
must be at least (n
2
). Strong lower bounds are hard to prove [YAR77, GYY80]. Kerr [Ker70]
and Nakamori [Nak72] give lower bounds for algorithms with restricted operations. Karger et
al. [KKP93] show that any algorithm based on path comparison must take time (mn). For
general algorithms, the gap between the lower and the upper bound is huge.
Given an SSSP algorithm with running time S(n, m, W), the straightforward approach to
solve APSP is to run the SSSP algorithm for each node. This approach yields a running time of
O(nS(n, m, W)). For non-negative weights, we may use Dijkstras algorithm, which yields time
O(mn+n
2
lg n). The algorithm of Johnson [Joh77] matches this bound also for negative weights
(by using Dijkstras algorithm [Dij59] and the Bellman-Ford algorithm [Bel67, For56]). For very
sparse graphs with m = O(n) edges, this is already best possible.
In practice, for sparse graphs, an initial graph decomposition step may potentially decrease the
total computation time [LS67, FLM67, KY65, Mil66, Hu68, HT69, Yen71, GKN74, LR82] (see
Figure 2.1). For some graphs we can save a few unnecessary operations by considering essential
edges only [KKP93, McG95]. Also, there are algorithms that are efcient on average [Spi73,
TM80, Blo83, FG85, MT87, MP97, MC09]. In both cases, the worst-case complexity does not
change.
The question is: can we do better for dense graphs (with algorithms not based on path com-
parison)? Label-correcting algorithms may help to improve the running time. The famous Floyd-
Warshall algorithm [Flo62, War62] is based on dynamic programming and it runs in time O(n
3
)
(see also [CLRS01, Section 25.2]). This is optimal if only triple operations on paths and edge
35
CHAPTER 2. PRELIMINARIES
Figure 2.1: Illustration of the network decomposition technique as originally depicted by Hu and
Torres [HT69, Figure 4].
costs are allowed [IN72]. Yen [Yen72, Yen73] and Moffat and Takaoka [MT84] further in-
vestigate the constants hidden in the Onotation. Fredman improves the time complexity to
O
_
n
3
_
lg lg n
lg n
_
1/3
_
[Fre76]. Based on Fredmans algorithm, Takaoka [Tak92] improves the run-
ning time to O
_
n
3
_
lg lg n
lg n
_
. Feder and Motwani [FM95] give an O
_
mn
lg(n
2
/m)
lg n
_
time algo-
rithm for weighted graphs and an O
_
n
3 1
lg n
_
time algorithm for unweighted graphs. Takaoka
gives an algorithm running in time O
_
n
3
lg lg n
lg n
_
[Tak05]. For directed graphs with real edge
lengths, Zwick [Zwi06] gives an O
_
n
3

lg lg n
lg n
_
time algorithm. The algorithm of Chan [Cha07]
runs in time O
_
n
3
lg lg n
lg
2
n
_
. Blelloch et al. [BVW08] save a lgfactor compared to Feder and
Motwani [FM95]. They give the currently fastest combinatorial algorithm, which requires time
proportional to O
_
mn
lg(n
2
/m)
lg
2
n
_
. For an overview, see Table 2.6.
In forty years of research, only a logarithmic improvement in the running time has been
achieved. A combinatorial algorithm running in truly sub-cubic time would be a major break-
through. Better theoretical bounds exist through an algebraic [Car71] approach: the APSP prob-
lem can be solved using matrix multiplication (APSP for directed graphs is at least as hard as
boolean matrix multiplication). It is also possible to apply algorithms for fast matrix multiplica-
tion [Str69]. We refer to the survey [Tak08] and briey state the result. Although very important
in theory, the algorithms are unfortunately impractical to implement. Let M(n) denote the time
it takes to multiply two n n matrices. The fastest known algorithm is due to Coppersmith
and Winograd [CW90] with M(n) = O(n

), where < 2.376. The APSP problem for undi-


rected graphs with small weights can be solved in time

O(M(n)) [Yuv76, Rom80, GM93, GM97,
AGMN92, AGM97, Sei95, Zwi98, SZ99, Zwi02].
36
2.4. SHORTEST PATHS
Time O Reference
mn 1 [Bel67, For56]
n
3
1 [Flo62, War62]
n
3

_
lg lg n
lg n
_
1/3
[Fre76]
n
3

_
lg lg n
lg n
_
1/2
[Tak92]
mn
lg(
n
2
/m)
lg n
[FM95]
n
3

_
lg lg n
lg n
_
5/7
[Han04]
n
3

lg lg n
lg n
[Tak05]
n
3

lg lg n
lg n
[Zwi06]
n
3

1
lg n
[Han08a]
n
3

_
lg lg n
lg n
_
5/4
[Han08b]
n
3

lg lg n
lg
2
n
[Cha07]
mn
lg(
n
2
/m)
lg
2
n
[BVW08]
Table 2.6: Combinatorial algorithms for the All Pairs Shortest Path problem.
All Pairs Approximate Shortest Path (APASP) Algorithms
For some applications, the computation of exact distances may be too expensive. The computa-
tional complexity of approximation algorithms is better. We also refer to the survey by Sen [Sen09,
Section 5].
For undirected, unweighted graphs, Aingworth et al. [ACIM99] solve APASP with stretch
(1, 2) in time O(n
2.5
lg n) and Dor et al. [DHZ00] solve APASP with stretch (1, O(lg n)) in time

O(n
2
). Cohen and Zwick [CZ01] solve APASP with stretch (2, 0) in time

O(n
3/2
m
1/2
), with
stretch (
7
/3, 0) in time

O(n
7/3
), and with stretch (3, 0) in time

O(n
2
). Baswana and Kavitha
[BK06], Berman and Kasiviswanathan [BK07], and Baswana et al. [BGS09] give an

O(n
2
)
time algorithm for all pairs (2, W)approximate shortest paths, where W denotes the largest
edge weight. Based on matrix multiplication, Zwick [Zwi02] obtains running time O(n

/) for
(1 +, 0)approximate distances. Roditty and Shapira [RS08] give a smooth transition between
Zwicks APASP result and the exact APSP algorithms based on matrix multiplication. Their dis-
tances have sublinear additive distortion.
2.4.3 Many Pairs Shortest Path (MPSP) Algorithms
We may want to compute the shortest paths between pairs from a restricted subset of vertices
U V . For this, an APSP algorithm might be computing too much. Let s := |U|. For s = (1)
pairs, the algorithm of Pettie and Ramachandran [PR02] is currently the fastest method to compute
exact shortest paths.
Restricted graph classes. The algorithm by Cabello [Cab06] computes smany distances in
planar graphs in time O
_
s
2/3
n
2/3
lg n +n
4/3
lg
1/3
n
_
. The algorithm makes use of an efcient
distance oracle for planar graphs (see Section 3.1.3).
37
CHAPTER 2. PRELIMINARIES
Many Pairs Approximate Shortest Path (MPASP) Algorithms
For certain applications, approximate distances between a subset of the nodes may be enough.
For an overview of results, see [Elk05, Table I]. Similar to APASP, many algorithms use spanners
(subgraphs that approximate distances, for details see Section 2.3) to reduce the problem size.
Elkin [Elk05, p. 284] notes that the existence of an algorithm for constructing an (, )spanner
with O(n
1+
) edges ( > 0) in time O(T) implies the existence of an algorithm for the MPASP
problem with s sources with running time O(T +s n
1+
).
Also, any (, )approximate distance oracle (to be dened in the next chapter) with pre-
processing time O(T) and query time O(Q) can trivially solve MPASP in time O(T + s Q).
It is interesting to investigate whether it helps when we know the s pairs in beforehand (ofine
scenario).
Aingworth et al. [ACIM99] compute (1, 2)approximate distances in time O(n
1.5

s lg n +
n
2
lg
2
n). Dor et al. [DHZ00] compute (1, 1/)approximate shortest paths in time

O(n
2+
).
Cohen [Coh94] computes (1 + , polylog(n))approximate shortest paths in time O(mn

+
sn
1+
). Elkin [Elk05] computes (1 + , O(1))approximate shortest paths in the same time.
Roditty et al. [RTZ05] construct a distance oracle for a subset of the nodes U V . Based on this,
they can compute (2k 1, 0)approximate distances in time

O(ms
1/k
+ks).
2.4.4 Shortest Path and Distance Queries
Processing shortest path and distance queries in graphs can be seen as a generalization of the
APSP problem and the MPSP problem, for which we do not know the pairs in beforehand (online
scenario). The efcient processing of these queries and the corresponding data structures are the
main focus of this thesis. The problem is dened in Chapter 3, wherein we also review related
work.
38
Die Aufgabe [...] besteht darin, ein Verfahren anzugeben, wie
man sich aus einem Labyrinthe herausndet.
(The task is to give a method how to nd your way out of a maze.)
Ludwig Christian Wiener (18261896) [Wie73]
3
Review of Short Path Query Processing
The eminent importance of the shortest path problem has stimulated a large body of research, both
of theoretical and practical nature. Some important results for the general problem (without being
exhaustive) are mentioned in Section 2.4. In this chapter, we review results on the point-to-point
(approximate) shortest path query problem. In this overview, the query problem is restricted to
discrete, static graphs with positive edge weights. Also, the only criteria on the optimality of a
path is its length (for example, there are no turn penalties [Cal61, KP69]). If paths are required to
satisfy constraints other than distance (for example bounded leg paths [DP08]), the corresponding
results are not covered in this review.
For distance and path queries in computational geometry, we refer to [Mit97]. For the spe-
cic problem in computational geometry concerning robot path planning, we refer to [Sha97b,
KLMR98, Lat91]. For questions regarding query processing in dynamic graphs, see [Ram96a,
pp. 30100] and [Ita08, Ber09, BHS07, RZ04, MN67], for time-dependent weights, see [OR90],
and for negative weights, see [YZ05].
The query scenario is different from the classical scenario described in the previous sec-
tions (2.2 and 2.4). We are rst presented with a usually large graph G = (V, E). A so-called
preprocessing algorithm may compute certain information or a data structure to prepare
for the next phase. After this preprocessing algorithm has been executed, users will ask queries,
which should be answered efciently. In computational geometry, this and similar scenarios are
sometimes called repetitive-mode (as opposed to single-shot) scenarios [PS85, p. 37].
A lazy strategy would be not to precompute a data structure but to use a classical SSSP algo-
rithm to answer queries. The query would then take roughly linear time. An eager strategy would
be to precompute the result for all possible queries using an APSP algorithm.
1
Both strategies
have its advantages and disadvantages: for the rst strategy, no preprocessing is necessary but
the query processing is very slow; for the second strategy, the query execution is extremely fast
but the preprocessing step is very expensive and the space consumption is prohibitively large for
many graphs. In the path query scenario, we mediate between these two extremes: we analyze
the tradeoff between space, preprocessing time, and query time. If the query algorithm is allowed
to return an approximate shortest path, the worst-case stretch is also an important factor of the
tradeoff.
In the following, we review related work on shortest path and distance query processing for
graphs. Theoretical results are presented in Section 3.1 and practical results are summarized in
Section 3.2. The review in this chapter is made with best efforts; however, the list of methods and
algorithms is not exhaustive.
1
We assume no knowledge about the query distribution. In practice, we might actually know that some queries are
more frequent than others.
39
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
3.1 Theoretical Distance Oracles
Thorup and Zwick [TZ05] coined the term distance oracle, which is a data structure that, after
preprocessing a graph G = (V, E), allows for efcient (approximate) distance and shortest path
queries.
Denition 30. An ((, )approximate) distance oracle for a class of graphs G consists of a data
structure S and a query algorithm with the following characteristics:
The preprocessing time is the worst-case time required to construct the data structure for
any G G.
The space complexity refers to the worst-case size of the data structure for any G G.
After preprocessing G = (V, E), the data structure S supports (approximate) distance queries for
all pairs of vertices u, v V , returning a value

d
S
(u, v). The query algorithm and its result are
characterized as follows.
The query time is the worst-case time required to compute

d
S
(u, v) among all G = (V, E)
G and u, v V .
A distance oracle S is said to have stretch (, ) if for all G = (V, E) G and u, v V
its query algorithm satises
d
G
(u, v)

d
S
(u, v) d
G
(u, v) + .
The stretch is also called distortion.
In addition to the worst-case measures, the average-case behavior of the time and space com-
plexities, and, especially, the stretch, may also be of interest. For some distance oracles, only the
average stretch is guaranteed. For other distance oracles, the stretch condition is satised except
for n
2
pairs of nodes. This fraction [0, 1) is called slack. If not explicitly stated otherwise,
stretch means the worst-case stretch as in Denition 30.
We summarize the known results for general graphs (lower bounds in Section 3.1.1 and upper
bounds in Section 3.1.2), and we give an overview of the distance oracles for restricted classes of
graphs (Section 3.1.3). Some common techniques are explained in Section 2.3.
3.1.1 Lower Bounds
For directed graphs, distance oracles are closely related to reachability oracles. We do not review
upper bounds on algorithms for reachability oracles. Any lower bound on the space complexity of
reachability oracles directly implies a lower bound on any distance oracles with nite stretch. The
rst part of the denition for reachability oracles is equivalent to the rst part of the denition for
distance oracles (Denition 30) except that reachability oracles always consider directed graphs.
Denition 31. A reachability oracle for a class of directed graphs G consists of a data structure S
and a query algorithm with the following characteristics:
The preprocessing time is the worst-case time required to construct the data structure for
any G G.
40
3.1. THEORETICAL DISTANCE ORACLES
The space complexity refers to the worst-case size of the data structure for any G G.
After preprocessing G = (V, E), the data structure S supports reachability queries for all pairs of
vertices u, v V , returning a boolean value reach
S
(u, v) {, }. The query algorithm and its
result are characterized as follows.
The query time is the worst-case time required to compute reach
S
(u, v) among all G =
(V, E) G and u, v V .
The query algorithm returns
reach
S
(u, v) =
_
if and only if d
G
(u, v) =
otherwise.
For general directed graphs, any distance oracle with nite worst-case stretch requires space
(n
2
) bits. Consider a complete bipartite digraph D = (V
1
V
2
, A), where all arcs are directed
from V
1
to V
2
. Any scheme that encodes reachability [AF90] information for D and all its 2
(n
2
)
subgraphs requires quadratic space for some subgraphs.
For sparse graphs with O(n) edges, the following tradeoff between space complexity and
query time has been proven. For details, see Section 4.2.1.
Theorem 5 (P atrascu [Pat08a, Theorem 2]). A reachability oracle using space S in the cell-probe
model with wbit cells, requires query time t =
_
lg n/ lg
Sw
n
_
.
Corollary 6. A distance oracle for directed graphs with nite stretch using space S in the cell-
probe model with wbit cells, requires query time t =
_
lg n/ lg
Sw
n
_
.
For dense graphs, the information-theoretic argument extends to undirected graphs as follows.
For undirected graphs, Thorup and Zwick [TZ05] prove a lower bound on the size of distance
oracles for a certain stretch. The worst-case instances are dense graphs with large girth. Recall that
the girth of a graph is the length of its shortest cycle and recall that m
g
(n) denotes the maximum
number of edges in a graph with n vertices and girth at least g (for details, see Section 2.3.1).
Theorem 7 (Girth-based lower bound [TZ05, Proposition 5.1]). For any integer k 0 (called
stretch parameter), any distance oracle for graphs with n nodes and multiplicative stretch less
than < 2k + 1 needs space at least (m
2k+1
(n)) bits.
Erd os girth conjecture [Erd64, ES63] predicts that m
2k+1
(n) = m
2k+2
(n) = (n
1+1/k
). For
the values of k for which the conjecture has been proven (including 1, 2, 3, and 5, see Table 2.4),
this yields a space lower bound of (n
1+1/k
) bits.
Thorup and Zwicks lower bound proof [TZ05, Proposition 5.1] roughly works as follows:
Recall the argument of the lower bound for graph spanners (Denition 26): in a graph with girth
g = + 2, removing any edge increases the distance between its endpoints from 1 to at least
+ 1. Let G be a graph with n nodes, m
g
(n) edges, and girth g. All 2
m
g
(n)
subgraphs G

of the
graph G also have large girth g(G

) g(G). The distance oracle must distinguish between any


two different subgraphs G

and G

, and it cannot omit any edges since there is no alternative short


path. Thus, for a distance oracle with stretch less than g, at least lg 2
m
g
(n)
= (m
g
(n)) bits of
space are necessary.
Note that the tradeoff is between space and stretch only. The lower bound does not refer to the
query time in any way; it even holds if the query algorithm is allowed to access the complete data
structure. The lower bound essentially states that we cannot compress certain dense graphs with
large girth to less than their original size, or, alternatively, that the size of the data structure must
be at least (m).
41
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
3.1.2 General Graphs
In this section, if not stated otherwise, we consider distance oracles for general undirected graphs.
An overview is listed as Table 3.1. For a recent survey, we refer to Sen [Sen09, Section 4].
Thorup and Zwick, in their seminal work [TZ05], provide both the lower and the matching
upper bound: the space complexities of their distance oracles are tight with respect to the space
lower bound presented in the previous section (for those values of the stretch parameter k, for
which the girth conjecture has been proven). For any integer stretch parameter k 1, their
randomized algorithm (deterministic version by Roditty et al. [RTZ05]) can preprocess a graph
with n nodes and m edges in time

O(mn
1/k
) to construct a distance oracle of size O(kn
1+1/k
)
with query time O(k) and stretch 2k 1. Note that for k = 1 this yields an exact distance oracle
with preprocessing time O(mn), which equals the running time of many APSP algorithms.
Let us look at their method in more detail. For the sake of exposition, let k = 2. The prepro-
cessing step is outlined as Algorithm 1 (original in [TZ05, Fig. 5]) and the query step is outlined
as Algorithm 2 (original in [TZ05, Fig. 2]). For efcient query times, preprocessed information is
stored in a hash table [FKS84] for each node). If the actual approximate shortest path is needed,
each edge of the path can be generated in constant time O(1). For a graph G = (V, E) and a node
u V , the open ball with respect to a set S V is dened by B
S
(u) := {v : d(u, v) < d(u, S)}
(Denition 13). Let L(u) denote the node in S that is nearest to u. We also call this node the
landmark of u.
Algorithm 1 Preprocess (G = (V, E))
S =
for each v V do
with probability n
1/2
, add v to S
end for
for each v S do
run SSSP search from v in G
for each node u = v, store d(u, v) and let FirstNode
u
(v) be the penultimum node on the
shortest path; update L(u) if v is nearest landmark
end for
for each u V do
compute and store B
S
(u) (including distances)
for each v B
S
(u) let FirstNode
u
(v) be the rst node on the shortest path to v.
end for
Algorithm 2 Distance (s, t)
if s B
S
(t) or t B
S
(s) then
return local distance d(s, t) from the information at s or t.
else
return d(s, L(t)) +d(L(t), t)
end if
As mentioned before, the relationship between data structure size and stretch is almost optimal
with respect to their lower bound in Theorem 7. For those values of the stretch parameter k for
which the girth conjecture by Erd os has been proven (see Table 2.4), (n
1+1/k
) bits of space are
42
3.1. THEORETICAL DISTANCE ORACLES
necessary for any distance oracle with multiplicative stretch less than 2k + 1. Thorup and Zwick
achieve space O(kn
1+1/k
) words for multiplicative stretch 2k 1.
The time complexities (preprocessing and query) are not tight. Both have been improved upon
independently.
For weighted graphs, Baswana and Kavitha [BK06] (by using a new APASP scheme) reduce
the preprocessing time down to O(n
2
lg n). For k > 2 they maintain query time O(k), but for
k = 2 the query time is (lg n). For unweighted graphs, Baswana and Sen [BS06, Theorem 1.2]
(by using a (2, 1)spanner) reduce the preprocessing time down to expected quadratic time. For
unweighted graphs, with an additional additive term (stretch (3, 8)), Baswana et al. [BGSU08]
achieve sub-quadratic preprocessing time O(min{m+n
23/12
, mn
1/2
}), space O(n
3/2
) and query
time O(1). In a recent survey, Sen claims the stretch to be (3, 10) instead [Sen09, Table 1]. For
general k 3, Baswana et al. obtain stretch (2k1, 2k2), space O(kn
1+1/k
), and preprocessing
time O
_
m+kn
3
2
+
4k3
k(4k6)
_
.
Mendel and Naor [MN06], based on Ramsey partitions [LS93, CKR04, BLMN05], reduce the
query time down to constant O(1), but they sacrice a constant factor in the stretch. For stretch
(O(k), 0), their oracle needs space O(n
1+1/k
). For their data structure, the best-kown construction
time of O(mn
1/k
lg
2
n) is due to Mendel and Schwob [MS08b].
For general graphs, Bourgains theorem (Theorem 3) yields an (O(lg n), 0)approximate dis-
tance oracle with query time

O(1) and space complexity

O(n).
Derungs et al. [DJW07] consider the setting, where the query algorithm may access both the
data structure (termed index graph) and the original graph (whose representation must remain
unchanged). The access to the original graph is limited. In practice, a graph may be stored in
external memory but a small index can be stored in internal memory (the denition differs from
the denition for typical external memory algorithms [AV88, VS94] it is related to the indexing
model [Yao90, DLO03]). Their tradeoff is between the size of the data structure (which may be
accessed for free at query time), the number of edges read from the original graph at query time,
and the stretch. The index graph is supposed to have sublinear size o(n). The probe factor r is
dened as the number of edges read from the original graph divided by the number of edges on
the result path. Derungs et al. provide an index graph with
_
nlg n

r
_
1+2
edges, using which they
compute paths of stretch (4 + 1, 0), where is another parameter. They also prove two lower
bounds. For probe factor 0 (no edges can be read from the original graph), the index requires
space at least
nlg n
2
bits. For probe factor
n
10
and index size
nlg n
10
bits, no stretch less than (5, 0) is
possible.
For an overview of distance oracles for general graphs, see Table 3.1.
Suppose that we need a distance oracle with quasi-linear space consumption. The oracle of
Thorup and Zwick [TZ05] achieves this for k = lg n with O(lg n) multiplicative stretch and
O(lg n) query time. The oracle of Mendel and Naor [MN06] improves the query time to O(1).
It would be very useful to reduce the stretch to O(1) instead. The space lower bound proves that
such a reduction is impossible for dense graphs. It is however open whether such oracles exist for
sparse graphs.
To conclude this section, note that any APSP or APASP algorithm constructs a complete dis-
tance table, which may serve as a distance oracle. The space requirement is O(n
2
) and the query
time is O(1). Distances in the table are optimal if computed by an APSP algorithm. For APASP
algorithms, the tradeoff between stretch and space is not optimal with respect to the lower bound of
Section 3.1.1. APASP algorithms may improve upon APSP algorithms in terms of the preprocess-
43
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Preprocessing Space Query Stretch Reference
O(mn) O(n
2
) O(1) (1, 0) APSP
O(1) O(m) O(m) (1, 0) BFS

O(kmn
1/k
) O(kn
1+1/k
) O(k) (2k 1, 0) [TZ05, RTZ05]

O(mn
1/k
) O(n
1+1/k
) O(1) (O(k), 0) [MN06, MS08b]

O(n
2
) O(n
3/2
) (lg n) (3, 0) [BK06]

O(n
2
) O(kn
1+1/k
) O(k) (2k 1, 0) [BK06], k 3
O(n
2
) O(kn
1+1/k
) O(k) (2k 1, 0) [BS06]
O(m+n
23/12
) O(n
3/2
) O(1) (3, 10) [BGSU08]
(n
1+1/k
) < (2k + 1, 0) [TZ05]
n
1+(1/t)
t < (1, 2) Lemma 11
n
1+(1/t)
t < (, 0) Theorem 8
Table 3.1: Time and space complexities of distance oracles for general, undirected, unweighted
graphs (some upper bounds extend to weighted graphs). The upper part of the table lists upper
bounds; the lower part lists lower bounds (some restrictions on k and apply). For the result
by Baswana et al. [BGSU08], the stretch is taken from [Sen09, Table 1]. Approximate distance
oracles are included only if the space requirement is at most o(n
2
).
ing time. Recall that the distance oracle of Thorup and Zwick [TZ05] is restricted to stretch fac-
tors of uneven integers. If multiplicative stretch less than 3 is desired, the preprocessing algorithm
of their oracle can only function as an APSP algorithm. The APASP algorithms by Cohen and
Zwick [CZ01] yield a (2, 1)approximate distance oracle with preprocessing time

O(n
3/2
m
1/2
)
and a (
7
/3, 0)approximate distance oracle with preprocessing time

O(n
7/3
). The algorithms by
Baswana and Kavitha [BK06] improve this to a (2, 0)approximate distance oracle with prepro-
cessing time

O(mn
1/2
+ n
2
) and a (
7
/3, 0)approximate distance oracle with preprocessing time

O(m
2/3
n +n
2
).
If the input is restricted to special classes of graphs, the girth-based lower bound may not
necessarily apply. Distance oracles with better stretch/space tradeoffs exist.
3.1.3 Restricted Graph Classes
Given the prohibitive girth lower bound (Theorem 7), the natural question arises whether we can
construct a better distance oracle for certain classes of graphs. Graphs that actually appear in
practical settings are of particular interest. Indeed, there are better constructions for restricted
graph classes, in particular for several sparse graphs.
Planar Graphs (and Graphs with Bounded Genus)
Due to the importance of planar graphs, short path queries for planar graphs have been studied
extensively. A complete review of the corresponding results would deserve a chapter itself. We
give a brief overview; a summary can be found in Table 3.2.
For exact shortest path queries, the currently best result in terms of the tradeoff between
space and query time is by Fakcharoenphol and Rao [FR06, FR08]. The data structure requires
space O(nlg
3
n) (improvement for non-negative weights to O(nlg
2
n) by Klein [Kle05], an-
other improvement by a lg lg factor may be possible [MWN09]) and processes queries in time
44
3.1. THEORETICAL DISTANCE ORACLES
O(

nlg
2
n). Note that their result holds for graphs with negative weights as well.
Some distance oracles by Cabello [Cab06] (and others) have better query times. The time and
space complexities of his distance oracles depend on a parameter r that can be specied by the
application. He proves that for any r [n
4/3
lg
1/3
n, n
2
] there is an exact distance oracle with
preprocessing time and space O(r) with query time O
_
n

r
lg
3/2
n
_
. This oracle yields an im-
provement over complexities of the oracles by Arikati et al. [ACC
+
96, p. 517] and Djidjev [Dji96].
They prove that, for any r [n
3/2
, n
2
], there is a distance oracle with preprocessing time and size
O(r) that supports queries in time O(n
2
/r). Djidjev [Dji96] obtains two more results. For any
r [n, n
3/2
], the preprocessing step takes time O(n

r), the other features remain the same. For


any r [n
4/3
, n
3/2
], the query time is O
_
n

r
lg n
_
. The size remains O(r); the preprocessing
time remains O(n

r). For r = n
3/2
, we obtain query time O(n
1/4
lg n) and space O(n
3/2
).
Cabellos tradeoff [Cab06] yields better preprocessing times for the same amount of space.
The hammock decomposition [FJ88, Fre91, Fre95] of a planar graph is used to construct dis-
tance oracles optimized for certain classes of planar graphs. Let q = q(G) {1, 2, . . . n 1}
denote the minimum number of outerplanar subgraphs of a planar graph G (proportional to the
minimum number of faces covering all vertices of G; the minimum is taken over all embeddings
of G in the plane). The number of hammocks is a topological measure imposing a natural hierar-
chy on the class of planar graphs. Djidjev et al. [DPZ95] give a distance oracle with preprocessing
time and space O(n + q lg q) and distance query time O(lg n + q). Recall that, for some planar
graphs, q may be (n). If only distance queries are to be answered (no path queries), space O(n)
sufces. Djidjev et al. [DPZ00, DPZ91] also give a dynamic distance oracle for outerplanar graphs
with O(n) preprocessing and space and O(lg n) query time. They generalize this distance oracle
to planar graphs (and graphs with genus at most O(n

)) with q hammocks; the preprocessing and


space complexities are O(n + q
3/2
) and the distance query time is O(lg n +

q). They obtain


a fast linear-space distance oracle for planar graphs with at most q = O(n
2/3
) hammocks. Chen
and Xu [CX00] present a data structure that, for a parameter 1 r q, uses O(n +q
2
/r +q

r)
space, O(n+q
2
/

r+qr
3/4
) preprocessing time, and O(

r lg r+ a(n)) query time. a(n) denotes


the inverse of the Ackermann function. a(n) is an extremely slowly-growing function.
Gupta et al. [GKR04] address spatiotemporal queries on objects moving along the edges of
a certain class of planar graphs. Spatiotemporal queries (including distance queries) can be pro-
cessed in time O(

n) on a data structure of size O(n

n) with preprocessing time O(n

n). Their
tradeoff matches the tradeoff for the results by Arikati et al. and Djidjev with r = n
3/2
[ACC
+
96,
Dji96].
Hutchinson et al. [HMZ03] consider shortest path queries in planar graphs in the parallel disk
model. This model is used for external memory algorithms [AV88, VS94]. Their data structure
uses O(n
3/2
/B) blocks of external memory and allows for a shortest path query to be answered in
O
_

n+L
DB
_
I/O operations, where B is the block size, L is the number of vertices on the reported
path, and D is the number of parallel disks. Their result essentially also matches the tradeoffs
in [ACC
+
96, Dji96].
Restricted Queries. Kowalik and Kurowski [KK03] generalize adjacency queries in unweighted
planar graphs to queries for distances bounded by a constant h. The preprocessing time and space
complexities are O(n) and the query time is O(1) (which is an improvement over the O(lg n)
query time in Eppstein [Epp99, Theorem 12]). To obtain the space bound, they prove that any
constant power of a planar graph still has constant thickness (Denition 24). Then, answering a
45
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
distance query translates to combining the results of a constant number of adjacency queries. Note
that for unweighted planar graphs, any distance oracle with stretch (1, O(1)) immediately yields a
(1 +, 0)approximate distance oracle by combining it with the result by Kowalik and Kurowski.
Distances in undirected, unweighted grids are directly determined by grid coordinates. For a
weighted, directed (p q)grid, Schmidt [Sch98] gives an O(pq lg p)time algorithm to build a
distance oracle that supports distance queries in time O(lg p) for paths starting in the left-most
column. In a straightforward way, this yields an O(pq lg pq)time algorithm to build a distance
oracle that supports distance queries in time O(lg p + lg q) starting at any node on the outer face.
For a (

n)grid, the preprocessing and query times are O(nlg n) and O(lg n), respectively.
This result can be generalized (with rather different techniques). For an undirected graph with
genus g, Cabello and Chambers [CC07] provide an O(g
2
nlg n)time algorithm to represent the
shortest path tree from all the vertices on one specied face. Any query distance from a vertex on
this face can be obtained in time O(lg n).
Approximations. The tradeoff between space and query time of all exact distance oracles is
such that for space O(n
2
) the query time is at best O(n
/2
) [Cab06]. Constant query time with
space o(n
2
) has not been achieved yet. To obtain fast query times of O(1) or

O(1) with lower
space requirements, approximate distance oracles are considered. Recall that, for general graphs,
any distance oracle with multiplicative stretch less than < 3 requires space (n
2
). In the special
case of planar graphs, the corresponding tradeoff is much better.
Thorup [Tho04a] presents efcient (1 + )approximate distance oracles for planar digraphs.
The main ingredient of Thorups construction [Tho04a] is a special separator consisting of a set of
shortest paths (instead of a general set of O(

n) nodes as in the Lipton-Tarjan [LT80] separator


theorem (Theorem 4)). Each node computes the distance to certain nodes of these separator paths.
His method achieves the fastest (and only) provable query times for planar digraphs. His oracle
has also been implemented and tested for large road networks [MZ07]. The results indicate that,
for these road networks, it is not competitive with the specialized methods to be discussed in
Section 3.2.3. For undirected graphs, Thorup provides two oracles.
1. The rst distance oracle has query time O(1/), preprocessing time O
_
n
2
lg
3
n
_
, and
space requirement O(n
1
lg n) [Tho04a, Theorem 3.19]
Klein [Kle02a] independently obtains the same oracle for space and query time by improv-
ing Thorups second oracle.
2. For the slower query time of O
_
lg n(lg lg +
1
)
_
, space O(n
1
lg nlg ) and prepro-
cessing time O(n
1
lg
2
nlg ) are sufcient [Tho04a, Proposition 3.14].
Klein [Kle05, Section 7] improves the preprocessing time to O(n(1/ + lg n) lg nlg ).
The original separator theorem [LT80] can be generalized to minor-free graphs [AST90]. An
extension is also possible for Thorups shortest path separator theorem [Tho04a]. Abraham and
Gavoille [AG06] generalize it to minor-free graphs. Based on these shortest path separators, they
construct distance oracles for graphs without xed minors (including planar graphs). The graph at
hand is recursively cut into pieces of almost equal size, separated by at most O(1) shortest paths.
In polynomial time, they obtain a (1 + , 0)approximate distance oracle with space O
_
n

lg n
_
and query time O
_
lg n

_
.
For an overview, see Table 3.2. Recall that denotes the diameter (largest nite distance) of
a graph.
46
3.1. THEORETICAL DISTANCE ORACLES
Preprocessing Space Query Reference
O(nlg
2
n) O(nlg
2
n) O(

nlg
2
n) 1 [FR06, Kle05]
O(n
3/2
) O(n
3/2
) O(

n) 1 [ACC
+
96, Dji96]

O(n
4/3
)

O(n
4/3
) O(n
1/3
lg
4/3
n) 1 [Cab06, r : n
4/3
]
O(n
3/2
) O(n
3/2
) O(n
1/4
lg
3/2
n) 1 [Cab06, r : n
3/2
]
O(n
7/4
) O(n
3/2
) O(n
1/4
lg n) 1 [Dji96, r : n
3/2
]
O(n
7/4
) O(n
7/4
) O(n
1/8
lg
3/2
n) 1 [Cab06, r : n
7/4
]
O
_
n

2
lg
3
n
_
O
_
n

lg n
_
O(1/) 1 + [Tho04a, T3.19]
O
_
n

lg n
_
O(1/) 1 + [Kle02a]
O
_
n

lg
2
nlg
_
O(
n

lg nlg ) O(lg nlg lg +


lg n

) 1 + [Tho04a, P3.14]
O(
n

+ nlg
2
nlg ) O(
n

lg nlg ) O(lg nlg lg +


lg n

) 1 + [Kle05, Sec. 7]
O(poly(n)) O
_
n

lg n
_
O
_
lg n

_
1 + [AG06]
Table 3.2: Time and space complexities of distance oracles for undirected planar graphs (some
results extend to planar digraphs and/or minor-free graphs). denotes the multiplicative stretch;
denotes the diameter.
Approximate shortest path query processing for planar graphs has been investigated before
Thorups and Kleins seminal results. Frederickson and Janardan [FJ89, FJ90] give stretch (3, 0)
approximate routing schemes for planar graphs. Klein and Subramanian [KS98] give a data struc-
ture that also works in the dynamic case. The stretch is (1 + , 0); query and update times are
O(
1
n
2/3
lg
2
nlg ).
Road Networks Graphs with Bounded Highway Dimension
An important characteristic of road networks appears to be its highway dimension [AFGW10]. A
graph is said to have small highway dimension if for any r > 0, there is a not-too-large set of
vertices S
r
, which all shortest paths of length at least r pass through. S
r
is not too large if all
balls of radius at most O(r) contain only few vertices of S
r
. Based on this notion, many efcient
practical methods (to be discussed in Section 3.2.3) such as contraction hierarchies [GSSD08,
BDS
+
08], highway hierarchies [SS05, SS06, NBB
+
08], transit-node routing [BFM
+
07, Sch08b],
and SHARC [BD08] have provable performance guarantees. The highway dimension of real road
networks has not been investigated yet.
Road networks also share some properties with planar graphs such as small separators [EG08].
The techniques of Thorup [Tho04a] may potentially apply too (experimental results indicate
so [MZ07]).
Bounded Tree-Width
For digraphs with tree-width w (Denition 18), Chaudhuri and Zaroliagis [CZ00] give an algo-
rithm, parameterized by an integer q [1, a(n)], to compute a distance oracle with query time
O(w
3
q) in preprocessing time O(w
3
nlg n) for q = 1, preprocessing time O(w
3
nlg

n) for q = 2,
and preprocessing time O(w
3
n) for q = a(n). The tradeoff between preprocessing and query time
arises from semigroup computations over trees [AS87, Cha87, Sei06]. a(n) denotes the inverse
Ackermann function and lg

n denotes the iterated logarithm, which is the the number of times


the logarithm function has to be applied recursively to reach 1.
47
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Hagerup [Hag00a] obtains the same results (and extends it to dynamic oracles) using monadic
second-order logic. Courcelle and Vanicat [CV03b] obtain a related result for graphs with bounded
clique-width [ER97, CO00]. Their distance oracle can be computed in time O(nlg n) and (exact)
distance queries can be answered in time O(lg n).
kchordal graphs
A graph is kchordal if there is no cycle with more than k edges and no chord. A chord is an edge
between two nodes of the cycle that are not neighbors in the cycle. Chordal graphs are 3chordal.
Gavoille et al. [GKK
+
01] give a (1,
k
2
)approximate distance labeling scheme with labels of
size O(lg nlg ) for kchordal graphs. They also prove that these label sizes are optimal. Their
result implies the existence of a (1, 1)approximate distance oracle for chordal graphs with size
O(nlg
2
n).
Intersection Graphs and Permutation Graphs
Chen et al. [CLSS98] consider interval and circular-arc graphs. A graph G = (V, E) is an interval
graph if there exists a set S of intervals on the real line (called model) such that there is a bijection
between the vertices v V and the intervals I
v
S in such a way that an edge (u, v) E if
and only if I
u
I
v
= . Given a model, the input intervals get sorted by their endpoints (this
sorting step may take time O(nlg n)); after this, preprocessing takes time O(n). The distance
oracle has size O(n) and supports distance queries in time O(1). The denition of circular-arc
graphs is the same as that of interval graphs, with the exception that the set of intervals on the
real line is replaced by a set of circular-arcs on a unit circle. Sprague and Takaoka [ST99] give
a simpler method with the same performance guarantees. Gavoille and Paul [GP03a] obtain the
same performance using distance labelings.
Recall that a ball graph is an intersection graph of balls in R
dim
. It consists of n balls with
centers v
i
and radii r
i
. Two centers v
i
, v
j
are connected in the intersection graph iff their balls
intersect in R
dim
. A disk graph is a ball graph with dim = 2. In unit-disk and unit-ball graphs, all
radii are equal. Gao and Zhang [GZ05, Corollary 5.2] consider unit-disk (and unit-ball) graphs.
Based on the well-separated pair decomposition by Callahan and Kosaraju [CK95] (Denition 29),
they obtain, in preprocessing time O(n

nlg n/
3
), a data structure of size O(nlg n/
4
) that
supports (1+, 0)approximate distance queries in time O(1). F urer and Kasiviswanathan [FK07]
give efcient algorithms for disk and ball graphs with general radii. Let R :=
max r
i
min r
i
denote the
ratio between the largest and the smallest radius in the graph. They construct sparse (1 + , 0)
spanners (Denition 26), which have a separator of size S(n, R, ) = O(n
11/dim

dim+1/2
+

2dim+1
lg R) that can be found in time O(nlg n). After preprocessing of time

O(n S(n, R, ))
their distance oracle can answer queries in time O(S(n, R, )). Recall that the class of disk graphs
contains the class of planar graphs [Koe36]. The oracle of F urer and Kasiviswanathan is more
general than the oracles for planar graphs but not competitive on planar instances.
Let denote a permutation of [n] = {1, 2, . . . n}. The permutation graph G() is dened
as the graph with vertex set [n] and edge set {{i, j} : (i j) (
1
(i)
1
(j)) < 0}.
Sprague [Spr07] gives an algorithm that preprocesses a permutation graph in time O(n) such
that distance queries can be answered in time O(1).
48
3.1. THEORETICAL DISTANCE ORACLES
Graphs with Bounded Doubling Dimension
The doubling dimension of a metric space is the minimum dim such that any ball of radius r can
be covered by at most 2
dim
balls of radius r/2 (Denition 19).
Talwar [Tal04] provides distance labels of length O(
dim
), using which there is a (1 +
, 0)approximate distance oracle with space O(n
dim
) and query time O(
dim
). Abraham
et al. [ABN08] provide, for a parameter (0, 1], an (O(lg
1+
n), 0)approximate distance or-
acle with space complexity O(ndim/) and query time O(dim/). The stretch is worse but the
dependencies on the doubling dimension dim are better.
The beacon-based embedding by Kleinberg et al. [KSW09] works as follows. A set of land-
marks is selected (for example at random); a constant number of landmarks sufces. Each node
stores its distance to all the landmarks. Distances are approximated by triangulation. Distance
estimates are unbounded for a fraction of the node pairs. This fraction is referred to as slack.
For all the remaining node pairs, the triangulation yields a (1 + , 0)approximate distance oracle
with linear space complexity and constant query time.
Graphs with bounded doubling dimension have also been studied in the setting of compact
routing [AM05, AGGM06, KRX07] and spanners [CG06]. Some researchers suggest to apply
algorithms that work well for graphs with bounded doubling dimension to Internet-like graphs.
However, the graph representing the Internet does not appear to have bounded ball growth or
bounded doubling dimension. Both measures can be large [FLV08].
Power-law Graphs and Complex Networks
For power-law graphs, compact routing schemes have been studied. Any compact routing scheme
may serve as an approximate shortest path oracle. While the time to retrieve the approximate
distance may not be competitive, we can retrieve each edge of the path by simulating the decision
of a router in a centralized way such that path queries are efcient. The results for distance oracles
and compact routing schemes are often strongly related. For a routing scheme, all the information
needs to be distributed. This constraint renders the problem intuitively harder than constructing a
distance oracle, where information is centralized and the query algorithm is not restricted to local
information. We do not attempt to cover results on compact routing in this review. We instead refer
to [ANLP90, AP92, Gav01, GP03b, TZ01, AGM
+
08, CW04] and the references therein. Compact
routing schemes for power-law graphs with direct inuence on the best results for distance oracles
are discussed in the following.
Krioukov et al. [KFY04] evaluate the compact routing scheme by Thorup and Zwick [TZ01]
for Internet-like inter-domain topologies and random power-law graphs. They also analyze the
stretch-distribution for this routing scheme when run on Erd os-R enyi G
n,p
random graphs [ER60].
Enachescu et al. [EWG08] also analyze the compact routing scheme by Thorup and Zwick [TZ01]
for Erd os-R enyi G
n,p
random graphs [ER60]. They prove that stretch (2, 0) can be achieved with
space

O(n
7/4
) by selecting

O(n
3/4
) landmarks. They also claim(without proof in the proceedings
version) that stretch (, 0) can be achieved with space

O
_
n
1+
2
+1
+
_
. Recall that the Erd os-
R enyi random graphs do not have a power-law degree sequence (Section 2.1.3).
The compact routing scheme by Brady and Cowen [BC06] is evaluated experimentally only.
More on their scheme can be found in Section 3.2.4 on practical results.
49
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Geometric Graphs
A geometric graph G = (V, E) has vertices corresponding to points in R
dim
and edge weights
from a Euclidean metric; G is said to be a (, )spanner for V , if for any two points p and q in
V the shortest path metric in G (, 0)approximates the Euclidean distance in R
dim
. Since some
distances may be larger than the corresponding Euclidean distance, we can not just apply Johnson-
Lindenstrauss [JL84] (Lemma 2) to obtain an efcient distance oracle for G. For geometric graphs
with sparse spanners, Gudmundsson et al. [GLNS08] give a (1+, 0)approximate distance oracle
with preprocessing time O(mlg n), space O(nlg n), and query time O(1).
Andersson et al. [AGL07] extend the result to geometric graphs with dense clusters using
the well-separated pair decomposition by Callahan and Kosaraju [CK95] (Denition 29) and
well-separated clusters by Krznaric and Levcopoulos [KL95]. If G contains N disjoint (, 0)
spanners that are inter-connected with M edges, there is an algorithm that constructs an (1+, 0)
approximate distance oracle in time O((m+M
2
) lg n) with space O(M
2
+nlg n) and constant
query time. Their algorithm chooses a representative point for each cluster, based on which dis-
tances are computed. As a potential application they give the following example: in the European
railway network, each country has a its own network (a spanner) and the railway networks of the
countries are then sparsely interconnected.
Sankaranarayanan and Samet [SS09, SSA09] adapt the well-separated pair decomposition to
spatial networks of dim dimensions. They obtain (1 + , 0)approximate distance oracles using
space O(n/
dim
) and query time O(lg n). With a hash table, the query time can be reduced to
O(1) while the space consumption increases to O(nlg n/
dim
). They also evaluate their scheme
experimentally on road networks.
3.2 Practical
Efcient practical methods to process shortest path queries are often devised by following a feed-
back loop that consists of design, analysis, implementation, and experimentation. The approach
using this feedback loop is also called algorithm engineering [San09, Figure 1]. Since experimen-
tation is an integral part of the feedback loop, the choice of the datasets may highly inuence the
outcome of the algorithm engineering process. If possible, experiments are run with input graphs
that are actually used in practice.
Practical instances for shortest path problems are often sparse. The number of edges is roughly
linear in the number of nodes. Besides sparsity, practical networks often have other important
properties. A large fraction of the efforts in the eld of practical shortest path query processing
has been devoted to transportation networks, in particular to road networks. Road networks, for
example, share many properties with planar graphs; in particular, road networks also have small
separators. In practice, however, approaches directly based on separators are often not the most
efcient ones.
Experimental evaluation [Hit68, BH69, Dre69, GW73, Pap74, Gol76] has always been an im-
portant part of shortest path research. For the rst practical methods devised by the algorithm en-
gineering approach, the feedback loop was rather short. Researchers found that the representation
of a graph in memory affects the performance of the algorithm. For sparse graphs, representing
the graph by an adjacency list is quite efcient. The list can be sorted by starting nodes (such a
representation is sometimes termed forward star form). It may be efcient to also sort the edges
of a node by their length [DGKK79]. Such a sorting step preprocesses the graph in order to ob-
tain faster query times. It may also be efcient to reorder the vertices such that proximity in the
50
3.2. PRACTICAL
graph is reected in proximity in memory as well. Such a reordering may have great impact on
the running time of the query algorithm due to caching effects [GKW07].
Reordering nodes and edges was just the beginning. If additional structure is computed, or if
the network is changed structurally, the investment during the preprocessing phase is higher but so
is the payoff at query time. Network decomposition [LS67, FLM67, KY65, Mil66, Hu68, HT69,
Yen71, GKN74, LR82] was used to speedup APSP algorithms on sparse networks. Other than the
articles on the network decomposition technique, the thesis of Smolleck [Smo75, SC81] and the
article by van Vliet [Vli78] appear to be among the rst reports on the shortest path query problem
with considerable preprocessing.
2
Smolleck models the network by an electric circuit, wherein each edge is mapped to an
impedance. According to [DP84], Smolleck achieves a speedup of 30 compared to Dijkstras
algorithm (on a graph with 2,047 nodes and 2,547 edges); the paths are on average 1.9% longer
than the optimal path; the preprocessing time is apparently 1,000 times slower than the query time.
Van Vliet compares the running times of Dijkstras [Dij59], DEsopos [PW60, Pap74], and
Moores [Moo59] algorithms on road networks with up to 5,337 nodes and 14,930 edges. Based on
his observations, he introduces heuristics termed spider web techniques [Vli78, Section 6]. He
contracts nodes such that groups of 2 or more links from the original network are combined into
single links representing minimum distance paths between their end nodes. For an illustration, see
Figure 3.1. He attributes the idea to Hu [Hu69], who termed it distance equivalent networks;
3
he also relates it to triple operations [Flo62, Mur65, Hu68]. Such a triple operation compares
an edge length with the lengths of paths with two edges using an intermediate pivot node. The
method is mainly used in APSP algorithms. Van Vliet combining APSP and SSSP techniques
into a query method illustrates the tradeoff that shortest path query methods address. Van Vliets
contraction techniques decrease the CPU time for multiple queries by approximately 25%.
For recent methods, two preprocessing strategies are distinguished. Hierarchical approaches
compute an additional graph structure to speed up shortest path queries. Approaches based on
graph annotation attach additional information to each vertex, based on which, at query time, the
search tree can be pruned.
3.2.1 Hierarchical Approaches
Hierarchical methods to compute shortest paths in graphs have been proposed by many researchers
[KK77, AJ94, SWN92, SFG97, IOAI91, CF94, JHR96, TF97, FS97, CRS98, CTB01, HJR95,
CL07, CZ07, AY00, JP02, AY01, HSW08, SS06, BFM
+
07, Sch08b, KKRS08, GSSD08, Hol08,
BDS
+
08]. An auxiliary graph is constructed hierarchically. A shortest path query is then an-
swered by searching only a small part of the auxiliary graph, often using Dijkstras algorithm.
This approach works very well for intrinsically hierarchical graphs.
If, for each level, the size of the graph is reduced by a constant factor, the hierarchy contains
O(lg n) levels. In practice, it may be benecial to stop the recursive process when only O(

n)
nodes are left. For these remaining nodes, a distance table can be computed and stored. This yields
more efcient query algorithms at a comparably low preprocessing cost.
2
An earlier approach by Bazaraa and Langley [BL74] was to preprocess a graph in order to eliminate negative
weights such that Dijktras algorithm can be applied.
3
There may be a connection to the minimum-route transformations by Akers [Ake60] and William S. Jewell (no
reference). These network changes are based on Wye-Delta Y transformations of electrical networks. However,
the transformations appear to be restricted to planar networks and to two or three terminals. Hu and Torres [HT69,
p. 390] attribute smaller ow equivalent networks to Akers [Ake60].
51
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Figure 3.1: The contractions for nodes of degrees 2, 3, and 4 (termed spider web transforma-
tions) by van Vliet [Vli78, Figure 5].
Concrete approaches exploiting hierarchy are reviewed in the forthcoming section on road
networks (Section 3.2.3).
3.2.2 Graph Annotation Approaches
Algorithms exploiting the annotation approach are sometimes also termed goal-directed search
algorithms. Additional information is attached to all/some vertices or edges of the graph. Based
on this information, the search algorithm decides which part of the graph not to search.
A* Search
A* [Gel63, Sam63, KHI
+
86, HNR68, Dor67, Gel77, Kor85] is a popular search technique in
Articial Intelligence. The idea is to direct the search towards the goal. In the priority queue
implementation of Dijkstras algorithm, at each iteration, the node with the shortest distance to the
source is taken from the queue. In the original A* algorithm, instead of ordering nodes by their
distance from the source, nodes in the queue are ordered by their distance from the source plus
a potential (see for example [Del09, Algorithm 2, p. 22]). By adding a potential to the priority
of each node, the order in which nodes are removed from the priority queue is altered. A good
potential function increases the priority of nodes that lie on a shortest path to the target (usually
by decreasing the priority of the other nodes). In road networks for example, if the coordinates of
the target are known, the Euclidean distance provides a good lower bound on the graph distance
and thus a good potential function [SV86]. Using the Euclidean distance as a potential function
for A* has been exploited and applied successfully. In general, however, the coordinates may not
52
3.2. PRACTICAL
be known. A metric embedding or a drawing [WW05] can provide coordinates for a potential
function.
Goldberg and Harrelson [GH05] (see also [GW05, GKW06, GKW07]) propose to use a set
of landmarks S V and the triangle inequality (their method is sometimes called ALT (A*-
Landmarks-Triangle inequality) for this reason). Their potential function is a beacon-based
triangulation [KSW09]. Analogous to the distance oracle of Thorup and Zwick [TZ05], all nodes
v V know the distance to all landmarks L S. For two nodes u, v V and a landmark
L S, the triangle inequality yields that d(u, v) d(u, L) d(v, L). Taking the maximum
difference over all L S yields the best estimate, which is used as a potential in the A* search.
The quality of the lower bound highly depends on the landmark selection. Since in the preprocess-
ing phase the distances to all landmarks need to be computed and stored, the preprocessing time
and the space consumption highly depend on the number of landmarks. An important question
is how to select few but good landmarks. Random selection is a straightforward approach but
it may not guarantee good coverage, meaning that some nodes are far from all landmarks. Sev-
eral heuristics have been proposed to improve coverage [GH05, GW05], or to choose important
nodes [PBCG09]. The theory on beacon-based triangulations by Kleinberg et al. [KSW09] may
help to explain for which graphs ALT works well and how many landmarks to select. For graphs
with bounded doubling dimension, triangulations with respect to a constant number of landmarks
yields (1 + , 0)approximate distances for a (1 )fraction of the node pairs (Kleinberg et
al. also prove that this slack is necessary). While A* with landmarks [GH05] works for general
graphs, it is thus expected to perform best on graphs with low doubling dimension. Poudel [Pou08]
proposes a similar algorithm with a potential function based on an approximate distance oracle.
With increasing quality of the distance approximations, fewer nodes are visited at query time.
A* is easy to implement and it yields decent speedups. In external memory setups, there
appear to be better practical methods [EM01]. Better speedups can be obtained when combining
A* with the bidirectional version of Dijkstras algorithm. This is however not a straightforward
combination. The potential function for the forward and the backward search need to be consistent
such that the shortest path is found when both searches meet. A good approach for consistent
potentials is to take the average of the forward and backward potential function [IHI
+
94].
Querying using precomputed cluster distances [MSM09] is a somewhat similar approach. The
network is partitioned into clusters and distances between all pairs of clusters are precomputed.
These cluster distances yield upper and lower bounds for distances, based on which the search is
directed towards the goal.
Reach
Reach-Based Routing [Gut04] is another modication of Dijkstras algorithm. Each vertex is
assigned a so-called reach value that determines whether a particular vertex will be considered
during Dijkstras algorithm. To have a high reach value, a vertex must lie on a shortest path that
extends a long distance in both directions from the vertex (similar to Highway Hierarchies [SS05,
SS06, NBB
+
08], to be discussed in the next section). A vertex is excluded from consideration if
its reach value is small, that is, if it does not contribute to any path long enough to be of use for
the current query. Gutman [Gut04] reports fast query times with a speed up factor of 10 compared
to Dijkstras algorithm.
53
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
Arc Flags
In the preprocessing phase, the Arc Flag method [Lau04, KMS05, MSS
+
06] partitions the graph
into clusters and then, for each cluster, marks all edges where shortest paths towards nodes in the
cluster start. At query time, edges that are not marked with the target cluster are ignored. A related
approach uses geometric containers [WW03, WWZ05]. On its own, the preprocessing step of the
Arc Flag approach is rather expensive. However, when applied within a hierarchy [MSS
+
06] or
when combined with other techniques, it can be very efcient [BD08, BDS
+
08].
2Hop Covers and Reachability Queries
For general directed graphs, the answers to shortest path and reachability queries (Denition 31)
are harder to compute than for undirected graphs [AF90]. The scheme by Cohen et al. [CHKZ03]
actually lies at the boundary between theory and practice. Cohen et al. focus on algorithms for
directed graphs that occur in practice. They introduce a new technique, which they call 2hop
covers: such a cover is a set of shortest paths P such that for every pair of vertices (u, v) V V
there is a shortest path between u and v that is the concatenation of two paths in P. Based on
this cover, they assign labels to vertices. The sizes of the labels thus depend directly on the size
of the cover. It is known that graph classes with separators of size O(n

) have 2hop labels of


size at most O(n
1+
lg n) [Coh96] (optimal labelings may be smaller). Unfortunately, nding
a 2hop cover of minimal size is an NPhard problem; the optimal cover can be approximated
up to a logarithmic factor [CHKZ03, Theorem 4.2]. Fast practical computation of the labels
has been proposed [CYL
+
06, CYT06, CYL
+
08, CY09]. Still, depending on the optimal cover
size, the data structure may have size

O(n

m) and queries may take



O(

m) in the worst case.


Wang et al. [WHY
+
06] propose a different reachability labeling for (very) sparse graphs. They
consider a graph as having two components: a spanning tree plus a set of a additional non-tree
edges. For query time O(1) (reachability queries), their approach requires preprocessing time
O(n + m + a
3
) and space O(n + a
2
). Note that a = |E| n + 1, which may be (n
2
). Tril
and Leser [TL07] create a graph index called GRIPP. Preprocessing time and space complexities
are O(m + n); the index supports efcient reachability queries in time O(m n). They run
a depth-rst search from several root nodes such that each node obtains pairs of pre- and post-
ordering rankings. A tree can then be queried for reachability by looking at the ranking only.
Their approach outperforms the labelings by Cohen et al. [CHKZ03] and Wang et al. [WHY
+
06]
in practice (experiments are run for metabolic networks). Extensions to shortest path queries are
planned. Jin et al. [JXRW08, JXRF09] propose PathTree (for sparse graphs) and 3hop (for dense
graphs) to further improve preprocessing and query times.
3.2.3 Road Networks
Route planning for transportation networks (road networks in particular) has been studied inten-
sively [HP58, But68, EL82, IOAI91, SWN92, SKC93, CF94, LCL
+
94, ZZ94, Liu95, HJR95,
HJR96a, HJR96b, JHR96, AI97, FS97, HJR97a, HJR97b, SL97, SFG97, PS98, JHR98, CRS98,
HJR00, AY00, AY01, CTB01, ZC01, JP02, AOPS02, BSWW04, KMS05, HSWW05, GW05,
WWZ05, SS06, HLL06, MSS
+
06, BFM
+
07, BG07, KKK
+
07, CL07, Hol08, KKR08, KKK
+
08,
KKRS08, HSW08, GSSD08, Sch08a, Sch08b, BDS
+
08, DABC08, KH08, SS09, Del09]. The
9th DIMACS Implementation Challenge [DGJ08], which took place in 2006, stimulated a lot of
research. For recent results, we refer to the survey on route planning [DSSW09], the survey on
A*based point-to-point shortest path queries [Gol07], the overview on engineering large network
54
3.2. PRACTICAL
applications [Zar08], and the theses by Schultes [Sch08a] and Delling [Del09]. Route planning
is also strongly related to efcient path query processing on spatial networks [PZMT03, GKR04,
DABC08, SSA08, SSA09, SS09].
Approaches based on graph annotation can be applied as described in Section 3.2.2. Re-
cent approaches exploiting hierarchy are specically tailored for transportation networks. For an
overview of recent hierarchical approaches see also [Del09, pp. 46] and the references therein.
We give a brief overview in the following.
Highway Hierarchies (HH) [SS05, SS06, NBB
+
08] are based on the observation that a cer-
tain class of edges (the highway edges) tend to have greater representation among the portion
of the shortest paths that are not in the vicinity of either the source or target. A recursive compu-
tation of these edges, paired with a contraction step, leads to a hierarchy of graphs that enables
an impressive speedup at query time. Highway hierarchies were rst proposed by Sanders and
Schultes for undirected graphs [SS05], and later extended to directed graphs [SS06]. Nannicini et
al. [NBB
+
08] give a different approach for directed highway hierarchies. Their main focus is on
time-dependent weights though. Highway-Node Routing (HNR) [SS07b] is a variant of HH that
supports fast updates by additionally constructing overlay graphs.
The contraction step is an integral ingredient of the HH speedup technique. Nodes with low
degree can be contracted, since their removal does not cause many additional edges (an observation
related to van Vliets spider web [Vli78] and Hus distance equivalent networks [Hu69]). This
observation can be generalized [GSSD08]: For each node, the number of potential shortcut edges
is computed. If for a node under consideration the number of shortcuts is smaller than the number
of shortcuts one would expect based on the node degree, the node is contracted. The method called
Contraction Hierarchies [GSSD08, BDS
+
08] uses intelligent heuristics to contract nodes in the
right order.
4
This order yields a hierarchy, with which the query algorithm can efciently nd
shortest paths. Contraction-based techniques perform very well in practice. The preprocessing
step is particularly efcient.
Transit-Node Routing (TNR) [BFSS07, BFM
+
07, Sch08b] is based on the following obser-
vation: When driving somewhere far away, drivers usually leave their current location via one of
only a few access routes to a relatively small set of transit nodes. These transit nodes are then in-
terconnected by a sparse network relevant for long-distance travel. The TNR method precomputes
all shortest paths to transit nodes and all shortest paths among transit nodes. The preprocessing is
expensive but the query time is extremely fast.
Combining graph annotation and hierarchical approaches often yields powerful methods. Sev-
eral combinations have been investigated and evaluated empirically [HSWW05, BDS
+
08]. Two
particularly strong combinations are CHASE [BDS
+
08], which combines contraction hierarchies
and arc ags, and SHARC [BD08], which combines shortcuts and arc ags. Shortcut edges are
additional edges that maintain the original distance but decrease the number of hops (the number
of edges on the shortest path, Denition 8). The problem of nding the best k shortcuts has been
formalized and it is apparently hard (see [BDDW09, Theorems 1 and 2]). Nevertheless, SHARC
works very well in practice.
For an overview of the preprocessing time vs. query time tradeoff for some of these methods
see Figure 6.5; for experimental results, see Tables 6.4 and 6.1 (all in Chapter 6).
The performance of these hierarchy-based methods is really good in practice, however, com-
plexity results are mostly experimental only.
5
Recently, Abraham et al. [AFGW10] show that
4
Van Vliet [Vli78] contracts nodes up to degree 4; his contractions of nodes with higher degree did not yield any
speedup but a slowdown. The contraction order appears to be very important.
5
Exactness and correctness are proven.
55
CHAPTER 3. REVIEW OF SHORT PATH QUERY PROCESSING
if a graph has low highway dimension, algorithms based on reach [Gut04], contraction hierar-
chies [GSSD08, BDS
+
08], highway hierarchies [SS05, SS06, NBB
+
08], transit nodes [Sch08b],
and SHARC [BD08] have provable efciency guarantees.
Some of the methods that were tailored for road networks also perform well for timetable
information systems [Sch05b, MHSWZ04, PSWZ07, Jac08, BDW09] (for details we refer to the
thesis by Schulz [Sch05b]) and for grids and unit-disk graphs [Del09, Tables 4.21 & 4.22]. For
higher-dimensional graphs, however, the preprocessing time and the space consumption increase
signicantly [BDS
+
08, Tables 10 and 11].
6
3.2.4 Complex Networks
Although very efcient when applied to road networks, the hierarchical techniques outlined in the
previous section seem to have problems handling graphs with higher node degrees. For complex
networks, other techniques and heuristics have been suggested and evaluated.
Rattigan et al. [RMJ06] approximate distances in graphs using a structure index. Their algo-
rithm grows zones using random exploration starting from random seeds. When combining their
zones with landmarks, the algorithm computes the distances between each node and zone in time
O(ms) for s zones. They obtain approximations of the closeness and betweenness centrality for
nodes. The structure indices also help nding a graph clustering [RMJ07].
Potamias et al. [PBCG09] use landmark-based A*. Their strategies for landmark selection
outperform random landmark selection strategies. Their experimental evaluation includes social
networks. The networks they consider do not seem to have low doubling dimension. To the best of
our knowledge, the theoretical performance of beacon-based triangulations [KSW09] for complex
networks has not been investigated yet. Metric embeddings of power-law graphs seem to require
higher dimensional spaces.
Brady and Cowen [BC06] propose a compact routing scheme for unweighted power-law
graphs. Since the scheme is based on tree routing, it can be extended to efcient distance and
shortest path queries. Their distance oracle has size O(nE lg
2
n) and it returns (1, D)approximate
distances in time

O(1) for a parameter D and a value E that depends on D and the graph. There
are no theoretical bounds on D and E. The scheme roughly works as follows: we grow a short-
est path tree from the node with the highest degree. All nodes up to distance D/2 form the core
with diameter D. The remaining nodes form the fringe. The fringe is claimed to be almost a
forest. E denotes the number of edges we must remove such that the fringe actually becomes a
forest. For each of these edges, an additional routing tree is produced. The scheme routes through
the node with the highest degree with additive stretch at most D or optimally using one of the
fringe-trees. Intuitively, small values of D imply large values of E. Experiments using random
power-law graphs [ACL00, CL02] indicate that both D and E can be chosen to be small simulta-
neously. (1, O(1))approximate distance oracles with space

O(n) may be possible for power-law
graphs, although there is no proof.
Das Sarma et al. [SGNP10] provide a practical implementation of Bourgains embedding (The-
orem 3), and they propose an extension of the distance oracle by Thorup and Zwick [TZ05]. In
their extension, they omit ball computations. While the asymptotic performance is not affected,
their algorithms both for preprocessing and query are simpler and potentially faster in practice
than the corresponding original algorithms. The stretch bounds, however, only hold with high
probability.
6
Bauer et al. [BDS
+
08, p. 22 of TR] observe that most speed-up techniques have problems when the average node
degree becomes too large.
56
3.3. SUMMARY
Xiao et al. [XWP
+
09] compress graphs by exploiting symmetries. Instead of treating vertices
as a single unit, they work on orbits of automorphism groups. Shortest path queries are answered
using compact BFS-trees, which are based on these orbits. Symmetries in complex networks seem
to be very common. Experiments show that their method may be very efcient. Scalability cannot
be assessed appropriately without having the code. The largest graph they consider (the Internet
Autonomous System network) has 22,442 nodes and 45,550 edges. For this graph, their pre-
processing algorithm runs in roughly 347 seconds [XWP
+
09, Table 3]. No speedup for queries is
reported. The third-largest graph under consideration (a social network of Erd os collaborators) has
6,927 nodes and 11,850 edges. For this graph, preprocessing takes roughly 29 seconds [XWP
+
09,
Table 3]. They report a speedup for shortest path queries of 57.72 [XWP
+
09, Table 4]. Based on
these two graphs, the running time of the preprocessing algorithm appears to be roughly quadratic
in the number of nodes. Note, however, that the running time of their algorithm mainly depends
on the symmetries, which may be the main cause for the difference in the running times (also for
the two graphs considered here).
Goldman et al. [GSVGM98] consider relationships among objects in large databases. Their
method processes keyword searches over databases in interactive query sessions. Distances be-
tween objects are computed based on a compact index, which consists of local neighborhoods and
distances to hub vertices (separators). Hubs are chosen as high-degree nodes. They evaluate the
performance using the Internet Movie Data Base (IMDB), which arguably has a power-law de-
gree sequence. In database systems, distance estimation is also related to computing the transitive
closure of relations [DR94, UY90, Jag90, HWYY05, TL07] and thus related to reachability query
processing.
Cheng and Yu [CY09] use 2hop labels [CHKZ03] to efciently compute exact distances. For
the DBLP graph with 52,682 nodes and 59,395 edges, preprocessing takes 20 seconds [CY09,
Table 1]. At query time, their method outperforms Dijkstras algorithm by two orders of magni-
tude [CY09, Figure 17 and Section 7.4].
The GRIPP index by Tril and Leser [TL07] efciently answers reachability queries for scale-
free networks (evaluation on metabolic networks). Distance queries are planned as an extension.
3.3 Summary
Shortest path query processing in graphs has been studied extensively both in theory and in prac-
tice. Practical investigations focus mainly on the important class of transportation networks, for
which substantial speedups with respect to classical SSSP algorithms can be achieved. For trans-
portation networks, the focus of practical research efforts appears to be shifting to dynamic sce-
narios. For complex networks, methods have been proposed only recently; their efciency and
optimality is still under investigation.
Theoretical research on distance oracles for general graphs is centered around improving pre-
processing and query times (due to restrictive space lower bounds). For restricted graph classes
such as sparse graphs, planar graphs, and power-law graphs, various important questions remain
to be solved. Also, distance oracles for directed graphs of restricted classes are mostly unknown
territory.
57
ningen banji saiou ga uma
(often what at rst appears to be
bad turns out to be good)
Japanese proverb
4
Lower Bounds for Sparse Graphs
In this chapter, we investigate the tradeoff between the space complexity, the query time, and
the stretch of approximate distance oracles. The main result is a lower bound on the minimum
space consumption of distance oracles with query time t and stretch (, 0). This space lower
bound holds even for sparse (polylog(n)degree) graphs. The bound is proven using techniques
based on P atrascus [Pat08a] recent communication lower bounds on communication protocols for
the LOPSIDEDSETDISJOINTNESS (LSD) problem and the space lower bounds for data structures
obtained by reductions to LSD.
Thorup and Zwick [TZ05] prove that, for some integer values of the stretch parameter k 1,
any distance oracle with multiplicative stretch less than 2k+1, needs space at least (n
1+1/k
) bits
(Theorem7, connected to Erd os girth conjecture [Erd64, ES63]). Their proof holds even if innite
query time is allowed. For sparse graphs, the best bound it proves is that the size of the data
structure is at least proportional to the number of edges in the graph. An exact distance oracle with
space complexity O(m) and query time O(1) would still be possible. For sparse graphs, such a
distance oracle would be very useful. Unfortunately, our bound implies that such an oracle does
not exist. Even if a small stretch is allowed, linear space and constant query time is impossible.
4.1 Context
We prove a lower bound for approximate distance oracles in the cell-probe model (Denition 25).
The main result of this chapter is a three-way tradeoff between space, stretch and query time.
Theorem 8. There exists an integer n

such that for all n n

, the following holds. Let S =


S(n), t = t(n) lg n, w = w(n), and = (n) = o
_
lg n
lg(wn)
_
be integers such that there
exists an (, 0)approximate distance oracle with query time t in the cell-probe model with word-
length w for any graph with n vertices and maximum degree at least poly
_
tw
lg n
_
. Then, the space
complexity of this distance oracle is at least
S n
1+(
1
t
)
/ lg n.
In the lower bound by Thorup and Zwick (Theorem 7), dense graphs with large girth (and
their subgraphs) are the worst-case instances. In our proof, rregular graphs with large girth (and
their subgraphs) are the worst-case instances. For graphs with high regularity, devising a distance
oracle should intuitively be easy. For example, the distance between two nodes in the hypercube
is equal to the Hamming distance between the corresponding node labels. Short labelings may be
possible for various regular graphs.
59
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
For an rregular graph G = (V, E) with diameter = diam(G) = O(lg
r
n) and for
/g(G), (, 0)approximate distance oracles can be devised in a straightforward way. For each
node u V , compute the ball with radius /, denoted by B
/
G
(u) (which contains O(r
/
)
n
O(1/)
nodes). For a distance query d(u, v), if v B
/
G
(u), then return the exact distance;
otherwise return . This yields total space (and preprocessing time) n
1+O(1/)
for an (, 0)
approximate distance oracle with constant query time.
We prove that this amount of space is necessary (up to constant factors in the exponent) for
distance oracles with constant query time that can preprocess rregular graphs and all their sub-
graphs.
1
It is impossible to prove a lower bound that holds for a particular graph G. This is due
to the fact that the algorithm can hard-code G and its distance table in order to answer queries in
constant time (without accessing the data structure). In this proof, we will refer to a worst-case
instance for a distance oracle as a base-graph G, meaning that the distance oracle must accept
at least the class G consisting of G and all its subgraphs. The space lower bound by Thorup and
Zwick (Theorem 7) is also proven with respect to a base-graph and all its subgraphs.
4.2 Preliminaries
In the cell-probe model [Yao81, Mil99] (Denition 25), a cell has w bits and the space of a data
structure is measured as the number of cells it occupies, denoted by S. The query time is measured
by the worst-case number of cells that a query reads. All computations based on cells that have
been read are free. The most typical values of the cell size (also called word length) are w = lg n
or w = polylog(n), but larger (or smaller) values may be interesting as well.
A class of graphs G = (V, E) is considered sparse if |E| =

O(|V |). We may sometimes deem
a graph to be sparse if |E| n poly(w, lg n).
4.2.1 Communication Complexity
Our proof uses a reduction from a distance oracle to a communication protocol. This proof tech-
nique [Ajt88, Mil94, KW90, MNSW98, AIP06, Pat08a, Pat08b] has been used for reductions
from various data structures (the cell-probe interactions with the data structure, to be precise) to
communication protocols.
Communication complexity [Yao79, AB07] is the problem of two separated parties, Alice and
Bob, both holding an input x {0, 1}
n
and y {0, 1}
n
, respectively, computing the result (x, y)
of a function : {0, 1}
n
{0, 1}
n
{0, 1}. Before Alice and Bob receive their inputs, they
agree on a communication protocol using which they will compute (x, y). It is assumed that both
Alice and Bob have innite computing resources. The communication complexity of the protocol
is the number of bits sent in total. The communication complexity of computing is equal to the
communication complexity of the best-possible protocol.
For our proof, an intuitive understanding of a communication protocol as a sequence of mes-
sages between Alice and Bob should sufce. A more rigorous denition is the following.
Denition 32 ((Symmetric) Communication Protocol [AB07]). A (symmetric) tround commu-
nication protocol for a function : {0, 1}
n
{0, 1}
n
{0, 1} is a sequence of function pairs
(S
1
, C
1
), (S
2
, C
2
), . . . , (S
t
, C
t
), (
1
,
2
). The input of S
i
is the communication pattern of the rst
1
As a comparison: it is known that a metric embedding into
2
of the shortest path metric on rregular graphs with
girth g requires ((

g), 0) distortion [LMN02]. This means that, for constant r and girth O(lg
r
n), an embedding-
based distance oracle with space
e
O(n) has stretch ((

lg n), 0).
60
4.2. PRELIMINARIES
i 1 rounds and the output is from {1, 2}, indicating which player will communicate in the ith
round. The input of C
i
is the input string of this selected player as well as the communication
pattern of the rst i 1 rounds. The output of C
i
is the bit that this player will communicate in
the ith round. Finally,
1
and
2
are 0/1valued functions that the players apply at the end of the
protocol to their inputs as well as the communication pattern in the t rounds in order to compute
the output. These two outputs must be (x, y). The communication complexity of is
C() = min
protocols CP
max
x,y
{Number of bits exchanged by CP on x, y} .
A trivial protocol is to communicate the entire input of one player, compute (x, y) at the
other player, and send back the result. This yields C() n+1. The objective is to communicate
(signicantly) less.
Lopsided Set Disjointness
In communication complexity [Yao79, KN96, AB07], SETDISJOINTNESS is the problem of two
separated agents deciding whether two sets are disjoint. In the asymmetric version of the problem,
called LOPSIDEDSETDISJOINTNESS (LSD), Alice and Bob receive sets S
Alice
and S
Bob
, respec-
tively. Their goal is to determine whether S
Alice
S
Bob
= using a communication protocol
(Figure 4.1). More precisely (as in Denition 32), the function is 1 if S
Alice
S
Bob
= and
Alice U Bob
is given S
Alice
U is given S
Bob
U

. . .

S
Alice
S
Bob
?
=
Figure 4.1: The LOPSIDEDSETDISJOINTNESS communication problem
0 otherwise. LOPSIDEDSETDISJOINTNESS has two parameters, N and B, known to both Alice
and Bob. The universe has size NB and both Alice and Bob are given a subset of this universe;
S
Alice
[NB] and S
Bob
[NB]. Alices set has size |S
Alice
| = N. (Alice is given one of
_
NB
N
_
different sets.) B thus denotes the fraction between N and the size of the universe NB. The size
of Bobs set is not xed; it may be (NB). Two trivial protocols are the following: (1) Alice
communicates her set S
Alice
with a message of length lg
_
NB
N
_
= O(N lg B) bits; Bob can then
compute S
Alice
S
Bob
and reply with 1 bit. Alternatively, (2) Bob communicates his set S
Bob
with a message of length lg |S
Bob
| + lg
_
NB
|S
Bob
|
_
(encoding |S
Bob
| and S
Bob
); Alice computes
S
Alice
S
Bob
and sends back 1 bit. Either Alice or Bob communicates his/her complete set. The
question is how much communication Alice and Bob need to decide whether S
Alice
S
Bob
= . A
trivial randomized protocol is the following: Alice tosses a coin; based on the result, she decides
whether S
Alice
S
Bob
and sends 1 bit to Bob. Let this protocol be denoted by CP. The probability
61
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
that the result is correct is Pr[CP(S
Alice
, S
Bob
) = (S
Alice
, S
Bob
)] =
1
/2. Alice could also send a
random sample, based on which Bob must decide whether S
Alice
S
Bob
. A randomized protocol
CP

is said to have have two-sided error if there is a non-zero probability that Alice and Bob err
when they output either 0 or 1, and it is said to have one-sided error if the probability that Alice
and Bob err is zero for at least one of the possible outcomes 0 or 1. For example, CP

has one-
sided error if (S
Alice
, S
Bob
) = 0 Pr[CP

(S
Alice
, S
Bob
) = 0] = 1 [MR95]. A protocol CP

is
said to have bounded error if there are two constants ,

>
1
/2 such that (S
Alice
, S
Bob
) = 0
Pr[CP

(S
Alice
, S
Bob
) = 0] and (S
Alice
, S
Bob
) = 1 Pr[CP

(S
Alice
, S
Bob
) = 1]

.
The communication complexity to solve LSD with a one-sided error is known to be bounded
from below (Miltersen et al. [MNSW98]). Note that a lower bound on communication protocols
with one-sided error is easier than a bound on protocols with two-sided error, since one-sided error
protocols are required to have a better performance than two-sided error protocols.
Lemma 9 (Miltersen et al. [MNSW98]). There exists some constant C > 0 such that in a one-
sided error protocol for LSD, either Alice sends CN lg B bits or Bob sends NB
C
bits.
Andoni et al. [AIP06] extend the bound to include protocols with two-sided error as well;
P atrascu [Pat08a, Pat09, Pat08c] proves the following (alternatively, one could use P atrascus ap-
proach with [Pat08b, Theorem 5.15] and [Pat08b, Chapter 5.4.3]).
Lemma 10 (P atrascu [Pat09, Theorem 1.4]). There exists some constant C > 0 such that in a
bounded-error protocol for LSD, either Alice sends CN lg B bits or Bob sends NB
C
bits.
This roughly means that, for Alice and Bob to know whether S
Alice
S
Bob
with probabil-
ity bounded away from
1
/2, either Alice or Bob must send almost their complete set. (For the
communication complexity of symmetric set disjointness, see [KS92, Raz92, BYJKS04, HW07].)
Data Structure Lower Bounds based on LSD
One key part in P atrascus results [Pat08a] is the reduction from LOPSIDEDSETDISJOINTNESS
(LSD) to reachability oracles. Recall (Denition 31) that the reachability query problem is, given
a (sparse) directed graph G = (V, E), to construct a data structure using less than n
2
space such
that reachability queries (deciding whether there is a directed path fromu to v) can be answered ef-
ciently. Reachability oracles for undirected graphs are trivial: we compute the connected compo-
nents and store a component number for each node. Storing reachability information for directed
graphs appears to be hard [AF90] and so is the reachability query problem.
Recall P atrascus theorem on reachability oracles (Theorem 5), which we restate here.
Theorem (P atrascu [Pat08a, Theorem 2]). A reachability oracle using space S in the cell-probe
model with wbit cells, requires query time t = (lg n/ lg
Sw
n
).
P atrascus proof is a reduction from a variant of LSD to the problem of reachability queries in
a buttery graph and its subgraphs.
Denition 33. A buttery graph BUTTERFLY(, r) is a directed graph F
,r
= (V, A) specied by
two parameters: the depth and the degree r.
F
,r
has + 1 layers V
0
, V
1
, . . . V

with r

vertices each.
The nodeset is [ + 1] [r]

.
Every vertex v V \ V

has out-degree r.
Every vertex v V \ V
0
has in-degree r.
62
4.2. PRELIMINARIES
Arcs a A

i=0
V
i
V
i+1
only connect nodes in adjacent levels V
i
, V
i+1
.
Two nodes v V
i
and v

V
i+1
are adjacent if they differ only at coordinate i. Node
(i, c
1
, c
2
, . . . c
i
, . . . c

) is connected to all nodes (i + 1, c


1
, c
2
, . . . c

i
, . . . c

) for c

i
[r].
Note that paths between any s V
0
and t V

are unique. For an illustration of undirected


buttery graphs see Figure 4.2 for 3 layers with degree 2 and Figure 4.3 for 3 layers with degree 3.
The reduction to LSD is based on the following idea: the universe of LSD is mapped to the
edgeset of the buttery graph G = (V, E) using a bijection f : [NB] E. The input sets of
Alice and Bob can be mapped to subsets of the edges f(S
Alice
), f(S
Bob
) E, respectively.
2
Let
us assume that the set of Alice consists of a set of paths of length between nodes in V
0
and V

.
Let (e
1
, e
2
, . . . e

) denote such a path between u V


0
and v V

. Bob simulates the reachability


data structure for the subgraph of the buttery graph with his edges removed: G

= (V, E \
f(S
Bob
)). Alice can nd out whether at least one of her edges e
i
is in Bobs set f(S
Bob
) by asking
one reachability query reachable(u, v). Alice does so by sending a message in a communication
protocol (technically speaking, she encodes the position of the word in the data structure she
wants to read as in the cell-probe model). By asking all queries in parallel [PT06, Pat08a], Alice
learns whether at least one of her edges is in Bobs set f(S
Bob
) Alice thus also knows
whether f(S
Alice
) f(S
Bob
) S
Alice
S
Bob
. It turns out that, if the data structure is small,
the communication complexity of this protocol is very low, which contradicts the communication
lower bound of LSD.
Figure 4.2: A drawing of an undirected buttery graph with degree 2 spanning 3 layers.
Recall that the lower bound for reachability oracles directly implies a lower bound for distance
oracles on directed graphs (Section 3.1.1, Denition 31). In a straightforward way, it also implies
a lower bound for distance oracles on undirected graphs. The following direct reduction from
reachability oracles for subgraphs of the buttery graph to distance oracles with less than stretch
(1, 2) yields the same space lower bound for the latter. The lemma is a minor result of this chapter;
it serves as a warm-up and as an illustration of P atrascus reduction technique.
Lemma 11. In the cell-probe model with wbit cells, a distance oracle with additive stretch less
than 2 using space S requires query time t = (lg n/ lg
Sw
n
).
2
We abuse notation by simplifying f(E

) :=
S
eE

{f(e)}.
63
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
Figure 4.3: A drawing of an undirected buttery graph with degree 3 spanning 3 layers.
Proof. We give a straightforward reduction from reachability oracles for the directed buttery
graph F in P atrascus reduction [Pat08a, Reduction 13] (see also [Pat08b, Reduction 7.11]) to a
distance oracle for the same graph, interpreted as undirected, say F

. If node v is reachable from


u in F, the distance in F

is equal to ; if v is not reachable from u in F, the distance in F

must
be at least + 2 since the buttery graph F

and its subgraphs are bipartite. Therefore, if there is


an algorithm that approximates distances with additive stretch less than 2 using t probes in a data
structure of size S, then there is an algorithm using the same t probes in the same data structure
of size S that answers reachability queries by distinguishing distances and distances at least
+ 2.
Corollary 12. In the cell-probe model with wbit cells, a distance oracle with additive stretch
less than 2 and query time t requires space S at least
S n
1+(1/t)
/w.
Note that, for w = 1, Corollary 12 yields a lower bound in the bit-probe model [MP69].
The remainder of this chapter is devoted to amplifying the construction to obtain a lower
bound for multiplicative stretch . Instead of the buttery graph, the worst-case instance is a
regular graph with large girth.
4.2.2 Regular Graphs with Large Girth
In our space lower bound, we need graphs that contain many disjoint shortest paths of length .
The following is the denition of the set of sets of vertex-disjoint paths. It is used extensively in
the proof.
Denition 34. For a graph G = (V, E) and two positive integers , , let P
,
(G) denote the set
of sets of edges dened as follows. The members of P
,
(G) are all possible sets P E, where
P can be expressed as the union of vertex-disjoint paths in G, each of length exactly .
The proof also requires that these paths (v
0
, v
1
, . . . , v

) are shortest paths chosen such that


there is no short alternative path connecting v
0
and v

. Paths do have this property if the girth of


the graph is sufciently large.
64
4.2. PRELIMINARIES
Graph Requirements
Our bounds require reductions from sparse graphs with large P
,
(G). To ensure that P
,
(G) is
sufciently large, we require certain properties of the graph. Based on good expansion (G) of a
graph G, one can show that there are many disjoint paths in G [AC07]. Based on this, we prove
that P
,
(G) is large (Lemma 21).
Denition 35 ((normalized) second-largest eigenvalue [Alo86, AFWZ95]). Let G = (V, E) be an
rregular graph with n vertices. Let A be an adjacency matrix of G. Let B := A/r, that is b
i,j
=
1/r if (v
i
, v
j
) E and b
i,j
= 0 otherwise. Let
0

1
. . .
n1
denote the eigenvalues
of B. Let (G) denote the second-largest absolute value of an eigenvalue: |
0
|, |
1
|, . . . |
n1
|.
(G) is called the (normalized) second-largest eigenvalue of G. (G) is also called the expansion
of G. G is called Ramanujan if (G)
2

r1
r
.
It is known that each |
i
| is real number in the range [0, 1]. It is also known that
0
= 1 and
that
1
0. Therefore, (G) := max{
1
, |
n1
|}.
At one point in the proof, we rely on the expansion property of a graph. We use the follow-
ing theorem. Based on the expansion (G), Alon et al. [AFWZ95] prove a lower bound on the
probability that a random walk of length stays inside a set of a certain density.
Theorem 13 (Alon et al. [AFWZ95, Theorem 4.2]). For a graph G = (V, E) with := (G),
the probability that a random walk of steps from a uniformly random starting vertex stays inside
U V , where |U| n 6n, is at least ( 2)

.
The theorem implies the following corollary by setting = 0.9.
Corollary 14. For a graph G = (V, E) with (G) 0.1, the probability that a random walk of
steps from a uniformly random starting vertex stays inside U V , where |U|
9
10
|V |, is at least
1
2

.
Deep knowledge of expansion is not necessary to understand our proof in this chapter. For
more information on expander graphs, we refer to [Alo86, AFWZ95, HLW06].
Graph Construction
Our proof relies on the existence of graphs with large girth and large P
,
(G); Ramanujan graphs
3
(construction by Lubotzky et al. [LPS88] using the Cayley graph of a projective general linear
group) are what we use.
4
For the connection to dense graphs with large girth, see also [EJ08,
Construction IV].
Lubotzky et al. [LPS88, pp. 262263] prove the following.
5
Recall that the Legendre symbol
for two unequal primes p = q is dened as follows:
(p|q) =
_
+1 if there is an integer x such that p
q
x
2
1 otherwise
3
Ramanujan graphs are named after the Indian mathematician Srinivasa Iyengar Ramanujan (18871920).
4
In our proof, we apply Theorem 13 by Alon et al. with the condition that (G) 0.1. The construction by
Lubotzky et al. is not the only one that guarantees regular expanders with large girth. The construction by Morgen-
stern [Mor94] may potentially work as well. It has been shown that random regular graphs also have large expan-
sion [Fri91, FKS89]. Random regular graphs with large girth may thus potentially work as well.
5
Note that, in their paper, Lubotzky et al. do not normalize . The statement in Theorem 15 uses normalized
eigenvalues as dened in Denition 35.
65
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
Theorem 15 (Lubotzky et al. [LPS88, pp. 262263]). Let p and q be unequal primes congruent
to 1 mod 4. There exists a (p + 1)regular graph X
p,q
with the following properties:
Case (p|q) = +1
1. X
p,q
has
q(q
2
1)
2
nodes.
2. g(X
p,q
) 2 lg
p
q
3. diam(X
p,q
) 2 lg
p
n + 2 lg
p
2 + 1
4. (X
p,q
)
2

p
p+1
Case (p|q) = 1
1. X
p,q
has q(q
2
1) nodes.
2. g(X
p,q
) 4 lg
p
q lg
p
4
3. diam(X
p,q
) 2 lg
p
n + 2 lg
p
2 + 1
4. (X
p,q
)
2

p
p+1
In the following we prove the existence of Ramanujan graphs for innitely many and various
choices of large r and n. The theorem is mostly implied by the result of Lubotzky et al. [LPS88]
stated in Theorem 15.
Lemma 16. For every sufciently large n
0
, r
0
with n
0
> 8r
3
0
, there exists a graph G = (V, E)
with the following properties:
1. |V | = n with
n
0
2
n 9n
0
2. G is rregular, where r
0
r 2r
0
3. The girth of G is at least g(G)
1
2
lg
r
n
4. (G)
2

r1
r
Proof. The graph claimed to exist is a Ramanujan graph X
p,q
as in Theorem 15. The construction
by Lubotzky et al. [LPS88, pp. 262263] requires unequal primes p = q, both congruent to 1
mod 4. The graph has at least n =
q(q
2
1)
2
nodes and it is r = (p + 1)regular.
In the following, we prove the existence of primes p, q such that r and n lie in the ranges
r [r
0
, 2r
0
] and n [
n
0
2
, 9n
0
].
The Bertrand-Chebyshev theorem
6
states that for every x > 1 there is always at least one
prime p such that x < p < 2x. This generalizes to certain arithmetic progressions [Bre32, Erd35,
Bre64, Mor93]. Let z N
0
. Breusch [Bre32, p. 505] states
[...] da f ur x 7 zwischen x und 2x stets Primzahlen einer jeden der vier Progres-
sionen 3z + 1, 3z + 2, 4z + 1, 4z + 3 liegen.
6
Named after Joseph Louis Francois Bertrand (18221900), who conjectured the existence of primes between x
and 2x in 1845, and Pafnuty Lvovich Chebyshev (18211894), who proved the conjecture in 1850. Ramanujan gave a
simpler proof in 1919. Its a small world!
66
4.2. PRELIMINARIES
that for every x 7 there is a prime of the form p = 4z + 1 in the interval between x and 2x.
Note that the modular congruence is satised, p
4
1.
Due to Breusch [Bre32, p. 505], there exists a prime p [r
0
1, 2r
0
2] of the formp = 4z+1.
Let p denote this prime.
The Lubotzky et al. construction of the Ramanujan graph requires a different prime q = p of
the form q = 4z + 1. Again, due to Breusch [Bre32, p. 505], there exists such a prime q [x, 2x]
if the intervals for p and q do not overlap. Let x := n
1/3
0
+1 [n
1/3
0
+1, n
1/3
0
+2). The imposed
condition n
0
> 8r
3
0
ensures that the two intervals do not overlap. Thus, there is a different prime
in the interval [x, 2x] for every integer x 2r
0
1. Let q denote this prime. The Ramanujan
graph has either q(q
2
1) = q
3
q or
q(q
2
1)
2
=
q
3
q
2
nodes. For the number of nodes n, we
have that n [
x
3
x
2
, 8x
3
2x]. With x = n
1/3
0
+ 1 [n
1/3
0
+ 1, n
1/3
0
+ 2), we derive
n
"
n
0
+ 3n
2/3
0
+ 3n
1/3
0
+ 1 n
1/3
0
2
2
, 8(n
0
+ 6n
2/3
0
+ 6n
1/3
0
+ 8) 2(n
1/3
0
1)
#
n
"
n
0
+ 3n
2/3
0
+ 2n
1/3
0
1
2
, 8n
0
+ 48n
2/3
0
+ 46n
1/3
0
+ 10
#
n
h
n
0
2
, 9n
0
i
for n
0
sufciently large (such that n
0
48n
2/3
0
+ 46n
1/3
0
+ 10).
The graph is (p + 1)regular. Let r := p + 1.
The girth is at least 2 lg
p
q (since 4 lg
p
q lg
p
4 2 lg
p
q for q 2). In terms of n and r, this
yields
g(X
p,q
) 2 lg
p
q =
2
3
lg
p
q
3
=
2
3
lnq
3
lnp
=
2 ln(p + 1)
3 lnp
lg
p+1
q
3

1
2
lg
r
n.
The expansion is (X
p,q
)
2

p
p+1
due to Theorem 15. This concludes the proof.
4.2.3 Counting Permutations
In the following, we prove the existence of a not-too-large set of permutations with certain
properties. The lemma is a tailored restatement of P atrascu [Pat08a, Lemma 11], proven using the
probabilistic method [AS00, Erd63].
Lemma 17. Let S

denote the set of permutations of [] = {1, 2 . . . }. For [0, 1], let A

denote the set of all subsets of [] with elements. Let a := |A

|. Let B A

of size at least
|B| b. There exists a set S

of || =: >
a
b
lna permutations {
1
,
2
, . . . ,

} such that
for any set A A() there exists a permutation
i
with

aA
{
i
( a)} B.
Proof. We denote
i
(A) :=

aA
{
i
( a)}. Let denote a randomly-chosen permutation : []
[]. Fix A A

. The probability that (A) B is at least


Pr[ (A) B] =
b
a
.
Consider permutations
i
: [] [] selected independently at random,
1
,
2
, . . . ,

. The
probability that none of the permutations
i
maps A to a set contained in B is bounded by
Pr[i [] :
i
(A) B] =
_
1
b
a
_

< e
b/a
.
67
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
We need that for any A A() there is at least one permutation out of the permutations that
maps A to set contained in B. We apply the union bound to obtain the following.
q := Pr [A A()i [] :
i
(A) B] a
_
1
b
a
_

< ae
b/a
.
We need 1 q > 0 to apply the probabilistic method [AS00, Erd63] to guarantee that there exists
a set S

of permutations such that for each A A() there exists a permutation


i

with
i
(A) B. We obtain
1 ae
b/a
> 0
1
a
> e
b/a
lna >
b
a
>
a
b
lna.
The following lemma is a direct consequence of Lemma 17.
Lemma 18 (Corollary of Lemma 17). Let X, Y denote two sets of size := |X| = |Y|. For
[0, 1], let A

denote the set of all subsets of X with elements, A

:=
_
X

_
. Let B be a set
of subsets of Y of size at least |B| b such that each subset has elements. There exists a set
of
=

b
_
e

ln
_
e

_
bijections f
1
, . . . , f

such that for any A A() there exists a bijection f


i
with

aA
{f
i
( a)} B.
Proof. We select an arbitrary bijection f : X Y. f maps each A A

to a set A

Y. Since
f is a bijection, we have that
|{A A

: f(a)}| = |A

| =: a.
By Lemma 17, there exists a set of permutations {
1
,
2
, . . .

} of Y such that for each A

(as
above) there exists a
i
that maps A

to an element of B.
We have
7
that
a =
_

_
<
_
e

.
The statement of the lemma is immediate.
7
We use the following well-known inequality for binomial coefcients:

n
k
!
=
n!
k!(n k)!
=
n (n 1) . . . (n k + 1)
k!

n
k
k!

en
k

k
,
where the last step is due to the Taylor series of e
x
=

P
k=1
x
k
k!
, which yields x, k : e
x

x
k
k!
, in particular e
k

k
k
k!
and thus k!
`
k
e

k
.
68
4.3. REDUCTION FROM LOPSIDED SET DISJOINTNESS
4.3 Reduction from Lopsided Set Disjointness
In this section, we show a lower bound on the space complexity of any approximate distance oracle
on any base-graph G, based on essentially two parameters of G: the girth of G, and the path-count
of G, which is the cardinality of the set P
,
(G).
We prove that a graph G with a large path-count and large girth is a hard base-graph for
approximate distance oracles.
Lemma 19. Let G = (V, E) be a graph, such that an (, 0)approximate distance oracle exists
for G and all its subgraphs, using query time t and space S. Let C denote the constant from the
LSD communication complexity lower bound in Lemma 10. Let , be two positive integers, such
that <
g(G)
+1
and |E| (2tw/)
1/C
. Then,
S

e

_
|P
,
(G)|
1/
e(|E|/)
1C
_
/t

_
1
e|E|
_
1/t
.
The proof proceeds as follows. We start by giving some intuition in Section 4.3.1. We prove
that if there exists a good data structure then there is a good (short) protocol for LSD (Sec-
tion 4.3.2). Then, we use the LSD lower bound to show that there cannot be a good data structure
(Section 4.3.3).
4.3.1 Intuition
Was wirklich z ahlt, ist Intuition.
(The only really valuable thing is intuition.)
Albert Einstein (18791975)
We give a rough sketch of the proof of the main theorem, highlighting some technical details.
Statements are not necessarily formally correct in this section (for details, see proof of Lemma 20).
The proof is a reduction from the communication problem LSD, in which Alice is given a set
S
Alice
[N B] of cardinality N, Bob is given a set S
Bob
[N B], and they must decide whether
S
Alice
S
Bob
= . For an illustration of the reduction, see Figure 4.4. There are communication
lower bounds for LSD (see Section 4.2.1). We prove that a distance oracle with good parameters
implies a good protocol for LSD, and derive our lower bound from the contrapositive of this claim.
A standard way to perform such reductions is to translate one query to the data structure into
a t rounds of a communication protocol. In total, Alice sends Bob t lg S bits and Bob replies
with tw bits. However, lower bounds in communication complexity are usually asymptotic. The
standard reduction puts the constant multiplicative factor in the exponent of S; therefore, using
the standard reduction, we can only prove lower bounds on the space complexity of the form
S x
(1)
, where x is some expression that depends on the problem parameters. Since any
distance oracle must use space n and since there is a trivial distance oracle that requires space
(n
2
) (a complete distance table), this reduction can not provide a meaningful lower bound.
P atrascu and Thorup [PT06, Pat08a] found a way to prove lower bounds on the space com-
plexity S of data structures that hold up to a polylogarithmic multiplicative factor. Alice holds
independent queries to the same database, where is large ( = O
_
n
polylog(n)
_
). The queries
performed in parallel are transformed into a communication protocol. We use the same approach.
69
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
Alice G = (V, E) Bob
is given S
Alice
F = {f
1
, f
2
, . . . f

} is given S
Bob
Choose f
i
F such that
f
i
(S
Alice
) P
,
(G),
encode i (lg bits)
Compute oracle
which yields for (V, E \ f
i
(S
Bob
))
paths of length of size S
FOR t rounds
lg (
S

) bits

w bits

IF

d(u
j
, v
j
)
for all j [] THEN
S
Alice
S
Bob
=
ELSE S
Alice
S
Bob
=
(S
Alice
,S
Bob
)

S
Alice
S
Bob
?
=
Figure 4.4: Illustration of a reduction from a distance oracle to a communication protocol for
LOPSIDEDSETDISJOINTNESS. The protocol computes (S
Alice
, S
Bob
) :=
_
S
Alice
S
Bob
?
=
_
.
G and F are known to both Alice and Bob. Details are in the proof of Lemma 20.
Our rst objective is to prove that a good distance oracle implies a good protocol for LSD. We
identify the universe of the LSD problem with the edge set of a graph. Let the universe size of the
LSD problem be N B = |E|. For the moment, f denotes an arbitrary bijection f : E [NB]
between the edgeset E and the elements of the universe [NB].
In the reduction, Alice plays the role of the querier, and Bob plays the role of the data structure.
Bob transforms his set S
Bob
into a subgraph of G, G

= (V, E

) = (V, E \ f(S
Bob
)). In other
words, Bob constructs a subgraph G

of G where the missing edges are the ones that correspond


to his input set S
Bob
. Bob then builds a distance oracle for the graph G

. Alice constructs a set of


70
4.3. REDUCTION FROM LOPSIDED SET DISJOINTNESS
queries based on her input S
Alice
, in a way that we shall specify next.
Let the girth g = g(G) of the graph G = (V, E) be large. Now consider one query to
the data structure, which asks for

d
G
(u, v) where u and v are close to each other in G, that is,
d
G
(u, v) = <
g
+1
. Let p
u,v
be the path of length between u and v in G. We ask for the
distance in G

. A query to the distance oracle returns the answer



d
G
(u, v), which is an (, 0)
approximation for d
G
(u, v). We know that the girth of G

is large, g(G

) g, and that u and v


are close in G (at distance <
g
+1
). The result of the distance query is at most if and only if
all edges on the path p
u,v
are in E

. Otherwise, the result of the distance query is a number strictly


larger than .
Thus, using one query to the distance oracle, we can distinguish the case that all edges of p
u,v
are in E

from the case where at least one edge is missing. Now, if we perform queries of this
form, we can check paths. Therefore, using queries we can check whether edges are all
in the graph, or whether at least one of these edges is missing. By setting NB = |E| and
N = , this is an instance of the LSD problem. By doing a standard transformation from a data
structure to a communication protocol, we connect the parameters of the data structure to those of
a protocol for LSD: Alice sends roughly tlg(S/) bits to Bob, and Bob sends roughly tw bits
to Alice.
However, the instance described above is a very specic instance of the LSD problem, where
Alices set is restricted to a set that corresponds to paths in G. For a general instance of LSD,
Alices input may not necessarily map to a collection of vertex-disjoint paths of length each.
There is a technique of P atrascu [Pat08a] to perform a reduction from LSD even when only a non-
negligible fraction of Alices inputs map to a set of vertex-disjoint paths: in a preliminary round of
communication, Alice and Bob choose the bijection f from some not-too-large set of bijections
(see Lemma 18). In order to obtain such a set of bijections, we prove that there is a large set of
sets of vertex-disjoint paths in G (we refer to this as the path-count, as in Denition 34). If the
path-count is sufciently large, then we obtain a strong lower bound.
For details, we refer to the proof of Lemma 20 in the next section.
4.3.2 Reduction from a data structure to a communication protocol
We prove that the distance oracle data structure can be transformed into a protocol for LSD. For
an illustration, see Figure 4.4.
Lemma 20. Let G = (V, E) be a graph, such that an (, 0)approximate distance oracle exists
for G and all its subgraphs, using query time t and space S in the cell-probe model with word size
w. Let , be two positive integers, such that <
g(G)
+1
. Then, there exists a protocol for LSD with
parameters N = and B = |E|/N, where Alice sends
tlg(eS/) +N lg(eB) + lg(eBN) lg |P
,
(G)| bits,
and Bob sends tw bits.
Proof. We begin by dening a bijection between the universe [NB] and E. For now, any bijection
will do additional restrictions will be imposed later. Denote the bijection by f : [NB] E.
Alice and Bob both know G and f.
In the LSD problem, Alice receives a set S
Alice
[NB] of cardinality |S
Alice
| = N, and Bob
receives a set S
Bob
[NB]. We now derive a protocol for LSD based on the existence of the data
structure. Bob uses his set S
Bob
to construct a set of edges, E

= E \ f(S
Bob
). That is, an edge
71
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
e E is in E

if and only if its corresponding element is not in Bobs set S


Bob
. Bob preprocesses
the graph G

= (V, E

) creating the distance oracle data structure, and from now on, Bob plays
the role of the data structure. Since G

is a subgraph of G, by the condition of Lemma 19, this data


structure requires space S, query time t, and it is an (, 0)approximate distance oracle for G

.
Alice constructs the set P = f(S
Alice
). For now, assume that P P
,
(G). We call this
assumption the perfect bijection scenario; we shall remove this assumption later. Under this
assumption, P can be written as the union of vertex-disjoint paths, each of length . Let
(u
1
, v
1
), . . . , (u

, v

) be the endpoints of these paths.


For every pair (u
i
, v
i
), we know that d
G
(u
i
, v
i
) = (it cannot be smaller since the graph has
large girth and thus there is no cycle of length < 2 in G). Denote the path of length between u
i
and v
i
by p
i
. We see that, on one hand, an (, 0)approximate distance query on the pair (u
i
, v
i
)
returns

d(u
i
, v
i
) if all of the edges of the path p
i
are in E

. On the other hand, the result


of the distance query is

d(u
i
, v
i
) g(G) if at least one of the edges of p
i
is not in E

, since
there are no cycles shorter than g(G), and since an approximate distance oracle never returns an
underestimate of the distance, but always an overestimate or the correct value. After querying all
distances (u
1
, v
1
), . . . , (u

, v

), if all of the queries return distances at most



d(u
i
, v
i
) , then
Alice and Bob conclude that S
Alice
S
Bob
= , otherwise they conclude that S
Alice
S
Bob
= .
Alice and Bob, in order to compute the answer to LSD, simulate queries to the data structure
by communication [MNSW98, Pat08a].
Bob computes the data structure itself, based on E

. Alice computes the set P and the pairs


(u
1
, v
1
), . . . , (u

, v

). Alice then considers which cells should be probed in the rst round of each
of the queries, and sends the set of probed cells to Bob. This set can be communicated using
lg
_
S

_
bits (it is crucial not to send the queries one by one, which would require lg S bits [PT06,
Pat08a]). Bob replies with the contents of these cells, using w bits. Next, Alice sends the set of
cells to be probed in the second round, using another lg
_
S

_
bits, and Bob replies, using another
w bits. This procedure is repeated for t rounds in total. Overall, Alice sends
t lg
_
S

_
t lg
_
eS

= tlg
_
eS

_
bits,
and Bob sends tw bits.
We eliminate the perfect bijection assumption by including an additional round of communica-
tion at the beginning of the protocol. In this round, Alice chooses a particular bijection f to reach
the perfect bijection scenario. Instead of having only one bijection f : [NB] E, Alice and Bob
share knowledge of bijections f
1
, f
2
, . . . , f

, all between [NB] and E. This set of bijections


must have the property that for any set S
Alice
[NB] of cardinality N, there exists an i such that
choosing f = f
i
reaches the perfect bijection scenario, that is, i [] : f
i
(S
Alice
) = P
,
(G). If
there is such a set of bijections, then Alice and Bob can reach the perfect bijection scenario by
having Alice send lg bits (the index of the bijection they use) and then continue as before. By
Lemma 18, there is such a set of size
= N ln(eB)
(eB)
N
|P
,
(G)|
.
Therefore, there is an LSD protocol where Alice rst sends
lg N + lg ln(eB) +N lg(eB) lg |P
,
(G)| bits
lg(eBN) +N lg(eB) lg |P
,
(G)| bits
72
4.3. REDUCTION FROM LOPSIDED SET DISJOINTNESS
in order to reach the perfect bijection scenario, and then she sends another
tlg
_
eS

_
bits
in the perfect bijection scenario. This yields a total of at most
lg(eBN) +N lg(eB) lg |P
,
(G)| + tlg
_
eS

_
bits.
Bob sends tw bits. We ignore the nal 1 bit that Alice sends to inform Bob of the result. For an
illustration, see Figure 4.4.
4.3.3 Communication complexity implies space complexity
Conditioned on the existence of a space-efcient distance oracle, there is a communication pro-
tocol for LSD with low communication complexity (Lemma 20). In the following, we prove that
the lower bound on the communication problem LSD (Lemmas 9 and 10) yields a lower bound on
the space complexity of distance oracles.
Recall the statement of Lemma 19: Let G = (V, E) be a graph, such that an (, 0)
approximate distance oracle exists for G and all its subgraphs, using query time t and space
S. Let C denote the constant from the LSD communication complexity lower bound in Lemma 10.
Let , be two positive integers, such that <
g(G)
+1
and |E| (2tw/)
1/C
. Then,
S

e

_
|P
,
(G)|
1/
e(|E|/)
1C
_
/t

_
1
e|E|
_
1/t
.
Proof of Lemma 19. If a protocol computes LSD with parameters N and B, then either Alice
sends at least CN lg B bits or Bob must send at least NB
C
bits (Lemma 10).
Bob communicates tw bits. By the condition of Lemma 19, B (2tw/)
1/C
. This implies
NB
C
2tw/ = 2tw.
Bob, by sending tw bits, uses strictly less than NB
C
bits. The lower bound on the communi-
cation complexity of LSD implies that Alice must communicate at least CN lg B bits. Using the
protocol of Lemma 20, we have that
lg(eBN) +N lg(eB) lg |P
,
(G)| + tlg
_
eS

_
CN lg B.
Starting from this inequality, we derive a bound on S. Recall that N = and recall that the
73
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
edgeset is identied with the universe of LSD (|E| = BN). We get
tlg
_
eS

_
CN lg B lg(eBN) N lg(eB) + lg |P
,
(G)|
tlg
_
eS

_
lg |P
,
(G)| + C lg B lg(eB) lg(eBN)
lg
_
eS

_

1
t
lg |P
,
(G)| +
C
t
lg B

t
lg(eB)
1
t
lg(eBN)
eS

|P
,
(G)|
1/t
(B)
C/t
(eB)
/t
(eBN)
1/t
S

e
|P
,
(G)|
1/t
(eB
1C
)
/t
(eBN)
1/t
S
_
|P
,
(G)|
1/
eB
1C
_
/t


e(e|E|)
1/t
,
which yields the statement of the theorem.
4.3.4 Counting Paths
We prove a lower bound on the size of the set of sets of disjoint paths P
,
(G).
Lemma 21. Let G = (V, E) be an rregular graph and , be two positive integers, such that
the following three conditions hold:
1. (G) 0.1,
2. |V | 20, and
3. < g(G).
Then
|P
,
(G)|
_
|V |

_
r
8
_

.
Proof. Let N = .
Let us rst choose one path. There are |V | vertices to start a path. Since the graph is rregular,
and since < g(G), we have r choices for the rst step and r 1 choices for each subsequent
step. This yields
|V | r (r 1)
1

4.1
possibilities to choose one path. Since the graph is undirected and since we want to reduce LSD
to distance queries (as opposed to actual path queries), an s t path is equal to an t s path. We
divide (4.1) by 2 to account for this.
Let us now choose vertex-disjoint paths of length one by one. Recall that, due to
Corollary 14 (implied by Theorem 13 of Alon et al. [AFWZ95]), the probability that a random
walk of steps from a uniformly random starting vertex stays inside U V , where |U|
9
10
|V |,
is at least
1
2

. Let U := V \ A. Since |V | = 20 and |A| , we have that |U|


9
10
|V |.
According to (4.1) (divided by 2) there are at least
|V |
2
r (r 1)
1

|V |
2

_
r
4
_

74
4.3. REDUCTION FROM LOPSIDED SET DISJOINTNESS
(simple) paths of length . According to the corollary, each path has probability at least
1
2

to avoid
A. Therefore, the number of different paths of length in G that do not use any vertices of A is at
least
|V |
_
r
8
_

.
We apply this argument times to generate all sets with paths of length . After each
application, we add the vertices of the path to A. We divide by ! to account for the fact that the
order in which the paths were chosen does not matter. We obtain
|P
,
(G)|
_
|V |
_
r
8
_

!

|V |

!

_
r
8
_

_
|V |

_
r
8
_

.
This concludes the proof of Lemma 21.
4.3.5 Assembly
We now combine Lemma 21 with Lemma 19 to prove a lower bound for any expander graph based
on its expansion, degree, and girth. After this, we use the Ramanujan graph from Lemma 16 to
derive the main result of this chapter. Both proofs consist of just a sequence of calculations to glue
together all the conditions.
Lemma 22. Let C denote the constant from the LSD communication complexity lower bound in
Lemma 10. Let G = (V, E) be an rregular expander graph on |V | = n vertices, n sufciently
large, with expansion (G) 0.1 and girth g = g(G). In the cell-probe model with word-length
at most w = n
o(1)
, for an integer 1, any (, 0)approximate distance oracle with query time
t that works for G and its subgraphs, requires space at least
S
n
lg n
r
(
g
t
)
given that
g = g(G) 2 and
r
_
4tw
g
_
1/C
.
Proof of Lemma 22. Let =
g
2
lg n. Let =
|V |
20
. This yields
n
20 lg n
. Recall that
N = .
The conditions of Lemma 21 are satised:
(G) 0.1 (by the condition of Lemma 22)
|V | 20 (by the denition of and )
< g(G) = g since g 2 and =
g
2

From Lemma 21, we know that (under the above conditions)


|P
,
(G)|
_
|V |

_
r
8
_

.
The conditions of Lemma 19 are satised.
75
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
<
g(G)
+1
|E| = |V |
r
2
= 10r 10
_
4tw
g
_
1/C
(2tw/)
1/C
From Lemma 19, we know that (under the above conditions)
S
_
|P
,
(G)|
1/
e(|E|/)
1C
_
/t
/((e|E|)
1/t
e).
Since we are interested in the asymptotic behavior of S, we may ignore constant factors.
We claim that (e|E|)
1/t
e = (1). Since |E| 1, we have that (e|E|)
1/t
e = (1). Since
|E| V
2
= 400
2

2
400
4
,
(e|E|)
1/t
e O(1)
(e|E|)
1/t
O(1)
lg(e400)
t
O(1)
lg
t
O(1).
We derive (using |V | = 20 and |E| = |V |
r
2
)
S
_
|P
,
(G)|
1/
e(|E|/)
1C
_
/t

n
20 lg n

_
_
_
_
_
|V |

_
r
8
_

_
1/
e(|E|/)
1C
_
_
_
/t

n
20 lg n

_
_
_
|V |

_
1/
r
8e(|E|/)
1C
_
_
/t

n
20 lg n

_
_
_
20

_
1/
r
8e(10r)
1C
_
_
/t

n
20 lg n

_
(20)
1/
r
8e(10r)
1C
_
/t

n
20 lg n

_
r
C
8e10
1C
_
/t
(20)
1/t
.
By =
g
2
, we get the statement in the theorem for sufciently large n.
Based on Lemma 22, we now use the Ramanujan graph from Lemma 16 to derive the lower
bound stated in the main theorem of this chapter (Theorem 8).
Proof of Theorem 8. Let C denote the constant from the LSD communication complexity lower
bound in Lemma 10.
76
4.3. REDUCTION FROM LOPSIDED SET DISJOINTNESS
Let n
0
= (n). Let r
0
:=
_
8tw
lg n
0
_
2/C
. We assume that n
0
is sufciently large such that
r
0
max{400,
_
4
C
_
4/C
}.
To use Lemma 16, we need that n
0
> 8r
3
0
. Since t, O(polylog(n)) and w = n
o(1)
, we
have that
8
_
8tw
lg n
0
_
6/C
< 8 (8tw)
6/C
< n
0
,
thus there exists a graph G = (V, E) with the following properties:
1. |V | = n

with
n
0
2
n

9n
0
2. G is rregular, where r
0
r 2r
0
3. The girth of G is at least g(G)
1
2
lg
r
n
4. (G)
2

r1
r
Since n
0
= (n), we have that n

= (n). To simplify notation, we use n instead of n

for the
remainder of the proof.
To apply Lemma 22, three conditions must be veried.
(G)
2

r1
r
0.1 holds for r 400.
g = g(G) 2
We have that r
0
=
_
8tw
lg n
0
_
2/C
and r [r
0
, 2r
0
]. Also, both t lg n and lg n.
g(G)
1
2
lg
r
n
=
1
2
lg n
lg r

1
2
lg n
lg (2r
0
)
=
1
2
lg n
lg
_
2
_
8tw
lg n
0
_
2/C
_
=
C
4
lg n
lg
_
2
C/2
8tw
lg n
0
_

C
4
lg n
lg
_
2
C/2
8wlg
2
n
lg n
0
_
=
_
lg n
lg(wlg n)
_
77
CHAPTER 4. LOWER BOUNDS FOR SPARSE GRAPHS
Since = o
_
lg n
lg(wn)
_
, the condition holds. Note that the necessary condition on is
C
4
lg n
lg
_
2
C/2
8wlg
2
n
lg n
0
_ 2
C
8
lg n
lg(2
C/2
8) + lg w + 2 lg lg n lg lg
n
2

C
8
lg n
C
2
+ lg 8 + lg w + lg lg
n
2
1.
The lower bound thus extends to constant w and, in particular, to the bit-probe model [MP69].

_
4tw
g
_
1/C
r
We need two inequalities on r. Since r r
0

_
4
C
_
4/C
(for sufciently large n
0
), we have
that (lg r)
1/C

r. Since r r
0
=
_
8tw
lg n
0
_
2/C

_
8tw
lg n
_
2/C
,
_
4tw
g
_
1/C

_
8twlg r
lg n
_
1/C

_
8tw
lg n
_
1/C
(lg r)
1/C

_
8tw
lg n
_
1/C

r r.
We now apply Lemma 22. Since g = g(G)
1
2
lg
r
n, we have that
r
g
r
1
2
lg
r
n
= n
1/2
,
therefore,
S
n
lg n
n
(
1
t
)
This concludes the proof.
4.4 Conclusion and Open Problems
Theorem 8 implies that a distance oracle with query time t and stretch (, 0) requires space
n
1+(1/t)
. Since our proof holds even for sparse graphs with m =

O(n) edges, the space re-
quirement is strictly larger than the original graph size. For sparse graphs, our space lower bound
is an improvement over the lower bound by Thorup and Zwick [TZ05], which states that at least
space (m) is required. We prove that O(m) is not enough.
Our lower bound also indicates that the tradeoff of the distance oracle of Thorup and Zwick
can potentially be improved for sparse graphs. Their tradeoff is that for multiplicative stretch
78
4.4. CONCLUSION AND OPEN PROBLEMS
(, 0) and query time O(), the space is roughly n
1+O(1/)
. Our lower bound only proves space
requirement n
1+(1/
2
)
. There is a gap between the upper and the lower bound. Mendel and
Naor [MN06] improve the query time to O(1) while maintaining the same amount of space asymp-
totically. Their tradeoff is tight up to constant factors in the exponent with respect to our lower
bound.
Two technical questions remain open.
Linear Number of Edges. The worst-case graphs in our proof have degree

O(1). It would be
interesting to generalize the proof to constant-degree graphs.
Graphs without Large Girth. In our proof, we require that a graph has large girth. It may be
possible to remove this requirement. In the reduction, we perform distance queries. The corre-
sponding paths must be vertex-disjoint and non-bypassable, meaning that any alternative path is
long. We ensure that paths do not have a short alternative by using graphs with large girth. It may
however be feasbile to use graphs without large girth to prove a lower bound.
Our lower bound applies to general sparse graphs. It however does not apply to specic graph
classes such as those with many short cycles; efcient distance oracles may still be possible for
specic graph classes.
79
Unversehens h angt alles
ineinander [...]
(everything is connected)
Max Frisch [Fri64, p. 116]
5
Distance Oracles for Power-law Graphs
5.1 Introduction
Although complex networks are very common in practice (Section 1.1.2), there is no distance
oracle with provable guarantees better than those of the general distance oracle of Thorup and
Zwick [TZ05]. For stretch parameter k = 2, the distance oracle of Thorup and Zwick has the
following worst-case performance: the size is O(n
3/2
) and the stretch is (3, 0). Fortunately, the
theoretical worst-case stretch bounds of Thorup and Zwicks distance oracle [TZ05] (and, also, of
their routing scheme [TZ01]) are not observed in practice
1
[KFY04], even though they are tight.
In this chapter, we make an attempt to bridge the gap between theory and practice. We provide
the rst theoretical analysis that directly links the power-law exponent of a random power-law
graph to the bound on distance oracle sizes.
We adapt the distance oracle of Thorup and Zwick [TZ05] to optimize it for unweighted,
undirected power-law graphs. The scheme by Thorup and Zwick is based on a set of landmarks
selected uniformly at random. Instead of sampling landmarks at random, we select the nodes with
highest degrees as landmarks.
The use of nodes with high degrees is a heuristic that has been proposed by many researchers.
The high-degree heuristic is also very common in practice. For power-law graphs it particularly
makes sense to leverage the power of high-degree nodes. These nodes are also called hubs. These
hubs appear in most large complex networks [Bar03, p. 63].
Connectors are an extremely important component of our social network. They create
trends and fashions, make important deals, spread fads, or help launch a restaurant.
They are the thread of society, smoothly bringing together different races, levels of ed-
ucation, and pedigrees. [...] Connectors nodes with an anomalously large number
of links are present in very diverse complex systems, ranging from the economy to
the cell. They are a fundamental property of most networks [...]. [Bar03, p. 56]
Indeed, with links to an unusually large number of nodes, hubs create short paths
between any two nodes in the system [Bar03, p. 64]
Intuitively, using these hubs to approximate distances in power-law graphs is a good heuristic.
The main result of this chapter is a theoretical proof that may explain why this heuristic performs
well in practice.
1
Krioukov et al. [KFY04, Section IV.B] report routing tables with 52 entries for random power-law
graphs [ACL00] with 10,000 nodes. The bound by Thorup and Zwick [TZ01] is O(

nlg n) entries. For n = 10000,

nlg n is 365.
81
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
5.1.1 Overview of the Result
We give an informal statement of our result. The precise statement is deferred to Theorem 30.
For the nodes of a power-law graph, the probability that a node has degree x is proportional
to x

for some , which is called the power-law exponent. For most practical scenarios, the
power-law exponent lies in the interval 2 < < 3. These inequalities are assumed to hold in the
following.
The complexity analysis of our distance oracle is based on the random power-law graph model
with expected degree sequence proposed by Aiello, Chung and Lu [ACL00, CL02, Lu02, CL06]
with some minor simplications.
Let =
2
23
+ and > 0. For sufciently large n, we prove that for a random power-
law graph (sampled from the modied Chung-Lu model, see Denition 37) with n nodes, with
probability at least 1 1/n, our distance oracle of size O(n
1+
lg n) can be constructed in time
O(n
1+
lg n). With probability 1, the distance oracle has stretch (3, 0). The space requirement of
O(n
1+
lg n) (for a plot of with respect to , see Figure 5.1) improves upon the general distance
oracle of size O(n
3/2
) by Thorup and Zwick [TZ05].
Our bounds on the space complexity of the distance oracle of Thorup and Zwick [TZ05] extend
to the labeled compact routing scheme
2
by Thorup and Zwick [TZ01].
Figure 5.1: The gure shows a plot of f() =
2
23
for (2, 3). For values of close to 2, for
example for = 2.1, which is the exponent that ts the power-law distribution well to the degree
distribution of the actual Internet inter-domain graph [FFF99, KFY04], our bound is O(n
13/12+
),
which indicates that the adapted distance oracle (and the adapted routing scheme) could be very
effective on Internet-like graphs.
2
In this thesis, we focus on distance oracles; we do not dene compact routing schemes. Some notes on the routing
scheme, without detailed explanation and proof. The routing scheme is a xed-port scheme, meaning that it works
for any permutation of port number assignments on any node. The routing scheme requires a stretch5 handshaking
(see [TZ01, Section 4]), and uses addresses and message headers of size O(lg nlg lg n), with probability at least
1 o(1). Addresses and headers are based on an efcient path encoding scheme using O(lg nlg lg n) bits per node.
The encoding scheme relies on specic distance properties of power-law graphs. For details, see [CSTW09b].
82
5.2. PRELIMINARIES
5.1.2 Related Work
Due to their large occurrence in practice, various aspects of power-law graphs and complex net-
works have been studied.
There is some evidence that, despite their unique features, power-law graphs are actually not
easy instances for algorithms. Although power-law graphs are sparse, optimization problems
remain hard: problems such as COLORING or CLIQUE are NPhard for power-law graphs as
well [FPP08].
Power-law graphs have a dense core that consists of nodes with high degrees. Core prop-
erties have been investigated for several power-law graph models. Having a small core whose
removal substantially changes connectivity, would allow for a scheme that constructs shortest
paths through this core based on a separator theorem, as for planar graphs [Tho04a] and for
minor-free graphs [AG06]. However, the proportion of nodes that have to be removed to sub-
stantially change the connectivity of a power-law graph is linear with respect to the size of the
graph [CNSW00, NSW01, NWS02, BR03, FFV05, NR08]. Therefore, a separator-based strategy
is not suitable for power-law graphs; different techniques are necessary.
Also, most powerful techniques that work well for graphs with bounded doubling dimension
cannot be used. Although sometimes claimed, the Internet does not appear to have bounded ball
growth or bounded doubling dimension; both measures can be large [FLV08].
Practical routing schemes (and distance oracles) for power-law graphs have been proposed
[BC06, RMJ07, PBCG09, CY09, GSVGM98, XWP
+
09] (Section 3.2.4). However, there are no
theoretical results on the space requirements of routing schemes for power-law graphs.
An approach related to routing is due to Kleinberg [Kle00]. He formally proves that, in the
re-wired lattice model [BMST97, NW99, Kle00] (Section 2.1.3), greedy routing is a good routing
scheme. Greedy routing intuitively means that edges on the path to the target are chosen one after
another such that the estimated distance to the target is minimized. Kleinberg proves that paths
have length O(lg
2
n). For this greedy approach to work, it is however crucial that nodes know their
coordinates within the lattice.
3
Unfortunately, we cannot transform the greedy routing scheme into
a distance oracle, since there is no bound on the stretch for greedy routing; even for two nodes
connected by a short path of constant length, the greedy route may have length O(lg
2
n). The
stretch is thus (O(lg
2
n), 0).
Complex networks usually have diameter O(lg n). For a graph with diameter , a (, 0)
approximate distance oracle with constant space is trivial (by storing the diameter). If the oracle is
required to output actual paths, linear space sufces (the preprocessing algorithm creates a shortest
path starting at an arbitrary vertex; the query algorithm outputs paths on this tree).
The objective is to devise a distance oracle (or routing scheme) with constant stretch and good
theoretical bounds on the space requirements.
5.2 Preliminaries
5.2.1 Distance Oracle of Thorup and Zwick
The construction algorithm of the distance oracle of Thorup and Zwick [TZ05] is based on the
following ideas. Thorup and Zwick use random sampling to select a subset S V containing
O(

n) vertices. From each vertex of the set S, the algorithm computes and stores the distance
to all the vertices in the graph. From all other vertices v V \ S, the algorithm computes the
3
Without coordinates, paths may have length O(n
c
) [DEH07a, DEH07b].
83
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
open ball around v until it touches the nearest vertex from S. For each v, the ball has expected
size O(

n). The balance between the sample size |S| and the expected ball size is optimal,
which yields the desired space complexity of O(n

n). For details, see Algorithms 1 and 2 in


Section 3.1.2.
5.2.2 Properties of Random Power-law Graphs
Essentially, all models are wrong, but some are useful.
George P. E. Box [BD86, p. 424]
We adapt the random graph model for xed expected degree sequence as dened by Aiello,
Chung, and Lu [ACL00, CL02, Lu02, CL06] using the denition from [CL02, Section 2]. We
refer to the original random graph distribution using the expression Fixed Degree Random Graph
(FDRG).
Denition 36 (Fixed Degree Random Graph [CL02, Section 2]). In a random graph with a given
expected degree sequence w = {w
1
, w
2
, . . . , w
n
} such that i : w
2
i
<

j
w
j
, the edge between
v
i
and v
i
is present in the random graph with probability
Pr [{v
i
, v
i
} E] = w
i
w
i
, where =
1

j
w
j
.
In the original FDRG model it is assumed that i, i

: w
i
w
i
<

j
w
j
. We adapt the original
model by deterministically inserting edges if w
i
w
i
>

j
w
j
. Without modication, the original
assumption would rule out the values for considered in this thesis.
Denition 37. For a constant (2, 3), the random power-law graph distribution RPLG(n, )
is dened as follows. Let the sequence of generating parameters w = {w
1
, w
2
, . . . , w
n
} obey a
power law:
w
j
=
_
n
j
_
1/(1)
for j {1, 2, . . . n}.
The edge between v
i
and v
i
is present in the random graph with probability
Pr [{v
i
, v
i
} E] = min{w
i
w
i
, 1}, where =
1

j
w
j
.
Note that, in both models, there is a one-to-one correspondence between a node v
j
and its
generating parameter w
j
. In the FDRG model, the value w
j
corresponds to the expected degree
of vertex v
j
, and Chung and Lu refer to w as the expected degree sequence. In the RPLG(n, )
adaptation, the graph is sampled according to the generating parameter values w
j
. Let D
j
be the
random variable denoting the degree of node v
j
. In the RPLG(n, ) model, the expected degree
E[D
j
] of node v
j
is less than or equal to the generating parameter w
j
. We refer to the edges
between two nodes v
i
, v
i
with w
i
w
i


j
w
j
as deterministic edges; we refer to the remaining
edges as random edges.
An important reason to work with this model is that the edges are independent. This in-
dependence makes several graph properties easier to analyze. We also (implicitly) rely on a
property called assortativity. Assortativity is the tendency of nodes with high degree to attach
to other highly connected nodes. This tendency is especially high in social networks. The
84
5.2. PRELIMINARIES
opposite tendency, termed dissortativity, is more common in technological and biological net-
works. Highly connected nodes tend to be connected with low degree nodes. Li et al. [LADW05,
Denition 4.1] formalize assortativity as follows. They dene the s(G) value of a graph as
s(G) :=

{v
i
,v
i
}E
deg(v
i
) deg(v
i
). Graphs sampled from the FDRG model tend to have a
high s(G) value, since high-degree nodes are attached to other highly connected nodes. Li et
al. state that s(G) measures to what extent a graph has a hub-like core.
The core of a graph consists of nodes having large degrees. Let =
2
23
+ for some > 0
and

=
1
1
.
Denition 38. For a power-law degree sequence w and a graph G with n nodes, the core with
degree threshold n

(0, 1), is dened as follows.


core

( w) :=
_
v
j
: w
j
> n

_
,
core

(G) :=
_
v
j
: deg
G
(v
j
) > n

/4
_
,
where deg
G
(v
j
) is the degree of v
j
in G (the subscript G is omitted when the graph is clear from
the context).
The core

( w) as dened here is the n

Core in [Lu02, Chapter 4, Denition 2]. Note that


core

( w) and core

(G) are not necessarily equivalent. Even if the degree bound in core

(G)
was set to n

instead of n

/4, the two cores would not be equal. In Section 5.4.1, we prove that
core

( w) core

(G) with high probability.


For each vertex u of a graph G, dene its ball relative to the core (which is the open metric
ball as in Denition 13) as
B
core
(u) :=
_
v V (G) : d(u, v) < min
v

core

(G)
d(u, v

)
_
.
Note that it is important to use the open ball and not the closed ball.
The volume of a set of nodes is an integral notion in the proof. Let G be a random graph
sampled from RPLG(n, ). For a set of nodes S, dene its volume Vol (S) as the sum of all its
nodes w
j
, that is, Vol (S) :=

v
j
S
w
j
. We simplify notation by Vol (G) := Vol (V ). Note that
Vol (G) = 1/ (Denition 37). Let vol
G
(S) denote the sum of the nodes degrees in the actual
graph G, vol
G
(S) :=

v
j
S
deg
G
(v
j
).
For our proof, the most important property of the FDRG model is captured in the following
lemma, which is applied for the core and individual balls. There is an edge between two nodes
v
i
, v
i
with probability proportional to w
i
w
i
. The statement is extended to sets of nodes S, T
V (G) in the following. The lemma holds for both FDRG( w) and RPLG(n, ).
Lemma 23 ([Lu02, Lemma 3.3, proof in Lemma 9]). For any two disjoint subsets S and T with
Vol (S) Vol (T) > c Vol (G), we have
Pr[d(S, T) > 1] =

v
i
S,v
i
T
max{0, (1 w
i
w
i
/Vol (G))} e
c
.
The following lemma proves that Vol (G) is linear in n.
85
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
Lemma 24. Let G be a random graph sampled from RPLG(n, ). The volume Vol (G) satises
n < Vol (G)
1
2
n.
Proof. Lower bound: it holds that

j
w
j
> n, since w
j
> 1 for all j < n and w
n
= 1.
Upper bound: it holds that
Vol (G) =
n

j=1
w
j
< w
1
+
n
_
1
_
n
x
_
1/(1)
dx
1
2
n.
In the remainder of the preliminaries section, we prove certain concentration properties of the
adapted random power-law graph model. Since for the RPLG(n, ) model the edge probability
is capped, several properties of graphs sampled according to the FDRG model do not hold for
graphs sampled according to the RPLG(n, ) model.
In the following, we show concentration results for the actual degree of a vertex and for the
volume of a set of vertices under the adapted RPLG(n, ) model. We also restate the correspond-
ing results in the original FDRG model.
Lemma 25 ([CL06, Lemma 5.6], generalized from [McD98, Theorem 2.7]). For a random graph
sampled from FDRG( w), the random variable D
j
measuring the degree of vertex v
j
is concen-
trated around its expectation w
j
as follows:
Pr[D
j
> w
j
c

w
j
] 1 e
c
2
/2

5.1
Pr[D
j
< w
j
+c

w
j
] 1 e

c
2
2(1+c/(3

w
j
))

5.2
Lemma 26 ([CL06, Lemma 5.9]). For a random graph sampled from FDRG( w), for a subset of
vertices S and for all 0 < c
_
Vol (S),
Pr
_
|vol (S) Vol (S)| < c
_
Vol (S)
_
1 2e
c
2
/6
.
Weaker concentration bounds hold for graphs sampled from RPLG(n, ).
Lemma 27. Let n 4
1
(2)
2
. For a random graph sampled from RPLG(n, ), if w
j
32 lnn,
for vertex v
j
, the degree D
j
satises the following:
Pr[w
j
/4 D
j
3w
j
] > 1 2/n
4
.
Proof. Recall that = 1/Vol (G) < 1/n (by Lemma 24).
For 1 j n, let h(j) {1, 2, . . . n} denote the smallest integer such that w
h(j)
w
j
1.
Consider h(1). Since w
1
_
n
n
3
_
1/(1)
1, we have that
h(1) n
3
.
Therefore, for all 1 j n,
h(j) h(1) n
3
.
86
5.2. PRELIMINARIES
We split the degree D
j
into two parts: the contribution by edges to nodes v
j
with j

< h(j)
and the contribution stemming from edges to nodes v
j
with j

h(j). When h(j) 1, there


are at least h(j) 1 edges to nodes v
j
with j

< h(j). Now consider the edges between v


j
and
v
j
for j

h(j). Since the sequence w is monotonically decreasing,


n

i=h(j)
w
i

n
_
n
3
+1
(n/x)
1/(1)
dx

1
2
_
n n
1/(1)
2
2
1
n
2
1
(3)
_
(since n
3
1)

1
2( 2)
n (since n 4
1
(2)
2
).
Recall that = 1/

n
j=1
w
j

2
n(1)
by Lemma 24.
Let D

j
denote the random variable counting the number of edges fromv
j
to v
j
with j

h(j)
in a random graph. Thus,
E[D

j
] = = w
j
n

i=h(j)
w
i
w
j
/2 16 lnn.
Also, w
j
. Since there are no deterministic edges in this case, the random variable D

j
can be
bounded using Lemma 25:
Pr[D

j
> /2] 1 e
/4
1 1/n
4
,
Pr[D

j
< 2] 1 e
3/8
1 1/n
4
.
For h(j) = 1, the statement of the lemma follows directly.
If h(j) > 1, we have D
j
D

j
+h(j) 1. Notice that w
j
(n/w
j
)
1/(1)
1, which implies
that h(j) w
j
w
j
+ 1. Therefore,
Pr[w
j
/4 /2 D
j
3w
j
] 1 2/n
4
.
Lemma 28. Let G be a random graph sampled from RPLG(n, ). For a subset of vertices S sat-
isfying Vol (S) 192 lnn, it holds with probability at least 1 2/n
3
that Vol (S)/8 vol (S)
4Vol (S).
Proof. We split S into two parts S
1
:= {v
j
S : w
j
< 32 lnn} and S
2
:= S \ S
1
.
By Lemma 27,
Pr[Vol (S
2
)/4 vol (S
2
) 3Vol (S
2
)] 1 2|S
2
|/n
4
.
For each vertex v
j
S
1
, w
j
< 32 lnn. Since no deterministic edges are attached to S
1
, we
can apply Lemma 26 to S
1
.
Therefore, if Vol (S
1
) 96 lnn, by Lemma 26,
Pr[Vol (S
1
)/2 vol (S
1
) 2Vol (S
1
)/3] 1 2/n
4
.
Therefore, the statement holds with probability at least 1 2(|S
2
| + 1)/n
4
1 2/n
3
.
87
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
If Vol (S
1
) < 96 lnn, we have Vol (S
2
) Vol (S)/2 96 lnn.
However, since
Pr
_
vol (S
1
) <
3
2
96 lnn
3
4
Vol (S)
_
1 2/n
4
,
we can still apply Lemma 26 to bound vol (S
1
) from above.
In this case, since
Pr[Vol (S)/8 Vol (S
2
)/4 vol (S
2
) 3Vol (S
2
)] 1 2|S
2
|/n
4
,
the statement also holds with probability at least 1 2/n
3
.
In Lemma 24, we prove that Vol (G) = (n). Under the adapted RPLG(n, ) model, com-
pared to the FDRG model, the edge probability may only decrease. We immediately have that
vol (G) = O(n) with high probability. With the concentration results of this section, we obtain
the following corollary.
Corollary 29. The number of edges of a random graph sampled from RPLG(n, ) is at most
vol (G)
2

4(1)
2
n with probability at least 1 n
2
.
5.3 The Adapted Distance Oracle
We propose a modication of the distance oracle of Thorup and Zwick [TZ05, Fig. 5] for stretch
parameter k = 2, which guarantees stretch (3, 0). The main idea of the scheme by Thorup and
Zwick for k = 2 is the following (see Figure 5.2): in the preprocessing step, given a graph
G = (V, E):
1. Each node v V is chosen as a landmark independently at random with probability n
1/2
.
The expected number of landmarks is

n.
2. For each node u V , nd its nearest landmark L(u) and compute the distances from u to
all landmarks.
3. To guarantee optimal stretch for short distance queries, for every node u V a local ball
B
G
(u) = {u

V (G) : d(u, u

) < d(u, L(u))} is computed, including all nodes with


distance strictly less than the distance to the landmarks.
The result of the distance query d(s, t) is exact if s B(t) or t B(s) and otherwise
stretch (3, 0) is guaranteed [Cow01]. Since the set of landmarks consists of a random sample,
the expected ball size is O(

n), which is equal to the number of landmarks. This is the optimal


balance for general graphs.
For power-law graphs a better balance is possible. Using high-degree nodes as landmarks is a
natural heuristic. We can select fewer landmarks and obtain smaller sized balls than [TZ05, Fig. 5]
at the same time.
It is required that n = |V (G)| is sufciently large, specically, that
n
(23)
1

2( 1)
2
lnn.

5.3
The complexity results of this chapter do not have any other implicit dependencies on .
The following is the precise version of the main theorem of this chapter.
88
5.3. THE ADAPTED DISTANCE ORACLE
D
A
E
B
C
D
A
E
B
C
D
A
E
B
C
D
A
E
B
C
Figure 5.2: The distance oracle of Thorup and Zwick [TZ05, Fig. 5] for stretch parameter k =
2, which guarantees stretch (3, 0). An illustration of the preprocessing algorithm, from left to
right: (1) random sampling of landmarks, (2) SSSP computation for one landmark (node B), (3)
knowledge of one node after all SSSP computations (from all landmarks), and (4) ball for the
rightmost node in the bottom line
Theorem 30. Let =
2
23
+ be a constant. Assume Equation (5.3) is satised. For ran-
dom power-law graphs from RPLG(n, ) (Denition 37), there exists a (3, 0)approximate dis-
tance oracle with the following properties. The preprocessing algorithm runs in expected time
O(n
1+
lg n) and creates a distance oracle of expected size O(n
1+
). These bounds also hold
with probability at least 1 1/n. After preprocessing, approximate distance queries can be an-
swered in O(1) time with stretch at most (3, 0).
Since power-law graphs do not have large girth, the lower bound of Chapter 4 does not apply to
power-law graphs. However, scale-free networks and expander graphs (which are the worst-case
instances in Chapter 4) also share certain important properties [MPS06, Theorem 1, Corollary 4,
and p. 247]. It is thus not clear whether space O(n
1+
) is reasonably good or whether space

O(n)
may be sufcient.
Details for the preprocessing step are listed in Algorithm 3. Analogous to the oracle of Tho-
rup and Zwick [TZ05], for efcient query times, preprocessed information is stored in a hash
table [FKS84] for each node.
The query algorithm is the same as in [TZ05] for k = 2, see Algorithm 4.
Lemma 31. Algorithm 4 runs in time O(1) and achieves stretch (3, 0).
The following proof applies the same stretch and time bounds as [TZ05].
Proof. Time: At each node, all the information is stored in a hash table [FKS84] with constant
access time. The number of hash table reads necessary is constant.
89
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
Algorithm 3 Preprocess (G = (V, E),

)
compute core {v V : deg(v) > n

/4}
for each v core do
run breadth-rst search from v in G
for each node u = v, store d(u, v) and let FirstNode
u
(v) be the penultimum node on the
shortest path; update L(u) if v is nearest landmark
end for
for each u V do
compute and store B
core
(u) (including distances)
for each v B
core
(u) let FirstNode
u
(v) be the rst node on the shortest path to v.
end for
Algorithm 4 Distance (s, t)
if s B
S
(t) or t B
S
(s) then
return local distance d(s, t) from the information at s or t.
else
return d(s, L(t)) +d(L(t), t)
end if
t
s
L(t)
L(s)
d(s, t)
d(t, L(t))
d(s, L(t))
Figure 5.3: Illustration of the proof of worst-case stretch (3, 0) using the triangle inequality.
Stretch (3, 0) is guaranteed by the following observation [Cow01] (for an illustration, see
Figure 5.3). For a node u V , its ball is dened as follows.
B
core
(u) :=
_
v V (G) : d(u, v) < min
v

core

(G)
d(u, v

)
_
If neither s B
core
(t) nor t B
core
(s), then both
d(s, t) d(s, core) = d(s, L(s)) and
d(s, t) d(t, core) = d(t, L(t)).
90
5.4. TIME AND SPACE COMPLEXITIES
Using the second inequality, the triangle inequality
d(s, L(t)) d(s, t) +d(t, L(t)),
and d(t, L(t)) = d(L(t), t) (since G is undirected), we have
d(s, L(t)) +d(L(t), t) d(s, t) +d(t, L(t)) +d(L(t), t) 3d(s, t).
In practice, the return value
min{d(s, L(t)) +d(L(t), t), d(s, L(s)) +d(L(s), t)}
(or even min
Lcore
{d(s, L) +d(L, t)}) may yield better approximations (this triangulation is related
to A* [GH05] and beacon-based embeddings [KSW09], see Section 3.2.2).
5.4 Time and Space Complexities
The objective of this section is to prove the following lemma.
Lemma 32. Let =
2
23
+ be a constant. Assume Equation 5.3 is satised. For random power-
law graphs RPLG(n, ), Algorithm 3 runs in expected time O(n
1+
lg n) and creates a distance
oracle of expected size O(n
1+
). These bounds also hold with probability at least 1 1/n.
The main result of this chapter, Theorem 30 is immediate from Lemmas 31 and 32.
5.4.1 Core Size
We prove that the size of the core is (n

) in expectation and with high probability. We also prove


that it contains the nodes with high degree with high probability.
The core is dened by
core

( w) :=
_
v
i
: w
i
> n

_
,
core

(G) :=
_
v
i
: deg
G
(v
i
) > n

/4
_
.
To compute the size of core

( w), we solve the inequality w


k
> n

and obtain k.
w
k
=
_
n
k
_ 1
1
> n

1
1
> n

1
1
k < n
(1)(

1
1
)
= n

(1)+1
As

=
1
1
, we have
|core

( w)| = n

(1)+1
1 = n

1.
Even if the same degree threshold n

is used for core

( w) and core

(G), the two sets of


nodes may differ. For a slightly smaller degree threshold n

/4 (as in Denition 38), the core of


the actual graph contains core

( w) with high probability. The proof of the following theorem


essentially consists of applying Lemma 27.
91
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
Lemma 33. Let G be a random graph sampled from RPLG(n, ). With probability at least
1 1/n
2
it holds that
core

( w) =
_
v
i
: w
i
> n

_
v
i
: deg(v
i
) > n

/4
_
= core

(G).
Proof. Let v
i
be a vertex in core

( w).
By Lemma 27, D
i
n

/4 with probability at least 1 2/n


4
. This holds for all j i.
Therefore, by applying the union bound, the probability that the core of the actual graph con-
tains the nodes with high potential core

( w)
_
v
i
: deg(v
i
) > n

/4
_
is at least 1 1/n
2
.
Lemma 34. Let G be a random graph sampled from RPLG(n, ). With probability at least
1 1/n
2
,
|core

(G)| = (n

).
Proof. Lower bound: Since core

(G) contains core

( w) with probability at least 11/n


2
,
its size is at least n

with at least the same probability.


Upper bound: Let i = 144n

. By Lemma 27, D
i
3w
i
< n

/4 with probability at least


1 2/n
4
. This holds for all j (i, n]. By union bound, core

(G) does not contain any


vertex v
j
for i j n, with probability at least 1 1/n
2
, which implies |core

(G)|
144n

with probability at least 1 1/n


2
.
5.4.2 Ball Sizes
We prove that the expected ball size is small. This section contains the main technical idea, which
is the application of Lemma 23.
Let G be a random graph sampled from RPLG(n, ). Recall that a ball is dened by
B
core
(u) =
_
v V (G) : d(u, v) < min
v

core

(G)
d(u, v

)
_
.
Lemma 35. Let b =

( 2) +
(23)
1
be a constant. Assume Equation (5.3) is satised. For a
random graph G sampled from RPLG(n, ), with probability at least 1 3/n
2
, it holds that for
all u V (G),
|B
core
(u)| = |{u

V (G) : d(u, u

) < d(u, core

( w))}| = O(n
b
),
|E(B
core
(u))| = O(n
b
lg n),
where E(B
core
(u)) is the set of internal edges among vertices in B
core
(u).
Since for RPLG(n, ) the edges are independent, in our analysis, the existence of every edge
in random graph Gis only determined when it is needed, and before that it is treated as a probabil-
ity distribution as dened in our randomgraph model. We call the determination of the existence of
an edge according to its probability distribution revealing the edge. For a given vertex u V (G),
we dene a sequence of balls (B
0
= {u}, B
1
, B
2
, . . .) as follows:
Let V

= V \ core

( w).
Now dene B
0
= {u} and B
i
= {v : d
G
(u, v) i}.
92
5.4. TIME AND SPACE COMPLEXITIES
We also dene the circles C
i
= B
i
\ B
i1
for i 0 with B
1
= . Let E
i
denote the
random variable counting the number of edges between C
i
and C
i
C
i+1
.
We rst give a concentration result on E
i
.
Lemma 36. For circle C
i
, the following holds with probability at least 1 2/n
3
:
If Vol (C
i
) < 192 lnn, then E
i
4 192 lnn, and
if Vol (C
i
) 192 lnn, then E
i
4Vol (C
i
).
If Vol (C
i
) < 192 lnn, then E
i
4 192 lnn, and if Vol (C
i
) 192 lnn, then E
i
4Vol (C
i
).
Proof. For our analysis, we assume that the edges of the random graph are revealed in consecutive
steps as follows: in step i with i 0, edges from C
i
to V

\ B
i1
are revealed and circle C
i+1
is
formed. In other words, when discovering C
i
, the edges between C
i
and V

= V

\ B
i1
have
not been revealed yet.
In particular, E
i
measures the number of edges between C
i
and V

under the condition that


we know all edges adjacent to B
i1
. We can dene another random graph G

on the vertex set


V

, such that the edge between two vertices in V

is sampled with the same probability as in


RPLG(n, ). Now, E
i
and vol
G
(C
i
) have the same distribution, where vol
G
(C
i
) denotes the
number of edges adjacent to C
i
in G

.
Let vol (C
i
) denote the random variable measuring the number of edges adjacent to C
i
for the
original model FDRG. vol
G
(C
i
) is stochastically dominated by vol (C
i
). Hence, the statement of
the lemma follows directly, since it applies to vol (C
i
) by Lemma 28.
Since there are at most n circles, Lemma 36 holds for all circles with probability at least
1 2/n
2
.
The above arguments are combined to prove Lemma 35.
Proof of Lemma 35. Let k be the smallest integer such that Vol (B
k
) n
b
. We have the condi-
tions
Vol (B
k
) n
b
,
Vol (core

( w)) |core

( w)|n

= n
+

, and
Vol (G)
1
2
n (Lemma 24).
From Equation (5.3),
n
b

(2)
> 2
1
2
lnn.
Since the edges between B
k
and core

( w) have not been revealed yet, Lemma 23 can be


applied. Due to Lemma 23, there is an edge between B
k
and core

( w) with probability at least


1 1/n
2
. By Lemma 33, core

( w) core

(G) with probability at least 1 1/n


2
. Hence
B
core
(u) B
k
with probability at least 1 2/n
2
.
In the following, we bound the size of B
k
. Lemma 36 holds for all circles with high probability.
In our case,
Vol (C
k1
) Vol (B
k1
) < n
b
.
By Lemma 36, |C
k
| E
k1
4n
b
with probability at least 1 1/n
2
. Then,
|B
k
| = |B
k1
| +|C
k
| Vol (B
k1
) +|C
k
| 5n
b
.
93
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
Since B
core
(u) B
k
with probability at least 1 2/n
2
, we have
|E(B
core
(u))| = O(vol (B
k1
(u))) = O
_
k1

i=0
E
i
_
with probability at least 1 2/n
2
.
By Lemma 36, with probability at least 1 1/n
2
, for all i,
E
i
4 192 lnn + 4Vol (C
i
).
Since k n
b
, with probability at least 1 3/n
2
,
|E(B
core
(u))| = O
_
k1

i=0
E
i
_
= O(4 192n
b
lnn + 4Vol (B
k1
))
= O(n
b
lg n).
This concludes the proof.
5.4.3 Assembly
The core core

(G) has size (n

) with probability at least 1 1/n


2
(Lemma 34) and all balls
B
core
(u) have size O(n

) with probability at least 1 3/n


2
(Lemma 35). Therefore, we have the
following result.
Proof of Lemma 32. Our algorithm is deterministic. The expected time (space) complexity is the
average running time (space) of our algorithm over all graphs from the random graph distribution
RPLG(n, ).
Given a graph G with n nodes and m edges, our algorithm computes the core core

(G) of
G with time complexity O(m + nlg n). It runs a complete breadth-rst search for each node
of the core in time O(m). Due to the condition of Lemma 32, Equation (5.3) is satised. Let
B
core
(u) denote the ball computed in our algorithm for vertex u. Let T(B
core
(u)) denote the
time to compute B
core
(u). Therefore, the time complexity TC and space complexity SC of our
algorithm are at most
TC(G) = O
_
_
m |core

(G)| +

vV (G)
T(B
core
(u))
_
_
,

5.4
SC(G) = O
_
_
n |core

(G)| +

vV (G)
|B
core
(u)|
_
_
.

5.5
Let b =

( 2) +
(23)
1
. By Lemma 35, SC is at most O(n
1+b
) with probability at least
13/n
2
. The time to compute B
core
(u) is linear in the number of internal edges in B
core
(u), since
the graph is unweighted and the distance fromu to the core has been determined before computing
B
core
(u). By Lemma 35, TC = O(n
1+b
) with probability at least 1 3/n
2
.
We now know that with probability at least 1 5/n
2
, all of the following conditions are true:
94
5.5. CONCLUSION AND OPEN PROBLEMS
1. m = (n) (Corollary 29);
2. |core

(G)| = (n

) (Lemma 34);
3. |B
core
(u)| = O(n
b
) for all vertices u (Lemma 35);
4. T(B
core
(u)) = O(n
b
lg n) for all vertices u (Lemma 35).
Therefore, from Equations 5.4 and 5.5, we know that with probability at least 1 5/n
2
, the
space complexity of our algorithm is O(n
1+
+ n
1+b
) and the time complexity is O(n
1+
+
n
1+b
lg n).
Finally, we x the parameters to obtain a balanced scheme. In a balanced scheme, the core
size and the expected ball sizes are asymptotically equivalent, that is, b = . Together with
b =

( 2) +
(2 3)
1
and

=
1
1
, we have
=
2
2 3
+ .
Therefore, assuming that Equation (5.3) is satised, the space requirement per node is O(n

lg n)
bits and the total preprocessing time is bounded by O(n
1+
lg n), which holds with probability at
least 1 1/n.
5.5 Conclusion and Open Problems
Theorem 30 implies that distances and shortest paths in random power-law graphs can be approx-
imated efciently. For power-law exponents close to 2, the expected space consumption is close
to linear. The extension of the algorithms for distance oracles to compact routing schemes indi-
cates that the routing scheme by Thorup and Zwick [TZ01] may be very efcient on Internet-like
network topologies.
Edge-weighted Graphs
The algorithm and the proof currently only apply to unweighted graphs. It seems difcult to extend
our distance oracle to graphs with worst-case weights. An adversary could assign all edges within
the core and within the fringe (all nodes outside the core) innitesimal values, and edges between
the core and the fringe to large values. With these weight values, all balls of nodes in the fringe
span the whole fringe. Since the fringe size is linear in the number of nodes, the distance oracle
would have O(n
2
) space. It may be interesting to investigate the case where weights are random.
General Stretch Parameter k
Currently, the adaptation of the distance oracle of Thorup and Zwick [TZ05] works for the stretch
parameter k = 2 only. An extension to general k seems feasible but with our proof technique, the
space requirements for constant k would remain O(n
1+
) for some > 0.
95
CHAPTER 5. DISTANCE ORACLES FOR POWER-LAW GRAPHS
Stretch
An important open question targets the stretch of the distance oracle. If a distance oracle is used
as a component to distinguish between close and far entities in a complex network, the stretch is
very important. Since the diameter of random power-law graphs is O(lg n), the stretch should be
as small as possible to guarantee meaningful estimates. While stretch (3, 0) is best possible for
general graphs and distance oracles with O(n
3/2
) space, we prove (Theorem 30) that it is possible
to signicantly reduce the space for power-law graphs. For Erd os-R enyi random graphs, stretch
(2, 0) has been achieved with space consumption of

O(n
7/4
) [EWG08] (see Section 3.1.3). Is it
possible to reduce both space and stretch?
Different Models
As mentioned in Section 2.1.3, there are many different models for complex networks. Our anal-
ysis only works for the adapted random graph model by Aiello, Chung, and Lu [ACL00]. Other
models such as the conguration model [BBK72] and the preferential attachment model [BA99]
may also have efcient distance oracles.
96
The best theory is inspired by prac-
tice and the best practice is inspired
by theory.
Donald Knuth [Knu89]
6
Approximating Shortest Paths
Using Voronoi Duals
6.1 Introduction
The main result of this chapter is an approximation method to answer shortest path queries in gen-
eral, undirected graphs with positive edge weights, based on random sampling and graph Voronoi
duals [Meh88, Erw00]. In preprocessing, each node is selected as a Voronoi site independently at
random with probability p, and the Voronoi dual is computed for the selected sites (Section 6.3).
This preprocessing step is very efcient; it takes time proportional to computing one single source
shortest path tree (Section 6.4). For p < 1, the resulting dual graph is expected to be smaller
than the original graph. At query time, search for the shortest path from source s to target t can
potentially be done faster in the Voronoi dual. We let the shortest path in the Voronoi dual guide
the search for an approximate shortest path in the original graph. We prove that the expected ap-
proximation ratio is at most logarithmic in the number of nodes on the actual shortest path, and
that this bound is tight (Section 6.5). Our experimental results show that, in practice, the approx-
imation is much better than the stated theoretical bound and that the preprocessing overhead is
indeed extremely low (Section 6.6).
Many practical shortest path query methods are tailored for road networks (Section 3.2.3).
There has been considerable recent progress: for the road networks of Europe or the USA, using
a high-performance computer, a speedup of several orders of magnitude compared to Dijkstras
algorithm can be achieved with a preprocessing time in the tens of minutes [DSSW09]. Un-
fortunately, theoretical bounds on both query time and preprocessing time are often difcult to
obtain. However, even though road networks constitute the most common and popular application
of shortest path query algorithms to date, other challenging applications exist. Computer net-
works, social networks, protein interaction networks, and the web graph exhibit different degree
and structural properties, and may contain hundreds of millions or even billions of nodes. In spe-
cic cases, a user might be willing to trade preprocessing time against exactness due to the vast
size of the data or due to restricted processing power (Section 1.2.2). These scenarios may require
the use of a fast approximation method.
Related methods. Kambara and Ueshima [KU08] independently propose a method (without
analysis) that appears to be closely related to the method we present in this chapter. Fang et
al. [FGG
+
05] use graph Voronoi diagrams for routing in sensor networks. Yu et al. [YWD08] use
Voronoi paths to bridge communication gaps in sparse sensor networks. Chan and Efrat [CE01]
solve the cheapest path problem for ight connections in R
2
. Their method runs Dijkstras algo-
rithm on the Delaunay triangulation with respect to a superquadratic cost function R
2
R
2
R
+
.
97
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
6.2 Preliminaries
6.2.1 Graph Voronoi Diagram
The classical Voronoi diagram is a distance-based decomposition of a metric space relative to a
discrete set, the Voronoi sites [Dir50, Vor07].
1
For a survey on this fundamental structure, we refer
to [Aur91]. Among many applications, the Voronoi diagram is often used to solve facility location
problems [Sha75, ACS99, AGK
+
04, GKP05, Svi08, Sug09]. The Voronoi diagram and the Delau-
nay triangulation of n points in the plane can be computed in expected time n 2
O(

lg lg n)
[CP07],
which is even faster than O(nlg n).
Mehlhorn [Meh88] and Erwig [Erw00] proposed an analogous decomposition, the Graph Vo-
ronoi Diagram, for undirected and directed graphs respectively. Since the Voronoi diagram for
the Euclidean space is used for various applications, its graph counterpart, the graph Voronoi dia-
gram, may be used for these applications if the underlying metric is the shortest path metric of a
graph [OSF
+
08].
Real-world distances or travelling times can be approximated more appropriately using mod-
els based on weighted graphs. In general, non-planar networks such as social networks, com-
puter networks, protein interaction networks, and the web graph cannot be embedded into a low-
dimensional Euclidean space without signicant distortion.
Denition 39 (Graph Voronoi Diagram [Meh88, Erw00]). In a graph G = (V, E, w), the Voronoi
diagramfor a set of nodes K = {v
1
, . . . , v
k
} V is a disjoint partition Vor
(G,K)
:= {V
1
, . . . , V
k
}
of V such that for each node u V
i
, d(u, v
i
) d(u, v
j
) for all j {1, . . . , k}.
The V
i
are called Voronoi regions. The graph Voronoi diagram is not necessarily unique, as a
node u may have the same distance to more than one Voronoi node. Let vor(u) denote the index i
of the Voronoi region V
i
containing u; that is, vor(u) = i u V
i
.
Analogously to the Delaunay triangulation dual for classical Voronoi diagrams of point sets,
we dene the Voronoi dual for graphs.
Denition 40. Let G = (V, E, w) be an edge-weighted graph and Vor
G,K
its Voronoi diagram.
The Voronoi dual is the graph G

= (K, E

, w

) with edgeset E

:= {(v
i
, v
j
) : v
i
, v
j

K and u V
i
w V
j
: (u, w) E}, and edge weights
w

(v
i
, v
j
) := min
uV
i
,wV
j
(u,w)E
{d(v
i
, u) +w(u, w) +d(w, v
j
)}.
By contracting edges on the shortest paths connecting Voronoi nodes, one can see that G

is a
minor of G (see for example Wolff [Wol08, Lemma 4]; minors are dened in Denition 21).
Figure 6.1 illustrates a Voronoi diagram and a graph Voronoi diagram. Although the classical
Voronoi dual of a non-degenerate set of points in the plane is always a triangulation, the graph
Voronoi dual is not necessarily a triangulation, even for planar graphs. For example, a graph
Voronoi dual may have nodes whose removal would disconnect the graph.
1
More folklore in the style of Erd os-numbers: according to the Mathematics Genealogy Project, available online
at genealogy.math.ndsu.nodak.edu, there is a tree path in the advisor graph from Dirichlet to my mentor
and collaborator on the Voronoi method, Michael E. Houle: Gustav Peter Lejeune Dirichlet Rudolf Otto Sigismund
Lipschitz Christian Felix Klein Carl Louis Ferdinand Lindemann Arnold Johannes Wilhelm Sommerfeld Ernst
Adolph Guillemin Samuel Jefferson Mason Robert Wellington Donaldson Godfried Theodore Patrick Toussaint
Michael Edward Houle.
98
6.2. PRELIMINARIES
F
G
D
A
E
B
C
F
G
D
A
E
B
C
F
G
D
A
E
B
C
F
G
D
A
E
B
C
Figure 6.1: The Voronoi diagram and the Delaunay triangulation of the plane for a set of Vo-
ronoi sites {A, B, . . . G} and the graph Voronoi diagram and its dual for a set of Voronoi nodes
{A, B, . . . G} in an unweighted graph (note that the graph Voronoi dual is not necessarily a trian-
gulation).
Erwig [Erw00, Theorem 2] showed that the graph Voronoi diagram can be constructed with
a single Dijkstra search in time O(m + n lg n). A heap is used to store the shortest path dis-
tances from nodes to their closest Voronoi node. The heap is initialized to store the Voronoi nodes
themselves. Thereafter, as long as there are nodes in the queue, the minimum is extracted from
the heap and processed (or settled) by assigning to it a Voronoi region, storing the distance
to its Voronoi node, and adding to or updating its neighbors in the queue. We slightly modify
this construction of the Voronoi diagram [Erw00, Section 3.1] to compute the Voronoi dual
that is, to also compute E

and w

. Whenever a node u is settled in the Dijkstra search, for


all its settled neighbors u

of different Voronoi regions, the edge (v

vor(u)
, v

vor(u

)
) with weight
w
G
(v

vor(u)
, v

vor(u

)
) = d
G
(v
vor(u)
, u) + w
G
(u, u

) + d
G
(u

, v
vor(u

)
) is added, or its length is
decreased if there already is an edge in G

representing a longer path in G. This modication of


Erwigs algorithm is shown as Algorithm 5.
In the analysis to follow (in Section 6.5) we move back and forth between a graph and its dual.
For this we need the following denitions.
Denition 41. Given a path P = (u
0
, u
1
, . . . , u
h
), the Voronoi path of P is the sequence of
vertices P

= (v
vor(u
0
)
, v
vor(u
1
)
, . . . , v
vor(u
h
)
).
Note that the Voronoi path P

may not necessarily be simple, as multiple consecutive occur-


rences of nodes v
vor(u
i
)
are possible in P

. They are treated as a single occurrence, and such paths


are deemed to be equivalent.
Lemma 37. For any path P = (u
0
, . . . , u
h
) in an undirected graph G = (V, E, w), the corre-
sponding Voronoi path P

exists and is unique.


99
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
Algorithm 5 ComputeVoronoiDual(G = (V, E), K V )
1: V

:= V {v
0
}, E

:= E
2: i := 1
3: for u K do
4: v
i
:= u
5: vor(v
i
) := i
6: E

:= E

{{v
0
, u}}
7: i := i + 1
8: end for
9: HEAP.put(v
0
)
10: while HEAP.empty do
11: u
cur
:= HEAP.extractMin
12: for u (u
cur
) do
13: if vor(u) = undened then
14: vor(u) := vor(u
cur
)
15: HEAP.insert(u, d(v
0
, u
cur
) +w(u
cur
, u))
16: else if d(v
0
, u
cur
) +w(u
cur
, u) < d(v
0
, u) then
17: vor(u) := vor(u
cur
)
18: HEAP.decreaseKey(u, d(v
0
, u
cur
) +w(u
cur
, u))
19: else if HEAP.contains(u) and vor(u) = vor(u
cur
) then
20: if (v
vor(u
cur
)
, v
vor(u)
) E

then
21: E

:= E

{(v
vor(u
cur
)
, v
vor(u))
}
22: w

(v
vor(u
cur
)
, v
vor(u)
) :=
23: end if
24: if w

(v
vor(u
cur
)
, v
vor(u)
) > d(v
vor(u
cur
)
, u
cur
) +w(u
cur
, u) +d(u, v
vor(u)
) then
25: w

(v
vor(u
cur
)
, v
vor(u)
) := d(v
vor(u
cur
)
, u
cur
) +w(u
cur
, u) +d(u, v
vor(u)
)
26: end if
27: end if
28: end for
29: end while
Proof. Suppose that there is no such path P

in G

. This implies that there exist pairs of nodes


u
i
, u
i+1
on the path P for which v
vor(u
i
)
= v
vor(u
i+1
)
and (v
vor(u
i
)
, v
vor(u
i+1
)
) / E

. As u
i
, u
i+1
are consecutive nodes on the path P, we know that (u
i
, u
i+1
) E. This contradicts the denition
of the Voronoi dual (Def. 40), since (u
i
, u
i+1
) E and v
vor(u
i
)
= v
vor(u
i+1
)
together imply that
(v
vor(u
i
)
, v
vor(u
i+1
)
) E

. P

is unique since each node u


i
on the path belongs to exactly one
Voronoi region, corresponding to exactly one Voronoi node v
vor(u
i
)
.
Denition 42. For a path P

in the Voronoi dual G

of a graph G, the Voronoi sleeve is the


subgraph of G induced by the nodes in the union of all Voronoi regions V
i
for which its Voronoi
node v
i
lies on P

,
Sleeve
(G,G

)
(P

) := G
_
_
_
v
i
P

V
i
_
_
.
The Voronoi sleeve is related to a subgraph sometimes termed corridor.
With the denitions at hand we can now state the approximation method.
100
6.3. THE VORONOI METHOD
6.3 The Voronoi Method
This section describes the preprocessing and query algorithms of the Voronoi method. Both algo-
rithms are conceptually very simple and thus easy to implement.
In preprocessing, each node is selected as a Voronoi site independently at random with prob-
ability p, and the Voronoi dual is computed for the selected sites (Algorithm 6). For the sake of
exposition, we treat the computation of the Voronoi dual as a black box, denoted by Compute
VoronoiDual.
Algorithm 6 Preprocessing
Input: graph G = (V, E, w), sampling rate p [0, 1].
Output: Voronoi dual G

with Voronoi nodes selected independently at random with probability p.


1: Random sampling: Generate the set of Voronoi nodes by selecting each node of V indepen-
dently at random: v V, Pr[v K] = p.
2: Compute a Voronoi dual G

= (K, E

, w

) using the modied version of Erwigs algo-


rithm [Erw00, Section 3.1] as shown in Algorithm 5.
G

:=ComputeVoronoiDual(G, K)
3: Return G

.
Lemma 38. For a graph G = (V, E) with n := |V | and m := |E|, Algorithm 6 takes time
proportional to that of Dijkstras single source shortest path algorithm.
Proof. Erwigs variant of Dijkstras algorithm computes the graph Voronoi diagram in a worst-
case time proportional to Dijkstras algorithm [Erw00, Theorem 2]. The only modication of
Algorithm 5 compared to Erwigs variant is the following: for each node, at the time it is settled,
all its neighbors are inspected. Therefore, each edge is additionally considered two times in total.
This yields the same asymptotic running time.
The preprocessing time complexity is proportional to the cost of computing one single source
shortest path tree. Details are discussed in Section 6.4.
At query time, given a graph Gand its Voronoi dual G

, we answer (approximate) shortest path


queries between source s and target t, by rst searching for a shortest path SP
G
(v
vor(s)
, v
vor(t)
) in
the smaller Voronoi dual G

. This path determines the sleeve S = Sleeve(SP


G
(v
vor(s)
, v
vor(t)
)),
whose shortest path SP
S
(s, t) approximates the shortest path SP
G
(s, t) in G. The shortest path in
the Voronoi dual guides the Dijkstra search in the original graph. For a pseudo-code description,
see Algorithm 7; for an illustration, see Figure 6.2.
The running time of Algorithm 7 depends on G and p. Let N

and M

denote the random


variables measuring the number of nodes and edges of the Voronoi dual. Clearly E[N

] = p n.
The expected query time without renement (computing the shortest path in the Voronoi sleeve) is
at most O(N

lg N

+M

). The time for the renement step depends on the size of the Voronoi
sleeve. The analysis will show that the renement step is not necessary for the approximation ratio
to hold for long distance queries; however, it makes a practical difference for the quality of paths.
For p = O(n
2/3
), E[N

] = O(n
1/3
), and thus we can afford to compute all-pairs shortest path
distances in the Voronoi dual G

in overall linear expected time. This allows for constant-time


approximate distance queries.
101
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
Algorithm 7 Query
Input: Graph G, Voronoi dual G

, Source s, Target t.
Output: an approximate shortest path P from s to t.
1: Find Voronoi source v
vor(s)
from s and Voronoi target v
vor(t)
from t. If thereby a shortest path
SP
G
(s, t) has been found, return it.
2: Compute a shortest path from v
vor(s)
to v
vor(t)
in the Voronoi dual G

: SP
G
(v
vor(s)
, v
vor(t)
).
3: Compute the Voronoi sleeve
S := Sleeve(SP
G
(v
vor(s)
, v
vor(t)
)).
4: Compute a shortest path from s to t in the Voronoi sleeve, SP
S
(s, t).
5: Return P = SP
S
(s, t).
F
G
D
A
E
B
C
F
G
D
A
E
B
C
F
G
D
A
E
B
C
F
G
D
A
E
B
C
Figure 6.2: Illustration of the query algorithm of the Voronoi method. Left to right, top to bottom:
(1) the original shortest path, (2) shortest path in the (weighted) dual, (3) sleeve, and (4) shortest
path in the sleeve
6.4 Computational Complexity
In this section we study the cost of computing a Voronoi dual. Recall that in Erwigs algo-
rithm [Erw00, Section 3.1] the graph Voronoi diagram is constructed with a single Dijkstra search.
A heap is used to store the shortest path distances from nodes to their closest Voronoi node. Con-
ceptually, a dummy node with a zero-weighted edge to each of the Voronoi nodes is added, the
dummy node is inserted into the heap, and the Dijkstra single source shortest path search is ex-
ecuted. The running times of different implementations of Dijkstras algorithm depend on the
102
6.4. COMPUTATIONAL COMPLEXITY
priority queue employed (see Table 2.5). Using Fibonacci heaps [FT87], Dijkstras algorithm
takes time O(m+nlg n).
Erwig also claims a time lower bound of (max(n, (n k) lg k)) [Erw00, Theorem 1]. The
lower bound simplies to (nlg n) when the number of Voronoi nodes is assumed to be k = n
C
for a xed choice of C (0, 1). Assuming that all edges must be inspected at construction time,
this lower bound would be tight. The bound is information theoretic: for a connected graph, each
node w V \ K is in exactly one of the k regions V
i
. Encoding one instance out of these k
nk
possibilities requires lg k
nk
= (n k) lg k bits.
For some graphs with special properties, Erwigs lower bound may not apply. Eppstein and
Goodrich [EG08] presented a linear-time algorithm to compute the Voronoi diagram for road net-
works satisfying certain geometric properties. Also, the lower bound may not hold under different
models of computation, such as the word RAM model. This model assumes that basic operations
such as adding two words requires a single time step, and that the time compexity is the number
of word operations executed. The space complexity is the number of words of storage required,
assuming that any identier (such as a node label) or value (such as a distance) can be contained
in a single word. Under the word RAM model, the implementation of Dijkstras algorithm by
Thorup [Tho04b] requires only O(m+nlg lg n)time.
Corollary 39. The graph Voronoi dual can be computed in time O(m + nlg lg n) in the word
RAM model.
Note that the time upper bound under the word RAM model does not contradict Erwigs
information-theoretic lower bound [Erw00, Theorem 1] of (nlg n) bits.
Computing a graph Voronoi dual does not actually require the use of Dijkstras algorithm
any single source shortest path algorithm (including parallel and distributed algorithms) can be
used to compute a graph Voronoi dual as follows. Instead of an adapted Dijkstra search, we may
also
1. augment G by introducing a dummy node v
0
connected to each of the Voronoi nodes with
an edge of length zero,
2. run any single source shortest path algorithm in the augmented graph G

with v
0
as its
source, and
3. explore the search tree rooted at v
0
by following shortest path edges only.
This last step simulates a Dijkstra search by following the single source shortest path tree without
using any expensive decrease-key operations (these operations have to be avoided to reduce the
worst-case running time [Tho00b, Tho07]); a First-In-First-Out queue with constant time for the
enqueue and dequeue operations is sufcient. For a pseudo-code description, see Algorithm 8.
Although the construction is mainly of theoretical interest, it may be useful for example for parallel
or distributed algorithms and for software that must rely on certain libraries.
Note that, if a single source shortest path algorithmAworks for a special class of graphs G, the
augmented graph G

may not necessarily be in G, and thus algorithm Acannot be used in general.


For example, for planar graphs, the O(n)time algorithm of Henzinger et al. [HKRS97] cannot
be applied directly to compute the Voronoi diagram since planarity may be violated by adding a
dummy node. In the particular case of the algorithm of Henzinger et al., however, the analysis of
the running time depends on separators, which seem to admit the introduction of a dummy node.
103
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
Algorithm 8 ComputeVoronoiDual(G = (V, E), K V )
1: Let G

:= (V

, E

) with V

= V {v
0
} and E

= E {(v
0
, v) : v K} with w

(v
0
, v) =
(one would set = 0 if possible; if only positive edge are allowed, other values work as well)
2: D := SSSP(G

, v
0
), where D is the distance vector storing the distance from v
0
to each node
u V

3: for i := 1 to k = |K| do
4: vor(v
i
) := i
5: FIFO.enqueue(v
i
)
6: end for
7: while FIFO.empty do
8: u
cur
:= FIFO.dequeue
9: for u (u
cur
) do
10: if D(u) = D(u
cur
) +w(u, u
cur
) and vor(u) = undef then
11: vor(u) := vor(u
cur
)
12: FIFO.enqueue(u)
13: else if vor(u) = undef and vor(u) = vor(u
cur
) then
14: if (v
vor(u
cur
)
, v
vor(u)
) E

then
15: E

:= E

{(v
vor(u
cur
)
, v
vor(u))
}
16: w

(v
vor(u
cur
)
, v
vor(u)
) :=
17: end if
18: if w

(v
vor(u
cur
)
, v
vor(u)
) > D(v
0
, u
cur
) +w(u
cur
, u) +D(u, v
0
) then
19: w

(v
vor(u
cur
)
, v
vor(u)
) := D(v
0
, u
cur
) +w(u
cur
, u) +D(u, v
0
)
20: end if
21: end if
22: end for
23: end while
Theorem 40. Using any single source shortest path algorithm for general graphs with running
time S(n, m, W), Algorithm 8 computes a graph Voronoi dual in time O(n +m+S(n, m, W)).
Proof. After running the SSSP algorithm in time S(n, m, W), Algorithm 8 visits every node
exactly once and every edge exactly twice (once for each end point).
For undirected graphs with integer or oating point weights, we may use the O(m)time SSSP
algorithm of Thorup [Tho99, Tho00a].
Corollary 41. For undirected graphs with integer or oating point weights, the graph Voronoi
dual can be computed in time O(m+n) in the word RAM model.
For real weights and undirected graphs, we may use the O(m + nlg lg n)time algorithm of
Pettie and Ramachandran [PR02].
Corollary 42. For undirected graphs with real weights, the graph Voronoi dual can be computed
in time O(m+nlg lg n).
For road networks, we may use the linear-time algorithm of Eppstein and Goodrich [EG08].
104
6.5. STRETCH ANALYSIS
Practical considerations. Note that storing each Voronoi node twice, once as a graph node and
once as a dual node, causes unnecessary additional space consumption. However, when both the
original graph and the dual graph are stored in the same structure, searching the dual could result
in a substantial number of cache misses, since Voronoi nodes are roughly 1/p positions apart.
Adapting the memory organization by reordering the nodes such that the memory locations used
for Voronoi nodes are close together may potentially increase the cache efciency [GKW07].
6.5 Stretch Analysis
In this section, we prove that the expected path length approximation ratio (stretch) is logarithmic
in the number of edges of an exact shortest path. The bound on the stretch is the main theoretical
result of this chapter.
In this section, to simplify notation, we only consider the multiplicative stretch . We write
stretch instead of stretch (, 0) as originally dened in Denition 30.
Theorem 43. For shortest paths having h edges, Algorithm 7, given a graph and its Voronoi dual
with sampling rate p (constructed by Algorithm6), has expected worst-case stretch O(lg
1/(1p)
h).
The path SP
S
(s, t) found by the algorithm is an approximation, since it is possible that no
actual shortest path SP
G
(s, t) lies entirely within the Voronoi sleeve S. We explain how this is
possible, and give an upper bound on the expected length (SP
S
(s, t)). For this purpose, we
prove relationships between the lengths of simple paths P and their corresponding Voronoi paths
P

. The stretch of a path P

depends on the number and distribution of Voronoi nodes on the path


P. In particular, the stretch depends linearly on the largest interval between two Voronoi nodes on
the path.
Denition 43. For a path P = (u
0
, u
1
, . . . , u
h
) in a graph G = (V, E, w), and a set of Voronoi
nodes K V , two Voronoi nodes v
i
, v
j
on P are called consecutive if the subpath between v
i
and v
j
does not contain another Voronoi node. The gap g between two consecutive Voronoi nodes
on the path is dened as the number of edges of this subpath. The largest gap of a path is the
maximum over all gaps between two consecutive Voronoi nodes on the path.
To simplify the analysis, we initially assume that s and t are Voronoi nodes. Later, we will
relax this restriction.
We wish to prove that the stretch is at most the size of the largest gap

h between two Voronoi
nodes on the path SP
G
(s, t). For the analysis we x a shortest path SP
G
(s, t) from s to t, say
(s, u
1
, u
2
, . . . , u
h1
, t). If the corresponding Voronoi path (SP
G
(s, t))

is a shortest path from s


to t in the Voronoi dual, then the Voronoi sleeve S also contains SP
G
(s, t). Figure 6.3 gives an
example for which (SP
G
(s, t))

is not a shortest path in the dual.


In Lemma 44, for any simple path P, we give a worst-case bound on the length of the cor-
responding Voronoi path. P

can have maximal stretch if there is no Voronoi node among the


intermediate nodes and the corresponding Voronoi nodes have maximal distance (while still satis-
fying the Voronoi condition).
Lemma 44. Given a simple path P = (s, u
1
, . . . , u
h1
, t) between two Voronoi nodes s = u
0
and t = u
h
with h edges and length (P), the corresponding Voronoi path P

in the Voronoi dual


G

has at most length (P

) h (P). This upper bound is tight.


105
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
Proof. The path contains h 1 intermediate nodes and h edges and therefore passes through at
most h+1 different Voronoi regions. Out of these, at most h1 regions are interfering regions,
meaning that the original shortest path does not lead through the corresponding Voronoi nodes but
the shortest Voronoi path does. The path length (P) in the original graph is the sum of the edge
weights (P) := d(s, t) =

h1
k=0
w(u
k
, u
k+1
). The length d

(v
vor(u
k
)
, v
vor(u
k+1
)
) of an edge
between two Voronoi nodes on the path P

can be bounded as follows (see Figure 6.4):


d

(v
vor(u
k
)
, v
vor(u
k+1
)
) d(v
vor(u
k
)
, u
k
) +w(u
k
, u
k+1
) +d(u
k+1
, v
vor(u
k+1
)
)
From the Voronoi condition, we observe that d(u
k
, v
vor(u
k
)
) d(u
k
, v
vor(u
j
)
) for all j. Due to
the assumption that s and t are also Voronoi nodes, this also holds for source and target. That is,
d(u
k
, v
vor(u
k
)
) d(s, u
k
)
d(u
k
, v
vor(u
k
)
) d(u
k
, t)
= d(v
vor(u
k
)
, u
k
)
This yields:
(P

) d

(s, t) = d

(s, v
vor(u
1
)
)
+
h2

k=1
_
d(v
vor(u
k
)
, u
k
) +w(u
k
, u
k+1
)+d(u
k+1
, v
vor(u
k+1
)
)
_
+d

(v
vor(u
h1
)
, t)
w(s, u
1
) +d(u
1
, v
vor(u
1
)
)
+
h2

k=1
_
d(v
vor(u
k
)
, u
k
) +d(u
k+1
, v
vor(u
k+1
)
)
_
+
h2

k=1
w(u
k
, u
k+1
)
+d(v
vor(u
h1
)
, u
h1
) +w(u
h1
, t)
d(s, t) +
h1

k=1
_
d(s, u
k
) +d(u
k
, t)
_
= h (P)
There exist constructions for which the bound can be shown to be tight. For example, for any
choice of a > > 0, the edge weights of G may be chosen such that d(u
k
, v
vor(u
k
)
) = a ,
w(u
k
, u
k+1
) = , and w(s, u
1
) = w(u
h1
, t) = a. Path P has length 2a + (h 2), and the
Voronoi path P

has length 2a +(h 2) +2(h 1) (a ). The worst case is attained for very
small . As 0, the ratio (P

)/(P) h.
If in addition to the endpoints there are Voronoi nodes on the shortest path, the maximum
stretch is guaranteed to be smaller than the number of edges on the shortest path. In the following
lemma, we prove that the maximum stretch is proportional to the largest gap between Voronoi
nodes on the path. The proof is a simple composition of Lemma 44, and is supported by the
illustration in Figure 6.4.
106
6.5. STRETCH ANALYSIS
b
< <
a + b + 2c
a + b
a + c
b + c
c
a
t
s
u
v
i
path/edge in G

path/edge in G
Vor. region boundary
Figure 6.3: s, t, and v
i
are Voronoi nodes. The shortest path from s to t leads through u, which
is in v
i
s Voronoi region (if c < a and c < b), and paths in the Voronoi dual pass through v
i
.
If < a + b + 2c, the shortest path in the Voronoi dual SP
G
takes the left-hand route, and the
Voronoi sleeve S does not contain u.
s
u
h1
v
vor(u
h1
)
v
vor(u2)
u
2
u
1
v
vor(u1)
. . .
t
Figure 6.4: The shortest path between two Voronoi nodes s and t with h 1 intermediate nodes
u
1
, . . . , u
h1
. The distance between two Voronoi nodes that are adjacent in the Voronoi dual is at
most w

(v
vor(u
k
)
, v
vor(u
k+1
)
) d(v
vor(u
k
)
, u
k
) +w(u
k
, u
k+1
) +d(u
k+1
, v
vor(u
k+1
)
).
Lemma 45. Let P = (v
i
, u
1
, . . . , u
h1
, v
j
) be a simple path of length (P) between two Voronoi
nodes v
i
= u
0
and v
j
= u
h
. Let

h denote the largest gap of P. The corresponding Voronoi path
P

in the Voronoi dual G

has at most length (P

h (P). This upper bound is tight.


107
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
Proof. Suppose there are 2+ Voronoi nodes u
k
= v
vor(u
k
)
on the path. The remaining h1
nodes are non-Voronoi nodes. We cut the path P into subpaths P
k
between Voronoi nodes. Let
h
k
denote the number of edges between two consecutive Voronoi nodes, which is the number
of edges of P
k
. The Voronoi path is composed of 1 + segments P
k
between Voronoi nodes
(

k=0
(P
k
) = P,

k=0
h
k
= h, k : h
k


h). Composition of Lemma 44 leads to the
following bound on the path length:

k=0
h
k
(P
k
)

k=0
max
{0,...,}
h

(P
k
)

h (P).
Tightness can be shown with the same example as in the proof of Lemma 44.
Lemma 47 gives an upper bound on the expected size of the largest gap. We use the fol-
lowing lemma by Szpankowski and Rego [SR90] concerning the maximum of geometric random
variables.
Lemma 46 (Szpankowski and Rego [SR90, eq. (2.6) and (2.12)]). Let X
i
, i = 1, 2, . . . , n be a
set of i.i.d. random variables distributed according to the geometric distribution with parameter
p. That is, for every i = 1, 2, . . . , n and k N
+
,
Pr[X
i
= k] = (1 p)
k1
p
E[X
i
] = p
1
E[X
2
i
] = (2 p)p
2
.
Let M
n
= max{X
1
, X
2
, . . . , X
n
}. The expected value of M
n
is
E[M
n
] =
n

k=1
(1)
k
_
n
k
_
1
1 (1 p)
k
= lg
1/(1p)
n +O(1).
Lemma 47. In a path of length h1, where each node has been selected as a Voronoi node inde-
pendently at random with probability p, the longest sequence of non-Voronoi nodes is of expected
length at most O(lg
1/(1p)
h).
Proof. The path can be seen as a sequence of coin tosses, for which we want to bound the ex-
pected length of the longest sequence of tails. This problem is known as the Longest Success-
Run [EMK97, Ch. 8.5]. We wish to bound the expectation of the maximum of N independent
geometric random variables with probability p and sum h 1 N (N itself being a random
variable).
To derive a bound on the expectation, we observe that by dropping the sum condition, and
by taking the maximum over h N random variables, the maximum value obtained can only
increase.
As of Lemma 46, the expectation of the maximum of h geometric random variables with
probability p is known to be at most O(lg
1/(1p)
h).
We now combine Lemmas 44, 45, and 47 to prove Theorem 43.
108
6.6. EXPERIMENTS
Proof of Theorem 43. Consider rst the case where s and t are both Voronoi nodes.
Let

h denote the largest gap of some shortest path SP
G
(s, t). Lemma 45 implies that the
corresponding Voronoi path (SP
G
(s, t))

has length at most



h (SP
G
(s, t)). Trivially, the
shortest path in the Voronoi dual is of length no more than that of the Voronoi path; that is,
((SP
G
(s, t))

) (SP
G
(s, t)). The path SP
G
(s, t) in the Voronoi dual corresponds to a path
P

of the same length in the Voronoi sleeve Sleeve(SP


G
(s, t)). Therefore,
(SP
S
(s, t)) (P

)
= (SP
G
(s, t))
((SP
G
(s, t))


h (SP
G
(s, t)).
Recall that nodes are independently selected as Voronoi nodes with sampling rate p. For a shortest
path with h edges, the expected largest gap

h is at most O(lg
1/(1p)
h) by Lemma 47.
For the case where either s or t (or both) are not Voronoi nodes, if the path returned by
Algorithm 7 has been found in Step 1, it is optimal, and the result holds trivially. For the re-
mainder of the proof we assume that the shortest path has not been found in Step 1. In this
case, the path returned is at most as long as the shortest path P
vor
in G from s to t having
SP
Sleeve(SP
G
(v
vor(s)
,v
vor(t)
))
(v
vor(s)
, v
vor(t)
) as a subpath. In the following, we derive an upper
bound on (P
vor
) with respect to the number of edges on the shortest path between s and t, denoted
by h

. We have that
(P
vor
) d(s, v
vor(s)
) +d

(v
vor(s)
, v
vor(t)
) +d(v
vor(t)
, t).
Since the shortest path froms to t has not already been found directly in Step 1, it must be true that
both d(s, v
vor(s)
) d(s, t) and d(s, v
vor(s)
) d(s, t). It remains to bound the distance between
v
vor(s)
and v
vor(t)
in the dual graph.
Observe that augmenting the graph G with one edge (u, v
vor(u)
) of weight d(u, v
vor(u)
) for
each non-Voronoi node u V \K affects neither the Voronoi diagram nor the Voronoi dual, since
the nodes on the shortest path fromv
vor(u)
to u cannot be interfered with by another Voronoi node.
In the augmented primal graph, by the triangle inequality (for details, see Lemma 31), we have
that
d(v
vor(s)
, v
vor(t)
) d(v
vor(s)
, s) +d(s, t) +d(t, v
vor(t)
) 3d(s, t)
using a path with at most 1 +h

+ 1 edges. Therefore, the expected distance d

(v
vor(s)
, v
vor(t)
) is
also bounded by O(lg h

) 3d(s, t). The bound for P


vor
follows directly.
This concludes the proof of Theorem 43.
6.6 Experiments
In the following, we provide an experimental evaluation for our implementation of the Voronoi
shortest path approximation method. The preprocessing and query times are compared with those
of Dijkstras algorithm and with those of related but exact methods.
6.6.1 Algorithms
Benchmarking
As the methods in our study were developed and compiled on different computers and architec-
tures, a direct comparison with reported query times would not be meaningful. We measure the
109
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
performance of the methods against the bidirectional version of Dijkstras algorithm, in terms of
the ratio of the number of nodes settled by Dijkstras algorithm over the number of nodes settled
by the Voronoi method. This ratio, which we will refer to as the speedup of the method, can be
used to evaluate the performance of Steps 1, 2, and 4 of Algorithm 7. In addition, we count the
number of marked regions to account for Step 3.
The use of the Voronoi sleeve in Steps 3 and 4 of Algorithm 7 leads to practical improvements
in accuracy; however, the example in Figure 6.3 shows that for general graphs the worst-case
stretch does not improve. For all the experiments, we evaluate the method once using the rene-
ment step and once with these Voronoi sleeve steps omitted. For the second type of queries, the
reported distance is the sum of the distances from the query source to the Voronoi source, from the
Voronoi source to the Voronoi target, and from the Voronoi target to the query target, as computed
in Steps 1 and 2 of Algorithm 7.
Voronoi method
Our method using the Voronoi dual can be parameterized using the sampling probability p, the
value of which determines the tradeoff between approximation quality and speedup. For the eval-
uation, we consider three values of the sampling probability p =
1
/2, p = n
1/2
, and p = n
2/3
that produce Voronoi nodesets of expected sizes n/2,

n, and
3

n respectively. The variants


are referred to as VORHALF, VORROOT, and VORCUBERT.
Other methods
Sanders and Schultes [SS07a, Table 1] provide a detailed overview of methods for accelerated
point-to-point shortest path queries in road networks. Bauer et al. [BDS
+
08, p. 13] list another set
of methods and compare their performance on several transportation networks. We select some of
the fastest methods for comparison with our algorithm. Unless stated otherwise, we will use the
naming conventions of [SS07a, BDS
+
08] to refer to these methods. We give a brief description of
the methods considered. More background information can be found in Section 3.2.3.
Highway Hierarchies (HH) [SS06] are based on the observation that a certain class of edges
(the highway edges) tend to have greater representation among the portion of the shortest
paths that are not in the vicinity of either the source or target. A recursive computation
of these edges, paired with a contraction step, leads to a hierarchy of graphs that enables
an impressive speedup at query time. HH+dist denotes a variant of HH where all higher
levels with at most O(

n) nodes are replaced by a single distance table. HH+dist+A* is


HH combined with A* search and implemented with distance tables [DSSW06]. Highway
Node Routing (HNR) [SS07b] is another variant of the Highway Hierarchies strategy.
In the same spirit as HH, Transit Node Routing (TNR) [BFM
+
07] identies a set of nodes
(called transit nodes) that often occur on shortest paths. A table storing the distances
between all pairs of these nodes allows any shortest path distance to be computed with a
small number of table look-ups. Two variants are listed: TNR-eco with economical space
consumption, and TNR-gen with generous space consumption.
The Arc-Flag method [Lau04] computes a partition of the graph and then, for each compo-
nent and for each shortest path ending in that component, it labels the rst edge. A variant of
this method, SHARC [BD08], incorporates techniques developed for Highway Hierarchies.
110
6.6. EXPERIMENTS
Contraction Hierarchies (CHHNR) [GSSD08] is an extension of highway hierarchies in
which the graph is further simplied using contraction operations. Many variants have been
proposed; we consider only the variant with the fastest preprocessing time, CHHNREDS1235,
and the variant with the best speedup, CHHNREVSQWL. The CHASE method [BDS
+
08] inte-
grates the Contraction Hierarchies and Arc-Flag methods.
A method based on A* search by Goldberg and Harrelson [GH05], which we will refer
to as simply A*, is one of the rst methods with reasonable preprocessing time and good
speedup.
ALT-m16 [DW07] is a variant of ALT [GW05], which in turn is a combination of A*, Land-
marks, and speedup techniques based on the triangle inequality. CALT-m16 and CALT-
a64 [BDS
+
08] are two variants of a method that combines ALT and Contraction Hierar-
chies.
6.6.2 Data Sets
For the sake of comparison, we consider transportation networks that were used by Sanders and
Schultes [SS07a] and Bauer et al. [BDS
+
08, BD08] in their evaluations. In addition, to demon-
strate that our method is effective for more general graphs, we run experiments with a social
network, a citation graph, a router network, and protein interaction networks as data sets. The
node degrees of these graphs appear to follow a power-law distribution [Mit03].
Road networks
The road network of Western Europe has been made available for scientic use by the company
PTV AG. It covers 14 countries and, with its massive size of 18,010,173 nodes and 42,560,279
directed edges, it serves as an important benchmark for shortest path queries. In order to apply
the Voronoi method, we convert the graph into an undirected form. There are two different edge
weightings, one representing geographical distances and the other representing driving time. We
conduct experiments for both.
Public transportation
We also conduct experiments for three European public transportation networks: (1) long railway
connections in Europe, with 1,586,862 nodes and 2,402,352 directed edges, (2) the bus network of
the Rhein-Main-Verkehrsverbund RMV, with 2,278,066 nodes and 3,417,084 directed edges, and
(3) the bus network of the Verkehrsverbund Berlin Brandenburg VBB, with 2,600,818 nodes and
3,901,212 directed edges. The graphs considered by [BDS
+
08, BD08] differ slightly from those
used for experimentation with the Voronoi method.
The numbers of nodes and edges of the RMV and VBB input graphs are nearly identical;
however, the long railway graph used in our experimentation has 33% more nodes and edges than
in [BDS
+
08, BD08]. Again, for the Voronoi experimentation, the graphs were converted into an
undirected form.
Social networks
We extracted the DBLP computer science bibliography [Ley02] co-author graph from an ofcial
XML version downloaded on 24 August 2008. In the graph, two authors are connected by an
111
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
edge if they have at least one joint publication. This yielded an undirected graph, from which we
selected the largest connected component. The nal graph is unweighted and consists of 511,163
nodes and 1,871,070 edges.
Router topology
CAIDA maintains data on the router-level topology of a portion of the Internet [Coo03]. After
cleaning we obtained an undirected, unweighted graph with 190,914 nodes and 607,610 edges.
Citation graph
The citations for 27,400 publications in the high energy physics research literature were used as
a data set in the KDD Cup 2003 competition [GSDF03]. From these citations, we constructed an
undirected, unweighted graph with 352,542 edges.
Protein interactions
The Database of Interacting Proteins [SMS
+
02] catalogs experimentally determined interactions
among proteins. We extracted the largest connected component, consisting of 19,928 nodes and
82,406 edges. BioGRID is a general repository for interaction data sets [SBR
+
06] from which we
extracted the largest connected component, consisting of 4,039 nodes and 43,854 edges.
6.6.3 Experimental Setting
In this section we describe the experimental setting for the Voronoi method. The implementation
is written in C++ and executed on one core of a 2x2.66 GHz Dual-Core Intel Xeon Desktop with
6 GB 800 MHz DDR2 FB-DIMM running Mac OS X 10.5.6.
Every graph was preprocessed 1, 000 times using different random seeds (250 times for the
European road networks). For these runs we report the mean value and standard deviation of
the execution time in seconds. After preprocessing, we performed 100 shortest path queries for
random (s, t) pairs. For these queries, we provide the mean values and standard deviations of
the speedup relative to the bidirectional version of Dijkstras algorithm, and of the multiplicative
stretch relative to a shortest path.
6.6.4 Results
Computers are useless. They can only give you answers.
Pablo Picasso (18811973)
Running times, speedups, and approximation qualities for the Voronoi method are listed in
Table 6.2 for transportation data sets and 6.3 for data sets concerning complex networks. The
performances of the other methods are listed in Tables 6.1 and 6.4 as originally summarized
in [SS07a, GSSD08, BDS
+
08].
Preprocessing
For the Voronoi method, as Lemma 38 predicts, the preprocessing cost is extremely low for all
three values of p. For the non-planar graphs, the greatest preprocessing times were observed for
112
6.6. EXPERIMENTS
the largest value, p =
1
/2. This likely reects the logarithmic cost of the heap operations associated
with the computation of Voronoi regions. At the start of the Dijkstra search, the heap is initialized
with all neighbors of the graph Voronoi nodes. When p is large, the initial heap size is a large
proportion of the total number of nodes, and the cost of the heap operations becomes signicant.
On the other hand, when p and the average node degree are both small, the heap evolves smoothly
with its size remaining small.
Speedup
For road networks VORHALF achieves moderate speedups of approximately 2, which likely re-
ects the fact that the expected number of nodes of the Voronoi dual is half that of the original
graph. For the power-law graphs, probability p =
1
/2 does not lead to a signicant speedup. One
reason for this might be that the Voronoi dual for each of these graphs is rather dense and, as a
consequence, the Dijkstra search in the dual explores many nodes until it can nd the destination.
For the smaller probabilities, larger speedups can be observed, but the performance gain is signi-
cantly smaller
2
than the speedups obtained for almost planar networks. There, the speedup seems
proportional to 1/p. As expected, if for small values of p the sleeve is used to rene the path, the
speedup decreases drastically due to the large size of this subgraph.
50 100 200 500 1000 2000 5000 10000
1
e
+
0
5
1
e
+
0
3
1
e
+
0
1
preprocessing time
s
p
e
e
d
-
u
p
VorHalf
VorRoot
VorCubeRt
TNR
CHHNR
A*
HH HNR
CHHNR
Figure 6.5: Preprocessing time versus speedup (with respect to Dijkstras algorithm) tradeoff for
the European road network. Plot on a doubly-logarithmic scale, yaxis reversed. Circles stand
for variants of the Voronoi method and for the related exact methods listed in Table 6.1. Transit-
Node Routing (TNR) has the best speedup and the slowest preprocessing. Contraction Hierarchies
(CHHNR) and Highway Hierarchies (HH) achieve a very good tradeoff between preprocessing
and query times. A* has short preprocessing times but rather low speedup. The Voronoi method
(Chapter 6) has the fastest preprocessing times with competitive speedups at the cost of exactness.
2
Cheng and Yu [CY09] use 2hop labels [CHKZ03] to efciently compute exact distances. They obtain a better
speedup at the expense of a signicantly longer preprocessing. For a 10 times smaller DBLP graph with 52,682 nodes
and 59,395 edges, their preprocessing step takes 20 seconds [CY09, Table 1]. At query time it outperforms Dijkstras
algorithm by two orders of magnitude [CY09, Figure 17 and Section 7.4].
113
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
Stretch
The Voronoi method achieved stretch values that were surprisingly consistent among different data
sets, with most values under 2 and very close to optimal for the road networks. Figures 6.6 and 6.7
show the approximate path length versus the shortest path length, with and without the sleeve
renement steps, respectively. The theoretical worst-case logarithmic dependency on the number
of edges cannot be observed in the experimental results. Renement using the sleeve substantially
improves the stretch in practice, although the theoretical performance is not affected.
PTV European road network, driving time
preprocessing [s] speedup
CHHNREDS1235 [GSSD08] 602 8,505
A* [GH05] 780 28
HH [SS06] 780 4,002
HH+dist [SS06] 900 8,320
HH+dist+A* [DSSW06] 1,320 11,496
HNR [SS07b] 1,440 4,079
CHHNREVSQWL [GSSD08] 1,914 10,874
TNR-eco [BFM
+
07] 2,760 471,881
TNR-gen [BFM
+
07] 9,840 1,129,143
Table 6.1: Experimental results of related exact shortest path query methods for road networks.
This table is excerpted from Sanders and Schultes [SS07a, Table 1] except for CHHNR values,
which are from [GSSD08, Table 1]. Preprocessing times are converted from minutes to seconds
to ease comparison with the Voronoi method. Machines used (except for A*): 2.0 or 2.6 GHz
processor, 8 or 16 GB RAM, C++ implementation.
114
6
.
6
.
E
X
P
E
R
I
M
E
N
T
S
method preprocessing [s] without sleeve with sleeve
speedup stretch speedup stretch
PTV European road network, driving time, 18,010,173 nodes, 42,560,279 edges
VORHALF 31.76864.4436 2.6061 0.0734 1.03940.0131 2.58780.0750 1.01110.0062
VORROOT 40.52963.6423 3,518.0645 725.2776 1.66130.2078 4.99914.9017 1.12910.0783
VORCUBERT 31.33722.8181 39,918.498814,207.5395 1.55440.4292 1.58631.1123 1.04050.0597
PTV European road network, geographical distance, 18,010,173 nodes, 42,560,279 edges
VORHALF 29.83654.3576 2.6266 0.0558 1.03070.0095 2.58000.0627 1.01390.0057
VORROOT 34.27853.0609 3,672.4070 511.1418 1.18210.0960 5.92127.9921 1.03900.0249
VORCUBERT 22.55312.0284 42,266.644213,530.5983 1.28820.5384 1.63831.4232 1.01410.0291
Public transportation, long distance railway, 1,586,862 nodes, 2,402,352 edges
VORHALF 2.04990.1998 1.9511 0.1231 1.01800.0227 1.89720.1367 1.00800.0143
VORROOT 1.90860.0946 363.8390 153.4644 1.38130.2848 2.85273.3113 1.08290.0971
VORCUBERT 1.76330.0860 2,116.0373 1,251.1773 1.51670.6610 1.25990.5990 1.02470.0658
Public transportation, RMV, 2,278,066 nodes, 3,417,084 edges
VORHALF 3.77140.4064 1.9892 0.1813 1.02900.0255 1.93150.1766 1.01040.0131
VORROOT 3.74550.2158 789.2912 328.2714 1.29720.2591 3.18025.4237 1.06440.0864
VORCUBERT 3.41200.1633 5,973.7950 3,748.1389 1.35220.6003 1.30890.9703 1.02040.0583
Public transportation, VBB, 2,600,818 nodes, 3,901,212 edges
VORHALF 4.14090.4180 1.9881 0.6476 1.03350.0248 1.93130.5172 1.00750.0097
VORROOT 4.02420.2914 866.8917 405.4821 1.40420.2516 3.78647.6010 1.08340.1000
VORCUBERT 3.71450.2333 7,373.2971 4,742.2783 1.43753.3690 1.34271.2759 1.02440.0660
Table 6.2: Experimental results for the Voronoi method on transportation networks.
1
1
5
C
H
A
P
T
E
R
6
.
A
P
P
R
O
X
I
M
A
T
I
N
G
S
H
O
R
T
E
S
T
P
A
T
H
S
U
S
I
N
G
V
O
R
O
N
O
I
D
U
A
L
S
method preprocessing [s] without sleeve with sleeve
speedup stretch speedup stretch
DBLP co-authorship, 511,163 nodes, 1,871,070 edges
VORHALF 0.91450.0431 1.3576 1.4690 1.20930.1805 1.3447 1.4364 1.14190.1468
VORROOT 0.83760.0430 37.7082 53.2992 1.93230.3591 11.443214.8387 1.39540.2850
VORCUBERT 0.60410.0312 143.8757208.7946 2.00330.3630 9.961612.5412 1.28810.2406
CAIDA router topology, 190,914 nodes, 607,610 edges
VORHALF 0.30500.0154 1.3164 1.1720 1.18100.1703 1.2972 1.1074 1.12830.1359
VORROOT 0.17930.0092 42.4832 54.6527 1.78450.3533 7.8865 8.8062 1.23450.2175
VORCUBERT 0.15620.0081 135.5521188.9479 1.83140.3755 6.0451 7.1000 1.16210.1837
High energy physics citations, 27,400 nodes, 352,542 edges
VORHALF 0.17640.0100 1.6620 1.2240 1.31790.2909 1.6452 1.1544 1.21070.2323
VORROOT 0.06110.0043 40.1114 21.9262 1.99180.4695 11.5248 7.9582 1.33900.3286
VORCUBERT 0.04610.0032 101.9210 58.6233 2.03300.4852 9.0423 7.5795 1.23250.2750
Database of Interacting Proteins, 19,928 nodes, 82,406 edges
VORHALF 0.01170.0007 2.2044 1.0637 1.18870.2188 2.1248 1.0093 1.11830.1778
VORROOT 0.01080.0007 57.7343 45.7341 1.82140.4084 9.1154 6.0720 1.32160.3030
VORCUBERT 0.00960.0006 134.4816106.4737 1.92770.4444 6.2541 3.8117 1.26440.2703
BioGRID, 4,039 nodes, 43,854 edges
VORHALF 0.00350.0002 1.5086 0.8003 1.25810.2718 1.3722 0.6858 1.13340.1973
VORROOT 0.00250.0001 10.7295 7.9563 1.86760.5737 3.0394 1.9172 1.27530.3354
VORCUBERT 0.00240.0001 18.6805 14.7570 1.94120.6250 2.7906 1.7177 1.23080.3137
Table 6.3: Experimental results for the Voronoi method on complex networks.
1
1
6
6
.
6
.
E
X
P
E
R
I
M
E
N
T
S
long distance rail RMV VBB
|V | 1,192,736 2,277,812 2,599,953
|E| 1,789,088 3,416,552 3,899,807
preprocessing [s] speedup preprocessing [s] speedup preprocessing [s] speedup
CALT-a64 [BDS
+
08] 87 291.84 191 267.11 123 459.30
CALT-m16 [BDS
+
08] 158 182.71 377 159.62 174 281.23
ALT-m16 [DW07] 291 20.30 556 18.91 604 23.04
CHHNR [GSSD08] 286 1,620.62 2,584 2,077.69 1,636 3,124.59
CHASE [BDS
+
08] 536 2,660.93 2,863 4,649.26 2,008 10,398.64
SHARC [BD08] 12,540 81.04 36,120 118.10
Table 6.4: Experimental results of related exact shortest path query methods for public transportation networks. This table is excerpted from Bauer
et al. [BDS
+
08, p. 13]. SHARC is evaluated in [BD08, p. 10]. The speedup is computed according to the number of settled nodes. Machines used:
2.0 or 2.6 GHz processor, 8 or 16 GB RAM, C++ implementation.
1
1
7
CHAPTER 6. APPROXIMATING SHORTEST PATHS USING VORONOI DUALS
6.7 Conclusion and Open Problems
We have presented a simple and general method based on Voronoi duals to efciently support
shortest path queries in undirected graphs with very low preprocessing overheads and competitive
query times, at the cost of exactness. The method was shown to be effective on a variety of graph
types while remaining a reasonable alternative to existing exact methods specically designed for
transportation networks. The results of our experiments also demonstrate that the approximation
ratio in practice is signicantly better than the tight theoretical worst-case bound (Theorem 43).
The maximal distortion of paths in the graph Voronoi dual depends on the distance between nodes
in the original graph, unlike Delaunay triangulations of the Euclidean plane, which have constant
distortion [DFS90, KG92].
Although the Voronoi method is intended mainly as a practical one (and thus kept as simple
as possible to facilitate efcient implementation), let us conclude with a few remarks concerning
the theoretical performance of the Voronoi method. If logarithmic stretch is allowed, the distance
oracle of Thorup and Zwick [TZ05] achieves quasi-linear space O(n
1+1/ lg n
lg n) O(nlg n)
and query time O(lg n). With the same worst-case stretch, the query time of the Voronoi method
is much worse. However, note that the tradeoff between stretch and query time is fundamentally
different from Thorup and Zwicks result: in the Voronoi method the long query time helps to
compute a better approximation with a better stretch. The quantitative tradeoff between query time
and stretch is probably far from optimal but intuitively this is the right relationship. More time
should in general yield better results. Note that, in theory, the Voronoi method is also outperformed
by a distance oracle based on Bourgains embedding [Bou85] (Theorem 3) and the distance oracle
of Mendel and Naor [MN06]. In practice, however, the Voronoi method provides much better
stretch than the worst-case bound predicts.
It remains open as to whether the Voronoi method presented in this paper can be extended
to handle directed graphs. The nature of the Voronoi dual within a directed graph is inherently
different from the dual within an undirected graph. The need for path connectivity suggests the
construction of two Voronoi diagrams, one where reachability paths are oriented outward from
Voronoi nodes and another where reachability paths are oriented inward. As the respective Voronoi
regions may not coincide [Erw00], it is not straightforward to dene a single dual structure whose
shortest path lengths approximate those of the original graph.
A natural extension is the computation of a hierarchical structure of Voronoi duals, where the
Voronoi nodes are chosen through recursive sampling. At a given level of the hierarchy, shortest
path queries within the Voronoi dual would be resolved by a recursive call one level higher in the
structure. Note that the union of all regions of different levels do not necessarily form a laminar
family [GA06]. The Voronoi sleeves may expand locally compared to the sleeve of a higher level.
A node that was not considered on a higher level of the hierarchy may be part of the sleeve in a
lower level.
At query time, the practical stretch can be improved; instead of considering the sleeve dened
by nodes on the Voronoi path only, one can also allow the query algorithm to search in neighboring
cells. Furthermore, instead of searching the path between one Voronoi source and one Voronoi
sink, one can compute paths between all Voronoi nodes close to the source and all Voronoi nodes
close to the sink. For both heuristics, the theoretical worst-case performance does not seem to
improve. However, these practical heuristics give useful parameters that may be tuned depending
on the needs of the application.
118
6.7. CONCLUSION AND OPEN PROBLEMS
Figure 6.6: Approximate path length versus actual shortest path length for VORROOT with sleeve
steps omitted on the European road network, distance metric. The theoretical worst-case logarith-
mic dependency on the number of edges cannot be observed in the experimental results.
Figure 6.7: Approximate path length versus actual shortest path length for VORROOT using the
sleeve on the European road network, distance metric. Renement using the sleeve substantially
improves the stretch in practice (also compare with Figure 6.6), although the theoretical perfor-
mance is not affected.
119
MODUS VIVENDI
7
Conclusion
This thesis investigates shortest path query processing in networks both from a theoretical and a
practical point of view.
Theory. We advance the theory of distance oracles in two ways.
Thorup and Zwick [TZ05] left open the important question whether, for graphs with n nodes
and m edges, the O(m) space requirements of SSSP and the O(1) query time of APSP could be
combined to obtain a distance oracle with space O(m) and query time O(1). They prove that
graphs with large girth cannot be compressed below (m). One result of this thesis is to prove
that there are graphs for which O(m) space is not sufcient to answer distance queries in constant
time. The result is actually stronger: for some graphs, O(m) space is not sufcient even if the
query algorithm is allowed to return approximate distances with a small distortion. To obtain an
efcient distance oracle for general sparse graphs, an additional data structure of considerable size
is necessary. Our proof implies that a distance oracle with query time t and stretch (, 0) requires
space roughly n
1+(1/t)
.
Thorup and Zwick, prove that, for many integer values of k, (n
1+1/k
) space is necessary if
distances must be approximated by a factor less than 2k + 1. For special classes of sparse graphs,
this lower bound can be circumvented [FR06, Cab06, ACC
+
96, Dji96, DPZ95, DPZ00, GKR04,
Tho04a, Kle02a, Kle05, AG06, AFGW10, CZ00, CLSS98, GZ05, FK07, Spr07, Tal04, ABN08,
GLNS08, AGL07, SS09, SSA09], usually by using the inherent structure of graphs in the class.
One common approach is to use separators [LT80, AST90]. Complex networks, in particular
power-law graphs, lack structure and separators. A different approach is necessary. Experimental
evidence [KFY04], however, shows that the general scheme by Thorup and Zwick with stretch
(3, 0) and space requirement O(n
3/2
) works well for power-law graphs, meaning that the space
requirements are signicantly lower than the space requirements the tight theoretical bound would
predict. We make an attempt to bridge the gap between theory and practice by proving why the
space requirements of distance oracles for power-law graphs are indeed smaller. In networking,
there is a common and powerful heuristic to use nodes with large degrees for routing purposes.
We give a theoretical explanation why this heuristic is good for power-law graphs. Our analysis
establishes a direct connection between the exponent of the power-law degree distribution and the
space requirements of distance oracles.
Many theoretical questions remain open. Are there non-trivial distance oracles with additive
error? What is the optimal tradeoff for dynamic distance oracles? Is linear space, constant stretch,
and logarithmic query time achievable for sparse graphs? Is quadratic preprocessing time possible
(without increasing the query time)? Is there a linear-space distance oracle for planar graphs? Can
we obtain quasi-linear space and stretch less than (3, 0) for power-law graphs? Is there an efcient
121
CHAPTER 7. CONCLUSION
distance oracle for certain sparse directed graphs?
The major open questions for the general shortest path problem may be more difcult. Is there
a combinatorial algorithm that computes APSP in truly sub-cubic time? For general weights, can
we compute SSSP faster than APSP? Can the SSSP problem be solved in linear time without using
specic properties of the word RAM model?
Practice. A third result of this thesis is an efcient and very simple practical method to answer
approximate shortest path queries. The Voronoi method works very well on practical network
instances. Compared to related practical methods, the preprocessing overheads are lower and the
query times remain competitive at the cost of exactness. The method is effective on a variety of
graph types. It is also a reasonable alternative to existing exact methods specically designed for
transportation networks. For example, the Voronoi method has also been applied as a subroutine
in large-scale trafc simulations.
Practical challenges are manifold. What if a network changes? Howcan we take trafc [Ker04]
into account? In current nagivation systems for example, you can pick the fastest route, the short-
est, the one that avoids motorways or a route that passes through or avoids a particular point.
Future devices will learn about a drivers preferences and adjust accordingly [Eco09]. For this,
efcient dynamic methods are necessary. How can we efciently adapt the data structure?
Dynamic distance oracles are also important for social networking sites. The social graph is
changing constantly. At the same time, users and advertisers are interested in information on the
social graph.
Euler invented the graph model roughly 250 years ago, the single-source shortest path algo-
rithm of Dijkstra dates back 50 years, and Thorup and Zwick discovered the optimal distance
oracle slightly less than 10 years ago. The last a-bit-more-than-2 years of my research life have
been devoted to the investigation of shortest path query processing in networks. Some problems
have been solved, but the quest must go on.
PS: to nally answer the rst question of the introduction: take the express train to Narita
airport and board a direct ight to Zurich, from where a train ride via Zurich main station brings
you to Affoltern, which is two bus stops away from Ottenbach.
122
List of Tables
2.1 Examples of minor-free graph classes. . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Complex networks of different scales [ASBS00] . . . . . . . . . . . . . . . . . . 23
2.3 Onotation for the asymptotic behavior of functions f, g. . . . . . . . . . . . . 25
2.4 Results on Erd os girth conjecture, overview from [TZ05, Table II]. Maximum
size of the edge set E for a graph G = (V, E) with |V | = n nodes and given girth
(length of a shortest cycle, Def. 17). . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Running times for different implementations of Dijkstras algorithm. W denotes
the largest integer weight. The table is in large parts excerpted from [Tho99,
p. 364]. The algorithms in the rst two rows work for both the comparison/addition
model and the word RAM model. The analysis of the algorithms shown in row 3
and below only works in the word RAM model. . . . . . . . . . . . . . . . . . 34
2.6 Combinatorial algorithms for the All Pairs Shortest Path problem. . . . . . . . . 37
3.1 Time and space complexities of distance oracles for general, undirected, unweighted
graphs (some upper bounds extend to weighted graphs). The upper part of the ta-
ble lists upper bounds; the lower part lists lower bounds (some restrictions on k
and apply). For the result by Baswana et al. [BGSU08], the stretch is taken
from [Sen09, Table 1]. Approximate distance oracles are included only if the
space requirement is at most o(n
2
). . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Time and space complexities of distance oracles for undirected planar graphs
(some results extend to planar digraphs and/or minor-free graphs). denotes the
multiplicative stretch; denotes the diameter. . . . . . . . . . . . . . . . . . . 47
6.1 Experimental results of related exact shortest path query methods for road net-
works. This table is excerpted from Sanders and Schultes [SS07a, Table 1] except
for CHHNR values, which are from [GSSD08, Table 1]. Preprocessing times are
converted from minutes to seconds to ease comparison with the Voronoi method.
Machines used (except for A*): 2.0 or 2.6 GHz processor, 8 or 16 GB RAM, C++
implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.2 Experimental results for the Voronoi method on transportation networks. . . . . . 115
6.3 Experimental results for the Voronoi method on complex networks. . . . . . . . . 116
6.4 Experimental results of related exact shortest path query methods for public trans-
portation networks. This table is excerpted from Bauer et al. [BDS
+
08, p. 13].
SHARC is evaluated in [BD08, p. 10]. The speedup is computed according to the
number of settled nodes. Machines used: 2.0 or 2.6 GHz processor, 8 or 16 GB
RAM, C++ implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
123
List of Figures
1.1 The K onigsberg bridges as depicted by Leonhard Euler in his article Solutio
Problematis ad Geometriam Situs Pertinentis on page 129 in volume 8 of Com-
mentarii Academiae Scientiarum Petropolitanae in 1741. . . . . . . . . . . . . . 1
1.2 The shortest path tree starting at the node representing the city of Los Angeles.
Original by George B. Dantzig [Dan57]. Hence the optimal path is from Los
Angeles to Salt Lake City, then to Chicago, and nally to Boston. . . . . . . . . 9
2.1 Illustration of the network decomposition technique as originally depicted by Hu
and Torres [HT69, Figure 4]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1 The contractions for nodes of degrees 2, 3, and 4 (termed spider web transfor-
mations) by van Vliet [Vli78, Figure 5]. . . . . . . . . . . . . . . . . . . . . . . 52
4.1 The LOPSIDEDSETDISJOINTNESS communication problem . . . . . . . . . . . 61
4.2 A drawing of an undirected buttery graph with degree 2 spanning 3 layers. . . . 63
4.3 A drawing of an undirected buttery graph with degree 3 spanning 3 layers. . . . 64
4.4 Illustration of a reduction from a distance oracle to a communication protocol
for LOPSIDEDSETDISJOINTNESS. The protocol computes (S
Alice
, S
Bob
) :=
_
S
Alice
S
Bob
?
=
_
. G and F are known to both Alice and Bob. Details are
in the proof of Lemma 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.1 The gure shows a plot of f() =
2
23
for (2, 3). For values of close to 2,
for example for = 2.1, which is the exponent that ts the power-law distribution
well to the degree distribution of the actual Internet inter-domain graph [FFF99,
KFY04], our bound is O(n
13/12+
), which indicates that the adapted distance
oracle (and the adapted routing scheme) could be very effective on Internet-like
graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2 The distance oracle of Thorup and Zwick [TZ05, Fig. 5] for stretch parameter k =
2, which guarantees stretch (3, 0). An illustration of the preprocessing algorithm,
from left to right: (1) random sampling of landmarks, (2) SSSP computation for
one landmark (node B), (3) knowledge of one node after all SSSP computations
(from all landmarks), and (4) ball for the rightmost node in the bottom line . . . . 89
5.3 Illustration of the proof of worst-case stretch (3, 0) using the triangle inequality. . 90
6.1 The Voronoi diagram and the Delaunay triangulation of the plane for a set of Vo-
ronoi sites {A, B, . . . G} and the graph Voronoi diagram and its dual for a set of
Voronoi nodes {A, B, . . . G} in an unweighted graph (note that the graph Voronoi
dual is not necessarily a triangulation). . . . . . . . . . . . . . . . . . . . . . . 99
6.2 Illustration of the query algorithm of the Voronoi method. Left to right, top to
bottom: (1) the original shortest path, (2) shortest path in the (weighted) dual, (3)
sleeve, and (4) shortest path in the sleeve . . . . . . . . . . . . . . . . . . . . . . 102
6.3 s, t, and v
i
are Voronoi nodes. The shortest path froms to t leads through u, which
is in v
i
s Voronoi region (if c < a and c < b), and paths in the Voronoi dual pass
through v
i
. If < a +b +2c, the shortest path in the Voronoi dual SP
G
takes the
left-hand route, and the Voronoi sleeve S does not contain u. . . . . . . . . . . . 107
125
List of Figures
6.4 The shortest path between two Voronoi nodes s and t with h1 intermediate nodes
u
1
, . . . , u
h1
. The distance between two Voronoi nodes that are adjacent in the
Voronoi dual is at most w

(v
vor(u
k
)
, v
vor(u
k+1
)
) d(v
vor(u
k
)
, u
k
)+w(u
k
, u
k+1
)+
d(u
k+1
, v
vor(u
k+1
)
). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.5 Preprocessing time versus speedup (with respect to Dijkstras algorithm) tradeoff
for the European road network. Plot on a doubly-logarithmic scale, yaxis re-
versed. Circles stand for variants of the Voronoi method and for the related exact
methods listed in Table 6.1. Transit-Node Routing (TNR) has the best speedup
and the slowest preprocessing. Contraction Hierarchies (CHHNR) and Highway
Hierarchies (HH) achieve a very good tradeoff between preprocessing and query
times. A* has short preprocessing times but rather low speedup. The Voronoi
method (Chapter 6) has the fastest preprocessing times with competitive speedups
at the cost of exactness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.6 Approximate path length versus actual shortest path length for VORROOT with
sleeve steps omitted on the European road network, distance metric. The theoreti-
cal worst-case logarithmic dependency on the number of edges cannot be observed
in the experimental results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.7 Approximate path length versus actual shortest path length for VORROOT us-
ing the sleeve on the European road network, distance metric. Renement using
the sleeve substantially improves the stretch in practice (also compare with Fig-
ure 6.6), although the theoretical performance is not affected. . . . . . . . . . . . 119
126
Bibliography
[AB02] Reka Albert and Albert-L aszl o Barab asi. Statistical mechanics of complex net-
works. Reviews of Modern Physics, 74:4797, 2002. 4
[AB07] Sanjeev Arora and Boaz Barak. Computational complexity: A modern approach,
2007. Web draft dated 2007-01-08 22:02. 60, 61
[ABN08] Ittai Abraham, Yair Bartal, and Ofer Neiman. Embedding metric spaces in their
intrinsic dimension. In Proceedings of the Nineteenth Annual ACM-SIAM Sympo-
sium on Discrete Algorithms, SODA 2008, San Francisco, California, USA, Jan-
uary 20-22, 2008, pages 363372, 2008. 49, 121
[AC07] Noga Alon and Michael Capalbo. Finding disjoint paths in expanders determinis-
tically and online. In FOCS 07: Proceedings of the 48th Annual IEEE Symposium
on Foundations of Computer Science, pages 518524, 2007. 65
[ACC
+
96] Srinivasa Rao Arikati, Danny Z. Chen, L. Paul Chew, Gautam Das, Michiel H. M.
Smid, and Christos D. Zaroliagis. Planar spanners and approximate shortest path
queries among obstacles in the plane. In Algorithms - ESA 96, Fourth Annual
European Symposium, Barcelona, Spain, September 25-27, 1996, Proceedings,
pages 514528, 1996. 45, 47, 121
[ACIM99] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast
estimation of diameter and shortest paths (without matrix multiplication). SIAM
Journal on Computing, 28(4):11671181, 1999. Announced at SODA 1996. 28,
37, 38
[ACKM09] Dimitris Achlioptas, Aaron Clauset, David Kempe, and Cristopher Moore. On the
bias of traceroute sampling: Or, power-law degree distributions in regular graphs.
Journal of the ACM, 56(4), 2009. 5
[ACL00] William Aiello, Fan Rong King Chung, and Linyuan Lu. A random graph model
for massive graphs. In Proceedings of the 32nd Annual ACMSymposiumon Theory
of Computing, pages 171180, 2000. 23, 24, 56, 81, 82, 84, 96
[ACS99] Karen Aardal, Fabian A. Chudak, and David B. Shmoys. A 3-approximation al-
gorithm for the k-level uncapacitated facility location problem. Information Pro-
cessing Letters, 72:161167, 1999. 98
[ADD
+
93] Ingo Alth ofer, Gautam Das, David P. Dobkin, Deborah Joseph, and Jos e Soares.
On sparse spanners of weighted graphs. Discrete & Computational Geometry,
9:81100, 1993. 27
[ADG
+
06] Lyudmil Aleksandrov, Hristo Djidjev, Hua Guo, Anil Maheshwari, Doron Nuss-
baum, and J org-R udiger Sack. Approximate shortest path queries on weighted
polyhedral surfaces. In Mathematical Foundations of Computer Science 2006,
31st International Symposium, MFCS 2006, Star a Lesn a, Slovakia, August 28-
September 1, 2006, Proceedings, pages 98109, 2006. 11, 32
[AF90] Mikl os Ajtai and Ronald Fagin. Reachability is harder for directed than for undi-
rected nite graphs. The Journal of Symbolic Logic, 55(1):113150, 1990. An-
nounced at FOCS 1988. 41, 54, 62
[AFGW10] Ittai Abraham, Amos Fiat, Andrew V. Goldberg, and Renato Fonseca F. Werneck.
Highway dimension, shortest paths, and provably efcient algorithms. In ACM-
127
Bibliography
SIAMSymposiumon Discrete Algorithms (SODA10) January 17-19, 2010, Austin,
Texas, 2010. 22, 47, 55, 121
[AFWZ95] Noga Alon, Uriel Feige, Avi Wigderson, and David Zuckerman. Derandomized
graph products. Computational Complexity, 5:6075, 1995. 65, 74
[AG06] Ittai Abraham and Cyril Gavoille. Object location using path separators. In Pro-
ceedings of the Twenty-Fifth Annual ACM Symposium on Principles of Distributed
Computing, PODC 2006, Denver, CO, USA, July 23-26, 2006, pages 188197,
2006. Details in LaBRI Research Report RR-1394-06. 46, 47, 83, 121
[AGGM06] Ittai Abraham, Cyril Gavoille, Andrew V. Goldberg, and Dahlia Malkhi. Routing
in networks with low doubling dimension. In 26th IEEE International Conference
on Distributed Computing Systems (ICDCS 2006), 4-7 July 2006, Lisboa, Portu-
gal, page 75, 2006. 49
[AGK
+
04] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala,
and Vinayaka Pandit. Local search heuristics for k-median and facility location
problems. SIAM Journal on Computing, 33(3):544562, 2004. Announced at
STOC 2001. 98
[AGL07] Mattias Andersson, Joachim Gudmundsson, and Christos Levcopoulos. Approx-
imate distance oracles for graphs with dense clusters. Computational Geometry,
37(3):142154, 2007. Announced at ISAAC 2004. 50, 121
[AGM97] Noga Alon, Zvi Galil, and Oded Margalit. On the exponent of the all pairs shortest
path problem. Journal of Computer and System Sciences, 54(2):255262, 1997.
Announced at FOCS 1991. 36
[AGM
+
08] Ittai Abraham, Cyril Gavoille, Dahlia Malkhi, Noam Nisan, and Mikkel Thorup.
Compact name-independent routing with minimum stretch. ACM Transactions on
Algorithms, 4(3), 2008. 49
[AGMN92] Noga Alon, Zvi Galil, Oded Margalit, and Moni Naor. Witnesses for boolean
matrix multiplication and for shortest paths. In 33rd Annual Symposium on Foun-
dations of Computer Science, 24-27 October 1992, Pittsburgh, Pennsylvania, USA,
pages 417426, 1992. 36
[AHL02] Noga Alon, Shlomo Hoory, and Nathan Linial. The Moore bound for irregular
graphs. Graphs and Combinatorics, 18(1):5357, 2002. 28
[AI97] Cavit Aydin and Doug Ierardi. Partitioning algorithms for transportation graphs
and their applications to routing. In Proceedings of the 9th Canadian Conference
on Computational Geometry (CCCG), 1997. 54
[AI00] Yasuhito Asano and Hiroshi Imai. Practical efciency of the linear-time algorithm
for the single source shortest path problem. Journal of the Operations Research
Society of Japan, 43(4):431447, 2000. 33, 34
[AIP06] Alexandr Andoni, Piotr Indyk, and Mihai Patrascu. On the optimality of the di-
mensionality reduction method. In 47th Annual IEEE Symposium on Foundations
of Computer Science (FOCS 2006), 21-24 October 2006, Berkeley, California,
USA, Proceedings, pages 449458, 2006. 60, 62
[AJ89] Rakesh Agrawal and H. V. Jagadish. Materialization and incremental update of
path information. In Proceedings of the Fifth International Conference on Data
128
Bibliography
Engineering, February 6-10, 1989, Los Angeles, California, USA, pages 374383,
1989. 10
[AJ94] Rakesh Agrawal and H. V. Jagadish. Algorithms for searching massive graphs.
IEEE Transactions on Knowledge and Data Engineering, 6(2):225238, 1994. 51
[AJB99] Reka Albert, Hawoong Jeong, and Albert-L aszl o Barab asi. Diameter of the world-
wide web. Nature, 401:130, 1999. 5
[Ajt88] Mikl os Ajtai. A lower bound for nding predecessors in Yaos call probe model.
Combinatorica, 8(3):235247, 1988. 60
[Ake60] Sheldon B. Akers. The use of wye-delta transformations in network simplication.
Operations Research, 8(3):311323, 1960. Announced at Rand Symposium on
Mathematical Programming 1959. 51
[Akg88] Mustafa Akg ul. Shortest paths and the simplex method. 14th International Con-
ference on Mathematical Programming, Tokyo, 1988. 34
[Alo86] Noga Alon. Eigenvalues and expanders. Combinatorica, 6(2):8396, 1986. 65
[Alo03] Noga Alon. Problems and results in extremal combinatoricsI. Discrete Mathe-
matics, 273(1-3):3153, 2003. 30
[ALZ07] Luca Allulli, Peter Lichodzijewski, and Norbert Zeh. A faster cache-oblivious
shortest-path algorithm for undirected graphs with bounded edge lengths. In Pro-
ceedings of the Eighteenth Annual ACM-SIAMSymposiumon Discrete Algorithms,
SODA 2007, New Orleans, Louisiana, USA, January 7-9, 2007, pages 910919,
2007. 35
[AM05] Ittai Abraham and Dahlia Malkhi. Name independent routing for growth bounded
networks. In SPAA 2005: Proceedings of the 17th Annual ACM Symposium on
Parallel Algorithms, July 18-20, 2005, Las Vegas, Nevada, USA, pages 4955,
2005. 49
[AMO93] Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin. Network Flows.
Prentice Hall, 1993. 32
[AMOT90] Ravindra K. Ahuja, Kurt Mehlhorn, James B. Orlin, and Robert Endre Tarjan.
Faster algorithms for the shortest path problem. Journal of the ACM, 37(2):213
223, 1990. 34
[ANLP90] Baruch Awerbuch, Amotz Bar Noy, Nathan Linial, and David Peleg. Improved
routing strategies with succinct tables. Journal of Algorithms, 11(3):307341,
1990. 49
[AO92] Ravindra K. Ahuja and James B. Orlin. The scaling network simplex algorithm.
Operations Research, 40(S1):513, 1992. 34
[AOPS02] Ravindra K. Ahuja, James B. Orlin, Stefano Pallottino, and Maria Grazia Scutell` a.
Minimum time and minimum cost-path problems in street networks with periodic
trafc lights. Transportation Science, 36(3):326336, 2002. 54
[AP89] Stefan Arnborg and Andrzej Proskurowski. Linear time algorithms for NP-hard
problems restricted to partial k-trees. Discrete Applied Mathematics, 23(1):1124,
1989. 20, 22
129
Bibliography
[AP92] Baruch Awerbuch and David Peleg. Routing with polynomial communication-
space trade-off. SIAM Journal on Discrete Mathematics, 5(2):151162, 1992. 49
[APF
+
06] Bal azs Adamcsek, Gergely Palla, Ill es J. Farkas, Imre Der enyi, and Tam as Vic-
sek. CFinder: locating cliques and overlapping modules in biological networks.
Bioinformatics, 22(8):10211023, 2006. 13
[Ari00] Masanori Arita. Metabolic reconstruction using shortest paths. Simulation Prac-
tice and Theory, 8(1-2):109 125, 2000. 11
[AS87] Noga Alon and Baruch Schieber. Optimal preprocessing for answering on-line
product queries. Technical Report 71/87, Tel Aviv University, 1987. 47
[AS00] Noga Alon and Joel Spencer. The Probabilistic Method (second edition). John
Wiley Interscience Series in Discrete Mathematics and Optimization, 2000. 67, 68
[ASBS00] Lus A. Nunes Amaral, Antonio Scala, Marc Barthelemy, and H. Eugene Stan-
ley. Classes of small-world networks. Proceedings of the National Academy of
Sciences USA, 97(21):1114911152, 2000. 4, 23, 123
[Ass83] Patrice Assouad. Plongements Lipschitziens dans R
n
. Bulletin de la Soci et e
Math ematique de France, 111:429448, 1983. 20
[ASS09] Dennis J. Adams-Smith and Douglas R. Shier. Generating random test net-
works for shortest path algorithms. Operations Research and Cyber-Infrastructure,
47:295308, 2009. 22
[AST90] Noga Alon, Paul D. Seymour, and Robin Thomas. A separator theorem for nonpla-
nar graphs. Journal of the American Mathematical Society, 3(4):801808, 1990.
Announced at STOC 1990. 31, 46, 121
[AST94] Noga Alon, Paul D. Seymour, and Robin Thomas. Planar separators. SIAMJournal
on Discrete Mathematics, 7(2):184193, 1994. 30
[Aur91] Franz Aurenhammer. Voronoi diagrams a survey of a fundamental geometric
data structure. ACM Computing Surveys, 23(3):345405, 1991. 98
[AV88] Alok Aggarwal and S. Vitter, Jeffrey. The input/output complexity of sorting and
related problems. Communications of the ACM, 31(9):11161127, 1988. 26, 43,
45
[AY00] Muhammad Abaidullah Anwar and Takaichi Yoshida. OORF: An object-oriented
route nder. In SAC 00: Proceedings of the 2000 ACM Symposium on Applied
Computing, pages 301306, 2000. 51, 54
[AY01] Muhammad Abaidullah Anwar and Takaichi Yoshida. Integrating OO road net-
work database, cases and knowledge for route nding. In SAC 01: Proceedings
of the 2001 ACM Symposium on Applied Computing, pages 215219, 2001. 51, 54
[BA99] Albert-L aszl o Barab asi and Reka Albert. Emergence of scaling in random net-
works. Science, 286(5439):509512, 1999. 4, 5, 23, 24, 96
[Bac94] Paul Bachmann. Die Analytische Zahlentheorie. Zahlentheorie. pt. 2. 1894. 25
[BAJ00] Albert-L aszl o Barab asi, Reka Albert, and Hawoong Jeong. Scale-free charac-
teristics of random networks: the topology of the world-wide web. Physica A:
Statistical Mechanics and its Applications, 281:6977, 2000. 4, 5
130
Bibliography
[Bar03] Albert-L aszl o Barab asi. Linked: How Everything Is Connected to Everything Else
and What It Means. Plume, 2003. 81
[Bat08] Michael Batty. The size, scale, and shape of cities. Science, 319(5864):769771,
2008. 10
[Bav48] Alex Bavelas. A mathematical model of group structure. Human Organizations,
7:1630, 1948. 12
[Bav50] Alex Bavelas. Communication patterns in task-oriented groups. The Journal of
the Acoustical Society of America, 22:725, 1950. 12
[BBCR03] B ela Bollob as, Christian Borgs, Jennifer T. Chayes, and Oliver Riordan. Directed
scale-free graphs. In Symposium on Discrete Algorithms (SODA), pages 132139,
2003. 5
[BBJ
+
02] Christopher L. Barrett, Keith R. Bisset, Riko Jacob, Goran Konjevod, and Mad-
hav V. Marathe. Classical and contemporary shortest path problems in road net-
works: Implementation and experimental analysis of the TRANSIMS router. In
Algorithms - ESA 2002, 10th Annual European Symposium, Rome, Italy, Septem-
ber 17-21, 2002, Proceedings, pages 126138, 2002. 10
[BBK72] Andr as B ek essy, P. Bekessy, and J anos Koml os. Asymptotic enumeration of reg-
ular matrices. Studia Scientiarum Mathematicarum Hungarica, 7:343353, 1972.
23, 24, 96
[BC58] Frederick Bock and Scott Cameron. Allocation of network trafc demand by in-
stant determination of optimum paths. Operations Research, 6:633634, 1958.
Announced at the 13th National (6th Annual) Meeting of the Operations Research
Society of America, 1958. 8
[BC06] Arthur Brady and Lenore Cowen. Compact routing on power law graphs with
additive stretch. In Proceedings of the Ninth Workshop on Algorithm Engineering
and Experiments, pages 119128, 2006. 49, 56, 83
[BCE03] B ela Bollob as, Don Coppersmith, and Michael L. Elkin. Sparse distance pre-
servers and additive spanners. In Symposium on Discrete Algorithms (SODA),
2003. 27
[BD86] George E. P. Box and Norman R. Draper. Empirical model-building and response
surface. John Wiley & Sons, Inc., 1986. 84
[BD08] Reinhard Bauer and Daniel Delling. SHARC: Fast and robust unidirectional rout-
ing. In Proceedings of the 10th Workshop on Algorithm Engineering and Experi-
ments (ALENEX08), pages 1326, 2008. 47, 54, 55, 56, 110, 111, 117, 123
[BDDW09] Reinhard Bauer, Gianlorenzo DAngelo, Daniel Delling, and Dorothea Wagner.
The shortcut problem - complexity and approximation. In SOFSEM 2009: Theory
and Practice of Computer Science, 35th Conference on Current Trends in Theory
and Practice of Computer Science, Spindleruv Ml yn, Czech Republic, January 24-
30, 2009. Proceedings, pages 105116, 2009. 55
[BDS
+
08] Reinhard Bauer, Daniel Delling, Peter Sanders, Dennis Schieferdecker, Dominik
Schultes, and Dorothea Wagner. Combining hierarchical and goal-directed speed-
up techniques for Dijkstras algorithm. In Experimental Algorithms, 7th Inter-
131
Bibliography
national Workshop (WEA08), Provincetown, MA, USA, May 30-June 1, 2008,
Proceedings, pages 303318, 2008. 47, 51, 54, 55, 56, 110, 111, 112, 117, 123
[BDW09] Reinhard Bauer, Daniel Delling, and Dorothea Wagner. Experimental study of
speed-up techniques for timetable information systems. Networks, 2009. 56
[Bea64] Murray A. Beauchamp. An improved index of centrality. Behavioral Science,
10:161163, 1964. 12
[Bel58] Richard Ernest Bellman. On a routing problem. Quarterly of Applied Mathematics,
16:8790, 1958. 8, 34
[Bel67] Richard Ernest Bellman. Dynamic Programming. Princeton University Press,
1967. 34, 35, 37
[Ben66] C. T. Benson. Minimal regular graphs of girth eight and twelve. Canadian Journal
of Mathematics, 18:10911094, 1966. 28
[Ber93] Dimitri P. Bertsekas. A simple and fast label correcting algorithm for shortest
paths. Networks, 23(7):703709, 1993. 34
[Ber09] Aaron Bernstein. Fully dynamic (2 + ) approximate all-pairs shortest paths with
O(log log n) query and close to linear update time. In 50th Annual IEEE Sym-
posium on Foundations of Computer Science, FOCS 2009, pages 693702, 2009.
39
[Bes74] Julian Besag. Spatial interaction and the statistical analysis of lattice systems.
Journal of the Royal Statistical Society. Series B (Methodological), 36(2):192
236, 1974. 11
[BFM
+
07] Holger Bast, Stefan Funke, Domagoj Matijevic, Peter Sanders, and Dominik
Schultes. In transit to constant time shortest-path queries in road networks.
In Proceedings of the Workshop on Algorithm Engineering and Experiments
(ALENEX07), New Orleans, Louisiana, USA, January 6, 2007, 2007. 47, 51,
54, 55, 110, 114
[BFMZ04] Gerth Stlting Brodal, Rolf Fagerberg, Ulrich Meyer, and Norbert Zeh. Cache-
oblivious data structures and algorithms for undirected breadth-rst search and
shortest paths. In Algorithm Theory - SWAT 2004, 9th Scandinavian Workshop
on Algorithm Theory, Humlebaek, Denmark, July 8-10, 2004, Proceedings, pages
480492, 2004. 35
[BFSS07] Holger Bast, Stefan Funke, Peter Sanders, and Dominik Schultes. Fast routing in
road networks with transit nodes. Science, 316(5824):566, 2007. 55
[BG07] Zachary K. Baker and Maya Gokhale. On the acceleration of shortest path
calculations in transportatoin networks. In International Symposium on Field-
Programmable Custom Computing Machines, pages 2332, 2007. 10, 54
[BGS09] Surender Baswana, Vishrut Goyal, and Sandeep Sen. All-pairs nearly 2-
approximate shortest paths in

O(n
2
) time. Theoretical Computer Science,
410(1):8493, 2009. Announced at STACS 2005. 37
[BGSU08] Surender Baswana, Akshay Gaur, Sandeep Sen, and Jayant Upadhyay. Distance
oracles for unweighted graphs: Breaking the quadratic barrier with constant ad-
ditive error. In Automata, Languages and Programming, 35th International Col-
loquium, ICALP 2008, Reykjavik, Iceland, July 7-11, 2008, Proceedings, Part I:
132
Bibliography
Tack A: Algorithms, Automata, Complexity, and Games, pages 609621, 2008. 43,
44, 123
[BH69] M. H. Bourgoin and E. M. J. Heurgon. Study and comparison of algorithms of
the shortest path through planned experiments. In Project Planning by Network
Analysis, pages 106118, 1969. 35, 50
[BHS07] Surender Baswana, Ramesh Hariharan, and Sandeep Sen. Improved decremental
algorithms for maintaining transitive closure and all-pairs shortest paths. Journal
of Algorithms, 62(2):7492, 2007. 39
[Big98] Norman Biggs. Constructions for cubic graphs with large girth. The Electronic
Journal of Combinatorics, 5, 1998. 28
[BK06] Surender Baswana and Telikepalli Kavitha. Faster algorithms for approximate
distance oracles and all-pairs small stretch paths. In 47th Annual IEEE Symposium
on Foundations of Computer Science (FOCS 2006), 21-24 October 2006, Berkeley,
California, USA, pages 591602, 2006. 37, 43, 44
[BK07] Piotr Berman and Shiva Prasad Kasiviswanathan. Faster approximation of dis-
tances in graphs. In Algorithms and Data Structures, 10th International Workshop,
WADS 2007, Halifax, Canada, August 15-17, 2007, Proceedings, pages 541552,
2007. 37
[BKM
+
00] Andrei Z. Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Ra-
jagopalan, Raymie Stata, Andrew Tomkins, and Janet L. Wiener. Graph structure
in the web. Computer Networks, 33(1-6):309320, 2000. 5
[BKMM07] David A. Bader, Shiva Kintali, Kamesh Madduri, and Milena Mihail. Approxi-
mating betweenness centrality. In Algorithms and Models for the Web-Graph, 5th
International Workshop, WAW 2007, San Diego, CA, USA, December 11-12, 2007,
Proceedings, pages 124137, 2007. 12
[BKMP05] Surender Baswana, Telikepalli Kavitha, Kurt Mehlhorn, and Seth Pettie. New
constructions of (, )-spanners and purely additive spanners. In SODA 05: Pro-
ceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms,
pages 672681, 2005. 28
[BKS01] Ingmar Bitter, Arie E. Kaufman, and Mie Sato. Penalized-distance volumetric
skeleton algorithm. IEEE Transactions on Visualization and Computer Graphics,
7(3):195206, 2001. 11
[BL74] Mokhtar S. Bazaraa and R. W. Langley. A dual shortest path algorithm. SIAM
Journal on Applied Mathematics, 26(3):496501, 1974. 51
[BLM
+
06] Stefano Boccaletti, Vito Latora, Yamir Moreno, Mario Chavez, and Dong-Uk
Hwang. Complex networks: Structure and dynamics. Physics Reports, 424:175
308, 2006. 4, 13
[BLMN05] Yair Bartal, Nathan Linial, Manor Mendel, and Assaf Naor. On metric Ramsey
type phenomena. Annals of Mathematics (2), 162(2):643709, 2005. Announced
at STOC 2003. 43
[Blo83] Peter A. Bloniarz. A shortest-path algorithm with expected time
O(n
2
log nlog

n). SIAM Journal on Computing, 12(3):588600, 1983.


Announced at STOC 1980. 35
133
Bibliography
[BMST97] Frank Ball, Denis Mollison, and Gianpaolo Scalia-Tomba. Epidemics with two
levels of mixing. Ann. Appl. Probab., 7(1):4689, 1997. 24, 83
[BMW89] Anantaram Balakrishnan, Thomas L. Magnanti, and Richard T. Wong. A dual-
ascent procedure for large-scale uncapacitated network design. Operations Re-
search, 37(5):716740, 1989. 8
[Bol01] B ela Bollob as. Random Graphs. Cambridge University Press, 2001. 22, 23
[Boo67] John Boothroyd. Algorithms: Authors note on algorithms 22, 23, 24; algorithm
22: Shortest path between start node and end node of a network; algorithm 23:
Shortest path between start node and all other nodes of a network; algorithm 24:
The list of nodes on the shortest path from start node to end node of a network.
The Computer Journal, 10(3):306308, 1967. In the Algorithms Supplement. 35
[BOR80] John J. Bartholdi, James B. Orlin, and H. Donald Ratliff. Cyclic scheduling via in-
teger programs with circular ones. Operations Research, 28(5):10741085, 1980.
8
[Bor84] Gunilla Borgefors. Distance transformations in arbitrary dimensions. Computer
Vision, Graphics, and Image Processing, 27(3):321345, September 1984. 11
[Bou85] Jean Bourgain. On lipschitz embedding of nite metric spaces in Hilbert space.
Israel Journal of Mathematics, 52(1-2):4652, 1985. 30, 118
[BR03] B ela Bollob as and Oliver Riordan. Robustness and vulnerability of scale-free ran-
dom graphs. Internet Mathematics, 1, 2003. 83
[BR04] B ela Bollob as and Oliver Riordan. The diameter of a scale-free random graph.
Combinatorica, 24(1):534, 2004. 4
[Bra68] Dietrich Braess.

Uber ein Paradoxon aus der Verkehrsplanung. Un-
ternehmensforschung, 12:258268, 1968. 10
[Bra01] Ulrik Brandes. A faster algorithm for betweenness centrality. Journal of Mathe-
matical Sociology, 25:163177, 2001. 12
[Bre32] Robert Breusch. Zur Verallgemeinerung des Bertrandschen Postulates, da zwis-
chen x und 2x stets Primzahlen liegen. Mathematische Zeitschrift, 34(1):505526,
1932. 66, 67
[Bre64] Robert Breusch. An asymptotic formula for primes of the form 4n + 1. The
Michigan Mathematical Journal, 11:311315, 1964. 66
[Bre66] Melvin A. Breuer. Coding the vertexes of a graph. IEEE Transactions on Infor-
mation Theory, 12(2):148153, 1966. 31
[Bro66] William G. Brown. On graphs that do not contain a Thomsen graph. Canadian
Mathematical Bulletin, 9:28185, 1966. 28
[BRST01] B ela Bollob as, Oliver Riordan, Joel Spencer, and G abor E. Tusn ady. The de-
gree sequence of a scale-free random graph process. Random Struct. Algorithms,
18(3):279290, 2001. 4
[BS05] Nadine Baumann and Sebastian Stiller. Network Analysis, chapter 13. Network
Models, pages 341372. Springer, 2005. 24
134
Bibliography
[BS06] Surender Baswana and Sandeep Sen. Approximate distance oracles for un-
weighted graphs in expected O(n
2
) time. ACM Transactions on Algorithms,
2(4):557577, 2006. Announced at SODA 2004. 43, 44
[BS08] Surender Baswana and Sandeep Sen. Algorithms for spanners in weighted graphs.
In Encyclopedia of Algorithms. 2008. 27
[BSB
+
00] Ingmar Bitter, Mie Sato, Michael Bender, Kevin T. McDonnell, Arie Kaufman,
and Ming Wan. CEASAR: a smooth, accurate and robust centerline extraction
algorithm. In Visualization 2000. Proceedings, pages 4552, Oct. 2000. 11
[BSWW04] Ulrik Brandes, Frank Schulz, Dorothea Wagner, and Thomas Willhalm. Gener-
ating node coordinates for shortest-path computations in transportation networks.
ACM Journal of Experimental Algorithmics, 9, 2004. 54
[BTZ98] Gerth Stlting Brodal, Jesper Larsson Tr aff, and Christos D. Zaroliagis. A parallel
priority queue with constant time operations. Journal of Parallel and Distributed
Computing, 49(1):421, 1998. Announced at IPPS 1997. 35
[But68] Larry F. Butas. A directionally oriented shortest path algorithm. Transportation
Research, 2(3):253268, 1968. 54
[BVW08] Guy E. Blelloch, Virginia Vassilevska, and Ryan Williams. A new combinatorial
approach for sparse graph problems. In Automata, Languages and Programming,
35th International Colloquium, ICALP 2008, Reykjavik, Iceland, July 7-11, 2008,
Proceedings, Part I: Tack A: Algorithms, Automata, Complexity, and Games, pages
108120, 2008. 36, 37
[BYJKS04] Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, and D. Sivakumar. An information
statistics approach to data stream and communication complexity. Journal of Com-
puter and System Sciences, 68(4):702732, 2004. Special Issue on FOCS 2002.
62
[Cab06] Sergio Cabello. Many distances in planar graphs. In Proceedings of the Seven-
teenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2006, Mi-
ami, Florida, USA, January 22-26, 2006, pages 12131220, 2006. A preprint of
the journal version is available in the University of Ljubljana preprint series, Vol.
47 (2009), 1089. 37, 45, 46, 47, 121
[Cal61] Tom Caldwell. On nding minimum routes in a network with turn penalties. Com-
munications of the ACM, 4(2):107108, 1961. 8, 39
[Car71] Bernard A. Carr e. An algebra for network routing problems. IMA Journal of
Applied Mathematics, 7(3):273294, 1971. 36
[CC07] Sergio Cabello and Erin W. Chambers. Multiple source shortest paths in a genus g
graph. In Symposium on Discrete Algorithms (SODA), pages 8997, 2007. 46
[CE91] Marek Chrobak and David Eppstein. Planar orientations with low out-degree and
compaction of adjacency matrices. Theoretical Computer Science, 86(2):243266,
1991. 31
[CE01] Timothy M. Chan and Alon Efrat. Fly cheaply: On the minimum fuel consumption
problem. Journal of Algorithms, 41(2):330337, 2001. Announced at SOCG 1998
by Alon Efrat and Sariel Har-Peled. 97
135
Bibliography
[CF94] Adrijana Car and Andrew U. Frank. Modelling a hierarchy of space applied to
large road networks. In IGIS 94: Geographic Information Systems, International
Workshop on Advanced Information Systems, Monte Verita, Ascona, Switzerland,
February 28 - March 4, 1994, Proceedings, pages 1524, 1994. 51, 54
[CF06] Deepayan Chakrabarti and Christos Faloutsos. Graph mining: Laws, generators,
and algorithms. ACM Computing Surveys, 38(1):2, 2006. 24
[CG06] T-H. Hubert Chan and Anupam Gupta. Small hop-diameter sparse spanners for
doubling metrics. In SODA 06: Proceedings of the seventeenth annual ACM-
SIAM symposium on Discrete algorithm, pages 7078, 2006. 49
[CGR96] Boris V. Cherkassky, Andrew V. Goldberg, and Tomasz Radzik. Shortest paths
algorithms: Theory and experimental evaluation. Mathematical Programming,
73:129174, 1996. 35
[CGS99] Boris V. Cherkassky, Andrew V. Goldberg, and Craig Silverstein. Buckets, heaps,
lists, and monotone priority queues. SIAM Journal on Computing, 28(4):1326
1346, 1999. 35
[CH66] K. L. Cooke and E. Halsey. The shortest route through a network with time-
dependent internodal transit times. Journal of Mathematical Analysis and Appli-
cations, 14:492498, 1966. 32
[CH78] Louis Caccetta and Roland H aggkvist. On minimal digraphs with given girth. In
Congressus Numerantium XXI Proceedings of the Ninth Southeastern Confer-
ence on Combinatorics, Graph Theory, and Computing, pages 181187, 1978. 28
[Cha67] Bruce A. Chartres. Letter concerning Nicholsons paper. The Computer Journal,
10(1):118119, 1967. In Discussion and Correspondence. 35
[Cha87] Bernard Chazelle. Computing on a free tree via complexity-preserving mappings.
Algorithmica, 2:337361, 1987. Announced at FOCS 1984. 47
[Cha07] Timothy M. Chan. More algorithms for all-pairs shortest paths in weighted graphs.
In Proceedings of the 39th Annual ACM Symposium on Theory of Computing
(STOC07), pages 590598, 2007. 36, 37
[CHC
+
00] Maria C. Costanzo, Jennifer D. Hogan, Michael E. Cusick, Brian P. Davis, Ann M.
Fancher, Peter E. Hodges, Pinar Kondu, Carey Lengieza, Jodi E. Lew-Smith, Carol
Lingner, Kevin J. Roberg-Perez, Michael Tillberg, Joan E. Brooks, and James I.
Garrels. The yeast proteome database (ypd) and caenorhabditis elegans proteome
database (wormpd): comprehensive resources for the organization and comparison
of model organism protein information. Nucleic Acids Research, 28(1):7376,
2000. 6
[Che96] Danny Z. Chen. Developing algorithms and software for geometric path planning
problems. ACM Computing Surveys, page 18, 1996. 32
[CHKZ03] Edith Cohen, Eran Halperin, Haim Kaplan, and Uri Zwick. Reachability and dis-
tance queries via 2-hop labels. SIAM Journal on Computing, 32(5):13381355,
2003. Announced at SODA 2002. 54, 57, 113
[CK95] Paul B. Callahan and S. Rao Kosaraju. A decomposition of multidimensional point
sets with applications to k-nearest-neighbors and n-body potential elds. Journal
of the ACM, 42(1):6790, 1995. 31, 48, 50
136
Bibliography
[CK96] Laurent D. Cohen and Ron Kimmel. Global minimum for active contour models:
A minimal path approach. In IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, pages 666673, 1996. 8, 10
[CKL97] Jason Cong, Andrew B. Kahng, and Kwok-Shing Leung. Efcient heuristics
for the minimum shortest path Steiner arborescence problem with applications to
VLSI physical design. In International Symposium on Physical Design, pages
8895, 1997. 8
[CKL
+
09] Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessan-
dro Panconesi, and Prabhakar Raghavan. Models for the compressible web. In Pro-
ceedings of the IEEE Symposium on Foundations of Computer Science (FOCS),
pages 331340, 2009. 5
[CKR04] Gruia Calinescu, Howard J. Karloff, and Yuval Rabani. Approximation algorithms
for the 0-extension problem. SIAM Journal on Computing, 34(2):358372, 2004.
Announced at SODA 2001. 43
[CL02] Fan Rong King Chung and Linyuan Lu. The average distances in random graphs
with given expected degrees. Internet Mathematics, 99:1587915882, 2002. 24,
56, 82, 84
[CL06] Fan Rong King Chung and Linyuan Lu. Complex Graphs and Networks. American
Mathematical Society, 2006. 82, 84, 86
[CL07] Edward P. Chan and Heechul Lim. Optimization and evaluation of shortest path
queries. The VLDB Journal, 16(3):343369, 2007. 51, 54
[CLRS01] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.
Introduction to Algorithms. The MIT Press, 2nd edition, 2001. 8, 19, 25, 27, 32,
34, 35
[CLSS98] Danny Z. Chen, D. T. Lee, R. Sridhar, and Chandra N. Sekharan. Solving the
all-pair shortest path query problem on interval and circular-arc graphs. Networks,
31(4):249258, 1998. 48, 121
[CLW08] Danny Z. Chen, Shuang Luan, and Chao Wang. Coupled path planning, region
optimization, and applications in intensity-modulated radiation therapy. In Al-
gorithms - ESA 2008, 16th Annual European Symposium, Karlsruhe, Germany,
September 15-17, 2008. Proceedings, pages 271283, 2008. 11
[CMMS98] Andreas Crauser, Kurt Mehlhorn, Ulrich Meyer, and Peter Sanders. A paralleliza-
tion of Dijkstras shortest path algorithm. In Mathematical Foundations of Com-
puter Science 1998, 23rd International Symposium, MFCS98, Brno, Czech Re-
public, August 24-28, 1998, Proceedings, pages 722731, 1998. 35
[CNM04] Aaron Clauset, Mark E. J. Newman, and Cristopher Moore. Finding community
structure in very large networks. Physical Review E (Statistical, Nonlinear, and
Soft Matter Physics), 70(6):066111+, Dec 2004. 13
[CNSW00] Duncan S. Callaway, Mark E. J. Newman, Steven H. Strogatz, and Duncan J.
Watts. Network robustness and fragility: Percolation on random graphs. Phys-
ical Review Letters, 85:54685471, 2000. 83
[CO00] Bruno Courcelle and Stephan Olariu. Upper bounds to the clique width of graphs.
Discrete Applied Mathematics, 101(1-3):77 114, 2000. 48
137
Bibliography
[Coh94] Edith Cohen. Polylog-time and linear-work approximation scheme for undirected
shortest paths. In Proceedings of the ACM Symposium on Theory of Computing
(STOC), pages 1626, 1994. 38
[Coh96] Edith Cohen. Efcient parallel shortest-paths in digraphs with a separator decom-
position. Journal of Algorithms, 21(2):331357, 1996. Announced at SPAA 1993.
54
[Coh98] Edith Cohen. Fast algorithms for constructing t-spanners and paths with stretch t.
SIAM Journal on Computing, 28(1):210236, 1998. Announced at FOCS 1993.
27
[Coh00] Edith Cohen. Polylog-time and near-linear work approximation scheme for undi-
rected shortest paths. Journal of the ACM, 47(1):132166, 2000. 35
[Coo03] Cooperative Association for Internet Data Analysis. Router-level topology mea-
surements. Online at caida.org/tools/measurement/skitter/router topology/, le:
itdk0304 rlinks undirected.gz, 2003. 5, 112
[Cow01] Lenore Cowen. Compact routing with minimum stretch. Journal of Algorithms,
38(1):170183, 2001. 88, 90
[CP07] Timothy M. Chan and Mihai Patrascu. Voronoi diagrams in n 2
O(

lg lg n)
time.
In Proceedings of the 39th Annual ACM Symposium on Theory of Computing, San
Diego, California, USA, June 11-13, 2007, pages 3139, 2007. 98
[CR73] Stephen A. Cook and Robert A. Reckhow. Time bounded randomaccess machines.
Journal of Computer and System Sciences, 7(4):354375, 1973. Announced at
STOC 1972. 26
[CRS98] Yu-Li Chou, H. Edwin Romeijn, and Robert L. Smith. Approximating shortest
paths in large-scale networks with an application to intelligent transportation sys-
tems. INFORMS Journal on Computing, 10(2):163179, 1998. 51, 54
[CS83] Vasek Chv atal and Endre Szemer edi. Short cycles in directed graphs. Journal of
Combinatorial Theory, Series B, 35(3):323327, 1983. 28
[CSN07] Aaron Clauset, Cosma R. Shalizi, and Mark E. J. Newman. Power-law distribu-
tions in empirical data. SIAM Reviews, June 2007. 4, 23
[CSTW09a] Wei Chen, Christian Sommer, Shang-Hua Teng, and Yajun Wang. Compact rout-
ing in power-law graphs. In 23rd International Symposium on Distributed Com-
puting (DISC), pages 379391, 2009. 16
[CSTW09b] Wei Chen, Christian Sommer, Shang-Hua Teng, and Yajun Wang. A compact
routing scheme and approximate distance oracle for power-law graphs. Technical
Report MSR-TR-2009-84, Microsoft Research, July 2009. 16, 82
[CTB01] Adrijana Car, George Taylor, and Chris Brunsdon. An analysis of the performance
of a hierarchical waynding computational model using synthetic graphs. Com-
puters, Environment and Urban Systems, 25(1):6988, 2001. 51, 54
[CV03a] Elizabeth Costenbader and Thomas W. Valente. The stability of centrality mea-
sures when networks are sampled. Social Networks, 25(4):283 307, 2003. 12
[CV03b] Bruno Courcelle and R emi Vanicat. Query efcient implementation of graphs of
bounded clique-width. Discrete Applied Mathematics, 131(1):129150, 2003. 48
138
Bibliography
[CW90] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic
progressions. Journal of Symbolic Computation, 9(3):251280, 1990. Announced
at STOC 1987. 36
[CW04] Lenore J. Cowen and Christopher G. Wagner. Compact roundtrip routing in di-
rected networks. Journal of Algorithms, 50(1):7995, 2004. Announced at PODC
2000. 49
[CWM94] Gordon Cameron, Brian J. N. Wylie, and David McArthur. PARAMICS: moving
vehicles on the connection machine. Technical report, Edinburgh Parallel Com-
puting Center, 1994. ISSN 1063-9535, IEEE. 10
[CX00] Danny Z. Chen and Jinhui Xu. Shortest path queries in planar graphs. In Proceed-
ings of the ACM Symposium on Theory of Computing (STOC), pages 469478,
2000. 45
[CY09] Jiefeng Cheng and Jeffrey Xu Yu. On-line exact shortest distance query processing.
In EDBT 2009, 12th International Conference on Extending Database Technology,
Saint Petersburg, Russia, March 24-26, 2009, Proceedings, pages 481492, 2009.
54, 57, 83, 113
[CYL
+
06] Jiefeng Cheng, Jeffrey Xu Yu, Xuemin Lin, Haixun Wang, and Philip S. Yu. Fast
computation of reachability labeling for large graphs. In Advances in Database
Technology - EDBT 2006, 10th International Conference on Extending Database
Technology, Munich, Germany, March 26-31, 2006, Proceedings, pages 961979,
2006. 54
[CYL
+
08] Jiefeng Cheng, Jeffrey Xu Yu, Xuemin Lin, Haixun Wang, and Philip S. Yu. Fast
computing reachability labelings for large graphs with high compression rate. In
EDBT 2008, 11th International Conference on Extending Database Technology,
Nantes, France, March 25-29, 2008, Proceedings, pages 193204, 2008. 54
[CYT06] Jiefeng Cheng, Jeffrey Xu Yu, and Nan Tang. Fast reachability query processing.
In Database Systems for Advanced Applications, 11th International Conference,
DASFAA 2006, Singapore, April 12-15, 2006, Proceedings, pages 674688, 2006.
54
[CZ00] Shiva Chaudhuri and Christos D. Zaroliagis. Shortest paths in digraphs of small
treewidth. part I: Sequential algorithms. Algorithmica, 27(3):212226, 2000. An-
nounced at ICALP 1995. 47, 121
[CZ01] Edith Cohen and Uri Zwick. All-pairs small-stretch paths. Journal of Algorithms,
38(2):335353, 2001. Announced at SODA 1997. 37, 44
[CZ07] Edward P. F. Chan and Jie Zhang. A fast unied optimal route query evaluation
algorithm. In CIKM 07: Proceedings of the sixteenth ACM conference on Con-
ference on information and knowledge management, pages 371380, 2007. 51
[DABC08] Engin Demir, Cevdet Aykanat, and B. Barla Cambazoglu. Clustering spatial net-
works for aggregate query processing: A hypergraph approach. Inf. Syst., 33(1):1
17, 2008. 54, 55
[Dan57] George Bernard Dantzig. Discrete-variable extremum problems. Operations Re-
search, 5(2):266277, 1957. 8, 9, 125
139
Bibliography
[Dan60] George Bernard Dantzig. On the shortest route through a network. Management
Science, 6(2):187190, 1960. 8, 32, 35
[DBCP97] Mikael Degermark, Andrej Brodnik, Svante Carlsson, and Stephen Pink. Small
forwarding tables for fast routing lookups. In SIGCOMM, pages 314, 1997. 15
[dC83] Dennis de Champeaux. Bidirectional heuristic search again. Journal of the ACM,
30(1):2232, 1983. 10, 35
[DCKM04] Frank Dabek, Russ Cox, Frans Kaashoek, and Robert Morris. Vivaldi: a decen-
tralized network coordinate system. In SIGCOMM 04: Proceedings of the 2004
conference on Applications, technologies, architectures, and protocols for com-
puter communications, pages 1526, 2004. 15
[DEH07a] Philippe Duchon, Nicole Eggemann, and Nicolas Hanusse. Non-searchability of
randompower-lawgraphs. In Principles of Distributed Systems, 11th International
Conference, OPODIS 2007, Guadeloupe, French West Indies, December 17-20,
2007. Proceedings, pages 274285, 2007. 83
[DEH07b] Philippe Duchon, Nicole Eggemann, and Nicolas Hanusse. Non-searchability of
random scale-free graphs. In Proceedings of the Twenty-Sixth Annual ACM Sym-
posium on Principles of Distributed Computing, PODC 2007, Portland, Oregon,
USA, August 12-15, 2007, pages 380381, 2007. 83
[Del09] Daniel Delling. Engineering and Augmenting Route Planning Algorithms. PhD
thesis, Universit at Karlsruhe, 2009. 52, 54, 55, 56
[DF79] Eric V. Denardo and Bennett L. Fox. Shortest route methods: Reaching, pruning
and buckets. Operations Research, 27:161186, 1979. 33
[DFJ54] George Bernard Dantzig, Delbert Ray Fulkerson, and Selmer M. Johnson. Solution
of a large-scale traveling-salesman problem. Journal of the Operations Research
Society of America, 2(4):393410, 1954. 8
[DFS90] David P. Dobkin, Steven J. Friedman, and Kenneth J. Supowit. Delaunay graphs
are almost as good as complete graphs. Discrete & Computational Geometry,
5:399407, 1990. 118
[DGJ08] Camil Demetrescu, Andrew V. Goldberg, and David S. Johnson. Implementation
challenge for shortest paths. In Encyclopedia of Algorithms. 2008. 15, 54
[DGKK79] Robert B. Dial, Fred Glover, David Karney, and Darwin Klingman. A computa-
tional analysis of alternative algorithms and labeling techniques for nding shortest
path trees. Networks, 9:215248, 1979. 33, 50
[DGST88] James R. Driscoll, Harold N. Gabow, Ruth Shrairman, and Robert E. Tarjan. Re-
laxed heaps: an alternative to Fibonacci heaps with applications to parallel com-
putation. Communications of the ACM, 31(11):13431354, 1988. 33, 34
[DHZ00] Dorit Dor, Shay Halperin, and Uri Zwick. All-pairs almost shortest paths. SIAM
Journal on Computing, 29(5):17401759, 2000. Announced at FOCS 1996. 27,
29, 37, 38
[DI04] Camil Demetrescu and Giuseppe F. Italiano. Engineering shortest path algorithms.
In Experimental and Efcient Algorithms, Third International Workshop, WEA
2004, Angra dos Reis, Brazil, May 25-28, 2004, Proceedings, pages 191198,
2004. 35
140
Bibliography
[DI08] Camil Demetrescu and Giuseppe F. Italiano. Decremental all-pairs shortest paths.
In Encyclopedia of Algorithms. 2008. 32, 35
[Dia69] Robert B. Dial. Algorithm 360: shortest-path forest with topological ordering [h].
Communications of the ACM, 12(11):632633, 1969. 33
[Dij59] Edsger Wybe Dijkstra. A note on two problems in connexion with graphs. Nu-
merische Mathematik, 1:269271, 1959. 8, 32, 35, 51
[Dir50] Gustav Lejeune Dirichlet.

Uber die Reduktion der positiven quadratischen Formen
mit drei unbestimmten ganzen Zahlen. Journal f ur die Reine und Angewandte
Mathematik, 40:209227, 1850. 98
[Dji96] Hristo Djidjev. Efcient algorithms for shortest path problems on planar digraphs.
In Graph-Theoretic Concepts in Computer Science, 22nd International Workshop,
WG 96, Cadenabbia (Como), Italy, June 12-14, 1996, Proceedings, pages 151
165, 1996. 45, 47, 121
[Dji06] Hristo Djidjev. A scalable multilevel algorithm for graph clustering and commu-
nity structure detection. In Algorithms and Models for the Web-Graph, Fourth
International Workshop, WAW 2006, Banff, Canada, November 30 - December 1,
2006. Revised Papers, pages 117128, 2006. 13
[Djo73] Dragomir Z. Djokovic. Distance-preserving subgraphs of hypercubes. Journal of
Combinatorial Theory, Series B, 14(3):263267, 1973. 30
[DJW07] J org Derungs, Riko Jacob, and Peter Widmayer. Approximate shortest paths
guided by a small index. In Algorithms and Data Structures, 10th International
Workshop, WADS 2007, Halifax, Canada, August 15-17, 2007, Proceedings, pages
553564, 2007. Journal version to appear in Algorithmica. 43
[DLL
+
06] Debora Donato, Luigi Laura, Stefano Leonardi, Ulrich Meyer, Stefano Millozzi,
and Jop F. Sibeyn. Algorithms and experiments for the webgraph. Journal of
Graph Algorithms and Applications, 10(2):219236, 2006. Announced at ESA
2003. 5
[DLO03] Erik D. Demaine and Alejandro L opez-Ortiz. A linear lower bound on index size
for text retrieval. Journal of Algorithms, 48(1):215, 2003. Announced at SODA
2001. 43
[DMS00] Sergey N. Dorogovtsev, Jos e Fernando Ferreira Mendes, and Alexander N.
Samukhin. Structure of growing networks with preferential linking. Physical
Review Letters, 85(21):46334636, Nov 2000. 24
[Dor67] Jim E. Doran. An approach to automatic problem-solving. Machine Intelligence,
1:105124, 1967. 35, 52
[DP84] Narsingh Deo and Chi-Yin Pang. Shortest path algorithms: Taxonomy and anno-
tation. Networks, 14:257323, 1984. 32, 51
[DP08] Ran Duan and Seth Pettie. Bounded-leg distance and reachability oracles. In SODA
08: Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete
algorithms, pages 436445, 2008. 39
[DPW09] Daniel Delling, Thomas Pajor, and Dorothea Wagner. Accelerating multi-modal
route planning by access-nodes. In Algorithms - ESA 2009, 17th Annual Eu-
141
Bibliography
ropean Symposium, Copenhagen, Denmark, September 7-9, 2009. Proceedings,
pages 587598, 2009. 2
[DPWZ09] Daniel Delling, Thomas Pajor, Dorothea Wagner, and Christos D. Zaroliagis. Ef-
cient route planning in ight networks. In ATMOS 2009 - 9th Workshop on Algo-
rithmic Approaches for Transportation Modeling, Optimization, and Systems, IT
University of Copenhagen, Denmark, September 10, 2009, 2009. 2
[DPZ91] Hristo Djidjev, Grammati E. Pantziou, and Christos D. Zaroliagis. Computing
shortest paths and distances in planar graphs. In Automata, Languages and Pro-
gramming, 18th International Colloquium, ICALP91, Madrid, Spain, July 8-12,
1991, Proceedings, pages 327338, 1991. 45
[DPZ95] Hristo Djidjev, Grammati E. Pantziou, and Christos D. Zaroliagis. On-line and
dynamic algorithms for shorted path problems. In STACS, pages 193204, 1995.
45, 121
[DPZ00] Hristo Djidjev, Grammati E. Pantziou, and Christos D. Zaroliagis. Improved algo-
rithms for dynamic shortest paths. Algorithmica, 28(4):367389, 2000. 45, 121
[DR94] Shaul Dar and Raghu Ramakrishnan. A performance study of transitive closure
algorithms. ACM SIGMOD Record, 23(2):454465, 1994. 57
[Dre69] Stuart E. Dreyfus. An appraisal of some shortest-path algorithms. Operations
Research, 17(3):395412, 1969. 32, 35, 50
[dSP65] Derek J. de Solla Price. Networks of scientic papers. Science, 149(3683):510
515, 1965. 4, 5
[dSP76] Derek J. de Solla Price. A general theory of bibliometric and other cumulative
advantage processes. Journal of the American Society for Information Science,
27(5):292306, 1976. 5
[dSPK78] Ithiel de Sola Pool and Manfred Kochen. Contacts and inuence. Social Networks,
1:551, 1978. 6
[DSSW06] Daniel Delling, Peter Sanders, Dominik Schultes, and Dorothea Wagner. Highway
hierarchies star. In 9th DIMACS Implementation Challenge, 2006. 110, 114
[DSSW09] Daniel Delling, Peter Sanders, Dominik Schultes, and Dorothea Wagner. Engineer-
ing route planning algorithms. In Algorithmics of Large and Complex Networks -
Design, Analysis, and Simulation [DFG priority program 1126], pages 117139,
2009. 54, 97
[DW07] Daniel Delling and Dorothea Wagner. Landmark-based routing in dynamic graphs.
In Experimental Algorithms, 6th International Workshop (WEA07), Rome, Italy,
June 6-8, 2007, Proceedings, pages 5265, 2007. 111, 117
[Eco09] The Economist. Rational consumer: The road ahead. The Economist (Technology
Quarterly), 392(8647):17, 2009. 122
[Edm65] Jack Edmonds. Paths, trees and owers. Canadian Journal of Mathematics,
17:449467, 1965. 25
[EG60] Paul Erd os and Tibor Gallai. Grafok el oirt fok u pontokkal. Matematikai Lapok,
11:264274, 1960. English title: Graphs with points of prescribed degrees. 24
142
Bibliography
[EG08] David Eppstein and Michael T. Goodrich. Studying (non-planar) road networks
through an algorithmic lens. In 16th ACM SIGSPATIAL International Symposium
on Advances in Geographic Information Systems, ACM-GIS 2008, November 5-7,
2008, Irvine, California, USA, Proceedings, page 16, 2008. 22, 31, 47, 103, 104
[Ege31] Jen o Egerv ary. Matrixok kombinatorikus tulajdons agair ol. K oz episkolai Matem-
atikai es Fizikai Lapok, 38:1627, 1931. Translation by Harold W. Kuhn in Logis-
tics Papers, issue 11, 1955. 8
[EGK
+
04] Stephen Eubank, Hasan Guclu, V. S. Anil Kumar, Madhav V. Marathe, Aravind
Srinivasan, Zoltan Toroczkai, and Nan Wang. Modelling disease outbreaks in
realistic urban social networks. Nature, 429(6988):180184, 2004. 7, 10
[EGS09] David Eppstein, Michael T. Goodrich, and Darren Strash. Linear-time algorithms
for geometric graphs with sublinearly many crossings. In Proceedings of the Twen-
tieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2009, New
York, NY, USA, January 4-6, 2009, pages 150159, 2009. 22
[EHP04] Jeff Erickson and Sariel Har-Peled. Optimally cutting a surface into a disk. Dis-
crete & Computational Geometry, 31(1):3759, 2004. Announced at SOCG 2002.
19
[EJ73] Jack Edmonds and Ellis L. Johnson. Matching, euler tours and the Chinese post-
man. Mathematical Programming, 5(1):88124, 1973. 8
[EJ08] Geoffrey Exoo and Robert Jajcay. Dynamic cage survey. The Electronic Journal
of Combinatorics, 15, 2008. 28, 65
[EL82] R. J. Elliott and Michael Lesk. Route nding in street maps by computers and
people. In AAAI, pages 258261, 1982. 54
[Elk05] Michael L. Elkin. Computing almost shortest paths. ACM Transactions on Algo-
rithms, 1(2):283323, 2005. Announced at PODC 2001. 38
[Elk08a] Michael L. Elkin. Sparse graph spanners. In Encyclopedia of Algorithms. 2008.
27
[Elk08b] Michael L. Elkin. Synchronizers, spanners. In Encyclopedia of Algorithms. 2008.
27
[Elm77] Salah E. Elmaghraby. Activity Networks: Project Planning and Control by Net-
work Models. Wiley, New York, 1977. 8
[EM01] Stefan Edelkamp and Ulrich Meyer. Theory and practice of time-space trade-
offs in memory limited search. In KI 2001: Advances in Articial Intelligence,
Joint German/Austrian Conference on AI, Vienna, Austria, September 19-21, 2001,
Proceedings, pages 169184, 2001. 53
[EMK97] Paul Embrechts, Thomas Mikosch, and Claudia Kl uppelberg. Modelling extremal
events: for insurance and nance. Springer-Verlag, London, UK, 1997. 108
[EP04] Michael L. Elkin and David Peleg. (1+epsilon, beta)-spanner constructions for
general graphs. SIAM Journal on Computing, 33(3):608631, 2004. 27
[Epp99] David Eppstein. Subgraph isomorphism in planar graphs and related problems.
Journal of Graph Algorithms and Applications, 3(3), 1999. Announced at SODA
1995. 45
143
Bibliography
[ER60] Paul Erd os and Alfr ed A. R enyi. On the evolution of random graphs. Magyar
Tudom anyos Akad emia Matematikai Kutat o Int ezet enek K ozlem enyei, 5:1761,
1960. 22, 23, 49
[ER97] Joost Engelfriet and Grzegorz Rozenberg. Node replacement graph grammars. In
Handbook of Graph Grammars and Computing by Graph Transformations, Vol-
ume 1: Foundations, pages 194, 1997. 48
[Erd35] Paul Erd os.

Uber die Primzahlen gewisser arithmetischer Reihen. Mathematische
Zeitschrift, 39(1):473491, 1935. 66
[Erd63] Paul Erd os. On a Combinatorial Problem, I. Nordisk Matematisk Tidsskrift, 11:5
10, 1963. 67, 68
[Erd64] Paul Erd os. Extremal problems in graph theory. Theory Graphs Appl., Proc. Symp.
Smolenice, pages 2936, 1964. 28, 41, 59
[ERS66] Paul Erd os, Alfr ed A. R enyi, and Vera Turan Sos. On a problem of graph theory.
Studia Scientiarum Mathematicarum Hungarica, 1:215235, 1966. 28
[Erw00] Martin Erwig. The graph Voronoi diagram with applications. Networks,
36(3):156163, 2000. 97, 98, 99, 101, 102, 103, 118
[ES63] Paul Erd os and Horst Sachs. Regul are Graphen gegebener Taillenweite mit min-
imaler Knotenzahl. Wissenschaftliche Zeitschrift der Martin-Luther-Universit at
Halle-Wittenberg, Mathematisch-Naturwissenschaftliche Reihe, pages 251258,
1963. 28, 41, 59
[EW03] Nils Eissfeldt and Peter Wagner. Effect of anticipatory driving in trafc ow
model. The European Physical Journal B, 33:121129, 2003. 10
[EW04] David Eppstein and Joseph Wang. Fast approximation of centrality. Journal of
Graph Algorithms and Applications, 8:3945, 2004. Announced at SODA 2001.
12
[EWG08] Mihaela Enachescu, Mei Wang, and Ashish Goel. Reducing maximum stretch in
compact routing. In INFOCOM, pages 336340, 2008. 49, 96
[FF58] Lester Randolph Ford and Delbert Ray Fulkerson. Constructing maximal dynamic
ows from static ows. Op. Res., 6(3):419433, 1958. 8, 34
[FFF99] Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. On power-law rela-
tionships of the Internet topology. In SIGCOMM: Proceedings of the conference
on applications, technologies, architectures, and protocols for computer commu-
nication, pages 251262, 1999. 5, 82, 125
[FFV05] Abraham D. Flaxman, Alan M. Frieze, and Juan Vera. Adversarial deletion in a
scale free random graph process. In Proceedings of the 16th annual ACM-SIAM
symposium on Discrete algorithms, pages 287292, 2005. 83
[FFV07] Abraham D. Flaxman, Alan M. Frieze, and Juan Vera. A geometric preferential
attachment model of networks ii. In Algorithms and Models for the Web-Graph,
5th International Workshop, WAW 2007, San Diego, CA, USA, December 11-12,
2007, Proceedings, pages 4155, 2007. 4, 5
[FG85] Alan M. Frieze and Geoffrey R. Grimmett. The shortest-path problem for graphs
with random arc-lengths. Discrete Applied Mathematics, 10(1):5777, 1985. 35
144
Bibliography
[FGG
+
05] Qing Fang, Jie Gao, Leonidas J. Guibas, V. de Silva, and Li Zhang. GLIDER:
gradient landmark-based distributed routing for sensor networks. In INFOCOM
2005. 24th Annual Joint Conference of the IEEE Computer and Communications
Societies, 13-17 March 2005, Miami, FL, USA, pages 339350, 2005. 97
[FH04] Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Efcient graph-based image
segmentation. International Journal of Computer Vision, 59(2):167181, 2004.
11
[FJ88] Greg N. Frederickson and Ravi Janardan. Designing networks with compact rout-
ing tables. Algorithmica, 3:171190, 1988. 45
[FJ89] Greg N. Frederickson and Ravi Janardan. Efcient message routing in planar net-
works. SIAM Journal on Computing, 18(4):843857, 1989. 47
[FJ90] Greg N. Frederickson and Ravi Janardan. Space-efcient message routing in c-
decomposable networks. SIAM Journal on Computing, 19(1):164181, 1990. 47
[FK07] Martin F urer and Shiva Prasad Kasiviswanathan. Spanners for geometric inter-
section graphs. In Algorithms and Data Structures, 10th International Workshop,
WADS 2007, Halifax, Canada, August 15-17, 2007, Proceedings, pages 312324,
2007. 48, 121
[FKS84] Michael L. Fredman, J anos Koml os, and Endre Szemer edi. Storing a sparse table
with O(1) worst case access time. Journal of the ACM, 31(3):538544, 1984.
Announced at FOCS 1982. 42, 89
[FKS89] Joel Friedman, Jeff Kahn, and Endre Szemer edi. On the second eigenvalue in
random regular graphs. In Proceedings of the Twenty-First Annual ACM Sympo-
sium on Theory of Computing, 15-17 May 1989, Seattle, Washington, USA, pages
587598, 1989. 65
[Fla63] C. Flament. Applications of Graph Theory to Group Structure. Prentice-Hall,
Englewood Cliffs, 1963. 12
[FLM67] B. A. Farbey, A. H. Land, and J. D. Murchland. The cascade algorithm for nding
all shortest distances in a directed graph. Management Science, 14(1):1928, 1967.
35, 51
[Flo62] Robert W. Floyd. Algorithm 97: Shortest path. Communications of the ACM,
5(6):345, 1962. 8, 35, 37, 51
[FLV08] Pierre Fraigniaud, Emmanuelle Lebhar, and Laurent Viennot. The inframetric
model for the internet. In INFOCOM 2008. 27th IEEE International Conference
on Computer Communications, Joint Conference of the IEEE Computer and Com-
munications Societies, 13-18 April 2008, Phoenix, AZ, USA, pages 10851093,
2008. 49, 83
[FM95] Tom as Feder and Rajeev Motwani. Clique partitions, graph compression and
speeding-up algorithms. Journal of Computer and System Sciences, 51(2):261
272, 1995. Announced at STOC 1991. 36, 37
[For56] Lester R. Ford. Network ow theory. Report P-923, The Rand Corporation, 1956.
8, 34, 35, 37
145
Bibliography
[FPP08] Alessandro Ferrante, Gopal Pandurangan, and Kihong Park. On the hardness of
optimization in power-law graphs. Theoretical Computer Science, 393(1-3):220
230, 2008. 83
[FR06] Jittat Fakcharoenphol and Satish Rao. Planar graphs, negative weight edges,
shortest paths, and near linear time. Journal of Computer and System Sciences,
72(5):868889, 2006. Announced at FOCS 2001. 32, 44, 47, 121
[FR08] Jittat Fakcharoenphol and Satish Rao. Shortest paths in planar graphs with negative
weight edges. In Encyclopedia of Algorithms. 2008. 44
[Fra08] Andrew U. Frank. Shortest path in a combined public transportation network. KI,
22(3):1418, 2008. 2
[Fre76] Michael L. Fredman. New bounds on the complexity of the shortest path problem.
SIAM Journal on Computing, 5(1):8389, 1976. 36, 37
[Fre77] Linton C. Freeman. A set of measures of centrality based on betweenness. So-
ciometry, 40(1):3541, 1977. 12
[Fre87] Greg N. Frederickson. Fast algorithms for shortest paths in planar graphs, with
applications. SIAM Journal on Computing, 16(6):10041022, 1987. 35
[Fre91] Greg N. Frederickson. Planar graph decomposition and all pairs shortest paths.
Journal of the ACM, 38(1):162204, 1991. 21, 45
[Fre95] Greg N. Frederickson. Using cellular graph embeddings in solving all pairs short-
est paths problems. Journal of Algorithms, 19(1):4585, 1995. Announced at
FOCS 1989. 21, 45
[Fri64] Max Frisch. Mein Name sei Gantenbein. Suhrkamp, 1964. 81
[Fri76] Alan M. Frieze. Shortest path algorithms for knapsack type problems. Mathemat-
ical Programming, 11(1):150157, 1976. 8
[Fri91] Joel Friedman. On the second eigenvalue and random walks in random d-regular
graphs. Combinatorica, 11(4):331362, 1991. 65
[FS97] AndrewFetterer and Shashi Shekhar. Aperformance analysis of hierarchical short-
est path algorithms. In ICTAI, pages 8493, 1997. 51, 54
[FT87] Michael L. Fredman and Robert Endre Tarjan. Fibonacci heaps and their uses in
improved network optimization algorithms. Journal of the ACM, 34(3):596615,
1987. Announced at FOCS 1984. 33, 34, 103
[FTW87] Martin A. Fischler, Jay M. Tenenbaum, and Helen C. Wolf. Detection of roads and
linear structures in low-resolution aerial imagery using a multisource knowledge
integration technique. In Readings in computer vision: issues, problems, princi-
ples, and paradigms, pages 741752, 1987. 11
[FUS
+
98] Alexandre X. Falcao, Jayaram K. Udupa, Supun Samarasekera, Shoba Sharma,
Bruce Elliot Hirsch, and Roberto de A. Lotufo. User-steered image segmentation
paradigms: Live wire and live lane. Graphical Models and Image Processing,
60(4):233 260, 1998. 8, 10
[FVC07] Alan M. Frieze, Juan Vera, and Soumen Chakrabarti. The inuence of search
engines on preferential attachment. Internet Mathematics, 3(3), 2007. 5
146
Bibliography
[FW93] Michael L. Fredman and Dan E. Willard. Surpassing the information theoretic
bound with fusion trees. Journal of Computer and System Sciences, 47(3):424
436, 1993. 34
[FW94] Michael L. Fredman and Dan E. Willard. Trans-dichotomous algorithms for mini-
mum spanning trees and shortest paths. Journal of Computer and System Sciences,
48(3):533551, 1994. Announced at FOCS 1990. 34
[GA06] Christopher M. Gold and Paul Angel. Voronoi hierarchies. In Geographic Informa-
tion Science, 4th International Conference, GIScience 2006, M unster, Germany,
September 20-23, 2006, Proceedings, pages 99111, 2006. 118
[Gal58] Tibor Gallai. Maximum-minimum S atze uber Graphen. Acta Mathematica
Academiae Scientiarum Hungaricae, 9:395434, 1958. 8
[Gav01] Cyril Gavoille. Routing in distributed networks: overview and open problems.
SIGACT News, 32(1):3652, 2001. 49
[GBB
+
03] Loic Giot, Joel S. Bader, C. Brouwer, Amitabha Chaudhuri, Bing Kuang, Y. Li,
YL. Hao, CE. Ooi, Brian Godwin, E. Vitols, G. Vijayadamodar, Philippe Pochart,
H. Machineni, M. Welsh, Y. Kong, B. Zerhusen, Robert J. Malcolm, Z. Var-
rone, A. Collis, M. Minto, S. Burgess, L. McDaniel, E. Stimpson, F. Spriggs,
J. Williams, Kathryin Neurath, N. Ioime, M. Agee, E. Voss, K. Furtak, R. Renzulli,
N. Aanensen, S. Carrolla, E. Bickelhaupt, Y. Lazovatsky, A. DaSilva, J. Zhong,
Clement A. Stanyon, Russell L. Finley Jr, Kevin P. White, Michael S. Braverman,
Thomas P. Jarvie, S. Gold, M. Leach, J. Knight, Richard A. Shimkets, Michael P.
McKenna, John Chant, and Jonathan M. Rothberg. A protein interaction map of
Drosophila melanogaster. Science, 302(5651):17271736, 2003. 6
[GBL08] Amit Goyal, Francesco Bonchi, and Laks V.S. Lakshmanan. Discovering leaders
from community actions. In CIKM 08: Proceeding of the 17th ACM conference
on Information and knowledge management, pages 499508, 2008. 7
[GCE
+
09] Emanuele Galli, Leticia Cuellar, Stephan Eidenbenz, Mary Ewers, Sue
Mniszewski, and Christof Teuscher. Activitysim: large-scale agent-based activ-
ity generation for infrastructure simulation. In Proceedings of the 2009 Spring
Simulation Multiconference, SpringSim 2009, San Diego, California, USA, March
22-27, 2009, 2009. 7
[Gel63] Herbert L. Gelernter. Realization of a geometry theorem proving machine. Com-
puters and Thought, 1963. 35, 52
[Gel77] David Gelperin. On the optimality of A*. Articial Intelligence, 8(1):6976, 1977.
35, 52
[GGM06] Jesus Gomez-Gardenes and Yamir Moreno. From scale-free to Erdos-Renyi net-
works. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics),
73(5):056124, 2006. 4
[GH05] Andrew V. Goldberg and Chris Harrelson. Computing the shortest path: A* search
meets graph theory. In Proceedings of the Sixteenth Annual ACM-SIAM Sympo-
sium on Discrete Algorithms (SODA05), Vancouver, British Columbia, Canada,
January 23-25, 2005, pages 156165, 2005. 53, 91, 111, 114
[GHK90] Donald Goldfarb, Jianxiu Hao, and Sheng-Roan Kai. Efcient shortest path sim-
plex algorithms. Operations Research, 38(4):624628, 1990. 34
147
Bibliography
[GHT84] John R. Gilbert, Joan P. Hutchinson, and Robert Endre Tarjan. A separator theorem
for graphs of bounded genus. Journal of Algorithms, 5(3):391407, 1984. 31
[Gil59] Edgar N. Gilbert. Random graphs. The Annals of Mathematical Statistics,
30:11411144, 1959. 22, 23
[GJ90] Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide
to the Theory of NP-Completeness. W. H. Freeman & Co., 1990. 26
[GJ99] Donald Goldfarb and Zhiying Jin. An O(nm)-time network simplex algorithm for
the shortest path problem. Operations Research, 47(3):445448, 1999. 34
[GKK
+
01] Cyril Gavoille, Michal Katz, Nir A. Katz, Christophe Paul, and David Peleg. Ap-
proximate distance labeling schemes. In Algorithms - ESA 2001, 9th Annual Eu-
ropean Symposium, Aarhus, Denmark, August 28-31, 2001, Proceedings, pages
476487, 2001. 48
[GKN74] Fred Glover, D. Klingman, and A. Napier. A note on nding all shortest paths.
Transportation Science, 8:312, 1974. 35, 51
[GKP85] Fred Glover, Darwin D. Klingman, and Nancy V. Phillips. A new polynomially
bounded shortest path algorithm. Operations Research, 33(1):6573, 1985. 34
[GKP05] Naveen Garg, Rohit Khandekar, and Vinayaka Pandit. Improved approximation
for universal facility location. In Proceedings of the Sixteenth Annual ACM-SIAM
Symposium on Discrete Algorithms, (SODA05), Vancouver, British Columbia,
Canada, January 23-25, 2005, pages 959960, 2005. 98
[GKPS85] Fred Glover, Darwin D. Klingman, Nancy V. Phillips, and Robert F. Schneider.
New polynomial shortest path algorithms and their computational attributes. Man-
agement Science, 31(9):11061128, 1985. 34
[GKR04] Sandeep Gupta, Swastik Kopparty, and Chinya Ravishankar. Roads, codes, and
spatiotemporal queries. In PODS 04: Proceedings of the twenty-third ACM
SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages
115124, 2004. 45, 55, 121
[GKW06] Andrew Goldberg, Haim Kaplan, and Renato Fonseca F. Werneck. Reach for A*:
Efcient point-to-point shortest path algorithms. In Proceedings of the Eighth
Workshop on Algorithm Engineering and Experiments (ALENEX06), pages 129
143. SIAM, 2006. 53
[GKW07] Andrew V. Goldberg, Haim Kaplan, and Renato Fonseca F. Werneck. Better land-
marks within reach. In Experimental Algorithms, 6th International Workshop,
WEA 2007, Rome, Italy, June 6-8, 2007, Proceedings, pages 3851, 2007. 51, 53,
105
[GL07] Cyril Gavoille and Arnaud Labourel. Shorter implicit representation for planar
graphs and bounded treewidth graphs. In Algorithms - ESA 2007, 15th Annual
European Symposium, Eilat, Israel, October 8-10, 2007, Proceedings, pages 582
593, 2007. 31
[GLNS08] Joachim Gudmundsson, Christos Levcopoulos, Giri Narasimhan, and Michiel
H. M. Smid. Approximate distance oracles for geometric spanners. ACM Trans-
actions on Algorithms, 4(1), 2008. 50, 121
148
Bibliography
[GM77] Bruce Golden and Thomas L. Magnanti. Deterministic network optimization: A
bibliography. Networks, 7:149183, 1977. 32
[GM93] Zvi Galil and Oded Margalit. Witnesses for boolean matrix multiplication and for
transitive closure. Journal of Complexity, 9(2):201221, 1993. 36
[GM97] Zvi Galil and Oded Margalit. All pairs shortest distances for graphs with small
integer length edges. Information and Computation, 134(2):103139, 1997. 36
[GMMO00] Sudipto Guha, Nina Mishra, Rajeev Motwani, and Liadan OCallaghan. Clustering
data streams. In Proceedings of the IEEE Symposium on Foundations of Computer
Science (FOCS), pages 359366, 2000. 13
[GN64] Allan L. Gutjahr and George L. Nemhauser. An algorithm for the line balancing
problem. Management Science, 11(2):308315, 1964. 8
[GN67] A. J. Goldman and G. L. Nemhauser. A transport improvement problem trans-
formable to a best-path problem. Transportation Science, 1(4):295307, 1967.
8
[GN02] M. Girvan and Mark E. J. Newman. Community structure in social and biological
networks. Proceedings of the National Academy of Sciences, 99(12):78217826,
2002. 13
[Gol76] Bruce Golden. Shortest-path algorithms: A comparison. Operations Research,
24(6):11641168, 1976. 35, 50
[Gol95] Andrew V. Goldberg. Scaling algorithms for the shortest paths problem. SIAM
Journal on Computing, 24(3):494504, 1995. Announced at SODA 1993. 32
[Gol01] Andrew V. Goldberg. Shortest path algorithms: Engineering aspects. In
Algorithms and Computation, 12th International Symposium, ISAAC 2001,
Christchurch, New Zealand, December 19-21, 2001, Proceedings, pages 502513,
2001. 35
[Gol07] Andrew V. Goldberg. Point-to-point shortest path algorithms with preprocessing.
In SOFSEM 2007: Theory and Practice of Computer Science, 33rd Conference on
Current Trends in Theory and Practice of Computer Science, Harrachov, Czech
Republic, January 20-26, 2007, Proceedings, pages 88102, 2007. 54
[Gol08] Andrew V. Goldberg. A practical shortest path algorithm with linear expected
time. SIAM Journal on Computing, 37(5):16371655, 2008. 35
[GOY76] Satoshi Goto, Tatsuo Ohtsuki, and Takeshi Yoshimura. Sparse matrix tech-
niques for the shortest path problem. IEEE Transactions on Circuits and Systems,
23(12):752758, Dec 1976. 35
[GP72] Ronald L. Graham and Henry O. Pollak. On embedding graphs in squashed cubes,
volume 303, pages 99110. 1972. 30
[GP86] Giorgio Gallo and Stefano Pallottino. Shortest path methods: A unifying approach.
Mathematical Programming Studies, 26:3864, 1986. 34
[GP03a] Cyril Gavoille and Christophe Paul. Optimal distance labeling for interval and
circular-arc graphs. In Algorithms - ESA 2003, 11th Annual European Symposium,
Budapest, Hungary, September 16-19, 2003, Proceedings, pages 254265, 2003.
48
149
Bibliography
[GP03b] Cyril Gavoille and David Peleg. Compact and localized distributed data structures.
Distributed Computing, 16(2-3):111120, 2003. 29, 49
[GPPR04] Cyril Gavoille, David Peleg, St ephane P erennes, and Ran Raz. Distance labeling
in graphs. Journal of Algorithms, 53(1):85112, 2004. Announced at SODA 2000.
29
[Gra06] Leo Grady. Computing exact discrete minimal surfaces: Extending and solving the
shortest path problem in 3d with application to segmentation. In IEEE Computer
Society Conference on Computer Vision and Pattern Recognition, pages 6978,
2006. 11
[Gra08] Leo Grady. Minimal surfaces extend shortest path segmentation methods to 3d.
IEEE Transactions on Pattern Analysis and Machine Intelligence, PP(99):1, De-
cember 2008. 11
[GSDF03] Lise Getoor, Ted E. Senator, Pedro Domingos, and Christos Faloutsos, editors.
SIGKDD Proceedings, 2003. 112
[GSSD08] Robert Geisberger, Peter Sanders, Dominik Schultes, and Daniel Delling. Con-
traction hierarchies: Faster and simpler hierarchical routing in road networks. In
Experimental Algorithms, 7th International Workshop (WEA08), Provincetown,
MA, USA, May 30-June 1, 2008, Proceedings, pages 319333, 2008. 47, 51, 54,
55, 56, 111, 112, 114, 117, 123
[GSVGM98] Roy Goldman, Narayanan Shivakumar, Suresh Venkatasubramanian, and Hector
Garcia-Molina. Proximity search in databases. In VLDB98, Proceedings of 24rd
International Conference on Very Large Data Bases, August 24-27, 1998, New
York City, New York, USA, pages 2637, 1998. 57, 83
[GT89] Harold N. Gabow and Robert Endre Tarjan. Faster scaling algorithms for network
problems. SIAM Journal on Computing, 18(5):10131036, 1989. 32
[Gua93] John Guare. Six degrees of separation, 1993. Movie. 6
[Gut04] Ron Gutman. Reach-based routing: A new approach to shortest path algorithms
optimized for road networks. In Proceedings 6th Workshop on Algorithm Engi-
neering and Experiments (ALENEX), pages 100111. SIAM, 2004. 53, 56
[GW73] David E. Gilsinn and Christoph Witzgall. A performance comparison of label-
ing algorithms for calculating shortest path trees. Technical Note 772, National
Institute of Standards and Technology, 1973. 35, 50
[GW05] Andrew V. Goldberg and Renato Fonseca F. Werneck. Computing point-to-point
shortest paths from external memory. In Proceedings of the Seventh Workshop
on Algorithm Engineering and Experiments and the Second Workshop on Ana-
lytic Algorithmics and Combinatorics, ALENEX /ANALCO 2005, Vancouver, BC,
Canada, 22 January 2005, pages 2640, 2005. 53, 54, 111
[GYY80] Ronald L. Graham, Andrew Chi-Chih Yao, and F. Frances Yao. Information
bounds are weak in the shortest distance problem. Journal of the ACM, 27(3):428
444, 1980. 35
[GZ05] Jie Gao and Li Zhang. Well-separated pair decomposition for the unit-disk graph
metric and its applications. SIAM Journal on Computing, 35(1):151169, 2005.
48, 121
150
Bibliography
[GZC
+
09] Ido Guy, Naama Zwerdling, David Carmel, Inbal Ronen, Erel Uziel, Sivan Yogev,
and Shila Ofek-Koifman. Personalized recommendation of social software items
based on social relations. In Proceedings of the 2009 ACM Conference on Recom-
mender Systems, RecSys 2009, New York, NY, USA, October 23-25, 2009, pages
5360, 2009. 13
[Hag00a] Torben Hagerup. Dynamic algorithms for graphs of bounded treewidth. Algorith-
mica, 27(3):292315, 2000. Announced at ICALP 1997. 48
[Hag00b] Torben Hagerup. Improved shortest paths on the word RAM. In Automata, Lan-
guages and Programming, 27th International Colloquium, ICALP 2000, Geneva,
Switzerland, July 9-15, 2000, Proceedings, pages 6172, 2000. 33
[Hag06] Torben Hagerup. Simpler computation of single-source shortest paths in linear
average time. Theory of Computing Systems, 39(1):113120, 2006. 35
[Hak62] Seifollah Louis Hakimi. On realizability of a set of integers as degrees of the
vertices of a linear graph. I. Journal of the Society for Industrial and Applied
Mathematics, 10(3):496506, 1962. 24
[Hal76] Rudolf Halin. S-functions for graphs. Journal of Geometry, 8(1-2):171186, 1976.
20
[Han87] Eric N. Hanson. A performance analysis of view materialization strategies. SIG-
MOD Rec., 16(3):440453, 1987. 10
[Han04] Yijie Han. Improved algorithm for all pairs shortest paths. Information Processing
Letters, 91(5):245250, 2004. 37
[Han08a] Yijie Han. A note of an O(n
3
/ log n) time algorithm for all pairs shortest paths.
Information Processing Letters, 105(3):114116, 2008. 37
[Han08b] Yijie Han. An O(n
3
(log log n/ log n)
5/4
) time algorithm for all pairs shortest
path. Algorithmica, 51(4):428434, 2008. Announced at ESA 2006. 37
[Hay06] Brian Hayes. Connecting the dots. Computing Science, 94(5):400, 2006. 4
[Hel53] Isidor Heller. On the problem of shortest path between points, I. Bulletin of the
American Mathematical Society, 59, 1953. Astudy of the ve-city Traveling Sales-
man Problem polytope. 8
[HHSW09] Shinichi Honiden, Michael E. Houle, Christian Sommer, and Martin Wolff. Ap-
proximate shortest path queries in graphs using Voronoi duals. In Sixth annual In-
ternational Symposium on Voronoi Diagrams in Science and Engineering (ISVD),
pages 5362, 2009. Invited to special issue of Transactions on Computational
Science. 16
[Hit68] Lewis E. Hitchner. A comparative investigation of the computational efciency of
shortest path algorithms. Technical Report ORC 68-17, University of California at
Berkeley, 1968. 35, 50
[HJR95] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Hierarchical path views: A
model based on fragmentation and transportation road types. In ACM-GIS, pages
93, 1995. 51, 54
[HJR96a] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Effective graph clustering
for path queries in digital map databases. In CIKM 96, Proceedings of the Fifth
151
Bibliography
International Conference on Information and Knowledge Management, November
12 - 16, 1996, Rockville, Maryland, USA, pages 215222, 1996. 54
[HJR96b] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Path queries for trans-
portation networks: Dynamic reordering and sliding window paging techniques.
In GIS 96, Proceedings of the fourth ACM workshop on Advances on Advances
in Geographic Information Systems, November 15-16, 1996, Rockville, Maryland,
USA, pages 916, 1996. 54
[HJR97a] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. A hierarchical path view
model for path nding in intelligent transportation systems. GeoInformatica,
1(2):125159, 1997. 54
[HJR97b] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Integrated query process-
ing strategies for spatial path queries. In Proceedings of the Thirteenth Interna-
tional Conference on Data Engineering, April 7-11, 1997 Birmingham U.K, pages
477486, 1997. 54
[HJR00] Yun-Wu Huang, Ning Jing, and Elke A. Rundensteiner. Optimizing path query
performance: Graph clustering strategies, 2000. 54
[HKRS97] Monika Rauch Henzinger, Philip Nathan Klein, Satish Rao, and Sairam Subrama-
nian. Faster shortest-path algorithms for planar graphs. Journal of Computer and
System Sciences, 55(1):323, 1997. Announced at STOC 1994. 35, 103
[HLL06] Haibo Hu, Dik Lun Lee, and Victor C. S. Lee. Distance indexing on road networks.
In VLDB 06: Proceedings of the 32nd international conference on Very large data
bases, pages 894905, 2006. 54
[HLW98] Abolhassan Halati, Henry Lieu, and Susan Walker. CORSIM-corridor trafc simu-
lation model. In Proceedings at the trafc congesion and trafc safety conference,
pages 570576, 1998. 10
[HLW06] Shlomoh Hoory, Nati Linial, and Avi Wigderson. Expander graphs and their ap-
plications. Bulletin of the American Mathematical Society, 43(4):439561, 2006.
2, 65
[HMZ03] David A. Hutchinson, Anil Maheshwari, and Norbert Zeh. An external memory
data structure for shortest path queries. Discrete Applied Mathematics, 126(1):55
82, 2003. 45
[HNR68] Peter E. Hart, Nils J. Nilsson, and Bertram R. Raphael. A formal basis for the
heuristic determination of minimum cost paths in graphs. IEEE Transactions of
Systems Science and Cybernetics, SSC-4(2):100107, 1968. 35, 52
[Hol04] Johan Holmgren. Efcient updating shortest path calculations for trafc assign-
ment. Technical Report LITH-MAI-EX-2004-13, Division of Optimization, De-
partment of Mathematics, Linkoping Institute of Technology, Linkoping, October
2004. 10
[Hol08] Martin Holzer. Engineering Planar-Separator and Shortest-Path Algorithms. PhD
thesis, Universit at Karlsruhe, 2008. 51, 54
[Hoo02] Shlomo Hoory. On graphs of high girth. PhD thesis, Hebrew University, 2002. 28
152
Bibliography
[HP58] Walter Hoffman and Richard Pavley. Applications of digital computers to prob-
lems in the study of vehicular trafc. In IRE-ACM-AIEE 58 (Western): Pro-
ceedings of the May 6-8, 1958, western joint computer conference: contrasts in
computers, pages 159161, 1958. 10, 54
[HPCD96] John Joseph Helmsen, Elbridge Gerry Puckett, Phillip Colella, and M. Dorr. Two
new methods for simulating photolithography development in 3D. In Proceed-
ings of the SPIE The International Society for Optical Engineering Optical Mi-
crolithography IX, pages 253261, 1996. 11
[HSP92] Jianying Hu, William J. Sakoda, and Theodosios Pavlidis. Interactive road nding
for aerial images. In IEEE Workshop on Applications of Computer Vision, pages
5663, 1992. 11
[HST09] Bernhard Haeupler, Siddhartha Sen, and Robert Endre Tarjan. Rank-pairing heaps.
In Algorithms - ESA 2009, 17th Annual European Symposium, Copenhagen, Den-
mark, September 7-9, 2009. Proceedings, pages 659670, 2009. 33
[HSW08] Martin Holzer, Frank Schulz, and Dorothea Wagner. Engineering multilevel over-
lay graphs for shortest-path queries. ACM Journal of Experimental Algorithmics,
13, 2008. 51, 54
[HSWW05] Martin Holzer, Frank Schulz, Dorothea Wagner, and Thomas Willhalm. Combin-
ing speed-up techniques for shortest-path computations. ACM Journal of Experi-
mental Algorithmics, 10, 2005. 54, 55
[HT69] Te C. Hu and William T. Torres. Shortcut in the decomposition algorithm for short-
est paths in a network. IBM Journal of Research and Development, 13(4):387390,
1969. 35, 36, 51, 125
[HT02] Yijie Han and Mikkel Thorup. Integer sorting in O(n

log log n) expected time


and linear space. In 43rd Symposium on Foundations of Computer Science (FOCS
2002), 16-19 November 2002, Vancouver, BC, Canada, Proceedings, pages 135
144, 2002. 34
[HTB01] Michelle R. Hribar, Valerie E. Taylor, and David E. Boyce. Implementing par-
allel shortest path for parallel transportation applications. Parallel Computing,
27(12):15371568, 2001. 35
[Hu67] Te C. Hu. Revised matrix algorithms for shortest paths. SIAM Journal on Applied
Mathematics, 15(1):207218, 1967. 8
[Hu68] Te C. Hu. A decomposition algorithm for shortest paths in a network. Operations
Research, 16(1):91102, 1968. 35, 51
[Hu69] Te C. Hu. Integer Programming and Network Flows. Addison Wesley, 1969. 51,
55
[Hu71] Te C. Hu. Some problems in discrete optimization. Mathematical Programming,
1:102112, 1971. 9
[Hui00] Christian Huitema. Routing in the Internet (2nd ed.). Prentice Hall PTR, Upper
Saddle River, NJ, USA, 2000. 14
[HW07] Johan H astad and Avi Wigderson. The randomized communication complexity of
set disjointness. Theory of Computing, 3(1):211219, 2007. 62
153
Bibliography
[HWYY05] Hao He, Haixun Wang, Jun Yang, and Philip S. Yu. Compact reachability labeling
for graph-structured data. In CIKM 05: Proceedings of the 14th ACM interna-
tional conference on Information and knowledge management, pages 594601,
2005. 57
[ICO
+
01] Takashi Ito, Tomoko Chiba, Ritsuko Ozawa, Mikio Yoshida, Masahira Hattori, and
Yoshiyuki Sakaki. A comprehensive two-hybrid analysis to explore the yeast pro-
tein interactome. Proceedings of the National Academy of Sciences, 98(8):4569
4574, 2001. 6
[IFB
+
02] Jan Ihmels, Gilgi Friedlander, Sven Bergmann, Ofer Sarig, Yaniv Ziv, and Naama
Barkai. Revealing modular organization in the yeast transcriptional network. Na-
ture Genetics, 31(4):370377, 2002. 6
[IHI
+
94] Takahiro Ikeda, Min-Yao Hsu, Hiroshi Imai, Shigeki Nishimura, Hiroshi Shi-
moura, Takeo Hashimoto, Kenji Tenmoku, and Kunihiko Mitoh. A fast algorithm
for nding better routes by AI search techniques. In Vehicle Navigation and Infor-
mation Systems Conference, pages 291296, 1994. 35, 53
[II84] Hiroshi Imai and Masao Iri. Practical efciencies of existing shortest-path algo-
rithms and a new bucket algorithm. Journal of the Operations Research Society of
Japan, 27(1):4358, 1984. 35
[II86] Hiroshi Imai and Masao Iri. Computational-geometric methods for polygonal
approximations of a curve. Computer Vision, Graphics, and Image Processing,
36(1):3141, 1986. 8
[IM04] Pitor Indyk and Jir Matousek. CRC Handbook of Discrete and Computational Ge-
ometry, 2nd edition, chapter Low-Distortion Embeddings of Finite Metric Spaces,
pages 177196. 2004. 29
[IN72] Masao Iri and Mario Nakamori. Path-sets, operator semigroups and shortest path
algorithms on a network. Research Association of Applied Geometry (RAAG)
memoirs of the unifying study of basic problems in engineering and physical sci-
ences by means of geometry 185, 3rd series, 1972. 36
[Ind01] Piotr Indyk. Algorithmic applications of low-distortion geometric embeddings.
In Proceedings of the IEEE Symposium on Foundations of Computer Science
(FOCS), pages 1033, 2001. 29
[IOAI91] Kunihiro Ishikawa, Michima Ogawa, Shigetoshi Azuma, and Tooru Ito. Map nav-
igation software of the electro-multivision of the 91 Toyoto Soarer. In Vehicle
Navigation and Information Systems Conference, 1991, volume 2, pages 463473,
Oct. 1991. 51, 54
[Iri92] Masao Iri. How to generate realistic sample problems for network optimization.
In ISAAC 92: Proceedings of the Third International Symposium on Algorithms
and Computation, pages 342350, London, UK, 1992. Springer-Verlag. 22
[Ita08] Giuseppe F. Italiano. Fully dynamic all pairs shortest paths. In Encyclopedia of
Algorithms. 2008. 32, 35, 39
[Jac08] Riko Jacob. Shortest paths approaches for timetable information. In Encyclopedia
of Algorithms. 2008. 56
154
Bibliography
[Jag90] H. V. Jagadish. A compression technique to materialize transitive closure. ACM
Transactions on Database Systems, 15(4):558598, 1990. 57
[JBIH05] Maliackal Poulo Joy, Amy Brock, Donald E. Ingber, and Sui Huang. High-
betweenness proteins in the yeast protein interaction network. Journal of
Biomedicine and Biotechnology, 2005(2):96103, 2005. 12
[JHR96] Ning Jing, Yun-Wu Huang, and Elke A. Rundensteiner. Hierarchical optimization
of optimal path nding for transportation applications. In CIKM 96, Proceedings
of the Fifth International Conference on Information and Knowledge Management,
November 12 - 16, 1996, Rockville, Maryland, USA, pages 261268, 1996. 10, 51,
54
[JHR98] Ning Jing, Yun-Wu Huang, and Elke A. Rundensteiner. Hierarchical encoded path
views for path query processing: An optimal model and its performance evalu-
ation. IEEE Transactions on Knowledge and Data Engineering, 10(3):409432,
1998. 54
[JL84] William B. Johnson and Joram Lindenstrauss. Extensions of Lipschitz maps into
a Hilbert space. Contemporary Mathematics, 26:189206, 1984. 30, 50
[JMBO01] Hawoong Jeong, S. P. Mason, Albert-L aszl o Barab asi, and Zoltan N. Oltvai.
Lethality and centrality in protein networks. Nature, 411:4142, 2001. 12
[JMN99] Riko Jacob, Madhav V. Marathe, and Kai Nagel. A computational study of routing
algorithms for realistic transportation networks. ACM Journal of Experimental
Algorithmics, 4:6, 1999. 35
[JOB03] Hawoong Jeong, Zoltan N. Oltvai, and Albert-L aszl o Barab asi. Prediction of pro-
tein essentiality based on genomic data. Complexus, 1(1):1928, 2003. 6, 12
[Joh72] Ellis L. Johnson. On shortest paths and sorting. In ACM72: Proceedings of the
ACM annual conference, pages 510517, 1972. 33
[Joh77] Donald B. Johnson. Efcient algorithms for shortest paths in sparse networks.
Journal of the ACM, 24(1):113, 1977. 33, 35
[Joh82] Donald B. Johnson. A priority queue in which initialization and queue operations
take O(log log D) time. Mathematical Systems Theory, 15(4):295309, 1982. 34
[Joh97] Peter Johansson. On a weighted distance model for injection moulding. Masters
thesis, Link oping University, 1997. 8
[JP02] Sungwon Jung and Sakti Pramanik. An efcient path computation model for hier-
archically structured topographical road maps. IEEE Transactions on Knowledge
and Data Engineering, 14(5):10291046, 2002. 51, 54
[JW03] Glen Jeh and Jennifer Widom. Scaling personalized web search. In WWW 03:
Proceedings of the 12th international conference on World Wide Web, pages 271
279, 2003. 13
[JXRF09] Ruoming Jin, Yang Xiang, Ning Ruan, and David Fuhry. 3-HOP: a high-
compression indexing scheme for reachability query. In SIGMOD 09: Proceed-
ings of the 35th SIGMOD international conference on Management of data, pages
813826, 2009. 54
155
Bibliography
[JXRW08] Ruoming Jin, Yang Xiang, Ning Ruan, and Haixun Wang. Efciently answering
reachability queries on very large directed graphs. In SIGMOD 08: Proceed-
ings of the 2008 ACM SIGMOD international conference on Management of data,
pages 595608, 2008. 54
[JZ05] Hema Jampala and Norbert Zeh. Cache-oblivious planar shortest paths. In Au-
tomata, Languages and Programming, 32nd International Colloquium, ICALP
2005, pages 563575, 2005. 35
[Kar29] Frigyes Karinthy. Lancszemek. 1929. 6
[Ker70] Leslie Robert Kerr. The effect of algebraic structure on the computational com-
plexity of matrix multiplication. PhD thesis, Cornell University, Ithaca, NY, USA,
1970. 35
[Ker04] Boris S. Kerner. The Physics of Trafc. Springer, 2004. 122
[KFY04] Dmitri V. Krioukov, Kevin R. Fall, and Xiaowei Yang. Compact routing on
internet-like graphs. In INFOCOM, 2004. 49, 81, 82, 121, 125
[KG92] John Mark Keil and Carl Andrew Gutwin. Classes of graphs which approximate
the complete Euclidean graph. Discrete & Computational Geometry, 7:1328,
1992. 118
[KH05] Sebastian K onig and Juergen Hesser. 3Dlive-wires on pre-segmented volume data.
Medical Imaging 2005: Image Processing, 5747(1):16741681, 2005. 11
[KH08] Hitoshi Kanoh and Kenta Hara. Hybrid genetic algorithm for dynamic multi-
objective route planning with predicted trafc in a real-world road network. In
Genetic and Evolutionary Computation Conference, GECCO 2008, Proceedings,
Atlanta, GA, USA, July 12-16, 2008, pages 657664, 2008. 54
[KHI
+
86] Ru-Mei Kung, Eric N. Hanson, Yannis E. Ioannidis, Timos K. Sellis, Leonard D.
Shapiro, and Michael Stonebraker. Heuristic search in database systems. In Pro-
ceedings from the rst international workshop on Expert database systems, pages
537548, Redwood City, CA, USA, 1986. Benjamin-Cummings Publishing Co.,
Inc. 35, 52
[KIK
+
06] Shinichi Konomi, Sozo Inoue, Takashi Kobayashi, Masashi Tsuchida, and Masaru
Kitsuregawa. Supporting colocated interactions using RFID and social network
displays. IEEE Pervasive Computing, 5(3):4856, 2006. 14
[KK77] Leonard Kleinrock and Farouk Kamoun. Hierarchical routing for large networks;
performance evaluation and optimization. Computer Networks, 1:155174, 1977.
51
[KK03] Lukasz Kowalik and Maciej Kurowski. Short path queries in planar graphs in
constant time. In Proceedings of the 35th Annual ACM Symposium on Theory of
Computing, June 9-11, 2003, San Diego, CA, USA, pages 143148, 2003. 45
[KKK
+
07] Hans-Peter Kriegel, Peer Kr oger, Peter Kunath, Matthias Renz, and Tim Schmidt.
Proximity queries in large trafc networks. In 15th ACM International Sympo-
sium on Geographic Information Systems, ACM-GIS 2007, November 7-9, 2007,
Seattle, Washington, USA, Proceedings, page 21, 2007. 54
[KKK
+
08] Hans-Peter Kriegel, Peer Kr oger, Peter Kunath, Matthias Renz, and Tim Schmidt.
Efcient query processing in large trafc networks. In Proceedings of the 24th
156
Bibliography
International Conference on Data Engineering, ICDE 2008, April 7-12, 2008,
Canc un, M exico, pages 14511453, 2008. 54
[KKP93] David R. Karger, Daphne Koller, and Steven J. Phillips. Finding the hidden
path: Time bounds for all-pairs shortest paths. SIAM Journal on Computing,
22(6):11991217, 1993. Announced at FOCS 1991. 35
[KKR
+
99] Jon Michael Kleinberg, Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan,
and Andrew Tomkins. The web as a graph: Measurements, models, and methods.
In COCOON, pages 117, 1999. 5
[KKR08] Hans-Peter Kriegel, Peer Kr oger, and Matthias Renz. Continuous proximity mon-
itoring in road networks. In 16th ACM SIGSPATIAL International Symposium
on Advances in Geographic Information Systems, ACM-GIS 2008, November 5-7,
2008, Irvine, California, USA, Proceedings, page 12, 2008. 54
[KKRS08] Hans-Peter Kriegel, Peer Kr oger, Matthias Renz, and Tim Schmidt. Hierarchical
graph embedding for efcient query processing in very large trafc networks. In
Scientic and Statistical Database Management, 20th International Conference,
SSDBM 2008, Hong Kong, China, July 9-11, 2008, Proceedings, pages 150167,
2008. 51, 54
[KL95] Drago Krznaric and Christos Levcopoulos. Computing hierarchies of clusters from
the euclidean minimum spanning tree in linear time. In Foundations of Software
Technology and Theoretical Computer Science, 15th Conference, Bangalore, In-
dia, December 18-20, 1995, Proceedings, pages 443455, 1995. 50
[KL06] Ravi Kumar and Matthieu Latapy. Preface. Theoretical Computer Science: special
issue on Complex Networks, 355(1):15, 2006. 4
[Kle63] Morton Klein. On assembly line balancing. Operations Research, 11(2):274281,
1963. 8
[Kle64] Victor LaRue Klee. A string algorithm for shortest path in directed networks.
Operations Research, 12(3):428432, 1964. 8, 35
[Kle00] Jon Michael Kleinberg. Navigation in a small world. Nature, 406(6798):845,
2000. 24, 83
[Kle02a] Philip Nathan Klein. Preprocessing an undirected planar network to enable fast ap-
proximate distance queries. In Symposium on Discrete Algorithms (SODA), pages
820827, 2002. 46, 47, 121
[Kle02b] Judith S. Kleinfeld. Six degrees of separation: Urban myth? Psychology Today,
35(2):74, 2002. 6
[Kle02c] Judith S. Kleinfeld. The small world problem. Society, 39(2):6166, 2002. 6
[Kle05] Philip Nathan Klein. Multiple-source shortest paths in planar graphs. In Sympo-
sium on Discrete Algorithms (SODA), pages 146155, 2005. 44, 46, 47, 121
[KLMR98] Lydia E. Kavraki, Jean-Claude Latombe, Rajeev Motwani, and Prabhakar Ragha-
van. Randomized query processing in robot path planning. Journal of Computer
and System Sciences, 57(1):5066, 1998. 39
[KMS05] Ekkehard K ohler, Rolf H. M ohring, and Heiko Schilling. Acceleration of shortest
path and constrained shortest path computation. In Experimental and Efcient
157
Bibliography
Algorithms, 4th InternationalWorkshop, WEA 2005, Santorini Island, Greece, May
10-13, 2005, Proceedings, pages 126138, 2005. 54
[KN96] Eyal Kushilevitz and Noam Nisan. Communication Complexity. Cambridge Uni-
versity Press, 1996. 61
[KNR92] Sampath Kannan, Moni Naor, and Steven Rudich. Implicit representation of
graphs. SIAM Journal on Discrete Mathematics, 5(4):596603, 1992. Announced
at STOC 1988. 31
[Knu89] Donald Ervin Knuth. Theory and practice. Available online as P138, August 1989.
15, 97
[Koe36] Paul Koebe. Kontaktprobleme der konformen Abbildung. Berichte uber
die Verhandlungen der S achsischen Akademie der Wissenschaften zu Leipzig,
Mathematisch-Physikalische Klasse, 88:141164, 1936. contains the circle pack-
ing theorem. 22, 48
[Kor85] Richard E. Korf. Depth-rst iterative-deepening: An optimal admissible tree
search. Articial Intelligence, 27(1):97109, 1985. 52
[Kor01] Guy Kortsarz. On the hardness of approximating spanners. Algorithmica,
30(3):432450, 2001. 27
[KP69] Ronald F. Kirby and Renfrey B. Potts. The minimum route problem for net-
works with turn penalties and prohibitions. Transportation Research, 3(3):397
408, 1969. 39
[KPSZ96] Dimitris J. Kavvadias, Grammati E. Pantziou, Paul G. Spirakis, and Christos D.
Zaroliagis. Hammock-on-ears decomposition: A technique for the efcient paral-
lel solution of shortest paths and other problems. Theoretical Computer Science,
168(1):121154, 1996. 21
[KRR
+
00] Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, D. Sivakumar, Andrew
Tomkins, and Eli Upfal. Random graph models for the web graph. In Proceedings
of the IEEE Symposium on Foundations of Computer Science (FOCS), pages 57
65, 2000. 5, 24
[KRX07] Goran Konjevod, Andr ea W. Richa, and Donglin Xia. Optimal scale-free compact
routing schemes in networks of low doubling dimension. In Proceedings of the
Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007,
New Orleans, Louisiana, USA, January 7-9, 2007, pages 939948, 2007. 49
[KS92] Bala Kalyanasundaram and Georg Schnitger. The probabilistic communication
complexity of set intersection. SIAM Journal on Discrete Mathematics, 5(4):545
557, 1992. Announced at Structure in Complexity Theory 1987. 62
[KS93a] David E. Kaufman and Robert L. Smith. Fastest paths in time-dependent networks
for intelligent vehicle-highway systems application. Journal of Intelligent Trans-
portation Systems, 1(1):111, 1993. 32
[KS93b] Philip Nathan Klein and Sairam Subramanian. A linear-processor polylog-time
algorithm for shortest paths in planar graphs. In 34th Annual Symposium on Foun-
dations of Computer Science, 3-5 November 1993, Palo Alto, California, USA,
pages 259270, 1993. 35
158
Bibliography
[KS96] Vijay Kumar and Eric J. Schwabe. Improved algorithms and data structures for
solving graph problems in external memory. In Eighth IEEE Symposium on Par-
allel and Distributed Processing, pages 169176, Oct 1996. 35
[KS98] Philip Nathan Klein and Sairam Subramanian. A fully dynamic approximation
scheme for shortest paths in planar graphs. Algorithmica, 22(3):235249, 1998.
47
[KSS97a] Henry A. Kautz, Bart Selman, and Mehul A. Shah. The hidden web. AI Magazine,
18(2):2736, 1997. 14
[KSS97b] Henry A. Kautz, Bart Selman, and Mehul A. Shah. Referral Web: Combining so-
cial networks and collaborative ltering. Communications of the ACM, 40(3):63
65, 1997. 14
[KSW09] Jon Michael Kleinberg, Aleksandrs Slivkins, and Tom Wexler. Triangulation and
embedding using small sets of beacons. Journal of the ACM, 56(6), 2009. An-
nounced at FOCS 2004. 49, 53, 56, 91
[KU08] Tomoya Kambara and Shinichi Ueshima. Proposal of graph-partitioning tree using
network Voronoi diagram and its application to shortest path problems. Journal of
the Database Society of Japan, 7(1):193198, 2008. 97
[Kuh55] Harold William Kuhn. On certain convex polyhedra. Bulletin of the American
Mathematical Society, 61:557558, 1955. 8
[KvK09] Steffen Klamt and Axel von Kamp. Computing paths and cycles in biological
interaction graphs. BMC Bioinformatics, 10(1):181, 2009. 11
[KW90] Mauricio Karchmer and Avi Wigderson. Monotone circuits for connectivity re-
quire super-logarithmic depth. SIAM Journal on Discrete Mathematics, 3(2):255
265, 1990. 60
[KY65] M. Kitamura and M. Yamazaki. On the connection of the two shortest route sys-
tems. In Proceedings of the 8th Japanese Road Conference, pages 6668, 1965.
In Japanese. 35, 51
[LAB
+
04] Siming Li, Christopher M. Armstrong, Nicolas Bertin, Hui Ge, Stuart Milstein,
Mike Boxem, Pierre-Olivier Vidalain, Jing-Dong J. Han, Alban Chesneau, Tong
Hao, Debra S. Goldberg, 1 Monica Martinez Ning Li, Jean-Francois Rual, Philippe
Lamesch, Lai Xu, Muneesh Tewari, Sharyl L. Wong, Lan V. Zhang, Gabriel F.
Berriz, Laurent Jacotot, Philippe Vaglio, Jerome Reboul, Tomoko Hirozane-
Kishikawa, Qianru Li, Harrison W. Gabel, Ahmed Elewa, Bridget Baumgartner,
Debra J. Rose, Haiyuan Yu, Stephanie Bosak, Reynaldo Sequerra, Andrew Fraser,
Susan E. Mango, William M. Saxton, Susan Strome, Sander van den Heuvel,
Fabio Piano, Jean Vandenhaute, Claude Sardet, Mark Gerstein, Lynn Doucette-
Stamm, Kristin C. Gunsalus, J. Wade Harper, Michael E. Cusick, Frederick P.
Roth, David E. Hill, and Marc Vidal. A map of the interactome network of the
metazoan C. elegans. Science, 303(5657):540543, 2004. 6
[LADW05] Lun Li, David Alderson, John C. Doyle, and Walter Willinger. Towards a theory of
scale-free graphs: Denition, properties, and implications. Internet Mathematics,
2(4):431523, 2005. 85
[Lan09] Edmund Landau. Handbuch der Lehre von der Verteilung der Primzahlen. 1909.
25
159
Bibliography
[Lat91] Jean-Claude Latombe. Robot Motion Planning. 1991. 39
[Lau04] Ulrich Lauther. An extremely fast, exact algorithm for nding shortest paths in
static networks with geographical background. In Geoinformation und Mobilit at
von der Forschung zur praktischen Anwendung, volume 22, pages 219230, 2004.
54, 110
[LCL
+
94] Bing Liu, Siew-Hwee Choo, Shee-Ling Lok, Sing-Meng Leong, Soo-Chee
Lee, Foong-Ping Poon, and Hwee-Har Tan. Integrating case-based reasoning,
knowledge-based approach and Dijkstra algorithm for route nding. In Proceed-
ings of the Tenth Conference on Articial Intelligence for Applications, pages 149
155, Mar 1994. 54
[LDSS99] Aaron W. F. Lee, David Dobkin, Wim Sweldens, and Peter Schr oder. Multireso-
lution mesh morphing. In SIGGRAPH 99: Proceedings of the 26th annual con-
ference on Computer graphics and interactive techniques, pages 343350. ACM
Press/Addison-Wesley Publishing Co., 1999. 11
[LEA
+
01] Fredrik Liljeros, Christofer R. Edling, Luis A. Nunes Amaral, H. Eugene Stanley,
and Yvonne Aberg. The web of human sexual contacts. Nature, 411:907908,
2001. 7
[Ley02] Michael Ley. The DBLP computer science bibliography: Evolution, research
issues, perspectives. In String Processing and Information Retrieval, 9th Inter-
national Symposium (SPIRE02), Lisbon, Portugal, September 11-13, 2002, Pro-
ceedings, pages 110, 2002. 111
[LGJ
+
57] M. Leyzorek, R. S. Gray, A. A. Johnson, W. C. Ladew, S. R. Meaker Jr, R. M.
Petry, and R. N. Seitz. Investigation of model techniques rst annual report
a study of model techniques for communication systems. Case Institute of
Technology, Cleveland, Ohio, 1957. 8, 32
[LH08] Jure Leskovec and Eric Horvitz. Planetary-scale views on a large instant-
messaging network. In Proceedings of the 17th International Conference on World
Wide Web, WWW 2008, Beijing, China, April 21-25, 2008, pages 915924, 2008.
7
[Lin02] Nathan Linial. Finite metric spaces combinatorics, geometry and algorithms. In
Proceedings of the International Congress of Mathematicians III, pages 573586,
2002. 29
[Liu95] Bing Liu. Using knowledge to isolate search in route nding. In IJCAI95: Pro-
ceedings of the 14th international joint conference on Articial intelligence, pages
119124, 1995. 54
[LLR95] Nathan Linial, Eran London, and Yuri Rabinovich. The geometry of graphs and
some of its algorithmic applications. Combinatorica, 15(2):215245, 1995. An-
nounced at FOCS 1994. 30
[LMF
+
07] Jure Leskovec, Mary McGlohon, Christos Faloutsos, Natalie S. Glance, and
Matthew Hurst. Patterns of cascading behavior in large blog graphs. In Proceed-
ings of the Seventh SIAM International Conference on Data Mining, April 26-28,
2007, Minneapolis, Minnesota, USA, 2007. 5
160
Bibliography
[LMN02] Nathan Linial, Avner Magen, and Assaf Naor. Girth and euclidean distortion. In
Proceedings of the ACM Symposium on Theory of Computing (STOC), pages 705
711, 2002. 60
[LN94] Eugene V. Levner and A. S. Nemirovsky. A network ow algorithm for just-in-
time project scheduling. European Journal of Operational Research, 79(2):167
175, December 1994. 8
[LNMG09] Yan Liu, Alexandru Niculescu-Mizil, and Wojciech Gryc. Topic-link LDA: joint
models of topic and author community. In ICML 09: Proceedings of the 26th
Annual International Conference on Machine Learning, pages 665672, 2009. 5
[Lov09] L aszl o Lov asz. Very large graphs. CoRR, math.CO/0902.0132, 2009. 4
[LPA
+
09] David Lazer, Alex Pentland, Lada Adamic, Sinan Aral, Albert-L aszl o Barab asi,
Devon Brewer, Nicholas Christakis, Noshir Contractor, James Fowler, Myron Gut-
mann, Tony Jebara, Gary King, Michael Macy, Deb Roy, and Marshall Van Al-
styne. Computational social science. Science, 323:721723, 2009. 4
[LPS88] Alexander Lubotzky, R. Phillips, and Peter Sarnak. Ramanujan graphs. Combina-
torica, 8(3):261277, 1988. 65, 66
[LR82] Zachary F. Lansdowne and David W. Robinson. Geographic decomposition of
the shortest path problem, with an application to the trafc assignment problem.
Management Science, 28(12):13801390, 1982. 35, 51
[LS67] A. H. Land and S. W. Stairs. The extension of the cascade algorithm to large
graphs. Management Science, 14(1):2933, 1967. 35, 51
[LS93] Nathan Linial and Michael E. Saks. Low diameter graph decompositions. Combi-
natorica, 13(4):441454, 1993. Announced at SODA 1991. 43
[LS09] Silvio Lattanzi and D. Sivakumar. Afliation networks. In Proceedings of the 41st
Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD,
USA, May 31 - June 2, 2009, pages 427434, 2009. 24
[LT80] Richard J. Lipton and Robert Endre Tarjan. Applications of a planar separator
theorem. SIAM Journal on Computing, 9(3):615627, 1980. Announced at FOCS
1977. 30, 46, 121
[LT09] Ulf Leser and Silke Tril. Graph management in the life sciences. In Encyclopedia
of Database Systems, pages 12661271. 2009. 4
[LU93] Felix Lazebnik and Vasiliy A. Ustimenko. New examples of graphs without small
cycles and of large size. European Journal of Combinatorics, 14(5):445460,
1993. 28
[Lu02] Linyuan Lu. Probabilistic methods in massive graphs and Internet computing.
PhD thesis, University of California San Diego, 2002. 82, 84, 85
[LUW95] Felix Lazebnik, Vasiliy A. Ustimenko, and Andrew J. Woldar. A new series
of dense graphs of high girth. Bulletin of the American Mathematical Society,
32(1):7379, 1995. 28
[LUW96] Felix Lazebnik, Vasiliy A. Ustimenko, and Andrew J. Woldar. A characterization
of the components of the graphs D(k, q). Discrete Mathematics, 157(1-3):271
283, 1996. 28
161
Bibliography
[MB98] Eric N. Mortensen and William A. Barrett. Interactive segmentation with intelli-
gent scissors. Graphical Models and Image Processing, 60(5):349384, 1998. 8,
10, 11
[MB08] Scott Morris and Kobus Barnard. Finding trails. In 2008 IEEE Computer Soci-
ety Conference on Computer Vision and Pattern Recognition (CVPR 2008), 24-26
June 2008, Anchorage, Alaska, USA, 2008. 11
[MC09] Julian John McAuley and Tib erio S. Caetano. An expected-case sub-cubic solution
to the all-pairs shortest path problem in R. CoRR, abs/0912.0975, 2009. 35
[McD98] Colin McDiarmid. Probabilistic Methods for Algorithmic Discrete Mathematics,
volume 16 of Algorithms and Combinatorics, chapter Concentration, pages 146.
Springer, 1998. 86
[McG95] Catherine C. McGeoch. All-pairs shortest paths and the essential subgraph. Algo-
rithmica, 13(5):426441, 1995. 35
[MDMCM01] Fabien Mourgues, Frederic Devernay, Gr egoire Malandain, and
`
Eve Coste-
Mani` ere. 3d +t modeling of coronary artery tree from standard non simultaneous
angiographic sequences. In Proceedings of the 4th International Conference on
on Medical Image Computing and Computer-Assisted Intervention, pages 1320
1322, 2001. 8
[Meh88] Kurt Mehlhorn. A faster approximation algorithm for the Steiner problem in
graphs. Information Processing Letters, 27(3):125128, 1988. 97, 98
[MEJ
+
09] Kamesh Madduri, David Ediger, Karl Jiang, David A. Bader, and Daniel G.
Chavarra-Miranda. A faster parallel algorithm and efcient multithreaded im-
plementations for evaluating betweenness centrality on massive datasets. In 23rd
IEEE International Symposium on Parallel and Distributed Processing, IPDPS
2009, Rome, Italy, May 23-29, 2009, pages 18, 2009. 12
[Met07] Philipp Metzner. Transition Path Theory for Markov Processes. PhD thesis, Freie
Universit at Berlin, 2007. 11
[Mey03] Ulrich Meyer. Average-case complexity of single-source shortest-paths algo-
rithms: lower and upper bounds. Journal of Algorithms, 48(1):91134, 2003.
Announced at SODA 2001. 35
[Mey09] Ulrich Meyer. Via detours to I/O-efcient shortest paths. In Efcient Algorithms,
Essays Dedicated to Kurt Mehlhorn on the Occasion of His 60th Birthday, pages
219232, 2009. 35
[MHSWZ04] Matthias M uller-Hannemann, Frank Schulz, Dorothea Wagner, and Christos D.
Zaroliagis. Timetable information: Models and algorithms. In Algorithmic Meth-
ods for Railway Optimization, International Dagstuhl Workshop, Dagstuhl Castle,
Germany, June 20-25, 2004, 4th International Workshop, ATMOS 2004, Bergen,
Norway, September 16-17, 2004, Revised Selected Papers, pages 6790, 2004. 3,
56
[Mil66] G. Mills. A decomposition algorithm for the shortest route problem. Operations
Research, 14:279286, 1966. 35, 51
[Mil67] Stanley Milgram. The small world problem. Psychology Today, 1:6167, 1967. 6
162
Bibliography
[Mil94] Peter Bro Miltersen. Lower bounds for union-split-nd related problems on ran-
dom access machines. In Proceedings of the ACM Symposium on Theory of Com-
puting (STOC), pages 625634, 1994. 60
[Mil99] Peter Bro Miltersen. Cell probe complexity a survey. In Advances in Data
Structures Workshop, Proc. of 19th Conference on the Foundations of Software
Technology and Theoretical Computer Science (FSTTCS), 1999. 26, 60
[Min57] George James Minty. A comment on the shortest-route problem. Operations Re-
search, 5(5):724, 1957. 8, 35
[Min58] George James Minty. Operations Research, 6:882883, 1958. 8
[Mit97] Joseph S. B. Mitchell. chapter Shortest Paths and Networks, pages 445466. CRC
Press, 1997. 32, 39
[Mit03] Michael Mitzenmacher. A brief history of generative models for power law and
lognormal distributions. Internet Mathematics, 1(2), 2003. 4, 23, 24, 111
[MMP87] Joseph S. B. Mitchell, David M. Mount, and Christos H. Papadimitriou. The dis-
crete geodesic problem. SIAM Journal on Computing, 16(4):647668, 1987. 32
[MN67] M. Mori and T. Nishimura. Solution of the routing problem through a network
by a matrix method with auxiliary nodes. Transportation Research, 1(2):165
180, 1967. Announced in Memoirs of the Faculty of Engineering, Osaka City
University, pp. 149162, 1963. 8, 39
[MN06] Manor Mendel and Assaf Naor. Ramsey partitions and proximity data structures.
In 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS
2006), 21-24 October 2006, Berkeley, California, USA, Proceedings, pages 109
118, 2006. 43, 44, 79, 118
[MNSW98] Peter Bro Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson. On data
structures and asymmetric communication complexity. Journal of Computer and
System Sciences, 57(1):3749, 1998. Announced at STOC 1995. 60, 62, 72
[MO09] Ulrich Meyer and Vitaly Osipov. Design and implementation of a practical I/O-
efcient shortest paths algorithm. In Proceedings of the Workshop on Algorithm
Engineering and Experiments, ALENEX 2009, New York, New York, USA, January
3, 2009, pages 8596, 2009. 35
[Moo59] Edward F. Moore. The shortest path through a maze. In Proceedings of the In-
ternational Symposium on the Theory of Switching, and Annals of the Computa-
tion Laboratory of Harvard University, pages 285292. Harvard University Press,
1959. Announced at the International Symposium on the Theory of Switching
1957. 8, 34, 51
[Mor93] Pieter Moree. Bertrands postulate for primes in arithmetical progressions. Com-
puters & Mathematics with Applications, 26(5):3543, 1993. 66
[Mor94] Moshe Morgenstern. Existence and explicit constructions of q + 1 regular Ra-
manujan graphs for every prime power q. Journal of Combinatorial Theory, Series
B, 62(1):4462, 1994. 65
[MP69] Marvin Minsky and Seymour Papert. Perceptrons. MIT Press, Cambridge, 1969.
64, 78
163
Bibliography
[MP97] Kurt Mehlhorn and Volker Priebe. On the all-pairs shortest-path algorithm of Mof-
fat and Takaoka. Random Struct. Algorithms, 10(1-2):205220, 1997. Announced
at ESA 1995. 35
[MPN
+
05] Lauren Ancel Meyers, Babak Pourbohloul, Mark E. J. Newman, Danuta M.
Skowronski, and Robert C. Brunham. Network theory and SARS: predicting out-
break diversity. Journal of Theoretical Biology, 232(1):7181, 2005. 7, 10
[MPS06] Milena Mihail, Christos H. Papadimitriou, and Amin Saberi. On certain connectiv-
ity properties of the internet topology. Journal of Computer and System Sciences,
72(2):239251, 2006. Announced at FOCS 2003. 89
[MPSV02] Yamir Moreno, Romualdo Pastor-Satorras, and Alessandro Vespignani. Epidemic
outbreaks in complex heterogeneous networks. The European Physical Journal B
- Condensed Matter and Complex Systems, 26(4), 2002. 7
[MR95] Rajeev Motwani and Prabhakar Raghavan. Randomized algorithms. Cambridge
University Press, 1995. 62
[MS03] Ulrich Meyer and Peter Sanders. -stepping: a parallelizable shortest path algo-
rithm. Journal of Algorithms, 49(1):114152, 2003. Announced at ESA 1998.
35
[MS08a] Jiri Matousek and Anastasios Sidiropoulos. Inapproximability for metric embed-
dings into R
d
. In 49th IEEE Symposium on Foundations of Computer Science
(FOCS 2008), 2008. 30
[MS08b] Manor Mendel and Chaya Schwob. C-K-R partitions of sparse graphs. CoRR,
abs/0809.1902, 2008. 43, 44
[MSM09] Jens Maue, Peter Sanders, and Domagoj Matijevic. Goal-directed shortest-path
queries using precomputed cluster distances. ACM Journal of Experimental Algo-
rithmics, 14:3.23.27, 2009. 53
[MSS
+
06] Rolf H. M ohring, Heiko Schilling, Birk Sch utz, Dorothea Wagner, and Thomas
Willhalm. Partitioning graphs to speedup Dijkstras algorithm. ACM Journal of
Experimental Algorithmics, 11, 2006. 54
[MT84] Alistair Moffat and Tadao Takaoka. A priority queue for the all pairs shortest path
problem. Information Processing Letters, 18(4):189193, 1984. 36
[MT87] Alistair Moffat and Tadao Takaoka. An all pairs shortest path algorithm with ex-
pected time O(n
2
log n). SIAM Journal on Computing, 16(6):10231031, 1987.
Announced at FOCS 1985. 35
[MT93] Sean P. Meyn and Richard L. Tweedie. Markov Chains and Stochastic Stability.
Springer Verlag, 1993. 11
[Mur65] John David Murchland. A new method for nding all elementary paths in a com-
plete directed graph. Technical Report LBS-TNT-22, London Business School,
Transport Network Theory Unit, 1965. 51
[Mur67a] John David Murchland. The effect of increasing or decreasing the length of a
single arc on all shortest distances in a graph. Technical Report LBS-TNT-26,
London Business School, Transport Network Theory Unit, 1967. 33
164
Bibliography
[Mur67b] John David Murchland. The once-through method of nding all shortest dis-
tances in a graph from a single origin. Technical Report LBS-TNT-56, London
Business School, Transport Network Theory Unit, 1967. 10, 35
[MV06] Oliver Mason and Mark Verwoerd. Graph theory and networks in biology. arXiv:q-
bio, q-bio/0604006, 2006. 4, 6, 11, 12
[MVV87] Ketan Mulmuley, Umesh V. Vazirani, and Vijay V. Vazirani. Matching is as easy
as matrix inversion. Combinatorica, 7(1):105113, 1987. Announced at STOC
1987. 19
[MWN09] Shay Mozes and Christian Wulff-Nilsen. Shortest paths in planar graphs with real
lengths in O(nlog
2
n/ log log n) time. CoRR, abs/0911.4963, 2009. 44
[MZ93] Nicolas Merlet and Josiane Zerubia. A curvature-dependent energy function for
detecting lines in satellite images. In The 8th Scandinavian Conference on Image
Analysis, University of Tromso, Norway, pages 2528, 1993. 11
[MZ03] Ulrich Meyer and Norbert Zeh. I/O-efcient undirected shortest paths. In Al-
gorithms - ESA 2003, 11th Annual European Symposium, Budapest, Hungary,
September 16-19, 2003, Proceedings, pages 434445, 2003. 35
[MZ06] Ulrich Meyer and Norbert Zeh. I/O-efcient undirected shortest paths with un-
bounded edge lengths. In Algorithms - ESA 2006, 14th Annual European Sympo-
sium, Zurich, Switzerland, September 11-13, 2006, Proceedings, pages 540551,
2006. 35
[MZ07] Laurent Flindt Muller and Martin Zachariasen. Fast and compact oracles for ap-
proximate distances in planar graphs. In Algorithms - ESA 2007, 15th Annual
European Symposium, Eilat, Israel, October 8-10, 2007, Proceedings, pages 657
668, 2007. 46, 47
[MZ08] Anil Maheshwari and Norbert Zeh. I/O-efcient planar separators. SIAM Journal
on Computing, 38(3):767801, 2008. Announced at SODA 2002. 35
[Nak72] Mario Nakamori. A note on the optimality of some all-shortest path algorithms.
Journal of the Operations Research Society of Japan, 15:201204, 1972. 35
[NBB
+
08] Giacomo Nannicini, Philippe Baptiste, Gilles Barbier, Daniel Krob, and Leo Lib-
erti. Fast paths in large-scale dynamic road networks. Computational Optimization
and Applications, 2008. 47, 53, 55, 56
[New00] Mark E. J. Newman. Models of the small world. Journal of Statistical Physics,
pages 819841, 2000. 4
[New01] Mark E. J. Newman. Scientic collaboration networks. II. shortest paths, weighted
networks, and centrality. Physical Review E (Statistical, Nonlinear, and Soft Mat-
ter Physics), 64, 2001. 14
[New02] Mark E. J. Newman. Spread of epidemic disease on networks. Physical Review E,
66(1):016128, Jul 2002. 7
[New03] Mark E. J. Newman. The structure and function of complex networks. SIAM
Review, 45(2):167256, 2003. 4
[New04] Mark E. J. Newman. Analysis of weighted networks. Physical Review E (Statisti-
cal, Nonlinear, and Soft Matter Physics), 70(5):056131, Nov 2004. 13
165
Bibliography
[New05] Mark E. J. Newman. Power laws, Pareto distributions and Zipfs law. Contempo-
rary Physics, 46:323351, 2005. 4, 23
[NG04] Mark E. J. Newman and M. Girvan. Finding and evaluating community structure
in networks. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics),
69(026113), 2004. 13
[Nic66] T. Alastair J. Nicholson. Finding the shortest route between two points in a net-
work. The Computer Journal, 9(3):275280, November 1966. 10, 35
[NMM78] Kohei Noshita, Etsuo Masuda, and Hajime Machida. On the expected behaviors of
the Dijkstras shortest path algorithm for complete graphs. Information Processing
Letters, 7(5):237243, 1978. 35
[Nos85] Kohei Noshita. A theorem on the expected complexity of Dijkstras shortest path
algorithm. Journal of Algorithms, 6(3):400408, 1985. 35
[NR03] Ilan Newman and Yuri Rabinovich. A lower bound on the distortion of embed-
ding planar metrics into Euclidean space. Discrete and Computational Geometry,
29:7781, 2003. 30
[NR06] Ilkka Norros and Hannu Reittu. On a conditionally Poissonian graph process.
Advances in Applied Probability, 38(1):5975, 2006. 24
[NR08] Ilkka Norros and Hannu Reittu. On the robustness of power-law random graphs in
the nite mean, innite variance region. arXiv:0801.1079, 2008. 83
[NSW01] Mark E. J. Newman, Steven H. Strogatz, and Duncan J. Watts. Random graphs
with arbitrary degree distributions and their applications. Physical Review E (Sta-
tistical, Nonlinear, and Soft Matter Physics), 64(2):026118 117, Jul 2001. 83
[NW99] Mark E. J. Newman and D. J. Watts. Scaling and percolation in the small-
world network model. Physical Review E (Statistical, Nonlinear, and Soft Matter
Physics), 60(6):73327342, Dec 1999. 24, 83
[NWS02] Mark E. J. Newman, Duncan J. Watts, and Steven H. Strogatz. Random graph
models of social networks. Proceedings of the National Academy of Sciences,
99:25662572, 2002. 83
[NZ02] T. S. Eugene Ng and Hui Zhang. Predicting internet network distance with
coordinates-based approaches. In INFOCOM, 2002. 15
[OR90] Ariel Orda and Raphael Rom. Shortest-path and minimum delay algorithms in
networks with time-dependent edge-length. Journal of the ACM, 37(3):607625,
1990. 32, 39
[Orl83] James B. Orlin. On the simplex algorithm for networks and generalized networks.
Working papers 1467-83., Massachusetts Institute of Technology (MIT), Sloan
School of Management, 1983. 34
[OSF
+
08] Atsuyuki Okabe, Toshiaki Satoh, Takehiro Furuta, Atsuo Suzuki, and Kyoko
Okano. Generalized network Voronoi diagrams: Concepts, computational meth-
ods, and applications. International Journal of Geographical Information Science,
22:965994, 2008. 98
[OSH
+
07] Jukka-Pekka Onnela, Jari Saram aki, J orkki Hyv onen, G abor Szab o, M. Argollo
de Menezes, Kimmo Kaski, Albert-L aszl o Barab asi, and J anos Kert esz. Analy-
166
Bibliography
sis of a large-scale weighted network of one-to-one human communication. New
Journal of Physics, 9(6):179, 2007. 7
[Pap74] Uwe Pape. Implementation and efciency of Moore-algorithms for the shortest
route problem. Mathematical Programming, 7(1):212222, 1974. 34, 35, 50, 51
[Pap80] Uwe Pape. Algorithm 562: Shortest path lengths [h]. ACM Transactions on Math-
ematical Software, 6(3):450455, 1980. 34
[Par60] Arvind Chandulai Parikh. Some Theorems and Algorithms for Finding Optimal
Paths Over Graphs with Engineering Applications. PhD thesis, Purdue University,
1960. 8
[Pat08a] Mihai Patrascu. (Data) STRUCTURES. In 49th Annual IEEE Symposiumon Foun-
dations of Computer Science, FOCS 2008, October 25-28, 2008, Philadelphia, PA,
USA, pages 434443, 2008. 41, 59, 60, 62, 63, 64, 67, 69, 71, 72
[Pat08b] Mihai Patrascu. Lower Bound Techniques for Data Structures. PhD thesis, Mas-
sachusetts Institute of Technology, 2008. 60, 62, 64
[Pat08c] Mihai Patrascu. Randomized lower bounds for lopsided set disjoint-
ness. http://people.csail.mit.edu/mip/papers/structures/
lsd.pdf, 2008. 62
[Pat09] Mihai Patrascu. Unifying the Landscape of Cell-Probe Lower Bounds . submitted,
2009. 62
[PBCG09] Michalis Potamias, Francesco Bonchi, Carlos Castillo, and Aristides Gionis. Fast
shortest path distance estimation in large networks. In CIKM 09: Proceeding
of the 18th ACM conference on Information and knowledge management, pages
867876, 2009. 13, 53, 56, 83
[PBMW99] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The Page-
Rank citation ranking: Bringing order to the web. Technical report, Stanford Info-
Lab, 1999. 5
[PC06] Gabriel Peyr e and Laurent D. Cohen. Landmark-based geodesic computation for
heuristically driven path planning. In Computer Vision and Pattern Recognition,
2006 IEEE Computer Society Conference on, volume 2, pages 22292236, 2006.
11
[Pel00] David Peleg. Proximity-preserving labeling schemes. Journal of Graph The-
ory, 33:167176, 2000. See also: Proceedings of the 25th Workshop on Graph-
Theoretic Concepts in Computer Science (WG), 1999. 29
[Pet03] Seth Pettie. On the Shortest Path and Minimum Spanning Tree Problems. PhD
thesis, The University of Texas at Austin, 2003. 32, 34
[Pet04] Seth Pettie. A new approach to all-pairs shortest paths on real-weighted graphs.
Theoretical Computer Science, 312(1):4774, 2004. 34
[Pet07] Seth Pettie. Low distortion spanners. In Automata, Languages and Programming,
34th International Colloquium, ICALP 2007, Wroclaw, Poland, July 9-13, 2007,
Proceedings, pages 7889, 2007. 27, 28
[Pet08a] Seth Pettie. All pairs shortest paths in sparse graphs. In Encyclopedia of Algo-
rithms. 2008. 35
167
Bibliography
[Pet08b] Seth Pettie. Single-source shortest paths. In Encyclopedia of Algorithms. 2008.
32, 34
[PK85] Richard C. Paige and Clyde P. Kruskal. Parallel algorithms for shortest path prob-
lems. In ICPP, pages 1420, 1985. 35
[PMT03] Sarah V. Porter, Majid Mirmehdi, and Barry T. Thomas. A shortest path repre-
sentation for video summarisation. In 12th International Conference on Image
Analysis and Processing (ICIAP 2003), 17-19 September 2003, Mantova, Italy,
pages 460465, 2003. 11
[Poh71] Ira Sheldon Pohl. Bi-directional search. Machine Intelligence, 6:127140, 1971.
10, 35
[Pou08] Pawan Poudel. Computing point-to-point shortest path using an approximate dis-
tance oracle. Masters thesis, Miami University, 2008. 53
[PR02] Seth Pettie and Vijaya Ramachandran. Computing shortest paths with comparisons
and additions. In SODA 02: Proceedings of the thirteenth annual ACM-SIAM
symposium on Discrete algorithms, pages 267276, 2002. 33, 34, 37, 104
[PRB60] Robert M. Peart, Paul H. Randolph, and T. E. Bartlett. The shortest-route problem
(letters to the editor). Operations Research, 8(6):866868, 1960. 8
[PRS02] Seth Pettie, Vijaya Ramachandran, and Srinath Sridhar. Experimental evaluation
of a new shortest path algorithm. In Algorithm Engineering and Experiments, 4th
International Workshop, ALENEX 2002, San Francicsco, CA, USA, January 4-5,
2002, Revised Papers, pages 126142, 2002. 34, 35
[PS85] Franco P. Preparata and Michael Ian Shamos. Computational geometry: an intro-
duction. 1985. 39
[PS89] David Peleg and Alejandro A. Sch affer. Graph spanners. Journal of Graph Theory,
13(1):99116, 1989. 27
[PS98] Stefano Pallottino and Maria Grazia Scutell` a. Equilibrium and Advanced Trans-
portation Modelling, chapter Shortest Path Algorithms in Transportation Models:
Classical and Innovative Aspects, pages 245281. 1998. 54
[PSV01] Romualdo Pastor-Satorras and Alessandro Vespignani. Epidemic spreading in
scale-free networks. Physical Review Letters, 86(14):32003203, Apr 2001. 7
[PSWZ07] Evangelia Pyrga, Frank Schulz, Dorothea Wagner, and Christos D. Zaroliagis. Ef-
cient models for timetable information in public transportation systems. ACM
Journal of Experimental Algorithmics, 12, 2007. 56
[PT06] Mihai Patrascu and Mikkel Thorup. Time-space trade-offs for predecessor search.
In Proceedings of the 38th Annual ACM Symposium on Theory of Computing,
Seattle, WA, USA, May 21-23, 2006, pages 232240, 2006. 63, 69, 72
[PW60] Maurice Pollack and Walter Wiebenson. Solutions of the shortest-route problem
a review. Operations Research, 8(2):224230, 1960. 8, 51
[PW99] Panos D. Prevedouros and Yuhao Wang. Simulation of large freeway and arterial
network with CORSIM, INTEGRATION, and WATSim. Transportation Research
Record: Journal of the Transportation Research Board, 1678:197207, 1999. 10
168
Bibliography
[PZMT03] Dimitris Papadias, Jun Zhang, Nikos Mamoulis, and Yufei Tao. Query processing
in spatial network databases. In VLDB, pages 802813, 2003. 55
[RA59] Harold Rapaport and Paul Abramson. An analog computer for nding an optimum
route through a communication network. IRE Transactions on Communications
Systems, 7(1):3742, May 1959. 8
[Ram96a] Ganesan Ramalingam. Bounded Incremental Computation. 1996. 39
[Ram96b] Rajeev Raman. Priority queues: Small, monotone and trans-dichotomous. In
Algorithms - ESA 96, Fourth Annual European Symposium, Barcelona, Spain,
September 25-27, 1996, Proceedings, pages 121137, 1996. 34
[Ram97] Rajeev Raman. Recent results on the single-source shortest paths problem.
SIGACT News, 28:8187, 1997. 34
[Rao99] Satish Rao. Small distortion and volume preserving embeddings for planar and
Euclidean metrics. In Symposium on Computational Geometry, pages 300306,
1999. 30
[Raz92] Alexander A. Razborov. On the distributional complexity of disjointness. Theo-
retical Computer Science, 106(2):385390, 1992. 62
[Rei58] I. Reiman.

Uber ein Problem von K. Zarankiewicz. Acta Mathematica Academiae
Scientiarum Hungaricae, 9:269279, 1958. 28
[RMJ06] Matthew J. Rattigan, Marc Maier, and David Jensen. Using structure indices for
efcient approximation of network properties. In Proceedings of the Twelfth ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining,
Philadelphia, PA, USA, August 20-23, 2006, pages 357366, 2006. 13, 56
[RMJ07] Matthew J. Rattigan, Marc Maier, and David Jensen. Graph clustering with net-
work structure indices. In Machine Learning, Proceedings of the Twenty-Fourth
International Conference (ICML 2007), Corvalis, Oregon, USA, June 20-24, 2007,
pages 783790, 2007. 13, 56, 83
[RN04a] Bryan Raney and Kai Nagel. Iterative route planning for large-scale modular trans-
portation simulations. Future Generation Computer Systems, 20(7):11011118,
2004. 10
[RN04b] Hannu Reittu and Ilkka Norros. On the power-law random graph model of mas-
sive data networks. Performance Evaluation, 55(1-2):323, 2004. Announced at
Internet performance symposium (IPS 2002). 24
[Rob56] John T. Robacker. Min-max theorems on shortest chains and disjoint cuts of a
network. Research Memorandum RM-1660, The Rand Corporation, 1956. 8
[Rom80] Francesco Romani. Shortest-path problem is not harder than matrix multiplication.
Information Processing Letters, 11(3):134136, 1980. 36
[Rou86] Dennis H. Rouvray. Predicting chemistry from topology. Scientic American,
265:4047, 1986. 12
[RS83] Neil Robertson and Paul D. Seymour. Graph minors. I. excluding a forest. Journal
of Combinatorial Theory, Series B, 35(1):3961, 1983. 20, 21
[RS86] Neil Robertson and Paul D. Seymour. Graph minors. II. Algorithmic aspects of
tree-width. Journal of Algorithms, 7:309322, 1986. 20
169
Bibliography
[RS08] Liam Roditty and Asaf Shapira. All-pairs shortest paths with a sublinear additive
error. In Automata, Languages and Programming, 35th International Colloquium,
ICALP 2008, Reykjavik, Iceland, July 7-11, 2008, Proceedings, Part I: Tack A:
Algorithms, Automata, Complexity, and Games, pages 622633, 2008. 37
[RSM
+
02] Erzs ebet Ravasz, Anna Lisa Somera, Dale A. Mongru, Zolt an N. Oltvai, and
Albert-L aszl o Barab asi. Hierarchical organization of modularity in metabolic net-
works. Science, 297(5586):15511555, 2002. 6
[RSR
+
01] Jean-Christophe Rain, Luc Selig, Hilde De Reuse, Veronique Battaglia, Celine
Reverdy, Stephane Simon, Gerlinde Lenzen, Fabien Petel, Jerome Wojcik, Vincent
Sch achter, Y. Chemama, Agnes Labigne, and Pierre Legrain. The protein-protein
interaction map of Helicobacter pylori. Nature, 409(6817):211215, 2001. 6
[RTMS05] Martin Rosvall, Ala Trusina, Peter Minnhagen, and Kim Sneppen. Networks and
cities: An information perspective. Physical Review Letters, 94(2):028701, Jan
2005. 10
[RTZ05] Liam Roditty, Mikkel Thorup, and Uri Zwick. Deterministic constructions of ap-
proximate distance oracles and spanners. In Automata, Languages and Program-
ming, 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15,
2005, Proceedings, pages 261272, 2005. 38, 42, 44
[RTZ08] Liam Roditty, Mikkel Thorup, and Uri Zwick. Roundtrip spanners and roundtrip
routing in directed graphs. ACM Transactions on Algorithms, 4(3):117, 2008. 28
[RZ04] Liam Roditty and Uri Zwick. Dynamic approximate all-pairs shortest paths in
undirected graphs. In Proc. of Symp. on Foundations of Computer Science, Rome,
Oct. 2004, pages 499508, 2004. 39
[Sab66] Gert Sabidussi. The centrality index of a graph. Psychometrika, 31(4):581603,
1966. 12
[Sam63] A. L. Samuel. Some studies in machine learning using the game of checkers.
Computers and Thought, 1963. Originally in IBM Journal 3, 211-229 (1959). 35,
52
[San09] Peter Sanders. Algorithmengineering - an attempt at a denition. In Efcient Algo-
rithms, Essays Dedicated to Kurt Mehlhorn on the Occasion of His 60th Birthday,
pages 321340, 2009. 50
[SBA
+
95] LaRon Smith, Richard Beckman, Doug Anson, Kai Nagel, and Michael E
Williams. TRANSIMS: Transportation analysis and simulation system. In Fifth
National Conference on Transportation Planning Methods Applications-Volume
II: A Compendium of Papers Based on a Conference Held in Seattle, Washington
in April 1995, 1995. 10
[SBR
+
06] Chris Stark, Bobby-Joe Breitkreutz, Teresa Reguly, Lorrie Boucher, Ashton Bre-
itkreutz, and Mike Tyers. Biogrid: a general repository for interaction datasets.
Nucleic Acids Research, 34(1):535539, 2006. 112
[SC81] Howard A. Smolleck and Mo-Shing Chen. A new approach to near-optimal path
assignment through electric-circuit modeling. Networks, 11:335349, 1981. 51
[Sch98] Jeanette P. Schmidt. All highest scoring paths in weighted grid graphs and their
170
Bibliography
application to nding all approximate repeats in strings. SIAM Journal on Com-
puting, 27(4):972992, 1998. Announced at ISTCS 1995. 8, 46
[Sch03] Alexander Schrijver. Combinatorial Optimization Polyhedra and Efciency.
Springer-Verlag, Berlin, 2003. 32, 35
[Sch05a] Alexander Schrijver. Handbook of Discrete Optimization, chapter On the history
of combinatorial optimization (till 1960), pages 168. 2005. 32
[Sch05b] Frank Schulz. Timetable Information and Shortest Paths. PhD thesis, Universit at
Karlsruhe, 2005. 56
[Sch08a] Dominik Schultes. Route Planning in Road Networks. PhD thesis, Universit at
Karlsruhe, 2008. 54, 55
[Sch08b] Dominik Schultes. Routing in road networks with transit nodes. In Encyclopedia
of Algorithms. 2008. 47, 51, 54, 55, 56
[SCK
+
08] Ralf Schenkel, Tom Crecelius, Mouna Kacimi, Thomas Neumann, Josiane
Xavier Parreira, Marc Spaniol, and Gerhard Weikum. Social wisdom for search
and recommendation, June 2008. 13
[SdC77] Lenie Sint and Dennis de Champeaux. An improved bidirectional heuristic search
algorithm. Journal of the ACM, 24(2):177191, 1977. Announced at IJCAI 1975.
10, 35
[Sei95] Raimund Seidel. On the all-pairs-shortest-path problem in unweighted undirected
graphs. Journal of Computer and System Sciences, 51(3):400403, 1995. An-
nounced at STOC 1992. 36
[Sei06] Raimund Seidel. Top-down analysis of path compression: Deriving the inverse-
ackermann bound naturally (and easily). In Algorithm Theory - SWAT 2006, 10th
ScandinavianWorkshop on Algorithm Theory, Riga, Latvia, July 6-8, 2006, Pro-
ceedings, page 1, 2006. 47
[Sen09] Sandeep Sen. Approximating shortest paths in graphs. In WALCOM: Algorithms
and Computation, Third International Workshop, WALCOM 2009, Kolkata, India,
February 18-20, 2009. Proceedings, pages 3243, 2009. 32, 37, 42, 43, 44, 123
[Set96] James Albert Sethian. A fast marching level set method for monotonically advanc-
ing fronts. Proceedings of the National Academy of Sciences, 93(4):15911595,
1996. 11
[SFG97] Shashi Shekhar, Andrew Fetterer, and Bjajesh Goyal. Materialization trade-offs in
hierarchical shortest path algorithms. In SSD 97: Proceedings of the 5th Interna-
tional Symposium on Advances in Spatial Databases, pages 94111, London, UK,
1997. Springer-Verlag. 10, 51, 54
[SG67] Ralph E. Schofer and Franklin F. Goodyear. Electronic computer applications in
urban transportation planning. In Proceedings of the 1967 22nd national confer-
ence, pages 247253, 1967. 10
[SG05] Alfons Schnitzler and Joachim Gross. Normal and pathological oscillatory com-
munication in the brain. Nature Reviews Neuroscience, 6:285296, 2005. 6
[SGNP10] Atish Das Sarma, Sreenivas Gollapudi, Marc Najork, and Rina Panigrahy. A
sketch-based distance oracle for web-scale graphs. In International Conference
on Web Search and Data Mining (WSDM), 2010. to appear. 56
171
Bibliography
[Sha54] Marvin E. Shaw. Group structure and the behavior of individuals in small groups.
Journal of Psychology: Interdisciplinary and Applied, 38:139149, 1954. 12
[Sha75] Michael Ian Shamos. Geometric complexity. In Conference Record of Seventh
Annual ACM Symposium on Theory of Computation (STOC75), 5-7 May 1975,
Albuquerque, New Mexico, USA, pages 224233, 1975. 98
[Sha97a] Mehul A. Shah. ReferralWeb: A resource location system guided by personal
relations. Masters thesis, Massachusetts Institute of Technology, 1997. 14
[Sha97b] Micha Sharir. Handbook of Discrete and Computational Geometry, chapter Algo-
rithmic Motion Planning, pages 733754. CRC Press, 1997. 32, 39
[Shi53] Alfonso Shimbel. Structural parameters of communication networks. Bulletin of
Mathematical Biophysics, 15:501507, 1953. 8, 12
[Shi55] Alfonso Shimbel. Structure in communication nets. In Proceedings of the Sympo-
sium on Information Networks (New York, 1954), pages 199203, 1955. 8
[Shi00] Tetsuo Shibuya. Computing the n m shortest paths efciently. ACM Journal of
Experimental Algorithmics, 5:9, 2000. Announced at ALENEX 1999. TR: IBM
TRL Report RT5133 1997. 35
[Shi03] Clay Shirky. Power laws, weblogs, and inequality. Networks, Economics, and
Culture mailing list, 2003. 5
[SHWH08] Christian Sommer, Michael E. Houle, Martin Wolff, and Shinichi Honiden. Ap-
proximate shortest path queries in graphs using Voronoi duals. Technical Report
NII-2008-007E, National Institute of Informatics, August 2008. 16
[SKC93] Shashi Shekhar, Ashim Kohli, and Mark Coyle. Path computation algorithms for
advanced traveller information system (ATIS). In Proceedings of the Ninth In-
ternational Conference on Data Engineering, April 19-23, 1993, Vienna, Austria,
pages 3139, 1993. 54
[SL67] Ralph E. Schofer and Bernard M. Levin. The urban transportation planning pro-
cess. Socio-Economic Planning Sciences, 1:185197, 1967. 10
[SL97] Shashi Shekhar and Duen-Ren Liu. CCAM: A connectivity-clustered access
method for networks and network computations. IEEE Transactions on Knowl-
edge and Data Engineering, 9(1):102119, 1997. 54
[SL03] Manoj Pratim Samanta and Shuguang Liang. Predicting protein functions from
redundancies in large-scale protein interaction networks. Proceedings of the Na-
tional Academy of Sciences, 100(22):1257912583, 2003. 6
[SLSP03] Thomas Seidenbecher, T. Rao Laxmi, Oliver Stork, and Hans-Christian Pape.
Amygdalar and hippocampal theta rhythm synchronization during fear memory
retrieval. Science, 301(5634):846850, 2003. 6
[Smo75] Howard A. Smolleck. Application of Fast Sparse-Matrix Techniques and an En-
ergy Estimation Model for Large Transportation Networks. PhD thesis, University
of Texas at Arlington, 1975. 35, 51
[SMR97] Carsten Steger, Helmut Mayer, and Bernd Radig. The role of grouping for road
extraction. In Automatic Extraction of Man-Made Objects from Aerial and Space
Images (II), pages 245256, 1997. 11
172
Bibliography
[SMS
+
02] Lukasz Salwinski, Christopher S. Miller, Adam J. Smith, Frank K. Pettit, James U.
Bowie, and David Eisenberg. DIP, the database of interacting proteins: a research
tool for studying cellular networks of protein interactions. Nucleic Acids Research,
30(1):303305, 2002. 112
[SNGM09] Antonio Sede no-Noda and Carlos Gonz alez-Martn. New efcient shortest path
simplex algorithm: pseudo permanent labels instead of permanent labels. Compu-
tational Optimization and Applications, 43(3):437448, 2009. 34
[Spi73] Philip M. Spira. Anewalgorithmfor nding all shortest paths in a graph of positive
arcs in average time O(n
2
log
2
n). SIAMJournal on Computing, 2(1):2832, 1973.
35
[Spr07] Alan P. Sprague. O(1) query time algorithm for all pairs shortest distances on
permutation graphs. Discrete Applied Mathematics, 155(3):365373, 2007. 48,
121
[SPZ06] Gabriele Di Stefano, Alberto Petricola, and Christos D. Zaroliagis. On the im-
plementation of parallel shortest path algorithms on a supercomputer. In Parallel
and Distributed Processing and Applications, 4th International Symposium, ISPA
2006, Sorrento, Italy, December 4-6, 2006, Proceedings, pages 406417, 2006. 35
[SR90] Wojciech Szpankowski and Vernon Rego. Yet another application of a binomial
recurrence. Order statistics. Computing, 43(4):401410, 1990. 108
[SR08] Parag Singla and Matthew Richardson. Yes, there is a correlation: - from social
networks to personal behavior on the web. In WWW 08: Proceeding of the 17th
international conference on World Wide Web, pages 655664, 2008. 13
[SS80] Mischa Schwartz and Thomas E. Stern. Routing techniques used in computer
communication networks. IEEE Transactions on Communications, 28(4):539
552, Apr 1980. 8, 14
[SS99] Hanmao Shi and Thomas H. Spencer. Timework tradeoffs of the single-source
shortest paths problem. Journal of Algorithms, 30(1):1932, 1999. 35
[SS05] Peter Sanders and Dominik Schultes. Highway hierarchies hasten exact shortest
path queries. In Algorithms - ESA 2005, 13th Annual European Symposium, Palma
de Mallorca, Spain, October 3-6, 2005, Proceedings, pages 568579, 2005. 47,
53, 55, 56
[SS06] Peter Sanders and Dominik Schultes. Engineering highway hierarchies. In Al-
gorithms - ESA 2006, 14th Annual European Symposium, Zurich, Switzerland,
September 11-13, 2006, Proceedings, pages 804816, 2006. 47, 51, 53, 54, 55,
56, 110, 114
[SS07a] Peter Sanders and Dominik Schultes. Engineering fast route planning algorithms.
In Experimental Algorithms, 6th International Workshop (WEA07), Rome, Italy,
June 6-8, 2007, Proceedings, pages 2336, 2007. 110, 111, 112, 114, 123
[SS07b] Dominik Schultes and Peter Sanders. Dynamic highway-node routing. In Experi-
mental Algorithms, 6th International Workshop (WEA07), Rome, Italy, June 6-8,
2007, Proceedings, pages 6679, 2007. 55, 110, 114
[SS09] Jagan Sankaranarayanan and Hanan Samet. Distance oracles for spatial networks.
In Proceedings of the 25th International Conference on Data Engineering, ICDE
173
Bibliography
2009, March 29 2009 - April 2 2009, Shanghai, China, pages 652663, 2009. 50,
54, 55, 121
[SSA08] Hanan Samet, Jagan Sankaranarayanan, and Houman Alborzi. Scalable network
distance browsing in spatial databases. In Proceedings of the ACM SIGMOD In-
ternational Conference on Management of Data, SIGMOD 2008, Vancouver, BC,
Canada, June 10-12, 2008, pages 4354, 2008. 55
[SSA09] Jagan Sankaranarayanan, Hanan Samet, and Houman Alborzi. Path oracles for
spatial networks. Proceedings of the VLDB Endowment, 2(1):12101221, 2009.
50, 55, 121
[ST99] Alan P. Sprague and Tadao Takaoka. O(1) query time algorithm for all pairs short-
est distances on interval graphs. International Journal of Foundations of Computer
Science, 10(4):465472, 1999. 48
[Sto99] Bryan Stout. Smart move: Intelligent path-nding. Online at gamasu-
tra.com/view/feature/3317/smart move intelligent .php, 1999. 8
[Str69] Volker Strassen. Gaussian elimination is not optimal. Numerische Mathematik,
13:354356, 1969. 36
[Str01] Steven H. Strogatz. Exploring complex networks. Nature, 410:268276, 2001. 4
[Sug09] Kokichi Sugihara. Voronoi diagrams in facility location. In Encyclopedia of Opti-
mization, Second Edition, pages 40404045. 2009. 98
[SV86] Robert Sedgewick and Jeffrey Scott Vitter. Shortest paths in Euclidean graphs.
Algorithmica, 1(1):3148, 1986. Announced at FOCS 1984. 35, 52
[Svi08] Zoya Svitkina. Lower-bounded facility location. In Proceedings of the Nine-
teenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA08), San
Francisco, California, USA, January 20-22, 2008, pages 11541163, 2008. 98
[SVY09] Christian Sommer, Elad Verbin, and Wei Yu. Distance oracles for sparse graphs.
In 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS),
pages 703712, 2009. 15
[SWN92] Jacob Shapiro, Jerry Waxman, and Danny Nir. Level graphs and approximate
shortest path algorithms. Networks, 22:691717, 1992. 51, 54
[SZ99] Avi Shoshan and Uri Zwick. All pairs shortest paths in undirected graphs with inte-
ger weights. In Proceedings of the IEEE Symposium on Foundations of Computer
Science (FOCS), pages 605615, 1999. 36
[Tak92] Tadao Takaoka. A new upper bound on the complexity of the all pairs shortest
path problem. Information Processing Letters, 43(4):195199, 1992. Announced
at WG 1991. 36, 37
[Tak05] Tadao Takaoka. An O(n
3
log log n/ log n) time algorithm for the all-pairs shortest
path problem. Information Processing Letters, 96(5):155161, 2005. 36, 37
[Tak08] Tadao Takaoka. All pairs shortest paths via matrix multiplication. In Encyclopedia
of Algorithms. 2008. 36
[Tal04] Kunal Talwar. Bypassing the embedding: algorithms for low dimensional metrics.
In STOC 04: Proceedings of the thirty-sixth annual ACM symposium on Theory
of computing, pages 281290, 2004. 49, 121
174
Bibliography
[TE09] Sunil Thulasidasan and Stephan Eidenbenz. Accelerating trafc microsimulations:
A parallel discrete-event queue-based approach for speed and scale. In The Winter
Simulation Conference, 2009. 10
[TF97] Sabine Timpf and Andrew U. Frank. Using hierarchical spatial data structures for
hierarchical spatial reasoning. In Spatial Information Theory: A Theoretical Basis
for GIS, International Conference COSIT 97, Laurel Highlands, Pennsylvania,
USA, October 15-18, 1997, Proceedings, pages 6983, 1997. 51
[TGJ
+
02] Hongsuda Tangmunarunkit, Ramesh Govindan, Sugih Jamin, Scott Shenker, and
Walter Willinger. Network topology generators: degree-based vs. structural. In
Proceedings of the ACM SIGCOMM 2002 Conference on Applications, Technolo-
gies, Architectures, and Protocols for Computer Communication, August 19-23,
2002, Pittsburgh, PA, USA, pages 147159, 2002. 24
[Tho99] Mikkel Thorup. Undirected single-source shortest paths with positive integer
weights in linear time. Journal of the ACM, 46(3):362394, 1999. Announced
at FOCS 1997. 33, 34, 104, 123
[Tho00a] Mikkel Thorup. Floats, integers, and single source shortest paths. Journal of
Algorithms, 35(2):189201, 2000. Announced at STACS 1998. 33, 104
[Tho00b] Mikkel Thorup. On RAM priority queues. SIAM Journal of Computing, 30(1):86
109, 2000. Announced at SODA 1996. 34, 103
[Tho04a] Mikkel Thorup. Compact oracles for reachability and approximate distances in
planar digraphs. Journal of the ACM, 51(6):9931024, 2004. Announced at FOCS
2001. 31, 46, 47, 83, 121
[Tho04b] Mikkel Thorup. Integer priority queues with decrease key in constant time and the
single source shortest paths problem. Journal of Computer and System Sciences,
69(3):330353, 2004. Announced at STOC 2003. 34, 103
[Tho07] Mikkel Thorup. Equivalence between priority queues and sorting. Journal of the
ACM, 54(6), 2007. Announced at FOCS 2002. 33, 103
[Tit59] Jacques Tits. Sur la trialit e et certains groupes qui sen d eduisent. Publications
Math ematiques de lInstitut des Hautes

Etudes Scientiques, 2(1), 1959. 28
[TKE
+
09] Sunil Thulasidasan, Shiva Kasiviswanathan, Stephan Eidenbenz, Emanuele Galli,
Susan Mniszewski, and Philip Romero. Designing systems for large-scale,
discrete-event simulations: Experiences with the FastTrans parallel microsimu-
lator. In HiPC - International Conference on High Performance Computing, 2009.
10
[TL07] Silke Tril and Ulf Leser. Fast and practical indexing and querying of very large
graphs. In SIGMOD 07: Proceedings of the 2007 ACM SIGMOD international
conference on Management of data, pages 845856, 2007. 54, 57
[TM69] Jeffrey Travers and Stanley Milgram. An experimental study of the small world
problem. Sociometry, 32:425443, 1969. 6
[TM80] Tadao Takaoka and Alistair Moffat. An O(n
2
log log log n) expected time al-
gorithm for the all shortest distance problem. In Mathematical Foundations of
Computer Science 1980 (MFCS80), Proceedings of the 9th Symposium, Rydzyna,
Poland, September 1-5, 1980, pages 643655, 1980. 35
175
Bibliography
[TM92] Edouard Thiel and Annick Montanvert. Chamfer masks: discrete distance func-
tions, geometrical properties and optimization. In 11th IAPR International Con-
ference on Pattern Recognition, 1992. Vol.III. Conference C: Image, Speech and
Signal Analysis, Proceedings., pages 244247, Aug-3 Sep 1992. 11
[Tob70] Waldo R. Tobler. A computer movie simulating urban growth in the Detroit region.
Economic Geography, 46:234240, 1970. 7
[Tsi95] John N. Tsitsiklis. Efcient algorithms for globally optimal trajectories. IEEE
Transactions on Automatic Control, 40(9):15281538, Sep 1995. 11
[Tur37] Alan Mathison Turing. On computable numbers, with an application to
the Entscheidungsproblem. Proceedings of the London Mathematical Society,
42(1):230265, 1937. 26
[TZ00] Jesper Larsson Tr aff and Christos D. Zaroliagis. A simple parallel algorithm for
the single-source shortest path problem on planar digraphs. Journal of Parallel
and Distributed Computing, 60(9):11031124, 2000. Announced at IRREGULAR
1996. 35
[TZ01] Mikkel Thorup and Uri Zwick. Compact routing schemes. In ACM Symposium on
Parallelism in Algorithms and Architectures, pages 110, 2001. 49, 81, 82, 95
[TZ05] Mikkel Thorup and Uri Zwick. Approximate distance oracles. Journal of the ACM,
52(1):124, 2005. Announced at STOC 2001. 28, 40, 41, 42, 43, 44, 53, 56, 59,
78, 81, 82, 83, 88, 89, 95, 118, 121, 123, 125
[TZ06] Mikkel Thorup and Uri Zwick. Spanners and emulators with sublinear distance
errors. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Dis-
crete Algorithms, SODA 2006, Miami, Florida, USA, January 22-26, 2006, pages
802809, 2006. 27, 28, 29
[UCDG08] Antti Ukkonen, Carlos Castillo, Debora Donato, and Aristides Gionis. Searching
the wikipedia with contextual information. In CIKM 08: Proceeding of the 17th
ACM conference on Information and knowledge management, pages 13511352,
2008. 13
[UGC
+
00] Peter Uetz, Loic Giot, Gerard Cagney, Traci A. Manseld, Richard S. Judson,
James R. Knight, Daniel Lockshon, Vaibhav Narayan, Maithreyan Srinivasan,
Pascale Pochart, Alia Qureshi-Emili, Ying Li, Brian Godwin, Diana Conover,
Theodore Kalbeisch, Govindan Vijayadamodar, Meijia Yang, Mark Johnston,
Stanley Fields, and Jonathan M. Rothberg. A comprehensive analysis of protein
protein interactions in Saccharomyces cerevisiae. Nature, 403:623627, 2000. 6
[UY90] Jeffrey D. Ullman and Mihalis Yannakakis. The input/output complexity of tran-
sitive closure. ACM SIGMOD Record, 19(2):4453, 1990. 57
[vEB90] Peter van Emde Boas. Handbook of theoretical computer science (vol. A): al-
gorithms and complexity, chapter Machine models and simulations, pages 166.
1990. 26
[vEBKZ77] Peter van Emde Boas, Rob Kaas, and E. Zijlstra. Design and implementation of
an efcient priority queue. Mathematical Systems Theory, 10:99127, 1977. 34
[Ves09] Alessandro Vespignani. Predicting the behavior of techno-social systems. Science,
325:425428, 2009. 7, 10
176
Bibliography
[VFD
+
07] Monique V. Vieira, Bruno M. Fonseca, Rodrigo Damazio, Paulo B. Golgher, Davi
de Castro Reis, and Berthier Ribeiro-Neto. Efcient search ranking in social net-
works. In CIKM 07: Proceedings of the sixteenth ACM conference on Conference
on information and knowledge management, pages 563572, 2007. 13
[Vli78] Dirck Van Vliet. Improved shortest path algorithms for transport networks. Trans-
portation Research, 12(1):720, 1978. 51, 52, 55, 125
[VLL00] Bert Vogelstein, David Lane, and Arnold J. Levine. Surng the p53 network.
Nature, 408(6810):307310, 2000. 6, 12
[Vor07] Georgy Voronoi. Nouvelles applications des param` etres continus ` a la th eorie des
formes quadratiques. Journal f ur die Reine und Angewandte Mathematik, 133:97
178, 1907. 98
[VPSV02] Alexei V azquez, Romualdo Pastor-Satorras, and Alessandro Vespignani. Large-
scale topological and dynamical properties of the internet. Physical Review E
(Statistical, Nonlinear, and Soft Matter Physics), 65(6):066130, Jun 2002. 5
[VS94] Jeffrey Scott Vitter and Elizabeth A. M. Shriver. Algorithms for parallel memory
I: Two-level memories. Algorithmica, 12(2/3):110147, 1994. 26, 43, 45
[VS09] Vishal Verma and Jack Snoeyink. Reducing the memory required to nd a geodesic
shortest path on a large mesh. In 17th ACM SIGSPATIAL International Symposium
on Advances in Geographic Information Systems, ACM-GIS 2009, November 4-6,
2009, Seattle, Washington, USA, Proceedings, pages 227235, 2009. 32
[Wag76] Robert A. Wagner. A shortest path algorithm for edge-sparse graphs. Journal of
the ACM, 23(1):5057, 1976. 33, 35
[War62] Stephen Warshall. A theorem on boolean matrices. Journal of the ACM, 9(1):11
12, 1962. 8, 35, 37
[Wen91] Rephael Wenger. Extremal graphs with no C
4
s, C
6
s, or C
10
s. Journal of Com-
binatorial Theory, Series B, 52(1):113116, 1991. 28
[WH60] P. D. Whiting and J. A. Hillier. A method for nding the shortest route through a
road network. Journal of the Operational Research Society, 11(1/2):3740, 1960.
8, 32
[Whi69] Leon S. White. Shortest route models for the allocation of inspection effort on a
production line. Management Science, 15(5):249259, 1969. 8
[WHY
+
06] Haixun Wang, Hao He, Jun Yang, Philip S. Yu, and Jeffrey Xu Yu. Dual labeling:
Answering graph reachability queries in constant time. In ICDE 06: Proceedings
of the 22nd International Conference on Data Engineering, page 75, 2006. 54
[Wie73] Christian Wiener. Ueber eine Aufgabe aus der Geometria situs. Mathematische
Annalen, 6(1):2930, 1873. Dated December 1871. 39
[Wie47] Harry Wiener. Structural determination of parafn boiling points. Journal of the
American Chemical Society, 69(1):1720, 1947. 12
[Wil64] J. W. J. Williams. Algorithm 232: Heapsort. Communications of the ACM, 7:347
348, 1964. 33, 34
[Win83] Peter M. Winkler. Proof of the squashed cube conjecture. Combinatorica,
3(1):135139, 1983. 30
177
Bibliography
[WM03] Takashi Washio and Hiroshi Motoda. State of the art of graph-based data mining.
ACM SIGKDD Explorations Newsletter, 5(1):5968, 2003. 4
[Wol08] Martin Joachim Wolff. Finding important edges for routing in networks. Masters
thesis, Universit at Karlsruhe, 2008. 98
[Woo06] David P. Woodruff. Lower bounds for additive spanners, emulators, and more.
In 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS
2006), 21-24 October 2006, Berkeley, California, USA, Proceedings, pages 389
398, 2006. 28, 29
[Wri75] J. W. Wright. Reallocation of housing by use of network analysis. Operational
Research Quarterly, 26(2):253258, 1975. 8
[WS98] Duncan J. Watts and Steven H. Strogatz. Collective dynamics of small-world
networks. Nature, pages 440442, 1998. 4, 7
[WS03] Stefan Wuchty and Peter F. Stadler. Centers of complex networks. Journal of
Theoretical Biology, 223(1):45 53, 2003. 12
[WU93] Andrew J. Woldar and Vasiliy A. Ustimenko. An application of group theory to
extremal graph theory. In Group Theory: Proceedings of the Biennial Ohio State-
Denison Conference, pages 293298, 1993. 28
[WW03] Dorothea Wagner and Thomas Willhalm. Geometric speed-up techniques for nd-
ing shortest paths in large sparse graphs. In Algorithms - ESA 2003, 11th Annual
European Symposium, Budapest, Hungary, September 16-19, 2003, Proceedings,
pages 776787, 2003. 54
[WW05] Dorothea Wagner and Thomas Willhalm. Drawing graphs to speed up shortest-
path computations. In Proceedings of the Seventh Workshop on Algorithm Engi-
neering and Experiments and the Second Workshop on Analytic Algorithmics and
Combinatorics, ALENEX /ANALCO 2005, Vancouver, BC, Canada, 22 January
2005, pages 1725, 2005. 53
[WWZ05] Dorothea Wagner, Thomas Willhalm, and Christos D. Zaroliagis. Geometric con-
tainers for efcient shortest-path computation. ACM Journal of Experimental Al-
gorithmics, 10, 2005. 54
[XNR10] Rongjing Xiang, Jennifer Neville, and Monica Rogati. Modeling relationship
strength in online social networks. In Proceedings of the 19th International Con-
ference on World Wide Web, WWW 2010, 2010. to appear. 2
[XWP
+
09] Yanghua Xiao, Wentao Wu, Jian Pei, Wei Wang, and Zhenying He. Efciently
indexing shortest paths by exploiting symmetry in graphs. In EDBT 09: Pro-
ceedings of the 12th International Conference on Extending Database Technology,
pages 493504, 2009. 57, 83
[Yao79] Andrew Chi-Chih Yao. Some complexity questions related to distributive comput-
ing (preliminary report). In STOC 79: Proceedings of the eleventh annual ACM
symposium on Theory of computing, pages 209213, 1979. 60, 61
[Yao81] Andrew Chi-Chih Yao. Should tables be sorted? Journal of the ACM, 28(3):615
628, 1981. 26, 60
[Yao90] Andrew Chi-Chih Yao. Coherent functions and program checkers (extended ab-
stract). In Proceedings of the Twenty Second Annual ACM Symposium on Theory
178
Bibliography
of Computing, 14-16 May 1990, Baltimore, Maryland, USA, pages 8494, 1990.
43
[YAR77] Andrew Chi-Chih Yao, David Avis, and Ronald L. Rivest. An (n
2
log n) lower
bound to the shortest paths problem. In Conference Record of the Ninth Annual
ACM Symposium on Theory of Computing, 2-4 May 1977, Boulder, Colorado,
USA, pages 1117, 1977. 35
[YBLS08] Sihem Amer Yahia, Michael Benedikt, Laks V. S. Lakshmanan, and Julia Stoy-
anovich. Efcient network aware search in collaborative tagging sites. Proceed-
ings of the VLDB Endowment, 1(1):710721, 2008. 13
[Yen71] Jin Y. Yen. On Hus decomposition algorithm for shortest paths in a network.
Operations Research, 19(4):983985, 1971. 35, 51
[Yen72] Jin Y. Yen. Finding the lengths of all shortest paths in n-node nonnegative-distance
complete networks using
1
2
n
3
additions and n
3
comparisons. Journal of the ACM,
19(3):423424, 1972. 36
[Yen73] Jin Y. Yen. Reply to Williams and Whites note. Journal of the ACM, 20(3):390,
1973. 36
[Yuv76] Gideon Yuval. An algorithm for nding all shortest paths using n
2.81
innite-
precision multiplications. Information Processing Letters, 4(6):155156, 1976.
36
[YWD08] Ruiyun Yu, Xingwei Wang, and Sajal K. Das. A Voronoi diagram approach for
mobile element scheduling in sparse sensor networks. In FGCN 08: Proceedings
of the 2008 Second International Conference on Future Generation Communica-
tion and Networking, pages 6267, 2008. 97
[YZ05] Raphael Yuster and Uri Zwick. Answering distance queries in directed graphs
using fast matrix multiplication. In FOCS 05: Proceedings of the 46th Annual
IEEE Symposium on Foundations of Computer Science, pages 389396, 2005. 39
[Zah71] Charles T. Zahn. Graph-theoretical methods for detecting and describing Gestalt
clusters. IEEE Transactions on Computers, 20(1):6886, 1971. 11
[Zar08] Christos Zaroliagis. Engineering algorithms for large network applications. In
Encyclopedia of Algorithms. 2008. 10, 55
[ZC01] J. Leon Zhao and Hsing Kenneth Cheng. Graph indexing for spatial data traversal
in road map databases. Computers & OR, 28(3):223241, 2001. 54
[ZKM97] Athanasios K. Ziliaskopoulos, Dimitri Kotzinos, and Hani S. Mahmassani. De-
sign and implementation of parallel time-dependent least time path algorithms for
intelligent transportation systems applications. Transportation Research Part C:
Emerging Technologies, 5(2):95107, 1997. 10
[ZN98] F. Benjamin Zhan and Charles E. Noon. Shortest path algorithms: An evaluation
using real road networks. Transportation Science, 32(1):6573, 1998. 35
[ZN00] F. Benjamin Zhan and Charles E. Noon. A comparison between label-setting and
label-correcting algorithms for computing one-to-one shortest paths. Journal of
Geographic Information and Decision Analysis, 4(2):111, 2000. 35
179
Bibliography
[Zwi98] Uri Zwick. All pairs shortest paths in weighted directed graphs
3
4
exact and al-
most exact algorithms. In Proceedings of the IEEE Symposium on Foundations of
Computer Science (FOCS), pages 310319, 1998. 36
[Zwi01] Uri Zwick. Exact and approximate distances in graphs a survey. In Algorithms
- ESA 2001, 9th Annual European Symposium, Aarhus, Denmark, August 28-31,
2001, Proceedings, pages 3348, 2001. 32
[Zwi02] Uri Zwick. All pairs shortest paths using bridging sets and rectangular matrix
multiplication. Journal of the ACM, 49(3):289317, 2002. 36, 37
[Zwi06] Uri Zwick. A slightly improved sub-cubic algorithm for the all pairs shortest paths
problem with real edge lengths. Algorithmica, 46(2):181192, 2006. Announced
at ISAAC 2004. 36, 37
[ZZ94] J. Leon Zhao and Ahmed Zaki. Spatial data traversal in road map databases: A
graph indexing approach. In Proceedings of the Third International Conference on
Information and Knowledge Management (CIKM94), Gaithersburg, Maryland,
November 29 - December 2, 1994, pages 355362, 1994. 54
180

You might also like