Orion: Shortest Path Estimation For Large Social Graphs
Orion: Shortest Path Estimation For Large Social Graphs
Xiaohan Zhao, Alessandra Sala, Christo Wilson, Haitao Zheng and Ben Y. Zhao
Department of Computer Science, UC Santa Barbara, USA
{xiaohanzhao, alessandra, bowlin, htzheng, ravenben}@cs.ucsb.edu
1
key source of error in network coordinate systems, graph tion. Applications that benefit from these systems in-
coordinates could potentially be even more accurate. clude content distribution networks [24], multicast sys-
We make three key contributions in this paper. First, tems [3], distributed file systems [26] and file-sharing
we propose the use of graph coordinate systems to sim- networks [1, 5].
plify node distance computation on large graphs. While The majority of network coordinate systems work by
similar in fundamental methodology to network coordi- mapping an Internet host to a specific position in a Eu-
nates, several critical differences force a ground-up re- clidean space based on round-trip measurements to other
design of graph coordinate systems. For example, while hosts. Depending on the protocol, a node’s coordinates
network coordinates can be easily tuned using fast la- can be continually refined as additional measurement re-
tency measurements (e.g. via Internet ping), measur- sults are added to the system. Once a pair of nodes has
ing actual distances between graph nodes can be very converged to their positions in the coordinate space, their
expensive. We describe Orion, a prototype graph coor- distance in the Internet (usually a round-trip-time or RTT
dinate system, and explore critical decisions in its de- value) can be predicted by computing the Euclidean dis-
sign. Second, we perform extensive validation of Orion’s tance between their coordinate values.
node distance estimates using several real social graphs. Based on the way coordinates are computed for
Finally, we explore the utility of graph coordinate sys- new nodes, NC systems can be generally categorized
tems in graph analysis and social applications, and show into “landmark-based” and “decentralized” systems.
that Orion produces effective results on large graphs for Landmark-based systems such as GNP [20] first compute
applications such as node separation metrics, centrality coordinates for an initial set of well-known landmark
computation, and ranked social search. nodes using pair-wise measurements, where errors be-
tween virtual and measured distances are minimized us-
Roadmap. We begin in Section 2 by defining our
ing a non-linear optimization algorithm such as Simplex
goals and assumptions, and describing key differences
Downhill [19]. The NC then uses these nodes as fixed
from prior work on network coordinate systems. We then
points to calibrate coordinate values for the rest of the
describe the Orion graph coordinate system and explain
network. Landmark-based systems [17, 20, 21, 22, 29]
key design decisions in Section 3. Next, we present ac-
have fast convergence properties, since all nodes rely on
curacy measurements of Orion in Section 4, and show
the same fixed nodes for their coordinate calculations.
the effectiveness of Orion in computing graph metrics
However, the accuracy of these systems can suffer if the
and graph applications in Section 5. Finally, we discuss
choice of landmark nodes is suboptimal, i.e. they do not
future directions and conclude in Section 6.
sufficiently cover the network.
In contrast, decentralized NCs such as PIC [7] and Vi-
2 Virtual Coordinates and Large Graphs valdi [8] allow incoming nodes to orient themselves in
the coordinate space using any nodes already positioned
The goal of our work is to find a compact representa- in the space. While these systems avoid dependence on
tion of distances between nodes in a graph, such that we well-known landmarks, new nodes can force already cal-
can quickly and easily compute estimates of shortest path ibrated nodes to adjust their coordinates, potentially in-
distances between any two nodes. We are inspired by the creasing convergence time and propagating errors. For
significant volume of prior work on the topic of network further details on NC systems, we refer the reader to a
coordinate systems, much of which mapped distances recent survey [9].
between Internet hosts to distances in a Euclidean space. Successes and Limitations. NC systems have been
In this section, we briefly summarize prior work in net- shown to be highly effective at improving performance
work coordinates, and use it as context to identify key of large distributed systems [12, 1]. However, more re-
differences and challenges in the design of graph coordi- cent work has questioned the validity of using Euclidean
nate systems. Finally, we briefly discuss related projects spaces to approximate Internet latencies, which have
as context for our work. been shown to violate the Triangle Inequality [13, 33].
2
tems, but must instead carefully reevaluate them in the poses to compute nodes position in a graph by exploit-
context of graph distances. ing a coordinate-like approach, called network structure
index (NSI) [25]. Compared to Orion, NSI is more ex-
Triangle Inequality. First, we note that while the
pensive in both time and space complexity. The space
presence of triangle inequality violations (TIV) is often
complexity of NSI is O(nkD), where k is the number of
identified as a barrier to accuracy in network coordinate
zones and D is the number of dimensions, which are k
systems, shortest path computation on graphs is guaran-
times higher than Orion. On the other hand, NSI’s time
teed to be TIV free. This is inherent in the definition
complexity, O(mkD), is proportional to the number of
of the shortest path metric. The proof is straightforward
edges m while Orion takes only O(nkD) time, where n
by contradiction. Assume a triangle inequality violation
is the number of nodes. This also represents a significant
for three nodes a, b, c, i.e. d(a, b) + d(a, c) < d(b, c),
decrease in time complexity, since m is several orders of
where d(a, b) represents the shortest path distance be-
magnitude larger than n in online social graphs. Further-
tween nodes a and b. This scenario is impossible, be-
more, unlike our work, annotation distances computed
cause one can construct a “shorter” shortest path between
by NSI are not the number of hops between nodes pairs.
b and c that is the concatenation of the shortest paths be-
Recent work by Potamias et al. [23] proposes a land-
tween (b, a) and (a, c). At minimum, the sum of lengths
mark scheme for approximating shortest path distances.
of two shortest paths in the triangle is equal to the length
The approach is similar in spirit, but stores for each node
of the third. This property means a graph coordinate sys-
its distance to every landmark. In contrast, Orion is more
tem does not have to support TIVs by resorting to com-
compact. It stores for each node a coordinate address of
plex algorithms such as matrix factorization [17].
e.g. 10 values, independent of the number of landmarks
Cost of Measurements. The second and most crit- used. In addition, our work considers the broader prob-
ical difference between these two problems is the cost lem of embedding large graphs into known coordinate
of obtaining ground truth distance values between two spaces, and evaluates our work using a broad array of
nodes. In Internet latency estimation, a running system applications.
can perform a latency measurement with minimal cost
Social Networks. A significant amount of research ef-
via Internet Ping. In contrast, measuring the shortest path
fort has been invested to understand OSNs such as MyS-
between graph nodes is expensive, and can take at worst
pace, Orkut [2], Flickr, LiveJournal [18], Facebook [32],
time O(n+m). In addition, computing the distance from
and Twitter [10]. Social networks are characterized
a to b using BFS effectively computes the shortest path
by graph properties like power-law degree distribution,
between a to all other nodes in the graph. With these
small-world clustering, and scale-free behavior [16]. A
factors in mind, we must carefully consider how graph
necessary precondition for quantifying some of these
coordinates obtain real node distances for node calibra-
characteristics is calculating node separation metrics (i.e.
tion. We must minimize the number of overall BFS oper-
radius, diameter and average path length) that are based
ations, while reusing the results from each BFS operation
on all-pairs shortest paths. Some social applications also
as much as possible.
leverage shortest path computations, such as distance-
Error Sensitivity. Finally, graph coordinate systems based community detection [11]. Unfortunately, com-
face an additional challenge of higher error sensitivity. puting all-pairs shortest paths on today’s social graphs is
While latency between Internet nodes can vary from sub- infeasible, since they often have millions of nodes and
milliseconds to hundreds of milliseconds, node distances hundreds of millions of edges. Existing studies sidestep
on small-world graphs tend to have much smaller vari- this issue by using sampling techniques to estimate the
ance. For example, diameters of recently measured Face- graph’s true values [18, 32]. In contrast, our solution
book graphs are less than 20 [32]. Additionally, all node computes shortest paths between node pairs in 0.2 mi-
distance values are integers. This means node distance croseconds, making it a scalable solution for computing
values across different paths in a graph are significantly all-pairs shortest paths on massive social graphs.
more clustered across a small number of possible values,
and any estimation errors can be rounded up. Thus, a
graph coordinate system must provide reasonably high 3 Designing Orion
accuracy in order to be useful in graph applications.
In this section, we present the Orion graph coordinate
system and explain our design decisions in detail. Simi-
2.3 Related work lar to network coordinate systems, graph coordinate sys-
tems work in two phases. First, nodes in the graph
Shortest Path Methods. Shortest path computations are iteratively added to the coordinate space, the po-
are extremely costly on large graphs. Rattigan et al. pro- sition of each node being calibrated by ground truth
3
{
y a graph, since each computation can, in the worst case,
a require a full traversal of the graph. Using a landmark
1 approach, we limit the total number of Breadth-First-
b c b c Search operations to k, the number of landmarks. Each
{
BFS computes the shortest path distance from a land-
d x mark to all other nodes. Computing BFS for all land-
1 d
marks essentially precomputes all values needed to cal-
e f ibrate all nodes in the graph. In contrast, a decentral-
e f ized approach such as the physical springs model used by
Vivaldi [8] requires shortest path computations between
Figure 1: Mapping graph nodes into Euclidean coordinate random node pairs, thus drastically increasing the num-
space. For most node pairs, the Euclidean distance exactly ber of BFS operations.
matches the hop-count separating them in the original graph. The second advantage of a landmark-based scheme is
that the positions of incoming nodes depend only on the
landmark nodes. This bounds the number of operations
node-distance measurements. This “calibration phase” required to compute a node’s position, guaranteeing fast
is where a graph coordinate system incurs its one-time convergence. In contrast, in decentralized models adding
computational overhead. Once all nodes in the graph a new node will often force its nearby neighbors to make
have been added, the resulting system can be integrated adjustments on their position, a process that can propa-
with graph applications to answer node distance queries gate adjustments iteratively throughout the entire space.
with estimates.
Finally, we note that the challenges that make Land-
Since the per-query computation cost is O(1), the fo-
mark systems undesirable in Internet systems do not ap-
cus of our design is to ensure the calibration phase is
ply in our context. In network coordinate systems, land-
computationally efficient, and the results are as accurate
marks are physical machines that must remain available
as possible. More specifically, our goals are three-fold:
at all times, and processing load from other applications
• Scalability. The computational cost of the calibra- (e.g. web traffic) can affect the accuracy of latency mea-
tion phase must scale linearly with the number of surements to other machines in the network [21]. Com-
nodes, i.e. O(n). promised landmarks can also significantly impact the en-
• Accuracy. While individual node distance pre- tire system [9]. Those issues do not exist for graph coor-
dictions might incur reasonable errors, predictions dinates, where nodes are just graph vertices and all com-
should approximate ground truth at the large scale. putation can be performed on a centralized server.
• Fast convergence. Impact of individual node cali-
brations should be localized, i.e. should not trigger 3.2 Scalable Landmark Coordinates
significant new adjustments to their neighbors.
Intuitively, the number of landmarks used to calibrate a
Based on these goals, we now describe the Orion de- graph should have a direct impact on the accuracy of the
sign and explain key decisions. Euclidean mapping. Similar correlation between land-
marks and accuracy has been observed in the context of
3.1 A Landmark-based Approach network coordinate systems [20]. The highly connected
and complex nature of social graphs leads us to believe
Figure 1 illustrates how Orion maps nodes in a graph to that an accurate graph coordinate system requires a sig-
positions in a D-dimension Euclidean coordinate space. nificant number of landmarks. The challenge is to find a
The goal is accurately translate pairwise hop-count dis- way to accurately and quickly compute the coordinates
tances in the graph into Euclidean distances in the co- for a large number of landmarks.
ordinate space. To do this, Orion uses a landmark ap- Traditional network coordinates determine a node’s
proach, where the positions of all nodes are calibrated D-dimension coordinates by minimizing the sum of
with their relative distances to a fixed number (k) of cho- squares of prediction errors using the Simplex Downhill
sen landmark nodes. Landmark nodes are initially cho- algorithm [19], a nonlinear optimization algorithm. The
sen from the entire graph based on their position and de- algorithm runs in O(k 2 · D) time to compute coordinates
gree of connectivity. of k landmarks.
Why Landmarks? We use a landmark-based scheme Since running Simplex Downhill on our desired num-
in Orion for two main reasons. First and foremost, we ber of landmarks (up to 100 in our study) is computa-
wish to minimize the number of shortest path compu- tionally expensive, we propose a new approach, where
tations needed to establish ground truth on the actual we separate our landmarks into two groups, a small ini-
4
tial group of 16 landmarks, and a larger secondary group Network Nodes Edges Avg. Path Len.
composed of the remaining landmarks. Norway 293K 5,589K 4.2
We leverage the Simplex Downhill algorithm to com- Egypt 246K 1,618K 5.0
Los Angeles 275K 2,115K 5.1
pute the coordinates for the initial (k I = 16) landmarks,
India 363K 1,556K 6.1
thus its asymptotical complexity is O(k I 2 · D). The sec-
ondary group of landmarks calibrate their positions us- Table 1: Properties of Social Graphs
ing the initial kI landmarks as anchors, contributing to a
computational complexity of only O(k I · D) each. Thus,
the total time required to compute landmark coordinates We consider these strategies as approximations of the
is O(kI 2 · D) + (k − kI ) × O(kI · D), where k is the high-centrality strategy, and evaluate their effectiveness
total number of landmarks. empirically in Section 4.
Furthermore, we describe two ways to compute the Summary. Orion works as a landmark-based scheme,
coordinates of the secondary group of landmarks, while where an initial core of 16 landmarks is first fixed in
maintaining the same computational complexity. In the the space using Simplex Downhill optimization. A sec-
global approach, we compute the coordinates of each ondary group of landmarks position themselves based
node in the secondary group relying only on the ini- on the original landmarks. Finally, all remaining graph
tial group as anchors. In the incremental landmarks ap- nodes calibrate their positions based on node distances
proach, nodes in the secondary group are added one by obtained from computing BFS from all landmarks.
one. Once a node receives its coordinate values, it be-
comes an anchor for all remaining nodes. To compute its
coordinates, any remaining node in the secondary group 4 Experimental Results
can choose any k I nodes from all embedded nodes to be
its landmarks. In this section we analyze the accuracy of Orion’s node
distance estimates. We study the impact on accuracy by
key factors: Landmark selection strategy, cardinality of
3.3 Landmark Selection the Landmark set, and dimensionality of node coordi-
nates. We preface our core discussion with an overview
Finally, we consider the problem of choosing landmark
of the experimental environment and evaluation metrics.
nodes to produce the most accurate graph to Euclidean
coordinate mapping. Prior work by Potamias et. al con-
sidered the problem of choosing landmarks, and con- 4.1 Experimental Setup
cluded experimentally that choosing nodes with high
centrality performed significantly better than random We evaluate Orion accuracy using four anonymized
choice [23]. Given the complexity of computing node datasets (Egypt, India, Los Angeles and Norway) gath-
centrality, we consider two groups of alternative land- ered from Facebook regional networks [32]. These
mark selection strategies as possible approximations of graphs were chosen because they are large, but not too
centrality-based selection: Random and High-degree. large to make graph analysis intractable. Their statistical
properties are consistent with other OSN datasets [2, 30].
• Random. This is the basic landmark selection strat- Table 1 reports their basic properties.
egy. Landmarks are chosen uniformly at random All experiments were run on 2.4 GHz, dual core Xeon
from all nodes in the graph. servers with 32GB of RAM. All machines ran Fedora
• High-degree. Prior measurements on social net- Core, kernel version 2.6.x.
works [18, 32] show that social graphs exhibit a Evaluation Metrics. We use two key metrics to eval-
power-law-like degree distribution. Intuitively, high uate Orion accuracy. The first is Relative Error. This
degree nodes reside at the core of social graphs, ef- metric is widely used in the study of Network Coordinate
fectively approximating central nodes. This strategy Systems, although it must be modified slightly in order
chooses nodes with the highest degree. to evaluate graph coordinate systems. Let a and b be two
• Landmark separation. Closely positioned land- nodes in the graph. Let d m a,b be the measured distance
marks are less effective at “covering” the graph as between a and b on the real graph using the BFS algo-
anchors. Therefore, we add variants to the two ba- rithm, and let dP a,b be the estimated distance computed
sic strategies, where we select the landmarks one using a and b’s coordinates from Orion. In our context,
by one, ignore any potential landmarks that are too the relative error is:
close in the graph to existing landmarks, and con-
tinue selecting landmarks until the desired number |dm P
a,b − da,b |
Re = (1)
has been met. dm
a,b
5
0.4 1
Random Strategy (Global) 0.9
0.35 High-degree Strategy (Global)
Average relative error
CDF
0.2 0.5
0.15 0.4
0.3
0.1
0.2
0.05 100 landmarks
0.1 30 landmarks
0 0
Original 2-hop 3-hop 4-hop 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Minimum distance between landmarks Relative error
Figure 2: ARE of nodes’ distances with different combination Figure 3: CDF of relative error on nodes distances on India.
of landmark selection and computation strategies in India graph
6
0.4 Metric Method India Egypt L. A. Norway
Norway Avg. Relative Error Orion 11.7 9.5 10.8 8.1
0.35 Egypt Avg. Relative Error Radius
Average relative error
To demonstrate Orion’s utility and accuracy in an opera- Accuracy Results. For a scalable side-by-side com-
tional setting, we integrate Orion into several graph anal- parison, we randomly sample 1000 nodes from each of
ysis and social applications that make extensive use of the graphs, and compute graph radius, diameter and av-
shortest path computations. Under normal conditions, erage path length based on BFS from those nodes to
these graph metrics and applications can be computa- all other nodes in the graph. We compare those results
tionally intractable for large graphs. We show that we to those generated using node distance estimations from
can use Orion to scalably obtain answers that reasonably Orion, and show the results in Table 3. We find that Orion
approximate answers obtained from deterministic meth- performs very well in predicting these metrics. For graph
ods. Specifically, we look at three common operations: radius and diameter, it always provides a result that is less
computing node separation metrics such as graph radius, than 1 hop from the BFS answer. In the case of average
diameter and average path length, locating central nodes path length, Orion is even more accurate, and provides
in a graph, and ranked social search. results that never deviate more than 0.3 from BFS.
7
0.9 1
0.8 0.9
0.7 0.8
0.6 0.7
Accuracy
Accuracy
0.6
0.5
0.5
0.4
0.4
0.3 India 0.3 Norway
0.2 Egypt 0.2 Egypt
0.1 Los Angeles Los Angeles
Norway 0.1 India
0 0
50 100 200 5 10 20 50
Top # of 1000 nodes Top # of 100 responses
Figure 5: Accuracy of Top k high centrality nodes Figure 6: Accuracy of top k ranked nodes.
5.2 Computing Node Centrality rate because it has the longest average path lengths of
our sample graphs. The results are generally good across
Information dissemination is an active research area the board, with Orion giving correct estimates more than
of social networks. Viral spread [31], influence cam- 50% of the time, when selecting top 50 highest centrality
paigns [4, 14], and breaking-news coverage [10] are all nodes out of 1000.
examples of information dissemination problems on so-
cial graphs. A critical, but computationally expensive, 5.3 Ranked Social Search
metric necessary for these applications is node central-
ity. We leverage Orion coordinates to compute node’s Online social networks often need to rank their query re-
centrality in order to compare its speed and accuracy sults by proximity in the social graph to the query owner.
with centrality calculations performed using traditional For example, searches for specific names on Facebook
shortest-path algorithms. and LinkedIn will only return the top results that are clos-
Centrality is defined as the average shortest path est in social distance to the user. Social distance is used
length from a node a to every other nodes in the graph. to rank query results because users generally care about
The smaller the average path length for a node is, the people close to their social circles.
higher its centrality is. Using Orion, a node can quickly We implement a ranked social search application. In
estimate its centrality by computing its average Eu- each graph, we randomly select 100 nodes to represent
clidean distance to all other nodes in the graph. the total set of results for each query. We run the simula-
We estimate the precision of computing node central- tion 5000 times, each time with a randomly chosen node
ity via Orion by comparing its results to actual results as the point of origin for the query.
computed using BFS. Computationally, node centrality Accuracy Results. We sort the randomly selected 100
also requires all pairs of shortest paths computation, and nodes in increasing order and choose the top k nodes.
our time estimates from node separation metrics also ap- Then we count the amount of overlap in the two sets
ply here (152 hours for our LA graph). of top k nodes computed by Orion and the BFS-based
Accuracy Results. To keep computation time man- approach. We define the accuracy of the ranked social
ageable, we again sample 1000 random nodes from each search in Orion as the ratio of the number of overlapping
graph, and compute node centrality values for each node nodes to the total number of all considered nodes. Fig-
using both Orion and BFS. We sort nodes based on their ure 6 plots the accuracy values over different values of
average shortest path length to every other node in the k, averaged across the 5000 runs. Again, Orion’s social
network, in increasing order. Then we select the top k search produces fairly good results, with more than 60%
nodes from each resulting group, and count the number overlap when choosing the top 20 responses.
of top k central nodes (according to BFS) that also ap-
peared in Orion’s results. We repeat this for 5 sets of 6 Conclusions and Future Directions
1000 random nodes and average the result.
Figure 5 shows the percentage of top-k nodes that are Shortest path computation is one of the most critical
correctly considered found by Orion, for different val- and computationally intensive primitives for both graph
ues of k: 50, 100, and 200. The overlap between Orion analysis and social networking applications. We pro-
and BFS’ results increases with k. As with results in pose graph coordinate systems, a new approach to dra-
Section 4.3, centrality results for India are more accu- matically reduce the complexity of shortest paths com-
8
putation by mapping the entire graph into a multi- [10] K WAK , H., L EE , C., PARK , H., AND M OON , S. What is twitter,
dimensional Euclidean coordinate space. We describe a social network or a news media? In Proc. of WWW (2010).
the design of Orion, an efficient graph coordinate proto- [11] L ANCICHINETTI , A., AND F ORTUNATO , S. Community detec-
type. Mapping a graph of n nodes takes time O(k I ·D·n) tion algorithms: A comparative analysis. Phys. Rev. E 80, 5 (Nov
2009).
(roughly 2-3 hours for a 275K node graph), after which
[12] L EDLIE , J., G ARDNER , P., AND S ELTZER , M. I. Network co-
each node distance estimation takes less than 0.2 mi-
ordinates in the wild. In Proc. of NSDI (April 2007).
croseconds. Our experiments show Orion can provide
[13] L EE , S., Z HANG , Z., S AHU , S., AND S AHA , D. On suitability
accurate results both for graph metrics such as graph ra- of euclidean embedding of internet hosts. In Proc. of SIGMET-
dius and node centrality, as well as graph-based applica- RICS (June 2006).
tions such as ranked social search. [14] L ESKOVEC , J., ET AL . Cost-effective outbreak detection in net-
works. In Proc. of KDD (2007).
Future Directions. We believe graph coordinate sys-
tems are a promising new research direction for scalable [15] L ESKOVEC , J., AND H ORVITZ , E. Planetary-scale views on a
large instant-messaging network. In Proc. of WWW (2008).
graph analysis. While our work here is preliminary, we
[16] L I , L., ET AL . Towards a theory of scale-free graphs: Definition,
see three immediate areas for future work. First, we
properties, and implications. Internet Math 2, 4 (2005), 431–523.
would like to explore the efficacy of mapping graphs
[17] M AO , Y., S AUL , L., AND S MITH , J. M. Ides: An internet dis-
to non-Euclidean coordinate systems such as spherical tance estimation service for large networks. IEEE JSAC 24, 12
and hypercube. Second, we will examine the impact (Dec. 2006), 2273–2284.
of graph coordinates on weighted graphs, e.g. geo- [18] M ISLOVE , A., ET AL . Measurement and analysis of online social
graphical graphs or temporal distance metrics for social networks. In Proc. of IMC (Oct 2007).
graphs [28]. Finally, Orion is designed for static graphs. [19] N ELDER , J. A., AND M EAD , R. A simplex method for function
Adding new nodes to the graph after the initial mapping minimization. The Computer Journal 7, 4 (Jan. 1965), 308–313.
can change shortest path values for portions of the graph [20] N G , T. S. E., AND Z HANG , H. Predicting internet network dis-
and force a re-mapping of the graph. We will investigate tance with coordinates-based approaches. In Proc. of INFOCOM
mechanisms and heuristics to allow run-time modifica- (New York, NY, June 2002).
tions to graphs already mapped to the coordinate space. [21] N G , T. S. E., AND Z HANG , H. A network positioning system
for the internet. In Proc. of USENIX ATC (June 2004).
[22] P IAS , M., ET AL . Lighthouses for scalable distributed location.
Acknowledgments In Proc. of IPTPS (Feb. 2003).
[23] P OTAMIAS , M., ET AL . Fast shortest path distance estimation in
This material is based in part upon work supported by the large networks. In Proc. of CIKM (Hong Kong, Nov. 2009).
National Science Foundation under grants IIS-847925, [24] R ATNASAMY, S., H ANDLEY, M., K ARP, R., AND S CHENKER ,
CNS-0916307, and CAREER CNS-0546216. S. Topologically-aware overlay construction and server selection.
In Proc. of INFOCOM (2002), IEEE.