Theoretical General Issues)
Theoretical General Issues)
The second axiom simply says that a vertex that does not dominate any other
vertex has no relational power and hence gets the value zero:
Axiom 2: Dummy vertex property For every G ∈ Gn and i ∈ V satisfying
+
NG (i) = ∅ it holds that fG (i) = 0.
In the third axiom the authors formalize the fact that if two vertices have the
same dominance structure, i.e. the same number of dominated vertices and the
same number of dominating vertices, then they should get the same dominance-
value:
Axiom 3: Symmetry For every G ∈ Gn and i, j ∈ V satisfying d+
G (i) = dG (j)
+
Finally, the fourth axiom addresses the case of putting together directed
graphs. It says that if several directed graphs are combined in such a way that
a vertex is dominated in at most one directed graph (i.e. if the result of the
combination may be viewed as an independent partition), then the total domi-
nance value of a vertex should simply be the sum of its dominance values in the
directed graphs.
Axiom 4: Additivity over independent partitions For every G ∈ Gn and every
independent partition {G1 , . . . , GK } of G it holds
K
$
fG = f Gk .
k=1
5 Advanced Centrality Concepts 101
σG (i) = d+
G (i) ∀ i ∈ V, G ∈ Gn
Instead of taking the number of dominated vertices as the total value that is
distributed over the vertices according to their dominance, the total number of
relations is now taken as a basis for normalization:
Axiom 1b: Score normalization For every G ∈ Gn it holds that
$
fG (i) = |E|.
i∈V
Above, we presented a set of axioms that describe a certain measure that has
some aspects of feedback centralities but also links to the preceding section via
its strong relation to the score measure. We now pass over to feedback centralities
in the narrower sense.
and set W Dω−1 to be the normalized weight matrix, and Dα = diag (α(i)). Then
the ranking problem )V, α, W * is defined for the vertex set V of a discipline,
the associated vertices weights α and the corresponding citation matrix W , and
considers the ranking (a centrality vector) cPHV ≥ 0 that is normalized with
respect to the l1 -norm: ,cPHV ,1 = 1.
The authors consider two special classes of ranking problems:
1. ranking problems with all vertex weights equal, α(i) = α(j) ∀ i, j ∈ V
(isoarticle problems) and
2. ranking problems with all reference intensities equal, ω(·i) ω(·j)
α(i) = α(j) ∀ i, j ∈ V
(homogeneous problems).
To relate small and large problems, the reduced ranking problem Rk for a ranking
problem R = )V, α, W * with respect to a given vertex k is defined as Rk =
)V \ {k}, (α(i))i∈V \{k} , (ωk (i, j))(i,j)∈V \{k}×V \{k} *, with
ω(i, k)
ωk (i, j) = ω(i, j) + ω(k, j) % ∀ i, j ∈ V \ {k}.
l∈V \{k} ω(l, k)
Φ()V, α, W Γ *) = Φ()V α, W *)
for all ranking problems )V, α, W * and every Matrix Γ = diag(γj )j∈V with
γj > 0 ∀ j ∈ V .
Axiom 2: (weak) homogeneity
5 Advanced Centrality Concepts 103
Φi (R) ω(i, j)
= . (5.7)
Φj (R) ω(j, i)
Φi (R) Φi (Rk )
= ∀ i, j ∈ V \ {k}. (5.8)
Φj (R) Φj (Rk )
Palacios-Huerta and Volij show that the ranking method assigning the Pinski-
Narin centrality cP N given as the unique solution of
−1
Dα−1 W DW Dα c = c
Assume that a network is modified slightly for example due to the addition of a
new link or the inclusion of a new page in case of the Web graph. In this situation
the ‘stability’ of the results are of interest: does the modification invalidate the
computed centralities completely?
In the following subsection we will discuss the topic of stability for distance
based centralities, i.e., eccentricity and closeness, introduce the concept of stable,
5 Advanced Centrality Concepts 105
quasi-stable and unstable graphs and give some conditions for the existence of
stable, quasi-stable and unstable graphs.
A second subsection will cover Web centralities and present results for the
numerical stability and rank stability of the centralities discussed in Section 3.9.3.
or
for every vertex v ∈ V . Kishi [357] calls a graph for which the second case
(Equation 5.10) occurs an unstable graph with respect to c. Figures 5.2 and 5.3
in Section 5.4.1 show unstable graphs with respect to the eccentricity and the
closeness centrality. The first case (Equation 5.9) can be further classified into
and
A graph G is called a stable graph if the first case (Equation 5.11) occurs,
otherwise G is called a quasi-stable graph. The definition of stable graphs with
respect to c encourages Sabidussi’s claim [500] that an edge added to a central
vertex u ∈ Sc (G) should strengthen its position.
In Figure 5.4 an example for a quasi-stable graph with respect % to closeness
centrality is shown. For each vertex the status value s(u) = v∈V d(u, v) is
indicated. Adding the edge (u, v) leads to a graph with a new central vertex v.
In [357] a more generalized form of closeness centrality is presented by Kishi:
The centrality value cGenC (u) of a vertex u ∈ V is
1
cGenC (u) = %∞ (5.13)
k=1 ak nk (u)
106 D. Koschützki et al.
44 42
29 u 27 u
40 34 40 34
33 33
39 39
48 48
32 v
26
v
Fig. 5.4. A quasi-stable graph with respect to the closeness centrality. The values
indicate the total distances s(u). After inserting the edge (u, v) the new median is
vertex v
where nk (u) is the number of vertices whose distance from u is k and each ak is
a real constant. With ak = k it is easy to see that
1 1
%∞ =% = cC (u).
k=1 ak nk (u) v∈V d(u, v)
Kishi and Takeuchi [358] have shown under which conditions there exists
a stable, quasi-stable, and unstable graph for generalized centrality functions
cGenC of the form in Equation 5.13:
Theorem 5.5.1. For any generalized vertex centrality cGenC of the form in
Equation 5.13 holds:
1. if a2 < a3 , then there exists a quasi-stable graph, and
2. if a3 < a4 , then there exists an unstable graph.
Theorem 5.5.2. Any connected undirected graph G is stable if and only if the
generalized vertex centrality cGenC given in Equation 5.13 satisfies a2 = a3 .
Moreover, G is not unstable if and only if cGenC satisfies a3 = a4 .
Sabidussi has shown in [500] that the class of undirected trees are stable
graphs with respect to the closeness centrality.
Theorem 5.5.3. If an undirected graph G forms a tree, then G is stable with
respect to the closeness centrality.
Numerical Stability. Langville and Meyer [378] remark that it is not reason-
able to consider the linear system formulation of, e.g., the PageRank approach
and the associated condition number3 , since it may be that the solution vector of
the linear system changes considerable but the normalized solution vector stays
almost the same. Hence, what we are looking for is to consider the stability of the
eigenvector problem which is the basis for different Web centralities mentioned
in Section 3.9.3.
Ng et al. [449] give a nice example showing that an eigenvector may vary con-
siderably even if the underlying network changes only slightly. They considered
a set of Web pages where 100 of them are linked to algore.com and the other
103 pages link to georgewbush.com. The first two eigenvectors (or, in more de-
tail, the projection onto their nonzero components) are drawn in Figure 5.5(a).
How the scene changes if five new Web pages linking to both algore.com and
georgewbush.com enter the collection is then depicted in Figure 5.5(b).
Bush(103) Bush&Gore(5)
1
Bush(103)
1
0 Gore(100)
0 Gore(100)
0 1
0 1
(a) (b)
Fig. 5.5. A small example showing instability resulting from perturbations of the
graph. The projection of the eigenvector is shown and the perturbation is visible as a
strong shift of the eigenvector
Regarding the Hubs & Authorities approach Ng et al. the authors give a
second example, cf. Figs 5.6(a) and 5.6(b). In the Hubs & Authorities algorithm
the largest eigenvector for S = AT A is computed. The solid lines in the figures
represent the contours of the quadratic form xT Si x for two matrices S1 , S2 as
well as the contours of the slightly (but equally) perturbed matrices. In both
figures the associated eigenvectors are depicted. The difference (strong shift in
the eigenvectors in the first case, almost no change in the eigenvectors in the
3
cond(A) = !A!!A−1 ! (for A regular)
108 D. Koschützki et al.
second case) between the two figures consists of the fact that S1 has an eigengap4
δ1 ∼ 0 whereas S2 has eigengap δ2 = 2. Hence in the case that the eigengap is
almost zero, the algorithm may be very sensitive about small changes in the
matrix whereas in case the eigengap is large the sensitivity is small.
(a) (b)
Fig. 5.6. A simple example showing the instability resulting from different eigengaps.
The position of the eigenvectors changes dramatically in the case of a small eigengap
(a)
If we consider the PageRank algorithm, then the first fact that we have to
note is that for a Markov chain having transition matrix P the sensitivity of the
principal eigenvector is determined by the difference of the second eigenvalue
to 1. As shown by Haveliwala and Kamvar [290] the second eigenvalue for the
PageRank-matrix with P having at least two irreducible closed subsets satisfies
4
Difference between the first and the second largest eigenvalue.
5 Advanced Centrality Concepts 109
λ2 = d. This is true even in the case that in Formula 3.43 the vector 1n is
substituted by any stochastic vector v, the so-called personalization vector, cf.
Section 5.2 for more information about the personalization vector.
Therefore a damping factor of d = 0.85 (this is the value proposed by the
founders of Google) yields in general much more stable results than d = 0.99
which would be desirable if the similarity of the original Web graph with its
perturbed graph should be as large as possible.
Ng et al. [449] proved
Theorem 5.5.6. Let U ⊆ V be the set of pages where the outlinks are changed,
cPR be the old PageRank score and cU
PR be the new PageRank score corresponding
to the perturbed situation. Then
2 $
,cPR − cU
PR ,1 ≤ cPR (i).
1−d
i∈U
Bianchini, Gori and Scarselli [61] were able to strengthen this bound. They
showed
Theorem 5.5.7. Under the same conditions as given in Theorem 5.5.6 it holds
2d $
,cPR − cU
PR ,1 ≤ cPR (i).
1−d
i∈U
Rank Stability. When considering Web centralities, the results are in general
returned as a list of Web pages matching the search-query. The scores attained
by the Web pages are in most cases not displayed and hence the questions that
occurs is whether numeric stability also implies stability with respect to the rank
in the list (called rank-stability). Lempel and Moran [388] investigated the three
main representatives of Web centrality approaches with respect to rank-stability.
To show that numeric stability does not necessarily imply rank-stability they
used the graph G = (V, E) depicted in Figure 5.7. Note that in the graph any
undirected edge [u, v] represents two directed edges (u, v) and (v, u). From G two
different graphs Ga = (V, E ∪ {(y, ha )}) and Gb = (V, E ∪ {(y, hb )}) are derived
(they are not displayed). It is clear that the PageRank vector caPR corresponding
to Ga satisfies
0 < caPR (xa ) = caPR (y) = caPR (xb ),
and therefore caPR (ha ) > caPR (hb ).
Analogously in Gb we have
xa y xb
ha c hb
a1 a2 an bn b2 b1
Fig. 5.7. The graph G used for the explanation of the rank stability effect of PageRank.
Please note that for Ga a directed edge from y to ha is added and in the case of Gb
from y to hb
caPR (ai ) > caPR (bi ) and cbPR (ai ) < cbPR (bi ) ∀ i.
Sven Kosub
U. Brandes and T. Erlebach (Eds.): Network Analysis, LNCS 3418, pp. 112–142, 2005.
c Springer-Verlag Berlin Heidelberg 2005
(
6 Local Density 113
a k1 -fraction of the other members of the group, then the tie distance within the
group is at most k. Results comparable to that can be proven for connectivity
as well. Here, however, the dependency from density is not as strong as in the
case of distances (see Chapter 7).
In this chapter, we survey computational approaches and solutions for dis-
covering locally dense groups. A graph-theoretical group property is local if it
is definable over subgraphs induced by the groups only. Locality does not cor-
respond to the above-mentioned separation characteristic of cohesiveness, since
it neglects the network outside the group. In fact, most notions that have been
defined to cover cohesiveness have a maximality condition. That is, they require
for a group to be cohesive with respect to some property Π, in addition to
fulfilling Π, that it is not contained in any larger group of the network that sat-
isfies Π as well. Maximality is non-local. We present the notions on the basis of
their underlying graph-theoretical properties and without the additional max-
imality requirements. Instead, maximality appears in connection with several
computational problems derived from these notions. This is not a conceptual
loss. Actually, it emphasizes that locality reflects an important hidden aspect of
cohesive groups: being invariant under network changes outside the group. Inte-
rior robustness and stability is an inherent quality of groups. Non-local density
notions and the corresponding algorithmic problems and solutions are presented
in Chapter 8. A short list of frequently used non-local notions is also discussed
in Section 6.4.
The prototype of a cohesive group is the clique. Since its introduction into
sociology in 1949 [401], numerous efforts in combinatorial optimization and al-
gorithms have been dedicated to solving computational problems for cliques.
Therefore, the treatment of algorithms and hardness results for clique problems
deserves a large part of this chapter. We present some landmark results in detail
in Section 6.1. All other notions that we discuss are relaxations of the clique
concept. We make a distinction between structural and statistical relaxations. A
characteristic of structural densities is that all members of a group have to satisfy
the same requirement for group membership. These notions (plexes, cores) ad-
mit strong statements about the structure within the group. Structurally dense
groups are discussed in Section 6.2. In contrast, statistical densities average over
members of a group. That is, the property that defines group membership needs
only be satisfied in average (or expectation) over all group members. In general,
statistically dense groups reveal only few insights into the group structure. How-
ever, statistical densities can be applied under information uncertainty. They are
discussed in Section 6.3.
All algorithms are presented for the case of unweighted, undirected simple
graphs exclusively. Mostly, they can be readily translated for directed or weighted
graphs. In some exceptional cases where new ideas are needed, we mention these
explicitly.
114 S. Kosub
Computational hardness. The problem arises whether we can improve the ex-
haustive search algorithm significantly with respect to the amount of time.
Unfortunately, this will probably not be the case. Computationally, finding a
maximum clique is an inherently hard problem. We consider the corresponding
decision problem:
Problem: Clique
Input: Graph G, Parameter k ∈
Question: Does there exist a clique of size at least k within G?
Let ω(G) denote the size of a maximum clique of a graph G. Note that if we have
an algorithm that decides Clique in time T (n) then we are able to compute
ω(G) in time O(T (n) · log n) using binary search. The other way around, any
T (n) algorithm for computing ω(G), gives a T (n) algorithm for deciding Clique.
Thus, if we had a polynomial algorithm for Clique, we would have a polyno-
mial algorithm for maximum-clique sizes, and vice versa. However, Clique was
among the first problems for which N P-completeness was established [345].
Theorem 6.1.4. Clique is N P-complete.
6 Local Density 117
Proof. Note that testing whether some guessed set is a clique is possible in
polynomial time. This shows the containment in N P. In order to prove the
N P-hardness, we describe a polynomial-time transformation of Satisfiability
into Clique. Suppose we are given a Boolean formula H in conjunctive normal
form consisting of m clauses C1 , . . . , Ck . For H we construct a k-partite graph
GH where vertices are the literals of H labeled by their clause, and where edges
connect literals that are not negations of each other. More precisely, define GH =
(VH , EH ) to be the following graph:
" + #
VH =def (L, i) + i ∈ {1, . . . , k} and L is a literal in clause Ci
" + #
EH =def {(L, i), (L$ , j)} + i '= j and L '= ¬L$
Clearly, the graph GH can be computed in time polynomial in the size of the
formula H. We show that H is satisfiable if and only if the graph GH contains
a clique of size k.
Suppose that H is satisfiable. Then there exists a truth assignment to vari-
ables x1 , . . . , xn such that in each clause at least one literal is true. Let L1 , . . . , Lk
be such literals. Then, of course, it must hold that Li '= ¬Lj for i '= j. We thus
obtain that the set {(L1 , 1), . . . , (Lk , k)} is a clique of size k in GH .
Suppose now that U ⊆ VH is a clique of size k in graph GH . Since GH is
k-partite, U contains exactly one vertex from each part of VH . By definition of
set VH , we have that for all vertices (L, i) and (L$ , j) of U , L '= ¬L$ whenever
i '= j. Hence, we can assign truth values to variables in such a way that all
literals contained in U are satisfied. This gives a satisfying truth assignment to
formula H. 4
3
So, unless P = N P, there are no algorithms with a running time polynomial
in n for solving Clique with arbitrary clique size or computing the maximum
clique. On the other hand, even if we have a guarantee that there is a clique of
size k in graph G, then we are not able to find it in polynomial time.
Corollary 6.1.5. Unless P = N P, there is no algorithm running in polynomial
time to find a clique of size k in a graph which is guaranteed to have a clique of
size k.
Proof. Suppose we have an algorithm A that runs in polynomial time on each
input (G, k) and outputs a clique of size k, if it exists, and behaves in an arbitrary
way in the other cases. A can be easily modified into an algorithm A$ that decides
Clique in polynomial time. On input (G, k), run algorithm A, if A produces no
output, then reject the instance. If A outputs a set U , then test whether U is
a clique. If so, accept, otherwise reject. This procedure is certainly polynomial
time. 4
3
Note that the hardness of finding the hidden clique does not depend on the size
of the clique. Even very large hidden cliques (of size (1− ε)n for ε > 0) cannot be
found unless P = N P (see, e.g., [308, 37]). The situation becomes slightly better
if we consider randomly chosen graphs, i.e., graphs where each edge appears
118 S. Kosub
3
It might be easier to think of independent sets rather than cliques. An independent
set in a graph G is a set U of vertices such that G[U ] has no edges. A clique in graph
G corresponds to an independent set in graph G, where in G exactly those vertices
are adjacent that are not adjacent in G. Independent sets are a little bit easier to
handle, since we do not have to reason about edges that are not in the graph. In
fact, many algorithms in the literature are described for independent sets.
6 Local Density 119
The intuitive algorithm in the theorem captures the essence of a series of fast
exponential algorithms for the maximum clique problem. It started with an
O∗ (1.286n) algorithm [543] that follows essentially the ideas of the algorithm
above. This algorithm has been subsequently improved to O∗ (1.2599n ) [545],
by using a smart and tedious case analysis of the neighborhood around a low-
degree vertex. The running time of the algorithm has been further improved
to O∗ (1.2346n) [330], and, using combinatorial arguments on connected regular
graphs, to O∗ (1.2108n) [495]. Unfortunately, the latter algorithm needs expo-
nential space. This drawback can be avoided: there is a polynomial-space algo-
rithm with a slightly weaker O∗ (1.2227n) time complexity [54]. A non-trivial
lower bound on the basis of the exponential is still unknown (even under some
complexity-theoretic assumptions).
The approximation ratio stated in the theorem is the best known. The follow-
ing celebrated result [287] indicates that in fact, there is not much space for
improving over that ratio.
Theorem 6.1.8. Unless N P = ZPP,4 there exists no polynomial-time algo-
rithm whose output for a graph G with n vertices is a clique of size within factor
n1−ε of ω(G) for any ε > 0.
The complexity-theoretic assumption used in the theorem is almost as strong as
P = N P. The inapproximability
& ' result has been strengthened
& 'to subconstant
values of ε, first to O √log log n [177] and further to O (log n)γ [353] for some
1 1
γ > 0. These results are based on much stronger complexity assumptions – es-
sentially, that no N P-complete problem can be solved by randomized algorithms
O(1)
with quasi-polynomial running time, i.e., in time 2(log n) . Note that the ratio
4
ZPP is the class of all problems that can be solved with randomized algorithms
running in expected polynomial time while making no errors. Such algorithms are
also known as (polynomial-time) Las Vegas algorithms.
120 S. Kosub
& '
n
(log n)2is expressible as Ω logloglogn n in terms of ε. The gap between the lower
bound and the upper bound for approximability is thus pretty close.
Also many heuristic techniques for finding maximum cliques have been pro-
posed. They often show reasonable behavior, but of course, they cannot improve
over the theoretical inapproximability ratio. An extensive discussion of heuristics
for finding maximum cliques can be found in [70].
In the random graph model, we known that, with high probability, ω(G)
is either (2 + o(1)) log n rounded up or rounded down, for a random graph of
size n (see, e.g., [24]). There are several polynomial-time algorithms producing
cliques of size (1+o(1)) log n, i.e., they achieve an approximation ratio of roughly
two [263]. However, it is conjectured that there is no polynomial-time algorithm
outputting a clique of size at least (1 + ε) log n for any ε > 0 [328, 347].
Theorem 6.1.10. For every k ≥ 3 there exists an algorithm for finding a clique
of size k in a graph with n vertices that runs in time O(nβ(k) ) where β(k) =
α(6k/37, 8(k − 1)/39, 8k/39) and multiplying an nr × ns -matrix with an ns × nt -
matrix can be done in time O(nα(r,s,t) ).
Proof. Let k1 denote 6k/37, let k2 denote 8(k −1)/39, and let k3 denote the value
8k/39. Note that k = k1 +k2 +k3 . Let G be any graph with n vertices and m edges.
We first construct a tripartite auxiliary graph G̃ as follows: the vertex set Ṽ is
divided into three sets Ṽ1 , Ṽ2 , and Ṽ3 where Ṽi consists of all cliques of size ki in G.
Define two vertices U ∈ Ṽi and U $ ∈ Ṽj to be adjacent in G̃ if and only if i '= j and
U ∪U $ is a clique of size ki +kj in G. The algorithm now tests the auxiliary graph
G̃ for triangles. If there is such a triangle {U1 , U2 , U3 }, then the construction of
G̃ implies that U1 ∪ U2 ∪ U3 is a clique of size k in G. Testing the graph G̃ for
triangles can be done by matrix multiplication as described in Theorem 6.1.9.
However, we now have to multiply an nk1 × nk2 adjacency matrix, representing
edges between Ṽ1 and Ṽ2 , with an nk2 ×nk3 adjacency matrix, representing edges
between Ṽ2 and Ṽ3 . This step can be done in time O(nβ(k) ). Computing the
2k
three matrices needs in the worst case O(nmax{k1 +k2 ,k1 +k3 ,k2 +k3 } ) = O(n) 3 * ),
which is asymptotically dominated by the time for the fast rectangular matrix
multiplication [318]. 4
3
We give an impression of the algorithmic gain of using matrix multiplication
(see, e.g., [260]).
Clique size Exhaustive search Matrix multiplication
3 O(n3 ) O(n2.376 )
4 O(n4 ) O(n3.376 )
5 O(n5 ) O(n4.220 )
6 O(n6 ) O(n4.751 )
7 O(n7 ) O(n5.751 )
8 O(n8 ) O(n6.595 )
The theorem has a nice application to the membership counting problem for
cliques of fixed size. The following result is due to [260].
Theorem 6.1.11. For every k ≥ 3, there exists an algorithm that counts the
number of cliques of size k to which each vertex of a graph on n vertices belongs,
in time O(nβ(k) ) where β(k) is the same function as in Theorem 6.1.10.
Proof. The theorem is based on the observation that for the case k = 3 (see
Theorem 6.1.9), it is not only easy to check whether two vertices vi and vj
belong to some triangle in G, but also to compute in how many triangles they
lie: if the edge {vi , vj } exists in G, then the number is just the entry bij in
the square of the adjacency matrix A(G). In general, we apply this observation
to the auxiliary graph G̃. For any vertex v ∈ V , let Ck (v) denote the number
of different cliques of size k in G in which v is contained. Similarly, let C̃3 (U )
denote the number of triangles to which node U of G̃ belongs. Notice that U is a
clique in G of size smaller than k. Clearly, cliques of G of size k may have many
122 S. Kosub
Clearly, using Theorem 6.1.10, the left-hand side of this equation can be com-
puted in time O(nβ(k) ) (first, compute the matrices and second, search entries
for all U containing v). We now easily calculate Ck (v) from Equation 6.1. 4
3
A recent study of the corresponding decremental problem [260], i.e., the scenario
where starting from a given graph vertices and edges can be removed, has shown
that we can save roughly n0.8 time compared to computing the number of size-k
cliques to which the vertices belong each time from the scratch. For example,
the problem of counting triangles in a decremental setting now takes O(n1.575 ).
is polynomial total time. That is, the algorithm outputs all C possible configu-
rations in time bounded by a polynomial in C and the input size n. Exhaustive
search is not polynomial total time. In contrast, one of the classical algorithms
[473] first runs O(n2 C) steps with no output and then outputs all C maximal
cliques all at once. However, an algorithm for the enumeration of all maximum
cliques that runs in polynomial total time does not exist, unless P = N P [382].
We next review enumerative algorithms for maximal cliques with polynomial
total time having some further desirable properties.
– LeftChild(U, i): If U ⊆ N (vi+1 ) (the first case above), then the left child is
U ∪ {vi+1 }. If U '⊆ N (vi+1 ) (one part of the second case above), then the left
child is U . Checking which case has to be applied needs O(n + m) time.
– RightChild(U, i): If U ⊆ N (vi+1 ), then there is no right child defined.If U '⊆
N (vi+1 ), then the right child of U is (U − N (vi+1 )) ∪ {vi+1 } if it is a maximal
clique and U = Parent((U − N (vi+1 )) ∪ {vi+1 }, i + 1), otherwise the right child
is not defined. Note that we only need O(n + m) processing time.
The longest path between any two leaves in the tree is 2n − 2 passing through
2n − 1 nodes. For each node we need O(n + m) time. Since any subtree of our
tree has a leaf at level n, this shows that the delay between outputs is O(n3 ).
Note that the algorithm only needs to store while processing a node, the set U ,
the level i, and a label indicating whether it is the left or the right child. Hence,
the amount of space is O(n + m). 4
3
Theorem 6.1.13. Deciding for any given graph G and any maximal clique U
of G, whether there is a maximal clique U $ lexicographically larger than U , is
N P-complete.
The theorem is proven by a polynomial transformation from Satisfiability
[334]. It has some immediate consequences, e.g., it rules out polynomial-delay
algorithms with respect to inverse lexicographic order.
Corollary 6.1.14. 1. Unless P = N P, there is no algorithm that generates
for any given graph G and any maximal clique U in G the lexicographically
next maximal clique in polynomial time.
2. Unless P = N P, there is no algorithm that generates for any given graph
all maximal cliques in inverse lexicographic order with polynomial delay.
It might seem surprising that algorithms exist generating all maximal cliques
in lexicographic order, with polynomial delay. The idea of such an algorithm
is simply that while producing the current output, we invest additional time
in producing lexicographically larger maximal cliques. We store these cliques
in a priority queue Q. Thus, Q contains a potentially exponential number of
cliques and requires potentially exponential space. The following algorithm has
been proposed in [334] and uses in a clever way the tree structure employed in
Theorem 6.1.12.
6 Local Density 125
For the time bound, the costly operations are the extraction of the lexi-
cographically smallest maximal clique from Q (which needs O(n log C)), the n
computations of maximal cliques containing a given set (which takes O(n+m) for
each set), and attempting to insert a maximal clique into Q (at costs O(n log C)
n
per clique). Since C ≤ 3) 3 * , the total delay is O(n3 ) in the worst case. 4
3
Counting complexity. We conclude this section with some remarks on the com-
plexity of counting the number of maximal cliques. An obvious way to count
maximal cliques is to enumerate them with some of the above-mentioned al-
gorithms and increment a counter each time a clique is output. This, however,
would take exponential time. The question is whether it is possible to compute
the number more directly and in time polynomial in the graph size. To study
such issues the class #P has been introduced [559], which can be considered
as the class of all functions counting the number of solutions of instances of
N P-problems. It can be shown that counting the number of maximal cliques
is #P-complete (with respect to an appropriate reducibility notion) [560]. An
immediate consequence is that if there is a polynomial-time algorithm for com-
puting the number of maximal cliques, then Clique is in P, and thus, P = N P.
Note that in the case of planar, bipartite or bounded-degree graphs there are
polynomial-time algorithms for counting maximal cliques [557].
6.2.1 Plexes
We generalize the clique concept by allowing members in a group to miss some
ties with other group members, but only up to a certain number N ≥ 1. This
leads to the notion of an N -plex [514, 511].
Definition 6.2.1. Let G = (V, E) be any undirected graph and let N ∈
{1, . . . , n − 1} be a natural number. A subset U ⊆ V is said to be an N -plex
if and only if δ(G[U ]) ≥ |U | − N .
Clearly, a clique is simply a 1-plex, and an N -plex is also an (N + 1)-plex. We
say that a subset U ⊆ V is a maximal N -plex if and only if U is an N -plex
and it is not strictly contained in any larger N -plex of G. A subset U ⊆ V is a
maximum N -plex if and only if U has a maximum number of vertices among all
N -plexes of G.
It is easily seen that any subgraph of an N -plex is also an N -plex, that is,
N -plexes are closed under exclusion. Moreover, we have the following relation
between the size of an N -plex and its diameter [514, 189, 431].
6 Local Density 127
Problem: N -Plex
Input: Graph G, Parameter k ∈
Question: Does there exist an N -plex of size at least k within G?
128 S. Kosub
V $ =def V × {0, 1, . . . , N − 1}
" #
E $ =def {(u, 0), (v, 0)} | {u, v} ∈ E ∪
" #
∪ {(u, i), (v, j)} | u, v ∈ V and i, j > 0 ∪
" #
∪ {(u, 0), (v, i)} | u, v ∈ V with u '= v and i > 0
6.2.2 Cores
A concept dual to plexes is that of a core. Here, we do not ask how many edges
are missing in the subgraph for being complete, but we simply fix a threshold
in terms of a minimal degree for each member of the subgroup. One of the
most important things to learn about cores is that there exist polynomial-time
algorithms for finding maximum cores. Cores have been introduced in [513].
Definition 6.2.4. Let G = (V, E) be any undirected graph. A subset U ⊆ V is
said to be an N -core if and only if δ(G[U ]) ≥ N .
The parameter N of an N -core is the order of the N -core. A subset U ⊆ V is a
maximal N -core if and only if U is an N -core and it is not strictly contained in
any larger N -core of G. A subset U ⊆ V is a maximum N -core if and only if U
has maximum number of vertices among all N -cores of G. Maximum cores are
also known as main cores.
Any (N + 1)-core is an N -core and any N -core is an (n − N )-plex. Moreover,
if U and U $ are N -cores, then U ∪ U $ is an N -core as well. That means maximal
N -cores are unique. However, N -cores are not closed under exclusion and are in
general not nested. As an example, a cycle is certainly a 2-core but any proper
subgraph has at least one vertex with degree less than two. N -cores need not
be connected. The following proposition relates maximal connected N -cores to
each other.
Proposition 6.2.5. Let G = (V, E) be any undirected graph and let N > 0
be any natural number. Let U and U $ be maximal connected N -cores in G with
U '= U $ . Then there exists no edge between U and U $ in G.
Proof. Assume there is an edge {u, v} with u ∈ U and v ∈ U $ . It follows that
U ∪ U $ is an N -core containing both U and U $ . Furthermore, it is connected,
since U and U $ are connected. 4
3
Some immediate consequences of the proposition are the following: the unique
maximum N -core of a graph is the union of all its maximal connected N -cores,
the maximum 2-core of a connected graph is connected (notice that the internal
vertices of a path have degree two), and a graph is a forest if and only if it
possesses no 2-cores. The next result is an important algorithmic property of
N -cores, that was exhibited in [46].
Proposition 6.2.6. Let G = (V, E) be any undirected graph and let N > 0 be
any natural number. If we recursively remove all vertices with degree strictly less
than N , and all edges incident with them, then the remaining set U of vertices
is the maximum N -core.
Proof. Clearly, U is an N -core. We have to show that it is maximum. Assume
to the contrary, the N -core U obtained is not maximum. Then there exists a
non-empty set T ⊆ V such that U ∪ T is the maximum N -core, but vertices of
T have been removed. Let t be the first vertex of T that has been removed. At
that time, the degree of t must have been strictly less than N . However, as t has
130 S. Kosub
at least N neighbors in U ∪ T and all other vertices have still been in the graph
when t was removed, we have a contradiction. 4
3
The procedure described in the proposition suggests an algorithm for computing
N -cores. We extend the procedure for obtaining auxiliary values which provide
us with complete information on the core decomposition of a network. Define
the core number of a vertex v ∈ V to be the highest order N of a maximum
N -core vertex v belongs to, i.e.,
A method, according to [47], for computing all core numbers is shown in Algo-
rithm 10. The algorithm is correct due to the following reasons: any graph G is
certainly a δ(G)-core, and each neighbor of vertex v having lower degree than v
decrements the potential core number of v. A straightforward implementation of
the algorithm yields a worst-case time bound of O(mn log n) – the most costly
operations being sorting vertices with respect to their degree. A more clever
implementation guarantees linear time [47].
J where entry J[i] is the minimum index j such that for all r ≥ j, vertex
V [r] has degree at least i. We can now replace the ‘resort’-line in Algorithm
10 by the following instructions:
an N -plex approaches one. A little bit more exactly, for all N > 0 and for all
132 S. Kosub
−η
0 ≤ η ≤ 1, an N -plex of size at least N1−η is an η-dense subgraph. But evidently,
not every (1 − n−1 )-dense subgraph (when allowing non-constant densities) is
N −1
Walks. The density averages over edges in subgraphs. An edge is a walk of length
one. A generalization of density can involve walks of larger length. To make this
more precise, we introduce some notations. Let G = (V, E) be any undirected
graph with n vertices. Let 2 ∈ be any walk-length. For a vertex v ∈ V , we
define its degree of order 2 in G as the number of walks of length 2 that start in
v. Let d'G (v) denote v’s degree of order 2 in G. We set d0G (v) = 1 for all v ∈ V .
Clearly, d1G (v) is the degree of v in G. The number of walks of length 2 in a graph
G is denoted by W' (G). We have the following relation between the degrees of
higher order and the number of walks in a graph.
Proposition 6.3.3. Let G = (V, %E) be any undirected graph. For all 2 ∈ and
for all r ∈ {0, . . . , 2}, W' (G) = v∈V drG (v) · d'−r
G (v).
W' (G)
0' (G) =def .
n(n − 1)'
Note that 01 (G) = 0(G) as in W1 (G) each edge counts twice. We easily conclude
the following proposition.
Proposition 6.3.4. It holds 0' (G) ≤ 0'−1 (G) for all graphs G and all natural
numbers 2 ≥ 2.
Proof. Let G = %
(V, E) be any undirected graph %with n'−1
vertices. By Proposition
6.3.3, W' (G) = v∈V d1G (v)·d'−1
G (v) ≤ (n−1) v∈V G (v) = (n−1)·W'−1 (G).
d
Now, the inequality follows easily. 4
3
For a graph G = (V, E) we can define a subset U ⊆ V to be an η-dense subgraph
of order 2 if and only if 0' (G[U ]) ≥ η. From the proposition above, any η-dense
subgraph of order 2 is an η-dense subgraph of order 2 − 1 as well. The η-dense
subgraphs of order 2 ≥ 2 inherit the property of being nested from the η-dense
subgraphs. If we fix a density and consider dense subgraphs of increasing order,
then we can observe that they become more and more similar to cliques. A
formal argument goes as follows. Define the density of infinite order of a graph
G as
0∞ (G) =def lim 0' (G).
'→∞
The density of infinite order induces a discrete density function due to the fol-
lowing zero-one law [307].
Theorem 6.3.5. Let G = (V, E) be any undirected graph.
1. It holds that 0∞ (G) is either zero or one.
2. V is a clique if and only if 0∞ (G) = 1.
The theorem says that the only subgroup that is η-dense for some η > 0 and for
all orders, is a clique. In a sense, the order of a density functions allows a scaling
of how important compactness of groups is in relation to density.
Average degree. One can easily translate the density of a graph with n vertices
¯
into its average degree (as we did in the proof of Proposition 6.3.2): d(G) =
0(G)(n − 1). Technically, density and average degree are interchangeable (with
appropriate modifications). We thus can define dense subgraphs alternatively
in terms of average degrees. Let N > 0 be any rational number. An N -dense
subgraph of a graph G = (V, E) is any subset U ⊆ V such that d(G[U ¯ ]) ≥ N .
Clearly, an η-dense subgraph (with respect to percentage densities) of size k is
an η(k − 1)-dense subgraph (with respect to average degrees), and an N -dense
subgraph (with respect to average degrees) of size k is an k−1
N
-dense subgraph
(with respect to percentage densities). Any N -core is an N -dense subgraph. N -
dense subgraphs are neither closed under exclusion nor nested. This is easily seen
by considering N -regular graphs (for N ∈ ). Removing some vertices decreases
the average degree strictly below N . However, average degrees allow a more
fine-grained analysis of network structure. Since a number of edges quadratic
134 S. Kosub
in the number of vertices is required for a graph to be denser than some given
percentage threshold, small graphs are favored. Average degrees avoid this pitfall.
Extremal graphs. Based upon Turán’s theorem (see Theorem 6.1.2), a whole new
area in graph theory has emerged which has been called extremal graph theory
(see, e.g., [66]). It studies questions like the following: how many edges may a
graph have such that some of a given set of subgraphs are not contained in the
graph? Clearly, if we have more edges in the graph, then all these subgraphs
must be contained in it. This has been applied to dense subgraphs as well. The
following classical theorem due to Dirac [156] is a direct strengthening of Turán’s
theorem.
Theorem 6.3.6 (Dirac, 1963). Let G = (V, E) be any undirected graph. If
2
This problem can be solved in polynomial time using flow techniques [477, 248,
239]; our proof is from [248].
Theorem 6.3.7. There is an algorithm for solving Densest Subgraph on
2
graphs with n vertices and m edges in time O(mn(log n)(log nm )).
6 Local Density 135
V $ =def V ∪ {s, t}
E $ =def {(v, w) | {v, w} ∈ E} ∪ {(s, v) | v ∈ V } ∪ {(v, t) | v ∈ V }
¯
= m|V | + |S+ |(γ − d(G[S + ])) (6.2)
It is clear from this equation that γ is our guess on the maximum average degree
of G. We need to know how we can detect whether γ is too big or too small. We
prove the following claim.
Claim. Let S and T be sets that realize the minimum capacity cut, with respect
to γ. Then we have the following:
1. If S+ =
' ∅, then γ ≤ γ ∗ (G).
2. If S+ = ∅, then γ ≥ γ ∗ (G).
136 S. Kosub
It is easily seen that the smallest possible distance between two different points
in the set is n(n−1)
1
. A binary search procedure for finding a maximum average
degree subgraph is given as Algorithm 11.
It can be shown that the maximum average degree for G is the maximum of
the optimal solutions for LPγ over all γ. Each linear program can be solved in
polynomial time. Since there are O(n2 ) many ratios for |S|/|T | and thus for γ,
we can now compute the maximum average degree for G (and a corresponding
subgraph as well) in polynomial time by binary search.
not able to deduce easily information on the existence of subgraphs with certain
average degrees and certain sizes, from a solution of Densest Subgraph. We
discuss this problem independently. For an undirected graph G = (V, E) and
parameter k ∈ , let γ ∗ (G, k) denote the maximum value of the average degrees
of all induced subgraphs of G having k vertices, i.e.,
¯
γ ∗ (G, k) =def max{ d(G[U ]) | U ⊆ V and |U | = k }.
The following optimization problem has been introduced in [201]:
Theorem 6.3.8. Let G be any graph with n vertices and let k ∈ be an even
natural number with k ≤ n. Let A(G, k) denote the average degree of the subgraph
of G induced by the vertex set that is the output of Algorithm 12. We have
2n
γ ∗ (G, k) ≤ · A(G, k).
k
Proof. For subsets U, U $ ⊆ V , let E(U, U $ ) denote the set of edges consisting
of one vertex of U and one vertex of U $ . Let mU denote the cardinality of the
edge set E(G[U ]). Let dH denote the average degree of the k2 vertices of G with
highest degree with respect to G. We certainly have, dH ≥ γ ∗ (G, k). We obtain
dH · k
|E(H, V − H)| = dH · |H| − 2mH ≥ − 2mH ≥ 0.
2
6 Local Density 139
Trivially, V is an LS set. Also the singleton sets {v} are LS sets in G for each
v ∈ V . LS sets have some nice structural properties. For instance, they do
not non-trivially overlap [399, 381], i.e., if U1 and U2 are LS sets such that
U1 ∩ U2 '= ∅, then either U1 ⊆ U2 or U2 ⊆ U1 . Moreover, LS sets are rather
dense: the minimum degree of a non-trivial LS set is at least half of the number
of outgoing edges [512]. Note that the structural strength of LS sets depends
heavily on the universal requirement that all proper subsets share more ties
with the network outside than the set U does (see [512] for a discussion of this
point). Some relaxations of LS sets can be found in [86].
142 S. Kosub
Lambda sets. A notion closely related to LS sets is that of a lambda set. Let
G = (V, E) be any undirected graph. For vertices u, v ∈ V , let λ(u, v) denote
the number of edge-disjoint paths between u and v in G, i.e., λ(u, v) measures
the edge connectivity of u and v in G. A subset U ⊆ V is said to be a lambda
set if and only if
min λ(u, v) > max λ(u, v).
u,v∈U u∈U,v∈V −U
In a lambda set, the members have more edge-disjoint paths connecting them to
each other than to non-members. Each LS set is a lambda set [512, 86]. Lambda
sets do not directly measure the density of a subset. However, they have some
importance as they allow a polynomial-time algorithm for computing them [86].
The algorithm essentially consists of two parts, namely computing the edge-
connectivity matrix for the vertex set V (which can be done by flow algorithms
in time O(n4 ) [258]) and based on this matrix, grouping vertices together in a
level-wise manner, i.e., vertices u and v belong to the same lambda set (at level
N ) if and only if λ(u, v) ≥ N . The algorithm can also be easily extended to
compute LS sets.
Normal sets. In [285], a normality predicate for network subgroups has been
defined in a statistical way over random walks on graphs. One of the most
important reasons for considering random walks is that typically the resulting
algorithms are simple, fast, and general. A random walk is a stochastic process
by which we go over a graph by selecting the next vertex to visit at random
among all neighbors of the current vertex. We can use random walks to capture
a notion of cohesiveness quality of a subgroup. The intuition is that a group is
the more cohesive the higher the probability is that a random walk originating at
some group member does not leave the group. Let G = (V, E) be any undirected
graph. For d ∈ and α ∈ + , a subset U ⊆ V is said to be (d, α)-normal if and
only if for all vertices u, v ∈ U such that dG (u, v) ≤ d, the probability that a
random walk starting at u will reach v before visiting any vertex w ∈ V −U , is at
least α. Though this notion is rather intuitive, we do not know how to compute
normal sets or decomposing a network into normal sets. Instead, some heuristic
algorithms, running in linear time (at least on graphs with bounded degree),
have been developed producing decompositions in the spirit of normality [285].
7 Connectivity
This chapter is mainly concerned with the strength of connections between ver-
tices with respect to the number of vertex- or edge-disjoint paths. As we shall
see, this is equivalent to the question of how many nodes or edges must be re-
moved from a graph to destroy all paths between two (arbitrary or specified)
vertices. For basic definitions of connectivity see Section 2.2.4.
We present algorithms which
– check k-vertex (k-edge) connectivity,
– compute the vertex (edge) connectivity, and
– compute the maximal k-connected components
of a given graph.
After a few definitions we present some important theorems which summarize
fundamental properties of connectivity and which provide a basis for understand-
ing the algorithms in the subsequent sections.
We denote the vertex-connectivity of a graph G by κ(G) and the edge-
connectivity by λ(G); compare Section 2.2.4. Furthermore, we define the local
(vertex-)connectivity κG (s, t) for two distinct vertices s and t as the minimum
number of vertices which must be removed to destroy all paths from s to t. In
the case that an edge from s to t exists we set κG (s, t) = n − 1 since κG cannot
exceed n − 2 in the other case1 . Accordingly, we define λG (s, t) to be the least
number of edges to be removed such that no path from s to t remains. Note,
that for undirected graphs κG (s, t) = κG (t, s) and λG (s, t) = λG (t, s), whereas
for directed graphs these functions are, in general, not symmetric.
Some of the terms we use in this chapter occur under different names in
the literature. In what follows, we mainly use (alternatives in parentheses): cut-
vertex (articulation point, separation vertex), cut-edge (isthmus, bridge), com-
ponent (connected component), biconnected component (non-separable compo-
nent, block). A cut-vertex is a vertex which increases the number of connected
components when it is removed from the graph; the term cut-edge is defined sim-
ilarly. A biconnected component is a maximal 2-connected subgraph; see Chap-
ter 2. A block of a graph G is a maximal connected subgraph of G containing
no cut-vertex, that is, the set of all blocks of a graph consists of its isolated
1
If s and t are connected by an edge, it is not possible to disconnect s from t by
removing only vertices.
U. Brandes and T. Erlebach (Eds.): Network Analysis, LNCS 3418, pp. 143–177, 2005.
c Springer-Verlag Berlin Heidelberg 2005
(
144 F. Kammer and H. Täubig
2
8
3 6
1 9 11
4 7
10
5
(a) A graph. We consider the connectivity between the vertices 1 and 11.
2 2
8 8
3 6 3 6
1 9 11 1 9 11
4 7 4 7
10 10
5 5
vertices, its cut-edges, and its maximal biconnected subgraphs. Hence, with our
definition, a block is (slightly) different from a biconnected component.
The block-graph B(G) of a graph G consists of one vertex for each block
of G. Two vertices of the block-graph are adjacent if and only if the correspond-
ing blocks share a common vertex (that is, a cut-vertex). The cutpoint-graph
C(G) of G consists of one vertex for each cut-vertex of G, where vertices are
adjacent if and only if the corresponding cut-vertices reside in the same block
of G. For the block- and the cutpoint-graph of G the equalities B(B(G)) = C(G)
and B(C(G)) = C(B(G)) hold [275]. The block-cutpoint-graph of a graph G is
the bipartite graph which consists of the set of cut-vertices of G and a set of ver-
tices which represent the blocks of G. A cut-vertex is adjacent to a block-vertex
whenever the cut-vertex belongs to the corresponding block. The block-cutpoint-
graph of a connected graph is a tree [283]. The maximal k-vertex-connected (k-
edge-connected) subgraphs are called k-vertex-components (k-edge-components).
A k-edge-component which does not contain any (k + 1)-components is called a
cluster [410, 470, 411, 412].
Proof. The incident edges of a vertex having minimum degree δ(G) form an edge
separator. Thus we conclude λ(G) ≤ δ(G).
The vertex-connectivity of any graph on n vertices can be bounded from
above by the connectivity of the complete graph κ(Kn ) = n − 1.
Let G = (V, E) be a graph with at least 2 vertices and consider a minimal
edge separator that separates a vertex set S from all other vertices S̄ = V \ S. In
the case that all edges between S and S̄ are present in G we get λ(G) = |S|·|S̄| ≥
|V | − 1. Otherwise there exist vertices x ∈ S, y ∈ S̄ such that {x, y} ∈ / E, and
the set of all neighbors of x in S̄ as well as all vertices from S \ {x} that have
neighbors in S̄ form a vertex separator; the size of that separator is at most the
number of edges from S to S̄, and it separates (at least) x and y. 4
3
The following is the graph-theoretic equivalent of a theorem that was pub-
lished by Karl Menger in his work on the general curve theory [419].
Theorem 7.1.2 (Menger, 1927). If P and Q are subsets of vertices of an
undirected graph, then the maximum number of vertex-disjoint paths connecting
vertices from P and Q is equal to the minimum cardinality of any set of vertices
intersecting every path from a vertex in P to a vertex in Q.
This theorem is also known as the n-chain or n-arc theorem, and it yields as a
consequence one of the most fundamental statements of graph theory:
Corollary 7.1.3 (Menger’s Theorem). Let s, t be two vertices of an undi-
rected graph G = (V, E). If s and t are not adjacent, the maximum number of
vertex-disjoint s-t-paths is equal to the minimum cardinality of an s-t-vertex-
separator.
The analog for the case of edge-cuts is stated in the next theorem.
Theorem 7.1.4. The maximum number of edge-disjoint s-t-paths is equal to
the minimum cardinality of an s-t-edge-separator.
This theorem is most often called the edge version of Menger’s Theorem although
it was first explicitely stated three decades after Menger’s paper in publications
due to Ford and Fulkerson [218], Dantzig and Fulkerson [141], as well as Elias,
Feinstein, and Shannon [175].
A closely related result is the Max-Flow Min-Cut Theorem by Ford and
Fulkerson (see Theorem 2.2.1, [218]). The edge variant of Menger’s Theorem can
be seen as a restricted version where all edge capacities have a unit value.
The following global version of Menger’s Theorem was published by Hassler
Whitney [581] and is sometimes referred to as ‘Whitney’s Theorem’.
Theorem 7.1.5 (Whitney, 1932). Let G = (V, E) be a non-trivial graph and
k a positive integer. G is k-(vertex-)connected if and only if all pairs of distinct
vertices can be connected by k vertex-disjoint paths.
The difficulty in deriving this theorem is that Menger’s Theorem requires the
nodes to be not adjacent. Since this precondition is not present in the edge ver-
sion of Menger’s Theorem, the following follows immediately from Theorem 7.1.4:
146 F. Kammer and H. Täubig
case, assume that G̃ has at least 2 vertices. For each subgraph Gi that contains
a certain edge e ∈ C of the min-cut, the cut also contains a cut for Gi (otherwise
the two vertices would be connected in Gi \ C and G̃ \ C which would contradict
the assumption that C is a minimum cut). We conclude that there is a Gi such
that λ(G̃) = |C| ≥ λ(Gi ), which directly implies λ(G̃) ≥ min1≤i≤t {λ(Gi )} and
thereby proves the theorem. 4
3
Although we can see from Theorem 7.1.1 that k-vertex/edge-connectivity
implies a minimum degree of at least k, the converse is not true. But in the case
of a large minimum degree, there must be a highly connected subgraph.
Theorem 7.1.12 (Mader, 1972). Every graph of average degree at least 4k
has a k-connected subgraph.
For a proof see [404].
Several observations regarding the connectivity of directed graphs have been
made. One of them considers directed spanning trees rooted at a node r, so
called r-branchings:
Theorem 7.1.13 (Edmonds’ Branching Theorem [171]). In a directed
multigraph G = (V, E) containing a vertex r, the maximum number of pairwise
edge-disjoint r-branchings is equal to κG (r), where κG (r) denotes the minimum,
taken over all vertex sets S ⊂ V that contain r, of the number of edges leaving S.
The following theorem due to Lovász [396] states an interrelation of the
maximum number of directed edge-disjoint paths and the in- and out-degree of
a vertex.
Theorem 7.1.14 (Lovász, 1973). Let v ∈ V be a vertex of a graph G =
(V, E). If λG (v, w) ≤ λG (w, v) for all vertices w ∈ V , then d+ (v) ≤ d− (v).
As an immediate consequence, this theorem provided a proof for Kotzig’s con-
jecture:
Theorem 7.1.15 (Kotzig’s Theorem). For a directed graph G, λG (v, w)
equals λG (w, v) for all v, w ∈ V if and only if the graph is pseudo-symmetric,
i.e. the in-degree equals the out-degree for all vertices: d+ (v) = d− (v).
Let Rfmax be the residual network of N and fmax , where fmax is a maximum
s-t-flow in N . As a consequence of Theorem 2.2.1 on page 11, the maximum flow
saturates all minimum s-t-cuts and therefore each set S ⊆ V \ t is a minimum
s-t-cut iff s ∈ S and no edges leave S in Rfmax .
where the components that result from removing the minimum weight edge of
the s-t-path represent a minimum cut between s and t. This tree is called the
Gomory-Hu cut tree.
Gusfield [265] demonstrated how to do the same computation without node
contractions and without the overhead for avoiding the so called crossing cuts.
See also [272, 344, 253].
If one is only interested in any edge cutset of minimum weight in an undi-
rected weighted graph (without a specified vertex pair to be disconnected), this
can be done using the algorithm of Stoer and Wagner, see Section 7.7.1.
Definition 7.4.3. A pair )S1 , S2 * is called crossing cut, if S1 , S2 are two min-
imum cuts and neither S1 ∩ S2 , S1 \ S2 , S2 \ S1 nor S¯1 ∩ S¯2 is empty.
Lemma 7.4.4. Let )S1 , S2 * be crossing cuts and let A = S1 ∩ S2 , B = S1 \ S2 ,
C = S2 \ S1 and D = S¯1 ∩ S¯2 . Then
a. A, B, C and D are minimum cuts
150 F. Kammer and H. Täubig
b. w(A, D) = w(B, C) = 0
c. w(A, B) = w(B, D) = w(D, C) = w(C, A) = λ2 .
Proof. Since we know that S1 and S2 are minimum cuts, we can conclude
) *
w S1 , S¯1 = w(A, C) + w(A, D) + w(B, C) + w(B, D) = λ
) *
w S2 , S¯2 = w(A, B) + w(A, D) + w(B, C) + w(C, D) = λ
and since there is no cut with weight smaller than λ, we know that
) *
w A, Ā = w(A, B) + w(A, C) + w(A, D) ≥ λ
) *
w B, B̄ = w(A, B) + w(B, C) + w(B, D) ≥ λ
) *
w C, C̄ = w(A, C) + w(B, C) + w(C, D) ≥ λ
) *
w D, D̄ = w(A, D) + w(B, D) + w(C, D) ≥ λ
Summing up twice the middle and the right side of the first two equalities
we obtain
2·w(A, B)+2·w(A, C)+4·w(A, D)+4·w(B, C)+2·w(B, D)+2·w(C, D) = 4·λ
and summing up both side of the four inequalities we have
2·w(A, B)+2·w(A, C)+2·w(A, D)+2·w(B, C)+2·w(B, D)+2·w(C, D) ≥ 4·λ
Therefore w(A, D) = w(B, C) = 0. In other words, there are no diagonal
edges in Figure 7.3.
For a better imagination, let us assume that the length of the four inner line
segments in the figure separating A, B, C and D is proportional to the sum of
the weights of all edges crossing this corresponding line segments. Thus the total
length l of both horizontal or both vertical lines, respectively, is proportional to
the weight λ.
Let us assume the four line segments have different length, in other words,
the two lines separating the sets S1 from S¯1 or S2 from S¯2 , respectively, do not
cross each other exactly in the midpoint of the square, then the total length of
the separating line segments of one vertex set ∆ = A, B, C or D is shorter then
l. Thus w(∆, ∆) ¯ < λ. Contradiction.
) As a
* consequence,
) * w(A,) B)* = w(B, ) D)* = w(D, C) = w(C, A) = 2 and
λ
w A, Ā = w B, B̄ = w C, C̄ = w D, D̄ = λ. 4
3
7 Connectivity 151
A B : S1
: S2
C D
Fig. 7.3. Crossing cuts )S1 , S2 * with S1 := A ∪ B and S2 := A ∪ C
A crossing cut in G = (V, E) partitions the vertex set V into exactly four
parts. A more general definition is the following, where the vertex set can be
divided in three or more parts.
Definition 7.4.5. A circular partition is a partition of V into k ≥ 3 disjoint
sets V1 , V2 , . . . , Vk such that
(
λ/2 : |i − j| = 1 mod k
a. w (Vi , Vj ) =
0 : otherwise
b. If S is a minimum cut, then
1. S or S̄ is a proper subset of some Vi or
2. the circular partition is a refinement of the partition defined by the min-
imum cut S. In other words, the minimum cut is the union of some of
the sets of the circular partition.
Let V1 , V2 , . . . , Vk) be the* disjoint sets of a circular partition, then for all
1 ≤ a ≤ b < k, S := ∪bi=a Vi is a minimum cut. Of course, the complement of
S containing Vk is a minimum cut, too. Let us define these minimum cuts as
circular partition cuts. Especially each Vi , 1 ≤ i ≤ k, is a minimum cut (property
a. of the last definition).
Consider a minimum cut S such that neither S nor its complement is con-
tained in a set of the circular partition. Since S is connected (Observation 7.2.2),
S or its complement are equal to ∪bi=a Vi for some 1 ≤ a < b < k.
Moreover, for all sets Vi of a circular partition, there exists no minimum cut
S such that )Vi , S* is a crossing cut (property b. of the last definition).
Definition 7.4.6. Two different circular partitions P := {U1 , . . . , Uk } and Q :=
{V1 , . . . , Vl } are compatible if there is a unique r and s, 1 ≤ r, s ≤ k, such that
for all i '= r : Ui ⊆ Vs and for all j '= s : Vj ⊆ Ur .
Lemma 7.4.7 ([216]). All different circular partitions are pairwise compatible.
Proof. Consider two circular partitions P and Q in a graph G = (V, E). All sets
of the partitions are minimum cuts. Assume a set S ∈ P is equal to the union of
more than one and less than all sets of Q. Exactly two sets A, B ∈ Q contained
in S are connected by at least an edge to the vertices V \ S. Obtain T from S
by replacing A ⊂ S by an element of Q connected to B and not contained in S.
Then )S, T * is a crossing cut, contradiction.
152 F. Kammer and H. Täubig
ar 1
+ bs 1
+
ak bl
ar bs
a1 b1
ar-1 bs-1
Fig. 7.4. Example graph G = ({a1 . . . ar , b1 . . . bs } , E) shows two compatible partitions
P, Q defined as follows:
S1 ∩ S2 ∩ S3 = ∅.
Proof. Assume that the lemma is not true. As shown in Figure 7.5, let
) *
a = w S3 \ (S1 ∪ S2 ) , S1 ∩ S2 ∩ S3
b = w ((S2 ∩ S3 ) \ S1 , S2 \ (S1 ∪ S3 ))
c = w (S1 ∩ S2 ∩ S3 , (S1 ∩ S2 ) \ S3 )
d = w ((S1 ∩ S3 ) \ S2 , S1 \ (S2 ∪ S3 ))
On one hand S1 ∩ S2 is a minimum cut (Lemma 7.4.4.a.) so that c ≥ λ2
(Lemma 7.4.1). On the other hand c + b = c + d = λ2 (Lemma 7.4.4.c.). Therefore
b = d = 0 and (S1 ∩ S3 ) \ S2 = (S2 ∩ S3 ) \ S1 = ∅.
If we apply Lemma 7.4.4.b. to S1 and S2 , then S1 ∩ S2 ∩ S3 and S3 \ (S1 ∪ S2 )
are not connected. Contradiction. 4
3
7 Connectivity 153
S1 c S2
d b
S3
a
λ λ
w(A ∪ B, C ∪ D) = (1), w(B, C) = (2),
2 2
) * ) * λ
w (A, B) + w B, S1 ∪ S2 = w B, A ∪ S1 ∪ S2 = (3) and
2
) * ) * ) * λ
w A, S1 ∪ S2 + w B, S1 ∪ S2 = w A ∪ B, S1 ∪ S2 = (4).
2
All) equalities* follow from Lemma 7.4.4.c.. Moreover w (A, T \ S2 ) = 0,
w D, S1 ∪ S2 = 0 (7.4.4.b.) and B, C are minimum cuts. Since (1), (2) and
: S1 : S2 :T
A D A B C D
BC
(a) (b)
=
F{S1 ,...,Sk } = F{Sα11,...,S
,...,αk
k}
\ {∅} .
α1 ,...,αk ∈{0,1}k
Lemma 7.4.10. Let )S1 , S2 * be a crossing cut and A ∈ F{S1 ,S2 } . Choose B ∈
F{S1 ,S2 } such that w (A, B) = λ2 . For all crossing cuts )B, T *:
λ ) * λ
w (A, B ∩ T ) = or w A, B ∩ T̄ =
2 2
Proof. W.l.o.g. A = S1 ∩ S2 (if not, interchange S1 and S¯1 or S2 and S¯2 ),
B = S1 \ S2 (if not, interchange S1 and S2 ). Let C = S2 \ S1 and D = S¯1 ∩ S¯2 .
Then (∗) : w(B, C) = 0 (Lemma 7.4.4.b.). Consider the following four cases:
T ⊂ (A ∪ B) (Figure 7.7(a)) : w (A, B ∩ T ) = λ
2 (Lemma 7.4.9)
w (A \ T, A ∩ T ) + w (A \ T, B ∩ T ) + w (B \ T, A ∩ T ) + w (B \ T, B ∩ T )
= w ((A \ T ) ∪ (B \ T ) , (A ∩ T ) ∪ (B ∩ T ))
λ
= w (S1 \ T, S1 ∩ T ) = .
2
Together with w(B \ T, B ∩ T ) ≥ λ2 (Lemma 7.4.1), we can conclude
– w(A \ T, A ∩ T ) = 0 and therefore A ∩ T = ∅ or A \ T = ∅,
– w(A \ T, B ∩ T ) = 0 (1) and
– w(A ∩ T, B \ T ) = 0 (2).
(1)
Note that w(A, B) = λ2 . If A ∩ T = ∅, w(A, B ∩ T ) = 0 and w(A, B \ T ) = λ2 .
(2)
Otherwise A \ T = ∅, w(A, B \ T ) = 0 and w(A, B ∩ T ) = λ2 .
7 Connectivity 155
: S1 : S2 :T
A B A B A B
C D C D C D
(a) (b) (c)
Different crossing cuts interact in a very specific way, as shown in the next
theorem.
Theorem 7.4.12 ([63, 153]). In a graph G = (V, E), for each partition P of
V into 4 disjoint sets due to a crossing cut in G, there exists a circular partition
in G that is a refinement of P .
Proof. Given crossing cut )S1 , S2 *, choose the set
" #
Λ := S1 ∩ S2 , S1 \ S2 , S2 \ S1 , S1 ∪ S2
as a starting point.
As long as there is a crossing cut )S, T * for some T '∈ Λ and S ∈ Λ, add T
to Λ. This process terminates since we can only add each set T ∈ P(V ) into Λ
once. All sets in Λ are minimum cuts. Definition 7.4.5.b. is satisfied for Λ.
The disjoint minimum cuts F (Λ) give us a partitioning of the graph. All sets
in F (Λ) can be built by crossing cuts of minimum cuts in Λ. Therefore, each set in
F (Λ) has exactly two neighbors, i.e., for each set X ∈ F(Λ), there exist exactly
two different sets Y, Z ∈ F(Λ) such that w(X, Y ) = w(X, Z) = λ2 (Corollary
7.4.11). For all other sets Z ∈ F(Λ), w(X, Z) = 0. Since G is a connected graph,
all sets in F (Λ) can be ordered, so that Definition 7.4.5.a. holds. Observe that
Definition 7.4.5.b. is still true, since splitting the sets in Λ into smaller sets still
allows a reconstruction of the sets in Λ. 4
3
&) *'
Lemma 7.4.13 ([63, 153]). A graph G = (V, E) has O |V2 | many mini-
&) *'
mum cuts and this bound is tight. This means that a graph can have Ω |V2 |
many minimum cuts.
Proof. The upper bound is a consequence of the last theorem. Given a graph
G = (V, E), the following recursive function Z describes the number of minimum
cuts in G:
%k ) * A circular partition
(Z (|Vi |)) + k2
i=1 V , . . . , Vk exists in G
1
It is easy to see that this function achieves the maximum in &)the*'case where
a circular partition W1 , . . . , W|V | exist. Therefore Z (|V |) = O |V2 | .
)) **
The lower bound is achieved by a simple cycle of n vertices. There are Ω n2
pairs of edges. Each pair of edges defines another two minimum cuts S and S̄.
These two sets are separated by simply removing the pair of edges. 4
3
7 Connectivity 157
restriction that only the vertices of Vi are mapped to this tree. The root of
T(Vi ,E) corresponds exactly to the set Vi . Thus we can merge node Ni of the
circle and the root of T(Vi ,E) for all 1 ≤ i ≤ k. This circle connected with all
the trees is the cactus CG for G. The number of nodes is equal to the sum of
all nodes in the trees T(Vi ,E) with 1 ≤ i ≤ k. Therefore, the number of nodes of
the cactus is bounded by 2 |V | − 1 and again, there is a 1 − 1 correspondence
between minimum cuts in G and the separation of CG into two parts.
Now consider a graph G = (V, E) with the circular partitions P1 , . . . , Pz .
Take all circular partitions as a set of sets. Construct a cactus CG representing
the circular partition cuts of G in the following way.
The vertices of each set F ∈ FP1 ∪...∪Pz are mapped to one node and two
nodes are connected, if for their corresponding sets F1 and F2 , w (F1 , F2 ) > 0.
Then each circular partition creates one circle in CG . Since all circular partitions
are pairwise compatible, the circles are connected by edges that are not part of
any circle. The cactus CG is now a tree-like graph (Figure 7.8).
After representing the remaining minimum cuts that are not part of a circular
partition, we get the cactus TC for G. As before, the number of nodes of the
cactus is bounded by 2 |V | − 1.
P1 P2 P5
P6
P4
P3
Fig. 7.8. A cactus representing the circular partition cuts of 6 circular partitions
Better algorithms for the more restricted version of unit capacity networks
exist.
160 F. Kammer and H. Täubig
a a’ a"
s t s’ s" t’ t"
b b’ b"
Fig. 7.9. Construction of the directed graph Ḡ that is derived from the undirected
input graph G to compute the local vertex-connectivity κG (s, t)
187] presented a method for computing κG (s, t) that is based on the following
construction: For the given graph G = (V, E) having n vertices and m edges we
derive a directed graph Ḡ = (V̄ , Ē) with |V̄ | = 2n and |Ē| = 2m+ n by replacing
each vertex v ∈ V with two vertices v $ , v $$ ∈ V̄ connected by an (internal) edge
ev = (v $ , v $$ ) ∈ Ē. Every edge e = (u, v) ∈ E is replaced by two (external) edges
e$ = (u$$ , v $ ), e$$ = (v $$ , u$ ) ∈ Ē, see Figure 7.9.
κ(s, t) is now computed as the maximum flow in Ḡ from source s$$ to the
target t$ with unit capacities for all edges2 . For a proof of correctness see [187].
For each pair v $ , v $$ ∈ V̄ representing a vertex v ∈ V the internal edge (v $ , v $$ )
is the only edge that emanates from v $ and the only edge entering v $$ , thus
the network Ḡ is of type 2. According to Lemma 7.6.2 the computation√of the
maximum flow resp. the local vertex-connectivity has time complexity O( nm).
A trivial algorithm for computing κ(G) could determine the minimum for the
local connectivity of all pairs of vertices. Since κG (s, t) = n − 1 for all pairs (s, t)
that are directly connected by an edge, this algorithm would make n(n−1) 2 −m
calls to the flow-based subroutine. We will see that we can do much better.
If we consider a minimum vertex separator S ⊂ V that separates a ‘left’
vertex subset L ⊂ V from a ‘right’ subset R ⊂ V , we could compute κ(G) by
fixing one vertex s in either subset L or R and computing the local connectivities
κG (s, t) for all vertices t ∈ V \ {s} one of which must lie on the other side of the
vertex cut. The problem is: how to select a vertex s such that s does not belong
to every minimum vertex separator? Since κ(G) ≤ δ(G) (see Theorem 7.1.1), we
could try δ(G) + 1 vertices for s, one of which must not be part of all √ minimum
vertex cuts. This would result in an algorithm of complexity O((δ+1)·n· nm)) =
O(δn3/2 m)
Even and Tarjan [188] proposed Algorithm 13 that stops computing the local
connectivities if the size of the current minimum cut falls below the number of
examined vertices.
The resulting algorithm examines not more than κ + 1 vertices in the loop
for variable i. Each vertex has at least δ(G) neighbors, thus at most O((n −
δ − 1)(κ + 1)) calls to the maximum flow subroutine are carried out. Since
κ(G) ≤ 2m/n (see Theorem 7.1.8), the minimum capacity is found√not later
than in call 2m/n + 1. As a result, the overall time complexity is O( nm2 ).
2
Firstly, Even used c(ev ) = 1, c(e" ) = c(e"" ) = ∞ which leads to the same results.
162 F. Kammer and H. Täubig
return κmin
Esfahanian and Hakimi [183] further improved the algorithm by the following
observation:
Lemma 7.6.3. If a vertex v belongs to all minimum vertex-separators then there
are for each minimum vertex-cut S two vertices l ∈ LS and r ∈ RS that are
adjacent to v.
Proof. Assume v takes part in all minimum vertex-cuts of G. Consider the par-
tition of the vertex set V induced by a minimum vertex-cut S with a component
L (the ‘left’ side) of the remaining graph and the respective ‘right’ side R. Each
side must contain at least one of v’s neighbors, because otherwise v would not
be necessary to break the graph into parts. Actually each side having more than
one vertex must contain 2 neighbors since otherwise replacing v by the only
neighbor would be a minimum cut without v, in contrast to the assumption. 3 4
These considerations suggest Algorithm 14. The first loop makes n − δ − 1
calls to the MaxFlow procedure, the second requires κ(2δ − κ − 3)/2 calls. The
overall complexity is thus n − δ − 1 + κ(2δ − κ − 3)/2 calls of the maximum flow
algorithm.
Proof. Let the elements of L be denoted by {l1 , l2 , . . . , lk } and denote the induced
edges by E[L] = E(G[L]).
k
$
δ(G) · k ≤ dG (li )
i=1
≤ 2 · |E[L]| + |S|
k(k − 1)
≤2· + |S|
2
< k(k − 1) + δ(G)
From δ(G) · (k − 1) < k(k − 1) we conclude |L| = k > 1 and |L| = k > δ(G) (as
well as |R| > δ(G)). 4
3
Corollary 7.6.5. If λ(G) < δ(G) then each component of G − S contains a
vertex that is not incident to any of the edges in S.
Lemma 7.6.6. Assume again that λ(G) < δ(G). If T is a spanning tree of G
then all components of G − S contain at least one vertex that is not a leaf of T
(i.e. the non-leaf vertices of T form a λ-covering).
Proof. Assume the converse, that is all vertices in L are leaves of T . Thus no
edge of T has both ends in L, i.e. |L| = |S|. Lemma 7.6.4 immediately implies
that λ(G) = |S| = |L| > δ(G), a contradiction to the assumption. 4
3
Lemma 7.6.6 suggests an algorithm that first computes a spanning tree of the
given graph, then selects an arbitrary inner vertex v of the tree and computes
the local connectivity λ(v, w) to each other non-leaf vertex w. The minimum of
these values together with δ(G) yields exactly the edge connectivity λ(G). This
algorithm would profit from a larger number of leaves in T but, unfortunately,
finding a spanning tree with maximum number of leaves is N P-hard.Esfahanian
chosen to be the smaller of both sets, leaves and non-leaves, the algorithm re-
quires at most n/2 calls to the computation of a local connectivity, which yields
an overall complexity of O(λmn).
This could be improved by Matula [413], who made use of the following
lemma.
Lemma 7.6.7. In case λ(G) < δ(G), each dominating set of G is also a λ-
covering of G.
Similar to the case of the spanning tree, the edge-connectivity can now be com-
puted by choosing a dominating set D of G, selecting an arbitrary vertex u ∈ D,
and calculating the local edge-connectivities between u and all other vertices in
D. The minimum of all values together with the minimum degree δ(G) gives the
result. While finding a dominating set of minimum cardinality is N P-hard in
general, the connectivity algorithm can be shown to run in time O(nm) if the
dominating set is chosen according to Algorithm 17.
The proof shows, by induction on the active vertices, that for each active
vertex v the adjacency to the vertices added before (Av ) does not exceed the
weight of the cut of Av ∪ {v} induced by C (denoted by Cv ). Thus it is to prove
that
w(Av , v) ≤ w(Cv )
For the base case, the inequality is satisfied since both values are equal for
the first active vertex. Assuming now that the proposition is true for all active
vertices up to active vertex v, the value for the next active vertex u can be
written as
w(Au , u) = w(Av , u) + w(Au \ Av , u)
≤ w(Av , v) + w(Au \ Av , u) (w(Av , u) ≤ w(Av , v))
≤ w(Cv ) + w(Au \ Av , u) (by induction assumption)
≤ w(Cu )
The last line follows because all edges between Au \ Av and u contribute their
weight to w(Cu ) but not to w(Cv ).
Since t is separated by C from its immediate predecessor s, it is always an
active vertex; thus the conclusion w(At , t) ≤ w(Ct ) completes the proof. 4
3
Theorem 7.7.2. A cut-of-the-phase having minimum weight among all cuts-of-
the-phase is a minimum capacity cut of the original graph.
Proof. For the case where the graph consists of only 2 vertices, the proof is
trivial. Now assume |V | > 2. The following two cases can be distinguished:
1. Either the graph has a minimum capacity cut that is also a minimum s-t-cut
(where s and t are the vertices added last in the first phase), then, according
to Lemma 7.7.1, we conclude that this cut is a minimum capacity cut of the
original graph.
2. Otherwise the graph has a minimum cut where s and t are on the same side.
Therefore the minimum capacity cut is not affected by merging the vertices
s and t.
Thus, by induction on the number of vertices, the minimum capacity cut of the
graph is the cut-of-the-phase having minimum weight. 4
3
1 1 1 1 1 1 1 1
A B C a e f A B C a e t
2 1 2 3 2 1 2 3 2 1 2 2 1 2
3 3
3 1 3 1 3 3
D E F b c s D E b c
4 2 3 4 2 3 4 2 1 F 4 2 1
s
G 1 H d 1 t G 1 H d 1
1 1
A B a s A 1 a 1
1 1
2 1 2 2 1 2 2 1 B 2 1
C
3 1 3 1 3 3 C 3 3
D E F b c t D E b c t
F
4 2 H 4 2 4 2 4 2
1 H
G 1
d G 1 s 1
1 1
A B a
4 BCDE
C A 2 a 2 A
1 1
FGH
2 E F 2 t s BCE
3
5
3
5
2 2 s
G FGH 4
s t
D 4 H b 4 D 7
t 7
Fig. 7.10. Example for the Stoer/Wagner algorithm. Upper case letters are vertex
names, lower case letters show the order of addition to the set S. The minimum cut
{ABDEG} | {CF H} has capacity 3 and is found in Part 7.10(f) (third phase)
graphs. For weighted graphs they proposed a Monte Carlo algorithm that has
error probability 1/2 and expected running time O(nm log(n2 /m)).
A B G C A B
D C H B F D C
H E H E
G F A E G F
Fig. 7.11. Computation of biconnected components in undirected graphs.
Left: the undirected input graph. Middle: dfs tree with forward (straight) and back-
ward (dashed) edges. Right: the blocks and articulation nodes of the graph.
The edges e = (v, w) that are inspected during the DFS traversal are divided
into the following categories:
1. All edges that lead to unlabeled vertices are called tree edges (they belong
to the trees of the DFS forest).
2. The edges that point to a vertex w that was already labeled in a former step
fall into the following classes:
a) If num[w] > num[v] we call e a forward edge.
b) Otherwise, if w is an ancestor of v in the same DFS tree we call e a
backward edge.
c) Otherwise e is called a cross edge (because it points from one subtree to
another).
1 7 1 7
2 6 8 10 2 6 8 10
3 4 5 9 11 3 4 5 9 11
Fig. 7.12. DFS forest for computing strongly connected components in directed
graphs: tree, forward, backward, and cross edges
Proof. → Assume conversely that the condition holds but u is the root of v’s
strong component with u '= v. There must exist a directed path from v
to u. The first edge of this path that points to a vertex w that is not a
descendant of v in the DFS tree is a back or a cross edge. This implies
lowlink[v] ≤ num[w] < num[v], since the highest numbered common ancestor
of v and w is also in this strong component.
← If v is the root of some strong component in the actual spanning forest,
we may conclude that lowlink[v] = num[v]. Assuming the opposite (i.e.
lowlink[v] < num[v]), some proper ancestor of v would belong to the same
strong component. Thus v would not be the root of the SCC.
This concludes the proof. 4
3
If we put all discovered vertices on a stack during the DFS traversal (similar
to the stack of edges in the computation of the biconnected components) the
lemma allows us to ‘cut out’ the strongly connected components of the graph.
It is apparent that the above algorithms share their similarity due to the
fact that they are based on the detection of cycles in the graph. If arbitrary
instead of simple cycles (for biconnected components) are considered, this ap-
proach yields a similar third algorithm that computes the bridge- (or 2-edge-)
connected components (published by Tarjan [544]).
7.8.3 Triconnectivity
First results on graph triconnectivity were provided by Mac Lane [403] and
Tutte [555, 556]. In the sixties, Hopcroft and Tarjan published a linear time
algorithm for dividing a graph into its triconnected components that was based
on depth-first search [309, 310, 312]. Miller and Ramachandran [422] pro-
vided another algorithm based on a method for finding open ear decompo-
sitions together with an efficient parallel implementation. It turned out that
the early Hopcroft/Tarjan algorithm was incorrect, which was then modified by
Gutwenger and Mutzel [267]. They modified the faulty parts to yield a correct
linear time implementation of SPQR-trees. We now briefly review their algo-
rithm.
Definition 7.8.3. Let G = (V, E) be a biconnected (multi-) graph. Two vertices
a, b ∈ V are called a separation pair of G if the induced subgraph on the vertices
V \ {a, b} is not connected.
The pair (a, b) partitions the edges of G into equivalence classes E1 , . . . , Ek
(separation classes), s.t. two edges belong to the same class exactly if both lie
on some path p that contains neither a nor b as an inner vertex, i.e. if it contains
a or b it is an end vertex of p. The pair (a, b) is a separation pair if there are
at least two separation classes, except for the following special cases: there are
exactly two separation classes, and one of them consists of a single edge, or if
there are exactly three separation classes that all consist of a single edge. The
graph G is triconnected if it contains no separation pair.
7 Connectivity 173
(P) Parallel Case: If the split pair {s, t} has more than two split components
G1..k , the root of T is a P-node with a skeleton consisting of k parallel s-t-
edges e1..k with e1 = e.
(S) Series Case: If the split pair {s, t} has exactly two split components, one of
them is e; the other is denoted by G$ . If G$ has cut-vertices c1..k−1 (k ≥ 2)
that partition G into blocks G1..k (ordered from s to t), the root of T is an
S-node, whose skeleton is the cycle consisting of the edges e0..k , where e0 = e
and ei = (ci−1 , ci ) with i = 1..k, c0 = s and ck = t.
(R) Rigid Case: In all other cases let {s1 , t1 }, .., {sk , tk } be the maximal split
pairs of G with respect to {s, t}. Further let Gi for i = 1, .., k denote the
union of all split components of {si , ti } except the one containing e. The
root of T is an R-node, where the skeleton is created from G by replacing
each subgraph Gi with the edge ei = (si , ti ).
For the non-trivial cases, the children µ1..k of the node are the roots of the SPQR-
trees of Gi ∪ ei with respect to ei . The vertices incident with each edge ei are
the poles of the node µi , the virtual edge of node µi is the edge ei of the node’s
skeleton. The SPQR-tree T is completed by adding a Q-node as the parent of the
node, and thus the new root (that represents the reference edge e).
Each edge in G corresponds with a Q-node of T , and each edge ei in the skeleton
of a node corresponds with its child µi . T can be rooted at an arbitrary Q-node,
which results in an SPQR-tree with respect to its corresponding edge.
Theorem 7.8.8. Let G be a biconnected multigraph with SPQR-tree T .
1. The skeleton graphs of T are the triconnected components of G. P-nodes cor-
respond to bonds, S-nodes to polygons, and R-nodes to triconnected simple
graphs.
2. There is an edge between two nodes µ, ν ∈ T if and only if the two corre-
sponding triconnected components share a common virtual edge.
3. The size of T , including all skeleton graphs, is linear in the size of G.
For a sketch of the proof, see [267].
We consider now the computation of SPQR-trees for a biconnected multi-
graph G (without self-loops) and a reference edge er . We assume a labeling of
the vertices by unique indices from 1 to |V |. As a preprocessing step, all edges
are reordered (using bucket sort), first according to the incident vertex with the
lower index, and then according to the incident vertex with higher index, such
that multiple edges between the same pair of vertices are arranged successively.
In a second step, all such bundles of multiple edges are replaced by a new vir-
tual edge. In this way a set of multiple bonds C1 , .., Ck is created together with
a simple graph G$ .
In the second step, the split components Ck+1 , .., Cm of G$ are computed
using a dfs-based algorithm. In this context, we need the following definition:
Definition 7.8.9. A palm tree P is a directed multigraph that consists of a set
of tree arcs v → w and a set of fronds v 6→ w, such that the tree arcs form
7 Connectivity 175
a directed spanning tree of P (that is the root has no incoming edges, all other
vertices have exactly one parent), and if v 6→ w is a frond, then there is a directed
path from w to v.
Suppose now, P is a palm tree for the underlying simple biconnected graph
G$ = (V, E $ ) (with vertices labeled 1, .., |V |). The computation of the separation
pairs relies on the definition of the following variables:
& '
∗
lowpt1(v) = min {v} ∪ {w|v →6→ w}
& & ''
∗
lowpt2(v) = min {v} ∪ {w|v →6→ w} \ {lowpt1(v)}
These are the two vertices with minimum label, that are reachable from v by
traversing an arbitrary number (including zero) of tree arcs followed by exactly
one frond of P (or v itself, if no such option exists).
Let Adj(v) denote the ordered adjacency list of vertex v, and let D(v) be the
set of descendants of v (that is the set of vertices that are reachable via zero or
more directed tree arcs). Hopcroft and Tarjan [310] showed a simple method for
computing an acceptable adjacency structure, that is, an order of the adjacency
lists, which meets the following conditions:
1. The root of P is the vertex labeled with 1.
2. If w1 , .., wn are the children of vertex v in P according to the ordering in
Adj(v), then wi = v + |D(wi+1 ∪ .. ∪ D(wn )| + 1,
3. The edges in Adj(v) are in ascending order according to lowpt1(w) for tree
edges v → w, and w for fronds v 6→ w, respectively.
Let w1 , .., wn be the children of v with lowpt1(wi )) = u ordered according
to Adj(v), and let i0 be the index such that lowpt2(wi ) < v for 1 ≤ i ≤ i0
and lowpt2(wj ) ≥ v for i0 < j ≤ n. Every frond v 6→ w ∈ E $ resides between
v → wi0 and v → wi0 +1 in Adj(v).
An adequate rearrangement of the adjacency structure can be done in linear
time if a bucket sort with 3|V | + 2 buckets is applied to the following sorting
function (confer [310, 267]), that maps the edges to numbers from 1 to 3|V | + 2:
3lowpt1(w)
if e = v → w and lowpt2(w) < v
φ(e) = 3w + 1 if e = v 6→ w
3lowpt1(w) + 2 if e = v → w and lowpt2(w) ≥ v
If we perform a depth-first search on G$ according to the ordering of the edges
in the adjacency list, then this partitions G$ into a set of paths, each consisting
of zero or more tree arcs followed by a frond, and each path ending at the vertex
with lowest possible label. We say that a vertex un is a first descendant of u0 if
there is a directed path u0 → · · · → un and each edge ui → ui+1 is the first in
Adj(ui ).
Lemma 7.8.10. Let P be a palm tree of a biconnected graph G = (V, E) that
satisfies the above conditions. Two vertices a, b ∈ V with a < b form a separation
pair {a, b} if and only if one of the following conditions is true:
176 F. Kammer and H. Täubig
Type-1 Case There are distinct vertices r, s ∈ V \ {a, b} such that b → r is a tree
edge, lowpt1(r) = a, lowpt2(r) ≥ b, and s is not a descendant of r.
∗
Type-2 Case There is a vertex r ∈ V \ b such that a → r → b, b is a first
descendant of r (i.e., a, r, b lie on a generated path), a '= 1, every frond
x 6→ y with r ≤ x < b satisfies a ≤ y, and every frond x 6→ y with a < y < b
∗
and b → w → x has lowpt1(w) ≥ a.
Multiple Edge Case (a, b) is a multiple edge of G and G contains at least four
edges.
For a proof, see [310].
We omit the rather technical details for finding the split components
Ck+1 , .., Cm . The main loop of the algorithm computes the triconnected compo-
nents from the split components C1 , .., Cm by merging two bonds or two polygons
that share a common virtual edge (as long as they exist). The resulting time com-
plexity is O(|V | + |E|). For a detailed description of the algorithm we refer the
interested reader to the original papers [309, 310, 312, 267].
Average connectivity. Only recently, Beineke, Oellermann, and Pippert [56] con-
sidered the concept of average connectivity. This measure is defined as the av-
erage, over all pairs of vertices a, b ∈ V , of the maximum number of vertex-
disjoint paths between a and b, that is, the average local vertex-connectivity.
While the conventional notion of connectivity is rather a description of a worst
case scenario, the average connectivity might be a better description of the global
properties of a graph, with applications in network vulnerability and reliability.
Sharp bounds for this measure in terms of the average degree were shown by
7 Connectivity 177
Dankelmann and Oellermann [138]. Later on, Henning and Oellermann consid-
ered the average connectivity of directed graphs and provided sharp bounds for
orientations of graphs [294].
Other measures. There are some further definitions that might be of interest.
Matula [410] defines a cohesiveness function for each element of a graph (ver-
tices and edges) to be the maximum edge-connectivity of any subgraph con-
taining that element. Akiyama et al. [13] define the connectivity contribution or
cohesiveness of a vertex v in a graph G as the difference κ(G) − κ(G − v).
Connectivity problems that aim at dividing the graph into more than two
components by removing vertices or edges are considered in conjunction with
the following terms: A shredder of an undirected graph is a set of vertices
whose removal results in at least three components, see for example [121]. The 2-
connectivity of a graph is the minimum number of vertices that must be deleted
to produce a graph with at least 2 components or with fewer than 2 vertices,
see [456, 455]. A similar definition exists for the deletion of edges, namely the
i-th order edge connectivity, confer [254, 255].
Acknowledgments. The authors thank the anonymous reviewer, the editors, and
Frank Schilder for critical assessment of this chapter and valuable suggestions.
We thank Professor Ortrud Oellermann for her support.
8 Clustering
Marco Gaertler
U. Brandes and T. Erlebach (Eds.): Network Analysis, LNCS 3418, pp. 178–215, 2005.
c Springer-Verlag Berlin Heidelberg 2005
(
8 Clustering 179
belong to these groups are examined closer. The recursive structure of topics
and subtopics suggests a repeated application of this technique. Using the clus-
tering information on the data set, one can design methods that explore and
navigate within the data with a minimum of human interaction. Therefore, it is
a fundamental aspect of automatic information processing.
Preliminaries
Let G = (V, E) be a directed graph. A clustering C = {C1 , . . . , Ck } of G is a
partition of the node set V into non-empty subsets Ci . The set E(Ci , Cj ) is the
set of all edges that have their origin in Ci and their destination in Cj ; E(Ci ) is
!k
a short-hand for E(Ci , Ci ). Then E(C) := i=1 E(Ci ) is the set of intra-cluster
edges and E(C) := E \ E(C) the set of inter-cluster edges. The number of intra-
cluster edges is denoted by m (C) and the number of inter-cluster edges by m (C).
In the following, we often identify a cluster Ci with the induced subgraph of G,
i.e., the graph G[Ci ] := (Ci , E(Ci )). A clustering is called trivial if either k = 1
(1-clustering) or k = n (singletons). A clustering with k = 2 is also called a cut
(see also Section 2.2.3).
The set of all possible clusterings is denoted by A (G). The set A (G) is par-
tially ordered with respect to inclusion. Given two clusterings C1 := {C1 , . . . , Ck }
and C2 := {C1$ , . . . , C'$ }, Equation (8.1) shows the definition of the partial order-
ing.
C1 ≤ C2 : ⇐⇒ ∀ 1 ≤ i ≤ k : ∃ j ∈ {1, . . . , 2} : Ci ⊆ Cj$ (8.1)
Clustering C1 is called a refinement of C2 , and C2 is called a coarsening of C1 .
A chain of clusterings, i.e., a subset of clusterings such that every pair is com-
parable, is also called a hierarchy. The hierarchy is called total if both trivial
clusterings are contained. A hierarchy that contains exactly one clustering of k
clusters for every k ∈ {1, . . . , n} is called complete. It is easy to see that such a
hierarchy has n clusterings and that no two of these clusterings have the same
number of clusters.
Besides viewing a clustering as a partition, it can also be seen as an equiv-
alence relation ∼C on V × V , where u ∼C v if u and v belong to the same
cluster in C. Note that the edge set E is also a relation over V × V , and it is
an equivalence relation if and only if the graph consists of the union of disjoint
cliques.
The power set of a set X is the set of all possible subsets, and is denoted
by P(X), see also Section 2.4. A cut function S : P(V ) → P(V ) maps a set of
nodes to a subset of itself, i.e.,
∀ V $ ⊆ V : S(V $ ) ⊆ V $ . (8.2)
Cut functions formalize the idea of cutting a node-induced subgraph into two
parts. For a given node subset V $ of V the cut function S defines a cut
by (S(V $ ), V $ \ S(V $ )). In order to exclude trivial functions, we require a cut
function to assign a non-empty proper subset whenever possible. Proper cut
functions in addition fulfill the condition (8.3).
180 M. Gaertler
Graph model of this chapter. In this chapter, graph usually means simple and
directed graphs with edge weights and without loops.
Content Organization
The following is organized into three parts. The first one introduces measure-
ments for the quality of clusterings. They will provide a formal method to define
‘good’ clusterings. This will be important due to the informal foundation of
clustering. More precisely, these structural indices rate partitions with respect
to different interpretations of natural groups. Also, they provide the means to
compare different clusterings with respect to their quality. In the second part,
generic concepts and algorithms that calculate clusterings are presented. The fo-
cus is directed on the fundamental ideas, and not on concrete parameter choices.
Finally, the last section discusses potential enhancements. These extensions are
limited to alternative models for clusterings, some practical aspects, and Klein-
berg’s proposal of an axiomatic system for clustering.
edge. We will also use the following short-cut for summing up the weight of an
edge subset: $
ω(E $ ) := ω(e) for E $ ⊆ E .
e∈E !
f (C) + g (C)
index (C) := (8.4)
max{f (C $ )
+ g (C $ ) : C $ ∈ A (G)}
are evaluated with random instances, or the data of the input network is not re-
liable, i.e., different data collections result into different networks. The clustering
methods could be applied to all networks, and the clustering with the best score
would be chosen. However, if the index uses characteristics of the input graph,
like the number of edges, maximum weight, etc., then this comparison can only
be done when the underlying graph is the same for all clusterings. Therefore,
indices that depend on the input graph are not appropriate for all applications,
such as benchmarks. Although this dependency will seldom occur, one has to
consider these facts when designing new indices.
8.1.1 Coverage
The coverage γ (C) measures the weight of intra-cluster edges, compared to the
weight of all edges. Thus f (C) = ω(E(C)) and g ≡ 0. The maximum value is
achieved for C = {V }. Equation (8.5) shows the complete formula.
%
ω(E(C)) e∈E(C) ω(e)
γ (C) := = % (8.5)
ω(E) e∈E ω(e)
Coverage measures only the accumulated density within the clusters. Therefore,
an individual cluster can be sparse or the number of inter-cluster edges can be
large. This is illustrated in Figure 8.1. Coverage is also the probability of ran-
cluster 1 cluster 2
cluster 1 cluster 2
Fig. 8.1. A situation where coverage splits an intuitive cluster. The thickness of an
edge corresponds to its weight. If normal edges have weight one and bold edges weight
100, then the intuitive clustering has γ = 159/209 ≈ 0.76 while the optimal value for
coverage is 413/418 ≈ 0.99
8.1.2 Conductance
In contrast to coverage, which measures only the accumulated edge weight within
clusters, one can consider further structural properties like connectivity. Intu-
itively, a cluster should be well connected, i.e., many edges need to be removed
to bisect it. Two clusters should also have a small degree of connectivity between
each other. In the ideal case, they are already disconnected. Cuts are a useful
184 M. Gaertler
method to measure connectivity (see also Chapter 7). The standard minimum
cut has certain disadvantages (Proposition 8.1.2), therefore an alternative cut
measure will be considered: conductance. It compares the weight of the cut with
the edge weight in either of the two induced subgraphs. Informally speaking, the
conductance is a measure for bottlenecks. A cut is a bottleneck, if it separates
two parts of roughly the same size with relatively few edges.
Definition 8.1.3. Let C $ = (C1$ , C2$ ) be a cut, i.e., (C2$ = V \ C1$ ) then the
conductance-weight a(C1$ ) of a cut side and the conductance ϕ(C $ ) are defined
in Equations (8.7) and (8.8).
$
a(C1$ ) := ω((u, v)) (8.7)
(u,v)∈E(C1! ,V )
1, if C1$ ∈ {∅, V }
0, ' {∅, V }, ω(E(C)) = 0
if C1$ ∈
ϕ(C $ ) := (8.8)
ω(E(C))
, otherwise
min(a(C1$ ), a(C2$ ))
The conductance of the graph G is defined by
Note that the case differentiation in Equation (8.8) is only necessary in order
to prevent divisions by zero. Before presenting further general information about
conductance, graphs with maximum conductance are characterized.
Lemma 8.1.4. Let G = (V, E, ω) be an undirected and positively weighted
graph. Then G has maximum conductance, i.e., ϕ(G) = 1 if and only if G
is connected and has at most three nodes, or is a star.
Proof. Before the equivalence is shown, two short observations are stated:
1. All disconnected graphs have conductance 0 because there is a non-trivial
cut that has zero weight and the second condition of the Formula (8.8) holds.
2. For a non-trivial cut C $ = (C1$ , V \ C1$ ) the conductance-weight a(C1$ ) can be
rewritten as
$
a(C1$ ) = ω(e) = ω(E(C1$ )) + ω(E(C))
e∈E(C1! ,V )
ω(E(C)) ω(E(C))
& ' = & ' .
min a(C1$ ), a(V \ C1$ ) ω(E(C)) + min ω(E(C1$ )), ω(E(V \ C1$ ))
(8.10)
8 Clustering 185
‘⇐=’: If G has one node, then the first condition of Formula (8.8) holds and
thus ϕ(G) = 1.
If G has two or three nodes or is a star, then every non-trivial cut C $ =
(C1$ , V \ C1$ ) isolates an independent set, i.e., E(C1$ ) = ∅. This is achieved
by setting C1$ to the smaller cut set if G has at most three nodes and to
the cut set that does not contain the center node if G is a star. There-
fore ω(E(C1$ )) = 0 and Equation (8.10) implies ϕ(C $ ) = 1. Because all
non-trivial cuts have conductance 1, the graph G has conductance 1 as
well.
‘=⇒’: If G has conductance one, then G is connected (observation 1) and for
every non-trivial cut C $ = (C1$ , V \C1$ ) at least one edge set E(C1$ ) or E(V \
C1$ ) has 0 weight (observation 2). Because ω has only positive weight, at
least one of these sets has to be empty.
It is obvious that connected graphs with at most three nodes fulfill these
requirements, therefore assume that G has at least four nodes. The graph
has a diameter of at most two because otherwise there is a path of length
three with four pairwise distinct nodes v1 , . . . , v4 , where ei := {vi , vi+1 } ∈
E for 1 ≤ i ≤ 3. Then the non-trivial cut C $ = ({v1 , v2 }, V \ {v1 , v2 }) can-
not have conductance 1, because first the inequality ω(E(C $ )) ≥ ω(e2 ) ≥ 0
implies the third condition of Formula (8.8) and second both cut sides are
non-empty (e1 ∈ E({v1 , v2 }) and e3 ∈ E(V \ {v1 , v2 })). By the same ar-
gument, G cannot contain a simple cycle of length four or greater. It
also cannot have a simple cycle of length three. Assume G has such a
cycle v1 , v2 , v3 . Then there is another node v4 that is not contained in
the cycle but in the neighborhood of at least one vi . Without loss of
generality i = 1. Thus, the non-trivial cut ({v1 , v4 }, V \ {v1 , v4 }) is a
counterexample. Thus G cannot contain any cycle and is therefore a tree.
The only trees with at least four nodes, and a diameter of at most two,
are stars.
4
3
It is N P-hard to calculate the conductance of a graph [39]. Fortunately,
√ it can
be approximated with a guarantee of O(log n) [565] and O( log n) [36]. For
some special graph classes, these algorithms have constant approximation fac-
tors. Several of the involved ideas are found in the theory of Markov chains and
random walks. There, conductance models the probability that a random walk
gets ‘stuck’ inside a non-empty part. It is also used to estimate bounds on the
rate of convergence. This notion of ‘getting stuck’ is an alternative description of
bottlenecks. One of these approximation ideas is related to spectral properties.
Lemma 8.1.5 indicates the use of eigenvalues as bounds.
Lemma 8.1.5 ([521, Lemma 2.6]). For an ergodic1 reversible Markov
chain with underlying graph G, the second (largest) eigenvalue λ2 of the transi-
tion matrix satisfies:
1
Aperiodic and every state can be reached an arbitrary number of times from all
initial states.
186 M. Gaertler
λ2 ≥ 1 − 2 · ϕ(G) . (8.11)
A proof of that can be found in [521, p. 53]. Conductance is also related to
isoperimetric problems as well as expanders, which are both related to similar
spectral properties themselves. Section 14.4 states some of these characteristics.
For unweighted graphs the conductance of the complete graph is often a
useful boundary. It is possible to calculate its exact conductance value. Propo-
sition 8.1.6 states the result. Although the formula is different for even and odd
number of nodes, it shows that the conductance of complete graphs is asymp-
totically 1/2.
Proposition 8.1.6. Let n be an integer, then equation (8.12) holds.
@
1
· n , if n is even
ϕ(Kn ) = 21 n−11 (8.12)
2 + n−1 , if n is odd
|C| · (n − |C|)
ϕ(Kn ) = min . (8.13)
C⊂V,1≤|C|<n min(|C|(|C| − 1), (n − |C|)(n − |C| − 1))
8.1.3 Performance
The next index combines two non-trivial functions for the density measure f and
the sparsity measure g. It simply counts certain node pairs. According to the
general intuition of intra-cluster density versus inter-cluster sparsity, we define
for a given clustering a ‘correct’ classified pair of nodes as two nodes either
belonging to the same cluster and connected by an edge, or belonging to different
clusters and not connected by an edge. The resulting index is called performance.
Its density function f counts the number of edges within all clusters while its
sparsity function g counts the number of nonexistent edges between clusters, i.e.,
188 M. Gaertler
cluster 1 cluster 2
cluster 5
cluster 4
(b) non-trivial clustering with best (c) non-trivial clustering with best
intra-cluster conductance intra-cluster conductance
Fig. 8.2. A situation where intra-cluster conductance splits intuitive clusters. The
intuitive clustering has α = 3/4, while the other two clusterings have α = 1. The split
in Figure 8.2(b) is only a refinement of the intuitive clustering, while Figure 8.2(c) shows
a clusterings with same intra-cluster conductance value that is skew to the intuitive
clustering
k
$
f (C) := |E(Ci )| and
i=1
$ (8.19)
g (C) := [(u, v) '∈ E] · [u ∈ Ci , v ∈ Cj , i '= j] .
u,v∈V
The definition is given in Iverson Notation, first described in [322], and adapted
by Knuth in [365]. The term inside the parentheses can be any logical statement.
If the statement is true the term evaluates to 1, if it is false the term is 0. The
maximum of f +g has n·(n−1) as upper bound because there are n(n−1) different
node pairs. Please recall that loops are not present and each pair contributes with
either zero or one. Calculating the maximum of f + g is N P-hard (see [516]),
therefore this bound is used instead of the real maximum. By using some duality
aspects, such as the number of intra-cluster edges and the number of inter-cluster
edges sum up to the whole number of edges, the formula of performance can be
simplified as shown in Equation (8.21).
8 Clustering 189
cluster 1 cluster 1
cluster 4
Fig. 8.3. A situation where two very similar clusterings have very different inter-
cluster conductance values. The intuitive clustering has δ = 0, while the other has the
optimum value of 8/9
& %k '
m (C) + n(n − 1) − i=1 |Ci |(|Ci | − 1) − m (C)
perf (C) =
n(n − 1)
%k
n(n − 1) − m + 2m (C) − i=1 |Ci |(|Ci | − 1)
= (8.20)
n(n − 1)
m(C) %k
m(1 − 2 m ) + i=1 |Ci |(|Ci | − 1)
=1− . (8.21)
n(n − 1)
Note that the derivation from Equation (8.20) to (8.21) applies the equality m =
m (C) + m (C), and that m (C) /m is just the coverage γ (C) in the unweighted
case. Similarly to the other indices, performance has some disadvantages. Its
main drawback is the handling of very sparse graphs. Graphs of this type do
not contain subgraphs of arbitrary size and density. For such instances the gap
between the number of feasible edges (with respect to the structure) and the
maximum number of edges (regardless of the structure) is also huge. For example,
a planar graph cannot contain any complete graph with five or more nodes, and
the maximum number of edges such that the graph is planar is linear in the
number of nodes, while in general it is quadratic. In conclusion, clusterings with
good performance tend to have many small clusters. Such an example is given
in Figure 8.4.
An alternative motivation for performance is given in the following. Therefore
recall that the edge set induces a relation on V × V by u ∼ v if (u, v) ∈ E, and
clusterings are just another notion for equivalence relations. The problem of
finding a clustering can be formalized in this context as finding a transformation
of the edge-induced relation into an equivalence relation with small cost. In
190 M. Gaertler
cluster 1 cluster 2
cluster 5
cluster 4 cluster 3
cluster 2
Fig. 8.4. A situation where the clustering with optimal performance is a refinement
(Figure 8.4(b)) of an intuitive clustering and is skew (Figure 8.4(c)) to another intuitive
clustering
other words, add or delete edges such that the relation induced by the new edge
set is an equivalence relation. As cost function, one simply counts the number
of additional and deleted edges. Instead of minimizing the number of changes,
one can consider the dual version: find a clustering such that the clustering-
induced relation and the edge-set relation have the greatest ‘intersection’. This
is just the maximizing f + g. Thus performance is related to the ‘distance’ of the
edge-set relation to the closed clustering. Because finding this maximum is N P-
hard, solving this variant of clustering is also N P-hard. Although the problem is
hard, it possesses a very simple integer linear program (ILP). ILPs are also N P-
hard, however there exist many techniques that lead to usable heuristics or
approximations for the problems. The ILP is given by n2 decision variables Xuv ∈
{0, 1} with u, v ∈ V , and the following three groups of constraints:
8 Clustering 191
reflexivity ∀ u : Xuu = 1
symmetry ∀ u, v : Xuv = Xvu
Xuv + Xvw − 2 · Xuw ≤ 1
transitivity ∀ u, v, w : Xuw + Xuv − 2 · Xvw ≤ 1
Xvw + Xuw − 2 · Xuv ≤ 1
The idea is that the X variables represent equivalence relations, i.e., two
nodes u, v ∈ V are equivalent if Xuv = 1 and the objective function counts
the number of not ‘correct’ classified node pairs.
There exist miscellaneous variations of performance that use more complex
models for classification. However, many modifications highly depend on their
applicational background. Instead of presenting them, some variations to include
edge weights are given. As pointed out in Section 8.1, indices serve two different
tasks. In order to preserve the comparability aspect, we assume that all the
considered edge weights have a meaningful maximum M . It is not sufficient to
replace M with the maximum occurring edge weight because this value depends
on the input graph. Also, choosing an extremely large value of M is not suitable
because it disrupts the range aspects of the index. An example of such weightings
with a meaningful maximum are probabilities where M = 1. The weighting
represents the probability that an edge can be observed in a random draw.
Using the same counting scheme of performance, one has to solve the problem of
assigning a real value for node pairs that are not connected. This problem will
be overcome with the help of the meaningful maximum M .
The first variation is straightforward and leads to the measure functions given
in Equation (8.22):
k
$
f (C) := ω (E(Ci )) and
i=1
$ (8.22)
g (C) := M · [(u, v) '∈ E] · [u ∈ Ci , v ∈ Cj , i '= j] .
u,v∈V
Please note the similarity to the unweighted definition in Formula (8.19). How-
ever, the weight of the inter-cluster edges is neglected. This can be integrated
by modifying g:
+ + & '
+ +
g $ (C) := g (C) +M · +E(C)+ − ω E(C) . (8.23)
A BC D
=:gw (C)
192 M. Gaertler
The additional term gw (C) corresponds to the difference of weight that would
be counted if no inter-cluster edges were present and the weight that is assigned
to the actual inter-cluster edges. In both cases the maximum is bounded by M ·
n(n − 1), and the combined formula would be:
where θ is a scaling parameter that rates the importance of the weight of the
intra-cluster edges (with respect to the weight of the inter-cluster edges). The
different symbols for density f˜ and sparsity g̃ functions are used to clarify that
these functions perform inversely to the standard functions f and g with respect
to their range: small values indicate better structural behavior instead of large
values. Both can be combined to a standard index via
f˜ (C) + g̃ (C)
perfm (C) = 1 − . (8.26)
n(n − 1)M
Note that the versions are the same for ϑ = θ = 1. In general, this is not true for
other choices of ϑ and θ. Both families have their advantages and disadvantages.
The first version (Equation (8.24)) should be used, if the clusters are expected
to be heavy, while the other version (Equation (8.26)) handles clusters with
inhomogeneous weights better.
information, the first problem is to estimate the similarity of nodes that are not
connected with an edge. This is often solved using shortest path techniques that
respect all information, like weight or direction, or only partial information, or
none. The most common measures resulting are: diameter, edge weight variance,
and average distance within the clusters. In contrast to the previous indices,
these measures do not primarily focus on the intra-cluster density versus inter-
cluster sparsity paradigm. Most of them even ignore the inter-cluster structure
completely. Another difference is that these indices usually rate each cluster in-
dividually, regardless of its position within the graph. The resulting distribution
is then rated with respect to the average or the worst case. Thus, a density mea-
sure2 π can be transformed into an index by applying π on all cluster-induced
subgraphs and rating the resulting distribution of values, e.g., via minimum,
maximum, average, or mean:
worst case: min{π(G[C1 ]), . . . , π(G[Ck ])}
i
1$
average case: π(G[Ci ])
k i
best case: max{π(G[C1 ]), . . . , π(G[Ck ])}
i
One advantage is that one can now distinguish between good, ‘random’, and
bad clusterings depending on the sign and magnitude of the index. However, the
expected number of edges that fall at random in a subgraph may be a too global
view. The local density may be very inhomogeneous and thus an average global
view can be very inaccurate. Figure 8.5 sketches such an example. The graph
consists of two groups, i.e., a cycle (on the left side) and a clique (on the right
side), which are connected by a path. In this example there is no subgraph with
average density, in fact the two groups have a significantly different density. One
2
greater values imply larger density
194 M. Gaertler
way to restrict the global view of average density is to fix the degree of each
node and consider expected number of edges that fall at random in a subgraph.
The modified formula is given in Equation (8.28).
G% H2
$k
ω(E(C i )) e∈E(C ,V ) ω(e)
.
− i
(8.28)
i=1
ω(E) ω(E)
In general, these measures that compare the partition with an average case of the
graph seem to introduce a new perspective that has not yet been fully explored.
8.1.5 Summary
The presented indices serve as quality measurements for clusterings. They pro-
vide a formal way to capture the intuitive notation of natural decompositions.
Several of these indices can be expressed in a simple framework that models
the paradigm of intra-cluster density versus inter-cluster sparsity. It has been
introduced in Section 8.1. Table 8.1 summarizes these indices, the general form
is given in Equation (8.29).
f (C) + g (C)
index (C) := (8.29)
N
A commonality of these measures is that the associated optimization problem,
i.e., finding the clustering with best score, usually is N P-hard. If the optimal
structure is known in advance, as is the case for coverage or conductance, the
restriction to interesting or practically relevant clusterings leads to N P-hard
problems.
k
performance 1 − perfm (C) M |Ci |(|Ci | − 1) − θ · ω(E(Ci ) ω E(C) M · n(n − 1)
i=1
8 Clustering
195
196 M. Gaertler
Greedy Concepts. Most greedy methods fit into the following framework:
start with a trivial and feasible solution and use update operations to lower
its costs recursively until no further optimization is possible. This scheme for a
greedy approach is shown in Algorithm 19, where c (L) denotes the cost of solu-
tion L and Ng (L) is the set of all solutions that can be obtained via an update
operation starting with solution L. This iterative scheme can also be expressed
for clusterings via hierarchies. A hierarchy represents an iterative refinement (or
8 Clustering 197
reduce reduce
−−−−→ −−−−→
↓ solve
expand expand
←−−−− ←−−−−
Fig. 8.6. Example of successive reductions. The ‘shapes’ of the (sub)instances indicate
the knowledge of clustering, i.e., smooth shapes point to fuzzy information while the
rectangles indicate exact knowledge. The first row shows the application of modifica-
tions, while the second indicates the expansion phases
coarsening) process. Greedy methods that use either merge or split operations
as updates define a hierarchy in a natural way. The restriction to one of these
operations guarantees the comparability of clusterings, and thus leads to a hier-
archy. These two concepts will be formalized shortly, before that some facts of
hierarchies are briefly mentioned.
Hierarchies provide an additional degree of freedom over clusterings: the
number of clusters is not fixed. Thus, they represent the group structure in-
dependently of its granularity. However, this feature usually increases the space
requirement, although a hierarchy can be implicitly represented by a tree, also
called a dendrogram, that represents the merge operations. Algorithms tend to
construct it explicitly. Therefore, their space consumption is often quadratic or
larger. For a few special cases these costs can be reduced with the help of data
structures [178].
198 M. Gaertler
Although the definition is very formal, the basic idea is to perform a cheapest
merge operation. The cost of such an operation can be evaluated using two
different view points. A local version charges only merge itself, which depends
only on the two involved clusters. The opposite view is a global version that
considers the impact of the merge operation. These two concepts imply also
the used cost functions, i.e., a global cost function has the set of clusterings as
domain while the local cost function uses a pair of node subsets as arguments.
An example of linkage is given in Figure 8.7. The process of linkage can be
reversed, and, instead of merging two clusters, one cluster is split into two parts.
This dual process is called Splitting (Diversion). The formal description is given
in Definition 8.2.2.
Definition 8.2.2 (Splitting). Let a graph G = (V, E, ω), an initial cluster-
ing C1 , and one of the following function sets be given:
global: a cost function cglobal : A (G) → + 0
semi-global: a cost function cglobal : A (G) → +0 and a proper cut function Slocal :
P(V ) → P(V )
semi-local: a cost function clocal : P(V ) × P(V ) → + 0 and a proper cut func-
tion Slocal : P(V ) → P(V )
local: a cost function clocal : P(V ) × P(V ) → + 0
The Splitting process splits one cluster in the current clustering Ci := {C1 , . . .
Ck } into two parts. The process ends when no further splitting is possible. The
cluster that is going to be split is chosen in the following way:
8 Clustering 199
cluster 4,5
cluster 6 cluster 6
global: let P be the set of all possible clusterings resulting from Ci by splitting
one cluster into two non-empty parts, i.e.,
K L
P := {C1 , . . . , Ck } \ {Cµ } ∪ {Cµ$ , Cµ \ Cµ$ } | ∅ '= Cµ$ ! Cµ ,
local: let µ be an index and Cν be a proper subset of cluster Cµ such that clocal
has one global minimum in the pair (Cν , Cµ \ Cν ), then the new cluster-
ing Ci+1 is defined by splitting cluster Cµ according to Slocal , i.e.,
Ci+1 := {C1 , . . . , Ck } \ {Cµ } ∪ {Cµ , Cµ \ Cν } .
Similar to the Linkage process, the definition is rather technical but the ba-
sic idea is to perform the cheapest split operation. In contrast to the Linkage
method, the cost model has an additional degree of freedom because clusters can
be cut in several ways. Again, there is a global and a local version that charge
the impact of the split and the split itself, respectively. Both correspond to the
views in the Linkage process. However, rating every possible non-trivial cut of
the clusters is very time consuming and usually requires sophisticated knowledge
of the involved cost functions. One way to reduce the set of possible splittings is
to introduce an additional cut function Slocal . It serves as an ‘oracle’ to produce
useful candidates for splitting operations. The semi-global and semi-local ver-
sions have the same principles a the global and the local version, however, their
candidate set is dramatically reduced. Therefore, they are often quite efficiently
computable, and no sophisticated knowledge about the cost function is required.
However, the choice of the cut function has usually a large impact on the quality.
Both, the Linkage and the Splitting process, are considered to be greedy
for several reasons. One is the construction of the successive clusterings, i.e., an
update operation chooses always the cheapest clustering. These can produce total
or complete hierarchies quite easily. Total hierarchies can be achieved by simply
adding the trivial clusterings to the resulting hierarchy. They are comparable
to all other clusterings, therefore preserving the hierarchy property. Recall that
complete hierarchies are hierarchies such that a clustering with k clusters is
included for every integer k ∈ [1, n]. Both processes lead to a complete hierarchy
when initialized with the trivial clusterings, i.e., singletons for Linkage and the
1-clustering for Splitting. Note that in the case of the Splitting process, it is
essential that the cut functions are proper. Therefore, it is guaranteed that every
cluster will be split until each cluster contains only one node. Although the cost
can be measured with respect to the result or the operation itself, clairvoyance
or projection of information into the future, i.e., accepting momentarily higher
cost for a later benefit, is never possible.
Because of their simple structure, especially in the local versions, both con-
cepts are frequently used and are also the foundation of clustering algorithms
in general. The general local versions can be very efficiently implemented. For
the Linkage process, a matrix containing all cluster pairs and their merging cost
is stored. When an update operation takes place, only the cost of merging the
new resulting cluster with another is recalculated. For certain cost functions this
scheme can even be implemented with less than quadratic space and runtime
consumption [178]. In the case of the Splitting process, only the cut information
needs to be stored for each cluster. Whenever a cluster gets split, one has to re-
compute the cut information only for the two new parts. This is not true for any
global version in general. However, a few very restricted cost and cut functions
can be handled efficiently also in the global versions.
8 Clustering 201
Owing to their contingent iterative nature, shifting approaches are rarely used
on their own. There can be sequences of shifting operations where the initial and
final clustering are the same, so-called loops. Also, bounds on the runtime are
more difficult to establish than for greedy approaches. Nonetheless, they are a
common postprocessing step for local improvements. An example of shifting is
given in Figure 8.8.
cluster 3 cluster 3
Fig. 8.9. Example of generating a graph and its clustering using distributions.
the input graph, and the estimation approach would try to estimate the number
of cluster centers as well as the assignment of each node to a cluster center. In
the EM case, the resulting clustering should have the largest expectation to be
the original hidden clustering, i.e., the same number of cluster points and the
correct node cluster-point assignment.
Evolutionary approaches such as genetic algorithms (GA), evolution strat-
egies (ES) and evolutionary programming (EP) iteratively modify a population
of solution candidates by applying certain operations. ‘Crossover’ and ‘mutation’
are the most common ones. The first creates a new candidate by recombining
two existing ones, whilst the second modifies one candidate. To each candidate
a fitness value is associated, usually the optimization function evaluated on the
candidate. After a number of basic operations, a new population is generated
based on the existing one, where candidates are selected according to their fitness
value. A common problem is to guarantee the feasibility of modified solutions.
Usually this is accomplished by the model specification. In the context of cluster-
204 M. Gaertler
ing, the model can either use partitions or equivalence relations. As presented in
Section 8.1.3, clusterings can be modeled as 0-1 vectors with certain constraints.
Search-based approaches use a given (implicit) topology of the candidate
space and perform a random walk starting at an arbitrary candidate. Similar to
evolutionary approaches, the neighborhood of a candidate can be defined by the
result of simple operations like the mutations. The neighborhood of a clustering
usually is the set of clusterings that result from node shifting, cluster merging,
or cluster splitting. The selection of a neighborhood is also based on some fitness
value, usually the optimization function evaluated on the candidate. The search
usually stops after a certain number of iterations, after finding a local optimum,
or a combination of both.
8.2.2 Algorithms
Clustering methods have been developed in many different fields. They were
usually very adapted, either for specific tasks or under certain conditions. The
reduction of algorithms to their fundamental ideas, and constructing a framework
on top, started not that long ago. Thus, this part can only give a short synopsis
about commonly used methods.
When dealing with similarities instead of distances one has to define a mean-
ingful path ‘length’. A simple way is to ignore it totally and define the cost
function as
M
clocal (Ci , Cj ) := {M − ω(e) : e ∈ E(Ci , Cj )} , (8.31)
where M is the maximum edge weight in the graph. Alternatively, one can define
the similarity of a path P : v1 , . . . , v' by:
G '−1 H−1
$ 1
ω(P ) := . (8.32)
i=1
ω(vi , vi+1 )
Although this definition is compatible with the triangle inequality, the meaning
of the original range can be lost along with other properties. Similar to cost
definition in Equation (8.31), the distance value (in Equation(8.30)) would be
replaced by (n − 1)M − ω(P ). These ‘inversions’ are necessary to be compatible
with the range meaning of cost functions. Another definition that is often used
in the context of probabilities is
'−1
O
ω(P ) := ω(vi , vi+1 ) . (8.33)
i=1
If ω(vi , vi+1 ) is the probability that the edge (vi , vi+1 ) is present and these
probabilities are independent of each other, then ω(P ) is the probability that
the whole path exists.
Lemma 8.2.4. The dendrogram of Single Linkage is defined by a Minimum
Spanning Tree.
Only a sketch of the proof will be given. A complete proof can be found
in [324]. The idea is the following: consider the algorithm of Kruskal where
edges are inserted in non-decreasing order, and only those that do not create
a cycle. From the clustering perspective of Single Linkage, an edge that would
create a cycle connects two nodes belonging to the same cluster, thus that edge
cannot be an inter-cluster edge, and thus would have never been selected.
The Linkage framework is often applied in the context of sparse networks
and networks where the expected number of inter-cluster edges is rather low.
This is based on the observation that many Linkage versions tend to produce
chains of clusters. In the case where either few total edges or few inter-cluster
edges are present, these effects occur less often.
Conductance Cuts (Equation (8.37)), and Bisectors (Equation (8.38)). Table 8.2
contains all the requisite formulae.
Ratio Cuts, balanced cuts, and Bisectors (and their generalization, the k–
Sectors) are usually applied when the uniformity of cluster size is an important
constraint. Most of these measures are N P-hard to compute. Therefore, approx-
imation algorithms or heuristics are used as replacement. Note that balanced
cuts and Conductance Cuts are based on the same fundamental ideas: rating
the size/weight of the cut in relation to the size/weight of the smaller induced
cut side. Both are related to node and edge expanders as well as isoperimetric
problems. These problems focus on the intuitive notion of bottlenecks and their
formalizations (see Section 8.1.2 for more information about bottlenecks). Some
spectral aspects are also covered in Section 14.4, and [125] provides further in-
sight. Beside these problems, the two cut measures have more in common. There
are algorithms ([565]) that can be used to simultaneously approximate both cuts.
However, the resulting approximation factor differs.
Splitting is often applied to dense networks or networks where the expected
number of intra-cluster edges is extremely high. An example for dense graphs
are networks that model gene expressions [286]. A common observation is that
Splitting methods tend to produce small and very dense clusters.
Fig. 8.10. Example for the removal of bridge elements. Removed elements are drawn
differently: edges are dotted and nodes are reduced to their outline
derived from shortest path or flow computations. Also centralities can be utilized
for the identification [445].
Multi-Level Approaches are generalizations of the Linkage framework, where
groups of nodes are collapsed into a single element until the instance becomes
solvable. Afterwards, the solution has to be transformed into a solution for the
original input graph. During these steps the previously formed groups need not
be preserved, i.e., a group can be split and each part can be assigned to individual
clusters. In contrast to the original Linkage framework, here it is possible to tear
an already formed cluster apart. Multi-level approaches are more often used in
the context of equi-partitioning, where k groups of roughly the same size should
be found that have very few edges connecting them. In this scenario, they have
been successfully applied in combination with shiftings. Figure 8.11 shows an
example.
Modularity – as presented at the beginning of Section 8.2.1, clustering algo-
rithms have a very general structure: they mainly consist of ‘invertible’ transfor-
mations. Therefore, a very simple way to generate new clustering algorithms is
the re-combination of these transformations, with modifications of the sequence
where appropriate. A reduction does not need to reduce the size of an instance,
on the contrary it also can increase it by adding new data. There are two different
types of data that can be added. The first is information that is already present
in the graph structure, but only implicitly. For example, an embedding such
that the distance in the embedding is correlated to the edge weights. Spectral
embeddings are quite common, i.e. nodes are positioned according to the entries
of an eigenvector (to an associated matrix of the graph). More details about
spectral properties of a graph can be found in Chapter 14. Such a step is usually
placed at the beginning, or near the end, of the transformation sequence. The
second kind is information that supports the current view of the data. Similar to
the identification of bridge elements, one can identify cohesive groups. Bridges
and these cohesive parts are dual to each other. Thus, while bridges would be
removed, cohesive groups would be extended to cliques. These steps can occur
during the whole transformation sequence.
208 M. Gaertler
Group 2 Group 3
Group 1 Group 4
Group 1 Group 2
λ ≤ δ. (14.16)
Keep in mind that when summing over all edges {i, j} ∈ E, the terms must not
depend on which end of the edge actually is i and which is j. Observe that this
is always the case here, e.g., in the preceding calculation because (gi ± gj )2 =
(gj ± gi )2 for all i, j ∈ V .
Using the eigenvalue property of y we get
$
λyi = d(i)yi − yj ∀i ∈ V. (14.18)
j: {i,j}∈E
$ $
= (gi + gj )2 + yj yi .
(14.17) {i,j}∈E {i,j}∈E(W,V \W )
%
Set α := {i,j}∈E(W,V \W ) yi yj . Because the left and right hand sides of (14.19)
are not negative, using (14.20) we derive
& $ '2
λ(2∆ − λ) yi2
i∈W
$ $
≥ (gi + gj )2 (gi − gj )2
{i,j}∈E {i,j}∈E
& $ $ ' (14.21)
+α (gi − gj )2 − (gi + gj )2 − α2
{i,j}∈E {i,j}∈E
$ $ & $ '
= (gi + gj )2 (gi − gj )2 − α 4 yi yj + α .
{i,j}∈E {i,j}∈E {i,j}∈E(W )
We would like to drop the ‘α’ term completely in this equation. To this end,
observe that by the definition of W we have α ≤ 0. Furthermore, using again
the eigenvalue property of λ (see also (14.18)), we have
402 A. Baltz and L. Kliemann
$
4 yi yj + α
{i,j}∈E(W )
$ $ $
=2 yi yj + 2 yi yj + yi yj
{i,j}∈E(W ) {i,j}∈E(W ) {i,j}∈E(W,V \W )
$ $ $
=2 yi yj + yi yj
{i,j}∈E(W ) i∈W j: {i,j}∈E
$ $
=2 yi yj + (d(i) − λ) y 2
ABCD i∈W A BC D i
{i,j}∈E(W )
≥0 ≥0
≥ 0,
= )v, w*2
(14.23)
≤ ,v,2 ,w,2
$ $
= (gi + gj )2 (gi − gj )2
{i,j}∈E {i,j}∈E
&$ '2
≤ λ(2∆ − λ) yi2 .
(14.22) i∈W
We will now bound this from below. Let 0 = t0 < t1 < . . . < tN be all the
different values of the components of g. Define Vk := {i ∈ V ; gi ≥ tk } for k ∈
{0, . . . , N } and for convenience VN +1 := ∅. Then, for k ∈ {1, . . . , N + 1} we have
Vk ⊆ W and therefore |Vk | ≤ |W |, hence |Vk | = min{|Vk | , |V \ Vk |}. It also holds
that VN ⊆ VN −1 ⊆ . . . V1 = W ⊆ V0 = V and that |Vk | − |Vk+1 | is the number
of entries in g equal to tk for all k ∈ {0, . . . , N }. % + +
We will later show that we can express the sum {i,j}∈E +gi2 − gj2 + in a con-
venient way:
14 Spectral Analysis 403
$ + N
+ $ $
+gi2 − gj2 + = (gj2 − gi2 )
{i,j}∈E k=1 {i,j}∈E
gi <gj =tk
N
$ $ N
$
= (t2k − t2k−1 ) = |E(Vk , V \ Vk )| (t2k − t2k−1 )
see below k=1 {i,j}∈E(Vk ,V \Vk ) k=1
N
$ N
$
≥ i(G) |Vk | (t2k − t2k−1 ) = i(G) t2k (|Vk | − |Vk+1 |),
k=1 k=0
% + +
since VN +1 = ∅ and t0 = 0. Now we can conclude that {i,j}∈E +gi2 − gj2 + ≥
% %
i(G) i∈V gi2 = i(G) i∈W yi2 . This together with (14.23) yields the claim of
the theorem.
We now only have left to prove the validity of the transformation of the sum,
i.e.,
$N $ N
$ $
(gj2 − gi2 ) = (t2k − t2k−1 ). (14.24)
k=1 {i,j}∈E k=1 {i,j}∈E(Vk ,V \Vk )
gi <gj =tk
This will be done by induction on N . The case N = 1 is clear. So let N > 1 and
assume that (14.24) has already been proven for instances with N − 1 instead of
N , i.e., instances where we have a vector g̃ on a graph G̃ = (Ṽ , Ẽ) assuming only
N different values 0 = t˜0 < . . . < t̃N −1 on its components and where subsets
ṼN −1 ⊆ ṼN −2 ⊆ . . . Ṽ1 = W̃ ⊆ Ṽ0 = Ṽ are defined accordingly.
We will make use of this for the following instance. Define G̃ := G − VN (the
vertices and edges of G̃ are Ṽ and Ẽ, respectively) and let g̃ be the restriction
of g on Ṽ . We then have t̃k = tk for all k ∈ {0, . . . , N − 1}. If we then define the
sets Ṽk accordingly, we also have V˜k = Vk \ VN for all k ∈ {0, . . . , N − 1}. Note
that VN ⊆ Vk for all k ∈ {0, . . . , N − 1}, so the sets Ṽk differ from the sets Vk
exactly by the vertices in VN .
By induction, we have
404 A. Baltz and L. Kliemann
N
$ $
(gj2 − gi2 )
k=1 {i,j}∈E
gi <gj =tk
N
$ −1 $ $
= (gj2 − gi2 ) + (gj2 − gi2 )
k=1 {i,j}∈E {i,j}∈E
gi <gj =tk gi <gj =tN
N
$ −1 $ $
= (g̃j2 − g̃i2 ) + (gj2 − gi2 )
k=1 {i,j}∈E {i,j}∈E (14.25)
g̃i <g̃j =tk gi <gj =tN
N
$ −1 $ $
= (t̃2k − t̃2k−1 ) + (gj2 − gi2 )
induction k=1 {i,j}∈E (Ṽk ,Ṽ \Ṽk )
G̃
{i,j}∈E
gi <gj =tN
N
$ −1 $ $
= (t2k − t2k−1 ) + (gj2 − gi2 ).
k=1 {i,j}∈E (Ṽk ,Ṽ \Ṽk ) {i,j}∈E
G̃
A BC D gi <gj =tN
(∗)
Observe that the cuts EG̃ (Ṽk , Ṽ \ Ṽk ) only consist of edges in Ẽ. If we switch
to cuts in G, we have to subtract some edges afterwards, namely those with one
end in VN . This way, we get for the sum (∗) the following:
N
$ −1 $
(t2k − t2k−1 )
k=1 {i,j}∈E (Ṽk ,Ṽ \Ṽk )
G̃
N
$ −1 $ N
$ −1 $ (14.26)
= (t2k − t2k−1 ) − (t2k − t2k−1 ) .
k=1 {i,j}∈E(Vk ,V \Vk ) k=1 {i,j}∈E(Vk ,V \Vk )
j∈VN
A BC D
(+)
We will inspect the ‘corrective’ term (+) closer. To this end, for each i ∈ V
let k(i) be the smallest index such that i ∈ V \ Vk(i) . We then have gi = tk(i)−1
for all i ∈ V and
14 Spectral Analysis 405
N
$ −1 $
(t2k − t2k−1 )
k=1 {i,j}∈E(Vk ,V \Vk )
j∈VN
$ $
= (t2k − t2k−1 )
{i,j}∈E(VN ,V \VN ) k∈{1,...,N −1}
j∈VN i∈V \Vk
$ N
$ −1
= (t2k − t2k−1 )
(14.27)
{i,j}∈E(VN ,V \VN ) k=k(i)
j∈VN
$
= (t2N −1 − t2k(i)−1 )
telescope {i,j}∈E(VN ,V \VN )
j∈VN
$
= (t2N −1 − gi2 ).
{i,j}∈E(VN ,V \VN )
j∈VN
This will later allow us to combine the last sum from (14.25) and that from
(14.27).
Putting everything together, we see:
N
$ $
(gj2 − gi2 )
k=1 {i,j}∈E
gi <gj =tk
N
$ −1 $ $
= (t2k − t2k−1 ) + (gj2 − gi2 )
(14.25) k=1 {i,j}∈E (Ṽk ,Ṽ \Ṽk )
G̃
{i,j}∈E
A BC D gi <gj =tN
(∗)
N
$ −1 $ N
$ −1 $
= (t2k − t2k−1 ) − (t2k − t2k−1 )
(14.26) k=1 {i,j}∈E(Vk ,V \Vk ) k=1 {i,j}∈E(Vk ,V \Vk )
i∈VN
A BC D
(+)
$
+ (gj2 − gi2 ).
{i,j}∈E
gi <gj =tN
N
$ −1 $ $
... = (t2k − t2k−1 ) − (t2N −1 − gi2 )
k=1 {i,j}∈E(Vk ,V \Vk ) {i,j}∈E(VN ,V \VN )
j∈VN
$
+ ( gj2 −gi2 )
{i,j}∈E
ABCD
gi <gj =tN =t2N
N
$ −1 $
= (t2k − t2k−1 )
(14.28) k=1 {i,j}∈E(Vk ,V \Vk )
$
+ (t2N − gi2 − t2N −1 + gi2 )
{i,j}∈E(VN ,V \VN )
j∈VN
N
$ −1 $ $
= (t2k − t2k−1 ) + (t2N − t2N −1 )
k=1 {i,j}∈E(Vk ,V \Vk ) {i,j}∈E(VN ,V \VN )
j∈VN
N
$ $
= (t2k − t2k−1 ).
k=1 {i,j}∈E(Vk ,V \Vk )
This is (14.24). 4
3
So, for graphs with ε = 0 the chromatic number is lower bounded in terms of
R(G) as
χ(G) ≥ 2R(G) + 2.
For small |ε|, this equation obviously still holds in an approximate sense.
Spectra. Let us start with G(n, p). We generated 100 graphs randomly accord-
ing to G(2000, 12 ) and computed their spectra. A first observation is that the
largest eigenvalue is significantly set off from the rest of the spectrum in all of
these experiments. Because this offset will be explicitly regarded later, we for
now exclude the largest eigenvalue from our considerations. So, when talking of
‘spectrum’, for the rest of this section, we mean ‘spectrum without the largest
eigenvalue’.
To get a visualization of all the spectra, we computed a histogram of the
eigenvalues of all 100 random graphs. This technique will be used for the other
random graph models as well. The quantity approximated by those histograms
is also known as the spectral density. Although we do not introduce this notion
formally [198], we will in the following occasionally use that term when referring
to the way eigenvalues are distributed.
All histograms, after being normalized to comprise an area of 1, were scaled
in the following way. Let λ1 , . . . , λN be the eigenvalues. (In our case N = 100 ·
(2000 − 1), for we have 100 graphs with 2000 eigenvalues each, and we exclude
the largest of them for each graph). Define λ̄ as the mean of all these values
N
1 $
λ̄ = λi .
N i=1
14 Spectral Analysis 409
The x-axis of our plots is scaled by σ1 and the y-axis is scaled by σ. This
makes it easy to compare spectra of graphs of different sizes.
Figure 14.10 shows the scaled histogram. The semicircle form actually is no
surprise. It follows from a classical result from random matrix theory known as
the semicircle law. It originally is due to Wigner [584, 585] and has later been
refined by a number of researchers [34, 233, 337].
0.35
"Gnp_2000_0.5_.dens"
0.3
0.25
0.2
0.15
0.1
0.05
0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
Fig. 14.10. A histogram of the union of the spectra of 100 random graphs from
G(2000, 12 )
chosen is proportional to its degree; at the very first step, when all vertices have
degree 0, each of them is equally likely to be chosen.
This way of generating preferential attachment graphs is slightly different
from what usually is considered in that multiple edges are allowed. However,
the resulting histograms of the spectra look very similar to those of [198], where
multiple edges do not occur. See Figure 14.11 for our results. For comparison,
Figure 14.12 shows plots for our preferential attachment graphs and for G(n, p)
graphs in one figure.
0.45
"BA_2000_10_.dens"
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-4 -3 -2 -1 0 1 2 3 4
Fig. 14.11. A smoothed histogram of the union of the spectra of 100 random graphs
with preferential attachment on 2000 vertices and with m = 10
2
There might be a number of less than m = 10 isolated vertices.
14 Spectral Analysis 411
0.45
c(x)
"Gnp_2000_0.5_.dens"
"BA_2000_10_.dens"
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-2 -1 0 1 2
Fig. 14.12. Histograms of spectra of 100 random graphs with preferential attachment
(n = 2000, m = 10) and from G(2000, 12 ) each. The solid line marks an ideal semicircle
3
The expected degree is in fact p(n − 1). To see this, fix a vertex i ∈ V . This vertex
has n − 1 potential neighbors. Let X1 , . . . , Xn−1 be 0/1 random variables, where Xj
indicates whether or not an edge is drawn from i to j. If we put X := n−1 j=1 Xi , then
[X] is the expected degree of i. The claim follows from linearity of expectation.
412 A. Baltz and L. Kliemann
0.5
c(x)
"Gnp_alpha_1000_5.0.dens"
0.45 "Gnp_alpha_2000_5.0.dens"
"Gnp_alpha_4000_5.0.dens"
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
Fig. 14.13. Histograms of spectra of sparse random graphs. For n = 1000, 2000 and
4000 each we created 10 random graphs from G(n, p) with p set to satisfy pn = 5. The
solid line marks an ideal semicircle
In [49], the reasons for the peaks are investigated. It is argued that they
originate from small connected components and, for pn > 1, also from small
trees grafted on the giant component.
1
"Gnp_2000__1_20"
"Gnp_2000__2_00"
0.9 "Gnp_2000__5_00"
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-6 -4 -2 0 2 4 6
Fig. 14.14. The cumulative distribution of eigenvalues for sparse random graphs. For
n = 2000 and pn = 1.2, 2.00 and 5.00 each we created 10 random graphs from G(n, p)
feature eigenvectors with high inverse participation ratio. The small-world graph
shows a very asymmetric structure.
We were not able to produce likewise distinguishing plots of the spectral
density for such small graphs.
1 1 1
"Gnp_100_0.1_0001_.ipr" "BA_100_10_0001_.ipr" "sw_100_0.01_10_0001_.ipr"
0 0 0
-15 -10 -5 0 5 10 15 -15 -10 -5 0 5 10 15 -15 -10 -5 0 5 10 15
Fig. 14.15. Inverse participation ratios of three random graphs. The models are:
G(n, p) (Gnp 100 0.1 0001 ipr), preferential attachment (BA 100 10 0001 ipr), small-
world (sw 100 0.01 10 0001 ipr). All graphs have an expected degree of 10
n
$
max wi2 < wi ,
i∈{1,...,n}
i=1
Furthermore, we denote the largest expected degree by m and the average ex-
pected degree by d.
We can prove results on the largest eigenvalue of the adjacency spectrum of
a random graph from G(w), which hold almost surely4 and under certain condi-
tions on d˜ and m. We can also make such statements on the k largest eigenvalues,
provided that d,˜ m, and the k largest expected degrees behave appropriately.
An interesting application of these results concerns random power law graphs:
we can choose the sequence w suitably, so that the expected number of vertices
of degree k is proportional to k −β , for some given β. Under consideration were
values of β > 2.
Theorem 14.5.1 ([127, 128]). Let G be a random power law graph with ex-
ponent β and adjacency spectrum λ1 , . . . , λn .
4
I.e., with probability tending to 1 as n tends to ∞.
14 Spectral Analysis 415
1. For β ≥ 3 and
m > d2 log3 n, (14.31)
we have almost surely √
λn = (1 + o(1)) m.
2. For 2.5 < β < 3 and
β−2 3
m > d β−2.5 log β−2.5 n, (14.32)
we have almost surely √
λn = (1 + o(1)) m.
3. For 2 < β < 2.5 and
3
m > log 2.5−β n,
we have almost surely
λn = (1 + o(1))d.˜
, -β−1
4. For 2.5 < β and k < n d
m log n , almost surely the k largest eigenvalues
of G have power law distribution with exponent 2β − 1, provided m is large
enough (satisfying (14.31) and (14.32)).
We remark that the second order average degree d˜ can actually be computed
in these cases. For details see [127, 128].
Numerical Methods
The diagonalization strategy for small dense matrices is comprehensively treated
in [482, 535]. For a discussion of QR-like algorithms including parallelizable
versions see [571]. More on the Lanczos method can be found in [467, 591]. An
Arnoldi code for real asymmetric matrices is discussed in [509].
416 A. Baltz and L. Kliemann
U. Brandes and T. Erlebach (Eds.): Network Analysis, LNCS 3418, pp. 417–437, 2005.
c Springer-Verlag Berlin Heidelberg 2005
(
418 G.W. Klau and R. Weiskircher
such that the resulting network is disconnected and has property P ?”. These
are worst case statistics because the deletion of an arbitrary set of vertices or
edges of the same size may not cause the same effect. So we implicitly assume
that the vertex or edge failures are not random but targeted for maximum effect.
15.1.2 Cohesiveness
The notion of cohesiveness was introduced by Akiyama et al. in [13] and defines
for each vertex of the network to what extent it contributes to the connectivity.
Definition 15.1.1. Let κ(G) be the vertex-connectivity of G (see the definition
in Section 2.2.4). Let G − v be the network obtained from G by removing vertex
v. For any vertex v of G, the cohesiveness c(v) is defined as follows:
Vertex 7 in Figure 15.1(a) has a cohesiveness of -2, because the network has
vertex-connectivity 1 if vertex 7 is present and vertex connectivity 3 if we delete
it. On the other hand, vertex 6 in Figure 15.1(b) has cohesiveness 1 because if
we remove it from the network, the vertex-connectivity drops from 3 to 2.
It follows from the definition that the cohesiveness of a vertex cannot be
greater than 1. Intuitively, a vertex with negative cohesiveness is an outlier of
the network while a vertex with cohesiveness 1 is central. It can be shown that
a network can have at most one vertex with negative cohesiveness and that
the neighborhood of this negative vertex contains the only set of vertices of
15 Robustness and Resilience 419
0 5 0 5
7 1 6 4 1 6 4
2 3 2 3
(a) (b)
Fig. 15.1. Example graphs for the cohesiveness of a vertex. Vertex 7 in Figure 15.1(a)
has cohesiveness -2 and vertex 6 in Figure 15.1(b) cohesiveness 1
size κ(G) whose removal disconnects the network. Consider as an example the
network shown in Figure 15.1(a), where vertex 7 is the only vertex with negative
cohesiveness. The only neighbor of vertex 7 is vertex 1 and this is the only vertex
whose deletion splits the network.
Even though a network can have at most one negative vertex, we can compute
a set of loosely connected vertices by removing the negative vertex and then
looking for the next negative vertex. This algorithm could be used to find loosely
connected vertices in a network because a negative vertex is at the periphery of
the graph. A drawback of this approach is that this algorithm may stop after
a few vertices even for big networks because there are no more vertices with
negative cohesiveness.
The cohesiveness of a vertex can be computed using standard connectivity
algorithms (see Chapter 7). To compute the cohesiveness of every vertex, the
connectivity algorithm has to be called n times where n is the number of vertices
in the network.
1 5 6
0 3
8 2 4 7
Fig. 15.2. Example network for the minimum m-degree
4 5
3 2 7 10
0 1 8 9
15.1.4 Toughness
The toughness of a network was introduced by Chvátal [129]. It measures the
number of internally connected components that the graph can be broken into
by the failure of a certain number of vertices.
15 Robustness and Resilience 421
graph.
It is N P-hard to decide for a general graph if it has toughness at least t [48].
If the network is a tree, the toughness is ∆(G) 1
where ∆(G) is the maximum
degree of any vertex. The toughness of the complete bipartite network Km,n
with m ≤ n and n ≥ 2 is m n.
The toughness of a circle is one and it follows that the toughness of a Hamil-
tonian graph is at least one. In [129], Chvátal also showed a connection between
the independence number of a network and the toughness. The independence
number β0 is the size of the largest subset S of the vertices with the property
that there is no edge in the network connecting two vertices in S. The toughness
of G is lower-bounded by κ(G)/β0 (G) and upper bounded by (n − β0 (G))/β0 .
size of the smallest subset of vertices we have to delete to split the network into
components of at most k vertices each. Classical connectivity is a special case of
conditional connectivity where P = ∅.
If we define a sequence S = (P1 , . . . , Pk ) of properties according to our ap-
plication such that Pi+1 implies Pi for 1 ≤ i ≤ k − 1, we obtain a vector of
conditional connectivity
(κ(G : P1 ), . . . , κ(G : Pk )) .
If the properties are defined to model increasing degradation of the network with
respect to the application, this vector gives upper bounds for the usefulness of
the system with respect to the number of failed vertices.
A similar measure is general connectivity, also introduced by Harary [277].
If G is a network with property P and Y is a subset of the vertices (edges) of
G, then κ(G, Y : P ) is the smallest set X ⊂ Y of vertices (edges) in G whose
removal results in a network G$ that does not have property P . Conditional
connectivity is a special case of general connectivity.
The main drawback of these statistics is that there is no efficient algorithm
known that computes them for a general graph.
15.2.1 Persistence
The persistence of a network is the minimum number of vertices that have to be
deleted in order to increase the diameter (the longest distance between a pair of
vertices in the network). Again, an analogous notion is defined for the deletion
of edges (edge persistence). Persistence was introduced by Boesch, Harary and
Kabell in [64] where they also present the following properties of the persistence
of a network:
– The persistence of a network with diameter 2 ≤ d ≤ 4 is equal to the minimum
over all pairs of non-adjacent vertices i and j of the maximum number of
vertex-disjoint i, j-paths of length no more than d.
– The edge-persistence of a network with diameter d ∈ {2, 3} is the minimum
over all pairs of vertices i, j of the maximum number of edge-disjoint i, j-paths
of length no more than d.
15 Robustness and Resilience 423
There are many theoretic results on persistence that mainly establish con-
nections between connectivity and persistence, see for example [74, 475]. The
persistence vector is an extension of the persistence concept. The i-th compo-
nent of P (G) = (p1 , . . . , pn ) is the worst-case diameter of G if i vertices are
removed. This is the same concept as the vertex-deleted diameter sequence we
introduce in Section 15.2.2.
The main drawback of persistence is that there is no efficient algorithm known
to compute it.
Table 15.2. The vertex- and edge-deletion-sequences for the network of Figure 15.4
A (1,2)
B (3,3)
D (3,4)
T (4,4)
3 2 4 5
6 7
– B = (1, . . . , 1)
– D = (1, . . . , 1)
– T = (2, . . . , 2)
Krishnamoorthy, Thulasiraman and Swamy show that the largest increase
in the distance between any pair of vertices caused by the deletion of i vertices
or edges can always be found among the neighbors of the deleted objects. This
speeds up the computation of the sequences significantly and also simplifies the
definitions of A and B. These sequences can also be defined as follows (note that
N (Vi ) is the set of vertices adjacent to vertices in the set Vi and N (Ei ) is the
set of vertices incident to edges in Ei ):
The vertex- and edge-deletion sequences are a worst case measure for the
increase in distance caused by the failure of vertices or edges and they do not
make any statements about the state of the graph after disconnection occurred.
So these measures are only suited for applications where distance is crucial and
disconnection makes the whole network unusable. Even with the improvement
mentioned above, computing the sequences is still only possible for graphs with
low connectivity.
The statistics in this section make statements about the average number of
vertices or edges that have to fail in order for the network to have a certain
property or build an average of local properties in order to cover global aspects
of the network.
15 Robustness and Resilience 425
Figure 15.5 shows a graph with mean connectivity 3/4. This can be seen as
follows: For every edge-sequence where the edge (2, 3) does not come last, we have
ξ(s) = 3. For all other sequences, we have ξ(s) = 4. Since there are six sequences
where edge (2, 3) is last and 24 sequences in total, the mean connectivity of the
graph is 3/4.
Note that M(G) is not the same as the mean number of edges we have to
delete to disconnect G. If we look at all sequences of deleting edges and compute
the mean index where the graph becomes disconnected, we obtain the value 7/4
for the graph in Figure 15.5.
2 3
1
Fig. 15.5. A graph with mean connectivity 3/4
λ(G) − 1 ≤ M(G) ≤ m − n + 1
maxk |Sk |
frag1 = %ki=1
i=1 |Sk |
1
In [17], the authors use the term interconnectedness which corresponds to the clas-
sical average distance. In their experiments, however, they measure the average
connected distance. The classical average distance becomes ∞ as soon as the graph
becomes disconnected.
15 Robustness and Resilience 427
Figure 15.6 shows the effect of vertex failures and attacks on the average con-
nected distance d¯ for randomly generated networks whose degree distributions
follow a Poisson distribution and a power-law distribution, respectively. The
Poisson networks suffer equally from random and targeted failures. Every vertex
plays more or less the same role, and deleting one of them affects the average
connected distance, on average, only slightly if at all. The scale-free network, in
contrast, is very robust to failures in terms of average connected distance. The
probability that a high-degree vertex is deleted is quite small and since those
vertices are responsible for the short average distance in scale-free networks,
the distances almost do not increase at all when deleting vertices randomly. If,
however, those vertices are the aim of an attack, the average connected distance
increases quickly. Simulations on small fragments of the Internet router graph
and the WWW graph show a similar behavior as the random scale-free network,
see [17].
12
P SF
Failure
10 Attack
d¯ 8
4
0.00 0.02 0.04
f
Fig. 15.6. Changes in average connected distance d¯ of randomly generated networks
(|V | = 10, 000, |E| = 20, 000) with Poisson (P) and scale-free (SF) degree distribution
after randomly removing f |V | vertices (source: [17])
The increase in average connected distance alone does not say much about
the connectivity status of the network in terms of fragmentation. It is possible
to create networks with small average connected distance that consist of many
disconnected components (imagine a large number of disconnected triangles:
their average connected distance is 1). Therefore, Albert et al. also measure the
fragmentation process under failure and attack.
Figure 15.7 shows the results of the experimental study on fragmentation.
The Poisson network shows a threshold-like behavior for f > fc ≈ 0.28 when
frag1 , the relative size of the largest component, becomes almost zero. Together
with the behavior of frag2 , the average size of the disconnected components, that
reaches a peak of 2 at this point, this indicates the breakdown scenario as shown
428 G.W. Klau and R. Weiskircher
also in Figure 15.8: Removing few vertices disconnects only single vertices. The
components become larger as f reaches the percolation threshold fc . After that,
the system falls apart. As in Figure 15.6, the results are the same for random
and targeted failures in networks with Poisson degree distribution.
The process looks different for scale-free networks (again, the data for the
router and WWW graphs look similar as for the randomly generated scale-
free networks). For random deletion of vertices no percolation threshold can be
observed: the system shows a behavior known as graceful degradation. In case of
attacks, we see the same breakdown scenario as for the Poisson network, with
an earlier percolation threshold fc ≈ 0.18.
frag1 frag2
Failure
2 Attack
1
P SF
2
0
0.0 0.4 0.8
1
1
fc fc
0 0
0.0 0.2 0.4 0.0 0.2 0.4
f f
Fig. 15.7. Changes in fragmentation frag = (frag1 , frag2 ) of random networks (Poisson
degree distribution: P, scale-free degree distribution: SF) after randomly removing f |V |
vertices. The inset in the upper right corner shows the scenario for the full range of
deletions in scale-free networks (source: [17])
Poisson
scale-
free
Fig. 15.8. Breakdown scenarios of networks with Poisson degree and scale-free distri-
bution (source: [17])
15 Robustness and Resilience 429
In summary the experimental study shows that scale-free networks are tol-
erant against random failures but highly sensitive to targeted attacks. Since the
Internet is believed to have a scale-free structure, the findings confirm the vul-
nerability of this network which is often paraphrased as the ‘Achilles heel of the
Internet’.
Broder et al. study the structure of the web more thoroughly and come to
the conclusion that the web has a ‘bow tie structure’ as depicted in Figure 4.1
on page 77 in Chapter 3 [102]. Their experimental results on the web graph W
reveal that the world wide web is robust against attacks. Deleting all vertices
{v ∈ V (W ) | d− (v) ≥ 5} does not decrease the size of the largest component
dramatically, it still contains approximately 30% of the vertices. This apparent
contradiction to the results of Albert et al. can be explained by the fact that
is still below the percolation threshold and is thus just another way to look at
the same data: while ‘deleting all vertices with high degree’ sounds drastic this
is still a set of small cardinality.
A number of application-oriented papers use the average connected distance
and fragmentation as the measures of choice in order to show the robustness
properties of the corresponding network. For example, Jeong et al. study the
protein interaction network of the yeast proteome (S. cervisiae) and show that
it is robust against random mutations of proteins but susceptible to the destruc-
tion of the highest degree proteins [327]. Using average connected distance and
fragmentation to study epidemic propagation networks leads to the advice to
take care of the hubs first, when it comes to deciding a vaccination strategy (see,
e.g., [469]).
Holme et al. [305] study slightly more complex attacks on networks. Besides
attacks on vertices they also consider deleting edges and choose betweenness
centrality as an alternative selection criterion for deletion. In addition, they
investigate in how far recalculating the selection criteria after each deletion alters
the results. They show empirically that attacks based on recalculated values are
more effective.
On the theoretical side Cohen et al. [130] and, independently, Callaway et
al. [108] study the fragmentation process on scale-free networks analytically.
While the first team of authors uses percolation theory, Callaway and his col-
leagues obtain more general results for arbitrary degree distributions using gener-
ating functions (see Section 13.2.2 in Chapter 13). The theoretical analyses con-
firm the results of the empirical studies and yield the same percolation thresholds
as shown in the figures above.
1 2 3
2 3 3
3 3 3
2 3 3
2 1 3 3 3 3
(a) (b) (c)
Fig. 15.9. Balanced-cut resilience for an example graph. Balanced cut shown for each
vertex for (a) 1-neighborhoods, (b) 2-neighborhoods, and (c) 3-neighborhoods
Computing a minimum balanced cut is N P-hard [240] and thus the draw-
back of this statistics is certainly its computational complexity which makes it
impractical for large networks. There are, however, a number of heuristics that
yield reasonably good values so that the balanced-cut resilience can at least be
15 Robustness and Resilience 431
estimated. Karypis and Kumar [348], for instance, propose a multilevel parti-
tioning heuristics that runs in time O(m) where m is the number of edges in the
network.
The effective diameter diameff (r) of a network is the smallest h such that the
number of pairs within a h-neighborhood is at least r times the total number of
reachable pairs:
see also Chapter 11. In the case that this distribution follows the power law
P (h) = (n + 2m)hH , the value H is also referred to as the hop-plot exponent.
The authors perform experiments on the network of approximately 285,000
routers in the Internet to investigate in how far and under which circumstances
the effective diameter of the router network changes. The experiments consist
of deleting either edges or vertices of the network and recomputing the effective
diameter diameff after each deletion, using a value of 0.9 for the parameter r.
Since an exact calculation of this statistics would take days, they exploit the
approximate neighborhood function described in Section 11.2.6 of Chapter 11.
Using these estimated values leads to a speed-up factor of 400.
Figures 15.10 and 15.11 show the effect of link and router failures on the
Internet graph. Confirming previous studies, the plots show that the Internet
is very robust against random failures but highly sensitive to failure of high
degree vertices. Also, deleting vertices with low effective eccentricity first rapidly
decreases the connectivity.
432 G.W. Klau and R. Weiskircher
Fig. 15.10. Effect of edge deletions (link failures) on the network of 285,000 routers
(source: [462]). The set E " denotes the deleted edges
Fig. 15.11. Effect of vertex deletions (router failures) on the network of 285,000 routers
(source: [462]). The set V " denotes the deleted edges
2 4
0.8
0.6
1 6
0.4
0.2
3 5 0
0 0.2 0.4 0.6 0.8 1
G, i)
n
% of failed nodes
Fig. 15.13. The probability P (G, i) for members of F. The number of vertices in the
network, n, determines the height of the curve. Their vertex degree, k, determines the
offset on the abscissa
15 Robustness and Resilience 435
looked at two worst case distance statistics, namely the persistence and incre-
mental distance sequences. The second concept is more general than the first
but for neither of them a polynomial time algorithm is known.
The main drawback of all the worst case statistics is that they make no state-
ments about the results of random edge- or vertex-failures. Therefore, we looked
at average robustness statistics in Section 15.3. The two statistics in this section
for which no polynomial algorithm is known (mean connectivity and balanced-
cut resilience) make statements about the network when edges fail while the
two other statistics (average distance/fragmentation and effective diameter) only
characterize the current state of a network. Hence, they are useful to measure
robustness properties of a network only if they are repeatedly evaluated after
successive edge deletions—either in an experiment or analytically.
In Section 15.4, we presented two statistics that give the probability that
the network under consideration is still connected after the random failure of
edges or vertices. The reliability polynomial gives the probability that the graph
is connected given a failure probability for the edges while the probabilistic
resilience for a network and a number i is the probability that the network
disconnects after exactly i failures. There is no polynomial time algorithm known
to compute any of these two statistics for general graphs.
The ideal statistics for describing the robustness of a complex network de-
pend on the application and the type of the failures that are expected. If a
network ceases to be useful after it is disconnected, statistics that describe the
connectivity of the graph are best suited. If distances between vertices must be
small, diameter-based statistics are preferable.
For random failures, the average and probabilistic statistics are the most
promising while the effects of deliberate attacks are best captured by worst case
statistics. So the ideal measure for deliberate attacks seems to be generalized
connectivity but this has the drawback that it is hard to compute. A probabilistic
version of generalized connectivity would be ideal for random failures.
In practice, an experimental approach to robustness seems to be most use-
ful. The simultaneous observation of changes in average connected distance and
fragmentation is suitable in many cases. One of the central results regarding
robustness is certainly that scale-free networks are on the one hand tolerant
against random failure but on the other hand exposed to intentional attacks.
Robustness is already a very complex topic but there are still many features of
real-world networks that we have not touched in this chapter. Examples include
the bandwidth of edges or the importance of vertices in an application as well
as routing protocols and delay on edges.
Another interesting area are networks where the failures of elements are not
independent of each other. In power networks for example, the failure of a power
line puts more stress on other lines and thus makes their failure more likely,
which might cause a domino effect.
At the moment, there are no deterministic polynomial algorithms that can
answer meaningful questions about the robustness of complex real-world net-
15 Robustness and Resilience 437
works. If there are no major theoretic breakthroughs the most useful tools in
this field will be simulations and heuristics.
Acknowledgments. The authors thank the editors, the co-authors of this book,
and the anonymous referee for valuable comments.
Bibliography
1. Serge Abiteboul, Mihai Preda, and Gregory Cobena. Adaptive on-line page im-
portance computation. In Proceedings of the 12th International World Wide Web
Conference (WWW12), pages 280–290, Budapest, Hungary, 2003.
2. Forman S. Acton. Numerical Methods that Work. Mathematical Association of
America, 1990.
3. Alan Agresti. Categorical Data Analysis. Wiley, 2nd edition, 2002.
4. Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Anal-
ysis of Computer Algorithms. Addison-Wesley, 1974.
5. Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. Data Structures and
Algorithms. Addison-Wesley, 1983.
6. Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin. Network Flows:
Theory, Algorithms, and Applications. Prentice Hall, 1993.
7. Ravindra K. Ahuja and James B. Orlin. A fast and simple algorithm for the max-
imum flow problem. Operations Research, 37(5):748–759, September/October
1989.
8. Ravindra K. Ahuja and James B. Orlin. Distance-based augmenting path algo-
rithms for the maximum flow and parametric maximum flow problems. Naval
Research Logistics Quarterly, 38:413–430, 1991.
9. William Aiello, Fan R. K. Chung, and Linyuan Lu. A random graph model for
massive graphs. In Proceedings of the 32nd Annual ACM Symposium on the
Theory of Computing (STOC’00), pages 171–180, May 2000.
10. Martin Aigner. Combinatorial Theory. Springer-Verlag, 1999.
11. Martin Aigner and Eberhard Triesch. Realizability and uniqueness in graphs.
Discrete Mathematics, 136:3–20, 1994.
12. Donald Aingworth, Chandra Chekuri, and Rajeev Motwani. Fast estimation
of diameter and shortest paths (without matrix multiplication). In Proceedings
of the 7th Annual ACM–SIAM Symposium on Discrete Algorithms (SODA’96),
1996.
13. Jin Akiyama, Francis T. Boesch, Hiroshi Era, Frank Harary, and Ralph Tindell.
The cohesiveness of a point of a graph. Networks, 11(1):65–68, 1981.
14. Richard D. Alba. A graph theoretic definition of a sociometric clique. Journal
of Mathematical Sociology, 3:113–126, 1973.
15. Réka Albert and Albert-László Barabási. Statistical mechanics of complex net-
works. Reviews of Modern Physics, 74(1):47–97, 2002.
16. Réka Albert, Hawoong Jeong, and Albert-László Barabási. Diameter of the world
wide web. Nature, 401:130–131, September 1999.
17. Réka Albert, Hawoong Jeong, and Albert-László Barabási. Error and attack
tolerance of complex networks. Nature, 406:378–382, July 2000.
18. Mark S. Aldenderfer and Roger K. Blashfield. Cluster Analysis. Sage, 1984.
19. Noga Alon. Eigenvalues and expanders. Combinatorica, 6(2):83–96, 1986.
20. Noga Alon. Generating pseudo-random permutations and maximum flow algo-
rithms. Information Processing Letters, 35(4):201–204, 1990.
440 Bibliography
21. Noga Alon, Fan R. K. Chung, and Ronald L. Graham. Routing permutations on
graphs via matchings. SIAM Journal on Discrete Mathematics, 7:513–530, 1994.
22. Noga Alon, Michael Krivelevich, and Benny Sudakov. Finding a large hidden
clique in a random graph. Randoms Structures and Algorithms, 13(3–4):457–466,
1998.
23. Noga Alon and Vitali D. Milman. λ1 , isoperimetric inequalities for graphs, and
superconcentrators. Journal of Combinatorial Theory Series B, 38:73–88, 1985.
24. Noga Alon and Joel Spencer. The Probabilistic Method. Wiley, 1992.
25. Noga Alon, Joel Spencer, and Paul Erdős. The Probabilistic Method. Wiley, 1992.
26. Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length
cycles. Algorithmica, 17(3):209–223, 1997.
27. Charles J. Alpert and Andrew B. Kahng. Recent directions in netlist partitioning:
A survey. Integration: The VLSI Journal, 19(1-2):1–81, 1995.
28. Ashok T. Amin and S. Louis Hakimi. Graphs with given connectivity and inde-
pendence number or networks with given measures of vulnerability and surviv-
ability. IEEE Transactions on Circuit Theory, 20(1):2–10, 1973.
29. Carolyn J. Anderson, Stanley Wasserman, and Bradley Crouch. A p∗ primer:
Logit models for social networks. Social Networks, 21(1):37–66, January 1999.
30. Carolyn J. Anderson, Stanley Wasserman, and Katherine Faust. Building
stochastic blockmodels. Social Networks, 14:137–161, 1992.
31. James G. Anderson and Stephen J. Jay. The diffusion of medical technology:
Social network analysis and policy research. The Sociological Quarterly, 26:49–
64, 1985.
32. Jacob M. Anthonisse. The rush in a directed graph. Technical Report BN 9/71,
Stichting Mathematisch Centrum, 2e Boerhaavestraat 49 Amsterdam, October
1971.
33. Arvind Arasu, Jasmine Novak, Andrew S. Tomkins, and John Tomlin. PageRank
computation and the structure of the web: experiments and algorithms. short
version appeared in Proceedings of the 11th International World Wide Web Con-
ference, Poster Track, November 2001.
34. Ludwig Arnold. On the asymptotic distribution of the eigenvalues of random
matrices. Journal of Mathematical Analysis and Applications, 20:262–268, 1967.
35. Sanjeev Arora, David R. Karger, and Marek Karpinski. Polynomial time approx-
imation schemes for dense instances of N P-hard problems. Journal of Computer
and System Sciences, 58(1):193–210, 1999.
36. Sanjeev Arora, Satish Rao, and Umesh Vazirani. Expander flows, geometric
embeddings and graph partitioning. In Proceedings of the 36th Annual ACM
Symposium on the Theory of Computing (STOC’04), pages 222–231. ACM Press,
2004.
37. Yuichi Asahiro, Refael Hassin, and Kazuo Iwama. Complexity of finding dense
subgraphs. Discrete Applied Mathematics, 121(1–3):15–26, 2002.
38. Yuichi Asahiro, Kazuo Iwama, Hisao Tamaki, and Takeshi Tokuyama. Greedily
finding a dense subgraph. Journal of Algorithms, 34(2):203–221, 2000.
39. Giorgio Ausiello, Pierluigi Crescenzi, Giorgio Gambosi, Viggo Kann, and Al-
berto Marchetti-Spaccamela. Complexity and Approximation - Combinatorial
Optimization Problems and Their Approximability Properties. Springer-Verlag,
2nd edition, 2002.
40. Albert-László Barabási and Réka Albert. Emergence of scaling in random net-
works. Science, 286:509–512, 1999.
41. Alain Barrat and Martin Weigt. On the properties of small-world network models.
The European Physical Journal B, 13:547–560, 2000.
42. Vladimir Batagelj. Notes on blockmodeling. Social Networks, 19(2):143–155,
April 1997.
Bibliography 441
43. Vladimir Batagelj and Ulrik Brandes. Efficient generation of large random net-
works. Physical Review E, 2005. To appear.
44. Vladimir Batagelj and Anuška Ferligoj. Clustering relational data. In Wolf-
gang Gaul, Otto Opitz, and Martin Schader, editors, Data Analysis, pages 3–15.
Springer-Verlag, 2000.
45. Vladimir Batagelj, Anuška Ferligoj, and Patrick Doreian. Generalized block-
modeling. Informatica: An International Journal of Computing and Informatics,
23:501–506, 1999.
46. Vladimir Batagelj and Andrej Mrvar. Pajek – A program for large network
analysis. Connections, 21(2):47–57, 1998.
47. Vladimir Batagelj and Matjaž Zaveršnik. An O(m) algorithm for cores decom-
position of networks. Technical Report 798, IMFM Ljublana, Ljubljana, 2002.
48. Douglas Bauer, S. Louis Hakimi, and Edward F. Schmeichel. Recognizing tough
graphs is NP-hard. Discrete Applied Mathematics, 28:191–195, 1990.
49. Michel Bauer and Olivier Golinelli. Random incidence matrices: moments of
the spectral density. Journal of Statistical Physics, 103:301–307, 2001. arXiv
cond-mat/0007127.
50. Alex Bavelas. A mathematical model for group structure. Human Organizations,
7:16–30, 1948.
51. Alex Bavelas. Communication patterns in task oriented groups. Journal of the
Acoustical Society of America, 22:271–282, 1950.
52. Murray A. Beauchamp. An improved index of centrality. Behavioral Science,
10:161–163, 1965.
53. M. Becker, W. Degenhardt, Jürgen Doenhardt, Stefan Hertel, G. Kaninke, W. Ke-
ber, Kurt Mehlhorn, Stefan Näher, Hans Rohnert, and Thomas Winter. A prob-
abilistic algorithm for vertex connectivity of graphs. Information Processing
Letters, 15(3):135–136, October 1982.
54. Richard Beigel. Finding maximum independent sets in sparse and general graphs.
In Proceedings of the 10th Annual ACM–SIAM Symposium on Discrete Algo-
rithms (SODA’99), pages 856–857. IEEE Computer Society Press, 1999.
55. Lowell W. Beineke and Frank Harary. The connectivity function of a graph.
Mathematika, 14:197–202, 1967.
56. Lowell W. Beineke, Ortrud R. Oellermann, and Raymond E. Pippert. The average
connectivity of a graph. Discrete Mathematics, 252(1):31–45, May 2002.
57. Claude Berge. Graphs. North-Holland, 3rd edition, 1991.
58. Noam Berger, Béla Bollobás, Christian Borgs, Jennifer Chayes, and Oliver M.
Riordan. Degree distribution of the FKP network model. In Proceedings of
the 30th International Colloquium on Automata, Languages, and Programming
(ICALP’03), pages 725–738, 2003.
59. Julian E. Besag. Spatial interaction and the statistical analysis of lattice systems
(with discussion). Journal of the Royal Statistical Society, Series B, 36:196–236,
1974.
60. Sergej Bezrukov, Robert Elsässer, Burkhard Monien, Robert Preis, and Jean-
Pierre Tillich. New spectral lower bounds on the bisection width of graphs.
Theoretical Computer Science, 320:155–174, 2004.
61. Monica Bianchini, Marco Gori, and Franco Scarselli. Inside PageRank. ACM
Transactions on Internet Technology, 2004. in press.
62. Robert E. Bixby. The minimum number of edges and vertices in a graph with
edge connectivity n and m n-bonds. Bulletin of the American Mathematical
Society, 80(4):700–704, 1974.
63. Robert E. Bixby. The minimum number of edges and vertices in a graph with
edge connectivity n and m n-bonds. Networks, 5:253–298, 1981.
442 Bibliography
64. Francis T. Boesch, Frank Harary, and Jerald A. Kabell. Graphs as models of
communication network vulnerability: Connectivity and persistence. Networks,
11:57–63, 1981.
65. Francis T. Boesch and R. Emerson Thomas. On graphs of invulnerable commu-
nication nets. IEEE Transactions on Circuit Theory, CT-17, 1970.
66. Béla Bollobás. Extremal graph theory. Academic Press, 1978.
67. Béla Bollobás. Modern Graph Theory, volume 184 of Graduate Texts in Mathe-
matics. Springer-Verlag, 1998.
68. Béla Bollobás and Oliver M. Riordan. Mathematical results on scale-free random
graphs. In Stefan Bornholdt and Heinz Georg Schuster, editors, Handbook of
Graphs and Networks: From the Genome to the Internet, pages 1–34. Wiley-
VCH, 2002.
69. Béla Bollobás, Oliver M. Riordan, Joel Spencer, and Gábor Tusnády. The de-
gree sequence of a scale-free random graph process. Randoms Structures and
Algorithms, 18:279–290, 2001.
70. Immanuel M. Bomze, Marco Budinich, Panos M. Pardalos, and Marcello Pelillo.
The maximum clique problem. In Ding-Zhu Du and Panos M. Pardalos, edi-
tors, Handbook of Combinatorial Optimization (Supplement Volume A), volume 4,
pages 1–74. Kluwer Academic Publishers Group, 1999.
71. Phillip Bonacich. Factoring and weighting approaches to status scores and clique
identification. Journal of Mathematical Sociology, 2:113–120, 1972.
72. Phillip Bonacich. Power and centrality: A family of measures. American Journal
of Sociology, 92(5):1170–1182, 1987.
73. Phillip Bonacich. What is a homomorphism? In Linton Clarke Freeman, Dou-
glas R. White, and A. Kimbal Romney, editors, Research Methods in Social Net-
work Analysis, chapter 8, pages 255–293. George Mason University Press, 1989.
74. J. Bond and Claudine Peyrat. Diameter vulnerability in networks. In Yousef
Alavi, Gary Chartrand, Linda Lesniak, Don R. Lick, and Curtiss E. Wall, editors,
Graph Theory with Applications to Algorithms and Computer Science, pages 123–
149. Wiley, 1985.
75. Robert R. Boorstyn and Howard Frank. Large scale network topological opti-
mization. IEEE Transaction on Communications, Com-25:29–37, 1977.
76. Kellogg S. Booth. Problems polynomially equivalent to graph isomorphism. Tech-
nical report, CS-77-04, University of Ljublana, 1979.
77. Kellogg S. Booth and George S. Lueker. Linear algorithms to recognize interval
graphs and test for consecutive ones property. Proceedings of the 7th Annual
ACM Symposium on the Theory of Computing (STOC’75), pages 255–265, 1975.
78. Ravi B. Boppana. Eigenvalues and graph bisection: an average case analysis. In
Proceedings of the 28th Annual IEEE Symposium on Foundations of Computer
Science (FOCS’87), pages 280–285, October 1987.
79. Ravi B. Boppana and Magnús M. Halldórsson. Approximating maximum inde-
pendent sets by excluding subgraphs. BIT, 32(2):180–196, 1992.
80. Ravi B. Boppana, Johan Håstad, and Stathis Zachos. Does co-N P have short
interactive proofs? Information Processing Letters, 25:127–132, 1987.
81. Stephen P. Borgatti. Centrality and AIDS. Connections, 18(1):112–115, 1995.
82. Stephen P. Borgatti and Martin G. Everett. The class of all regular equivalences:
Algebraic structure and computation. Social Networks, 11(1):65–88, 1989.
83. Stephen P. Borgatti and Martin G. Everett. Two algorithms for computing
regular equivalence. Social Networks, 15(4):361–376, 1993.
84. Stephen P. Borgatti and Martin G. Everett. Models of core/periphery structures.
Social Networks, 21(4):375–395, 1999.
85. Stephen P. Borgatti and Martin G. Everett. A graph-theoretic perspective on
centrality. Unpublished manuscript, 2004.
Bibliography 443
86. Stephen P. Borgatti, Martin G. Everett, and Paul R. Shirey. LS sets, lambda
sets and other cohesive subsets. Social Networks, 12(4):337–357, 1990.
87. Allan Borodin, Gareth O. Roberts, Jeffrey S. Rosenthal, and Panayiotis Tsaparas.
Finding authorities and hubs from link structures on the world wide web. In
Proceedings of the 10th International World Wide Web Conference (WWW10),
pages 415–429, Hong Kong, 2001.
88. Rodrigo A. Botagfogo, Ehud Rivlin, and Ben Shneiderman. Structural analysis
of hypertexts: Identifying hierarchies and useful metrics. ACM Transactions on
Information Systems, 10(2):142–180, 1992.
89. John P. Boyd. Social Semigroups. George Mason University Press, 1991.
90. John P. Boyd and Martin G. Everett. Relations, residuals, regular interiors, and
relative regular equivalence. Social Networks, 21(2):147–165, April 1999.
91. Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge Uni-
versity Press, 2004.
92. Ulrik Brandes. A faster algorithm for betweenness centrality. Journal of Mathe-
matical Sociology, 25(2):163–177, 2001.
93. Ulrik Brandes and Sabine Cornelsen. Visual ranking of link structures. Journal
of Graph Algorithms and Applications, 7(2):181–201, 2003.
94. Ulrik Brandes and Daniel Fleischer. Centrality measures based on current flow.
In Proceedings of the 22nd International Symposium on Theoretical Aspects of
Computer Science (STACS’05), volume 3404 of Lecture Notes in Computer Sci-
ence, 2005. To appear.
95. Ulrik Brandes, Marco Gaertler, and Dorothea Wagner. Experiments on graph
clustering algorithms. In Proceedings of the 11th Annual European Symposium on
Algorithms (ESA’03), volume 2832 of Lecture Notes in Computer Science, pages
568–579, September 2003.
96. Ulrik Brandes, Patrick Kenis, and Dorothea Wagner. Communicating centrality
in policy network drawings. IEEE Transactions on Visualization and Computer
Graphics, 9(2):241–253, 2003.
97. Ulrik Brandes and Jürgen Lerner. Structural similarity in graphs. In Pro-
ceedings of the 15th International Symposium on Algorithms and Computation
(ISAAC’04), volume 3341 of Lecture Notes in Computer Science, pages 184–195,
2004.
98. Ronald L. Breiger. Toward an operational theory of community elite structures.
Quality and Quantity, 13:21–57, 1979.
99. Ronald L. Breiger, Scott A. Boorman, and Phipps Arabie. An algorithm for clus-
tering relational data with applications to social network analysis and comparison
with multidimensional scaling. Journal of Mathematical Psychology, 12:328–383,
1975.
100. Ronald L. Breiger and James G. Ennis. Personae and social roles: The network
structure of personality types in small groups. The Sociological Quarterly, 42:262–
270, 1979.
101. Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual Web
search engine. Computer Networks and ISDN Systems, 30(1–7):107–117, 1998.
102. Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Ra-
jagopalan, Raymie Stata, Andrew S. Tomkins, and Janet Wiener. Graph struc-
ture in the Web. Computer Networks: The International Journal of Computer
and Telecommunications Networking, 33(1–6):309–320, 2000.
103. Coen Bron and Joep A. G. M. Kerbosch. Algorithm 457: Finding all cliques of
an undirected graph. Communications of the ACM, 16(9):575–577, 1973.
104. Tian Bu and Don Towsley. On distinguishing between Internet power law topol-
ogy generators. In Proceedings of Infocom’02, 2002.
444 Bibliography
105. Mihai Bădoiu. Approximation algorithm for embedding metrics into a two-
dimensional space. In Proceedings of the 14th Annual ACM–SIAM Symposium
on Discrete Algorithms (SODA’03), pages 434–443, 2003.
106. Horst Bunke and Kim Shearer. A graph distance metric based on the maximal
common subgraph. Pattern Recognition Letters, 19:255–259, 1998.
107. Ronald S. Burt. Positions in networks. Social Forces, 55:93–122, 1976.
108. Duncan S. Callaway, Mark E. J. Newman, Steven H. Strogatz, and Duncan J.
Watts. Network robustness and fragility: Percolation on random graphs. Physical
Review Letters, 25(85):5468–5471, December 2000.
109. Kenneth L. Calvert, Matthew B. Doar, and Ellen W. Zegura. Modeling Internet
topology. IEEE Communications Magazine, 35:160–163, June 1997.
110. A. Cardon and Maxime Crochemore. Partitioning a graph in O(|a| log2 |v|).
Theoretical Computer Science, 19:85–98, 1982.
111. Tami Carpenter, George Karakostas, and David Shallcross. Pracical Issues and
Algorithms for Analyzing Terrorist Networks. invited talk at WMC 2002, 2002.
112. Peter J. Carrington, Greg H. Heil, and Stephen D. Berkowitz. A goodness-of-fit
index for blockmodels. Social Networks, 2:219–234, 1980.
113. Moses Charikar. Greedy approximation algorithms for finding dense components
in a graph. In Proceedings of the 3rd International Workshop on Approximatin
Algorithms for Combinatorial Optimization (APPROX’00), volume 1931 of Lec-
ture Notes in Computer Science, pages 84–95. Springer-Verlag, 2000.
114. Gary Chartrand. A graph-theoretic approach to a communications problem.
SIAM Journal on Applied Mathematics, 14(5):778–781, July 1966.
115. Gary Chartrand, Gary L. Johns, Songlin Tian, and Steven J. Winters. Directed
distance on digraphs: Centers and medians. Journal of Graph Theory, 17(4):509–
521, 1993.
116. Gary Chartrand, Grzegorz Kubicki, and Michelle Schultz. Graph similarity and
distance in graphs. Aequationes Mathematicae, 55(1-2):129–145, 1998.
117. Qian Chen, Hyunseok Chang, Ramesh Govindan, Sugih Jamin, Scott Shenker,
and Walter Willinger. The origin of power laws in internet topologies revisited.
In Proceedings of Infocom’02, 2002.
118. Joseph Cheriyan and Torben Hagerup. A randomized maximum-flow algorithm.
SIAM Journal on Computing, 24(2):203–226, 1995.
119. Joseph Cheriyan, Torben Hagerup, and Kurt Mehlhorn. An o(n3 )-time
maximum-flow algorithm. SIAM Journal on Computing, 25(6):144–1170, De-
cember 1996.
120. Joseph Cheriyan and John H. Reif. Directed s-t numberings, rubber bands, and
testing digraph k-vertex connectivity. In Proceedings of the 3rd Annual ACM–
SIAM Symposium on Discrete Algorithms (SODA’92), pages 335–344, January
1992.
121. Joseph Cheriyan and Ramakrishna Thurimella. Fast algorithms for k-shredders
and k-node connectivity augmentation. Journal of Algorithms, 33:15–50, 1999.
122. Boris V. Cherkassky. An √ algorithm for constructing a maximal flow through a
network requiring O(n2 p) operations. Mathematical Methods for Solving Eco-
nomic Problems, 7:117–126, 1977. (In Russian).
123. Boris V. Cherkassky. A fast algorithm for constructing a maximum flow through a
network. In Selected Topics in Discrete Mathematics: Proceedings of the Moscow
Discrete Mathematics Seminar, 1972-1990, volume 158 of American Mathemat-
ical Society Translations – Series 2, pages 23–30. AMS, 1994.
124. Steve Chien, Cynthia Dwork, Ravi Kumar, and D. Sivakumar. Towards exploit-
ing link evolution. In Workshop on Algorithms and Models for the Web Graph,
November 2002.
125. Fan R. K. Chung. Spectral Graph Theory. CBMS Regional Conference Series in
Mathematics. American Mathematical Society, 1997.
Bibliography 445
126. Fan R. K. Chung, Vance Faber, and Thomas A. Manteuffel. An upper bound
on the diameter of a graph from eigenvalues associated with its laplacian. SIAM
Journal on Discrete Mathematics, 7(3):443–457, 1994.
127. Fan R. K. Chung, Linyuan Lu, and Van Vu. Eigenvalues of random power law
graphs. Annals of Combinatorics, 7:21–33, 2003.
128. Fan R. K. Chung, Linyuan Lu, and Van Vu. The spectra of random graphs with
given expected degree. Proceedings of the National Academy of Science of the
United States of America, 100(11):6313–6318, May 2003.
129. Vašek Chvátal. Tough graphs and hamiltionian circuits. Discrete Mathematics,
5, 1973.
130. Reuven Cohen, Keren Erez, Daniel ben Avraham, and Shlomo Havlin. Resilience
of the Internet to random breakdown. Physical Review Letters, 21(85):4626–4628,
November 2000.
131. Colin Cooper and Alan M. Frieze. A general model of web graphs. Randoms
Structures and Algorithms, 22:311–335, 2003.
132. Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic
progressions. Journal of Symbolic Computation, 9(3):251–280, 1990.
133. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.
Introduction to Algorithms. MIT Press, 2nd edition, 2001.
134. Trevor F. Cox and Michael A. A. Cox. Multidimensional Scaling. Monographs
on Statistics and Applied Probability. Chapman & Hall/CRC, 2nd edition, 2001.
135. Dragoš M. Cvetković, Michael Doob, and Horst Sachs. Spectra of Graphs. Johann
Ambrosius Barth Verlag, 1995.
136. Dragoš M. Cvetković, Peter Rowlinson, and Slobodan Simic. Eigenspaces of
Graphs. Cambridge University Press, 1997.
137. Andrzej Czygrinow. Maximum dispersion problem in dense graphs. Operations
Research Letter, 27(5):223–227, 2000.
138. Peter Dankelmann and Ortrud R. Oellermann. Bounds on the average connec-
tivity of a graph. Discrete Applied Mathematics, 129:305–318, August 2003.
139. George B. Dantzig. Application of the simplex method to a transportation prob-
lem. In Tjalling C. Koopmans, editor, Activity Analysis of Production and Al-
location, volume 13 of Cowles Commission for Research in Economics, pages
359–373. Wiley, 1951.
140. George B. Dantzig. Maximization of a linear function of variables subject to linear
inequalities. In Tjalling C. Koopmans, editor, Activity Analysis of Production and
Allocation, volume 13 of Cowles Commission for Research in Economics, pages
339–347. Wiley, 1951.
141. George B. Dantzig and Delbert R. Fulkerson. On the max-flow min-cut theorem
of networks. In Linear Inequalities and Related Systems, volume 38 of Annals of
Mathematics Studies, pages 215–221. Princeton University Press, 1956.
142. Camil Demetrescu and Giuseppe F. Italiano. A new approach to dynamic all
pairs shortest paths. In Proceedings of the 35th Annual ACM Symposium on the
Theory of Computing (STOC’03), pages 159–166, June 2003.
143. Guiseppe Di Battista and Roberto Tamassia. Incremental planarity testing. In
Proceedings of the 30th Annual IEEE Symposium on Foundations of Computer
Science (FOCS’89), pages 436–441, October/November 1989.
144. Guiseppe Di Battista and Roberto Tamassia. On-line maintenance of tricon-
nected components with SPQR-trees. Algorithmica, 15:302–318, 1996.
145. Reinhard Diestel. Graph Theory. Graduate Texts in Mathematics. Springer-
Verlag, 2nd edition, 2000.
146. Edsger W. Dijkstra. A note on two problems in connection with graphs. Nu-
merische Mathematik, 1:269–271, 1959.
446 Bibliography
147. Stephen Dill, Ravi Kumar, Kevin S. McCurley, Sridhar Rajagopalan, D. Sivaku-
mar, and Andrew S. Tomkins. Self-similarity in the web. ACM Transactions on
Internet Technology, 2(3):205–223, August 2002.
148. Chris H. Q. Ding, Xiaofeng He, Parry Husbands, Hongyuan Zha, and Horst D.
Simon. PageRank, HITS and a unified framework for link analysis. LBNL Tech
Report 49372, NERSC Division, Lawrence Berkeley National Laboratory, Uni-
versity of California, Berkeley, CA, USA, November 2001. updated Sept. 2002
(LBNL-50007), presented in the poster session of the Third SIAM International
Conference on Data Mining, San Francisco, CA, USA, 2003.
149. Chris H. Q. Ding, Hongyuan Zha, Xiaofeng He, Parry Husbands, and Horst D.
Simon. Link analysis: Hubs and authorities on the world wide web. SIAM Review,
46(2), 2004. to appear, published electronically May, 3, 2004.
150. Yefim Dinitz. Algorithm for solution of a problem of maximum flow in a network
with power estimation. Soviet Mathematics-Doklady, 11(5):1277–1280, 1970.
151. Yefim Dinitz. Bitwise residual decreasing method and transportation type prob-
lems. In A. A. Fridman, editor, Studies in Discrete Mathematics, pages 46–57.
Nauka, 1973. (In Russian).
152. Yefim Dinitz. Finding shortest paths in a network. In Y. Popkov and B. Shmu-
lyian, editors, Transportation Modeling Systems, pages 36–44. Institute for Sys-
tem Studies, Moscow, 1978.
153. Yefim Dinitz, Alexander V. Karzanov, and M. V. Lomonosov. On the structure
of the system of minimum edge cuts in a graph. In A. A. Fridman, editor, In
Studies in Discrete Optimization, pages 290–306. Nauka, 1976.
154. Yefim Dinitz and Ronit Nossenson. Incremental maintenance of the 5-edge-
connectivity classes of a graph. In Proceedings of the 7th Scandinavian Workshop
on Algorithm Theory (SWAT’00), volume 1851 of Lecture Notes in Mathematics,
pages 272–285. Springer-Verlag, July 2000.
155. Yefim Dinitz and Jeffery Westbrook. Maintaining the classes of 4-edge-
connectivity in a graph on-line. Algorithmica, 20(3):242–276, March 1998.
156. Gabriel A. Dirac. Extensions of Turán’s theorem on graphs. Acta Mathematica
Academiae Scientiarum Hungaricae, 14:417–422, 1963.
157. Matthew B. Doar. A better model for generating test networks. In IEEE GLOBE-
COM’96, 1996.
158. Wolfgang Domschke and Andreas Drexl. Location and Layout Planning: An
International Bibliography. Springer-Verlag, Berlin, 1985.
159. William E. Donath and Alan J. Hoffman. Lower bounds for the partitioning of
graphs. IBM Journal of Research and Development, 17(5):420–425, 1973.
160. Patrick Doreian. Using multiple network analytic tools for a single social network.
Social Networks, 10:287–312, 1988.
161. Patrick Doreian and Louis H. Albert. Partitioning political actor networks: Some
quantitative tools for analyzing qualitative networks. Journal of Quantitative
Anthropology, 1:279–291, 1989.
162. Patrick Doreian, Vladimir Batagelj, and Anuška Ferligoj. Symmetric-acyclic de-
compositions of networks. Journal of Classification, 17(1):3–28, 2000.
163. Patrick Doreian, Vladimir Batagelj, and Anuška Ferligoj. Generalized blockmod-
eling of two-mode network data. Social Networks, 26(1):29–53, 2004.
164. Sergey N. Dorogovtsev and Jose Ferreira F. Mendes. Evolution of networks.
Advances in Physics, 51(4):1079–1187, June 2002.
165. Sergey N. Dorogovtsev and Jose Ferreira F. Mendes. Evolution of Networks.
Oxford University Press, 2003.
166. Sergey N. Dorogovtsev, Jose Ferreira F. Mendes, and Alexander N. Samukhin.
Structure of growing networks: Exact solution of the Barabási-Albert’s model.
http://xxx.sissa.it/ps/cond-mat/0004434, April 2000.
Bibliography 447
190. Martin G. Everett and Stephen P. Borgatti. Role colouring a graph. Mathematical
Social Sciences, 21:183–188, 1991.
191. Martin G. Everett and Stephen P. Borgatti. Regular equivalence: General theory.
Journal of Mathematical Sociology, 18(1):29–52, 1994.
192. Martin G. Everett and Stephen P. Borgatti. Analyzing clique overlap. Connec-
tions, 21(1):49–61, 1998.
193. Martin G. Everett and Stephen P. Borgatti. Peripheries of cohesive subsets.
Social Networks, 21(4):397–407, 1999.
194. Martin G. Everett and Stephen P. Borgatti. Extending centrality. In Peter J.
Carrington, John Scott, and Stanley Wasserman, editors, Models and Methods in
Social Network Analysis. Cambridge University Press, 2005. To appear.
195. Martin G. Everett, Philip Sinclair, and Peter Dankelmann. Some centrality re-
sults new and old. Submitted, 2004.
196. Alex Fabrikant, Elias Koutsoupias, and Christos H. Papadimitriou. Heuristi-
cally optimized trade-offs: A new paradigm for power laws in the Internet. In
Proceedings of the 29th International Colloquium on Automata, Languages, and
Programming (ICALP’02), volume 2380 of Lecture Notes in Computer Science,
2002.
197. Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. On power-law re-
lationships of the Internet topology. In Proceedings of SIGCOMM’99, 1999.
198. Illés Farkas, Imre Derényi, Albert-László Barabási, and Tamás Vicsek. Spectra
of “real-world” graphs: Beyond the semicircle law. Physical Review E, 64, August
2001.
199. Katherine Faust. Comparison of methods for positional analysis: Structural and
general equivalences. Social Networks, 10:313–341, 1988.
200. Katherine Faust and John Skvoretz. Logit models for affiliation networks. Soci-
ological Methodology, 29(1):253–280, 1999.
201. Uriel Feige, Guy Kortsarz, and David Peleg. The dense k-subgraph problem.
Algorithmica, 29(3):410–421, 2001.
202. Uriel Feige and Robert Krauthgamer. Finding and certifying a large hidden
clique in a semirandom graph. Randoms Structures and Algorithms, 16(2):195–
208, 2000.
203. Uriel Feige and Robert Krauthgamer. A polylogarithmic approximation of the
minimum bisection. SIAM Journal on Computing, 31(4):1090–1118, 2002.
204. Uriel Feige and Michael A. Seltser. On the densest k-subgraph problem. Technical
Report CS97-16, Department of Applied Mathematics and Computer Science,
The Weizmann Institute of Science, Rehovot, Israel, 1997.
205. Trevor Fenner, Mark Levene, and George Loizou. A stochastic evolu-
tionary model exhibiting power-law behaviour with an exponential cutoff.
http://xxx.sissa.it/ps/cond-mat/0209463, June 2004.
206. Anuška Ferligoj, Patrick Doreian, and Vladimir Batagelj. Optimizational ap-
proach to blockmodeling. Journal of Computing and Information Technology,
4:63–90, 1996.
207. Jean-Claude Fernandez. An implementation of an efficient algorithm for bisim-
ulation equivalence. Science of Computer Programming, 13(1):219–236, 1989.
208. William L. Ferrar. Finite Matrices. Oxford University Press, London, 1951.
209. Jiřı́ Fiala and Daniël Paulusma. The computational complexity of the role assign-
ment problem. In Proceedings of the 30th International Colloquium on Automata,
Languages, and Programming (ICALP’03), pages 817–828. Springer-Verlag, 2003.
210. Miroslav Fiedler. Algebraic connectivity of graphs. Czechoslovak Mathematical
Journal, 23(98):289–305, 1973.
211. Miroslav Fiedler. A property of eigenvectors of nonnegative symmetric matrices
and its application to graph theory. Czechoslovak Mathematical Journal, 1:619–
633, 1975.
Bibliography 449
212. Stephen E. Fienberg and Stanley Wasserman. Categorical data analysis of a single
sociometric relation. In Samuel Leinhardt, editor, Sociological Methodology, pages
156–192. Jossey Bass, 1981.
213. Stephen E. Fienberg and Stanley Wasserman. Comment on an exponential family
of probability distributions. Journal of the American Statistical Association,
76(373):54–57, March 1981.
214. Philippe Flajolet, Kostas P. Hatzis, Sotiris Nikoletseas, and Paul Spirakis. On
the robustness of interconnections in random graphs: A symbolic approach. The-
oretical Computer Science, 287(2):515–534, September 2002.
215. Philippe Flajolet and G. Nigel Martin. Probabilistic counting algorithms for
data base applications. Journal of Computer and System Sciences, 31(2):182–
209, 1985.
216. Lisa Fleischer. Building chain and cactus representations of all minimum cuts
from Hao-Orlin in the same asymptotic run time. Journal of Algorithms,
33(1):51–72, October 1999.
217. Robert W. Floyd. Algorithm 97: Shortest path. Communications of the ACM,
5(6):345, 1962.
218. Lester R. Ford, Jr. and Delbert R. Fulkerson. Maximal flow through a network.
Canadian Journal of Mathematics, 8:399–404, 1956.
219. Lester R. Ford, Jr. and Delbert R. Fulkerson. A simple algorithm for finding
maximal network flows and an application to the Hitchcock problem. Canadian
Journal of Mathematics, 9:210–218, 1957.
220. Lester R. Ford, Jr. and Delbert R. Fulkerson. Flows in Networks. Princeton
University Press, 1962.
221. Scott Fortin. The graph isomorphism problem. Technical Report 96-20, Univer-
sity of Alberta, Edmonton, Canada, 1996.
222. Ove Frank and David Strauss. Markov graphs. Journal of the American Statistical
Association, 81:832–842, 1986.
223. Greg N. Frederickson. Ambivalent data structures for dynamic 2-edge-
connectivity and k smallest spanning trees. In Proceedings of the 32nd Annual
IEEE Symposium on Foundations of Computer Science (FOCS’91), pages 632–
641, October 1991.
224. Michael L. Fredman. New bounds on the complexity of the shortest path problem.
SIAM Journal on Computing, 5:49–60, 1975.
225. Michael L. Fredman and Dan E. Willard. Trans-dichotomous algorithms for
minimum spanning trees and shortest paths. Journal of Computer and System
Sciences, 48(3):533–551, 1994.
226. Linton Clarke Freeman. A set of measures of centrality based upon betweeness.
Sociometry, 40:35–41, 1977.
227. Linton Clarke Freeman. Centrality in social networks: Conceptual clarification I.
Social Networks, 1:215–239, 1979.
228. Linton Clarke Freeman. The Development of Social Network Analysis: A Study
in the Sociology of Science. Booksurge Publishing, 2004.
229. Linton Clarke Freeman, Stephen P. Borgatti, and Douglas R. White. Centrality in
valued graphs: A measure of betweenness based on network flow. Social Networks,
13(2):141–154, 1991.
230. Noah E. Friedkin. Structural cohesion and equivalence explanations of social
homogeneity. Sociological Methods and Research, 12:235–261, 1984.
231. Delbert R. Fulkerson and George B. Dantzig. Computation of maximal flows in
networks. Naval Research Logistics Quarterly, 2:277–283, 1955.
232. Delbert R. Fulkerson and G. C. Harding. On edge-disjoint branchings. Networks,
6(2):97–104, 1976.
233. Zoltán Füredi and János Komlós. The eigenvalues of random symmetric matrices.
Combinatorica, 1(3):233–241, 1981.
450 Bibliography
234. Harold N. Gabow. Scaling algorithms for network problems. Journal of Computer
and System Sciences, 31(2):148–168, 1985.
235. Harold N. Gabow. Path-based depth-first search for strong and biconnected
components. Information Processing Letters, 74:107–114, 2000.
236. Zvi Galil. An O(V 5/3 E 2/3 ) algorithm for the maximal flow problem. Acta Infor-
matica, 14:221–242, 1980.
237. Zvi Galil and Giuseppe F. Italiano. Fully dynamic algorithms for edge connectiv-
ity problems. In Proceedings of the 23rd Annual ACM Symposium on the Theory
of Computing (STOC’91), pages 317–327, May 1991.
238. Zvi Galil and Amnon Naamad. An O(EV log2 V ) algorithm for the maximal
flow problem. Journal of Computer and System Sciences, 21(2):203–217, October
1980.
239. Giorgio Gallo, Michail D. Grigoriadis, and Robert E. Tarjan. A fast paramet-
ric maximum flow algorithm and applications. SIAM Journal on Computing,
18(1):30–55, 1989.
240. Michael R. Garey and David S. Johnson. Computers and Intractability. A Guide
to the Theory of N P-Completeness. W. H. Freeman and Company, 1979.
241. Michael R. Garey, David S. Johnson, and Larry J. Stockmeyer. Some simplified
N P-complete graph problems. Theoretical Computer Science, 1:237–267, 1976.
242. Christian Gawron. An iterative algorithm to determine the dynamic user equilib-
rium in a traffic simulation model. International Journal of Modern Physics C,
9(3):393–408, 1998.
243. Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin. Bayesian
Data Analysis. Chapman & Hall Texts in Statistical Science. Chapman &
Hall/CRC, 2nd edition, June 1995.
244. Horst Gilbert. Random graphs. The Annals of Mathematical Statistics,
30(4):1141–1144, 1959.
245. Walter R. Gilks, Sylvia Richardson, and David J. Spiegelhalter. Markov Chain
Monte Carlo in Practice. Interdisciplinary Statistics. Chapman & Hall/CRC,
1996.
246. Chris Godsil. Tools from linear algebra. Research report, University of Waterloo,
1989.
247. Chris Godsil and Gordon Royle. Algebraic Graph Theory. Graduate Texts in
Mathematics. Springer-Verlag, 2001.
248. Andrew V. Goldberg. Finding a maximum density subgraph. Technical Re-
port UCB/CSB/ 84/171, Department of Electrical Engineering and Computer
Science, University of California, Berkeley, CA, 1984.
249. Andrew V. Goldberg. A new max-flow algorithm. Technical Memo
MIT/LCS/TM-291, MIT Laboratory for Computer Science, November 1985.
250. Andrew V. Goldberg and Satish Rao. Beyond the flow decomposition barrier.
Journal of the ACM, 45(5):783–797, 1998.
251. Andrew V. Goldberg and Satish Rao. Flows in undirected unit capacity networks.
SIAM Journal on Discrete Mathematics, 12(1):1–5, 1999.
252. Andrew V. Goldberg and Robert E. Tarjan. A new approach to the maximum-
flow problem. Journal of the ACM, 35(4):921–940, 1988.
253. Andrew V. Goldberg and Kostas Tsioutsiouliklis. Cut tree algorithms: An ex-
perimental study. Journal of Algorithms, 38(1):51–83, 2001.
254. Donald L. Goldsmith. On the second order edge connectivity of a graph. Con-
gressus Numerantium, 29:479–484, 1980.
255. Donald L. Goldsmith. On the n-th order edge-connectivity of a graph. Congressus
Numerantium, 32:375–381, 1981.
256. Gene H. Golub and Charles F. Van Loan. Matrix Computations. John Hopkins
University Press, 3rd edition, 1996.
Bibliography 451
257. Ralph E. Gomory and T.C. Hu. Multi-terminal network flows. Journal of SIAM,
9(4):551–570, December 1961.
258. Ralph E. Gomory and T.C. Hu. Synthesis of a communication network. Journal
of SIAM, 12(2):348–369, 1964.
259. Ramesh Govindan and Anoop Reddy. An analysis of Internet inter-domain topol-
ogy and route stability. In Proceedings of Infocom’97, 1997.
260. Fabrizio Grandoni and Giuseppe F. Italiano. Decremental clique problem. In
Proceedings of the 30th International Workshop on Graph-Theoretical Conecpts
in Computer Science (WG’04), Lecture Notes in Computer Science. Springer-
Verlag, 2004. To appear.
261. George Grätzer. General Lattice Theory. Birkhäuser Verlag, 1998.
262. Jerrold R. Griggs, Miklós Simonovits, and George Rubin Thomas. Extremal
graphs with bounded densities of small subgraphs. Journal of Graph Theory,
29(3):185–207, 1998.
263. Geoffrey Grimmett and Colin J. H. McDiarmid. On colouring random graphs.
Mathematical Proceedings of the Cambridge Philosophical Society, 77:313–324,
1975.
264. Dan Gusfield. Connectivity and edge-disjoint spanning trees. Information Pro-
cessing Letters, 16(2):87–89, 1983.
265. Dan Gusfield. Very simple methods for all pairs network flow analysis. SIAM
Journal on Computing, 19(1):143–155, 1990.
266. Ronald J. Gutman. Reach-based routing: A new approach to shortest path al-
gorithms optimized for road networks. In Proceedings of the 6th Workshop on
Algorithm Engineering and Experiments (ALENEX’04), Lecture Notes in Com-
puter Science, pages 100–111. SIAM, 2004.
267. Carsten Gutwenger and Petra Mutzel. A linear time implementation of SPQR-
trees. In Proceedings of the 8th International Symposium on Graph Drawing
(GD’00), volume 1984 of Lecture Notes in Computer Science, pages 70–90, Jan-
uary 2001.
268. Willem H. Haemers. Eigenvalue methods. In Alexander Schrijver, editor, Packing
and Covering in Combinatorics, pages 15–38. Mathematisch Centrum, 1979.
269. Per Hage and Frank Harary. Structural models in anthropology. Cambridge
University Press, 1st edition, 1983.
270. S. Louis Hakimi. On the realizability of a set of integers as degrees of the vertices
of a linear graph. SIAM Journal on Applied Mathematics, 10:496–506, 1962.
271. S. Louis Hakimi. Optimum location of switching centers and the absolute centers
and medians of a graph. Operations Research, 12:450–459, 1964.
272. Jianxiu Hao and James B. Orlin. A faster algorithm for finding the minimum
cut in a graph. In Proceedings of the 3rd Annual ACM–SIAM Symposium on
Discrete Algorithms (SODA’92), pages 165–174, January 1992.
273. Frank Harary. Status and contrastatus. Sociometry, 22:23–43, 1959.
274. Frank Harary. The maximum connectivity of a graph. Proceedings of the National
Academy of Science of the United States of America, 48(7):1142–1146, July 1962.
275. Frank Harary. A characterization of block-graphs. Canadian Mathematical Bul-
letin, 6(1):1–6, January 1963.
276. Frank Harary. Conditional connectivity. Networks, 13:347–357, 1983.
277. Frank Harary. General connectivity. In Khee Meng Koh and Hian-Poh Yap,
editors, Proceedings of the 1st Southeast Asian Graph Theory Colloquium, volume
1073 of Lecture Notes in Mathematics, pages 83–92. Springer-Verlag, 1984.
278. Frank Harary and Per Hage. Eccentricity and centrality in networks. Social
Networks, 17:57–63, 1995.
279. Frank Harary and Yukihiro Kodama. On the genus of an n-connected graph.
Fundamenta Mathematicae, 54:7–13, 1964.
452 Bibliography
280. Frank Harary and Helene J. Kommel. Matrix measures for transitivity and bal-
ance. Journal of Mathematical Sociology, 6:199–210, 1979.
281. Frank Harary and Robert Z. Norman. The dissimilarity characteristic of Husimi
trees. Annals of Mathematics, 58(2):134–141, 1953.
282. Frank Harary and Herbert H. Paper. Toward a general calculus of phonemic
distribution. Language: Journal of the Linguistic Society of America, 33:143–
169, 1957.
283. Frank Harary and Geert Prins. The block-cutpoint-tree of a graph. Publicationes
Mathematicae Debrecen, 13:103–107, 1966.
284. Frank Harary and Ian C. Ross. A procedure for clique detection using the group
matrix. Sociometry, 20:205–215, 1957.
285. David Harel and Yehuda Koren. On clustering using random walks. In Proceed-
ings of the 21st Conference on Foundations of Software Technology and Theoret-
ical Computer Science (FSTTCS’01), volume 2245 of Lecture Notes in Computer
Science, pages 18–41. Springer-Verlag, 2001.
286. Erez Hartuv and Ron Shamir. A clustering algorithm based on graph connectiv-
ity. Information Processing Letters, 76(4-6):175–181, 2000.
287. Johan Håstad. Clique is hard to approximate within n1−ε . Acta Mathematica,
182:105–142, 1999.
288. Vaclav Havel. A remark on the existence of finite graphs (in czech). Casopis
Pest. Math., 80:477–480, 1955.
289. Taher H. Haveliwala. Topic-sensitive pagerank: A context-sensitive ranking algo-
rithm for web search. IEEE Transactions on Knowledge and Data Engineering,
15(4):784–796, 2003.
290. Taher H. Haveliwala and Sepandar D. Kamvar. The second eigenvalue of the
Google matrix. Technical report, Stanford University, March 2003.
291. Taher H. Haveliwala, Sepandar D. Kamvar, and Glen Jeh. An analytical com-
parison of approaches to personalized PageRank. Technical report, Stanford
University, June 2003.
292. Taher H. Haveliwala, Sepandar D. Kamvar, Dan Klein, Christopher D. Manning,
and Gene H. Golub. Computing PageRank using power extrapolation. Technical
report, Stanford University, July 2003.
293. George R. T. Hendry. On graphs with a prescribed median. I. Journal of Graph
Theory, 9:477–487, 1985.
294. Michael A. Henning and Ortrud R. Oellermann. The average connectivity of a
digraph. Discrete Applied Mathematics, 140:143–153, May 2004.
295. Monika R. Henzinger and Michael L. Fredman. Lower bounds for fully dynamic
connectivity problems in graphs. Algorithmica, 22(3):351–362, 1998.
296. Monika R. Henzinger and Valerie King. Fully dynamic 2-edge connectivity al-
gorithm in polylogarithmic time per operation. SRC Technical Note 1997-004a,
Digital Equipment Corporation, Systems Research Center, Palo Alto, California,
June 1997.
297. Monika R. Henzinger and Johannes A. La Poutré. Certificates and fast algo-
rithms for biconnectivity in fully-dynamic graphs. SRC Technical Note 1997-021,
Digital Equipment Corporation, Systems Research Center, Palo Alto, California,
September 1997.
298. Monika R. Henzinger, Satish Rao, and Harold N. Gabow. Computing vertex
connectivity: New bounds from old techniques. In Proceedings of the 37th Annual
IEEE Symposium on Foundations of Computer Science (FOCS’96), pages 462–
471, October 1996.
299. Wassily Hoeffding. Probability inequalities for sums of bounded random vari-
ables. Journal of the American Statistical Association, 58(301):713–721, 1963.
Bibliography 453
300. Karen S. Holbert. A note on graphs with distant center and median. In V. R.
Kulli, editor, Recent Sudies in Graph Theory, pages 155–158, Gulbarza, India,
1989. Vishwa International Publications.
301. Paul W. Holland, Kathryn B. Laskey, and Samuel Leinhardt. Stochastic block-
models: First steps. Social Networks, 5:109–137, 1983.
302. Paul W. Holland and Samuel Leinhardt. An exponential family of probability
distributions for directed graphs. Journal of the American Statistical Association,
76(373):33–50, March 1981.
303. Jacob Holm, Kristian de Lichtenberg, and Mikkel Thorup. Poly-logarithmic de-
terministic fully-dynamic algorithms for connectivity, minimum spanning tree,
2-edge, and biconnectivity. Journal of the ACM, 48(4):723–760, 2001.
304. Petter Holme. Congestion and centrality in traffic flow on complex networks.
Advances in Complex Systems, 6(2):163–176, 2003.
305. Petter Holme, Beom Jun Kim, Chang No Yoon, and Seung Kee Han. Attack
vulnerability of complex networks. Physical Review E, 65(056109), 2002.
306. Klaus Holzapfel. Density-based clustering in large-scale networks. PhD thesis,
Technische Universität München, 2004.
307. Klaus Holzapfel, Sven Kosub, Moritz G. Maaß, Alexander Offtermatt-Souza, and
Hanjo Täubig. A zero-one law for densities of higher order. Manuscript, 2004.
308. Klaus Holzapfel, Sven Kosub, Moritz G. Maaß, and Hanjo Täubig. The com-
plexity of detecting fixed-density clusters. In Proceedings of the 5th Italian Con-
ference on Algorithms and Complexity (CIAC’03), volume 2653 of Lecture Notes
in Computer Science, pages 201–212. Springer-Verlag, 2003.
309. John E. Hopcroft and Robert E. Tarjan. Finding the triconnected components
of a graph. Technical Report TR 72-140, CS Dept., Cornell University, Ithaca,
N.Y., August 1972.
310. John E. Hopcroft and Robert E. Tarjan. Dividing a graph into triconnected
components. SIAM Journal on Computing, 2(3):135–158, September 1973.
311. John E. Hopcroft and Robert E. Tarjan. Efficient algorithms for graph manipu-
lation. Communications of the ACM, 16(6):372–378, June 1973.
312. John E. Hopcroft and Robert E. Tarjan. Dividing a graph into triconnected
components. Technical Report TR 74-197, CS Dept., Cornell University, Ithaca,
N.Y., February 1974.
313. John E. Hopcroft and Robert E. Tarjan. Efficient planarity testing. Journal of
the ACM, 21(4):549–568, October 1974.
314. John E. Hopcroft and J.K. Wong. A linear time algorithm for isomorphism of
planar graphs. In Proceedings of the 6th Annual ACM Symposium on the Theory
of Computing (STOC’74), pages 172–184, 1974.
315. Radu Horaud and Thomas Skordas. Stereo correspondence through feature
grouping and maximal cliques. IEEE Transactions on Pattern Analysis and Ma-
chine Intelligence, 11(11):1168–1180, 1989.
316. Wen-Lian Hsu. O(MN) algorithms for the recognition and isomorphism problems
on circular-arc graphs. SIAM Journal on Computing, 24:411–439, 1995.
317. T.C. Hu. Optimum communication spanning trees. SIAM Journal on Computing,
3:188–195, 1974.
318. Xiaohan Huang and Victor Y. Pan. Fast rectangular matrix multiplication and
applications. Journal of Complexity, 14(2):257–299, 1998.
319. Charles H. Hubbell. In input-output approach to clique identification. Sociome-
try, 28:377–399, 1965.
320. Piotr Indyk and Jiřı́ Matoušek. Low-distortion embeddings of finite metric spaces.
In Jacob E. Goodman and Joseph O’Rourke, editors, Handbook of Discrete and
Computational Geometry. Chapman & Hall/CRC, April 2004.
321. Alon Itai and Michael Rodeh. Finding a minimum circuit in a graph. SIAM
Journal on Computing, 7(4):413–423, 1978.
454 Bibliography
344. David R. Karger and Clifford Stein. An Õ(n2 ) algorithm for minimum cuts. In
Proceedings of the 25th Annual ACM Symposium on the Theory of Computing
(STOC’93), pages 757–765, May 1993.
345. Richard M. Karp. Reducibility among combinatorial problems. In Raymond E.
Miller and James W. Thatcher, editors, Complexity of Computer Computations,
pages 85–103. Plenum Press, 1972.
346. Richard M. Karp. On the computational complexity of combinatorial problems.
Networks, 5:45–68, 1975.
347. Richard M. Karp. Probabilistic analysis of some combinatorial search problems.
In Joseph F. Traub, editor, Algorithms and Complexity: New Directions and Re-
cent Results, pages 1–19. Academic Press, 1976.
348. George Karypis and Vipin Kumar. A fast and high quality multilevel scheme for
partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1):359–
392, 1998.
349. Alexander V. Karzanov. On finding maximum flows in networks with spe-
cial structure and some applications. In Matematicheskie Voprosy Upravleniya
Proizvodstvom, volume 5, pages 66–70. Moscow State University Press, 1973. (In
Russian).
350. Alexander V. Karzanov. Determining the maximal flow in a network by the
method of preflows. Soviet Mathematics-Doklady, 15(2):434–437, 1974.
351. Alexander V. Karzanov and Eugeniy A. Timofeev. Efficient algorithm for finding
all minimal edge cuts of a nonoriented graph. Cybernetics, 22(2):156–162, 1986.
352. Leo Katz. A new status index derived from sociometric analysis. Psychometrika,
18(1):39–43, 1953.
353. Subhash Khot. Improved approximation algorithms for max clique, chromatic
number and approximate graph coloring. In Proceedings of the 42nd Annual
IEEE Symposium on Foundations of Computer Science (FOCS’01), pages 600–
609. IEEE Computer Society Press, 2001.
354. Subhash Khot. Ruling out PTAS for graph min-bisection, densest subgraph
and bipartite clique. In Proceedings of the 45th Annual IEEE Symposium on
Foundations of Computer Science (FOCS’04), pages 136–145. IEEE Computer
Society Press, 2004.
355. K. H. Kim and F. W. Roush. Group relationships and homomorphisms of boolean
matrix semigroups. Journal of Mathematical Psychology, 28:448–452, 1984.
356. Valerie King, Satish Rao, and Robert E. Tarjan. A faster deterministic maximum
flow algorithm. In Proceedings of the 3rd Annual ACM–SIAM Symposium on
Discrete Algorithms (SODA’92), pages 157–164, January 1992.
357. G. Kishi. On centrality functions of a graph. In N. Saito and T. Nishizeki,
editors, Proceedings of the 17th Symposium of Research Institute of Electrical
Communication on Graph Theory and Algorithms, volume 108 of Lecture Notes
in Computer Science, pages 45–52, Sendai, Japan, October 1980. Springer.
358. G. Kishi and M. Takeuchi. On centrality functions of a non-directed graph.
In Proceedings of the 6th Colloquium on Microwave Communication, Budapest,
1978.
359. Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal
of the ACM, 46(5):604–632, 1999.
360. Jon M. Kleinberg. The small-world phenomenon: An algorithmic perspective. In
Proceedings of the 32nd Annual ACM Symposium on the Theory of Computing
(STOC’00), May 2000.
361. Jon M. Kleinberg. An impossibility theorem for clustering. In Proceedings of
15th Conference: Neiral Information Processing Systems, Advances in Neural In-
formation Processing Systems, 2002.
362. Daniel J. Kleitman. Methods for investigating connectivity of large graphs. IEEE
Transactions on Circuit Theory, 16(2):232–233, May 1969.
456 Bibliography
363. Ton Kloks, Dieter Kratsch, and Haiko Müller. Finding and counting small in-
duced subgraphs efficiently. Information Processing Letters, 74(3–4):115–121,
2000.
364. David Knoke and David L. Rogers. A blockmodel analysis of interorganizational
networks. Sociology and Social Research, 64:28–52, 1979.
365. Donald E. Knuth. Two notes on notation. American Mathematical Monthly,
99:403–422, 1990.
366. Dénes Kőnig. Graphen und Matrizen. Mat. Fiz. Lapok, 38:116–119, 1931.
367. Avrachenkov Konstantin and Nelly Litvak. Decomposition of the Google PageR-
ank and optimal linking strategy. Technical Report 5101, INRIA, Sophia An-
tipolis, France, January 2004.
368. Tamás Kővári, Vera T. Sós, and Pál Turán. On a problem of Zarankiewicz.
Colloquium Mathematicum, 3:50–57, 1954.
369. Paul L. Krapivsky, Sidney Redner, and Francois Leyvraz. Connectivity of growing
random networks. http://xxx.sissa.it/ps/cond-mat/0005139, September 2000.
370. Jan Kratochvı́l. Perfect Codes in General Graphs. Academia Praha, 1991.
371. V. Krishnamoorthy, K. Thulasiraman, and M. N. S. Swamy. Incremental distance
and diameter sequences of a graph: New measures of network performance. IEEE
Transactions on Computers, 39(2):230–237, February 1990.
372. Joseph B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a
nonparametric hypothesis. Psychometrika, 29:1–27, March 1964.
373. Joseph B. Kruskal. Nonmetric multidimensional scaling: A numerical method.
Psychometrika, 29:115–129, June 1964.
374. Luděk Kučera. Expected complexity of graph partitioning problems. Discrete
Applied Mathematics, 57(2–3):193–212, 1995.
375. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, D. Sivakumar, An-
drew S. Tomkins, and Eli Upfal. Stochastic models for the web graph. In Proceed-
ings of the 41st Annual IEEE Symposium on Foundations of Computer Science
(FOCS’00), 2000.
376. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and Andrew S.
Tomkins. Trawling the web for emerging cyber-communities. Computer Net-
works: The International Journal of Computer and Telecommunications Network-
ing, 31(11–16):1481–1493, 1999.
377. Johannes A. La Poutré, Jan van Leeuwen, and Mark H. Overmars. Maintenance
of 2- and 3-connected components of graphs, Part I: 2- and 3-edge-connected com-
ponents. Technical Report RUU-CS-90-26, Dept. of Computer Science, Utrecht
University, July 1990.
378. Amy N. Langville and Carl D. Meyer. Deeper inside PageRank. Technical report,
Department of Mathematics, North Carolina State University, Raleigh, NC, USA,
March 2004. accepted by Internet Mathematics.
379. Amy N. Langville and Carl D. Meyer. A survey of eigenvector methods of web
information retrieval. Technical report, Department of Mathematics, North Car-
olina State University, Raleigh, NC, USA, January 2004. accepted by The SIAM
Review.
380. Luigi Laura, Stefano Leonardi, Stefano Millozzi, and Ulrich Meyer. Algorithms
and experiments for the webgraph. In Proceedings of the 11th Annual European
Symposium on Algorithms (ESA’03), volume 2832 of Lecture Notes in Computer
Science, 2003.
381. Eugene L. Lawler. Cutsets and partitions of hypergraphs. Networks, 3:275–285,
1973.
382. Eugene L. Lawler, Jan Karel Lenstra, and Alexander H. G. Rinnooy Kan. Gen-
erating all maximal independent sets: N P-hardness and polynomial-time algo-
rithms. SIAM Journal on Computing, 9(3):558–565, 1980.
Bibliography 457
383. Chris Pan-Chi Lee, Gene H. Golub, and Stefanos A. Zenios. A fast two-stage
algorithm for computing PageRank. Technical Report SCCM-03-15, Stanford
University, 2003.
384. Erich L. Lehmann. Testing Statistical Hypotheses. Springer Texts in Statistics.
Springer-Verlag, 2nd edition, 1997.
385. Erich L. Lehmann and George Casella. Theory of Point Estimation. Springer
Texts in Statistics. Springer-Verlag, 2nd edition, 1998.
386. L. Ya. Leifman. An efficient algorithm for partitioning an oriented graph into
bicomponents. Cybernetics, 2(5):15–18, 1966.
387. Ronny Lempel and Shlomo Moran. The stochastic approach for link-structure
analysis (SALSA) and the TKC effect. Computer Networks: The International
Journal of Computer and Telecommunications Networking, 33:387–401, 2000. vol-
ume coincides with the Proceedings of the 9th international World Wide Web
conference on Computer networks.
388. Ronny Lempel and Shlomo Moran. Rank-stability and rank-similarity of link-
based web ranking algorithms in authority-connected graphs. Information Re-
trieval, special issue on Advances in Mathematics/Formal Methods in Informa-
tion Retrieval, 2004. in press.
389. Thomas Lengauer. Combinatorial Algorithms for Integrated Circuit Layout. Wi-
ley, 1990.
390. Linda Lesniak. Results on the edge-connectivity of graphs. Discrete Mathematics,
8:351–354, 1974.
391. Robert Levinson. Pattern associativity and the retrieval of semantic networks.
Computers & Mathematics with Applications, 23(2):573–600, 1992.
392. Nathan Linial, László Lovász, and Avi Wigderson. A physical interpretation of
graph connectivity and its algorithmic applications. In Proceedings of the 27th
Annual IEEE Symposium on Foundations of Computer Science (FOCS’86), pages
39–48, October 1986.
393. Nathan Linial, László Lovász, and Avi Wigderson. Rubber bands, convex em-
beddings and graph connectivity. Combinatorica, 8(1):91–102, 1988.
394. François Lorrain and Harrison C. White. Structural equivalence of individuals in
social networks. Journal of Mathematical Sociology, 1:49–80, 1971.
395. Emmanuel Loukakis and Konstantinos-Klaudius Tsouros. A depth first search
algorithm to generate the family of maximal independent sets of a graph lexico-
graphically. Computing, 27:249–266, 1981.
396. László Lovász. Connectivity in digraphs. Journal of Combinatorial Theory Se-
ries B, 15(2):174–177, August 1973.
397. László Lovász. On the Shannon capacity of a graph. IEEE Transactions on
Information Theory, 25:1–7, 1979.
398. Anna Lubiw. Some N P-complete problems similar to graph isomorphism. SIAM
Journal on Computing, 10:11–24, 1981.
399. Fabrizio Luccio and Mariagiovanna Sami. On the decomposition of networks in
minimally interconnected subnetworks. IEEE Transactions on Circuit Theory,
CT-16:184–188, 1969.
400. R. Duncan Luce. Connectivity and generalized cliques in sociometric group struc-
ture. Psychometrika, 15:169–190, 1950.
401. R. Duncan Luce and Albert Perry. A method of matrix analysis of group struc-
ture. Psychometrika, 14:95–116, 1949.
402. Eugene M. Luks. Isomorphism of graphs of bounded valence can be tested in
polynomial time. Journal of Computer and System Sciences, 25:42–65, 1982.
403. Saunders Mac Lane. A structural characterization of planar combinatorial
graphs. Duke Mathematical Journal, 3:460–472, 1937.
404. Wolfgang Mader. Ecken vom Grad n in minimalen n-fach zusammenhängenden
Graphen. Archiv der Mathematik, 23:219–224, 1972.
458 Bibliography
405. Damien Magoni and Jean Jacques Pansiot. Analysis of the autonomous system
network topology. Computer Communication Review, 31(3):26–37, July 2001.
406. Vishv M. Malhotra, M. Pramodh Kumar, and S. N. Maheshwari. An O(|V |3 ) al-
gorithm for finding maximum flows in networks. Information Processing Letters,
7(6):277–278, October 1978.
407. Yishay Mansour and Baruch Schieber. Finding the edge connectivity of directed
graphs. Journal of Algorithms, 10(1):76–85, March 1989.
408. Maarten Marx and Michael Masuch. Regular equivalence and dynamic logic.
Social Networks, 25:51–65, 2003.
409. Rudi Mathon. A note on the graph isomorphism counting problem. Information
Processing Letters, 8(3):131–132, 1979.
410. David W. Matula. The cohesive strength of graphs. In The Many Facets of
Graph Theory, Proc., volume 110 of Lecture Notes in Mathematics, pages 215–
221. Springer-Verlag, 1969.
411. David W. Matula. k-components, clusters, and slicings in graphs. SIAM Journal
on Applied Mathematics, 22(3):459–480, May 1972.
412. David W. Matula. Graph theoretic techniques for cluster analysis algorithms.
In J. Van Ryzin, editor, Classification and clustering, pages 95–129. Academic
Press, 1977.
413. David W. Matula. Determining edge connectivity in O(nm). In Proceedings of the
28th Annual IEEE Symposium on Foundations of Computer Science (FOCS’87),
pages 249–251, October 1987.
414. James J. McGregor. Backtrack search algorithms and the maximal common
subgraph problem. Software - Practice and Experience, 12(1):23–24, 1982.
415. Brendan D. McKay. Practical graph isomorphism. Congressus Numerantium,
30:45–87, 1981.
416. Brendan D. McKay and Nicholas C. Wormald. Uniform generation of random
regular graphs of moderate degree. Journal of Algorithms, 11:52–67, 1990.
417. Alberto Medina, Anukool Lakhina, Ibrahim Matta, and John Byers. BRITE: An
approach to universal topology generation. In Proceedings of the International
Symposium on Modeling, Analysis and Simulation of Computer and Telecommu-
nication Systems (MASCOTS’01), 2001.
418. Alberto Medina, Ibrahim Matta, and John Byers. On the origin of power laws
in Internet topologies. Computer Communication Review, 30(2), April 2000.
419. Karl Menger. Zur allgemeinen Kurventheorie. Fundamenta Mathematicae, 10:96–
115, 1927.
420. Milena Mihail, Christos Gkantsidis, Amin Saberi, and Ellen W. Zegura. On
the semantics of internet topologies. Technical Report GIT-CC-02-07, Georgia
Institute of Technology, 2002.
421. Stanley Milgram. The small world problem. Psychology Today, 1:61, 1967.
422. Gary L. Miller and Vijaya Ramachandran. A new graph triconnectivity algorithm
and its parallelization. Combinatorica, 12(1):53–76, 1992.
423. Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii,
and Uri Alon. Network motifs: Simple building blocks of complex networks.
Science, 298:824–827, October 2002.
424. J. Clyde Mitchell. Algorithms and network analysis: A test of some analyti-
cal procedures on Kapferer’s tailor shop material. In Linton Clarke Freeman,
Douglas R. White, and A. Kimbal Romney, editors, Research Methods in Social
Network Analysis, pages 365–391. George Mason University Press, 1989.
425. Bojan Mohar. Isoperimetric numbers of graphs. Journal of Combinatorial Theory
Series B, 47(3):274–291, 1989.
426. Bojan Mohar. Eigenvalues, diameter and mean distance in graphs. Graphs and
Combinatorics, 7:53–64, 1991.
Bibliography 459
427. Bojan Mohar. The laplacian spectrum of graphs. In Yousef Alavi, Gary Char-
trand, Ortrud R. Oellermann, and Allen J. Schwenk, editors, Graph Theory,
Combinatorics, and Applications, pages 871–898. Wiley, 1991.
428. Bojan Mohar and Svatopluk Poljak. Eigenvalues in combinatorial optimization.
In Richard A. Brualdi, Shmuel Friedland, and Victor Klee, editors, Combinato-
rial and Graph-Theoretical Problems in Linear Algebra, pages 107–151. Springer-
Verlag, 1993.
429. Robert J. Mokken. Cliques, clubs, and clans. Quality and Quantity, 13:161–173,
1979.
430. Burkhard Möller. Zentralitäten in Graphen. Diplomarbeit, Fachbereich Infor-
matik und Informationswissenschaft, Universität Konstanz, July 2002.
431. John W. Moon. On the diameter of a graph. Michigan Mathematical Journal,
12(3):349–351, 1965.
432. John W. Moon and L. Moser. On cliques in graphs. Israel Journal of Mathemat-
ics, 3:23–28, 1965.
433. Robert L. Moxley and Nancy F. Moxley. Determining Point-Centrality in Un-
contrived Social Networks. Sociometry, 37:122–130, 1974.
434. N. C. Mullins, L. L. Hargens, P. K. Hecht, and Edward L. Kick. The group struc-
ture of cocitation clusters: A comparative study. American Sociological Review,
42:552–562, 1977.
435. Ian Munro. Efficient determination of the transitive closure of a directed graph.
Information Processing Letters, 1(2):56–58, 1971.
436. Siegfried F. Nadel. The Theory of Social Structure. Cohen & West LTD, 1957.
437. Kai Nagel. Traffic networks. In Stefan Bornholdt and Heinz Georg Schuster,
editors, Handbook of Graphs and Networks: From the Genome to the Internet.
Wiley-VCH, 2002.
438. Walid Najjar and Jean-Luc Gaudiot. Network resilience: A measure of network
fault tolerance. IEEE Transactions on Computers, 39(2):174–181, February 1990.
439. Georg L. Nemhause and Laurence A. Wolesy. Integer and Combinatorial Opti-
mization. Wiley, 1988.
440. Jaroslav Nešetřil and Svatopluk Poljak. On the complexity of the subgraph
problem. Commentationes Mathematicae Universitatis Carolinae, 26(2):415–419,
1985.
441. Mark E. J. Newman. Assortative mixing in networks. Physical Review Letters,
89(208701), 2002.
442. Mark E. J. Newman. Fast algorithm for detecting community structure in net-
works. arXiv cond-mat/0309508, September 2003.
443. Mark E. J. Newman. A measure of betweenness centrality based on random
walks. arXiv cond-mat/0309045, 2003.
444. Mark E. J. Newman and Michelle Girvan. Mixing patterns and community
structure in networks. In Romualdo Pastor-Satorras, Miguel Rubi, and Albert
Diaz-Guilera, editors, Statistical Mechanics of Complex Networks, volume 625 of
Lecture Notes in Physics, pages 66–87. Springer-Verlag, 2003.
445. Mark E. J. Newman and Michelle Girvan. Findind and evaluating community
structure in networks. Physical Review E, 69(2):026113, 2004.
446. Mark E. J. Newman and Juyong Park. Why social networks are different from
other types of networks. Physical Review E, 68(036122), 2003.
447. Mark E. J. Newman, Steven H. Strogatz, and Duncan J. Watts. Random graph
models of social networks. Proceedings of the National Academy of Science of the
United States of America, 99:2566–2572, 2002.
448. Mark E. J. Newman, Duncan J. Watts, and Steven H. Strogatz. Random graphs
with arbitrary degree distributions and their applications. Physical Review E,
64:026118, 2001.
460 Bibliography
449. Andrew Y. Ng, Alice X. Zheng, and Micheal I. Jordan. Link analysis, eigenvectors
and stability. In Proceedings of the senventeenth international joint conference
on artificial intelligence, pages 903–910, Seattle, Washington, 2001.
450. Victor Nicholson, Chun Che Tsai, Marc A. Johnson, and Mary Naim. A subgraph
isomorphism theorem for molecular graphs. In Proceedings of The International
Conference on Graph Theory and Topology in Chemistry, pages 226–230, 1987.
451. U. J. Nieminen. On the Centrality in a Directed Graph. Social Science Research,
2:371–378, 1973.
452. National laboratory for applied network research routing data, 1999.
453. Krzysztof Nowicki and Tom A.B. Snijders. Estimation and prediction for stochas-
tic blockstructures. Journal of the American Statistical Association, 96:1077–
1087, 2001.
454. Esko Nuutila and Eljas Soisalon-Soininen. On finding the strongly connected
components in a directed graph. Information Processing Letters, 49(1):9–14,
January 1994.
455. Ortrud R. Oellermann. A note on the ,-connectivity function of a graph. Con-
gressus Numerantium, 60:181–188, December 1987.
456. Ortrud R. Oellermann. On the l-connectivity of a graph. Graphs and Combina-
torics, 3:285–291, 1987.
457. Maria G.R. Ortiz, Jose R.C. Hoyos, and Maria G.R. Lopez. The social networks of
academic performance in a student context of poverty in mexico. Social Networks,
26(2):175–188, 2004.
458. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageR-
ank citation ranking: Bringing order to the web. Manuscript, 1999.
459. Robert Paige and Robert E. Tarjan. Three partition refinement algorithms.
SIAM Journal on Computing, 16(6):973–983, 1987.
460. Ignacio Palacios-Huerta and Oscar Volij. The measurement of intellectual influ-
ence. Econometrica, 2004. accepted for publication.
461. Christopher Palmer, Phillip Gibbons, and Christos Faloutsos. Fast approximation
of the “neighbourhood” function for massive graphs. Technical Report CMUCS-
01-122, Carnegie Mellon Uiversity, 2001.
462. Christopher Palmer, Georgos Siganos, Michalis Faloutsos, Christos Faloutsos, and
Phillip Gibbons. The connectivity and fault-tolerance of the Internet topology.
In Workshop on Network-Related Data Management (NRDM 2001), 2001.
463. Gopal Pandurangan, Prabhakar Raghavan, and Eli Upfal. Using PageRank to
characterize Web structure. In Proceedings of the 8th Annual International Con-
ference on Computing Combinatorics (COCOON’02), volume 2387 of Lecture
Notes in Computer Science, pages 330–339, 2002.
464. Apostolos Papadopoulos and Yannis Manolopoulos. Structure-based similarity
search with graph histograms. In DEXA Workshop, pages 174–178, 1999.
465. Britta Papendiek and Peter Recht. On maximal entries in the principal eigen-
vector of graphs. Linear Algebra and its Applications, 310:129–138, 2000.
466. Panos M. Pardalos and Jue Xue. The maximum clique problem. Journal of
Global Optimization, 4:301–328, 1994.
467. Beresford N. Parlett. The Symmetric Eigenvalue Problem. SIAM, 1998.
468. Romualdo Pastor-Satorras, Alexei Vázquez, and Alessandro Vespignani. Dy-
namical and correlation properties of the internet. Physical Review Letters,
87(258701), 2001.
469. Romualdo Pastor-Satorras and Alessandro Vespignani. Epidemics and immu-
nization in scale-free networks. In Stefan Bornholdt and Heinz Georg Schuster,
editors, Handbook of Graphs and Networks: From the Genome to the Internet.
Wiley-VCH, 2002.
470. Keith Paton. An algorithm for the blocks and cutnodes of a graph. Communi-
cations of the ACM, 14(7):468–475, July 1971.
Bibliography 461
471. Philippa Pattison. Algebraic Models for Social Networks. Cambridge University
Press, 1993.
472. Philippa Pattison and Stanley Wasserman. Logit models and logistic regressions
for social networks: II. Multivariate relations. British Journal of Mathematical
and Statistical Psychology, 52:169–193, 1999.
473. Marvin C. Paull and Stephen H. Unger. Minimizing the number of states in
incompletely specified sequential switching functions. IRE Transaction on Elec-
tronic Computers, EC-8:356–367, 1959.
474. Aleksandar Pekeč and Fred S. Roberts. The role assignment model nearly fits
most social networks. Mathematical Social Sciences, 41:275–293, 2001.
475. Claudine Peyrat. Diameter vulnerability of graphs. Discrete Applied Mathemat-
ics, 9, 1984.
476. Steven Phillips and Jeffery Westbrook. On-line load balancing and network flow.
Algorithmica, 21(3):245–261, 1998.
477. Jean-Claude Picard and Maurice Queyranne. A network flow solution to some
nonlinear 0-1 programming problems, with an application to graph theory. Net-
works, 12:141–159, 1982.
478. Jean-Claude Picard and H. D. Ratliff. Minimum cuts and related problems.
Networks, 5(4):357–370, 1975.
479. Gabriel Pinski and Francis Narin. Citation influence for journal aggregates of
scientific publications: theory, with application to the literature of physics. In-
formation Processing & Management, 12:297–312, 1976.
480. André Pönitz and Peter Tittmann. Computing network reliability in graphs
of restricted pathwidth. http://www.peter.htwm.de/publications/Reliability.ps,
2001.
481. R. Poulin, M.-C. Boily, and B.R. Mâsse. Dynamical systems to define centrality
in social networks. Social Networks, 22:187–220, 2000.
482. William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flan-
nery. Numerical Recipes in C. Cambridge University Press, 1992.
483. C.H. Proctor and C. P. Loomis. Analysis of sociometric data. In Marie Ja-
hoda, Morton Deutsch, and Stuart W. Cook, editors, Research Methods in Social
Relations, pages 561–586. Dryden Press, 1951.
484. Paul W. Purdom, Jr. A transitive closure algorithm. Computer Sciences Tech-
nical Report #33, University of Wisconsin, July 1968.
485. Paul W. Purdom, Jr. A transitive closure algorithm. BIT, 10:76–94, 1970.
486. Pavlin Radoslavov, Hongsuda Tangmunarunkit, Haobo Yu, Ramesh Govindan,
Scott Shenker, and Deborah Estrin. On characterizing network topologies and
analyzing their impact on protocol design. Technical Report 00-731, Computer
Science Department, University of Southern California, February 2000.
487. Rajeev Raman. Recent results on the single-source shortest paths problem. ACM
SIGACT News, 28(2):81–87, 1997.
488. John W. Raymond, Eleanor J. Gardiner, and Peter Willet. RASCAL: Calculation
of graph similarity using maximum common edge subgraphs. The Computer
Journal, 45(6):631–644, 2002.
489. Ronald C. Read and Derek G. Corneil. The graph isomorphism disease. Journal
of Graph Theory, 1:339–363, 1977.
490. John H. Reif. A topological approach to dynamic graph connectivity. Information
Processing Letters, 25(1):65–70, 1987.
491. Franz Rendl and Henry Wolkowicz. A projection technique for partitioning the
nodes of a graph. Annals of Operations Research, 58:155–180, 1995.
492. John A. Rice. Mathematical Statistics and Data Analysis. Duxbury Press, 2nd
edition, 1995.
493. Fred S. Roberts and Li Sheng. N P-completeness for 2-role assignability. Tech-
nical Report 8, Rutgers Center for Operation Research, 1997.
462 Bibliography
494. Garry Robins, Philippa Pattison, and Stanley Wasserman. Logit models and
logistic regressions for social networks: III. Valued relations. Psychometrika,
64:371–394, 1999.
495. John Michael Robson. Algorithms for maximum independent sets. Journal of
Algorithms, 7(3):425–440, 1986.
496. Liam Roditty and Uri Zwick. On dynamic shortest paths problems. In Proceed-
ings of the 12th Annual European Symposium on Algorithms (ESA’04), volume
3221 of Lecture Notes in Computer Science, pages 580–591, 2004.
497. Arnon S. Rosenthal. Computing Reliability of Complex Systems. PhD thesis,
University of California, 1974.
498. Sheldon M. Ross. Introduction to Probability Models. Academic Press, 8th edition,
2003.
499. Britta Ruhnau. Eigenvector-centrality – a node-centrality? Social Networks,
22:357–365, 2000.
500. Gert Sabidussi. The centrality index of a graph. Psychometrika, 31:581–603,
1966.
501. Lee Douglas Sailer. Structural equivalence: Meaning and definition, computation
and application. Social Networks, 1:73–90, 1978.
502. Thomas Schank and Dorothea Wagner. Approximating clustering-coefficient and
transitivity. Technical Report 2004-9, Universität Karlsruhe, Fakultät für Infor-
matik, 2004.
503. Claus P. Schnorr. Bottlenecks and edge connectivity in unsymmetrical networks.
SIAM Journal on Computing, 8(2):265–274, May 1979.
504. Uwe Schöning. Graph isomorphism is in the low hierarchy. Journal of Computer
and System Sciences, 37:312–323, 1988.
505. Alexander Schrijver. Theory of linear and integer programming. Wiley, 1986.
506. Alexander Schrijver. Paths and flows—a historical survey. CWI Quarterly,
6(3):169–183, September 1993.
507. Alexander Schrijver. Combinatorial Optimization: Polyhedra and Efficiency.
Springer-Verlag, 2003.
508. Joseph E. Schwartz. An examination of CONCOR and related methods for
blocking sociometric data. In D. R. Heise, editor, Sociological Methodology 1977,
pages 255–282. Jossey Bass, 1977.
509. Jennifer A. Scott. An Arnoldi code for computing selected eigenvalues of
sparse real unsymmetric matrices. ACM Transactions on Mathematical Software,
21:423–475, 1995.
510. John R. Seeley. The net of reciprocal influence. Canadian Journal of Psychology,
III(4):234–240, 1949.
511. Stephen B. Seidman. Clique-like structures in directed networks. Journal of
Social and Biological Structures, 3:43–54, 1980.
512. Stephen B. Seidman. Internal cohesion of LS sets in graphs. Social Networks,
5(2):97–107, 1983.
513. Stephen B. Seidman. Network structure and minimum degree. Social Networks,
5:269–287, 1983.
514. Stephen B. Seidman and Brian L. Foster. A graph-theoretic generalization of the
clique concept. Journal of Mathematical Sociology, 6:139–154, 1978.
515. Stephen B. Seidman and Brian L. Foster. A note on the potential for gen-
uine cross-fertilization between anthropology and mathematics. Social Networks,
172:65–72, 1978.
516. Ron Shamir, Roded Sharan, and Dekel Tsur. Cluster graph modification prob-
lems. In Graph-Theoretic Concepts in Computer Science, 28th International
Workshop, WG 2002, volume 2573 of Lecture Notes in Computer Science, pages
379–390. Springer-Verlag, 2002.
Bibliography 463
517. Micha Sharir. A strong-connectivity algorithm and its applications in data flow
analysis. Computers & Mathematics with Applications, 7(1):67–72, 1981.
518. Yossi Shiloach. An O(n · I log 2 I) maximum-flow algorithm. Technical Report
STAN-CS-78-702, Computer Science Department, Stanford University, December
1978.
519. Alfonso Shimbel. Structural parameters of communication networks. Bulletin of
Mathematical Biophysics, 15:501–507, 1953.
520. F. M. Sim and M. R. Schwartz. Does CONCOR find positions? Unpublished
manuscript, 1979.
521. Alistair Sinclair. Algorithms for Random Generation and Counting: A Markov
Chain Approach. Birkhäuser Verlag, 1993.
522. Brajendra K. Singh and Neelima Gupte. Congestion and Decongestion in a
communication network. arXiv cond-mat/0404353, 2004.
523. Mohit Singh and Amitabha Tripathi. Order of a graph with given vertex and edge
connectivity and minimum degree. Electronic Notes in Discrete Mathematics, 15,
2003.
524. Peter J. Slater. Maximin facility location. Journal of National Bureau of Stan-
dards, 79B:107–115, 1975.
525. Daniel D. Sleater and Robert E. Tarjan. A data structure for dynamic trees.
Journal of Computer and System Sciences, 26(3):362–391, June 1983.
526. Giora Slutzki and Oscar Volij. Scoring of web pages and tournaments – axioma-
tizations. Technical report, Iowa State University, Ames, USA, February 2003.
527. Christian Smart and Peter J. Slater. Center, median and centroid subgraphs.
Networks, 34:303–311, 1999.
528. Peter H. A. Sneath and Robert R. Sokal. Numerical Taxonomy: The Principles
and Practice of Numerical Classification. W. H. Freeman and Company, 1973.
529. Tom A.B. Snijders. Markov chain monte carlo estimation of exponential random
graph models. Journal of Social Structure, 3(2), April 2002.
530. Tom A.B. Snijders and Krzysztof Nowicki. Estimation and prediction of stochas-
tic blockmodels for graphs with latent block structure. Journal of Classification,
14:75–100, 1997.
531. Anand Srivastav and Katja Wolf. Finding dense subgraphs with semidefinite
programming. In Proceedings of the 1st International Workshop on Approxi-
matin Algorithms for Combinatorial Optimization (APPROX’98), volume 1444
of Lecture Notes in Computer Science, pages 181–191. Springer-Verlag, 1998.
532. Angelika Steger and Nicholas C. Wormald. Generating random regular graphs
quickly. Combinatorics, Probability and Computing, 8:377–396, 1999.
533. Karen A. Stephenson and Marvin Zelen. Rethinking centrality: Methods and
examples. Social Networks, 11:1–37, 1989.
534. Volker Stix. Finding all maximal cliques in dynamic graphs. Computational
Optimization and Applications, 27(2):173–186, 2004.
535. Josef Stoer and Roland Bulirsch. Introduction to Numerical Analysis. Springer-
Verlag, 1993.
536. Mechthild Stoer and Frank Wagner. A simple min-cut algorithm. Journal of the
ACM, 44(4):585–591, 1997.
537. Sun Microsystems. Sun Performance Library User’s Guide.
538. Melvin Tainiter. Statistical theory of connectivity I: Basic definitions and prop-
erties. Discrete Mathematics, 13(4):391–398, 1975.
539. Melvin Tainiter. A new deterministic network reliability measure. Networks,
6(3):191–204, 1976.
540. Hongsuda Tangmunarunkit, Ramesh Govindan, Sugih Jamin, Scott Shenker, and
Walter Willinger. Network topologies, power laws, and hierarchy. Technical
Report 01-746, Computer Science Department, University of Southern California,
2001.
464 Bibliography
541. Hongsuda Tangmunarunkit, Ramesh Govindan, Sugih Jamin, Scott Shenker, and
Walter Willinger. Network topologies, power laws, and hierarchy. ACM SIG-
COMM Computer Communication Review, 32(1):76, 2002.
542. Robert E. Tarjan. Depth-first search and linear graph algorithms. SIAM Journal
on Computing, 1(2):146–160, June 1972.
543. Robert E. Tarjan. Finding a maximum clique. Technical Report 72-123, Depart-
ment of Computer Science, Cornell University, Ithaca, NY, 1972.
544. Robert E. Tarjan. A note on finding the bridges of a graph. Information Pro-
cessing Letters, 2(6):160–161, 1974.
545. Robert E. Tarjan and Anthony E. Trojanowski. Finding a maximum independent
set. SIAM Journal on Computing, 6(3):537–546, 1977.
546. Mikkel Thorup. On RAM priority queues. In Proceedings of the 7th Annual
ACM–SIAM Symposium on Discrete Algorithms (SODA’96), pages 59–67, 1996.
547. Mikkel Thorup. Undirected single source shortest paths with positive integer
weights in linear time. Journal of the ACM, 46(3):362–394, 1999.
548. Mikkel Thorup. On ram priority queues. SIAM Journal on Computing, 30(1):86–
109, 2000.
549. Mikkel Thorup. Fully dynamic all-pairs shortest paths: Faster and allowing neg-
ative cycles. In Proceedings of the 9th Scandinavian Workshop on Algorithm
Theory (SWAT’04), volume 3111 of Lecture Notes in Computer Science, pages
384–396. Springer-Verlag, 2004.
550. Gottfried Tinhofer. On the generation of random graphs with given properties
and known distribution. Appl. Comput. Sci. Ber. Prakt. Inf., 13:265–296, 1979.
551. Po Tong and Eugene L. Lawler. A faster algorithm for finding edge-disjoint
branchings. Information Processing Letters, 17(2):73–76, August 1983.
552. Miroslaw Truszczyński. Centers and centroids of unicyclic graphs. Mathematica
Slovaca, 35:223–228, 1985.
553. Shuji Tsukiyama, Mikio Ide, Hiromu Ariyoshi, and Isao Shirakawa. A new al-
gorithm for generating all the maximal independent sets. SIAM Journal on
Computing, 6(3):505–517, 1977.
554. Pál Turán. On an extremal problem in graph theory. Matematikai és Fizikai
Lapok, 48:436–452, 1941.
555. William T. Tutte. A theory of 3-connected graphs. Indagationes Mathematicae,
23:441–455, 1961.
556. William T. Tutte. Connectivity in graphs. Number 15 in Mathematical Exposi-
tions. University of Toronto Press, 1966.
557. Salil P. Vadhan. The complexity of counting in sparse, regular, and planar graphs.
SIAM Journal on Computing, 31(2):398–427, 2001.
558. Thomas W. Valente and Robert K. Foreman. Integration and radiality: measuring
the extent of an individual’s connectedness and reachability in a network. Social
Networks, 1:89–105, 1998.
559. Leslie G. Valiant. The complexity of computing the permanent. Theoretical
Computer Science, 8:189–201, 1979.
560. Leslie G. Valiant. The complexity of enumeration and reliability problems. SIAM
Journal on Computing, 8(3):410–421, 1979.
561. Edwin R. van Dam and Willem H. Haemers. Which graphs are determined by
their spectrum? Linear Algebra and its Applications, 373:241–272, 2003.
562. René van den Brink and Robert P. Gilles. An axiomatic social power index for
hierarchically structured populations of economic agents. In Robert P. Gilles and
Picter H.M. Ruys, editors, Imperfections and Behaviour in Economic Organiza-
tions, pages 279–318. Kluwer Academic Publishers Group, 1994.
563. René van den Brink and Robert P. Gilles. Measuring domination in directed
networks. Social Networks, 22(2):141–157, May 2000.
Bibliography 465
564. Stijn M. van Dongen. Graph Clustering by Flow Simulation. PhD thesis, Uni-
versity of Utrecht, 2000.
565. Santosh Vempala, Ravi Kannan, and Adrian Vetta. On clusterings - good, bad
and spectral. In Proceedings of the 41st Annual IEEE Symposium on Foundations
of Computer Science (FOCS’00), pages 367–378, 2000.
566. The Stanford WebBase Project. http://www-diglib.stanford.edu/ testbed/doc2/-
WebBase/.
567. Yuchung J. Wang and George Y. Wong. Stochastic blockmodels for directed
graphs. Journal of the American Statistical Association, 82:8–19, 1987.
568. Stephen Warshall. A theorem on boolean matrices. Journal of the ACM, 9(1):11–
12, 1962.
569. Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and
Applications. Cambridge University Press, 1994.
570. Stanley Wasserman and Philippa Pattison. Logit models and logistic regressions
for social networks: I. An introduction to Markov graphs and p∗ . Psychometrika,
60:401–426, 1996.
571. David S. Watkins. QR-like algorithms for eigenvalue problems. Journal of Com-
putational and Applied Mathematics, 123:67–83, 2000.
572. Alison Watts. A dynamic model of network formation. Games and Economic
Behavior, 34:331–341, 2001.
573. Duncan J. Watts and Steven H. Strogatz. Collective dynamics of “small-world”
networks. Nature, 393:440–442, 1998.
574. Bernard M. Waxman. Routing of multipoint connections. IEEE Journal on
Selected Areas in Communications, 6(9):1617–1622, 1988.
575. Alfred Weber. Über den Standort der Industrien. J. C. B. Mohr, Tübingen, 1909.
576. Douglas B. West. Introduction to Graph Theory. Prentice Hall, 2nd edition, 2001.
577. Jeffery Westbrook and Robert E. Tarjan. Maintaining bridge-connected and
biconnected components on-line. Algorithmica, 7:433–464, 1992.
578. Douglas R. White and Stephen P. Borgatti. Betweenness Centrality Measures
for Directed Graphs. Social Networks, 16:335–346, 1994.
579. Douglas R. White and Karl P. Reitz. Graph and semigroup homomorphisms on
networks of relations. Social Networks, 5:193–234, 1983.
580. Scott White and Padhraic Smyth. Algorithms for estimating relative importance
in networks. In Proceedings of the 9th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (KDD’03), 2003.
581. Hassler Whitney. Congruent graphs and the connectivity of graphs. American
Journal of Mathematics, 54:150–168, 1932.
582. R. W. Whitty. Vertex-disjoint paths and edge-disjoint branchings in directed
graphs. Journal of Graph Theory, 11(3):349–358, 1987.
583. Harry Wiener. Structural determination of paraffin boiling points. Journal of
the American Chemical Society, 69:17–20, 1947.
584. Eugene P. Wigner. Characteristic vectors of bordered matrices with infinite
dimensions. Annals of Mathematics, 62:548–564, 1955.
585. Eugene P. Wigner. On the distribution of the roots of certain symmetric matrices.
Annals of Mathematics, 67:325–327, 1958.
586. Herbert S. Wilf. generatingfunctionology. pub-ap, 1994.
587. James H. Wilkinson. The Algebraic Eigenvalue Problem. Clarendon Press, 1965.
588. Thomas Williams and Colin Kelley. Gnuplot documentation.
589. Gerhard Winkler. Image Analysis, Random Fields, and Markov Chain Monte
Carlo Methods. Springer-Verlag, 2nd edition, 2003.
590. Gerhard J. Woeginger. Exact algorithms for N P-hard problems: A survey. In
Proceedings of the 5th International Workshop on Combinatorial Optimization
(Aussois’2001), volume 2570 of Lecture Notes in Computer Science, pages 185–
207. Springer-Verlag, 2003.
466 Bibliography
591. Kesheng Wu and Hort Simon. Thick-restart Lanczos method for large symmet-
ric eigenvalue problems. SIAM Journal on Matrix Analysis and Applications,
22(2):602–616, 2000.
592. Stefan Wuchty and Peter F. Stadler. Centers of complex networks. Journal of
Theoretical Biology, 223:45–53, 2003.
593. Norman Zadeh. Theoretical efficiency of the Edmonds-Karp algorithm for com-
puting maximal flows. Journal of the ACM, 19(1):184–192, 1972.
594. Bohdan Zelinka. Medians and peripherians tree. Archivum Mathematicum
(Brno), 4:87–95, 1968.
595. Uri Zwick. All pairs shortest paths using bridging sets and rectangular matrix
multiplication. Electronic Colloquium on Computational Complexity (ECCC),
60(7), 2000.