0% found this document useful (0 votes)

2 views29 pages

Understanding Shortest Paths in Growing Directed Networks Through Duplication

Uploaded by

cbi2173047546

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views29 pages

Understanding Shortest Paths in Growing Directed Networks Through Duplication

Uploaded by

cbi2173047546

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

The distribution of shortest path lengths on trees of a given size

in subcritical Erdős-Rényi networks

Barak Budnick, Ofer Biham and Eytan Katzav

arXiv:2310.01591v2 [cond-mat.stat-mech] 31 Oct 2023

Racah Institute of Physics, The Hebrew University, Jerusalem 9190401, Israel

Abstract
In the subcritical regime Erdős-Rényi (ER) networks consist of finite tree components, which are
non-extensive in the network size. The distribution of shortest path lengths (DSPL) of subcritical
ER networks was recently calculated using a topological expansion [E. Katzav, O. Biham and
A.K. Hartmann, Phys. Rev. E 98, 012301 (2018)]. The DSPL, which accounts for the distance
ℓ between any pair of nodes that reside on the same finite tree component, was found to follow
a geometric distribution of the form P (L = ℓ|L < ∞) = (1 − c)cℓ−1 , where 0 < c < 1 is the
mean degree of the network. This result includes the contributions of trees of all possible sizes and
topologies. Here we calculate the distribution of shortest path lengths P (L = ℓ|S = s) between
random pairs of nodes that reside on the same tree component of a given size s. It is found that
ℓ+1 (s−2)!
P (L = ℓ|S = s) = sℓ (s−ℓ−1)!
. Surprisingly, this distribution does not depend on the mean degree
c of the network from which the tree components were extracted. This is due to the fact that the
ensemble of tree components of a given size s in subcritical ER networks is sampled uniformly from
the set of labeled trees of size s and thus does not depend on c. The moments of the DSPL are also
calculated. It is found that the mean distance between random pairs of nodes on tree components
√
of size s satisfies E[L|S = s] ∼ s, unlike small-world networks in which the mean distance scales
logarithmically with s.

1
I. INTRODUCTION

Random networks provide a useful framework for the analysis of a large variety of systems
that consist of interacting objects [1–4]. One can distinguish between two major types of
random networks: supercritical networks and subcritical networks. Supercritical networks
form a giant component that encompasses a macroscopic fraction of all the nodes. The
giant component may provide a useful description of networks in which the connectivity
is essential, such as the world-wide-web, social networks, and infrastructure networks. The
giant component is a small-world network, namely the mean distance between pairs of nodes
on the giant component scales logarithmically with its size. It includes a large number of
cycles with a broad spectrum of cycle lengths [5–7]. These cycles provide redundancy in
the connectivity between pairs of nodes via multiple paths. The redundancy helps to main-
tain the integrity of the giant component upon deletion of nodes or edges due to failures
or attacks. The combination of the small-world property and the redundancy gives rise to
highly efficient channels of transport and communication and to the robustness of the net-
work. In contrast, subcritical networks consist of finite tree components that do not scale
with the overall network size. In a tree topology each pair of nodes is connected by a single
path. Therefore, in subcritical networks the shortest path between any pair of nodes that
reside on the same tree component is, in fact, the only path between them. As a result, in
subcritical networks each node of degree k ≥ 2 is an articulation point, namely its deletion
would break the tree component on which it resides into at least two disconnected parts
[8, 9]. Moreover, each edge is a bredge (bridge edge), namely its deletion would break the
tree component on which it resides into two disconnected parts [10]. The subcritical tree
components may describe the fragmented structure of secure compartmentalized networks,
such as the communication networks of commercial enterprises, government agencies and
illicit organizations [11]. The structure of such networks may be determined by the trade-off
between efficiency and security. When security considerations outweigh efficiency consider-
ations, the number of communication lines may need to be reduced to a minimum, which
is achieved in the case of tree structures. Other examples of fragmented networks include
networks that suffered multiple failures, large scale attacks or epidemics, in which the re-
maining functional or uninfected nodes form small, isolated components [12, 13]. In spite of
their importance, the structural and statistical properties of subcritical networks have not

2
attracted nearly as much attention as those of supercritical networks.
Random networks of the Erdős-Rényi (ER) type [14–16] are the simplest class of random
networks and are used as a benchmark for the study of structure and dynamics in complex
networks [17]. The ER network ensemble is a maximum entropy ensemble, under the condi-
tion that the mean degree hKi = c is fixed. It is a special case of a broader class of random
uncorrelated networks, referred to as configuration model networks [18–21]. In an ER net-
work of N nodes, each pair of nodes is independently connected with probability p, such
that the mean degree is c = (N − 1)p. It was recently shown that the ER graph structure
is an asymptotic structure for networks that contract due to node deletion processes, which
may result from failures, attacks or epidemics [22, 23].
The degree distribution of ER networks follows a Poisson distribution of the form

e−c ck
P (K = k) = . (1)
k!
ER networks exhibit a percolation transition at c = 1 such that for c > 1 (supercritical
regime) there is a giant component [24], while for 0 < c < 1 (subcritical regime) the network
consists of small, isolated tree components [17, 25]. In the special case of c = 0 the network
consists of N isolated nodes and the degree distribution degenerates into P (K = k) = δk,0.
In Fig. 1 we present the structure of a single instance of a subcritical ER network of size
N = 100 with mean degree c = 0.9. It consists of 33 isolated nodes, 9 dimers, two chains of
three nodes, two chains of four nodes and trees of 5, 6, 10 and 14 nodes.
In the asymptotic limit, ER networks exhibit duality with respect to the percolation
threshold [17]. In a supercritical ER network of N nodes the fraction of nodes that belong
to the giant component is denoted by 0 < g ≤ 1, while the fraction of nodes that belong
to the finite components is 1 − g. Thus, the subcritical network that consists of the finite
components is of size N(1 − g) . This network is in itself an ER network whose mean degree
is c′ = c(1 − g), where c′ < 1.
The distribution of tree sizes in subcritical ER networks with mean degree 0 < c < 1 is
given by [17, 26, 27]

2ss−2 cs−1 e−cs

P (S = s) = . (2)
(2 − c)s!
In the special case of c = 0 this distribution degenerates into P (S = s) = δs,1 .

3
× 33
×9
×2
×2

FIG. 1. The structure of a single instance of a subcritical ER network of N = 100 nodes with
mean degree c = 0.9. It consists of 33 isolated nodes, 9 dimers, two chains of three nodes, two
chains of four nodes and trees of 5, 6, 10 and 14 nodes.

The mean tree size is given by [27]

2
hSi = . (3)
2−c
The expected number of trees in a network instance consisting of N nodes is thus given by

N c
NT = =N 1− . (4)
hSi 2
The variance of P (S = s) is given by [27]

2c
Var(S) = . (5)
(1 − c)(2 − c)2
Note that Var(S) diverges as c → 1− , which implies that near the percolation transition
some of the trees are very large.
Trees of a given size s may exhibit different structures, where the number of distinct
structures increases with s. An important distinction in this context is between labeled
trees, in which nodes are distinguishable and carry labels, and unlabeled trees in which the
nodes are indistinguishable. The number Ts of distinct labeled tree configurations of size s
is given by the Cayley formula [28]

Ts = ss−2 . (6)

4
Each one of these labeled tree configurations can be encoded by a unique sequence, refereed
to as the Prüfer sequence [29]. The Prüfer sequence of a labeled tree of s nodes is a string
of s − 2 integers, taking values in the range of 1, 2, . . . , s. The Prüfer code provides a very
powerful tool for the random sampling of labeled trees of a given size.
When the labels are removed, the number of distinct configurations is reduced since
each unlabeled configuration corresponds to several labeled configurations. In the case of
unlabeled trees, the number of non-isomorphic tree topologies, n(s), which can be assembled
from s nodes quickly increases as a function of s. For example, the values of n(s) for
s = 1, 2, . . . , 13 are 1, 1, 1, 2, 3, 6, 11, 23, 47, 106, 235, 551 and 1301, respectively [30]. An
efficient algorithm for generating all the tree topologies that can be assembled from s nodes,
is presented in Refs. [31, 32]. A list of all possible tree topologies up to s = 13 is presented
in Ref. [30].
In Fig. 2 we present the tree topologies that consist of s nodes for s = 1, 2, . . . , 7. For
s ≤ 3 the linear chain topology is the only possible topology while for s ≥ 4 more complex
topologies appear and their number quickly increases. The number of labeled configuration
associated with each one of the tree topologies is also shown. Note that the total number
of labeled trees that consist of s nodes add up to ss−2 , which is consistent with the Cayley
formula (6).
While the local structure of a network is well characterized by the degree distribution,
the distribution of shortest path lengths (DSPL), denoted by P (L = ℓ), provides a useful
characterization of its large scale structure. When two nodes, i and j, reside on the same
connected component, the distance, ℓij , between them is given by the length of the shortest
path that connects them. When nodes i and j reside on different network components, there
is no path connecting them and the distance between them is ℓij = ∞. The probability that
two randomly selected nodes reside on the same component, and thus are at a finite distance
from each other, is denoted by P (L < ∞) = 1 − P (L = ∞). The conditional DSPL between
pairs of nodes that reside on the same component is denoted by P (L = ℓ|L < ∞), where
ℓ = 1, 2, . . . , N − 1. The conditional DSPL satisfies

P (L = ℓ)
P (L = ℓ|L < ∞) = . (7)
P (L < ∞)
Note that P (L = ℓ|L < ∞) is well defined only for c > 0. This is due to the fact that
P (L < ∞) = 0 for c = 0. Thus, the analysis presented below is focused on 0 < c < 1.

5
1
1
1
2
3
3
12
4 4

5
60 60
5
360 90 120 6
360 360
6
2520 2520 5040 840 630
7 2520

210
840 1260 420 7

FIG. 2. The tree topologies that consist of s nodes for s = 1, 2, . . . , 7. For s ≤ 3 the linear
chain topology is the only topology while for s ≥ 4 more complex topologies appear and their
number quickly increases. The number of labeled conﬁgurations associated with each one of the
tree topologies is also shown. Note that the total number of labeled trees that consist of s nodes
add up to ss−2 , which is consistent with the Cayley formula (6).

The DSPL provides a natural platform for the study of dynamical processes on net-
works, such as diffusive processes, epidemic spreading, critical phenomena, synchronization,
information propagation and communication. For supercritical networks the DSPL was cal-
culated using various theoretical approaches, which include recursion equations, generating
functions, master equations and branching processes [7, 12, 13, 21, 24, 33–45]. In the spe-
cial case of random regular graphs with c ≥ 3 the giant component encompasses the whole
network. In this case there is a closed-form analytical expression for P (L = ℓ) [34, 40, 44],
which follows a discrete Gompertz distribution [46].
P∞
It was shown that the mean distance E[L|L < ∞] = ℓ=1 ℓP (L = ℓ|L < ∞) scales
like E[L|L < ∞] ∼ ln N/ ln c, in agreement with rigorous results, showing that supercritical
random networks are small-world networks [47–50]. It was also shown that the variance of the
DSPL of supercritical random networks does not scale with N, and satisfies Var(L) ∼ O(1)
[40]. The statistical properties of distances in scale-free networks, which typically consist of a
single connected component, were studied in Refs. [36, 37, 51]. Using an analytical argument

6
it was shown that scale free networks with degree distributions of the form P (k) ∼ k −γ
are ultrasmall, namely they exhibit a mean distance which scales like E[L] ∼ ln ln N for
2 < γ < 3. For γ = 3 it was shown that the mean distance scales like E[L] ∼ ln N/ ln ln N,
while for γ > 3 it coincides with the common scaling of small world networks, namely
E[L] ∼ ln N.
The DSPL of subcritical ER networks was recently studied using a topological expansion
[27]. This analysis employs the fact that in the subcritical regime, in the large-network limit,
the network consists of finite tree components with no cycles [17, 25]. It was found that for
0 < c < 1 the DSPL between pairs of nodes that reside on the same tree component is given
by [27]

P (L = ℓ|L < ∞) = (1 − c)cℓ−1 , (8)

and that the probability that two random nodes reside on the same tree component is [27]

c
P (L < ∞) = . (9)
(1 − c)N
The corresponding tail distribution is given by

P (L > ℓ|L < ∞) = cℓ . (10)

The mean distance between pairs of nodes that reside on the same tree component is

1
E[L|L < ∞] = , (11)
1−c
while the variance of the DSPL is given by

c
Var(L|L < ∞) = . (12)
(1 − c)2
While subcritical ER networks consist of finite tree components, in supercritical ER
networks there is a coexistence between the giant component and the finite tree components.
As a result, the DSPL of supercritical ER networks combines the contributions of the giant
and finite components. Using the duality relations discussed above, the DSPL of the finite
components of a supercritical ER network can be obtained from the analysis of its dual
subcritical network [24, 27].

7
In this paper we calculate the DSPL of finite tree components of size s, denoted by P (L =
ℓ|S = s), in subcritical ER networks. This is done by expressing the overall distribution
P (L = ℓ) as a linear combination of the corresponding conditional distributions P (L =
ℓ|S = s), using the known distribution of tree sizes. Using an inverse transformation we
extract the conditional distribution P (L = ℓ|S = s). Surprisingly, this distribution does not
depend on the mean degree c of the network from which the tree components were extracted.
This is due to the fact that the ensemble of tree components of a given size s in subcritical
ER networks is sampled uniformly from the set of labeled trees of size s and thus does not
depend on c. This insight is corroborated by a direct combinatorial argument. We also
calculate the DSPL over all tree components up to size s, denoted by P (L = ℓ|S ≤ s) and
examine its convergence towards the DSPL of the whole network, P (L = ℓ|L < ∞), as s is
increased. The moments of the DSPL are also calculated. It is found that the mean distance
√
between random pairs of nodes on tree components of size s satisfies E[L|S = s] ∼ s, unlike
small-world networks in which the mean distance scales logarithmically with s.
The paper is organized as follows. In Sec. II we consider the conditional DSPL on finite
tree components. The moments of the DSPL are calculated in Sec. III. The results are
discussed in Sec. IV and summarized in Sec. V.

II. THE DISTRIBUTION OF SHORTEST PATH LENGTHS

Using the law of total probability the DSPL of subcritical ER networks, given by Eq.
(8), can be expressed in the form

X
∞
P (L = ℓ|L < ∞) = P (L = ℓ|S = s)Pb(S = s), (13)
s=2

where P (L = ℓ|S = s) is the DSPL on tree components that consist of s nodes and Pb(S = s)
is the distribution of tree sizes on which a pair of random nodes resides (given that they reside
on the same tree component). In the analysis below we extract a closed-form expression
for P (L = ℓ|S = s) by inverting the infinite system of linear equations, given by Eq. (13).
Unlike commonly used methods for the calculation of such distributions, which are based on
combinatorial considerations, this approach is purely algebraic. It is essentially a top-down
approach, in which the conditional distribution P (L = ℓ|S = s) is obtained from the overall

8
distribution P (L = ℓ|L < ∞) via the distribution of tree sizes P (S = s). This approach is
advantageous over the complementary bottom-up approach, which would require a detailed
knowledge of all the tree configurations of size s, their weights and the DSPL over each and
every one of them.
The distribution Pb(S = s) is given by

s

Pb(S = s) = 2
S
P (S = s), (14)
2

where

X ∞
S s
= P (S = s) (15)
2 s=2
2

is the mean number of pairs of nodes in a randomly selected tree component, and P (S = s)
is given by Eq. (2). This is due to the fact that the number of pairs of nodes on a tree

component of size s is given by the binomial coefficient 2s . The evaluation of S2 is
presented in Appendix A. It yields

S c
= , (16)
2 (1 − c)(2 − c)
s

where 0 < c < 1. Inserting P (S = s) from Eq. (2) and h 2
i from Eq. (16) into Eq. (14),
we obtain

s−2 s−2 −cs

s c e
Pb(S = s) = (1 − c) . (17)
(s − 2)!
Inserting Pb(S = s) from Eq. (17) and P (L = ℓ|L < ∞) from Eq. (8) into Eq. (13), we
obtain

X
∞
ss−2 cs−1 e−cs
P (L = ℓ|S = s) = cℓ . (18)
s=2
(s − 2)!
This equation can be re-written in the form

X
∞
ss−2 (ce−c )s
P (L = ℓ|S = s) = cℓ+1 . (19)
s=2
(s − 2)!

The distribution P (L = ℓ|S = s) is obtained by inverting Eq. (19). In the inversion process
we assume that P (L = ℓ|S = s) does not depend on the mean degree c. The results

9
presented below show that such a solution indeed exists and is justified by a combinatorial
argument. The resulting expression for P (L = ℓ|S = s) is verified by computer simulations.
Defining

x = ce−c (20)

enables us to express the left hand side of Eq. (19) as a power series in x. For the analysis
below, it will be useful to also express the right hand side in terms of x rather than c. To
this end, we invert Eq. (20) and obtain

c = −W (−x), (21)

where W (x) is the Lambert W function [52]. Eq. (19) can now be written in the form

X
∞
ss−2
P (L = ℓ|S = s)xs = [−W (−x)]ℓ+1 . (22)
s=2
(s − 2)!

From equation (3.2.2) in Ref. [53], which results from the Lagrange inversion formula, we
obtain the identity

X
∞
(−s)s−r−1
r
[W (x)] = (−r) xs . (23)
s=r
(s − r)!

Using Eq. (23) we now express the right hand side of Eq. (22) as a power series in x.
Comparing the coefficients of xs on both sides of Eq. (22), we obtain the DSPL of tree
components that consist of s nodes in subcritical ER networks with 0 < c < 1. It is given
by

(ℓ + 1) (s − 2)!
P (L = ℓ|S = s) = , (24)
sℓ (s − ℓ − 1)!
where s ≥ 2 and 1 ≤ ℓ ≤ s − 1. This is the central result of the paper. Clearly, this
distribution does not depend on the mean degree c of the subcritical network from which
the trees of size s were extracted.
Unlike the DSPL of the whole network, which is a monotonically decreasing geometric
distribution, P (L = ℓ|S = s) exhibits a peak. The location of the peak is referred to as the
mode of the distribution and is denoted by ℓmode . Since P (L = ℓ|S = s) exhibits a single

10
peak, ℓmode is the lowest integer for which P (L = ℓ + 1|S = s) < P (L = ℓ|S = s). Using Eq.
(24), this inequality can be expressed in the form

ℓ+2 s−ℓ−1
< 1. (25)
ℓ+1 s
The solution of this inequality (assuming positive ℓ) is

√
4s + 1 − 3
ℓ> . (26)
2
The mode ℓmode is the lowest integer that satisfies Eq. (26), namely

√
4s + 1 − 3
ℓmode = , (27)
2
where ⌈x⌉ is the lowest integer that is larger than x, also known as the ceiling function. In
√
the limit of large trees, the mode scales like ℓmode ∼ s.
It turns out that the DSPL given by Eq. (24) coincides with the DSPL of the ensemble
obtained by uniformly random sampling over all the labeled tree configurations of size s
[54, 55]. The DSPL over all the labeled tree configurations of size s can be obtained from
direct combinatorial considerations. To this end we pick a random pair of nodes i and j on
a tree of size s. We count the number of possible configurations of labeled trees of size s,
in which the distance between a given pair of nodes i and j is ℓ. The fact that the distance
between i and j is ℓ implies that there is a single path of length ℓ between them. This path
consists of ℓ − 1 intermediate nodes. The number of ways to select these ℓ − 1 nodes from
the s − 2 nodes (not including i and j), where the order is important, is given by

(s − 2)! s−2
= (ℓ − 1)!. (28)
(s − ℓ − 1)! ℓ−1
The path joining i and j, which consists of ℓ + 1 nodes (including i and j), can be considered
as the backbone of the tree. Each node on the backbone may be the root of a tree branch
such that each one of the remaining s − ℓ − 1 nodes belongs to one of these tree branches.
This enables us to use the generalized Cayley formula [28, 56, 57], which provides the number
of labeled tree configurations that consist of ℓ + 1 non-empty disjoint tree components (also
known as forests) with a total of s nodes, namely

Ts,ℓ+1 = (ℓ + 1)ss−ℓ−2. (29)

11
Note that Cayley formula of Eq. (6) is a special case of the generalized Cayley formula (29),
namely Ts = Ts,1 . The probability P (L = ℓ|S = s) is obtained by dividing the number of
possible configurations of labeled trees of size s, in which the distance between a given pair
of nodes i and j is ℓ by the total number Ts of configurations of labeled trees of size s. It
yields

s−2
Ts,ℓ+1 ℓ−1
(ℓ − 1)!
P (L = ℓ|S = s) = , (30)
Ts
which is equivalent to Eq. (24). This equivalence suggests that the ensemble of trees of a
given size s in subcritical ER networks is equivalent to a uniformly random sampling among
all the Ts labeled tree configurations of size s. This is consistent with the fact that the DSPL
given by Eqs. (24) and (30) does not depend on the mean degree c of the network from
which these trees were extracted. The equivalence between the two ensembles can be justified
using the following argument. Given a finite connected component consisting of s nodes in
a subcritical ER network it is almost surely to exhibit a tree topology containing s − 1
edges [17]. For a set of s nodes, the probability that these nodes will form a connected tree
component of a given labeled configuration, which is isolated from the rest of the network,
is given by

ps−1 (1 − p)(2)−(s−1) (1 − p)s(N −s) ,

s
(31)

where the first term accounts for the s − 1 edges of the tree, the second term accounts for
the probability that there are no additional edges between the nodes in the tree component,
and the third term accounts for the probability that the tree is isolated from the rest of the
network. In an ER network, in which the connectivity between different pairs of nodes is
independent, this probability is the same for all possible configurations of labeled trees of
size s.
Summing up the right hand side of Eq. (24) from ℓ + 1 to infinity, we obtain the tail
distribution, which is given by

(s − 2)! ss−ℓ−2
P (L > ℓ|S = s) = , (32)
ss−2 (s − ℓ − 2)!
where ℓ = 0, 1, 2, . . . , s − 2. It is a monotonically decreasing function that satisfies P (L >
0|S = s) = 1 and P (L > s − 2|S = s) = (s − 2)!/ss−2 .

12
0.25

0.2

0.15

0.1

0.05

0
0 5 10 15 20 25

FIG. 3. (Color online) analytical results (solid lines) for the DSPL on trees of size s, denoted by
P (L = ℓ|S = s), for s = 10, 20, 30 and 40 (left to right), obtained from Eq. (24). The analytical
results are in very good agreement with the results obtained from computer simulations carried
out for networks of size N = 104 , c = 0.5 (×) and c = 0.8 (◦), which coincide with each other.
These results conﬁrm the validity of Eq. (24) as well as the fact that the ensemble of ﬁnite trees
of a given size s extracted from subcritical ER networks of mean degree c does not depend on c.
Note that the simulation results for c = 0.5 are shown only for s = 10, 20 and 30, because trees of
size s = 40 are extremely rare in this case.

In Fig. 3 we present analytical results (solid lines) for the DSPL on trees of size s, denoted
by P (L = ℓ|S = s), for s = 10, 20, 30 and 40, obtained from Eq. (24). The analytical results
are in very good agreement with the results obtained from computer simulations carried out
for c = 0.5 (×) and c = 0.8 (◦), which coincide with each other. These results confirm the
validity of Eq. (24) as well as the fact that the ensemble of finite trees of a given size s
extracted from subcritical ER networks of mean degree c does not depend on c.
In the simulations we generated subcritical ER networks of size N = 104 with mean
degree c = 0.5 and c = 0.8. From these networks we picked tree components of the desired
sizes, such as s = 10, 20, 30 and 40. The expected number of trees of size s in a network
instance of size N is given by

NT (s) = NT P (S = s). (33)

13
Inserting NT from Eq. (4) and P (S = s) from Eq. (2) into Eq. (33), we obtain

ss−2 cs−1 e−cs

NT (s) = N . (34)
s!
This result can be used in order to estimate the number of network instances which is
required in order to obtain the desired number of trees of size s that are needed for the
statistical analysis. The distribution P (S = s) is a quickly decreasing function of s. Thus,
trees of size s become less abundant as s is increased. As a result, one needs a large number
of network instances in order to obtain sufficient data for statistical analysis of large tree
components. The results presented in Fig. 3 are based on 1,500 instances of subcritical ER
networks of size N = 104 for each value of c. For c = 0.8 these network instances yield
12,454 trees of size 10, 1,823 trees of size 20, 500 trees of size 30 and 183 trees of size 40.
For c = 0.5 these network instances yield 3,617 trees of size 10, 91 trees of size 20, 8 trees
of size 30 and no trees of size 40. Therefore, In Fig. 3 the analytical results for s = 40 are
compared only to the simulation results for c = 0.8 (◦).
Another interesting distribution is the DSPL between pairs of nodes that reside on all
tree components of size s′ ≤ s. It can be obtained from

Ps
s′ =2 Pb(S = s′ )P (L = ℓ|S = s′ )
P (L = ℓ|S ≤ s) = Ps . (35)
b(S = s′ )
P
s =2
′

Taking the limit of large s, P (L = ℓ|S ≤ s) converges towards P (L = ℓ|L < ∞), as in Eq.
P
(13). To explore this convergence it is convenient to replace the sums ss′ =2 in Eq. (35) by
P P∞
the difference ∞s′ =2 − s′ =s+1 . Carrying out the first summations in the numerator and in

the denominator, we obtain

P∞ b
(1 − c)cℓ−1 − s′ =s+1 P (S = s )P (L
′
= ℓ|S = s′ )
P (L = ℓ|S ≤ s) = P . (36)
1 − ∞ b
s′ =s+1 P (S = s )
′

In Fig. 4 we present analytical results (solid lines) for the distribution P (L = ℓ|S ≤ s)
of shortest path lengths on all tree components of size smaller or equal to s, in subcritical
ER networks with mean degree c = 0.8. The analytical results obtained from Eq. (36),
are presented for tree sizes of s = 10, 20, 30 and 40 (top to bottom on the left hand side).
The analytical results are in very good agreement with the results obtained from computer
simulations carried out for c = 0.8 (◦). As s is increased, the distribution P (L = ℓ|S ≤ s)

14
0.5

0.4

0.3

0.2

0.1

0
0 5 10 15 20 25

FIG. 4. (Color online) Analytical results (solid lines) for P (L = ℓ|S ≤ s) on tree components
of size smaller or equal to s in subcritical ER network with mean degree c = 0.8, for s = 10,
20, 30 and 40 (top to bottom on the left hand side), obtained from Eq. (36). As s is increased,
P (L = ℓ|S ≤ s) converges towards the overall DSPL, P (L = ℓ|L < ∞), of the subcritical ER
network (dashed line). The analytical results are in very good agreement with the results obtained
from computer simulations (◦).

converges towards the overall DSPL P (L = ℓ|L < ∞) of the subcritical ER network (dashed
line).

III. THE MEAN AND VARIANCE OF THE DSPL

In order to calculate the moments of the DSPL, we define the moment generating function

s−1
X
M(x) = exℓ P (L = ℓ|S = s). (37)
ℓ=1

Inserting the probability P (L = ℓ|S = s) from Eq. (24) into Eq. (37) and carrying out the
summation, we obtain

s −2x 1 e−x −x
s(x+e−x ) −x

M(x) = e − + s 1−e e Γ s + 1, se . (38)
s−1 s s
where Γ(a, z) is the incomplete Gamma function [52]. The nth moment of P (L = ℓ|S = s)
is obtained by differentiating M(x), with respect to x, n times, namely

15
∂nM
E [Ln |S = s] = . (39)
∂xn x=0
Inserting n = 1 in Eq. (39), we obtain the mean distance between random pairs of nodes
that reside on a tree component of size s. It is given by

s[es s−s Γ(s + 1, s) − 2]

E[L|S = s] = . (40)
s−1
Inserting n = 2 in Eq. (39), we obtain the second moment, which is given by

s[4 + 2s − 3es s−s Γ(s + 1, s)]

E[L2 |S = s] = . (41)
s−1
The variance of P (L = ℓ|S = s) is given by

Var(L|S = s) = E[L2 |S = s] − (E[L|S = s])2 , (42)

where E[L2 |S = s] is given by Eq. (41) and E[L|S = s] is given by Eq. (40).
For sufficiently large values of s one can obtain simplified asymptotic expressions for the
moments of the DSPL. To achieve this we use the double-asymptotic expansion of Γ(s, s),
given by equation 8.11.12 in Ref. [52], namely

r
s−1 −s π√ 1 1
Γ(s, s) = s e s− +O √ (43)
2 3 s
To evaluate the moments, we need a closed form expression for Γ(s + 1, s). Using equation
8.8.2 in Ref. [52], we obtain

Γ(s + 1, s) = sΓ(s, s) + ss e−s , (44)

where Γ(s, s) is given by Eq. (43). Equipped with these expressions, we can now obtain
asymptotic expansions for the moments in the limit of large s. More specifically, the mean
distance on a random tree of size s is given by

r
π√ 4 1
E[L|S = s] = s− +O √ . (45)
2 3 s
It is found that the mean distance between random pairs of nodes that reside on a tree
component of size s scales like square root of s. Comparing the right hand sides of Eqs. (27)
and (45), which show the mode ℓmode and the mean distance E[L|S = s], respectively, it is

16
7
6

E[L|S = s]
5
4
3
2
1
0 10 20 30 40
s

FIG. 5. (Color online) Analytical results (solid line) for the mean distance E[L|S = s] between
pairs of nodes that reside on the same tree component of size s, in a subcritical ER network, as
a function of s. The analytical results are in very good agreement with the results obtained from
computer simulations for subcritical ER networks of size N = 104 and c = 0.5 (×) and c = 0.8 (◦),
which coincide with each other. Note that the simulation results for c = 0.5 are shown only up to
s = 30, because in this case larger trees are rare.

√
found that while both of them scale like s the pre-factor of the mean distance is larger than
the pre-factor of the mode. This implies that the distribution P (L = ℓ|S = s) is positively
skewed. Interestingly, the scaling of the mean distance, implied by Eq. (45), resembles the
scaling of distances on two dimensional lattices. It is in contrast with small world random
networks in which the mean distance scales like ln s. This means that the tree components
in subcritical ER networks are not small world networks.

In Fig. 5 we present analytical results (solid line) for the mean distance E[L|S = s]
between pairs of nodes that reside on the same tree component of size s, in a subcritical
ER network, as a function of s. The analytical results are in very good agreement with
the results obtained from computer simulations for subcritical ER networks of size N = 104
with c = 0.5 (×) and c = 0.8 (◦), which coincide with each other. Note that the simulation
results for c = 0.5 are shown only up to s = 30, because in this case larger trees are rare.

The second moment of the DSPL can be expressed by

17
r
2 π√ 1
E[L |S = s] = 2s − 3 s+2+O √ . (46)
2 s
Combining the results presented above for the first and second moments, we obtain an
asymptotic expression for the variance. It is given by

r
4−π π√
Var(L|S = s) = s− s + O (1) . (47)
2 18
√
Thus, the standard deviation of the DSPL on trees of size s scales like s, namely it scales
like the mean distance E[L|S = s]. Interestingly, the same qualitative relation is found
in the DSPL of the whole subcritical ER network. This implies that P (L = ℓ|S = s) is
relatively broad distribution, in contrast with the typical results for the DSPL of supercritical
configuration model networks [39, 40, 44].
In Fig. 6 we present analytical results (solid line) for the variance Var(L|S = s) of
the distribution of shortest path lengths between pairs of nodes that reside on the same
tree component of size s, in a subcritical ER network, as a function of s. The analytical
results are in very good agreement with the results obtained from computer simulations for
subcritical ER networks of size N = 104 and mean degree c = 0.5 (×) and c = 0.8 (◦), which
coincide with each other. Note that the simulation results for c = 0.5 are shown only up to
s = 30, because in this case larger trees are rare.
The cumulative mean distance between pairs of nodes that reside on a tree of size smaller
or equal to s is given by

Ps
Pb(S = s′ )E[L|S = s′ ]
s′ =2
E[L|S ≤ s] = Ps . (48)
b
s′ =2 P (S = s )
′

To evaluate the right hand side of Eq. (48), it is convenient to express the numerator and
the denominator as differences between two infinite sums, namely

P∞ P
s′ =2 Pb(S = s′ )E[L|S = s′ ] − ∞ b
s′ =s+1 P (S = s )E[L|S = s ]
′ ′
E[L|S ≤ s] = P∞ P . (49)
b ∞ b
s′ =2 P (S = s ) − s′ =s+1 P (S = s )
′ ′

The first term in the numerator amounts to E[L|L < ∞], which is given by Eq. (11), while
the first term in the denominator is equal to 1 (due to the normalization of Pb(S = s)). Eq.
(49) can thus be simplified to

18
15

0
0 10 20 30 40

FIG. 6. (Color online) The variance Var(L|S = s) of the DSPL between pairs of nodes that
reside on the same tree component of size s, in a subcritical ER network, as a function of s. The
analytical results are in very good agreement with the results obtained from computer simulations
for subcritical ER networks of size N = 104 and mean degree c = 0.5 (×) and c = 0.8 (◦), which
coincide with each other. Note that the simulation results for c = 0.5 are shown only up to s = 30,
because in this case larger trees are rare.

P∞ b
1 1 − (1 − c) s′ =s+1 P (S = s )E[L|S
′
= s′ ]
E[L|S ≤ s] = P∞ . (50)
1−c 1− b
s′ =s+1 P (S = s )
′

Inserting Pb(S = s) from Eq. (17) and E[L|S = s] from Eq. (45), which is accurate for
sufficiently large s, into Eq. (50) and carrying out the summations, we obtain

(1−c)2
p π 1

1 1− √
2πc2
(ce1−c )s+1 2 1−ce1−c
− 43 Φ ce1−c , 12 , s + 1
E[L|S ≤ s] ≃ , (51)
1−c 1− √1−c (ce1−c )s+1 Φ ce1−c , 12 , s + 1 − Φ ce1−c , 32 , s + 1
2πc 2

where

X
∞
zn
Φ(z, s, a) = (52)
n=0
(a + n)s
is the Lerch Phi transcendent [52]. Eq. (51) is expected to be valid for large values of s.
In Fig. 7 we present analytical results (solid lines) for the mean distance E[L|S ≤ s]
between pairs of nodes that reside on the same tree component, for all tree components of

19
6

E[L|S ≤ s]
4

1
0 0.25 0.5 0.75 1
c

FIG. 7. (Color online) Analytical results (solid lines), obtained from Eq. (51), for the mean
distance E[L|S ≤ s] between all pairs of nodes that reside on the same tree component, for all
tree components of size smaller or equal to s, in subcritical networks, as a function of the mean
degree c. The results are presented for s = 10, 20, 40 and 80 (from bottom to top). The analytical
results are in very good agreement with the results obtained from computer simulations (◦). As
s is increased, the mean distance E[L|S ≤ s] converges towards the mean distance over the whole
network, E[L|L < ∞] (dashed line), given by Eq. (11).

size smaller or equal to s, in subcritical networks, as a function of the mean degree c. The
results are presented for s = 10, 20, 40 and 80 (from bottom to top). The analytical results,
obtained from Eq. (51), are in very good agreement with the results obtained from computer
simulations (◦). As s is increased, the mean distance E[L|S ≤ s] converges towards the mean
distance over the whole network, E[L|L < ∞] (dashed line), given by Eq. (11).

IV. DISCUSSION

The ensemble of trees that appear in subcritical ER networks belong to the class of
equilibrium trees [4]. These are trees that are formed by equilibrium processes. Their
statistical properties can be analyzed using methods of equilibrium statistical mechanics.
In this paper we calculated the DSPL of trees of a given size s in subcritical ER networks.
It was found that P (L = ℓ|S = s) is independent of the mean degree c of the subcritical

20
network from which these trees were extracted. It was also found that the mean distance
√
on the ensemble of trees of size s scales like E[L|S = s] ∼ s. This scaling implies that the
Hausdorff dimension of the trees is DH = 2, in agreement with earlier results obtained for
other equilibrium trees [4]. It is in contrast with the scaling obtained in supercritical ER
networks and other configuration model networks. In these networks the mean distance E[L]
scales logarithmically with the network size N and they are thus referred to as small-world
networks.
Another important ensemble of trees consists of random recursive trees, which belong to
the class of nonequilibrium trees. These trees grow via a kinetic process of node addition.
The simplest model of random tree growth is the random attachment model. In this model,
starting from a small seed network, at each time step a new node is added and is connected
to one of the existing nodes uniformly at random. For simplicity we consider the case in
which the seed network consists of a single node. Interestingly, the ensembles of equilibrium
and nonequilibrium trees of size s include the same set of tree configurations. However,
their statistical properties are different due to the different weights assigned to each one
of the possible configurations. In growing trees the order in which the nodes are added is
important. In particular, nodes that appeared early in the growth process are likely to gain
more links than nodes that appeared at later stages [4].
The DSPL of the ensemble of random attachment trees of size s was found to follow a
Poisson distribution whose mean is given by E[L|S = s] = 2 ln s [42]. This implies that the
random attachment trees belong to the class of small-world networks, in which the mean
distance scales logarithmically with the network size. These trees tend to form compact
structures dominated by the nodes that appeared early in the growth process. This is in
sharp contrast to the results obtained for the subcritical ER trees in which the mean distance
√
scales like s.
The methodology presented in this paper can be applied to the calculation of the distri-
bution P (L = ℓ|S = s) in configuration model networks with various degree distributions
P (K = k), such as the exponential distribution and the power-law distribution. To this
end, one needs to obtain the distribution P (S = s) of tree sizes in the subcritical configu-
ration model network under study and the DSPL of the whole network, P (L = ℓ|L < ∞)
and to insert them into Eq. (13). The distribution P (S = s) can be calculated using the
generating function approach presented in Ref. [26]. The inversion of Eq. (13) to extract

21
P (L = ℓ|S = s) is possible probably in those cases in which P (L = ℓ|S = s) is independent
of the mean degree c. The validity of this condition will need to be tested on a case-by-case
basis.

Apart from the DSPL there are other metric properties that characterize the large scale
structure of finite trees in subcritical configuration model networks. These include the
distributions of eccentricities and diameters of trees of size s. The eccentricity is a property
of a single node i and it is equal to the largest distance between the given node i and any
other node in the tree. The diameter is a property of the whole tree and it is equal to
the largest distance between any pair of nodes in the tree. The distribution of the largest
diameter among all the trees in a subcritical ER network was recently studied [58, 59]. It
was found that this distribution follows a Gumbel distribution [60], which is one of the three
distributions encountered in extreme-value theory.

The resistance distance between two nodes in a network is a measure of how difficult
it is for electricity (or some other form of flow) to pass between these two nodes. In an
unweighted network, the resistance distance is defined as the resistance between the two
nodes, where the resistance of each edge is equal to 1 Ohm. The resistance distance can be
thought of as a generalization of the concept of distance to networks, where the ”distance”
between two nodes is determined by the flow resistance between them rather than their
physical separation. A more formal definition is given in [61, 62], where it is also shown
that it is a proper metric, satisfying for example the triangle inequality. In general, the
resistance distance between two nodes will be smaller if there are more paths between the
two nodes with lower resistance, and larger if there are fewer paths or if the paths have higher
resistance. Random networks of resistors have been studied, mainly in two dimensions [63],
and recently calculated for supercritical ER networks [64, 65]. Interestingly, on tree graphs
the shortest path between a pair of nodes i and j is in fact the only path between them. As
a result, the resistance distance between i and j is equal to the shortest path length between
them. This means that the results presented in this paper provide also the distribution of
resistance distances in ER networks in the subcritical regime.

22
V. SUMMARY

We calculated the distribution of shortest path lengths P (L = ℓ|S = s) between random

pairs of nodes that reside on finite tree components of a given size s in subcritical ER
ℓ+1 (s−2)!
networks. It was found that P (L = ℓ|S = s) = sℓ (s−ℓ−1)!
. Surprisingly, this probability
does not depend on the mean degree c of the network from which these tree components
were extracted. This is due to the fact that the ensemble of tree components of a given size
s in ER networks is sampled uniformly from the set of labeled trees of size s. The moments
of the DSPL were also calculated. It was found that the mean distance between random
√
pairs of nodes on tree components of size s satisfies E[L|S = s] ∼ s, unlike small-world
networks in which the mean distance scales logarithmically with s.
This work was supported by the Israel Science Foundation grant no. 1682/18.

Appendix A: The generating function of P (S = s)

The generating function of P (S = s) is given by

X
∞
H(u) = us P (S = s). (A1)
s=1

Inserting P (S = s) from Eq. (2) into Eq. (A1), we obtain

2 X ss−2 cs−1 e−cs s

∞
H(u) = u. (A2)
2 − c s=1 s!

Rearranging terms on the right hand side of Eq. (A2), we obtain

2 X 1 (−s)s−1
∞
s
H(u) = − −uce−c . (A3)
c(2 − c) s=1 s s!

Replacing the term 1/s on the right hand side of Eq. (A3) by the integral expression

Z ∞
1
= e−sτ dτ, (A4)
s 0

yields

X ∞ Z
2 ∞
−sτ (−s)s−1 s
H(u) = − e dτ −uce−c . (A5)
c(2 − c) s=1 0 s!

23
Exchanging the order of the sum and the integral on the right hand side of Eq. (A5), we
obtain

Z X
∞
2 ∞
(−s)s−1 s
H(u) = − dτ −uce−c e−τ . (A6)
c(2 − c) 0 s=1
s!

Using the series expansion of the Lambert W function, which is given by

X
∞
(−s)s−1 xs
W (x) = , (A7)
s=1
s!

we obtain

Z
2 ∞
H(u) = − dτ W −uce−c e−τ . (A8)
c(2 − c) 0

Changing the integration variable from τ to x = −uce−c e−τ , we obtain

Z 0
2 dx
H(u) = W (x) . (A9)
c(2 − c) −uce−c x
Changing the integration variable again, from x to y = W (x), which from the definition of
the Lambert function implies that x = yey , we obtain

Z 0
2 1
H(u) = y 1+ dy. (A10)
c(2 − c) W (−uce−c ) y
Carrying out the integration on the right hand side of Eq. (A10), we obtain

1 n 2 o
H(u) = − W (−uce−c ) + 2W (−uce−c ) . (A11)
c(2 − c)
The moments of P (S = s) can be obtained by taking suitable derivatives of H(u). In
particular, the mean tree size is

dH(u) 2
hSi = = . (A12)
du u=1 2−c
and the second factorial moment is given by

d2 H(u) 2c
hS(S − 1)i = = . (A13)
du2 u=1 (1 − c)(2 − c)
Using these results, it is found that the second moment of P (S = s) is given by

24
2
hS 2 i = , (A14)
(1 − c)(2 − c)

and the variance is given by

2c
Var(S) = . (A15)
(1 − c)(2 − c)2

It is also found that

S c
= . (A16)
2 (1 − c)(2 − c)

[1] S. Havlin and R. Cohen, Complex networks: structure, robustness and function (Cambridge
University Press, New York, 2010).
[2] E. Estrada, The structure of complex networks: theory and applications (Oxford University
Press, Oxford, 2011).
[3] M.E.J. Newman, Networks, Second Edition (Oxford University Press, Oxford, 2018).
[4] S.N. Dorogovtsev and J.F.F. Mendes, The Nature of Complex Networks (Oxford University
Press, Oxford, 2022).
[5] E. Marinari and R. Monasson, Circuits in random graphs: from local trees to global loops, J.
Stat. Mech., P09004 (2004).
[6] E. Marinari and G. Semerjian, On the number of circuits in random graphs, J. Stat. Mech.,
P06019 (2006).
[7] H. Bonneau, A. Hassid, O. Biham, R. Kühn and E. Katzav, Distribution of shortest cycle
lengths in random networks, Phys. Rev. E 96, 062307 (2017).
[8] L. Tian, A. Bashan, D.-N. Shi, and Y.-Y. Liu, Articulation points in complex networks, Nature
Communications 8, 14223 (2017).
[9] I. Tishby, O. Biham, R. Kühn and E. Katzav, Statistical analysis of articulation points in
conﬁguration model networks, Phys. Rev. E 98, 062301 (2018).
[10] H. Bonneau, O. Biham, R. Kühn and E. Katzav, Statistical analysis of edges and bredges in
conﬁguration model networks Phys. Rev. E 102, 012314 (2020).

25
[11] P.A.C. Duijn, V. Kashirin and P.M.A and Sloot, The relative ineﬀectiveness of criminal net-
work disruption, Sci. Rep. 4, 4238 (2014).
[12] J. Shao, S.V. Buldyrev, R. Cohen, M. Kitsak, S. Havlin and H.E. Stanley, Fractal boundaries
of complex networks, Europhys. Lett. 84, 48004 (2008).
[13] J. Shao, S.V. Buldyrev, L.A. Braunstein, S. Havlin and H.E. Stanley, Structure of shells in
complex networks, Phys. Rev. E 80, 036105 (2009).
[14] P. Erdős and A. Rényi, On random graphs I, Publicationes Mathematicae (Debrecen) 6, 290
(1959).
[15] P. Erdős and A. Rényi, On the evolution of random graphs, Publ. Math. Inst. Hung. Acad.
Sci. 5, 17 (1960).
[16] P. Erdős and A. Rényi, On the evolution of random graphs II, Bull. Inst. Int. Stat. 38, 343
(1961).
[17] B. Bollobás, Random Graphs, Second Edition (Academic Press, London, 2001).
[18] B. Bollobás, A probabilistic proof of an asymptotic formula for the number of labeled regular
graphs, European Journal of Combinatorics 1, 311 (1980).
[19] M. Molloy and B. Reed, A critical point for random graphs with a given degree sequence,
Rand. Struct. Algo. 6, 161 (1995).
[20] M. Molloy and B. Reed, The size of the giant component of a random graph with a given
degree sequence, Combin. Probab. Comput. 7, 295 (1998).
[21] M.E.J. Newman, S.H. Strogatz and D.J. Watts, Random graphs with arbitrary degree distri-
butions and their applications, Phys. Rev. E 64, 026118 (2001).
[22] I. Tishby, O. Biham and E. Katzav, Convergence towards an Erdős-Rényi graph structure in
network contraction processes, Phys. Rev. E 100, 032314 (2019).
[23] I. Tishby, O. Biham and E. Katzav, Analysis of the convergence of the degree distribution of
contracting random networks towards a Poisson distribution using the relative entropy, Phys.
Rev. E 101, 062308 (2020).
[24] I. Tishby, O. Biham, E. Katzav and R. Kühn, Revealing the micro-structure of the giant
component in random graph ensembles, Phys. Rev. E 97, 042318 (2018).
[25] R. Durrett, Random Graph Dynamics (Cambridge University Press, Cambridge, 2007).
[26] M.E.J. Newman, Component sizes in networks with arbitrary degree distributions, Phys. Rev.
E 76, 045101 (2007).

26
[27] E. Katzav, O. Biham and A.K. Hartmann, Distribution of shortest path lengths in subcritical
Erdős-Rényi networks, Phys. Rev. E 98, 012301 (2018).
[28] A. Cayley, A theorem on trees, Quart. J. Pure Appl. Math. 23, 376 (1889).
[29] H. Prüfer, Neuer beweis eines satzes über permutationen, Arch. Math. Phys. 27, 742 (1918).
[30] P. Steinbach, Field guide to simple graphs, Volume 3: The book of trees (Design Lab, Albou-
querque, 1990).
[31] T. Beyer and S.M. Hedetniemi, Constant Time Generation of Rooted Trees, SIAM J. Comput.
9, 706 (1980).
[32] R.A. Wright, B. Richmond, A. Odlyzko and B.D. McKay, Constant Time Generation of Free
Trees SIAM J. Comput. 15, 540 (1986).
[33] S.N. Dorogovtsev, J.F.F. Mendes and A.N. Samukhin, Metric structure of random networks,
Nuclear Physics B 653, 307 (2003).
[34] R. van der Hofstad, G. Hooghiemstra and P. van Mieghem, Distances in random graphs with
finite variance degrees, Rand. Struct. Algo. 27, 76 (2005).
[35] V.D. Blondel, J.-L. Guillaume, J.M. Hendrickx and R.M. Jungers, Distance distribution in
random graphs and application to network exploration, Phys. Rev. E 76, 066101 (2007).
[36] R. van der Hofstad, G. Hooghiemstra and D. Znamenski, Distances in random graphs with
finite mean and infinite variance degrees, Elect. J. Prob. 12, 703 (2007).
[37] R. van der Hofstad and G. Hooghiemstra, Universality for distances in power-law random
graphs, J. Math. Phys. 49, 125209 (2008).
[38] H. van der Esker, R. van der Hofstad and G. Hooghiemstra, Universality for the distance in
finite variance random graphs, J. Stat. Phys. 133, 169 (2008).
[39] E. Katzav, M. Nitzan, D. ben-Avraham, P.L. Krapivsky, R. Kühn, N. Ross and O. Biham,
Analytical results for the distribution of shortest path lengths in random networks, EPL 111,
26006 (2015).
[40] M. Nitzan, E. Katzav, R. Kühn and O. Biham, Distance distribution in configuration-model
networks, Phys. Rev. E 93, 062309 (2016).
[41] S. Melnik and J.P. Gleeson, Simple and accurate analytical calculation of shortest path lengths,
arXiv:1604.05521 (2016).
[42] C. Steinbock, O. Biham and E. Katzav, Distribution of shortest path lengths in a class of
node duplication network models, Phys. Rev. E 96, 032301 (2017).

27
[43] C. Steinbock, O. Biham and E. Katzav, Analytical results for the distribution of shortest path
lengths in directed random networks that grow by node duplication, Eur. Phys. J. B 92, 130
(2019).
[44] I. Tishby, O. Biham, R. Kühn and E. Katzav, The mean and variance of the distribution of
shortest path lengths of random regular graphs, J. Phys. A: Math. Theor. 55, 265005 (2022).
[45] A.D. Jackson and S.P. Patil, Phases of Small Worlds: A Mean Field Formulation, J. Stat.
Phys. 189, 40 (2022).
[46] B. Gompertz, On the nature of the function expressive of the law of human mortality and on
a new mode of determining the value of life contingencies, Phil. Trans. R. Soc. A 115, 513
(1825).
[47] F. Chung and L. Lu, The average distances in random graphs with given expected degrees,
Proc. Natl. Acad. Sci. USA 99, 15879 (2002).
[48] F. Chung and L. Lu, The average distance in a random graph with given expected degrees,
Internet Math. 1, 91 (2004).
[49] A. Fronczak, P. Fronczak and J.A. Holyst, Average path length in random networks, Phys.
Rev. E 70, 056110 (2004).
[50] B. Bollobás, S. Janson and O. Riordan, The phase transition in inhomogeneous random graphs,
Rand. Struct. Algo. 31, 3 (2007).
[51] R. Cohen and S. Havlin, Scale-free networks are ultrasmall, Phys. Rev. Lett. 90, 058701 (2003).
[52] F.W. Olver, D.W. Lozier, R.F. Boisvert and C.W. Clark, NIST Handbook of Mathematical
Functions (Cambridge University Press, Cambridge, 2010).
[53] I.M. Gessel, Lagrange inversion, Journal of Combinatorial Theory, Series A 144, 212 (2016).
[54] J.W. Moon, Counting Labeled Trees (Canadian Mathematical Congress, Ottawa, 1970).
[55] A. Meir and J.W. Moon, The distance between points in random trees, Journal of Combina-
torial Theory 8, 99 (1970).
[56] L. Takacs, On Cayley’s formula for counting forests, Journal of Combinatorial Theory A 53,
321 (1990).
[57] P.W. Shor, A new proof of Cayley’s formula for counting labeled trees, Journal of Combina-
torial Theory, Series A 71, 154 (1995).
[58] T. Luczak, Random trees and random graphs, Rand. Struct. Alg. 13, 485 (1998).
[59] A.K. Hartmann and M. Mézard, Distribution of diameters for Erdős-Rényi random graphs,

28
Phys. Rev. E 97, 032128 (2018).
[60] E.J. Gumbel, Les valeurs extremes des distributions statistiques, Annales de l’Institut Henri
Poincaré 5, 115 (1935).
[61] M.M. Deza and E. Deza, Encyclopedia of Distances, Fourth Edition (Springer, Berlin, 2016).
[62] R.B. Bapat, Graphs and Matrices, Second Edition (Springer, London, 2014).
[63] B. Derrida and J. Vannimenus, A transfer-matrix approach to random resistor networks, J.
Phys. A 15, L557 (1982).
[64] P. Akara-pipattana, T. Chotibut and O. Evnin, Resistance distance distribution in large sparse
random graphs, J. Stat. Mech. 033404 (2022).
[65] P. Akara-pipattana and O. Evnin, Random matrices with row constraints and eigenvalue
distributions of graph Laplacians, J. Phys. A 56, 295001 (2023).

BookChapter - 11 Good One PDF
No ratings yet
BookChapter - 11 Good One PDF
39 pages
Complex Network Models
No ratings yet
Complex Network Models
110 pages
10 Models Erdos Renyi
No ratings yet
10 Models Erdos Renyi
9 pages
After CAT
No ratings yet
After CAT
443 pages
Complex Network
No ratings yet
Complex Network
6 pages
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
No ratings yet
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
113 pages
Troanary Photonic Storage Blueprint - How Light Based Logic can Redefine Computation and Data Storage
From Everand
Troanary Photonic Storage Blueprint - How Light Based Logic can Redefine Computation and Data Storage
Ylia Callan
No ratings yet
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
No ratings yet
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
116 pages
Percolation Randomgraphs Rev
No ratings yet
Percolation Randomgraphs Rev
74 pages
SoICT-Eng - ProbComp - Lec 9 - Random Network Models
No ratings yet
SoICT-Eng - ProbComp - Lec 9 - Random Network Models
80 pages
Barabasi CNDay2013
No ratings yet
Barabasi CNDay2013
49 pages
Random Graph Theo
No ratings yet
Random Graph Theo
45 pages
Random Graphs With Arbitrary Degree Distributions and Their Applications
No ratings yet
Random Graphs With Arbitrary Degree Distributions and Their Applications
19 pages
Interacting Network
No ratings yet
Interacting Network
33 pages
Introducción A La Teoría de Grafos
No ratings yet
Introducción A La Teoría de Grafos
19 pages
SoICT-Eng - ProbComp - Lec 9 - Random Network Models
No ratings yet
SoICT-Eng - ProbComp - Lec 9 - Random Network Models
80 pages
Complex Networks 複雜網絡 Lecture-1
No ratings yet
Complex Networks 複雜網絡 Lecture-1
123 pages
4 RandomNetworks Lastupdate2324
No ratings yet
4 RandomNetworks Lastupdate2324
41 pages
Course 3-4
No ratings yet
Course 3-4
46 pages
Separating Path Systems in Trees
No ratings yet
Separating Path Systems in Trees
21 pages
Ubc 1999-0278
No ratings yet
Ubc 1999-0278
71 pages
28 Geometric Fractal Growth Model For Scale-Free Networks
No ratings yet
28 Geometric Fractal Growth Model For Scale-Free Networks
6 pages
Na 4
No ratings yet
Na 4
7 pages
Gleeson08 Cascades On Correlated and Modular Random Networks
No ratings yet
Gleeson08 Cascades On Correlated and Modular Random Networks
11 pages
1604 A Theory of Network Security Principles of Natural Selection and Combinatorics
No ratings yet
1604 A Theory of Network Security Principles of Natural Selection and Combinatorics
60 pages
Network Robustness and Fragility: Percolation On Random Graphs
No ratings yet
Network Robustness and Fragility: Percolation On Random Graphs
4 pages
Physics Research3
No ratings yet
Physics Research3
9 pages
Topology of Evolving Networks
No ratings yet
Topology of Evolving Networks
4 pages
Data Mining and BI: Social Network Analytics: Random Graphs
No ratings yet
Data Mining and BI: Social Network Analytics: Random Graphs
46 pages
Unit 5 Machine
No ratings yet
Unit 5 Machine
9 pages
04 Sahoo
No ratings yet
04 Sahoo
24 pages
Md. Kamrul Hassan: Two-Hundred Years Long Journey From Graph To Complex Network Theory
No ratings yet
Md. Kamrul Hassan: Two-Hundred Years Long Journey From Graph To Complex Network Theory
47 pages
Clique Percolation - CFinder
No ratings yet
Clique Percolation - CFinder
4 pages
Oxford SC2 Transcribed Notes
No ratings yet
Oxford SC2 Transcribed Notes
42 pages
Random Graph and Network Evolution
No ratings yet
Random Graph and Network Evolution
2 pages
1960 Random Graphs
No ratings yet
1960 Random Graphs
45 pages
Concentration of Random Graphs and Application To Community Detection
No ratings yet
Concentration of Random Graphs and Application To Community Detection
22 pages
Complex-Network Modelling and Inference
No ratings yet
Complex-Network Modelling and Inference
26 pages
Topology Control (I)
No ratings yet
Topology Control (I)
32 pages
Cola Grafos RA34
No ratings yet
Cola Grafos RA34
7 pages
SNA 8: Network Resilience: Lada Adamic
No ratings yet
SNA 8: Network Resilience: Lada Adamic
35 pages
The Degree Sequence of A Scale-Free Random Graph Process: B Ela Bollob As, Oliver Riordan, Joel Spencer, G Abor Tusn Ady
No ratings yet
The Degree Sequence of A Scale-Free Random Graph Process: B Ela Bollob As, Oliver Riordan, Joel Spencer, G Abor Tusn Ady
12 pages
Network Topology: ELEG 667-013 Spring 2003
No ratings yet
Network Topology: ELEG 667-013 Spring 2003
74 pages
Random Graph and Stochastic Process
No ratings yet
Random Graph and Stochastic Process
10 pages
Complex Networks: 13.1 Erd Os-R Enyi Random Graph
No ratings yet
Complex Networks: 13.1 Erd Os-R Enyi Random Graph
12 pages
Cours GA Online
No ratings yet
Cours GA Online
49 pages
A Mini Course On Percolation Theory
No ratings yet
A Mini Course On Percolation Theory
38 pages
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
From Everand
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
Björn Olsson
No ratings yet
Concentration of Random Graphs and Application To Community Detection
No ratings yet
Concentration of Random Graphs and Application To Community Detection
20 pages
Topology of Evolving Networks: Local Events and Universality
No ratings yet
Topology of Evolving Networks: Local Events and Universality
13 pages
On The Properties of Small-World Network Models: HE Uropean Hysical Ournal
No ratings yet
On The Properties of Small-World Network Models: HE Uropean Hysical Ournal
14 pages
1904 02690 PDF
No ratings yet
1904 02690 PDF
48 pages
Fractual Boundary of Complex Network
No ratings yet
Fractual Boundary of Complex Network
8 pages
Complex Systems: Analysis and Models of Real-World Networks
No ratings yet
Complex Systems: Analysis and Models of Real-World Networks
18 pages
514085
No ratings yet
514085
4 pages
Informations About Networks
No ratings yet
Informations About Networks
12 pages
Graph Theory in The Information Age
No ratings yet
Graph Theory in The Information Age
13 pages
On The K-Connectivity of Ad-Hoc Wireless Networks
No ratings yet
On The K-Connectivity of Ad-Hoc Wireless Networks
5 pages
Resiliense Notios For Scale-Free Networks
No ratings yet
Resiliense Notios For Scale-Free Networks
6 pages
Serenity 48V 3000W
No ratings yet
Serenity 48V 3000W
2 pages
LSMV Fact Sheet
No ratings yet
LSMV Fact Sheet
2 pages
Phrack Magazine Issue 1
100% (1)
Phrack Magazine Issue 1
13 pages
11th Computer Science EM Public Exam 2023 Important 5 Mark Questions English Medium PDF Download
No ratings yet
11th Computer Science EM Public Exam 2023 Important 5 Mark Questions English Medium PDF Download
1 page
Tektronix 760 Manual
100% (3)
Tektronix 760 Manual
120 pages
Bahir Dar University College of Business and Economics Department of Management
No ratings yet
Bahir Dar University College of Business and Economics Department of Management
4 pages
EMMANUEL ODWIRA - Quantitaty Methods
No ratings yet
EMMANUEL ODWIRA - Quantitaty Methods
203 pages
Conference Interpreting
No ratings yet
Conference Interpreting
28 pages
Hydraulic Oil Flow Through Orifices 9
No ratings yet
Hydraulic Oil Flow Through Orifices 9
2 pages
Ansys CFX Tutorials - Release 13 PDF
No ratings yet
Ansys CFX Tutorials - Release 13 PDF
636 pages
CH 2 PDF
No ratings yet
CH 2 PDF
85 pages
Combined Slides Policy Quarterly Member Roundtable 06.2024
No ratings yet
Combined Slides Policy Quarterly Member Roundtable 06.2024
17 pages
Pocket Guide To Transportation 2010
No ratings yet
Pocket Guide To Transportation 2010
56 pages
1.1 Introduction To Accounting: Transcribed by - To Remove This Message
No ratings yet
1.1 Introduction To Accounting: Transcribed by - To Remove This Message
4 pages
Olutasidenib 8
No ratings yet
Olutasidenib 8
18 pages
Hardness Test
100% (1)
Hardness Test
8 pages
Lipsky Et Al 2020 IWGDF Infection Guideline
No ratings yet
Lipsky Et Al 2020 IWGDF Infection Guideline
24 pages
Roccal SDS
No ratings yet
Roccal SDS
7 pages
Introduction to Topology
From Everand
Introduction to Topology
Simone Malacrida
No ratings yet
Tct210319c005-1 基克纳-Aegis Legend 2 Kit-tpd正式
No ratings yet
Tct210319c005-1 基克纳-Aegis Legend 2 Kit-tpd正式
7 pages
Leadership Challenges in Tertiary Education
No ratings yet
Leadership Challenges in Tertiary Education
9 pages
Chen 2020 J. Electrochem. Soc. 167 080534
No ratings yet
Chen 2020 J. Electrochem. Soc. 167 080534
23 pages
Salesforce1 User Guide
No ratings yet
Salesforce1 User Guide
8 pages
Topacio v. Ong, G.R. No. 179895, December 18, 2008
No ratings yet
Topacio v. Ong, G.R. No. 179895, December 18, 2008
13 pages
Zulekha Hospital Sharjah New Project Features
No ratings yet
Zulekha Hospital Sharjah New Project Features
34 pages
Psychrometry Chart
No ratings yet
Psychrometry Chart
7 pages
SMM1 In-Class1 31211022596
No ratings yet
SMM1 In-Class1 31211022596
2 pages
WECC Models
No ratings yet
WECC Models
10 pages
Badplaas and Mashishila Cluster: Geography Mapwork/Gis Task1 Marking Guideline MARCH 2023
No ratings yet
Badplaas and Mashishila Cluster: Geography Mapwork/Gis Task1 Marking Guideline MARCH 2023
9 pages
Hse Working Requirements/Plan
No ratings yet
Hse Working Requirements/Plan
1 page
Aspen Exchanger Design and Rating Shell & Tube V10: File: Printed: 1/31/2023 at 3:41:24 PM TEMA Sheet
No ratings yet
Aspen Exchanger Design and Rating Shell & Tube V10: File: Printed: 1/31/2023 at 3:41:24 PM TEMA Sheet
1 page

Understanding Shortest Paths in Growing Directed Networks Through Duplication

Uploaded by

Understanding Shortest Paths in Growing Directed Networks Through Duplication

Uploaded by

The distribution of shortest path lengths on trees of a given size

in subcritical Erdős-Rényi networks

Barak Budnick, Ofer Biham and Eytan Katzav

Racah Institute of Physics, The Hebrew University, Jerusalem 9190401, Israel

2ss−2 cs−1 e−cs

The mean tree size is given by [27]

P (L = ℓ|L < ∞) = (1 − c)cℓ−1 , (8)

P (L > ℓ|L < ∞) = cℓ . (10)

II. THE DISTRIBUTION OF SHORTEST PATH LENGTHS

s−2 s−2 −cs

Ts,ℓ+1 = (ℓ + 1)ss−ℓ−2. (29)

ps−1 (1 − p)(2)−(s−1) (1 − p)s(N −s) ,

NT (s) = NT P (S = s). (33)

ss−2 cs−1 e−cs

the denominator, we obtain

III. THE MEAN AND VARIANCE OF THE DSPL

s[es s−s Γ(s + 1, s) − 2]

s[4 + 2s − 3es s−s Γ(s + 1, s)]

Var(L|S = s) = E[L2 |S = s] − (E[L|S = s])2 , (42)

Γ(s + 1, s) = sΓ(s, s) + ss e−s , (44)

The second moment of the DSPL can be expressed by

We calculated the distribution of shortest path lengths P (L = ℓ|S = s) between random

Appendix A: The generating function of P (S = s)

The generating function of P (S = s) is given by

Inserting P (S = s) from Eq. (2) into Eq. (A1), we obtain

2 X ss−2 cs−1 e−cs s

Rearranging terms on the right hand side of Eq. (A2), we obtain

Using the series expansion of the Lambert W function, which is given by

Changing the integration variable from τ to x = −uce−c e−τ , we obtain

and the variance is given by

It is also found that

You might also like