Spatial Patterns in Urban Systems PDF
Spatial Patterns in Urban Systems PDF
ABSTRACT
Understanding the morphology of an urban system is an important step toward unveiling the dynamical processes of its
growth and development. At the foundation of every urban system, transportation system is undeniably a crucial component
in powering the life of the entire urban system. In this work, we study the spatial pattern of 73 cities across the globe by
analysing the distribution of public transport points within the cities. The analysis reveals that different spatial distributions
of points could be classified into four groups with distinct features, indicating whether the points are clustered, dispersed or
regularly distributed. From visual inspection, we observe that the cities with regularly distributed patterns do not have apparent
centre in contrast to the other two types in which star-node structure, i.e. monocentric, can be clearly observed. Furthermore,
the results provide evidence for the existence of two different types of urban system: well-planned and organically grown.
We also study the spatial distribution of another important urban entity, the amenities, and find that it possesses universal
properties regardless of the city’s spatial pattern type. This result has one important implication that at small scale of locality,
the urban dynamics cannot be controlled even though the regulation can be done at large scale of the entire urban system.
The relation between the distribution of amenities within the city and its spatial pattern is also discussed.
When ρ is small, the number of clusters η (ρ ) is large a few clusters of points located at a farther distance beyond
because most of the points are not connected and they those in the current largest cluster that is being traced. As a re-
form their own clusters. As ρ increases, η (ρ ) decreases sult, this would provide us with the information on the length
monotonically because of merger of small clusters. In scales of distribution of points within the set. It can be easily
fact, it is a step function because the pairwise distances seen that if there are many peaks, the points are distributed in
between points are discrete in value. On the other hand, clusters that are apart with different distances; whereas the ex-
2/13
istence of few peaks implies a uniform distribution of points Spatial patterns in urban systems
that are (approximately) equidistant from one another. In ei- The characteristic distances introduced above tell us where-
ther case, it is without doubt that there exists a characteristic about the transitions of cluster size and area take place but
distance in the spatial distribution of points. This character- they do not tell us how the size and area of the cluster tran-
istic distance should tell us the length scale above which the sit from small to large value, i.e. how the cluster grows. This,
points are (largely) connected and below which they are dis- however, can be easily characterised by further exploiting the
connected. analysis of peaks in ξmax
′ (ρ ) (and A′ (ρ )). It is a matter of
max
Since all peaks in the derivative ξmax
′ (ρ ) contribute to the fact that if the cluster grows rapidly through the transition,
growth of the cluster ξmax (ρ ), a measure of the characteristic there are very few peaks in ξmax ′
(ρ ), all of which are sharp
distance ρξ⋆ must take into account the effects of all of them. and localised. On the other hand, the peaks are scattered over
However, a high peak indicates a more significant increase in a wide range of ρ should the cluster gradually grow. The stan-
cluster size (a major merger) than that indicated by a lower dard deviation of the location ρξ†,i of the peaks, or the spread
one. Hence, the average of all values of ρξ†,i at which a peak i of transition σξ for cluster size, is a good measure of such
dξ max (ρ )
scattering. However, a low peak that is distant from a group
occurs, weighted by the height ξmax (ρξ ,i ) =
′ †
of localised high peaks should not significantly enlarge the
dρ †
ρ =ρξ ,i
spread. Therefore, the standard deviation of ρξ†,i needs to be
of the peaks, is an appropriate measure of this characteristic
distance, i.e. weighted by the height ξmax
′
(ρξ†,i ) of the peaks, i.e.
v
∑i ξmax
′
(ρξ†,i )ρξ†,i u ξ ′ (ρ † ) ρ † − ρ ⋆ 2
u
ρξ⋆ = . (1) u ∑i max ξ ,i ξ ,i ξ
∑i ξmax
′ (ρ ) † σξ = t . (3)
ξ ,i ∑i ξmax
′ (ρ † )
ξ ,i
Similarly, we have for the cluster area Similarly, we have the spread of transition for cluster area
v
∑i A′max (ρA,i )ρA,i
† † u 2
u ∑i A′max (ρA,i ) ρA,i − ρA⋆
† †
ρA⋆ =
u
. (2)
∑i A′max (ρA,i ) σA = t
†
. (4)
∑i A′max (ρA,i )
†
3/13
σξ = σA
dispersed
pattern
σξ ≪ σA
regularly distributed
at multiple length scales
more dispersed medium-to-large σξ ≈ σA
σA
clustered
pattern
σξ ≫ σA
regularly distributed
at single length scale
more clustered
small σξ ≈ σA
σξ
Figure 1. Interpretation of different patterns of spatial point distributions given different values of the pair (σξ , σA ).
which most of the points are (approximately) equally spaced largest cluster is of comparable size or area to several other
from each other, e.g. grid points. The boroughs of Bronx, clusters. The most important feature of this spatial pattern
Brooklyn and Manhattan of New York city are typical exam- is that the jumps in the profile of ξmax (ρ ) correspond well
ples of such kind of distribution (see Fig. 3(a)). The spatial to those in Amax (ρ ), even though the locations of the jumps
pattern of the transport point in these cities appears very reg- are spread apart. That leads to the (approximate) equality of
ular. In fact, inspecting their street patterns, one can easily the spread of transitions σξ and σA despite their not being
tell the pattern of parallel roads in one direction cutting those small. A good example of this type of distribution is the city
in the other dividing the land into well organised polygons of Epsom in Auckland, New Zealand (see Fig. 3(b)).
with almost perfect square and rectangular shapes. Appar- It is also interesting to note that within a city itself, differ-
ently, this feature must be a result of well-designed and top- ent parts can possess distinct spatial patterns of the transport
down planning before actually building the infrastructure in points. For example, even though New York city is known
the city.13 to be a well-planned city with grid-like street patterns, not all
of its five boroughs share that nice feature. Only Manhattan,
Multiple-scale regular pattern Bronx and Brooklyn have small σξ and σA while the spreads
The transport points in a city can also be distributed in a reg- are larger for the other two boroughs, Queens and Staten Is-
ular manner but at different length scales. For example, the land. This fact indeed complements the result reported earlier
entire set of points can be divided into several subsets and that Queens exhibits a distinct spatial pattern different from
within each subset, the points are (quasi-)equally distant from the other boroughs.16 St Louis in Missouri, USA, is another
each others. At larger length scale, i.e. ρ increases further, interesting example. Two halves of the city on the two banks
these subsets of points are again (quasi-)equally distant from of Mississippi river appear to have different spatial patterns
each other, i.e. hierarchical structure. The buffer radius ρ when they possess different values of the pair σξ and σA .
can thus be thought to play the rôle of a zooming parameter.
In this multi-scale regular pattern, the profile of the largest Clustered pattern
cluster size ξmax (ρ ) and area Amax (ρ ) experience a signifi- There are cases in which the jumps in the profile of largest
cant jump every time ρ changes its zooming level. At the cluster size ξmax (ρ ) don’t correspond to those in the area
lowest level are individual transport point. When ρ zooms Amax (ρ ) and vice versa. In such cases, the spatial distribu-
out to the second level, the points that are closest to each tion of the transport points deviates from regular patterns. We
other start to form their respective clusters. Moving to the first consider the scenarios in which σξ ≫ σA . For such dis-
next level, the nearby clusters start joining to form larger clus- tributions, the points are clustered and tend to minimise the
ter but there will be many of these “larger clusters”, i.e. the coverage area. When σξ ≫ σA , there are jumps in the size
4/13
700 BEJ SAT
MLN
600 RCT
MCT
ABQ
BMT
BAR MSL
500
SDG
FWA HNL
400
σA (m)
ROM CLR
DV2
PRS
BLV
CNT LWD
CLB
300 TKO QNS
OKL
PTB KAS
ALT SL2 SYD EPS
BTM DLU
SIS MIA ALB
MUN DLS BLD
200 RKV
PLP
DV1 SEO CLL
MDR
HMT SLK DV3
DCT LTR
LVG FWO SMT
LDN LUV
100 NAS
MWK MDS
SAN
CAL
SFR RLE SL1
SNG STT
BRL AAB TUR
BNX AUT BLN
Figure 2. Types of spatial distribution of transport points in cities across the globe. The three reference lines are σA = σξ ,
σA = σξ + 50 and σA = σξ − 50.
of the largest cluster size that do not give rise to a jump in its largest cluster. If the acquired cluster are not compact, i.e. its
area. This happens when the points of an acquired cluster are points span a larger area, there might be significant increase
compact, contributing very little increase in the area of the in the area of the largest cluster and, hence, a peak would be
5/13
×104 Brooklyn, New York, USA normalised size
1
1 0.04
0.03
ξmax (ρ)
(ρ)
0.8
0.5
ξmax
0.6 0.02
′
0.4 0.01
0.2 0 0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
y (m)
0 ρ (m)
-0.2 normalised area
1 0.04
-0.4
0.03
Amax (ρ)
A′max (ρ)
-0.6
0.5 0.02
-0.8
0.01
-1
0 0
-5000 0 5000 10000 0 200 400 600 800 1000 1200 1400 1600 1800 2000
x (m) ρ (m)
(a) Brooklyn, New York, USA. The spreads of transition for the size σξ and area σA are both small. This is an example of single-scale regularly distributed
pattern, i.e. grid.
(ρ)
2 0.5
ξmax
0.01
′
1
0 0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
y (m)
0
ρ (m)
normalised area
-1 1
0.03
Amax (ρ)
A′max (ρ)
-2 0.02
0.5
-3 0.01
0 0
-2 0 2 0 200 400 600 800 1000 1200 1400 1600 1800 2000
x (m) ×104 ρ (m)
(b) Epsom, Auckland, New Zealand. The spreads of transition for the size σξ and area σA are not small but stay comparable to one another. This is an example
of multi-scale regularly distributed pattern.
Figure 3. Typical cities of single and multiple-scale regular spatial patterns. In each subfigure, the left panel shows the
location of transport points within the city, the upper right panel the profile of largest cluster size ξmax (ρ ) together with its
first derivative ξmax
′ (ρ ) and the lower right panel the profile of largest cluster area A
max (ρ ) together with its first derivative
Amax (ρ )
′
reflected by its contribution to σA . However, the size mea- This happens when the points of an acquired cluster are dis-
sure is not affected as it only tells the number of points that persed (but still within the buffer radius so that they belong
are included in the cluster but not their relative location with to the same cluster). This way, the increase in the area of the
respect to each other. largest cluster is more significant than that in its size, result-
The distribution of transport points in the city of Turin in ing σξ ≪ σA . A good example of this type is the distribution
Piedmont, Italy (see Fig. 4(a)), is a good example of this type. of transport points in Manchester in Greater Manchester, Eng-
The points appear clustered and compactly distributed but not land (see Fig. 4(b)). The points appear in dispersed pattern
regular or grid-like. of long roads around the city.
If the feature of single-scale regular spatial pattern (when
Dispersed pattern both σξ and σA are small) is a result of well-designed and
On the other side, we have the scenarios of σξ ≪ σA , in top-down planning in an urban system, the other spatial pat-
which the points are dispersed and tend to maximise the cov- terns (either σξ or σA is not small) can be intepreted as a
erage area. When σξ ≪ σA , there are jumps in the area of consequence of developing an urban system under local con-
the largest cluster that do not give rise to a jump in its size. straints. In the former case, the urban system appears to be
6/13
×104 Turin, Piedmont, Italy normalised size
1
6
0.015
ξmax (ρ)
(ρ)
4 0.5 0.01
ξmax
′
0.005
2
0 0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
y (m)
0 ρ (m)
normalised area
1
-2
Amax (ρ)
A′max (ρ)
0.02
0.5
-4 0.01
0 0
-8 -6 -4 -2 0 2 0 200 400 600 800 1000 1200 1400 1600 1800 2000
x (m) 4 ρ (m)
×10
(a) Turin, Piedmont, Italy. The spread of transition in peaks for the size is more than that for the area of the largest cluster, σξ > σA . This is an example of
clustered pattern.
(ρ)
1.5
0.5
ξmax
1 0.005
′
0.5
0 0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
y (m)
0
ρ (m)
-0.5 normalised area
1
-1
0.01
Amax (ρ)
A′max (ρ)
-1.5
0.5
-2 0.005
-2.5
0 0
-2 -1 0 1 2 0 200 400 600 800 1000 1200 1400 1600 1800 2000
x (m) ×104 ρ (m)
(b) Manchester, Greater Manchester, England. The spread of transition in the peaks for the size is less than that for the area of the largest cluster, σξ < σA .
This is an example of dispersed pattern.
Figure 4. Typical cities of clustered and dispersed spatial patterns. In each subfigure, the left panel shows the location of
transport points within the city, the upper right panel the profile of largest cluster size ξmax (ρ ) together with its first derivative
ξmax
′ (ρ ) and the lower right panel the profile of largest cluster area A
max (ρ ) together with its first derivative Amax (ρ )
′
of organised type while in the latter, it can be said to be of from which the roads diverge radially to the outer part of the
organic type when its spatial features develop in an adhoc city. This observation could by explained by the growth pro-
manner as the city grows. The revelation of spatial patterns cess of different types of urban system. When a city grows
in urban systems from the analysis in this work could imply organically, it starts from a central business district and grad-
two different types of process that the cities undergo through ually expands to encompass the nearby area to accommodate
their course of development. more people wanting to participate the business activities at
Visually inspecting the spatial distribution of transport the centre. On the other hand, when a city is planned before,
points within the cities, it appears that cities with regularly the planners seem not to concentrate the infrastructure in one
distributed pattern, either single- or multiple-scale, do not confined area but stretch it across the entire city.
have an apparent centre. That means there is no spatial pref-
erence in the distribution of the points, i.e. no part is special Universal features in urban systems
than the others. This is in contrast to the other two types of Area of system of transport points
cities in which star-node structure can be clearly observed. Despite the difference in the spatial pattern of the distribu-
The node represents the centre of the city at which there is tion of transport points, all the cities considered in this study
higher density of transport points than the other areas, and appear to possess a common relation between the density of
7/13
points and the characteristic distance of area ρA⋆ . To explore τ = −1.29 is not a relation that can be achieved with any
this, we first define the characteristic area of a set of points value of ρ . The relation can only hold at some value of the
which is the union area of circles of radius ρA⋆ centred at all buffer radius like ρA⋆ , given the structure in the distribution of
points in the set, Ξ(ρA⋆ ). The characteristic distance of area N points. Because of this feature, we consider ρA⋆ the charac-
ρA⋆ represents the typical value of ρ at which the area of the teristic distance of a set of spatially distributed points.
entire system experiences the transition. Further increase in
ρ after that does not contribute to as much increase in the Amenity distribution
area of the largest cluster and the cluster, thus, would entail Beside transport system, which is represented by a network
unnecessary area. The value of Ξ(ρA⋆ ) is therefore expected of transport points, the morphology of an urban system can
to be a good proxy to the essential area covered by the set of also be understood from another angle by examing the dis-
points given their spatial distribution. In the case of transport tribution of amenities within it. It turns out that despite pos-
system, we call this area the serving area of all the transport sessing different types of distribution of transport points, the
points. cities appear to share a common universal distribution of
Having defined the area, it is easy to calculate the density amenities. The analysis of locations of amenities in all the
of points per unit area, which is simply the ratio between the cities reveals that the (Euclidean) distance Ωk of an amenity k
N to its nearest transport point follows a robust exponential dis-
number of points and the area they cover, . From em-
Ξ(ρA⋆ ) tribution. That means the probability of finding an amenity
pirical analysis, we find the data fit very well to the relation with distance Ω to its nearest transport point decays exponen-
N tially with Ω, i.e. its probability density function is given by
∝ (ρA⋆ )−τ (5)
Ξ(ρA⋆ )
P(Ω) = λ e−λ Ω , (10)
with τ ≈ 1.29. Figure 5 shows the empirical relation between
the two quantities. The relation in Eq. (5) is well obeyed by which renders its mean and standard deviation (not variance)
all 73 cities. It is a remarkable relation given the scattered equal
relation between ρA⋆ and N or Ξ(ρA⋆ ). For reference, the arti-
ficial generated data, including both random and regular pat- 1
q
hΩi = σ (Ω) = hΩ2 i − hΩi2 = . (11)
terns (see “Discussion” for details) are also included in the λ
plot but not in the fitting itself. As can be observed, those
points generally stay below the points for the 73 cities. In Fig. 6, the mean hΩi and standard deviation σ (Ω) of
It should be further noted that the relation in Eq. (5) or shortest amenity-transport point distance for different cities
Fig. 5 is not a trivial one. To see this, we consider the two are shown to stay close to the diagonal line σ (Ω) = hΩi.
scenarios of ρ in extreme limits, ρ ≫ 1 and ρ ≪ 1, and its The exponential distribution of the distance Ω is strongly sup-
N ported by further veriyfing that higher moments of the distri-
relation with . For the small extreme value, ρ ≪ 1, all
Ξ(ρ ) bution P(Ω) fit well to
clusters include only a single point, and there are, hence, N
clusters. We, therefore, have n!
hΩn i = , (12)
λn
Ξ(ρ ) = N πρ 2 (6)
up to fourth order, n = 4.
which yields the relation
It has to be emphasized that the distribution of distance
N 1 from amenities to their nearest transport points follows an ex-
= ∝ ρ −2 . (7)
Ξ(ρ ) πρ 2 ponential rather than a Poissonian one. That means the mean
of such distance is (approximately) equal to its standard devi-
At the other extreme value, ρ ≫ 1, there is only one single ation rather than variance which holds for a Poisson distribu-
cluster that encompasses all points concentrating at the centre tion. The robust distribution of amenities across all city types
of the union area. We, therefore, have has one important implication that the local growth process
Ξ(ρ ) ≈ πρ 2 , (8) in urban systems appears to be independent of human inter-
vention and larger scale of the entire system. That means
which leads to planners can plan the large-scale growth process like trans-
N N portation but the small-scale growth process like local busi-
≈ , (9) ness still takes place on its own. But it remains a significant
Ξ(ρ ) πρ 2
question why the distribution is exponential, not any other
which in turn displays scaling behaviour like in Eq. (5) if and form. In fact, exponential decay in spatial urban patterns has
only if the number of points N scales with ρ . been long reported in literature.21 Using this feature as a fact,
The whole argument about the extreme values of ρ is to a model has been constructed to successfully capture the mor-
illustrate that the scaling relation in Eq. (5) with exponent phology of urban systems.22
8/13
2
10
data from 73 cities
fitting line with slope -1.29
randomly generated data
regularly generated data
(km−2 )
1
10
Ξ(ρA )
⋆
N
0
10
200 500 800 1100 1400 1700
ρ⋆A (m)
Figure 5. Empirical relation between density of point per unit area and the characteristic distance ρA⋆ in log-log scale. The
fitting line has slope −1.29 and was obtained with linear regression coefficient of R2 ≈ 0.8. The artificial generated data,
both random and regular, were not included in the fitting.
Relation between transport point and amenity dis- large or small hΩi. A large value of hΩi (and hence, large
tributions standard deviation σ (Ω), too) implies a city with sparse dis-
There appears to be a relation between the density of trans- tribution of amenities when they are distant from the nearest
port points within a city and the distribution of its amenities. transport point like Dallas or San Antonio in Texas, USA.
Figure 7 depicts this relation by plotting the average distance On the other hand, a small value of hΩi suggests that the
N amenities are built close to public transport points implying
hΩi agaisnt the density . It could be observed that the existence of sub-centres or several small towns or districts
Ξ(ρA⋆ )
only the lower-left triangle of the plot is occupied, leaving within the city, such as Turin in Piedmont, Italy.
no points in the upper-right corner of the plot. That means
there are no cities with high density of transport points and, Discussion
at the same time, having large (average) distance between its The present work analyses the features of the spatial distri-
amenities and the nearest transport points. This can be easily bution patterns of important entities in an urban system, the
understood by the fact that in cities with high density of the public transport points and the amenities. The former ones
transport points, the road network is very dense, the transport are part of the backbone of any urban system, the street net-
points have to stay within a short distance of each other. As work which plays essential rôle in enabling flow or exchange
a result, the amenities must necessarily be built very close to of various processes in the city. The advantage of knowledge
the public transport points. It turns out that those cities with of these transport points is that they can be well defined and
very high density of transport points are those with single- easily collected and at the same time provide other informa-
scale regular pattern of distribution, i.e. the points are regu- tion like the residential distribution within the city. The latter
larly distributed at (approximately) equal distances from each ones, on the other hands, can gauge the size of population
other like grid points, such as San Francisco or the boroughs as well as the level of activities in the city. The results unveil
of New York city. different types of city with distinct spatial patterns. The cities
At the other end, the cities with low density of public trans- are shown to be either of organised type, in which the entities
port points can exhibit a wide spectrum of average amenity- are well spaced as if they are built top-down, or of organic
transport point distance hΩi. These cities can either have type, in which the entities are spaced with multiple length
9/13
450
400
350
300
σ(Ω) (m)
250
200
150
100
100 150 200 250 300 350 400 450
hΩi (m)
Figure 6. Standard deviation σ (Ω) vs. mean hΩi of distance from amenities to nearest transport points within each city.
The reference line is σ (Ω) = hΩi.
scales as if they grow spontaneously. In either cases, the typ- lar grid of points in a square lattice, the points are randomly
ical distance among the transport points can be described by displaced by a small amount not more than a quarter of the
a characteristic distance. Despite the different types of the lattice spacing. It turns out that both the randomly and regu-
cities’ spatial patterns, the density of the transport points ex- larly generated data produce simple behaviours through our
hibits universal scaling behaviour with this characteristic dis- analysis. In particular, the peaks in size and area generally
tance. On the other hand, the distance from amenities to their coincide with each other and stay localised (the random sets
nearest transport points also follows a robust exponential dis- tend to produce more peaks while the regular ones have only
tribution for all cities studied. Furthermore, there is an appar- one peak as expected), and hence, both σξ and σA are small
ent relation between the distributions of transport points and indicating regular pattern of distribution. At this point, we
amenities within the cities. These facts signify some univer- would like to link the analysis with the idea of measure of
sal mechanisms underlying the growth and development that complexity of a symbolic sequence.23–25 The idea states that
all cities have to undergo. both regular and random (in the sense of a random number
In an attempt to understand the processes that generate generator) sequences possess very low measure of complex-
the spatial distribution patterns of the transport points, we ity as their structures or patterns are simple and easy to be
artificially generate some distributions of points on a two- presented in terms of the so-called ε -machine.23 Along that
dimensional surface. In the first distribution, the points are line, it could be argued that the patterns observed in the dis-
generated at random locations within a domain with uniform tributions of transport points from the real data of 73 cities
probability. In the second distribution, starting from a regu- around the world are more complex then those in the artifi-
10/13
450
400 TKO
DLS DV3
SAT SMT
SL2
350 SL1
BLV
MLN
NAS
CLB
300 DV2 DV1 QNS
BEJ SEO DLU DCT
hΩi (m)
KAS
LWD HUT
FWO
CLL PTB
250 MSL MDR AAB
CNT
ALT
OKL LTR ROM BTM
SDG BAR SIS
BMT LVG
SAN
200 ABQ BLN MUN RKV LUV
DTR BNX
BLD PLP
CLR SYD MIA
RLE FWA PRS
SLK MDS
150 ALB
AUT
LDN MWK BRL
STT RCT
HNL EPS MCT HMT
TUR MHT
100
SNG SFR
CAL
TRT
50
0 5 10 15 20 25 30 35 40 45
N
⋆ (km−2 )
Ξ(ρA )
N
Figure 7. Relation between density of transport points and average amenity-transport point distance hΩi. Refer to
Ξ(ρA⋆ )
Fig. 2 for codename of the cities.
cially generated data. The generated data, which is meant Aα (ρ ) by the union of area Ξ(ρ ) of circles of radius ρ centred at all points
to be either regular or completely random possess, only sim- in the domain.
The identification of the clusters can be done by using a simple heuristic
ple structure as we have argued above with small spreads σξ cluster finding algorithm that starts with a random points in the set and grad-
and σA . The real world data might well contain mixtures be- ually identifies the other points of in the same cluster. Alternatively, one can
tween regular and random patterns that could result in both employ the method of DBSCAN,26 setting the noise parameter to be zero.
the clustered and dispersed patterns that we have reported in The two methods are identical and yield the same results.
“Spatial patterns in urban systems”. Analysis
For any cluster-related quantity, we attach the subscript ξ to associate it with
Method cluster size while A for cluster area. For simplicity of all discussions, unless
stated explicitly, descriptions for cluster size ξ also hold for cluster area A.
General ideas To quantify the spatial pattern of the set of points, we vary the buffer ra-
For the analysis, we propose a procedure to characterise the spatial pattern of dius parameter ρ . As ρ increases, the farther points can belong to the same
a set of points. The procedure involves identifying clusters of points, whose cluster. As a result, the clusters can merge to increase their size. Tracing
pairwise distance does not exceed the value of a parameter, and quantifying the behaviour of the largest cluster ξmax (ρ ) can provide us with the way the
the growth of the clusters as the parameter value increases. points are distributed within the set. For example, the profile of the first
Consider a domain D which can be thought of as a city or a town. In this dξ max (ρ ) dA max (ρ )
derivative ξmax
′
(ρ ) = (and A′max (ρ ) = ) can indicate at
domain, there are N points i distributed, each of which represents a transport dρ dρ
point located at coordinates (xi ,yi ). We introduce a parameter called the which distance ρ , the points are (largely) connected in a single cluster. Be-
buffer radius ρ to construct the clusters. Any point j, whose distance cause ξmax (ρ ) increases monotonically with ρ , we introduce the so-called
q characteristic distance ρξ⋆ at which ξmax (ρ ) exhibits the most significant in-
di j = (xi − x j )2 + (yi − y j )2 (13)
crease. In some cases, the profile ξmax (ρ ) shows a sharp narrow increase
from point i is less than or equal to ρ , belongs to the same cluster as i. We around a value ρ . While in other cases, several small increases are observed,
denote η (ρ ) as the number of clusters given the buffer radius ρ . For each spreading a wide range of ρ . To account for that, it is also meaningful to
cluster α , we define the cluster size ξα (ρ ) and the cluster area Aα (ρ ). The introduce a quantity σξ , called spread of transition, to measure the overall
cluster size is defined as the number of points in the cluster and the cluster width of the increases in the profile of ξmax .
area the union of area of circles of uniform radius ρ centred at the points in To recap, ρξ⋆ is the value of ρ above which there is a significant transition
the cluster. To make different domains comparable, we normalise the cluster in the largest cluster size ξmax (ρ ) (see peak analysis in “Peak analysis”);
size ξα (ρ ) by the number of points N in the domain, while the cluster area σξ measures the width of the transition. The respective quantities for cluster
11/13
area are ρA⋆ and σA . We then use the union area with this buffer radius ρA⋆ of confined within areas on Earth’s surface spanning less than 100km in both
all points in the system as the effective area Ξ(ρA⋆ ) of the point set. dimensions, we find the approximation method below sufficient, with errors
We refrain from looking at the average value of distribution of the cluster being less than 0.5%.29
size or area as these measures are vulnerable against errors in data, i.e. out- We convert the spherical coordinates φ i = (ϕi , θi ) to Cartesian coordi-
liers. For example, when the buffer radius ρ is sufficiently large that most nates r i = (xi ,yi ) for every point i in the dataset by first setting the origin of
but a few of the points in the dataset belong to a single large (“giant”) cluster, the plot. The origin O is the centroid of all points
there are only two (or more) clusters and the average cluster size would be
N
1
∑ φ i,
half (or less) of what it is supposed to be.
φ O = hφ i i = (15)
N i=1
Peak analysis
As the buffer radius ρ increases, the size of the largest cluster ξmax (ρ ) also for which r O = (0,0). The Cartesian coordiantes of a point i is then deter-
increases. It can be easily observed that the profile of ξmax (ρ ) exhibits an mined based on its great-circle distance from the origin O. In particular, the
either sharp or gradual increase. The former introduces a single dominant x-coordinate of i is its (signed) great-circle distance from the point that has
peaks in the profile of ξmax
′ (ρ ) while the latter a set of peaks scattering over the same longitude ϕO as O and the same latitude θi as i itself. On the other
a wide range of ρ . This scattering of peaks can be quantified using the hand, the y-coordinate of i is its (signed) great-circle distance from the point
standard deviation of their locations, weighted by the strength (height) of that has the same latitude θO as O and the same longitude ϕi as i itself. The
the peaks. A small standard deviation implies a sharp increase in ξmax (ρ ), great-circle distance is calculated using the “haversine” formula and, hence,
and vice versa, a large standard deviation signifies a gradual increase. the Cartesian coordinates of point i are given by
In our analysis, we consider the profile of ξmax′ (ρ ) (and A′ (ρ )) at every
max
ϕi − ϕO
value of the buffer radius ρ ranging from ρmin = ρ1 = 10m to ρmax = ρM = cos θ i sin
2000m in the step of δ ρ = ρi+1 − ρi = 5m, ∀i. Since the values of the buffer xi = 2R tan−1 2
, (16)
radius are discrete, a point (ρi , ξmax
′ (ρ )) is a peak if and only if r ϕ i − ϕ
1 − cos2 θi sin2
i O
2
ξmax (ρi ) > ξmax (ρi−1 )
( ′ ′
. (14) yi = R(θi − θO ), (17)
ξmax
′
(ρi ) > ξmax
′
(ρi+1 )
(18)
This discrete nature also produces a lot of small noisy peaks. In our analysis,
with R = 6,371,000m being the Earth’s radius.
we filter these noisy peaks by offsetting the entire profile of ξmax
′ (ρ ) by a
sufficiently small amount and considering only the positive remaining peaks.
The value of ρξ⋆ will then be the mean of ρ of all peaks, weighted by the
peak height ξmax
′ (ρ ) (see Eq. (3)). The spread σ is the standard deviation References
ξ
of ρ of all peaks, again weighted by the peak height ξmax
′ (ρ ) (see Eq. (4)).
1. Batty, M. Cities and Complexity (The MIT Press, Cam-
Implementation bridge, 2005).
The cluster analysis is implemented in Python, using shapely library to 2. Bettencourt, L. M. A., Lobo, J., Helbing, D., Kühnert,
calculate the union area. We also use DBSCAN library for DBSCAN analysis
to compare with our method. C. & West, G. B. Growth, innovation, scaling, and the
pace of life in cities. P. Natl. Acad. Sci. 104, 7301–7306
Data (2007).
12/13
12. Strano, E., Nicosia, V., Latora, V., Porta, S. & 23. Crutchfield, J. P. & Young, K. Inferring statistical com-
Barthélemy, M. Elementary processes governing the evo- plexity. Phys. Rev. Lett. 63, 105–108 (1989).
lution of road networks. Nat. Sci. Rep. 2, 296 (2012). 24. Crutchfield, J. P. Between order and chaos. Nat. Phys. 8,
13. Barthélemy, M., Bordin, P., Berestycki, H. & Gribaudi, 17–24 (2012).
M. Self-organization versus top-down planning in the 25. Huynh, H. N., Pradana, A. & Chew, L. Y. The com-
evolution of a city. Nat. Sci. Rep. 3, 2153 (2013). plexity of sequences generated by the arc-fractal system.
14. Gudmundsson, A. & Mohajeri, N. Entropy and order in PLoS ONE 10, e0117365 (2015).
urban street networks. Nat. Sci. Rep. 3 (2013). 26. Ester, M., Kriegel, H.-P., Sander, J. & Xu, X. A density-
15. Strano, E. et al. Urban street networks, a comparative based algorithm for discovering clusters in large spatial
analysis of ten European cities. Environ. Plann. B 40, databases with noise. In Proceedings of the Second Inter-
1071–1086 (2013). national Conference on Knowledge Discovery and Data
Mining, KDD-96, 226––231 (1996).
16. Louf, R. & Barthelemy, M. A typology of street patterns.
J. Roy. Soc. Interface 11 (2014). 27. URL https://www.transitlink.com.sg/.
17. Porta, S., Crucitti, P. & Latora, V. The network analysis 28. URL https://mapzen.com/data/metro-extracts/.
of urban streets: A dual approach. Physica A 369, 853– 29. URL http://www.movable-type.co.uk/scripts/latlon
866 (2006).
18. Stauffer, D. & Aharony, A. Introduction to percolation Acknowledgements
theory (Taylor & Francis, London, 1994). We thank Muhamad Azfar Bin Ramli for his help in collect-
19. Domb, C., Green, M. S. & Lebowitz, J. (eds.) Phase tran- ing Singapore bus stop data.
sitions and critical phenomena, vol. 1–20 (Academic Author contributions statement
Press, 1972–2001). HNH, CM and LYC conceived and designed the study. HNH
20. Stanley, H. E. Scaling, universality, and renormalization: devised the method of analysis and analysed the data. EM
Three pillars of modern critical phenomena. Rev. Mod. collected the bus stop data of all cities. HNH and EFL col-
Phys. 71, S358–S366 (1999). lected the remaining data. HNH wrote the manuscript. All
authors reviewed the manuscript.
21. Clark, C. Urban population densities. J. Roy. Stat. Soc.
A Gen. 114, 490–496 (1951).
Additional information
22. Makse, H. A., Andrade, J. S., Batty, M., Havlin, S. &
Stanley, H. E. Modeling urban growth patterns with cor- Competing financial interests: The authors declare no com-
related percolation. Phys. Rev. E 58, 7054 (1998). peting financial interests.
13/13