0% found this document useful (0 votes)
29 views137 pages

Serra - Methodological Foundation of A Numerical Taxonomy

Uploaded by

Mariana Diniz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views137 pages

Serra - Methodological Foundation of A Numerical Taxonomy

Uploaded by

Mariana Diniz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 137

Environment and Planning B: Urban Analytics and City Science

Methodological Foundation of a Numerical Taxonomy of


Urban Form

Journal: Environment and Planning B: Urban Analytics and City Science

Manuscript ID EPB-2021-0170.R1

Manuscript Type: Manuscript


Fo
urban morphology, classification, urban morphometrics, numerical
Keywords:
taxonomy
rR
ev
iew
On
ly

https://mc04.manuscriptcentral.com/epb
Page 1 of 136 Environment and Planning B: Urban Analytics and City Science

11
1
2
3
4
5
6
7
8
9
10
11
12
13 Methodological Foundation of a Numerical Taxonomy of Urban
14
15
16 Form
17
18
19
Fo
20 Abstract
21
22
rR
23
24 Cities are complex products of human culture, characterised by a startling diversity of visible
25
26 traits. Their form is constantly evolving, reflecting changing human needs and local
ev

27
28 contingencies, manifested in space by many urban patterns.
29
iew

30
31 Urban Morphology laid the foundation for understanding many such patterns, largely relying
32
33 on qualitative research methods to extract distinct spatial identities of urban areas. However,
34
On

35 the manual, labour-intensive and subjective nature of such approaches represents an


36
37
38
impediment to the development of a scalable, replicable and data-driven urban form
ly

39
40 characterisation. Recently, with advances in Geographic Data Science and the growing
41
42 availability of digital mapping products, researchers in this field have developed an interest in
43
44
quantitative urban morphology, or urban morphometrics, withopen the potentialopportunity to
45
46
47 overcome such limitations. And yet, our current capacity to systematically capture the
48
49 heterogeneity of spatial patterns remains limited in terms of spatial parameters included in the
50
51 analysis and hardly scalable due to the highly labour-intensive nature of the task. In this
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 2 of 136

22
1
2
3
4
5
6
7
8
9 paper, we present a method for numerical taxonomy of urban form derived from biological
10
11
12
systematics, which allows the rigorous detection and classification of urban types. Initially,
13
14 we produce a rich numerical characterisation of urban space from minimal data input,
15
16 minimizing limitations due to inconsistent data quality and availability. These are street
17
18
network, building footprint, and morphological tessellation, a spatial unit derivative of
19
Fo
20
21 Voronoi tessellation, obtained from building footprints. Hence, we derive homogeneous urban
22
rR
23 tissue types (or “taxa”) and, by determining overall morphological similarity between them,
24
25 generate a hierarchical classification (“phenetic taxonomy”) of urban form. After framing and
26
ev

27
28 presenting the method, we test it on two cities - Prague and Amsterdam - and discuss potential
29
iew

30 applications and further developments. The proposed classification method represents a step
31
32 towards the development of an extensive, scalable numerical taxonomy of urban form and
33
34
opens the way to more rigorous comparative morphological studies and explorations into the
On

35
36
37 relationship between urban space and phenomena as diverse as environmental performance,
38
ly

39 health and place attractiveness.


40
41
42
43
44 Keywords: urban morphometrics, classification, numerical taxonomy, urban morphology,
45 cluster analysis
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 3 of 136 Environment and Planning B: Urban Analytics and City Science

33
1
2
3
4
5
6
7
8
9
10
Introduction
11
12
13 Cities’ visual diversity is astounding. To our eyes, they presentIndeed, when comparing their
14
15 spatial form, marked differences can be clearly observed at all scales. And yet, despite these
16
17
18
variations, their heterogeneous fabrics share geometric characteristics, which make it possible
19
Fo
20 to compare them to one another through the analysis of their constituent elements and, to
21
22 recognise patchworks of distinct urban tissues, which frame our daily urban lives with
rR
23
24
impacts on key economic (Ahlfeldt & Pietrostefani, 2019), environmental (Andersson &
25
26
ev

27 Colding, 2014, Banister et al. 1997, Ewing & Rong, 2008) and social (Romice et al. 2017)
28
29 dynamics within each city.
iew

30
31
32 BuildingThe endeavour of capturing these multifaceted spatial patterns has been the object of
33
34
investigation across multiple disciplines. Notably, building on research in geography (Conzen,
On

35
36
37 1960) and architecture (Muratori, 1959), the discipline of urban morphology exploresdevote
38
ly

39 over 60 years to explore recurrent patterns within urban forms in cities all over the world,
40
41
42
aiming at their definition, classification and characterisation (Kropf 1993, Kropf 2014,;
43
44 Oliveira 2016).
45
46
47 StillFurther research has focused on classification of morphological elements into “types”.
48
49
This includes the series of works by Steadman (Steadman, Bruhns and Holtier, 2000;
50
51
52 Steadman, Evans and Batty, 2009) on the classification of buildings based on a handful of
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 4 of 136

44
1
2
3
4
5
6
7
8
9 empirically measured geometrical parameters as well as the work by Marshall (2005) on the
10
11
12
classification of street pattern types.
13
14
15 And whilst these contributions are heterogeneous both in terms of object of interest (i.e.
16
17 building, street, urban tissue), method (i.e. qualitative vs quantitative) and aim of the
18
19
classification (i.e. energy performance, historical origin, design paradigm), they mark
Fo
20
21
22 important attempts at classifying the variations of individual elements – buildings (Steadman
rR
23
24 et al. 2000, Steadman et al. 2009) – or aggregations of individual elements - street patterns
25
26 (Marshall, 2005) – making up the of spatial form of cities through geometrical analysis. As
ev

27
28
29 such they mark steps towards a more rigorous study of relationships between different urban
iew

30
31 configurations.
32
33
34 Yet, our current capacity to systematically and comprehensively capture the heterogeneity of
On

35
36
37
spatial patterns remains limited: most . Most existing research in urban morphology usesrelies
38
ly

39 on highly-supervised, expert-driven and labour-intensive qualitative methods whichboth in the


40
41 data preparation process and in the design of the analysis. As a result, most existing works are
42
43
hardly scalable due to their highly supervised, subjective and labour-intensive nature. Recentthe
44
45
46 considerable amount of manual work required to prepare the input data and tend to focus on the
47
48 analysis of relatively few spatial parameters.
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 5 of 136 Environment and Planning B: Urban Analytics and City Science

55
1
2
3
4
5
6
7
8
9 Recently however, advances in geographic data science, combined with growing availability of
10
11
12
geospatial data, triggered a data-driven stream of urban morphology studies, named “urban
13
14 morphometrics” (e.g. Gil et al. 2012, Dibble et al. 2019, Araldi & Fusco 2019, Bobkova 2019).
15
16 Within this line of research, thisthe paper presents a quantitative methodologyaims to address
17
18
the need for amore systematic unsupervised , scalable and efficient method for the detection
19
Fo
20
21 and classification and characterisation of morphological patterns. To this end, after presenting
22
rR
23 a brief literature review on urban form, which builds on phenetics and numerical taxonomy in
24
25 biological systematics. classification and specifying the requirements for a rigorous
26
ev

27
28 classification method, we
29
iew

30
31 ● After identifying key conceptual principlespresent an original quantitative methodology
32
33 for a rigorous method the systematic unsupervised classification of urban form
34
On

35
36 classificationpatterns and ground it on the theory of phenetics and numerical taxonomy
37
38 in biological systematics.
ly

39
40 ● apply the proposed methodology to two exploratory case studies, as proofs of concept
41
42
43
aimed at providing an illustration of the method and some of its potential theoretical
44
45 impacts and technical shortcomings.
46
47
48 More specifically, we define its will first frame the proposed approach to urban form
49
50
classification within numerical taxonomy, which seeks to describe and classify species and
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 6 of 136

66
1
2
3
4
5
6
7
8
9 taxa based on morphological similarity (Sneath & Sokal, 1973). To build this methodological
10
11
12
parallel between the (a-biotic) system of urban form and biology, we a) re-frame the
13
14 constituent elements of urban forms as the building blocks (structural elements, taxonomic
15
16 unit of analysis and morphometric characters),of the method, 2) describe how to identify
17
18
structurally homogenoushomogeneous urban form types (or “taxa”) and 3) measure their
19
Fo
20
21 hierarchical relationship based on phenetic similarity, delivering a systematic numerical
22
rR
23 taxonomy of urban form. Finally, we validatetest the proposed method throughon two case
24
25 studiesmajor European cities characterised by various types of urban fabric originating from
26
ev

27
28 different historical stages: Prague, CZ and Amsterdam, NL.
29
iew

30
31 We conclude discussing validation findings, highlighting potential theoretical impact of the
32
33 proposed method and discussing methodological limitations.
34
On

35
36
37 Existing models of urban form classification
38
ly

39
40 The primary aim of classification is to reduce the complexity of the world around us. Many
41
42
43
urban form classification methods exist at building (Schirmer & Axhausen, 2015),
44
45 neighbourhood (Soman et al., 2020) and city (Louf & Barthelemy, 2014) scales, varying
46
47 conceptually and analytically (e.g. global, Angel et al. (2012) vs local, Guyot et al. (2021)
48
49
focus). Simplest forms involve flat classifications, where relationships between types is
50
51
52 unknown.Steadman et al, 2000, Steadman et al. 2009, Schirmer & Axhausen, 2015), street
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 7 of 136 Environment and Planning B: Urban Analytics and City Science

77
1
2
3
4
5
6
7
8
9 (Marshall, 2005) neighbourhood (Soman et al., 2020) and city (Louf & Barthelemy, 2014)
10
11
12
scales, varying conceptually and analytically both in terms of focus scale - e.g. global, (Angel
13
14 et al. 2012) vs local (Guyot et al. 2021), analytical approach – e.g. quantitative vs. qualitative,
15
16 and aim of the classification. Structurally, the simplest forms involve flat classifications,
17
18
where the relationship between types is unknown. These are either binary like organized vs.
19
Fo
20
21 unorganized neighbourhoods (Dogrusoz & Aksoy, 2007), or multi-class, as Caruso et al.’s
22
rR
23 (2017) 4-class clustering based on inter-building distance, or Song and Knapp’s (2007) 6-
24
25 class neighbourhood typology based on factor analysis and K-means of 21 spatial descriptors,
26
ev

27
28 or the “multiscale typology” by Schirmer & Axhausen, (2015) identifying four flat classes
29
iew

30 based on centrality and accessibility. More complex classifications involve hierarchical


31
32 methods (taxonomies), which organise classes based on the strengths of their mutual
33
34
relationships like Serra et al. (2018)’s hierarchical taxonomy of neighbourhoods built
On

35
36
37 according to 12 morphological characters of street network, blocks and buildings, and the
38
ly

39 work by Dibble et al. (2019) who hierarchically classify portions of urban area enclosed by
40
41
main streets or “sanctuary areas”.. More granular approaches include the work by Araldi &
42
43
44 Fusco (2017, 2019), who classify street segments using 20+21 morphometric characters
45
46 derived from street networks, building footprints and digital terrain model and research by
47
48 SMOG at Chalmers University (Berghauser Pont, Stavroulaki & Marcus, 2019 et al., 2019a;
49
50
51 Berghauser Pont, Stavroulaki, Bobkova, et al., 20192019b; Bobkova et al., 2019) that
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 8 of 136

88
1
2
3
4
5
6
7
8
9 classifies morphological elements of plots, streets and buildings through a handful of
10
11
12
morphometric characters.
13
14
15 Other approaches employ morphometric assessment to predict pre-defined typologies of
16
17 buildings, streets or larger areas (Marshall, 2005, Hartmann et al., 2016; Neidhart and Sester,
18
19
2004; Steiniger et al., 2008; Wurm et al., 2016). These validate morphometrics in
Fo
20
21
22 classification of urban form, even though the typology itself is not defined
rR
23
24 empiricallydifferently. Related to this are Urban Structural Type classifications reviewed by
25
26 Lehner & Blaschke (2019), and detection of Local Climate Zones (Stewart & Oke, 2012;
ev

27
28
29 Taubenböck et al., 2020).
iew

30
31
32 EachWhilst the list does not aim to be exhaustive of all contributions it nevertheless provides
33
34 an overview of the state of the art in urban form classification research. Specifically, it
On

35
36
37
highlights how each of these method shows shortcomings in scalability (the ability to analyse
38
ly

39 large areas while retaining the detail), transferability (the ability to apply to different
40
41 contexts), robustness (the ability to remain unaffected by small imprecision of the input data
42
43
or measurement), and comprehensivenessextensiveness (i.e. the bias induced by a small
44
45
46 number of variables), or interpretative flexibility (i.e., missing relations between classes),
47
48 leaving). This leaves a methodological gap in morphometric classification of built
49
50 environment hindering the development of universal taxonomy of urban form.
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 9 of 136 Environment and Planning B: Urban Analytics and City Science

99
1
2
3
4
5
6
7
8
9
10
Method: Building a taxonomy of urban form
11
12
13 Notwithstanding operational limitations, Dibble et al. (2019)The problem of classification of
14
15 urban patterns based on geometrical resemblance is the firstnot dissimilar, conceptually
16
17
18
speaking, to explicitly applythe work of early biologists seeking to classify biotic species and
19
Fo
20 taxa based on morphological similarity. This was indeed the primary aim of numerical
21
22 taxonomy to urban form classification. Numerical taxonomy (orand generally phenetics)
rR
23
24
was), established in biology in the second half of the 20th century (Sneath & Sokal, 1973) to
25
26
ev

27 describe and classify species and taxa based on morphological similarity. ).


28
29
iew

30 Whilst DNA sequencing and phylogenetics have now largely replaced morphometrics in
31
32 modern biological taxonomy, we can take advantage of the latter for the study of urban form.
33
34
Indeed,Very much like the study of organismal phenotypes and the statistical description of
On

35
36
37 biological forms were instrumental to the separation of individuals (and species) into
38
ly

39 recognisable, homogeneous groups (Raup,1966), extending numerical taxonomy to the study


40
41
42
of urban form offers an operationally viable and reliable conceptual and methodological
43
44 framework for an inclusivea systematic classification of homogeneous urban form types.
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 10 of 136

1010
1
2
3
4
5
6
7
8
9
10
Method: Building a taxonomy of urban form
11
12
13 And yet, whilst this possibility has always fascinated urban scholars in an analogic sense
14
15 (Philip and Steaman, 1979), a rigorous methodological parallel between numerical taxonomy
16
17
18
and urban form classification is a matter of pioneering research.
19
Fo
20
21 One of the first authors to explicitly use numerical taxonomy on urban form was Dibble et al.
22
rR
23 (2019) who, notwithstanding operational limitations, measured a large number of geometrical
24
25
parameters of fundamental morphological elements (buildings, streets, plots etc) to test the
26
ev

27
28 applicability of the approach in urban morphology. However, their method requires
29
iew

30 predefined boundaries of urban types, is extremely data demanding and is not possible to do
31
32 without manual measuring. Despite that, it paved the conceptual way for further research
33
34
including the one presented in this paper.
On

35
36
37
38
ly

39 Morphometrics and numerical taxonomy in urban form


40
41
42 The first step for inclusive numerical taxonomy of urban form is the definition of the
43
44
45 morphometric building blocks of the method, namely: 1) structural elements, or the urban
46
47 form counterpart of the individual and its body in biology (Sneath & Sokal, 1973); 2),
48
49 operational taxonomic unit (OTU), or else the unit forming the lowest ranking taxa, which in
50
51
52 biology is individuals or populations depending on taxonomic level; and 3) morphometric
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 11 of 136 Environment and Planning B: Urban Analytics and City Science

1111
1
2
3
4
5
6
7
8
9 characters, that is the measurable traits of each structural elements - the “wing’s length” or
10
11
12
“beak’s dimension” in biology.
13
14
15 Structural elements: Morphometric elements
16
17
18 Urban morphologists generally agree that urban form showson three fundamental elements:
19
Fo
20
buildings, plots and streets (Kropf, 2017; Moudon, 1997). To make our method scalable it is
21
22
imperative that, when these are translated into operational and measurable morphometric
rR
23
24
25 elements, i.e., vector features in GIS data layers, they maintain their intrinsic meaning with
26
ev

27 minimal data input, hence maximising data accessibility and consistency.


28
29
iew

30
31 From a morphometric standpoint, this is relatively straightforward for streets and buildings
32
33 due to their conceptual simplicity: buildings can be represented as building footprint polygons
34
On

35 (with the attribute of building height) at Level of Detail 1 (Biljecki et al., 2016) whilst streets
36
37
38
as network centrelines, cleared of transport planning-related structures. The same is more
ly

39
40 complicated for the plot, particularly at large scale, due to its highly polysemic nature (Kropf,
41
42 2018) and ambiguous structuring role in contemporary urban fabrics (Levy, 1999).
43
44
45
To overcomeavoid the plot’s inconsistencies, we use instead morphological tessellation, a
46
47
48 geometricpolygon-based derivative of Voronoi tessellation obtained from building footprints
49
50 proposed by Fleischmann et al. (2020) after Hamaina et al. (2012) and Usui & Asami (2013)
51
52 and the morphological cell, its smallest spatial unit which delineates the portion of land
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 12 of 136

1212
1
2
3
4
5
6
7
8
9 around each building that is closer to it than to any other. but no further than 100m. As such,
10
11
12
the morphological tessellation captures the topological relations between individual cells and
13
14 influence that each building exerts on the surrounding space (Hamaina et al., 2012),
15
16 regardless of historical origin, thanks to its contiguity throughout the whole analysis space
17
18
(figure 1figures 1a and 2). Furthermore, being generated solely from building footprints, it
19
Fo
20
21 does not increase data reliance. However, as such, it does not have the ability to represent
22
rR
23 unbuilt areas and empty plots and does not serve as a substitute for plot in general terms as it
24
25 does not have the same structural role. Morphological tessellation is a purely analytical
26
ev

27
28 element.
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 13 of 136 Environment and Planning B: Urban Analytics and City Science

1313
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32 Figure 1: a) Fundamental morphometric elements: building footprint, morphologicaltessellation cell
33 (derived from building footprints) and street (segment and node from centrelines). b) Diagram illustrating
34 the workflow of the proposed method. From input data (buildings, streets) are derived generated elements
On

35
(tessellation, blocks). All elements are used to measure primary morphometric characters. Each of them is
36
37 then represented as 4 contextual characters that are used as an input of the cluster analysis. Finally,
38 resulting classes are organised in a taxonomy.
ly

39
40 Taxonomic unit: Urban form type
41
42
43
In biology the operational taxonomic unit (OTU) is intuitive (individual living organism).
44
45
46 The same is, however, not true for urban form. In urban morphology, this can be associated to
47
48 the concept of “morphological regions” (Oliveira & Yaygin, 2020), “urban tissues” (Caniggia
49
50 & Maffei, 2001; Kropf, 1996) or “urban structural types” (Lehner & Blaschke, 2019;
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 14 of 136

1414
1
2
3
4
5
6
7
8
9 Osmond, 2010), or else “a distinct area of a settlement in all three dimensions, characterized
10
11
12
by a unique combination of streets, blocks/plot series, plots, buildings, structures and
13
14 materials and usually the result of a distinct process of formation at a particular time or
15
16 period” (Kropf (2017: , p.89).
17
18
19
From a morphometric standpoint, adopting the concept of “urban tissue” as the OTU of the
Fo
20
21
22 proposed method has two main advantages. First, being grounded on the notion of
rR
23
24 homogeneity, its definition can be configured as a typical problem of cluster analysis:
25
26 homogeneous urban tissues are hence derived from the analysis of recurrent
ev

27
28
29 similarities/differences in the morphometric characters of their constituent urban elements.
iew

30
31 Furthermore, as size and geometry of each urban tissue are determined by internal
32
33 homogeneity rather than pre-defined boundaries, the Modifiable Aerial Unit Problem bias is
34
On

35
36
minimised (Openshaw, 1984).
37
38
ly

39 Having the elements defined, the method proposed here can be split into five consecutive
40
41 steps illustrated on figure 1b: 1) generation of morphological elements, 2) measurement of
42
43
primary morphometric characters, 3) measurement of contextual character, 4) cluster analysis,
44
45
46 5) taxonomy. The remaining steps are outlined in the following sections.
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 15 of 136 Environment and Planning B: Urban Analytics and City Science

1515
1
2
3
4
5
6
7
8
9 Morphometric characters: primary and contextual characters
10
11
12
13
The definition of measurable morphometric characters is key for cluster analysis and captures
14
15 the cross-scale structural complexity of different urban tissues. To this end, building on earlier
16
17 literature review <masked for review>, we identifieduse six categories of morphometric
18
19
characters - dimension, shape, spatial distribution, intensity, connectivity, diversity.
Fo
20
21
22
These characters allow to numerically qualifydescribe morphometric elements (street
rR
23
24
25 segments, building footprints and tessellation cells) within any urban fabric, by capturing the
26
ev

27 relationships between them and their immediate surroundings. Morphometric characters are
28
29
iew

30 indeedThey are measured at three topological scales: small (element itself), medium (element
31
32 and its immediate neighbours) and large – the element and its neighbours within k-th order of
33
34 contiguity. Spatial contiguity can either be kept constrained by enclosing streets (the
On

35
36
37
equivalent of an urban block) or left unconstrained (see the Supplementary Material 1 for
38
ly

39 further details).
40
41
42 Considered morphometric characters are of two types: primary and contextual. Primary
43
44
characters measure geometric and configurational properties of morphometric elements
45
46
47 (buildings, streets and cells) and their topological relationships (at small, medium and large all
48
49 scales). By abundantly representing all six morphometric categories (dimension, shape, spatial
50
51 distribution, intensity, connectivity and diversity) this set is highly comprehensiveextensive.
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 16 of 136

1616
1
2
3
4
5
6
7
8
9 Accordingly, starting from as broad a set of unique variables identified by <masked for
10
11
12
review>, we shortlist 74 characters (table S1 in the Supplementary Material), following rules
13
14 by Sneath & Sokal (1973) to minimise potential collinearity and limit redundancy of
15
16 information, while retaining the universality of the method.
17
18
19
Primary characters describe morphometric elements and their immediate neighbourhood
Fo
20
21
22 rather than their spatial patterns. As such, when employed for cluster analysis they may result
rR
23
24 in spatially discontinuous classes. To avoid this, we derive from primary charactersUrban
25
26 tissues are defined by their internal homogeneity, but it can, and often is, be the homogeneity
ev

27
28
29 of heterogeneity. In other words, the tissue may be defined by the combination of small and
iew

30
31 large buildings or various shapes, and we need to capture these characteristics. Thus we
32
33 derive a set of spatially lagged contextual characters describing the tendency of each primary
34
On

35
36
character in its context. The term “context” is here defined as topological aggregation of
37
38 morphological cells within three topological steps from each given cell Ci, an empirically
ly

39
40 determined value large enough to capture a cohesive pattern over a relatively wide spatial
41
42
extent but small enough to generate sharp boundaries between different patterns (Figure 2).
43
44
45 The notion of “tendency” is in turn quantified through four values:
46
47
48 1. Interquartile mean (IQM), the most representative value cleaned of the effect of
49
50 potential outliers.
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 17 of 136 Environment and Planning B: Urban Analytics and City Science

1717
1
2
3
4
5
6
7
8
9 2. Interquartile range (IQR); as local measure of statistical dispersion, describes the
10
11
12
range of values cleaned of outliers:
13
14
15 𝐼𝑄𝑅𝑐ℎ = 𝑄3𝑐ℎ ― 𝑄1𝑐ℎ,
16
17
18 where 𝑄3𝑐ℎ and 𝑄1𝑐ℎ are is the third and quartiles of the primary character.
19
Fo
20
21 3. Interdecile Theil index (IDT), describes the local (in)equality of distribution of values:
22
rR
23
𝑛
24
25
26
𝐼𝐷𝑇𝑐ℎ = ∑ 𝑖=1
(𝑛

𝑐ℎ 𝑖

𝑖=1
𝑐ℎ 𝑖 𝑙𝑛 [𝑁 𝑛 ∑
𝑐ℎ 𝑖

𝑖=1
𝑛
𝑐ℎ 𝑖 ]) , 𝑖 = 1 (
∑ 𝑛
𝑐ℎ 𝑖
∑ 𝑖 = 1 𝑐ℎ 𝑖
𝑙𝑛 [𝑁 𝑛
𝑐ℎ 𝑖
∑ 𝑖 = 1 𝑐ℎ 𝑖
]),
ev

27
28
29 where 𝑐ℎ is the primary character.
iew

30
31
32 4. Simpson’s diversity index (SDI), captures the local presence of classes of values
33
34
compared to the global structure of the distribution:
On

35
36
37
𝑅
38 ∑ 𝑖 = 1 𝑛 𝑖 (𝑛 𝑖 ― 1) ∑ 𝑅𝑖= 1 𝑛 𝑖 (𝑛 𝑖 ― 1)
ly

39 𝑆𝐷𝐼𝑐ℎ = 𝑁(𝑁 ― 1) 𝑁(𝑁 ― 1) ,


40
41
42
where 𝑅 is richness, expressed as number of bins, 𝑛𝑖 is the number of features within i-
43
44
45 th bin and N is the total number of features.
46
47
48 Of these, the first captures the local central tendency and the latter three the distribution of
49
50
values within third order of contiguity from each cell.
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 18 of 136

1818
1
2
3
4
5
6
7
8
9 Each primary character is used as an input for each contextual option. The full set of
10
11
12
morphometric characters hence includes 74 primary plus 296 contextual characters (74x4),
13
14 totalling 370 characters. These are computed using the bespoke open-source Python toolkit
15
16 <masked for review>, ensuring the full replicability and reproducibility of the method.
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 19 of 136 Environment and Planning B: Urban Analytics and City Science

1919
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30 Figure 2: Morphological tessellation’s adaptive topological aggregation; “context” is defined as all cells
31
32 within third order of contiguity in Prague: a) compact perimeter blocks, b) single family housing.
33
34
On

35 Detection of morphological taxa


36
37
38 Only contextual characters’ values are input to cluster analysis that identifies urban form
ly

39
40
41 types. Identifying OTUs as clusters of fundamental entities closely mirrors a mixture problem
42
43 in biology, which identifies populations within samples and classifies at population level
44
45 (Sneath & Sokal, 1973). Since contextual characters are spatially lagged, they are spatially
46
47
48
significantly autocorrelated by design, thus avoiding computationally expensive spatial
49
50 constraint models (Duque et al., 2012). We mitigate potential over-smoothing of the
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 20 of 136

2020
1
2
3
4
5
6
7
8
9 boundaries by basing contextual characters on truncated values (with the exception of SDI),
10
11
12
which eliminate outliers’ effect and define boundaries more precisely.
13
14
15 The most suited clustering algorithm is Gaussian Mixture Model (GMM), a probabilistic
16
17 derivative of k-means (Reynolds, 2009) tested in a similar context by Jochem et al. (2020).).
18
19
Unlike the k-means itself, GMM does not rely only on squared Euclidean distances and is
Fo
20
21
22 more sensitive to clusters of different sizes. GMM assumes that a Gaussian distribution
rR
23
24 represents each dimension of each cluster. Hence the cluster itself is defined by a mixture of
25
26 Gaussians and tested in a similar context by Jochem et al. . The output of GMM are cluster
ev

27
28
29 labels assigned to individual tessellation cells.(2020).
iew

30
31
32 The ideal outcome of cluster detection would equate clusters to distinct taxa of urban tissues.
33
34 Because the definition of urban tissue (Kropf, 2017) does not specify the threshold beyond
On

35
36
37
which two similar parts of a city cluster in same tissue, it is difficult to equate clusters to taxa.
38
ly

39 We resolve this by estimating the number of clusters, required by GMM clustering method,
40
41 on the goodness of fit of the model, measured using Bayesian Information Criterion (BIC)
42
43
(Schwarz & others, 1978).) based on the “elbow” of the curve.
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 21 of 136 Environment and Planning B: Urban Analytics and City Science

2121
1
2
3
4
5
6
7
8
9 The foundation of taxonomy
10
11
12
13
To classify urban form types, we use Ward's minimum variance hierarchical clustering for its
14
15 long lineage in academic use (Singleton & Longley, 2009) alsopreviously applied in urban
16
17 morphology (Dibble et al., 2019; Serra et al., 2018). Here, each urban form type is represented
18
19
by its centroid (mean of each character) within its hyperspace; across cells with the same
Fo
20
21
22 label); Ward's algorithm agglomeratively links observations (OTUs) thus reducing any
rR
23
24 increase in total within-cluster variance (Ward Jr, 1963). The classification is represented
25
26 through a dendrogram capturing the cophenetic relationship between observations (i.e.,
ev

27
28
29 morphometric similarity), forming the foundation of our taxonomy.
iew

30
31
32 Validation theory
33
34
On

35 For validation, we study our taxonomy in relation to other urban dynamics with which some
36
37
38
form of relation is expected. In urban morphology substantial theoretical groundtheory and
ly

39
40 qualitative evidence suggests that different urban patterns emerge in areas of different
41
42 historical origins or else belonging to different “morphological periods” (Whitehand et al.,
43
44
2014). This notion has also been observed quantitatively in the urban fabric (Boeing, 2020;
45
46
47 Dibble et al., 2019; Porta et al., 2014), <masked>) as well as in land use patterns (Castro et
48
49 al., 2019) of cities and is inherently embedded in our OTU (section 3.1.2)..
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 22 of 136

2222
1
2
3
4
5
6
7
8
9 We validate our classification against three datasets: 1) historical origins; 2) predominant
10
11
12
land-use patterns, and 3) qualitative classification of urban form adopted in official planning
13
14 documents. We use the same method, based on cross-tabulation, resulting in statistical
15
16 analysis using chi-squaresquared statistic and related Cramér’s V (Agresti, 2018). The model
17
18
is considered valid if a significant relationship is found between proposed classification and
19
Fo
20
21 such three additional datasets and if similar performance is shown across different case
22
rR
23 studies.
24
25
26 Case study
ev

27
28
29
iew

30 We testedtest the proposed method in two historical European cities: Prague, CZ and
31
32 Amsterdam, NL. Prague’s analysis area is defined by its administrative boundary, which
33
34 extends beyond its continuous built-up area to minimise the “edge-effect” withof the street
On

35
36
37
network (Gil, 2016). Amsterdam’s analysis area is defined by its contiguous urban fabric,
38
ly

39 extending beyond the city’s administrative boundary. Comparing the two confirms
40
41 effectiveness of our taxonomy in different historically layered and heterogeneous contexts.
42
43
The morphological data (buildings, streets) for Prague case study were obtained from city’s
44
45
46 open data portal (https://www.geoportalpraha.cz/en), while the validation layers were
47
48 provided by Prague Institute of Planning and Development. The morphological data for
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 23 of 136 Environment and Planning B: Urban Analytics and City Science

2323
1
2
3
4
5
6
7
8
9 Amsterdam are obtained from 3D BAG repository (Dukai, 2020) and Basisregistratie
10
11
12
Grootschalige Topografie(http://data.nlextract.nl/)
13
14
15
16
17
18 Results: Taxonomy of Prague and Amsterdam
19
Fo
20
21
22 We measure all 74 primary characters in both Prague and Amsterdam, associated to each
rR
23
24 morphological cell, and subsequently generate 296 contextual characters as input to cluster
25
26
ev

analysis.
27
28
29
iew

30 Cluster analysis in Prague


31
32
33 Based on BIC results (figure S5 in the Supplementary Material), GMM clustering identifies
34
On

35 2010 clusters (figure 3a). At a first visual inspection, clusters appear well defined and able to
36
37
38 reflect homogenous forms, their contiguity resulting from contextual characters’ patterned
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 24 of 136

2424
1
2
3
4
5
6
7
8
9 nature.
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 25 of 136 Environment and Planning B: Urban Analytics and City Science

2525
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44 Figure 3: Spatial distribution of 20 detected clusters in central Prague (a) and central Amsterdam (b)
45 accompanied by dendrograms representing the results of Ward’s hierarchical clustering of urban form types
46
in Prague (c) and Amsterdam (d). The y-axis shows the cophenetic distance between individual clusters,
47
48 i.e., their morphometric dis-similarity. The full extent of case studies is shown in figures S9S7 and S10S8
49 in the Supplementary Material.
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 26 of 136

2626
1
2
3
4
5
6
7
8
9 Starting from the historical core of Prague (top left), we first identify (11, red) the medieval
10
11
12
urban form, (7), then the compact perimeter blocks of Vinohrady neighbourhood (5, dark
13
14 blue), the less rigid heterogenous perimeter block-like tissues (10, light blue)6,), and finally
15
16 the fringe areas (7, pink3). Towards South and East, we note low-rise tissues (8, 3, lighter
17
18
yellow1) and modernist developments (2, 12, grey, green4).
19
Fo
20
21
22 Drawing purely from visual observation and personal knowledge of the city of Prague,
rR
23
24 identified clusters appear to nicely capture meaningful urban form types.
25
26
ev

27 Cluster analysis in Amsterdam


28
29
iew

30
31 In Amsterdam, Bayesian Information Criterion (BIC) shows a different curve, as no
32
33 culmination point appears to indicate the optimum number of types, suggesting that the model
34
On

35 overfits data. This requires a different optimal number identification based on the curve’s
36
37
38
gradient, which in this case indicates the optimal number being 3010 clusters, similarly to
ly

39
40 Prague.
41
42
43 As in Prague, the geography of clusters shows seemingly meaningful results (figure 3b). For
44
45
example, cluster 297 captures the city’s medievalhistorical core, while clusters 0, 11 and 13
46
47
48 match the 17th century Canal District expansion up to the Singelgracht canal, distinguishing
49
50 morphological variations within the same historical period. Cluster s 11 (dark blue), 18 (grey)
51
52 and 4 (pink) reflect. The cluster 1 reflects well-known shifts in planning paradigms with the
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 27 of 136 Environment and Planning B: Urban Analytics and City Science

2727
1
2
3
4
5
6
7
8
9 rise of New Amsterdam School (Panerai et al., 2004) forming the early 20th century south
10
11
12
expansion. Once again, under preliminary observation, identified clusterclusters capture
13
14 meaningful historical spatial patterns.
15
16
17 Numerical taxonomy
18
19
Fo
20
The centroid values of each cluster, obtained as mean value of each contextual character, are
21
22
used as taxonomic characters in Ward’s hierarchical clustering. Resulting relationship
rR
23
24
25 between centroids represents relationship between clusters (figure 3c). The dendrogram’s
26
ev

27 horizontal axis represents detected clusters, while the vertical axis their cophenetic distance
28
29
iew

30 (i.e., morphological dissimilarity (i.e., cophenetic distance): the lower the connecting link of
31
32 two clusters, the higher their similarity. Values annotated under each connecting link
33
34 represent cophenetic distance, while numbers in brackets the number of connected clusters.
On

35
36
37
38
Prague’s dendrogram contains 2010 clusters; starting from top, i.e. bifurcation with ,
ly

39
40 illustrating the higher cophenetic distance (43.53), we can divide Prague’s into two major
41
42 urban form types. The right side represents what we may call “organised city”, which
43
44
consistsuniqueness of mixed historical origins areas spanning from the spatial pattern of
45
46
47 medieval core to modern and contemporary developments. Oncity (7), forming the other side
48
49 lies the “unorganised city” which contains bothfirst bifurcation and independent branch. The
50
51 similar situation is with cluster covering industrial and areas (0) being dissimilar to other
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 28 of 136

2828
1
2
3
4
5
6
7
8
9 clusters. Further in the dendrogram, we can see branches with regular perimeter blocks (6)
10
11
12
and their fringe areas as well as contemporary office parks. The yellow branch is the low
13
14 density “heterogenous unorganised city”,(3), unorganised development of modern era (4, 2)
15
16 or a branch featuring irregular and inconsistent spatial relationships between urban form
17
18
elements.residential areas of low density (9, 1, 5, 8).
19
Fo
20
21
22 The dendrogram of Amsterdam urban form (figure 3d) shows similar characteristics, with
rR
23
24 significant bifurcation into two branches and lower bifurcations distinguishing nested levels
25
26 of spatial variations.
ev

27
28
29
iew

30 In the classification maps of Prague and Amsterdam shown in figure 3, urban form types are
31
32 colour-coded to highlight distinctions at individual cluster’s level. However, we can instead
33
34 colour-code according to clusters’ similarity. Because the dendrogram shows several major
On

35
36
37
bifurcations at different levels of cophenetic distance indicating distinct higher-order groups
38
ly

39 of clusters, by colouring each cluster in the map according to the branch it belongs to in the
40
41 dendrogram and using different hues to distinguish between lower-level clusters in each
42
43
branch, we distinguish hierarchies based on cophenetic distance, (figure S11 in the
44
45
46 Supplementary Material)..
47
48
49 We can further combine the two cities’ clusters in one shared dendrogram (figure 4c). Urban
50
51 form types from both pools appear regularly distributed in the lowest orders of the tree,
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 29 of 136 Environment and Planning B: Urban Analytics and City Science

2929
1
2
3
4
5
6
7
8
9 showing a similar spatial structure emerging in both cases. Remarkably, we can see the major
10
11
12
bifurcation setting apart organised and unorganisedindustrial urban forms neatly holds in the
13
14 combined taxonomy.
15
16
17 A lower order bifurcation within the organised citymain branch distinguishes between
18
19
dense/compact urban form and the rest. Further lower-level subdivisions are also visible.
Fo
20
21
22 Compared to individual ones, the combined tree shows some differences in branching: a few
rR
23
24 clusters are reshuffled and the branches themselves are slightly reorganised. This is likely to
25
26 happen as more and more cities are analysed until the unified taxonomy reaches a “plateau”
ev

27
28
29 when enough cases are included, ultimately producing a ‘general taxonomy of urban form’.
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 30 of 136

3030
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 31 of 136 Environment and Planning B: Urban Analytics and City Science

3131
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45 Figure 4: Spatial distribution of different branches of the combined dendrogram in central Prague (a) and
46 central Amsterdam (b) accompanied by the dendrogram representing the results of Ward’s hierarchical
47 clustering of urban form types from a combined pool of Prague and Amsterdam (c). The y-axis shows
48 cophenetic distance between individual clusters, i.e. their morphometric dis-similarity. Branches are
49
interpretatively coloured - the colours are then used on maps illustrating spatial distribution of these
50
51 branches. The full extent of case studies is shown in figures S12S9 and S13S10 in the Supplementary
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 32 of 136

3232
1
2
3
4
5
6
7
8
9 Material. Flows between combined and individual dendrograms are shown on figure S14 in the
10 Supplementary Material.
11
12
13
The geography of Prague and Amsterdam combined taxonomy (figure 4a, 4b) allows cross-
14
15 comparing urban form patterns by similarity (represented by similar colours). Same can be
16
17 extended across multitude of cities and regions.
18
19
Fo
20
Validation
21
22
rR
23
24 We validate the output of numerical taxonomy against three datasets: 1) historical origins; 2)
25
26 land-use patterns, and 3) qualitative classifications. All these are assessed by contingency
ev

27
28 table-based Chi-squarechi-squared statistic and Cramér's V.
29
iew

30
31
32 In Prague, proprietary data on historical origin provided by Institute for Planning and
33
34 Development classifies urban areas into 7 periods: 1840, 1880, 1920, 1950, 1970, 1990, 2012.
On

35
36 Data from same source captures, while there are 123 categories of land use at individual
37
38
building/plot level, where only 15 contain more than 1,000 buildings. We redefined prevailing
ly

39
40
41 land uses within the 3 topological steps of morphological tessellation: only 5 categories
42
43 (Multi-family housing, Single-family housing, Villas, Industry small, Industry large) contain
44
45
more than 1% of the dataset. We use these five and denote the rest as Other.
46
47
48
49 Qualitative classification is drawn from a municipal typology of neighbourhoods developed
50
51 by the city of Prague for planning purposes (Institut plánování a rozvoje hlavního města
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 33 of 136 Environment and Planning B: Urban Analytics and City Science

3333
1
2
3
4
5
6
7
8
9 Prahy, 2018).. Each neighbourhood has specified boundaries based on its morphology and
10
11
12
other aspects, from historical origin to social perception and qualitatively classified according
13
14 to 10 types. We exclude 3 types, hybrid and heterogenous, which are non-morphological and
15
16 linear which captures railway structures only.
17
18
19
Differently from Prague, the Amsterdam dataset of historical origin (Dukai, 2020) indicates
Fo
20
21
22 each building’s year of construction, starting with 1800, rather than area/plot’s first
rR
23
24 settlement. To ensure data compatibility with the method and avoid issues with pre-1800
25
26 periods, origin dates are binned into 11 groups following the classification of Spaan and Waag
ev

27
28
29 Society (2015).
iew

30
31
32 The resulting Chi-squarechi-squared and Cramér's V values are reported in table 1S7.
33
34 Contingency tables are available as tables S3 – S5S6. All tests indicate moderate to high
On

35
36
37
association between identified clusters and the 3 sets of validation data, supporting model’s
38
ly

39 validity.
40
41
42
43
44
45 Case study Data Degrees of N 𝜒2 p-value Cramér’s
46
Freedom V
47
48
49 Prague Historical 104 140315 106700.51 < .001 0.358
50 origin
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 34 of 136

3434
1
2
3
4
5
6
7
8
9 Prague Land use 95 140315 176165.83 < .001 0.501
10
11 Prague Qualitative 114 119355 325595.20 < .001 0.674
12 classification
13
14
Amsterdam Historical 290 252385 312903.31 < .001 0.353
15
16 origin
17
18
19
Fo
20 Table 1: Reported Chi-square and Cramér's V results for each tested dataset. All results indicate
21 significant relationship as per Chi-square statistics and moderate to high association as per Cramér’s
22
V.
rR
23
24
25 Historical origin validationHistorical origin shows moderate association in both Prague
26
ev

27 (V=0.331) and Amsterdam. (V=0.311). Because of the nature of data, where period of
28
29
iew

30 first development is not the only driver of form and we have tissues – e.g. single-family
31
32 – populating multiple historical periods, a moderate association is expected.
33
34
On

35 Both land Land use (V=0.468) and municipal qualitative classification, (V=0.674), tested only
36
37
38
in Prague, indicate moderate and high association to clusters. Again, since land use is only a
ly

39
40 partial driver of urban form, highmoderate association supports the proposed method’s high
41
42 potential to capture urban reality. Furthermore, relationship between morphometric types and
43
44
qualitative ones sourced from local authority is the highest among validation data, reaching
45
46
47 V=0.674. This seems encouraging, since both classifications aim to capture a similar
48
49 conceptualisation of the built environment.
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 35 of 136 Environment and Planning B: Urban Analytics and City Science

3535
1
2
3
4
5
6
7
8
9 Discussion
10
11
12
13
The proposed morphometric method hierarchically classifies urban form types according to
14
15 the similarity of their morphological traits. It is numerical, unsupervised, rich in information
16
17 and scalable in spatial extent. It identifies homogenous clusters of urban form as distinct
18
19
urban form types and, within each, contiguous urban tissues, reflecting that in a typical city
Fo
20
21
22 we observe many urban tissues belonging to the same type. The method is parsimonious in
rR
23
24 terms of input data, requiring only building footprints (and height) and street networks, to
25
26 generate three morphometric elements (building units, street network, morphological
ev

27
28
29 tessellation) and to compute the 370 morphometric characters. Such a wealth of fine-grained
iew

30
31 information allows comprehensivelyextensively characterising each building in the study area
32
33 and its adjacency and deriving distinct urban form types hierarchically organised according to
34
On

35
36
similarity.
37
38
ly

39 The method allows urban form analysis both in detail and at large scale, hence overcoming a
40
41 methodological gap; it is fully data-driven and does not rely on (but confirms) experts’
42
43
judgement other than for interpretation of BIC score. It is structurally hierarchical, which
44
45
46 ensures depth along the similarity structure of urban form types and flexibility of use,
47
48 according to the desired resolution of classification. Furthermore, it is
49
50 comprehensiveextensive, encompassing a broad range of morphometric descriptors of
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 36 of 136

3636
1
2
3
4
5
6
7
8
9 between major urban fromform components and their context, without biased pre-selection;
10
11
12
and it is granular, since morphometric characters are referred to each individual building.
13
14
15 Finally, it is scalable and reproducible, in that it is designed to suite well the large scale of
16
17 coverage - like cities and combinations of cities - and its source code is available open-source.
18
19
Fo
20
Information generated with the proposed method supports applications at three different
21
22
applicationslevels. First, the set of morphometric characters can be input to studies of a
rR
23
24
25 relationship between urban form and socio-economic aspects of urban life, e.g. via regression
26
ev

27 analysis. This includes investigations into the link between urban form and
28
29
iew

30 energetic/bioclimatic performance of cities, population health, gentrification and place


31
32 attractiveness. Second, flat clustering with morphometric profiles can provide aggregated
33
34 information on patterns without dealing with individual characters. This makes it possible to
On

35
36
37
capture the overall morphological “identity” of an urban tissue rather than focusing on one
38
ly

39 element at the time. Third, the taxonomy brings hierarchy into classification and, as such, it
40
41 can adapt its resolution to fit any question asked. In this sense, while the results of the clusters
42
43
may be well-suited for fine-grained spatial analyses, by horizontally cutting the dendrogram at
44
45
46 a desired height, it is possible to group clusters into fewer, more generalised spatial
47
48 aggregations which might be better suited for analyses at coarser resolution.
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 37 of 136 Environment and Planning B: Urban Analytics and City Science

3737
1
2
3
4
5
6
7
8
9 Whilst parsimonious in terms of input data, our method still relies on their availability and
10
11
12
consistency. The building footprints layer is often available atof sub-optimal quality level:
13
14 adjacent buildings may be represented as unified polygons, misleading the method in dense
15
16 historical centres or informal settlements.areas. Building-level information on height may not
17
18
be available, reducing depth of information with potentially negative effects on the quality of
19
Fo
20
21 resulting clusters. Consistency of data across geographies may also be an issue, particularly
22
rR
23 for large spatial extents, which may require data generated independently by multiple sources.
24
25
26 The case of Amsterdam also pointed out the potential overfitting of the model since the
ev

27
28
29 BIC did not culminate and assessment of its gradient was hence required to determine
iew

30
31 the number of components. However, we can mitigate this overfit in identification of
32
33 initial types through subsequent hierarchical classification merging similar clusters
34
On

35
36
together.
37
38
ly

39 Conclusions
40
41
42
43
Urban morphometrics and proposed classification method represent a key step towards the
44
45 development of a comprehensiveThe paper presents an original data-driven approach for the
46
47 systematic unsupervised classification and characterisation of urban form patterns grounded on
48
49
numerical taxonomy in biological systematics and which clusters urban tissues based on
50
51
52 phenetic similarity, delivering a systematic numerical taxonomy of urban form. This
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 38 of 136

3838
1
2
3
4
5
6
7
8
9 advancement opens to scalable urban morphology: ultimately, the method is capableMore
10
11
12
specifically it measures a selection of delivering rigorous and detailed morphometric profiles
13
14 of urban form types74 primary characters from input data (buildings, streets) and individual
15
16 branchesderived generated elements (tessellation and blocks), each of which is represented
17
18
through 4 contextual characters (Interquartile mean, Interquartile range, Interdecile Theil
19
Fo
20
21 index, Simpson’s diversity index). These are then used as an input of theirthe cluster analysis,
22
rR
23 resulting in a hierarchical taxonomy. FurthermoreFinally, the recognised tissuesproposed
24
25 approach is validated through two exploratory case studies illustrating how the resulting
26
ev

27
28 clustering show significant relationship with validation data reflecting other urban spatial
29
iew

30 dynamics, suggesting that the method is capable.


31
32
33 Urban morphometrics and proposed classification method represent a step towards the
34
On

35
36
development of a taxonomy of urban form and opens to scalable urban morphology. By
37
38 overcoming existing limitations in the systematic detection and characterisation of
ly

39
40 morphological patterns, the proposed approach opens the way to identify relevant structural
41
42
componentsthe large-scale classification and characterisation of urban form patterns,
43
44
45 potentially resulting, if applied to a substantial pool of cities reflecting their physical nature., in
46
47 a universal taxonomy of urban form.
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 39 of 136 Environment and Planning B: Urban Analytics and City Science

3939
1
2
3
4
5
6
7
8
9 At the same time, the proposed approach also provides valuable tools for more rigorous
10
11
12
comparative studies, which are fundamental to highlight similarities and differences in urban
13
14 forms of different urban settlements in different contexts, and to explore the relationship
15
16 between urban space and phenomena as diverse as environmental performance, health and place
17
18
attractiveness and more.
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 40 of 136

4040
1
2
3
4
5
6
7
8
9 References
10
11
12
13
● Agresti A (2018) An Introduction to Categorical Data Analysis. John Wiley & Sons.
14
15
16 ● Angel S, Blei AM, Civco DL and Parent J (2012). Atlas of urban expansion. Lincoln
17
18 Institute of Land Policy Cambridge, MA.
19
Fo
20
21
 Araldi A and Fusco G (2017) Decomposing and Recomposing Urban Fabric: The City
22
from the Pedestrian Point of View. In: Computational Science and Its Applications –
rR
23
24
25 ICCSA 2017. Cham: Springer International Publishing, pp. 365–376. DOI:
26
ev

27
10.1007/978-3-319-62401-3_27.
28
29
iew

30
31 ● Araldi A and Fusco G (2019) From the street to the metropolitan region: Pedestrian
32
33 perspective in urban fabric analysis: Environment and Planning B: Urban Analytics
34
On

35 and City Science 46(7): 1243–1263. DOI: 10.1177/2399808319832612.


36
37
38
● Berghauser Pont M, Stavroulaki G and Marcus L (20192019a) Development of urban
ly

39
40
41 types based on network centrality, built density and their impact on pedestrian
42
43 movement. Environment and Planning B: Urban Analytics and City Science 46(8):
44
45
46 1549–1564. DOI: 10/gghf42.
47
48
49 ● Berghauser Pont M, Stavroulaki G, Bobkova E, et al. (20192019b) The spatial
50
51 distribution and frequency of street, plot and building types across five European
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 41 of 136 Environment and Planning B: Urban Analytics and City Science

4141
1
2
3
4
5
6
7
8
9 cities. Environment and Planning B: Urban Analytics and City Science 46(7): 1226–
10
11
12
1242. DOI: 10/gf8x8j.
13
14
15 ● Biljecki F, Ledoux H and Stoter J (2016) An improved LOD specification for 3D
16
17 building models. Computers, Environment and Urban Systems 59: 25–37. DOI:
18
19
10/f83fz4.
Fo
20
21
22
● Bobkova E, Berghauser Pont M and Marcus L (2019) Towards analytical typologies
rR
23
24
25 of plot systems: Quantitative profile of five European cities. Environment and
26
ev

27 Planning B: Urban Analytics and City Science: 239980831988090. DOI: 10/ggbgsm.


28
29
iew

30
31 ● Boeing G (2020) Off the grid… and back again? The recent evolution of american
32
33 street network planning and design. Journal of the American Planning Association.
34
On

35 Taylor & Francis: 1–15. DOI: 10/ghf423.


36
37
38
● Caniggia G and Maffei GL (2001) Architectural Composition and Building Typology:
ly

39
40
41 Interpreting Basic Building. Firenze: Alinea Editrice.
42
43
44 ● Caruso G, Hilal M and Thomas I (2017). Measuring urban forms from inter-building
45
46
47
distances: Combining MST graphs with a Local Index of Spatial Association.
48
49 Landscape and Urban Planning, 163, 80–89.
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 42 of 136

4242
1
2
3
4
5
6
7
8
9 ● Castro KB de, Roig HL, Neumann MRB, et al. (2019) New perspectives in land use
10
11
12
mapping based on urban morphology: A case study of the Federal District, Brazil.
13
14 Land Use Policy 87: 104032. DOI: 10.1016/j.landusepol.2019.104032.
15
16
17
18
● Conzen M (1960) Alnwick, Northumberland: A Study in Town-Plan Analysis. London:
19
Fo
20
21 George Philip & Son. Available at: http://www.jstor.org/stable/pdf/621094.pdf.
22
rR
23
24 ● Dibble J, Prelorendjos A, Romice O, et al. (2019) On the origin of spaces:
25
26
Morphometric foundations of urban form evolution. Environment and Planning B:
ev

27
28
29 Urban Analytics and City Science 46(4): 707–730. DOI: 10.1177/2399808317725075.
iew

30
31
32 ● Dogrusoz E and Aksoy S (2007) Modeling urban structures using graph-based spatial
33
34 patterns. In: 1 January 2007, pp. 4826–4829. IEEE. DOI:
On

35
36
37 10.1109/IGARSS.2007.4423941.
38
ly

39
40 ● Dukai B (2020) 3D Registration of Buildings and Addresses (BAG) / 3D
41
42 Basisregistratie Adressen en Gebouwen (BAG). 4TU.ResearchData. DOI:
43
44
45
https://doi.org/10.4121/uuid:f1f9759d-024a-492a-b821-07014dd6131c.
46
47
48 ● Duque JC, Anselin L and Rey SJ (2012) The max-p-regions problem. Journal of
49
50 Regional Science 52(3). Wiley Online Library: 397–419. DOI: 10/cf9h6h.
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 43 of 136 Environment and Planning B: Urban Analytics and City Science

4343
1
2
3
4
5
6
7
8
9  Feliciotti A (2018) RESILIENCE AND URBAN DESIGN: A SYSTEMS
10
11
12 APPROACH TO THE STUDY OF RESILIENCE IN URBAN FORM. University of
13
14 Strathclyde, Glasgow.
15
16
17 ● Fleischmann M, Feliciotti A, Romice O, et al. (2020) Morphological tessellation as a
18
19
way of partitioning space: Improving consistency in urban morphology at the plot
Fo
20
21
22 scale. Computers, Environment and Urban Systems 80: 101441. DOI:
rR
23
24 10.1016/j.compenvurbsys.2019.101441.
25
26
ev

27
28 ● Gil J, Beirão JN, Montenegro N, Duarte, JP (2012) On the discovery of urban
29
iew

30 typologies: data mining the many dimensions of urban form. Urban Morphology
31
32 16(1): 27–40
33
34
On

35
36
● Gil J (2016) Street network analysis ‘edge effects’: Examining the sensitivity of
37
38 centrality measures to boundary conditions. Environment and Planning B: Planning
ly

39
40 and Design. DOI: 10.1177/0265813516650678.
41
42
43
● Guyot M, Araldi A, Fusco G and Thomas I (2021). The urban form of Brussels from
44
45
46 the street perspective: The role of vegetation in the definition of the urban fabric.
47
48 Landscape and Urban Planning, 205, 103947. https://doi.org/10/ghf96c
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 44 of 136

4444
1
2
3
4
5
6
7
8
9 ● Hamaina R, Leduc T and Moreau G (2012) Towards Urban Fabrics Characterization
10
11
12
Based on Buildings Footprints. In: Bridging the Geographic Information Sciences.
13
14 Berlin, Heidelberg: Springer, Berlin, Heidelberg, pp. 327–346. DOI: 10.1007/978-3-
15
16 642-29063-3_18.
17
18
19
● Hartmann A, Meinel G, Hecht R, et al. (2016) A Workflow for Automatic
Fo
20
21
22 Quantification of Structure and Dynamic of the German Building Stock Using Official
rR
23
24 Spatial Data. ISPRS International Journal of Geo-Information 5(8): 142. DOI:
25
26 10/f872vh.
ev

27
28
29
iew

30  Institut plánování a rozvoje hlavního města Prahy (2018) Územní Plán Hlavního
31
32 Města Prahy Metropolitní Plán Návrh k Projednání Dle § 50 Stavebního Zákona.
33
34 Praha: IPR Praha.
On

35
36
37
38 ● Jochem WC, Leasure DR, Pannell O, et al. (2020) Classifying settlement types from
ly

39
40 multi-scale spatial patterns of building footprints. Environment and Planning B:
41
42 Urban Analytics and City Science: 239980832092120. DOI: 10/ggtsbn.
43
44
45
46 ● Kropf K (1993) The definition of built form in urban morphology. University of
47
48 Birmingham.
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 45 of 136 Environment and Planning B: Urban Analytics and City Science

4545
1
2
3
4
5
6
7
8
9 ● Kropf K (1996) Urban tissue and the character of towns. URBAN DESIGN
10
11
12
International 1(3): 247–263. DOI: 10.1057/udi.1996.32.
13
14
15 ● Kropf K (2014) Ambiguity in the definition of built form. Urban Morphology 18(1):
16
17 41–57.
18
19
Fo
20
21
● Kropf K (2017) The Handbook of Urban Morphology. Chichester: John Wiley &
22
Sons. Available at: http://cds.cern.ch/record/2316422.
rR
23
24
25
26 ● Kropf K (2018) Plots, property and behaviour. Urban Morphology 22(1): 5–14.
ev

27
28
29
● Lehner A and Blaschke T (2019) A Generic Classification Scheme for Urban
iew

30
31
32 Structure Types. Remote Sensing 11(2): 173. DOI: 10.3390/rs11020173.
33
34
On

35 ● Levy A (1999) Urban morphology and the problem of the modern urban fabric: some
36
37 questions for research. Urban Morphology 3: 79–85.
38
ly

39
40
41 ● Louf R and Barthelemy M (2014) A typology of street patterns. Journal of the Royal
42
43 Society Interface 11. DOI: http://dx.doi.org/10.1098/rsif.2014.0924.
44
45
46  Mehaffy MW, Porta S, Rofè Y, et al. (2010) Urban nuclei and the geometry of streets:
47
48
49 The ‘emergent neighborhoods’ model. URBAN DESIGN International 15(1): 22–46.
50
51 DOI: 10.1057/udi.2009.26.
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 46 of 136

4646
1
2
3
4
5
6
7
8
9 ● Moudon AV (1997) Urban morphology as an emerging interdisciplinary field. Urban
10
11
12
Morphology 1(1): 3–10.
13
14
15 ● Muratori S (1959) Studi per una operante storia urbana di Venezia. Palladio. Rivista di
16
17 storia dell’architettura 1959: 1–113.
18
19
Fo
20
21
● Neidhart H and Sester M (2004) Identifying building types and building clusters using
22
3-D laser scanning and GIS-data. Int Arch Photogramm Remote Sens Spatial Inf Sci
rR
23
24
25 35: 715–720.
26
ev

27
28
● Oliveira V (2016) Urban Morphology: An Introduction to the Study of the Physical
29
iew

30
31 Form of Cities. Cham: Springer International Publishing.
32
33
34 ● Oliveira V and Yaygin MA (2020) The concept of the morphological region:
On

35
36 developments and prospects. Urban Morphology 24(1): 18.
37
38
ly

39
40 ● Openshaw S (1984) The Modifiable Areal Unit Problem.
41
42
43 ● Osmond P (2010) The urban structural unit: Towards a descriptive framework to
44
45 support urban analysis and planning. Urban Morphology 14(1): 5–20.
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 47 of 136 Environment and Planning B: Urban Analytics and City Science

4747
1
2
3
4
5
6
7
8
9 ● Porta S, Romice O, Maxwell JA, et al. (2014) Alterations in scale: Patterns of change
10
11
12
in main street networks across time and space. Urban Studies 51(16): 3383–3400.
13
14 DOI: 10.1177/0042098013519833.
15
16
17 ● Reynolds DA (2009) Gaussian mixture models. Encyclopedia of biometrics 741.
18
19
Berlin, Springer. DOI: 10/cqtzqm.
Fo
20
21
22
 Romice O, Feliciotti A and Porta S (2017) The road to masterplanning for change and
rR
23
24
25 the design of resilient places. Architectural Research in Finland 1(1).
26
ev

27
28
● Schirmer PM and Axhausen KW (2015) A multiscale classification of urban
29
iew

30
31 morphology. Journal of Transport and Land Use 9(1): 101–130. DOI:
32
33 10.5198/jtlu.2015.667.
34
On

35
36 ● Schwarz G and others (1978) Estimating the dimension of a model. The annals of
37
38
ly

39 statistics 6(2). Institute of Mathematical Statistics: 461–464.


40
41
42 ● Serra M, Psarra S and O’Brien J (2018) Social and Physical Characterization of Urban
43
44 Contexts: Techniques and Methods for Quantification, Classification and Purposive
45
46
47 Sampling. Urban Planning 3(1): 58–74. DOI: 10.17645/up.v3i1.1269.
48
49
50  Singleton AD and Longley PA (2009) Geodemographics, visualisation, and social
51
52 networks in applied geography. Applied Geography 29(3): 289–298. DOI: 10/dg6t8r.
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 48 of 136

4848
1
2
3
4
5
6
7
8
9 ● Sneath PHA and Sokal RR (1973) Numerical Taxonomy. San Francisco: Freeman.
10
11
12
13 ● SomaSoman S, Beukes A, Nederhood C, Marchio N and Bettencourt L (2020).
14
15 Worldwide detection of informal settlements via topological analysis of crowdsourced
16
17 digital maps. ISPRS International Journal of Geo-Information, 9(11), 685.
18
19
https://doi.org/10/ghpwqm
Fo
20
21
22
rR
23
24 ● Song Y and Knaap G-J (2007) Quantitative Classification of Neighbourhoods: The
25
26
Neighbourhoods of New Single-family Homes in the Portland Metropolitan Area.
ev

27
28
29 Journal of Urban Design 12(1): 1–24. DOI: 10.1080/13574800601072640.
iew

30
31
32 ● Spaan B and Waag Society (2015) All buildings in Netherlands shaded by a year of
33
34 construction. Available at: https://code.waag.org/buildings/.
On

35
36
37
38 ● Steadman, P. (1979). The Evolution of Designs Biological Analogy in Architecture
ly

39
40 and the Applied Arts.
41
42
43 ● Steiniger S, Lange T, Burghardt D, et al. (2008) An Approach for the Classification of
44
45
46
Urban Building Structures Based on Discriminant Analysis Techniques. Transactions
47
48 in GIS 12(1): 31–59. DOI: 10.1111/j.1467-9671.2008.01085.x.
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 49 of 136 Environment and Planning B: Urban Analytics and City Science

4949
1
2
3
4
5
6
7
8
9 ● Stewart ID and Oke TR (2012) Local Climate Zones for Urban Temperature Studies.
10
11
12
Bulletin of the American Meteorological Society 93(12): 1879–1900. DOI:
13
14 10.1175/BAMS-D-11-00019.1.
15
16
17 ● Taubenböck H, Debray H, Qiu C, et al. (2020) Seven city types representing
18
19
morphologic configurations of cities across the globe. Cities 105: 102814. DOI:
Fo
20
21
22 10/gg2jv4.
rR
23
24
25  Thienel I (2013) Städtewachstum Im Industrialisierungsprozess Des 19. Jahrhunderts:
26
ev

27
Das Berliner Beispiel. Walter de Gruyter.
28
29
iew

30
31 ● Usui H and Asami Y (2013) Estimation of Mean Lot Depth and Its Accuracy. Journal
32
33 of the City Planning Institute of Japan 48(3): 357–362.
34
On

35
36
 Vanderhaegen S and Canters F (2017) Mapping urban form and function at city block
37
38
ly

39 level using spatial metrics. Landscape and Urban Planning 167: 399–409. DOI:
40
41 10.1016/j.landurbplan.2017.05.023.
42
43
44 ● Ward Jr JH (1963) Hierarchical grouping to optimize an objective function. Journal of
45
46
47 the American statistical association 58(301). Taylor & Francis Group: 236–244. DOI:
48
49 10/fz95kg.
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 50 of 136

5050
1
2
3
4
5
6
7
8
9 ● Whitehand J, Gu K, Conzen MP, et al. (2014) The typological process and the
10
11
12
morphological period: a cross-cultural assessment. Environment and Planning B:
13
14 Planning and Design 41(3). SAGE Publications Sage UK: London, England: 512–
15
16 533. DOI: 10/f546ck.
17
18
19
● Wurm M, Schmitt A and Taubenbock H (2016) Building Types’ Classification Using
Fo
20
21
22 Shape-Based Features and Linear Discriminant Functions. IEEE Journal of Selected
rR
23
24 Topics in Applied Earth Observations and Remote Sensing 9(5): 1901–1912. DOI:
25
26 10.1109/JSTARS.2015.2465131.
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 51 of 136 Environment and Planning B: Urban Analytics and City Science

1
1
2
3
4
5
6
7
8
9
10
Methodological Foundation of a Numerical Taxonomy of Urban
11
12
13 Form
14
15
16
17
Abstract
18
19
Fo
20 Cities are complex products of human culture, characterised by a startling diversity of visible
21
22 traits. Their form is constantly evolving, reflecting changing human needs and local
rR
23
24
25 contingencies, manifested in space by many urban patterns.
26
ev

27 Urban Morphology laid the foundation for understanding many such patterns, largely relying
28
29 on qualitative research methods to extract distinct spatial identities of urban areas. However,
iew

30
31
32
the manual, labour-intensive and subjective nature of such approaches represents an
33
34 impediment to the development of a scalable, replicable and data-driven urban form
On

35
36 characterisation. Recently, advances in Geographic Data Science and the availability of
37
38
digital mapping products, open the opportunity to overcome such limitations. And yet, our
ly

39
40
41 current capacity to systematically capture the heterogeneity of spatial patterns remains limited
42
43 in terms of spatial parameters included in the analysis and hardly scalable due to the highly
44
45 labour-intensive nature of the task. In this paper, we present a method for numerical
46
47
48 taxonomy of urban form derived from biological systematics, which allows the rigorous
49
50 detection and classification of urban types. Initially, we produce a rich numerical
51
52 characterisation of urban space from minimal data input, minimizing limitations due to
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 52 of 136

2
1
2
3
4
5
6
7
8
9 inconsistent data quality and availability. These are street network, building footprint, and
10
11
12
morphological tessellation, a spatial unit derivative of Voronoi tessellation, obtained from
13
14 building footprints. Hence, we derive homogeneous urban tissue types and, by determining
15
16 overall morphological similarity between them, generate a hierarchical classification of urban
17
18
form. After framing and presenting the method, we test it on two cities - Prague and
19
Fo
20
21 Amsterdam - and discuss potential applications and further developments. The proposed
22
rR
23 classification method represents a step towards the development of an extensive, scalable
24
25 numerical taxonomy of urban form and opens the way to more rigorous comparative
26
ev

27
28 morphological studies and explorations into the relationship between urban space and
29
iew

30 phenomena as diverse as environmental performance, health and place attractiveness.


31
32
33
34
Keywords: urban morphometrics, classification, numerical taxonomy, urban morphology
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 53 of 136 Environment and Planning B: Urban Analytics and City Science

3
1
2
3
4
5
6
7
8
9
10
Introduction
11
12
13 Cities’ visual diversity is astounding. Indeed, when comparing their spatial form, marked
14
15 differences can be clearly observed at all scales. And yet, despite these variations, their
16
17
18
heterogeneous fabrics share geometric characteristics, which make it possible to compare
19
Fo
20 them to one another through the analysis of their constituent elements and, to recognise
21
22 patchworks of distinct urban tissues within each city.
rR
23
24
25
The endeavour of capturing these multifaceted spatial patterns has been the object of
26
ev

27
28 investigation across multiple disciplines. Notably, building on research in geography (Conzen,
29
iew

30 1960) and architecture (Muratori, 1959), the discipline of urban morphology devote over 60
31
32 years to explore recurrent patterns within urban forms in cities all over the world, aiming at
33
34
their definition, classification and characterisation (Kropf 1993, 2014; Oliveira 2016).
On

35
36
37
38 Further research has focused on classification of morphological elements into “types”. This
ly

39
40 includes the series of works by Steadman (Steadman, Bruhns and Holtier, 2000; Steadman,
41
42
43
Evans and Batty, 2009) on the classification of buildings based on a handful of empirically
44
45 measured geometrical parameters as well as the work by Marshall (2005) on the classification
46
47 of street pattern types.
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 54 of 136

4
1
2
3
4
5
6
7
8
9 And whilst these contributions are heterogeneous both in terms of object of interest (i.e.
10
11
12
building, street, urban tissue), method (i.e. qualitative vs quantitative) and aim of the
13
14 classification (i.e. energy performance, historical origin, design paradigm), they mark
15
16 important attempts at classifying the variations of individual elements – buildings (Steadman
17
18
et al. 2000, Steadman et al. 2009) – or aggregations of individual elements - street patterns
19
Fo
20
21 (Marshall, 2005) – making up the of spatial form of cities through geometrical analysis. As
22
rR
23 such they mark steps towards a more rigorous study of relationships between different urban
24
25 configurations.
26
ev

27
28
29 Yet, our current capacity to systematically capture the heterogeneity of spatial patterns remains
iew

30
31 limited. Most existing research in urban morphology relies on highly-supervised, expert-driven
32
33 and labour-intensive qualitative methods both in the data preparation process and in the design
34
On

35
36
of the analysis. As a result, most existing works are hardly scalable due to the considerable
37
38 amount of manual work required to prepare the input data and tend to focus on the analysis of
ly

39
40 relatively few spatial parameters.
41
42
43
Recently however, advances in geographic data science, combined with growing availability of
44
45
46 geospatial data, triggered a data-driven stream of urban morphology studies, named “urban
47
48 morphometrics” (e.g. Gil et al. 2012, Dibble et al. 2019, Araldi & Fusco 2019, Bobkova 2019).
49
50 Within this line of research, the paper aims to address the need for more systematic, scalable
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 55 of 136 Environment and Planning B: Urban Analytics and City Science

5
1
2
3
4
5
6
7
8
9 and efficient method for the detection and classification of morphological patterns. To this end,
10
11
12
after presenting a brief literature review on urban form classification and specifying the
13
14 requirements for a rigorous classification method, we
15
16
17 ● present an original quantitative methodology for the systematic unsupervised
18
19
classification of urban form patterns and ground it on the theory of phenetics and
Fo
20
21
22 numerical taxonomy in biological systematics.
rR
23
24 ● apply the proposed methodology to two exploratory case studies, as proofs of concept
25
26
ev

aimed at providing an illustration of the method and some of its potential theoretical
27
28
29 impacts and technical shortcomings.
iew

30
31
32 More specifically, we will first frame the proposed approach to urban form classification
33
34 within numerical taxonomy, which seeks to describe and classify species and taxa based on
On

35
36
37 morphological similarity (Sneath & Sokal, 1973). To build this methodological parallel
38
ly

39 between the (a-biotic) system of urban form and biology, we a) re-frame the constituent
40
41 elements of urban forms as the building blocks of the method, 2) describe how to identify
42
43
44
structurally homogeneous urban form types (or “taxa”) and 3) measure their hierarchical
45
46 relationship based on phenetic similarity, delivering a systematic numerical taxonomy of
47
48 urban form. Finally, we test the proposed method on two major European cities characterised
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 56 of 136

6
1
2
3
4
5
6
7
8
9 by various types of urban fabric originating from different historical stages: Prague, CZ and
10
11
12
Amsterdam, NL.
13
14
15 We conclude discussing validation findings, highlighting potential theoretical impact of the
16
17 proposed method and discussing methodological limitations.
18
19
Fo
20
21 Existing models of urban form classification
22
rR
23
24 The primary aim of classification is to reduce the complexity of the world around us. Many
25
26
ev

urban form classification methods exist at building (Steadman et al, 2000, Steadman et al.
27
28
29 2009, Schirmer & Axhausen, 2015), street (Marshall, 2005) neighbourhood (Soman et al.,
iew

30
31 2020) and city (Louf & Barthelemy, 2014) scales, varying conceptually and analytically both
32
33 in terms of focus scale - e.g. global, (Angel et al. 2012) vs local (Guyot et al. 2021), analytical
34
On

35
36 approach – e.g. quantitative vs. qualitative, and aim of the classification. Structurally, the
37
38 simplest forms involve flat classifications, where the relationship between types is unknown.
ly

39
40 These are either binary like organized vs. unorganized neighbourhoods (Dogrusoz & Aksoy,
41
42
43
2007), or multi-class, as Caruso et al.’s (2017) 4-class clustering based on inter-building
44
45 distance, or Song and Knapp’s (2007) 6-class neighbourhood typology based on factor
46
47 analysis and K-means of 21 spatial descriptors, or the “multiscale typology” by Schirmer &
48
49
Axhausen, (2015) identifying four flat classes based on centrality and accessibility. More
50
51
52 complex classifications involve hierarchical methods (taxonomies), which organise classes
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 57 of 136 Environment and Planning B: Urban Analytics and City Science

7
1
2
3
4
5
6
7
8
9 based on their mutual relationships like Serra et al. (2018)’s hierarchical taxonomy of
10
11
12
neighbourhoods built according to 12 morphological characters of street network, blocks and
13
14 buildings, and the work by Dibble et al. (2019) who hierarchically classify portions of urban
15
16 area enclosed by main streets. More granular approaches include the work by Araldi & Fusco
17
18
(2019), who classify street segments using 21 morphometric characters derived from street
19
Fo
20
21 networks, building footprints and digital terrain model and research by SMOG at Chalmers
22
rR
23 University (Berghauser Pont et al., 2019a; Berghauser Pont et al., 2019b; Bobkova et al.,
24
25 2019) that classifies morphological elements of plots, streets and buildings through a handful
26
ev

27
28 of morphometric characters.
29
iew

30
31 Other approaches employ morphometric assessment to predict pre-defined typologies of
32
33 buildings, streets or larger areas (Marshall, 2005, Hartmann et al., 2016; Neidhart and Sester,
34
On

35
36
2004; Steiniger et al., 2008; Wurm et al., 2016). These validate morphometrics in
37
38 classification of urban form, even though the typology itself is defined differently. Related to
ly

39
40 this are Urban Structural Type classifications reviewed by Lehner & Blaschke (2019), and
41
42
detection of Local Climate Zones (Stewart & Oke, 2012; Taubenböck et al., 2020).
43
44
45
46 Whilst the list does not aim to be exhaustive of all contributions it nevertheless provides an
47
48 overview of the state of the art in urban form classification research. Specifically, it highlights
49
50 how each of these method shows shortcomings in scalability (the ability to analyse large areas
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 58 of 136

8
1
2
3
4
5
6
7
8
9 while retaining the detail), transferability (the ability to apply to different contexts),
10
11
12
robustness (the ability to remain unaffected by small imprecision of the input data or
13
14 measurement), and extensiveness (i.e. the bias induced by a small number of variables), or
15
16 interpretative flexibility (i.e., missing relations between classes). This leaves a methodological
17
18
gap in morphometric classification of built environment hindering the development of
19
Fo
20
21 universal taxonomy of urban form.
22
rR
23
24 Method: Building a taxonomy of urban form
25
26
ev

27
28 The problem of classification of urban patterns based on geometrical resemblance is not
29
iew

30 dissimilar, conceptually speaking, to the work of early biologists seeking to classify biotic
31
32 species and taxa based on morphological similarity. This was indeed the primary aim of
33
34
numerical taxonomy (and generally phenetics), established in biology in the second half of the
On

35
36
37 20th century (Sneath & Sokal, 1973).
38
ly

39
40 Whilst DNA sequencing and phylogenetics have now largely replaced morphometrics in
41
42
43
modern biological taxonomy, we can take advantage of the latter for the study of urban form.
44
45 Very much like the study of organismal phenotypes and the statistical description of
46
47 biological forms were instrumental to the separation of individuals (and species) into
48
49
recognisable, homogeneous groups (Raup,1966), extending numerical taxonomy to the study
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 59 of 136 Environment and Planning B: Urban Analytics and City Science

9
1
2
3
4
5
6
7
8
9 of urban form offers an operationally viable and reliable conceptual and methodological
10
11
12
framework for a systematic classification of homogeneous urban form types.
13
14
15 And yet, whilst this possibility has always fascinated urban scholars in an analogic sense
16
17 (Philip and Steaman, 1979), a rigorous methodological parallel between numerical taxonomy
18
19
and urban form classification is a matter of pioneering research.
Fo
20
21
22
One of the first authors to explicitly use numerical taxonomy on urban form was Dibble et al.
rR
23
24
25 (2019) who, notwithstanding operational limitations, measured a large number of geometrical
26
ev

27 parameters of fundamental morphological elements (buildings, streets, plots etc) to test the
28
29
iew

30 applicability of the approach in urban morphology. However, their method requires


31
32 predefined boundaries of urban types, is extremely data demanding and is not possible to do
33
34 without manual measuring. Despite that, it paved the conceptual way for further research
On

35
36
37
including the one presented in this paper.
38
ly

39
40
41 Morphometrics and numerical taxonomy in urban form
42
43
44 The first step for numerical taxonomy of urban form is the definition of the building blocks of
45
46
47 the method, namely: 1) structural elements, or the urban form counterpart of the individual
48
49 and its body in biology (Sneath & Sokal, 1973); 2), operational taxonomic unit (OTU), or else
50
51 the unit forming the lowest ranking taxa, which in biology is individuals or populations
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 60 of 136

10
1
2
3
4
5
6
7
8
9 depending on taxonomic level; and 3) morphometric characters, that is the measurable traits
10
11
12
of each structural elements - the “wing’s length” or “beak’s dimension” in biology.
13
14
15 Structural elements
16
17
18 Urban morphologists generally agree on three fundamental elements: buildings, plots and
19
Fo
20
streets (Kropf, 2017; Moudon, 1997). To make our method scalable it is imperative that, when
21
22
these are translated into operational and measurable morphometric elements, i.e., vector
rR
23
24
25 features in GIS data, they maintain their meaning with minimal data input, hence maximising
26
ev

27 data accessibility and consistency.


28
29
iew

30
31 From a morphometric standpoint, this is relatively straightforward for streets and buildings
32
33 due to their conceptual simplicity: buildings can be represented as building footprint polygons
34
On

35 (with the attribute of building height) at Level of Detail 1 (Biljecki et al., 2016) whilst streets
36
37
38
as network centrelines, cleared of transport planning-related structures. The same is more
ly

39
40 complicated for the plot, particularly at large scale, due to its highly polysemic nature (Kropf,
41
42 2018) and ambiguous structuring role in contemporary urban fabrics (Levy, 1999).
43
44
45
To avoid the plot’s inconsistencies, we use morphological tessellation, a polygon-based
46
47
48 derivative of Voronoi tessellation obtained from building footprints proposed by Fleischmann
49
50 et al. (2020) after Hamaina et al. (2012) and Usui & Asami (2013) and the morphological cell,
51
52 its smallest spatial unit which delineates the portion of land around each building that is closer
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 61 of 136 Environment and Planning B: Urban Analytics and City Science

11
1
2
3
4
5
6
7
8
9 to it than to any other but no further than 100m. As such, the morphological tessellation
10
11
12
captures the topological relations between individual cells and influence that each building
13
14 exerts on the surrounding space (Hamaina et al., 2012), regardless of historical origin, thanks
15
16 to its contiguity throughout the analysis space (figures 1a and 2). Furthermore, being
17
18
generated solely from building footprints, it does not increase data reliance. However, as such,
19
Fo
20
21 it does not have the ability to represent unbuilt areas and empty plots and does not serve as a
22
rR
23 substitute for plot in general terms as it does not have the same structural role. Morphological
24
25 tessellation is a purely analytical element.
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 62 of 136

12
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32 Figure 1: a) Fundamental morphometric elements: building footprint, tessellation cell (derived from
33 building footprints) and street (segment and node from centrelines). b) Diagram illustrating the workflow
34 of the proposed method. From input data (buildings, streets) are derived generated elements (tessellation,
On

35
blocks). All elements are used to measure primary morphometric characters. Each of them is then
36
37 represented as 4 contextual characters that are used as an input of the cluster analysis. Finally, resulting
38 classes are organised in a taxonomy.
ly

39
40 Taxonomic unit
41
42
43
In biology the operational taxonomic unit (OTU) is intuitive (individual organism). The same
44
45
46 is, however, not true for urban form. In urban morphology, this can be associated to the
47
48 concept of “morphological regions” (Oliveira & Yaygin, 2020), “urban tissues” (Caniggia &
49
50 Maffei, 2001; Kropf, 1996) or “urban structural types” (Lehner & Blaschke, 2019; Osmond,
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 63 of 136 Environment and Planning B: Urban Analytics and City Science

13
1
2
3
4
5
6
7
8
9 2010), or else “a distinct area of a settlement in all three dimensions, characterized by a
10
11
12
unique combination of streets, blocks/plot series, plots, buildings, structures and materials
13
14 and usually the result of a distinct process of formation at a particular time or period” (Kropf
15
16 2017, p.89).
17
18
19
From a morphometric standpoint, adopting the concept of “urban tissue” as the OTU has two
Fo
20
21
22 main advantages. First, being grounded on the notion of homogeneity, its definition can be
rR
23
24 configured as a typical problem of cluster analysis: homogeneous urban tissues are hence
25
26 derived from the analysis of recurrent similarities/differences in the morphometric characters
ev

27
28
29 of their constituent urban elements. Furthermore, as size and geometry of each urban tissue
iew

30
31 are determined by internal homogeneity rather than pre-defined boundaries, the Modifiable
32
33 Aerial Unit Problem is minimised (Openshaw, 1984).
34
On

35
36
37
Having the elements defined, the method proposed here can be split into five consecutive
38
ly

39 steps illustrated on figure 1b: 1) generation of morphological elements, 2) measurement of


40
41 primary morphometric characters, 3) measurement of contextual character, 4) cluster analysis,
42
43
5) taxonomy. The remaining steps are outlined in the following sections.
44
45
46
47 Morphometric characters
48
49
50 The definition of measurable morphometric characters is key for cluster analysis and captures
51
52 the cross-scale structural complexity of different urban tissues. To this end, building on earlier
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 64 of 136

14
1
2
3
4
5
6
7
8
9 literature review <masked for review>, we use six categories of morphometric characters -
10
11
12
dimension, shape, spatial distribution, intensity, connectivity, diversity.
13
14
15 These characters allow to numerically describe morphometric elements (street segments,
16
17 building footprints and tessellation cells) within any urban fabric, by capturing the
18
19
relationships between them and their immediate surroundings. They are measured at three
Fo
20
21
22 topological scales: small (element itself), medium (element and its immediate neighbours) and
rR
23
24 large – the element and its neighbours within k-th order of contiguity. Spatial contiguity can
25
26 either be kept constrained by enclosing streets (the equivalent of an urban block) or left
ev

27
28
29 unconstrained (see the Supplementary Material 1 for further details).
iew

30
31
32 Considered morphometric characters are of two types: primary and contextual. Primary
33
34 characters measure geometric and configurational properties of morphometric elements
On

35
36
37
(buildings, streets and cells) and their relationships (at all scales). By abundantly representing
38
ly

39 all six morphometric categories this set is extensive. Accordingly, starting from as broad a set
40
41 of unique variables identified by <masked for review>, we shortlist 74 characters (table S1 in
42
43
the Supplementary Material), following rules by Sneath & Sokal (1973) to minimise potential
44
45
46 collinearity and limit redundancy of information, while retaining the universality of the
47
48 method.
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 65 of 136 Environment and Planning B: Urban Analytics and City Science

15
1
2
3
4
5
6
7
8
9 Primary characters describe morphometric elements and their immediate neighbourhood
10
11
12
rather than their spatial patterns. As such, when employed for cluster analysis they may result
13
14 in spatially discontinuous classes. Urban tissues are defined by their internal homogeneity, but
15
16 it can, and often is, be the homogeneity of heterogeneity. In other words, the tissue may be
17
18
defined by the combination of small and large buildings or various shapes, and we need to
19
Fo
20
21 capture these characteristics. Thus we derive a set of spatially lagged contextual characters
22
rR
23 describing the tendency of each primary character in its context. The term “context” is here
24
25 defined as topological aggregation of morphological cells within three topological steps from
26
ev

27
28 each given cell Ci, an empirically determined value large enough to capture a cohesive pattern
29
iew

30 over a relatively wide spatial extent but small enough to generate sharp boundaries between
31
32 different patterns (Figure 2). The notion of “tendency” is in turn quantified through four
33
34
values:
On

35
36
37
38 1. Interquartile mean (IQM), the most representative value cleaned of the effect of
ly

39
40 potential outliers.
41
42
2. Interquartile range (IQR); as local measure of statistical dispersion, describes the
43
44
45 range of values cleaned of outliers:
46
47
48 𝐼𝑄𝑅𝑐ℎ = 𝑄3𝑐ℎ ― 𝑄1𝑐ℎ,
49
50
51
where 𝑄3𝑐ℎ and 𝑄1𝑐ℎ are is the third and quartiles of the primary character.
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 66 of 136

16
1
2
3
4
5
6
7
8
9 3. Interdecile Theil index (IDT), describes the local (in)equality of distribution of values:
10
11
12 𝑛 𝑐ℎ𝑖 𝑐ℎ𝑖
13 𝐼𝐷𝑇𝑐ℎ = ∑𝑖 = 1( 𝑛 𝑙𝑛 [𝑁 𝑛 ]),
∑𝑖 = 1𝑐ℎ𝑖 ∑𝑖 = 1𝑐ℎ𝑖
14
15
16
17 where 𝑐ℎ is the primary character.
18
19
Fo
20 4. Simpson’s diversity index (SDI), captures the local presence of classes of values
21
22
compared to the global structure of the distribution:
rR
23
24
25 𝑅
26 ∑𝑖 = 1𝑛𝑖(𝑛𝑖 ― 1)
ev

27 𝑆𝐷𝐼𝑐ℎ = 𝑁(𝑁 ― 1) ,
28
29
iew

30 where 𝑅 is richness, expressed as number of bins, 𝑛𝑖 is the number of features within i-


31
32
33 th bin and N is the total number of features.
34
On

35
36 Of these, the first captures the local central tendency and the latter three the distribution of
37
38 values within third order of contiguity from each cell.
ly

39
40
41
42 Each primary character is used as an input for each contextual option. The full set of
43
44 morphometric characters hence includes 74 primary plus 296 contextual characters (74x4),
45
46 totalling 370 characters. These are computed using the bespoke open-source Python toolkit
47
48
49
<masked for review>, ensuring the full replicability and reproducibility of the method.
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 67 of 136 Environment and Planning B: Urban Analytics and City Science

17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30 Figure 2: Morphological tessellation’s adaptive topological aggregation; “context” is defined as all cells
31
32 within third order of contiguity in Prague: a) compact perimeter blocks, b) single family housing.
33
34
On

35 Detection of morphological taxa


36
37
38 Only contextual characters’ values are input to cluster analysis that identifies urban form
ly

39
40
41 types. Identifying OTUs as clusters of fundamental entities closely mirrors a mixture problem
42
43 in biology, which identifies populations within samples and classifies at population level
44
45 (Sneath & Sokal, 1973). Since contextual characters are spatially lagged, they are spatially
46
47
48
autocorrelated by design, thus avoiding computationally expensive spatial constraint models
49
50 (Duque et al., 2012). We mitigate potential over-smoothing of the boundaries by basing
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 68 of 136

18
1
2
3
4
5
6
7
8
9 contextual characters on truncated values (with the exception of SDI), which eliminate
10
11
12
outliers’ effect and define boundaries more precisely.
13
14
15 The most suited clustering algorithm is Gaussian Mixture Model (GMM), a probabilistic
16
17 derivative of k-means (Reynolds, 2009) tested in a similar context by Jochem et al. (2020).
18
19
Unlike the k-means itself, GMM does not rely only on squared Euclidean distances and is
Fo
20
21
22 more sensitive to clusters of different sizes. GMM assumes that a Gaussian distribution
rR
23
24 represents each dimension of each cluster. Hence the cluster itself is defined by a mixture of
25
26 Gaussians. The output of GMM are cluster labels assigned to individual tessellation cells.
ev

27
28
29
iew

30 The ideal outcome of cluster detection would equate clusters to distinct taxa of urban tissues.
31
32 Because the definition of urban tissue (Kropf, 2017) does not specify the threshold beyond
33
34 which two similar parts of a city cluster in same tissue, it is difficult to equate clusters to taxa.
On

35
36
37
We resolve this by estimating the number of clusters, required by GMM clustering method,
38
ly

39 on the goodness of fit of the model, measured using Bayesian Information Criterion (BIC)
40
41 (Schwarz & others, 1978) based on the “elbow” of the curve.
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 69 of 136 Environment and Planning B: Urban Analytics and City Science

19
1
2
3
4
5
6
7
8
9 The foundation of taxonomy
10
11
12
13
To classify urban form types, we use Ward's minimum variance hierarchical clustering
14
15 previously applied in urban morphology (Dibble et al., 2019; Serra et al., 2018). Here, each
16
17 urban form type is represented by its centroid (mean of each character across cells with the
18
19
same label); Ward's algorithm links observations reducing increase in total within-cluster
Fo
20
21
22 variance (Ward Jr, 1963). The classification is represented through a dendrogram capturing
rR
23
24 the cophenetic relationship between observations (i.e., morphometric similarity), forming the
25
26 foundation of our taxonomy.
ev

27
28
29
iew

30 Validation theory
31
32
33 For validation, we study our taxonomy in relation to other urban dynamics with which some
34
On

35 form of relation is expected. In urban morphology theory and qualitative evidence suggests
36
37
38
that different urban patterns emerge in areas of different historical origins or else belonging to
ly

39
40 different “morphological periods” (Whitehand et al., 2014). This notion has also been
41
42 observed quantitatively in the urban fabric (Boeing, 2020; Dibble et al., 2019; Porta et al.,
43
44
2014, <masked>) as well as in land use patterns (Castro et al., 2019) of cities and is inherently
45
46
47 embedded in our OTU.
48
49
50 We validate our classification against three datasets: 1) historical origins; 2) predominant
51
52 land-use patterns, and 3) qualitative classification of urban form adopted in official planning
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 70 of 136

20
1
2
3
4
5
6
7
8
9 documents. We use the same method, based on cross-tabulation, resulting in statistical
10
11
12
analysis using chi-squared statistic and related Cramér’s V (Agresti, 2018). The model is
13
14 considered valid if a significant relationship is found between proposed classification and
15
16 three additional datasets and if similar performance is shown across different case studies.
17
18
19
Case study
Fo
20
21
22
We test the proposed method in two historical European cities: Prague, CZ and Amsterdam,
rR
23
24
25 NL. Prague’s analysis area is defined by its administrative boundary, which extends beyond
26
ev

27 its continuous built-up area to minimise the “edge-effect” of the street network (Gil, 2016).
28
29
iew

30 Amsterdam’s analysis area is defined by its contiguous urban fabric, extending beyond the
31
32 city’s administrative boundary. The morphological data (buildings, streets) for Prague case
33
34 study were obtained from city’s open data portal (https://www.geoportalpraha.cz/en), while
On

35
36
37
the validation layers were provided by Prague Institute of Planning and Development. The
38
ly

39 morphological data for Amsterdam are obtained from 3D BAG repository (Dukai, 2020) and
40
41 Basisregistratie Grootschalige Topografie(http://data.nlextract.nl/)
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 71 of 136 Environment and Planning B: Urban Analytics and City Science

21
1
2
3
4
5
6
7
8
9
10
Results: Taxonomy of Prague and Amsterdam
11
12
13 We measure all 74 primary characters in both Prague and Amsterdam, associated to each
14
15 morphological cell, and subsequently generate 296 contextual characters as input to cluster
16
17
18
analysis.
19
Fo
20
21 Cluster analysis in Prague
22
rR
23
24 Based on BIC results (figure S5 in the Supplementary Material), GMM clustering identifies
25
26
ev

10 clusters (figure 3a). At a visual inspection, clusters appear well defined and able to reflect
27
28
29 homogenous forms, their contiguity resulting from contextual characters’ patterned nature.
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 72 of 136

22
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44 Figure 3: Spatial distribution of detected clusters in central Prague (a) and central Amsterdam (b)
45 accompanied by dendrograms representing the results of Ward’s hierarchical clustering of urban form types
46
in Prague (c) and Amsterdam (d). The y-axis shows the cophenetic distance between individual clusters,
47
48 i.e., their morphometric dis-similarity. The full extent of case studies is shown in figures S7 and S8 in the
49 Supplementary Material.
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 73 of 136 Environment and Planning B: Urban Analytics and City Science

23
1
2
3
4
5
6
7
8
9 Starting from the historical core of Prague (top left), we first identify the medieval urban form
10
11
12
(7), then the compact perimeter blocks of Vinohrady neighbourhood (6,), and the fringe areas
13
14 (3). Towards South and East, we note low-rise tissues (8, 1) and modernist developments (4).
15
16
17 Drawing purely from visual observation and personal knowledge of the city of Prague,
18
19
identified clusters appear to nicely capture meaningful urban form types.
Fo
20
21
22
Cluster analysis in Amsterdam
rR
23
24
25
26 In Amsterdam, BIC indicates the optimal number being 10 clusters, similarly to Prague.
ev

27
28
29 As in Prague, the geography of clusters shows seemingly meaningful results (figure 3b). For
iew

30
31
32 example, cluster 7 captures the city’s historical core up to the Singelgracht canal. The cluster
33
34 1 reflects well-known shifts in planning paradigms with the rise of New Amsterdam School
On

35
36 (Panerai et al., 2004) forming the early 20th century south expansion. Once again, under
37
38
preliminary observation, identified clusters capture meaningful spatial patterns.
ly

39
40
41
42 Numerical taxonomy
43
44
45 The centroid values of each cluster, obtained as mean value of each contextual character, are
46
47
used as taxonomic characters in Ward’s hierarchical clustering. Resulting relationship
48
49
50 between centroids represents relationship between clusters (figure 3c). The dendrogram’s
51
52 horizontal axis represents detected clusters, while the vertical axis their cophenetic distance
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 74 of 136

24
1
2
3
4
5
6
7
8
9 (i.e., morphological dissimilarity ): the lower the connecting link of two clusters, the higher
10
11
12
their similarity.
13
14
15 Prague’s dendrogram contains 10 clusters, illustrating the uniqueness of the spatial pattern of
16
17 medieval city (7), forming the first bifurcation and independent branch. The similar situation
18
19
is with cluster covering industrial areas (0) being dissimilar to other clusters. Further in the
Fo
20
21
22 dendrogram, we can see branches with regular perimeter blocks (6) and their fringe areas (3),
rR
23
24 unorganised development of modern era (4, 2) or a branch featuring residential areas of low
25
26 density (9, 1, 5, 8).
ev

27
28
29
iew

30 The dendrogram of Amsterdam urban form (figure 3d) shows similar characteristics, with
31
32 bifurcations distinguishing nested levels of spatial variations.
33
34
On

35 In the classification maps shown in figure 3,types are colour-coded to highlight distinctions at
36
37
38
individual cluster’s level. However, we can instead colour-code according to clusters’
ly

39
40 similarity. Because the dendrogram shows several major bifurcations at different levels of
41
42 cophenetic distance indicating distinct higher-order groups of clusters, by colouring each
43
44
cluster in the map according to the branch it belongs to in the dendrogram and using different
45
46
47 hues to distinguish between lower-level clusters in each branch, we distinguish hierarchies
48
49 based on cophenetic distance.
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 75 of 136 Environment and Planning B: Urban Analytics and City Science

25
1
2
3
4
5
6
7
8
9 We can further combine the two cities’ clusters in one shared dendrogram (figure 4c). Urban
10
11
12
form types from both pools appear regularly distributed in the lowest orders of the tree,
13
14 showing a similar spatial structure emerging in both cases. Remarkably, we can see the major
15
16 bifurcation setting apart industrial urban forms in the combined taxonomy.
17
18
19
A lower order bifurcation within the main branch distinguishes between dense/compact urban
Fo
20
21
22 form and the rest. Further lower-level subdivisions are also visible. Compared to individual
rR
23
24 ones, the combined tree shows some differences in branching: a few clusters are reshuffled
25
26 and the branches themselves are slightly reorganised. This is likely to happen as more and
ev

27
28
29 more cities are analysed until the unified taxonomy reaches a “plateau” when enough cases
iew

30
31 are included, ultimately producing a ‘general taxonomy of urban form’.
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 76 of 136

26
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45 Figure 4: Spatial distribution of different branches of the combined dendrogram in central Prague (a) and
46 central Amsterdam (b) accompanied by the dendrogram representing the results of Ward’s hierarchical
47 clustering of urban form types from a combined pool of Prague and Amsterdam (c). The y-axis shows
48 cophenetic distance between individual clusters, i.e. their morphometric dis-similarity. Branches are
49
interpretatively coloured - the colours are then used on maps illustrating spatial distribution of these
50
51 branches. The full extent of case studies is shown in figures S9 and S10 in the Supplementary Material.
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 77 of 136 Environment and Planning B: Urban Analytics and City Science

27
1
2
3
4
5
6
7
8
9 The geography of Prague and Amsterdam combined taxonomy (figure 4a, 4b) allows cross-
10
11
12
comparing urban form patterns by similarity (represented by similar colours). Same can be
13
14 extended across multitude of cities and regions.
15
16
17 Validation
18
19
Fo
20
We validate the output of numerical taxonomy against three datasets: 1) historical origins; 2)
21
22
land-use patterns, and 3) qualitative classifications. All these are assessed by contingency
rR
23
24
25 table-based chi-squared statistic and Cramér's V.
26
ev

27
28 In Prague, data on historical origin classifies urban areas into 7 periods: 1840, 1880, 1920,
29
iew

30
31 1950, 1970, 1990, 2012, while there are 123 categories of land use at individual building/plot
32
33 level, where only 15 contain more than 1,000 buildings. We redefined prevailing land uses
34
On

35 within the 3 topological steps of morphological tessellation: only 5 categories (Multi-family


36
37
38
housing, Single-family housing, Villas, Industry small, Industry large) contain more than 1%
ly

39
40 of the dataset. We use these five and denote the rest as Other.
41
42
43 Qualitative classification is drawn from a municipal typology of neighbourhoods developed
44
45
by the city for planning purposes. Each neighbourhood has specified boundaries based on its
46
47
48 morphology and other aspects, from historical origin to social perception and qualitatively
49
50 classified according to 10 types. We exclude 3 types, hybrid and heterogenous, which are
51
52 non-morphological and linear which captures railway structures only.
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 78 of 136

28
1
2
3
4
5
6
7
8
9 Differently from Prague, the Amsterdam dataset of historical origin (Dukai, 2020) indicates
10
11
12
each building’s year of construction, starting with 1800, rather than area/plot’s first
13
14 settlement. To ensure data compatibility with the method and avoid issues with pre-1800
15
16 periods, origin dates are binned into 11 groups following Spaan and Waag Society (2015).
17
18
19
The resulting chi-squared and Cramér's V values are reported in table S7. Contingency tables
Fo
20
21
22 are available as tables S3 – S6. All tests indicate moderate to high association between
rR
23
24 identified clusters and the 3 sets of validation data, supporting model’s validity.
25
26
ev

27 Historical origin shows moderate association in both Prague (V=0.331) and Amsterdam
28
29
iew

30 (V=0.311). Because of the nature of data, where period of first development is not the only
31
32 driver of form and we have tissues – e.g. single-family – populating multiple historical
33
34 periods, a moderate association is expected. Land use (V=0.468) and municipal qualitative
On

35
36
37
classification (V=0.674), tested only in Prague, indicate moderate and high association to
38
ly

39 clusters. Again, since land use is only a partial driver of urban form, moderate association
40
41 supports the proposed method’s potential to capture urban reality. Furthermore, relationship
42
43
between morphometric types and qualitative ones sourced from local authority is the highest
44
45
46 among validation data, reaching V=0.674. This seems encouraging, since both classifications
47
48 aim to capture a similar conceptualisation of the built environment.
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 79 of 136 Environment and Planning B: Urban Analytics and City Science

29
1
2
3
4
5
6
7
8
9 Discussion
10
11
12
13
The proposed method hierarchically classifies urban form types according to the similarity of
14
15 their morphological traits. It is numerical, unsupervised, rich in information and scalable in
16
17 spatial extent. It identifies clusters of urban form as distinct urban form types and, within
18
19
each, contiguous urban tissues, reflecting that in a typical city we observe tissues belonging to
Fo
20
21
22 the same type. The method is parsimonious in terms of input data, requiring only building
rR
23
24 footprints (and height) and street networks, to generate three morphometric elements
25
26 (building units, street network, morphological tessellation) and to compute the 370
ev

27
28
29 morphometric characters. Such a wealth of fine-grained information allows extensively
iew

30
31 characterising each building in the study area and its adjacency and deriving distinct urban
32
33 form types hierarchically organised according to similarity.
34
On

35
36
37
The method allows urban form analysis both in detail and at large scale, hence overcoming a
38
ly

39 methodological gap; it is fully data-driven and does not rely on (but confirms) experts’
40
41 judgement other than for interpretation of BIC score. It is structurally hierarchical, which
42
43
ensures depth along the similarity structure of urban form types and flexibility of use,
44
45
46 according to the desired resolution of classification. Furthermore, it is extensive,
47
48 encompassing a broad range of morphometric descriptors between major urban form
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 80 of 136

30
1
2
3
4
5
6
7
8
9 components and their context; and it is granular, since morphometric characters are referred
10
11
12
to each individual building.
13
14
15 Finally, it is scalable and reproducible, in that it is designed to suite well the large scale of
16
17 coverage - like cities and combinations of cities - and its source code is available open-source.
18
19
Fo
20
Information generated with the proposed method supports applications at three different
21
22
levels. First, the set of morphometric characters can be input to studies of a relationship
rR
23
24
25 between urban form and socio-economic aspects of urban life, e.g. via regression analysis.
26
ev

27 This includes investigations into the link between urban form and energetic/bioclimatic
28
29
iew

30 performance of cities, population health, gentrification and place attractiveness. Second, flat
31
32 clustering with morphometric profiles can provide aggregated information on patterns without
33
34 dealing with individual characters. This makes it possible to capture the overall morphological
On

35
36
37
“identity” of an urban tissue rather than focusing on one element at the time. Third, the
38
ly

39 taxonomy brings hierarchy into classification and, as such, it can adapt its resolution to fit any
40
41 question asked. In this sense, while the results of the clusters may be well-suited for fine-
42
43
grained spatial analyses, by horizontally cutting the dendrogram at a desired height, it is
44
45
46 possible to group clusters into fewer, more generalised spatial aggregations which might be
47
48 better suited for analyses at coarser resolution.
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 81 of 136 Environment and Planning B: Urban Analytics and City Science

31
1
2
3
4
5
6
7
8
9 Whilst parsimonious in terms of input data, our method still relies on their availability and
10
11
12
consistency. The building footprints layer is often of sub-optimal quality level: adjacent
13
14 buildings may be represented as unified polygons, misleading the method in dense areas.
15
16 Building-level information on height may not be available, reducing depth of information
17
18
with potentially negative effects on the quality of resulting clusters. Consistency of data
19
Fo
20
21 across geographies may also be an issue, particularly for large spatial extents, which may
22
rR
23 require data generated independently by multiple sources.
24
25
26
Conclusions
ev

27
28
29
iew

30 The paper presents an original data-driven approach for the systematic unsupervised
31
32 classification and characterisation of urban form patterns grounded on numerical taxonomy in
33
34
biological systematics and which clusters urban tissues based on phenetic similarity, delivering
On

35
36
37 a systematic numerical taxonomy of urban form. More specifically it measures a selection of
38
ly

39 74 primary characters from input data (buildings, streets) and derived generated elements
40
41
42
(tessellation and blocks), each of which is represented through 4 contextual characters
43
44 (Interquartile mean, Interquartile range, Interdecile Theil index, Simpson’s diversity index).
45
46 These are then used as an input of the cluster analysis, resulting in a hierarchical taxonomy.
47
48
Finally, the proposed approach is validated through two exploratory case studies illustrating
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 82 of 136

32
1
2
3
4
5
6
7
8
9 how the resulting clustering show significant relationship with validation data reflecting other
10
11
12
urban spatial dynamics.
13
14
15 Urban morphometrics and proposed classification method represent a step towards the
16
17 development of a taxonomy of urban form and opens to scalable urban morphology. By
18
19
overcoming existing limitations in the systematic detection and characterisation of
Fo
20
21
22 morphological patterns, the proposed approach opens the way to the large-scale classification
rR
23
24 and characterisation of urban form patterns, potentially resulting, if applied to a substantial pool
25
26 of cities, in a universal taxonomy of urban form.
ev

27
28
29
iew

30 At the same time, the proposed approach also provides valuable tools for more rigorous
31
32 comparative studies, which are fundamental to highlight similarities and differences in urban
33
34 forms of different urban settlements in different contexts, and to explore the relationship
On

35
36
37
between urban space and phenomena as diverse as environmental performance, health and place
38
ly

39 attractiveness and more.


40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 83 of 136 Environment and Planning B: Urban Analytics and City Science

33
1
2
3
4
5
6
7
8
9 References
10
11
12
13
● Agresti A (2018) An Introduction to Categorical Data Analysis. John Wiley & Sons.
14
15
16 ● Angel S, Blei AM, Civco DL and Parent J (2012). Atlas of urban expansion. Lincoln
17
18 Institute of Land Policy Cambridge, MA.
19
Fo
20
21
● Araldi A and Fusco G (2019) From the street to the metropolitan region: Pedestrian
22
perspective in urban fabric analysis: Environment and Planning B: Urban Analytics
rR
23
24
25 and City Science 46(7): 1243–1263. DOI: 10.1177/2399808319832612.
26
ev

27
28
● Berghauser Pont M, Stavroulaki G and Marcus L (2019a) Development of urban types
29
iew

30
31 based on network centrality, built density and their impact on pedestrian movement.
32
33 Environment and Planning B: Urban Analytics and City Science 46(8): 1549–1564.
34
On

35 DOI: 10/gghf42.
36
37
38
● Berghauser Pont M, Stavroulaki G, Bobkova E, et al. (2019b) The spatial distribution
ly

39
40
41 and frequency of street, plot and building types across five European cities.
42
43 Environment and Planning B: Urban Analytics and City Science 46(7): 1226–1242.
44
45
46
DOI: 10/gf8x8j.
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 84 of 136

34
1
2
3
4
5
6
7
8
9 ● Biljecki F, Ledoux H and Stoter J (2016) An improved LOD specification for 3D
10
11
12
building models. Computers, Environment and Urban Systems 59: 25–37. DOI:
13
14 10/f83fz4.
15
16
17 ● Bobkova E, Berghauser Pont M and Marcus L (2019) Towards analytical typologies
18
19
of plot systems: Quantitative profile of five European cities. Environment and
Fo
20
21
22 Planning B: Urban Analytics and City Science: 239980831988090. DOI: 10/ggbgsm.
rR
23
24
25 ● Boeing G (2020) Off the grid… and back again? The recent evolution of american
26
ev

27
street network planning and design. Journal of the American Planning Association.
28
29
iew

30 Taylor & Francis: 1–15. DOI: 10/ghf423.


31
32
33 ● Caniggia G and Maffei GL (2001) Architectural Composition and Building Typology:
34
On

35 Interpreting Basic Building. Firenze: Alinea Editrice.


36
37
38
● Caruso G, Hilal M and Thomas I (2017). Measuring urban forms from inter-building
ly

39
40
41 distances: Combining MST graphs with a Local Index of Spatial Association.
42
43 Landscape and Urban Planning, 163, 80–89.
44
45
46
● Castro KB de, Roig HL, Neumann MRB, et al. (2019) New perspectives in land use
47
48 mapping based on urban morphology: A case study of the Federal District, Brazil.
49
50 Land Use Policy 87: 104032. DOI: 10.1016/j.landusepol.2019.104032.
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 85 of 136 Environment and Planning B: Urban Analytics and City Science

35
1
2
3
4
5
6
7
8
9 ● Conzen M (1960) Alnwick, Northumberland: A Study in Town-Plan Analysis. London:
10
11
12
George Philip & Son. Available at: http://www.jstor.org/stable/pdf/621094.pdf.
13
14
15 ● Dibble J, Prelorendjos A, Romice O, et al. (2019) On the origin of spaces:
16
17 Morphometric foundations of urban form evolution. Environment and Planning B:
18
19
Urban Analytics and City Science 46(4): 707–730. DOI: 10.1177/2399808317725075.
Fo
20
21
22
● Dogrusoz E and Aksoy S (2007) Modeling urban structures using graph-based spatial
rR
23
24
25 patterns. In: 1 January 2007, pp. 4826–4829. IEEE. DOI:
26
ev

27
10.1109/IGARSS.2007.4423941.
28
29
iew

30
31 ● Dukai B (2020) 3D Registration of Buildings and Addresses (BAG) / 3D
32
33 Basisregistratie Adressen en Gebouwen (BAG). 4TU.ResearchData. DOI:
34
On

35 https://doi.org/10.4121/uuid:f1f9759d-024a-492a-b821-07014dd6131c.
36
37
38
● Duque JC, Anselin L and Rey SJ (2012) The max-p-regions problem. Journal of
ly

39
40
41 Regional Science 52(3). Wiley Online Library: 397–419. DOI: 10/cf9h6h.
42
43
44
45
● Fleischmann M, Feliciotti A, Romice O, et al. (2020) Morphological tessellation as a
46
47
48 way of partitioning space: Improving consistency in urban morphology at the plot
49
50 scale. Computers, Environment and Urban Systems 80: 101441. DOI:
51
52 10.1016/j.compenvurbsys.2019.101441.
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 86 of 136

36
1
2
3
4
5
6
7
8
9 ● Gil J, Beirão JN, Montenegro N, Duarte, JP (2012) On the discovery of urban
10
11
12
typologies: data mining the many dimensions of urban form. Urban Morphology
13
14 16(1): 27–40
15
16
17 ● Gil J (2016) Street network analysis ‘edge effects’: Examining the sensitivity of
18
19
centrality measures to boundary conditions. Environment and Planning B: Planning
Fo
20
21
22 and Design. DOI: 10.1177/0265813516650678.
rR
23
24
25 ● Guyot M, Araldi A, Fusco G and Thomas I (2021). The urban form of Brussels from
26
ev

27
the street perspective: The role of vegetation in the definition of the urban fabric.
28
29
iew

30 Landscape and Urban Planning, 205, 103947. https://doi.org/10/ghf96c


31
32 ● Hamaina R, Leduc T and Moreau G (2012) Towards Urban Fabrics Characterization
33
34 Based on Buildings Footprints. In: Bridging the Geographic Information Sciences.
On

35
36
37 Berlin, Heidelberg: Springer, Berlin, Heidelberg, pp. 327–346. DOI: 10.1007/978-3-
38
ly

39 642-29063-3_18.
40
41
42 ● Hartmann A, Meinel G, Hecht R, et al. (2016) A Workflow for Automatic
43
44
45
Quantification of Structure and Dynamic of the German Building Stock Using Official
46
47 Spatial Data. ISPRS International Journal of Geo-Information 5(8): 142. DOI:
48
49 10/f872vh.
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 87 of 136 Environment and Planning B: Urban Analytics and City Science

37
1
2
3
4
5
6
7
8
9 ● Jochem WC, Leasure DR, Pannell O, et al. (2020) Classifying settlement types from
10
11
12
multi-scale spatial patterns of building footprints. Environment and Planning B:
13
14 Urban Analytics and City Science: 239980832092120. DOI: 10/ggtsbn.
15
16
17 ● Kropf K (1993) The definition of built form in urban morphology. University of
18
19
Birmingham.
Fo
20
21
22
● Kropf K (1996) Urban tissue and the character of towns. URBAN DESIGN
rR
23
24
25 International 1(3): 247–263. DOI: 10.1057/udi.1996.32.
26
ev

27
28
● Kropf K (2014) Ambiguity in the definition of built form. Urban Morphology 18(1):
29
iew

30
31 41–57.
32
33
34 ● Kropf K (2017) The Handbook of Urban Morphology. Chichester: John Wiley &
On

35
36 Sons. Available at: http://cds.cern.ch/record/2316422.
37
38
ly

39
40 ● Kropf K (2018) Plots, property and behaviour. Urban Morphology 22(1): 5–14.
41
42
43 ● Lehner A and Blaschke T (2019) A Generic Classification Scheme for Urban
44
45 Structure Types. Remote Sensing 11(2): 173. DOI: 10.3390/rs11020173.
46
47
48
49 ● Levy A (1999) Urban morphology and the problem of the modern urban fabric: some
50
51 questions for research. Urban Morphology 3: 79–85.
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 88 of 136

38
1
2
3
4
5
6
7
8
9 ● Louf R and Barthelemy M (2014) A typology of street patterns. Journal of the Royal
10
11
12
Society Interface 11. DOI: http://dx.doi.org/10.1098/rsif.2014.0924.
13
14
15 ● Moudon AV (1997) Urban morphology as an emerging interdisciplinary field. Urban
16
17 Morphology 1(1): 3–10.
18
19
Fo
20
21
● Muratori S (1959) Studi per una operante storia urbana di Venezia. Palladio. Rivista di
22
storia dell’architettura 1959: 1–113.
rR
23
24
25
26 ● Neidhart H and Sester M (2004) Identifying building types and building clusters using
ev

27
28
3-D laser scanning and GIS-data. Int Arch Photogramm Remote Sens Spatial Inf Sci
29
iew

30
31 35: 715–720.
32
33
34 ● Oliveira V (2016) Urban Morphology: An Introduction to the Study of the Physical
On

35
36 Form of Cities. Cham: Springer International Publishing.
37
38
ly

39
40 ● Oliveira V and Yaygin MA (2020) The concept of the morphological region:
41
42 developments and prospects. Urban Morphology 24(1): 18.
43
44
45 ● Openshaw S (1984) The Modifiable Areal Unit Problem.
46
47
48
49 ● Osmond P (2010) The urban structural unit: Towards a descriptive framework to
50
51 support urban analysis and planning. Urban Morphology 14(1): 5–20.
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 89 of 136 Environment and Planning B: Urban Analytics and City Science

39
1
2
3
4
5
6
7
8
9 ● Porta S, Romice O, Maxwell JA, et al. (2014) Alterations in scale: Patterns of change
10
11
12
in main street networks across time and space. Urban Studies 51(16): 3383–3400.
13
14 DOI: 10.1177/0042098013519833.
15
16
17 ● Reynolds DA (2009) Gaussian mixture models. Encyclopedia of biometrics 741.
18
19
Berlin, Springer. DOI: 10/cqtzqm.
Fo
20
21
22
● Schirmer PM and Axhausen KW (2015) A multiscale classification of urban
rR
23
24
25 morphology. Journal of Transport and Land Use 9(1): 101–130. DOI:
26
ev

27
10.5198/jtlu.2015.667.
28
29
iew

30
31 ● Schwarz G and others (1978) Estimating the dimension of a model. The annals of
32
33 statistics 6(2). Institute of Mathematical Statistics: 461–464.
34
On

35
36 ● Serra M, Psarra S and O’Brien J (2018) Social and Physical Characterization of Urban
37
38
ly

39 Contexts: Techniques and Methods for Quantification, Classification and Purposive


40
41 Sampling. Urban Planning 3(1): 58–74. DOI: 10.17645/up.v3i1.1269.
42
43
44 ● Sneath PHA and Sokal RR (1973) Numerical Taxonomy. San Francisco: Freeman.
45
46
47
48 ● Soman S, Beukes A, Nederhood C, Marchio N and Bettencourt L (2020). Worldwide
49
50 detection of informal settlements via topological analysis of crowdsourced digital
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 90 of 136

40
1
2
3
4
5
6
7
8
9 maps. ISPRS International Journal of Geo-Information, 9(11), 685.
10
11
12
https://doi.org/10/ghpwqm
13
14 ● Song Y and Knaap G-J (2007) Quantitative Classification of Neighbourhoods: The
15
16 Neighbourhoods of New Single-family Homes in the Portland Metropolitan Area.
17
18
Journal of Urban Design 12(1): 1–24. DOI: 10.1080/13574800601072640.
19
Fo
20
21
22 ● Spaan B and Waag Society (2015) All buildings in Netherlands shaded by a year of
rR
23
24 construction. Available at: https://code.waag.org/buildings/.
25
26
ev

27
● Steadman, P. (1979). The Evolution of Designs Biological Analogy in Architecture
28
29
iew

30 and the Applied Arts.


31
32
33 ● Steiniger S, Lange T, Burghardt D, et al. (2008) An Approach for the Classification of
34
On

35 Urban Building Structures Based on Discriminant Analysis Techniques. Transactions


36
37
38 in GIS 12(1): 31–59. DOI: 10.1111/j.1467-9671.2008.01085.x.
ly

39
40
41 ● Stewart ID and Oke TR (2012) Local Climate Zones for Urban Temperature Studies.
42
43 Bulletin of the American Meteorological Society 93(12): 1879–1900. DOI:
44
45
46
10.1175/BAMS-D-11-00019.1.
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 91 of 136 Environment and Planning B: Urban Analytics and City Science

41
1
2
3
4
5
6
7
8
9 ● Taubenböck H, Debray H, Qiu C, et al. (2020) Seven city types representing
10
11
12
morphologic configurations of cities across the globe. Cities 105: 102814. DOI:
13
14 10/gg2jv4.
15
16
17 ● Usui H and Asami Y (2013) Estimation of Mean Lot Depth and Its Accuracy. Journal
18
19
of the City Planning Institute of Japan 48(3): 357–362.
Fo
20
21
22
● Ward Jr JH (1963) Hierarchical grouping to optimize an objective function. Journal of
rR
23
24
25 the American statistical association 58(301). Taylor & Francis Group: 236–244. DOI:
26
ev

27
10/fz95kg.
28
29
iew

30
31 ● Whitehand J, Gu K, Conzen MP, et al. (2014) The typological process and the
32
33 morphological period: a cross-cultural assessment. Environment and Planning B:
34
On

35 Planning and Design 41(3). SAGE Publications Sage UK: London, England: 512–
36
37
38 533. DOI: 10/f546ck.
ly

39
40
41 ● Wurm M, Schmitt A and Taubenbock H (2016) Building Types’ Classification Using
42
43 Shape-Based Features and Linear Discriminant Functions. IEEE Journal of Selected
44
45
46
Topics in Applied Earth Observations and Remote Sensing 9(5): 1901–1912. DOI:
47
48 10.1109/JSTARS.2015.2465131.
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 92 of 136

1
2
3 Summary of changes
4
5 We are very grateful for all the comments received from all referees, as they helped us address shortcomings
6 and substantially improve the quality of our research paper. The majority of comments have been implemented
7 in the revised version of our manuscript. Below we provide a detailed response to each comment.
8
9 The major methodological change made to the manuscript is related to the initial cluster analysis and required
10 revision of the second part of the results section. Following the recommendations from two reviewers, we have
11 revisited the method based on BIC and settled to the conservative “elbow”-based interpretation of the curve,
12 leading to a change in the number of types. The revised version of the manuscript uses 10 clusters in Prague and
13 10 in Amsterdam. The remaining steps were therefore updated based on the new results. The remaining changes
14 did not affect the method and results and are detailed in the individual responses below.
15
16 Responses to individual comments are embedded in the original comments made by referees for easier cross-
17 referencing.
18
19
20 --------------
21 Referee: 1
Fo

22 --------------
23
24 The manuscript presents a computational model for the unsupervised automated identification and classification
rR

25 of urban fabric based on three key elements, building footprint, plot and street network. The model is tested on
26 the analysis of two major cities that include various types of urban fabric originating from different historical
27 stages. The study contributes to the field of morphometrics thus augmenting the research published recently in
28 EPB. The manuscript develops a clear argumentation on the method which is supported by high quality figures
ev

29 and supplemental materials.


30
31 The literature review, however, does not fully cover the contributions to the field of urban morphology and
iew

32 classification of architectural and urban form. It is recommended that the author/s take a position and reflect on
33 the research I list below, thus explaining how the method presented here differs from the previous work. This
34 especially important since much of this work has been published in EP(B).
35
36 1) One of the fundamental elements of urban form constituting the proposed model is the building, with its
On

37 footprint and other formal/volumetric measures. How does this classification relate to the work by Steadman
38 and others on the classification of built form (Steadman, Bruhns and Holtier, 2000) and (Steadman, Evans and
39 Batty, 2009)?
40
ly

41
42 2a) The proposed classification is based on the biological and numerical taxonomy by Sneath and Sokal. How
43 does it relate to the biological morphospace (Raup, 1966), which is also referred to within (Sneath and Sokal,
44 1973)?
45
46 2b) And most importantly, to its translation as architectural morphospace (Steadman and Mitchell, 2010) and
47 later as the area structure approach to morphological representation and analysis (Marshall, 2014)
48
Answer: A section was added in the initial section of the manuscript which better frames pre-existing research
49
50 on urban form classification methods, including the contributions suggested by the reviewer, as suggested.
51 We have also mentioned the work of Raul and highlighted some parallelisms between the study of organismal
52 phenotypes, the statistical description of biological forms and the separation of individuals (and species) into
53 recognisable, homogeneous groups and the urban form classification work proposed in the paper. Indeed like in
54
early study in the morphology of biological entities, basic models were forced, for practical reasons, to perform
55
obvious simplifications of the real situation. This has been also true for urban morphology studies which
56
generally focus on a relatively small number of variables/parameters.
57
58 Finally, we did mentioned the work of Steadman and Mitchell and Marshall on encoding and classification of,
59 respectively, buildings and streets, in our literature review as two important contributions aimed at more
60 rigorously systematising the relationship between different geometrical configurations of built form.

https://mc04.manuscriptcentral.com/epb
Page 93 of 136 Environment and Planning B: Urban Analytics and City Science

1
2
3 However we did not address further the concept of architectural morphospace and the method proposed by
4 Marshall as these focus primarily on the clarification of individual elements – buildings (Steadman) – or
5 aggregations of individual elements - street patterns (Marshall) – through geometrical analysis. In contrast to
6 these, the distinctive objective of the proposed method is the classification of urban form or patterns. This means
7 that it focuses on recurrencies in the relationships between individual elements making them up rather than the
8 individual characteristics of each element as such. Furthermore, the proposed method is concerned with physical
9
patterns – that is patterns that can be extracted by observing entities “that are represented on maps as lines and
10
areas”, leaving aside activity patterns or social patterns.
11
12 --
13
14 3) It is unclear how topological aggregation of morphological cells within 3 steps relates or is affected by the
15 street network topology. This can be clarified by explaining how the proposed model compares against the
16 ABCD classification of street patterns (Marshall 2005), whose topological classification is also presented in
17 quantitative and numerical format.
18
19 Answer: The topological aggregation of morphological cells is not affected by street network topology but it is
20 solely affected by the geometry and topology of morphological tessellation cells which, in turn, are dependent
21 on the shape, position (absolute and relative) of buildings. For the same reason it would not be appropriate to
Fo

22 explain the proposed model in relation to Marshall’s typology.


23
24 --
rR

25
4) I recommend the author/s also clarify the first steps of the classification in cases of undeveloped land caught
26
within the urbanized territory, unbuilt plots without any built footprints, or unbuildable land such as bodies of
27
28 water. Maybe Figure 1 can include a few unbuilt plots. The maps of Amsterdam and Prague do rightly result in
ev

29 void areas over canals, rivers and agricultural fringes (shown blank) but is unclear whether they are distinct
30 categories in the classification. E.g. would the map of Detroit, USA result in white patches for both the empty
31 lots within the city and the lake?
iew

32 Answer: The method is ultimately linked to building footprints and their tessellation cells, meaning that
33
undeveloped land is not represented in the model as an individual entity, although it is partly captured in the
34
geometry and extent of the tessellation cell (larger when surrounded by unbuilt land, smaller when surrounded
35
by dense built up fabric). When there is an unbuilt plot in between built ones, its area will be shared between
36
tessellation cells of neighbouring buildings, up to a certain limit. We have added a sentence clarifying that in the
On

37
section Structural elements.
38
39 --
40
ly

41
42 Referee: 2
43 --------------
44
45
The manuscript proposes a methodology grounded in biological systematics to produce numerical taxonomies
46
of urban form based on a collection of 74 characteristics of streets, buildings and tesselation from the field of
47
quantitative urban morphology, or as the authors call it morphometrics, aggregated on different spatial contexts
48
to define tissue types. The methodology is applied to the cases of Prague and Amsterdam for demonstration and
49
50 validation, and opens the door for more systematic development of urban taxonomies.
51 While the process of quantitative unsupervised classification of urban form at different scales is today well
52 established, the proposed methodology offers some innovations. The last step of examining the output of such
53 processes by analysing morphological similarities based on hierarchical classification offers new ways of
54 analysing and comparing urban form types, in addition to the interpretative descriptive approach more
55 commonly used. The second step, of extracting urban tissue types based on contextual characteristics, is also
56 interesting as it combines street, building and cell metrics to define homogeneous areas at different scales. The
57 first step of measuring characteristics of urban form units is more conventional, it has potentially the advantage
58 of being scalable (this has not been tested), but also involves a large number of variables and one might
59 question the parsimony of the approach.
60

https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 94 of 136

1
2
3 Answer: The parsimony of the approach is twofold: First, it tries to minimise the data input as it depends on
4 building data and street networks only, unlike alternative methods requiring largely unavailable plots
5 (Berghauser Pont et al) or topography (Araldi and Fusco). Second, it implements the whole method as an open-
6 source algorithm. Therefore, it does not matter so much how many variables we use as they tend to be
7 computationally efficient and the inclusion of a large number of them brings conceptual benefits (detailed
8 below) at low cost.
9
10 --
11
12
13
While the manuscript is well organised, written and illustrated, it is not always clear, becoming hard to gain an
14
overall understanding of methodology, its contributions and limitations. Important aspects of the work are not
15
made sufficiently explicit, or are stated matter of factly without sufficient explanation and reflection. I will
16
17 elaborate on these main aspects for revision first, addressing additional minor points later.
18
19 PROBLEM and CONTRIBUTION
20 The manuscript does not clearly state in the introduction the problem, aim, and main contributions to the field
21 of urban morphometrics. The statement in lines 33-38 p. 3 only says what and how, but does not answer: Why is
Fo

22 this proposed methodology necessary? What problem is it trying to solve?


23
Answer: The proposed methodology is an attempt to overcome an important existing limitation in the capacity
24
to systematically capture morphological patterns in a way that is efficient, scalable and extensive. The existence
rR

25
of this gap is described in the introduction. We have added a sentence to better link the gap to the aim of the
26
paper.
27
28
ev

--
29
30 The authors acknowledge extensive previous work in section 2, and they list briefly some shortcomings at the
31
iew

end (page 5) but these are not common to all studies nor entirely unresolved. And "shortcomings" have
32 eventually a reason to be, and might only be seen as such from the perspective of numerical taxonomy.
33
34 Answer: the shortcomings identified relate directly to the aim of achieving a systematic characterization and
35 classification of urban form patterns that is scalable and require low manual supervision. In this sense, when we
36 discuss shortcomings of existing literature, we necessarily do it from the perspectives of the investigated topic.
On

37 This does not imply in any way that studies mentioned are less valuable to the understanding of urban form
38 dynamics, it simply means that they currently do not allow as such to achieve the aim set in the paper.
39
40 --
ly

41
42
43 Why do we need this approach in the field of morphometric classification of the built environment?
44 Answer: We believe this question is answered in the comment above. The proposed methodology is an attempt
45 to overcome an important existing limitation in the capacity to systematically capture morphological patterns in
46
a way that is efficient, scalable, and extensive, which is a major impediment in large scale endeavors to classify
47
and characterise urban form patterns and reduces the possibility for comparative studies, which is fundamental
48
to relate urban form to phenomena as diverse as economic performance, health, attractiveness and so on. We
49
have added a sentence in the conclusions to better reinforce this point.
50
51 --
52
53
54
55 GAPS
56 The gaps are taken up in the Discussion (page 27), but only stating very factually that they are addressed by the
57 methodology without further reflection.
58
59
60

https://mc04.manuscriptcentral.com/epb
Page 95 of 136 Environment and Planning B: Urban Analytics and City Science

1
2
3 How can one claim parsimony of data if typical data does not fulfil the requirements? (see comments below in
4 limitations)
5
6 Answer: In our experience, typical data used when working with buildings in morphological studies are of an
7 authoritative source, e.g. from national or municipal mapping agencies, not from OpenStreetMap due to its
8 incompleteness and variable quality. Authoritative data tends to fulfill the requirements. The only exception
9 may be missing building height which we discuss in more detail below. Street networks of OSM origin can be
10 used as an input in most cases. Compared to the other relevant methods, the proposed one is still less data
11 demanding in most cases (e.g. Araldi and Fusco (2019) require topography on top of the data we do and
12 Berghauser Pont et al. (2019) require plots).
13
14 --
15
16
17 Where is evidence that the 74 morphometric characteristics leading to 370 characters are comprehensive? It is
18 extensive, but one could argue that characters should include aspects such as roughness, skyview factor,
19 vegetation, etc.
20 Answer: Thank you for this observation. We have realised that the usage of the term “comprehensive” in the
21
Fo

paper was not appropriate. Full comprehensiveness would require the inclusion of other data inputs and may be
22
nearly impossible to reach. We have changed the language used in the paper, stating that the set is “extensive”
23
rather than “comprehensive”.
24
rR

25 --
26
27
28
ev

By including all "possible" variables one avoids pre-selection. But one might argue that this selection is not
29
30 biased but rather meaningful and context relevant, and allows interpretation of the outcome, certainly hard
31 from 370 variables.
iew

32 Answer: That is certainly a valid argument. However, such a selection is at the same time context relevant and
33 context-dependent. That means that the selection made based on e.g. Prague would not necessarily be as
34 appropriate for Tokyo or Medellin. With the addition of more case studies to the pool, the qualitative definition
35
of a small context-relevant set of variables becomes an increasingly complex task. It needs to be noted that the
36
set of variables needs to be the same for all cases to allow for the creation of taxonomy and comparability of
On

37
clustering results.
38
39 --
40
ly

41
42
43 Being fully data driven avoids interpretation. But to what extent can we trust a single BIC score that also must
44 be interpreted, possibly by a non-expert?
45 Answer: The analysis of the optimal number of clusters using BIC is not using a single BIC score per option.
46 Each of the tested parameters was initialised 5 times to ensure the stability of the score and its interpretability.
47 The non-expert interpretation of BIC is a valid point but one that is common to most clustering methods.
48
49 --
50
51
52
The methodology depends on additional external data (year of construction, socioeconomics, land use, etc.) and
53
expert opinion for the resulting taxonomy to gain names/meaning - i.e. validation. What is the advantage of
54
moving this step to the end, rather than an input from the start?
55
56 Answer: The additional external data was used only to validate the results of the taxonomy and as such is not
57 part of the clustering method, which is only based on geometrical and configurational properties of urban form
58 elements. With the information on year of construction etc. We have sought to show that, in the case studies
59 analysed, the taxonomy was meaningful from an interpretative point of view.
60

https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 96 of 136

1
2
3 --
4
5
6
7 How can the taxonomy be further systematised? It ends with dozens of types for the two cities combined, that
8 also requires interpretation (where to cut) and expert knowledge to interpret or use.
9 Answer: Further systematisation of the taxonomy and detailed interpretation of the hierarchical structure
10 emerging from the result is left for further research. The scope of this paper is to present and test a method that
11
can be used to build such a taxonomy. We have included the branch-based interpretation shown in Figure 4 to
12
help build an intuition about the result but we feel that he further explorations deserve more space than we have
13
in this paper. When doing so, the optimal way would be to include more case studies resulting in a richer and
14
possibly more stable taxonomy and mirror the interpretative methods used in biology.
15
16 --
17
18
19
20 How is the method scalable, if the final taxonomy expands to hundreds or thousands of cities?
21 Answer: The whole method is encapsulated into a set of self-sustaining Python scripts that can be easily
Fo

22 deployed to a cloud computing environment and programmatically applied to a large set of input case studies.
23 When aiming to build a taxonomy of hundreds or thousands of cities, we are in the realm of Big Data (simply
24
due to the sheer amount of building footprints no matter the method used), therefore requiring scalable
rR

25
infrastructure that can run the code efficiently. With the ongoing developments in the Python ecosystem, the
26
underlying computational libraries are becoming significantly more powerful, making the scalability task easier.
27
However, it needs to be pointed out that while a small number of case studies is feasible to run on a single
28
ev

desktop (as was in the case of this research), we wouldn’t recommend trying to classify hundreds of cities on it.
29
30 --
31
iew

32 LIMITATIONS
33 The last two paragraphs of the discussion (pages 28-29) suggest a limitations subsection. It's important to
34 expand it with further reflections. For example:
35
36
On

37 If building footprints are not well defined (typical in OSM data), it has an impact on both the buildings and the
38 tesselation elements. What is the impact on building and tesselation characters and how can one address this
39 problem?
40
ly

41 Answer: The degree of impact depends on the situation. If the data source is close to the optimum with a small
42 percentage of buildings being suboptimally defined, the method should be robust enough to minimise the impact
43 thanks to the design of contextual characters, that include all cells within 3 topological steps and eliminate
44 outlier values caused by such suboptimality. When building geometry does not represent individual footprints
45 but whole blocks (as often happens in OSM) some characters need to be interpreted differently and some will be
46 invariable (e.g. shared walls no longer exist). However, according to our internal tests, as long as the level of
47 detail remains consistent (all buildings are drawn equally suboptimally), the method tends to result in a
48 meaningful taxonomy. If the data contain a varied level of detail and missing buildings, then the method will
49 likely struggle to detect meaningful tissues but in that case, we would recommend using remote sensing rather
50 than any vector-based method.
51
52 --
53
54
55 Since the context is largely defined by topological relations of this tesselation, it has also an impact on most
56 variables. Are there alternative approaches for defining context that obviate this problem . For example buffer
57 and network distance approaches instead of topological neighbourhoods. What is the advantage of the chosen
58
approach?
59
60

https://mc04.manuscriptcentral.com/epb
Page 97 of 136 Environment and Planning B: Urban Analytics and City Science

1
2
3 Answer: Prior to using topological relations as the aggregation method, we have tested other options and
4 published results in conference proceedings. As we are not able to share those due to the anonymity, a short
5 summary: we have tested aggregation using topology on MT, euclidean distance, metric reach along the street
6 network, and K-Nearest Neighbours on 2 scales equal to 200 m and 400 m euclidean distance. The analysis
7 assessed the distributions of the number of neighbours captured by each method and distributions of the total
8 area covered by them. The test was looking for 1) stability of a number of neighbours and 2) adaptability of the
9
covered area due to the varied scale of different urban tissues. Out of this comparison, MT is the second most
10
stable in terms of the number of neighbours just after the KNN with a fixed number and the most flexible in the
11
case of the covered area. Therefore, we have concluded that using topology as an aggregation method is the best
12
option available. If there is an assumption that the tessellation mesh is significantly negatively affected by input
13
data, any of the tested methods is possible to use as a replacement.
14
15 --
16
17
18 If building heights are missing, how many variables are affected? What is the impact on the results?
19
20 Answer: Missing building height affects 13.5% of characters (10 out of 74 primary and an equal ratio of
21 contextual). We haven’t quantified the effect on the resulting clustering and taxonomy but our tests showed that
Fo

22 most clusters tend to remain stable if we exclude height-based characters, but some post-war developments may
23 be merged together when their footprints are similar but the main difference is between 2 and 8 stories.
24
--
rR

25
26
27 What is the robustness of depending on 370 variables? Is there a robust subset, or are all really needed?
28
ev

29 Answer: We would be able to define a robust subset for Prague and Amsterdam but the problem with such an
30 approach is its dependency on the context. We can use factor analysis or a similar method on the measured data
31 but the subset that is able to detect clusters in historical European cities is likely different from the subset for
iew

32 North American, Asian, or African cities. Each case study can have its own robust subset. But doing so,
33 clustering solutions from different places would not be comparable. Let’s assume that we define such a subset
34 on the North American sprawling city - courtyard area would surely be eliminated from the set as it is mostly
35 invariable there. However, in the Mediterranean context, it may likely be one of the important variables. Since
36 the aim of the method is to be applied across various case studies to build a universal taxonomy of urban form,
On

37 we cannot define a subset based on two cases only but would have to do that based on potentially hundreds to
38 ensure that the subset is able to capture the specificity of each place.
39
40 --
ly

41
42 METHODOLOGY
43 In my view, limitations can be addressed by solutions to parts of the methodology. A clearer articulation of the
44 different stages of the methodology is needed for its better understanding and wider adoption, which leads me to
45 the final major point.
46 The subsections in Method are based on the numerical taxonomy elements, but it is not immediately obvious
47 how it relates operationally with the many steps of the methodology. It's important to make all the stages and
48 sub-steps clear, in what are their inputs and outputs, and how they relate.
49
50
51 At the start of the Method section the authors should include a diagram of the various stages and steps.
52
53 Answer: Thank you for the suggestion. We have included a diagram in figure 1b.
54 --
55
56
It was hard, even going between methods, results and supplementary material to piece together the puzzle.
57
Generally, the method section requires more frequent references to the supplementary material to be
58
understood, it's not clear on its own. In particular:
59
60

https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 98 of 136

1
2
3 the description of urban characters in page 10 invites also a table summarizing all the categories mentioned
4 and their hierarchy (not a list of characters, that's in supplementary material)
5
6 Answer: Thank you for your comment and we fully appreciate that understanding all details of the methodology
7 can be demanding. Unfortunately due to the limitation on the length of the paper we are not able to add further
8 tables or diagrams within the allocated space. Currently, the same information is provided albeit split across
9 different tables in the supplementary material.
10
11 --
12
13
14 It is not clear how you progress from the 74 characters to define each each specific context.
15
16 Answer: We have slightly reformulated the relevant section of the method and included this step in the diagram,
17 hoping that it is clearer now. Each of the primary characters is used as an input variable for each of the four
18 contextual options measured within three topological steps around each tessellation cell.
19
20 --
21 the description of contextual characters would benefit from the insertion of a simpler version of figure S4.
Fo

22
23 Answer: As much as we would like to include additional figures in the main text, as said before, we are
24 unfortunately limited by the Journal’s word count, leaving us no choice than keeping the figure in the
rR

25 Supplementary material. We would have to remove another part of the manuscript, that would affect the clarity
26 in a negative way elsewhere.
27
28 --
ev

29
30
31 - The detection of morphological taxa focuses on the algorithm, missing an explanation of the output, and how
iew

32 that links to the next step


33
34 Answer: We have added a description of the output of the step (cluster labels assigned to individual tessellation
35 cells) and linked it more explicitly to the next step (each urban form type is represented by its centroid (mean of
36 each character across cells with the same label)).
On

37
38 --
39
40
ly

41 In doing so, the authors will also break the method into stages that can eventually be more easily reproduced or
42 adapted to different needs by others, not necessarily adopting the whole "package" in case there are limitation
43 in data input or application that need to be addressed. Such flexibility makes the model more robust.
44
45 Answer: The method is effectively broken into five steps - 1) generation of morphological elements, 2)
46 measurement of primary morphometric characters, 3) measurement of contextual character, 4) cluster analysis,
47 5) taxonomy, where each of these can be adapted to the individual needs if required. We have made this division
48 explicit in the paper.
49
50 --
51
52 MINOR POINTS
53 The abstract should highlight the contributions of the methodology, not simply state what is done
54
Answer: We have added a sentence to clarify the contribution in the abstract.
55
56 --
57
58
59 The emergence of quantitative morphometrics in the field of Urban Morphology is less recent than suggested in
60 page 3 line 27, dating back to a publication in its journal: Gil J, Beirão JN, Montenegro N, Duarte, JP (2012)

https://mc04.manuscriptcentral.com/epb
Page 99 of 136 Environment and Planning B: Urban Analytics and City Science

1
2
3 On the discovery of urban typologies: data mining the many dimensions of urban form. Urban Morphology
4 16(1): 27–40.
5
6 Answer: We have now included a reference to Gil et al. (2012) as it was correctly pointed out that it is
7 considered one of the first works within the subfield of urban morphometrics.
8
9 --
10
11
12 What tools, software and data sets were used in the case study and should be used for the methods?
13
14 Answer: The whole method is written as a reproducible Python code in Jupyter notebooks that can be executed
15 within a Docker container. The main component, measurement of primary and contextual characters has been
16 turned into an independent Python package and released as open-source under MIT license. Due to the potential
17 anonymity breach, we cannot link the repository or documentation here.
18
19 The morphological data (buildings, streets) for the Prague case study were obtained from the city’s open data
20 portal (https://www.geoportalpraha.cz/en), while the validation layers were provided by the Prague Institute of
21 Planning and Development. The morphological data for Amsterdam are obtained from 3D BAG repository
Fo

22 (Dukai, 2020) and Basisregistratie Grootschalige Topografie, BGT (http://data.nlextract.nl/). The same
23 information was also reported in the main body of the paper.
24
--
rR

25
26
27
28 There is no clear explanation for the choice of 20 and 30 types for Prague and Amsterdam respectively. The
ev

29 BIC plots are not clearly indicating this.


30
Answer: Thank you for pointing out the methodological issue hidden in the selection of the optimal number of
31
iew

32 GMM components. We have revised the method based on BIC and settled to the conservative “elbow”-based
33 interpretation of the curve, leading to a change in the number of types. The revised version of the manuscript
34 uses 10 clusters in Prague and 10 in Amsterdam.
35 --
36
On

37
38
39 What might be the impact of many variables not having enough variance across contexts in a given case. Would
40 they affect the Gaussian clustering results?
ly

41
Answer: Gaussian Mixture Model is a distance-based clustering technique (similarly to K-Means), where each
42
variable is a single dimension in an n-dimensional space. When a variable does not have enough variance, the
43
distance to each observation along the specific dimension is the same (or almost the same). As such, it does not
44
45 have any significant impact on the formation of clusters. If you imagine a clustering based on 2 variables, where
46 one is highly heterogeneous and the other is invariable, the result would be the same as using the former one
47 only.
48 --
49
50
51
52 Table 1 reports the validation results of the tests. It would be useful for the reader to indicate how to interpret
53 those scores, and mention the scores in the text. What is high, medium or low.
54
Answer: We have added indications of low, moderate, and high association to the table caption. V < .3 indicates
55
low, .3 - .5 moderate, and > .5 high association. However, due to the space limitations, we had to move the table
56
to Supplementary material.
57
58 --
59
60

https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 100 of 136

1
2
3
4 The Discussion ventures into example of applications (p.38, line 31), but each of the three statements must be
5 expanded with one or two sentences to explain what the authors actually mean. This is not obvious.
6
7 Answer: Each of the three statements raised has been clarified with one additional explanatory sentence.
8
9 --
10
11
The conclusion must be revised to offer a short summary of the manuscript, highlights of contributions, and
12
some brief outlook.
13
14 Answer: A short summary of the manuscript was added to the conclusion sections and we have briefly
15 expanded on the contributions on the method
16
17 --
18
19
20 The headings should be numbered to more easily navigate the text and position ourselves in each section.
21
Fo

22 Answer: The Journal style uses unnumbered headings, we only follow its requirements here.
23
24 --
rR

25
26
- Inconsistent referencing form for co-authors, sometimes no co-authors (Bobkova 2019, page 3), sometimes 3
27
authors included (Berghauser pont et al. 2019, page 5), although most are consistent.
28
ev

29 Answer: Referencing form has been fixed to remain consistent across the paper.
30
31 --
iew

32
33
34 UNIVERSALITY
35
36 One final question for reflection and comment. To what extent do the authors envisage this being a method for a
On

37 universal taxonomy of urban form? The fact of adopting an extensive list of characters seems to point in that
38 direction. Maybe something to reflect in the conclusion?
39
40 Answer: The method is indeed aimed to be as universal as feasible, which affected method design decisions like
the selection of characters. We have added the reflection of the potential for the development of the universal
ly

41
42 taxonomy of urban form to the conclusion.
43
--
44
45
46
47 --------------
48 Referee: 3
49 --------------
50
51 This paper represents the latest, and perhaps to date the more accomplished, attempt to translate biology's
52 numerical classification methods (i.e. numerical taxonomy) to urban morphology – a research trend with
53 precedents which the authors thoroughly identify. It is a well-structured and carefully thought piece of research,
54 which moreover stems from previous, peer-reviewed work from the authors. In its current form, I believe the
55 paper is already almost publishable, for it is clear and very-well written. However, I also believe that it has an
56 unresolved tension regarding its objectives.
57 In my view, the scope of this paper is to present a theoretical and analytical framework, which is then applied to
58 two case-studies, serving only as proofs of concept for the proposed methodology. If this is indeed the case, then
59 some technical shortcomings that the paper has (see below) are of lesser importance. But if the case studies are
60 supposed to be more than illustrations of the method, then those shortcomings become relevant and need to be

https://mc04.manuscriptcentral.com/epb
Page 101 of 136 Environment and Planning B: Urban Analytics and City Science

1
2
3 addressed. I think that this tension might be solved just by clearly bounding the scope and objectives of the
4 paper (i.e. to propose a theoretical and methodological/analytical framework, illustrated by two experimental
5 applications), and by addressing in a more frontal and thorough way, the theoretical impacts of some of its
6 methodological options (see below). This review follows this premise, stressing the unaddressed theoretical
7 impacts which should be made explicit and clarified. The technical shortcomings identified in the application
8 exercises are enumerated last, and their importance depends upon the relevance given to those exercises in the
9
end.
10
11
MAJOR CONCERNS
12
The first methodological option with huge (largely unaddressed) theoretical impacts, is the substitution of one of
13
the fundamental elements of urban form, namely plots, by an artificial entity, namely Voronoi cells (derived
14
15 from building footprints’ centroids). The difficulty in gathering data on plots (which is widely acknowledged) is
16 invoked to explain such substitution. However, one must not be misguided by the superficial resemblance
17 between such cells and actual plots. The two things are not, by any means, the same. The plot is a real, concrete
18 “piece of land, bound by legally defined borders that constitutes a basic unit of land control and use” (Bobkova
19 2019). Beyond being legal entities, plots also have their own geometric properties, and the relationship between
20 such properties and those of buildings and streets (e.g. how much and which part of the plot the building
21 occupies), are critical to accurately describe the nature of urban tissues. The proposed method does compute a
Fo

22 number of geometric attributes of the Voronoi cells but those are not, of course, attributes of plots. Perhaps
23 most importantly, plots are not inert entities, being subject to morphogenetic processes (e.g. subdivision,
24 amalgation). Therefore, they are as structural and formative as street networks are, and cannot be ‘substituted’
rR

25 by some proxy, at least such a crude and simplistic as Voronoi cells. The fact that the Voronoi tesselation is
26 instrumental to their method (more on that below), must not deter the authors from fully recognizing these
27 differences, the critical information on plots that their method necessarily overlooks and the theoretical impacts
28 of such methodological options, which should be fully explored. In particular, the authors should make an effort
ev

29 to dispel any doubts created by superficial ressemblances between their cells and actual plot systems, lest their
30 method be misread or misused by others, thinking that the two things are interchangeable.
31
iew

32 Answer: This is a very valid comment and we thank the reviewer for raising it. We are aware that the
33 tessellation is not a complete substitute for a plot but merely an analytical entity allowing us to describe certain
34 aspects of urban patterns. We have amended the language used in the discussion of the plot and tessellation and
35 added a sentence clarifying the position of tessellation and its lack of structural role the plot has. We hope that
36 such a clarification avoids the method being misread. As the aim of the paper is different, the detailed discussion
On

37 of the relation between tessellation cells and plots is seen as out of scope.
38
39 --
40
ly

41 2. Another important aspect concerns the use of topological neighbourhoods in order to calculate spatially
42
lagged morphometric measures, to which the authors call ‘contextual characters’. It is here that the Voronoi
43
tessalation becomes instrumental, because it serves to define the topological relations of contiguity between
44
cells and, hence, the topological neighbourhoods of k-th order around each cell. Using these neighbourhoods,
45
the authors derive ‘contextual’ versions of the original morphometric measures describing each individual
46
element, namely indexes of variety for the distribution of values of each measure within each topological
47
48 neighbourhood. In itself, this technique is simple and elegant, efectively capturing the zonal variability of
49 values. As the authors note, such technique is already a kind of spatial autocorrelation measure, because
50 ‘contextual measures’ are indeed measuring the extent to which a given object is (or not) surrounded by similar
51 objects.
52 What I find less clear, is the fact the values that are actually input into the classification algorithms, are not the
53 individual characteristics of the urban form elements themselves, but rather their ‘contextual’ counterparts
54 (explicitly leaving out the original measures). In other words, what is being clustered are not the original
55 morphometric variables, but rather values describing the spatial autocorrelation of those variables within a
56 given topological neighbourhood. There seems to be a certain circularity here. But my main concern is the
57 reason why the authors choose to do this unorthodox analytical step. Indeed, on line 15 p.11, the authors say
58 that “primary characters describe [...] elements and their immediate neighbourhood rather than their spatial
59 patterns. As such, when employed for cluster analysis they may result in spatially discontinuous classes. To
60 avoid this, we derive [...] a set of spatially lagged contextual characters [...]”.

https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 102 of 136

1
2
3 There seems to be a confusion here. Shouldn’t the spatial continuity of morphometric values emerge naturaly,
4 from the fact that similar elements are spatially clustered (or autocorrelated) in the first place? And, should we
5 “avoid” obtaining spatially discontinuous classes (through some technical trick), or should we naturally obtain
6 such continuity through well-targeted measures? Isn’t the very nature of morphological regions or tissues,
7 exactely the fact that they are made up of similar elements? I might be wrong, but it seems to me that the
8 authors are feeding the clustering algorithm with the answer that they seek; or, perhaps, they are just giving the
9
algorithm a clever and legitimate initial push. At any rate, these issues must be fully disclosed and discussed.
10
The text in p.1 of Suplementary Material is quite important and a good place to start doing so. In fact, I think
11
that that section ought to be part of the main text, because it holds quite relevant information.
12
13
14 Answer: We fully understand where the feeling that this step is somewhat unnecessary emerges from and while
15 it may seem that we’re feeding the algorithm what we want, the situation is slightly different. “Shouldn’t the
16 spatial continuity of morphometric values emerge naturally, from the fact that similar elements are spatially
17 clustered (or autocorrelated) in the first place? … Isn’t the very nature of morphological regions or tissues,
18 exactely the fact that they are made up of similar elements?” - Yes and no. There are urban patterns, like large-
19 scale suburban developments, where each building is the same or almost the same as the neighbouring one. In
20
such a homogeneous case, we would be able to use primary characters to derive spatially contiguous clusters.
21
Fo

However, most urban tissues are not like that. They all are formed by elements of various sizes, shapes, and
22
configurations and the way they are combined together makes the pattern. In that case, we have to capture what
23
makes the pattern, not its individual components. We need to understand that the place is typical with high
24
heterogeneity of building areas and shapes (as it is in the case of Prague’s medieval core), therefore the
rR

25
26 information that needs to be passed to the clustering algorithm should be the one reflecting the pattern.
27 Contextual characters are specifically designed to do so, to reflect the tendency of each measurable aspect
28 within the context around each building.
ev

29 “should we naturally obtain such continuity through well-targeted measures?” - That could be one approach but
30 if you want to capture shapes of individual elements, your well-targeted measure would end up being very
31
iew

similar to contextual character based on the primary one measuring shape. Therefore, we would circle back to
32
the same spot, because we would still want to understand the distribution of values within the context to capture
33
the varied heterogeneity of urban tissues.
34
35 “I might be wrong, but it seems to me that the authors are feeding the clustering algorithm with the answer that
36 they seek; or, perhaps, they are just giving the algorithm a clever and legitimate initial push” - We firmly believe
On

37 that contextual characters are an initial push. They are not the answer, individual characters do not tell the same
38 story as cluster analysis. They are the way allowing us to use the scalable clustering method that is not spatially
39 constrained.
40
ly

41 We have expanded the section Morphometric characters: primary and contextual characters, trying to explain
42 the reasoning behind contextual characters in a clearer way.
43
44 --
45
46
47 TECHNICAL SHORTCOMINGS
48 1. The authors use a very large number of morphometric variables (296), studied and described in previous
49 papers. They also provide a list of those variables and their formal definitions in the Suplementary Materials.
50 However, some kind of scheme or diagram disclosing the organization (or inner structure, i.e. element, scale,
51 morphometric class) of this large set of variables would be helpful in the main text, so that the reader may
52 quickly apprehend them.
53
54 Answer: As much as we would like to include additional figures to the main text, we are unfortunately limited
55 by the Journal’s word count. We would have to remove another part of the manuscript, that would affect the
56 clarity in a negative way elsewhere.
57
58 --
59
60

https://mc04.manuscriptcentral.com/epb
Page 103 of 136 Environment and Planning B: Urban Analytics and City Science

1
2
3 1.b Moreover, one must ask if these variables are all uncorrelated and not redundant. The authors mention that
4 they are not, but given their large number this seems very unlikely. This otherwise simple and parsimonious
5 method, seems to adopt a brute force strategy regarding the variables, which would gain with some redundancy
6 or dimensionality-reduction analysis.
7
8 Answer: We would be able to define a robust subset for Prague and Amsterdam but the problem with such an
9 approach is its dependency on the context. We can use factor analysis or a similar method on the measured data
10 but the subset that is able to detect clusters in historical European cities is likely different from the subset for
11 North American, Asian, or African cities. Each case study can have its own robust subset. But doing so,
12 clustering solutions from different places would not be comparable. Let’s assume that we define such a subset
13 on the North American sprawling city - courtyard area would surely be eliminated from the set as it is mostly
14 invariable there. However, in the Mediterranean context, it may likely be one of the important variables. Since
15 the aim of the method is to be applied across various case studies to build a universal taxonomy of urban form,
16 we cannot define a subset based on two cases only but would have to do that based on potentially hundreds to
17 ensure that the subset is able to capture the specificity of each place. The similar applies to correlation and
18 redundancy. The fact that some variables are correlated in one context does not necessarily imply that they
19 would be correlated in another.
20
21 --
Fo

22
23
24 2. GMM is appropriatedly chosen as main clustering algorithm. However, GMM is not deterministic (it may
rR

25 well produce different results in different runs), which stands against the authors’ claims of methodological
26 reproducibility. One way to mitigate this is to run GMM several times and inspect the stability of the results
27 (standard deviation, etc). I would recommend this procedure in further experimental exercises.
28
ev

29 Answer: We use the similar approach used in K-Means clustering (also not deterministic), that is repeated
30 initialisation and automatic selection of the best result from initialisations. In this study, we use 100
31 initialisations leading to results with a high degree of stability. Furthermore, for the direct reproducibility of the
iew

32 analysis, we specify the random state meaning, that the outcome of the GMM using the same input data will
33 always result in the same labels. These details are included in the Jupyter notebooks accompanying the
34 manuscript.
35
36
3. There is no analysis of clusterability of the data. In order for the experimental exercises to be valid, the
On

37
authors should preform a previous assessment of the cluster tendency of their data, such as the Hopkins
38
39 statistic. All clustering algorithms provide clustering solutions. However, these are only meaningful if clusters
40 are present, in the first place.
ly

41 Answer: We believe that the presented results and their external validation clearly indicate that clusters are
42 meaningful and that input data are clusterable. However, following the recommendation we have performed
43 Hopkins statistics (Prague: 0.028, Amsterdam: 0.011) that indicates high clusterability of the input data.
44
45 --
46
47
48 4. The lack of a clear minimum in the BIC values is concerning, not only regarding overfitting but also
49 regarding the previous point (insignificant cluster structure). At any rate, I think that the authors’ choice of
50 selecting the number of clusters as that with the minimum BIC is clearly misguided, in this case. Instead, they
51 should look at the curve of BIC gradients and choose the number of clusters where the curve suddenly looses
52 slope (from that point on, further clusters become more and more irrelevant). Now, in Prague this seems to
53 happen around 10-15 clusters and in Amsterdam (as depicted in Figure S8) around 10. This is completely at
54 odds with the 20 clusters for Prague and 30 for Amsterdam, chosen by the authors. I believe that these choices
55 are all but parsimonious and hardly defensable based on the BIC value alone. The authors should inspect
56 solutions with less clusters, potentially more informative, generalizable and less overfitting than the ones
57 chosen.
58
59 Answer: Thank you for pointing out the methodological issue hidden in the selection of the optimal number of
60 GMM components. We have revised the method based on BIC and settled to the conservative “elbow”-based

https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 104 of 136

1
2
3 interpretation of the curve, leading to a change in the number of types. The revised version of the manuscript
4 uses 10 clusters in Prague and 10 in Amsterdam.
5
6 --
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Fo

22
23
24
rR

25
26
27
28
ev

29
30
31
iew

32
33
34
35
36
On

37
38
39
40
ly

41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

https://mc04.manuscriptcentral.com/epb
Page 105 of 136 Environment and Planning B: Urban Analytics and City Science

1
1
2
3
4 Methodological Foundation of a Numerical Taxonomy of Urban
5 Form: Supplementary material
6
7
8
9 Supplementary Material 1: Relational analytical framework
10 This research proposes and applies a relational framework of urban form for urban
11 morphometrics.
12
13 Relational analytical framework (RF) of urban form is based on two concepts - topology and
14 inclusiveness. The framework acknowledges that there are identifiable relations between all
15
elements of urban form and their aggregations. As such, it accommodates all analytical
16
17 aggregations into a singular framework, linking all potential measurable characters to the
18 smallest element. Furthermore, it employs topological relations in the way it generates location-
19 based aggregations of fundamental elements.
Fo
20
21 Unlike existing frameworks in literature, RF is analytical, not conceptual or structural. It does
22 not try to propose a new theory of urban form; it has purely morphometric nature.
rR
23
24
25 Within this research, RF is operationalised based on morphological tessellation.
26
ev

27 The key principles of the tessellation-based relation framework are as follows.


28
29 1. Urban form is represented as building footprints, street networks and footprint-based
iew

30
morphological tessellation.
31
32 2. There is an identifiable relationship between buildings and street networks, buildings and
33 street nodes and buildings and tessellation cells.
34 3. Morphometric characters are measured on scales defined by topological relations between
On

35 elements.
36 - Element itself
37 - Element and its immediate neighbours
38
- Element and its neighbours within n topological steps, either in a constrained or an
ly

39
40 unconstrained way.
41 4. Therefore, we can define subsets of RF as measurable entities of urban form based on
42 fundamental elements and topological scales.
43 5. Subsets are overlapping, reusing each element within all relevant relations.
44
45 Since the relation between all elements is preserved throughout the process of their combination,
46
47
we can always link values measured on one subset to another. For example, due to the fixed
48 relation between building and street node, we can attach a node's degree value to a building as an
49 element. The constrained topological relation can identify traditional area-based aggregations
50 like block (as a combination of all tessellation cells which topological relation does not cross a
51 street). As such, they allow us to combine both area-based and location-based aggregations while
52 minimising MAUP for each of them.
53
54
55 Subsets of elements
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 106 of 136

2
1
2
3 Subsets are a combination of topological scales and fundamental elements. Overlap of
4
5
morphometric characters derived from subsets, where each subset is representing a different
6 structural unit, gives an overall characteristic of each duality building - cell, which can be later
7 used for further analysis.
8
9 We can divide subsets into three topological scales: Small (or Single), Medium and Large.
10
11
Note that topological distance is possible to define within each layer (relations between
12
13
buildings, relations between cells, relations between edges or nodes), but not as a combination of
14 layers. The relation between building, its cell, its segment and its node is fixed and seen as a
15 singular feature. That is why morphometric characters like covered area ratio of the cell are
16 classified as a Small scale character.
17
18 Small/Single (S)
19
Small scale captures fundamental elements themselves (topological distance is 0 - itself). In the
Fo
20
21 case of building and tessellation cell, it captures the individual character of each cell. In the case
22 of street segment and node, it captures value for segment or node, which is then applied to each
cell attached to it.
rR
23
24
25 We have four subsets within small scale:
26
ev

27
28
- building
29 - tessellation cell
iew

30 - street segment
31 - street node
32
33
34
On

35
36
37
38
ly

39
40
41 Figure S1: Diagrams illustrating the subsets on the small/single scale.
42
43
44
Medium (M)
45 The medium scale reflects topological distance 1. It captures individual character for each
46 element derived from the relation to its adjacent elements.
47
48 - adjacent buildings
49 - neighbouring cells
50
- neighbouring segments
51
52 - linked nodes
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 107 of 136 Environment and Planning B: Urban Analytics and City Science

3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
Figure S2: Diagrams illustrating the subsets on the medium scale.
iew

30
31 Large (L)
32 Large scale captures topological distance 2-n. In the case of cells, it captures individual character
33 for each cell derived from the relation to cells within set topological distance. In the case of
34 joined buildings and block, resulting measurable values are shared among all elements within
On

35
such a structural unit. Block here is based on morphological tessellation and is defined as the
36
37 contiguous portion of land comprised of cells which are normally bounded by streets or open
38 space.
ly

39
40 - joined buildings
41 - neighbouring cells of larger topological distance
42 - block (the maximum number of topological steps from element without the need to cross the
43
44
street network)
45 - neighbouring segments of larger topological distance
46 - linked nodes of larger topological distance
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 108 of 136

4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28 Figure S3: Diagrams illustrating the subsets on the large scale.
29
iew

30
31 The resulting combination of all subsets is overlapping, following, in principle, Alexander's
32 (1966) schema of overlapping semi-lattice.
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 109 of 136 Environment and Planning B: Urban Analytics and City Science

5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37 Figure S4: Diagrams illustrating the overlapping nature of the relational framework. The left
38 diagram overlays all subsets on top of each other capturing the importance of each element for
ly

39
description of urban form around the indicated building. The darker the colour is, more times
40
41 each element is used within various subsets. Diagram on the right shows all subsets aligned on
42 top of each other describing the similar information while showing each subset directly.
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 110 of 136

6
1
2
3
4
5
6
7
8 Supplementary Material 2 Primary morphometric characters
9
10 Based on the principles described in Sneath and Sokal (1973), the following morphometric
11
characters compose the final set of primary characters. For the implementation details, please
12
13
refer to the original referred work and to the documentation and code of <masked for blind
14 review>, which contains Python-based implementation of each character.
15
16 index element context
level category
17
18 area building bSuilding dimension
19
Fo
20 height building bSuilding dimension
21
volume building bSuilding dimension
22
rR
23
perimeter building bSuilding dimension
24
25 courtyard area building bSuilding dimension
26
ev

27 form factor building bSuilding shape


28
29 volume to façade ratio building bSuilding shape
iew

30
31 circular compactness building bSuilding shape
32
33 corners building bSuilding shape
34
On

35 squareness building bSuilding shape


36
37 equivalent rectangular building bSuilding shape
38 index
ly

39
elongation building bSuilding shape
40
41
centroid - corner distance building bSuilding shape
42 deviation
43
44 centroid - corner mean building bSuilding shape
45 distance
46
47 solar orientation building bSuilding distribution
48
49 street alignment building bSuilding distribution
50
51 cell alignment building bSuilding distribution
52
53 longest axis length tessellation cell tSessellation cell dimension
54
area tessellation cell tSessellation cell dimension
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 111 of 136 Environment and Planning B: Urban Analytics and City Science

7
1
2
3 circular compactness tessellation cell tSessellation cell shape
4
5 zequivalent rectangular tessellation cell tSessellation cell shape
6 index
7
8 solar orientation tessellation cell tSessellation cell distribution
9
10 street alignment tessellation cell tSessellation cell distribution
11
12 coverage area ratio tessellation cell tSessellation cell intensity
13
floor area ratio tessellation cell tSessellation cell intensity
14
15
length street segment sStreet segment dimension
16
17 width street profile sStreet segment dimension
18
19 height street profile sStreet segment dimension
Fo
20
21 height to width ratio street profile sStreet segment shape
22
rR
23 openness street profile sStreet segment distribution
24
25 width deviation street profile sStreet segment diversity
26
ev

27 height deviation street profile sStreet segment diversity


28
29 linearity street segment sStreet segment shape
iew

30
31 area covered street segment sStreet segment dimension
32
buildings per meter street segment sStreet segment intensity
33
34
area covered street node sStreet node dimension
On

35
36 shared walls ratio adjacent buildings aM
djacent buildings distribution
37
38 alignment neighbouring buildings nMeighbouring cells (queen) distribution
ly

39
40 mean distance neighbouring buildings nMeighbouring cells (queen) distribution
41
42 weighted neighbours tessellation cell nMeighbouring cells (queen) distribution
43
44 area covered neighbouring cells nMeighbouring cells (queen) dimension
45
46 reached cells neighbouring segments nMeighbouring segments intensity
47
48 reached area neighbouring segments nMeighbouring segments dimension
49
50 degree street node nMeighbouring nodes distribution
51
52 mean distance to street node nMeighbouring nodes dimension
neighbouring nodes
53
54 reached cells neighbouring nodes nMeighbouring nodes intensity
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 112 of 136

8
1
2
3 reached area neighbouring nodes nMeighbouring nodes dimension
4
5 number of courtyards adjacent buildings jLoined buildings intensity
6
7 perimeter wall length adjacent buildings jLoined buildings dimension
8
9 mean inter-building neighbouring buildings cLell queen neighbours 3 distribution
10 distance
11
12 building adjacency neighbouring buildings cLell queen neighbours 3 distribution
13
gross floor area ratio neighbouring tessellation cells cLell queen neighbours 3 intensity
14
15
weighted reached blocks neighbouring tessellation cells cLell queen neighbours 3 intensity
16
17 area block bLlock dimension
18
19 perimeter block bLlock dimension
Fo
20
21 circular compactness block bLlock shape
22
rR
23 equivalent rectangular block bLlock shape
24 index
25
26 compactness-weighted axis block bLlock shape
ev

27
28 solar orientation block bLlock distribution
29
iew

30 weighted neighbours block bLlock distribution


31
weighted cells block bLlock intensity
32
33
local meshedness street network nLodes 5 steps connectivity
34
On

35 mean segment length street network sLegment 3 steps dimension


36
37 cul-de-sac length street network nLodes 3 steps dimension
38
ly

39 reached cells street network sLegment 3 steps dimension


40
41 node density street network nLodes 5 steps intensity
42
43 reached cells street network nLodes 3 steps dimension
44
45 reached area street network nLodes 3 steps dimension
46
47 proportion of cul-de-sacs street network nLodes 5 steps connectivity
48
49 proportion of 3-way street network nLodes 5 steps connectivity
intersections
50
51
proportion of 4-way street network nLodes 5 steps connectivity
52 intersections
53
54 weighted node density street network L intensity
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 113 of 136 Environment and Planning B: Urban Analytics and City Science

9
1
2
3 local closeness centrality street network nLodes 5 steps connectivity
4
5 square clustering street network nLodes within network connectivity
6
7
8 Table S1: Table of primary morphometric characters. For detailed explanation, formulas and
9
10
11
references, see the details below. Nomenclature follows the Index of Element model proposed by
12
13 <masked for blind review>. Scale refers to the topological scale from which a character is
14
15 derived, while context describes the actual set of elements used.
16
17
18
19
Fo
20
21
22 1. Area of a building is denoted as
rR
23 𝑎𝑏𝑙𝑔
24
25 and defined as an area covered by a building footprint in m2.
26
ev

27 2. Height of a building is denoted as


28 ℎ𝑏𝑙𝑔
29
iew

30 and defined as building height in m measured optimally as weighted mean height (in case of
31
32
buildings with multiple parts of different height). It is a required input value not measured within
33 the morphometric assessment itself.
34
3. Volume of a building is denoted as
On

35
36 𝑣𝑏𝑙𝑔 = 𝑎𝑏𝑙𝑔 × ℎ𝑏𝑙𝑔
37
38 and defined as building footprint multiplied by its height in m3.
ly

39
40 4. Perimeter of a building is denoted as
41 𝑝𝑏𝑙𝑔
42
43 and defined as the sum of lengths of the building exterior walls in m.
44
45 5. Courtyard area of a building is denoted as
46 𝑎𝑏𝑙𝑔𝑐
47
48 and defined as the sum of areas of interior holes in footprint polygons in m2.
49
50 6. Form factor of a building is denoted as
51 𝑎𝑏𝑙𝑔
52 𝐹𝑜𝐹𝑏𝑙𝑔 = 2 .
53 𝑣3𝑏𝑙𝑔
54
55 It captures three-dimensional unitless shape characteristic of a building envelope unbiased by the
56 building size (Bourdic et al., 2012).
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 114 of 136

10
1
2
3 7. Volume to façade ratio of a building is denoted as
4 𝑣𝑏𝑙𝑔
5 𝑉𝐹𝑅𝑏𝑙𝑔 = 𝑝𝑏𝑙𝑔 × ℎ𝑏𝑙𝑔.
6
7 It captures the aspect of the three-dimensional shape of a building envelope able to distinguish
8
building types, as shown by Schirmer and Axhausen (2015). It can be seen as a proxy of
9
10 volumetric compactness.
11
8. Circular compactness of a building is denoted as
12
13 𝑎𝑏𝑙𝑔
14
𝐶𝐶𝑜𝑏𝑙𝑔 =
𝑎𝑏𝑙𝑔𝐶
15
16 where 𝑎𝑏𝑙𝑔𝐶 is an area of minimal enclosing circle. It captures the relation of building footprint
17
shape to its minimal enclosing circle, illustrating the similarity of shape and circle (Dibble et al.,
18
19 2019).
Fo
20
21
9. Corners of a building is denoted as
𝑛
22
∑𝑐
rR
23 𝐶𝑜𝑟𝑏𝑙𝑔 = 𝑏𝑙𝑔
24 𝑖=1
25
26 where 𝑐𝑏𝑙𝑔 is defined as a vertex of building exterior shape with an angle between adjacent line
ev

27 segments ≤ 170 degrees. It uses only external shape, courtyards are not included. Character is
28
adapted from Steiniger et al. (2008) to exclude non-corner-like vertices.
29
iew

30 10. Squareness of a building is denoted as


31 𝑛
32 ∑𝑖 = 1𝐷𝑐𝑏𝑙𝑔
𝑖
33 𝑆𝑞𝑢𝑏𝑙𝑔 =
34 𝑛
On

35
36
where 𝐷 is the deviation of angle of corner 𝑐𝑏𝑙𝑔𝑖 from 90 degrees and 𝑛 is a number of corners.
37
38 11. Equivalent rectangular index of a building is denoted as
ly

39 𝑎𝑏𝑙𝑔 𝑝𝑏𝑙𝑔𝐵
40 𝐸𝑅𝐼𝑏𝑙𝑔 = ∗
𝑎𝑏𝑙𝑔𝐵 𝑝𝑏𝑙𝑔
41
42
43
44 where 𝑎𝑏𝑙𝑔𝐵 is an area of a minimal rotated bounding rectangle of a building (MBR) footprint
45 and 𝑝𝑏𝑙𝑔𝐵 its perimeter of MBR. It is a measure of shape complexity identified by Basaraner and
46
Cetinkaya (2017) as the shape characters with the best performance.
47
48 12. Elongation of a building is denoted as
49
50
𝑙𝑏𝑙𝑔𝐵
𝐸𝑙𝑜𝑏𝑙𝑔 =
51 𝑤𝑏𝑙𝑔𝐵
52
53 where 𝑙𝑏𝑙𝑔𝐵 is length of MBR and 𝑤𝑏𝑙𝑔𝐵 is width of MBR. It captures the ratio of shorter to the
54 longer dimension of MBR to indirectly capture the deviation of the shape from a square
55
56
(Schirmer and Axhausen, 2015).
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 115 of 136 Environment and Planning B: Urban Analytics and City Science

11
1
2
3 13. Centroid - corner distance deviation of a building is denoted as
4
𝑛
5 1
6
7
𝐶𝐶𝐷𝑏𝑙𝑔 =
𝑛 ∑(𝑐𝑐𝑑 ― 𝑐𝑐𝑑‾)
𝑖=1
𝑖
2

8
9 where 𝑐𝑐𝑑𝑖 is a distance between centroid and corner 𝑖 and 𝑐𝑐𝑑‾ is mean of all distances. It
10
11 captures a variety of shape. As a corner is considered vertex with angle < 170º to reflect potential
12 circularity of object and topological imprecision of building polygon.
13
14 14. Centroid - corner mean distance of a building is denoted as

(∑ )
𝑛
15 1
16 𝐶𝐶𝑀𝑏𝑙𝑔 = 𝑐𝑐𝑑𝑖
17 𝑛
𝑖=1
18
19
where 𝑐𝑐𝑑𝑖 is a distance between centroid and corner 𝑖. It is a character measuring a dimension of
Fo
20
21 the object dependent on its shape (Schirmer and Axhausen, 2015).
22
15. Solar orientation of a building is denoted as
rR
23
24 𝑂𝑟𝑖𝑏𝑙𝑔 = |𝑜𝑏𝑙𝑔𝐵 ― 45|
25
26 where 𝑜𝑏𝑙𝑔𝐵 is an orientation of the longest axis of bounding rectangle in a range 0 - 45. It
ev

27 captures the deviation of orientation from cardinal directions. There are multiple ways of
28 capturing orientation of a polygon. As reported by Yan et al. (2007), Duchêne et al. (2003)
29
assessed five different options (longest edge, weighted bisector, wall average, statistical
iew

30
31 weighting, bounding rectangle) and concluded a bounding rectangle as the most appropriate.
32 Deviation from cardinal directions is used to avoid sudden changes between square-like objects.
33
34 16. Street alignment of a building is denoted as
On

35 𝑆𝐴𝑙𝑏𝑙𝑔 = |𝑂𝑟𝑖𝑏𝑙𝑔 ― 𝑂𝑟𝑖𝑒𝑑𝑔|


36
37 where 𝑂𝑟𝑖𝑏𝑙𝑔 is a solar orientation of the building and 𝑂𝑟𝑖𝑒𝑑𝑔 is a solar orientation of the street
38 edge. It reflects the relationship between the building and its street, whether it is facing the street
ly

39 directly or indirectly (Schirmer and Axhausen, 2015).


40
41 17. Cell alignment of a building is denoted as
42
𝐶𝐴𝑙𝑏𝑙𝑔 = |𝑂𝑟𝑖𝑏𝑙𝑔 ― 𝑂𝑟𝑖𝑐𝑒𝑙𝑙|
43
44
where 𝑂𝑟𝑖𝑐𝑒𝑙𝑙 is a solar orientation of tessellation cell. It reflects the relationship between a
45
46 building and its cell.
47
48 18. Longest axis length of a tessellation cell is denoted as
49 𝐿𝐴𝐿𝑐𝑒𝑙𝑙 = 𝑑𝑐𝑒𝑙𝑙𝐶
50
51 where 𝑑𝑐𝑒𝑙𝑙𝐶 is a diameter of the minimal circumscribed circle around the tessellation cell
52 polygon. The axis itself does not have to be fully within the polygon. It could be seen as a proxy
53 of plot depth for tessellation-based analysis.
54
55 19. Area of a tessellation cell is denoted as
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 116 of 136

12
1
2
3 𝑎𝑐𝑒𝑙𝑙
4
5 and defined as an area covered by a tessellation cell footprint in m2.
6
7 20. Circular compactness of a tessellation cell is denoted as
8 𝑎𝑐𝑒𝑙𝑙
9 𝐶𝐶𝑜𝑐𝑒𝑙𝑙 =
10 𝑎𝑐𝑒𝑙𝑙𝐶
11
12 where 𝑎𝑐𝑒𝑙𝑙𝐶 is an area of minimal enclosing circle. It captures the relation of tessellation cell
13 footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle.
14
15 21. Equivalent rectangular index of a tessellation cell is denoted as
16 𝑎𝑐𝑒𝑙𝑙 𝑝𝑐𝑒𝑙𝑙𝐵
17 𝐸𝑅𝐼𝑐𝑒𝑙𝑙 = ∗
18 𝑎𝑐𝑒𝑙𝑙𝐵 𝑝𝑐𝑒𝑙𝑙
19
Fo
20
21 where 𝑎𝑐𝑒𝑙𝑙𝐵 is an area of the minimal rotated bounding rectangle of a tessellation cell (MBR)
22
footprint and 𝑝𝑐𝑒𝑙𝑙𝐵 its perimeter of MBR.
rR
23
24
25 22. Solar orientation of a tessellation cell is denoted as
26 𝑂𝑟𝑖𝑐𝑒𝑙𝑙 = |𝑜𝑐𝑒𝑙𝑙𝐵 ― 45|
ev

27
28 where 𝑜𝑐𝑒𝑙𝑙𝐵 is an orientation of the longest axis of bounding rectangle in a range 0 - 45. It
29 captures the deviation of orientation from cardinal directions.
iew

30
31 23. Street alignment of a building is denoted as
32 𝑆𝐴𝑙𝑐𝑒𝑙𝑙 = |𝑂𝑟𝑖𝑐𝑒𝑙𝑙 ― 𝑂𝑟𝑖𝑒𝑑𝑔|
33
34 where 𝑂𝑟𝑖𝑐𝑒𝑙𝑙 is a solar orientation of tessellation cell and 𝑂𝑟𝑖𝑒𝑑𝑔 is a solar orientation of the
On

35
36
street edge. It reflects the relationship between tessellation cell and its street, whether it is facing
37 the street directly or indirectly.
38
24. Coverage area ratio of a tessellation cell is denoted as
ly

39
40 𝑎𝑏𝑙𝑔
41 𝐶𝐴𝑅𝑐𝑒𝑙𝑙 =
𝑎𝑐𝑒𝑙𝑙
42
43 where 𝑎𝑏𝑙𝑔 is an area of a building and 𝑎𝑐𝑒𝑙𝑙 is an area of related tessellation cell (Schirmer and
44
45
Axhausen, 2015). Coverage area ratio (CAR) is one of the commonly used characters capturing
46 intensity of development. However, the definitions vary based on the spatial unit.
47
48 25. Floor area ratio of a tessellation cell is denoted as
49 𝑓𝑎𝑏𝑙𝑔
50 𝐹𝐴𝑅𝑐𝑒𝑙𝑙 =
𝑎𝑐𝑒𝑙𝑙
51
52 where 𝑓𝑎𝑏𝑙𝑔 is a floor area of a building and 𝑎𝑐𝑒𝑙𝑙 is an area of related tessellation cell. Floor area
53
54
could be computed based on the number of levels or using an approximation based on building
55 height.
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 117 of 136 Environment and Planning B: Urban Analytics and City Science

13
1
2
3 26. Length of a street segment is denoted as
4
5
𝑙𝑒𝑑𝑔
6
7 and defined as a length of a LineString geometry in metres.
8
27. Width of a street profile is denoted as
9

(∑ )
𝑛
10 1
11 𝑤𝑠𝑝 = 𝑤𝑖
12
𝑛
𝑖=1
13
14 where 𝑤𝑖 is width of a street section i. The algorithm generates street sections every 3 meters
15
16 alongside the street segment, and measures mean value. In the case of the open-ended street, 50
17 metres is used as a perception-based proximity limit (Araldi and Fusco, 2019).
18
19 28. Height of a street profile is denoted as
Fo

(∑ )
𝑛
20 1
21 ℎ𝑠𝑝 = ℎ𝑖
22 𝑛
𝑖=1
rR
23
24
25
where ℎ𝐼 is mean height of a street section i. The algorithm generates street sections every 3
26 meters alongside the street segment, and measures mean value (Araldi and Fusco, 2019).
ev

27
28 29. Height to width ratio of a street profile is denoted as

( )
𝑛
29 1 ℎ𝑖
iew

30
31
𝐻𝑊𝑅𝑠𝑝 =
𝑛 ∑𝑤
𝑖=1
𝑖
32
33
34
where ℎ𝐼 is mean height of a street section i and 𝑤𝑖 is the width of a street section i. The
On

35 algorithm generates street sections every 3 meters alongside the street segment, and measures
36 mean value (Araldi and Fusco, 2019).
37
38 30. Openness of a street profile is denoted as
ly

39 ∑ℎ𝑖𝑡
40 𝑂𝑝𝑒𝑠𝑝 = 1 ―
2∑𝑠𝑒𝑐
41
42 where ∑ℎ𝑖𝑡 is a sum of section lines (left and right sides separately) intersecting buildings and
43
44
∑𝑠𝑒𝑐 total number of street sections. The algorithm generates street sections every 3 meters
45 alongside the street segment.
46
47 31. Width deviation of a street profile is denoted as
48 𝑛
1
49
50
𝑤𝐷𝑒𝑣𝑠𝑝 =
𝑛 ∑(𝑤 ― 𝑤
𝑖=1
𝑖 𝑠𝑝)
2

51
52 where 𝑤𝑖 is width of a street section i and 𝑤𝑠𝑝 is mean width. The algorithm generates street
53
54 sections every 3 meters alongside the street segment.
55
56
32. Height deviation of a street profile is denoted as
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 118 of 136

14
1
2
3 𝑛
4 1
5 ℎ𝐷𝑒𝑣𝑠𝑝 =
𝑛 ∑(ℎ ― ℎ 𝑖 𝑠𝑝)
2

6 𝑖=1
7
8 where ℎ𝑖 is height of a street section i and ℎ𝑠𝑝 is mean height. The algorithm generates street
9 sections every 3 meters alongside the street segment.
10
11 33. Linearity of a street segment is denoted as
12 𝑙𝑒𝑢𝑐𝑙
13 𝐿𝑖𝑛𝑒𝑑𝑔 =
14 𝑙𝑒𝑑𝑔
15
16 where 𝑙𝑒𝑢𝑐𝑙 is Euclidean distance between endpoints of a street segment and 𝑙𝑒𝑑𝑔 is a street
17 segment length. It captures the deviation of a segment shape from a straight line. It is adapted
18 from (Araldi and Fusco, 2019).
19
Fo
20 34. Area covered by a street segment is denoted as
21 𝑛
22
𝑎𝑒𝑑𝑔 = ∑𝑎 𝑐𝑒𝑙𝑙𝑖
rR
23
24 𝑖=1
25
26
where 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖 belonging to the street segment. It captures the area
ev

27 which is likely served by each segment.


28
29 35. Buildings per meter of a street segment is denoted as
iew

30 ∑𝑏𝑙𝑔
31 𝐵𝑝𝑀𝑒𝑑𝑔 =
𝑙𝑒𝑑𝑔
32
33 where ∑𝑏𝑙𝑔 is a number of buildings belonging to a street segment and 𝑙𝑒𝑑𝑔 is a length of a street
34
segment. It reflects the granularity of development along each segment.
On

35
36
37
36. Area covered by a street node is denoted as
𝑛
38
∑𝑎
ly

39 𝑎𝑛𝑜𝑑𝑒 = 𝑐𝑒𝑙𝑙𝑖
40 𝑖=1
41
42 where 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖 belonging to the street node. It captures the area which
43 is likely served by each node.
44
45 37. Shared walls ratio of adjacent buildings is denoted as
46
𝑝𝑏𝑙𝑔𝑠ℎ𝑎𝑟𝑒𝑑
47
𝑆𝑊𝑅𝑏𝑙𝑔 =
48 𝑝𝑏𝑙𝑔
49
50 where 𝑝𝑏𝑙𝑔𝑠ℎ𝑎𝑟𝑒𝑑 is a length of a perimeter shared with adjacent buildings and 𝑝𝑏𝑙𝑔 is a perimeter of
51 a building. It captures the amount of wall space facing the open space (Hamaina et al., 2012).
52
53 38. Alignment of neighbouring buildings is denoted as
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 119 of 136 Environment and Planning B: Urban Analytics and City Science

15
1
2
3 𝑛
1
4
5
𝐴𝑙𝑖𝑏𝑙𝑔 =
𝑛 ∑|𝑂𝑟𝑖
𝑖=1
𝑏𝑙𝑔 ― 𝑂𝑟𝑖𝑏𝑙𝑔𝑖|
6
7
8
where 𝑂𝑟𝑖𝑏𝑙𝑔 is the solar orientation of a building and 𝑂𝑟𝑖𝑏𝑙𝑔𝑖 is the solar orientation of building 𝑖
9 on a neighbouring tessellation cell. It calculates the mean deviation of solar orientation of
10 buildings on adjacent cells from a building. It is adapted from Hijazi et al. (2016).
11
12 39. Mean distance to neighbouring buildings is denoted as
13 𝑛
1
14
15
𝑁𝐷𝑖𝑏𝑙𝑔 =
𝑛 ∑𝑑
𝑖=1
𝑏𝑙𝑔,𝑏𝑙𝑔𝑖
16
17 where 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 is a distance between building and building 𝑖 on a neighbouring tessellation cell. It
18
19 is adapted from Hijazi et al. (2016). It captures the average proximity to other buildings.
Fo
20
21 40. Weighted neighbours of a tessellation cell is denoted as
22 ∑𝑐𝑒𝑙𝑙𝑛
𝑊𝑁𝑒𝑐𝑒𝑙𝑙 =
rR
23 𝑝𝑐𝑒𝑙𝑙
24
25 where ∑𝑐𝑒𝑙𝑙𝑛 is a number of cell neighbours and 𝑝𝑐𝑒𝑙𝑙 is a perimeter of a cell. It reflects
26
ev

27 granularity of morphological tessellation.


28
29
41. Area covered by neighbouring cells is denoted as
iew

𝑛
30
31
32
𝑎𝑐𝑒𝑙𝑙𝑛 = ∑𝑎
𝑖=1
𝑐𝑒𝑙𝑙𝑖

33
34 where 𝑎𝑐𝑒𝑙𝑙𝑖 is area of tessellation cell 𝑖 within topological distance 1. It captures the scale of
On

35 morphological tessellation.
36
37 42. Reached cells by neighbouring segments is denoted as
38 𝑛
ly

39
40 𝑅𝐶𝑒𝑑𝑔𝑛 = ∑𝑐𝑒𝑙𝑙𝑠 𝑒𝑑𝑔𝑖
41 𝑖=1
42
43 where 𝑐𝑒𝑙𝑙𝑠𝑒𝑑𝑔𝑖 is number of tessellation cells on segment 𝑖 within topological distance 1. It
44 captures accessible granularity.
45
46 43. Reached area by neighbouring segments is denoted as
47 𝑛
48
49
𝑎𝑒𝑑𝑔𝑛 = ∑𝑎
𝑖=1
𝑒𝑑𝑔𝑖
50
51
where 𝑎𝑒𝑑𝑔𝑖 is an area covered by a street segment 𝑖 within topological distance 1. It captures an
52
53 accessible area.
54
55 44. Degree of a street node is denoted as
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 120 of 136

16
1
2
3
4 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 = ∑𝑒𝑑𝑔 𝑖𝑗
5 𝑗
6
7 where 𝑒𝑑𝑔𝑖𝑗 is an edge of a street network between node 𝑖 and node 𝑗. It reflects the basic degree
8 centrality.
9
10 45. Mean distance to neighbouring nodes from a street node is denoted as
11 𝑛
1
12
13
𝑀𝐷𝑖𝑛𝑜𝑑𝑒 =
𝑛 ∑𝑑
𝑖=1
𝑛𝑜𝑑𝑒,𝑛𝑜𝑑𝑒𝑖
14
15
where 𝑑𝑛𝑜𝑑𝑒,𝑛𝑜𝑑𝑒𝑖 is a distance between node and node 𝑖 within topological distance 1. It captures
16
17 the average proximity to other nodes.
18
19 46. Reached cells by neighbouring nodes is denoted as
Fo
20 𝑛
21
22
𝑅𝐶𝑛𝑜𝑑𝑒𝑛 = ∑𝑐𝑒𝑙𝑙𝑠
𝑖=1
𝑛𝑜𝑑𝑒𝑖
rR
23
24 where 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 is number of tessellation cells on node 𝑖 within topological distance 1. It
25
26
captures accessible granularity.
ev

27
47. Reached area by neighbouring nodes is denoted as
28 𝑛
29
∑𝑎
iew

30 𝑎𝑛𝑜𝑑𝑒𝑛 = 𝑛𝑜𝑑𝑒𝑖
31 𝑖=1
32
33 where 𝑎𝑛𝑜𝑑𝑒𝑖 is an area covered by a street node 𝑖 within topological distance 1. It captures an
34 accessible area.
On

35
36 48. Number of courtyards of adjacent buildings is denoted as
37 𝑁𝐶𝑜𝑏𝑙𝑔𝑎𝑑𝑗
38
ly

39
where 𝑁𝐶𝑜𝑏𝑙𝑔𝑎𝑑𝑗 is a number of interior rings of a polygon composed of footprints of adjacent
40
41 buildings (Schirmer and Axhausen, 2015).
42
43 49. Perimeter wall length of adjacent buildings is denoted as
44 𝑝𝑏𝑙𝑔𝑎𝑑𝑗
45
46 where 𝑝𝑏𝑙𝑔𝑎𝑑𝑗 is a length of an exterior ring of a polygon composed of footprints of adjacent
47 buildings.
48
49 50. Mean inter-building distance between neighbouring buildings is denoted as
50 𝑛
51 1
52 𝐼𝐵𝐷𝑏𝑙𝑔 =
𝑛 ∑𝑑 𝑏𝑙𝑔,𝑏𝑙𝑔𝑖
53 𝑖=1
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 121 of 136 Environment and Planning B: Urban Analytics and City Science

17
1
2
3 where 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 is a distance between building and building 𝑖 on a tessellation cell within
4
5 topological distance 3. It is adapted from Caruso et al. (2017). It captures the average proximity
6 between buildings.
7
8 51. Building adjacency of neighbouring buildings is denoted as
9 ∑𝑏𝑙𝑔𝑎𝑑𝑗
10 𝐵𝑢𝐴𝑏𝑙𝑔 =
∑𝑏𝑙𝑔
11
12 where ∑𝑏𝑙𝑔𝑎𝑑𝑗 is a number of joined built-up structures within topological distance three and
13
14 ∑𝑏𝑙𝑔 is a number of buildings within topological distance 3. It is adapted from Vanderhaegen
15 and Canters (2017).
16
17 52. Gross floor area ratio of neighbouring tessellation cells is denoted as
𝑛
18 ∑𝑖 = 1𝐹𝐴𝑅𝑐𝑒𝑙𝑙𝑖
19 𝐺𝐹𝐴𝑅𝑐𝑒𝑙𝑙 =
Fo
20 𝑛
∑𝑖 = 1𝑎𝑐𝑒𝑙𝑙𝑖
21
22
where 𝐹𝐴𝑅𝑐𝑒𝑙𝑙𝑖 is a floor area ratio of tessellation cell 𝑖 and 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖
rR
23
24 within topological distance 3. Based on Dibble et al. (2019).
25
26 53. Weighted reached blocks of neighbouring tessellation cells is denoted as
ev

27 ∑𝑏𝑙𝑘
28 𝑊𝑅𝐵𝑐𝑒𝑙𝑙 = 𝑛
29 ∑𝑖 = 1𝑎𝑐𝑒𝑙𝑙𝑖
iew

30
31 where ∑𝑏𝑙𝑘 is a number of blocks within topological distance three and 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of
32 tessellation cell 𝑖 within topological distance three.
33
34 54. Area of a block is denoted as
On

35
36
𝑎𝑏𝑙𝑘
37
38
and defined as an area covered by a block footprint in m2.
ly

39
55. Perimeter of a block is denoted as
40
41 𝑝𝑏𝑙𝑘
42
43 and defined as lengths of the block polygon exterior in m.
44
45
56. Circular compactness of a block is denoted as
46 𝑎𝑏𝑙𝑘
47 𝐶𝐶𝑜𝑏𝑙𝑘 =
𝑎𝑏𝑙𝑘𝐶
48
49 where 𝑎𝑏𝑙𝑘𝐶 is an area of minimal enclosing circle. It captures the relation of block footprint
50
51
shape to its minimal enclosing circle, illustrating the similarity of shape and circle.
52
57. Equivalent rectangular index of a block is denoted as
53
54 𝑎𝑏𝑙𝑘 𝑝𝑏𝑙𝑘𝐵
55 𝐸𝑅𝐼𝑏𝑙𝑘 = ∗
𝑎𝑏𝑙𝑘𝐵 𝑝𝑏𝑙𝑘
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 122 of 136

18
1
2
3
4
5 where 𝑎𝑏𝑙𝑘𝐵 is an area of the minimal rotated bounding rectangle of a block (MBR) footprint and
6
7
𝑝𝑏𝑙𝑘𝐵 its perimeter of MBR.
8
9 58. Compactness-weighted axis of a block is denoted as
10
11
12
𝐶𝑊𝐴𝑏𝑙𝑘 = 𝑑𝑏𝑙𝑘𝐶 × ―
𝜋 (
4 16(𝑎𝑏𝑙𝑘)
𝑝2𝑏𝑙𝑘 )
13
14 where 𝑑𝑏𝑙𝑘𝐶 is a diameter of the minimal circumscribed circle around the block polygon, 𝑎𝑏𝑙𝑘 is
15 an area of a block and 𝑝𝑏𝑙𝑘 is a perimeter of a block. It is a proxy of permeability of an area
16 (Feliciotti, 2018).
17
18 59. Solar orientation of a block is denoted as
19 𝑂𝑟𝑖𝑏𝑙𝑘 = |𝑜𝑏𝑙𝑘𝐵 ― 45|
Fo
20
21 where 𝑜𝑏𝑙𝑘𝐵 is an orientation of the longest axis of bounding rectangle in a range 0 - 45. It
22
captures the deviation of orientation from cardinal directions.
rR
23
24
25
60. Weighted neighbours of a block is denoted as
26 ∑𝑏𝑙𝑘𝑛
ev

27 𝑤𝑁𝑏𝑙𝑘 =
𝑝𝑏𝑙𝑘
28
29 where ∑𝑏𝑙𝑘𝑛 is a number of block neighbours and 𝑝𝑏𝑙𝑘 is a perimeter of a block. It reflects
iew

30
31
granularity of a mesh of blocks.
32 61. Weighted cells of a block is denoted as
33
34
∑𝑐𝑒𝑙𝑙
𝑤𝐶𝑏𝑙𝑘 =
On

35 𝑎𝑏𝑙𝑘
36
37 where ∑𝑐𝑒𝑙𝑙 is a number of cells composing a block and 𝑎𝑏𝑙𝑘 is an area of a block. It captures the
38 granularity of each block.
ly

39
40 62. Local meshedness of a street network is denoted as
41 𝑒―𝑣+1
42 𝑀𝑒𝑠𝑛𝑜𝑑𝑒 =
43 2𝑣 ― 5
44
45 where 𝑒 is a number of edges in a subgraph, and 𝑣 is the number of nodes in a subgraph
46 (Feliciotti, 2018). A subgraph is defined as a network within topological distance five around a
47 node.
48
49 63. Mean segment length of a street network is denoted as
50 𝑛
1
51
52
𝑀𝑆𝐿𝑒𝑑𝑔 =
𝑛 ∑𝑙
𝑖=1
𝑒𝑑𝑔𝑖

53
54 where 𝑙𝑒𝑑𝑔𝑖 is a length of a street segment 𝑖 within a topological distance 3 around a segment.
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 123 of 136 Environment and Planning B: Urban Analytics and City Science

19
1
2
3 64. Cul-de-sac length of a street network is denoted as
4 𝑛
5
6
7
𝐶𝐷𝐿𝑛𝑜𝑑𝑒 = ∑𝑙
𝑖=1
𝑒𝑑𝑔𝑖, 𝑖𝑓 𝑒𝑑𝑔𝑖 𝑖𝑠 𝑐𝑢𝑙 ― 𝑑𝑒 ― 𝑠𝑎𝑐

8
9 where 𝑙𝑒𝑑𝑔𝑖 is a length of a street segment 𝑖 within a topological distance 3 around a node.
10
11 65. Reached cells by street network segments is denoted as
12 𝑛
13
14
𝑅𝐶𝑒𝑑𝑔 = ∑𝑐𝑒𝑙𝑙𝑠
𝑖=1
𝑒𝑑𝑔𝑖
15
16 where 𝑐𝑒𝑙𝑙𝑠𝑒𝑑𝑔𝑖 is number of tessellation cells on segment 𝑖 within topological distance 3. It
17
18 captures accessible granularity.
19
66. Node density of a street network is denoted as
Fo
20
21 ∑𝑛𝑜𝑑𝑒
22 𝐷𝑛𝑜𝑑𝑒 = 𝑛
∑𝑖 = 1𝑙𝑒𝑑𝑔𝑖
rR
23
24
25 where ∑𝑛𝑜𝑑𝑒 is a number of nodes within a subgraph and 𝑙𝑒𝑑𝑔𝑖 is a length of a segment 𝑖 within
26 a subgraph. A subgraph is defined as a network within topological distance five around a node.
ev

27
28 67. Reached cells by street network nodes is denoted as
29 𝑛
iew

30
31
𝑅𝐶𝑛𝑜𝑑𝑒𝑛𝑒𝑡 = ∑𝑐𝑒𝑙𝑙𝑠
𝑖=1
𝑛𝑜𝑑𝑒𝑖
32
33
34
where 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 is number of tessellation cells on node 𝑖 within topological distance 3. It
On

35 captures accessible granularity.


36
37 68. Reached area by street network nodes is denoted as
38 𝑛
ly

39
40
𝑎𝑛𝑜𝑑𝑒𝑛𝑒𝑡 = ∑𝑎
𝑖=1
𝑛𝑜𝑑𝑒𝑖

41
42 where 𝑎𝑛𝑜𝑑𝑒𝑖 is an area covered by a street node 𝑖 within topological distance 3. It captures an
43
44 accessible area.
45
46
69. Proportion of cul-de-sacs within a street network is denoted as
𝑛
47 ∑𝑖 = 1𝑛𝑜𝑑𝑒𝑖, 𝑖𝑓 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 = 1
48 𝑝𝐶𝐷𝑛𝑜𝑑𝑒 = 𝑛
49 ∑𝑖 = 1𝑛𝑜𝑑𝑒𝑖
50
51
where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from Boeing
52
53 (2017).
54
55 70. Proportion of 3-way intersections within a street network is denoted as
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 124 of 136

20
1
2
3 𝑛
∑𝑖 = 1𝑛𝑜𝑑𝑒𝑖, 𝑖𝑓 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 = 3
4
5 𝑝3𝑊𝑛𝑜𝑑𝑒 = 𝑛
6 ∑𝑖 = 1𝑛𝑜𝑑𝑒𝑖
7
8 where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from Boeing
9 (2017).
10
11 71. Proportion of 4-way intersections within a street network is denoted as
12 𝑛
13
∑𝑖 = 1𝑛𝑜𝑑𝑒𝑖, 𝑖𝑓 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 = 4
14 𝑝4𝑊𝑛𝑜𝑑𝑒 = 𝑛
15 ∑𝑖 = 1𝑛𝑜𝑑𝑒𝑖
16
17 where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from Boeing
18 (2017).
19
Fo
20 72. Weighted node density of a street network is denoted as
21 𝑛
22 ∑𝑖 = 1𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 ― 1
𝑤𝐷𝑛𝑜𝑑𝑒 =
rR
23 𝑛
24 ∑𝑖 = 1𝑙𝑒𝑑𝑔𝑖
25
26 where 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 is a degree of a node 𝑖 within a subgraph and 𝑙𝑒𝑑𝑔𝑖 is a length of a segment 𝑖
ev

27
within a subgraph. A subgraph is defined as a network within topological distance five around a
28
29 node.
iew

30
73. Local closeness centrality of a street network is denoted as
31
32 𝑛―1
𝑙𝐶𝐶𝑛𝑜𝑑𝑒 = 𝑛 ― 1
33 ∑𝑣 = 1𝑑(𝑣,𝑢)
34
On

35
36 where 𝑑(𝑣,𝑢) is the shortest-path distance between 𝑣 and 𝑢, and 𝑛 is the number of nodes within
37 a subgraph. A subgraph is defined as a network within topological distance five around a node.
38
ly

39 74. Square clustering of a street network is denoted as


40 𝑘 𝑘
∑𝑢𝑣= 1∑𝑤𝑣= 𝑢 + 1𝑞𝑣(𝑢,𝑤)
41 𝑠𝐶𝑙𝑛𝑜𝑑𝑒 = 𝑘𝑣
42 𝑘
∑𝑢 = 1∑𝑤𝑣= 𝑢 + 1[𝑎𝑣(𝑢,𝑤) + 𝑞𝑣(𝑢,𝑤)]
43
44
45 where 𝑞𝑣(𝑢,𝑤) are the number of common neighbours of 𝑢 and 𝑤 other than 𝑣 (ie squares), and
46 𝑎𝑣(𝑢,𝑤) = (𝑘𝑢 ―(1 + 𝑞𝑣(𝑢,𝑤) + 𝜃𝑢𝑣))(𝑘𝑤 ―(1 + 𝑞𝑣(𝑢,𝑤) + 𝜃𝑢𝑤)), where 𝜃𝑢𝑤 = 1 if 𝑢 and 𝑤
47 are connected and 0 otherwise (Lind et al., 2005).
48
49
50 Table below contains each character and its classification to scale following <masked for blind
51 review> and key used in additional figures across supplementary materials.
52
53
54
55 index element grain extent id
56 area building S S sdbAre
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 125 of 136 Environment and Planning B: Urban Analytics and City Science

21
1
2
3
4
height building S S sdbHei
5 volume building S S sdbVol
6 perimeter building S S sdbPer
7
8
courtyard area building S S sdbCoA
9 form factor building S S ssbFoF
10 volume to façade ratio building S S ssbVFR
11
12
circular compactness building S S ssbCCo
13 corners building S S ssbCor
14 squareness building S S ssbSqu
15
16
equivalent rectangular index building S S ssbERI
17 elongation building S S ssbElo
18 centroid - corner distance
19 deviation building S S ssbCCD
Fo
20
21
centroid - corner mean distance building S S ssbCCM
22 solar orientation building S S stbOri
rR
23 street alignment building S S stbSAl
24
25
cell alignment building S S stbCeA
26 longest axis length tessellation cell S S sdcLAL
ev

27 area tessellation cell S S sdcAre


28
29
circular compactness tessellation cell S S sscCCo
iew

30 equivalent rectangular index tessellation cell S S sscERI


31 solar orientation tessellation cell S S stcOri
32
33
street alignment tessellation cell S S stcSAl
34 coverage area ratio tessellation cell S S sicCAR
On

35 floor area ratio tessellation cell S S sicFAR


36
37
length street segment S S sdsLen
38 width street profile S S sdsSPW
ly

39 height street profile S S sdsSPH


40
41
height to width ratio street profile S S sdsSPR
42 openness street profile S S sdsSPO
43 width deviation street profile S S sdsSWD
44
45
height deviation street profile S S sdsSHD
46 linearity street segment S S sssLin
47 area covered street segment S S sdsAre
48
49
buildings per meter street segment S S sisBpM
50 area covered street node S S sddAre
51 adjacent
52 shared walls ratio buildings S S mtbSWR
53 neighbouring
54
alignment buildings S S mtbAli
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 126 of 136

22
1
2
3 neighbouring
4
5
mean distance buildings S S mtbNDi
6 weighted neighbours tessellation cell S S mtcWNe
7 neighbouring
8 area covered cells S S mdcAre
9 neighbouring
10
reached cells segments S S misRea
11
12 neighbouring
13 reached area segments S S mdsAre
14 degree street node S S mtdDeg
15 mean distance to neighbouring
16 nodes street node S S mtdMDi
17
18
neighbouring
19 reached cells nodes S S midRea
Fo
20 neighbouring
21 reached area nodes S S midAre
22 adjacent
rR
23 number of courtyards buildings S S libNCo
24
adjacent
25
26 perimeter wall length buildings S S ldbPWL
ev

27 neighbouring
28 mean inter-building distance buildings S S ltbIBD
29 neighbouring
iew

30 building adjacency buildings S S ltcBuA


31 neighbouring
32
tessellation
33
34 gross floor area ratio cells S S licGDe
On

35 neighbouring
36 tessellation
37 weighted reached blocks cells S S ltcWRB
38
area block S S ldkAre
ly

39
40 perimeter block S S ldkPer
41 circular compactness block S S lskCCo
42
equivalent rectangular index block S S lskERI
43
44 compactness-weighted axis block S S lskCWA
45 solar orientation block S S ltkOri
46
weighted neighbours block S S ltkWNB
47
48 weighted cells block S S likWBB
49 local meshedness street network S M lcdMes
50
mean segment length street network S S ldsMSL
51
52 cul-de-sac length street network S S ldsCDL
53 reached cells street network S S ldsRea
54
node density street network S M lddNDe
55
56 reached cells street network S S lddRea
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 127 of 136 Environment and Planning B: Urban Analytics and City Science

23
1
2
3
4
reached area street network S S lddARe
5 proportion of cul-de-sacs street network S M linPDE
6 proportion of 3-way intersections street network S M linP3W
7
8
proportion of 4-way intersections street network S M linP4W
9 weighted node density street network S M linWID
10 local closeness centrality street network S M lcnClo
11
12
square clustering street network S L xcnSCl
13
14 Table S2: Additional classification of primary morphometric characters.
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 128 of 136

24
1
2
3 Supplementary Material 3: Bayesian Information Criterion
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35 Figure S5: Bayesian Information Criterion score for the variable number of components in
36 Prague case study. Shaded area reflects .95 confidence interval.
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 129 of 136 Environment and Planning B: Urban Analytics and City Science

25
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39 Figure S6: Bayesian Information Criterion score for the variable number of components in
40 Amsterdam case study. Shaded area reflects .95 confidence interval, red line marks the first
41 significant minimum.
42
43
44
45 Supplementary material 4: Full extent of presented maps illustrating spatial distribution of
46 results of cluster analysis.
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 130 of 136

26
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34 Figure S7: Spatial distribution of 10 detected clusters in Prague.
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 131 of 136 Environment and Planning B: Urban Analytics and City Science

27
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51 Figure S8: Spatial distribution of 10 detected clusters in Amsterdam.
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 132 of 136

28
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34 Figure S9: Spatial distribution of different branches of the combined dendrogram in Prague.
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 133 of 136 Environment and Planning B: Urban Analytics and City Science

29
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Fo
20
21
22
rR
23
24
25
26
ev

27
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51 Figure S10: Spatial distribution of different branches of the combined dendrogram in
52 Amsterdam.
53
54
55 Supplementary Material 5: Contingency tables
56 ,
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 134 of 136

30
1
2
3
4
cluster 1840 1880 1920 1950 1970 1990 2012
5 0 349 85 263 1219 1506 1442 565
6
1 1138 513 3588 17095 4499 1265 1453
7
8 2 1407 621 1655 4537 3108 2530 1357
9 3 1392 1719 2658 2895 678 223 213
10
11 4 145 54 156 888 1993 6414 532
12 5 3442 568 1487 7677 3975 2084 3459
13
14 6 1413 2778 4109 2005 150 4 8
15 7 3177 110 73 49 0 0 1
16
17
8 2834 981 2661 9645 4259 2629 829
18 9 69 63 151 3764 1147 1573 1244
19
Fo
20
21 Table S3: Contingency table showing the counts of features per historical origin within
22 individual clusters in Prague case study.
rR
23
24
25 Multi-family Single-family Industry Industry
26
ev

27
cluster housing housing Villas small large other
28 0 112 617 3 322 1138 3497
29
1 437 27953 1164 3 0 33
iew

30
31 2 3706 7238 203 972 789 2830
32 3 8472 577 136 93 26 626
33
34 4 9553 748 0 0 0 17
On

35 5 75 21590 147 50 22 1156


36
37 6 10070 231 153 0 0 34
38 7 2374 6 0 0 0 1057
ly

39
40
8 4296 18110 1080 117 60 340
41 9 868 7015 79 0 0 120
42 Table S4: Contingency table showing the counts of features per predominant land use within
43
individual clusters in Prague case study.
44
45
46
47 perimeter garden
48 cluster organic block village city modernism production services
49
50
0 0 17 377 213 39 3216 352
51 1 0 3 11384 16150 100 1 0
52
2 8 453 2937 2859 1394 2383 1085
53
54 3 192 6516 100 725 248 234 197
55 4 0 54 192 324 8782 17 49
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Page 135 of 136 Environment and Planning B: Urban Analytics and City Science

31
1
2
3
4
5 0 0 13298 7824 40 33 61
5 6 604 8522 8 575 6 0 0
6
7 3281 49 0 0 0 3 78
7
8 8 0 263 6614 9900 2189 98 78
9 9 0 0 880 3176 1112 0 62
10
11
12 Table S5: Contingency table showing the counts of features per expert typology classes within
13
individual clusters in Prague case study.
14
15
16
17
18 cluster 1800 1850 1900 1930 1945 1960 1975 1985 1995 2005 2020
19 0 2 6 25 653 757 5541 11488 10448 10153 3362 3327
Fo
20
21
1 314 0 5201 17479 5118 325 60 395 743 241 110
22 2 65 42 360 1794 914 1409 1949 1258 1280 1597 1230
rR
23 3 59 27 303 2133 1072 1244 2189 1512 1906 1990 1452
24
25 4 2 0 62 32 27 81 267 288 420 477 361
26 5 927 24 2000 5825 2824 6583 3236 2564 3854 3662 3393
ev

27 6 111 45 713 5116 2366 4643 8811 4463 5696 4171 3089
28
29 7 7153 98 1531 1828 692 145 213 362 722 386 125
iew

30 8 31 24 371 7976 6716 11113 5369 1948 7652 2948 3739


31 9 127 25 359 658 322 1153 2453 1478 2082 2122 1698
32
33
34 Table S6: Contingency table showing the counts of features per historical origin within
On

35 individual clusters in Amsterdam case study.


36
37
38
ly

39
40
41 Case study Data Degrees N 𝝌𝟐 p- Cramér’s
42 of value V
43 Freedom
44
45
Prague Historical origin 54 140315 91599 < .001 0.331
46
47
48 Prague Land use 45 140315 153672 < .001 0.468
49
50 Prague Qualitative 54 119413 325351 < .001 0.674
51 classification
52
53 Amsterdam Historical origin 90 252385 218457 < .001 0.311
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb
Environment and Planning B: Urban Analytics and City Science Page 136 of 136

32
1
2
3 Table S7: Reported Chi-square and Cramér's V results for each tested dataset. All results
4
5 indicate significant relationship as per Chi-square statistics and moderate to high association as
6 per Cramér’s V. V < .3 indicates low, .3 - .5 moderate, and > .5 high association.
7
8
9
10 Data and Code
11
12 The reproducible Python code is available in the form of Jupyter notebooks at <anonymised>.
13
14
15 The work is accompanied by an open-source Python package (available at <anonymised>) .
16
17
18
19
The morphological data (buildings, streets) for Prague case study were obtained from the city's
open data portal (https://www.geoportalpraha.cz/en), while the validation layers were provided
Fo
20
21 by Prague Institute of Planning and Development. The morphological data for Amsterdam are
22 obtained from 3D BAG repository (Dukai, 2020) and Basisregistratie Grootschalige Topografie,
rR
23 BGT (http://data.nlextract.nl/)
24
25 - Dukai, B. (2020) ‘3D Registration of Buildings and Addresses (BAG) / 3D
26 Basisregistratie Adressen en Gebouwen (BAG)’, 4TU.ResearchData. doi:
ev

27 10.4121/uuid:f1f9759d-024a-492a-b821-07014dd6131c.
28
29
iew

30
31
32
33
34
On

35
36
37
38
ly

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 https://mc04.manuscriptcentral.com/epb

You might also like