Data Report
Data Report
ON
Bachelor of Technology
In
Computer Science Engineering
(BIKANER TECHNICAL UNIVERSITY BIKANER)
SESSION 2022-2023
CERTIFICATE
This is to certify that the seminar Report entitled “Evolution of Urban Patterns: Urban Morphology as an
Open Reproducible Data Science” has been submitted by Ms. Sanjana Agrawaal in partial in fulfillment
for the requirement of the degree of B.Tech in Computer Science Engineering for the academic Session
2022-23. She has been undergone the requisite work as prescribed by Bikaner Technical University Bikaner
(Rajasthan).
Place:- Ajmer
Date:- 30-01-2023
2
ACKNOWLEDGMENET
I take this opportunity to express my gratitude to all those people who have been directly and indirectly with
me during the competition of this Seminar.
I am grateful to Seminar in charge Mrs. Payal Awwal who has given guidance to make the seminar work
successful.
I want to give sincere thanks to our principal Dr.J.K. Deegwal for his valuable support.
I extend my thanks to Mrs. Meeta Sharma Head of the Department for his constant support and motivation.
Sanjana Agrawal
B.Tech VII Semester
3
TABLE OF CONTENT
4
ABSTRACT
The recent growth of geographic data science (GDS) fuelled by increasingly available open data and
open source tools has influenced urban sciences across a multitude of fields. Yet there is limited
application in urban morphology—a science of urban form. Although quantitative approaches to
morphological research are finding momentum, existing tools for such analyses have limited scope
and are predominantly implemented as plug-ins for standalone geographic information system
software. This inherently restricts transparency and reproducibility of research. Simultaneously, the
Python ecosystem for GDS is maturing to the point of fully supporting highly specialized
morphological analysis. In this paper, we use the open source Python ecosystem in a workflow to
illustrate its capabilities in a case study assessing the evolution of urban patterns over six historical
periods on a sample of 42 locations. Results show a trajectory of change in the scale and structure of
urban form from pre-industrial development to contemporary neighborhoods, with a peak of highest
deviation during the post-World War II era of modernism, confirming previous findings. The wholly
reproducible method is encapsulated in computational notebooks, illustrating how modern GDS can
be applied to urban morphology research to promote open, collaborative, and transparent science,
independent of proprietary or otherwise limited software.
5
Urban data science and morphology
Across all fields of science, the expanding availability of openly accessible data and constant development
of new tools, technologies, and platforms is making possible the generation of new theories and testing
of new analytical methods and processes. This is expanding our ability to interrogate and understand the
world around us, delivering new evidence-based knowledge to guide action. Furthermore, it supports a
shift towards quantitative geography, allowing its evolution in the direction of the fourth paradigm of
science as well as its closer integration with critical geography. The former, based on the notion that
science can be data driven to the point where knowledge and theory can be retrieved from data only (Hey
et al. 2009; Gahegan 2020), represents one direction of quantitative science. The latter goes in the
opposite one, repeatedly suggested by Kwan and Schwanen (2009) or later Derudder and van Meeteren
(2019) in their call for a “common language” and onboarding the critical insights stemming from
quantitative approaches.
This trend manifests in the rapid growth of quantitative geography and geographic data science (GDS),
fuelled by the development of new computational tools and availability of (big) open data. In this field, the
quick emergence and maturation of a new generation of spatial data software ecosystems in both Python
and R (Rey 2019; Bivand 2020), represented by GeoPandas (Jordahl et al. 2021), PySAL (Rey and Anselin
2007; Rey 2019), xarray (Hoyer and Hamman 2017), or sf (Pebesma 2018), has enabled highly
specialized research applications—including the analysis and modelling of urban spatial systems.
Concerning cities, these new tools are finding fertile ground in the study of functional as- pects of
urban life, where we now rely on abundant data on, for example, urban populations coming from census
records or social networks, or environmental performance, thanks to ar- rays of various sensors both on
the orbit (Drusch et al. 2012) and on the ground (Mydlarz et al. 2017). This trend is extremely promising
and is already bringing fundamental new knowledge on the performance and behavior of lived
environments, with concrete applications in policy and planning (Kandt and Batty 2021). Conversely, it is
leaving exposed the complementary study of physical aspects of cities—the spatial environment
constituting the setting of all lived urban experience.
Too often, data on buildings, streets, and open space are used limitedly to provide a hollow
cartographic “backdrop” to a wide array of alternative urban dynamics, with little or no further data-driven
insight. This is despite the fact that the built environment is not a simple passive component on which
social and economic processes happen to occur: It is an active layer constantly influencing and affecting the
quality and modes of our existence within cities and that can be directly or indirectly manipulated through
design and planning to deliver long-term effects.
In recent years, GDS focusing on urban form has taken initial steps, (Boeing 2018; Oliveira and
Medeiros 2016; Araldi and Fusco 2019; Dibble et al. 2019; Jochem et al. 2020) and yet this line of
research is still at its infancy. Although most existing tools in quantitative geography are ill-suited for
urban form analysis, tools specifically focused on analysis of urban form have a relatively narrow scope.
6
The majority are geared towards the analysis of street networks (Hillier 1996; Porta, Latora and Strano
2010), leaving aside key considerations of geometry and compo- sition of urban fabric. Capturing the
complexity of urban form requires more than the character- ization of street networks and a handful of
other measurable characters.
Aware of this limitation, some researchers have attempted to develop new computational frameworks.
These, however, present a range of limitations. Very often, they are built ad hoc and are rarely
generalizable beyond the single case. The implemented code is infrequently made publicly available, and
in many cases, researchers fail to properly document every design deci- sion in their publications—
drastically reducing reproducibility and unnecessarily complicating the process of tentatively rebuilding
workflows multiple times. Collectively, this constitutes a substantial obstacle to a truly shared and
evidence-based knowledge of urban form.
In this sense, developing stronger foundations and bespoke tools for a data-driven science of urban form
is key to reducing the existing gap and allowing a more comprehensive and actionable knowledge of the
physical structure of the urban environment.
In this paper, we provide an overview of available tools for morphological analysis to un- derstand
better the severity of these issues and the potential to overcome them. We then argue that the field needs a
shift from dominant traditional geographic information system (GIS) envi- ronments based on a graphical
user interface (GUI; e.g., QGIS or ArcMap) towards reproducible open code-based workflows.
That is further supported by an overview of the Python ecosystem and its ability to support research
applications particularly in the area of urban morphology, a field of study concerned with the analysis of
urban form. The suggested approach is then illus- trated in the case study analysing alteration in structure
and scale of urban patterns depending on their period of origin. We close up with a discussion on the future
developments of quantitative urban morphology keeping up with open science and reproducibility.
7
Morphometric tools in the ecosystem: A state of the art
Available tools:
Although the pool of advanced computing techniques for geospatial analysis is in rapid expan- sion, the
current offer of tools geared towards the analysis of urban form is rather limited in scope and inconsistent
in representativeness.
Currently, most researchers interested in urban form analysis rely on traditional “point- and-click” GIS
software packages, such as ArcGIS or QGIS. Although more intuitive to use, these have three primary
disadvantages. First, some require access to proprietary software, which comes with inherent barriers to
accessibility, either related to affordability or to platform compatibility (i.e., ArcGIS is available only for
Microsoft Windows). Second, even when free to use and multi-platform (i.e., QGIS), they are restricted
by their underlying one-fits-all archi- tecture. And although in some cases, these can be partially
customized through user-developed single-purpose plug-ins, they tend to constrain users within
predetermined software capabilities. Consequently, scientific methods are a direct function of the
limitations imposed by the soft- ware, rather than by the underlying theory or the specific questions at
hand (Harris et al. 2017; Poorthuis and Zook 2020).
Lastly, as pointed out by, Boeing (2020b) toolkits relying on point- and-click interfaces are inefficient
in the era of big data. Due to the limited scope for automation of tasks, not only is workflow efficiency
reduced but also the reproducibility of the underlying research is compromised, because this largely
depends on the (often undocumented) sequence of decisions manually operating the software.
This situation is particularly relevant in urban morphology, a field of study spreading from geography
to architecture, focusing on the analysis of urban form and processes of its formation and transformation
(Moudon 1997; Oliveira 2016; Kropf 2017). Unlike other strands of urban space research, urban
morphology is concerned with a range of intermediate spatial scales— neighborhoods, blocks, streets,
squares, plots, and buildings—and its study is meant to “identify the repeating patterns in the structure,
formation and transformation of the built environment to help comprehend how the elements work
together (…) to meet human needs and accommodate human culture” (Kropf 2014, p. 41). In this sense,
urban morphology not only has contributed to the conceptualization of the spatial fabric of cities as a
complex adaptive system and developed highly specialized methods for the study of its organizational
structure but also has provided considerable insight on those intermediate spatial scales of central interest
for urban designers, providing a valuable evidence base to contemporary urban design theory and practice.
Although traditionally geared towards qualitative approaches and “low-tech” methods, re- searchers in
this field have recently shown an increasing interest for quantitative approaches based on the use of GIS
and related tools and the integration of digital cartographic products. Reviewing the most frequently used
tools for urban form analysis (Table 1) highlights two trends. Initially, recalling earlier reference, most
available tools are plug-ins associated with software packages such as ArcMap or QGIS. 8
Furthermore, urban morphology is oriented around three fundamental spatial elements—plots,
buildings, and streets (and their aggregations)—and their spatial formation. Accessibility to this data is
underutilized due to the majority of current tools typically focusing on street network analysis only. The
reason for this imbalance is the availabil- ity of data and tools at the time they were initially transferred
between alternate disciplines and applied in urban morphology.
The mathematical foundation for the majority of street network analysis is rooted in graph theory and
physics of complex networks (Porta, Crucitti and Latora 2006), with methods applied to social and
biological networks long before they were first applied to street formations. Furthermore, spatial
networks—such as power grids, railways, or rivers, as well as streets—were among the first types of data
to be available in a GIS environment.
Even today, street network data remain the most abundant and widespread. For example, the crowd-
mapping platform OpenStreetMap (OSM) reports an 83% network completeness worldwide (Barrington-
Leigh and Millard-Ball 2017; and that is a single source of data), whereas building footprint data within
the same platform are highly inconsistent, in terms of coverage, accuracy, and resolution (Brovelli and
Zamboni 2018).
This imbalance is further reflected in “what” is being assessed, with network connectivity metrics
being the only category of characters sufficiently covered. Following, Fleischmann et al. (2020a) there are
five other categories (dimension, shape, spatial distribution, intensity, and diversity), which are severely
underrepresented by available tools. Among tools for connec- tivity assessment, it can be further observed
that the majority focus on the aspects of street centrality.
Either building on the work “Multiple Centrality Assessment” by Porta, Crucitti and Latora (2006) or
the school of “Space Syntax” (Hillier 1996), the first being based on a primal representation of the
network of streets whereas the second being built on a dual approach. The former was initially
implemented as a standalone tool (Gasser and Caillet 2013) and later reimplemented in different spatial
analysis plug-ins/libraries.
These include the ArcMap (and later Rhino3D) toolbox Urban Network Analysis (UNA) (Sevtsuk and
Mekonnen 2012; Sevtsuk 2018), the open-source Spatial Design Network Analysis (Cooper and Chiaradia
2020), and re- cently as part of OSMnx (Boeing 2017) and in an expanded form in momepy (Fleischmann
2019). The latter was implemented by the Space Syntax group at the University College London in
depthmapX (Turner 2001; depthmapX Development Team 2017) and Place Syntax Toolkit (PST; Ståhle
et al. 2005), both accessible as open-source softwares. In addition to centrality analysis, each tool can
often measure other network-based variables (e.g., redundancy index in UNA, circuity in OSMnx, and
meshedness in momepy).
This leaves morphological elements such as buildings and plots, among others (i.e., blocks and street
edges) considerably underrepresented.1 Notable exceptions include Metropolitan Form Analysis toolbox
(Amindarbari and Sevtsuk 2013) capturing seven metrics describing footprint and land-use pattern of a
city, therefore operating on a metropolitan scale (of both grain and extent).
AwaP-IC (Majic and Pafka 2019), a QGIS plug-in measuring two permeability-related metrics at the
scale of blocks and buildings, foot, an R package describing gridded building footprints via a small
number of metrics (Jochem and Tatem 2021), and momepy (Fleischmann 2019), which will be discussed
separately.
9
It is to be remarked that the predominance of street-network and connectivity-based tools and the relative
absence of tools for alternative morphological elements do not mean that researchers in this area have
lacked engagement in measuring different elements of urban features (Araldi and Fusco 2019; Dibble et al.
2019). The issue is that these are hardly replicable and reproducible and come with no reusable tools or
simply rely on ad hoc code. Dibble et al. (2019) for example, measured 207 morphological attributes at the
level of blocks, buildings, and plots, as well as street networks in 45 urban neighborhoods (the author
defines them as sanctuary areas).
These measurements are all manually determined by Google satellite and Ordnance Survey data. An
additional example is the work by Araldi and Fusco (2019), who measured 21 characters cap- turing the
pedestrian point of view across the metropolitan area of Nice, France. Although this work does rely on an
algorithmic approach using ad hoc code, this is largely undocumented in the published output, a decision
which might be due to licensing restrictions. This situation is typical in partnerships between academia
and the private sector or research carried out for the private sector alone.
One might argue that even in the absence of specific analytical tools, a traditional GIS en- vironment
(e.g., QGIS) using a GUI is more than sufficient for most applications, particularly considering the up-
front investment (in time as well as learning effort) required for processing a method into a reusable code
and lack of any academic reward for doing so. This overlooks several limitations affecting both research
design and its applicability.
The data science tools in Python are not restricted to geospatial analysis but offer a wide range of
potential combinations from natural text processing to powerful artificial intelligence modelling, which
can be inter- twined with geographic data in scenarios that researchers require. The open-source code
ensures transparency of methods as we can verify what each part of the process does, unlike in propri-
etary software, where the user has to believe the (often imprecise) documentation.
Furthermore, code-based methods support reproducibility and replicability of the work by eliminating
undoc- umented steps while avoiding the situation where methodological details and “rationales un-
derpinning analytical decisions became obfuscated” (Boeing and Arribas-Bel 2021, p. 2) when relying on
GUI.
Within the wider GIS research community, computational notebooks2 and open-source soft- ware
packages are increasingly seen as key solutions for research in the area and to be at the forefront of
geographic open science (Boeing and Arribas-Bel 2021).
10
10
Python ecosystem for GDS
With the recent developments in both Python and R ecosystems for GDS, it is no longer the lack of
fundamental building blocks that hinders the development and release of scientific software. Focusing on
Python and vector-based analysis of urban form,3 we are witnessing a growing num- ber of libraries and
packages being released. These are quickly maturing to provide the required degree of stability,
performance, and scalability for processing of large datasets.
Modern data science in Python is oriented around pandas (McKinney 2010), a package for tabular data
analysis and manipulation. GDS follows this model with GeoPandas (Jordahl et al. 2021), extending pandas
via support of geospatial features and operations, linking together var- ious components of the ecosystem
into a convenient form (Fig. 1). Its core depends on libraries written in C—GEOS (GEOS contributors
2021), PROJ (PROJ contributors 2021), and GDAL (GDAL/OGR contributors 2021).
The geometry operations are handled by shapely (Gillies and others 2007), a Python interface to GEOS.
Coordinate reference systems are managed by pyproj (Snow et al. 2021), which is interfacing PROJ.
Capabilities for reading and writing geospatial data are using fiona (Gillies and others 2011), a module
based on GDAL. The power of three performant C libraries (GEOS, PROJ, and GDAL) within a convenient
pandas-like Python inter- face made GeoPandas a core tool for vector data manipulation.
Figure 1. Schema of the software ecosystem supporting morphological research within this study.
[Colour figure can be viewed at wileyonlinelibrary.com]
Even prior to the initiation of the development of GeoPandas (June 2013), PySAL (Python Spatial Analysis
Library) (Rey and Anselin 2007; Rey 2019) started being developed. Although originally independent, its
relation to GeoPandas has become stronger over the years, and sev- eral PySAL modules now depend on
GeoPandas and vice versa.4 PySAL brings a broad range of tools for spatial analysis, from implementation
of spatial weights matrices to advanced spatial interpolation and multiscale geographically weighted
regression models.
Parallel and largely disconnected has been the development of NetworkX (Hagberg, Schult and Swart 2008),
a general purpose package for manipulation and analysis of networks, not necessarily spatial. However, in
recent years, it has been incorporated into the several GDS applications, making it a fundamental component
of spatial network analysis (typically streets). These tools were crucial to the development of specialized
11
software for morphological anal- ysis.
11
The first example is OSMnx, a library for modelling and analysing street networks obtained from OSM,
depending on the capability of NetworkX and interfacing GeoPandas. Its ability to download and parse
OSM directly from Python with a convenient interface opened new research possibilities democratizing
access to data. However, the main analytical focus on OSMnx is still street network analysis and as such
focuses on connectivity, similarly to the majority of existing tools outlined above.
The recent addition to the ecosystem is momepy, a library that builds on all GeoPandas, PySAL, NetworkX,
and to a degree OSMnx, to develop an open repository of tools for morpho- metric assessment of built
environments. This covers connectivity as well as the alternate five categories of measurable characters,
which are reflected in different modules of the library. The close relationship to the rest of the ecosystem
allows complex characterization of urban form within the frameworks of modern data science, which is
reproducible as well as scalable.
The Python ecosystem centered around momepy has a potential to deliver great insights into the built-up
patterns. Some initial applications include analysis of informal settlements (Mottelson and Venerandi 2020),
prediction of building heights (Milojevic-Dupont et al. 2020), or classification of seashore streets in Portugal
to evaluate local climate adaptation plans (Dal Cin et al. 2020). Combined with the parsing capability of
OSMnx, we can now create open and fully reproducible workflows analysing the structure and composition
of urban form based on open data from OSM.To prove this key point, we present below an application of
such workflow.
To prove this key point, we present below an application of such workflow.Specifically, to illustrate the
potential of Python for urban morphometrics (i.e., the quantitative analysis of urban form). Different open-
source Python tools are implemented to understand patterns of change in the structure and scale of urban
form over time. To ensure reproducibility of the analysis, the whole method is delivered as a series of
Jupyter notebooks executed within a containerized environment.
12
12
Case study: Alterations in structure and scale
Cities are in a state of continuous flow: They change in their economy, cultural landscape, socie- tal norms,
political discourse, and relationship with the natural environment. This global change is shaped over time
through individuals and collective action. The form of cities also changes— buildings, plots, street fronts,
blocks, and streets—each change at their own pace, moulded by new construction methods, technologies,
resource availability, lifestyle preferences, and plan- ning policy/theory.
And yet this change is not chaotic or random but reflects (and interacts with) accidents of history—key
events of a social, political, cultural, and environmental nature—not only within the same city but also
between cities that might be even considerably far apart in space.
For example, the technical innovations brought by the industrial revolution (i.e., prefabrica- tion and
serialization) have allowed forms and speed of building construction all over the world up to that point
unthinkable; the freedom of movement associated to the shift to a car-dominated society has triggered the
spread of low-density suburban lifestyles everywhere; the rapid popu- lation growth currently experienced in
many developing regions of the world prompted a wave of unregulated informal urbanization characterized
by extremely high density and compactness. Research in urban morphology has already started investigating
the “laws” behind these recur- rent urban form patterns through comparative studies.
Notably, Porta et al. (2014) have sought to show how different urban settlements all over the world all share
what he calls a significant “alteration” in the scale of their street network, which corresponded to the macro-
shift from a pre-industrial to an industrialized society. More specifically, they analyzed the intersection
patterns of urban main streets from a pool of 100 case studies from 30 countries, characterized by diverse
historic, sociocultural, and economic back- grounds divided into nine categories of urban form: ancient,
medieval, renaissance, baroque, industrial, garden city, radiant city, new urbanism, and informal
development. Albeit limited by sample size, the study uncovered the existence of a recurrent “400-m rule” in
the intersection pattern of urban main streets in historic pre-industrial cities, a pattern that roughly doubled
in all post-industrial samples up to the present day, with the notable exception of informal settlements,
which followed the same “rule” observed in historical cases. Indeed, according to the author, this “alteration
in scale” is on the one hand the product of concomitant socioeconomic shifts—the dominance of motorized
vehicle, the engineerization of transportation planning, and the possi- bilities enabled by serial production—
and, on the other, one of the prime culprits in the spread of “the unsustainable, car-dominated city of today”
(p. 3,398).
The transformation of the network patterns over time was later studied by, Boeing (2020a) who carried out a
similar study across the whole of the United States using OSMnx to quantify the link between configuration
of the network and car ownership.
His big data approach confirmed Porta’s manual measurement, showing that US cities and their street
networks are on the journey from predominantly gridded configurations of pre-World War II (WW2)
development through convoluted dendritic patterns peaking in the 1980s, back to more interconnected
conventions in the 21st century. According to the author, the rise in car ownership was mirrored in the 13
studied networks by a steady drift away from the connectivity and density of gridded patterns, and only very
recently, with the awareness of the need of more walk- able cities, this trend was partly reversed.
. 13
In parallel, Barrington-Leigh and Millard-Ball (2020) carried out a similar analysis at a global level and
developed a Street-Network Disconnectedness Index, finding that “in contrast to the corrective trend
observed in the United States, where streets have become more connected since the late 20th century, we
find that most of the world is build- ing ever-more disconnected ‘street-network sprawl’” (p. 1,941).
Broadly speaking, all these works share a similar conclusion: That certain spatial trends and “alterations,”
which are visible in the shape and configuration of the street networks in cities, are recurrent in cities
regardless of geographic location as a result of high impact processes and events. Such hypothesis builds on
a long strand of research in urban morphology, according to which changes in economy, technology, and
culture drive phases of development, stagnation, and redevelopment across all constitutive elements of the
urban form system (Conzen 1960; Feliciotti, Romice, and Porta 2017; Hallowell and Baran 2013).
But although this phenomenon has been studied from a quantitative perspective in relation to street networks
only, it is reasonable to theorize that using Porta et al.’s (2014) own words, “similarly recursive spatial
patterns within other elements of urban form” (p. 3,384).
To this regard, the proposed study builds further on this hypothesis by extending it to other components of
urban form—namely, buildings and tessellation cells—and applies a rigorous quantitative approach to test
whether patterns (and changes) similar to those observed for street networks across historical periods are
also recognizable in structure and scale of these alternative elements of urban form.
14
14
Case study analysis method
For the purpose of this analysis, we defined six well-established historical periods to a degree replicating the
subdivision in “urban design paradigms” adopted in the study by, Porta et al. (2014) although with a reduced
number of classes: “pre-industrial,” “industrial,” “garden city,” “modernist,” “neo-traditional,” and
“informal.” For each period, we sampled seven internally homogeneous areas, each defined by a 400-m
buffer around a central location (Fig. 2), accumu- lating a total of 42 cases spread all over the world,
covering different geographical and historical contexts (Fig. 3). Each of the 42 samples had to be internally
homogeneous and highly represen- tative of the relevant historical period.
Figure 2. Examples of case studies (one per historical period) reflecting the variety of urban form
patterns within set buffers. [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 3. Selection of case studies and their allocation to historical periods. The tabular form is available
in the code and data repository. [Colour figure can be viewed at wileyonlinelibrary. com] 15
Using OSMnx, we download building footprints and street networks from OSM as GeoPandas data structure
as an input for momepy. To develop a more grounded understanding of the structure, we further generate
morphological tessellation (Fig. 4), an analytical spatial unit derived from building footprints using Voronoi
tessellation (Fleischmann et al. 2020b). Tessellation reflects the smallest spatial division, which at the same
retains the information on contiguity, allowing identification of topological relationships between individual
buildings. This property is used to measure various morphometric characters reflecting the spatial distribu-
tion of building footprints.
Figure 4. Illustration of the behavior of morphological tessellation on the sample set of case studies.
[Colour figure can be viewed at wileyonlinelibrary.com]
Capturing the change in the structure of urban form is a complex task as there are endless possibilities on
what could be analyzed. Within this work, we use a selection of 12 morphometric characters (Table 2)
spanning across all six categories identified by Fleischmann et al. (2020a) and different spatial elements,
that is, street networks, building footprints, and morphological tessellation. The set ranges from simple
characters (i.e., area of building footprint), as well as more complex metrics reflecting the relationship
between individual elements (i.e., adjacency of buildings) or capturing characters of the street profile
(defined by a combination of streets and buildings).
We treat this selection as a sample to illustrate the workflow while producing valuable insights. This can be
expanded in a potential full-scale study. Statistical distributions of measured values within each period are
analyzed using a Kruskal–Wallis one-way analysis of variance (Kruskal and Wallis 1952) and a pairwise
Mann–Whitney U-test (Mann-Whitney 1947; follow- ing SciPy implementation [Virtanen et al. 2020]) to
empirically test the hypothesis of change.
16
The assumption of the analysis is that results will present changes in the built-up pat- terns similar to
what has been previously observed by, Porta et al. (2014), Boeing (2020a) and Barrington-Leigh and
Millard-Ball (2020)—a significant alteration in the transition from pre- industrial to post-war development,
with a slow tendency of the return to the pre-industrial pat- terns in particular contexts (i.e., informal
settlements and early 21st century developments). The extent of such a change is not known yet.
The whole method is encapsulated in three computational Jupyter notebooks covering each step from the
beginning to the end, with the only external input being the list of case studies with the point of origin.We
download and process data for a specific timestamp using OSMnx and GeoPandas, generate tessellation, and
measure morphometric characters using momepy, gen- erating statistical outputs and figures using SciPy
(Virtanen et al. 2020) and seaborn (Waskom and team 2020). This is wholly processed within a reproducible
containerized environment. The potential expansion of the list of case studies then depends only on their
identification and data availability. 17
17
Results
Table 3 presents a summary statistic (median and interquartile range) of each morphometric character by
historical period. The tendencies shown by the data are generally in line with the previous finding by Boeing
(2020a) and, Porta et al. (2014) with the highest deviations occur- ring during the post-WW2 era of the
modernist planning paradigm and slowly turning back afterwards. Notably, informal settlements tend to be
structurally similar to pre-industrial and industrial development than to any other period, suggesting that the
lack of the influence of plan- ning and especially transportation technology (i.e., personal vehicles) have a
strong tendency to generate similar walkable dependable patterns.
The apparent alteration in scale, observed by Porta et al. (2014) as a change in the distance between the main
streets, is to a large degree present in the small-scale data also. The median area of tessellation cells and
related building areas and neighbor distance between buildings all show very similar tendencies (Fig. 5), this
being the change of the scale between industrial and garden city periods, with a major peak during
modernism. The change in scale is extensive. Although the median area of tessellation cells more than
triples between the industrial era and garden city movement, it is more than 19 larger during modernism than
in the industrial period.
Although we can observe how neo-traditional development values are comparatively less ex- treme than in
the modernist period, they are still almost triple in size compared with the (pre-) industrial era, despite their
programmatic aim to develop patterns close to pre-WW2 fabric. Informal settlements are the most compact
in this comparison, at roughly 65% the size of his- torical development. Although the difference in mean
neighbor distance is not so radical, the tendency is the same.
Simultaneously, the only radical deviation in building size is present in modernism (more than 3.5 larger in
comparison with alternative forms) and in informal develop- ments (almost half in comparison with
alternative forms), indicating that the primary alteration change is in the pattern, rather than in the range—in
other words, we have simply started building our houses further away from one another. 18
18
Figure 5. Box plots of the (a) mean neighbor distance between buildings, (b) area of tessellation cells
grouped by historical period, and (c) area of building footprints grouped by historical period. [Colour
figure can be viewed at wileyonlinelibrary.com]
Such interpretation is further supported by the results of openness of a street profile, width of a street profile,
and covered area ratio of tessellation cells. Openness refers to the presence of buildings along the street; that
is, wider gaps between buildings will lead to a higher openness. As shown in Fig. 6a, the previously
observed tendency is directly reflected in the change of open- ness.
The openness of informal settlements is on par with (pre-)industrial cases, not deviated as the scale issue
tends to be. The buildings are not further away only along the street but also across the street, leading to
wider and more saturated streetscapes, illustrated in Fig. 6c. Lastly, the cov- ered area ratio directly reflects
the area of tessellation cells and area of buildings measured above and again placing the informal
settlements very close to historical development.
19
Figure 6. Box plots of the (a) openness of a street profile, (b) covered area ratio grouped by historical
period, and (c) width of a street profile grouped by historical period. [Colour figure can be viewed at
wileyonlinelibrary.com]
An additional effect of the change of scale is reflected in the significant difference between the length of a
perceived contiguous perimeter wall of (pre-)industrial development and that of the more recent origin.
Similar is observed in building adjacency (Fig. 7). Both conclude that we have moved from building cities
composed of adjacent buildings towards solitary development.
20
Figure 7. Box plots of (a) length of a perimeter wall and (b) building adjacency grouped byhistorical
period. [Colour figure can be viewed at wileyonlinelibrary.com]
The final group of characters presents that not all aspects of built form follow the same trajectory.
Meshedness (Fig. 8a), the proxy of street network connectivity, is highest in indus- trial cases, driven by a
rigid grid, conventional of that era. The linearity of a street segment also reflects the effect of the grid while
showing that informal settlements, often similar to historical developments, have the second most
convoluted street network (just behind modernism). The effect of informality is also reflected in the
deviation of street profile width, being the largest in this group of case studies.
Figure 8. Box plots of the (a) mean meshedness of a street network, (b) mean linearity of street
segments grouped by historical period, and (c) mean width deviation of a street profile grouped by historical
period. [Colour figure can be viewed at wileyonlinelibrary.com]
The interpretation of median values and box plots leading to a conclusion that there is a significant
change of the built-up patterns between different periods is further supported by the results of Kruskal–
Wallis test, which indicate that the distributions of morphometric values ob- tained from samples from
different historical periods cannot be considered the same (p < .05). The consequent analysis using Mann–
Whitney U-test comparing distributions of every pair of periods for each measurable character indicates that
with three exceptions, the values obtained from any two pairs of periods significantly differ (p < .05). The
only distributions that cannot be considered different are building adjacency of garden city and modernist
periods (p = .4), linear- ity of a street segment of garden city and modernist periods (p = .15), and width
21
deviation of a street profile of pre-industrial and modernist periods (p = .17). Complete results are reported
in the complimentary Jupyter notebook.
21
Discussion
The alteration in patterns observed in the 42 sample cases highlights once more that profound
transformations have occurred over time in the way we construct the built environment. These changes are
manifested in every aspect of urban form, both in terms of structures and scale, mainly confirming
previous findings by Porta et al. (2014) and Boeing (2020a) regarding street networks. Although the
explanation of such (often radical) differences is left to further research, it is clear how changes
influencing the structure of a particular morphological complex (i.e., network of streets) are intertwined
to changes in alternative complexes due to the tangled inter- dependencies of the urban form system.
22
This is undoubtedly a serious pursuit, which will allow the deeper uncovering of structural tendencies in
the environments around us and ultimately advance our current understanding of different performances
of cities. Although the selection of case studies used in this example aims to be representative, a wider set
would provide more robust results. A similar situation applies to the set of morphometric characters used,
where a larger number of employed characters would result in a more comprehensive picture. However,
the primary purpose of the case is to illustrate the abilities of the Python ecosystem in the study of urban
form, and as such, it is less affected by these limitations.
From the perspective of the Python GDS ecosystem and its ability to support and deliver an analysis of
urban form, this case study illustrates its achieved maturity and reliability. Every step of the procedure is
fully contained and processed within Jupyter notebooks without ever requir- ing the analyst to switch
between environments. Furthermore, as the whole method is written only in Python, it reduces the burden
on researchers to learn a broad range of tools to process different steps of the analysis.
The proposed method is fully replicable, reproducible, and expandable because it only re- quires open
data from OSM (moreover linked to a specific point in time) and relies on an entirely automated workflow.
It is replicable because running the code within the provided Docker con- tainer (lightweight executable
environment) should always lead to the exact same results. It is reproducible as the code can be simply
applied to different case studies of choice or can be run on different data sources capturing urban form
besides OSM, potentially exploiting resources provided by open data portals at municipal, national, and
global levels. Finally, it is expandable insofar the code can be optimized and extended to include further
metrics in the analysis—either existing or created from scratch by other researchers. On the other hand, it
also requires a basic knowledge of Python to use the workflow, which may be initially seen as limiting.
Switching to a code-based analysis may be associated with a steep learning curve. However, not
everyone needs to reach the developer level as the data science ecosystem aims to provide a middle
ground user level. That is a bit like Lego—the researcher learns how to put pieces together and then find
pieces they need to build a house. Other researchers already packaged complex scripts into
straightforward functions (like in the case of PySAL). Furthermore, truly reproducible workflows should
just run, with minimal knowledge needed, as is illustrated by the presented case study. The user only
needs to prepare an environment (either manually or via Docker image) and optionally edit the table listing
individual cases. As the coding proficiency advances, it is easier to build reproducible research and share
it with the community, which re- sults in increasing impact. 22
The more this mindset becomes widespread among researchers, the more highly specialized tools
addressing alternative aspects of urban form analysis will become available. In this way, the addition of
newly developed tools that contribute to the existing ecosystem becomes a stan- dard practice. The new
methods can result in dedicated packages (where the scope of the work does not fit in any of the existing)
while retaining compatibility with the ecosystem, allowing direct exchange of data and consistency of
workflows. In other cases, they can become contribu- tions to existing packages in a similar fashion in
which PySAL is being developed (e.g., imple- mentations of work proposed by Jiang 2013; Arribas-Bel,
Garcia-López and Viladecans-Marsal 2019; Wolf, Knaap and Rey 2019). The majority of the
infrastructural work overlaps between different applications. Thus, we should not spend time
reimplementing it over and over again, as there is no requirement to constantly reinvent the wheel.
Running a morphometric assessment of the sort just presented, scaled from small pieces of urban
tissue to broad metropolitan areas, can become computationally demanding if not over- whelming for
traditional GIS environments. With more steps involved, the point-and-click work- flow becomes
obfuscated and its processing toolkits inefficient. Although GeoPandas performs all operations as a single-
core process, Python’s ecosystem can support its parallelization and eventually out-of-core computation
for larger than memory data. It, alongside code optimization, allows researchers to handle vast amounts of
data and very demanding computations. Depending on Dask (Rocklin 2015; a library for scalable Python
computation) minimizes the requirements to learn additional frameworks (as Apache Sedona [Yu, Zhang,
and Sarwat 2019] depending on Apache Spark) due to close relation to pandas ecosystem, API, and data
manipulation logic. The work on the scalability of GeoPandas-based computation based on Dask is already
under devel- opment both directly as a dask-geopandas extension (Signell, Van den Bossche and
Fleischmann 2021) and indirectly leveraging Dask directly in momepy. It can be expected that the support
will evolve into a seamless implementation, which in turn allows straightforward scalability of urban
morphometrics to regional or national extents and beyond.
Disregarding the technical aspects, relying on open-source data and operating in an open- source
environment dramatically widen opportunities for a fully open research agenda in urban morphology and
open new possibilities in terms of compatibility and cooperation. New quanti- tative methods can derive
rich data, enabling explorations of the applicability of the fourth para- digm of science in urban
morphology. Even the ability to think about fully data-driven research is new to the field and would not be
possible without the inclusion of GDS. At the same time, we can strengthen links between traditional and
quantitative methods, with the former providing a theoretical component and the latter descriptive one.
Such mixed methods can link the detail and profound insights of traditional urban morphology with
descriptive power and scalability of urban morphometrics. Both of these options are becoming more
prominent and will require time and critical assessment to properly mature. Nevertheless, both already
enrich the portfolio of urban morphologists, making it more open and more reproducible. The open
research para- digm, based on open platforms and transparent community-led governance, has the
potential to democratize science and remove unnecessary friction caused by the lack of cooperation
between research groups while bringing additional transparency to research methods and outputs.
23
Conclusion
Disregarding the technical aspects, relying on open-source data and operating in an open- source
environment dramatically widen opportunities for a fully open research agenda in urban morphology and
open new possibilities in terms of compatibility and cooperation. New quanti- tative methods can derive
rich data, enabling explorations of the applicability of the fourth para- digm of science in urban
morphology. Even the ability to think about fully data-driven research is new to the field and would not be
possible without the inclusion of GDS. At the same time, we can strengthen links between traditional and
quantitative methods, with the former providing a theoretical component and the latter descriptive one.
Such mixed methods can link the detail and profound insights of traditional urban morphology with
descriptive power and scalability of urban morphometrics. Both of these options are becoming more
prominent and will require time and critical assessment to properly mature. Nevertheless, both already
enrich the portfolio of urban morphologists, making it more open and more reproducible. The open
research para- digm, based on open platforms and transparent community-led governance, has the
potential to democratize science and remove unnecessary friction caused by the lack of cooperation
between research groups while bringing additional transparency to research methods and outputs.
Notes:
It is to be noted that although UNA and PST can link other features to street segments (i.e., buildings and
plots), the gist of the analysis is still network analysis.
Boeing and Arribas-Bel (2021) describe a computational notebook as “a computer file that contains code,
output, images, and narrative text woven together.
Notebooks allow users to consolidate their analytics workflows, blending code, documentation, and
results into a single reproducible and distributable file.” (p. 1).
We do not address remote sensing and rasters in this article due to their limited scope of application in
urban morphology at the time of writing.
However, the current underutilization of raster data is likely to change, opening additional avenues of
research. GeoPandas is using PySAL’s mapclassify library in its choropleth mapping.
24
24
References
1. Martin Fleischmann, Alessandra Feliciotti, and William Kerr “Evolution of Urban Patterns: Urban
Morphology as an Open Reproducible Data Science”, Geographical Analysis (2022) 54, 536–558
2. Signell, J., J. Van den Bossche, and M. Fleischmann. (2021). Dask-Geopandas. Available at:
https://github. com/geopandas/dask-geopandas/.
3. Tripathy, P., P. Rao, K. Balakrishnan, and T. Malladi. (2020). “An Open-Source Tool to Extract
Natural Continuity and Hierarchy of Urban Street Networks.” Environment and Planning B: Urban
Analytics and City Science. https://doi.org/10.1177/2399808320967680.
4. Araldi, A., and G. Fusco. (2019). “From the Street to the Metropolitan Region: Pedestrian
Perspective in Urban Fabric Analysis.” Environment and Planning B: Urban Analytics and City
Science 46(7), 1243–63. https://doi.org/10.1177/2399808319832612.
5. Arribas-Bel, D., M.-À. Garcia-López, and E. Viladecans-Marsal. (2019). “Building (s and) Cities:
Delineating Urban Areas with a Machine Learning Algorithm.” Journal of Urban Economics.
103217. https://doi.org/10.1016/j.jue.2019.103217.
6. Barrington-Leigh, C., and A. Millard-Ball. (2020). “Global Trends Toward Urban Street-Network
Sprawl.” Proceedings of the National Academy of Sciences 117(4), 1941–50.
https://doi.org/10.1073/ pnas.1905232116.
7. Wolf, L. J., E. Knaap, and S. Rey. (2019). “Geosilhouettes: Geographical Measures of Cluster Fit.”
Environment and Planning B: Urban Analytics and City Science 48(3), 521–539. https://doi.
org/10.1177/2399808319875752.
8. Derudder, B., and M. van Meeteren. (2019). “Engaging with “Urban Science”.” Urban Geography
40(4),555–64. https://doi.org/10.1016/j.softx.2020.100525.
9. Kandt, J., and M. Batty. (2021). “Smart Cities, Big Data and Urban Policy: Towards Urban
Analytics for the Long Run.” Cities 109, 102992. https://doi.org/10.1016/j.cities.2020.102992.
10. Majic, I., and E. Pafka. (2019). “AwaP-IC—An Open-Source GIS Tool for Measuring Walkable
Access.”Urban Science 3(2), 48. https://doi.org/10.3390/urbansci3020048.
25