Identifying Urban Form Typologies in Seoul With Mixture Model Based
Identifying Urban Form Typologies in Seoul With Mixture Model Based
net/publication/344758076
CITATIONS READS
0 2
2 authors, including:
SEE PROFILE
All content following this page was uploaded by Steven Jige Quan on 20 October 2020.
a. City Energy Lab, Graduate School of Environmental Studies, Seoul National University, South
Korea
b. Environmental Planning Institute, Seoul National University, South Korea
c. Artificial Intelligence Institute, Seoul National University, South Korea
*. Corresponding author. Graduate School of Environmental Studies, Seoul National University,
Seoul, South Korea. Email: [email protected] | [email protected]
Abstract
Seoul has a very diverse urban form due to its complex urban development history. Previous studies
that identified typical Seoul urban forms showed a lot of limitations because of subjectivity in their
expert knowledge-based methods. An alternative approach in this field is data-driven clustering, but the
widely used k-means method in the literature has strong cluster assumptions that rarely hold in the urban
environment. This study addresses those issues by applying the Gaussian Mixture Model (GMM)
method to identify urban form typologies in Seoul. The GMM method estimates the statistical
probability of cluster memberships and is considered to be able to better capture the underlying structure
of a complex problem such as urban form clustering in Seoul. This study used a 500 m x 500 m grid to
cover the entire Seoul, and calculated ten urban form attributes in each cell as learning features in GMM.
Through clustering processes implemented in R, nine urban form typologies were finally identified to
represent the complex urban form in Seoul. For each typology, a representative area was identified and
further analyzed. A further comparison of these findings with a previous study in Seoul suggested better
typology identification results from this study.
Keywords
Urban morphology, Typical urban form, Unsupervised learning, Gaussian Mixture Model, Urban
form attributes, Seoul
1. Introduction
Seoul is the largest city in South Korea and one of the major megacities in East Asia. During the rapid
urbanization process, urban areas have been developed with diverse patterns in the city, forming a
highly hybrid urban form in Seoul today. These urban form patterns relate to their diverse socio-
economic, cultural, historical, and regulatory characteristics. These differences receive a lot of
attention in urban development policymaking to provide more targeted and effective policies in Seoul.
A few studies were conducted by the Seoul Development Institute (SDI) to understand different types
1
Preprint submitted to proceedings of ISUF 2020 Cities in the 21st Century
of urban patterns in Seoul, including the typical urban block study in global cities in 2003 (Kim and
Seoul Development Institute, 2003), and the urban form typology study in Seoul in 2009 (Seoul
Development Institute, 2009). The formerly identified eight typical urban blocks in Seoul based on the
expertise of the researchers, and the latter classified in Seoul into seven form typologies based on the
location, elevation, and development method of each urban area, and the expert interpretation and
judgment. While these studies provided insights into the urban form structure of Seoul, they relied too
heavily on the researcher’s professional experience and personal judgment and therefore were subject
to criticism for lacking scientific rigorousness. Improvement of the identification of urban pattern
types in Seoul requires a better understanding of the research goal and detailed method to identify
urban typologies.
Urban typology has been one important concept in urban morphology to understand the spatial
structure and the evolution of urban development (Hill and Evans, 1972; Yiftachel, 1989; Moudon,
1994). By representing a complex urban environment as typical forms, the typology approach allows
researchers to focus on a greatly reduced number of cases, and planners and designers can better learn
from the details of these typologies to apply to their practices.
Definitions of urban typologies differ greatly with the goal of the research. A common definition is
called typo-morphology in the field of urban morphology (Kropf, 2009). This approach is applied to
understand the built environment as an integral of space and building types and the historical process
of its formation in urban morphology studies (Kropf, 2009; Oliveira, 2016). Besides this definition in
urban morphology, in other fields of urban studies, scholars identified urban typologies in urban
history (Groth, 2004; Whitehand et al., 2007; Hasegawa, 2008; Gu et al., 2008; Kubat, 2010), urban
sociology (Vaughan et al.,2005), urban climate (Stewart and Oke, 2012; Huang et al., 2017), energy
(Sattrup et al., 2013; Rode et al., 2014; Huang et al., 2017; Quan et al., 2020), and transportation
(Kenworth and Hu, 2002; Shim et al., 2006; Frank, 2008).
The identification of urban typologies was achieved using quite different methods in the literature.
These methods can be categorized into two general groups: expert knowledge-based, and data-driven.
In studies adopting the expert knowledge-based approach, urban form typologies were often decided
according to expert knowledge. The previous urban typology studies in Seoul were of this type. While
the resulted typologies from this approach are often intuitive to interpret, most of them are subjective
and dependent on researchers’ expertise. When the urban environment is very complex with a large
amount of relevant information available, it becomes difficult for researchers to digest the information
and provide synthesized solutions (Alexander, 1965).
Compared to the expert knowledge-based approach, the data-driven approach is advantageous because
2
of its objectivity, scalability, and generalizability. The quantitative methods in this approach are
mostly based on available data and therefore less subjective. As a result, they can be applied to areas
at different scales with only a main constraint of computational power, and can also be used for
different locations. Most of the data-driven studies in this field adopted the unsupervised learning
method of different types, among which the k-means method was widely used. Song and Knaap
(2007) adopted the k-means method to characterize neighborhoods in Portland. Gil et al. (2012)
applied the k-means method with 25 urban attributes to classify blocks types in a neighborhood.
Schirmer and Axhausen (2019) also tested the k-means to characterize the urban form in a multiscale
manner. These studies generally seemed to be able to identify distinctive urban typologies by their
reported results. However, the k-means method assumes that all clusters are equally sized and all
attributes have the same variance, and directly applies the hard assignments to samples. In real urban
areas with complex urban forms, those assumptions do not hold true most of the time. These
theoretical issues make the k-means method less appropriate in identifying urban typologies and
render the results from previous studies less rigorous and representative for the complex urban forms.
Among the unsupervised learning methods, the Gaussian Mixture Model (GMM) method is
considered a great candidate to resolve those challenges. The GMM can be seen as a soft version of k-
means and assumes Gaussian distributions for attributes in each cluster. Different from the k-means
method, the GMM method can identify clusters that differ in their size, density, and variance using the
expectation-maximization approach. Because of those advantages, studies across different fields have
widely used the GMM method (Chen et al., 2011; Tao et al., 2016; Mohamed et al., 2016). In
comparison, such a method was rarely applied in urban typology studies. This paper aims to fill this
gap and propose a GMM approach to identify urban typologies in Seoul to understand its
morphological structure better. The identified urban typologies through unsupervised learning are also
compared with those from the previous SDI studies based on expert knowledge and judgment to
highlight the advantages and shortcomings of both methods.
2. Methodology
3
2.2. Data Collection
This study collected the GIS data about buildings and streets provided by the National Geographic
Information Institute (NGII) and the building construction data and plot data provided by the National
Spatial Data Infrastructure Portal (NSDIP). All the data is for the year 2018.
The choice of urban form attributes was based on studies on fundamental elements of urban
morphology (Conzen, 1960; Moudon, 1997; Caniggia et al., 2001; Oliveira, 2016). A total of ten
attributes were selected to measure the three types of fundamental elements: buildings, streets, and
plots, with reference to previous studies (Frank et al., 2008; Schirmer et al., 2019; Berghauser et al.,
2019). Four attributes were used to quantify buildings: building footprint area, the number of
buildings, and average building height. Three attributes for streets are average street length, average
street width, and the number of street intersections. The other three attributes were adopted to measure
properties related to plots: the number of plots, average plot size, and the compactness coverage ratio
defined as Total building footprint area /total plot size. All urban form attributes were calculated for
each grid cell by geoprocessing in ArcGIS10.2. Table 1 provides the details of those attributes.
4
2.4. Method
The GMM method was adopted to cluster urban forms and identify urban typologies in this study. The
GMM assumes that the sample set is a mixture of k Gaussian distributed observations, and each
Gaussian distribution is called a component. Each component has its mean vector, covariance matrix,
and mixing coefficient. The Expectation-Maximization algorithm is applied to estimate components
from the sample set (Rasmussen, 2000). First, in the expectation step, the mixing coefficients are
estimated, and the sample set is clustered accordingly. Second, in the maximization step, the k
numbers of mean vectors, covariance matrices, and mixing coefficient vectors are estimated. The two
steps repeat until the performance is not improved. The Bayesian Information Criterion (BIC), a
penalized log-likelihood, is used to measure the performance. The algorithm was applied to the study
area several times with different values of k, which stand for the number of targeted clusters. The final
k value and its corresponding model was decided based on the BIC value upon the convergence in
each run (Scrucca et al., 2016). Mclust 5, a package in R, was used to implement Gaussian Mixture
Model in R studio 1.2.5. The output from running the algorithm using the package was joined to the
GIS grid data for further analysis and visualization.
3. Results
Concerning the attribute values, Cluster 3 has the largest number of buildings, total building footprint
area, coverage ratio, and the number of plots, suggesting high-density development. In comparison,
those values in Cluster 5 and Cluster 6 are the lowest, indicating low-density development. Besides
those extreme clusters, Cluster 2 seems to have the highest buildings on average, and Cluster 6 is
characterized by narrow streets. shows that most clusters seem to have distinctive distribution patterns
concerning the attributes of the total building footprint area, average building height, and average
street width. However, for other attributes, only a few clusters can be well distinguished from others.
5
Table 2. Nine clusters and mean values of urban form attributes in each cluster in Seoul resulted from
GMM clustering.
Among the nine typologies, the representative typology for Cluster 1 is represented by the Gridded
Low-rise typology, which has grid-pattern streets and generally low-rise buildings. The representative
typology for Cluster 2 is defined as Apartment, which has slab-shaped apartment buildings. The
typology for Cluster 3 is named Vernacular Low-rise with irregular street patterns and low-rise
buildings. Cluster 4 is Mixed Low-rise with diverse building footprint areas. The representative
typology for Cluster 5 is represented by the Mixed Sparsely Built typology, which is similar to Cluster
9 but has diverse building footprint areas. Cluster 6 is named Undeveloped with very little
development. The representative typology for Cluster 7 is defined as Compact Low-rise, characterized
by low-rise buildings, and small plots. The typical urban form in Cluster 8 is defined as Mixed
Apartment with a mixture of high-rise apartment complexes and other buildings, and it generally
contains big plots. Cluster 9 is typified with a Sparsely Built typology that has low building density.
The clusters are also named after their corresponding representative typologies.
6
Figure 1. Mixture component densities of urban form attributes in Seoul from GMM clustering.
7
Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5
Gridded Low-rise Apartment Vernacular Low-rise Mixed Low-rise Mixed Sparsely Built
8
First, there are apparent typologies clusters, especially for the Gridded Low-rise, the Apartment, the
Vernacular Low-rise, and Mixed Low-rise. Second, the city center is dominated by the Vernacular
Low-rise typology, representing the historical development, surrounded by Gridded Low-rise areas.
Furthermore, certain typologies are often located where topography features exist. Specifically, the
Mixed Sparsely Built typology often coexists with rivers, and the Sparsely Built and Undeveloped are
mostly found in mountainous areas. These observations are in line with some of the characteristics of
urban development in Seoul: rapid and large-scale urbanization, historical development, and
interactions between urban development and natural environments.
4. Discussions
The clustering results using the GMM method were further compared with the SDI 2009 research
outcome to understand the performance of the two typology classification methods. Table 3 shows the
percentage of matches between different typologies in the two studies. Also, a manual check was
conducted for randomly chosen urban areas for each typology in the two studies to validate their
classification results. The comparison and manual validation led to two main discussion points.
The first observation is that the two studies used quite different sets of typology names, reflecting
their underlying clustering criteria. This study named the typologies based on the specific
characteristics according to the ten urban form attributes, which well separate different
urban forms. In comparison, the SDI study seemed to define typologies based on different sets of
attributes, which resulted in typologies that are not mutually exclusive. The overlapping issue was
evident in their final typology mapping. For example, the downtown typology defined by the SDI
study actually contains urban patterns that also belong to other typologies except for the Hill Area.
This is because the downtown of Seoul has experienced a long and complex development process, and
as a result, its different parts follow quite different urban patterns. Also, the SDI study did not identify
those areas with few developments separately due to its classification schemes, which is an important
typology from the urban morphology point of view.
The second reflection upon the comparison between the two studies is that the SDI study following
expert knowledge correctly identified typical areas, but when it was applied to a larger scale, it failed
to identify at the same accuracy level according to the manual validation through random checks.
Furthermore, the expert knowledge-based approach didn’t perform well when the target urban area is
not clearly one typology or another, but more as an urban pattern in between. This is evident in many
misclassified Organic Vernacular Residential Area typology areas which have actually grid street
patterns.
9
These issues highlight the importance of a structured, formalized, and objective method to identify
urban form typologies to support better urban development management. The comparison and
validation suggest that the GMM method used in this study outperforms the SDI study based on
expert knowledge, and this GMM method has more potential to become a theoretically sound and
practically useful method for urban form classification.
Table 3. Matching identified urban form typologies between this study and the 2009 SDI study.
This
Study Gridded Mixed Mixed
Vernacular Compact Mixed Sparsely
Apartment Low- Sparsely Undeveloped
Low-rise Low-rise Low-rise Apartment Built
SDI rise Built
5. Conclusions
Classification complex urban environment into urban form typologies has been explored in practice to
support real urban development policymaking in Seoul. The SDI studies provided a lot of insights into
the spatial structure of Seoul. However, like many other urban form typology studies, the SDI studies
were based on expert knowledge and therefore had strong subjective elements in them. Another
school of urban form typology study used the data-driven approach in clustering to achieve
objectivity, scalability, and generalizability. But most of the studies in this school adopted the k-means
method, which has strong assumptions on the properties of the target clusters that are often not the
case in the real urban environments.
This study aims to bridge this gap by applying the GMM method, a soft version of k-means that
estimate the statistical probability of cluster memberships. This method is widely used across different
fields. It models a given dataset as a certain number of clusters with attributes having Gaussian
10
distributions, which provides a better approximation of urban form than the k-means method. The
GMM method was applied to the city of Seoul. Ten attributes were defined to represent urban form
following previous studies for clustering. The GMM method identified nine urban form typologies
that were considered to be able to best represent the complex urban form in Seoul: Mixed Apartment,
Undeveloped, Compact Low-rise, Gridded Low-rise, Vernacular Low-rise, Sparsely Built Apartment,
Mixed Low-rise, and Mixed Sparsely Built. The representative urban areas and the spatial
distributions of these typologies are shown in Figure 2 and Figure 3. These typologies and their spatial
locations offer a deeper understanding of the urban form and spatial structure of Seoul which formed
over a long and complex development process.
The results from this study were further compared to those from the 2009 SDI study which was based
on expert knowledge. Through matching and manual checking, it was evident that this study
outperforms the SDI study in the rigorousness of the typology definitions, the correctness of
classification, and the representativeness of identified typologies. These findings suggest that the data-
driven approach, especially the GMM method, is potentially a better method to identify urban form
typologies to support more targeted urban development policymaking. From a developmental point of
view, different urban form typologies were formed over complex processes often governed by
socioeconomic, cultural, and regulatory contexts. A typology-oriented urban policy can be more
tailored and adapted to local characteristics to improve its effectiveness. At the same time, reasonably
identified urban form typologies can help the urban planners and designers to better understand the
main issues in current complex urban development with a succinct representation. These typologies
can provide prototyped examples as references in urban planning and design practices.
There are still issues to be further addressed in this study, including the choice of urban form
attributes, the assumption of Gaussian distribution of those attributes, and the potential integration of
the data-driven method and expert knowledge-based method. These topics will be examined in future
studies.
Acknowledgment
This work was supported by the Creative-Pioneering Researchers Program through Seoul National
University (SNU), the National Research Foundation of Korea (NRF) grant funded by the Korea
government (Ministry of Science and ICT) (No. 2018R1C1B5043758) and the Seoul National
University AI Institute through the Data Science Research Project 2018.
11
References
Alexander, C. (1965) ‘A city is not a tree’, in Larice, M., and Macdonald, E. (eds.) The urban design
reader (Routledge, London) 152-166.
Berghauser Pont, M., Stavroulaki, G., Bobkova, E., Gil, J., Marcus, L., Olsson, J. and Legeby, A.
(2019) ‘The spatial distribution and frequency of street, plot and building types across five European
cities’, Environment and Planning B: Urban Analytics and City Science 46(7), 1226-1242.
Caniggia, G. and Maffei, G. C. (2001) Architectural composition and building typology: interpreting
basic building (Alinea, Firenze).
Chen, Z., and Ellis, T. (2011) ‘Self-adaptive Gaussian mixture model for urban traffic monitoring
system’ 2011 IEEE international conference on computer vision workshops, 1769-1776.
Frank, L., Bradley, M., Kavage, S., Chapman, J. and Lawton, T. K. (2008) ‘Urban form, travel time,
and cost relationships with tour complexity and mode choice’, Transportation 35(1), 37-54.
Gil, J., Beirão, J. N., Montenegro, N. and Duarte, J. P. (2012) ‘On the discovery of urban typologies:
data mining the many dimensions of urban form’, Urban morphology 16(1), 27-40.
Gu, K., Tian, Y., Whitehand, J. W. R. and Whitehand, S. M. (2008) ‘Residential building types as an
evolutionary process: the Guangzhou area, China’, International Seminar on Urban Form.
Hasegawa, J. (2008) ‘The reconstruction of bombed cities in Japan after the Second World
War’, Urban Morphology 12(1), 11-24.
Hill, J. N. and Evans, R. K. (1972) ‘A model for classification and typology’ in Clarke, D. L.
(ed.) Models in archaeology (Routledge, London, pp. 231-273).
Huang, K. T., and Li, Y. J. (2017) ‘Impact of street canyon typology on building’s peak cooling energy
demand: A parametric analysis using orthogonal experiment’, Energy and Buildings 154, 448-464.
Kenworthy, J., and Hu, G. (2002) ‘Transport and urban form in Chinese cities: an international
comparative and policy perspective with implications for sustainable urban transport in China’, disP-
The Planning Review 38(151), 4-14.
Kim, K. and Seoul Development Institute (2003) International Urban Form Study: Development
Pattern and Density of Selected World Cities (Seoul Development Institute, Seoul).
12
Kropf, K. (2009) ‘Aspects of urban form’, Urban morphology 13(2), 105-120.
Kubat, A. S. (2010) ‘The study of urban form in Turkey’, Urban Morphology 14(1), 31-48.
Mohamed, K., Côme, E., Oukhellou, L., and Verleysen, M. (2016) ‘Clustering smart card data for
urban mobility analysis’ IEEE Transactions on intelligent transportation systems 18(3), 712-728.
Moudon, A. V. (1994) ‘Getting to know the built landscape: typomorphology’, Type and the Ordering
of Space.
National Geographic Information Institute (2018), the geospatial information data of building and
street, available at: https://www.ngii.go.kr/kor/content.do?sq=237 (Accessed: 25 September 2020).
National Spatial Data Infrastructure Portal (2018), the geospatial information of plot and the
information of building construction, available at: http://data.nsdi.go.kr/organization/a05016
(Accessed: 25 September 2020).
Oliveira, V. (2016) Urban morphology: an introduction to the study of the physical form of cities
(Springer, Berlin).
Perry, C. (1929) ‘The neighborhood unit’, in Larice, M., and Macdonald, E. (eds.) The urban design
reader (Routledge, London) 78-89.
Quan, S. J., Economou, A., Grasl, T. and Yang, P. P.(2020) ‘An exploration of the relationship between
density and building energy performance’, Urban Design International 25(1), 92-112.
Rasmussen, C. E. (2000) ‘The infinite Gaussian mixture model’, In Advances in neural information
processing systems 554-560.
Schirmer, P. M., and Axhausen, K. W. (2019) A multiscale clustering of the urban morphology for use
in quantitative models. In The Mathematics of Urban Morphology (Birkhäuser, Cham, pp. 355-382).
Scrucca, L., Fop, M., Murphy, T. B. and Raftery, A. E. (2016) ‘mclust 5: clustering, classification and
density estimation using Gaussian finite mixture models’, The R journal 8(1), 289.
Seoul Development Institute (2009) Urban Form Study of Seoul (Seoul Development Institute
publish, Seoul).
Shim, G. E., Rhee, S. M., Ahn, K. H. and Chung, S. B. (2006) ‘The relationship between the
13
characteristics of transportation energy consumption and urban form’ The Annals of Regional Science
40(2), 351-367.
Song, Y., and Knaap, G. J. (2007) ‘Quantitative classification of neighborhoods: The neighborhoods
of new single-family homes in the Portland Metropolitan Area’ Journal of Urban Design 12(1), 1-24.
Stewart, I. D., and Oke, T. R. (2012) ‘Local climate zones for urban temperature studies’ Bulletin of
the American Meteorological Society 93(12), 1879-1900.
Tao, J., Shu, N., Wang, Y., Hu, Q., & Zhang, Y. (2016) ‘A study of a Gaussian mixture model for
urban land-cover mapping based on VHR remote sensing imagery’, International Journal of Remote
Sensing 37(1), 1-13.
Vaughan, L., Clark, D. L. C., Sahbaz, O. and Haklay, M. (2005) ‘Space and exclusion: does urban
morphology play a part in social deprivation?’, Area 37(4), 402-412
Whitehand, J. W. R., and Gu, K. (2007) ‘Extending the compass of plan analysis: a Chinese
exploration’, Urban Morphology 11(2), 91.
Yeo, H.J., and Byun, M. R. (2010) Seoul Neighborhood Spatial Pattern Study (Seoul Development
Institute, Seoul).
Yiftachel, O. (1989) ‘Towards a new typology of urban planning theories’, Environment and Planning
B: Planning and Design 16(1), 23-39.
14