0% found this document useful (0 votes)
16 views22 pages

Urban Land-Use Classification Using Machine Learning Classifiers Comparative Evaluation and Post-Classification Multi-Feature Fusion Approach

Uploaded by

rawangisidk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views22 pages

Urban Land-Use Classification Using Machine Learning Classifiers Comparative Evaluation and Post-Classification Multi-Feature Fusion Approach

Uploaded by

rawangisidk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

European Journal of Remote Sensing

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tejr20

Urban land-use classification using machine


learning classifiers: comparative evaluation and
post-classification multi-feature fusion approach

Yashon O. Ouma, Amantle Keitsile, Boipuso Nkwae, Phillimon Odirile, Ditiro


Moalafhi & Jiaguo Qi

To cite this article: Yashon O. Ouma, Amantle Keitsile, Boipuso Nkwae, Phillimon Odirile, Ditiro
Moalafhi & Jiaguo Qi (2023) Urban land-use classification using machine learning classifiers:
comparative evaluation and post-classification multi-feature fusion approach, European
Journal of Remote Sensing, 56:1, 2173659, DOI: 10.1080/22797254.2023.2173659

To link to this article: https://doi.org/10.1080/22797254.2023.2173659

© 2023 The Author(s). Published by Informa


UK Limited, trading as Taylor & Francis
Group.

Published online: 22 Feb 2023.

Submit your article to this journal

Article views: 2267

View related articles

View Crossmark data

Citing articles: 7 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=tejr20
EUROPEAN JOURNAL OF REMOTE SENSING
2023, VOL. 56, NO. 1, 2173659
https://doi.org/10.1080/22797254.2023.2173659

Urban land-use classification using machine learning classifiers: comparative


evaluation and post-classification multi-feature fusion approach
a
Yashon O. Ouma , Amantle Keitsilea, Boipuso Nkwaea, Phillimon Odirilea, Ditiro Moalafhib and Jiaguo Qi c

a
Department of Civil Engineering, University of Botswana, Gaborone, Botswana; bFaculty of Natural Resources, BUAN, Gaborone,
Botswana; cCenter for Global Change and Earth Observations, Michigan State University, East Lansing, Michigan, USA

ABSTRACT ARTICLE HISTORY


Accurate spatial-temporal mapping of urban land-use and land-cover (LULC) provides critical Received 19 August 2022
information for planning and management of urban environments. While several studies have Revised 21 January 2023
investigated the significance of machine learning classifiers for urban land-use mapping, the Accepted 24 January 2023
determination of the optimal classifiers for the extraction of specific urban LULC classes in time KEYWORDS
and space is still a challenge especially for multitemporal and multisensor data sets. This study Urban land-use land-cover;
presents the results of urban LULC classification using decision tree-based classifiers compris­ Gradient tree boosting;
ing of gradient tree boosting (GTB), random forest (RF), in comparison with support vector Random forest; Support
machine (SVM) and multilayer perceptron neural networks (MLP-ANN). Using Landsat data vector machine; Multilayer
from 1984 to 2020 at 5-year intervals for the Greater Gaborone Planning Area (GGPA) in perceptron neural networks;
Botswana, RF was the best classifier with overall average accuracy of 92.8%, MLP-ANN Post-classification feature
fusion
(91.2%), SVM (90.9%) and GTB (87.8%). To improve on the urban LULC mapping, the study
presents a post-classification multiclass fusion of the best classifier results based on the
principle of feature in-feature out (FEI-FEO) under mutual exclusivity boundary conditions.
Through classifier ensemble, the FEI-FEO approach improved the overall LULC classification
accuracy by more than 2% demonstrating the advantage of post-classification fusion in urban
land-use mapping.

Introduction approaches in urban LULC mapping (Blaschke et al.,


2014; Drăgut et al., 2010; Johnson & Xie, 2013; Jozdani
Within the urban ecosystems, the determination and et al., 2018). GEOBIA segmentation approach does
analysis of land-use and land-cover (LULC) and LULC not however yield good results for medium- and low-
change analysis plays an important role in providing resolution remote sensing data.
the critical input in the decision-making process for At medium spatial resolutions, therefore, methods
environmental planning and ecological management comprising of unsupervised algorithms, parametric
(Dwivedi et al., 2005; Fan et al., 2007). For urban supervised and machine learning methods have been
LULC mapping and change detection, remote sensing widely used for LULC mapping (Friedl & Brodley,
data provides the optimal spatial and temporal data 1997; Halder et al., 2011; Li et al., 2016; Orieschnig
sources. However, the extraction of urban features is et al., 2021; Waske & Braun, 2009; Wu et al., 2019).
often a challenging task due to the high degree of The supervised classifiers comprise of maximum like­
interactions and complexities within the features in lihood classifier, Mahalanobis distance, k-nearest
terms of their spectral, spatial and textural properties neighbors (kNN), support vector machine (SVM),
(Blaschke et al., 2014). Due to these factors, the appli­ random forest (RF), decision trees (DT), spectral
cations of traditional pixel-based classifiers in urban angle mapper (SAM), fuzzy logic, fuzzy adaptive reso­
LULC mapping often lead to unsatisfactory results nance theory-supervised predictive mapping (Fuzzy-
(Johnson & Xie, 2013; Myint et al., 2011). ARTMAP), radial basis function (RBF), artificial
To overcome the drawbacks in pixel-based classifi­ neural networks (ANN) and naive Bayes (NB) (Ma
cations, Blaschke et al., 2014 proposed the geographic et al., 2019; Shih et al., 2019). The unsupervised classi­
object-based image analysis (GEOBIA) focusing on fiers include, among others, fuzzy c-means, k-means
the segmentation of very high-spatial resolution algorithm, affinity propagation clustering algorithm
(VHR) image data. Using GEOBIA segmentation, pix­ and ISODATA techniques (Maxwell et al., 2018).
els are grouped into similar and semantically indepen­ Urban LULC mapping is data intensive requiring
dent image segments or objects for feature extraction both current and historical remote sensing data. To
and classification. For VHR image data, GEOBIA has improve on the urban LULC mapping, machine learn­
been reported to perform better than pixel-based ing (ML) and artificial intelligence classifiers have

CONTACT Yashon O. Ouma [email protected] Department of Civil Engineering, University of Botswana, Private Bag UB 0061 Gaborone, Botswana
This article has been corrected with minor changes. These changes do not impact the academic content of the article.
© 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits
unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
2 Y. O. OUMA ET AL.

been preferred (Mao et al., 2020; Lefulebe et al. (2022). Classification and Regression Trees (CART) are not
In general, the application of ML algorithms for LULC only sensitive to changes in the training data sets but
mapping has attracted considerable research interests also tend to overfit the model (Prasad et al., 2006). On
(Maxwell et al., 2018; Wang et al., 2022). This is the other hand, for kNN classifiers, the setting of the
mainly because ML algorithms do not require hypoth­ ideal value of k is difficult (Naidoo et al., 2012), and
eses on the input data distribution and tend to yield the classifier is computationally complex as its effec­
better results than the traditional parametric classifiers tiveness is dependent on the a priori determination of
(Jozdani et al., 2018; L. Yu et al., 2014; Nery et al., the number of neighbours (Qian et al., 2015). Naïve
2016). Different ML algorithms have been used for Bayes classifiers perform satisfactorily with small data
urban LULC mapping and modeling (e.g. C. Zhang sets; however, the output accuracy is compromised if
et al., 2019; Mao et al., 2020; Talukdar et al., 2020; inputs are not independent. SVM, on the other hand,
Teluguntla et al., 2018), and have also been compared is effective in high dimensional spaces and performs
(Camargo et al., 2019; Li et al., 2016; Rogan et al., adequately in situations where a clear margin of
2008). However, each ML algorithm will yield differ­ separation exists between classes and is computation­
ent accuracy levels for specific case studies and data. ally efficient. Nevertheless, SVM requires a long train­
Further, in addition to the quality and quantity of the ing time for large data sets and is not intuitive, easy to
imagery, the choice of the suitable ML classifier is still understand or fine-tune (Huang et al., 2002). Though
a challenge as the hyperparameters of the classifiers widely used due to their non-parametric modeling
influence the quality of the feature extractions (Lu & capability, ANN classifiers are highly complex, time-
Weng, 2004; Nichols et al., 2019; Thanh Noi & consuming and computationally expensive, require
Kappas, 2017). large training data to produce accurate outcomes and
To determine the suitable and most accurate ML exhibit a high tolerance for noisy data (Y. Ouma et al.,
approaches for urban LULC modeling, several studies 2022).
have compared different classifiers based on their Further, while several studies have been conducted
overall accuracy, and not in terms of their mathema­ on urban LULC mapping using ML algorithms, the
tical and functional approach (e.g. Camargo et al., performance of the models cannot be replicated from
2019; Jamali, 2019; Li et al., 2016; Rogan et al., 2008). one case study to another, and most studies do not
Comparing RF, SVM, naïve Bayes, and kNN machine focus on classifier-class performance; rather, the
learning classifiers for mapping urban areas in the city emphasis is on the overall LULC classification accu­
of Cape Town, Lefulebe et al. (2022) found all the racy. Secondly, most of the studies are based on single-
classifiers to have accuracy of greater than 91%, with date imagery and not on the multitemporal imagery
kNN being the best classifier at 96.54% accuracy with with multisensor characteristics. Previous studies have
kappa of 0.95. Kranjčić et al. (2019) classified green also pointed out that the performance of the ML
infrastructure in Varaždin and Osijek cities in Croatia classifiers in urban LULC classifications are affected
from Sentinel-2 satellite image using SVM, RF, ANN, by the limitations in the spectral and spatial resolu­
and naïve classifier and found SVM to yield the high­ tions of the sensors especially at lower resolutions
est accuracy of 87% with kappa coefficients of 0.89. Shi (Yang et al., 2017). Compared to the traditional clas­
et al. (2019) used multisource satellite images for sifiers, the non-parametric ML algorithms are consid­
urban LULC mapping in Guangzhou by integrating ered superior as they do not relay on a priori
an ensemble of object-based classifiers, decision trees hypotheses of the input data distribution (Nery et al.,
and RF and achieved an accuracy of >85%. Ha et al. 2016). However, results from different case studies
(2020) also used RF to map rural urbanization in have demonstrated that the performance of a given
Vietnam obtaining an accuracy of more than 90% ML classifier is not only specific to the case study, but
using Landsat data. For detailed mapping of urban also influenced by the setup of the machine learning
land use with multi-source data in the city of algorithm itself and the quality of the training data.
Lanzhou, Zong et al. (2020) used RF and attained With focus on computationally efficient open-source
overall accuracy of 83.75%. From previous research, solutions, this study evaluates the performance of
RF, SVM and ANN have been reported to provide Gradient Tree Boosting (GTB), RF, in comparison
higher overall accuracy in urban LULC modeling as with SVM and Multilayer Perceptron Neural
compared to the traditional classification techniques Networks (MLP-ANN) as implemented within the
(Carranza-García et al., 2019; Gong et al., 2020; Ma Google Earth Engine (GEE) platform for urban
et al., 2019). LULC mapping.
In addition to their inherent mathematical func­ RF derives the optimal classification solutions by
tionalities, the performances of the classifiers are also overcoming the limitations of single decision tree
influenced by the data and characteristics of the land- classifiers through robust ensemble learning (EL),
use features within the urban landscape. For example, majority voting and being able to handle higher num­
simple decision trees-based classifiers like ber of model variables (Belgiu & Drăguţ, 2016). GTB is
EUROPEAN JOURNAL OF REMOTE SENSING 3

also an ensemble classifier, which, unlike RF, ensem­ extraction of urban LULC classes at different multi­
bles weak base learners with the aim of minimizing the temporal scales and multisensory data; and (3) evalu­
loss function through adding newer weak learners to ate the significance of FEI-FEO post-classification
the ensemble (Friedman, 2002). GTB has the advan­ fusion strategy in the extraction of urban features as
tage of high-order feature information optimization, optimally detected from the classifiers. This study
generalization and representation without scaling. improves on our previous study reported in
Compared to other EL algorithms, for every GTB Y. Ouma et al. (2022) by introducing GTB and ANN,
iteration, the negative gradient loss values are used to and on the optimization of classifier hyperparameters
fit the residuals of the regression tree (Y. Ouma et al., for more accurate classification.
2022). SVM determines the best boundary between
different training classes by features transfer to higher
dimensions and performs well on high-dimensional Materials and methods
data using dynamic kernel functions and are adaptable Study area
for different classification tasks (Pedregosa et al.,
2011). Given the dense and complex nature of the The GGPA is the main urban area in Botswana with
input training data for urban LULC classification, the highest population concentration (Figure 1).
MLP is considered to overcome potential classification GGPA lies between latitude 20° 30′S and 24° 45′S
overfitting scenarios (Mohtadifar et al., 2022). With and longitude 25° 50′E and 26° 12′ (Figure 1), with
focus on computationally efficient open-source solu­ an average altitude of 1,004 m AMSL. The GGPA land
tions, this study evaluates the performs of the EL (GTB area is approximately 961.73 km2, while the city is
and RF) and non-EL (SVM and MLP) algorithms in approximately 169 km2. The spatial expansion of the
mapping of urban LULC classes as implemented city and the larger GGPA is constrained in part by the
within the GEE platform for urban LULC mapping. existence of the Gaborone dam, the traditional land
Based on their performances for urban LULC feature tenure within the larger GGPA, and the topographic
mapping, the results of EL and non-EL algorithms will and semi-arid climate of the area. As depicted in
be evaluated for a proposed feature feature-in feature Figure 1, within the commuting radius of the
out (FEI-FEO) post-classification fusion approach. Gaborone city, a dormitory of suburbs are rapidly
For the test study area of the Greater Gaborone evolving which are resulting in the characteristic cen­
Planning Area (GGPA) in Botswana, the objectives tripetal movement of rural –urban migrations.
and contributions of this study are (1) to implement
ensemble decision tree-based (GTB and RF classifiers)
Data
and SVM and MLP-ANN machine-learning classifiers
for urban LULC mapping and change detection from Multisensor Landsat series data comprising of Landsat-4
Landsat data from 1984 to 2020 in 5-year intervals; (2) (L4-MSS), Landsat-5 (L5-TM), Landsat-7 (L7-ETM+)
compare the performance of the classifiers in the and Landsat-8 (L8-OLI) acquired from 1984 to 2020

Figure 1. Location of the test study area: greater gaborone planning area (GGPA) (reprinted with permission from Y. Ouma et al.
(2022). Copyright 2022 ISPRS archives).
4 Y. O. OUMA ET AL.

Table 1. Characteristics of Landsat MSS, TM, ETM+ and Landsat OLI sensors.
Temporal period
Landsat sensors Spectral wavelength (μm) Spectral bands Spatial resolution (m) (5-year)
Landsat MSS 0.50–0.60 Green 60 1984
0.60–0.70 Red
0.70–0.80 Near IR1
0.80–1.11 Near IR2
Landsat TM 0.45–0.52 Blue 30 1990, 1995, 2000, 2005
0.53–0.60 Green
0.63–0.69 Red
0.75–0.90 Near IR1
1.55–1.75 SWIR1
2.09–2.35 SWIR2
Landsat ETM+ 0.45–0.52 Blue 30 2010
0.52–0.60 Green
0.63–0.69 Red
0.77–0.90 Near IR1
1.55–1.75 SWIR1
2.09–2.35 SWIR2
Landsat OLI 0.45–0.51 Blue 30 2015, 2020
0.53–0.59 Green
0.64–0.67 Red
0.85–0.88 Near IR2
1.57–1.65 SWIR1
2.11–2.29 SWIR2

were downloaded from the USGS Earth explorer. The minimize the spectral mixing and spatial heterogeneity of
MSS pixel size of 60 m was resampled into 30 m using the built-up features. It was, however, possible to spec­
data fusion as implemented in Chen et al. (2017). To trally distinguish between the vegetation types due to
minimize the seasonality effects, the data sets were their expansive areal extents and the resulting spectral
acquired during the time of year. The multitemporal homogeneity. Figure 2 presents the LULC classes and the
and multisensor Landsat imagery (Table 1) were atmo­ spectral reflectance trends for the classes in the Landsat’s
spherically corrected using the ATCOR2 tool and histo­ visible, NIR and SWIR bands. For each year, the training
gram equalization in ERDAS Imagine. The time-series comprised of 12,500 pixels and 7,500 pixels were used for
bands were mosaiced, composited, resampled to 30 m model accuracy testing. The training and testing data
spatial resolution and clipped to the study area. samples were collected from visual identification and
interpretations from the Landsat imagery, Google Earth
historical imagery and the historical LULC maps.
Training and testing data
The urban LULC classes comprised of built-up (residen­ Methods
tial, commercial, industrial and impervious surfaces);
water (dam water body), vegetation cover (forest, shrubs Decision tree-based machine learning classifiers
and grass) and bare soil. Depending on the season, the This section presents an overview of the functional
grass and shrubs are partially converted to croplands. The and implementational approaches for GTB, RF, SVM
aggregation of the built-up components was adopted to and MLP-ANN classifiers. To improve on the

Figure 2. False color composite LULC classes and the spectral reflectances in L8-OLI (part of figure reprinted with permission from
Y. Ouma et al. (2022). Copyright 2022 ISPRS archives).
EUROPEAN JOURNAL OF REMOTE SENSING 5

performance accuracy of the classifiers, optimal model The GB algorithm for training and classification of
hyperparameters determined following the steps out­ sample feature vector is summarized in the following
lined in Y. O. Ouma et al. (2023). steps:

(I) Determine the training T and testing S sets, i.e.


Gradient tree boosting (GTB)
T ¼ ðx1 ; y1 Þ; ðx2 ; y2 Þ; :::;ðxN ; yN Þ, xi 2 X � Rn ,
GTB aggregates an ensemble of decision trees
yi 2 f0; þ 1g. Let xi denote the feature vector
(Figure 3). The classifier, however, confines indi­
of each sample and yi denotes its class label.
vidual trees to a weaker prediction model, hence
(II) Establish the loss function:
limiting the complexity of the decision trees. The
algorithm attains its classification accuracy by the Lð y; f ðxÞÞ ¼ ½ y f ðxÞ�2 (1)
iterative combination of weak learner ensembles
into stronger ensemble of trees through stepwise where f(x) is the fitting function of y.
minimization of the loss function based on the
gradient descent optimization (Friedman, 2002). III. Initialization of the model variables:
Different from the other ensemble classifiers, X
N
GTB fits residuals of the regression tree at each fo ðxÞ ¼ arg min Lð yi ; cÞ (2)
c
iteration using negative gradient values of loss. i¼1

The inter-tree correlations are reduced by con­ where c is the constant for the minimization of the loss
structing new trees based on stochastically selected function Lðy; cÞ and represents a tree with a single root
training subset data. node.

Figure 3. Visualizing gradient tree decision boosting (reprinted with permission from Y. Ouma et al. (2022). Copyright 2022 ISPRS
archives).
6 Y. O. OUMA ET AL.

IV. For each model the following steps are decisions rules. At each internal node of the decision
executed: tree, a feature is chosen to infer the class determining
(a) For the i-th sample (I = 1, 2, . . . ., N), calculate the specific decision by splitting the incoming training
the negative gradient rmi of the loss function in the samples to maximize the information gain. Similarly,
current model rmi : each tree in the ensemble is constructed from a sample
� � (bootstrap sample) that is replaceably drawn from the
@Lðy; f ðxi ÞÞ
rmi ¼ (3) training set. The RF training process is similar to
@f ðxi Þ f ðxÞ¼fm 1 ðxÞ CART; however, to increase computational efficiency
where m is the model number, with m = 1, 2, . . . ., M, in RF, each tree only utilizes a random subset of
M is the maximum value of m and J is the maximum features at each node to reduce correlation (Figure 4).
value of j. Let u 2 U � R q be an input feature vector,
andv 2 V � R be its corresponding target value for
(a) Fit a regression tree for rmi and determine the regression. For a given internal node j and a set of
leaf node area rmi (j = 1, 2, . . . ., J) for the mth samples Sj � U � V, the information gain achieved by
tree. choosing the kth feature to split the samples in the
(b) Number of leaf nodes j (j = 1, 2, . . . ., J) is regression problem is computed according to Eqs.
determined from Eq. 4: (7–9):
X � � � �
cm;j ¼ arg min Lðyi ; fm 1 ðxi Þ þ cÞ (4) � k � � � � k � � �
c � � Sj;L � �Sj;R �
xi 2Rm;j Ijk ¼ H Sj � � H Skj;L
�Sj �
� � H Skj;R
�Sj � (7)
(d) To minimize the loss function and estimate the
value of the leaf node area, a linear search is used in
1 X
this step. H ðSÞ ¼ ðv �vÞ2 (8)
jV j v
Tree update according to Eq. (5):
1 X
X
J � �v ¼ ðvÞ (9)
fm ðxÞ ¼ fm 1 ðxÞ þ cm;j I x 2 Rm;j (5) jV j v
j¼1
where L and R denote� the left and right child nodes,
where, if x 2 Rm;j , then I = 0 or I = 1. n o

V. Determination of the final tree model (Eq. (6)): Skj;L ¼ ðu; vÞ 2 Sj �uk < θkj , Skj;R ¼ Sj nSkj;L , uk is the kth

M X
X J feature of the feature vector u, θkj is the splitting
_ �
f ðxÞ ¼ fM ðxÞ ¼ cm;j I x 2 Rm;j (6) threshold chosen to maximize the information gain
m¼1 j¼1
Ijk for the kth feature uk , and | ⋅ | is the cardinality of
From an ensemble of weaker models, GB creates the set. H ðSÞ denotes the variance of all target values in
new models such that each of the created models the classification or regression problem.
minimizes the loss function Lðy; f ðxÞÞ, to fit a more In training the RF algorithm, the splitting θkj is
accurate model with improved overall accuracy. To implemented recursively until either the information
minimize overfitting, a boosting threshold criterion gain Ijk is insignificant or the training samples input
based on either the achieved prediction accuracy or into a given node is less than the preceding threshold
the maximum number models to be created is θkj 1 . Using the out-of-bag errors is often adopted as
adopted. The weak learning process of GTB deci­
the option for parameter tuning in (Biau & Scornet,
sion tree is complemented by improving the repre­
2016). The advantage of RF is that it can produce
sentation, optimization and generalization so as to
stable, robust and accurate results even with mini­
capture the higher-order information and is invar­
mal tuning of the hyperparameters. The algorithm is
iant to scaling of sample data. Further, by weight­
easy to parameterize, insensitive to over-fitting and
ing the combination scheme, GTB can avoid
deals with outliers in training data, reporting the
overfitting by fitting the residuals of the regression
classification error and variable significance.
tree at each iteration using the negative gradient
Further, RF is able to process multidimensional
values of loss.
features from both continuous and categorical data­
sets. In implementing RF, predictions for classifica­
Random forest (RF) tion are performed by obtaining and bagging the
RF is made up of combined multiple decision trees majority class vote from the individual tree class
(Figure 4) trained upon random subsets of the labeled votes. The output classification result for a new
samples and features (Breiman, 2001). A decision tree sample is obtained through a majority voting of
itself is a deterministic data structure for modeling the individual tree results. For RF tuning, the
EUROPEAN JOURNAL OF REMOTE SENSING 7

Figure 4. Illustration of the random forest classification structure.

number of trees (nTree) and the number of variables the inequalities for all elements of the training set as
per split (mTry) are optimized. (Eqs. 12–13).
wxiT þ b � þ1; if yi ¼ þ1 (12)
Support vector machine (SVM)
The SVM model fits an optimal separating hyper­ wxiT þ b � 1; if yi ¼ 1 (13)
plane or hyperplanes in a high-dimensional space,
to create an optimal boundary between two classes The aim of training in SVM model is to determine
that enables the prediction of labels from one or more the w and b so that the hyperplane separates the data
feature vectors (Noble, 2006). The hyperplane(s) are and maximizes the margin 1=kwk2 . Vectors xi for

orientated furthest from the closest data points from which wxiT þ b ¼ 1 will be termed support vector
each of the classes. These closest points are the sup­ as depicted in Figure 5.
port vectors. Like the ensemble algorithms, SVM By solving the optimization task (Eq. 14), the linear
comprises of a set of related learning algorithms for SVM determines the optimal separating margin:
classification and regression. Given a labeled training � Xn �
1 2
data set: minimize jwj þ C ε ; εi
2 i¼1 i

T ¼ fðx1 ; y1 Þ; ðx2 ; y2 Þ; . . . ; ðxn ; yn Þg ; x1 2 Rd and yi � 0 subjecttoyi wT xi þ b � 1 εi ; i
2 ð 1; þ 1Þ ¼ 1; 2; . . . ; n (14)
(10) where C is the optimum cost parameter, εi defines the
where xi is a feature vector representation,yi the class positive slack variables, w is a normal vector and b is
label of a training compound I and n is the elements in a scalar quantity. For nSV support vectors,
the training data sets. The optimal hyperplane is W becomes a linear combination of the training
defined by vectors (Eq. 15), and b the average of all support
vectors (Eq. 16):
wxT þ b ¼ 0 (11)
X
n

where w is the weight vector, x is the input feature W¼ αi yi xi (15)


i¼1
vector and b is the bias. w and b, respectively, satisfy
8 Y. O. OUMA ET AL.

Figure 5. Maximum margin-minimum norm classifier in support vector machine with optimal hyperplane for linearly non-
separable classes (reprinted with permission from Y. Ouma et al. (2022). Copyright 2022 ISPRS archives).

1 XNSV In implementing the SVM classifier with the RBF


b¼ ðWxi yi Þ (16) kernel, there are two main determinants (Ballanti
NSV i¼1
et al., 2016; Qian et al., 2015):
To expand the conventional linear SVM into the
nonlinear cases, xi is replaced through mapping into (i) Optimum cost parameter C which determines
the feature space θðxi Þ, such that xiT x is expressed in the size of the allowed misclassification for
spectrally overlapping training data to enable
the form θðxi ÞT θðxi Þ in the transformation feature
the possible adjustment of the training data. To
space. The nonlinear discriminate function is
minimize model over-fitting larger C values are
expressed as in Eq. 17, where K ðxi ; xÞ ¼ hθðxi Þ; θðxÞi
preferred.
and K ðxi ; xÞ defines the kernel function.
(ii) Kernel width parameter γ determines the
�Xn �
degree of smoothing and shape of the hyper­
f ðxÞ ¼ sgn α y
i¼1 i i
K ð xi ; x Þ þ b (17)
plane dividing the class (Melgani & Bruzzone,
2004). Increasing the γ affects the shape of the
For better performance of SVM classifier in land-
class-dividing hyperplane which may influence
cover classification, Knorn et al., 2009 demonstrated
the accuracy of the classification results.
that RBF kernel function is preferred due to accuracy
and reliability (X. Yu et al., 2004), and was adopted in
the current study. The K ðxi ; xÞ is defined as Multilayer perception neural network (MLP-ANN)
Artificial neural networks (ANN) have different topo­
Kðx; yÞ ¼ < f ðxÞ; f ðyÞ > (18) logical structures including multilayer perceptron
(MLP), adaptive neuro fuzzy inference system
where K is the kernel function, x, y are n-dimensional
(ANFIS), generalized regression neural networks
inputs; f is used to map the input from the n-dimen­
(GRNN), recurrent neural networks (RNN), and radial
sion to m-dimensional space, and < x; y > denotes the
basis function network (RBFN). These ANN models
dot product.
can generally be categorized into feedforward neural
The kernel functions are used to calculate the scalar
networks (FFNN) and RNN. The most popular FFNN
product between two data points in a higher-dimensional
is the MLP-ANN trained with a backpropagation
space without explicitly calculating the transformation
learning algorithm. FFNN have the advantages that
from the input space to the higher dimensional space.
with single or few hidden layers suitable activation
The kernel computation is easier in the high dimensional
functions, the model can approximate a complex and
space for the determination of the inner product of two
nonlinear system. The adopted MLP-ANN model for
feature vectors. This is an advantage in the complexity in
this study is represented in Figure 6.
computing the feature vectors for kernels.
� For the RBF
In Figure 6, R = total number of inputs; z = hidden
kernel KRBF ðx; yÞ ¼ exp γkx yk2 , though the cor­
neurons; ωi.j(1) = weight of first layer between the
responding feature vector is infinite dimensional, the
input j and the ith hidden neuron; ωi.j(2) =weight
kernel computation is trivial.
EUROPEAN JOURNAL OF REMOTE SENSING 9

Figure 6. Topology of a three-layer MLP-ANN.

of second layer between the ith hidden neuron and a crosscheck between the classification results and test
output neuron; bi(1) =bias weight for the ith hidden samples. The following metrices are used: (i) producer’s
neuron; and b1(2) =bias weight for the output neuron. accuracy (PA); (ii) user’s accuracies (UA); (iii) overall
accuracy (OA); (iv) kappa index; and (v) F1-score
(Eqs. 19–22). PA measuring the degree of precision is
Multi-classifier and multi-feature fusion approach the proportion of the samples that truly belong to
This study proposes a multi-classifier and multi- a specific class among all those classified as that specific
feature fusion based on the concept of feature in- class, while UA or recall is the proportion of samples
feature out (FEI-FEO) post-classification feature classified as a specific class among all the samples that
fusion approach where the extracted features with truly belong to that class (Nevalainen et al., 2017). The
the highest accuracy are combined under mutual OA is the number of correctly classified samples to the
exclusivity conditions. The rationale behind FEI-FEO total number of samples. The kappa index provides the
is that when the scene information from different agreement of prediction with the true class, considering
classifiers fC1 ðXÞ; C2 ðXÞ; � � �; CM ðXÞgrepresents dif­ the random chance of correct classification, and the F1-
ferent parts of the scene at different accuracies, the measure is the harmonic mean of precision and recall
features extracted with the highest accuracies can be and was calculated to determine the performance at
combined to obtain the complete and more accurate a classifier and class levels.
global information for a given LULC scene. The pro­ Kii Kjj
posed FEI-FEO based fusion approach relies on the UA ¼ ; and PA ¼ (19)
Kiþ Kþj
complementary combination of the input features
fFm1 ; Fm2 ; Fm3 ; � � �; Fmn g to produce the new scene Pn
i¼1 Kii
global features ðF1 ; F2 ; F3 ; � � �; Fn Þ with the highest OA ¼ (20)
T
extraction accuracies. The FEI-FEO feature fusion
approach is empirically illustrated in Figure 7. Pn Pn �
T i¼1 Kii i;j¼1 Kiþ Kþj
The FEI-FEO data fusion process addresses a set of K¼ Pn � (21)
features with the aim to improve, refine or obtain new T2 i;j¼1 Kiþ Kþj
features (Dasarathy, 1997). The advantage of the opti­
mal FE1-FEO fusion approach is on the fact that the UA � PA
F1 score ¼ 2 � (22)
optimal class is detected, and this can permit general­ UA þ PA
ization of the classifier for class or feature detection with where PA = producer’s accuracy; UA = user’s accu­
even lesser number of training samples. racy; OA = overall accuracy; K = Kappa coefficient;
n = number of classifications; Kii = number of cor­
rect classification; Kiþ = number of pixels in the ith
Accuracy assessments
row and Kþj = number of pixels in the ith column;
For evaluation of the performance of the classifiers, and T = number of pixels used for the accuracy
confusion matrices were generated based on evaluation.
10 Y. O. OUMA ET AL.

Figure 7. The feature in-feature out (FEI-FEO) fusion for post-classification fusion.

The statistical significance of the differences in the samples or batch size required for reliable labeling,
classification accuracy between the classifiers was eval­ training and classification. The optimization of the
uated using pairwise z-score test. The z-test was applied training samples is important as it is time-
to the OA results for testing statistical significance at consuming, costly to obtain requires more storage
a significance level of 5%. If z > 1.96, the test is signifi­ and computation power. Varying the training size,
cant, leading to the conclusion that the obtained results the classifier accuracy outputs were averaged with
differ from each other. the results in presented in Figure 8.
The classification accuracy is observed to
improve for all the classifiers as the number of
Results training samples increases and tended to converge
to an optimal classification rate when the number of
Influence of training data size on classifier training samples was above 10,500 pixels, with all
performance the classifiers performing at >80% in overall accu­
In this section, the accuracy for urban LULC mapping racy. The satisfactory performance with higher
is considered as a function of the number of training number of training samples is attributed to the

Figure 8. Variation of number of training samples and classifier overall average accuracy.
EUROPEAN JOURNAL OF REMOTE SENSING 11

process required to minimize the separability, noise detecting built-up areas using the PA metrics, MLP-
and time invariance of the sound features. The ANN had the highest average of PA 93.2%, followed
results in Figure 8 depict the significance of training by RF (90.6%), SVM (89.3%) and GTB (85%). In
samples as a critical hyperparameter in the imple­ terms of the UA, MLP-ANN and RF had the highest
mentation and comparison of machine learning values of 93% and 92.3%, respectively, while SVM
classifiers. and GTB had respective UAs of 91.9% and 86.1%.
The water bodies were classified with consistently
higher average PA accuracy of 99% by MLP-ANN,
Accuracy assessment followed by RF (98.1%), GTB (97.5%) and SVM
Class producer and user accuracies (97.2%) and the corresponding UA values were RF
The results for the PA and UA for each class in the and MLP-ANN equally at 99.7%, SVM (99.3%) and
respective years are presented in Figure 9. For GTB (95.3%).

Figure 9. Average yearly PA and UA metrics for LULC classes.


12 Y. O. OUMA ET AL.

For bare-soil mapping, RF attained average PA (false positive rate) and AUC (area under the roC
of 85.6% which was 0.1%, 3.5% and 5.6% higher curve) measures for each class for the 8-epochs. All
than MLP-ANN, GTB and SVM, respectively. The the classifiers mapped water bodies with the highest
UA for bare soil was highest for MLP-ANN OA, F1-score, TPR and AUC scores for the 35-year
(88.1%) and followed by RF (83.9%), SVM period. For the built-up class, MLP-ANN had the high­
(83.7%) and GTB (81.3%). In mapping the bare- est accuracy. The vegetation classes were mapped with
soil cover, all the classifiers achieved lower PAs the least accuracy as compared to the other classes, with
compared to built-up areas and waterbodies MLP-ANN being the best classifier for mapping grass,
classes. This could be attributed to the spectral and RF being the most suitable for mapping shrubs and
confusion of bare-soil with the impervious sur­ forest. MLP-ANN is recorded to be best classifier for
faces including buildings and roads. For the vege­ detecting bare soil at 98.1%, which is 1.1% higher than
tation classes, all the classifiers attained the lowest the least accuracy from SVM.
average PA of 83.5%, with grass having the highest For the overall average mapping of the LULC classes,
average PA of 88% and shrubs having the least PA RF achieved the highest performance with OA of
of 80.5%. In terms of the UA, the average accuracy 92.8%, MLP-ANN (91.2%), SVM (90.9%) and GTB
was 85.3% and the trend was the same with grass (87.8%). The same performance pattern was observed
(92.3%), forest (83.2%) and shrubs (80.3%). The from the overall average F1-score, TPR and AUC
low PA and UA in vegetation mapping is contrib­ except for the FPR where GTB tended to have lower
uted to by the intra-spectral confusion in the FPR values compared to the better performing classi­
vegetation classes. fiers per LULC class. The results of the TPR and FPR
Apart from the spectrally homogeneous water are presented in Figure 10 in terms of the area under
body, the PA and UA measures are observed to vary ROC for all the classification models. On average for all
with the urban LULC class, time, sensor and the clas­ the classes, the RF model had the highest area under
sifier. This requires further statistical evaluation of the ROC curve of 0.981, which is 0.021, 0.029 and 0.077,
classification results to determine the suitability of the respectively. higher than MLP-ANN, SVM and GTB
classifiers for detecting and extracting specific urban classifiers in mapping the urban LULC classes. The
LULC classes. results in Figure 10 are the average results after tuning
the classifier hyperparameters to yield the best results
Average class classification accuracy for a given year and for the urban LULC classes.
Table 2 presents the average metrics in terms of OA, Figure 11 presents the summary of the F1-scores for
F1-score, TPR (true positive rate or sensitivity), FPR each classifier, class and year. RF had the highest

Table 2. Classifier-class average accuracy.


Built-up OA (%) F1-score TPR FPR AUC
GTB 97.2 0.848 0.861 0.018 0.921
RF 98.6 0.912 0.923 0.008 0.957
MLP-ANN 99.0 0.928 0.931 0.004 0.963
SVM 98.5 0.901 0.919 0.009 0.955
Water
GTB 99.0 0.962 0.960 0.953 0.974
RF 99.6 0.989 0.970 0.997 0.997
MLP-ANN 99.8 0.994 0.995 0.997 0.998
SVM 99.5 0.984 0.987 0.997 0.996
Grass
GTB 97.2 0.633 0.604 0.008 0.798
RF 97.8 0.811 0.826 0.010 0.908
MLP-ANN 97.9 0.772 0.755 0.006 0.874
SVM 97.0 0.720 0.728 0.010 0.859
Shrubs
GTB 90.6 0.817 0.920 0.146 0.887
RF 94.7 0.867 0.939 0.076 0.931
MLP-ANN 92.8 0.844 0.941 0.102 0.920
SVM 93.9 0.857 0.921 0.079 0.921
Forest
GTB 91.7 0.740 0.668 0.020 0.824
RF 95.4 0.872 0.816 0.010 0.903
MLP-ANN 94.2 0.833 0.754 0.012 0.871
SVM 95.0 0.860 0.799 0.013 0.893
Bare-soil
GTB 97.2 0.795 0.813 0.017 0.898
RF 97.9 0.826 0.839 0.012 0.914
MLP-ANN 98.1 0.858 0.881 0.014 0.933
SVM 97.0 0.791 0.837 0.022 0.907
EUROPEAN JOURNAL OF REMOTE SENSING 13

Figure 10. Average ROC curves for the classifiers.

average F1-score of 0.879 performing marginally classes, RF performed better than all the classifiers
higher than MLP-ANN by 0.007. SVM and GTB, with average OA of 92.8%. It is, however, observed
respectively, recorded average F1-scores of 0.852 and that only MLP-ANN marginally outperformed RF in
0.799. For the specific years, the performance of the 1984, 1990 and 2010. In terms of the OA, RF per­
classifiers varied with MLP-ANN recording the high­ formed better than MLP-ANN by 1.6%. In close
est F1-score of 0.994 and the least F1-score of 0.633 performance to MLP-ANN is SVM with average
was from GTB and GTB also recorded the least F1- OA of 90.9% and GTB had average OA of 87.8%
scores for all the years. In overall from the F1-score performing lower than the other classifiers by
measures, water was the best mapped land-cover for between 3 and 5%. The kappa indices in Table 3
all the years. Based on the average class accuracy show that RF and MLP-ANN had average kappa
measures in Table 2, the best classifier for detecting index values of 0.880 and 0.863, while SVM and
a specific urban LULC class can be identified for GTB achieved kappa indices of 0.860 and 0.794,
a given case study. respectively. Notable is the similarity in the perfor­
mance trend between the classifiers with the varying
Overall accuracy and kappa index multitemporal scales and multisensor data sets.
The overall accuracy results are presented in The results for the inter-comparison of the ML
Figure 12, indicating that for all the years and classifiers using the pairwise z-score test, such

Figure 11. Classifier-class-year F1-scores.


14 Y. O. OUMA ET AL.

Figure 12. Average overall accuracy performance of the classifiers and the FEI-FEO fusion.

Table 3. Average kappa coefficients for the classifiers per year.


Kappa index
Year GTB RF SVM MLP-ANN
1984 0.761 0.864 0.862 0.891
1990 0.696 0.799 0.801 0.806
1995 0.767 0.921 0.898 0.872
2000 0.868 0.936 0.923 0.922
2005 0.768 0.900 0.855 0.846
2010 0.893 0.875 0.905 0.930
2015 0.717 0.845 0.751 0.788
2020 0.884 0.907 0.890 0.856

that z-score>1.96 is considered statistically different at 72% for all the classifiers. The classifiers reported
α ¼ 0:05 level of significance, are presented in Table 4 different statistics for the least growth period with
with the average results that for the case study, there RF and MLP-ANN showing that between 2005 and
was no significant difference between the classifiers in 2010 the urban growth was at 4% and 6%, respectively,
terms of the overall accuracy of performance. The while during the same period GTB and SVM showed
notable significant difference is between GTB and double digit growth at 15% and 13%, respectively
the other classifiers with a z-score>1, and the least (Figure 13). In latest year 2020, the total built-up
difference is between RF and MLP-ANN at p-value area was mapped by the classifiers as MLP-ANN
= 0.881. (108.14 km2), RF (113.39 km2), GTB (135.17 km2)
and SVM (126.23 km2). Given that RF detected the
urban areas with the highest accuracy in 2020, it
LULC classification and change detection results implies the other classifiers overestimated the built-
The urban LULC mapping and change detection up area. Though the built-up class mixing increases
results for the GGPA are presented in Figure 13, the classification accuracy, it does not capture the
with the area coverage and percent change in area as specific elements of the urban built-up ecosystem
mapped by each classifier. The spatial-temporal trends which comprises of residential, commercial, industrial
for each class are briefly discussed below. and impervious surfaces.
Urban built-up: In 1984, RF and SVM estimated the Water: The main water body within the study area
built-up area to be 2.4% of the total area (~22.6 km2), is the Gaborone dam. For all the classifiers, the homo­
while both GTB and MLP-ANN were at 2.6% (25.3 geneous water body was mapped with highest accu­
km2) as shown in Figure 13. All the classifiers showed racy 99% on overage. In terms of the changes in
progressive growth in urban area development in the surface area, there are observed fluctuations in the
successive years with the highest growth change dam area and the results show that all the classifiers
recorded between 1984–1990 ranging from 62 to recorded maximum increase in the surface area in the

Table 4. Average z-scores and p-values for ML model pairs.


Classifier pairs z-score p-value Significance level
GTB-RF −1.687 0.092 No
GTB-SVM −1.308 0.191 No
GTB-ANN −1.476 0.140 No
RF-SVM 0.362 0.717 No
RF-ANN 0.150 0.881 No
SVM-ANN −0.194 0.846 No
EUROPEAN JOURNAL OF REMOTE SENSING 15

Figure 13. LULC year-class areas and change detection using GTB, RF, SVM and MLP-ANN.

periods of 1984–1990 and 2015–2020 and decreases of climatic conditions. For the 35 years of study, all the
more than 50% in 2000–2005 and 2010–2015 classifiers showed that there was an increase in forest
(Figure 13). and grassland, while the areas covered by shrubs
Bare-soil: Most of the land in the study area are decreased. From RF results, the shrubland reduced
covered by bare soils especially during most parts of by nearly 50% in 1984 from 659.49 km2 to 300.97
the year. For all the years, the classifiers presented km2 in 2020. In the same duration, MLP-ANN and
different results in terms of the surface area occupied SVM estimated that shrubs occupied nearly the same
by bare soil/land. In 1984, GTB and MLP-ANN esti­ area of 578.99 km2 and 572.66 km2, respectively, and
mated bare soil to occupy 8% (72.35 km2) of the total in 2020 the classifiers estimated shrubland to be
area, while RF results indicated that bare soil was 12% 335.09 km2 and 246,98 km2, respectively. The GTB
(110.82 km2) and SVM was 6% (57.93 km2). In 2020, results indicated that shrubland occupied 607.18 km2
the results from GTB and MLP-ANN were nearly the in 1984 and 331.03 km2 in 2020 (Figure 13). This
same at between 5–6% (52.36−56.0 km2), respectively, implies that most of the land occupied by built areas
and RF and SVM estimated bare-soil at 107.70 km2 as a result of conversions from shrublands.
(11%) and 137.73 km2 (14%), respectively. Between
1984 and 2020, the overall area covered by bare soil
Post-classification feature fusion of classifier
reduced by 3% (RF), 22% (MLP-ANN), 28% (GTB)
results
and increased by 138% (SVM).
Vegetation: Vegetation cover comprised of grass, From the results above, it is observed that an urban
shrubs and forested areas. Similar to water, the abun­ LULC class(s) can accurately be detected and extracted
dance of natural vegetation cover is influenced by the by a specific classifier. Thus, to improve on the overall
16 Y. O. OUMA ET AL.

Figure 14. Multiclass FEI-FEO post classification fusion.

accuracy for urban LULC mapping using the ML Table 5 presents the best class classifier for each year
methods, the proposed FEI-FEO post-classification and the resulting post-classification fusion accuracies.
feature fusion is adopted to maximize on the advan­ To further illustrate the significance of FEI-FEO
tages of the different classifiers and to improve on the approach, Figure 15 presents a comparison of the
accuracy of urban LULC mapping. The output of the classification results for 2020 in relation to the
ML-fusion results is presented compared in Figure 14, ground-truth reference imagery from Google Earth.
for 2015 which had the least multisensor performance It is observed that the RF and SVM results have the
(Figure 12). The results indicate that the proposed same shape and structural patterns in terms of map­
ensemble of the best classifier class results as obtained ping the urban areas and captured the bare soils more
from the MLP-ANN and RF classifiers improves the accurately. In 2020, MLP-ANN tended to underesti­
overall accuracy from 87.5% to 90.1% for 2015. The mate the built-up areas, while GTB classified some
results of the best classifier class FEI-FEO fusion are urban areas as bare soils. In mapping waterbodies,
compared in Figure 15 with overall improvements in RF and MLP-ANN were able to differentiate the
the multitemporal and multisensor urban LULC map­ land–water interfaces better than the other classifiers
ping where the different classes are mapped accurately which tended to map the bare soils around the water
by different classifiers with results in Figure 12. bodies as built-up areas. RF detected the shape of the

Table 5. Summary of best class classifier per year.


Best machine learning classifier and ML-fusion accuracy (%)
LULC class 1984 1990 1995 2000 2005 2010 2015 2020
Built-up MLP-ANN GTB MLP-ANN MLP-ANN MLP-ANN MLP-ANN MLP-ANN RF
Water MLP-ANN MLP-ANN SVM SVM MLP-ANN MLP-ANN RF RF
Grass MLP-ANN SVM RF MLP-ANN MLP-ANN RF MLP-ANN RF
Shrubs RF MLP-ANN RF RF RF MLP-ANN RF RF
Forest RF RF RF RF RF GTB RF RF
Bare soil MLP-ANN MLP-ANN RF SVM GTB SVM RF RF
ML-Fusion Accuracy 93.7% 93.4% 96.5% 97.1% 95.5% 96.2% 90.1% 93.4%

Figure 15. Image-based ground-truth comparison of classification results for different LULC classes and at different locations
within the study area for 2020 (reprinted with permission from Y. Ouma et al. (2022). Copyright 2022 ISPRS archives).
EUROPEAN JOURNAL OF REMOTE SENSING 17

dam water body more accurately. For the vegetation SVM and ANN performed better on Landsat ETM+,
cover in 2020, it is observed in Figure 15 that RF and while SVM gave better results with Landsat-TM data.
SVM mapped the forest and bare soil areas with the These same observations are noted in the current
same degree of compactness, while MLP-ANN and study in which different Landsat sensors MSS, TM,
GTB tended to map the forest area as mixed with ETM+ and OLI were classified with different accuracy
shrubs. Visually, however, the results shows that the for different years as presented in Table 4. Huang et al.
classifiers tend to have closely related results. (2002) also noted that with Landsat TM/ETM+, ANN
performed better than SVM. Pal (2005) concluded that
the performances of RF and SVM were similar if all the
Discussion
required classifier hyperparameters were optimally set.
Typified by increasing population and infrastructural Erbek et al. (2004) and Deng et al. (2008) further
development, urbanization is one of the anthropo­ reported that the output LULC class areas also varied
genic activities that is critical to land-use change. As in the different Landsat sensor satellite data. Besides
such, accurate urban LULC information is imperative the scene LULC overall classification accuracy, similar
in providing evidence towards sustainable urban area performance patterns are observed in this study for
planning and management. Previous studies have each urban LULC class or feature.
revealed that the capability of classification with The noted difference between RF and MLP-ANN is
remote sensing data, as the most practical data source that the performance of RF is more stable for all the
of urban LULC mapping, is dependent on the classi­ years and not significantly influenced by the spectral
fier, the input data and on the complexity of the land­ and radiometric differences in the Landsat sensors.
scape (Klein et al., 2009). On the significance of The stability of the RF classifier has been reported to
classifiers in urban LULC mapping, Pandey et al. be based on its ability to handle category type features,
(2021) noted that the differences in the accuracy increased number of trees, as well as the bagging and
resulting from the performance of a classifier was random concepts resulting into its efficiency and pre­
higher than the influence of the land-use land-cover cision (Gislason et al., 2006; Talukdar et al., 2020).
characteristics of a given case study. For mapping Further, the superior performances of RF and MLP-
complex urban landscapes, the improved perfor­ ANN have also been attributed to the fact that the
mances of machine learning algorithms have resulted classifiers tend to be tolerant to noise and are signifi­
in an increase in their applications (Jozdani et al., cantly more robust towards both random and sys­
2018). This study thus evaluated the performance of tematic noise of the training data sets (Breiman,
two supervised ensemble classifiers (GTB and RF), 2001; Pelletier et al., 2017). Based on its extended
pixel-based SVM and neural network MLP machine feature set (Georganos et al., 2018), SVM results
learning classifiers for urban LULC mapping. were observed to be stable with sensor and time and
The focus of the pixel-based SVM algorithm is on exhibited nearly the same performance trend as RF
the pixel independence (Johnson & Xie, 2013). SVM and MLP-ANN. GTB performance was most effected
has been recorded to have advantages that include by the radiometric and spectral resolution differences
having high classification accuracy with small training of the sensors as it recorded the least accuracy. The
data and is also more robust for data with low noise lower performance by GTB can be attributed to the
levels (Pelletier et al., 2017). On the other hand, the decision trees being too sensitive to small changes in
object-based (RF and GTB) and neural network the training data sets and tends to overfit the model
(MLP) classifiers are able to take into account the (Prasad et al., 2006). This implies that the GTB
complex neighborhood spectral and spatial character­ requires continuous readjustments of the of the deci­
istics. Despite several studies having explored the sion trees to minimize the classification errors (Awad
robustness of the performances of different classifiers & Khanna, 2015).
with remote sensing data sets, the identification of the For the study area, MLP-ANN and RF are observed
most appropriate classifier for mapping a specific to be the dominant classifiers (Table 5). Despite the
urban scene and for a specific LULC feature is still observed marginal differences in the classification
a challenging task (Pandey et al., 2021). Some of the results, the study showed that the accuracies of the
previous studies recorded similar results to this study classifiers were similar at 5% level of significance with
as represented in Figure 12 and Table 2, where the all the classifiers performing at above 85% overall
performances of the classifiers vary for the same scene. accuracy. In a similar evaluation, Hackman et al.
However, for a different case study, object-based clas­ (2017) observed that the advanced classification
sifier performed better than pixel-based SVM (Abdi, machine learning algorithms may not always be
2019; Conchedda et al., 2008; Thanh Noi & Kappas, advantageous when applied to process multispectral
2017. Srivastava et al. (2012) and Dixon and Candade image data and the focus should be on the abilities of
(2008) compared different machine learning techni­ the classifiers to extract specific LULC classes. Despite
ques and different Landsat sensors and concluded that the standalone classifiers exhibiting good results in
18 Y. O. OUMA ET AL.

urban LULC classification, more accurate classifica­ the advantages of a given classifier in mapping
tion techniques are continuously being sought (P. a particular urban LULC class or feature, and to
Zhang et al., 2018). Due to the differences in classifier improve on urban LULC mapping accuracy, an
performance for the same scene features, this study ensemble of ML classifiers in recommended in the
proposed the FEI-FEO post-classification feature form of post-classification feature fusion of the best
fusion approach that takes advantage of the classifiers classifier outputs. For multisensor and multitemporal
accuracy in mapping a specific urban LULC features. urban LULC mapping, the proposed post-
The FEI-FEO hybridization has the potential to dis­ classification FEI-FEO fusion approach increased the
criminate and enhance the robustness and accuracy of accuracy of mapping the urban LULC classes as it
urban LULC classification results. combines the inherent feature detection and classifica­
The post-classification results presented statistically tion abilities in the machine learning classifiers. The
in Table 5 and in Figures 12, 14, and comparatively in study results show that the detection and mapping of
Figure 15 in terms of classification accuracies shows urban LULC classes with a given classifier cannot be
that the proposed feature fusion approach is suitable generalized and depends on the sensor spectral resolu­
for maximizing the performance of machine learning tions, and is also influenced by the temporal, atmo­
classifiers. The multifeature fusion ensemble takes spheric, illumination and geometric variabilities. As
advantage of the neighborhood and proximities of proposed in this study, accurately derived information
the pixels mapped in each classifier to generate more on urban LULC is useful for urban land-use planners
accurate and stable urban LULC results. Further, the and managers for decision support on urban planning,
results in Table 5 with respect to specific class detec­ management and for growth policy development. This
tion indicate the ability of ANN to model nonlinear study recommends further evaluations on the perfor­
features and adequately handle the uncertainties that mances of the machine learning and the FEI-FO
exist in spatial data. The multifeature mapping cap­ approach in this study in comparison with deep learn­
ability of the proposed FEI-FEO is considered useful ing classification models for mapping the complex
in land-use/cover classification tasks in the complex urban environments.
urban environments with high spatial heterogeneity.
The utility of FEI-FEO approach should be investi­
gated for each case study to determine the most effec­ Disclosure statement
tive classifiers for mapping specific features. No potential conflict of interest was reported by the authors.

Conclusions Funding

Mapping of urban landscapes is a complex and chal­ This research project was funded by both the USAID
Partnerships for Enhanced Engagement in Research
lenging task due to the spectral overlaps and spatial (PEER) under the PEER program cooperative agreement
heterogeneity of the urban features. This study was number: AID-OAA-A-11-00012 and the University of
carried out, first to implement and evaluate the accu­ Botswana Office of Research and Development (ORD)
racy of ensemble decision tree-based (GTB and RF)
classifiers, and SVM and MLP-ANN machine learning
classification algorithms for mapping of urban land- ORCID
use classes from multitemporal and multisensor Yashon O. Ouma http://orcid.org/0000-0003-1163-0385
Landsat data from 1984 to 2020 at 5-year intervals. Jiaguo Qi http://orcid.org/0000-0002-8183-0297
The second goal was to maximize the potentials of the
ML classifiers for improving the urban LULC map­
ping through a post-classification feature in – feature Data availability statement
out fusion approach. For mapping the six LULC The data used in this study was obtained from the United
classes over the 35 years, MLP-ANN was the preferred States Geological Survey (USGS): https://earthexplorer.usgs.
classifier for the urban built-up area and water body. gov/. The rest of the data are as presented in this paper. The
The optimal classifiers for extracting the vegetation image data classification was carried out within the Google
classes were determined as grass (MLP-ANN), shrub­ Earth Engine (GEE).
land (RF) and forest (RF). In terms of the combined
overall accuracy for the eight-epoch years, RF average References
performance was highest at 92.8% which was 1.6%,
1.9% and 5.0% higher than MLP-ANN, SVM and Abdi, A. M. (2019). Land cover and land use classification
performance of machine learning algorithms in a boreal
GTB, respectively. For improved ML classifier perfor­ landscape using sentinel-2 data. GIScience & Remote
mances, the study experiments showed that optimal Sensing, 57(1), 1–20. https://doi.org/10.1080/15481603.
training sample size should be determined. Based on 2019.1650447
EUROPEAN JOURNAL OF REMOTE SENSING 19

Awad, M., & Khanna, R. (2015). Support vector machines Remote Sensing, 26(7), 1285–1287. https://doi.org/10.
for classification. In Efficient learning machines (pp. 39– 1080/01431160512331337763
66). Apress. https://doi.org/10.1007/978-1-4302-5990-9_ Erbek, F. S., Özkan, C., & Taberner, M. (2004). Comparison
3 of maximum likelihood classification method with super­
Ballanti, L., Blesius, L., Hines, E., & Kruse, B. (2016). Tree vised artificial neural network algorithms for land use
species classification using hyperspectral imagery: activities. International Journal of Remote Sensing, 25(9),
A comparison of two classifiers. Remote Sensing, 8(6), 1733–1748. https://doi.org/10.1080/
45. https://doi.org/10.3390/rs8060445 0143116031000150077
Belgiu, M., & Drăguţ, L. (2016). Random forest in remote Fan, F., Weng, Q., & Wang, Y. (2007). Land use land cover
sensing: A review of applications and future directions. change in Guangzhou, China, from 1998 to 2003, based
Isprs Journal of Photogrammetry and Remote Sensing, 114, on landsat TM/ETM+ imagery. Sensors, 7(7), 1323–1342.
24–31. https://doi.org/10.1016/j.isprsjprs.2016.01.011 https://doi.org/10.3390/s7071323
Biau, G. Ã. Š., & Scornet, E. (2016). A random forest guided Friedl, M. A., & Brodley, C. E. (1997). Decision tree classi­
tour. TEST, 25(2), 197–227. https://doi.org/10.1007/ fication of land cover from remotely sensed data. Remote
s11749-016-0481-7 sensing of environment, 61(3), 399–409. https://doi.org/
Blaschke, T., Hay, G. J., Kelly, M., Lang, S., Hofmann, P., 10.1016/S0034-4257(97)00049-7
Addink, E., Queiroz Feitosa, R., van der Meer, F., van der Friedman, J. H. (2002). Stochastic gradient boosting.
Werff, H., van Coillie, F., & Tiede, D. (2014). Geographic Computational Statistics & Data Analysis, 38(4),
object-based image analysis – towards a new paradigm. 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
Isprs Journal of Photogrammetry and Remote Sensing, 87, Georganos, S., Grippa, T., Vanhuysse, S., Lennert, M.,
180–191. https://doi.org/10.1016/j.isprsjprs.2013.09.014 Shimoni, M., & Wolff, E. (2018). Very high resolution
Breiman, L. (2001). Random forests. Machine Learning, 45 object-based land use–land cover urban classification
(1), 5–32. https://doi.org/10.1023/A:1010933404324 using extreme gradient boosting. IEEE Geoscience and
Camargo, F. F., Sano, E. E., Almeida, C. M., Mura, J. C., & Remote Sensing Letters, 15(4), 607–611. https://doi.org/
Almeida, T. (2019). A comparative assessment of 10.1109/LGRS.2018.2803259
machine-learning techniques for land use and land Gislason, P. O., Benediktsson, J. A., & Sveinsson, J. R.
cover classification of the Brazilian tropical savanna (2006). Random forests for land cover classification.
using ALOS-2/PALSAR-2 polarimetric images. Remote Pattern Recognition Letters, 27(4), 294–300. https://doi.
org/10.1016/j.patrec.2005.08.011
Sensing, 11(13), 1600. https://doi.org/10.3390/rs11131600
Gong, P., Chen, B., Li, X., Liu, H., Wang, J., Bai, Y., Chen, J.,
Carranza-García, M., García-Gutiérrez, J., & Riquelme, J. C.
Chen, X., Fang, L., Feng, S., Feng, Y., Gong, Y., Gu, H.,
(2019). A framework for evaluating land use and land
Huang, H., Huang, X., Jiao, H., Kang, Y., Lei, G., Li, A. . . .
cover classification using convolutional neural networks.
Xu, B. (2020). Mapping essential urban land use cate­
Remote Sensing, 11(3), 274. https://doi.org/10.3390/
gories in China (EULUC-China): Preliminary results for
rs11030274
2018. Science Bulletin, 65(3), 182–187. https://doi.org/10.
Chen, B., Huang, B., & Xu, B. (2017). Multi-source remotely
1016/j.scib.2019.12.007
sensed data fusion for improving land cover
Hackman, K. O., Gong, P., & Wang, J. (2017). New
classification. Isprs Journal of Photogrammetry and
land-cover maps of Ghana for 2015 using landsat 8 and
Remote Sensing, 124, 27–39. https://doi.org/10.1016/j.
three popular classifiers for biodiversity assessment.
isprsjprs.2016.12.008 International Journal of Remote Sensing, 38(14),
Conchedda, G., Durieux, L., & Mayaux, P. (2008). An 4008–4021. https://doi.org/10.1080/01431161.2017.
object-based method for mapping and change analysis 1312619
in mangrove ecosystems. Isprs Journal of Halder, A., Ghosh, A., & Ghosh, S. (2011). Supervised and
Photogrammetry and Remote Sensing, 63(5), 578–589. unsupervised landuse map generation from remotely
https://doi.org/10.1016/j.isprsjprs.2008.04.002 sensed images using ant based systems. Applied Soft
Dasarathy, B. V. (1997). Sensor fusion potential Computing, 11(8), 5770–5781. https://doi.org/10.1016/j.
exploitation-innovative architectures and illustrative asoc.2011.02.030
applications. Proceedings of the IEEE, 85 (1), 24–38. Ha, T. V., Tuohy, M., Irwin, M., & Tuan, P. V. Monitoring
IEEE. https://doi.org/10.1109/5.554206 and mapping rural urbanization and land use changes
Deng, J. S., Wang, K., Deng, Y. H., & Qi, G. J. (2008). PCA- using landsat data in the northeast subtropical region of
based land-use change detection and analysis using mul­ Vietnam. (2020). The Egyptian Journal of Remote Sensing
titemporal and multisensor satellite data. International and Space Science, 23(1), 11–19. U. https://doi.org/10.
Journal of Remote Sensing, 29(16), 4823–4838. https:// 1016/j.ejrs.2018.07.001
doi.org/10.1080/01431160801950162 Huang, C., Davis, L., & Townshend, J. (2002). An assess­
Dixon, B., & Candade, N. (2008). Multispectral landuse ment of support vector machines for land cover
classification using neural networks and support vector classification. International Journal of Remote Sensing,
machines: One or the other, or both? International 23(4), 725–749. https://doi.org/10.1080/
Journal of Remote Sensing, 29(4), 1185–1206. https://doi. 01431160110040323
org/10.1080/01431160701294661 Jamali, A. (2019). Evaluation and comparison of eight
Drăgut, L., Tiede, D., & Levick, S. R. (2010). ESP: A tool to machine learning models in land use/land cover mapping
estimate scale parameter for multiresolution image seg­ using landsat 8 OLI: A case study of the northern region
mentation of remotely sensed data. International Journal of Iran. SN Applied Sciences, 1(11), 1448. https://doi.org/
of Geographical Information Science, 24(6), 859–871. 10.1007/s42452-019-1527-8
https://doi.org/10.1080/13658810903174803 Johnson, B., & Xie, Z. (2013). Classifying a high resolution
Dwivedi, R. S., Sreenivas, K., & Ramana, K. V. (2005). Land- image of an urban area using super-object information.
use/land-cover change analysis in part of Ethiopia using Isprs Journal of Photogrammetry and Remote Sensing, 83,
landsat thematic mapper data. International Journal of 40–49. https://doi.org/10.1016/j.isprsjprs.2013.05.008
20 Y. O. OUMA ET AL.

Jozdani, S. E., Momeni, M., Johnson, B. A., & Sattari, M. kruger national park region, by integrating hyperspectral
(2018). A regression modelling approach for optimizing and LiDAR data in a random forest data mining
segmentation scale parameters to extract buildings of environment. Isprs Journal of Photogrammetry and
different sizes. International Journal of Remote Sensing, Remote Sensing, 69, 167–179. https://doi.org/10.1016/j.
39(3), 684–703. https://doi.org/10.1080/01431161.2017. isprsjprs.2012.03.005
1390273 Nery, T., Sadler, R., Solis-Aulestia, M., White, B.,
Klein, D., Esch, T., Himmler, V., Thiel, M., & Dech, S. Polyakov, M., & Chalak, M. (2016). Comparing supervised
(2009). Assessment of urban extent and imperviousness algorithms in land use and land cover classification of
of cape town using TerraSAR-X and landsat images. In a landsat time-series. In Proceedings of the International
Proceedings of the 2009 IEEE International Geoscience Geoscience and Remote Sensing Symposium (IGARSS),
and Remote Sensing Symposium, Cape Town, South Beijing, China, 3 November. 2016.
Africa, 12–17 July. 2009; Volume 3, p. III 1051. Nevalainen, O., Honkavaara, E., Tuominen, S., Viljanen, N.,
Knorn, J., Rabe, A., Radeloff, V. C., Kuemmerle, T., Hakala, T., Yu, X., Hyyppä, J., Saari, H., Pölönen, I.,
Kozak, J., & Hostert, P. (2009). Land cover mapping of Imai, N., & Tommaselli, A. (2017). Individual tree detec­
large areas using chain classification of neighboring land­ tion and classification with UAV-based photogrammetric
sat satellite images. Remote sensing of environment, 113 point clouds and hyperspectral imaging. Remote Sensing,
(5), 957–964. https://doi.org/10.1016/j.rse.2009.01.010 9(3), 185. https://doi.org/10.3390/rs9030185
Kranjčić, N., Medak, D., Župan, R., & Rezo, M. (2019). Nichols, J. A., Herbert Chan, H. W., & Baker, M. (2019).
Machine learning methods for classification of the green Machine learning: Applications of artificial intelligence to
infrastructure in city areas. ISPRS International Journal of imaging and diagnosis. Biophysical Reviews, 11(1),
Geo-Information, 8(10), 463. https://doi.org/10.3390/ 111–118. https://doi.org/10.1007/s12551-018-0449-9
ijgi8100463 Noble, W. S. (2006). What is a support vector machine?
Lefulebe, B. E., Van der Walt, A., & Xulu, S. (2022). Fine- Nature Biotechnology, 24(12), 1565–1567. https://doi.
scale classification of urban land use and land cover with org/10.1038/nbt1206-1565
planetscope imagery and machine learning strategies in Orieschnig, C. A., Belaud, G., Venot, J. -P., Massuel, S., &
the city of Cape Town, South Africa. Sustainability, Ogilvie, A. (2021). Input imagery, classifiers, and cloud
14 (15), 9139. https://doi.org/10.3390/su14159139 computing: Insights from multi-temporal LULC map­
Li, X., Chen, W., Cheng, X., & Wang, L. (2016). ping in the Cambodian Mekong Delta. European
A comparison of machine learning algorithms for map­ Journal of Remote Sensing, 54(1), 398–416. https://doi.
ping of complex surface-mined and agricultural land­ org/10.1080/22797254.2021.1948356
scapes using ZiYuan-3 stereo satellite imagery. Remote Ouma, Y. O., Moalafhi, D. B., Anderson, G., Nkwae, B.,
Sensing, 8(6), 514. https://doi.org/10.3390/rs8060514 Odirile, P., Parida, B. P., & Qi, J. (2023). Dam water
Lu, D., & Weng, Q. (2004). Spectral mixture analysis of the level prediction using vector autoregression, random for­
urban landscapes in Indianapolis with landsat ETM+ est regression and MLP-ANN models based on land-use
imagery. Photogrammetric Engineering and Remote and climate factors. Sustainability, 14(22), 14934. https://
Sensing, 70(9), 1053–1062. https://doi.org/10.14358/ doi.org/10.3390/su142214934
PERS.70.9.1053 Ouma, Y., Nkwae, B., Moalafhi, D., Odirile, P., Parida, B.,
Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G., & Johnson, B. A. Anderson, G., & Qi, J. (2022). Comparison of machine
(2019). Deep learning in remote sensing applications: A learning classifiers for multitemporal and multisensor
meta-analysis and review. Isprs Journal of mapping of urban LULC features. The International
Photogrammetry and Remote Sensing, 152, 166–177. Archives of the Photogrammetry, Remote Sensing and
https://doi.org/10.1016/j.isprsjprs.2019.04.015 Spatial Information Sciences, XLIII-B3-2022, 681–689.
Mao, W., Lu, D., Hou, L., Liu, X., & Yue, W. (2020). https://doi.org/10.5194/isprs-archives-XLIII-B3-2022-
Comparison of machine-learning methods for urban 681-2022
land-use mapping in Hangzhou City, China. Remote Pal, M. (2005). Random forest classifier for remote sensing
Sensing, 12(17), 2817. https://doi.org/10.3390/rs12172817 classification. International Journal of Remote Sensing, 26
Maxwell, A. E., Warner, T. A., & Fang, F. (2018). (1), 217–222. https://doi.org/10.1080/
Implementation of machine-learning classification in 01431160412331269698
remote sensing: An applied review. International Pandey, P. C., Koutsias, N., Petropoulos, G. P.,
Journal of Remote Sensing, 39(9), 2784–2817. https://doi. Srivastava, P. K., & Ben Dor, E. (2021). Land use/land
org/10.1080/01431161.2018.1433343 cover in view of earth observation: Data sources, input
Melgani, F., & Bruzzone, L. (2004). Classification of hyper­ dimensions, and classifiers—a review of the state of the
spectral remote sensing images with support vector art. Geocarto international, 36(9), 957–988. https://doi.
machines. IEEE Transactions on Geoscience and Remote org/10.1080/10106049.2019.1629647
Sensing, 42(8), 1778–1790. https://doi.org/10.1109/TGRS. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
2004.831865 Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
Mohtadifar, M., Cheffena, M., & Pourafzal, A. (2022). Weiss, R., Dubourg, V., & Vanderplas, J. (2011). Scikit-
Acoustic- and radio-frequency-based human activity learn: Machine learning in python. Journal of Machine
recognition. Sensors, 22(9), 3125. https://doi.org/10. Learning Research: JMLR, 12(85), 2825–2830. https://doi.
3390/s22093125 org/10.48550/arXiv.1201.0490
Myint, S. W., Gober, P., Brazel, A., Grossman-Clarke, S., & Pelletier, C., Valero, S., Inglada, J., Champion, N., Marais
Weng, Q. (2011). Per-pixel vs. object-based classification Sicr, C., & Dedieu, G. (2017). Effect of training class label
of urban land cover extraction using high spatial resolu­ noise on classification performances for land cover map­
tion imagery. Remote sensing of environment, 115(5), ping with satellite image time series. Remote Sensing, 9(2),
1145–1161. https://doi.org/10.1016/j.rse.2010.12.017 173. https://doi.org/10.3390/rs9020173
Naidoo, L., Cho, M. A., Mathieu, R., & Asner, G. (2012). Prasad, A., Iverson, L., & Liaw, A. (2006). Newer classifica­
Classification of savanna tree species, in the greater tion and regression tree techniques: Bagging and random
EUROPEAN JOURNAL OF REMOTE SENSING 21

forests for ecological prediction. Ecosystems (New York, Wang, J., Bretz, M., Dewan, M. A. A., & Delavar, M. A.
NY), 9(2), 181–199. https://doi.org/10.1007/s10021-005- (2022). Machine learning in modelling land-use and land
0054-1 cover-change (LULCC): Current status, challenges and
Qian, Y., Zhou, W., Yan, J., Li, W., & Han, L. (2015). prospects. The Science of the Total Environment, 822,
Comparing machine learning classifiers for object-based 153559. https://doi.org/10.1016/j.scitotenv.2022.153559
land cover classification using very high resolution Waske, B., & Braun, M. (2009). Classifier ensembles for land
imagery. Remote Sensing, 7(1), 153–168. https://doi.org/ cover mapping using multitemporal SAR imagery. Isprs
10.3390/rs70100153 Journal of Photogrammetry and Remote Sensing, 64(5),
Rogan, J., Franklin, J., Stow, D., Miller, J., Woodcock, C., & 450–457. https://doi.org/10.1016/j.isprsjprs.2009.01.003
Roberts, D. (2008). Mapping land-cover modifications Wu, L., Zhu, X., Lawes, R., Dunkerley, D., & Zhang, H.
over large areas: A comparison of machine learning (2019). Comparison of machine learning algorithms for
algorithms. Remote sensing of environment, 112(5), classification of LiDAR points for characterization of
2272–2283. https://doi.org/10.1016/j.rse.2007.10.004 canola canopy structure. International Journal of Remote
Shih, H. C., Stow, D. A., & Tsai, Y. H. (2019). Guidance on Sensing, 40(15), 5973–5991. https://doi.org/10.1080/
and comparison of machine learning classifiers for 01431161.2019.1584929
landsat-based land cover and land use mapping. Yang, C., Wu, G., Ding, K., Shi, T., Li, Q., & Wang, J. (2017).
International Journal of Remote Sensing, 40(4), Improving land use/land cover classification by integrat­
1248–1274. https://doi.org/10.1080/01431161.2018. ing pixel unmixing and decision tree methods. Remote
1524179 Sensing, 9(12), 1222. https://doi.org/10.3390/rs9121222
Shi, Y., Qi, Z., Liu, X., Niu, N., & Zhang, H. (2019). Urban Yu, L., Liang, L., Wang, J., Zhao, Y., Cheng, Q., Hu, L.,
land use and land cover classification using multisource Liu, S., Yu, L., Wang, X., Zhu, P., Li, X., Xu, Y., Li, C.,
remote sensing images and social media data. Remote Fu, W., Li, X., Li, W., Liu, C., Cong, N., Zhang, H. . . .
Sensing, 11(22), 2719. https://doi.org/10.3390/rs11222719 Gong, P. (2014). Meta-discoveries from a synthesis of
Srivastava, P. K., Han, D., Rico-Ramirez, M. A., Bray, M., & satellite-based land-cover mapping research.
Islam, T. (2012). Selection of classification techniques for International Journal of Remote Sensing, 35(13),
land use/land cover change investigation. Advances in 4573–4588. https://doi.org/10.1080/01431161.2014.
Space Research, 50(9), 1250–1265. https://doi.org/10. 930206
1016/j.asr.2012.06.032 Yu, X., Liong, S., & Baaovic, V. (2004). EV-SVM approach
Talukdar, S., Singha, P., Praveen, S., Mahato, B., & Rahman, A. for real-time hydrologic forecasting. Journal of
(2020). Dynamics of ecosystem services (ESs) in response to Hydroinformatics, 6(3), 209–223. https://doi.org/10.
land use land cover (LU/LC) changes in the lower gangetic 2166/hydro.2004.0016
plain of India. Ecological indicators, 112, 106121. https://doi. Zhang, P., Ke, Y., Zhang, Z., Wang, M., Li, P., & Zhang, S.
org/10.1016/j.ecolind.2020.106121 (2018). Urban land use and land cover classification using
Teluguntla, P., Thenkabail, P. S., Oliphant, A., Xiong, J., novel deep learning models based on high spatial resolu­
Gumma, M. K., Congalton, R. G., & Huete, A. (2018). A tion satellite imagery. Sensors, 18(11), 3717. https://doi.
30-m landsat-derived cropland extent product of Australia org/10.3390/s18113717
and China using random forest machine learning algorithm Zhang, C., Sargent, I., Pan, X., Li, H., Gardiner, A., Hare, J.,
on google earth engine cloud computing platform. Isprs & Atkinson, P. M. (2019). Joint deep learning for land
Journal of Photogrammetry and Remote Sensing, 144, cover and land use classification. Remote sensing of envir­
325–340. https://doi.org/10.1016/j.isprsjprs.2018.07.017 onment, 221, 173–187. https://doi.org/10.1016/j.rse.2018.
Thanh Noi, P., & Kappas, M. (2017). Comparison of ran­ 11.014
dom forest, k-nearest neighbor, and support vector Zong, L., He, S., Lian, J., Bie, Q., Wang, X., Dong, J., &
machine classifiers for land cover classification using Xie, Y. (2020). Detailed mapping of urban land use based
sentinel-2 imagery. Sensors, 18(2), 18. https://doi.org/10. on multi-source data: A case study of Lanzhou. Remote
3390/s18010018 Sensing, 12(12), 1987. https://doi.org/10.3390/rs12121987

You might also like