LR4 Groundwater Aquifer Potential Modeling Using An Ensemble Multi-Adoptive

Journal of Hydrology 579 (2019) 124172
Contents lists available at ScienceDirect
Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol
Research papers
Groundwater aquifer potential modeling using an ensemble multi-adoptive T

boosting logistic regression technique
⁎ ⁎
Hossein Mojaddadi Rizeeia, Biswajeet Pradhana,b, , Maryam Adel Saharkhiza, Saro Leec,d,
a
Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and IT, University of Technology Sydney, NSW 2007, Australia
b
Department of Energy and Mineral Resources Engineering, Sejong University, Choongmu-gwan, 209 Neungdong-ro, Gwangjin-gu, Seoul, 05006, Republic of Korea
c
Geoscience Platform Division, Korea Institute of Geoscience and Mineral Resources (KIGAM), Gajeong-dong 30, Yuseong-gu, Daejeon 305-350, Republic of Korea
d
Korea University of Science and Technology, 217 Gajeong-ro Yuseong-gu, Daejeon 34113, Republic of Korea
A R T I C LE I N FO A B S T R A C T
This manuscript was handled by G. Syme, Machine learning and data-driven models have achieved a favorable reputation in the field of advanced geos-
Editor-in-Chief patial modeling, particularly for models of groundwater aquifer potential over large areas. Such models built
Keywords: using standalone machine learning techniques retain some uncertainty, including errors associated with the
Machine learning modeling process, sampling approach, and input hyper-parameters. Some of these techniques cannot be applied
Groundwater aquifer potential in data-scarce regions because high bias and variance can lead to oversimplification. Therefore, in the current
Multi-adaptive-boosting-logistic-regression study, we developed and validated a novel ensemble multi-adaptive boosting logistic regression (MABLR) model
GIS for groundwater aquifer potential mapping. This model was validated in a large area of the Gyeongsangbuk-do
Optimization
basin in South Korea and the results were compared to those of different types of machine learning models
including multiple-layer perception (MPL), logistic regression (LR), and support vector machine (SVM) models.
A forward stepwise LR technique was implemented to assess the importance of contributing morphological
factors; we found 15 factors that contributed significantly: topographic wetness index (TWI), topographic
roughness index (TRI), stream power index (SPI), topographic position index (TPI), multi-resolution valley
bottom flatness (MVBF), slope, aspect, slope length (LS), distance from the river, distance from the fault, profile
curvature, plane curvature, altitude, land use/land cover (LULC), and geology. We optimized the MABLR model
using a fuzzy logic supervised (FLS) approach with 184 iterations and then validated the results using accuracy
assessment metrics including the κ coefficient, root-mean-square error (RMSE), receiver operating characteristics
(ROC), and the precision-recall curve (PRC). Our model had superior predictive performance among the models
tested, with higher overall goodness-of-fit and validation values according to the κ coefficient (0.819 and 0.781,
respectively), ROC (0.917 and 0.838), and PRC (0.931 and 0.872). Our experimental results demonstrate that
MABLR is more effective at reducing bias and variance error than other constituent machine learning methods.
1. Introduction important to investigate the behavior and characteristics of ground-

water.
Groundwater is among the greatest valuable natural resources due Groundwater transmissivity is determined by several factors in-
to its vital importance in industrial, residential, and agricultural ap- cluding geological, physiographical, morphological parameters, hy-
plications. As a non-renewable natural resource, groundwater quality drological conditions, and climate variation (Kumar et al., 2015), and
effects the vulnerability of soil to pollution, drinking water quality, the availability and activity of groundwater can be affected by topo-
temperature modulation, environmental sensitivity, and local climate graphy, lithology, geological structure, slope, and many other factors
change (Manap et al., 2013). One third of the global population de- (Oh et al., 2011). The creation of a comprehensive model that can ef-
pends on groundwater for their daily needs (Oh et al., 2011). The de- fectively consider all possible contributing factors for groundwater
velopment of groundwater is a key issue for the storage of fresh mapping is essential.
drinking water (Jothibasu and Anbazhagan, 2016). Therefore, it is Hydrogeological lab tests, sample drilling, and geospatial models
⁎
Corresponding authors at: Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and IT, University of
Technology Sydney, NSW 2007, Australia (B. Pradhan). Geoscience Platform Division, Korea Institute of Geoscience and Mineral Resources (KIGAM), Gajeong-dong
30, Yuseong-gu, Daejeon 305-350, Republic of Korea (S. Lee).
E-mail addresses: [email protected] (B. Pradhan), [email protected] (S. Lee).
https://doi.org/10.1016/j.jhydrol.2019.124172
Received 3 July 2019; Received in revised form 30 August 2019; Accepted 23 September 2019
Available online 23 September 2019
0022-1694/ © 2019 Elsevier B.V. All rights reserved.
H.M. Rizeei, et al. Journal of Hydrology 579 (2019) 124172
Fig. 1. Location of the study area.
are tools often used for mapping groundwater. Although these methods regression spline (Zabihi et al., 2016), index of entropy (Al-Abadi and
provide detailed recognition of subsurface hydrogeological structures Shahid, 2015), boosted regression tree (Naghibi et al., 2016), multi-
(Helaly, 2017), they can be time-consuming and costly (Nampak et al., variate adaptive regression splines (Rahmati et al., 2019), artificial
2014). The development of geographic information systems (GIS), neural network model (Corsini et al., 2009), and aquifer sustainability
statistical techniques, machine learning models, and remote sensing factor (Smith et al., 2010).
data have led to advances in groundwater potential analyses (Yin et al., In most cases, statistical and machine learning models perform well;
2018). GIS and remote sensing technology have been used as spatial however, if the training sample size is inadequate, these models tend to
research tools in numerous environmental applications including hy- oversimplify reality. Different sources of uncertainties related to
drological studies and natural hazard risk assessments (Mojaddadi groundwater modeling can include the modeling process, input para-
et al., 2017; Rizeei et al., 2018a,b; Rizeei et al., 2016, 2018c). Recently, meters, and sampling approach (Refsgaard et al., 2007). The ensemble
machine learning and data mining methods have been implemented in evidential belief function (Mohammady et al., 2012) and tree-based
many groundwater studies due to their ability to recognize patterns model was proposed to create the groundwater potential map (Naghibi
within inventory datasets and nonlinear relationships between para- et al., 2019). The development of ensemble models has allowed the
meters (Naghibi et al., 2018). integration of a base-learner approach with a prime algorithm to
Numerous forms of GIS-based and machine learning models have achieve more robust models that can be applied over large study areas,
been applied for groundwater potential application, including multi- where data coverage can be inconsistent (Naghibi et al., 2017). How-
criteria decision analysis (Kaliraj et al., 2014; Pradhan, 2010), fre- ever, the application of hybrid models should be explored for different
quency ratio (Guru et al., 2017; Oh et al., 2011; Rahmati et al., 2016), regions to determine the optimum model in terms of accuracy, ro-
Dempster-Shafer theory (Rahmati and Melesse, 2016), weights-of-evi- bustness, overfitting, and sensitivity to scarce data (Rahmati et al.,
dence modelling (Corsini et al., 2009; Ghorbani Nejad et al., 2017), 2018).
Self-learning Random Forests (Sameen et al., 2018), logistic regression To reduce these modeling uncertainties, we coupled a multi-adap-
(Ozdemir, 2011; Rizeei et al., 2018a), decision tree (Chenini et al., tive boosting hybrid model (MultiAdaBoosting) based on a decision-
2010), evidential belief function (Mogaji et al., 2015), the logistic committee technique that combines adaptive boosting (AdaBoost) with
model tree (Rahmati et al., 2018) certainty factor (Razandi et al., wagging, with logistic regression (LR), a robust model with strict ex-
2015), analytical hierarchy process (Adiat et al., 2012; Yin et al., 2018), pectations prior to training (Pradhan, 2010), to develop the ensemble
the statistical index (Falah et al., 2017), multivariate adaptive MABLR model. Although, MultiAdaBoosting is one of the powerful
2
adaptive boosting classifiers that can classify multiple classes on both

basic and complex recognition problems, yet it can be sensitive to the
existence of the outliers in the dataset which is very common in
groundwater domain, as well as over-fitting problems (Naghibi et al.,
2016). Therefore, we integrated LR with MultiAdaBoosting to overcome
the model over-fitting and outlier sensitivity problems by producing a
highly certain ensemble classifier with less dependency on modification
of hyper-parameters or settings.
Our main goal was to evaluate the ability of MABLR to assess the
morphological parameters of groundwater aquifer potential zones in
South Korea and compare its results to those of support vector machine
(SVM), multiple-layer perception (MPL), and standalone LR models.
Specifically, we optimized the contributing factors for LR groundwater
modeling; spatially modeled groundwater aquifer potential using the
ensemble MABLR machine learning model and compared the results to
those of other constituent machine learning models including MLP,
SVM, and LR; assessed the strength of machine learning model hyper-
parameters using a fuzzy logic supervised (FLS) approach; and com-
pared the performance of MABLR and other machine learning models.
The study site was the Ulseong county in South Korea and covers an Fig. 2. The overall flowchart of this study.
area of about 1175.2 km2 (Fig. 1). The Ulseong county comprise of
831.0 km2 (70.7%) of forest land, 214.6 km2 (18.2%) of farmland, and
productivity. This requires training related parameters (Pradhan and
32.6 km2 (2.7%) of rivers. The yearly average temperature is 11 °C, and
Lee, 2010; Aghdam et al., 2016; Hong et al., 2017; Rizeei et al., 2018a),
it is a cold and dry region with very little precipitation due to its geo-
which can affect model precision.
graphical nature as an inland basin situated between Taebaek and So-
We selected the following groundwater conditioning factors for
baek mountain ranges. It rains an average of 92 days annually. The
their potential contribution to the groundwater model: topographic
average precipitation is 960 mm, which shows the shortage of rainfall,
wetness index (TWI), topographic roughness index (TRI), stream power
compared to the Korean mean of precipitation amount 1250 mm
index (SPI), specific catchment area (SCA), topographic position index
(http://www.usc.go.kr/eng/About_Uiseong/Introduction/Location).
(TPI), multiresolution valley bottom flatness (MVBF), multiresolution
We identified 169 rock aquifers within the study area and recorded
ridge top flatness (MRTF), the convergence index (CI), Melton rug-
specific capacity and transmissivity information for each well in the
gedness number (MRN), slope, aspect, slope length (SL), distance from
region based on field surveys. Rock aquifer data indicate that the
the river, distance from the fault, profile curvature, plane curvature,
maximum range for aquifer groundwater is about 955.41 m3/h in
altitude, land use/land cover (LULC), soil, and geology.
winter, declining to a minimum of 0.0005 m3/h in summer. Well in-
To determine which contributing parameters were significantly
ventory points were randomly separated into two classes of 70% (118
correlated with groundwater aquifer productivity, a forward stepwise
wells) for model training and 30% (51 wells) for model testing.
LR was applied using the Weka software. The LR assessed the degree of
Based on the transmissivity (T) characteristics of each individual
functional correlation among all contributing parameters and spring
well, the well inventory was divided into two groups, productive
locations, which affect aquifer expansion (Hosmer et al., 2013;
(yield > 40 m3/h) and unproductive (yield < 40 m3/h), according to
Ozdemir, 2011). Effective contributing parameters were defined as
the criteria of Sameen et al. (2018). To create an effective well in-
those with P < 0.05 (Rahmati et al., 2018); a total of 15 contributing
ventory for use in the machine learning models, productive and un-
parameters were identified and retained in the model: TWI, TRI, SPI,
productive samples were assigned values of 1 and 0, respectively. Fig. 1
TPI, MVBF, slope, aspect, SL, distance from the river, distance from
shows the locations of the groundwater wells within the study area.
fault, profile curvature, plane curvature, altitude, LULC, and geology
We examined the effects of 12 contributing morphological factors.
(Fig. 3).
These factors were extracted using the ArcGIS 10.6 software in raster
Elevation is among the most significant parameters used in
format at a spatial resolution of 10 m × 10 m and statistically analyzed
groundwater analyses; groundwater aquifer potential in highly elevated
using Waikato Environment for Knowledge Analysis (Weka) v. 3.9.2.
areas approaches zero (Botzen et al., 2013). Surface runoff flows from
Topographic indices were derived from a digital elevation model that
highly elevated areas toward lower regions; consequently, groundwater
was originally surveyed as a 1:5000-scale topographic map by the
potential is higher in low-altitude or flat terrains. Slope and aspect are
Korean National Geographic Information Institute.
topographical factors with important applications as hydrology para-
We developed and calibrated the MABLR model to map ground-
meters due to their effects on runoff accumulation and the velocity of
water potential in the basin through the following steps. First, sig-
excess rainfall (Rizeei et al., 2017). An increase in slope decreases the
nificant contributing factors were selected. Then we modeled ground-
amount of time available for surface infiltration, increasing the amount
water aquifer potential using the calibrated MABLR model and
of water entering drainage networks that will later be retrievable from
compared the results to those of other well-known machine learning
groundwater aquifers. Aspect also can be influential parameters, par-
models. Finally, we evaluated the model results using the κ coefficient,
ticularly in a hilly area. North and east face aspect have exposed by the
root-mean-square error (RMSE), receiver operating characteristics
long duration of sun radiation where vegetation coverage is not as
(ROC) curve, and the precision-recall curve (PRC) (Fig. 2).
dense as west and south face aspect in Korean region. Hence, the
rainfall drops cannot penetrate the bare soil where soil pores are more
2. Materials and methods
likely to get blocked by intense rainfall due to lack of enough vegetation
coverage. As a result, the probability of groundwater is more on west or
2.1. Groundwater conditioning parameters
south face aspect rather east and north ones. Plan and profile curvature
also contribute significantly to physical groundwater models; these
We performed probability analyses to examine the correlation
parameters consist of raster data ranging from negative (concave) to
among conditioning factors that can influence groundwater potential
positive (convex) values and must be classified before becoming model
model results (Tehrany et al., 2013) and groundwater aquifer
3
Fig. 3. Significant contributing factors to groundwater modelling.
input. Pixels with a value of zero are assigned to flat regions. aquifer.
TPI indicates the position of each cell, and is calculated as follows The MVBF index reflects the valley bottom characteristics of flatness
(De Reu et al., 2013; Guisan et al., 1999): and lowness. Flatness is measured using the inverse of the slope, and
lowness is measured using ranking elevation with respect to a circular
Epixel
TPI = surrounding area. These two measures, both scaled from 0 to 1, are
Esurrounding (1) combined by multiplication and can be interpreted as fuzzy set mem-
bership functions (Gallant and Dowling, 2003; Kaufmann, 1975). LS is a
where Epixel is the altitude of the cell and Esurrounding is the mean altitude
combination of slope gradient (S) and slope length (L). We adopted an
of the neighboring pixels. High TPI values indicate upper slopes, while
extensively used method for calculating LS, as follows:
low values of TPI show lower slopes where the potential of the
groundwater aquifer is high. The MVBF index links between size and A 0.4 sinβ ⎞1.3
flatness of valley bottoms, which was incorporated into the algorithm LS = ⎛ s ⎞ ⎛
⎝ 22.13 ⎠ ⎝ 0.0896 ⎠ (2)
by reducing the slope threshold. Zero value specifies erosional terrain
with less possibility of groundwater aquifer, while values above 1 in- where A is the accumulated flow of the unit stream power theory,
dicating areas of deposition with much productive groundwater which considers sediments and water, and β is the slope in degrees.
4
Fig. 3. (continued)
Basically, a low value of LS is more probable for a productive topographic index can be estimated with respect to grid spacing and
groundwater aquifer. terrain roughness by comparing the relationship between the topo-
SPI and TWI are water-related parameters calculated as follows graphic index surface and reference data.
(Gokceoglu et al., 2005): TRI is another morphological parameter widely used in ground-
water analyses; it is calculated in this study as follows:
SPI = Astanβ , (3)
TRI = Abs(max2 − min2), (5)
As ⎞
TWI = ln ⎛⎜ ⎟,
where max and min represent the largest and smallest values of cells in
⎝ tanβ ⎠ (4)
nine rectangular neighborhoods of altitude values.
where As is the catchment area or flow accumulation (m2 m−1) and β is LULC types are also primary factors that strongly contribute to
the local slope gradient measured in degrees. SPI indicates the erosive groundwater potential modeling. A detailed understanding of LULCs
power of water flow. TWI represents the effects of topography on runoff bears extreme significance for environmental and natural hazards
generation and the amount of flow accumulation at any location within (Rizeei et al., 2016). Lithology and geology are also important para-
the river catchment (Gokceoglu et al., 2005). The accuracy of a meters used to detect sensitive groundwater aquifer areas. Soil type
5
Fig. 3. (continued)
directly affects the drainage process via characteristics such as texture, 2012). The FLS evaluated hyper-parameters by a search run iteratively
permeability degree, and structure. Lithological information regarding from a random vertex that calculated the ideal value among the
the permeability of rocks is also required. The study area contained available domain. After all runs were assessed, the optimal hyper-
rocks from 142 different lithology classes. parameter configuration was selected within 184 iterations according
Variation in the factors contributing to the behavior and activity of to evaluation metrics. The optimal hyper-parameters for all proposed
groundwater cause ambiguity in the overlaying process. Therefore, all models are summarized in Table 1.
factors were normalized to a common scale in the feature raster before
overlaying (Youssef et al., 2015; Mojaddadi et al., 2017; Fanos and 2.3. Theory of the LR, SVM, MLP, and MABLR models
Pradhan, 2019).
LR is a widely used multivariate statistical model that can be ap-
2.2. Model optimization plied to continuous or discrete data of any distribution or raster format
Hyper-parameters affect the quality and robustness of machine Table 1

The optimal value for hyper-parameters of the models by the FbSP technique.
learning models and must, therefore, be selected to achieve the highest
model performance (Liao et al., 2012). Once all significant hyper- Model Hyper-parameter Optimal value
parameters were selected, domain values were assigned for each in-
MABLR Weight threshold 120
dividual hyper-parameter. These values indicate the range of probable Seed 1
values for each parameter. Because the optimal value for single hyper- Number of sub committees 4
parameters should be coordinated with other hyper-parameters, finding Batch size 100
the most effective domain value of hyper-parameters for a model is a MPL Seed 5
time-consuming procedure without optimization systems (Woo et al., Momentum 0.8
2007). In this study, we specified six classes for each domain to cover its Learning rate 0.9
Hidden layer attribute 9.5
effective range. Following domain selection, we applied an FLS tech-
Hidden layer class 2
nique to optimize the hyper-parameters (Zhang et al., 2010).
The FLS optimized the hyper-parameters of the MABLR, SVM, MLP, SVM Kernel function Poly-kernel
gamma in kernel 0.3
and LR models by assigning an optimal predictive value for all involved C value 0.25
hyper-parameters within their domains to limit the degree of re- Penalty parameter 100
dundancy between them. Values of hyper-parameters that were ex-
LR Batch size 150
ceptionally associated with the model and with low inter-correlations Ride value 1.2e−9
were selected on the basis of discrepancy evaluation (Tong and Murray,
6
(Lee and Sambath, 2006a,b). It was proposed by McFadden (1974) to 2.4. Evaluation methods
measure probability of occurrence depending on contributing para-
meters. LR can be used to evaluate relationships among binary depen- The following evaluation metrics were applied to assess the accu-
dent variables over nominal and scalar values of independent variables racy of groundwater potential models: RMSE, κ coefficient, ROC, and
(Shirzadi et al., 2012). PRC.
SVM was designed on the basis of statistical learning theory to The κ coefficient measures the overall accuracy of the model among
minimize operational uncertainty (Yao et al., 2008). This process con- all correctly assigned samples on a diagonal basis in the error matrix
verts nonlinear structures into linear structures according to hyperplane allocated by the full dataset (Ridd and Liu, 1998). The κ coefficient is
creation (Tehrany et al., 2014). A separate hyperplane is created for the calculated as follows:
original space with n coordinates among points within two different r r
M ∑i = 1 x ii − ∑i = 1 xi + x + i
categories (Marjanović et al., 2011). The hyperplane separates training K= r
datasets based on a kernel function of the SVM. Support vectors are M2 − ∑i = 1 xi + x + i (6)
recognized as neighboring training vertices of the ideal hyperplane. The
where r reflects the total number of rows in the error matrix, xii is
goal of the SVM model is to recognize the ideal separating hyperplane
observation i, xi and x + 1 are the minimal totals, and M is the set of
range.
observations. ROC curves are designed to evaluate and visualize the
The MLP algorithm is a feed-forward artificial neural network (NN)
performance of an analytical model; they indicate sensitivity or a true
that uses nodes linked by input signals and numeric weights to produce
positive rate (TP) associated with a decision threshold on the y-axis,
layers that receive, process, and display output (Harun et al., 2010).
and specificity or false positive rate (FP) on the x-axis (Fawcett, 2006),
Back-propagation is applied to reduce errors accumulated via the re-
thus representing the positive and negative probability, respectively,
petitive approach. NNs have successfully been utilized in remote sen-
that a pixel is classified correctly. The area under the ROC curve esti-
sing applications. Limitations of the MLP model include high compu-
mates the overall accuracy of the model (Nampak et al., 2014; Pradhan,
tational costs and overlearning (Mia and Dhar, 2016).
2010). However, evaluation of the model solely by visual interpretation
MultiAdaBoosting merges AdaBoost with wagging to produce a
of ROC can be misleading; thus, the precision-recall curve (PRC) is a
decision-committee model (Webb, 2000; Bui et al., 2016) that reduces
complementary evaluation metric that is useful for imbalanced data-
both variance and bias. Although it cannot be applied for committees
sets. The PRC shows the correlation between the positive predictive
of < 10 members, MultiAdaBoosting exhibits greater error reduction
value (PPV) or precision and sensitivity for all possible pixels, from
than all other relative committed algorithms (Kotsiantis et al., 2007). In
which TP and FP can be calculated. The PRC graph can be plotted by
comparison, MABLR uses LR for classifier-based learning to generate
dividing sensitivity by PPV. The x-axis represents recall or sensitivity,
decision committees with less error than either wagging or Multi-
and the y-axis represents precision. Each point on the PRC graph thus
AdaBoosting, even for a large cross-section of datasets (Webb, 2000).
represents a selected cut-off. A perfect model will have a ROC and PRC
MABLR is more efficient than MultiAdaBoosting due to its matching
of 1, whereas a value approaching 0 indicates an inaccurate model.
parallel execution algorithms. The steps of MABLR implementation are
RMSE is used to evaluate differences between the observed sample
shown in Fig. 4.
values and predicted model values. RMSD is the square root of the
All classifiers determined by wagging are independent from all
second trial or the quadratic mean of the deviations from observed
others, permitting parallel multiplication and creating uncertainty in
values to predicted values (Hyndman and Koehler, 2006). RMSE was
the MultiAdaBoosting model at the sub-committee class. MABLR im-
calculated as follows:
proves error reduction compared to other approaches, including bag-
ging decision trees, wagging, and MultiAdaBoosting, particularly at n
∑i = 1 (Xtest − Xtrain )2
∊t < 10, when variance is amplified, thus reducing the frequency at RMSE =
n (7)
which the central tendency is created and therefore reducing its ability
to contribute to uncertainty. where Xtest is the set of testing values and Xtrain is the set of training
values at i.
3. Results and discussion
3.1. Groundwater potential mapping
Groundwater aquifer potential was modeled using four machine

learning techniques: MABLR, MLP, SAV, and LR; maps based on these
models are shown in Fig. 5. We focused mainly on the development of
the MABLR ensemble model because this study is the first to implement
it to determine groundwater probability; therefore, the optimization
processes are discussed in detail. The models were assessed according to
four accuracy metrics.
Groundwater potential aquifer maps were created on the basis of
predicted probability ranging from 0 to 1, where 0 indicates no prob-
able pixels and 1 indicates 100% probability of occurrence. To create
thematic zoning maps, which are more easily understood by end-users
and decision makers, we used the quantile technique in the GIS plat-
form to reclassify the probability index into five classes: very high, high,
moderate, low, and very low. High potential areas were located in low-
elevation zones near riverbanks, and low potential areas were found in
high-elevation areas with steep slopes. These findings were common
among all groundwater potential aquifer maps; clearly, riparian areas
and some upstream areas are expected to have the potential for
Fig. 4. Steps of MABLR implentation. groundwater yield.
7
Fig. 5. Groundwater aquifer potential maps calculated by a) LR, b) MLP, c) SVM, and d) MABLF models.
In particular, MABLR results indicated that locations with the groundwater potential was SPI, with a weight of 4.844. Variation in SPI
highest groundwater aquifer potential were mainly situated in the can directly increase or decrease groundwater potential. Plane curva-
western and southwestern regions of the study area (Fig. 5). By con- ture and MVBF were the second and third most influential factors, with
trast, very low groundwater potential was assigned to eastern and weights of 4.315 and 4.240, respectively. These factors significantly
northwestern regions of the study area. Among a total of 59 productive affected runoff behavior and further delineated areas of groundwater
wells, 47 were assigned to very high and high groundwater aquifer concentration. Other hydrological and morphological factors including
potential zones, indicating the high precision of the MABLR model. TPI, TRI, and SL also contributed greatly to groundwater potential
The MABLR model was used to extract the degree of contribution of zones, with weights of 3.076, 3.039, and 2.537, respectively. Altitude,
each factor (Fig. 6). The most effective parameter in determining index, TWI, and profile curvature made moderate contributions
Fig. 6. The assigned weightage to each contributing factor by the MABLR model.
8
(weights: 1.189, 0.996, 0.955, and 0.481, respectively). However,

MABLR defined slope, aspect, geology, distance from fault, and LULC as
the least influential factors for this study area, with weights < 0.2.
Most of the artificial intelligence-based models contain multiple
hyper-parameters that must be precisely defined to achieve ideal re-
sults; their interactions must also be considered for model optimization
(Dehnavi et al., 2015; Zare et al., 2013). The trial-and-error methods
used by machine learning models to retrieve optimal hyper-parameter
values are time-consuming and can introduce errors, particularly if the
number of hyper-parameters exceeds four and the range of their do-
mains is very wide (Mojaddadi Rizeei et al., 2019). Hence, automating
this selection process is useful for decreasing computational time and
increasing the accuracy of the final output by considering the full range
of possible interactions among hyper-parameters.
We adopted FLS optimization, which refines the parameter config-
uration at each iteration until convergence; however, increasing the
number of iterations does not necessarily result in enhanced config-
uration. FLS determined optimal hyper-parameter values within 184
iterations. Table 1 lists the feasible and optimal hyper-parameters of the
implemented models examined in this study, as determined by the FLS
approach. The MABLR model was calibrated using four hyper-para-
meters including weight threshold, seed number, the number of sub-
committees, and batch size.
The well-calibrated MABLR ensemble model achieved greater bias
and error reduction than MultiAdaBoost, particularly at small com-
mittee sizes.
3.2. Evaluation of the groundwater potential models
The models examined in this study were evaluated by RMSE, κ

coefficient, ROC, and PRC, which reflect the efficiency, accuracy, and
validity of the resulting groundwater potential aquifer maps. The
greatest difference between ROC and PRC is that the ROC graph pro-
duces a greater number of true negative results (Table 2).
The assessment was divided into two parts: goodness-of-fit (success)
and validation (prediction). A training sample of well locations, which
represented 70% of the total inventory, was used to assess the success of
the model, and the testing sample included the remaining 30% of data
not used during the modeling process. All metrics indicated a con-
siderable correlation between the model results and observed data;
however, the trained data were not used in the validation process.
The success rate (goodness-of-fit) and prediction rate (validation)
results are shown in Table 2. The RMSE results indicated that MABLR
had lower error than all other models, with values of 0.2483 and
0.3003 for goodness-of-fit and validation assessment, respectively. The Fig. 7. The area under the ROC graph for goodness-of-fit and validation level.
predictive performance of MABLR was also greater for goodness-of-fit
and validation assessment in terms of κ coefficient (0.8191 and 0.7814, The MLP results for goodness-of-fit included a κ coefficient of
respectively), ROC (0.917 and 0.838), and PRC (0.931 and 0.872) 0.6954, ROC of 0.843, and PRC of 0.924. Slightly lower values were
(Fig. 7). By reducing bias and variance in the dataset due to integrated obtained for validation: 0.6697, 0.823, and 0.851, respectively. RMSE
with LR, the ensemble MABLR reduced outlier negative effects and error values indicated that the MLP model had the second highest
sampling patterns more than other implemented models. Also, it success rate (0.2385) and the third highest validation rate (0.435)
showed the best capability to minimize the overfitting problem due to among all models.
optimized hyper-parameters of the MultiAdaBoosting, which showed a The SVM model yielded κ coefficient, ROC, and PRC values of
stable accuracy variation from goodness-of-fit to validation, which is a 0.6853 and 0.6801, 0.834 and 0.813, and 0.883 and 0.791 for
common issue among ensemble model.
Table 2
The results of goodness-of-fit and validation evaluation of all the applied models.
Goodness of fit Validation
Metrics RMSE k coefficient ROC PRC RMSE k coefficient ROC PRC
MABLR 0.2483 0.8191 0.917 0.931 0.3003 0.7814 0.838 0.872

MLP 0.2385 0.6954 0.843 0.924 0.435 0.6697 0.823 0.851
SVM 0.3119 0.6853 0.834 0.883 0.3217 0.6801 0.813 0.791
LR 0.2937 0.5569 0.822 0.8685 0.4704 0.5401 0.745 0.8116
9
goodness-of-fit and validation assessment, respectively. SVM placed Declaration of Competing Interest
third among all accuracy assessment metrics. SVM RMSE values in-
dicated the highest error in terms of success rate (0.3119), and the The authors declare that they have no known competing financial
second lowest error in terms of validation rate (0.3217). interests or personal relationships that could have appeared to influ-
LR showed the lowest accuracy among all models, with κ coeffi- ence the work reported in this paper.
cient, ROC, and PRC values of 0.5569, 0.822, and 0.8685, respectively,
in terms of goodness-of-fit and 0.5401, 0.745, and 0.8116 in terms of Acknowledgements
validation. RMSE values indicated that LR had a slightly higher success
rate (0.2937) than the SVM and the worst performance among all This research was supported by the Basic Research Project of the
models (0.4704). Korea Institute of Geoscience and Mineral Resources (KIGAM) and
In general, all models examined in this study had an acceptable Science and Technology Internationalization Project (NRF-
amount of uncertainty and high goodness-of-fit. The well-calibrated 2016K1A3A1A09915721) funded by the Ministry of Science and ICT.
ensemble MABLR model exhibited the highest performance for mod- The research is supported by the Centre for Advanced Modelling and
eling groundwater aquifer potential. Geospatial Information Systems (CAMGIS), University of Technology
Sydney under grant numbers: 323930, 321740.2232335;
321740.2232424 and 321740.2232357.
4. Conclusion The English in this document has been checked by at least two
professional editors, both native speakers of English. For a certificate,
Sustainable groundwater aquifer management requires precise please see: http://www.textcheck.com/certificate/189N3i.
modeling to accurately and reliably simulate conditions in nature.
Modeling groundwater aquifer potential is a delicate process involving Appendix A. Supplementary data
the estimation of several morphological and hydrological parameters.
Several techniques have been proposed for groundwater potential Supplementary data to this article can be found online at https://
mapping; however, not all can be applied in data-scarce regions where doi.org/10.1016/j.jhydrol.2019.124172.
bias and variance are high, as they tend toward oversimplification.
Although, MultiAdaBoosting is one of the powerful adaptive boosting References
classifiers that can classify multiple classes even on complex recogni-
tion problems, yet it is sensitive to the existence of the outliers in the Adiat, K., Nawawi, M., Abdullah, K., 2012. Assessing the accuracy of GIS-based ele-
dataset which is very common in groundwater domain, as well as mentary multi criteria decision analysis as a spatial prediction tool–a case of pre-
dicting potential zones of sustainable groundwater resources. J. Hydrol. 440, 75–89.
overfitting problems. Therefore, we proposed the ensemble MABLR, Aghdam, I.N., Varzandeh, M.H.M., Pradhan, B., 2016. Landslide susceptibility mapping
which reduces bias and variance in the dataset in the Gyeongsangbuk- using an ensemble statistical index (Wi) and adaptive neuro-fuzzy inference system
do basin of South Korea. The integrated MultiAdaBoosting with the (ANFIS) model at Alborz Mountains (Iran). Environ. Earth Sci. 75 (7), 553. https://
doi.org/10.1007/s12665-015-5233-6.
actual function of LR caused less sensitivity on outliers, training dis- Al-Abadi, A.M., Shahid, S., 2015. A comparison between index of entropy and catastrophe
tribution that resulted in a tangible reduction of overfitting problem theory methods for mapping groundwater potential in an arid region. Environ. Monit.
with less dependency on modification of hyper-parameters. Assess. 187 (9), 576.
Bui, D.T., Ho, T.-C., Pradhan, B., Pham, B.-T., Nhu, V.-H., Revhaug, I., 2016. GIS-based
Several contributing factors were assessed using a dataset of specific modeling of rainfall-induced landslides using data mining-based functional trees
capacity and transmissivity for 169 well locations. Initially, we applied classifier with AdaBoost, Bagging, and MultiBoost ensemble frameworks. Environ.
a forward stepwise LR algorithm to identify 15 significantly con- Earth Sci. 75 (14), 1101. https://doi.org/10.1007/s12665-016-5919-4.
Botzen, W., Aerts, J., Van den Bergh, J., 2013. Individual preferences for reducing flood
tributing morphological factors: TWI, TRI, SPI, TPI, MRVBF, slope, as-
risk to near zero through elevation. Mitig. Adapt. Strat. Gl. 18 (2), 229–244.
pect, SL, distance from the river, distance from fault, profile curvature, Chenini, I., Mammou, A.B., El May, M., 2010. Groundwater recharge zone mapping using
plane curvature, altitude, LULC, and geology. Then we developed a new GIS-based multi-criteria analysis: a case study in Central Tunisia (Maknassy Basin).
robust ensemble method, coupling LR with the MultiAdaBoosting Water Resour. Manage. 24 (5), 921–939.
Corsini, A., Cervi, F., Ronchetti, F., 2009. Weight of evidence and artificial neural net-
technique to construct the MABLR model, which showed higher per- works for potential groundwater spring mapping: an application to the Mt. Modino
formance than other well-known machine learning methods including area (Northern Apennines, Italy). Geomorphology 111 (1–2), 79–87.
MPL, SVM, and standalone LR. We applied FLS to successfully retrieve De Reu, J., Bourgeois, J., Bats, M., Zwertvaegher, A., Gelorini, V., De Smedt, P., Chu, W.,
Antrop, M., Maeyer, P.D., Finke, P., Meivenne, M.V., Verniers, J., Crombe, P., 2013.
optimal hyper-parameter values for the implemented models. The Application of the topographic position index to heterogeneous landscapes.
model results showed that MABLR had the best accuracy and efficiency Geomorphology 186, 39–49.
based on evaluation by RMSE, κ coefficient, ROC, and PRC. The most Dehnavi, A., Aghdam, I.N., Pradhan, B., Varzandeh, M.H.M., 2015. A new hybrid model
using step-wise weight assessment ratio analysis (SWARA) technique and adaptive
influential contributing factors were identified as SPI, plan curvature, neuro-fuzzy inference system (ANFIS) for regional landslide hazard assessment in
and MRVBF. Visual interpolation of high groundwater aquifer potential Iran. Catena 135, 122–148. https://doi.org/10.1016/j.catena.2015.07.020.
areas showed that they were located in low-elevation zones near riv- Falah, F., Ghorbani Nejad, S., Rahmati, O., Daneshfar, M., Zeinivand, H., 2017.
Applicability of generalized additive model in groundwater potential modelling and
erbanks whereas low potential areas were located in high-elevation comparison its performance by bivariate statistical methods. Geocarto Int. 32 (10),
areas with steep slopes. Our results will be valuable for evaluating 1069–1089.
groundwater studies and successive model development to further re- Fanos, A.M., Pradhan, B., 2019. A spatial ensemble model for rockfall source identifica-
tion from high resolution LiDAR data and GIS. IEEE Access. 7, 74570–74585. https://
duce uncertainties and consider the morphological factors that influ-
doi.org/10.1109/ACCESS.2019.2919977.
ence the precision of groundwater potential modeling. The main barrier Fawcett, T., 2006. An introduction to ROC analysis. Pattern Recogn. Lett. 27 (8),
of this research was using the contributing factors with a moderate 861–874.
spatial resolution, which reduced the quality of groundwater mapping. Gallant, J.C., Dowling, T.I., 2003. A multiresolution index of valley bottom flatness for
mapping depositional areas. Water Resour. Res. 39 (12).
Thus, it is suggested to use the 1-meter spatial resolution to leverage the Ghorbani Nejad, S., Falah, F., Daneshfar, M., Haghizadeh, A., Rahmati, O., 2017.
final map precision. Since the proposed model has the capability of Delineation of groundwater potential zones using remote sensing and GIS-based data-
modeling the functions with scare input data, it is also recommended driven models. Geocarto Int. 32 (2), 167–187.
Gokceoglu, C., Sonmez, H., Nefeslioglu, H.A., Duman, T.Y., Can, T., 2005. The 17 March
being experimented on other probability application such as landslide 2005 Kuzulu landslide (Sivas, Turkey) and landslide-susceptibility map of its near
that has a smaller number of inventory datasets. However, the proposed vicinity. Eng. Geol. 81 (1), 65–83.
model should be implemented in multiple regions to test its transfer- Guisan, A., Weiss, S.B., Weiss, A.D., 1999. GLM versus CCA spatial modeling of plant
species distribution. Plant Ecol. 143 (1), 107–122.
ability and reliability before it can be applied to assess the vulnerability Guru, B., Seshan, K., Bera, S., 2017. Frequency ratio model for groundwater potential
of wells.
10
mapping and its sustainable management in cold desert, India. J. King Saud Univ. Sci. Pradhan, B., 2010. Remote sensing and GIS-based landslide hazard analysis and cross-
29 (3), 333–347. validation using multivariate logistic regression model on three test areas in
Harun, N., Dlay, S.S., Woo, W.L., 2010. Performance of keystroke biometrics authenti- Malaysia. Adv. Space Res. 45 (10), 1244–1256.
cation system using multilayer perceptron neural network (MLP NN), Pradhan, B., Lee, S., 2010. Regional landslide susceptibility analysis using back-propa-
Communication Systems Networks and Digital Signal Processing (CSNDSP), 2010 7th gation neural network model at Cameron Highland, Malaysia. Landslides 7 (1),
International Symposium on. IEEE. pp. 711–714. 13–30. https://doi.org/10.1007/s10346-009-0183-2.
Helaly, A.S., 2017. Assessment of groundwater potentiality using geophysical techniques Rahmati, O., Melesse, A.M., 2016. Application of Dempster-Shafer theory, spatial analysis
in Wadi Allaqi basin, Eastern Desert, Egypt-Case study. NRIAG J. Astron. Geophys. 6 and remote sensing for groundwater potentiality and nitrate pollution analysis in the
(2), 408–421. semi-arid region of Khuzestan. Iran. Sci. Total Environ. 568, 1110–1123.
Hong, H., Liu, J., Zhu, A.X., Shahabi, H., Pham, B.T., Chen, W., Pradhan, B., Tien Bui, D., Rahmati, O., Moghaddam, D.D., Moosavi, V., Kalantari, Z., Samadi, M., Lee, S., Tien Bui,
2017. A novel hybrid integration model using support vector machines and random D., 2019. An automated python language-based tool for creating absence samples in
subspace for weather-triggered landslide susceptibility assessment in the Wuning groundwater potential mapping. Remote Sens. 11 (11), 1375.
area (China). Environ. Earth. Sci. 76, 652. https://doi.org/10.1007/s12665-017- Rahmati, O., Naghibi, S.A., Shahabi, H., Bui, D.T., Pradhan, B., Azareh, A., Melesse, A.M.,
6981-2. 2018. Groundwater spring potential modelling: comprising the capability and ro-
Hosmer Jr, D.W., Lemeshow, S., Sturdivant, R.X., 2013. Applied Logistic Regression. John bustness of three different modeling approaches. J. Hydrol. 565, 248–261.
Wiley & Sons, pp. 398. Rahmati, O., Pourghasemi, H.R., Melesse, A.M., 2016. Application of GIS-based data
Hyndman, R.J., Koehler, A.B., 2006. Another look at measures of forecast accuracy. Int. J. driven random forest and maximum entropy models for groundwater potential
Forecast. 22 (4), 679–688. mapping: a case study at Mehran Region, Iran. Catena 137, 360–372.
Jothibasu, A., Anbazhagan, S., 2016. Modeling groundwater probability index in Razandi, Y., Pourghasemi, H.R., Neisani, N.S., Rahmati, O., 2015. Application of analy-
Ponnaiyar River basin of South India using analytic hierarchy process. Model. Earth tical hierarchy process, frequency ratio, and certainty factor models for groundwater
Syst. Environ. 2 (3), 109. potential mapping using GIS. Earth Sci. Inform. 8 (4), 867–883.
Kaliraj, S., Chandrasekar, N., Magesh, N., 2014. Identification of potential groundwater Refsgaard, J.C., van der Sluijs, J.P., Højberg, A.L., Vanrolleghem, P.A., 2007. Uncertainty
recharge zones in Vaigai upper basin, Tamil Nadu, using GIS-based analytical hier- in the environmental modelling process–a framework and guidance. Environ. Model.
archical process (AHP) technique. Arab. J. Geosci. 7 (4), 1385–1401. Softw. 22 (11), 1543–1556.
Kaufmann, A., 1975. Introduction to the Theory of Fuzzy Subsets. Academic Pr, pp. 2. Ridd, M.K., Liu, J., 1998. A comparison of four algorithms for change detection in an
Kotsiantis, S.B., Zaharakis, I., Pintelas, P., 2007. Supervised machine learning: a review of urban environment. Remote Sens. Environ. 63 (2), 95–100.
classification techniques. Emerg. Artificial Intell. Appl. Comput. Eng. 160, 3–24. Rizeei, H.M., Azeez, O.S., Pradhan, B., Khamees, H.H., 2018a. Assessment of groundwater
Kumar, P., Bansod, B.K., Debnath, S.K., Thakur, P.K., Ghanshyam, C., 2015. Index-based nitrate contamination hazard in a semi-arid region by using integrated parametric
groundwater vulnerability mapping models using hydrogeological settings: a critical IPNOA and data-driven logistic regression models. Environ. Monit. Assess. 190 (11),
evaluation. Environ. Impact Assess. 51, 38–49. 633.
Lee, S., Sambath, T., 2006a. Landslide susceptibility mapping in the Damrei Romel area, Rizeei, H.M., Pradhan, B., Saharkhiz, M.A., 2017. Surface Runoff Estimation and
Cambodia using frequency ratio and logistic regression models. Environ. Geol. 50 (6), Prediction Regarding LULC and Climate Dynamics Using Coupled LTM, Optimized
847–855. ARIMA and Distributed-GIS-Based SCS-CN Models at Tropical Region, GCEC 2017.
Lee, S., Sambath, T., 2006b. Landslide susceptibility mapping in the Damrei Romel area, Springer, pp. 1103–1126.
Cambodia using frequency ratio and logistic regression models. Environ. Geol. 50, Rizeei, H.M., Pradhan, B., Saharkhiz, M.A., 2018b. An integrated fluvial and flash pluvial
847–855. model using 2D high-resolution sub-grid and particle swarm optimization-based
Liao, S.-H., Chu, P.-H., Hsiao, P.-Y., 2012. Data mining techniques and applications–A random forest approaches in GIS. Complex Intell. Syst. 1–20.
decade review from 2000 to 2011. Expert Syst. Appl. 39 (12), 11303–11311. Rizeei, H.M., Pradhan, B., Saharkhiz, M.A., 2018c. Surface runoff prediction regarding
Manap, M.A., Sulaiman, W.N.A., Ramli, M.F., Pradhan, B., Surip, N., 2013. A knowledge- LULC and climate dynamics using coupled LTM, optimized ARIMA, and GIS-based
driven GIS modeling technique for groundwater potential mapping at the Upper SCS-CN models in tropical region. Arab. J. Geosci. 11 (3), 53.
Langat Basin, Malaysia. Arab. J. Geosci. 6 (5), 1621–1637. Rizeei, H.M., Saharkhiz, M.A., Pradhan, B., Ahmad, N., 2016. Soil erosion prediction
Marjanović, M., Kovačević, M., Bajat, B., Voženílek, V., 2011. Landslide susceptibility based on land cover dynamics at the Semenyih watershed in Malaysia using LTM and
assessment using SVM machine learning algorithm. Eng. Geol. 123 (3), 225–234. USLE models. Geocarto Int. 31 (10), 1158–1177.
McFadden, D., 1974. Conditional logit analysis of qualitative choice behavior. In: Sameen, M.I., Pradhan, B., Lee, S., 2018. Self-learning random forests model for mapping
Zarembka, P. (Ed.), Frontiers in Econometrics. Academic Press, New York, pp. groundwater yield in data-scarce areas. Nat. Resour. Res. 1–19.
105–142. Shirzadi, A., Saro, L., Joo, O.H., Chapi, K., 2012. A GIS-based logistic regression model in
Mia, M., Dhar, N.R., 2016. Prediction of surface roughness in hard turning under high rock-fall susceptibility mapping along a mountainous road: salavat Abad case study,
pressure coolant using Artificial Neural Network. Measurement 92, 464–474. Kurdistan, Iran. Nat. Hazards. 64, 1639–1656.
Mogaji, K., Lim, H., Abdullah, K., 2015. Regional prediction of groundwater potential Smith, A.J., Walker, G., Turner, J., 2010. Aquifer Sustainability Factor: A Review of
mapping in a multifaceted geology terrain using GIS-based Dempster-Shafer model. Previous Estimates. International Association of Hydrogeologists (AIH) and the
Arab. J. Geosci. 8 (5), 3235–3258. Geological Society of Australia (GSA), pp. EP104589.
Mojaddadi, H., Pradhan, B., Nampak, H., Ahmad, N., Ghazali, A.H.B., 2017. Ensemble Tehrany, M.S., Pradhan, B., Jebur, M.N., 2013. Spatial prediction of flood susceptible
machine-learning-based geospatial approach for flood risk assessment using multi- areas using rule based decision tree (DT) and a novel ensemble bivariate and mul-
sensor remote-sensing data and GIS. Geomat. Nat. Haz. Risk. 8 (2), 1080–1102. tivariate statistical models in GIS. J. Hydrol. 504, 69–79.
https://doi.org/10.1080/19475705.2017.1294113. Tehrany, M.S., Pradhan, B., Jebur, M.N., 2014. Flood susceptibility mapping using a novel
Mojaddadi Rizeei, H., Pradhan, B., Saharkhiz, M.A., 2019. Urban object extraction using ensemble weights-of-evidence and support vector machine models in GIS. J. Hydrol.
Dempster Shafer feature-based image analysis from worldview-3 satellite imagery. 512, 332–343.
Int. J. Remote Sens. 40 (3), 1092–1119. Tong, D., Murray, A.T., 2012. Spatial optimization in geography. Ann. Assoc. Am. Geogr
Mohammady, M., Pourghasemi, H.R., Pradhan, B., 2012. Landslide susceptibility map- 102 (6), 1290–1309.
ping at Golestan Province, Iran: a comparison between frequency ratio, Dempster- Webb, G.I., 2000. Multiboosting: a technique for combining boosting and wagging. Mach.
Shafer, and weights-of-evidence models. J. Asian Earth Sci. 61 (15), 221–236. Learn. 40 (2), 159–196.
https://doi.org/10.1016/j.jseaes.2012.10.005. Woo, M.W., Daud, W.R.W., Tasirin, S.M., Talib, M.Z.M., 2007. Optimization of the spray
Naghibi, S.A., Moghaddam, D.D., Kalantar, B., Pradhan, B., Kisi, O., 2017. A comparative drying operating parameters—A quick trial-and-error method. Dry Technol. 25 (10),
assessment of GIS-based data mining models and a novel ensemble model in 1741–1747.
groundwater well potential mapping. J. Hydrol. 548, 471–483. https://doi.org/10. Yao, X., Tham, L., Dai, F., 2008. Landslide susceptibility mapping based on support vector
1016/j.jhydrol.2017.03.020. machine: a case study on natural slopes of Hong Kong, China. Geomorphology 101
Naghibi, S.A., Pourghasemi, H.R., Dixon, B., 2016. GIS-based groundwater potential (4), 572–582.
mapping using boosted regression tree, classification and regression tree, and random Yin, H., Shi, Y., Niu, H., Xie, D., Wei, J., Lefticariu, L., Xu, S., 2018. A GIS-based model of
forest machine learning models in Iran. Environ. Monit. Assess. 188 (1), 44. potential groundwater yield zonation for a sandstone aquifer in the Juye Coalfield,
Naghibi, S.A., Dolatkordestani, M., Rezaei, A., Amouzegari, P., Heravi, M.T., Kalantar, B., Shangdong, China. J. Hydrol. 557, 434–447.
Pradhan, B., 2019. Application of rotation forest with decision trees as base classifier Youssef, A.M., Al-Kathery, M., Pradhan, B., 2015. Landslide susceptibility mapping at Al-
and a novel ensemble model in spatial modeling of groundwater potential. Environ. Hasher area, Jizan (Saudi Arabia) using GIS-based frequency ratio and index of en-
Monit. Assess. 191 (4), 248. tropy models. Geosci. J. 19 (1), 113–134. https://doi.org/10.1007/s12303-014-
Naghibi, S., Vafakhah, M., Hashemi, H., Pradhan, B., Alavi, S., 2018. Groundwater aug- 0032-8.
mentation through the site selection of floodwater spreading using a data mining Zabihi, M., Pourghasemi, H.R., Pourtaghi, Z.S., Behzadfar, M., 2016. GIS-based multi-
approach (case study: Mashhad plain, Iran). Water 10 (10), 1405. variate adaptive regression spline and random forest models for groundwater po-
Nampak, H., Pradhan, B., Manap, M.A., 2014. Application of GIS based data driven tential mapping in Iran. Environ. Earth Sci. 75 (8), 665.
evidential belief function model to predict groundwater potential zonation. J. Hydrol. Zare, M., Pourghasemi, H.R., Vafakhah, M., Pradhan, B., 2013. Landslide susceptibility
513, 283–300. mapping at Vaz Watershed (Iran) using an artificial neural network model: a com-
Oh, H.-J., Kim, Y.-S., Choi, J.-K., Park, E., Lee, S., 2011. GIS mapping of regional prob- parison between multilayer perceptron (MLP) and radial basic function (RBF) algo-
abilistic groundwater potential in the area of Pohang City, Korea. J. Hydrol. 399 rithms. Arab. J. Geosci. 6 (8), 2873–2888.
(3–4), 158–172. Zhang, Y., Maxwell, T., Tong, H., Dey, V., 2010. Development of a supervised software
Ozdemir, A., 2011. GIS-based groundwater spring potential mapping in the Sultan tool for automated determination of optimal segmentation parameters for ecognition.
Mountains (Konya, Turkey) using frequency ratio, weights of evidence and logistic ISPRS TC VII Symposium – 100 Years ISPRS, Vienna, Austria.
regression methods and their comparison. J. Hydrol. 411 (3–4), 290–308.
11

LR4 Groundwater Aquifer Potential Modeling Using An Ensemble Multi-Adoptive

Uploaded by

Copyright:

Available Formats

LR4 Groundwater Aquifer Potential Modeling Using An Ensemble Multi-Adoptive

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LR4 Groundwater Aquifer Potential Modeling Using An Ensemble Multi-Adoptive

Uploaded by

Copyright:

Available Formats

Journal of Hydrology 579 (2019) 124172

Contents lists available at ScienceDirect

Groundwater aquifer potential modeling using an ensemble multi-adoptive T

1. Introduction important to investigate the behavior and characteristics of ground-

Fig. 1. Location of the study area.

adaptive boosting classiﬁers that can classify multiple classes on both

Fig. 3. Signiﬁcant contributing factors to groundwater modelling.

Hyper-parameters aﬀect the quality and robustness of machine Table 1

3. Results and discussion

3.1. Groundwater potential mapping

Groundwater aquifer potential was modeled using four machine

(weights: 1.189, 0.996, 0.955, and 0.481, respectively). However,

3.2. Evaluation of the groundwater potential models

The models examined in this study were evaluated by RMSE, κ

Metrics RMSE k coeﬃcient ROC PRC RMSE k coeﬃcient ROC PRC

MABLR 0.2483 0.8191 0.917 0.931 0.3003 0.7814 0.838 0.872

You might also like