0% found this document useful (0 votes)
34 views13 pages

Applying Machine Learning Methods To Predict Geology Using Soil Sample Geochemistry

This study compared machine learning techniques for predicting underlying geology from soil geochemistry data. Six sampling methods were tested to address imbalanced class sizes, with SMOTE performing best. Nine machine learning algorithms were also compared, with AdaBoost classifiers and gradient boosting classifiers most effective. Additionally, multiple classifier systems that combined different algorithms outperformed individual models, with the best system using nearest neighbors, RBF SVM, neural network, random forest, AdaBoost, and gradient boosting, followed by logistic regression. The results were used to create a tool for predicting geology from soil geochemistry.

Uploaded by

Saanvi Arya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views13 pages

Applying Machine Learning Methods To Predict Geology Using Soil Sample Geochemistry

This study compared machine learning techniques for predicting underlying geology from soil geochemistry data. Six sampling methods were tested to address imbalanced class sizes, with SMOTE performing best. Nine machine learning algorithms were also compared, with AdaBoost classifiers and gradient boosting classifiers most effective. Additionally, multiple classifier systems that combined different algorithms outperformed individual models, with the best system using nearest neighbors, RBF SVM, neural network, random forest, AdaBoost, and gradient boosting, followed by logistic regression. The results were used to create a tool for predicting geology from soil geochemistry.

Uploaded by

Saanvi Arya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Applied Computing and Geosciences 16 (2022) 100094

Contents lists available at ScienceDirect

Applied Computing and Geosciences


journal homepage: www.sciencedirect.com/journal/applied-computing-and-geosciences

Applying machine learning methods to predict geology using soil


sample geochemistry
Timothy C.C. Lui a, b, *, Daniel D. Gregory b, Marek Anderson c, Well-Shen Lee d,
Sharon A. Cowling b
a
Department of Geological Sciences, Stanford University, Stanford, CA, 94305, USA
b
Department of Earth Sciences, University of Toronto, Toronto, ON, M5S 3B1, Canada
c
Department of Electrical Engineering and Computer Science, University of Applied Sciences, Lübeck, S-H, 23562, Germany
d
Harquail School of Earth Sciences, Laurentian University, Sudbury, ON, Canada, P3E 2C6

A R T I C L E I N F O A B S T R A C T

Keywords: In this study we compared various machine learning techniques that used soil geochemistry to aid in geologic
Geological mapping mapping. We tested six different sampling methods (undersample, oversample, Synthetic Minority Oversampling
Soil geochemistry Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), SMOTE and Edited Nearest Neighbor (SMO­
Data science
TEENN), and SMOTE and Tomek links (SMOTETomek)). SMOTE performed best with ADASYN and SMOTE­
Machine learning
Sampling method
Tomek having slightly lower effectiveness. Nine machine learning algorithms (naïve Bayes, logistic regression,
Multiple classifier system quadratic discriminant analysis, nearest neighbors, radial basis function support-vector machine, artificial neural
network, random forest, AdaBoost classifier, and gradient boosting classifier) were compared and AdaBoost
classifiers and gradient boosting classifiers were found to be most effective. Finally, we experimented with
multiple classifier systems (MCS) testing different combinations of algorithms and various combinatorial func­
tions. It was found that MCS can outperform individual models, and the best MCS combined nearest neighbors,
radial basis function support-vector machine, artificial neural network, random forest, AdaBoost classifiers, and
gradient boosting classifier, then applied a logistic regression to the probabilities output by the models. Ulti­
mately, we created a tool that is able to adequately predict underlying geology in the study area using soil
geochemistry.

1. Introduction based on soil sample geochemistry.


Machine learning has been applied to aid in geologic mapping, but
It is crucial to understand the underlying geology of an area for many many questions about which techniques are most effective remain
purposes. However, sometimes underlying geology is difficult to deter­ unanswered. Previous projects have used remote sensing, geophysical
mine when there are no or few bedrock outcrops near the surface. For data, and geospatial data as predictors to their models, but few have
these areas, drilling and/or geophysical surveys are sometimes required. included large amounts of soil geochemical data (Carranza and Laborte,
Unfortunately, these can be costly and often difficult, especially in areas 2016; Cracknell and Reading, 2014; Harris and Grunsky, 2015; Kuhn
that are remote, heavily vegetated, or have complex topography. Soils et al., 2018; Masoumi et al., 2017). Projects that have taken soil
may assimilate the chemistry of the underlying geology. Soil chemistry geochemistry into account tend to have only used decision tree
is mainly affected by climate, living organisms, time, topography, and ensemble methods like the random forest algorithm (Anderson, 2019;
importantly for this study, underlying rock material (Weil and Brady, Cracknell et al., 2014; Kuhn et al., 2019, 2020). This study explores the
2017). Specifically, the C horizon, the deepest subdivision of the soil use of soil geochemical data through more varieties of algorithms other
profile and hence the least weathered, is most likely to chemically than just decision tree ensemble methods. Additionally, no other pre­
resemble the underlying bedrock (Bradshaw and Thomson, 1979; Weil dictive geological mapping study has analyzed the effectiveness of
and Brady, 2017). The goal of this study is to explore what machine multiple sampling methods for imbalanced classes nor the effectiveness
learning techniques are most effective at predicting underlying geology of multiple classifier systems (MCS).

* Corresponding author. Department of Geological Sciences, Stanford University, Stanford, CA, 94305, USA.
E-mail address: [email protected] (T.C.C. Lui).

https://doi.org/10.1016/j.acags.2022.100094
Received 4 February 2021; Received in revised form 20 July 2022; Accepted 3 August 2022
Available online 11 August 2022
2590-1974/© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

This study explores the effectiveness of various machine learning Table 2


techniques at predicting the underlying geologic unit using soil sample Machine learning algorithms tested and descriptions.
geochemistry, specifically, the use of data sampling methods and the use Algorithm Description
of MCS. Our classes were imbalanced meaning we had a different
Naïve Bayes (NB) Uses Gaussian distributions and likelihoods to
number of samples from each geologic unit. Since some algorithms can make predictions (Shalev-Shwartz and
perform better with balanced classes, we explored six sampling methods Ben-David, 2014).
(undersample, oversample, Synthetic Minority Oversampling Technique Logistic regression (LR) Fits sigmoid functions to training data to make
(SMOTE), Adaptive Synthetic Sampling (ADASYN), SMOTE and Edited predictions based on probability (
Shalev-Shwartz and Ben-David, 2014).
Nearest Neighbor (SMOTEENN), and SMOTE and Tomek links (SMO­ Quadratic discriminant analysis Decision boundaries are fitted using quadratic
TETomek)) that can create balanced classes from imbalanced ones (QDA) equations and Gaussian densities (Hastie et al.,
(Batista et al., 2004). Simple descriptions can be found in Table 1. 2001).
Furthermore, many types of machine learning algorithms exist, and it Nearest neighbors (NN) Classifies based on the nearest data in the
training set (Cover and Hart, 1967).
has not yet been tested which algorithm performs best with large
Radial basis function support- Predicts by creating decision surfaces based on
amounts of soil geochemical data. In this study we ran nine different vector machine (RBFSVM) training set. In this case, the kernel used is the
algorithms (naïve Bayes (NB), logistic regression (LR), quadratic radial basis function kernel (Vapnik, 1998).
discriminant analysis (QDA), nearest neighbors (NN), radial basis Artificial neural network (ANN) Inspired by neurons in brains, uses layers of
function support-vector machine (RBFSVM), artificial neural network neurons and learns on weights between neuron
connections to make predictions (
(ANN), random forest (RF), AdaBoost classifier (AB), and gradient Shalev-Shwartz and Ben-David, 2014).
boosting classifier (GB)) to see which algorithm worked best at identi­ Random forest (RF) Combines decision tree predictors and uses
fying underlying geology with this type of data. Descriptions of these majority voting to create the random forest’s
algorithms can be found in Table 2. We also tested whether a multiple prediction (Breiman, 2001).
AdaBoost classifier (AB) Sequentially combines weak learners, in this case
classifier system, an algorithm that works by combining the predictions
decision stumps, that take into account previous
of different algorithms, is effective (Ranawana and Palade, 2006). This mistakes and weighs learners differently (Zhu
produced a tool that can be used to understand subsurface geology when et al., 2009).
traditional methods may not be possible, but soil sample analyses are Gradient boosting classifier (GB) Sequentially creates decision trees that learn
available, or as a preliminary step to more expensive ground truthing from errors in previous trees and scales trees
differently (Friedman, 2001).
techniques (i.e., diamond drilling).

2. Study area rocks, marble, carbonaceous schist, quartzite, and psammitic schist.
The Snowcap assemblage is intruded by tonalite-granodiorite arc rocks
Our study location is at the Klaza deposit located in the southern tip of the Devonian-Mississippian-age Finlayson assemblage (Piercey and
of the Dawson Range, southcentral Yukon (Fig. 1). Basement rocks Colpron, 2009; Piercey et al., 2006). The Simpson Range suite is pre­
comprise Devonian and older metamorphosed volcaniclastic and sedi­ dominantly comprised of biotite-hornblende granitic to granodioritic
mentary rocks of the parautochthonous Yukon-Tanana terrane (YTT). rocks of Late Devonian to Early Carboniferous-age which cut the prior
Locally at Klaza, the YTT comprises the Finlayson assemblage, Simpson volcano-sedimentary assemblages (Piercey and Murphy, 2000).
Range suite, and the Snowcap assemblage. The Snowcap assemblage is The YTT accreted to the Laurentian margin between the Late Triassic
the oldest assemblage in the YTT and mainly comprises calc-silicate to Early Jurassic during east-dipping subduction (Allan et al., 2013;
Colpron et al., 2006; Nelson et al., 2006, 2013; Piercey et al., 2006). The
Table 1 regionally extensive Long Lake suite and Minto suite granodiorite to
Sampling methods tested and descriptions. monzonite plutons were emplaced during this subduction event.
Sampling Method Description East-dipping subduction resumed in the mid-Cretaceous, leading to the
emplacement of the Whitehorse plutonic suite (112–98 Ma) and the
No change No sampling method is used. Imbalance is
maintained.
coeval Mount Nansen andesitic volcanic package (115–110 Ma)
Random Undersampler Randomly and without replacement, select from (Klöcking et al., 2016). The regionally extensive Dawson Range batho­
the majority classes down to the size of the lith consists of multiphase granodiorite to monzogranite of the White­
smallest class (Batista et al., 2004). horse plutonic suite. Continued east-dipping subduction in the Late
Random Oversampler Randomly duplicate with replacement samples
Cretaceous led to the emplacement of the Casino plutonic suite (78–74
from minority classes up to the size of the largest
class (Batista et al., 2004). Ma), which is composed of rhyolite to dacite dikes and plugs (Allan
Synthetic Minority Oversampling SMOTE is an algorithm that oversamples by et al., 2013; Nelson et al., 2013; Sack et al., 2021). The Casino plutonic
Technique (SMOTE) synthetically creating data points. Apply suite is less voluminous than the Whitehorse suite, although it occurs
SMOTE to all minority classes up to the size of regionally throughout the Dawson Range Gold Belt in the form of
the largest class (Batista et al., 2004).
Adaptive Synthetic Sampling ADASYN is a distinct algorithm that
porphyritic dykes and stocks.
(ADASYN) synthetically oversamples. Apply ADASYN to all
minority classes up to the size of the largest 3. Methods
class. (He et al., 2008)
SMOTE and Edited Nearest Edited Nearest Neighbor (ENN) is a cleaning
3.1. Data acquisition
Neighbor (SMOTEENN) technique that removes samples based on
dissimilarity to nearby samples. First apply
SMOTE to all minority classes up to the size of Soil samples have been collected from 1986 to 2017 on the property.
the largest class, then use ENN to remove Soil was retrieved between 30 and 80 cm deep using hand-held augers.
potentially inadequate samples (Batista et al., Collection was done in grids, and location was recorded using a hand-
2004).
SMOTE and Tomek links Tomek links is a cleaning technique that remove
held Global Positioning System (GPS) or prior to widespread, precise
(SMOTETomek) samples based on distances between dissimilar GPS availability, with hip chain and compass bearing to a grid baseline.
samples. First apply SMOTE to all minority These samples were dried and then screened so only fine-grained ma­
classes up to the size of the largest class, then use terial was used which is considered to be more geochemically repre­
Tomek links to remove potentially inadequate
sentative than coarse-grained material (Grunsky and de Caritat, 2019).
samples (Batista et al., 2004).

2
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Fig. 1. Location of study area, Klaza Property, Yukon, Canada. Georeferenced to UTM zone 8N projection using the WGS-84 datum. Each dot represents a soil sample
within the study area. The inset map shows the relative position of the study area in relation to provinces in northwestern Canada.

Then, they were analyzed in laboratories using a combination of fire rarely reported or had very low variance were also removed. Those
assay fusion, atomic absorption spectrometry, aqua regia digestion, and removed due to low variance were at least 75% the exact same value and
inductively coupled plasma-atomic emission spectrometry. It is impor­ arose due to the detection limits, hence we wanted to avoid an artifact
tant to note that geochemical analyses are not able to perfectly capture from the data collection process. Additionally, only one feature was kept
the complete geochemical picture. For example, different acid digestion when features were highly correlated, which we defined as having a
methods may not be able to dissolve certain minerals (Balaram and Pearson correlation coefficient greater than 0.9 or lesser than − 0.9. Ca,
Subramanyam, 2022; Grunsky and de Caritat, 2019). However, these Co, Cr, and Fe were dropped for these reasons. When geochemical values
techniques are industry standards and incomplete data will always be were missing, these were imputed with the lowest concentration value
present, so we are also testing the effectiveness of these methods despite for the element for the year in which it was collected.
the imperfect picture. Ultimately our data base contains 11 832 soil To avoid soil samples that might be labelled with the wrong geology
samples with their respective GPS location and 51 elemental due to local errors in position of geologic contacts, we created a 50 m
abundances. buffer in both directions from contacts between geologic units. Soil
We obtained a High-Resolution Digital Elevation Model (HRDEM) samples within the buffer were withdrawn from the dataset. This
from Natural Resources Canada, which provided us with elevation data removed some of the uncertainty related to inferring geologic unit
and allowed creation of slope and aspect maps of the region. We addi­ contacts as well as related to the mixing of soils between geologic units.
tionally obtained the most recent geologic map for the study area from The distribution of soil samples per geologic unit can be found in
the Yukon Geological Survey. The mapping took 40 days and was con­ Table 3. Geologic units PDS, MSR, and LKP only had 20, 17, and 8 soil
ducted in 2019 and 2020. This map compiles various geologic maps samples respectively, so we removed them from the dataset since there
published by geological surveys as well as from geologic companies, and was not sufficient data to effectively train and test on. In total, 6605 soil
additionally considers results from geochronological data. (Sack et al., samples and 23 elemental abundances were discarded. This resulted in
2021). Using these we spatially combined the three sources of data what we called our full dataset comprising of 5227 soil samples and 31
creating a data file that included 11 832 soil samples, their chemical features (concentrations of Ag, Al, As, Au, Ba, Be, Bi, Cd, Cu, Ga, K, La,
abundances for 51 elements, the elevation, slope, and aspect at those
points, and the corresponding geological unit.
Table 3
Number of soil samples available for each geologic unit.
3.2. Data pre-processing
Geologic Unit Abbreviation Number of Samples

The data were then checked for robustness and it was determined Whitehorse Suite mKW 3194
Mount Nansen Group mKN 933
that some of the data was inadequate and not all of it useable. Data
Minto Suite LTrEJM 518
collected before 2010 was discarded since analytical techniques were Casino Suite LKC 370
older and detection limits were different, hence all data we used was Finlayson Assemblage DMF 212
collected between 2010 and 2017. We applied an additive log ratio Snowcap Assemblage PDS 20
transform to the data using Ti as a divisor. Some elements (B, Ce, Cs, Ge, Simpson Range Suite MSR 17
Prospector Mountain LKP 8
Hf, Hg, In, Li, Nb, Rb, Re, Se, Sn, Ta, Te, Tl, W, Y, and Zr) which were

3
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Mg, Mn, Mo, Na, Ni, P, Pb, S, Sb, Sc, Sr, Th, Ti, U, V, and Zn, elevation, Table 4
slope, and aspect). Hyperparameters tested.
Additionally, we created three polygons, and samples within these Algorithm Hyperparameters
formed an external testing set. This allowed us to test the performance of
NB Variance smoothing
our models in an area that was not part of the train/test split. These LR Penalty [l1 l2], C
external testing sets are far from each other, are on the edges of soil QDA Tolerance, regularization parameter
sample clusters to minimize spatial correlation with samples in the NN Number of neighbors, weights [uniform, distance], algorithm [ball tree,
training set, and are near geologic contacts to test model performance kd tree], leaf size
RBFSVM C, gamma, tolerance
around contacts. They cover four out of five of the labels in this study ANN Hidden layer sizes, alpha, activation function [identity, logistic, tanh,
since the excluded label DMF is all within one spatial cluster with no relu]
nearby contacts. 298 samples were in these external testing polygons, RF Max depth, minimum samples to split, number of estimators, criterion
hence the dataset we used to train and test our methods was composed of [gini, entropy]
AB Learning rate, number of estimators, base estimators [various decision
4929 soil samples and 31 features.
tree classifiers]
GB Number of estimators, learning rate, minimum samples to split,
minimum samples in leaf, maximum depth
3.3. Machine learning methods

Analyses were done using Python 3.7.3 and packages scikit-learn techniques were tested. They are described in Table 1. We reran the code
0.23 (Pedregosa et al., 2011) and imblearn 0.5 (Lemaitre et al., 2017). with ten random seeds. After comparing the performance of the sam­
First, we applied a centered log-ratio transform to the geochemical data pling methods, it appeared that SMOTE was the best technique, so we
to avoid issues with spurious correlations when working with compo­ used it for the rest of the study.
sitional data (Aitchison, 1982). In contrast to other log-ratio transforms, Next, to test which supervised machine learning algorithm is best for
the centered log-ratio does not drop a feature in the process, making the this problem, we ran nine different machine learning algorithms: logistic
transformed data easier to interpret (Grunsky et al., 2017). Before regression (LR), quadratic discriminant analysis (QDA), nearest neigh­
training the algorithms, the dataset was divided by applying an 80–20 bors (NN), radial basis function support-vector machine (RBFSVM),
train-test split. Additionally, we stratified it, which means there was an naïve Bayes (NB), artificial neural network (ANN), random forest (RF),
equal proportion of imbalanced class samples in the training and testing AdaBoost classifier (AB), and gradient boosting classifier (GB). Simple
set. The training set was scaled and normalized using a z-score, and then descriptions and references can be found in Table 2. We ran code that
normalization parameters were applied to the testing set. A 5-fold cross used SMOTE as its sampling method and tested the performance of the
validation loop was applied upon training which further increased the nine distinct algorithms. This code was rerun with ten random seeds.
testing rigor and reduced overfitting. For hyperparameter selection we Additionally, we tested whether a multiple classifier system could be
performed a randomized search with 100 iterations over the hyper­ useful. Multiple classifier systems ensemble distinct models and form
parameter space. Numerical hyperparameters were tested in exponen­ predictions that combine results of the independent models (Ranawana
tial increments. Fig. 2 visualizes the data division processes and Table 4 and Palade, 2006). We tested multiple combinatorial functions
shows the hyperparameters tested. described in Table 5. We also tested different groups of algorithms to
To compare results, models were evaluated using F1 micro scores
and F1 macro scores. The F1 score is a metric that evaluates the per­ Table 5
formance of a model and takes into account the number of true positives, Combinatorial functions tested.
true negatives, false positives, and false negatives predicted. It is the
Combinatorial Description
harmonic mean of precision and recall. An F1 score was calculated for all Function
classes. The F1 macro score gets the F1 score for each class and then
SUM Adds the probabilities of the models. Prediction is the
averages the F1 score across all classes weighing each class equally
geologic unit with highest summed probability.
regardless how many samples are in each class (Opitz and Burst, 2019). A-LR Train a logistic regression on the probabilities of the training
In contrast, the F1 micro scores does not weigh classes equally. A benefit set.
to the F1 macro score is that all geologic units are of equal importance B-LR Applies SMOTE to probabilities of the training set, then trains
a logistic regression on it.
regardless of how many soil samples were collected within them, while
C-LR Scales the probabilities of the training set with a z-score, then
the F1 micro score will inform about the performance of the model over trains a logistic regression.
the whole study area. D-LR Scales the probabilities of the training set with a z-score, then
Since the dataset is imbalanced as seen in Table 3, we explored what applies SMOTE, then trains a logistic regression on it.
sampling method best dealt with the imbalance problem. Six sampling

Fig. 2. Flowchart summarizing division of data and purposes. This data division process was repeated with 10 random seeds to account for randomness.

4
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

combine as described in Table 6. We then tested all the combinations All MCS except for A-LRMCS2, C-LRMCS2, and SUMMCS9 were able to
between combinatorial functions and groups of algorithms and reran the significantly outperform GB (Fig. 8). The highest improvement came
process with 10 random seeds. from D-LRMCS6 with a F1 micro score of 0.956 ± 0.006 which was on
average 0.013 ± 0.005 above GB. Our results suggest that the over­
4. Results sampling step improves the performance of the LRMCS. The two
methods that did not use SMOTE, A-LRMCS and C-LRMCS, consistently
Fig. 3 compares the performance of the different sampling methods perform more poorly than their SMOTE counterparts, B-LRMCS and D-
on the 9 models tested. Additionally Fig. 4 compares the per seed dif­ LRMCS. We can also observe some general trends regarding MCS
ference between using sampling methods and not. Since the code was behavior. As MCS number increases, SUMMCS and LRMCS tend to
rerun using multiple random seeds, it is important to compare differ­ improve in performance. After MCS number 6, performance decreases as
ences in performance within the same random seed so that algorithms MCS number increases, and SUMMCS decreases at a faster rate than
are compared under the same training and testing sets. Looking at LRMCS.
Fig. 3a first, for all models the Random Undersampler performed To help visualize the effectiveness of the predictions made by D-
significantly worse and had on average 0.10 lower F1 micro score than LRMCS6, the best performing classifier, we plotted the predictions for
not using a sampling method. Decision tree based methods, RF, AB, and each soil sample on top of the geologic map (Fig. 9). Additionally,
GB, all tended to perform better with oversampling-style techniques. Fig. 10 shows the various predicted probabilities for each geologic unit
Fig. 4a, shows that Random Oversampler performed 0.010 ± 0.009 which helps quantify how confident the model was regarding its pre­
better in RF when compared to not using a sampling method. SMOTE dictions. This figure shows random seed 15 with an F1 micro score of
performed 0.030 ± 0.010 better in RF, 0.019 ± 0.007 in AB, and 0.006 0.955 which was the seed with performance closest to the average.
± 0.006 in GB. ADASYN performed 0.030 ± 0.008 better in RF and Table 7 shows a confusion matrix for that seed’s testing set. Table 8
0.021 ± 0.007 in AB. Finally, using SMOTETomek led to an increase in shows a confusion matrix for the external testing sets. The F1 micro
F1 micro score of 0.031 ± 0.010 in RF, and 0.021 ± 0.007 in AB. score for the external testing set was 0.507. Overall, the model predicted
Fig. 3b provides an alternate perspective on model performance well, and many simple boundaries were clear. The algorithm predicted
where all geologic units are weighted equally instead of each being the Finlayson Assemblage and Whitehorse Suite very well, while Casino
weighted by abundance of analyses. RF, AB, and GB all greatly improve Suite had the worst performance. Casino Suite was mostly incorrectly
F1 macro score when oversampling-style techniques are used. Notably, predicted as Minto Suite and Whitehorse Suite. Nothing was mis­
when compared to not using a sampling method, SMOTE increases RF’s classified as Finlayson Assemblage.
F1 macro score by 0.080 ± 0.020, AB by 0.035 ± 0.012, GB by 0.020 ±
0.012; ADASYN increases RF by 0.079 ± 0.018, AB by 0.039 ± 0.013, 5. Discussion
and GB by 0.016 ± 0.013; and SMOTETomek increases RF by 0.080 ±
0.022, AB by 0.039 ± 0.012, and GB by 0.020 ± 0.013 (Fig. 4b). 5.1. Comparing sampling methods
The performance of the 9 different machine learning algorithms
varied significantly (Fig. 5). We can also compare each pair of algo­ Our results provide insights into the differences in performance be­
rithms and their performances within the same seed. Fig. 6 shows a tween various sampling methods. Figs. 3 and 4 show that Random
heatmap comparing the average differences in performance in F1 micro Undersampler is not a good technique to use when predicting underlying
score between all algorithms. The best performing models were AB and geology from soil sample geochemistry, and it is better to simply keep
GB with F1 micro scores of 0.946 ± 0.007 and 0.943 ± 0.007 respec­ imbalance as with the “no change” method. For all models, performance
tively. RF, RBFSVM, and ANN were slightly lower but still comparable at was significantly lower than “no change” with scores far outside the
0.935 ± 0.006, 0.932 ± 0.009, and 0.922 ± 0.008 respectively. error bars (on average 0.10 lower F1 micro score than “no change”). For
Observing Fig. 6, we can see that AB and GB significantly perform better decision tree ensembling models we see that oversampling-style tech­
than these algorithms. Performing worse were NN at 0.887 ± 0.008, niques are useful, with SMOTE, ADASYN, and SMOTETomek working
QDA at 0.843 ± 0.011, and LR at 0.794 ± 0.011. NB performed worst the best. These techniques help increase F1 micro score, but importantly
with a F1 micro score of 0.693 ± 0.009. they also considerably increase F1 macro score. The oversampling al­
Fig. 7 shows a comparison of performance between the MCS we gorithms predict the less abundant units better, hence the low increase
tested. Since the code was rerun using multiple random seeds, Fig. 8 in F1 micro score but higher increase in F1 macro score. For our next
compares the differences in performance within the same random seed highest performing models, RBFSVM and ANN, these oversampling
between the various MCS architectures and GB, the best individual al­ techniques did not significantly affect the model’s performance. Since
gorithm. Information about MCS number and architecture is found in three out of the five best models improve significantly due to SMOTE
Tables 5 and 6 We can see that MCS can outperform individual models. (RF’s F1 micro score increased by 0.030 ± 0.010, AB by 0.019 ± 0.007;
RF’s F1 macro score increased by 0.080 ± 0.020, AB by 0.035 ± 0.012,
GB by 0.020 ± 0.012), and the other two remain statistically similar (F1
Table 6
micro score difference between SMOTE and no sampling method was
Groups of algorithms tested.
− 0.004 ± 0.006 for RBFSVM and − 0.007 ± 0.010 for ANN), SMOTE
MCS Algorithms Combined Reasoning was thus decided to be the best sampling method for this project.
Name
ADASYN and SMOTETomek were also suitable sampling methods but
MCS2 RBFSVM, GB Top two algorithms with no repetition of given the increased complexity and runtime of the method and the low
decision tree ensemble methods
payoffs, SMOTE was chosen instead. Overall, we see the value sampling
MCS3 RBFSVM, ANN, GB Top three algorithms with no repetition
of decision tree ensemble methods methods could provide based on the algorithm used and suggest that
MCS5 RBFSVM, ANN, RF, AB, GB Top five algorithms. SMOTE is most effective for similar studies (with ADASYN and SMO­
MCS6 NN, RBFSVM, ANN, RF, AB, Top six algorithms. TETomek being good alternatives as well).
GB
MCS7 QDA, NN, RBFSVM, ANN, Top seven algorithms.
RF, AB, GB 5.2. Machine learning algorithms compared
MCS8 LR, QDA, NN, RBFSVM, Top eight algorithms.
ANN, RF, AB, GB Using our previously suggested sampling method, SMOTE, we
MCS9 NB, LR, QDA, NN, RBFSVM, All algorithms compared the performance of various algorithms in Figs. 5 and 6. RF,
ANN, RF, AB, GB
AB, and GB performed best showing that ensemble decision tree

5
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Fig. 3. Comparison of performance of different sampling methods per algorithm, (a) using F1 micro score, (b) using F1 macro score. Error bars represent one
standard deviation after rerunning using ten random seeds.

methods tend to be effective algorithms for similar projects. There is no studies that predicted geologic mapping suggested random forests was
significant difference between the performance of AB (0.946 ± 0.007) the best algorithm (Cracknell and Reading, 2014). Our results support
and GB (0.943 ± 0.007). Both models performed significantly better the notion that random forests is a strong algorithm for these types of
than RF (0.011 ± 0.005 and 0.008 ± 0.007 respectively). Hence, our projects, and provides further evidence that more complex decision tree
results suggest that using the more advanced ensemble decision tree ensemble techniques like GB and AB are effective.
methods can improve performance when compared to RF. The runtime
of AB is only about 1.5 times longer than RF, while GB required about 5.3. Performance of multiple classifier systems
5.5 times longer, and given similar performances AB might be preferred
over GB. RBFSVM (0.932 ± 0.009) and ANN (0.922 ± 0.008) also per­ Our results showed that MCS can perform better than one of our best
formed well, but not as well as AB and GB. The remaining four algo­ singular algorithms, GB (Figs. 7 and 8). B-LRMCS and D-LRMCS applied
rithms performed significantly lower (NN 0.887 ± 0.008, QDA 0.843 ± SMOTE first to balance the data, while A-LRMCS and C-LRMCS did not.
0.011, LR 0.794 ± 0.011, NB 0.693 ± 0.009). For comparison, a trivial Given that B-LR and D-LR outperformed A-LR and C-LR, it is evident that
classifier has been added which always predicts the most abundant class applying a sampling method such as SMOTE at this stage increases the
in the training set and its F1 micro score is 0.641. All models at least performance of the MCS. A-LR and B-LR did not scale probabilities
perform better than the trivial classifier, showing that the soil dataset is before running the logistic regression, while C-LR and D-LR did. Dif­
informative about the underlying rock. ferences within these groups are minimal and within each other’s error
Our results support the notion that support vector machines, neural bars, so it is unlikely that scaling at this stage will significantly affect the
networks, and random forests are strong algorithms. Other comparison classifiers.

6
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Fig. 4. Average difference of performance between using a sampling method versus not for each seed, (a) using F1 micro score, (b) using F1 macro score. Error bars
represent one standard deviation after rerunning using ten random seeds.

Observing the MCS number (number of individual classifiers 0.006) and outperformed one of the best individual models GB (0.943 ±
included in the multiple classifier system), we can find trends that 0.007). However, the F1 micro score increase was relatively small, the
inform us about the performance of our models. As MCS number run time was longer, and the increase in complexity is large. Five
increased, MCS performance tended to improve until MCS number 6. additional models needed to be trained to create D-LRMCS6, and
After this point, larger MCS numbers decreased model performance, and training D-LRMCS6 takes about twice as long as only training GB.
SUMMCS was affected more drastically than LRMCS. This suggests that Nonetheless, increasing performance when performance is already high
the quality of the models combined is more important than quantity of is difficult. Therefore, whether it is better to use GB or create the MCS
models combined. When MCS number is low, the increase in different models will depend on the needs of the user.
algorithms and perspectives provides positive impact on MCS perfor­
mance, but once we start adding poorly performing models like LR and 5.4. External testing set performance
NB, these lower-quality models do not add to the predictive ability and
have a negative impact on performance. SUMMCS is more susceptible to Predictions of one seed of D-LRMCS6 are mapped in Figs. 9 and 10,
this negative effect since it treats all models equally and simply adds which aid in the visualization of the performance of our techniques. In
probabilities. This is seen since the SUMMCS score decreases drastically general, the classifier effectively predicted the underlying geology with
compared to the LRMCS after MCS number 6. These LRMCS successfully a testing set F1 micro score of 0.955. Through detailed investigation of
use a logistic regression to detect which algorithms are working best and the classifier’s performance some observations can be made. Casino
minimize the impact of poorly performing algorithms. Suite was the most poorly predicted unit. It tended to be predicted as
In this study, D-LRMCS6 was the best performing model (0.956 ± Whitehorse Suite and Minto Suite, likely because the lithologies are

7
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Fig. 5. Comparison of performance of 9 algorithms. Trivial classifier always predicts points as the most abundant geologic unit in the training set. Error bars
represent one standard deviation after rerunning using ten random seeds.

Fig. 6. Heatmap comparing average algorithm F1 micro score difference between two algorithms. Uncertainty ranges represent one standard deviation after
rerunning using ten random seeds.

8
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Fig. 7. Comparison of performance of various MCS. MCS number represents the number of algorithms combined which are explained in more detail in Table 6. Each
bar represents a different MCS architecture explained in Table 5. GB is plotted for comparison since it is the best performing independent algorithm. Error bars
represent one standard deviation after rerunning using ten random seeds.

Fig. 8. Average difference of performance between MCS architectures and GB for each seed. Error bars represent one standard deviation after rerunning using ten
random seeds.

similar. Casino Suite is made up of rhyolite to dacite which is very amphibolites which are significantly geochemically different to the
compositionally similar to Whitehorse Suite and Minto Suite which are igneous rocks that dominate the study area. For example, samples from
both hornblende granodiorites. The similar chemical compositions of this geologic unit tended to have significantly higher signals in Mg, Ni,
the rocks could explain why the chemistry of the soil, which formed as Cr, and Cu, and lower signals in Ca, P, S, and Na, when compared to the
products of their weathering, is not effective at distinguishing between other units. These differences were easily picked up by our algorithms.
the three lithologies. The model might have performed better if it tried Performance within the external testing set was lower than for the
to predict by lithological type and these units were grouped together, internal testing set. Note, however, that some important boundaries are
however, Casino suite is related to gold mineralization so we wanted to being detected by our models (Figs. 9 and 10). In inset A, we can see that
see if this unit could be identified and distinguished from similar rock. In the model predicts the green, mKN polygon and boundary well, despite
contrast, our model best classifies samples from the Finlayson Assem­ there being no nearby mKN points that were part of the training set.
blage. This is likely because the geologic unit is composed of Additionally, inset B shows a complex external testing set with many

9
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Fig. 9. Map comparing prediction made by D-LRMCS6 and a geologic map for the Klaza Property. F1 micro score was 0.962 on the full data set, 0.955 on the internal
testing set, and 0.507 on the external testing set. We chose to plot the random seed with performance closest to average. Each dot represents the position of a soil
sample, and the color of the dot indicates what geologic unit was predicted for that soil sample by D-LRMCS6. These dots lie on top of polygons for the geologic map.
External testing set polygons are outlined in the map and shown in more detail in the insets. (For interpretation of the references to color in this figure legend, the
reader is referred to the Web version of this article.)

geologic units and convoluted boundaries. Our model predicts the green, are used to train the algorithm. Similarly, one could do cross-validation
mKN points pretty well, and those that it ultimately gets wrong tend to based on east to west position, e.g. testing on the 10% eastmost samples,
have 0.1–0.5 probability of being mKN (Fig. 10d). It is also able to training on the 90% westmost samples, and then repeating with other
predict the boundary with the red, LKC unit (Figs. 9 and 10a). As groups. Although that would help with the spatial autocorrelation, our
mentioned before, Casino Suite (LKC) is geochemically similar to Minto study area is too spatially complex and imbalanced for it to work
Suite (LTrEJM) and Whitehorse Suite (mKW). This could explain why properly. There would be many spatial clusters absent of some of the
the northeastern part of inset B is predicted more poorly, often mis­ geologic units, and we would not be able to train and test adequately.
classifying samples between these three units. Our methods are able to test on areas spatially distant from our training
The F1 score for the external testing sets was 0.507 while the internal set, cover difficult areas with geologic boundaries, and include four out
testing set F1 micro score was 0.955. For comparison, we can consider of five geologic units in our study area. Although the external test set did
an alternate form of the trivial classifier. Instead of always predicting the not include any Finlayson Assemblage, our best model never improperly
most abundant class in the training set, (which would be inadequate classified something in the external test set as Finlayson Assemblage.
since the external testing set is imbalanced), we can compare perfor­ Despite the external test set performance ultimately being low, our re­
mance to a classifier that would classify through random selection which sults and comparisons are still meaningful. Algorithms that performed
would have an expected F1 score of 0.2. We can then see that our better on the internal test set tended to also have better performances in
technique is drastically better than a trivial classifier. The difference the external test set, so performance differences in the internal test set do
between external and internal testing set performance is expected since hold value for external test set performance.
data from the external testing set were spatially distant from data in the Additional sources of error include potential misclassifications in the
training set, and it seems like our models were not able to generalize too map itself. Predicted lithologies could possibly be correct but recorded
well to external soil samples. Part of the reason why internal testing set as incorrect due to slight errors where the contacts have been inter­
performance is so high is also because neighboring points might be part preted under cover. In addition to this, contact areas, especially on
of the training set, and hence the model is likely to predict correctly due slopes, are likely to contain contributions from multiple rock types
to spatial correlations. However, it is inherently difficult to predict and sourced from further up-slope. This will alter the chemistry of the
measure performance on a study area like this. resulting soil making it more difficult to predict the underlying geology.
We tried to mitigate the impacts of these by creating the 50m buffer in
both directions of geologic contacts but in some areas this may not have
5.5. Additional considerations been sufficient.
The study area is inherently challenging, and geologic maps for the
A possible solution to the spatial correlation issue is to create poly­ region have been updated multiple times throughout the past 5 years.
gons to split the data into regional chunks, then perform cross-validation New geologic units were introduced recently, and geochronological data
where one region is used to test the algorithm, while the other regions

10
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Fig. 10. Maps showing prediction probabilities made by D-LRMCS6 per geologic unit. We chose to plot the random seed with performance closest to average.
External testing set polygons are outlined in the map and shown in more detail in the insets. Each map presents prediction probabilities for a different geologic unit.
Each dot represents the position of a soil sample, and the color of the dot indicates what probability D-LRMCS6 predicted for that soil sample to be the respective
geologic unit. These dots lie on top of polygons for the geologic map. External testing set polygons are outlined in the map and shown in more detail in the insets. (For
interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

has led to remapping of some units. Were the goal to obtain good pre­ geochemically similar rock.
diction performance, we could have grouped geologic units based on
lithological rock types, hence reducing the misclassification of Casino 6. Concluding remarks
Suite, Minto Suite, and Whitehorse Suite. However, we chose to predict
down to the geologic unit because only specific geologic units are Through this study we analyzed the effectiveness of multiple ma­
associated with mineralization (Casino Suite), and the goal was to chine learning models on predicting underlying geology using soil
determine if our models would be able to distinguish these from properties. We conclude that machine learning can be an effective tool

11
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

Table 7 Gregory and the NSERC CGS-M to Timothy Chee Cheng Lui.
Confusion matrix for testing set predictions of D-LRMCS6 seed 15.
Predicted Geologic Unit Declaration of competing interest
DMF LKC LTrEJM mKN mKW
The authors declare that they have no known competing financial
Actual Geologic Unit DMF 37 0 0 0 5
interests or personal relationships that could have appeared to influence
LKC 0 41 4 2 14
LTrEJM 0 0 92 3 3 the work reported in this paper.
mKN 0 1 0 145 7
mKW 0 2 1 8 621 Acknowledgements

We would like to thank Rockhaven Resources, Matt Turner, and


Table 8 Andrew Carne for providing the geochemical data, and Natural Re­
Confusion matrix for external testing set predictions of D-LRMCS6 seed 15. sources Canada for the Digital Elevation Model. Additionally, we thank
Predicted Geologic Unit NSERC for the NSERC Discovery Grant to Daniel Gregory and the NSERC
CGS-M to Timothy Chee Cheng Lui.
LKC LTrEJM mKN mKW

Actual Geologic Unit LKC 8 35 1 21 References


LTrEJM 0 23 0 7
mKN 0 56 95 15
Aitchison, J., 1982. The statistical analysis of compositional data. J. Roy. Stat. Soc. B 44
mKW 0 5 7 25 (2), 139–160. https://doi.org/10.1111/j.2517-6161.1982.tb01195.x.
Allan, M.M., Mortensen, J.K., Hart, C.J.R., Bailey, L.A., Sánchez, M.G., Ciolkiewicz, W.,
McKenzie, G.G., Creaser, R.A., 2013. Magmatic and metallogenic framework of west-
and our best model, D-LRMCS6, had an F1 micro score of 0.956 ± 0.006. central Yukon and eastern Alaska. In: Colpron, M., Bissig, T., Rusk, B.G.,
For this type of problem, evidence suggests that SMOTE is the most Thompson, J.F.H. (Eds.), Tectonics, Metallogeny, and Discovery: the North American
Cordillera and Similar Accretionary Settings. Society of Economic Geologists,
effective sampling tool for imbalanced classes, however ADASYN and pp. 111–168.
SMOTETomek can also be effective. Additionally, we found that the Anderson, M., 2019. Applying Machine Learning to Geochemical Mineral Exploration.
gradient boosting classifier and AdaBoost classifier were the best per­ Unpublished M.Sc. Thesis, Technical University of Applied Sciences Lübeck, Lübeck,
Germany, p. 89.
forming individual algorithms. We also conclude that MCS are effective Balaram, V., Subramanyam, K.S.V., 2022. Sample preparation for geochemical analysis:
tools, and an MCS using the combinatorial function D-LR and combining strategies and significance. Advances in Sample Preparation 1. https://doi.org/
nearest neighbors, radial basis function support-vector machine, artifi­ 10.1016/j.sampre.2022.100010.
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C., 2004. A study of the behavior of several
cial neural networks, random forests, AdaBoost classifier, and gradient methods for balancing machine learning training data. ACM SIGKDD Explorations
boosting classifier is the best technique overall. This information could Newsletter 6, 20–29. https://doi.org/10.1145/1007730.1007735.
be useful for the better understanding of underlying geology when soil Bradshaw, P.M.D., Thomson, I., 1979. The application of soil sampling to geochemical
exploration in nonglaciated regions of the world. In: Hood, P.J. (Ed.), Geophysics
samples and an adequate training set are available.
and Geochemistry in the Search for Metallic Ores, vol. 31. Geological Survey of
It would be useful to see a comparison of the same techniques in a Canada, Economic Geology Report, pp. 327–338.
different setting. This would help us understand whether SMOTE, Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:
1010933404324.
AdaBoost classifier, gradient boosted classifier, and D-LRMCS6 are good
Carranza, E.J.M., Laborte, A.G., 2016. Data-driven predictive modeling of mineral
techniques to use across different settings, or if their success is depen­ prospectivity using random forests: a case study in catanduanes island (Philippines).
dent on the qualities of this specific area of study. Additionally, more Nat. Resour. Res. 25, 35–50. https://doi.org/10.1007/s11053-015-9268-x.
combinatorial functions can be tested, and since different base algo­ Colpron, M., Nelson, J.L., Murphy, D.C., 2006. A tectonostratigraphic framework for the
pericratonic terranes of the northern Cordillera. In: Colpron, M., Nelson, J.L. (Eds.),
rithms performed better with different sampling methods, it could be Paleozoic Evolution and Metallogeny of Pericratonic Terranes at the Ancient Pacific
interesting to test the performance of a combinatorial function that used Margin of North America, Canadian and Alaskan Cordillera. Geological Association
various sampling methods based on what algorithm is used. of Canada, Special Paper 45, pp. 1–23.
Cover, T., Hart, P., 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theor.
13, 21–27. https://doi.org/10.1109/TIT.1967.1053964.
Authorship statement Cracknell, M.J., Reading, A.M., 2014. Geological mapping using remote sensing data: a
comparison of five machine learning algorithms, their response to variations in the
spatial distribution of training data and the use of explicit spatial information.
Timothy Lui: Conceptualization, Methodology, Software, Formal Comput. Geosci. 63, 22–33. https://doi.org/10.1016/j.cageo.2013.10.008.
Analysis, Investigation, Writing – Original Draft. Daniel Gregory: Cracknell, M.J., Reading, A.M., McNeill, A.W., 2014. Mapping geology and volcanic-
Conceptualization, Writing – Review & Editing, Supervision, Funding hosted massive sulfide alteration in the hellyer–Mt charter region, tasmania, using
random Forests™ and self-organising maps. Aust. J. Earth Sci. 61 (2), 287–304.
Acquisition. Marek Anderson: Conceptualization, Data Curation. Well- https://doi.org/10.1080/08120099.2014.858081.
Shen Lee: Writing – Original Draft (Section 2, Study Area), Investiga­ Friedman, J.H., 2001. Greedy function approximation: a gradient boosting machine.
tion. Sharon Cowling: Conceptualization, Resources, Writing – Review Ann. Stat. 29 (5), 1189–1232. https://doi.org/10.1214/aos/1013203451.
Grunsky, E.C., de Caritat, P., 2019. State-of-the-Art analysis of geochemical data for
& Editing, Supervision.
mineral exploration, Geochemistry, Exploration Environment Analysis, Special Issue
from Exploration, vol. 17. https://doi.org/10.1144/geochem2019-031. October,
Computer code availability 2017, Toronto, Canada.
Grunsky, E.C., de Caritat, P., Mueller, U.A., 2017. Using surface regolith geochemistry to
map the major crustal blocks of the Australian continent. Gondwana Res. 46,
Source codes related to this article can be found at https://github. 227–239. https://doi.org/10.1016/j.gr.2017.02.011.
com/timmehlui/Machine-Learning-Geological-Mapping-Soil-Geochemi Harris, J.R., Grunsky, E.C., 2015. Predictive lithological mapping of Canada’s North
using Random Forest classification applied to geophysical and geochemical data.
stry, an open-source online data repository. Code is written in Python
Comput. Geosci. 80, 9–25. https://doi.org/10.1016/j.cageo.2015.03.013.
3.7.3 and total program size is 154 KB. Software used includes Spyder Hastie, T., Tibshirani, R., Friedman, J., 2001. The Elements of Statistcal Learning, second
IDE and scikit-learn, imblearn, numpy, pandas, pyrolite, and matplotlib ed. Springer series in statistics, New York, NY, p. 745pp.
packages. Code developed by Timothy C. C. Lui, Department of Earth He, H., Bai, Y., Garcia, E.A., Li, S., 2008. ADASYN: Adaptive synthetic sampling approach
for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural
Sciences, University of Toronto, Toronto, ON, Canada M5S 3B1. Networks, pp. 1322–1328. https://doi.org/10.1109/IJCNN.2008.4633969.
Klöcking, M., Mills, L., Mortensen, J., Roots, C., 2016. Geology of mid-Cretaceous
Funding volcanic rocks at Mount Nansen, central Yukon, and their relationship to the Dawson
Range batholith. Yukon Geological Survey, Open File 2016–25, 37.
Kuhn, S., Cracknell, M.J., Reading, A.M., 2018. Lithological mapping using Random
This work was supported by the NSERC Discovery Grant to Daniel Forests applied to geophysical and remote-sensing data: a demonstration study from

12
T.C.C. Lui et al. Applied Computing and Geosciences 16 (2022) 100094

the Eastern Goldfields of Australia. Geophysics 83 (4), B183–B193. https://doi.org/ Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikit-learn: machine
10.1190/geo2017-0590.1. learning in Python. J. Mach. Learn. Res. 12, 2825–2830.
Kuhn, S., Cracknell, M.J., Reading, A.M., 2019. Lithological mapping in the central Piercey, S.J., Colpron, M., 2009. Composition and provenance of the Snowcap
african copper Belt using random forests and clustering: strategies for optimised assemblage, basement to the Yukon-Tanana terrane, northern cordillera:
results. Ore Geol. Rev. 112, 103015 https://doi.org/10.1016/j. implications for cordilleran crustal growth. Geosphere 5, 439–464. https://doi.org/
oregeorev.2019.103015. 10.1130/GES00505.S3.
Kuhn, S., Cracknell, M.J., Reading, A.M., Sykora, S., 2020. Identification of intrusive Piercey, S.J., Murphy, D.C., 2000. Stratigraphy and regional implications of unstrained
lithologies in volcanic terrains in British Columbia by machine learning using Devono-Mississippian volcanic rocks in the Money Creek thrust sheet, Yukon-Tanana
random forests: the value of using a soft classifier. Geophysics 85 (6), B235–B244. Terrane, southeastern Yukon. In: Emond, D.S., Weston, L.W. (Eds.), Yukon
https://doi.org/10.1190/geo2019-0461.1. Exploration and Geology 1999. Exploration and Geological Sciences Division, Yukon
Lemaitre, G., Nogueira, F., Aridas, C.K., 2017. Imbalanced-learn: a Python toolbox to Region, Indian and Northern Affairs Canada, pp. 67–78.
tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18 Piercey, S.J., Nelson, J.L., Colpron, M., Dusel-Bacon, C., Simard, R.-L., Roots, C.F., 2006.
(17), 1–5. http://jmlr.org/papers/v18/16-365.html. Paleozoic magmatism and crustal recycling along the ancient Pacific margin of North
Masoumi, F., Eslamkish, T., Abkar, A.A., Honarmand, M., Harris, J.R., 2017. Integration America, northern Cordillera. In: Colpron, M., Nelson, J.L. (Eds.), Paleozoic
of spectral, thermal, and textural features of ASTER data using Random Forests Evolution and Metallogeny of Pericratonic Terranes at the Ancient Pacific Margin of
classification for lithological mapping. J. Afr. Earth Sci. 129, 445–457. https://doi. North America, Canadian and Alaskan Cordillera. Geological Association of Canada,
org/10.1016/j.jafrearsci.2017.01.028. Special Paper 45, pp. 281–322.
Nelson, J.L., Colpron, M., Piercey, S.J., Dusel-Bacon, C., Murphy, D.C., Roots, C.F., 2006. Ranawana, R., Palade, V., 2006. Multi-classifier systems-review and a roadmap for
Paleozoic tectonic and metallogenic evolution of the pericratonic terranes in Yukon, developers. Int. J. Hybrid Intell. Syst. 3, 35–61. https://doi.org/10.3233/HIS-2006-
northern British Columbia and eastern Alaska. In: Colpron, M., Nelson, J.L. (Eds.), 3104.
Paleozoic Evolution and Metallogeny of Pericratonic Terranes at the Ancient Pacific Sack, P., Eriks, N., van Loon, S., 2021. Revised geological map of Mount Nansen area
Margin of North America, Canadian and Alaskan Cordillera. Geological Association (NTS 115I/3 and part of 115I/2). Yukon Geological Survey, Open File, 2021-2, 2
of Canada, Special Paper 45, pp. 323–360. maps, scale 1:50,000 and 1:20,000.
Nelson, J.L., Colpron, M., Israel, S., 2013. The cordillera of British columbia, Yukon, and Shalev-Shwartz, S., Ben-David, S., 2014. Understanding Machine Learning: from Theory
Alaska: tectonics and metallogeny. In: Colpron, M., Bissig, T., Rusk, B.G., to Algorithms. Cambridge university press, New York, NY, p. 449pp.
Thompson, J.F.H. (Eds.), Tectonics, Metallogeny, and Discovery: the North American Vapnik, V.N., 1998. Statistical Learning Theory. John Wiley & Sons, INC, New York,
Cordillera and Similar Accretionary Settings. Society of Economic Geologists, Chichester, Weinheim, Brisbane, Singapore, Toronto, p. 732pp.
pp. 53–109. Weil, R.R., Brady, N.C., 2017. The Nature and Properties of Soil, fifteenth ed. Pearson
Opitz, J., Burst, S., 2019. Macro F1 and Macro F1. https://arxiv.org/abs/1911.03347. Education Limited, Harlow, Essex, England, pp. 51–100.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Zhu, J., Zou, H., Rosset, S., Hastie, T., 2009. Multi-class AdaBoost. Statistics and Its
Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Interfeace 2 (3), 349–360. https://doi.org/10.4310/SII.2009.v2.n3.a8.

13

You might also like