Supervised Learning - A Systematic Literature Review
Supervised Learning - A Systematic Literature Review
A P REPRINT
Salim Dridi
[email protected] - salimdridi.info
A BSTRACT
Machine Learning (ML) is a rapidly emerging field that enables a plethora of innovative approaches to solving real-world
problems. It enables machines to learn without human intervention from data and is used in a variety of applications,
from fraud detection to recommendation systems and medical imaging. Supervised learning, unsupervised learning, and
reinforcement learning are the 3 main categories of ML. Supervised learning involves pre-training the model on a labeled
dataset and entails two distinct types of learning: classification and regression. Regression is used when the output is
continuous. By contrast, classification is used when the output is categorical.
Supervised learning aims to optimize class label models using predictor features. Following that, a second clas-
sifier is used to assign class labels to the test data in cases where the values of the predictor characteristics are known but
the value of the class label is unknown. In classification, the label identifies the class to which the training set belongs.
However, in regression, the label is a real-value response that corresponds to the example.
Numerous supervised learning approaches and algorithms have been proposed: XGBoost, Naïve Bayes, Support
Vector Machine, Decision Tree, Random Forest, Logistic Regression, and K-Nearest Neighbor to name a few. This survey
paper examines supervised learning by offering a thorough assessment of approaches and algorithms, performance metrics,
and the merits and demerits of numerous studies. This paper will point researchers in new directions and enable them to
compare the efficacy and effectiveness of supervised learning algorithms.
Keywords Supervised Learning · Literature Review · Survey · ML · Supervised Learning approaches · Classification and Regression
Supervised Learning - A Systematic Literature Review A P REPRINT
Table of Contents
1 Introduction 1
1.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Literature Review 4
5 Conclusion 16
i
Supervised Learning - A Systematic Literature Review A P REPRINT
List of Figures
1 Basic Architecture of ML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Major Categories of ML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
3 Basic Architecture of Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
4 Types of Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
5 Workflow of Classification in Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
6 Types of Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
7 Methodology Used for this Systematic Literature Review on Supervised ML . . . . . . . . . . . . . . 7
8 Supervised Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
9 Widely Used Supervised Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
10 Taxonomy of Papers Based on Research Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
List of Tables
1 Search Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Search Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Number of Papers Selected After Applying Inclusion and Exclusion Criteria . . . . . . . . . . . . . . 7
4 Widely Used Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5 Approaches, Merits, and Demerits of Supervised Learning Studies . . . . . . . . . . . . . . . . . . . 16
List of Algorithms
1 Pseudo-Code of Naïve Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Pseudo-Code of SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Pseudo-Code of Decision Tree Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Pseudo-Code of K-Nearest Neighbor Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
ii
Supervised Learning - A Systematic Literature Review A P REPRINT
1 Introduction
ML, due to the concepts it inherits, can be regarded a subfield of Artificial Intelligence (AI). It enables prediction;
for this purpose, its basic building blocks are algorithms. ML enables systems to learn on their own rather than
being explicitly programmed to do so, resulting in more intelligent behavior. It generates data-driven predictions by
developing models that discover patterns in historical data and utilize those patterns to generate predictions [1] [2].
The general architecture of ML consists of several steps: business understanding (understanding and knowledge of
the domain), data acquisition and understanding (gathering and understanding data), modeling (which entails feature
engineering, model training, and evaluation), and deployment (deploy the model on the cloud).
Supervised Learning, Unsupervised Learning, and Reinforcement Learning are the three major categories of ML. In
Supervised Learning, the model is trained on labeled data and is then used to generate predictions on unlabeled data [3].
In unsupervised learning, a model is trained on unlabeled data and said model automatically learns from that data by
extracting features and patterns from it. In Reinforcement Learning, an agent is trained on the environment and this
enables said agent to find the optimum solution and accomplish a goal in a complex situation [4].
ML algorithms treat each instance of a dataset as a collection of features. These features may be binary, categorical,
or continuous in nature. If the instances are labeled, then this type of learning is termed as supervised learning [2].
Supervised Learning involves training the model on labeled data and testing it on unlabeled data. Its fundamental
architecture begins with dataset collection; the dataset is then partitioned into testing and training data; and then, the
data is preprocessed. Extracted features are fed into an algorithm and the model is then trained to learn the features
1
Supervised Learning - A Systematic Literature Review A P REPRINT
associated with each label. Finally, the model is supplied with the test data, and said model makes predictions on the
test data by providing the expected labels, as illustrated in figure 3.
Classification and regression are the two broad types of supervised learning. Both are discussed in detail in the following
sections.
1.1.1 Classification
In classification, a model predicts unknown values (set of Outputs) based on a set of known values (set of inputs)
[2]. When the output is in categorical form, the problem is referred to as a classification one [5]. Generally, in
classification, a dataset’s instances are categorised according to specified classes [5] [6]. Classification can be applied to
both structured and unstructured datasets. Some terms used in Classification are: Classification model, Classification
algorithm, and feature. A classification algorithm, alternatively referred to as a classifier, learns from the training
dataset and assigns each new data point to a certain class. In comparison, a classification model uses a mapping
function, which is concluded by said model from the training dataset, to predict the class label for the test data. Finally,
a feature is associated with the dataset, which helps in building a precise predictive model.
The classification process is depicted in Figure 5. Data collection and preprocessing are the first steps in
building a classification model. Preprocessing is the process of cleansing data by eliminating noise and duplicates.
Numerous techniques are used to preprocess data, among which "brute-force" is the simplest and most common one.
The data is then split into train and test sets using the cross-validation technique. The next stage is to train the model
using class labels; in Python, the sci-kit-learn package has a function called "fit-transform (X,Y)" that maps X (input
data) to Y (labels) for the purpose of preparing the classifier. The next step is to forecast the new dataset’s class or label.
Finally, the classification algorithm is evaluated using the test data.
There are two distinct classification types: binary and multi-label [5]. Binary Classification is used when
the outcome is binary or has two classes. For instance, in an ambiguity detection process, the model predicts whether
or not sentences are ambiguous and as a result, there are only two possible outcomes/classes; this is referred to as
binary classification. However, multi-label classification is made up of a distribution of classes. For example, in
predicting mental disorders, there are multiple ones such as depression, anxiety, schizophrenia, bipolar disorder, and
2
Supervised Learning - A Systematic Literature Review A P REPRINT
Post-traumatic Stress Disorder (PTSD). Thus, the outcome can fall into one of these five categories; this is termed
multi-label classification.
1.1.2 Regression
Regression is a supervised learning technique that permits the discovery of correlations between variables and the
prediction of continuous values based on these variables. When the output is continuous, the problem is referred to as a
regression one [5]—for instance, predicting a person’s weight, age, or salary, weather forecasting, or housing price
forecasting. In regression, X (input variables) is mapped to Y (continuous output). Classification is the process of
predicting the discrete labels of the input. Regression, on the other hand, is concerned with the prediction of continuous
values. Regression is divided into two main categories: Simple Linear and Multiple (figure 6). In simple linear
regression, a straight line is drawn to define the relationship between two variables (X and Y). In contrast, Multiple
regression encompasses multiple variables and is further divided into linear and non-linear.
3
Supervised Learning - A Systematic Literature Review A P REPRINT
The primary goal of this literature review is to discover and assess works on the stated subject. This review will assist
researchers in identifying future research areas by providing an overview of Supervised Learning Approaches and
Algorithms, the metrics used to evaluate the performance of each supervised learning model, and a comparative analysis
of the accuracy of each supervised learning model.
The following section presents the literature review of the topic, Section 3 provides the proposed methodol-
ogy for conducting this systematic literature review, Section 4 provides the results of the study and the related
discussion, and Section 5 presents the conclusion.
2 Literature Review
During the previous decade, numerous researchers have already performed surveys on supervised learning. For instance,
the authors of paper [7] wrote a survey on various supervised learning classification approaches. Their study examined
five classification methods: Naïve Bayes, Neural Network, Decision tree, Support vector machine, and K-Nearest
neighbor; and presented a taxonomy of each paper’s benefits and shortcomings. Additionally, they categorised the
papers according to their research topic, classification algorithm, and publication year. Their survey includes articles
in a variety of disciplines, including medicine, agriculture, education, business, and networking. According to their
research, the most frequently used classification algorithms are decision trees and Naïve Bayes. Their survey, however,
was limited to only five classification strategies.
The authors of study [2] classified supervised learning techniques into five categories: Logic-based algo-
rithms, Statistical learning algorithms, Instance-based learning, Support vector machines, and deep learning.
Additionally, they demonstrated the general pseudocodes of decision trees, rule learners, Bayesian networks, and
instance-based learners. According to their survey, neural networks and SVM outperform other algorithms when
dealing with continuous data. In comparison, logic-based algorithms perform better when the data is categorical.
Additionally, they stated that Naïve Bayes is capable of doing well with small datasets. On the other hand, SVM and
neural networks require large datasets to reach optimal accuracy. However, their research focused exclusively on
classification algorithms and didn’t cover regression.
Similarly, [5] provides an overview of supervised classification approaches and classifies them as: logically
learning algorithms, support vector machine, statistically based algorithms, and lazy learning algorithms. This survey
defines, details, and discusses the benefits, drawbacks, and applications of each technique. The authors also conducted
a comparative analysis of the accuracy of four widely used algorithms: SVM, Naïve Bayes, Decision tree, and k-NN; at
the conclusion of the paper using a dataset from the Census Bureau Database. Several comparison parameters such as
classification speed, learning speed, and noise tolerance were used in their analysis. According to their findings, SVM
at 84.94% outperformed k-NN, Naïve Bayes, and Decision Trees in terms of accuracy, respectively.
In paper [7], the authors reviewed supervised text classification techniques. Their survey covered three ma-
chine learning approaches: NB, SVM, and k-NN; and the performance evaluation measures associated with each.
Additionally, they mentioned several weighting methods for text classification. As a result, k-NN outperformed the
other ML algorithms. According to this study, the performance of the algorithms is dependent on the dataset, i.e. each
algorithm performed differently on different datasets. Unfortunately, this study was limited to three classification models.
Study [8] compares supervised learning methods empirically. The authors use eight comparison parameters
to contrast the following supervised learning algorithms: SVM, ANN, Logistic regression, Naive Bayes, k-NN,
Decision tree, Random Forest, Bagged trees, memory-based learning, and Boosted stumps. Those parameters were:
Accuracy, precision and recall, F-score, Cross entropy, ROC Curve, Squared error, average precision, and breakeven
point. According to their findings, calibrated boosted trees outperformed all by scoring highly on all comparison
parameters. Random forest came in a close second place, followed by SVM. The performance of logistic regression,
Naïve Bayes, and decision trees was, however, poor. It is worth noting that the models’ calibration is surprisingly
effective at producing an excellent performance.
While researchers have made significant attempts at conducting surveys in the domain of supervised learn-
ing, there is still a need for a systematic literature review (SLR), which is one of the main issues of the aforementioned
studies. Additionally, these surveys are restricted to a subset of classification methods, further justifying an enhanced
and organized SLR.
4
Supervised Learning - A Systematic Literature Review A P REPRINT
This SLR will follow Kitchenham’s guidelines and well-defined steps. The first stage is to define the re-
search questions and then analyze the collected data to answer them. The second phase entails defining the search
procedure. This study has included conference proceedings and journal articles dating back to 2011. "IEEE Xplore,"
"the Association for Computing Machinery (ACM)," and "Science Direct Elsevier" were the three databases used to
look for papers. In the third phase, we defined the search strings that were used to retrieve related works from these
databases. Then, the papers were subjected to inclusion and exclusion criteria: we included papers that were relevant to
the topic and have a publication date no older than 2011 and excluded all others. Finally comes Data Extraction which
involves extracting information from the selected list of articles that address the three research questions.
The following are the Research Questions (RQs) formulated for this study.
RQ1: What are the approaches/algorithms that are used for problem-solving in supervised learning?
RQ2: What are the widely used evaluation metrics for measuring the performance of the employed super-
vised learning model?
RQ3: What are the merits and demerits of each study in supervised learning?
Once the key terms were identified, we initiated the search process. From these different terms, we formulated the
query strings.
To formulate query strings, we followed the three databases’ (ACM Digital Library, IEEE Xplore, and Science Direct
Elsevier) guidelines in using Boolean operators, synonyms, and different terms. For our initial search strings, thousands
of papers appeared, so we had to refine them and use the Advanced Search option so a lower number of papers would
be returned to us. Table 1 below shows the final Search strings that were used.
5
Supervised Learning - A Systematic Literature Review A P REPRINT
To retrieve the articles, we ran the queries in August 2021. The result of the query strings is shown in Table 2. Then, for
the citations and bibliographies, we imported these papers into Mendeley Library. After eliminating duplicates, we
ended up with a total of 139 articles.
Based on our stated research questions and SLR topic, inclusion and exclusion criteria were defined as follows.
Inclusion Criteria
1. Those which were published between 1st January 2011 and 1st August 2021.
2. Those which contain a supervised learning approach.
3. Those that discuss classification and regression algorithms in the domain of supervised learning.
Exclusion Criteria
We used the following keywords to conduct our search in the aforementioned databases: "supervised machine learning,"
"supervised learning algorithm," and "supervised learning approach." Then, we set the year filter to exclude works
published prior to 2011. Following that, we assessed the papers that were relevant to our investigation. Several were
eliminated solely on the Title, while others were eliminated after reading the Abstract. Finally, as shown in Table 3, we
ended up with 57 papers for our survey.
6
Supervised Learning - A Systematic Literature Review A P REPRINT
Table 3: Number of Papers Selected After Applying Inclusion and Exclusion Criteria
Figure 7 summarizes the strategy used to conduct this literature review on supervised learning. As noted previously,
three databases were used to choose the publications. Following that, we obtained a total of 139 articles from three
databases. Then, as discussed, we applied the inclusion and exclusion criteria. After applying the inclusion and
exclusion criteria, 57 papers remained that also matched our ten-year time window (1 January 2011 - 1 August 2021).
The next section contains the results of the study.
In this section, the results corresponding to each formulated research question are provided.
RQ1: What are the approaches/algorithms that are used for problem-solving in supervised learning?
Many supervised learning approaches and algorithms have been proposed since the last decade. Our survey
divides the supervised learning them into five categories: logic-based, statistics-based, instance-based, support vector
machines, and deep learning. Figure 8 lays out the different approaches in supervised learning, whereas Figure 9
presents an overview of the algorithms.
7
Supervised Learning - A Systematic Literature Review A P REPRINT
Statistic Based
Learning
Support Vector
Machines
Deep Learning
Instance Based
Learning
Logistic
Regression
Support Vector
XGBoost
Machine
Supervised Learn-
Random Forest Decision Tree
ing Algorithms
K-Nearest
Neighbour AdaBoost
Naive Bayes
The statistics-based approach simplifies the problem through the use of distributive statistics. The prediction task is
based on the structure of the distribution. The statistical-based approach to learning involves the Naïve Bayes Algorithm.
8
Supervised Learning - A Systematic Literature Review A P REPRINT
all of whom are independent. Numerous studies used NB for a variety of classification tasks. In [10], the authors
proposed an opinion-based book recommendation system by employing NB classification to classify and summarize
customer feedback on a book. NB performed well in recommending top-ranked books to customers. The authors of [11]
utilized NB to classify whether email was spam or legitimate. The results showed that NB performed better than other
algorithms when it categorized about 95% of users’ spam emails correctly. In Study [12], deep feature weighting NB
was used to classify Chinese text. According to the findings of the authors, deep feature NB outperformed simple NB.
In [13], the authors presented a system for emotion recognition based on audio signals. Aspects of audio signals such as
pitch, ZCR, and energy were classified using NB. The authors of [14] proposed a method for classifying semantic web
services using NB. Finally, in [15], the authors described a patient-centric clinical decision support system based on the
NB classifier that maintained patient anonymity. The concept offered increased diagnostic precision and minimized
diagnostic time.
The pseudocode of the Naïve Bayes algorithm [56] is:
Another approach in supervised learning is to use support vector machines (SVM). By handling discrete and continuous
instances, SVM are widely used to detect outliers, perform classification, and perform regression. SVM represent
features or occurrences in an n-dimensional space with a defined margin of categories or classes. Using SVM is an
excellent choice when working with high-dimensional data. Another advantage of SVM is its memory efficiency.
Numerous researchers used SVM, for example, the authors of [16] used this algorithm to predict cardiovascular disease.
Their model determined the patient’s arterial stiffness by measuring the pulse from the fingertip. The necessary features
were retrieved from the reading’s waveform. Following that, an SVM classifier was utilized to predict arterial stiffness
as low or high. In [17], SVM were used to classify and compare the breathing patterns of patients undergoing weaning
trials. The authors of [18] employed this algorithm to classify the heart rate signal. They obtained the reading from
three distinct sets of individuals: the young, teenagers, and elders. Twenty individuals were randomly selected from
each group, and their heart rates were collected and subsequently categorized using an SVM. Finally, [19] discusses the
use of this algorithm in biochemical applications. In this study, a SVM was used to estimate the action potential of the
cell membrane.
The pseudocode of SVM [57] is:
9
Supervised Learning - A Systematic Literature Review A P REPRINT
Algorithms based on logic solve problems by sequentially or incrementally applying logical functions. A decision tree
is an example of a logic-based learning algorithm. Decision trees are a widely used classification and regression models.
A decision tree is a logical one composed of nodes and branches. Nodes represent features, while branches represent a
value or a condition associated with a node. Classification of the samples is accomplished by sorting them starting
with the root node. Sorting is done based on the feature values. At each stage of sorting and selecting the most
relevant alternatives, the decision tree makes a determination. It is a straightforward strategy that requires little data
preprocessing and is simple to understand. It is, nevertheless, unstable and may result in a complex tree structure.
Numerous studies employed decisions trees for classification and regression tasks. The authors of [20] proposed a
model for predicting the risk of heart disease based on a patient’s health details. A decision tree was used as the basis
for the model, and a set of rules was constructed to forecast the risk level. The experimental findings were promising.
In study [21], the authors estimated the soil quality using a decision tree model based on the composition of the soil.
Study [22] presents a decision tree-based model for Alzheimer’s disease prediction. Here, the authors used decision tree
induction corresponding to the sample data. At each step/level of the tree, an information gain was used for selecting
the feature.
The pseudocode of the Decision Tree algorithm is:
Instance-based or lazy learning postpones generalization until the classification process is completed. As it slows
down the process, it is referred to as "lazy learning." Its computational time during the training phase is quite low. In
contrast, it is relatively computationally intensive during the classification phase. K-Nearest Neighbor is a widely used
instance-based algorithm for classification and regression problems.
K-Nearest Neighbor (KNN) is a straightforward algorithm. It is utilized when limited information about the data’s
distribution exists. KNN classifies new data based on two things: features of said new data and training samples.
It stores available data and predicts new data labels based on the similarity measures of the nearest neighbor. It is
effective against noisy data and is suitable for training samples taken from real-world situations. However, because
the distance to the k nearest neighbor is calculated repetitively for each new data set, the computational cost is quite
high. Numerous studies have used KNN to perform classification tasks. It is more typically used for classification
tasks than for regression tasks. Study [23] classified heart disease using KNN. In paper [24], the authors suggested
a model for offline handwritten digit prediction using KNN. The model was trained and validated using the MNIST
digit image dataset. Voting was used to classify the instances. [25] proposed a new KNN model to address the issues
associated with evidentiary KNN (EKNN), a KNN extension. Study [26] was proposed by bioscience writers. Here,
the authors employed KNN to classify blasts in acute leukemia blood samples. The blasts were classified as either
acute lymphocytic leukemia or acute myelogenous leukemia. With an 86% classification accuracy, the results were
encouraging. Finally, study [27] proposed a KNN-based approach for the placement of an undergraduate student in an
IT business. The classification was binary in nature, as it made use of only two classes (Yes and No).
The pseudocode of the K-Nearest Neighbor algorithm is:
10
Supervised Learning - A Systematic Literature Review A P REPRINT
Using Deep learning for classification and regression tasks is another approach in supervised learning. In deep learning,
the model comprises many layers, and the model is trained in a layer-by-layer method. Deep learning models have a wide
range of applications, from voice recognition to computer vision and natural language processing. By combining several
preprocessing techniques and space search optimization, the authors of research [28] suggested a novel framework
termed Polynomial Neural Network Classifier (PNNC). In [29], the authors suggested a neural network model for stock
price prediction. They named the model FCM (floating centroids Methods), and it reached a high degree of accuracy
and optimal operation. In [30], the researchers proposed a neural network-based algorithm for predicting gum disease.
Here, a combination of risk factors and symptoms were fed into the model as inputs. Afterward, the hidden layer
retrieved features from the given sample and automatically lowered the data’s dimensionality. In the end, the output was
a binary classification, with 1 indicating the presence of periodontal disease and 0 if gingivitis disease was present.
Finally, the authors of [31] employed a multilayered perceptron to forecast heart disease. 13 clinical examples were
used to train the model and a prediction was made regarding the existence or absence of cardiac disease. The model
worked admirably, with a 98% accuracy rate.
RQ2: What are the widely used evaluation metrics for measuring the performance of the employed supervised learning
model?
Several evaluation metrics are used by researchers for the performance evaluation of several classification
and regression models. Table 4 presents a list of widely used evaluation metrics.
11
Supervised Learning - A Systematic Literature Review A P REPRINT
RQ3: What are the merits and demerits of each study in supervised learning?
Table 5 presents several studies, the approach employed and the merits and demerits associated with each of
the study.
12
Supervised Learning - A Systematic Literature Review A P REPRINT
13
Supervised Learning - A Systematic Literature Review A P REPRINT
14
Supervised Learning - A Systematic Literature Review A P REPRINT
15
Supervised Learning - A Systematic Literature Review A P REPRINT
Medical
60%
10.9%
Education
20% 9.1%
Agriculture
Networking
5 Conclusion
Supervised learning is one of the main categories of ML. It involves training the model on labeled data and testing it
on unlabeled data. Additionally, it is divided into classification and regression tasks. Numerous supervised learning
algorithms have been proposed throughout the previous decade. Supervised learning is used in a wide variety of
applications, from fraud detection to information retrieval, from heart disease diagnosis to cancer detection. This
SLR followed Kitchenham’s proposed sequence of well-defined phases and is a review of the literature covering
supervised learning methodologies and algorithms. It also showcases many performance indicators for supervised
learning algorithms. Additionally, it discusses the advantages and disadvantages of many studies. This survey report
will assist researchers in determining which supervised learning approach or algorithm to utilize for tackling problems
and which area of research requires additional focus.
This survey is limited to widely used supervised learning algorithms and focuses exclusively on research ar-
ticles published in the last decade and drawn from three databases. In the future, we hope to incorporate other databases,
algorithms, and methodologies to improve guidance.
References
[1] James Cussens, "Machine Learning," IEEE Journal of Computing and Control, Vol.7, No.4, pp.164-168, 1996.
[2] Muhammad, I., & Yan, Z., "Supervised Machine Learning Approaches A Survey," ICTACT Journal on Soft
Computing, Vol.5, No.3, 2015.
[3] S. B. Kotsiantis, "Supervised Machine Learning: A Review of Classification Techniques," Informatica, Vol.31,
No.3, pp.249-268, 2007.
16
Supervised Learning - A Systematic Literature Review A P REPRINT
[4] Richard S. Sutton and Andrew G. Barto, "Reinforcement Learning: An Introduction," Cambridge, MA: MIT Press,
1998.
[5] Sen, P. C., Hajra, M., & Ghosh, M., "Supervised classification algorithms in Machine Learning: A survey and
review," Emerging technology in modelling and graphics, Springer, pp.99-111, 2020.
[6] Kadhim, A. I.,"Survey on supervised Machine Learning techniques for automatic text classification," AI Review,
Vol.52, No.1, pp.273-292, 2019.
[7] Narayanan, U., Unnikrishnan, A., Paul, V., & Joseph, S., "A survey on various supervised classification algorithms,"
2017 International Conference on Energy Communication, Data Analytics and Soft Computing (ICECDS), IEEE,
pp.2118-2124, August 2017.
[8] Caruana, R., & Niculescu-Mizil, A., "An empirical comparison of supervised learning algorithms," Proceedings of
the 23rd international conference on Machine Learning, pp.161-168, June 2006.
[9] B. Kitchenham, O. P. Brereton, D. Budgen, M. Turner, J. Bailey, and S. Linkman, "Systematic literature reviews in
software engineering – A systematic literature review," Inf. Softw. Technol., Vol.51, No.1, pp.7-15, 2009.
[10] Tewari, A. S., Ansari, T. S., & Barman, A. G., "Opinion based book recommendation using naive bayes classifier,"
2014 International Conference on Contemporary Computing and Informatics (IC3I), IEEE, pp.139-144, November
2014.
[11] Solanki, R. K., Verma, K., & Kumar, R., "Spam filtering using hybrid local-global Naive Bayes classifier,"
2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE,
pp.829-833, August 2015.
[12] Jiang, Q., Wang, W., Han, X., Zhang, S., Wang, X., & Wang, C., "Deep feature weighting in Naive Bayes for
Chinese text classification," 2016 4th International Conference on Cloud Computing and Intelligence Systems
(CCIS), IEEE, pp.160-164, August 2016.
[13] Bhakre, S. K., & Bang, A., "Emotion recognition on the basis of audio signal using Naive Bayes classifier,"
2016 International conference on advances in computing, communications and informatics (ICACCI), IEEE,
pp.2363-2367, September 2016.
[14] Liu, J., Tian, Z., Liu, P., Jiang, J., & Li, Z., "An approach of semantic web service classification based on Naive
Bayes," 2016 International Conference on Services Computing (SCC), IEEE, pp.356-362, June 2016.
[15] Liu, X., Lu, R., Ma, J., Chen, L., & Qin, B., "Privacy-preserving patient-centric clinical decision support system
on naive Bayesian classification," IEEE Journal of Biomedical and Health Informatics, Vol.20, No.2, pp.655-668,
2015.
[16] Alty, S. R., Millasseau, S. C., Chowienczyc, P. J., & Jakobsson, A., "Cardiovascular disease prediction using
support vector machines," 46th Midwest Symposium on Circuits and Systems, IEEE, Vol.1, pp.376-379, December
2003.
[17] Giraldo, B. F., Garde, A., Arizmendi, C., Jané, R., Diaz, I., & Benito, S., "Support vector machine classification
applied on weaning trials patients," Encyclopedia of Healthcare Information Systems, IGI Global, pp.1277-1282,
2008.
[18] Kampouraki, A., Nikou, C., & Manis, G., "Robustness of support vector machine-based classification of heart rate
signals," 2006 International Conference of the Engineering in Medicine and Biology Society, IEEE, pp.2159-2162,
August 2006.
[19] Seijas, C., Caralli, A., & Villazana, S., "Estimation of action potential of the cellular membrane using support
vectors machines," 2006 International Conference of the Engineering in Medicine and Biology Society, IEEE,
pp.4200-4204, August 2006.
[20] Saxena, K., & Sharma, R., "Efficient heart disease prediction system using decision tree," International Conference
on Computing, Communication & Automation, IEEE, pp.72-77, May 2015.
[21] Dongming, L., Yan, L., Chao, Y., Chaoran, L., Huan, L., & Lijuan, Z., "The application of decision tree
C4. 5 algorithm to soil quality grade forecasting model," 2016 First International Conference on Computer
Communication and the Internet (ICCCI), IEEE, pp.552-555, October 2016.
[22] Dana, A. D., & Alashqur, A., "Using decision tree classification to assist in the prediction of Alzheimer’s disease,"
6th International Conference on Computer Science and Information Technology (CSIT), IEEE, pp.122-126, March
2014.
[23] Udovychenko, Y., Popov, A., & Chaikovsky, I., "Ischemic heart disease recognition by k-NN classification of
current density distribution maps," 35th International Conference on Electronics and Nanotechnology (ELNANO),
IEEE, pp.402-405, April 2015.
17
Supervised Learning - A Systematic Literature Review A P REPRINT
[24] Babu, U. R., Venkateswarlu, Y., & Chintha, A. K., "Handwritten digit recognition using K-nearest neighbour
classifier," World Congress on Computing and Communication Technologies, IEEE, pp.60-65, February 2014.
[25] Jiao, L., Pan, Q., Feng, X., & Yang, F., "An evidential k-nearest neighbor classification method with weighted
attributes," Proceedings of the 16th International Conference on Information Fusion, IEEE, pp.145-150, July 2013.
[26] Supardi, N. Z., Mashor, M. Y., Harun, N. H., Bakri, F. A., & Hassan, R., "Classification of blasts in acute leukemia
blood samples using k-nearest neighbour," 8th International Colloquium on Signal Processing and its Applications,
IEEE, pp.461-465, March 2012.
[27] Giri, A., Bhagavath, M. V. V., Pruthvi, B., & Dubey, N., "A placement prediction system using k-nearest neighbors
classifier," Second International Conference on Cognitive Computing and Information Processing (CCIP), IEEE,
pp.1-4, August 2016.
[28] Huang, W., Oh, S. K., & Pedrycz, W., "Polynomial neural network classifiers based on data preprocessing and
space search optimization," Joint 8th International Conference on Soft Computing and Intelligent Systems (SCIS)
and 17th International Symposium on Advanced Intelligent Systems (ISIS), IEEE, pp.769-773, August 2016.
[29] Liu, S., Yang, B., Wang, L., Zhao, X., Zhou, J., & Guo, J., "Prediction of share price trend using FCM neural
network classifier," 3rd International Conference on Informative and Cybernetics for Computational Social Systems
(ICCSS), IEEE, pp.81-86, August 2016.
[30] Thakur, A., Guleria, P., & Bansal, N., "Symptom & risk factor based diagnosis of Gum diseases using neural
network," 6th International Conference-Cloud System and Big Data Engineering (Confluence), IEEE, pp.101-104,
January 2016.
[31] Sonawane, J. S., & Patil, D. R., "Prediction of heart disease using multilayer perceptron neural network,"
International conference on information communication and embedded systems (ICICES), IEEE, pp.1-6, February
2014.
[32] Torlay, L., Perrone-Bertolotti, M., Thomas, E., & Baciu, M., "Machine Learning–XGBoost analysis of language
networks to classify patients with epilepsy," Brain informatics, Vol.4, No.3, pp.159-169, 2017.
[33] Wang, H., Zheng, B., Yoon, S. W., & Ko, H. S., "A support vector machine-based ensemble algorithm for breast
cancer diagnosis," European Journal of Operational Research, Vol.267, No.2, pp.687-699, 2018.
[34] Bhattacharya, S., Maddikunta, P. K. R., Kaluri, R., Singh, S., Gadekallu, T. R., Alazab, M., & Tariq, U., "A novel
PCA-firefly based XGBoost classification model for intrusion detection in networks using GPU," Electronics, Vol.9,
No.2, pp.219, 2020.
[35] Huang, S., Cai, N., Pacheco, P. P., Narrandes, S., Wang, Y., & Xu, W., "Applications of support vector machine
(SVM) learning in cancer genomics," Cancer Genomics-Proteomics, Vol.15, No.1, pp.41-51, 2018.
[36] Kavin, S., Mohan, S. K., Karthick, V. I., & Sudar, K. M., "Performance comparison of support vector machine,
random forest, and extreme learning machine for intrusion detection," Special Section on Survivability Strategies
for Emerging Wireless Networks, IEEE, 2018.
[37] DongSheng Liu, Zhongrui Fan, Q. Fu, Mo Li, M. A. Faiz, Shoaib Ali, Tianxiao Li, L. Zhang, Muhammad
Imran Khan, "Random forest regression evaluation model of regional flood disaster resilience based on the whale
optimization algorithm," Journal of Cleaner Production, Vol.250, pp.119468, 2020.
[38] Safwan Mohammed, Ali Al-Ebraheem, Imre J Holb, Karam Alsafadi, Mohammad Dikkeh, Quoc Bao Pham,
Nguyen Thi Thuy Linh, Szilard Szabo, "Soil management effects on soil water erosion and runoff in central
Syria—A comparative evaluation of general linear model and random forest regression," Water, Vol.12, No.9,
pp.2529, September 2020.
[39] Liu, K., Hu, X., Zhou, H., Tong, L., Widanalage, D., & Marco, J., "Feature analyses and modelling of lithium-ion
batteries manufacturing based on random forest classification," IEEE/ASME Transactions on Mechatronics, 2021.
[40] Jackins, V., Vimal, S., Kaliappan, M., & Lee, M. Y., "AI-based smart prediction of clinical disease using random
forest classifier and Naive Bayes," The Journal of Supercomputing, Vol.77, No.5, pp.5198-5219, 2021.
[41] Abdulkareem, N. M., Abdulazeez, A. M., Zeebaree, D. Q., & Hasan, D. A., "COVID-19 World Vaccination
Progress Using Machine Learning Classification Algorithms," Qubahan Academic Journal, Vol.1, No.2, pp.100-105,
2021.
[42] Abbas, A., Abdelsamea, M. M., & Gaber, M. M., "Classification of COVID-19 in chest X-ray images using
DeTraC deep convolutional neural network," Applied Intelligence, Vol.51, No.2, pp.854-864, 2021.
[43] Abd El Kader, I., Xu, G., Shuai, Z., Saminu, S., Javaid, I., & Salim Ahmad, I., "Differential deep convolutional
neural network model for brain tumor classification," Brain Sciences, Vol.11, No.3, pp.352, 2021.
18
Supervised Learning - A Systematic Literature Review A P REPRINT
[44] Wang, Z., Huang, S., Wang, J., Sulaj, D., Hao, W., & Kuang, A., "Risk factors affecting crash injury severity for
different groups of e-bike riders: A classification tree-based logistic regression model," Journal of safety research,
Vol.76, pp.176-183, 2021.
[45] Xiao, R., Cui, X., Qiao, H., Zheng, X., & Zhang, Y., "Early diagnosis model of Alzheimer’s Disease based on
sparse logistic regression," Multimedia Tools and Applications, Vol.80, No.3, pp.3969-3980, 2021.
[46] Khurana, G., & Bawa, N. K., "Weed Detection Approach Using Feature Extraction and KNN Classification,"
Advances in Electromechanical Technologies, Springer, pp.671-679, 2021.
[47] Al Dujaili, M. J., Ebrahimi-Moghadam, A., & Fatlawi, A., "Speech emotion recognition based on SVM and
KNN classifications fusion," International Journal of Electrical and Computer Engineering, Vol.11, No.2, pp.1259,
2021.
[48] Sankaranarayanan, S., & Mookherji, S., "SVM-based traffic data classification for secured IoT-based road
signaling system," Research Anthology on AI Applications in Security, IGI Global, pp.1003-103, 2021.
[49] Nayak, R., Jiwani, S. A., & Rajitha, B., "Spam email detection using Machine Learning algorithm" Materials
Today: Proceedings, 2021.
[50] Ghiasi, M. M., Zendehboudi, S., & Mohsenipour, A. A., "Decision tree-based diagnosis of coronary artery disease:
CART model" Computer methods and programs in biomedicine, Vol.192, pp.105400, 2020.
[51] Binh Thai Pham, Manh Duc Nguyen, Trung Nguyen-Thoi, Lanh Si Ho, Mohammadreza Koopialipoor, Nguyen
Kim Quoc, Danial Jahed Armaghani, Hiep Van Le, "A novel approach for classification of soils based on laboratory
tests using Adaboost, Tree and ANN modeling," Transportation Geotechnics, Vol.27, pp.100508, 2021.
[52] Putri, H. S. K. A., Sari, C. A., & Rachmawanto, E. H., "Classification of Skin Diseases Types using Naïve
Bayes Classifier based on Local Binary Pattern Features," International Seminar on Application for Technology of
Information and Communication (iSemantic), IEEE, pp.61-66, September 2020.
[53] Aslam, M. A., Xue, C., Wang, K., Chen, Y., Zhang, A., Cai, W., ... & Cui, D., "SVM based classification and
prediction system for gastric cancer using dominant features of saliva," Nano Biomed Eng, Vol.12, No.1, pp.1-13,
2020.
[54] Zheng, J., Lin, D., Gao, Z., Wang, S., He, M., & Fan, J., "Deep learning assisted efficient AdaBoost algorithm for
breast cancer detection and early diagnosis," IEEE Access, Vol.8, pp.96946-96954, 2020.
[55] Kumar, R. G., & Kumaraswamy, Y. S., "Investigating cardiac arrhythmia in ECG using random forest classifica-
tion," Intl. J. Comput. Appl, Vol.37, No.4, pp.31-34, 2012.
[56] Muhammad Firman Saputra, Triyanna Widiyaningtyas, Aji Wibawa, "Illiteracy Classification Using K Means-
Naïve Bayes Algorithm," International Journal on Informatics Visualization, Vol.2, No.153, 10.30630/joiv.2.3.129,
2018.
[57] Houssein, E.H., Hosney, M.E., Elhoseny, M. et al., "Hybrid Harris hawks optimization with cuckoo search for
drug design and discovery in chemoinformatics," Sci Rep, Vol. 10, pp.14439, 2020.
19