0% found this document useful (0 votes)

64 views9 pages

An Overview of Supervised Machine Learning Paradigms and Their Classifiers

Artificial Intelligence (AI) is the theory and development of computer systems capable of performing complex tasks that historically requires human intelligence such as recognizing speech, making decisions and identifying patterns. These tasks cannot be accomplished without the ability of the systems to learn. Machine learning is the ability of machines to learn from their past experiences. Just like humans, when machines learn under supervision, it is termed supervised learning. In this work, a

Uploaded by

Poonam Kilaniya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views9 pages

An Overview of Supervised Machine Learning Paradigms and Their Classifiers

Uploaded by

Poonam Kilaniya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

International Journal of Advanced Engineering, Management and

Science (IJAEMS)
Peer-Reviewed Journal
ISSN: 2454-1311 | Vol-10, Issue-3; Mar-Apr, 2024
Journal Home Page: https://ijaems.com/
DOI: https://dx.doi.org/10.22161/ijaems.103.4

An Overview of Supervised Machine Learning Paradigms

and their Classifiers
Mbeledogu Njideka Nkemdilim, Paul Roseline Uzoamaka, Ugoh Daniel and Mbeledogu
Kaodilichukwu Chidi

Received: 30 Jan 2024; Received in revised form: 14 Feb 2024; Accepted: 22 Mar 2024; Available online: 1 Apr 2024

Abstract— Artificial Intelligence (AI) is the theory and development of computer systems capable of
performing complex tasks that historically requires human intelligence such as recognizing speech, making
decisions and identifying patterns. These tasks cannot be accomplished without the ability of the systems to
learn. Machine learning is the ability of machines to learn from their past experiences. Just like humans,
when machines learn under supervision, it is termed supervised learning. In this work, an in-depth knowledge
on machine learning was expounded. Relevant literatures were reviewed with the aim of presenting the
different types of supervised machine learning paradigms, their categories and classifiers.
Keywords— Artificial intelligence, Machine learning, supervised learning paradigms

I. INTRODUCTION progressively blurs the boundaries between machine

For intelligent system to perform complex tasks that intelligence and human intellect (Tucci, 2023).
historically requires human intelligence such as recognizing
speech, making decisions and identifying patterns (Staff, II. MACHINE LEARNING
2023), it requires the ability to learn from past experiences.
ML are computational techniques (scientific algorithms and
Learning is a process that leads to change and it is an
statistical models) that enable computers to learn from data
attribute that is possessed by humans. It occurs as a result
without being explicitly programmed. If programming is
of experience and increases the potential for improved
automation, then ML is automating the process of
performance and future learning (Ambrose et al., 2010). As
automation. It provides machines with the ability to learn
the intelligence demonstrated by machines are said to be
independently (Ghahremani-Nahr et al., 2021) and makes
artificial, their learning ability is referred to as machine
programming scalable.
learning (ML). ML is a type of Artificial Intelligence (AI)
focused on building computer systems that learn from data. According to NetApp (2023), ML is made up of three parts.
It has applications in all types of sectors including They are:
manufacturing, retail, cyber-security, real-time chatbot a) Computational Algorithm: A formal procedure
agents, humanities disciplines, Agriculture, Social media, describing an ordered sequence of operations to be
healthcare and life sciences, Email, Image processing, performed a finite number of times (Falade, 2021).
travel and hospitality, financial services and energy, This is at the core of considering determinations.
feedstock and utilities ( Bansal et al., 2019). b) Variables and features that make up the decisions.
In the light of its applications, it is undoubtedly more c) Knowledge Base: The known facts which the
valuable than other branches of AI because for a system to system trains to learn from.
be intelligent, it must possess the ability to learn in order to In a typical simple model of machine learning (Fig. 1), the
improve the performance of their AI software applications environment supplies the information to the learning
over time and as well as possess the ability to adapt to element which uses the information to make improvements
changes. This in turn fuels the advancements in AI and in the knowledge base in order for the performance element
to perform its task accurately. The kind of information

This article can be downloaded from here: www.ijaems.com 24

©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Nkemdilim et al. International Journal of Advanced Engineering, Management and Science, 10(3) -2024

supplied to the machine by the environment is usually determined from raw data and experience in the inductive
imperfect, with the result that the learning element does not information processing and it is used in similarity-based
know in advance how to fill in missing details or ignore learning where as in deductive, general rules are used to
details that are unimportant. The machine therefore, determine the specific facts and is used in proof of a
operates by guessing and then receives feedback from the theorem where deductions are made from known axioms to
performance element. The feedback mechanism enables the other existing axioms (Haykin, 1994).
machine to evaluate its hypotheses and revise them if In comparison with the traditional programming, ML uses
necessary. data and output to run on the computer to generate a
Two different kinds of information processing are involved program which can then be used in traditional programming
in machine learning. They are the inductive and deductive while traditional programming uses data and program on
information processing. General pattern and rules are the computer to produce output (Brownlee, 2020).

Data Data
Computer Output Computer Program
Program Output
(a) (b)
(a) Traditional Programming (b) Machine Learning
Fig. 1: Typical simple model of machine learning

Machine Learning Classifiers ii. Regression: The function is continuous. The target
The technique for determining which class a dependent variable is numeric.
belongs to base on one or more independent variables is iii. Forecasting (Probability Estimation): The function
termed as Classification. The type of machine learning is a probability.
algorithm that assigns a label to a data input is known as iv. The supervised learning paradigm classifiers are
Classifier. Decision trees, Naïve Bayes, Regression, Logistic
Regression, Support Vector Machine (SVM), K-
Supervised Machine Learning Paradigm and their
Nearest Neighbor (K-NN), Discriminant Analysis,
Classifiers
Ensemble Methods and Neural Networks.
As the name implies, it is when a machine learns under
Decision Trees
supervision. This is the learning paradigm for acquiring the
input-output relationship information of a system based on This is a statistical classifier used for both classification and
a given set of paired input-output training samples. The regression problems. It incorporates nominal and numerical
model is provided with a correct answer (output) for every values that are expressed as a recursive partition of the
input pattern (Samarasinghe, 2006) and as such referred to instance space. Decision tree is a graphical representation
as “learning with a teacher” (Jain, 1996), that is, available of a well-defined decision problem (Fig. 3). It consists of
data comprises feature vectors together with the target nodes that are concerned with decision making and arcs
values. The learner (computer program) is provided with which connects the nodes (decision rules). The decision tree
two sets of data, training set and test set. The training set has forms the rooted (directed) tree that has basically three types
labelled dataset examples (solution to each problem dataset) of nodes: the root nodes, the internal nodes and the terminal
which the learner can use to identify unlabeled examples in nodes. The root node originates from the tree and in turn is
the test set with the highest possible accuracy as depicted in called the parent node. It has no incoming edges and zero or
Fig. 2. The data is analyzed in order to tune the parameters more outgoing edges. Every other nodes have one incoming
of the model that were not in the training set to predict the node and are called child node. A node with outgoing edges
target value for the new set of data (test data). is termed an internal node. It is also referred to as the test
node. It represents the features of the dataset. Each internal
The major tasks of supervised learning paradigms are:
node has exactly one incoming edge, two or more outgoing
i. Classification: Labeled data and classifiers are used edges and splits the instance space into two or more sub-
to produce predictions about data input spaces based on the discrete function of the input attribute
classifications. The function is discrete and it is a values (attribute test condition) to separate records that have
categorical type. different characteristics. This latter process is called
Splitting.
This article can be downloaded from here: www.ijaems.com 25
©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Nkemdilim et al. International Journal of Advanced Engineering, Management and Science, 10(3) -2024

Fig. 2: Data Flow Diagram of Supervised Learning Paradigm

This is the process of dividing a node into two or more process of going through and reducing the tree to only the
nodes and decision branches off into variables. For numeric most important nodes or outcomes.
attributes, the range is considered as the partition criteria Decision Tree Pseudocode:
where the decision tree can be geometrically interpreted as
1. Start the decision tree with a root node, P that
a collection of hyperplanes, each orthogonal to one of the
contains the complete dataset.
axes. For classification problem, the entropy, Gini index
2. Using the Attribute Selection Measure (ASM),
and information gain (IG) are the splitting metrics used
determine the best attribute in the dataset P to split
while for regression, residual sum of squares is applied. All
it.
other nodes apart from the root and internal nodes are
3. Divide P into subsets containing possible values
termed as the leaves/terminal/decision nodes. Each of the
for the best attributes.
leaf has exactly one incoming edge and no outgoing edges
4. Generate a tree node that contains the best
because it represents the outcome. The leaf node is assigned
attribute.
to the class label describing the most appropriate target
5. Make new decision trees recursively by using the
value. Instances are classified by navigating them from the
subsets of the dataset P created in Step 3. Continue
root down through the arcs to the leaf (Figure 4). Pruning in
the process until a point is reached that the nodes
decision tree classifier is the opposite of splitting. It is the
cannot be further classified.

This article can be downloaded from here: www.ijaems.com 26

Fig. 3: Decision tree showing the root, internal and leaf nodes

Naive Bayes For the mathematical analysis from Bayes theorem, if A and
This is a probabilistic classifier and a generative learning B are events and P(B) ≠ 0, to find the probability of event
algorithm that is based on Bayes’ theorem. It is used for text A:
classification task. Given the data and some prior 𝑃(𝐵 |𝐴)𝑃(𝐴)
𝑃(𝐴|𝐵) = …(1.1)
knowledge, the theorem is based on the probability of a 𝑃(𝐵)

hypothesis. The classifier assumes that all features in the where Event B is an evidence (true), P(A) is the priori of A,
input data are conditionally independent of each other, P(B) is the marginal probability, 𝑃(𝐴|𝐵) is the posteriori
given the class label (note: this assumption is not true for all probability of B and 𝑃(𝐵|𝐴) is the Likelihood probability
real world cases) thereby, permitting the algorithm to make that a hypothesis will come true based on the evidence.
predictions quickly. The dataset is divided into two: the Applying Bayes theorem:
feature matrix and the response vector. The feature matrix
𝑃(𝑋 |𝑦)𝑃(𝑦)
contains all the vector of the dataset in which each vector 𝑃(𝑦|𝑋) = …(1.2)
𝑃(𝑋)
consist of the value of the dependent features. The response
vector contains the value of class variable (prediction) for y is the class variable and X is the dependent feature vector
each row of the feature matrix. (of size n), where

Assumptions of Naive Bayes 𝑋 = 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 …(1.3)

i. Feature independence: The features of the data are Putting the naïve assumption into the Bayes’ theorem
conditionally independent of each other, given the (independence among the features), we split the evidence
class label. into independent parts.
ii. Continuous features are normally distributed: If a If A and B are independent, then:
feature is continuous then it is assumed to be P(A,B) = P(A)P(B) …(1.4)
normally distributed within each class.
Hence,
iii. Discrete features have multinomial distributions:
If a feature is discrete then it is assumed to have a 𝑃(𝑥1 |𝑦)𝑃(𝑥2 |𝑦)…𝑃(𝑥𝑛 |𝑦)𝑃(𝑦)
𝑃(𝑦|𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 ) =
𝑃(𝑥1 )𝑃(𝑥2 )…𝑃(𝑥𝑛 )
multinomial distribution within each class.
iv. Features are equally important: All features are …(1.5)
assumed to contribute equally to the prediction of which can be expressed as:
the class label. 𝑛
𝑃(𝑦) ∏𝑖=1 𝑃(𝑥𝑖 |𝑦)
v. No missing data: The data should not contain any 𝑃(𝑦|𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 ) = …(1.6)
𝑃(𝑥1 )𝑃(𝑥2 )…𝑃(𝑥𝑛 )
missing values.

This article can be downloaded from here: www.ijaems.com 27

As the denominator remains constant for any given input, data within the dataset, m is the coefficient
we remove 𝑃(𝑦|𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 ) ∝ 𝑃(𝑦) ∏𝑛𝑖=1 𝑃(𝑥𝑖 |𝑦) (contribution of the input value in determining the
In order to create the classifier model, we find the best fit line) and c is the bias or intercept
probability of the given set of inputs for all possible values (deviations added to the line equation for the
of the class variable y, and with maximum probability. predictions made).
2. Adjust the line by varying m and c.
𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑦 𝑃(𝑦) ∏𝑛𝑖=1 𝑃(𝑥𝑖 |𝑦) ..(1.7) 3. Randomly determine values initially for m and c
Regression and plot the line.
The goal of this statistical classifier is to plot the best-fit line 4. If the line does not fit best, adjust m and c using
or curve between the data (Kurama, 2023). A continuous gradient descent algorithm or least square method.
outcome (y) is predicted based on the value of the predictor 𝑦 = 𝑚𝑥 + 𝑐 …(1.8)
variables (x). Linear regression is the most common y = the dependent variable and it is plotted along
regression model due to ease (Fig. 4). It finds the linear the y-axis
relationship between the dependent variables (continuous) x = the independent variable and plotted along the
and one or more independent variables (continuous or x-axis
discrete).
m = Slope of the line
Steps in determining the best-fit line:
c = the intercept (the value of y when x = 0)
1. Considering the linear problem 𝑦 = 𝑚𝑥 + 𝑐
Line of regression = Best fit line for a model
where y is the dependent data, x is the independent

Fig. 4: Linear Regression Model showing the Best Fit Line

Logistic Regression 𝑒 (𝑏0 + 𝑏1 𝑋)

𝑦= …(1.9)
1+ 𝑒 (𝑏0 + 𝑏1 𝑋)
This does binary classification tasks by predicting the
where 𝑥 = input value, 𝑦 = predicted output, 𝑏0 = bias or
probability of an outcome, event, or observation. Based on
intercept term and 𝑏1 = coefficient for input (𝑥)
the independent variables, it predicts the probability of an
event occurring by fitting the data to a logistic function Logistic regression is similar to linear regression where the
(Fig. 5). The coefficients of the independent variables in input values are combined linearly to predict an output
the logistic function are optimized by maximizing the value using weights or coefficient values but differs in the
likelihood function. A decision boundary is determined output value model. Logistic regression returns a binary
such that the cost function is minimal using Gradient value (0 or 1) as output rather than a numeric value as with
Descent. The model delivers a binary or dichotomous the linear regression.
outcome limited to two possible outcomes: yes/no, 0/1, or
true/false. This is mathematically defined as:

This article can be downloaded from here: www.ijaems.com 28

Fig. 5: Logistic Regression with predicted y between 0 and 1

Support Vector Machine (SVM) K-Nearest Neighbour (K-NN)

This is used for classification (pattern recognition) and This is a non-parametric instance base learning classifier
regression (function approximation) problems. It is based that uses proximity (distance) to make predictions about the
on statistical learning theory that can transform the input grouping of individual data. Due to the fact that it is
data into an N-dimensional (where N is the number of unlikely for an object to exactly match another, the classifier
features that is high) by the use of kernel function to clearly finds a group of 𝑘 objects in the training set that are closest
create a linear model in the feature space. The kernel to the test object by measuring the distance between the data
functions used in SVM include linear, polynomial, radial (similarity measure) and assigns a label based on the
basis function and sigmoid function. predominance of a particular class in their neighbor
It constructs an optimal hyperplane (decision boundary) in (Steinbach and Tan, 2009). K-NN is a lazy learning
a multidimensional space that separates cases of different technique because it delays until the query occurs to
class labels by using the objects (samples) on the edges of generalize beyond the training data.
the margin (support vectors) to separate objects rather than K-NN Pseudocode
using the differences in class means. It is based on the 1. Determine parameter k = number of nearest
separation mechanism of the algorithm to obtain a neighbor.
hyperplane by supporting (defining) using the vectors (data 2. Calculate the distance between the query-instance
points) nearest to the margin that it was called the Support and all the training examples.
Vector Machine. 3. Sort the distance and determine the nearest
Sahu and Sharma (2023) noted that SVM uses the Hinge neighbour based on the k-t minimum distance.
Loss function to maximize the margin distance between the 4. Gather the category Y of the nearest neighbor.
observations of the classes (training) as in equ. 1,10. 5. Use simple majority of the category of the nearest
𝑙(𝑦) = max(0,1 + max 𝑤𝑦 𝑥 − 𝑤𝑡 𝑥) …(1.10) neighbor as the prediction value of the query
𝑦≠𝑡 instance.
where w is the model parameter, x is the input variable and Linear Discriminant Analysis (LDA)
t is the target variable.
This is also known as normal discriminant analysis (NDA)
SVM can efficiently be used in high dimensional space or discriminant function analysis (DFA). This technique
where the number of spaces is higher than the number of aids in optimizing machine learning models in data science.
samples, though it can result to poor outcome. The fame of It has generative model frame work because the data
SVM rests on two key properties: it finds solutions to distribution for each class is modeled and uses Bayes
classification tasks that have generalization and it solves theorem to classify new data points by calculating the
non-linear problems using the kernel trick, thus, referred to probability of whether an input data set will belong to a
as kernel machine. It uses Gaussian particular output. Also, this is used to solve multi-class
distribution, thereby, making the induction paradigm for classification problems by separating multiple classes with
parameter estimation the maximum likelihood method multiple features through data dimensionality reduction.
which is then reduced to the minimization of sum-of-errors- Assumptions of LDA
square cost function.

This article can be downloaded from here: www.ijaems.com 29

1. Every feature such as variable, dimension, or of the layers are considered) to handle complex non-linear
attribute in the dataset has Gaussian distribution. tasks. The Feed forward neural network comprises of the
2. Each feature holds the same variance and has single layer (Hopfield net architecture) and Multiple layer
varying values around the mean with the same perceptron (MLP) uses back propagation learning
amount on average. (Levenberg Marquardt) and Radial basis neural network are
supervised learning.
3. Each feature is assumed to be sampled randomly.
Feed Forward Neural Networks (FFNN): This is a
4. Lack of multicollinearity in independent features
layered neural network in which an input layer of source
and there is an increment in correlations between
nodes projects on to an input layer of neurons but not vice
independent features and the power of prediction
versa.
decreases.
a. Single-layer Feed Forward Network: This is the
In reducing the features from higher dimension space to
simplest kind of neural network that is flat and
lower dimensional space, the following steps should be
consists of a single layer of output nodes (Fig. 6).
considered:
It is also called single perceptron. The inputs are
1. Compute the separate ability amid the various fed directly to the outputs through a series of
classes. This is to determine the between-class weights. The sum of the products of the weights
variance of the different classes (the distance and the inputs are calculated in each node, and if
between the mean of the different classes). the value is above some threshold (typically 0),
2. Compute the distance among the mean and the the neuron fires and takes the activated value
sample of each class (within class variance). (typically 1); otherwise it takes the deactivated
3. Determine the lower dimensional space that value (-1). Single perceptron is only capable of
maximizes the between class variance and learning linearly separable patterns.
minimizes the within class variance.
Ensemble Methods
This classifier encapsulates multiple learning algorithms for
better predictive results. It aims to mitigate errors or biases
that may exist in individual models by leveraging the
collective intelligence of the ensemble (Singh, 2023). The
outputs of many models are combined thereby utilizing the Fig. 6: A Single layer Feed Forward Network
strengths of these models to improve accuracy and handle
uncertainties in data in its learning system. The various
The mapping of single unit perceptron is expressed as:
ensemble techniques are Max Voting, Averaging, Weighted
Average, Stacking, Blending, Bagging and Boosting. 𝑦 = 𝑓(∑𝑛𝑖=1 𝑤𝑖 𝑥𝑖 + 𝑏) …(1.11)
Artificial Neural Network (ANN)
It is designed to mimic the function and structure of the where 𝑤𝑖 are the individual weights, 𝑥𝑖 are the inputs and 𝑏
human brain. ANN is an intricate network of interconnected is the bias
nodes or neurons that collaborates to tackle complicated b. Multilayer Feed Forward Network (MLP):
tasks. The main characteristics of ANN is the ability to learn This distinguishes itself by the presence of one or
in classification task. It learns by example and through more hidden layers called hidden neurons
experience. In high dimensionality data, learning is needful between the input units and the output units (Fig.
in modeling non-linear relationships or recognizing not well 7). This aids the network in dealing with more
established relationship amongst the input variables. The complex non-linear problems. MLP is structured
learning process is achieved by adjusting the weights of the in a feed forward topology whereby each unit gets
interconnections according to the applied learning its input from the previous one (back
algorithm. The basic attributes of ANNs can be classified propagation).
into Architectural attributes and Neuro-dynamic attributes
(Kartalopoulos, 1996). The architectural attributes define
the number and topology of neurons and interconnectivity
while the neuro-dynamic attributes define the functionality
of the ANN. Based on this, ANN is also referred to as Deep
Learning (DL) when it has more than three layers (the depth

This article can be downloaded from here: www.ijaems.com 30

radial basis functions, ‖𝑥 − 𝑐𝑖 ‖ is the Euclidean distance

between the input vector and the center of the radial basis
function and ∅ is the radial basis function usually chosen to
be a Gaussian Function.

III. CONCLUSION
Fig. 7: Multiple Layer Perceptron As the present world revolts round AI for its benefits,
machine learning has been of immense importance to the
building body of such intelligent systems to improve their
The mapping of the inputs to the outputs using an MLP performances. Learning under supervision to predict the
neural network can be expressed as: output of a system when given new inputs has been more
(2) (1) (1) (2)
𝑦𝑘 = 𝑓(∑𝑚 𝑛
𝑗=1 𝑤𝑘𝑗 (∑𝑖=1 𝑤𝑗𝑖 + 𝑤𝑗0 ) + 𝑤𝑘0 )
accurate and of ease when the decision boundary is not
overstrained. The overview of supervised machine learning
…(1.12) paradigms gave a detailed insight to the various statistical
(1) (2)
Where 𝑤𝑗𝑖 and 𝑤𝑘𝑗 indicate the weights in the first and and scientific classifiers used in building functions that map
second layers respectively, going from input 𝑖 to hidden unit new data onto the expected output values in tasks that
𝑗 (hidden layer 1), 𝑚 is the number of the hidden units, 𝑦𝑘 requires either or both classification and regression issues.
(1) (2)
is the output unit, 𝑤𝑗0 and 𝑤𝑘0 are the biases for the
hidden units 𝑗 and 𝑘 respectively. For simplicity, the biases REFERENCES
have been omitted from the diagram. [1] Ambrose, S.A., Bridges, M.N., Dipietro, M, Lovett, M.C. and
c. Radial Basis Neural Network (RBNN): This is Norman, M.K. (2010). How Learning Works: Seven
Research-Based Principles for Smart Teaching, Jossey-Bass
also called Radial Basis Feed Forward (RBF)
A Wiley Imprint Publisher, San Francisco, pp. 1-301
network. It is a two layer feed forward type
[2] Bansal, R., Singh, J. and Kaur, R. (2019). Machine Learning
network in which the input is transformed by the and its Applications: A Review, Journal of Applied Science
basis function at the hidden layer (Fig. 8). At the and Computations, Vol. VI Issue VI, pp. 1392-1398
output layer, linear combinations of the hidden [3] Brownlee, J. (2020). Basic Concepts in Machine Learning.
layer node responses are added to form the output. Retrieved from
The name RBF comes from the fact that the Basis https://machinelearningmastery.com/basic-concepts-in-
function in the hidden layer nodes are radially machine-learning/
symmetric, that is, the neurons in the hidden layer [4] Falade, K.I. (2021). Introduction to Computational
Algorithm, Numerical and Computational Research
contain Gaussian transfer functions whose
Laboratory, pp.1-50
outputs are inversely proportional to the distance
[5] Ghahremani-Nahr, J., Hamed, N. and Sadeghi, M.E. (2021).
from the center of the neuron. Artificial Intelligence and Machine Learning for Real-World
Problem (A Survey), International Journal of Innovation in
Engineering 1 (3), pp. 38-47
[6] Haykin, S. (1998). Neural Networks: A Comprehensive
Foundation, Macmillan College Publishing Company, Inc.
USA, pp. 1-696
[7] Jain, A.K. (1996). Artificial Neural Networks: A tutorial. Pp.
1-14. Retrieved from
www.cogsci.ucsd.edu/ajyu/Teaching/cogs202_sp12/Readin
gs/jain_ann96.pdf
[8] Kartalopoulous, S.V. (1996). Understanding Neural
Fig. 8: Radial Basis Neural Network
Networks and Fuzzy Loic: Basic Concepts and Applications,
IEEE press, NY, pp. 1-232
[9] Kurama, V. (2023). Regression in Machine Learning: What
Mathematically, it can be expressed as:
it is and Examples of Different Models. Retrieved from
𝑦(𝑥) = ∑𝑁
𝑖=1 𝑤𝑖 ∅(‖𝑥 − 𝑐𝑖 ‖) …(1.13) https://builtin.com/data-science/regression=machine-
where 𝑥 is the input vector, 𝑁 is the number of neurons in learning
[10] NetApp (2023). What is Machine Learning? Retrieved from
the hidden layer, 𝑤𝑖 are weights of the connections from the
https://www.netapp.com/artificial-intelligence/what-is-
hidden layer to the output layer, 𝑐𝑖 are the centers of the machine-learning/

This article can be downloaded from here: www.ijaems.com 31

[11] Sahu, C.K. and Sharma, M. (2023). Hinge Loss in support

Vector Machine. Retrieved from
https://www.niser.ac.in/~smishra/teach/cs460/23cs460/lectu
res/lecII.pdf
[12] Samarasinghe, S. (2006). Neural Networks for Applied
Sciences and Engineering from Fundamentals to Complex
Pattern Recognition, Auerbach Publications, Taylor and
Francis Company , New York, pp. 1-570
[13] Singh, A. (2023). A Comprehensive Guide to Ensemble
Learning (with Python Codes). Retrieved from
https://www.analyticsvidhya.com/blog/2018/06/comprehens
ive-guide-for-ensemble-models/
[14] Staff, C. (2023). What is Artificial Intelligence? Definition,
Uses and Types. Retrieved from
https://www.coursera.org/articles/what-is-artificial-
intelligence
[15] Steinbach, M. and Tan, P. (2009). KNN: K- Nearest
Neighbors, Chapter 8, Taylor and Francis, pp. 151-159
[16] Tucci, L. (2023). What is Machine Learning and How does it
work? In-depth guide. Retrieved from
https://www.techtarget.com/searchenterpriseai/definition/ma
chine-learning-ML

This article can be downloaded from here: www.ijaems.com 32

©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/

Seminar On Machine Learning and Ai
No ratings yet
Seminar On Machine Learning and Ai
15 pages
Machine Learning Internshala: Mini Project / Internship Report
100% (1)
Machine Learning Internshala: Mini Project / Internship Report
28 pages
Machine Learning Based Intrusion Detection System: Anish Halimaa A Dr. K.Sundarakantham
No ratings yet
Machine Learning Based Intrusion Detection System: Anish Halimaa A Dr. K.Sundarakantham
5 pages
Truncated Doc 4
No ratings yet
Truncated Doc 4
3 pages
Machine Learning 1.4.19
No ratings yet
Machine Learning 1.4.19
23 pages
UNIT 1 All Notes
No ratings yet
UNIT 1 All Notes
24 pages
Types of Machine Learning Algorithms: February 2010
No ratings yet
Types of Machine Learning Algorithms: February 2010
31 pages
MAchine Learning
No ratings yet
MAchine Learning
10 pages
A Survey On Machine Learning Approaches and Its Techniques:: Thomas. Rincy. N Dr. Roopam Gupta
No ratings yet
A Survey On Machine Learning Approaches and Its Techniques:: Thomas. Rincy. N Dr. Roopam Gupta
6 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
4 pages
Applications of Machine Learning in Diverse Sector A Study
No ratings yet
Applications of Machine Learning in Diverse Sector A Study
5 pages
Machine Learning Is The Subfield Of: Support Vector Machine Linear Boundary
No ratings yet
Machine Learning Is The Subfield Of: Support Vector Machine Linear Boundary
4 pages
Research Paper Machine Learning
No ratings yet
Research Paper Machine Learning
9 pages
Research Paper
No ratings yet
Research Paper
9 pages
ML Unit 1
No ratings yet
ML Unit 1
42 pages
ML Unit 1
No ratings yet
ML Unit 1
6 pages
Machine Learning and Eural Etwork
100% (1)
Machine Learning and Eural Etwork
21 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Machine Learning Lecture Notes
No ratings yet
Machine Learning Lecture Notes
19 pages
ML Unit-1
No ratings yet
ML Unit-1
28 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Chapter 5 Artificial Intelligence Notes
No ratings yet
Chapter 5 Artificial Intelligence Notes
7 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
Report - Stock Price Prediction DL
No ratings yet
Report - Stock Price Prediction DL
37 pages
Unit1 ML
No ratings yet
Unit1 ML
23 pages
Short Review On Machine Learning and Its Application
No ratings yet
Short Review On Machine Learning and Its Application
12 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
20 pages
Machine Learning For Beginner
No ratings yet
Machine Learning For Beginner
31 pages
Machine Learning: An Artificial Intelligence Methodology: WWW - Ijecs.in
No ratings yet
Machine Learning: An Artificial Intelligence Methodology: WWW - Ijecs.in
6 pages
Face Recognition and Machine Learning
No ratings yet
Face Recognition and Machine Learning
8 pages
Lecture 01 02
No ratings yet
Lecture 01 02
30 pages
Report
No ratings yet
Report
27 pages
Machine Learning From Theory To Algorithms An Over
No ratings yet
Machine Learning From Theory To Algorithms An Over
16 pages
InTech-Types of Machine Learning Algorithms
No ratings yet
InTech-Types of Machine Learning Algorithms
30 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
Unit I
No ratings yet
Unit I
38 pages
21AI63 Module 1
No ratings yet
21AI63 Module 1
38 pages
ML m1-m5 NOTES
No ratings yet
ML m1-m5 NOTES
160 pages
.Machine Learning Algorithms Trends, Perspectives and Prospects
No ratings yet
.Machine Learning Algorithms Trends, Perspectives and Prospects
8 pages
ML Doc1
No ratings yet
ML Doc1
14 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
ML - Module 1
No ratings yet
ML - Module 1
30 pages
Machine Learning: Algorithms Types
No ratings yet
Machine Learning: Algorithms Types
27 pages
Machine Learning: Algorithms Types
No ratings yet
Machine Learning: Algorithms Types
32 pages
ML Algos
No ratings yet
ML Algos
31 pages
InTech-Types of Machine Learning Algorithms PDF
No ratings yet
InTech-Types of Machine Learning Algorithms PDF
32 pages
Different Adv Algorithms For Machine Learning
No ratings yet
Different Adv Algorithms For Machine Learning
13 pages
InTech-Types of Machine Learning Algorithms
No ratings yet
InTech-Types of Machine Learning Algorithms
32 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Machine Learning - v1
No ratings yet
Machine Learning - v1
30 pages
ML-Unit 1 Merged
No ratings yet
ML-Unit 1 Merged
151 pages
ML-Unit 1
No ratings yet
ML-Unit 1
43 pages
INTRODUCTION
No ratings yet
INTRODUCTION
51 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
31 pages
Machine Learning (MCA)
No ratings yet
Machine Learning (MCA)
5 pages
Selecting a Specialized Physical Assessment Test for Female Wrestlers Aged 14-15
No ratings yet
Selecting a Specialized Physical Assessment Test for Female Wrestlers Aged 14-15
7 pages
Handover and Call Drop Optimization Techniques
No ratings yet
Handover and Call Drop Optimization Techniques
7 pages
Implementation of The Virtual Internship Program: A Case of Araling Panlipunan Junior High School Teachers at Zone IV, Division of Zambales S.Y. 2020-2021
No ratings yet
Implementation of The Virtual Internship Program: A Case of Araling Panlipunan Junior High School Teachers at Zone IV, Division of Zambales S.Y. 2020-2021
10 pages
Practices and Challenges Encountered by Secondary Mathematics Teachers in Limited Face-To-Face Learning Modality in Zone IV, Division of Zambales
No ratings yet
Practices and Challenges Encountered by Secondary Mathematics Teachers in Limited Face-To-Face Learning Modality in Zone IV, Division of Zambales
13 pages
Spectrum Efficiency Improvement in 5G Network Using NOMA
No ratings yet
Spectrum Efficiency Improvement in 5G Network Using NOMA
13 pages
Sex: A Predictor of Hospitality Management Course Selection in The Philippines
No ratings yet
Sex: A Predictor of Hospitality Management Course Selection in The Philippines
10 pages
Examine The Impact of Advertising Strategies On Consumer Decision Making Process at Hindustan Coco Cola Beverages
No ratings yet
Examine The Impact of Advertising Strategies On Consumer Decision Making Process at Hindustan Coco Cola Beverages
16 pages
A Study On Impact of Mergers On Share Price Performance of Merged Banks
No ratings yet
A Study On Impact of Mergers On Share Price Performance of Merged Banks
6 pages
Impact of Social Media Marketing On Brand Awareness and Brand Recognition
No ratings yet
Impact of Social Media Marketing On Brand Awareness and Brand Recognition
6 pages
Evaluating The Effectiveness of Digital Platforms in Improving Student Outcomes With Reference To UG and PG
No ratings yet
Evaluating The Effectiveness of Digital Platforms in Improving Student Outcomes With Reference To UG and PG
7 pages
Strategies and Challenges in Teaching and Learning Higher Order Thinking Skills (HOTS) in Kolehiyo NG Subic
No ratings yet
Strategies and Challenges in Teaching and Learning Higher Order Thinking Skills (HOTS) in Kolehiyo NG Subic
8 pages
Exploring Salary Brackets For Different Nationalities in The UAE: An Analysis of Compensation Structures and Equity
No ratings yet
Exploring Salary Brackets For Different Nationalities in The UAE: An Analysis of Compensation Structures and Equity
7 pages
Perceptions of Undergraduate Nursing Students Towards Simulation-Based Learning: A Cross-Sectional Study
No ratings yet
Perceptions of Undergraduate Nursing Students Towards Simulation-Based Learning: A Cross-Sectional Study
9 pages
A Study On Asset and Liability Management of HDFC Bank Limited
No ratings yet
A Study On Asset and Liability Management of HDFC Bank Limited
8 pages
A Study On Dominance of Electrical Vehicles Over Internal Combustion Engines (ICE) Vehicles in Hyderabad
No ratings yet
A Study On Dominance of Electrical Vehicles Over Internal Combustion Engines (ICE) Vehicles in Hyderabad
6 pages
Embedding Artificial Intelligence in Next Generation Human Resource Development Implementations
No ratings yet
Embedding Artificial Intelligence in Next Generation Human Resource Development Implementations
9 pages
Impact of Artificial Intelligence in Space of Digital Marketing
No ratings yet
Impact of Artificial Intelligence in Space of Digital Marketing
12 pages
The Role of Organizational Psychology in Enhancing Employee Well-Being
No ratings yet
The Role of Organizational Psychology in Enhancing Employee Well-Being
12 pages
Intercultural Competence Teaching in China's Foreign
No ratings yet
Intercultural Competence Teaching in China's Foreign
11 pages
IOT - Based Smart Irrigation System
No ratings yet
IOT - Based Smart Irrigation System
10 pages
Asylum Seeking and Institutional Discrimination in Melatu Uche Okorie's This Hostel Life
No ratings yet
Asylum Seeking and Institutional Discrimination in Melatu Uche Okorie's This Hostel Life
3 pages
Construction and Standardization of Psychology Aptitude Test For Incoming College Psychology Students
No ratings yet
Construction and Standardization of Psychology Aptitude Test For Incoming College Psychology Students
7 pages
The Quiet Consequences of Scrolling: Conceptualizing Social Media Wellness
No ratings yet
The Quiet Consequences of Scrolling: Conceptualizing Social Media Wellness
4 pages
Quilted Selves and Shadowed Psyches: A Psychoanalytic Study of Grace Marks
No ratings yet
Quilted Selves and Shadowed Psyches: A Psychoanalytic Study of Grace Marks
6 pages
Strategies For Improving Student Achievement in Mathematics in Grade 10 Board Examinations
No ratings yet
Strategies For Improving Student Achievement in Mathematics in Grade 10 Board Examinations
8 pages
The Power of Metaphors in Psychotherapy: Enhancing Therapeutic Communication, Emotional Expression, and Transformative Change
No ratings yet
The Power of Metaphors in Psychotherapy: Enhancing Therapeutic Communication, Emotional Expression, and Transformative Change
5 pages
Sharing For Better or Worse: How Social Media and Online Information Sharing Influence Individual Well-Being
No ratings yet
Sharing For Better or Worse: How Social Media and Online Information Sharing Influence Individual Well-Being
7 pages
Self-Confidence in Learning Turkish: A Beginner-Level Analysis
No ratings yet
Self-Confidence in Learning Turkish: A Beginner-Level Analysis
15 pages
The Role of Sales and Technology Integration in Increasing Sales Revenue in The Corporate Market
No ratings yet
The Role of Sales and Technology Integration in Increasing Sales Revenue in The Corporate Market
28 pages
Trends in Criminologist Licensure Examination Performance: A Study of Criminology Graduates From Capiz State University
No ratings yet
Trends in Criminologist Licensure Examination Performance: A Study of Criminology Graduates From Capiz State University
8 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Lec 9 Supervised Learning Final
100% (1)
Lec 9 Supervised Learning Final
182 pages
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
61 pages
Research Paper Emaildetection
No ratings yet
Research Paper Emaildetection
6 pages
Apssdc Edunet
No ratings yet
Apssdc Edunet
11 pages
Addarsh Chandrasekar - Crime Prediction and Classification in San Francisco City
No ratings yet
Addarsh Chandrasekar - Crime Prediction and Classification in San Francisco City
6 pages
Naïve Bayes Classifier Algorithm For
No ratings yet
Naïve Bayes Classifier Algorithm For
9 pages
TWITTER SENTIMENT NLP Projectt
No ratings yet
TWITTER SENTIMENT NLP Projectt
19 pages
Django Tutorial
No ratings yet
Django Tutorial
26 pages
Applied Natural Language Processing: Barbara Rosario
No ratings yet
Applied Natural Language Processing: Barbara Rosario
39 pages
1 s2.0 S0034425717302821 Main
No ratings yet
1 s2.0 S0034425717302821 Main
15 pages
MT1 SP19 Solutions
No ratings yet
MT1 SP19 Solutions
14 pages
Top 10 Machine Learning Algorithms
No ratings yet
Top 10 Machine Learning Algorithms
12 pages
Sentiment Analysis of Code-Mixed Social Media Text
No ratings yet
Sentiment Analysis of Code-Mixed Social Media Text
18 pages
Internship Project On Fraud Detection
No ratings yet
Internship Project On Fraud Detection
17 pages
Machine Learning Algorithms For Opinion Mining and Sentiment Classification
No ratings yet
Machine Learning Algorithms For Opinion Mining and Sentiment Classification
6 pages
Data Science Answers
No ratings yet
Data Science Answers
2 pages
Sentiment Analysis and Sarcasm Detection of Tweets
No ratings yet
Sentiment Analysis and Sarcasm Detection of Tweets
9 pages
Machine Learning - Classification: CS102 Winter 2019
No ratings yet
Machine Learning - Classification: CS102 Winter 2019
36 pages
AI Lec3
No ratings yet
AI Lec3
22 pages
Untitled Document
No ratings yet
Untitled Document
7 pages
Chapter 1 Introduction To Neural Network and Machine Learning
No ratings yet
Chapter 1 Introduction To Neural Network and Machine Learning
16 pages
Unit III
No ratings yet
Unit III
10 pages
Yang 2018
No ratings yet
Yang 2018
6 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
68 pages
An Introduction To Naive Bayes Algorithm For Beginners
No ratings yet
An Introduction To Naive Bayes Algorithm For Beginners
11 pages
Lecture 7: Text Classification and Naive Bayes: Information Retrieval Computer Science Tripos Part II
No ratings yet
Lecture 7: Text Classification and Naive Bayes: Information Retrieval Computer Science Tripos Part II
48 pages
Ai Fundamental Midterm Quizzes - Jei
No ratings yet
Ai Fundamental Midterm Quizzes - Jei
48 pages
Ijebea14 141
No ratings yet
Ijebea14 141
7 pages

An Overview of Supervised Machine Learning Paradigms and Their Classifiers

Uploaded by

An Overview of Supervised Machine Learning Paradigms and Their Classifiers

Uploaded by

International Journal of Advanced Engineering, Management and

An Overview of Supervised Machine Learning Paradigms

I. INTRODUCTION progressively blurs the boundaries between machine

This article can be downloaded from here: www.ijaems.com 24

Fig. 2: Data Flow Diagram of Supervised Learning Paradigm

This article can be downloaded from here: www.ijaems.com 26

Assumptions of Naive Bayes 𝑋 = 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 …(1.3)

This article can be downloaded from here: www.ijaems.com 27

Fig. 4: Linear Regression Model showing the Best Fit Line

Logistic Regression 𝑒 (𝑏0 + 𝑏1 𝑋)

This article can be downloaded from here: www.ijaems.com 28

Fig. 5: Logistic Regression with predicted y between 0 and 1

Support Vector Machine (SVM) K-Nearest Neighbour (K-NN)

This article can be downloaded from here: www.ijaems.com 29

This article can be downloaded from here: www.ijaems.com 30

radial basis functions, ‖𝑥 − 𝑐𝑖 ‖ is the Euclidean distance

This article can be downloaded from here: www.ijaems.com 31

[11] Sahu, C.K. and Sharma, M. (2023). Hinge Loss in support

This article can be downloaded from here: www.ijaems.com 32

You might also like