0% found this document useful (0 votes)

21 views9 pages

ML Research Paper

This paper surveys basic concepts and algorithms in Machine Learning, including supervised, unsupervised, and deep learning methods, along with their applications in various fields. It discusses the importance of feature extraction, classification, and clustering, as well as specific algorithms like linear regression, logistic regression, and k-nearest neighbors. The paper concludes with a brief overview of deep learning methodologies and their relevance in areas such as pattern recognition and computer vision.

Uploaded by

komalsahay810

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views9 pages

ML Research Paper

Uploaded by

komalsahay810

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

A Brief Survey of Machine Learning Methods

Abstract—This paper provides a brief survey of the basic concepts and

algorithms used for Machine Learning and its applications.
We begin with a broader definition of machine learning and then introduce
various learning modalities including supervised and unsupervised methods and
deep learning paradigms.
In the rest of the paper, we discuss applications of machine learning algorithms
in various fields including pattern recognition.
In the final sections, we present some of the software tools and an extensive
bibliography.

I. Introduction
Although, the term machine learning has its origins in computer science, there have been several
vector quantization methods developed in telecommunications and signal processing for coding and
compression. In computer and data science, learning is accomplished based on examples (data
samples) and experience. A basic signal/data processing framework that includes pre-processing,
noise removal and segmentation is shown in Figure 1, where, the signal is acquired from the sensor
and then processed, typically in a frame-by-frame or batch mode. Removal of noise and feature
extraction follows next and finally the classification stage which will provide either an estimate or a
decision is at the end of the process.

Figure 1: Basic signal processing framework including pre- processing, feature extraction and
classification

Typically, the feature extraction stage will extract compact information bearing parameters that can
characterize the data. The classification stage will have to be trained by a machine learning
algorithm to recognize and classify the collection of features. The field of machine learning is vast
and applications are expanding rapidly especially with the emergence of fast mobile devices that
also have access to cloud computing . Compressing and extracting information from sensors and
big data have recently elevated interest in the area Smart city projects, mobile health monitoring,
networked security, manufacturing, self-driven automobiles, surveillance, intelligent border control;
every application has its idiosyncrasies and requires customized features, adaptive learning, and data
fusion. Data compression and statistical signal and data analysis has a large role transmitting and
interpreting data and producing meaningful analytics. Machine Learning algorithms can be broadly
classified into three categories based on the properties, style of learning, and the way data are used
[13]: supervised, unsupervised and semi-supervised algorithms. This type of classification is important
in identifying the role of the input data, the utility of the algorithms and learning models relative to the
applications.
II. SUPERVISED LEARNING
In supervised learning, “true” or “correct” labels of the input dataset are available. The
algorithm is “trained” using the labelled input dataset (training data) which means ground truth
samples are available for training. In the training process, the algorithm makes appropriate
predictions on the input data and improves its estimates using the ground truth and reiterating
until the algorithm reaches a desired level of accuracy. In almost all the machine learning
algorithms, we optimize a cost function or an objective function. The cost function is typically a
measure of the error between the ground truth and the algorithm estimates. By minimizing the
cost function, we train our model to produce estimates that are close to the correct values
(ground truth). Minimization of the cost function is usually achieved using gradient descent
technique . Variants of gradient descent technique such as stochastic gradient descent for a

used in many machine learning training paradigms. Suppose we have ‘𝑚’ number of training
minibatch, momentum based gradient descent nesterov accelerated gradient descent have been

examples, each one of them is a labelled data and can be represented in a pair:(x, 𝑦), here x
represents the input data and 𝑦 represents the class label. The input data x can be an 𝑛
dimensional, whereas each dimension corresponds to a feature or a variable. Supervised learning
methods are used in various fields including the identification of phytoplankton species , mapping
rainfall induced landslides , and classification of biomedical data . In a machine learning algorithm
is integrated on an embedded sensor system for IoT applications. In the following sub-sections, we
present supervised learning algorithms.

A. Linear Regression

Regression is a statistical technique of estimating the relationship between input and output
variables. It maps the input variables to a continuous function. A simple univariate linear
regression model is shown in Figure 2.

Figure 2: A simple Linear regression example with one feature/variable.

Figure 3: Sigmoid curve having a bound between 0 and

The training dataset consists of ‘𝑚’ labelled training sets(x, 𝑦) < 𝑅𝑛+1 , x is the independent
variable and 𝑦 is the dependent variable. The linear regression model assumes the relationship
between independent variable and dependent variable is linear and fits a straight line to the data
points. This relationship is expressed by a hypothesis function or a prediction function. It is
expressed as

ℎ(x) = w0 + w1𝑥1 + w2𝑥2+... +w𝑛𝑥𝑛 (1)

where 𝑥1, 𝑥2,. . . 𝑥𝑛 are the features and w0, w1, w2.. . w𝑛 are the weights of the model. The
approach can be used to perform linear regression through slope filtering. Equation (1) is for a
multivariate linear regression model. The output is the linear sum of the weighted input features.
The weights are typically learned by weighted least squares minimization process. We can also
make use of quadratic, cubic or higher polynomial terms to obtain completely different hypothesis
function which can fit quadratic , cubic or polynomial curves respectively, rather than a simple
straight line. Multivariate linear regression is used for several applications, including activity
recognition and classification, steady state visual evoked potential (SSVEP) recognition for BCI data.

B. Logistic Regression

The objective of multivariate regression model is to determine a hypothesis function which outputs
a continuous value. Now, we present another class of supervised learning algorithms: Classification,
in which the objective is to obtain a discrete output. Logistic regression is a statistical way of
modelling a binomial outcome. As before, the input can have one or more features (or variables).
For a binary logistic regression, the outcome can be a 0 or 1 which performs binary classification of
positive class from negative class. Logistic regression uses a sigmoid curve shown in the Figure 3 to
output a probability value and thus performs the classification. The hypothesis function for a
logistic regression is given by

ℎ(𝑥) = 𝑆(w0 + w1𝑥1 + w2𝑥2+ . . . +w𝑛𝑥𝑛) (2)

where 𝑆( ) is a sigmoid function given by

1/
𝑆(𝑧) = 1 + 𝑒−𝑧 (3)

for each 𝐶 possible outcomes or 𝐶 number of classes. Here, 𝑝(𝜔𝑐|𝑥) is the posterior probability that
given feature x belongs to 𝑐th class 𝜔𝑐, and 𝑝(𝜔𝑐) is the prior probability of the class 𝜔𝑐 independent
of the data, and 𝑝(x|𝜔𝑐) is the likelihood which is the probability of the predictor given the class and

𝑝(x) is the prior probability of the predictor which is the normalizing factor. There are many variations of
Naïve Bayes theorem, some of them tackle the poor assumptions of Naïve Bayes [54,55,56]. Naïve Bayes
algorithm is used for text classification [57], for credit scoring [58], for emotion classification and
recognition [67], and detection of epileptic seizures from EEG signals

E. k-Nearest Neighbors

The k-Nearest Neighbors (k-NN) algorithm is one of the simplest supervised machine learning algorithm. k-
NN can be used for classification of input points to discrete outcomes. A simple k-NN model is shown in
Figure 5.
Figure 4: Maximum Figure 5: A simple k-NN
margin intuition; model for different values
hyperplane A has of k
maximum separation.

k-NN can be used for regression analysis where the outcome of a dependent variable is
predicted from the input independent variables. In Figure 5, for k =3, the test point (star) is
classified as belonging to class B and for k=6, the point is classified as belonging to class A. k-NN is a
non-probabilistic and non-parametric model and hence it is the first choice for classification study when
there is no prior knowledge about the distribution of data. k-NN stores all the labelled input points to
classify any unknown sample and this makes it computationally expensive. The classification is based on
the similarity measure (a distance metric). Any unknown sample is classified by the majority vote of its k
nearest neighbors. The complexity increases as the dimensionality increases and hence dimensionality
reduction techniques are performed before using k-NN to avoid the effects of curse of dimensionality . k-
NN classifier is used for stress detection using physiological signals in and detection of epileptic seizures .

III. UNSUPERVISED LEARNING

In the case of unsupervised algorithms there are no explicit labels associated with the training
dataset. The objective is to draw inferences from the input data and then model the hidden or
the underlying structure and the distribution in the data, in order to learn more about the data.
Clustering is the most common example of an unsupervised algorithm. The details of the same is
mentioned below.

A. Clustering

Clustering deals with finding a structure or pattern in a collection of unlabeled dataset. For a
given dataset, clustering algorithm groups the given data into K number of clusters such that the
data points within each cluster are similar to each other and data points from different clusters are
dissimilar. Similar to k-NN algorithm, we make use of a similarity metric or distance metric.
Different distance metrics such as Euclidean, Mahalanobis, cosine, Minkowski etc. are used.
Although Euclidean distance metric is used more often, that it is not a suitable metric to capture
the quality of the clustering. The K-means algorithm is one of the simplest clustering algorithms
and is aintuitive and iterative algorithm. It clusters the data by separating them into K groups of
equal variances, minimizing the inertia or within-cluster sum-of-squares. However, the algorithm

the data point is assigned to the cluster with the nearest mean 𝝁(j), which is also referred to as the
requires the number of clusters to be specified before running the algorithm. Each observation or

Centroid of that cluster. Thus, the K clusters can be specified by the K centroids.
K-means clustering algorithms leads to Voronoi tessellation. K-means algorithms
iterations stops (converges) when there is no change in the value of means of the clusters. In
Figure 6, a converged K-means algorithm is shown. Clustering has several applications in many
fields. In biology, clustering has been used to determine groups of genes that have similar
functions , for detection of brain tumor in cardiogram data clustering in business and e-
commerce analysis and information retrieval image segmentation and compression , in
the study of quantitative resolutions of nanoparticles , in fault detection in Solar PV panels and in
speech recognition .

Figure 6: The K-means and the cluster centroids.

B. Vector Quantization
In its simplest form vector quantization organizes data in vectors and represents them by their
centroids. It typically uses a K-means clustering algorithm to train the quantizer. The centroids form
codewords and all the codewords are stored in a Codebook.

Figure 7: Uniform quantization of 2-dimensional Data. Figure 8: Vector quantization of 2-dimensional Data.

Vector quantization is a lossy compression method and is used in several coding applications. As a result, the
compressed data has errors that are inversely proportional to density. This property is shown in Figure 8 and
compared with uniform quantization Figure 7.The Vector quantization technique is used in various speech
applications including speech coding ,emotion recognition , audio compression , large-scale image
classification and image compression .

IV. DEEP LEARNING

In this section, a brief introduction to the field of artificial neural networks is provided with the
focus on deep learning methodologies and their applications. Artificial neural networks are widely
used in the areas of image classification, pattern recognition and they have proved to be the most
successful and they achieve superior results in various fields including signal processing , computer
vision , speech processing and natural language processing .

Deep learning is a branch in machine learning that has gained popularity quite recently, capable of
learning multiple levels of abstraction. Although, the inception of neural networks dates in 1960 ,
deep learning gained more popularity since 2012 because of the great advancements in the GPUs
and availability of large labelled datasets. In Figure 9, a simple artificial neural network with 4
hidden layers is shown. The last layer, namely the output layer, performs classification. The term
“deep learning” refers to several layers used to learn multiple levels of representation. Each
successive layer takes the output of the previous layer and feeds the result to the next layer.
Figure 9: Artificial Neural Network with four hidden layers.

Typical artificial neural networks challenges include initialization of the network parameters,
overfitting, and long training time. We now have various techniques to address the above
problems. Batch normalization , normalization propagation , weight normalization , layer
normalization all help in accelerating the training of deep neural networks. Dropouts help in
reducing overfitting. There are several network architectures including the one shown in Figure 9
which consists of dot product layers (fully connected layers). A convolutional layer processes
volume of activations rather than a vector and produces feature maps.It also makes use of a
subsampling layer or a max- pooling layer to reduce the size of the feature maps. Figure 10
shows an example of a convolutional neural network (CNN). Networks whose output depends
on present and past inputs, namely recurrent neural networks (RNNs) , have also been used in
several applications.

Figure 10: A CNN with 3 convolutional, 2 subsampling layers

CONCLUSION

This Machine Learning short survey paper supported the tutorial session of the IISA2017. The
paper covered supervised and unsupervised learning models. We also provided a brief
introduction to current deep learning methodologies and outlined several applications including
pattern recognition, anomaly detection, computer vision and speech processing. The paper
provides extensive bibliography of machine algorithms and their applications.

Resume Working Student Jollibee
50% (2)
Resume Working Student Jollibee
3 pages
ML - Unit - 1
No ratings yet
ML - Unit - 1
47 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
Question Bank - Machine Learning (Repaired)
100% (1)
Question Bank - Machine Learning (Repaired)
78 pages
Supervised Learning
No ratings yet
Supervised Learning
6 pages
Unit 1
100% (1)
Unit 1
13 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
UNIT1
No ratings yet
UNIT1
38 pages
Applied ML Notes
No ratings yet
Applied ML Notes
123 pages
Memorial For Applicant
100% (1)
Memorial For Applicant
15 pages
Supervised Learning (Classification and Regression)
No ratings yet
Supervised Learning (Classification and Regression)
14 pages
SDS Underwater Cutting Rods 2018 PDF
100% (1)
SDS Underwater Cutting Rods 2018 PDF
8 pages
Unit 3
No ratings yet
Unit 3
10 pages
CIMA Syllabus Final
No ratings yet
CIMA Syllabus Final
128 pages
MLT Unit 2 Notes
No ratings yet
MLT Unit 2 Notes
58 pages
A Brief Survey of Machine Learning Methods and Their Sensor and IoT Applications
No ratings yet
A Brief Survey of Machine Learning Methods and Their Sensor and IoT Applications
8 pages
Classification Algorithms 3rd
No ratings yet
Classification Algorithms 3rd
15 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
5 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
17 pages
Week 1
No ratings yet
Week 1
9 pages
Machine Learning Models
No ratings yet
Machine Learning Models
11 pages
Unit II
No ratings yet
Unit II
25 pages
Ai Unit-4-1
No ratings yet
Ai Unit-4-1
9 pages
Supervised Learning
No ratings yet
Supervised Learning
46 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Machine Learning and Regression
No ratings yet
Machine Learning and Regression
8 pages
Unit 1
No ratings yet
Unit 1
21 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
Intorduction of ML
No ratings yet
Intorduction of ML
14 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
ML Notes UT-1
No ratings yet
ML Notes UT-1
21 pages
Chap2 SupervisedLearning
No ratings yet
Chap2 SupervisedLearning
24 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
20 pages
1 - Machine Learning Survey Paper
No ratings yet
1 - Machine Learning Survey Paper
8 pages
ML Algorithms
No ratings yet
ML Algorithms
12 pages
Unit 1
No ratings yet
Unit 1
8 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
21 pages
ML Doc1
No ratings yet
ML Doc1
14 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Group 2 ML Asignmet
No ratings yet
Group 2 ML Asignmet
23 pages
Machine Learning Concept1
No ratings yet
Machine Learning Concept1
16 pages
Dinesh ML
No ratings yet
Dinesh ML
11 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
AI
No ratings yet
AI
52 pages
Machine Learning Algorithms - A Review - ART20203995
No ratings yet
Machine Learning Algorithms - A Review - ART20203995
6 pages
ML Unit-4
No ratings yet
ML Unit-4
20 pages
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
No ratings yet
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
27 pages
6CS4 AI Unit-4 @zammers
No ratings yet
6CS4 AI Unit-4 @zammers
129 pages
Machine Learning
No ratings yet
Machine Learning
115 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
PDF&Rendition 1 2
No ratings yet
PDF&Rendition 1 2
27 pages
AI 4 Unit Notes
No ratings yet
AI 4 Unit Notes
47 pages
Machine Learning
No ratings yet
Machine Learning
12 pages
(E-Book PDF) The Medical Examiner Service A Practical Guide For England and Wales 1st Edition Fast Download
100% (2)
(E-Book PDF) The Medical Examiner Service A Practical Guide For England and Wales 1st Edition Fast Download
15 pages
Slide 1
No ratings yet
Slide 1
29 pages
Fulldoc - Dsec Mca - Crime Prediction (1) - 051521
No ratings yet
Fulldoc - Dsec Mca - Crime Prediction (1) - 051521
65 pages
Machine Learning
No ratings yet
Machine Learning
100 pages
Data Science Vijay1
No ratings yet
Data Science Vijay1
88 pages
Unit 3 Material
No ratings yet
Unit 3 Material
8 pages
BM135 Commercial LAW SLIDES
No ratings yet
BM135 Commercial LAW SLIDES
79 pages
Compressor Handbook 2
No ratings yet
Compressor Handbook 2
7 pages
Design Basis: CE 315-Design of Concrete Structure - I Instructor: Dr. E. R. Latifee
No ratings yet
Design Basis: CE 315-Design of Concrete Structure - I Instructor: Dr. E. R. Latifee
2 pages
Car Amp Subwofer JBL - bp1200.1
No ratings yet
Car Amp Subwofer JBL - bp1200.1
33 pages
Open The Dor
No ratings yet
Open The Dor
9 pages
Our Development Board: Product Details
No ratings yet
Our Development Board: Product Details
4 pages
Popescu Luiza PDF
No ratings yet
Popescu Luiza PDF
4 pages
HI121 - Installation Instructions PDF
No ratings yet
HI121 - Installation Instructions PDF
2 pages
Developing Models of Managerial Competencies of Managers: A Review
No ratings yet
Developing Models of Managerial Competencies of Managers: A Review
15 pages
An Introduction To Matrix Structural Analysis and Finite Element Methods
No ratings yet
An Introduction To Matrix Structural Analysis and Finite Element Methods
8 pages
United States Court of Appeals, Eleventh Circuit
No ratings yet
United States Court of Appeals, Eleventh Circuit
5 pages
A 32nm Fully Integrated Reconfigurable Switched-Capacitor DC-DC Converter Delivering 0.55W/mm2 at 81% Efficiency
No ratings yet
A 32nm Fully Integrated Reconfigurable Switched-Capacitor DC-DC Converter Delivering 0.55W/mm2 at 81% Efficiency
3 pages
CAF 2 Tax Study Plan
No ratings yet
CAF 2 Tax Study Plan
5 pages
Laag 1
No ratings yet
Laag 1
12 pages
20.-Mclaughlin V CA
No ratings yet
20.-Mclaughlin V CA
2 pages
Lecture Notes Respiratory Medicine 9th Edition Stephen J. Bourke Instant Download
100% (1)
Lecture Notes Respiratory Medicine 9th Edition Stephen J. Bourke Instant Download
52 pages
Transfer and Bevel Gears
No ratings yet
Transfer and Bevel Gears
3 pages
Diagnostic Test 15 Dependent Prepositions
No ratings yet
Diagnostic Test 15 Dependent Prepositions
1 page
Ashour: Personal Info Education
No ratings yet
Ashour: Personal Info Education
2 pages
Macronix MX25L12855FXCI 10G Datasheet
No ratings yet
Macronix MX25L12855FXCI 10G Datasheet
15 pages
2 2 2
No ratings yet
2 2 2
4 pages
Exec Order On PCART
No ratings yet
Exec Order On PCART
5 pages
Chapter 2 - Do A Usability Test Now - 2012 - Observing The User Experience
No ratings yet
Chapter 2 - Do A Usability Test Now - 2012 - Observing The User Experience
9 pages
F-22 Paper Model Template Craft
No ratings yet
F-22 Paper Model Template Craft
1 page
SBR - Chapter 1
No ratings yet
SBR - Chapter 1
2 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

ML Research Paper

Uploaded by

ML Research Paper

Uploaded by

A Brief Survey of Machine Learning Methods

Abstract—This paper provides a brief survey of the basic concepts and

Figure 2: A simple Linear regression example with one feature/variable.

Figure 3: Sigmoid curve having a bound between 0 and

ℎ(x) = w0 + w1𝑥1 + w2𝑥2+... +w𝑛𝑥𝑛 (1)

ℎ(𝑥) = 𝑆(w0 + w1𝑥1 + w2𝑥2+ . . . +w𝑛𝑥𝑛) (2)

where 𝑆( ) is a sigmoid function given by

III. UNSUPERVISED LEARNING

Figure 6: The K-means and the cluster centroids.

IV. DEEP LEARNING

Figure 10: A CNN with 3 convolutional, 2 subsampling layers

You might also like