.Machine Learning Algorithms Trends, Perspectives and Prospects

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

ISSN XXXX XXXX © 2017 IJESC

Research Article Volume 7 Issue No.3

Machine Learning Algorithms:Trends, Perspectives and Prospects


Priyanka P. Raut 1 , Namrata R. Borkar2
ME Student1 , Assistant Professsor2
Depart ment of CSE
Dr. Sau. Kamlatai Gawai Institute of Engineering and Technology, Darapur, MS, India

Abstract:
Machine learning addresses the question of how to build computers that imp rove automatically through experience. It is one of
today’s most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of art ificial
intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning
algorith ms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-
intensive machine-learning methods can be found throughout science, technology and commerce, leading to mo re evidence -based
decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and
market ing.

Keywords: Supervised Machine Learning, SVM, DT, Classifier.

I. INTRODUCTION divided into train and test dataset. The train dataset has output
variable which needs to be predicted or classified. All algorith ms
Machine learning is used to teach machines how to handle the learn some kind of patterns from the training dataset and apply
data more efficiently. So met imes after viewing the data, we them to the test dataset for prediction or classification [4]. The
cannot interpret the pattern or extract information fro m the data. workflow of supervised machine learning algorith ms is given in
In that case, we apply machine learning [1]. With the abundance Fig.1. Three most famous supervised mach ine learning
of datasets available, the demand for machine learn ing is in rise. algorith ms have been discussed here.
Many industries from med icine to military apply machine
learning to extract relevant information. Machine learning has 1) Decision Tree: Decision trees are those type of trees which
progressed dramatically over the past two decades, from groups attributes by sorting them based on their values. Decision
laboratory curiosity to a practical technology in widespread tree is used mainly for classification purpose. Each tree consists
commercial use. Within artificial intelligence (AI), mach ine of nodes and branches. Each nodes represents attributes in a
learning has emerged as the method of choice for developing group that is to be classified and each branch represents a value
practical software for computer vision, speech recognition, that the node can take [4]. An examp le of decision tree is given
natural language processing, robot control, and other in Fig. 2.
applications. Many developers of AI systems no recognize that,
for many applications, it can be far easier to train a sys tem by
showing it examples of desired input-output behavior than to
program it manually by anticipating the desired response for all
possible inputs. The effect of machine learning has also been felt
broadly across computer science and across a range of industries
concerned with data-intensive issues, such as consumer services,
the diagnosis of faults in co mplex systems, and the control of
logistics chains. There has been a similarly broad range of
effects across empirical sciences, fro m bio logy to cosmolog y to
social science, as machine-learning methods have been
developed to analyze high throughput experimental data in novel
ways. The purpose of machine learning is to learn from the data.
Many studies have been done on how to make mach ines learn by
themselves [2] [3]. Many mathematicians and programmers
apply several approaches to find the solution of this problem.

II. TYPES OF LEARNING

A. Supervised Learning
The supervised machine learning algorith ms are those Figure.1. Workflow of supervised machine learning
algorith ms which needs external assistance. The input dataset is algorithm [4]

International Journal of Engineering Science and Computing, March 2017 4884 http://ijesc.org/
The pseudo code for Decision tree is described in Fig. 4; where
S, A and y are training set, input attribute and target attribute
respectively.

2) Naïve Bayes: Naïve Bayes mainly targets the text


classification industry. It is mainly used for clustering and
classification purpose [6]. The underlying architecture of Naïve
Bayes depends on the conditional probability. It creates trees
based on their probability of happening. These trees are also
known as Bayesian Network. An examp le of the network is
given in Fig. 4. The pseudo code is given in Fig. 5.
Figure. 4. An example of bayesian network [7]

Figure.2. Decision tree [5]

Figure .5. Pseudo code for naï ve bayes [6]

3) Support Vector Machine: Another most widely used state-of-


the-art machine learning technique is Support Vector Machine
(SVM ). It is mainly used for classification. SVM works on the
principle of margin calcu lation. It basically, draw marg ins
between the classes. The marg ins are drawn in such a fashion
that the distance between the margin and the classes is maximu m
and hence, minimizing the classification error. An examp le of
working and pseudo code of SVM is given in Fig. 6 and Fig. 7,
respectively.

Figure. 3. Pseudo code for decision tree [5] Figure. 6. Working of support vector machine [8]

International Journal of Engineering Science and Computing, March 2017 4885 http://ijesc.org/
explained in Fig. 11. The pseudo code for PCA is discussed in
Fig. 12.

Figure. 7. Pseudo code for support vector machine [9]

B. Unsupervised Learning
The unsupervised learning algorith ms learn few features from
the data. When new data is introduced, it uses the previously
learned features to recognize the class of the data. It is main ly
used for clustering and feature reduction. An example of
workflow of unsupervised learning is given in Fig. 8.

Figure.10. Pseudo code for k-means clustering [12]

Figure. 8. Example of unsupervised learning [10]

The two main algorith ms for clustering and dimensionality


reduction techniques are discussed below.

1) K-Means Clustering:
Clustering or grouping is a type of unsupervised learning
Figure.11. Visualizati on of data before and after appl ying
technique that when initiates, creates groups automatically. The
pca [11]
items which possesses similar characteristics are put in the same
cluster. This algorith m is called k-means because it creates k
distinct clusters. The mean of the values in a particular cluster is
the center of that cluster [9]. A clustered data is represented in
Fig. 9. The algorith m for k-means is given in Fig. 10.

Figure .9. K-means clustering [7]

2) Principal Component Analysis:


In Principal Co mponent Analysis or PCA, the dimension of the
data is reduced to make the co mputations faster and easier. To
understand how PCA works, let’s take an examp le of 2D data.
When the data is being plot in a graph, it will take up two axes.
Figure. 12. Pseudo code for pca [12]
PCA is applied on the data, the data then will be 1D. This is

International Journal of Engineering Science and Computing, March 2017 4886 http://ijesc.org/
C. Semi - Supervised Learning on a task, it remembers the procedure how it solved the problem
Semi – supervised learning algorithms is a technique which or how it reaches to the particular conclusion. The algorithm
combines the power of both supervised and unsupervised then uses these steps to find the solution of other similar problem
learning. It can be fruit-fu ll in those areas of machine learning or task. This helping of one algorithm to another can also be
and data mining where the unlabeled data is already present and termed as inductive transfer mechanism. If the learners share
getting the labeled data is a tedious process [14]. There are many their experience with each other, the learners can learn
categories of semi-supervised learning [15]. So me of which are concurrently rather than individually and can be much faster
discussed below: [18].

1) Generative Models: Generative models are one of the oldest F. Ensemble Learning
semi-supervised learning method assumes a structure like p(x,y) When various individual learners are combined to form only one
= p(y)p(x|y) where p(x|y) is a mixed distribution e.g. Gaussian learner then that particular type of learn ing is called ensemble
mixtu re models. Within the unlabeled data, the mixed learning. The individual learner may be Naïve Bayes, decision
components can be identifiab le. One labeled examp le per tree, neural network, etc. Ensemb le learning is a hot topic since
component is enough to confirm the mixture d istribution. 1990s. It has been observed that, a collection of learners is
almost always better at doing a particular job rather than
2) Self-Training: In self-t rain ing, a classifier is trained with a individual learners [19]. Two popular Ensemble learning
portion of labeled data. The classifier is then fed with unlabeled techniques are given below [20]:
data. The unlabeled points and the predicted labels are added
together in the training set. This procedure is then repeated 1) Boosting:
further. Since the classifier is learning itself, hence the name Boosting is a technique in ensemble learn ing which is used to
self-train ing. decrease bias and variance. Boosting creates a collection of weak
learners and convert them to one strong learner. A weak learner
3) Transductive SVM: Transductive support vector machine or is a classifier which is barely correlated with true classificat ion.
TSVM is an extension of SVM. In T SVM, the labeled and On the other hand, a strong learner is a type of classifier which is
unlabeled data both are considered. It is used to label the strongly correlated with true classification [20]. The pseudo code
unlabeled data in such a way that the marg in is maximu m for AdaBoost (which is most popular examp le of boosting) is
between the labeled and unlabeled data. Finding an exact give in Fig. 14.
solution by TSVM is a NP-hard p roblem.

D. Reinforcement Learni ng
Reinforcement learning is a type of learn ing which makes
decisions based on which actions to take such that the outcome
is more positive. The learner has no knowledge which actions to
take until it’s been given a situation. The action which is taken
by the learner may affect situations and their actions in the future.
Reinforcement learning solely depends on two criteria: trial and
error search and delayed outcome [16]. The general model [17]
for reinforcement learning is depicted in Fig. 13.

Figure .14. Pseudo code for adaboost [20]

2) Bagging: Bagging or bootstrap aggregating is applied where


the accuracy and stability of a machine learning algorith m needs
to be increased. It is applicable in classification and regression.
Bagging also decreases variance and helps in handling
overfitting [7]. The pseudo code for bagging in given in Fig. 15.

Figure.13. The reinforcement learni ng model [17]

In the figure, the agent receives an input i, current state s, state


transition r and input function I fro m the environment. Based on
these inputs, the agent generates a behavior B and takes an
action a which generates an outcome.

E. Multitask Learning
Multitask learn ing has a simple goal of helping other learners to
perform better. When multitask learning algorith ms are applied Figure.15. Pseudo code for bagging [20]

International Journal of Engineering Science and Computing, March 2017 4887 http://ijesc.org/
G. Neural Network Learning The neural network (or artificial 2) Unsupervised Neural Network: Here, the neural network has
neural network or ANN) is derived fro m the biological concept no prior clue about the output the input. The main job of the
of neurons. A neuron is a cell like structure in a brain. To network is to categorize the data according to some similarities.
understand neural network, one must understand how a neuron The neural network checks the correlation between various
works. A neuron has mainly four parts (see Fig. 16). They are inputs and groups them. The schematic diagram is shown in Fig.
dendrites, nucleus, soma and axon. 19.

Figure. 16. A neuron [21]

The dendrites receive electrical signals. Soma processes the


electrical signal. The output of the process is carried by the axon
to the dendrite terminals where the output is sent to next neuron. Figure.19. Unsupervised neural network [22]
The nucleus is the heart of the neuron. The inter-connection of
neuron is called neural network where electrical impulses travel 3) Reinforced Neural Network: In reinforced neural network,
around the brain. the network behaves as if a human co mmun icates with the
environment. Fro m the environ ment, a feedback has been
provided to the network acknowledging the fact that whether the
decision taken by the network is right orwrong. If the decision is
right, the connection which points to that particular output is
strengthened. The connections are weakened otherwise. The
network has no previous information about the output.
Reinforced neural network is represented in Fig. 20.

Figure. 17. Structure of an artificial neural network [21]

An artificial neural network behaves the same way. It works on


three layers. The input layer takes input (much like dendrites).
The hidden layer processes the input (like soma and axon).
Finally, the output layer sends the calculated output (like
dendrite terminals) shown in fig. 17 [21]. There are basically
three types of artificial neural network: supervised, unsupervised
Figure .20. Reinforced neural network [22]
and reinforcement [22].
H. Instance-Based Learning
1) Supervised Neural Network: In the supervised neural
In instance-based learning, the learner learns a particular type of
network, the output of the input is already known. The predicted
pattern. It tries to apply the same pattern to the newly fed data.
output of the neural network is compared with the actual output.
Hence the name instance-based. It is a type of lazy learner wh ich
Based on the error, the parameters are changed, and then fed into
waits for the test data to arrive and then act on it together with
the neural network again. Fig. 18 will summarize the process.
training data. The complexity of the learning algorithm increases
Supervised neural network is used in feed forward neural
with the size of the data. Given below is a well-known example
network.
of instance-based learning which k-nearest neighbor [7] is.

1) K-Nearest Neighbor: In k-nearest neighbor (or KNN), the


training data (wh ich is well-labeled) is fed into the learner.
When the test data is introduced to the learner, it compares both
the data. k most correlated data is taken from training set. The
majority of k is taken which serves as the new class for the test
data [23]. The pseudo code for KNN is given in Fig. 21.

Figure. 18. Supervised neural network [22]

International Journal of Engineering Science and Computing, March 2017 4888 http://ijesc.org/
For examp le, huge data sets require co mputationally tractable
algorith ms, highly personal data raise the need for algorithms
that min imize privacy effects, and the availability of huge
quantities of unlabeled data raises the challenge of designing
learning algorithms to take advantage of it. The next sections
survey some of the effects of these demands on recent work in
mach ine-learning algorith ms, theory, and practice.

IV .CORE METHODS AND RECENT PROGRESS

One high-impact area of progress in supervised learning in


recent years involves deep networks, which are mult ilayer
networks of threshold units, each of which computes some
simp le parameterized function of its inputs [25, 26].Deep
learning systems make use of gradient-based optimization
algorith ms to adjust parameters throughout such a multilayered
network based on errors at its output. Explo iting modern parallel
computing architectures, such as graphics processing units
originally developed for v ideo gaming, it has been possible to
build deep learning systems that contain billions of parameters
and that can be trained on the very large collections of images,
videos, and speech samples available on the Internet. Such large-
scale deep learning systems have had a major effect in recent
years in computer vision [27] and speech recognition [28], where
Figure. 21. Pseudo code for k-nearest neighbor [24] they have yielded major improvements in performance over
previous approaches shown in Fig. 22. Deep network methods
III. DRIVERS OF MACHINE L EARNING PROGRESS are being actively pursued in a variety of additional applications
fro m natural language translation to collaborative filtering.
The past decade has seen rapid growth in the ability of
networked and mobile co mputing systems to gather and
transport vast amounts of data, a phenomenon often referred to
as “Big Data.” The scientists and engineers who collect such
data have often turned to machine learn ing for solutions to the
problem of obtaining useful insights, predictions, and decisions
fro m such data sets. Indeed, the sheer size of the data makes it
essential to develop scalable procedures that blend
computational and statistical considerations, but the issue is
more than the mere size of modern data sets; it is the granular,
personalized nature of much of these data. Mobile devices and
embedded computing permit large amounts of data to be
gathered about individual humans, and mach ine-learning
algorith ms can learn fro m these data to customize their services
to the needs and circumstances of each individual. Moreover,
these personalized services can be connected, so that an overall Figure.22. Automatic generati on of text captions for i mages
service emerges that takes advantage of the wealth and diversity wi th deep networks
of data from many individuals while still customizing to the
needs and circu mstances of each. Instances of this trend toward V. EMERGING TREND
capturing and mining large quantities of data to imp rove services
and productivity can be found across many fields of commerce, The field of machine learn ing is sufficiently young that it is still
science, and government. Historical medical records are used to rapidly expanding, often by inventing new formalizat ions of
discover which patients will respond best to which treatments; mach ine-learning problems driven by practical applications. (An
historical traffic data are used to improve traffic control and example is the development of reco mmendation systems, as
reduce congestion; historical crime data are used to help allocate described in Fig. 23. A reco mmendation system is a machine-
local police to specific locations at specific times; and large learning system that is based on data that indicate links between
experimental data sets are captured and curated to accelerate a set of users (e.g., people) and a set of items (e.g., products). A
progress in biology, astronomy, neuroscience, and other data lin k between a user and a product means that the user has
intensive empirical sciences. We appear to be at the beginning of indicated an interest in the product in some fashion (perhaps by
a decades-long trend toward increasingly data-intensive, purchasing that item in the past).The machine learning problem
evidence-based decision making across many aspects of science, is to suggest other items to a given user that he or she may also
commerce, and govern ment. With the increasing pro minence of be interested in, based on the data across all users.
large-scale data in all areas of human endeavor has come a wave
of new demands on the underlying machine learn ing algorith ms.

International Journal of Engineering Science and Computing, March 2017 4889 http://ijesc.org/
VI. OPPORTUNITIES AND CHALLENGES they are computer algorithms, an imals, organizat ions, or natu ral
evolution. As the field progresses, we may see machine-learning
Despite its practical and co mmercial successes, mach ine learning theory and algorithms increasingly providing models for
remains a young field with many underexp lored research understanding learning in neural systems, organizations, and
opportunities. So me of these opportunities can be seen by biological evolution and see machine learning benefit fro m
contrasting current machine-learning approaches to the types of ongoing studies of these other types of learning systems.
learning we observe in naturallyoccurring systems such as
humans and other animals, organizat ions, economies, and VII. CONCLUS ION
biological evolution.
This paper surveys various machine learn ing algorith ms,
progress and trends in machine learn ing with their applications .
Today each and every person is using machine learning
knowingly or unknowingly. Machine learn ing is useful from
getting a recommended product in online shopping to updating
photos in social networking sites. This paper gives an
introduction to most of the popular machine learning algorith ms
and emerg ing trends.

VIII.REFER ENCES

[1]. W. Richert, L. P. Coelho, “Bu ild ing Machine Learning


Systems with Python”, Packt Publishing Ltd., ISBN 978-1-
78216-140-0.

[2]. M. Welling, “A First Encounter with Machine Learning”

[3]. M. Bowles, “Machine Learning in Python: Essential


Techniques for Predictive Analytics”, John Wiley & Sons Inc.,
ISBN: 978-1-11896174-2

[4]. S.B. Kotsiantis, “Supervised Machine Learning: A Review


of Classification Techniques”, Informat ica 31 (2007) 249-268

Figure. 23. Recommendation systems [5]. L. Ro kach, O. Maimon, “Top – Down Induction of Decision
Trees Classifiers – A Survey”, IEEE Transactions on Systems,
For examp le, whereas most machinelearning algorith ms are
targeted to learn one specific function ordata model [6]. D. Lo wd, P. Do mingos, “Naïve Bayes Models for
fro monesingle data source, humans clearly learn many different Probability Estimation”
skills and types of knowledge, fro m years of diverse training
experience, supervised and unsupervised, in a simple -to-more- [7]. AyonDey, “Machine Learning Algorithms: A Review”
difficult sequence (e.g., learning to crawl, then walk, then run). International Journal of Computer Science and Information
This has led some researchers to begin exploring The question of Technologies, Vo l. 7 (3), ISSN:0975-9646, 1174-1179, 2016
how to construct computer lifelong or never-ending learners that
operate nonstop for years, learning thousands of interrelated [8]. D. Meyer, “Support Vector Machines – The Interface to
skills or functions within an overall architecture that allows the libsvm in package e1071”, August 2015
system to improve its ability to learn one skill based on having
learned another [29,30]. Another aspect of the analogy to natural [9]. S. S. Shwart z, Y. Singer, N. Srebro, “Pegasos: Primal
learning systems suggests the idea of team-based, mixed- Estimated sub - Gradient Solver for SVM”, Proceedings of the
initiat ive learn ing. For example, whereas current machine 24th International Conference on Machine Learning, Corvallis,
learning systems typically operate in isolation to analyze the OR, 2007
given data, people often work in teams to collect and analyze
data (e.g., bio logists have worked as teams to collect and analyze [10].http://www.simplilearn.co m/what-is-machine-learning-and-
genomic data, bringing together diverse experiments and why-it matters-article
perspectives to make progress on this difficult problem). New
mach ine-learning methods capable of working collaboratively [11]. P. Harrington, “Machine Learning in action”, Manning
with humans to jointly analyze complex data sets might bring Publications Co., Shelter Island, New Yo rk, 2012
together the abilit ies of machines to tease out subtle statistical
regularit ies from massive data sets with the abilities of humans to [12]. K. Alsabati, S. Ranaka, V. Singh, “An efficient k-means
drawon diverse background knowledge to generate plausible clustering algorith m”, Electrical Engineering and Co mputer
explanations and suggest new hypotheses . Many theoretical Science, 1997
results inmachine learning apply to all learning systems, whether

International Journal of Engineering Science and Computing, March 2017 4890 http://ijesc.org/
[13]. M. Andrecut, “Parallel GPU Imp lementation of Iterative [29]. T. M itchel, Proceedings of the Twenty-Ninth Conference
PCA Algorith ms”, Institute of Bioco mplexity and Informatics, on Artificial Intelligence (AAAI-15), 25 to 30 January 2015,
University of Calgary, Canada, 2008 Austin, TX.

[14]. X. Zhu, A. B. Go ldberg, “Introduction to Semi – [30]. S. Thrun, L. Pratt, Learn ing To Learn (Kluwer Academic
Supervised Learn ing”, Synthesis Lectures on Artificial Press, Boston, 1998).
Intelligence and Machine Learning, 2009, Vo l. 3, No. 1, Pages
1-130

[15]. X. Zhu, “Semi-Supervised Learning Literature Survey”,


Co mputer Sciences, University of Wisconsin-Madison, No.
1530, 2005

[16]. R. S. Sutton, “Introduction: The Challenge of


Reinforcement Learning”, Machine Learning, 8, Page 225-227,
Klu wer Academic Publishers, Boston, 1992

[17]. L. P. Kaelbing, M. L. Litt man, A. W. Moore,


“Rein forcement Learning: A Survey”, Journal of A rtificial
Intelligence Research, 4, Page 237-285, 1996

[18]. R. Caruana, “Multitask Learning”, Machine Learning, 28,


41-75, Kluwer Academic Publishers, 1997

[19]. D. Op itz, R. Maclin, “Popular Ensemb le Methods: An


Emp irical Study”, Journal of Art ificial Intelligence Research, 11,
Pages 169198, 1999

[20]. Z. H. Zhou, “Ensemble Learning”, Nat ional Key


Laboratory for Novel So ftware Technology, Nanjing University,
Nanjing, China

[21]. V. Sharma, S. Rai, A. Dev, “A Co mprehensive Study of


Artificial Neural Networks”, International Journal of Advanced
Research in Co mputer Science and Software Engineering, ISSN
2277128X, Vo lu me 2, Issue 10, October 2012

[22]. S. B. Hiregoudar, K. Manjunath, K. S. Patil, “A Survey:


Research Summary on Neural Networks”, International Journal
of Research in Engineering and Technology, ISSN: 2319 1163,
Vo lu me 03, Special Issue 03, pages 385-389, May, 2014

[23]. P. Harrington, “Machine Learning in Action”, Manning


Publications Co., Shelter Island, New York, ISBN
9781617290183, 2012

[24]. J. M. Keller, M . R. Gray, J. A. Givens Jr., “A Fu zzy K-


Nearest Neighbor Algorithm”, IEEE Transactions on Systems,
Man and Cybernetics, Vo l. SM C-15, No. 4, August 1985.

[25]. J. Sch midhuber,“Neural Net work”. 61, 85 –117, 2015.

[26] .Y. Bengio, “Foundations and Trends in Machine Learning”


2 (Now Publishers, Boston, 2009), pp. 1–127.

[27]. A. Krizhevsky, I. Sutskever, G. Hinton, “Adv. Neural


Information Process System”. 25, 1097–1105 (2015).

[28]. G. Hinton, Signal Processing Management, IEEE 29, 82 –


97 (2012).

International Journal of Engineering Science and Computing, March 2017 4891 http://ijesc.org/

You might also like