How To Select A Suitable Machine Learning Algorithm

The document proposes a machine learning algorithm selection framework based on the nature of the learning activity and algorithm characteristics. It discusses supervised and unsupervised learning approaches and example algorithms. The framework uses these concepts to guide users in selecting suitable algorithms for their specific data analysis scope and constraints.

Uploaded by

Subhojit Saha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views7 pages

How To Select A Suitable Machine Learning Algorithm

Uploaded by

Subhojit Saha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

XXIII Summer School “Francesco Turco” – Industrial Systems Engineering

How to select a suitable machine learning algorithm: a

feature-based, scope-oriented selection framework

Sala R., Zambetti M., Pirola F., Pinto R.

*Department of Management, Information and Production Engineering, University of Bergamo, Viale

Marconi, 5, Dalmine (BG), 24044, Italy ([email protected], [email protected],
[email protected], [email protected])

Abstract: The increasing availability of data gatherable from various sources and in several contexts, is forcing
practitioners to find affordable ways to manage and exploit datasets. Within this context, machine learning (ML) -
which can be described as a set of algorithms to analyse and process data to extract relevant features for
clusterization, classification or prediction - emerged as one of the most investigated area providing powerful tools.
Indeed, in literature it is possible to find a considerable number of articles dealing with ML algorithms and describing
their real-world applications. This considerable number of works, depicting a wide variety of algorithms and
widespread applications, creates an extensive knowledge on the topic. At the same time, it may also generate
disorientation in the selection of the right approach. Thus, the need of synthesis and guidelines to drive the selection
of the most suitable algorithm for a specific scope arises. To provide a response to such a necessity, the authors
propose a ML algorithm selection tool. As a starting point, authors analysed several ML algorithms investigating their
scope, their characteristics, and their typical fields of application, including also real examples. According to this
exploration, authors identified two decision layers: the first one concerns the nature of the learning activity
(supervised, unsupervised, etc.) while the second one is related to the characteristics of the ML algorithms (type of
response, data size and type they can manage, etc.). Starting from a pool of algorithms, the first layer enables the
users to narrow this pool depending on their scope. Then, the second layer guides the final selection, fitting the
users’ constraints, the previously mentioned algorithms features, and the data characteristics.

Keywords: machine learning; classification; selection framework; data analysis; decision making

1.Introduction framework suggesting a set of algorithms able to deal

properly with an analysed dataset. Rather than an
Nowadays, many companies are moving along the path
instrument that selects the best algorithm available for the
toward digitisation in the light of the benefits envisaged by
data analysis, this framework should be intended as a tool
the 4th Industrial revolution (Coreynen et al., 2015;
that supports the user discarding wrong algorithms for the
European Commission, 2016). One of the effects of the
data analysis.
digitisation path is an unprecedented growth in data
generation and availability. Still, it is clear that data Some of the frameworks currently available tend to use as
availability per se does not lead to the creation of new main selection drivers the accuracy and/or prediction
value for companies, which can be generated only speed of the ML algorithms. An example of this is the
transforming data into useful information. For this reason, “Machine Learning Algorithm Clean Sheet” proposed by
researchers and practitioners are struggling to find the Microsoft (Microsoft, 2017). The problem in this class of
correct way to manage data and extract value from them. selection frameworks is that they may result too general
The increased computational capabilities of modern PCs most of the time. This means that the users are guided
renewed the interest of research and industrial community toward their choice without considering parameters like
around machine learning (ML), which refers to a set the algorithms’ scalability or robustness to outliers,
algorithms for analysing and process data for focusing only on the outcome (e.g. algorithm accuracy)
clusterization, classification or prediction purposes (Smola and/or the amount of time required to obtain it. In this
and Vishwanathan, 2008). Consequently, many researchers way, unsuitable ML algorithms may be selected leading in
and practitioners are now approaching the ML field turn to poor results. In parallel, other selection
aiming at using these algorithms to extract value from approaches, built on different principles, are described in
their data. Literature reports applications of ML in the literature, like the Akaike Information Criterion (AIC)
most various fields and contexts due to its flexibility and (Akaike, 1974), the Bayesian Information Criterion (BIC)
adaptability. However, a unique ML algorithm able to deal (Schwarz, 1978) and the Hannan-Quinn Criterion (HQC)
with the analysis of every dataset does not exist, making (Hannan and Quinn, 1979). These criteria consist in the
the selection of the right ML algorithm challenging computation of a specific value able to suggest the best
(Andreopoulos et al., 2009; Kotsiantis et al., 2007; Saxena statistical model to be used in an analysis. The problem
et al., 2017). Thus, this paper proposes a selection with these criteria is that they require that the user knows

87
XXIII Summer School “Francesco Turco” – Industrial Systems Engineering

the characteristics of each model and that s/he computes able to make predictions. Supervised learning approaches
the value for each single model under consideration. In can be used for regression or classification purposes. In
this case, the model selection could be biased by the initial the first case, the idea is to use past data to build a
pool of algorithms selected by the user that could not regression model able to predict the future behaviour of
include the most suitable ones. the dataset. In the second case, the aim of the model is to
classify data in specific classes, based on that, assign new
The definition of the validation strategy for the proposed
data to the correct class. In essence, the supervised
framework requires a detailed discussion due to the fact
learning process aims to construct a mapping function
that the problem under study is non-trivial. The validation
conditioned to the provided training data set (Christiano
strategy should be developed considering the
Silva and Zhao, 2016). The following list reports the
characteristics of the traditional methods and of the new
algorithms considered for classification purposes:
framework, making possible to identify the pro and cons
of each one and, in turn, to understand when one perform - Logit: this algorithm is suited for binary
better than the others. For this reason, the framework classifications. Logistic regression algorithm
validation is postponed to another paper. calculates the class membership probability for
one of the two categories in a dataset. It is best
The paper is structured as follows: Section 2 introduces
suited for data clearly separated by a single, linear
the concepts of ML, discussing the main characteristics of
boundary (Dreiseitl and Ohno-Machado, 2002;
the Supervised and Unsupervised learning approaches,
Smola and Vishwanathan, 2008);
and presenting a set of Supervised and Unsupervised
algorithms. Section 3 deals with the definition of the - Multinomial Logit: in the multinomial situation,
framework, describing the drivers composing it and how there are different categorical response variables
they should be applied to datasets. Section 4 concludes the that can assume more than two outcomes. It
paper discussing the main benefits and limits of the basically gives an estimation about the class
framework. probabilities for a multi-category response which
are then used to classify the new cases into one
2.Machine Learning
of several outcome groups (Dreiseitl and Ohno-
Mishra and Gupta (2017) defines ML as algorithms that, Machado, 2002; Smola and Vishwanathan, 2008);
through the automatic association of events and their
- Classification Trees: it is a non-parametric
consequences, allows making accurate predictions based
algorithm. For this reason, its performance is not
on a database of past observations.
affected by the presence of outliers. Even
Literature distinguishes Active Learning approaches from though it can handle a wide variety of input data,
Passive Learning approaches. The main idea behind this algorithm is not suitable for high
Active Learning is that it is a learning process where the dimensional datasets. It has a flow chart
ML algorithm is allowed to select the data from which it structure, where each node represents a test and
learns and, by that, is able to perform better with less each leaf represent the response. It is easy to
training (Liu, 2010). On the contrary, the Passive Learning visualize, and results are easy to interpret (Singh
refers to an ensemble of approaches, namely Supervised, et al., 2016);
Unsupervised and Semi-Supervised, which use random
- Support Vector Machines (SVM) Classification:
samples from the dataset to build models that can
the algorithm aims at classifying data by finding
perform prediction, classification, clustering or other tasks
linear decision boundary (called hyperplane)
depending on the dataset composition (Mishra and Gupta,
which separates the data classes. The algorithm
2017). The differences among the three Passive Learning
aims at finding the hyperplane that has the
approaches is based on the presence of labels in the
largest margin between two classes. For non-
dataset.
linear situations, the algorithm considers a loss
Labels are constituted by one or more tags that contains function that penalizes the points on the wrong
desirable information on the data and favour its side of the hyperplane. Sometime this algorithm
recognition. Thus, data can be classified as labelled or uses a kernel to transform nonlinearly separable
unlabelled. In particular, if the algorithm is trained on the data into higher dimensions where a linear
base of labelled data the approach is identified as decision boundary can be found. The SVM is
“Supervised Learning”; otherwise, in case of unlabelled suitable for binary data, but also discrete data can
data, the approach is called “Unsupervised Learning”. In be used as input. High dimensional data can be
addition, if the training dataset contains both labelled and managed easily. The algorithm performance
unlabelled data, the approach is called “Semi-Supervised decreases in presence of noise (Kotsiantis et al.,
Learning”. The presented framework deals only with 2007);
Supervised and Unsupervised learning approaches.
- k-Nearest Neighbour: this algorithm categorizes
2.1 Supervised Learning an object depending on the classes of the nearest
neighbours in the dataset. Consequently, the
In the supervised learning, the idea is to exploit the algorithm assumes that objects that are close to
information emerging from the data distribution and from each other are similar. The algorithm can be
the external knowledge – the labels – to create a model trained using different distance metrics (e.g.

88
XXIII Summer School “Francesco Turco” – Industrial Systems Engineering

Euclidean, Chebyshev, etc.). The algorithm can constituted by a continuous value. Neural
work with binary and discrete variables, but its Networks algorithm can deal with noise and
performance is strongly affected by the data size outliers in the dataset (Singh et al., 2016).
and the presence of outliers and noise
2.2 Unsupervised Learning
(Kotsiantis et al., 2007);
In the unsupervised learning case, the main task consists
- (Multilayer) Neural Network: this algorithm
in finding intrinsic data structures. The learning process,
consists of a set of simple, interconnected
in this case, is solely guided by the data relationships as no
computation units called neurons, organized into
labels are available for the data analysis (Mitchell, 1997).
layers with different roles called input, output
The following list reports the algorithms considered for
and hidden layer, respectively. The number of
clustering purposes:
hidden layers depends upon the model
complexity. The neurons are connected via - Fuzzy C-Means (FCM): it uses fuzzy algorithms
weighted links, and the way the neurons are to handle data and analyse them. It is useful
connected defines different types of Neural when data point can belong to more than one
Network. A Neural Network is trained iteratively cluster. The number of clusters should be
to find the right weights for links. It best fits the known. It is not suitable for datasets with noise
modelling of highly nonlinear systems, when or outliers but can handle large datasets (Havens
data are available incrementally and there is a et al., 2012);
constant need to update the model. Neural
Networks algorithm can deal with noise and - Balanced Iterative Reducing and Clustering using
outliers in the dataset (Singh et al., 2016); Hierarchies (BIRCH): it is a hierarchical
algorithm that takes as input a set of N data
The following list reports the algorithms considered for points and a desired number of clusters k.
regression purposes: BIRCH relies on the use of clustering feature
(CF) vectors to store and summarize the
- Regression Trees: unlike the Classification Trees,
information about each cluster. It organizes
Regression Trees can handle categorical and
these vectors in a CF tree, which is a height-
continuous variables. This algorithm is suitable
balanced tree data structure. It is suitable for
when data has many features interacting in
large datasets and is robust to outliers and noise.
complicated and nonlinear ways. It sub-divides
Disadvantages include a difficulty in finding
the space into smaller regions and further
arbitrary shaped clusters (Pitolli et al., 2017);
partitions the sub-divisions and assigns to its
nodes (leaves) where interactions are more - Clustering Using Representatives (CURE): it is
manageable. As for classification trees, nodes are an improvement of the BIRCH algorithm since
subdivided into leaf nodes which contain the it is possible to find clusters of arbitrary shapes.
responses (Yildiz et al., 2017); CURE is also more robust with respect to
outliers and scalable to large datasets. These
- SVM Regression: this kind of regression is very
benefits are achieved by using several
similar to a classification algorithm, but it is
representative objects for a cluster. At each
thought to predict a continuous response. It
iteration, the two clusters with the closest pair of
does not find a hyperplane to separate data, but
representative objects are merged. A drawback is
it searches for a model that deviates from the
the user-specified parameter values, the number
measured data by a value that is not greater than
of clusters and the shrinking factor (Guha et al.,
a small amount, having the values of parameter
1998);
as small as possible, in order to minimize the
sensitivity to error. It is usually used for high- - RObust Clustering using linKs (ROCK): it
dimensional data, where there is a large number assumes a similarity measure between objects
of predictor variables (Yildiz et al., 2017); and defines a ‘link’ between two objects whose
similarity exceeds a threshold. Initially, each
- k-Nearest Neighbour: it can be used in case of
object is assigned to a separate cluster. Then,
continuous data labels. The value of the
clusters are merged repeatedly according to their
parameter k influences the prediction variance:
closeness. This algorithm is not able to handle
when it is small there is a high variance in
properly large datasets and is not robust to
prediction, while when it is high there is a large
outliers (Guha et al., 2000);
bias. The scale of features for KNN regression
influences the quality of predictions (Hidalgo et - k-Means: it divides data into k mutually exclusive
al., 2017); clusters. The distance from the cluster centre
defines the probability to belong to it. It fits large
- (Multilayer) Neural Network: this algorithm is
datasets but its performance decreases when
similar to the one used for classification
outliers are present in the dataset (Pham et al.,
purposes. The main difference with the
2005);
classification versions of Neural Network is that,
while in the first case the output is constituted by
a discrete value (the class), here the output is

89
XXIII Summer School “Francesco Turco” – Industrial Systems Engineering

- k-Medoids or Partitioning Around Medoids Kotsiantis et al., 2007; Saxena et al., 2017). Moreover, the
(PAM): it is similar to k-Means. The term k presence in literature of a wide range of different
represents the number of medoids to be algorithms can create disorientation in the users not
identified and the number of clusters that are acquainted with their known strengths and weaknesses.
required. A medoid is an object whose average This paper proposes a selection framework that works on
dissimilarity from the other objects in the cluster two different layers, each one linked to a different aspect
is minimal and for this reason it is used as of the analysis. This would guide the user in the selection
representative object (the centre) of the cluster. of the ML algorithms suitable for the analysis of a specific
It processes and assigns n data points to k dataset. The first layer of the ML algorithm selection is
clusters with k medoids. In contrast to k-Means based on the presence of labels in the dataset and on the
algorithm, k-Medoids uses a data in the cluster as scope of analysis (i.e. learning activity). In this way, the
cluster centre, while in k-Means the centroid is user is guided towards Supervised or Unsupervised
calculated and may not coincide with a data- Learning algorithms. Then, in case of a Supervised
point in the cluster. This makes the k-Medoids approach, the user is requested to indicate whether s/he is
algorithm more robust in handling noise and interested in a classification or a regression analysis.
outliers because it minimizes a sum of pairwise Otherwise, in case of Unsupervised Learning only
differences instead of a sum of Euclidean clustering is proposed. In the second layer, four more
distances (Shafiq and Torunski, 2016). drivers guide the user in the identification of proper ML
algorithms. The drivers have been identified after a
Based on these definitions, the framework is proposed in
literature review of ML application cases. The results are
the following section.
reported in Table 1, which contains the list of drivers, their
3.Framework description and the list of papers used for their
identification. It is worth to notice that some papers are
In the selection of a suitable ML algorithm for data associated to multiple drivers, in some cases to every
analysis many aspects should be taken into account, and driver. This underpins the importance of these drivers,
most of the times selecting an algorithm only on the base strengthening the fact that these should be taken into
of the promised accuracy or computational speed leads to consideration during the ML algorithm selection phase.
unsatisfactory results (Andreopoulos et al., 2009;

Table 1: Drivers Identified from Literature

Driver Description Literature

Data Type The ML algorithms are built under specific assumptions, each one (Andreopoulos et al., 2009; Dreiseitl and
is usually informed to work with specific types of data. Here, four Ohno-Machado, 2002; Guha et al., 2000;
types of data are considered: binary, discrete, categorical, and Kotsiantis, 2007; Pham et al., 2005;
continuous. Discrete data can assume only certain values and are Pitolli et al., 2017; Saxena et al., 2017;
always numeric (e.g. number of students, etc…); categorical data Singh et al., 2016)
can assume only certain value from a finite and fixed set (e.g.
colours, months in a year, etc…); continuous data can assume any
value in an interval (e.g. height, time to complete a task, etc…).
Scalability The scalability of an algorithm measures the growth of its time (Andreopoulos et al., 2009; Dreiseitl and
complexity in relation to the growth of the problem size. It Ohno-Machado, 2002; Guha et al., 1998,
measures the capacity of an algorithm to handle big inputs (Teng, 2000; Havens et al., 2012; Hidalgo et al.,
2018). In the proposed framework, a ML algorithm is scalable if it 2017; Kotsiantis, 2007; Pham et al.,
is able to deal with both small datasets (few features) and large 2005; Pitolli et al., 2017; Saxena et al.,
datasets (many features) without decreasing its performance. For 2017; Shafiq and Torunski, 2016; Singh
example, the runtime and memory requirements should not et al., 2016; Teng, 2018)
increase exponentially when increasing the number of features in
the datasets.
Robustness It is defined as the ability of a ML algorithm to deal with the (Andreopoulos et al., 2009; Guha et al.,
to Outliers presence of data not belonging to the analysed sample. If an 1998, 2000; Havens et al., 2012;
algorithm is robust to outliers and noise, its performance is not Kotsiantis, 2007; Pitolli et al., 2017;
affected by their presence. Saxena et al., 2017; Shafiq and Torunski,
2016; Singh et al., 2016)
Response It is defined as the outcome of the analysis. As for the Data Type (Dreiseitl and Ohno-Machado, 2002;
Type driver, there are different possible types of response for the Guha et al., 1998, 2000; Kotsiantis, 2007;
analysis. As for the Data Type driver, the possible Response Types Pham et al., 2005; Pitolli et al., 2017;
are binary, discrete, categorical and continuous. Saxena et al., 2017)

90
XXIII Summer School “Francesco Turco” – Industrial Systems Engineering

Figure 1 depicts the two layers of the framework and the Moreover, this paper deals only with the framework
related selection drivers, while Table 2 shows the list of development but not with its validation due to the
algorithms classified based on those drivers. As explained complexity of the problem and due to the space
earlier, following this framework the users should be able limitations.
to identify one or more ML algorithms suitable for their
This selection framework aims at supporting the
scope (drivers Learning and Learning Activity) and the
researchers who are approaching the ML field guiding
dataset characteristics (drivers Data Type, Scalability,
them through the selection of a set of algorithms suitable
Robustness to Outliers/Noise and Response Type).
for their scopes, helping them to avoid incurring in the
application of algorithms which are not suitable for the
data they are dealing with. The selection framework
covers some of the most commonly used ML Supervised
and Unsupervised algorithms. The drivers presented in
the selection framework constitute a solid base for the
selection of proper ML algorithms since they are easily
recognizable and do not necessitate deep analysis to be
identified.
The research presented in this paper is not free from
limitations and possible future improvements. First, the
current pool of ML algorithms lacks Semi-Supervised and
Active Learning approaches. Furthermore, more
classification algorithms, as well as more regression
algorithms, could be considered in the framework.
Examples of these algorithms are the generalized linear
models, the Bayesian networks, the linear and quadratic
discriminant analysis, the gaussian processes etc. Also, the
pool of unsupervised clustering algorithms could be
extended including new algorithms such as the affinity
propagation, the spectral clustering, the gaussian mixtures,
Figure 1: Machine Learning Algorithm Selection
the agglomerative clustering etc. Moreover, the level of
Framework detail used for each type of algorithm could be improved
(e.g. the variants of Classification Trees, of Neural
In order to be effective, the proposed selection Networks, the different kernels available for the SVM
framework has to be applied in a precise sequence. In etc.). Second, the drivers list could be extended
particular, at the beginning, the first layer requires to select considering other characteristics of the data such as the
the learning approach and, in turn, the learning activity. training time, the required parameters etc. Third, due to
Then, the second selection layer requires the identification the number of algorithms currently available in the
of the data characteristics to understand which of the framework and the possible combinations of the dataset
available algorithms should be used for the analysis. As characteristics, in some cases the framework may be
above mentioned, the user should consider the nature of unable to suggest a suitable ML algorithm. Fourth, the
the data in order to avoid algorithms unable to deal with framework currently does not take into consideration the
the dataset. Thus, knowing the dataset dimension, the user application of data manipulation techniques that could
is required to indicate whether the it has a limited number modify the dataset structure in the pre-processing phase
of features, and so can be handled by most of the and, in turn, extend the pool of algorithms to be taken
algorithms, or requires specific algorithms, whose into account for the analysis.
performance are not affected by the number of features.
Moreover, the user should consider the possibility to have Future work will encompass multiple aspects not
outliers or noise in the dataset and clarify if their presence considered in this paper, starting from the validation
is a problem or not for the analysis. If it is a problem, the strategy, which should be chosen carefully to be effective,
framework removes the algorithms not able to manage the and continuing with the selection framework extension
presence of outliers and/or noise. Finally, the framework and refinement.
requires the user to specify the type of response required
as output of the analysis.
4.Conclusions
This paper presented a selection framework aimed at
guiding users in the selection of one or more suitable ML
algorithms to use in their analyses. The proposed selection
framework does not claim to be complete with respect to
the available literature on the topic and/or to suggest the
best algorithm to the user due to the vast complexity
characterising the ML field of research and application.

91
XXIII Summer School “Francesco Turco” – Industrial Systems Engineering

Table 2: Machine Learning Algorithm Classification

First layer Second layer
ML
Algorithm Learning Learning Robustness to Response
Data Type Scalability
Type Activity Noise/Outliers Type
Logit Supervised Classification Binary Yes Yes Binary
Multinomial Discrete –
Supervised Classification Yes Yes Categorical
Logit Categorical
Binary –
Decision Binary - Discrete -
Supervised Classification No Yes Discrete -
Trees Categorical
Categorical
SVM Binary -
Supervised Classification Binary - Discrete Yes No
Classification Discrete
K-Nearest Binary -
Supervised Classification Binary - Discrete No No
Neighbour Categorical
Neural Binary - Discrete - Binary -
Supervised Classification Yes Yes
Networks Categorical Categorical
Discrete -
Regression Categorical-
Supervised Regression Categorical - No Yes
Trees Continuous
Continuous
SVM Discrete - Discrete -
Supervised Regression Yes No
Regression Continuous Continuous
K-Nearest Discrete -
Supervised Regression No No Categorical
Neighbour Continuous
Binary - Discrete -
Neural Categorical -
Supervised Regression Categorical - Yes Yes
Networks Continuous
Continuous
Continuous -
k-Mean Unsupervised Clustering Continuous Yes No
Categorical
k-Medoids
Unsupervised Clustering Continuous Yes Yes Categorical
(PAM)
FCM (Fuzzy
Unsupervised Clustering Continuous Yes No Categorical
C-Means)
BIRCH Unsupervised Clustering Continuous Yes Yes Discrete
CURE Unsupervised Clustering Continuous Yes Yes Discrete
ROCK Unsupervised Clustering Continuous No Yes Discrete

Acknowledgements (2009), “A roadmap of clustering algorithms:

Finding a match for a biomedical application”,
This work was funded by Lombardy Region - FESR 2014-
Briefings in Bioinformatics, Vol. 10 No. 3, pp. 297–314.
2020 innovazione e competitività “Bando Linea R&S per
aggregazioni” in the project “Proactive Maintenance and Christiano Silva, T. and Zhao, L. (2016), Machine Learning
rEal Time monitoring for Efficiency & Ø defect in Complex Networks, Book, Springer.
production (PROMETEØ)”, project ID: 148633, CUP:
Coreynen, W., Matthyssens, P. and Van Bockhaven, W.
E47H16001570009.
(2015), “Boosting servitization through digitization:
References Pathways and dynamic resource configurations for
manufacturers”, Industrial Marketing Management,
Akaike, H. (1974), “A New Look at the Statistical Model
Elsevier Inc., Vol. 60, pp. 42–53.
Identification”, IEEE Transactions on Automatic
Control, Vol. 19 No. 6, pp. 716–723. Dreiseitl, S. and Ohno-Machado, L. (2002), “Logistic
regression and artificial neural network classification
Andreopoulos, B., An, A., Wang, X. and Schroeder, M.
models: A methodology review”, Journal of Biomedical

92
XXIII Summer School “Francesco Turco” – Industrial Systems Engineering

Informatics, Vol. 35 No. 5–6, pp. 352–359. clustering techniques and developments”,
Neurocomputing, Elsevier B.V., Vol. 267, pp. 664–
European Commission. (2016), “Digitising European
681.
Industry Reaping the full benefits of a Digital Single
Market”. Schwarz, G. (1978), “Estimating the Dimension of a
Model”, The Annals of Statistics, Vol. 6 No. 2, pp.
Guha, S., Rastogi, R. and Shim, K. (1998), “CURE: an
461–464.
efficient clustering algorithm for large databases”,
ACM SIGMOD Record, Vol. 27 No. 2, pp. 73–84. Shafiq, M.O. and Torunski, E. (2016), “A Parallel K-
Medoids Algorithm for Clustering based on
Guha, S., Rastogi, R. and Shim, K. (2000), “Rock: a robust
MapReduce”, 2016 15th IEEE International Conference
clustering algorithm for categorical attributes”,
on Machine Learning and Applications A, pp. 502–207.
Information Systems, Vol. 25 No. 5, pp. 345–366.
Singh, A., Thakur, N. and Sharma, A. (2016), “A Review
Hannan, E.J. and Quinn, B.G. (1979), “The
of Supervised Machine Learning Algorithms”, 2016
Determination of the Order of an Autoregression”,
International Conference on Computing for Sustainable
Journal of the Royal Statistical Society. Series B
Global Development (INDIACom), Vol. 16, pp. 1310–
(Methodological), Vol. 41 No. 2, pp. 190–195.
1315.
Havens, T.C., Bezdek, J.C., Leckie, C., Hall, L.O. and
Smola, A. and Vishwanathan, S.V.N. (2008), Introduction to
Palaniswami, M. (2012), “Fuzzy c-Means algorithms
Machine Learning.
for very large data”, IEEE Transactions on Fuzzy
Systems, Vol. 20 No. 6, pp. 1130–1146. Teng, S.-H. (2018), “Scalable Algorithms in the Age of Big
Data and Network Sciences”, Proceedings of the
Hidalgo, J.I., Colmenar, J.M., Kronberger, G., Winkler,
Eleventh ACM International Conference on Web Search
S.M., Garnica, O. and Lanchares, J. (2017), “Data
and Data Mining - WSDM ’18, pp. 6–7.
Based Prediction of Blood Glucose Concentrations
Using Evolutionary Methods”, Journal of Medical Yildiz, B., Bilbao, J.I. and Sproul, A.B. (2017), “A review
Systems, Vol. 41 No. 9. and analysis of regression and machine learning
models on commercial building electricity load
Kotsiantis, S.B. (2007), “Supervised machine learning: a
forecasting”, Renewable and Sustainable Energy Reviews,
review of classification techniques.”, Informatica
Elsevier Ltd, Vol. 73 No. February, pp. 1104–1122.
(03505596), Vol. 31 No. 3, pp. 249–268.
Kotsiantis, S.B., Zaharakis, I. and Pintelas, P. (2007),
“Supervised machine learning: A review of
classification techniques”, Vol. 31, pp. 249–268.
Liu, Y. (2010), “Active learning literature survey”,
Computer Sciences Technical Report 1648.
Microsoft. (2017), “The Machine Learning Algorithm
Cheat Sheet”, available at:
https://docs.microsoft.com/en-us/azure/machine-
learning/studio/algorithm-choice#the-machine-
learning-algorithm-cheat-sheet.
Mishra, C. and Gupta, D.L. (2017), “Deep Machine
Learning and Neural Networks: An Overview”,
IAES International Journal of Artificial Intelligence (IJ-
AI), Vol. 6 No. 2, pp. 66–73.
Mitchell, T.M. (1997), Machine Learning, McGraw-Hill
Science/Engineering/Math.
Pham, D.T., Dimov, S.S. and Nguyen, C.D. (2005),
“Selection of K in K-means clustering”, Proceedings of
the Institution of Mechanical Engineers, Part C: Journal of
Mechanical Engineering Science, Vol. 219 No. 1, pp.
103–119.
Pitolli, G., Aniello, L., Laurenza, G., Querzoni, L. and
Baldoni, R. (2017), “Malware family identification
with BIRCH clustering”, Proceedings - International
Carnahan Conference on Security Technology, Vol. 2017–
Octob.
Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O.P.,
Tiwari, A., Er, M.J., et al. (2017), “A review of

Whitepaper v1
No ratings yet
Whitepaper v1
24 pages
How To Choose The Right Machine Learning Algorithm
No ratings yet
How To Choose The Right Machine Learning Algorithm
10 pages
1 s2.0 S0952197624004986 Main
No ratings yet
1 s2.0 S0952197624004986 Main
15 pages
Categorization of Factors Affecting Classification Algorithms Selection
No ratings yet
Categorization of Factors Affecting Classification Algorithms Selection
19 pages
An Analysis of Data Quality Requirements For Machine Learning
No ratings yet
An Analysis of Data Quality Requirements For Machine Learning
12 pages
Machine Learning Algorithms, Real-World Applications and Research Directions
No ratings yet
Machine Learning Algorithms, Real-World Applications and Research Directions
73 pages
2021, Gulati - Efficiency Enhancement of Machine Learning Approaches Through The Impact of Preprocessing Techniques
No ratings yet
2021, Gulati - Efficiency Enhancement of Machine Learning Approaches Through The Impact of Preprocessing Techniques
6 pages
ML Classification1
No ratings yet
ML Classification1
12 pages
Association Rule
No ratings yet
Association Rule
16 pages
Machine Learning On Big Data: Opportunities and Challenges: Version of Record
No ratings yet
Machine Learning On Big Data: Opportunities and Challenges: Version of Record
27 pages
Unit - 1+2
No ratings yet
Unit - 1+2
108 pages
An Overview of Supervised Machine Learning Paradigms and Their Classifiers
No ratings yet
An Overview of Supervised Machine Learning Paradigms and Their Classifiers
9 pages
Manish Bhatt 2451137 ProjectIV
No ratings yet
Manish Bhatt 2451137 ProjectIV
20 pages
Machine Learning Review and Trends
No ratings yet
Machine Learning Review and Trends
75 pages
Data Quality Evaluation Using Probability Models: 1 Research Problem 2 Outline of Objectives
No ratings yet
Data Quality Evaluation Using Probability Models: 1 Research Problem 2 Outline of Objectives
6 pages
New Advances in Machine Learning
No ratings yet
New Advances in Machine Learning
374 pages
Big Data With Machine Learning and Fuzzy Logic
No ratings yet
Big Data With Machine Learning and Fuzzy Logic
5 pages
Machine-Learning-Using-Python-Pdf-Free (1) - 23-30
No ratings yet
Machine-Learning-Using-Python-Pdf-Free (1) - 23-30
8 pages
Machine Learning Project 1
No ratings yet
Machine Learning Project 1
19 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
Automated Machine Learning AI-driven Decision Making in Business Analytics
No ratings yet
Automated Machine Learning AI-driven Decision Making in Business Analytics
7 pages
Machine Learning Report
No ratings yet
Machine Learning Report
16 pages
Challenges of Machine Learning: 1. Data Quality and Quantity
No ratings yet
Challenges of Machine Learning: 1. Data Quality and Quantity
11 pages
Machine Learning in Data Analysis
No ratings yet
Machine Learning in Data Analysis
17 pages
How To Choose A Machine Learning Algorithm
No ratings yet
How To Choose A Machine Learning Algorithm
12 pages
(IJCST-V9I4P18) :yew Kee Wong
No ratings yet
(IJCST-V9I4P18) :yew Kee Wong
5 pages
Machine Learning Algorithms, Real World Applications and Research
No ratings yet
Machine Learning Algorithms, Real World Applications and Research
21 pages
Supervised Machine Learning Algorithms: Classification and Comparison
No ratings yet
Supervised Machine Learning Algorithms: Classification and Comparison
12 pages
Data Preprocessing For Supervised Leaning
No ratings yet
Data Preprocessing For Supervised Leaning
6 pages
Machine Learning Challenges and Opportunities in The African Agricultural Sector. A General Perspective.
No ratings yet
Machine Learning Challenges and Opportunities in The African Agricultural Sector. A General Perspective.
13 pages
Toward Integrating Feature Selection Algorithms For Classification and Clustering-M7s PDF
No ratings yet
Toward Integrating Feature Selection Algorithms For Classification and Clustering-M7s PDF
12 pages
Manish Bhatt 2451137 ProjectIV
No ratings yet
Manish Bhatt 2451137 ProjectIV
20 pages
7 +ijaise5
No ratings yet
7 +ijaise5
10 pages
InTech-Types of Machine Learning Algorithms
No ratings yet
InTech-Types of Machine Learning Algorithms
30 pages
AkinsolaJET IJCTT V48P126
No ratings yet
AkinsolaJET IJCTT V48P126
12 pages
Prediction Stock Price Using Data Science Technique
No ratings yet
Prediction Stock Price Using Data Science Technique
11 pages
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
No ratings yet
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
6 pages
MPRA Paper 116579
No ratings yet
MPRA Paper 116579
40 pages
ML Lecture Notes Unit-1
No ratings yet
ML Lecture Notes Unit-1
45 pages
A Survey of Machine Learning Methods For Iot and Their Future Applications
No ratings yet
A Survey of Machine Learning Methods For Iot and Their Future Applications
5 pages
Bias, Fairness, and Accountability With AI and ML Algorithms
No ratings yet
Bias, Fairness, and Accountability With AI and ML Algorithms
18 pages
Truncated Doc 3
No ratings yet
Truncated Doc 3
3 pages
A Survey of Data Quality Requirements That Matter in ML
No ratings yet
A Survey of Data Quality Requirements That Matter in ML
39 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
20 pages
Supervised Machine Learning: A Review of Classification Techniques
No ratings yet
Supervised Machine Learning: A Review of Classification Techniques
20 pages
Machine Learning
No ratings yet
Machine Learning
48 pages
Cognitive Machine Learning Techniques For Predictive Maintenance in Industrial Systems: A Data-Driven Analysis
No ratings yet
Cognitive Machine Learning Techniques For Predictive Maintenance in Industrial Systems: A Data-Driven Analysis
7 pages
Research Trends in Machine Learning: Muhammad Kashif Hanif
No ratings yet
Research Trends in Machine Learning: Muhammad Kashif Hanif
80 pages
Machine Learning With Big Data To Solve
No ratings yet
Machine Learning With Big Data To Solve
8 pages
Predictive Approach To Model Selection and Validation in Statistical Learning
No ratings yet
Predictive Approach To Model Selection and Validation in Statistical Learning
28 pages
Machine Learning Report - Comments
No ratings yet
Machine Learning Report - Comments
16 pages
Human Annotator For Imbalanced Dossier
No ratings yet
Human Annotator For Imbalanced Dossier
11 pages
Towards Big Industrial Data Mining Through Explainable Automated Machine Learning
No ratings yet
Towards Big Industrial Data Mining Through Explainable Automated Machine Learning
20 pages
Machine Learning: An Artificial Intelligence Methodology: WWW - Ijecs.in
No ratings yet
Machine Learning: An Artificial Intelligence Methodology: WWW - Ijecs.in
6 pages
A R S M L D A B D: Eview On The Ignificance of Achine Earning For ATA Nalysis in IG ATA
No ratings yet
A R S M L D A B D: Eview On The Ignificance of Achine Earning For ATA Nalysis in IG ATA
18 pages
ML Algos
No ratings yet
ML Algos
31 pages