0% found this document useful (0 votes)

79 views17 pages

Machine Learning Techniques Using Python For Data

Uploaded by

Rodolfo Ribeiro

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views17 pages

Machine Learning Techniques Using Python For Data

Uploaded by

Rodolfo Ribeiro

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/324928841

Machine learning techniques using python for data analysis in performance

evaluation

Article in International Journal of Intelligent Systems Technologies and Applications · January 2018

DOI: 10.1504/IJISTA.2018.10012853

CITATIONS READS

0 976

1 author:

Lakshmi Jupudi
Quaid E-Millath Government College for Women
11 PUBLICATIONS 21 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Stochastic Gradient Descent using Linear Regression with Python View project

All content following this page was uploaded by Lakshmi Jupudi on 16 August 2019.

The user has requested enhancement of the downloaded file.

Int. J. Intelligent Systems Technologies and Applications, Vol. 17, Nos. 1/2, 2018 3

Machine learning techniques using python for data

analysis in performance evaluation

J.V.N. Lakshmi
Acharya Institute of Management and Sciences,
MCA Department,
Bangalore, 560 058, India
Email: [email protected]

Abstract: Machine learning algorithms are used to parallelise the workloads.

To achieve paramount performance required parameters are tuned in the
algorithms. The jobs are implemented using machine learning techniques using
various parameters. The performance is examined by executing various features
and verifying time constraints depending on the assignments for a cluster.
The attempt is made to obtain minimum execution time using python language
for implementing machine learning algorithms. Supervised and unsupervised
techniques of machine learning algorithms are used differentiating the
performance evaluation and time efficiency. Linear regression, logistic
regression and K-means clustering techniques are used to evaluate the data
analytic jobs. This implementation reveals the best performance of supervised
algorithms over unsupervised for data analysis. This paper is an attempt made
to analyse the machine learning techniques and evaluates the timer feature on
various methods irrespective of supervised or unsupervised.

Keywords: machine learning; linear regression; logistic regression; K-means;

EM.

Reference to this paper should be made as follows: Lakshmi, J.V.N. (2018)

‘Machine learning techniques using python for data analysis in performance
evaluation’, Int. J. Intelligent Systems Technologies and Applications, Vol. 17,
Nos. 1/2, pp.3–18.

Biographical notes: J.V.N. Lakshmi is an Assistant Professor in Acharya

Institution of Management and Sciences in MCA Department, Bangalore -
India. She received her Master’s degree in Statistics, Master of Computer
Applications (MCA) and Bachelor of Science degree in Statistics and
Computer Science from Osmania University, Hyderabad, India. Her research
interest is on machine learning, big data, statistics and data analytics.
Her papers are published on machine learning using big data for analytics
in reputed journals like IEEE, ACM, and Scopus Indexed Journals. She is
pursuing PhD from SCSVMV university, Kanchipuram, India.

This paper is a revised and expanded version of a paper entitled ‘Machine

learning techniques using python for data analysis in performance evaluation’
presented at International Conference on Innovative Systems, Bangalore,
14 December, 2016.

Copyright © 2018 Inderscience Enterprises Ltd.

4 J.V.N. Lakshmi

1 Introduction

There is a massive growth in data analysis a model is widely used to parallelise by

incorporating machine learning algorithms. Machine learning techniques are
implemented on large, complex data clusters for analysing data. For achieving high
accuracy while processing data parallel on huge complex clusters, one of the concerns is
machine learning. A parallel programming model is required for efficient data processing
in cluster environment (Walisa and Wichan, 2013a).
Machine learning is one of the subfields of computer science emerging from artificial
intelligence in the study of pattern recognition and computational learning theory.
Forecasting from data, study of observations, learning patterns and constructing the
algorithms can be explored using machine learning. These algorithms can be operated by
developing a framework for trained input dataset to make data-driven predictions and
decisions rather than following static way of programming implementing dynamically.
Machine learning is a discipline of computational statistics and mathematical
optimisation by conveying methods, theory and application in various fields (Rich et al.,
2008).
Machine learning uses hyper parameters for tuning parameters of an algorithm
to verify the best learning method (Brownlee, 2016). Multi-label classifier learning
and K-fold cross-validation tasks are used for evaluating results in adapting
algorithms.
When a learning process uses various parameters, a range of patterns can be conveyed
for different jobs resulting total execution time. The time for total execution depends on
the pattern assigned (Pavlo, (2009). To execute jobs efficiently best pattern should be
recognised from various machine learning techniques.
A method is proposed by implementing the machine learning algorithms using the
python language to minimise execution time (Haroshi et al., 2011). In this paper, an
attempt is made to execute supervised and unsupervised machine learning techniques
using python tool to formulate the optimum efficiency (Asha and Shravanthi, 2013).
The rest of the paper is organised as follows. Section 2 introduces background in
machine learning. Section 3 describes supervised techniques by python language and
Section 4 about unsupervised learning algorithms. In this paper, Section 5 describes the
proposed model for data analytics. Section 6 discusses the performance evaluation.
Section 7 discusses the result analysis and Section 8 concludes the paper.

2 Background

2.1 Machine learning

Machine learning algorithms are categorised into two basically supervised and
unsupervised techniques:
Supervised learning: Applications in which the training data comprise example of the
input vectors along with their corresponding target vectors are known as supervised
learning methods (Lakshmi, 2016).
Machine learning techniques using python for data analysis 5

Unsupervised learning: In other pattern-recognition problems, the training data consist of

a set of input vectors x without any corresponding target values. The goal in such
unsupervised learning problems may be to discover groups of similar examples within the
data (Manar and Stephane, 2015).
Supervised techniques in machine learning areas can be further grouped as below:
Classification: This assigns a category to each object such as OCR, text classification,
speech recognition.
Regression: This is used to predict a real value for each object such as stock prices,
values, economic variables and ratings.
Clustering: This is based on partition data into homogeneous groups (analysis of very
large datasets).
Ranking: The order objects according to some criterion (relevant web pages returned by a
search engine).
Dimensional reduction: To find lower-dimensional manifold preserving ties of the data
(computer vision).
Density estimation: This is used for learning probability distribution according to which
data have been sampled.
A variety of learning algorithms including linear regression, logistic regression from
supervised and expectation–maximisation (EM) in a Gaussian mixture model, and K-
means of unsupervised learning satisfies these characteristics (Michael, 2015).

2.2 Python programming language

Python is an easy to learn, powerful programming language. It has efficient high-level
data structures and a simple but effective approach to object-oriented programming.
Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it
an ideal language for scripting and rapid application development in many areas on most
platforms (Python Documentation: https://docs.python.org/2/tutorial/).
A language excels at string processing – that is the manipulation of string lists a few
languages with good string processing capabilities and compares them in terms of the
degree to which they are still being actively developed by a community of developers
testing whether or not they are object oriented (Stuart and Harald, 2007).

3 Supervised learning algorithms in python

Supervised learning – class labels/target variable are known in learning algorithms by

train sets using various training techniques such as linear regression, logistic regression,
support vector machines, K-nearest neighbour, expectation and maximisation, K-means,
decision tree, random forests and many more. A train set is set to learn a function
h : x → y so h(x) is a best predictor of y.
6 J.V.N. Lakshmi

3.1 Linear regression

A parameterised linear function is defined with weights mapping x and y giving in
equation (1):
n
h ( x ) = ∑θi xi =θ T x. (1)
i =0

Equation (1) is formalised as a linear function is defined considering x as vectors. The

cost function is given as shown in equation (2).
1 m
( ( ) ).
2
J (θ ) = ∑ hθ x( ) − y ( )
i i
(2)
2 i =1
This cost function repeatedly changes and converges to predict the best fit forming
gradient descent. A snippet illustrating the linear regressing is shown in Figure 1.

Figure 1 Linear regression (see online version for colours)

Figure 1 illustrates the r2 calculated for a sum of mean squared errors and the covariance
giving a correlation between x and y. Figure 2 shows the plot representing the predictions.

Figure 2 Linear model (see online version for colours)

Figure 2 describes the linear model and the line describes the deviation of values from the
averages taken from existing for the use of predictions.
The classification problem with the data of binary variables in which you can take
values 0 and 1. 0 represents the positive class, whereas 1 represents the positive class.
Machine learning techniques using python for data analysis 7

This approach of classification of the discrete values can be constructed with the logistic
regression. This algorithm trains you for the corresponding x by classification.

3.2 Logistic regression

The classification problem uses binary values 0 and 1 calling positive and negative
classes where y∈ {0,1} . To fix the change a logistic function or a sigmoid function is
described as g(z):
1
hθ ( x ) = g (θ T x ) = T
(3)
1 + e −θ x

1
g ( z) = . (4)
1 + e− z
The sigmoid curve represents the 0 and 1 classes for giving xi, and corresponding Yi
labels in Figure 3.

Figure 3 Sigmoid function (see online version for colours)

Logistic regression model derivatives the sigmoid function which follows the least square
regression. The least square regression could be derived as the maximum likelihood
estimator under a set of suppositions. These assumptions endow the stated classification
model with a set of probabilistic assumptions and then fit the parameters via maximum
likelihood (Walisa and Wichian, 2013b).
In Snippet 1, decision regions are classified as training and test through which a
signed cost function and the classification are shown by a plot for decision regions.

Snippet 1 Logistic regression using python (see online version for colours)
8 J.V.N. Lakshmi

Figures 3 and 4 describe the logistic regression, classification for the temperature dataset.
It shows data points plotted in the decision regions for the test dataset.

Figure 4 Logistic regression (see online version for colours)

3.3 Support vector machines

Support vector machines represent an extension to nonlinear models of the generalised
portrait algorithm developed by Vladimir Vapnik (Schwarz, 1978). The SVM algorithm
is based on the statistical learning theory. It will be useful computationally if only a small
fraction of the data points is supported vectors (Rich et al., 2008).
y s ( w T x s + b) 1
r= = (5)
w w

wT xi + b ≤ − ρ / 2 if yi = −1
⇔ yi (w T xi + b) ≥ ρ / 2.
w xi + b ≥ ρ / 2 if yi = 1
T

Snippet 2 shows the regularised regression where increasing the value of c increases the
bias and lowers the variance of the model.

Snippet 2 Support vector machines using python (see online version for colours)

The variable C can control, the penalty for misclassification. Large values if C
corresponds to large error penalties were as misclassification errors can be occurred when
C value is small. Therefore, the bias variance is tuned to adjust the trade offs. The
concept of regularised regression is shown in Snippet 2 where increasing the values of C
increases the bias and lowers the variance of the model.
Machine learning techniques using python for data analysis 9

3.4 K nearest neighbour

KNN is a lazy learner algorithm which is a non-parameterised model that is instance
based. Models which are instance-based characterised by memorising the training dataset
and choose the best with the special case of instance-based learning with zero cost during
the learning process.
KNN is summarised in the following steps:
• choose the number of k and a distance metric
• find the K nearest neighbours of the sample that we want to classify
• assign the class label by majority vote.
The right choice of k is crucial to find a good balance between over and under the fitting.
A simple Euclidean distance measure is used for real valued samples of temperature
dataset. Snippet 3 measures the Euclidean distance to standardise the data so that each
feature contributes equally to the distance (Asha and Shravanthi, 2013).

Snippet 3 K nearest neighbour using python (see online version for colours)

In this snippet, a generalisation of Euclidean and Manhattan are used which set the
parameter p = 2 or the Manhattan distance p = 1 provided for the metric parameter.
By implementing the above described snippet, Figure 5 shows the balancing for
choosing the right K-value of the data.

Figure 5 K nearest neighbour (see online version for colours)

10 J.V.N. Lakshmi

This memory-based approach adapts to the classification rapidly for the training data
collected. The training data in context are the diabetics’ dataset was chosen very few
features is shown in Figure 5.

3.5 Decision tree

Decision tree is a classifier which concentrates on interpretability. This suggests a
breaking down for making decisions on posing queries. Using the decision tree in the
process begins at the root and follows by splitting the data by information gain. It is an
iterative process for reaching the maximum depth of the tree pruning is the process which
is implemented.
Snippet 4 illustrates the fit to the temperature data which classifies basing on the
entropy generated while training the data.
m Nj
IG ( DP , f ) = I ( DP ) − ∑ I ( Dj ), (6)
j =1 Np
where IG is Gini index, IH is entropy and IE is the classification error
defined for all non-empty classes.
c
I H ( t ) = − ∑ p ( i|t ) log 2 p ( i|t ) (7)
i =1

I E = 1 − max { p ( i|t )}. (8)

Figure 6 shows the parallel boundaries of the decision tree with the maximum depth of
three using entropy as a criterion of impurity and tree is given in Figure 7.

Snippet 4 Decision region for using (see online version for colours)

Figure 6 Decision tree representation of the dataset (see online version for colours)
Machine learning techniques using python for data analysis 11

Figure 7 Decision tree for a diabetes dataset giving entropy (see online version for colours)

The decision algorithm splits the tree root and data on the feature resulting
the information gain. In an iterative process, the split can be possible at each child
node until the leaves remain to be pure. This shows the samples at each node belongs to
the same class. This result to prune the tree by setting a limit for the maximal depth of the
tree.

3.6 Random forest

Random forest typically prunes the hyperparameter values since the ensemble model is
quite robust to noise from the individual decision trees. The parameter is the number of
trees chosen for better performance of the random forest classifier at the expense of an
increased computational cost.
1 Draw a random bootstrap sample of size n.
2 Grow a decision tree from the bootstrap sample. At each node:
a randomly select d features without replacement and
b split the objective function for instance by maximising the information gain.
3 Aggregate the predictions by each tree to assign the class label by majority vote.
Using the preceding Snippet 5 a random forest from ten decision trees via the
n_estimators parameters and used the entropy criterion as an impurity measure to split the
nodes. The tiny random forest dataset uses a n_jobs parameter for parallelising the model
training multiple cores. In Figure 8 random forest is used implemented using diabetes
data set.
12 J.V.N. Lakshmi

Figure 8 Diabetes dataset plotted using random forest (see online version for colours)

Snippet 5 Random forest uses python (see online version for colours)

4 Unsupervised techniques

In the cluster problem a training set {x(1), x(2), …, x(m)} are grouped under a few
cohesive ‘clusters’. This technique does not involve a labels y(I). Hence, they are referred
as unsupervised learning.

4.1 K-means
This is a clustering algorithm repeatedly carries out two steps:
• assigning each training example x(I) to the closest cluster centroid µj
• moving each centroid µj with the mean of the points assigned to it. For each I set
2
c (i ) : = argmin j x ( i ) − µ j . (9)
And for each j set

∑ 1{c ( ) = j } x ( )
m i i
i =1
µj := . (10)
∑ 1{c( ) = j}
m i
i =1

To initialise the cluster centroids by choosing k training sets randomly and set the cluster
centroids to be equal to the values of the trained examples.
Machine learning techniques using python for data analysis 13

Snippet 6 illustrates the distortion function with K-means function measures the sum
of squared distances between each training examples and cluster centroid.

Snippet 6 Clustering with K-means (see online version for colours)

The distortion function c in equation (9) is a convex function implemented in iris dataset
and so coordinates descent on J for converge the global minimum. The different
clustering’s found the lowest distortion for a cluster is chosen. Figure 9 uses the diabetes
data set for executing K-Means clustering technique.

Figure 9 Results of K means clustering (see online version for colours)

4.2 Expectation and maximisation

In EM, you randomly initialise your model parameters, and then you alternate between
(E) assigning values of hidden variables, based on parameters and (M) computing
parameters based on fully observed data (Chu et al., 2006).
E-step: Coming up with values to hidden variables, based on parameters. If you work out
the math of choosing the best values for the class variable based on the features of a
given piece of data in your dataset, it comes out to “for each data point, chose the
centroid that it is closest to, by Euclidean distance, and assign that centroids label”.
14 J.V.N. Lakshmi

w(ji ) := p ( z (i ) = j |x (i ) ; ∅, µ , ∑ ) . (11)

M-step: Coming up with parameters, based on full assignments. If you work out the math
of choosing the best parameter values based on the features of a given piece of data in
your dataset, it comes out to “take the mean of all the data points that were labelled as c”
(Schwarz, 1978).

∑
m
i =1
w(ji )
∅j = . (12)
−β
The EM algorithm is an iterative algorithm that uses maximum likelihood estimation
becomes nearly identical to estimate the parameters of the Gaussian discriminant analysis
model except that plays the role of the class labels is illustrated in the Snippet 7.

Snippet 7 Expectation and maximisation step using python (see online version for colours)

5 Proposed methodology

When analysing data, time is a vital factor which has a major impact on execution.
Execution time of each node is read when datasets fit into a distributed memory. Almost
all the machine learning algorithms are iterative, so execution of memory-based jobs is
crucial and is less efficient.
When the dataset does not fit into distribute memory, then the data are initially stored
on local disks and are used until the execution terminates. The pattern determines
whether the given job is memory based or not. For improving the execution time machine
learning techniques is implemented as in the following model.
This model in Figure 10 suggests the least execution time for data analytics as it
follows both the techniques of supervised and unsupervised behaviours of machine
learning.
Machine learning techniques using python for data analysis 15

Figure 10 Model

As drafted in the proposed model, prune the dataset according to the algorithm for
analysing. Evaluate each technique of supervised and unsupervised on the dataset.
The model illustrated below has the following phases:
• read the dataset as input
• classification of data based on supervised and unsupervised learning methods
• extracting the featured and evaluating the techniques
• compared with the existing methods
• result analysis is done is the final phase.
Features of each algorithm are extracted for the best time, efficiency using regression,
classification and clustering on the dataset. The time efficiency is evaluated by applying
various ML techniques on the data. The algorithm can be improved with additional
parameter for accessing the better results.

6 Evaluation of algorithms

This paper evaluates the supervised and unsupervised techniques of machine learning.
The supervised methods, both linear and logistic regression are fitted into the pair of
values which resulting the targeted values either sided of the line of regression. If
unsupervised methods are used, the points are scattered forming clusters in K-means.
16 J.V.N. Lakshmi

This evaluation reveals the impact on supervised techniques for data analytics.
Our method targets multiple jobs using different parameters. Since the input of the jobs is
the same, the job integration technique improves the performance of the jobs. Figure 11
illustrates the average of linear regression methods besides the existing average.
Figure 12, the graph is the average taken from the time complexities calculated on
each machine learning technique. The straight line is the linear average and the blue line
is the average of time, space, read and writes operations of each method of machine
learning.

Figure 11 Average and linear average (see online version for colours)

Figure 12 Variation in learning techniques (see online version for colours)

7 Result analysis

The machine learning techniques reveal the efficient use of time and space. These
methods train the machine so they adapt to the dataset. Figure 13 gives the gist of the
techniques used for ML.
Machine learning techniques using python for data analysis 17

Figure 13 Supervised learning techniques (see online version for colours)

Figure 14 shows the combined measures of supervised and unsupervised learning

techniques considering the time, space, read and write operations. By observation
5.12 MB and 18.19S is the minimum space and time used by random forest, respectively.
But the random forest is less efficient considering the disk read and writes operations.
KNN shows the disks read and write efficiently as 23.55 and 26.58 measures,
respectively. From the metrics in Figure 13 it is evident that supervised learning
techniques give more accurate measures in comparison with unsupervised.

Figure 14 Supervised vs. unsupervised (see online version for colours)

Our method optimises the assignments using the advantages of such jobs for developing a
deadline scheduling method. Their method maximises the number of jobs that can be run
in the cluster while satisfying the deadlines of all jobs. Jobs are scheduled using only the
minimum number of nodes so that the cluster keeps free nodes for later jobs. In contrast,
our method uses entire cluster to complete jobs as early as possible.

8 Conclusion and future work

Executing data analysis, jobs using various parameters are commonly seen in machine
learning but time consuming. The proposed method for optimising the job assignment for
18 J.V.N. Lakshmi

machine learning is to minimise the total execution time. Our method can be extended for
data analytics job’s execution, memory-based execution and job integration, for machine
learning and optimises the job assignment based on the execution. An execution of
machine learning techniques is developed to predict the execution time of these jobs on
the extended execution.

References
Asha, T. and Shravanthi (2013) ‘Building machine learning algorithms on Hadoop for big data’,
IJET UK Journal, Vol. 3, No. 2, pp.143–147.
Chu, C-T., Lin, Y-A., Yu, Y., Bradski, G.R., Ng, A.Y. and Olukotun, K. (2006) ‘Map-reduce for
machine learning on multicore’, NIPS, MIT Press, pp.281–288.
Haroshi, T., Shinji, N. and Takuyu, A. (2011) ‘Optimizing multiple machine learning jobs on map
reduce’, IEEE–ICCCTS Conference at Japan, pp.59–66.
Lakshmi, J.V.N. (2016) ‘Stochastic gradient descent using linear regression with python’, IJA-ERA,
Vol. 2, No. 8, December, pp.519–524.
Manar, A. and Stephane, P. (2015) ‘Machine learning with Python’, SIMUREX, October.
Pavlo, A. (2009) ‘A comparison of approaches to large-scale data analysis’, Proc. ACM SIGMOD,
USA, pp.100–113.
Rich, C., Karampatziakis, N. and Yessenalina, A. (2008) ‘An empirical evaluation of supervised
learning in high dimensions’, Proceedings of the 25th International Conference on Machine
Learning, ACM, New York, USA, pp.96–103.
Schwarz, G. (1978) ‘Estimating the dimension of a model’, The Annals of Statistics,
Vol. 6, No. 2, pp.461–464.
Stuart, R. and Harald, B. (2007) Beginning Python for Language Research, Vol. 2, pp.44–47.
Walisa, R. and Wichan, P. (2013a) ‘An adaptive ML on map reduce for improving performance of
large scale data analysis’, EC2 IEEE, Bangkok, Thailand, pp.234–236.
Walisa, R. and Wichan, P. (2013b) ‘An adaptive ML on map reduce for improving performance of
large scale data analysis on EC2’, IEEE 11th Conference on ICT and Knowledge Engineering
2013.
Michael, B. (2015) Machine Learning in Python: Essential Techniques for Predictive Analysis,
Print ISBN:9781118961742, Online ISBN:9781119183600, DOI:10.1002/9781119183600.
Brownlee, J. (2016) Master Machine Learning–How it Works, pp.1–5.

Website
https://pythonhosted.org/spyder/

View publication stats

Machine Learning With Real Life Project: by - Rishabh Gaur
100% (2)
Machine Learning With Real Life Project: by - Rishabh Gaur
26 pages
Book Summary
No ratings yet
Book Summary
35 pages
Machine Learning Using Python
No ratings yet
Machine Learning Using Python
2 pages
Machine Learning
No ratings yet
Machine Learning
48 pages
Salary Prediction-2
No ratings yet
Salary Prediction-2
26 pages
Maintenance of Electrical Machines
No ratings yet
Maintenance of Electrical Machines
8 pages
Synopsis For Airlines Management System
0% (1)
Synopsis For Airlines Management System
11 pages
Flexim Fluxus F601 Manual
No ratings yet
Flexim Fluxus F601 Manual
187 pages
M.tech (Cyber Security 40913)
No ratings yet
M.tech (Cyber Security 40913)
42 pages
Machine Learning With Python Report
100% (1)
Machine Learning With Python Report
41 pages
Vtu 8th Sem Cse Network Management Systems Notes 10cs834
No ratings yet
Vtu 8th Sem Cse Network Management Systems Notes 10cs834
57 pages
Supervised - ML Complete Book
No ratings yet
Supervised - ML Complete Book
153 pages
Industrial Training Report Chetan Vidiyar
No ratings yet
Industrial Training Report Chetan Vidiyar
51 pages
Joe-Wilke - 5G Network Architecture and FMC
100% (1)
Joe-Wilke - 5G Network Architecture and FMC
23 pages
Chapter 2 Literature Review
0% (1)
Chapter 2 Literature Review
10 pages
Ss
No ratings yet
Ss
26 pages
(IJIT-V7I6P2) :yew Kee Wong
No ratings yet
(IJIT-V7I6P2) :yew Kee Wong
7 pages
Machine Learning in Data Analysis
No ratings yet
Machine Learning in Data Analysis
17 pages
Testing and Analysis of Predictive ML Algorithms-Pages-421-443
No ratings yet
Testing and Analysis of Predictive ML Algorithms-Pages-421-443
23 pages
Machine Learning
No ratings yet
Machine Learning
54 pages
H18650CH 2.6Ah（小动力）锂离子电池规格书（A00版) 20200603
No ratings yet
H18650CH 2.6Ah（小动力）锂离子电池规格书（A00版) 20200603
17 pages
LabVIEW Integrated With Python Machine Learning For DC Motor Drive
No ratings yet
LabVIEW Integrated With Python Machine Learning For DC Motor Drive
6 pages
WSB M1P78 B1
No ratings yet
WSB M1P78 B1
6 pages
Automatic Problem-Specific
No ratings yet
Automatic Problem-Specific
53 pages
Cognitive Machine Learning Techniques For Predictive Maintenance in Industrial Systems: A Data-Driven Analysis
No ratings yet
Cognitive Machine Learning Techniques For Predictive Maintenance in Industrial Systems: A Data-Driven Analysis
7 pages
Machine Learning Presentation
No ratings yet
Machine Learning Presentation
12 pages
IoT Analytics Final Notes - 10!01!2025 - Student
No ratings yet
IoT Analytics Final Notes - 10!01!2025 - Student
6 pages
Machine Learning
No ratings yet
Machine Learning
7 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
Group
No ratings yet
Group
43 pages
Man and Environment Relationship Case Study
No ratings yet
Man and Environment Relationship Case Study
13 pages
Presentation 2
No ratings yet
Presentation 2
9 pages
BTL Diathermy Catalogue
No ratings yet
BTL Diathermy Catalogue
2 pages
A Review On Machine Learning Application
No ratings yet
A Review On Machine Learning Application
6 pages
A To Z of Machine Learning by Rashvandh
No ratings yet
A To Z of Machine Learning by Rashvandh
34 pages
Machine Learning 1.4.19
No ratings yet
Machine Learning 1.4.19
23 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
20 pages
Avani Kakkar
No ratings yet
Avani Kakkar
13 pages
Summer Training 2020: Advanced Data Science With IBM & Bionic Robotic Arm
No ratings yet
Summer Training 2020: Advanced Data Science With IBM & Bionic Robotic Arm
10 pages
Accident Detection Using Machine Learning Method
No ratings yet
Accident Detection Using Machine Learning Method
10 pages
Unit 2 ML
No ratings yet
Unit 2 ML
141 pages
Analysis of Machine Learning Methods To Improve Efficiency of Big Data Processing in Industry 4.0
No ratings yet
Analysis of Machine Learning Methods To Improve Efficiency of Big Data Processing in Industry 4.0
7 pages
Tyagi Et Al. 2021
No ratings yet
Tyagi Et Al. 2021
19 pages
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
No ratings yet
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
6 pages
Refrigerant and Oil Filling Quantities Passenger Cars & Commercial Vehicles 2014-15
No ratings yet
Refrigerant and Oil Filling Quantities Passenger Cars & Commercial Vehicles 2014-15
112 pages
Report
No ratings yet
Report
11 pages
Query Generation Using Nadaq System
No ratings yet
Query Generation Using Nadaq System
11 pages
(IJIT-V7I5P2) :yew Kee Wong
No ratings yet
(IJIT-V7I5P2) :yew Kee Wong
6 pages
Worksheet 8
No ratings yet
Worksheet 8
17 pages
Maschine Mikro Mk1 Manual English
No ratings yet
Maschine Mikro Mk1 Manual English
375 pages
(IJCST-V9I4P18) :yew Kee Wong
No ratings yet
(IJCST-V9I4P18) :yew Kee Wong
5 pages
Short Review On Machine Learning and Its Application
No ratings yet
Short Review On Machine Learning and Its Application
12 pages
Alesis qx61 - Quickstart - Guide - Revb
No ratings yet
Alesis qx61 - Quickstart - Guide - Revb
36 pages
Cmsa Sem 6 Dse ML
No ratings yet
Cmsa Sem 6 Dse ML
3 pages
Internship-Data Science and Machine Learning Using Python
No ratings yet
Internship-Data Science and Machine Learning Using Python
5 pages
PPT-Final Project - DT - Done All Final
No ratings yet
PPT-Final Project - DT - Done All Final
14 pages
Manual Data
No ratings yet
Manual Data
13 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
Report Print
No ratings yet
Report Print
22 pages
Sagiv Nissan CV
No ratings yet
Sagiv Nissan CV
2 pages
Module - 1
No ratings yet
Module - 1
9 pages
Project: Advisor Dr. Sanaa El Touny (Spring 2024) Group 3
No ratings yet
Project: Advisor Dr. Sanaa El Touny (Spring 2024) Group 3
7 pages
AI and ML For Business Antim Prahar WITH ANSWERS
No ratings yet
AI and ML For Business Antim Prahar WITH ANSWERS
26 pages
ML Lecture Notes Unit-1
No ratings yet
ML Lecture Notes Unit-1
45 pages
b0700tk - C - Control Core Services v9.6 Software
No ratings yet
b0700tk - C - Control Core Services v9.6 Software
308 pages
An Overview of Machine Learning Classification Tec
No ratings yet
An Overview of Machine Learning Classification Tec
24 pages
L2 - Machine Learning Process
No ratings yet
L2 - Machine Learning Process
17 pages
Data Analyst Interview Questionaries
No ratings yet
Data Analyst Interview Questionaries
16 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
8 pages
Lecture 2 Unit 1
No ratings yet
Lecture 2 Unit 1
60 pages
ML Unit1 (HKB)
No ratings yet
ML Unit1 (HKB)
7 pages
Machine Learning Part: Domain Overview
No ratings yet
Machine Learning Part: Domain Overview
20 pages
Pytorch Basics - For Absolute Beginners - Sel, Tam (Sel, Tam) - 2021 - Anna's Archive - Copie
No ratings yet
Pytorch Basics - For Absolute Beginners - Sel, Tam (Sel, Tam) - 2021 - Anna's Archive - Copie
62 pages
ML Course Outline
No ratings yet
ML Course Outline
4 pages
3 - Processor (Single Cycle)
No ratings yet
3 - Processor (Single Cycle)
53 pages
Asw July12
No ratings yet
Asw July12
68 pages
PCA Advanced Guide 11.1
No ratings yet
PCA Advanced Guide 11.1
484 pages
Traditional Learning Vs Elearning
50% (2)
Traditional Learning Vs Elearning
2 pages
Etrust Antivirus Administrator Guide
No ratings yet
Etrust Antivirus Administrator Guide
194 pages
(IJETA-V8I5P1) :yew Kee Wong
No ratings yet
(IJETA-V8I5P1) :yew Kee Wong
5 pages
Transport and Main Roads Specifications MRTS211 Pedestrian Push Buttons and Audio Tactile Drivers
No ratings yet
Transport and Main Roads Specifications MRTS211 Pedestrian Push Buttons and Audio Tactile Drivers
8 pages
Distribution Board Schedule Report 1
No ratings yet
Distribution Board Schedule Report 1
15 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
TP - CB - VII - Cs - Ch. 5 Using Css in HTML
No ratings yet
TP - CB - VII - Cs - Ch. 5 Using Css in HTML
14 pages
Bos 19.07.24
No ratings yet
Bos 19.07.24
13 pages
MIL Lesson 8 - Trends in MIL PDF
No ratings yet
MIL Lesson 8 - Trends in MIL PDF
6 pages
Fake Account Detection Complete Report
No ratings yet
Fake Account Detection Complete Report
8 pages
hl5100t en
No ratings yet
hl5100t en
4 pages
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
From Everand
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
DAVID MACKAY
No ratings yet
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet

Machine Learning Techniques Using Python For Data

Uploaded by

Machine Learning Techniques Using Python For Data

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Machine learning techniques using python for data analysis in performance

Article in International Journal of Intelligent Systems Technologies and Applications · January 2018

The user has requested enhancement of the downloaded file.

Machine learning techniques using python for data

Abstract: Machine learning algorithms are used to parallelise the workloads.

Keywords: machine learning; linear regression; logistic regression; K-means;

Reference to this paper should be made as follows: Lakshmi, J.V.N. (2018)

Biographical notes: J.V.N. Lakshmi is an Assistant Professor in Acharya

This paper is a revised and expanded version of a paper entitled ‘Machine

Copyright © 2018 Inderscience Enterprises Ltd.

There is a massive growth in data analysis a model is widely used to parallelise by

2.1 Machine learning

Unsupervised learning: In other pattern-recognition problems, the training data consist of

2.2 Python programming language

3 Supervised learning algorithms in python

Supervised learning – class labels/target variable are known in learning algorithms by

3.1 Linear regression

Equation (1) is formalised as a linear function is defined considering x as vectors. The

Figure 1 Linear regression (see online version for colours)

Figure 2 Linear model (see online version for colours)

3.2 Logistic regression

Figure 3 Sigmoid function (see online version for colours)

Figure 4 Logistic regression (see online version for colours)

3.3 Support vector machines

3.4 K nearest neighbour

Figure 5 K nearest neighbour (see online version for colours)

3.5 Decision tree

I E = 1 − max { p ( i|t )}. (8)

3.6 Random forest

Snippet 6 Clustering with K-means (see online version for colours)

Figure 9 Results of K means clustering (see online version for colours)

4.2 Expectation and maximisation

Figure 12 Variation in learning techniques (see online version for colours)

Figure 13 Supervised learning techniques (see online version for colours)

Figure 14 shows the combined measures of supervised and unsupervised learning

Figure 14 Supervised vs. unsupervised (see online version for colours)

8 Conclusion and future work

View publication stats

You might also like