0% found this document useful (0 votes)
98 views56 pages

Machine Learning: MACHINE LEARNING - Copy Rights Reserved Real Time Signals

Machine learning is a type of artificial intelligence that allows software to improve predictions without being explicitly programmed. It works by building algorithms that receive input data and use statistical analysis to predict outputs. Common applications of machine learning include fraud detection, image and speech recognition, medical diagnosis, spam filtering, product recommendations, and traffic predictions. There are two main types of machine learning: supervised learning, which uses labeled training data to create predictive models, and unsupervised learning, which learns from unlabeled data to differentiate input data. Linear regression is a commonly used supervised learning algorithm that finds the relationship between dependent and independent variables to make predictions and forecasts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views56 pages

Machine Learning: MACHINE LEARNING - Copy Rights Reserved Real Time Signals

Machine learning is a type of artificial intelligence that allows software to improve predictions without being explicitly programmed. It works by building algorithms that receive input data and use statistical analysis to predict outputs. Common applications of machine learning include fraud detection, image and speech recognition, medical diagnosis, spam filtering, product recommendations, and traffic predictions. There are two main types of machine learning: supervised learning, which uses labeled training data to create predictive models, and unsupervised learning, which learns from unlabeled data to differentiate input data. Linear regression is a commonly used supervised learning algorithm that finds the relationship between dependent and independent variables to make predictions and forecasts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

MACHINE LEARNING – copy rights reserved Real Time Signals

Machine learning
Machine learning is a type of artificial intelligence (AI) that allows software applications to become
more accurate in predicting outcomes without being explicitly programmed.

The basic operations of machine learning is to build Algorithms that can receive input data and
use statistical analysis to predict an output value.

machine learning are similar to that of data mining and predictive modelling.

Machine Learning Use cases Day-to-Day Life:

It is used for fraud detection ,Image and Speech Recognition, Medical Diagnosis, spam
filtering, network security threat detection, Prediction, Classification, Learning Associations,
Statistical Arbitrage, Extraction, Regression.

Email Spam Filtering :

There are a number of spam filtering approaches that email clients use. To as certain that
these spam filters are continuously updated, they are powered by machine learning. When rule-
based spam filtering is done, it fails to track the latest tricks adopted by spammers. Multi Layer
Perception.
MACHINE LEARNING – copy rights reserved Real Time Signals

Online Fraud Detection :

Machine learning is proving its potential to make cyberspace a secure place and tracking
monetary frauds online is one of its examples. For example: Paypal is using ML for protection
against money laundering. The company uses a set of tools that helps them to compare millions of
transactions taking place and distinguish between legitimate or illegitimate transactions taking place
between the buyers and sellers.

Face Recognition:

You upload a picture of you with a friend and Face book instantly recognizes that friend.
Face book checks the poses and projections in the picture, notice the unique features, and then
match them with the people in your friend list. The entire process at the backend is complicated and
takes care of the precision factor but seems to be a simple application of ML at the front end.

Search Engine Result Refining:

Google and other search engines use machine learning to improve the search results for
you. Every time you execute a search, the algorithms at the backend keep a watch at how you
respond to the results. If you open the top results and stay on the web page for long, the search
engine assumes that the the results it displayed were in accordance to the query. Similarly, if you
reach the second or third page of the search results but do not open any of the results, the search
engine estimates that the results served did not match requirement. This way, the algorithms
working at the backend improve the search results.

Traffic Predictions:

We all have been using GPS navigation services. While we do that, our current locations
and velocities are being saved at a central server for managing traffic. This data is then used to
build a map of current traffic. While this helps in preventing the traffic and does congestion analysis,
the underlying problem is that there are less number of cars that are equipped with GPS. Machine
learning in such scenarios helps to estimate the regions where congestion can be found on the
basis of daily experiences

Product Recommendations :

You shopped for a product online few days back and then you keep receiving emails for
shopping suggestions. If not this, then you might have noticed that the shopping website or the app
recommends you some items that somehow matches with your taste. Certainly, this refines the
shopping experience but did you know that it’s machine learning doing the magic for you? On the
basis of your behaviour with the website/app, past purchases, items liked or added to cart, brand
preferences etc., the product recommendations are made.
MACHINE LEARNING – copy rights reserved Real Time Signals

Image Recognition :

One of the most common uses of machine learning is image recognition. There are many
situations where you can classify the object as a digital image. For digital images, the
measurements describe the outputs of each pixel in the image.
In the case of a black and white image, the intensity of each pixel serves as one
measurement. So if a black and white image has N*N pixels, the total number of pixels and hence
measurement is N2.
In the colored image, each pixel considered as providing 3 measurements to the
intensities of 3 main color components ie RGB. So N*N colored image there are 3 N2
measurements.

For face detection – The categories might be face versus no face present. There might be a
separate category for each person in a database of several individuals.

For character recognition – We can segment a piece of writing into smaller images, each
containing a single character. The categories might consist of the 26 letters of the English
alphabet, the 10 digits, and some special characters.

Speech Recognition :

Speech recognition (SR) Is the translation of spoken words into text. It is also known as
“automatic speech recognition” (ASR), “computer speech recognition”, or “speech to text”
(STT).

In speech recognition, a software application recognizes spoken words. The measurements


in this application might be a set of numbers that represent the speech signal. We can
segment the signal into portions that contain distinct words or phonemes. In each segment, we
can represent the speech signal by the intensities or energy in different time-frequency bands.

Although the details of signal representation are outside the scope of this program, we can
represent the signal by a set of real values.

Speech recognition applications include voice user interfaces. Voice user interfaces are
such as voice dialing, call routing, domotic appliance control. It can also use as simple data
entry, preparation of structured documents, speech-to-text processing, and plane.
MACHINE LEARNING – copy rights reserved Real Time Signals

Types of machine learning:


1. Supervised learning
2. Unsupervised learning

Supervised learning: Learning from the known label data to create a model then predicting target
class for the given input data.

Unsupervised learning: Learning from the unlabeled data to differentiating the given input data.

Supervised Learning Algorithms:

All classification and regression algorithms come under supervised learning.

1. Linear regression & multiple linear regression


2. Logistic Regression
3. polynomial regression
4. Decision trees
5. Support vector machine (SVM)
6. k-Nearest Neighbours(KNN)
7. Naive Bays
8. Random forest

Unsupervised learning algorithms:

All clustering algorithms come under unsupervised learning algorithms.

1. K – means clustering
2. Hierarchical clustering
MACHINE LEARNING – copy rights reserved Real Time Signals

Linear Regression
Linear regression:

Linear regression is a basic and commonly used type of predictive analysis. The
relationship between dependent variable (Y) and one or more independent variables (X).

Simple linear regression:


The relationship between one dependent variable and one independent variable.

First, the regression might be used to identify the strength of the effect that the independent
variable(s) have on a dependent variable. Typical questions are what is the strength of
relationship between dose and effect, sales and marketing spending, or age and income.

Second, it can be used to forecast effects or impact of changes. That is, the regression analysis
helps us to understand how much the dependent variable changes with a change in one or more
independent variables. A typical question is, “how much additional sales income do I get for each
additional $1000 spent on marketing?”

Third, regression analysis predicts trends and future values. The regression analysis can be used
to get point estimates. A typical question is, “what will the price of gold be in 6 months?”

line equation

y=mx+c+e

error(e) slope(m)

predicted value (minimize the error)

(c)intercept X
what is cost function : error is also called as cost function. cost function be the sum of squared
errors over your training set.

cost function =( predicted value - actual value)^2

cost function=(y[i]-(m*x[i]+c))^2
MACHINE LEARNING – copy rights reserved Real Time Signals

what is gradient descent: gradient descent is use to decrease the cost function for each
iteration.

differentiating error equation to minimize the error.

y=mx+c
cost function (e) = (y[i]-(m*x[i]))^2
Gradient descent = d(e)/dm = d/dm ( y[i]-m*x[i])^2
= 2 (y[i] - m * x[i] ) * d/dm (-m * x[i])
= 2(y[i] - m * x[i] )* -x[i]
Gradient descent = -2(y[i] - m * x[i])* x[i]
MACHINE LEARNING – copy rights reserved Real Time Signals

b1 = (x-x_bar)(y-y_bar)

(x-x_bar)^2

b0 = ( y_bar) - b1*(x_bar)

(x-x_bar)^2 (y-y_bar)(x-x_bar)
4 4
1 0
0 0
1 0
4 2
sum=10 sum=6

b1 = (x-x_bar)(y-y_bar) = 0.6

(x-x_bar)^2

b0 = ( y_bar) - b1*(x_bar) = 2.2


MACHINE LEARNING – copy rights reserved Real Time Signals

R-squared (R^2) = compare between (estimated distance - mean),( actual distance -mean)

estimated values

6 regression line

4 mean=4

0 1 2 3 4 5 6

6 (actual values)

4 mean=4

0 1 2 3 4 5

R^2 = (y^ - y_bar)^2

(y-y_bar)^2

y^=b0+b1*x

x y y-y_bar (y-y_bar)^2 y^ y^-y_bar (y^-y_bar)^2


1 2 -2 4 2.8 -1.2 1.44
2 4 0 0 3.4 -.6 .36
3 5 1 1 4 0 0
4 4 0 4 4.6 0.6 0.36
5 5 1 1 5.2 1.2 1.44

R^2 = (y^ - y_bar)^2 = 3.6/6 = 0.6

(y-y_bar)^2

If R^2 is near to 1 better the line fits the data & when R^2 is far from 1, line not represents data at
all.
MACHINE LEARNING – copy rights reserved Real Time Signals

Code without using sklearn :

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = np.array([1,2,3,4,5])
y = np.array([2,4,5,4,5])
def estimate_coef(x, y):
# number of observations/points
n = np.size(x)
# mean of x and y vector
m_x, m_y = np.mean(x), np.mean(y)
# calculating cross-deviation and deviation about x
SS_xy = np.sum(y*x - n*m_y*m_x)
SS_xx = np.sum(x*x - n*m_x*m_x)
# calculating regression coefficients
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return(b_0, b_1)

def plot_regression_line(x, y, b):


# plotting the actual points as scatter plot
plt.scatter(x, y, color = "m",marker = "o", s = 30)
# predicted response vector
y_pred = b[0] + b[1]*x
# plotting the regression line
plt.plot(x, y_pred, color = "g")
# putting labels
plt.xlabel('x')
plt.ylabel('y')
# function to show plot
plt.show()
def main():
# observations
x = np.array([1,2,3,4,5])
y = np.array([2,4,5,4,5])
# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \
\nb_1 = {}".format(b[0], b[1]))
#plotting regression line
plot_regression_line(x, y, b)

if __name__ == "__main__":
main()
MACHINE LEARNING – copy rights reserved Real Time Signals
Multiple linear regression

Multiple linear regression:


Multiple linear regression is the most common form of linear regression analysis. As a
predictive analysis, The relationship between two or more independent variables(x) and one
dependent variable(y).

x11,x12,x13,..... x1n y1
x21,x22,x23,..... x2n y2
x= ......,.......,......,....... ,y= y3
.......,.......,......,....... ......
xm1,xm2,xm3,...... xmn yn

y=m0*x0+m1*x1+m2*x2+........................m[n]*x[n]

multiple linear regression mathematical calculations using gradient descent :


y=mx+c+e
error=(y[i]-(m*x[i]))**2
gradient_descent+=-2*(y[i]-(m0*x[i]))*x[i]
new_m=m-mu*gradient_descent(here to find new slope by change the m value)

example: house price Prediction, Predicting which Television Show will have more viewers for
next week.

x=["sqft","bedrooms","water","bathrooms"] here x is independent variable

y=["price"], y is dependent variable

• Square feet is the Area of house.


• Price is the corresponding cost of that house

no sqft bedrooms water bathrooms price


1 150 3 1 3 6500
2 160 3 1 3 7000
3 200 2 0 2 9000
4 250 3 0 3 10000
5 300 2 1 2 12000
6 350 4 0 4 13500
7 400 5 1 5 14000
MACHINE LEARNING – copy rights reserved Real Time Signals

By using gradient descent :

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
data=pd.read_csv("house_price.csv")
data.head()
data.shape
def gradient(x,y,m0,mu):
error=0
m_gradient=0
new_m=0
for i in range(len(x)):
error+=(y[i]-(m0*x[i]))**2
m_gradient+=-2*(y[i]-(m0*x[i]))*x[i]
new_m+=m0-mu*m_gradient
return new_m,error
x=data['sqft']
y=data['price']
x=x-np.mean(x)
y=y-np.mean(y)

m=0
lst=[]
mu=0.01
for i in range(10):
m,e=gradient(x,y,m,mu)
lst.append(e)
plt.plot(lst)
print m
plt.scatter(x,y)
x1=np.arange(-0.5,0.5,0.1)
y1=x1*m
plt.plot(x1,y1,color='red')
MACHINE LEARNING – copy rights reserved Real Time Signals

By using sklearn:

import numpy as np # linear algebra


import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
data=pd.read_csv("house.csv")
data.head()
x=np.array(data[["sqft", "bedrooms","water","bathrooms]])
y=np.array(data[["price"]])
x_train,x_test,y_test,y_train=train_test_split(x,y,test_size=0.3,random_state=3)
model=LinearRegression()
model.fit(x_train,y_train)
model.score(x_train,y_train)
model.score(x_test,y_test)
MACHINE LEARNING – copy rights reserved Real Time Signals
Logistic regression

Logistic regression:
Logistic regression is used to find the probability of event_ success and event_failure.

Dependent variable:
The dependent variable must be binary in nature (0 or 1 ).It is widely used for classification
problems.

logistic regression as a special case of linear regression when the outcome variable is
categorical, where we are using log of odds as dependent variable. In simple words, it predicts the
probability of occurrence of an event by fitting data to a log it function.

1. binomial: target variable can have only 2 possible types: “0” or “1” which may represent “win”
vs “loss”, “pass” vs “fail”, “dead” vs “alive”, etc.
2. multinomial: target variable can have 3 or more possible types which are not ordered(i.e.
types have no quantitative significance) like “disease A” vs “disease B” vs “disease C”.
3. ordinal: it deals with target variables with ordered categories. For example, a test score can
be categorized as:“very poor”, “poor”, “good”, “very good”. Here, each category can be given
a score like 0, 1, 2, 3.

ex:

hear, x=Gender, y=pass/fail

Gender Pass/Fail
Female 1
Male 0
Male 1
Female 1
Female 1
Male 0
male 0
MACHINE LEARNING – copy rights reserved Real Time Signals
MACHINE LEARNING – copy rights reserved Real Time Signals

• s(z) = output between 0 and 1 (probability estimate)


• z = input to the function (your algorithm’s prediction e.g. mx + c)
• e = base of natural log

Decision Boundary :
Our current prediction function returns a probability score between 0 and 1. In order
to map this to a discrete class (true/false, cat/dog), we select a threshold value or tipping
point above which we will classify values into class 1 and below which we classify values
into class 2.

p≥0.5,class=1p<0.5,class=0p≥0.5,class=1p<0.5,class=0

For example, if our threshold was 0.5 and our prediction function returned 0.7, we
would classify this observation as positive. If our prediction was 0.2 we would classify the
observation as negative. For logistic regression with multiple classes we could select the
class with the highest predicted probability.

cost function= (y[i]-S(z))^2

where,
MACHINE LEARNING – copy rights reserved Real Time Signals

cost function = (y[i]-1/1+e^-(mx+c))^2

if y=1 if y=0

0 s(z) 1 0 s(z) 1

Gradient descent: = differentiate (y [i]-1/1+e^-(m*x [i]))^2 with respect to m

= x [i] * e^(-m * x [i])


(1+e^(-m * x [i]))^2
MACHINE LEARNING – copy rights reserved Real Time Signals
Multiclass logistic regression (or) Softmax regression :

Softmax regression is a generalization of logistic regression to the case where we want


to handle multiple classes.

softmax : As the name suggests, in softmax regression (SMR), we replace the sigmoid
logistic function by the so-called softmax function φ

where z is

(w is the weight vector, x is the feature vector of 1 training sample, and w0 is the bias unit.)
Now, this soft max function computes the probability that this training sample x(i) belongs to
class j given the weight and net input z(i). So, we compute the probability p(y = j | x(i); wj) for each
class label in j = 1, ..., k.
MACHINE LEARNING – copy rights reserved Real Time Signals

ex: z=[apple,orange,banana,mango]

apple=class 0
orange=class 1
banana= class 2
mango=class3

First, we want to encode the class labels into a format that we can more easily work with; we apply one-
hot encoding:
one-hot encoding
(0,1,2,3)

mango class 3 0 0 0 1

orange class 1 0 1 0 0
banana class 2 0 0 1 0
apple class 0 1 0 0 0

Ex: Titanic data set

PassengerId Survived Pclass Sex Age SibSp Ticket Fare Embarked


1 0 3 male 22 1 A/5 21171 7.25 S
2 1 1 female 38 1 PC 17599 71.2833 C
STON/O2.
1
3 3 female 26 0 3101282 7.925 S
4 1 1 female 35 1 113803 53.1 S
5 0 3 male 35 0 373450 8.05 S
6 0 3 male 0 330877 8.4583 Q
7 0 1 male 54 0 17463 51.8625 S
8 0 3 male 2 3 349909 21.075 S
9 1 3 female 27 0 347742 11.1333 S
MACHINE LEARNING – copy rights reserved Real Time Signals

Titanic dataset by using sklearn:

import numpy as np # linear algebra


import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
data=pd.read_csv("titanic.csv")
data.head()
x=np.array(data["Pclass","Sex","Age",Embarked"]
y=np.array(data[["Survived"]])
x_train,x_test,y_test,y_train=train_test_split(x,y,test_size=0.3,random_state=3)
model=LinearRegression()
model.fit(x_train,y_train)
model.score(x_train,y_train)
model.score(x_test,y_test)
MACHINE LEARNING – copy rights reserved Real Time Signals
Decision Tree Algorithm
Decision Tree Algorithm:

Decision tree is a type of supervised learning algorithm. It is mostly used in classification


problems.

It works for both categorical and continuous input and output variables. In this technique,
we split the data into two or more homogeneous sets.

decision tree identifies the most significant variable and it’s value that gives best homogeneous set
of population.

Types of Decision Trees:


Types of decision tree is based on the type of target variable we have. It can be of two types:
1.Categorical Variable Decision Tree: Decision Tree which has categorical target variable then it
called as categorical.
2.Continuous Variable Decision Tree: Decision Tree has continuous target variable then it is
called as Continuous Variable Decision Tree.
MACHINE LEARNING – copy rights reserved Real Time Signals

Entropy :
Entropy, as it relates to machine learning, is a measure of the randomness in the
information being processed. The higher the entropy, the harder it is to draw any conclusions from
that information.

Information gain (I(p,n)) : [ (-p/p+n) * ( log(2) p/p+n) - (n/p+n) * (log(2) n/p+n) ]

Entropy = ∑ P[i]+N[i] (I(P[i],N[i])

P+N

Gain: Entropy(class) - Entropy(attribute)

ex:

Age Competition Type Profit


Old Yes software Down
Old No Software Down
Old No Hardware Down
Mid Yes Software Down
Mid Yes Hardware Down
Mid No Hardware Up
Mid No Software Up
New Yes Software Up
New No Hardware Up
New No Software Up

profit (class) : P = 5 (Up) , N = 5 (Down)

Class Entropy :
-5/10 log2 (5/10) - 5/10 log2 (5/10) = 1
MACHINE LEARNING – copy rights reserved Real Time Signals

steps to calculate making decision tree :


1.Information Gain
2.Entropy
3.Gain
Age:
Age P[i] N[i] I(P[i],N[i])
Old 0 3 0
Mid 2 2 1
New 3 0 0

Entropy(Age) :

0+3/10 * (0) + 2+2/10 * (1) + 3+0/10 * (0)

4/10=0.4

Gain :

1-0.4 = 0.6

Competition :

Competition P[i] N[i] I(P[i],N[i])


Yes 1 3 0.8127
No 4 2 0.918295

Entropy (Competition) :

1+3/10 * (0.8127) + 4+2/10 * (0.918295) = 0.8754

Gain :

1- 0.8754 = 0.124515

Type :

Type P[i] N[i] I(P[i],N[i])


Software 3 3 1
Hardware 2 2 1

Entropy ( Type) :

3+3/10 * (1) + 2+2/10 * (1) = 1

Gain : 1-1 = 0
MACHINE LEARNING – copy rights reserved Real Time Signals

Compare Gain:

attribute Gain
Age 0.6
Competition 0.124
Type 0

here Age gain is more , so Age is the Root node

Age

Old new
mid

down ? up

Age :

Age Competition Type Profit


Mid Yes Software Down
Mid Yes Hardware Down
Mid No Software Up
Mid No Hardware Up

profit (class) : p = 2, n = 2

Entropy :
-2/4 log2 (2/4) - 2/4 log2 (2/4) = 1

Competition :

Competition P[i] N[i] I(P[i],N[i])


Yes 0 2 0
No 2 0 0

Entropy (Competition) :
0+2/4 (0) + 2+0/4 (0) = 0

Gain : Entropy(class) - Entropy(attribute)

1-0 =1
MACHINE LEARNING – copy rights reserved Real Time Signals

Type :

Type P[i] N[i] I(P[i],N[i])


Software 1 1 1
Hardware 1 1 1

Entropy (Type) :

1+1/4 (1) + 1+1/4 (1) = 1

Gain : Entropy(class) - Entropy(attribute)

1-1 =0

Compare Gain:

Attribute Gain
Competition 1 node
Type 0

Decision Tree :

Age

Old new
mid

down up
Competition

Yes No

down up
MACHINE LEARNING – copy rights reserved Real Time Signals

Advantages:
• Easy to Understand: Decision tree output is very easy to understand even for people from
non-analytical background. It does not require any statistical knowledge to read and
interpret them. Its graphical representation is very intuitive and users can easily relate their
hypothesis.

• Useful in Data exploration: Decision tree is one of the fastest way to identify most
significant variables and relation between two or more variables. With the help of decision
trees, we can create new variables / features that has better power to predict target
variable.

• Less data cleaning required: It requires less data cleaning compared to some other
modelling techniques. It is not influenced by outliers and missing values to a fair degree.

• Data type is not a constraint: It can handle both numerical and categorical variables.

• Non Parametric Method: Decision tree is considered to be a non-parametric method. This


means that decision trees have no assumptions about the space distribution and the
classifier structure.

ex: bank data set :

a ma edu def bal hou lo d m dur cam pd pre pout de


con
g job rita cati aul anc sin a a on atio paig ay vio com po
tact
e l on t e g n y th n n s us e sit

0 ma seco unk
5 adm 234 n ma 104 unkn
rrie ndar no yes now 5 1 -1 0 yes
9 in. 3 o y 2 own
d y n

1 ma seco unk
5 adm n ma 146 unkn
rrie ndar no 45 no now 5 1 -1 0 yes
6 in. o y 7 own
d y n

2 tech ma seco unk


4 127 n ma 138 unkn
nicia rrie ndar no yes now 5 1 -1 0 yes
1 0 o y 9 own
n d y n

3 ma seco unk
5 servi 247 n ma unkn
rrie ndar no yes now 5 579 1 -1 0 yes
5 ces 6 o y own
d y n

4 ma unk
5 adm tertia n ma unkn
rrie no 184 no now 5 673 2 -1 0 yes
4 n. ry o y own
d n
MACHINE LEARNING – copy rights reserved Real Time Signals

program by using Decision Tree:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
data=pd.read_csv("bank.csv")
data
features=data[['age', 'job', 'marital', 'education', 'default', 'balance', 'housing',
'loan', 'contact', 'day', 'month', 'duration', 'campaign', 'pdays','previous']]
target=data["deposit"]
from sklearn.preprocessing import LabelEncoder
label = LabelEncoder()
Colums = features.dtypes.pipe(lambda features: features[features=='object']).index
for col in Colums:
features[col] = label.fit_transform(features[col])
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn import tree

X_train,X_test,Y_train,Y_test=train_test_split(features,target,test_size=0.4,random_state=1)
model=DecisionTreeClassifier()
model.fit(X_train,Y_train)
model.score(X_test,Y_test)
MACHINE LEARNING – copy rights reserved Real Time Signals

KNN(K- Nearest Neighbour) Classifier Algorithm

KNN(K- Nearest Neighbour) Classifier Algorithm:

K-nearest neighbor classifier algorithm is one of the supervised learning algorithm. it is


used for both classification and regression predictive problems. However, it is more widely used in
classification problems in the industry.

K-nearest neighbor classifier algorithms is to predict the target label by finding the nearest
neighbor class. The closest class will be identified using the distance measures like Euclidean
distance.

Euclidean distance: The Euclidean distance between two points in either the plane or 3-
dimensional space measures the length of a segment connecting the two points.

the distance between points and is

In general, the distance between points and in a Euclidean space is given by

K-nearest neighbor classification step by step procedure :

To demonstrate a k-nearest neighbor analysis, let's consider the task of classifying a new
object (query point) among a number of known examples.

Let’s consider the above image where we have two different target classes circle and triangle.

Now we would like to predict the target class for the red circle. Considering k value
as three, we need to calculate the similarity distance using similarity measures like Euclidean
distance.
MACHINE LEARNING – copy rights reserved Real Time Signals

Below graph shows that, for new dataset ( ), for 3 Nearest neighbour's dataset one is circle and
two are triangles. so new dataset will move to the cluster of triangle.

k=3

Let’s consider a setup with “n” training samples, where xi is the training data point. The
training data points are categorized into “c” classes. Using KNN, we want to predict class for the
new data point. So, the first step is to calculate the distance(Euclidean) between the new data
point and all the training data points.

Nearest Neighbor Algorithm:

Nearest neighbor is a special case of k-nearest neighbor class. Where k value is 1 (k = 1). In this
case, new data point target class will be assigned to the 1st closest neighbor.

How to choose the value of K?

Selecting the value of K in K-nearest neighbor is the most critical problem. A small value of
K means that noise will have a higher influence on the result i.e., the probability of overfitting is
very high. A large value of K makes it computationally expensive and defeats the basic idea
behind KNN (that points that are near might have similar classes ). A simple approach to select k
is k = n^(1/2).

To optimize the results, we can use Cross Validation. Using the cross-validation technique,
we can test KNN algorithm with different values of K. The model which gives good accuracy can
be considered to be an optimal choice.

It depends on individual cases, at times best process is to run through each possible value
of k and test our result.
MACHINE LEARNING – copy rights reserved Real Time Signals

Advantages:

• The K-Nearest Neighbours (KNN) Classifier is a very simple classifier that works well on
basic recognition problems.

Disadvantages:

• The main disadvantage of the KNN algorithm is, it does not learn anything from the training
data and simply uses the training data itself for classification.
• To predict the label of a new instance the KNN algorithm will find the K closest neighbours
to the new instance from the training data, the predicted class label will then be set as the
most common label among the K closest neigh boring points.
• Another disadvantage of this approach is that the algorithm does not learn anything from
the training data, which can result in the algorithm not generalizing well and also not being
robust to noisy data. Further, changing K can change the resulting predicted class label.

ex: Breast cancer diagnosis using k-nearest neighbor (Knn) algorithm

X=['radius_mean', 'texture_mean', 'perimeter_mean',


'area_mean', 'smoothness_mean', 'compactness_mean', 'concavity_mean',
'concave points_mean', 'symmetry_mean', 'fractal_dimension_mean',
'radius_se', 'texture_se', 'perimeter_se', 'area_se', 'smoothness_se',
'compactness_se', 'concavity_se', 'concave points_se', 'symmetry_se',
'fractal_dimension_se', 'radius_worst', 'texture_worst',
'perimeter_worst', 'area_worst', 'smoothness_worst',
'compactness_worst', 'concavity_worst', 'concave points_worst',
'symmetry_worst', 'fractal_dimension_worst', 'diagnosis1']
y= ['diagnosis']
MACHINE LEARNING – copy rights reserved Real Time Signals

algorithm for Breast cancer using sklearn:

import numpy as np # linear algebra


import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt #matplotlib is used for plot the graphs,
%matplotlib inline
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
data=pd.read_csv("breast-cancer.csv")
data.head()
data.shape
x=data[['radius_mean','perimeter_mean','area_mean','compactness_mean','concave
points_mean','radius_se','perimeter_se', 'area_se','compactness_se', 'concave
points_se','radius_worst','perimeter_worst','compactness_worst','concave
points_worst','texture_worst','area_worst']]
y=data[['diagnosis']]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=1)
from sklearn.metrics import accuracy_score
model = KNeighborsClassifier(n_neighbors=5)
model.fit(x_train, y_train)
predict = model.predict(x_test)
accuracy_score(predict,y_test)
print(accuracy)
MACHINE LEARNING – copy rights reserved Real Time Signals

Naive Bayes Algorithm


Naive Bayes Algorithm: Naive Bayes classifier is a straightforward and powerful
algorithm for the classification task. It works on Bayes theorem of probability to predict the class
of unknown data set.

How Naive Bayes classifier algorithm works in machine learning:

▪ Basically, we are trying to find probability of event A, given the event B is true. Event B is also
termed as evidence.
▪ P(A) is the priori of A (the prior probability, i.e. Probability of event before evidence is seen).
The evidence is an attribute value of an unknown instance(here, it is event B).
▪ P(A|B) is a posteriori probability of B, i.e. probability of event after evidence is seen.

we can apply Bayes’ theorem in following way:

where, y is class variable and X is a dependent feature vector (of size n) where:
X=(x1,x2,x3,......xn)

X=(Rainy,Hot,High,False)

y=No

So basically, P(X|y) here means, the probability of “Not playing golf” given that the weather
conditions are “Rainy outlook”, “Temperature is hot”, “high humidity” and “no wind”.
MACHINE LEARNING – copy rights reserved Real Time Signals
ex :

The dataset is divided into two parts, namely, features and the response vector.

▪ Feature matrix contains all the vectors(rows) of dataset in which each vector consists of the
value of dependent features. In above dataset, features are ‘Outlook’, ‘Temperature’,
‘Humidity’ and ‘Windy’.

With relation to our dataset, this concept can be understood as:


▪ We assume that no pair of features are dependent. For example, the temperature being
‘Hot’ has nothing to do with the humidity or the outlook being ‘Rainy’ has no effect on the
winds. Hence, the features are assumed to be independent.
▪ Secondly, each feature is given the same weight(or importance). For example, knowing
only temperature and humidity alone can’t predict the outcome accurately. None of the
attributes is irrelevant and assumed to be contributing equally to the outcome.

probability of Outlook

Outlook yes no P(yes) P(no)


Rainy 2 3 2/9 3/5
Sunny 3 2 3/9 2/5
Overcast 4 0 4/9 0/5
total 9 5 100% 100%
MACHINE LEARNING – copy rights reserved Real Time Signals

probability of Temperature:

Temperature yes no P(yes) P(no)


Hot 2 2 2/9 2/5
Mild 4 2 4/9 2/5
Cool 3 1 3/9 1/5
total 9 5 100% 100%

probability of Humidity:

Humidity yes no P(yes) P(no)


High 3 4 3/9 4/5
Normal 6 1 6/9 1/5
Total 9 5 100% 100%

Probability of Wind yes and no:

wind yes no P(yes) P(no)


False(NO) 6 2 6/9 2/5
True(YES) 3 3 3/9 3/5
Total 9 5 100% 100%

probability of play:

play P(yes)/p(no)

Yes 9 9/14

No 5 5/14

total 14 100%
MACHINE LEARNING – copy rights reserved Real Time Signals
Let us test it on a new set of features (let us call it today)

today=(Sunny,Hot,Normal,False)

P(yes/today)=P(sunny/yes)P(Hot/yes)P(Normal/yes)P(Nowind/yes)P(yes)/P(today)
3/9*2/9*6/9*6/9*9/14=0.0211

P(no/today)=P(sunny/no)P(Hot/no)P(Normal/no)P(Nowind/no)P(no)/P(today)
2/5*2/5*1/5*2/5*5/14=0.0045
now since,
P(yes/today)+P(no/today)=1
P(yes/today)=0.0211/(0.0211+0.0045)=>0.0211/0.0256=>0.82
P(no/today)=0.0045/(0.0211+0.0045)=>0.0045/0.0256=>0.17

P(yes/today)>P(no/today)

So, prediction that golf would be played is ‘Yes’.


The method that we discussed above is applicable for discrete data. In case of continuous
data, we need to make some assumptions regarding the distribution of values of each feature. The
different naive Bayes classifiers differ mainly by the assumptions they make regarding the
distribution of P(xi | y).

Gaussian Naive Bayes classifier:


In Gaussian Naive Bayes, continuous values associated with each feature are assumed to be
distributed according to a Gaussian distribution. A Gaussian distribution is also called Normal
distribution. When plotted, it gives a bell shaped curve which is symmetric about the mean of the
feature values as shown below:
MACHINE LEARNING – copy rights reserved Real Time Signals

The conditional probability is given by:

ex : iris data set

x=[sepal.leangth,sepal.width,petal.leangth,petal.width]

y=[species]

Sepal.Length Sepal.Width Petal.Length Petal.Width Species


1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 versicolor
5 5 3.6 1.4 0.2 versicolor
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 virginica
9 4.4 2.9 1.4 0.2 virginica
10 4.9 3.1 1.5 0.1 setosa

Naive bayes algorithm using sklearn:


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
data=pd.read_csv("Iris.csv")
data.head()
# store the feature matrix (X) and response vector (Y)
X=np.array(data.iloc[:,0:4])
Y=data["Species"]
# splitting X and y into training and testing sets
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.2,random_state=4)
# training the model on training set
from sklearn.naive_bayes import GaussianNB
model = GaussianNB()
model.fit(X_train, Y_train)
# making predictions on the testing set
Y_pred = model.predict(X_test)
# find accuracy
from sklearn.metrics import accuracy_score
accuracy_score(Y_test, Y_pred)
MACHINE LEARNING – copy rights reserved Real Time Signals

Random Forest Algorithm

Random Forest Algorithm:


Random forest algorithm is a supervised classification algorithm. It is used for both
classification and the regression kind of problems. This algorithm creates the forest with a number
of decision trees. In general, the more trees in the forest the more robust the forest looks like. In
the same way in the random forest classifier, the higher the number of trees in the forest
gives the high accuracy results.

Why Random forest algorithm:

To address why random forest algorithm. I am giving you the below advantages.

• The same random forest algorithm or the random forest classifier can use for both
classification and the regression task.
• Random forest classifier will handle the missing values.
• When we have more trees in the forest, we use random forest classifier to avoid
the overfitting.

Random forest algorithm real life example:


For Random Forest Classification each tree’s prediction is counted as a vote for one
class. The label is predicted to be the class which receives the most votes.

All data

tree 1 tree 2 tree 3 tree 4


random subset random subset random subset random subset

Here is how such a system is trained; for some number of trees T

1. At each node:
1. For some number m (see below), m predictor variables are selected at random from
all the predictor variables.
2. The predictor variable that provides the best split, according to some objective
function, is used to do a binary split on that node.
3. At the next node, choose another m variables at random from all predictor variables
and do the same.
MACHINE LEARNING – copy rights reserved Real Time Signals

Analyze Credit Risk with Spark Machine Learning Scenario


Our data is from the German Credit Data Set which classifies people described by a set of
attributes as good or bad credit risks. For each bank loan application we have the following
information:
Decision tree :

similarly, randomly creates the multiple decision trees and find the prediction for each decision
tree . each decision tree’s prediction is counted as a vote for one class. The label is predicted to
be the class which receives the most votes.

features =['Account Balance', 'Duration of Credit (month)', 'Payment Status of Previous Credit',
'Purpose', 'Credit Amount', 'Value Savings/Stocks', 'Length of current employment', 'Installment per
cent', 'Sex & Marital Status', 'Guarantors', 'Duration in Current address', 'Most valuable available
asset', 'Age (years)', 'Concurrent Credits', 'Type of apartment', 'No of Credits at this Bank',
'Occupation', 'No of dependents', 'Telephone', 'Foreign Worker']

label =['Creditability'] => Label → Creditable or Not Creditable (1 or 0)


MACHINE LEARNING – copy rights reserved Real Time Signals

German_credit Algorithm using Random forest:

import numpy as np # linear algebra


import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt #matplotlib is used for plot the graphs,
%matplotlib inline
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
data=pd.read_csv("german_credit.csv")
data.head()
data.columns
data.info()
corr=data.corr()
corr.nlargest(20,'Creditability')['Creditability']
x=np.array(data[['Account Balance','Payment Status of Previous Credit','Value
Savings/Stocks','Length of current employment','Concurrent Credits','Age (years)','Sex & Marital
Status','Foreign Worker','No of Credits at this Bank']])
y=np.array(data[['Creditability']])
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.4,random_state=1)
model= RandomForestClassifier(max_depth=8, random_state=1)
model.fit(x_train,y_train)
pre=model.predict(x_test)
model.score(x_test,y_test)

Applications of using Random Forest algorithm:

ex: Banking, Medicine, Stock Market and E-commerce:

• For the application in banking, Random Forest algorithm is used to find loyal customers, which
means customers who can take out plenty of loans and pay interest to the bank properly, and
fraud customers, which means customers who have bad records like failure to pay back a loan
on time or have dangerous actions.

• For the application in medicine, Random Forest algorithm can be used to both identify the
correct combination of components in medicine, and to identify diseases by analyzing the
patient’s medical records.

• For the application in the stock market, Random Forest algorithm can be used to identify a
stock’s behavior and the expected loss or profit.
MACHINE LEARNING – copy rights reserved Real Time Signals
SUPPORT VECTOR MACHINES

Support vector machine (SVM): It is a supervised learning algorithm.


it is mostly used for classification problems.
There are two types of Classifiers:
1.linear svm classifier
2.non linear svm classifier

Linear Svm : In linear svm the data points are separated by an apparent gap.
it predicts a straight "hyper plane" dividing by 2 classes.
For Drawing the hyper plane is maximizing the distance from hype plane to the nearest data point
of either class.
The hyper plane is called as a "maximum margin hyper plane".

Non linear Svm: In non linear svm data points plotted in a higher dimensional space. Here
kernel trick is used to maximum margin hyper plane.

what is Margin : Margin is defined as the distance between the separating hyper plane
(decision boundary) .
MACHINE LEARNING – copy rights reserved Real Time Signals

here H1 does not separate the classes. H2 does, but only with a small margin. H3 separates them
with the maximum margin.

1 . Identify the right hyper-plane (Scenario-1): Here, we have three hyper-planes (A, B and C).
Now, identify the right hyper-plane to classify star and circle.

hyper-plane “B” has excellently performed this


MACHINE LEARNING – copy rights reserved Real Time Signals

2. Identify the right hyper-plane (Scenario-2): Here, we have three hyper-planes (A, B and C) and
all are segregating the classes well. Now, How can we identify the right hyper-plane?

Here to find the maximum distance between the nearest data points for each class.

C is high as compared to both A and B. Hence, the right hyper-plane as C.

3. Identify the right hyper-plane (Scenario-3):

here hyper-plane B as higher margin compared to A. But, hyper-plane B has a classification error and A
has classified all correctly. Therefore, the right hyper-plane is A.
MACHINE LEARNING – copy rights reserved Real Time Signals

SVM needs only the support vectors to classify any new data instances, it is quite efficient.
In other words, it uses a subset of training points in the decision function ( support vectors), so it is
also memory efficient.

maximum margin & decision boundary of SVM:


The primary reason for having decision boundaries with large margins is that they tend to
have a lower generalization error whereas models with small margins are more prone to over
fitting

To get the better understanding for the margin maximization, we may want to take a closer
look at those positive and negative hyper planes that are parallel to the decision boundary.

If we subtract the two linear equations from each other, we get:

We can normalize this by the length of the vector w, which is defined as follows:
MACHINE LEARNING – copy rights reserved Real Time Signals

Then, we'll have the following:

We can write it more compact form:

We can see we have two choices of implementation, and it depends on a specific SVM:

Slack variables for not linearly separable case:


The reason for introducing the slack variable ξ is that the linear constraints need to be
relaxed for nonlinearly separable data to allow convergence of the optimization in the presence
of misclassifications under the appropriate cost penalization.

The slack variable simply can be added to the linear constraints:

So, the new objective to be minimized becomes:

With the variable C, we can penalize for misclassification.


MACHINE LEARNING – copy rights reserved Real Time Signals

Large values of C correspond to large error penalties while we are less strict about
misclassification errors if we choose smaller values for C.

We can use the parameter C to control the width of the margin and therefore tune the bias-
variance trade-off as shown in the picture below:

Kernel method
The kernel methods is to deal with linearly inseparable data, and to create nonlinear
combinations of the original features to project them onto a higher dimensional space via a
mapping function ϕ() where it becomes linearly separable.
MACHINE LEARNING – copy rights reserved Real Time Signals

we can transform a two-dimensional dataset onto a new three-dimensional feature space


where the classes become separable via the following projection

The mapping allows us to separate the two classes shown in the plot via a linear hyper plane
that becomes a nonlinear decision boundary if we project it back (ϕ−1) onto the original feature
space.

Finding separating hyper planes via kernel trick:


To solve a nonlinear problem with SVM: We transform the training data onto a higher
dimensional feature space via a mapping function ϕ().
1. We transform the training data onto a higher dimensional feature space via a mapping
function ϕ().
2. We train a linear SVM model to classify the data in this new feature space.
3. Then, we can use the same mapping function ϕ()ϕ() to transform unseen data to classify it
using the linear SVM model.
4. The kernel trick avoids the explicit mapping that is needed to get linear learning
algorithms to learn a nonlinear function or decision boundary.
5. To train an SVM, in practice, all we need is to replace the dot
product x(i)Tx(j) by ϕ(x(i))Tϕ(x(j)).
6. In order to avoid the expensive step of calculating this dot product between two points
explicitly, we define a so-called kernel function
MACHINE LEARNING – copy rights reserved Real Time Signals
ex : iris data set

x=[sepal.leangth,sepal.width,petal.leangth,petal.width]

y=[species]

Sepal.Length Sepal.Width Petal.Length Petal.Width Species


1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 versicolor
5 5 3.6 1.4 0.2 versicolor
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 virginica
9 4.4 2.9 1.4 0.2 virginica
10 4.9 3.1 1.5 0.1 setosa

svm by using sklearn :

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
data=pd.read_csv("Iris.csv")
data.head()
# store the feature matrix (X) and response vector (Y)
X=np.array(data.iloc[:,0:4])
Y=data["Species"]
# splitting X and y into training and testing sets
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.2,random_state=4)
# training the model on training set
svc = svm.SVC(kernel='linear', C=15)
svc.fit(X_train, Y_train)
predict= svc.predict(X_test)
predict
accuracy=svc.score(X_test,Y_test)
accuracy
MACHINE LEARNING – copy rights reserved Real Time Signals

K- MEANS CLUSTERING
K-MEANS CLUSTERING:

K-means clustering is a type of unsupervised learning, which is used when you have
unlabeled data. The goal of this algorithm is to find groups in the data, with the number of
groups represented by the variable K. The algorithm works iteratively to assign each data
point to one of K groups based on the features that are provided.

• The Centroids of the K clusters, which can be used to label new data

Algorithm:
• assuming we have inputs x1,x2,x3,…,xn and value of K
• Step 1 - Pick K random points as cluster centers called centroids.
• Step 2 - Assign each xi to nearest cluster by calculating its distance to each centroid.
• Step 3 - Find new cluster center by taking the average of the assigned points.
• Step 4 - Repeat Step 2 and 3 until none of the cluster assignments change.
step1:
We randomly pick K cluster centres (centroids). Let’s assume these are c1,c2,…,ck and we
can say that;

C=c1,c2,.....ck
C is the set of all centroids.

step 2:
In this step we assign each input value to closest center. This is done by calculating
Euclidean(L2) distance between the point and the each Centroid.

Where dist(.) is the Euclidean distance.

step 3:
In this step, we find the new centroid by taking the average of all the points assigned to that
cluster.

Si is the set of all points assigned to the ith cluster.


MACHINE LEARNING – copy rights reserved Real Time Signals

Step 4:
In this step, we repeat step 2 and 3 until none of the cluster assignments change. That
means until our clusters remain stable, we repeat the algorithm.

ex: K = {2,3,4,10,11,12,20,25,30}

K-Means is widely used for many applications:

• Image Segmentation
• Clustering Gene Segementation Data
• News Article Clustering
• Clustering Languages
• Species Clustering
• Anomaly Detection
MACHINE LEARNING – copy rights reserved Real Time Signals

Clustering algorithm with out using sklearn:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#get_ipython().magic(u'matplotlib inline')
df = pd.DataFrame({
'x': [12, 20, 28, 18, 29, 33, 24, 45, 45, 52, 51, 52, 55, 53, 55, 61, 64, 69, 72],
'y': [39, 36, 30, 52, 54, 46, 55, 59, 63, 70, 66, 63, 58, 23, 14, 8, 19, 7, 24]
})
np.random.seed(200)
k=3
# centroids[i] = [x, y]
centroids = {
i+1: [np.random.randint(0, 80), np.random.randint(0, 80)]
for i in range(k)
}

fig = plt.figure(figsize=(5, 5))


plt.scatter(df['x'], df['y'], color='k')
colmap = {1: 'r', 2: 'g', 3: 'b'}
for i in centroids.keys():
plt.scatter(*centroids[i], color=colmap[i])
plt.xlim(0, 80)
plt.ylim(0, 80)
plt.show()
def assignment(df, centroids):
for i in centroids.keys():
# sqrt((x1 - x2)^2 - (y1 - y2)^2)
df['distance_from_{}'.format(i)] = (
np.sqrt(
(df['x'] - centroids[i][0]) ** 2
+ (df['y'] - centroids[i][1]) ** 2
)
)
centroid_distance_cols = ['distance_from_{}'.format(i) for i in centroids.keys()]
df['closest'] = df.loc[:, centroid_distance_cols].idxmin(axis=1)
df['closest'] = df['closest'].map(lambda x: int(x.lstrip('distance_from_')))
df['color'] = df['closest'].map(lambda x: colmap[x])
return df
df = assignment(df, centroids)
#print(df.head())
print(df)
MACHINE LEARNING – copy rights reserved Real Time Signals

fig = plt.figure(figsize=(5, 5))


plt.scatter(df['x'], df['y'], color=df['color'], alpha=0.5, edgecolor='k')
for i in centroids.keys():
plt.scatter(*centroids[i], color=colmap[i])
plt.xlim(0, 80)
plt.ylim(0, 80)
plt.show()
import copy

old_centroids = copy.deepcopy(centroids)

def update(k):
for i in centroids.keys():
centroids[i][0] = np.mean(df[df['closest'] == i]['x'])
centroids[i][1] = np.mean(df[df['closest'] == i]['y'])
return k

centroids = update(centroids)

fig = plt.figure(figsize=(5, 5))


ax = plt.axes()
plt.scatter(df['x'], df['y'], color=df['color'], alpha=0.5, edgecolor='k')
for i in centroids.keys():
plt.scatter(*centroids[i], color=colmap[i])
plt.xlim(0, 80)
plt.ylim(0, 80)
for i in old_centroids.keys():
old_x = old_centroids[i][0]
old_y = old_centroids[i][1]
dx = (centroids[i][0] - old_centroids[i][0]) * 0.75
dy = (centroids[i][1] - old_centroids[i][1]) * 0.75
ax.arrow(old_x, old_y, dx, dy, head_width=2, head_length=3, fc=colmap[i], ec=colmap[i])
plt.show()

df = assignment(df, centroids)
# Plot results
fig = plt.figure(figsize=(5, 5))
plt.scatter(df['x'], df['y'], color=df['color'], alpha=0.5, edgecolor='k')
for i in centroids.keys():
plt.scatter(*centroids[i], color=colmap[i])
plt.xlim(0, 80)
plt.ylim(0, 80)
plt.show()
MACHINE LEARNING – copy rights reserved Real Time Signals

while True:
closest_centroids = df['closest'].copy(deep=True)
centroids = update(centroids)
df = assignment(df, centroids)
if closest_centroids.equals(df['closest']):
break

fig = plt.figure(figsize=(5, 5))


plt.scatter(df['x'], df['y'], color=df['color'], alpha=0.5, edgecolor='k')
for i in centroids.keys():
plt.scatter(*centroids[i], color=colmap[i])
plt.xlim(0, 80)
plt.ylim(0, 80)
plt.show()
MACHINE LEARNING – copy rights reserved Real Time Signals
MACHINE LEARNING – copy rights reserved Real Time Signals
MACHINE LEARNING – copy rights reserved Real Time Signals
MACHINE LEARNING – copy rights reserved Real Time Signals
MACHINE LEARNING – copy rights reserved Real Time Signals

You might also like