Machine Learning Lab
Machine Learning Lab
A) Linear Regression
This example uses the only the first feature of the diabetes dataset, in order to illustrate a
two-dimensional plot of this regression technique. The straight line can be seen in the plot,
showing how linear regression attempts to draw a straight line that will best minimize the
residual sum of squares between the observed responses in the dataset, and the responses
predicted by the linear approximation.
The coefficients, the residual sum of squares and the coefficient of determination are also
calculated.
Program:
1
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
OUTPUT:
2
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
b) Logistic Regression.
To start with a simple example, let’s say that your goal is to build a logistic regression model
in Python in order to determine whether candidates would get admitted to a prestigious
university.
Here, there are two possible outcomes: Admitted (represented by the value of ‘1’) vs.
Rejected (represented by the value of ‘0’).
3
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
4
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
import seaborn as sn
candidates = {'gmat':
[780,750,690,710,680,730,690,720,740,690,610,690,710,680,770,610,580,650,540,590,620
,600,550,550,570,670,660,580,650,660,640,620,660,660,680,650,670,580,590,690],
'gpa':
[4,3.9,3.3,3.7,3.9,3.7,2.3,3.3,3.3,1.7,2.7,3.7,3.7,3.3,3.3,3,2.7,3.7,2.7,2.3,3.3,2,2.3,2.7,3,3.3,
3.7,2.3,3.7,3.3,3,2.7,4,3.3,3.3,2.3,2.7,3.3,1.7,3.7],
'work_experience':
[3,4,3,5,4,6,1,4,5,1,3,5,6,4,3,1,4,6,2,3,2,1,4,1,2,6,4,2,6,5,1,2,4,6,5,1,2,1,4,5],
'admitted':
[1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0,0,0,1,1,0,1,1,0,0,1,1,1,0,0,0,0,1]
}
#print (df)
X = df[['gmat', 'gpa','work_experience']]
y = df['admitted']
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0)
logistic_regression= LogisticRegression()
logistic_regression.fit(X_train,y_train)
y_pred=logistic_regression.predict(X_test)
confusion_matrix = pd.crosstab(y_test, y_pred, rownames=['Actual'],
colnames=['Predicted'])
sn.heatmap(confusion_matrix, annot=True)
5
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
∑ TP = True Positives = 5
∑ TN = True Negatives = 3
∑ FP = False Positives = 2
∑ FN = False Negatives = 0
∑ print (X_test)
∑ print (y_pred)
CODE:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
candidates = {'gmat':
[780,750,690,710,680,730,690,720,740,690,610,690,710,680,770,610,580,650,540,590,620,
600,550,550,570,670,660,580,650,660,640,620,660,660,680,650,670,580,590,690],
'gpa':
[4,3.9,3.3,3.7,3.9,3.7,2.3,3.3,3.3,1.7,2.7,3.7,3.7,3.3,3.3,3,2.7,3.7,2.7,2.3,3.3,2,2.3,2.7,3,3.3,3.
7,2.3,3.7,3.3,3,2.7,4,3.3,3.3,2.3,2.7,3.3,1.7,3.7],
6
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
'work_experience':
[3,4,3,5,4,6,1,4,5,1,3,5,6,4,3,1,4,6,2,3,2,1,4,1,2,6,4,2,6,5,1,2,4,6,5,1,2,1,4,5],
'admitted':
[1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0,0,0,1,1,0,1,1,0,0,1,1,1,0,0,0,0,1]
}
df = pd.DataFrame(candidates,columns= ['gmat', 'gpa','work_experience','admitted'])
X = df[['gmat', 'gpa','work_experience']]
y = df['admitted']
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0) #train is
based on 75% of the dataset, test is based on 25% of dataset
logistic_regression= LogisticRegression()
logistic_regression.fit(X_train,y_train)
y_pred=logistic_regression.predict(X_test)
print (X_test) #test dataset
print (y_pred) #predicted values
The prediction was also made for those 10 records (where 1 = admitted, while 0 = rejected):
In the actual dataset (from step-1), you’ll see that for the test data, we got the correct results 8
out of 10 times:
7
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
candidates = {'gmat':
[780,750,690,710,680,730,690,720,740,690,610,690,710,680,770,610,580,650,540,590,620,
600,550,550,570,670,660,580,650,660,640,620,660,660,680,650,670,580,590,690],
'gpa':
[4,3.9,3.3,3.7,3.9,3.7,2.3,3.3,3.3,1.7,2.7,3.7,3.7,3.3,3.3,3,2.7,3.7,2.7,2.3,3.3,2,2.3,2.7,3,3.3,3.
7,2.3,3.7,3.3,3,2.7,4,3.3,3.3,2.3,2.7,3.3,1.7,3.7],
'work_experience':
[3,4,3,5,4,6,1,4,5,1,3,5,6,4,3,1,4,6,2,3,2,1,4,1,2,6,4,2,6,5,1,2,4,6,5,1,2,1,4,5],
8
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
'admitted':
[1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0,0,0,1,1,0,1,1,0,0,1,1,1,0,0,0,0,1]
}
X = df[['gmat', 'gpa','work_experience']]
y = df['admitted']
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0) #in this
case, you may choose to set the test_size=0. You should get the same prediction here
logistic_regression= LogisticRegression()
logistic_regression.fit(X_train,y_train)
print (df2)
print (y_pred)
9
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
As an example of this, consider the simple case of a classification task, in which the two classes of
points are well separated
10
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
A linear discriminative classifier would attempt to draw a straight line separating the two sets
of data, and thereby create a model for classification. For two dimensional data like that
shown here, this is a task we could do by hand. But immediately we see a problem: there is
more than one possible dividing line that can perfectly discriminate between the two classes!
11
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
These are three very different separators which, nevertheless, perfectly discriminate between
these samples. Depending on which you choose, a new data point (e.g., the one marked by
the "X" in this plot) will be assigned a different label! Evidently our simple intuition of
"drawing a line between classes" is not enough, and we need to think a bit deeper. Support
Vector Machines: Maximizing the Margin
Support vector machines offer one way to improve on this. The intuition is this: rather than
simply drawing a zero-width line between the classes, we can draw around each line a margin
of some width, up to the nearest point. Here is an example of how this might look
12
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
In support vector machines, the line that maximizes this margin is the one we will choose as
the optimal model. Support vector machines are an example of such a maximum margin
estimator. Fitting a support vector machine
Let's see the result of an actual fit to this data: we will use Scikit-Learn's support vector
classifier to train an SVM model on this data. For the time being, we will use a linear kernel
and set the C parameter to a very large number.
To better visualize what's happening here, let's create a quick convenience function that will
plot SVM decision boundaries for us
13
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
This is the dividing line that maximizes the margin between the two sets of points. Notice that a few
of the training points just touch the margin: they are indicated by the black circles in this figure.
These points are the pivotal elements of this fit, and are known as the support vectors, and give the
14
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
algorithm its name. In Scikit-Learn, the identity of these points are stored in the support vectors_
attribute of the classifier
A key to this classifier's success is that for the fit, only the position of the support vectors
matter; any points further from the margin which are on the correct side do not modify the fit!
Technically, this is because these points do not contribute to the loss function used to fit the
model, so their position and number do not matter so long as they do not cross the margin.
We can see this, for example, if we plot the model learned from the first 60 points and first
120 points of this dataset:
15
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
In the left panel, we see the model and the support vectors for 60 training points. In the right panel,
we have doubled the number of training points, but the model has not changed: the three support
vectors from the left panel are still the support vectors from the right panel. This insensitivity to the
exact behavior of distant points is one of the strengths of the SVM model.
Where SVM becomes extremely powerful is when it is combined with kernels. We have seen
a version of kernels before, in the basis function regressions of In Depth: Linear Regression.
There we projected our data into higher-dimensional space defined by polynomials and
Gaussian basis functions, and thereby were able to fit for nonlinear relationships with a linear
classifier.
16
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
It is clear that no linear discrimination will ever be able to separate this data. But we can draw a
lesson from the basis function regressions in In Depth: Linear Regression, and think about how we
might project the data into a higher dimension such that a linear separator would be sufficient. For
example, one simple projection we could use would be to compute a radial basis function centered
on the middle clump.
We can visualize this extra data dimension using a three-dimensional plot—if you are running this
notebook live, you will be able to use the sliders to rotate the plot
17
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
We can see that with this additional dimension, the data becomes trivially linearly separable,
by drawing a separating plane at, say, r=0.7.
Here we had to choose and carefully tune our projection: if we had not centered our radial
basis function in the right location, we would not have seen such clean, linearly separable
results. In general, the need to make such a choice is a problem: we would like to somehow
automatically find the best basis functions to use.
One strategy to this end is to compute a basis function centered at every point in the dataset,
and let the SVM algorithm sift through the results. This type of basis function transformation
is known as a kernel transformation, as it is based on a similarity relationship (or kernel)
between each pair of points.
18
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
little procedure known as the kernel trick, a fit on kernel-transformed data can be done
implicitly—that is, without ever building the full N
-dimensional representation of the kernel projection! This kernel trick is built into the SVM,
and is one of the reasons the method is so powerful.
In Scikit-Learn, we can apply kernelized SVM simply by changing our linear kernel to an
RBF (radial basis function) kernel, using the kernel model hyperparameter:
19
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
To handle this case, the SVM implementation has a bit of a fudge-factor which "softens" the
margin: that is, it allows some of the points to creep into the margin if that allows a better fit.
The hardness of the margin is controlled by a tuning parameter, most often known as C. For
very large C, the margin is hard, and points cannot lie in it. For smaller C, the margin is
softer, and can grow to encompass some points.
The plot shown below gives a visual picture of how a changing C parameter affects the final
fit, via the softening of the margin:
20
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
21
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Experiment-3:
Exploratory Data Analysis for Classification using Pandas and Matplotlib.
Exploratory Data Analysis is a technique to analyse data with visual techniques and all statistical
results. We will learn about how to apply these techniques before applying any Machine Learning
Models.
Loading Libraries:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import trim_mean
Loading Data:
data = pd.read_csv("../input/statecsv/state.csv")
# Check the type of data
print ("Type : ", type(data), "\n\n")
# Printing Top 10 Records
print ("Head -- \n", data.head(10))
# Printing last 10 Records
print ("\n\n Tail -- \n", data.tail(10))
22
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
23
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
2: Data Description
3 : Data Info
24
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
5 : Calculating Mean
25
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
6 : Median
data = data.sort_values('MurderRate'),
palette ="Set2")
plt.xticks(rotation =-90)
ax2 = sns.barplot(
x ="State", y ="MurderRate",
data = data.sort_values('MurderRate', ascending = 1),
palette ="husl")
ax2.set(xlabel ='States', ylabel ='Murder Rate per 100000')
ax2.set_title('Murder Rate by State', size = 20)
plt.xticks(rotation =-90)
28
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Experiment-6:
Write a program to demonstrate the working of the decision tree based
ID3 algorithm. Use an appropriate data set for building the decision
tree and apply this knowledge to classify a new sample.
import pandas as pd
df = pd.read_csv('../input/playtenniscsv/PlayTennis.csv')
print("\n Input Data Set is:\n", df)
29
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
t = df.keys()[-1]
print('Target Attribute is: ', t)
# Get the attribute names from input dataset
attribute_names = list(df.keys())
#Remove the target attribute from the attribute names list
attribute_names.remove(t)
print('Predicting Attributes: ', attribute_names)
30
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
glist.reverse()
nobs = len(df.index) * 1.0
df_agg1=df_split.agg({target_attribute:lambda x:entropy_of_list(x,
glist.pop())})
df_agg2=df_split.agg({target_attribute :lambda x:len(x)/nobs})
df_agg1.columns=['Entropy']
df_agg2.columns=['Proportion']
32
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
OUTPUT:
37
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
38
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Experiment-7:
Build an Artificial Neural Network by implementing the Back propagation
algorithm and test the same using appropriate data sets.
Program:
import random
from math import exp
from random import seed
# Initialize a network
new_inputs = []
for neuron in layer:
activation = activate(neuron['weights'], inputs)
neuron['output'] = transfer(activation)
new_inputs.append(neuron['output'])
inputs = new_inputs
return inputs
if i != len(network)-1:
for j in range(len(layer)):
error = 0.0
for neuron in network[i + 1]:
error += (neuron['weights'][j] * neuron['delta'])
errors.append(error)
else:
for j in range(len(layer)):
neuron = layer[j]
errors.append(expected[j] - neuron['output'])
for j in range(len(layer)):
neuron = layer[j]
neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])
#Network Initialization
network = initialize_network(n_inputs, 2, n_outputs)
i= 1
for layer in network:
41
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
j=1
for sub in layer:
print("\n Layer[%d] Node[%d]:\n" %(i,j),sub)
j=j+1
i=i+1
OUTPUT:
The input Data Set :
[[2.7810836, 2.550537003, 0], [1.465489372, 2.362125076, 0], [3.396561688,
4.400293529, 0], [1.38807019, 1.850220317, 0], [3.06407232, 3.005305973,
0], [7.627531214, 2.759262235, 1], [5.332441248, 2.088626775, 1],
[6.922596716, 1.77106367, 1], [8.675418651, -0.242068655, 1], [7.673756466,
3.508563011, 1]]
Number of Inputs :
2
Number of Outputs :
2
Layer[1] Node[1]:
{'weights': [0.4560342718892494, 0.4478274870593494, -0.4434486322731913]}
Layer[1] Node[2]:
{'weights': [-0.41512800484107837, 0.33549887812944956,
0.2359699890685233]}
Layer[2] Node[1]:
{'weights': [0.1697304014402209, -0.1918635424108558,
0.10594416567846243]}
Layer[2] Node[2]:
{'weights': [0.10680173364083789, 0.08120401711200309, -
0.3416171297451944]}
42
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Layer[1] Node[1]:
{'weights': [0.8642508164347665, -0.8497601716670763, -
0.8668929014392035], 'output': 0.9295587965836384, 'delta':
0.005645382825629247}
Layer[1] Node[2]:
{'weights': [-1.2934302410111025, 1.7109363237151507, 0.7125327507327329],
'output': 0.04760703296164151, 'delta': -0.005928559978815076}
Layer[2] Node[1]:
{'weights': [-1.3098359335096292, 2.16462207144596, -0.3079052288835876],
'output': 0.19895563952058462, 'delta': -0.03170801648036037}
Layer[2] Node[2]:
{'weights': [1.5506793402414165, -2.11315950446121, 0.1333585709422027],
'output': 0.8095042653312078, 'delta': 0.029375796661413225}
Predict
Making predictions with a trained neural network is easy enough. We have already seen how
to forward-propagate an input pattern to get an output. This is all we need to do to make a
prediction. We can use the output values themselves directly as the probability of a pattern
belonging to each output class. It may be more useful to turn this output back into a crisp
class prediction. We can do this by selecting the class value with the larger probability. This
is also called the arg max function. Below is a function named predict() that implements this
procedure. It returns the index in the network output that has the largest probability. It
assumes that class values have been converted to integers starting at 0.
OUTPUT:
44
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Experiment-10:
Apply EM algorithm to cluster a Heart Disease Data Set. Use the same data
set for clustering using kMeans algorithm. Compare the results of these two
algorithms and comment on the quality of clustering. You can add
Java/Python ML library classes/API in the program.
# Store the inputs as a Pandas Dataframe and set the column names
X = pd.DataFrame(iris.data)
#print(X)
X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
#print(X.columns)
#print("X:",x)
#print("Y:",y)
45
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
y = pd.DataFrame(iris.target)
y.columns = ['Targets']
# Create a colormap
colormap = np.array(['red', 'lime', 'black'])
# Plot Sepal
plt.subplot(1, 2, 1)
plt.scatter(X.Sepal_Length,X.Sepal_Width, c=colormap[y.Targets], s=40)
plt.title('Sepal')
plt.subplot(1, 2, 2)
plt.scatter(X.Petal_Length,X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Petal')
46
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
# Create a colormap
colormap = np.array(['red', 'lime', 'black'])
47
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
The Fix
Re-plot
48
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
# Create a colormap
colormap = np.array(['red', 'lime', 'black'])
# Plot Orginal
plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Real Classification')
Performance Measures
Accuracy
# Performance Metrics sm.accuracy_score(y, predY)
49
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Confusion Matrix
GMM:
50
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
y_cluster_gmm = gmm.predict(xs)
y_cluster_gmm
plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y_cluster_gmm],
s=40)
plt.title('GMM Classification')
51
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Experiment-11:
Write a program to implement k-Nearest Neighbor algorithm to classify
the iris data set. Print both correct and wrong predictions.
Concept: So the concept that KNN works on is Basically similarities measurements, for
example, if you look at Mango,it is more similar to Apple then dog or cat, then what
KNN will do is put it in the category of fruits not in the category of animals.
What is K in KNN
What happens in KNN,we trained the model and after that we want to test our model , means
we want to classify our new data (test-data),for that we will check some (K) classes around it
and assign the most common class to the test-data.
K=1 means the testing data are given the same level as the closet example in training set.
K=4 means the labels of the four closet classes are check and most common class is assign to
the testing data.
52
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
1. In this diagram we have 2 classes one blue class one red class
2. Now we have a new green point, we have to find out whether this point is in class red or
blue
3. For this, we will define the value of K
4. At K= 1, we will see the distance from the green point to the nearest points, and select the
point with lowest distance and classify the green point in that class, here it red.
5. At K=5 We will calculate the distance from the green point to the nearest points and select
the five points with the lowest distance and classify the green point to the most common
class, that is red here.
6. How to choose the value of K? The value of k is not defined, it depends on the cases.
Lazy Learner
1. KNN is simple algorithm for classification but that's not the reason
2. KNN is lazy learner because it doesn't learn a discriminative function from the training data
but memorizes the training dataset instead.
KNN Algorithm
let's understand the concept of KNN algorithm with iris flower problem
Data: This data consist of total 150 instances (samples) , 4 features , and three classes
(targets).
Problem: Using four features we have to classify which flower belongs to which category.
53
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Importing Data-set
import sklearn
import pandas as pd
from sklearn.datasets import load_iris
iris=load_iris()
iris.keys()
df=pd.DataFrame(iris['data'])
print(df)
print(iris['target_names'])
iris['feature_names']
NOTE:
1. Now we need a target and data so that we can train the model
2. As we know that we have to find out the class from the features we have
3. With this logic,our target is classes (0,1,2) and data is in df.
54
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
Splitting Data
1. The data is split so that with some data we can train the model and from the remaining data
we can test the model and can check how well our model is
2. To do this we have an inbuilt function in sklearn
55
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.
OUT PUT:
56