0% found this document useful (0 votes)

6 views56 pages

Machine Learning Lab

The document is a lab manual for machine learning experiments, detailing exercises on Linear and Logistic Regression. It includes practical examples using Python code to demonstrate how to implement these methods, analyze results, and calculate accuracy. Additionally, it introduces Support Vector Machines (SVM) and discusses their advantages, including the concept of maximizing margins and using kernel functions for non-linear classification.

Uploaded by

damarlanaganandini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views56 pages

Machine Learning Lab

Uploaded by

damarlanaganandini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Experiment-1: Exercises to solve the real-world problems using the

following machine learning methods:
a) Linear Regression
b) Logistic Regression.

A) Linear Regression

This example uses the only the first feature of the diabetes dataset, in order to illustrate a
two-dimensional plot of this regression technique. The straight line can be seen in the plot,
showing how linear regression attempts to draw a straight line that will best minimize the
residual sum of squares between the observed responses in the dataset, and the responses
predicted by the linear approximation.

The coefficients, the residual sum of squares and the coefficient of determination are also
calculated.

Program:

1
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

OUTPUT:

2
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

b) Logistic Regression.

Step 1: Gather your data

To start with a simple example, let’s say that your goal is to build a logistic regression model
in Python in order to determine whether candidates would get admitted to a prestigious
university.

Here, there are two possible outcomes: Admitted (represented by the value of ‘1’) vs.
Rejected (represented by the value of ‘0’).

You can then build a logistic regression in Python, where:

∑ The dependent variable represents whether a person gets admitted; and

∑ The 3 independent variables are the GMAT score, GPA and Years of work experience

This is how the dataset would look like:

3
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

4
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
import seaborn as sn

candidates = {'gmat':
[780,750,690,710,680,730,690,720,740,690,610,690,710,680,770,610,580,650,540,590,620
,600,550,550,570,670,660,580,650,660,640,620,660,660,680,650,670,580,590,690],
'gpa':
[4,3.9,3.3,3.7,3.9,3.7,2.3,3.3,3.3,1.7,2.7,3.7,3.7,3.3,3.3,3,2.7,3.7,2.7,2.3,3.3,2,2.3,2.7,3,3.3,
3.7,2.3,3.7,3.3,3,2.7,4,3.3,3.3,2.3,2.7,3.3,1.7,3.7],
'work_experience':
[3,4,3,5,4,6,1,4,5,1,3,5,6,4,3,1,4,6,2,3,2,1,4,1,2,6,4,2,6,5,1,2,4,6,5,1,2,1,4,5],
'admitted':
[1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0,0,0,1,1,0,1,1,0,0,1,1,1,0,0,0,0,1]
}

df = pd.DataFrame(candidates,columns= ['gmat', 'gpa','work_experience','admitted'])

#print (df)
X = df[['gmat', 'gpa','work_experience']]
y = df['admitted']
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0)
logistic_regression= LogisticRegression()
logistic_regression.fit(X_train,y_train)
y_pred=logistic_regression.predict(X_test)
confusion_matrix = pd.crosstab(y_test, y_pred, rownames=['Actual'],
colnames=['Predicted'])
sn.heatmap(confusion_matrix, annot=True)

print ('Accuracy: ',metrics.accuracy_score(y_test, y_pred))

5
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

As can be observed from the matrix:

∑ TP = True Positives = 5
∑ TN = True Negatives = 3
∑ FP = False Positives = 2
∑ FN = False Negatives = 0

You can then also get the Accuracy using:

Accuracy = (TP+TN)/Total = (5+3)/10 = 0.8

The accuracy is therefore 80% for the test set.

Diving Deeper into the Results

Let’s now print two components in the python code:

∑ print (X_test)
∑ print (y_pred)

CODE:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

6
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

'work_experience':
[3,4,3,5,4,6,1,4,5,1,3,5,6,4,3,1,4,6,2,3,2,1,4,1,2,6,4,2,6,5,1,2,4,6,5,1,2,1,4,5],
'admitted':
[1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0,0,0,1,1,0,1,1,0,0,1,1,1,0,0,0,0,1]
}
df = pd.DataFrame(candidates,columns= ['gmat', 'gpa','work_experience','admitted'])
X = df[['gmat', 'gpa','work_experience']]
y = df['admitted']
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0) #train is
based on 75% of the dataset, test is based on 25% of dataset
logistic_regression= LogisticRegression()
logistic_regression.fit(X_train,y_train)
y_pred=logistic_regression.predict(X_test)
print (X_test) #test dataset
print (y_pred) #predicted values

The prediction was also made for those 10 records (where 1 = admitted, while 0 = rejected):

In the actual dataset (from step-1), you’ll see that for the test data, we got the correct results 8
out of 10 times:

7
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

This is matching with the accuracy level of 80%

Checking the Prediction for a New Set of Data

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

candidates = {'gmat':
[780,750,690,710,680,730,690,720,740,690,610,690,710,680,770,610,580,650,540,590,620,
600,550,550,570,670,660,580,650,660,640,620,660,660,680,650,670,580,590,690],
'gpa':
[4,3.9,3.3,3.7,3.9,3.7,2.3,3.3,3.3,1.7,2.7,3.7,3.7,3.3,3.3,3,2.7,3.7,2.7,2.3,3.3,2,2.3,2.7,3,3.3,3.
7,2.3,3.7,3.3,3,2.7,4,3.3,3.3,2.3,2.7,3.3,1.7,3.7],
'work_experience':
[3,4,3,5,4,6,1,4,5,1,3,5,6,4,3,1,4,6,2,3,2,1,4,1,2,6,4,2,6,5,1,2,4,6,5,1,2,1,4,5],

8
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

'admitted':
[1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0,0,0,1,1,0,1,1,0,0,1,1,1,0,0,0,0,1]
}

df = pd.DataFrame(candidates,columns= ['gmat', 'gpa','work_experience','admitted'])

X = df[['gmat', 'gpa','work_experience']]
y = df['admitted']
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=0) #in this
case, you may choose to set the test_size=0. You should get the same prediction here
logistic_regression= LogisticRegression()
logistic_regression.fit(X_train,y_train)

new_candidates = {'gmat': [590,740,680,610,710],

'gpa': [2,3.7,3.3,2.3,3],
'work_experience': [3,4,6,1,5]
}

df2 = pd.DataFrame(new_candidates,columns= ['gmat', 'gpa','work_experience'])

y_pred=logistic_regression.predict(df2)

print (df2)
print (y_pred)

9
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Experiment-2: Write a program to Implement Support Vector Machines.

Step 1: We begin with the standard imports

As an example of this, consider the simple case of a classification task, in which the two classes of
points are well separated

10
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

A linear discriminative classifier would attempt to draw a straight line separating the two sets
of data, and thereby create a model for classification. For two dimensional data like that
shown here, this is a task we could do by hand. But immediately we see a problem: there is
more than one possible dividing line that can perfectly discriminate between the two classes!

We can draw them as follows:

11
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

These are three very different separators which, nevertheless, perfectly discriminate between
these samples. Depending on which you choose, a new data point (e.g., the one marked by
the "X" in this plot) will be assigned a different label! Evidently our simple intuition of
"drawing a line between classes" is not enough, and we need to think a bit deeper. Support
Vector Machines: Maximizing the Margin

Support vector machines offer one way to improve on this. The intuition is this: rather than
simply drawing a zero-width line between the classes, we can draw around each line a margin
of some width, up to the nearest point. Here is an example of how this might look

12
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

In support vector machines, the line that maximizes this margin is the one we will choose as
the optimal model. Support vector machines are an example of such a maximum margin
estimator. Fitting a support vector machine

Let's see the result of an actual fit to this data: we will use Scikit-Learn's support vector
classifier to train an SVM model on this data. For the time being, we will use a linear kernel
and set the C parameter to a very large number.

To better visualize what's happening here, let's create a quick convenience function that will
plot SVM decision boundaries for us

13
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

This is the dividing line that maximizes the margin between the two sets of points. Notice that a few
of the training points just touch the margin: they are indicated by the black circles in this figure.
These points are the pivotal elements of this fit, and are known as the support vectors, and give the
14
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

algorithm its name. In Scikit-Learn, the identity of these points are stored in the support vectors_
attribute of the classifier

A key to this classifier's success is that for the fit, only the position of the support vectors
matter; any points further from the margin which are on the correct side do not modify the fit!
Technically, this is because these points do not contribute to the loss function used to fit the
model, so their position and number do not matter so long as they do not cross the margin.

We can see this, for example, if we plot the model learned from the first 60 points and first
120 points of this dataset:

15
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

In the left panel, we see the model and the support vectors for 60 training points. In the right panel,
we have doubled the number of training points, but the model has not changed: the three support
vectors from the left panel are still the support vectors from the right panel. This insensitivity to the
exact behavior of distant points is one of the strengths of the SVM model.

Beyond linear boundaries: Kernel SVM

Where SVM becomes extremely powerful is when it is combined with kernels. We have seen
a version of kernels before, in the basis function regressions of In Depth: Linear Regression.
There we projected our data into higher-dimensional space defined by polynomials and
Gaussian basis functions, and thereby were able to fit for nonlinear relationships with a linear
classifier.
16
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

It is clear that no linear discrimination will ever be able to separate this data. But we can draw a
lesson from the basis function regressions in In Depth: Linear Regression, and think about how we
might project the data into a higher dimension such that a linear separator would be sufficient. For
example, one simple projection we could use would be to compute a radial basis function centered
on the middle clump.

We can visualize this extra data dimension using a three-dimensional plot—if you are running this
notebook live, you will be able to use the sliders to rotate the plot

17
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

We can see that with this additional dimension, the data becomes trivially linearly separable,
by drawing a separating plane at, say, r=0.7.

Here we had to choose and carefully tune our projection: if we had not centered our radial
basis function in the right location, we would not have seen such clean, linearly separable
results. In general, the need to make such a choice is a problem: we would like to somehow
automatically find the best basis functions to use.

One strategy to this end is to compute a basis function centered at every point in the dataset,
and let the SVM algorithm sift through the results. This type of basis function transformation
is known as a kernel transformation, as it is based on a similarity relationship (or kernel)
between each pair of points.

A potential problem with this strategy—projecting N points into N dimensions—is that it

might become very computationally intensive as N grows large. However, because of a neat

18
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

little procedure known as the kernel trick, a fit on kernel-transformed data can be done
implicitly—that is, without ever building the full N

-dimensional representation of the kernel projection! This kernel trick is built into the SVM,
and is one of the reasons the method is so powerful.

In Scikit-Learn, we can apply kernelized SVM simply by changing our linear kernel to an
RBF (radial basis function) kernel, using the kernel model hyperparameter:

Tuning the SVM: Softening Margins

19
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

To handle this case, the SVM implementation has a bit of a fudge-factor which "softens" the
margin: that is, it allows some of the points to creep into the margin if that allows a better fit.
The hardness of the margin is controlled by a tuning parameter, most often known as C. For
very large C, the margin is hard, and points cannot lie in it. For smaller C, the margin is
softer, and can grow to encompass some points.

The plot shown below gives a visual picture of how a changing C parameter affects the final
fit, via the softening of the margin:

20
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

21
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Experiment-3:
Exploratory Data Analysis for Classification using Pandas and Matplotlib.

Exploratory Data Analysis is a technique to analyse data with visual techniques and all statistical
results. We will learn about how to apply these techniques before applying any Machine Learning
Models.

Loading Libraries:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import trim_mean

Loading Data:
data = pd.read_csv("../input/statecsv/state.csv")
# Check the type of data
print ("Type : ", type(data), "\n\n")
# Printing Top 10 Records
print ("Head -- \n", data.head(10))
# Printing last 10 Records
print ("\n\n Tail -- \n", data.tail(10))

22
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

1 : Adding Column to the dataframe

# Adding a new column with derived data
data['PopulationInMillions'] = data['Population']/1000000
# Changed data
print (data.head(5))

23
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

2: Data Description

3 : Data Info

24
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

4 : Renaming a column heading:

5 : Calculating Mean

25
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

6 : Median

7. Visualizing Population per Million

# Plot Population In Millions

fig, ax1 = plt.subplots()
fig.set_size_inches(15, 9)

ax1 = sns.barplot(x ="State", y ="Population",

26
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

data = data.sort_values('MurderRate'),
palette ="Set2")

ax1.set(xlabel ='States', ylabel ='Population In Millions')

ax1.set_title('Population in Millions by State', size = 20)

plt.xticks(rotation =-90)

8. Visualizing Murder Rate per Lakh

# Plot Murder Rate per 1, 00, 000
27
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

fig, ax2 = plt.subplots()

fig.set_size_inches(15, 9)

ax2 = sns.barplot(
x ="State", y ="MurderRate",
data = data.sort_values('MurderRate', ascending = 1),
palette ="husl")
ax2.set(xlabel ='States', ylabel ='Murder Rate per 100000')
ax2.set_title('Murder Rate by State', size = 20)

plt.xticks(rotation =-90)

28
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Experiment-6:
Write a program to demonstrate the working of the decision tree based
ID3 algorithm. Use an appropriate data set for building the decision
tree and apply this knowledge to classify a new sample.

import pandas as pd
df = pd.read_csv('../input/playtenniscsv/PlayTennis.csv')
print("\n Input Data Set is:\n", df)

29
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

t = df.keys()[-1]
print('Target Attribute is: ', t)
# Get the attribute names from input dataset
attribute_names = list(df.keys())
#Remove the target attribute from the attribute names list
attribute_names.remove(t)
print('Predicting Attributes: ', attribute_names)

#Function to calculate the entropy of collection S

import math
def entropy(probs):
return sum( [-prob*math.log(prob, 2) for prob in probs])

#Function to calulate the entropy of the given Data Sets/List with

#respect to target attributes
def entropy_of_list(ls,value):
from collections import Counter
cnt = Counter(x for x in ls)# Counter calculates the propotion of class
print('Target attribute class count(Yes/No)=',dict(cnt))
total_instances = len(ls)
print("Total no of instances/records associated with {0} is:
{1}".format(value,total_instances ))

30
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

probs = [x / total_instances for x in cnt.values()] # x means no of

YES/NO
print("Probability of Class {0} is: {1:.4f}".format(min(cnt),min(probs)))
print("Probability of Class {0} is: {1:.4f}".format(max(cnt),max(probs)))
return entropy(probs) # Call Entropy

def information_gain(df, split_attribute, target_attribute,battr):

print("\n\n-----Information Gain Calculation of ",split_attribute, " --------")
df_split = df.groupby(split_attribute) # group the data based on attribute
values
glist=[]
for gname,group in df_split:
print('Grouped Attribute Values \n',group)
glist.append(gname)

glist.reverse()
nobs = len(df.index) * 1.0
df_agg1=df_split.agg({target_attribute:lambda x:entropy_of_list(x,
glist.pop())})
df_agg2=df_split.agg({target_attribute :lambda x:len(x)/nobs})

df_agg1.columns=['Entropy']
df_agg2.columns=['Proportion']

# Calculate Information Gain:

new_entropy = sum( df_agg1['Entropy'] * df_agg2['Proportion'])
if battr !='S':
old_entropy = entropy_of_list(df[target_attribute],'S-
'+df.iloc[0][df.columns.get_loc(battr)])
else:
old_entropy = entropy_of_list(df[target_attribute],battr)
return old_entropy - new_entropy

def id3(df, target_attribute, attribute_names,

default_class=None,default_attr='S'):

from collections import Counter

cnt = Counter(x for x in df[target_attribute])# class of YES /NO

## First check: Is this split of the dataset homogeneous?

if len(cnt) == 1:
31
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

return next(iter(cnt)) # next input data set, or raises StopIteration

when EOF is hit.

## Second check: Is this split of the dataset empty? if yes, return a

default value
elif df.empty or (not attribute_names):
return default_class # Return None for Empty Data Set

## Otherwise: This dataset is ready to be devied up!

else:
# Get Default Value for next recursive call of this function:
default_class = max(cnt.keys()) #No of YES and NO Class
# Compute the Information Gain of the attributes:
gainz=[]
for attr in attribute_names:
ig= information_gain(df, attr, target_attribute,default_attr)
gainz.append(ig)
print('Information gain of ',attr,' is : ',ig)

index_of_max = gainz.index(max(gainz)) # Index of Best

Attribute
best_attr = attribute_names[index_of_max] # Choose Best
Attribute to split on
print("\nAttribute with the maximum gain is: ", best_attr)
# Create an empty tree, to be populated in a moment
tree = {best_attr:{}} # Initiate the tree with best attribute as a node
remaining_attribute_names =[i for i in attribute_names if i != best_attr]

# Split dataset-On each split, recursively call this algorithm.Populate

the empty tree with subtrees, which
# are the result of the recursive call
for attr_val, data_subset in df.groupby(best_attr):
subtree = id3(data_subset,target_attribute,
remaining_attribute_names,default_class,best_attr)
tree[best_attr][attr_val] = subtree
return tree

from pprint import pprint

tree = id3(df,t,attribute_names)
print("\nThe Resultant Decision Tree is:")
pprint(tree)

32
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

OUTPUT:

-----Information Gain Calculation of Unnamed: 0 --------

Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
0 0 No Sunny Hot High Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
1 1 No Sunny Hot High Strong
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
2 2 Yes Overcast Hot High Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
3 3 Yes Rain Mild High Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
4 4 Yes Rain Cool Normal Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
5 5 No Rain Cool Normal Strong
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
6 6 Yes Overcast Cool Normal Strong
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
7 7 No Sunny Mild High Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
8 8 Yes Sunny Cool Normal Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
9 9 Yes Rain Mild Normal Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
10 10 Yes Sunny Mild Normal Strong
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
11 11 Yes Overcast Mild High Strong
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
12 12 Yes Overcast Hot Normal Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
13 13 No Rain Mild High Strong
Target attribute class count(Yes/No)= {'Weak': 1}
Total no of instances/records associated with 0 is: 1
Probability of Class Weak is: 1.0000
Probability of Class Weak is: 1.0000
Target attribute class count(Yes/No)= {'Strong': 1}
Total no of instances/records associated with 1 is: 1
Probability of Class Strong is: 1.0000
Probability of Class Strong is: 1.0000
Target attribute class count(Yes/No)= {'Weak': 1}
Total no of instances/records associated with 2 is: 1
Probability of Class Weak is: 1.0000
Probability of Class Weak is: 1.0000
33
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Target attribute class count(Yes/No)= {'Weak': 1}

Total no of instances/records associated with 3 is: 1
Probability of Class Weak is: 1.0000
Probability of Class Weak is: 1.0000
Target attribute class count(Yes/No)= {'Weak': 1}
Total no of instances/records associated with 4 is: 1
Probability of Class Weak is: 1.0000
Probability of Class Weak is: 1.0000
Target attribute class count(Yes/No)= {'Strong': 1}
Total no of instances/records associated with 5 is: 1
Probability of Class Strong is: 1.0000
Probability of Class Strong is: 1.0000
Target attribute class count(Yes/No)= {'Strong': 1}
Total no of instances/records associated with 6 is: 1
Probability of Class Strong is: 1.0000
Probability of Class Strong is: 1.0000
Target attribute class count(Yes/No)= {'Weak': 1}
Total no of instances/records associated with 7 is: 1
Probability of Class Weak is: 1.0000
Probability of Class Weak is: 1.0000
Target attribute class count(Yes/No)= {'Weak': 1}
Total no of instances/records associated with 8 is: 1
Probability of Class Weak is: 1.0000
Probability of Class Weak is: 1.0000
Target attribute class count(Yes/No)= {'Weak': 1}
Total no of instances/records associated with 9 is: 1
Probability of Class Weak is: 1.0000
Probability of Class Weak is: 1.0000
Target attribute class count(Yes/No)= {'Strong': 1}
Total no of instances/records associated with 10 is: 1
Probability of Class Strong is: 1.0000
Probability of Class Strong is: 1.0000
Target attribute class count(Yes/No)= {'Strong': 1}
Total no of instances/records associated with 11 is: 1
Probability of Class Strong is: 1.0000
Probability of Class Strong is: 1.0000
Target attribute class count(Yes/No)= {'Weak': 1}
Total no of instances/records associated with 12 is: 1
Probability of Class Weak is: 1.0000
Probability of Class Weak is: 1.0000
Target attribute class count(Yes/No)= {'Strong': 1}
Total no of instances/records associated with 13 is: 1
Probability of Class Strong is: 1.0000
Probability of Class Strong is: 1.0000
Target attribute class count(Yes/No)= {'Weak': 8, 'Strong': 6}
Total no of instances/records associated with S is: 14
Probability of Class Strong is: 0.4286
Probability of Class Weak is: 0.5714
Information gain of Unnamed: 0 is : 0.9852281360342516

-----Information Gain Calculation of PlayTennis --------

Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
0 0 No Sunny Hot High Weak
1 1 No Sunny Hot High Strong
5 5 No Rain Cool Normal Strong
7 7 No Sunny Mild High Weak
13 13 No Rain Mild High Strong
Grouped Attribute Values
34
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind

2 2 Yes Overcast Hot High Weak
3 3 Yes Rain Mild High Weak
4 4 Yes Rain Cool Normal Weak
6 6 Yes Overcast Cool Normal Strong
8 8 Yes Sunny Cool Normal Weak
9 9 Yes Rain Mild Normal Weak
10 10 Yes Sunny Mild Normal Strong
11 11 Yes Overcast Mild High Strong
12 12 Yes Overcast Hot Normal Weak
Target attribute class count(Yes/No)= {'Weak': 2, 'Strong': 3}
Total no of instances/records associated with No is: 5
Probability of Class Strong is: 0.4000
Probability of Class Weak is: 0.6000
Target attribute class count(Yes/No)= {'Weak': 6, 'Strong': 3}
Total no of instances/records associated with Yes is: 9
Probability of Class Strong is: 0.3333
Probability of Class Weak is: 0.6667
Target attribute class count(Yes/No)= {'Weak': 8, 'Strong': 6}
Total no of instances/records associated with S is: 14
Probability of Class Strong is: 0.4286
Probability of Class Weak is: 0.5714
Information gain of PlayTennis is : 0.04812703040826949

-----Information Gain Calculation of Outlook --------

Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
2 2 Yes Overcast Hot High Weak
6 6 Yes Overcast Cool Normal Strong
11 11 Yes Overcast Mild High Strong
12 12 Yes Overcast Hot Normal Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
3 3 Yes Rain Mild High Weak
4 4 Yes Rain Cool Normal Weak
5 5 No Rain Cool Normal Strong
9 9 Yes Rain Mild Normal Weak
13 13 No Rain Mild High Strong
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
0 0 No Sunny Hot High Weak
1 1 No Sunny Hot High Strong
7 7 No Sunny Mild High Weak
8 8 Yes Sunny Cool Normal Weak
10 10 Yes Sunny Mild Normal Strong
Target attribute class count(Yes/No)= {'Weak': 2, 'Strong': 2}
Total no of instances/records associated with Overcast is: 4
Probability of Class Strong is: 0.5000
Probability of Class Weak is: 0.5000
Target attribute class count(Yes/No)= {'Weak': 3, 'Strong': 2}
Total no of instances/records associated with Rain is: 5
Probability of Class Strong is: 0.4000
Probability of Class Weak is: 0.6000
Target attribute class count(Yes/No)= {'Weak': 3, 'Strong': 2}
Total no of instances/records associated with Sunny is: 5
Probability of Class Strong is: 0.4000
Probability of Class Weak is: 0.6000
Target attribute class count(Yes/No)= {'Weak': 8, 'Strong': 6}
Total no of instances/records associated with S is: 14
35
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Probability of Class Strong is: 0.4286

Probability of Class Weak is: 0.5714
Information gain of Outlook is : 0.005977711423774124

-----Information Gain Calculation of Temperature --------

Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
4 4 Yes Rain Cool Normal Weak
5 5 No Rain Cool Normal Strong
6 6 Yes Overcast Cool Normal Strong
8 8 Yes Sunny Cool Normal Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
0 0 No Sunny Hot High Weak
1 1 No Sunny Hot High Strong
2 2 Yes Overcast Hot High Weak
12 12 Yes Overcast Hot Normal Weak
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
3 3 Yes Rain Mild High Weak
7 7 No Sunny Mild High Weak
9 9 Yes Rain Mild Normal Weak
10 10 Yes Sunny Mild Normal Strong
11 11 Yes Overcast Mild High Strong
13 13 No Rain Mild High Strong
Target attribute class count(Yes/No)= {'Weak': 2, 'Strong': 2}
Total no of instances/records associated with Cool is: 4
Probability of Class Strong is: 0.5000
Probability of Class Weak is: 0.5000
Target attribute class count(Yes/No)= {'Weak': 3, 'Strong': 1}
Total no of instances/records associated with Hot is: 4
Probability of Class Strong is: 0.2500
Probability of Class Weak is: 0.7500
Target attribute class count(Yes/No)= {'Weak': 3, 'Strong': 3}
Total no of instances/records associated with Mild is: 6
Probability of Class Strong is: 0.5000
Probability of Class Weak is: 0.5000
Target attribute class count(Yes/No)= {'Weak': 8, 'Strong': 6}
Total no of instances/records associated with S is: 14
Probability of Class Strong is: 0.4286
Probability of Class Weak is: 0.5714
Information gain of Temperature is : 0.03914867190307092

-----Information Gain Calculation of Humidity --------

Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
0 0 No Sunny Hot High Weak
1 1 No Sunny Hot High Strong
2 2 Yes Overcast Hot High Weak
3 3 Yes Rain Mild High Weak
7 7 No Sunny Mild High Weak
11 11 Yes Overcast Mild High Strong
13 13 No Rain Mild High Strong
Grouped Attribute Values
Unnamed: 0 PlayTennis Outlook Temperature Humidity Wind
4 4 Yes Rain Cool Normal Weak
5 5 No Rain Cool Normal Strong
6 6 Yes Overcast Cool Normal Strong
36
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

8 8 Yes Sunny Cool Normal Weak

9 9 Yes Rain Mild Normal Weak
10 10 Yes Sunny Mild Normal Strong
12 12 Yes Overcast Hot Normal Weak
Target attribute class count(Yes/No)= {'Weak': 4, 'Strong': 3}
Total no of instances/records associated with High is: 7
Probability of Class Strong is: 0.4286
Probability of Class Weak is: 0.5714
Target attribute class count(Yes/No)= {'Weak': 4, 'Strong': 3}
Total no of instances/records associated with Normal is: 7
Probability of Class Strong is: 0.4286
Probability of Class Weak is: 0.5714
Target attribute class count(Yes/No)= {'Weak': 8, 'Strong': 6}
Total no of instances/records associated with S is: 14
Probability of Class Strong is: 0.4286
Probability of Class Weak is: 0.5714
Information gain of Humidity is : 0.0

Attribute with the maximum gain is: Unnamed: 0

The Resultant Decision Tree is:

{'Unnamed: 0': {0: 'Weak',
1: 'Strong',
2: 'Weak',
3: 'Weak',
4: 'Weak',
5: 'Strong',
6: 'Strong',
7: 'Weak',
8: 'Weak',
9: 'Weak',
10: 'Strong',
11: 'Strong',
12: 'Weak',
13: 'Strong'}}

37
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

38
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Experiment-7:
Build an Artificial Neural Network by implementing the Back propagation
algorithm and test the same using appropriate data sets.
Program:
import random
from math import exp
from random import seed

# Initialize a network

def initialize_network(n_inputs, n_hidden, n_outputs):

network = list()
hidden_layer = [{'weights':[random.uniform(-0.5,0.5) for i in
range(n_inputs + 1)]} for i in range(n_hidden)]
network.append(hidden_layer)
output_layer = [{'weights':[random.uniform(-0.5,0.5) for i in
range(n_hidden + 1)]} for i in range(n_outputs)]
network.append(output_layer)
i= 1
print("\n The initialised Neural Network:\n")
for layer in network:
j=1
for sub in layer:
print("\n Layer[%d] Node[%d]:\n" %(i,j),sub)
j=j+1
i=i+1
return network

# Calculate neuron activation (net) for an input

def activate(weights, inputs):

activation = weights[-1]
for i in range(len(weights)-1):
activation += weights[i] * inputs[i]
return activation

# Transfer neuron activation to sigmoid function

def transfer(activation):
return 1.0 / (1.0 + exp(-activation))

# Forward propagate input to a network output

def forward_propagate(network, row):
inputs = row
for layer in network:
39
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

new_inputs = []
for neuron in layer:
activation = activate(neuron['weights'], inputs)
neuron['output'] = transfer(activation)
new_inputs.append(neuron['output'])
inputs = new_inputs
return inputs

# Calculate the derivative of an neuron output

def transfer_derivative(output):
return output * (1.0 - output)

# Backpropagate error and store in neurons

def backward_propagate_error(network, expected):
for i in reversed(range(len(network))):
layer = network[i]
errors = list()

if i != len(network)-1:
for j in range(len(layer)):
error = 0.0
for neuron in network[i + 1]:
error += (neuron['weights'][j] * neuron['delta'])
errors.append(error)
else:
for j in range(len(layer)):
neuron = layer[j]
errors.append(expected[j] - neuron['output'])

for j in range(len(layer)):
neuron = layer[j]
neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])

# Update network weights with error

def update_weights(network, row, l_rate):
for i in range(len(network)):
inputs = row[:-1]
if i != 0:
inputs = [neuron['output'] for neuron in network[i - 1]]
for neuron in network[i]:
for j in range(len(inputs)):
neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j]
neuron['weights'][-1] += l_rate * neuron['delta']

# Train a network for a fixed number of epochs

def train_network(network, train, l_rate, n_epoch, n_outputs):
40
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

print("\n Network Training Begins:\n")

for epoch in range(n_epoch):

sum_error = 0
for row in train:
outputs = forward_propagate(network, row)
expected = [0 for i in range(n_outputs)]
expected[row[-1]] = 1
sum_error += sum([(expected[i]-outputs[i])**2 for i in
range(len(expected))])
backward_propagate_error(network, expected)
update_weights(network, row, l_rate)
print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))

print("\n Network Training Ends:\n")

#Test training backprop algorithm

seed(2)
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]

print("\n The input Data Set :\n",dataset)

n_inputs = len(dataset[0]) - 1
print("\n Number of Inputs :\n",n_inputs)
n_outputs = len(set([row[-1] for row in dataset]))
print("\n Number of Outputs :\n",n_outputs)

#Network Initialization
network = initialize_network(n_inputs, 2, n_outputs)

# Training the Network

train_network(network, dataset, 0.5, 20, n_outputs)

print("\n Final Neural Network :")

i= 1
for layer in network:
41
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

j=1
for sub in layer:
print("\n Layer[%d] Node[%d]:\n" %(i,j),sub)
j=j+1
i=i+1

OUTPUT:
The input Data Set :
[[2.7810836, 2.550537003, 0], [1.465489372, 2.362125076, 0], [3.396561688,
4.400293529, 0], [1.38807019, 1.850220317, 0], [3.06407232, 3.005305973,
0], [7.627531214, 2.759262235, 1], [5.332441248, 2.088626775, 1],
[6.922596716, 1.77106367, 1], [8.675418651, -0.242068655, 1], [7.673756466,
3.508563011, 1]]

Number of Inputs :
2

Number of Outputs :
2

The initialised Neural Network:

Layer[1] Node[1]:
{'weights': [0.4560342718892494, 0.4478274870593494, -0.4434486322731913]}

Layer[1] Node[2]:
{'weights': [-0.41512800484107837, 0.33549887812944956,
0.2359699890685233]}

Layer[2] Node[1]:
{'weights': [0.1697304014402209, -0.1918635424108558,
0.10594416567846243]}

Layer[2] Node[2]:
{'weights': [0.10680173364083789, 0.08120401711200309, -
0.3416171297451944]}

Network Training Begins:

>epoch=0, lrate=0.500, error=5.278

>epoch=1, lrate=0.500, error=5.122
>epoch=2, lrate=0.500, error=5.006
>epoch=3, lrate=0.500, error=4.875
>epoch=4, lrate=0.500, error=4.700
>epoch=5, lrate=0.500, error=4.466
>epoch=6, lrate=0.500, error=4.176
>epoch=7, lrate=0.500, error=3.838
>epoch=8, lrate=0.500, error=3.469
>epoch=9, lrate=0.500, error=3.089
>epoch=10, lrate=0.500, error=2.716
>epoch=11, lrate=0.500, error=2.367
>epoch=12, lrate=0.500, error=2.054
>epoch=13, lrate=0.500, error=1.780
>epoch=14, lrate=0.500, error=1.546
>epoch=15, lrate=0.500, error=1.349

42
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

>epoch=16, lrate=0.500, error=1.184

>epoch=17, lrate=0.500, error=1.045
>epoch=18, lrate=0.500, error=0.929
>epoch=19, lrate=0.500, error=0.831

Network Training Ends:

Final Neural Network :

Layer[1] Node[1]:
{'weights': [0.8642508164347665, -0.8497601716670763, -
0.8668929014392035], 'output': 0.9295587965836384, 'delta':
0.005645382825629247}

Layer[1] Node[2]:
{'weights': [-1.2934302410111025, 1.7109363237151507, 0.7125327507327329],
'output': 0.04760703296164151, 'delta': -0.005928559978815076}

Layer[2] Node[1]:
{'weights': [-1.3098359335096292, 2.16462207144596, -0.3079052288835876],
'output': 0.19895563952058462, 'delta': -0.03170801648036037}

Layer[2] Node[2]:
{'weights': [1.5506793402414165, -2.11315950446121, 0.1333585709422027],
'output': 0.8095042653312078, 'delta': 0.029375796661413225}

Predict
Making predictions with a trained neural network is easy enough. We have already seen how
to forward-propagate an input pattern to get an output. This is all we need to do to make a
prediction. We can use the output values themselves directly as the probability of a pattern
belonging to each output class. It may be more useful to turn this output back into a crisp
class prediction. We can do this by selecting the class value with the larger probability. This
is also called the arg max function. Below is a function named predict() that implements this
procedure. It returns the index in the network output that has the largest probability. It
assumes that class values have been converted to integers starting at 0.

from math import exp

# Calculate neuron activation for an input

def activate(weights, inputs):
activation = weights[-1]
for i in range(len(weights)-1):
activation += weights[i] * inputs[i]
return activation

# Transfer neuron activation

def transfer(activation):
return 1.0 / (1.0 + exp(-activation))
43
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

# Make a prediction with a network

def predict(network, row):
outputs = forward_propagate(network, row)
return outputs.index(max(outputs))

# Test making predictions with the network

dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
#network = [[{'weights': [-1.482313569067226, 1.8308790073202204,
1.078381922048799]}, {'weights': [0.23244990332399884, 0.3621998343835864,
0.40289821191094327]}],
# [{'weights': [2.5001872433501404, 0.7887233511355132, -1.1026649757805829]},
{'weights': [-2.429350576245497, 0.8357651039198697, 1.0699217181280656]}]]
for row in dataset:
prediction = predict(network, row)
print('Expected=%d, Got=%d' % (row[-1], prediction))

OUTPUT:

44
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Experiment-10:
Apply EM algorithm to cluster a Heart Disease Data Set. Use the same data
set for clustering using kMeans algorithm. Compare the results of these two
algorithms and comment on the quality of clustering. You can add
Java/Python ML library classes/API in the program.

import matplotlib.pyplot as plt

from sklearn import datasets
from sklearn.cluster import KMeans
import sklearn.metrics as sm
import pandas as pd
import numpy as np
%matplotlib inline

# import some data to play with

iris = datasets.load_iris()

#print("\n IRIS DATA :",iris.data);

#print("\n IRIS FEATURES :\n",iris.feature_names)
#print("\n IRIS TARGET :\n",iris.target)
#print("\n IRIS TARGET NAMES:\n",iris.target_names)

# Store the inputs as a Pandas Dataframe and set the column names
X = pd.DataFrame(iris.data)

#print(X)
X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
#print(X.columns)
#print("X:",x)
#print("Y:",y)
45
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

y = pd.DataFrame(iris.target)
y.columns = ['Targets']

# Set the size of the plot

plt.figure(figsize=(14,7))

# Create a colormap
colormap = np.array(['red', 'lime', 'black'])

# Plot Sepal
plt.subplot(1, 2, 1)
plt.scatter(X.Sepal_Length,X.Sepal_Width, c=colormap[y.Targets], s=40)
plt.title('Sepal')

plt.subplot(1, 2, 2)
plt.scatter(X.Petal_Length,X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Petal')

Build the K Means Model

46
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Visualise the classifier results

# View the results

# Set the size of the plot
plt.figure(figsize=(14,7))

# Create a colormap
colormap = np.array(['red', 'lime', 'black'])

# Plot the Original Classifications

plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Real Classification')

# Plot the Models Classifications

plt.subplot(1, 2, 2)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[model.labels_], s=40)
plt.title('K Mean Classification')

47
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

The Fix

Re-plot

48
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

# View the results

# Set the size of the plot
plt.figure(figsize=(14,7))

# Create a colormap
colormap = np.array(['red', 'lime', 'black'])

# Plot Orginal
plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Real Classification')

# Plot Predicted with corrected values

plt.subplot(1, 2, 2)
plt.scatter(X.Petal_Length,X.Petal_Width, c=colormap[predY], s=40)
plt.title('K Mean Classification')

Performance Measures

Accuracy
# Performance Metrics sm.accuracy_score(y, predY)

49
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Confusion Matrix

GMM:

from sklearn.mixture import GaussianMixture

gmm = GaussianMixture(n_components=3)
gmm.fit(xs)

50
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

y_cluster_gmm = gmm.predict(xs)
y_cluster_gmm

plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y_cluster_gmm],
s=40)
plt.title('GMM Classification')

51
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Experiment-11:
Write a program to implement k-Nearest Neighbor algorithm to classify
the iris data set. Print both correct and wrong predictions.

K-Nearest Neighbour (KNN)

∑ KNN is simple supervised learning algorithm used for both regression and classification
problems.
∑ KNN is basically store all available cases and classify new cases based on similarities with
stored cases.

Concept: So the concept that KNN works on is Basically similarities measurements, for
example, if you look at Mango,it is more similar to Apple then dog or cat, then what
KNN will do is put it in the category of fruits not in the category of animals.

What is K in KNN

What happens in KNN,we trained the model and after that we want to test our model , means
we want to classify our new data (test-data),for that we will check some (K) classes around it
and assign the most common class to the test-data.

K- Number of nearest neighbors

K=1 means the testing data are given the same level as the closet example in training set.

K=4 means the labels of the four closet classes are check and most common class is assign to
the testing data.

How does KNN is work?

52
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Let's understand it with the above given diagram

1. In this diagram we have 2 classes one blue class one red class
2. Now we have a new green point, we have to find out whether this point is in class red or
blue
3. For this, we will define the value of K
4. At K= 1, we will see the distance from the green point to the nearest points, and select the
point with lowest distance and classify the green point in that class, here it red.
5. At K=5 We will calculate the distance from the green point to the nearest points and select
the five points with the lowest distance and classify the green point to the most common
class, that is red here.
6. How to choose the value of K? The value of k is not defined, it depends on the cases.

Lazy Learner

1. KNN is simple algorithm for classification but that's not the reason
2. KNN is lazy learner because it doesn't learn a discriminative function from the training data
but memorizes the training dataset instead.

KNN Algorithm

let's understand the concept of KNN algorithm with iris flower problem

Data: This data consist of total 150 instances (samples) , 4 features , and three classes
(targets).

Problem: Using four features we have to classify which flower belongs to which category.

53
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Importing Data-set

import sklearn
import pandas as pd
from sklearn.datasets import load_iris
iris=load_iris()
iris.keys()
df=pd.DataFrame(iris['data'])
print(df)
print(iris['target_names'])
iris['feature_names']

NOTE:

1. Now we need a target and data so that we can train the model
2. As we know that we have to find out the class from the features we have
3. With this logic,our target is classes (0,1,2) and data is in df.

54
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

Splitting Data

1. The data is split so that with some data we can train the model and from the remaining data
we can test the model and can check how well our model is
2. To do this we have an inbuilt function in sklearn

KNN Classifier and Training of the Model

Prediction and Accuracy

55
Machine Learning Lab Manual Dr.A.Veeraswamy, Associate Professor, SACET.

OUT PUT:

Econometrics 1st Edition K. Nirmal Ravi Kumar instant download
100% (5)
Econometrics 1st Edition K. Nirmal Ravi Kumar instant download
65 pages
OceanofPDF.com Data Analytics for Finance Using Python - Nitin Jaglal Untwal
100% (1)
OceanofPDF.com Data Analytics for Finance Using Python - Nitin Jaglal Untwal
138 pages
Machine Learning
100% (3)
Machine Learning
46 pages
Linear Regression Simple Technique For I
No ratings yet
Linear Regression Simple Technique For I
3 pages
Ml practicals
No ratings yet
Ml practicals
21 pages
Simple Linear Regression: Y ($) X ($) Y ($) X ($)
No ratings yet
Simple Linear Regression: Y ($) X ($) Y ($) X ($)
5 pages
ML Workshop
No ratings yet
ML Workshop
78 pages
Ezekiel John Decipulo OLLC Lesson 5.3 Simple Linear Regression Analysis Application
No ratings yet
Ezekiel John Decipulo OLLC Lesson 5.3 Simple Linear Regression Analysis Application
2 pages
Logistic Regression Example (1)
No ratings yet
Logistic Regression Example (1)
7 pages
Assignment 3: Logistic Regression (Individual Submission)
0% (1)
Assignment 3: Logistic Regression (Individual Submission)
3 pages
ML 8,9,10
No ratings yet
ML 8,9,10
3 pages
Does Aid Undermine The Mobilization of Domestic Resources?: Jean Marie W. Kébré, Idrissa M. Ouédraogo
No ratings yet
Does Aid Undermine The Mobilization of Domestic Resources?: Jean Marie W. Kébré, Idrissa M. Ouédraogo
11 pages
OPIM628
No ratings yet
OPIM628
4 pages
Lab Experiment 5
No ratings yet
Lab Experiment 5
5 pages
Lab 4 - Markdown Practical - Solution
No ratings yet
Lab 4 - Markdown Practical - Solution
5 pages
Critical Path - ITC5203-Full
No ratings yet
Critical Path - ITC5203-Full
6 pages
ML Lab Record_250625_105014
No ratings yet
ML Lab Record_250625_105014
29 pages
MM ZG515 Course Handout
No ratings yet
MM ZG515 Course Handout
3 pages
Machine Intelligence
No ratings yet
Machine Intelligence
24 pages
s10551-018-3977-0
No ratings yet
s10551-018-3977-0
19 pages
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
No ratings yet
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
10 pages
Sci ML Mock Exam 2023
No ratings yet
Sci ML Mock Exam 2023
8 pages
Log Reg Skimed.ipynb - Colab
No ratings yet
Log Reg Skimed.ipynb - Colab
10 pages
Aquif Ibrar 1212
No ratings yet
Aquif Ibrar 1212
9 pages
CS8082U4L02 - Locally Weighted Regression
No ratings yet
CS8082U4L02 - Locally Weighted Regression
13 pages
Agniva
No ratings yet
Agniva
16 pages
GLS+ WLS+ Ols
No ratings yet
GLS+ WLS+ Ols
25 pages
Unit Wise Questions
No ratings yet
Unit Wise Questions
14 pages
Performance Determinants of Business Income Tax Collection
No ratings yet
Performance Determinants of Business Income Tax Collection
12 pages
ML LAB 146
No ratings yet
ML LAB 146
50 pages
M.E MACHINE LEARNING -CP4252 LAB MANUAL4716718074353656238
No ratings yet
M.E MACHINE LEARNING -CP4252 LAB MANUAL4716718074353656238
26 pages
Analisis Pengaruh Motivasi Pelayanan Publik (Public Service Motivation) Guru Dan Tenaga Kependidikan Di Kecamatan Parungkuda - Sukabumi
No ratings yet
Analisis Pengaruh Motivasi Pelayanan Publik (Public Service Motivation) Guru Dan Tenaga Kependidikan Di Kecamatan Parungkuda - Sukabumi
10 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
27 pages
Assignment 3
No ratings yet
Assignment 3
5 pages
Unit-III Advanced Machine Learning
No ratings yet
Unit-III Advanced Machine Learning
8 pages
ASEGEDECH DUMESSA GURMESSA Assessment of Budget Utilization and Allocation Practice: In Jimma University, Ethiopia
No ratings yet
ASEGEDECH DUMESSA GURMESSA Assessment of Budget Utilization and Allocation Practice: In Jimma University, Ethiopia
65 pages
easy pract ml
No ratings yet
easy pract ml
7 pages
ML File - Merged
No ratings yet
ML File - Merged
24 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Experiment No 3
No ratings yet
Experiment No 3
7 pages
UNIT 1,2,3
No ratings yet
UNIT 1,2,3
17 pages
Shamsuddin Et Al., 2008 - Demand Analyses of Rice in Malaysia
No ratings yet
Shamsuddin Et Al., 2008 - Demand Analyses of Rice in Malaysia
21 pages
Logistic Regression vs. SVMs - Solution
No ratings yet
Logistic Regression vs. SVMs - Solution
7 pages
Week 4 Logistic
No ratings yet
Week 4 Logistic
21 pages
Assignment II Machine Learning
No ratings yet
Assignment II Machine Learning
8 pages
ML Exp 8
No ratings yet
ML Exp 8
22 pages
Cheatsheet Supervised Learning
100% (1)
Cheatsheet Supervised Learning
4 pages
AIML PRACTICALS
No ratings yet
AIML PRACTICALS
22 pages
4. Logistic Regression
No ratings yet
4. Logistic Regression
21 pages
B24 ML Exp-1
No ratings yet
B24 ML Exp-1
10 pages
ML-Unit 4
No ratings yet
ML-Unit 4
29 pages
Mock Exams 2024
No ratings yet
Mock Exams 2024
81 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
CH 1
No ratings yet
CH 1
24 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
AI XII (843) Units 1,2,3
No ratings yet
AI XII (843) Units 1,2,3
17 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
ML RECORD - Merged
No ratings yet
ML RECORD - Merged
33 pages
Service Quality - A Study of The Luxury Hotels in Malaysia
100% (3)
Service Quality - A Study of The Luxury Hotels in Malaysia
11 pages
Machine Learning (Chapter1)
No ratings yet
Machine Learning (Chapter1)
8 pages
Model_learning_steps
No ratings yet
Model_learning_steps
12 pages
Morin2021 Sprint With GPS
No ratings yet
Morin2021 Sprint With GPS
4 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
2-Machine Learning Algorithms
No ratings yet
2-Machine Learning Algorithms
16 pages
Super Cheatsheet Machine Learning
100% (1)
Super Cheatsheet Machine Learning
15 pages
Handbook of Statistics Vol 14 (Elsevier, 1996) WW
No ratings yet
Handbook of Statistics Vol 14 (Elsevier, 1996) WW
726 pages
Infotec Ai 1000 Program-hcia-Ai Lab Guide
No ratings yet
Infotec Ai 1000 Program-hcia-Ai Lab Guide
82 pages
ML Lab Programs (1)
No ratings yet
ML Lab Programs (1)
9 pages
Data Analytical Roadmap
No ratings yet
Data Analytical Roadmap
10 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Stata Tutorial: Updated For Version 16
No ratings yet
Stata Tutorial: Updated For Version 16
49 pages
ECONF241 GaussMarkov Theorem
No ratings yet
ECONF241 GaussMarkov Theorem
25 pages
Advanced Regression With JMP PRO Handout
No ratings yet
Advanced Regression With JMP PRO Handout
46 pages
Cheatsheet Supervised Learning
No ratings yet
Cheatsheet Supervised Learning
4 pages
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
Commonly Used Machine Learning Algorithms (With Python and R Codes)
No ratings yet
Commonly Used Machine Learning Algorithms (With Python and R Codes)
19 pages
20dit073 Jay Prajapati ML
No ratings yet
20dit073 Jay Prajapati ML
68 pages
Machine Learning The Basics
No ratings yet
Machine Learning The Basics
158 pages
Broadly, There Are 3 Types of Machine Learning Algorithms.
No ratings yet
Broadly, There Are 3 Types of Machine Learning Algorithms.
33 pages
â MS Preweek (B48)
No ratings yet
â MS Preweek (B48)
10 pages
Pytorch Tutorial For Beginner: Department of Computer Science & Engineering University of Washington
No ratings yet
Pytorch Tutorial For Beginner: Department of Computer Science & Engineering University of Washington
11 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
The Impact of Time Management On The Students' Academic Achievements
No ratings yet
The Impact of Time Management On The Students' Academic Achievements
8 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
38 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
Essentials of Machine Learning Algorithms
No ratings yet
Essentials of Machine Learning Algorithms
15 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet