Machine Learning 6th Sem
Machine Learning 6th Sem
DEPARTMENT
OF
LABORATORY MANUAL
MACHINE LEARNING
(Effective from the academic year 2023 -2024)
(21AIL66)
SEMESTER - VI
Prepared by:
DR. B. Kursheed
Assistant Professor
Department of CSE (AI & ML)
Vision
“To impart quality education to cater the needs of Industries, Business Establishments, Research and
Development Organizations, create Knowledgeable and competent Engineers of global standard.”
Mission
“To create Industry enabled Engineers manifesting in excellence with extraordinary progress, to
give bright and challenging future for deserving students who are underprivileged.”
Vision
“To promote partnership between the academic and industry sectors to bring to bring in a innovative
ecosystem and being very beneficial to the whole society.”
Mission
1. Artificial Intelligence and Machine Learning department intends to provide a major support to enable
2. Commit to openness and inclusiveness in a collaborative and constructive spirit to boldly attack
3. Initiate the education and training paying special attention two the necessary skills of
PEO1: Graduates will contribute in development of software applications keeping abreast of the
development.
PEO2: Exhibit competence as an individual, in teams with leadership and managerial skills.
PEO3: By optimizing the technology, the graduates will be able to adopt lifelong learning.
PSO1: Apply the Mathematical tools, Electronics & Embedded Systems Knowledge, and
Programming Knowhow to develop software.
PSO2: Use of Artificial Intelligence and Machine Learning, High-Performance Computing, Cloud
Computing, Network Security technologies and Software Engineering for providing solutions to the
technological and social needs.
PSO3: Work individually and in teams, ethically exhibiting the managerial and leadership skills with
sustainable Environment.
COURSE OUTCOMES:
CO 2. Demonstrate the working of various algorithms with respect to training and test data sets
CO 3. Illustrate and analyze the principles of Instance based and Reinforcement learning
techniques.
CO 4. . Elicit the importance and Applications of Supervised and unsupervised machine learning.
CO 5. Compare and contrast the Bayes theorem principles and Q learning approach.
PROGRAM OUTCOMES (POs)
2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences, and engineering sciences.
3. Design / development of solutions: Design solutions for complex engineering problems and
design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities
with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to
the professional engineering practice.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or leader in
diverse teams, and in multidisciplinary settings
10. Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and write
effective reports and design documentation, make effective presentations, and give and receive
clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multi-disciplinary environments.
12. Life- long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.
(21AIL66) MACHINE LEARING LAB
OBJECTIVES :
Compare and contrast the learning techniques like ANN approach, Bayesian learning and
reinforcement learning.
Able to solve and analyse the problems on ANN, Instance based learning
that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively
improve performance on a specific task) with data, without being explicitly programmed.
In the past decade, machine learning has given us self-driving cars, practical speech,
recognition, effective web search, and a vastly improved understanding of the human genome.
Machine learning tasks are typically classified into two broad categories,
Supervised learning: The computer is presented with example inputs and their desired
outputs,
given by a "teacher", and the goal is to learn a general rule that maps inputs to outputs.
As special cases, the input signal can be only partially available, or restricted to special feedback:
signal: a training set with some (often many) of the target outputs missing.
Active learning: The computer can only obtain training labels for a limited set ,
of instances (based on a budget), and also has to optimize its choice of objects to acquire labels for.
When used interactively, these can be presented to the user for labeling.
Reinforcement learning: training data (in form of rewards and punishments) is given,
Unsupervised Semi-supervised
Supervised Learning
Learning Learning
Locally weighted
Find- S algorithm EM algorithm
Regression algorithm
K means algorithm
Candidate elimination algorithm
Decision tree algorithm
Back propagation Algorithm
Naïve Bayes Algorithm
and the learner must produce a model that assigns unseen inputs,
where the inputs are email (or other) messages and the classes are "spam" and "not spam".
In regression, also a supervised problem, the outputs are continuous rather than discrete.
Topic modeling is a related problem, where a program is given a list of human language
documents and is tasked with finding out which documents cover similar topics.
Lab Experiments:
1. Implement and demonstrate the FIND-S algorithm for finding the most specific hypothesis
based on a given set of training data samples. Read the training data from a.CSV file.
import csv
with open('tennis.csv','r') as f:
reader=csv.reader(f)
your_list=list(reader)
h=[['0','0','0','0','0','0']]
for i in your_list:
print(i)
if i[-1]=="true":
j=0
for x in i:
if x!="true":
if x!=h[0][j] and h[0][j]=='0':
h[0][j]=x
elif x!=h[0][j] and h[0][j]!='0':
h[0][j]='?'
else:
pass
j=j+1
print("maximally specific hypothesis is")
print(h)
INPUT :
'sunny','warm','normal','strong','warm','same',true
'sunny','warm','high','strong','warm','same',true
'rainy','cold','high','strong','warm','change',false
'sunny','warm','high','strong','cool','change',true
Output:
maximally specific hypothesis is
[["'sunny'", "'warm'", '?', "'strong'", '?', '?']]
Program 2: For a given set of training data examples stored in a. CSV file, implement and
Demonstrate the Candidate-Elimination algorithm to output a description of
the set of all hypotheses consistent with the training examples.
import numpy as np
import pandas as pd
data=pd.DataFrame(data=pd.read_csv('tennis.csv'))
concepts=np.array(data.iloc[:,0:-1])
target=np.array(data.iloc[:,-1])
def learn(concepts,target):
specific_h=concepts[0].copy()
print("initialization of specific_h and general_h")
print(specific_h)
general_h=[["?" for i in range(len(specific_h))] for i in range(len(specific_h))]
print(general_h)
for i,h in enumerate(concepts):
if target[i]=="yes":
for x in range(len(specific_h)):
if h[x]!=specific_h[x]:
specific_h[x]='?'
general_h[x]='?'
if target[i]=="no":
for i in range(len(specific_h)):
if h[x]!=specific_h[x]:
general_h[x][x]=specific_h[x]
else:
general_h[x][x]='?'
print("steps in candidate elimination algorithm\n",i+1)
print(specific_h)
print(general_h)
indices=[i for i,val in enumerate(general_h) if val==['?','?','?','?','?','?']]
for i in indices:
general_h.remove(['?','?','?','?','?','?'])
return specific_h,general_h
s_final,g_final=learn(concepts,target)
print("final specific_h:",s_final,sep="\n")
print("final general_h:",g_final,sep="\n")
Input:
'sunny','warm','normal','strong','warm','same',yes
'sunny','warm','high','strong','warm','same',yes
'rainy','cold','high','strong','warm','change',no
'sunny','warm','high','strong','cool','change',yes
Output:
Program 3: Write a program to demonstrate the working of the decision tree based ID3
algorithm. Use an appropriate data set for building the decision tree and apply this
import pandas as pd
import numpy as np
dataset= pd.read_csv('data3.csv',names=['outlook','temperature','humidity','wind','class',])
def entropy(target_col):
elements,counts = np.unique(target_col,return_counts = True)
entropy = np.sum([(-counts[i]/np.sum(counts))*np.log2(counts[i]/np.sum(counts)) for i in range(len(elements))])
return entropy
def InfoGain(data,split_attribute_name,target_name="class"):
total_entropy = entropy(data[target_name])
vals,counts= np.unique(data[split_attribute_name],return_counts=True)
Weighted_Entropy =
np.sum([(counts[i]/np.sum(counts))*entropy(data.where(data[split_attribute_name]==vals[i]).dropna()
[target_name]) for i in range(len(vals))])
Information_Gain = total_entropy - Weighted_Entropy
return Information_Gain
def ID3(data,originaldata,features,target_attribute_name="class",parent_node_class = None):
if len(np.unique(data[target_attribute_name])) <= 1:
return np.unique(data[target_attribute_name])[0]
elif len(data)==0:
return np.unique(originaldata[target_attribute_name])
[np.argmax(np.unique(originaldata[target_attribute_name],return_counts=True)[1])]
elif len(features) ==0:
return parent_node_class
else:
parent_node_class = np.unique(data[target_attribute_name])
[np.argmax(np.unique(data[target_attribute_name],return_counts=True)[1])]
item_values = [InfoGain(data,feature,target_attribute_name) for feature in features] #Return the information
gain values for the features in the dataset
best_feature_index = np.argmax(item_values)
best_feature = features[best_feature_index]
tree = {best_feature:{}}
features = [i for i in features if i != best_feature]
for value in np.unique(data[best_feature]):
value = value
sub_data = data.where(data[best_feature] == value).dropna()
subtree = ID3(sub_data,dataset,features,target_attribute_name,parent_node_class)
tree[best_feature][value] = subtree
return(tree)
tree = ID3(dataset,dataset,dataset.columns[:-1])
print(' \nDisplay Tree\n',tree)
data Loader Program(run this first then run the main program)
import csv
def read_data(filename):
Input:
outlook,temperature,humidity,wind,target
sunny,hot,high,weak,no
sunny,hot,high,strong,no
overcast,hot,high,weak,yes
rain,mild,high,weak,yes
rain,cool,normal,weak,yes
rain,cool,normal,strong,no
overcast,cool,normal,strong,yes
sunny,mild,high,weak,no
sunny,cool,normal,weak,yes
rain,mild,normal,weak,yes
sunny,mild,normal,strong,yes
overcast,mild,high,strong,yes
overcast,hot,normal,weak,yes
rain,mild,high,strong,no
OUTPUT:
Display Tree
{'outlook': {'outlook': 'target', 'overcast': 'yes', 'rain': {'wind': {'strong': 'no', 'weak': 'yes'}}, 'sunny': {'humidity':
{'high': 'no', 'normal': 'yes'}}}}
Program 4: Build an Artificial Neural Network by implementing the Backpropagation algorithm and test
the same using appropriate data sets.
import numpy as np
X=np.array(([2,9],[1,5],[3,6]),dtype=float)
Y=np.array(([92],[86],[89]),dtype=float)
X=X/np.amax(X,axis=0)
Y=Y/100
def derivatives_sigmoid(x):
return x*(1-x)
epoch=7000
lr=0.1
inputlayer_neurons=2
hiddenlayer_neurons=3
output_neurons=1
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))
for i in range(epoch):
hinp1=np.dot(X,wh)
hinp=hinp1+bh
hlayer_act=sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp=outinp1+bout
output=sigmoid(outinp)
EO=Y-output
outgrad=derivatives_sigmoid(output)
d_output=EO* outgrad
EH=d_output.dot(wout.T)
hiddengrad=derivatives_sigmoid(hlayer_act)
d_hiddenlayer=EH*hiddengrad
wout+=hlayer_act.T.dot(d_output)*lr
wh+=X.T.dot(d_hiddenlayer)*lr
print("input:\n" + str(X))
print("Actual output:\n" + str(Y))
print("predicted output:\n",output)
input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual output:
[[0.92]
[0.86]
[0.89]]
predicted output:
[[0.89432484]
[0.88392017]
[0.89187563]]
Program 5: Write a program to implement the naïve Bayesian classifier for a sample training data set stored
as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.
import pandas as pd
df = pd.read_csv("5data.csv")
headers = df.columns.values
data = df.iloc[:,0:-1].values
target = df.iloc[:,-1].values
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(data, target, test_size=0.2, random_state = 1)
print("split {0} rows into: ".format(len(data)))
print("Number of training data: "+(repr(len(x_train))))
print("Number of test data: "+(repr(len(x_test))))
print("\nThe values assumed for concept learning attributes are: ")
print('''Outlook => sunny=1 overcast=2 rain=3
Temperature => Hot=1 mild=2 cool=3
Humidity => high=1 normal=2
Wind => weak=1 strong=2''')
print("Target cincept: Play Teenis => Yes=10 No=5")
print("The training set:")
for x,y in zip(x_train, y_train):
print((x,y))
print("The test set:")
for x,y in zip(x_test, y_test):
print((x,y))
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
gnb.fit(x_train, y_train)
y_pred = gnb.predict(x_test)
print("prediction for the given training set ",y_pred)
from sklearn import metrics
print("Gaussiannaivebayes odel accuracy (in %): ",metrics.accuracy_score(y_test,y_pred)*100)
INPUT:
1,1,1,1,5
1,1,1,2,5
2,1,1,2,10
3,2,1,1,10
3,3,2,1,10
3,3,2,2,5
2,3,2,2,10
1,2,1,1,5
1,3,2,1,10
3,2,2,2,10
1,2,2,2,10
2,2,1,2,10
2,1,2,1,10
3,2,1,2,5
1,2,1,2,
OUTPUT:
split 15 rows into:
Number of training data: 12
Number of test data: 3
import pandas as pd
msg=pd.read_csv('naivetext1.csv',names=['message','label'])
print('The dimensions of the dataset',msg.shape)
msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
y=msg.labelnum
print(X)
print(y)
#splitting the dataset into train and test data
from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(X,y)
print(xtest.shape)
print(xtrain.shape)
print(ytest.shape)
print(ytrain.shape)
#output of count vectoriser is a sparse matrix
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
xtrain_dtm = count_vect.fit_transform(xtrain)
xtest_dtm=count_vect.transform(xtest)
print(count_vect.get_feature_names())
df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_names())
print(df)#tabular representation
print(xtrain_dtm) #sparse matrix representation
# Training Naive Bayes (NB) classifier on training data.
from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)
#printing accuracy metrics
from sklearn import metrics
print('Accuracy metrics')
print('Accuracy of the classifer is',metrics.accuracy_score(ytest,predicted))
print('Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))
print('Recall and Precison ')
print(metrics.recall_score(ytest,predicted))
print(metrics.precision_score(ytest,predicted))
'''docs_new = ['I like this place', 'My boss is not my saviour']
X_new_counts = count_vect.transform(docs_new)
predictednew = clf.predict(X_new_counts)
for doc, category in zip(docs_new, predictednew):'''
INPUT:
OUTPUT:
0 0 0 0 0 0 ... 0 0 0 0 0
1 0 1 0 0 0 ... 0 0 0 0 0
2 1 0 0 0 0 ... 1 0 0 0 0
3 0 0 0 0 0 ... 0 0 0 0 0
4 0 0 0 0 0 ... 0 0 0 0 0
5 0 0 0 0 0 ... 0 1 0 0 0
6 0 0 0 0 0 ... 0 0 0 0 0
7 0 0 0 0 0 ... 0 0 0 0 1
8 0 0 0 0 0 ... 0 0 1 0 0
9 0 0 0 0 0 ... 0 0 0 1 0
10 0 0 0 0 0 ... 0 0 0 0 0
11 0 1 0 0 1 ... 0 0 0 0 0
12 0 0 1 1 0 ... 0 0 0 0 0
[13 rows x 43 columns]
(0, 19) 1
(0, 34) 1
(0, 24) 1
(0, 31) 1
(0, 32) 1
(0, 20) 1
(0, 23) 1
(0, 10) 1
(1, 29) 1
(1, 35) 1
(1, 1) 1
(1, 34) 1
(1, 24) 1
(2, 5) 1
(2, 33) 1
(2, 0) 1
(2, 13) 1
(2, 38) 1
(2, 12) 1
(3, 11) 1
(3, 30) 1
(3, 22) 1
(3, 18) 1
(3, 15) 1
(4, 8) 1
: :
(7, 34) 1
(8, 16) 1
(8, 14) 1
(8, 40) 1
(9, 41) 1
(9, 9) 1
(9, 7) 1
(9, 34) 1
(10, 26) 1
(10, 34) 1
(10, 20) 1
(10, 23) 1
(10, 10) 1
(11, 25) 1
(11, 4) 1
(11, 28) 1
(11, 35) 1
(11, 1) 1
(11, 34) 1
(11, 24) 1
(12, 2) 1
(12, 3) 1
(12, 25) 1
(12, 18) 1
(12, 34) 1
Accuracy metrics
Accuracy of the classifier is 0.6
Confusion matrix
[[0 2]
[0 3]]
Recall and Precision
1.0
0.6
Program 7: Write a program to construct a Bayesian network considering medical data.
Use this model to demonstrate the diagnosis of heart patients using standard Heart Disease. Data Set.
You can use Java/Python ML library classes/API.
import bayespy as bp
import numpy as np
import csv
from colorama import init
from colorama import Fore, Back, Style
init()
ageEnum = {'SuperSeniorCitizen':0, 'SeniorCitizen':1, 'MiddleAged':2, 'Youth':3, 'Teen':4}
genderEnum = {'Male':0, 'Female':1}
familyHistoryEnum = {'Yes':0, 'No':1}
dietEnum = {'High':0, 'Medium':1, 'Low':2}
lifeStyleEnum = {'Athlete':0, 'Active':1, 'Moderate':2, 'Sedetary':3}
cholesterolEnum = {'High':0, 'BorderLine':1, 'Normal':2}
heartDiseaseEnum = {'Yes':0, 'No':1}
with open('heart_disease_data.csv') as csvfile:
lines = csv.reader(csvfile)
dataset = list(lines)
data = []
for x in dataset:
data.append([ageEnum[x[0]],genderEnum[x[1]],familyHistoryEnum[x[2]],dietEnum[x[3]],lifeStyleEnum[x[4]],ch
olesterolEnum[x[5]],heartDiseaseEnum[x[6]]])
data = np.array(data)
N = len(data)
p_age = bp.nodes.Dirichlet(1.0*np.ones(5))
age = bp.nodes.Categorical(p_age, plates=(N,))
age.observe(data[:,0])
p_gender = bp.nodes.Dirichlet(1.0*np.ones(2))
gender = bp.nodes.Categorical(p_gender, plates=(N,))
gender.observe(data[:,1])
p_familyhistory = bp.nodes.Dirichlet(1.0*np.ones(2))
familyhistory = bp.nodes.Categorical(p_familyhistory, plates=(N,))
familyhistory.observe(data[:,2])
p_diet = bp.nodes.Dirichlet(1.0*np.ones(3))
diet = bp.nodes.Categorical(p_diet, plates=(N,))
diet.observe(data[:,3])
p_lifestyle = bp.nodes.Dirichlet(1.0*np.ones(4))
lifestyle = bp.nodes.Categorical(p_lifestyle, plates=(N,))
lifestyle.observe(data[:,4])
p_cholesterol = bp.nodes.Dirichlet(1.0*np.ones(3))
cholesterol = bp.nodes.Categorical(p_cholesterol, plates=(N,))
cholesterol.observe(data[:,5])
p_heartdisease = bp.nodes.Dirichlet(np.ones(2), plates=(5, 2, 2, 3, 4, 3))
heartdisease = bp.nodes.MultiMixture([age, gender, familyhistory, diet, lifestyle, cholesterol],
bp.nodes.Categorical, p_heartdisease)
heartdisease.observe(data[:,6])
p_heartdisease.update()
m=0
while m == 0:
print("\n")
res = bp.nodes.MultiMixture([int(input('Enter Age: ' + str(ageEnum))), int(input('Enter Gender: ' +
str(genderEnum))), int(input('Enter FamilyHistory: ' + str(familyHistoryEnum))), int(input('Enter dietEnum: ' +
str(dietEnum))), int(input('Enter LifeStyle: ' + str(lifeStyleEnum))), int(input('Enter Cholesterol: ' +
str(cholesterolEnum)))], bp.nodes.Categorical, p_heartdisease).get_moments()[0][heartDiseaseEnum['Yes']]
print("Probability(HeartDisease) = " + str(res))
m = int(input("Enter for Continue:0, Exit :1 "))
INPUT:
SuperSeniorCitizen,Male,Yes,Medium,Sedetary,High,Yes
SuperSeniorCitizen,Female,Yes,Medium,Sedetary,High,Yes
SeniorCitizen,Male,No,High,Moderate,BorderLine,Yes
Teen,Male,Yes,Medium,Sedetary,Normal,No
Youth,Female,Yes,High,Athlete,Normal,No
MiddleAged,Male,Yes,Medium,Active,High,Yes
Teen,Male,Yes,High,Moderate,High,Yes
SuperSeniorCitizen,Male,Yes,Medium,Sedetary,High,Yes
Youth,Female,Yes,High,Athlete,Normal,No
SeniorCitizen,Female,No,High,Athlete,Normal,Yes
Teen,Female,No,Medium,Moderate,High,Yes
Teen,Male,Yes,Medium,Sedetary,Normal,No
MiddleAged,Female,No,High,Athlete,High,No
MiddleAged,Male,Yes,Medium,Active,High,Yes
Youth,Female,Yes,High,Athlete,BorderLine,No
SuperSeniorCitizen,Male,Yes,High,Athlete,Normal,Yes
SeniorCitizen,Female,No,Medium,Moderate,BorderLine,Yes
Youth,Female,Yes,Medium,Athlete,BorderLine,No
Teen,Male,Yes,Medium,Sedetary,Normal,No
Use the same data set for clustering using k-Means algorithm. Compare the results of these two
X=np.matrix(list(zip(f1,f2)))
plt.plot()
plt.xlim([0, 100])
plt.ylim([0, 50])
plt.title('Dataset')
plt.ylabel('speeding_feature')
plt.xlabel('Distance_Feature')
plt.scatter(f1,f2)
plt.show()
# create new plot and data
plt.plot()
colors = ['b', 'g', 'r']
markers = ['o', 'v', 's']
# KMeans algorithm
#K = 3
kmeans_model = KMeans(n_clusters=3).fit(X)
plt.plot()
for i, l in enumerate(kmeans_model.labels_):
plt.plot(f1[i], f2[i], color=colors[l], marker=markers[l],ls='None')
plt.xlim([0, 100])
plt.ylim([0, 50])
plt.show()
Input:
Driver_ID,Distance_Feature,Speeding_ Feature
3423311935,71.24,28
3423313212,52.53,25
3423313724,64.54,27
3423311373,55.69,22
3423310999,54.58,25
3423313857,41.91,10
3423312432,58.64,20
3423311434,52.02,8
3423311328,31.25,34
3423312488,44.31,19
3423311254,49.35,40
3423312943,58.07,45
3423312536,44.22,22
3423311542,55.73,19
3423312176,46.63,43
3423314176,52.97,32
3423314202,46.25,35
3423311346,51.55,27
3423310666,57.05,26
3423313527,58.45,30
3423312182,43.42,23
3423313590,55.68,37
3423312268,55.15,18
OUTPUT:
Driver_ID Distance_Feature Speeding_Feature
0 3423311935 71.24 28
1 3423313212 52.53 25
2 3423313724 64.54 27
3 3423311373 55.69 22
4 3423310999 54.58 25
5 3423313857 41.91 10
6 3423312432 58.64 20
7 3423311434 52.02 8
8 3423311328 31.25 34
9 3423312488 44.31 19
10 3423311254 49.35 40
11 3423312943 58.07 45
12 3423312536 44.22 22
13 3423311542 55.73 19
14 3423312176 46.63 43
15 3423314176 52.97 32
16 3423314202 46.25 35
17 3423311346 51.55 27
18 3423310666 57.05 26
19 3423313527 58.45 30
20 3423312182 43.42 23
21 3423313590 55.68 37
22 3423312268 55.15 18
Program 9: Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print
both correct and wrong predictions. Java/Python ML library classes can be used for this problem.
OUTPUT:
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]
[5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3]
[5. 3.4 1.5 0.2]
[4.4 2.9 1.4 0.2]
[4.9 3.1 1.5 0.1]
[5.4 3.7 1.5 0.2]
[4.8 3.4 1.6 0.2]
[4.8 3. 1.4 0.1]
[4.3 3. 1.1 0.1]
[5.8 4. 1.2 0.2]
[5.7 4.4 1.5 0.4]
[5.4 3.9 1.3 0.4]
[5.1 3.5 1.4 0.3]
[5.7 3.8 1.7 0.3]
[5.1 3.8 1.5 0.3]
[5.4 3.4 1.7 0.2]
[5.1 3.7 1.5 0.4]
[4.6 3.6 1. 0.2]
[5.1 3.3 1.7 0.5]
[4.8 3.4 1.9 0.2]
[5. 3. 1.6 0.2]
[5. 3.4 1.6 0.4]
[5.2 3.5 1.5 0.2]
[5.2 3.4 1.4 0.2]
[4.7 3.2 1.6 0.2]
[4.8 3.1 1.6 0.2]
[5.4 3.4 1.5 0.4]
[5.2 4.1 1.5 0.1]
[5.5 4.2 1.4 0.2]
[4.9 3.1 1.5 0.1]
[5. 3.2 1.2 0.2]
[5.5 3.5 1.3 0.2]
[4.9 3.1 1.5 0.1]
[4.4 3. 1.3 0.2]
[5.1 3.4 1.5 0.2]
[5. 3.5 1.3 0.3]
[4.5 2.3 1.3 0.3]
[4.4 3.2 1.3 0.2]
[5. 3.5 1.6 0.6]
[5.1 3.8 1.9 0.4]
[4.8 3. 1.4 0.3]
[5.1 3.8 1.6 0.2]
[4.6 3.2 1.4 0.2]
[5.3 3.7 1.5 0.2]
[5. 3.3 1.4 0.2]
[7. 3.2 4.7 1.4]
[6.4 3.2 4.5 1.5]
[6.9 3.1 4.9 1.5]
[5.5 2.3 4. 1.3]
[6.5 2.8 4.6 1.5]
[5.7 2.8 4.5 1.3]
[6.3 3.3 4.7 1.6]
[4.9 2.4 3.3 1. ]
[6.6 2.9 4.6 1.3]
[5.2 2.7 3.9 1.4]
[5. 2. 3.5 1. ]
[5.9 3. 4.2 1.5]
[6. 2.2 4. 1. ]
[6.1 2.9 4.7 1.4]
[5.6 2.9 3.6 1.3]
[6.7 3.1 4.4 1.4]
[5.6 3. 4.5 1.5]
[5.8 2.7 4.1 1. ]
[6.2 2.2 4.5 1.5]
[5.6 2.5 3.9 1.1]
[5.9 3.2 4.8 1.8]
[6.1 2.8 4. 1.3]
[6.3 2.5 4.9 1.5]
[6.1 2.8 4.7 1.2]
[6.4 2.9 4.3 1.3]
[6.6 3. 4.4 1.4]
[6.8 2.8 4.8 1.4]
[6.7 3. 5. 1.7]
[6. 2.9 4.5 1.5]
[5.7 2.6 3.5 1. ]
[5.5 2.4 3.8 1.1]
[5.5 2.4 3.7 1. ]
[5.8 2.7 3.9 1.2]
[6. 2.7 5.1 1.6]
[5.4 3. 4.5 1.5]
[6. 3.4 4.5 1.6]
[6.7 3.1 4.7 1.5]
[6.3 2.3 4.4 1.3]
[5.6 3. 4.1 1.3]
[5.5 2.5 4. 1.3]
[5.5 2.6 4.4 1.2]
[6.1 3. 4.6 1.4]
[5.8 2.6 4. 1.2]
[5. 2.3 3.3 1. ]
[5.6 2.7 4.2 1.3]
[5.7 3. 4.2 1.2]
[5.7 2.9 4.2 1.3]
[6.2 2.9 4.3 1.3]
[5.1 2.5 3. 1.1]
[5.7 2.8 4.1 1.3]
[6.3 3.3 6. 2.5]
[5.8 2.7 5.1 1.9]
[7.1 3. 5.9 2.1]
[6.3 2.9 5.6 1.8]
[6.5 3. 5.8 2.2]
[7.6 3. 6.6 2.1]
[4.9 2.5 4.5 1.7]
[7.3 2.9 6.3 1.8]
[6.7 2.5 5.8 1.8]
[7.2 3.6 6.1 2.5]
[6.5 3.2 5.1 2. ]
[6.4 2.7 5.3 1.9]
[6.8 3. 5.5 2.1]
[5.7 2.5 5. 2. ]
[5.8 2.8 5.1 2.4]
[6.4 3.2 5.3 2.3]
[6.5 3. 5.5 1.8]
[7.7 3.8 6.7 2.2]
[7.7 2.6 6.9 2.3]
[6. 2.2 5. 1.5]
[6.9 3.2 5.7 2.3]
[5.6 2.8 4.9 2. ]
[7.7 2.8 6.7 2. ]
[6.3 2.7 4.9 1.8]
[6.7 3.3 5.7 2.1]
[7.2 3.2 6. 1.8]
[6.2 2.8 4.8 1.8]
[6.1 3. 4.9 1.8]
[6.4 2.8 5.6 2.1]
[7.2 3. 5.8 1.6]
[7.4 2.8 6.1 1.9]
[7.9 3.8 6.4 2. ]
[6.4 2.8 5.6 2.2]
[6.3 2.8 5.1 1.5]
[6.1 2.6 5.6 1.4]
[7.7 3. 6.1 2.3]
[6.3 3.4 5.6 2.4]
[6.4 3.1 5.5 1.8]
[6. 3. 4.8 1.8]
[6.9 3.1 5.4 2.1]
[6.7 3.1 5.6 2.4]
[6.9 3.1 5.1 2.3]
[5.8 2.7 5.1 1.9]
[6.8 3.2 5.9 2.3]
[6.7 3.3 5.7 2.5]
[6.7 3. 5.2 2.3]
[6.3 2.5 5. 1.9]
[6.5 3. 5.2 2. ]
[6.2 3.4 5.4 2.3]
[5.9 3. 5.1 1.8]]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0000000000000111111111111111111111111
1111111111111111111111111122222222222
2222222222222222222222222222222222222
2 2]
confusion matrix is as follows
[[ 9 0 0]
[ 0 17 2]
[ 0 2 15]]
Accuracy metrics
precision recall f1-score support
Select appropriate data set for your experiment and draw graphs.
Total_bill,tip,sex,smoker,day,time,size
16.99,1.01,Female,No,Sun,Dinner,2
10.34,1.66,Male,No,Sun,Dinner,3
21.01,3.5,Male,No,Sun,Dinner,3
23.68,3.31,Male,No,Sun,Dinner,2
24.59,3.61,Female,No,Sun,Dinner,4
25.29,4.71,Male,No,Sun,Dinner,4
8.77,2,Male,No,Sun,Dinner,2
26.88,3.12,Male,No,Sun,Dinner,4
15.04,1.96,Male,No,Sun,Dinner,2
14.78,3.23,Male,No,Sun,Dinner,2
10.27,1.71,Male,No,Sun,Dinner,2
35.26,5,Female,No,Sun,Dinner,4
15.42,1.57,Male,No,Sun,Dinner,2
18.43,3,Male,No,Sun,Dinner,4
14.83,3.02,Female,No,Sun,Dinner,2
21.58,3.92,Male,No,Sun,Dinner,2
10.33,1.67,Female,No,Sun,Dinner,3
16.29,3.71,Male,No,Sun,Dinner,3
16.97,3.5,Female,No,Sun,Dinner,3
20.65,3.35,Male,No,Sat,Dinner,3
17.92,4.08,Male,No,Sat,Dinner,2
20.29,2.75,Female,No,Sat,Dinner,2
15.77,2.23,Female,No,Sat,Dinner,2
39.42,7.58,Male,No,Sat,Dinner,4
19.82,3.18,Male,No,Sat,Dinner,2
17.81,2.34,Male,No,Sat,Dinner,4
13.37,2,Male,No,Sat,Dinner,2
12.69,2,Male,No,Sat,Dinner,2
21.7,4.3,Male,No,Sat,Dinner,2
19.65,3,Female,No,Sat,Dinner,2
9.55,1.45,Male,No,Sat,Dinner,2
18.35,2.5,Male,No,Sat,Dinner,4
15.06,3,Female,No,Sat,Dinner,2
20.69,2.45,Female,No,Sat,Dinner,4
17.78,3.27,Male,No,Sat,Dinner,2
24.06,3.6,Male,No,Sat,Dinner,3
16.31,2,Male,No,Sat,Dinner,3
16.93,3.07,Female,No,Sat,Dinner,3
18.69,2.31,Male,No,Sat,Dinner,3
31.27,5,Male,No,Sat,Dinner,3
16.04,2.24,Male,No,Sat,Dinner,3
17.46,2.54,Male,No,Sun,Dinner,2
13.94,3.06,Male,No,Sun,Dinner,2
9.68,1.32,Male,No,Sun,Dinner,2
30.4,5.6,Male,No,Sun,Dinner,4
18.29,3,Male,No,Sun,Dinner,2
22.23,5,Male,No,Sun,Dinner,2
32.4,6,Male,No,Sun,Dinner,4
28.55,2.05,Male,No,Sun,Dinner,3
18.04,3,Male,No,Sun,Dinner,2
12.54,2.5,Male,No,Sun,Dinner,2
10.29,2.6,Female,No,Sun,Dinner,2
34.81,5.2,Female,No,Sun,Dinner,4
9.94,1.56,Male,No,Sun,Dinner,2
25.56,4.34,Male,No,Sun,Dinner,4
19.49,3.51,Male,No,Sun,Dinner,2
38.01,3,Male,Yes,Sat,Dinner,4
26.41,1.5,Female,No,Sat,Dinner,2
11.24,1.76,Male,Yes,Sat,Dinner,2
48.27,6.73,Male,No,Sat,Dinner,4
20.29,3.21,Male,Yes,Sat,Dinner,2
13.81,2,Male,Yes,Sat,Dinner,2
11.02,1.98,Male,Yes,Sat,Dinner,2
18.29,3.76,Male,Yes,Sat,Dinner,4
17.59,2.64,Male,No,Sat,Dinner,3
20.08,3.15,Male,No,Sat,Dinner,3
16.45,2.47,Female,No,Sat,Dinner,2
3.07,1,Female,Yes,Sat,Dinner,1
20.23,2.01,Male,No,Sat,Dinner,2
15.01,2.09,Male,Yes,Sat,Dinner,2
12.02,1.97,Male,No,Sat,Dinner,2
17.07,3,Female,No,Sat,Dinner,3
26.86,3.14,Female,Yes,Sat,Dinner,2
25.28,5,Female,Yes,Sat,Dinner,2
14.73,2.2,Female,No,Sat,Dinner,2
10.51,1.25,Male,No,Sat,Dinner,2
17.92,3.08,Male,Yes,Sat,Dinner,2
27.2,4,Male,No,Thur,Lunch,4
22.76,3,Male,No,Thur,Lunch,2
17.29,2.71,Male,No,Thur,Lunch,2
19.44,3,Male,Yes,Thur,Lunch,2
16.66,3.4,Male,No,Thur,Lunch,2
10.07,1.83,Female,No,Thur,Lunch,1
32.68,5,Male,Yes,Thur,Lunch,2
15.98,2.03,Male,No,Thur,Lunch,2
34.83,5.17,Female,No,Thur,Lunch,4
13.03,2,Male,No,Thur,Lunch,2
18.28,4,Male,No,Thur,Lunch,2
24.71,5.85,Male,No,Thur,Lunch,2
21.16,3,Male,No,Thur,Lunch,2
28.97,3,Male,Yes,Fri,Dinner,2
22.49,3.5,Male,No,Fri,Dinner,2
5.75,1,Female,Yes,Fri,Dinner,2
16.32,4.3,Female,Yes,Fri,Dinner,2
22.75,3.25,Female,No,Fri,Dinner,2
40.17,4.73,Male,Yes,Fri,Dinner,4
27.28,4,Male,Yes,Fri,Dinner,2
12.03,1.5,Male,Yes,Fri,Dinner,2
21.01,3,Male,Yes,Fri,Dinner,2
12.46,1.5,Male,No,Fri,Dinner,2
11.35,2.5,Female,Yes,Fri,Dinner,2
15.38,3,Female,Yes,Fri,Dinner,2
44.3,2.5,Female,Yes,Sat,Dinner,3
22.42,3.48,Female,Yes,Sat,Dinner,2
20.92,4.08,Female,No,Sat,Dinner,2
15.36,1.64,Male,Yes,Sat,Dinner,2
20.49,4.06,Male,Yes,Sat,Dinner,2
25.21,4.29,Male,Yes,Sat,Dinner,2
18.24,3.76,Male,No,Sat,Dinner,2
14.31,4,Female,Yes,Sat,Dinner,2
14,3,Male,No,Sat,Dinner,2
7.25,1,Female,No,Sat,Dinner,1
38.07,4,Male,No,Sun,Dinner,3
23.95,2.55,Male,No,Sun,Dinner,2
25.71,4,Female,No,Sun,Dinner,3
17.31,3.5,Female,No,Sun,Dinner,2
29.93,5.07,Male,No,Sun,Dinner,4
10.65,1.5,Female,No,Thur,Lunch,2
12.43,1.8,Female,No,Thur,Lunch,2
24.08,2.92,Female,No,Thur,Lunch,4
11.69,2.31,Male,No,Thur,Lunch,2
13.42,1.68,Female,No,Thur,Lunch,2
14.26,2.5,Male,No,Thur,Lunch,2
15.95,2,Male,No,Thur,Lunch,2
12.48,2.52,Female,No,Thur,Lunch,2
29.8,4.2,Female,No,Thur,Lunch,6
8.52,1.48,Male,No,Thur,Lunch,2
14.52,2,Female,No,Thur,Lunch,2
11.38,2,Female,No,Thur,Lunch,2
22.82,2.18,Male,No,Thur,Lunch,3
19.08,1.5,Male,No,Thur,Lunch,2
20.27,2.83,Female,No,Thur,Lunch,2
11.17,1.5,Female,No,Thur,Lunch,2
12.26,2,Female,No,Thur,Lunch,2
18.26,3.25,Female,No,Thur,Lunch,2
8.51,1.25,Female,No,Thur,Lunch,2
10.33,2,Female,No,Thur,Lunch,2
14.15,2,Female,No,Thur,Lunch,2
16,2,Male,Yes,Thur,Lunch,2
13.16,2.75,Female,No,Thur,Lunch,2
17.47,3.5,Female,No,Thur,Lunch,2
34.3,6.7,Male,No,Thur,Lunch,6
41.19,5,Male,No,Thur,Lunch,5
27.05,5,Female,No,Thur,Lunch,6
16.43,2.3,Female,No,Thur,Lunch,2
8.35,1.5,Female,No,Thur,Lunch,2
18.64,1.36,Female,No,Thur,Lunch,3
11.87,1.63,Female,No,Thur,Lunch,2
9.78,1.73,Male,No,Thur,Lunch,2
7.51,2,Male,No,Thur,Lunch,2
14.07,2.5,Male,No,Sun,Dinner,2
13.13,2,Male,No,Sun,Dinner,2
17.26,2.74,Male,No,Sun,Dinner,3
24.55,2,Male,No,Sun,Dinner,4
19.77,2,Male,No,Sun,Dinner,4
29.85,5.14,Female,No,Sun,Dinner,5
48.17,5,Male,No,Sun,Dinner,6
25,3.75,Female,No,Sun,Dinner,4
13.39,2.61,Female,No,Sun,Dinner,2
16.49,2,Male,No,Sun,Dinner,4
21.5,3.5,Male,No,Sun,Dinner,4
12.66,2.5,Male,No,Sun,Dinner,2
16.21,2,Female,No,Sun,Dinner,3
13.81,2,Male,No,Sun,Dinner,2
17.51,3,Female,Yes,Sun,Dinner,2
24.52,3.48,Male,No,Sun,Dinner,3
20.76,2.24,Male,No,Sun,Dinner,2
31.71,4.5,Male,No,Sun,Dinner,4
10.59,1.61,Female,Yes,Sat,Dinner,2
10.63,2,Female,Yes,Sat,Dinner,2
50.81,10,Male,Yes,Sat,Dinner,3
15.81,3.16,Male,Yes,Sat,Dinner,2
7.25,5.15,Male,Yes,Sun,Dinner,2
31.85,3.18,Male,Yes,Sun,Dinner,2
16.82,4,Male,Yes,Sun,Dinner,2
32.9,3.11,Male,Yes,Sun,Dinner,2
17.89,2,Male,Yes,Sun,Dinner,2
14.48,2,Male,Yes,Sun,Dinner,2
9.6,4,Female,Yes,Sun,Dinner,2
34.63,3.55,Male,Yes,Sun,Dinner,2
34.65,3.68,Male,Yes,Sun,Dinner,4
23.33,5.65,Male,Yes,Sun,Dinner,2
45.35,3.5,Male,Yes,Sun,Dinner,3
23.17,6.5,Male,Yes,Sun,Dinner,4
40.55,3,Male,Yes,Sun,Dinner,2
20.69,5,Male,No,Sun,Dinner,5
20.9,3.5,Female,Yes,Sun,Dinner,3
30.46,2,Male,Yes,Sun,Dinner,5
18.15,3.5,Female,Yes,Sun,Dinner,3
23.1,4,Male,Yes,Sun,Dinner,3
15.69,1.5,Male,Yes,Sun,Dinner,2
19.81,4.19,Female,Yes,Thur,Lunch,2
28.44,2.56,Male,Yes,Thur,Lunch,2
15.48,2.02,Male,Yes,Thur,Lunch,2
16.58,4,Male,Yes,Thur,Lunch,2
7.56,1.44,Male,No,Thur,Lunch,2
10.34,2,Male,Yes,Thur,Lunch,2
43.11,5,Female,Yes,Thur,Lunch,4
13,2,Female,Yes,Thur,Lunch,2
13.51,2,Male,Yes,Thur,Lunch,2
18.71,4,Male,Yes,Thur,Lunch,3
12.74,2.01,Female,Yes,Thur,Lunch,2
13,2,Female,Yes,Thur,Lunch,2
16.4,2.5,Female,Yes,Thur,Lunch,2
20.53,4,Male,Yes,Thur,Lunch,4
16.47,3.23,Female,Yes,Thur,Lunch,3
26.59,3.41,Male,Yes,Sat,Dinner,3
38.73,3,Male,Yes,Sat,Dinner,4
24.27,2.03,Male,Yes,Sat,Dinner,2
12.76,2.23,Female,Yes,Sat,Dinner,2
30.06,2,Male,Yes,Sat,Dinner,3
25.89,5.16,Male,Yes,Sat,Dinner,4
48.33,9,Male,No,Sat,Dinner,4
13.27,2.5,Female,Yes,Sat,Dinner,2
28.17,6.5,Female,Yes,Sat,Dinner,3
12.9,1.1,Female,Yes,Sat,Dinner,2
28.15,3,Male,Yes,Sat,Dinner,5
11.59,1.5,Male,Yes,Sat,Dinner,2
7.74,1.44,Male,Yes,Sat,Dinner,2
30.14,3.09,Female,Yes,Sat,Dinner,4
12.16,2.2,Male,Yes,Fri,Lunch,2
13.42,3.48,Female,Yes,Fri,Lunch,2
8.58,1.92,Male,Yes,Fri,Lunch,1
15.98,3,Female,No,Fri,Lunch,3
13.42,1.58,Male,Yes,Fri,Lunch,2
16.27,2.5,Female,Yes,Fri,Lunch,2
10.09,2,Female,Yes,Fri,Lunch,2
20.45,3,Male,No,Sat,Dinner,4
13.28,2.72,Male,No,Sat,Dinner,2
22.12,2.88,Female,Yes,Sat,Dinner,2
24.01,2,Male,Yes,Sat,Dinner,4
15.69,3,Male,Yes,Sat,Dinner,3
11.61,3.39,Male,No,Sat,Dinner,2
10.77,1.47,Male,No,Sat,Dinner,2
15.53,3,Male,Yes,Sat,Dinner,2
10.07,1.25,Male,No,Sat,Dinner,2
12.6,1,Male,Yes,Sat,Dinner,2
32.83,1.17,Male,Yes,Sat,Dinner,2
35.83,4.67,Female,No,Sat,Dinner,3
29.03,5.92,Male,No,Sat,Dinner,3
27.18,2,Female,Yes,Sat,Dinner,2
22.67,2,Male,Yes,Sat,Dinner,2
17.82,1.75,Male,No,Sat,Dinner,2
18.78,3,Female,No,Thur,Dinner,2
Out put