0% found this document useful (0 votes)

98 views33 pages

ML Lab Programs 1-10-Converted NAM COLLEGE PDF

The document describes implementations of several machine learning algorithms: 1. FIND-S algorithm to find the most specific hypothesis from training data read from a CSV file. 2. Candidate-Elimination algorithm to output hypotheses consistent with training examples stored in a CSV file. 3. ID3 decision tree algorithm demonstrated on a sample dataset to classify new samples. 4. Backpropagation neural network implemented and tested on sample data. 5. Naive Bayes classifier implemented to classify training and test data from a CSV file and compute accuracy.

Uploaded by

Pradyumna A Kubear

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views33 pages

ML Lab Programs 1-10-Converted NAM COLLEGE PDF

Uploaded by

Pradyumna A Kubear

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

MACHINE LEARNING LABORATORY (15CS76)

1.Implement and demonstrate the FIND-S algorithm for finding the most specific hypothesis based
on a given set of training data samples. Read the training data from a .CSV file.

import csv
with open('data.csv','r') as f:
reader=csv.reader(f)
your_list=list(reader)
h=[['0','0','0','0','0','0']]
for i in your_list:
print(i)
if i[-1]=="yes":
j=0
for x in i:
if x!="yes":
if x!=h[0][j]and h[0][j]=='0':
h[0][j]=x
elif x!=h[0][j] and h[0][j]!='0':
h[0][j]='?'
else:
pass
j=j+1
print("final hypothesis is")
print(h)

Data Set: (file name: data.csv)

sunny,warm,normal,strong,warm,same,yes
sunny,warm,high,strong,warm,same,yes
rain,cold,high,strong,warm,change,no
sunny,warm,hgh,strong,cool,change,yes

output:

C:\Users\admin\PycharmProjects\1rr16cs181\venv\Scripts\python.exe
C:/Users/admin/PycharmProjects/1rr16cs181/finds.py
['sunny', 'warm', 'normal', 'strong', 'warm', 'same', 'yes']
['sunny', 'warm', 'high', 'strong', 'warm', 'same', 'yes']
['rain', 'cold', 'high', 'strong', 'warm', 'change', 'no']
['sunny', 'warm', 'hgh', 'strong', 'cool', 'change', 'yes']
final hypothesis is
[['sunny', 'warm', '?', 'strong', '?', '?']]

Process finished with exit code 0

2.For a given set of training data examples stored in a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm to output a description of the set of all hypotheses consistent with
the training examples.

(NOTE: Install numpy, pandas)

import numpy as np
import pandas as pd

data=pd.DataFrame(data=pd.read_csv('data.csv'))
concepts=np.array(data.iloc[:,0:-1])
target=np.array(data.iloc[:,-1])
def learn(concepts,target):
specific_h=concepts[0].copy()
print("intilization of specific_h and general_h")
print(specific_h)
general_h=[["?" for i in range(len(specific_h))] for i in range(len(specific_h))]
print(general_h)
for i,h in enumerate(concepts):
if target[i]=="yes":
for x in range(len(specific_h)):
if h[x]!=specific_h[x]:
specific_h[x]='?'
general_h[x][x]='?'
if target[i]=="no":
for x in range(len(specific_h)):
if h[x]!=specific_h[x]:
general_h[x][x]=specific_h[x]
else:
general_h[x][x]='?'
print("steps of candidate elimination algorithm",i+1)
print("specific_h",i+1,"\n")
print(specific_h)
print("general_h",i+1,"\n")
print(general_h)

indices=[i for i,val in enumerate(general_h) if val==['?','?','?','?','?','?']]

for i in indices:
general_h.remove(['?','?','?','?','?','?'])
return specific_h,general_h
s_final,g_final=learn(concepts,target)
print("Final specific_h:",s_final,sep="\n")
print("final general_h:",g_final,sep="\n")

Dataset: (file name: data.csv)

sunny,warm,normal,strong,warm,same,yes
sunny,warm,high,strong,warm,same,yes
rain,cold,high,strong,warm,change,no
sunny,warm,hgh,strong,cool,change,yes
Output:

C:\Users\admin\PycharmProjects\1rr16cs181\venv\Scripts\python.exe
C:/Users/admin/PycharmProjects/1rr16cs181/candi.py

intilization of specific_h and general_h

['sunny' 'warm' 'high' 'strong' 'warm' 'same']

[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', '?']]

steps of candidate elimination algorithm 3

specific_h 3

['sunny' 'warm' '?' 'strong' '?' '?']

general_h 3

[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?',
'?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

Final specific_h:

['sunny' 'warm' '?' 'strong' '?' '?']

final general_h:

[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?']]

Process finished with exit code 0

3.Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an
appropriate data set for building the decision tree and apply this knowledge to classify a new
sample
(NOTE: Install numpy)

import numpy as np
import math
from data_loader import read_data
class Node:
def __init__(self,attribute):
self.attribute=attribute
self.children=[]
self.answer=""
#def __str__(self):
#return self.attribute
def sub(data,col,delete):
dict={}
items=np.unique(data[:,col])
count=np.zeros((items.shape[0],1),dtype=np.int32)
for x in range(items.shape[0]):
for y in range(data.shape[0]):
if data[y,col]==items[x]:
count[x]+=1
for x in range(items.shape[0]):
dict[items[x]]=np.empty((int(count[x]),data.shape[1]),dtype='|S32')
pos=0
for y in range(data.shape[0]):
if data[y,col]==items[x]:
dict[items[x]][pos]=data[y]
pos+=1
if delete:
dict[items[x]]=np.delete(dict[items[x]],col,1)
return items,dict
def entropy(s):
items=np.unique(s)
if items.size==1:
return 0
counts=np.zeros((items.shape[0],1))
sums=0
for x in range(items.shape[0]):
counts[x]=sum(s==items[x])/(s.size*1.0)
for count in counts:
sums+=-1*count*math.log(count,2)
return sums
def gain(data,col):
items,dict=sub(data,col,delete=False)
total_size=data.shape[0]
entropies=np.zeros((items.shape[0],1))
intrinsic=np.zeros((items.shape[0],1))
for x in range((items.shape[0])):
ratio=dict[items[x]].shape[0]/(total_size*1.0)
entropies[x]=ratio*entropy(dict[items[x]][:,-1])
intrinsic[x]=ratio*math.log(ratio,2)
total_entropy=entropy(data[:,-1])
iv =-1*sum(intrinsic)
for x in range(entropies.shape[0]):
total_entropy-=entropies[x]
return (total_entropy/iv)
def create(data,metadata):
if(np.unique(data[:,-1])).shape[0]==1:
node=Node("")
node.answer=np.unique(data[:,-1])[0]
return node

gains=np.zeros((data.shape[1]-1,1))
for col in range(data.shape[-1]-1):
gains[col]=gain(data,col)
split=np.argmax(gains)
node=Node(metadata[split])
metadata=np.delete(metadata,split,0)
items,dict=sub(data,split,delete=True)

for x in range(items.shape[0]):
child=create(dict[items[x]],metadata)
node.children.append((items[x],child))
return node
def empty(size):
s=""
for x in range(size):
s+=""
return s
def print_tree(node,level):
if node.answer!="":
print(empty(level),node.answer)
return
print(empty(level),node.attribute)

for value,n in node.children:

print(empty(level+1),value)
print_tree(n,level+2)
metadata,traindata=read_data("data1.csv")
data=np.array(traindata)
node=create(data,metadata)
print_tree(node,0)
data_loader.py [another supporting file]
import csv

def read_data(filename):
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',')
headers = next(datareader)
metadata = []
traindata = []
for name in headers:
metadata.append(name)

for row in datareader:

traindata.append(row)

return (metadata, traindata)

Data set: (file name: data1.csv)

outlook,temprature,humidity,wind,palytennis
sunny,hot,high,weak,no
sunny,hot,high,strong,no
overcast,hot,high,weak,yes
rain,mild,high,weak,yes
rain,cool,normal,weak,yes
rain,cool,normal,strong,no
overcast,cool,normal,strong,yes
sunny,mild,high,weak,no
sunny,cool,normal,weak,yes
rain,mild,normal,weak,yes
sunny,mild,normal,strong,yes
overcast,mild,high,strong,yes
overcast,hot,normal,weak,yes
rain,mild,high,strong,no

Output:

C:\Users\admin\PycharmProjects\1rr16cs181\venv\Scripts\python.exe
C:/Users/admin/PycharmProjects/1rr16cs181/id3.py

outlook

overcast

b'yes'
rain

wind

b'strong'

b'no'

b'weak'

b'yes'

sunny

humidity

b'high'

b'no'

b'normal'

b'yes'

Process finished with exit code 0

4.Build an Artificial Neural Network by implementing the Back propagation algorithm and test the
same using appropriate data sets
(NOTE: Install numpy)

import numpy as np
x=np.array(([2,9],[1,5],[3,6]),dtype=float)
y=np.array(([92],[86],[89]),dtype=float)
x=x/np.amax(x,axis=0)
y=y/100

def sigmoid(x):
return (1/(1+np.exp(-x)))
def derivatives_sigmoid(x):
return x*(1-x)

epoch=7000
lr=0.1
inputlayer_neuron=2
hiddenlayer_neuron=3
output_neuron=1

wh=np.random.uniform(size=(inputlayer_neuron,hiddenlayer_neuron))
bh=np.random.uniform(size=(1,hiddenlayer_neuron))
wout=np.random.uniform(size=(hiddenlayer_neuron,output_neuron))
bout=np.random.uniform(size=(1,output_neuron))

for i in range(epoch):
hinp1=np.dot(x,wh)
hinp=hinp1+bh
hlayer_act=sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp=outinp1+bout
output=sigmoid(outinp)

EO=y-output
outgrad=derivatives_sigmoid(output)
d_output=EO*outgrad
EH=d_output.dot(wout.T)
hiddengrad=derivatives_sigmoid(hlayer_act)
d_hiddenlayer=EH*hiddengrad
wout+=hlayer_act.T.dot(d_output*lr)
wh+=x.T.dot(d_hiddenlayer)*lr
print("input:\n"+str(x))
print("actual output:\n"+str(y))
print("predicted output:\n",output)

OUTPUT

C:\Users\admin\PycharmProjects\1rr16cs181\venv\Scripts\python.exe
C:/Users/admin/PycharmProjects/1rr16cs181/nueral.py
input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
actual output:
[[0.92]
[0.86]
[0.89]]
predicted output:
[[0.89612636]
[0.87868519]
[0.89460368]]

Process finished with exit code 0

5. Write a program to implement the naïve Bayesian classifier for a sample training data set stored
as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.

print("\nNaive Bayes Classifier for concept learning problem")

import csv
import random
import math
import operator

def safe_div(x, y):

if y == 0:
return 0
return x / y

def loadCsv(filename):
lines = csv.reader(open(filename))
dataset = list(lines)
for i in range(len(dataset)):
dataset[i] = [float(x) for x in dataset[i]]
return dataset

def splitDataset(dataset, splitRatio):

trainSize = int(len(dataset) * splitRatio)
trainSet = []
copy = list(dataset)
i=0
while len(trainSet) < trainSize:
# index = random.randrange(len(copy))

trainSet.append(copy.pop(i))
return [trainSet, copy]

def separateByClass(dataset):
separated = {}
for i in range(len(dataset)):
vector = dataset[i]
if (vector[-1] not in separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated

def mean(numbers):
return safe_div(sum(numbers), float(len(numbers)))
def stdev(numbers):
avg = mean(numbers)
variance = safe_div(sum([pow(x - avg, 2) for x in numbers]), float(len(numbers) - 1))
return math.sqrt(variance)

def summarize(dataset):
summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)]
del summaries[-1]
return summaries

def summarizeByClass(dataset):
separated = separateByClass(dataset)
summaries = {}
for classValue, instances in separated.items():
summaries[classValue] = summarize(instances)
return summaries

def calculateProbability(x, mean, stdev):

exponent = math.exp(-safe_div(math.pow(x - mean, 2), (2 * math.pow(stdev, 2))))
final = safe_div(1, (math.sqrt(2 * math.pi) * stdev)) * exponent
return final

def calculateClassProbabilities(summaries, inputVector):

probabilities = {}
for classValue, classSummaries in summaries.items():
probabilities[classValue] = 1
for i in range(len(classSummaries)):
mean, stdev = classSummaries[i]
x = inputVector[i]
probabilities[classValue] *= calculateProbability(x, mean, stdev)
return probabilities

def predict(summaries, inputVector):

probabilities = calculateClassProbabilities(summaries, inputVector)
bestLabel, bestProb = None, -1
for classValue, probability in probabilities.items():
if bestLabel is None or probability > bestProb:
bestProb = probability
bestLabel = classValue
return bestLabel

def getPredictions(summaries, testSet):

predictions = []
for i in range(len(testSet)):
result = predict(summaries, testSet[i])
predictions.append(result)
return predictions

def getAccuracy(testSet, predictions):

correct = 0
for i in range(len(testSet)):
if testSet[i][-1] == predictions[i]:
correct += 1
accuracy = safe_div(correct, float(len(testSet))) * 100.0
return accuracy

def main():
filename = 'NaiveBayes ConceptLearning.csv'
splitRatio = 0.75
dataset = loadCsv(filename)
trainingSet, testSet = splitDataset(dataset, splitRatio)
print('Split {0} rows into'.format(len(dataset)))
print('Number of Training data: ' + (repr(len(trainingSet))))
print('Number of Test Data: ' + (repr(len(testSet))))
print("\nThe values assumed for the concept learning attributes are\n")
print(
"OUTLOOK=> Sunny=1 Overcast=2 Rain=3\nTEMPERATURE=> Hot=1 Mild=2
Cool=3\nHUMIDITY=> High=1 Normal=2\nWIND=> Weak=1 Strong=2")
print("TARGET CONCEPT:PLAY TENNIS=> Yes=10 No=5")
print("\nThe Training set are:")
for x in trainingSet:
print(x)
print("\nThe Test data set are:")
for x in testSet:
print(x)
print("\n")
# prepare model
summaries = summarizeByClass(trainingSet)
# test model
predictions = getPredictions(summaries, testSet)
actual = []
for i in range(len(testSet)):
vector = testSet[i]
actual.append(vector[-1])
# Since there are five attribute values, each attribute constitutes to 20% accuracy. So if all attributes
match with predictions then 100% accuracy
print('Actual values: {0}%'.format(actual))
print('Predictions: {0}%'.format(predictions))
accuracy = getAccuracy(testSet, predictions)
print('Accuracy: {0}%'.format(accuracy))

main()
Data Set: (file name: NaiveBayes ConceptLearning.csv)

1,1,1,1,5
1,1,1,2,5
2,1,1,2,10
3,2,1,1,10
3,3,2,1,10
3,3,2,2,5
2,3,2,2,10
1,2,1,1,5
1,3,2,1,10
3,2,2,2,10
1,2,2,2,10
2,2,1,2,10
2,1,2,1,10
3,2,1,2,5
1,2,1,2,10
1,2,1,2,5

OUTPUT

C:\Users\admin\PycharmProjects\ss\venv\Scripts\python.exe
"C:/Users/admin/Desktop/RNSIT_ML_LAB_PROGRAMS_JULY2018/RNSIT_ML_LAB_PROGRAM
S_JULY2018/5-naive-bayes/NaiveBayes ConceptLearning.py"

Naive Bayes Classifier for concept learning problem

Split 16 rows into
Number of Training data: 12
Number of Test Data: 4

The values assumed for the concept learning attributes are

OUTLOOK=> Sunny=1 Overcast=2 Rain=3

TEMPERATURE=> Hot=1 Mild=2 Cool=3
HUMIDITY=> High=1 Normal=2
WIND=> Weak=1 Strong=2
TARGET CONCEPT:PLAY TENNIS=> Yes=10 No=5

The Training set are:

[1.0, 1.0, 1.0, 1.0, 5.0]
[1.0, 1.0, 1.0, 2.0, 5.0]
[2.0, 1.0, 1.0, 2.0, 10.0]
[3.0, 2.0, 1.0, 1.0, 10.0]
[3.0, 3.0, 2.0, 1.0, 10.0]
[3.0, 3.0, 2.0, 2.0, 5.0]
[2.0, 3.0, 2.0, 2.0, 10.0]
[1.0, 2.0, 1.0, 1.0, 5.0]
[1.0, 3.0, 2.0, 1.0, 10.0]
[3.0, 2.0, 2.0, 2.0, 10.0]
[1.0, 2.0, 2.0, 2.0, 10.0]
[2.0, 2.0, 1.0, 2.0, 10.0]

The Test data set are:

[2.0, 1.0, 2.0, 1.0, 10.0]
[3.0, 2.0, 1.0, 2.0, 5.0]
[1.0, 2.0, 1.0, 2.0, 10.0]
[1.0, 2.0, 1.0, 2.0, 5.0]

Actual values: [10.0, 5.0, 10.0, 5.0]%

Predictions: [5.0, 10.0, 5.0, 5.0]%
Accuracy: 25.0%

Process finished with exit code 0

6.Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier
model to perform this task. Built-in Java classes/API can be used to write the program.
Calculate the accuracy, precision, and recall for your data set.

Note:(install pandas and sklearn)

import pandas as pd
msg=pd.read_csv('naivetext1.txt',names=['message','label'])
print('The dimensions of the dataset',msg.shape)
msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
y=msg.labelnum
print(X)
print(y)

#splitting the dataset into train and test data

from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(X,y)
print(xtest.shape)
print(xtrain.shape)
print(ytest.shape)
print(ytrain.shape)
#output of count vectoriser is a sparse matrix
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
xtrain_dtm = count_vect.fit_transform(xtrain)
xtest_dtm=count_vect.transform(xtest)
print(count_vect.get_feature_names())

df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_names())
print(df)#tabular representation
print(xtrain_dtm) #sparse matrix representation
# Training Naive Bayes (NB) classifier on training data.
from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)

#printing accuracy metrics

from sklearn import metrics
print('Accuracy metrics')
print('Accuracy of the classifer is',metrics.accuracy_score(ytest,predicted))
print('Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))
print('Recall and Precison ')
print(metrics.recall_score(ytest,predicted))
print(metrics.precision_score(ytest,predicted))

'''docs_new = ['I like this place', 'My boss is not my saviour']

X_new_counts = count_vect.transform(docs_new)
predictednew = clf.predict(X_new_counts)
for doc, category in zip(docs_new, predictednew):
print('%s->%s' % (doc, msg.labelnum[category]))'''

Data Set:

Note: In Notepad save the input with filename: naivetext1.txt

I love this sandwich,pos

This is an amazing place,pos
I feel very good about these beers,pos
This is my best work,pos
What an awesome view,pos
I do not like this restaurant,neg
I am tired of this stuff,neg
I can't deal with this,neg
He is my sworn enemy,neg
My boss is horrible,neg
This is an awesome place,pos
I do not like the taste of this juice,neg
I love to dance,pos
I am sick and tired of this place,neg
What a great holiday,pos
That is a bad locality to stay,neg
We will have good fun tomorrow,pos
I went to my enemy's house today,neg

OUTPUT

C:\Users\admin\PycharmProjects\ss\venv\Scripts\python.exe
C:/Users/admin/pythonml/6.naivtext.py
The dimensions of the dataset (18, 2)
0 I love this sandwich
1 This is an amazing place
2 I feel very good about these beers
3 This is my best work
4 What an awesome view
5 I do not like this restaurant
6 I am tired of this stuff
7 I can't deal with this
8 He is my sworn enemy
9 My boss is horrible
10 This is an awesome place
11 I do not like the taste of this juice
12 I love to dance
13 I am sick and tired of this place
14 What a great holiday
15 That is a bad locality to stay
16 We will have good fun tomorrow
17 I went to my enemy's house today
Name: message, dtype: object
0 1
1 1
2 1
3 1
4 1
5 0
6 0
7 0
8 0
9 0
10 1
11 0
12 1
13 0
14 1
15 0
16 1
17 0
Name: labelnum, dtype: int64
(5,)
(13,)
(5,)
(13,)
['about', 'am', 'amazing', 'an', 'and', 'awesome', 'bad', 'beers', 'best', 'boss', 'can', 'deal',
'do', 'enemy', 'feel', 'good', 'great', 'he', 'holiday', 'horrible', 'house', 'is', 'juice', 'like',
'locality', 'my', 'not', 'of', 'place', 'restaurant', 'sick', 'stay', 'sworn', 'taste', 'that', 'the',
'these', 'this', 'tired', 'to', 'today', 'very', 'went', 'what', 'with', 'work']
about am amazing an and ... very went what with work
0 0 0 0 1 0 ... 0 0 0 0 0
1 0 0 0 0 0 ... 0 0 0 0 0
2 0 0 0 0 0 ... 0 0 0 0 0
3 0 0 0 0 0 ... 0 0 0 0 0
4 0 0 0 0 0 ... 0 0 0 0 0
5 0 0 0 0 0 ... 0 0 0 0 0
6 1 0 0 0 0 ... 1 0 0 0 0
7 0 0 0 0 0 ... 0 0 0 1 0
8 0 1 0 0 1 ... 0 0 0 0 0
9 0 0 1 1 0 ... 0 0 0 0 0
10 0 0 0 0 0 ... 0 0 0 0 1
11 0 0 0 0 0 ... 0 0 1 0 0
12 0 0 0 0 0 ... 0 1 0 0 0

[13 rows x 46 columns]

(0, 28) 1
(0, 5) 1
(0, 3) 1
(0, 21) 1
(0, 37) 1
(1, 22) 1
(1, 27) 1
(1, 33) 1
(1, 35) 1
(1, 23) 1
(1, 26) 1
(1, 12) 1
(1, 37) 1
(2, 19) 1
(2, 9) 1
(2, 25) 1
(2, 21) 1
(3, 29) 1
(3, 23) 1
(3, 26) 1
(3, 12) 1
(3, 37) 1
(4, 13) 1
(4, 32) 1
(4, 17) 1
: :
(8, 4) 1
(8, 30) 1
(8, 1) 1
(8, 27) 1
(8, 28) 1
(8, 37) 1
(9, 2) 1
(9, 28) 1
(9, 3) 1
(9, 21) 1
(9, 37) 1
(10, 45) 1
(10, 8) 1
(10, 25) 1
(10, 21) 1
(10, 37) 1
(11, 18) 1
(11, 16) 1
(11, 43) 1
(12, 40) 1
(12, 20) 1
(12, 42) 1
(12, 39) 1
(12, 13) 1
(12, 25) 1
Accuracy metrics
Accuracy of the classifer is 0.6
Confusion matrix
[[1 0]
[2 2]]
Recall and Precison
0.5
1.0

Process finished with exit code 0

7.Write a program to construct a Bayesian network considering medical data. Use this
model to demonstrate the diagnosis of heart patients using standard Heart Disease Data
Set. You can use Java/Python ML library classes/API.
(Note: Install bayespy, numpy, colorama)

import bayespy as bp
import numpy as np
import csv
from colorama import init
from colorama import Fore, Back, Style
init()

# Define Parameter Enum values

#Age
ageEnum = {'SuperSeniorCitizen':0, 'SeniorCitizen':1, 'MiddleAged':2, 'Youth':3, 'Teen':4}
# Gender
genderEnum = {'Male':0, 'Female':1}
# FamilyHistory
familyHistoryEnum = {'Yes':0, 'No':1}
# Diet(Calorie Intake)
dietEnum = {'High':0, 'Medium':1, 'Low':2}
# LifeStyle
lifeStyleEnum = {'Athlete':0, 'Active':1, 'Moderate':2, 'Sedetary':3}
# Cholesterol
cholesterolEnum = {'High':0, 'BorderLine':1, 'Normal':2}
# HeartDisease
heartDiseaseEnum = {'Yes':0, 'No':1}
#heart_disease_data.csv
with open('BBN heart_disease_data.csv') as csvfile:
lines = csv.reader(csvfile)
dataset = list(lines)
data = []
for x in dataset:

data.append([ageEnum[x[0]],genderEnum[x[1]],familyHistoryEnum[x[2]],dietEnum[x[3]],lifeSt
yleEnum[x[4]],cholesterolEnum[x[5]],heartDiseaseEnum[x[6]]])
# Training data for machine learning todo: should import from csv
data = np.array(data)
N = len(data)

# Input data column assignment

p_age = bp.nodes.Dirichlet(1.0*np.ones(5))
age = bp.nodes.Categorical(p_age, plates=(N,))
age.observe(data[:,0])

p_gender = bp.nodes.Dirichlet(1.0*np.ones(2))
gender = bp.nodes.Categorical(p_gender, plates=(N,))
gender.observe(data[:,1])

p_familyhistory = bp.nodes.Dirichlet(1.0*np.ones(2))
familyhistory = bp.nodes.Categorical(p_familyhistory, plates=(N,))
familyhistory.observe(data[:,2])

p_diet = bp.nodes.Dirichlet(1.0*np.ones(3))
diet = bp.nodes.Categorical(p_diet, plates=(N,))
diet.observe(data[:,3])

p_lifestyle = bp.nodes.Dirichlet(1.0*np.ones(4))
lifestyle = bp.nodes.Categorical(p_lifestyle, plates=(N,))
lifestyle.observe(data[:,4])
p_cholesterol = bp.nodes.Dirichlet(1.0*np.ones(3))
cholesterol = bp.nodes.Categorical(p_cholesterol, plates=(N,))
cholesterol.observe(data[:,5])

# Prepare nodes and establish edges

# np.ones(2) ->HeartDisease has 2 options Yes/No
# plates(5, 2, 2, 3, 4, 3) -> corresponds to options present for domain values
p_heartdisease = bp.nodes.Dirichlet(np.ones(2), plates=(5, 2, 2, 3, 4, 3))
heartdisease = bp.nodes.MultiMixture([age, gender, familyhistory, diet, lifestyle, cholesterol],
bp.nodes.Categorical, p_heartdisease)
heartdisease.observe(data[:,6])
p_heartdisease.update()

# Sample Test with hardcoded values

#print("Sample Probability")
#print("Probability(HeartDisease|Age=SuperSeniorCitizen, Gender=Female,
FamilyHistory=Yes, DietIntake=Medium, LifeStyle=Sedetary, Cholesterol=High)")
#print(bp.nodes.MultiMixture([ageEnum['SuperSeniorCitizen'], genderEnum['Female'],
familyHistoryEnum['Yes'], dietEnum['Medium'], lifeStyleEnum['Sedetary'],
cholesterolEnum['High']], bp.nodes.Categorical,
p_heartdisease).get_moments()[0][heartDiseaseEnum['Yes']])

# Interactive Test
m=0
while m == 0:
print("\n")
res = bp.nodes.MultiMixture([int(input('Enter Age: ' + str(ageEnum))), int(input('Enter
Gender: ' + str(genderEnum))), int(input('Enter FamilyHistory: ' + str(familyHistoryEnum))),
int(input('Enter dietEnum: ' + str(dietEnum))), int(input('Enter LifeStyle: ' +
str(lifeStyleEnum))), int(input('Enter Cholesterol: ' + str(cholesterolEnum)))],
bp.nodes.Categorical, p_heartdisease).get_moments()[0][heartDiseaseEnum['Yes']]
print("Probability(HeartDisease) = " + str(res))
#print(Style.RESET_ALL)
m = int(input("Enter for Continue:0, Exit :1 "))

Data Set:

(filename: 'BBN heart_disease_data.csv')

SuperSeniorCitizen,Male,Yes,Medium,Sedetary,High,Yes
SuperSeniorCitizen,Female,Yes,Medium,Sedetary,High,Yes
SeniorCitizen,Male,No,High,Moderate,BorderLine,Yes
Teen,Male,Yes,Medium,Sedetary,Normal,No
Youth,Female,Yes,High,Athlete,Normal,No
MiddleAged,Male,Yes,Medium,Active,High,Yes
Teen,Male,Yes,High,Moderate,High,Yes
SuperSeniorCitizen,Male,Yes,Medium,Sedetary,High,Yes
Youth,Female,Yes,High,Athlete,Normal,No
SeniorCitizen,Female,No,High,Athlete,Normal,Yes
Teen,Female,No,Medium,Moderate,High,Yes
Teen,Male,Yes,Medium,Sedetary,Normal,No
MiddleAged,Female,No,High,Athlete,High,No
MiddleAged,Male,Yes,Medium,Active,High,Yes
Youth,Female,Yes,High,Athlete,BorderLine,No
SuperSeniorCitizen,Male,Yes,High,Athlete,Normal,Yes
SeniorCitizen,Female,No,Medium,Moderate,BorderLine,Yes
Youth,Female,Yes,Medium,Athlete,BorderLine,No
Teen,Male,Yes,Medium,Sedetary,Normal,No

OUTPUT

Enter Age: {'SuperSeniorCitizen': 0, 'SeniorCitizen': 1, 'MiddleAged': 2, 'Youth': 3, 'Teen': 4}1

Enter Gender: {'Male': 0, 'Female': 1}1
Enter FamilyHistory: {'Yes': 0, 'No': 1}1 Enter dietEnum: {'High': 0, 'Medium': 1, 'Low': 2}2
Enter LifeStyle: {'Athlete': 0, 'Active': 1, 'Moderate': 2, 'Sedetary': 3}2 Enter Cholesterol:
{'High': 0, 'BorderLine': 1, 'Normal': 2}1 Probability(HeartDisease) = 0.5 Enter for Continue:0,
Exit :1 0 Enter Age: {'SuperSeniorCitizen': 0, 'SeniorCitizen': 1, 'MiddleAged': 2, 'Youth': 3,
'Teen': 4}0 Enter Gender: {'Male': 0, 'Female': 1}0 Enter FamilyHistory: {'Yes': 0, 'No': 1}0
Enter dietEnum: {'High': 0, 'Medium': 1, 'Low': 2}0 Enter LifeStyle: {'Athlete': 0, 'Active': 1,
'Moderate': 2, 'Sedetary': 3}3 Enter Cholesterol: {'High': 0, 'BorderLine': 1, 'Normal': 2}0
Probability(HeartDisease) = 0.5 Enter for Continue:0, Exit :1

8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set
for clustering using k-Means algorithm. Compare the results of these two algorithms and
comment on the quality of clustering. You can add Java/Python ML library classes/API in
the program
(Note: Install numpy, pandas, sklearn)

from sklearn.cluster import KMeans

#from sklearn import metrics

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
data=pd.read_csv("kmeansdata.csv")
df1=pd.DataFrame(data)
print(df1)
f1 = df1['Distance_Feature'].values
f2 = df1['Speeding_Feature'].values
X=np.matrix(list(zip(f1,f2)))
plt.plot()
plt.xlim([0, 100])
plt.ylim([0, 50])
plt.title('Dataset')
plt.ylabel('Speeding_Feature')
plt.xlabel('Distance_Feature')
plt.scatter(f1,f2)
plt.show()

# create new plot and data

plt.plot()
colors = ['b', 'g', 'r']
markers = ['o', 'v', 's']

# KMeans algorithm
#K = 3
kmeans_model = KMeans(n_clusters=3).fit(X)

plt.plot()
for i, l in enumerate(kmeans_model.labels_):
plt.plot(f1[i], f2[i], color=colors[l], marker=markers[l],ls='None')
plt.xlim([0, 100])
plt.ylim([0, 50])

plt.show()

Data Set: ( filename: kmeansdata.csv)

Driver_ID,Distance_Feature,Speeding_Feature
3423311935,71.24,28
3423313212,52.53,25
3423313724,64.54,27
3423311373,55.69,22
3423310999,54.58,25
3423313857,41.91,10
3423312432,58.64,20
3423311434,52.02,8
3423311328,31.25,34
3423312488,44.31,19
3423311254,49.35,40
3423312943,58.07,45
3423312536,44.22,22
3423311542,55.73,19
3423312176,46.63,43
3423314176,52.97,32
3423314202,46.25,35
3423311346,51.55,27
3423310666,57.05,26
3423313527,58.45,30
3423312182,43.42,23
3423313590,55.68,37
3423312268,55.15,18

OUTPUT

C:\Users\admin\PycharmProjects\ss\venv\Scripts\python.exe C:/Users/admin/pythonml/8.kmeans.py.py
Driver_IDDistance_FeatureSpeeding_Feature
0 3423311935 71.24 28
1 3423313212 52.53 25
2 3423313724 64.54 27
3 3423311373 55.69 22
4 3423310999 54.58 25
5 3423313857 41.91 10
6 3423312432 58.64 20
7 3423311434 52.02 8
8 3423311328 31.25 34
9 3423312488 44.31 19
10 3423311254 49.35 40
11 3423312943 58.07 45
12 3423312536 44.22 22
13 3423311542 55.73 19
14 3423312176 46.63 43
15 3423314176 52.97 32
16 3423314202 46.25 35
17 3423311346 51.55 27
18 3423310666 57.05 26
19 3423313527 58.45 30
20 3423312182 43.42 23
21 3423313590 55.68 37
22 3423312268 55.15 18
9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions. Java/Python ML library classes can be used
for this problem.
KNN ALGORITHM

from sklearn.model_selection import train_test_split

from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix

from sklearn import datasets

iris=datasets.load_iris()
iris_data=iris.data
iris_labels=iris.target
print(iris_data)
print(iris_labels)
x_train, x_test, y_train, y_test=train_test_split(iris_data,iris_labels,test_size=0.30)

classifier=KNeighborsClassifier(n_neighbors=5)
classifier.fit(x_train,y_train)
y_pred=classifier.predict(x_test)
print('confusion matrix is as follows')
print(confusion_matrix(y_test,y_pred))
print('Accuracy metrics')
print(classification_report(y_test,y_pred))

Data Set:

5.1,3.5,1.4,0.2,Iris-setosa,
4.9,3,1.4,0.2,Iris-setosa,
4.7,3.2,1.3,0.2,Iris-setosa,
4.6,3.1,1.5,0.2,Iris-setosa,
5,3.6,1.4,0.2,Iris-setosa,
5.4,3.9,1.7,0.4,Iris-setosa,
4.6,3.4,1.4,0.3,Iris-setosa,
5,3.4,1.5,0.2,Iris-setosa,
4.4,2.9,1.4,0.2,Iris-setosa,
4.9,3.1,1.5,0.1,Iris-setosa,
5.4,3.7,1.5,0.2,Iris-setosa,
4.8,3.4,1.6,0.2,Iris-setosa,
4.8,3,1.4,0.1,Iris-setosa,
4.3,3,1.1,0.1,Iris-setosa,
5.8,4,1.2,0.2,Iris-setosa,
5.7,4.4,1.5,0.4,Iris-setosa,
5.4,3.9,1.3,0.4,Iris-setosa,
5.1,3.5,1.4,0.3,Iris-setosa,
5.7,3.8,1.7,0.3,Iris-setosa,
5.1,3.8,1.5,0.3,Iris-setosa,
5.4,3.4,1.7,0.2,Iris-setosa,
5.1,3.7,1.5,0.4,Iris-setosa,
4.6,3.6,1,0.2,Iris-setosa,
5.1,3.3,1.7,0.5,Iris-setosa,
4.8,3.4,1.9,0.2,Iris-setosa,
5,3,1.6,0.2,Iris-setosa,
5,3.4,1.6,0.4,Iris-setosa,
5.2,3.5,1.5,0.2,Iris-setosa,
5.2,3.4,1.4,0.2,Iris-setosa,
4.7,3.2,1.6,0.2,Iris-setosa,
4.8,3.1,1.6,0.2,Iris-setosa,
5.4,3.4,1.5,0.4,Iris-setosa,
5.2,4.1,1.5,0.1,Iris-setosa,
5.5,4.2,1.4,0.2,Iris-setosa,
4.9,3.1,1.5,0.1,Iris-setosa,
5,3.2,1.2,0.2,Iris-setosa,
5.5,3.5,1.3,0.2,Iris-setosa,
4.9,3.1,1.5,0.1,Iris-setosa,
4.4,3,1.3,0.2,Iris-setosa,
5.1,3.4,1.5,0.2,Iris-setosa,
5,3.5,1.3,0.3,Iris-setosa,
4.5,2.3,1.3,0.3,Iris-setosa,
4.4,3.2,1.3,0.2,Iris-setosa,
5,3.5,1.6,0.6,Iris-setosa,
5.1,3.8,1.9,0.4,Iris-setosa,
4.8,3,1.4,0.3,Iris-setosa,
5.1,3.8,1.6,0.2,Iris-setosa,
4.6,3.2,1.4,0.2,Iris-setosa,
5.3,3.7,1.5,0.2,Iris-setosa,
5,3.3,1.4,0.2,Iris-setosa,
7,3.2,4.7,1.4,Iris-versicolor,
6.4,3.2,4.5,1.5,Iris-versicolor,
6.9,3.1,4.9,1.5,Iris-versicolor,
5.5,2.3,4,1.3,Iris-versicolor,
6.5,2.8,4.6,1.5,Iris-versicolor,
5.7,2.8,4.5,1.3,Iris-versicolor,
6.3,3.3,4.7,1.6,Iris-versicolor,
4.9,2.4,3.3,1,Iris-versicolor,
6.6,2.9,4.6,1.3,Iris-versicolor,
5.2,2.7,3.9,1.4,Iris-versicolor,
5,2,3.5,1,Iris-versicolor,
5.9,3,4.2,1.5,Iris-versicolor,
6,2.2,4,1,Iris-versicolor,
6.1,2.9,4.7,1.4,Iris-versicolor,
5.6,2.9,3.6,1.3,Iris-versicolor,
6.7,3.1,4.4,1.4,Iris-versicolor,
5.6,3,4.5,1.5,Iris-versicolor,
5.8,2.7,4.1,1,Iris-versicolor,
6.2,2.2,4.5,1.5,Iris-versicolor,
5.6,2.5,3.9,1.1,Iris-versicolor,
5.9,3.2,4.8,1.8,Iris-versicolor,
6.1,2.8,4,1.3,Iris-versicolor,
6.3,2.5,4.9,1.5,Iris-versicolor,
6.1,2.8,4.7,1.2,Iris-versicolor,
6.4,2.9,4.3,1.3,Iris-versicolor,
6.6,3,4.4,1.4,Iris-versicolor,
6.8,2.8,4.8,1.4,Iris-versicolor,
6.7,3,5,1.7,Iris-versicolor,
6,2.9,4.5,1.5,Iris-versicolor,
5.7,2.6,3.5,1,Iris-versicolor,
5.5,2.4,3.8,1.1,Iris-versicolor,
5.5,2.4,3.7,1,Iris-versicolor,
5.8,2.7,3.9,1.2,Iris-versicolor,
6,2.7,5.1,1.6,Iris-versicolor,
5.4,3,4.5,1.5,Iris-versicolor,
6,3.4,4.5,1.6,Iris-versicolor,
6.7,3.1,4.7,1.5,Iris-versicolor,
6.3,2.3,4.4,1.3,Iris-versicolor,
5.6,3,4.1,1.3,Iris-versicolor,
5.5,2.5,4,1.3,Iris-versicolor,
5.5,2.6,4.4,1.2,Iris-versicolor,
6.1,3,4.6,1.4,Iris-versicolor,
5.8,2.6,4,1.2,Iris-versicolor,
5,2.3,3.3,1,Iris-versicolor,
5.6,2.7,4.2,1.3,Iris-versicolor,
5.7,3,4.2,1.2,Iris-versicolor,
5.7,2.9,4.2,1.3,Iris-versicolor,
6.2,2.9,4.3,1.3,Iris-versicolor,
5.1,2.5,3,1.1,Iris-versicolor,
5.7,2.8,4.1,1.3,Iris-versicolor,
6.3,3.3,6,2.5,Iris-virginica,
5.8,2.7,5.1,1.9,Iris-virginica,
7.1,3,5.9,2.1,Iris-virginica,
6.3,2.9,5.6,1.8,Iris-virginica,
6.5,3,5.8,2.2,Iris-virginica,
7.6,3,6.6,2.1,Iris-virginica,
4.9,2.5,4.5,1.7,Iris-virginica,
7.3,2.9,6.3,1.8,Iris-virginica,
6.7,2.5,5.8,1.8,Iris-virginica,
7.2,3.6,6.1,2.5,Iris-virginica,
6.5,3.2,5.1,2,Iris-virginica,
6.4,2.7,5.3,1.9,Iris-virginica,
6.8,3,5.5,2.1,Iris-virginica,
5.7,2.5,5,2,Iris-virginica,
5.8,2.8,5.1,2.4,Iris-virginica,
6.4,3.2,5.3,2.3,Iris-virginica,
6.5,3,5.5,1.8,Iris-virginica,
7.7,3.8,6.7,2.2,Iris-virginica,
7.7,2.6,6.9,2.3,Iris-virginica,
6,2.2,5,1.5,Iris-virginica,
6.9,3.2,5.7,2.3,Iris-virginica,
5.6,2.8,4.9,2,Iris-virginica,
7.7,2.8,6.7,2,Iris-virginica,
6.3,2.7,4.9,1.8,Iris-virginica,
6.7,3.3,5.7,2.1,Iris-virginica,
7.2,3.2,6,1.8,Iris-virginica,
6.2,2.8,4.8,1.8,Iris-virginica,
6.1,3,4.9,1.8,Iris-virginica,
6.4,2.8,5.6,2.1,Iris-virginica,
7.2,3,5.8,1.6,Iris-virginica,
7.4,2.8,6.1,1.9,Iris-virginica,
7.9,3.8,6.4,2,Iris-virginica,
6.4,2.8,5.6,2.2,Iris-virginica,
6.3,2.8,5.1,1.5,Iris-virginica,
6.1,2.6,5.6,1.4,Iris-virginica,
7.7,3,6.1,2.3,Iris-virginica,
6.3,3.4,5.6,2.4,Iris-virginica,
6.4,3.1,5.5,1.8,Iris-virginica,
6,3,4.8,1.8,Iris-virginica,
6.9,3.1,5.4,2.1,Iris-virginica,
6.7,3.1,5.6,2.4,Iris-virginica,
6.9,3.1,5.1,2.3,Iris-virginica,
5.8,2.7,5.1,1.9,Iris-virginica,
6.8,3.2,5.9,2.3,Iris-virginica,
6.7,3.3,5.7,2.5,Iris-virginica,
6.7,3,5.2,2.3,Iris-virginica,
6.3,2.5,5,1.9,Iris-virginica,
6.5,3,5.2,2,Iris-virginica,
6.2,3.4,5.4,2.3,Iris-virginica,
5.9,3,5.1,1.8,Iris-virginica,
OUTPUT

C:\Users\admin\PycharmProjects\ss\venv\Scripts\python.exe
C:/Users/admin/pythonml/9.KNN.py
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]
[5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3]
[5. 3.4 1.5 0.2]
[4.4 2.9 1.4 0.2]
[4.9 3.1 1.5 0.1]
[5.4 3.7 1.5 0.2]
[4.8 3.4 1.6 0.2]
[4.8 3. 1.4 0.1]
[4.3 3. 1.1 0.1]
[5.8 4. 1.2 0.2]
[5.7 4.4 1.5 0.4]
[5.4 3.9 1.3 0.4]
[5.1 3.5 1.4 0.3]
[5.7 3.8 1.7 0.3]
[5.1 3.8 1.5 0.3]
[5.4 3.4 1.7 0.2]
[5.1 3.7 1.5 0.4]
[4.6 3.6 1. 0.2]
[5.1 3.3 1.7 0.5]
[4.8 3.4 1.9 0.2]
[5. 3. 1.6 0.2]
[5. 3.4 1.6 0.4]
[5.2 3.5 1.5 0.2]
[5.2 3.4 1.4 0.2]
[4.7 3.2 1.6 0.2]
[4.8 3.1 1.6 0.2]
[5.4 3.4 1.5 0.4]
[5.2 4.1 1.5 0.1]
[5.5 4.2 1.4 0.2]
[4.9 3.1 1.5 0.1]
[5. 3.2 1.2 0.2]
[5.5 3.5 1.3 0.2]
[4.9 3.1 1.5 0.1]
[4.4 3. 1.3 0.2]
[5.1 3.4 1.5 0.2]
[5. 3.5 1.3 0.3]
[4.5 2.3 1.3 0.3]
[4.4 3.2 1.3 0.2]
[5. 3.5 1.6 0.6]
[5.1 3.8 1.9 0.4]
[4.8 3. 1.4 0.3]
[5.1 3.8 1.6 0.2]
[4.6 3.2 1.4 0.2]
[5.3 3.7 1.5 0.2]
[5. 3.3 1.4 0.2]
[7. 3.2 4.7 1.4]
[6.4 3.2 4.5 1.5]
[6.9 3.1 4.9 1.5]
[5.5 2.3 4. 1.3]
[6.5 2.8 4.6 1.5]
[5.7 2.8 4.5 1.3]
[6.3 3.3 4.7 1.6]
[4.9 2.4 3.3 1. ]
[6.6 2.9 4.6 1.3]
[5.2 2.7 3.9 1.4]
[5. 2. 3.5 1. ]
[5.9 3. 4.2 1.5]
[6. 2.2 4. 1. ]
[6.1 2.9 4.7 1.4]
[5.6 2.9 3.6 1.3]
[6.7 3.1 4.4 1.4]
[5.6 3. 4.5 1.5]
[5.8 2.7 4.1 1. ]
[6.2 2.2 4.5 1.5]
[5.6 2.5 3.9 1.1]
[5.9 3.2 4.8 1.8]
[6.1 2.8 4. 1.3]
[6.3 2.5 4.9 1.5]
[6.1 2.8 4.7 1.2]
[6.4 2.9 4.3 1.3]
[6.6 3. 4.4 1.4]
[6.8 2.8 4.8 1.4]
[6.7 3.5. 1.7]
[6. 2.9 4.5 1.5]
[5.7 2.6 3.5 1. ]
[5.5 2.4 3.8 1.1]
[5.5 2.4 3.7 1. ]
[5.8 2.7 3.9 1.2]
[6. 2.7 5.1 1.6]
[5.4 3. 4.5 1.5]
[6. 3.4 4.5 1.6]
[6.7 3.1 4.7 1.5]
[6.3 2.3 4.4 1.3]
[5.6 3. 4.1 1.3]
[5.5 2.5 4. 1.3]
[5.5 2.6 4.4 1.2]
[6.1 3. 4.6 1.4]
[5.8 2.6 4. 1.2]
[5. 2.3 3.3 1. ]
[5.6 2.7 4.2 1.3]
[5.7 3. 4.2 1.2]
[5.7 2.9 4.2 1.3]
[6.2 2.9 4.3 1.3]
[5.1 2.5 3. 1.1]
[5.7 2.8 4.1 1.3]
[6.3 3.3 6. 2.5]
[5.8 2.7 5.1 1.9]
[7.1 3. 5.9 2.1]
[6.3 2.9 5.6 1.8]
[6.5 3. 5.8 2.2]
[7.6 3. 6.6 2.1]
[4.9 2.5 4.5 1.7]
[7.3 2.9 6.3 1.8]
[6.7 2.5 5.8 1.8]
[7.2 3.6 6.1 2.5]
[6.5 3.2 5.1 2. ]
[6.4 2.7 5.3 1.9]
[6.8 3. 5.5 2.1]
[5.7 2.5 5.2. ]
[5.8 2.8 5.1 2.4]
[6.4 3.2 5.3 2.3]
[6.5 3. 5.5 1.8]
[7.7 3.8 6.7 2.2]
[7.7 2.6 6.9 2.3]
[6. 2.2 5. 1.5]
[6.9 3.2 5.7 2.3]
[5.6 2.8 4.9 2. ]
[7.7 2.8 6.7 2. ]
[6.3 2.7 4.9 1.8]
[6.7 3.3 5.7 2.1]
[7.2 3.2 6. 1.8]
[6.2 2.8 4.8 1.8]
[6.1 3. 4.9 1.8]
[6.4 2.8 5.6 2.1]
[7.2 3. 5.8 1.6]
[7.4 2.8 6.1 1.9]
[7.9 3.8 6.4 2. ]
[6.4 2.8 5.6 2.2]
[6.3 2.8 5.1 1.5]
[6.1 2.6 5.6 1.4]
[7.7 3. 6.1 2.3]
[6.3 3.4 5.6 2.4]
[6.4 3.1 5.5 1.8]
[6. 3. 4.8 1.8]
[6.9 3.1 5.4 2.1]
[6.7 3.1 5.6 2.4]
[6.9 3.1 5.1 2.3]
[5.8 2.7 5.1 1.9]
[6.8 3.2 5.9 2.3]
[6.7 3.3 5.7 2.5]
[6.7 3. 5.2 2.3]
[6.3 2.5 5. 1.9]
[6.5 3. 5.2 2. ]
[6.2 3.4 5.4 2.3]
[5.9 3. 5.1 1.8]]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0000000000000111111111111111111111111
1111111111111111111111111122222222222
2222222222222222222222222222222222222
2 2]
confusion matrix is as follows
[[21 0 0]
[ 0 11 2]
[ 0 0 11]]
Accuracy metrics
precision recall f1-score support

0 1.00 1.00 1.00 21

1 1.00 0.85 0.92 13
2 0.85 1.00 0.92 11

avg / total 0.96 0.96 0.96 45

Process finished with exit code 0

10. Implement the non-parametric Locally Weighted Regression algorithm in order to fit
data points. Select appropriate data set for your experiment and draw graphs.
LOCALLY WEIGHTED REGRESSION
(Note: Install numpy, scipy)

from math import ceil

import numpy as np
from scipy import linalg

def lowess(x, y, f=2./3., iter=3):

n = len(x)
r = int(ceil(f*n))
h = [np.sort(np.abs(x - x[i]))[r] for i in range(n)]
w = np.clip(np.abs((x[:,None] - x[None,:]) / h), 0.0, 1.0)
w = (1 - w**3)**3
yest = np.zeros(n)
delta = np.ones(n)
for iteration in range(iter):
for i in range(n):
weights = delta * w[:,i]
b = np.array([np.sum(weights*y), np.sum(weights*y*x)])
A = np.array([[np.sum(weights), np.sum(weights*x)],
[np.sum(weights*x), np.sum(weights*x*x)]])
beta = linalg.solve(A, b)
yest[i] = beta[0] + beta[1]*x[i]

residuals = y - yest
s = np.median(np.abs(residuals))
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta**2)**2

return yest

if __name__ == '__main__':
import math
n = 100
x = np.linspace(0, 2 * math.pi, n)
print("==========================values of x=====================")
print(x)
y = np.sin(x) + 0.3*np.random.randn(n)
print("================================Values of y===================")
print(y)
f = 0.25
yest = lowess(x, y, f=f, iter=3)

import pylab as pl
pl.clf()
pl.plot(x, y, label='y noisy')
pl.plot(x, yest, label='y pred')
pl.legend()
pl.show()

OUTPUT
C:\Users\admin\PycharmProjects\ss\venv\Scripts\python.exe
C:/Users/admin/Desktop/RNSIT_ML_LAB_PROGRAMS_JULY2018/RNSIT_ML_LAB_PROGRAMS
_JULY2018/10-regression/regression.py
==========================values of x=====================
[0. 0.06346652 0.12693304 0.19039955 0.25386607 0.31733259
0.38079911 0.44426563 0.50773215 0.57119866 0.63466518 0.6981317
0.76159822 0.82506474 0.88853126 0.95199777 1.01546429 1.07893081
1.14239733 1.20586385 1.26933037 1.33279688 1.3962634 1.45972992
1.52319644 1.58666296 1.65012947 1.71359599 1.77706251 1.84052903
1.90399555 1.96746207 2.03092858 2.0943951 2.15786162 2.22132814
2.28479466 2.34826118 2.41172769 2.47519421 2.53866073 2.60212725
2.66559377 2.72906028 2.7925268 2.85599332 2.91945984 2.98292636
3.04639288 3.10985939 3.17332591 3.23679243 3.30025895 3.36372547
3.42719199 3.4906585 3.55412502 3.61759154 3.68105806 3.74452458
3.8079911 3.87145761 3.93492413 3.99839065 4.06185717 4.12532369
4.1887902 4.25225672 4.31572324 4.37918976 4.44265628 4.5061228
4.56958931 4.63305583 4.69652235 4.75998887 4.82345539 4.88692191
4.95038842 5.01385494 5.07732146 5.14078798 5.2042545 5.26772102
5.33118753 5.39465405 5.45812057 5.52158709 5.58505361 5.64852012
5.71198664 5.77545316 5.83891968 5.9023862 5.96585272 6.02931923
6.09278575 6.15625227 6.21971879 6.28318531]
================================Values of y===================
[ 0.32536008 -0.0080573 0.11946369 0.41612046 0.45098579 0.43815367
0.10801193 0.68389606 0.86074625 0.04549917 0.68505644 0.60342634
1.17247156 0.88083937 0.71119685 0.95001511 0.54481781 0.7051224
1.25351458 0.8712536 0.92022204 0.7352142 0.88698095 0.91535147
0.83840992 0.7904273 1.75713902 0.9658919 0.39042121 0.66715723
0.82248617 1.16770788 1.62890879 0.55892447 1.66198264 0.02503305
0.79764264 0.55443527 1.21535481 1.09842121 0.94842294 0.73174791
0.684332 0.28964437 0.71744902 0.37907153 0.2530457 0.15897645
0.07088533 0.54206641 0.12110612 0.08384214 0.12731212 -0.53552899
0.11736083 -0.56747834 -0.21437779 -0.53090037 -0.02105477 -0.7363005
-0.43987103 -0.67372833 -0.38014677 -0.17410718 -0.67528673 -0.80375547
-0.62601973 -0.74283758 -0.75248483 -0.67113581 -1.20706585 -0.64311434
-1.59478696 -1.23125828 -0.8670961 -0.64860678 -0.9419199 -0.42584513
-0.78040914 -1.10565932 -0.990609 -0.89934155 -0.60020463 -0.38534216
-1.28563144 -0.71983964 -0.43870468 -1.03712938 -0.28325743 -0.63386377
-0.49045503 -0.45722592 -0.0669703 -0.47006542 -0.44179404 -0.66259661
-0.21934077 -0.51959973 -0.11584542 0.19354907]

L13 - Business Process Management Perspective
100% (2)
L13 - Business Process Management Perspective
76 pages
10774A ENU TrainerHandbook Part2
50% (2)
10774A ENU TrainerHandbook Part2
480 pages
1.implement FIND-S Algorithm: Desription
No ratings yet
1.implement FIND-S Algorithm: Desription
19 pages
Compiler Design Slide Chapter 1-6
No ratings yet
Compiler Design Slide Chapter 1-6
250 pages
RTOS Class Notes
100% (1)
RTOS Class Notes
15 pages
Machine Learning Laboratory Record Book: 1 Find S Algorithm
No ratings yet
Machine Learning Laboratory Record Book: 1 Find S Algorithm
22 pages
Principles of Digital Communications: Majd Hallak 170701119
100% (1)
Principles of Digital Communications: Majd Hallak 170701119
30 pages
ML Lab Record
No ratings yet
ML Lab Record
49 pages
Module 5 Verification and Validation of Simulation Models
No ratings yet
Module 5 Verification and Validation of Simulation Models
15 pages
Internet & Email
No ratings yet
Internet & Email
130 pages
ML Lab Programs
No ratings yet
ML Lab Programs
21 pages
R20 Iii-Ii ML Lab Manual
100% (1)
R20 Iii-Ii ML Lab Manual
79 pages
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
No ratings yet
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
33 pages
ML Lab Observation
100% (1)
ML Lab Observation
44 pages
Introduction EJB3
No ratings yet
Introduction EJB3
52 pages
Lab Manual
No ratings yet
Lab Manual
25 pages
PESIT Bangalore South Campus: Vii Semester Lab Manual Subject: Machine Learning
No ratings yet
PESIT Bangalore South Campus: Vii Semester Lab Manual Subject: Machine Learning
31 pages
IT ML Lab
No ratings yet
IT ML Lab
35 pages
Machine Learning LAB MANUAL
No ratings yet
Machine Learning LAB MANUAL
23 pages
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
43 pages
Open SSH (Configuring Secure Shell)
No ratings yet
Open SSH (Configuring Secure Shell)
10 pages
φ= T Y (t) dt: Point Estimator
No ratings yet
φ= T Y (t) dt: Point Estimator
7 pages
φ= T Y (t) dt: Point Estimator
No ratings yet
φ= T Y (t) dt: Point Estimator
7 pages
φ= T Y (t) dt: Point Estimator
No ratings yet
φ= T Y (t) dt: Point Estimator
7 pages
φ= T Y (t) dt: Point Estimator
No ratings yet
φ= T Y (t) dt: Point Estimator
7 pages
Comp Method Book CM s21
No ratings yet
Comp Method Book CM s21
295 pages
Machine Learning Through Python Lab Mannual
No ratings yet
Machine Learning Through Python Lab Mannual
33 pages
Ad3461 ML Lab Manual Format Edited
No ratings yet
Ad3461 ML Lab Manual Format Edited
45 pages
ML Manual
No ratings yet
ML Manual
74 pages
Machine Learning Manual Final
No ratings yet
Machine Learning Manual Final
37 pages
Wa0027.
No ratings yet
Wa0027.
34 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
New ML Lab Manual
No ratings yet
New ML Lab Manual
29 pages
AD3461 - ML Lab Manual
No ratings yet
AD3461 - ML Lab Manual
54 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
MANUAL
No ratings yet
MANUAL
33 pages
Practical 1: A. Design A Simple Machine Learning Model To Train The Training Instances and Test The Same
No ratings yet
Practical 1: A. Design A Simple Machine Learning Model To Train The Training Instances and Test The Same
30 pages
Covid19 Detection Using Federated Learning
No ratings yet
Covid19 Detection Using Federated Learning
63 pages
Lab Manual
No ratings yet
Lab Manual
55 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
MLAll Practical
No ratings yet
MLAll Practical
27 pages
MLlab Manual LIET
No ratings yet
MLlab Manual LIET
52 pages
Untitled
No ratings yet
Untitled
3 pages
MLPrograma1-5 Py
No ratings yet
MLPrograma1-5 Py
17 pages
Computer Fraud and Abuse Techniques
No ratings yet
Computer Fraud and Abuse Techniques
41 pages
ML Lab Prog1-5 (5) College PDF
No ratings yet
ML Lab Prog1-5 (5) College PDF
12 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
ML Lab Output
No ratings yet
ML Lab Output
15 pages
IV - ML Lab
No ratings yet
IV - ML Lab
31 pages
ML Lab File Batch 1
No ratings yet
ML Lab File Batch 1
20 pages
ML Lab Manual (1-9)
No ratings yet
ML Lab Manual (1-9)
37 pages
ML Final
No ratings yet
ML Final
19 pages
MANUAL
No ratings yet
MANUAL
34 pages
Screenshot 2023-12-07 at 11.07.49 AM
No ratings yet
Screenshot 2023-12-07 at 11.07.49 AM
14 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
Shashidhar-18csl76 Final
No ratings yet
Shashidhar-18csl76 Final
19 pages
ML Lab PFG - Removed - Removed - Removed
No ratings yet
ML Lab PFG - Removed - Removed - Removed
22 pages
ML1 3 Merged
No ratings yet
ML1 3 Merged
19 pages
ML Lab - 231009 - 210335
No ratings yet
ML Lab - 231009 - 210335
38 pages
Amit MLT1
No ratings yet
Amit MLT1
22 pages
ML Lab Manual-99
No ratings yet
ML Lab Manual-99
23 pages
Upgrade RHEL 7.2 To 7.6
No ratings yet
Upgrade RHEL 7.2 To 7.6
9 pages
ML Lab
No ratings yet
ML Lab
21 pages
FLEX Board USB Connection
100% (2)
FLEX Board USB Connection
3 pages
MLWP LAB Experiment's
No ratings yet
MLWP LAB Experiment's
11 pages
ML Lab Record
No ratings yet
ML Lab Record
33 pages
ML Ex1
No ratings yet
ML Ex1
12 pages
ML Lab
No ratings yet
ML Lab
9 pages
Compose and Install Laravel
No ratings yet
Compose and Install Laravel
25 pages
15CS73 Dec18-Jan19 PDF
No ratings yet
15CS73 Dec18-Jan19 PDF
2 pages
AIML
No ratings yet
AIML
12 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
ML Lab Programs
No ratings yet
ML Lab Programs
18 pages
1 PB
No ratings yet
1 PB
8 pages
ML Lab P-1
No ratings yet
ML Lab P-1
10 pages
MLT Shivani
No ratings yet
MLT Shivani
8 pages
Data Warehousing: Chetan R Assistant Professor, Dept. of ISE SJB Institute of Technology
No ratings yet
Data Warehousing: Chetan R Assistant Professor, Dept. of ISE SJB Institute of Technology
23 pages
Code MLT
No ratings yet
Code MLT
9 pages
Machine Learning Laboratory Manual
No ratings yet
Machine Learning Laboratory Manual
11 pages
Data Set
No ratings yet
Data Set
10 pages
Gameduino Tutorial
No ratings yet
Gameduino Tutorial
12 pages
DELL Inspiron 640M - WISTRON BERMUDA - POWER SEQUENCE PDF
No ratings yet
DELL Inspiron 640M - WISTRON BERMUDA - POWER SEQUENCE PDF
39 pages
3.2 Tests For Random Numbers: Two Types of Tests: 1. Frequency Test: U
No ratings yet
3.2 Tests For Random Numbers: Two Types of Tests: 1. Frequency Test: U
12 pages
IXrouter2 Installation Guide (2017-01)
No ratings yet
IXrouter2 Installation Guide (2017-01)
15 pages
Machine Learning Laboratory (21AIL66)
No ratings yet
Machine Learning Laboratory (21AIL66)
7 pages
RSA Algorithm
No ratings yet
RSA Algorithm
6 pages
Inbound Integration Process (Lookup Integration)
No ratings yet
Inbound Integration Process (Lookup Integration)
4 pages
Release Notes C Six Cad V 1810
No ratings yet
Release Notes C Six Cad V 1810
5 pages
MG Gs Crestron Flex Unified Communications Solutions
No ratings yet
MG Gs Crestron Flex Unified Communications Solutions
9 pages
Google Spanner
No ratings yet
Google Spanner
2 pages
Attia Elmoslimany ElKeyi JCM 2012
No ratings yet
Attia Elmoslimany ElKeyi JCM 2012
15 pages
Firmware - Software Engineer Innovation Graduate Intern - Review
No ratings yet
Firmware - Software Engineer Innovation Graduate Intern - Review
4 pages
H.265+ Encoding Technology: Hikvision
No ratings yet
H.265+ Encoding Technology: Hikvision
12 pages
Functions Assignment
No ratings yet
Functions Assignment
1 page
Quiz 6 - 3
No ratings yet
Quiz 6 - 3
9 pages
DR Gao's Resources
No ratings yet
DR Gao's Resources
3 pages
Corsair
No ratings yet
Corsair
5 pages
Answers Pradyumna
No ratings yet
Answers Pradyumna
2 pages
Vaccines Chart
No ratings yet
Vaccines Chart
4 pages
Project Proposal
No ratings yet
Project Proposal
1 page
Rajarajeswari College of Engineering Department of Computer Science and Engineering Model Test
No ratings yet
Rajarajeswari College of Engineering Department of Computer Science and Engineering Model Test
1 page
Axis T8133 30 W Midspan: Widely Used For Poe-Based Devices
No ratings yet
Axis T8133 30 W Midspan: Widely Used For Poe-Based Devices
2 pages
Provision-IsR CMS - PC Decode & Record Capabilities
No ratings yet
Provision-IsR CMS - PC Decode & Record Capabilities
1 page

ML Lab Programs 1-10-Converted NAM COLLEGE PDF

Uploaded by

ML Lab Programs 1-10-Converted NAM COLLEGE PDF

Uploaded by

MACHINE LEARNING LABORATORY (15CS76)

Data Set: (file name: data.csv)

Process finished with exit code 0

(NOTE: Install numpy, pandas)

indices=[i for i,val in enumerate(general_h) if val==['?','?','?','?','?','?']]

Dataset: (file name: data.csv)

intilization of specific_h and general_h

['sunny' 'warm' 'high' 'strong' 'warm' 'same']

steps of candidate elimination algorithm 3

['sunny' 'warm' '?' 'strong' '?' '?']

['sunny' 'warm' '?' 'strong' '?' '?']

Process finished with exit code 0

for value,n in node.children:

for row in datareader:

return (metadata, traindata)

Data set: (file name: data1.csv)

Process finished with exit code 0

Process finished with exit code 0

print("\nNaive Bayes Classifier for concept learning problem")

def safe_div(x, y):

def splitDataset(dataset, splitRatio):

def calculateProbability(x, mean, stdev):

def calculateClassProbabilities(summaries, inputVector):

def predict(summaries, inputVector):

def getPredictions(summaries, testSet):

def getAccuracy(testSet, predictions):

Naive Bayes Classifier for concept learning problem

The values assumed for the concept learning attributes are

OUTLOOK=> Sunny=1 Overcast=2 Rain=3

The Training set are:

The Test data set are:

Actual values: [10.0, 5.0, 10.0, 5.0]%

Process finished with exit code 0

Note:(install pandas and sklearn)

#splitting the dataset into train and test data

#printing accuracy metrics

'''docs_new = ['I like this place', 'My boss is not my saviour']

Note: In Notepad save the input with filename: naivetext1.txt

I love this sandwich,pos

[13 rows x 46 columns]

Process finished with exit code 0

# Define Parameter Enum values

# Input data column assignment

# Prepare nodes and establish edges

# Sample Test with hardcoded values

(filename: 'BBN heart_disease_data.csv')

Enter Age: {'SuperSeniorCitizen': 0, 'SeniorCitizen': 1, 'MiddleAged': 2, 'Youth': 3, 'Teen': 4}1

from sklearn.cluster import KMeans

#from sklearn import metrics

# create new plot and data

Data Set: ( filename: kmeansdata.csv)

from sklearn.model_selection import train_test_split

from sklearn import datasets

0 1.00 1.00 1.00 21

avg / total 0.96 0.96 0.96 45

Process finished with exit code 0

from math import ceil

def lowess(x, y, f=2./3., iter=3):

You might also like