18csl76 Lab Manual Lab Material
18csl76 Lab Manual Lab Material
LAB MANUAL
Prepared By
A Rosline Mary, S Suma
Asst. Professor
Dept. of CSE
Mission
M1: To nurture a positive environment with state of art facilities conducive for deep learning and
meaningful research and development.
M2: To enhance interaction with industry for promoting collaborative research in emerging technologies.
M3: To strengthen the learning experiences enabling the students to become ethical professionals with good
interpersonal skills, capable of working effectively in multi-disciplinary teams.
SYLLABUS
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING LABORATORY
[As per Choice Based Credit System (CBCS) scheme]
(Effective from the academic year 2018 -2019)
SEMESTER – VII
Subject Code: 18CSL76 IA Marks: 40
Number of Lecture Hours/Week: 01I + 02P Exam Marks: 60
Total Number of Lecture Hours: 36 Exam Hours: 03
CREDITS – 02
Course objectives: This course will enable students to
Description:
Lab Experiments:
Implement A* Search algorithm.
Implement AO* Search algorithm.
For a given set of training data examples stored in a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm o output a description of the set of all hypotheses consistent with
the training examples.
Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an
appropriate data set for building the decision tree and apply this knowledge to classify a new
sample.
Build an Artificial Neural Network by implementing the Backpropagation algorithm and test the
same using appropriate data sets.
Write a program to implement the naïve Bayesian classifier for a sample training data set stored as
a .CSV file. Compute the accuracy of the classifier, considering few test data sets.
Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for
clustering using k-Means algorithm. Compare the results of these two algorithms and comment on
the quality of clustering. You can add Java/Python ML library classes/API in the program.
Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print
both correct and wrong predictions. Java/Python ML library classes can be used for this problem.
Implement the non-parametric Locally Weighted Regression algorithm to fit data points. Select
appropriate data set for your experiment and draw graphs.
Table of contents
Exp. Experiments Page
No. No.
1 Implement A* Search algorithm. 1
3 For a given set of training data examples stored in a .CSV file, implement and 12
demonstrate the Candidate-Elimination algorithm to output a description of the set of
all hypotheses consistent with the training examples.
4 Write a program to demonstrate the working of the decision tree based ID3 algorithm. 15
Use an appropriate data set for building the decision tree and apply this knowledge to
classify a new sample.
5 Build an Artificial Neural Network by implementing the Backpropagation algorithm 42
and test the same using appropriate data sets.
6 Write a program to implement the naïve Bayesian classifier for a sample training data 45
set stored as a .CSV file. Compute the accuracy of the classifier, considering few test
data sets.
7 Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data 48
set for clustering using k-Means algorithm. Compare the results of these two
algorithms and comment on the quality of clustering. You can add Java/Python ML
library classes/API in the program.
8 Write a program to implement k-Nearest Neighbour algorithm to classify the iris data 50
set. Print both correct and wrong predictions. Java/Python ML library classes can be
used for this problem.
9 Implement the non-parametric Locally Weighted Regression algorithm in order to fit 52
data points. Select appropriate data set for your experiment and draw graphs.
#INPUT 1
OUTPUT:
Path found: ['A', 'E', 'D', 'G']
##INPUT2
def heuristic(n):
H_dist = {
'A': 11,
'B': 6,
'C': 5,
'D': 7,
'E': 3,
'F': 6,
'G': 5,
'H': 3,
'I': 1,
'J': 0
}
return H_dist[n]
#Describe your graph here
Graph_nodes = {
'A': [('B', 6), ('F', 3)],
'B': [('A', 6), ('C', 3), ('D', 2)],
'C': [('B', 3), ('D', 1), ('E', 5)],
'D': [('B', 2), ('C', 1), ('E', 8)],
'E': [('C', 5), ('D', 8), ('I', 5), ('J', 5)],
'F': [('A', 3), ('G', 1), ('H', 7)],
'G': [('F', 1), ('I', 3)],
'H': [('F', 7), ('I', 2)],
'I': [('E', 5), ('G', 3), ('H', 2), ('J', 3)],
}
aStarAlgo('A', 'J')
OUTPUT:
Path found: ['A', 'F', 'G', 'I', 'J']
AO STAR SEARCH:
class Graph:
def init (self, graph, heuristicNodeList, startNode): #instantiate graph object with graph topology,
heuristic values, start node
self.graph = graph
self.H=heuristicNodeList
self.start=startNode
self.parent={}
self.status={}
self.solutionGraph={}
def applyAOStar(self): # starts a recursive AO* algorithm
self.aoStar(self.start, False)
def getNeighbors(self, v): # gets the Neighbors of a given node
return self.graph.get(v,'')
def getStatus(self,v): # return the status of a given
node return self.status.get(v,0)
def setStatus(self,v, val): # set the status of a given node
self.status[v]=val
def getHeuristicNodeValue(self, n):
return self.H.get(n,0) # always return the heuristic value of a given node
def setHeuristicNodeValue(self, n, value):
self.H[n]=value # set the revised heuristic value of a given node
def printSolution(self):
PROCESSING NODE : G
8 ['I']
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : B
8 ['H']
HEURISTIC VALUES : {'A': 10, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : A
12 ['B', 'C']
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : I
0 []
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': []}
PROCESSING NODE : G
1 ['I']
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I']}
PROCESSING NODE : B
2 ['G']
HEURISTIC VALUES : {'A': 12, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : A
6 ['B', 'C']
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : C
2 ['J']
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : A
6 ['B', 'C']
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : J
0 []
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 0}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G'], 'J': []}
PROCESSING NODE : C
1 ['J']
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 1, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 0}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G'], 'J': [], 'C': ['J']}
PROCESSING NODE : A
5 ['B', 'C']
FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE START NODE: A
{'I': [], 'G': ['I'], 'B': ['G'], 'J': [], 'C': ['J'], 'A': ['B', 'C']}
INPUT 2
print ("Graph - 2")
h2 = {'A': 1, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7} # Heuristic values of Nodes
graph2 = { # Graph of Nodes and Edges
'A': [[('B', 1), ('C', 1)], [('D', 1)]], # Neighbors of Node 'A', B, C & D with repective weights
'B': [[('G', 1)], [('H', 1)]], # Neighbors are included in a list of lists
'D': [[('E', 1), ('F', 1)]] # Each sublist indicate a "OR" node or "AND" nodes
}
G2 = Graph(graph2, h2, 'A') # Instantiate Graph object with graph, heuristic values and start Node
G2.applyAOStar() # Run the AO* algorithm
G2.printSolution() # Print the solution graph as output of the AO* algorithm search
OUTPUT:
Graph - 2
HEURISTIC VALUES : {'A': 1, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : A
11 ['D']
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : D
10 ['E', 'F']
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : A
11 ['D']
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : E
0 []
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : D
6 ['E', 'F']
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : A
7 ['D']
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : F
0 []
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 0, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': [], 'F': []}
PROCESSING NODE : D
2 ['E', 'F']
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 2, 'E': 0, 'F': 0, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': [], 'F': [], 'D': ['E', 'F']}
PROCESSING NODE : A
3 ['D']
FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE START NODE: A
{'E': [], 'F': [], 'D': ['E', 'F'], 'A': ['D']}
3. For a given set of training data examples stored in a .CSV file, implement and demonstrate the C
andidate-Elimination algorithm to output a description of the set of all hypotheses consistent with t
he training examples.
import numpy as np
import pandas as pd
data = pd.read_csv('enjoysport.csv')
concepts = np.array(data.iloc[:,0:-1])
print(concepts)
target = np.array(data.iloc[:,-1])
print(target)
specific_h = concepts[0].copy()
print(specific_h)
print(general_h)
for i, h in enumerate(concepts):
if target[i] == "yes":
for x in range(len(specific_h)):
if h[x]!= specific_h[x]:
specific_h[x] ='?'
general_h[x][x] ='?'
if target[i] == "no":
for x in range(len(specific_h)):
if h[x]!= specific_h[x]:
general_h[x][x] = specific_h[x]
else:
general_h[x][x] = '?'
print(specific_h)
print(general_h)
print("\n")
print("\n")
indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
for i in indices:
OUTPUT:
Final Specific_h:
['sunny' 'warm' '?' 'strong' '?' '?']
Final General_h:
[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?']]
4. Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an
appropriate data set for building the decision tree and apply this knowledge to classify a new
sample.
df_tennis.keys()[0]
'PlayTennis'
Classes: ['YES']
Probabilities of Class YES is 1.0:
Probabilities of Class YES is 1.0:
Classes: Counter({'YES': 3, 'NO': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Classes: Counter({'NO': 3, 'YES': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Outlook
overcast 0.000000 0.285714
rainy 0.970951 0.357143
sunny 0.970951 0.357143
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Info-gain for Outlook is :0.2467498197744391
Information Gain Calculation of Humidity
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
high
Group:
Humidity
high 0.985228 0.5
normal 0.591673 0.5
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Info-gain for Humidity is: 0.15183550136234136
Information Gain Calculation of Windy
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
strong
Group:
PlayTennis Outlook Temperature Humidity Windy
1 NO sunny hot high strong
5 NO rainy cool normal strong
6 YES overcast cool normal strong
10 YES sunny mild normal strong
11 YES overcast mild high strong
13 NO rainy mild high strong
Name:
weak
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
2 YES overcast hot high weak
3 YES rainy mild high weak
4 YES rainy cool normal weak
7 NO sunny mild high weak
8 YES sunny cool normal weak
9 YES rainy mild normal weak
Number of Instances of the Current Sub Class is 6: [0.6666666666666666, 0.3333333333333333] Classes: ['YES', 'NO']
Probabilities of Class NO is 0.3333333333333333:
Probabilities of Class YES is 0.6666666666666666: Index(['entropy_of_list', '<lambda>'], dtype='object')
ID3 Algorithm
def id3(df, target_attribute_name, attribute_names, default_class=None):
## Tally target attribute:
from collections import Counter
cnt = Counter(x for x in df[target_attribute_name])# class of YES /NO
## First check: Is this split of the dataset homogeneous?
if len(cnt) == 1:
return next(iter(cnt)) # next input data set, or raises StopIteration when EOF is hit.
# Split dataset
# On each split, recursively call this algorithm.
# populate the empty tree with subtrees, which
# are the result of the recursive call
for attr_val, data_subset in df.groupby(best_attr):
subtree = id3(data_subset,
target_attribute_name,
remaining_attribute_names,
default_class)
tree[best_attr][attr_val] = subtree
return tree
Predicting Attributes
# Get Predictor Names (all but 'class')
attribute_names = list(df_tennis.columns)
print("List of Attributes:", attribute_names)
attribute_names.remove('PlayTennis') #Remove the class attribute
print("Predicting Attributes:", attribute_names)
List of Attributes: ['PlayTennis', 'Outlook', 'Temperature', 'Humidity', 'Windy']
# Run Algorithm:
from pprint import pprint
tree = id3(df_tennis,'PlayTennis',attribute_names)
print("\n\nThe Resultant Decision Tree is :\n")
#print(tree)
pprint(tree)
attribute = next(iter(tree))
print("Best Attribute :\n",attribute)
print("Tree Keys:\n",tree[attribute].keys())
Information Gain Calculation of Outlook
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'> Name:
overcast Group:
PlayTennis Outlook Temperature Humidity Windy
Group:
Classes: ['YES']
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Information Gain Calculation of Humidity
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
high
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
2 YES overcast hot high weak
3 YES rainy mild high weak
7 NO sunny mild high weak
11 YES overcast mild high strong
13 NO rainy mild high strong
Name:
normal
Group:
PlayTennis Outlook Temperature Humidity Windy
4 YES rainy cool normal weak
5 NO rainy cool normal strong
6 YES overcast cool normal strong
8 YES sunny cool normal weak
9 YES rainy mild normal weak
10 YES sunny mild normal strong
12 YES overcast hot normal weak
Classes: Counter({'NO': 4, 'YES': 3})
Number of Instances of the Current Sub Class is 7:
[0.5714285714285714, 0.42857142857142855]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.42857142857142855:
[0.6, 0.4]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Information Gain Calculation of Humidity
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
high
Group:
PlayTennis Outlook Temperature Humidity Windy
3 YES rainy mild high weak
13 NO rainy mild high strong
Name:
normal
Group:
PlayTennis Outlook Temperature Humidity Windy
4 YES rainy cool normal weak
5 NO rainy cool normal strong
9 YES rainy mild normal weak
Classes: Counter({'YES': 1, 'NO': 1})
Number of Instances of the Current Sub Class is 2:
[0.5, 0.5]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Classes: Counter({'YES': 2, 'NO': 1})
Number of Instances of the Current Sub Class is 3:
[0.6666666666666666, 0.3333333333333333]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.3333333333333333:
Probabilities of Class YES is 0.6666666666666666:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Humidity
high 1.000000 0.4
normal 0.918296 0.6
Classes: Counter({'YES': 3, 'NO': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Information Gain Calculation of Windy
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
strong
Group:
PlayTennis Outlook Temperature Humidity Windy
5 NO rainy cool normal strong
13 NO rainy mild high strong
Name:
weak
Group:
PlayTennis Outlook Temperature Humidity Windy
3 YES rainy mild high weak
4 YES rainy cool normal weak
9 YES rainy mild normal weak
Classes: Counter({'NO': 2})
[1.0]
Classes: ['YES']
Probabilities of Class YES is 1.0:
Probabilities of Class YES is 1.0:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Windy
strong 0.0 0.4
weak 0.0 0.6
Classes: Counter({'YES': 3, 'NO': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Information Gain Calculation of Temperature
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
cool
Group:
PlayTennis Outlook Temperature Humidity Windy
8 YES sunny cool normal weak
Name:
hot
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
Name:
mild
Group:
PlayTennis Outlook Temperature Humidity Windy
7 NO sunny mild high weak
Name:
high
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
7 NO sunny mild high weak
Name:
normal
Group:
PlayTennis Outlook Temperature Humidity Windy
8 YES sunny cool normal weak
10 YES sunny mild normal strong
Classes: Counter({'NO': 3})
Number of Instances of the Current Sub Class is 3:
[1.0]
Classes: ['NO']
Probabilities of Class NO is 1.0:
Probabilities of Class NO is 1.0:
Classes: Counter({'YES': 2})
Number of Instances of the Current Sub Class is 2:
[1.0]
Classes: ['YES']
Probabilities of Class YES is 1.0:
Probabilities of Class YES is 1.0:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Humidity
high 0.0 0.6
normal 0.0 0.4
Classes: Counter({'NO': 3, 'YES': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
DT={'Outlook': {'Overcast': 'Yes', 'Rain': {'Wind': {'Strong': 'No', 'Weak': 'Yes'}}, 'Sunny': {'Humidity':
{'High': 'No', 'Normal': 'Yes'}}}}
testsample= {'Outlook':'Sunny','Temperature':'Hot', 'Humidity':'High', 'Wind':'Strong'}
dic=testsample
print("Test sample : ", dic)
for value in DT.values():
while(value!='Yes' or value!='No'):
for i in dic.keys():
if i in DT.keys():
DT= DT[i]
#print(DT)
for i in dic.values():
if i in DT.keys():
DT= DT[i]
if( DT=='No' or DT== 'Yes'):
print(" The test sample is classfied as ", DT)
Test sample : {'Outlook': 'Sunny', 'Temperature': 'Hot', 'Humidity': 'High', 'Wind': 'Strong'}
The test sample is classfied as No
5 Build an Artificial Neural Network by implementing the Backpropagation algorithm and test the
same using appropriate data sets.
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float) # Features ( Hrs Slept, Hrs Studied)
y = np.array(([92], [86], [89]), dtype=float) # Labels(Marks obtained)
c=np.amax(X,axis=0) # Normalize
print(c)
X = X/c # Normalize
y = y/100
print(X)
print(y)
def sigmoid(x):
return 1/(1 + np.exp(-x))
def sigmoid_grad(x):
return x * (1 - x)
# Variable initialization
epoch=4 #Setting training iterations
eta =0.3 #Setting learning rate (eta)
input_neurons = 2 #number of features in data set
hidden_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
# Weight and bias - Random initialization
wh=np.random.uniform(size=(input_neurons,hidden_neurons)) # 2x3
print(wh)
bh=np.random.uniform(size=(1,hidden_neurons)) # 1x3
print(bh)
wout=np.random.uniform(size=(hidden_neurons,output_neurons)) # 3x1
print(wout)
bout=np.random.uniform(size=(1,output_neurons))
print(bout)
for i in range(epoch):
#Forward Propogation
h_ip=np.dot(X,wh) + bh # Dot product + bias
print(h_ip)
h_act = sigmoid(h_ip) # Activation function
o_ip=np.dot(h_act,wout) + bout
output = sigmoid(o_ip)
#Backpropagation
# Error at Output layer
Output:
[3. 9.]
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
[[0.92]
[0.86]
[0.89]]
[[0.34391748 0.26893556 0.69278977]
[0.27114375 0.50486066 0.36606859]]
[[0.01416985 0.30257255 0.49180958]]
[[0.87639985]
[0.62305314]
[0.48385416]]
[[0.84524982]]
[[0.51459193 0.98672358 1.31973802]
[0.27944443 0.67269588 0.92611094]
[0.53884983 0.90808188 1.42864508]]
[[0.51459193 0.98672358 1.31973802]
6. Write a program to implement the naïve Bayesian classifier for a sample training data set stored
as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.
import csv
import random
import math
def loadCsv(filename):
lines = csv.reader(open(filename, "r"))
dataset = list(lines)
for i in range(len(dataset)):
dataset[i] = [float(x) for x in dataset[i]]
return dataset
def separateByClass(dataset):
separated = {}
for i in range(len(dataset)):
vector = dataset[i]
if (vector[-1] not in separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated
def mean(numbers):
return sum(numbers)/float(len(numbers))
def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
return math.sqrt(variance)
def summarize(dataset):
summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)]
del summaries[-1]
return summaries
def summarizeByClass(dataset):
separated = separateByClass(dataset)
summaries = {}
for classValue, instances in separated.items():
summaries[classValue] = summarize(instances)
return summaries
def main():
filename = 'pima-indians-diabetes.csv'
splitRatio = 0.80
dataset = loadCsv(filename)
trainingSet, testSet = splitDataset(dataset, splitRatio)
print('Split {0} rows into train={1} and test={2} rows'.format(len(dataset), len(trainingSet),
len(testSet)))
# prepare model
summaries = summarizeByClass(trainingSet)
# test model
predictions = getPredictions(summaries, testSet)
accuracy = getAccuracy(testSet, predictions)
print('Accuracy: {0}%'.format(accuracy))
main()
OUTPUT:
Split 768 rows into train=614 and test=154 rows
Accuracy: 32.467532467532465%
7. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for
clustering using k-Means algorithm. Compare the results of these two algorithms and comment
on the quality of clustering. You can add Java/Python ML library classes/API in the program.
OUTPUT:
24.0
<Figure size 1008x504 with 0 Axes>
8. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print
both correct and wrong predictions. Java/Python ML library classes can be used for this problem.
OUTPUT:
ACCURACY: 0.9473684210526315
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is setosa Actual is setosa
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is setosa Actual is setosa
Prediction is setosa Actual is setosa
Prediction is versicolor Actual is virginica
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is setosa Actual is setosa
Prediction is versicolor Actual is versicolor
Prediction is setosa Actual is setosa
Prediction is versicolor Actual is versicolor
Prediction is versicolor Actual is versicolor
Prediction is setosa Actual is setosa
Prediction is setosa Actual is setosa
Prediction is virginica Actual is virginica
9. Implement the non-parametric Locally Weighted Regression algorithm in order to fit data
points. Select appropriate data set for your experiment and draw graphs.
weights[j,j] = np.exp(diff*diff.T/(-2.0*k**2))
return weights
def localWeight(point,xmat,ymat,k):
wei = kernel(point,xmat,k)
W = (X.T*(wei*X)).I*(X.T*(wei*ymat.T))
return W
def localWeightRegression(xmat,ymat,k):
m,n = np.shape(xmat)
ypred = np.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)
return ypred
data = pd.read_csv('10_data10_tips.csv')
bill = np.array(data.total_bill)
tip = np.array(data.tip)
mbill = np.mat(bill)
mtip = np.mat(tip)
m= np.shape(mbill)[1]
one = np.mat(np.ones(m))
X= np.hstack((one.T,mbill.T))
ypred = localWeightRegression(X,mtip,5)
SortIndex = X[:,1].argsort(0)
xsort = X[SortIndex][:,0]
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(bill,tip, color='blue')
ax.plot(xsort[:,1],ypred[SortIndex], color = 'red', linewidth=8)
plt.xlabel('Total bill')
plt.ylabel('Tip')
#plt.show()
OUTPUT: