0% found this document useful (0 votes)

38 views61 pages

18csl76 Lab Manual Lab Material

The document is a lab manual for the Artificial Intelligence and Machine Learning Laboratory course at Vemana Institute of Technology, detailing the course objectives, experiments, and evaluation methods. It outlines the vision and mission of the institute and department, along with program educational objectives and specific outcomes. The manual includes various AI and ML algorithms to be implemented in Python or Java, along with guidelines for practical examinations.

Uploaded by

wasim khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views61 pages

18csl76 Lab Manual Lab Material

Uploaded by

wasim khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

lOMoARcPSD|44664451

18CSL76 Lab Manual - lab material

Cryptography (Vemana Institute of Technology )

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by wasim khan ([email protected])
lOMoARcPSD|44664451

Academic Year 2021– 2022 (ODD Semester)

LAB MANUAL

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

LABORATORY
18CSL76
VII Semester CS

Prepared By
A Rosline Mary, S Suma
Asst. Professor
Dept. of CSE

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Vision and Mission of the Institute

Vision
To become a leading institute for quality technical education and research with ethical values.
Mission
M1: To continually improve quality education system that produces thinking engineers having good
technical capabilities with human values.
M2: To nurture a good eco-system that encourages faculty and students to engage in meaningful research
and development.
M3: To strengthen industry institute interface for promoting team work, internship and entrepreneurship.
M4: To enhance educational opportunities to the rural and weaker sections of the society to equip with
practical skills to face the challenges of life.

Vision and Mission of the Department

Vision
To become a leading department engaged in quality education and research in the field of computer science
and engineering.

Mission
M1: To nurture a positive environment with state of art facilities conducive for deep learning and
meaningful research and development.

M2: To enhance interaction with industry for promoting collaborative research in emerging technologies.
M3: To strengthen the learning experiences enabling the students to become ethical professionals with good
interpersonal skills, capable of working effectively in multi-disciplinary teams.

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

PROGRAM EDUCATIONAL OBJECTIVES (PEOS)

PEO 1: Successful and ethical professionals in IT and ITES industries contributing to societal progress.
PEO 2: Engaged in life-long learning, adapting to changing technological scenarios.
PEO 3: Communicate and work effectively in diverse teams and exhibit leadership qualities.

PROGRAM SPECIFIC OUTCOMES (PSOs)

PSO 1: Analyze, design, implement and test innovative application software systems to meet the specified
requirements.
PSO 2: Understand and use systems software packages.
PSO 3: Understand the organization and architecture of digital computers, embedded systems and computer
networks.

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

SYLLABUS
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING LABORATORY
[As per Choice Based Credit System (CBCS) scheme]
(Effective from the academic year 2018 -2019)
SEMESTER – VII
Subject Code: 18CSL76 IA Marks: 40
Number of Lecture Hours/Week: 01I + 02P Exam Marks: 60
Total Number of Lecture Hours: 36 Exam Hours: 03
CREDITS – 02
Course objectives: This course will enable students to

 Implement and evaluate AI and ML algorithms in and Python programming language.

PART - A

Description:

 The programs can be implemented in either JAVA or Python.

 For Problems 1 to 6 and 10, programs are to be developed without using the built-in classes or
APIs of Java/Python.
 Data sets can be taken from standard repositories
(https://archive.ics.uci.edu/ml/datasets.html) or constructed by the students.

Lab Experiments:
 Implement A* Search algorithm.
 Implement AO* Search algorithm.
 For a given set of training data examples stored in a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm o output a description of the set of all hypotheses consistent with
the training examples.
 Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an
appropriate data set for building the decision tree and apply this knowledge to classify a new
sample.
 Build an Artificial Neural Network by implementing the Backpropagation algorithm and test the
same using appropriate data sets.
 Write a program to implement the naïve Bayesian classifier for a sample training data set stored as
a .CSV file. Compute the accuracy of the classifier, considering few test data sets.

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

 Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for
clustering using k-Means algorithm. Compare the results of these two algorithms and comment on
the quality of clustering. You can add Java/Python ML library classes/API in the program.
 Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print
both correct and wrong predictions. Java/Python ML library classes can be used for this problem.
 Implement the non-parametric Locally Weighted Regression algorithm to fit data points. Select
appropriate data set for your experiment and draw graphs.

Conduction of Practical Examination:

 All laboratory experiments are to be included for practical examination.
 Students are allowed to pick one experiment from the lot.
 Strictly follow the instructions as printed on the cover page of answer script
 Marks distribution: Procedure + Execution + Viva-Voce: 15+70+15 = 100 Marks
Change of experiment is allowed only once and marks allotted to the procedure part to be made
zero.

Course outcomes: The students should be able to:

 Implement and demonstrate AI and ML algorithms.
 Evaluate different algorithms.

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Table of contents
Exp. Experiments Page
No. No.
1 Implement A* Search algorithm. 1

2 Implement AO* Search algorithm. 5

3 For a given set of training data examples stored in a .CSV file, implement and 12
demonstrate the Candidate-Elimination algorithm to output a description of the set of
all hypotheses consistent with the training examples.
4 Write a program to demonstrate the working of the decision tree based ID3 algorithm. 15
Use an appropriate data set for building the decision tree and apply this knowledge to
classify a new sample.
5 Build an Artificial Neural Network by implementing the Backpropagation algorithm 42
and test the same using appropriate data sets.
6 Write a program to implement the naïve Bayesian classifier for a sample training data 45
set stored as a .CSV file. Compute the accuracy of the classifier, considering few test
data sets.
7 Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data 48
set for clustering using k-Means algorithm. Compare the results of these two
algorithms and comment on the quality of clustering. You can add Java/Python ML
library classes/API in the program.
8 Write a program to implement k-Nearest Neighbour algorithm to classify the iris data 50
set. Print both correct and wrong predictions. Java/Python ML library classes can be
used for this problem.
9 Implement the non-parametric Locally Weighted Regression algorithm in order to fit 52
data points. Select appropriate data set for your experiment and draw graphs.

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Machine Learning Laboratory 2019-2020

1. Implement A* Search algorithm.

def aStarAlgo(start_node, stop_node):
open_set = set(start_node) #Keeps track of unvisited nodes
closed_set = set() #keeps track of visited nodes
g = {} #store distance from starting node
parents = {} # parents contains an adjacency map of all
nodes g[start_node] = 0 #distance of starting node
from itself is zero #start_node is root node i.e it has no parent
nodes
#so start_node is set to its own parent node
parents[start_node] = start_node

while len(open_set) > 0:

n = None
#node with lowest f() is found
for v in open_set:
if n == None or g[v] + heuristic(v) < g[n] + heuristic(n):
n=v
if n == stop_node or Graph_nodes[n] == None:
pass
else:
for (m, weight) in get_neighbors(n):
#nodes 'm' not in first and last set are added to
first #n is set its parent
if m not in open_set and m not in closed_set:
open_set.add(m)
parents[m] = n
g[m] = g[n] + weight
#for each node m,compare its distance from start i.e g(m) to the
#from start through n node
else:

Dept. of CSE, Vemana IT 1

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

if g[m] > g[n] + weight:

#update g(m)
g[m] = g[n] + weight
#change parent of m to n
parents[m] = n
#if m in closed set,remove and add to open
if m in closed_set:
closed_set.remove(m)
open_set.add(m)
if n == None:
print('Path does not exist!')
return None
# if the current node is the stop_node
# then we begin reconstructin the path from it to the start_node
if n == stop_node:
path = []
while parents[n] != n:
path.append(n)
n = parents[n]
path.append(start_node)
path.reverse()
print('Path found: {}'.format(path))
return path
# remove n from the open_list, and add it to closed_list
# because all of his neighbors were inspected
open_set.remove(n)
closed_set.add(n)
print('Path does not exist!')
return None
#define fuction to return neighbor and its distance

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

#from the passed node

def get_neighbors(v):
if v in Graph_nodes:
return Graph_nodes[v]
else:
return None

#INPUT 1

#for simplicity we ll consider heuristic distances given

#and this function returns heuristic distance for all
nodes def heuristic(n):
H_dist = {
'A': 11,
'B': 6,
'C': 99,
'D': 1,
'E': 7,
'G': 0,
}
return H_dist[n]
#Describe your graph here
Graph_nodes = {
'A': [('B', 2), ('E', 3)],
'B': [('C', 1),('G', 9)],
'C': None,
'E': [('D', 6)],
'D': [('G', 1)],
}
aStarAlgo('A', 'G')

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

OUTPUT:
Path found: ['A', 'E', 'D', 'G']

##INPUT2
def heuristic(n):
H_dist = {
'A': 11,
'B': 6,
'C': 5,
'D': 7,
'E': 3,
'F': 6,
'G': 5,
'H': 3,
'I': 1,
'J': 0
}
return H_dist[n]
#Describe your graph here
Graph_nodes = {
'A': [('B', 6), ('F', 3)],
'B': [('A', 6), ('C', 3), ('D', 2)],
'C': [('B', 3), ('D', 1), ('E', 5)],
'D': [('B', 2), ('C', 1), ('E', 8)],
'E': [('C', 5), ('D', 8), ('I', 5), ('J', 5)],
'F': [('A', 3), ('G', 1), ('H', 7)],
'G': [('F', 1), ('I', 3)],
'H': [('F', 7), ('I', 2)],
'I': [('E', 5), ('G', 3), ('H', 2), ('J', 3)],
}
aStarAlgo('A', 'J')

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

OUTPUT:
Path found: ['A', 'F', 'G', 'I', 'J']

2. Implement AO* Search algorithm.

AO STAR SEARCH:
class Graph:
def init (self, graph, heuristicNodeList, startNode): #instantiate graph object with graph topology,
heuristic values, start node
self.graph = graph
self.H=heuristicNodeList
self.start=startNode
self.parent={}
self.status={}
self.solutionGraph={}
def applyAOStar(self): # starts a recursive AO* algorithm
self.aoStar(self.start, False)
def getNeighbors(self, v): # gets the Neighbors of a given node
return self.graph.get(v,'')
def getStatus(self,v): # return the status of a given
node return self.status.get(v,0)
def setStatus(self,v, val): # set the status of a given node
self.status[v]=val
def getHeuristicNodeValue(self, n):
return self.H.get(n,0) # always return the heuristic value of a given node
def setHeuristicNodeValue(self, n, value):
self.H[n]=value # set the revised heuristic value of a given node
def printSolution(self):

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

print("FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE START

NODE:",self.start)
print(" ")
print(self.solutionGraph)
print(" ")

def computeMinimumCostChildNodes(self, v): # Computes the Minimum Cost of child nodes of a

given node v
minimumCost=0
costToChildNodeListDict={}
costToChildNodeListDict[minimumCost]=[]
flag=True
for nodeInfoTupleList in self.getNeighbors(v): # iterate over all the set of child node/s
cost=0
nodeList=[]
for c, weight in nodeInfoTupleList:
cost=cost+self.getHeuristicNodeValue(c)+weight
nodeList.append(c)
if flag==True: # initialize Minimum Cost with the cost of first set of child node/s
minimumCost=cost
costToChildNodeListDict[minimumCost]=nodeList # set the Minimum Cost child node/s
flag=False
else: # checking the Minimum Cost nodes with the current Minimum Cost
if minimumCost>cost:
minimumCost=cost
costToChildNodeListDict[minimumCost]=nodeList # set the Minimum Cost child node/s
return minimumCost, costToChildNodeListDict[minimumCost] # return Minimum Cost and
Minimum Cost child node/s
def aoStar(self, v, backTracking): # AO* algorithm for a start node and backTracking status flag
print("HEURISTIC VALUES :", self.H)
print("SOLUTION GRAPH :", self.solutionGraph)

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

print("PROCESSING NODE :", v)

print(" ")
if self.getStatus(v) >= 0: # if status node v >= 0, compute Minimum Cost nodes of
v
minimumCost, childNodeList = self.computeMinimumCostChildNodes(v)
print(minimumCost, childNodeList)
self.setHeuristicNodeValue(v, minimumCost)
self.setStatus(v,len(childNodeList))
solved=True # check the Minimum Cost nodes of v are solved
for childNode in childNodeList:
self.parent[childNode]=v
if self.getStatus(childNode)!=-1:
solved=solved & False
if solved==True: # if the Minimum Cost nodes of v are solved, set the current node status as
solved(-1)
self.setStatus(v,-1)
self.solutionGraph[v]=childNodeList # update the solution graph with the solved nodes which
may be a part of solution
if v!=self.start: # check the current node is the start node for backtracking the current node value
self.aoStar(self.parent[v], True) # backtracking the current node value with backtracking status
set to true
if backTracking==False: # check the current call is not for backtracking
for childNode in childNodeList: # for each Minimum Cost child node
self.setStatus(childNode,0) # set the status of child node to 0(needs exploration)
self.aoStar(childNode, False) # Minimum Cost child node is further explored with
backtracking status as false
#for simplicity we ll consider heuristic distances given
print ("Graph - 1")
h1 = {'A': 1, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1}
graph1 = {
'A': [[('B', 1), ('C', 1)], [('D', 1)]],
'B': [[('G', 1)], [('H', 1)]],

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

'C': [[('J', 1)]],

'D': [[('E', 1), ('F', 1)]],
'G': [[('I', 1)]]
}
G1= Graph(graph1, h1, 'A')
G1.applyAOStar()
G1.printSolution()
print ("Graph - 2")
h2 = {'A': 1, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7} # Heuristic values of Nodes
graph2 = { # Graph of Nodes and Edges
'A': [[('B', 1), ('C', 1)], [('D', 1)]], # Neighbors of Node 'A', B, C & D with repective weights
'B': [[('G', 1)], [('H', 1)]], # Neighbors are included in a list of lists
'D': [[('E', 1), ('F', 1)]] # Each sublist indicate a "OR" node or "AND" nodes
}
G2 = Graph(graph2, h2, 'A') # Instantiate Graph object with graph, heuristic values and start Node
G2.applyAOStar() # Run the AO* algorithm
G2.printSolution() # Print the solution graph as output of the AO* algorithm search
OUTPUT :
OUTPUT:
Graph - 1
HEURISTIC VALUES : {'A': 1, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : A
10 ['B', 'C']
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : B
6 ['G']
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : A
10 ['B', 'C']
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

PROCESSING NODE : G
8 ['I']
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : B
8 ['H']
HEURISTIC VALUES : {'A': 10, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : A
12 ['B', 'C']
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1}
SOLUTION GRAPH : {}
PROCESSING NODE : I
0 []
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': []}
PROCESSING NODE : G
1 ['I']
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I']}
PROCESSING NODE : B
2 ['G']
HEURISTIC VALUES : {'A': 12, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : A
6 ['B', 'C']
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : C
2 ['J']
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : A
6 ['B', 'C']
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : J
0 []
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 0}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G'], 'J': []}

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

PROCESSING NODE : C
1 ['J']
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 1, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 0}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G'], 'J': [], 'C': ['J']}
PROCESSING NODE : A
5 ['B', 'C']
FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE START NODE: A
{'I': [], 'G': ['I'], 'B': ['G'], 'J': [], 'C': ['J'], 'A': ['B', 'C']}

INPUT 2
print ("Graph - 2")
h2 = {'A': 1, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7} # Heuristic values of Nodes
graph2 = { # Graph of Nodes and Edges
'A': [[('B', 1), ('C', 1)], [('D', 1)]], # Neighbors of Node 'A', B, C & D with repective weights
'B': [[('G', 1)], [('H', 1)]], # Neighbors are included in a list of lists
'D': [[('E', 1), ('F', 1)]] # Each sublist indicate a "OR" node or "AND" nodes
}

G2 = Graph(graph2, h2, 'A') # Instantiate Graph object with graph, heuristic values and start Node
G2.applyAOStar() # Run the AO* algorithm
G2.printSolution() # Print the solution graph as output of the AO* algorithm search

OUTPUT:
Graph - 2
HEURISTIC VALUES : {'A': 1, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : A
11 ['D']
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : D
10 ['E', 'F']
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

PROCESSING NODE : A
11 ['D']
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : E
0 []
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : D
6 ['E', 'F']
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : A
7 ['D']
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : F
0 []
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 0, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': [], 'F': []}
PROCESSING NODE : D
2 ['E', 'F']
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 2, 'E': 0, 'F': 0, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': [], 'F': [], 'D': ['E', 'F']}
PROCESSING NODE : A
3 ['D']
FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE START NODE: A
{'E': [], 'F': [], 'D': ['E', 'F'], 'A': ['D']}

3. For a given set of training data examples stored in a .CSV file, implement and demonstrate the C
andidate-Elimination algorithm to output a description of the set of all hypotheses consistent with t
he training examples.
import numpy as np

import pandas as pd

data = pd.read_csv('enjoysport.csv')

concepts = np.array(data.iloc[:,0:-1])

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

print(concepts)

target = np.array(data.iloc[:,-1])

print(target)

def learn(concepts, target):

specific_h = concepts[0].copy()

print("initialization of specific_h and general_h")

print(specific_h)

general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]

print(general_h)

for i, h in enumerate(concepts):

print("For Loop Starts")

if target[i] == "yes":

print("If instance is Positive ")

for x in range(len(specific_h)):

if h[x]!= specific_h[x]:

specific_h[x] ='?'

general_h[x][x] ='?'

if target[i] == "no":

print("If instance is Negative ")

for x in range(len(specific_h)):

if h[x]!= specific_h[x]:

general_h[x][x] = specific_h[x]

else:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

general_h[x][x] = '?'

print(" steps of Candidate Elimination Algorithm",i+1)

print(specific_h)

print(general_h)

print("\n")

indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]

for i in indices:

general_h.remove(['?', '?', '?', '?', '?', '?'])

return specific_h, general_h

s_final, g_final = learn(concepts, target)

print("Final Specific_h:", s_final, sep="\n")

print("Final General_h:", g_final, sep="\n")

OUTPUT:

[['sunny' 'warm' 'normal' 'strong' 'warm' 'same']

['sunny' 'warm' 'high' 'strong' 'warm' 'same']
['rainy' 'cold' 'high' 'strong' 'warm' 'change']
['sunny' 'warm' 'high' 'strong' 'cool' 'change']]
['yes' 'yes' 'no' 'yes']
initialization of specific_h and general_h
['sunny' 'warm' 'normal' 'strong' 'warm' 'same']
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', '?']]

For Loop Starts

If instance is Positive
steps of Candidate Elimination Algorithm 1

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

['sunny' 'warm' 'normal' 'strong' 'warm' 'same']

[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', '?']]

For Loop Starts

If instance is Positive
steps of Candidate Elimination Algorithm 2
['sunny' 'warm' '?' 'strong' 'warm' 'same']
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', '?']]

For Loop Starts

If instance is Negative
steps of Candidate Elimination Algorithm 3
['sunny' 'warm' '?' 'strong' 'warm' 'same']
[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '
?', '?', '?'], ['?', '?', '?', '?', '?', 'same']]

For Loop Starts

If instance is Positive
steps of Candidate Elimination Algorithm 4
['sunny' 'warm' '?' 'strong' '?' '?']
[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '
?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

Final Specific_h:
['sunny' 'warm' '?' 'strong' '?' '?']
Final General_h:
[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?']]

4. Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an
appropriate data set for building the decision tree and apply this knowledge to classify a new
sample.

Import Play Tennis Data

import pandas as pd
from pandas import DataFrame
df_tennis = DataFrame.from_csv('PlayTennis.csv')
print("\n Given Play Tennis Data Set:\n\n", df_tennis)
Given Play Tennis Data Set:
PlayTennis Outlook Temperature Humidity Windy

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

0 NO sunny hot high weak

1 NO sunny hot high strong
2 YES overcast hot high weak
3 YES rainy mild high weak
4 YES rainy cool normal weak
5 NO rainy cool normal strong
6 YES overcast cool normal strong
7 NO sunny mild high weak
8 YES sunny cool normal weak
9 YES rainy mild normal weak
10 YES sunny mild normal strong
11 YES overcast mild high strong
12 YES overcast hot normal weak
13 NO rainy mild high strong

df_tennis.keys()[0]

'PlayTennis'

Entropy of the Training Data Set

#Function to calculate the entropy of probaility of observations
# -p*log2*p
def entropy(probs):
import math

return sum( [-prob*math.log(prob, 2) for prob in probs] )

#Function to calulate the entropy of the given Data Sets/List with respect to target attributes
def entropy_of_list(a_list):
#print("A-list",a_list)
from collections import Counter
cnt = Counter(x for x in a_list) # Counter calculates the propotion of class
print("\nClasses:",cnt)
#print("No and Yes Classes:",a_list.name,cnt)
num_instances = len(a_list) # = 14

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

print("\n Number of Instances of the Current Sub Class is {0}:".format(num_instances ))

probs = [x / num_instances for x in cnt.values()] # x means no of YES/NO
print(probs)
print("\n Classes:",list(cnt.keys()))#min(cnt),max(cnt))
print(" \n Probabilities of Class {0} is {1}:".format(min(cnt),min(probs)))
print(" \n Probabilities of Class {0} is {1}:".format(max(cnt),max(probs)))
return entropy(probs) # Call Entropy :
# The initial entropy of the YES/NO attribute for our dataset.
print("\n INPUT DATA SET FOR ENTROPY CALCULATION:\n", df_tennis['PlayTennis'])
total_entropy = entropy_of_list(df_tennis['PlayTennis'])
print("\n Total Entropy of PlayTennis Data Set:",total_entropy)
INPUT DATA SET FOR ENTROPY CALCULATION:
0 NO
1 NO
2 YES
3 YES
4 YES
5 NO
6 YES
7 NO
8 YES
9 YES
10 YES
11 YES
12 YES
13 NO
Name: PlayTennis, dtype: object
Classes: Counter({'YES': 9, 'NO':
5})
Number of Instances of the Current Sub Class is
14: [0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Total Entropy of PlayTennis Data Set: 0.9402859586706309

Information Gain of Attributes

def information_gain(df, split_attribute_name, target_attribute_name, trace=0):
print("Information Gain Calculation of ",split_attribute_name)
# Split Data by Possible Vals of Attribute:
df_split = df.groupby(split_attribute_name)
print("split:", type(df_split))
for name,group in df_split:
print("Name:\n",name)
print("Group:\n",group)
# Calculate Entropy for Target Attribute, as well as
# Proportion of Obs in Each Data-Split
nobs = len(df.index)
# print("NOBS",nobs)
#define the aggregation based on target attribute name
df_agg_ent = df_split.agg({target_attribute_name : [entropy_of_list, lambda x: len(x)/nobs]
})[target_attribute_name]
#print(target_attribute_name)
#print(" Entropy List ",entropy_of_list)
print(df_agg_ent.columns)
print("DFAGGENT",df_agg_ent)#[target_attribute_name])
df_agg_ent.columns = ['Entropy', 'PropObservations']
#if trace: # helps understand what fxn is doing:
# print(df_agg_ent)
# Calculate Information Gain:
new_entropy = sum( df_agg_ent['Entropy'] * df_agg_ent['PropObservations']
) old_entropy = entropy_of_list(df[target_attribute_name])
return old_entropy - new_entropy
print('Info-gain for Outlook is :'+str( information_gain(df_tennis, 'Outlook', 'PlayTennis')),"\n")
print('\n Info-gain for Humidity is: ' + str( information_gain(df_tennis, 'Humidity', 'PlayTennis')),"\n")
print('\n Info-gain for Windy is:' + str( information_gain(df_tennis, 'Windy', 'PlayTennis')),"\n")

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

print('\n Info-gain for Temperature is:' + str( information_gain(df_tennis,

'Temperature','PlayTennis')),"\n")
Information Gain Calculation of Outlook
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
overcast
Group:
PlayTennis Outlook Temperature Humidity Windy
2 YES overcast hot high weak
6 YES overcast cool normal strong
11 YES overcast mild high strong
12 YES overcast hot normal weak
Name:
rainy
Group:
PlayTennis Outlook Temperature Humidity Windy
3 YES rainy mild high weak
4 YES rainy cool normal weak
5 NO rainy cool normal strong
9 YES rainy mild normal weak
13 NO rainy mild high strong
Name:
sunny
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
7 NO sunny mild high weak
8 YES sunny cool normal weak
10 YES sunny mild normal strong
Classes: Counter({'YES': 4})
Number of Instances of the Current Sub Class is 4:
[1.0]

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Classes: ['YES']
Probabilities of Class YES is 1.0:
Probabilities of Class YES is 1.0:
Classes: Counter({'YES': 3, 'NO': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Classes: Counter({'NO': 3, 'YES': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Outlook
overcast 0.000000 0.285714
rainy 0.970951 0.357143
sunny 0.970951 0.357143
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Info-gain for Outlook is :0.2467498197744391
Information Gain Calculation of Humidity
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
high
Group:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

PlayTennis Outlook Temperature Humidity Windy

0 NO sunny hot high weak
1 NO sunny hot high strong
2 YES overcast hot high weak
3 YES rainy mild high weak
7 NO sunny mild high weak
11 YES overcast mild high strong
13 NO rainy mild high strong
Name:
normal
Group:
PlayTennis Outlook Temperature Humidity Windy
4 YES rainy cool normal weak
5 NO rainy cool normal strong
6 YES overcast cool normal strong
8 YES sunny cool normal weak
9 YES rainy mild normal weak
10 YES sunny mild normal strong
12 YES overcast hot normal weak
Classes: Counter({'NO': 4, 'YES': 3})
Number of Instances of the Current Sub Class is 7:
[0.5714285714285714, 0.42857142857142855]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.42857142857142855:
Probabilities of Class YES is 0.5714285714285714:
Classes: Counter({'YES': 6, 'NO': 1})
Number of Instances of the Current Sub Class is 7:
[0.8571428571428571, 0.14285714285714285]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.14285714285714285:
Probabilities of Class YES is 0.8571428571428571:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Humidity
high 0.985228 0.5
normal 0.591673 0.5
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Info-gain for Humidity is: 0.15183550136234136
Information Gain Calculation of Windy
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
strong
Group:
PlayTennis Outlook Temperature Humidity Windy
1 NO sunny hot high strong
5 NO rainy cool normal strong
6 YES overcast cool normal strong
10 YES sunny mild normal strong
11 YES overcast mild high strong
13 NO rainy mild high strong
Name:
weak
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
2 YES overcast hot high weak
3 YES rainy mild high weak
4 YES rainy cool normal weak
7 NO sunny mild high weak
8 YES sunny cool normal weak
9 YES rainy mild normal weak

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

12 YES overcast hot normal weak

Classes: Counter({'NO': 3, 'YES': 3})
Number of Instances of the Current Sub Class is 6:
[0.5, 0.5]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Classes: Counter({'YES': 6, 'NO': 2})
Number of Instances of the Current Sub Class is 8:
[0.25, 0.75]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.25:
Probabilities of Class YES is 0.75:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Windy
strong 1.000000 0.428571
weak 0.811278 0.571429
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Info-gain for Windy is:0.04812703040826927
Information Gain Calculation of Temperature
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
cool
Group:
PlayTennis Outlook Temperature Humidity Windy
4 YES rainy cool normal weak
5 NO rainy cool normal strong

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

6 YES overcast cool normal strong

8 YES sunny cool normal weak
Name:
hot
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
2 YES overcast hot high weak
12 YES overcast hot normal weak
Name:
mild
Group:
PlayTennis Outlook Temperature Humidity Windy
3 YES rainy mild high weak
7 NO sunny mild high weak
9 YES rainy mild normal weak
10 YES sunny mild normal strong
11 YES overcast mild high strong
13 NO rainy mild high strong
Classes: Counter({'YES': 3, 'NO': 1})
Number of Instances of the Current Sub Class is 4:
[0.75, 0.25]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.25:
Probabilities of Class YES is 0.75:
Classes: Counter({'NO': 2, 'YES': 2})
Number of Instances of the Current Sub Class is 4:
[0.5, 0.5]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Classes: Counter({'YES': 4, 'NO': 2})

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Number of Instances of the Current Sub Class is 6: [0.6666666666666666, 0.3333333333333333] Classes: ['YES', 'NO']
Probabilities of Class NO is 0.3333333333333333:
Probabilities of Class YES is 0.6666666666666666: Index(['entropy_of_list', '<lambda>'], dtype='object')

DFAGGENT entropy_of_list <lambda>

Temperature cool
hot
mild 0.811278 0.285714
1.000000 0.285714
0.918296 0.428571

Classes: Counter({'YES': 9, 'NO': 5})

Number of Instances of the Current Sub Class is 14: [0.35714285714285715, 0.6428571428571429] Classes: ['NO', 'YES'
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429: Info-gain for Temperature is:0.029222565658954647

ID3 Algorithm
def id3(df, target_attribute_name, attribute_names, default_class=None):
## Tally target attribute:
from collections import Counter
cnt = Counter(x for x in df[target_attribute_name])# class of YES /NO
## First check: Is this split of the dataset homogeneous?
if len(cnt) == 1:
return next(iter(cnt)) # next input data set, or raises StopIteration when EOF is hit.

## Second check: Is this split of the dataset empty?

# if yes, return a default value
elif df.empty or (not attribute_names):
return default_class # Return None for Empty Data Set

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

## Otherwise: This dataset is ready to be devied up!

else:
# Get Default Value for next recursive call of this function:
default_class = max(cnt.keys()) #No of YES and NO Class
# Compute the Information Gain of the attributes:
gainz = [information_gain(df, attr, target_attribute_name) for attr in attribute_names] #
index_of_max = gainz.index(max(gainz)) # Index of Best Attribute
# Choose Best Attribute to split on:
best_attr = attribute_names[index_of_max]

# Create an empty tree, to be populated in a moment

tree = {best_attr:{}} # Iniiate the tree with best attribute as a node
remaining_attribute_names = [i for i in attribute_names if i != best_attr]

# Split dataset
# On each split, recursively call this algorithm.
# populate the empty tree with subtrees, which
# are the result of the recursive call
for attr_val, data_subset in df.groupby(best_attr):
subtree = id3(data_subset,
target_attribute_name,
remaining_attribute_names,
default_class)
tree[best_attr][attr_val] = subtree
return tree

Predicting Attributes
# Get Predictor Names (all but 'class')
attribute_names = list(df_tennis.columns)
print("List of Attributes:", attribute_names)
attribute_names.remove('PlayTennis') #Remove the class attribute
print("Predicting Attributes:", attribute_names)
List of Attributes: ['PlayTennis', 'Outlook', 'Temperature', 'Humidity', 'Windy']

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Predicting Attributes: ['Outlook', 'Temperature', 'Humidity', 'Windy']

# Run Algorithm:
from pprint import pprint
tree = id3(df_tennis,'PlayTennis',attribute_names)
print("\n\nThe Resultant Decision Tree is :\n")
#print(tree)
pprint(tree)
attribute = next(iter(tree))
print("Best Attribute :\n",attribute)
print("Tree Keys:\n",tree[attribute].keys())
Information Gain Calculation of Outlook
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'> Name:
overcast Group:
PlayTennis Outlook Temperature Humidity Windy

2 YES overcast hot high weak

6 YES overcast cool normal strong
11 YES overcast mild high strong
12 YES overcast hot normalweak
Name: rainy Group:

PlayTennis Outlook Temperature Humidity Windy

3 YESrainy
YES
mildhighweak
NO
rainy
YESrainy cool
rainy normalweak cool normal strong
4 NO rainy
mild normalweak
5
9
13
Name: sunny
mild high strong

Group:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

PlayTennis Outlook Temperature Humidity Windy

0 NO sunny hot high weak
1 NO sunny hot high strong
7 NO sunny mild high weak
8 YES sunny cool normal weak
10 YES sunny mild normal strong

Classes: Counter({'YES': 4})

Number of Instances of the Current Sub Class is 4:

[1.0]

Classes: ['YES']

Probabilities of Class YES is 1.0:

Classes: Counter({'YES': 3, 'NO': 2})

Number of Instances of the Current Sub Class is 5:

[0.6, 0.4]

Classes: ['YES', 'NO']

Probabilities of Class NO is 0.4:

Probabilities of Class YES is 0.6:
Classes: Counter({'NO': 3, 'YES': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Index(['entropy_of_list', '<lambda>'], dtype='object')

DFAGGENT entropy_of_list <lambda>
Outlook
overcast 0.000000 0.285714
rainy 0.970951 0.357143
sunny 0.970951 0.357143
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Information Gain Calculation of Temperature
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
cool
Group:
PlayTennis Outlook Temperature Humidity Windy
4 YES rainy cool normal weak
5 NO rainy cool normal strong
6 YES overcast cool normal strong
8 YES sunny cool normal weak
Name:
hot
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
2 YES overcast hot high weak
12 YES overcast hot normal weak
Name:
mild
Group:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

PlayTennis Outlook Temperature Humidity Windy

3 YES rainy mild high weak
7 NO sunny mild high weak
9 YES rainy mild normal weak
10 YES sunny mild normal strong
11 YES overcast mild high strong
13 NO rainy mild high strong
Classes: Counter({'YES': 3, 'NO': 1})
Number of Instances of the Current Sub Class is 4:
[0.75, 0.25]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.25:
Probabilities of Class YES is 0.75:
Classes: Counter({'NO': 2, 'YES': 2})
Number of Instances of the Current Sub Class is 4:
[0.5, 0.5]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Classes: Counter({'YES': 4, 'NO': 2})
Number of Instances of the Current Sub Class is 6:
[0.6666666666666666, 0.3333333333333333]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.3333333333333333:
Probabilities of Class YES is 0.6666666666666666:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Temperature
cool 0.811278 0.285714
hot 1.000000 0.285714
mild 0.918296 0.428571
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Information Gain Calculation of Humidity
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
high
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
2 YES overcast hot high weak
3 YES rainy mild high weak
7 NO sunny mild high weak
11 YES overcast mild high strong
13 NO rainy mild high strong
Name:
normal
Group:
PlayTennis Outlook Temperature Humidity Windy
4 YES rainy cool normal weak
5 NO rainy cool normal strong
6 YES overcast cool normal strong
8 YES sunny cool normal weak
9 YES rainy mild normal weak
10 YES sunny mild normal strong
12 YES overcast hot normal weak
Classes: Counter({'NO': 4, 'YES': 3})
Number of Instances of the Current Sub Class is 7:
[0.5714285714285714, 0.42857142857142855]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.42857142857142855:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Probabilities of Class YES is 0.5714285714285714:

Classes: Counter({'YES': 6, 'NO': 1})
Number of Instances of the Current Sub Class is 7:
[0.8571428571428571, 0.14285714285714285]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.14285714285714285:
Probabilities of Class YES is 0.8571428571428571:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Humidity
high 0.985228 0.5
normal 0.591673 0.5
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Information Gain Calculation of Windy
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
strong
Group:
PlayTennis Outlook Temperature Humidity Windy
1 NO sunny hot high strong
5 NO rainy cool normal strong
6 YES overcast cool normal strong
10 YES sunny mild normal strong
11 YES overcast mild high strong
13 NO rainy mild high strong
Name:
weak
Group:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

PlayTennis Outlook Temperature Humidity Windy

0 NO sunny hot high weak
2 YES overcast hot high weak
3 YES rainy mild high weak
4 YES rainy cool normal weak
7 NO sunny mild high weak
8 YES sunny cool normal weak
9 YES rainy mild normal weak
12 YES overcast hot normal weak
Classes: Counter({'NO': 3, 'YES': 3})
Number of Instances of the Current Sub Class is 6:
[0.5, 0.5]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Classes: Counter({'YES': 6, 'NO': 2})
Number of Instances of the Current Sub Class is 8:
[0.25, 0.75]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.25:
Probabilities of Class YES is 0.75:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Windy
strong 1.000000 0.428571
weak 0.811278 0.571429
Classes: Counter({'YES': 9, 'NO': 5})
Number of Instances of the Current Sub Class is 14:
[0.35714285714285715, 0.6428571428571429]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.35714285714285715:
Probabilities of Class YES is 0.6428571428571429:
Information Gain Calculation of Temperature

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>

Name:
cool
Group:
PlayTennis Outlook Temperature Humidity Windy
4 YES rainy cool normal weak
5 NO rainy cool normal strong
Name:
mild
Group:
PlayTennis Outlook Temperature Humidity Windy
3 YES rainy mild high weak
9 YES rainy mild normal weak
13 NO rainy mild high strong
Classes: Counter({'YES': 1, 'NO': 1})
Number of Instances of the Current Sub Class is 2:
[0.5, 0.5]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Classes: Counter({'YES': 2, 'NO': 1})
Number of Instances of the Current Sub Class is 3:
[0.6666666666666666, 0.3333333333333333]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.3333333333333333:
Probabilities of Class YES is 0.6666666666666666:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Temperature
cool 1.000000 0.4
mild 0.918296 0.6
Classes: Counter({'YES': 3, 'NO': 2})
Number of Instances of the Current Sub Class is 5:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

[0.6, 0.4]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Information Gain Calculation of Humidity
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
high
Group:
PlayTennis Outlook Temperature Humidity Windy
3 YES rainy mild high weak
13 NO rainy mild high strong
Name:
normal
Group:
PlayTennis Outlook Temperature Humidity Windy
4 YES rainy cool normal weak
5 NO rainy cool normal strong
9 YES rainy mild normal weak
Classes: Counter({'YES': 1, 'NO': 1})
Number of Instances of the Current Sub Class is 2:
[0.5, 0.5]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Classes: Counter({'YES': 2, 'NO': 1})
Number of Instances of the Current Sub Class is 3:
[0.6666666666666666, 0.3333333333333333]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.3333333333333333:
Probabilities of Class YES is 0.6666666666666666:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Humidity
high 1.000000 0.4
normal 0.918296 0.6
Classes: Counter({'YES': 3, 'NO': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Information Gain Calculation of Windy
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
strong
Group:
PlayTennis Outlook Temperature Humidity Windy
5 NO rainy cool normal strong
13 NO rainy mild high strong
Name:
weak
Group:
PlayTennis Outlook Temperature Humidity Windy
3 YES rainy mild high weak
4 YES rainy cool normal weak
9 YES rainy mild normal weak
Classes: Counter({'NO': 2})

Number of Instances of the Current Sub Class is 2:

[1.0]
Classes: ['NO']
Probabilities of Class NO is 1.0:
Probabilities of Class NO is 1.0:
Classes: Counter({'YES': 3})
Number of Instances of the Current Sub Class is 3:

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

[1.0]
Classes: ['YES']
Probabilities of Class YES is 1.0:
Probabilities of Class YES is 1.0:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Windy
strong 0.0 0.4
weak 0.0 0.6
Classes: Counter({'YES': 3, 'NO': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['YES', 'NO']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Information Gain Calculation of Temperature
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
cool
Group:
PlayTennis Outlook Temperature Humidity Windy
8 YES sunny cool normal weak
Name:
hot
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
Name:
mild
Group:
PlayTennis Outlook Temperature Humidity Windy
7 NO sunny mild high weak

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

10 YES sunny mild normal strong

Classes: Counter({'YES': 1})
Number of Instances of the Current Sub Class is 1:
[1.0]
Classes: ['YES']
Probabilities of Class YES is 1.0:
Probabilities of Class YES is 1.0:
Classes: Counter({'NO': 2})
Number of Instances of the Current Sub Class is 2:
[1.0]
Classes: ['NO']
Probabilities of Class NO is 1.0:
Probabilities of Class NO is 1.0:
Classes: Counter({'NO': 1, 'YES': 1})
Number of Instances of the Current Sub Class is 2:
[0.5, 0.5]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Temperature
cool 0.0 0.2
hot 0.0 0.4
mil 1.0 0.4
Classes: Counter({'NO': 3, 'YES': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Information Gain Calculation of Humidity
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Name:
high
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
1 NO sunny hot high strong
7 NO sunny mild high weak
Name:
normal
Group:
PlayTennis Outlook Temperature Humidity Windy
8 YES sunny cool normal weak
10 YES sunny mild normal strong
Classes: Counter({'NO': 3})
Number of Instances of the Current Sub Class is 3:
[1.0]
Classes: ['NO']
Probabilities of Class NO is 1.0:
Probabilities of Class NO is 1.0:
Classes: Counter({'YES': 2})
Number of Instances of the Current Sub Class is 2:
[1.0]
Classes: ['YES']
Probabilities of Class YES is 1.0:
Probabilities of Class YES is 1.0:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Humidity
high 0.0 0.6
normal 0.0 0.4
Classes: Counter({'NO': 3, 'YES': 2})
Number of Instances of the Current Sub Class is 5:
[0.6, 0.4]

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Classes: ['NO', 'YES']

Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
Information Gain Calculation of Windy
split: <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>
Name:
strong
Group:
PlayTennis Outlook Temperature Humidity Windy
1 NO sunny hot high strong
10 YES sunny mild normal strong
Name:
weak
Group:
PlayTennis Outlook Temperature Humidity Windy
0 NO sunny hot high weak
7 NO sunny mild high weak
8 YES sunny cool normal weak
Classes: Counter({'NO': 1, 'YES': 1})
Number of Instances of the Current Sub Class is 2:
[0.5, 0.5]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.5:
Probabilities of Class YES is 0.5:
Classes: Counter({'NO': 2, 'YES': 1})
Number of Instances of the Current Sub Class is 3:
[0.6666666666666666, 0.3333333333333333]
Classes: ['NO', 'YES']
Probabilities of Class NO is 0.3333333333333333:
Probabilities of Class YES is 0.6666666666666666:
Index(['entropy_of_list', '<lambda>'], dtype='object')
DFAGGENT entropy_of_list <lambda>
Windy

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

strong 1.000000 0.4

weak 0.918296 0.6
Classes: Counter({'NO': 3, 'YES': 2})
Number of Instances of the Current Sub Class is 5: [0.6, 0.4]
Classes: ['NO', 'YES'] Probabilities of Class NO is 0.4:
Probabilities of Class YES is 0.6:
The Resultant Decision Tree is :
{'Outlook': {'overcast': 'YES',
'rainy': {'Windy': {'strong': 'NO', 'weak': 'YES'}},
'sunny': {'Humidity': {'high': 'NO', 'normal': 'YES'}}}} Best Attribute :
Outlook Tree Keys:
dict_keys(['overcast', 'rainy', 'sunny'])

DT={'Outlook': {'Overcast': 'Yes', 'Rain': {'Wind': {'Strong': 'No', 'Weak': 'Yes'}}, 'Sunny': {'Humidity':
{'High': 'No', 'Normal': 'Yes'}}}}
testsample= {'Outlook':'Sunny','Temperature':'Hot', 'Humidity':'High', 'Wind':'Strong'}
dic=testsample
print("Test sample : ", dic)
for value in DT.values():
while(value!='Yes' or value!='No'):
for i in dic.keys():
if i in DT.keys():
DT= DT[i]
#print(DT)
for i in dic.values():
if i in DT.keys():
DT= DT[i]
if( DT=='No' or DT== 'Yes'):
print(" The test sample is classfied as ", DT)

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Test sample : {'Outlook': 'Sunny', 'Temperature': 'Hot', 'Humidity': 'High', 'Wind': 'Strong'}
The test sample is classfied as No

5 Build an Artificial Neural Network by implementing the Backpropagation algorithm and test the
same using appropriate data sets.

import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float) # Features ( Hrs Slept, Hrs Studied)
y = np.array(([92], [86], [89]), dtype=float) # Labels(Marks obtained)
c=np.amax(X,axis=0) # Normalize
print(c)
X = X/c # Normalize

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

y = y/100
print(X)
print(y)

def sigmoid(x):
return 1/(1 + np.exp(-x))
def sigmoid_grad(x):
return x * (1 - x)
# Variable initialization
epoch=4 #Setting training iterations
eta =0.3 #Setting learning rate (eta)
input_neurons = 2 #number of features in data set
hidden_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
# Weight and bias - Random initialization
wh=np.random.uniform(size=(input_neurons,hidden_neurons)) # 2x3
print(wh)
bh=np.random.uniform(size=(1,hidden_neurons)) # 1x3
print(bh)
wout=np.random.uniform(size=(hidden_neurons,output_neurons)) # 3x1
print(wout)
bout=np.random.uniform(size=(1,output_neurons))
print(bout)

for i in range(epoch):
#Forward Propogation
h_ip=np.dot(X,wh) + bh # Dot product + bias
print(h_ip)
h_act = sigmoid(h_ip) # Activation function
o_ip=np.dot(h_act,wout) + bout
output = sigmoid(o_ip)

#Backpropagation
# Error at Output layer

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Eo = y-output # Error at o/p

outgrad = sigmoid_grad(output)
d_output = Eo* outgrad # Errj=Oj(1-Oj)(Tj-Oj)
print("the d_output is",d_output)
# Error at Hidden later
Eh = d_output.dot(wout.T) # .T means transpose
hiddengrad = sigmoid_grad(h_act) # How much hidden layer wts contributed to error
d_hidden = Eh * hiddengrad
wout += h_act.T.dot(d_output) *eta # Dotproduct of nextlayererror and currentlayerop
wh += X.T.dot(d_hidden) *eta
print("Normalized Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)

Output:

[3. 9.]
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
[[0.92]
[0.86]
[0.89]]
[[0.34391748 0.26893556 0.69278977]
[0.27114375 0.50486066 0.36606859]]
[[0.01416985 0.30257255 0.49180958]]
[[0.87639985]
[0.62305314]
[0.48385416]]
[[0.84524982]]
[[0.51459193 0.98672358 1.31973802]
[0.27944443 0.67269588 0.92611094]
[0.53884983 0.90808188 1.42864508]]
[[0.51459193 0.98672358 1.31973802]

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

[0.27944443 0.67269588 0.92611094]

[0.53884983 0.90808188 1.42864508]]
[[0.51459193 0.98672358 1.31973802]
[0.27944443 0.67269588 0.92611094]
[0.53884983 0.90808188 1.42864508]]
[[0.51459193 0.98672358 1.31973802]
[0.27944443 0.67269588 0.92611094]
[0.53884983 0.90808188 1.42864508]]
the d_output is [[ 0.00150286]
[-0.00302754]
[-0.0011524 ]]
Normalized Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.90286381]
[0.89123185]
[0.90317825]]

6. Write a program to implement the naïve Bayesian classifier for a sample training data set stored
as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.

import csv
import random
import math

def loadCsv(filename):
lines = csv.reader(open(filename, "r"))
dataset = list(lines)

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

for i in range(len(dataset)):
dataset[i] = [float(x) for x in dataset[i]]
return dataset

def splitDataset(dataset, splitRatio):

trainSize = int(len(dataset) * splitRatio)
trainSet = []
copy = list(dataset)
while len(trainSet) < trainSize:
index = random.randrange(len(copy))
trainSet.append(copy.pop(index))
return [trainSet, copy]

def separateByClass(dataset):
separated = {}
for i in range(len(dataset)):
vector = dataset[i]
if (vector[-1] not in separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated

def mean(numbers):
return sum(numbers)/float(len(numbers))

def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
return math.sqrt(variance)

def summarize(dataset):
summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)]
del summaries[-1]
return summaries

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

def summarizeByClass(dataset):
separated = separateByClass(dataset)
summaries = {}
for classValue, instances in separated.items():
summaries[classValue] = summarize(instances)
return summaries

def calculateProbability(x, mean, stdev):

exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2))))
return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent

def calculateClassProbabilities(summaries, inputVector):

probabilities = {}
for classValue, classSummaries in summaries.items():
probabilities[classValue] = 1
for i in range(len(classSummaries)):
mean, stdev = classSummaries[i]
x = inputVector[i]
probabilities[classValue] *= calculateProbability(x, mean, stdev)
return probabilities

def predict(summaries, inputVector):

probabilities = calculateClassProbabilities(summaries, inputVector)
bestLabel, bestProb = None, -1
for classValue, probability in probabilities.items():
if bestLabel is None or probability > bestProb:
bestProb = probability
bestLabel = classValue
return bestLabel

def getPredictions(summaries, testSet):

predictions = []
for i in range(len(testSet)):

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

result = predict(summaries, testSet[i])

predictions.append(result)
return predictions

def getAccuracy(testSet, predictions):

correct = 0
for i in range(len(testSet)):
if testSet[i][-1] == predictions[i]:
correct += 1
return (correct/float(len(testSet))) * 100.0

def main():
filename = 'pima-indians-diabetes.csv'
splitRatio = 0.80
dataset = loadCsv(filename)
trainingSet, testSet = splitDataset(dataset, splitRatio)
print('Split {0} rows into train={1} and test={2} rows'.format(len(dataset), len(trainingSet),
len(testSet)))
# prepare model
summaries = summarizeByClass(trainingSet)
# test model
predictions = getPredictions(summaries, testSet)
accuracy = getAccuracy(testSet, predictions)
print('Accuracy: {0}%'.format(accuracy))
main()
OUTPUT:
Split 768 rows into train=614 and test=154 rows
Accuracy: 32.467532467532465%

7. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for
clustering using k-Means algorithm. Compare the results of these two algorithms and comment
on the quality of clustering. You can add Java/Python ML library classes/API in the program.

import matplotlib.pyplot as plt

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

from sklearn import datasets

from sklearn.cluster import KMeans
import pandas as pd
import numpy as np
import sklearn.metrics as sm
iris = datasets.load_iris()
X = pd.DataFrame(iris.data)
X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y = pd.DataFrame(iris.target)
y.columns = ['Targets']
plt.figure(figsize=(14,7))
model = KMeans(n_clusters=3)
model.fit(X)
model.labels_
plt.figure(figsize=(14,7))
colormap = np.array(['red', 'lime', 'black'])
plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('EM Clustering')
plt.subplot(1, 2, 2)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[model.labels_], s=40)
plt.title('K-Means clustering')
acc = sm.accuracy_score( y , model.labels_)
print(acc * 100)

OUTPUT:
24.0
<Figure size 1008x504 with 0 Axes>

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

8. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print
both correct and wrong predictions. Java/Python ML library classes can be used for this problem.

from sklearn.datasets import load_iris

from sklearn.neighbors import KNeighborsClassifier

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

from sklearn.model_selection import train_test_split

import numpy as np
iris_dataset=load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris_dataset["data"], iris_dataset["target"])
kn = KNeighborsClassifier()
kn.fit(X_train, y_train)
prediction = kn.predict(X_test)
print(f"ACCURACY: {kn.score(X_test, y_test)}")
target_names = iris_dataset.target_names
for pred,actual in zip(prediction,y_test):
print(f"Prediction is {target_names[pred]} Actual is {target_names[actual]}")

OUTPUT:
ACCURACY: 0.9473684210526315
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is setosa Actual is setosa
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is setosa Actual is setosa
Prediction is setosa Actual is setosa
Prediction is versicolor Actual is virginica
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is setosa Actual is setosa
Prediction is versicolor Actual is versicolor
Prediction is setosa Actual is setosa
Prediction is versicolor Actual is versicolor
Prediction is versicolor Actual is versicolor
Prediction is setosa Actual is setosa
Prediction is setosa Actual is setosa
Prediction is virginica Actual is virginica

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Prediction is versicolor Actual is versicolor

Prediction is setosa Actual is setosa
Prediction is versicolor Actual is versicolor
Prediction is virginica Actual is virginica
Prediction is virginica Actual is virginica
Prediction is virginica Actual is versicolor
Prediction is setosa Actual is setosa
Prediction is versicolor Actual is versicolor
Prediction is versicolor Actual is versicolor
Prediction is setosa Actual is setosa
Prediction is setosa Actual is setosa
Prediction is versicolor Actual is versicolor
Prediction is virginica Actual is virginica
Prediction is versicolor Actual is versicolor
Prediction is setosa Actual is setosa
Prediction is virginica Actual is virginica
Prediction is versicolor Actual is versicolor
Prediction is setosa Actual is setosa

9. Implement the non-parametric Locally Weighted Regression algorithm in order to fit data
points. Select appropriate data set for your experiment and draw graphs.

import matplotlib.pyplot as plt

import pandas as pd
import numpy as np

def kernel(point,xmat, k):

m,n= np.shape(xmat)
weights = np.mat(np.eye((m)))
for j in range(m):
diff = point - X[j]

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

weights[j,j] = np.exp(diff*diff.T/(-2.0*k**2))
return weights

def localWeight(point,xmat,ymat,k):
wei = kernel(point,xmat,k)
W = (X.T*(wei*X)).I*(X.T*(wei*ymat.T))
return W

def localWeightRegression(xmat,ymat,k):
m,n = np.shape(xmat)
ypred = np.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)
return ypred

data = pd.read_csv('10_data10_tips.csv')
bill = np.array(data.total_bill)
tip = np.array(data.tip)

mbill = np.mat(bill)
mtip = np.mat(tip)
m= np.shape(mbill)[1]
one = np.mat(np.ones(m))
X= np.hstack((one.T,mbill.T))

ypred = localWeightRegression(X,mtip,5)
SortIndex = X[:,1].argsort(0)
xsort = X[SortIndex][:,0]

fig = plt.figure()

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

ax = fig.add_subplot(1,1,1)
ax.scatter(bill,tip, color='blue')
ax.plot(xsort[:,1],ypred[SortIndex], color = 'red', linewidth=8)
plt.xlabel('Total bill')
plt.ylabel('Tip')
#plt.show()

OUTPUT:

Regression with parameter k=3

Regression with parameter k=8

Downloaded by wasim khan ([email protected])

lOMoARcPSD|44664451

Downloaded by wasim khan ([email protected])

Jntuk R20 ML MANUAL
100% (1)
Jntuk R20 ML MANUAL
53 pages
Project Proposal - Plant Disease Detection
No ratings yet
Project Proposal - Plant Disease Detection
2 pages
ML Lab Manual-18csl76
No ratings yet
ML Lab Manual-18csl76
52 pages
ML Lab Manual 18csl76 1
No ratings yet
ML Lab Manual 18csl76 1
54 pages
ML Lab
No ratings yet
ML Lab
62 pages
15CSL76 Lab Manual
75% (4)
15CSL76 Lab Manual
48 pages
15CSL76
No ratings yet
15CSL76
3 pages
AML Lab Record Index
No ratings yet
AML Lab Record Index
4 pages
Lab Plan Ii Sec B (ML Lab)
No ratings yet
Lab Plan Ii Sec B (ML Lab)
4 pages
AI&ML Lab Manual - 18CSL76 - Master
No ratings yet
AI&ML Lab Manual - 18CSL76 - Master
47 pages
18CSL76 Artificial Intelligence and Machine Learning Laboratory Syllabus For CS
No ratings yet
18CSL76 Artificial Intelligence and Machine Learning Laboratory Syllabus For CS
1 page
AIML-lab-manual Final CSE
No ratings yet
AIML-lab-manual Final CSE
43 pages
ML Lab Manual-Bcsl606
No ratings yet
ML Lab Manual-Bcsl606
64 pages
ML Manual
No ratings yet
ML Manual
42 pages
ML Lab Manual
No ratings yet
ML Lab Manual
40 pages
Artifical Intelligence and Machine Learning Lab Manual - Faculty Copy (1) .PDF FD (1) .PDF 123
No ratings yet
Artifical Intelligence and Machine Learning Lab Manual - Faculty Copy (1) .PDF FD (1) .PDF 123
68 pages
CP4252 Set2
No ratings yet
CP4252 Set2
4 pages
MLA LabManual1
No ratings yet
MLA LabManual1
52 pages
ML Record - Unlocked
No ratings yet
ML Record - Unlocked
67 pages
AIML Lab Manual
67% (3)
AIML Lab Manual
31 pages
AIML Lab Improvement
No ratings yet
AIML Lab Improvement
20 pages
CS3491 Set3
No ratings yet
CS3491 Set3
2 pages
Ad3461 Machine Learning Laboratory - 1
No ratings yet
Ad3461 Machine Learning Laboratory - 1
1 page
CL-I Lab Manual
No ratings yet
CL-I Lab Manual
131 pages
ML Manual - 2023-24
No ratings yet
ML Manual - 2023-24
54 pages
21CSU393 Kunal Verma - AI&ML Lab Manual
No ratings yet
21CSU393 Kunal Verma - AI&ML Lab Manual
71 pages
CSE AIML BTech Fourth Year
No ratings yet
CSE AIML BTech Fourth Year
26 pages
Ai and ML Lab
No ratings yet
Ai and ML Lab
2 pages
Lab Cycle
No ratings yet
Lab Cycle
1 page
AI ML - LAB - Manual - 2020 - 211213 - 100634
No ratings yet
AI ML - LAB - Manual - 2020 - 211213 - 100634
53 pages
Artifical Intelligence and Machine Learning Lab
No ratings yet
Artifical Intelligence and Machine Learning Lab
109 pages
AL-405 Machine Learning Lab Manual
No ratings yet
AL-405 Machine Learning Lab Manual
40 pages
Aiml - 4351601
No ratings yet
Aiml - 4351601
60 pages
21UCS608 AI&ML Lab Manual
No ratings yet
21UCS608 AI&ML Lab Manual
29 pages
Affiliated To VTU, Belgaum and Approved by AICTE
No ratings yet
Affiliated To VTU, Belgaum and Approved by AICTE
27 pages
AIML Syllabus
No ratings yet
AIML Syllabus
7 pages
Machine Learning Record
No ratings yet
Machine Learning Record
52 pages
21ucs608 Ai&Ml Lab Manual
No ratings yet
21ucs608 Ai&Ml Lab Manual
28 pages
Sec D EE 490 Artifical Intelligence
No ratings yet
Sec D EE 490 Artifical Intelligence
9 pages
Ai ML Lab Manual
No ratings yet
Ai ML Lab Manual
87 pages
Machine Learning Lab (CIE 421P)
No ratings yet
Machine Learning Lab (CIE 421P)
49 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
35 pages
Astha ML Manual
No ratings yet
Astha ML Manual
56 pages
Hindusthan College of Engineering and Technology
No ratings yet
Hindusthan College of Engineering and Technology
9 pages
Updated ML LAB Manual-2020-21
No ratings yet
Updated ML LAB Manual-2020-21
57 pages
7th Sem Updated Lab Manual
No ratings yet
7th Sem Updated Lab Manual
14 pages
1 22csu601-Aiml Syllabus
No ratings yet
1 22csu601-Aiml Syllabus
4 pages
Ad3461 Machine Learning Laboratory Syllabus
No ratings yet
Ad3461 Machine Learning Laboratory Syllabus
2 pages
Lab Manual - R20A6683 Deep Learning - Year-IV - Semester-I
No ratings yet
Lab Manual - R20A6683 Deep Learning - Year-IV - Semester-I
68 pages
Machine Learning Lab Sheets
No ratings yet
Machine Learning Lab Sheets
5 pages
Ai ML Final Lab Manual
No ratings yet
Ai ML Final Lab Manual
23 pages
AI & ML - LAB MANUAL - 3 Varients - 23-24-1
No ratings yet
AI & ML - LAB MANUAL - 3 Varients - 23-24-1
105 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
4 pages
AI LAB Manual R18 2020-21 10.02.2021
No ratings yet
AI LAB Manual R18 2020-21 10.02.2021
65 pages
536C3E
No ratings yet
536C3E
2 pages
P3 Practical
No ratings yet
P3 Practical
20 pages
Ml-Lab-Manual Cse
No ratings yet
Ml-Lab-Manual Cse
69 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
From Everand
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
Manish Soni
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Updated 18csl66 Lab Manual
No ratings yet
Updated 18csl66 Lab Manual
55 pages
2020 21 CSE Project Report
No ratings yet
2020 21 CSE Project Report
69 pages
132 Cse 2022 23
No ratings yet
132 Cse 2022 23
16 pages
18csl67 CGV Lab Manual
No ratings yet
18csl67 CGV Lab Manual
49 pages
18cs55 Python Syllabus
No ratings yet
18cs55 Python Syllabus
4 pages
Mathematics Behind Machine Learning:: Linear Regression Model
No ratings yet
Mathematics Behind Machine Learning:: Linear Regression Model
21 pages
Lecture 1
No ratings yet
Lecture 1
85 pages
Fast and Efficient Lung Abnormality Identification With Explainable AI A Comprehensive Framework For Chest CT Scan and X-Ray Images
No ratings yet
Fast and Efficient Lung Abnormality Identification With Explainable AI A Comprehensive Framework For Chest CT Scan and X-Ray Images
19 pages
Face Mask Detection by Using Convolutional Neural Network 2
No ratings yet
Face Mask Detection by Using Convolutional Neural Network 2
26 pages
Letter of Intent
No ratings yet
Letter of Intent
2 pages
Demand Forecasting Model Using Deep Learning Methods For Supply Chain Management 4.0
No ratings yet
Demand Forecasting Model Using Deep Learning Methods For Supply Chain Management 4.0
9 pages
AI Agents: Evolution, Architecture, and Real-World Applications
No ratings yet
AI Agents: Evolution, Architecture, and Real-World Applications
52 pages
Enabling Intelligence. Delivering Results
No ratings yet
Enabling Intelligence. Delivering Results
12 pages
Pattern Recognition Introduction Features Classifiers and Principles Jrgen Beyerer Matthias Richter Matthias Nagel Download
No ratings yet
Pattern Recognition Introduction Features Classifiers and Principles Jrgen Beyerer Matthias Richter Matthias Nagel Download
86 pages
Construction Waste Sorting
No ratings yet
Construction Waste Sorting
15 pages
Additional MCQs Chap 4 MA
No ratings yet
Additional MCQs Chap 4 MA
4 pages
Univ 1001
No ratings yet
Univ 1001
3 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
System Programming Notes 2 - TutorialsDuniya
No ratings yet
System Programming Notes 2 - TutorialsDuniya
73 pages
Harnessing AI in Digital Marketing
No ratings yet
Harnessing AI in Digital Marketing
9 pages
Pneumonia Detection Using Deep Learning
No ratings yet
Pneumonia Detection Using Deep Learning
5 pages
The Future of Influencer Marketing - Trends To Watch Out For, Marketing & Advertising News, ET BrandEquity
No ratings yet
The Future of Influencer Marketing - Trends To Watch Out For, Marketing & Advertising News, ET BrandEquity
13 pages
App - Py Code
No ratings yet
App - Py Code
22 pages
Chat Bot Development
No ratings yet
Chat Bot Development
15 pages
Data Mining
No ratings yet
Data Mining
27 pages
Cbse - Department of Skill Education Artificial Intelligence
No ratings yet
Cbse - Department of Skill Education Artificial Intelligence
10 pages
Estimating PVT Properties of Crude Oil Systems Based On A Boosted Decision Tree Regression Modelling Scheme With K-Means Clustering
No ratings yet
Estimating PVT Properties of Crude Oil Systems Based On A Boosted Decision Tree Regression Modelling Scheme With K-Means Clustering
15 pages
Crop Yield Prediction
No ratings yet
Crop Yield Prediction
4 pages
Learning Deep Architectures For AI - Yoshua Bengio
No ratings yet
Learning Deep Architectures For AI - Yoshua Bengio
130 pages
AI and ML With Generative AI
No ratings yet
AI and ML With Generative AI
3 pages
Lecture 11 - Introduction To Artificial Neural Networks (ANN)
No ratings yet
Lecture 11 - Introduction To Artificial Neural Networks (ANN)
35 pages
TENCON 2016 Singapore
No ratings yet
TENCON 2016 Singapore
69 pages
Fall 2023 - CS619 - 8873 - 1
No ratings yet
Fall 2023 - CS619 - 8873 - 1
9 pages
HRJ R1333
No ratings yet
HRJ R1333
6 pages