0% found this document useful (0 votes)

299 views21 pages

My ML Lab Manual

The document describes a Machine Learning Lab Manual for a university course. It outlines 10 experiments for students to complete related to machine learning algorithms. The first 3 experiments involve implementing the FIND-S algorithm, Candidate Elimination algorithm, and ID3 decision tree algorithm using appropriate datasets. The remaining experiments involve implementing algorithms like backpropagation neural networks, naive Bayes classifier, Bayesian networks, k-means clustering, k-NN classification and locally weighted regression.

Uploaded by

Anonymous qZP5Zyb2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

299 views21 pages

My ML Lab Manual

Uploaded by

Anonymous qZP5Zyb2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 21

SJC Institute of Technology

Department of Computer Science & Engineering

MACHINE LEARNING LAB MANUAL

Designed and Compiled by: Prof. Harshavardhana Doddamani

MACHINE LEARNING LABORATORY
[As per Choice Based Credit System (CBCS) scheme]
(Effective from the academic year 2016 -2017)
SEMESTER – VII
Subject Code 15CSL76 IA Marks 20
Number of Lecture Hours/Week 01I + 02P Exam Marks 80
Total Number of Lecture Hours 40 Exam Hours 03
CREDITS – 02
Course objectives: This course will enable students to
1. Make use of Data sets in implementing the machine learning algorithms
2. Implement the machine learning concepts and algorithms in any suitable language of choice.
Description (If any):
1. The programs can be implemented in either JAVA or Python.
2. For Problems 1 to 6 and 10, programs are to be developed without using the built-in
classes or APIs of Java/Python.
3. Data sets can be taken from standard repositories (https://archive.ics.uci.edu/ml/datasets.html)
or constructed by the students.
Lab Experiments:
1. Implement and demonstrate the FIND-S algorithm for finding the most specific
hypothesis based on a given set of training data samples. Read the training data from a
.CSV file.
2. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Candidate-Elimination algorithm to output a description of the set of
all hypotheses consistent with the training examples.
3. Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use an appropriate data set for building the decision tree and apply this knowledge to
classify a new sample.
4. Build an Artificial Neural Network by implementing the Backpropagation algorithm
and test the same using appropriate data sets.
5. Write a program to implement the naïve Bayesian classifier for a sample training data set
stored as a .CSV file. Compute the accuracy of the classifier, considering few test data
sets.
6. Assuming a set of documents that need to be classified, use the naïve Bayesian
Classifier model to perform this task. Built-in Java classes/API can be used to write the
program. Calculate the accuracy, precision, and recall for your data set.
7. Write a program to construct a Bayesian network considering medical data. Use this
model to demonstrate the diagnosis of heart patients using standard Heart Disease Data
Set. You can use Java/Python ML library classes/API.
8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set
for clustering using k-Means algorithm. Compare the results of these two algorithms and
comment on the quality of clustering. You can add Java/Python ML library classes/API in
the program.
9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions. Java/Python ML library classes can be used
for this problem.
10. Implement the non-parametric Locally Weighted Regression algorithm in order to fit
data points. Select appropriate data set for your experiment and draw graphs.
Study Experiment / Project:
NIL
Course outcomes: The students should be able to:
1. Understand the implementation procedures for the machine learning algorithms.
2. Design Java/Python programs for various Learning algorithms.
3. Apply appropriate data sets to the Machine Learning algorithms.
4. Identify and apply Machine Learning algorithms to solve real world problems.
Conduction of Practical Examination:
All laboratory experiments are to be included for practical examination.
Students are allowed to pick one experiment from the lot.
Strictly follow the instructions as printed on the cover page of answer script
Marks distribution: Procedure + Conduction + Viva:20 + 50 +10 (80)
Change of experiment is allowed only once and marks allotted to the procedure part to be
made zero.

Problem1: Implement and demonstrate the FIND-S algorithm for finding the
most specific hypothesis based on a given set of training data samples. Read the
training data from a .CSV file.

Algorithm:
1. Initialize h to the most specific hypothesis in H
2. For each positive training instance x
• For each attribute constraint ai in h
If the constraint ai in h is satisfied by x then do nothing
else replace ai in h by the next more general constraint that is satisfied by x
3. Output hypothesis h

Illustration:
Step1: Find S

Step2: Find S

Step2: Find S
Iteration 4 and Step 3: Find S

Source Code of the Program:

import random
import csv
attributes = [['Sunny','Rainy'],
['Warm','Cold'],
['Normal','High'],
['Strong','Weak'],
['Warm','Cool'],
['Same','Change']]

num_attributes = len (attributes)

print (" \n The most general hypothesis : ['?', '?', '?', '?', '?', '?']\n")
print ("\n The most specific hypothesis : ['0', '0', '0', '0', '0', '0']\n")

a=[]
print("\n The Given Training Data Set \n")

with open('C:\\Users\\hd\\Desktop\\Data\\tennis.csv', 'r') as csvFile:

reader = csv.reader (csvFile)
for row in reader:
a.append (row)
print (row)

print ("\n The initial value of hypothesis: ")

hypothesis = ['0'] * num_attributes
print(hypothesis)

# Comparing with First Training Example

for j in range(0, num_attributes):

hypothesis [j] = a[0] [j];

# Comparing with Remaining Training Examples of Given Data Set

print("\n Find S : Finding a Maximally Specific Hypothesis\n")

for i in range(0, len(a)) :

if a[i] [num_attributes] == 'Yes':

for j in range(0, num_attributes):

if a [i] [j] != hypothesis[j]:
hypothesis [j] = '?'
else :
hypothesis [j] = a [i] [j]
print(" For Training Example No : {0} the hypothesis is ".format (i) , hypothesis)
print("\n The Maximally Specific Hypothesis for a given Training Examples :\n")
print(hypothesis)
Output:

The most general hypothesis : ['?','?','?','?','?','?']

The most specific hypothesis : ['0','0','0','0','0','0']
The Given Training Data Set
['sunny', 'warm', 'normal', 'strong', 'warm', 'same', 'Yes']
['sunny', 'warm', 'high', 'strong', 'warm', 'same', 'Yes']
['rainy', 'cold', 'high', 'strong', 'warm', 'change', 'No']
['sunny', 'warm', 'high', 'strong', 'cool', 'change', 'Yes']
The initial value of hypothesis:
['0', '0', '0', '0', '0', '0']
Find S: Finding a Maximally Specific Hypothesis
For Training Example No :0 the hypothesis is ['sunny', 'warm', 'normal', 'strong', 'warm', 'same']
For Training Example No :1 the hypothesis is ['sunny', 'warm', '?', 'strong', 'warm', 'same']
For Training Example No :2 the hypothesis is ['sunny', 'warm', '?', 'strong', 'warm', 'same']
For Training Example No :3 the hypothesis is ['sunny', 'warm', '?', 'strong', '?', '?']
The Maximally Specific Hypothesis for a given Training Examples :
['sunny', 'warm', '?', 'strong', '?', '?']

Problem-2: For a given set of training data examples stored in a .CSV file,
implement and demonstrate the Candidate-Elimination algorithm to output a
description of the set of all hypotheses consistent with the training examples.
Trace – 1:
Trace – 2:

Trace – 3:
Final Version Space:

Source Code:
OUTPUT:
Problem – 3: Write a program to demonstrate the working of the decision tree
based ID3 algorithm. Use an appropriate data set for building the decision tree
and apply this knowledge to classify a new sample.

Algorithm:

Illustration: To illustrate the operation of ID3, let’s consider the learning task represented by the
below examples Compute the Gain and identify which attribute is the best as illustrated below
Day Outlook Temperature. Humidity Wind Play Tennis

D1 Sunny Hot High Weak No

D2 Sunny Hot High Strong No

D3 Overcast Hot High Weak Yes

D4 Rain Mild High Weak Yes

D5 Rain Cool Normal Weak Yes

D6 Rain Cool Normal Strong No

D7 Overcast Cool Normal Weak Yes

D8 Sunny Mild High Weak No

D9 Sunny Cold Normal Weak Yes

D10 Rain Mild Normal Strong Yes

D11 Sunny Mild Normal Strong Yes

D12 Overcast Mild High Strong Yes

D13 Overcast Hot Normal Weak Yes

D14 Rain Mild High Strong No

Which attribute to test at the root?

After first step:

Second step:

Second and third steps:

Source Code:
import pandas as pd
from pandas import DataFrame
df_tennis =
DataFrame.from_csv('C:\\Users\\HD\\Desktop\\Data\\PlayTennis.csv')
df_tennis

def entropy(probs): # Calulate the Entropy of given probability

import math
return sum( [-prob*math.log(prob, 2) for prob in probs] )
def entropy_of_list(a_list): # Entropy calculation of list of discrete
values (YES/NO)
from collections import Counter
cnt = Counter(x for x in a_list)
print("No and Yes Classes:",a_list.name,cnt)
num_instances = len(a_list)*1.0
probs = [x / num_instances for x in cnt.values()]
return entropy(probs) # Call Entropy:
# The initial entropy of the YES/NO attribute for our dataset.
#print(df_tennis['PlayTennis'])
total_entropy = entropy_of_list(df_tennis['PlayTennis'])
print("Entropy of given PlayTennis Data Set:",total_entropy)

Output :
No and Yes Classes : PlayTennis Counter({'Yes': 9, 'No': 5})
Entropy of given PlayTennis Data Set : 0.9402859586706309

Information Gain of Attributes

def information_gain(df, split_attribute_name, target_attribute_name,
trace=0):
print("Information Gain Calculation of ",split_attribute_name)
'''
Takes a DataFrame of attributes,and quantifies the entropy of a target
attribute after performing a split along the values of another attribute.
'''
# Split Data by Possible Vals of Attribute:
df_split = df.groupby(split_attribute_name)
#print(df_split.groups)
for name,group in df_split:
print(name)
print(group)
# Calculate Entropy for Target Attribute, as well as
# Proportion of Obs in Each Data-Split
nobs = len(df.index) * 1.0
#print("NOBS",nobs)
df_agg_ent = df_split.agg({target_attribute_name : [entropy_of_list,
lambda x: len(x)/nobs] })[target_attribute_name]
#print("DFAGGENT",df_agg_ent)
df_agg_ent.columns = ['Entropy', 'PropObservations']
#if trace: # helps understand what fxn is doing:
# print(df_agg_ent)
# Calculate Information Gain:
new_entropy = sum( df_agg_ent['Entropy'] *
df_agg_ent['PropObservations'] )
old_entropy = entropy_of_list(df[target_attribute_name])

return old_entropy - new_entropy

print('Info-gain for Outlook is :'+str( information_gain(df_tennis,
'Outlook', 'PlayTennis')),"\n")
print('\n Info-gain for Humidity is: ' + str( information_gain(df_tennis,
'Humidity', 'PlayTennis')),"\n")
print('\n Info-gain for Wind is:' + str( information_gain(df_tennis,
'Wind', 'PlayTennis')),"\n")
print('\n Info-gain for Temperature is:' +
str( information_gain(df_tennis, 'Temperature','PlayTennis')),"\n")

ID3 Algorithm
def id3(df, target_attribute_name, attribute_names, default_class=None):
## Tally target attribute:
from collections import Counter
cnt = Counter(x for x in df[target_attribute_name])# class of YES /NO
## First check: Is this split of the dataset homogeneous?
if len(cnt) == 1:
return next(iter(cnt))
## Second check: Is this split of the dataset empty?
# if yes, return a default value
elif df.empty or (not attribute_names):

return default_class
## Otherwise: This dataset is ready to be divvied up!
else:
# Get Default Value for next recursive call of this function:
default_class = max(cnt.keys()) #[index_of_max] # most common value of
target attribute in dataset
# Choose Best Attribute to split on:
gainz = [information_gain(df, attr, target_attribute_name) for attr in
attribute_names]
index_of_max = gainz.index(max(gainz))
best_attr = attribute_names[index_of_max]
# Create an empty tree, to be populated in a moment
tree = {best_attr:{}}
remaining_attribute_names = [i for i in attribute_names if i != best_attr]
# Split dataset
# On each split, recursively call this algorithm.
# populate the empty tree with subtrees, which
# are the result of the recursive call
for attr_val, data_subset in df.groupby(best_attr):
subtree = id3(data_subset,
target_attribute_name,
remaining_attribute_names,
default_class)
tree[best_attr][attr_val] = subtree
return tree

Predicting Attributes:
# Get Predictor Names (all but 'class')
attribute_names = list(df_tennis.columns)
print("List of Attributes:", attribute_names)
attribute_names.remove('PlayTennis') #Remove the class attribute
print("Predicting Attributes:", attribute_names)

Tree Construction:
# Run Algorithm:
from pprint import pprint
tree = id3(df_tennis,'PlayTennis',attribute_names)
print("\n\nThe Resultant Decision Tree is :\n")
pprint(tree)

Classification Accuracy:
def classify(instance, tree, default=None):
attribute = next(iter(tree))#tree.keys()[0]
if instance[attribute] in tree[attribute].keys():
result = tree[attribute][instance[attribute]]
if isinstance(result, dict): # this is a tree, delve deeper
return classify(instance, result)
else:
return result # this is a label
else:
return default

df_tennis['predicted'] = df_tennis.apply(classify, axis=1,

args=(tree,'No') )
# classify func allows for a default arg: when tree doesn't have answer
for a particular
# combitation of attribute-values, we can use 'no' as the default guess
print('Accuracy is:' +
str( sum(df_tennis['PlayTennis']==df_tennis['predicted'] ) /
(1.0*len(df_tennis.index)) ))
df_tennis[['PlayTennis', 'predicted']]

Structures of Feeling - Raymond Williams
100% (1)
Structures of Feeling - Raymond Williams
6 pages
Model of Human Behavior
100% (4)
Model of Human Behavior
2 pages
Power of Teamwork
100% (1)
Power of Teamwork
14 pages
Basic Research Approaches and Designs - An Overview - Amoud - 2020
100% (1)
Basic Research Approaches and Designs - An Overview - Amoud - 2020
18 pages
Giles 1987
No ratings yet
Giles 1987
37 pages
2.1.4. Basic Competencies (Integrated With 21st Century Skills) NC III
No ratings yet
2.1.4. Basic Competencies (Integrated With 21st Century Skills) NC III
74 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
Psychology SET A PDF
100% (1)
Psychology SET A PDF
3 pages
Comparison Theories PDF
No ratings yet
Comparison Theories PDF
3 pages
Genene Abebe
No ratings yet
Genene Abebe
226 pages
St. Mary'S College of Bansalan, Inc.: of Education. While The A, B, and C Has Their Own Purposes
No ratings yet
St. Mary'S College of Bansalan, Inc.: of Education. While The A, B, and C Has Their Own Purposes
11 pages
Recruitment and Selection Questionnaire
100% (1)
Recruitment and Selection Questionnaire
2 pages
Correlative Study On Emotional Intelligence and Family Environment
No ratings yet
Correlative Study On Emotional Intelligence and Family Environment
7 pages
Fundamentals of Management mgt162 Group Assignment Report
No ratings yet
Fundamentals of Management mgt162 Group Assignment Report
20 pages
ML Lab Draft Manual
No ratings yet
ML Lab Draft Manual
46 pages
Unit 2 - Client Interviewing and Counselling (ChatGPT)
No ratings yet
Unit 2 - Client Interviewing and Counselling (ChatGPT)
10 pages
ML Record
No ratings yet
ML Record
19 pages
ML Lab Report
No ratings yet
ML Lab Report
8 pages
LC - G10 - PE - Q2 Week 1 2 1
No ratings yet
LC - G10 - PE - Q2 Week 1 2 1
4 pages
ML Lab Record
No ratings yet
ML Lab Record
30 pages
MLT Bcai 651 Lab Manual
No ratings yet
MLT Bcai 651 Lab Manual
42 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
12 pages
Cse Machine Learning Lab Manual
No ratings yet
Cse Machine Learning Lab Manual
22 pages
ML Lab Manual
No ratings yet
ML Lab Manual
46 pages
Bayesian Rationality The Probabilistic Approach To Human Reasoning Mike Oaksford Nick Chater Z-Lib
No ratings yet
Bayesian Rationality The Probabilistic Approach To Human Reasoning Mike Oaksford Nick Chater Z-Lib
317 pages
Machine Learning Techniques LAB FILE - KAI651
No ratings yet
Machine Learning Techniques LAB FILE - KAI651
16 pages
MLT Lab1
No ratings yet
MLT Lab1
27 pages
MANUAL
No ratings yet
MANUAL
34 pages
ML LAB Record
No ratings yet
ML LAB Record
35 pages
IT ML Lab
No ratings yet
IT ML Lab
35 pages
Lab Manual Final
No ratings yet
Lab Manual Final
34 pages
22K61A0618 - Removed - Lab Manual Sasi CLD
No ratings yet
22K61A0618 - Removed - Lab Manual Sasi CLD
25 pages
Original ML Lab Manual
No ratings yet
Original ML Lab Manual
22 pages
Machine Learning
No ratings yet
Machine Learning
27 pages
ABSTRACT-WPS Office
No ratings yet
ABSTRACT-WPS Office
10 pages
Learning Plateau Exam Questions
No ratings yet
Learning Plateau Exam Questions
6 pages
Problem Solving
No ratings yet
Problem Solving
3 pages
Learning Update
No ratings yet
Learning Update
6 pages
ML Lab Manual
No ratings yet
ML Lab Manual
70 pages
ML Experiments
No ratings yet
ML Experiments
22 pages
ML Lab Manual Devansh
No ratings yet
ML Lab Manual Devansh
57 pages
ML Ex1
No ratings yet
ML Ex1
12 pages
Machine Learning Laboratory
No ratings yet
Machine Learning Laboratory
44 pages
ML Lab Manual R20
No ratings yet
ML Lab Manual R20
77 pages
Discipline and Ideas Module 6 PDF
No ratings yet
Discipline and Ideas Module 6 PDF
5 pages
ML Record New Format
No ratings yet
ML Record New Format
48 pages
Ahmed Othman Report
No ratings yet
Ahmed Othman Report
6 pages
Quiet The Mind and The Soul Will Speak Exploring The Boundary Effects of Green Mindfulness and Spiritual Intelligence On University Students Green Entrepreneurial IntentionBehavior Link - 2023 - MDPI
No ratings yet
Quiet The Mind and The Soul Will Speak Exploring The Boundary Effects of Green Mindfulness and Spiritual Intelligence On University Students Green Entrepreneurial IntentionBehavior Link - 2023 - MDPI
21 pages
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
No ratings yet
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
20 pages
AD3461 - ML Lab Manual
No ratings yet
AD3461 - ML Lab Manual
54 pages
201CS240 Mllabmanual
No ratings yet
201CS240 Mllabmanual
20 pages
ML Lab Manual (1-9)
No ratings yet
ML Lab Manual (1-9)
37 pages
B.TECH Machine Learning-Lab
No ratings yet
B.TECH Machine Learning-Lab
99 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
Jntuk R20 ML
No ratings yet
Jntuk R20 ML
43 pages
Nahw An Introduction To The Science of Arabic Grammar
No ratings yet
Nahw An Introduction To The Science of Arabic Grammar
2 pages
R20-21nm-Iii-I-Ml-Lab Manual
No ratings yet
R20-21nm-Iii-I-Ml-Lab Manual
38 pages
ML - LAB Record - Final
No ratings yet
ML - LAB Record - Final
39 pages
Outcome Based Lab Report
No ratings yet
Outcome Based Lab Report
22 pages
ML Lab
No ratings yet
ML Lab
7 pages
Machine Learning Lab Mannual CS 601
No ratings yet
Machine Learning Lab Mannual CS 601
30 pages
ML Lab R20
No ratings yet
ML Lab R20
42 pages
ML Priyesha - 778
No ratings yet
ML Priyesha - 778
23 pages
ML Lab
No ratings yet
ML Lab
45 pages
Spring 2020 Honors Beacon
No ratings yet
Spring 2020 Honors Beacon
14 pages
ML New Record
No ratings yet
ML New Record
51 pages
Ashin ML Record - Merged
No ratings yet
Ashin ML Record - Merged
53 pages
ML Lab
No ratings yet
ML Lab
49 pages
ML Lab Manual-17csl76
No ratings yet
ML Lab Manual-17csl76
43 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
31 pages
MLlab Manual LIET
No ratings yet
MLlab Manual LIET
52 pages
Fedal #5
No ratings yet
Fedal #5
33 pages
Lab Manual: Department of Computer Science and Engineering
No ratings yet
Lab Manual: Department of Computer Science and Engineering
30 pages
Desicion Making - Assignment
No ratings yet
Desicion Making - Assignment
3 pages
Machine Learninf File Final
No ratings yet
Machine Learninf File Final
45 pages
ML Lab Manual
No ratings yet
ML Lab Manual
26 pages
15CSL76
No ratings yet
15CSL76
3 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
The Impact of Reward and Recognition Programs On Employees' Motivation and Satisfaction
No ratings yet
The Impact of Reward and Recognition Programs On Employees' Motivation and Satisfaction
12 pages
Experimental Psych Gustatory
No ratings yet
Experimental Psych Gustatory
5 pages
15CSL76
No ratings yet
15CSL76
35 pages
Updated ML LAB Manual-2020-21
No ratings yet
Updated ML LAB Manual-2020-21
57 pages
Hs Credits Taken in Ms Info
No ratings yet
Hs Credits Taken in Ms Info
1 page
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
Edison Rhet2
No ratings yet
Edison Rhet2
31 pages
Edited - Edited - Final ML Lab Manual Version11
No ratings yet
Edited - Edited - Final ML Lab Manual Version11
83 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
Machine Learning Laboratory 18CSL76: Institute of Technology and Management
No ratings yet
Machine Learning Laboratory 18CSL76: Institute of Technology and Management
49 pages
Core Java Programming Book
From Everand
Core Java Programming Book
Manish Soni
No ratings yet
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet