0% found this document useful (0 votes)

20 views6 pages

Mllabprog 5

Uploaded by

Gagan DN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views6 pages

Mllabprog 5

Uploaded by

Gagan DN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Machine Learning Laboratory 15CSL76

5. Write a program to implement the naïve Bayesian classifier for a sample training data set stored
as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.

Bayes’ Theorem is stated as:

Where,
P(h|D) is the probability of hypothesis h given the data D. This is called the posterior
probability.
P(D|h) is the probability of data d given that the hypothesis h was true.
P(h) is the probability of hypothesis h being true. This is called the prior probability of h.
P(D) is the probability of the data. This is called the prior probability of D

After calculating the posterior probability for a number of different hypotheses h, and is
interested in finding the most probable hypothesis h ∈ H given the observed data D. Any such
maximally probable hypothesis is called a maximum a posteriori (MAP) hypothesis.

Bayes theorem to calculate the posterior probability of each candidate hypothesis is hMAP is a
MAP hypothesis provided

(Ignoring P(D) since it is a constant)

1 Deepak D. Assistant Professor, Dept. of CS&E, Canara Engg. College

Machine Learning Laboratory 15CSL76

Gaussian Naive Bayes

A Gaussian Naive Bayes algorithm is a special type of Naïve Bayes algorithm. It’s specifically
used when the features have continuous values. It’s also assumed that all the features are
following a Gaussian distribution i.e., normal distribution

Representation for Gaussian Naive Bayes

We calculate the probabilities for input values for each class using a frequency. With real-
valued inputs, we can calculate the mean and standard deviation of input values (x) for each
class to summarize the distribution.

This means that in addition to the probabilities for each class, we must also store the mean and
standard deviations for each input variable for each class.

Gaussian Naive Bayes Model from Data

The probability density function for the normal distribution is defined by two parameters (mean
and standard deviation) and calculating the mean and standard deviation values of each input
variable (x) for each class value.

Example: Refer the link

http://chem-eng.utoronto.ca/~datamining/dmc/naive_bayesian.htm

2 Deepak D. Assistant Professor, Dept. of CS&E, Canara Engg. College

Machine Learning Laboratory 15CSL76

Examples:
 The data set used in this program is the Pima Indians Diabetes problem.
 This data set is comprised of 768 observations of medical details for Pima Indians
patents. The records describe instantaneous measurements taken from the patient such
as their age, the number of times pregnant and blood workup. All patients are women
aged 21 or older. All attributes are numeric, and their units vary from attribute to
attribute.
 The attributes are Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin,
BMI, DiabeticPedigreeFunction, Age, Outcome
 Each record has a class value that indicates whether the patient suffered an onset of
diabetes within 5 years of when the measurements were taken (1) or not (0)

Sample Examples:
Examples Pregnancies Glucose BloodPressure SkinThickness Insulin BMI Diabetic Age Outcome
Pedigree
Function
1 6 148 72 35 0 33.6 0.627 50 1
2 1 85 66 29 0 26.6 0.351 31 0
3 8 183 64 0 0 23.3 0.672 32 1
4 1 89 66 23 94 28.1 0.167 21 0
5 0 137 40 35 168 43.1 2.288 33 1
6 5 116 74 0 0 25.6 0.201 30 0
7 3 78 50 32 88 31 0.248 26 1
8 10 115 0 0 0 35.3 0.134 29 0
9 2 197 70 45 543 30.5 0.158 53 1
10 8 125 96 0 0 0 0.232 54 1

3 Deepak D. Assistant Professor, Dept. of CS&E, Canara Engg. College

Machine Learning Laboratory 15CSL76

Program:

import csv
import random
import math

def loadcsv(filename):
lines = csv.reader(open(filename, "r"));
dataset = list(lines)
for i in range(len(dataset)):
#converting strings into numbers for processing
dataset[i] = [float(x) for x in dataset[i]]

return dataset

def splitdataset(dataset, splitratio):

#67% training size
trainsize = int(len(dataset) * splitratio);
trainset = []
copy = list(dataset);
while len(trainset) < trainsize:
#generate indices for the dataset list randomly to pick ele for
training data
index = random.randrange(len(copy));
trainset.append(copy.pop(index))
return [trainset, copy]

def separatebyclass(dataset):
separated = {} #dictionary of classes 1 and 0
#creates a dictionary of classes 1 and 0 where the values are
#the instances belonging to each class
for i in range(len(dataset)):
vector = dataset[i]
if (vector[-1] not in separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated

def mean(numbers):
return sum(numbers)/float(len(numbers))

def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in
numbers])/float(len(numbers)-1)
return math.sqrt(variance)

4 Deepak D. Assistant Professor, Dept. of CS&E, Canara Engg. College

Machine Learning Laboratory 15CSL76

def summarize(dataset): #creates a dictionary of classes

summaries = [(mean(attribute), stdev(attribute)) for
attribute in zip(*dataset)];
del summaries[-1] #excluding labels +ve or -ve
return summaries

def summarizebyclass(dataset):
separated = separatebyclass(dataset);
#print(separated)
summaries = {}
for classvalue, instances in separated.items():
#for key,value in dic.items()
#summaries is a dic of tuples(mean,std) for each class value
summaries[classvalue] = summarize(instances)
#summarize is used to cal to mean and std
return summaries

def calculateprobability(x, mean, stdev):

exponent = math.exp(-(math.pow(x-mean,2)/
(2*math.pow(stdev,2))))
return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent

def calculateclassprobabilities(summaries, inputvector):

# probabilities contains the all prob of all class of test data
probabilities = {}
for classvalue, classsummaries in summaries.items():
#class and attribute information as mean and sd
probabilities[classvalue] = 1
for i in range(len(classsummaries)):
mean, stdev = classsummaries[i] #take mean and
sd of every attribute for class 0 and 1 seperaely
x = inputvector[i] #testvector's first attribute
probabilities[classvalue] *=
calculateprobability(x, mean, stdev);#use normal dist
return probabilities

def predict(summaries, inputvector): #training and test data

is passed
probabilities = calculateclassprobabilities(summaries,
inputvector)
bestLabel, bestProb = None, -1
for classvalue, probability in probabilities.items():
#assigns that class which has the highest prob
if bestLabel is None or probability > bestProb:
bestProb = probability
bestLabel = classvalue
return bestLabel

5 Deepak D. Assistant Professor, Dept. of CS&E, Canara Engg. College

Machine Learning Laboratory 15CSL76

def getpredictions(summaries, testset):

predictions = []
for i in range(len(testset)):
result = predict(summaries, testset[i])
predictions.append(result)
return predictions

def getaccuracy(testset, predictions):

correct = 0
for i in range(len(testset)):
if testset[i][-1] == predictions[i]:
correct += 1
return (correct/float(len(testset))) * 100.0

def main():
filename = 'naivedata.csv'
splitratio = 0.67
dataset = loadcsv(filename);

trainingset, testset = splitdataset(dataset, splitratio)

print('Split {0} rows into train={1} and test={2}
rows'.format(len(dataset), len(trainingset), len(testset)))
# prepare model
summaries = summarizebyclass(trainingset);
#print(summaries)
# test model
predictions = getpredictions(summaries, testset) #find the
predictions of test data with the training data
accuracy = getaccuracy(testset, predictions)
print('Accuracy of the classifier is :
{0}%'.format(accuracy))

main()

Output:

Split 768 rows into train=514 and test=254 rows

Accuracy of the classifier is : 71.65354330708661%

6 Deepak D. Assistant Professor, Dept. of CS&E, Canara Engg. College

Guidance On - Auditing Climate Change Issues in ISO 9001
No ratings yet
Guidance On - Auditing Climate Change Issues in ISO 9001
10 pages
List of Potential PHD Advisors and Co-Pd
100% (1)
List of Potential PHD Advisors and Co-Pd
12 pages
BeagleBone and Linux
80% (5)
BeagleBone and Linux
11 pages
How God Answers Prayer
100% (1)
How God Answers Prayer
12 pages
54.01 101490900101 101490900144 Operator's Platform
No ratings yet
54.01 101490900101 101490900144 Operator's Platform
6 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
ML Lab PT
No ratings yet
ML Lab PT
25 pages
Department of Computer Engineering Academic Term: June-Nov 2021
No ratings yet
Department of Computer Engineering Academic Term: June-Nov 2021
6 pages
Lab Manual ML
No ratings yet
Lab Manual ML
28 pages
Yugoslav Register YU-CJA To YU-CJZ
No ratings yet
Yugoslav Register YU-CJA To YU-CJZ
3 pages
ML Lab Experiments (1) - Pages-3
No ratings yet
ML Lab Experiments (1) - Pages-3
11 pages
Atul MLT Exp 4-11
No ratings yet
Atul MLT Exp 4-11
17 pages
Module05 - Bayesian Reasoning
No ratings yet
Module05 - Bayesian Reasoning
37 pages
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
No ratings yet
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
29 pages
Machine File
No ratings yet
Machine File
27 pages
Wa0001
No ratings yet
Wa0001
39 pages
Bacdeaf 23032025 115708 Split 1
No ratings yet
Bacdeaf 23032025 115708 Split 1
37 pages
AI and ML Lab Manual
No ratings yet
AI and ML Lab Manual
29 pages
Lecture 5-Naïve Bayes
No ratings yet
Lecture 5-Naïve Bayes
26 pages
Data Classification and Prediction : Lecture-11
No ratings yet
Data Classification and Prediction : Lecture-11
36 pages
Pattern Recognition
No ratings yet
Pattern Recognition
26 pages
K - Nearest Neighbours Classifier / Regressor
No ratings yet
K - Nearest Neighbours Classifier / Regressor
35 pages
Linear Regression Example
No ratings yet
Linear Regression Example
26 pages
ML Lab Manual (1-10) FINAL
No ratings yet
ML Lab Manual (1-10) FINAL
34 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
35 pages
Lecture Slide 03 - Bayesian Classifier - Summer 2023
No ratings yet
Lecture Slide 03 - Bayesian Classifier - Summer 2023
23 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
34 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Naive Bayes Algorithm With Classification Example 1697128543
No ratings yet
Naive Bayes Algorithm With Classification Example 1697128543
16 pages
ML Naive Bayes 1
No ratings yet
ML Naive Bayes 1
19 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
Purva Rawale - BDA Practical No 2
No ratings yet
Purva Rawale - BDA Practical No 2
9 pages
ML Lab Programs For Exam
No ratings yet
ML Lab Programs For Exam
10 pages
Department of Computer Engineering: Experiment No.6
No ratings yet
Department of Computer Engineering: Experiment No.6
5 pages
Exp 3 Bi 30
No ratings yet
Exp 3 Bi 30
7 pages
Machine Learning Lab New
No ratings yet
Machine Learning Lab New
14 pages
Practical-3 Ritesh
No ratings yet
Practical-3 Ritesh
5 pages
Naive Bayes Numericals
No ratings yet
Naive Bayes Numericals
9 pages
Pattern Unit 3
No ratings yet
Pattern Unit 3
14 pages
Cse Machine Learning Lab Manual
No ratings yet
Cse Machine Learning Lab Manual
22 pages
Practical 3
No ratings yet
Practical 3
11 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
8 pages
ML Lab
No ratings yet
ML Lab
7 pages
Naive Biase
No ratings yet
Naive Biase
6 pages
Remaining ML Program
No ratings yet
Remaining ML Program
12 pages
LAB08 Bayes Theory
No ratings yet
LAB08 Bayes Theory
4 pages
Naive Bayes
No ratings yet
Naive Bayes
9 pages
Astm E9 09
No ratings yet
Astm E9 09
4 pages
Ex 3
No ratings yet
Ex 3
5 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
Exp 5
No ratings yet
Exp 5
4 pages
Naive
No ratings yet
Naive
5 pages
Ex 6
No ratings yet
Ex 6
2 pages
Assignment#3 (Naive Bayes)
No ratings yet
Assignment#3 (Naive Bayes)
5 pages
P 7
No ratings yet
P 7
5 pages
3 Naive Bayes Model
No ratings yet
3 Naive Bayes Model
3 pages
U4-Naive Bayes Algorithm
No ratings yet
U4-Naive Bayes Algorithm
5 pages
Naivebayes Labprg2
No ratings yet
Naivebayes Labprg2
3 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
01-Historical Perspectives
No ratings yet
01-Historical Perspectives
22 pages
Bulk LPG Layout Requirements-Comparison BTW San & Nfpa 58
No ratings yet
Bulk LPG Layout Requirements-Comparison BTW San & Nfpa 58
25 pages
Data Mining Classification: Naïve Bayes Classifier Lecture Notes For Chapter 4 &5
No ratings yet
Data Mining Classification: Naïve Bayes Classifier Lecture Notes For Chapter 4 &5
26 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
Portfolio Optimization
No ratings yet
Portfolio Optimization
53 pages
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
9
No ratings yet
9
21 pages
Deciduous Fruit Trees - Leo Gentry Nursery
100% (1)
Deciduous Fruit Trees - Leo Gentry Nursery
8 pages
Lesson Plan For Energy Skate Park1
No ratings yet
Lesson Plan For Energy Skate Park1
2 pages
Kuiper
No ratings yet
Kuiper
223 pages
Validation of The Finometer Device For Measurement
No ratings yet
Validation of The Finometer Device For Measurement
7 pages
Worcester Wave: Installation and Operating Manual
No ratings yet
Worcester Wave: Installation and Operating Manual
16 pages
ESG Report
No ratings yet
ESG Report
9 pages
Recent Advances in Microbial Biopolymer Production PDF
No ratings yet
Recent Advances in Microbial Biopolymer Production PDF
16 pages
Lit Analysis (The Illiad)
No ratings yet
Lit Analysis (The Illiad)
4 pages
Elsa NG Resume sp15
No ratings yet
Elsa NG Resume sp15
1 page
Two Phase Pressure Drop & Flooding Characyeristics in A Horizontal Vertical Pulsed Seive Plate Column
No ratings yet
Two Phase Pressure Drop & Flooding Characyeristics in A Horizontal Vertical Pulsed Seive Plate Column
11 pages
Sertel T SL 300 100 6D Catalogue
No ratings yet
Sertel T SL 300 100 6D Catalogue
2 pages
Insurance Sem 4 - Copy1
No ratings yet
Insurance Sem 4 - Copy1
64 pages
A Short Introduction To Serverless Architecture
No ratings yet
A Short Introduction To Serverless Architecture
3 pages
Icd 16 5 Eng V2.1 PDF
No ratings yet
Icd 16 5 Eng V2.1 PDF
2 pages
Form A1
No ratings yet
Form A1
2 pages
Robotics INNOVATION REPORT
No ratings yet
Robotics INNOVATION REPORT
15 pages
Doggy Styles 3 - Loving Duke
No ratings yet
Doggy Styles 3 - Loving Duke
11 pages
PFD For Upload - 4
No ratings yet
PFD For Upload - 4
11 pages
Drawing Class Notes
No ratings yet
Drawing Class Notes
5 pages