Naive Bayes Classification

The document describes the implementation of a Naïve Bayes classifier using a dataset about weather conditions and playing tennis. It includes data preprocessing, training the classifier, making predictions, and evaluating the model's performance through accuracy and a confusion matrix. The accuracy achieved by the model is 0.5, and the confusion matrix is visualized using matplotlib.

Uploaded by

angelin272004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views8 pages

Naive Bayes Classification

Uploaded by

angelin272004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Naïve Bayes Classification

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Load dataset
data = {
'Outlook': ['Sunny', 'Sunny', 'Overcast', 'Rain', 'Rain', 'Rain', 'Overcast', 'Sunny',
'Sunny', 'Rain', 'Sunny', 'Overcast', 'Overcast', 'Rain'],
'Temperature': ['Hot', 'Hot', 'Hot', 'Mild', 'Cool', 'Cool', 'Cool', 'Mild', 'Cool',
'Mild', 'Mild', 'Mild', 'Hot', 'Mild'],
'Humidity': ['High', 'High', 'High', 'High', 'Normal', 'Normal', 'Normal', 'High',
'Normal', 'Normal', 'Normal', 'High', 'Normal', 'High'],
'Windy': [False, True, False, False, False, True, True, False, False, False, True,
True, False, True],
'Play Tennis': ['No', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes', 'Yes', 'Yes',
'Yes', 'Yes', 'No']
}
df = pd.DataFrame(data)

# Encode categorical variables

df_encoded = pd.get_dummies(df, columns=['Outlook', 'Temperature', 'Humidity',
'Windy'])
# Split data into features and target
X = df_encoded.drop(columns=['Play Tennis']).values
y = df_encoded['Play Tennis'].values

# Split data into train and test sets

def train_test_split(X, y, test_size=0.2, random_state=None):
np.random.seed(random_state)
indices = np.random.permutation(len(X))
test_size = int(len(X) * test_size)
X_train = X[indices[:-test_size]]
y_train = y[indices[:-test_size]]
X_test = X[indices[-test_size:]]
y_test = y[indices[-test_size:]]
return X_train, X_test, y_train, y_test

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)

# Naive Bayes Classifier

class NaiveBayesClassifier:
def fit(self, X, y):
self.X = X
self.y = y
self.classes = np.unique(y)
self.parameters = []
for i, c in enumerate(self.classes):
X_c = X[y == c]
self.parameters.append([])
for col in X_c.T:
parameters = {"mean": col.mean(), "var": col.var()}
self.parameters[i].append(parameters)
def _calculate_likelihood(self, mean, var, x):
eps = 1e-4
coeff = 1.0 / np.sqrt(2.0 * np.pi * var + eps)
exponent = np.exp(-(np.power(x - mean, 2) / (2 * var + eps)))
return coeff * exponent

def _calculate_prior(self, c):

frequency = np.mean(self.y == c)
return frequency

def _classify(self, sample):

posteriors = []
for i, c in enumerate(self.classes):
posterior = self._calculate_prior(c)
for feature_value, params in zip(sample, self.parameters[i]):
likelihood = self._calculate_likelihood(params["mean"], params["var"],
feature_value)
posterior *= likelihood
posteriors.append(posterior)
return self.classes[np.argmax(posteriors)]

def predict(self, X):

y_pred = [self._classify(sample) for sample in X]
return np.array(y_pred)

# Train the Naive Bayes Classifier

nb_classifier = NaiveBayesClassifier()
nb_classifier.fit(X_train, y_train)
# Make predictions
y_pred = nb_classifier.predict(X_test)

# Calculate accuracy
accuracy = np.mean(y_pred == y_test)
print("Accuracy:", accuracy)

# Confusion Matrix
def confusion_matrix(y_true, y_pred, labels):
cm = np.zeros((len(labels), len(labels)), dtype=int)
label_map = {label: i for i, label in enumerate(labels)}
for i in range(len(y_true)):
cm[label_map[y_true[i]]][label_map[y_pred[i]]] += 1
return cm
labels = np.unique(y)
cm = confusion_matrix(y_test, y_pred, labels)
print("Confusion Matrix:")
print(cm)

# Visualization of Confusion Matrix

plt.figure(figsize=(8, 6))
plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Reds)
plt.title("Confusion Matrix")
plt.colorbar()
tick_marks = np.arange(len(labels))
plt.xticks(tick_marks, labels, rotation=45)
plt.yticks(tick_marks, labels)
plt.ylabel('True label')
plt.xlabel('Predicted label')
for i in range(len(labels)):
for j in range(len(labels)):
plt.text(j, i, cm[i, j], ha="center", va="center", color="white" if cm[i, j] >
cm.max() / 2.0 else "black")
plt.tight_layout()
plt.show()
***************
Output:
Accuracy: 0.5
Confusion Matrix:
[[0 0]
[1 1]]
Step by Step Explanation
Data Loading:
The dataset is created as a Python dictionary with keys representing different
features such as Outlook, Temperature, Humidity, Windy, and the target variable
"Play Tennis".
This dictionary is then converted into a pandas DataFrame df.
Data Preprocessing:
 Categorical variables are encoded using one-hot encoding. This is done using
the pd.get_dummies() function, which converts categorical variables into
dummy/indicator variables.
 The line df_encoded = pd.get_dummies(df, columns=['Outlook',
'Temperature', 'Humidity', 'Windy']) is using the pd.get_dummies() function
from the pandas library to perform one-hot encoding on categorical variables
in the DataFrame df.
 Here's what each part of the line does:
 pd.get_dummies(): This function in pandas is used to convert categorical
variables into dummy/indicator variables.
 df: This is the original DataFrame containing the dataset.
 columns=['Outlook', 'Temperature', 'Humidity', 'Windy']: This specifies the
columns in the DataFrame df that should be one-hot encoded. These columns
contain categorical variables that need to be converted into numerical format
for machine learning algorithms.
 The function pd.get_dummies() will create new binary (0 or 1) columns for
each category in the specified categorical columns. Each binary column
represents whether a particular category is present or not in the original
column.
 After executing this line, the DataFrame df_encoded will contain the original
columns along with additional columns representing the one-hot encoded
variables for the specified categorical columns.
Train-Test Split:
 The custom train_test_split() function is defined to split the dataset into
training and testing sets. This function shuffles the data and then splits it based
on the specified test size.

Naive Bayes Classifier:

 The NaiveBayesClassifier class is defined with methods for fitting the model
and making predictions.
 In the fit() method, the mean and variance of each feature in each class are
calculated and stored in the parameters attribute.
 In the predict() method, for each sample in the test set, the likelihood and prior
probabilities are computed for each class using the Gaussian Naive Bayes
formula. The class with the highest posterior probability is chosen as the
predicted class.
Training the Classifier:
 An instance of the NaiveBayesClassifier is created (nb_classifier) and trained
on the training data (X_train and y_train) using the fit() method.
Making Predictions:
 The trained classifier (nb_classifier) is used to predict the labels for the test
data (X_test) using the predict() method.
Accuracy Calculation:
 The accuracy of the model is calculated by comparing the predicted labels
(y_pred) with the true labels (y_test). This is done using NumPy's array
comparison.
Confusion Matrix:
 A confusion matrix is generated to evaluate the performance of the classifier.
The confusion_matrix() function calculates the confusion matrix based on the
true and predicted labels.
Confusion Matrix Visualization:
 The confusion matrix is visualized using matplotlib. Each cell in the matrix
represents the number of samples that belong to the corresponding true and
predicted labels. The color intensity of each cell indicates the count of
samples.
 Finally, the accuracy and confusion matrix are printed, and the visualization
of the confusion matrix is displayed using matplotlib.

**********************

Hitachi ZX200-5G Breaker Installation-1
No ratings yet
Hitachi ZX200-5G Breaker Installation-1
93 pages
Physics of Mammography PDF
100% (1)
Physics of Mammography PDF
307 pages
E 2446 - 16
100% (3)
E 2446 - 16
14 pages
Stanford Cricket Ground - Antigua
100% (1)
Stanford Cricket Ground - Antigua
15 pages
Udacity Machine Learning Analysis Supervised Learning
100% (1)
Udacity Machine Learning Analysis Supervised Learning
504 pages
ITP Pek. Pemancangan CSP
50% (2)
ITP Pek. Pemancangan CSP
8 pages
IC Engines
No ratings yet
IC Engines
17 pages
Machine
100% (1)
Machine
45 pages
Engro Project Report
50% (2)
Engro Project Report
16 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
13 pages
ML Lab
No ratings yet
ML Lab
7 pages
Example - 1
No ratings yet
Example - 1
5 pages
Aman Agarwal
No ratings yet
Aman Agarwal
6 pages
Fall Semester 2020-21 AI With Python ECE-4031
No ratings yet
Fall Semester 2020-21 AI With Python ECE-4031
5 pages
DP-Lite: User Guide
No ratings yet
DP-Lite: User Guide
118 pages
Board Information & Wiring Diagram Power Supply: Com Bat Com V V
No ratings yet
Board Information & Wiring Diagram Power Supply: Com Bat Com V V
1 page
Bridge Course
No ratings yet
Bridge Course
8 pages
Iecex Ces 15.0012u
No ratings yet
Iecex Ces 15.0012u
5 pages
ML - LAB - 7 - Jupyter Notebook
100% (1)
ML - LAB - 7 - Jupyter Notebook
7 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
11 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
The Analysis of Earth Pressure On Retaining Wall Based On ABAQUS
No ratings yet
The Analysis of Earth Pressure On Retaining Wall Based On ABAQUS
4 pages
Exp 3 Bi
No ratings yet
Exp 3 Bi
12 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
Microsoft Word Short Cut Keys
No ratings yet
Microsoft Word Short Cut Keys
5 pages
Landpreparation Tractor Operation and Maintenance
No ratings yet
Landpreparation Tractor Operation and Maintenance
8 pages
ML File
No ratings yet
ML File
17 pages
Axalto Partners With Bharti Telesoft To Deliver Cash-Based Prepaid Reloading Solution
No ratings yet
Axalto Partners With Bharti Telesoft To Deliver Cash-Based Prepaid Reloading Solution
2 pages
Template For Electronic Submission To ACS Journals
No ratings yet
Template For Electronic Submission To ACS Journals
7 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
EU Guide To Executive Recruitment Feb 2018 Workmaze
No ratings yet
EU Guide To Executive Recruitment Feb 2018 Workmaze
17 pages
ML Lab Experiments (1) - Pages-3
No ratings yet
ML Lab Experiments (1) - Pages-3
11 pages
Deep Learning Perceptron
No ratings yet
Deep Learning Perceptron
10 pages
Exp 3 Bi 30
No ratings yet
Exp 3 Bi 30
7 pages
ML Lab6
No ratings yet
ML Lab6
4 pages
Service Learning Final Report
No ratings yet
Service Learning Final Report
6 pages
Quick Start Guide Goodrive20
No ratings yet
Quick Start Guide Goodrive20
9 pages
LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
Naive Bayes Algorithm With Classification Example 1697128543
No ratings yet
Naive Bayes Algorithm With Classification Example 1697128543
16 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
Ehs-E076-Ics-Das-Cnl-0000-90011-00 G02
No ratings yet
Ehs-E076-Ics-Das-Cnl-0000-90011-00 G02
8 pages
Byrd Canción Del Pájaro TPT Pno 1
No ratings yet
Byrd Canción Del Pájaro TPT Pno 1
12 pages
Om Sai Educational & Charitable Trust
No ratings yet
Om Sai Educational & Charitable Trust
2 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Ee324 - Electrical Machines and Drives
No ratings yet
Ee324 - Electrical Machines and Drives
6 pages
Resume Pengelly Web
No ratings yet
Resume Pengelly Web
1 page
One Night and One Night Only
No ratings yet
One Night and One Night Only
1 page
Fairchild Schottky Diodes BAT54 Datasheet
No ratings yet
Fairchild Schottky Diodes BAT54 Datasheet
3 pages
3 Classification
No ratings yet
3 Classification
16 pages
ML Python Exercises UOM BDS Classification
No ratings yet
ML Python Exercises UOM BDS Classification
18 pages
Aiml 5-8
No ratings yet
Aiml 5-8
19 pages
ML Lab PT
No ratings yet
ML Lab PT
25 pages
Username 1
No ratings yet
Username 1
97 pages
1st PGM
No ratings yet
1st PGM
10 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
Allcodesml 2
No ratings yet
Allcodesml 2
10 pages
Assignment - 01
No ratings yet
Assignment - 01
4 pages
W8 Naive Bayes Lab
No ratings yet
W8 Naive Bayes Lab
4 pages
ML PDF
No ratings yet
ML PDF
30 pages
Ensayo Sobre El Calentamiento Global en Inglés
100% (1)
Ensayo Sobre El Calentamiento Global en Inglés
5 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
33 pages
ML LAB Rec
No ratings yet
ML LAB Rec
9 pages
ML Lab 01999676272
No ratings yet
ML Lab 01999676272
12 pages
ML Lab Programs
No ratings yet
ML Lab Programs
18 pages
ADS - Phase 3
No ratings yet
ADS - Phase 3
34 pages
21CSC305P ML - Lab Programs 1 - 9
No ratings yet
21CSC305P ML - Lab Programs 1 - 9
36 pages
Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
No ratings yet
Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
23 pages
ML Prac1-10
No ratings yet
ML Prac1-10
32 pages
23BCE7199 ML Lab Assignment
No ratings yet
23BCE7199 ML Lab Assignment
15 pages
Remaining ML Program
No ratings yet
Remaining ML Program
12 pages
2L5H 1.5m HUAWEI AHP4518R10v06 (4224)
No ratings yet
2L5H 1.5m HUAWEI AHP4518R10v06 (4224)
3 pages
Naive Bayes
No ratings yet
Naive Bayes
8 pages
23BCE7092 ML Lab Assignment
No ratings yet
23BCE7092 ML Lab Assignment
14 pages
ML Lab
No ratings yet
ML Lab
26 pages
Pressure Relief Valve - 620 - 640 SERIES
No ratings yet
Pressure Relief Valve - 620 - 640 SERIES
3 pages
Bacdeaf 23032025 115708 Split 1
No ratings yet
Bacdeaf 23032025 115708 Split 1
37 pages
Ann Experiential Learning
No ratings yet
Ann Experiential Learning
43 pages
Detect Fake Profiles in Online Social Networks Using Support Vector Machine
No ratings yet
Detect Fake Profiles in Online Social Networks Using Support Vector Machine
8 pages
ML Lab P-1
No ratings yet
ML Lab P-1
10 pages
Internals 2 Web
No ratings yet
Internals 2 Web
4 pages
ML5 Implementation
No ratings yet
ML5 Implementation
32 pages
Case Based Reasoning Presentation
No ratings yet
Case Based Reasoning Presentation
6 pages
Current Agile Practices
No ratings yet
Current Agile Practices
12 pages
Atul MLT Exp 4-11
No ratings yet
Atul MLT Exp 4-11
17 pages
Purva Rawale - BDA Practical No 2
No ratings yet
Purva Rawale - BDA Practical No 2
9 pages
MLLAB
No ratings yet
MLLAB
10 pages
Dsbda 10
No ratings yet
Dsbda 10
5 pages
Final ML Programs 075005
No ratings yet
Final ML Programs 075005
15 pages
1
No ratings yet
1
13 pages
Prog 6
No ratings yet
Prog 6
3 pages
Import As Import As From Import From Import From Import From Import
No ratings yet
Import As Import As From Import From Import From Import From Import
4 pages
Aiml Practical
No ratings yet
Aiml Practical
17 pages
Web Tech Pratical
No ratings yet
Web Tech Pratical
33 pages
ML Functions
No ratings yet
ML Functions
12 pages
Data 3rd Yr BSC Data Science
No ratings yet
Data 3rd Yr BSC Data Science
10 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet