0% found this document useful (0 votes)
54 views5 pages

Ai/Ml Lab-4: Name: Pratik Jadhav PRN: 20190802050

The document discusses implementing two machine learning algorithms on an iris dataset: 1) A k-nearest neighbors algorithm is used to classify the iris data, achieving 96.67% accuracy. 2) A naive Bayes classifier is also implemented on the iris data, with its accuracy to be computed.

Uploaded by

test
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views5 pages

Ai/Ml Lab-4: Name: Pratik Jadhav PRN: 20190802050

The document discusses implementing two machine learning algorithms on an iris dataset: 1) A k-nearest neighbors algorithm is used to classify the iris data, achieving 96.67% accuracy. 2) A naive Bayes classifier is also implemented on the iris data, with its accuracy to be computed.

Uploaded by

test
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

10/8/21, 1:09 PM 20190802050_DS_Lab4

AI/ML LAB-4
Name: Pratik Jadhav

PRN: 20190802050

AIM: To implement two algorithms on a data set and impute the


accuracy score of the predictions

Q1. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions. Java/Python ML library classes can be used
for this problem.

In [1]:
%matplotlib inline

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

In [2]:
iris_data = pd.read_csv("Iris.csv")

iris_data.head()

Out[2]: Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

In [3]:
len(iris_data)

150
Out[3]:

In [4]:
iris_data.isna().sum()

Id 0

Out[4]:
SepalLengthCm 0

SepalWidthCm 0

PetalLengthCm 0

PetalWidthCm 0

Species 0

dtype: int64

localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 1/5
10/8/21, 1:09 PM 20190802050_DS_Lab4

In [5]: X = iris_data.drop("Species", axis=1)

y = iris_data["Species"]

len(X), len(y)

(150, 150)
Out[5]:

In [6]:
from sklearn.neighbors import KNeighborsClassifier

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.2,

random_state=1)

clf = KNeighborsClassifier(n_neighbors=3)

clf.fit(X_train, y_train)

clf.score(X_test, y_test)

0.9666666666666667
Out[6]:

In [7]:
y_preds = clf.predict(X_test)

y_preds[:10]

array(['Iris-setosa', 'Iris-versicolor', 'Iris-versicolor', 'Iris-setosa',

Out[7]:
'Iris-virginica', 'Iris-versicolor', 'Iris-virginica',

'Iris-setosa', 'Iris-setosa', 'Iris-virginica'], dtype=object)

In [8]:
y_preds_proba = clf.predict_proba(X_test)

y_preds_proba[:10]

array([[1., 0., 0.],

Out[8]:
[0., 1., 0.],

[0., 1., 0.],

[1., 0., 0.],

[0., 0., 1.],

[0., 1., 0.],

[0., 0., 1.],

[1., 0., 0.],

[1., 0., 0.],

[0., 0., 1.]])

In [9]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

accuracy = accuracy_score(y_preds, y_test)

print(f"The accuracy of the ML model for iris data: {accuracy * 100:.2f}%\n")

print(f"Classfication Report: {classification_report(y_preds, y_test)}\n")


print(f"Confusion Matrix: \n{confusion_matrix(y_preds, y_test)}")

The accuracy of the ML model for iris data: 96.67%

Classfication Report: precision recall f1-score support

Iris-setosa 1.00 1.00 1.00 11

Iris-versicolor 0.92 1.00 0.96 12

Iris-virginica 1.00 0.86 0.92 7

localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 2/5
10/8/21, 1:09 PM 20190802050_DS_Lab4

accuracy 0.97 30

macro avg 0.97 0.95 0.96 30

weighted avg 0.97 0.97 0.97 30

Confusion Matrix:

[[11 0 0]

[ 0 12 0]

[ 0 1 6]]

In [10]:
from sklearn.model_selection import cross_val_score

cvs = cross_val_score(clf, X, y)

print(cvs)

print(f"Mean of each testing data set: {np.mean(cvs) * 100:.2f}%")

[0.66666667 1. 1. 1. 0.7 ]

Mean of each testing data set: 87.33%

In [11]:
y_testing = pd.Series(y_test).reset_index().drop("index",axis=1)

y_predictions = pd.Series(y_preds)

In [12]:
predictions_df = pd.DataFrame(data={

"Species": y_testing["Species"],

"Predicted Species": y_predictions

})

In [13]:
predicts = []

for index, i in enumerate(y_testing["Species"]):

if i == y_preds[index]:

predicts.append("Correct")

else:

predicts.append("Wrong")

In [14]:
predictions_df["Correct or Wrong"] = pd.Series(predicts)

predictions_df.head()

Out[14]: Species Predicted Species Correct or Wrong

0 Iris-setosa Iris-setosa Correct

1 Iris-versicolor Iris-versicolor Correct

2 Iris-versicolor Iris-versicolor Correct

3 Iris-setosa Iris-setosa Correct

4 Iris-virginica Iris-virginica Correct

In [15]:
print(f"Total Correct or Wrong Predictions:\n\

{predictions_df['Correct or Wrong'].value_counts()}")

Total Correct or Wrong Predictions:

Correct 29

localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 3/5
10/8/21, 1:09 PM 20190802050_DS_Lab4

Wrong 1

Name: Correct or Wrong, dtype: int64

Q2. Write a program to implement the naïve Bayesian classifier for a sample training data
set stored as a .CSV file. Compute the accuracy of the classifier, considering few test data
sets.

In [16]:
iris_data = pd.read_csv("Iris.csv")

iris_data.head()

Out[16]: Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

In [17]:
X = iris_data.drop("Species", axis=1)

y = iris_data["Species"]

len(X), len(y)

(150, 150)
Out[17]:

In [18]:
from sklearn.naive_bayes import GaussianNB

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.3,

random_state=1)

gnb = GaussianNB()

gnb.fit(X_train, y_train)

gnb.score(X_test, y_test)

1.0
Out[18]:

In [19]:
y_preds = gnb.predict(X_test)

y_preds[:10]

array(['Iris-setosa', 'Iris-versicolor', 'Iris-versicolor', 'Iris-setosa',

Out[19]:
'Iris-virginica', 'Iris-versicolor', 'Iris-virginica',

'Iris-setosa', 'Iris-setosa', 'Iris-virginica'], dtype='<U15')

In [20]:
from sklearn.metrics import accuracy_score

accuracy = accuracy_score(y_preds, y_test)

localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 4/5
10/8/21, 1:09 PM 20190802050_DS_Lab4

print(f"The accuracy of the ML model for iris data: {accuracy * 100:.2f}%")

The accuracy of the ML model for iris data: 100.00%

In [21]:
from sklearn.model_selection import cross_val_score

cvs = cross_val_score(gnb, X, y)

print(cvs)

print(f"Mean of each testing data set: {np.mean(cvs) * 100:.2f}%")

[0.96666667 1. 1. 1. 1. ]

Mean of each testing data set: 99.33%

In [22]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

accuracy = accuracy_score(y_preds, y_test)

print(f"The accuracy of the ML model for iris data: {accuracy * 100:.2f}%\n")

print(f"Classfication Report: {classification_report(y_preds, y_test)}\n")


print(f"Confusion Matrix: \n{confusion_matrix(y_preds, y_test)}")

The accuracy of the ML model for iris data: 100.00%

Classfication Report: precision recall f1-score support

Iris-setosa 1.00 1.00 1.00 14

Iris-versicolor 1.00 1.00 1.00 18

Iris-virginica 1.00 1.00 1.00 13

accuracy 1.00 45

macro avg 1.00 1.00 1.00 45

weighted avg 1.00 1.00 1.00 45

Confusion Matrix:

[[14 0 0]

[ 0 18 0]

[ 0 0 13]]

Conclusion: Hence, we have successfully implemented kNeigbhours and Naive Bayesian


algorithms on iris data set and computed the accuracy and different evaluation model on the
predictions. We got an accuray of 96.67% on testing data and 87.33% on different testing data
sets of the KNeighbours Algorithm. And for Naive Bayesian we got an accuracy of 100% and
99.33% on different testing data sets of iris data.

localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 5/5

You might also like