0% found this document useful (0 votes)
5 views4 pages

6 Naive Bayesclassifn Algo

The document contains a Python script for analyzing the Iris dataset using pandas and scikit-learn. It includes data loading, exploration, and preparation steps such as splitting the dataset into training and testing sets. The script also sets up a Gaussian Naive Bayes model for classification tasks.

Uploaded by

omkarmagdum818
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views4 pages

6 Naive Bayesclassifn Algo

The document contains a Python script for analyzing the Iris dataset using pandas and scikit-learn. It includes data loading, exploration, and preparation steps such as splitting the dataset into training and testing sets. The script also sets up a Gaussian Naive Bayes model for classification tasks.

Uploaded by

omkarmagdum818
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

cota12-6

March 25, 2025

[1]: '''name : Omkar Magdum


Rollno:COTC53'''
[1]: 'name : Omkar Magdum\n Rollno:COTC53'
[2]:
import pandas as pd
import matplotlib.pyplot as plt
[3]: data = pd.read_csv("iris.csv")

[4]: data.head()

[4] : Id SepalLengthCm SepalWidthCmPetalLengthCm PetalWidthCm Species


0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
[5] : 4 5
data.shape
5.0 3.6 1.4 0.2 Iris-setosa

[5]: (150, 6)

[6] : data.head()

[6] : Id SepalLengthCm SepalWidthCmPetalLengthCm PetalWidthCm Species


0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5
[7] : data.tail() 5.0 3.6 1.4 0.2 Iris-setosa

[7] : Id SepalLengthCm SepalWidthCmPetalLengthCm PetalWidthCm \


145 146 6.7 3.0 5.2 2.3
146 147 6.3 2.5 5.0 1.9
147 148 6.5 3.0 5.2 2.0

1
148 149 6.2 3.4 5.4 2.3
149 150 5.9 3.0 5.1 1.8

Species
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
[8] : 149 Iris-virginica

data.info()

<class
'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
# Column Non-Null Count Dtype

0 Id 150 non-null int64


1 SepalLengthCm 150 non-null float64
2 SepalWidthCm 150 non-null float64
3 PetalLengthCm 150 non-null float64
[9] :
4 PetalWidthCm 150 non-null float64
5 Species 150 non-null object
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB

[9] : Id SepalLengthCm SepalWidthCmPetalLengthCm PetalWidthCm


data.describe()
count 150.000000 150.000000 150.000000 150.000000 150.000000
mean 75.500000 5.843333 3.054000 3.758667 1.198667
std 43.445368 0.828066 0.433594 1.764420 0.763161
min 1.000000 4.300000 2.000000 1.000000 0.100000
25% 38.250000 5.100000 2.800000 1.600000 0.300000
50% 75.500000 5.800000 3.000000 4.350000 1.300000
[10] : 75%
data.isnull().sum()
112.750000 6.400000 3.300000 5.100000 1.800000
max 150.000000 7.900000 4.400000 6.900000 2.500000
[10] : Id 0
SepalLengthCm 0
SepalWidthCm 0
PetalLengthCm 0
PetalWidthCm 0
Species 0
dtype: int64

2
[11] : x = data.drop(['Species'], axis=1)
y = data.drop(['SepalLengthCm', 'SepalWidthCm',
'PetalLengthCm', ␣
𝗌'PetalWidthCm'], axis=1)
print(x)
print(y)
print(x.shape)
print(y.shape)
Id SepalLengthC SepalWidthC PetalLengthC PetalWidthC
m m m m
0 1 5.1 3.5 1.4 0.2
1 2 4.9 3.0 1.4 0.2
2 3 4.7 3.2 1.3 0.2
3 4 4.6 3.1 1.5 0.2
4 5 5.0 3.6 1.4 0.2
.. … … … … …
145 146 6.7 3.0 5.2 2.3
146 147 6.3 2.5 5.0 1.9
147 148 6.5 3.0 5.2 2.0
148 149 6.2 3.4 5.4 2.3
149 150 5.9 3.0 5.1 1.8

[150 rows x 5 columns]


Id Species
0 1 Iris-setosa
1 2 Iris-setosa
2 3 Iris-setosa
3 4 Iris-setosa
4 5 Iris-setosa
.. … …
145 146 Iris-virginica
146 147 Iris-virginica
147 148 Iris-virginica
148 149 Iris-virginica
149 150 Iris-virginica

[150 rows x 2 columns]


(150, 5)
(150, 2)
[12] :
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2,␣
𝗌shuffle=True)

print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

3
(120, 5)
(30, 5)
(120, 2)
(30, 2)
[14]: from sklearn.naive_bayes import GaussianNB

[15]: GaussianNB()

[15]: GaussianNB()

You might also like