0% found this document useful (0 votes)
36 views5 pages

SVM and Kmeans - Iris Dataset - Ipynb - Colab

Uploaded by

termp89
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views5 pages

SVM and Kmeans - Iris Dataset - Ipynb - Colab

Uploaded by

termp89
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

11/29/24, 9:30 PM SVM and Kmeans -Iris dataset.

ipynb - Colab

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

!kaggle datasets download -d uciml/iris

Dataset URL: https://www.kaggle.com/datasets/uciml/iris


License(s): CC0-1.0
Downloading iris.zip to /content
0% 0.00/3.60k [00:00<?, ?B/s]
100% 3.60k/3.60k [00:00<00:00, 7.28MB/s]

Loading of the dataset and creating dataframe

!unzip iris.zip

Archive: iris.zip
inflating: Iris.csv
inflating: database.sqlite

df = pd.read_csv('Iris.csv')
print(df.head())

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species


0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa

Changing categorical to numbers

df['Species'] = df['Species'].astype('category').cat.codes

Selection of columns and assigning to X and Y

X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print("Training set shape:", X_train.shape)
print("Test set shape:", X_test.shape)

Training set shape: (120, 5)


Test set shape: (30, 5)

Training of the SVM Model

svm_model = SVC(kernel='linear', C=1.0, random_state=42)

Model fitting and Prediction

svm_model.fit(X_train, y_train)

y_pred = svm_model.predict(X_test)

Evaluation Metrics and Parameters

accuracy = accuracy_score(y_test, y_pred)


print("Accuracy:", accuracy)
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

Accuracy: 1.0

Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 10

https://colab.research.google.com/drive/1kDkVaGxeyPshe6mgQPxShanNabVTF1v_#scrollTo=oHHbeiXRnXVu&printMode=true 1/5
11/29/24, 9:30 PM SVM and Kmeans -Iris dataset.ipynb - Colab
1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

Confusion Matrix

conf_matrix = confusion_matrix(y_test, y_pred)


print("\nConfusion Matrix:")
print(conf_matrix)

Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]

HeatMap

sns.heatmap(conf_matrix, annot=True, cmap="YlGnBu", fmt='g')


plt.title("Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()

K MEANS Implementation

import numpy as np

df=pd.read_csv('/content/Iris.csv')
df.head()

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

Next steps: Generate code with df


toggle_off View recommended plots New interactive sheet

df.info()

https://colab.research.google.com/drive/1kDkVaGxeyPshe6mgQPxShanNabVTF1v_#scrollTo=oHHbeiXRnXVu&printMode=true 2/5
11/29/24, 9:30 PM SVM and Kmeans -Iris dataset.ipynb - Colab
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Id 150 non-null int64
1 SepalLengthCm 150 non-null float64
2 SepalWidthCm 150 non-null float64
3 PetalLengthCm 150 non-null float64
4 PetalWidthCm 150 non-null float64
5 Species 150 non-null object
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB

df.drop(['Id'] ,axis=1, inplace=True)

df.isnull().sum()

SepalLengthCm 0

SepalWidthCm 0

PetalLengthCm 0

PetalWidthCm 0

Species 0

dtype: int64

df.describe()

SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm

count 150.000000 150.000000 150.000000 150.000000

mean 5.843333 3.054000 3.758667 1.198667

std 0.828066 0.433594 1.764420 0.763161

min 4.300000 2.000000 1.000000 0.100000

25% 5.100000 2.800000 1.600000 0.300000

50% 5.800000 3.000000 4.350000 1.300000

75% 6.400000 3.300000 5.100000 1.800000

max 7.900000 4.400000 6.900000 2.500000

df.head()

SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 5.1 3.5 1.4 0.2 Iris-setosa

1 4.9 3.0 1.4 0.2 Iris-setosa

2 4.7 3.2 1.3 0.2 Iris-setosa

3 4.6 3.1 1.5 0.2 Iris-setosa

4 5.0 3.6 1.4 0.2 Iris-setosa

Next steps: Generate code with df


toggle_off View recommended plots New interactive sheet

df_imp = df.iloc[:,0:4]
from sklearn.cluster import KMeans
k_meansclus = range(1,10)
sse = []

for k in k_meansclus :
km = KMeans(n_clusters =k)
km.fit(df_imp)
sse.append(km.inertia_)

plt.title('The Elbow Method')


plt.plot(k_meansclus,sse)
plt.show()

https://colab.research.google.com/drive/1kDkVaGxeyPshe6mgQPxShanNabVTF1v_#scrollTo=oHHbeiXRnXVu&printMode=true 3/5
11/29/24, 9:30 PM SVM and Kmeans -Iris dataset.ipynb - Colab

km1 = KMeans(n_clusters=3,max_iter=300 , random_state=0)


km1.fit(df_imp)
y_means = km1.fit_predict(df_imp)

km1.cluster_centers_

array([[5.88360656, 2.74098361, 4.38852459, 1.43442623],


[5.006 , 3.418 , 1.464 , 0.244 ],
[6.85384615, 3.07692308, 5.71538462, 2.05384615]])

df_imp = np.array(df_imp)

plt.scatter(df_imp[y_means==0,2 ],df_imp[y_means==0,3 ], color='g' , label='Iris-versicolor ')


plt.scatter(df_imp[y_means==1,2 ],df_imp[y_means==1,3 ], color='r' , label='Iris-setosa')
plt.scatter(df_imp[y_means==2,2 ],df_imp[y_means==2,3 ], color='b', label='Iris-virginica')
plt.legend()
plt.show()

plt.scatter(df_imp[y_means==0,0 ],df_imp[y_means==0,1], color='g' , label='Iris-versicolor ')


plt.scatter(df_imp[y_means==1,0 ],df_imp[y_means==1,1 ], color='r' , label='Iris-setosa')
plt.scatter(df_imp[y_means==2,0 ],df_imp[y_means==2,1 ], color='b', label='Iris-virginica')

plt.legend()
plt.show()

https://colab.research.google.com/drive/1kDkVaGxeyPshe6mgQPxShanNabVTF1v_#scrollTo=oHHbeiXRnXVu&printMode=true 4/5
11/29/24, 9:30 PM SVM and Kmeans -Iris dataset.ipynb - Colab

https://colab.research.google.com/drive/1kDkVaGxeyPshe6mgQPxShanNabVTF1v_#scrollTo=oHHbeiXRnXVu&printMode=true 5/5

You might also like