0% found this document useful (0 votes)
50 views4 pages

ML 2.3 Prashant

This document summarizes the implementation of K-means clustering on the iris dataset. It loads the dataset, converts categorical variables to numeric, extracts features and labels, creates a K-means model with 3 clusters, plots the unclustered and clustered data, and makes a prediction using the model. Key steps include loading the iris dataset, preprocessing the data, fitting a K-means model with 3 clusters to the features, obtaining the cluster centroids and labels, and visualizing the unclustered versus clustered data.

Uploaded by

deadm2996
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views4 pages

ML 2.3 Prashant

This document summarizes the implementation of K-means clustering on the iris dataset. It loads the dataset, converts categorical variables to numeric, extracts features and labels, creates a K-means model with 3 clusters, plots the unclustered and clustered data, and makes a prediction using the model. Key steps include loading the iris dataset, preprocessing the data, fitting a K-means model with 3 clusters to the features, obtaining the cluster centroids and labels, and visualizing the unclustered versus clustered data.

Uploaded by

deadm2996
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Machine Learning

Worksheet 2.3

Student Name: PRASHANT UID: 22MCA20999


Branch: UIC - MCA Section/Group: 3/B
Subject Code: 22CAP-702 Date of Performance:18/10/2023

Question: Implementation of K-means clustering Algorithm.

Importing necessary libraries:

import matplotlib.pyplot as plt


import pandas as pd import
seaborn as sns Answer:

1. Loading dataset iris = sns.load_dataset('iris') labels


= iris.species.unique()

iris.head()

sepal_length
sepal_width petal_length petal_width species

5.1 3.5 1.4 0.2 setosa

4.9 3.0 1.4 0.2 setosa

4.7 3.2 1.3 0.2 setosa

4.6 3.1 1.5 0.2 setosa

5.0 3.6 1.4 0.2 setosa


2. Converting categorical data to numeric value:
iris["species"] = pd.Categorical(iris["species"])
iris["species"] = iris["species"].cat.codes iris.head()

sepal_length
sepal_width petal_length petal_width species

5.1 3.5 1.4 0.2 0

4.9 3.0 1.4 0.2 0

4.7 3.2 1.3 0.2 0

4.6 3.1 1.5 0.2 0

5.0 3.6 1.4 0.2 0

3. Extracting feature and label data in form of X and


Y: X = iris[['sepal_length','sepal_width']].values y =
iris.species

4. Creating K-Means Clustering Model: from


sklearn.cluster import KMeans model =
KMeans(n_clusters = 3).fit(X) centers =
model.cluster_centers_ new_labels = model.labels_

print('Centroids :',centers)
print('\nLabels :',new_labels)

//Output:
Centroids : [[5.77358491 2.69245283]
[6.81276596 3.07446809]
[5.006 3.428 ]]
Labels : [2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2
2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 0 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 1 1 1 1
1 1 0 0 1 1 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 1 1 0 1
1 0]

5. Plotting Unclustered and Clustered Data:


plt.figure(figsize=(8,8)) plt.scatter(X[:,
0], X[:, 1],c=y, s=60) plt.xlabel('Sepal
length', fontsize=18) plt.ylabel('Sepal
width', fontsize=18)
plt.title('Unclustered Data',fontsize=18)

Text(0.5,
1.0,
'Unclustered
Data')

plt.figure(figsize=(8,6))
plt.scatter(X[:, 0], X[:, 1], c=new_labels,s=60) plt.scatter(centers[:, 0],
centers[:, 1], c='r', s=400, marker = '*', zorder=10);
plt.xlabel('Sepal length', fontsize=18) plt.ylabel('Sepal width', fontsize=18)
plt.title('Clustered Data',fontsize=18)
Text(0.5,
1.0, 'Clustered Data')

6. Predicting Label Using Model: y_pred =


model.predict([[2.3,5.6]]) print("Result
:",labels[y_pred[0]])

Output:

Result : virginica

You might also like