0% found this document useful (0 votes)
15 views1 page

K Means Algorithm

The document discusses using K-means clustering on iris dataset. It loads and explores the iris data, identifies optimal number of clusters using elbow method, runs K-means algorithm for 3 clusters and plots the clustered data points and cluster centers.

Uploaded by

ROHAN G A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views1 page

K Means Algorithm

The document discusses using K-means clustering on iris dataset. It loads and explores the iris data, identifies optimal number of clusters using elbow method, runs K-means algorithm for 3 clusters and plots the clustered data points and cluster centers.

Uploaded by

ROHAN G A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

K-means algm

Prgm By ROHAN
GA
In [1]: import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [23]: df=pd.read_csv("IRIS.csv")
df.head(10)

Out[23]: sepal_length sepal_width petal_length

0 5.1 3.5 1.4

1 4.9 3.0 1.4

2 4.7 3.2 1.3

3 4.6 3.1 1.5

4 5.0 3.6 1.4

5 5.4 3.9 1.7

6 4.6 3.4 1.4

7 5.0 3.4 1.5

8 4.4 2.9 1.4

9 4.9 3.1 1.5

In [3]: df.info()

<class 'pandas.core.frame.DataFr
ame'>
RangeIndex: 150 entries, 0 to 14
9
Data columns (total 5 columns):
# Column Non-Null Coun
t Dtype
--- ------ -------------
- -----
0 sepal_length 150 non-null
float64
1 sepal_width 150 non-null
float64
2 petal_length 150 non-null
float64
3 petal_width 150 non-null
float64
4 species 150 non-null
object
dtypes: float64(4), object(1)
memory usage: 6.0+ KB

In [4]: df.shape

Out[4]: (150, 5)

In [30]: df.describe()

Out[30]: sepal_length sepal_width petal_le

count 150.000000 150.000000 150.000

mean 5.843333 3.054000 3.758

std 0.828066 0.433594 1.764

min 4.300000 2.000000 1.000

25% 5.100000 2.800000 1.600

50% 5.800000 3.000000 4.350

75% 6.400000 3.300000 5.100

max 7.900000 4.400000 6.900

In [5]: df.columns

Out[5]: Index(['sepal_length', 'sepal_wi


dth', 'petal_length', 'petal_wid
th',
'species'],
dtype='object')

In [ ]:

In [6]: x=df.iloc[:,[0,1,2,3]].values

In [24]: from sklearn.cluster import KMea


wcss=[]
for i in range(1,11):
kmeans=KMeans(n_clusters=i,i
kmeans.fit(x)
wcss.append(kmeans.inertia_)
plt.plot(range(1,11),wcss)
plt.title('the elbow method')
plt.xlabel('number of cluster')
plt.xlabel('wcss')
plt.show()

In [ ]:

In [20]: kmeans = KMeans(n_clusters = 3 ,


max_iter=300, n_i
y_kmeans = kmeans.fit_predict(x)

In [ ]:

In [25]: plt.scatter(x[y_kmeans==0,0],x[y_
s=100,c='red',label='
plt.scatter(x[y_kmeans==1,0],x[y_
s=100,c='blue',label=
plt.scatter(x[y_kmeans==2,0],x[y_
s=100,c='yellow',labe
plt.scatter(kmeans.cluster_cente
s=100,c='yellow',labe
plt.legend()

Out[25]: <matplotlib.legend.Legend at 0x1


90b2df2b50>

In [ ]:

In [ ]:

You might also like