0% found this document useful (0 votes)

46 views7 pages

Kmeans

The document performs k-means clustering on a dataset containing individuals' ages and incomes. It loads and explores the data, scales the features, runs k-means clustering for different values of k, and plots the results. This includes plotting the clustered data, cluster centroids, and a graph of sum of squared errors for different k values to evaluate the optimal number of clusters.

Uploaded by

patil samrudhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views7 pages

Kmeans

Uploaded by

patil samrudhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

from sklearn.

cluster import KMeans

import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from matplotlib import pyplot as plt
%matplotlib inline

df = pd.read_csv("income.csv")
df.head()

Name Age Income

0 Rutuja 27 70000
1 Samruddhi 29 90000
2 Shubhangi 29 61000
3 Pratiksha 28 60000
4 Mohan 42 150000

plt.scatter(df.Age,df['Income'])
plt.xlabel('Age')
plt.ylabel('Income')

Text(0, 0.5, 'Income')

km = KMeans(n_clusters=3)
y_predicted = km.fit_predict(df[['Age','Income']])
y_predicted
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(

array([2, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2])

df['cluster']=y_predicted

df.head()

Name Age Income cluster

0 Rutuja 27 70000 2
1 Samruddhi 29 90000 2
2 Shubhangi 29 61000 1
3 Pratiksha 28 60000 1
4 Mohan 42 150000 0

km.cluster_centers_

array([[3.82857143e+01, 1.50000000e+05],
[3.26000000e+01, 5.59500000e+04],
[3.16666667e+01, 8.00000000e+04]])

df1 = df[df.cluster==0]
df2 = df[df.cluster==1]
df3 = df[df.cluster==2]
plt.scatter(df1.Age,df1['Income'],color='green')
plt.scatter(df2.Age,df2['Income'],color='red')
plt.scatter(df3.Age,df3['Income'],color='black')
plt.scatter(km.cluster_centers_[:,0],km.cluster_centers_[:,1],color='p
urple',marker='*',label='centroid')
plt.xlabel('Age')

plt.ylabel('Income')
plt.legend()

<matplotlib.legend.Legend at 0x658dfd0>
scaler = MinMaxScaler()

scaler.fit(df[['Income']])
df['Income'] = scaler.transform(df[['Income']])

scaler.fit(df[['Age']])
df['Age'] = scaler.transform(df[['Age']])

df.head()

Name Age Income cluster

0 Rutuja 0.058824 0.213675 2
1 Samruddhi 0.176471 0.384615 2
2 Shubhangi 0.176471 0.136752 1
3 Pratiksha 0.117647 0.128205 1
4 Mohan 0.941176 0.897436 0

plt.scatter(df.Age,df['Income'])

<matplotlib.collections.PathCollection at 0x658e4f0>
km = KMeans(n_clusters=3)
y_predicted = km.fit_predict(df[['Age','Income']])
y_predicted

array([1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2])

plt.ylabel('Income ')
plt.legend()

<matplotlib.legend.Legend at 0x67a30c0>
km.cluster_centers_

array([[0.72268908, 0.8974359 ],
[0.11029412, 0.12232906],
[0.8 , 0.17094017]])

sse = []
k_rng = range(1,10)
for k in k_rng:
km = KMeans(n_clusters=k)
km.fit(df[['Age','Income']])
sse.append(km.inertia_)

/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(
/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py:870:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set the value of `n_init` explicitly to suppress the
warning
warnings.warn(

plt.xlabel('K')
plt.ylabel('Sum of squared error')
plt.plot(k_rng,sse)

[<matplotlib.lines.Line2D at 0x6c45878>]
sse

[5.116391204288242,
1.8517482098747688,
0.4938829874100704,
0.36790934699671735,
0.2632376117287616,
0.1931260970868519,
0.13764115392926798,
0.10882448152285196,
0.07871979164270587]

Unit II - Perceptron
No ratings yet
Unit II - Perceptron
20 pages
Unit 4
No ratings yet
Unit 4
108 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
Combining Multiple Sources of Knowledge in Deep Cnns For Action Recognition
No ratings yet
Combining Multiple Sources of Knowledge in Deep Cnns For Action Recognition
8 pages
Syllabus
No ratings yet
Syllabus
2 pages
DL Assignment Solutions
No ratings yet
DL Assignment Solutions
64 pages
Deep Learning July 2023
No ratings yet
Deep Learning July 2023
4 pages
Skripsi Tanpa Bab Pembahasan
No ratings yet
Skripsi Tanpa Bab Pembahasan
63 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
C-3 Pap365er
No ratings yet
C-3 Pap365er
4 pages
Mmds
No ratings yet
Mmds
12 pages
5 Powerful Scikit-Learn Examples - Towards Data Science
No ratings yet
5 Powerful Scikit-Learn Examples - Towards Data Science
10 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
61 pages
Soft Computing Lab Manual
No ratings yet
Soft Computing Lab Manual
24 pages
ML Question Bank
No ratings yet
ML Question Bank
4 pages
Chap5 Basic Association Analysis
No ratings yet
Chap5 Basic Association Analysis
105 pages
Generative Adversarial Network-Based Phishing URL Detection With Variational Autoencoder and Transformer
No ratings yet
Generative Adversarial Network-Based Phishing URL Detection With Variational Autoencoder and Transformer
8 pages
Urn CH SLSP ZBZ 9781098134181 Ihv PDF
No ratings yet
Urn CH SLSP ZBZ 9781098134181 Ihv PDF
7 pages
Group 4
No ratings yet
Group 4
11 pages
Feed-Forward Neural Networks (Part 2: Learning)
No ratings yet
Feed-Forward Neural Networks (Part 2: Learning)
17 pages
AIMLDL Questions
No ratings yet
AIMLDL Questions
5 pages
06 Association Rule Mining
No ratings yet
06 Association Rule Mining
20 pages
Association Rule Mining
No ratings yet
Association Rule Mining
20 pages
J Ipm 2019 102121
No ratings yet
J Ipm 2019 102121
17 pages
Neural Network Architecture
No ratings yet
Neural Network Architecture
3 pages
Image Classification Using Backpropagation Neural Network Without Using Built-In Function
No ratings yet
Image Classification Using Backpropagation Neural Network Without Using Built-In Function
8 pages
IEEE Xplore Reference Download 2024.6.18.20.21.16
No ratings yet
IEEE Xplore Reference Download 2024.6.18.20.21.16
2 pages
Start For Free: Learning Vector Quantization Learning Vector Quantization
No ratings yet
Start For Free: Learning Vector Quantization Learning Vector Quantization
2 pages

Kmeans

Uploaded by

Kmeans

Uploaded by

from sklearn.

cluster import KMeans

Name Age Income

Text(0, 0.5, 'Income')

Name Age Income cluster

Name Age Income cluster

You might also like