0% found this document useful (0 votes)
5 views5 pages

AI&ML Lab-Ex.9corre

Uploaded by

kpramya19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views5 pages

AI&ML Lab-Ex.9corre

Uploaded by

kpramya19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Ex.

No:9 Implement clustering algorithms

Aim:

To write a program to implement k-Means algorithm to cluster a set of data .

K-Means Algorithm

1. Load data set


2. Clusters the data into k groups where k is predefined.
3. Select k points at random as cluster centers.
4. Assign objects to their closest cluster center according to the
Euclidean distance function.
5. Calculate the centroid or mean of all objects in each cluster.
6. Repeat steps 3, 4 and 5 until the same points are assigned to each cluster in
consecutive rounds.

Various distance metrics used:


No change between iterations 3 and 4 has been noted. By using clustering, 2 groups have been identified
15-28 and 35-65. The initial choice of centroids can affect the output clusters, so the algorithm is often
run multiple times with different starting conditions in order to get a fair view of what the clusters should
be.
Problem:

Cluster the following thirty points (with (x, y) representing locations) into three clusters:

'x': [25, 34, 22, 27, 33, 33, 31, 22, 35, 34, 67, 54, 57, 43, 50, 57, 59, 52, 65, 47, 49, 48, 35, 33, 44, 45, 38,
43, 51, 46],

'y': [79, 51, 53, 78, 59, 74, 73, 57, 69, 75, 51, 32, 40, 47, 53, 36, 35, 58, 59, 50, 25, 20, 14, 12, 20, 5, 29,
27, 8, 7]

Program:

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

data = {
'x': [25, 34, 22, 27, 33, 33, 31, 22, 35, 34, 67, 54, 57, 43, 50, 57, 59, 52, 65, 47, 49, 48, 35, 33, 44, 45,
38,
43, 51, 46],
'y': [79, 51, 53, 78, 59, 74, 73, 57, 69, 75, 51, 32, 40, 47, 53, 36, 35, 58, 59, 50, 25, 20, 14, 12, 20, 5, 29,
27,
8, 7]
}

df = pd.DataFrame(data)

kmeans = KMeans(n_clusters=3).fit(df)
centroids = kmeans.cluster_centers_
print(centroids)

plt.scatter(df['x'], df['y'], c=kmeans.labels_.astype(float), s=50, alpha=0.5)


plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
plt.show()

Output:

[[29.6 66.8]
[43.2 16.7]
[55.1 46.1]]

You might also like