Practical 03
Practical 03
Description:
Data clustering is a powerful technique used to group similar data points together. Here’s a
practical guide to performing clustering using Python, specifically with the `scikit-learn` library.
2. Import Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
Visualizing the data can help understand the structure before clustering:
To choose the optimal number of clusters, you can use the Elbow method:
inertia = []
K = range(1, 11)
for k in K:
kmeans = KMeans(n_clusters=k)
kmeans.fit(X)
inertia.append(kmeans.inertia_)
plt.figure(figsize=(8, 4))
plt.plot(K, inertia, 'bx-')
plt.xlabel('Number of clusters K')
plt.ylabel('Inertia')
plt.title('Elbow Method For Optimal K')
plt.show()
Conclusion
In this practical, you learned how to perform clustering using K-Means in Python. Adjusting
parameters and preprocessing your data can yield better clustering results.