Clustering Code Explaination
Clustering Code Explaination
Import Libraries:
import numpy as np
import pandas as pd
Sample Data:
data = {
'Player': ['Player 1', 'Player 2', 'Player 3', 'Player 4', 'Player 5', 'Player 6', 'Player 7', 'Player 8', 'Player
9', 'Player 10'],
'Runs Scored': [350, 280, 420, 200, 320, 380, 240, 400, 310, 360],
'Wickets Taken': [15, 10, 20, 5, 12, 18, 8, 17, 14, 16]
This is the sample data representing runs scored and wickets taken by cricket players.
Create DataFrame:
df = pd.DataFrame(data)
This selects the features 'Runs Scored' and 'Wickets Taken' for clustering.
plt.xlabel('Runs Scored')
plt.ylabel('Wickets Taken')
plt.show()
k = 4 # Number of clusters
kmeans = KMeans(n_clusters=k)
kmeans.fit(X)
centroids = kmeans.cluster_centers_
labels = kmeans.labels_
This retrieves the cluster centers and labels assigned to each data point.
Add Cluster Labels to DataFrame:
df['Cluster'] = labels
for i in range(k):
plt.xlabel('Runs Scored')
plt.ylabel('Wickets Taken')
plt.legend()
plt.show()
This visualizes the clusters along with their centroids on a scatter plot.
print("Cluster Centers:")
This code performs K-means clustering on the given dataset of cricket player statistics and visualizes
the resulting clusters. Adjustments can be made to the number of clusters (k) and the features
selected for clustering as needed.