Clustering
Clustering
Example:
Let’s now take an example to understand how K-Means actually works:
We have these 8 points and we want to apply k-means to create clusters for
these points. Here’s how we can do it.
Here, the red and green circles represent the centroid for these clusters.
Step 3: Assign all the points to the closest cluster centroid
Once we have initialized the centroids, we assign each point to the closest
cluster centroid:
Here you can see that the points which are closer to the red point are
assigned to the red cluster whereas the points which are closer to the green
point are assigned to the green cluster.
Here, the red and green crosses are the new centroids.
We can stop the algorithm if the centroids of newly formed clusters are not
changing. Even after multiple iterations, if we are getting the same
centroids for all the clusters, we can say that the algorithm is not learning
any new pattern and it is a sign to stop the training.
Another clear sign that we should stop the training process if the points
remain in the same cluster even after training the algorithm for multiple
iterations.
In the picture below you would notice that as we add more clusters after 3
it doesn't give much better modeling on the data. The first cluster adds
much information, but at some point, the marginal gain will start dropping.