IntroML 8 KmeanClustering
IntroML 8 KmeanClustering
Machine Learning
(5 ECTS)
Quote Slide Option 2 Overview lecture
• Clustering
• K-means Clustering (hard clustering)
Applications:
- Image classification
- Medical Diagnosis
- …
“https://abeyon.com/how-do-machines-
learn/”
Applications:
- Image segmentation
- Dimensionality reduction
- Clustering
- …
“https://abeyon.com/how-do-machines-
learn/”
Steps:
1- Randomly assign centres for the K clusters (µ(1), µ(2), µ(3),…, µ(k))
2- Calculate the distance of every point in the training data to the cluster centres
3- assign datapoints to the nearest cluster centre (c(1), c(2), c(3),…, c(m))
4- Update the centre of clusters (average (mean) of datapoints assigned to cluster)
5- repeat 2-4
6- stop when assignment does not change
• The cost function is basically the average of the variance within each group
• Bad initialization of cluster centres repeat algorithm with different initialization of centroids
Silhouette criteria
Solutions for optimum Calinski-Harabasz criteria
number of clusters:
Gap criteria
https://www.mathworks.com/help/stats/
evalclusters.html
Steps:
1- Determine a set of cluster numbers to evaluate; K={2,3,4…k}; 2<k<n (number of datapoints)
2- apply the K-Means algorithm until convergence for k clusters
3- Calculate the silhouette value for each data point and average
4- repeat step 1-3
5- Select the number of clusters with the highest silhouette coefficient
Steps:
1- Determine a set of cluster numbers to evaluate; K={2,3,4…k}; 2<k<N (number of datapoints)
2- apply the K-Means algorithm until convergence for k clusters
3- Calculate the Calinski score value for each number of cluster
4- repeat step 1-3
5- Select the number of clusters with the highest Calinski score
Steps:
1- Determine a set of cluster numbers to evaluate; K={2,3,4…k}; 2<k<n (number of datapoints)
2- apply the K-Means algorithm until convergence for k clusters, on real data
4- create a set of random points from a uniform distribution (i.e, fake datapoints)
(B)
We could either use other methods (e.g., Gaussian mixture models) or we could manipulate the features (e.g.,
rescaling, or other features altogether).
• Unsupervised
• Each datapoint can belong to more than one cluster (probability assignment)
• Slower than k-means
https://medium.com/geekculture/fuzzy-c-
means-clustering-fcm-algorithm-in-machine-
learning-c2e51e586fff
Trinity College Dublin, The University of Dublin 16
Quote Slide Option 2C-means Clustering (soft-clustering)
How it works:
• Same as K-mean except that the centroids of each cluster is updated based on the weighted probabilities of
each datapoint in that cluster
m: fuzziness parameter
wk(x)= degree (probability) of datapoint x belonging to
cluster k
m
• Update degree values:
• P(xi|ck) =
µk = ; σk=
µk =
σ k=
https://www.youtube.com/watch?
v=iQoXFmbXRJA