Unit Iv
Unit Iv
The task of grouping data points based on their similarity with each other is
called Clustering or Cluster Analysis.
This method is defined under the branch of Unsupervised Learning, which aims
at gaining insights from unlabelled data points, that is, unlike supervised
learning we don’t have a target variable.
For Example, In the graph given below, we can clearly see that there are 3
circular clusters forming on the basis of distance.
or example, In the below given graph we can see that the clusters formed are
not circular in shape
Types of Clustering
Hard Clustering: In this type of clustering, each data point belongs to a cluster
completely or not. For example, Let’s say there are 4 data point and we have
to cluster them into 2 clusters. So each data point will either belong to cluster
1 or cluster 2.
A C1
B C2
C C2
D C1
Soft Clustering: In this type of clustering, instead of assigning each data point
into a separate cluster, a probability or likelihood of that point being that
cluster is evaluated. For example, Let’s say there are 4 data point and we have
to cluster them into 2 clusters. So we will be evaluating a probability of a data
point belonging to both clusters. This probability is calculated for all data
points.
A 0.91 0.09
B 0.3 0.7
C 0.17 0.83
D 1 0
K means Clustering
A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9)
C1(3, 9.5)
Medoids as Centers: Unlike k-means, which uses the mean of the points in a
cluster as the center, k-medoids selects actual data points as the centers
(medoids). This makes the cluster centers more interpretable.
1. Choose k number of random points from the data and assign these k
points to k number of clusters. These are the initial medoids.
2. For all the remaining data points, calculate the distance from each
medoid and assign it to the cluster with the nearest medoid.
3. Calculate the total cost (Sum of all the distances from all the data points
to the medoids)
4. Select a random point as the new medoid and swap it with the previous
medoid. Repeat 2 and 3 steps.
5. If the total cost of the new medoid is less than that of the previous
medoid, make the new medoid permanent and repeat step 4.
6. If the total cost of the new medoid is greater than the cost of the
previous medoid, undo the swap and repeat step 4.
7. The Repetitions have to continue until no change is encountered with
new medoids to classify data points.
{Cost(3,4),(2,6)}=|3-4|+|4-6|=3
Total cost=3+4+4+3+1+1+2+2=20
Example
Types:
Agglomerative (Bottom-Up): Starts with each data point as its own cluster and
merges the closest pairs of clusters iteratively until all points are in a single
cluster or a stopping criterion is met.
Steps:
Consider each alphabet as a single cluster and calculate the distance of one
cluster from all the other clusters.
In the second step, comparable clusters are merged together to form a single
cluster. Let’s say cluster (B) and cluster (C) are very similar to each other
therefore we merge them in the second step similarly to cluster (D) and (E) and
at last, we get the clusters [(A), (BC), (DE), (F)]
We recalculate the proximity (it find similarities’ and dissimilarities) according
to the algorithm and merge the two nearest clusters ([(DE), (F)]) together to
form new clusters as [(A), (BC), (DEF)]
Repeating the same process; The clusters DEF and BC are comparable and
merged together to form a new cluster. We’re now left with clusters [(A),
(BCDEF)].
At last, the two remaining clusters are merged together to form a single cluster
[(ABCDEF)].
Divisive (Top-Down): Begins with all data points in one cluster and recursively
splits them into smaller clusters.
Advantages:
Disadvantages:
Multi-view clustering
Each view might contain different information about the data points, and
combining these views can lead to more accurate and robust clustering
results
Challenges: One of the main challenges is how to effectively integrate and align
these different views, especially when they have varying levels of noise and
completeness
Another challenge is balancing view consistency (ensuring the views agree with
each other) and view specificity (capturing unique information from each view)
For example, some methods use graph learning to capture the relationships
between data points across different views, while others use contrastive
learning to align representations from different views