Mod2 Clustering Text Book
Mod2 Clustering Text Book
Learning Algorithms
LEARNING OBJECTIVES
• To understand the basics of clustering. • To learn how to create dendrograms using
• To understand and appreciate the require- agglomerative-based clustering
ments of clustering.
• To introduce the concept of partitioning-based
clustering (k-means and k-medoids).
LEARNING OUTCOMES
• Students will be able to understand and appre- • Students will be able to solve numericals on
ciate clustering as an unsupervised learning agglomerative-based clustering using single,
method. complete, and average linkages.
• Students will be able to solve numericals on
partitioning-based clustering techniques.
100
50
-50
C1
C2
C3
-100
50 100 150 200
asymmetric binary. In symmetric data, both values are equally important. For example, in gender, it
is male and female. In asymmetric data, both values are not equally important. For example, in result,
it is pass and fail. The clustering algorithm should also work for complex data types such as graphs,
sequences, images, and documents.
3. Discovery of clusters with arbitrary shape: Generally, clustering algorithms are to determine spheri-
cal clusters. Due to the characteristics and diverse nature of the data used, clusters may be of arbitrary
shapes and can be nested within one another. For example, the cluster pattern for active and inactive
volcanoes has chain-like patterns as shown in Fig. 13.1.
Traditional clustering algorithms, such as k-means and k-medoids, fail to detect non-spherical
shapes. Thus, it is important to have clustering algorithms that can detect clusters of any arbitrary
shape.
4. Avoiding domain knowledge to determine input parameters: Many algorithms require domain
knowledge like the desired number of clusters in the form of input. Thus, the clustering results may
become sensitive to the input parameters. Such parameters are often hard to determine for high dimen-
sionality data. Domain knowledge requirement affects the quality of clustering and burdens the user.
For example, in k-means algorithm, the metric used to compare results for different values of k is
the mean distance between data points and their cluster centroid. Increasing the number of clusters
will always reduce the distance of data points to the extreme of reaching zero when k is the same as the
number of data points. Thus, this cannot be used. Instead, to roughly determine k, the mean distance
to the centroid is plotted as a function of k and the “elbow point”, where the rate of decrease sharply
shifts. This is shown in Fig. 13.2.
5. Handling noisy data: Real-world data, which is the input of clustering algorithms, are mostly affected
by noise. This results in poor-quality clusters. Noise is an unavoidable problem, which affects the data
collection and data preparation processes. Therefore, the algorithms we use should be able to deal with
noise. There are two types of noise:
• Attribute noise includes implicit errors introduced by measurement tools. They are induced by
different types of sensors.
1 2 3 4 5 6 7 8
K (no. of clusters)
• Random errors introduced by batch processes or experts when the data is gathered. This can be
induced in document digitalization process.
6. Incremental clustering: The database used for clustering needs to be updated by adding new data
(incremental updates). Some clustering algorithms cannot incorporate incremental updates but have
to recompute a new clustering from scratch. The algorithms which can accommodate new data with-
out reconstructing the clusters are called incremental clustering algorithms. It is more effective to use
incremental clustering algorithms.
7. Insensitivity to input order: Some clustering algorithms are sensitive to the order in which data
objects are entered. Such algorithms are not ideal as we have little idea about the data objects pre-
sented. Clustering algorithms should be insensitive to the input order of data objects.
8. Handling high-dimensional data: A dataset can contain numerous dimensions or attributes.
Generally, clustering algorithms are good at handling low-dimensional data such as datasets involving
only two or three dimensions. Clustering algorithms which can handle high-dimensional space are
more effective.
9. Handling constraints: Constrained clustering can be considered to contain a set of must-link con-
straints, cannot-link constraints, or both. In a must-link constraint, two instances in the must-link
relation should be included in the same cluster. On the other hand, a cannot-link constraint speci-
fies that the two instances cannot be in the same cluster. These sets of constraints act as guidelines
to cluster the entire dataset. Some constrained clustering algorithms cancel the clustering process
if they cannot form clusters which satisfy the specified constraints. Others try to minimize the
amount of constraint violation if it is impossible to find a clustering which satisfies the constraints.
Constraints can be used to select a clustering model to follow among different clustering meth-
ods. A challenging task is to find data groups with good clustering behavior that satisfy specified
constraints.
10. Interpretability and usability: Users require the clustering results to be interpretable, usable, and
include all the elements. Clustering is always tied with specific semantic interpretations and applica-
tions. The applications should be able to use the information retrieved after clustering in a useful
manner.
( )
c ci 2
J (V ) = å å xi - v j (13.1)
i =1 j =1
where
||xi – vj|| is the Euclidean distance between xi and vj .
æ 1 ö ci
vi = ç ÷ å xi (13.2)
è ci ø j =1
Solution:
The initial cluster centers are given as C1(2), C2(16), and C3(38). Calculating the distance between each
data point and cluster centers, we get the following table.
Data Points Distance from C1(2) Distance f rom C2(16 ) Distance from C3(38)
By assigning the data points to the cluster center whose distance from it is minimum of all the cluster
centers, we get the following table.
m1 = 2 m2 = 16 m3 = 38
{2, 3, 4, 6} {12, 14, 15, 16, 21, 23, 25} {31, 35, 38}
New cluster centers
m1 = 3.75 m2 = 18 m3 = 34.67
Similarly, using the new cluster centers we can calculate the distance from it and allocate clusters based
on minimum distance. It is found that there is no difference in the cluster formed and hence we stop this
procedure. The final clustering result is given in the following table.
m1 = 3.75 m2 = 18 m3 = 34.67
{2, 3, 4, 6} {12, 14, 15, 16, 21, 23, 25} {31, 35, 38}
Solution:
We solve the numerical by following the calculations carried out in solved problem 13.1. The result is
presented in the following table.
C1(80) C2(250)
m1 = 80 m2 = 250
{23, 34, 56, 78, 90, 116, 117, 118, 123} {150, 199, 234, 456}
m1 = 83.9 m2 = 259.75
{23, 34, 56, 78, 90, 116, 117, 118, 123} {150, 199, 234, 456}
m1 = 90.5 m2 = 296.3
{23, 34, 56, 78, 90, 116, 117, 118, 123, 150} {199, 234, 456}
m1 = 90.5 m2 = 296.3
{23, 34, 56, 78, 90, 116, 117, 118, 123, 150} {199, 234, 456}
1 185 72
2 170 56
3 168 60
4 179 68
5 182 72
6 188 77
Solution:
Sample No. X Y Assignment
1 185 72 C1
2 170 56 C2
1 185 72 C1
2 170 56 C2
3 168 60 C2
4 179 68
5 182 72
6 188 77
Similarly,
1. Distance from C1 for (179, 68) = 7.21
Distance from C2 for (179, 68) = 15
Since C1 is closer to (179, 68), the sample belongs to C1.
2. Distance from C1 for (182, 72) = 3
Distance from C2 for (182, 72) = 20
Since C1 is closer to (182, 72), the sample belongs to C1.
3. Distance from C1 for (188, 77) = 5.83
Distance from C2 for (188, 77) = 27.66
Since C1 is closer to (188, 77), the sample belongs to C1.
1 185 72 C1
2 170 56 C2
3 168 60 C2
4 179 68 C1
5 182 72 C1
6 188 77 C1
1 185 72 C1
2 170 56 C2
3 168 60 C2
4 179 68 C1
5 182 72 C1
6 188 77 C1
After the second iteration, the assignment has not changed and hence the algorithm is stopped and the
points are clustered.
13.3.2 k-Medoids
The k-medoids algorithm is a clustering algorithm very similar to the k-means algorithm. Both k-means
and k-medoids algorithms are partitional and try to minimize the distance between points and cluster
center. In contrast to the k-means algorithm, k-medoids chooses data points as centers and uses Manhattan
distance to define the distance between cluster centers and data points. This technique clusters the dataset
of n objects into k clusters, where the number of clusters k is known in prior. It is more robust to noise and
outliers as compared to k-means because it minimizes a sum of pairwise dissimilarities instead of a sum of
squared Euclidean distances. A medoid is defined as an object of a cluster whose average dissimilarity to all
the objects in the cluster is minimal.
The Manhattan distance between two vectors in an n-dimensional real vector space is given by
Eq. (13.2). It is used in computing the distance between a data point and its cluster center.
n
d1 ( p, q ) = p - q 1 = å pi - qi (13.2)
i =1
The most common algorithm in k-medoid clustering is Partitioning Around Medoids (PAM) algorithm.
PAM uses a greedy search which is faster than the exhaustive search and may not find the optimum s olution.
It works as follows:
1. Initialize: select k of the n data points as the medoids.
2. Associate each data point to the closest medoid.
3. While the cost of the configuration decreases: For each medoid m and for each non-medoid data point o:
• Swap m and o, recompute the cost (sum of distances of points to their medoid).
• If the total cost of the configuration increased in the previous step, undo the swap.
X1 2 6
X2 3 4
X3 3 8
X4 4 2
X5 6 2
X6 6 4
Solution:
Step 1: Two observations c1 = X2 = (3, 4) and c2 = X6 = (6, 4) are randomly selected as medoids (cluster
centers).
Step 2: Manhattan distances are calculated to each center to associate each data object to its nearest medoid.
X1 (2, 6) 3 6
X2 (3, 4) 0 3
X3 (3, 8) 4 7
X4 (4, 2) 3 4
X5 (6, 2) 5 2
X6 (6, 4) 3 0
Cost 10 2
Step 3: We select one of the non-medoids O′. Let us assume O′ = (6, 2). So now the medoids are c1(3, 4)
and O′(6, 2). If c1 and O′ are the new medoids. We calculate the total cost involved.
X1 (2, 6) 3 8
X2 (3, 4) 0 5
X3 (3, 8) 4 9
X4 (4, 2) 3 2
X5 (6, 2) 5 0
X6 (6, 4) 3 2
Cost 7 4
So cost of swapping medoid from c2 to O′ is 11. Since the cost is less, this is considered as a better cluster
assignment. Here swapping is done as the cost is less.
Step 4: We select another non-medoid O′. Let us assume O′ = (4, 2). So now the medoids are c1(3, 4)
and O′(4, 2). If c1 and O′ are new medoids, we calculate the total cost involved.
X1 (2, 6) 3 6
X2 (3, 4) 0 3
X3 (3, 8) 4 7
X4 (4, 2) 3 0
X5 (6, 2) 5 2
X6 (6, 4) 3 4
Cost 7 8
So cost of swapping medoid from c2 to O′ is 15. Since the cost is more, this cluster assignment is not
considered and the swapping is not done.
Thus, we try other non-medoids points to get minimum cost. The assignment with minimum cost
is considered the best. For some applications, k-medoids show better results than k-means. The most
time-consuming part of the k-medoids algorithm is the calculation of the distances between objects. The
distances matrix can be computed in advance to speed-up the process.
Level a b c d e
l=0 1.0
l=1 0.8
Similarity Scale
l=2
0.6
l=3 0.4
l=4 0.2
0.0
Maximum distance:
( )
dist max Ci ,C j = max
p ∈Ci , p ′∈C j
{ p − p′ } (13.4)
Mean distance:
( )
dist mean Ci ,C j = mi − m j (13.5)
Average distance:
( )
dist avg Ci ,C j =
1
ni n j
∑
p ∈Ci , p ′∈C j
p − p′ (13.6)
When an algorithm uses the minimum distance, dmin(Ci, Cj), to measure the distance between clusters, it is
called nearest-neighbor clustering algorithm. If the clustering process is terminated when the distance between
the nearest clusters exceeds a user-defined threshold, it is called single-linkage algorithm. Agglomerative hier-
archical clustering algorithm (with minimum distance measure) is called minimum spanning tree algorithm
since spanning tree of a graph is a tree that connects all vertices and a minimal spanning tree is one with the
least sum of edge weights.
An algorithm that uses the maximum distance, dmax(Ci, Cj), to measure the distance between clusters is
called farthest-neighbor clustering algorithm. If clustering is terminated when the maximum distance exceeds
a user-defined threshold, it is called complete-linkage algorithm.
The minimum and maximum measures tend to be sensitive to outliers or noisy data. The third method
thus suggests to take the average distance to rule out outlier problems. Another advantage is that it can
handle categoric data as well.
Algorithm: The agglomerative algorithm is carried out in three steps and the flowchart is shown in Fig. 13.4.
1. Convert object attributes to distance matrix.
2. Set each object as a cluster (thus, if we have N objects, we will have N clusters at the beginning).
3. Repeat until number of clusters is one.
• Merge two closest clusters.
• Update distance matrix.
Start
Compute distance
Set object as
Yes
No. of cluster
No End
Update distance
matrix
Sample No. X Y
P1 0.40 0.53
P2 0.22 0.38
P3 0.35 0.32
P4 0.26 0.19
P5 0.08 0.41
P6 0.45 0.30
Solution:
To compute distance matrix:
d éë( x , y ) ( a, b ) ùû = ( x - a )2 + ( y - b )
2
Euclidean distance:
æ P1 P2 P3 P4P6 ö P5
ç P1 0 ÷
ç ÷
ç P2 0.23 0 ÷
ç ÷
ç P3 0.22 0.14 0 ÷
ç P4 0.37 0.19 0.13 0 ÷
ç ÷
ç P5 0.34 0.14 0.28 0.23 0 ÷
ç P6
è 0.24 0.24 0.10 0.22 0.39 0 ÷ø
Merging the two closest members of the two clusters and finding the minimum element in distance
matrix, we get
æ P1 P2 P3 P4 P6 ö
P5
ç P1 0 ÷
ç ÷
ç P2 0.23 0 ÷
ç ÷
ç P3 0.22 0.14 0 ÷
ç P4 0.37 0.19 0.13 0 ÷
ç ÷
ç P5 0.34 0.14 0.28 0.23 0 ÷
ç P6
è 0.24 0.24 0.10 0.22 0.39 0 ÷ø
Here the minimum value is 0.10 and hence we combine P3 and P6. Now, form cluster of elements
corresponding to minimum value and update distance matrix. To update the distance matrix
min ((P3, P6), P1) = min((P3, P1), (P6, P1)) = min(0.22, 0.24) = 0.22
min ((P3, P6), P2) = min((P3, P2), (P6, P2)) = min(0.14, 0.24) = 0.14
min ((P3, P6), P4) = min((P3, P4), (P6, P4)) = min(0.13, 0.22) = 0.13
min ((P3, P6), P5) = min((P3, P5), (P6, P5)) = min(0.28, 0.39) = 0.28
æ P1 P2 P3, P6 P4 P5 ö
ç P1 0 ÷
ç ÷
ç P2 0.23 0 ÷
ç ÷
ç P3, P6 0.22 0.14 0 ÷
ç P4 0.37 0.19 0.13 0 ÷
çç ÷
è P5 0.34 0.14 0.28 0.23 0 ÷ø
Merging two closest members of the two clusters and finding the minimum element in distance matrix.
æ P1 P2 P3, P6 P4 P5 ö
ç P1 0 ÷
ç ÷
ç P2 0.23 0 ÷
ç ÷
ç P3, P6 0.22 0.14 0 ÷
ç P4 0.37 0.19 0.13 0 ÷
çç ÷
è P5 0.34 0.14 0.28 0.23 0 ÷ø
Here the minimum value is 0.13 and hence we combine P3, P6 and P4. Now, form cluster of elements
corresponding to minimum values and update distance matrix. To update the distance matrix
min (((P3, P6), P4), P1) = min(((P3, P6), P1), (P4, P1)) = min(0.22, 0.37) = 0.22
min (((P3, P6), P4), P2) = min(((P3, P6), P2), (P4, P2)) = min(0.14, 0.19) = 0.14
min (((P3, P6), P4), P5) = min(((P3, P6), P5), (P4, P5)) = min(0.28, 0.23) = 0.23
æ P1 P2 P3, P6, P4 P5 ö
ç P1 0 ÷
ç ÷
ç P2 0.23 0 ÷
ç ÷
ç P3, P6, P4 0.22 0.14 0 ÷
ç P5 0 .34 0 .14 0 . 23 0 ÷
è ø
Merging two closest members of the two clusters and finding the minimum element in distance matrix.
æ P1 P2 P3, P6, P4 P5 ö
ç P1 0 ÷
ç ÷
ç P2 0.23 0 ÷
ç ÷
ç P3, P6, P4 0.22 0.14 0 ÷
ç P5 0 .34 0 .14 0 . 23 0 ÷
è ø
Here the minimum value is 0.14 and hence we combine P2 and P5. Now, form cluster of elements cor-
responding to minimum values and update distance matrix. To update the distance matrix
min ((P2, P5), P1) = min((P2, P1), (P5, P1)) = min(0.23, 0.34) = 0.23
min ((P2, P5),(P3, P6, P4)) = min((P2, (P3, P6, P4)), (P5, (P3, P6, P4))) = min(0.14, 0.23) = 0.14
Here the minimum value is 0.14 and hence we combine P2, P5 and P3, P6, P4. Now, form cluster of
elements corresponding to minimum values and update distance matrix. To update the distance matrix
min ((P2, P5, P3, P6, P4), P1) = min((P2, P5), P1), ((P3, P6, P4), P1)) = min(0.23, 0.22) = 0.22
P3
P6
P4
P2
P5
P1
Solution:
To compute distance matrix:
d éë( x , y ) ( a, b ) ùû = ( x - a )2 + ( y - b )
2
P1 P2 P3 P 6
P4 P5
P1 0
P2 0.71 0
P3 5.66 4.95 0
P4 3.6 2.92 2.24 0
P5 4.24 3.53 1.41 1.0 0
P6 3.20 2.5 2.5 0.5 1.12 0
Merging two closest members of the two clusters and finding the minimum element in distance matrix
and forming the clusters, we get
P1 P2 P3 P 6
P4 P5
P1 0
P2 0.71 0
P3 5.66 4.95 0
P4 3.6 2.92 2.24 0
P5 4.24 3.53 1.41 1.0 0
P6 3.20 2.5 2.5 0.5 1.12 0
Here the minimum value is 0.5 and hence we combine P4 and P6. To update the distance matrix
max (d(P4, P6), P1)) = max(d(P4, P1), d(P6, P1)) = max(3.6, 3.2) = 3.6
P1 P2 P3 P4, P6 P5
P1 0
P2 0.71 0
P3 5.66 4.95 0
P4, P6 3.6 2.92 2.5 0
P5
4.24 3.53 1.41 1.12 0
Merging two closest by finding the minimum element in distance matrix and forming the clusters, we get
Merging two closest by finding the minimum element in distance matrix and forming the clusters, we get
Merging two closest by finding the minimum element in distance matrix and forming the clusters, we get
The final cluster formed can now be drawn as shown in Fig. 13.6.
P6
P5
P4, P6 P1, P2
A B
P1 1 1
P2 1.5 1.5
P3 5 5
P4 3 4
P5 4 4
P6 3 3.5
Solution:
The distance matrix is:
P1 P2 P3 P4 P5 P6
P1 0
P2 0.71 0
P3 5.66 4.95 0
P4 3.6 2.92 2.24 0
P5 4.24 3.53 1.41 1.0 0
P6 3.20 2.5 2.5 0.5 1.12 0
Merging two closest members of the two clusters and finding the minimum element in distance matrix,
we get
P1 P2 P3 P 6
P4 P5
P1 0
P2 0.71 0
P3 5.66 4.95 0
P4 3.6 2.92 2.24 0
P5 4.24 3.53 1.41 1.0 0
P6 3.20 2.5 2.5 0.5 1.12 0
Here the minimum value is 0.5 and hence we combine P4 and P6. To update the distance matrix average
(d(P4, P6), P1)) = average(d(P4, P1), d(P6, P1)) = average(3.6, 3.2) = 3.4
P1 P2 P3 P4, P6 P5
P1 0
P2 0.71 0
P3 5.66 4.95 0
P4, P6 3.2 2.71 2.37 0
P5
4.24 3.53 1.41 1.06 0
Merging two closest by finding the minimum element in distance matrix and forming the clusters:
The final cluster formed can now be drawn as shown in Fig. 13.7.
P6
P5
P4, P6 P1, P2
Figure 13.7 The final cluster formed merging all data points.
Case Study
There are diverse applications using clustering. As discussed in previous sections, the concept of clustering is
one wherein the output is not known. In other words, the training dataset has only input data samples. So
based on some metrics of grouping or clustering, similar data items are grouped together in one cluster. Let
us now see some of the major areas wherein this concept is extensively used.
Let us take a small sample size of eight customers. Based on their duration of national and international
calls, a scatter plot is drawn as shown below.
0
0 1 2 3 4 5 6 7
Av. International Call Duration
Using Euclidean distance metric to compute the centroids, the final clusters forms are shown in the figure
below.
5
Av. Local Call Duration
0
0 1 2 3 4 5 6 7
Av. International Call Duration
Based on the proximity between cluster centres and centriod of the formed clusters, the customer plans
can be decided so that the company as well as the customers benefit from a chosen plan.
Summary
• Clustering is the process of grouping together • Clustering is also called data segmentation
data objects into multiple sets or clusters, so because clustering partitions large datasets into
that objects within a cluster have high similarity groups according to their similarity.
when compared to objects outside of it. • Clustering is known as unsupervised learning
• Similarity is measured by distance metrics and because the class label information is not present.
the most common among them is the Euclidean
distance metrics.
• The applications of clustering are varied and • Partitioning-based clustering algorithms are
include business intelligence, pattern recogni- distance based. k-means and k-medoids are
tion, image processing, biometrics, web technol- popular partition-based clustering algorithms.
ogy, search engine, and text mining. The number of clusters to be formed is initially
• The requirements of clustering is dependent on specified.
its scalability, number and types of attributes • The result of hierarchical clustering is a tree-
which have to be clustered, shape of the clus- based representation of the objects, which is also
ter to be identified, efficiency in handling noisy known as dendrogram.
data and incremental data points to the existing • Density-based clustering algorithm finds non-
clusters, handling high-dimensional data, and linear shaped clusters based on density. Density-
data with constraints. based spatial clustering of applications with
• The basic types of clustering are hard cluster- noise (DBSCAN) is the most widely used den-
ing and soft clustering depending on whether sity-based algorithm. It uses the concept of den-
the data points belong to only one cluster or sity reachability and density connectivity.
whether they can be shared among clusters. • The grid-based clustering approach differs from
• Clustering algorithms are classified based on conventional clustering algorithms in that it is
partitioning, hierarchical, density-based and concerned not with the data points but with the
grid-based clustering. value space that surrounds the data points.
Multiple-Choice Questions
3000
2000
1000
2 4 6 8 10 12 14
Number of Clusters
1. Give an example of an application for k-means {2, 4, 10, 12, 3, 20, 30, 11, 25}
clustering algorithm. Explain in brief.
4. Compare between single link, complete link,
2. Explain the different distance measures used and average link based on distance formula.
for clustering.
5. Draw the flowchart of k-means algorithm.
3. Using k-means clustering, cluster the following
data into two clusters. Show each step.
1. Compute the distance matrix for the x–y 3. Use k-means algorithm to cluster the following
coordinates given in the following table. dataset consisting of the scores of two variables
on each of the seven individuals.
Point x coordinate y coordinate
Subject A B
p1 0.4005 0.5306
p2 0.2148 0.3854 1 1.0 1.0
4. Use the k-means algorithm and Euclidean Suppose the initial seeds (centers of each
distance to cluster the following eight examples cluster) are A1, A4, and A7. Run the k-means
into three clusters: A1 = (2, 10), A2 = (2, 5), algorithm for 1 epoch only. At the end of this
A3 = (8, 4), A4 = (5, 8), A5 = (7, 5), A6 = epoch show:
(6, 4), A7 = (1, 2), A8 = (4, 9). The distance
(a)
The new clusters (that is, the examples
matrix based on the Euclidean distance is given
belonging to each cluster).
in the following table.
(b) The centers of the new clusters.
A1 A2 A3 A4 A5 A6 A7 A8
5. Use single and complete link agglomerative
A1 0 25 36 13 50 52 65 5 clustering to group the data given in the follow-
ing distance matrix. Show the dendrograms.
A2 0 37 18 25 17 10 20
A3 0 25 2 2 53 41 A B C D
A4 0 13 17 52 2 A 0 1 4 5
A5 0 2 45 25 B 0 2 6
A6 0 29 29 C 0 3
A7 0 58 D 0
A8 0
Review Questions
1. Use k-means algorithm to create three clusters 3. Apply complete link agglomerative clustering
for a given set of values: {2, 3, 6, 8, 9, 12, 15, techniques on the given data to find the promi-
1s8, 22}. nent clusters.
2. Apply agglomerative clustering algorithm on
the given data and draw the dendogram. Show P1 P2 P3 P4 P5 P6
three clusters with its allocated points by using P1 0 0.23 0.22 0.37 0.34 0.24
the single link method.
P2 0.23 0 0.14 0.19 0.14 0.24
a b c d e f P3 0.22 0.14 0 0.13 0.28 0.10
a 0 2 10 17 5 20 P4 0.37 0.19 0.13 0 0.23 0.22
b 2 0 8 3 1 18 P5 0.34 0.14 0.28 0.23 0 0.39
c 10 8 0 5 5 2 P6 0.24 0.24 0.10 0.22 0.39 0
d 17 3 5 0 2 3 4. Explain expectation–maximization algorithm.
e 5 1 5 2 0 13 5. What are the requirements for clustering?
6. What are the applications of clustering?
f 20 18 2 3 13 0
Answers
Multiple-Choice Answers
1. (a) 2. (d) 3. (a) 4. (b) 5. (b)