0% found this document useful (0 votes)
12 views

Problems

Uploaded by

luckymoviezplay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Problems

Uploaded by

luckymoviezplay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Exercise 1 Page 1

Questions
1. Given the following points: 2, 4, 10, 12, 3, 20, 30, 11, 25, and k = 3, use the K-Means algorithm to
compute the clusters and update the means in each iteration. The initial means are:

µ1 = 2, µ2 = 4, µ3 = 6

Show the clusters obtained and the new means after each iteration.

2. Compute the distance matrix Dij = dist(xi , xj ), using the Manhattan distance (i.e., L1 ), given the
data from the following table:
x1 x2
x1 0 0
x2 1 0
x3 2 0
x4 −0.5 −1
x5 0.5 −1
x6 0 −1.5
Then, perform single-linkage, average-linkage and complete-linkage agglomerative clustering and
draw the dendrograms.

3. Use single-link and complete-link divisive clustering to group the data described by the following
distance matrix. Show the dendrograms.

A B C D
A 0 1 4 5
B 0 2 6
C 0 3
D 0

4. Use the K-Means algorithm and Euclidean distance to cluster the following 8 examples into 3
clusters:

A1 = (2, 10), A2 = (2, 5), A3 = (8, 4), A4 = (5, 8), A5 = (7, 5), A6 = (6, 4), A7 = (1, 2), A8 = (4, 9)

The distance matrix is:


A1 A2
√ A3
√ A4
√ A5
√ A6
√ A7
√ A8

A1 √0 25 √36 √13 √50 √52 √65 √5
A2 √25 √0 37 √18 √25 √17 √10 √20
A3 √36 √37 √0 25 √2 √2 √53 √41
A4 √13 √18 √25 √0 13 √17 √52 √2
A5 √50 √25 √2 √13 √0 2 √45 √25
A6 √52 √17 √2 √17 √2 √0 29 √29
A7 √65 √10 √53 √52 √45 √29 √0 58
A8 5 20 41 2 25 29 58 0

Suppose the initial seeds (centers of each cluster) are A1 , A4 , and A7 . Compute the algorithm for
4 epochs and answer the following for each epoch:
i. The new clusters: (i.e., the examples belonging to each cluster)
ii. The centers of the new clusters.
iii. Sum of the Squared Error?
iv. How many more iterations are needed to converge? Draw the result for each epoch.
v. A 10 by 10 space: Plot all 8 points and show the clusters after the first epoch and the new
centroids.
Exercise 1 Page 2

5. A binary classification model is evaluated on two datasets with different class ratios. Each dataset
contains 100 samples. The confusion matrices are as follows:

Scenario 1: Balanced Classes


Class 1 (Positive): 50 samples, Class 2 (Negative): 50 samples

Predicted: Class 1 Predicted: Class 2


Actual: Class 1 40 10
Actual: Class 2 15 35

Scenario 2: Imbalanced Classes


Class 1 (Positive): 20 samples, Class 2 (Negative): 80 samples

Predicted: Class 1 Predicted: Class 2


Actual: Class 1 15 5
Actual: Class 2 20 60

Perform the following for each scenario:

i. Compute Precision, Recall, F1-Score, and Accuracy.


ii. Discuss how class imbalance impacts each metric.
6. Consider a multiclass classification problem with four classes: C1 , C2 , C3 , and C4 . A model was
trained and tested on a dataset of 20 samples. The ground truth(Actual) and predicted labels are:

Sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Actual C1 C1 C1 C2 C2 C2 C3 C3 C3 C1 C2 C3 C1 C4 C3 C4 C4 C4 C4 C4
Pred C1 C2 C1 C2 C3 C2 C3 C1 C4 C1 C2 C3 C1 C4 C3 C4 C3 C2 C4 C1

i. Construct the confusion matrix for this multiclass problem.


ii. Compute the following metrics for each class:
• Precision
• Recall
• F1-Score
iii. Compute the overall accuracy of the model.
iv. Compute the macro-averaged and weighted-averaged F1-Score.

You might also like