0% found this document useful (0 votes)
26 views10 pages

Homework#6

Uploaded by

008133327
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views10 pages

Homework#6

Uploaded by

008133327
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Homework#6 (Machine Learning)

a.) What are 3 clusters and their centers after one iteration? Show the detailed steps, same

as questions b.
b. What are 3 clusters and their centers after two iterations?
Dataset A (write A1 or A2, same in the following question);

_ In A1 clusters, a few points are located outside the circle but the average position of the data is

not positioned during K-means, then K-means are used to apply to A2.

Dataset B

_ B1 Cluster the far right side of the red cluster is nearer to other clusters. Also the rest clusters

are marked with centroids, so K-means can be applied to B2.

Dataset C

_ C2 is greater than C1, because the C1 has less than C2, which we can say the C1 has clusters

are separate and has less cluster and C2 is a connected cluster and it has a larger cluster than C1.

So we can apply C2 for K-mean for clustering.

Dataset D
_The colors at the extreme ends are also nearer to other centroids with respect to their own

centroid, so it can be said that K-means is useful for D1.

Dataset E

_ The line that segregates the light blue from the deep blue in E1 has a downward or negative

slope. However, because the centroid of light blue is a little higher than that of deep blue, the

dividing line must maintain a positive slope. So, the K-means was thus applied to E2.

Dataset F

_ The centroid of blue in dataset F1 is clearly visible and nearer to the red dataset. Which means

clusters are well separated. So, the K-means is applied for F2.

a. What is the distance between the two farthest members? (max or complete link) (round

to four decimal places here, and next 2 problems)


b. What is the distance between the two closest members? (min or single link)

c. What is the average distance between all pairs?

= 5.620910252
d. What is the center distance between two clusters?

e. Among all four distances above, which one is robust to noise? Answer either “complete”,

“single”, “average”, and "center"?

_ Among all four distances above, the robust to noise is "average" distance. The average

distance between all pairs considers the distances between all points in the clusters and provides

a more balanced measure that is less sensitive to outliers or extreme values.


\

● Point 1: 7 points within ε

● Point 2: 9 points within ε

● Point 3: 7 points within ε


● Point 4: 5 points within ε

● Point 5: 6 points within ε

● Point 6: 5 points within ε

● Point 7: 6 points within ε

● Point 8: 6 points within ε

● Point 9: 4 points within ε

● Point 10: 4 points within ε

● Point 11: 3 points within ε

● Point 12: 4 points within ε

● Point 13: 7 points within ε

● Point 14: 7 points within ε

● Point 15: 3 points within ε

● Point 16: 7 points within ε

Core Points: Points 1, 2, 3, 4, 5, 6, 7, 8, 13, 14, 16

Border Points: None (as all points within ε neighborhood of core points)

Outliers: Points 9, 10, 11, 12, 15.

You might also like