0% found this document useful (0 votes)
17 views5 pages

Machine Learning Homework 8

The document presents two exercises on clustering algorithms: K-means and DBSCAN. In Exercise 1, K-means clustering is applied to 9 data points resulting in three clusters, with centroids calculated for each cluster. In Exercise 2, the DBSCAN algorithm identifies core, border, and noise points, leading to the discovery of two clusters based on different epsilon values.

Uploaded by

Linh Phạm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

Machine Learning Homework 8

The document presents two exercises on clustering algorithms: K-means and DBSCAN. In Exercise 1, K-means clustering is applied to 9 data points resulting in three clusters, with centroids calculated for each cluster. In Exercise 2, the DBSCAN algorithm identifies core, border, and noise points, leading to the discovery of two clusters based on different epsilon values.

Uploaded by

Linh Phạm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Homework 08

Tri Dung Do - 22025501 - INT3405


October 24, 2024

Collaboration: None
Acknowledgments: None

Exercise 1.

Solution 1.
Data points with three initial centroids
12
Initial Centroids
Data Points
10 A1
A8
8 A4

6
y

A2 A5
4 A6 A3

2 A7

0
0 2 4 6 8 10
x
We draw 9 data points in the above 2-D space.
We run K means clustering in 1 epoch to cluster these data points. Kmeans clustering
algorithm is showed below: Following the algorithm, we have initiate centroids as exercise.
Now, we will form 3 clusters. To do this, we need to calculate the distance between each
centroid to other non-centroids point. We use Euclidean distance formula to measure these
distances. The result is showed below:

1
Algorithm 1 K-means Clustering Algorithm
1: Select K points as the initial centroids.
2: repeat
3: Form K clusters by assigning all points to the closest centroid.
4: Recompute the centroid of each cluster.
5: until The centroids don’t change

Points A1 A4 A7
√ √
A2 5
√ 3 2 √10
A3 6√2 √5 √53
A5 5√ 2 √13 3√ 5
A6 2√13 √17 √29
A8 5 2 58

The table shows that: A2 is nearest to A7, while A3, A5, A6, A8 are nearest to A4. As the
result, we have three clusters:
Cluster1 = [A1]
Cluster2 = [A3, A4, A5, A6, A8]
Cluster3 = [A2, A7]
Finally, we recompute the centroid of each clusters:

Centroid1 = (2; 10)


 
4+5+7+6+9 9+8+5+4+4
Centroid2 = ; = (6; 6)
5 5
   
2+1 5+2 3 7
Centroid3 = ; = ;
2 2 2 2
Additionally, I draw a new graph to show clusters with their new centroids.

2
Result after 1 epoch of K-means
12
Cluster 1
Cluster 2
10 Centroid1 A1 Cluster 3
A8
8 A4

6 Centroid2
y

A2 A5
4 A6 A3
Centroid3

2 A7

0
−2 0 2 4 6 8 10
x

Exercise 2.

Solution 2.

The DBSCAN algorithm is showed below:

Algorithm 2 DBSCAN Algorithm


1: Label all points as core, border, or noise points.
2: Eliminate noise points.
3: Put an edge between all core points that are within Eps of each other.
4: Make each group of connected core points into a separate cluster.
5: Assign each border point to one of the clusters with its associated core points.

As the algorithm said, first we need to compute which are core points, border points or noise
points.
The core points need to satisfy the condition that overlaps at least 2 points, with the radius
ϵ = 2. As the result, the core points are [A3, A5, A6].
There is no border point (the core points have no neighbourhood that is not core point).
Apart from core points are showed, all other points are noise point.
We connect three points [A3, A5, A6], make it a cluster. So with ϵ = 2 and min sample=2,
DBSCAN discovers only 1 cluster C=[A3, A5, A6]. We draw a graph below to see clearly
the cluster.

3
Result of DBSCAN with ϵ = 2
12
Noise Point
Cluster
10 A1
A8
8 A4

6
y

A2 A5
4 A6 A3

2 A7

0
0 2 4 6 8 10
x

By increasing ϵ to 10, the core point list includes [A3, A5, A6, A8]. The border points list
includes [A1, A4], as these points are neighbourhood of A8, a core point. The noise point
includes A2, and A7. √
We connect core points A3, A5, A6 as their distance values are smaller than 10, then these
three points. make a √ cluster. Otherwise, there is no core point that is near A8 with the
distance smaller than 10, so A8 make a cluster by it self.
The two border points A1 and A4 is the neighbourhood of A8, so we assign these points as
the member of cluster created by A8. A2 and A7 is noise point, so they are not assigned to
any cluster. √
To conclude, two clusters are discovered by DBSCAN with ϵ = 10 and min sample=2 is:

Cluster1 = [A1, A4, A8]


Cluster2 = [A3, A5, A6]
The visualization is illustrated below:

4

Result of DBSCAN with ϵ = 10;
12

10 A1
A8
8 A4

6
y

A2 A5
4 A6 A3

2 A7

0
0 2 4 6 8 10
x

You might also like