DBSCAN Clustering
DBSCAN Clustering
Defnition:
DBSCAN (density-Based Spatial Clustering of Applications with Noise) is a
density-based clustering algorithm that identifies clusters in data by grouping
points that are close together and marking points in low-density regions as
noise or outliers. It does not require specifying the number of clusters in
advance and works well for clusters of arbitrary shape.(handles nasted
clusters).
Border Points:
Noise Points:
DBSCAN Clustering 1
Points that are not core points and are not within the ϵ-neighborhood of any
core point.
Treated as outliers.
2. Parameters
Epsilon (ϵ):
MinPts:
Steps :
For each point, count how many points fall within its ϵ-neighborhood.
DBSCAN Clustering 2
A point is considered a core point if it has at least MinPts points
(including itself) within a given radius ε (epsilon)
2. Expand Clusters:
DBSCAN Clustering 3
Create a new cluster and include all points in its ϵ-neighborhood.
Recursively add all neighboring core points and their neighbors to the
cluster.
3. Classify Points:
Border points are added to the cluster of the nearest core point.( Non
core points) but we dont use it to ad to the cluster, meaning non core
points can only be added to the cluste , but we don’t use them to
expand it ( Ne9fou fih)
DBSCAN Clustering 4
Points not belonging to any cluster are classified as noise.
Advantages
1. No Need to Specify KKK:
DBSCAN Clustering 5
Can identify clusters of irregular shapes (e.g., spirals, concentric
circles).
3. Handles Noise:
Limitations
1. Parameter Sensitivity:
ϵthat is too small results in many small clusters or noise, while too large
may merge clusters.
2. Varying Densities:
3. High Dimensionality:
DBSCAN Clustering 6