0% found this document useful (0 votes)
350 views14 pages

CLUSTERING GRID-BASED METHODS Elsayed Hemayed Data Mining Course

This document discusses density-based clustering methods. It provides background on density-based clustering and terminology. It explains that DBSCAN is a prominent density-based clustering algorithm that finds clusters by searching for core points with many nearby points and iteratively collecting density-reachable points to form clusters.

Uploaded by

amjad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
350 views14 pages

CLUSTERING GRID-BASED METHODS Elsayed Hemayed Data Mining Course

This document discusses density-based clustering methods. It provides background on density-based clustering and terminology. It explains that DBSCAN is a prominent density-based clustering algorithm that finds clusters by searching for core points with many nearby points and iteratively collecting density-reachable points to form clusters.

Uploaded by

amjad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

CLUSTERING

DENSITY-BASED METHODS
Elsayed Hemayed
Data Mining Course
Outline
2

 Density-Based Clustering Methods


 Density-Based Clustering Background
 Terminology
 How does DBSCAN find clusters?
 DBSCAN

Density-based Clustering Methods


Clustering Methods
3

 Partitioning methods
 K-Means
 Hierarchical methods
 Agglomerative Hierarchical Clustering
 Divisive hierarchical clustering
 Density-based methods
 DBSCAN: a Density-Based Spatial Clustering of Applications with Noise
 Grid-based methods
 STING: A Statistical Information Grid Approach to Spatial Data Mining
 Model-based methods
 Expectation-Maximization
 Neural Network Approach
 High Dimensional Data Clustering
 CLIQUE: A Dimension-Growth Subspace Clustering Method

Density-based Clustering Methods


4 Density-based Clustering Methods
DBSCAN

Density-based Clustering Methods


Density-Based Clustering Methods
5

 Clustering based on density, such as density-connected points instead


of distance metric.
 Cluster = set of “density connected” points.
 Major features:
 Discover clusters of arbitrary shape
 Handle noise
 Need “density parameters” as termination condition- (when no new
objects can be added to the cluster.)

 Example:
 DBSCAN (Ester, et al. 1996)
 OPTICS (Ankerst, et al 1999)
 DENCLUE (Hinneburg & D. Keim 1998)

Density-based Clustering Methods


Density-Based Clustering: Background
6

 Eps neighborhood: The neighborhood within a radius


Eps of a given object
 MinPts: Minimum number of points in an Eps-neighborhood
of that object.
 Core object: If the Eps neighborhood contains at least a
minimum number of points Minpts, then the object is a core
object
 Directly density-reachable: A point p is directly density-
reachable from a point q wrt. Eps, MinPts if
 1) p is within the Eps neighborhood of q
 2) q is a core object p MinPts = 5
q
Density-based Clustering Methods
Eps = 1
Density Reachability and Density
7
Connectivity
 M, P, O and R are core objects since each is in an
Eps neighborhood containing at least 3 points

Minpts = 3
Eps=radius
of the
circles
Density-based Clustering Methods
Directly density reachable
8

Q is directly density reachable from M.


 M is directly density reachable from P and vice versa.

Density-based Clustering Methods


Indirectly density reachable
9

 Q is indirectly density reachable from P since Q is


directly density reachable from M and M is directly
density reachable from P. But, P is not density
reachable from Q since Q is not a core object.

Density-based Clustering Methods


Core, border, and noise points
10

 DBSCAN is a Density-Based Spatial Clustering of


Applications with Noise
 Density = number of points within a specified radius (Eps)

 A point is a core point if it has a specified number (or more)


of points (MinPts) within Eps
 These are points that are at the interior of a cluster.

 A border point has fewer than MinPts within Eps, but is in the
neighborhood of a core point.

 A noise point is any point that is not a core point nor a


border point.

Density-based Clustering Methods


How does DBSCAN find clusters?
11

 DBSCAN searches for clusters by checking the Eps-


neighborhood of each point in the database.
 If the Eps-neighborhood of a point p contains more than
MinPts, a new cluster with p as a core object is created.
 DBSCAN then iteratively collects directly density-
reachable objects from these core objects, which may
involve the merge of a few density-reachable clusters.
 The process terminates when no new point can be
added to any cluster

Density-based Clustering Methods


DBSCAN Algorithm
12

 Arbitrary select a point p


 Retrieve all points density-reachable from p wrt Eps
and MinPts.
 If p is a core point, a cluster is formed.

 If p is a border point, no points are density-reachable


from p and DBSCAN visits the next point of the
database.
 Continue the process until all of the points have been
processed.

Density-based Clustering Methods


DBSCAN Summary
13

 DBSCAN is A Density-Based Clustering Method Based


on Connected Regions with Sufficiently High Density
 The algorithm grows regions with sufficiently high
density into clusters and discovers clusters of arbitrary
shape in spatial databases with noise.
 It defines a cluster as a maximal set of density-
connected points. So distance is not the metric unlike
the case of hierarchical methods.

Density-based Clustering Methods


Summary
14

 Density-Based Clustering Methods


 Density-Based Clustering Background

 Terminology

 How does DBSCAN find clusters?

 DBSCAN

Density-based Clustering Methods

You might also like