0% found this document useful (0 votes)
30 views14 pages

Clustering Density Based

The document discusses different density-based clustering methods. It describes DBSCAN, which uses a density-based notion of clusters to find clusters of arbitrary shape and handle noise. OPTICS is presented as an improvement over DBSCAN that produces a cluster ordering to allow analysis over a range of parameter settings. DENCLUE is also summarized, which uses statistical density functions and identifies clusters as attractors of local density maxima.

Uploaded by

AdamZain788
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views14 pages

Clustering Density Based

The document discusses different density-based clustering methods. It describes DBSCAN, which uses a density-based notion of clusters to find clusters of arbitrary shape and handle noise. OPTICS is presented as an improvement over DBSCAN that produces a cluster ordering to allow analysis over a range of parameter settings. DENCLUE is also summarized, which uses statistical density functions and identifies clusters as attractors of local density maxima.

Uploaded by

AdamZain788
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 14

Chapter 10.

Cluster Analysis: Basic Concepts and


Methods
 Cluster Analysis: Basic Concepts
 Partitioning Methods
 Hierarchical Methods
 Density-Based Methods
 Grid-Based Methods
 Evaluation of Clustering
 Summary

1
Density-Based Clustering Methods
 Clustering based on density (local cluster criterion), such as
density-connected points
 Major features:

Discover clusters of arbitrary shape

Handle noise

One scan

Need density parameters as termination condition
 Several interesting studies:
 DBSCAN: Ester, et al. (KDD’96)

 OPTICS: Ankerst, et al (SIGMOD’99).

 DENCLUE: Hinneburg & D. Keim (KDD’98)

 CLIQUE: Agrawal, et al. (SIGMOD’98) (more grid-based)

2
Density-Based Clustering: Basic Concepts
 Two parameters:
 Eps: Maximum radius of the neighbourhood
 MinPts: Minimum number of points in an Eps-
neighbourhood of that point
 NEps(p): {q belongs to D | dist(p,q) ≤ Eps}
 Directly density-reachable: A point p is directly density-
reachable from a point q w.r.t. Eps, MinPts if
 p belongs to NEps(q)
p MinPts = 5
 core point condition:
Eps = 1 cm
|NEps (q)| ≥ MinPts q

3
Density-Reachable and Density-Connected
 Density-reachable:
 A point p is density-reachable from p
a point q w.r.t. Eps, MinPts if there
p1
is a chain of points p1, …, pn, p1 = q, q
pn = p such that pi+1 is directly
density-reachable from pi
 Density-connected
 A point p is density-connected to a p q
point q w.r.t. Eps, MinPts if there is
a point o such that both, p and q o
are density-reachable from o w.r.t.
Eps and MinPts
4
DBSCAN: Density-Based Spatial Clustering of
Applications with Noise
 Relies on a density-based notion of cluster: A cluster is
defined as a maximal set of density-connected points
 Discovers clusters of arbitrary shape in spatial databases
with noise

Outlier

Border
Eps = 1cm
Core MinPts = 5

5
DBSCAN: The Algorithm
 Arbitrary select a point p
 Retrieve all points density-reachable from p w.r.t. Eps and
MinPts
 If p is a core point, a cluster is formed
 If p is a border point, no points are density-reachable
from p and DBSCAN visits the next point of the database
 Continue the process until all of the points have been
processed

6
DBSCAN: Sensitive to Parameters

7
OPTICS: A Cluster-Ordering Method (1999)

 OPTICS: Ordering Points To Identify the Clustering


Structure
 Ankerst, Breunig, Kriegel, and Sander (SIGMOD’99)

 Produces a special order of the database wrt its

density-based clustering structure


 This cluster-ordering contains info equiv to the density-

based clusterings corresponding to a broad range of


parameter settings
 Good for both automatic and interactive cluster

analysis, including finding intrinsic clustering structure


 Can be represented graphically or using visualization

techniques
8
OPTICS: Some Extension from DBSCAN
 Index-based:
 k = number of dimensions
 N = 20
 p = 75% D
M = N(1-p) = 5
 Complexity: O(NlogN)

 Core Distance: p1
 min eps s.t. point is core

 Reachability Distance o
p2
o
Max (core-distance (o), d (o, p)) MinPts = 5
r(p1, o) = 2.8cm. r(p2,o) = 4cm  = 3 cm 9
Reachability
-distance

undefined


 ‘

Cluster-order
of the objects 10
Density-Based Clustering: OPTICS & Its Applications

11
DENCLUE: Using Statistical Density Functions

 DENsity-based CLUstEring by Hinneburg & Keim (KDD’98)


total influence
 Using statistical density functions: on x
d ( x , xi ) 2
2 
( x )  i 1 e
d ( x,y) N
 D 2 2

f Gaussian ( x , y )  e 2 2 f Gaussian

d ( x , xi ) 2
influence of y 
( x, xi )  i 1 ( xi  x)  e
N
on x f D
Gaussian
2 2
 Major features
gradient of x in
 Solid mathematical foundation the direction of
xi
 Good for data sets with large amounts of noise
 Allows a compact mathematical description of arbitrarily shaped
clusters in high-dimensional data sets
 Significant faster than existing algorithm (e.g., DBSCAN)
 But needs a large number of parameters
12
Denclue: Technical Essence
 Uses grid cells but only keeps information about grid cells that do
actually contain data points and manages these cells in a tree-based
access structure
 Influence function: describes the impact of a data point within its
neighborhood
 Overall density of the data space can be calculated as the sum of the
influence function of all data points
 Clusters can be determined mathematically by identifying density
attractors
 Density attractors are local maximal of the overall density function
 Center defined clusters: assign to each density attractor the points
density attracted to it
 Arbitrary shaped cluster: merge density attractors that are connected
through paths of high density (> threshold)

13
Density Attractor

14

You might also like