Module-5 Clustering Algorithms
Module-5 Clustering Algorithms
12.5
10.0
.
. . .. ..
.
Values
7.5
. .
5.0
2.5
6 7
0 1 2 3 4 5
Samples
Fig: Cluster Visualization
visual identification of clusters in the previous example is easy
because it has three features. But when examples have more
features say 100, then clustering can not be done manually and
automatic clustering algorithms are required.
All clusters are represented by Centroid , for example if the input
or data is (3,3),(2,6) and (7,9) then
Centroid is given as: = (3+2+7)/3, (3+6+9) / 3
= (4,6)
The clusters should not overlap and every cluster should represent
only one class. Therefore, clustering algorithms use trail and error
method to form clusters that can be converted to labels.
Clustering Classification
Attribute 0 1
Matching
0 a b
X
1 c d
Y
Step 3: compute the mean of the initial clusters and assign the
remaining sample to the closest clusters based on Euclidian distance
or any other distance measure between the instances and centroid of
the cluster.
(4,6) (10,4)
(2,4) (12,4)
(6,8)
Centroid(4,6) centroid (11,4)
Step 1: Define a set of grid points and assign the given data
points on the grid.
Step 2: Determine the dense and sparse cells. If a number of
points in a cell is the threshold value T, the cell is categorized
as dense cells, space cells are removed from the list.
Step 3: Merge the dense cell if they adjacent
Step 4: Form a list of grid cells for every subspace as output.
Algorithm : CLIQUE (Clustering in Quest)
This algorithm works in two stages as given below:
Stage 1:
Step 1: identify the dense cells.
Step 2: merge dense cells C1 & C2 if they share the same interval.
Step 3: generate Apriori rule to generate (K+1)th cell for higher
dimensions then, check whether the number of pints cross the
threshold. This step repeated till there are no dense cells or new
generation of dense cell.
STAGE 2: