0% found this document useful (0 votes)
13 views6 pages

Cluster 1

Data science
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
13 views6 pages

Cluster 1

Data science
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 6
Ox= (Ax + Bx +Cx)/3 Oy = (Ay + By + Cy)/3 B (20,25) 4+ 20+ 30=54/3=18 5+25+6=36/3=12 € (30,6) centroid coordinate: (18, 12) Hierarchical Clustering in Data Mining A Hierarchical clustering method works via grouping data into a tree of clusters. Hierarchical clustering begins by treating every data points as a separate cluster. Then, it repeatedly executes the subsequent steps: 1. Identify the 2 clusters which can be closest together, and 2. Merge the 2 maximum comparable clusters. We need to continue these steps until all the clusters are merged together. In Hierarchical Clustering, the aim is to produce a hierarchical series of nested clusters. A diagram called Dendrogram (A Dendrogram is a tree-like diagram that statistics the sequences of merges or splits) graphically represents this hierarchy and is an inverted tree that describes the order in which factors are merged (bottom-up view) or cluster are break up (top-down view). The basic method to generate hierarchical clustering are: 1. Agglomerative: Initially consider every data point as an individual Cluster and at every step, merge the nearest pairs of the cluster. (It is a bottom-up method). At first everydata set set is considered as individual entity or cluster. At every iteration, the clusters merge with different clusters until one cluster is formed. Algorithm for Agglomerative Hierarchical Clustering is: Calculate the similarity of one cluster with “ all the other clusters (calculate proximity matrix) Consider every data point as a individual cluster Merge the clusters which are highly similar or close to each other. Recalculate the proximity matrix for each cluster Repeat Step 3 and 4 until only a single cluster remains. Let’s see the graphical representation of this algorithm using a dendrogram. Note: This is just a demonstration of how the actual algorithm works no calculation has been performed below all the proximity among the clusters are assumed. Let’s say we have six data points A, B, C, D, E, F. Let’s say we have six data points A, B, C, D, E, F. (A} (BK ‘orb Bc » BCDEF —°? ABCDEF Oo 7 p > DE }- DEF ) F} | step 1 | | Step 2 | | step 3 } Step 4 | step | Agglomerative —————_________+ Figure — Agglomerative Hierarchical clustering 2. Divisive: We can say that the Divisive Hierarchical clustering is precisely the opposite of the Agglomerative Hierarchical clustering. In Divisive Hierarchical clustering, we take into account all of the data points as a single cluster and in every iteration, we separate the data points from the clusters which aren’t comparable. In the end, we are left with N clusters. A» le BC C-_X BCDEF )} ABCDEF Ds ‘{ DE «+ DEF ___] E «—_— > __ | Step 5 Step 4 | Step 3 | Step2 | Step 1 | < Divisive

You might also like