0% found this document useful (0 votes)
24 views16 pages

Cluster Analysis Unit 4.

Cluster analysis is an unsupervised machine learning technique that groups similar objects together without predefined labels. It works by computing the distances or similarities between objects to perform hierarchical clustering. Hierarchical clustering builds clusters iteratively by either merging the most similar clusters in an agglomerative approach or splitting the least similar clusters in a divisive approach. The results can be visualized in a dendrogram to show how the clusters are merged or split as the similarity threshold changes.

Uploaded by

Rohan bhatkal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views16 pages

Cluster Analysis Unit 4.

Cluster analysis is an unsupervised machine learning technique that groups similar objects together without predefined labels. It works by computing the distances or similarities between objects to perform hierarchical clustering. Hierarchical clustering builds clusters iteratively by either merging the most similar clusters in an agglomerative approach or splitting the least similar clusters in a divisive approach. The results can be visualized in a dendrogram to show how the clusters are merged or split as the similarity threshold changes.

Uploaded by

Rohan bhatkal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

SHEETAL CHABUKSWAR (Assistant professor)

CLUSTER ANALYSIS

Cluster analysis is more primitive technique in that no assumption are


made concerning the number of groups or the group structure. Grouping is done
on the basis of similarity or distance (dissimilarity). The input required are
similarities can be computed distance and similarities coefficient for pair of
items.
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)

Suppose five individual having following characterless – Hight , weight , eye


color , hair color , handedness , gender.
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)

Similarities and association measure for pair of variables.


SHEETAL CHABUKSWAR (Assistant professor)

Hierarchical Clustering
To find “reasonable” cluster without having to look at all configuration.
Hierarchical clustering technique proceed by rather a series of successive
merger or series of successive division.
There are two type of hierarchic methods, agglomerative hierarchic
method and Division hierarchic method (Division of individual)
Agglomerative method starts with individual object. There are many
clusters as subject the most similar object. The most similar objects are first
grouped and their initial groups are merge according to their similarities. When
similarities decrease, all sub groups are merge into single cluster.
Divisive method works in the opposite direction. And initial single group
of objects is divided into two sub groups such that the objects in one sub group
are far from the objects in the other. These sub group are further divided into dis
similar sub groups, the process continue until there are as many sub groups as
objects i.e. until each object form a group.
The results are both agglomerative method and divisive method may be
displays as follows.
SHEETAL CHABUKSWAR (Assistant professor)

Steps in agglomerative method. (items and variable)


1) Start with N cluster, each containing a single entity and an N × N
symmetric matrix of distances. (or similarities) D = dik
2) Search the distance matrix for the nearest (most similar) pair of clusters
let the distance between “most similar”” cluster U and V be d uv
3) Merge cluster U V label the newly form cluster (U V) update the entry in
the distance matrix by
i) Deleting the rows and Colum corresponding cluster U V.
ii) Adding row and Colum giving the distance between cluster (U V)
and the remaining cluster.
4) Repeat step 2 and 3 a total of (N-1) times (all objects will be in single
cluster at termination of algorithm) after the algorithm termination record
the identity of clusters that are merge and the levels (distances or
similarities) at which the merger take place.

Single linkage
Similarly, we must find the smallest distance D = { dik} and merge the
corresponding objects , say U and V , to get the cluster (UV).fpr step 3 general
algorithm, the distances between (UV) and any other cluster W are computed by
d(uv)w = min {duw,dvw}
Here the quantities duw and dvw are the distances between the nearest neighbors
Of clusters U and W and cluster V and W respectively.
1) Consider the matrix
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)

Single linkage Dendrogram for distance between five objects

Complete linkage method


SHEETAL CHABUKSWAR (Assistant professor)

Complete linkage dendrogram for distance between five objects


SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)

Suppose we are measure two variable X1 and X2 for each of four item A,B,C and
D. The data are given below
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)

You might also like