SHEETAL CHABUKSWAR (Assistant professor)
CLUSTER ANALYSIS
Cluster analysis is more primitive technique in that no assumption are
made concerning the number of groups or the group structure. Grouping is done
on the basis of similarity or distance (dissimilarity). The input required are
similarities can be computed distance and similarities coefficient for pair of
items.
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)
Suppose five individual having following characterless – Hight , weight , eye
color , hair color , handedness , gender.
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)
Similarities and association measure for pair of variables.
SHEETAL CHABUKSWAR (Assistant professor)
Hierarchical Clustering
To find “reasonable” cluster without having to look at all configuration.
Hierarchical clustering technique proceed by rather a series of successive
merger or series of successive division.
There are two type of hierarchic methods, agglomerative hierarchic
method and Division hierarchic method (Division of individual)
Agglomerative method starts with individual object. There are many
clusters as subject the most similar object. The most similar objects are first
grouped and their initial groups are merge according to their similarities. When
similarities decrease, all sub groups are merge into single cluster.
Divisive method works in the opposite direction. And initial single group
of objects is divided into two sub groups such that the objects in one sub group
are far from the objects in the other. These sub group are further divided into dis
similar sub groups, the process continue until there are as many sub groups as
objects i.e. until each object form a group.
The results are both agglomerative method and divisive method may be
displays as follows.
SHEETAL CHABUKSWAR (Assistant professor)
Steps in agglomerative method. (items and variable)
1) Start with N cluster, each containing a single entity and an N × N
symmetric matrix of distances. (or similarities) D = dik
2) Search the distance matrix for the nearest (most similar) pair of clusters
let the distance between “most similar”” cluster U and V be d uv
3) Merge cluster U V label the newly form cluster (U V) update the entry in
the distance matrix by
i) Deleting the rows and Colum corresponding cluster U V.
ii) Adding row and Colum giving the distance between cluster (U V)
and the remaining cluster.
4) Repeat step 2 and 3 a total of (N-1) times (all objects will be in single
cluster at termination of algorithm) after the algorithm termination record
the identity of clusters that are merge and the levels (distances or
similarities) at which the merger take place.
Single linkage
Similarly, we must find the smallest distance D = { dik} and merge the
corresponding objects , say U and V , to get the cluster (UV).fpr step 3 general
algorithm, the distances between (UV) and any other cluster W are computed by
d(uv)w = min {duw,dvw}
Here the quantities duw and dvw are the distances between the nearest neighbors
Of clusters U and W and cluster V and W respectively.
1) Consider the matrix
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)
Single linkage Dendrogram for distance between five objects
Complete linkage method
SHEETAL CHABUKSWAR (Assistant professor)
Complete linkage dendrogram for distance between five objects
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)
Suppose we are measure two variable X1 and X2 for each of four item A,B,C and
D. The data are given below
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)
SHEETAL CHABUKSWAR (Assistant professor)