0% found this document useful (0 votes)

14 views104 pages

Clustering

Uploaded by

rotwiler38

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views104 pages

Clustering

Uploaded by

rotwiler38

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 104

Pattern Recognition:

Clustering

For BCSE Final Year

Jadavpur University
Pattern Classification by
Distance Functions
The method of pattern classification by
distance functions can be expected to yield
practical and satisfactory results only when
the pattern classes tend to have clustering
properties
Contd..

Figure 1. Patterns classifiable by

proximity concept
Contd..

Figure 2. Patterns not easily classifiable

by proximity concept
Contd..
Unlike Fig. 2, in Fig. 1, the patterns of each
class tend to cluster tightly about a typical or
representative pattern for the class. This
occurs in case where pattern variability and
other corruptive influences are well behaved.

A typical example is the problem of reading

bank checks by machine.
Contd..
Here the proximity of an unknown pattern to
the patterns of a class will serve as a measure
for its classification, so the method is called
minimum-distance pattern classification. And
the system which does this is called minimum-
distance classifiers.

x
Single Prototypes
In this case, the patterns of each class tend to
cluster tightly about a typical or
representative pattern for that class. Under
these conditions, minimum-distance classifiers
can constitute a very effective approach to the
classification problem.
Contd..
Consider M pattern classes and assume that
these classes are representable by prototype
patterns Z 1,Z 2,...,Z M .

The Euclidean distance between an arbitrary

pattern vector x and the ith prototype is given
by,

D i  X   Z i  ( X  Z i )( X  Z i ) ………………….(1)
Contd..
A minimum-distance classifier computes the
distance from a pattern x of unknown
classification to the prototype of each class,
and assigns the pattern to the class to which it
is closest.
Contd..
In other words, x is assigned to class wi if
Di < Dj for all i≠j. Ties are resolved arbitrarily.

Equation (1) may be expressed as

D i 2  X Z i 2
 ( X  Z i )( X  Z i )
Contd..
  
 X X  2 X Z iZ i Z i
1 
 X X  2( X Z i  Z i Z i )
2
Contd..
Choosing the minimum Di2 is equivalent to
choosing the minimum Di since all distances
are positive, since the term XtX is independent
of i, choosing the minimum Dt2 is equivalent to
choosing the maximum (XtZi – 1/2xZitZi)
Consequently, we may define the decision
functions..
Contd..
1  ……….(2)
d i ( X )  X Z i  Z i Z i , i  1,2,..., M
2
where x is assigned to class wi , if d i ( x)  d j ( x)
for all j  i.

Observe that d i ( X ) is a linear decision

function; that is, if Z ij , j  1,2,..., n, are the
components of Z i , and we let..
Contd..
 x1 
 
 x2 
. 
wij  z ij , j  1,2,..., n wi.n1   1 Z i  Z i and X 
.


2  
 .x n 
1 
 
then we may express Eq. (2) in the familiar
linear form

d i ( X ) W i X , i  1,2,..., M where W i ( w i1 ,w i 2 ,....,w i.n1 ).
Contd..
The decision boundary of a two-class
example in which each class is characterized
by a single prototype is shown below.

Figure 3. Decision boundary of two classes

characterized by single prototype
Contd..
It can be shown that the linear decision
surface separating every pair of prototype
points zi and zj is the hyperplane which is the
perpendicular bisector of the line segment
joining the two points. Thus the minimum-
distance-classifiers are a special case of linear
classifiers.
Multiprototypes
Here each class is characterized by several
prototypes, that is, each pattern of class wi
tends to cluster about one of the prototypes
Z i 1, Z i 2 ,..., Z i N , where N i is the number of
i

prototypes in the ith pattern class.

Let the distance function between an

arbitrary pattern x and class wi be denoted by

D i  min X  Z i l , l  1,2,..., N i
Contd..
Following the development for single prototype,
the decision functions
1 l
d i ( X )  max{( X Z i )  ( Z i )Z i l }, l  1,2,..., N i
l

2
where, as before, x is placed in class w i ,if
d i ( X )  d j ( X ), for all j  i.

The decision boundaries for a two-class case

in which each class contains two prototypes
are shown in the next slide.
Contd..

Figure 2. Piecewise-linear decision boundaries

for two classes, each of which is characterized
by two prototypes
Contd..
Note that the boundaries between classes w i
w j and are piecewise linear. Since we could
have defined this as a single-prototype, four-
class problem, the sections of the boundaries
are the perpendicular bisectors of the lines
joining the prototypes of different classes.
Contd..
Although general iterative algorithms exist
which can be used in the calculation of linear
decision function parameters, unfortunately,
no truly general algorithm is yet known for the
piecewise-linear classifier decision function.
Cluster Seeking

It is evident that the ability to determine

characteristic prototypes or cluster centers
in a given set of data plays a central role in
the design of pattern classifiers based on
the minimum-distance concept. So we need
to study various cluster-seeking methods.
Contd..
Note that the performance of a given
algorithm is not only dependent on the type of
data being analyzed, but is also strongly
influenced by the chosen measure of pattern
similarity
Measures of Similarity
To define a data cluster, it is necessary to first
define a measure of similarity which will
establish a rule for assigning patterns to the
domain of a particular cluster center. Earlier
we considered the Euclidean distance between
two patterns x and z:
D X Z
as a measure of their similarity----the smaller
the distance, the greater the similarity.
Contd..
There are, however, other meaningful
distance measures which are sometimes
useful. For example, the Mahalanobis distance
from x to m.
D  ( X  m)C 1 ( X  m) ………………(1)
Contd..
It is a useful measure of similarity when
statistical properties are being explicitly
considered. Here C is the covariance matrix of
a pattern population, m is the mean vector,
and x represents a variable pattern.
Contd..
Measures of similarity need not be restricted
to distance measures. For example, the non-
metric similarity function:
XZ
S(X , Z)  ………………….(2)
X Z
which is the cosine of the angle between
the vectors x and z, is maximum when
x and z are oriented in the same direction
with respect to the origin.
Contd..
This measure of similarity is useful when
cluster regions tend to develop along
principal axes, as shown below.
x z 1
s ( x, z 1 )  cos  1 
x z2
x z 2
s ( x, z 2 )  cos  2 
x z2

Figure 4. Illustration of a similarity measure

Contd..

Here the use of this similarity measure is

governed by certain qualifications, such as
sufficient separation of cluster regions with
respect to each other as well as with respect
to the coordinate system origin.
Contd..
When the patterns under consideration are
binary valued with 0, 1 elements, the
similarity function of Eq. (2) may be given an
interesting non-geometrical interpretation.
Contd..
We say that a binary pattern x possesses the
ith attribute if x Then the term XtZ in Eq. (2) is
simply the number of attributes shared by x
and z, while X Z  ( X X )( Z Z ) is the geometric
mean of the number of attributes possessed
by x and the number possessed by z. In this
case the similarity function s ( X , Z ) is,
therefore, seen to be a measure of common
attributes possessed by the binary vectors x
and z.
Contd..
A binary variation of Eq. (2) which has been
widely used in information retrieval, nosology
(classification of diseases), and taxonomy
(classification of plants and animals) is the so-
called Tanimoto measure, which is given by

XZ
S(X , Z) 
X X  Z Z  X  Z
Clustering Criteria
After a measure of pattern similarity has
been adopted, we are still faced with the
problem of specifying a procedure for
partitioning the given data into cluster
domains.
Contd..
The clustering criterion used may represent a
heuristic scheme, or it may be based on the
minimization (or maximization) of a certain
performance index. The heuristic approach is
guided by intuition and experience.
Contd..

The Euclidean distance is generally used as

measure of proximity. A suitable threshold is
also necessary in order to define degrees of
acceptable similarity in the cluster-seeking
process.
Contd..
The performance-index (sometimes called
objective function) approach is guided by the
development of a procedure which will
minimize or maximize the chosen performance
index. One of the most often used indices is
the sum of the squared errors index, given
by..
Contd..
NC
J   X m j 2

j 1 X S j

where N C is the number of clusters, S j is the

set of samples belonging to the jth cluster,
and 1
mj
Nj
X
X S j

is the sample mean vector of set S j

Contd..

N j represents the number of samples in Sj.

The index J represents the overall sum of the
.

squared errors between the samples of a

cluster domain and their corresponding mean.
The K-means clustering algorithm is based on
this performance index.
Contd..
Other common performance indices are :
 the average squared distances between
samples in a cluster domain
 the average squared distances between
samples in different cluster domains
 indices based on the scatter matrix concept,

 minimum and maximum-variance indices

 a score of other performance measures which

have been used throughout the years.
Contd..

It is not uncommon to find a cluster-seeking

algorithm that represents a combination of the
heuristic and performance index approaches.
The Isodata clustering algorithm is such a
combination.
A Simple Cluster-Seeking
Algorithm
Suppose that we have a set of N sample
patterns X 1, X 2,..., X N . Let the first cluster
center Z1=X1(or any other pattern) and select
.
an arbitrary nonnegative threshold T.
Next, we compute the distance D 21from X 2 to Z 2
If this distance exceeds T, a new cluster
center, Z 2  X 2, is started. Otherwise, we assign
X 2 to the domain of cluster center Z 1.Suppose
that D 21  T so that Z 2 is established. In the next
step, the distances D 31 and D 32 from X 3 to Z 1
and Z 2 are computed.
Contd..
If both D 31 and D 32 are greater than T, a new
cluster center, Z 3 X 3, is created. Otherwise,
we assign X 3 to the domain of the cluster
center to which it is closest. This process is
repeated for all patterns.
Contd..

Figure 5. Effect of threshold in clustering

Contd..
The results of the foregoing procedure
depend on
 the first cluster center chosen,
 the order in which the patterns are
considered,
 the value of T, and,
 the geometrical properties of the data.
Contd..
Some intuitive idea for choosing a suitable
value of T may be based on

 MST of data points

 Intra-cluster distance
 Inter-cluster distance
 The closest and farthest points in cluster from
the cluster center and
 the variance of these cluster domains
Contd..

This procedure may be expected to yield

useful results in situations where the data
exhibit characteristic "pockets" which are
reasonably well separated with respect to the
chosen values of the threshold.
Maximin-Distance
Algorithm
The maximum-distance algorithm is another
simple heuristic procedure based on the
Euclidean distance concept. Consider the ten
two-dimensional samples shown in the Figure
below.
Contd..

The table is initially empty. Then, in the first step,

we arbitrarily let X 1 become the first cluster
center, designated by Z 1 .
Contd..
 Next, we determine the farthest sample from X 1 ,
which in this case is X 6 , and call it cluster center Z 2
 In the third step we compute the distance from
, each remaining sample to Z 1 and Z 2 . For every pair
of these computations we save the minimum
distance. Then, we select the maximum of these
minimum distances. If this distance is an appreciable
fraction (say, at least half or more), of the distance
between cluster centers Z 1 and Z 2 we call the Z 3
corresponding sample cluster center. Otherwise the
algorithm is terminated.
K-means Algorithm
Clustering is an unsupervised technique
used in discovering inherent structure
present in the set of patterns. In fact,
clustering techniques aim to extract the
groups present in a given data set and
each such group is termed as a cluster.

Let the set of patterns be S  x1 , x 2 , ...., x n   m ,

where xi is the ith pattern vector, n is the
total number of patterns and m is the
dimensionality of the feature space.
Contd..
Let the number of clusters be K . If the
clusters are represented by C1 , C 2 , ........ , C K
then we assume
.

P1. Ci   , for i  1, 2, ........ , K

P2. Ci  C j   for i  j and
P3.  iK1 C i  S where  represents null set.
Contd..
Clustering techniques may broadly be divided
into two categories: Hierarchical and non-
hierarchical. The non-hierarchical or partitional
clustering problem deals with obtaining an
optimal partition of into subsets such that
some clustering criterion is satisfied.
Contd..
Among the partitional clustering techniques,
the -Means algorithm has been one of the
most widely used algorithms. Here the value
of K needs to be known a priori. The principle
used for clustering by K-means algorithm is to
minimize the sum of intraclass distances to
get the optimal clusters. Mathematically this
principle has been stated below.
Contd..

1.Let C1 , C 2 , ........ , C K be a set of K clusters of S

 
  x
 xC 
2.Let zj    for j .  1,2,.....K . where # C j
j

#Cj
represents the number of points in C j
Contd..
K 2

3.Let f (C1 , C2 , ........ , C K )    x  z j

j 1 xC j

f (C1 , C2 , ........ , C K ) is referred as the objective

function of the clustering C1 , C 2 , ........ , C K
4. Minimize f (C1 , C2 , ........ , C K ) over all such
C1 , C 2 , ........ , C K where C1 , C 2 , ........ , C K
satisfy P1, P2 and P3 stated earlier.
Contd..
All possible clusterings of S are to be
considered to get the optimal C1 , C 2 ,......, C k .
So obtaining the exact solution of the problem
is theoretically possible, yet not feasible in
practice due to limitations of computer storage
and time. One requires the evaluation of
S (n, k ) partitions if exhaustive enumeration is
used to solve the problem, where

1 k k j  k  n
S (n, k )   (1)   j .
k! j 1  j
This clearly indicates that exhaustive
enumeration cannot lead to the required
solution for most practical problems in
reasonable computation time. For example,
for the crude-oil data, the exact solution
requires the examination of S (56,3)  1018
partitions.
Contd..
Thus, approximate heuristic techniques
seeking a compromise or looking for an
acceptable solution have usually been
adopted. One such method is the Forgy's K-
means algorithm.
Algorithm K-means
Step 1 : Select an initial cluster configuration.
Repeat
Step 2 : Calculate cluster centers z j , j  1,2,.....K .
of the existing groups.
Step 3 : Redistribute patterns among clusters
utilizing the minimum squared Euclidean distance
classifier concept2 :
xi  C jif xi  z j  xi  zl l  {1,2,....K }, l  j
2

Until (there is no change in cluster centers)

End
Contd..
The behavior of the K-means algorithm is
influenced by
 the number of cluster centers specified,
 the choice of initial cluster centers,
 the order in which the samples are taken, and,
 the geometrical properties of the data.
Contd..
Although no general proof of convergence
exists for this algorithm, it can be expected to
yield acceptable results when the data exhibit
characteristic pockets which are relatively far
from each other.
Isodata Algorithm
The Isodata (Iterative Self-Organizing Data
Analysis) is similar in principle to the K-means
procedure in the sense that cluster centers are
iteratively determined sample means. Unlike
the latter algorithm, however, Isodata
represents a fairly comprehensive set of
additional heuristic procedures which have
been incorporated into an interactive scheme.

Before executing the algorithm it is necessary

to specify a set N C of initial cluster centers
Z 1, Z 2 ,..., Z N . For a set of N samples, { X 1, X 2,..., X N },
Isodata consists of the following steps..
C
Contd..
Step 1. Specify the following process
parameters:
K = number of cluster centers desired;
 N = a parameter against which the number of
samples in a cluster domain is compared;
 s = standard deviation parameter;
 C = lumping parameter;
L = maximum number of pairs of cluster
centers which can be lumped;
I = number of iterations allowed.
Contd..
Step 2. Distribute the N samples among the
present cluster centers, using the relation
X S j if X  Z j  X  Z i , i  1,2,..., N C ; i  j
for all x in the sample set. In this notation,
S j represents the subset of samples assigned to
cluster center Z j .
Step 3. Discard sample subsets with fewer
than  N members; that is, if for any j , N j  N ,
discard S j and reduce N C by 1.
Contd..
Step 4. Update each cluster center by
Z j , j  1,2,..., N C , setting it equal to the sample
mean of its corresponding set S j; that is,
1
Z j
Nj
 X,
X S j
j  1,2,..., N C

where N j is the number of samples in S j

Step 5. Compute the average distance D j
of samples in cluster domain S j from their
corresponding cluster center, using the
relation 1
D j
Nj

X S j
X Z j , j  1,2,..., N C
Contd..
Step 6. Compute the overall average
distance of the samples from their
respective cluster centers, using the
relation

NC
1
D
N
N
j 1
j Dj
Contd..
Step 7.
(a) If this is the last iteration, set  C 0
and go to Step 11.
(b) If N C  K / 2, go to Step 8.
(c) If this is an even-numbered iteration,
or if N C  2K , go to Step 11; otherwise,
continue
Contd..
Step 8. Find the standard deviation vector
 j ( 1 j, 2 j,..., nj) for each sample subset, using
the relation 1
 ij
N
(X
X S j
ik
2
 Z ij ) , i  1,2,..., n; j  1,2,..., N C

where n is the sample dimensionality,x ik is the

ith component of the kth sample in S j , Z ij is the
ith component of Z j , and N j is the number of
samples in S j Each component of  j represents
the standard deviation of the samples in S j
along a principal coordinate axis.
Contd..
Step 9. Find the maximum component of each
 j, j  1,2,..., N C and denote it by 
. j max
Step 10. If for any  j max, j  1,2,..., N C , we have  j max S
and
.

(a) D j  D and N j  2( N1) or

(b) N C  K / 2
Contd..

then split Z j into two new cluster centers Z j and Z j


, delete Z j , and increase N C by 1.

Cluster center Z j is formed by adding a given
quantity y j to the component of Z j which
,

corresponds to the maximum component of  j

Z j is formed by subtracting y j from the same


component of Z j. One way of specifying y j is to let

it be equal to some fraction of  j max, that is,
y j  k j max, where 0  k  1.
Contd..
The basic requirement in choosing yj is that it
be sufficient to provide a detectable difference
in the distance from an arbitrary sample to the
two new cluster centers, but not so large as to
change the overall cluster domain
arrangement appreciably.
If splitting took place in this step, go to Step
2; otherwise continue.
Contd..
Step 11. Compute the pairwise distances D ij
between all cluster centers :

D ij  Z i  Z j , i  1,2,..., N C 1; j  i  1,..., N C

Contd..
Step 12. Compare the distances D ij against
the parameter  C . Arrange the L smallest
distances which are less than  C in
ascending order:
[ D i1 j1 , D i 2 j 2 ,..., D iLjL ]
where D i j  D i j  D iL jL and L is the maximum
1 1 2 2

number of pairs of cluster centers which can

be lumped together. The lumping process is
discussed in the next step.
Contd..
Step 13. With each distance D iljl there is
associated a pair of cluster centers Z il and Z jl
Starting with the smallest of these distances,
perform a pairwise lumping operation according
to the following rule :
For l  1,2,..., L, if neither Z il nor Z jl has been used
in lumping in this iteration, merge these two
cluster centers using the following relation:
 1
Zl  [ N il ( Z il )  N jl ( Z jl )]
N il  N jl

Delete Z il and Z jl and reduce N C by 1.

Contd..
It is noted that only pairwise lumping is
allowed and that a lumped cluster center is
obtained by weighting each old cluster center
by the number of samples in its domain.
Experimental evidence indicates that more
complex lumping can produce unsatisfactory
results. The above procedure makes the
lumped cluster centers representative of the
true average point of the combined subsets. It
is also important to note that, since a cluster
center can be lumped only once, this step will
not always result in L lumped centers.
Contd..
Step 14. If this is the last iteration, the
algorithm terminates. Otherwise go to Step
1 if any of the process parameters requires
changing at the user's discretion, or go to
Step 2 if the parameters are to remain the
same for the next iteration. An iteration is
counted every time the procedure returns to
Step 1 or 2.
Example
Example: Let the patterns be {(0,0), (1,1),
(2,2), (4,3), (4,4), (5,3), (5,4), (6,5)}

In this case N  8 and n  2 . Suppose that

we initially let N C  1, Z 1 (0,0),and specify the
following parameters:

Step 1.K  2,  N 1,  S 1,  C 4, L  0, I 4
Contd..
If no a priori information on the data being
analyzed is available, these parameters are
arbitrarily chosen and then adjusted during
successive iterations through the algorithm.
Step 2. Since there is only one cluster
center,

S 1 { X 1, X 2,,..., X 8} and N 1 8

Step 3. Since N 1 N,no subsets are discarded.

Contd..
Step 4. Update the cluster centers
1  3.38 
Z 1  X   
N1 X S 1  2.75 

Step 5. ComputeD j :
1
D 1
N1

X S 1
X  Z 1  2.26
Contd..
Step 6. Compute D in this case
D  D1  2.26
,

Step 7. Since this is not the last iteration and

N c  K / 2 Go to step 8.

Step 8. Find the standard deviation vector for S1

1.99 
 1   
1.56 
Contd..
Step 9. The maximum component of  1 is
1.99 ; hence,  1 max 1.99.
Step 10. Since  1 max S and N C  K / 2, we
split Z 1 into two new clusters. Following the
procedure described in Step 10, suppose
that we let y j  0.5 j max 1.0. Then,

  4.38    2.38 
Z 1   , Z 1   
 2.75   2.75 
Contd..
For convenience these two cluster centers
are renamed Z 1 and Z 2 respectively. Also, N C
is increased by 1. Go to Step 2.
Step 2. The sample sets are now
S 1 { X 4, X 5, X 6, X 7, X 8}, S 2  { X 1, X 2, X 3}
and N 1 5, N 2 3.
Step 3. Since both N 1 and N 2 are greater
than  N, no subsets are discarded.
Contd..
Step 4. Update the cluster centers :

1  4.80  1 1.00 
Z 1  X   , Z 2   
N1 X S 1  3.80  N2 X S 2 1.00 

Step 5. Compute D j , j  1,2 :

1 1
D 1
N1

X S 1
X  Z 1  0.80, D 2
N2

X S 2
X  Z 2  0.94
Contd..
Step 6. Compute D :
NC
1 1 2
D
N

j 1
N j D j   N j D j  0.85
8 j 1

Step 7. Since this is an even-numbered

iteration, condition (c) of Step 7 is satisfied.
Therefore, go to Step 11.
Contd..
Step 11. Compute the pairwise distances:
D 12  Z 1 Z 2  4.72
Step 12. Compare D12 to  C. In this case, D12  C.
Step 13. From the results of Step 12, we see
that no lumping of cluster centers can take
place.
Step 14. Since this is not the last iteration,
we are faced with the decision of whether or
not to make an alteration in the parameters.
Since, in this simple example,
Contd..
1. we have obtained the desired number of
clusters,
2. their separation is greater than the average
spread indicated by the standard deviations,
and
3. each cluster subset contains a significant
percentage of the total number of samples,
we arrive at the conclusion that the cluster
centers are representative of the data.
Therefore, we return to Step 2.
Contd..
Steps 2-6 : yield the same results as in the
previous iteration
Step 7. None of the conditions in this step is
satisfied. Therefore, we proceed to Step 8.
Step 8. Compute the standard deviation of sets
S 1 { X 4, X 5, X 6, X 7, X 8} and S 2  { X 1, X 2, X 3} :

 0.75   0.82 
 1    ,  2   
 0.75   0.82 
Contd..
Step 9. In this case  1 max  0.75 and  2 max  0.82 .
Step 10. The conditions for splitting are not
satisfied. Therefore, we proceed to Step 11.
Step 11. We obtain the same result as in the
previous iteration:
D12  Z 1 Z 2  4.72

Step 12. We obtain the same result as in the

Previous iteration.
Contd..
Step 13. We obtain the same result as in
the previous iteration.
Step 14. Nothing new has been added in
this iteration, except the computation of the
standard deviation vectors. Therefore, we
return to Step 2.
Steps 2—6. yield the same results as in
the previous iteration.
Step 7. Since this is the last iteration, we
set  C 0 and go to Step ll.
Contd..
Step 11. D12  Z 1 Z 2  4.72 as before.
Step 12. We obtain the same result as in
the previous iteration.
Step 13. From the results of Step 12, we
see that no lumping can take place.
Step 14. Since this is the last iteration, the
algorithm is terminated
Contd..
It should be clear, even from the above
simple example, that the application of
Isodata to a set of moderately complex
data requires, in general, extensive
experimentation before one can arrive at
meaningful conclusions. However, by
properly organizing the information
obtained in each iteration, it is possible to
gain considerable insight into the structure
of the data.
Evaluation of Clustering Results

The principal difficulty in evaluating the

results of clustering algorithms is inability to
visualize the geometrical properties of a
high-dimensional space.

Several interpretation techniques exist

which allow at least partial insight into the
geometrical properties of the resulting
cluster domains.
Contd..
A very useful interpretation tool is the distance
between cluster centers This information is
best presented in the form of a table.
Contd..

Cluster
Center
Z1 Z2 Z3 Z4 Z5

Z1 0.0 4.8 14.7 2.1 50.6

Z2 0.0 21.1 6.1 48.3
Z3 0.0 15.0 36.7

Z4 0.0 49.3
Z5 0.0

Table 1. Distance Table for Interpreting

Clustering Results
Contd..
Cluster center Z 5 is far removed from the

other cluster centers. If it is known that this

cluster center is associated with numerous
samples, we would accept it as a valid
description of the data, otherwise we might
dismiss this cluster center as representative
of noise samples.
Contd..

If two cluster centers are relatively close,

and one of the centers is associated with a
much larger number of samples, it is often
possible to merge the two cluster domains.
The variances of a cluster domain about its
mean can be used to infer the relative
distribution of the samples in the domain.
Contd..
4

Cluster Variances
Domains
1 2 3 4

S1 1.2 0.9 4
0.7 1.0

2.0 1.3 1.5 0.9

S2
S3 3.7 4.8 7.3 10.4

S4 0.3 0.8 0.7 1.1

S5 4.2 5.4 18.3 3.3

Table 2. Variance Table for Interpreting Clustering Results

Contd..
It is assumed that each variance
component is along the direction of one of
the coordinate axes.

Note :
 Since domain S1 has very similar variances,
it can be expected to be roughly spherical in
nature.
 Cluster domain S5 on the other hand, is
significantly elongated about the third
coordinate axis. A similar analysis can be
carried out for the other domains.
Contd..

This information, coupled with the distance

table and sample numbers, can be of
significant value in interpreting clustering
results.
Graph-Theoretic Approach

Till now, the clusters are determined in such a way

that the intraset distance among each cluster is kept
to a minimum, and the interset distances between
two clusters are made as large as possible.
Contd..
An alternative approach to cluster seeking is to make
use of some basic notion in graph theory. In this
approach, a pattern graph is first constructed from
the given sample patterns, which form the nodes of
the graph. A node j is connected to a node k by an
edge, if the patterns corresponding to these two
nodes are similar or are related.
Pattern X j and pattern X k are said to be similar if the
similarity measure S ( X j, X k ) is greater than a pre-
specified threshold T. The similarity measure may be
used to generate a similarity matrix S, whose
elements are 0 or 1. The similarity matrix provides a
systematic way to construct the pattern graph. Since
cliques of a pattern graph form the clusters of the
patterns, cluster seeking can be accomplished by
detecting the cliques of the pattern graph.
Contd..
Several clique detection algorithms and
programs have been introduced in the
literature. Several such methods exist that
utilize the MST of the data points to get
the clustering.
Thank You

Pat Recogn
No ratings yet
Pat Recogn
145 pages
02 - Clustering
No ratings yet
02 - Clustering
43 pages
Pattern Recognition - Clustering - Classification
No ratings yet
Pattern Recognition - Clustering - Classification
177 pages
T G Butaney - Forecasting Prices (1958) PDF
100% (3)
T G Butaney - Forecasting Prices (1958) PDF
280 pages
Answer Key
No ratings yet
Answer Key
4 pages
DM - Topic Four - Part III (Autosaved)
No ratings yet
DM - Topic Four - Part III (Autosaved)
67 pages
Clustering
No ratings yet
Clustering
104 pages
Module 5 - Clustering - Afterclassb
No ratings yet
Module 5 - Clustering - Afterclassb
49 pages
Y7 Separating Mixtures
No ratings yet
Y7 Separating Mixtures
2 pages
M4 - Clustering
No ratings yet
M4 - Clustering
43 pages
Chp-10 (Topic Not in Book) Types of Data in Cluster Analysis.
No ratings yet
Chp-10 (Topic Not in Book) Types of Data in Cluster Analysis.
13 pages
164-Article Text-421-1-10-20210814
No ratings yet
164-Article Text-421-1-10-20210814
6 pages
Unit 2
No ratings yet
Unit 2
89 pages
Pattern
No ratings yet
Pattern
9 pages
Xingang Catalogue 2020
No ratings yet
Xingang Catalogue 2020
9 pages
Supervised Learning vs. Unsupervised Learning
No ratings yet
Supervised Learning vs. Unsupervised Learning
7 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
An Overview On Clustering Methods: T. Soni Madhulatha
No ratings yet
An Overview On Clustering Methods: T. Soni Madhulatha
7 pages
Pattern Recognition Linear Classifier by Zaheer Ahmad
0% (1)
Pattern Recognition Linear Classifier by Zaheer Ahmad
37 pages
Distance-Based Techniques
No ratings yet
Distance-Based Techniques
7 pages
Clustering Today
No ratings yet
Clustering Today
52 pages
Unsupervised Learning and Clustering
No ratings yet
Unsupervised Learning and Clustering
26 pages
Distance Metric Learning: A Comprehensive Survey: Liu Yang Advisor: Rong Jin May 8th, 2006
No ratings yet
Distance Metric Learning: A Comprehensive Survey: Liu Yang Advisor: Rong Jin May 8th, 2006
51 pages
Neighbourhood Park - Case Study
100% (1)
Neighbourhood Park - Case Study
14 pages
Hierarchicalclustering
No ratings yet
Hierarchicalclustering
20 pages
Web Description MATLAB
No ratings yet
Web Description MATLAB
32 pages
Clustering 1
No ratings yet
Clustering 1
75 pages
Dynoc
No ratings yet
Dynoc
7 pages
Chapter 2
No ratings yet
Chapter 2
70 pages
Module 4 ML
No ratings yet
Module 4 ML
11 pages
Datawarehousing and Data Mining
No ratings yet
Datawarehousing and Data Mining
119 pages
Linear Discriminant Functions Lesson 26: Characterization of The Decision Boundary
No ratings yet
Linear Discriminant Functions Lesson 26: Characterization of The Decision Boundary
7 pages
Iso 16779-2015
No ratings yet
Iso 16779-2015
3 pages
Clustering Slides
No ratings yet
Clustering Slides
22 pages
Ds Module 5
No ratings yet
Ds Module 5
49 pages
A Comparative Study On Distance Measuring Approach
No ratings yet
A Comparative Study On Distance Measuring Approach
3 pages
Mlfa Autumn 22 Lec 03
No ratings yet
Mlfa Autumn 22 Lec 03
61 pages
Clustering
0% (1)
Clustering
127 pages
DM 10,11 Clustering PDF
No ratings yet
DM 10,11 Clustering PDF
65 pages
FML 4
No ratings yet
FML 4
26 pages
III Clustering
No ratings yet
III Clustering
87 pages
Image Representation
No ratings yet
Image Representation
26 pages
Formulas at A Glance - IDS
No ratings yet
Formulas at A Glance - IDS
5 pages
W6 Clustering
No ratings yet
W6 Clustering
29 pages
ML Co4 Session 29
No ratings yet
ML Co4 Session 29
36 pages
Unit 2 ML
No ratings yet
Unit 2 ML
89 pages
A Tutorial on ν-Support Vector Machines: 1 An Introductory Example
No ratings yet
A Tutorial on ν-Support Vector Machines: 1 An Introductory Example
29 pages
Understanding The Inners of Clustering: DR Akashdeep, UIET, Panjab University Chandigarh, Maivriklab@pu - Ac.in
No ratings yet
Understanding The Inners of Clustering: DR Akashdeep, UIET, Panjab University Chandigarh, Maivriklab@pu - Ac.in
61 pages
An MDL Framework For Data Clustering: Petri Kontkanen, Petri Myllym Aki, Wray Buntine, Jorma Rissanen and Henry Tirri
No ratings yet
An MDL Framework For Data Clustering: Petri Kontkanen, Petri Myllym Aki, Wray Buntine, Jorma Rissanen and Henry Tirri
35 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
ML Module5
No ratings yet
ML Module5
37 pages
PATTERN CLASSIFICATION BY DISTANCE FUNCTIONS BY Dr. K.Vijayarekha
No ratings yet
PATTERN CLASSIFICATION BY DISTANCE FUNCTIONS BY Dr. K.Vijayarekha
8 pages
Showfile
No ratings yet
Showfile
130 pages
Ijcttjournal V1i1p12
No ratings yet
Ijcttjournal V1i1p12
3 pages
Non Parametric Classification: Pattern Recognition
No ratings yet
Non Parametric Classification: Pattern Recognition
74 pages
LC50 51 51 58
No ratings yet
LC50 51 51 58
14 pages
Medical Imabmnge Analysis
No ratings yet
Medical Imabmnge Analysis
41 pages
Clustering Lecture
No ratings yet
Clustering Lecture
46 pages
S VD For Clustering
No ratings yet
S VD For Clustering
10 pages
IV Distance and Rule Based Models 4.1 Distance Based Models
No ratings yet
IV Distance and Rule Based Models 4.1 Distance Based Models
45 pages
Graph Partitioning Advance Clustering Technique
No ratings yet
Graph Partitioning Advance Clustering Technique
14 pages
Instance Based Learning
No ratings yet
Instance Based Learning
20 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
51 pages
Five Year Development Plan (2003-2008)
No ratings yet
Five Year Development Plan (2003-2008)
190 pages
Sitxglc002-Learner Assessment Pack Prctical
No ratings yet
Sitxglc002-Learner Assessment Pack Prctical
19 pages
OIM Call For Application
No ratings yet
OIM Call For Application
5 pages
Distance Based Models
No ratings yet
Distance Based Models
58 pages
Harvard Thesis Template PDF
100% (3)
Harvard Thesis Template PDF
7 pages
Plant Consciousness Communication and Et
No ratings yet
Plant Consciousness Communication and Et
13 pages
Definition of Key Terms Thesis
100% (3)
Definition of Key Terms Thesis
7 pages
Descartes1619 4
No ratings yet
Descartes1619 4
74 pages
21UHV49 UHV Module 5
No ratings yet
21UHV49 UHV Module 5
20 pages
MSC CS Ai3
No ratings yet
MSC CS Ai3
117 pages
E-Catalogue From Begate Optic (Candy) 2021
No ratings yet
E-Catalogue From Begate Optic (Candy) 2021
24 pages
BE and Policy Nudging
No ratings yet
BE and Policy Nudging
20 pages
包装创新设计中的智能技术专利数据可视化分析黎映川
No ratings yet
包装创新设计中的智能技术专利数据可视化分析黎映川
7 pages
3 - 1.2 Linear Models and Rates of Change
No ratings yet
3 - 1.2 Linear Models and Rates of Change
30 pages
Grade 6 Investigation Marking Guideline
No ratings yet
Grade 6 Investigation Marking Guideline
5 pages
Isolation of Biocrust Cyanobacteria and Evaluation of Cu, PB, and ZN Immobilisation Potential For Soil Restoration and Sustainable Agriculture
No ratings yet
Isolation of Biocrust Cyanobacteria and Evaluation of Cu, PB, and ZN Immobilisation Potential For Soil Restoration and Sustainable Agriculture
12 pages
2021 Pricelist Europractice General MPW 8
No ratings yet
2021 Pricelist Europractice General MPW 8
7 pages
VBH 70 05
No ratings yet
VBH 70 05
5 pages
Sujet Dissertation Telephone Portable
100% (2)
Sujet Dissertation Telephone Portable
4 pages
English Exam 4th
No ratings yet
English Exam 4th
5 pages
YUPLENE® R930Y - SK Geo Centric
No ratings yet
YUPLENE® R930Y - SK Geo Centric
2 pages
Structural Engineering SET AIT 1
No ratings yet
Structural Engineering SET AIT 1
2 pages
50 Vip (12-21) Ch-13 Class 10 Science Abk-Ssj
No ratings yet
50 Vip (12-21) Ch-13 Class 10 Science Abk-Ssj
2 pages
Suriname Food Systems Innovation Challenge Event Flyer For Students
No ratings yet
Suriname Food Systems Innovation Challenge Event Flyer For Students
2 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet