K Mean Clustering
K Mean Clustering
CLUSTERING
INTRODUCTION-
What is clustering?
space.
Analgorithm for partitioning (or clustering) N
data points into K disjoint subsets Sj
containing data points so as to minimize the
sum-of-squares criterion
containing:
{1,2,3} and {4,5,6,7}.
Their new centroids are:
Step 3:
Now using these centroids
we compute the Euclidean
distance of each object, as
shown in table.
Therefore, there is no
change in the cluster.
Thus, the algorithm comes
to a halt here and final
result consist of 2 clusters
{1,2} and {3,4,5,6,7}.
PLOT
(with K=3)
Step 1 Step 2
PLOT
Real-Life Numerical Example
of K-Means Clustering
We have 4 medicines as our training data points object
and each medicine has 2 attributes. Each attribute
represents coordinate of the object. We have to
determine which medicines belong to cluster 1 and
which medicines belong to the other cluster.
Attribute1 (X): Attribute 2 (Y): pH
Object
weight index
Medicine A 1 1
Medicine B 2 1
Medicine C 4 3
Medicine D 5 4
Step 1:
Initial value of
centroids : Suppose
we use medicine A and
medicine B as the first
centroids.
Let and c and c
1 2
denote the coordinate
of the centroids, then
c1=(1,1) and c2=(2,1)
Objects-Centroids distance : we calculate the
distance between cluster centroid to each object.
Let us use Euclidean distance, then we have
distance matrix at iteration 0 is
iteration 1 is
Iteration-1, Objects
clustering:Based on the new
distance matrix, we move the
medicine B to Group 1 while
all the other objects remain.
The Group matrix is shown
below
Iteration 2, determine
centroids: Now we repeat step
4 to calculate the new centroids
coordinate based on the
clustering of previous iteration.
Group1 and group 2 both has
two members, thus the new
centroids are
and
Iteration-2, Objects-Centroids distances :
Repeat step 2 again, we have new distance
matrix at iteration 2 as
Iteration-2,
Objects clustering: Again, we
assign each object based on the minimum
distance.
H. Zha, C. Ding, M. Gu, X. He and H.D. Simon. "Spectral Relaxation for K-means
Clustering", Neural Information Processing Systems vol.14 (NIPS 2001). pp. 1057-
1064, Vancouver, Canada. Dec. 2001.
www.wikipedia.com