Analysis and Study of K Means Clustering Algorithm IJERTV2IS70648
Analysis and Study of K Means Clustering Algorithm IJERTV2IS70648
ISSN: 2278-0181
Vol. 2 Issue 7, July - 2013
14. kk={q} For example let number of times first statement runs
15. K=K union {kk} with cost m1 is q (>=1). For each q next statement ,
16. ck=q for i=1,2,…………n where n is number of data
17. C=C union {ck} objects, runs n+1 times with cost m2. For each q and
for each n, next statement runs k+1 times, where k is
number of cluster with cost m3. 4th statement runs
3.1. ADVANTAGES OF PROPOSED one time for each q and for each n with cost m4.
CLUSTERING Calculating new mean for each cluster requires k+1
runs for each q with cost m5.
Having looked at the available literature indicates the
following advantages can be found in proposed Running time for algorithm is the sum of running
clustering over K-means clustering algorithm. time for each statement executed i.e.
1. In K-means clustering algorithms, the
number of clusters (k) needs to be T(n) =m1*q+m2*1∑q (n+1)+m3*1∑q 1∑
n
11 15.18 2 1 11
TECH.
10 9.14 2 4 12
1 9 7.64 2 3 13 20
18
8 6.22 2 6 12 THERSHOL
16
7 4.84 2 8 12 D VALUE
14
6 3.78 2 11 12 12
12 11
12 17.2 3 6 7 10 10 NO. OF
9 8
11 14.79 3 7 8 8 7 CLUSTER
6 FORMED
10 8.42 3 12 8 6
2 9 6.9 3 11 9 4 SQUARE
8 5.58 3 14 8 2 ERROR
7 4.35 3 14 9 0 *100
6 3.56 3 15 10 1 2 3 4 5 6 7
12 17.21 4 6 7
11 14.13 4 10 7 Figure 2: Graph representing test case2.
10 7.49 4 18 6
3 9 5.8 4 20 6
8 5.32 4 17 7
7 3.92 4 20 7
6 2.78 4 27 6
7. References
PROPOSED CLUSTERING
1. Han, J. &Kamber, M. (2012). Data Mining: Concepts
TECH. and Techniques. 3rd.ed. Boston: Morgan Kaufmann
Publishers.
30 2. Sudhir Singh, Dr. Nasib Singh Gill,Comparative Study
Of Different Data Mining Techniques : A Review, www.
25 THERSHOL ijltemas.in, Volume II, Issue IV, APRIL 2013 IJLTEMAS
D VALUE ISSN 2278 – 2540.
20 3. M. Ester, H. Kriegel, J. Sander, and X. Xu. A Density-
Based Algorithm for Discovering Clusters in Large Spatial
15 NO. OF Databases with Noise. Proc. of the 2nd Int’l Conf. on
12 Knowledge Discovery and Data Mining, August 1996.
11 CLUSTER
10 4. M. Ester, H. Kriegel, and X. Xu. Knowledge Discovery
10 9 8 FORMED in Large Spatial Databases: Focusing Techniques for
7 6
SQUARE Efficient Class Identification. Proc. of the Fourth Int’l.
5 Symposium on Large Spatial Databases, 1995.
ERROR
5. D. Judd, P. McKinley, and A. Jain. Large-Scale Parallel
0 *100 Data Clustering. Proc. Int’l Conference on Pattern
Recognition, August 1996.
1 2 3 4 5 6 7
6. R. T. Ng and J. Han. Efficient and Effective Clustering
Methods for Spatial Data Mining. Proc. of the 20th Int’l
Conference on Very Large Databases, Santiago, Chile,
Figure 3: Graph representing test case3. pages 144–155, 1994.
7. T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An
Above graph shows that Efficient Data Clustering Method for Very Large Databases
Proc. of the 1996 ACM SIGMOD Int’l Conf. on
1. As threshold value decreases Square Error Management of Data, Montreal, Canada, pages 103–114,
RT
June 1996.
decreases. Lower the value of Square Error 8. Performance Evaluation of Incremental K-means
higher the compactness of cluster and as Clustering Algorithm, Sanjay Chakraborty , N.K. Nagwani
separate as possible. Hence as we decrease National Institute of Technology (NIT) Raipur, CG, India,
IJE