0% found this document useful (0 votes)
11 views12 pages

K Mean

The document explains the K-Means Clustering algorithm, detailing the mathematical steps involved in clustering data points into groups based on their distances to centroids. It describes the iterative process of recalculating centroids and assigning data points to clusters until convergence is achieved. The conclusion emphasizes the importance of using methods like the Elbow method to determine the optimal number of clusters in larger datasets.

Uploaded by

pdefindetudes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
11 views12 pages

K Mean

The document explains the K-Means Clustering algorithm, detailing the mathematical steps involved in clustering data points into groups based on their distances to centroids. It describes the iterative process of recalculating centroids and assigning data points to clusters until convergence is achieved. The conclusion emphasizes the importance of using methods like the Elbow method to determine the optimal number of clusters in larger datasets.

Uploaded by

pdefindetudes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 12
‘21726, 785 PM ph ps gn ly at KM || Mam wives > EZ cre "Medium © Fay sh ladindy craaill 1,9 Gbuals I) KeMeans @e2e The Math Behind K-Mean's Clustering’ Algorithm, Oe GY esas Sie eye ta ye hI eye pe pCa LEN a 3b ln oh Rn ly pa Jk 9G Kase SS Ala any a nl J he pal KOS iy che pap ind Si Spud KEL Sle pane pe Ba Sd i gee la pay Joh pS a + Bhakti K, : ancl thy Ri IS an fl Sy eS cy yl 9 AST All AB IN. eS ce pn aig Shah oe hie goa K Means Se pane SAS ne 9 a SK Means 238 gata 8 Speake yaad ae. pa inyhKeMeans en Ish outst K-Means Atl Iai: SyABO K (Cie pn, 2 as 1 Kae pall pS gl pe Jot Sty At 7 epee Sp Ky Hd AS RD a a AH Distance = Yq =a +02 = hay Aa I Oe pened pl iy A US gh 3 Be pena Bp SL eA ih oe el Se ne a LS Sale pa olen SH, Aa HD yn ec! eb yn Sl BES eel WT SS al 26 SA z ‘ntpsifmedium,com@araj0718the-matn-behind-k-means-clustering-4aa85532085eH~text=K-Means Clustering Algorthm involves,point ange... 1/12 sannri2e, 7:55 PM ysgol oy yas K-Moans | Lb | Medium oie pel a gre i cls ee eee + Bd phe less alte el SY oe ie pepe it) P13) (2,2: P3L5,8) PAGS) PSL, 9) POLO,7) P73) PBL, 4): P93). ABI a QUALNI Le pal 551 ya fly 3 oe P7(3,3) PCG 7) BBLG, A) She CLs C2+ C3, hie J sey JL a ep Spa Lg i a NT a gD I a al ge Distance= YG2=HF FOR =F : st A ytd) ne (C1AC2.C3) APL =(3,3)(,3)=> sqrt(1-3)*#(3-3)]=> sqrtfa] 92 C2PL=9(3,7)01,3)=> sqrt(1-3)"+3-7)'1=> sqrtl20] =>4.5, (C3P1=>(9,4)(1,3)=> B1<= [6S] al <=[°4-3)4 0-1 eel al ees Pas C1P2=>(3,3)(2,2)=> sqrt(2-3)*H(2-3)"1=> sqrtfa] =>14 2P2=9(3,7)(2,2)=> sqrtll2-8)*+(2-7)91=> sqrtl26] =>5.1 Le sl sind! <=[(4-2)+4(9-2) rw sl sia dP 1P2=(8,2(6,8) => sqrtS-2)*(8-3)"] > sqrt29] 25.3 (C2P2=>(3,7)(5,8)=> sqrtl(5-3)**(8-7)"] => sqrt{5] =>2.2 caP2=>(9,4)(5,8) => sqrtl(5-9)?+{8-4)"1=> sqrtl32] =>5.7 Similarly for other distances. Data Points Centroid (3,3) | Centroid (3,7) | Centroid (9,4) Cluster P1(1,3) 2 4s al a P2(2,2) 14 Sa 73 a ?3(5,8) 53 22 57 Q P4(g,5) 54 Sa Sa. 3 P5(3,9) 6 2 73 2 P6(10,7) at 7 32 3 P7(3,3) ° 4 61 a 1 P3(9,4) 61 67 0 a P9(3,7) 4 ° 67 fe } cap2>(9,4)2,2) > 7.36 -ntpsfmedium, conv @draj0?18the-matn-behind-k-means-clustering-4aa85532085eH~text=K-Means Clustering Algorthm involves,point ane... 2/12 sannri2e, 7:55 PM ysgol oy yas K-Moans | Lb | Medium Cluster 1=> P1(L,3) , P2(2,2), P7(3,3) Cluster 2=> P3(5,8),P5(3,9) ,P9L3,7) Cluster 3=> P4(8,5) ,PO(L0,7) , P8(9,4) Now, We re-compute the new clusters and the new cluster center is computed by taking the mean of all the points contained in that particular cluster, New center of Cluster 1 => (1#263)/3, (8#23)/3=> 2,2.7 New center of Cluster 2=> (5+3#3)/8, (8+97)/3=> 3.7,8 New center of Cluster 3 => (8+1019)/3,, (5+744)/3 => 9,5.3, Iteration 1is over. Nowy let us take our new center points and repeat the same steps which are to calculate the distance between data points and new center points with the Euclidean formula and find cluster groups. eration 2 Calcualte the distance between data points and K (C1,C2,C3) 612,2.7),€218.7,8) 630,53) C1P1=>,2.7)0,8) => sqrtl-2)°48-2.7)"1 > sqrt] 1.0 C2P1 =9(8:7,8)(0,3)=> sqrtl1-3.7)*+8-8)"] => sqrls2.29] =94.5, (C3P1=>(,5.3)0,8) => sqrt(1-9)°4(3-5.3)"F> sqrtfo9.29] 8.3 Similarly for other distances.. Pa(4.3) 10 4s 83 a P2(2,2) o7 62 a a P3(5.8) 61 13 48 © Pa(a,s) 64 52 10 a P5(3,9) 6a 12 70 2 P6(10,7), oa 64 19 a P7(3,3) 10 5.0 o4 cn P8(9,4) ma 6.5 13 a P9(3,7) aa 12 62 [1 | Data Points Centroid (2,2.7) Centroid (3.7,8) Centroid (9,5.3) Cluster i Cluster 2 => P3(5,8) , P5(3,9) , P9(3,7) i Cluster 3 => P4(8,5) , P6(10,7) , P8(9,4) ! Genter of luster 1 => (1424393, (91248)3=> 2.27 -ntpsfmedium, com @araj0718the-matn-behind-k-means-clustering-4aa85532085eH~text=K-Means Clustering Algorthm involves,point ange... 3/12 sannri2e, 7:55 PM 1a as ge ey haa! KoMans |e! 4, | Mem ‘We got the same centroid and cluster groups which indicates that this dataset has only 2 groups. K-Means clustering stops iteration because of the same cluster repeating so no need to continue iteration and display the last iteration as the best cluster groups for this dataset ‘The Below graph explained the difference between iterations 1 and 2. We ean see centroids (green dot) changed in the 2nd Iteration, Iteration 1 Iteration 2 . » . e? ’ # 2 ° Conclusion 7 hope you are clear about the math steps behind the K-Means Clustering, In this blog, we took a small number for the dataset so ‘we are given a k value is 3 and took 2 iterations. In real-time, the dataset feature will be maximum in that case we should use the Elbow method (WCSS) to get the perfect K(Cluster groups) value. Check out some of my previous blogs: Implementation of K-Means Clu Introduetion Linear Regression -LSM Linear Reg sion is used to fin the relationship between a dependent variable and the independent variable. There Implementation of Linear Regression using Python Introduction Have doubts? Need help? Contact me! 1 Linkedin: https/www linkedin com/in/aharmaraicd-1b707898 (Github: hets:fgthub. com/DharmaraiPi -nips:medium,conv@sraj0718the-matn-behind-k-means-clistring-4aa85532085et~ text Means Clustering Algorthm involves,point nde... 4/12 sann7i2e, 7:55 PM La oye op hoa! K-Moans |e! | Medlum & © Written by Dharmaraj 420 Fotowers «158 Following have worked on projects tat invelved Machine Learning, Dep aming, Computer Vision ane AWS hnsdwawInkedin com insharmaa-4-1s707808/ No responses yet ° Wht are yout thoughts? More from Dharmaraj -nps:medium,conv@sraj0718the-matn-behind-k-means-clistring-4aa85532085et~ text Means Clustering Algorthm involves,point ange... 5/12 sannri2e, 7:55 PM ysgol oy yas K-Moans | Lb | Medium © ovens Convolutional Neural Networks (CNN) — Architectures Explained Introduction duotoma wane @2 a Tranfer Learning - VGGI6 dVU@ © ovwrnara Image classification and prediction using transfer learning In this blog, we willimplemer the image classification using the VGG-16 Deep Convolutional Network used asa Transfer Learning framework. 1 or3.2002 wT ct -ntps:fmedium, conv @draj0718the-matn-behind-k-means-clusterng-4aa85532085eH~text=K-Means Clustering Algorthm involves,point ange... 6/12 sannri2e, 7:55 PM ys eg ay hos K-Moans | eb le | Medium Sue of co} Of o Pypeoniatey So | 139| 85 | 0 Jo ayiaem) 54 | 84 | 128] 0 o |131 | 99 | 70 | 129] 127] 0 0 | 80 | 57 | 115] 69 | 134] o 0 | 104 | 126 | 123 | 95 | 130] o o}oj]o}o}]o}ojo © onarmar Zero-Padding in Convolutional Neural Networks Introduction sepz.z021 20 © Dnwrnara Exploring Sentiment Analysis with Generative Al Introduce the concept of sentiment analysis and ts significance in understanding user feedback, Mention the growing importance of Ali boes.2008 7 ct tps simedium.com@draj7 18itne-math-behind-k-means-clslering-4aa85532085eH~text=K-Means Clustering Algorithm involves,point and e... 7/12, sannri2e, 7:55 PM yj leon oy yas K-Moans | eb | Meglum Recommended from Medium rade K-Means Clustering Pseudocode and Implementation Hey.is this you? + ne cf -ntpsufmedium, com @draj0?18the-matn-behind-k-means-clustering-4aa85532085eH~text=K-Means Clustering Algorthm involes,point ange... 8/12 sannTieg, 7:55 PM yj plea yais K-Moans | eb | Medium @ reicsraey The Role of Inertia in K-means Clustering k-means clustering sa popular unsupervised machine learning algorithm used for partitioning a dataset into K distinct, non-overlapping. Predictive Modeling w/ Python Natural Language Processing Practical Guides to Machine Learning 2 data science and Al Means Clustering Algorithm involves,point and e... 9/12, -nps:medium,conv@sraj0718the-matn-behind-k-means-clisterng-4aa85532085eH~ text sann7i2e, 7:55 PM ys gg ay hati, K-Moans | eb le | Medium © ore Clustering 1.K-Means Clustering + Nove Scree plot 1.00 1.25 1.60 1.75 2.00 2.25 2.50 2.75 3.00 Principal component © scx0 Principal Components Analysis (PCA) Principal Components Analysis PC -mbinaions of the original ones, that. + sont wa tps simedium.com@draj7 18itne-math-behind-k-means-clslering-4aa85532085eH~text=K-Means Clustering Algorithm involves,point and... 10/12 sannri2e, 7:55 PM gps egy Choa K Means | | | | Result 1 Result 2 Result 3 Majority voting / Averaging Randam farect nrodictian oe wer Yoon Random Forests Random forests sa powertul machine learning model based on an ensemble of decision tees, where each tree is grown using arandom subset + Mer25.2025 #20 a Bagging Bostetrapeing e Meigtel Data vray Classes “eelees! © #r Towards toy thomas Arter Bagging vs. Boosting: The Power of Ensemble Methods in Machine Learning How to maximize predictive performance by cresting a strong learner trom multiple weak ones + n6 9028 se @ 4 a C ee -ntpsfmedium, com @draj0718the-matn-behind-k-means-clusterng-4aa85532085eH~text=K Means Clustering Algorthm involves,point ane... 11/12 sannri2e, 7:55 PM ysgol oy yas K-Moans | Lb | Medium -nps:medium,conv@sraj0718the-matn-behind-k-means-clistring-4aa85532085et~ text Means Clustering Algorithm involves,point and... 12/12,

You might also like