0% found this document useful (0 votes)
28 views5 pages

CAEd Home Work 02 Report

Uploaded by

Usama Aameer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views5 pages

CAEd Home Work 02 Report

Uploaded by

Usama Aameer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Computer Applications in Engineering Design

Assignment 02
Abdur Rahman 180380
December 2, 2019

Implementation of K-mean Clustering Algorithm in Python


Submitted To: Dr.Habib

1
Contents
1 What is K-mean Clustering? 2
1.1 K-mean Algorithm: . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Implementation of K-mean Algorithm in Python 3


2.1 Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Pseudo code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1
1 What is K-mean Clustering?
Clustering is a technique to find out the sub-groups in a group of given
data.Kmeans algorithm is an iterative algorithm that tries to partition
the dataset into Kpre-defined distinct non-overlapping subgroups (clusters)
where each data point belongs to only one group.

1.1 K-mean Algorithm:


• Specify number of clusters K.

• Initialize centroids by first shuffling the dataset and then randomly


selecting K data points for the centroids without replacement.

• Keep iterating until there is no change to the centroids. i.e assignment


of data points to clusters isn’t changing.

• Compute the sum of the squared distance between data points and all
centroid.

• Assign each data point to the closest cluster (centroid).

• Compute the centroid for the clusters by taking the average of the all
data points that belong to each cluster.

2
2 Implementation of K-mean Algorithm in
Python
2.1 Flow Chart

2.2 Pseudo code


• Read data from text files into arrays.

• Concatenate the above arrays into one array variable.

3
• Get the maximum and minimum value of the given points.

• Generate k random clustering points in the range from maximum and


minimum value measure in the above step.

• Get the distance of every data point in given data from every clustering
point.

• Compute the average distance and assign the clustering points to there
respective clusters with respect to the less distance as compare to oth-
ers.

• Plot these points and system pause for some seconds for visualization.

• Move the clustering points to there new position in there respective


clusters

• Measure the difference between the old and new positions of clustering
points

• Repeat the above process until the above difference is very much less

You might also like