Image Segmentation Using K Mean Algorithm
Image Segmentation Using K Mean Algorithm
Assignment
i
Image Segmentation
In computer vision, image segmentation is the process of partitioning an image into multiple segments.
The goal of segmenting an image is to change the representation of an image into something that is
more meaningful and easier to analyze. It is usually used for locating objects and creating boundaries.
It is not a great idea to process an entire image because many parts in an image may not contain any
useful information. Therefore, by segmenting the image, we can make use of only the important
segments for processing.
An image is basically a set of given pixels. In image segmentation, pixels which have similar attributes are
grouped together. Image segmentation creates a pixel-wise mask for objects in an image which gives us
a more comprehensive and granular understanding of the object.
Image Segmentation involves converting an image into a collection of regions of pixels that are
represented by a mask or a labeled image. By dividing an image into segments, you can process only the
important segments of the image instead of processing the entire image.
For example, if we seek to find if there is a chair or person inside an indoor image, we may need image
segmentation to separate objects and analyze each object individually to check what it is. Image
segmentation usually serves as the pre-processing before pattern recognition, feature extraction, and
compression of the image.
Clustering
A process of organizing objects into groups such that data points in the same groups are similar to the
data points in the same group. A cluster is a collection of objects where these objects are similar and
dissimilar to the other cluster.
K-Means
K-Means clustering is a type of unsupervised learning. The main goal of this algorithm to find groups in
data and the number of groups is represented by K. It is an iterative procedure where each data point is
assigned to one of the K groups based on feature similarity.
Algorithm
K-Means algorithm starts with initial estimates of K centroids, which are randomly selected from the
dataset.
Uses:
1. Used in self-driving cars. Autonomous driving is not possible without object detection which
involves segmentation.
2. Used in the healthcare industry. Helpful in segmenting cancer cells and tumors using which their
severity can be gauged.
1
K Means is a clustering algorithm. Clustering algorithms are unsupervised algorithms which means that
there is no labelled data available. It is used to identify different classes or clusters in the given data
based on how similar the data is. Data points in the same group are more similar to other data points in
that same group than those in other groups.
K-means clustering is one of the most commonly used clustering algorithms. Here, k represents the
number of clusters.
Observations
AB = Average of A, B
CD = Average of C, D
2
3. Calculate squared Euclidean distance between all data points to the centroids AB, CD. For example,
distance between A (2,3) and AB (4,2) can be given by s = (2–4) ² + (3–2) ².
4. If we observe in the fig, the highlighted distance between (A, CD) is 4 and is less compared to (AB, A)
which is 5. Since point A is close to the CD, we can move A to CD cluster.
5. There are two clusters formed so far, let recompute the centroids i.e., B, ACD similar to step 2.
ACD = Average of A, C, D
B=B
6. As we know K-Means is iterative procedure now we have to calculate the distance of all points (A, B, C,
D) to new centroids (B, ACD) similar to step 3.
Clusters B, ACD
7. In the above picture, we can see respective cluster values are minimum that A is too far from cluster B
and near to cluster ACD. All data points are assigned to clusters (B, ACD) based on their minimum
distance. The iterative procedure ends here.
8. To conclude, we have started with two centroids and end up with two clusters, K=2.
To sum up:
increase the value of k, the image becomes clearer and distinct because the K-means algorithm can
classify more classes/cluster of colors. K-means clustering works well when we have a small dataset. It
can segment objects in images and also give better results. But when it is applied on large datasets (a
3
greater number of images), it looks at all the samples in one iteration which leads to a lot of time being
taken up.