Clustering Image Unit 4
Clustering Image Unit 4
want to analyze what’s inside the image. For example, if we seek to find if there is a chair or person
inside an indoor image, we may need image segmentation to separate objects and analyze each
object individually to check what it is. Image segmentation usually serves as the pre-processing
before pattern recognition, feature extraction, and compression of the image.
Image segmentation is the classification of an image into different groups. Many kinds of
research have been done in the area of image segmentation using clustering. There are different
methods and one of the most popular methods is K-Means clustering algorithm.K-Means clustering
algorithm is an unsupervised algorithm and it is used to segment the interest area from the
background. It clusters, or partitions the given data into K-clusters or parts based on the K-centroids.
Image segmentation is the process of partitioning a digital image into multiple distinct
regions containing each pixel(sets of pixels, also known as superpixels) with similar attributes.
The goal of Image segmentation is to change the representation of an image into something
that is more meaningful and easier to analyze.
Image segmentation is typically used to locate objects and boundaries(lines, curves, etc.) in
images. More precisely, Image Segmentation is the process of assigning a label to every pixel in an
image such that pixels with the same label share certain characteristics.
The algorithm is used when you have unlabeled data(i.e. data without defined categories or groups).
The goal is to find certain groups based on some kind of similarity in the data with the number of
groups represented by K.
In the above figure, Customers of a shopping mall have been grouped into 5 clusters based on their
income and spending score. Yellow dots represent the Centroid of each cluster.
The objective of K-Means clustering is to minimize the sum of squared distances between all points
and the cluster center.
Steps in K-Means algorithm:
3.Assign each data point to the closest centroid → that forms K clusters.
5.Reassign each data point to the new closest centroid. If any reassignment . took place, go to step 4,
otherwise, the model is ready.
The basic idea behind partitioning methods, such as K-Means clustering, is to define clusters such
that the total intra-cluster variation or in other words, total within-cluster sum of square (WCSS) is
minimized. The total WCSS measures the compactness of the clustering and we want it to be as small
as possible.
The Elbow method looks at the total WCSS as a function of the number of clusters: One should
choose a number of clusters so that adding another cluster doesn’t improve much better the total
WCSS.
The location of a bend (knee) in the plot is generally considered as an indicator of the appropriate
number of clusters.