0% found this document useful (0 votes)
78 views13 pages

K-Means Clustering: CMPUT 615 Applications of Machine Learning in Image Analysis

K-means clustering is an unsupervised machine learning algorithm that attempts to classify data points into K number of clusters. It works by iteratively assigning each data point to the cluster with the nearest mean and then recalculating the cluster means. K-means clustering is commonly used for image segmentation but does not impose spatial coherence, resulting in noisy segmentations. Otsu's thresholding method finds the optimal threshold for binary segmentation by maximizing between-class variance and can be efficiently computed using recursive formulas on the image histogram.

Uploaded by

Thilaga Mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views13 pages

K-Means Clustering: CMPUT 615 Applications of Machine Learning in Image Analysis

K-means clustering is an unsupervised machine learning algorithm that attempts to classify data points into K number of clusters. It works by iteratively assigning each data point to the cluster with the nearest mean and then recalculating the cluster means. K-means clustering is commonly used for image segmentation but does not impose spatial coherence, resulting in noisy segmentations. Otsu's thresholding method finds the optimal threshold for binary segmentation by maximizing between-class variance and can be efficiently computed using recursive formulas on the image histogram.

Uploaded by

Thilaga Mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

K-Means Clustering

CMPUT 615
Applications of Machine Learning
in Image Analysis

K-means Overview
A clustering algorithm
An approximation to an NP-hard combinatorial
optimization problem
It is unsupervised
K stands for number of clusters, it is a user
input to the algorithm
From a set of data points or observations (all
numerical), K-means attempts to classify them
into K clusters
The algorithm is iterative in nature

K-means Details

X1,, XN are data points or vectors or observations

Each observation will be assigned to one and only one cluster

C(i) denotes cluster number for the ith observation

Dissimilarity measure: Euclidean distance metric

K-means minimizes within-cluster point scatter:


1 K
W (C ) xi x j
2 k 1 C ( i ) k C ( j ) k

Nk

where
mk is the mean vector of the kth cluster
Nk is the number of observations in kth cluster

k 1

x m

C (i )k

K-means Algorithm
For a given assignment C, compute the cluster means
mk:
xi
mk

i:C ( i ) k

Nk

, k 1, , K .

For a current set of cluster means, assign each


observation as:
2

C (i ) arg min xi mk , i 1, , N
1 k K

Iterate above two steps until convergence

Image Segmentation Results

An image (I)

Three-cluster image (J) on


gray values of I

Matlab code:
I = double(imread('));
J = reshape(kmeans(I(:),3),size(I));

Note that K-means result is noisy

Summary
K-means converges, but it finds a local minimum
of the cost function
Works only for numerical observations (for
categorical and mixture observations, K-medoids
is a clustering method)
Fine tuning is required when applied for image
segmentation; mostly because there is no
imposed spatial coherency in k-means algorithm
Often works as a staring point for sophisticated
image segmentation algorithms

Otsus Thresholding Method

(1979)

Based on the clustering idea: Find the


threshold that minimizes the weighted withincluster point scatter.
This turns out to be the same as maximizing
the between-class scatter.
Operates directly on the gray level histogram
[e.g. 256 numbers, P(i)], so its fast (once the
histogram is computed).

Otsus Method
Histogram (and the image) are bimodal.
No use of spatial coherence, nor any other
notion of object structure.
Assumes uniform illumination (implicitly), so
the bimodal brightness behavior arises from
object appearance differences only.

Theweightedwithinclassvarianceis:

(t) q1 (t) (t) q2 (t) (t)


2
w

2
1

2
2

Wheretheclassprobabilitiesareestimatedas:
t

q1 (t) P(i)

q2 (t)

i 1

P(i)

i t 1

Andtheclassmeansaregivenby:
t

iP(i)
1 (t)
i 1 q1 (t)

iP(i)
2 (t)
i t 1 q2 (t )

Finally,theindividualclassvariancesare:
t

P(i)
(t) [i 1 (t)]
q1 (t)
i1
2
1

P(i)
(t) [i 2 (t)]
q2 (t)
i t 1
2
2

Now,wecouldactuallystophere.Allweneedtodoisjust
runthroughthefullrangeoftvalues[1,256]andpickthe
w2 (t)
valuethatminimizes.
Buttherelationshipbetweenthewithinclassandbetween
classvariancescanbeexploitedtogeneratearecursion
relationthatpermitsamuchfastercalculation.

Finally...
Initialization...

q1 (1) P(1) ; 1 (0) 0

Recursion...
q1 (t 1) q1 (t) P(t 1)

q1 (t) 1 (t) (t 1)P(t 1)


1 (t 1)
q1 (t 1)

q1 (t 1)1 (t 1)
2 (t 1)
1 q1 (t 1)

Aftersomealgebra,wecanexpressthetotalvarianceas...

(t) q1 (t)[1 q1 (t)][1 (t) 2 (t)]


2

2
w

Withinclass,
frombefore

2
Betweenclass, B (t)

Sincethetotalisconstantandindependentoft,theeffectof
changingthethresholdismerelytomovethecontributionsof
thetwotermsbackandforth.
So,minimizingthewithinclassvarianceisthesameas
maximizingthebetweenclassvariance.
Thenicethingaboutthisisthatwecancomputethequantities
2

inrecursivelyaswerunthroughtherangeoftvalues.
B (t)

Result of Otsus Algorithm


An image

Binary image
by Otsus method

0.06
0.05

Matlab code:

0.04

I = double(imread('));

0.03
0.02

I = (I-min(I(:)))/(max(I(:))-min(I(:)));

0.01
0

50

100

150

200

Gray level histogram

250

300

J = I>graythresh(I);

You might also like