Fast Image Segmentation Based On K-Means Clustering With Histograms in HSV Color Space
Fast Image Segmentation Based On K-Means Clustering With Histograms in HSV Color Space
I. I NTRODUCTION
Fig. 1. Overview of the proposed algorithm.
Color image segmentation [1], whose purpose is to decom-
pose an image into meaningful partitions, is widely applied
in multimedia analysis. Depending on different applications,
various kinds of techniques, including feature-space methods coding standards, they are not close to human perceptions.
and spatial-domain methods, are used for color image seg- Besides, CIE color spaces are perceptually uniform but in-
mentation. Feature-space methods, such as clustering, intend to convenient since they require complicated computations. HSV
classify pixels to different groups in a pre-defined color space, (Hue, Saturation, Value) [6], which is shown to have better
whereas spatial-domain methods, such as regions growing, ma- results for image segmentation than RGB color space [5],
nipulate pixels to form connected regions. Since feature-based [7], is capable of emphasizing human visual perception in
methods and spatial-domain methods both have their strengths, hues and has an easily invertible transform from RGB [8].
algorithms which combine two methods are developed [2], Nevertheless, to our best knowledge, clustering in HSV color
[3]. Among them, K-Means clustering is often applied as an space is still an open question since there is no appropriate
essential step in the segmentation process. criteria for separation or distance measurement of gray and
As a traditional clustering algorithm, K-Means is popular color pixels. Based on these observations, a new method that
for its simplicity for implementation, and it is commonly combines gray and color histogram bins for clustering in HSV
applied for grouping pixels in images or video sequences. color space is developed in this paper.
However, the quality of K-Means suffers from being confined In brief, the proposed technique has three main advantages
to run with a fixed value of K rather than dynamically for the segmentation process: one is to reduce the computation
adjusted value of K [4]. There are solutions that run K- complexity of K-Means by using histogram quantization in
Means many times to find a suitable number of K [2], [5], HSV color space, another is to efficiently estimate the number
but it is time-consuming. In addition, random initialization of cluster of K-Means without testing each value of K; the
of centroids (cluster centers) makes the results different each other is to consider gray and color histogram bins together for
time. Therefore, a fast and efficient algorithm for K-Means the clustering procedure. Furthermore, the algorithm integrates
image segmentation is proposed to handle these problems. a filter to efficiently alleviate over-segmentation, and the
On the other hand, choices of color space may have sig- results of segmentation are close to human perception. Due
nificant influences on the result of image segmentation. There to these advantages, it is not only useful for processing large
are many kinds of color space, including RGB, YCbCr, YUV, amount of image data but also suitable for embedded systems
HSV, CIE L*a*b*, and CIE L*u*v*. Although RGB, YCbCr, which have few computational resources.
and YUV color spaces are commonly used in raw data and The paper is organized as follows. The proposed algorithm
s ∈ [1, 7] s = s
v ∈ [1, 7] v = v
Fig. 2. Histogram generation process in HSV color space.
is first described in Sec. II. Next, in Sec. III, the experimental briefly stated as follows:
results will be shown. Finally, a short conclusion is given in Step A: From the color histogram bins and gray histogram
Sec. IV. bins, find the bin which has the maximum number of pixels
to be the first centroid.
II. P ROPOSED A LGORITHM Step B: For each remaining histogram bin, calculate the min
An overview of the proposed algorithm can be illustrated in distance, which is the distance between it and its nearest
Fig. 1, the steps in which will be explained in the following centroid. Then the bin which has the maximum value of min
sections. distance is chosen as the next centroid.
Step C: Repeat the process until the number of centroid equals
A. Color Space Transform and Histogram Generation
to KMax or the maximum value of the distance in Step B
HSV color space is chosen for the proposed algorithm, and is smaller than a predefined threshold T hM . The threshold
the transform from RGB to HSV in [8] is adopted. Since the KMax = 10 is set based on the assumption that there should
ranges of three dimensions in HSV color space are not the be no more than 10 dominant color in one image for high-
same (Hue: [0, 360◦], saturation: [0, 1], and value: [0, 1]), a level image segmentation, and T hM = 25 is set empirically
quantization process is performed to normalize the values in according to the human perceptions of different color in HSV
each dimension: color space. The distance measurements of histogram bins and
h = Hue/hQ , (1) centroid vectors will be explained in Sec. II-C.
s = Saturation/sQ , (2) C. K-Means Clustering in HSV Color Space
v = Value/vQ , (3) The proposed K-Means clustering in HSV color space
includes five steps, which are listed as follows:
where (h , s , v ) denotes the quantization index, and the
Step 1: Estimate the parameters of K-Means, including suit-
quantization parameters hQ = 12◦ , sQ = 0.125, vQ = 0.125
able value of K and the position of K initial centroids from
are set empirically to accentuate the importance of hue and
the Maximin algorithm in Sec. II-B.
to save computational costs. Therefore, the HSV color space
Step 2: Two kinds of histogram bins will be clustered together
are divided into 30 × 8 × 8 = 1920 partitions. However, the
in this step. For color histogram bins, since the hue dimension
hue of pixels with low saturation is often meaningless since
is circular (e.g. 0◦ = 360◦ ), the numerical boundary should
their color is close to gray, and it is suggested that color
be considered in the distance measurement and the process
histogram bins and gray histogram bins should be separated
of centroid calculations. The distance measurement between a
for better color representation [5], [9]. G(v) represents the
histogram bin vector Bi = (hi , si , vi ) and a cluster centroid
number of pixel in the gray histogram bin with parameter v, (t) (t) (t) (t)
vector Cj = (hj , sj , vj ) in the current iteration t is
whereas B(h, s, v) represents the number of pixel in the color
defined in the form of the Euclidean distance (2-Norm):
histogram bin with parameters (h, s, v). The correspondence
(t) (t) (t) (t)
of these parameters and the quantization indices (h , s , v ) are D2 (Bi , Cj ) = Dh2 (hi , hj ) + (si − sj )2 + (vi − vj )2 ,
summarized in Table I. Totally NG gray histogram bins and (4)
NB color histogram bins are generated, where NG = 8 and
where
NB = 30×7×7 = 1470. The process of histogram generation ◦ (t) (t) 180◦
and quantization is illustrated in Fig. 2. (t) ( 360 2
hQ − |hi − hj |) , if |hi − hj | >
Dh2 (hi , hj ) = (t)
hQ
,
B. Maximin Initialization and Parameter Estimation (hi − hj )2 , otherwise.
(5)
In traditional K-Means algorithm, cluster number K is often
specified by users, and the initial centroid positions are chosen Next, classify each color histogram bin to its nearest cluster
randomly. In the proposed method, the parameters, including centroid by the distance measurement. Thus the membership
cluster number and the initial positions of centroids, are all function of histogram bin Bi is defined by
(t)
estimated though the Maximin algorithm [3] in a systematic 1, if j = arg min D2 (Bi , Ck )
(t)
approach. Three steps of the proposed Maximin algorithm, φ(j|Bi ) = k . (6)
which are used in the preliminary stage for K-Means, are 0, otherwise.
323
On the other hand, for gray histogram bins, there is no hue
information inside. Thus the distance measurement between a
gray histogram bin Gi = (vi ) and a cluster centroid vector
(t) (t) (t) (t)
Cj = (hj , sj , vj ) is different from (4):
(t) (t) (t)
(a) (b) (c)
D2 (Gi , Cj ) = (sj )2 + (vi − vj )2 , (7)
Fig. 3. Image kodim23: (a) labeled image, (b) filtered image, and (c)
which means that the saturation values of gray histogram bins segmentation result.
are all considered as zero, and the hue values can be arbitrary.
Besides, the membership function of the gray histogram bin
Gi is of the same form as (6), where Bi is replaced by Gi .
Step 3: Recalculate and update K cluster centroids. Again,
since the hue dimension is circular, the indices in the hue
dimension should be considered not absolutely but relatively. (a) (b) (c)
An efficient method is introduced to calculate the relative hue
(t)
index of the original hue index hi to the centroid Cj = Fig. 4. Image kodim06: (a) the original image, (b) the clustering result of
(t) (t) (t) the proposed method , and (c) the clustering result in [5], where color and
(hj , sj , vj ): gray pixels are separated deterministically and marked.
⎧ ◦ (t) ◦ (t) ◦
⎪ 360 180 180
⎨ hi − hQ , if |hi − hj | > hQ and hj < hQ
(t) ◦ (t) 180◦ (t) 180◦ ,
h̃i,j = hi + 360
⎪ hQ , if |hi − hj | > hQ and hj > hQ When the difference of total distortion measurement |Δ(t+1) −
⎩
hi , otherwise. Δ(t) | is smaller than a predefined threshold or when the max-
(8) imum iteration number is reached, the iteration is terminated.
and the values in each dimension of all centroid vectors for the Otherwise, return to Step 2 and t is incremented.
(t+1)
next iteration Cj are updated according to the following Step 5: Image pixels are labeled with the index of nearest
equations: centroid of their corresponding histogram bins. A labeled
image l(x, y) is obtained in this step, and K-Means clustering
N B
(t) (t) is finished. An example is shown in Fig. 3(a).
h̃ φ B(Bi )i,j (j|Bi )
(t+1) i=1
hj = , (9) D. Post-Processing of Spatial Regions
N B
(t)
φ(j|Bi ) B(Bi ) To eliminate noise and unnecessary details of labeled
i=1
images, an efficient statistical filter is introduced. A local
N B
(t) histogram for a pixel on the position (x, y) in the labeled
si φ(j|Bi ) B(Bi ) image is defined as
(t+1) i=1
sj = , (10)
B
N
(t) G
N
(t) H(z|x, y) = 1, (13)
φ(j|Bi ) B(Bi ) + φ(j|Gi ) G(Gi )
i=1 i=1 l(x ,y )∈W(x,y)
l(x ,y )=z
N B
(t) G
N
(t)
vi φ(j|Bi ) B(Bi ) + vi φ(j|Gi ) G(Gi ) where W(x, y) is an N × N window centered in the spatial
(t+1)
vj = i=1 i=1
, (11) coordinate (x, y). Then the processed labeled image ˆ
l(x, y) is
N B
(t) G
N
(t) obtained by using the following equation:
φ(j|Bi ) B(Bi ) + φ(j|Gi ) G(Gi )
i=1 i=1 ˆl(x, y) = arg max H(z|x, y). (14)
z
where B(Bi ) denotes the number of pixels in the color
histogram bin with histogram bin vector Bi , and G(Gi ) The purpose of this filter is to replace the pixel in the labeled
denotes the number of pixels in the color histogram bin with image with the label with maximum number in a window.
histogram bin vector Gi . Note that the range of hue value Afterwards, the spatial regions (8-connected) whose area is
of the new centroid should be normalized in the range of smaller than T hA are merged to the neighboring region with
[0, 360◦/hQ ). the maximum area to avoid over-segmentation. T hA is set to
Step 4: Check if the clustering process is converged according be 0.05% of the total number of pixels. Two examples in this
to the total distortion measurement, which is the summation of stage are shown in Fig. 3(b)(c).
distances between each histogram bin and its nearest cluster
centroid: III. E XPERIMENTAL R ESULTS
N B K The experiments, which contain four parts, are performed
(t) (t)
Δ(t) = φ(j|Bi ) D2 (Bi , Cj )B(Bi ) in Pentium-D 2.66GHz computer with 2GB memory. The first
i=1 j=1
. (12) part is algorithm comparison. For 25 images of size 768×512,
G
N K
(t) 2 (t)
+ φ(j|Gi ) D (Gi , Cj )G(Gi ) the average execution time of the proposed method is 0.29
i=1 j=1 second, and the method in [5] requires 1.20 second. Also, the
324
200 10000
Number of Region
Distortion (x10 )
4
Random kodim01
kodim03
Autonomous kodim07
1000 kodim22
100
100
0 10
1 6 11 16 21 26 0 2 4 6 8 10
Iteration Filter Window Size (a) (b) (c)
(a) (b)
Fig. 5. (a) Total distortion v.s. iteration of K-Means with image kodim03.
(b) Region number v.s. filter window size with four images.
325