0% found this document useful (0 votes)
44 views18 pages

Seeded Region Growing: An Extensive and Comparative Study: Jianping Fan, Guihua Zeng, Mathurin Body, Mohand-Said Hacid

Uploaded by

Song Tae Ha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views18 pages

Seeded Region Growing: An Extensive and Comparative Study: Jianping Fan, Guihua Zeng, Mathurin Body, Mohand-Said Hacid

Uploaded by

Song Tae Ha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Pattern Recognition Letters 26 (2005) 1139–1156

www.elsevier.com/locate/patrec

Seeded region growing: an extensive and comparative study


a,*
Jianping Fan , Guihua Zeng b, Mathurin Body c, Mohand-Said Hacid c

a
Department of Computer Science, University of North Carolina, 9201 University City Boulevard, Charlotte, NC 28223, USA
b
Department of Electronic Engineering, Shanghai Jiaotong University, Shanghai 200030, China
c
LISI-UFR d’Informatique, Universite Claude Bernard Lyon 1, France

Received 26 November 2003


Available online 7 December 2004

Abstract

Seeded region growing (SRG) algorithm is very attractive for semantic image segmentation by involving high-level
knowledge of image components in the seed selection procedure. However, the SRG algorithm also suffers from the
problems of pixel sorting orders for labeling and automatic seed selection. An obvious way to improve the SRG algo-
rithm is to provide more effective pixel labeling technique and automate the process of seed selection. To provide such a
framework, we design an automatic SRG algorithm, along with a boundary-oriented parallel pixel labeling technique
and an automatic seed selection method. Moreover, a seed tracking algorithm is proposed for automatic moving object
extraction. The region seeds, which are located inside the temporal change mask, are selected for generating the regions
of moving objects. Experimental evaluation shows good performances of our technique on a relatively large variety of
images without the need of adjusting parameters.
 2004 Elsevier B.V. All rights reserved.

Keywords: Seeded region growing; Automatic image segmentation; Seed tracking

1. Introduction also become a key point of MPEG-4 and


MPEG-7 standards for realizing the object-based
Automatic image segmentation is an essential image coding and content-based image description
process for most subsequent tasks, such as image and retrieval. The general image segmentation
description, recognition, retrieval and object-based problem involves the partitioning of a given image
image compression (Majunath et al., 2000; Kunt into a number of homogeneous regions according
et al., 1987). Automatic image segmentation has to a given critical. Thus, image segmentation can be
considered as a pixel labeling process in the sense
that all pixels that belong to the same homogeneous
*
Corresponding author. region are assigned the same label (Haris et al.,
E-mail address: [email protected] (J. Fan). 1998). The existing automatic image segmentation

0167-8655/$ - see front matter  2004 Elsevier B.V. All rights reserved.
doi:10.1016/j.patrec.2004.10.010
1140 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

techniques can be classified into five approaches, nique to accelerate the seeded pixel labeling
namely, thresholding techniques (Lim and Lee, procedure. Our boundary-oriented pixel label-
1990; Sahoo et al., 1988; Pal and Pal, 1993), bound- ing technique can also support a parallel SRG
ary-based methods (Kass et al., 1987; Palmer et al., procedure.
1996), region-based methods (Haralick and Shap- • Automatic seed selection: The authors in (Fan
iro, 1985; Chang and Li, 1994; Hijjatoleslami et al., 2001a) have developed an automatic
and Kittler, 1998; Adams and Bischof, 1994), hy- edge-oriented seed generation technique to
brid techniques (Pavlidis and Liow, 1990; Haddon automate SRG algorithm. The color edge detec-
and Boyce, 1990; Chu and Aggarwal, 1993), and tion technique is first performed to obtain the
clustering-based techniques (Pappas, 1992; Shen simplified geometric structures of a color image.
et al., 1998). The centroids of the neighboring labeled color
Seeded region growing (SRG), that is introduced edges are then taken as the initial seeds for
by (Adams and Bischof, 1994), is robust, rapid and region growing. However, the edge-oriented
free of tuning parameters. These characteristics SRG algorithm may induce oversegmentation
allow implementation of a very good algorithm problem because the color edges may be over-
which could be applied to large variety of images. detected for the texture images and thus result
SRG is also very attractive for semantic image seg- in redundant seeds. There are two reasonable
mentation by involving the high-level knowledge approaches to improving this edge-oriented
of image components in the seed selection proce- SRG algorithm: one is to perform a post-proce-
dure. However, the SRG algorithm also suffers dure of similarity-based region merging after
from the problems of automatic seed generation the SRG procedure, and the other is to perform
and pixel sorting orders for labeling (Mehnert an image filtering procedure before the color
and Jackway, 1997; Fan et al., 2001a). There are edge detection procedure. In this paper, we will
several potential approaches to improving the propose an automatic edge-oriented seed gener-
SRG algorithm: ation technique via image filtering.
• Temporal seed tracking: SRG is very attractive
• Scan order optimization: The original SRG algo- for content-based image database applications
rithm uses sequential sorted list as data struc- by involving the high-level knowledge of image
ture (Adams and Bischof, 1994). All pixels are objects in the segmentation procedure. How-
put into the sequential sorted list according to ever, automatic semantic image segmentation
their delta value. The authors in (Mehnert and is an ill-defined problem because semantic
Jackway, 1997) have confirmed that a different objects do not usually correspond to homoge-
order of processing pixels leads to different final neous regions in color or texture (Deng and
segmentation results. They also noticed two Manjunath, 2001). Automatic moving object
types of order dependencies. The first type is extraction via seed tracking may be one reason-
called inherent order dependencies, while the able solution of this problem (Grinias and
second is called implementation order depen- Tziritas, 2001). In this paper, we propose an
dencies. However, the unlabeled pixels may interesting seed tracking technique for auto-
not be adjacent to all these selected seeds espe- matic moving object extraction.
cially at the beginning of SRG procedure, thus
the connection characteristics among the adja- This paper is organized as follows. A brief re-
cent pixels should be used in the pixel labeling view of SRG technique is given in Section 2. In
procedure. The objective of image segmentation Section 3, we propose three automatic SRG tech-
is to label the adjacent (connected each other on niques (their major steps are shown in Fig. 1). A
pixel level) similar pixels with the same sym- comparative study of those three automatic SRG
bol. Since the region boundaries are used for techniques is also given. Section 4 describes a seed
defining the boundaries of different image com- tracking technique for automatic moving object
ponents, we propose a boundary-oriented tech- extraction. Section 5 introduces our techniques
J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156 1141

Image Input

Regular Image Color Edge Detection Image Filtering


Partition

Center-oriented Edge-oriented Color Edge


Seed Generation Seed Generation Detection

Seeded Region Seeded Region Edge-oriented


Growing Growing Seed Generation

Similarity-based
Seeded Region
Region Merging Growing

Final Segmentation Result

Fig. 1. The major steps for the three automatic SRG techniques.

for semantic-sensitive salient object detection. We all unallocated pixels which are adjacent to at least
conclude in Section 6. one of the labeled regions.
( )
[q [
q
H ¼ ðx; yÞ 62 Ri jN ðx; yÞ \ Ri 6¼ ; ð1Þ
2. Seeded region growing: brief review i¼1 i¼1

Seeded region growing approach to image seg- where N(x, y) is the second-order neighborhood of
mentation is to segment an image into regions with the pixel (x, y) as shown in Fig. 2.
respect to a set of q seeds (Adams and Bischof, For the unlabeled pixel (x, y) 2 H, N(x, y) meets
1994). Given the set of seeds, S1, S2, . . ., Sq, each just one of the labeled image region Ri and define
step of SRG involves one additional pixel to one u(x, y) 2 {1, 2, . . ., q} to be that index such that
of the seed sets. Moreover, these initial seeds are N(x, y) \ Ru(x,y) 5 ;. d(x, y, Ri) is defined as the
further replaced by the centroids of these gener- difference between the testing pixel at (x, y) and
ated homogeneous regions, R1, R2, . . ., Rq, by its adjacent labeled region Ri. d(x, y, Ri) is calcu-
involving the additional pixels step by step. The lated as
pixels in the same region are labeled by the same   
dðx; y; Ri Þ ¼ gðx; yÞ  g X c ; Y c 
i i ð2Þ
symbol and the pixels in variant regions are la-
beled by different symbols. All these labeled pixels where g(x, y) indicates the values of the three color
are called the allocated pixels, and the others are components of the testing pixel (x, y), gðX ci ; Y ci Þ
called the unallocated pixels. Let H be the set of represents the average values of three color
1142 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

which arise from a demonstration of SRG are:


how to manage the pixel labeling procedure more
(x-1, y-1) (x, y-1) (x+1, y-1) efficiently? how to select the seeds automatically?
how critical is the seed selection to a good segmen-
tation? A poor starting estimate of region seeds
or bad pixel sorting orders may result in an incor-
rect segmentation of an image (Mehnert and Jack-
(x-1, y) (x, y) (x+1, y) way, 1997; Fan et al., 2001a). The obvious way to
improve the SRG technique is to provide a more
effective pixel labeling technique and automate
the process of seed selection. In this section, we
propose three automatic seed selection algorithms
(x-1, y+1) (x, y+1) (x+1, y+1) and an interesting boundary-oriented pixel label-
ing technique. We also give the performance com-
parison of these three automatic SRG techniques
on a relatively large variety of images.
Fig. 2. The second-order neighborhood N(x, y) of current
testing pixel at (x, y).
3.1. SRG via regular seed generation

The images can be first partitioned into a set of


components of the homogeneous region Ri, with rectangular regions with fixed size as shown in Fig.
ðX ci ; Y ci Þ the centroid of Ri. 3. A simple automatic SRG algorithm can then be
If N(x, y) meets two or more of the labeled re- realized by selecting the centers of these rectangu-
gions, u(x, y) takes a value of i such that N(x, y) lar regions as the seeds. However, the traditional
meets Ri and d(x, y, Ri) is minimized. sequential pixel sorting technique may meet the
uðx; yÞ ¼ min fdðx; y; Rj Þjj 2 f1; . . . ; qgg ð3Þ unconnected problem at the beginning of seeded
ðx;yÞ2H
region growing procedure. As shown in Fig. 4, it
This seeded region growing procedure is re- is hard to allocate the unlabeled pixel (x, y) to
peated until all pixels in the image have been allo-
cated to the corresponding regions. The definition
of Eqs. (1) and (3) ensures that the final partition
of the image is divided into a set of regions as
homogeneous as possible on the basis of the given
constraints. SRG algorithm is robust, rapid and
free of tuning parameters and it is also very attrac-
tive for semantic image segmentation. However,
SRG algorithm also suffers from the problems of
pixel sorting and automatic seed selection.

3. Automatic seeded region growing

An advantage of SRG is that the high-level


knowledge of semantic image components can be
exploited by selecting the suitable seeds for grow-
ing more meaningful regions. This property is very
attractive for content-based image database appli- Fig. 3. The space partition of an image and the regular seed
cations (Fan et al., 2001). The natural questions selection scheme.
J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156 1143

unlabelled pixel the boundary pixel (xl, yl) as the new boundary
pixel of the region Ri. This similarity testing and
(x, y) labeling procedure for all the boundary pixels
for the same region can also be performed at
the same time.
• The color similarity distance D(xl, yl, Ri),
between the unlabeled pixel (xl ± 1, yl ± 1) and
the current testing boundary pixel (xl, yl) of
the region Ri, is calculated as
Dðxl ; y l ; Ri Þ ¼ jIðxl ; y l Þ  Iðxl  1; y l  1Þj
initial seeds
þ juðxl ; y l Þ  uðxl  1; y l  1Þj
Fig. 4. The example for the unlabeled pixel at the beginning of
þ jvðxl ; y l Þ  vðxl  1; y l  1Þj
seeded region growing.
ð4Þ
where I(xl, yl), u(xl, yl), and v(xl, yl) indicate
any of the selected seeds because the pixel (x, y) is the values of the three color components of the
not adjacent to all the seeds. It is more reasonable testing boundary pixel (xl, yl). I(xl ± 1, yl ± 1),
to start the pixel labeling procedure from the se- u(xl ± 1, yl ± 1), and v(xl ± 1, yl ± 1) represent
lected seeds by involving new pixels step by step. the values of three color components of the
Based on this observation, we propose a bound- unlabeled pixel (xl ± 1, yl ± 1) which is adjacent
ary-oriented pixel labeling technique to accelerate to the boundary pixel (xl, yl). If the unlabeled
the SRG procedure, where the region growing is pixel (xl ± 1, yl ± 1) meets two or more of the
realized by dilating its boundaries step by step. labeled boundary pixels of the region Ri, it is
Moreover, this boundary-oriented pixel labeling merged into the region Ri and replaces the most
technique can support parallel region growing similar boundary pixel (xm, ym) of the region Ri:
according to the following steps:
Dðxm ; y m ; Ri Þ ¼ min fDðxl ; y l ; Ri Þjl
ðxl ;y l Þ2BRi
• The pixels in the same region are labeled by the
2 f1; . . . ; Lgg ð5Þ
same symbol. The adjacent regions connect via
their common boundaries. The regions are rep- • If the unlabeled pixel meets two or more bound-
resented via two parameters: one is the centroid ary pixels from adjacent regions, it is merged
of the region, and the other is a set of boundary into region Rj which has the smallest similarity
pixels. These two region description parameters distance and replace the most similar boundary
are updated by involving new pixels step by pixel as the new boundary pixel of the region Rj:
step. At the beginning of SRG procedure, the
data sets for the centroid and the boundary pix- Dðxk ; y k ; Rj Þ ¼ min fDðxm ; y m ; Ri Þjðxm ; y m Þ 2 Bri g
i2f1;...;qg
els of a region are the same, i.e., the seed of the
ð6Þ
corresponding region.
• The SRG procedure starts from all the seeds at • The parallel SRG procedure for each boundary
the same time. For a seeded region Ri with pixel will stop if the boundary pixels for the
the set of boundary pixels BRi = {(xl, yl) j l 2 neighboring regions are connected or the color
[1, . . ., L]}, we can test the second-order neigh- similarity distance is above a predefined
boring pixels (xl ± 1, yl ± 1) of its boundary threshold.
pixel (xl, yl). If the unlabeled pixel at
(xl ± 1, yl ± 1) is similar with the adjacent Since the seeds are selected regularly without
boundary pixel (xl, yl) of the region Ri, then they involving the spatial distribution of image compo-
are merged into the region Ri and also replace nents, small image regions, whose sizes are less
1144 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

than the initial rectangular box, may be lost if regions according to their color similarity. The col-
the centers of the corresponding rectangular re- or information of each image region is character-
gions do not locate inside these small image ized by its color histogram (Swain and Ballard,
regions. On the other hand, oversegmentation for 1991). The color similarity distance between two
the image regions with large size is also included neighboring regions Ri and Rj can be defined as
because several seeds may be used for the same N X
X N  
large region. In order to solve the first problem, dðRi ; Rj Þ ¼ alm H i ðlÞ  H j ðmÞ
new seeds are added automatically if the small l¼0 m¼0
 
regions can not be merged into their connected  H i ðmÞ  H j ðlÞ ð7Þ
seed regions. In order to solve the second (overseg-
mentation) problem, a post-procedure of similar- where alm is a symmetric matrix of weights be-
ity-based region merging is performed on these tween 0 and 1 representing the similarity between
adjacent regions. bins l and m, Hi(l) and Hj(l) denote the color histo-
In order to accelerate this similarity-based re- gram bins for the lth color component, N is the
gion merging procedure, the region adjacency maximum number of bins in the color histogram.
graph (RAG) is built to express the relations The RAG for image regions is stored as a table
among the regions (Tremeau and Colantoni, in our experiments, and the color similarity dis-
2000). Each link of the RAG has a binary state: tance of any two neighboring regions are calcu-
on or off, which determines the membership of a lated and stored in a distance table. If the color
node to a given processing neighborhood. The similarity distance d(Ri, Rj) is less than a predefined
RAG for an image is stored in a table and this ta- threshold, the adjacent regions Ri and Rj are
ble is updated along with the neighboring relations merged. The color histogram of new region is then
if the regions are merged (see Fig. 5). The region calculated, the similarity distance table and the
merging procedure is performed on the adjacent RAG table are updated as shown in Fig. 5. This

A
C C

A’

B D D

E F F
E

(a) (b)

A B C D E F A’ C D E F

on on on off off off A’ on on on on off


A
on on on C on on on on off
B on on off
on on on on on D off on on on on
C off
on on on E on on on on on
D off on on

E off on on on on on F off off on on on

F off off off on on on

(c) (d)

Fig. 5. The table of region adjacency graph for region merging: (a) adjacency region graph before merging; (b) adjacency region graph
after merging; (c) relationship table before region merging; (d) relationship table after region merging.
J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156 1145

region merging procedure stops when all the color has been proposed in (Fan et al., 2001a), where the
similarity distances in the table are above the pre- initial seeds for SRG are obtained automatically
defined threshold. from the centroid of the color edges. The centroid
For evaluating the real performance of this ðX ci;j ; Y ci;j Þ between two adjacent labeled edge re-
automatic SRG algorithm, we have tested a rela- gions Ei and Ej, which is defined as the algebraic
tively large variety of color images. Fig. 6(a) shows average of the edge pixels in the corresponding
an original image of ‘‘Akiyo’’, Fig. 6(b) is the auto- regions, is calculated as
matic SRG result via regular seed selection. One 8 P
can find that the image is oversegmented because > X c ¼ P ðx;yÞ2fEi ;Ej g x
>
>
>
< i;j
the large homogeneous regions may span over sev- ðx;yÞ2fEi ;Ej g dðx; yÞ
P ð8Þ
eral rectangular boxes and several seeds are used. >
> ðx;yÞ2fEi ;Ej g y
> c
> Y i;j ¼ P
Fig. 6(c) shows the SRG result after the post-pro- :
ðx;yÞ2fEi ;Ej g dðx; yÞ
cedure of similarity-based region merging, where
the similar adjacent regions as shown in Fig. 6(b) where d(x, y) = 1 if and only if (x, y) 2 {Ei, Ej},
are merged to form more meaningful image re- otherwise, d(x, y) = 0.
gions. The boundaries of these small regions (ini- The boundary for the same homogeneous re-
tial segmentation results) are also shown in Fig. gion may be partitioned into several adjacent edge
6(c) with low grey level, and this will help us to regions because the obtained color edges are nor-
understand what kind of small regions are merged mally discontinuous, thus the centroids between
to form meaningful large regions. several adjacent edge regions may be very close
and their colors are also very similar. Therefore,
3.2. SRG via edge-oriented seed generation these neighboring similar centroids are merged to
one. These refined centroids are then taken as the
The simplified geometric structures of the image initial seeds S1, S2, . . ., Sq, for seeded region grow-
regions can be obtained from their color edges. An ing and these seeds are updated step by step by
automatic edge-oriented seed generation technique involving the new points. The color similarity

Fig. 6. The performance comparison of three automatic seeded region growing algorithm on ‘‘Akiyo’’: (a) original image; (b) region
boundaries obtained by regular seed selection; (c) region boundaries after region merging, the boundaries for the small regions are also
shown in low grey levels; (d) color edges; (e) region boundaries obtained by edge-oriented seed selection; (f) region boundaries obtained
by the improved edge-oriented seed generation technique.
1146 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

distance between the current testing pixel at (x, y) In our current works, we use the second ap-
and its connected region Ri can be calculated via proach via image filtering. The window of image
Eq. (4). filter can be defined as
After the region seeds are obtained, the bound- n o
ary-oriented pixel labeling technique as described N mn ðx; yÞ ¼ ðx þ i; y þ jÞj½m=26i6½m=2;½n=26j6½n=2
in Section 3.1 is used for managing the similarity-
based SRG procedure. The SRG procedure can ð9Þ
start from all the seeds at the same time, and it
stops if the boundary pixels for the neighboring re- where m, n are odd, [m/2] and [n/2] denote the larg-
gions are connected or the color similarity distance est integer not greater than the argument of m and
is above a predefined threshold. Fig. 6(d) shows n, Nm·n(x, y) denotes a m-row and n-column image
the detected image edges. We can find that the sim- region. The image filtering procedure (image
plified geometric structures of the images are given smoothing) tries to find a best trade-off between
by the color edges even they are discontinuous. By the noise reduction and the loss of image details.
using the centroids of these adjacent edges as the The general form of this processing is
seeds, the SRG result is given in Fig. 6(e). One
1 X X
½m=2 ½n=2
can find that the edge-oriented SRG technique Iðx; yÞ ¼ axþi;yþj Iðx þ i; y þ jÞ
can provide more meaningful segmentation results mn i¼½m=2 j¼½n=2
as compared with the SRG technique via regular ð10Þ
seed selection, because the distribution of image
components (provided by the color edges) are used where ax+i,y+j is the weighting coefficient,
for seed selection. Since the color edges are over- I(x + i, y + j) indicates the grey level value of the
detected for the texture images, the segmentation pixel at (x + i, y + j), and Iðx; yÞ is the average grey
results obtained by the edge-oriented SRG tech- level value of the pixel at (x, y) after the filtering
nique are not much better than those obtained procedure. The grey level value of pixel at (x, y)
by using the SRG technique via regular seed is then replaced by its corresponding average grey
selection. level value Iðx; yÞ. The small window size can be
used to reduce the white Gaussian noise for color
3.3. SRG via improved edge-oriented seed edge detection, while large window size is useful
generation for detecting texture boundaries. Multiple scales
can be used for solving the problem of texture
The images may be corrupted by additional image filtering (Bouman and Shapiro, 1994). In or-
noise or they are textured. The edge detection der to avoid the estimation of texture model para-
algorithm may induce over-detections on the color meter, the authors in (Deng and Manjunath, 2001)
edges. The edge-oriented seed generation tech- have proposed a color quantization technique to
nique, that takes the centroids of these connected smooth the image colors into several representa-
color edges as the seeds, may induce the redundant tive classes. This color quantization technique
seeds and thus results in oversegmentation of the can also be selected for image filtering by quantiz-
images. There are three potential approaches to ing a set of image pixels to the same color. The
solving this oversegmentation problem: the first same image filtering procedure is also performed
one is to perform a post-procedure of region merg- on the other two chrominance components.
ing as described in Section 3.1, the second one is to The color edge detection procedure is then per-
perform an image filtering procedure to reduce the formed on the smoothed images. The centroids of
noise and smooth the texture regions before the the neighboring labeled edges are taken as the
color edge detection is performed, the third one seeds for automatic SRG as described in Section
is to perform an edge smoothing and thinning pro- 3.2. Fig. 6(f) shows the segmentation result ob-
cedure as done in the traditional boundary-based tained by this improved edge-oriented SRG tech-
segmentation techniques (Kass et al., 1987). nique. One can find that its performance on
J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156 1147

simple image like ‘‘Akiyo’’ is very similar with the detector (Fan et al., 2001b; Diehl, 1991). How-
traditional edge-oriented SRG technique. ever, the change detector induces the holes in
We have also tested a relatively large variety of the uniform regions.
color images from Corel database, and parts of the • Spatiotemporal segmentation: To make the
results are shown in Figs. 7 and 8. We also find boundaries of moving objects correspond accu-
that our improved edge-oriented SRG technique rately to their spatial feature variation, spatio-
is very attractive for medical image segmentation, temporal video segmentation techniques are
and parts of test results are given in Figs. 9 and 10. proposed for moving object extraction via
Bayesian frameworks or Markov random fields
(Bouthemy and Francois, 1993; Moscheni et al.,
4. Moving object extraction via seed tracking 1998; Gu et al., 1996; Alatan et al., 1998). How-
ever, their major drawbacks are the computa-
During the last decade, many approaches to tional complexity and also the number of
automatic moving object extraction have been objects to be found has to be specified.
proposed. The existing moving object extraction • Temporal tracking: Since the spatial segmenta-
techniques can be classified into three categories: tion can provide the accurate boundaries of
the video objects regarding to color or texture,
• Temporal segmentation: Temporal segmentation spatial segmentation can be integrated with a
only use motion information deduced from con- temporal tracking procedure for moving object
secutive frames. A classical approach first con- extraction (Meier and Ngan, 1998). Moreover,
sists in estimating a dense motion field and the semantic information can be involved in
then partition the scene only based on the the spatial segmentation procedure via a human
obtained motion information, where the adja- computer interaction or object seed detection
cent video components are merged to form the procedure (Gu and Lee, 1998; Fan and Elma-
meaningful video objects if they obey the same garmid, 2001). This temporal tracking approach
Hough or affine transformation motion model is very attractive for content-based video data-
(Adiv, 1985; Wang and Adelson, 1994). How- base applications because the object extraction
ever, dense field motion vectors are not very for this case can be performed on off-line.
reliable for noisy data (Deng and Manjunath,
2001). In order to avoid the noise of optical In this section, we propose an automatic mov-
flow, some techniques first include a change ing object extraction technique via seed tracking.

Fig. 7. The segmentation results obtained by the improved edge-oriented SRG technique: original image, color edges, region of object.
1148 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

Fig. 8. The segmentation results obtained by the improved edge-oriented SRG technique: original image, color edges, region of object.

Fig. 9. The segmentation results obtained by the improved edge-oriented SRG technique: original image, color edges, region of object.

In order to detect the moving objects in a video should be part of these potential seeds that are ob-
sequence, two critical parameters should be used: tained by the spatial segmentation procedure as
the spatial information of the region boundaries described in Section 3. For this seed tracking tech-
of moving objects, and the temporal relationship nique to work, camera motion should be small
information of the moving objects. Therefore, the with respect to the object motion, otherwise,
temporal intensity changes can be used for select- camera motion compensation should be per-
ing the suitable seeds for generating the regions formed before the temporal change detection pro-
which correspond to moving objects. These seeds, cedure (Fan and Elmagarmid, 2001). In this paper,
that are selected for moving object detection, we focus on moving object extraction from the
J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156 1149

Fig. 10. The segmentation results obtained by the improved edge-oriented SRG technique: original image, color edges without
filtering, color edges with filtering, region of object.

image sequences with a relatively small camera where N(x, y) indicates the second-order neigh-
motion, thus the camera motion compensation is borhood of the pixel at (x, y) as shown in Fig.
not performed in the work reported here. Our seed 1, the constant C = 1 is used to avoid numeri-
tracking technique takes the following steps for cal instabilities. For reducing the influence of
automatic moving object extraction: stationary texture background on intensity
changes among frames, the motion measure
• Spatial segmentation: The scene cut detection feature FCON is normalized by the spatial
technique in (Fan et al., 2000) is first performed intensity derivation. The temporal relation-
on a video sequence to obtain the video shots. ships between the coordinate pixels can then
The automatic SRG technique as described in be classified into two opposite classes accord-
Section 3.3 is only performed on the first frame ing to their motion strength: changed pixel ver-
of each video shot to obtain the boundaries of sus stationary pixel.
the object regions. In order to track these object (
regions among frames within a video shot, the FCONðx; yÞ P T ; changed pixel
automatic seed selection procedure as described ð12Þ
FCONðx; yÞ < T ; stationary pixel
in Section 3.3 is performed on each frame to
extract all the potential seeds for generating where the threshold T is determined automati-
the regions of moving objects. cally by using 1D entropic thresholding tech-
• Change detection: Given two successive frames nique as introduced in (Fan et al., 2001a;
Ft1 and Ft of a video sequence, the motion Cheng et al., 2000). Fig. 11(a) shows a reference
measure between two coordinate pixels can be image of ‘‘Akiyo’’ and Fig. 11(b) is its spatial
calculated as segmentation result obtained by the improved

2P
jIðx; y; t  1Þ  Iðx; y; tÞj ði;jÞ2N ðx;yÞ jIðx; y; tÞ  Iðx þ i; y þ j; tÞj
FCONðx; yÞ ¼ P 2
ð11Þ
ði;jÞ2N ðx;yÞ jIðx; y; tÞ  Iðx þ i; y þ j; tÞj þ C
1150 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

Fig. 11. The temporal segmentation results for ‘‘Akiyo’’: (a) reference image; (b) regions for reference image; (c) original test image; (d)
spatial segmentation regions for test image; (e) change pixels between test image and its reference; (f) moving object regions as
compared with the spatial regions, the boundaries for the temporal unchanged regions are also shown in low grey levels.

edge-oriented SRG technique. Fig. 11(c) shows ing regions, which are obtained by the color-
a test image of ‘‘Akiyo’’ and Fig. 11(d) is its based region growing according to the selected
spatial segmentation result obtained by our seeds, may span over the temporal change
improved SRG technique. By performing the mask. These regions are taken as the uncovered
temporal change detection, Fig. 11(e) gives background because the moving objects should
the temporal change pixels between the refer- be fully located inside the temporal change
ence frame shown in Fig. 11(a) and the current mask. The new objects, which are fully located
test frame shown in Fig. 11(c). We also provide inside the temporal change mask like the mov-
the spatial segmentation results to show how ing objects, can also be detected by this seed
many regions are changed among frames. tracking procedure. Fig. 11(f) shows the moving
• Seed tracking: The improved edge-oriented seed objects obtained by this seed tracking tech-
generation technique (without SRG procedure) nique. We also visualize the spatial segmenta-
as described in Section 3.3 is then performed on tion results in low grey levels to show which
the current frame to obtain all the potential regions are changed among frames. One can
seeds. We are only interested in the moving find that these moving objects (eyes) are not
objects (or regions of moving objects) and the the semantic objects that the video database
moving objects should be located inside the users are concerned with.
temporal changed regions. The temporal
change detection information are then used The test results for ‘‘Salesman’’ and ‘‘Miss
for selecting the suitable seeds from all the American’’ are given in Figs. 12 and 13. Since the
potential seeds. Only the seeds, which are camera motion is included in these sequences, the
located in the temporal changed regions, are temporal change regions cover some redundant
selected for generating the moving regions of seeds (seeds for the moving background induced
interest. If two connected regions are similar by camera motion). Camera motion compensation
on color or texture, they are merged and consid- may first be performed for removing the effect of
ered to be the same moving object. Some mov- camera motion on the temporal changes.
J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156 1151

Fig. 12. The temporal segmentation results for ‘‘Salesman’’: (a) reference image; (b) test image 1; (c) change pixels between test 1 and
reference; (d) object regions for test 1, the boundaries for the temporal unchanged regions are also shown in low grey levels; (e) test
image 2; (f) change pixels for test image 2; (g) object regions and uncovered background for test image 2, the boundaries for temporal
unchanged regions are also shown in low grey levels.

Fig. 13. The temporal segmentation results for ‘‘Miss American’’: (a) reference image; (b) test image 1; (c) change pixels between test 1
and reference; (d) object regions for test 1, the boundaries for the temporal unchanged regions are also shown in low grey levels; (e) test
image 2; (f) change pixels for test image 2; (g) object regions and uncovered background for test image 2, the boundaries for the
temporal unchanged regions are also shown in low grey levels.

5. Salient object detection for image indexing We have already implemented 32 functions to
detect 32 types of salient objects in natural scenes,
The salient objects are defined as the visually and each function is able to detect a certain type of
distinguishable image compounds. For example, these salient objects in the basic vocabulary. Each
the salient object ‘‘sky’’ is defined as the connected detection function consists of three parts: (a) auto-
image regions with large sizes (i.e., dominant matic image segmentation by using the techniques
image regions) that are related to the human introduced in Section 3; (b) image region classifica-
semantics ‘‘sky’’. tion by using the SVM classifiers with an optimal
1152 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

model parameter search scheme and (c) label- and negative samples. Each labeled training sam-
based region aggregation for automatic salient ob- ple is a pair (Xl, Lj(Xl)) that consists of a set of re-
ject generation. gion-based low-level visual features Xl and the
We use our detection function of the salient ob- semantic label Lj(Xl) for the corresponding labeled
ject ‘‘sand field’’ as an example to show how we homogeneous image region.
can design our detection functions. As shown in The image region classifier is learned from these
Fig. 14, image regions with homogeneous color available labeled training samples. We use the
or texture are first obtained by using the seeded well-known SVM classifiers for binary image re-
region growing technique in Section 3. Since the gion classification (Vapnik, 1999). Consider a bin-
visual properties of a certain type of salient object ary classification problem with linearly separable
may look different at different lighting and captur- sample set Xgj ¼ fX l ; Lj ðX l Þjl ¼ 1; . . . ; N g, where
ing conditions, using only one image is insufficient the semantic label Lj(Xl) for the labeled homoge-
to represent its visual characteristics. Thus this neous image region with the visual feature Xl is
automatic image segmentation procedure is per- either +1 or 1. For the positive samples Xl with
formed on a set of training images which consist Lj(Xl) = +1, there exists the transformation
of the salient object ‘‘sand field’’. parameters K and b such that K Æ Xl + b > +1. Sim-
The homogeneous regions in the training ilarly, for negative samples Xl with Lj(Xl) = 1, we
images, that are related to the salient object ‘‘sand have K Æ Xl + b < 1. The margin between these
field’’, are selected and labeled as the training sam- two supporting planes will be 2/kKk2. The SVM
ples by human interaction. Region-based low-level classifier is then designed for maximizing the mar-
visual features, such as 1-dimensional coverage gin with the constraints K Æ Xl + b > +1 for the po-
ratio (i.e., density ratio) for a coarse shape repre- sitive samples and K Æ Xl + b < 1 for the negative
sentation, 6-dimensional region locations (i.e., 2- samples.
dimensions for region center and 4-dimensions to Given the training set Xgj ¼ fX l ; Lj ðX l Þj
indicate the rectangular box for a coarse shape l ¼ 1; . . . ; N g, the margin maximization procedure
representation), 7-dimensional LUV dominant is then transformed into the following optimiza-
colors and color variances, 14-dimensional Tam- tion problem:
ura texture, and 28-dimensional wavelet texture XN
1 T
features, are extracted for characterizing the visual arg min K KþC nl
properties of these labeled image regions that are K;b;n 2 l¼1 ð13Þ
explicitly related to the salient object ‘‘sand field’’. Lj ðK  UðX l Þ þ bÞ P 1  nl
The 6-dimensional region locations are used to
determine the spatial contexts among different where nl P 0 represents the training error rate,
types of salient objects to avoid the wrong detec- C > 0 is the penalty parameter to adjust the train-
T
tion of the visual similar salient objects such as ing error rate and the regularization term K 2K,
‘‘beach sand’’ and ‘‘road sand’’. U(Xl) is the function that maps Xl into higher-
We use one-against-all rule to label the training dimensional space (i.e., feature dimensions plus
samples Xgj ¼ fX l ; Lj ðX l Þjl ¼ 1; . . . ; N g: positive the dimension of response) and the kernel function
samples for the specific salient object ‘‘sand field’’ is defined as j(Xi, Xj) = U(Xi)TU(Xj). In our current

Fig. 14. The flowchart for automatic salient object detection.


J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156 1153

implementation, we select radial basis function able, cross-validation is then used to determine
(RBF), j(Xi, Xj) = exp(ckXi  Xjk2), c > 0. the underlying optimal parameter pair (C, c). (d)
We have developed an efficient search algorithm Given the optimal parameter pair (C, c), the final
to determine the optimal model parameters (C, c) classifier model (i.e., support vectors) is trained
for the SVM classifiers: (a) The labeled image re- again by using the whole training data set. (e)
gions are partitioned into m subsets in equal size, The spatial contexts among different types of sali-
where m  1 subsets are used for classifier training ent objects (i.e., coherence among different types of
and the remaining one is used for classifier valida- salient objects) have also been used to cope well
tion. (b) Our feature set for image region represen- with the wrong detection problem for the visual
tation is first normalized to avoid the features in similar salient objects.
greater numeric ranges that dominate those in Some results for our detection functions are
smaller numeric ranges. Because inner product is shown in Figs. 15 and 16. From these experimental
usually used to calculate the kernel values, this results, one can find that the salient objects are more
normalization procedure is able to avoid the representative than the homogeneous image regions
numerical problem. (c) The numeric ranges for and the major visual properties of the dominant
the parameters C and c are exponentially parti- image components are maintained by using the sali-
tioned into small pieces with M pairs. For each ent objects for image content representation. Thus
pair, m  1 subsets are used to train the classifier using the salient objects for feature extraction can
model. When the M classifier models are avail- enhance the quality of features and result in more

Fig. 15. The detection results of the salient object ‘‘water’’.


1154 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

Fig. 16. The detection results of the salient object ‘‘sand field’’.

effective semantic image classification. In addition, compression. SRG algorithm, which is robust,
the salient objects can be visually distinguishable, rapid and free of tuning parameters, is very attrac-
and they are also semantic to human beings. Thus tive for semantic image segmentation. However,
the keywords for interpreting the salient objects the traditional SRG algorithm suffers from the
can also be used to achieve the annotations of the problems of pixel sorting orders for labeling and
images at the content level. The average perfor- automatic seed selection. An automatic SRG algo-
mance for some detection functions is given in Table rithm, along with a more effective pixel labeling
1. It is worth noting that the procedure for salient technique and an automatic seed selection method,
object detection is automatic and the human inter- is presented in this paper. A seed tracking algo-
action is only involved in the procedure to label rithm is also proposed for automatic moving ob-
the training samples (i.e., homogeneous image re- ject extraction, where the seeds located inside the
gions) for learning the detection functions. temporal change mask are selected for generating
the regions of moving objects. Our seed tracking
technique is very attractive for detecting the
6. Conclusion uncovered background and new objects which
are also located inside the temporal change mask
Automatic image segmentation has become the like the moving objects. Our future research will
key point for realizing content-based image focus on how to handle the limitations in the algo-
description and retrieval, and object-based image rithm, such as improving its performance for tex-
J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156 1155

Table 1
The average performance of some detection functions (precision q versus recall .)
Salient objects
Brown horse Grass Purple flower
q 95.6% 92.9% 96.1%
. 100% 94.8% 95.2%
Red flower Rock Sand field
q 87.8% 98.7% 98.8%
. 86.4% 100% 96.6%
Water Human skin Sky
q 86.7% 86.2% 87.6%
. 89.5% 85.4% 94.5%
Snow Sunset/sunrise Waterfall
q 86.7% 92.5% 88.5%
. 87.5% 95.2% 87.1%
Yellow flower Forest Sail cloth
q 87.4% 85.4% 96.3%
. 89.3% 84.8% 94.9%
Elephant Cat Zebra
q 85.3% 90.5% 87.2%
. 88.7% 87.5% 85.4%

ture images and video sequences with a relatively Cheng, H.D., Chen, Y.H., Jiang, X.H., 2000. Thresholding
large camera motion. using two-dimensional histogram and fuzzy entropy princi-
ple. IEEE Trans. Image Process. 9, 732–735.
Chu, C., Aggarwal, J.K., 1993. The integration of image
segmentation maps using region and edge information.
References IEEE Trans. Pattern Anal. Machine Intell. 15, 1241–
1252.
Adams, R., Bischof, L., 1994. Seeded region growing. IEEE Deng, Y., Manjunath, B.S., 2001. Unsupervised segmentation
Trans. Pattern Anal. Machine Intell. 16, 641–647. of color–texture regions in image and video. IEEE Trans.
Adiv, G., 1985. Determining three-dimensional motion and Pattern Anal. Machine Intell. 23 (8), 800–810.
structure from optical flow generated by several moving Diehl, N., 1991. Object-oriented motion estimation and seg-
objects. IEEE Trans. Pattern Anal. Machine Intell. 7, 384– mentation in image sequence. Signal Process.: Image
401. Comm. 3, 23–56.
Alatan, A.A., Onural, L., Wollborn, M., Mech, R., Tuncel, E., Fan, J., Elmagarmid, A.K., 2001. An automatic algorithm for
Sikora, T., 1998. Image sequence analysis for emerging semantic object generation and temporal tracking. Signal
interactive multimedia services—The European COST 211 Process.: Image Comm. 16.
framework. IEEE Trans. Circ. Syst. Vid. Technol. 8, 802– Fan, J., Yau, D.K.Y., Aref, W.G., Rezgui, A., 2000. Adaptive
813. motion-compensated video coding scheme toward content-
Bouman, C.A., Shapiro, M., 1994. A multiscale random field based bit rate allocation. J. Eletrical Imaging 9, 521–533.
model for Bayesian image segmentation. IEEE Trans. Fan, J., Yau, D.K.Y., Elmagarmid, A.K., Aref, W.G., 2001a.
Image Process. 3, 162–177. Automatic image segmentation by integrating color-based
Bouthemy, P., Francois, E., 1993. Motion segmentation and extraction and seeded region growing. IEEE Trans. Image
qualitative dynamic scene analysis from an image sequence. Process. 10 (10), 1454–1466.
Internat. J. Comput. Vision 10, 157–182. Fan, J., Yu, J., Fujita, G., Onoye, T., Wu, L., Shirakawa, I.,
Chang, Y.-L., Li, X., 1994. Adaptive image region-growing. 2001b. Spatiotemporal segmentation for compact video
IEEE Trans. Image Process. 3, 868–872. representation. Signal Process.: Image Comm. 16, 553–566.
1156 J. Fan et al. / Pattern Recognition Letters 26 (2005) 1139–1156

Grinias, I., Tziritas, G., 2001. A semi-automatic seeded region Meier, T., Ngan, K.N., 1998. Automatic segmentation of
growing algorithm for video object localization and track- moving objects for video object plane generation. IEEE
ing. Signal Process.: Image Comm. 16, 977–986. Trans. Circ. Syst. Vid. Technol. 8, 525–538.
Gu, C., Lee, M.C., 1998. Semantic segmentation and tracking Mehnert, A., Jackway, P., 1997. An improved seeded region
of semantic video objects. IEEE Trans. Circ. Syst. Vid. growing algorithm. Pattern Recognition Lett. 18, 1065–
Technol. 8, 572–584. 1071.
Gu, H., Shirai, Y., Asada, M., 1996. MDL-based segmentation Moscheni, F., Bhattacharjee, S., Kunt, M., 1998. Spatiotem-
and motion modeling in a long image sequence of scene with poral segmentation based on region merging. IEEE Trans.
multiple independently moving object. IEEE Trans. Pattern Pattern Anal. Machine Intell. 20, 897–914.
Anal. Machine Intell. 18, 58–64. Pal, N., Pal, S., 1993. A review on image segmentation
Haddon, J., Boyce, J., 1990. Image segmentation by unifying techniques. Pattern Recognition 26, 1277–1294.
region and boundary information. IEEE Trans. Pattern Palmer, P.L., Dabis, H., Kittler, J., 1996. A performance
Anal. Machine Intell. 12, 929–948. measure for boundary detection algorithms. Computer
Haralick, R.M., Shapiro, L.G., 1985. Survey: Image segmen- Vision and Image Understanding 63, 476–494.
tation techniques. Comput. Vision Graphics Image Process. Pappas, T.N., 1992. An adaptive clustering algorithm for image
29, 100–132. segmentation. IEEE Trans. Signal Process. 40, 901–914.
Haris, K., Efstratiadis, S.N., Maglaveras, N., Katsaggelos, Pavlidis, T., Liow, Y.T., 1990. Integrating region growing and
A.K., 1998. Hybrid image segmentation using watersheds edge detection. IEEE Trans. Pattern Anal. Machine Intell.
and fast region merging. IEEE Trans. Image Process. 7 (12), 12, 225–233.
1684–1699. Sahoo, P.K., Soltani, S., Wong, A.K.C., 1988. A survey of
Hijjatoleslami, S.A., Kittler, J., 1998. Region growing: A new thresholding techniques. Comput. Vision Graphics Image
approach. IEEE Trans. Image Process. 7, 1079–1084. Process. 41, 233–260.
Kass, M., Witkin, A., Terzopoulos, D., 1987. Snakes: Active Shen, X., Spann, M., Nacken, P.F.M., 1998. Segmentation of
contour models. In: Proc. 1st Internat. Conf. on Computer 2D and 3D images through hierarchical clustering. Pattern
Vision, pp. 259–267. Recognition 31, 1295–1320.
Kunt, M., Benard, M., Leonardi, R., 1987. Recent results in Swain, M., Ballard, D., 1991. Color indexing. Internat. J.
high-compression image coding. IEEE Trans. Circ. Syst. 34, Comput. Vision 7 (1).
1306–1336. Tremeau, A., Colantoni, P., 2000. Region adjacent graph
Lim, Y.W., Lee, S.U., 1990. On the color image segmentation applied to color image segmentation. IEEE Trans. Image
algorithm based on the thresholding and the fuzzy C-means Process. 9, 735–744.
technique. Pattern Recognition 23 (9), 935–952. Vapnik, V., 1999. The Natural of Statistical Learning Theory.
Majunath, B.S., Huang, T.S., Tekalp, A.M., Zhang, H.J., 2000. Springer.
special issue on image and video processing for digital Wang, J., Adelson, E., 1994. Representing moving image with
libraries. IEEE Trans. Image Process. 9 (1). layers. IEEE Trans. Image Process. 3, 625–638.

You might also like