Superpixel Segmentation Using Linear Spectral Clustering
Superpixel Segmentation Using Linear Spectral Clustering
Abstract
1
used to optimize the segmentation cost function defined by
normalized cuts. Figure 1 shows some superpixel segmen-
tation results of LSC. We will demonstrate the efficiency
and effectiveness of LSC through further experiments.
The rest of this paper is organized as follows. In Section
2, we briefly review existing approaches for superpixel seg-
mentation. Section 3 presents the proposed LSC method.
Experimental results are demonstrated in Section 4. The
last section concludes our work.
2. Related Works
In early studies, algorithms designed for image segmen- Figure 2. Comparison of different superpixel segmentation al-
tation were directly used for generating superpixels, such gorithms. (a)SEEDS, (b)Lattice, (c)Turbopixel, (d)EneOpt0,
as FH [8], mean shift [5] and quick shift [21]. In FH, each (e)EneOpt1, (f)quick shift, (g)Ncuts, (h)SLIC, (i) ERS and (j)LSC.
superpixel is represented by a minimum spanning tree and The image [13] is segmented into 400/200 superpixels.
two superpixels are merged if the maximum weight of edges
inside the trees is larger than the minimum weight of edges thus generated present relative low adherence to boundaries
connecting them. Mean shift and quick shift are two mode- and the algorithm is slow in practice. Veksler et al. for-
seeking methods attempting to maximize a density function mulated superpixel segmentation as an energy optimization
by shifting pixels towards areas of higher density. Pixels problem which was then solved using the min-cut/max-flow
converging to the same mode formulate a superpixel. These algorithm [4][3][10]. The authors further extended this al-
algorithms offer no explicit control over the size and num- gorithm to two variations (EneOpt0 and EneOpt1) by bal-
ber of the superpixel and compactness is not considered. ancing between shape regularity and boundary adherence
Superpixels thus produced are usually of irregular sizes and differently [20]. Moore et al. proposed an algorithm (Lat-
shapes and tend to overlap with multiple objects. tice) that preserves the topology of a regular lattice in su-
Another widely known algorithm adopts the normalized perpixel segmentation [15][14]. Nevertheless, the quality
cuts formulation [18]. However, the traditional eigen-based of the superpixel relies on a pre-calculated boundary proba-
solution is of high computational complexity which further bility map. Liu et al. presented in [12] a clustering objective
grows when the number of eigenvectors to be computed in- function that consists of the entropy rate (ERS) of a random
creases. For superpixel segmentation, the number of eigen- walk and a balancing term which encourages the generation
vectors equals the expected number of superpixels, which of superpixels with similar sizes. ERS is able to preserve
is usually much larger than the number of segments in tra- jagged object boundaries which are likely to be smoothed
ditional image segmentation. Therefore, to facilitate nor- by other algorithms. However, the irregular shape of ERS
malized cuts based superpixel segmentation, Ren and Malik superpixels may become a potential drawback in feature ex-
proposed a two step algorithm (Ncuts) [17], in which pixels traction [1]. Bergh et al. proposed SEEDS in [2] by in-
are first grouped into large regions by eigen-based normal- troducing an energy function that encourages color homo-
ized cuts and direct K-means clustering is then adopted to geneity and shape regularity. A hill-climbing algorithm is
further partition these regions into small superpixels. Due used for optimization. However, SEEDS also suffers from
to its heuristic nature, Ncuts is less effective comparing to highly shape irregularity and the superpixel number is diffi-
other methods when the number of superpixel grows. cult to control. Achanta et al. proposed a linear clustering
Previous researches show that algorithms which do not based algorithm (SLIC) which produces superpixels by iter-
consider the spatial compactness usually lead to under seg- atively applying simple K-means clustering in the combined
mentation, especially when there is poor contrast or shadow five dimensional color and coordinate space. In spite of its
[11]. Among the four algorithms mentioned above, Ncuts simplicity, SLIC has been proved to be effective in various
[17] is the only one that implicitly takes compactness into computer vision applications [22]. Nevertheless, as a local
consideration. However, the high computational complex- feature based algorithm, the relationship between SLIC and
ity has limited its applicability. To solve this problem, sev- global image properties is not clear.
eral other approaches have been proposed to generate com- Another work closely related to our proposed method
pact and regular superpixels with relatively low computa- was introduced in [7], in which Dhillon et al. proved that K-
tional complexity. The Turbopixel algorithm [11] generates way normalized cuts in the original pixel space is identical
highly uniform lattice-like superpixels by iteratively dilat- to the weighted K-means clustering in a high dimensional
ing regularly distributed seeds. However, due to the stabil- feature space by rewriting weighted K-means clustering as
ity and efficiency issues of the level-set method, superpixels a trace maximization problem. However, in [7], the high
dimensional feature space is not explicitly defined and the In normalized cuts, each data point corresponds to a node
kernel trick has to be used. The generated kernel matrix in a graph G = (V, E, W ) in which V is the set of all
can be very large in practice. For instance, a moderate size nodes; E is the set of edges; and W is a function char-
image with N = 105 pixels will produce a 30GB kernel acterizing similarity among data points. The K-way nor-
matrix in case that it is dense, leading to serious deterio- malized cuts criterion is to maximize the objective function
ration in both time and space complexity. Moreover, this FN cuts defined in (3), in which W (p, q) stands for the
kernel matrix has to be positive definite to guarantee con- similarity between two points p and q. Several solutions
vergence of iterative weighted K-means. These problems for solving this optimization problem have been proposed
have limited the application of this algorithm in spite of its in [18][23][16]. All of these solutions are based on the
solid theoretical foundation. We will reveal that these prob- eigenvalue decomposition of the large affinity matrix and
lems can be efficiently solved by investigating the relation- are therefore intrinsically ∑
computationally
∑ complex.
1 ∑ p∈πk q∈πk W (p, q)
K
ship between the inner product in the feature space and the
FN cuts = ∑ ∑ (3)
similarity between image pixels. Superpixel segmentation K
k=1 p∈πk q∈V W (p, q)
results of different algorithms are compared in Figure 2.
By introducing a kernel matrix for mapping data points
3. Linear Spectral Clustering Superpixel into a higher dimensional feature space, Dhillon et al.
showed the connection between weighted K-means cluster-
In this section, we will present the LSC superpixel seg- ing and normalized cuts by rewriting the optimization of
mentation algorithm which not only produces superpixels both Fk−m and FN cuts as the same matrix trace maximiza-
with state of the art boundary adherence but also captures tion problem [7]. Under such a formulation, the conver-
global image properties. The LSC algorithm is proposed gence of the iterative minimization of Fk−m can be guaran-
based on the investigation of the relationship between the teed only when the kernel matrix is positive definite. How-
objective functions of normalized cuts and weighted K- ever, this can not always be ensured. To solve this prob-
means. We find that optimizing these two objective func- lem and further reveal the relationship between Fk−m and
tions are equivalent if the similarity between two points in FN cuts , we present the following corollary. Equations (4)
the input space is equal to the weighted inner product be- and (5) can also be deduced from the results in [7].
tween the two corresponding vectors in an elaborately de-
signed high dimensional feature space. As such, simple Corollary 1 Optimization of the objective functions of
weighted K-means clustering in this feature space can be weighted K-means and normalized cuts are mathematically
used to replace the highly complex eigen-based method for equivalent if both (4) and (5) hold.
minimizing the normalized cuts objective function. Com-
paring to the weighted kernel K-means clustering [7], LSC ∀ p, q ∈ V, w(p)ϕ(p) · w(q)ϕ(q) = W (p, q) (4)
∑
avoids the calculation of the large kernel matrix and the con- ∀ p ∈ V, w(p) = W (p, q) (5)
vergence condition can be naturally satisfied. By further q∈V
limiting the search space of the weighted K-means, LCS
achieves a linear complexity while retaining the high qual- Equation (4) indicates that the weighted inner prod-
ity of the generated superpixels. uct of two vectors in the high dimensional feature space
To facilitate the deduction, we briefly revisit the problem equals the similarity between the two corresponding points
definitions of weighted K-means clustering and normalized in the input space; and (5) indicates that the weight of
cuts. For clarity, we use lowercase letters, such as p, q, to each point in weighted K-means clustering equals the to-
represent data points, or pixels in our case, to be clustered in tal weight of edges that connect the corresponding node to
the input space. In the weighted K-means clustering, each all the other nodes in normalized cuts. To prove Corol-
data point p is assigned with a weight w(p). Let K be the lary 1, we first rewrite Fk−m as (6), in which C =
∑K ∑ 2
number of clusters; πk be the kth (k = 1, 2, , K) cluster; k=1 p∈πk w(p)∥ϕ(p)∥ is a constant independent of
and ϕ denote the function that maps data points to a higher the clustering result. A detailed derivation of (6) can be
dimensional feature space for improving linear separability. found in the supplementary material.
The objective function of weighted K-means is defined in Combining (4), (5) and (6), we have (7), from which
(1), in which mk is the center of πk as is defined in (2). it can be easily observed that minimizing Fk−m is strictly
Fk−m can be efficiently minimized in an iterative manner. equivalent to maximizing FN cuts . In other words, by care-
∑
K ∑ fully constructing the high dimensional feature space de-
Fk−m = w(p)∥ϕ(p) − mk ∥2 (1) fined by ϕ, the partitioning result of the normalized cuts
k=1 p∈πk
∑ should be identical to that of the weighted K-means cluster-
w(q)ϕ(q) ing at their optimum points. This conclusion serves as the
mk = ∑
q∈πk
(2)
q∈πk w(q) foundation of our LSC algorithm.
∑ Although Wc (p, q) has very clear physical meaning in
∑
K ∑
q∈π w(q)ϕ(q) 2
Fk−m = w(p)∥ϕ(p) − ∑ k ∥ measuring pixel similarity, it cannot be directly used in our
k=1 p∈πk q∈πk w(q) method because it does not satisfy the positivity condition
∑ [6] required by (8). Detailed explanation can be found in
∑K ∑ ∑K
∥ p∈πk w(p)ϕ(p)∥2
= w(p)∥ϕ(p)∥ −
2 ∑ the supplementary material. To solve this problem, we try
p∈πk w(p) to find a proper approximation of Wc (p, q).
k=1 p∈πk k=1
∑ ∑
∑K
q∈πk w(p)ϕ(p) · w(q)ϕ(q)
=C−
p∈πk
∑ (6) c (p, q)
W = Cs2 [g(xp − xq ) + g(yp − yq )] + Cc2 [g(lp
p∈πk w(p)
]
k=1 −lq ) + 2.552 (g(αp − αq ) + g(βp − βq )
Among the two sufficient conditions of Corollary 1, (5) g(t) = 1 − t2 , t ∈ [−1, 1] (10)
can be easily fulfilled by using the sum of edge weights in
normalized cuts as the point weight in weighted K-means. We rewrite (9) as (10) to show that Wc (p, q) is a nonneg-
∑K ∑ ∑ ative linear combination of a number of instances of a sim-
W (p, q)
Fk−m = C − ∑ p∈πk
∑q∈πk ple functions g(t), which can be expanded as a uniformly
p∈πk q∈V W (p, q) convergent Fourier series shown in (11). The coefficients of
k=1
= C − K × FN cuts (7) this series converge to 0 very quickly at a speed of (2k+1)3 .
Therefore, g(t) can be well approximated by the first term
Fulfilling (4), however, requires a careful selection of the
in the series as is expressed in (12).
similarity function W . Equation (4) can be rewritten as (8), ∞
∑ 32(−1)k (2k + 1)πt
in which the left hand side is the inner product of two vec- g(t) = cos( ), t ∈ [−1, 1]
[(2k + 1)π] 3 2
tors in the high dimensional feature space. In fact, (8) can k=0
also be considered as defining a symmetric kernel function, (11)
indicating that it must satisfy the positivity condition [6].
Also, to avoid the kernel matrix, W must be separable to 32 π
g(t) = 1 − t2 ≈ cos t, t ∈ [−1, 1] (12)
allow an explicit expression of the mapping function ϕ. π 2
Simply omitting the constant multiplier 32/π, W c (p, q)
W (p, q)
ϕ(p) · ϕ(q) = (8) can be approximated by W (p, q) defined in (13). Unlike
w(p)w(q)
g(t), cos π2 t is positive definite, leading to the positivity of
In order to find a suitable form for W (p, q), we first W (p, q). Actually, according to the properties of cosine
investigate the widely used Euclidean distance based pixel function, W (p, q) can be directly written in the inner prod-
similarity measurement. For each pixel in a color image, we uct form shown in (4), in which ϕ and w are defined in (14).
represent it using a five dimensional vector (l, α, β, x, y), in
[ π π ]
which l, α, β are its color component values in the CIELAB
W (p, q) = Cs2 cos (xp − xq ) + cos (yp − yq )
color space; and x, y are the vertical and horizontal coordi-
[ 2π 2
π
nates in the image plane. Without loss of generality, the +Cc cos (lp − lq ) + 2.552 (cos (αp
2
range of each component is linearly normalized to [0, 1] 2 ] 2
π
for simplicity. The CIELAB color space is adopted be- −αq ) + cos (βp − βq ) (13)
cause it is believed that the Euclidean distance is nearly 2
perceptually uniform in this space [1]. Given two pixels
p = (lp , αp , βp , xp , yp ) and q = (lq , αq , βq , xq , yq ), a 1 π π π
ϕ(p) = (Cc cos lp , Cc sin lp , 2.55Cc cos αp ,
similarity measurement between them can be defined as w(p) 2 2 2
(9), in which W cc and W cs are used to measure color sim- π π π
2.55Cc sin αp , 2.55Cc cos βp , 2.55Cc sin βp ,
ilarity and space proximity respectively. Two parameters 2 2 2
Cc and Cs are used to control the relative significance of π π π π
Cs cos xp , Cs sin xp , Cs cos yp , Cs sin yp )
color and spatial information. We multiply the first term 2 2 2 2
∑ ∑
of Wcc (p, q) with 2.552 in order to be consistent with the w(p) = W (p, q) = w(p)ϕ(p) · w(q)ϕ(q) (14)
standard CIELAB definition. q∈V q∈V
c (p, q)
W cc (p, q) + C 2 · W
= Cc2 · W cs (p, q) Until now, we have explicitly define a ten dimensional
s
[ ]
cc (p, q)
W = 2.552 2 − (αp − αq )2 − (βp − βq )2 feature space in (14) such that weighted K-means clustering
[ ] in this feature space is approximately equivalent to normal-
+ 1 − (lp − lq )2
[ ] ized cuts in the input space. Noticing that under the sim-
cs (p, q)
W = 2 − (xp − xq )2 − (yp − yq )2 (9) ilarity function defined in (13), both the kernel matrix for
weighted kernel K-means and the affinity matrix in the nor- Algorithm 1 LSC Superpixel Segmentation
malized cuts will be highly dense, leading to high computa- 1: Map each point p = (lp , αp , βp , xp , yp ) to a ten di-
tional complexity when using existing methods. In contrast, mensional vector ϕ(p) in the feature space.
by directly applying weighted K-means in the ten dimen- 2: Sampling K seeds over the image uniformly at fixed
sional feature space, the objective function of the normal- horizontal and vertical intervals vx and vy .
ized cuts can be efficiently optimized. 3: Move each seed to its lowest gradient neighbor in the
Based on the above analysis, we propose the LSC super- 3 × 3 neighborhood.
pixel segmentation algorithm which takes as input the de- 4: Initialize weighted mean mk and search center ck of
sired number of superpixels, K. In LSC, image pixels are each cluster using the corresponding seed.
first mapped to weighted points in the ten dimensional fea- 5: Set label L(p) = 0 for each point p.
ture space defined by (14). K seed pixels are then sampled 6: Set distance D(p) = ∞ for each point p.
uniformly over the whole image with horizontal and verti- 7: repeat
cal intervals vx and vy , while vx /vy equals the aspect ratio 8: for each weighted means mk and search center ck
of the image. After slight disturbances for avoiding noisy do
and boundary pixels [1], these seeds as used as the search 9: for point p in the τ vx × τ vy neighborhood of ck
centers and their feature vectors are used as initial weighted in the image plane do
means of the corresponding clusters. Each pixel is then as- 10: D = Euclidean distance between ϕ(p) and mk
signed to the cluster of which the weighted mean is closest in the feature space.
to the pixel’s vector in the feature space. After pixel assign- 11: if D < d(p) then
ment, the weighted mean and search center of each cluster 12: d(p) = D
will be updated accordingly. The above two steps are iter- 13: L(p) = k
atively performed until convergence. Pixels assigned to the 14: end if
same cluster form a superpixel. 15: end for
Theoretically, the search space of each cluster should 16: end for
cover the whole image to satisfy Corollary 1. However, for 17: Update weighted means and search centers for all
superpixels, local compactness is a common prior. In other clusters.
words, it may not be favorable to assign pixels far away 18: until weighted means of K cluster converge.
from each other to the same superpixel in term of human 19: Merge small superpixels to their neighbors.
perception. Hence, we adopt the common practice [20][1]
in superpixel segmentation by limiting the search space of will be enough for generating superpixels with the state of
each cluster to the size of τ vx × τ vy , in which τ ≥ 1 is a art quality. Therefore, LSC is of a linear complexity O(N )
parameter for balancing local compactness and global opti- and experiments will show that LSC is among the fastest
mality. We simply choose τ = 2 for implementation. superpixel segmentation algorithms.
The above process offers no enforcement on the con-
nectivity of superpixels, meaning that there is no guarantee
4. Experiments
that pixels in the same cluster form a connected component. We compare LSC to eight state of the art superpixel
To address this problem, we empirically merge small iso- segmentation algorithms including SLIC [1], SEEDS [2],
lated superpixels which are less than one fourth of the ex- Ncuts [23], Lattice [15], ERS [12], Turbopixel [11],
pected superpixel size to their large neighboring superpix- EneOpt1 and EneOpt0 [20]. For all the eight algorithms,
els. When there are more than one candidates for merging, the implementations are based on publicly available code.
we choose the closest one in the ten dimensional feature Experiments are performed on the Berkeley Segmentation
space. The algorithm is summarized in Algorithm 1. Database [13] consisting of three hundred test images with
Suppose the number of image pixels is N . The com- human segmented ground truth. The boundary adherence of
plexity of the feature mapping is obviously O(N ). By superpixels generated by different algorithms are compared
restricting the search space of each cluster, the complex- using three commonly used evaluation metrics in image
ity of pixel assignment is reduced from O(KN ) to O(N ) segmentation: under-segmentation error (UE), boundary re-
in each iteration. The complexity of updating the weight call (BR) and achievable segmentation accuracy (ASA).
means and search centers is also O(N ). The merging step Among the three metrics, UE measures the percentage of
requires O(nz) operations, in which z represents the num- pixels that leak from the ground truth boundaries. It actu-
ber of small isolated superpixels to be merged and n is the ally evaluates the quality of superpixel segmentation by pe-
average number of their adjacent neighbors. As such, the nalizing superpixels overlapping with multiple objects. The
overall complexity of LSC is O(κN + nz), in which κ is definition of UE used in [1] is adopted here. Lower UE in-
the number of iterations. In practice, nz ≪ N I and κ = 20 dicates that fewer superpixels straddle multiple objects. BR
(a) UE (b) BR (c) ASA (d) Time
Figure 3. Quantitative evaluation of different superpixel segmentation algorithms.
measures the fraction of ground truth boundaries correctly LSC has achieved the most perceptually satisfactory seg-
recovered by the superpixel boundaries. A true boundary mentation results for different kind of images.
pixel is regarded to be correctly recovered if it falls within According to Figure 3, EneOpt0 performs the worst in
2 pixels from at least one superpixel boundary point. A terms of boundary adherence among the five selected algo-
high BR indicates that very few true boundaries are missed. rithms, probably because that it uses a variation of the mini-
ASA is defined as the highest achievable object segmenta- mum cut strategy which may suffer from the bias of cutting
tion accuracy when utilizing superpixel as units [12]. By out small set of pixels, leading to under segmentation errors
labeling each superpixel with the ground truth segments of in practice as is shown in Figure 4(a). Actually, EneOpt0 in-
the largest overlapping area, ASA is calculated as the frac- directly controls the superpixel density by setting the upper
tion of labeled pixels that are not leaked from the ground bound of the superpixel size. However, it may be difficult
truth boundaries. A high ASA indicates that the superpixels to produce a desirable number of superpixels using EneOp0
comply well with objects in the image. Figure 3 shows the especially for small K because large superpixels tend to be
experimental results which are average values over all the split into small patches at regions with high variabilities. As
300 test images in the Berkeley segmentation database. for Ncuts, a major drawback is its extremely low time effi-
Computational efficiency is also an important factor for ciency. The two-step heuristic algorithm proposed in [17]
evaluating the performance of superpixel segmentation al- for acceleration has caused Ncuts to become ineffective in
gorithms. In our experiment, we calculate the average run- terms of boundary adherence as K increases. Even though,
ning time for different algorithms and the results are shown Ncuts is still the slowest algorithm as is shown in Table 1.
in Figure 3(d). All the experiments are performed on an As a local feature based method, SLIC is the second fastest
desktop PC equipped with an Intel 3.4 GHz dual core pro- among the selected algorithms according to our experimen-
cessor and 2GB memory. The time consumption of the tal results. The superpixels generated by SLIC are also
Ncuts algorithm [23] is much higher than that of the other perceptually satisfactory for most of the cases. However,
methods and is therefore omitted in Figure 3(d). compared to the proposed LSC algorithm, the boundary ad-
To be more clear, we also list the numeric values of the herence of SLIC is less competitive according to Figure 3.
metrics when the number of superpixels K = 400 in Table Actually, the major difference between SLIC and LSC is
1 which also summarizes the computational complexity of that the iterative weighted K-means clustering is performed
different algorithms. From Figure 3 and Table 1, it can be inside different feature spaces. However, this difference is
observed that in terms of the boundary adherence, the pro- critical because unlike SLIC which rely on local features
posed LSC is comparable to the state of art the algorithms. only, LSC successfully connects a local feature based op-
For relative large number of superpixels, LSC performs the eration with a global optimization objective function by in-
best. Also, LSC is of linear complexity and is among the al- troducing ϕ so that the global image structure is implicitly
gorithms with the highest time efficiency. In addition, quali- utilized to generate more reasonable segmentation results.
tative experiments demonstrate that LSC performs the best. In terms of boundary adherence, ERS and SEEDS are very
We select the five algorithms (SEEDS, Ncuts, SLIC, ERS close to LSC and SEEDS is probably the fastest existing su-
and LSC) that achieve the lowest UE values when K = 400 perpixel segmentation algorithm. However, this is achieved
for visual comparison. According to Figure 3, these five by sacrificing the regularity and perceptual satisfaction of
algorithms generally outperform the remaining three algo- the generated superpixels as is shown in Figure 4(d).
rithms in terms of UE, BR as well as ASA. Figure 4 shows LSC uses two parameters Cs and Cc to control the rel-
some typical visual results of superpixel segmentation us- ative significance of the color similarity and space proxim-
ing these algorithms. Some detail segmentation results are ity in measuring similarity between pixels. In fact, what is
emphasized to facilitate close visual inspection. Intuitively, truly meaningful is the their ratio rc = Cs /Cc . Generally,
Table 1. Performance metrics of superpixel segmentation algorithms at K = 400
EneOpt0 SEEDS ERS Lattices Ncuts SLIC Turbo LSC
Adherence to boundaries
Under segmentation error 0.230 0.197 0.198 0.303 0.220 0.213 0.277 0.190
Boundary recall 0.765 0.918 0.920 0.811 0.789 0.837 0.739 0.926
Achievable segmentation accuracy 0.950 0.960 0.959 0.933 0.956 0.956 0.943 0.962
Segmentation speed
3 3 2
Computational complexity O( K
N
2) O(N ) O(N 2 lgN ) O(N 2 lgN ) O(N 3 ) O(N ) O(N ) O(N )
Average time per image 8.22s 0.213s 2.88s 0.748s 273s 0.314s 20.2s 0.919s
(a) SEEDS (b) Ncuts (c) SLIC (d) ERS (e) LSC
Figure 4. Visual comparison of superpixel segmentation results when K = 400.
[4] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate en-
ergy minimization via graph cuts. IEEE Trans. on PAMI,
23(11):1222–1239, 2001. 2
[5] D. Comaniciu and P. Meer. Mean shift: a robust approach
towards feature space analysis. IEEE Trans. on PAMI,
24(5):603–619, 2002. 1, 2
[6] N. Cristianini and J. Taylor. An introduction to support
vector machines and other kernel-based learning methods.
Cambridge University Press New York, NY, USA, 2000. 4
[7] I. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts with-
out eigenvectors: a multilevel approach. IEEE Trans. on
PAMI, 29(11):1944–1957, 2007. 2, 3
[8] P. Felzenszwalb and D. Huttenlocher. Efficient graph-based
image segmentation. International Journal of Computer Vi-
(a) rc = 0.05 (b) rc = 0.075 (c) rc = 0.1 (d) rc = 0.15
sion, 59(2):167–181, 2004. 1, 2
Figure 5. Superpixel segmentation results for different rc . [9] D. Hoiem, A. Efros, and M.hebert. Atuomatic photo pop-up.
larger values of rc indicate a tendency towards generating ACM Trans. on Graphics, 24(3):577–584, 2005. 1
superpixels with higher shape regularity while smaller rc [10] V. Kolmogrov. What energy functions can be minimized via
usually lead to better boundary adherence as is shown in graph cuts? IEEE Trans. on PAMI, 26(2):147–159, 2004. 2
Figure 5. In our experiments, we set rc = 0.075. We have [11] A. Levinshtein, A. Stere, K. Kutulakos, D. Fleet, S. Dickin-
c (p, q) using son, and K. Siddiqi. Turbopixel: fast supepixels using geo-
also verified the validity of approximating W
metric flow. IEEE Trans. on PAMI, 31(12):2209–2297, 2009.
W (p, q). For more than 98.8% practical cases, the relative 1, 2, 5
error caused by this approximation does not exceed 0.5%. [12] M. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa. Entropy
rate superpiexl segmentation. Proc. of CVPR, pages 2097–
5. Conclusions 2104, 2011. 1, 2, 5, 6
We present in this paper a novel superpixel segmenta- [13] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database
of human segmented natural images and its application to
tion algorithm, LSC, which produces compact and regular
evaluating segmentation algorithms and measuring ecologi-
shaped superpixels with linear time complexity and high
cal statistics. Proc. of ICCV, 2:416–423, 2001. 1, 2, 5
memory efficiency. The most critical idea in LSC is to ex-
[14] A. Moore, S. Prince, and J. Warrell. lattice cut - constructing
plicitly utilize the connection between the optimization ob- superpixels using layer constraints. Proc. of CVPR, pages
jectives of weighted K-means and normalized cuts by intro- 2117–2124, 2010. 2
ducing a elaborately designed high dimensional space. As [15] A. Moore, S. Prince, J. Warrell, U. Mohammed, and
such, LSC achieves both boundary adherence and global G. Jones. Superpixel lattices. Proc. of CVPR, pages 1–8,
image structure perseverance through simple local feature 2008. 2, 5
based operations. Experimental results show that LSC gen- [16] A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: anal-
erally over-performs most state of the art algorithms both ysis and an algorithm. Proc. of NIPS, pages 849–856, 2001.
quantitatively and qualitatively. 3
This work was supported by the Beijing Higher Edu- [17] X. Ren and J. Malik. Learning a classification model for
cation Young Elite Teacher Project (YETP0104), the Ts- segmentation. Proc. of ICCV, 1:10–17, 2003. 1, 2, 6
inghua University Initiative Scientific Research Program [18] J. Shi and J. Malik. Normalized cuts and image segmenta-
(20131089382), and the National Natural Science Founda- tion. IEEE Trans. on PAMI, 22(8):888–905, 2000. 2, 3
tion of China (61101152),. [19] J. Tighe and S. Lazebnik. Superparsing: scalable non para-
metric image parsing with superpixel. Proc. of ECCV,
References 5:352–365, 2010. 1
[20] O. Veksler, Y. Boykov, and P. Mehrani. Superpixels and su-
[1] R. Achantan, A. Shaji, K. Smith, A. Lucchi, P. Fua, and pervoxels in an energy optimization framework. Proc. of
S. Susstrunk. Slic superpixels compared to state-of-the-art ECCV, pages 211–224, 2010. 2, 5
superpixel methods. IEEE Trans. on PAMI, 34(11):2274– [21] A. Veldadi and S. Soatto. Quick shift and kernel methods for
2281, 2012. 2, 4, 5 mode seeking. Proc. of ECCV, pages 705–718, 2008. 1, 2
[2] M. Bergh, X. Boix, G. Roig, B. Capitani, and L. V. Gool. [22] S. Wang, H. Lu, F. Yang, and M. Yang. Superpixel tracking.
Seeds: Superpixels extracted via energy-driven sampling. Proc. of ICCV, 1:1323–1330, 2011. 1, 2
Proc. of ECCV, 7578:13–26, 2012. 2, 5 [23] S. Yu and J. Shi. Multiclass spectral clustering. Proc. of
[3] Y. Boykov and V. Kolmogrov. An experimental comparison ICCV, 1:313–319, 2003. 3, 5, 6
of min-cut/max-flow algorithms for energy minimization in
vision. IEEE Trans. on PAMI, 26(9):1124–1137, 2001. 2