0% found this document useful (0 votes)

76 views12 pages

Zach2008 VMV Fast Global Labeling

This document presents a real-time approach for global label assignment on regular grids using a Markov random field energy model with a Potts prior term. The method is applied to accelerate the final depthmap selection step in a real-time dense stereo method based on plane sweeping with multiple directions. The proposed continuous formulation allows computing the global minimum efficiently using graphics hardware, achieving approximately 30 times faster runtimes than graph cut approaches. Experimental results on a three-label problem for stereo depthmap cleanup are comparable in quality to previous work while being orders of magnitude faster, enabling real-time global labeling.

Uploaded by

marcniethammer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views12 pages

Zach2008 VMV Fast Global Labeling

Uploaded by

marcniethammer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Fast Global Labeling for Real-Time Stereo Using Multiple

Plane Sweeps
Christopher Zach, David Gallup, Jan-Michael Frahm and Marc Niethammer
Department of Computer Science
University of North Carolina at Chapel Hill
Email: cmzach,gallup,jmf,[email protected]
August 26, 2008
Abstract
This work presents a real-time, data-parallel approach for global label assignment on
regular grids. The labels are selected according to a Markov random eld energy with a Potts
prior term for binary interactions. We apply the proposed method to accelerate the clean-
up step of a real-time dense stereo method based on plane sweeping with multiple sweeping
directions, where the label set directly corresponds to the employed directions. In this setting
the Potts smoothness model is suitable, since the set of labels does not possess an intrinsic
metric or total order. The observed run-times are approximately 30 times faster than the
ones obtained by graph cut approaches.
1 Introduction
The best results in stereo have come from global methods, however, these methods are still too
computationally demanding in order to be used in real-time applications or other applications
where processing time is a critical resource. Gallup et al. [11] present a real-time stereo method
which uses plane-sweeping and local matching to quickly produce depthmaps for three surface-
aligned sweeping directions. The nal depthmap is a per-pixel selection from the three candidate
depths, which is solved with regularization using a global energy. Although the problem involves
only three labels (namely the employed sweeping direction), optimization using graph cuts still
takes several seconds, making the nal step unsuitable for real time. Thus, the authors recommend
a local best cost selection scheme for real-time applications.
In this paper we present a real-time solution to the global labeling problem. By relaxing the
energy to the continuous case, our method can compute the true global minimum, and since the
computation is highly data-parallel, the solution can be computed eciently using graphics hard-
ware. The relationship between the Potts discontinuity model and total variation regularization
enables the continuous formulation of the global labeling task.
We have applied our method to the three-label problem proposed for depth map clean-up
in [11], with slight modications to the energy formulation to facilitate ecient, data-parallel
minimization. Despite these modications, our results are comparable in quality to those presented
in previous work, and the computation is orders of magnitude faster. Thus, our work enables higher
quality global labeling results to be computed in real-time, which was not possible before. The
core of our approach is not restricted to dense stereo computation, and can be applied on a varierty
of labeling problems.
The combination of plane sweep stereo using multiple directions with global label assignment
is interesting for the following reasons: (a) it allows the renement and clean-up of the depth maps
without the huge computational costs that come with other global stereo approaches. (b) Since
the label set corresponds to dominant facade directions in urban scenes, the rened labels can be
used to assist a subsequent semantic analysis of the captured geometry.
1
(a) Reference image (b) Best cost labels
(c) Graph cut result (from [11]) (d) Proposed method
Figure 1: Plane sweep with multiple directions results. (a) shows the reference image used to
compute the best matching costs for multiple sweep directions. (b) shows the labels (represented
by the three color channels) corresponding to the directions with the lowest matching costs. (c)
depicts the cleaner label assignment using graph cuts as proposed in [11] (with enhanced colors,
2s runtime); and (d) displays our result, that is visually most similar to (c) ( = 60, 58ms
runtime). This gure is best viewed in color.
2 Related Work
In this section we focus on previous work related to global label assignment. We refer to [21, 5]
for an overview and evaluation of stereo methods.
Binary labeling problems incorporating a matching cost term and a spatial smoothness prior
can be solved using network ow approaches [14]. Since the primary tool is the construction of
an appropriate directed graph and determining the minimum cut, a whole class of methods based
on this principle is usually referred as graph cut methods. Label assignment with more than two
labels can be approximately addressed by a sequence of binary labeling methods, e.g. -expansion
and --swaps [3]. A lot of work has been done recently to address some shortcomings of graph
cuts for multi label problems (e.g. [16, 13, 15]). Exact solutions for labeling problems with linear
discontinuity costs [2] and convex pairwise interactions [12] can be obtained by suitable graph
constructions.
Graph cut based approaches are mostly sequential algorithms, and all attempts to accelerate
these methods by highly parallel computations, in particular on GPUs, have had limited success
so far. Another method to optimize Markov random elds is loopy belief propagation based on
message passing [10, 22, 9]. The updates executed in message passing methods can be performed in
parallel and are therefore highly suitable for GPU implementation. The major problem with loopy
belief propagation is, that inference (i.e. optimal label assignment) is only exact for tree graphs,
where the message passing algorithm is an extension of dynamic programming. Belief propagation
in loopy graphs (e.g. image grids with 4-neighborhoods) is only a heuristic optimization procedure.
Although loopy belief propagation often works well in practice, divergent and oscillating behaviour
may be observed.
2
Consequently, continuous methods for global labeling problems, that provide at least global
optimal results for convex constraints replacing non-convex ones, are appealing for an accelerated
GPU implementation. In [19] a variational method is proposed to nd the global optimum of
labeling problems with total variation (TV) regularization (in particular for dense depth estimation
from stereo images). This approach is only applicable if the set of labels has a natural distance
metric, since the employed TV regularization is equivalent to a linear discontinuity cost model
for the label values. In our application the set of labels (major sweep directions) have no natural
metric, and a dierent solution is required. In the following we derive a continuous method for
global labeling with the Potts discontinuity model.
3 Global Label Assignment and the Potts Model
In this section we propose a novel label assignment approach based on continuous energy mini-
mization. Typically, global label assignment maps pixel locations to labels, : L, and it is
formulated as a energy minimization problem. Here, denotes the typically rectangular image
domain, and L = 1, . . . , L is the set of labels. The underlying energy functional accumulates
the data cost for selecting label l at pixel x, D(x, l), and a spatial smoothness cost to regularize
the resulting assignment. Thus, the goal is to nd the minimizer of
E() =

D(x, (x)) dx + V (), (1)

where controls the importance of the data delity on the overall energy. For many computer
vision problems typical choices for V () are the homogeneous regularization, V () =

||
2
dx,
and the total variation, V () =

|| dx. Since in our application the labels representing plane

directions have no natural order, these gradient based regularization terms are not applicable.
If we focus on the data energy

x, (x)

dx and ignore the smoothness cost for now,

determining the optimal data energy is just a point-wise minimization problem,
min
lL
c
x,l
, (2)
where we substitute c
x,l
= D(x, l). By interpreting the optimization task above as a (rather
trivial) linear program, we can analyze its dual problem:
min
u
x,l

lL
c
x,l
u
x,l
s.t. (3)
u
x,l
0

lL
u
x,l
= 1.
The interpretation of the unknowns u
x,l
is, that u
x,l
[0, 1] is the continuous version of the
indicator function ((x) = l). Although u
x,l
is not enforced to be binary by the constraints, the
dual linear program will result in a unique binary solution if all data costs are distinct for a pixel,
i.e. c
x,l
= c
x,l
for l = l

. Otherwise a set of equally optimal assignments for u

x,l
is obtained, but
an optimal binary solution u

x,l
can be determined by setting exactly one non-zero u
x,l
to 1, and
all other variables to 0.
In the following we will use the notation u
l
for the indicator function of label l, i.e. u
l
(x) = u
x,l
.
Further, let u
x
denote the vector (u
x,1
, . . . , u
x,L
) at location x, i.e. u
x
(l) = u
x,l
. Thus, u
l
is a
particular slice of the volume L R, and u
x
is a specic column in the label direction. These
notations are also used for other mappings with domain L.
The Potts Model The Potts model for smoothness priors is appropriate if the numeric value
of a label has no particar meaningful interpretation, e.g. when labels have a purely symbolic
character. In the Potts model the smoothness cost is zero, if neighboring locations are assigned
3
with the same label. Otherwise a constant penalty (independent of the actual values of the labels)
is added to the overall energy. More formally,
P

(x) =

0 if |(x)| = 0
1 otherwise,
(4)
where we assign a unit cost for discontinuities. Note, that is understood on a discrete grid,
i.e. as nite dierences. Further, we can employ the L
2
(Euclidean), L

(maximum norm) or the

L
1
norm for || (resulting in dierent preferred orientations induced by the regularization).
In the following, we denote the spatial gradient of u by
x
u, i.e. for 2-dimensional image
domains we have
x
u = (u/x, u/y)
T
. We can rewrite the accumulated Potts discontinuity
cost

dx in terms of u:
Proposition 1 If u
x,l
0, 1 is the binary indicator function, u
x,l
= ((x) = l), then

dx =
1
2

|
x
u
l
|dx. (5)
(Recall that u
l
(x) = u
x,l
.)
Proof: By the coarea formula for functions of bounded variation it is known that (for l L)

|
x
u
l
|dx = Per(U
l
), (6)
where U
l
= x : u
x,l
= 1 is the set induced by the indicator function u
l
. Note that U
l
is
exactly the set of jumps involving label l. Since for every x exactly one u
x,l
is set (i.e. has
value 1), every discontinuity in the label assignment (e.g. switching from l to l

) contributes to two
such perimeters namely Per(U
l
) and Per(U
l
). Hence the right hand side of Eq. 5 is twice
the number of discontinuities of , i.e. 2

dx.
A Continuous Formulation The results from the previous paragraphs can be combined to
obtain a continuous formulation for the labeling problem with relaxed constraints. Merging the
linear program in Eq. 3 with the relation from Proposition 1 leads to the following energy to
minimize over u:
E
1
(u) =

l
|
x
u
l
| +

l
c
x,l
u
x,l

dx, (7)
with the constraints u
x,l
0 and

l
u
x,l
= 1. Since Eq. 7 is dicult to optimize directly, we
decouple the regularization and data term by introducing an additional function v linked to u by
a quadratic approximation force [1, 4]:
E
2
(u; v) =

l
|
x
u
l
|
+

l
1
2
(u
x,l
v
x,l
)
2
(8)
+

l
c
x,l
v
x,l

dx,
subject to v
x,l
0 and

l
v
x,l
= 1. is a parameter that controls the inuence of the squared
distance between u and v in Eq. 8. This technique of quadratic relaxation allows the combination
of TV-based smoothness costs with arbitrary data terms, and the utilization of well-established
and ecient methods for TV-regularization. If is set to a very small number, then u and v
are very close respective approximations, but the speed of convergence is substatially reduced.
Typical choices for are 1/10 and 1/20. Now the (convex) optimization task can be solved by
alternating steps minimizing either u for constant v or vice versa.
4
Optimizing E
2
with respect to u We can omit the constant data delity term depending
only on v in 8. Thus, the task is to solve
min
u

l
|
x
u
l
|
+

l
1
2
(u
x,l
v
x,l
)
2

dx. (9)
This decomposes into L independent image denoising problems (for l L):
E
ROF
l
= min
u
l

|
x
u
l
|
+
1
2
(u
x,l
v
x,l
)
2

dx, (10)
which is known as the Rudin-Osher-Fatemi (ROF) model [20] and can be solved eciently by
a gradient descent procedure [6, 7]. Here we only briey sketch the procedure proposed in [7]:
|
x
u
l
| can be rewritten as max
p
l
:p
l
1
'p
l
,
x
u
l
`, thereby introducing the dual vector-valued
function p
l
, which essentially removes the non-linearity induced by |
x
u
l
|. Eq. 10 reads then as
min
u
l
max
p
l
1

'p
l
,
x
u
l
` +
1
2
(u
x,l
v
x,l
)
2

dx. (11)
Computing the functional derivatives of Eq. 11 with respect to the unknowns u
l
and p
l
yields the
following gradient descent/reprojection equations for p
l
:
p
(t+1)
l
= p
(t)
l
+
x
u
l
p
(t+1)
l
=
B
( p
(t+1)
l
), (12)
where < /4 is the timestep, and
B
() denotes the projection into the unit ball B = x : |x|
1. The corresponding values of u
l
can be determined by
u
x,l
= v
x,l
+

p
l
(x)

. (13)
It turns out that the nite dierence implementation of u
l
and p
l
must be dual (in the sense
of linear operators), e.g. if u
l
is approximated by forward dierences, p
l
is based on backward
dierences.
We did not specify the exact norm | | appearing in the equations above. The Euclidean
norm | |
2
does not prefer certain directions and is the appropriate choice for many applications.
We observed, that the L
1
norm, |x|
1
=

[x
i
[, applied on u
l
gives visually more appealing
results. Note that using |u
l
|
1
results in preference of horizontal and vertical directions along
the discontinuities. Further, |u
l
|
1
translates to using the maximum norm on the dual variables
p
l
, since
|y|
1
= max
p1
'p, y` (14)
for every y R
n
. Further, the unit ball B = x : |x|

1 is just the unit square, and

B
( p
l
)
is obtained by clamping the components of p
l
to the range [1, 1].
Optimizing E
2
with respect to v This task decouples into separate subproblems for every
pixel x, since the v
x,l
do not spatially interact with neighboring values. Hence, we are facing the
following (data-parallel) minimization problems for every position x:
min
v
x,l

1
2
(u
x,l
v
x,l
)
2
+ c
x,l
v
x,l

s.t.

lL
v
x,l
= 1, v
x,l
0. (15)
5
We can rewrite Eq. 15 to obtain the equivalent formulation:
min
v
x,l

(u
x,l
c
x,l
) v
x,l

2
s.t. (16)

lL
v
x,l
= 1, v
x,l
0,
which means, that the vector v
x
= (v
x,1
, . . . , v
x,L
) is the closest point (in terms of the Euclidean
distance) to the vector u
x
c
x
on the canonical simplex, i.e. the projection
S
() on the unit
simplex. This problem is well-known in the literature, and a particular simple algorithm is an
active set method based on successive projections and corrections [17]: Since the set I of inactive
Algorithm 1 Update procedure to minimize E
2
with respect to v
v
x,l
u
x,l
c
x,l
, I = 1, . . . , L
repeat
Projection onto plane
v =

lI
v
x,l

/[I[
l I : v
x,l
v
x,l
v + 1/[I[
Enforce inequality constraints
I I ` l : v
x,l
< 0
l / I : v
x,l
0
until

l
v
x,l
= 1
inequality constraints (v
x,l
0) is reduced by at least one element in every iteration (if the current
solution is still infeasible), the algorithm requires at most L iterations. Since the label set L is very
small in our application, we can manually unroll the loop to obtain coherent parallel execution on
GPUs (see Section 5).
Discussion The Potts model for discontinuities formulated in terms of assigned labels is not
convex, but our formulation in (Eqs. 7 and 8) based on soft indicator variables is. The constraints
of the original problem, u
x,l
0, 1, are not convex and are essentially replaced by the bounds
u
x,l
[0, 1], hence u
x,l
can attain fractional values. Exactly this modication makes the problem
convex and allows a global optimal solution to be determined eciently. But this means, that
assigning labels according to (x) = arg max
l
u
x,l
does not necessarily return a global optimum
of the original discrete problem. In certain cases the relaxed continuous formulation provides a
global optimum for the discrete problem, e.g. TV -L
1
denoising of binary input images results in
global optimal solutions after thresholding for allmost all threshold values [8, 18]. No such result
is known for minimizing Eq. 7, hence the obtained discrete label assignment will be a (usually
very strong) solution. In practice we observed that the assigned u
x,l
is binary for allmost all pixels
(after convergence). Note, that the iterated graph cut approach used in [11] only returns a strong
solution as well.
4 Application to Stereo with Multiple Sweeping Directions
Ecient solutions to the global labeling problem are of particular importance in large scale 3D
reconstruction from video. Due to the enormity of video data, processing time is a major concern.
Other applications such as mission planning, change detection, and robot navigation require results
in a timely fashion, if not immediately. Capturing even a small city from ground requires literally
millions of frames of video. For practical use, processing time must be comparable to the capture
time, thus real-time is an important goal. Note, that this requirement excludes many global or
semi-global approaches using the full disparity range as potential labels.
In that regard, Gallup et al. [11] present a real-time stereo tailored (but not limited) to broad
planar surfaces such as those found in urban environments. The fastest stereo methods are local,
6
which compute matching in windows in the image. In urban scenes acquired from street-level,
ground surfaces such as streets and sidewalks, and facade surfaces between buildings, are often
viewed at steep angles and thus appear highly slanted in the image. Such surfaces pose a problem
for window-based matching: while the center pixel is in correspondence, other pixels in the window,
especially outer pixels, are not, which can lead to mismatches in the nal result. Preferably the
window should be aligned to the surface in 3D, in which case the correct match will feature all
pixels in correspondence.
Performing local stereo with cost windows aligned with an exhaustive set of surface normals
is not feasible, hence only promising sweeping directions are retained. In urban environments the
ground surface normal and two orthogonal facade normals are dominant, therefore three directions
are sucient for city modeling. The main sweep directions can be determined from vanishing points
or from sparse reconstructions obtained by visual odometry methods.
Once the surface normals are found, a local best-cost plane-sweep is performed for each di-
rection. This produces one depthmap for each sweeping direction. The nal depthmap can be
obtained simply by selecting per-pixel the depth with minimal matching cost. However, matching
scores are often somewhat noisy, which leads to errors in the selection. Hence, regularization with
spatial smoothness priors is inevitable.
The minimization method presented in this paper can solve the labeling task orders of magni-
tude faster than graph-cuts (which require a few seconds), making the high quality method possible
in real-time. In [11] two types of smoothness penalties are proposed: compatibility between labels,
and smoothness between depths (integrability cost). We found that the integrability penalty
has a minor contribution to the results, and it is dicult to optimize eciently.
Figure 1 shows that our formulation produces nearly identical results to those obtained using
the more complex graph cut formulation proposed in [11]. The run-time for the minimization
step is 58ms, and in addition to 30ms for the plane-sweeps, the overall processing rate is about
11 Hz. The observed runtime for the graph cut implementation on the same image data is 2s, i.e.
approximately 34 times slower.
5 Implementation
This section provides more details on our GPU-accelerated algorithm for the continuous formula-
tion of global label assignment. Although the CUDA programming paradigm is currently consid-
ered as the state-of-the art approach for ecient GPU programming, we still employ the OpenGL
API and Cg shading language for the following reasons: (a) it allows the implementation to be
executed on a substantially larger range of graphics hardware from dierent vendors and on older
GPU generations as well. Note that in contrast to the scalar G80 architecture from NVidia, the
current generation of GPUs from AMD/ATI still use vectorized processing units. In particular,
the shader-based specication of Algorithm 1 conveniently makes use of intrinsic vector operations.
(b) CUDA is usually only substantially faster than shader based methods if the proposed shared
memory programming model can be exploited, which is only the case to a limited extent for the
approach presented in this work.
Shader-based Implementation Since the number of required labels for our applications is very
small (three labels in our setting), we can represent the data cost volume c
x,l
, the soft indicator
functions u
l
and the respective dual variables p
l
by regular 2D textures with the respective number
of color channels. In practice, it is sucient to represent u
l
and p
l
by 16-bit oating point
components, hence the required memory bandwidth can be reduced (which results in improved
runtimes). Since p
x
is then comprised by 6 components (x and y components for every label l),
p
x
can be represented by two textures. Updating p
x
in a single pass requires the ability to render
into multiple targets. Alternatively, the two 16-bit components of p
x,l
for a specic label can be
packed into one 32-bit oating point value on NVidia GPUs. Thus, the complete set of values for
p
x
can be encoded in one multi-channel oating point texture.
7
The alternating optimization step Eq. 12 (and Eq. 13) and Algorithm 1 corresponds directly
to a pair of shader programs, which are outlined in Eq. 1720:
1a. v
x

S

u
(t)
x
c
x

(17)
1b. u
(t+1)
x
v
x
+ p
(t)
x
(18)
2a. p
x
p
(t)
x
+

u
(t+1)
x
(19)
2b. p
(t+1)
x
max

1, min(1, p
x
)

, (20)
where the min and max operators in 2b. are understood as component-wise application on the
input vector. p
x
is a temporary variable local to the update step. The gradient and the divergence
are computed by nite dierencing neighboring pixels.
The projection
S
() on the canonical simplex is achieved by unrolling the loop in Algorithm 1.
Essentially, the following Cg source fragment is repeated L times:
cardI = (I.x + I.y + I.z);
v = v + (1.0 - dot(I, v)) / cardI;
I = (v < 0) ? half3(0) : I;
v = max(v, float3(0));
The binary vector I represents the set of inactive constraints. The rst line moves v on the
respective plane by subtracting a modied mean over all non-zero elements. The second and third
line determine the active inequality constraints and forces v to be non-negative.
Coarse-to-ne Strategy Since incorporating a global smoothness prior allows pixels to com-
municate over the entire image, convergence to stable results can be slow. Figure 4 illustrates
an example, where the initally assigned labels are revised again (in the lower left corner of the
image). Figure 4(a) shows the obtained result after 300 iterations without a multi-scale approach,
and Figure 4(b) depicts the labeling after 1000 iterations. There is still no clear assignment in
the indicated region, although most of the image appears already converged. In order to acceler-
ate the procedure we employ a multi-scale approach similar to the one proposed in [9] (see also
Figure 2). The positive inuence of a coarse-to-ne method on the convergence rate can be seen
in Figure 4(c), where 4 levels with half the resolution of the previous one are used. The obtained
result after 100 iterations on the base level is virtually identical to the fully converged result
(Figure 4(d)).
Level k+1 Level k
Figure 2: Illustration of the multi-scale approach. A group of 2 2 pixels is merged in the next
level.
Some considerations about the data costs on coarser levels are required. Figure 5(a) shows
the result on the full resolution base level with = 50. In order to benet from the multi-scale
approach, a suitable initialization from previous levels in the pyramid is required. Figure 5(b)
8
shows the desired result at level 2 (quarter resolution), but neither averaging nor accumulation
of data costs from the ner resolution levels provide the intended result, if is xed for all levels
(Figure 5(c) and (d)).
Assume for now that all pixels in a 22 block as indicated in Figure 2 have the same data cost
c
x,l
in level k, and we use the average cost in the next level k + 1. Then the overall contribution
of the data delity term,

l
c
x,l
u
x,l
, to the combined energy E
1
on level k +1 is one quarter of
the data energy at level k. But the set of discontinuities contributing to the smoothness energy
is only reduced by a factor of two, since they are one-dimensional level curves. Hence, the correct
value of
(k+1)
at level k + 1 is 2
(k)
, with
(0)
= , the given data weight.
6 Results
In this section we provide timing and visual results for our method. The utilized PC hardware
is equipped with a NVidia Geforce 8800 Ultra GPU and a 3 GHz CPU. Run-times are measured
under a Linux OS using current OpenGL drivers. One issue with GPU-based iterative methods is
the stopping criterion, since this usually involves an expensive reduction operation e.g. to compute
the current energy or the maximal update of the unknowns. We empirically found out that 150
iterations on each level using the coarse-to-ne approach yields to (visually) converged results.
The observed run-times for global label assignment are approximately 60ms for 512 384 images,
and 45ms for 384 288 pixels.
(a) Best cost labels (b) Global assignment
Figure 3: Local and global label assignment. (a) shows a lot of noise in the plane directions (labels)
selected by taking the minimal matching costs. (b) shows the much more consistent labeling result.
In terms of the dominant plane orientations the car in the foreground is a highly ambiguous object,
resulting in the non-uniform label assignment (which is less important for non-planar objects).
Figure 3 and Figure 6 illustrate the obtained labelings with local (best-cost) assignment and the
global approach. Incorporating a smoothness prior does not only result in cleaner label maps, but
reduces the noise in the 3D model as shown in Figure 6(c) and (d). The ability of global optimiza-
tion to reduce the noise in the nal model is limited by early determining a small set of possible
depth values for every pixel. The rened labels provide a signicantly improved segmentation of
the scene at low computational costs.
7 Conclusion
In this work we introduced a data-parallel approach to solve Markov random elds on regular grids
with a Potts smoothness prior. Using modern GPUs the observed performance is more than 30
times faster than a graph cut based approach. One suitable application demonstrated in this work
is the postprocessing and clean-up step for depth maps obtained by real-time stereo methods.
9
In this work we restricted ourselves to a uniform weighting between data costs and smooth-
ness priors. Future work needs to explore the applicability of weighted TV-norms, that yield to
generalized Potts discontinuity models. Note that in this setting a slightly extended version of
Proposition 1 still holds. Additionally, incorporating the rened label assignment into a subsequent
semantinc analysis procedure for urban environments is left as future work.
Acknowledgments: We gratefully acknowledge support from NVidia Corporation.
References
[1] J.-F. Aujol, G. Gilboa, T. Chan, and S. Osher. Structure-texture image decomposition
modeling, algorithms, and parameter selection. Int. Journal of Computer Vision, 67(1):111
136, 2006.
[2] Y. Boykov, O. Veksler, and R. Zabih. Markov random elds with ecient approximations.
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 648655,
1998.
[3] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts.
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 23(11):12221239,
2001.
[4] X. Bresson, S. Esedoglu, P. Vandergheynst, J. Thiran, and S. Osher. Fast Global Minimization
of the Active Contour/Snake Model. Journal of Mathematical Imaging and Vision, 2007.
[5] M. Z. Brown, D. Burschka, and G. D. Hager. Advances in computational stereo. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 25(8):9931008, 2003.
[6] A. Chambolle. An algorithm for total variation minimization and applications. Journal of
Mathematical Imaging and Vision, 20(12):8997, 2004.
[7] A. Chambolle. Total variation minimization and a class of binary MRF models. In Energy
Minimization Methods in Computer Vision and Pattern Recognition, pages 136152, 2006.
[8] T. F. Chan and S. Esedoglu. Aspects of total variation regularized L
1
function approximation.
SIAM Journal on Applied Mathematics, 65(5):18171837, 2004.
[9] P. F. Felzenszwalb and D. P. Huttenlocher. Ecient belief propagation for early vision.
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 261268,
2004.
[10] W. T. Freeman, E. C. Pasztor, and O. T. Carmichael. Learning low-level vision. Int. Journal
of Computer Vision, 40(1):2547, 2000.
[11] D. Gallup, J.-M. Frahm, P. Mordohai, Q. Yang, and M. Pollefeys. Real-time plane-sweeping
stereo with multiple sweeping directions. In IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2007.
[12] H. Ishikawa. Exact optimization for Markov random elds with convex priors. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence (PAMI), 25(10):13331336, 2003.
[13] V. Kolmogorov. Convergent tree-reweighted message passing for energy minimization. IEEE
Transactions on Pattern Analysis and Machine Intelligence (PAMI), 28(10):15681583, 2006.
[14] V. Kolmogorov and R. Zabih. What energy functions can be minimized via graph cuts? IEEE
Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26(2):147159, 2004.
10
[15] N. Komodakis, N. Paragios, and G. Tziritas. MRF optimization via dual decomposition:
Message-passing revisited. In IEEE International Conference on Computer Vision (ICCV),
2007.
[16] N. Komodakis and G. Tziritas. A new framework for approximate labeling via graph cuts.
In IEEE International Conference on Computer Vision (ICCV), 2005.
[17] C. Michelot. A nite algorithm for nding the projection of a point onto the canonical simplex
of
n
. Journal of Optimization Theory and Applications, 50(1):195200, 1986.
[18] M. Nikolova, S. Esedoglu, and T. F. Chan. Algorithms for nding global minimizers of image
segmentation and denoising models. SIAM Journal on Applied Mathematics, 66(5):16321648,
2006.
[19] T. Pock, T. Schoenemann, D. Cremers, and H. Bischof. A convex formulation of continuous
multi-label problems. In European Conference on Computer Vision (ECCV), 2008. to appear.
[20] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms.
Physica D, 60:259268, 1992.
[21] D. Scharstein and R. Szeliski. A taxonomy and evaluation of dense two-frame stereo corre-
spondence algorithms. Int. Journal of Computer Vision, 47(1-3):742, 2002.
[22] M. F. Tappen and W. T. Freeman. Comparison of graph cuts with belief propagation for
stereo, using identical MRF parameters. In IEEE International Conference on Computer
Vision (ICCV), pages 900907, 2003.
11
(a) Level 0, 300 iterations (b) Level 0, 1000 iterations (c) Level 40, 100 iterations (d) Converged result
(e) Level 0, 300 iterations (f) Level 0, 1000 iterations (g) Level 40, 100 itera-
tions
(h) Converged result
Figure 4: A coarse-to-ne approach speeds up convergence. Top row: u
x,l
obtained after the
specied number of iterations without (ab) and with using multiple scales (c). (d) shows the
converged result after 5000 iterations. Bottom row: The obtained labels by selecting arg max
l
u
x,l
at every pixel. Although the labeling in (f) is similar to the nal result (h), (b) indicates that the
u
l
still do not induce a clear decision in the lower left region. This gure is best viewed in color.
(a) Full resolution (b) Level 2, correct cost
scaling
(c) Level 2, cost averaging (d) Level 2, cost accumula-
tion
Figure 5: A coarse-to-ne approach requires the correct scaling of the cost values. (a) shows the
result on the full image resolution (512384); (b) shows the result on quarter resolution (12896)
with correct downsampling of the data term; in (c) the downscaled costs are the means of the
costs at the previous level, and yield to oversmoothed results; and in (d) the cost are added, which
leads to less regularized label assignments. This gure is best viewed in color.
(a) Best cost labels (b) Global assignment (c) Best cost 3D model (d) Model with global as-
signment
Figure 6: The Begijnhof sequence (courtesy of Marc Pollefeys). (a) and (b) show the best cost
and global labeling results, respectively. Global labeling almost perfectly results in a semantic
segmentation of the ground (green), fronto-parallel parts (blue) and orthogonal facades (red). (c)
and (d) depict the lit, but untextured facade in the left portion of the image without and with
global labeling. This gure is best viewed in color.
12

Optimization Theory with Applications
From Everand
Optimization Theory with Applications
Donald A. Pierre
4/5 (4)
Modern Multidimensional Calculus
From Everand
Modern Multidimensional Calculus
Marshall Evans Munroe
No ratings yet
Honda OBD Code
No ratings yet
Honda OBD Code
5 pages
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
A Comparative Study of Energy Minimization Methods For Markov Random Fields
No ratings yet
A Comparative Study of Energy Minimization Methods For Markov Random Fields
14 pages
Continuous 3D Label Stereo Matching Using Local Expansion Moves
No ratings yet
Continuous 3D Label Stereo Matching Using Local Expansion Moves
20 pages
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
From Everand
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
Fouad Sabry
No ratings yet
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
From Everand
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
Fouad Sabry
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Histogram Equalization: Enhancing Image Contrast for Enhanced Visual Perception
From Everand
Histogram Equalization: Enhancing Image Contrast for Enhanced Visual Perception
Fouad Sabry
No ratings yet
Computational Geometry: Exploring Geometric Insights for Computer Vision
From Everand
Computational Geometry: Exploring Geometric Insights for Computer Vision
Fouad Sabry
No ratings yet
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
From Everand
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
Fouad Sabry
No ratings yet
Nonlinear Transformations of Random Processes
From Everand
Nonlinear Transformations of Random Processes
Ralph Deutsch
No ratings yet
Raster Graphics Editor: Transforming Visual Realities: Mastering Raster Graphics Editors in Computer Vision
From Everand
Raster Graphics Editor: Transforming Visual Realities: Mastering Raster Graphics Editors in Computer Vision
Fouad Sabry
No ratings yet
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
What Energy Functions Can Be Minimized Using Graph Cuts?: Shai Bagon
No ratings yet
What Energy Functions Can Be Minimized Using Graph Cuts?: Shai Bagon
42 pages
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Fast Approximate Energy Minimization Via Graph Cuts
No ratings yet
Fast Approximate Energy Minimization Via Graph Cuts
18 pages
Joint Bilateral
No ratings yet
Joint Bilateral
5 pages
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
From Everand
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
Fouad Sabry
No ratings yet
Unit Iii
No ratings yet
Unit Iii
3 pages
Multi-Label Image Segmentation For Medical Applications Based On Graph-Theoretic Electrical Potentials
No ratings yet
Multi-Label Image Segmentation For Medical Applications Based On Graph-Theoretic Electrical Potentials
13 pages
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
From Everand
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
Fouad Sabry
No ratings yet
Ee5551 Newproj Report
No ratings yet
Ee5551 Newproj Report
18 pages
Bump Mapping: Exploring Depth in Computer Vision
From Everand
Bump Mapping: Exploring Depth in Computer Vision
Fouad Sabry
No ratings yet
Licence Plate Detection
No ratings yet
Licence Plate Detection
42 pages
Bayesian Modeling of Uncertainty in Low-Level Vision
No ratings yet
Bayesian Modeling of Uncertainty in Low-Level Vision
27 pages
Standard-Slope Integration: A New Approach to Numerical Integration
From Everand
Standard-Slope Integration: A New Approach to Numerical Integration
Peter James Italia, MD
No ratings yet
A Star: Fundamentals and Applications
From Everand
A Star: Fundamentals and Applications
Fouad Sabry
No ratings yet
Texture Mapping: Exploring Dimensionality in Computer Vision
From Everand
Texture Mapping: Exploring Dimensionality in Computer Vision
Fouad Sabry
No ratings yet
Visual Slam Whyfilter
No ratings yet
Visual Slam Whyfilter
36 pages
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
From Everand
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
Fouad Sabry
No ratings yet
Graph Theory
From Everand
Graph Theory
Ronald Gould
No ratings yet
Euler Machine Leanring Application Manifold Denoising PDF
No ratings yet
Euler Machine Leanring Application Manifold Denoising PDF
8 pages
Mesh Decimation PDF
No ratings yet
Mesh Decimation PDF
10 pages
Weighted Line Fitting Algorithms For Mobile Robot Map Building and Efficient Data Representation
No ratings yet
Weighted Line Fitting Algorithms For Mobile Robot Map Building and Efficient Data Representation
8 pages
Article B3 D
No ratings yet
Article B3 D
39 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
ICCV07 Tutorial Yuri
No ratings yet
ICCV07 Tutorial Yuri
36 pages
An Experimental Comparison of Min-CutMax-Flow Algorithms For Energy Minimization in Vision
No ratings yet
An Experimental Comparison of Min-CutMax-Flow Algorithms For Energy Minimization in Vision
34 pages
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
Bresenham Line Algorithm: Efficient Pixel-Perfect Line Rendering for Computer Vision
From Everand
Bresenham Line Algorithm: Efficient Pixel-Perfect Line Rendering for Computer Vision
Fouad Sabry
No ratings yet
Blob Detection: Unveiling Patterns in Visual Data
From Everand
Blob Detection: Unveiling Patterns in Visual Data
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Method of Moments for 2D Scattering Problems: Basic Concepts and Applications
From Everand
Method of Moments for 2D Scattering Problems: Basic Concepts and Applications
Christophe Bourlier
No ratings yet
An Analysis and Implementation of The BM3D Image Denoising Method
No ratings yet
An Analysis and Implementation of The BM3D Image Denoising Method
39 pages
Ulti Region Segmentation Using Graph Cuts
No ratings yet
Ulti Region Segmentation Using Graph Cuts
8 pages
Color Mapping: Exploring Visual Perception and Analysis in Computer Vision
From Everand
Color Mapping: Exploring Visual Perception and Analysis in Computer Vision
Fouad Sabry
No ratings yet
Efficient Optimal Surface Detection: Theory, Implementation and Experimental Validation
No ratings yet
Efficient Optimal Surface Detection: Theory, Implementation and Experimental Validation
8 pages
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Fast Region Expansion Connected Component Labelling Algorithm
No ratings yet
Fast Region Expansion Connected Component Labelling Algorithm
2 pages
Realtime Scanmatching and Localization Using A Spinning Laser Rangefinder Tudor Achim
No ratings yet
Realtime Scanmatching and Localization Using A Spinning Laser Rangefinder Tudor Achim
5 pages
Markov Random Fields (MRF)
No ratings yet
Markov Random Fields (MRF)
42 pages
Sift Detector and Descriptor: (Scale Invariant Feature Transform)
No ratings yet
Sift Detector and Descriptor: (Scale Invariant Feature Transform)
34 pages
Applied Mathematics for Science and Engineering
From Everand
Applied Mathematics for Science and Engineering
Larry A. Glasgow
No ratings yet
A New Total Variation Regularization-Based Model For Additive Noise Removal Using Global Meshless Collocation Scheme
No ratings yet
A New Total Variation Regularization-Based Model For Additive Noise Removal Using Global Meshless Collocation Scheme
11 pages
Geometric Transformations For Image Processing
No ratings yet
Geometric Transformations For Image Processing
11 pages
Lesson04 PDF
No ratings yet
Lesson04 PDF
51 pages
Graph-Based Semi-Supervised Learning With Multiple Labels
No ratings yet
Graph-Based Semi-Supervised Learning With Multiple Labels
4 pages
Cao2013 Miccai 0
No ratings yet
Cao2013 Miccai 0
8 pages
Rathi2013 Miccai Cdmri RBF Prop 0
No ratings yet
Rathi2013 Miccai Cdmri RBF Prop 0
10 pages
Kwitt2013 Miccai
No ratings yet
Kwitt2013 Miccai
8 pages
Lyu2013 Ipmi 0
No ratings yet
Lyu2013 Ipmi 0
12 pages
Kwon TMI 2013 - 0
No ratings yet
Kwon TMI 2013 - 0
17 pages
Rathi2013 Miccai Cdmri RBF Prop 0
No ratings yet
Rathi2013 Miccai Cdmri RBF Prop 0
10 pages
Longitudinal Image Registration With Non-Uniform Appearance Change
No ratings yet
Longitudinal Image Registration With Non-Uniform Appearance Change
8 pages
Hong2013 Miccai 0
No ratings yet
Hong2013 Miccai 0
8 pages
Vandenberg Icra 2012
No ratings yet
Vandenberg Icra 2012
8 pages
Registration For Correlative Microscopy Using Image Analogies
No ratings yet
Registration For Correlative Microscopy Using Image Analogies
10 pages
Zhang Miccai 2012
No ratings yet
Zhang Miccai 2012
8 pages
Simple Geodesic Regression
No ratings yet
Simple Geodesic Regression
10 pages
Temporally-Dependent Image Similarity Measure For Longitudinal Analysis
No ratings yet
Temporally-Dependent Image Similarity Measure For Longitudinal Analysis
10 pages
Niethammer2008 PAMI Geometric Observers For Dynamically Evolving Curves
No ratings yet
Niethammer2008 PAMI Geometric Observers For Dynamically Evolving Curves
18 pages
Shan ISBI2012
No ratings yet
Shan ISBI2012
4 pages
Ajafernandez2008 Restoration of DWI Data
No ratings yet
Ajafernandez2008 Restoration of DWI Data
30 pages
Sliding Geometries in Deformable Image Registration
No ratings yet
Sliding Geometries in Deformable Image Registration
8 pages
Kabul2011 Texture Metamorphosis
No ratings yet
Kabul2011 Texture Metamorphosis
12 pages
An Optimal Control Approach For The Registration of Image Time-Series
No ratings yet
An Optimal Control Approach For The Registration of Image Time-Series
8 pages
Cortical Correspondence With Probabilistic Fiber Connectivity
No ratings yet
Cortical Correspondence With Probabilistic Fiber Connectivity
12 pages
Hart2009 Oc Registration
No ratings yet
Hart2009 Oc Registration
8 pages
Modeling and Analysis of Water Pumping Windmills
No ratings yet
Modeling and Analysis of Water Pumping Windmills
14 pages
至一科技媒体周PPT V4
No ratings yet
至一科技媒体周PPT V4
59 pages
22-25 Vapour Absorption System
No ratings yet
22-25 Vapour Absorption System
50 pages
CIPAM's IPR Awareness Circular
No ratings yet
CIPAM's IPR Awareness Circular
7 pages
Asphalt Core Brochure 4932a
No ratings yet
Asphalt Core Brochure 4932a
6 pages
05 - HO - Logical Database Design and The Relational Model
No ratings yet
05 - HO - Logical Database Design and The Relational Model
27 pages
Synopsis
No ratings yet
Synopsis
12 pages
New Seismic Parameters For Building Code of Pakistan and Their Effect On Existing Reinforced Concrete Buildings: A Case Study
No ratings yet
New Seismic Parameters For Building Code of Pakistan and Their Effect On Existing Reinforced Concrete Buildings: A Case Study
7 pages
Sales Marketing Travel Policy
No ratings yet
Sales Marketing Travel Policy
8 pages
Summary of Non-Critical Parameters
No ratings yet
Summary of Non-Critical Parameters
1 page
Pradygdha Jati - Lesson Learned From Rural Micro Hydro Develompent in Indonesia
No ratings yet
Pradygdha Jati - Lesson Learned From Rural Micro Hydro Develompent in Indonesia
33 pages
Performance Sheet Drill Collar
No ratings yet
Performance Sheet Drill Collar
2 pages
Difference Between PFILE and SPFILE in Oracle
No ratings yet
Difference Between PFILE and SPFILE in Oracle
3 pages
2010 Dodge Ram 1500 Ebrochure
No ratings yet
2010 Dodge Ram 1500 Ebrochure
16 pages
Business Model Canvas Lean Canvas
No ratings yet
Business Model Canvas Lean Canvas
43 pages
Photo Etching
100% (1)
Photo Etching
9 pages
Protocol Development According To ISO 11607
No ratings yet
Protocol Development According To ISO 11607
37 pages
Script For Lessons Learned Workshop: Using This Template
No ratings yet
Script For Lessons Learned Workshop: Using This Template
9 pages
Ratiotronic 6000 - R08 - Manual - Eng
No ratings yet
Ratiotronic 6000 - R08 - Manual - Eng
224 pages
Presented By:: Srishti Gupta Shaweta Goyal Harsimran Kaur Ekta Jarangal
No ratings yet
Presented By:: Srishti Gupta Shaweta Goyal Harsimran Kaur Ekta Jarangal
25 pages
TV Daewood Chassis CN-001A PDF
No ratings yet
TV Daewood Chassis CN-001A PDF
44 pages
Parsing Chat Logs and Deobfuscating The Registry For GigaTribe 2.5
No ratings yet
Parsing Chat Logs and Deobfuscating The Registry For GigaTribe 2.5
4 pages
A Study On Dual Clutch Transmission: P. P. Patnaik & Bhabani S. Mahanto
No ratings yet
A Study On Dual Clutch Transmission: P. P. Patnaik & Bhabani S. Mahanto
6 pages
Tecnicas Reunidas S.A: Welding Procedure Specification (WPS)
No ratings yet
Tecnicas Reunidas S.A: Welding Procedure Specification (WPS)
2 pages
N50M Grade Neodymium Magnets Data
No ratings yet
N50M Grade Neodymium Magnets Data
1 page
Stanford Matsci 156 HW
No ratings yet
Stanford Matsci 156 HW
109 pages
Summer Picnics
No ratings yet
Summer Picnics
6 pages
HSG Anh 8 2018-2019
No ratings yet
HSG Anh 8 2018-2019
11 pages
Msds
0% (1)
Msds
5 pages

Zach2008 VMV Fast Global Labeling

Uploaded by

Zach2008 VMV Fast Global Labeling

Uploaded by

Fast Global Labeling for Real-Time Stereo Using Multiple

D(x, (x)) dx + V (), (1)

|| dx. Since in our application the labels representing plane

dx and ignore the smoothness cost for now,

. Otherwise a set of equally optimal assignments for u

(maximum norm) or the

1 is just the unit square, and

You might also like