A Variable-Resolution Probabilistic Three-Dimensional Model For Change Detection
A Variable-Resolution Probabilistic Three-Dimensional Model For Change Detection
A Variable-Resolution Probabilistic
Three-Dimensional Model for Change Detection
Daniel Crispell, Member, IEEE, Joseph Mundy, and Gabriel Taubin, Fellow, IEEE
Abstract—Given a set of high-resolution images of a scene, it is appearance of the scene given the previously observed images.
often desirable to predict the scene’s appearance from viewpoints Some change detection algorithms operate at large scales,
not present in the original data for purposes of change detection. typically indicating changes in land-cover type (e.g., forests,
When significant 3-D relief is present, a model of the scene ge-
ometry is necessary for accurate prediction to determine surface urban, and farmland). Due to the increasing availability of
visibility relationships. In the absence of an a priori high-resolution high-resolution imagery, however, interest in higher resolution
model (such as those provided by LIDAR), scene geometry can and intraclass change detection is growing. The precise def-
be estimated from the imagery itself. These estimates, however, inition of “change” is application dependent in general, and
cannot, in general, be exact due to uncertainties and ambiguities in many cases, it is easier to define what does not constitute
present in image data. For this reason, probabilistic scene models
and reconstruction algorithms are ideal due to their inherent valid change [1], [2]. Typically, changes in appearance due
ability to predict scene appearance while taking into account to illumination conditions, atmospheric effects, viewpoint, and
such uncertainties and ambiguities. Unfortunately, existing data sensor noise are not desired to be reported. Various classes of
structures used for probabilistic reconstruction do not scale well methods for accomplishing this have been attempted, a survey
to large and complex scenes, primarily due to their dependence of which was given by Radke et al. [1] in 2005 (which includes
on large 3-D voxel arrays. The work presented in this paper
generalizes previous probabilistic 3-D models in such a way that the joint histogram-based method [3] used for comparison by
multiple orders of magnitude savings in storage are possible, mak- Pollard et al. [2]). One common assumption relied on by most
ing high-resolution change detection of large-scale scenes from of these methods, as well as more recent approaches [4], [5],
high-resolution aerial and satellite imagery possible. Specifically, is an accurate registration of pixel locations in the collected
the inherent dependence on a discrete array of uniformly sized image to corresponding pixel locations in the base imagery.
voxels is removed through the derivation of a probabilistic model
which represents uncertain geometry as a density field, allowing When the scene being imaged is relatively flat or the 3-D
implementations to efficiently sample the volume in a nonuniform structure is known a priori, “rubber sheeting” techniques can
fashion. be used to accomplish this. When the scene contains significant
Index Terms—Computer vision, data structures, remote 3-D relief viewed from disparate viewpoints, however, tech-
sensing. niques based on this assumption fail due to their inability to
predict occlusions and other viewpoint-dependent effects [2].
High-resolution imagery exacerbates this problem due to the
I. I NTRODUCTION
increased visibility of small-scale 3-D structure (trees, small
B. Probabilistic Models
Manuscript received December 1, 2010; revised March 18, 2011; accepted
April 22, 2011. Computing exact 3-D structure based on 2-D images is, in
D. Crispell is with the National Geospatial-Intelligence Agency, Springfield, general, an ill-posed problem. Bhotika et al. [8] characterized
VA 22150 USA (e-mail: [email protected]). the sources of difficulty as belonging to one of two classes:
J. Mundy and G. Taubin are with Brown University, Providence, RI 02912,
USA (e-mail: [email protected]; [email protected]). scene ambiguity and scene uncertainty. Scene ambiguity ex-
Digital Object Identifier 10.1109/TGRS.2011.2158439 ists due to the existence of multiple possible photo-consistent
0196-2892/$26.00 © 2011 IEEE
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
D. Outline
The remainder of this paper is laid out as follows. In
Section II, a brief survey of related work in the fields of 3-D
reconstruction is given. Section III describes the theoretical
foundations of the proposed model. Sections IV and V de-
scribe the implementation using an octree data structure and
the reconstruction algorithms, respectively. The application
of the reconstructed models for change detection are dis-
cussed in Section VI, followed by the paper’s conclusion in
Section VII.
Fig. 1. Variable-resolution models (b) such as octrees allow for high- II. R ELATED W ORK
resolution representation where needed (i.e., near surfaces) with far less data
than required in (a) fixed grid models. Representing surface probabilities using There is a large body of previous work in the computer vi-
the proposed method allows for variable-resolution models to be used for sion community involving the automatic reconstruction of 3-D
probabilistic 3-D modeling approaches to change detection.
models from imagery, a brief overview of which is given here.
The bulk of the representations used are not probabilistic in
nature and are discussed in Section II-A. Existing probabilistic
scenes and is a problem even in the absence of any sensor noise methods are discussed in Section II-B.
or violations of assumptions built into the imaging and sensor
model. In the absence of prior information regarding the scene
A. Deterministic Methods
structure, there is no reason to prefer one possible reconstruc-
tion over another. The term “scene uncertainty,” on the other Three-dimensional reconstruction from images is one of the
hand, is used to describe all other potential sources of error in- fundamental problems in the fields of computer vision and
cluding sensor noise, violations of certain simplifying assump- photogrammetry, the basic principles of which are discussed in
tions (e.g., Lambertian appearance), and calibration errors. The many texts including [9]. Reconstruction methods vary both in
presence of scene uncertainty typically makes reconstruction the algorithms used and the type of output produced.
of a perfectly photo-consistent scene impossible. Probabilistic Traditional stereo reconstruction methods take as input two
models allow the scene ambiguity and uncertainty to be ex- (calibrated) images and produce a depth (or height) map as
plicitly represented, which, in turn, allows the assignment of output. A comprehensive review of the stereo reconstruction
confidence values to visibility calculations, expected images, literature as of 2002 is given by Scharstein and Szeliski [10].
and other data extracted from the model. A probabilistic model While highly accurate results are possible with recent methods
can also be used to determine which areas of the scene require [11], [12], the reconstruction results are limited to functions of
further data collection due to low confidence. the form f (x, y) and cannot completely represent general 3-D
scenes on their own.
Many multiview methods are capable of computing 3-D
point locations as well as camera calibration information
C. Contributions
simultaneously using the constraints imposed by feature
The work presented in this paper generalizes previous prob- matches across multiple images (so called “structure from
abilistic reconstruction models in such a way that multiple motion”). One example of a such a method is presented by
orders of magnitude savings in storage are possible, making Pollefeyes et al. [13], who use tracked Harris corner [14]
precise representation and change detection of large-scale out- features to establish correspondences across frames of a video
door scenes possible. Specifically, the inherent dependence on sequence. Brown and Lowe [15] and Snavely et al. [16] use
a discrete array of uniformly sized voxels is removed through scale invariant feature transform features [17] for the same
the derivation of a new probabilistic representation based on a purpose with considerable success. Snavley et al. have shown
density field model. The representation allows for implementa- their system capable of successfully calibrating data sets con-
tions which nonuniformly sample the volume, providing high- sisting of hundreds of images taken from the Internet. The
resolution detail where needed (e.g., near surfaces) and coarser output of feature-based matching methods (at least in an initial
resolutions in areas containing little information (e.g., in empty phase) is a discrete and sparse set of 3-D elements which are
space) (Fig. 1). Additionally, it represents the first probabilistic not directly useful for the purpose of appearance prediction
volumetric model to provide a principled way to take viewing since some regions (e.g., those with homogeneous appearance)
ray/voxel intersection lengths into account, enabling higher ac- will be void of features and, thus, also void of reconstructed
curacy modeling and rendering. The proposed model combined points. It is possible to estimate a full surface mesh based
with the reconstruction and associated algorithms comprise on the reconstructed features [18], [19], but doing so requires
a practical system capable of automatically detecting change imposing regularizing constraints to fill in “holes” correspond-
and generating photo-realistic renderings of large and complex ing to featureless regions. Methods based on dense matching
scenes from arbitrary viewpoints based on image data alone. techniques avoid the hole-filling problem but are dependent on
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
B. Probabilistic Methods
As discussed in Section I-B, probabilistic 3-D models have Fig. 2. (Left) Three viewing rays pass through a voxel, which (right) is then
subdivided to achieve higher resolution. Assuming that no further information
the desirable quality of allowing a measure of uncertainty about the voxel is obtained, the occlusion probabilities P (QA1 ), P (QB1 ), and
and ambiguity in the reconstructed model to be explicitly P (QC1 ) should not depend on the level of subdivision and should be equal to
represented. Probabilistic methods are also capable of produc- P (QA2 ), P (QB2 ), and P (QC2 ), respectively.
ing a complete representation of the modeled surfaces while
making no assumptions about scene topology or regularizing range of possible scene reconstructions. Broadhurst et al. [25]
constraints. aim to reconstruct such a representation, as well as Pollard et al.
There exists in the literature several distinct methods for [2] for the purpose of change detection. Pollard et al. use
reconstructing a probabilistic volumetric scene model based on an online Bayesian method to update voxel probabilities with
image data, all based on discrete voxel grid models. Although each observation. Because a model which fully represents both
the methods vary in their approach, the goal is the same: to scene ambiguities and uncertainties and is capable of change
produce a volumetric representation of the 3-D scene, where detection is desired, the model and algorithms presented in this
each voxel is assigned a probability based on the likelihood paper are based on this approach.
of it being contained in the scene. The algorithms grew out of One quality that current volumetric probabilistic reconstruc-
earlier “voxel coloring” algorithms [20]–[23] in which voxels tion methods all share is that the voxel representation is inher-
are removed from the scene based on photometric consistency ently incapable of representing the true continuous nature of
and visual hull constraints. Voxel coloring methods are prone surface location uncertainty. Using standard models, occlusion
to errors due to scene uncertainty; specifically, violations of can only occur at voxel boundaries, since each voxel is modeled
the color consistency constraint often manifest themselves as as being either occupied or empty. A side effect of this fact is
incorrectly carved “holes” in the model [24]. To combat these that there is no principled way to take into account the length
errors, probabilistic methods do not “carve” voxels but rather of viewing ray/voxel intersections when computing occlusion
assign each a probability of existing as part of the model. probabilities, which limits the accuracy of the computations.
Broadhurst et al. [25] assign a probability to each voxel based These limitations are typically handled in practice by the use of
on the likelihood that the image samples originated from a high-resolution voxel grids, which minimize the discretization
distribution with small variance rather than make a binary effects of the model. Unfortunately, high-resolution voxel grids
decision. Similarly, Bhotika et al. [8] carve each voxel with a are prohibitively expensive, requiring O(n3 ) storage to repre-
probability based on the variance of the samples in each of a sent scenes with linear resolution n.
large number of runs. The final voxel probability is computed 1) Variable-Resolution Probabilistic Methods: The benefits
as the probability that the voxel exists in a given run. of an adaptive variable-resolution representation are clear: In
In addition to uncertainty due to noise and other unmodeled theory, a very highly effective resolution can be achieved
phenomenon, any reconstruction algorithm must also deal with without the O(n3 ) storage requirements imposed by a regular
scene ambiguity, the condition which exists when multiple voxel grid. One hurdle to overcome is the development of a
photo-consistent reconstructions are possible given a set of probabilistic representation which is invariant to the local level
collected images. If certain a priori information about the scene of discretization. A simple example is informative.
is available, the information may be used to choose the photo- Consider a single voxel pierced by three distinct viewing
consistent reconstruction which best agrees with the prior. The rays, as shown in Fig. 2 (left). After passing through the single
reconstruction algorithm presented in this paper is as general voxel, the viewing rays have been occluded with probabilities
as possible and, thus, does not utilize any such prior. Another P (QA1 ), P (QB1 ), and P (QC1 ), respectively. Given that no
approach is to define a particular member of the set of photo- further information about the volume is obtained, the occlusion
consistent reconstructions as “special” and aim to recover that probabilities should not change if the voxel is subdivided to
member. This is the approach taken by Kutulakos and Seitz [21] provide finer resolution, as shown in Fig. 2 (right). In other
and Bhotika et al. [8]. Kutulakos and Seitz define the photo words, the occlusion probabilities should not be inherently tied
hull as the tightest possible bound on the true scene geometry, to the level of discretization.
i.e., the maximal photo-consistent reconstruction. They show Using traditional probabilistic methods, P (QA1 ) =
that, under ideal conditions, the photo hull can be recovered P (QB1 ) = P (QC1 ) = P (X ∈ S), where P (X ∈ S) is the
exactly, while Bhotika et al. present a stochastic algorithm probability that the voxel belongs to the set S of occupied
for probabilistic recovery of the photo hull in the presence of voxels. Upon subdivision of the voxel, eight new voxels are
noise. The photo hull provides a maximal bound on the true created, each of which must be assigned a surface probability
scene geometry but does not contain any information about the P (Xchild ∈ S). Whatever the probability chosen, it is assumed
distribution of possible scene surfaces within the hull. A third to be constant among the eight “child” voxels since there is
approach is to explicitly and probabilistically represent the full no reason for favoring one over any of the others. Given that
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
rays A, B, and C pass through four child voxels, two child in general. Points along the line of sight may be parameterized
voxels, and one child voxel, respectively, the new occlusion by s, the distance from q
probabilities are computed as the probability that any of the
voxels passed through belong to the set S of surface voxels. x(s) = q + sr, s ≥ 0. (4)
This is easily solved using De Morgan’s Laws by instead
computing the complement of the probability that all voxels Given the point q and viewing ray r, a proposition Vs may be
passed through are empty defined as follows:
P (QA2 ) = 1 − (1 − P (Xchild ∈ S))4 (1)
Vs ≡ “The point along r at distance s is visible from q.” (5)
2
P (QB2 ) = 1 − (1 − P (Xchild ∈ S)) (2)
The probability P (Vs ) is a (monotonically decreasing) function
P (QC2 ) = P (Xchild ∈ S). (3)
of s and can be written as such using the notation vis(s)
Obviously, the three occlusion probabilities cannot be equal
vis(s) ≡ P (Vs ). (6)
to the original values, i.e., P (X ∈ S), except in the trivial cases
P (X ∈ S) = P (Xchild ∈ S) = 0 or P (X ∈ S) = P (Xchild ∈
S) = 1. This simple example demonstrates the general impos- Given a segment of r with arbitrary length beginning at the
sibility of resampling a traditional voxel-based probabilistic point with distance s from q, the segment occlusion probabil-
model while maintaining the semantic meaning of the original. ity P (Qs ) is defined as the probability that the point at distance
This presents a major hurdle to generalizing standard proba- s + is not visible, given that the point at distance s is visible
bilistic 3-D models to variable-resolution representations.
The methods proposed in this paper solve the problems P Qs =P (V̄s+ |Vs )
associated with resolution dependence by modeling surface
= 1 − P (Vs+ |Vs ). (7)
probability as a density field rather than a set of discrete voxel
probabilities. The density field is still represented discretely in
practice, but the individual voxels can be arbitrarily subdivided Using Bayes’ theorem
without affecting occlusion probabilities since the density is P (Vs |Vs+ )P (Vs+ )
a property of the points within the voxel and not the voxel P Qs = 1 − . (8)
P (Vs )
itself. Occlusion probabilities are computed by integrating the
density field along viewing rays, providing a principled way to
Substituting vis(s) for the visibility probability at distance s
take voxel/viewing ray intersection lengths into account. The
and recognizing that P (Vs |Vs+ds ) = 1
derivation of this density field is presented in Section III.
vis(s + )
P Qs = 1 −
III. O CCLUSION D ENSITY vis(s)
In order to offset the prohibitively large storage costs and vis(s) − vis(s + )
discretization problems of the regular voxel grid on which tradi- P Qs = . (9)
vis(s)
tional probabilistic methods are based, a novel representation of
surface probability is proposed in the form of a scalar function If an infinitesimal segment length = ds is used, (9) can be
termed the occlusion density. The occlusion density at a point written as
in space can be thought of as a measure of the likelihood
that the point occludes points behind it along the line of sight −∂vis(s)
of a viewer, given that the point itself is unoccluded. More P Qdss = (10)
vis(s)
precisely, the occlusion density value at the point is a measure ds
of occlusion probability per unit length of a viewing ray which P Qs vis (s)
is passing through the point. =− . (11)
ds vis(s)
If the occlusion density is defined over a volume, proba-
bilistic visibility reasoning can be performed for any pair of The left-hand side of (11) is a measure of occlusion probability
points within the volume. In the case where surface geometry per unit length and defines the occlusion density at point x(s).
exists and is known completely [e.g., scenes defined by a The estimation of the occlusion density value is discussed in
surface mesh or digital elevation model (DEM)], the occlusion Section V
density is defined as infinite at the surface locations and zero
elsewhere. vis (s)
α (x(s)) ≡ − . (12)
Given a ray in space defined by its origin point q and a unit vis(s)
direction vector r, the probability of each point x along the
ray being visible from q may be computed. It is assumed here 2) Visibility Probability Calculation: The visibility proba-
that q is the position of a viewing camera and r represents a bility of the point at distance s along a viewing ray can be
viewing ray of the camera, but the assumption is not necessary derived in terms of the occlusion density function along the
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
ray by integrating both sides of (12) with respect to a dummy is irrelevant. By contrast, the occlusion probabilities P (Qs )
variable of integration t represent the probability that the viewing ray is occluded at
any point on the interval [s, s + ]. The path length becomes
s s
−∂vis(t) important when one moves away from high-resolution regular
α(t)dt = voxel grids to variable-resolution models because its value
vis(t)
0 0 may vary greatly depending on the size of the voxel and the
s geometry of the ray-voxel intersection.
− α(t)dt = [ln (vis(t)) + c]s0
0
IV. I MPLEMENTATION : O CTREE R EPRESENTATION
s
− α(t)dt = ln (vis(s)) − ln (vis(0)) . (13) In order to make practical use of the probabilistic model
described in Section III, a finite-sized representation which is
0
able to associate both an occlusion density and appearance
Finally, by recognizing that vis(0) = 1 and placing both sides model with each point in the working volume is needed. Details
of (13) in an exponential are presented in this section of an octree-based implementation
s which approximates the underlying occlusion density and ap-
− α(t)dt
vis(s) = e 0 . (14) pearance functions as piecewise constant.
Most real-world scenes contain large slowly varying regions
Equation (14) gives a simple expression for the visibility of low occlusion probability in areas of “open space” and
probability in terms of the occlusion density values along the high quickly varying occlusion probability near “surfacelike”
viewing ray. objects. It therefore makes sense to sample α(x) in a nonuni-
An example of corresponding occlusion density and visi- form fashion. The proposed implementation approximates both
bility functions is shown in Fig. 3, which depicts a camera α(x) and the appearance model as being piecewise constant,
ray piercing a theoretical volume for which the occlusion with each region of constant value represented by a cell of an
density is defined at each point within the volume. The value adaptively refined octree. The implementation details of the
of the occlusion density α(s) as a function of distance along underlying octree data structure itself are beyond the scope
the camera ray is plotted, indicating two significant peaks in of this paper; the reader is referred to Samet’s comprehensive
surface probability. The resulting visibility probability function treatment [26]. Fig. 4 shows a viewing ray passing through a
is plotted directly below it. volume which has been adaptively subdivided using an octree
3) Occlusion Probability: Substituting (14) back into (9), a data structure and the finite-length ray segments that result from
simplified expression of a segment’s occlusion probability is the intersections with the individual octree cells.
obtained Each cell in the octree stores a single occlusion density
s+ value α and appearance distribution pA (i). The appearance
− α(t)dt
P Qs = 1 − e s . (15) distribution represents the probability density function of the
pixel intensity value resulting from the imaging of the cell.
4) Relationship With Discrete Voxel Probabilities: The key The occlusion density value and appearance distribution are
theoretical difference between the discrete voxel probabilities assumed to be constant within the cell. Note that this piecewise-
P (X) of existing methods and the preceding formulation of constant assumption can be made arbitrarily accurate since, in
occlusion density is the interpretation of the probability values. theory, any cell in which the approximation is not acceptable
Because existing methods effectively model the probability that can always be subdivided into smaller cells. In practice, how-
a voxel boundary is occluding (whether they are defined as such ever, the amount of useful resolution in the model is limited by
or not), the path length of the viewing ray through the voxel the resolution of the input data used to construct it.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
V. R ECONSTRUCTION F ROM I MAGERY The term pAj (cD ) is the probability density of the viewing ray’s
corresponding pixel intensity value cD given by the appearance
Pollard et al. [2] estimate the probability that each voxel X model of the ith octree cell along the ray. Equation (19) can
of the model belongs to the set S of “surface voxels.” This then be generalized as
“surface probability” is denoted as P (X ∈ S) or simply P (X)
for convenience. The voxel surface probabilities are initialized
−1
N
with a predetermined prior and updated with each new ob- vis∞ ≡ 1 − P Qsi (21)
servation using an online Bayesian update equation (16). The i=0
update equation determines the posterior surface probabilities prei + vis(si )pAi (cD )
of each of a series of voxels along a camera ray, given their P Qsi |D = P Qsi (22)
pre∞ + vis∞ pA∞ (cD )
prior probabilities P (X) and an observed image D
P (D|X) where pre∞ represents the total probability of the observation
P (X|D) = P (X) . (16) based on all voxels along the ray. The probability of the ray
P (Dt )
passing unoccluded through the model is represented by vis∞
The marginal (P (Dt )) and conditional (P (D)|X)) probabil- and is computed based on (21). The term pA∞ (cD ) represents
ities of observing D can be expanded as a sum of probabilities the probability density of the observed intensity given that the
along the viewing ray. In practice, a single camera ray is ray passes unoccluded through the volume and can be thought
traversed for each pixel in image t and all pixel values are of as a “background” appearance model. In practice, portions
assumed to be independent. of the scene not contained in the volume of interest may be
Rather than a camera ray intersecting a series of voxels, the visible in the image. In this case, the background appearance
equation can be generalized to a series of N intervals along a model represents the predicted intensity distribution of these
ray parameterized by s, the distance from the camera center. points and is nominally set to a uniform distribution. Note that
The ith interval is the result of the intersection of the viewing the denominator of (22) differs from the update equation of
ray with the ith octree cell along the ray, and it has length i . Pollard et al. [2] due to consolidation of the “pre” and “post”
(See Fig. 4.) The interval lengths resulting from the voxel–ray terms into pre∞ and the addition of the “background” term
intersections are irrelevant in the formulation of Pollard et al. vis∞ pA∞ (cD ).
because the occlusion probabilities are fixed and equal to the Equation (22) provides a method for computing the posterior
discrete voxel surface probabilities P (Xi ). The surface prob- occlusion density of a viewing ray segment but must be related
ability of the ith voxel is replaced by P (Qsii ), the occlusion back to the cell’s occlusion density value to be of use. This is
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Fig. 5. Four representative frames from the “downtown” video sequence filmed over Providence, RI. Images courtesy of Brown University. Because of the short
duration of the sequence, moving objects such as vehicles dominate the changes from image to image.
The previous occlusion density value of each cell is replaced The occlusion densities and appearance models of the refined
with the ᾱ value computed using all K viewing rays of the given cells are initialized to the value of their parent cell. This process
image which pass through it. is executed after each update to the model.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Fig. 6. (Top) Ground truth segmentation used to evaluate the change detection
algorithms. Original image courtesy of Brown University. (Bottom) ROC
curves for change detection using the proposed 3-D model and a fixed grid
voxel model.
Fig. 8. Expected images generated from the viewpoint of the image used
to evaluate change detection. Because there is a low probability of a mov-
ing vehicle being at a particular position on the roads, they appear empty.
(Top) Image generated using the proposed 3-D model. (Bottom) Image gen-
erated using a fixed grid voxel model. The proposed variable-resolution model
allows for finer details to be represented, as is visible in the expanded crops.
Fig. 9. (Left) Training image of the Haifa Street region from 2002. (Middle) Ground truth changes marked in one of the test images from 2007. Most of the
changes are a result of moving vehicles on the roads. (Right) Changes detected using the proposed algorithm. Original image copyright Digital Globe.
Fig. 10. (Left) Training image of the region surrounding the construction of the U.S. embassy from 2002. (Middle) Ground truth changes marked in one of the
test images from 2007. There is a variety of changes resulting in the placement of large storage containers and other construction apparatus. (Right) Changes
detected using the proposed algorithm. Original image copyright Digital Globe.
be generated by integrating the appearance information of each of each pixel based on the computed probability density distri-
voxel along the pixel’s corresponding viewing ray R butions of each pixel. Fig. 8 shows expected images generated
using the proposed model and the fixed grid model. Small
pA (c) = vis(si )(1 − e−αi i )pAi (c). (27) features such as individual building windows and rooftop air-
i∈R conditioning units are visible using the proposed model but
blend into the background using the fixed grid model. The
Pixel values with a low probability density are “unexpected.” resolution of the fixed grid model can, in theory, always be
A binary “change mask” can therefore be created by thresh- increased to match the capabilities of the variable-resolution
olding the image of pixel probability densities. The receiver model, but doing so quickly becomes impractical due to the
operating characteristic (ROC) curves in Figs. 6, 11, and 12 O(n3 ) storage requirements. The fixed grid model of the
show the rate of ground-truth change pixels correctly marked “downtown” sequence is nearly 50% larger than the variable-
as change versus the rate of pixels incorrectly identified as resolution model while providing half the effective resolution.
change. The plotted curves indicate how these values vary as A fixed grid model equaling the effective resolution of variable-
the threshold τ is varied. resolution model would require approximately 18 GB (1000%
Two distinct collection types are investigated: full motion larger than the variable-resolution model).
video collected from an aerial platform and satellite imagery. The satellite imagery used for experimentation was collected
The full motion video was collected using a high-definition by Digital Globe’s Quickbird satellite over Baghdad, Iraq, from
(1280 × 720 pixels) video camera from a helicopter flying over 2002 to 2007 and are the same data sets used by Pollard et al.
Providence, RI, U.S. A few representative frames are shown in [2] in their change detection experiments. Two areas of interest
Fig. 5. The model was updated using 175 frames of a sequence are focused on the following: a region with some high-rise
(in random order) in which the helicopter made a full 360◦ buildings along Haifa Street and the region surrounding the
loop around a few blocks of the downtown area, and the change construction site of the new U.S. embassy building. A manually
detection algorithm was then run on an image (not used in the determined translation is applied to bring the supplied camera
training set) that contains some moving vehicles (Fig. 7). model to within approximately one pixel of reprojection error.
The ROC curve in Fig. 6 demonstrates the advantage that The images have a nominal resolution of approximately 0.7-m
higher resolution 3-D modeling provides to the change detec- GSD. Although the 3-D structure is less pronounced than in
tion algorithm. In order to better visualize the higher resolution the video sequence, it is still sufficient to pose a challenge to
capability of the proposed model, a synthetic image can be 2-D change detection algorithms, as shown by Pollard et al.
generated by computing the expected intensity value E[pA (c)] The “haifa” and “embassy” models were updated with each of
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Fig. 11. (a) Change detection ROC curve for the “haifa” data set after (a) one pass and (b) five passes of the 28 training images. The additional passes allow the
octrees in the variable-resolution model time to reach their optimal subdivision levels.
Fig. 12. (a) Change detection ROC curve for the “embassy” data set after (a) one pass and (b) five passes of the 25 training images. After the additional passes,
the variable-resolution model has again reached the effective resolution of the fixed grid model.
Joseph Mundy received the B.S. and Ph.D. de- Gabriel Taubin (M’86–F’01) received the Licenci-
grees in electrical engineering from Rensselaer Poly- ado en Ciencias Matemáticas degree from the Uni-
technic Institute, Troy, NY, in 1963 and 1969, versity of Buenos Aires, Buenos Aires, Argentina,
respectively. and the Ph.D. degree in electrical engineering from
He joined General Electric Global Research in Brown University, Providence, RI.
1963. In his early career at GE, he carried out re- In 1990, he joined IBM, where during a 13-year
search in solid state physics and integrated circuit career in the Research Division, he held various
devices. In the early 1970s, he formed a research positions, including Research Staff Member and Re-
group on computer vision with emphasis on indus- search Manager. In 2003, he joined the School of
trial inspection. His group developed a number of in- Engineering, Brown University, as an Associate Pro-
spection systems for GE’s manufacturing divisions, fessor of Engineering and Computer Science. While
including a system for the inspection of lamp filaments that exploited syntactic on sabbatical from IBM during the 2000–2001 academic year, he was appointed
methods in pattern recognition. During the 1980s, his group moved toward Visiting Professor of Electrical Engineering at the California Institute of Tech-
more basic research in object recognition and geometric reasoning. In 1988, nology, Pasadena. While on sabbatical from Brown during the spring semester
he was named a Coolidge Fellow, which awarded him a sabbatical at Oxford of 2010, he was appointed Visiting Associate Professor of Media Arts and
University, Oxford, U.K. At Oxford, he collaborated on the development of Sciences at Massachusetts Institute of Technology, Cambridge. He serves as
theory and application of geometric invariants. In 2002, he retired from GE a member of the editorial board of the Geometric Models journal. He has made
Global Research and joined the School of Engineering, Brown University, significant theoretical and practical contributions to the field now called Digital
Providence, RI. At Brown University, his research is in the area of video Geometry Processing: to 3-D shape capturing and surface reconstruction and to
analysis and probabilistic computing. geometric modeling, geometry compression, progressive transmission, signal
processing, and display of discrete surfaces. The 3-D geometry compres-
sion technology that he developed with his group was incorporated into the
MPEG-4 standard and became an integral part of IBM products.
Prof. Taubin is the current Editor-in-Chief of the IEEE C OMPUTER G RAPH -
ICS AND A PPLICATIONS M AGAZINE and has served as Associate Editor of
the IEEE T RANSACTIONS OF V ISUALIZATION AND C OMPUTER G RAPHICS.
He was named IEEE Fellow for his contributions to the development of
3-D geometry compression technology and multimedia standards, won the
Eurographics 2002 Günter Enderle Best Paper Award, and was named IBM
Master Inventor.