Generalized-ICP: Aleksandr V. Segal Dirk Haehnel Sebastian Thrun
Generalized-ICP: Aleksandr V. Segal Dirk Haehnel Sebastian Thrun
Generalized-ICP: Aleksandr V. Segal Dirk Haehnel Sebastian Thrun
Abstract— In this paper we combine the Iterative Closest geometrically or with point clouds. Chen and Medioni[7]
Point (’ICP’) and ‘point-to-plane ICP‘ algorithms into a single considered the more specific problem of aligning range data
probabilistic framework. We then use this framework to model for object modeling. Their approach takes advantage of the
locally planar surface structure from both scans instead of just
the ”model” scan as is typically done with the point-to-plane tendency of most range data to be locally planar and intro-
method. This can be thought of as ‘plane-to-plane’. The new duces the ”point-to-plane” variant of ICP. Zhang[5] almost
approach is tested with both simulated and real-world data and simultaneously describes ICP, but adds a robust method of
is shown to outperform both standard ICP and point-to-plane. outlier rejection in the correspondence selection phase of the
Furthermore, the new approach is shown to be more robust to algorithm.
incorrect correspondences, and thus makes it easier to tune the
maximum match distance parameter present in most variants of Two more modern alternatives are Iterative Dual Correspon-
ICP. In addition to the demonstrated performance improvement, dence [15] and Metric-Based ICP [16]. IDC improves the
the proposed model allows for more expressive probabilistic point-matching process by maintaining two sets of correspon-
models to be incorporated into the ICP framework. While dences. MbICP is designed to improve convergence with large
maintaining the speed and simplicity of ICP, the Generalized-ICP initial orientation errors by explicitly putting a measure of
could also allow for the addition of outlier terms, measurement
noise, and other probabilistic techniques to increase robustness. rotational error as part of the distance metric to be minimized.
The primary advantages of most ICP based methods are
I. I NTRODUCTION simplicity and relatively quick performance when imple-
Over the last decade, range images have grown in popularity mented with kd-trees for closest-point look up. The draw-
and found increasing applications in fields including medical backs include the implicit assumption of full overlap of the
imaging, object modeling, and robotics. Because of occlusion shapes being matched and the theoretical requirement that the
and limited sensor range, most of these applications require points are taken from a known geometric surface rather than
accurate methods of combining multiple range images into a measured [1]. The first assumption is violated by partially
single model. Particularly in mobile robotics, the availability overlapped scans (taken from different locations). The sec-
of range sensors capable of quickly capturing an entire 3D ond causes problems because different discretizations of the
scene has drastically improved the state of the art. A striking physical surface make it impossible to get exact overlap of
illustration of this is the fact that virtually all competitors in the the individual points even after convergence. Point-to-plane,
DARPA Grand Challenge relied on fast-scanning laser range as suggested in [7], solves the discretization problem by not
finders as the primary input method for obstacle avoidance, penalizing offsets along a surface. The full overlap assumption
motion planning, and mapping. Although GPS and IMUs are is usually handled by setting a maximum distance threshold
often used to calculate approximate displacements, they are not in the correspondence.
accurate enough to reliably produce precise positioning. In ad- Aside from point-to-plane, most ICP variations use a closed
dition, there are many situation (tunnels, parking garages, tall form solution to iteratively compute the alignment from the
buildings) which obstruct GPS reception and further decrease correspondences. This is typically done with [10] or similar
accuracy. To deal with this shortcoming, most applications techniques based on cross-correlation of the two data sets. Re-
rely on scan-matching of range data to refine the localization. cently, there has been interest in the use of generic non-linear
Despite such wide usage, the typical approach to solving the optimization techniques instead of the more specific closed
scan-matching problem has remained largely unchanged since form approaches [9]. These techniques are advantageous in
its introduction. that they allow for more generic minimization functions rather
then just the sum of euclidean distances. [9] uses non-linear
II. S CANMATCHING optimization with robust statistics to show a wider basin of
Originally applied to scan-matching in the early 90s, the convergence.
ICP technique has had many variations proposed over the We argue that among these, the probabilistic techniques
past decade and a half. Three papers published around the are some of the best motivated due to the large amount of
same time period outline what is still considered the state theoretical work already in place to support them. [2] applies
of the art solution for scan-matching. The most often cited a probabilistic model by assuming the second scan is generated
analysis of the algorithm comes from Besl and McKay[1]. [1] from the first through a random process. [4] Applies ray
directly addresses registration of 3D shapes described either tracing techniques to maximize the probability of alignment.
[8] builds a set of compatible correspondences, and then A. ICP
maximizes probability of alignment over this distribution. [17]
The key concept of the standard ICP algorithm can be
introduces a fully probabilistic framework which takes into
summarized in two steps:
account a motion model and allows estimates of registration
uncertainty. An interesting aspect of the approach is that a 1) compute correspondences between the two scans.
sampled analog of the Generalized Hough Transform is used 2) compute a transformation which minimizes distance
to compute alignment without explicit correspondences, taking between corresponding points.
both surface normals into account for 2D data sets. Iteratively repeating these two steps typically results in conver-
There is also a large amount of literature devoted to solving gence to the desired transformation. Because we are violating
the global alignment problem with multiple scans ([18] and the assumption of full overlap, we are forced to add a max-
many others). Many approaches to this ([18] in particular) use imum matching threshold dmax . This threshold accounts for
a pair-wise matching algorithm as a basic component. This the fact that some points will not have any correspondence in
makes improvements in pairwise matching applicable to the the second scan (e.g. points which are outside the boundary of
global alignment problem as well. scan A). In most implementations of ICP, the choice of dmax
Our approach falls somewhere between standard IPC and represents a trade off between convergence and accuracy. A
the fully probabilistic models. It is based on using MLE low value results in bad convergence (the algorithm becomes
as the non-linear optimization step, and computing discrete “short sighted”); a large value causes incorrect correspon-
correspondences using kd-trees. It is unique in that it provides dences to pull the final alignment away from the correct value.
symmetry and incorporates the structural assumptions of [7]. Standard ICP is listed as Alg. 1.
Because closest point look up is done with euclidean distance,
however, kd-trees can be used to achieve fast performance input : Two pointclouds: A = {ai }, B = {bi }
on large pointclouds. This is typically not possible with fully An initial transformation: T0
probabilistic methods as these require computing a MAP output: The correct transformation, T , which aligns A
estimate over assignments. In contrast to [8], we argue that and B
the data should be assumed to be locally planar since most 1 T ← T0 ;
environments sampled for range data are piecewise smooth 2 while not converged do
surfaces. By giving the minimization processes a probabilistic 3 for i ← 1 to N do
interpretation, we show that is easy to extend the technique to 4 mi ← FindClosestPointInA(T · bi );
include structural information from both scans, rather then just 5 if ||mi − T · bi || ≤ dmax then
one as is typically done in ”point-to-plane” ICP. We show that 6 wi ← 1;
introducing this symmetry improves accuracy and decreases 7 else
dependence on parameters. 8 wi ← 0;
Unlike the IDC [15] and MbICP [16] algorithms, our 9 end
approach is designed to deal with large 3D pointclouds. Even 10 end
more fundamentally both of these approaches are somewhat
X
T ← argmin { wi ||T · bi − mi ||2 };
orthogonal to our technique. Although MbICP suggests an 11 T i
alternative distance metric (as do we), our metric aims to 12 end
take into account structure rather then orientation. Since our Algorithm 1: Standard ICP
technique does not rely on any particular type (or number) of
correspondences, it would likely be improved by incorporating
a secondary set of correspondences as in IDC. B. Point-to-plane
A key difference between our approach and [17] is the
computational complexity involved. [17] is designed to deal The point-to-plane variant of ICP improves performance by
with planar scan data – the Generalized Hough Transform sug- taking advantage of surface normal information. Originally
gested requires comparing every point in one scan with every introduced by Chen and Medioni[7], the technique has come
point in the other (or a proportional number of comparisons into widespread use as a more robust and accurate variant of
in the case of sampling). Our approach works with kd-trees standard ICP when presented with 2.5D range data. Instead
for closest point look up and thus requires O(n log(n) explicit of minimizing Σ||T · bi − mi ||2 , the point-to-plane algorithm
point comparisons. It is not clear how to efficiently generalize minimizes error along the surface normal (i.e. the projection
the approach in [17] to the datasets considered in this paper. of (T · bi − mi ) onto the sub-space spanned by the surface
Furthermore, there are philosophical differences in the models. normal). This improvement is implemented by changing line
This paper proceeds by summarizing the ICP and point- 11 of Alg. 1 as follows:
to-plane algorithms, and then introducing Generalized-ICP X
T ← argmin { wi ||ηi · (T · bi − mi )||2 }
as a natural extension of these two standard approaches. T i
Experimental results are then presented which highlight the
advantages of Generalized-ICP. where ηi is the surface normal at mi .
III. G ENERALIZED -ICP In this case, (2) becomes
(T) T (T)
X
A. Derivation T = argmin di di
T i
Generalized-ICP is based on attaching a probabilistic model
(T) 2
X
to the minimization step on line 11 of Alg. 1. The technique = argmin ||di || (3)
T
keeps the rest of the algorithm unchanged so as to reduce i
complexity and maintain speed. Notably, correspondences are which is exactly the standard ICP update formula.
still computed with the standard Euclidean distance rather then With the Generalized-IPC framework in place, however, we
a probabilistic measure. This is done to allow for the use of have more freedom in modeling the situation; we are free
kd-trees in the look up of closest points and hence maintain to pick any set of covariances for {CiA } and {CiB }. As a
the principle advantages of ICP over other fully probabilistic motivating example, we note that the point-to-plane algorithm
techniques – speed and simplicity. can also be thought of probabilistically.
Since only line 11 is relevant, we limit the scope of the The update step in point-to-plane ICP is performed as:
derivation to this context. To simplify notation, we assume X
that the closest point look up has already been performed T = argmin { ||Pi · di ||2 } (4)
T i
and that the two point clouds, A = {ai }i=1,...,N and B =
{bi }i=1,...,N , are indexed according to their correspondences where Pi is the projection onto the span of the surface normal
(i.e. ai corresponds with bi ). For the purpose of this section, at bi . This minimizes the distance of T · ai from the plane
we also assume all correspondences with ||mi −T ·bi || > dmax defined by bi and its surface normal. Since Pi is an orthogonal
have been removed from the data. projection matrix, Pi = Pi 2 = Pi T . This means ||Pi · di ||2
In the probabilistic model we assume the existence of can be reformulated as a quadratic form:
an underlying set of points, Â = {aˆi } and B̂ = {bˆi },
||Pi · di ||2 = (Pi · di )T · (Pi · di )
which generate A and B according to ai ∼ N (aˆi , CiA )
and bi ∼ N (bˆi , CiB ). In this case, {CiA } and {CiB } are = dTi · Pi · di
covariance matrices associated with the measured points. If Looking at (4) in this format, we get:
we assume perfect correspondences (geometrically consistent X
with no errors due to occlusion or sampling), and the correct T = argmin { dTi · Pi · di } (5)
transformation, T∗ , we know that T i
(a) indoor scene positioning. Although (standard) ICP itself was used in the
pairwise matching to generate the ground truth, the spacing of
scans used for the SLAM approach was an order of magnitude
smaller. In contrast, the scan pairs used for testing were
extracted with much higher spacing (15-20+ meters) in order
to pose a much more challenging problem. This is not a
perfect method to generate ground truth, but we believe it
provides a reasonable baseline to make comparisons between
the algorithms.
To measure performance, all algorithms were run on pairs
of scans from each of the three data sets. For each scan
pair, the initial offset was set to the true offset with a
(b) outdoor scene
uniformly generated error term added. The error term was
Fig. 2. simulated 3D environments set within ±1.5m and ±15◦ along all axes. Performance was
measured by averaging positioning error over all scan pairs
for a particular algorithm. In all cases tested, rotational error
was negligible.
As mentioned before, selection of dmax plays an important
role in the convergence of ICP. Fig. 5 shows the average
error for different values of dmax ; the plot shows average
performance across all scan pairs. Fig. 9 shows the averages
for individual scan pairs based on ideal values of dmax ; it
demonstrates the distribution of error across the range of scan
pairs. In contrast to Fig. 5, the large number off random initial
offsets averaged into each data point of Fig. 9 serves to sample
the space of possible offsets. For Fig. 5, the algorithms were
run on each scan pair with 10 randomly generated starting
positions. For the plots in Fig. 9, each data point was generated
(a) indoor scene with 50 random initial poses using best-case values for dmax .
In all cases, error bars were computed as √σN .
The plots in Fig. 5 show that the proposed algorithm is
more robust to choice of the matching threshold and demon-
strates better performance in general. This is to be expected
since it more completely models the environment and will
automatically discount many incorrect matches based on the
structure of the scene. In particular, Fig. 5 shows that in the
simulated environments, the accuracy of the algorithm is not
sensitive to overestimated values of dmax . For the real data,
(b) outdoor scene Generalized-ICP is still shown to be less sensitive due to the
Fig. 3. ray-traced scans – scan A is shown in green, scan B in red
smaller slope of average error as dmax → ∞. The discrepancy
between simulated and real data can be explained by the
difference in their respective frequency profiles. Whereas the
simulated environments only have high-level features modeled
Simulated Hallway with 15,000 points, 1cm noise Simulated Outdoor with 15,000 points, 1cm noise
1.2 1.2
Generalized ICP Generalized ICP
1 Point-to-plane ICP 1 Point-to-plane ICP
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
max. match distance [m] max. match distance [m]
Generalized ICP
2 Point-to-plane ICP
average error [m] Standard ICP
1.5
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
max. match distance [m]
Fig. 8. Velodyne scan pairs #31 and #45 shown in perspective to illustrate scene complexity
Simulated Hallway with 15,000 points, 1cm noise Simulated Outdoor with 15,000 points, 1cm noise
0.5 0.5
0.4 0.4
average error [m]
0.2 0.2
0.1 0.1
0 0
1 2 3 4 5 6 2 4 6 8 10 12 14
Scan pair # Scan pair #
Generalized ICP, dmax=5.0m Generalized ICP, dmax=1.6m
Point-to-plane ICP, dmax=1.6m Point-to-plane ICP, dmax=1.0m
Standard ICP, dmax=1.4m Standard ICP, dmax=1.8m
1.4
1.2
average error [m]
1
0.8
0.6
0.4
0.2
0
5 10 15 20 25
Scan pair #
Generalized ICP, dmax=1.2m Point-to-plane ICP, dmax=1.0m Standard ICP, dmax=1.0m
1.4
1.2
average error [m]
1
0.8
0.6
0.4
0.2
0
30 35 40 45 50
Scan pair #
Generalized ICP, dmax=1.2m Point-to-plane ICP, dmax=1.0m Standard ICP, dmax=1.0m
Fig. 9. average error with ideal values of dmax which minimize Fig. 5
by hand, the real world data contains much more detailed, experiments, Generalized-ICP was shown to increase accuracy.
high-frequency data. This increases the chances of incorrect At the same time, the use of structural information from both
correspondences which share a common surface orientation – scans decreased the influence of incorrect correspondences.
a situation which is not taken into account by our algorithm. Consequently the choice of maximum matching distance as a
Nonetheless, even when comparing worst-cast values of dmax parameter for the correspondence phase becomes less critical
for Generalized-ICP with best-case values for point-to-plane, to performance. These modifications maintain the simplicity
Generalized-ICP performs roughly as good. and speed of ICP, while improving performance and removing
As mentioned in Section II, the dmax plays an important the trade off typically associated with parameter selection.
role in the performance of ICP. Setting a low value decreases
ACKNOWLEDGMENT
the chance of convergence, but increases accuracy. Setting a
value which is too high increases the radius of convergence, This research was supported in part under subcontract
but decreases accuracy since more incorrect correspondences through Raytheon Sarcos LLC with DARPA as prime sponsor,
are made. The algorithm proposed in this paper heavily contract HR0011-04-C-0147.
reduces the penalty of picking a large value of dmax by dis- R EFERENCES
counting the effect of incorrect correspondences. This makes it
[1] P. Besl, N. McKay. ”A Method for Registration of 3-D Shapes,” IEEE
easier to get good performance in a wide range of environment Trans. on Pattern Analysis and Machine Intel., vol. 14, no. 2, pp. 239-256,
without hand-picking a value of dmax for each one. 1992.
In addition to the increased accuracy, the new algorithm [2] P. Biber, S. Fleck, W. Strasser. ”A Probabilistic Framework for Robust
and Accurate Matching of Point Clouds,” Pattern Recognition, Lecture
gives equal consideration to both scans when computing the Notes in Computer Science, vol. 3175/2004, pp. 280-487, 2004.
transformation. Fig. 6 and Fig. 7 show two situations where [3] N. Gelfan, L. Ikemoto, S. Rusinkiewicz, M. Levoy. ”Geometrically Stable
using the structure of both scans removed local minima which Sampling for the ICP Algorithm,” Fourth International Conference on 3-D
Digital Imaging and Modeling, p. 260, 2003.
were present with point-to-plane. These represent top-down [4] D. Haehnel, W. Burgard. ”Probabilistic Matching for 3D Scan Registra-
views of velodyne scans recorded approximately 30 meters tion,” Proc. of the VDI-Conference Robotik, 2002.
apart and aligned. Fig. 8 shows some additional views of the [5] Z. Zhang. ”Iterative Point Matching for Registration of Free-Form
Curves,” IRA Rapports de Recherche, Programme 4: Robotique, Image
same scan pairs to better illustrate the structure of the scene. et Vision, no. 1658, 1992.
The scans cover a range of 70-100 meters from the sensor [6] D. Hahnel, W. Burgard, S. Thrun. ”Learning compact 3D models of
in an outdoor environment as seen from a car driving on the indoor and outdoor environments with a mobile robot,” Robotics and
Autonomous Systems, vol. 44, pp. 15-27, 2003.
road. [7] Y. Chen, G. Medioni. ”Object Modeling by Registration of Multiple
Because this minimization is still performed within the ICP Range Images,” Proc. of the 1992 IEEE Intl. Conf. on Robotics and
framework, the approach combines the speed and simplicity Automation, pp. 2724-2729, 1991.
[8] L. Montesano, J. Minguez, L. Montano. ”Probabilistic Scan Matching for
of the standard algorithm with some of the advantages of Motion Estimation in Unstructured Environments,” IEEE Intl. Conf. on.
fully probabilistic techniques such as EM. The theoretical Intelligent Robots and Systems, pp. 3499-3504, 2005.
framework also allows standard robustness techniques to be [9] A. Fitzgibbon. ”Robust registration of 3D and 3D point sets,” Image and
Vision Computing, vol. 21, no. 13-14, pp. 1145-1153, 2003.
incorporated. For example, the Gaussian kernel can be mixed [10] B. Horn. ”Closed-form solution of absolute orientation using unit
with a uniform distribution to model outliers. The Gaussian quaternions,” Journal of the Optical Society of America A, vol. 4, pp.
RVs can also be replaced by a distribution which takes 629-642, 1987.
[11] S. Rusinkiewicz, M. Levoy. ”Efficient Variants of the ICP Algorithm,”
into account a certain amount of slack in the matching to Third International Conference on 3-D Digital Imaging and Modeling, p.
explicitly model the inexact correspondences (by assigning the 145, 2001.
(T) [12] G. Dalley, P. Flynn. ”Pair-Wise Range Image Registration: A Study in
distribution of di a constant density on some region around
Outlier Classification,” Computer Vision and Image Understanding, vol.
0). Although we have considered some of these variations, 87, pp. 104-115, 2002.
none of them have an obvious closed form which is easily [13] S. Kim , Y. Hwang , H. Hong , M. Choi. ”An Improved ICP Algorithm
minimized. This makes them too complex to include in the Based on the Sensor Projection for Automatic 3D Registration,” Lecture
Notes in Computer Science, vol. 2972/2004 pp. 642-651, 2004.
current work, but a good topic for future research. [14] J.-S. Gutmann, C. Schlegel, ”AMOS: comparison of scan matching
approaches for self-localization in indoor environments,” eurobot, p.61,
V. C ONCLUSION 1st Euromicro Workshop on Advanced Mobile Robots (EUROBOT),
1996.
In this paper we have proposed a generalization of the [15] F. Lu, E. Milos. ”Robot Pose Estimation in Unknown Environments by
ICP algorithm which takes into account the locally planar Matching 2D Range Scans,” Journal of Intelligent Robotics Systems 18:
structure of both scans in a probabilistic model. Most of the pp. 249-275, 1997.
[16] J. Minguez, F. Lamiraux, L. Montesano. ”Metric-Based Scan Matching
ICP framework is left unmodified so as to maintain the speed Algorithms for Mobile Robot Displacement Estimation,” Robotics and
and simplicity which make this class of algorithms popular Automation, Proceedings of the 2005 IEEE International Conference on,
in practice; the proposed generalization only deals with the pp. 3557-3563, 2005.
[17] A. Censi, ”Scan matching in a probabilistic framework,” Robotics and
iterative computation of the transformation. We assume all Automation, Proceedings of the 2006 IEEE International Conference on,
measured points are drawn from Gaussians centered at the true pp. 2291-2296, 2006.
points which are assumed to be in perfect correspondence. [18] K. Pulli, ”Mutliview Registration for Large Data Sets,” 3-D Digital
Imaging and Modeling, 1999. Proceedings. Second International Con-
MLE is then used to iteratively estimate transformation for ference on, pp. 160-168, 1999.
aligning the scans. In a range of both simulated and real-world