ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, September 12-14, 2007, Finland
EXTENDING GENERALIZED HOUGH TRANSFORM TO DETECT 3D OBJECTS IN
LASER RANGE DATA
Kourosh Khoshelham
Optical and Laser Remote Sensing Research Group, Delft University of Technology, Kluyverweg 1, 2629 HS Delft, The
Netherlands
[email protected]
KEY WORDS: Laser scanning, Point cloud, Object Recognition, 3D Generalized Hough Transform, Automation
ABSTRACT:
Automated detection and 3D modelling of objects in laser range data is of great importance in many applications. Existing
approaches to object detection in range data are limited to either 2.5D data (e.g. range images) or simple objects with a parametric
form (e.g. spheres). This paper describes a new approach to the detection of 3D objects with arbitrary shapes in a point cloud. We
present an extension of the generalized Hough transform to 3D data, which can be used to detect instances of an object model in laser
range data, independent of the scale and orientation of the object. We also discuss the computational complexity of the method and
provide cost-reduction strategies that can be employed to improve the efficiency of the method.
1. INTRODUCTION representations, as a set of voxel templates (Greenspan and
Boulanger, 1999) or spin images (Johnson and Hebert, 1999),
Automated extraction of objects from laser range data is of which are matched against the data or as a set of parameters that
great importance in a wide range of applications. Reverse mathematically define the object. In the latter case, Hough
engineering, 3D visualisation, industrial design monitoring and transform (Duda and Hart, 1972; Hough, 1962) has been used
environmental planning are a few examples of the applications to determine the model parameters as well as the data points
that require 3D models of objects extracted from images or laser that belong to the object (Olson, 2001).
range data. A 3D model provides an abstract description of the
object, which can be processed and visualised more easily and The application of Hough transform is restricted to simple
efficiently. The process of object extraction consists of two objects that can be represented with few parameters, such as
main tasks. The first task is detection, in which the presence of planes, spheres and cylinders. Vosselman et al., (2004) describe
an object in the data is verified, and its approximate location is a Hough-based method for the detection of planes and spheres
found (usually by labeling the data points that belong to the in a point cloud. Rabbani (2006) developed an extension of this
object). The second task is modeling, where the detected object method that can be used for the detection of cylinders. Figure 1
is represented with a 3D geometric model that is most adequate demonstrates the application of Hough transform to the
in terms of such criteria as accuracy, compactness, the domain detection of cylinders in a point cloud. As can be seen, the
of the object and the application requirements. The detection curved parts joining the cylinders have not been extracted
step plays a key role in the successful modeling of the object. If because these parts cannot be expressed in parametric forms
the object is properly detected in the data, the modeling can be with few parameters.
carried out more reliably and accurately.
This paper concentrates on the detection of 3D objects with
Existing approaches to the detection of objects in range data arbitrary shapes in a point cloud. The objective of this paper is
can be divided into two major categories: data-driven to develop a new extension of Hough transform, which can be
approaches and model-driven approaches. Data-driven used to detect instances of a complex object model in laser
approaches are mainly based on segmentation (Khoshelham, range data, independent of the scale and orientation of the
2006; Rottensteiner and Briese, 2003; Sithole, 2005), clustering object.
(Filin, 2002; Vosselman, 1999) and classification (Forlani et
al., 2006; Oude Elberink and Maas, 2000). While these The paper has five sections. Section 2 provides an overview of
methods have been commonly applied to the laser range data of the standard and generalized Hough transform as applied to 2D
2.5D surfaces, their application to more complex 3D scenes is images. In section 3, the principles of the 3D generalized
not always possible. For instance, in laser range data of Hough transform is described. A discussion on the
industrial installations many objects are partially occluded and computational complexity of the method is presented in section
data-driven methods fail to correctly detect these objects in the 4. Conclusions appear in section 5.
data. Model-driven approaches, on the contrary, are more
robust in the presence of partial occlusion, since they
incorporate some form of knowledge about the shape of the
object. The object model can be represented, among other
206
IAPRS Volume XXXVI, Part 3 / W52, 2007
Figure 1. Detection of cylinders in a point cloud using Hough transform (from Rabbani (2006)). The curved parts
joining the cylinders cannot be extracted using this method.
length, r, and direction, β, of the connecting vectors. Table 1
2. AN OVERVIEW OF THE STANDARD AND illustrates a general form of an R-table.
GENERALIZED HOUGH TRANSFORM
Table 1: R-table
Hough transform is a well known method for the detection of Point φ r
objects in 2D intensity images. The standard Hough transform
is applicable to objects with an analytical shape such as straight 0 0 (r, β)01 - (r, β)02 - (r, β)03 - …
lines, circles and ellipses; whereas, with the generalized Hough 1 ∆φ (r, β)11 - (r, β)12 - (r, β)13 - …
transform any arbitrary curve can be detected in a 2D image. 2 2∆φ (r, β)21 - (r, β)22 - (r, β)23 - …
The following sections briefly describe the standard and … …
generalized Hough transform.
2.1 The standard Hough transform The reconstruction of the object model from the R-table is
straightforward:
The idea of Hough transform for detecting straight lines in
images was first introduced by Hough (1962). In the original x p = x c − r ⋅ cos( β )
(1)
Hough transform, a straight line is parameterized as y = mx+b y p = y c − r ⋅ sin( β )
with two parameters m and b. According to the number of
parameters, a 2D parameter space is formed in which every where (xc, yc) and (xp, yp) are respectively the coordinates of the
point in the image space corresponds to a line b = -xm+y. A set reference point and a point on the boundary of the object. For
of image points that lie on a same line y = mx+b in image space the detection of the object model in the image, however, the
correspond to a number of lines in the parameter space, which coordinates of the reference point are not known. A 2D
intersect at point (m, b). Finding this intersection point is, accumulator array is, therefore, constructed with the two
therefore, the basis for line detection in Hough transform. The parameters of the reference point as the axes. At every image
parameter space is realized in the form of a discrete edge pixel the gradient direction is obtained and then looked up
accumulator array consisting of a number of bins that receive in the R-table. The corresponding sets of r and β values are
votes from edge pixels in the image space. The intersection used to evaluate Equation 1, and the resulting xc and yc values
point is determined by finding the bin that receives a maximum indicate the accumulator array bins that should receive a vote.
number of votes. Once this process is complete for all edge pixels, the bin with
the maximum vote indicates the reference point, and the edge
In addition to straight lines, Hough transform has been used to pixels that cast vote for this bin belong to an instance of the
detect also other analytical shapes, such as circles and ellipses, object in the image.
in 2D images. The underlying principle for the detection of
other analytical shapes is the same as for the straight line
detection, and is based on constructing a duality between edge
pixels in the image and object parameters in the parameter
space. The dimensions of the parameter space, however, vary
with respect to the parameterization of the object.
G
C(X C,YC)
2.2 The generalized Hough transform
r φ
Ballard (1981) proposed a generalization of Hough transform to β
X
detect non-parametric objects with arbitrary shapes in 2D P(XP,YP)
intensity images. In the generalized Hough transform, the object
model is stored in a so-called R-table format. An arbitrary Y
reference point is selected for the object, and for every pixel on
the object boundary the gradient direction as well as the length
and direction of a vector connecting the boundary pixel to the
X
reference point are computed (Figure 2). The gradient Figure 2: Parameters involved in the generalized Hough
directions, φ, serve as indices in the R-table to look up the transform.
207
ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, September 12-14, 2007, Finland
The generalized Hough transform can also be used to detect a C(XC , Y C ,ZC )
rotated and scaled version of a model in an image. This is n
achieved by supplementing Equation 1 with a scale factor and a Z r
rotation angle, and the parameter space is expanded to a 4D
accumulator array. The peak of the accumulator array Y
determines the scale and rotation parameters in addition to the P(X P,YP,ZP)
X
coordinates of the reference point, although at the price of a
higher computational expense.
2.3 Modifications to Hough transform
Several modified variations of the Hough transform have been
proposed to improve the performance of the method.
Illingworth and Kittler (1988) provide a survey of these
methods. Duda and Hart (1972) suggested a modification of the Z
standard Hough transform by substituting the original slope-
intercept parameterization of straight lines with a polar, angle-
radius, parameterization. The polar parameterization leads to a
Y
bounded parameter space, unlike the original parameterization,
and is, consequently, more computationally efficient. They also X
showed that standard Hough transform can be used to detect
more general curves in an image. Gradient weighted Hough
Z Z
transform, as appears in Ballard’s generalization, was first n C(XC , YC ,Z C)
introduced by O’Gorman and Clowes (1976). The derivation of ψ r
α
edge orientation information imposes very little computational P(XP,YP,ZP) P(X P,YP,ZP)
cost, but greatly increases the efficiency of the method. Other Y Y
methods that have been shown to improve the performance of β
φ
Hough transform include Adaptive Hough transform X X
(Illingworth and Kittler, 1987), Hierarchical Hough transform
(Princen et al., 1990), and Randomized Hough transform (Xu et Figure 3: Parameters involved in the 3D GHT method.
al., 1990).
3. EXTENSION OF GENERALIZED HOUGH
TRANSFORM TO 3D DATA r213 r412
2
r r1
In this section we present an extension of the generalized 211 41
r112 1 r31
Hough transform to 3D data. The method will be referred to as r21
ψ r1 r111
3D GHT in the subsequent parts of the paper. The 3D GHT 01
follows the same principle as generalized Hough transform as φ
outlined in Section 2.2. The main difference is that the gradient
vector is replaced with a surface normal vector. The normal Figure 4: Storing r vectors in a 2D R-table.
vectors can be obtained by triangulating the surface of the
object or by fitting planar surfaces to small sets of points in a The reconstruction of the object model from the R-table is
local neighbourhood. Vectors connecting each triangle to an carried out by extending Equation 1 to 3D:
arbitrary reference point are stored in the R-table as a function x p = x c − r ⋅ sin( α ) cos( β )
of the normal vector coordinates. A normal vector is y p = y c − r ⋅ sin( α ) sin( β ) (3)
constrained to be of unit length and is, therefore, defined by two z = z − r ⋅ cos( α )
p c
orientation angles, φ and ψ, as depicted in Figure 3. A
connecting vector is defined by two orientation angles, α and β,
where α and β denote the orientation angles of the vector that
as well as its length r. These parameters can be derived from the
connects a point p to the reference point c. For the detection of
coordinates of the reference point and the object boundary
the 3D object model in a point cloud the three coordinates of
point:
the reference point are unknown parameters. Thus, the
equations given in (3) are rearranged so as to express the
1 unknown parameters as functions of the known variables:
r = [( x p − x c ) 2 + ( y p − y c ) 2 + ( z p − z c ) 2 ] 2
zc − z p x c = x p + r ⋅ sin( α ) cos( β )
α = arccos
(2)
r y c = y p + r ⋅ sin( α ) sin( β ) (4)
xc − x p
z = z + r ⋅ cos( α )
β = arccos c p
r sin( α )
Having obtained the object model in the form of the R-table, an
This formulation results in a 2D R-table where all the algorithm for the detection of instances of this model in a point
connecting vectors, r, are stored in cells whose coordinates are cloud can be outlined as follows:
the orientation angles of the normal vectors. Figure 4
demonstrates how such a 2D R-table is constructed. 1. Construct a 3D accumulator array with the three
parameters of the reference point as the axes;
208
IAPRS Volume XXXVI, Part 3 / W52, 2007
2. Compute the normal vector for every point in the point strategy to reduce the number of bins that receive votes in the
cloud and look up r vectors at coordinates (φ, ψ) of the parameter space. In the randomized voting, instead of working
2D R-table; with one point at a time, a number of points sufficient for the
3. Evaluate Equation (4) with the corresponding sets of r, α computation of all parameters are selected from the data. Once
and β values to obtain xc, yc and zc; all the parameters are computed, only one bin in the
4. Cast a vote (an increment) to the accumulator array bin accumulator array receives a vote. In the case of a 3D object
corresponding to each set of xc, yc and zc values; with seven parameters, a set of three points must be selected
5. Repeat the voting process for all the points in the point from the data at each time. These points along with their
cloud; respective r vectors form nine equations of the form given in
6. The bin with the maximum vote indicates the reference Equation 5, which can be solved for the seven parameters.
point, and the 3D points that cast vote for this bin belong Thus, for each randomly selected set only one vote is cast for a
to an instance of the object in the point cloud. bin in the 7D accumulator array.
In practice, the object appears in range data with an arbitrary
rotation and scale. To account for the additional rotation and 5. CONCLUSIONS
scale parameters, Equation (4) is modified as:
In this paper we presented an extension of the generalized
Hough transform to detect arbitrary 3D objects in laser range
c = p + sM z .M y .M x .r (5) data. The procedure of storing a 3D model in a 2D R-table was
demonstrated, and a method for the detection of instances of the
where c = (xc , yc , zc )T, p = (x p , y p , z p )T, r = (r sin(α ) cos(β ), r sin(α ) sin( β ), r cos(α ))T model in a point cloud, based on a voting process, was
s is a scale factor and Mx, My and Mz are rotation matrices described. It was discussed that the voting process can be
around x, y and z axis respectively. The incorporation of a scale computationally expensive in the case that the object appears in
factor and three rotation parameters results in an expansion of data with an arbitrary scale and rotation with respect to the
the Hough space to seven dimensions. To evaluate Equation 5 model. The employment of a voting process based on the
and cast votes for the accumulator bins, a 4D space randomized Hough transform was, therefore, suggested to
circumventing the entire range of scale factors and rotation reduce the computational cost of the method.
angles must be exhausted. This implies that the crude
application of the 3D GHT method to object detection can be
very expensive. Therefore, cost-reduction strategies such as REFERENCES
adaptive, hierarchical and randomized voting schemes are of
great importance in the 3D GHT algorithm. Ballard, D.H., 1981. Generalizing the Hough transform to
detect arbitrary shapes. Pattern Recognition, 13(2): 111-122.
Duda, R.O. and Hart, P.E., 1972. Use of the Hough
transformation to detect lines and curves in pictures.
4. IMPLEMENTATION ASPECTS
Communications of the ACM, 15: 11-15.
The 3D GHT method as described in Section 3 is Filin, S., 2002. Surface clustering from airborne laser scanning
computationally expensive when the object appears in data with data, Proceedings of the Photogrammetric Computer Vision,
an arbitrary scale and rotation with respect to the model. The ISPRS Commission III Symposium, Graz, Austria, pp. 119-124.
development of a cost-reduction strategy is thus the main Forlani, G., Nardinocchi, C., Scaioni, M. and Zingaretti, P.,
challenge in the application of 3D GHT. In general, the 2006. Complete classification of raw LIDAR data and 3D
execution time of Hough transform is more dominated by the reconstruction of buildings. Pattern Analysis & Applications,
voting process rather than by the search for a peak in the 8(4): 357-374.
accumulator. In the absence of arbitrary scale and rotation, the Greenspan, M. and Boulanger, P., 1999. Efficient and reliable
number of required operations in the voting process is O(M), template set matching for 3D object recognition, 2nd
where M is the number of points in the dataset. Thus, a International Conference on 3-D Digital Imaging and
desirable cost-reduction strategy must aim to reduce the number Modeling, Ottawa, Canada, pp. 230-239.
of points that are involved in the voting process. Randomized Hough, P.V.C., 1962. Methods and means for recognizing
(Xu et al., 1990) and probabilistic (Kiryati et al., 1991) complex patterns, US patent 3069654.
variations of the Hough transform work based on a random Illingworth, J. and Kittler, J., 1987. The adaptive Hough
selection of a small number of data points, and are, therefore, transform. IEEE Transactions on Pattern Analysis and Machine
suitable options for controlling the computational cost of the Intelligence, 9(5): 690-698.
voting process.. Illingworth, J. and Kittler, J., 1988. A survey of the Hough
transform. Computer Vision, Graphics and Image Processing,
In the presence of arbitrary scale and rotation, a 4D subset of 44: 87-116.
the parameter space circumventing the entire range of scale Johnson, A.E. and Hebert, M., 1999. Using spin images for
factors and rotation angles is exhausted during the voting efficient object recognition in cluttered 3D scenes. IEEE
process. Consequently, the number of operations required in the Transactions on Pattern Analysis and Machine Intelligence,
voting process is O(M*N4), where N is the number of intervals 21(5): 433-449.
along each axis of the accumulator array. Clearly, a desirable Khoshelham, K., 2006. Automated 3D modelling of buildings
cost-reduction strategy in this case must concentrate on the N4 in suburban areas based on integration of image and height
factor. The adaptive Hough transform (Illingworth and Kittler, data, International Workshop on 3D Geoinformation
1987) reduces the number of intervals along axes since it begins (3DGeoInfo '6), Kuala Lumpur, Malaysia, pp. 381-393.
with a coarse-resolution parameter space and increases the Kiryati, N., Eldar, Y. and Bruckstein, A.M., 1991. A
resolution only in the vicinity of the peak. The randomized probabilistic Hough transform. Pattern Recognition, 24(4): 303-
Hough transform (Xu et al., 1990) also provides an efficient 316.
209
ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, September 12-14, 2007, Finland
O'Gorman, F. and Clowes, M.B., 1976. Finding picture edges images, International Archives of Photogrammetry and Remote
through collinearity of feature points. IEEE Transactions on Sensing, Volume XXXIV/3-W13, Dresden, pp. 170-184.
Computers, 25(4): 449-456. Sithole, G., 2005. Segmentation and classification of airborne
Olson, C.F., 2001. A general method for geomtric feature laser scanner data. PhD Thesis, Delft University of Technology,
matching and model extraction. International Journal of Delft, 185 pp.
Computer Vision, 45(1): 39-54. Vosselman, G., 1999. Building reconstruction using planar
Oude Elberink, S. and Maas, H.G., 2000. The use of anisotropic faces in very high density height data, ISPRS Conference on
height texture measures for the segmentation of airborne laser Automatic Extraction of GIS Objects from Digital Imagery,
scanner data, International Archives of Photogrammetry and Munich, pp. 87-92.
Remote Sensing, 33 (Part B3/2), pp. 678–684. Vosselman, G., Gorte, B.G., Sithole, G. and Rabbani Shah, T.,
Princen, J., Illingworth, J. and Kittler, J., 1990. A hierarchical 2004. Recognising structure in laser scanner point cloud,
approach to line extraction based on the Hough transform. International Archives of Photogrammetry, Remote Sensing
Computer Vision, Graphics and Image Processing, 52(1): 57- and Spatial Information Sciences, Vol. 46, Part 8/W2, Freiburg,
77. Germany, pp. 33-38.
Rabbani Shah, T., 2006. Automatic reconstruction of industrial Xu, L., Oja, E. and Kultanen, P., 1990. A new curve detection
installations using point clouds and images PhD thesis Thesis, method: randomized Hough transform (RHT). Pattern
Delft University of Technology, Delft, 154 pp. Recognition Letters, 11(5): 331-338.
Rottensteiner, F. and Briese, C., 2003. Automatic generation of
building models from Lidar data and the integration of aerial
210