Depth Object Recovery Using Radial Basis Functions
Depth Object Recovery Using Radial Basis Functions
Abstract
The main task of robot vision systems is to recover the shape of the objects in a scene and use this information for
recognition and classification. The fringe projection technique may be used to recover depth range data. To use this
technique, we need to estimate the phase of the projected fringes Žrelated to the object height. by using conventional phase
demodulation techniques Žphase shifting, phase locked loop, or spatial synchronous detection among others.. The final step
of the process is to recover the object surface using a calibration procedure. This calibration procedure uses a phase-to-depth
conversion obtained from the experimental optical geometry. Set-up parameters to consider are the optical characteristics of
the lenses used to project and acquire the fringe patterns, which are usually unknown. Some experimental factors arise when
the object under analysis is located near the projector–camera system, such as the diverging fringe projection, the video
camera perspective distortion and the use of crossed-optical-axes geometry. In this work, we present a neural network based
on radial basis functions ŽRBF. to estimate actual depth data of the object from the recovered phase using the projected
fringe pattern technique. The training set of the neural network is the phase recovered of several calibration planes whose
locations in space are known. An advantage of our method is that knowledge of the optical parameters of the experiment is
not explicitly required. q 1999 Published by Elsevier Science B.V. All rights reserved.
0030-4018r99r$ - see front matter q 1999 Published by Elsevier Science B.V. All rights reserved.
PII: S 0 0 3 0 - 4 0 1 8 Ž 9 9 . 0 0 1 4 3 - 1
F.J. CueÕas et al.r Optics Communications 163 (1999) 270–277 271
where parameter t is related with the bandwidth of the which gives a fast and simple way to perform phase
estimated phase, which behaves as a low pass filter when it estimation given a carrier frequency interferogram.
is initialized with a small value. If I Ž x, y . is normalized in
the range of Ž0,1. a typical value for t is about 0.1. The
high-pass filtered fringe pattern is achieved with the dis-
cretized version of the signal derivative with respect to x. 3. Depth recovery in the ideal case
A second iteration is required in a backward direction
through the same line to recover incorrect phase values The object height distribution can be approximated
due to setting time in the dynamic system. This backward from the estimated phase applying trigonometric relations
scanning may begin taking as the initial condition the last if there is no optical distortion and all robot vision parame-
term calculated in Eq. Ž4.. Then the backward process can ters are known. Obviously, this hypothetical circumstance
be expressed as: where everything is known seldom occurs, but its analysis
f Ž x , y . s f Ž x q 1, y . y t w I Ž x q 1, y . y I Ž x , y . x is important to understand the common problems present
in the calibration system.
= sin w 2p f 0 x q f Ž x q 1, y . x ,
In Fig. 2 we show the optical configuration used for
Ž x s 1,2, . . . , N y 1 . . Ž5. 3-D measurement. To make it easily understandable, the
In two-dimensional interferogram, we are concerned optical axis of the projector lens and CCD camera lens are
with x and y demodulation directions. If we assume a parallel and are normal to the reference plane. The z-axis
smooth phase we can use as starting phase for the Ž y q 1. is aligned with the optical axis of the CCD camera lens.
line being demodulated, the last term calculated by the The grating to be projected has its lines normal to the
backward process of the y line. In this way, initial condi- reference plane. We get a conjugated image of projected
tions used in the forward process are: grating on the reference plane. On points P and C are
located the centers of the entrance pupils of the projector
f Ž 1,1 . s 0, Ž6. and CCD camera, respectively. They are located at a
and distance d 0 from the reference plane, where the object is
located, are separated by a distance d1 and are situated on
f Ž 1, y q 1 . s f Ž 1, y . Ž y s 2,3, . . . , N y 1 . . Ž7. the same z position.
This method has the advantage of allowing us to find We can consider the reference plane as a flat object
the phase map and phase unwrapping simultaneously, with depth z Ž x, y . s 0 for all x and y. If the distance d 0
is far away and d1 is small, the projected grating observed region where the depth data is being measured. To this
through point C could be considered as a regular grating end, a single calibrating plane is moved throughout several
pattern. Obviously, if d 0 is near the reference plane, the positions inside the volume being calibrated.
grating observed will be irregular with frequency changes
in direction x. When an object is placed on the reference
plane, we can calculate the phase distribution of the object.
4. RBFs applied to depth recovering
If we analyze phase f O on point O, it has the same value
as phase fA on point A. Besides, point O on the object
and point B on the reference plane are imaged by the RBF neural networks w22–27x provide a powerful tech-
camera lens to the CCD detector array in the same point nique for generating multivariate nonlinear mappings.
BX. Then we can express distance AB as a dependence of These are used to classification, pattern recognition and
phase: interpolation tasks due to their simple topological struc-
ture. Other neural networks used for these tasks are back-
f B y fO propagation, Kohonen network and probabilistic neural
ABs , Ž8.
2p f networks w28–30x. The main advantage of RBF nets over
others approaches are fast learning, because the learning
where f is the spatial frequency of projected grating on the
algorithm corresponds to the solution of a linear problem.
reference plane and f B is the phase value on point B. We
RBF nets belong to the group of kernel function nets
can realize that triangles PCO and ABO are similar and we
that utilize simple kernel function, distributed in different
can write the height of the point O as:
neighborhoods of the input space, for which responses are
Ž f B y fO . d 0 essentially local in nature, its effect is determined by its
zŽ x, y. s . Ž9. center and width. Its output is high when the input is close
2p fd1 q Ž f B y f O .
to the center and it decreases rapidly to zero as the inputs
The formula for converting the recovered phase distri- distance from the center increases. In the typical approach
bution into physical height distribution given in Eq. Ž9. to RBF implementation, the basis function is usually cho-
works well only in the immediate neighborhood of the sen as Gaussian and uses a two-layer topology Žsee Fig. 3.,
optical axes of both the projector and the video camera or which consists of one hidden and one output layer with the
equivalently the object under study must be far away from number of hidden units fixed a priori based on the proper-
the projector–camera system Žsee Fig. 2.. However, when ties of input data. Each hidden node in an RBF net
the object under analysis is located near the projector– represents one of the kernel functions. An output node
camera system, several problems arise. In this condition, generally computes the weighted sum of the hidden node
there are three main factors that give rise to calibrating outputs. This shallow architecture has great advantage in
problems: Ž1. the diverging fringe projection, Ž2. the video terms of computing speed compared to multiple hidden
camera perspective distortion, and Ž3. the use of crossed- layer nets. The design and training of an RBF net consist
optical-axes geometry. These circumstances have the effect of: Ž1. determining how many kernel functions to use, Ž2.
of obtaining a distorted phase f Ž x, y .. finding their centers and widths, and Ž3. finding the weights
In Section 4, we propose a neural network based on that connect them to an output node.
RBFs to solve this drawback, which uses as input the In this work, we use a linear combination of RBF
phase distribution getting from a set of planes inside the w22–27x to find the relationship between the object height
z Ž x, y . and the detected phase f Ž x, y .. In our case, the points Ž x i , yi .. The function f that approximates the depth
adjusting parameters of the RBF are their weights while of the object can be expressed as:
their centers and spread are kept fixed. The centers are
evenly spaced within the spatial range of interest. The I
Gaussian spread is such that half of a Gaussian response is f Žr. s Ý wi G i Ž r . , Ž 10 .
located at the sites of the neighborhood Gaussians. The is1
reference detected phase and training set are obtained from
a number of carefully positioned calibration planes where where
phase presents geometrical distortion due to the diverging
fringe projection and the perspective of the camera with
crossed-optical-axes geometry. The calibrating planes must 5 r y ci 5 2
be positioned inside the working volume of the automated Gi Ž r . s exp ,
s2
depth recovering system Žsee Fig. 4.. The RBF network
operates as an interpolation function of phase values to the
r s Ž f Ž x , y . , x , y . and c i s Ž f P Ž x i , yi . , x i , yi . ,
object height in places other than the calibrating planes.
We preferred an RBF neural network with Gaussian
processors over a multilayer-perceptron network with sig- where r is the feature vector that contains the detected
moid processors under backpropagation training w28,30x phase f Ž x, y . at the point Ž x, y ., I is the number of
since the ‘learning’ process in the RBF is linear, so that Gaussian processors with centers c i getting from the cali-
only the computation of an inverse matrix is required, as it bration planes and s is the standard deviation related with
will be explained below, instead of a descent gradient the width of the Gaussian kernel. Finally, vector W s Ž w1,
minimization, which sometimes requires a lot of computer w 2 , . . . , wI . contains the parameters being estimated. Fig.
time. 3 depicts the two-layer feed-forward RBF network used in
Our technique calculates the depth of the object by this application. Then, the RBF network inputs are the
adjusting weights of a linear combination of Gaussian estimated phase calculated and points Ž x, y . using a phase
functions using the experimental reference planes fixing demodulation technique.
the centers of the RBFs on vectors c i over samples of We define a cost or energy function which should be
detected phases f P Ž x i , yi . from P planes Žsee Fig. 4. at optimized to obtain the vector W constituted by the values
Fig. 4. An experimental set-up with crossed-optical-axes geometry and diverging fringe projection showing planes used in the calibration
process.
F.J. CueÕas et al.r Optics Communications 163 (1999) 270–277 275
5. Experiments
Ž k s 1,2,3, . . . , I . . Ž 12 .
In this way we obtain the following expression:
LW s V , Ž 13 .
where
Q
lk i s Ý G k Ž r q . Gi Ž r q . , Ž k ,i s 1,2,3, . . . , I .
qs 1
w1
w2 Q
W s w 3 , and V k s Ý Gk Ž r q . t q
... qs 1
wI
Fig. 7. The phase recovered from the fringe pattern shown in Fig. the object depth recovered was 2.6% with respect to the
6. measurements made using a contact machine measuring.
This was calculated as:
U
using Eqs. Ž4. and Ž5.. The phase recovered of plane z s 0 Relative average error s )100%. Ž 15 .
Q
with the PLL technique is shown in Fig. 5. Distortion and
tilt can be observed in the retrieval phase due to miscali- where U is the cost function or quadratic error defined in
bration in the experimental set-up parameters. Eq. Ž11. and Q is the size of the real data set. Fig. 9 shows
We estimated the surface of a pyramidal object, whose a comparison between the RBF network approximation
footprint is a square with an area of 81 cm2 , and whose and the real size along the cross-section through the mid-
height is 4.8 cm. The reflectance of the surface was dle of the object.
Lambertian to minimize irradiance variations caused by
specular reflections. The base of the pyramid was placed
over the plane z s 0. We calculated the object phase from 6. Conclusions
the fringe pattern shown in Fig. 6 using the PLL technique
ŽFig. 7.. Then, we applied the RBF network to estimate the We have proposed a calibration method in a robot
object topography in centimeters. The calibrated pyramid vision system that applies an RBF neural network. This
obtained is shown in Fig. 8. The relatiÕe aÕerage error of method is based on the structured light technique that uses
a fringe projection system to recover the depth information
of an object from the demodulated phase. This method
implicitly takes into account the optical system’s distor-
tions. The RBF system corrects the geometrical distortion
of the recovered phase caused by the diverging illumina-
tion of the slide projector, the perspective distortion of the
optical system of the video camera and the non-linear
calibration of the crossed-optical geometry used. This cali-
brating–correcting system functions inside a given work-
ing volume where the system is calibrated using several
reference planes at well-defined positions.
The RBF network operates as an interpolation function
to approximate the object height distribution from the
phase recovered from the projected fringes over the refer-
ence planes. This technique offers several advantages over
a conventional calibration, where optical parameters should
be known. The geometrical distortions of the projector and
the video capture system are minimized through the mini-
mization of a squared cost function which considers the
Fig. 8. The surface approximation applying RBF network tech- error over established calibration planes and the RBF
nique. network’s output. This technique can estimate the cali-
F.J. CueÕas et al.r Optics Communications 163 (1999) 270–277 277
brated object topography using a 256 = 256 resolution w6x Y.K. Ryu, H.S. Cho, Opt. Eng. 35 Ž1996. 1483.
image in 1.2 s under a PENTIUM 133 MHZ PC. w7x K.H. Woodham, Opt. Eng. 19 Ž1980. 139.
Finally, an RBF network with fixed centers and spread w8x E. Coleman, R. Jain, Comput. Graph. Image Processing 18
Ž1982. 309.
was selected over a multilayer-perceptron network trained
w9x X. Zhang, R. Mammone, Opt. Eng. 33 Ž1994. 4079.
by backpropagation algorithm. The reason is that its train- w10x D. Zou, S. Ye, Ch. Wang, Opt. Eng. 34 Ž1995. 3040.
ing process is linear so that we can use fast inverse matrix w11x J. Lin, X. Su, Opt. Eng. 34 Ž1995. 3297.
algorithms. As a consequence, this learning approach is w12x J.H. Saldner, J. Huntley, Opt. Eng. 36 Ž1997. 610.
much faster than using a gradient descent, whose error w13x J. Li, H. Su, X. Su, Appl. Opt. 36 Ž1997. 277.
reduction is a slow process. w14x P. Sandoz et al., J. Mod. Opt. 43 Ž1997. 701.
w15x M. Servin, F.J. Cuevas, J. Mod. Opt. 42 Ž1995. 1853.
w16x J.E. Greivenkamp, J.H Bruning, in: D. Malacara ŽEd.., Opti-
Acknowledgements cal Shop Testing, Wiley, New York, 1992, pp. 501–598.
w17x M. Takeda, K. Mutoh, Appl. Opt. 22 Ž1983. 3977.
w18x Y. Ichioka, M. Inuiya, Appl. Opt. 11 Ž1972. 1507.
We are indebted to Dr. Mariano Rivera, Dr. Jose Luis w19x M. Servin, R. Rodriguez-Vera, J. Mod. Opt. 40 Ž1993. 2087.
Marroquin, Dr. Francisco Sanchez Marin and MC. Ricardo w20x W.J. Smith, Modern Optical Engineering, McGraw-Hill, New
Legarda, for their enlightening and useful discussions dur- York, 1993, pp. 57–79.
ing the development of this work. The authors wish to w21x B. Prescott, F. McLean, Graphics Models Image Processing
acknowledge the financial support for this research to the 59 Ž1993. 39.
Consejo Nacional de Ciencia y Tecnologıa ´ de Mexico
´ w22x M.J.D. Powell, Radial basis functions for multivariable inter-
ŽCONACYT. under grants 0580P-E and 2608P-A. polation: a review, in: M.G. Cox, J.C. Mason ŽEds.., Algo-
rithms for the Approximation, Clarendon Press, Oxford, 1987.
w23x D.S. Broomhead, D. Lowe, Complex Syst. 2 Ž1988. 321.
w24x M. Musavi, W. Ahmed, K. Chan, K. Faris, D. Hummels,
References Neural Networks 5 Ž1992. 595.
w25x A. Roy, S. Govil, R. Miranda, Neural Networks 8 Ž1995.
w1x D. Weinshall, Comput. Vision, Graph. Image Processing 49 179.
Ž1990. 222. w26x B. Mulgrew, IEEE Signal Processing Magazine, 1996, p. 50.
w2x E. Grosso, M. Tistarelli, IEEE Trans. Patt. Anal. Mach. w27x M. Servin, F.J. Cuevas, Rev. Mex. Fis. 39 Ž1993. 235.
Intelligence 17 Ž1995. 868. w28x P.D. Wasserman, Neural Computing, Van Nostrand-Rein-
w3x F.J. Cardenas-Garcia, H. Yao, S. Zheng, Opt. Lasers Eng. 22 hold, New York, 1989.
Ž1995. 192. w29x T. Kohonen, Self-Organization and Associative Memory, 2nd
w4x B. Horn, Robot Vision, Chap. 13, MIT Press, McGraw-Hill, edn., Springer-Verlag, New York, 1988.
New York, 1986. w30x D.E. Rumelhart, G.E. Hinton, R.J. Williams, Parallel Dis-
w5x H. Ohara et al., Appl. Opt. 35 Ž1996. 4476. tributed Processing, MIT Press, Cambridge, 1986, p. 318.