A Kalman-Filter-Based Method For Pose Estimation in Visual Servoing
A Kalman-Filter-Based Method For Pose Estimation in Visual Servoing
net/publication/224171200
CITATIONS READS
167 1,747
2 authors, including:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Mohammed Abd El Rahman Marey on 17 May 2015.
[9] E. Chong and S. H. Zak, An Introduction to Optimization, 2nd ed. New A Kalman-Filter-Based Method for Pose Estimation in
York: Wiley, 2001, ISBN-10: 0471391263. Visual Servoing
[10] S. Durola, P. Danès, D. Coutinho, and M. Courdesses, “Rational systems
and matrix inequalities to the multicriteria analysis of visual servos,” in
Proc. IEEE Int. Conf. Robot. Autom., Kobe, Japan, May 2009, pp.1504–
Farrokh Janabi-Sharifi and Mohammed Marey
1509.
[11] P. Danès, D. Bellot, “Towards an LMI approach to multicriteria vi-
sual servoing in robotics,” Eur. J. Control, vol. 12, no. 1, pp. 86–110, Abstract—The problem of estimating position and orientation (pose) of
2006. an object in real time constitutes an important issue for vision-based control
[12] R. Findeisen and F. Allgöwer, “An introduction to nonlinear model predic- of robots. Many vision-based pose-estimation schemes in robot control rely
tive control,” presented at the Benelux Meeting Syst. Control, Veldhoven, on an extended Kalman filter (EKF) that requires tuning of filter parame-
Pays Bas, The Netherlands, 2002. ters. To obtain satisfactory results, EKF-based techniques rely on “known”
[13] J. Gangloff and M. De Mathelin, “Visual servoing of a 6 dof manipulator noise statistics, initial object pose, and sufficiently high sampling rates
for unknown 3-D profile following,” IEEE Trans. Robot. Autom., vol. 18, for good approximation of measurement-function linearization. Deviations
no. 4, pp. 511–520, Aug. 2002. from such assumptions usually lead to degraded pose estimation during
[14] R. Ginhoux, J. Gangloff, M. De Mathelin, M. Soler, and L. Sanchez, “Ac- visual servoing. In this paper, a new algorithm, namely iterative adaptive
tive filtering of physiological motion in robotized surgery using predictive EKF (IAEKF), is proposed by integrating mechanisms for noise adaptation
control,” IEEE Trans. Robot. Autom., vol. 21, no. 1, pp. 67–79, Feb. and iterative-measurement linearization. The experimental results are pro-
2005. vided to demonstrate the superiority of IAEKF in dealing with erroneous
[15] K. Hashimoto and H. Kimura, “LQ optimal and nonlinear approaches to a priori statistics, poor pose initialization, variations in the sampling rate,
visual servoing,” in Visual Servoing(World Scientific Series in Robotics and trajectory dynamics.
and Intelligent Systems), K. Hashimoto, Ed, vol. 7. Singapore: World
Index Terms—Adaptation, Kalman filter (KF), control, pose estimation,
Scientific, 1993, pp. 165–198.
robotic manipulator, visual servoing.
[16] M. Kazemi, K. Gupta, and M. Mehrandezh, “Global path planning for
robust visual servoing in complex environments,” in Proc. IEEE Int. Conf.
Robot. Autom., Kobe, Japan, May 2009, pp. 326–332.
[17] R. Mahony, P. Corke, and F. Chaumette, “Choice of image features for
I. INTRODUCTION
depth-axis control in image-based visual servo control,” in Proc. IEEE/RSJ
Int. Conf. Intell. Robots Syst., Lausanne, Switzerland, Oct. 2002, pp. 390– In computer vision, the problem of pose estimation is to determine
395.
the position and orientation (pose) of a camera with respect to an ob-
[18] Y. Mezouar and F. Chaumette, “Optimal camera trajectory with image-
based control,” Int. J. Robot. Res., vol. 22, no. 10, pp. 781–804, 2003. ject’s coordinate frame using the image information. The problem is
[19] T. Murao, T. Yamada, and M. Fujita, “Predictive visual feedback control also known as extrinsic camera-calibration problem with its solution
with eye-in-hand system via stabilizing receding horizon approach,” in playing a crucial rule in the success of many computer-vision applica-
Proc. 45th IEEE CDC, San Diego, CA, Dec. 2006, pp.1758–1763. tions, such as object recognition [1], intelligent surveillance [2], and
[20] M. Morari and E. Zafiriou, Robust Control. Paris, France: Dunod, 1983.
[21] N. Papanikolopoulos, P. Khosla, and T. Kanade, “Visual tracking of a robotic visual servoing (RVS) [3]. Estimation of the camera displace-
moving target by a camera mounted on a robot: A combination of vision ment (CD) between the current and desired pose for RVS [4], [5] is
and control,” IEEE Trans. Robot. Autom., vol. 9, no. 1, pp. 14–35, Feb. also relevant to this problem. However, the focus of this study will be
1993. on pose estimation for RVS where the relative pose between a camera
[22] M. Sauvée, P. Poignet, E. Dombre, and E. Courtial, “Image based visual
and an object is used for real-time control of a robot motion [3].
servoing through nonlinear model predictive control,” in Proc. 45th IEEE
CDC, San Diego, CA, Dec. 2006, pp.1776–1781. In RVS, the control error can be calculated in the image space,
[23] F. Schramm and G. Morel, “Ensuring visibility in calibration-free path Cartesian space, or both (hybrid) spaces [3], [6], [7]. While partial
planning for image-based visual servoing,” IEEE Trans. Robot. Autom., estimation of the pose vector (e.g., depth) is required for image-based
vol. 22, no. 4, pp. 848–854, Aug. 2006. and hybrid visual-servoing schemes [8], [9], an important class of
visual-servoing methods, namely the position-based visual-servoing
(PBVS) scheme, requires full pose estimation to calculate Cartesian
error of the relative pose between the endpoint and the object [10].
Two major difficulties with pose estimation for RVS are related to the
requirements for efficiency and robustness of pose estimation [11].
The solutions to pose-estimation problem usually focus on using
sets of 2-D–3-D correspondences between geometric features and their
projections on the image plane. Although high-level geometric fea-
tures, such as lines and conics, have been proposed, point features
are typically used for pose estimation due to their ease of availability
in many objects [12]. Solutions for three points [13], and more than phy matrix estimation, e.g., in [5] and [25] and, hence, face the issue of
three points [14] have already been presented. However, exact and degeneration of the epipolar geometry in some cases, thus leading to
closed-form solutions are only available for three or four noncollinear unstable estimation [4]. Despite some treatments [4], they remain sus-
points [15]. Such methods, although simple to implement, are often ex- ceptible to outliers. In addition, majority of them require several images
posed to difficulty in point matching in crowded environments. Besides, for reconstruction and, hence, are more appealing for postproduction
point-based solutions are not robust and demonstrate high susceptibility applications [26]. The assumption of known object model is not a ma-
to noise in image coordinates [16]. For three-point solutions, it has been jor issue in many industrial setups since computer-aided-design (CAD)
shown that the points configuration and noise in the points coordinates models of the objects are usually available. For uncertain environments
can drastically affect the output errors [13]. It has also been demon- with a poor (or unknown) model of the object, an EKF-based approach
strated that when the noise level exceeds a knee level or the number of for real-time estimation of combined target model and pose has been
points is below a knee level, least-squares-based methods, which are proposed in [27] and [28]. Therefore, this issue will not be the subject
commonly used for points solutions, become unstable leading to large of our focus. Second, while a KF provides optimal solution under the
errors [17]. Addition of more points would enhance pose-estimation assumption of zero-mean Gaussian noise for a linear problem, the EKF
robustness with the cost of increased computational expense. Nonlin- formulation may not provide optimal results. In fact, linearization can
ear, iterative, and/or recursive methods are then recommended for more generate unstable filters when the assumption of local linearity is not
than four points as well as high-level features. met [29]. In the previous work, it has been recommended to take a
The iterative approaches formulate the problem as a nonlinear least- sufficiently high sampling rate to enforce accuracy of the linearization
squares problem. Such solutions offer more accuracy and robustness, over the sampling period [10]. However, in practice, RVS-system band-
yet they are computationally more intensive than closed-form ap- width would limit the sampling rate for the filter. As it has been shown
proaches, and their accuracy depends on the quality of the initial pose in [30], an EKF-based system might easily diverge under fast and non-
estimates [18], [19]. The iterative methods usually rely on nonlinear linear trajectory dynamics, even with a relatively high sampling rate.
optimization techniques, such as the Gauss–Newton method [1]. To Third, statistics of the measurement and dynamic noise are assumed
reduce the problem complexity, approximate methods have also been to be known in advance and to remain constant. Poor measurement
proposed by simplifying the perspective camera model, e.g., relaxing and dynamic models or poor noise estimates would degrade the system
the orthogonality constraint on the rotation matrix [19], [20]. The sur- performance and might even lead to the filter divergence. In particular,
vey of both exact and approximate pose-estimation methods can be while the measurement noise-covariance matrix can be tuned through
found in the literature [15], [21]. In short, this class of methods ex- experiments, dynamic covariance matrix is difficult to tune [23]. This
hibits convergence problems and does not effectively account for the is because dynamics of the object motion with respect to the camera
orthonormal structure of rotation matrices [22]. Furthermore, with this cannot be accurately predicted in a dynamic environment. Fourth, the
class of techniques, noisy visual-servo images usually lead to poor convergence of EKF depends on the choice of initial state estimate and
individual pose estimates [23], thus requiring temporal filtering. tuning of filter parameters. In many RVS applications, such as assem-
A class of recursive methods relies on temporal-filtering methods, bly industry, initial pose of the object with respect to the camera can
and in particular, Kalman-filtering techniques, to address robustness be readily approximated. Yet, sufficiently good pose estimates cannot
and efficiency issues. Since a 3-D pose and its time rate constitute a be initially available in unstructured and uncertain environments. This
12-D state vector to be estimated in real time, many of these filter- paper will contribute by formulating an EKF method to address the last
ing methods, such as particle filters [24], can hardly model the true two aforementioned issues.
distribution in real time. A true 3-D pose estimation using Kalman Several methods have been proposed in the literature to deal with
filter (KF) for RVS has been realized in [10]. With KFs, photogram- varying statistics and poor filter initialization of EKF for RVS sys-
metric equations are formed by first mapping the object features into tems. An adaptive EKF (AEKF) with a fixed set of image features has
the camera frame and then projecting them onto the image plane. A been formulated for the first time in [30] to update the dynamic-noise-
KF is then applied to provide an implicit and recursive solution of the covariance matrix in order to address the issue of varying and/or uncer-
pose parameters. Since the filter output model for RVS is nonlinear in tain dynamic noise. The AEKF-based approach has later been extended
the system states, an extended KF (EKF) is usually applied, in which the in [31] to have a variable set of image features during the servoing for
output equations are linearized about the current state estimates. The improving servoing robustness. Despite the adaptation capability of
use of a KF in RVS is motivated by its several advantages, including its AEKF to unknown noise statistics, the presented AEKF methods do not
recursive implementation, capability to statistically combine redundant provide robust and accurate pose estimation in the presence of poor filter
information (such as features) or sensors, temporal filtering, possibility initialization and camera calibration, particularly when tracking a fast
of using lower number of features, and the possibility for changing and nonlinear trajectory is desired. This aspect will be investigated ex-
the measurement set without disrupting the operation [3], [10]. For in- perimentally in this paper. While tuning EKF noise-covariance matrices
stance, an EKF-based platform has been proposed in [11] to integrate were addressed in the aforementioned AEKF-based approaches [30],
range sensor with vision sensor for robust pose estimation in RVS. [31], tuning and initialization of other EKF parameters and mechanisms
Additionally, an EKF implementation facilitates dynamic windowing to enhance output linearization for RVS did not receive much attention.
of the features of interest by providing estimation of the next time-step To address tuning of other filter parameters and to facilitate its initial-
feature location. This allows only small window areas to be processed ization, an initial proposal for iterative EKF (IEKF) use in RVS has been
for image-parameter measurements and leads to a significant reduc- provided in [32]. As a matter of fact, Lefebvre et al. [33] have studied
tions in image-processing time. It has been shown that, in practice, an several modifications of KFs for general nonlinear systems. They have
EKF provides near-optimal estimation [10]. categorized all the different versions of KFs such as the central differ-
Despite its advantages, there are a few issues with the application of ence filter (CDF), unscented KF (UKF), and the divided difference filter
EKF to pose estimation in RVS. First, a known object model is usually (DD1) as linear regression KFs (LRKFs) and have compared them with
assumed to be available. Model-free approaches based on Euclidean EKF and IEKF [34]. They have concluded that EKF and IEKF generally
reconstruction have been proposed for CD estimation [4], [5]. These outperform LRKFs, yet they require a careful tuning. An interesting
approaches typically rely on fundamental, essential, and/or homogra- result of their study is that IEKF outperforms EKF, because it uses the
IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 5, OCTOBER 2010 941
measurements to linearize the measurement function, whereas in EKF III. EXTENDED KALMAN FILTER
and LRKFs, the measurement is not used for the same purpose. Despite
its advantages, lack of adaptive noise estimation mechanism would de- For pose estimation, the state vector of dynamic model is defined to
grade the performance of IEKF. In this paper, for the first time, an include pose and velocity parameters, i.e.,
iterative AEKF (IAEKF) for RVS is proposed to overcome limitations ! "T
of IEKF and AEKF. The presented work in this paper is continuation of x = X, Ẋ, Y, Ẏ , Z, Ż, φ, φ̇, α, α̇, ψ, ψ̇ . (3)
the previous works on EKF for RVS [3], [11], [27], [28], [30], [32]. This
The relative target velocity is usually assumed to be constant during
study contributes by detailed formulation of IAEKF and experimental
each sample period. This is a reasonably valid assumption for suf-
comparison of EKF, AEKF, IEKF, and IAEKF for RVS.
ficiently small sample periods in RVS systems. A discrete dynamic
model will be then
II. FEATURE-POINT TRANSFORMATION
xk = Axk −1 + γ k (4)
The commonly used perspective projection model of the camera
is shown in Fig. 1. Image frame is located at F (i.e., effective focal A being a block diagonal matrix with 2 × 2 blocks of the form
with
1T
length) along the Z C -axis with its X i - and Y i -axes parallel to the , T being the sample period, k being the sample step, and γ k
0 1
X C - and Y C -axes of the camera frame, respectively. In this study,
being the disturbance noise vector described by a zero-mean Gaussian
similar to many iterative methods, point features will be used for pose
distribution with covariance Qk , i.e.,
estimation. Let the relative pose of the object to the camera (or end-
effector) frame be W = (T , Θ)T , where T = [X, Y, Z]T denotes the
! "
E[γ i ] = q i , E (γ i − q i )(γ j − q j )T = Qi δ i j (5)
relative position vector of the object frame with respect to the camera
frame, and Θ = [φ, α, ψ]T is the relative orientation vector with roll, where q i and Qi are true mean and true moments about the mean of
pitch, and yaw parameters, respectively. Let P Cj = (XjC , YjC , ZjC )T state noise sequences, respectively, and δ is the Kronecker delta. The
and P oj = (Xjo , Yjo , Zjo )T represent the coordinate vectors of the jth output model will be based on the projection model given by (1) and
object feature point in the camera and object frames, respectively (see (2) and defines the image-feature locations in terms of the state vector
Fig. 1). The vector of P oj is available from the CAD model of the object xk as follows:
or measurements and can be described in the camera frame using the z k = G(xk ) + ν k (6)
following transformation:
with measurements for p feature points
P Cj =T + R(φ, α, ψ)P oj (1)
z k = [xi1 , y1i , xi2 , y2i , . . . , xip , ypi ]Tk (7)
where the rotation matrix is given in [3] and [10]. For control error
calculations, the Euler angles can be approximately related to the total and
angles in a PBVS structure using a transition matrix [10]. The coordi- T
X1C Y1C XpC YpC
nates of the projection of a feature point on the image plane using a G(xk ) = F , , · · · , , . (8)
PX Z1C PY Z1C PX ZpC PY ZpC
pin-hole camera model will be xij and yji given by (see Fig. 1)
T Here, XjC , YjC , and ZjC are given by (1), and ν k denotes the image-
F XjC YjC
[ xij yji ]T = C (2) parameter measurement noise that is assumed to be described by a
Zj PX PY zero-mean Gaussian distribution with covariance Rk , i.e.,
! "
where PX and PY are interpixel spacing in X i - and Y i -axes of the E[ν i ] = r i , E (ν i − r i )(ν i − r i )T = R i δi j (9)
image plane, respectively. This model assumes that the origin of the
image coordinates is located at the principal point, and |ZjC | F. where r i and Ri are true mean and true moments about the mean
For short focal lengths, lens distortion can have a drastic effect on the of measurement-noise sequences, respectively. Since (6) is nonlinear,
feature-point locations. For details of distortion model and its relation to an optimal solution cannot be obtained through a KF implementation.
projection model, see [30] and [35]. The perspective projection model Instead, an extension of KF (i.e., EKF) can be formulated by linearizing
requires both intrinsic and extrinsic camera parameters. The camera the output equation about the current state.
942 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 5, OCTOBER 2010
Let xk be the state at step k, x̂k , k −1 denote the a priori state es- 1
$ T
Rik = Rk −1 + r̂ ik − r̄ ik r̂ ik − r̄ ik
timate at step k given the knowledge of the process or measurement N −1
at the end of step k − 1, and let x̂k , k be the a posteriori state es- T
timate at step k given measurement z k . Then, a priori and a poste- − r̂ k −N − r̄ ik r̂ k −N − r̄ ik
riori estimate errors, and their corresponding covariances are defined 1 T N −1
%
as ek = xk − r̂ ik − r̂ k −N r̂ ik − r̂ k −N (Γk −N − Γik )
" E[ek ek ], ek , k −1 = xk − x̂k , k −1 , and + +
T
! x̂k , k , PT k , k = N N
P k , k −1 = E ek , k −1 ek , k −1 , respectively. It is well known that the
recursive EKF algorithm consists of two major parts of prediction and (20)
estimation as follows. T
$ T
%−1
K ik = P k , k −1 H ik Rik + H ik P k , k −1 H ik (21)
Prediction:
x̂ik+ 1 = x̂ik + K ik (z k − G(x̂ik ). (22)
x̂k , k −1 = Ax̂k −1 , k −1 (10)
At the end of iterations, the iteration output is propagated as follows:
P k , k −1 = AP k −1 , k −1 AT + Qk −1 . (11)
x̂k , k = x̂m
k , r̄ k = r̄ m
k , Rk = Rm
k ,
Linearization:
∂G(x) ## Γk = Γm
k , Kk = Km
k (23)
Hk = x= x̂ k , k −1 . (12)
∂x
and a posteriori error-covariance estimate is updated according to (15).
Kalman gain update: Here, a window of past measurements of size N is selected for adapta-
tion of Rk . The observation noise sample r̂ j is assumed to be represen-
K k = P k , k −1 H Tk (Rk + H k P k , k −1 H Tk )−1 . (13)
tative of ν j , and for j = k − N → k to be independent and identically
Estimation updates: distributed.
Finally, the state noise statistics are estimated adaptively as follows:
x̂k , k = x̂k , k −1 + K k (z k − G(x̂k , k −1 ) (14)
q̂ j = x̂j, j −1 − Ax̂j −1 , j −1 (24)
P k , k = P k , k −1 − K k H k P k , k −1 . (15)
and for j = k − N → k, it is assumed to be independent and identically
Here, K k is the Kalman gain matrix at step k. The measurement-
distributed. In addition, let
and process-noise covariances Qk and Rk are usually assumed to
be constant during servoing and obtained through tuning [10]. While ∆k ≡ AP k −1 , k −1 AT − P k , k . (25)
Rk can be determined through the experiments [23], matrix Qk is
difficult to determine a priori due to unknown object’s and/or camera’s Then, the process-noise-covariance matrix will be updated according
motions. In general, the aim of adaptive filtering in RVS is to estimate to
not only the state but the time-varying statistical parameters given by
1
Υi = {r i , Ri , q i , Qi } as well. An AEKF has been introduced in [30] q̄ k = q̄ k −1 + (q̂ − q̂ k −N ) (26)
N k
and [31] to estimate Rk and Qk in real time. The adaptation capability $
of AEKF with poor initialization of noise-covariance matrices has been 1
Qk = Qk −1 + (q̂ k − q̄ k )(q̂ k − q̄ k )T
demonstrated in our previous work [30]. However, the results also N −1
showed that in quicker changes of the pose, the error of AKEF will − (q̂ k −N − q̄ k )(q̂ k −N − q̄ k )T
increase. This is mainly due to the time required by AEKF to react to
1 N −1
%
such a sudden change. Linearization approximation in (12) cannot be + (q̂ k − q̂ k −N )(q̂ k − q̂ k −N )T + (∆k −N − ∆k )
treated by AEKF properly and is another source of errors, especially N N
in tracking the trajectories with faster and higher dynamics. Besides, (27)
the linearization approximation errors would lead to high sensitivity
to poor initialization and camera-calibration error. An IEKF has been followed by predictions stage, which is represented by (10) and (11).
proposed in our previous work [32] to alleviate this issue. However, it must be noted that the above algorithm is computationally
In the next section, adaptive and iterative mechanisms are combined intensive when compared with EKF, IEKF, and AEKF. In order to
to address the aforementioned issues simultaneously and to establish a improve computing time, adaptation steps are performed outside the
robust framework for pose estimation in RVS. iterations. After initialization and prediction steps, the first limited filter
algorithm [30] to estimate the measurement-noise statistics is applied
IV. ITERATIVE ADAPTIVE EXTENDED KALMAN FILTER to find Rk before the iterations (using (16)–(20) without index i).
Next, the iteration is established for m cycles to obtain Kalman gain
The proposed approach combines advantages of AEKF and IEKF. and estimation updates according to (16), (21), and (22). State noise
After initialization and prediction stages, iteration is started for m statistics are estimated outside the iterations, according to (24)–(27).
iterations by first setting x̂0k = x̂k , k −1 , i.e., for i = 0, and then To ensure positive definiteness of Rk and Qk , the diagonal elements
# of covariance estimators are reset to their absolute values. In addition, a
∂G(x) #
H ik = #x= x̂ ik (16) fading-memory approach is applied to give low weights to initial (i.e.,
∂x
less reliable) samples of length and growing weight to successive noise
r̂ ik ≡ z k − G(x̂ik ) (17) samples as follows [30]:
T
Γik ≡ H ik P k , k −1 H ik (18) k = (k − 1)(k − 2) · · · (k − η)/k η , if k ≥ η (28)
1
r̄ ik = r̄ k −1 + (r̂ ik − r̂ k −N ) (19) with the property of limk →∞ k = 1.
N
IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 5, OCTOBER 2010 943
TABLE I
COMPUTATIONAL COST FOR POSE ESTIMATION WITH
p = 5, N = 10, m = 10(20, 30)
V. EXPERIMENTAL RESULTS
Extensive simulations and experiments were conducted to inves-
tigate and compare the performance of various Kalman-filtering ap-
proaches for pose estimation.
The default filter parameters were as follows: R0 is a diagonal matrix
with diagonal elements of 0.01 (in pixels square) measured through the
Fig. 3. Dynamic performance of pose estimation by EKF and forward kine-
experiments, P 0 , 0 is a block diagonal matrix with 2 × 2 blocks of the matics estimators (experiment 1).
form diag[0.02 0.01] (in meters square, in (meters per second) square,
in degrees, in (degrees per second) square), N = 20, m = 30, η = 5,
and p = 5. different trajectories with various dynamics were designed. A good
To evaluate the accuracy of estimation, the results of estimations estimation power of tuned EKF in relatively slow motion has already
were compared with the relative pose calculations through the robot been shown [3], [10]. Therefore, EKF formed the comparison base.
forward kinematics using the joint encoders. Another measure of accu- The purpose of experiment 1 was to compare the performance of vari-
racy was the inspection of the Kalman-estimate output errors that are ous KF-based methods under an accurately calibrated robot framework.
the errors between the true image-feature locations and those obtained The maximum-velocity components of the endpoint trajectory were set
from KF estimates, i.e., filter residues: z k − G(x̂k , k −1 ). to 50 mm/s and 5◦ /s, for translational and rotational coordinates, re-
A 6-degree-of-freedom (DOF) Cartesian manipulator, i.e., AFMA- spectively, to generate a moderate motion dynamics. A null-state noise-
6, with an endpoint mounted AVT-MARLIN F-033C CCD camera covariance matrix was initially introduced to simulate the case of poorly
(at IRISA-INRIA, Rennes) and a target object shown in Fig. 2 were tuned KF-based estimators for variety of trajectories. The endpoint rela-
used. The robot was calibrated and operated under Linux with visual- tive trajectory was designed to incorporate sudden-velocity changes and
servoing software, i.e., VISP [39]. The camera images were sent at significant nonlinearities. The purpose was to investigate the adaptation
50 fps (frames/s) to the host PC with Intel Core 2–2.93 GHz running capability of AEKF and IAEKF to deviations from constant-velocity
under Linux on which frame grabbers had been installed. The images assumption of KF process model, and to evaluate the iterative per-
had the size of 128 × 182 pixels and had an effective focal length of F = formance of IEKF and IAEKF in approximating the output-model lin-
12.5 mm. The image processing and control computations were carried earization. The inspection of the results (see Figs. 3–6 and Tables II and
out on the host, and then, the control output was transmitted to the robot III) indicates that estimation accuracy of all algorithms is better in X,
controller via a PCI-VME bus-adapter board. About 10 ms was required Y , and roll than in depth parameters Z, pitch, and yaw. The results also
for control action. The camera parameters, namely image center and show that sudden changes in the velocity lead to divergence of EKF (see
interpixel spacings, were obtained from the calibration program. Initial Fig. 3). This is due to the assumption of constant velocity in the state
estimate of the pose was obtained using DeMenthon’s method [19]. model. However, both AEKF and IAEKF were able to adapt to velocity
The sampling period was T = 0.06325 s. changes (see Figs. 4 and 5). Fig. 5 shows that, although IEKF perfor-
The robot was commanded to travel through a predefined trajectory mance is superior to that of EKF, lack of a noise-adaptation mechanism
over a stationary object. The maximum velocity of the AFMA-6 end- in IEKF leads to significant errors and divergence toward the end of the
point was set through VISP. Therefore, for a given set of nodal points, relative pose trajectory. Table II shows pose-estimate-error statistics
944 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 5, OCTOBER 2010
TABLE II
POSE ERROR STATISTICS FOR DIFFERENT KF-BASED ESTIMATORS WHEN
COMPARED WITH KINEMATIC ESTIMATOR (EXPERIMENT 1)
TABLE III
IMAGE-PLANE-ERROR VARIANCE FOR DIFFERENT KF-BASED ESTIMATORS IN
Fig. 4. Dynamic performance of pose estimation by AEKF and forward kine- PIXELS SQUARE (EXPERIMENT 1)
matics estimators (experiment 1).
TABLE IV
POSE-ERROR STATISTICS FOR DIFFERENT KF-BASED ESTIMATORS WHEN
COMPARED WITH KINEMATIC ESTIMATOR (EXPERIMENT 3)
TABLE V
Fig. 7. Dynamic performance of pose estimation by AEKF and forward kine- IMAGE-PLANE-ERROR VARIANCE FOR DIFFERENT KF-BASED ESTIMATORS IN
PIXELS SQUARE (EXPERIMENT 3)
matics estimators (experiment 2a).
400 mm in all position coordinates. While IEKF and IAEKF provided [13] R. M. Haralick, C. Lee, K. Ottenberg, and M. Nolle, “Review and analysis
acceptable results upto 200 mm deviation from the initial position, of solutions of the three point perspective pose estimation,” Int. J. Comput.
other methods failed at 100 mm deviation. The mean-error values for Vis., vol. 12, no. 3, pp. 331–356, 1994.
[14] O. Faugeras, Three-Dimensional Computer Vision. Cambridge, MA:
IAEKF and IEKF remained within +10% of those reported in Table II MIT Press, 1993.
(i.e., almost-perfect pose initialization). [15] D. DeMenthon and L. S. Davis, “Exact and approximate solutions of the
perspective-three point problem,” IEEE Trans. Pattern Analys. Mach.
Intell., vol. 14, no. 11, pp. 1100–1105, Nov. 1992.
VI. CONCLUSION
[16] X. Wang and G. Xu, “Camera parameters estimation and evaluation in
Different KF-based methods of pose estimation have been discussed. active vision systems,” Pattern Recognit., vol. 29, no. 3, pp. 439–447,
1996.
A new pose-estimation method, namely the IAEKF algorithm, has also
[17] R. M. Haralick, H. Joo, C. Lee, X. Zhang, V. Vaidya, and M. Kim, “Pose
been introduced. All methods have been compared for their perfor- estimation from corresponding point data,” IEEE Trans. Syst., Man, Cy-
mance under different experimental conditions. It has been shown that bern., vol. 19, no. 6, pp. 1426–1446, Nov./Dec. 1989.
mechanisms of noise adaptation and iterative-measurement lineariza- [18] Q. Ji, M. S. Costa, R. M. Haralick, and L. G. Shapiro, “An inte-
tion can be integrated within a novel IAEKF algorithm to obtain a grated linear technique for pose estimation from different geometric fea-
tures,” Int. J. Pattern Recognit. Artif. Intell., vol. 13, no. 5, pp. 705–733,
superior performance in comparison with other KF-based methods. In 1999.
particular, robustness of IAKEF has been established through exper- [19] D. DeMenthon and L. S. Davis, “Model-based object pose in 25 lines of
iments, and it has been demonstrated that IEAKF can improve pose- code,” in Proc. Eur. Conf. Comput. Vis., Santa Margherita Ligure, Italy,
estimation performance in the presence of erroneous a priori statistics, 1992, pp. 335–343.
[20] R. K. Lenz and R.Y. Tsai, “Techniques for calibration of the scale fac-
nonlinear and fast-tracking trajectories and measurement function, slow
tor and image center for high accuracy 3D machine vision metrology,”
sampling rates, and erroneous pose initialization. The improvements IEEE Trans. Pattern Anal. Mach. Intell., vol. 10, no. 5, pp. 713–720, Sep.
have been obtained at an additional computational cost, which are, in 1988.
general, modest given the current PC technology and when compared [21] R. Kumar and A. R. Hanson, “Robust methods for estimating pose and a
with feature selection and image-processing time in RVS. sensitivity analysis,” CVGIP: Image Understanding, vol. 60, pp. 313–342,
1994.
[22] C.-P. Lu, G. D. Hager, and E. Mjolsness, “Fast and globally convergent
ACKNOWLEDGMENT pose estimation from video images,” IEEE Trans. Pattern Anal. Mach.
Intell., vol. 22, no. 6, pp. 610–622, Jun. 2000.
The authors would like to thank the Lagadic staff, particularly, [23] J. Wang and W. J. Wilson, “3D relative position and orientation estimation
F. Chaumette and F. Spindler, during his visit to INRIA-IRISA for using Kalman filtering for robot control,” in Proc. IEEE Int. Conf. Robot.
useful discussions and their assistance with the experiments. The au- Autom., Nice, France, 1992, pp. 2638–2645.
thors also acknowledge the assistance of A. Vakanski in simulations [24] J. Carpenter, P. Clifford, and P. Fearnhead, “Improved particle filter for
nonlinear problems,” Inst. Electr. Eng. Proc. Radar Sonar Navig., vol. 146,
and efficiency calculations. no. 1, pp. 2–7, 1999.
[25] Q.-T. Luong and O. Faugeras, “The fundamental matrix: Theory, algo-
REFERENCES rithms, and stability analysis,” Int. J. Comput. Vis., vol. 17, no. 1, pp. 43–
75, 1996.
[1] D. G. Lowe, “Three-dimensional object recognition from single two- [26] É. Marchand and F. Chaumette, “Virtual visual servoing: A framework for
dimensional images,” Artif. Intell, vol. 31, pp. 355–395, 1987. real-time augmented reality,” EUROGRAPHICS, vol. 21, no. 3, pp. 289–
[2] A. Mittal, L. Zhao, and L.S. Davis, “Human body pose estimation using 298, 2002.
silhouette shape analysis,” in Proc. IEEE Conf. Adv. Video Signal Based [27] L. Deng, W. J. Wilson, and F. Janabi-Sharifi, “Combined target model
Surveill., Jul. 2003, pp. 263–270. estimation and position-based visual servoing,” in Proc. IEEE/RSJ
[3] F. Janabi-Sharifi, “Visual servoing: Theory and applications,” in Opto- Int. Conf. Intell. Robot. Syst., Sendai, Japan, Oct. 2004, pp. 1395–
Mechatronic Systems Handbook, H. Cho, Ed. Boca Raton, FL: CRC, 1400.
2002, pp. 15-1–15-24. [28] L. Deng, W. J. Wilson, and F. Janabi-Sharifi, “Decoupled EKF for simul-
[4] E. Malis and F. Chaumette, “2 1/2 D visual servoing with respect to un- taneous target model and relative pose estimation using feature points,”
known objects through a new estimation scheme of camera displacement,” in Proc. IEEE Int. Conf. Control Appl., Toronto, ON, Canada, Aug. 2005,
Int. J. Comput. Vision, vol. 37, no. 1, pp. 79–97, 2000. pp. 749–754.
[5] G. Chesi and K. Hashimoto, “A simple technique for improving camera [29] S. J. Julier and J. K. Uhlmann, “Unscented filtering and nonlinear estima-
displacement estimation in eye-in-hand visual servoing,” IEEE Trans. tion,” Proc. IEEE, vol. 92, no. 3, pp. 401–422, Mar. 2004.
Pattern Anal. Mach. Intell., vol. 26, no. 9, pp. 1239–1242, Sep. 2004. [30] M. Ficocelli and F. Janabi-Sharifi, “Adaptive filtering for pose estimation
[6] F. Chaumette and S. Hutchinson, “Visual servo control, Part I: Basic in visual servoing,” in Proc. IEEE/RSJ Int. Conf. Intel. Robot. Syst., Maui,
approaches,” IEEE Robot. Autom. Mag., vol. 13, no. 4, pp. 82–90, Dec. HI, 2001, pp. 19–24.
2006. [31] V. Lippiello, B. Siciliano, and L. Villani, “Adaptive extended Kalman
[7] F. Chaumette and S. Hutchinson, “Visual servo control, Part II: Advanced filtering for visual motion estimation of 3D objects,” Control Eng Pract.,
approaches,” IEEE Robot. Autom. Mag., vol. 14, no. 1, pp. 109–118, Mar. vol. 15, pp. 123–134, 2007.
2007. [32] A. Shademan and F. Janabi-Sharifi, “Sensitivity analysis of EKF and
[8] J. Feddema and O. R. Mitchell, “Vision-guided servoing with feature- Iterated EKF for position-based visual servoing,” in Proc. IEEE Int. Conf.
based trajectory generation,” IEEE Trans. Robot. Autom., vol. 5, no. 5, Control Appl., Toronto, ON, Canada, Aug.2005, pp. 755–760.
pp. 691–700, Oct. 1989. [33] T. Lefebvre, H. Bruyninckx, and J. De Schutter, “Kalman filters for non-
[9] E. Malis, F. Chaumette, and S. Boudet, “2-1/2 D visual servoing,” IEEE linear systems: A comparison of performance,” Intl. J. Control, vol. 77,
Trans. Robot. Autom., vol. 15, no. 2, pp. 234–246, Apr. 1999. no. 7, pp. 639–653, 2004.
[10] W. J. Wilson, C. W. Hulls, and G. S. Bell, “Relative end-effector control [34] Y. Bar-Shalom and X. R. Li, Estimation and Tracking: Principles, Tech-
using Cartesian position based visual servoing,” IEEE Trans. Robot. niques, and Software. Boston, MA: Artech House, 1993.
Autom., vol. 12, no. 5, pp. 684–696, Oct. 1996. [35] M. Ficocelli, “Camera calibration: Intrinsic parameters,” Robot. Manuf.
[11] W. J. Wilson, C. W. Hulls, and F. Janabi-Sharifi, “Robust image processing Autom. Lab., Ryerson Univ., Toronto, ON, Canada, Tech. Rep. TR-1999-
and position-based visual servoing,” in Robust Vision for Vision-Based 12-17-01, 1999.
Control of Motion, M. Vincze and G. D. Hager, Eds. New York: IEEE [36] P. I. Corke, Visual Control of Robots: High Performance Visual Servoing.
Press, 2000, pp. 163–201. Somerset, U.K.: Res. Studies, 1999.
[12] F. Janabi-Sharifi and W. J. Wilson, “Automatic selection of image features [37] R. Tsai and R. Lenz, “A new technique for fully autonomous and efficient
for visual servoing,” IEEE Trans. Robot. Autom., vol. 13, no. 6, pp. 890– 3D robotic hand/eye calibration,” IEEE Trans. Robot. Autom., vol. 5,
903, Dec. 1997. no. 3, pp. 345–358, Jun. 1989.
IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 5, OCTOBER 2010 947