Jose Luis Guzman, and Roland Siegwart : Ramon Gonzalez, Francisco Rodriguez Cedric Pradalier
Jose Luis Guzman, and Roland Siegwart : Ramon Gonzalez, Francisco Rodriguez Cedric Pradalier
http://journals.cambridge.org
2 Combined visual odometry and visual compass for off-road mobile robots localization
available at http://www.ual.es/personal/rgonzalez/videosVO. matches), the robot motion estimate can become degraded. In
htm. order to minimize this shortcoming, especially undesirable
The paper is organized as follows. Section 2 describes the estimating robot orientation, a second camera is added. This
methodology to estimate the robot location combining the solution is inspired by two recent works, in which a method
position obtained using visual odometry and the orientation called visual compass was proposed to estimate rotational
using the visual compass method. Section 3 is devoted to information from omnidirectional cameras.20, 24 The visual
implementation issues. Physical experiments are discussed compass technique is based on the use of a camera mounted
in Section 4. Finally, conclusions and discussions about vertical to the ground on a mobile robot. Then, a pure rotation
physical experiments are detailed in Section 5. on its vertical axis results in a single column-wise shift of
the appearance in the opposite direction. In this way, the
rotation angle is retrieved by matching a template between
2. Methodology the current image (after rotation) and the previous one (before
Visual odometry constitutes a straightforward-cheap method rotation).20
to estimate the robot location.9, 16, 17 A single consumer- In this section, the steps carried out to estimate the
grade camera can replace a typical expensive sensor suite robot location using visual odometry based on the template
(encoders, IMU, GPS, etc.). It is especially appropriate for matching method are explained. Our strategy takes two
off-road applications, since the visual information is used to image sequences as input. One image sequence comes from
estimate the actual velocity of the robot, thereby minimizing a single standard camera pointing at the ground under the
slip phenomena.17 The main limitations of vision-based robot, and the second one comes from a camera looking at
techniques are mainly related to the light and imaging the environment. The former is employed to estimate the
conditions (i.e., terrain appearance, camera parameters, etc.) robot longitudinal displacement (Section 2.2), and the latter
and the computational cost. is employed to estimate the robot orientation (Section 2.3).
Generally, there are two ways to estimate the location Firstly, the mathematical formulation of template matching
of a mobile robot using the visual odometry paradigm. is briefly described in the following subsection.
The most popular method is called optical flow.13, 18, 19 It is
based on tracking distinctive features between successively
2.1. Template matching
acquired images.18 In this case, an image is matched with
The template matching method is defined as the process
the previous one by individually comparing each feature on
of locating the position of a sub-image inside a larger
them and finding candidate matching features based on the
image. The sub-image is called the template and the larger
Euclidean distance of their feature vectors. Afterwards, the
image is called the search area.22, 23 This process involves
velocity vector between these pairs of points is calculated
shifting the template over the search area and computing
and the displacement is obtained by using these vectors.18
the similarity between the template and a window in the
Optical flow is especially advisable for textured scenarios,
search area. This is achieved by calculating the integral of
such as urban and rough environments.8, 11, 20 This approach
their product. When the template matches, the value of the
has been tested using single,16 stereo,21 and omnidirectional
integral is maximized. There are several methods to address
cameras.20
the template matching, see refs [28 and 29] for a review.
A slightly different approach is the template matching
Here, the cross-correlation solution has been implemented
method.22–24 It avoids the problem of finding and tracking
(a trade-off was realized comparing different methods and
features, and instead it looks at the change in the appearance
the best result, fewer false matches, was obtained using the
of the world (images). For that purpose, it takes a template
cross-correlation approach). It is based on calculating an
or patch from an image and tries to match it in the previous
array of dimensionless coefficients for every image position
image. The main difference with optical flow is that now no
(s, v) as22, 29
identification or tracking of features are involved, and there is
no need to measure image velocities at different locations.25
h−1
−1
The appearance-based method has been successfully applied
employing single26, 27 and omnidirectional cameras.24 R(s, v) = (T (i, j ) − T̄ (i, j ))(I (i + s, j + v)
The main difference between the optical flow and template i=0 j =0
matching approaches is that when the scene is low-textured, −I¯(i + s, j + v)), (1)
the number of detected and tracked features (single patterns)
is low, which can lead to poor accuracy of motion estimate.8
This fact means that optical flow can fail on almost featureless where h ∈ R+ and ∈ R+ are the height and the width
scenarios (such as sandy soils, urban floors, etc.) where of the template, respectively, T (i, j ) and I (i, j ) are the pixel
images with few high gradients are grabbed. On the other values at location (i, j ) of the template and the current search
hand, the template matching approach works properly in area, respectively, and T̄ (i, j ) and I¯(i, j ) are the mean values
low texture scenarios, since a larger pattern (template) is of the template and current search area, respectively. These
employed, and, therefore, the probability of a successful mean values are calculated as
matching is increased.24
h−1 −1
Previous discussion motivates why the template matching 1
method has been selected in this work. However, it is T̄ (i, j ) = T (a, c), (2)
important to remark that if the matching process fails (false ( h) a=0 c=0
http://journals.cambridge.org
Combined visual odometry and visual compass for off-road mobile robots localization 3
Fig. 1. (Colour online) Visual odometry based on template matching using a camera pointing at the terrain under the robot.
Notice that the value of R̃ changes between −1 and +1, where x ∈ R, y ∈ R are the camera longitudinal and
and the closer the R̃ to +1, the more similar the template and lateral displacements in physical world units, respectively,
the current image. For that purpose, the best match is defined Z ∈ R+ is the height of the camera above ground (see
g g
as Assumption 1), and fx ∈ R, fy ∈ R are the focal lengths
of the camera pointing at the ground.
R̃ M = max(R̃(s, v)), (6)
Assumption 1. It is assumed that the distance between the
M M M camera and the ground is almost constant (see Remark 1).
where R̃ is the maximum value of the array R̃ and (s , v )
is the position of that point. Remark 1. We would like to remark that although on non-
smooth surfaces the distance between the downward camera
2.2. Estimating robot displacement and the ground is not fixed due to vibrations, it is reasonable
This subsection focuses on the estimation of the robot to assume this variation as zero or zero-mean, since such
longitudinal displacement using the images taken by the little oscillations are cancelled out during the experiment.30
camera pointing at the ground. Notice that on a rougher surface an IMU sensor or a laser
As shown in Fig. 1, at sampling instant, t = τ − 1, the sensor should be used to estimate the height of the camera,
robot takes a picture of the ground under it. At the following leading to a 3D localization.27 Recently, a novel approach
sampling instant, t = τ , the template matching approach is consists in using telecentric cameras.31 These cameras are
employed to find a defined template from the previous image electronically modified in such a way that the lens keeps the
http://journals.cambridge.org
4 Combined visual odometry and visual compass for off-road mobile robots localization
http://journals.cambridge.org
Combined visual odometry and visual compass for off-road mobile robots localization 5
(5) Translate from camera plane to world plane using the find it in the following image. On the contrary, the smaller the
camera calibration parameters by means of Eq. (8). template, the higher the probability to fail into false matches.
(6) Compute the rotation angle using the visual compass That is, if a too small template is selected, several areas of
method using Eq. (13). the following image can match with that template.
(7) Estimate the robot location using translation information As commented previously, the second way to speed up
given by the camera pointing at the ground with Eq. (9), the correlation matching process consists in using a reduced
and the rotation angle given by the camera looking at the window of the original image instead of the whole image.
environment with Eq. (14). Such reduced search area is given by
(8) Repeat from Step 1.
1 q
W inqw = W ,
λq
3. Implementation Issues q 1
In this section, the computational aspects of template W inh = q H q , (17)
λ
matching (Section 3.1) and the selection of the search area
and template sizes (Section 3.2) are discussed. where W inq is the size of the new reduced image, and λq ≥ 1
is a reduction factor tuned experimentally (it is explained
3.1. Computational aspects of template matching subsequently). Then the reduced image will start at the point
q q
This subsection discusses some experiments carried out to W inq (s, v) and it will have a size of W inw × W inh . The top
select the most appropriate template/search area size for a left corner of the new image is
satisfactory performance of the correlation algorithm and
q
proper computation time. W inw
The main drawback of template matching approach is its W inq (s) = W inqw − ,
λq
computation cost, since the template has to be slid over the q
whole search area. In the general case, the detection of a q q W inh
W in (v) = W inh − . (18)
single template Tm×m within a image In×n by means of a λq
matching process is O = m2 (n − m + 1)2 .22 For that reason,
two important issues to be investigated are the template size In this way the computation time is decreased, since
and the search area size. correlation process is carried out over a smaller image as
Notice that here the possibility of speeding up the shown in the following subsection.
matching process from an algorithmic point of view is not
considered. This subsection only deals with determining 3.2. Selection of the search area and template sizes
the template/search area size to reach a trade-off between Before carrying out physical experiments, the effect of
performance and computation time. the template and image sizes on the computation time
Firstly, the proper template size is studied, and later on, has been analyzed. Image sequences taken during physical
a way to reduce the search area is analyzed. In this way, experiments are also employed here (see Fig. 7). Notice that
the template is obtained as a reduced squared window of the experiments have been carried out on a computer Intel Core
image taken at sampling instant τ − 1. The template origin 2 Duo 2.5 GHz with 3.5 GB RAM using OpenCV (Version
has been established in the image center, and the top left 1.1).28
corner of the template is located at27 Figure 3 shows the resulting computation time varying the
template and image sizes (“Mean” is the mean computation
q time of the sequence of images and “Std” denotes the
Wq Tsize
T q (s) = − , standard deviation). Here the template and image sizes
2 2
q
of the image sequence employed by the visual compass
Hq Tsize method are fixed to ρ e = 4 and λe = 1.7 for Eqs. (16)
T q (v) = − , (15)
2 2 and (17), respectively. As observed, larger template size
(smaller ρ g ) implies that the computation time is lower.
where T q (s, v) is the top left corner of the template, q refers When the matching process is applied over a smaller search
to the images taken by the camera pointing at the ground area (larger λg ), the computation time also decreases. The
(q = g) and to the camera looking at the environment (q = computation time when images do not have any reduction
e), W q ∈ R+ and H q ∈ R+ are the width and height of the (black triangles) is also displayed. From this analysis,
original image, respectively, and the following reduction factors ρ g = 3 and λg = 1.2 have
been selected, since they constitute a compromise between
q 1 q suitable computation time (< 0.2 s) and success in the
Tsize = H , (16)
ρq matching process. It is important to remark that although
smaller reduction factors can be considered, these reduced
is the template size being ρ q ≥ 1, a reduction factor search areas lead to an unfeasible matching process. This
experimentally tuned (it is explained subsequently). means that for the experiments carried out in this work if
Notice that the larger the template, the smaller the smaller search areas were considered, the number of false
probability that it is matched in the search area. This means matches increases to unsuitable values, and robot location
that if a too large template is selected, it cannot be possible to cannot be reliably estimated.
http://journals.cambridge.org
6 Combined visual odometry and visual compass for off-road mobile robots localization
Fig. 3. (Colour online) Analysis of template and image sizes on computation time (images from camera pointing at the ground). The size
of the images from the Pancam is fixed.
4. Results
In this section, the physical experiments carried out to
localize a tracked mobile robot using the suggested visual
odometry approach are discussed. In this case, the robot was
teleoperated on a sunlit illuminated off-road terrain. For com-
parison purposes, we collected vision data (cameras), global
position (DGPS), odometry data (encoders), and absolute
orientation (magnetic compass). The frames were grabbed
at 5 Hz and the robot velocity ranged between 0.4 m/s and
0.5 m/s. Notice that for the kind of applications in which
our mobile robot will be applied (greenhouse tasks),14 these
are considered an appropriate sampling period and velocity
range. Here the DGPS and the magnetic compass data
are considered as ground-truth for position and orientation,
respectively. Notice that the position obtained using the Fig. 4. (Colour online) Tracked mobile robot Fitorobot at the
DGPS is translated to relative position. For this purpose, experiment site. Observe the position of the two cameras on the
the global position (latitude/longitude) was converted to robot.
Universal Transverse Mercator (UTM) grid system.7
We have tried several experiments. In this case, we firstly 1.5-m long × 0.7-m wide. It is driven by a 20-HP gasoline
present a physical experiment in which the robot was driven engine.
along a squared trajectory where the total travelled distance We have employed two consumer-grade cameras, Logitech
was close to 160 m. After that, we discuss a S-shaped 2 Mpixel QuickCam Sphere AF webcam with maximum
trajectory with a total travelled distance close to 290 m. frame rate of 30 fps. In this case, a resolution of 640 × 480
Finally, we show a circular trajectory in which the total has been employed. For calibration purposes, the Matlab’s
travelled distance was 65 m. camera calibration toolbox has been used.34
The rest of sensors were one magnetic compass (C100,
4.1. Testbed KVH Industries Inc.), two incremental encoders (DRS61,
The robot available at the University of Almerı́a (Spain) is SICK AG), and one DGPS (R100, Hemisphere). The
a tracked mobile robot called Fitorobot (see Fig. 4).14 The performance of the DGPS under motion is about 0.20 m.
mobile robot has a mass of 500 kg and its dimensions are The resolution of the magnetic compass is 0.1o .
http://journals.cambridge.org
Combined visual odometry and visual compass for off-road mobile robots localization 7
Fig. 5. (Colour online) Images taken by the camera pointing at the ground (velocities > 1 m/s): (a) Blur effect (flat terrain); (b) vibrations
effect (bumpy terrain).
Notice the position of the cameras for visual odometry not appropriate, since it would mean to raise the computation
(circles) in Fig. 4. The camera looking at the environment time assigned to the vision algorithm. Bounding the robot
was mounted on the top center of the robot. The camera velocity can be a successful solution; however, it would
pointing to the ground in front of the robot is in the middle entail a certain degree of conservativeness for the motion
of both tracks at a height of 0.49 m and the distance between controllers.
the camera and the robot center is 0.9 m. In relation to the image shown in Fig. 5(b), it is also
interesting to remark that blur phenomenon is stressed by
4.2. Preliminary experiments: shadows and blur the vibrations affecting the mobile robot. It is a difficult issue
phenomenon to be removed, since the tracked mobile robot employed for
From physical experiments, it was noticed that when physical experiments has a limited suspension mechanism
the robot moves at velocities greater than 1 m/s on flat that produces unavoidable vibrations on the robot structure.
terrains, blur phenomenon corrupts the images taken by In conclusion, as a first approach in this work, visual
the camera pointing at the ground (see Fig. 5(a)). Blur odometry was employed when the robot was moving at
phenomenon occurs when an image is captured while the velocities lower than 1 m/s.
camera is moving during the exposure time or shutter Another important issue observed from outdoor physical
time.35 This phenomenon constitutes a difficult issue to experiments is the problem found in environments with
be removed and elaborate solutions have to be considered changing lighting conditions, which can lead to shadows in
to minimize its influence. For instance, in ref. [36], the images taken by the camera pointing at the ground (see
authors formulate a learning policy as a trade-off between Fig. 6). After analyzing many experiments, it was concluded
the localization accuracy and the robot velocity. In ref. that when there are shadows in the images, the risk for false
[35], authors propose to carry out a preprocessing step matches increases highly. For this reason, this phenomenon
before detecting features in the image. In this work, a has been deeply studied and two approaches to minimize its
preprocessing of the images, such as an enhancing filter, is effect have been proposed.
Fig. 6. (Colour online) Position and image acquired by groundcam with shadows. (a) Height of the groundcam. (b) Shadows in the
groundcam.
http://journals.cambridge.org
8 Combined visual odometry and visual compass for off-road mobile robots localization
Fig. 7. (Colour online) Result of template matching in the experiment site (gravel soil). (a) Panoramic view. (b) Ground view.
First, the position and the height of the camera pointing at orientation ground-truth comes from a magnetic compass
the ground were studied carefully. In this case, the camera (denoted as “Compass”).
was mounted in front of the robot between both tracks at a
height of 0.49 m, see Fig. 6(a). This distance was obtained as 4.3.1. Experiment 1. Rectangular trajectory. In this
a trade-off between shadow-reduction and template matching experiment, the robot was manually driven on a sunlit
performance, that is, higher distance leads to more features illuminated gravel terrain following a rectangular trajectory.
but shadows can appear. On the contrary, shorter distance In this case, the lighting conditions did not produce any
corresponds to smaller field of view, where the probability significant shadows during the experiment.
of shadows in the images is reduced. However, it can lead to Figure 7 shows two frames employed by the vision-based
featureless images. localization technique during this experiment. The pixel
Secondly, a threshold filter has been tuned. It compares displacement is marked by the green line and the red circle,
the current pixel displacement with the previous ones; if the template is labelled by the blue rectangle, and the black
the difference is greater than the threshold (experimentally rectangle means the reduced area in which the matching
selected), then the current value is considered as an outlier. process is carried out.
In this way, these peaks or outliers, due to false matches, are Figure 8 shows the resulting trajectories. It is observed that
partially compensated. As shown in the following subsection, the visual odometry with visual compass trajectory closely
this filter works properly and requires a small computation follows the ground-truth, while the wheel-based odometry
time. estimate diverges largely from the ground-truth, particularly
odometry fails at turns. The trajectory obtained using the
image sequence from the camera pointing at the ground to
4.3. Physical experiments in off-road conditions estimate orientation is also plotted, and it has a similar result
Several trajectories were tested to check the performance to that obtained using the approach combining information
of the suggested localization approach. In this case, three from both cameras (visual odometry with visual compass).
experiments were selected. In the first one, the robot was A deeper analysis is obtained looking at Fig. 9. Here,
driven along a rectangular trajectory of approximately 55-m the error between each localization method and the ground-
long and 20-m wide. The total travelled distance was close truth is shown quantitatively. In this case, the Euclidean
to 160 m. In the second experiment, the robot was driven distance between the initial and the final positions of the
along an S-shaped trajectory with three parallel paths to the robot in the four parts of the trajectory is calculated, that
x-axis of 80 m and two perpendicular paths of 20 m. The is, two parallel paths to the x-axis (Parts 1 and 3) and two
total travelled distance was close to 290 m. Finally, a circular perpendicular ones (Parts 2 and 4). From these data, it is
trajectory is selected, the total travelled distance was close to observed that the visual odometry with the visual compass
65 m. Notice that trajectories similar to those selected here approach achieves the smallest error. Another vision-based
are usually employed in off-road mobile robotics, see, for technique also achieves an admissible error. The relative
instance, refs. [12, 20, 27]. The sampling period was Ts = mean errors with respect to the total travelled distance are
0.2 s and the robot velocity ranged between 0.4 m/s and 0.5 1.45% for visual odometry with visual compass, 2.33% for
m/s. visual odometry alone (using only the downward camera),
The compared localization techniques are visual odometry and 16% for wheel-based odometry.
with visual compass trajectory (denoted as “VO + VC” Figure 10 displays the orientations. Here, it is checked that
in the figures), visual odometry using only the downward the orientations obtained through the visual odometry-based
camera (see Remark 2; referred to as “VO”), and the wheel- approaches follow properly the ground-truth. The mean ori-
based odometry (denoted as “Odo”). The position ground- entation errors are 8.2o for visual odometry with visual com-
truth comes from a DGPS (labelled as “DGPS”), and the pass, 14.1o for visual odometry, and 39.37o for wheel-based
http://journals.cambridge.org
Combined visual odometry and visual compass for off-road mobile robots localization 9
Trajectories Orientations
VO
VO + VC 350
30 DGPS VO
Odo VO + VC
300
Compass
20
270 Odo
250
Orientation (deg)
10
200
y (m)
0 150
100
−10
Start
End
50
−20
0
0 10 20 30 40 50 60 0 20 40 60 80 100 120 140 160
x (m)
Travelled distance (m)
10 10
20
8 8
0
6 6
4 4 −20
Δ s (pixel) (groundcam)
2 2
−40
0 0
VO VO + VC Odo VO VO + VC Odo −60
Part 3 Part 4
40 −80
Euclidean distance (m)
40
30 −100
30
20 −120
20
10 −140
10
0 0 −160
VO VO + VC Odo VO VO + VC Odo −40 −30 −20 −10 0 10
Δ u (pixel) (pancam)
odometry. In this figure, it is possible to observe the not move forward, since, as observed, s is close to zero at
unavoidable error growth phenomenon for odometry-based turns. Finally, note that in the range s = (−20, −40) pixel,
solutions, that is, the deviation between the ground-truth and u is zero, which means the moment in which the robot is
the rest of techniques increases along the travelled distance stopping before turning.
(integration of the noises and the error over time).
In Fig. 11, the longitudinal (s) and lateral (u) pixel 4.3.2. Experiment 2. S-shaped trajectory. In this experiment,
displacement values related to the visual odometry with the a longer trajectory, in which the robot changed direction
visual compass approach are shown. Notice that values close several times, was tested. Furthermore, in some parts of the
to zero mean small displacements (low velocity), and high experiment site, the lighting conditions produced shadows
values mean large displacements (high velocity). In this plot, that affected the performance of the vision-based localization
it is observed that the points are aligned in two directions, strategies.
being this effect due to the pixel displacements during straight Figure 12 shows the resulting trajectories. It is observed
motions, s component, and during turns, u component. that the visual odometry with the visual compass trajectory
It is checked that template matching is highly robust with does not follow accurately the ground-truth mainly for one
few outliers (false matching or unsuccessful matching). It is reason. As checked during the first perpendicular path to
important to point out three interesting conclusions from this the x-axis and the second parallel path, the trajectory is
plot. Firstly, since the robot always turns in the same sense (to shorter than the ground-truth. This fact is due to false matches
the left side), lateral pixel displacement (u) is also aligned obtained from the camera pointing at the ground caused by
in one direction. Secondly, when the robot is turning, it does shadows (see F 15(b)). This erroneous behavior is worst
http://journals.cambridge.org
10 Combined visual odometry and visual compass for off-road mobile robots localization
Trajectories Orientations
20
Start 250
VO
0 200 VO + VC
Compass
−20 150 Odo
End
Orientation (deg)
100
−40
y (m)
50
−60 0
VO
−80 VO + VC −50
DGPS −100
−100 Odo
−150
−120 −180
−200
0 20 40 60 80 0 50 100 150 200 250 300
x (m) Travelled distance (m)
Fig. 12. (Colour online) Experiment 2. S-shaped trajectory. Fig. 14. (Colour online) Experiment 2. Orientations.
in the case of visual odometry alone, since, now, longitudinal visual odometry with visual compass, 7.60% for wheel-based
displacement and orientation are obtained from the camera odometry, and 19.50% for visual odometry.
pointing at the ground. The largest deviation is obtained In Fig. 14, the orientations are plotted with respect to
during the first perpendicular path to the x-axis. Again, the the travelled distance. Here the erroneous behavior of the
wheel-based odometry diverges largely from the ground- visual odometry approach during the first parallel path to
truth, particularly, odometry fails at turns. the x-axis is noticed. The visual odometry with the visual
In Fig. 13, the error between each localization technique compass approach estimates the orientation properly and
and the ground-truth is displayed. In this case, the Euclidean follows the ground-truth. The mean orientation errors are 4.8o
distance between the initial and the final position of the robot for visual odometry with visual compass, 10.2o for wheel-
is calculated in five parts, that is, three parallel paths to based odometry, and 148.2o for visual odometry. The mean
the x-axis (Parts 1, 3, and 5) and two perpendicular ones orientation error for the case of visual odometry cannot be
(Parts 2 and 4). As expected, the visual odometry with visual considered as a comparable value, since it has a large standard
compass approach obtains an admissible error except during deviation.
the second and the third paths. The relative mean errors In Fig. 15(a), the longitudinal (s) and lateral (u) pixel
with respect to the total travelled distance are 2.46% for displacement values are shown. In contrast to the previous
3 10 80
8
60
2
6
40
4
1
2 20
0 0 0
VO VO + VC Odo VO VO + VC Odo VO VO + VC Odo
Part 4 Part 5
80
Euclidean distance (m)
80
60 60
40 40
20 20
0 0
VO VO + VC Odo VO VO + VC Odo
Fig. 13. (Colour online) Experiment 2. Comparison of the Euclidean distance with respect to the ground-truth.
http://journals.cambridge.org
Combined visual odometry and visual compass for off-road mobile robots localization 11
0 0
−20 −20
Δ s (pixel) (groundcam)
−40 −40
Δs (pixel)
−60 −60
−80 −80
−100 −100
−120 −120
−140 −140
−160 −160
−50 −25 0 25 50 0 500 1000 1500 2000 2500 3000 3500 4000
Δ u (pixel) (pancam) Sample (image)
Fig. 15. (Colour online) Experiment 2. Template matching using both cameras. (a) Lateral and longitudinal displacement. (b) Longitudinal
displacement (groundcam).
experiment, now the pixels related to the lateral displacement A deeper understanding of the erroneous behavior of the
(u) are aligned in two directions (right and left turns). visual odometry approach is obtained by analyzing Fig. 16.
Notice that the turns to the right side were carried out at higher Recall that, for the case of visual odometry alone, the
linear velocity than the turns to the left side. In this case, there robot orientation comes from the lateral pixel displacement
are some outliers, when the robot moved in straight line obtained from the camera pointing at the ground (see
(s < −40 pixel), which can explain the small deviation Remark 2). As checked in Fig. 16(a), many outliers appear in
obtained at the end of the first parallel path to the x-axis (see the v component; compare it with the visual compass pixel
Fig. 12). On the other hand, in Fig. 15(b), the longitudinal displacement (u) obtained from the camera looking at the
pixel displacements (s) with respect to the acquired images environment in Fig. 15(a). In Fig. 16(b), notice that these
are displayed. As noticed, during the samples (1200, 1600), outliers occur within two intervals in which false matches
there is an erroneous behavior (false matches). This behavior appeared due to shadows (see Fig. 15(b)).
explains why the trajectories obtained with the vision-based
approaches are shorter than the ground-truth during the first 4.3.3. Experiment 3. Circular trajectory. Finally, a circular
perpendicular path to the x-axis. The false matches found trajectory was tested in order to check the performance
in the interval (2300, 2800) explain why the trajectories are of the proposed localization strategies estimating the robot
shorter than the ground-truth during the second parallel path orientation. The most challenging issue about this experiment
to the x-axis. is that the robot is always turning and, hence there are always
0 80
−20 60
Δ s (pixel) (groundcam)
−40 40
20
Δ v (pixel)
−60
0
−80
−20
−100
−40
−120
−60
−140 −80
−160 −100
−100 −50 0 50 100 0 500 1000 1500 2000 2500 3000 3500 4000
Δ v (pixel) (groundcam) Sample (image)
Fig. 16. (Colour online) Experiment 2. Template matching process using groundcam. (a) Lateral and longitudinal displacement. (b) Lateral
displacement (groundcam).
http://journals.cambridge.org
12 Combined visual odometry and visual compass for off-road mobile robots localization
Orientations Trajectories VO
20 VO + VC
550
DGPS
500 VO Odo
VO + VC KFff
Compass 15
KFfb
Odo
400
Orientation (deg)
360
10
300
y (m)
5
200
Start
100 0
End
−5
0 10 20 30 40 50 60 70 −5 0 5 10
x (m)
Travelled distance (m)
Fig. 17. (Colour online) Experiment 3. Orientations. Fig. 18. (Colour online) Experiment 3. Circular trajectory.
http://journals.cambridge.org
Combined visual odometry and visual compass for off-road mobile robots localization 13
Table I. Summary of localization methods.
Trajectory
Feature Rectangular S-shaped Circular
have confirmed the appropriate behavior of the proposed 4. H. Beom and H. Cho, “Mobile robot localization using a single
scheme with a mean error lesser than 3% with respect rotating sonar and two passive cylindrical beacons,” Robotica
to the total travelled distance. Furthermore, an acceptable 13(3), 243–252 (1995).
5. S. Cho and J. Lee, “Localization of a high-speed mobile
computation time (<0.17 s) has been achieved taking into robot using global features,” Robotica 29(5), 757–765 (2010)
account the purposes of our testbed. However, it is important Available on CJO.
to remark that the current code can be highly optimized. 6. R. Siegwart and I. Nourbakhsh, Introduction to Autonomous
The point related to reduce the size of the search area Mobile Robots, 1st ed. A Bradford book. (The MIT Press,
during the matching process can be considered as an Cambridge, MA, USA, 2004).
7. B. Hofmann-Wellenhof, H. Lichtenegger and J. Collins, Global
incipient approach to decrease the computation time. The Positioning System: Theory and Practice, 5th ed. (Springer,
best improvement in which we are currently working consists Germany, 2001).
of using the robot motion to reduce accordingly the image 8. A. Johnson, S. Goldberg, C. Yang and L. Matthies, “Robust
size and to estimate the position of the template during and Efficient Stereo Feature Tracking for Visual Odometry,”
the matching process. We are also considering it in terms In: Proceedings of IEEE International Conference on Robotics
and Automation, IEEE, Pasadena, USA (May 19–23, 2008)
of a multi-objective problem (success of matching process, pp. 39–46.
reduction of template size, and reduction of image size). 9. L. Matthies, M. Maimone, A. Johnson, Y. Cheng, R. Willson,
The problem with shadows and blur phenomenon, and C. Villalpando, S. Goldberg and A. Huertas, “Computer vision
hence with false matches, constitutes the most important on mars,” Int. J. Comput. Vis. 75(1), 67–92 (2007).
shortcoming of the vision-based localization techniques. In 10. C. Olson, L. Matthies, M. Schoppers and M. Maimone, “Rover
navigation using stereo ego-motion,” Robot. Auton. Syst. 43(4),
this work, a deep analysis has been carried out to minimize 215–229 (2003).
these issues. The height of the downward camera was 11. I. Parra, M. Sotelo, D. Llorce and M. O. Na, “Robust visual
carefully selected and a threshold filter was tuned. Currently, odometry for vehicle localization in urban environments,”
we are investigating two ways to minimize such undesirable Robotica 28(3), 441–452 (2010).
effects. Firstly, mounting the downward camera just under 12. J. Campbell, R. Sukthankar, I. Nourbakhsh and A. Pahwa,
“A Robust Visual Odometry and Precipice Detection System
the vehicle and using an artificial uniform source to light Using Consumer-grade Monocular Vision,” In: Proceedings of
the ground (shadows issue), and secondly, acquisition of a IEEE International Conference on Robotics and Automation,
new camera with a shorter exposure time (blur phenomenon IEEE, Barcelona, Spain (Apr. 18–22, 2005) pp. 3421–3427.
issue). 13. L. Matthies, “Dynamic Stereo Vision,” Ph.D Thesis
On the other hand, probabilistic techniques, such as the (Pittsburgh, USA: Carnegie Mellon University, 1989).
14. R. González, F. Rodrı́guez, J. Sánchez-Hermosilla and
Kalman filter or particle filter, will be employed fusing J. Donaire, “Navigation techniques for mobile robots in
the orientations obtained using visual compass and an greenhouses,” Appl. Eng. Agr. 25(2), 153–165 (2009).
absolute orientation sensor. In this way, we will reduce the 15. R. González, “Localization of the CRAB rover using
unavoidable error growth of relative localization techniques. Visual Odometry,” Technical Report (Autonomous
Finally, we will study to improve the planar motion Systems Lab, ETH Zürich, Switzerland), available at:
http://www.ual.es/personal/rgonzalez/english/publications.htm
assumption (fixed height of the camera pointing at the (2009) online. (Accessed September 2011).
ground) employing an IMU sensor or a telecentric camera. 16. D. Nistér, O. Naroditsky and J. Bergen, “Visual odometry
for ground vehicle applications,” J. Field Robot. 23(1), 3–20
(2006).
References 17. A. Angelova, L. Matthies, D. Helmick and P. Perona, “Learning
1. S. Thrun, S. Thayer, W. Whittaker, C. Baker, W. Burgard, and prediction of slip from visual information,” J. Field Robot.
D. Ferguson, D. Haehnel, M. Montemerlo, A. Morris, 24(3), 205–231 (2007) .
Z. Omohundro, C. Reverte and W. Whittaker, “Autonomous 18. D. Lowe, “Distinctive image features from scale-invariant
exploration and mapping of abandoned mines,” IEEE Robot. keypoints,” Int. J. Comput. Vis. 60(2), 91–110 (2004).
Autom. Mag. 11(1), 79–91 (2004). 19. B. Lucas and T. Kanade, “An Iterative Image Registration
2. J. Borenstein, “The CLAPPER: A Dual-Drive Mobile Robot Technique with An Application to Stereo Vision,” In:
with Internal Correction of Dead-Reckoning Errors,” In: Proceedings of DARPA Imaging Understanding Workshop,
Proceedings of IEEE Conference on Robotics and Automation, DARPA, Monterey, USA (Aug. 24–28, 1981) pp. 121–
IEEE, San Diego, CA, USA (May 8–13, 1994) pp. 3085–3090. 130.
3. O. Horn and M. Kreutner, “Smart wheelchair perception using 20. D. Scaramuzza, “Omnidirectional Vision: From Calibration to
odometry, ultrasound sensors, and camera,” Robotica 27(2), Robot Motion Estimation,” Ph.D Thesis (Zürich, Switzerland:
303–310 (2009). Swiss Federal Institute of Technology, 2008).
http://journals.cambridge.org
14 Combined visual odometry and visual compass for off-road mobile robots localization
21. V. Matiukhin, “Trajectory Stabilization of Wheeled System,” 29. J. Rodgers and W. Nicewander, “Thirteen ways to look at the
In: Proceedings of IFAC World Congress, IFAC, Seoul, Korea correlation coefficient,” Am. Stat. 42(1), 59–66 (1988).
(2008) pp. 1177–1182. 30. M. Elmadany and Z. Abduljabbar, “On the statistical
22. R. Brunelli, Template Matching Techniques in Computer performance of active and semi-active car suspension systems,”
Vision: Theory and Practice, (John Wiley, New Jersey, USA, Comput. Struct. 33(3), 785–790 (1989).
2009). 31. K. Nagatani, A. Ikeda, G. Ishigami, K. Yoshida and I. Nagai,
23. A. Goshtasby, S. Gage and J. Bartholic, “A two-stage “Development of a visual odometry system for a wheeled robot
correlation approach to template matching,” IEEE Trans. on loose soil using a telecentric camera,” Adv. Robot. 24(8–9),
Pattern Anal. Mach. Intell. 6(3), 374–378 (1984). 1149–1167 (2010).
24. F. Labrosse, “The visual compass: Performance and limitations 32. J. Montiel and A. Davison, “A Visual Compass based on
of an appearance-based method,” J. Field Robot. 23(10), 913– SLAM,” In: Proceedings of IEEE International Conference
941 (2006). on Robotics and Automation, IEEE, Orlando, USA (May 15–
25. M. Srinivasan, “An image-interpolation technique for the 19, 2006) pp. 1917–1922.
computation of optic flow and egomotion,” J. Biol. Cybern. 33. J. Sturm and A. Visser, “An appearance-based visual compass
71(5), 401–415 (1994). for mobile robots,” Robot. Auton. Syst. 57(5), 536–545 (2009).
26. S. Kim and S. Lee, “Robust mobile robot velocity estimation 34. J. Bouguet, “Camera calibration toolbox for Matlab,” available
using a polygonal array of optical mice,” Int. J. Inf. Acquis. at: http://www.vision.caltech.edu/bouguetj/calib doc/ (2008).
5(4), 321–330 (2008). (Accessed September 2011).
27. N. Nourani-Vatani, J. Roberts and M. Srinivasan, “Practical 35. A. Pretto, E. Menegatti, M. Bennewitz, W. Burgard and
Visual Odometry for Car–like Vehicles,” In: Proceedings of E. Pagello, “A Visual Odometry Framework Robust to Motion
IEEE International Conference on Robotics and Automation, Blur,” In: Proceedings of IEEE International Conference on
IEEE, Kobe, Japan (May 12–17, 2009) pp. 3551– Robotics and Automation, IEEE, Kobe, Japan (May 12–17,
3557. 2009) pp. 1685–1692.
28. G. Bradski and A. Kaehler, Learning OpenCV: Computer 36. A. Hornung, M. Bennewitz and H. Strasdat, “Efficient vision-
Vision with the OpenCV Library, (O’Reilly, Sebastopol, CA, based navigation. Learning about the influence of motion blur,”
USA, 2008). Auton. Robots 29(2), 137–149 (2010).
http://journals.cambridge.org