16 - Simultaneous Localization and Mapping - A Survey of Current Trends in Autonomous Driving
16 - Simultaneous Localization and Mapping - A Survey of Current Trends in Autonomous Driving
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 1
Abstract—In this article, we propose a survey of the Simul- Another classic approach to localization is to take advan-
taneous Localization And Mapping field when considering the tage of the road infrastructure (road markings or roadway
recent evolution of autonomous driving. The growing interest re- detection) in order to guide a vehicle in a lane. Advanced
garding self-driving cars has given new directions to localization
and mapping techniques. In this survey, we give an overview of Driver-Assistance Systems (ADAS) are already integrating
the different branches of SLAM before going into the details lane marking detection in commercialized cars. While this kind
of specific trends that are of interest when considered with of approach mostly constrains the lateral position of a vehicle,
autonomous applications in mind. We first present the limits it is sufficient for environments where the roadway is easily
of classical approaches for autonomous driving and discuss the identifiable such as highways for instance. On the other hand,
criteria that are essential for this kind of application. We then
review the methods where the identified challenges are tackled. more complex environments (urban mostly, with intersections,
We mostly focus on approaches building and reusing long-term curved roads, etc.) do not always provide enough road infor-
maps in various conditions (weather, season, etc.). We also go mation to localize a vehicle. Moreover, the position accuracy
through the emerging domain of multi-vehicle SLAM and its needed along the longitudinal axis is more important than in
link with self-driving cars. We survey the different paradigms of straight, highway-like environments. Anyhow, redundancy is
that field (centralized and distributed) and the existing solutions.
Finally, we conclude by giving an overview of the various large- needed in order to build a safe system and ensure a consistent
scale experiments that have been carried out until now and behavior on the road. As such, different localization means
discuss the remaining challenges and future orientations. should be considered.
Index Terms—SLAM, localization, mapping, autonomous ve- In a general manner, localizing a vehicle, be it in a global
hicle, drift, place recognition, multi-vehicle, survey. or a local frame, is an essential functionality to perform
any other perception or planification tasks. Predicting the
I. I NTRODUCTION evolution of other obstacles on the road and choosing which
IMULTANEOUS Localization And Mapping (SLAM) has maneuver is the most appropriate require to know exactly
S been a hugely popular topic among the mobile robotics
community for more than 25 years now. The success of this
where the ego-vehicle is located and how it will evolve in
the coming seconds. The SLAM framework gives an answer
field is tightly bound to the fact that “solving” the SLAM to this problematic while still being general enough to allow
problem, that is localizing a robot thanks to a map of the the use of any sensor or estimation technique that suits the
surroundings built incrementally, has numerous applications prerequisite of estimating both the localization of the vehicle
ranging from spatial exploration to autonomous driving. The and the map at the same time. The map is of prime interest
recent spotlight put on intelligent vehicles has pushed further when autonomous driving is considered as a whole as it offers
the research effort with the contribution of car manufacturers. a first level of perception that is needed in order to make
One could think of GNSS (Global Navigation Satellite appropriate decisions.
System) as a solution to this localization problem but it The SLAM problem is considered as one of the keys
has quickly been showed that it was not sufficient by itself. towards truly autonomous robots, and as such is an essen-
Even though accuracy limits of classical GNSS solutions are tial aspect of self-driving cars. However, many issues are
lifted when perfectly positioned base stations are used (Real- still preventing the use of SLAM algorithms with vehicles
Time Kinematic GNSS), availability remains an issue. Satellite that should be able to drive for hundreds of kilometers in
signals are affected by atmospheric conditions that are difficult very different conditions. This last statement encompasses the
to predict. Moreover, the infrastructure can block the direct two main problems arising when dealing with SLAM for
reception of signals and generate multipath interference or autonomous vehicles: localization tends to drift over time and
non-line-of-sight reception which has disastrous consequences maps are not necessarily viable in every driving condition.
on the provided localization. This kind of signal degradation The former problem is well-known in the SLAM community.
is difficult to detect and usually causes a loss of integrity The local and incremental positioning estimation that is given
from which recovering can be tricky. These problems are more by SLAM algorithms tends to diverge from the true trajectory
common in dense urban areas where tall buildings can mask as the traveled distance increases. Without prior knowledge
satellites. On open roads, GNSS’s usually perform better. or absolute information, it thus becomes almost impossible to
ensure a proper localization for several kilometers. This leads
All authors are with Institut VEDECOM, Versailles, France, e-mail: to the second problem, namely having maps that are sufficient
[email protected]. Zayed Alsayed is also with Inria Paris-
Rocquencourt. Li Yu is also with Mines ParisTech. Sébastien Glaser is also for the localization task no matter the conditions. The mapping
with IFSTTAR. aspect has gained a lot of attention lately with the objective
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 2
of providing the necessary information to locate a vehicle SLAM methods require a wide panel of algorithms to
in different seasons, weathers or traffic conditions. Many ensure the robustness of the provided localization. As such,
solutions have been envisaged to solve these two problems sensor data extraction, primitive search [9], data association
like building a map with a careful selection of distinctive [10] or map storage and updates [11] are part of the topics
information with the objective of reusing it later or taking which concern SLAM as well. However, in this literature
advantage of new communication systems in order to share review, we will first focus on the main estimation methods
and enhance the maps built by other road users for instance. that exist before going through the current trends in the
SLAM is a broad field and involves many topics ranging autonomous vehicle application field.
from sensor extraction to state estimation. In this article, we The SLAM problem is usually formalized in a probabilistic
propose a survey focusing on the current trends in the SLAM fashion. The whole objective is to be able to estimate at the
community regarding the emergence of the autonomous vehi- same time the state of the vehicle and the map being built.
cle market. To be clear, throughout this paper, we will refer The vehicle state can be defined differently depending on
to SLAM as approaches that are composed of at least an the application: 2D position and orientation, 6D pose, speed,
odometry and a mapping module in order to cover all the acceleration, etc. We denote xk the vehicle pose estimation at
techniques that are of interest for autonomous driving. We the time k and m the map of the environment. To estimate
will first give a general introduction to SLAM by reviewing these variables, it is possible to take advantage of what we
the most commonly used estimation techniques, discuss the call control inputs uk and which represent an estimation of
different existing benchmarks and data sets and point to the the motion between k − 1 and k. They usually come from
relevant surveys covering aspects that are not reviewed in this wheel encoders or any sensor able to give a first idea of the
article. This will be the object of Section II. Then, Section III displacement. The particularity of SLAM approaches is to take
will concern the limits of classical methods for autonomous into consideration measurements coming from sensor readings
driving and more especially the impact of the drift. The and denoted zk . They help to build and improve the map and
problem will first be stated and we will then focus on the indirectly, to estimate the vehicle pose.
solutions proposed to avoid or correct this drift and on the The SLAM problem can be formulated in two ways. In
general reliability of the exploited information. We will end the first one, the goal is to estimate the whole trajectory of
this section by discussing the criteria that should be respected the vehicle and the map given all the control inputs and all
for a SLAM approach to be viable for autonomous driving. the measurements. A graphical representation can be seen in
Section IV will concern the techniques that tackle some of the Figure 1. This problem, known as full SLAM, computes the
challenges of SLAM for autonomous cars, namely building joint posterior over all the poses and the map based on the
and exploiting long-term maps. A survey of the methods entirety of sensor data:
that take advantage of previous knowledge, coming from the
SLAM algorithm itself or from another resource (Geographic
Information Systems for instance), will be proposed as it is bel(x0:k , m) = p(x0:k , m | z0:k , u0:k ) (1)
a key aspect to achieve true autonomy for self-driving cars.
Section V will give an overview of multi-vehicle SLAM
systems. This field offers a solution to both the problems
mentioned earlier: constraining drift and enhancing maps. The
different design choices of such applications will be exposed
and motivated with the related state of the art. Finally, Section
VI will expose the large-scale experiments that have been
carried out so far with relation to the localization means used.
It will serve as a basis to discuss the future orientations and
the remaining challenges in Section VII.
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 3
This can cause consistency issues: the true value can be outside
bel(xk , m) = p(xk , m | z0:k , u0:k ) of the estimated uncertainty [16][17].
However, estimates are most of the time sufficiently close
∝ p(zk | xk , m) to the truth to allow the use of the EKF. Sensors like laser
scanners for instance, that provide a range information, are
particularly adapted [18][19]. Sonars were among the first EKF
Z
p(xk | xk−1 , uk )bel(xk−1 , m)dxk−1 SLAM approaches for underwater applications [20][21]. Both
xk−1
(2) sensors have been combined in an EKF SLAM approach in
[22]. A coupling of vision and laser has also been proposed
in [23]. Monocular approaches have also largely been studied.
In [24], the landmarks composing the map are inserted in
the EKF only when sufficiently accurate. In [25], a specific
landmark parametrization is proposed. In [26], the authors
studied the impact of the Kalman gain on an update in order
to constrain linearization errors.
The continuously growing map size makes the EKF unable
to support large-scale SLAM as the update time depends, in a
quadratic way, on the size of the state vector. To overcome this
issue, the notion of submaps has been created. Each time a map
becomes too large (various criteria can be used to decide so), a
new blank map replaces the old one. A higher-level map keeps
Fig. 2: Graphical representation of the online SLAM problem track of the links between the submaps not to lose information.
at time k + 2 Among the first submap-based approaches, we can cite [27]
with the Constrained Relative Submap Filter where submaps
The estimation techniques can be separated into two main are decorrelated from one another but where loop closure (drift
categories: filter-based approaches and optimization-based correction based on the recognition of a previously visited
methods. The former corresponds to iterative processes that are place) is difficult to perform. The constant-time SLAM [28]
thus suited to online SLAM and the latter regroups methods and the Network Coupled Features Maps [29] work in a similar
performing batch treatments and, as such, are usually applied fashion except that landmarks common between submaps
to solve the full SLAM problem even if this trend has changed are kept to ease the change from one to another but these
during the last ten years. methods ignore correlated data. The Atlas framework [30]
takes advantage of a graph structure where nodes are submaps
and vertices the transformation between two submaps. Loop
A. Filter-based SLAM closures can only be applied offline. Estrada et al. solve this
Filter-based methods derive from Bayesian filtering and problem by maintaining two high-level maps [31] but still
work as two-step iterative processes. In the first step, a use landmarks in multiple submaps. Conditionally independent
prediction of the vehicle and the map states is made using submaps were proposed in [32] as a solution to this issue. The
an evolution model and the control inputs uk . In the second idea is to marginalize the information that is not common
step, the current observation, zk , coming from sensor data, is to two submaps in order to make them independent given
matched against the map in order to provide a correction of the common part. A different approach is chosen in [33]. A
the previously predicted state. The model that put in relation divide and conquer method is proposed to join the local maps
the observation with the map is called an observation model. created so as to recover an exact global map. New criteria to
These two steps iterate and so incrementally integrate sensor decide when to create submaps have also been proposed like
data to estimate the vehicle pose and the map. the number of simultaneously observable landmarks in [34] or
1) Extended Kalman Filter: The first branch of the filter- the correlation between landmarks in [35]. An alternative to
based methods concerns derivatives of the Kalman Filter (KF) submaps, the Compressed EKF SLAM, has been presented in
[12]. KFs assume that data are affected by Gaussian noises [19]. In this work, the state vector is divided into an active
which is not necessarily true in our case. At its basic form, (and updated) part and another one which is compressed into
KFs are designed to handle linear systems and while they a light auxiliary coefficient matrix.
have great convergence properties [13][14], they are rarely 2) Unscented Kalman Filter: To compensate the weak-
used for SLAM. On the other hand, the Extended Kalman nesses of the EKF with highly non-linear systems, Julier et
Filter (EKF) [15] is a common tool in non-linear filtering al. introduced the Unscented Kalman Filter (UKF) [36] which
and as such in SLAM. The EKF adds a linearization step avoids the computation of the Jacobians. The idea is to sample
for non-linear models. The linearization is performed around particles, called sigma points, which are pondered around the
the current estimate by a first order Taylor expansion. The expected value thanks to a likelihood function. These sigma
optimality of the EKF has been demonstrated as long as the points are then passed to the non-linear function and the
linearization is made around the true value of the state vector. estimate is recomputed. The major drawback of this method
In practice, it is the value to estimate and is thus not available. is its computational time. Most of the works using the UKF
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 4
took place at the beginning of the 2000s [37][38]. A real-time Bayes formalism or in [60] in the TBM context. Not mentioned
application to a monocular context has been demonstrated in yet, the use of RADARs for SLAM has been demonstrated
[39]. with filter-based approaches in [61][62][63]. However, their
3) Information Filter: Another variant of the Kalman Filter use remains limited in large-scale experiments due to the noisy
is the Information Filter (IF) [40] which is the inverse form of nature of the signal. They are usually employed for obstacle
the Kalman Filter. Its particularity is to define the information detection.
matrix as the inverse of the covariance matrix. One main
advantage is that the update step becomes additive and is not
B. Optimization-based SLAM
dependent on the order in which the observations are integrated
[41]. It is also possible to make the information matrix sparser Optimization-oriented SLAM approaches generally consist
by breaking the weak links between data [42] which ensures of two subsystems, as in filter-based SLAM. The first one
a near constant-time update [43]. The IF is not as popular as identifies the constraints of the problem based on sensor data
the EKF in mono-vehicle SLAM despite some applications in by finding a correspondence between new observations and the
[44][45][46] because it is necessary to convert every measure map. The second subsystem computes or refines the vehicle
in its inverse form which can be costly. However, the IF has pose (and past poses) and the map given the constraints so as
been more exploited for multi-vehicle SLAM (see Section V). to have a coherent whole. As for filters, we can divide these
4) Particle Filter: The second major branch in filtering methods into two main branches: bundle adjustment and graph
SLAM algorithms is based on Particle Filters (PF). Their SLAM.
principle is the following: the state is sampled with a set of 1) Bundle Adjustment: Bundle adjustment is a vision tech-
particles according to its probability density. Then, as with nique that jointly optimizes a 3D structure and the camera
every filter, a prediction of the displacement of each particle parameters (pose). Most of the early works focused on 3D
is accomplished and an update, depending on the observation, reconstruction [64] but it has since then been applied to
is performed. In the update phase, particles are weighted SLAM. The main idea is to optimize, usually using the
according to their likelihood regarding the measures. The most Levenberg-Marquardt algorithm [65], an objective function.
likely particles are kept, the others are eliminated and new ones The latter minimizes the reprojection error (distance between
are generated [3]. The direct application of this method to the observations in the image and reprojected past features) giving
SLAM is difficult to tract because it requires a set of particles the best camera and landmark positions. However, the core
per landmark. Variations of PFs have then appeared, like bundle adjustment algorithm can be computationally heavy as
the Distributed Particle approaches DP-SLAM [47] and DP- it considers all the variables at once to optimize over.
SLAM 2.0 [48] which proposed to use a minimal ancestry tree Since then, many approaches have been proposed to per-
as a data storage structure. It enables fast updates by guiding form local optimizations. In [66][67], a hierarchical method
the PF while reducing the number of iterations of the latter. working on smaller chunks is presented. The partial 3D models
However, the most famous PF algorithm is FastSLAM [49] obtained are then merged in a hierarchical fashion. To reduce
which has been greatly influenced by previous works [50][51] the complexity, two virtual key frames are chosen or created
on the subject. Each landmark is estimated with an EKF and among all the frames to represent a given portion. This lessens
particles are only used for the trajectory. FastSLAM has been the number of variables to optimize. A similar method has
applied in real-time in [52]. The major advantage of PFs is been applied to autonomous driving in [68] with accurate
that they do not require a Gaussian noise assumption and localization results but an offline map building.
can accommodate with any distribution. Nevertheless, PFs also In [69], an incremental method optimizing only over the
suffer from long-term inconsistency [53]. In [54], this problem new information is proposed. In the worst case, it is equivalent
was tackled by combining FastSLAM to an IF but the result to a full bundle adjustment. In [70], the authors use a sliding
is computationally heavy. In [55], FastSLAM was applied to window over key frames triplets to locally minimize the
laser data where the matching gives an odometry measure reprojection errors. Points common between two views are
which is then used to weight particles in the resampling considered in the optimization phase. The principle of a triplet
phase. Still regarding FastSLAM, its Bayesian foundations of images is common in the bundle adjustment community. As
were extended to the Transferable Belief Model framework with every optimization technique, bundle adjustment works
(TBM) in [56] thus allowing the representation of conflict in well when a good rough estimate is given. In that sense it
the employed grid map. is important to filter outliers. Nistér et al. [71] proposed a
Filter-based approaches tend to now rely on 3D points selection method based on a preemptive RANSAC [72]. Even
when it comes to vision sensors [24] and 2D occupancy though it allows for a real-time application (13 Hz), results are
grids with laser data. The latter, introduced in [57] and [58], less accurate than in [68]. A local bundle adjustment has been
are particularly suited to SLAM as the discretization of the proposed in [11]. The objective is to locally optimize the last n
space due to the grid itself allows for a finite number of camera positions based on the 2D reprojections of the points in
position candidates to test during the update step. Landmark the last N frames (N ≥ n). These parameters can be tuned to
uncertainties are represented by the occupancy probability of obtain a real-time approach with good results. An integration
the cell which makes updates on map parts possible. During of inertial measures has been proposed in [73]. A bi-objective
the update step, the classical approach is to maximize the simi- function (vision and inertial objective functions) is weighed
larity between the measurement and the map like in [59] in the with coefficients set thanks to a machine learning approach.
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 5
Another alternative is to optimize only a set of frames in order This parametrization takes the form of a tree structure that
to solve loop closing problems. This is the idea presented in defines and updates local regions at each iteration. A different
[74] where the authors keep skeleton frames and marginalize idea is to consider not a Euclidean space but a manifold.
out most of the features and frames in the process. This method It is the basis of the algorithm HOG-MAN [77] where a
is very close to graph SLAM techniques. hierarchical optimization on the manifold is proposed. The
2) Graph SLAM: The graphical representation of Bayesian lowest level represents the original data while the highest
SLAM (see Figure 2) is particularly well adapted to a resolu- levels capture the structural information of the environment.
tion via optimization methods. An example, coming from [75], In g 2 o [78], a similar representation is adopted. g 2 o uses the
is shown in Figure 3. Based on this graphical representation, structure of the Hessian matrix to reduce the complexity of
a matrix describing the relationships between landmarks and the system to optimize. COP-SLAM [79] optimizes a pose
vehicle poses can easily be built and used in an optimization graph. The latter considers the displacement and the associated
framework. uncertainty to build a chain of poses. In a different approach,
TreeMap [80] exploits a tree structure for the map and makes
topological groups of landmarks so as to make the information
matrix sparser and so quicken the processing (O (log n) for
n landmarks). Even though not exploiting a graph structure,
iSAM (incremental Smoothing And Mapping) [81][82] sim-
plifies the information matrix to speed up the underlying
optimization. Here, the objective is the QR factorization of
this sparse information matrix.
The comparison of filter-based techniques and optimization
approaches for a SLAM application is difficult as they are
usually considered in different scopes. This comparison effort
has been proposed in [83] and then extended in [84] for
monocular approaches. The outcome is that optimization tends
to give better results than filters which are more subject
to linearization issues. However, the authors conclude that
“filter-based SLAM frameworks might be beneficial if a small
processing budget is available, but that BA optimization is
superior elsewhere that with limited resources” [83].
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 6
Outdoor: 1.9 km
Indoor: 0.89 km
College: 2 km
[92], present the challenges of Visual SLAM for driverless
City: 28 km
Path length
cars: building long-life maps, how to share maps between
∼ 50 km
36.8 km
∼ 1 km
∼ 6 km
2.2 km
6.9 km
vehicles and the necessity to work on high-level features to
90 km
6 km
ease recognition. A very complete survey on visual place
recognition has also been published lately [93]. The authors
Images
Stereo:
Stereo:
Stereo:
Stereo:
Omni.:
again, we will insist on the key topics with regards to self-
driving cars and the experimental results of current state-
of-the-art approaches. Finally, in [94], the authors review
proposed by [98].
IMU
3
3
3
3
7
3
3
3
7
mitigated.
GPS
3
3
3
3
2008
2009
2009
2010
2010
2011
2011
2012
2014
Year
AUTONOMOUS DRIVING
A. Problem identification
The Marulan data sets [102]
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 8
it is not the case in the global map that keeps track of how sensors to compensate each other. According to Luo et al.
the submaps are connected. It is thus more of a local solution [128], three levels of information fusion can be identified:
to the problem. low-level with raw data, mid-level with features and high-
In a similar fashion, robot-centered approaches greatly level with objects. In [129], the authors propose to couple
reduce the divergence [115]. Instead of having a fixed world laser with camera data. The fusion is done at a landmark
frame, the estimates are always given with relation to the level (laser estimation followed by camera refinement). In
vehicle position. Inconsistency problems are less frequent [130], camera, laser and GPS are fused based on information
because they do not accumulate like it is usually the case. The coherency. Already discussed before, the method described in
estimation in this frame allows for a better consistency than in [105] combines vision and 3D laser data to produce a low-
classical EKF SLAM [116]. Nevertheless, as with submaps, drift algorithm. However, even if coupling information can
the divergence is not entirely resolved as landmarks can benefit the accuracy and consistency of an algorithm, it does
diverge with only several measures. A few SLAM approaches not ensure that it will not drift. It only partially corrects the
have used a robot-centered frame with success [117][118]. drift induced by one sensor.
Landmarks are forgotten as soon as they are not visible which Even if these methods have been proven to reduce or
relates these methods to visual odometry and is as such totally partially avoid the drift, it does not allow for a drift-free
appropriate. estimation during long periods of time. Correcting the drift
Other practical causes can create drift and inconsistency in a reliable manner involves that constraints about where
such as the presence of mobile obstacles when a static world a vehicle is with relation to a previously known, local or
is expected. This problem has been tackled in SLAMMOT absolute, reference should be regularly taken into account.
(Mobile Object Tracking) or SLAM-DATMO (Detection And
Tracking of Moving Objects) approaches [119][120][121]. The
C. Evaluation criteria for SLAM in autonomous driving
idea is to take advantage of the map building process to
directly detect and track obstacles by analyzing observations While the previously described approaches can be success-
which are not coherent with the vehicle displacement. Alterna- fully applied for autonomous indoor mobile robots with a
tively, credibilist approaches [60], which represent ambiguous dedicated exploration scheme, it is not sufficient for the large-
information, can indirectly deal with dynamic obstacles. They scale environments of the autonomous driving setting. It means
do not detect or track these obstacles but instead treat them that it is necessary to rely on previous knowledge (absolute or
as conflicts which allows the algorithm to affect a very low local information) or to be able to improve the built map over
weight to these observations when estimating the vehicle time until it is accurate enough (loop closing, for instance). As
displacement. such, the mapping aspect is central in self-driving vehicles and
More generally, the quality and number of selected land- raises important challenges about how to build or use compact,
marks in the SLAM process have a clear impact on how the relevant, reliable and evolutive maps.
system behaves. Ambiguities can cause matching errors and We have identified 6 criteria that we think must be fulfilled
so the inconsistency of the produced estimates. Related to by a SLAM approach to be viable for autonomous driving.
this problematic, Fault Detection and Isolation (FDI) systems They are described below:
propose to exploit information redundancy to measure the co- • Accuracy: refers to how accurate the vehicle localization
herence of different sources (sensors, models, etc.) and detect is, be it in world coordinates or with relation to an
errors or failures of one of the sources. By doing so, it becomes existing map. Ideally, the accuracy should be below a
possible to know if the system is in a situation prone to errors threshold (usually around 20 centimeters [109]) at all
and so reject spurious measures to avoid drift. In [122], the times. Of course, in straight lines, the longitudinal local-
authors propose to analyze the residuals of a bank of Kalman ization can be less accurate without major consequences.
Filters using thresholds to determine if one filter is diverging. • Scalability: refers to the capacity of the vehicle to handle
A similar approach is described in [123] but the decision part large-scale autonomous driving. The SLAM algorithm
is given to a Neural Network. Sundvall and Jensfelt, in [124], should be able to work in constant time and with a
add a coherence measure between different estimations. In constant memory usage. It implies the use of a map
[125], Fault Detection is applied to GPS using the Normalized manager to load data when needed. The second aspect is
Innovation Squared (NIS) test. Neural Networks are used in that the map built and/or used has to be light (or stored
[126] to detect sensor failures by making measure predictions on a distant server and retrieved on the fly) so as for the
using approximators. In [127], the authors propose a visual approach to be easily generalizable over long distances.
odometry where photometric camera calibration is taken into • Availability: refers to the immediate possibility for a
account in the minimization step in order to reduce the drift SLAM algorithm to provide an adequate localization
induced by effects like lens attenuation, gamma correction, that could be used right away for autonomous driving if
etc. The quality of the observations is increased compared to sufficiently accurate. In other words, no first passage of
simpler camera model. the algorithm is needed to build a map of the environment
An aspect that has often been considered to counter drift in and it implies that the approach is able to leverage
SLAM algorithms is the use and fusion of multiple sources. existing map resources (or integrate absolute information
Conversely to the previously described FDI methods, the more generally). This criterion is particularly important
idea is not to reject spurious measurements but for various not to restrain where a self-driving vehicle can operate
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 9
(a world-wide mapping process is costly and requires trusted to find those loops. It means that, most of the time,
dedicated means). this problem is solved by a dedicated algorithm that runs
• Recovery: refers to the ability to localize the vehicle at all times, independently from the estimation process. This
inside a large-scale map. Initially, the vehicle does not dedicated algorithm is also often used to relocalize the vehicle
know where it is and a dedicated process is, most of the inside a previously built map (kidnapped robot problem) which
time, needed to have a rough idea of its position in a map. is of prime interest when considering autonomous vehicles.
It is also a way to recover from a failure (kidnapped robot Most of the approaches use cameras to find loops because
problem). of the richness of the visual information. Williams et al. [131]
• Updatability: refers to the identification of permanent have identified 3 categories in which these algorithms can be
changes between the map and the current observation. It separated:
also includes the update policy that is needed to integrate • Image-to-image methods [132]
these lasting changes and not the temporary ones. Long- • Map-to-map methods [133]
term autonomous driving requires the automation of map • Image-to-map methods [134]
updates. 1) Image-to-image methods: In image-to-image methods,
• Dynamicity: refers to how the SLAM approach is able to the loop detection takes place in the sensor space. Bag-of-
handle dynamic environments and changes. This includes words approaches based on visual clues [132][135] belong to
dynamic obstacles that can distort the localization estima- this category. The idea is to build a dictionary where each word
tion. It also integrates weather conditions that can vary as represents similar descriptors. SIFT [136] is often used to find
well as seasonal changes (trees losing leaves, etc.). One of descriptors and compare them due to its good robustness to
the challenges of long-term SLAM is to find sufficiently scale, rotation or viewpoint changes in images. Many visual
discriminative features or methods in order to be robust descriptors can be considered in a redundant fashion (SURF,
against those changes. CenSurE, etc.) so as to have the largest database possible and
We will now go through existing methods for both single a representative dictionary. Once the latter is built, it can be
and multi-vehicle SLAM and compare them with regard to used to check if some images can be described with the same
the previously defined criteria so as to have a panorama on words and thus be qualified as representing the same place.
the maturity of SLAM approaches for autonomous driving. Algorithms like FAB-MAP (Fast Appearance Based Map)
[99] (and its extension FAB-MAP 2.0 [137]) and PIRF-Nav
IV. S INGLE - VEHICLE SLAM (Position Invariant Robust Feature Navigation) [138] are two
examples of SLAM solutions based on landmark appearance
All the challenges that appear from the criteria detailed
for loop closures. In the first one, information having a strong
above point at the necessity to build better maps and so at
dependency is extracted so as to avoid false recognitions which
environment representation and recognition both at a metric
often affect loop closing algorithms. The approach has been
and topological level. To cover this broad topic, we divide
evaluated on a 1000-km outdoor data set and shows that, using
the works in three categories which reflect how the SLAM
a dictionary built offline, the algorithm can provide proper
community usually considers their contributions. The first one
topological loop closures and make maps more coherent (the
concerns loop closure (recognizing a previously mapped place)
gain in accuracy is not evaluated). In [138], the objective is
which is an essential part of SLAM as it allows correcting
the same but features allowing to be robust against dynamic
maps and making them coherent. An interesting aspect is
changes are favored (PIRF). A major difference with FAB-
that these algorithms are usually applicable to relocalize a
MAP is that the whole process is online and incremental.
vehicle inside its environment and so provide an answer
However, it does not scale well (quadratic complexity for
to the recovery criterion. The second category deals with
image retrieval) and as thus has not been applied to vehicles
the localization inside a previously built map. As previously
contrary to FAB-MAP.
discussed, it is a direct way to constrain the drift. Theoretically,
Still to reduce the number of false positives, SeqSLAM
every SLAM approach could be reusing its map but we will
[139] proposes to be less discriminative on single images
focus here on the articles that explicitly do so and on those
but to analyze over sequences in order to ensure a proper
which address the long-term challenges of reusing maps. A
matching. Results on a 22 and 8-km data sets demonstrate
third part focuses on localization approaches that leverage
the ability of SeqSLAM to handle day/night matching in real
existing data so as to avoid the first passage of a specifically
time (even though the computing time scales linearly with the
equipped vehicle. For each part of this section, we will provide
size of the data set) with different weather conditions under
a synthesis of the discussed approaches, how they respond
which FAB-MAP is unable to properly operate. SeqSLAM
to the previously established criteria and briefly discuss the
is demonstrated as a pure place recognition approach and
remaining challenges.
does not apply loop closure to correct previous measurements
even though it is possible. SeqSLAM has even been extended
A. Relocalization and loop closure to integrate motion estimation between matching by using a
Recognizing a previously mapped place, and thus reducing graph structure representing the potential roads and a particle
the drift induced by SLAM algorithms, is considered as an filter to maintain the consistency of the positioning [140]. This
essential part of SLAM. The main difficulty comes from the approach, SMART PF, shows better results than SeqSLAM.
fact that the estimation process is inconsistent and cannot be Another interesting approach is shown in [141] where the
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 10
algorithm is tested to find loop closures across different grids and have been applied successfully to automated vehicles
seasons. To do so, the problem is also solved by using image with laser sensors in [147]. In this example, a coarse-to-fine
sequences. A flow network is built (a directed graph with start approach is used to relocalize the vehicle inside a map and not
and end-of-traversal nodes) and is used to formulate the image to perform loop closure. Experiments show that the method
matching as a minimum cost flow problem. On the proposed works well but that the size of the map is a crucial aspect.
data set (summer/winter matching), the approach outperforms 4) Discussion: We have focused on how loop closure is
SeqSLAM. It requires a few seconds to process each image. applied in SLAM algorithms. The recent years have seen a
An evolution of this method has been presented in [142]. The clear focus on how to deal with seasonal or weather changes
authors replaced the HOG features used to represent images and we refer our readers to this survey [93] for a detailed view
with a global image feature map from a Deep Convolutional of this field, independently of loop closure.
Neural Networks. They are able to obtain better performances. Even it may seem like image-to-image methods are favored,
The found loops are taken into account in a graph-based recent approaches tend to couple different methods to ensure
approach (g2 o [78]) with odometry constraints without giving that a place has been properly recognized. As an example,
insights about the reached accuracy. The approach runs on a an application of FAB-MAP with a visual odometry can be
GPU and is thus faster than the previous implementation (real found in [148]. The authors rely on dense stereo mapping
time with a data set of 48,000 images) even though its speed and use FAB-MAP to indicate loop closures. Their metric
still depends on the data set size. integration is done within a graph formulation and is calculated
2) Map-to-map methods: Map-to-map methods are solely using the Iterative Closest Point (ICP) algorithm [149] between
based on information contained in the map. It is the case the observation and the previously built map. Even if the
of the GCBB algorithm (Geometric Constraint Branch and accuracy is not directly measured, the built map indicates
Bound) proposed in [143]. The principle is to define geometric a consistent mapping. In [150], a bag-of-words method is
constraints between landmark pairs in the map and the current first employed to generate candidates that are then checked
observation. An association tree is established so to as to using Conditional Random Fields (CRF) in which a constraint
find the most likely association hypothesis that respects the ensures a geometric consistency between the landmarks. The
geometric organization of the landmarks in the map. It has whole approach is applied in a visual SLAM framework. In
been applied successfully in [133] with a hierarchical approach Table II, we give a brief overview of the principal methods
in outdoor experiments even if the accuracy of the resulting that have been described here with relation to the autonomous
map is not measured. driving problematic.
Li et al., in [144] and [145], proposed to merge occupancy Based on the established criteria, we can see that these ap-
grids by using a genetic algorithm to find the best alignment proaches propose a solution to the recovery problem. However,
possible. A low-cost GPS is used to constrain the search space loop closure was initially seen as a way to correct drift. While
to smaller regions. Similarly to the previous example, the it produces more coherent maps, the articles cited here do
consistency of the map is drastically improved but its accuracy not clearly expose the gain in accuracy. In [151], the authors
is not measured. In the case that the drift can be approximately explain that closing the loop can counter the drift inside the
estimated, the JCBB algorithm (Joint Compatibility Branch loop but that the result will always be overconfident. Another
and Bound) [146] selects compatible landmarks according to approach was proposed in [152] where errors are redistributed
their uncertainty. in a probabilistic fashion around the past trajectory when a
Map-to-map methods are difficult to apply without any hint loop is identified. It avoids the previously mentioned overcon-
of where to look for loops as maps are either too sparse to fidence problem but does not guarantee the consistency of the
be sufficiently distinctive (monocular SLAM for instance) or localization.
too complex for a complete exploration in real-time. In such It also explains why the community does not necessarily
cases, the use of a GPS, like in [145], makes sense as it allows focus on improving the accuracy but on addressing difficult
the system to considerably reduce the exploration space. conditions and building consistent maps. Even if the results are
3) Image-to-map methods: The last category, called image- impressive, especially with seasonal changes and sunny/rain
to-map (or more generally sensor-to-map), extracts informa- conditions, the approaches cited here, when considered for
tion from the sensor space and compares it directly with the autonomous driving, can be seen as a better GPS but within
map. The approach described in [134] is based on classifiers a given zone. As such, it gives two main challenges for the
trained on the detected landmarks. The loop closing stage future: can these approaches be generalized by using already
uses these classifiers on the image to check if there is a available information (maps, images, etc.) to avoid the zone
match. Then, a RANSAC algorithm takes the corresponding restriction? Can these methods be extended and integrated into
3D positions to compute the new pose of the vehicle. Results SLAM frameworks to give very resilient approaches that could
with a single camera show that the map becomes coherent in be viable for autonomous driving?
an outdoor scenario.
Another possibility to close the loop or relocalize a vehicle
is to use hierarchical techniques to speed up the processing. A B. Localization in a previously built map
low-resolution map first serves to identify a rough positioning The localization of a vehicle in a previously built map
of the vehicle which is then refined as the map resolution is tightly linked to the methods presented in the previous
increases. These methods are particularly suited to occupancy section about loop closing and relocalization. Indeed, the first
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 11
necessary action is to globally identify where the vehicle is are no longer needed. The use of a memory of images is
located in the map. Once this first step is accomplished, more quite common [157][158][159]. The principle is to compare
classical data association algorithms can be used. If countering the current image with a reference database stored in memory,
the drift is not entirely possible with loop closure, constraining which can be costly and a limiting factor for a map. Data
the localization inside a given map is a viable solution for association can also be a problem in such cases. In [160], the
autonomous driving. However, building a map that scales well, authors propose data association graphs as a way to model and
that can be updated or that can work whatever the conditions solve this problem.
is not a trivial task. We will go through existing methods that Building maps that are able to evolve and follow permanent
have showed that reusing a map is possible without or with a changes is a challenge that has mainly been tackled in indoor
dedicated map building process. environments. The Dynamic Pose Graph [161] maintains two
1) Identical method between first mapping and on-the-fly maps: an active and a dynamic map. The laser scan points
localization: The logical extension of any SLAM algorithm are labeled in order to identify mobile obstacles and then both
is to reuse a built map in order to constrain the localization. maps can be updated accordingly. The pose graph represen-
While it still requires a first passage, the map can be used right tation allows the system to remove inactive nodes and keep
away and does not necessitate a dedicated offline processing. a more tractable representation. In [162], a visual odometry
Nevertheless, not all methods have demonstrated that they are approach is coupled with a place recognition mechanism in
viable under such circumstances as it involves continuously order to stitch maps acquired at different times. Old views are
building or enriching maps. deleted when they are not longer relevant allowing the system
Map management is widely covered by all the approaches to maintain an up-to-date map even though temporary changes
that involve submaps, which allows SLAM algorithms to work are integrated as well.
with a near-constant time and memory consumption [28]. A 3-month outdoor experiment is proposed in [163] in
However, reusing the maps is not necessarily considered in which the authors introduce the concept of plastic maps as
these methods. In [59], PML-SLAM is proposed to tackle a way to integrate visual experiences over time. The idea is
these problems (map loading and unloading between RAM that the visual odometry tries to relate to a past experience. If
and hard drive based on the vehicle position, etc.) using laser none exists, a new one is created. Experiences can be partial
scanners and occupancy grids. This approach has been proven along the trajectory in order to only store changes on dedicated
to be viable for autonomous driving in [153] in certain areas. portions and not on the whole map. The authors show through
The map can also be continuously updated but no dedicated experiments that the number of required experiences tends to
process handles it, meaning that temporary changes will also be bounded over time. The main advantage of this concept is
be integrated. that it allows the detection of sudden and long-term changes
In [154], the authors describe a long-term mapping process in an unified method. Gathering all the needed experiences
where new measurements improve the map. The approach is requires many traversals of a same road which can only be
based on vision and inertial sensors and uses a pose graph done at a world level with probe cars.
representation to integrate new data. Scalability over known 2) Dedicated map-building process: The increasing capac-
areas is achieved thanks to the reduction of the pose graph. ity of computers, coupled with the fact that a first traversal
No control is made on whether or not an update corresponds is needed for autonomous driving with SLAM, has led re-
to a permanent change. Already mentioned before, the work searchers to focus on how to build the best maps possible for
of McDonald et al. in [150] uses anchor nodes to link together online exploitation.
pose graphs acquired during different sessions. This way, the Place recognition approaches, most of the time, fall also
produced visual SLAM maps can all be taken into account. in this category as they require a previous passage to build
Similarly to [154], all maps are integrated without distinction. specific databases that are then exploited. In [164], one SVM
The authors of [155] propose a monocular approach that per feature is trained across several images. The robustness
uses a low number of landmarks and their visual patches to is ensured by discarding the detectors not accurate or unique
characterize them. Landmarks are reprojected in the image enough across the neighboring images. These classifiers and
and matched thanks to their patches with Normalized Cross- their temporal connection can be seen as the map. The
Correlation [156]. Even if this approach is very resource-light, authors demonstrate a great reduction in localization failures
no dedicated mechanism is able to handle map portions that but do not directly assess the accuracy of the method. The
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 12
PoseNet algorithm, in [165], trains a Convolutional Network to to show the resilience of the features to various conditions. The
associate images to the corresponding position and orientation. average localization error is around 1 meter. Still in a multi-
The CNN is then applied to locate an image in real-time. sensor context, Maximally Stable Extremal Regions (MSER)
These approaches do localize the vehicle inside a map but [175] are extracted from both images and laser grid maps
do not ensure continuity in the localization service as with [176]. Coupled with a GPS-RTK, they serve to establish a
more classical methods. database during a first passage. A particle filter then keeps
Many approaches have also chosen to improve maps over track of the vehicle pose online. 2574 landmarks are required
time. In [166], the authors consider that, in the future, maps for 7 km for an average error below 50 centimeters. Finally,
will be coming from centralizing collect services. As such, an interesting approach, first proposed in [177] then extended
they propose a system where maps, constructed using vision by in [178], focuses on building high-quality road surface maps
optimizing over 2D-3D associations, serve to feed a database. using a 3D laser. A calibration method for reflectivity values
An offline process takes all these information to produce is proposed and maps are then computed using a graph
summary maps where all the meaningful data are contained approach with inertial and GPS constraints. A histogram filter
(most seen landmarks). The localization accuracy is under 30 is used for the online localization inside the map. Results have
cm in various conditions. The problem of the scalability of been demonstrated in autonomous driving for more than 6
such an approach is mentioned but not addressed. In [167], kilometers with an accuracy of less than 10 centimeters. A
the authors consider an initial metric and semantic map and map manager ensures a constant-memory usage. 10 MB are
propose an unsupervised method to make them evolve through required for 1.6 km. The authors indicate that the reliance
time in a parking context. The metric map uses a pose graph on the built map could have inappropriate consequences in
relaxation algorithm in order to take into account multiple complicated weather settings. Instead of a laser-built map,
passages. The semantic part is updated thanks to machine Napier and Newman, in [179], construct orthographic images
learning techniques. Multiple timescales are maintained in of the road surface from a visual odometry approach. During
[168] in order to choose the one that best fits the current the online localization phase, a synthetic image, based on
observation. A first run is performed to obtain an initial map. the predicted pose, is generated and compared with the map
After that, local maps are maintained online with a short-term using mutual information so as to refine the localization. The
memory while an offline update allows the system to build approach does not work in real time at the moment and its
more consistent global maps. Indoor experiments show that accuracy has not been directly assessed.
the map slowly adapts to long-term changes. 3) Discussion: An overview of the main approaches dis-
A different approach is proposed in [169] and [170] where cussed in this section and how they fill our autonomous driving
spherical images (intensity, depth, saliency and sampling criteria is exposed in Table III.
spheres) are built from several images. The main advantage of We can see that almost all approaches propose a recovery
a sphere is to cover a given area and not only one position. An system (except [173] where it is not mentioned) which makes
online registration method based on monocular inputs serves to sense with regard to the localization in a previously built map.
localize the vehicle. In [171], a two-step process is proposed. Vision alone, to be sufficient, requires dense representations
During the first phase (teach), a database is built using SURF or a high number of landmarks. It is also worth noting that
keypoints and submapping techniques. Then, the repeat phase experimental conditions are not the same between all these
localizes the robot according to the constructed map in order approaches and as such one method could not be able to
for it to follow the same path as previously. Similarly, in reach a proper accuracy in a different setting. Building long-
[172], a visual database is first built offline using a hierarchical term maps remain a difficult task and it is not always clear if
bundle adjustment method. Then, in real-time, the vehicle lo- always updating the map is the good strategy. We also notice
calizes itself inside the map. Both approaches do not allow the that this kind of methods has the disadvantage not to limit the
system to modify the map once built. In [172] the results show size of the maps (even if this effect is limited) which can be
an impressive accuracy (around 2 centimeters with the exact problematic in the long run. Many approaches do not propose
same path). However, the map remains quite heavy (around a specific mechanism to store partial maps as the scope of
100 Mb for a kilometer). Still related to vision, in [173], a the experiments do not necessarily requires it. However, the
geo-referenced visual map approach is proposed. The map is information density needed for most approaches makes a
first built offline using SURF features and GPS constraints world-wide deployment difficult to envisage at the moment.
within a graph-based optimization framework. Online, the GPS Direct availability of maps is the main problem but it is
is not used and only the map and stereovision are employed. worth noting that [166] mentions probe cars (multiple sensor-
The map memory is one of the problem cited by the authors equipped vehicles) as a future way to reduce this problem.
(500,000 landmarks for 1 km) for an average accuracy of 30 This will be further discussed in Section V.
centimeters. From this short analysis, it appears that the problem of
A topometric localization is presented in [174]. During a building life-long maps that can take into account permanent
first passage, a GPS is coupled with vision and range sensors changes as well as seasons and weather in a bounded manner
to create a database of compact features (a descriptor per remains a challenge. Regarding accuracy, even though some
image and a range and standard deviation value between each impressive results are shown, it is difficult to predict how one
image). A Bayesian filter is then used for online localization. map representation will work in a different environment. As
Extensive experiments (over 4 months) have been carried out there is no clear way to evaluate this, more tests in various
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 13
conditions are needed. The final, bigger problem is how can is not proposed in the article. Still regarding automatic lane-
the creation of these maps be generalized at a world level? At level map generation, Guo et al. in [184] (and extended in
the moment, it is not clear if it will happen as world-wide raw [185]) propose a low-cost approach based on an OSM map,
sensor data may never be available. an INS, a GPS and orthographic images generated from a
camera. The idea is that such a system could be used by probe
C. Localization in existing maps cars and could generalize the map-building process. First, the
INS and GNSS measurements are optimized together to obtain
Even without world-wide raw sensor data, there exists many the vehicle localization. A second optimization using visual
different sources already furnishing large-scale information. odometry is performed. The image is then aligned according to
In the prospect of facilitating the deployment of autonomous local map segment extracted from OSM. The lanes are finally
vehicles, many researchers have proposed approaches to lever- extracted from the orthographic image. Still centered on a path
age these geographic information sources. In this part, we will planning use, lanelets have been proposed in [186]. The format
first focus on new map (or potentially new) formats and their is proposed with tools to manually create maps from satellite
applications as it could drive how information are collected views. This map representation has been used in [109] but
in the near future. Then, we will discuss localization methods dedicated maps for localization were built as well. Finally, in
that already integrate existing widely available data to build [187], a system based on a high precision GNSS, a 3D laser
prior maps. and a cameras pointing downwards is proposed to build maps.
1) Building and using future maps: In the recent years, Bird eye view images are generated and localized according to
researchers have proposed custom map formats in order to the GNSS. Lane markings and curbs are then extracted from
respond to the need of a prior knowledge for autonomous them and are manually reviewed. A localization application,
driving. Some have considered the practical challenges of LaneLoc, is proposed where the map is reprojected in the
building world-wide maps in an automatic way. images (no laser is used in the localization phase) using the
In [180], the authors propose a custom format, Emap estimated vehicle position. It eases the subsequent extraction
(enhanced map), usable for a lane-level localization. Lanes of the lane markings and of the curbs. The localization is able
are represented as a series of straight lines, circles or clothoids to reach the map accuracy (around 10 centimeters) most of
based on GNSS and dead reckoning measurements. This map the time.
format has been utilized in [181] in a 30-minute experiment
where lane accuracy was achieved (error below 1 meter most All the previously cited methods bring geometric informa-
of the time). An extension of the Route Network Definition tion to, for now, topologic-only maps. As such, approaches
File format (RNDF), initially specified by DARPA, is dis- making use of this kind of prior information might be viable in
cussed in [182]. This new format, RNDFGraph, overcomes the near future. A map-aided localization is proposed in [188]
some of the limitations of the original definition by including that takes advantage of prior knowledge about lanes and stop
lane relationships and lane change information. This is made lines. A vision system that detects these lanes and compares
possible by using a graph representation. Splines are also them with the map is implemented inside a particle filter that
generated based on waypoints in order to ensure a smooth also integrates IMU and GPS measurements. This light system
trajectory. This format has been used in German highways is able to reach a 50-cm accuracy. In [189], the authors use
for path planning but not directly for localization purposes. only road segments and integrate them inside FastSLAM. The
Accurate lane-level map generation is also the objective pur- idea of this road constrained approach is to limit the lateral
sued in [183]. Here, the authors combine line-segment features deviation as well as angular errors by matching them with the
extracted from a 3D laser and a graph SLAM approach with expected value computed from the map. The latter is built
an OpenStreet Map map. A particle filter is used to obtain using a Differential GPS. A lane-accuracy is reached with
a lane estimation that is then integrated inside the map. The an average error of 1.4 m. Similarly, an accurate digital map
authors show that they are able to reach an average accuracy of the lane markings is built and used in [190]. Fused with
of 5 cm. The map utilization inside a localization algorithm GPS and proprioceptive information, the lane detection allows
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 14
the localization to be constrained to 10 centimeters along the images than just one. The authors formulate the problem
lateral axis. However, longitudinally, the error is around 1 as a regression on an image graph. The approach is only
meter. In general, lane approaches are difficult to apply in more evaluated by matching Street Views together and according
complex settings like intersections and roundabouts where the to the capacity of the algorithm to retrieve the right image.
accuracy is an important concern.
Instead of lanes, the authors of [191] use a map containing
the walls of the surroundings. A Bayesian network decides the
most appropriate wall to detect using a laser scanner in order
to reach a defined accuracy objective in a top-down manner.
A 20-cm accuracy is attained using a precise map of the walls
composing the scene. A previously-built pole database serves
as a reference map in [192]. The accuracy depends on the
frequency of the poles but is on average around 1 meter. In
[193], the authors exploit the entire pole-like infrastructure.
A map is first built using stereovision and a high-precision
DGPS/INS combination. During the localization phase, stereo
matching with the map, along with odometry and GPS are all
integrated inside a particle filter. The accuracy is not directly Fig. 5: Available information from Google: panorama, depth
measured but the authors claim a lateral accuracy of around map and road topology (each circle indicates a panorama/depth
20 centimeters. map).
2) Leveraging current map resources: An impressive
amount of data has already been gathered around the world The mentioned approaches solve the place recognition
(topological maps, panoramas, etc.) and can be used to create problem inside Street Views. However, they do not manage
specific maps that do not require a prior passage of a specific to achieve a sufficient accuracy for autonomous driving. This
vehicle. However, leveraging the available resources at hand issue is more difficult due to the sparse nature of the Street
to produce high-quality maps is not easy. View panoramas. In [203], the authors propose to create
A few approaches are starting to use Google Street View synthetic views to improve the continuity of the matching
images (see Figure 5) or equivalent. In [194], an aerial and thus of the localization. The depth associated to Street
vehicle takes advantage of Street Views to localize itself. Views is used in a 3D-to-2D matching that is injected inside
Artificial views are created to overcome the difference in an optimization framework to minimize the reprojection error.
viewpoints and are then compared using the Approximate They are able to obtain an accuracy of approximately 60 cm at
Nearest Neighbors on extracted features. The objective is to best. In [204], the pose estimation is performed by comparing
resolve the place recognition problem in urban environment features from the current image with the best Street Views
with an aerial vehicle. In [195], SIFT descriptors are extracted (highest number of matching with the query image). Proper
from Street View images and indexed in tree structures which correspondences are identified by evaluating the residual of
are then browsed with a nearest-neighbor algorithm from the the corresponding transformation over all the matches. The
descriptors of the current image. A vote across the candidates pose is then estimated between the best two reference views.
is then accomplished to choose the closest image. This work The accuracy of the system oscillates between less than 2
has been extended in [196] with a Bayesian tracking filter in meters to 16 meters. A method with two separated modules
order to ensure the continuity of the localization. Even though is described in [205]. In the first one, a visual odometry
the filter allows for a smoother trajectory, sudden jumps around algorithm is applied from feature points tracked for short
the position still occur. This city-scale localization has an periods of time. In the second one, the transformation linking
accuracy that ranges from more than 1 meter to 12 meters. In these points to Street View panoramas is computed so as to
[197], visual bag-of-words methods are employed to build two obtain a global localization with an accuracy below 1.5 m. In
dictionaries from Street View images using SIFT and MSER an original approach, the authors of [206] propose a text-based
(Maximally Stable Extremal Regions) detectors so as to have geo-localization method that extracts textual contents with a
both local and regional feature descriptors. Based on these, camera from available shop signs. It is then compared with an
the closest Street View from a real image can be recovered. annotated map built from Street View panoramas in order to
The relationships between physically close panoramas serve estimate the pose of the camera for an average error of around
to speed up the matching process. For all these approaches, 10 meters.
the difficult task is to find sufficiently discriminating features Other than Street View imagery, in [207], the authors show
to increase the matching ratio as explained in [198]. As such, that it is possible to consider traffic signs as geo-referenced
these methods share strong relationships with the new descrip- landmarks coming from existing maps. 3D models of these
tors that have emerged for the urban context [199][200][201]. traffic signs are matched in images and the position of the
Another way to improve the results is to process sequences that vehicle is optimized inside a Bundle Adjustment approach.
can then be matched against the topology of the environment Results are below 0.5 m most of the time but depend on the
as proposed in [202]. A better performance is obtained by number of traffic signs. However, the database that is used
considering visual words from a query image across multiple need to be quite precise and is built by a mobile mapping
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 15
system. With a more traditional traffic sign database, the V. M ULTI - VEHICLE SLAM
authors of [208] are able to reach a 5-meter accuracy by The complexity of creating or exploiting maps for local-
combining vision, GPS and an INS inside a Bayes filter. The ization purposes at a world scale indicates that cooperation
structure of the road graph can also serve to roughly localize a between vehicles might be needed in order to improve existing
vehicle as proposed in [209]. In this paper, a particle filter, only maps or ease the large-scale collection of data for appropriate
fed by odometric information, progressively converges toward map construction. However, doing so is not trivial as the
the true location of the vehicle by matching the trajectory collaboration of several vehicles has an impact on how to
followed against the map. The result is an average error of 5.8 design an effective approach. One important initial distinction
m in a large network. The authors of [210] combine a visual concerns how the cooperation is handled. Two systems exist:
odometry system along with features (lane markings) extracted the centralized approaches and the decentralized ones. In the
from satellite imagery. The aerial images are pre-processed first case, communications are all directed towards one entity
offline using a SVM and a clustering algorithm to identify that agglomerates and fuses all data before sending the result
road markings. The latter are then used online to constrain the back to the vehicles. Decentralized systems assume that each
localization. The accuracy of the system is not measured but vehicle is capable of building its own decentralized map while
the authors show that they are able to align satellite image communicating with the other vehicles of the fleet. It means
features with the current image. that information flows must be controlled to avoid bandwidth
3) Discussion: We provide a synthesis of the main ap- explosions and estimation problems which typically happen in
proaches covered here in Table IV with relation to the estab- such cases (double counting a measurement, for instance). It
lished criteria. is worth noting that this field is still fairly recent and concerns
The current state of localization methods based on existing mostly mobile robotics for now. We will discuss both systems
maps show that they are not yet viable for autonomous driving. (centralized and decentralized) and how they can be applied
The accuracy is not sufficient or requires more precise maps to autonomous driving.
that the ones currently available. An interesting observation
is that even the algorithms that utilize more precise lane
maps (built manually) do not reach the critical accuracy A. Centralized SLAM
(around 20 cm [109]) for automated driving. Of course, things Centralized SLAM may seem like a natural extension of
could evolve in the near future depending on the information single-vehicle SLAM algorithms where parts of the compu-
available in the upcoming high-precision maps. Nevertheless, tation are offloaded to a distant server. However, there are
leveraging current databases is an evolving trend that can many ways to share the SLAM task in a centralized approach
be an interesting prospect. However, even if scalability can depending on the objective. The main distinction that can be
theoretically be achieved with these methods, it will require made is whether or not the centralized part should be running
dedicated pre-processing algorithms to transform data in the in real-time or in an offline manner.
right format (for instance, creating bag of words according 1) Online centralized SLAM: This organization makes the
to zones). All of the methods depicted in Table IV are able extension of SLAM algorithms quite immediate. One of the
to recover their pose inside the map, most the time using a first works on the subject was presented by Fenwick et al.
standard GNSS which might prove to be more difficult in in [211]. An Extended Kalman filter is utilized to integrate
dense urban environments. the state vectors (pose and landmarks) of all the vehicles.
The main challenge for the localization methods presented The paper covers more the theoretical aspects and only shows
in this part is how to improve the accuracy by building more simulation results. In [212], submaps are built individually by
relevant, accurate maps. One solution might come naturally robots using cylindrical object features detected by a laser
with new world-wide sources (new maps, etc.). A key aspect scanner. All these submaps are fused in a centralized fashion
remains the information density of the sources. Street Views, from a nearby server. The relative locations of the robots
even though interesting, are, for now, too physically spaced to must be known. The approach is demonstrated with a very
be viable alone. This is why generating synthetic data might simple experiment with two robots. The work presented in
be a possible answer. Authors tend to focus on a specific kind [213] was the first, to our knowledge, to propose a multi-robot
of data and finding a framework and an estimation method visual SLAM. Observations and visual descriptors are sent to a
that take advantage of all the available resources at once central agent which builds a map shared among all the robots.
might be an interesting prospect. Knowing in advance what The initial positions of the vehicles must be approximately
will be the map elements that must be sought should also known so as to localize everyone in this common map.
be considered in order for a vehicle to evaluate beforehand, The estimation of the different trajectories within this unique
without having directly experienced it, if, with regard to its map is performed by a Rao-Blackwellized particle filter. The
capacity, a sufficient accuracy can be reached for autonomous approach has only been tested in simulation and the results
driving. The approaches mentioned in this Section do not show that the processing power needed to compute a proper
consider the update of these shared resources. While it is not map and the localization of the vehicles is not enough to meet
a trivial task to update inaccurate maps with an approximate real-time constraints.
vehicle position, it would clearly contribute to progressively A recent trend has moved this processing from one vehicle
improve these maps until they are viable for autonomous or close entity to the cloud in order to take advantage of the
driving. available processing power. One well-known example can be
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 16
found in [214]. In this article, the authors expose a cloud Offline computation of SLAM data can be seen as a natural
framework, C2 TAM, which shares the workload between a extension of multisession SLAM algorithms. Multisession
cloud server and the robot. All the demanding tasks are moved SLAM is the possibility for a SLAM algorithm to take
to the cloud and only the part where a high frequency is into account several passages in overlapping areas and so to
required is executed on the robot. The proposed application extend the map initially built. The identification of common
performs all the mapping aspects of a RGB-D SLAM on a grounds between two maps usually involves place recognition
distant computer and only the tracking of the pose is done algorithms that loop closing constraints to compute more
locally. Even if the approach is proposed as a cloud framework, coherent maps. Approaches proposing multisession mapping
the experiments use a desktop computer with a wireless con- are based on a graph representation which has the advantage
nection. They demonstrate the ability of the system to perform to be flexible when it comes to adding new constraints and
cooperative update on the map as well as online map merging. nodes to optimize. We refer the readers to the description of
In [215], the DAvinCi architecture is presented. Its objective is the following works in Section IV which could be utilized
to offload all the computations on a cloud system. The software by several vehicles in a cloud computation: [148] [150] [154]
architecture enables heterogeneous agents to share and upload [161] [163] [166] [167] [168]. While these approaches are
common data on the cloud. A grid-based FastSLAM has been only applied to a single-vehicle, the extension to a fleet
adapted to fit the needs of this cloud approach. Each particle is straightforward. In [166], the authors mention that the
responsible for the pose estimation can be run on separated objective in the long-term is the deployment of such a system
nodes. The experiments demonstrate a faster execution time in probe cars that would gather data and fuse them in the
for a single-vehicle FastSLAM but the authors do not consider cloud to then provide maps as a service. The main difficulties
the delays or latencies induced by these outsourced compu- are the necessity to divide maps by sectors in order to be still
tations on a real-time approach. Rapyuta [216] is another optimizable in a reasonable time and to limit how maps can
framework designed for cloud robotics. Its use is demonstrated grow to avoid having intractable environment representation.
with a RGB-D odometry. Different settings are evaluated: The first problem has already been addressed in single-vehicle
a complete offloading of the computation to the cloud, a SLAM with submapping techniques and the second is still an
combination where only the mapping process is offloaded open challenge. Finally, a practical problem is to have common
and a collaborative mapping by two robots. Keyframes, sent representations within a set of probe cars to be able to build
to the cloud, are compressed in order to reduce the needed and exploit the same map.
bandwidth. In their experiments, the Amazon cloud service The resources involved to roll out such an application make
was used. The complete offloading of information requires it more an industrial challenge than a research problem. This
too much bandwidth despite the compression. However, hybrid is the objective of the map creator HERE which intends to
approaches, were only a part of the computations is done on build HAD (Highly Automated Driving) maps by collecting
the cloud, are viable. The authors do not, however, discuss the data from probe cars. The data serve to feed different maps,
impact of the delay on the map building in these experiments. which are computed offline in the cloud, for specific services.
Cloud-based robotics applications are fairly new and have A noticeable research effort can be found in [219] where a
a great potential to, at least, reduce the computational require- cloud service is proposed in which data can be collected,
ments inside autonomous vehicles in the future. Interested stored and shared. Most of the previously cited cloud-based
readers can refer to [217] for a short survey of this recent approaches are connected to this service and it could be a way
practice and to [218] for a more general review of cloud to experiment large-scale multi-vehicle offline map building in
approaches in robotics. the future.
2) Offline centralized SLAM: The delays and latencies
involved by cloud computations can prevent the use of the B. Decentralized SLAM
previously cited methods in autonomous cars. However, with The increasing connectivity capacities of the autonomous
the aim of easing the creation of world-wide maps, offline vehicle with the infrastructure (Vehicle To Infrastructure, V2I)
computations of data gathered during the day by fleet of and other cars (Vehicle To Vehicle, V2V) bring the question
vehicles can be a way to build consistent large-scale maps. of how localization methods can take advantage of them.
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 17
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 18
two map pieces coming from two different vehicles. In [236], might be less of a problem in the coming years. Still,
the authors analyze the distribution of potential multi-vehicle large-scale demonstrations of the capabilities of decentralized
correspondences. Inliers are assumed to produce the same SLAM are yet to be proven and applying the previously
transformation between two vehicles conversely to outliers. A described approaches to autonomous driving is one of the
clustering algorithm is used to find initial candidates that can upcoming challenges of SLAM.
then be further analyzed with more classical methods. The
works of Li et al. [144][145] solve the initial alignment of
C. Discussion
the maps from different vehicles using a low-cost GPS and a
genetic algorithm in the restrained search space This overview of multi-robot SLAM approaches has
3) Communication issues: The bandwidth needs are often showed that the maturity of such methods for autonomous
neglected in decentralized SLAM. For instance, in [225], driving differs vastly depending on the way data are handled
submaps are directly sent. In [237], graphs (nodes comprise (in a centralized or decentralized fashion). Online central-
raw data) are exchanged, which is only suitable for small ized SLAM systems have been demonstrated on small-scale
vehicle fleets. In [238], the authors propose to exchange experiments with either a classic computer that serves as a
maps only when vehicles can detect each other, which means hub to collect observations or with cloud processing. The
that big amount of data will be sent punctually. Methods extension from mobile robots to driverless cars raises the
communicating only topological maps, like in [226] and [227], question of the bandwidth capacity that will be available in
avoid this problem by providing very light maps. tomorrow’s cars but most importantly of the availability of
The exchange strategy should also be able to cope with an Internet connection in the vehicle. Completely offloading
potential losses and delays. When not considered, a temporary critical processing onto the cloud will probably not be safe
saturation or failure can lead to information definitively lost enough. However, delegating the task of updating maps gath-
[237][225]. The algorithms built around interactions with ered from several vehicles in a reasonable amount of time is
neighbors are usually capable of identifying and asking for an interesting prospect. Real-time updates for map portions
missing data [227]. Another possibility, described in [239], is that might have changed could help driverless cars to adapt in
the Lazy Belief Propagation which integrates observations in a quicker way. Moreover, environmental conditions that have
a Particle Filter that does not depend on the temporal order a low dynamicity could be alerted beforehand in order for
in which data are added to the filter. Thus, missing or late the vehicle to anticipate by choosing more adapted features,
information can still be integrated later without a problem. etc. To attain this goal, large-scale experiments, involving a
However, it is necessary to be able to identify missing values. realistic number of vehicles, are necessary. It also means that
More generally, all the approaches based on the inverse form a major prospect remains the design of software architectures
of the Kalman Filter, namely the Information Filter, handle able to cope with data flows and to properly segment updates.
delay natively. Indeed, the update becomes additive and does One key element to make online cloud-based methods viable
not depend on any temporal order. It has become a common is the choice of a map representation able to integrate all
choice in the multi-vehicle community [234][240][241]. incoming information.
4) Experimental results: The experiments carried out in In that sense, fully decentralized methods share similar
the previously cited articles give a clear visibility on the goals. The main difference comes from the fact that, by
potential deployment of such methods for autonomous driving. receiving the information from nearby cars, the ego-vehicle
Indeed, most methods are only demonstrated in simulated ex- can select which information to integrate in its map. An
periments [220][225][227][235][239][240]. The main reason interesting challenge thus becomes to establish criteria in order
is that addressing all the issues mentioned in this section to evaluate the information gain that can bring maps coming
in order to conduct real data experiments can be difficult as from different vehicles with relation to the goal of the ego-
they are not all directly connected to the estimation problem. vehicle (destination, energetic constraints, etc.). However, all
Some approaches have been demonstrated indoor in real time the practical implications of decentralized methods are the
[242][233]. In [226], the authors demonstrate an air/ground first challenge to tackle. Contrary to cloud approaches, where
cooperation in outdoor environment. The map is built in a processing power is less of a limit, the tractability of the final
cooperative fashion with vision landmarks. The accuracy has solution will also be an important factor to consider. Large-
not been measured. In [113], outdoor experiments with two scale experiments are also an important milestone to reach in
or three vehicles are performed. The method relies on the order to clearly exhibit the benefits of decentralized SLAM
exchange of visual landmarks. The accuracy oscillates between for autonomous vehicles.
50 cm and a few meters depending on the capacity of the Offline centralized methods might be the most mature for
vehicle to identify common landmarks. Finally, in [145], 2 autonomous driving mainly because this challenge is very
vehicles equipped with laser merge their maps together in an close to multisession SLAM. While future HAD maps may
outdoor experiment. The method is able to reach a 3-meter not provide all the needed information to reliably localize a
error on average. vehicle in various conditions, probe cars might be a way to
This short survey of recent results shows that we are quickly gather enough data to build such maps. It is also an
still far from generalizing the use of decentralized SLAM interesting perspective regarding the capacity of vehicles to
for autonomous driving. However, the communication means detect long-term changes. Indeed, the agglomeration of differ-
themselves continue to improve and bandwidth limitations ent viewpoints on a situation should ease this process. Being
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 19
able to limit the growth of these maps remains a blocking Another interesting challenge conducted in 2011 is the
point. Finally, a considerable challenge is to build sufficiently Grand Cooperative Driving Challenge (GCDC) [253]. The idea
robust software architectures to integrate and process all these was to evaluate how collaboration in platooning can help to
data. The cost of such an infrastructure directs large-scale reduce congestion. The winner of the GCDC challenge in 2011
experiments with such methods towards industrial companies [254] used a previously built map in which received positions
rather than small research teams. of the other fleet members were matched and coupled with
on-board radar detection. This showed that sharing simple
localization information can still benefit autonomous driving
VI. L ARGE - SCALE EXPERIMENTS FOR AUTONOMOUS
by agreeing on speed or anticipating maneuvers.
VEHICLES
A very large-scale experiment has been conducted by
During the recent years, the scope of the experiments the VisLab team [255] with a 13,000-km trip from Italy to
used to validate localization algorithms has broadened with China. Technical details about the embedded technologies can
the availability of large-scale data sets. Moreover, many re- be found in [256]. The localization was performed using a
search teams now have a dedicated platform to test, during lane keeping algorithm (stereo and monocular) and a leader
many kilometers, their algorithms. In this section, we give following system. A laser terrain mapping algorithm was
an overview of the large-scale experiments that have involved also implemented for off-road driving but not used in this
fully autonomous vehicles. We will, of course, focus on the experiment. The leader following is an interesting possibility
localization algorithms and the context in which they have for autonomous driving if vehicles agree on their destination
been used. (or at least, a part of the trip). It still implies that one vehicle
The first large-scale experiments took place at the beginning should be able to have a full localization system. For instance,
of the 90’s and were mostly on highways [243][244]. In both cars with less expensive sensors could use platooning-like
approaches, localization was performed using lane detection formations.
with vision sensors to laterally position the vehicle on the road. Automatic guidance of a vehicle was demonstrated in [257]
Lane detection algorithms have been the default localization for the Stadpilot project. This work can be seen as an update
system in autonomous driving demonstration for a long time of the results of the Urban Challenge [249]. During a 15-
as it is totally appropriate in a highway context. The DARPA km experiment on open roads, a combination of DGPS, IMU
Grand Challenge [245] with its 244-km race across the desert and lane keeping algorithm was used. Prior digital maps were
changed things with a new setting. In [246], a GPS is coupled also employed. Based on their experience in DARPA, the
with elevation maps built using laser scanners and a road- authors clearly indicate that GPS and IMU are not sufficient
finding algorithm in order to avoid cliffs, rocks and the like. for autonomous driving in urban environments. An evolution
A similar approach is chosen by the winner of the challenge of [255] was presented in [258]. In a 13-km experiment in
[247]. The GPS is coupled with an IMU and an algorithm to different environments and real traffic, VisLab demonstrated
detect the road. In [248], a terrain reconstruction with lasers the possibility of using lane or leader following based on
was also used. In mostly empty territories, the use of a GPS vision, IMU and DGPS. An interesting aspect is that a map of
alone was sufficient for navigation. Obstacles and the difficult the environment was used to trigger the appropriate perception
terrain added the necessity to build maps that can provide a module.
safe corridor in which vehicles could evolve. The V-Charge project aims at providing automated valet
The DARPA Urban Challenge that followed focused on parking with close-to-market sensors. In [259], the authors
urban environments with a 97-km autonomy test in a city-like evaluate their localization approach in a real-world scenario.
traffic [249]. It has to be noted that the environment was still First, the map is built using SURF keypoints extraction from
sufficiently open for a Differential GPS to operate properly. fisheye cameras. Loop closures are defined manually and the
The reliance on maps was more important this time. The map is optimized using global Bundle Adjustment. Localiza-
Route Network Definition File (RNDF) format was specially tion is then performed against the known map still using SURF
designed for this challenge in order to furnish the topology and features. The algorithm is evaluated 0, 1 and 2 months after the
geometry of the road. Junior, the vehicle proposed by Stanford map creation with a sufficient accuracy. However, the authors
[250], used a GPS and the RNDF map to position the vehicle. state that map update is needed over long periods of time
The accuracy was improved with a laser system performing to reflect changes. Another open question raised here but not
lane and curb detection. In [251], a visual lane detection addressed is the map portability between different vehicles.
algorithm combined with the RNDF was responsible for the Still in the V-Charge project, [260] completed this system with
vehicle localization. The authors pointed out the difficulty to height maps built from stereovision. A semantic map and a
have a reliable localization with a vision-only system because road graph were also needed to identify parking places and
of environmental conditions like shadows, etc. The winner of how to navigate between them.
this challenge [252] combined a GPS along with an IMU and Large-scale outdoor experiments with mobile robots have
the RNDF map. Localization was improved with a laser-based also been carried out in real-world environments. In [261],
lane detection. The authors mention the necessity of prior road a robot traveled 20 km in crowded streets. The map was
models for an efficient localization. Semantic information is built beforehand with a graph-based laser SLAM. The on-the-
also referred to as an important point in order to disambiguate fly localization was based on a particle filter and a GPS for
situations. initialization. The system was able to perform reinitialization
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 20
when lost. The authors indicate that mobile obstacles masking remap from time to time the environment for a proper online
the map were a source of localization losses. In [262], a exploitation. This process should be automated according to
robot traveled several times at different periods of the year the authors. Large-scale maps are also a concern and they
a 1-km road. The map was built beforehand as well and the should be broken down in submaps. The standardization of
localization was based on road boundaries detected by laser these maps is also an important aspect that needs to be
and integrated inside an EKF. One of the problems cited by addressed to be able to use probe cars and crowd-source the
the authors is the recovery when the robot is lost and the required information for digital maps.
GPS is not available. Also appearance changes, like leaves Daimler experimented with their autonomous vehicle over
on the ground, proved to be difficult to handle. In [263], 103 km in mixed environments (urban, highways, etc.) with
a 6-km experiment in pedestrian walkways is discussed. A close-to-market sensors [109]. Their system is a combination
3D laser map of edges was built beforehand with 2D lasers. of vision, radar and accurate digital maps. The localization
The localization system compared the map with the current fuses lane detection with feature-based localization for a 20-
observation to weight the particles of a particle filter. An cm accuracy on average. All the lanes and features were
important difficulty faced by the authors is the presence of acquired during a first passage and fused in a UKF. Among
glass windows that disturbed the localization system based on the topics of importance cited by the authors, the scalability
laser. They conclude by saying that 2D lasers are not sufficient of the maps is crucial. Two other major aspects that need to
for this kind of experiments. be addressed for a commercial use are the reliance on an up-
Shuttle demonstrations in cities are also becoming more to-date digital map as well as its accuracy. For the authors, the
frequent as the mapping is confined to a dedicated area. In sensor setup should be improved in order to be less dependent
[264], the authors discuss the lessons learned after 1,500 km of of the map.
a vision-guided shuttle in a private site. The experiments took Among the industrial works in autonomous driving, Waymo
place over 3 months and the localization was performed using (formerly Google Car project) was a pioneer. The project
a hierarchical Bundle Adjustment method over a previously started following the DARPA Urban Challenge on the foun-
built map. This approach could be used with either a front dation of the winner [252]. While the algorithms are not
or a back camera. The main difficulties were the lightning specifically known, the localization relies on dedicated 3D
conditions despite the use of front and back cameras. The maps built beforehand and corrected by hand. The capacity
dynamic range of vision sensors was not sufficient to handle of Google makes that solution viable and more than 40,000
day-time changes that were qualified by the authors as having km per week are said to be driven by Waymo. Even with these
a greater impact on the localization than the 3-month gap. The impressive results, the disengagements reports published each
CityMobil2 project has cumulated 26,000 km in various cities year [269] [270] indicate that drivers must occasionally take
for several months [265]. The localization differed depending back the wheel due to software failures and in a more frequent
on the experiments but was based on pre-built maps with laser manner in urban environments. Conversely to Google, Tesla’s
or vision and GPS. The weather and the general reliability of strategy is to collect data from their own vehicles in order to
the software had a significant impact on the results. improve the capabilities of their vehicles [271]. This method
Already discussed before, Levinson et al. showed an update offers, by definition, less control on the quality of the produced
of the Stanford’s vehicle from the Urban Challenge in [266]. outputs. Anyway the role of this philosophy on the localization
Ground maps were generated using precise GPS, IMU and a is not known.
64-layer 3D laser from multiple passages in an offline manner.
The localization inside this map was then performed in real- VII. D ISCUSSION AND CONCLUSION
time with a 2D histogram filter for a 10-cm accuracy on In this survey, we have focused on the individual challenges
many kilometers. The approach of Ulm University is exposed that should be considered depending on the approach chosen.
in [267]. A map, composed of MSER features coming from To conclude this paper, we will now discuss how some more
vision and of a laser grid, is built beforehand and geo- general aspects could have an impact on the localization of
referenced using a RTK-GPS. This map representation is light autonomous vehicles.
and allowed for a 10-cm accuracy on average during the 5-km During the 23rd ITS World Congress, a localization com-
test. Using both sensors for localization improved the results petition was proposed. Based on low-resolution voxel maps
but the need of a highly-precise built-in-advance map is seen of above-the-ground objects and lanes furnished by HERE,
as a problem by the authors. participants were asked to propose an accurate localization al-
BMW, in [268], gives an interesting overview of their expe- gorithm. The competition ended with no winner as the targeted
rience with autonomous driving over thousands of kilometers accuracy was not reached. It is not yet clear if HAD maps,
on public roads (highways). Their approach relies on lane that are going to be available in the coming years, will provide
marking using vision and laser, as well as odometry and a a strong enough prior knowledge for localization algorithms.
DGPS. Road boundaries were also detected using laser and Depending on the outcome, the community will surely adapt
radar. A high-precision map is needed for a proper localization. what kind of prior knowledge is used in SLAM. These maps
Their map integrates semantic and geometric information (lane are also intended to be used as part of the Local Dynamic Map
models, connectivity, etc.) as well as localization data (lane (LDM) that gathers static and dynamic information to which
marking and road boundary positions) with different layers. SLAM could contribute. As such, standardization is needed
The difficulties presented by the authors were the necessity to [272].
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 21
An aspect that is not often discussed is sensor placement. starting in various cities. Even though, in that case, SLAM
It can have a tremendous impact on the performance of a approaches can use a previously built map without scalability
localization system. For instance, a laser on the car roof is issues, it is a good way to test them over long periods of
going to have a clear view on the infrastructure and avoid most time and confront them to the environment variability. In most
of the mobile obstacles while one in the undercarriage might roadmaps [283], the automation of transit systems is actually
be affected by masked information. Similarly, the minimal set foreseen at a shorter term than other vehicles as it has the
of sensors necessary for localization is not clearly defined advantage to restrain the covered area.
yet (even if lasers and cameras seem to be favored). Such Finally, the safety of localization algorithms is an important
a definition should be made conjointly with the map represen- issue to consider. Multiple sources should be envisaged and
tation that is going to serve as prior knowledge. Of course, this strategies to safely switch among them must be designed.
aspect is also tightly linked to the cost of those sensors. An Taking into account the failures and their impacts on the lo-
interesting perspective in that sense is multi-modality where a calization system could help in creating degraded localization
map could be built by an expensive 3D laser but then exploited modes that ensure that a vehicle can reach a stopping spot
by a simple camera as proposed in [273]. Another interesting safely.
challenge is to build flexible architectures in which a decision
system can choose what are the sensors, detectors or maps to
R EFERENCES
favor depending on the context.
Context by itself is an important part of autonomous driving [1] R. Smith and P. Cheeseman, “On the Representation and Estimation of
and its understanding would help place recognition algorithms Spatial Uncertainty,” The International Journal of Robotics Research,
vol. 5, no. 4, pp. 56–68, 1986.
or could even be directly integrated to make SLAM more [2] J. J. Leonard and H. Durrant-Whyte, “Simultaneous Map Building
robust. For now, most of the works carried out have concerned and Localization for an Autonomous Mobile Robot,” in IEEE/RSJ
indoor localization. They have been initiated by Chatila et International Conference on Intelligent Robots and Systems, 1991, pp.
1442–1447.
al. in [274] where semantic (object) maps are used for high- [3] F. Dellaert, D. Fox, W. Burgard, and S. Thrun, “Monte Carlo Local-
level decision-making. SLAM methods have since then been ization for Mobile Robots,” in IEEE Internationational Conference on
frequently considered for semantic map building. In [275] and Robotics and Automation, vol. 2, 1999, pp. 1322–1328.
[4] T. Bailey and H. Durrant-Whyte, “Simultaneous Localization and
[276], the authors take advantage of a 3D laser to identify Mapping (SLAM): Part II,” IEEE Robotics and Automation Magazine,
the ceiling, the floor and objects. In [277], a conceptual vol. 13, no. 3, pp. 108–117, 2006.
space representation is proposed based on three different [5] H. Durrant-Whyte and T. Bailey, “Simultaneous Localization and
Mapping: Part I,” IEEE Robotics and Automation Magazine, vol. 13,
maps: metric (from a SLAM system), navigation (free space) no. 2, pp. 99–110, 2006.
and topological (connections between door-separated areas). [6] S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics. MIT Press
The conceptual map allows the robot to have a semantic Cambridge, 2005, ch. 3, pp. 279–484.
[7] C. Stachniss, J. J. Leonard, and S. Thrun, Springer Handbook of
representation for rooms and objects that helps the interactions Robotics. Springer International Publishing, 2016, ch. 46, pp. 1153–
with users. In [278], the authors integrate well known objects 1176.
inside a monocular SLAM. A similar approach is followed in [8] C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira,
I. Reid, and J. J. Leonard, “Past, Present, and Future of Simultaneous
SLAM++ [279] where objects are detected and optimized in Localization and Mapping: Toward the Robust-Perception Age,” IEEE
a common localization framework. In these last two works, Transactions on Robotics, vol. 32, no. 6, pp. 1309–1332, 2016.
context and relations between objects are not directly used [9] C. Schmid, R. Mohr, and C. Bauckhage, “Evaluation of Interest Point
Detectors,” International Journal of Computer Vision, vol. 37, no. 2,
to reason upon. In that sense, semantic SLAM in outdoor pp. 151–172, 2000.
environments is still a challenge that needs to be tackled. [10] J. Civera, O. G. Grasa, A. J. Davison, and J. M. M. Montiel, “1-Point
However, semantic mapping from images alone is a field RANSAC for EKF Filtering. Application to Real-Time Structure from
Motion and Visual Odometry,” Journal of Fields Robotics, vol. 27,
that has grown lately [280][281]. Deep Convolutional Neural no. 5, pp. 609–631, 2010.
Networks have largely contributed to this trend and methods [11] E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, and P. Sayd,
like SegNet [281] show impressive results. Their application “Real Time Localization and 3D Reconstruction,” in IEEE Computer
Society Conference on Computer Vision and Pattern Recognition, 2006,
in localization methods remains an open challenge but a direct pp. 363–370.
use would be to remove moving, or temporally static, obstacles [12] R. E. Kalman, “A New Approach to Linear Filtering and Prediction
that can disturb the proper behavior of a SLAM algorithm. Problems,” Journal of Basic Engineering, vol. 82, no. 1, pp. 35–45,
1960.
More generally, CNNs could change how place recognition is
[13] M. W. M. G. Dissanayake, P. Newman, H. F. Durrant-Whyte, S. Clark,
performed by using feature maps coming from these networks and M. Csorba, “An Experimental and Theoretical Investigation into
as in [142]. The impact and the use of CNNs for localization Simultaneous Localization and Map Building,” in The Sixth Interna-
in autonomous driving will surely evolve in the coming years. tional Symposium on Experimental Robotics VI, 2000, pp. 265–274.
[14] ——, “A Solution to the Simultaneous Localization and Map Building
A recent example like [282] shows that localization might (SLAM) Problem,” IEEE Transactions on Robotics and Automation,
not even be needed. In this paper, Nvidia demonstrates the vol. 17, no. 3, pp. 229–241, 2001.
possibility to directly learn the steering angle that should be [15] R. E. Kalman and R. S. Bucy, “New Results in Linear Filtering and
Prediction Theory,” Journal of Basic Engineering, vol. 83, no. 3, pp.
applied from camera clues. Nevertheless, the scalability of 95–108, 1961.
such methods has yet to be demonstrated. [16] S. J. Julier and J. K. Uhlmann, “A Counter Example to the Theory of
In the recent years, experiments have broadened their scope Simultaneous Localization and Map Building,” in IEEE International
Conference on Robotics and Automation, 2001, pp. 4238–4243.
and autonomous driving for several kilometers and over long [17] Y. Bar-Shalom, X. R. Li, and T. Kirubarajan, Estimation with Appli-
periods of time is more common. Shuttle experiments are also cations to Tracking and Navigation. Wiley-Interscience, 2001.
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 22
[18] J. Guivant, E. Nebot, and S. Baiker, “Autonomous Navigation and Map Information Filters,” The International Journal of Robotics Research,
Building Using Laser Range Sensors in Outdoor Applications,” Journal vol. 23, no. 7-8, pp. 693–716, 2004.
of Robotic Systems, vol. 17, no. 10, pp. 565–583, 2000. [43] R. Eustice, M. Walter, and J. J. Leonard, “Sparse Extended Informa-
[19] J. Guivant and E. Nebot, “Optimization of the Simultaneous Local- tion Filters: Insights Into Sparsification,” in IEEE/RSJ International
ization and Map Building Algorithm for Real-Time Implementation,” Conference on Intelligent Robots and Systems, 2005, pp. 3281–3288.
IEEE Transactions on Robotics and Automation, vol. 17, no. 3, pp. [44] I. Mahon, S. B. Williams, O. Pizarro, and M. Johnson-Roberson,
242–257, 2001. “Efficient View-Based SLAM Using Visual Loop Closures,” IEEE
[20] J. J. Leonard and H. J. S. Feder, “A Computationally Efficient Method Transactions on Robotics, vol. 24, no. 5, pp. 1002–1014, 2008.
for Large-Scale Concurrent Mapping and Localization,” in Interna- [45] M. R. Walter, R. M. Eustice, and J. J. Leonard, “Exactly Sparse
tional Symposium on Robotics Research, vol. 9, 2000, pp. 169–178. Extended Information Filters for Feature-Based SLAM,” The Inter-
[21] P. Newman and J. J. Leonard, “Pure Range-Only Sub-Sea SLAM,” in national Journal of Robotics Research, vol. 26, no. 4, pp. 335–359,
IEEE International Conference on Robotics and Automation, vol. 2, 2007.
2003, pp. 1921–1926. [46] J.-S. Gutmann, E. E. nad Philip Fong, and M. E. Munich, “A Constant-
[22] P. Newman, J. Leonard, J. D. Tardós, and J. Neira, “Explore and Time Algorithm for Vector Field SLAM Using an Exactly Sparse
Return: Experimental Validation of Real-Time Concurrent Mapping Extended Information Filter,” in Robotics: Science and Systems, 2010.
and Localization,” in IEEE International Conference on Robotics and [47] A. Eliazar and R. Parr, “DP-SLAM: Fast, robust simultaneous localiza-
Automation, vol. 2, 2002, pp. 1802–1809. tion and mapping without predetermined landmarks,” in International
[23] P. Newman, D. Cole, and K. Ho, “Outdoor SLAM Using Visual Joint Conference on Artificial Intelligence, vol. 3, 2003, pp. 1135–1142.
Appearance and Laser Ranging,” in IEEE International Conference [48] ——, “DP-SLAM 2.0,” in IEEE International Conference on Robotics
on Robotics and Automation, 2006, pp. 1180–1187. and Automation, 2004, pp. 1314–1320.
[24] A. J. Davison, “Real-Time Simultaneous Localisation and Mapping [49] M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit, “FastSLAM:
with a Single Camera,” in IEEE International Conference on Computer A Factored Solution to the Simultaneous Localization And Mapping
Vision, 2003, pp. 1403–1410. Problem,” in AAAI/IAAI, 2002, pp. 593–598.
[25] J. Montiel, J. Civera, and A. J. Davison, “Unified Inverse Depth [50] K. P. Murphy, “Bayesian Map Learning in Dynamic Environments,” in
Parametrization for Monocular SLAM,” in Robotics: Science and Neural Information Processing Systems, 1999, pp. 1015–1021.
Systems, 2006.
[51] S. Thrun, W. Burgard, and D. Fox, “A Real-Time Algorithm for Mobile
[26] G. Bresson, T. Féraud, R. Aufrère, P. Checchin, and R. Chapuis, “Real
Robot Mapping With Applications to Multi-Robot and 3D Mapping,”
Time Monocular SLAM with Low Memory Requirements,” IEEE
in IEEE International Conference on Robotics and Automation, vol. 1,
Transactions on Intelligent Transportation Systems, 2015.
2000, pp. 321–328.
[27] S. B. Williams, G. Dissanayake, and H. Durrant-Whyte, “An Efficient
[52] E. Eade and T. Drummond, “Scalable Monocular SLAM,” in IEEE
Approach to the Simultaneous Localisation and Mapping Problem,” in
Computer Society Conference on Computer Vision and Pattern Recog-
IEEE International Conference on Robotics and Automation, 2002, pp.
nition, vol. 1, 2006, pp. 469–476.
406–411.
[53] T. Bailey, J. Nieto, and E. Nebot, “Consistency of the FastSLAM Algo-
[28] J. J. Leonard and P. Newman, “Consistent, Convergent and Constant-
rithm,” in IEEE International Conference on Robotics and Automation,
Time SLAM,” in International Joint Conferences on Artificial Intelli-
2006, pp. 424–429.
gence, 2003, pp. 1143–1150.
[29] T. Bailey, “Mobile Robot Localisation and Mapping in Extensive [54] M. Mohan and K. K. Madhava, “Mapping Large Scale Environments
Outdoor Environments,” Ph.D. dissertation, Australian Centre for Field by Combining Particle Filter and Information Filter,” in International
Robotics - The University of Sydney, 2002. Conference on Control, Automation, Robotics and Vision, 2010, pp.
1000–1005.
[30] M. Bosse, P. Newman, J. Leonard, M. Soika, W. Feiten, and S. Teller,
“An Atlas Framework for Scalable Mapping,” in IEEE International [55] D. Hahnel, W. Burgard, D. Fox, and S. Thrun, “An efficient FastSLAM
Conference on Robotics and Automation, vol. 2, 2003, pp. 1899–1906. algorithm for generating maps of large-scale cyclic environments from
[31] C. Estrada, J. Neira, and J. D. Tardós, “Hierarchical SLAM: real- raw laser range measurements,” in IEEE/RSJ International Conference
time accurate mapping of large environments,” IEEE Transactions on on Intelligent Robots and Systems, vol. 1, 2003, pp. 206–211.
Robotics, vol. 21, no. 4, pp. 588–596, 2005. [56] T. Reineking and J. Clemens, “Evidential FastSLAM for Grid Map-
[32] P. Piniés and J. D. Tardós, “Large Scale SLAM Building Conditionally ping,” in 16th International Conference on Information Fusion, 2013,
Independent Local Maps: Application to Monocular Vision,” IEEE pp. 789–796.
Transactions on Robotics, vol. 24, no. 5, pp. 1094–1106, 2008. [57] H. P. Moravec, “Sensor Fusion in Certainty Grids for Mobile Robots,”
[33] L. M. Paz, J. D. Tardós, and J. Neira, “Divide and Conquer: EKF AI Magazine, vol. 9, no. 2, p. 61, 1988.
SLAM in O(n),” IEEE Transactions on Robotics, vol. 24, no. 5, pp. [58] A. Elfes, “Using Occupancy Grids for Mobile Robot Perception and
1107–1120, 2008. Navigation,” Computer, vol. 6, no. 22, pp. 46–57, 1989.
[34] J. L. Blanco, J. González, and J.-A. Fernández-Madrigal, “Subjective [59] Z. Alsayed, G. Bresson, F. Nashashibi, and A. Verroust-Blondet, “PML-
Local Maps for Hybrid Metric-Topological SLAM,” Robotics and SLAM: a solution for localization in large-scale urban environments,”
Autonomous Systems, vol. 57, no. 1, pp. 64–74, 2009. in IEEE/RSJ International Conference on Intelligent Robots and Sys-
[35] M. Chli and A. J. Davison, “Automatically and Efficiently Inferring tems Workshop on Perception and Navigation for Autonomous Vehicles
the Hierarchical Structure of Visual Maps,” in IEEE International in Human Environment, 2015.
Conference on Robotics and Automation, 2009, pp. 387–394. [60] G. Trehard, Z. Alsayed, E. Pollard, B. Bradai, and F. Nashashibi,
[36] S. J. Julier and J. K. Uhlmann, “New Extension of the Kalman Filter “Credibilist Simultaneous Localization and Mapping with a LIDAR,” in
to Nonlinear Systems,” in AeroSense’97, 1997, pp. 182–193. IEEE/RSJ International Conference on Intelligent Robots and Systems,
[37] E. A. Wan and R. V. D. Merwe, “The Unscented Kalman Filter for 2014, pp. 2699–2706.
Nonlinear Estimation,” in Adaptive Systems for Signal Processing, [61] R. Rouveure, P. Faure, and M. Monod, “Radar-based SLAM without
Communications and Control Symposium, 2000, pp. 153–158. odometric sensor,” in ROBOTICS 2010: International workshop of
[38] S. J. Julier and J. K. Uhlmann, “Unscented Filtering and Nonlinear Mobile Robotics for environment/agriculture, 2010.
Estimation,” Proceedings of the IEEE, vol. 92, no. 3, pp. 401–422, [62] D. Vivet, P. Checchin, and R. Chapuis, “Localization and Mapping
2004. Using Only a Rotating FMCW Radar Sensor ,” Sensors, pp. 4527–
[39] D. Checklov, M. Pupilli, W. Mayol-Cuevas, and A. Calway, “Real- 4552, 2013.
Time and Robust Monocular SLAM using Predictive Multi-Resolution [63] E. Jose and M. D. Adams, “An Augmented State SLAM formulation
Descriptors,” Advances in Visual Computing, vol. 4292, pp. 276–285, for Multiple Line-of-Sight Features with Millimetre Wave RADAR,” in
2006. IEEE/RSJ International Conference on Intelligent Robots and Systems,
[40] P. S. Maybeck, Stochastic Models, Estimation and Control. Elsevier, 2005, pp. 3087–3092.
1982. [64] B. Triggs, P. F. McLauchlan, R. I. Hartley, and A. W. Fitzgibbon,
[41] Y. Liu and S. Thrun, “Results for Outdoor-SLAM Using Sparse “Bundle Adjustment - A Modern Synthesis,” in Vision Algorithms:
Extended Information Filters,” in IEEE International Conference on Theory and Practice. Springer, 2000, pp. 298–372.
Robotics and Automation, vol. 1, 2003, pp. 1227–1233. [65] W. Press, S. Keukolsky, W. Vettering, and B. Flannery, “Levenberg-
[42] S. Thrun, Y. Liu, D. Koller, A. Y. Ng, Z. Ghahramani, and H. Durrant- Marquardt Method,” Numerical Recipes in C: The Art of Scientific
Whyte, “Simultaneous Localization and Mapping with Sparse Extended Computation, pp. 542–547, 1992.
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 23
[66] H.-Y. Shum, Q. Ke, and Z. Zhang, “Efficient Bundle Adjustment with [90] F. Fraundorfer and D. Scaramuzza, “Visual Odometry: Part II: Match-
Virtual Key Frames: A Hierarchical Approach to Multi-frame Structure ing, RobuRobust, Optimization, and Applications,” IEEE Robotics &
from Motion,” in IEEE Computer Society Conference on Computer Automation Magazine, vol. 19, no. 2, pp. 78–90, 2012.
Vision and Pattern Recognition, 1999. [91] J. Fuentes-Pacheco, J. Ruiz-Ascencio, and J. M. Rendón-Mancha,
[67] R. Hartley and A. Zisserman, Multiple View Geometry in Computer “Visual simultaneous localization and mapping: a survey,” Artificial
Vision. Cambridge Univ Press, 2000, ch. 18, pp. 434–457. Intelligence Review, vol. 43, no. 1, pp. 55–81, 2012.
[68] E. Royer, M. Lhuillier, M. Dhome, and T. Chateau, “Localization in [92] G. Ros, A. D. Sappa, D. Ponsa, and A. M. Lopez, “Visual SLAM
Urban Environments: Monocular Vision Compared to a Differential for Driverless Cars: A Brief Survey,” in IEEE Intelligent Vehicles
GPS Sensor,” in IEEE Computer Society Conference on Computer Symposium Workshops, 2012.
Vision and Pattern Recognition, 2005, pp. 114–121. [93] S. Lowry, N. Sünderhauf, P. Newman, J. J. Leonard, D. Cox, P. Corke,
[69] D. Steedly and I. Essa, “Propagation of Innovative Information in Non- and M. J. Milford, “Visual Place Recognition: A Survey,” IEEE
Linear Least-Squres Structure from Motion,” in IEEE International Transactions on Robotics, vol. 32, no. 1, pp. 1–19, 2016.
Conference on Computer Vision, 2001, pp. 223–229. [94] S. Saeedi, M. Trentini, M. Seto, and H. Li, “Multiple-robot Simultane-
[70] Z. Zhang and Y. Shan, “Incremental Motion Estimation through ous Localization and Mapping - A Review,” Journal of Field Robotics,
Modified Bundle Adjustment,” in International Conference on Image vol. 33, no. 1, pp. 3–46, 2016.
Processing, 2003, pp. 343–346. [95] A. Bonarini, W. Burgard, G. Fontana, M. Matteucci, D. G. Sorrenti,
[71] D. Nistér, O. Naroditsky, and J. Bergen, “Visual Odometry for Ground and J. D. Tardós, “RAWSEEDS: Robotics Advancement through
Vehicle Applications,” Journal of Fields Robotics, vol. 23, no. 1, pp. Web-publishing of Sensorial and Elaborated Extensive Data Sets,” in
3–20, 2006. IEEE/RSJ International Conference on Intelligent Robots and Systems
[72] M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Workshop on Benchmarks in Robotics Research, 2006.
Paradigm for Model Fitting with Applications to Image Analysis and [96] M. Smith, I. Baldwin, W. Churchill, R. Paul, and P. Newman, “The
Automated Cartography,” Communications of the ACM, vol. 24, no. 6, New College Vision and Laser Data Set,” The International Journal of
pp. 381–395, 1981. Robotics Research, vol. 28, no. 5, pp. 595–599, 2009.
[73] J. Michot, A. Bartoli, and F. Gaspard, “Bi-Objective Bundle Adjust- [97] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for Autonomous
ment With Application to Multi-Sensor SLAM,” in 5th International Driving? The KITTI Vision Benchmark Suite,” in IEEE Conference on
Symposium on 3D Data Processing, Visualization and Transmission, Computer Vision and Pattern Recognition, 2012, pp. 3354–3361.
2010. [98] J.-L. Blanco-Claraco, F.-A. Moreno-Duenas, and J. González-Jiménez,
[74] K. Konolige and M. Agrawal, “FrameSLAM: from Bundle Adjustment “The Malaga Urban Dataset: High-rate Stereo and Lidars in a realistic
to RealReal Visual Mapping,” IEEE Transactions on Robotics, pp. urban scenario,” The International Journal of Robotics Research,
1066–1077, 2008. vol. 33, no. 2, pp. 207–214, 2014.
[75] S. Thrun and J. J. Leonard, “Simultaneous Localization And Mapping,” [99] M. Cummins and P. Newman, “FAB-MAP: Probabilistic Localization
in Springer Handbook of Robotics. Springer, 2008, pp. 871–889. and Mapping in the Space of Appearance,” The International Journal
[76] G. Grisetti, S. Grzonka, C. Stachniss, P. Pfaff, and W. Burgard, “Ef- of Robotics Research, vol. 27, no. 6, pp. 647–665, 2008.
ficient Estimation of Accurate Maximum Likelihood Maps in 3D,” in [100] J.-L. Blanco, F.-A. Moreno, and J. Gonzalez, “A Collection of Outdoor
IEEE/RSJ International Conference on Intelligent Robots and Systems, Robotic Datasets with centimeter-accuracy Ground Truth,” Autonomous
2007, pp. 3472–3478. Robots, vol. 27, no. 4, pp. 327–351, 2009.
[77] G. Grisetti, R. Kummerle, C. Stachniss, U. Frese, and C. Hertzberg, [101] A. S. Huang, M. Antone, E. Olson, L. Fletcher, D. Moore, S. Teller,
“Hierarchical Optimization on Manifolds for Online 2D and 3D Map- and J. Leonard, “A high-rate, heterogeneous data set from the DARPA
ping,” in IEEE International Conference on Robotics and Automation, Urban Challenge ,” The International Journal of Robotics Research,
2010, pp. 273–278. vol. 29, no. 13, pp. 1595–1601, 2010.
[78] R. Kummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard, [102] T. Peynot, S. Scheding, and S. Terho, “The Marulan Data Sets:
“g2o: A General Framework for Graph Optimization,” in IEEE Interna- Multi-Sensor Perception in Natural Environment with Challenging
tional Conference on Robotics and Automation, 2011, pp. 3607–3613. Conditions,” The International Journal of Robotics Research, vol. 29,
[79] G. Dubbelman and B. Browning, “Closed-form Online Pose-chain no. 13, pp. 1602–1607, 2010.
SLAM,” in IEEE International Conference on Robotics and Automa- [103] A. Geiger, J. Ziegler, and C. Stiller, “StereoScan: Dense 3d Recon-
tion, 2013, pp. 5190–5197. struction in Real-time,” in IEEE Intelligent Vehicle Symposium, 2011,
[80] U. Frese and L. Schroder, “Closing a Million-Landmarks Loop,” in pp. 963–968.
IEEE/RSJ International Conference on Intelligent Robots and Systems, [104] G. Pandey, J. R. McBride, and R. M. Eustice, “Ford Campus Vision
2006, pp. 5032–5039. and Lidar Data Set,” The International Journal of Robotics Research,
[81] M. Kaess, A. Ranganathan, and F. Dellaert, “iSAM: Incremental vol. 30, no. 13, pp. 1543–1552, 2011.
Smoothing and Mapping,” IEEE Transactions on Robotics, vol. 24, [105] J. Zhang and S. Singh, “Visual-lidar Odometry and Mapping: Low-
no. 6, pp. 1365–1378, 2008. drift, Robust, and Fast,” in IEEE International Conference on Robotics
[82] M. Kaess and F. Dellaert, “Covariance Recovery from a Square Root and Automation, 2015.
Information Matrix for Data Association,” Robotics and Autonomous [106] I. Cvišić and I. Petrović, “Stereo odometry based on careful feature
Systems, vol. 57, no. 12, pp. 1198–1210, 2009. selection and tracking,” in 2015 European Conference on Mobile
[83] H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Real-time Monocular Robotics, 2015, pp. 1–6.
SLAM: Why Filter?” in IEEE International Conference on Robotics [107] M. Buczko and V. Willert, “How to distinguish inliers from outliers
and Automation, 2010, pp. 2657–2664. in visual odometry for high-speed automotive applications,” in IEEE
[84] ——, “Visual SLAM: Why Filter?” Image and Vision Computing, Intelligent Vehicle Symposium, 2016, pp. 478–483.
vol. 30, no. 2, pp. 65–77, 2012. [108] M. Persson, T. Piccini, M. Felsberg, and R. Mester, “Robust stereo
[85] J. Aulinas, Y. Petillot, J. Salvi, and X. Lladó, “The SLAM problem: a visual odometry from monocular techniques,” in IEEE Intelligent
survey,” in 11th International Conference of the Catalan Association Vehicle Symposium, 2015, pp. 686–691.
for Artificial Intelligence, 2008, pp. 363–371. [109] J. Ziegler, P. Bender, M. Scheriber, H. Lategahn, T. Strauss, C. Stiller,
[86] G. Dissanayake, S. Huang, Z. Wang, and R. Ranasinghe, “A Review T. Dang, U. Franke, N. Appenrodt, C. G. Keller, E. Kaus, R. G. Her-
of Recent Developments in Simultaneous Localization and Mapping,” rtwich, C. Rabe, D. Pfeiffer, F. Lindner, F. Stein, F. Erbs, M. Enzweiler,
in IEEE 6th International Conference on Industrial and Information C. Knoppel, J. Hipp, M. Haueis, M. Trepte, C. Brenk, A. Tamke,
Systems, 2011, pp. 477–482. M. Ghanaat, M. Braun, A. Joos, H. Fritz, H. Mock, M. Hein, and
[87] S. Huang and G. Dissanayake, “A critique of current developments E. Zeeb, “Making Bertha DriveAn Autonomous Journey on a Historic
in simultaneous localization and mapping,” International Journal of Route,” IEEE Intelligent Transportation Systems Magazine, vol. 6,
Advanced Robotic Systems, vol. 13, no. 5, pp. 1–13, 2016. no. 2, 2014.
[88] Z. Chen, J. Samarabandu, and R. Rodrigo, “Recent advances in [110] S. Huang and G. Dissanayake, “Convergence and Consistency Analysis
simultaneous localization and map-building using computer vision,” for Extended Kalman Filter Based SLAM,” IEEE Transactions on
Advanced Robotics, vol. 21, no. 3-4, pp. 233–265, 2007. Robotics, vol. 23, no. 5, pp. 1036–1049, 2007.
[89] D. Scaramuzza and F. Fraundorfer, “Visual Odometry: Part I: The First [111] T. Bailey, J. Nieto, J. Guivant, M. Stevens, and E. Nebot, “Consistency
30 Years and Fundamentals,” IEEE Robotics & Automation Magazine, of the EKF-SLAM Algorithm,” in IEEE/RSJ International Conference
vol. 18, no. 4, pp. 80–92, 2011. on Intelligent Robots and Systems, 2006, pp. 3562–3568.
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 24
[112] U. Frese, “A Discussion of Simultaneous Localization and Mapping,” [135] J. Sivic and A. Zisserman, “Efficient Visual Search of Videos Cast as
Autonomous Robots, vol. 20, no. 1, pp. 25–42, 2006. Text Retrieval,” IEEE Transactions on Pattern Analysis and Machine
[113] G. Bresson, R. Aufrère, and R. Chapuis, “A General Consistent Intelligence, vol. 31, no. 4, pp. 591–606, 2009.
Decentralized SLAM Solution,” Robotics and Autonomous Systems, [136] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Key-
no. 74, pp. 128–147, 2015. points,” International Journal of Computer Vision, vol. 60, no. 2, pp.
[114] S. Huang, Y. Lai, U. Frese, and G. Dissanayake, “How far is SLAM 91–110, 2004.
from a linear least squares problem?” in IEEE/RSJ International [137] M. Cummins and P. Newman, “Appearance-only SLAM at large scale
Conference on Intelligent Robots and Systems, 2010, pp. 3011–3016. with FAB-MAP 2.0,” The International Journal of Robotics Research,
[115] J. A. Castellanos, J. Neira, and J. D. Tardós, “Limits to the Consis- vol. 30, no. 9, pp. 1100–1123, 2011.
tency of EKF-Based SLAM,” in 5th IFAC Symposium on Intelligent [138] A. Kawewong, N. Tongprasit, S. Tangruamsub, and O. Hasegawa,
Autonomous Vehicles, 2004. “Online and Incremental Appearance-Based SLAM in Highly Dy-
[116] J. A. Castellanos, R. Martinez-Cantin, J. D. Tardós, and J. Neira, namic Environments,” The International Journal of Robotics Research,
“Robocentric Map Joining: Improving the Consistency of EKF- vol. 30, no. 1, pp. 33–55, 2010.
SLAM,” Robotics and Autonomous Systems, vol. 55, no. 1, pp. 21–29, [139] M. J. Milford and G. F. Wyeth, “SeqSLAM: Visual Route-Based
2007. Navigation for Sunny Summer Days and Stormy Winter Nights,” in
[117] J. Civera, O. G. Grasa, A. J. Davison, and J. M. M. Montiel, “1- IEEE International Conference on Robotics and Automation, 2012, pp.
Point RANSAC for EKF-Based Structure from Motion,” in IEEE/RSJ 1643–1649.
International Conference on Intelligent Robot and Systems, 2009, pp. [140] E. Pepperell, P. Corke, and M. Milford, “Routed roads: Probabilistic
3498–3504. vision-based place recognition for changing conditions, split street and
[118] B. P. Williams and I. D. Reid, “On Combining Visual SLAM and varied viewpoints,” The International Journal of Robotics Research,
Visual Odometry,” in IEEE International Conference on Robotics and 2016.
Automation, 2010, pp. 3494–3500. [141] T. Naseer, L. Spinello, W. Burgard, and C. Stachniss, “Robust Visual
[119] T.-D. Vu, “Vehicle perception: Localization, mapping with detection, Robot Localization Across Seasons Using Network Flows,” in AAAI
classification and tracking of moving objects,” Ph.D. dissertation, Conference on Artificial Intelligence, 2014, pp. 2564–2570.
Institut National Polytechnique de Grenoble-INPG, 2009. [142] T. Naseer, M. Ruhnke, C. Stachniss, L. Spinello, and W. Burgard,
[120] C.-C. Wang, C. Thorpe, and A. Suppe, “Ladar-based detection and “Robust Visual SLAM Across Seasons,” in IEEE/RSJ International
tracking of moving objects from a ground vehicle at high speeds,” in Conference on Intelligent Robots and Systems, 2015, pp. 2529–2535.
IEEE Intelligent Vehicles Symposium, 2003, pp. 416–421. [143] J. Neira, J. D. Tardós, and J. A. Castellanos, “Linear time vehicle
[121] C.-C. Wang, C. Thorpe, and S. Thrun, “Online simultaneous local- relocation in SLAM,” in IEEE International Conference on Robotics
ization and mapping with detection and tracking of moving objects: and Automation, vol. 1, 2003, pp. 427–433.
Theory and results from a ground vehicle in crowded urban areas,” in [144] H. Li and F. Nashashibi, “A new method for occupancy grid maps
IEEE International Conference on Robotics and Automation, 2003, pp. merging: Application to multi-vehicle cooperative local mapping and
842–849. moving object detection in outdoor environment,” in International
Conference on Control, Automation, Robotics and Vision, 2012, pp.
[122] S. I. Roumeliotis, G. S. Sukhatme, and G. A. Bekey, “Sensor fault de-
632–637.
tection and identification in a mobile robot,” in IEEE/RSJ International
[145] ——, “Multi-vehicle cooperative localization using indirect vehicle-to-
Conference on Intelligent Robots and Systems, 1998, pp. 1383–1388.
vehicle relative pose estimation,” in IEEE International Conference on
[123] P. Goel, G. Dedeoglu, S. I. Roumeliotis, and G. S. Sukhatme, “Fault
Vehicular Electronics and Safety, 2012, pp. 267–272.
detection and identification in a mobile robot using multiple model
[146] J. Neira and J. D. Tardós, “Data Association in Stochastic Mapping
estimation and neural network,” in IEEE International Conference on
Using the Joint Compatibility Test,” IEEE Transactions on Robotics
Robotics and Automation, 2000, pp. 2302–2309.
and Automation, vol. 17, no. 6, pp. 890–897, 2002.
[124] P. Sundvall and P. Jensfelt, “Fault detection for mobile robots using
[147] J. Xie, F. Nashashibi, M. Parent, and O. G. Favrot, “A Real-Time
redundant positioning systems,” in IEEE International Conference on
Robust Global Localization for Autonomous Mobile Robots in Large
Robotics and Automation, 2006, pp. 3781–3786.
Environments,” in International Conference on Control, Automation,
[125] Y. Morales, E. Takeuchi, and T. Tsubouchi, “Vehicle localization in Robotics and Vision, 2010, pp. 1397–1402.
outdoor woodland environments with sensor fault detection,” in IEEE [148] P. Newman, G. Sibley, M. Smith, M. Cummins, A. Harrison, C. Mei,
International Conference on Robotics and Automation, 2008, pp. 449– I. Posner, R. Shade, D. Schroeter, D. Cole, and I. Reid, “Navigating,
454. Recognising and Describing Urban Spaces With Vision and Laser,”
[126] A. Jabbari, R. Jedermann, and W. Lang, “Application of computational The International Journal of Robotics Research, vol. 28, no. 11-12,
intelligence for sensor fault detection and isolation,” World academy pp. 1406–1433, 2009.
of science, engineering and technology, vol. 33, pp. 265–270, 2007. [149] P. Besl and N. McKay, “Method for registration of 3-D shapes,” IEEE
[127] J. Engel, V. Koltun, and D. Cremers, “Direct Sparse Odometry,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14,
Transactions on Pattern Analysis and Machine Intelligence, 2017. no. 2, pp. 239–256, 2002.
[128] R. C. Luo, C.-C. Yih, and K. L. Su, “Multisensor fusion and integration: [150] J. McDonald, M. Kaess, C. Cadena, J. Neira, and J. J. Leonard, “6-
approaches, applications, and future research directions,” IEEE Sensors DOF Multi-session Visual SLAM using Anchor Nodes,” in European
Journal, vol. 2, no. 2, pp. 107–119, 2002. Conference on Mobile Robotics, 2011, pp. 69–76.
[129] J. A. Castellanos, J. Neira, and J. D. Tardós, “Multisensor fusion for [151] A. Martinelli, N. Tomatis, and R. Siegwart, “Some Results on SLAM
simultaneous localization and map building,” IEEE Transactions on and the Closing the Loop Problem,” in IEEE/RSJ International Con-
Robotics and Automation, vol. 17, no. 6, pp. 908–914, 2001. ference on Intelligent Robots and Systems, 2005, pp. 2917–2922.
[130] L. Wei, C. Cappelle, and Y. Ruichek, “Camera/laser/GPS fusion method [152] D. M. Cole and P. M. Newman, “Using Laser Range Data for 3D
for vehicle positioning under extended NIS-based sensor validation,” SLAM in Outdoor Environments,” in IEEE International Conference
IEEE Transactions on Instrumentation and Measurement, vol. 62, on Robotics and Automation, 2006, pp. 1556–1563.
no. 11, pp. 3110–3122, 2013. [153] G. Bresson, M.-C. Rahal, D. Gruyer, M. Revilloud, and Z. Alsayed, “A
[131] B. Williams, M. Cummins, J. Neira, P. Newman, I. Reid, and J. Tardós, Cooperative Fusion Architecture for Robust Localization: Application
“A comparison of loop closing techniques in monocular SLAM,” to Autonomous Driving,” in IEEE 19th International Conference on
Robotics and Autonomous Systems, vol. 57, no. 12, pp. 1188–1197, Intelligent Transportation Systems, 2016.
2009. [154] H. Johannsson, M. Kaess, M. Fallon, and J. J. Leonard, “Temporally
[132] E. Eade and T. Drummond, “Unified Loop Closing and Recovery for scalable visual SLAM using a reduced pose graph,” in IEEE Interna-
Real Time Monocular SLAM,” in British Machine Vision Conference, tional Conference on Robotics and Automation, 2013, pp. 54–61.
2008. [155] T. Féraud, P. Checchin, R. Aufrère, and R. Chapuis, “Communicating
[133] L. A. Clemente, A. J. Davison, I. D. Reid, J. Neira, and J. D. Tardós, Vehicles in Convoy and Monocular Vision-based Localization,” in 7th
“Mapping Large Loops with a Single Hand-Held Camera,” in Robotics: Symposium on Intelligent Autonomous Vehicles, vol. 7, 2010.
Science and Systems, 2007. [156] J. P. Lewis, “Fast Normalized Cross-Correlation,” in Vision Interface,
[134] B. Williams, M. Cummins, J. Neira, P. Newman, I. Reid, and J. Tardós, 1995, pp. 120–123.
“An image-to-map Loop Closing Method for Monocular SLAM,” in [157] E. Menegatti, T. Maeda, and H. Ishiguro, “Image-based memory for
IEEE/RSJ International Conference on Intelligent Robots and Systems, robot navigation using properties of omnidirectional images,” Robotics
2008, pp. 2053–2059. and Autonomous Systems, vol. 47, no. 4, pp. 251–267, 2004.
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 25
[158] A. Remazeilles, F. Chaumette, and P. Gros, “Robot motion control from [181] R. Toledo-Moreo, D. Betaille, and F. Peyret, “Lane-level integrity pro-
a visual memory,” in IEEE International Conference on Robotics and vision for navigation and map matching with GNSS, dead reckoning,
Automation, 2004, pp. 4695–4700. and enhanced maps,” IEEE Transactions on Intelligent Transportation
[159] M. Jogan and A. Leonardis, “Robust localization using panoramic Systems, vol. 11, no. 1, pp. 100–112, 2010.
view-based recognition,” in 15th International Conference on Pattern [182] P. Czerwionka, M. Wang, and F. Wiesel, “Optimized route network
Recognition, 2000, pp. 136–139. graph as map reference for autonomous cars operating on German
[160] O. Vysotka and C. Stachniss, “Lazy Data Association for Image autobahn,” in IEEE 5th International Conference on Automation,
Sequences Matching Under Substantial Appearance Change,” IEEE Robotics and Applications, 2011, pp. 78–83.
Robotics and Automation Letters, vol. 1, no. 1, pp. 213–220, 2016. [183] A. Joshi and M. R. James, “Generation of accurate lane-level maps
[161] A. Walcott-Bryant, M. Kaess, H. Johannsson, and J. J. Leonard, from coarse prior maps and lidar,” IEEE Intelligent Transportation
“Dynamic Pose Graph SLAM: Long-term Mapping in Low Dynamic Systems Magazine, vol. 7, no. 1, pp. 19–29, 2015.
Environments,” in IEEE/RSJ International Conference on Intelligent [184] C. Guo, J.-i. Meguro, Y. Kojima, and T. Naito, “Automatic lane-
Robots and Systems, 2012, pp. 1871–1878. level map generation for advanced driver assistance systems using
[162] K. Konolige and J. Bowman, “Towards lifelong visual maps,” in low-cost sensors,” in IEEE International Conference on Robotics and
IEEE/RSJ International Conference on Intelligent Robots and Systems, Automation, 2014, pp. 3975–3982.
2009, pp. 1156–1163. [185] C. Guo, K. Kidono, J. Meguro, Y. Kokima, M. Ogawa, and T. Naito,
[163] W. Churchill and P. Newman, “Practice makes perfect? managing “A Low-Cost Solution for Automatic Lane-Level Map Generation
and leveraging visual experiences for lifelong navigation,” in IEEE Using Conventional In-Car Sensors,” IEEE Transactions on Intelligent
International Conference on Robotics and Automation, 2012, pp. 4525– Transportation Systems, vol. 17, no. 8, pp. 2355–2366, 2016.
4532. [186] P. Bender, J. Ziegler, and C. Stiller, “Lanelets: Efficient map representa-
[164] C. Linegar, W. Churchill, and P. Newman, “Made to Measure: Bespoke tion for autonomous driving,” in IEEE Intelligent Vehicles Symposium,
Landmarks for 24-Hour, All-Weather Localisation with a Camera,” in 2014, pp. 420–425.
IEEE International Conference on Robotics and Automation, 2016, pp. [187] M. Schreiber, C. Knoppel, and U. Franke, “LaneLoc: Lane Marking
787–794. based Localization using Highly Accurate Maps,” in IEEE Intelligent
[165] A. Kendall, M. Grimes, and R. Cipolla, “PoseNet: A convolutional Vehicles Symposium, 2013, pp. 449–454.
network for real-time 6-DOF camera relocalization,” in IEEE Interna- [188] I. Miller, M. Campbell, and D. Huttenlocher, “Map-aided localization in
tional Conference on Computer Vision, 2015, pp. 2938–2946. sparse global positioning system environments using vision and particle
[166] P. Muhlfellner, M. Brki, M. Bosse, W. Derendarz, R. Philippsen, and filtering,” Journal of Field Robotics, vol. 28, no. 5, pp. 619–643, 2011.
P. Furgale, “Summary Maps for Lifelong Visual Localization,” Journal [189] K. K. Lee, S. Wijesoma, and J. Ibanez-Guzmán, “A constrained SLAM
of Field Robotics, vol. 33, no. 5, pp. 561–590, 2015. approach to robust and accurate localisation of autonomous ground
[167] H. Grimmett, M. Buerki, L. Paz, P. Pinies, P. Furgale, I. Posner, and vehicles,” Robotics and Autonomous Systems, vol. 55, no. 7, pp. 527–
P. Newman, “Integrating Metric and Semantic Maps for Vision-Only 540, 2007.
Automated Parking,” in IEEE International Conference on Robotics [190] Z. Tao, P. Bonnifait, V. Frémont, and J. Ibanez-Guzmán, “Mapping and
and Automation, 2015, pp. 2159–2166. localization using GPS, lane markings and proprioceptive sensors,” in
IEEE/RSJ International Conference on Intelligent Robots and Systems,
[168] P. Biber and T. Duckett, “Experimental analysis of sample-based maps
2013, pp. 406–412.
for long-term SLAM,” The International Journal of Robotics Research,
[191] L. Delobel, C. Aynaud, R. Aufrère, R. Chapuis, T. Chateau, and
vol. 28, no. 1, pp. 20–33, 2009.
C. Bernay-Angeletti, “Robust localization using a top-down approach
[169] A. I. Comport, M. Meilland, and P. Rives, “An asymmetric real-time
with several LIDAR sensors,” in IEEE International Conference on
dense visual localisation and mapping system,” in IEEE International
Robotics and Biomimetics, 2015, pp. 2374–2376.
Conference on Computer Vision Workshops, 2011, pp. 700–703.
[192] C. Brenner, “Vehicle Localization Using Landmarks Obtained by a
[170] M. Meilland, A. I. Comport, and P. Rives, “Dense visual mapping
LIDAR Mobile Mapping System,” The International Archives of the
of large scale environments for real-time localisation,” in IEEE/RSJ
Photogrammetry, Remote Sensing and Spatial Information Sciences,
International Conference on Intelligent Robots and Systems, 2011, pp.
vol. 38, pp. 139–144, 2010.
4242–4248.
[193] R. Spangenberg, D. Goehring, and R. Rojas, “Pole-based localization
[171] P. Furgale and T. D. Barfoot, “Visual Teach and Repeat for Long- for autonomous vehicles in urban scenarios,” in IEEE/RSJ International
Range Rover Autonomy,” Journal of Field Robotics, vol. 27, no. 5, Conference on Intelligent Robots and Systems, 2016, pp. 2161–2166.
pp. 537–560, 2010. [194] A. Majdik, Y. Albers-Schoenberg, and D. Scaramuzza, “MAV Urban
[172] E. Royer, M. Lhuillier, M. Dhome, and J.-M. Lavest, “Monocular Localization from Google Street View Data,” in IEEE/RSJ International
vision for mobile robot localization and autonomous navigation,” Conference on Intelligent Robots and Systems, 2013, pp. 3979–3986.
International Journal of Computer Vision, vol. 74, no. 3, pp. 237–260, [195] A. R. Zamir and M. Shah, “Accurate image localization based on
2007. google maps street view,” in 11th European Conference on Computer
[173] H. Lategahn and C. Stiller, “City GPS using Stereo Vision,” in IEEE Vision, 2010, pp. 255–268.
International Conference on Vehicular Electronics and Safety, 2012, [196] G. Vaca-Castano, A. R. Zamir, and M. Shah, “City scale geo-spatial
pp. 1–6. trajectory estimation of a moving camera,” in IEEE Conference on
[174] H. Badino, D. Huber, and T. Kanade, “Real-Time Topometric Localiza- Computer Vision and Pattern Recognition, 2012, pp. 1186–1193.
tion,” in IEEE International Conference on Robotics and Automation, [197] L. Yu, C. Joly, G. Bresson, and F. Moutarde, “Monocular Urban
2012, pp. 1635–1642. Localization using Street View,” in The 14th International Conference
[175] J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide-baseline on Control, Automation, Robotics and Vision, 2016.
stereo from maximally stable extremal regions,” Image and Vision [198] G. Schindler, M. Brown, and R. Szeliski, “City-scale location recog-
Computing, vol. 22, no. 10, pp. 761–767, 2004. nition,” in IEEE Conference on Computer Vision and Pattern Recog-
[176] H. Deusch, J. Wiest, S. Reuter, D. Nuss, M. Fritzsche, and K. Di- nition, 2007, pp. 1–7.
etmayer, “Multi-Sensor Self-Localization based on Maximally Stable [199] G. Baatz, K. Köser, D. Chen, R. Grzeszczuk, and M. Pollefeys,
Extremal Regions,” in IEEE Intelligent Vehicles Symposium, 2014, pp. “Leveraging 3D city models for rotation invariant place-of-interest
555–560. recognition,” International Journal of Computer Vision, vol. 96, no. 3,
[177] J. Levinson, M. Montemerlo, and S. Thrun, “Map-Based Precision pp. 315–334, 2012.
Vehicle Localization in Urban Environments,” in Robotics: Science and [200] T. Yeh, K. Tollmar, and T. Darrell, “Searching the web with mobile
Systems, vol. 4, 2007. images for location recognition,” in IEEE Conference on Computer
[178] J. Levinson and S. Thrun, “Robust Vehicle Localization in Urban Envi- Vision and Pattern Recognition, vol. 2, 2004, pp. II–76.
ronments Using Probabilistic Maps,” in IEEE International Conference [201] G. Fritz, C. Seifert, M. Kumar, and L. Paletta, “Building detection
on Robotics and Automation, 2010, pp. 4372–4378. from mobile imagery using informative SIFT descriptors,” in Image
[179] A. Napier and P. Newman, “Generation and exploitation of local Analysis. Springer, 2005, pp. 629–638.
orthographic imagery for road vehicle localisation,” in IEEE Intelligent [202] A. Torii, J. Sivic, and T. Pajdla, “Visual localization by linear com-
Vehicles Symposium, 2012, pp. 590–596. bination of image descriptors,” in IEEE International Conference on
[180] D. Betaille and R. Toledo-Moreo, “Creating enhanced maps for lane- Computer Vision Workshops, 2011, pp. 102–109.
level vehicle navigation,” IEEE Transactions on Intelligent Transporta- [203] L. Yu, C. Joly, G. Bresson, and F. Moutarde, “Improving Robustness of
tion Systems, vol. 11, no. 4, pp. 786–798, 2010. Monocular Urban Localization using Augmented Street View,” in IEEE
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 26
19th International Conference on Intelligent Transportation Systems, [226] T. A. Vidal-Calleja, C. Berger, J. Solà, and S. Lacroix, “Large Scale
2016. Multiple Robot Visual Mapping with Heterogeneous Landmarks in
[204] W. Zhang and J. Kosecka, “Image based localization in urban envi- Semi-structured Terrain,” Robotics and Autonomous Systems, vol. 59,
ronments,” in Third International Symposium on 3D Data Processing, no. 9, pp. 654–674, 2011.
Visualization, and Transmission, 2006, pp. 33–40. [227] A. Cunningham, M. Paluri, and F. Dellaert, “DDF-SAM: Fully Dis-
[205] P. Agarwal, W. Burgard, and L. Spinello, “Metric Localization using tributed SLAM using Constrained Factor Graphs,” in IEEE/RSJ Inter-
Google Street View,” Computing Research Repository (CoRR), 2015. national Conference on Intelligent Robots and Systems, 2010.
[206] N. Radwan, G. D. Tipaldi, L. Spinello, and W. Burgard, “Do you see [228] R. Aragues, J. Cortes, and C. Sagues, “Dynamic Consensus for
the Bakery? Leveraging Geo-Referenced Texts for Global Localization Merging Visual Maps under Limited Communications,” in IEEE In-
in Public Maps,” in IEEE International Conference on Robotics and ternational Conference on Robotics and Automation, 2010, pp. 3032–
Automation, 2016, pp. 4837–4842. 3037.
[207] X. Qu, B. Soheilian, and N. Paparoditis, “Vehicle localization using [229] R. Aragues, E. Montijano, and C. Sagues, “Consistent Data Association
mono-camera and geo-referenced traffic signs,” in IEEE Intelligent in Multi-Robot Systems with Limited Communications,” in Robotics:
Vehicles Symposium, 2015, pp. 605–610. Science and Systems, 2010, pp. 51–58.
[208] A. Welzel, P. Reisdorf, and G. Wanielik, “Improving Urban Vehicle [230] H. Li and F. Nashashibi, “Cooperative multi-vehicle localization using
Localization with Traffic Sign Recognition,” in IEEE International split covariance intersection filter,” IEEE Intelligent Transportation
Conference on Intelligent Transportation Systems, 2015, pp. 2728– Systems Magazine, vol. 5, no. 2, pp. 33–44, 2013.
2732. [231] S. Julier and J. Uhlmann, General Decentralized Data Fusion with
[209] P. Merriaux, Y. Dupuis, P. Vasseur, and X. Savatier, “Fast and robust Covariance Intersection, D. Hall and J. Llinas, Eds. CRC Press,
vehicle positioning on graph-based representation of drivable maps,” 2001, chapter 12.
in IEEE International Conference on Robotics and Automation, 2015, [232] W. Burgard, M. Moors, C. Stachniss, and F. Schneider, “Coordinated
pp. 2787–2793. Multi-Robot Exploration,” IEEE Transactions on Robotics, vol. 21,
[210] O. Pink, F. Moosmann, and A. Bachmann, “Visual features for vehicle no. 3, pp. 376–386, 2005.
localization and ego-motion estimation,” in IEEE Intelligent Vehicles [233] X. S. Zhou and S. I. Roumeliotis, “Multi-robot SLAM with Unknown
Symposium, 2009, pp. 254–260. Initial Correspondence: The Robot Rendezvous Case,” in IEEE/RSJ
[211] J. W. Fenwick, P. M. Newman, and J. J. Leonard, “Cooperative Con- International Conference on Intelligent Robots and Systems, 2006, pp.
current Mapping and Localization,” in IEEE International Conference 1785–1792.
on Robotics and Automation, vol. 2, 2002, pp. 1810–1817. [234] S. Thrun and Y. Liu, “Multi-Robot SLAM With Sparse Extended
[212] T. Tao, Y. Huang, J. Yuan, F. Sun, and X. Wu, “Cooperative Si- Information Filters,” in 11th International Symposium of Robotics
multaneous Localization and Mapping for Multi-robot: Approach & Research, 2003, pp. 254–266.
Experimental Validation,” in 8th World Congress on Intelligent Control [235] A. Cunningham and F. Dellaert, “Large Scale Experimental Design for
and Automation, 2010, pp. 2888–2893. Decentralized SLAM,” in Unmanned Systems Technology XIV, 2012.
[213] A. Gil, O. Reinoso, M. Ballesta, and M. Juliá, “Multi-robot visual
[236] V. Indelman, E. Nelson, N. Michael, and F. Dellaert, “Multi-Robot
SLAM using a Rao-Blackwellized particle filter,” Robotics and Au-
Pose Graph Localization and Data Association from Unknown Initial
tonomous Systems, vol. 58, no. 1, pp. 68–80, 2010.
Relative Poses via Expectation Maximization,” in IEEE International
[214] L. Riazuelo, J. Civera, and J. M. M. Montiel, “C2TAM: A Cloud frame-
Conference on Robotics and Automation, 2014, pp. 593–600.
work for cooperative tracking and mapping,” Robotics and Autonomous
[237] M. Pfingsthorn, B. Slamet, and A. Visser, “A Scalable Hybrid Multi-
Systems, vol. 62, no. 4, pp. 401–413, 2014.
Robot SLAM Method for Highly Detailed Maps,” in RoboCup 2007:
[215] R. Arumugam, V. R. Enti, L. Bingbing, W. Xiaojun, K. Baskaran, F. F.
Robot Soccer World Cup XI, 2007, pp. 457–464.
Kong, A. S. Kumar, K. D. Meng, and G. W. Kit, “DAvinCi: A Cloud
Computing Framework for Service Robots,” in IEEE International [238] H. J. Chang, C. S. G. Lee, Y. C. Hu, and Y.-H. Lu, “Multi-Robot SLAM
Conference on Robotics and Automation, 2010, pp. 3084–3089. with Topological/Metric Maps,” in IEEE/RSJ International Conference
[216] G. Mohanarajah, D. Hunziker, R. D’Andrea, and M. Waibel, “Rapyuta: on Intelligent Robots and Systems, 2007, pp. 1467–1472.
A Cloud Robotics Platform,” IEEE Transactions on Automation Science [239] A. Martin and M. R. Emami, “Just-in-time Cooperative Simultaneous
and Engineering, vol. 12, no. 2, pp. 481–493, 2015. Localization and Mapping,” in International Conference on Control,
[217] R. Doriya, P. Sao, V. Payal, V. Anand, and P. Chakraborty, “A Automation, Robotics and Vision, 2010, pp. 479–484.
Review on Cloud Robotics Based Frameworks to Solve Simultaneous [240] J. V. Diosdado and I. T. Ruiz, “Decentralised Simultaneous Localisation
Localization and Mapping (SLAM) Problem,” International Journal and Mapping for AUVs,” in 2nd SEAS DTC Technical Conference,
of Advances in Computer Science and Cloud Computing, vol. 3, pp. 2007, p. A14.
40–45, 2015. [241] R. H. Deaves, D. Nicholson, D. W. Gough, L. A. Binns, P. Vangasse,
[218] B. Kehoe, S. Patil, P. Abbeel, and K. Goldberg, “A Survey of Research and P. Greenway, “Multiple Robot System for Decentralized SLAM
on Cloud Robotics and Automation,” IEEE Transactions on Automation Investigations,” in Sensor Fusion and Decentralized Control in Robotic
Science and Engineering, vol. 12, no. 2, pp. 398–409, 2015. Systems III, vol. 4196, 2000, pp. 360–369.
[219] M. Waibel, M. Beetz, J. Civera, R. DAndrea, J. Elfring, D. Galvez- [242] H. S. Lee and K. M. Lee, “Multi-Robot SLAM Using Ceiling Vision,”
Lopez, K. Haussermann, R. Janssen, J. Montiel, A. Perzylo, in IEEE/RSJ International Conference on Intelligent Robots and Sys-
B. Schieble, M. Tenorth, O. Zweigle, and R. van de Molengraft, tems, 2009, pp. 912–917.
“RoboEarth,” IEEE Robotics and Automation Magazine, vol. 18, no. 2, [243] E. D. Dickmanns, B. Mysliwetz, and T. Christians, “An Integrated
pp. 69–82, 2011. Spatio-Temporal Approach to Automatic Visual Guidance of Au-
[220] E. Nettleton, S. Thrun, H. Durrant-Whyte, and S. Sukkarieh, “De- tonomous Vehicles,” IEEE Transactions on Systems, Man, and Cyber-
centralised SLAM with Low-Bandwidth Communication for Teams of netics, vol. 20, no. 6, pp. 1273–1284, 1990.
Vehicles,” in Field and Service Robotics, 2006, pp. 179–188. [244] D. Pomerleau and T. Jochem, “Rapidly Adapting Machine Vision for
[221] E. Nettleton, H. Durrant-Whyte, and S. Sukkarieh, “A Robust Archi- Automated Vehicle Steering,” IEEE Expert, vol. 11, no. 2, pp. 19–27,
tecture for Decentralised Data Fusion,” in International Conference on 1996.
Advanced Robotics, 2003. [245] G. Seetharaman, A. Lakhotia, and E. P. Blasch, “Unmanned Vehicles
[222] H. Durrant-Whyte, “A Beginner’s Guide to Decentralised Data Fusion,” Come of Age: The DARPA Grand Challenge,” Computer, vol. 39,
Australian Centre for Field Robotics, The University of Sydney, no. 12, pp. 26–29, 2006.
Australia, Tech. Rep., 2000. [246] L. B. Cremean, T. B. Foote, J. H. Gillula, G. H. Hines, D. Kogan, K. L.
[223] S. P. McLaughlin, R. J. Evans, and B. Krishnamurthy, “Data Incest Re- Kriechbaum, J. C. Lamb, J. Leibs, L. Lindzey, C. E. Rasmussen, A. D.
moval in Survivable Estimation Fusion Architecture,” in International Stewart, J. W. Burdick, and R. M. Murray, “Alice: An Information-
Conference on Information Fusion, vol. 1, 2003, pp. 229–236. Rich Autonomous Vehicle for High-Speed Desert Navigation,” Journal
[224] M. Hua, T. Bailey, P. Thompson, and H. Durrant-Whyte, “Decentralised of Field Robotics, vol. 23, no. 9, pp. 777–810, 2006.
Solutions to the Cooperative Multi-Platform Navigation Problem,” [247] S. Thrun, M. Montemerlo, H. Dahlkamp, D. Stavens, A. Aron,
IEEE Transactions on Aerospace and Electronic Systems, 2010. J. Diebel, P. Fong, J. Gale, M. Halpenny, G. Hoffmann, K. Lau,
[225] S. B. Williams, G. Dissanayake, and H. Durrant-Whyte, “Towards C. Oakley, M. Palatucci, V. Pratt, P. Stang, S. Strohband, C. Dupont,
Multi-Vehicle Simultaneous Localisation and Mapping,” in IEEE In- L.-E. Jendrossek, C. Koelen, C. Markey, C. Rummel, J. van Niek-
ternational Conference on Robotics and Automation, 2002, pp. 2743– erk, E. Jensen, P. Alessandrini, G. Bradski, B. Davies, S. Ettinger,
2748. A. Kaehler, A. Nefian, and P. Mahoney, “Stanley: The Robot that Won
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 27
the DARPA Grand Challenge,” Journal of Field Robotics, vol. 23, no. 9, [263] E. Trulls, A. Corominas Murtra, J. Perez-Ibarz, G. Ferrer, D. Vasquez,
pp. 661–692, 2006. J. M. Mirats-Tur, and A. Sanfeliu, “Autonomous Navigation for Mobile
[248] C. Urmson, C. Ragusa, D. Ray, J. Anhalt, D. Bartz, T. Galatali, Service Robots in Urban Pedestrian Environments,” Journal of Field
A. Gutierrez, J. Johnston, S. Harbaugh, H. Y. Kato, W. Messner, Robotics, vol. 28, no. 3, pp. 329–354, 2011.
N. Miller, K. Peterson, B. Smith, J. Snider, S. Spiker, J. Z. W. R. [264] E. Royer, F. Marmoiton, S. Alizon, D. Ramadasan, M. Slade, A. Nizard,
Whittaker, M. Clark, P. Koon, A. Mosher, and J. Struble, “A Robust M. Dhome, B. Thuilot, and F. Bonjean, “Lessons learned after more
Approach to High-Speed Navigation for Unrehearsed Desert Terrain,” than 1000 km in an autonomous shuttle guided by vision,” in IEEE
Journal of Field Robotics, vol. 23, no. 8, pp. 467–508, 2006. International Conference on Intelligent Transportation Systems, 2016,
[249] M. Buehler, K. Iagnemma, and S. Singh, The DARPA Urban Challenge: pp. 2248–2253.
Autonomous Vehicles in City Traffic . Springer, 2009, vol. 56. [265] C. project, “Citymobil2 experience and recommendations,” 2016.
[250] M. Montemerlo, J. Becker, S. Bhat, H. Dahlkamp, D. Dolgov, S. Et- [266] J. Levinson, J. Askeland, J. Becker, J. Dolson, D. Held, S. Kammel,
tinger, D. Haehnel, T. Hilden, G. Hoffmann, B. Huhnke, D. Johnston, J. Z. Kolter, D. Langer, O. Pink, V. Pratt, M. Sokolsky, G. Stanek,
S. Klumpp, D. Langer, A. Levandowski, J. Levinson, J. Marcil, D. Stavens, A. Teichman, M. Werling, and S. Thrun, “Towards Fully
D. Orenstein, J. Paefgen, I. Penny, A. Petrovskaya, M. Pflueger, Autonomous Driving: Systems and Algorithms,” in IEEE Intelligent
G. Stanek, D. Stavens, A. Vogt, and S. Thrun1, “Junior: The Stanford Vehicles Symposium, 2011, pp. 163–168.
Entry in the Urban Challenge,” Journal of Field Robotics, vol. 25, [267] F. Kunz, D. Nuss, J. Wiest, H. Deusch, S. Reuter, F. Gritschneder,
no. 9, pp. 569–597, 2008. A. Scheel, M. Stubler, M. Bach, P. Hatzelmann, C. Wild, and K. Diet-
[251] F. W. Rauskolb, K. Berger, C. Lipski, M. Magnor, K. Cornelsen, mayer, “Autonomous Driving at Ulm University: A Modular, Robust,
J. Effertz, T. Form, F. Graefe, S. Ohl, W. Schumacher, J.-M. Wille, and Sensor-Independent Fusion Approach,” in IEEE Intelligent Vehicles
P. Hecker, T. Nothdurft, M. Doering, K. Homeier, J. Morgenroth, Symposium, 2015, pp. 666–673.
L. Wolf, C. Basarke, C. Berger, T. Gulke, F. Klose, and B. Rumpe, [268] M. Aeberhard, S. Rauch, M. Bahram, G. Tanzmeister, J. Thomas,
“Caroline: An Autonomously Driving Vehicle for Urban Environments Y. Pilat, F. Homm, W. Huber, and N. Kaempchen, “Experience,
,” Journal of Field Robotics, vol. 25, no. 9, pp. 674–724, 2008. Results and Lessons Learned from Automated Driving on Germanys
[252] C. Urmson, J. Anhalt, D. Bagnell, C. Baker, R. Bittner, J. D. Highways,” IEEE Intelligent Transportation Systems Magazine, vol. 7,
M. N. Clark, D. Duggins, T. Galatali, C. Geyer, M. Gittleman, S. Har- no. 1, pp. 42–57, 2015.
baugh, M. Hebert, T. M. Howard, S. Kolski, A. Kelly, M. Likhachev, [269] DMV, “Autonomous vehicle disengagement reports
M. McNaughton, N. Miller, K. Peterson, B. Pilnick, R. Rajkumar, 2015,” 2015. [Online]. Available: https://www.dmv.ca.gov/
P. Rybski, B. Salesky, Y.-W. Seo, S. Singh, J. Snider, A. Stentz, W. R. portal/wcm/connect/dff67186-70dd-4042-bc8c-d7b2a9904665/
Whittaker, Z. Wolkowicki, J. Ziglar, H. Bae, T. Brown, D. Demitrish, GoogleDisengagementReport2014-15.pdf?MOD=AJPERES
B. Litkouhi, J. Nickolaou, V. Sadekar, W. Zhang, J. Struble, M. Taylor, [270] ——, “Autonomous vehicle disengagement reports 2016,” 2016.
M. Darms, and D. Ferguson, “Autonomous Driving in Urban Envi- [Online]. Available: https://www.dmv.ca.gov/portal/wcm/connect/
ronments: Boss and the Urban Challenge,” Journal of Field Robotics, 946b3502-c959-4e3b-b119-91319c27788f/GoogleAutoWaymo
vol. 25, no. 8, pp. 425–466, 2008. disengage report 2016.pdf?MOD=AJPERES
[253] J. Ploeg, S. Shladover, H. Nijmeijer, and N. van de Wouw, “Introduction [271] S. Coast, “Tesla maps and the exploding future of map data,”
to the Special Issue on the 2011 Grand Cooperative Driving Challenge,” October 2015. [Online]. Available: http://stevecoast.com/2015/10/15/
IEEE Transactions on Intelligent Transportation Systems, vol. 13, no. 3, tesla-maps-and-the-exploding-future-of-map-data/
pp. 989–993, 2012. [272] ETSI, “102 863 V1. 1.1 Local Dynamic Map (LDM)-Rational for
[254] A. Geiger, M. Lauer, F. Moosmann, B. Ranft, H. Rapp, C. Stiller, and and guidance on standardization,” ETSI, Tech. Rep., 2011. [Online].
J. Ziegler, “Team AnnieWAYs entry to the Grand Cooperative Driving Available: http://www.etsi.org/deliver/etsi tr/102800 102899/102863/
Challenge 2011,” IEEE Transactions on Intelligent Transportation 01.01.01 60/tr 102863v010101p.pdf
Systems, vol. 13, no. 3, pp. 1008–1017, 2012. [273] R. W. Wolcott and R. M. Eustice, “Visual Localization within LIDAR
[255] M. Bertozzi, L. Bombini, A. Broggi, M. Buzzoni, E. Cardarelli, Maps for Automated Urban Driving,” in IEEE/RSJ International Con-
S. Cattani, P. Cerri, A. Coati, S. Debattisti, A. Falzoni, R. I. Fedriga, ference on Intelligent Robots and Systems, 2014, pp. 176–183.
M. Felisa, L. Gatti, A. Giacomazzo, P. Grisleri, M. C. Laghi, L. Mazzei, [274] R. Chatila and J.-P. Laumond, “Position referencing and consistent
P. Medici, M. Panciroli, P. P. Porta, P. Zani, and P. Versari, “VIAC: an world modeling for mobile robots,” in IEEE International Conference
Out of Ordinary Experiment,” in IEEE Intelligent Vehicles Symposium, on Robotics and Automation, 1985, pp. 138–145.
2011, pp. 175–180. [275] A. Nüchter, O. Wulf, K. Lingemann, J. Hertzberg, B. Wagner, and
[256] A. Broggi, M. Buzzoni, S. Debattisti, P. Grisleri, M. C. Laghi, H. Surmann, “3D mapping with semantic knowledge,” in RoboCup
P. Medici, and P. Versari, “Extensive Tests of Autonomous Driving 2005: Robot Soccer World Cup IX. Springer, 2006, pp. 335–346.
Technologies,” IEEE Transactions on Intelligent Transportation Sys- [276] A. Nüchter and J. Hertzberg, “Towards semantic maps for mobile
tems, vol. 14, no. 3, pp. 1403–1415, 2013. robots,” Robotics and Autonomous Systems, vol. 56, no. 11, pp. 915–
[257] F. Saust, J. M. Wille, B. Lichte, and M. Maurer, “Autonomous Vehicle 926, 2008.
Guidance on Braunschweigs Inner Ring Road within the Stadtpilot [277] H. Zender, O. Martı́nez Mozos, P. Jensfelt, G.-J. Kruijff, and W. Bur-
Project,” in IEEE Intelligent Vehicles Symposium, 2011, pp. 169–174. gard, “Conceptual spatial representations for indoor mobile robots,”
[258] A. Broggi, P. Cerri, S. Debattisti, M. C. Laghi, P. Medici, M. Panciroli, Robotics and Autonomous Systems, vol. 56, no. 6, pp. 493–502, 2008.
and A. Prioletti, “PROUDPublic ROad Urban Driverless test: Architec- [278] J. Civera, D. Gálvez-López, L. Riazuelo, J. D. Tardós, and J. M. M.
ture and Results,” in IEEE Intelligent Vehicles Symposium, 2014, pp. Montiel, “Towards Semantic SLAM using a Monocular Camera,” in
648–654. IEEE/RSJ Intelligent Conference on Robots and Systems, 2011, pp.
[259] P. Muehlfellner, P. Furgale, W. Derendarz, and R. Philippsen, “Evalu- 1277–1284.
ation of Fisheye-Camera Based Visual Multi-Session Localization in a [279] R. F. Salas-Moreno, R. A. Newcombe, H. Strasdat, P. H. J. Kelly, and
Real-World Scenario,” in IEEE Intelligent Vehicles Symposium, 2013, A. J. Davison, “SLAM++: Simultaneous Localisation and Mapping at
pp. 57–62. the Level of Objects,” in IEEE Conference on Computer Vision and
[260] P. Furgale, U. Schwesinger, M. Rufli, W. Derendarz, H. Grimmett, Pattern Recognition, 2013, pp. 1352–1359.
P. Muhlfellner, S. Wonneberger, J. Timpner, S. Rottmann, B. Li, [280] F. Bernuy and J. R. del Solar, “Semantic Mapping of Large-Scale Out-
B. Schmidt, T. N. Nguyen, E. Cardarelli, S. Cattani, S. Bruning, door Scenes for Autonomous Off-Road Driving,” in IEEE International
S. Horstmann, M. Stellmacher, H. Mielenz, K. Koser, M. Beermann, Conference on Computer Vision Workshop, 2015, pp. 124–130.
C. Hane, L. Heng, G. H. Lee, F. Fraundorfer, R. Iser, R. Triebel, [281] V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A Deep
I. Posner, P. Newman, L. Wolf, M. Pollefeys, S. Brosig, J. Effertz, Convolutional Encoder-Decoder Architecture for Image Segmentation,”
C. Pradalier, and R. Siegwart, “Toward automated driving in cities CoRR, 2015.
using close-to-market sensors: An overview of the V-Charge Project,” [282] M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp,
in IEEE Intelligent Vehicles Symposium, 2013, pp. 809–816. P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang,
[261] R. Kummerle, M. Ruhnke, B. Steder, C. Stachniss, and W. Bur- J. Zhao, and K. Zieba, “End to End Learning for Self-Driving Cars,”
gard, “Autonomous Robot Navigation in Highly Populated Pedestrian CoRR, 2016.
Zones,” Journal of Field Robotics, vol. 32, no. 4, pp. 565–589, 2015. [283] ERTRAC, “Automated driving roadmap,” ERTRAC - Task Force Con-
[262] Y. Morales, A. Carballo, E. Takeuchi, A. Aburadani, and T. Tsubouchi, nectivity and Automated Driving, Tech. Rep., 07 2015.
“Autonomous robot navigation in outdoor pedestrian walkways,” Jour-
nal of Field Robotics, vol. 26, no. 8, pp. 609–635, 2009.
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIV.2017.2749181, IEEE
Transactions on Intelligent Vehicles
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, VOL. XX, NO. X, XX 201X 28
2379-8858 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.