SLAM Integrated Mobile Mapping System in Complex Urban Environments
SLAM Integrated Mobile Mapping System in Complex Urban Environments
Keywords: In this paper, a SLAM-integrated MMS with the capability of efficient, consistent and robust mapping in complex
Mobile mapping environments where GNSS signal is intermittently lost is presented. The system is developed to take the feedback
SLAM from mapping module to the localization module into account, which significantly improves the smoothness of
Factor graph optimization the 6 DOF trajectory estimation. In particular, the complex mapping problem is modeled as a hierarchical factor
Seamless positioning
graph to improve the computation efficiency of the optimization back-end. Moreover, GNSS signal is modeled as
Data fusion
a plug-and-play constrained factor to facilitate the mapping process in complex urban environments. We eval-
uate the performance of the proposed method on KITTI benchmark, and compare the result with both the GNNS/
INS system and SuMa, which represents the traditional MMS and the typical laser SLAM system respectively. The
results confirm that the proposed method enables normal and accurate mapping even when the inavailability of
GNSS is up to 60%. It outperforms the traditional MMS in terms of map consistency with an increase of 24.7% on
average MME, and outperforms laser SLAM in terms of the absolute accuracy with 5.9 times increase on average
RMSE.
⁎
Corresponding author at: Department of Geospatial Information, PLA Information Engineering University, 450001 Zhengzhou, China.
E-mail address: [email protected] (S. Li).
https://doi.org/10.1016/j.isprsjprs.2020.05.012
Received 22 February 2020; Received in revised form 8 May 2020; Accepted 15 May 2020
Available online 29 June 2020
0924-2716/ © 2020 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Odometry) (Yang et al., 2018), but neither of them validated the fea- consistent map of this environment while simultaneously determining
sibility in complex urban environments where GNSS outage. Qu et al. its location within this map. (Leonard and Durrant-Whyte, 2002). SLAM
(2018) integrated SLAM with geo-referenced landmarks to reduce the has been formulated as a theoretical problem that is a joint conditional
drift. However, the sparse feature map cannot be used for navigation probability distribution of robot poses and observed landmarks given
and perception, thus the mapping module was still required to re- all measurement noise models (Thrun et al., 2006). The generality of
construct a dense scene. probabilistic formulation of SLAM essentially makes it become a multi-
In this paper, we establish a SLAM-integrated MMS with the cap- source data fusion technique. From the perspective of probability, the
ability of efficient, consistent and robust mapping in complex en- estimation of robot poses can be transformed as maximizing the con-
vironments where GNSS signal is intermittently lost. We first derived ditional distribution of robot pose, given the noise model of all on-
the probabilistic model for SLAM-integrated MMS, which abstractly board sensors. In practice, the solver of MAP (Maximum A Posteriori)
proved the advances of adding the feedback from the mapping module could be classified into two categories(Durrant-Whyte and Bailey, 2006;
to the localization module. Then we describe details of the hierarchical Bailey and Durrant-Whyte, 2006), (1) filtering methods, e.g. Extended
factor graph optimization structure, which is the core of the proposed Kalman Filter (EKF), and Particle Filter (PF), (2) and smoothing
method. In particular, point cloud registration is performed on GPU methods. With the increasing improvement of computing power and
(Graphics Processing Unit) using surfel-based approach to speed up. the exploiting of matrix sparsity, smoothing approaches become the
The hierarchical optimization structure helps to construct a essential main stream in terms of solving SLAM problem (Strasdat et al., 2012).
factor graph, which dumps all redundant factors to keep the global Full smoothing is a brutal optimization strategy that optimizes entire
graph concise and tight. Finally, the iSAM (incremental smoothing and history of states by solving a large nonlinear optimization problem
mapping) algorithm is employed off-line to efficiently solve the global (Thrun and Montemerlo, 2006). Although it could achieve higher ac-
optimization. curacy, its complexity is also the highest. Therefore, frames are always
discarded unless it is selected as the reference frame (Konolige and
2. Related work Agrawal, 2008). Fixed-lag smoothing approach is a compromise of
filter approaches and full smoothing approaches. It only optimizes
2.1. Traditional MMS states falling within a given time window, while marginalize out old
states falling out of the window by Schur Complement (SC) algorithm
The crux of the construction of HD maps is the accurate 6 DOF (Konolige et al., 2010). The accuracy of fixed-lag smoothing approaches
(Degree of Freedom) pose estimation of LiDAR, including 3D position is higher than that of filter approaches, since they re-linearize past
and 3D rotation. In most cases, the POS (Position and Orientation measurements. The disadvantage is that the marginalization of old
measurement System) outputs high accuracy 6 DOF attitude and posi- states destructs the sparsity of hessian matrix, which decreases the ef-
tion estimation in the world frame, e.g. WGS-84 by combining INS and ficiency of optimization. iSAM (Incremental Smoothing and Map-
GNSS measurements into a fusion filter. It is the most common used ping) (Kaess et al., 2008; Kaess et al., 2012) is the state-of-art optimi-
localization technology in MMS. Meanwhile, driven by the LiDAR zation approach which expresses constraints between states as
technique which is well-known by its capability of dense, accurate and constrained factors, and only optimizes a small subset of variables
convenient surrounding points acquisition, a typical raw data collection identified as affected nodes by the new measurement. Factor graph
system for HD maps is usually a car equipped with several 3D LiDARs supports incremental optimization in Plug-and-Play (PNP) mode, i.e.
and POS (Takai et al., 2013). Many companies, such as TOPCON, the solver makes the most use of all measurements it received, re-
TRIMBLE and REGEL have established their commercial MMS produc- gardless of whether some of on-board sensors are disabled, which is
tions. A general review of the different solutions available on the well-suited for multi-source data fusion. The great superiority of iSAM
market and a comparison of their technical specifications can be found attracts researchers’ attentions, and SLAM is one of the most-used areas.
in Puente et al. (2013). Toschi et al. (2015) presented a methodology to Many studies, including VO and SLAM, employ iSAM in their systems as
evaluate the precision and the accuracy of MMS map with advanced the data fusion processor or optimization back-end, e.g., Zhang et al.
statistical methods. However, these evaluations are incomplete, since (2017),Forster et al. (2016), Shan and Englot (2018),Shuai-Xin et al.
the localization module and mapping module were evaluated in- (2020), Behley et al. (2018),Yang et al. (2018).
dividually on the condition that the observation environments were
good, while the performance of mapping in GNSS-denied area was not 2.2.2. LiDAR-based SLAM
mentioned. Due to the localization component heavily reply on GNSS, Although many SLAM approaches utilize camera as the environ-
inaccurate GNSS measurements in the urban canyon can lead to error in mental perception sensor for its cheap price and rich texture informa-
the trajectory estimation (Gressin et al., 2012). Many works try to solve tion in the image, LiDAR is an indispensable sensor for large-scale
the absent of GNSS by the automatic registration of point cloud map mapping tasks. Moosmann and Stiller (2011) proposed Velodyne SLAM
from multiple passes (Ding et al., 2007; Levinson et al., 2007; Takai which carried out Iterative Closest Point (ICP) algorithm to align re-
et al., 2013). However, the approaches requires data from multiple ceived laser scan with the accumulated map to estimate the pose of
passes of the same scene, which cannot always be fulfilled. Moreover, LiDAR. However, the classical point-based ICP registration is a free-
these methods even require manual adjustment to achieve the desired form correspondence algorithm. On the one hand, it costs too much on
accuracy. Hussnain et al. (2018) extracted the known landmarks in the closest points searching. On the other hand, the algorithm may get
environment as control points. Then the INS is utilized to provide po- stuck into the local optimum, and obtain a false positive transformation
sitioning information, while utilized the geo-referenced control points estimation when a bad initial is provided. The novel LiDAR Odometry
to mitigate the drift. Accordingly, lots of surveying and mapping works and Mapping (LOAM) approach proposed by Zhang and Singh (2017) is
need to be done in advance, which goes against the origin intention of the state-of-art solution. It divided complex SLAM prblem into the
mobile mapping for unknown environments. odometry and the mapping thread respectively, and performed feature-
based scan-to-scan registration on odometry thread at high frequency,
2.2. Maximum a posteriori estimation and LiDAR-based SLAM while preformed feature-based scan-to-map registration on the parallel
mapping thread at lower frequency to keep the real-time performance.
2.2.1. Maximum a posteriori estimation On the downside, LOAM lacks an optimization back-end, which makes
SLAM problem was first proposed and studied in 1980s to ask if it is error accumulation inevitable as the trajectory grows over time. Shan
possible for a mobile robot to be placed at an unknown location in an and Englot (2018) added an optimization back-end for LOAM to in-
unknown environment and for the robot to incrementally build a crease the accuracy and enable further refinement when loop closure is
317
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
detected. However, the LiDAR-only approach suffers from motion de- control points, Qu et al. (2018) utilizes surveyed road signs, while
generation in structureless environment or under rapid rotation. In- Jende et al. (2018) utilizes the control points by the triangulation of
spired by VIO (Visual Inertial Odometry), the IMU-LiDAR tightly cou- correspondences on aerial images. Moreover, all of these methods are
pled approach inertial-LOAM was proposed by Shuai-Xin et al. (2020), proposed aiming to the implementation for GNSS-denied environments
and the evaluation showed that it was more robust under rapid without considering their flexibility to be converted into a practicable
movement. SuMa (Surfel Mapping) is a surfel-based SLAM approach implementation in GNSS-affluent area.
proposed by Behley et al. (2018), in which a global surfel-based map When the availability of GNSS is sporadic, the multi-sensor fusion
was constructed and maintained. The surfel-based ICP algorithm is a algorithm has to be improved to support the seamless switch between
variation of classical ICP. It project points as the depth image, so that GNSS-denied area and GNSS-affluent area. It has been extensively stu-
correspondences can be directly found by indexing the same pixel in a died during the past several years, and some of them are related to our
pair of successive depth images, which avoids the costly searching and work. To adapt the shift of availability of GNSS measurements,
increase the convergency of the estimation iteration. The changes of Shamseldin et al. (2018) propose a SLAM-based Pseudo-GNSS/INS
poses were estimated by exploiting the projective data associations system for the localization component in MMS, which virtually mimics
between current laser scan and a rendered model view from the global GNSS/INS system using SLAM estimation result. The key of this is
point cloud map. To improve the registration accuracy in dynamic transforming the pose estimation by SLAM to Pseudo-GNSS signals. To
environments, Chen et al. (2019) incorporated the deep learning improve the efficiency of the mapping optimization, Elbahnasawy et al.
method RangeNet++ (Milioto et al., 2019) with SuMa to remove the (2018) presents a hierarchical filter along with smoothers. The first
moving objects in the point cloud and build a semantic map. The stage is a tightly coupled Kalman Filter (KF) followed by a smoother
method is fast and able to work in city-scale environments, which is using GNSS/INS measurements as inputs. The second stage is a loosely
quite suitable to be a substitution of POS in harsh environments. coupled KF utilizing the output from the first stage and a vision-based
trajectory to aid trajectory refinement during GNSS outages. Moreover,
Roh et al. (2016) uses both GNSS/INS and odometry to navigate the
2.3. SLAM integrated with MMS mapping car, and models the structure lines, e.g., building facade and
road curb, leveraging publicly available digital maps. Then the line
With the development of SLAM techniques, many studies have been points from the 2D on-board LiDAR is aligned with the constructed
done focusing on integrating SLAM with MMS to solve the positioning model to build the spatial constraint, and use a graph-based back-end to
problem during GNSS outages. Directly substitute the localization correct the navigation error. Similarly, in Chang et al. (2019), the na-
component by SLAM technique in GNSS-denied area to provide stable vigation results are obtained by the information fusion of the GNSS
pose estimations for the mapping module is intuitive. Many excellent position, IMU preintegration and the relative pose from the LiDAR
works have been done following the idea to extend MMSs availability in odometry. The fixed lag smoothing is adopted to ensure that the com-
indoor environments, e.g. Wen et al. (2018),Markom et al. (2016). In- putational load of the graph optimization does not increase with time.
door environments are always well-structured, i.e. indoor free space is However, although these methods are designed for complex urban en-
flat and full of lines and surfaces (Shamseldin et al., 2018), which en- vironments, the validations of them are incomplete, and the perfor-
ables 2D SLAM solutions such as Gmapping (Grisetti et al., 2007) and mance when the availability of GNSS is sporadic is not evaluated
Cartographer (Hess et al., 2016) to provide accurate and reliable 3 DOF quantitatively.
pose and could be augmented to 6 DOF by integrating with an IMU. In
forests and caves, where structural features are insufficient, other in-
dependent navigation device, e.g. IMU, is necessary. Pierzchała et al. 2.4. Our strategy
(2018) and Zlot and Bosse (2014) use IMU to provide a transformation
prior for the point cloud registration. Then the 3D map was generated In this paper, to extend the availability of MMS to complex city area
solely from laser scans, first by relying on laser odometry and then by (GNSS signals may be blocked some time) and increase the consistency
refining it with robust graph optimization after loop closures, which is of HD maps, we present a SLAM integrated MMS capable of auto-
the core of the algorithm. However, these methods cannot construct a matically and efficiently constructing globally consistent and accurate
HD map in the geo-referenced frame as SLAM problem is defined in a HD maps in complex city environments (see Fig. 1). A hierarchical
local frame which is the first frame of the on-board sensor. To transform factor graph based optimization scheme is proposed to fuse measure-
the map result into the geo-referenced frame, other control landmarks ments from a GNSS receiver, an IMU and a 3D LiDAR, which adapts to
should be incorporated. For instance, Zlot and Bosse (2014) utilizes the fact that different on-board sensors own different sampling
318
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
frequencies, and construct an essential global factor graph. We select 3. Notations and preliminaries
SuMa (Behley et al., 2018) as the to be integrated LiDAR SLAM ap-
proach, and based on which IMU (Inertial Measurement Unit) is cou- 3.1. Notations
pled in GNSS-denied environments. Besides, we use iSAM (Kaess et al.,
2008; Kaess et al., 2012) as the optimization solver, since it makes a In this section, we define notations and frames used in the paper. We
great improvements in efficiency. In the end, we summarize the main consider w as the world frame, whose origin is defined as the start
contributions of this paper as follow: position of the INS/GNSS combined navigation system POS, and z axis
is aligned with the direction of gravity w g . b denotes the body frame
• First, a novel localization module taking the feedback from mapping representing the frame of the mapping vehicle, whose origin is as the
module into account is proposed to improve the smoothness of same as that of POS, and is rigid attached with the vehicle. l denotes the
trajectory and the consistency of the map. LiDAR frame, and Tlb represents a transformation that transforms a
• Second, we model the complex mapping problem as a hierarchical point in l into b. The direction of b and l,as well as their relative
transformation could be seen in Fig. 1. A rotation matrix R representing
optimization problem, which can be efficiently and automatically
solved by the hierarchical factor graph. a 3D rotation is described as the Special Orthogonal Group:
• Third, the GNSS signal is modeled as a plug-and-play constraint to
SO (3) {R 3 × 3: RT R = I , det(R) = 1}, (1)
facilitate the mapping process in the complex environments.
• Finally, extensive experiments are conducted using open source which forms a smooth manifold. The tangent space to the manifold is
dataset to demonstrate the accuracy, consistency and robustness of denoted as Lie algebra:
our approach in complex environments, and compare the localiza-
tion and mapping results with those from traditional MMS and so (3) {Log(R) = : 3 × 1},
(2)
SLAM approach.
which coincides with the space of 3 × 3 skew symmetric matrix. In the
The remainder of this paper is organized as follows: We discuss paper, for transformation matrix, the right superscript denotes the
relevant literature in Section 2 and give an overview of the system pi- target frame, while the right subscript denotes the source, i.e.
peline in Section 3. In Section 4, we derive the probabilistic model of Tbw = {Rbw , tbw } is the transformation from the body frame to the world
the SLAM integrated approach and compare it with the traditional MMS frame. For point, the left subscript denotes the reference frame in which
to illustrate the reasonability. The proposed hierarchical optimization the translation is projected, i.e. w p is a point in the world frame. The
structure for SLAM integrated MMS is presented in Section 5. The pose of the POS and LiDAR is described by the transformation
Processing steps for raw data and local factor graph based incremental Tbw = {Rbw , tbw } and Tlw = {Rlw , tlw } respectively, which maps a point from
smoothing at the first level of optimization are presented in Section 5.1, the sensor frame to the world frame. Similarly, transformation is de-
while global factor graph based incremental smoothing at the second scribed as the Special Euclidean Group:
level of optimization is presented in Section 5.2. Detailed experiments
and experimental results are shown in Section 6. Finally, the paper is
concluded with a discussion and future research direction in Section 7.
SE (3) { T= R t
0 1
4 × 4: R SO (3), t }
3 .
(3)
319
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Fig. 3. A general example of MMS. Light green circle: states variable; Light
4. Probabilistic model for MMS
yellow circle: control data; Light blue circle: environmental observations; Light
pink circle: map variable. (For interpretation of the references to colour in this
figure legend, the reader is referred to the web version of this article.)
Consider a general case: a MMS vehicle moves through an en-
vironment, taking observations by POS to locate itself, while collecting
point cloud of the environment using an on-board LiDAR to build the
se (3) = Log(T ) = : , 3×1 . traversed environmental map, as shown in Fig. 3. At a time instant k,
(4) state of MMS vehicle is Xk . A GNSS/INS measurement Uk denoting the
k is defined as the sampling time instant, i.e. bk is the body frame while motion control at time k drives the vehicle to a new state Xk + 1. Zk
sampling the k-th measurement. We denote ( · ) as the measurement describes the measurement of landmarks Mi observed by LiDAR at k.
with Gaussian noise, ( · ) as the optimized estimate of a parameter and Since the presence of observation noise, both Uk and Zk are Gaussian
( ¯· ) as the prediction of a parameter before refined. In the paper, a random variables, i.e. Uk ~N (uk , uk ) and Zk ~N (z k , z k ) . From the
complete state X to be estimated is defined as a 9 × 1 vector, including probabilistic point of view, the real observation is a sampling action of
attitude , position , and velocity v : the event.
(1) The system starts with receiving raw data (green blocks in Fig. 2) by It is a two-step process in which the localization module estimates
on-board LiDAR and POS, after which the received data is pre- the poses of the vehicle first and mapping module joints the map ac-
processed (pink blocks in Fig. 2). Specifically, IMU and GNSS cording to the pose estimations. The state transition in Eq. (6) (the first
measurements are coupled to estimate the pose in an EKF filter. If row) is assumed to be a Markov chain, which means that the next state
GNSS signals are unavailable, IMU measurements would be used Xk + 1 only depends on the immediately preceding state Xk and the ap-
alone. Relative transformation between two sampling timestamps plied control Uk . Its obvious that Xk + 1 is independent of both the ob-
of LiDAR is also used as motion prior to de-skew the distortion of servations and the map. Thus, the state transition can be extended as:
raw point cloud caused by continuous movement during scanning. k
(2) The first level of optimizer receives constrained factors, including P (Xk U0: k 1, X0 ) = P (X0 )· P (Xt Ut 1, Xt 1).
IMU preintegrated factors or odometry factors, scan-to-map regis- t=1 (7)
tration factors and small loop factors, and refines poses of each laser
The most often used estimator for unknown variables in Eq. (7) is
scans to joint a consistent submap. Specifically, IMU preintegrated
the maximum a posterior (MAP) algorithm:
factors are created by preintegrating IMU measurements (Forster
et al., 2016; Shuai-Xin et al., 2020), while odometry factors are Xk = arg max(P (Xk Uk 1, Xk 1)).
generated from GNSS/IMU filter when GNSS signals locked. Both Xk (8)
registration and loop closure factors use the same surfel-based ap- Since P (X 0 )·
k 1
Ut 1, Xt 1) is a constant, it can be dropped in
P (Xt
proach as SuMa. Note that the first level of optimizer is triggered t=1
Eq. (8). In practical implementations, there are many solvers can be
once when it meets the requirements for build up a submap. Due to employed, i.e. EKF and Gaussian Particle Filter (PF). After that, map Mk
the insertion of registration factors and small loop factors, mis- can also be updated.
matching of laser scans caused by positional error of odometry es-
timates can be mitigated, which makes the local map more con-
4.2. SLAM integrated MMS formulation
sistent. (yellow blocks in Fig. 2)
(3) Generally, the relative transformation between a pair of submaps is
The main difference between SLAM integrated MMS with its tradi-
estimated using the same surfel-based registration method as be-
tional counterpart is the usage of map Mk in localization module.
fore. However, when GNSS signal is lost, and the construction time
Except control measurements Uk 1 and the preceding state Xk 1, the
of a submap span a long period of time, surfel-based approach may
estimation of the next vehicle pose also requires the measurements of
fail. If that, we change to use point-based registration approach
the environment landmarks Zk 1. In SLAM problem, it is an estimator
instead though it may cost more computation resources. Moreover,
for the joint posterior density of the vehicle states, control
320
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
measurements and map measurements, rather than the estimator for factor graph based structure as shown in Fig. 4. The second level is a
vehicle states and control measurements only. Thus, Eq. (6) can be global factor graph GGlobal consisting of many submaps MN in different
rewritten as: referenced states X Nref which are the basic unit of the global optimi-
zation. Factors in the global graph include loop factors FLoop , successive
P (Xk U0: k 1, Z0: k , M0: k 1, X 0 ),
registration factors FReckon , map-to-map registration factors FReg and
P (M0: k Z0: k , X 0: k ). (9) GNSS factors FGNSS . Specifically, both FLoop and FReg impose spatial
It can be found that the state transition in Eq. (9) is related to map constraints from registering a pair of corresponding submaps with each
and landmarks measurements which are not incorporated in the tra- other by surfel-based registration approach. The only difference is that
ditional MMS. Following the principle of conditional probability dis- FReg is the constraint between the latest and the last submap, while
tribution, the state transition can be extended as: FLoop represents constraint that the vehicle backs to the place it already
traveled. The role of FReckon is to keep the changes of estimated poses
P (Xk U0:k 1, Z0:k , M0:k 1, X0 ) smooth, and that of FGNSS is to provide optimizer an absolute reference
k to decrease the accumulation of registration errors. (see the right graph
= P (X 0 )· [P (Xt Ut 1, Zt , Mt 1, Xt 1)· P (Mt 1 Zt 1, Xt 1)]. in Fig. 4).
t=1 (10) Each submap aggregates multiple consecutive laser scans S Ni refer-
After plugging Eq. (10) into Eq. (9), the problem Eq. (9) is trans- ring to different states X Ni . A single submap represents the vehicles
formed as an recursive problem. The state transition can also be solved vicinity in a given time window WT (WT = 5 in our experiment). To
by MAP algorithm: build a consistency submap and to preserve the estimated trajectory not
drifting, the first level optimization based on the local factor graph
Xk = arg max(P (Xk Uk 1, Zk , Mk 1, Xk 1)). GLocal is employed. Factors in the local factor graph include odometry
Xk (11)
factor FOdom coming from GNSS/INS filter, scan-to-map registration
With the comparison of Eq. (11) and Eq. (8), the difference of lo- factors FReg by registering the latest laser scan with so-far submap using
calization modules in traditional MMS and SLAM incorporated MMS is surfel-based registration approach, and loop closure factor FLoop de-
obvious. The latter self-locates not only based on the measurements scribing the overlap of view in the WT . Note that, if GNSS is lost, FOdom
from POS but also on the measurements from environment landmarks would be replaced as the IMU preintegration factors FPrein , and the
and so far map. In other words, the latter is a recursive process, i.e. relative motion prediction would be reckoned using IMU measurements
localization module further adjusts its estimation by evaluating the only. (see the left graph in Fig. 4).
variance of landmarks measurements and so-far map, while the former For the optimization of local and global factor graph, we use gtsam
is just a one-way process without the use of a map. Intuitively, SLAM library (Kaess et al., 2008; Kaess et al., 2012) and optimize the pose
integrated MMS is more reasonable than traditional MMS which omits graph using Levenberg Marquardt (L-M) algorithm with a maximum of
constraints between landmarks measurements and the map. For Eq. (8), 100 iterations, yielding maximum likelihood estimates of the refer-
Xk is the optimal estimation based on specific Uk 1 and Xk 1, i.e. en- enced states of S Ni and MN , which is variable X Ni and X Nref . Once the
vironment measurements at Xk may not fit well with the local map states of each laser scan X Ni are optimized through the first level op-
leading to the misalignment of point clouds of the same scene, which timization, it means that the relative transformation between S Ni and its
are scanned from different scannings. For instance, considering a wall is reference MN can be determined. Consequently, when X Nref are refined
scanned from different views by a MMS vehicle, poses of which are through the second level optimization according to spatial constraints
estimated only by Uk 1. The inaccuracy of each Xk would incur the between submaps, all of X Ni can be also refined. In order to increase the
misalignment of corresponding point clouds, leading to an inconsistent consistency of the map and avoid the adverse impact of poor registra-
local point cloud map. However, for Eq. (11), Xk is the optimal esti- tions, we also employ the idea by Graham et al. (2015) and optimize
mation based on Uk 1, Xk 1 , Mk 1, and Zk , which guarantees that the local factor graphs with consistency checking.
current scan Zk in good conformity with the so-far submap Mk 1.
Moreover, once one of the measurements Uk 1 or Zk is unavailable, the 5.1. Local refinement
other one still enables the vehicle to self-locates without apparent
harmful effects, which is so called seamless localization in complex In a given time window WT , we aggregate all consecutive laser scans
environments. S Ni to generate a local submap MN of the traversed environment. We
employ a surfel-based map for the representation of submap, since it
5. Hierarchical optimization structure enables to represent even large-scale environments and maintain dense,
detailed geometric information of the point clouds at the same time.
We model the problem of SLAM-integrated MMS in a hierarchical Moreover, a synthetic view can be rendered at the last pose X Ni 1
Fig. 4. The hierarchical optimization structure. Left: variable nodes and factor nodes in the local factor graph. Right: variable nodes and factor nodes in the global
factor graph. A local factor graph is one of a constituent of the global factor graph.
321
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Nref1 1
Ni ( ¯Nref ¯ Ni )
[EOdom ]i = ,
¯N 1 Nref
( ¯ Ni ¯ Nref ) Ni
ref
[ Odom ]i (12)
Fig. 7. Average of relative translation and rotation errors using training data of where represents the multiplication operator of Lie algebra.
KITTI. Blue: Our method; Pink: Traditional MMS; Green: SuMa. (For inter- Conversely, if GNSS signal is unavailable, IMU measurement would
pretation of the references to colour in this figure legend, the reader is referred be used only for the preintegration. Virtual IMU preintegration mea-
to the web version of this article.) surements are derived as:
322
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Table 1
The RMSE of the proposed MMS, GNSS/INS odometry and SuMa.
Seq. No. Length & Env. & Speed Method Ave. Positioning RMSE [m] Ave. Rotation RMSE [°]
#00 7m34s br City + Loop Our method 1.735 1.639 1.775 2.975 0.612 0.494 0.031 0.787
Normal GNSS/INS 1.743 1.638 1.775 2.978 0.619 0.504 0.057 0.802
SuMa 20.718 19.308 6.042 28.958 1.283 1.163 7.546 7.741
#01 1m50s Our method 3.271 6.528 14.039 15.825 0.641 0.705 0.065 0.955
Highway GNSS/INS 3.269 6.414 14.084 15.817 0.607 0.751 0.069 0.968
Fast SuMa 42.386 52.726 48.284 83.114 2.172 3.094 2.670 4.624
#02 7m46s Our method 2.427 2.724 2.163 3.241 0.570 0.507 0.016 0.763
City + Loop GNSS/INS 2.460 2.659 2.222 3.250 0.653 0.464 0.069 0.802
Fast SuMa 74.055 46.155 16.046 88.724 1.788 2.154 17.309 17.533
#04 27s Our method 1.204 2.301 0.748 2.702 0.848 0.115 0.046 0.857
Highway GNSS/INS 1.160 2.307 0.729 2.683 0.854 0.275 0.132 0.907
Normal SuMa 1.883 2.038 1.614 3.210 1.163 0.630 0.258 1.348
#05 4m36s Our method 1.636 1.352 1.018 2.354 0.619 0.458 0.074 0.773
City + Loop GNSS/INS 1.638 1.363 1.014 2.359 0.854 0.533 0.636 1.192
Normal SuMa 5.585 5.240 1.417 7.788 0.814 0.539 2.693 2.859
#06 1m50s Our method 0.743 1.082 0.866 1.374 0.286 0.366 0.619 0.773
City + Loop GNSS/INS 0.743 1.084 0.869 1.379 0.286 0.349 0.567 0.751
Normal SuMa 0.917 2.1548 1.2751 2.754 0.894 0.533 1.146 1.547
#07 1m50s Our method 1.455 1.023 0.689 1.907 0.561 0.441 0.063 0.716
City GNSS/INS 1.450 1.043 0.689 1.915 0.596 0.413 0.526 0.896
Normal SuMa 2.259 1.316 0.453 2.654 0.693 0.338 1.684 1.851
#08 8m37s Our method 1.875 2.179 1.708 3.344 0.630 0.430 0.092 0.768
City + Loop GNSS/INS 1.855 2.234 1.615 3.323 1.049 0.529 0.018 1.175
Normal SuMa 7.788 8.955 13.366 17.874 1.089 0.877 4.171 4.400
#09 2m39s Our method 2.300 2.702 0.961 2.676 0.613 0.395 0.074 0.733
City GNSS/INS 2.250 2.710 0.9524 2.669 0.722 0.378 0.728 1.094
Normal SuMa 16.987 18.202 3.032 25.081 1.106 0.716 4.847 5.025
#10 2m Our method 1.790 2.960 1.600 2.989 0.493 0.533 0.127 0.737
City GNSS/INS 1.803 2.975 1.580 2.992 0.670 0.493 0.063 0.831
Normal SuMa 5.803 11.421 4.684 13.515 0.561 0.619 1.398 1.632
b
Rbii+ 1 = Exp(([w]k [ g b]k ) t ), distortion of raw laser scans S Ni by transforming each point to the same
k [i, i + 1] time based on the assumption of uniform motion model. In the paper,
bi we choose the first points timestamp as the reference.
v bbkk+1 = R bi + 1 ([a ]k [ ab]k ) t ,
After de-skewing, we register the undistorted laser scan with the so-
k [i, i + 1] N
1 bi far submap to estimate the relative pose in the submap frame X Niref by
pbbkk+ 1 = v bbii+ 1 t + 2
Rbi +1 ([a ]k [ ab]k ) t 2, surfel-based ICP algorithm. First, the preprocessed laser scan S Ni is
k [i, i + 1] (13) projected as a depth image VS which can be exploited to generate a
where (a , w ) is the measurement of IMU describing linear acceleration corresponding normal image NS . Next, the depth image VM and normal
and angular velocity respectively; t is the sampling interval of IMU. image VM are rendered from the local surfel map. Then, we align VS
N
Although preintegration measurements are variable of IMU bias with VM and VM to estimate scan pose X Niref . The surfel-based laser scan
( ab,g b) , we regard them as constants which can be calibrated in ad- representation enables a fast surfel-to-surfel assocaition, which avoid
vance for a industrial grade INS. Then, we can set preintegration the cost for the closet point researching in the traditional ICP algorithm.
b Registration result is checked by aligning current vertex image with the
measurement as factor FPre = (Log( R bii+ 1 )) ( v bbii+1 ) ( pbbii+1 ) N
one rendered at the view of estimated pose X Niref . An example could be
with covariance Pre and insert it into the local graph. The residual
seen in Fig. 6. The residual error ES 2M can be derived based on the
errors EPre can be written as:
minimization of the distance from point in vS VS to its association in
[EPre ]i vM VM along the direction of nM NM :
bi 1 1
R bi + 1 ( ¯bi ¯b
i + 1)
1
ES 2M =
N
nM · TNiref vS vM .
= ¯b 1
(v¯ bi +1 v¯ bi g t bbii+ 1 ) v bbii+ 1 . vS VS (15)
i
¯b
i
1
(¯ bi + 1 ¯ bi v¯ bi t bbii+ 1
1
2
g( t bbii+ 1 )2 ) pbbii+ 1
[ Pre ]i
Note that, although associations can be easily searched by the in-
dices of surfels on VM , they also need to be reconfirmed at each itera-
(14) tion of the ICP. The nonlinear L-M iteration on the manifold starts from
the odometry prediction, and is solved using Lie algebra.
5.1.2. De-skewing and scan-to-map registration factor
Nref
We consider the just obtained spatial constraint T Ni as scan-to-map
From the view of LiDAR mechanical driving mechanism, laser re- Nref
registration factor FReg = ( Ni ) (
Nref
Ni ) . Similar to FOdom , we
ceivers receive reflected laser signals emitted from a column of spinning
laser emitters. Accordingly, the exact sampling time of each point is denote EReg as residual errors:
different because of the continuous movement during mobile sensing.
Nref 1 1
Specifically, the exact time of each point in the laser scan is varied ( Ni ) ( ¯Nref ¯N )
i
according to its order in the scan, while the acquisition of a full scan can [EReg ]i = Nref
.
¯N 1 ( ¯ Ni ¯ Nref )
take multiple milliseconds. We utilize relative transformation predic-
Ni
(16)
ref
[ Reg ]i
tions T NNii+1 from INS/GNSS filter or INS integration to de-skew the
323
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Fig. 8. Plots of the trajectories with parts of details. Orange line: the ground truth; Green line: SuMa; Pink line: Our Method; Blue line: Traditional MMS; Green circle:
start point; Red point: end point. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
324
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
(19)
However, if the time spanning of a submap is too long, the points
which far away from the referenced origin may drop out, due to the
further points may be blocked. If that, it would be hard to find valid
associations based on the rendered depth image and .
Therefore, we switch to point-based registration approach which may
take more time, but the accuracy can be guaranteed. The only differ-
ence between the two registration approach is how to search associa-
tions, i.e. the former searches the corresponding point cloud in a point-
Fig. 9. Absolute translation and rotation errors of sequence #4. Blue: Our based KD-tree by the closest distance, while the latter directly indexes
method; Pink: Traditional MMS; Green: SuMa. (For interpretation of the re- the corresponding pixels on the depth image.
ferences to colour in this figure legend, the reader is referred to the web version N 1
Similarly, we regard the estimated relative motion TNref ref between
of this article.)
submaps as map-to-map registration factor
erating loop factors. Considering the special structure of the designed submap, the re-
lative transformation of two serial submaps can also be reckoned by
serial scan-to-map registration. The successive registration constraints
5.2. Global refinement
are extremely conductive to improving the smoothness of the trajec-
tory:
At the second level of optimization, consecutive submaps are con-
strained by map-to-map registration factors, while remote subamps 1 Nref 1 Nref
( ¯Nref ¯N + 1 ) (( Nlast ) )
with the mutually visual zone are constrained by loop closure factors. [EReckon ]N =
ref Nfirst
.
Nref
For both two kinds of factors, a surfel-based ICP registration is em- ( Nlast )
1 ( ¯ Nlast ¯ Nfirst ) Nlast
Nfirst
ployed for the fast correspondences association. We first search the loop [ Reckon ]N
Fig. 10. MME of all sequences using different approaches. Pink line and marker: Our method; Blue line and marker: Traditional MMS; Green line and marker: SuMa.
(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
325
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Fig. 11. MME of sequence #9 over optimization iterations. The first point at 0 represents the map without the global optimization.
5.2.2. Loop factor and GNSS factor 5.2.3. Loop factor and GNSS factor
For loop closure, we follow the loop detection and verification We perform optimization using the same incremental optimization
methods in SuMa. The main difference in the proposed system is how to strategy as the first level refinement. Only after a long period of GNSS
determine the spatial constraint. As for loops among submaps, since the losing, or when the loop closure between submaps is detected, or when
target of loop detection is local surfel maps, instead of laser scans, the the global optimizer is called, the global optimizer is triggered to run
loop constraint is determined by map-to-map registration. If we detect once. Poses of all local maps M0: n can be optimized in a batch:
loops using a scan in the indexed submap, it may construct more than
one loop factor (see Fig. 1) between the indexed submap and the can- X Nref = argmin ([EGNSS ]N + [EReckon ]N + [EReg ]N + [ELoop ]i ), N [0, n].
X Nref
didate submap, which is redundant. Moreover, a single laser scan is too
sparse to fully express the geometric features and structure of a scene, (25)
while an accumulated submap is much denser. Consequently, rather Since the relative poses of each laser scan to their corresponding
than rendering a local view from global map for the indexed scan to referenced frame is also determined by the first level of refinement, all
verify the detected loop, we verify loops by registering the indexed map laser scans can be updated.
with the candidate directly. As for loops in the submap, we set a smaller
time interval of triggering loop detection to enable loops in a submap
6. Experimental evaluations
could be detected, especially when a sharp turn occurs. All recording
maps closer than specified threshold Dthr are selected as candidates.
We evaluate our approach on raw KITTI datasets (Geiger et al.,
Then, the one that is the most compatible with current submap is se-
2013), in which point clouds of Velodyne HDL-64E S2, poses and IMU
lected as the loop closure submap. So do the loops in submap, but select
measurements of OXTS GNSS/IMU localization unit are recorded at the
laser scans as candidates.
same rate of 10 Hz. All on-board sensors are triggered synchronically by
The loop factor a time synchronization controller, and the ground truths of trajectories
are also provided. The recording scenes include inner city, highway and
is identical to registration factor, which imposes registration of a local residential area. Our SLAM integrated MMS is carried out on an Intel i5-
map MN or a laser scan S Ni with the loop closure candidate MM or S Nj to [email protected] GHz laptop with 8 GB RAM, and an Nvidia Geforce GTX
M
estimate the relative motion T Nrefref . The error equation of loop factor is 960 M with 2 GB RAM. To make a comprehensive evaluation on the
also a relative pose constraint, i.e.: performance of the proposed MMS, we evaluate the accuracy, con-
sistency and robustness of the system.
Mref 1 1
( Nref ) ( ¯Nref ¯M )
ref
1
[Eloop ]N = Mref
, 6.1. Accuracy of positioning
¯M1 ( ¯ Mref ¯ Nref ) Nref
(22)
ref
[ loop]N
We input the OXTS outputs into the filter to obtain a reliable 6 DOF
or poses, and regard them as initialization of pose estimation. Because of
the lack of POS data in sequence #03, all sequences #00-#10 except
Nj 1 N ¯NNref ) #03 are used as testing data in this experiment. We first compare the
( Ni ) (( ¯Njref ) 1
[Eloop ]N =
i
, proposed approach with both of the GNSS/INS odometry and SuMa
N
( ¯Njref ) 1 N
( ¯Njref
N
¯ Niref )
Nj
Ni (briefly denote as PA, GO and SM respectively in the follow), which
[ loop ]N (23) represent the traditional MMS and the SLAM method respectively, The
average of relative measures (relative translation error Erel (t ) and re-
where M and j are the index of the loop closure submap and laser scan
lative rotation error Erel (R) ) is shown in Fig. 7. The measures are
respectively.
provided by KITTI that computes in fixed distances over total se-
GNSS factors [EGNSS ]N are positional constraints for nodes in the
quences. The Erel (t ) of all three method are smaller than 2.0%, and the
graph. It will give nodes prior information to limit them not to change Erel (R) of them are smaller than 0.02°. It can witnesses a slight increase
much during the optimization, and it gives the estimated pose a world
of positioning error in SM as the length growing, while an inverse
frame reference. Essentially, both [Eloop ]N and [EGNSS ]N provide linked
tendency for both PA and GO. In general, the relative accuracy of GO is
nodes with more stably external reference, which effectively reduce the
slightly better than that of PA, and both of them are better than SM.
divergence of the trajectory:
Then we use the Root Mean Square Error (RMSE) to evaluate the
absolute accuracy Eabs of the estimated poses, as shown in Table 1. We
[EGNSS ]N = ( ¯ Nref ) Nref [ GNSS ]N . (24) focus more on the Eabs instead of Erel since the absolute accuracy is the
real focus in terms of mobile sensing. From the RMSEs in the table, it
326
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Fig. 12. Visualized map results and local details. Left: reconstructed maps colorized by height. Right: comparison of reconstructed objects, where each pair contains
ours (left) and traditional GNSS/INS localization module (right). Right-top: top view of local roads; Right-middle: view of local scenes, etc. parks and yards. Right
bottom: detailed objects, etc. cars and swings.
can be seen that the absolute accuracy of GNSS/ INS localization unit Its obvious in the Fig. 8 that the Eabs of SM is much larger than that
and our proposed system is meter-level, while laser SLAM approach is of PA and GO. Especially, for some long-time sequences, e.g. #00, #02
dozens of meters. The reason why the absolute accuracy is relatively and #08, or when the vehicle moves fast, e.g. #01, the accuracy of SM
low is that we set the GNSS/INS positioning result as the initial, which is poor, at around dozens of meters. Only for those sequences that loop
means that the accuracy is totally dependent on the accuracy of GNSS/ closures could be successfully detected, e.g. #5 and #6, SM presents a
INS device. In other words, it is too hard to improve the accuracy by decent performance. In the combination with Erel (t ) and Erel (R) , we
one order of magnitude than the accuracy of the initial. can draw a conclusion that the LiDAR-only SLAM can hardly be used for
327
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
A
A
Fig. 14. Trajectories of sequence #07 after/before the refinement with different lengths of GNSS losing. Different color represents different lengths of lost time; Full
line: after the refinement; Dash line: before the refinement; Green square: GNSS losing point. (For interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.)
328
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Table 2
The RMSE of estimated poses after/before the refinement (a/r and b/r) with different lengths of GNSS losing.
Approach Ave. Positioning RMSE [m] Ave. Rotation RMSE [°]
w/o lost 1.455 1.023 0.689 1.907 0.561 0.441 0.063 0.716
5s a/r 1.457 1.011 0.688 1.902 0.598 0.416 0.473 0.869
b/r 1.467 1.000 0.690 1.905 0.622 0.444 0.571 0.954
10s a/r 1.461 0.994 0.690 1.897 0.597 0.431 0.544 0.916
b/r 1.472 1.011 0.680 1.914 0.567 0.536 0.739 1.075
15s a/r 1.474 1.002 0.689 1.911 0.591 0.483 0.541 0.935
b/r 1.452 1.159 0.689 1.981 0.733 0.474 0.622 1.072
20s a/r 1.495 1.016 0.697 1.937 0.595 0.470 0.543 0.932
b/r 1.446 1.295 0.691 2.060 0.702 0.480 0.667 1.081
25s a/r 1.505 1.074 0.700 1.977 0.590 0.487 0.520 0.925
b/r 1.459 1.383 0.698 2.128 0.647 0.484 0.889 1.202
30s a/r 1.467 1.159 0.702 1.997 0.561 0.432 0.546 0.895
b/r 1.543 1.383 0.711 2.191 0.584 0.498 0.927 1.203
40s a/r 1.509 1.264 0.808 2.128 0.591 0.492 0.497 0.915
b/r 1.659 1.392 0.703 2.277 0.672 0.579 1.247 1.531
50s a/r 1.548 1.278 0.743 2.140 0.595 0.495 0.482 0.912
b/r 1.665 1.496 0.659 2.333 0.806 0.677 1.238 1.625
60s a/r 1.734 1.345 0.559 2.264 0.595 0.495 0.460 0.900
b/r 1.667 1.608 0.661 2.408 0.803 0.686 1.221 1.614
generating HD map without the support of GNSS and INS, though the consistent map than GO. In addition, comparison with SM shows that
relative accuracy of it is satisfied. As for PA and GO, it can be found that SM has a similar performance with PA illustrating that the importance
they did well on most of sequences except #1, and the accuracy of PA of the utilization of laser scan measurements in the point cloud regis-
matches that of GO in all sequences. However, if the total quality of tration front-end and the correction of loop closures, i.e. the sig-
GNSS is poor, such as #1, the accuracy of PA may be also affected. nificance of feedback from mapping module to the localization module.
Therefore, good GNSS/INS odometer is the key to enable MMS to build To further explore the affection of our proposed hierarchical opti-
accurate HD maps,i.e. the accuracy of MMS is dependent on that of the mization structure on the quality of map, we show the resulting MMME
GNSS/INS odometer. of the best sequence #09, plotted against the number of iterations when
It cannot show the superiority of PA just from the numbers in the running the refinement as post-process step. Fig. 11 shows that the
table, while the plots of the trajectories with details visualize the dif- MMME decrease dramatically over the first few iterations, and stabilize
ferences more clearly (see Fig. 8). We randomly select some ranges of at around −1.577.
interest (ROI) in each graph to show the detail. As it shown in the plots The map results of 3 sequences with medium length and without
of the trajectories, the estimated poses of SM drift away from the loop closures, #7, #9, #10, as well as the one with loop closure, #5, is
ground truth, while those of PA and GO do not. From the detailed ROI, shown in Fig. 12. The point cloud is colorized by height, and down-
it can be obviously found that the trajectories of GO wave affected by sampled using voxelgrid with the size (0.1 0.1 0.1). From the top views
the noise of GNSS and INS, but those of PA are much smoother. of local roads in different sequences (see Fig. 12(a)), they show that
To demonstrate the advantage of the integration of 3D visual walls in PA are much thinner than that in GO indicating the better
matching further, we select the shortest sequence #04 to compare the consistency of PA. From the local views of roads and crossroads in the
Erel (t ) and Erel (R) in each direction, as shown in Fig. 9. From the line right-middle of each figure (see Fig. 12(b)), we can observe that the
chart of Erel (t ) , the much better continuity of the error can be observed results of PA are much clearer than that of GO which seems to be
clearly, especially in the direction of forward motion x. covered up by mosaic. We also compare the detailed objects in the two
point cloud maps as shown in Fig. 12(c) and (d). In the map of PA, we
6.2. Consistency of mapping can clearly distinguish the cars, roadside lightings, trees with branches,
fences and upstairs in the yards, and even the swing in the city park, but
In this experiment, we first calculate the mean map entropy (MME) in that of GO cannot. The experimental results show the superiority of
(Razlaw et al., 2015) MMME of all the points PA comparison with GO in terms of the consistency of mapping. But the
in the resulting map to measure map’s quality, as shown in Fig. 10. The good results rely on the correct registrations which may be invalid af-
MME represents the sharpness of a map, i.e. lower entropy measures fected by moving objects.
correspond to higher map quality. The entropy h for a single map point
pk is calculated by:
6.3. Robustness in complex environment
1
h (pk ) = ln |2 e ·S (pk )|,
2 (26) To evaluate the performance of PA in weak GNSS environments,
where signals could be blocked or effected by multi-path effect, we
where S (pk ) is the covariance of sampled points in the radius r around
select #07 as the training data and manually control a period of time as
pk ;|·| is the operator of determinant. In the paper, we set r = 0.5. The
mean map entropy H (P ) is the average of all points in the resulting GNSS invalid to simulate a complex outdoor environment. In the ex-
map: periment, we set the time periods of GNSS-denied as 5s, 10s, 15s, 20s,
25s, 30s, 40s, 50s and 60s respectively. The figure of our mapping
1
P
system at the scene where the GNSS signal relocked after 60s lost is
H (P ) = h (pk ).
shown in Fig. 13. It can be seen that before the refinement (see the left
P k=1 (27)
figure in Fig. 13) the new submap reference is far from the end scan of
The figure shows that the MMME of each map result built by PA is the last submap. This is because the last submap constructed by INS
smaller than its counterpart built by GO, i.e. PA could build a more odometry and scan-to-map registration is drifted, while the new
329
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
Fig. 15. Changes of estimated poses of sequence #07 after the refinement with different lengths GNSS losing. Lines colorized by the same colors as that in Fig. 14
have the same meaning.
submap reference is initialized based on no-drifted GNSS pose prior. and a GNSS/INS coupled device, POS. We utilize the POS provided pose
After the second level refinement, the poses of submaps are refined. as the initialization for the system, while it is regarded as the final pose
Trajectories are visualized in Fig. 14, and the detailed Eabs of them, estimation in the traditional system.
as well as their unoptimized counterparts are listed in Table 2. From the Three main contributions are introduced in this paper. First, a novel
figure of trajectories of sequence #07, we can qualitatively learn that it localization took the feedback from mapping module into account is
has no effect on the performance of the proposed mapping system proposed to improve the smoothness of trajectory and the consistency
within a half minute of GNSS losing, while it could still work normally of the map. The proposed method is evaluated on KITTI benchmark and
within 60% of GNSS losing time. To quantitatively prove it, we plot the compared with the state of art SLAM approach. The accuracy of pro-
changes between the poses with a period of GNSS losing and those posed MMS is close to the result of GNSS/INS device, but the smooth-
without (see Fig. 15). It shows that the maximum of pose change is no ness is much better which guarantees the consistency of point cloud
more than 1 m and 1.8° representing the validity of the system when map. Besides, the map consistency of proposed MMS is better than both
GNSS intermittently lost in the simulation environment. of the compared counterparts. The main contribution is to keep the
accuracy of POS, while enabling an obvious quantitative improvement
7. Conclusion and future work on map consistency.
Second, the complex mapping problem is modeled as a hierarchical
This paper proposed a robust SLAM-integrated MMS using hier- optimization problem, which can be efficiently solved by the hier-
archical optimization structure. The mapping system uses 3D LiDAR archical factor graph. On the one hand, the local factor graph
330
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
contributes to construct a concise but tight essential graph by dumping in gps/ins integration. J. Navigat. 60, 517–529.
redundant factors. On the other hand, the factor graph could be effi- Durrant-Whyte, H., Bailey, T., 2006. Simultaneous localization and mapping: part i. IEEE
Robot. Automat. Magaz. 13, 99–110.
ciently solved by the incremental smoothing approach, which greatly Elbahnasawy, M., Shamseldin, T., Habib, A., 2018. Image-assisted gnss/ins navigation for
shortened the time cost on global optimization. Third, the GNSS signal uav-based mobile mapping systems during gnss outages. In: 2018 IEEE/ION Position,
is modeled as a plug-and-play constraint to facilitate the mapping Location and Navigation Symposium (PLANS). IEEE, pp. 417–425.
Forster, C., Carlone, L., Dellaert, F., Scaramuzza, D., 2016. On-manifold preintegration for
process in the complex environments. Compared to the result without real-time visual–inertial odometry. IEEE Trans. Rob. 33, 1–21.
GNSS lost, the proposed method controls the drift under 1m and 1.8° Geiger, A., Lenz, P., Stiller, C., Urtasun, R., 2013. Vision meets robotics: The kitti dataset.
with the GNSS-denied time up to one minute. Int. J. Robot. Res. 32, 1231–1237.
George, M., Sukkarieh, S., 2005. Tightly coupled ins/gps with bias estimation for uav
However, there are many significant improvements still need to be applications. In: Proceedings of Australiasian Conference on Robotics and
made before it becomes a mature application. In future work, we will Automation (ACRA).
focus on the following weaknesses of the proposed system: Graham, M.C., How, J.P., Gustafson, D.E., 2015. Robust incremental slam with con-
sistency-checking. In: 2015 IEEE/RSJ International Conference on Intelligent Robots
• In the next place, the extrinsic calibration result should be in- Liang, X., Liu, J., Wang, Y., et al., 2015. Accuracy of kinematic positioning using
global satellite navigation systems under forest canopies. Forests 6, 3218–3236.
corporated into the estimated state. Advance off-line calibration is Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J.J., Dellaert, F., 2012. isam2:
necessary which even need to be repeated every day after recordings Incremental smoothing and mapping using the bayes tree. Int. J. Robot. Res. 31,
216–235.
to avoid drift. It is tedious and requires lots of labours. An intuitive
Kaess, M., Ranganathan, A., Dellaert, F., 2008. isam: Incremental smoothing and map-
idea is treating the extrinsic parameters as estimated states and re- ping. IEEE Trans. Rob. 24, 1365–1378.
cursively estimate it along with poses. Moreover, if the accuracy of Konolige, K., Agrawal, M., 2008. Frameslam: From bundle adjustment to real-time visual
laser odometry at the first few scans can be guaranteed, it can even mapping. IEEE Trans. Rob. 24, 1066–1077.
Konolige, K., Grisetti, G., Kümmerle, R., Burgard, W., Limketkai, B., Vincent, R., 2010.
achieve full-auto mapping with any sensors rig. Efficient sparse pose adjustment for 2d mapping. In: 2010 IEEE/RSJ International
• The automatical recognition of valid GNSS information is also a key Conference on Intelligent Robots and Systems. IEEE, pp. 22–29.
Leonard, J.J., Durrant-Whyte, H.F., 2002. Mobile robot localization by tracking geometric
point. In the complex environments, GNSS signals could also be
beacons. IEEE Trans. Robot. Automat. 7, 376–382.
received, but affected by the multi-path effect, it may provide a false Levinson, J., Montemerlo, M., Thrun, S., 2007. Map-based precision vehicle localization
position result. Based on the proposed feedback network from in urban environments. In: Robotics: Science and Systems, Citeseer. p. 1.
mapping module to the localization module, the detection of false Lhuillier, M., 2012. Incremental fusion of structure-from-motion and gps using con-
strained bundle adjustments. IEEE Trans. Pattern Anal. Machine Intell. 34,
positive signals could be realized by some specially designed filter. 2489–2495.
Li, S., Li, G., Wang, L., Zhou, Y., Peng, Y., Fu, J., 2017. A three-dimensional robust ridge
Declaration of Competing Interest estimation positioning method for uwb in a complex environment. Adv. Space Res.
60, 2763–2775.
Markom, M.A., Shukor, S.A.A., Adom, A.H., Tan, E., Shakaff, A.Y.M., 2016. Indoor
The authors declare that they have no known competing financial scanning and mapping using mobile robot and rp lidar. In International Journal of
Advances in Mechanical and Automobile Engg. (IJAMAE) 3.
interests or personal relationships that could have appeared to influ-
Milioto, A., Vizzo, I., Behley, J., Stachniss, C., 2019. Rangenet++: Fast and accurate
ence the work reported in this paper. lidar semantic segmentation. In: Proc. of the IEEE/RSJ Intl. Conf. on Intelligent
Robots and Systems (IROS).
Acknowledgements Moosmann, F., Stiller, C., 2011. Velodyne slam. In: 2011 IEEE Intelligent Vehicles
Symposium (IV), IEEE. pp. 393–398.
Nerurkar, E.D., Wu, K.J., Roumeliotis, S.I., 2014. C-klam: Constrained keyframe-based
This work is supported by State Key Laboratory of Geo-Information localization and mapping. In: 2014 IEEE international conference on robotics and
Engineering(SKLGIE2018-M-3-1), National Key Research and automation (ICRA), IEEE. pp. 3638–3643.
Pierzchała, M., Giguère, P., Astrup, R., 2018. Mapping forests using an unmanned ground
Development Project 2017YFF0206001, and National Natural Science vehicle with 3d lidar and graph-slam. Comput. Electron. Agric. 145, 217–225.
Foundation of China 41501491. Puente, I., González-Jorge, H., Martínez-Sánchez, J., Arias, P., 2013. Review of mobile
mapping and surveying technologies. Measurement 46, 2127–2145.
Qu, X., Soheilian, B., Paparoditis, N., 2018. Landmark based localization in urban en-
References vironment. ISPRS J. Photogramm. Remote Sens. 140, 90–103.
Razlaw, J., Droeschel, D., Holz, D., Behnke, S., 2015. Evaluation of registration methods
Bailey, T., Durrant-Whyte, H., 2006. Simultaneous localization and mapping (slam): Part for sparse 3d laser scans. In: 2015 European Conference on Mobile Robots (ECMR).
ii. IEEE Robot. Automat. Magaz. 13, 108–117. IEEE, pp. 1–7.
Behley, J., Stachniss, C., 2018. Efficient surfel-based slam using 3d laser range data in Roh, H., Jeong, J., Cho, Y., Kim, A., 2016. Accurate mobile urban mapping via digital
urban environments. In: Robotics: Science and Systems. map-based slam. Sensors 16, 1315.
Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I., Leonard, Shamseldin, T., Manerikar, A., Elbahnasawy, M., Habib, A., 2018. Slam-based pseudo-
J.J., 2016. Past, present, and future of simultaneous localization and mapping: gnss/ins localization system for indoor lidar mobile mapping systems. In: 2018 IEEE/
Toward the robust-perception age. IEEE Trans. Robot. 32, 1309–1332. ION Position, Location and Navigation Symposium (PLANS). IEEE, pp. 197–208.
Chang, L., Niu, X., Liu, T., Tang, J., Qian, C., 2019. Gnss/ins/lidar-slam integrated na- Shan, T., Englot, B., 2018. Lego-loam: Lightweight and ground-optimized lidar odometry
vigation system based on graph optimization. Remote Sensing 11, 1009. and mapping on variable terrain. In: 2018 IEEE/RSJ International Conference on
Chen, X., Milioto, A., Palazzolo, E., Giguère, P., Behley, J., Stachniss, C., 2019. Suma++: Intelligent Robots and Systems (IROS). IEEE, pp. 4758–4765.
Efficient lidar-based semantic slam. In: 2019 IEEE/RSJ International Conference on Shuai-Xin, L., Guang-Yun, L., Wang, L., Yang, X.T.L., 2020. Imu tightly coupled real-time
Intelligent Robots and Systems (IROS). IEEE, pp. 4530–4537. localiza-tion method. Acta Automatica Sinica 46, 1–13.
Dellaert, F., Kaess, M., 2006. Square root sam: Simultaneous localization and mapping via Strasdat, H., Montiel, J.M., Davison, A.J., 2012. Visual slam: why filter? Image Vis.
square root information smoothing. Int. J. Robot. Res. 25, 1181–1203. Comput. 30, 65–77.
Ding, W., Wang, J., Rizos, C., Kinlyside, D., 2007. Improving adaptive kalman estimation Takai, S., Date, H., Kanai, S., Niina, Y., Oda, K., Ikeda, T., 2013. Accurate registration of
331
S. Li, et al. ISPRS Journal of Photogrammetry and Remote Sensing 166 (2020) 316–332
mms point clouds of urban areas using trajectory. ISPRS Ann. Photogramm. Remote optimization with control network constraint for mobile mapping. Sensors 18, 3668.
Sens. Spat. Inf. Sci, 277–282. Yang, S., Zhu, X., Nian, X., Feng, L., Qu, X., Mal, T., 2018. A robust pose graph approach
Thrun, S., Burgard, W., Fox, D., 2006. Probabilistic Robotics, vol. 1 MIT press, Cambridge. for city scale lidar mapping. In: 2018 IEEE/RSJ International Conference on
Thrun, S., Montemerlo, M., 2006. The graph slam algorithm with applications to large- Intelligent Robots and Systems (IROS). IEEE, pp. 1175–1182.
scale mapping of urban structures. Int. J. Robot. Res. 25, 403–429. Zhan, H., Weerasekera, C.S., Bian, J., Reid, I., 2019. Visual odometry revisited: What
Toschi, I., Rodríguez-Gonzálvez, P., Remondino, F., Minto, S., Orlandini, S., Fuller, A., should be learnt? arXiv preprint arXiv:1909.09803.
2015. Accuracy evaluation of a mobile mapping system with advanced statistical Zhang, J., Kaess, M., Singh, S., 2017. A real-time method for depth enhanced visual
methods. Int. Arch. Photogramm., Remote Sensing Spatial Informat. Sci. 40, 245. odometry. Autonomous Robots 41, 31–43.
Wei, L., Cappelle, C., Ruichek, Y., Zann, F., 2011. Gps and stereovision-based visual Zhang, J., Singh, S., 2017. Low-drift and real-time lidar odometry and mapping.
odometry: Application to urban scene mapping and intelligent vehicle localization. Autonomous Robots 41, 401–416.
Int. J. Vehicul. Technol. 2011. Zlot, R., Bosse, M., 2014. Efficient large-scale three-dimensional mobile mapping for
Wen, J., Qian, C., Tang, J., Liu, H., Ye, W., Fan, X., 2018. 2d lidar slam back-end underground mines. J. Field Robot. 31, 758–779.
332