Autonomous Navigation Framework For Intelligent
Autonomous Navigation Framework For Intelligent
Autonomous Navigation Framework For Intelligent
Abstract: Humans have an innate ability of environment modeling, perception, and planning
while simultaneously performing tasks. However, it is still a challenging problem in the study
of robotic cognition. We address this issue by proposing a neuro-inspired cognitive navigation
framework, which is composed of three major components: semantic modeling framework (SMF),
semantic information processing (SIP) module, and semantic autonomous navigation (SAN) module
to enable the robot to perform cognitive tasks. The SMF creates an environment database using
Triplet Ontological Semantic Model (TOSM) and builds semantic models of the environment.
The environment maps from these semantic models are generated in an on-demand database and
downloaded in SIP and SAN modules when required to by the robot. The SIP module contains
active environment perception components for recognition and localization. It also feeds relevant
perception information to behavior planner for safely performing the task. The SAN module uses
a behavior planner that is connected with a knowledge base and behavior database for querying
during action planning and execution. The main contributions of our work are the development
of the TOSM, integration of SMF, SIP, and SAN modules in one single framework, and interaction
between these components based on the findings of cognitive science. We deploy our cognitive
navigation framework on a mobile robot platform, considering implicit and explicit constraints for
autonomous robot navigation in a real-world environment. The robotic experiments demonstrate the
validity of our proposed framework.
Keywords: intelligent robot; autonomous navigation framework; triplet ontological semantic model;
environment modeling; on-demand database; knowledge-based recognition; hierarchical planning
1. Introduction
Environment modeling, recognition, and planning are fundamental components for intelligent
systems, such as mobile robots, that enable them to understand and perceive the complex environment
in the same way a human does and perform the task reliably. When mobile robots are deployed in
the real-world, numerous challenges arise: acquiring spatial models of the continuously changing
environment that leads to the problem of robotic mapping, a radically changed scene appearance that
is one of the significant factors towards the visual recognition failure [1], and the dynamic nature of
the operating environment that is a critical challenge for robot planning [2]. Many approaches have
been proposed to handle these issues [3]. However, among them, a growing world-wide trend is
introducing semantic information into robotic systems using ontology [4], which endows the mobile
robot with human-like mapping [5], recognition [6] and planning [7] capabilities. Inspired by these
recent advances, our framework exploits the semantic knowledge by defining a Triplet Ontological
Semantic Model (TOSM).
Recent developments in robotics have heightened the need for semantic information processing
techniques to design object and place recognition systems based on the findings of cognitive science.
Semantic recognition systems based on Convolutional Neural Network (CNN) models have made
groundbreaking advances to tackle scene understanding tasks with an emphasis on object and place
recognition [8]. The complex representation of visual information and their processing by biologically
inspired design of CNN model has shown unprecedented advancements in visual recognition.
The layered structure in the CNN model is more reliable to represent the lower area of the human visual
cortex [9]. Motivated by these developments, we combine the strengths of a deep learning-based CNN
model with an on-demand database for semantic object-based place recognition and robot localization.
When planning a task, an agent requires a description of the environment and of itself,
containing the possible states, actions, and events. Classical planners, such as Planning Domain
Definition Language (PDDL) planners [10], make use of a set of predicates and actions to generate
a valid action sequence. Those planners have been vastly used by the research community and
integrated into powerful frameworks [11]. However, they tend not to perform well on unpredictable
and dynamic environments, usually described by Partially Observed Markov Decision Processes
(POMDP). Aiming to solve POMDPs on a variety of domains, reinforcement learning approaches
try to use a large amount of data to approximate a policy capable of obtaining an optimal action
based solely on the current state. Those approaches have shown promising results, but they still
underperform when dealing with long-term scenarios. Our framework tries to overcome those
shortcomings by combining classical PDDL planners, sampling-based planners, and reinforcement
learning algorithms into a hierarchical scheme.
Autonomous navigation frameworks based on knowledge base have become essential for complex
applications of robotic systems in real-world environments. In this direction, the Ontology-Based
Unified Robot Knowledge (OUR-K) framework [12] emphasized at robot intelligence to perform
navigation, recognition, planning, and manipulation tasks using knowledge description (for contexts,
features, objects, spaces, and actions) and knowledge association for establishing relationships
between descriptions by supporting inference. The OUR-K framework considered small static
indoor environments such as an office room; however, our framework concentrates on robot
navigation in large dynamic environments using the TOSM-based architecture. The RoboEarth
framework [13] designed a knowledge-based system to generate, share, and reuse data using a cloud
database. In contrast, our framework uses the on-demand database to ensure real-time capabilities.
Furthermore, it integrates semantic modeling, information processing, and autonomous navigation
modules to perform a task. Our prior [14] work has persuaded us to introduce the navigation
framework in this study.
Our framework makes major contributions to the semantic robotic research area as follows:
The rest of our paper is divided into five sections as follows: Section 2, gives a brief
overview of related work in other frameworks, environment modeling, recognition, and planning.
Section 3, explains the overall framework and describes each component in detail. Section 4,
Appl. Sci. 2020, 10, 3219 3 of 30
illustrates experiments and discusses the results. Finally, conclusions and future directions follow in
Section 5.
2. Related Work
In this section, we first discuss the related work for knowledge-based navigation frameworks
(Section 2.1). After that, we review the literature that deals with the individual modules of our
autonomous navigation framework distinguishing into environment modeling for robots (Section 2.2),
recognition (Section 2.3), and planning (Section 2.4).
maps. Thrun et al. [20] presented a follow-up research work for an offline SLAM problem by
introducing GraphSLAM, which represented the log-likelihood of the data and generated the maps
with more than 108 features in urban environments. Ref. [21], used selective resampling operations
to reduce the number of particles in Rao-Blackwellized [22] particle filter for building accurate
map of environment. Ref. [23], addressed the motion estimation and distortion problem in lidar
cloud by proposing a real-time approach divided into two algorithms that run in parallel. The lidar
odometry algorithm estimated transform between consecutive frames at a higher frequency, while
the lidar mapping algorithm performed fine matching and registration of the point clouds at a lower
frequency. Subsequently, Parallel Tracking and Mapping (PTAM) [24] was considered the most
popular method to implement visual based SLAM because of its ability to handle large number of 3D
points. Later, Mur-Artal et al. [25] extended the adaptability of PTAM to intractable environments by
designing a new feature-based monocular ORB-SLAM system from scratch.
The semantic mapping problem has attracted the researchers’ attention in recent years.
Pronobis et al. [26] proposed a probabilistic system for extracting semantic information from different
heterogeneous modalities and combined it with common-sense conceptual knowledge. It made the
semantic maps more descriptive based on the concept of spatial properties that were represented by
the chain graph. Kostavelis et al. [27] gave a qualitative description to enhance the representation
of the environment and enabled the robot to perceive its surroundings similar to the human.
McCormac et al. [28] used CNNs and dense SLAM system ElasticFusion for providing dense
correspondence from the 2D frame into the 3D map. The semantic predictions from CNN were
obtained from multiple viewpoints and then fused into a dense map. Yang et al. [29] proposed an
incremental semantic 3D mapping system with a scrolling occupancy grid map while CNN was
used for segmentation and computation of pixel label distribution. Nakajima et al. [30] presented
an approach for incrementally building dense 3D maps that were semantically annotated. The class
probabilities were assigned to each region of the 3D map and segmented using geometric-based
techniques, while CNN enhanced frame-wise semantic segmentation. Xiao et al. [31] presented a
Dynamic-SLAM by taking advantage of deep learning-based recognition technique, known as SSD, to
recognize the dynamic objects at semantic level while recall rate was improved by proposing a missed
detection compensation algorithm.
and general knowledge with activity observations knowledge for environmental map building.
Crespo et al. [36] proposed a semantic relational model based on a physical and conceptual
representation of real-world objects, places, and semantic relationships among them to endow the
robot with the capabilities of making queries about its surroundings for planning and navigation tasks.
2.3. Recognition
In this section, we focus on reviewing the object and place recognition approaches, along with
localization techniques described in the literature.
In a study, Schönberger et al. [51] proposed an approach based on 3D geometric and semantic
understanding of real-world environment by training the generative model on the semantic scene to
improve the descriptor learning for missing observations. Garg et al. [52] developed a set of semantics
and appearance-based methods for efficient VPR in a challenging scenario by extracting convolutional
features from a dense semantic segmentation network using the proposed Local Semantic Tensor (LoST)
descriptor of images. Chen et al. [53] generated condition and viewpoint-invariant features by training
two CNN models to recognize specific places with a multi-scale feature encoding method. Ref. [54],
used CNN to perform the task of recognizing revisited locations under different environmental
variations and mapping visual information to low dimensional space with Euclidean distance that
corresponds to place dissimilarity.
2.3.3. Localization
The robotic research community is making steady progress in determining the autonomous
robot localization concerning the environment using recognition techniques. Mitsuhashi et al. [55]
proposed a system for mobile robot localization with an appearance-based place recognition approach
in conjunction with the dead reckoning and gyrodometry model. Ref. [56], presented a vision-based
localization approach in the outdoor environment against a known map by combining
a place-dependent feature detector with prior experience of place. Instead of representing edges
and corners, these features represented mid-level patches of different elements like windows. Ref. [57],
presented the Visual Landmark Sequence-based Indoor Localization (VLSIL) approach and addressed
environmental variations using semantic information of steady indoor objects (landmarks) while
performed the localization using objects’ occurrence order in the video. Zhong et al. [58] proposed
a mobile robot’s self-localization method using recognition of artificial visual landmarks which were
identified using a bilayer recognition algorithm. The geometric relation was established between image
and environmental frames to get the orientation and position of the camera for robot localization.
2.4. Planning
Planning is one of the main study areas in the robotics field, especially after robots started
performing central roles in modern industry and our daily life [59,60]. Moreover, the Defense Advanced
Research Projects Agency (DARPA) challenge [61] provided real-world conditions and pushed the
boundaries of the Self-Driving Vehicle (SDV) researches. It also showed that sophisticated planning
solutions were paramount for SDVs being able to handle a broad span of driving scenarios.
Planning is referred to as a sequence of actions necessary to accomplish a goal, given an initial
state. To complete such a goal, the agent requires a description of the environment, its actions, and its
capabilities. A planner can perform different roles depending on how general they are. There are many
ways of dividing those roles, such as Global/Local planner or Mission/Behavior/Motion planner.
The Global planner, generally performed offline, generates a high-level low-resolution sequence of
actions, based on the a priori known environment, to accomplish the final goal. The local planner, on the
other hand, focuses on a low-level high-resolution sequence of actions, based on the online sensor data,
that focus on a portion of the global path [62]. In this paper, however, the Mission/Behavior/Motion
division is explored [59].
the ROSPlan framework [11] integrating PDDL 2.1 with the Robot Operating System (ROS), which is
vastly used by the robotics research community. By using a Knowledge Base, ROSPlan is able to
build the initial domain state automatically, validated the generated plan, and decide when to re-plan.
Estivill-Castro et al. [66] showed that PDDL-planners could also work on dynamic environments
when other agents follow a deterministic behavior. Researchers have been attempting to integrate
Hierarchical Task Network (HTN) with task planning using different techniques. Dvorak et al. [67]
proposed the first Flexible ANML Actor and Planner (FAPE) that combined most of the Action Notation
Modeling Language (ANML) [68] with HTN task decomposition. It also integrated acting that followed
Procedural Reasoning System (PRS) refinements, plan repair and re-planning. However, it used
non-recursive decomposition hierarchy which decreased the expressiveness of HTN planning.
Nonetheless, classical planners generally do not perform well in a dynamic and unpredictable
environment due to its linear nature, making it unable to reason about multiple outcomes.
Decision-theoretic planning, on the other hand, tries to maximize the probability of success (also
known as a maximum expected utility, MEU) for every possible action outcomes from any state
the robot reaches during the plan execution. It can solve Markov Decision Processes (MDP) and
even POMDPs, but its high computational cost makes decision-theoretic approaches prohibitive.
Most of the research done in this field was trying to overcome this issue. Boutilier et al. [69] tried to
solve an approximated optimal policy by identifying which propositions are relevant and using this
information to reduce the state-space. On a similar fashion, Dean et al. [70] used a known initial state
to model the world nondeterminism and restrict the planner to focus only on states that might be
encountered during the plan execution.
Some newer approaches tried to combine ontology and semantic knowledge into planning.
Hence, ontological knowledge eases semantic, temporal, and spatial representations of the
environment [4]. Galindo et al. [71] used semantic maps to extend the planner reasoning capabilities,
aiding high-level tasks understanding. KnowRob [72] is a knowledge processing framework that
combines episodic memory, mental simulation, and a generalized ontology to help a robot to reason,
plan, and learn. This work, however, focused only on manipulation tasks [73].
In cell decomposition methods, the environment’s free space is divided into small cells, and the
output is a sequence of adjacent cells that leads the initial position to the goal while avoiding obstacles.
Arney [78] used approximate cell decomposition, which does not set a fixed cell shape and size,
to navigate efficiently. Lingelbach [79] combined CDM with probabilistic sampling to improve its
scalability. Cell decomposition approaches, however, suffer from limited granularity, combinatorial
explosion, and often generating infeasible solutions [80]. In potential field methods, an attractive
potential is attached to goals while a repulsive one is attached to obstacles, and the robot follows
the gradient direction generated by the combination of those fields. Both Pradhan et al. [81] and
Kang et al. [82] showed that such approach is suitable for multi-robot navigation. Potential fields,
however, suffer from often converging to local minima, trapping the robot midway [80].
Sampling-based algorithms have attracted considerable attention due to their ability to scale
well while also being probabilistic complete, namely if the number of samples converges to infinity,
the probability of finding the optimal solution converges to 1. The most notable SBMs are Probabilistic
Roadmaps (PRM) and Rapidly-exploring Random Trees (RRT) [83]. Despite both being based on
connecting points sampled randomly, the way both algorithms connect those points largely differ.
PRM [84] maintains multiple graph expansions simultaneously, and have been shown to perform well
on high dimensional spaces, while RRT [85] rapidly explores a single graph, making it more suitable
for local planning on small sized maps. Karaman et al. [86] showed that under mild conditions,
original PRM and RRT algorithms, most of the time, returned non-optimal values. The author then
introduced asymptotically optimal versions of both methods, namely PRM* and RRT*. This advantage,
however, is overshadowed by their relatively slow convergence, which hinders their usage in real-time
scenarios. They also provided no theoretical guarantees for obtaining an optimal solution [87].
Figure 1. The proposed framework consists of semantic modeling framework (SMF) (blue blocks),
semantic information processing (SIP) module (orange blocks), and semantic autonomous navigation
(SAN) module (green blocks).
Each model of TOSM is described based on classes, object properties, and datatype properties,
which are terminologies defined in the OWL reference [94]. In the following tables, we explain the
environment modeling based on TOSM using the OWL terminologies.
We use classes to group resources with similar characteristics for providing an abstraction
mechanism. Table 1 lists the example of hierarchical classes for environmental elements. Object, Place,
and Robot are defined as subclasses of the EnvironmentalElement, and each class is defined as
superclass of the classes expressed in its right column (e.g., Door is a superclass of AutomaticDoor).
Subclasses inherit the characteristics of superclasses, and the relation between two classes can
be defined as “is-a.” For example, AutomaticDoor inherits all the characteristics of Door, Object,
and EnvironmentalElement; the relationship between AutomaticDoor and Door can be described as
“An AutomaticDoor is a Door.” The classes are used to determine domains and ranges of object
properties and datatype properties. Therefore, subclasses can define more specific properties rather
than superclasses, such as AutomaticDoor describes Door in more detail. In the case of subclasses for
Robot, we use the robot description ontology model as defined in [95].
Table 1. Classes.
AutomaticDoor
Door HingedDoor
ElevatorDoor
GuideSign
Sign
SafetySign
VendingMachine
Object Device
WaterPurifier
Table
Furniture
Chair
Occupant
Environmental Person Pedestrian
Element Driver
Corridor
Doorway
Room
IndoorPlace Elevator
Staircase
Place
Floor
Cafe
Building
OutdoorPlace Road
Sidewalk
Robot
Object properties are used to represent relations between environmental elements corresponding
to the implicit model; in the OWL, object properties provide relations between individuals that
are members of the classes. The object properties can be defined using object property hierarchy,
descriptions, and characteristics. The object property hierarchy in Table 2 expresses the nodes
corresponding to the object properties in a tree-like structure; the right child node is a subproperty of
the left parent node. We use SubPropertyOf, domains, and ranges among the various descriptions of
object properties. SubPropertyOf is used to represent the subproperty relations between nodes in the
object property hierarchy (e.g., isFrontOf SubPropertyOf spatialRelation). The object properties can set
domains and ranges of each property using the classes. The individual using the property is inferred
as an instance of the domain and range class. For example, if “CIR752 isLookingAt Table34” and Robot
and Object are domain and range of isLookingAt respectively, CIR752 is inferred as Robot and Table34
as Object. We use symmetric and transitive characteristics of the object properties. “The property is
symmetric” means that it is reasonable to exchange the domain and range of the property. For example,
“B isNextTo A” can be inferred from “A isNextTo B”, because isNextTo is symmetric. In case of isInsideOf,
Appl. Sci. 2020, 10, 3219 11 of 30
the property is transitive because we can realize “A isInsideOf C” using “A isInsideOf B” and “B
isInsideOf C”.
Table 2. Object properties.
We divide relations between objects, places, and robots into spatialRelation and robotRelation.
SpatialRelation means the spatial relations between objects and places that constitute an environment.
In particular, the properties, where Object specifies both domains and ranges, express relative position
relations between objects within places. The properties can be used by robots to infer the relations
between objects. Robots can also use a database which consists of isInsideOf and isConnectedTo to
understand the environment similarly the way a human does. We explain the spatialRelation in detail
with an example in Section 3.1.2. robotRelation is used to define the properties that are determined by
the robot’s current state. When the robot is in operation, it uses isLookingAt and isLocatedAt properties
to comprehend the current state in real-time by associating it with objects and space stored in the
database. We explain the robotRelation exactly in Section 3.2
We use datatype properties to define the TOSM of environmental elements; in the OWL,
data properties are relations between instances of classes and RDF literals or XML Schema datatypes.
The datatype property hierarchy, similar to object property hierarchy, represents the subproperty
relations between datatype properties. Additionally, classes determine domains of the datatype
properties, and datatypes determine ranges.
Table 3 lists definitions of the symbolic, explicit, and implicit model for Object. The symbolic model
consists of name and ID to symbolize individuals of Object. For the sake of clarity, we describe ranges of
pose, velocity, size, and color in the explicit model as above. The ranges of these properties are defined as
an array of the datatype (e.g., float array) in the database. isKeyObject in the implicit model is utilized
to represent crucial objects used for localization and place recognition. Furthermore, the properties,
whose domains are the subclasses of Object, are used to define a more specific model of each subclass.
In the case of Door, isOpen means whether it is open or not, and canBeOpen means whether it can be
opened if it is closed. Robots can use this information when planning and performing tasks.
The definitions of the TOSM for Place and Robot using datatype properties are shown in
Tables 4 and 5 respectively. The definitions of the symbolic models are the same with Object. We
select boundary composed of polygons for the explicit model of Place. In the implicit model, complexity
informs the state of Place. capability and purpose in the implicit model of Robot are the main components
when making an on-demand database.
We have selected the third version of YOLO [42] for object recognition because it presents a good
trade-off between execution time and accuracy, compared with other object detectors. For example,
its accuracy is almost similar to RetinaNet [96]; however, it is four times faster. Moreover, it is much
better than SSD variants and Darknet-53 as its backbone network architecture makes it 1.5 times faster
than ResNet-101 [42].
First, we train state-of-the-art one-stage CNN based model, YOLOv3 [42] for object recognition
with our own data set. When a mobile robot navigates in the environment, visual data captured by
a camera is passed to the CNN model; it produces a sequence of layered activations, which generate
feature vectors that have a deeply learned representation of high-level features. We perform
pre-processing step to remove the noise before passing the visual data captured by the robot’s camera
to the CNN model. Both the raw-data and the CNN model are hosted by short-term memory (STM).
Raw-data does not have access to the on-demand database, while the CNN model, which holds the
recognition components, has access to the on-demand database. Object and Place models are initially
stored in this database. We download these semantically enriched models of recognized objects from
the on-demand database when required by the robot.
Our YOLO-based recognition model consists of two phases: the processing phase and the
retrieving phase.
The processing phase is also known as an offline or training stage in which we train the YOLOv3
model for object recognition using our dataset. The training phase involves three major steps; (i) Dataset
annotation: we label objects in each image using the YOLO-Mark tool. (ii) Convolution weights: we
use YOLOv3 weights trained on ImageNet for convolutional layers. (iii) Configuration set-up: we
prepare a model configuration file that loads 64 images in the forward pass for each iteration during the
training process. The model updates the weights when backpropagating after gradients’ calculation.
Thus, for each iteration, our model uses 64 batch size with 16 mini-batches, and four images are loaded
for each mini-batch. In our configuration file, we define filter size to 60, the maximum number of
iterations 30,000 for 15 classes. The YOLOv3 has three YOLO layers; therefore, we generate nine
anchors to recognize small, medium, and large size object at its three YOLO layers. Finally, when the
training phase is completed, we get new weights for our dataset.
The retrieval phase is known as an online or testing stage. In the testing phase, weights that were
generated during model training, are used to recognize objects in a real-world environment.
In our SIP module, after object recognition using the CNN model, inference for semantic
information is performed at run-time by integrating the recognized objects with their semantic
information that contains spatial relationships between objects, described in Table 2. Through these
spatial relationships among recognized objects, we infer the relationship information of the robot by
leveraging with the semantic objects and space stored in the on-demand database as follows:
Assume x is an object recognized in the environment by the CNN model, while X is the subclass
of EnvironmentalElement (e.g., Furniture, Door, and Corridor). Then the class of object x is defined by
a class definition function IsA( x, X ), which indicates that object x is class X. As shown in Figure 4,
there are two objects in the environment: table and chair. The class definition function IsA(table, Table)
and IsA(chair, Chair ) states that in both cases, the class of the table is Table and the chair is Chair.
When the class is defined, we can get the datatype properties of the object from the database based on
the robot’s pose and relative pose of the object. In other words, Chair is specified chair2 (we call this Xn )
that is an instance of Chair by the database query such as illustrated in the gray table inside Figure 4.
We also use a relationship inference function SR = rel ( Xn , Xm ) to define relationships between
recognized objects, other objects, and places in the database. According to the class definition function,
Xn indicates an instance of class X. SR describes a spatial relationship between Xn and Xm , such that
“Xn isNextTo/isInsideO f /isInFrontO f Xm ” (e.g., “table1 isNextTo chair2”.) Similarly, we can infer
the robot’s relationship in the environment; in that case, the Xn is an instance of Robot.
So, when the robot navigates in a real-world environment, and it finds two objects, table and chair,
shown in Figure 4, the robot recognizes these two objects using the CNN model and infers these objects
Appl. Sci. 2020, 10, 3219 15 of 30
with their semantic information that is saved in the database. As a result, we get spatial relationships
such as “table1 isNextTo chair2”, “table1 isInsideO f corridor2”, “robot isLookingAt chair2”, and “robot
isLocatedAt corridor2”.
We can also see that the recognition model and the TOSM-based on-demand database share
human-like memory architecture when working together for visual semantic recognition. The SIP
module is processed in working memory (Short Term Memory, STM), while prior semantic knowledge
stored in the on-demand database is hosted by Long-Term Memory (LTM), as shown in Figure 5.
When the mobile robot navigates in a real-world environment, it sends the sensory data to SIP,
which passes the pre-processed data to the recognition model that is stored in working memory and
gets the periodic visual updates as robot navigates.
used to feed the map data into the knowledge base. After that, the behavior sequence is generated
through the Partial Order Planning Forwards (POPF) planner [97]. Despite the current state of the
behavior database not including temporal-actions, we have plans to also support them in the future.
Therefore, we opted to use the POPF planner due to its ability to deal with both simple and durative
actions. ROSPLAN also sends the current behavior to the Behavior Planner through the medium of
the Plan Dispatcher node.
Figure 6. The CIR robot is equipped with a continuous track and extension arms, enabling going up
and downstairs. The 02 robot is equipped with sensors and manipulators that allow elevator operation.
In our example, the 02 robot shown in Figure 6b receives the multi-floor navigation mission.
The robot then receives the on-demand database generated by the SMF based on the robot’s implicit
information (i.e., its capabilities). This on-demand database contains the building floors and the
elevators, but no stairs, as the robot cannot transverse through them. Moreover, a domain file
containing all the possible robot actions is sent to the ROSPLAN knowledge base. An example action,
which sends a robot to a waypoint on the same floor, is shown on Listing 1. The predicates are general
and shared between all actions stored on the database, thus making possible generating the domain
file simply by querying which actions can be performed by the robot. Subsequently, the environment
TOSM data is used to populate the ROSPLAN knowledge base environment state, adding connected
waypoints, elevator, robot, and goal predicates. After running the POPF planner, the behavior sequence
shown on Listing 2 is generated and sent by the dispatcher node to the Behavior Planner module.
( : a c t i o n goto_waypoint
: parameters ( ? v − r o b o t ? from ? t o − waypoint ? f l − f l o o r )
: p r e c o n d i t i o n ( and
( o u t s i d e _ e l e v a t o r ?v )
( r o b o t _ a t ?v ? from )
( w a y p o i n t _ a t _ f l o o r ? from ? f l )
( waypoint_at_floor ? to ? f l )
( hallway_waypoint ? from )
( hallway_waypoint ? t o )
( c o n n e c t e d _ t o ? from ? t o )
)
: e f f e c t ( and
( r o b o t _ a t ?v ? t o )
( not ( r o b o t _ a t ?v ? from ) )
)
)
Appl. Sci. 2020, 10, 3219 17 of 30
Figure 7. A Finite State Machine diagram showing a simplified version of the Enter Elevator behavior
stored on the Behavior Database.
meshes for common objects such as doors, chairs, and tables. If the robot encounters a real-world
object in which there is no 3D model available, a placeholder is automatically generated using the
perceived shape and color of the object. Such point can be seen on Figure 8, where the simulated door
uses a 3D mesh, while the vending-machine and the beverage storage are represented by automatically
generated shapes.
where rcompletion , rtime , rcloser and rcollision are positive integers defined trivially.
As the training step only needs to sample the experience tuples, it can be performed either
on-board or by using a central cloud service, provided that such structure is accessible by the robot.
Appl. Sci. 2020, 10, 3219 20 of 30
Figure 9. The Deep Deterministic Policy Gradients (DDPG) structure used to approximate
an autonomous navigation policy.
4. Experiments
In this section, we present our experiments using the TOSM-based autonomous navigation
framework. The objective of the experiments is to demonstrate the framework’s validity and introduce
a possible use case.
The environmental database, illustrated in Figure 11, for the first floor in the convention center
consisted of 23 instances from three subclasses of Object and 16 instances from nine subclasses of Place.
Although there were many objects on the first floor, we selected only static objects that can be worked as
dominant features for recognizing environments; We defined this property as isKeyObject in the implicit
Appl. Sci. 2020, 10, 3219 21 of 30
model for objects. In the case of objects which were not key objects (e.g., pedestrian), we considered
the objects only in the short term memory for the efficiency of memory consuming. By doing so,
the database consumed memory less than 0.1 MB in size, although it covered the large environments.
Second, the SAN module set goals as “inspected kdjHall1” and “robot_at cir752 base1” to divide
the mission into six modular behaviors using the ROSPlan, which moved the robot to the working area,
realized the inspection, and returned to the robot base; each behavior was described on Listings 1 and 3.
Appl. Sci. 2020, 10, 3219 22 of 30
Each waypoint following behavior can be expanded, as shown in Figure 12, where the PRM algorithm
used randomly sampled points to generate a 2D graph. This graph was traversed to generate the
shortest path to the goal. Afterward, each desired point in the graph was sent sequentially to the
Action planner, which then used the trained policy to navigate safely.
( : action inspect
: parameters ( ? v − r o b o t ?wp − waypoint ? f l − f l o o r )
: p r e c o n d i t i o n ( and
( r o b o t _ a t ?v ?wp)
( w a y p o i n t _ a t _ f l o o r ?wp ? f l )
)
: e f f e c t ( i n s p e c t e d ?wp)
)
Figure 12 describes the moment of executing second behavior “goto_waypoint cir752 c2 kdjHall1
f l1”, which meant cir752 should go to kdjHall1 from c2 inside f loor1. When the robot started
executing the behavior, it got sensory inputs to recognize the environment. The SIP module recognized
objects and places using sensory inputs based on the CNN model with semantic knowledge in
the on-demand database. In this case, it could infer “cir752 isLookingAt column12, column16, and
in f ormationCenter1”, “cir752 isLocatedAt corridor2 that isInsideO f f loor1”, and “column12 isNextTo
column16”. In addition, it got datatype properties of each instance. Finally, if the robot executed the
final behavior successfully, which was goto_waypoint base1, it accomplished the mission.
Place Name Place Total Mapped Area (m2 ) Free Space Ratio Free Space Area (m2 )
floor1 39,759.4 0.123 4890.41
corridor1 13,632.0 0.258 3517.06
corridor2 4672.0 0.391 1826.75
corridor3 3124.0 0.271 846.60
We chose a set of the number of samples {50, 100, 200, 300, 400, 500} heuristically, and ran our
PRM algorithm 50 times for each n value on each map. The most important performance information
for our application was the time taken to build the PRM graph, the time taken to find the optimal
path, the length of the final path, and whether or not the algorithm was able to find a path between
N p and N g . The average values for each of those measurements are shown in Figure 13. While the
time taken to build the PRM graph grew quadratically with the number of samples, the time taken to
Appl. Sci. 2020, 10, 3219 23 of 30
find the optimal path only grew linearly. As our approach generated the graph only when receiving a
new map, the effect of building the PRM graph can be mitigated. Smaller places showed higher times
for both, graph building and path finding, when using the same amount of samples as larger places,
because it became progressively harder for the algorithm to find free spaces when the map was nearly
filled with samples. In addition, graphs with a high density of nodes per square meter tended to have
a higher variety of possible paths, which could slow down the path search.
Additionally, we were able to show that the number of samples barely influenced the final path
length. However, it had a considerable influence on the ability to find a valid path. When the amount
of nodes was too small while comparing to the map size, some nodes could not find valid neighbors,
which led to disconnected graphs. This situation is not desirable when performing a mission, as it
would require the robot to re-generate the PRM graph whenever encountering such an issue. This can
lead to unexpected delays and even dead-locks.
(a) Graph build time (ms) (b) Path search time (ms)
Figure 13. Influence of the number of samples (x-axis) on the PRM algorithm.
Aiming to select an optimal number of nodes based on the free space information, we performed
a linear regression while trying to minimize n with two constraints: the path search time should take
no more than 15 ms; the success rate should be above 90%. We obtained the following coefficients:
where ceil represents the rounding up operation. By using Equation (1), the number of samples needed
by each one of the aforementioned maps is displayed in Table 7. We ran the algorithm 50 more
times, this time using the optimal n values. It has been shown that by using the proposed formula,
both constraints were satisfied.
Appl. Sci. 2020, 10, 3219 24 of 30
Figure 14. Reward obtained by the robot on each training episode. The line in green represents
a moving average using a 100 episodes window. The line in red shows the cumulative average until
the current episode.
To demonstrate the advantage of our hybrid approach when compared to pure reinforcement
learning navigation methods, we performed the experiment shown in Figure 15. The robot started
on the red arrow, and its goal was to navigate until the red circle by avoiding collision with any
obstacle within 10 min. The simulated environment was generated by the mental simulation building
algorithm. It was based on a narrow corridor sequence inside a university building. The total path
size was 65.6 m. We performed two different experiments ten times each. The first one was using only
the policy learned to navigate the robot directly to the goal. The second one used the PRM algorithm
to generate a waypoint sequence as the one displayed in Figure 15b, which would be used as goals
for the navigation policy. The results of this experiment are summarized in Table 8. It can be seen
that the hybrid approach arrived at the goal 40% faster than the navigation done purely by using
the learned policy. When the goal was too far away, the robot kept turning itself to the goal point,
Appl. Sci. 2020, 10, 3219 25 of 30
going in the direction of a wall, then turning back to avoid a collision. This behavior slowed down the
navigation and was less predictable than the path followed by the PRM-RL approach. Both methods
had two failures. The RL method timed out after going back and forth in the first corridor. This failure
happened depending on how close the robot was to the trashcan when approaching the corner. If the
robot was too close, the laser scanner would not be able to see the second corridor’s entrance, believe it
was a dead-end, and turn back. In the case of the PRM-RL approach, the failures were caused because
the PRM algorithm created a sample too close to trashcan, which led to a collision. We believe this
could be solved by using a more robust sampling algorithm than the one currently used.
The experimental results show that our hybrid approach added an extra layer of robustness to
autonomous navigation based on reinforcement learning policies. It was able to perform a navigation
task 40% faster than the baseline while maintaining the success rate.
5. Conclusions
In this paper, we have introduced our cognitive science inspired navigation framework and its
three composing parts: semantic modeling framework (SMF), semantic information processing (SIP)
module, and semantic autonomous navigation (SAN) module. We also showed how we modeled
the environment employing the triplet ontological semantic model (TOSM), which allowed the three
modules mentioned above to utilize the same modeled data. The on-demand database allowed
different robots to share a common database by selectively querying the environment data based on
each robot’s intrinsic capabilities. The SIP module, performed object detection and localization using
the deep learning-based approach, while, inferred the robot’s relationship information based on the
semantic properties of recognized objects derived from the on-demand database. Moreover, the SAN
structure showed that a hybrid planning approach, which combines classical and sampling-based
planners with reinforcement learning algorithms, is able to mitigate each planner’s shortcomings
leading to a more robust planning algorithm. The performed experiment showed that our framework
can be applied to real-world scenarios, and navigation tasks on large dynamic environments can be
performed seamlessly with the aid of TOSM data.
Immediate future work is focused on expanding the framework to other environments,
more specifically outdoor environments. A modeled outdoor environment can also be used on different
platforms, such as self-driving vehicles. Moreover, a large variety of TOSM semantic information shall
be used by the SIP module when performing object recognition and localization. In addition, we are
Appl. Sci. 2020, 10, 3219 26 of 30
planning to add a large variety of possible behaviors to enable higher-level missions to be performed.
A more robust sampling algorithm can also be used to improve both the graph build time and the
success rate of navigation tasks. Finally, another line of research is using the mental simulation to
check the plan validity before its execution. For example, the feasibility of following a sequence of
points generated by the PRM algorithm can be tested on simulation. In case of failure, a new sampled
graph can be generated.
Author Contributions: Conceptualization, S.-H.J., S.M., Y.G.R., and T.-Y.K.; methodology, S.-H.J., S.M., and Y.G.R.;
software, S.-H.J., S.M., and Y.G.R.; validation, S.-H.J.; formal analysis, S.-H.J.; investigation, S.-H.B. and K.-H.L.;
data curation, S.-H.B. and K.-H.L.; writing—original draft preparation, S.-H.J., S.M., and Y.G.R.; writing—review
and editing, S.-H.J., S.M., and M.K.; visualization, S.-H.J. and S.M. All authors have read and agreed to the
published version of the manuscript.
Funding: This research was supported by Korea Evaluation Institute of Industrial Technology (KEIT) funded by
the Ministry of Trade, Industry and Energy (MOTIE) (No. 1415162366).
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Lowry, S.; Sünderhauf, N.; Newman, P.; Leonard, J.J.; Cox, D.; Corke, P.; Milford, M.J. Visual place
recognition: A survey. IEEE Trans. Robot. 2015, 32, 1–19. [CrossRef]
2. Alterovitz, R.; Koenig, S.; Likhachev, M. Robot planning in the real world: Research challenges and
opportunities. Ai Mag. 2016, 37, 76–84. [CrossRef]
3. Pendleton, S.D.; Andersen, H.; Du, X.; Shen, X.; Meghjani, M.; Eng, Y.H.; Rus, D.; Ang, M.H. Perception,
planning, control, and coordination for autonomous vehicles. Machines 2017, 5, 6. [CrossRef]
4. Gayathri, R.; Uma, V. Ontology based knowledge representation technique, domain modeling languages
and planners for robotic path planning: A survey. ICT Express 2018, 4, 69–74.
5. Nüchter, A.; Hertzberg, J. Towards semantic maps for mobile robots. Robot. Auton. Syst. 2008, 56, 915–926.
[CrossRef]
6. Ruiz-Sarmiento, J.R.; Galindo, C.; Gonzalez-Jimenez, J. Exploiting semantic knowledge for robot object
recognition. Knowl.-Based Syst. 2015, 86, 131–142. [CrossRef]
7. Galindo, C.; Fernández-Madrigal, J.A.; González, J.; Saffiotti, A. Using semantic information for improving
efficiency of robot task planning. In Proceedings of the ICRA Workshop: Semantic Information in Robotics,
Rome, Italy, 10 April 2007.
8. Abdi, L.; Meddeb, A. Semantic recognition: Unified framework for joint object detection and semantic
segmentation. In Proceedings of the Symposium on Applied Computing, Marrakech, Morocco,
3–7 April 2017; pp. 83–88.
9. Horikawa, T.; Kamitani, Y. Generic decoding of seen and imagined objects using hierarchical visual features.
Nat. Commun. 2017, 8, 1–15. [CrossRef]
10. Fox, M.; Long, D. PDDL2. 1: An extension to PDDL for expressing temporal planning domains. J. Artif.
Intell. Res. 2003, 20, 61–124. [CrossRef]
11. Cashmore, M.; Fox, M.; Long, D.; Magazzeni, D.; Ridder, B.; Carrera, A.; Palomeras, N.; Hurtos, N.;
Carreras, M. Rosplan: Planning in the robot operating system. In Proceedings of the Twenty-Fifth
International Conference on Automated Planning and Scheduling, Jerusalem, Israel, 7–11 June 2015.
12. Lim, G.H.; Suh, I.H.; Suh, H. Ontology-based unified robot knowledge for service robots in indoor
environments. IEEE Trans. Syst. Man Cybern. Part Syst. Hum. 2010, 41, 492–509. [CrossRef]
13. Weibel, M.; Beetz, M.; DAndrea, R.; Janssen, R.; Tenorth, M.; Civera, J.; Elfring, J.; Gávez-Lopez, D.;
Häussermann, K.; Montiel, J.; et al. RoboEarth-A world wide web for robots. Robot. Autom. Mag. 2011,
18, 69–82.
14. Joo, S.H.; Manzoor, S.; Rocha, Y.G.; Lee, H.U.; Kuc, T.Y. A realtime autonomous robot navigation framework
for human like high-level interaction and task planning in global dynamic environment. arXiv 2019,
arXiv:1905.12942.
15. Schlenoff, C.; Prestes, E.; Madhavan, R.; Goncalves, P.; Li, H.; Balakirsky, S.; Kramer, T.; Miguelanez, E.
An IEEE standard ontology for robotics and automation. In Proceedings of the 2012 IEEE/RSJ International
Conference on Intelligent Robots and Systems, Vilamoura, Portugal, 7–12 October 2012; pp. 1337–1342.
Appl. Sci. 2020, 10, 3219 27 of 30
16. Suh, I.H.; Lim, G.H.; Hwang, W.; Suh, H.; Choi, J.H.; Park, Y.T. Ontology-based multi-layered robot
knowledge framework (OMRKF) for robot intelligence. In Proceedings of the 2007 IEEE/RSJ International
Conference on Intelligent Robots and Systems, San Diego, CA, USA, 29 October–2 November 2007;
pp. 429–436.
17. Bratko, I. Prolog Programming for Artificial Intelligence; Pearson Education: Toronto, ON, Canada, 2011.
18. Tenorth, M.; Perzylo, A.C.; Lafrenz, R.; Beetz, M. Representation and exchange of knowledge about actions,
objects, and environments in the roboearth framework. IEEE Trans. Autom. Sci. Eng. 2013, 10, 643–651.
[CrossRef]
19. Tenorth, M.; Beetz, M. KnowRob: A knowledge processing infrastructure for cognition-enabled robots. Int. J.
Robot. Res. 2013, 32, 566–590. [CrossRef]
20. Thrun, S.; Montemerlo, M. The graph SLAM algorithm with applications to large-scale mapping of urban
structures. Int. J. Robot. Res. 2006, 25, 403–429. [CrossRef]
21. Grisetti, G.; Stachniss, C.; Burgard, W. Improved techniques for grid mapping with rao-blackwellized
particle filters. IEEE Trans. Robot. 2007, 23, 34–46. [CrossRef]
22. Murphy, K.; Russell, S. Rao-Blackwellised particle filtering for dynamic Bayesian networks. In Sequential
Monte Carlo Methods in Practice; Springer: New York, NY, USA, 2001; pp. 499–515.
23. Zhang, J.; Singh, S. LOAM: Lidar Odometry and Mapping in Real-time. In Proceedings of the Robotics:
Science and Systems Conference, Berkeley, CA, USA, 12–16 July 2014.
24. Klein, G.; Murray, D. Parallel tracking and mapping for small AR workspaces. In Proceedings of
the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, Nara, Japan,
13–16 November 2007; pp. 225–234.
25. Mur-Artal, R.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM: A versatile and accurate monocular SLAM system.
IEEE Trans. Robot. 2015, 31, 1147–1163. [CrossRef]
26. Pronobis, A.; Jensfelt, P. Large-scale semantic mapping and reasoning with heterogeneous modalities.
In Proceedings of the 2012 IEEE International Conference on Robotics and Automation, Saint Paul, MN,
USA, 14–18 May 2012; pp. 3515–3522.
27. Kostavelis, I.; Gasteratos, A. Semantic mapping for mobile robotics tasks: A survey. Robot. Auton. Syst. 2015,
66, 86–103. [CrossRef]
28. McCormac, J.; Handa, A.; Davison, A.; Leutenegger, S. Semanticfusion: Dense 3d semantic mapping with
convolutional neural networks. In Proceedings of the 2017 IEEE International Conference on Robotics and
automation (ICRA), Singapore, Singapore, 29 May–3 June 2017; pp. 4628–4635.
29. Yang, S.; Huang, Y.; Scherer, S. Semantic 3D occupancy mapping through efficient high order CRFs.
In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
Vancouver, BC, Canada, 24–28 September 2017; pp. 590–597.
30. Nakajima, Y.; Tateno, K.; Tombari, F.; Saito, H. Fast and accurate semantic mapping through geometric-based
incremental segmentation. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 385–392.
31. Xiao, L.; Wang, J.; Qiu, X.; Rong, Z.; Zou, X. Dynamic-SLAM: Semantic monocular visual localization and
mapping based on deep learning in dynamic environment. Robot. Auton. Syst. 2019, 117, 1–16. [CrossRef]
32. Riazuelo, L.; Tenorth, M.; Di Marco, D.; Salas, M.; Gálvez-López, D.; Mösenlechner, L.; Kunze, L.; Beetz, M.;
Tardós, J.D.; Montano, L.; et al. RoboEarth semantic mapping: A cloud enabled knowledge-based approach.
IEEE Trans. Autom. Sci. Eng. 2015, 12, 432–443. [CrossRef]
33. Tenorth, M.; Perzylo, A.C.; Lafrenz, R.; Beetz, M. The roboearth language: Representing and exchanging
knowledge about actions, objects, and environments. In Proceedings of the 2012 IEEE International
Conference on Robotics and Automation, Saint Paul, MN, USA, 14–18 May 2012; pp. 1284–1289.
34. Johnston, B.; Yang, F.; Mendoza, R.; Chen, X.; Williams, M.A. Ontology based object categorization for
robots. In International Conference on Practical Aspects of Knowledge Management; Springer: Berlin/Heidelberg,
Germany, 2008; pp. 219–231.
35. Tenorth, M.; Kunze, L.; Jain, D.; Beetz, M. Knowrob-map-knowledge-linked semantic object maps.
In Proceedings of the 2010 10th IEEE-RAS International Conference on Humanoid Robots, Nashville,
TN, USA, 6–8 December 2010; pp. 430–435.
36. Crespo, J.; Barber, R.; Mozos, O. Relational model for robotic semantic navigation in indoor environments.
J. Intell. Robot. Syst. 2017, 86, 617–639. [CrossRef]
Appl. Sci. 2020, 10, 3219 28 of 30
37. Jiao, L.; Zhang, F.; Liu, F.; Yang, S.; Li, L.; Feng, Z.; Qu, R. A Survey of Deep Learning-Based Object Detection.
IEEE Access 2019, 7, 128837–128868. [CrossRef]
38. Zou, Z.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. arXiv 2019, arXiv:1905.05055.
39. Roldan, S.M. Object recognition in mental representations: Directions for exploring diagnostic features
through visual mental imagery. Front. Psychol. 2017, 8, 833. [CrossRef] [PubMed]
40. Luo, Y.; Gao, Y.; Liu, L.; Huang, X. A novel object recognition system for cognitive robot. In Proceedings of
the 2012 IEEE International Conference on Information and Automation, Shenyang, China, 6–8 June 2012;
pp. 680–685.
41. Cichy, R.M.; Kaiser, D. Deep neural networks as scientific models. Trends Cogn. Sci. 2019, 23, 305–317.
[CrossRef]
42. Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767.
43. Weiss, I. A Dual-hierarchy Semantic Graph for Robust Object Recognition. arXiv 2019, arXiv:1909.06867.
44. Zhang, H.; Xu, T.; Elhoseiny, M.; Huang, X.; Zhang, S.; Elgammal, A.; Metaxas, D. Spda-cnn: Unifying
semantic part detection and abstraction for fine-grained recognition. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1143–1152.
45. Wang, J.; Xie, C.; Zhang, Z.; Zhu, J.; Xie, L.; Yuille, A. Detecting semantic parts on partially occluded objects.
arXiv 2017, arXiv:1707.07819.
46. Choi, J.H.; Park, Y.T.; Lim, G.H.; Lee, S. Ontology-Based Semantic Context Modeling for Object Recognition
of Intelligent Mobile Robots. In Recent Progress in Robotics: Viable Robotic Service to Human; Springer:
Berlin/Heidelberg, Germany, 2007; pp. 399–408.
47. Maillot, N.E.; Thonnat, M. Ontology based complex object recognition. Image Vis. Comput. 2008, 26, 102–113.
[CrossRef]
48. Allani, O.; Zghal, H.B.; Mellouli, N.; Akdag, H. A knowledge-based image retrieval system integrating
semantic and visual features. Procedia Comput. Sci. 2016, 96, 1428–1436. [CrossRef]
49. Lee, S.; Naguib, A.M.; Islam, N.U. 3D deep object recognition and semantic understanding for
visually-guided robotic service. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 903–910.
50. Chen, Y.; Gan, W.; Zhang, L.; Liu, C.; Wang, X. A Survey on Visual Place Recognition for Mobile Robots
Localization. In Proceedings of the 2017 14th Web Information Systems and Applications Conference (WISA),
Liuzhou, China, 11–12 November 2017; pp. 187–192.
51. Schönberger, J.L.; Pollefeys, M.; Geiger, A.; Sattler, T. Semantic visual localization. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018;
pp. 6896–6906.
52. Garg, S.; Suenderhauf, N.; Milford, M. Lost? appearance-invariant place recognition for opposite viewpoints
using visual semantics. arXiv 2018, arXiv:1804.05526.
53. Chen, Z.; Jacobson, A.; Sünderhauf, N.; Upcroft, B.; Liu, L.; Shen, C.; Reid, I.; Milford, M. Deep learning
features at scale for visual place recognition. In Proceedings of the 2017 IEEE International Conference on
Robotics and Automation (ICRA), Singapore, Singapore, 29 May–3 June 2017; pp. 3223–3230.
54. Gomez-Ojeda, R.; Lopez-Antequera, M.; Petkov, N.; Gonzalez-Jimenez, J. Training a convolutional neural
network for appearance-invariant place recognition. arXiv 2015, arXiv:1505.07428.
55. Mitsuhashi, M.; Kuroda, Y. Mobile robot localization using place recognition in outdoor environments with
similar scenes. In Proceedings of the 2011 IEEE/ASME International Conference on Advanced Intelligent
Mechatronics (AIM)E, Budapest, Hungary, 3–7 July 2011; pp. 930–935.
56. McManus, C.; Upcroft, B.; Newman, P. Learning place-dependant features for long-term vision-based
localisation. Auton. Robot. 2015, 39, 363–387. [CrossRef]
57. Zhu, J.; Li, Q.; Cao, R.; Sun, K.; Liu, T.; Garibaldi, J.M.; Li, Q.; Liu, B.; Qiu, G. Indoor topological localization
using a visual landmark sequence. Remote. Sens. 2019, 11, 73. [CrossRef]
58. Zhong, X.; Zhou, Y.; Liu, H. Design and recognition of artificial landmarks for reliable indoor self-localization
of mobile robots. Int. J. Adv. Robot. Syst. 2017, 14, 1729881417693489. [CrossRef]
59. Simmons, R.; Goodwin, R.; Haigh, K.Z.; Koenig, S.; O’Sullivan, J. A layered architecture for office delivery
robots. In Proceedings of the first international conference on Autonomous Agents, Marina del Rey, CA,
USA, 5–8 February 1997; pp. 245–252.
Appl. Sci. 2020, 10, 3219 29 of 30
60. Choset, H.M.; Hutchinson, S.; Lynch, K.M.; Kantor, G.; Burgard, W.; Kavraki, L.E.; Thrun, S. Principles of
Robot Motion: Theory, Algorithms, and Implementation; MIT Press: Boston, MA, USA, 2005.
61. Buehler, M.; Iagnemma, K.; Singh, S. The DARPA Urban Challenge: Autonomous Vehicles in City Traffic; Springer:
New York, NY, USA, 2009; Volume 56.
62. Mac, T.T.; Copot, C.; Tran, D.T.; De Keyser, R. Heuristic approaches in robot path planning: A survey. Robot.
Auton. Syst. 2016, 86, 13–28. [CrossRef]
63. Muñoz, P.; R-Moreno, M.D.; Barrero, D.F. Unified framework for path-planning and task-planning for
autonomous robots. Robot. Auton. Syst. 2016, 82, 1–14. [CrossRef]
64. Fikes, R.E.; Nilsson, N.J. STRIPS: A new approach to the application of theorem proving to problem solving.
Artif. Intell. 1971, 2, 189–208. [CrossRef]
65. Nilsson, N.J. Shakey the Robot; Technical Report; SRI International: Menlo Park, CA, USA, 1984.
66. Estivill-Castro, V.; Ferrer-Mestres, J. Path-finding in dynamic environments with PDDL-planners.
In Proceedings of the 2013 16th International Conference on Advanced Robotics, ICAR 2013, Montevideo,
Uruguay, 25–29 November 2013; pp. 1–7. [CrossRef]
67. Dvorak, F.; Bit-Monnot, A.; Ingrand, F.; Ghallab, M. A Flexible ANML Actor and Planner in Robotics;
In Proceedings of the Planning and Robotics (PlanRob) Workshop (ICAPS), Portsmouth, NH, USA,
21–26 June 2014.
68. Smith, D.E.; Frank, J.; Cushing, W. The ANML language. In Proceedings of the ICAPS-08 Workshop on
Knowledge Engineering for Planning and Scheduling (KEPS), Sydney, Australia, 14–18 September 2008.
69. Boutilier, C.; Dearden, R. Using abstractions for decision-theoretic planning with time constraints.
In Proceedings of the AAAI, Seattle, WA, USA, 31 July–4 August 1994; pp. 1016–1022.
70. Dean, T.L.; Kaelbling, L.P.; Kirman, J.; Nicholson, A.E. Planning With Deadlines in Stochastic Domains.
In Proceedings of the AAAI, Washington, DC, USA, 11–15 July 1993; Volume 93; pp. 574–579.
71. Galindo, C.; Fernández-Madrigal, J.A.; González, J.; Saffiotti, A. Robot task planning using semantic maps.
Robot. Auton. Syst. 2008, 56, 955–966. [CrossRef]
72. Beetz, M.; Beßler, D.; Haidu, A.; Pomarlan, M.; Bozcuoğlu, A.K.; Bartels, G. Know rob 2.0—A 2nd
generation knowledge processing framework for cognition-enabled robotic agents. In Proceedings of the 2018
IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018;
pp. 512–519.
73. Beßler, D.; Pomarlan, M.; Beetz, M. Owl-enabled assembly planning for robotic agents. In Proceedings of
the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, Stockholm,
Sweden, 10–15 July 2018; Volume 3, pp. 1684–1692.
74. Lau, M.; Kuffner, J.J. Behavior planning for character animation. In Proceedings of the 2005 ACM
SIGGRAPH/Eurographics Symposium on Computer Animation, Los Angeles, CA, USA, 29–31 July 2005;
pp. 271–280.
75. Koo, T.; Hoffmann, F.; Shim, H.; Sinopoli, B.; Sastry, S. Hybrid control of an autonomous helicopter.
In Proceedings of the IFAC Workshop on Motion Control, Grenoble, France, 21–23 September 1998;
pp. 285–290.
76. Shi, X.; Wang, F.Y.; Lever, P. Experimental results of robotic excavation using fuzzy behavior control.
Control. Eng. Pract. 1996, 4, 145–152. [CrossRef]
77. Wei, J.; Snider, J.M.; Gu, T.; Dolan, J.M.; Litkouhi, B. A behavioral planning framework for autonomous
driving. In Proceedings of the 2014 IEEE Intelligent Vehicles Symposium Proceedings, Dearborn, MI, USA,
8–11 June 2014; pp. 458–464.
78. Arney, T. An efficient solution to autonomous path planning by approximate cell decomposition.
In Proceedings of the 2007 Third International Conference on Information and Automation for Sustainability,
Melbourne, Australia, 4–6 December 2007; pp. 88–93.
79. Lingelbach, F. Path planning using probabilistic cell decomposition. In Proceedings of the IEEE
International Conference on Robotics and Automation, 2004. Proceedings, ICRA’04, New Orleans, LA,
USA, 26 April–1 May 2004; Volume 1, pp. 467–472.
80. Šeda, M. Roadmap methods vs. cell decomposition in robot motion planning. In Proceedings of the 6th
WSEAS International Conference on Signal Processing, Robotics and Automation, Corfu Island, Greece,
16–19 February 2007; pp. 127–132.
Appl. Sci. 2020, 10, 3219 30 of 30
81. Pradhan, S.K.; Parhi, D.R.; Panda, A.K.; Behera, R.K. Potential field method to navigate several mobile
robots. Appl. Intell. 2006, 25, 321–333. [CrossRef]
82. Kang, Y.H.; Lee, M.C.; Kim, C.Y.; Yoon, S.M.; Noh, C.B. A study of cluster robots line formatted navigation
using potential field method. In Proceedings of the 2011 IEEE International Conference on Mechatronics
and Automation, Beijing, China, 7–10 August 2011; pp. 1723–1728.
83. Lee, J.; Kwon, O.; Zhang, L.; Yoon, S.E. A selective retraction-based RRT planner for various environments.
IEEE Trans. Robot. 2014, 30, 1002–1011. [CrossRef]
84. Kavraki, L.; Latombe, J.C. Randomized preprocessing of configuration for fast path planning. In Proceedings
of the 1994 IEEE International Conference on Robotics and Automation, San Diego, CA, USA, 8–13 May 1994;
pp. 2138–2145.
85. LaValle, S.M. Rapidly-Exploring Random trees: A New Tool for Path Planning; Research Report 98-11; Department
of Computer Science, Iowa State University: Ames, IA, USA, 1998.
86. Karaman, S.; Frazzoli, E. Sampling-based algorithms for optimal motion planning. Int. J. Robot. Res. 2011,
30, 846–894. [CrossRef]
87. Elbanhawi, M.; Simic, M. Sampling-based robot motion planning: A review. IEEE Access 2014, 2, 56–77.
[CrossRef]
88. Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.;
Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015,
518, 529–533. [CrossRef]
89. Shah, P.; Fiser, M.; Faust, A.; Kew, J.C.; Hakkani-Tur, D. Follownet: Robot navigation by following natural
language directions with deep reinforcement learning. arXiv 2018, arXiv:1805.06150.
90. Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control
with deep reinforcement learning. arXiv 2015, arXiv:1509.02971.
91. Tai, L.; Paolo, G.; Liu, M. Virtual-to-real deep reinforcement learning: Continuous control of mobile robots
for mapless navigation. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots
and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 31–36.
92. Faust, A.; Oslund, K.; Ramirez, O.; Francis, A.; Tapia, L.; Fiser, M.; Davidson, J. PRM-RL: Long-range robotic
navigation tasks by combining reinforcement learning and sampling-based planning. In Proceedings
of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia,
21–25 May 2018; pp. 5113–5120.
93. Gabrieli, J.D. Cognitive neuroscience of human memory. Annu. Rev. Psychol. 1998, 49, 87–115. [CrossRef]
94. Bechhofer, S.; Van Harmelen, F.; Hendler, J.; Horrocks, I.; McGuinness, D.L.; Patel-Schneider, P.F.; Stein, L.A.
OWL web ontology language reference. W3C Recomm. 2004, 10.02.
95. Rocha, Y.G.; Joo, S.H.; Kim, E.J.; Kuc, T.Y. Automatic Generation of a Simulated Robot from
an Ontology-Based Semantic Description. In Proceedings of the 2019 19th International Conference on
Control, Automation and Systems (ICCAS), Jeju, Korea, 15–18 October 2019; pp. 1340–1343. [CrossRef]
96. Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the
IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988.
97. Coles, A.J.; Coles, A.I.; Fox, M.; Long, D. Forward-chaining partial-order planning. In Proceedings of
the Twentieth International Conference on Automated Planning and Scheduling, Toronto, ON, Canada,
12–16 May 2010.
98. Dijkstra, E.W. A note on two problems in connexion with graphs. Numer. Math. 1959, 1, 269–271. [CrossRef]
99. Kahneman, D.; Tversky, A. The Simulation Heuristic; Technical Report; Department of Psychology,
Stanford University: Stanford, CA, USA, 1981.
100. Rocha, Y.G.; Kuc, T.Y. Mental simulation for autonomous learning and planning based on triplet ontological
semantic model. CEUR Workshop Proc. 2019, 2487, 65–73.
c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).